Wiki10-31K

mldr.datasets::get.mldr("Wiki10-31K")

Select your download

Partitions: select your desired partitioning strategy, validation and format

Random Stratified Iterative stratified
Hold out MULAN MEKA LibSVM KEEL mldr MULAN MEKA LibSVM KEEL mldr MULAN MEKA LibSVM KEEL mldr
2x5-fold cross validation MULAN MEKA LibSVM KEEL mldr MULAN MEKA LibSVM KEEL mldr MULAN MEKA LibSVM KEEL mldr
10-fold cross validation MULAN MEKA LibSVM KEEL mldr MULAN MEKA LibSVM KEEL mldr MULAN MEKA LibSVM KEEL mldr

Summary

Instances 20762
Attributes 132876
Inputs 101938
Labels 30938
Labelsets 20693
Single labelsets 20625
Max frequency 3
Cardinality 18.7616
Density 0.0006
Mean IR 5341.8088
SCUMBLE 0.8387
TCS 31.8094

Citation

Bhatia, Kush; Jain, Himanshu; Kar, Purushottam; Varma, Manik; Jain, Prateek (2015). Sparse local embeddings for extreme multi-label classification. In Advances in neural information processing systems, 730--738.
@inproceedings{,
  title={Sparse local embeddings for extreme multi-label classification},
  author={Bhatia, Kush and Jain, Himanshu and Kar, Purushottam and Varma, Manik and Jain, Prateek},
  booktitle={Advances in neural information processing systems},
  pages={730--738},
  year={2015}
}

Concurrence plot

In this concurrence plot, sectors represent labels and links between them depict label co-occurrences. SCUMBLE is a measure designed to assess the concurrence among imbalanced labels.