yahoo_social

mldr.datasets::get.mldr("yahoo_social")

Select your download

Partitions: select your desired partitioning strategy, validation and format

Random Stratified Iterative stratified
Hold out MULAN MEKA LibSVM KEEL mldr MULAN MEKA LibSVM KEEL mldr MULAN MEKA LibSVM KEEL mldr
2x5-fold cross validation MULAN MEKA LibSVM KEEL mldr MULAN MEKA LibSVM KEEL mldr MULAN MEKA LibSVM KEEL mldr
10-fold cross validation MULAN MEKA LibSVM KEEL mldr MULAN MEKA LibSVM KEEL mldr MULAN MEKA LibSVM KEEL mldr

Summary

Instances 12111
Attributes 52389
Inputs 52350
Labels 39
Labelsets 361
Single labelsets 179
Max frequency 4062
Cardinality 1.2793
Density 0.0328
Mean IR 257.7044
SCUMBLE 0.049
TCS 20.4181

Citation

Ueda, N.; Saito, K. (2002). Parametric mixture models for multi-labeled text. In Advances in neural information processing systems, 721--728.
@inproceedings{,
  title="Parametric mixture models for multi-labeled text",
author="Ueda, N. and Saito, K.",
booktitle="Advances in neural information processing systems",
pages="721--728",
year="2002"
}

Concurrence plot

In this concurrence plot, sectors represent labels and links between them depict label co-occurrences. SCUMBLE is a measure designed to assess the concurrence among imbalanced labels.