yahoo_society

mldr.datasets::get.mldr("yahoo_society")

Select your download

Partitions: select your desired partitioning strategy, validation and format

Random Stratified Iterative stratified
Hold out MULAN MEKA LibSVM KEEL mldr MULAN MEKA LibSVM KEEL mldr MULAN MEKA LibSVM KEEL mldr
2x5-fold cross validation MULAN MEKA LibSVM KEEL mldr MULAN MEKA LibSVM KEEL mldr MULAN MEKA LibSVM KEEL mldr
10-fold cross validation MULAN MEKA LibSVM KEEL mldr MULAN MEKA LibSVM KEEL mldr MULAN MEKA LibSVM KEEL mldr

Summary

Instances 14512
Attributes 31829
Inputs 31802
Labels 27
Labelsets 1054
Single labelsets 624
Max frequency 4092
Cardinality 1.6704
Density 0.0619
Mean IR 302.0678
SCUMBLE 0.0957
TCS 20.6235

Citation

Ueda, N.; Saito, K. (2002). Parametric mixture models for multi-labeled text. In Advances in neural information processing systems, 721--728.
@inproceedings{,
  title="Parametric mixture models for multi-labeled text",
author="Ueda, N. and Saito, K.",
booktitle="Advances in neural information processing systems",
pages="721--728",
year="2002"
}

Concurrence plot

In this concurrence plot, sectors represent labels and links between them depict label co-occurrences. SCUMBLE is a measure designed to assess the concurrence among imbalanced labels.