ohsumed

mldr.datasets::get.mldr("ohsumed")

Select your download

Partitions: select your desired partitioning strategy, validation and format

Random Stratified Iterative stratified
Hold out MULAN MEKA LibSVM KEEL mldr MULAN MEKA LibSVM KEEL mldr MULAN MEKA LibSVM KEEL mldr
2x5-fold cross validation MULAN MEKA LibSVM KEEL mldr MULAN MEKA LibSVM KEEL mldr MULAN MEKA LibSVM KEEL mldr
10-fold cross validation MULAN MEKA LibSVM KEEL mldr MULAN MEKA LibSVM KEEL mldr MULAN MEKA LibSVM KEEL mldr

Summary

Instances 13929
Attributes 1025
Inputs 1002
Labels 23
Labelsets 1147
Single labelsets 578
Max frequency 1175
Cardinality 1.6631
Density 0.0723
Mean IR 7.8692
SCUMBLE 0.0688
TCS 17.0902

Citation

Joachims, Thorsten (1998). Text Categorization with Suport Vector Machines: Learning with Many Relevant Features. In Proc. 10th European Conference on Machine Learning, 137--142.
@inproceedings{,
  title="Text Categorization with Suport Vector Machines: Learning with Many Relevant Features",
  author="Joachims, Thorsten",
  booktitle="Proc. 10th European Conference on Machine Learning",
  pages="137--142",
  year="1998"
}

Concurrence plot

In this concurrence plot, sectors represent labels and links between them depict label co-occurrences. SCUMBLE is a measure designed to assess the concurrence among imbalanced labels.