medical

mldr.datasets::get.mldr("medical")

Select your download

Partitions: select your desired partitioning strategy, validation and format

Random Stratified Iterative stratified
Hold out MULAN MEKA LibSVM KEEL mldr MULAN MEKA LibSVM KEEL mldr MULAN MEKA LibSVM KEEL mldr
2x5-fold cross validation MULAN MEKA LibSVM KEEL mldr MULAN MEKA LibSVM KEEL mldr MULAN MEKA LibSVM KEEL mldr
10-fold cross validation MULAN MEKA LibSVM KEEL mldr MULAN MEKA LibSVM KEEL mldr MULAN MEKA LibSVM KEEL mldr

Summary

Instances 978
Attributes 1494
Inputs 1449
Labels 45
Labelsets 94
Single labelsets 33
Max frequency 155
Cardinality 1.2454
Density 0.0277
Mean IR 89.5014
SCUMBLE 0.0471
TCS 15.6286

Citation

Crammer, K.; Dredze, M.; Ganchev, K.; Talukdar, P. P.; Carroll, S. (2007). Automatic Code Assignment to Medical Text. In Proc. Workshop on Biological, Translational, and Clinical Language Processing, Prague, Czech Republic, BioNLP07, 129--136.
@inproceedings{,
  title = "Automatic Code Assignment to Medical Text",
  author = "Crammer, K. and Dredze, M. and Ganchev, K. and Talukdar, P. P. and Carroll, S.",
  booktitle = "Proc. Workshop on Biological, Translational, and Clinical Language Processing,  Prague, Czech Republic, BioNLP07",
  year = "2007",
  pages = "129--136"
}

Concurrence plot

In this concurrence plot, sectors represent labels and links between them depict label co-occurrences. SCUMBLE is a measure designed to assess the concurrence among imbalanced labels.