stackex_coffee

mldr.datasets::get.mldr("stackex_coffee")

Select your download

Partitions: select your desired partitioning strategy, validation and format

Random Stratified Iterative stratified
Hold out MULAN MEKA LibSVM KEEL mldr MULAN MEKA LibSVM KEEL mldr MULAN MEKA LibSVM KEEL mldr
2x5-fold cross validation MULAN MEKA LibSVM KEEL mldr MULAN MEKA LibSVM KEEL mldr MULAN MEKA LibSVM KEEL mldr
10-fold cross validation MULAN MEKA LibSVM KEEL mldr MULAN MEKA LibSVM KEEL mldr MULAN MEKA LibSVM KEEL mldr

Summary

Instances 225
Attributes 1886
Inputs 1763
Labels 123
Labelsets 174
Single labelsets 149
Max frequency 7
Cardinality 1.9867
Density 0.0162
Mean IR 27.2415
SCUMBLE 0.1691
TCS 17.446

Citation

Charte, Francisco; Rivera, Antonio J.; del Jesus, Maria J.; Herrera, Francisco (2015). QUINTA: A question tagging assistant to improve the answering ratio in electronic forums. In EUROCON 2015 - International Conference on Computer as a Tool (EUROCON), IEEE, 1-6.
@inproceedings{,
  title="QUINTA: A question tagging assistant to improve the answering ratio in electronic forums",
  author="Charte, Francisco and Rivera, Antonio J. and del Jesus, Maria J. and Herrera, Francisco",
  booktitle="EUROCON 2015 - International Conference on Computer as a Tool (EUROCON), IEEE",
  year="2015",
  pages="1-6",
  month="Sept"
}

Concurrence plot

In this concurrence plot, sectors represent labels and links between them depict label co-occurrences. SCUMBLE is a measure designed to assess the concurrence among imbalanced labels.