stackex_cs

mldr.datasets::get.mldr("stackex_cs")

Select your download

Partitions: select your desired partitioning strategy, validation and format

Random Stratified Iterative stratified
Hold out MULAN MEKA LibSVM KEEL mldr MULAN MEKA LibSVM KEEL mldr MULAN MEKA LibSVM KEEL mldr
2x5-fold cross validation MULAN MEKA LibSVM KEEL mldr MULAN MEKA LibSVM KEEL mldr MULAN MEKA LibSVM KEEL mldr
10-fold cross validation MULAN MEKA LibSVM KEEL mldr MULAN MEKA LibSVM KEEL mldr MULAN MEKA LibSVM KEEL mldr

Summary

Instances 9270
Attributes 909
Inputs 635
Labels 274
Labelsets 4749
Single labelsets 3679
Max frequency 119
Cardinality 2.5562
Density 0.0093
Mean IR 85.0023
SCUMBLE 0.2723
TCS 20.5324

Citation

Charte, Francisco; Rivera, Antonio J.; del Jesus, Maria J.; Herrera, Francisco (2015). QUINTA: A question tagging assistant to improve the answering ratio in electronic forums. In EUROCON 2015 - International Conference on Computer as a Tool (EUROCON), IEEE, 1-6.
@inproceedings{,
  title="QUINTA: A question tagging assistant to improve the answering ratio in electronic forums",
  author="Charte, Francisco and Rivera, Antonio J. and del Jesus, Maria J. and Herrera, Francisco",
  booktitle="EUROCON 2015 - International Conference on Computer as a Tool (EUROCON), IEEE",
  year="2015",
  pages="1-6",
  month="Sept"
}

Concurrence plot

In this concurrence plot, sectors represent labels and links between them depict label co-occurrences. SCUMBLE is a measure designed to assess the concurrence among imbalanced labels.