Yelp

mldr.datasets::get.mldr("Yelp")

Select your download

Partitions: select your desired partitioning strategy, validation and format

Random Stratified Iterative stratified
Hold out MULAN MEKA LibSVM KEEL mldr MULAN MEKA LibSVM KEEL mldr MULAN MEKA LibSVM KEEL mldr
2x5-fold cross validation MULAN MEKA LibSVM KEEL mldr MULAN MEKA LibSVM KEEL mldr MULAN MEKA LibSVM KEEL mldr
10-fold cross validation MULAN MEKA LibSVM KEEL mldr MULAN MEKA LibSVM KEEL mldr MULAN MEKA LibSVM KEEL mldr

Summary

Instances 10806
Attributes 676
Inputs 671
Labels 5
Labelsets 32
Single labelsets 0
Max frequency 2120
Cardinality 1.6383
Density 0.3277
Mean IR 2.8756
SCUMBLE 0.0332
TCS 11.5839

Citation

Hitesh Sajnani, Vaibhav Saini, Kusum Kumar , Eugenia Gabrielova , Pramit Choudary, Cristina Lopes (2013). The Yelp dataset challenge - Multilabel classification of Yelp reviews into relevant categories.
@online{,
  title={The Yelp dataset challenge - Multilabel classification of Yelp reviews into relevant categories},
  author={Hitesh Sajnani, Vaibhav Saini, Kusum Kumar , Eugenia Gabrielova , Pramit Choudary, Cristina Lopes},
  year={2013},
  url={https://www.ics.uci.edu/~vpsaini/}
}

Concurrence plot

In this concurrence plot, sectors represent labels and links between them depict label co-occurrences. SCUMBLE is a measure designed to assess the concurrence among imbalanced labels.