Three equally experienced annotators provided aspect-level annotations of a subset of 300 randomly selected reviews from the INEX Amazon/LibraryThing Book Corpus*. The full dataset contains 2.7 million xml files, each with combined book metadata from Amazon and LibraryThing. Each description contains formal metadata (booktitle, author, publisher, etc.), subject metadata (library subject headings and classification codes) and user generated content (user ratings, reviews and tags).
The reviews selected for this dataset are annotated at sentence-level with information about the aspects and categories and the sentiment associated to them.
1. The information may only be used for research and development of natural-language processing, information-retrieval or knowledge-discovery systems.
2. Summaries, analyses and interpretations of the linguistic properties of the Information may be derived and published, provided it is not possible to reconstruct the Information from these summaries.
3. Small excerpts of the Information may be displayed to others or published in a scientific or technical context, solely for the purpose of describing the research and development and related issues. Any such use shall not infringe on the rights of any third party including, but not limited to, the authors and publishers of the excerpts.
If you use the dataset please acknowledge its use with a citation: Tamara Álvarez-López, Milagros Fernández-Gavilanes, Enrique Costa-Montenegro, Jonathan Juncal-Martínez, Silvia García-Méndez and Patrice Bellot: A Book Reviews Dataset for Aspect Based Sentiment Analysis. To be published in: Proceedings of the 8th Language & Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics (LTC 2017).
* Koolen, Marijn, Toine Bogers, Maria Gäde, Mark Hall, Iris Hendrickx, Hugo Huurdeman, Jaap Kamps, Mette Skov, Suzan Verberne, and David Walsh, 2016. Overview of the clef 2016 social book search lab. In International Conference of the Cross-Language Evaluation Forum for European Languages.