THE PRODUCT COMMENTS DETECTION BENCHMARK SUITE (REVIEWBS) README ================================================================ 1.Introduction ============== Product comment detection is based on extracting product reviews from real webshops. As template detection or content extraction techniques, comment detection needs to be continually tested in order to get new improvements in both, its results and performance. This testing is done using sets of webshops prepared for this purpose. Thus, a benchmark suite is an important requirement to measure the performance of these techniques. 2.How to obtain REVIEWBS 1.0 ============================ REVIEWBS 1.0 can be downloaded from the following URL: 3.Structure =========== REVIEWBS 1.0 was created by downloading 195 webpages from several webshops from the Internet. Once all the webpages were downloaded, four different engineers explored the key page and the webpages accessible from it to decide what part of the webpage corresponds to the reviews section. Using the results of this experiment, each webpage was prepared for comment detection. All elements belonging to the comment section were included in an HTML class called TECO_mainComments. Therefore, a comment detection tool can easily compare its output to the nodes belonging to this class. REVIEWBS 1.0 is organized in directories and files. There is a directory called "mixed" which has 75 files and directories inside, a directory of file for each webpage domain. In addition, there is another directory called "fixed" which includes 4 directories for webpages with 2, 3, 4, and 5 comments. 4.How to use REVIEWBS ===================== The installation is very simple, the zip file has to be extracted into the hard drive, pendrive or other media. Once extracted it will create both directories. It is recommended to extract the file on Linux or OS X systems because Windows based systems do not allow the directory structure used to store the benchmarks. 5.List of webpages ================== 5.1 Mixed webpages ================== mixed/ mixed/ mixed/ mixed/ mixed/ mixed/ mixed/ mixed/ mixed/ mixed/ mixed/ mixed/ mixed/ mixed/ mixed/ mixed/ mixed/ mixed/ mixed/ mixed/ mixed/ mixed/ mixed/ mixed/ mixed/ mixed/ mixed/ mixed/ mixed/ mixed/ mixed/ mixed/ mixed/ mixed/ mixed/ mixed/ mixed/shop.hashbro.html mixed/ mixed/ mixed/ mixed/ mixed/ mixed/ mixed/ mixed/ón-de-camisetas/diseña-tu-propia-camiseta-de-basket.html mixed/ mixed/ mixed/ mixed/ mixed/ mixed/ mixed/ mixed/ mixed/www.waterstones.html mixed/ mixed/ mixed/ mixed/ mixed/ mixed/ mixed/www.michelin.html mixed/ mixed/ mixed/ mixed/ mixed/ mixed/ mixed/ mixed/ mixed/ mixed/ mixed/www.waitrose.html mixed/ mixed/ mixed/www.autozone.html 5.2 Fixed webpages ================== fixed/2/ fixed/2/ fixed/2/ fixed/2/ fixed/2/ fixed/2/ fixed/2/ fixed/2/ fixed/2/ fixed/2/ fixed/2/ fixed/2/ fixed/2/ fixed/2/ fixed/2/ fixed/2/ fixed/2/ fixed/2/ fixed/2/ fixed/2/ fixed/2/www.anthropologie.html fixed/2/www.dermstore.html fixed/2/www.totto.html fixed/2/ fixed/2/www.maccosmetics.html fixed/2/www.maisonsdumonde.html fixed/2/www.tesco.html fixed/2/www.makeupalley.html fixed/2/ fixed/2/www.urbanoutfitters.html fixed/3/ fixed/3/ fixed/3/ fixed/3/ fixed/3/ fixed/3/ fixed/3/ fixed/3/ fixed/3/ fixed/3/ fixed/3/ fixed/3/ fixed/3/ fixed/3/ fixed/3/ fixed/3/ fixed/3/ fixed/3/ fixed/3/ fixed/3/ fixed/3/www.anthropologie.html fixed/3/www.dermstore.html fixed/3/www.totto.html fixed/3/ fixed/3/www.maccosmetics.html fixed/3/www.maisonsdumonde.html fixed/3/www.tesco.html fixed/3/www.makeupalley.html fixed/3/ fixed/3/www.urbanoutfitters.html fixed/4/ fixed/4/ fixed/4/ fixed/4/ fixed/4/ fixed/4/ fixed/4/ fixed/4/ fixed/4/ fixed/4/ fixed/4/ fixed/4/ fixed/4/ fixed/4/ fixed/4/ fixed/4/ fixed/4/ fixed/4/ fixed/4/ fixed/4/ fixed/4/www.anthropologie.html fixed/4/www.dermstore.html fixed/4/www.totto.html fixed/4/ fixed/4/www.maccosmetics.html fixed/4/www.maisonsdumonde.html fixed/4/www.tesco.html fixed/4/www.makeupalley.html fixed/4/ fixed/4/www.urbanoutfitters.html*/ fixed/5/ fixed/5/www.thejewellershop.html fixed/5/www.lulus.html fixed/5/www.scorer.html fixed/5/ fixed/5/ fixed/5/www.powerplanetonline.html fixed/5/www.madridhifi.html fixed/5/ fixed/5/ fixed/5/ fixed/5/www.gourmetfoodstore.html fixed/5/ fixed/5/www.woodcraft.html fixed/5/ fixed/5/www.averyaustin.html fixed/5/ fixed/5/www.voromotors.html fixed/5/ fixed/5/www.bonprix.html fixed/5/www.anthropologie.html fixed/5/www.dermstore.html fixed/5/www.totto.html fixed/5/ fixed/5/www.maccosmetics.html fixed/5/www.maisonsdumonde.html fixed/5/www.tesco.html fixed/5/www.makeupalley.html fixed/5/ fixed/5/www.urbanoutfitters.html