├── .gitignore ├── LICENSE ├── README.md ├── _config.yml ├── baseline-scores ├── allOnes.json ├── allZeros.json ├── human-01.json ├── lucene_paragraphs.json ├── lucene_world.json ├── raw │ ├── human.json │ ├── lucene_paragraphs-raw.json │ ├── lucene_world-raw.json │ └── simpleLR-raw.json └── simpleLR.json ├── multirc_materials ├── multirc-eval-v1.py ├── multirc-pr-curve-v1.py ├── multirc_measures.py ├── pr-curve-output.png └── threshold-tuning-raw-scores.py └── surfaceLR-baseline ├── JSON.pm ├── JSON ├── backportPP.pm └── backportPP │ ├── Boolean.pm │ ├── Compat5005.pm │ └── Compat5006.pm ├── README.md ├── liblinear.zip ├── run_surfaceLR.sh ├── surfaceLR.pl ├── surfaceLR_predict.pl └── surfaceLR_train.pl /.gitignore: -------------------------------------------------------------------------------- 1 | # Created by .ignore support plugin (hsz.mobi) 2 | baseline-scores/unpublished/ 3 | baseline-scores/unpublished/* 4 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Research and Academic Use License 2 | Cognitive Computation Group 3 | University of Illinois at Urbana-Champaign 4 | 5 | Downloading software implies that you accept the following license terms: 6 | 7 | Under this Agreement, The Board of Trustees of the University of Illinois ("University"), a body corporate and politic of the State of Illinois with its principal offices at 506 South Wright Street, Urbana, Illinois 61801, U.S.A., on behalf of its Department of Computer Science on the Urbana-Champaign Campus, provides the software ("Software") described in Appendix A, attached hereto and incorporated herein, to the Licensee identified below ("Licensee") subject to the following conditions: 8 | 9 | 1. Upon execution of this Agreement by Licensee below, the University grants, and Licensee accepts, a roylaty-free, non-exclusive license: 10 | A. To use unlimited copies of the Software for its own academic and research purposes. 11 | B. To make derivative works. However, if Licensee distributes any derivative work based on or derived from the Software (with such distribution limited to binary form only), then Licensee will (1) notify the University (c/o Professor Dan Roth, e-mail: danr@cs.uiuc.edu) regarding its distribution of the derivative work and provide a copy if requested, and (2) clearly notify users that such derivative work is a modified version and not the original Software distributed by the University. 12 | C. To redistribute (sublicense) derivative works based on the Software in binary form only to third parties provided that (1) the copyright notice and any accompanying legends or proprietary notices are reproduced on all copies, (2) no royalty is charged for such copies, and (3) third parties are restricted to using the derivative work for academic and research purposes only, without further sublicensing rights. 13 | No license is granted herein that would permit Licensee to incorporate the Software into a commercial product, or to otherwise commercially exploit the Software. Should Licensee wish to make commercial use of the Software, Licensee should contact the University, c/o the Office of Technology Management ("OTM") to negotiate an appropriate license for such commercial use. To contact the OTM: otmmailaccount@ad.uiuc.edu; telephone: (217)333-3781; fax: (217) 265-5530. 14 | 2. THE UNIVERSITY GIVES NO WARRANTIES, EITHER EXPRESSED OR IMPLIED, FOR THE SOFTWARE AND/OR ASSOCIATED MATERIALS PROVIDED UNDER THIS AGREEMENT, INCLUDING, WITHOUT LIMITATION, WARRANTY OF MERCHANTABILITY AND WARRANTY OF FITNESS FOR A PARTICULAR PURPOSE, AND ANY WARRANTY AGAINST INFRINGEMENT OF ANY INTELLECTUAL PROPERTY RIGHTS. 15 | 3. Licensee understands the Software is a research tool for which no warranties as to capabilities or accuracy are made, and Licensee accepts the Software on an "as is, with all defects" basis, without maintenance, debugging , support or improvement. Licensee assumes the entire risk as to the results and performance of the Software and/or associated materials. Licensee agrees that University shall not be held liable for any direct, indirect, consequential, or incidental damages with respect to any claim by Licensee or any third party on account of or arising from this Agreement or use of the Software and/or associated materials. 16 | 4. Licensee understands the Software is proprietary to the University. Licensee will take all reasonable steps to insure that the source code is protected and secured from unauthorized disclosure, use, or release and will treat it with at least the same level of care as Licensee would use to protect and secure its own proprietary computer programs and/or information, but using no less than reasonable care. 17 | 5. In the event that Licensee shall be in default in the performance of any material obligations under this Agreement, and if the default has not been remedied within sixty (60) days after the date of notice in writing of such default, University may terminate this Agreement by written notice. In the event of termination, Licensee shall promptly return to University the original and any copies of licensed Software in Licensee's possession. In the event of any termination of this Agreement, any and all sublicenses granted by Licensee to third parties pursuant to this Agreement (as permitted by this Agreement) prior to the date of such termination shall nevertheless remain in full force and effect. 18 | 6. The Software was developed, in part, with support from the National Science Foundation, and the Federal Government has certain license rights in the Software. 19 | 7. This Agreement shall be construed and interpreted in accordance with the laws of the State of Illinois, U.S.A.. 20 | 8. This Agreement shall be subject to all United States Government laws and regulations now and hereafter applicable to the subject matter of this Agreement, including specifically the Export Law provisions of the Departments of Commerce and State. Licensee will not export or re-export the Software without the appropriate United States or foreign government license. 21 | By its registration below, Licensee confirms that it understands the terms and conditions of this Agreement, and agrees to be bound by them. This Agreement shall become effective as of the date of execution by Licensee. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # MultiRC: Reasoning over Mulitple Sentences 2 | 3 | This repo contains a few useful in this work. To read more details on the paper, refer to [this page](http://cogcomp.org/page/publication_view/833) or [the dataset page](http://cogcomp.org/multirc/). 4 | 5 | ## Evaluation 6 | The evaluation script used is included in the `multirc_materials/` folder. 7 | 8 | To get F1 measures: 9 | 10 | ```bash 11 | > python eval/multirc-eval-v1.py 12 | Per question measures (i.e. precision-recall per question, then average) 13 | P: 0.825211112777 - R: 0.907502623295 - F1m: 0.864402738925 14 | Dataset-wide measures (i.e. precision-recall across all the candidate-answers in the dataset) 15 | P: 0.82434611161 - R: 0.906551362683 - F1a: 0.86349665639 16 | ``` 17 | 18 | ## Reasoning categories 19 | The collection of question annotations (with their reasoning phenomena) used in this work: 20 | [Google Drive Link](https://docs.google.com/spreadsheets/d/1Illoa4FrDFYPWzgNi6rSrRhCKxWYSfyrtUDLLzJgBtM/edit?usp=sharing) 21 | (see the Section 4 of the aforementioned paper) 22 | 23 | ## Citation 24 | If you use this, please cite the paper: 25 | 26 | ``` 27 | @inproceedings{MultiRC2018, 28 | author = {Daniel Khashabi and Snigdha Chaturvedi and Michael Roth and Shyam Upadhyay and Dan Roth}, 29 | title = {Looking Beyond the Surface:A Challenge Set for Reading Comprehension over Multiple Sentences}, 30 | booktitle = {NAACL}, 31 | year = {2018} 32 | } 33 | ``` 34 | -------------------------------------------------------------------------------- /_config.yml: -------------------------------------------------------------------------------- 1 | theme: jekyll-theme-minimal -------------------------------------------------------------------------------- /baseline-scores/allOnes.json: -------------------------------------------------------------------------------- 1 | [{"pid":"News/CNN/cnn-3b5bbf3ba31e4775140f05a8b59db55b22ee3e63.txt","qid":"0","scores":[1,1,1,1,1]},{"pid":"News/CNN/cnn-3b5bbf3ba31e4775140f05a8b59db55b22ee3e63.txt","qid":"1","scores":[1,1,1,1]},{"pid":"News/CNN/cnn-3b5bbf3ba31e4775140f05a8b59db55b22ee3e63.txt","qid":"2","scores":[1,1,1,1]},{"pid":"News/CNN/cnn-3b5bbf3ba31e4775140f05a8b59db55b22ee3e63.txt","qid":"3","scores":[1,1,1,1,1]},{"pid":"News/CNN/cnn-3b5bbf3ba31e4775140f05a8b59db55b22ee3e63.txt","qid":"4","scores":[1,1,1,1,1]},{"pid":"News/CNN/cnn-3b5bbf3ba31e4775140f05a8b59db55b22ee3e63.txt","qid":"5","scores":[1,1,1,1]},{"pid":"News/CNN/cnn-3b5bbf3ba31e4775140f05a8b59db55b22ee3e63.txt","qid":"6","scores":[1,1,1,1]},{"pid":"News/CNN/cnn-3b5bbf3ba31e4775140f05a8b59db55b22ee3e63.txt","qid":"7","scores":[1,1,1,1]},{"pid":"News/CNN/cnn-3b5bbf3ba31e4775140f05a8b59db55b22ee3e63.txt","qid":"8","scores":[1,1,1,1]},{"pid":"News/CNN/cnn-3b5bbf3ba31e4775140f05a8b59db55b22ee3e63.txt","qid":"9","scores":[1,1,1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3b5bbf3ba31e4775140f05a8b59db55b22ee3e63.txt","qid":"10","scores":[1,1,1,1]},{"pid":"News/CNN/cnn-3b5bbf3ba31e4775140f05a8b59db55b22ee3e63.txt","qid":"11","scores":[1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3b5bbf3ba31e4775140f05a8b59db55b22ee3e63.txt","qid":"12","scores":[1,1,1,1,1]},{"pid":"wikiMovieSummaries/3965765.txt","qid":"0","scores":[1,1,1]},{"pid":"wikiMovieSummaries/3965765.txt","qid":"1","scores":[1,1,1]},{"pid":"wikiMovieSummaries/3965765.txt","qid":"2","scores":[1,1,1]},{"pid":"wikiMovieSummaries/3965765.txt","qid":"3","scores":[1,1,1]},{"pid":"wikiMovieSummaries/3965765.txt","qid":"4","scores":[1,1,1,1]},{"pid":"wikiMovieSummaries/3965765.txt","qid":"5","scores":[1,1,1]},{"pid":"wikiMovieSummaries/3965765.txt","qid":"6","scores":[1,1,1,1]},{"pid":"wikiMovieSummaries/3965765.txt","qid":"7","scores":[1,1,1]},{"pid":"wikiMovieSummaries/3965765.txt","qid":"8","scores":[1,1,1,1]},{"pid":"wikiMovieSummaries/3965765.txt","qid":"9","scores":[1,1]},{"pid":"wikiMovieSummaries/3965765.txt","qid":"10","scores":[1,1]},{"pid":"wikiMovieSummaries/3965765.txt","qid":"11","scores":[1,1,1]},{"pid":"wikiMovieSummaries/3965765.txt","qid":"12","scores":[1,1,1]},{"pid":"Fiction-stories-masc-A_Wasted_Day-11.txt","qid":"0","scores":[1,1,1,1,1,1,1]},{"pid":"Fiction-stories-masc-A_Wasted_Day-11.txt","qid":"1","scores":[1,1,1,1]},{"pid":"Fiction-stories-masc-A_Wasted_Day-11.txt","qid":"2","scores":[1,1,1,1,1,1]},{"pid":"Fiction-stories-masc-A_Wasted_Day-11.txt","qid":"3","scores":[1,1,1,1,1,1]},{"pid":"Fiction-stories-masc-A_Wasted_Day-11.txt","qid":"4","scores":[1,1,1,1,1]},{"pid":"Fiction-stories-masc-A_Wasted_Day-11.txt","qid":"5","scores":[1,1,1,1,1,1]},{"pid":"Fiction-stories-masc-A_Wasted_Day-11.txt","qid":"6","scores":[1,1,1,1,1]},{"pid":"Fiction-stories-masc-A_Wasted_Day-11.txt","qid":"7","scores":[1,1,1,1,1,1,1]},{"pid":"Fiction-stories-masc-A_Wasted_Day-11.txt","qid":"8","scores":[1,1,1,1,1]},{"pid":"Fiction-stories-masc-A_Wasted_Day-11.txt","qid":"9","scores":[1,1,1,1,1,1]},{"pid":"Fiction-stories-masc-A_Wasted_Day-11.txt","qid":"10","scores":[1,1,1,1,1]},{"pid":"Fiction-stories-masc-A_Wasted_Day-11.txt","qid":"11","scores":[1,1,1,1,1,1,1,1]},{"pid":"Fiction-stories-masc-A_Wasted_Day-11.txt","qid":"12","scores":[1,1,1,1,1,1]},{"pid":"Fiction-stories-masc-A_Wasted_Day-11.txt","qid":"13","scores":[1,1,1,1,1,1,1]},{"pid":"Fiction-stories-masc-A_Wasted_Day-11.txt","qid":"14","scores":[1,1,1,1]},{"pid":"Fiction-stories-masc-A_Wasted_Day-11.txt","qid":"15","scores":[1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g4-5.txt","qid":"0","scores":[1,1,1,1,1,1]},{"pid":"Science-textbook/science-g4-5.txt","qid":"1","scores":[1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g4-5.txt","qid":"2","scores":[1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g4-5.txt","qid":"3","scores":[1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g4-5.txt","qid":"4","scores":[1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g4-5.txt","qid":"5","scores":[1,1,1,1,1,1,1]},{"pid":"Fiction/gutenberg-10020.txt","qid":"0","scores":[1,1,1]},{"pid":"Fiction/gutenberg-10020.txt","qid":"1","scores":[1,1]},{"pid":"Fiction/gutenberg-10020.txt","qid":"2","scores":[1,1]},{"pid":"Fiction/gutenberg-10020.txt","qid":"3","scores":[1,1,1]},{"pid":"Fiction/gutenberg-10020.txt","qid":"4","scores":[1,1,1]},{"pid":"Fiction/gutenberg-10020.txt","qid":"5","scores":[1,1,1]},{"pid":"Fiction/gutenberg-10020.txt","qid":"6","scores":[1,1]},{"pid":"Fiction/gutenberg-10020.txt","qid":"7","scores":[1,1,1]},{"pid":"Fiction/gutenberg-10020.txt","qid":"8","scores":[1,1,1]},{"pid":"Fiction/gutenberg_withoutQuotes/gutenberg-11643-0.txt","qid":"0","scores":[1,1,1,1]},{"pid":"Fiction/gutenberg_withoutQuotes/gutenberg-11643-0.txt","qid":"1","scores":[1,1,1,1,1]},{"pid":"Fiction/gutenberg_withoutQuotes/gutenberg-11643-0.txt","qid":"2","scores":[1,1,1,1]},{"pid":"News/CNN/cnn-3b0ff8796ddf508d1f1ef4c66ba29835ab642e7f.txt","qid":"0","scores":[1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3b0ff8796ddf508d1f1ef4c66ba29835ab642e7f.txt","qid":"1","scores":[1,1,1,1,1]},{"pid":"News/CNN/cnn-3b0ff8796ddf508d1f1ef4c66ba29835ab642e7f.txt","qid":"2","scores":[1,1,1,1]},{"pid":"News/CNN/cnn-3b0ff8796ddf508d1f1ef4c66ba29835ab642e7f.txt","qid":"3","scores":[1,1,1,1]},{"pid":"News/CNN/cnn-3b0ff8796ddf508d1f1ef4c66ba29835ab642e7f.txt","qid":"4","scores":[1,1]},{"pid":"News/CNN/cnn-3b0ff8796ddf508d1f1ef4c66ba29835ab642e7f.txt","qid":"5","scores":[1,1,1,1]},{"pid":"News/CNN/cnn-3b0ff8796ddf508d1f1ef4c66ba29835ab642e7f.txt","qid":"6","scores":[1,1,1,1]},{"pid":"News/CNN/cnn-3b0ff8796ddf508d1f1ef4c66ba29835ab642e7f.txt","qid":"7","scores":[1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3b0ff8796ddf508d1f1ef4c66ba29835ab642e7f.txt","qid":"8","scores":[1,1,1,1,1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3b0ff8796ddf508d1f1ef4c66ba29835ab642e7f.txt","qid":"9","scores":[1,1,1]},{"pid":"News/CNN/cnn-3b0ff8796ddf508d1f1ef4c66ba29835ab642e7f.txt","qid":"10","scores":[1,1,1,1]},{"pid":"News/CNN/cnn-3b0ff8796ddf508d1f1ef4c66ba29835ab642e7f.txt","qid":"11","scores":[1,1,1,1]},{"pid":"News/CNN/cnn-3b0ff8796ddf508d1f1ef4c66ba29835ab642e7f.txt","qid":"12","scores":[1,1,1,1]},{"pid":"News/CNN/cnn-3b0ff8796ddf508d1f1ef4c66ba29835ab642e7f.txt","qid":"13","scores":[1,1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3b0ff8796ddf508d1f1ef4c66ba29835ab642e7f.txt","qid":"14","scores":[1,1,1,1]},{"pid":"News/CNN/cnn-3b0ff8796ddf508d1f1ef4c66ba29835ab642e7f.txt","qid":"15","scores":[1,1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3b0ff8796ddf508d1f1ef4c66ba29835ab642e7f.txt","qid":"16","scores":[1,1,1,1,1]},{"pid":"News/CNN/cnn-3b0ff8796ddf508d1f1ef4c66ba29835ab642e7f.txt","qid":"17","scores":[1,1,1,1,1]},{"pid":"News/CNN/cnn-3b0ff8796ddf508d1f1ef4c66ba29835ab642e7f.txt","qid":"18","scores":[1,1,1,1]},{"pid":"News/CNN/cnn-3b0ff8796ddf508d1f1ef4c66ba29835ab642e7f.txt","qid":"19","scores":[1,1,1,1,1]},{"pid":"News/CNN/cnn-3b0ff8796ddf508d1f1ef4c66ba29835ab642e7f.txt","qid":"20","scores":[1,1,1]},{"pid":"News/CNN/cnn-3b159c09888b61241afe848844510478546353d4.txt","qid":"0","scores":[1,1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3b159c09888b61241afe848844510478546353d4.txt","qid":"1","scores":[1,1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3b159c09888b61241afe848844510478546353d4.txt","qid":"2","scores":[1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3b159c09888b61241afe848844510478546353d4.txt","qid":"3","scores":[1,1,1,1]},{"pid":"News/CNN/cnn-3b159c09888b61241afe848844510478546353d4.txt","qid":"4","scores":[1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3b159c09888b61241afe848844510478546353d4.txt","qid":"5","scores":[1,1,1,1,1]},{"pid":"News/CNN/cnn-3b159c09888b61241afe848844510478546353d4.txt","qid":"6","scores":[1,1,1,1,1]},{"pid":"News/CNN/cnn-3b159c09888b61241afe848844510478546353d4.txt","qid":"7","scores":[1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3b159c09888b61241afe848844510478546353d4.txt","qid":"8","scores":[1,1,1,1,1,1,1,1,1,1,1]},{"pid":"Fiction/gutenberg_withoutQuotes/gutenberg-106-0.txt","qid":"0","scores":[1,1]},{"pid":"Fiction/gutenberg_withoutQuotes/gutenberg-106-0.txt","qid":"1","scores":[1,1,1]},{"pid":"Fiction/gutenberg_withoutQuotes/gutenberg-106-0.txt","qid":"2","scores":[1,1]},{"pid":"Fiction/gutenberg_withoutQuotes/gutenberg-106-0.txt","qid":"3","scores":[1,1]},{"pid":"Fiction/gutenberg_withoutQuotes/gutenberg-106-0.txt","qid":"4","scores":[1,1,1]},{"pid":"Fiction/gutenberg_withoutQuotes/gutenberg-106-0.txt","qid":"5","scores":[1,1,1]},{"pid":"Fiction/gutenberg_withoutQuotes/gutenberg-106-0.txt","qid":"6","scores":[1,1,1]},{"pid":"Fiction/gutenberg_withoutQuotes/gutenberg-106-0.txt","qid":"7","scores":[1,1]},{"pid":"Fiction/gutenberg_withoutQuotes/gutenberg-106-0.txt","qid":"8","scores":[1,1,1]},{"pid":"Fiction/gutenberg_withoutQuotes/gutenberg-106-0.txt","qid":"9","scores":[1,1,1]},{"pid":"Fiction/gutenberg_withoutQuotes/gutenberg-106-0.txt","qid":"10","scores":[1,1,1]},{"pid":"Fiction/gutenberg_withoutQuotes/gutenberg-106-0.txt","qid":"11","scores":[1,1,1]},{"pid":"Fiction/gutenberg_withoutQuotes/gutenberg-106-0.txt","qid":"12","scores":[1,1,1]},{"pid":"Science-textbook/science-g5-74.txt","qid":"0","scores":[1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g5-74.txt","qid":"1","scores":[1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g5-74.txt","qid":"2","scores":[1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g5-74.txt","qid":"3","scores":[1,1,1,1,1]},{"pid":"Science-textbook/science-g5-74.txt","qid":"4","scores":[1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g5-74.txt","qid":"5","scores":[1,1,1,1,1]},{"pid":"Science-textbook/science-g5-74.txt","qid":"6","scores":[1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g5-74.txt","qid":"7","scores":[1,1,1,1]},{"pid":"Science-textbook/science-g5-74.txt","qid":"8","scores":[1,1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g5-74.txt","qid":"9","scores":[1,1,1,1,1,1]},{"pid":"Science-textbook/science-g5-74.txt","qid":"10","scores":[1,1]},{"pid":"Science-textbook/science-g5-74.txt","qid":"11","scores":[1,1,1,1,1,1,1]},{"pid":"wikiMovieSummaries/3614683.txt","qid":"0","scores":[1,1,1,1]},{"pid":"wikiMovieSummaries/3614683.txt","qid":"1","scores":[1,1,1,1]},{"pid":"wikiMovieSummaries/3614683.txt","qid":"2","scores":[1,1,1]},{"pid":"wikiMovieSummaries/3614683.txt","qid":"3","scores":[1,1,1,1,1]},{"pid":"wikiMovieSummaries/3614683.txt","qid":"4","scores":[1,1,1,1,1,1]},{"pid":"wikiMovieSummaries/3614683.txt","qid":"5","scores":[1,1,1,1]},{"pid":"wikiMovieSummaries/3614683.txt","qid":"6","scores":[1,1,1,1,1,1]},{"pid":"wikiMovieSummaries/3614683.txt","qid":"7","scores":[1,1,1,1,1]},{"pid":"wikiMovieSummaries/3614683.txt","qid":"8","scores":[1,1]},{"pid":"wikiMovieSummaries/3614683.txt","qid":"9","scores":[1,1,1,1,1]},{"pid":"wikiMovieSummaries/3614683.txt","qid":"10","scores":[1,1,1,1]},{"pid":"wikiMovieSummaries/3614683.txt","qid":"11","scores":[1,1,1,1,1]},{"pid":"wikiMovieSummaries/3614683.txt","qid":"12","scores":[1,1,1,1,1]},{"pid":"wikiMovieSummaries/3614683.txt","qid":"13","scores":[1,1,1,1]},{"pid":"wikiMovieSummaries/3614683.txt","qid":"14","scores":[1,1,1,1,1]},{"pid":"wikiMovieSummaries/3614683.txt","qid":"15","scores":[1,1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Camus-9.txt","qid":"0","scores":[1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Camus-9.txt","qid":"1","scores":[1,1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Camus-9.txt","qid":"2","scores":[1,1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Camus-9.txt","qid":"3","scores":[1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Camus-9.txt","qid":"4","scores":[1,1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Camus-9.txt","qid":"5","scores":[1,1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Camus-9.txt","qid":"6","scores":[1,1,1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Camus-9.txt","qid":"7","scores":[1,1,1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Camus-9.txt","qid":"8","scores":[1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Camus-9.txt","qid":"9","scores":[1,1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Camus-9.txt","qid":"10","scores":[1,1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Camus-9.txt","qid":"11","scores":[1,1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Camus-9.txt","qid":"12","scores":[1,1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Camus-9.txt","qid":"13","scores":[1,1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Camus-9.txt","qid":"14","scores":[1,1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Camus-9.txt","qid":"15","scores":[1,1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Einstein-6.txt","qid":"0","scores":[1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Einstein-6.txt","qid":"1","scores":[1,1,1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Einstein-6.txt","qid":"2","scores":[1,1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Einstein-6.txt","qid":"3","scores":[1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Einstein-6.txt","qid":"4","scores":[1,1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Einstein-6.txt","qid":"5","scores":[1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Einstein-6.txt","qid":"6","scores":[1,1,1,1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Einstein-6.txt","qid":"7","scores":[1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Einstein-6.txt","qid":"8","scores":[1,1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Einstein-6.txt","qid":"9","scores":[1,1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Einstein-6.txt","qid":"10","scores":[1,1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Einstein-6.txt","qid":"11","scores":[1,1,1,1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Einstein-6.txt","qid":"12","scores":[1,1,1,1,1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Einstein-6.txt","qid":"13","scores":[1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Einstein-6.txt","qid":"14","scores":[1,1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Einstein-6.txt","qid":"15","scores":[1,1,1,1,1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Einstein-6.txt","qid":"16","scores":[1,1,1,1,1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Einstein-6.txt","qid":"17","scores":[1,1,1,1,1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Einstein-6.txt","qid":"18","scores":[1,1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Einstein-6.txt","qid":"19","scores":[1,1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Einstein-6.txt","qid":"20","scores":[1,1,1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Einstein-6.txt","qid":"21","scores":[1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Einstein-6.txt","qid":"22","scores":[1,1,1,1,1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Einstein-6.txt","qid":"23","scores":[1,1,1,1,1,1]},{"pid":"Fiction-stories/mctest-mc160.train.67-0.txt","qid":"0","scores":[1,1,1,1,1]},{"pid":"Fiction-stories/mctest-mc160.train.67-0.txt","qid":"1","scores":[1,1]},{"pid":"Fiction-stories/mctest-mc160.train.67-0.txt","qid":"2","scores":[1,1,1,1,1]},{"pid":"Fiction-stories/mctest-mc160.train.67-0.txt","qid":"3","scores":[1,1]},{"pid":"Fiction-stories/mctest-mc160.train.67-0.txt","qid":"4","scores":[1,1,1,1]},{"pid":"Fiction-stories/mctest-mc160.train.67-0.txt","qid":"5","scores":[1,1]},{"pid":"Fiction-stories/mctest-mc160.train.67-0.txt","qid":"6","scores":[1,1,1,1]},{"pid":"Fiction-stories/mctest-mc160.train.67-0.txt","qid":"7","scores":[1,1]},{"pid":"Fiction-stories/mctest-mc160.train.67-0.txt","qid":"8","scores":[1,1,1]},{"pid":"Fiction-stories/mctest-mc160.train.67-0.txt","qid":"9","scores":[1,1]},{"pid":"Fiction-stories/mctest-mc160.train.67-0.txt","qid":"10","scores":[1,1]},{"pid":"Fiction-stories/mctest-mc160.train.67-0.txt","qid":"11","scores":[1,1,1,1,1,1,1]},{"pid":"Fiction-stories/mctest-mc160.train.67-0.txt","qid":"12","scores":[1,1,1]},{"pid":"Fiction-stories/mctest-mc160.train.67-0.txt","qid":"13","scores":[1,1,1]},{"pid":"Fiction-stories/mctest-mc160.train.67-0.txt","qid":"14","scores":[1,1]},{"pid":"Fiction-stories/mctest-mc160.train.67-0.txt","qid":"15","scores":[1,1]},{"pid":"Fiction-stories/mctest-mc160.train.67-0.txt","qid":"16","scores":[1,1,1,1]},{"pid":"Fiction-stories/mctest-mc160.train.67-0.txt","qid":"17","scores":[1,1,1,1]},{"pid":"Fiction-stories/mctest-mc160.train.67-0.txt","qid":"18","scores":[1,1,1,1,1]},{"pid":"Fiction-stories/mctest-mc160.train.67-0.txt","qid":"19","scores":[1,1]},{"pid":"News/CNN/cnn-3b4024de1da6390fe09f844b9318baa0f1756b4b.txt","qid":"0","scores":[1,1]},{"pid":"News/CNN/cnn-3b4024de1da6390fe09f844b9318baa0f1756b4b.txt","qid":"1","scores":[1,1,1,1]},{"pid":"News/CNN/cnn-3b4024de1da6390fe09f844b9318baa0f1756b4b.txt","qid":"2","scores":[1,1,1,1,1]},{"pid":"News/CNN/cnn-3b4024de1da6390fe09f844b9318baa0f1756b4b.txt","qid":"3","scores":[1,1,1,1,1]},{"pid":"News/CNN/cnn-3b4024de1da6390fe09f844b9318baa0f1756b4b.txt","qid":"4","scores":[1,1,1,1]},{"pid":"News/CNN/cnn-3b4024de1da6390fe09f844b9318baa0f1756b4b.txt","qid":"5","scores":[1,1,1]},{"pid":"News/CNN/cnn-3b4024de1da6390fe09f844b9318baa0f1756b4b.txt","qid":"6","scores":[1,1,1,1,1]},{"pid":"News/CNN/cnn-3b4024de1da6390fe09f844b9318baa0f1756b4b.txt","qid":"7","scores":[1,1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3b4024de1da6390fe09f844b9318baa0f1756b4b.txt","qid":"8","scores":[1,1,1]},{"pid":"Fiction/gutenberg_withoutQuotes/gutenberg-11249-0.txt","qid":"0","scores":[1,1,1]},{"pid":"Fiction/gutenberg_withoutQuotes/gutenberg-11249-0.txt","qid":"1","scores":[1,1,1]},{"pid":"Fiction/gutenberg_withoutQuotes/gutenberg-11249-0.txt","qid":"2","scores":[1,1,1]},{"pid":"Fiction/gutenberg_withoutQuotes/gutenberg-11249-0.txt","qid":"3","scores":[1,1,1,1]},{"pid":"Fiction/gutenberg_withoutQuotes/gutenberg-11249-0.txt","qid":"4","scores":[1,1,1]},{"pid":"Fiction/gutenberg_withoutQuotes/gutenberg-11249-0.txt","qid":"5","scores":[1,1,1,1]},{"pid":"Fiction/gutenberg_withoutQuotes/gutenberg-11249-0.txt","qid":"6","scores":[1,1,1]},{"pid":"Fiction/gutenberg_withoutQuotes/gutenberg-11249-0.txt","qid":"7","scores":[1,1,1]},{"pid":"Fiction/gutenberg_withoutQuotes/gutenberg-11249-0.txt","qid":"8","scores":[1,1,1]},{"pid":"Fiction/gutenberg_withoutQuotes/gutenberg-11249-0.txt","qid":"9","scores":[1,1]},{"pid":"Fiction/gutenberg_withoutQuotes/gutenberg-11249-0.txt","qid":"10","scores":[1,1,1]},{"pid":"Fiction/gutenberg_withoutQuotes/gutenberg-11249-0.txt","qid":"11","scores":[1,1,1]},{"pid":"Fiction/gutenberg_withoutQuotes/gutenberg-11249-0.txt","qid":"12","scores":[1,1,1,1]},{"pid":"Fiction/gutenberg_withoutQuotes/gutenberg-11249-0.txt","qid":"13","scores":[1,1,1]},{"pid":"Fiction/gutenberg_withoutQuotes/gutenberg-11249-0.txt","qid":"14","scores":[1,1,1]},{"pid":"Sept11-reports/oanc-chapter-3-2.txt","qid":"0","scores":[1,1,1,1,1,1,1]},{"pid":"Sept11-reports/oanc-chapter-3-2.txt","qid":"1","scores":[1,1]},{"pid":"Sept11-reports/oanc-chapter-3-2.txt","qid":"2","scores":[1,1,1,1,1]},{"pid":"Sept11-reports/oanc-chapter-3-2.txt","qid":"3","scores":[1,1,1]},{"pid":"Sept11-reports/oanc-chapter-3-2.txt","qid":"4","scores":[1,1,1]},{"pid":"Sept11-reports/oanc-chapter-3-2.txt","qid":"5","scores":[1,1,1]},{"pid":"Sept11-reports/oanc-chapter-3-2.txt","qid":"6","scores":[1,1,1,1,1,1,1,1,1,1]},{"pid":"Sept11-reports/oanc-chapter-3-2.txt","qid":"7","scores":[1,1,1,1,1,1]},{"pid":"Sept11-reports/oanc-chapter-3-2.txt","qid":"8","scores":[1,1,1,1,1,1,1,1]},{"pid":"Sept11-reports/oanc-chapter-3-2.txt","qid":"9","scores":[1,1,1]},{"pid":"Sept11-reports/oanc-chapter-3-2.txt","qid":"10","scores":[1,1,1,1,1,1,1]},{"pid":"Sept11-reports/oanc-chapter-3-2.txt","qid":"11","scores":[1,1,1]},{"pid":"Sept11-reports/oanc-chapter-3-2.txt","qid":"12","scores":[1,1,1]},{"pid":"Sept11-reports/oanc-chapter-3-2.txt","qid":"13","scores":[1,1,1]},{"pid":"Sept11-reports/oanc-chapter-3-2.txt","qid":"14","scores":[1,1,1]},{"pid":"Sept11-reports/oanc-chapter-3-2.txt","qid":"15","scores":[1,1,1,1]},{"pid":"Sept11-reports/oanc-chapter-3-2.txt","qid":"16","scores":[1,1,1,1,1,1,1,1,1,1]},{"pid":"Sept11-reports/oanc-chapter-3-2.txt","qid":"17","scores":[1,1,1,1,1,1]},{"pid":"Sept11-reports/oanc-chapter-3-2.txt","qid":"18","scores":[1,1,1,1,1,1,1,1]},{"pid":"Sept11-reports/oanc-chapter-3-2.txt","qid":"19","scores":[1,1]},{"pid":"Sept11-reports/oanc-chapter-3-2.txt","qid":"20","scores":[1,1,1,1,1,1,1,1,1,1,1]},{"pid":"Sept11-reports/oanc-chapter-3-2.txt","qid":"21","scores":[1,1,1,1,1]},{"pid":"News/CNN/cnn-3abe9f2e405be1e0b8bb42ef76ed520a6f1bf37e.txt","qid":"0","scores":[1,1,1,1]},{"pid":"News/CNN/cnn-3abe9f2e405be1e0b8bb42ef76ed520a6f1bf37e.txt","qid":"1","scores":[1,1,1,1]},{"pid":"News/CNN/cnn-3abe9f2e405be1e0b8bb42ef76ed520a6f1bf37e.txt","qid":"2","scores":[1,1]},{"pid":"News/CNN/cnn-3abe9f2e405be1e0b8bb42ef76ed520a6f1bf37e.txt","qid":"3","scores":[1,1,1,1]},{"pid":"News/CNN/cnn-3abe9f2e405be1e0b8bb42ef76ed520a6f1bf37e.txt","qid":"4","scores":[1,1,1]},{"pid":"News/CNN/cnn-3abe9f2e405be1e0b8bb42ef76ed520a6f1bf37e.txt","qid":"5","scores":[1,1,1]},{"pid":"News/CNN/cnn-3abe9f2e405be1e0b8bb42ef76ed520a6f1bf37e.txt","qid":"6","scores":[1,1,1]},{"pid":"News/CNN/cnn-3abe9f2e405be1e0b8bb42ef76ed520a6f1bf37e.txt","qid":"7","scores":[1,1,1]},{"pid":"News/CNN/cnn-3abe9f2e405be1e0b8bb42ef76ed520a6f1bf37e.txt","qid":"8","scores":[1,1]},{"pid":"News/CNN/cnn-3abe9f2e405be1e0b8bb42ef76ed520a6f1bf37e.txt","qid":"9","scores":[1,1,1]},{"pid":"News/CNN/cnn-3abe9f2e405be1e0b8bb42ef76ed520a6f1bf37e.txt","qid":"10","scores":[1,1,1]},{"pid":"News/CNN/cnn-3abe9f2e405be1e0b8bb42ef76ed520a6f1bf37e.txt","qid":"11","scores":[1,1,1]},{"pid":"News/CNN/cnn-3abe9f2e405be1e0b8bb42ef76ed520a6f1bf37e.txt","qid":"12","scores":[1,1,1]},{"pid":"News/CNN/cnn-3abe9f2e405be1e0b8bb42ef76ed520a6f1bf37e.txt","qid":"13","scores":[1,1,1]},{"pid":"News/CNN/cnn-3abe9f2e405be1e0b8bb42ef76ed520a6f1bf37e.txt","qid":"14","scores":[1,1,1]},{"pid":"News/CNN/cnn-3abe9f2e405be1e0b8bb42ef76ed520a6f1bf37e.txt","qid":"15","scores":[1,1,1]},{"pid":"News/CNN/cnn-3abe9f2e405be1e0b8bb42ef76ed520a6f1bf37e.txt","qid":"16","scores":[1,1,1,1]},{"pid":"News/CNN/cnn-3abe9f2e405be1e0b8bb42ef76ed520a6f1bf37e.txt","qid":"17","scores":[1,1,1,1]},{"pid":"News/CNN/cnn-3abe9f2e405be1e0b8bb42ef76ed520a6f1bf37e.txt","qid":"18","scores":[1,1,1]},{"pid":"News/CNN/cnn-3abe9f2e405be1e0b8bb42ef76ed520a6f1bf37e.txt","qid":"19","scores":[1,1,1,1]},{"pid":"News/CNN/cnn-3aad69ac9173d0db3e5f477b85d4a2ae8b26d019.txt","qid":"0","scores":[1,1,1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3aad69ac9173d0db3e5f477b85d4a2ae8b26d019.txt","qid":"1","scores":[1,1,1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3aad69ac9173d0db3e5f477b85d4a2ae8b26d019.txt","qid":"2","scores":[1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3aad69ac9173d0db3e5f477b85d4a2ae8b26d019.txt","qid":"3","scores":[1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3aad69ac9173d0db3e5f477b85d4a2ae8b26d019.txt","qid":"4","scores":[1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3aad69ac9173d0db3e5f477b85d4a2ae8b26d019.txt","qid":"5","scores":[1,1,1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3aad69ac9173d0db3e5f477b85d4a2ae8b26d019.txt","qid":"6","scores":[1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3aad69ac9173d0db3e5f477b85d4a2ae8b26d019.txt","qid":"7","scores":[1,1,1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3aad69ac9173d0db3e5f477b85d4a2ae8b26d019.txt","qid":"8","scores":[1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3aad69ac9173d0db3e5f477b85d4a2ae8b26d019.txt","qid":"9","scores":[1,1,1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3aad69ac9173d0db3e5f477b85d4a2ae8b26d019.txt","qid":"10","scores":[1,1,1,1,1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3aad69ac9173d0db3e5f477b85d4a2ae8b26d019.txt","qid":"11","scores":[1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3aad69ac9173d0db3e5f477b85d4a2ae8b26d019.txt","qid":"12","scores":[1,1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3aad69ac9173d0db3e5f477b85d4a2ae8b26d019.txt","qid":"13","scores":[1,1,1,1,1,1,1,1]},{"pid":"wikiMovieSummaries-18640799.txt","qid":"0","scores":[1,1,1,1,1,1,1]},{"pid":"wikiMovieSummaries-18640799.txt","qid":"1","scores":[1,1,1,1,1]},{"pid":"wikiMovieSummaries-18640799.txt","qid":"2","scores":[1,1,1,1,1]},{"pid":"wikiMovieSummaries-18640799.txt","qid":"3","scores":[1,1,1,1]},{"pid":"wikiMovieSummaries-18640799.txt","qid":"4","scores":[1,1,1,1,1]},{"pid":"wikiMovieSummaries-18640799.txt","qid":"5","scores":[1,1,1,1]},{"pid":"wikiMovieSummaries-18640799.txt","qid":"6","scores":[1,1,1,1]},{"pid":"wikiMovieSummaries-18640799.txt","qid":"7","scores":[1,1,1,1,1,1,1]},{"pid":"wikiMovieSummaries-18640799.txt","qid":"8","scores":[1,1,1,1,1]},{"pid":"wikiMovieSummaries-18640799.txt","qid":"9","scores":[1,1,1,1,1]},{"pid":"wikiMovieSummaries-18640799.txt","qid":"10","scores":[1,1,1,1,1]},{"pid":"wikiMovieSummaries-18640799.txt","qid":"11","scores":[1,1,1,1,1,1,1,1]},{"pid":"wikiMovieSummaries-18640799.txt","qid":"12","scores":[1,1,1]},{"pid":"Fiction/gutenberg-10028.txt","qid":"0","scores":[1,1,1,1,1]},{"pid":"Fiction/gutenberg-10028.txt","qid":"1","scores":[1,1,1,1]},{"pid":"Fiction/gutenberg-10028.txt","qid":"2","scores":[1,1,1,1,1,1,1]},{"pid":"Fiction/gutenberg-10028.txt","qid":"3","scores":[1,1]},{"pid":"Fiction/gutenberg-10028.txt","qid":"4","scores":[1,1,1,1,1,1]},{"pid":"Fiction/gutenberg-10028.txt","qid":"5","scores":[1,1,1,1,1,1]},{"pid":"Fiction-stories/mctest-mc500.dev.49-0.txt","qid":"0","scores":[1,1,1,1,1]},{"pid":"Fiction-stories/mctest-mc500.dev.49-0.txt","qid":"1","scores":[1,1,1,1,1,1,1]},{"pid":"Fiction-stories/mctest-mc500.dev.49-0.txt","qid":"2","scores":[1,1,1,1,1,1,1]},{"pid":"Fiction-stories/mctest-mc500.dev.49-0.txt","qid":"3","scores":[1,1,1,1,1]},{"pid":"Fiction-stories/mctest-mc500.dev.49-0.txt","qid":"4","scores":[1,1,1,1,1]},{"pid":"Fiction-stories/mctest-mc500.dev.49-0.txt","qid":"5","scores":[1,1,1,1]},{"pid":"Fiction-stories/mctest-mc500.dev.49-0.txt","qid":"6","scores":[1,1,1,1]},{"pid":"Fiction-stories/mctest-mc500.dev.49-0.txt","qid":"7","scores":[1,1,1,1,1]},{"pid":"Fiction-stories/mctest-mc500.dev.49-0.txt","qid":"8","scores":[1,1,1,1]},{"pid":"Fiction-stories/mctest-mc500.dev.49-0.txt","qid":"9","scores":[1,1,1]},{"pid":"Fiction-stories/mctest-mc500.dev.49-0.txt","qid":"10","scores":[1,1]},{"pid":"Fiction-stories/mctest-mc500.dev.49-0.txt","qid":"11","scores":[1,1,1,1,1]},{"pid":"Fiction-stories/mctest-mc500.dev.49-0.txt","qid":"12","scores":[1,1,1,1]},{"pid":"Fiction-stories/mctest-mc500.dev.49-0.txt","qid":"13","scores":[1,1,1,1]},{"pid":"Fiction-stories/mctest-mc500.dev.49-0.txt","qid":"14","scores":[1,1,1,1,1,1]},{"pid":"Fiction-stories/mctest-mc500.dev.49-0.txt","qid":"15","scores":[1,1,1,1,1]},{"pid":"Fiction-stories/mctest-mc500.dev.49-0.txt","qid":"16","scores":[1,1]},{"pid":"Fiction-stories/mctest-mc500.dev.49-0.txt","qid":"17","scores":[1,1,1,1,1]},{"pid":"Fiction-stories/mctest-mc500.dev.49-0.txt","qid":"18","scores":[1,1,1,1]},{"pid":"Fiction-stories/mctest-mc500.dev.49-0.txt","qid":"19","scores":[1,1,1,1]},{"pid":"News/CNN/cnn-3b659823a1b4208f0990ed2db4a293095c576458.txt","qid":"0","scores":[1,1,1,1]},{"pid":"News/CNN/cnn-3b659823a1b4208f0990ed2db4a293095c576458.txt","qid":"1","scores":[1,1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3b659823a1b4208f0990ed2db4a293095c576458.txt","qid":"2","scores":[1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3b659823a1b4208f0990ed2db4a293095c576458.txt","qid":"3","scores":[1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3b659823a1b4208f0990ed2db4a293095c576458.txt","qid":"4","scores":[1,1,1,1,1]},{"pid":"News/CNN/cnn-3b659823a1b4208f0990ed2db4a293095c576458.txt","qid":"5","scores":[1,1,1,1,1]},{"pid":"News/CNN/cnn-3b659823a1b4208f0990ed2db4a293095c576458.txt","qid":"6","scores":[1,1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3b659823a1b4208f0990ed2db4a293095c576458.txt","qid":"7","scores":[1,1,1,1,1]},{"pid":"wikiMovieSummaries-10376197.txt","qid":"0","scores":[1,1,1,1,1,1,1,1]},{"pid":"wikiMovieSummaries-10376197.txt","qid":"1","scores":[1,1,1]},{"pid":"wikiMovieSummaries-10376197.txt","qid":"2","scores":[1,1,1]},{"pid":"wikiMovieSummaries-10376197.txt","qid":"3","scores":[1,1,1,1,1]},{"pid":"wikiMovieSummaries-10376197.txt","qid":"4","scores":[1,1,1]},{"pid":"wikiMovieSummaries-10376197.txt","qid":"5","scores":[1,1,1,1,1,1,1]},{"pid":"wikiMovieSummaries-10376197.txt","qid":"6","scores":[1,1,1,1,1,1]},{"pid":"wikiMovieSummaries-10376197.txt","qid":"7","scores":[1,1,1]},{"pid":"wikiMovieSummaries-10376197.txt","qid":"8","scores":[1,1,1,1,1,1]},{"pid":"wikiMovieSummaries-10376197.txt","qid":"9","scores":[1,1,1,1,1,1,1,1]},{"pid":"wikiMovieSummaries-10376197.txt","qid":"10","scores":[1,1,1]},{"pid":"wikiMovieSummaries-10376197.txt","qid":"11","scores":[1,1,1,1,1,1,1,1]},{"pid":"wikiMovieSummaries-10376197.txt","qid":"12","scores":[1,1,1,1]},{"pid":"wikiMovieSummaries-10376197.txt","qid":"13","scores":[1,1,1,1,1]},{"pid":"wikiMovieSummaries-10376197.txt","qid":"14","scores":[1,1,1,1]},{"pid":"wikiMovieSummaries-10376197.txt","qid":"15","scores":[1,1,1]},{"pid":"wikiMovieSummaries-10376197.txt","qid":"16","scores":[1,1,1,1,1]},{"pid":"wikiMovieSummaries-10376197.txt","qid":"17","scores":[1,1,1]},{"pid":"wikiMovieSummaries-10376197.txt","qid":"18","scores":[1,1,1,1,1]},{"pid":"wikiMovieSummaries-10376197.txt","qid":"19","scores":[1,1,1,1,1,1,1]},{"pid":"wikiMovieSummaries-10376197.txt","qid":"20","scores":[1,1,1]},{"pid":"wikiMovieSummaries-10376197.txt","qid":"21","scores":[1,1,1]},{"pid":"History-Anthropology/oanc-HistoryEdinburgh-1.txt","qid":"0","scores":[1,1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryEdinburgh-1.txt","qid":"1","scores":[1,1,1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryEdinburgh-1.txt","qid":"2","scores":[1,1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryEdinburgh-1.txt","qid":"3","scores":[1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryEdinburgh-1.txt","qid":"4","scores":[1,1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryEdinburgh-1.txt","qid":"5","scores":[1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryEdinburgh-1.txt","qid":"6","scores":[1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryEdinburgh-1.txt","qid":"7","scores":[1,1,1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryEdinburgh-1.txt","qid":"8","scores":[1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryEdinburgh-1.txt","qid":"9","scores":[1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryEdinburgh-1.txt","qid":"10","scores":[1,1,1,1,1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryEdinburgh-1.txt","qid":"11","scores":[1,1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryEdinburgh-1.txt","qid":"12","scores":[1,1,1,1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-Firm_to_the_Poor_Needs_Help-2.txt","qid":"0","scores":[1,1,1,1,1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-Firm_to_the_Poor_Needs_Help-2.txt","qid":"1","scores":[1,1,1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-Firm_to_the_Poor_Needs_Help-2.txt","qid":"2","scores":[1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-Firm_to_the_Poor_Needs_Help-2.txt","qid":"3","scores":[1,1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-Firm_to_the_Poor_Needs_Help-2.txt","qid":"4","scores":[1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-Firm_to_the_Poor_Needs_Help-2.txt","qid":"5","scores":[1,1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g4-80.txt","qid":"0","scores":[1,1,1,1,1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g4-80.txt","qid":"1","scores":[1,1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g4-80.txt","qid":"2","scores":[1,1,1,1,1,1,1,1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g4-80.txt","qid":"3","scores":[1,1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g4-80.txt","qid":"4","scores":[1,1,1,1,1,1]},{"pid":"Science-textbook/science-g4-80.txt","qid":"5","scores":[1,1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g4-80.txt","qid":"6","scores":[1,1,1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g4-80.txt","qid":"7","scores":[1,1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g4-80.txt","qid":"8","scores":[1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g4-80.txt","qid":"9","scores":[1,1,1,1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g4-80.txt","qid":"10","scores":[1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g4-80.txt","qid":"11","scores":[1,1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g4-80.txt","qid":"12","scores":[1,1,1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g4-80.txt","qid":"13","scores":[1,1,1,1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g4-80.txt","qid":"14","scores":[1,1,1,1,1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryMadrid-2.txt","qid":"0","scores":[1,1,1,1,1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryMadrid-2.txt","qid":"1","scores":[1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryMadrid-2.txt","qid":"2","scores":[1,1,1,1,1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryMadrid-2.txt","qid":"3","scores":[1,1,1,1,1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryMadrid-2.txt","qid":"4","scores":[1,1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryMadrid-2.txt","qid":"5","scores":[1,1,1,1,1,1,1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryMadrid-2.txt","qid":"6","scores":[1,1,1,1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryMadrid-2.txt","qid":"7","scores":[1,1,1,1,1,1,1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryMadrid-2.txt","qid":"8","scores":[1,1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryMadrid-2.txt","qid":"9","scores":[1,1,1,1,1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryMadrid-2.txt","qid":"10","scores":[1,1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryMadrid-2.txt","qid":"11","scores":[1,1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryMadrid-2.txt","qid":"12","scores":[1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryMadrid-2.txt","qid":"13","scores":[1,1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryMadrid-2.txt","qid":"14","scores":[1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryMadrid-2.txt","qid":"15","scores":[1,1,1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryMadrid-2.txt","qid":"16","scores":[1,1,1,1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryMadrid-2.txt","qid":"17","scores":[1,1,1,1,1,1]},{"pid":"Science-textbook/science-g5-78.txt","qid":"0","scores":[1,1,1,1,1]},{"pid":"Science-textbook/science-g5-78.txt","qid":"1","scores":[1,1,1,1,1]},{"pid":"Science-textbook/science-g5-78.txt","qid":"2","scores":[1,1,1,1,1]},{"pid":"Science-textbook/science-g5-78.txt","qid":"3","scores":[1,1,1,1,1]},{"pid":"Science-textbook/science-g5-78.txt","qid":"4","scores":[1,1,1,1,1]},{"pid":"Science-textbook/science-g5-78.txt","qid":"5","scores":[1,1,1,1,1]},{"pid":"Science-textbook/science-g5-78.txt","qid":"6","scores":[1,1,1,1,1]},{"pid":"Science-textbook/science-g5-78.txt","qid":"7","scores":[1,1,1,1]},{"pid":"Science-textbook/science-g5-78.txt","qid":"8","scores":[1,1,1,1]},{"pid":"Science-textbook/science-g5-78.txt","qid":"9","scores":[1,1,1,1,1]},{"pid":"Science-textbook/science-g5-78.txt","qid":"10","scores":[1,1,1,1,1]},{"pid":"Science-textbook/science-g5-78.txt","qid":"11","scores":[1,1,1,1,1]},{"pid":"Science-textbook/science-g5-78.txt","qid":"12","scores":[1,1,1,1,1]},{"pid":"Science-textbook/science-g5-78.txt","qid":"13","scores":[1,1,1,1,1]},{"pid":"Science-textbook/science-g5-78.txt","qid":"14","scores":[1,1,1,1]},{"pid":"Science-textbook/science-g5-78.txt","qid":"15","scores":[1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-Disaster_center-2.txt","qid":"0","scores":[1,1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-Disaster_center-2.txt","qid":"1","scores":[1,1,1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-Disaster_center-2.txt","qid":"2","scores":[1,1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-Disaster_center-2.txt","qid":"3","scores":[1,1,1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-Disaster_center-2.txt","qid":"4","scores":[1,1,1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-Disaster_center-2.txt","qid":"5","scores":[1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-Bias_on_the_Job-2.txt","qid":"0","scores":[1,1,1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-Bias_on_the_Job-2.txt","qid":"1","scores":[1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-Bias_on_the_Job-2.txt","qid":"2","scores":[1,1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-Bias_on_the_Job-2.txt","qid":"3","scores":[1,1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-Bias_on_the_Job-2.txt","qid":"4","scores":[1,1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-Bias_on_the_Job-2.txt","qid":"5","scores":[1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-Bias_on_the_Job-2.txt","qid":"6","scores":[1,1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-Bias_on_the_Job-2.txt","qid":"7","scores":[1,1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-Bias_on_the_Job-2.txt","qid":"8","scores":[1,1,1,1,1,1,1]},{"pid":"Wiki_articles/wikiAlexander the Great-34.txt","qid":"0","scores":[1,1,1,1]},{"pid":"Wiki_articles/wikiAlexander the Great-34.txt","qid":"1","scores":[1,1,1]},{"pid":"Wiki_articles/wikiAlexander the Great-34.txt","qid":"2","scores":[1,1,1]},{"pid":"Wiki_articles/wikiAlexander the Great-34.txt","qid":"3","scores":[1,1,1,1,1,1]},{"pid":"Wiki_articles/wikiAlexander the Great-34.txt","qid":"4","scores":[1,1,1,1,1,1,1,1,1]},{"pid":"Wiki_articles/wikiAlexander the Great-34.txt","qid":"5","scores":[1,1,1,1,1,1]},{"pid":"Wiki_articles/wikiAlexander the Great-34.txt","qid":"6","scores":[1,1,1,1,1,1]},{"pid":"Wiki_articles/wikiAlexander the Great-34.txt","qid":"7","scores":[1,1,1,1]},{"pid":"Wiki_articles/wikiAlexander the Great-34.txt","qid":"8","scores":[1,1,1,1]},{"pid":"Wiki_articles/wikiAlexander the Great-34.txt","qid":"9","scores":[1,1]},{"pid":"Wiki_articles/wikiAlexander the Great-34.txt","qid":"10","scores":[1,1,1,1]},{"pid":"Wiki_articles/wikiAlexander the Great-34.txt","qid":"11","scores":[1,1,1]},{"pid":"Wiki_articles/wikiAlexander the Great-34.txt","qid":"12","scores":[1,1,1,1]},{"pid":"Wiki_articles/wikiAlexander the Great-34.txt","qid":"13","scores":[1,1,1,1,1]},{"pid":"Wiki_articles/wikiAlexander the Great-34.txt","qid":"14","scores":[1,1,1,1,1,1]},{"pid":"Wiki_articles/wikiAlexander the Great-34.txt","qid":"15","scores":[1,1,1]},{"pid":"Wiki_articles/wikiAlexander the Great-34.txt","qid":"16","scores":[1,1,1]},{"pid":"Wiki_articles/wikiAlexander the Great-50.txt","qid":"0","scores":[1,1,1,1,1,1]},{"pid":"Wiki_articles/wikiAlexander the Great-50.txt","qid":"1","scores":[1,1,1,1,1,1]},{"pid":"Wiki_articles/wikiAlexander the Great-50.txt","qid":"2","scores":[1,1,1,1,1,1]},{"pid":"Wiki_articles/wikiAlexander the Great-50.txt","qid":"3","scores":[1,1,1,1,1,1]},{"pid":"Wiki_articles/wikiAlexander the Great-50.txt","qid":"4","scores":[1,1,1,1]},{"pid":"Wiki_articles/wikiAlexander the Great-50.txt","qid":"5","scores":[1,1,1,1,1,1]},{"pid":"Fiction/gutenberg-10221.txt","qid":"0","scores":[1,1,1,1,1,1]},{"pid":"Fiction/gutenberg-10221.txt","qid":"1","scores":[1,1,1]},{"pid":"Fiction/gutenberg-10221.txt","qid":"2","scores":[1,1]},{"pid":"Fiction/gutenberg-10221.txt","qid":"3","scores":[1,1,1,1]},{"pid":"Fiction/gutenberg-10221.txt","qid":"4","scores":[1,1,1]},{"pid":"Fiction/gutenberg-10221.txt","qid":"5","scores":[1,1,1,1]},{"pid":"Fiction/gutenberg-10221.txt","qid":"6","scores":[1,1,1]},{"pid":"Fiction/gutenberg-10221.txt","qid":"7","scores":[1,1,1,1]},{"pid":"Fiction/gutenberg-10221.txt","qid":"8","scores":[1,1,1]},{"pid":"Fiction/gutenberg-10221.txt","qid":"9","scores":[1,1,1,1]},{"pid":"Fiction/gutenberg-10221.txt","qid":"10","scores":[1,1,1,1]},{"pid":"Fiction/gutenberg-10221.txt","qid":"11","scores":[1,1,1]},{"pid":"Fiction/gutenberg-10221.txt","qid":"12","scores":[1,1,1,1,1]},{"pid":"Fiction/gutenberg-10221.txt","qid":"13","scores":[1,1,1,1]},{"pid":"Fiction/gutenberg-10221.txt","qid":"14","scores":[1,1,1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryItaly-1.txt","qid":"0","scores":[1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryItaly-1.txt","qid":"1","scores":[1,1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryItaly-1.txt","qid":"2","scores":[1,1,1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryItaly-1.txt","qid":"3","scores":[1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryItaly-1.txt","qid":"4","scores":[1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryItaly-1.txt","qid":"5","scores":[1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryItaly-1.txt","qid":"6","scores":[1,1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryItaly-1.txt","qid":"7","scores":[1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryItaly-1.txt","qid":"8","scores":[1,1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryItaly-1.txt","qid":"9","scores":[1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryItaly-1.txt","qid":"10","scores":[1,1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryItaly-1.txt","qid":"11","scores":[1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryItaly-1.txt","qid":"12","scores":[1,1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryItaly-1.txt","qid":"13","scores":[1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryItaly-1.txt","qid":"14","scores":[1,1,1,1,1]},{"pid":"Science-textbook/science-g4-107.txt","qid":"0","scores":[1,1,1,1,1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g4-107.txt","qid":"1","scores":[1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g4-107.txt","qid":"2","scores":[1,1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g4-107.txt","qid":"3","scores":[1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g4-107.txt","qid":"4","scores":[1,1,1,1,1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g4-107.txt","qid":"5","scores":[1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g4-107.txt","qid":"6","scores":[1,1,1,1,1]},{"pid":"Science-textbook/science-g4-107.txt","qid":"7","scores":[1,1,1,1,1,1,1]},{"pid":"Fiction-stories-masc-hotel-California-0.txt","qid":"0","scores":[1,1,1,1,1]},{"pid":"Fiction-stories-masc-hotel-California-0.txt","qid":"1","scores":[1,1,1,1]},{"pid":"Fiction-stories-masc-hotel-California-0.txt","qid":"2","scores":[1,1,1,1]},{"pid":"Fiction-stories-masc-hotel-California-0.txt","qid":"3","scores":[1,1,1,1,1]},{"pid":"Fiction-stories-masc-hotel-California-0.txt","qid":"4","scores":[1,1,1,1]},{"pid":"Fiction-stories-masc-hotel-California-0.txt","qid":"5","scores":[1,1,1,1,1]},{"pid":"Fiction/mctest-mc160.dev.12-0.txt","qid":"0","scores":[1,1,1,1,1,1,1,1,1,1]},{"pid":"Fiction/mctest-mc160.dev.12-0.txt","qid":"1","scores":[1,1,1,1,1,1]},{"pid":"Fiction/mctest-mc160.dev.12-0.txt","qid":"2","scores":[1,1,1,1]},{"pid":"Fiction/mctest-mc160.dev.12-0.txt","qid":"3","scores":[1,1,1,1,1,1,1,1]},{"pid":"Fiction/mctest-mc160.dev.12-0.txt","qid":"4","scores":[1,1,1,1,1,1,1]},{"pid":"Fiction/mctest-mc160.dev.12-0.txt","qid":"5","scores":[1,1,1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3b07f5102c69e3e609d73b2ccb0dc5549d4fbaf6.txt","qid":"0","scores":[1,1,1]},{"pid":"News/CNN/cnn-3b07f5102c69e3e609d73b2ccb0dc5549d4fbaf6.txt","qid":"1","scores":[1,1,1,1]},{"pid":"News/CNN/cnn-3b07f5102c69e3e609d73b2ccb0dc5549d4fbaf6.txt","qid":"2","scores":[1,1,1,1,1]},{"pid":"News/CNN/cnn-3b07f5102c69e3e609d73b2ccb0dc5549d4fbaf6.txt","qid":"3","scores":[1,1,1,1]},{"pid":"News/CNN/cnn-3b07f5102c69e3e609d73b2ccb0dc5549d4fbaf6.txt","qid":"4","scores":[1,1,1,1,1]},{"pid":"News/CNN/cnn-3b07f5102c69e3e609d73b2ccb0dc5549d4fbaf6.txt","qid":"5","scores":[1,1,1,1,1]},{"pid":"News/CNN/cnn-3b07f5102c69e3e609d73b2ccb0dc5549d4fbaf6.txt","qid":"6","scores":[1,1]},{"pid":"News/CNN/cnn-3b07f5102c69e3e609d73b2ccb0dc5549d4fbaf6.txt","qid":"7","scores":[1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-A_helping_hand-3.txt","qid":"0","scores":[1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-A_helping_hand-3.txt","qid":"1","scores":[1,1,1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-A_helping_hand-3.txt","qid":"2","scores":[1,1,1]},{"pid":"Society_Law_and_Justice/oanc-A_helping_hand-3.txt","qid":"3","scores":[1,1,1]},{"pid":"Society_Law_and_Justice/oanc-A_helping_hand-3.txt","qid":"4","scores":[1,1,1]},{"pid":"Society_Law_and_Justice/oanc-A_helping_hand-3.txt","qid":"5","scores":[1,1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-A_helping_hand-3.txt","qid":"6","scores":[1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-A_helping_hand-3.txt","qid":"7","scores":[1,1,1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-A_helping_hand-3.txt","qid":"8","scores":[1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-A_helping_hand-3.txt","qid":"9","scores":[1,1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-A_helping_hand-3.txt","qid":"10","scores":[1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-A_helping_hand-3.txt","qid":"11","scores":[1,1,1,1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-A_helping_hand-3.txt","qid":"12","scores":[1,1,1,1,1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-A_helping_hand-3.txt","qid":"13","scores":[1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-A_helping_hand-3.txt","qid":"14","scores":[1,1,1,1]},{"pid":"Wiki_articles/wikiAlexander the Great-18.txt","qid":"0","scores":[1,1,1,1,1,1,1]},{"pid":"Wiki_articles/wikiAlexander the Great-18.txt","qid":"1","scores":[1,1,1,1,1]},{"pid":"Wiki_articles/wikiAlexander the Great-18.txt","qid":"2","scores":[1,1,1,1,1,1,1,1,1,1]},{"pid":"Wiki_articles/wikiAlexander the Great-18.txt","qid":"3","scores":[1,1]},{"pid":"Wiki_articles/wikiAlexander the Great-18.txt","qid":"4","scores":[1,1,1,1,1]},{"pid":"Wiki_articles/wikiAlexander the Great-18.txt","qid":"5","scores":[1,1,1,1]},{"pid":"Wiki_articles/wikiAlexander the Great-18.txt","qid":"6","scores":[1,1]},{"pid":"Wiki_articles/wikiAlexander the Great-18.txt","qid":"7","scores":[1,1,1,1,1,1]},{"pid":"Wiki_articles/wikiAlexander the Great-18.txt","qid":"8","scores":[1,1,1,1,1,1]},{"pid":"Wiki_articles/wikiAlexander the Great-18.txt","qid":"9","scores":[1,1,1,1,1]},{"pid":"Wiki_articles/wikiAlexander the Great-18.txt","qid":"10","scores":[1,1,1,1]},{"pid":"wikiMovieSummaries/31093623.txt","qid":"0","scores":[1,1,1,1,1,1,1]},{"pid":"wikiMovieSummaries/31093623.txt","qid":"1","scores":[1,1,1,1,1,1]},{"pid":"wikiMovieSummaries/31093623.txt","qid":"2","scores":[1,1,1,1]},{"pid":"wikiMovieSummaries/31093623.txt","qid":"3","scores":[1,1,1]},{"pid":"wikiMovieSummaries/31093623.txt","qid":"4","scores":[1,1,1]},{"pid":"wikiMovieSummaries/31093623.txt","qid":"5","scores":[1,1,1,1,1]},{"pid":"wikiMovieSummaries/31093623.txt","qid":"6","scores":[1,1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbrecht Durer-17.txt","qid":"0","scores":[1,1,1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbrecht Durer-17.txt","qid":"1","scores":[1,1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbrecht Durer-17.txt","qid":"2","scores":[1,1,1,1,1]},{"pid":"Science-textbook/science-g4-138.txt","qid":"0","scores":[1,1,1]},{"pid":"Science-textbook/science-g4-138.txt","qid":"1","scores":[1,1,1]},{"pid":"Science-textbook/science-g4-138.txt","qid":"2","scores":[1,1,1,1]},{"pid":"Science-textbook/science-g4-138.txt","qid":"3","scores":[1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g4-138.txt","qid":"4","scores":[1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g4-138.txt","qid":"5","scores":[1,1,1,1]},{"pid":"Science-textbook/science-g4-138.txt","qid":"6","scores":[1,1,1,1,1]},{"pid":"Science-textbook/science-g4-138.txt","qid":"7","scores":[1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g4-138.txt","qid":"8","scores":[1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g4-138.txt","qid":"9","scores":[1,1,1,1,1]},{"pid":"Science-textbook/science-g4-138.txt","qid":"10","scores":[1,1,1,1,1]},{"pid":"Science-textbook/science-g4-138.txt","qid":"11","scores":[1,1,1]},{"pid":"Science-textbook/science-g4-138.txt","qid":"12","scores":[1,1,1]},{"pid":"Science-textbook/science-g4-138.txt","qid":"13","scores":[1,1,1,1,1,1]},{"pid":"Science-textbook/science-g4-138.txt","qid":"14","scores":[1,1,1,1,1]},{"pid":"Science-textbook/science-g4-138.txt","qid":"15","scores":[1,1,1]},{"pid":"News/CNN/cnn-3b13a7fecb77be63cac1e9286b83f61b5784f387.txt","qid":"0","scores":[1,1]},{"pid":"News/CNN/cnn-3b13a7fecb77be63cac1e9286b83f61b5784f387.txt","qid":"1","scores":[1,1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3b13a7fecb77be63cac1e9286b83f61b5784f387.txt","qid":"2","scores":[1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3b13a7fecb77be63cac1e9286b83f61b5784f387.txt","qid":"3","scores":[1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3b13a7fecb77be63cac1e9286b83f61b5784f387.txt","qid":"4","scores":[1,1,1,1,1]},{"pid":"News/CNN/cnn-3b13a7fecb77be63cac1e9286b83f61b5784f387.txt","qid":"5","scores":[1,1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3b13a7fecb77be63cac1e9286b83f61b5784f387.txt","qid":"6","scores":[1,1]},{"pid":"News/CNN/cnn-3b13a7fecb77be63cac1e9286b83f61b5784f387.txt","qid":"7","scores":[1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3b36c94018ae586b3a550a2850e10be30f684c0a.txt","qid":"0","scores":[1,1,1,1]},{"pid":"News/CNN/cnn-3b36c94018ae586b3a550a2850e10be30f684c0a.txt","qid":"1","scores":[1,1,1]},{"pid":"News/CNN/cnn-3b36c94018ae586b3a550a2850e10be30f684c0a.txt","qid":"2","scores":[1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3b36c94018ae586b3a550a2850e10be30f684c0a.txt","qid":"3","scores":[1,1,1]},{"pid":"News/CNN/cnn-3b36c94018ae586b3a550a2850e10be30f684c0a.txt","qid":"4","scores":[1,1,1,1]},{"pid":"News/CNN/cnn-3b36c94018ae586b3a550a2850e10be30f684c0a.txt","qid":"5","scores":[1,1,1,1]},{"pid":"News/CNN/cnn-3b36c94018ae586b3a550a2850e10be30f684c0a.txt","qid":"6","scores":[1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3b36c94018ae586b3a550a2850e10be30f684c0a.txt","qid":"7","scores":[1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3b36c94018ae586b3a550a2850e10be30f684c0a.txt","qid":"8","scores":[1,1,1]},{"pid":"News/CNN/cnn-3b36c94018ae586b3a550a2850e10be30f684c0a.txt","qid":"9","scores":[1,1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3b36c94018ae586b3a550a2850e10be30f684c0a.txt","qid":"10","scores":[1,1,1,1]},{"pid":"News/CNN/cnn-3b36c94018ae586b3a550a2850e10be30f684c0a.txt","qid":"11","scores":[1,1,1,1,1]},{"pid":"News/CNN/cnn-3b36c94018ae586b3a550a2850e10be30f684c0a.txt","qid":"12","scores":[1,1,1]},{"pid":"News/CNN/cnn-3b36c94018ae586b3a550a2850e10be30f684c0a.txt","qid":"13","scores":[1,1,1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3b36c94018ae586b3a550a2850e10be30f684c0a.txt","qid":"14","scores":[1,1,1]},{"pid":"News/CNN/cnn-3b36c94018ae586b3a550a2850e10be30f684c0a.txt","qid":"15","scores":[1,1,1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3b36c94018ae586b3a550a2850e10be30f684c0a.txt","qid":"16","scores":[1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3b36c94018ae586b3a550a2850e10be30f684c0a.txt","qid":"17","scores":[1,1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3b36c94018ae586b3a550a2850e10be30f684c0a.txt","qid":"18","scores":[1,1,1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3b36c94018ae586b3a550a2850e10be30f684c0a.txt","qid":"19","scores":[1,1,1,1,1]},{"pid":"News/CNN/cnn-3b36c94018ae586b3a550a2850e10be30f684c0a.txt","qid":"20","scores":[1,1,1,1]},{"pid":"Fiction-stories/mctest-mc160.train.64-0.txt","qid":"0","scores":[1,1,1,1,1]},{"pid":"Fiction-stories/mctest-mc160.train.64-0.txt","qid":"1","scores":[1,1,1,1]},{"pid":"Fiction-stories/mctest-mc160.train.64-0.txt","qid":"2","scores":[1,1,1,1,1,1]},{"pid":"Fiction-stories/mctest-mc160.train.64-0.txt","qid":"3","scores":[1,1,1,1,1,1]},{"pid":"Fiction-stories/mctest-mc160.train.64-0.txt","qid":"4","scores":[1,1,1,1,1]},{"pid":"Fiction-stories/mctest-mc160.train.64-0.txt","qid":"5","scores":[1,1,1,1,1]},{"pid":"Fiction-stories/mctest-mc160.train.64-0.txt","qid":"6","scores":[1,1,1,1,1]},{"pid":"Fiction-stories/mctest-mc160.train.64-0.txt","qid":"7","scores":[1,1,1]},{"pid":"Fiction-stories/mctest-mc160.train.64-0.txt","qid":"8","scores":[1,1,1,1,1,1,1]},{"pid":"Fiction-stories/mctest-mc160.train.64-0.txt","qid":"9","scores":[1,1,1,1]},{"pid":"Fiction-stories/mctest-mc160.train.64-0.txt","qid":"10","scores":[1,1,1,1,1,1]},{"pid":"Fiction-stories/mctest-mc160.train.64-0.txt","qid":"11","scores":[1,1,1,1]},{"pid":"Fiction-stories/mctest-mc160.train.64-0.txt","qid":"12","scores":[1,1,1,1]},{"pid":"Fiction-stories/mctest-mc160.train.64-0.txt","qid":"13","scores":[1,1,1,1,1,1]},{"pid":"Fiction-stories/mctest-mc160.train.64-0.txt","qid":"14","scores":[1,1,1,1,1]},{"pid":"Fiction-stories/mctest-mc160.train.64-0.txt","qid":"15","scores":[1,1,1,1,1,1]},{"pid":"Fiction-stories/mctest-mc160.train.64-0.txt","qid":"16","scores":[1,1,1,1]},{"pid":"Fiction-stories/mctest-mc160.train.64-0.txt","qid":"17","scores":[1,1,1,1,1]},{"pid":"Fiction-stories/mctest-mc160.train.64-0.txt","qid":"18","scores":[1,1,1,1]},{"pid":"Fiction-stories/mctest-mc160.train.64-0.txt","qid":"19","scores":[1,1,1,1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g3-9.txt","qid":"0","scores":[1,1,1,1]},{"pid":"Science-textbook/science-g3-9.txt","qid":"1","scores":[1,1,1,1,1]},{"pid":"Science-textbook/science-g3-9.txt","qid":"2","scores":[1,1,1,1]},{"pid":"Science-textbook/science-g3-9.txt","qid":"3","scores":[1,1,1,1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Bandura-0.txt","qid":"0","scores":[1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Bandura-0.txt","qid":"1","scores":[1,1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Bandura-0.txt","qid":"2","scores":[1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Bandura-0.txt","qid":"3","scores":[1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Bandura-0.txt","qid":"4","scores":[1,1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Bandura-0.txt","qid":"5","scores":[1,1,1,1,1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Bandura-0.txt","qid":"6","scores":[1,1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Bandura-0.txt","qid":"7","scores":[1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Bandura-0.txt","qid":"8","scores":[1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Bandura-0.txt","qid":"9","scores":[1,1,1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Bandura-0.txt","qid":"10","scores":[1,1,1]},{"pid":"Fiction/mctest-mc160.dev.1-0.txt","qid":"0","scores":[1,1,1]},{"pid":"Fiction/mctest-mc160.dev.1-0.txt","qid":"1","scores":[1,1,1]},{"pid":"Fiction/mctest-mc160.dev.1-0.txt","qid":"2","scores":[1,1,1]},{"pid":"Fiction/mctest-mc160.dev.1-0.txt","qid":"3","scores":[1,1,1]},{"pid":"Wiki_articles/wikiAlexander II of Russia-0.txt","qid":"0","scores":[1,1,1,1,1]},{"pid":"Wiki_articles/wikiAlexander II of Russia-0.txt","qid":"1","scores":[1,1,1,1,1,1,1]},{"pid":"Wiki_articles/wikiAlexander II of Russia-0.txt","qid":"2","scores":[1,1,1,1,1,1,1]},{"pid":"Wiki_articles/wikiAlexander II of Russia-0.txt","qid":"3","scores":[1,1,1,1,1,1,1,1,1]},{"pid":"Fiction/mctest-mc160.test.12-0.txt","qid":"0","scores":[1,1,1,1,1]},{"pid":"Fiction/mctest-mc160.test.12-0.txt","qid":"1","scores":[1,1,1,1]},{"pid":"Fiction/mctest-mc160.test.12-0.txt","qid":"2","scores":[1,1,1,1]},{"pid":"Fiction/mctest-mc160.test.12-0.txt","qid":"3","scores":[1,1,1,1]},{"pid":"Fiction/mctest-mc160.test.12-0.txt","qid":"4","scores":[1,1,1,1]},{"pid":"Fiction/gutenberg_withoutQuotes/gutenberg-10641-0.txt","qid":"0","scores":[1,1,1,1,1,1]},{"pid":"Fiction/gutenberg_withoutQuotes/gutenberg-10641-0.txt","qid":"1","scores":[1,1,1,1,1]},{"pid":"Fiction/gutenberg_withoutQuotes/gutenberg-10641-0.txt","qid":"2","scores":[1,1,1,1,1]},{"pid":"Fiction/gutenberg_withoutQuotes/gutenberg-10641-0.txt","qid":"3","scores":[1,1,1,1,1,1]},{"pid":"Wiki_articles/wikiAlexander the Great-11.txt","qid":"0","scores":[1,1,1,1,1,1]},{"pid":"Wiki_articles/wikiAlexander the Great-11.txt","qid":"1","scores":[1,1]},{"pid":"Wiki_articles/wikiAlexander the Great-11.txt","qid":"2","scores":[1,1,1,1,1,1]},{"pid":"Wiki_articles/wikiAlexander the Great-11.txt","qid":"3","scores":[1,1,1,1,1]},{"pid":"Wiki_articles/wikiAlexander the Great-11.txt","qid":"4","scores":[1,1,1,1,1,1]},{"pid":"Wiki_articles/wikiAlexander the Great-11.txt","qid":"5","scores":[1,1,1,1,1,1]},{"pid":"Wiki_articles/wikiAlexander the Great-11.txt","qid":"6","scores":[1,1,1,1,1,1]},{"pid":"Wiki_articles/wikiAlexander the Great-11.txt","qid":"7","scores":[1,1,1,1,1,1]},{"pid":"Wiki_articles/wikiAlexander the Great-11.txt","qid":"8","scores":[1,1,1,1,1,1,1,1,1,1,1]},{"pid":"Wiki_articles/wikiAlexander the Great-11.txt","qid":"9","scores":[1,1,1]},{"pid":"Wiki_articles/wikiAlexander the Great-11.txt","qid":"10","scores":[1,1,1,1,1]},{"pid":"Wiki_articles/wikiAlexander the Great-11.txt","qid":"11","scores":[1,1,1,1,1,1,1,1,1]},{"pid":"Wiki_articles/wikiAlexander the Great-11.txt","qid":"12","scores":[1,1,1,1,1]},{"pid":"Sept11-reports/oanc-chapter-6-2.txt","qid":"0","scores":[1,1,1,1,1,1,1]},{"pid":"Sept11-reports/oanc-chapter-6-2.txt","qid":"1","scores":[1,1,1,1,1]},{"pid":"Sept11-reports/oanc-chapter-6-2.txt","qid":"2","scores":[1,1,1,1,1,1,1]},{"pid":"Sept11-reports/oanc-chapter-6-2.txt","qid":"3","scores":[1,1,1,1,1,1]},{"pid":"Sept11-reports/oanc-chapter-6-2.txt","qid":"4","scores":[1,1,1,1,1,1,1]},{"pid":"Sept11-reports/oanc-chapter-6-2.txt","qid":"5","scores":[1,1,1,1,1]},{"pid":"Sept11-reports/oanc-chapter-6-2.txt","qid":"6","scores":[1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g5-88.txt","qid":"0","scores":[1,1,1,1]},{"pid":"Science-textbook/science-g5-88.txt","qid":"1","scores":[1,1,1,1]},{"pid":"Science-textbook/science-g5-88.txt","qid":"2","scores":[1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g5-88.txt","qid":"3","scores":[1,1]},{"pid":"Science-textbook/science-g5-88.txt","qid":"4","scores":[1,1,1,1]},{"pid":"Science-textbook/science-g5-88.txt","qid":"5","scores":[1,1,1,1,1,1]},{"pid":"Science-textbook/science-g5-88.txt","qid":"6","scores":[1,1]},{"pid":"Science-textbook/science-g5-88.txt","qid":"7","scores":[1,1]},{"pid":"Science-textbook/science-g5-88.txt","qid":"8","scores":[1,1,1,1]},{"pid":"Science-textbook/science-g5-88.txt","qid":"9","scores":[1,1,1,1]},{"pid":"Science-textbook/science-g5-88.txt","qid":"10","scores":[1,1,1,1]},{"pid":"Science-textbook/science-g5-88.txt","qid":"11","scores":[1,1,1,1]},{"pid":"Science-textbook/science-g5-88.txt","qid":"12","scores":[1,1]},{"pid":"Science-textbook/science-g5-88.txt","qid":"13","scores":[1,1]},{"pid":"Science-textbook/science-g5-88.txt","qid":"14","scores":[1,1,1,1,1]},{"pid":"Science-textbook/science-g5-88.txt","qid":"15","scores":[1,1]},{"pid":"Science-textbook/science-g5-88.txt","qid":"16","scores":[1,1]},{"pid":"Science-textbook/science-g5-88.txt","qid":"17","scores":[1,1,1,1,1,1]},{"pid":"Science-textbook/science-g5-88.txt","qid":"18","scores":[1,1,1,1]},{"pid":"Science-textbook/science-g5-88.txt","qid":"19","scores":[1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryEdinburgh-4.txt","qid":"0","scores":[1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryEdinburgh-4.txt","qid":"1","scores":[1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryEdinburgh-4.txt","qid":"2","scores":[1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryEdinburgh-4.txt","qid":"3","scores":[1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryEdinburgh-4.txt","qid":"4","scores":[1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryEdinburgh-4.txt","qid":"5","scores":[1,1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryEdinburgh-4.txt","qid":"6","scores":[1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryEdinburgh-4.txt","qid":"7","scores":[1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryEdinburgh-4.txt","qid":"8","scores":[1,1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryEdinburgh-4.txt","qid":"9","scores":[1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryEdinburgh-4.txt","qid":"10","scores":[1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryEdinburgh-4.txt","qid":"11","scores":[1,1,1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryEdinburgh-4.txt","qid":"12","scores":[1,1,1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryEdinburgh-4.txt","qid":"13","scores":[1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryEdinburgh-4.txt","qid":"14","scores":[1,1,1,1,1]},{"pid":"Science-textbook/science-g4-121.txt","qid":"0","scores":[1,1,1,1,1,1]},{"pid":"Science-textbook/science-g4-121.txt","qid":"1","scores":[1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g4-121.txt","qid":"2","scores":[1,1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g4-121.txt","qid":"3","scores":[1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g4-121.txt","qid":"4","scores":[1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g4-121.txt","qid":"5","scores":[1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g4-121.txt","qid":"6","scores":[1,1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g4-121.txt","qid":"7","scores":[1,1,1,1,1]},{"pid":"Science-textbook/science-g4-121.txt","qid":"8","scores":[1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g4-121.txt","qid":"9","scores":[1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g4-121.txt","qid":"10","scores":[1,1,1,1,1]},{"pid":"Science-textbook/science-g4-121.txt","qid":"11","scores":[1,1,1,1,1,1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-Butler_Co_attorneys-1.txt","qid":"0","scores":[1,1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-Butler_Co_attorneys-1.txt","qid":"1","scores":[1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-Butler_Co_attorneys-1.txt","qid":"2","scores":[1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-Butler_Co_attorneys-1.txt","qid":"3","scores":[1,1,1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-Butler_Co_attorneys-1.txt","qid":"4","scores":[1,1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-Butler_Co_attorneys-1.txt","qid":"5","scores":[1,1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-Butler_Co_attorneys-1.txt","qid":"6","scores":[1,1]},{"pid":"Society_Law_and_Justice/oanc-Butler_Co_attorneys-1.txt","qid":"7","scores":[1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-Butler_Co_attorneys-1.txt","qid":"8","scores":[1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-Butler_Co_attorneys-1.txt","qid":"9","scores":[1,1,1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-Butler_Co_attorneys-1.txt","qid":"10","scores":[1,1,1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-Butler_Co_attorneys-1.txt","qid":"11","scores":[1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-Butler_Co_attorneys-1.txt","qid":"12","scores":[1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-Butler_Co_attorneys-1.txt","qid":"13","scores":[1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-Butler_Co_attorneys-1.txt","qid":"14","scores":[1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-Butler_Co_attorneys-1.txt","qid":"15","scores":[1,1,1]},{"pid":"Society_Law_and_Justice/oanc-Attorney_gives_his_time-2.txt","qid":"0","scores":[1,1,1,1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-Attorney_gives_his_time-2.txt","qid":"1","scores":[1,1,1]},{"pid":"Society_Law_and_Justice/oanc-Attorney_gives_his_time-2.txt","qid":"2","scores":[1,1,1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-Attorney_gives_his_time-2.txt","qid":"3","scores":[1,1,1,1,1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-Attorney_gives_his_time-2.txt","qid":"4","scores":[1,1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-Attorney_gives_his_time-2.txt","qid":"5","scores":[1,1,1,1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-Attorney_gives_his_time-2.txt","qid":"6","scores":[1,1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-Attorney_gives_his_time-2.txt","qid":"7","scores":[1,1,1,1,1,1,1]},{"pid":"Sept11-reports/oanc-chapter-13.1-29.txt","qid":"0","scores":[1,1,1,1,1]},{"pid":"Sept11-reports/oanc-chapter-13.1-29.txt","qid":"1","scores":[1,1,1,1,1,1]},{"pid":"Sept11-reports/oanc-chapter-13.1-29.txt","qid":"2","scores":[1,1,1,1]},{"pid":"Sept11-reports/oanc-chapter-13.1-29.txt","qid":"3","scores":[1,1,1,1]},{"pid":"Sept11-reports/oanc-chapter-13.1-29.txt","qid":"4","scores":[1,1,1]},{"pid":"Sept11-reports/oanc-chapter-13.1-29.txt","qid":"5","scores":[1,1,1,1]},{"pid":"Sept11-reports/oanc-chapter-13.1-29.txt","qid":"6","scores":[1,1,1,1,1,1,1,1]},{"pid":"Sept11-reports/oanc-chapter-13.1-29.txt","qid":"7","scores":[1,1,1]},{"pid":"Fiction/gutenberg-10002.txt","qid":"0","scores":[1,1,1,1,1,1]},{"pid":"Fiction/gutenberg-10002.txt","qid":"1","scores":[1,1,1,1,1,1,1,1,1,1]},{"pid":"Fiction/gutenberg-10002.txt","qid":"2","scores":[1,1,1,1,1,1,1]},{"pid":"Fiction/gutenberg-10002.txt","qid":"3","scores":[1,1,1,1,1,1]},{"pid":"Fiction/gutenberg-10002.txt","qid":"4","scores":[1,1,1,1,1,1,1,1,1]},{"pid":"Fiction/gutenberg-10002.txt","qid":"5","scores":[1,1,1,1,1,1,1]},{"pid":"Fiction/gutenberg-10002.txt","qid":"6","scores":[1,1,1,1,1]},{"pid":"Fiction/gutenberg-10002.txt","qid":"7","scores":[1,1,1,1,1,1,1,1]},{"pid":"Fiction/gutenberg-10002.txt","qid":"8","scores":[1,1,1,1,1]},{"pid":"Fiction/gutenberg-10002.txt","qid":"9","scores":[1,1,1,1,1]},{"pid":"Fiction/gutenberg-10002.txt","qid":"10","scores":[1,1,1,1,1,1,1,1,1]},{"pid":"Fiction/gutenberg-10002.txt","qid":"11","scores":[1,1,1,1,1,1,1,1,1,1]},{"pid":"Fiction/gutenberg-10002.txt","qid":"12","scores":[1,1,1,1,1,1,1,1,1]},{"pid":"Fiction/gutenberg-10002.txt","qid":"13","scores":[1,1,1,1,1,1,1,1]},{"pid":"Fiction/gutenberg-10002.txt","qid":"14","scores":[1,1,1,1,1,1,1]},{"pid":"Fiction/gutenberg-10002.txt","qid":"15","scores":[1,1,1,1,1]},{"pid":"Fiction/gutenberg-10002.txt","qid":"16","scores":[1,1,1,1,1,1,1]},{"pid":"Fiction/gutenberg-10002.txt","qid":"17","scores":[1,1,1,1,1,1,1,1,1]},{"pid":"Fiction-stories/mctest-mc500.dev.43-0.txt","qid":"0","scores":[1,1,1]},{"pid":"Fiction-stories/mctest-mc500.dev.43-0.txt","qid":"1","scores":[1,1]},{"pid":"Fiction-stories/mctest-mc500.dev.43-0.txt","qid":"2","scores":[1,1]},{"pid":"Fiction-stories/mctest-mc500.dev.43-0.txt","qid":"3","scores":[1,1,1]},{"pid":"Fiction-stories/mctest-mc500.dev.43-0.txt","qid":"4","scores":[1,1,1]},{"pid":"Fiction-stories/mctest-mc500.dev.43-0.txt","qid":"5","scores":[1,1,1]},{"pid":"Fiction-stories/mctest-mc500.dev.43-0.txt","qid":"6","scores":[1,1]},{"pid":"Fiction-stories/mctest-mc500.dev.43-0.txt","qid":"7","scores":[1,1]},{"pid":"Fiction-stories/mctest-mc500.dev.43-0.txt","qid":"8","scores":[1,1,1]},{"pid":"Fiction-stories/mctest-mc500.dev.43-0.txt","qid":"9","scores":[1,1,1]},{"pid":"Fiction-stories/mctest-mc500.dev.43-0.txt","qid":"10","scores":[1,1,1]},{"pid":"Fiction-stories/mctest-mc500.dev.43-0.txt","qid":"11","scores":[1,1,1]},{"pid":"Fiction-stories/mctest-mc500.dev.43-0.txt","qid":"12","scores":[1,1,1]},{"pid":"Fiction-stories/mctest-mc500.dev.43-0.txt","qid":"13","scores":[1,1,1]},{"pid":"Fiction-stories/mctest-mc500.dev.43-0.txt","qid":"14","scores":[1,1,1]},{"pid":"Fiction-stories/mctest-mc500.dev.43-0.txt","qid":"15","scores":[1,1]},{"pid":"Wiki_articles/wikiAlexander II of Russia-11.txt","qid":"0","scores":[1,1,1]},{"pid":"Wiki_articles/wikiAlexander II of Russia-11.txt","qid":"1","scores":[1,1,1]},{"pid":"Wiki_articles/wikiAlexander II of Russia-11.txt","qid":"2","scores":[1,1,1,1]},{"pid":"Wiki_articles/wikiAlexander II of Russia-11.txt","qid":"3","scores":[1,1,1]},{"pid":"Wiki_articles/wikiAlexander II of Russia-11.txt","qid":"4","scores":[1,1,1]},{"pid":"Wiki_articles/wikiAlexander II of Russia-11.txt","qid":"5","scores":[1,1,1]},{"pid":"Wiki_articles/wikiAlexander II of Russia-11.txt","qid":"6","scores":[1,1,1,1]},{"pid":"Wiki_articles/wikiAlexander II of Russia-11.txt","qid":"7","scores":[1,1,1,1,1]},{"pid":"Wiki_articles/wikiAlexander II of Russia-11.txt","qid":"8","scores":[1,1,1]},{"pid":"Wiki_articles/wikiAlexander II of Russia-11.txt","qid":"9","scores":[1,1,1,1,1]},{"pid":"Wiki_articles/wikiAlexander II of Russia-11.txt","qid":"10","scores":[1,1,1]},{"pid":"Wiki_articles/wikiAlexander II of Russia-11.txt","qid":"11","scores":[1,1,1]},{"pid":"Wiki_articles/wikiAlexander II of Russia-11.txt","qid":"12","scores":[1,1,1,1,1]},{"pid":"Wiki_articles/wikiAlexander II of Russia-11.txt","qid":"13","scores":[1,1,1,1]},{"pid":"Wiki_articles/wikiAlexander II of Russia-11.txt","qid":"14","scores":[1,1]},{"pid":"Wiki_articles/wikiAlexander II of Russia-11.txt","qid":"15","scores":[1,1,1,1,1]},{"pid":"Science-textbook/science-g4-142.txt","qid":"0","scores":[1,1,1,1,1]},{"pid":"Science-textbook/science-g4-142.txt","qid":"1","scores":[1,1,1]},{"pid":"Science-textbook/science-g4-142.txt","qid":"2","scores":[1,1,1,1]},{"pid":"Science-textbook/science-g4-142.txt","qid":"3","scores":[1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g4-142.txt","qid":"4","scores":[1,1,1,1,1]},{"pid":"Science-textbook/science-g4-142.txt","qid":"5","scores":[1,1,1,1,1]},{"pid":"Science-textbook/science-g4-142.txt","qid":"6","scores":[1,1,1]},{"pid":"Society_Law_and_Justice/oanc-grants_fail_to_come-1.txt","qid":"0","scores":[1,1,1]},{"pid":"Society_Law_and_Justice/oanc-grants_fail_to_come-1.txt","qid":"1","scores":[1,1,1]},{"pid":"Society_Law_and_Justice/oanc-grants_fail_to_come-1.txt","qid":"2","scores":[1,1,1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-grants_fail_to_come-1.txt","qid":"3","scores":[1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-grants_fail_to_come-1.txt","qid":"4","scores":[1,1,1]},{"pid":"Society_Law_and_Justice/oanc-grants_fail_to_come-1.txt","qid":"5","scores":[1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-grants_fail_to_come-1.txt","qid":"6","scores":[1,1,1]},{"pid":"Society_Law_and_Justice/oanc-grants_fail_to_come-1.txt","qid":"7","scores":[1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-grants_fail_to_come-1.txt","qid":"8","scores":[1,1,1]},{"pid":"Society_Law_and_Justice/oanc-grants_fail_to_come-1.txt","qid":"9","scores":[1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Camus-12.txt","qid":"0","scores":[1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Camus-12.txt","qid":"1","scores":[1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Camus-12.txt","qid":"2","scores":[1,1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Camus-12.txt","qid":"3","scores":[1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Camus-12.txt","qid":"4","scores":[1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Camus-12.txt","qid":"5","scores":[1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbert Camus-12.txt","qid":"6","scores":[1,1,1,1,1]},{"pid":"Science-textbook/science-g3-39.txt","qid":"0","scores":[1,1,1]},{"pid":"Science-textbook/science-g3-39.txt","qid":"1","scores":[1,1,1,1]},{"pid":"Science-textbook/science-g3-39.txt","qid":"2","scores":[1,1,1,1]},{"pid":"Science-textbook/science-g3-39.txt","qid":"3","scores":[1,1,1,1,1]},{"pid":"Science-textbook/science-g3-39.txt","qid":"4","scores":[1,1,1]},{"pid":"Science-textbook/science-g3-39.txt","qid":"5","scores":[1,1,1,1,1]},{"pid":"Science-textbook/science-g3-39.txt","qid":"6","scores":[1,1,1,1,1]},{"pid":"Science-textbook/science-g3-39.txt","qid":"7","scores":[1,1,1,1,1,1]},{"pid":"Science-textbook/science-g3-39.txt","qid":"8","scores":[1,1,1]},{"pid":"Science-textbook/science-g3-39.txt","qid":"9","scores":[1,1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g3-39.txt","qid":"10","scores":[1,1,1,1,1,1]},{"pid":"Science-textbook/science-g3-39.txt","qid":"11","scores":[1,1,1]},{"pid":"Science-textbook/science-g3-39.txt","qid":"12","scores":[1,1,1,1,1,1]},{"pid":"wikiMovieSummaries-2444093.txt","qid":"0","scores":[1,1,1,1,1]},{"pid":"wikiMovieSummaries-2444093.txt","qid":"1","scores":[1,1,1,1,1]},{"pid":"wikiMovieSummaries-2444093.txt","qid":"2","scores":[1,1,1,1,1,1]},{"pid":"wikiMovieSummaries-2444093.txt","qid":"3","scores":[1,1,1,1]},{"pid":"wikiMovieSummaries-2444093.txt","qid":"4","scores":[1,1,1,1,1,1]},{"pid":"wikiMovieSummaries-2444093.txt","qid":"5","scores":[1,1,1,1]},{"pid":"wikiMovieSummaries-2444093.txt","qid":"6","scores":[1,1,1,1,1,1]},{"pid":"wikiMovieSummaries-2444093.txt","qid":"7","scores":[1,1,1,1]},{"pid":"wikiMovieSummaries-2444093.txt","qid":"8","scores":[1,1,1,1]},{"pid":"wikiMovieSummaries-2444093.txt","qid":"9","scores":[1,1,1,1,1]},{"pid":"wikiMovieSummaries-2444093.txt","qid":"10","scores":[1,1,1,1,1,1]},{"pid":"wikiMovieSummaries-2444093.txt","qid":"11","scores":[1,1,1,1,1,1]},{"pid":"wikiMovieSummaries-2444093.txt","qid":"12","scores":[1,1,1,1]},{"pid":"wikiMovieSummaries-2444093.txt","qid":"13","scores":[1,1,1,1,1]},{"pid":"wikiMovieSummaries-2444093.txt","qid":"14","scores":[1,1,1,1,1,1]},{"pid":"wikiMovieSummaries-2444093.txt","qid":"15","scores":[1,1,1,1,1]},{"pid":"wikiMovieSummaries-2444093.txt","qid":"16","scores":[1,1,1,1,1,1]},{"pid":"wikiMovieSummaries-2444093.txt","qid":"17","scores":[1,1,1,1,1]},{"pid":"wikiMovieSummaries-2444093.txt","qid":"18","scores":[1,1,1,1,1]},{"pid":"wikiMovieSummaries-2444093.txt","qid":"19","scores":[1,1,1,1]},{"pid":"Fiction-stories-masc-hotel-California-6.txt","qid":"0","scores":[1,1,1,1,1,1]},{"pid":"Fiction-stories-masc-hotel-California-6.txt","qid":"1","scores":[1,1,1,1,1,1]},{"pid":"Fiction-stories-masc-hotel-California-6.txt","qid":"2","scores":[1,1,1]},{"pid":"Fiction-stories-masc-hotel-California-6.txt","qid":"3","scores":[1,1,1,1,1,1]},{"pid":"Fiction-stories-masc-hotel-California-6.txt","qid":"4","scores":[1,1,1,1,1,1]},{"pid":"Fiction-stories-masc-hotel-California-6.txt","qid":"5","scores":[1,1]},{"pid":"Fiction-stories-masc-hotel-California-6.txt","qid":"6","scores":[1,1]},{"pid":"Fiction-stories-masc-hotel-California-6.txt","qid":"7","scores":[1,1,1,1]},{"pid":"Fiction-stories-masc-hotel-California-6.txt","qid":"8","scores":[1,1,1,1]},{"pid":"Fiction-stories-masc-hotel-California-6.txt","qid":"9","scores":[1,1,1,1]},{"pid":"News/CNN/cnn-3b3301b2f6a5c4974baf18b4cf109206829dbc29.txt","qid":"0","scores":[1,1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3b3301b2f6a5c4974baf18b4cf109206829dbc29.txt","qid":"1","scores":[1,1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3b3301b2f6a5c4974baf18b4cf109206829dbc29.txt","qid":"2","scores":[1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3b3301b2f6a5c4974baf18b4cf109206829dbc29.txt","qid":"3","scores":[1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3b3301b2f6a5c4974baf18b4cf109206829dbc29.txt","qid":"4","scores":[1,1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3b2f008a5acf3285163a3b32329d92c0ccde914d.txt","qid":"0","scores":[1,1,1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3b2f008a5acf3285163a3b32329d92c0ccde914d.txt","qid":"1","scores":[1,1,1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3b2f008a5acf3285163a3b32329d92c0ccde914d.txt","qid":"2","scores":[1,1,1,1,1]},{"pid":"News/CNN/cnn-3b2f008a5acf3285163a3b32329d92c0ccde914d.txt","qid":"3","scores":[1,1,1,1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3b2f008a5acf3285163a3b32329d92c0ccde914d.txt","qid":"4","scores":[1,1,1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3b2f008a5acf3285163a3b32329d92c0ccde914d.txt","qid":"5","scores":[1,1,1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3b2f008a5acf3285163a3b32329d92c0ccde914d.txt","qid":"6","scores":[1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3b2f008a5acf3285163a3b32329d92c0ccde914d.txt","qid":"7","scores":[1,1,1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3b2f008a5acf3285163a3b32329d92c0ccde914d.txt","qid":"8","scores":[1,1,1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3b2f008a5acf3285163a3b32329d92c0ccde914d.txt","qid":"9","scores":[1,1,1,1,1,1,1]},{"pid":"News/CNN/cnn-3b2f008a5acf3285163a3b32329d92c0ccde914d.txt","qid":"10","scores":[1,1,1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-5_Legal_Groups-0.txt","qid":"0","scores":[1,1,1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-5_Legal_Groups-0.txt","qid":"1","scores":[1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-5_Legal_Groups-0.txt","qid":"2","scores":[1,1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-5_Legal_Groups-0.txt","qid":"3","scores":[1,1,1,1,1,1]},{"pid":"Society_Law_and_Justice/oanc-5_Legal_Groups-0.txt","qid":"4","scores":[1,1,1,1]},{"pid":"Science-textbook/science-g5-67.txt","qid":"0","scores":[1,1,1]},{"pid":"Science-textbook/science-g5-67.txt","qid":"1","scores":[1,1,1,1,1]},{"pid":"Science-textbook/science-g5-67.txt","qid":"2","scores":[1,1,1,1]},{"pid":"Science-textbook/science-g5-67.txt","qid":"3","scores":[1,1,1]},{"pid":"Science-textbook/science-g5-67.txt","qid":"4","scores":[1,1]},{"pid":"Science-textbook/science-g5-67.txt","qid":"5","scores":[1,1,1]},{"pid":"Science-textbook/science-g5-67.txt","qid":"6","scores":[1,1]},{"pid":"Science-textbook/science-g5-67.txt","qid":"7","scores":[1,1,1,1,1,1]},{"pid":"Science-textbook/science-g5-67.txt","qid":"8","scores":[1,1,1,1,1]},{"pid":"Science-textbook/science-g5-67.txt","qid":"9","scores":[1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryMallorca-6.txt","qid":"0","scores":[1,1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryMallorca-6.txt","qid":"1","scores":[1,1,1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryMallorca-6.txt","qid":"2","scores":[1,1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryMallorca-6.txt","qid":"3","scores":[1,1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryMallorca-6.txt","qid":"4","scores":[1,1,1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryMallorca-6.txt","qid":"5","scores":[1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryMallorca-6.txt","qid":"6","scores":[1,1,1,1,1,1]},{"pid":"History-Anthropology/oanc-HistoryMallorca-6.txt","qid":"7","scores":[1,1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbrecht Durer-25.txt","qid":"0","scores":[1,1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbrecht Durer-25.txt","qid":"1","scores":[1,1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbrecht Durer-25.txt","qid":"2","scores":[1,1,1,1]},{"pid":"Wiki_articles-paragraphs-wikiAlbrecht Durer-25.txt","qid":"3","scores":[1,1,1,1,1]},{"pid":"Fiction/gutenberg_withoutQuotes/gutenberg-11508-0.txt","qid":"0","scores":[1,1,1,1,1]},{"pid":"Fiction/gutenberg_withoutQuotes/gutenberg-11508-0.txt","qid":"1","scores":[1,1,1,1,1]},{"pid":"Fiction/gutenberg_withoutQuotes/gutenberg-11508-0.txt","qid":"2","scores":[1,1,1,1,1,1,1]},{"pid":"wikiMovieSummaries-261107.txt","qid":"0","scores":[1,1,1,1]},{"pid":"wikiMovieSummaries-261107.txt","qid":"1","scores":[1,1,1,1,1,1]},{"pid":"wikiMovieSummaries-261107.txt","qid":"2","scores":[1,1,1,1,1]},{"pid":"wikiMovieSummaries-261107.txt","qid":"3","scores":[1,1,1,1,1,1,1]},{"pid":"wikiMovieSummaries-261107.txt","qid":"4","scores":[1,1,1,1,1,1,1,1]},{"pid":"wikiMovieSummaries-261107.txt","qid":"5","scores":[1,1,1,1,1,1]},{"pid":"wikiMovieSummaries-261107.txt","qid":"6","scores":[1,1,1,1,1,1,1,1]},{"pid":"wikiMovieSummaries-261107.txt","qid":"7","scores":[1,1,1,1,1,1,1]},{"pid":"wikiMovieSummaries-261107.txt","qid":"8","scores":[1,1,1,1]},{"pid":"wikiMovieSummaries-261107.txt","qid":"9","scores":[1,1,1,1]},{"pid":"wikiMovieSummaries-261107.txt","qid":"10","scores":[1,1,1,1,1,1]},{"pid":"Science-textbook/science-g3-28.txt","qid":"0","scores":[1,1,1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g3-28.txt","qid":"1","scores":[1,1,1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g3-28.txt","qid":"2","scores":[1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g3-28.txt","qid":"3","scores":[1,1,1,1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g3-28.txt","qid":"4","scores":[1,1,1,1,1,1,1,1,1]},{"pid":"Science-textbook/science-g3-28.txt","qid":"5","scores":[1,1,1,1,1,1,1,1,1]},{"pid":"Sept11-reports/oanc-chapter-6-30.txt","qid":"0","scores":[1,1,1,1,1,1]},{"pid":"Sept11-reports/oanc-chapter-6-30.txt","qid":"1","scores":[1,1,1,1,1,1]},{"pid":"Sept11-reports/oanc-chapter-6-30.txt","qid":"2","scores":[1,1,1,1]},{"pid":"Sept11-reports/oanc-chapter-6-30.txt","qid":"3","scores":[1,1,1,1]},{"pid":"Sept11-reports/oanc-chapter-6-30.txt","qid":"4","scores":[1,1,1,1,1,1,1]},{"pid":"Sept11-reports/oanc-chapter-6-30.txt","qid":"5","scores":[1,1,1,1,1,1]},{"pid":"Sept11-reports/oanc-chapter-6-30.txt","qid":"6","scores":[1,1,1,1,1]},{"pid":"Sept11-reports/oanc-chapter-6-30.txt","qid":"7","scores":[1,1,1,1,1]},{"pid":"Sept11-reports/oanc-chapter-6-30.txt","qid":"8","scores":[1,1,1,1,1]},{"pid":"Sept11-reports/oanc-chapter-6-30.txt","qid":"9","scores":[1,1,1,1,1,1]},{"pid":"Sept11-reports/oanc-chapter-6-30.txt","qid":"10","scores":[1,1,1,1,1]},{"pid":"wikiMovieSummaries/5787450.txt","qid":"0","scores":[1,1,1,1,1]},{"pid":"wikiMovieSummaries/5787450.txt","qid":"1","scores":[1,1,1,1,1]},{"pid":"wikiMovieSummaries/5787450.txt","qid":"2","scores":[1,1,1,1,1]},{"pid":"wikiMovieSummaries/5787450.txt","qid":"3","scores":[1,1,1,1,1]},{"pid":"wikiMovieSummaries/5787450.txt","qid":"4","scores":[1,1,1,1]},{"pid":"Fiction-stories-masc-A_Wasted_Day-5.txt","qid":"0","scores":[1,1,1,1]},{"pid":"Fiction-stories-masc-A_Wasted_Day-5.txt","qid":"1","scores":[1,1,1,1,1]},{"pid":"Fiction-stories-masc-A_Wasted_Day-5.txt","qid":"2","scores":[1,1,1]},{"pid":"Fiction-stories-masc-A_Wasted_Day-5.txt","qid":"3","scores":[1,1,1,1,1]},{"pid":"Fiction-stories-masc-A_Wasted_Day-5.txt","qid":"4","scores":[1,1,1]},{"pid":"Fiction-stories-masc-A_Wasted_Day-5.txt","qid":"5","scores":[1,1,1,1,1]},{"pid":"Fiction-stories-masc-A_Wasted_Day-5.txt","qid":"6","scores":[1,1]},{"pid":"Fiction-stories-masc-A_Wasted_Day-5.txt","qid":"7","scores":[1,1,1]},{"pid":"Fiction-stories-masc-A_Wasted_Day-5.txt","qid":"8","scores":[1,1,1,1]},{"pid":"Sept11-reports/oanc-chapter-2-4-2.txt","qid":"0","scores":[1,1,1]},{"pid":"Sept11-reports/oanc-chapter-2-4-2.txt","qid":"1","scores":[1,1,1,1]},{"pid":"Sept11-reports/oanc-chapter-2-4-2.txt","qid":"2","scores":[1,1,1]},{"pid":"Sept11-reports/oanc-chapter-2-4-2.txt","qid":"3","scores":[1,1,1,1,1,1,1,1,1]},{"pid":"Sept11-reports/oanc-chapter-2-4-2.txt","qid":"4","scores":[1,1,1,1]},{"pid":"Sept11-reports/oanc-chapter-2-4-2.txt","qid":"5","scores":[1,1,1,1]},{"pid":"Sept11-reports/oanc-chapter-2-4-2.txt","qid":"6","scores":[1,1,1,1,1,1,1]},{"pid":"Sept11-reports/oanc-chapter-2-4-2.txt","qid":"7","scores":[1,1,1,1,1,1,1,1]},{"pid":"Sept11-reports/oanc-chapter-2-4-2.txt","qid":"8","scores":[1,1,1,1,1,1,1,1]},{"pid":"Sept11-reports/oanc-chapter-2-4-2.txt","qid":"9","scores":[1,1,1,1,1]},{"pid":"Sept11-reports/oanc-chapter-2-4-2.txt","qid":"10","scores":[1,1,1,1,1,1,1]},{"pid":"Sept11-reports/oanc-chapter-2-4-2.txt","qid":"11","scores":[1,1,1,1,1,1,1]}] -------------------------------------------------------------------------------- /multirc_materials/multirc-eval-v1.py: -------------------------------------------------------------------------------- 1 | ### Evaluation script used for evaluation of baselines for MultiRC dataset 2 | # The evaluation script expects the questions, and predicted answers from separate json files. 3 | # The predicted answers should be 1s and 0s (no real-valued scores) 4 | 5 | import json 6 | 7 | from multirc_measures import Measures 8 | 9 | # this is the location of your data; has to be downloaded from http://cogcomp.org/multirc/ 10 | inputFile = '/Users/daniel/ideaProjects/hard-qa/splitv1/dev_83-with-lucene.json' 11 | 12 | measures = Measures() 13 | 14 | def main(): 15 | eval('baseline-scores/human-01.json') 16 | # eval('baseline-scores/allOnes.json') 17 | # eval('baseline-scores/allZeros.json') 18 | # eval('baseline-scores/simpleLR.json') 19 | # eval('baseline-scores/lucene_world.json') 20 | # eval('baseline-scores/lucene_paragraphs.json') 21 | 22 | # the input to the `eval` function is the file which contains the binary predictions per question-id 23 | def eval(outFile): 24 | input = json.load(open(inputFile)) 25 | output = json.load(open(outFile)) 26 | output_map = dict([[a["pid"] + "==" + a["qid"], a["scores"]] for a in output]) 27 | 28 | assert len(output_map) == len(output), "You probably have redundancies in your keys" 29 | 30 | [P1, R1, F1m] = measures.per_question_metrics(input["data"], output_map) 31 | print("Per question measures (i.e. precision-recall per question, then average) ") 32 | print("\tP: " + str(P1) + " - R: " + str(R1) + " - F1m: " + str(F1m)) 33 | 34 | EM0 = measures.exact_match_metrics(input["data"], output_map, 0) 35 | EM1 = measures.exact_match_metrics(input["data"], output_map, 1) 36 | print("\tEM0: " + str(EM0)) 37 | print("\tEM1: " + str(EM1)) 38 | 39 | [P2, R2, F1a] = measures.per_dataset_metric(input["data"], output_map) 40 | 41 | print("Dataset-wide measures (i.e. precision-recall across all the candidate-answers in the dataset) ") 42 | print("\tP: " + str(P2) + " - R: " + str(R2) + " - F1a: " + str(F1a)) 43 | 44 | if __name__ == "__main__": 45 | main() 46 | -------------------------------------------------------------------------------- /multirc_materials/multirc-pr-curve-v1.py: -------------------------------------------------------------------------------- 1 | import json 2 | import numpy 3 | import matplotlib.pyplot as plt 4 | 5 | # this is the location of your data; has to be downloaded from http://cogcomp.org/multirc/ 6 | inputFile = '/Users/daniel/ideaProjects/hard-qa/split/dev_83.json' 7 | 8 | 9 | def main(): 10 | myPlot("Human", '-r', 1.5, 'baseline-scores/human.json') 11 | myPlot("IR(Paragraphs)", '-.g', 1.5, 'baseline-scores/lucene_paragraphs.json') 12 | myPlot("IR(World)", '*-b', 1.5, 'baseline-scores/lucene_world.json') 13 | myPlot("SimpleLR", '^--c', 1.5, 'baseline-scores/simpleLR.json') 14 | plt.xlabel('Recall', fontsize=11) 15 | plt.ylabel('Precision', fontsize=11) 16 | plt.legend(loc="lower left", fontsize=9) 17 | plt.ylim((0.0, 1.0)) 18 | plt.xlim((0.0, 1.0)) 19 | plt.show() 20 | 21 | 22 | def avg(l): 23 | return reduce(lambda x, y: x + y, l) / len(l) 24 | 25 | 26 | def myPlot(plotname, options, lw, outFile): 27 | print("PlotName: " + plotname) 28 | input = json.load(open(inputFile)) 29 | output = json.load(open(outFile)) 30 | outputMap = dict([[a["pid"] + "==" + a["qid"], a["scores"]] for a in output]) 31 | 32 | minVal = min([min(a["scores"]) for a in output]) 33 | maxVal = max([max(a["scores"]) for a in output]) 34 | num = 10.0 # number of points to measure 35 | 36 | P1 = [] 37 | R1 = [] 38 | for thr in numpy.arange(minVal - 0.1, maxVal + 0.1, (maxVal - minVal) / num): 39 | R1Tmp = [] 40 | P1Tmp = [] 41 | for p in input["data"]: 42 | for qIdx, q in enumerate(p["paragraph"]["questions"]): 43 | id = p["id"] + "==" + str(qIdx) 44 | if (id in outputMap): 45 | predictedAns = [a > thr for a in outputMap.get(id)] 46 | correctAns = [int(a["isAnswer"]) for a in q["answers"]] 47 | predictCount = sum(predictedAns) 48 | correctCount = sum(correctAns) 49 | agreementCount = sum([a * b for (a, b) in zip(correctAns, predictedAns)]) 50 | p1 = (1.0 * agreementCount / correctCount) if correctCount > 0.0 else 1.0 51 | r1 = (1.0 * agreementCount / predictCount) if predictCount > 0.0 else 1.0 52 | P1Tmp.append(p1) 53 | R1Tmp.append(r1) 54 | else: 55 | print("The id " + id + " not found . . . ") 56 | P1.append(avg(P1Tmp)) 57 | R1.append(avg(R1Tmp)) 58 | plt.plot(P1, R1, options, linewidth=lw, label=plotname) 59 | 60 | 61 | if __name__ == "__main__": 62 | main() 63 | -------------------------------------------------------------------------------- /multirc_materials/multirc_measures.py: -------------------------------------------------------------------------------- 1 | import math 2 | 3 | 4 | class Measures: 5 | 6 | @staticmethod 7 | def per_question_metrics(dataset, output_map): 8 | P = [] 9 | R = [] 10 | for p in dataset: 11 | for qIdx, q in enumerate(p["paragraph"]["questions"]): 12 | id = p["id"] + "==" + str(qIdx) 13 | if (id in output_map): 14 | predictedAns = output_map.get(id) 15 | correctAns = [int(a["isAnswer"]) for a in q["answers"]] 16 | predictCount = sum(predictedAns) 17 | correctCount = sum(correctAns) 18 | assert math.ceil(sum(predictedAns)) == sum(predictedAns), "sum of the scores: " + str(sum(predictedAns)) 19 | agreementCount = sum([a * b for (a, b) in zip(correctAns, predictedAns)]) 20 | p1 = (1.0 * agreementCount / predictCount) if predictCount > 0.0 else 1.0 21 | r1 = (1.0 * agreementCount / correctCount) if correctCount > 0.0 else 1.0 22 | P.append(p1) 23 | R.append(r1) 24 | else: 25 | print("The id " + id + " not found . . . ") 26 | 27 | pAvg = Measures.avg(P) 28 | rAvg = Measures.avg(R) 29 | f1Avg = 2 * Measures.avg(R) * Measures.avg(P) / (Measures.avg(P) + Measures.avg(R)) 30 | return [pAvg, rAvg, f1Avg] 31 | 32 | @staticmethod 33 | def exact_match_metrics(dataset, output_map, delta): 34 | EM = [] 35 | for p in dataset: 36 | for qIdx, q in enumerate(p["paragraph"]["questions"]): 37 | id = p["id"] + "==" + str(qIdx) 38 | if (id in output_map): 39 | predictedAns = output_map.get(id) 40 | correctAns = [int(a["isAnswer"]) for a in q["answers"]] 41 | em = 1.0 if sum([abs(i - j) for i, j in zip(correctAns, predictedAns)]) <= delta else 0.0 42 | EM.append(em) 43 | else: 44 | print("The id " + id + " not found . . . ") 45 | 46 | return Measures.avg(EM) 47 | 48 | @staticmethod 49 | def per_dataset_metric(dataset, output_map): 50 | agreementCount = 0 51 | correctCount = 0 52 | predictCount = 0 53 | for p in dataset: 54 | for qIdx, q in enumerate(p["paragraph"]["questions"]): 55 | id = p["id"] + "==" + str(qIdx) 56 | if (id in output_map): 57 | predictedAns = output_map.get(id) 58 | correctAns = [int(a["isAnswer"]) for a in q["answers"]] 59 | predictCount += sum(predictedAns) 60 | correctCount += sum(correctAns) 61 | agreementCount += sum([a * b for (a, b) in zip(correctAns, predictedAns)]) 62 | else: 63 | print("The id " + id + " not found . . . ") 64 | 65 | p1 = (1.0 * agreementCount / predictCount) if predictCount > 0.0 else 1.0 66 | r1 = (1.0 * agreementCount / correctCount) if correctCount > 0.0 else 1.0 67 | return [p1, r1, 2 * r1 * p1 / (p1 + r1)] 68 | 69 | @staticmethod 70 | def avg(l): 71 | return reduce(lambda x, y: x + y, l) / len(l) 72 | -------------------------------------------------------------------------------- /multirc_materials/pr-curve-output.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/CogComp/multirc/0e52e6f98f9d2970c90bb86502ed7488eb720fc5/multirc_materials/pr-curve-output.png -------------------------------------------------------------------------------- /multirc_materials/threshold-tuning-raw-scores.py: -------------------------------------------------------------------------------- 1 | ## A utility function to binarize the real-valued scores 2 | import json 3 | import numpy as np 4 | from measures import Measures 5 | 6 | # this is the location of your data; has to be downloaded from http://cogcomp.org/multirc/ 7 | inputFile = '/Users/daniel/ideaProjects/hard-qa/splitv1/dev_83-with-lucene.json' 8 | 9 | measures = Measures() 10 | 11 | def main(): 12 | # tune_threshold('/Users/daniel/ideaProjects/multirc/baseline-scores/raw/simpleLR-raw.json') 13 | # convert_to_binary('/Users/daniel/ideaProjects/multirc/baseline-scores/raw/simpleLR-raw.json', 14 | # '/Users/daniel/ideaProjects/multirc/baseline-scores/simpleLR.json', -0.76) 15 | # tune_threshold('/Users/daniel/ideaProjects/multirc/baseline-scores/raw/lucene_paragraphs-raw.json') 16 | # convert_to_binary('/Users/daniel/ideaProjects/multirc/baseline-scores/raw/lucene_paragraphs-raw.json', 17 | # '/Users/daniel/ideaProjects/multirc/baseline-scores/lucene_paragraphs.json', 0.0499) 18 | # tune_threshold('/Users/daniel/ideaProjects/multirc/baseline-scores/raw/lucene_world-raw.json') 19 | convert_to_binary('/Users/daniel/ideaProjects/multirc/baseline-scores/raw/lucene_world-raw.json', 20 | '/Users/daniel/ideaProjects/multirc/baseline-scores/lucene_world.json', 0.56) 21 | 22 | 23 | def convert_to_binary(outFile, exported_file, threshold): 24 | output = json.load(open(outFile)) 25 | for a in output: 26 | a["scores"] = binarize(a["scores"], threshold) 27 | 28 | with open(exported_file, 'w') as outfile: 29 | json.dump(output, outfile) 30 | 31 | 32 | def binarize(scores, th): 33 | return [1.0 if s > th else 0.0 for s in scores] 34 | 35 | def tune_threshold(outFile): 36 | input = json.load(open(inputFile)) 37 | output = json.load(open(outFile)) 38 | 39 | for th in np.arange(-0.5, 1.0, 0.02): 40 | input_binarized = dict([[a["pid"] + "==" + a["qid"], binarize(a["scores"], th)] for a in output]) 41 | [P1, R1, F1m] = measures.per_question_metrics(input["data"], input_binarized) 42 | print("Threshold: " + str(th) + " \tP: " + str(P1) + " - R: " + str(R1) + " - F1m: " + str(F1m)) 43 | 44 | if __name__ == "__main__": 45 | main() 46 | -------------------------------------------------------------------------------- /surfaceLR-baseline/JSON.pm: -------------------------------------------------------------------------------- 1 | package JSON; 2 | 3 | 4 | use strict; 5 | use Carp (); 6 | use Exporter; 7 | BEGIN { @JSON::ISA = 'Exporter' } 8 | 9 | @JSON::EXPORT = qw(from_json to_json jsonToObj objToJson encode_json decode_json); 10 | 11 | BEGIN { 12 | $JSON::VERSION = '2.97001'; 13 | $JSON::DEBUG = 0 unless (defined $JSON::DEBUG); 14 | $JSON::DEBUG = $ENV{ PERL_JSON_DEBUG } if exists $ENV{ PERL_JSON_DEBUG }; 15 | } 16 | 17 | my %RequiredVersion = ( 18 | 'JSON::PP' => '2.27203', 19 | 'JSON::XS' => '2.34', 20 | ); 21 | 22 | # XS and PP common methods 23 | 24 | my @PublicMethods = qw/ 25 | ascii latin1 utf8 pretty indent space_before space_after relaxed canonical allow_nonref 26 | allow_blessed convert_blessed filter_json_object filter_json_single_key_object 27 | shrink max_depth max_size encode decode decode_prefix allow_unknown 28 | /; 29 | 30 | my @Properties = qw/ 31 | ascii latin1 utf8 indent space_before space_after relaxed canonical allow_nonref 32 | allow_blessed convert_blessed shrink max_depth max_size allow_unknown 33 | /; 34 | 35 | my @XSOnlyMethods = qw/allow_tags/; # Currently nothing 36 | 37 | my @PPOnlyMethods = qw/ 38 | indent_length sort_by 39 | allow_singlequote allow_bignum loose allow_barekey escape_slash as_nonblessed 40 | /; # JSON::PP specific 41 | 42 | 43 | # used in _load_xs and _load_pp ($INSTALL_ONLY is not used currently) 44 | my $_INSTALL_DONT_DIE = 1; # When _load_xs fails to load XS, don't die. 45 | my $_ALLOW_UNSUPPORTED = 0; 46 | my $_UNIV_CONV_BLESSED = 0; 47 | 48 | 49 | # Check the environment variable to decide worker module. 50 | 51 | unless ($JSON::Backend) { 52 | $JSON::DEBUG and Carp::carp("Check used worker module..."); 53 | 54 | my $backend = exists $ENV{PERL_JSON_BACKEND} ? $ENV{PERL_JSON_BACKEND} : 1; 55 | 56 | if ($backend eq '1') { 57 | $backend = 'JSON::XS,JSON::PP'; 58 | } 59 | elsif ($backend eq '0') { 60 | $backend = 'JSON::PP'; 61 | } 62 | elsif ($backend eq '2') { 63 | $backend = 'JSON::XS'; 64 | } 65 | $backend =~ s/\s+//g; 66 | 67 | my @backend_modules = split /,/, $backend; 68 | while(my $module = shift @backend_modules) { 69 | if ($module =~ /JSON::XS/) { 70 | _load_xs($module, @backend_modules ? $_INSTALL_DONT_DIE : 0); 71 | } 72 | elsif ($module =~ /JSON::PP/) { 73 | _load_pp($module); 74 | } 75 | elsif ($module =~ /JSON::backportPP/) { 76 | _load_pp($module); 77 | } 78 | else { 79 | Carp::croak "The value of environmental variable 'PERL_JSON_BACKEND' is invalid."; 80 | } 81 | last if $JSON::Backend; 82 | } 83 | } 84 | 85 | 86 | sub import { 87 | my $pkg = shift; 88 | my @what_to_export; 89 | my $no_export; 90 | 91 | for my $tag (@_) { 92 | if ($tag eq '-support_by_pp') { 93 | if (!$_ALLOW_UNSUPPORTED++) { 94 | JSON::Backend::XS 95 | ->support_by_pp(@PPOnlyMethods) if ($JSON::Backend->is_xs); 96 | } 97 | next; 98 | } 99 | elsif ($tag eq '-no_export') { 100 | $no_export++, next; 101 | } 102 | elsif ( $tag eq '-convert_blessed_universally' ) { 103 | my $org_encode = $JSON::Backend->can('encode'); 104 | eval q| 105 | require B; 106 | local $^W; 107 | no strict 'refs'; 108 | *{"${JSON::Backend}\::encode"} = sub { 109 | # only works with Perl 5.18+ 110 | local *UNIVERSAL::TO_JSON = sub { 111 | my $b_obj = B::svref_2object( $_[0] ); 112 | return $b_obj->isa('B::HV') ? { %{ $_[0] } } 113 | : $b_obj->isa('B::AV') ? [ @{ $_[0] } ] 114 | : undef 115 | ; 116 | }; 117 | $org_encode->(@_); 118 | }; 119 | | if ( !$_UNIV_CONV_BLESSED++ ); 120 | next; 121 | } 122 | push @what_to_export, $tag; 123 | } 124 | 125 | return if ($no_export); 126 | 127 | __PACKAGE__->export_to_level(1, $pkg, @what_to_export); 128 | } 129 | 130 | 131 | # OBSOLETED 132 | 133 | sub jsonToObj { 134 | my $alternative = 'from_json'; 135 | if (defined $_[0] and UNIVERSAL::isa($_[0], 'JSON')) { 136 | shift @_; $alternative = 'decode'; 137 | } 138 | Carp::carp "'jsonToObj' will be obsoleted. Please use '$alternative' instead."; 139 | return JSON::from_json(@_); 140 | }; 141 | 142 | sub objToJson { 143 | my $alternative = 'to_json'; 144 | if (defined $_[0] and UNIVERSAL::isa($_[0], 'JSON')) { 145 | shift @_; $alternative = 'encode'; 146 | } 147 | Carp::carp "'objToJson' will be obsoleted. Please use '$alternative' instead."; 148 | JSON::to_json(@_); 149 | }; 150 | 151 | 152 | # INTERFACES 153 | 154 | sub to_json ($@) { 155 | if ( 156 | ref($_[0]) eq 'JSON' 157 | or (@_ > 2 and $_[0] eq 'JSON') 158 | ) { 159 | Carp::croak "to_json should not be called as a method."; 160 | } 161 | my $json = JSON->new; 162 | 163 | if (@_ == 2 and ref $_[1] eq 'HASH') { 164 | my $opt = $_[1]; 165 | for my $method (keys %$opt) { 166 | $json->$method( $opt->{$method} ); 167 | } 168 | } 169 | 170 | $json->encode($_[0]); 171 | } 172 | 173 | 174 | sub from_json ($@) { 175 | if ( ref($_[0]) eq 'JSON' or $_[0] eq 'JSON' ) { 176 | Carp::croak "from_json should not be called as a method."; 177 | } 178 | my $json = JSON->new; 179 | 180 | if (@_ == 2 and ref $_[1] eq 'HASH') { 181 | my $opt = $_[1]; 182 | for my $method (keys %$opt) { 183 | $json->$method( $opt->{$method} ); 184 | } 185 | } 186 | 187 | return $json->decode( $_[0] ); 188 | } 189 | 190 | 191 | 192 | sub true { $JSON::true } 193 | 194 | sub false { $JSON::false } 195 | 196 | sub null { undef; } 197 | 198 | 199 | sub require_xs_version { $RequiredVersion{'JSON::XS'}; } 200 | 201 | sub backend { 202 | my $proto = shift; 203 | $JSON::Backend; 204 | } 205 | 206 | #*module = *backend; 207 | 208 | 209 | sub is_xs { 210 | return $_[0]->backend->is_xs; 211 | } 212 | 213 | 214 | sub is_pp { 215 | return $_[0]->backend->is_pp; 216 | } 217 | 218 | 219 | sub pureperl_only_methods { @PPOnlyMethods; } 220 | 221 | 222 | sub property { 223 | my ($self, $name, $value) = @_; 224 | 225 | if (@_ == 1) { 226 | my %props; 227 | for $name (@Properties) { 228 | my $method = 'get_' . $name; 229 | if ($name eq 'max_size') { 230 | my $value = $self->$method(); 231 | $props{$name} = $value == 1 ? 0 : $value; 232 | next; 233 | } 234 | $props{$name} = $self->$method(); 235 | } 236 | return \%props; 237 | } 238 | elsif (@_ > 3) { 239 | Carp::croak('property() can take only the option within 2 arguments.'); 240 | } 241 | elsif (@_ == 2) { 242 | if ( my $method = $self->can('get_' . $name) ) { 243 | if ($name eq 'max_size') { 244 | my $value = $self->$method(); 245 | return $value == 1 ? 0 : $value; 246 | } 247 | $self->$method(); 248 | } 249 | } 250 | else { 251 | $self->$name($value); 252 | } 253 | 254 | } 255 | 256 | 257 | 258 | # INTERNAL 259 | 260 | sub __load_xs { 261 | my ($module, $opt) = @_; 262 | 263 | $JSON::DEBUG and Carp::carp "Load $module."; 264 | my $required_version = $RequiredVersion{$module} || ''; 265 | 266 | eval qq| 267 | use $module $required_version (); 268 | |; 269 | 270 | if ($@) { 271 | if (defined $opt and $opt & $_INSTALL_DONT_DIE) { 272 | $JSON::DEBUG and Carp::carp "Can't load $module...($@)"; 273 | return 0; 274 | } 275 | Carp::croak $@; 276 | } 277 | $JSON::BackendModuleXS = $module; 278 | return 1; 279 | } 280 | 281 | sub _load_xs { 282 | my ($module, $opt) = @_; 283 | __load_xs($module, $opt) or return; 284 | 285 | my $data = join("", ); # this code is from Jcode 2.xx. 286 | close(DATA); 287 | eval $data; 288 | JSON::Backend::XS->init($module); 289 | 290 | return 1; 291 | }; 292 | 293 | 294 | sub __load_pp { 295 | my ($module, $opt) = @_; 296 | 297 | $JSON::DEBUG and Carp::carp "Load $module."; 298 | my $required_version = $RequiredVersion{$module} || ''; 299 | 300 | eval qq| use $module $required_version () |; 301 | 302 | if ($@) { 303 | if ( $module eq 'JSON::PP' ) { 304 | $JSON::DEBUG and Carp::carp "Can't load $module ($@), so try to load JSON::backportPP"; 305 | $module = 'JSON::backportPP'; 306 | local $^W; # if PP installed but invalid version, backportPP redefines methods. 307 | eval qq| require $module |; 308 | } 309 | Carp::croak $@ if $@; 310 | } 311 | $JSON::BackendModulePP = $module; 312 | return 1; 313 | } 314 | 315 | sub _load_pp { 316 | my ($module, $opt) = @_; 317 | __load_pp($module, $opt); 318 | 319 | JSON::Backend::PP->init($module); 320 | }; 321 | 322 | # 323 | # Helper classes for Backend Module (PP) 324 | # 325 | 326 | package JSON::Backend::PP; 327 | 328 | sub init { 329 | my ($class, $module) = @_; 330 | 331 | # name may vary, but the module should (always) be a JSON::PP 332 | 333 | local $^W; 334 | no strict qw(refs); # this routine may be called after JSON::Backend::XS init was called. 335 | *{"JSON::decode_json"} = \&{"JSON::PP::decode_json"}; 336 | *{"JSON::encode_json"} = \&{"JSON::PP::encode_json"}; 337 | *{"JSON::is_bool"} = \&{"JSON::PP::is_bool"}; 338 | 339 | $JSON::true = ${"JSON::PP::true"}; 340 | $JSON::false = ${"JSON::PP::false"}; 341 | 342 | push @JSON::Backend::PP::ISA, 'JSON::PP'; 343 | push @JSON::ISA, $class; 344 | $JSON::Backend = $class; 345 | $JSON::BackendModule = $module; 346 | ${"$class\::VERSION"} = $module->VERSION; 347 | 348 | for my $method (@XSOnlyMethods) { 349 | *{"JSON::$method"} = sub { 350 | Carp::carp("$method is not supported in $module."); 351 | $_[0]; 352 | }; 353 | } 354 | 355 | return 1; 356 | } 357 | 358 | sub is_xs { 0 }; 359 | sub is_pp { 1 }; 360 | 361 | # 362 | # To save memory, the below lines are read only when XS backend is used. 363 | # 364 | 365 | package JSON; 366 | 367 | 1; 368 | __DATA__ 369 | 370 | 371 | # 372 | # Helper classes for Backend Module (XS) 373 | # 374 | 375 | package JSON::Backend::XS; 376 | 377 | sub init { 378 | my ($class, $module) = @_; 379 | 380 | local $^W; 381 | no strict qw(refs); 382 | *{"JSON::decode_json"} = \&{"$module\::decode_json"}; 383 | *{"JSON::encode_json"} = \&{"$module\::encode_json"}; 384 | *{"JSON::is_bool"} = \&{"$module\::is_bool"}; 385 | 386 | $JSON::true = ${"$module\::true"}; 387 | $JSON::false = ${"$module\::false"}; 388 | 389 | push @JSON::Backend::XS::ISA, $module; 390 | push @JSON::ISA, $class; 391 | $JSON::Backend = $class; 392 | $JSON::BackendModule = $module; 393 | ${"$class\::VERSION"} = $module->VERSION; 394 | 395 | if ( $module->VERSION < 3 ) { 396 | eval 'package JSON::PP::Boolean'; 397 | push @{"$module\::Boolean::ISA"}, qw(JSON::PP::Boolean); 398 | } 399 | 400 | for my $method (@PPOnlyMethods) { 401 | *{"JSON::$method"} = sub { 402 | Carp::carp("$method is not supported in $module."); 403 | $_[0]; 404 | }; 405 | } 406 | 407 | return 1; 408 | } 409 | 410 | sub is_xs { 1 }; 411 | sub is_pp { 0 }; 412 | 413 | sub support_by_pp { 414 | my ($class, @methods) = @_; 415 | 416 | JSON::__load_pp('JSON::PP'); 417 | 418 | local $^W; 419 | no strict qw(refs); 420 | 421 | for my $method (@methods) { 422 | my $pp_method = JSON::PP->can($method) or next; 423 | *{"JSON::$method"} = sub { 424 | if (!$_[0]->isa('JSON::PP')) { 425 | my $xs_self = $_[0]; 426 | my $pp_self = JSON::PP->new; 427 | for (@Properties) { 428 | my $getter = "get_$_"; 429 | $pp_self->$_($xs_self->$getter); 430 | } 431 | $_[0] = $pp_self; 432 | } 433 | $pp_method->(@_); 434 | }; 435 | } 436 | 437 | $JSON::DEBUG and Carp::carp("set -support_by_pp mode."); 438 | } 439 | 440 | 1; 441 | __END__ 442 | 443 | =head1 NAME 444 | 445 | JSON - JSON (JavaScript Object Notation) encoder/decoder 446 | 447 | =head1 SYNOPSIS 448 | 449 | use JSON; # imports encode_json, decode_json, to_json and from_json. 450 | 451 | # simple and fast interfaces (expect/generate UTF-8) 452 | 453 | $utf8_encoded_json_text = encode_json $perl_hash_or_arrayref; 454 | $perl_hash_or_arrayref = decode_json $utf8_encoded_json_text; 455 | 456 | # OO-interface 457 | 458 | $json = JSON->new->allow_nonref; 459 | 460 | $json_text = $json->encode( $perl_scalar ); 461 | $perl_scalar = $json->decode( $json_text ); 462 | 463 | $pretty_printed = $json->pretty->encode( $perl_scalar ); # pretty-printing 464 | 465 | =head1 VERSION 466 | 467 | 2.97001 468 | 469 | =head1 DESCRIPTION 470 | 471 | This module is a thin wrapper for L-compatible modules with a few 472 | additional features. All the backend modules convert a Perl data structure 473 | to a JSON text as of RFC4627 (which we know is obsolete but we still stick 474 | to; see below for an option to support part of RFC7159) and vice versa. 475 | This module uses L by default, and when JSON::XS is not available, 476 | this module falls back on L, which is in the Perl core since 5.14. 477 | If JSON::PP is not available either, this module then falls back on 478 | JSON::backportPP (which is actually JSON::PP in a different .pm file) 479 | bundled in the same distribution as this module. You can also explicitly 480 | specify to use L, a fork of JSON::XS by Reini Urban. 481 | 482 | All these backend modules have slight incompatibilities between them, 483 | including extra features that other modules don't support, but as long as you 484 | use only common features (most important ones are described below), migration 485 | from backend to backend should be reasonably easy. For details, see each 486 | backend module you use. 487 | 488 | =head1 CHOOSING BACKEND 489 | 490 | This module respects an environmental variable called C 491 | when it decides a backend module to use. If this environmental variable is 492 | not set, it tries to load JSON::XS, and if JSON::XS is not available, it 493 | falls back on JSON::PP, and then JSON::backportPP if JSON::PP is not available 494 | either. 495 | 496 | If you always don't want it to fall back on pure perl modules, set the 497 | variable like this (C may be C, C and the likes, 498 | depending on your environment): 499 | 500 | > export PERL_JSON_BACKEND=JSON::XS 501 | 502 | If you prefer Cpanel::JSON::XS to JSON::XS, then: 503 | 504 | > export PERL_JSON_BACKEND=Cpanel::JSON::XS,JSON::XS,JSON::PP 505 | 506 | You may also want to set this variable at the top of your test files, in order 507 | not to be bothered with incompatibilities between backends (you need to wrap 508 | this in C, and set before actually C-ing JSON module, as it decides 509 | its backend as soon as it's loaded): 510 | 511 | BEGIN { $ENV{PERL_JSON_BACKEND}='JSON::backportPP'; } 512 | use JSON; 513 | 514 | =head1 USING OPTIONAL FEATURES 515 | 516 | There are a few options you can set when you C this module: 517 | 518 | =over 519 | 520 | =item -support_by_pp 521 | 522 | BEGIN { $ENV{PERL_JSON_BACKEND} = 'JSON::XS' } 523 | 524 | use JSON -support_by_pp; 525 | 526 | my $json = JSON->new; 527 | # escape_slash is for JSON::PP only. 528 | $json->allow_nonref->escape_slash->encode("/"); 529 | 530 | With this option, this module loads its pure perl backend along with 531 | its XS backend (if available), and lets the XS backend to watch if you set 532 | a flag only JSON::PP supports. When you do, the internal JSON::XS object 533 | is replaced with a newly created JSON::PP object with the setting copied 534 | from the XS object, so that you can use JSON::PP flags (and its slower 535 | C/C methods) from then on. In other words, this is not 536 | something that allows you to hook JSON::XS to change its behavior while 537 | keeping its speed. JSON::XS and JSON::PP objects are quite different 538 | (JSON::XS object is a blessed scalar reference, while JSON::PP object is 539 | a blessed hash reference), and can't share their internals. 540 | 541 | To avoid needless overhead (by copying settings), you are advised not 542 | to use this option and just to use JSON::PP explicitly when you need 543 | JSON::PP features. 544 | 545 | =item -convert_blessed_universally 546 | 547 | use JSON -convert_blessed_universally; 548 | 549 | my $json = JSON->new->allow_nonref->convert_blessed; 550 | my $object = bless {foo => 'bar'}, 'Foo'; 551 | $json->encode($object); # => {"foo":"bar"} 552 | 553 | JSON::XS-compatible backend modules don't encode blessed objects by 554 | default (except for their boolean values, which are typically blessed 555 | JSON::PP::Boolean objects). If you need to encode a data structure 556 | that may contain objects, you usually need to look into the structure 557 | and replace objects with alternative non-blessed values, or enable 558 | C and provide a C method for each object's 559 | (base) class that may be found in the structure, in order to let the 560 | methods replace the objects with whatever scalar values the methods 561 | return. 562 | 563 | If you need to serialise data structures that may contain arbitrary 564 | objects, it's probably better to use other serialisers (such as 565 | L or L for example), but if you do want to use 566 | this module for that purpose, C<-convert_blessed_universally> option 567 | may help, which tweaks C method of the backend to install 568 | C method (locally) before encoding, so that 569 | all the objects that don't have their own C method can 570 | fall back on the method in the C namespace. Note that you 571 | still need to enable C flag to actually encode 572 | objects in a data structure, and C method 573 | installed by this option only converts blessed hash/array references 574 | into their unblessed clone (including private keys/values that are 575 | not supposed to be exposed). Other blessed references will be 576 | converted into null. 577 | 578 | This feature is experimental and may be removed in the future. 579 | 580 | =item -no_export 581 | 582 | When you don't want to import functional interfaces from a module, you 583 | usually supply C<()> to its C statement. 584 | 585 | use JSON (); # no functional interfaces 586 | 587 | If you don't want to import functional interfaces, but you also want to 588 | use any of the above options, add C<-no_export> to the option list. 589 | 590 | # no functional interfaces, while JSON::PP support is enabled. 591 | use JSON -support_by_pp, -no_export; 592 | 593 | =back 594 | 595 | =head1 FUNCTIONAL INTERFACE 596 | 597 | This section is taken from JSON::XS. C and C 598 | are exported by default. 599 | 600 | This module also exports C and C for backward 601 | compatibility. These are slower, and may expect/generate different stuff 602 | from what C and C do, depending on their 603 | options. It's better just to use Object-Oriented interfaces than using 604 | these two functions. 605 | 606 | =head2 encode_json 607 | 608 | $json_text = encode_json $perl_scalar 609 | 610 | Converts the given Perl data structure to a UTF-8 encoded, binary string 611 | (that is, the string contains octets only). Croaks on error. 612 | 613 | This function call is functionally identical to: 614 | 615 | $json_text = JSON->new->utf8->encode($perl_scalar) 616 | 617 | Except being faster. 618 | 619 | =head2 decode_json 620 | 621 | $perl_scalar = decode_json $json_text 622 | 623 | The opposite of C: expects an UTF-8 (binary) string and tries 624 | to parse that as an UTF-8 encoded JSON text, returning the resulting 625 | reference. Croaks on error. 626 | 627 | This function call is functionally identical to: 628 | 629 | $perl_scalar = JSON->new->utf8->decode($json_text) 630 | 631 | Except being faster. 632 | 633 | =head2 to_json 634 | 635 | $json_text = to_json($perl_scalar[, $optional_hashref]) 636 | 637 | Converts the given Perl data structure to a Unicode string by default. 638 | Croaks on error. 639 | 640 | Basically, this function call is functionally identical to: 641 | 642 | $json_text = JSON->new->encode($perl_scalar) 643 | 644 | Except being slower. 645 | 646 | You can pass an optional hash reference to modify its behavior, but 647 | that may change what C expects/generates (see 648 | C for details). 649 | 650 | $json_text = to_json($perl_scalar, {utf8 => 1, pretty => 1}) 651 | # => JSON->new->utf8(1)->pretty(1)->encode($perl_scalar) 652 | 653 | =head2 from_json 654 | 655 | $perl_scalar = from_json($json_text[, $optional_hashref]) 656 | 657 | The opposite of C: expects a Unicode string and tries 658 | to parse it, returning the resulting reference. Croaks on error. 659 | 660 | Basically, this function call is functionally identical to: 661 | 662 | $perl_scalar = JSON->new->decode($json_text) 663 | 664 | You can pass an optional hash reference to modify its behavior, but 665 | that may change what C expects/generates (see 666 | C for details). 667 | 668 | $perl_scalar = from_json($json_text, {utf8 => 1}) 669 | # => JSON->new->utf8(1)->decode($json_text) 670 | 671 | =head2 JSON::is_bool 672 | 673 | $is_boolean = JSON::is_bool($scalar) 674 | 675 | Returns true if the passed scalar represents either JSON::true or 676 | JSON::false, two constants that act like C<1> and C<0> respectively 677 | and are also used to represent JSON C and C in Perl strings. 678 | 679 | See L, below, for more information on how JSON values are mapped to 680 | Perl. 681 | 682 | =head1 COMMON OBJECT-ORIENTED INTERFACE 683 | 684 | This section is also taken from JSON::XS. 685 | 686 | The object oriented interface lets you configure your own encoding or 687 | decoding style, within the limits of supported formats. 688 | 689 | =head2 new 690 | 691 | $json = JSON->new 692 | 693 | Creates a new JSON::XS-compatible backend object that can be used to de/encode JSON 694 | strings. All boolean flags described below are by default I. 695 | 696 | The mutators for flags all return the backend object again and thus calls can 697 | be chained: 698 | 699 | my $json = JSON->new->utf8->space_after->encode({a => [1,2]}) 700 | => {"a": [1, 2]} 701 | 702 | =head2 ascii 703 | 704 | $json = $json->ascii([$enable]) 705 | 706 | $enabled = $json->get_ascii 707 | 708 | If C<$enable> is true (or missing), then the C method will not 709 | generate characters outside the code range C<0..127> (which is ASCII). Any 710 | Unicode characters outside that range will be escaped using either a 711 | single \uXXXX (BMP characters) or a double \uHHHH\uLLLLL escape sequence, 712 | as per RFC4627. The resulting encoded JSON text can be treated as a native 713 | Unicode string, an ascii-encoded, latin1-encoded or UTF-8 encoded string, 714 | or any other superset of ASCII. 715 | 716 | If C<$enable> is false, then the C method will not escape Unicode 717 | characters unless required by the JSON syntax or other flags. This results 718 | in a faster and more compact format. 719 | 720 | See also the section I later in this document. 721 | 722 | The main use for this flag is to produce JSON texts that can be 723 | transmitted over a 7-bit channel, as the encoded JSON texts will not 724 | contain any 8 bit characters. 725 | 726 | JSON->new->ascii(1)->encode([chr 0x10401]) 727 | => ["\ud801\udc01"] 728 | 729 | =head2 latin1 730 | 731 | $json = $json->latin1([$enable]) 732 | 733 | $enabled = $json->get_latin1 734 | 735 | If C<$enable> is true (or missing), then the C method will encode 736 | the resulting JSON text as latin1 (or iso-8859-1), escaping any characters 737 | outside the code range C<0..255>. The resulting string can be treated as a 738 | latin1-encoded JSON text or a native Unicode string. The C method 739 | will not be affected in any way by this flag, as C by default 740 | expects Unicode, which is a strict superset of latin1. 741 | 742 | If C<$enable> is false, then the C method will not escape Unicode 743 | characters unless required by the JSON syntax or other flags. 744 | 745 | See also the section I later in this document. 746 | 747 | The main use for this flag is efficiently encoding binary data as JSON 748 | text, as most octets will not be escaped, resulting in a smaller encoded 749 | size. The disadvantage is that the resulting JSON text is encoded 750 | in latin1 (and must correctly be treated as such when storing and 751 | transferring), a rare encoding for JSON. It is therefore most useful when 752 | you want to store data structures known to contain binary data efficiently 753 | in files or databases, not when talking to other JSON encoders/decoders. 754 | 755 | JSON->new->latin1->encode (["\x{89}\x{abc}"] 756 | => ["\x{89}\\u0abc"] # (perl syntax, U+abc escaped, U+89 not) 757 | 758 | =head2 utf8 759 | 760 | $json = $json->utf8([$enable]) 761 | 762 | $enabled = $json->get_utf8 763 | 764 | If C<$enable> is true (or missing), then the C method will encode 765 | the JSON result into UTF-8, as required by many protocols, while the 766 | C method expects to be handled an UTF-8-encoded string. Please 767 | note that UTF-8-encoded strings do not contain any characters outside the 768 | range C<0..255>, they are thus useful for bytewise/binary I/O. In future 769 | versions, enabling this option might enable autodetection of the UTF-16 770 | and UTF-32 encoding families, as described in RFC4627. 771 | 772 | If C<$enable> is false, then the C method will return the JSON 773 | string as a (non-encoded) Unicode string, while C expects thus a 774 | Unicode string. Any decoding or encoding (e.g. to UTF-8 or UTF-16) needs 775 | to be done yourself, e.g. using the Encode module. 776 | 777 | See also the section I later in this document. 778 | 779 | Example, output UTF-16BE-encoded JSON: 780 | 781 | use Encode; 782 | $jsontext = encode "UTF-16BE", JSON->new->encode ($object); 783 | 784 | Example, decode UTF-32LE-encoded JSON: 785 | 786 | use Encode; 787 | $object = JSON->new->decode (decode "UTF-32LE", $jsontext); 788 | 789 | =head2 pretty 790 | 791 | $json = $json->pretty([$enable]) 792 | 793 | This enables (or disables) all of the C, C and 794 | C (and in the future possibly more) flags in one call to 795 | generate the most readable (or most compact) form possible. 796 | 797 | =head2 indent 798 | 799 | $json = $json->indent([$enable]) 800 | 801 | $enabled = $json->get_indent 802 | 803 | If C<$enable> is true (or missing), then the C method will use a multiline 804 | format as output, putting every array member or object/hash key-value pair 805 | into its own line, indenting them properly. 806 | 807 | If C<$enable> is false, no newlines or indenting will be produced, and the 808 | resulting JSON text is guaranteed not to contain any C. 809 | 810 | This setting has no effect when decoding JSON texts. 811 | 812 | =head2 space_before 813 | 814 | $json = $json->space_before([$enable]) 815 | 816 | $enabled = $json->get_space_before 817 | 818 | If C<$enable> is true (or missing), then the C method will add an extra 819 | optional space before the C<:> separating keys from values in JSON objects. 820 | 821 | If C<$enable> is false, then the C method will not add any extra 822 | space at those places. 823 | 824 | This setting has no effect when decoding JSON texts. You will also 825 | most likely combine this setting with C. 826 | 827 | Example, space_before enabled, space_after and indent disabled: 828 | 829 | {"key" :"value"} 830 | 831 | =head2 space_after 832 | 833 | $json = $json->space_after([$enable]) 834 | 835 | $enabled = $json->get_space_after 836 | 837 | If C<$enable> is true (or missing), then the C method will add an extra 838 | optional space after the C<:> separating keys from values in JSON objects 839 | and extra whitespace after the C<,> separating key-value pairs and array 840 | members. 841 | 842 | If C<$enable> is false, then the C method will not add any extra 843 | space at those places. 844 | 845 | This setting has no effect when decoding JSON texts. 846 | 847 | Example, space_before and indent disabled, space_after enabled: 848 | 849 | {"key": "value"} 850 | 851 | =head2 relaxed 852 | 853 | $json = $json->relaxed([$enable]) 854 | 855 | $enabled = $json->get_relaxed 856 | 857 | If C<$enable> is true (or missing), then C will accept some 858 | extensions to normal JSON syntax (see below). C will not be 859 | affected in anyway. I. I suggest only to use this option to 861 | parse application-specific files written by humans (configuration files, 862 | resource files etc.) 863 | 864 | If C<$enable> is false (the default), then C will only accept 865 | valid JSON texts. 866 | 867 | Currently accepted extensions are: 868 | 869 | =over 4 870 | 871 | =item * list items can have an end-comma 872 | 873 | JSON I array elements and key-value pairs with commas. This 874 | can be annoying if you write JSON texts manually and want to be able to 875 | quickly append elements, so this extension accepts comma at the end of 876 | such items not just between them: 877 | 878 | [ 879 | 1, 880 | 2, <- this comma not normally allowed 881 | ] 882 | { 883 | "k1": "v1", 884 | "k2": "v2", <- this comma not normally allowed 885 | } 886 | 887 | =item * shell-style '#'-comments 888 | 889 | Whenever JSON allows whitespace, shell-style comments are additionally 890 | allowed. They are terminated by the first carriage-return or line-feed 891 | character, after which more white-space and comments are allowed. 892 | 893 | [ 894 | 1, # this comment not allowed in JSON 895 | # neither this one... 896 | ] 897 | 898 | =back 899 | 900 | =head2 canonical 901 | 902 | $json = $json->canonical([$enable]) 903 | 904 | $enabled = $json->get_canonical 905 | 906 | If C<$enable> is true (or missing), then the C method will output JSON objects 907 | by sorting their keys. This is adding a comparatively high overhead. 908 | 909 | If C<$enable> is false, then the C method will output key-value 910 | pairs in the order Perl stores them (which will likely change between runs 911 | of the same script, and can change even within the same run from 5.18 912 | onwards). 913 | 914 | This option is useful if you want the same data structure to be encoded as 915 | the same JSON text (given the same overall settings). If it is disabled, 916 | the same hash might be encoded differently even if contains the same data, 917 | as key-value pairs have no inherent ordering in Perl. 918 | 919 | This setting has no effect when decoding JSON texts. 920 | 921 | This setting has currently no effect on tied hashes. 922 | 923 | =head2 allow_nonref 924 | 925 | $json = $json->allow_nonref([$enable]) 926 | 927 | $enabled = $json->get_allow_nonref 928 | 929 | If C<$enable> is true (or missing), then the C method can convert a 930 | non-reference into its corresponding string, number or null JSON value, 931 | which is an extension to RFC4627. Likewise, C will accept those JSON 932 | values instead of croaking. 933 | 934 | If C<$enable> is false, then the C method will croak if it isn't 935 | passed an arrayref or hashref, as JSON texts must either be an object 936 | or array. Likewise, C will croak if given something that is not a 937 | JSON object or array. 938 | 939 | Example, encode a Perl scalar as JSON value with enabled C, 940 | resulting in an invalid JSON text: 941 | 942 | JSON->new->allow_nonref->encode ("Hello, World!") 943 | => "Hello, World!" 944 | 945 | =head2 allow_unknown 946 | 947 | $json = $json->allow_unknown ([$enable]) 948 | 949 | $enabled = $json->get_allow_unknown 950 | 951 | If C<$enable> is true (or missing), then C will I throw an 952 | exception when it encounters values it cannot represent in JSON (for 953 | example, filehandles) but instead will encode a JSON C value. Note 954 | that blessed objects are not included here and are handled separately by 955 | c. 956 | 957 | If C<$enable> is false (the default), then C will throw an 958 | exception when it encounters anything it cannot encode as JSON. 959 | 960 | This option does not affect C in any way, and it is recommended to 961 | leave it off unless you know your communications partner. 962 | 963 | =head2 allow_blessed 964 | 965 | $json = $json->allow_blessed([$enable]) 966 | 967 | $enabled = $json->get_allow_blessed 968 | 969 | See L for details. 970 | 971 | If C<$enable> is true (or missing), then the C method will not 972 | barf when it encounters a blessed reference that it cannot convert 973 | otherwise. Instead, a JSON C value is encoded instead of the object. 974 | 975 | If C<$enable> is false (the default), then C will throw an 976 | exception when it encounters a blessed object that it cannot convert 977 | otherwise. 978 | 979 | This setting has no effect on C. 980 | 981 | =head2 convert_blessed 982 | 983 | $json = $json->convert_blessed([$enable]) 984 | 985 | $enabled = $json->get_convert_blessed 986 | 987 | See L for details. 988 | 989 | If C<$enable> is true (or missing), then C, upon encountering a 990 | blessed object, will check for the availability of the C method 991 | on the object's class. If found, it will be called in scalar context and 992 | the resulting scalar will be encoded instead of the object. 993 | 994 | The C method may safely call die if it wants. If C 995 | returns other blessed objects, those will be handled in the same 996 | way. C must take care of not causing an endless recursion cycle 997 | (== crash) in this case. The name of C was chosen because other 998 | methods called by the Perl core (== not by the user of the object) are 999 | usually in upper case letters and to avoid collisions with any C 1000 | function or method. 1001 | 1002 | If C<$enable> is false (the default), then C will not consider 1003 | this type of conversion. 1004 | 1005 | This setting has no effect on C. 1006 | 1007 | =head2 filter_json_object 1008 | 1009 | $json = $json->filter_json_object([$coderef]) 1010 | 1011 | When C<$coderef> is specified, it will be called from C each 1012 | time it decodes a JSON object. The only argument is a reference to the 1013 | newly-created hash. If the code references returns a single scalar (which 1014 | need not be a reference), this value (i.e. a copy of that scalar to avoid 1015 | aliasing) is inserted into the deserialised data structure. If it returns 1016 | an empty list (NOTE: I C, which is a valid scalar), the 1017 | original deserialised hash will be inserted. This setting can slow down 1018 | decoding considerably. 1019 | 1020 | When C<$coderef> is omitted or undefined, any existing callback will 1021 | be removed and C will not change the deserialised hash in any 1022 | way. 1023 | 1024 | Example, convert all JSON objects into the integer 5: 1025 | 1026 | my $js = JSON->new->filter_json_object (sub { 5 }); 1027 | # returns [5] 1028 | $js->decode ('[{}]'); # the given subroutine takes a hash reference. 1029 | # throw an exception because allow_nonref is not enabled 1030 | # so a lone 5 is not allowed. 1031 | $js->decode ('{"a":1, "b":2}'); 1032 | 1033 | =head2 filter_json_single_key_object 1034 | 1035 | $json = $json->filter_json_single_key_object($key [=> $coderef]) 1036 | 1037 | Works remotely similar to C, but is only called for 1038 | JSON objects having a single key named C<$key>. 1039 | 1040 | This C<$coderef> is called before the one specified via 1041 | C, if any. It gets passed the single value in the JSON 1042 | object. If it returns a single value, it will be inserted into the data 1043 | structure. If it returns nothing (not even C but the empty list), 1044 | the callback from C will be called next, as if no 1045 | single-key callback were specified. 1046 | 1047 | If C<$coderef> is omitted or undefined, the corresponding callback will be 1048 | disabled. There can only ever be one callback for a given key. 1049 | 1050 | As this callback gets called less often then the C 1051 | one, decoding speed will not usually suffer as much. Therefore, single-key 1052 | objects make excellent targets to serialise Perl objects into, especially 1053 | as single-key JSON objects are as close to the type-tagged value concept 1054 | as JSON gets (it's basically an ID/VALUE tuple). Of course, JSON does not 1055 | support this in any way, so you need to make sure your data never looks 1056 | like a serialised Perl hash. 1057 | 1058 | Typical names for the single object key are C<__class_whatever__>, or 1059 | C<$__dollars_are_rarely_used__$> or C<}ugly_brace_placement>, or even 1060 | things like C<__class_md5sum(classname)__>, to reduce the risk of clashing 1061 | with real hashes. 1062 | 1063 | Example, decode JSON objects of the form C<< { "__widget__" => } >> 1064 | into the corresponding C<< $WIDGET{} >> object: 1065 | 1066 | # return whatever is in $WIDGET{5}: 1067 | JSON 1068 | ->new 1069 | ->filter_json_single_key_object (__widget__ => sub { 1070 | $WIDGET{ $_[0] } 1071 | }) 1072 | ->decode ('{"__widget__": 5') 1073 | 1074 | # this can be used with a TO_JSON method in some "widget" class 1075 | # for serialisation to json: 1076 | sub WidgetBase::TO_JSON { 1077 | my ($self) = @_; 1078 | 1079 | unless ($self->{id}) { 1080 | $self->{id} = ..get..some..id..; 1081 | $WIDGET{$self->{id}} = $self; 1082 | } 1083 | 1084 | { __widget__ => $self->{id} } 1085 | } 1086 | 1087 | =head2 max_depth 1088 | 1089 | $json = $json->max_depth([$maximum_nesting_depth]) 1090 | 1091 | $max_depth = $json->get_max_depth 1092 | 1093 | Sets the maximum nesting level (default C<512>) accepted while encoding 1094 | or decoding. If a higher nesting level is detected in JSON text or a Perl 1095 | data structure, then the encoder and decoder will stop and croak at that 1096 | point. 1097 | 1098 | Nesting level is defined by number of hash- or arrayrefs that the encoder 1099 | needs to traverse to reach a given point or the number of C<{> or C<[> 1100 | characters without their matching closing parenthesis crossed to reach a 1101 | given character in a string. 1102 | 1103 | Setting the maximum depth to one disallows any nesting, so that ensures 1104 | that the object is only a single hash/object or array. 1105 | 1106 | If no argument is given, the highest possible setting will be used, which 1107 | is rarely useful. 1108 | 1109 | =head2 max_size 1110 | 1111 | $json = $json->max_size([$maximum_string_size]) 1112 | 1113 | $max_size = $json->get_max_size 1114 | 1115 | Set the maximum length a JSON text may have (in bytes) where decoding is 1116 | being attempted. The default is C<0>, meaning no limit. When C 1117 | is called on a string that is longer then this many bytes, it will not 1118 | attempt to decode the string but throw an exception. This setting has no 1119 | effect on C (yet). 1120 | 1121 | If no argument is given, the limit check will be deactivated (same as when 1122 | C<0> is specified). 1123 | 1124 | =head2 encode 1125 | 1126 | $json_text = $json->encode($perl_scalar) 1127 | 1128 | Converts the given Perl value or data structure to its JSON 1129 | representation. Croaks on error. 1130 | 1131 | =head2 decode 1132 | 1133 | $perl_scalar = $json->decode($json_text) 1134 | 1135 | The opposite of C: expects a JSON text and tries to parse it, 1136 | returning the resulting simple scalar or reference. Croaks on error. 1137 | 1138 | =head2 decode_prefix 1139 | 1140 | ($perl_scalar, $characters) = $json->decode_prefix($json_text) 1141 | 1142 | This works like the C method, but instead of raising an exception 1143 | when there is trailing garbage after the first JSON object, it will 1144 | silently stop parsing there and return the number of characters consumed 1145 | so far. 1146 | 1147 | This is useful if your JSON texts are not delimited by an outer protocol 1148 | and you need to know where the JSON text ends. 1149 | 1150 | JSON->new->decode_prefix ("[1] the tail") 1151 | => ([1], 3) 1152 | 1153 | =head1 ADDITIONAL METHODS 1154 | 1155 | The following methods are for this module only. 1156 | 1157 | =head2 backend 1158 | 1159 | $backend = $json->backend 1160 | 1161 | Since 2.92, C method returns an abstract backend module used currently, 1162 | which should be JSON::Backend::XS (which inherits JSON::XS or Cpanel::JSON::XS), 1163 | or JSON::Backend::PP (which inherits JSON::PP), not to monkey-patch the actual 1164 | backend module globally. 1165 | 1166 | If you need to know what is used actually, use C, instead of string comparison. 1167 | 1168 | =head2 is_xs 1169 | 1170 | $boolean = $json->is_xs 1171 | 1172 | Returns true if the backend inherits JSON::XS or Cpanel::JSON::XS. 1173 | 1174 | =head2 is_pp 1175 | 1176 | $boolean = $json->is_pp 1177 | 1178 | Returns true if the backend inherits JSON::PP. 1179 | 1180 | =head2 property 1181 | 1182 | $settings = $json->property() 1183 | 1184 | Returns a reference to a hash that holds all the common flag settings. 1185 | 1186 | $json = $json->property('utf8' => 1) 1187 | $value = $json->property('utf8') # 1 1188 | 1189 | You can use this to get/set a value of a particular flag. 1190 | 1191 | =head1 INCREMENTAL PARSING 1192 | 1193 | This section is also taken from JSON::XS. 1194 | 1195 | In some cases, there is the need for incremental parsing of JSON 1196 | texts. While this module always has to keep both JSON text and resulting 1197 | Perl data structure in memory at one time, it does allow you to parse a 1198 | JSON stream incrementally. It does so by accumulating text until it has 1199 | a full JSON object, which it then can decode. This process is similar to 1200 | using C to see if a full JSON object is available, but 1201 | is much more efficient (and can be implemented with a minimum of method 1202 | calls). 1203 | 1204 | This module will only attempt to parse the JSON text once it is sure it 1205 | has enough text to get a decisive result, using a very simple but 1206 | truly incremental parser. This means that it sometimes won't stop as 1207 | early as the full parser, for example, it doesn't detect mismatched 1208 | parentheses. The only thing it guarantees is that it starts decoding as 1209 | soon as a syntactically valid JSON text has been seen. This means you need 1210 | to set resource limits (e.g. C) to ensure the parser will stop 1211 | parsing in the presence if syntax errors. 1212 | 1213 | The following methods implement this incremental parser. 1214 | 1215 | =head2 incr_parse 1216 | 1217 | $json->incr_parse( [$string] ) # void context 1218 | 1219 | $obj_or_undef = $json->incr_parse( [$string] ) # scalar context 1220 | 1221 | @obj_or_empty = $json->incr_parse( [$string] ) # list context 1222 | 1223 | This is the central parsing function. It can both append new text and 1224 | extract objects from the stream accumulated so far (both of these 1225 | functions are optional). 1226 | 1227 | If C<$string> is given, then this string is appended to the already 1228 | existing JSON fragment stored in the C<$json> object. 1229 | 1230 | After that, if the function is called in void context, it will simply 1231 | return without doing anything further. This can be used to add more text 1232 | in as many chunks as you want. 1233 | 1234 | If the method is called in scalar context, then it will try to extract 1235 | exactly I JSON object. If that is successful, it will return this 1236 | object, otherwise it will return C. If there is a parse error, 1237 | this method will croak just as C would do (one can then use 1238 | C to skip the erroneous part). This is the most common way of 1239 | using the method. 1240 | 1241 | And finally, in list context, it will try to extract as many objects 1242 | from the stream as it can find and return them, or the empty list 1243 | otherwise. For this to work, there must be no separators (other than 1244 | whitespace) between the JSON objects or arrays, instead they must be 1245 | concatenated back-to-back. If an error occurs, an exception will be 1246 | raised as in the scalar context case. Note that in this case, any 1247 | previously-parsed JSON texts will be lost. 1248 | 1249 | Example: Parse some JSON arrays/objects in a given string and return 1250 | them. 1251 | 1252 | my @objs = JSON->new->incr_parse ("[5][7][1,2]"); 1253 | 1254 | =head2 incr_text 1255 | 1256 | $lvalue_string = $json->incr_text 1257 | 1258 | This method returns the currently stored JSON fragment as an lvalue, that 1259 | is, you can manipulate it. This I works when a preceding call to 1260 | C in I successfully returned an object. Under 1261 | all other circumstances you must not call this function (I mean it. 1262 | although in simple tests it might actually work, it I fail under 1263 | real world conditions). As a special exception, you can also call this 1264 | method before having parsed anything. 1265 | 1266 | That means you can only use this function to look at or manipulate text 1267 | before or after complete JSON objects, not while the parser is in the 1268 | middle of parsing a JSON object. 1269 | 1270 | This function is useful in two cases: a) finding the trailing text after a 1271 | JSON object or b) parsing multiple JSON objects separated by non-JSON text 1272 | (such as commas). 1273 | 1274 | =head2 incr_skip 1275 | 1276 | $json->incr_skip 1277 | 1278 | This will reset the state of the incremental parser and will remove 1279 | the parsed text from the input buffer so far. This is useful after 1280 | C died, in which case the input buffer and incremental parser 1281 | state is left unchanged, to skip the text parsed so far and to reset the 1282 | parse state. 1283 | 1284 | The difference to C is that only text until the parse error 1285 | occurred is removed. 1286 | 1287 | =head2 incr_reset 1288 | 1289 | $json->incr_reset 1290 | 1291 | This completely resets the incremental parser, that is, after this call, 1292 | it will be as if the parser had never parsed anything. 1293 | 1294 | This is useful if you want to repeatedly parse JSON objects and want to 1295 | ignore any trailing data, which means you have to reset the parser after 1296 | each successful decode. 1297 | 1298 | =head1 MAPPING 1299 | 1300 | Most of this section is also taken from JSON::XS. 1301 | 1302 | This section describes how the backend modules map Perl values to JSON values and 1303 | vice versa. These mappings are designed to "do the right thing" in most 1304 | circumstances automatically, preserving round-tripping characteristics 1305 | (what you put in comes out as something equivalent). 1306 | 1307 | For the more enlightened: note that in the following descriptions, 1308 | lowercase I refers to the Perl interpreter, while uppercase I 1309 | refers to the abstract Perl language itself. 1310 | 1311 | =head2 JSON -> PERL 1312 | 1313 | =over 4 1314 | 1315 | =item object 1316 | 1317 | A JSON object becomes a reference to a hash in Perl. No ordering of object 1318 | keys is preserved (JSON does not preserver object key ordering itself). 1319 | 1320 | =item array 1321 | 1322 | A JSON array becomes a reference to an array in Perl. 1323 | 1324 | =item string 1325 | 1326 | A JSON string becomes a string scalar in Perl - Unicode codepoints in JSON 1327 | are represented by the same codepoints in the Perl string, so no manual 1328 | decoding is necessary. 1329 | 1330 | =item number 1331 | 1332 | A JSON number becomes either an integer, numeric (floating point) or 1333 | string scalar in perl, depending on its range and any fractional parts. On 1334 | the Perl level, there is no difference between those as Perl handles all 1335 | the conversion details, but an integer may take slightly less memory and 1336 | might represent more values exactly than floating point numbers. 1337 | 1338 | If the number consists of digits only, this module will try to represent 1339 | it as an integer value. If that fails, it will try to represent it as 1340 | a numeric (floating point) value if that is possible without loss of 1341 | precision. Otherwise it will preserve the number as a string value (in 1342 | which case you lose roundtripping ability, as the JSON number will be 1343 | re-encoded to a JSON string). 1344 | 1345 | Numbers containing a fractional or exponential part will always be 1346 | represented as numeric (floating point) values, possibly at a loss of 1347 | precision (in which case you might lose perfect roundtripping ability, but 1348 | the JSON number will still be re-encoded as a JSON number). 1349 | 1350 | Note that precision is not accuracy - binary floating point values cannot 1351 | represent most decimal fractions exactly, and when converting from and to 1352 | floating point, this module only guarantees precision up to but not including 1353 | the least significant bit. 1354 | 1355 | =item true, false 1356 | 1357 | These JSON atoms become C and C, 1358 | respectively. They are overloaded to act almost exactly like the numbers 1359 | C<1> and C<0>. You can check whether a scalar is a JSON boolean by using 1360 | the C function. 1361 | 1362 | =item null 1363 | 1364 | A JSON null atom becomes C in Perl. 1365 | 1366 | =item shell-style comments (C<< # I >>) 1367 | 1368 | As a nonstandard extension to the JSON syntax that is enabled by the 1369 | C setting, shell-style comments are allowed. They can start 1370 | anywhere outside strings and go till the end of the line. 1371 | 1372 | =back 1373 | 1374 | 1375 | =head2 PERL -> JSON 1376 | 1377 | The mapping from Perl to JSON is slightly more difficult, as Perl is a 1378 | truly typeless language, so we can only guess which JSON type is meant by 1379 | a Perl value. 1380 | 1381 | =over 4 1382 | 1383 | =item hash references 1384 | 1385 | Perl hash references become JSON objects. As there is no inherent 1386 | ordering in hash keys (or JSON objects), they will usually be encoded 1387 | in a pseudo-random order. This module can optionally sort the hash keys 1388 | (determined by the I flag), so the same data structure will 1389 | serialise to the same JSON text (given same settings and version of 1390 | the same backend), but this incurs a runtime overhead and is only rarely useful, 1391 | e.g. when you want to compare some JSON text against another for equality. 1392 | 1393 | =item array references 1394 | 1395 | Perl array references become JSON arrays. 1396 | 1397 | =item other references 1398 | 1399 | Other unblessed references are generally not allowed and will cause an 1400 | exception to be thrown, except for references to the integers C<0> and 1401 | C<1>, which get turned into C and C atoms in JSON. You can 1402 | also use C and C to improve readability. 1403 | 1404 | encode_json [\0,JSON::true] # yields [false,true] 1405 | 1406 | =item JSON::true, JSON::false, JSON::null 1407 | 1408 | These special values become JSON true and JSON false values, 1409 | respectively. You can also use C<\1> and C<\0> directly if you want. 1410 | 1411 | =item blessed objects 1412 | 1413 | Blessed objects are not directly representable in JSON, but C 1414 | allows various ways of handling objects. See L, 1415 | below, for details. 1416 | 1417 | =item simple scalars 1418 | 1419 | Simple Perl scalars (any scalar that is not a reference) are the most 1420 | difficult objects to encode: this module will encode undefined scalars as 1421 | JSON C values, scalars that have last been used in a string context 1422 | before encoding as JSON strings, and anything else as number value: 1423 | 1424 | # dump as number 1425 | encode_json [2] # yields [2] 1426 | encode_json [-3.0e17] # yields [-3e+17] 1427 | my $value = 5; encode_json [$value] # yields [5] 1428 | 1429 | # used as string, so dump as string 1430 | print $value; 1431 | encode_json [$value] # yields ["5"] 1432 | 1433 | # undef becomes null 1434 | encode_json [undef] # yields [null] 1435 | 1436 | You can force the type to be a string by stringifying it: 1437 | 1438 | my $x = 3.1; # some variable containing a number 1439 | "$x"; # stringified 1440 | $x .= ""; # another, more awkward way to stringify 1441 | print $x; # perl does it for you, too, quite often 1442 | 1443 | You can force the type to be a number by numifying it: 1444 | 1445 | my $x = "3"; # some variable containing a string 1446 | $x += 0; # numify it, ensuring it will be dumped as a number 1447 | $x *= 1; # same thing, the choice is yours. 1448 | 1449 | You can not currently force the type in other, less obscure, ways. Tell me 1450 | if you need this capability (but don't forget to explain why it's needed 1451 | :). 1452 | 1453 | Note that numerical precision has the same meaning as under Perl (so 1454 | binary to decimal conversion follows the same rules as in Perl, which 1455 | can differ to other languages). Also, your perl interpreter might expose 1456 | extensions to the floating point numbers of your platform, such as 1457 | infinities or NaN's - these cannot be represented in JSON, and it is an 1458 | error to pass those in. 1459 | 1460 | =back 1461 | 1462 | =head2 OBJECT SERIALISATION 1463 | 1464 | As for Perl objects, this module only supports a pure JSON representation 1465 | (without the ability to deserialise the object automatically again). 1466 | 1467 | =head3 SERIALISATION 1468 | 1469 | What happens when this module encounters a Perl object depends on the 1470 | C and C settings, which are used in 1471 | this order: 1472 | 1473 | =over 4 1474 | 1475 | =item 1. C is enabled and the object has a C method. 1476 | 1477 | In this case, the C method of the object is invoked in scalar 1478 | context. It must return a single scalar that can be directly encoded into 1479 | JSON. This scalar replaces the object in the JSON text. 1480 | 1481 | For example, the following C method will convert all L 1482 | objects to JSON strings when serialised. The fact that these values 1483 | originally were L objects is lost. 1484 | 1485 | sub URI::TO_JSON { 1486 | my ($uri) = @_; 1487 | $uri->as_string 1488 | } 1489 | 1490 | =item 2. C is enabled. 1491 | 1492 | The object will be serialised as a JSON null value. 1493 | 1494 | =item 3. none of the above 1495 | 1496 | If none of the settings are enabled or the respective methods are missing, 1497 | this module throws an exception. 1498 | 1499 | =back 1500 | 1501 | =head1 ENCODING/CODESET FLAG NOTES 1502 | 1503 | This section is taken from JSON::XS. 1504 | 1505 | The interested reader might have seen a number of flags that signify 1506 | encodings or codesets - C, C and C. There seems to be 1507 | some confusion on what these do, so here is a short comparison: 1508 | 1509 | C controls whether the JSON text created by C (and expected 1510 | by C) is UTF-8 encoded or not, while C and C only 1511 | control whether C escapes character values outside their respective 1512 | codeset range. Neither of these flags conflict with each other, although 1513 | some combinations make less sense than others. 1514 | 1515 | Care has been taken to make all flags symmetrical with respect to 1516 | C and C, that is, texts encoded with any combination of 1517 | these flag values will be correctly decoded when the same flags are used 1518 | - in general, if you use different flag settings while encoding vs. when 1519 | decoding you likely have a bug somewhere. 1520 | 1521 | Below comes a verbose discussion of these flags. Note that a "codeset" is 1522 | simply an abstract set of character-codepoint pairs, while an encoding 1523 | takes those codepoint numbers and I them, in our case into 1524 | octets. Unicode is (among other things) a codeset, UTF-8 is an encoding, 1525 | and ISO-8859-1 (= latin 1) and ASCII are both codesets I encodings at 1526 | the same time, which can be confusing. 1527 | 1528 | =over 4 1529 | 1530 | =item C flag disabled 1531 | 1532 | When C is disabled (the default), then C/C generate 1533 | and expect Unicode strings, that is, characters with high ordinal Unicode 1534 | values (> 255) will be encoded as such characters, and likewise such 1535 | characters are decoded as-is, no changes to them will be done, except 1536 | "(re-)interpreting" them as Unicode codepoints or Unicode characters, 1537 | respectively (to Perl, these are the same thing in strings unless you do 1538 | funny/weird/dumb stuff). 1539 | 1540 | This is useful when you want to do the encoding yourself (e.g. when you 1541 | want to have UTF-16 encoded JSON texts) or when some other layer does 1542 | the encoding for you (for example, when printing to a terminal using a 1543 | filehandle that transparently encodes to UTF-8 you certainly do NOT want 1544 | to UTF-8 encode your data first and have Perl encode it another time). 1545 | 1546 | =item C flag enabled 1547 | 1548 | If the C-flag is enabled, C/C will encode all 1549 | characters using the corresponding UTF-8 multi-byte sequence, and will 1550 | expect your input strings to be encoded as UTF-8, that is, no "character" 1551 | of the input string must have any value > 255, as UTF-8 does not allow 1552 | that. 1553 | 1554 | The C flag therefore switches between two modes: disabled means you 1555 | will get a Unicode string in Perl, enabled means you get an UTF-8 encoded 1556 | octet/binary string in Perl. 1557 | 1558 | =item C or C flags enabled 1559 | 1560 | With C (or C) enabled, C will escape characters 1561 | with ordinal values > 255 (> 127 with C) and encode the remaining 1562 | characters as specified by the C flag. 1563 | 1564 | If C is disabled, then the result is also correctly encoded in those 1565 | character sets (as both are proper subsets of Unicode, meaning that a 1566 | Unicode string with all character values < 256 is the same thing as a 1567 | ISO-8859-1 string, and a Unicode string with all character values < 128 is 1568 | the same thing as an ASCII string in Perl). 1569 | 1570 | If C is enabled, you still get a correct UTF-8-encoded string, 1571 | regardless of these flags, just some more characters will be escaped using 1572 | C<\uXXXX> then before. 1573 | 1574 | Note that ISO-8859-1-I strings are not compatible with UTF-8 1575 | encoding, while ASCII-encoded strings are. That is because the ISO-8859-1 1576 | encoding is NOT a subset of UTF-8 (despite the ISO-8859-1 I being 1577 | a subset of Unicode), while ASCII is. 1578 | 1579 | Surprisingly, C will ignore these flags and so treat all input 1580 | values as governed by the C flag. If it is disabled, this allows you 1581 | to decode ISO-8859-1- and ASCII-encoded strings, as both strict subsets of 1582 | Unicode. If it is enabled, you can correctly decode UTF-8 encoded strings. 1583 | 1584 | So neither C nor C are incompatible with the C flag - 1585 | they only govern when the JSON output engine escapes a character or not. 1586 | 1587 | The main use for C is to relatively efficiently store binary data 1588 | as JSON, at the expense of breaking compatibility with most JSON decoders. 1589 | 1590 | The main use for C is to force the output to not contain characters 1591 | with values > 127, which means you can interpret the resulting string 1592 | as UTF-8, ISO-8859-1, ASCII, KOI8-R or most about any character set and 1593 | 8-bit-encoding, and still get the same data structure back. This is useful 1594 | when your channel for JSON transfer is not 8-bit clean or the encoding 1595 | might be mangled in between (e.g. in mail), and works because ASCII is a 1596 | proper subset of most 8-bit and multibyte encodings in use in the world. 1597 | 1598 | =back 1599 | 1600 | =head1 BACKWARD INCOMPATIBILITY 1601 | 1602 | Since version 2.90, stringification (and string comparison) for 1603 | C and C has not been overloaded. It shouldn't 1604 | matter as long as you treat them as boolean values, but a code that 1605 | expects they are stringified as "true" or "false" doesn't work as 1606 | you have expected any more. 1607 | 1608 | if (JSON::true eq 'true') { # now fails 1609 | 1610 | print "The result is $JSON::true now."; # => The result is 1 now. 1611 | 1612 | And now these boolean values don't inherit JSON::Boolean, either. 1613 | When you need to test a value is a JSON boolean value or not, use 1614 | C function, instead of testing the value inherits 1615 | a particular boolean class or not. 1616 | 1617 | =head1 BUGS 1618 | 1619 | Please report bugs on backend selection and additional features 1620 | this module provides to RT or GitHub issues for this module: 1621 | 1622 | =over 4 1623 | 1624 | =item https://rt.cpan.org/Public/Dist/Display.html?Queue=JSON 1625 | 1626 | =item https://github.com/makamaka/JSON/issues 1627 | 1628 | =back 1629 | 1630 | Please report bugs and feature requests on decoding/encoding 1631 | and boolean behaviors to the author of the backend module you 1632 | are using. 1633 | 1634 | =head1 SEE ALSO 1635 | 1636 | L, L, L for backends. 1637 | 1638 | L, an alternative that prefers Cpanel::JSON::XS. 1639 | 1640 | C(L) 1641 | 1642 | =head1 AUTHOR 1643 | 1644 | Makamaka Hannyaharamitu, Emakamaka[at]cpan.orgE 1645 | 1646 | JSON::XS was written by Marc Lehmann 1647 | 1648 | The release of this new version owes to the courtesy of Marc Lehmann. 1649 | 1650 | 1651 | =head1 COPYRIGHT AND LICENSE 1652 | 1653 | Copyright 2005-2013 by Makamaka Hannyaharamitu 1654 | 1655 | This library is free software; you can redistribute it and/or modify 1656 | it under the same terms as Perl itself. 1657 | 1658 | =cut 1659 | 1660 | -------------------------------------------------------------------------------- /surfaceLR-baseline/JSON/backportPP/Boolean.pm: -------------------------------------------------------------------------------- 1 | package # This is JSON::backportPP 2 | JSON::PP::Boolean; 3 | 4 | use strict; 5 | use overload ( 6 | "0+" => sub { ${$_[0]} }, 7 | "++" => sub { $_[0] = ${$_[0]} + 1 }, 8 | "--" => sub { $_[0] = ${$_[0]} - 1 }, 9 | fallback => 1, 10 | ); 11 | 12 | $JSON::backportPP::Boolean::VERSION = '2.97001'; 13 | 14 | 1; 15 | 16 | __END__ 17 | 18 | =head1 NAME 19 | 20 | JSON::PP::Boolean - dummy module providing JSON::PP::Boolean 21 | 22 | =head1 SYNOPSIS 23 | 24 | # do not "use" yourself 25 | 26 | =head1 DESCRIPTION 27 | 28 | This module exists only to provide overload resolution for Storable and similar modules. See 29 | L for more info about this class. 30 | 31 | =head1 AUTHOR 32 | 33 | This idea is from L written by Marc Lehmann 34 | 35 | =cut 36 | 37 | -------------------------------------------------------------------------------- /surfaceLR-baseline/JSON/backportPP/Compat5005.pm: -------------------------------------------------------------------------------- 1 | package # This is JSON::backportPP 2 | JSON::backportPP5005; 3 | 4 | use 5.005; 5 | use strict; 6 | 7 | my @properties; 8 | 9 | $JSON::PP5005::VERSION = '1.10'; 10 | 11 | BEGIN { 12 | 13 | sub utf8::is_utf8 { 14 | 0; # It is considered that UTF8 flag off for Perl 5.005. 15 | } 16 | 17 | sub utf8::upgrade { 18 | } 19 | 20 | sub utf8::downgrade { 21 | 1; # must always return true. 22 | } 23 | 24 | sub utf8::encode { 25 | } 26 | 27 | sub utf8::decode { 28 | } 29 | 30 | *JSON::PP::JSON_PP_encode_ascii = \&_encode_ascii; 31 | *JSON::PP::JSON_PP_encode_latin1 = \&_encode_latin1; 32 | *JSON::PP::JSON_PP_decode_surrogates = \&_decode_surrogates; 33 | *JSON::PP::JSON_PP_decode_unicode = \&_decode_unicode; 34 | 35 | # missing in B module. 36 | sub B::SVp_IOK () { 0x01000000; } 37 | sub B::SVp_NOK () { 0x02000000; } 38 | sub B::SVp_POK () { 0x04000000; } 39 | 40 | $INC{'bytes.pm'} = 1; # dummy 41 | } 42 | 43 | 44 | 45 | sub _encode_ascii { 46 | join('', map { $_ <= 127 ? chr($_) : sprintf('\u%04x', $_) } unpack('C*', $_[0]) ); 47 | } 48 | 49 | 50 | sub _encode_latin1 { 51 | join('', map { chr($_) } unpack('C*', $_[0]) ); 52 | } 53 | 54 | 55 | sub _decode_surrogates { # from http://homepage1.nifty.com/nomenclator/unicode/ucs_utf.htm 56 | my $uni = 0x10000 + (hex($_[0]) - 0xD800) * 0x400 + (hex($_[1]) - 0xDC00); # from perlunicode 57 | my $bit = unpack('B32', pack('N', $uni)); 58 | 59 | if ( $bit =~ /^00000000000(...)(......)(......)(......)$/ ) { 60 | my ($w, $x, $y, $z) = ($1, $2, $3, $4); 61 | return pack('B*', sprintf('11110%s10%s10%s10%s', $w, $x, $y, $z)); 62 | } 63 | else { 64 | Carp::croak("Invalid surrogate pair"); 65 | } 66 | } 67 | 68 | 69 | sub _decode_unicode { 70 | my ($u) = @_; 71 | my ($utf8bit); 72 | 73 | if ( $u =~ /^00([89a-f][0-9a-f])$/i ) { # 0x80-0xff 74 | return pack( 'H2', $1 ); 75 | } 76 | 77 | my $bit = unpack("B*", pack("H*", $u)); 78 | 79 | if ( $bit =~ /^00000(.....)(......)$/ ) { 80 | $utf8bit = sprintf('110%s10%s', $1, $2); 81 | } 82 | elsif ( $bit =~ /^(....)(......)(......)$/ ) { 83 | $utf8bit = sprintf('1110%s10%s10%s', $1, $2, $3); 84 | } 85 | else { 86 | Carp::croak("Invalid escaped unicode"); 87 | } 88 | 89 | return pack('B*', $utf8bit); 90 | } 91 | 92 | 93 | sub JSON::PP::incr_text { 94 | $_[0]->{_incr_parser} ||= JSON::PP::IncrParser->new; 95 | 96 | if ( $_[0]->{_incr_parser}->{incr_parsing} ) { 97 | Carp::croak("incr_text can not be called when the incremental parser already started parsing"); 98 | } 99 | 100 | $_[0]->{_incr_parser}->{incr_text} = $_[1] if ( @_ > 1 ); 101 | $_[0]->{_incr_parser}->{incr_text}; 102 | } 103 | 104 | 105 | 1; 106 | __END__ 107 | 108 | =pod 109 | 110 | =head1 NAME 111 | 112 | JSON::PP5005 - Helper module in using JSON::PP in Perl 5.005 113 | 114 | =head1 DESCRIPTION 115 | 116 | JSON::PP calls internally. 117 | 118 | =head1 AUTHOR 119 | 120 | Makamaka Hannyaharamitu, Emakamaka[at]cpan.orgE 121 | 122 | 123 | =head1 COPYRIGHT AND LICENSE 124 | 125 | Copyright 2007-2012 by Makamaka Hannyaharamitu 126 | 127 | This library is free software; you can redistribute it and/or modify 128 | it under the same terms as Perl itself. 129 | 130 | =cut 131 | 132 | -------------------------------------------------------------------------------- /surfaceLR-baseline/JSON/backportPP/Compat5006.pm: -------------------------------------------------------------------------------- 1 | package # This is JSON::backportPP 2 | JSON::backportPP56; 3 | 4 | use 5.006; 5 | use strict; 6 | 7 | my @properties; 8 | 9 | $JSON::PP56::VERSION = '1.08'; 10 | 11 | BEGIN { 12 | 13 | sub utf8::is_utf8 { 14 | my $len = length $_[0]; # char length 15 | { 16 | use bytes; # byte length; 17 | return $len != length $_[0]; # if !=, UTF8-flagged on. 18 | } 19 | } 20 | 21 | 22 | sub utf8::upgrade { 23 | ; # noop; 24 | } 25 | 26 | 27 | sub utf8::downgrade ($;$) { 28 | return 1 unless ( utf8::is_utf8( $_[0] ) ); 29 | 30 | if ( _is_valid_utf8( $_[0] ) ) { 31 | my $downgrade; 32 | for my $c ( unpack( "U*", $_[0] ) ) { 33 | if ( $c < 256 ) { 34 | $downgrade .= pack("C", $c); 35 | } 36 | else { 37 | $downgrade .= pack("U", $c); 38 | } 39 | } 40 | $_[0] = $downgrade; 41 | return 1; 42 | } 43 | else { 44 | Carp::croak("Wide character in subroutine entry") unless ( $_[1] ); 45 | 0; 46 | } 47 | } 48 | 49 | 50 | sub utf8::encode ($) { # UTF8 flag off 51 | if ( utf8::is_utf8( $_[0] ) ) { 52 | $_[0] = pack( "C*", unpack( "C*", $_[0] ) ); 53 | } 54 | else { 55 | $_[0] = pack( "U*", unpack( "C*", $_[0] ) ); 56 | $_[0] = pack( "C*", unpack( "C*", $_[0] ) ); 57 | } 58 | } 59 | 60 | 61 | sub utf8::decode ($) { # UTF8 flag on 62 | if ( _is_valid_utf8( $_[0] ) ) { 63 | utf8::downgrade( $_[0] ); 64 | $_[0] = pack( "U*", unpack( "U*", $_[0] ) ); 65 | } 66 | } 67 | 68 | 69 | *JSON::PP::JSON_PP_encode_ascii = \&_encode_ascii; 70 | *JSON::PP::JSON_PP_encode_latin1 = \&_encode_latin1; 71 | *JSON::PP::JSON_PP_decode_surrogates = \&JSON::PP::_decode_surrogates; 72 | *JSON::PP::JSON_PP_decode_unicode = \&JSON::PP::_decode_unicode; 73 | 74 | unless ( defined &B::SVp_NOK ) { # missing in B module. 75 | eval q{ sub B::SVp_NOK () { 0x02000000; } }; 76 | } 77 | 78 | } 79 | 80 | 81 | 82 | sub _encode_ascii { 83 | join('', 84 | map { 85 | $_ <= 127 ? 86 | chr($_) : 87 | $_ <= 65535 ? 88 | sprintf('\u%04x', $_) : sprintf('\u%x\u%x', JSON::PP::_encode_surrogates($_)); 89 | } _unpack_emu($_[0]) 90 | ); 91 | } 92 | 93 | 94 | sub _encode_latin1 { 95 | join('', 96 | map { 97 | $_ <= 255 ? 98 | chr($_) : 99 | $_ <= 65535 ? 100 | sprintf('\u%04x', $_) : sprintf('\u%x\u%x', JSON::PP::_encode_surrogates($_)); 101 | } _unpack_emu($_[0]) 102 | ); 103 | } 104 | 105 | 106 | sub _unpack_emu { # for Perl 5.6 unpack warnings 107 | return !utf8::is_utf8($_[0]) ? unpack('C*', $_[0]) 108 | : _is_valid_utf8($_[0]) ? unpack('U*', $_[0]) 109 | : unpack('C*', $_[0]); 110 | } 111 | 112 | 113 | sub _is_valid_utf8 { 114 | my $str = $_[0]; 115 | my $is_utf8; 116 | 117 | while ($str =~ /(?: 118 | ( 119 | [\x00-\x7F] 120 | |[\xC2-\xDF][\x80-\xBF] 121 | |[\xE0][\xA0-\xBF][\x80-\xBF] 122 | |[\xE1-\xEC][\x80-\xBF][\x80-\xBF] 123 | |[\xED][\x80-\x9F][\x80-\xBF] 124 | |[\xEE-\xEF][\x80-\xBF][\x80-\xBF] 125 | |[\xF0][\x90-\xBF][\x80-\xBF][\x80-\xBF] 126 | |[\xF1-\xF3][\x80-\xBF][\x80-\xBF][\x80-\xBF] 127 | |[\xF4][\x80-\x8F][\x80-\xBF][\x80-\xBF] 128 | ) 129 | | (.) 130 | )/xg) 131 | { 132 | if (defined $1) { 133 | $is_utf8 = 1 if (!defined $is_utf8); 134 | } 135 | else { 136 | $is_utf8 = 0 if (!defined $is_utf8); 137 | if ($is_utf8) { # eventually, not utf8 138 | return; 139 | } 140 | } 141 | } 142 | 143 | return $is_utf8; 144 | } 145 | 146 | 147 | 1; 148 | __END__ 149 | 150 | =pod 151 | 152 | =head1 NAME 153 | 154 | JSON::PP56 - Helper module in using JSON::PP in Perl 5.6 155 | 156 | =head1 DESCRIPTION 157 | 158 | JSON::PP calls internally. 159 | 160 | =head1 AUTHOR 161 | 162 | Makamaka Hannyaharamitu, Emakamaka[at]cpan.orgE 163 | 164 | 165 | =head1 COPYRIGHT AND LICENSE 166 | 167 | Copyright 2007-2012 by Makamaka Hannyaharamitu 168 | 169 | This library is free software; you can redistribute it and/or modify 170 | it under the same terms as Perl itself. 171 | 172 | =cut 173 | 174 | -------------------------------------------------------------------------------- /surfaceLR-baseline/README.md: -------------------------------------------------------------------------------- 1 | # SurfaceLR: Simple Logistic Regression Based QA 2 | 3 | This is a simple baseline that makes use of our small training set, we reimplemented and trained a logistic regression model using word-based overlap 4 | features. As described in (Merkhofer et al., 2018), this baseline takes into account the lengths of a text, question and each answer candidate, 5 | as well as indicator features regarding the (co-)occurrences of any words in them. 6 | 7 | ## Usage 8 | 9 | Follow the instructions below: 10 | 11 | - Unzip the folder which contains the liblinear files `liblinear.zip`. 12 | - Compile liblinear by running make in the `liblinear/` subdirectory. 13 | - Download the training dataset forom the [website](http://cogcomp.org/multirc/) and train the system with the training data. 14 | - To run the system on the train/dev/test data, use the command `sh run_surfaceLR.sh`. 15 | - Note: The files have to be in the Same directory as the scripts, they must end in ".json", and train file must contain "train" (similar pattern for test and dev). 16 | - The system will create one output .JSON file (`.withLRscores.json`) for each dev/test file in the current directory. 17 | 18 | 19 | -------------------------------------------------------------------------------- /surfaceLR-baseline/liblinear.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/CogComp/multirc/0e52e6f98f9d2970c90bb86502ed7488eb720fc5/surfaceLR-baseline/liblinear.zip -------------------------------------------------------------------------------- /surfaceLR-baseline/run_surfaceLR.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/sh 2 | 3 | if [ ! -f liblinear/train ]; then 4 | echo "Could not find LIBLINEAR executables!" 5 | echo "Please compile them in liblinear/" 6 | exit 7 | fi 8 | 9 | # train surfaceLR baseline 10 | echo "Training surfaceLR baseline ..." 11 | perl surfaceLR_train.pl train_456-fixedIds.json 12 | # remove temporary model/features/prediction files 13 | rm -f *.preds *.feats 14 | 15 | # run surfaceLR baseline 16 | echo "Generating surfaceLR predictions ..." 17 | perl surfaceLR_predict.pl dev_83-fixedIds.json dev_83-fixedIds.withLRscores.json 18 | # remove temporary model/features/prediction files 19 | rm -f *.preds *.feats 20 | -------------------------------------------------------------------------------- /surfaceLR-baseline/surfaceLR.pl: -------------------------------------------------------------------------------- 1 | #!/usr/bin/perl 2 | use lib '.'; 3 | 4 | use JSON; 5 | use strict; 6 | 7 | # helper variables 8 | my $origtext = ""; 9 | my $text = ""; 10 | my $question = ""; 11 | my $type = ""; 12 | my $answer = ""; 13 | 14 | # hash that maps feature names to ids 15 | my %features = (); 16 | # inverse hash of features 17 | my %reversef = (); 18 | 19 | # Length and overlap features proposed by MITRE: 20 | # "... the following three counts were added to the feature set: |S overlap A|, ..." 21 | foreach ("t_length_word", "t_length_char", "a_length_word", "a_length_char", "q_length_word", "q_length_char", 22 | "ta_overlap", "qa_overlap", "tq_overlap") { 23 | $features{$_} = 1 + scalar(keys %features); 24 | $reversef{$features{$_}} = $_; 25 | } 26 | 27 | # process each data file, but start with train 28 | # (features need to be defined before processing the test sets!) 29 | 30 | opendir(DIR, "."); 31 | my @files = readdir(DIR); 32 | my @devtest = (); 33 | my $train; 34 | foreach my $f (@files) { 35 | next unless ($f =~ m/\.json/); 36 | next if($f =~ m/LRscores/); 37 | 38 | if($f =~ m/train/) { 39 | $train = $f; 40 | } else { 41 | push @devtest, $f; 42 | } 43 | } 44 | 45 | if($train) { 46 | print STDERR "Training classifier ...\n"; 47 | process($train); 48 | } else { 49 | print STDERR "Could not open training data! Please put .JSON training file into this directoy.\n"; 50 | exit 1; 51 | } 52 | if(@devtest) { 53 | print STDERR "Evaluating classifier ...\n"; 54 | foreach (@devtest) { 55 | process($_); 56 | } 57 | } else { 58 | print STDERR "Could not open dev/test data! Please put .JSON dev/test file(s) into this directoy.\n"; 59 | exit 1; 60 | } 61 | 62 | # process method takes a filename as input 63 | # and selects suitable processor to create output 64 | sub process($) { 65 | my $filename = shift; 66 | processJSON($filename) if($filename =~ m/\.json$/); 67 | } 68 | 69 | # processor for JSON 70 | sub processJSON($) { 71 | my $filename = shift; 72 | my ($data) = ($filename =~ m/^([^\.]*)/); 73 | 74 | # open file 75 | open(IN, $filename) or die "Could not open file: $filename!\n"; 76 | open(OUT, ">multRC_$data.feats"); 77 | 78 | my $json; 79 | while() { 80 | # read each line as a JSON object and extract corresponding data 81 | $json = decode_json $_; 82 | foreach (@{$json->{"data"}}) { 83 | $origtext = $_->{"paragraph"}->{"text"}; 84 | $origtext =~ s/<[^\/]*\/[^>]*>//g; 85 | $origtext =~ s/<[^>]*>//g; 86 | 87 | # read and normalize text (=lowercase, tokenize, ...) 88 | $text = normalize($origtext); 89 | 90 | foreach (@{$_->{"paragraph"}->{"questions"}}) { 91 | 92 | # read and normalize each question 93 | $question = $_->{"question"}; 94 | $question = normalize($question); 95 | 96 | foreach (@{$_->{"answers"}}) { 97 | 98 | # read and normalize each answer 99 | $answer = $_->{"text"}; 100 | $answer = normalize($answer); 101 | 102 | # build JSON output, if predictions already available 103 | if (-e "multRC_$data.preds") { 104 | my $pred = ; 105 | chomp $pred; 106 | ($pred) = ($pred =~ m/ ([^ ]*) /); 107 | 108 | $_->{"scores"}->{"simpleLR"} = $pred; 109 | } else { 110 | 111 | my $correct = $_->{"isAnswer"}; 112 | print OUT $correct?"1":"0"; 113 | # extract features and write to file (LIBLINEAR) 114 | writefeats(featurize($text, $question, $answer), $data); 115 | } 116 | } 117 | } 118 | } 119 | } 120 | close(IN); 121 | 122 | if($data =~ m/train/) { 123 | `liblinear/train -c 1 -n 20 -B 1 -s 0 multRC_$data.feats multRC.model`; 124 | } else { 125 | `liblinear/predict -b 1 multRC_$data.feats multRC.model multRC_$data.preds`; 126 | open(PRED, "multRC_$data.preds"); 127 | open(IN, $filename) or die "Could not open file: $filename!\n"; 128 | my $json; 129 | while() { 130 | # read each line as a JSON object and extract corresponding data 131 | $json = decode_json $_; 132 | foreach (@{$json->{"data"}}) { 133 | foreach (@{$_->{"paragraph"}->{"questions"}}) { 134 | foreach (@{$_->{"answers"}}) { 135 | my $pred = ; 136 | chomp $pred; 137 | ($pred) = ($pred =~ m/ ([^ ]*) /); 138 | $_->{"scores"}->{"simpleLR"} = $pred; 139 | } 140 | } 141 | } 142 | } 143 | 144 | open(OUT, ">$data.withLRscores.json"); 145 | print OUT encode_json $json; 146 | close(PRED); 147 | close(IN); 148 | } 149 | close(OUT); 150 | } 151 | 152 | # writefeats takes a set of features and name of dataset as 153 | # input and then writes out a corresponding feature file in 154 | # the LIBLINEAR format (LABEL [FEATID:FEATVAL]*\n) 155 | sub writefeats($$) { 156 | my $features = shift; 157 | my $data = shift; 158 | 159 | my @f = @$features; 160 | 161 | push @f, "t_length_char"; 162 | push @f, "t_length_word"; 163 | push @f, "q_length_char"; 164 | push @f, "q_length_word"; 165 | push @f, "a_length_char"; 166 | push @f, "a_length_word"; 167 | 168 | my %feats = (); 169 | 170 | foreach (@f) { 171 | # if feature hasn't been seen before, assign feature ID 172 | # (requires current dataset to be the training data) 173 | if(!$features{$_}) { 174 | next unless ($data =~ m/train/); 175 | $features{$_} = 1 + scalar(keys %features); 176 | $reversef{$features{$_}} = $_; 177 | } 178 | 179 | # fill feature values for length/overlap features 180 | if(m/_length_/) { 181 | my $str = ""; 182 | $str = $text if(m/^t/); 183 | $str = $question if(m/^q/); 184 | $str = $answer if (m/^a/); 185 | if(m/_char$/) { 186 | $feats{$features{$_}} = length($str); 187 | } elsif(m/_word$/) { 188 | $feats{$features{$_}} = scalar(split(" ", $str)); 189 | } 190 | 191 | # default treatment for other features (just count them) 192 | } else { 193 | $feats{$features{$_}}++; 194 | } 195 | } 196 | 197 | # write features, ordered by ID (required by LIBLINEAR) 198 | my @feats = sort {$a <=> $b} keys %feats; 199 | foreach (@feats) { 200 | print OUT " ", $_, ":", $feats{$_}; 201 | } 202 | print OUT "\n"; 203 | } 204 | 205 | # normalize take a string as input (text, question or answer) 206 | # and produces a "normalized" output 207 | sub normalize($) { 208 | my $string = shift; 209 | 210 | # lowercase input 211 | $string = lc($string); 212 | 213 | # replace HTML quotation marks with actual quotation marks 214 | $string =~ s/"/"/g; 215 | 216 | # remove punctuation and extra spaces 217 | $string =~ s/([\(\),\.\?\!\":\'\/])/ /g; 218 | $string =~ s/ */ /g; 219 | 220 | return $string; 221 | } 222 | 223 | # featurize takes a triple as input 224 | # and extracts all applicable features from that triple 225 | sub featurize($$$) { 226 | my $t = shift; 227 | my $q = shift; 228 | my $a = shift; 229 | 230 | # tokenize by whitespace 231 | my @t = split(" ", $t); 232 | my @q = split(" ", $q); 233 | my @a = split(" ", $a); 234 | 235 | # start with an empty list of features 236 | my @features = (); 237 | 238 | # find "prominent" words, i.e. more often than average 239 | my %prominent = (); 240 | my $total = 0; 241 | my $avg = 0; 242 | if(@t) { 243 | foreach (@t) { 244 | $total++; 245 | $prominent{$_}++; 246 | } 247 | $avg = scalar(keys %prominent)/$total; 248 | foreach (keys %prominent) { 249 | undef $prominent{$_} unless($prominent{$_} > $avg); 250 | } 251 | } 252 | 253 | # this variable keeps track of already extracted features 254 | # => make sure that counts represent types, not tokens 255 | my %overlap = (); 256 | 257 | ########################################################## 258 | # All features below are modified adaptations from MITRE # 259 | ########################################################## 260 | 261 | # "set of words in the answer (A)" 262 | foreach (@a) { 263 | if(!$overlap{$_}) { 264 | push @features, "a".$_."_type"; 265 | if($prominent{$_}) { 266 | push @features, "t".$_."xa".$_."_prominenttype"; 267 | push @features, "ta_prominent_typeoverlap"; 268 | } 269 | } 270 | 271 | $overlap{$_}++; 272 | } 273 | 274 | # "the words common to the story and the answer ("S")" 275 | %overlap = (); 276 | foreach my $x (@t) { 277 | foreach my $y (@a) { 278 | if($x eq $y) { 279 | if(!$overlap{$x}) { 280 | push @features, "t".$x."xa".$y."_type"; 281 | push @features, "ta_overlap_type" 282 | } 283 | $overlap{$x}++; 284 | } 285 | } 286 | foreach my $y (@q) { 287 | push @features, "tq_overlap" if($x eq $y); 288 | } 289 | } 290 | 291 | # "the Cartesian product of the the(sic!) question and the answer (Q x A)" 292 | %overlap = (); 293 | foreach my $x (@q) { 294 | foreach my $y (@a) { 295 | if(!$overlap{$x."_x_".$y}) { 296 | push @features, "q".$x."xa".$y."_type"; 297 | push @features, "qa_overlap_type" if($x eq $y); 298 | } 299 | $overlap{$x."_x_".$y}++; 300 | } 301 | } 302 | 303 | return \@features; 304 | } 305 | 306 | 307 | -------------------------------------------------------------------------------- /surfaceLR-baseline/surfaceLR_predict.pl: -------------------------------------------------------------------------------- 1 | #!/usr/bin/perl 2 | use utf8; 3 | use Text::Unidecode; 4 | 5 | use lib '.'; 6 | 7 | use JSON; 8 | use strict; 9 | 10 | # helper variables 11 | my $origtext = ""; 12 | my $text = ""; 13 | my $question = ""; 14 | my $type = ""; 15 | my $answer = ""; 16 | 17 | # hash that maps feature names to ids 18 | my %features = (); 19 | # inverse hash of features 20 | my %reversef = (); 21 | 22 | open(IN, "multRC.features") or die "Could not open feature file!\n"; 23 | while() { 24 | chomp; 25 | my ($f, $i) = split(" "); 26 | $features{$f} = $i; 27 | $reversef{$i} = $f; 28 | } 29 | close(IN); 30 | 31 | my $filename = $ARGV[0]; 32 | my ($data) = ($filename =~ m/^([^\.]*)/); 33 | 34 | # open file 35 | open(IN, $filename) or die "Could not open file: $filename!\n"; 36 | open(OUT, ">multRC_$data.feats"); 37 | 38 | my $json; 39 | while() { 40 | # read each line as a JSON object and extract corresponding data 41 | $json = decode_json $_; 42 | foreach (@{$json->{"data"}}) { 43 | $origtext = $_->{"paragraph"}->{"text"}; 44 | $origtext =~ s/<[^\/]*\/[^>]*>//g; 45 | $origtext =~ s/<[^>]*>//g; 46 | 47 | # read and normalize text (=lowercase, tokenize, ...) 48 | $text = normalize($origtext); 49 | 50 | foreach (@{$_->{"paragraph"}->{"questions"}}) { 51 | 52 | # read and normalize each question 53 | $question = $_->{"question"}; 54 | $question = normalize($question); 55 | 56 | foreach (@{$_->{"answers"}}) { 57 | 58 | # read and normalize each answer 59 | $answer = $_->{"text"}; 60 | $answer = normalize($answer); 61 | 62 | # build JSON output, if predictions already available 63 | if (-e "multRC_$data.preds") { 64 | my $pred = ; 65 | chomp $pred; 66 | ($pred) = ($pred =~ m/ ([^ ]*) /); 67 | 68 | $_->{"scores"}->{"simpleLR"} = $pred; 69 | } else { 70 | 71 | my $correct = $_->{"isAnswer"}; 72 | print OUT $correct?"1":"0"; 73 | # extract features and write to file (LIBLINEAR) 74 | writefeats(featurize($text, $question, $answer), $data); 75 | } 76 | } 77 | } 78 | } 79 | } 80 | close(IN); 81 | 82 | `liblinear/predict -b 1 multRC_$data.feats multRC.model multRC_$data.preds`; 83 | open(PRED, "multRC_$data.preds"); 84 | open(IN, $filename) or die "Could not open file: $filename!\n"; 85 | my $json; 86 | while() { 87 | # read each line as a JSON object and extract corresponding data 88 | $json = decode_json $_; 89 | foreach (@{$json->{"data"}}) { 90 | foreach (@{$_->{"paragraph"}->{"questions"}}) { 91 | foreach (@{$_->{"answers"}}) { 92 | my $pred = ; 93 | chomp $pred; 94 | ($pred) = ($pred =~ m/ ([^ ]*) /); 95 | $_->{"scores"}->{"simpleLR"} = $pred; 96 | } 97 | } 98 | } 99 | } 100 | 101 | open(OUT, ">$ARGV[1]"); 102 | my $output = encode_json $json; 103 | $output =~ s/simpleLR":"([^"]*)"/simpleLR":$1/g; 104 | print OUT $output; 105 | close(PRED); 106 | close(IN); 107 | close(OUT); 108 | 109 | # writefeats takes a set of features and name of dataset as 110 | # input and then writes out a corresponding feature file in 111 | # the LIBLINEAR format (LABEL [FEATID:FEATVAL]*\n) 112 | sub writefeats($$) { 113 | my $features = shift; 114 | my $data = shift; 115 | 116 | my @f = @$features; 117 | 118 | push @f, "t_length_char"; 119 | push @f, "t_length_word"; 120 | push @f, "q_length_char"; 121 | push @f, "q_length_word"; 122 | push @f, "a_length_char"; 123 | push @f, "a_length_word"; 124 | 125 | my %feats = (); 126 | 127 | foreach (@f) { 128 | # if feature hasn't been seen before, assign feature ID 129 | # (requires current dataset to be the training data) 130 | if(!$features{$_}) { 131 | next unless ($data =~ m/train/); 132 | $features{$_} = 1 + scalar(keys %features); 133 | $reversef{$features{$_}} = $_; 134 | } 135 | 136 | # fill feature values for length/overlap features 137 | if(m/_length_/) { 138 | my $str = ""; 139 | $str = $text if(m/^t/); 140 | $str = $question if(m/^q/); 141 | $str = $answer if (m/^a/); 142 | if(m/_char$/) { 143 | $feats{$features{$_}} = length($str); 144 | } elsif(m/_word$/) { 145 | $feats{$features{$_}} = scalar(split(" ", $str)); 146 | } 147 | 148 | # default treatment for other features (just count them) 149 | } else { 150 | $feats{$features{$_}}++; 151 | } 152 | } 153 | 154 | # write features, ordered by ID (required by LIBLINEAR) 155 | my @feats = sort {$a <=> $b} keys %feats; 156 | foreach (@feats) { 157 | print OUT " ", $_, ":", $feats{$_}; 158 | } 159 | print OUT "\n"; 160 | } 161 | 162 | # normalize take a string as input (text, question or answer) 163 | # and produces a "normalized" output 164 | sub normalize($) { 165 | my $string = shift; 166 | $string =~ s/([^[:ascii:]]+)/unidecode($1)/ge; 167 | 168 | # lowercase input 169 | $string = lc($string); 170 | 171 | # replace HTML quotation marks with actual quotation marks 172 | $string =~ s/"/"/g; 173 | 174 | # remove punctuation and extra spaces 175 | $string =~ s/([\(\),\.\?\!\":\'\/])/ /g; 176 | $string =~ s/ */ /g; 177 | 178 | return $string; 179 | } 180 | 181 | # featurize takes a triple as input 182 | # and extracts all applicable features from that triple 183 | sub featurize($$$) { 184 | my $t = shift; 185 | my $q = shift; 186 | my $a = shift; 187 | 188 | # tokenize by whitespace 189 | my @t = split(" ", $t); 190 | my @q = split(" ", $q); 191 | my @a = split(" ", $a); 192 | 193 | # start with an empty list of features 194 | my @features = (); 195 | 196 | # find "prominent" words, i.e. more often than average 197 | my %prominent = (); 198 | my $total = 0; 199 | my $avg = 0; 200 | if(@t) { 201 | foreach (@t) { 202 | $total++; 203 | $prominent{$_}++; 204 | } 205 | $avg = scalar(keys %prominent)/$total; 206 | foreach (keys %prominent) { 207 | undef $prominent{$_} unless($prominent{$_} > $avg); 208 | } 209 | } 210 | 211 | # this variable keeps track of already extracted features 212 | # => make sure that counts represent types, not tokens 213 | my %overlap = (); 214 | 215 | ########################################################## 216 | # All features below are modified adaptations from MITRE # 217 | ########################################################## 218 | 219 | # "set of words in the answer (A)" 220 | foreach (@a) { 221 | if(!$overlap{$_}) { 222 | push @features, "a".$_."_type"; 223 | if($prominent{$_}) { 224 | push @features, "t".$_."xa".$_."_prominenttype"; 225 | push @features, "ta_prominent_typeoverlap"; 226 | } 227 | } 228 | 229 | $overlap{$_}++; 230 | } 231 | 232 | # "the words common to the story and the answer ("S")" 233 | %overlap = (); 234 | foreach my $x (@t) { 235 | foreach my $y (@a) { 236 | if($x eq $y) { 237 | if(!$overlap{$x}) { 238 | push @features, "t".$x."xa".$y."_type"; 239 | push @features, "ta_overlap_type" 240 | } 241 | $overlap{$x}++; 242 | } 243 | } 244 | foreach my $y (@q) { 245 | push @features, "tq_overlap" if($x eq $y); 246 | } 247 | } 248 | 249 | # "the Cartesian product of the the(sic!) question and the answer (Q x A)" 250 | %overlap = (); 251 | foreach my $x (@q) { 252 | foreach my $y (@a) { 253 | if(!$overlap{$x."_x_".$y}) { 254 | push @features, "q".$x."xa".$y."_type"; 255 | push @features, "qa_overlap_type" if($x eq $y); 256 | } 257 | $overlap{$x."_x_".$y}++; 258 | } 259 | } 260 | 261 | return \@features; 262 | } 263 | 264 | 265 | -------------------------------------------------------------------------------- /surfaceLR-baseline/surfaceLR_train.pl: -------------------------------------------------------------------------------- 1 | #!/usr/bin/perl 2 | use utf8; 3 | use Text::Unidecode; 4 | 5 | use lib '.'; 6 | 7 | use JSON; 8 | use strict; 9 | 10 | # helper variables 11 | my $origtext = ""; 12 | my $text = ""; 13 | my $question = ""; 14 | my $type = ""; 15 | my $answer = ""; 16 | 17 | # hash that maps feature names to ids 18 | my %features = (); 19 | # inverse hash of features 20 | my %reversef = (); 21 | 22 | # Length and overlap features proposed by MITRE: 23 | # "... the following three counts were added to the feature set: |S overlap A|, ..." 24 | foreach ("t_length_word", "t_length_char", "a_length_word", "a_length_char", "q_length_word", "q_length_char", 25 | "ta_overlap", "qa_overlap", "tq_overlap") { 26 | $features{$_} = 1 + scalar(keys %features); 27 | $reversef{$features{$_}} = $_; 28 | } 29 | 30 | my $filename = $ARGV[0]; 31 | my ($data) = ($filename =~ m/^([^\.]*)/); 32 | 33 | # open file 34 | open(IN, $filename) or die "Could not open file: $filename!\n"; 35 | open(OUT, ">multRC_$data.feats"); 36 | 37 | my $json; 38 | while() { 39 | # read each line as a JSON object and extract corresponding data 40 | $json = decode_json $_; 41 | foreach (@{$json->{"data"}}) { 42 | $origtext = $_->{"paragraph"}->{"text"}; 43 | $origtext =~ s/<[^\/]*\/[^>]*>//g; 44 | $origtext =~ s/<[^>]*>//g; 45 | 46 | # read and normalize text (=lowercase, tokenize, ...) 47 | $text = normalize($origtext); 48 | 49 | foreach (@{$_->{"paragraph"}->{"questions"}}) { 50 | 51 | # read and normalize each question 52 | $question = $_->{"question"}; 53 | $question = normalize($question); 54 | 55 | foreach (@{$_->{"answers"}}) { 56 | 57 | # read and normalize each answer 58 | $answer = $_->{"text"}; 59 | $answer = normalize($answer); 60 | 61 | # build JSON output, if predictions already available 62 | if (-e "multRC_$data.preds") { 63 | my $pred = ; 64 | chomp $pred; 65 | ($pred) = ($pred =~ m/ ([^ ]*) /); 66 | 67 | $_->{"scores"}->{"simpleLR"} = $pred; 68 | } else { 69 | 70 | my $correct = $_->{"isAnswer"}; 71 | print OUT $correct?"1":"0"; 72 | # extract features and write to file (LIBLINEAR) 73 | writefeats(featurize($text, $question, $answer), $data); 74 | } 75 | } 76 | } 77 | } 78 | } 79 | close(IN); 80 | close(OUT); 81 | 82 | # produce model 83 | `liblinear/train -c 1 -n 20 -B 1 -s 0 multRC_$data.feats multRC.model`; 84 | 85 | # write features to file 86 | open(OUT, ">multRC.features"); 87 | foreach(keys %features) { 88 | print OUT $_, " ", $features{$_}, "\n"; 89 | } 90 | close(OUT); 91 | 92 | # writefeats takes a set of features and name of dataset as 93 | # input and then writes out a corresponding feature file in 94 | # the LIBLINEAR format (LABEL [FEATID:FEATVAL]*\n) 95 | sub writefeats($$) { 96 | my $features = shift; 97 | my $data = shift; 98 | 99 | my @f = @$features; 100 | 101 | push @f, "t_length_char"; 102 | push @f, "t_length_word"; 103 | push @f, "q_length_char"; 104 | push @f, "q_length_word"; 105 | push @f, "a_length_char"; 106 | push @f, "a_length_word"; 107 | 108 | my %feats = (); 109 | 110 | foreach (@f) { 111 | # if feature hasn't been seen before, assign feature ID 112 | # (requires current dataset to be the training data) 113 | if(!$features{$_}) { 114 | $features{$_} = 1 + scalar(keys %features); 115 | $reversef{$features{$_}} = $_; 116 | } 117 | 118 | # fill feature values for length/overlap features 119 | if(m/_length_/) { 120 | my $str = ""; 121 | $str = $text if(m/^t/); 122 | $str = $question if(m/^q/); 123 | $str = $answer if (m/^a/); 124 | if(m/_char$/) { 125 | $feats{$features{$_}} = length($str); 126 | } elsif(m/_word$/) { 127 | $feats{$features{$_}} = scalar(split(" ", $str)); 128 | } 129 | 130 | # default treatment for other features (just count them) 131 | } else { 132 | $feats{$features{$_}}++; 133 | } 134 | } 135 | 136 | # write features, ordered by ID (required by LIBLINEAR) 137 | my @feats = sort {$a <=> $b} keys %feats; 138 | foreach (@feats) { 139 | print OUT " ", $_, ":", $feats{$_}; 140 | } 141 | print OUT "\n"; 142 | } 143 | 144 | # normalize take a string as input (text, question or answer) 145 | # and produces a "normalized" output 146 | sub normalize($) { 147 | my $string = shift; 148 | $string =~ s/([^[:ascii:]]+)/unidecode($1)/ge; 149 | 150 | # lowercase input 151 | $string = lc($string); 152 | 153 | # replace HTML quotation marks with actual quotation marks 154 | $string =~ s/"/"/g; 155 | 156 | # remove punctuation and extra spaces 157 | $string =~ s/([\(\),\.\?\!\":\'\/])/ /g; 158 | $string =~ s/ */ /g; 159 | 160 | return $string; 161 | } 162 | 163 | # featurize takes a triple as input 164 | # and extracts all applicable features from that triple 165 | sub featurize($$$) { 166 | my $t = shift; 167 | my $q = shift; 168 | my $a = shift; 169 | 170 | # tokenize by whitespace 171 | my @t = split(" ", $t); 172 | my @q = split(" ", $q); 173 | my @a = split(" ", $a); 174 | 175 | # start with an empty list of features 176 | my @features = (); 177 | 178 | # find "prominent" words, i.e. more often than average 179 | my %prominent = (); 180 | my $total = 0; 181 | my $avg = 0; 182 | if(@t) { 183 | foreach (@t) { 184 | $total++; 185 | $prominent{$_}++; 186 | } 187 | $avg = scalar(keys %prominent)/$total; 188 | foreach (keys %prominent) { 189 | undef $prominent{$_} unless($prominent{$_} > $avg); 190 | } 191 | } 192 | 193 | # this variable keeps track of already extracted features 194 | # => make sure that counts represent types, not tokens 195 | my %overlap = (); 196 | 197 | ########################################################## 198 | # All features below are modified adaptations from MITRE # 199 | ########################################################## 200 | 201 | # "set of words in the answer (A)" 202 | foreach (@a) { 203 | if(!$overlap{$_}) { 204 | push @features, "a".$_."_type"; 205 | if($prominent{$_}) { 206 | push @features, "t".$_."xa".$_."_prominenttype"; 207 | push @features, "ta_prominent_typeoverlap"; 208 | } 209 | } 210 | 211 | $overlap{$_}++; 212 | } 213 | 214 | # "the words common to the story and the answer ("S")" 215 | %overlap = (); 216 | foreach my $x (@t) { 217 | foreach my $y (@a) { 218 | if($x eq $y) { 219 | if(!$overlap{$x}) { 220 | push @features, "t".$x."xa".$y."_type"; 221 | push @features, "ta_overlap_type" 222 | } 223 | $overlap{$x}++; 224 | } 225 | } 226 | foreach my $y (@q) { 227 | push @features, "tq_overlap" if($x eq $y); 228 | } 229 | } 230 | 231 | # "the Cartesian product of the the(sic!) question and the answer (Q x A)" 232 | %overlap = (); 233 | foreach my $x (@q) { 234 | foreach my $y (@a) { 235 | if(!$overlap{$x."_x_".$y}) { 236 | push @features, "q".$x."xa".$y."_type"; 237 | push @features, "qa_overlap_type" if($x eq $y); 238 | } 239 | $overlap{$x."_x_".$y}++; 240 | } 241 | } 242 | 243 | return \@features; 244 | } 245 | 246 | 247 | --------------------------------------------------------------------------------