├── LICENSE
├── README.md
├── data
    ├── queries.json
    └── tagme_annotations.json
├── nordlys
    ├── __init__.py
    ├── config.py
    ├── elr
    │   ├── __init__.py
    │   ├── field_mapping.py
    │   ├── query_annot.py
    │   ├── retrieval_elr.py
    │   ├── scorer_elr.py
    │   └── top_fields.py
    └── retrieval
    │   ├── __init__.py
    │   ├── indexer.py
    │   ├── lucene_tools.py
    │   ├── results.py
    │   ├── retrieval.py
    │   └── scorer.py
├── qrels
    ├── qrels-INEX_LD.txt
    ├── qrels-ListSearch.txt
    ├── qrels-QALD.txt
    ├── qrels-SemSearch_ES.txt
    ├── qrels-v3.9.txt
    └── queries.txt
├── requirements.txt
└── runs
    ├── default_params(Table5)
        ├── fsdm(default).treceval
        ├── fsdm_elr(default).treceval
        ├── lm_elr(default).treceval
        ├── prms_elr(default).treceval
        ├── sdm(default).treceval
        └── sdm_elr(default).treceval
    ├── fsdm.treceval
    ├── fsdm_elr.treceval
    ├── lm.treceval
    ├── lm_elr.treceval
    ├── mlm-all.treceval
    ├── mlm-all_elr.treceval
    ├── mlm-tc.treceval
    ├── mlm-tc_elr.treceval
    ├── prms.treceval
    ├── prms_elr.treceval
    ├── sdm.treceval
    └── sdm_elr.treceval


/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2016 Faegheh Hasibi
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # Entity Linking integrated Retrieval (ELR)
 2 | 
 3 | This repository contains resources developed within the following paper:
 4 | 
 5 | 	F. Hasibi, K. Balog, and S.E. Bratsberg. “Exploiting Entity Linking in Queries for Entity Retrieval”,
 6 | 	In proceedings of ACM SIGIR International Conference on the Theory of Information Retrieval (ICTIR ’16), Newark, DE, USA, Sep 2016.
 7 | 
 8 | You can check the [paper](http://hasibi.com/files/ictir2016-elr.pdf) and [presentation](http://www.slideshare.net/FaeghehHasibi/ictir2016-elr) for detailed information.
 9 | 
10 | The repository is structured as follows:
11 | 
12 | - `nordlys/`: Code required for running entity retrieval methods.
13 | - `data/`: Query set and data required for running the code.
14 | - `qrels/`: Qrels files for the [DBpedia-entity test collection](http://krisztianbalog.com/resources/sigir-2013-dbpedia/) (version 3.9).
15 | - `runs/`: Run files reported in the paper.
16 | 
17 | 
18 | ## Usage
19 | 
20 | Use the following command to run the code:
21 | 
22 | ```
23 | python -m nordlys.elr.retrieval_elr <model_name>
24 | ```
25 | Using this command, the retrieval results are produced using the recommended parameters in the paper.
26 | For detailed descriptions and setting different parameters read the help using the command `python -m nordlys.elr.retrieval_elr -h`.
27 | 
28 | Python v2.7 is required for running the code.
29 | 
30 | ## Code
31 | 
32 | Check the `nordlys/elr/scorer_elr.py` file for the actual implementation of the ELR framework and the baseline methods.
33 | 
34 | 
35 | ## Data
36 | 
37 | The indices required for running this code are described in the paper. You can also contact the authors to get the indices.
38 | The following files under the `data` folder are also required for running the code:
39 | 
40 | - `queries.json`: The DBpedia-entity queries, stopped as described in the paper.
41 | - `tagme_annotations.json`: Entity annotations of the queries obtained from the [TAGME API](https://tagme.d4science.org/tagme/).
42 | 
43 | 
44 | ## Citation
45 | 
46 | If you use the resources presented in this repository, please cite:
47 | 
48 | ```
49 | @inproceedings{Hasibi:2016:ELR, 
50 |    author =    {Hasibi, Faegheh and Balog, Krisztian and Bratsberg, Svein Erik},
51 |    title =     {Exploiting Entity Linking in Queries for Entity Retrieval},
52 |    booktitle = {Proceedings of ACM SIGIR International Conference on the Theory of Information Retrieval},
53 |    series =    {ICTIR '16},
54 |    year =      {2016},
55 |    pages=      {209-218},
56 |    publisher = {ACM},
57 |    DOI =       {ttp://dx.doi.org/10.1145/2970398.2970406}
58 | } 
59 | ```
60 | 
61 | ## Contact
62 | 
63 | Should you have any questions, please contact Faegheh Hasibi at <f.hasibi@cs.ru.nl>.
64 | 


--------------------------------------------------------------------------------
/data/queries.json:
--------------------------------------------------------------------------------
  1 | {
  2 |     "INEX_LD-2009022": "Szechwan dish food cuisine", 
  3 |     "INEX_LD-2009039": "roman architecture", 
  4 |     "INEX_LD-2009053": "finland car industry manufacturer saab sisu", 
  5 |     "INEX_LD-2009061": "france second world war normandy", 
  6 |     "INEX_LD-2009062": "social network group selection", 
  7 |     "INEX_LD-2009063": "D-Day normandy invasion", 
  8 |     "INEX_LD-2009074": "web ranking scoring algorithm", 
  9 |     "INEX_LD-2009096": "Eiffel", 
 10 |     "INEX_LD-2009111": "europe solar power facility", 
 11 |     "INEX_LD-2009115": "virtual museums", 
 12 |     "INEX_LD-2010004": "Indian food", 
 13 |     "INEX_LD-2010014": "composer museum", 
 14 |     "INEX_LD-2010019": "gallo roman architecture in paris", 
 15 |     "INEX_LD-2010020": "electricity source in France", 
 16 |     "INEX_LD-2010037": "social network API", 
 17 |     "INEX_LD-2010043": "List of films from the surrealist category", 
 18 |     "INEX_LD-2010057": "Einstein Relativity theory", 
 19 |     "INEX_LD-2010069": "summer flowers", 
 20 |     "INEX_LD-2010100": "house concrete wood", 
 21 |     "INEX_LD-2010106": "organic food advantages disadvantages", 
 22 |     "INEX_LD-20120111": "vietnam war movie", 
 23 |     "INEX_LD-20120112": "vietnam war facts", 
 24 |     "INEX_LD-20120121": "vietnam food recipes", 
 25 |     "INEX_LD-20120122": "vietnamese food blog", 
 26 |     "INEX_LD-20120131": "vietnam travel national park", 
 27 |     "INEX_LD-20120132": "vietnam travel airports", 
 28 |     "INEX_LD-20120211": "guitar chord tuning", 
 29 |     "INEX_LD-20120212": "guitar chord minor", 
 30 |     "INEX_LD-20120221": "guitar classical flamenco", 
 31 |     "INEX_LD-20120222": "guitar classical bach", 
 32 |     "INEX_LD-20120231": "guitar origin Russia", 
 33 |     "INEX_LD-20120232": "guitar origin blues", 
 34 |     "INEX_LD-20120311": "tango culture movies", 
 35 |     "INEX_LD-20120312": "tango culture countries", 
 36 |     "INEX_LD-20120321": "tango music  composers", 
 37 |     "INEX_LD-20120322": "tango music instruments", 
 38 |     "INEX_LD-20120331": "tango dance styles", 
 39 |     "INEX_LD-20120332": "tango dance history", 
 40 |     "INEX_LD-20120411": "bicycle sport races", 
 41 |     "INEX_LD-20120412": "bicycle sport disciplines", 
 42 |     "INEX_LD-20120421": "bicycle holiday towns", 
 43 |     "INEX_LD-20120422": "bicycle holiday nature", 
 44 |     "INEX_LD-20120431": "bicycle benefits health", 
 45 |     "INEX_LD-20120432": "bicycle benefits environment", 
 46 |     "INEX_LD-20120511": "female rock singers", 
 47 |     "INEX_LD-20120512": "south korean girl groups", 
 48 |     "INEX_LD-20120521": "electronic music genres", 
 49 |     "INEX_LD-20120522": "digital music notation formats", 
 50 |     "INEX_LD-20120531": "music conferences", 
 51 |     "INEX_LD-20120532": "intellectual property rights lobby", 
 52 |     "INEX_LD-2012301": "Niagara falls origin lake", 
 53 |     "INEX_LD-2012303": "Valley fever fungal infection San Joaquin", 
 54 |     "INEX_LD-2012305": "North Dakota's lowest river of another colour", 
 55 |     "INEX_LD-2012307": "July, 1850  president died Millard Fillmore sworn following day", 
 56 |     "INEX_LD-2012309": "residents small island city-state  Malay Peninsula Chinese", 
 57 |     "INEX_LD-2012311": "John Lennon Yoko Ono album Starting Over", 
 58 |     "INEX_LD-2012313": "John Turturro 1991 Coen Brothers film", 
 59 |     "INEX_LD-2012315": "Baguio Quezon City  Manila official independence 1945", 
 60 |     "INEX_LD-2012317": "daggeroso inclined to use a dagger novel Sons and Lovers", 
 61 |     "INEX_LD-2012318": "Directed Bela Glen Glenda Bride Monster Plan 9 Outer Space", 
 62 |     "INEX_LD-2012319": "1994 short story collection Alice Munro is Open", 
 63 |     "INEX_LD-2012321": "Asian port state-city Sir Stamford Raffles", 
 64 |     "INEX_LD-2012323": "Large glaciers island nation Langjokull Hofsjokull Vatnajokull", 
 65 |     "INEX_LD-2012325": "successor James G. Blaine studied law", 
 66 |     "INEX_LD-2012327": "Beloved author African-American Nobel Prize Literature", 
 67 |     "INEX_LD-2012329": "Sweden Iceland currency", 
 68 |     "INEX_LD-2012331": "Seoul Korea river name ethnic group China", 
 69 |     "INEX_LD-2012333": "Prime minister Canada nicknamed Silver-Tongued Laurier longest unbroken term", 
 70 |     "INEX_LD-2012335": "U.S. president authorise nuclear weapons against Japan", 
 71 |     "INEX_LD-2012336": "1906 territory Papua island Australian", 
 72 |     "INEX_LD-2012337": "Texas city Baylor University tornado 1953", 
 73 |     "INEX_LD-2012339": "Nelson Mandela John Dube", 
 74 |     "INEX_LD-2012341": "1997  Houston airport president", 
 75 |     "INEX_LD-2012343": "The Heart of a Woman poet's autobiography", 
 76 |     "INEX_LD-2012345": "Kennedy assassination governor of Texas seriously injured", 
 77 |     "INEX_LD-2012347": "seat Florida country Dade", 
 78 |     "INEX_LD-2012349": "Alexander Nevsky Cathedral Bulgarian city liberation Turks", 
 79 |     "INEX_LD-2012351": "Indian Cuisine dish rice dhal vegetables roti papad", 
 80 |     "INEX_LD-2012353": "country German language", 
 81 |     "INEX_LD-2012354": "greatest guitarist", 
 82 |     "INEX_LD-2012355": "England football player highest paid", 
 83 |     "INEX_LD-2012357": "prima ballerina Bolshoi Theatre 1960", 
 84 |     "INEX_LD-2012359": "Bob Ricker Executive Director the latest front group for the anti-gun movement", 
 85 |     "INEX_LD-2012361": "most famous award winning actor singer", 
 86 |     "INEX_LD-2012363": "American twins famous  American professional tennis double players", 
 87 |     "INEX_LD-2012365": "mathematician computer scientist MIT's six inaugural MacVicar Faculty Fellows", 
 88 |     "INEX_LD-2012367": "invented telescope", 
 89 |     "INEX_LD-2012369": "most famous civic-military airports", 
 90 |     "INEX_LD-2012371": "most beautiful railway stations world cities located", 
 91 |     "INEX_LD-2012372": "famous historical battlefields opponents fought", 
 92 |     "INEX_LD-2012373": "birds cannot fly", 
 93 |     "INEX_LD-2012375": "animals lay eggs mammals", 
 94 |     "INEX_LD-2012377": "allegedly caused World War I", 
 95 |     "INEX_LD-2012379": "pairs cities same language same longitude different countries", 
 96 |     "INEX_LD-2012381": "movie directors directed a block buster", 
 97 |     "INEX_LD-2012383": "famous computer scientists disappeared at sea", 
 98 |     "INEX_LD-2012385": "famous politicians vegetarians", 
 99 |     "INEX_LD-2012387": "famous river confluence dam constructed", 
100 |     "INEX_LD-2012389": "frequently visited sharks gulf Indian Ocean", 
101 |     "INEX_LD-2012390": "baseball player most homeruns national league", 
102 |     "INEX_XER-100": "Operating systems to Steve Jobs related", 
103 |     "INEX_XER-106": "Noble english person from the Hundred Years' War", 
104 |     "INEX_XER-108": "State capitals of the United States of America", 
105 |     "INEX_XER-109": "National capitals situated on islands", 
106 |     "INEX_XER-110": "Nobel Prize in Literature winners were also poets", 
107 |     "INEX_XER-113": "Formula 1 drivers that won the Monaco Grand Prix", 
108 |     "INEX_XER-114": "Formula one races in Europe", 
109 |     "INEX_XER-115": "Formula One World Constructors' Champions", 
110 |     "INEX_XER-116": "Italian nobel prize winners", 
111 |     "INEX_XER-117": "Musicians appeared in the Blues Brothers movies", 
112 |     "INEX_XER-118": "French car models in 1960's", 
113 |     "INEX_XER-119": "Swiss cantons they speak German", 
114 |     "INEX_XER-121": "US presidents since 1960", 
115 |     "INEX_XER-122": "Movies with eight or more Academy Awards", 
116 |     "INEX_XER-123": "FIFA world cup national team winners since 1974", 
117 |     "INEX_XER-124": "Novels that won the Booker Prize", 
118 |     "INEX_XER-125": "countries have won the FIFA world cup", 
119 |     "INEX_XER-126": "toy train manufacturers that are still in business", 
120 |     "INEX_XER-127": "german female politicians", 
121 |     "INEX_XER-128": "Bond girls", 
122 |     "INEX_XER-129": "Science fiction book written in the 1980", 
123 |     "INEX_XER-130": "Star Trek Captains", 
124 |     "INEX_XER-132": "living nordic classical composers", 
125 |     "INEX_XER-133": "EU countries", 
126 |     "INEX_XER-134": "record-breaking sprinters in male 100-meter sprints", 
127 |     "INEX_XER-135": "professional baseball team in Japan", 
128 |     "INEX_XER-136": "Japanese players in Major League Baseball", 
129 |     "INEX_XER-138": "National Parks East Coast Canada US", 
130 |     "INEX_XER-139": "Films directed by Akira Kurosawa", 
131 |     "INEX_XER-140": "Airports in Germany", 
132 |     "INEX_XER-141": "Universities in Catalunya", 
133 |     "INEX_XER-143": "Hanseatic league in Germany in the Netherlands Circle", 
134 |     "INEX_XER-144": "chess world champions", 
135 |     "INEX_XER-147": "Chemical elements that are named after people", 
136 |     "INEX_XER-60": "olympic classes dinghy sailing", 
137 |     "INEX_XER-62": "Neil Gaiman novels", 
138 |     "INEX_XER-63": "Hugo awarded best novels", 
139 |     "INEX_XER-64": "Alan Moore graphic novels adapted to film", 
140 |     "INEX_XER-65": "Pacific navigators Australia explorers", 
141 |     "INEX_XER-67": "Ferris and observation wheels", 
142 |     "INEX_XER-72": "films shot in Venice", 
143 |     "INEX_XER-73": "magazines about indie-music", 
144 |     "INEX_XER-74": "circus mammals", 
145 |     "INEX_XER-79": "Works by Charles Rennie Mackintosh", 
146 |     "INEX_XER-81": "Movies about English hooligans", 
147 |     "INEX_XER-86": "List of countries in World War Two", 
148 |     "INEX_XER-87": "Axis powers of World War II", 
149 |     "INEX_XER-88": "Nordic authors are known for children's literature", 
150 |     "INEX_XER-91": "Paul Auster novels", 
151 |     "INEX_XER-94": "Hybrid cars sold in Europe", 
152 |     "INEX_XER-95": "Tom Hanks movies he plays a leading role.", 
153 |     "INEX_XER-96": "Pure object-oriented programing languages", 
154 |     "INEX_XER-97": "Compilers that can compile both C and C++", 
155 |     "INEX_XER-98": "Makers of lawn tennis rackets", 
156 |     "INEX_XER-99": "Computer systems that have a recursive acronym for the name", 
157 |     "QALD2_te-1": "German cities have more than 250000 inhabitants?", 
158 |     "QALD2_te-100": "produces Orangina?", 
159 |     "QALD2_te-11": "is the Formula 1 race driver with the most races?", 
160 |     "QALD2_te-12": "all world heritage sites designated within the past five years.", 
161 |     "QALD2_te-13": "is the youngest player in the Premier League?", 
162 |     "QALD2_te-14": "all members of Prodigy.", 
163 |     "QALD2_te-15": "is the longest river?", 
164 |     "QALD2_te-17": "all cars that are produced in Germany.", 
165 |     "QALD2_te-19": "all people that were born in Vienna and died in Berlin.", 
166 |     "QALD2_te-2": "was the successor of John F. Kennedy?", 
167 |     "QALD2_te-21": "is the capital of Canada?", 
168 |     "QALD2_te-22": "is the governor of Texas?", 
169 |     "QALD2_te-24": "was the father of Queen Elizabeth II?", 
170 |     "QALD2_te-25": "U.S. state has been admitted latest?", 
171 |     "QALD2_te-27": "Sean Parnell is the governor of U.S. state?", 
172 |     "QALD2_te-28": "all movies directed by Francis Ford Coppola.", 
173 |     "QALD2_te-29": "all actors starring in movies directed by and starring William Shatner.", 
174 |     "QALD2_te-3": "is the mayor of Berlin?", 
175 |     "QALD2_te-31": "all current Methodist national leaders.", 
176 |     "QALD2_te-33": "all Australian nonprofit organizations.", 
177 |     "QALD2_te-34": "In military conflicts did Lawrence of Arabia participate?", 
178 |     "QALD2_te-35": "developed Skype?", 
179 |     "QALD2_te-39": "all companies in Munich.", 
180 |     "QALD2_te-40": "List all boardgames by GMT.", 
181 |     "QALD2_te-41": "founded Intel?", 
182 |     "QALD2_te-42": "is the husband of Amanda Palmer?", 
183 |     "QALD2_te-43": "all breeds of the German Shepherd dog.", 
184 |     "QALD2_te-44": "cities does the Weser flow through?", 
185 |     "QALD2_te-45": "countries are connected by the Rhine?", 
186 |     "QALD2_te-46": "professional surfers were born on the Philippines?", 
187 |     "QALD2_te-48": "In UK city are the headquarters of the MI6?", 
188 |     "QALD2_te-49": "other weapons did the designer of the Uzi develop?", 
189 |     "QALD2_te-5": "is the second highest mountain on Earth?", 
190 |     "QALD2_te-51": "all Frisian islands that belong to the Netherlands.", 
191 |     "QALD2_te-53": "is the ruling party in Lisbon?", 
192 |     "QALD2_te-55": "Greek goddesses dwelt on Mount Olympus?", 
193 |     "QALD2_te-57": "the Apollo 14 astronauts.", 
194 |     "QALD2_te-58": "is the time zone of Salt Lake City?", 
195 |     "QALD2_te-59": "U.S. states are in the same timezone as Utah?", 
196 |     "QALD2_te-6": "all professional skateboarders from Sweden.", 
197 |     "QALD2_te-60": "a list of all lakes in Denmark.", 
198 |     "QALD2_te-63": "all Argentine films.", 
199 |     "QALD2_te-64": "all launch pads operated by NASA.", 
200 |     "QALD2_te-65": "instruments did John Lennon play?", 
201 |     "QALD2_te-66": "ships were called after Benjamin Franklin?", 
202 |     "QALD2_te-67": "are the parents of the wife of Juan Carlos I?", 
203 |     "QALD2_te-72": "In U.S. state is Area 51 located?", 
204 |     "QALD2_te-75": "daughters of British earls died in the same place they were born in?", 
205 |     "QALD2_te-76": "List the children of Margaret Thatcher.", 
206 |     "QALD2_te-77": "was called Scarface?", 
207 |     "QALD2_te-8": "To countries does the Himalayan mountain system extend?", 
208 |     "QALD2_te-80": "all books by William Goldman with more than 300 pages.", 
209 |     "QALD2_te-81": "books by Kerouac were published by Viking Press?", 
210 |     "QALD2_te-82": "a list of all American inventions.", 
211 |     "QALD2_te-84": "created the comic Captain America?", 
212 |     "QALD2_te-86": "is the largest city in Australia?", 
213 |     "QALD2_te-87": "composed the music for Harold and Maude?", 
214 |     "QALD2_te-88": "films starring Clint Eastwood did he direct himself?", 
215 |     "QALD2_te-89": "In city was the former Dutch queen Juliana buried?", 
216 |     "QALD2_te-9": "a list of all trumpet players that were bandleaders.", 
217 |     "QALD2_te-90": "is the residence of the prime minister of Spain?", 
218 |     "QALD2_te-91": "U.S. State has the abbreviation MN?", 
219 |     "QALD2_te-92": "all songs from Bruce Springsteen released between 1980 and 1990.", 
220 |     "QALD2_te-93": "movies did Sam Raimi direct after Army of Darkness?", 
221 |     "QALD2_te-95": "wrote the lyrics for the Polish national anthem?", 
222 |     "QALD2_te-97": "painted The Storm on the Sea of Galilee?", 
223 |     "QALD2_te-98": "country does the creator of Miffy come from?", 
224 |     "QALD2_te-99": "For label did Elvis record his first album?", 
225 |     "QALD2_tr-1": "all female Russian astronauts.", 
226 |     "QALD2_tr-10": "In country does the Nile start?", 
227 |     "QALD2_tr-11": "countries have places with more than two caves?", 
228 |     "QALD2_tr-13": "classis does the Millepede belong to?", 
229 |     "QALD2_tr-15": "created Goofy?", 
230 |     "QALD2_tr-16": "the capitals of all countries in Africa.", 
231 |     "QALD2_tr-17": "all cities in New Jersey with more than 100000 inhabitants.", 
232 |     "QALD2_tr-18": "museum exhibits The Scream by Munch?", 
233 |     "QALD2_tr-21": "states border Illinois?", 
234 |     "QALD2_tr-22": "In country is the Limerick Lake?", 
235 |     "QALD2_tr-23": "television shows were created by Walt Disney?", 
236 |     "QALD2_tr-24": "mountain is the highest after the Annapurna?", 
237 |     "QALD2_tr-25": "In films directed by Garry Marshall was Julia Roberts starring?", 
238 |     "QALD2_tr-26": "bridges are of the same type as the Manhattan Bridge?", 
239 |     "QALD2_tr-28": "European countries have a constitutional monarchy?", 
240 |     "QALD2_tr-29": "awards did WikiLeaks win?", 
241 |     "QALD2_tr-3": "is the daughter of Bill Clinton married to?", 
242 |     "QALD2_tr-30": "state of the USA has the highest population density?", 
243 |     "QALD2_tr-31": "is the currency of the Czech Republic?", 
244 |     "QALD2_tr-32": "countries in the European Union adopted the Euro?", 
245 |     "QALD2_tr-34": "countries have more than two official languages?", 
246 |     "QALD2_tr-35": "is the owner of Universal Studios?", 
247 |     "QALD2_tr-36": "Through countries does the Yenisei river flow?", 
248 |     "QALD2_tr-38": "monarchs of the United Kingdom were married to a German?", 
249 |     "QALD2_tr-4": "river does the Brooklyn Bridge cross?", 
250 |     "QALD2_tr-40": "is the highest mountain in Australia?", 
251 |     "QALD2_tr-41": "all soccer clubs in Spain.", 
252 |     "QALD2_tr-42": "are the official languages of the Philippines?", 
253 |     "QALD2_tr-43": "is the mayor of New York City?", 
254 |     "QALD2_tr-44": "designed the Brooklyn Bridge?", 
255 |     "QALD2_tr-45": "telecommunications organizations are located in Belgium?", 
256 |     "QALD2_tr-47": "is the highest place of Karakoram?", 
257 |     "QALD2_tr-49": "all companies in the advertising industry.", 
258 |     "QALD2_tr-50": "did Bruce Carver die from?", 
259 |     "QALD2_tr-51": "all school types.", 
260 |     "QALD2_tr-52": "presidents were born in 1945?", 
261 |     "QALD2_tr-53": "all presidents of the United States.", 
262 |     "QALD2_tr-54": "was the wife of U.S. president Lincoln?", 
263 |     "QALD2_tr-55": "developed the video game World of Warcraft?", 
264 |     "QALD2_tr-57": "List all episodes of the first season of the HBO television series The Sopranos!", 
265 |     "QALD2_tr-58": "produced the most films?", 
266 |     "QALD2_tr-59": "all people with first name Jimmy.", 
267 |     "QALD2_tr-6": "did Abraham Lincoln die?", 
268 |     "QALD2_tr-61": "mountains are higher than the Nanga Parbat?", 
269 |     "QALD2_tr-62": "created Wikipedia?", 
270 |     "QALD2_tr-63": "all actors starring in Batman Begins.", 
271 |     "QALD2_tr-64": "software has been developed by organizations founded in California?", 
272 |     "QALD2_tr-65": "companies work in the aerospace industry as well as on nuclear reactor technology?", 
273 |     "QALD2_tr-68": "actors were born in Germany?", 
274 |     "QALD2_tr-69": "caves have more than 3 entrances?", 
275 |     "QALD2_tr-70": "all films produced by Hal Roach.", 
276 |     "QALD2_tr-71": "all video games published by Mean Hamster Software.", 
277 |     "QALD2_tr-72": "languages are spoken in Estonia?", 
278 |     "QALD2_tr-73": "owns Aldi?", 
279 |     "QALD2_tr-74": "capitals in Europe were host cities of the summer olympic games?", 
280 |     "QALD2_tr-75": "has been the 5th president of the United States of America?", 
281 |     "QALD2_tr-77": "music albums contain the song Last Christmas?", 
282 |     "QALD2_tr-78": "all books written by Danielle Steel.", 
283 |     "QALD2_tr-79": "airports are located in California, USA?", 
284 |     "QALD2_tr-8": "states of Germany are governed by the Social Democratic Party?", 
285 |     "QALD2_tr-80": "all Canadian Grunge record labels.", 
286 |     "QALD2_tr-81": "country has the most official languages?", 
287 |     "QALD2_tr-82": "In programming language is GIMP written?", 
288 |     "QALD2_tr-83": "produced films starring Natalie Portman?", 
289 |     "QALD2_tr-84": "all movies with Tom Cruise.", 
290 |     "QALD2_tr-85": "In films did Julia Roberts as well as Richard Gere play?", 
291 |     "QALD2_tr-86": "all female German chancellors.", 
292 |     "QALD2_tr-87": "wrote the book The pillars of the Earth?", 
293 |     "QALD2_tr-89": "all soccer clubs in the Premier League.", 
294 |     "QALD2_tr-9": "U.S. states possess gold minerals?", 
295 |     "QALD2_tr-91": "organizations were founded in 1950?", 
296 |     "QALD2_tr-92": "is the highest mountain?", 
297 |     "SemSearch_ES-1": "44 magnum hunting", 
298 |     "SemSearch_ES-10": "asheville north carolina", 
299 |     "SemSearch_ES-100": "YMCA Tampa", 
300 |     "SemSearch_ES-101": "ashley wagner", 
301 |     "SemSearch_ES-102": "beach flowers", 
302 |     "SemSearch_ES-103": "bounce city humble tx", 
303 |     "SemSearch_ES-104": "bourbonnais il", 
304 |     "SemSearch_ES-105": "cedar garden apartments", 
305 |     "SemSearch_ES-106": "chase masterson", 
306 |     "SemSearch_ES-107": "concord steel", 
307 |     "SemSearch_ES-108": "danielia cotton", 
308 |     "SemSearch_ES-109": "david hewlett", 
309 |     "SemSearch_ES-11": "austin powers", 
310 |     "SemSearch_ES-111": "eagle rock, ca", 
311 |     "SemSearch_ES-112": "espresso tv stands", 
312 |     "SemSearch_ES-114": "glenn frey", 
313 |     "SemSearch_ES-115": "goodwill of michigan", 
314 |     "SemSearch_ES-118": "iowa energy", 
315 |     "SemSearch_ES-119": "john elliott", 
316 |     "SemSearch_ES-12": "austin texas", 
317 |     "SemSearch_ES-120": "lawrence general hospital", 
318 |     "SemSearch_ES-123": "michael zimmerman", 
319 |     "SemSearch_ES-124": "motorola bluetooth hs850", 
320 |     "SemSearch_ES-125": "nokia e73", 
321 |     "SemSearch_ES-127": "palm tungsten e2 handheld", 
322 |     "SemSearch_ES-128": "philadelphia neufchatel cheese", 
323 |     "SemSearch_ES-129": "pizza populous detroit mi", 
324 |     "SemSearch_ES-13": "banana paper making", 
325 |     "SemSearch_ES-130": "plymouth police department", 
326 |     "SemSearch_ES-131": "scpa san diego", 
327 |     "SemSearch_ES-132": "sealy mattress co", 
328 |     "SemSearch_ES-133": "sedona hiking trails", 
329 |     "SemSearch_ES-134": "skye woods", 
330 |     "SemSearch_ES-135": "spring shoes canada", 
331 |     "SemSearch_ES-136": "sri lanka government gazette", 
332 |     "SemSearch_ES-137": "steak express", 
333 |     "SemSearch_ES-138": "syracuse spca", 
334 |     "SemSearch_ES-139": "the big texan steak house", 
335 |     "SemSearch_ES-14": "ben franklin", 
336 |     "SemSearch_ES-140": "toledo bend realty", 
337 |     "SemSearch_ES-141": "ventura county court", 
338 |     "SemSearch_ES-142": "windsor hotel philadelphia", 
339 |     "SemSearch_ES-15": "bradley center", 
340 |     "SemSearch_ES-16": "brooklyn bridge", 
341 |     "SemSearch_ES-17": "butte montana", 
342 |     "SemSearch_ES-18": "canasta cards", 
343 |     "SemSearch_ES-19": "carl lewis", 
344 |     "SemSearch_ES-2": "B. F. Skinner", 
345 |     "SemSearch_ES-20": "carolina", 
346 |     "SemSearch_ES-21": "charles darwin", 
347 |     "SemSearch_ES-22": "city of charlotte", 
348 |     "SemSearch_ES-23": "city of virginia beach", 
349 |     "SemSearch_ES-24": "coastal carolina", 
350 |     "SemSearch_ES-25": "david suchet", 
351 |     "SemSearch_ES-26": "disney orlando", 
352 |     "SemSearch_ES-27": "earl may", 
353 |     "SemSearch_ES-28": "el salvador", 
354 |     "SemSearch_ES-29": "ellis college", 
355 |     "SemSearch_ES-3": "Bookwork", 
356 |     "SemSearch_ES-30": "eloan line of credit", 
357 |     "SemSearch_ES-31": "emery", 
358 |     "SemSearch_ES-32": "fitzgerald auto mall chambersburg pa", 
359 |     "SemSearch_ES-33": "harry potter", 
360 |     "SemSearch_ES-34": "harry potter movie", 
361 |     "SemSearch_ES-35": "hospice of cincinnati", 
362 |     "SemSearch_ES-36": "imdb batman returns", 
363 |     "SemSearch_ES-37": "jack johnson", 
364 |     "SemSearch_ES-38": "jack the ripper", 
365 |     "SemSearch_ES-39": "james caldwell high school", 
366 |     "SemSearch_ES-4": "NAACP Image Awards", 
367 |     "SemSearch_ES-40": "james clayton md", 
368 |     "SemSearch_ES-41": "joan of arc", 
369 |     "SemSearch_ES-42": "john maxwell", 
370 |     "SemSearch_ES-45": "keith urban", 
371 |     "SemSearch_ES-47": "king arthur", 
372 |     "SemSearch_ES-48": "la scala restaurant philadelphia", 
373 |     "SemSearch_ES-49": "laura bush", 
374 |     "SemSearch_ES-5": "Scott County", 
375 |     "SemSearch_ES-50": "laura steele bob and tom", 
376 |     "SemSearch_ES-51": "lexus of maplewood", 
377 |     "SemSearch_ES-52": "lincoln park", 
378 |     "SemSearch_ES-53": "lynchburg virginia", 
379 |     "SemSearch_ES-54": "marc anthony", 
380 |     "SemSearch_ES-55": "marcus theaters", 
381 |     "SemSearch_ES-56": "mario bros", 
382 |     "SemSearch_ES-57": "martin luther king", 
383 |     "SemSearch_ES-58": "mason ohio", 
384 |     "SemSearch_ES-59": "mercy hospital in des moines, ia", 
385 |     "SemSearch_ES-6": "air wisconsin", 
386 |     "SemSearch_ES-60": "michael douglas", 
387 |     "SemSearch_ES-61": "mr rourke fantasy island", 
388 |     "SemSearch_ES-63": "old winchester shotguns", 
389 |     "SemSearch_ES-64": "omeara ford", 
390 |     "SemSearch_ES-65": "orlando florida", 
391 |     "SemSearch_ES-66": "overeaters anonymous", 
392 |     "SemSearch_ES-67": "ovguide movies", 
393 |     "SemSearch_ES-68": "pierce county washington", 
394 |     "SemSearch_ES-69": "piosenki mp3", 
395 |     "SemSearch_ES-7": "airsoft glock", 
396 |     "SemSearch_ES-70": "radio italia online", 
397 |     "SemSearch_ES-71": "richmond virginia", 
398 |     "SemSearch_ES-72": "rock 103 memphis", 
399 |     "SemSearch_ES-73": "rowan university", 
400 |     "SemSearch_ES-74": "sacred heart u", 
401 |     "SemSearch_ES-75": "sagemont church houston tx", 
402 |     "SemSearch_ES-76": "san antonio", 
403 |     "SemSearch_ES-77": "savannah tech", 
404 |     "SemSearch_ES-78": "sharp pc", 
405 |     "SemSearch_ES-79": "shobana masala", 
406 |     "SemSearch_ES-8": "aloha sol", 
407 |     "SemSearch_ES-80": "sonny and cher", 
408 |     "SemSearch_ES-81": "south dakota state university", 
409 |     "SemSearch_ES-82": "st lucia", 
410 |     "SemSearch_ES-83": "st paul saints", 
411 |     "SemSearch_ES-84": "the dish danielle fishel", 
412 |     "SemSearch_ES-85": "the longest yard sale", 
413 |     "SemSearch_ES-86": "the morning call lehigh valley pa", 
414 |     "SemSearch_ES-87": "the quick lift", 
415 |     "SemSearch_ES-88": "thomas jefferson", 
416 |     "SemSearch_ES-89": "university of north dakota", 
417 |     "SemSearch_ES-9": "american embassy nairobi", 
418 |     "SemSearch_ES-90": "university of phoenix", 
419 |     "SemSearch_ES-91": "westminster abbey", 
420 |     "SemSearch_ES-93": "08 toyota tundra", 
421 |     "SemSearch_ES-94": "Hugh Downs", 
422 |     "SemSearch_ES-95": "MADRID", 
423 |     "SemSearch_ES-96": "New England Coffee", 
424 |     "SemSearch_ES-97": "PINK PANTHER 2", 
425 |     "SemSearch_ES-98": "University of Texas at Austin", 
426 |     "SemSearch_ES-99": "University of York", 
427 |     "SemSearch_LS-1": "Apollo astronauts walked on the Moon", 
428 |     "SemSearch_LS-10": "did nicole kidman have any siblings", 
429 |     "SemSearch_LS-11": "dioceses of the church of ireland", 
430 |     "SemSearch_LS-12": "first targets of the atomic bomb", 
431 |     "SemSearch_LS-13": "five great epics of Tamil literature", 
432 |     "SemSearch_LS-14": "gods dwelt on Mount Olympus", 
433 |     "SemSearch_LS-16": "hijackers in the September 11 attacks", 
434 |     "SemSearch_LS-17": "houses of the Russian parliament", 
435 |     "SemSearch_LS-18": "john lennon, parents", 
436 |     "SemSearch_LS-19": "kenya's captain in cricket", 
437 |     "SemSearch_LS-2": "Arab states of the Persian Gulf", 
438 |     "SemSearch_LS-20": "kublai khan siblings", 
439 |     "SemSearch_LS-21": "lilly allen parents", 
440 |     "SemSearch_LS-22": "major leagues in the united states", 
441 |     "SemSearch_LS-24": "matt berry tv series", 
442 |     "SemSearch_LS-25": "members of u2?", 
443 |     "SemSearch_LS-26": "movies starring erykah badu", 
444 |     "SemSearch_LS-29": "nations Portuguese is an official language", 
445 |     "SemSearch_LS-3": "astronauts landed on the Moon", 
446 |     "SemSearch_LS-30": "orders (or 'choirs') of angels", 
447 |     "SemSearch_LS-31": "permanent members of the UN Security Council", 
448 |     "SemSearch_LS-32": "presidents depicted on mount rushmore died of shooting", 
449 |     "SemSearch_LS-33": "provinces and territories of Canada", 
450 |     "SemSearch_LS-34": "ratt albums", 
451 |     "SemSearch_LS-35": "republics of the former Yugoslavia", 
452 |     "SemSearch_LS-36": "revolutionaries of 1959 in Cuba", 
453 |     "SemSearch_LS-37": "standard axioms of set theory", 
454 |     "SemSearch_LS-38": "states that border oklahoma", 
455 |     "SemSearch_LS-39": "ten ancient Greek city-kingdoms of Cyprus", 
456 |     "SemSearch_LS-4": "Axis powers of World War II", 
457 |     "SemSearch_LS-40": "the first 13 american states", 
458 |     "SemSearch_LS-41": "the four of the companions of the prophet", 
459 |     "SemSearch_LS-42": "twelve tribes or sons of Israel", 
460 |     "SemSearch_LS-43": "books did paul of tarsus write?", 
461 |     "SemSearch_LS-44": "languages do they speak in afghanistan", 
462 |     "SemSearch_LS-46": "the British monarch is also head of state", 
463 |     "SemSearch_LS-49": "invented the python programming language", 
464 |     "SemSearch_LS-5": "books of the Jewish canon", 
465 |     "SemSearch_LS-50": "wonders of the ancient world", 
466 |     "SemSearch_LS-6": "boroughs of New York City", 
467 |     "SemSearch_LS-7": "Branches of the US military", 
468 |     "SemSearch_LS-8": "continents in the world", 
469 |     "SemSearch_LS-9": "degrees of Eastern Orthodox monasticism", 
470 |     "TREC_Entity-1": "Carriers that Blackberry makes phones for.", 
471 |     "TREC_Entity-10": "Campuses of Indiana University.", 
472 |     "TREC_Entity-11": "Donors to the Home Depot Foundation.", 
473 |     "TREC_Entity-12": "Airlines that Air Canada has code share flights with.", 
474 |     "TREC_Entity-14": "Authors awarded an Anthony Award at Bouchercon in 2007.", 
475 |     "TREC_Entity-15": "Universities that are members of the SEC conference for football.", 
476 |     "TREC_Entity-16": "Sponsors of the Mancuso quilt festivals.", 
477 |     "TREC_Entity-17": "Chefs with a show on the Food Network.", 
478 |     "TREC_Entity-18": "Members of the band Jefferson Airplane.", 
479 |     "TREC_Entity-19": "Companies that John Hennessey serves on the board of.", 
480 |     "TREC_Entity-2": "Winners of the ACM Athena award.", 
481 |     "TREC_Entity-20": "Scotch whisky distilleries on the island of Islay.", 
482 |     "TREC_Entity-4": "Professional sports teams in Philadelphia.", 
483 |     "TREC_Entity-5": "Products of Medimmune, Inc.", 
484 |     "TREC_Entity-6": "Organizations that award Nobel prizes.", 
485 |     "TREC_Entity-7": "Airlines that currently use Boeing 747 planes.", 
486 |     "TREC_Entity-9": "Members of The Beaux Arts Trio."
487 | }


--------------------------------------------------------------------------------
/nordlys/__init__.py:
--------------------------------------------------------------------------------
1 | from __future__ import division
2 | 


--------------------------------------------------------------------------------
/nordlys/config.py:
--------------------------------------------------------------------------------
 1 | """
 2 | Global nordlys config.
 3 | 
 4 | @author: Faegheh Hasibi (faegheh.hasibi@idi.ntnu.no)
 5 | @author: Krisztian Balog (krisztian.balog@uis.no)
 6 | """
 7 | 
 8 | from os import path
 9 | 
10 | NORDLYS_DIR = path.dirname(path.abspath(__file__))
11 | DATA_DIR = path.dirname(path.dirname(path.abspath(__file__))) + "/data"
12 | OUTPUT_DIR = path.dirname(path.dirname(path.abspath(__file__))) + "/runs"
13 | 
14 | TERM_INDEX_DIR = "path/to/term/index"
15 | URI_INDEX_DIR = "path/to/URI/index"
16 | print "Term index:", TERM_INDEX_DIR
17 | print "URI index:", URI_INDEX_DIR
18 | 
19 | QUERIES = DATA_DIR + "/queries.json"
20 | ANNOTATIONS = DATA_DIR + "/tagme_annotations.json"
21 | 
22 | 


--------------------------------------------------------------------------------
/nordlys/elr/__init__.py:
--------------------------------------------------------------------------------
1 | __author__ = 'faeghehhasibi'
2 | 


--------------------------------------------------------------------------------
/nordlys/elr/field_mapping.py:
--------------------------------------------------------------------------------
  1 | """
  2 | Computes PRMS field mapping probabilities.
  3 | 
  4 | @author: Faegheh Hasibi (faegheh.hasibi@idi.ntnu.no)
  5 | """
  6 | 
  7 | from __future__ import division
  8 | from pprint import PrettyPrinter
  9 | 
 10 | from nordlys.retrieval.scorer import ScorerPRMS
 11 | from nordlys.elr.top_fields import TopFields
 12 | 
 13 | 
 14 | class FieldMapping(object):
 15 |     DEBUG = 0
 16 |     MAPPING_DEBUG = 0
 17 | 
 18 |     def __init__(self, lucene_term, lucene_uri, n):
 19 |         self.lucene_term = lucene_term
 20 |         self.lucene_uri = lucene_uri
 21 |         self.n = n
 22 | 
 23 |     def map(self, query_annot, slop=None):
 24 |         """
 25 |         Computes PRMS field mapping probabilities for URIs, terms, ordered, and unordered phrases.
 26 | 
 27 |         :param query_annot: nordlys.elr.QueryAnnot
 28 |         :param slop: number of terms in between
 29 |         :return: interprets: {'uris': {uri:{field: prob, ..}, ..}, 'terms': {..}, 'ordered': {..}, 'unordered': {..}}
 30 |         """
 31 |         T, phrases, E = set(query_annot.T), set(query_annot.get_all_phrases()), set(query_annot.E.keys())
 32 |         field_mappings = {'uris': self.get_mapping_uris(E),
 33 |                           'terms': self.get_mapping_terms(T),
 34 |                           'ordered': self.get_mapping_phrases(phrases, 0, True)}
 35 |         print "  ordered done!"
 36 |         if slop is not None:
 37 |             field_mappings['unordered'] = self.get_mapping_phrases(phrases, slop, False)
 38 |             print "  unordered done!"
 39 |         print "==="
 40 |         return field_mappings
 41 | 
 42 |     def get_mapping_uris(self, uris):
 43 |         """
 44 |         Computes field mapping probability for URIs.
 45 | 
 46 |         :param uris: list of uris
 47 |         :return: Dictionary {uri: {field: weight, ..}, ..}
 48 |         """
 49 |         field_mappings = {}
 50 |         for uri in uris:
 51 |             top_fields = TopFields(self.lucene_uri).get_top_term(uri, self.n)
 52 |             scorer_prms = ScorerPRMS(self.lucene_uri, None, {'fields': top_fields})
 53 |             field_mappings[uri] = scorer_prms.get_mapping_prob(uri)
 54 |             if self.DEBUG:
 55 |                 print uri
 56 |                 PrettyPrinter(depth=4).pprint(sorted(field_mappings[uri].items(), key=lambda f: f[1], reverse=True))
 57 |         return field_mappings
 58 | 
 59 |     def get_mapping_terms(self, terms):
 60 |         """
 61 |         Computes PRMS field mapping probability for terms.
 62 | 
 63 |         :param terms: list of terms
 64 |         :return: Dictionary {term: {field: weight, ..}, ..}
 65 |         """
 66 |         field_mappings = {}
 67 |         top_fields = TopFields(self.lucene_term).get_top_index(self.n)
 68 |         for term in terms:
 69 |             scorer_prms = ScorerPRMS(self.lucene_term, None, {'fields': top_fields})
 70 |             field_mappings[term] = scorer_prms.get_mapping_prob(term)
 71 |             if self.DEBUG:
 72 |                 print term
 73 |                 PrettyPrinter(depth=4).pprint(sorted(field_mappings[term].items(), key=lambda f: f[1], reverse=True))
 74 |         return field_mappings
 75 | 
 76 |     def get_mapping_phrases(self, phrases, slop, ordered):
 77 |         """
 78 |         Computes PRMS field mapping probability for phrases.
 79 | 
 80 |         :param phrases: list of phrases
 81 |         :param ordered: if True, performs ordered search
 82 |         :param slop: number of terms between the terms of phrase
 83 |         :return: Dictionary {phrase: {field: weight, ..}, ..}
 84 |         """
 85 |         field_mappings = {}
 86 |         top_fields = TopFields(self.lucene_term).get_top_index(self.n)
 87 |         for phrase in phrases:
 88 |             coll_freqs = self.__get_coll_freqs(phrase, top_fields, slop, ordered)
 89 |             scorer_prms = ScorerPRMS(self.lucene_term, None, {'fields': top_fields})
 90 |             field_mappings[phrase] = scorer_prms.get_mapping_prob(phrase, coll_termfreq_fields=coll_freqs)
 91 |             if self.DEBUG:
 92 |                 print phrase
 93 |                 PrettyPrinter(depth=4).pprint(sorted(field_mappings[phrase].items(), key=lambda f: f[1], reverse=True))
 94 |         return field_mappings
 95 | 
 96 |     def __get_coll_freqs(self, phrase, fields, slop, ordered):
 97 |         """Gets collection term frequency for all fields."""
 98 |         coll_freqs = {}
 99 |         for f in fields:
100 |             doc_phrase_freq = self.lucene_term.get_doc_phrase_freq(phrase, f, slop=slop, ordered=ordered)
101 |             coll_freqs[f] = sum(doc_phrase_freq.values())
102 |         return coll_freqs


--------------------------------------------------------------------------------
/nordlys/elr/query_annot.py:
--------------------------------------------------------------------------------
 1 | """
 2 | Class for query annotations in the json file.
 3 | 
 4 | @author: Faegheh Hasibi (faegheh.hasibi@idi.ntnu.no)
 5 | """
 6 | from nordlys.retrieval.lucene_tools import Lucene
 7 | 
 8 | 
 9 | class QueryAnnot(object):
10 |     def __init__(self, annotations, score_th, qid=None):
11 |         self.annotations = annotations
12 |         self.score_th = score_th
13 |         self.qid = qid
14 |         self.__E = None
15 |         self.__T = None
16 |         self.__mentions = None
17 | 
18 |     @property
19 |     def query(self):
20 |         return self.annotations.get('query', None)
21 | 
22 |     @property
23 |     def field_mappings(self):
24 |         """Returns field mappings."""
25 |         return self.annotations.get('field_mappings', {})
26 | 
27 |     @field_mappings.setter
28 |     def field_mappings(self, value):
29 |         if "field_mappings" not in self.annotations:
30 |             self.annotations['field_mappings'] = {}
31 |         self.annotations['field_mappings'].update(value)
32 | 
33 |     @property
34 |     def E(self):
35 |         """Returns set of annotated entities."""
36 |         if self.__E is None:
37 |             self.__E = {}
38 |             for interpretation in self.annotations['interpretations'].values():
39 |                 for annot in interpretation['annots'].values():
40 |                     if float(annot['score']) >= self.score_th:
41 |                         self.__E[annot['uri']] = annot['score']
42 |         return self.__E
43 | 
44 |     @property
45 |     def T(self):
46 |         """Returns all query terms."""
47 |         if self.__T is None:
48 |             analyzed_query = Lucene.preprocess(self.query)
49 |             self.__T = analyzed_query.split(" ")
50 |         return self.__T
51 | 
52 |     @property
53 |     def mentions(self):
54 |         """Returns all mentions (among all annotations)."""
55 |         if self.__mentions is None:
56 |             self.__mentions = {}
57 |             for interpretation in self.annotations['interpretations'].values():
58 |                 for mention, annot in interpretation['annots'].iteritems():
59 |                     if float(annot['score']) >= self.score_th:
60 |                         analyzed_phrase = Lucene.preprocess(mention)
61 |                         if (analyzed_phrase is not None) and (analyzed_phrase.strip() != ""):
62 |                             self.__mentions[analyzed_phrase] = annot['score']
63 |         return self.__mentions
64 | 
65 |     def get_all_phrases(self):
66 |         """Returns phrases for the ordered part of the model. (bigram and n-gram of mentions)"""
67 |         all_phrases = set()
68 |         for s_t in self.mentions:
69 |             if len(s_t.split(" ")) > 1:
70 |                 all_phrases.add(s_t)
71 |         analyzed_query = Lucene.preprocess(self.query)
72 |         query_terms = analyzed_query.split(" ")
73 |         for i in range(0, len(query_terms)-1):
74 |             bigram = " ".join([query_terms[i], query_terms[i+1]])
75 |             all_phrases.add(bigram)
76 |         return all_phrases
77 | 
78 |     def update(self, key, value):
79 |         """Updates the annotation."""
80 |         self.annotations[key] = value
81 | 


--------------------------------------------------------------------------------
/nordlys/elr/retrieval_elr.py:
--------------------------------------------------------------------------------
  1 | """
  2 | Class for entity retrieval
  3 | 
  4 | @author: Faegheh Hasibi (faegheh.hasibi@idi.ntnu.no)
  5 | """
  6 | 
  7 | import argparse
  8 | import json
  9 | import os
 10 | 
 11 | from nordlys.config import QUERIES, TERM_INDEX_DIR, URI_INDEX_DIR, OUTPUT_DIR, ANNOTATIONS
 12 | from nordlys.elr.query_annot import QueryAnnot
 13 | from nordlys.elr.scorer_elr import ScorerMRF
 14 | from nordlys.retrieval.lucene_tools import Lucene
 15 | from nordlys.retrieval.results import RetrievalResults
 16 | from nordlys.retrieval.retrieval import Retrieval
 17 | 
 18 | 
 19 | class RetrievalELR(Retrieval):
 20 |     def __init__(self, model, query_file, annot_file, el_th=None, lambd=None, n_fields=None):
 21 |         query_file = query_file
 22 |         config = {'model': model,
 23 |                   'index_dir': TERM_INDEX_DIR,
 24 |                   'query_file': query_file,
 25 |                   'lambda': lambd,
 26 |                   'th': el_th,
 27 |                   'n_fields': n_fields,
 28 |                   'first_pass_num_docs': 1000,
 29 |                   'num_docs': 100,
 30 |                   'fields': None}
 31 | 
 32 |         lambd_str = "_lambda" + "_".join([str(l) for l in lambd]) if lambd is not None else ""
 33 |         th_str = "_th" + str(el_th) if el_th is not None else ""
 34 |         fields_str = str(n_fields) if n_fields is not None else ""
 35 |         run_id = model + fields_str + th_str + lambd_str
 36 |         config['run_id'] = run_id
 37 |         config['output_file'] = OUTPUT_DIR + "/" + run_id + ".treceval"
 38 |         super(RetrievalELR, self).__init__(config)
 39 | 
 40 |         self.annot_file = annot_file
 41 | 
 42 |     def _load_query_annotations(self):
 43 |         """Loads field annotation file."""
 44 |         self.query_annotations = json.load(open(self.annot_file))
 45 | 
 46 |     def _open_index(self):
 47 |         self.lucene_term = Lucene(TERM_INDEX_DIR)
 48 |         self.lucene_uri = Lucene(URI_INDEX_DIR)
 49 |         self.lucene_term.open_searcher()
 50 |         self.lucene_uri.open_searcher()
 51 | 
 52 |     def _close_index(self):
 53 |         self.lucene_term.close_reader()
 54 |         self.lucene_uri.close_reader()
 55 | 
 56 |     def _second_pass_scoring(self, res1, scorer):
 57 |         """
 58 |         Returns second-pass scoring of documents.
 59 | 
 60 |         :param res1: first pass results
 61 |         :param scorer: scorer object
 62 |         :return: RetrievalResults object
 63 |         """
 64 |         print "\tSecond pass scoring... "
 65 |         results = RetrievalResults()
 66 |         for doc_id, orig_score in res1.get_scores_sorted():
 67 |             score = scorer.score_doc(doc_id)
 68 |             results.append(doc_id, score)
 69 |         print "done"
 70 |         return results
 71 | 
 72 |     def retrieve(self, store_json=True):
 73 |         """Scores queries and outputs results."""
 74 |         self._open_index()
 75 |         self._load_queries()
 76 |         self._load_query_annotations()
 77 | 
 78 |         # init output file
 79 |         if os.path.exists(self.config['output_file']):
 80 |             os.remove(self.config['output_file'])
 81 |         out = open(self.config['output_file'], "w")
 82 |         print "Number of queries:", len(self.queries)
 83 | 
 84 |         for qid in sorted(self.queries):
 85 |             query = Lucene.preprocess(self.queries[qid])
 86 |             print "scoring [" + qid + "] " + query
 87 |             query_annot = QueryAnnot(self.query_annotations[qid], self.config['th'], qid=qid)
 88 | 
 89 |             # score documents
 90 |             res1 = self._first_pass_scoring(self.lucene_term, query)
 91 |             scorer = ScorerMRF.get_scorer(self.lucene_term, self.lucene_uri, self.config, query_annot)
 92 |             results = self._second_pass_scoring(res1, scorer)
 93 | 
 94 |             # write results to output file
 95 |             results.write_trec_format(qid, self.config['run_id'], out, self.config['num_docs'])
 96 |             break
 97 | 
 98 |         out.close()
 99 |         self._close_index()
100 | 
101 |         print "Output results: " + self.config['output_file']
102 | 
103 | 
104 | def arg_parser():
105 |     valid_models = ["lm", "mlm", "mlm-tc", "mlm-all", "prms", "sdm", "fsdm",
106 |                     "lm_elr", "mlm_elr", "mlm-tc_elr", "prms_elr", "sdm_elr", "fsdm_elr"]
107 |     parser = argparse.ArgumentParser()
108 |     parser.add_argument("model", help="Model name", type=str, choices=valid_models)
109 |     parser.add_argument("-q", "--queries", help="Query file", type=str, default=QUERIES)
110 |     parser.add_argument("-a", "--annot", help="Annotation file (with field mappings)", type=str, default=ANNOTATIONS)
111 |     parser.add_argument("-t", "--threshold", help="Entity linking threshold", type=float, default=0.1)
112 |     parser.add_argument("-n", "--nfields", help="number of fields", type=int, default=10)
113 |     parser.add_argument("-l", "--lambd", help="Lambdas, comma separated values for ", type=str)
114 |     args = parser.parse_args()
115 |     return args
116 | 
117 | 
118 | def main(args):
119 |     lambda_params = None
120 |     if args.lambd is not None:
121 |         lambdas = args.lambd.split(",")
122 |         lambda_params = [float(l.strip()) for l in lambdas]
123 | 
124 |     RetrievalELR(args.model, args.queries, args.annot, el_th=args.threshold, lambd=lambda_params,
125 |                  n_fields=args.nfields).retrieve()
126 | 
127 | if __name__ == '__main__':
128 |     main(arg_parser())
129 | 


--------------------------------------------------------------------------------
/nordlys/elr/scorer_elr.py:
--------------------------------------------------------------------------------
  1 | """
  2 | ELR extension of MRF based models: LM, MLM, PRMS, SDM, and FSDM
  3 | 
  4 | @author: Faegheh Hasibi (faegheh.hasibi@idi.ntnu.no)
  5 | """
  6 | 
  7 | from __future__ import division
  8 | 
  9 | import math
 10 | 
 11 | from nordlys.elr.field_mapping import FieldMapping
 12 | from nordlys.elr.top_fields import TopFields
 13 | from nordlys.retrieval.lucene_tools import Lucene
 14 | from nordlys.retrieval.scorer import ScorerLM
 15 | 
 16 | 
 17 | class ScorerMRF(object):
 18 |     DEBUG = 0
 19 | 
 20 |     TERM = "terms"
 21 |     ORDERED = "ordered"
 22 |     UNORDERED = "unordered"
 23 |     URI = "uris"
 24 |     SLOP = 6  # Window = 8
 25 | 
 26 |     def __init__(self, lucene_term, lucene_uri, params, query_annot):
 27 |         self.lucene_term = lucene_term
 28 |         self.lucene_uri = lucene_uri
 29 |         self.params = params
 30 |         self.query_annot = query_annot
 31 |         self.phrase_freq = {}
 32 | 
 33 |         self.scorer_lm_term = ScorerLM(self.lucene_term, None, {'smoothing_method': "dirichlet"})
 34 |         self.scorer_lm_uri = ScorerLM(self.lucene_uri, None, {})
 35 |         self.instance_list = []
 36 |         self.__n_fields = None
 37 |         self.__bigrams = None
 38 |         self.__mlm_all_mapping = None
 39 | 
 40 |     @property
 41 |     def n_fields(self):
 42 |         """Returns number of fields for fielded models."""
 43 |         if self.__n_fields is None:
 44 |             model = self.params['model']
 45 |             if ("prms" in model) or ("fsdm" in model) or ("mlm-all" in model):
 46 |                 self.__n_fields = 10 if self.params['n_fields'] is None else self.params['n_fields']
 47 |         return self.__n_fields
 48 | 
 49 |     @property
 50 |     def bigrams(self):
 51 |         """Returns all query bigrams."""
 52 |         if self.__bigrams is None:
 53 |             self.__bigrams = []
 54 |             for i in range(0, len(self.query_annot.T)-1):
 55 |                 bigram = " ".join([self.query_annot.T[i], self.query_annot.T[i+1]])
 56 |                 self.__bigrams.append(bigram)
 57 |         return self.__bigrams
 58 | 
 59 |     @property
 60 |     def mlm_all_mapping(self):
 61 |         if self.__mlm_all_mapping is None:
 62 |             self.__mlm_all_mapping = {}
 63 |             fields = TopFields(self.lucene_term).get_top_index(self.n_fields)
 64 |             weight = 1.0 / len(fields)
 65 |             for field in fields:
 66 |                 self.__mlm_all_mapping[field] = weight
 67 |         return self.__mlm_all_mapping
 68 | 
 69 |     @staticmethod
 70 |     def get_scorer(lucene_term, lucene_uri, params, query_annot):
 71 |         """
 72 |         Returns Scorer object (Scorer factory).
 73 | 
 74 |         :param lucene_term: Lucene object for terms
 75 |         :param lucene_uri: Lucene object for uris
 76 |         :param params: dict with models parameters
 77 |         :param query_annot: query annotation with the mapping probabilities
 78 |         """
 79 |         model = params['model']
 80 |         lambd = params['lambda']
 81 |         print "\t" + model + " scoring ..."
 82 |         if (model == "lm") or (model == "prms") or (model == "mlm-all") or (model == "mlm-tc"):
 83 |             params['lambda'] = [1.0, 0.0, 0.0] if lambd is None else lambd
 84 |             return ScorerFSDM(lucene_term, lucene_uri, params, query_annot)
 85 |         elif (model == "sdm") or (model == "fsdm"):
 86 |             params['lambda'] = [0.8, 0.1, 0.1] if lambd is None else lambd
 87 |             return ScorerFSDM(lucene_term, lucene_uri, params, query_annot)
 88 |         elif (model == "lm_elr") or (model == "prms_elr") or (model == "mlm-tc_elr") or (model == "mlm-all_elr"):
 89 |             params['lambda'] = [0.9, 0.0, 0.0, 0.1] if lambd is None else lambd
 90 |             return ScorerELR(lucene_term, lucene_uri, params, query_annot)
 91 |         elif (model == "sdm_elr") or (model == "fsdm_elr"):
 92 |             params['lambda'] = [0.8, 0.05, 0.05, 0.1] if lambd is None else lambd
 93 |             return ScorerELR(lucene_term, lucene_uri, params, query_annot)
 94 |         else:
 95 |             raise Exception("Unknown model '" + model + "'")
 96 | 
 97 |     def get_field_weights(self, clique_type, c):
 98 |         """
 99 |         Returns field mappings
100 | 
101 |         :param clique_type: [TERM | ORDERED | UNORDERED | URI]
102 |         :param c: str (term, phrase, or uri)
103 |         :return: {field: prob}
104 |         """
105 |         model = self.params['model']
106 |         if (model == "lm") or (model == "lm_elr") or (model == "sdm") or (model == "sdm_elr"):
107 |             return {Lucene.FIELDNAME_CONTENTS: 1}
108 |         elif (model == "prms") or (model == "prms_elr") or (model == "fsdm") or (model == "fsdm_elr"):
109 |             return self.get_prms_mapping(clique_type)[c]
110 |         elif (model == "mlm-tc") or (model == "mlm-tc_elr"):
111 |             if clique_type == self.URI:
112 |                 return self.get_prms_mapping(clique_type)[c]
113 |             else:
114 |                 return {'names': 0.2, 'contents': 0.8}
115 |         elif (model == "mlm-all") or (model == "mlm-all_elr"):
116 |             if clique_type == self.URI:
117 |                 return self.get_prms_mapping(clique_type)[c]
118 |             else:
119 |                 return self.mlm_all_mapping
120 | 
121 |     def get_prms_mapping(self, clique_type):
122 |         """
123 |         Gets PRMS mapping probability for a clique type
124 | 
125 |         :param clique_type: [TERM | ORDERED | UNORDERED | URI]
126 |         :return Dictionary {phrase: {field: weight, ..}, ..}
127 |         """
128 |         if clique_type not in self.query_annot.field_mappings:
129 |             mapper = FieldMapping(self.lucene_term, self.lucene_uri, self.n_fields)
130 |             if clique_type == self.TERM:
131 |                 self.query_annot.field_mappings = {clique_type: mapper.get_mapping_terms(set(self.query_annot.T))}
132 |             elif clique_type == self.ORDERED:
133 |                 self.query_annot.field_mappings = {clique_type: mapper.get_mapping_phrases(set(self.bigrams), 0, True)}
134 |             elif clique_type == self.UNORDERED:
135 |                 self.query_annot.field_mappings = {clique_type: mapper.get_mapping_phrases(set(self.bigrams),
136 |                                                                                            self.SLOP, False)}
137 |             elif clique_type == self.URI:
138 |                 self.query_annot.field_mappings = {clique_type: mapper.get_mapping_uris(set(self.query_annot.E))}
139 |         return self.query_annot.field_mappings[clique_type]
140 | 
141 |     def set_phrase_freq(self, clique_type, c, fields):
142 |         """Sets document and collection frequency for phrase."""
143 |         if clique_type not in self.phrase_freq:
144 |             self.phrase_freq[clique_type] = {}
145 |         if c not in self.phrase_freq.get(clique_type, {}):
146 |             self.phrase_freq[clique_type][c] = {}
147 |             for f in fields:
148 |                 if clique_type == self.ORDERED:
149 |                     doc_freq = self.lucene_term.get_doc_phrase_freq(c, f, 0, True)
150 |                 elif clique_type == self.UNORDERED:
151 |                     doc_freq = self.lucene_term.get_doc_phrase_freq(c, f, self.SLOP, False)
152 | 
153 |                 self.phrase_freq[clique_type][c][f] = doc_freq
154 |                 self.phrase_freq[clique_type][c][f]['coll_freq'] = sum(doc_freq.values())
155 | 
156 |     @staticmethod
157 |     def normalize_el_scores(scores):
158 |         """Normalize entity linking score, so that sum of all scores equal to 1"""
159 |         normalized_scores = {}
160 |         sum_score = sum(scores.values())
161 |         for item, score in scores.iteritems():
162 |             normalized_scores[item] = score / sum_score
163 |         return normalized_scores
164 | 
165 |     def get_p_t_d(self, t, field_weights, doc_id):
166 |         """
167 |         p(t|d) = sum_{f in F} p(t|d_f) p(f|t)
168 | 
169 |         :param t: term
170 |         :param field_weights: Dictionary {f: p_f_t, ...}
171 |         :param doc_id: entity id
172 |         :return  p(t|d)
173 |         """
174 |         lucene_doc_id_t = self.lucene_term.get_lucene_document_id(doc_id)
175 |         p_t_d = 0
176 |         for f, p_f_t in field_weights.iteritems():
177 |             if self.DEBUG:
178 |                 print "\tt:", t, "f:", f
179 |             p_t_d_f = self.scorer_lm_term.get_term_prob(lucene_doc_id_t, f, t)
180 |             p_t_d += p_t_d_f * p_f_t
181 |             if self.DEBUG:
182 |                 print "\t\tp(t|d_f):", p_t_d_f, "p(f|t):", p_f_t, "p(t|d_f).p(f|t):", p_t_d_f * p_f_t
183 |         if self.DEBUG:
184 |             print "\tp(t|d):", p_t_d
185 |         return p_t_d
186 | 
187 |     def get_p_o_d(self, o, field_weights, doc_id):
188 |         """
189 |         p(o|d) = sum_{f in F} p(o|d_f) p(f|o) for ordered search
190 | 
191 |         :param o: phrase (ordered search)
192 |         :param field_weights: Dictionary {f: p_f_o, ...}
193 |         :param doc_id: entity id
194 |         :return  p(o|d)
195 |         """
196 |         lucene_doc_id_t = self.lucene_term.get_lucene_document_id(doc_id)
197 |         self.set_phrase_freq(self.ORDERED, o, field_weights)
198 |         p_o_d = 0
199 |         for f, p_f_o in field_weights.iteritems():
200 |             if self.DEBUG:
201 |                 print "\to:", o, "f:", f
202 |             tf_t_d_f = self.phrase_freq[self.ORDERED][o].get(f, {}).get(doc_id, 0)
203 |             tf_t_C_f = self.phrase_freq[self.ORDERED][o].get(f, {}).get('coll_freq', 0)
204 |             p_o_d_f = self.scorer_lm_term.get_term_prob(lucene_doc_id_t, f, o, tf_t_d_f=tf_t_d_f, tf_t_C_f=tf_t_C_f)
205 |             p_o_d += p_o_d_f * p_f_o
206 |             if self.DEBUG:
207 |                 print "\t\tp(o|d_f):", p_o_d_f, "p(f|o):", p_f_o, "p(o|d_f).p(f|o):", p_o_d_f * p_f_o
208 |         if self.DEBUG:
209 |             print "\tp(o|d):", p_o_d
210 |         return p_o_d
211 | 
212 |     def get_p_u_d(self, u, field_weights, doc_id):
213 |         """
214 |         p(u|d) = sum_{f in F} p(u|d_f) p(f|u) for unordered search
215 | 
216 |         :param u: phrase (unordered search)
217 |         :param field_weights: Dictionary {f: p_f_u, ...}
218 |         :param doc_id: entity id
219 |         :return  p(o|d)
220 |         """
221 |         lucene_doc_id_t = self.lucene_term.get_lucene_document_id(doc_id)
222 |         self.set_phrase_freq(self.UNORDERED, u, field_weights)
223 |         p_u_d = 0
224 |         for f, p_f_u in field_weights.iteritems():
225 |             if self.DEBUG:
226 |                 print "\tu:", u, "f:", f
227 |             tf_t_d_f = self.phrase_freq[self.UNORDERED][u].get(f, {}).get(doc_id, 0)
228 |             tf_t_C_f = self.phrase_freq[self.UNORDERED][u].get(f, {}).get('coll_freq', 0)
229 |             p_u_d_f = self.scorer_lm_term.get_term_prob(lucene_doc_id_t, f, u, tf_t_d_f=tf_t_d_f, tf_t_C_f=tf_t_C_f)
230 |             p_u_d += p_u_d_f * p_f_u
231 |             if self.DEBUG:
232 |                 print "\t\tp(u|d_f):", p_u_d_f, "p(f|u):", p_f_u, "p(u|d_f).p(f|u):", p_u_d_f * p_f_u
233 |         if self.DEBUG:
234 |             print "\tp(u|d):", p_u_d
235 |         return p_u_d
236 | 
237 |     def get_p_e_d(self, e, field_weights, doc_id):
238 |         """
239 |         p(e|d) = sum_{f in F} p(e|d_f) p(f|e)
240 | 
241 |         :param e: entity URI
242 |         :param field_weights: Dictionary {f: p_f_t, ...}
243 |         :param doc_id: entity id
244 |         :return p(e|d)
245 |         """
246 |         if self.DEBUG:
247 |             print "\te:", e
248 |         p_e_d = 0
249 |         for f, p_f_e in field_weights.iteritems():
250 |             p_e_d_f = self.__get_uri_prob(doc_id, f, e)
251 |             p_e_d += p_e_d_f * p_f_e
252 |             if self.DEBUG:
253 |                 print "\t\tp(e|d_f):", p_e_d_f, "p(f|e):", p_f_e, "p(e|d_f).p(f|e):", p_e_d_f * p_f_e
254 |         if self.DEBUG:
255 |             print "\tp(e|d):", p_e_d
256 |         return p_e_d
257 | 
258 |     def __get_uri_prob(self, doc_id, field, e, lambd=0.1):
259 |         """
260 |         P(e|d_f) = P(e|d_f)= (1 - lambda) tf(e, d_f)+ lambda df(f, e) / df(f)
261 | 
262 |         :param doc_id: document id
263 |         :param field: field name
264 |         :param e: entity uri
265 |         :param lambd: smoothing parameter
266 |         :return: P(e|d_f)
267 |         """
268 |         if self.DEBUG:
269 |             print "\t\tf:", field
270 |         lucene_doc_id_u = self.lucene_uri.get_lucene_document_id(doc_id)
271 |         tf = self.scorer_lm_uri.get_tf(lucene_doc_id_u, field)
272 |         tf_e_d_f = 1 if tf.get(e, 0) > 0 else 0
273 |         df_f_e = self.lucene_uri.get_doc_freq(e, field)
274 |         df_f = self.lucene_uri.get_doc_count(field)
275 |         p_e_d_f = ((1 - lambd) * tf_e_d_f) + (lambd * df_f_e / df_f)
276 |         if self.DEBUG:
277 |             print "\t\t\ttf(e,d_f):", tf_e_d_f, "df(f, e):", df_f_e, "df(f):", df_f, "P(e|d_f):", p_e_d_f
278 |         return p_e_d_f
279 | 
280 | 
281 | class ScorerFSDM(ScorerMRF):
282 |     DEBUG_FSDM = 0
283 | 
284 |     def __init__(self, lucene_term, lucene_uri, params, query_annot):
285 |         ScorerMRF.__init__(self, lucene_term, lucene_uri, params, query_annot)
286 |         self.lambda_T = self.params['lambda'][0]
287 |         self.lambda_O = self.params['lambda'][1]
288 |         self.lambda_U = self.params['lambda'][2]
289 |         self.T = self.query_annot.T
290 | 
291 |     def score_doc(self, doc_id):
292 |         """    
293 |         P(q|e) = lambda_T sum_{t in T}P(t|d) + lambda_O sum_{o in O}P(o|d) + lambda_U sum_{u in U}P(u|d)
294 |         P(t|d) = sum_{f in F} p(t|d_f) p(f|t)
295 |         P(o|d) = sum_{f in F} p(o|d_f) p(f|o)
296 |         P(u|d) = sum_{f in F} p(u|d_f) p(f|u)
297 | 
298 |         :param doc_id: document id
299 |         :return: p(q|d)
300 |         """
301 |         if self.DEBUG_FSDM:
302 |             print "Scoring doc ID=" + doc_id
303 | 
304 |         if self.lucene_term.get_lucene_document_id(doc_id) is None:
305 |             return None
306 | 
307 |         p_T_d = 0
308 |         if self.lambda_T != 0:
309 |             for t in self.T:
310 |                 p_t_d = self.get_p_t_d(t, self.get_field_weights(self.TERM, t), doc_id)
311 |                 if p_t_d != 0:
312 |                     p_T_d += math.log(p_t_d)
313 | 
314 |         p_O_d = 0
315 |         if self.lambda_O != 0:
316 |             for b in self.bigrams:
317 |                 p_o_d = self.get_p_o_d(b, self.get_field_weights(self.ORDERED, b), doc_id)
318 |                 if p_o_d != 0:
319 |                     p_O_d += math.log(p_o_d)
320 | 
321 |         p_U_d = 0
322 |         if self.lambda_U != 0:
323 |             for b in self.bigrams:
324 |                 p_u_d = self.get_p_u_d(b, self.get_field_weights(self.UNORDERED, b), doc_id)
325 |                 if p_u_d != 0:
326 |                     p_U_d += math.log(p_u_d)
327 | 
328 |         p_q_d = (self.lambda_T * p_T_d) + (self.lambda_O * p_O_d) + (self.lambda_U * p_U_d)
329 |         if self.DEBUG_FSDM:
330 |             print "\t\tP(T|d) = ", p_T_d, "P(O|d):", p_O_d, "p(U|d):", p_U_d,  "P(q|d):", p_q_d
331 | 
332 |         return p_q_d
333 | 
334 | 
335 | class ScorerELR(ScorerFSDM):
336 |     DEBUG_ELR = 0
337 | 
338 |     def __init__(self, lucene_term, lucene_uri, params, query_annot):
339 |         ScorerFSDM.__init__(self, lucene_term, lucene_uri, params, query_annot)
340 |         self.lambda_E = self.params['lambda'][3]
341 |         self.E = ScorerMRF.normalize_el_scores(self.query_annot.E)
342 | 
343 |     def score_doc(self, doc_id):
344 |         """
345 |         P(q|e) = lambda_T sum_{t}P(t|d) + lambda_O sum_{o}P(o|d) + lambda_U sum_{u}P(u|d) + + lambda_E sum_{e}P(e|d)
346 |         P(T|d) = sum_{f in F} p(t|d_f) p(f|t)
347 |         P(O|d) = sum_{f in F} p(o|d_f) p(f|o)
348 |         P(U|d) = sum_{f in F} p(u|d_f) p(f|u)
349 |         P(E|d) = sum_{f in F} p(e|d_f) p(f|e)
350 | 
351 |         :param doc_id: document id
352 |         :return: p(q|d)
353 |         """
354 |         if self.DEBUG_ELR:
355 |             print "Scoring doc ID=" + doc_id
356 | 
357 |         if self.lucene_term.get_lucene_document_id(doc_id) is None:
358 |             # print doc_id,  self.lucene_term.get_lucene_document_id(doc_id)
359 |             return None
360 | 
361 |         p_T_d = 0
362 |         n_T = len(self.T)
363 |         if self.lambda_T != 0:
364 |             for t in self.T:
365 |                 p_t_d = self.get_p_t_d(t, self.get_field_weights(self.TERM, t), doc_id)
366 |                 if p_t_d != 0:
367 |                     p_T_d += math.log(p_t_d) / n_T
368 | 
369 |         p_O_d = 0
370 |         n_O = len(self.bigrams)
371 |         if self.lambda_O != 0:
372 |             for b in self.bigrams:
373 |                 p_o_d = self.get_p_o_d(b, self.get_field_weights(self.ORDERED, b), doc_id)
374 |                 if p_o_d != 0:
375 |                     p_O_d += math.log(p_o_d) / n_O
376 | 
377 |         p_U_d = 0
378 |         n_U = len(self.bigrams)
379 |         if self.lambda_U != 0:
380 |             for b in self.bigrams:
381 |                 p_u_d = self.get_p_u_d(b, self.get_field_weights(self.UNORDERED, b), doc_id)
382 |                 if p_u_d != 0:
383 |                     p_U_d += math.log(p_u_d) / n_U
384 | 
385 |         p_E_d = 0
386 |         if self.lambda_E != 0:
387 |             for e, score in self.E.iteritems():
388 |                 p_e_d = self.get_p_e_d(e, self.get_field_weights(self.URI, e), doc_id)
389 |                 if p_e_d != 0:
390 |                     p_E_d += score * math.log(p_e_d)
391 | 
392 |         p_q_d = (self.lambda_T * p_T_d) + (self.lambda_O * p_O_d) + (self.lambda_U * p_U_d) + (self.lambda_E * p_E_d)
393 |         if self.DEBUG_ELR:
394 |             print "\t\tP(T|d) = ", p_T_d, "P(O|d):", p_O_d, "p(U|d):", p_U_d, "p(E|d):", p_E_d,  "P(q|d):", p_q_d
395 | 
396 |         return p_q_d
397 | 


--------------------------------------------------------------------------------
/nordlys/elr/top_fields.py:
--------------------------------------------------------------------------------
 1 | """
 2 | This class returns top fields based on document frequency
 3 | 
 4 | @author: Faegheh Hasibi (faegheh.hasibi@idi.ntnu.no)
 5 | """
 6 | 
 7 | from nordlys.retrieval.lucene_tools import Lucene
 8 | 
 9 | 
10 | class TopFields(object):
11 |     DEBUG = 0
12 | 
13 |     def __init__(self, lucene):
14 |         self.lucene = lucene
15 |         self.__fields = None
16 | 
17 |     @property
18 |     def fields(self):
19 |         if self.__fields is None:
20 |             self.__fields = set(self.lucene.get_fields())
21 |         return self.__fields
22 | 
23 |     def get_top_index(self, n):
24 |         """Return top-n fields with highest document frequency across the whole index"""
25 |         doc_freq_field = {}
26 |         for field in self.fields:
27 |             if field == Lucene.FIELDNAME_ID:
28 |                 continue
29 |             doc_freq_field[field] = self.lucene.get_doc_count(field)
30 |         return self.__get_top_n(doc_freq_field, n)
31 | 
32 |     def get_top_term(self, term, n):
33 |         """Returns top-n fields with highest document frequency for the given term."""
34 |         doc_freq = {}
35 |         if self.DEBUG:
36 |             print "Term:[" + term + "]"
37 |         for field in self.fields:
38 |             df = self.lucene.get_doc_freq(term, field)
39 |             if df > 0:
40 |                 doc_freq[field] = df
41 |         top_fields = self.__get_top_n(doc_freq, n)
42 |         return top_fields
43 | 
44 |     def __get_top_n(self, fields_freq, n):
45 |         """Sorts fields and returns top-n."""
46 |         sorted_fields = sorted(fields_freq.items(), key=lambda item: (item[1], item[0]), reverse=True)
47 |         top_fields = dict()
48 |         i = 0
49 |         for field, freq in sorted_fields:
50 |             if i >= n:
51 |                 break
52 |             i += 1
53 |             top_fields[field] = freq
54 |             if self.DEBUG:
55 |                 print "(" + field + ", " + str(freq) + ")",
56 |         if self.DEBUG:
57 |             print "\nNumber of fields:", len(top_fields), "\n"
58 |         return top_fields
59 | 


--------------------------------------------------------------------------------
/nordlys/retrieval/__init__.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/hasibi/EntityLinkingRetrieval-ELR/b53d7bce81f8050dd5b7a96a8e8b99f0ed258ba6/nordlys/retrieval/__init__.py


--------------------------------------------------------------------------------
/nordlys/retrieval/indexer.py:
--------------------------------------------------------------------------------
  1 | """
  2 | Creates a Lucene index for DBpedia from MongoDB.
  3 | 
  4 | - URI values are resolved using a simple heuristic
  5 | - fields are indexed as multi-valued
  6 | - catch-all fields are not indexed with positions, other fields are
  7 | 
  8 | --------------------------------------------------------------------------------------------------
  9 | NOTE: Please note that this code cannot be run due to dependencies to the DBpedia Mongo collection.
 10 |       Yet, this is the main code used fo generating the indices and can be used as a reference.
 11 |       To get the original indices, please contact the first author.
 12 | --------------------------------------------------------------------------------------------------
 13 | 
 14 | @author: Faegheh Hasibi (faegheh.hasibi@idi.ntnu.no)
 15 | @author: Krisztian Balog (krisztian.balog@uis.no)
 16 | """
 17 | 
 18 | import sys
 19 | from urllib import unquote
 20 | from pprint import pprint
 21 | 
 22 | from nordlys import config
 23 | from nordlys.entity.config import COLLECTION_DBPEDIA
 24 | from nordlys.entity.dbpedia.fields import Fields
 25 | from nordlys.storage.mongo import Mongo
 26 | from nordlys.retrieval.lucene_tools import Lucene
 27 | 
 28 | 
 29 | class MongoDBToLucene(object):
 30 |     def __init__(self, host=config.MONGO_HOST, db=config.MONGO_DB, collection=COLLECTION_DBPEDIA):
 31 |         self.mongo = Mongo(host, db, collection)
 32 |         self.contents = None
 33 | 
 34 |     def __resolve_uri(self, uri):
 35 |         """Resolves the URI using a simple heuristic."""
 36 |         uri = unquote(uri)  # decode percent encoding
 37 |         if uri.startswith("<") and uri.endswith(">"):
 38 |             # Part between last ':' and '>', and _ replaced with space.
 39 |             # Works fine for <dbpedia:XXX> and <dbpedia:Category:YYY>
 40 |             return uri[uri.rfind(":") + 1:-1].replace("_", " ")
 41 |         else:
 42 |             return uri
 43 | 
 44 |     def __is_uri(self, value):
 45 |         """ Returns true if the value is uri. """
 46 |         if value.startswith("<dbpedia:") and value.endswith(">"):
 47 |             return True
 48 |         return False
 49 | 
 50 |     def __get_field_value(self, value, only_uris=False):
 51 |         """
 52 |         Converts mongoDB field value to indexable values by resolving URIs.
 53 |         It may be a string or a list and the return value is of the same data type.
 54 |         """
 55 |         if type(value) is list:
 56 |             nval = []  # holds resolved values
 57 |             for v in value:
 58 |                 if not only_uris:
 59 |                     nval.append(Lucene.preprocess(self.__resolve_uri(v)))
 60 |                 elif only_uris and self.__is_uri(v):
 61 |                     nval.append(v)
 62 |             return nval
 63 |         else:
 64 |             if not only_uris:
 65 |                 return Lucene.preprocess(self.__resolve_uri(value))
 66 |             elif only_uris and self.__is_uri(value):
 67 |                 return value
 68 |             # return self.__resolve_uri(value) if only_uris else value
 69 |         return None
 70 | 
 71 |     def __add_to_contents(self, field_name, field_value, field_type):
 72 |         """
 73 |         Adds field to document contents.
 74 |         Field value can be a list, where each item is added separately (i.e., the field is multi-valued).
 75 |         """
 76 |         if type(field_value) is list:
 77 |             for fv in field_value:
 78 |                 self.__add_to_contents(field_name, fv, field_type)
 79 |         else:
 80 |             if len(field_value) > 0:  # ignore empty fields
 81 |                 self.contents.append({'field_name': field_name,
 82 |                                       'field_value': field_value,
 83 |                                       'field_type': field_type})
 84 | 
 85 |     def build_index(self, index_config, only_uris=False, max_shingle_size=None):
 86 |         """Builds index.
 87 | 
 88 |         :param index_config: index configuration
 89 |         """
 90 |         lucene = Lucene(index_config['index_dir'], max_shingle_size)
 91 |         lucene.open_writer()  # generated shingle analyzer if the param is not None
 92 | 
 93 |         fieldtype_tv = Lucene.FIELDTYPE_ID_TV if only_uris else Lucene.FIELDTYPE_TEXT_TV
 94 |         fieldtype_tvp = Lucene.FIELDTYPE_ID_TV if only_uris else Lucene.FIELDTYPE_TEXT_TVP
 95 |         fieldtype_id = Lucene.FIELDTYPE_ID_TV if only_uris else Lucene.FIELDTYPE_ID
 96 |         fieldtype_ntv = Lucene.FIELDTYPE_ID_TV if only_uris else Lucene.FIELDTYPE_TEXT_NTV
 97 | 
 98 |         # iterate through MongoDB contents
 99 |         i = 0
100 |         for mdoc in self.mongo.find_all():
101 | 
102 |             # this is just to speed up things a bit
103 |             # we can skip the document right away if the ID does not start
104 |             # with "<dbpedia:"
105 |             if not mdoc[Mongo.ID_FIELD].startswith("<dbpedia:"):
106 |                 continue
107 | 
108 |             # get back document from mongo with keys and _id field unescaped
109 |             doc = self.mongo.get_doc(mdoc)
110 | 
111 |             # check must_have fields
112 |             skip_doc = False
113 |             for f, v in index_config['fields'].iteritems():
114 |                 if ("must_have" in v) and (v['must_have']) and (f not in doc):
115 |                     skip_doc = True
116 |                     break
117 | 
118 |             if skip_doc:
119 |                 continue
120 | 
121 |             # doc contents is represented as a list of fields
122 |             # (mind that fields are multi-valued)
123 |             self.contents = []
124 | 
125 |             # each predicate to a separate field
126 |             for f in doc:
127 |                 if f == Mongo.ID_FIELD:  # id is special
128 |                     self.__add_to_contents(Lucene.FIELDNAME_ID, doc[f], fieldtype_id)
129 |                 if f in index_config['ignore']:
130 |                     pass
131 |                 else:
132 |                     # get resolved field value(s) -- note that it might be a list
133 |                     field_value = self.__get_field_value(doc[f], only_uris)
134 |                     # ignore empty fields
135 |                     if (field_value is None) or (field_value == []):
136 |                         continue
137 | 
138 |                     to_catchall_content = True if index_config['catchall_all'] else False
139 | 
140 |                     if f in index_config['fields']:
141 |                         self.__add_to_contents(f, field_value, fieldtype_tvp)
142 | 
143 |                         # fields in index_config['fields'] are always added to catch-all content
144 |                         to_catchall_content = True
145 | 
146 |                         # copy field value to other field(s)
147 |                         # (copying is without term positions)
148 |                         if "copy_to" in index_config['fields'][f]:
149 |                             for f2 in index_config['fields'][f]['copy_to']:
150 |                                 self.__add_to_contents(f2, field_value, fieldtype_tv)
151 | 
152 |                     # copy field value to catch-all content field
153 |                     # (copying is without term positions)
154 |                     if to_catchall_content:
155 |                         self.__add_to_contents(Lucene.FIELDNAME_CONTENTS, field_value, fieldtype_tv)
156 | 
157 |             # add document to index
158 |             lucene.add_document(self.contents)
159 | 
160 |             i += 1
161 |             if i % 1000 == 0:
162 |                 print str(i / 1000) + "K documents indexed"
163 |         # close Lucene index
164 |         lucene.close_writer()
165 | 
166 |         print "Finished indexing (" + str(i) + " documents in total)"
167 | 
168 | 
169 | def main(argv):
170 |     fields = {}
171 |     top_fields = Fields().get_all()
172 |     for f in top_fields:
173 |         if f == "<rdfs:label>":
174 |             fields[f] = {'must_have': True, 'copy_to': ["names"]}
175 |         elif (f == "<foaf:name>") or (f == "!<dbo:wikiPageRedirects>"):
176 |             fields[f] = {'copy_to': ["names"]}
177 |         elif (f == "<rdf:type>") or (f == "<dcterms:subject>"):
178 |             fields[f] = {'copy_to': ["types"]}
179 |         elif f == "<rdfs:comment>":
180 |             fields[f] = {'must_have': True}
181 |         else:
182 |             fields[f] = {}
183 | 
184 | 
185 |     # Config of index7
186 |     config_index7 = {'index_dir': "path/to/index",
187 |                      'fields': fields,
188 |                      'catchall_all': True,
189 |                      'ignore': ["<owl:sameAs>"]  # except these
190 |                     }
191 | 
192 |     # config of config7_only_uri; Similar to index7, but keeps only uris
193 |     index_config7_only_uri = {'index_dir': "path/to/uri_only index",
194 |                      'fields': fields,
195 |                      'catchall_all': True,
196 |                      'ignore': ["<owl:sameAs>"]  # except these
197 |                     }
198 | 
199 |     pprint(config_index7)
200 |     m2l = MongoDBToLucene()
201 |     m2l.build_index(config_index7, only_uris=False)
202 |     m2l.build_index(config_index7, only_uris=True)
203 |     print "index build" + config_index7['index_dir']
204 | 
205 | if __name__ == "__main__":
206 |     main(sys.argv[1:])
207 | 


--------------------------------------------------------------------------------
/nordlys/retrieval/lucene_tools.py:
--------------------------------------------------------------------------------
  1 | """
  2 | Tools for Lucene.
  3 | All Lucene features should be accessed in nordlys through this class. 
  4 | 
  5 | - Lucene class for ensuring that the same version, analyzer, etc. 
  6 |   are used across nordlys modules. Handles IndexReader, IndexWriter, etc.  
  7 | - Command line tools for checking indexed document content
  8 | 
  9 | @author: Faegheh Hasibi (faegheh.hasibi@idi.ntnu.no)
 10 | @author: Krisztian Balog (krisztian.balog@uis.no)
 11 | """
 12 | import argparse
 13 | import lucene
 14 | from nordlys.retrieval.results import RetrievalResults
 15 | from java.io import File
 16 | from java.util import HashMap, TreeSet
 17 | from java.io import StringReader
 18 | from java.lang import StringBuilder
 19 | from org.apache.lucene.analysis.tokenattributes import CharTermAttribute
 20 | from org.apache.lucene.analysis.core import StopFilter
 21 | from org.apache.lucene.analysis.core import StopAnalyzer
 22 | from org.apache.lucene.analysis.standard import StandardTokenizer
 23 | from org.apache.lucene.analysis.standard import StandardAnalyzer
 24 | from org.apache.lucene.analysis.shingle import ShingleAnalyzerWrapper
 25 | from org.apache.lucene.document import Document
 26 | from org.apache.lucene.document import Field
 27 | from org.apache.lucene.document import FieldType
 28 | from org.apache.lucene.index import MultiFields
 29 | from org.apache.lucene.index import IndexWriter
 30 | from org.apache.lucene.index import IndexWriterConfig
 31 | from org.apache.lucene.index import DirectoryReader 
 32 | from org.apache.lucene.index import Term
 33 | from org.apache.lucene.index import TermContext
 34 | from org.apache.lucene.queryparser.classic import QueryParser
 35 | from org.apache.lucene.search import IndexSearcher
 36 | from org.apache.lucene.search import BooleanClause
 37 | from org.apache.lucene.search import TermQuery
 38 | from org.apache.lucene.search import BooleanQuery
 39 | from org.apache.lucene.search import PhraseQuery
 40 | from org.apache.lucene.search.spans import SpanNearQuery
 41 | from org.apache.lucene.search.spans import SpanTermQuery
 42 | from org.apache.lucene.search import FieldValueFilter
 43 | from org.apache.lucene.search.similarities import LMJelinekMercerSimilarity
 44 | from org.apache.lucene.search.similarities import LMDirichletSimilarity
 45 | from org.apache.lucene.store import SimpleFSDirectory
 46 | from org.apache.lucene.util import BytesRefIterator
 47 | from org.apache.lucene.util import Version
 48 | from org.apache.lucene.index import SlowCompositeReaderWrapper
 49 | 
 50 | # has java VM for Lucene been initialized
 51 | lucene_vm_init = False
 52 | 
 53 | 
 54 | class Lucene(object):
 55 | 
 56 |     # default fieldnames for id and contents
 57 |     FIELDNAME_ID = "id"
 58 |     FIELDNAME_CONTENTS = "contents"
 59 | 
 60 |     # internal fieldtypes
 61 |     # used as Enum, the actual values don't matter
 62 |     FIELDTYPE_ID = "id"
 63 |     FIELDTYPE_ID_TV = "id_tv"
 64 |     FIELDTYPE_TEXT = "text"
 65 |     FIELDTYPE_TEXT_TV = "text_tv"
 66 |     FIELDTYPE_TEXT_TVP = "text_tvp"
 67 |     FIELDTYPE_TEXT_NTV = "text_ntv"
 68 |     FIELDTYPE_TEXT_NTVP = "text_ntvp"
 69 | 
 70 |     def __init__(self, index_dir, max_shingle_size=None):
 71 |         global lucene_vm_init
 72 | 
 73 |         if not lucene_vm_init:
 74 |             lucene.initVM(vmargs=['-Djava.awt.headless=true'])
 75 |             lucene_vm_init = True
 76 |         self.dir = SimpleFSDirectory(File(index_dir))
 77 |         self.max_shingle_size = max_shingle_size
 78 |         self.analyzer = None
 79 |         self.reader = None
 80 |         self.searcher = None
 81 |         self.writer = None
 82 |         self.ldf = None
 83 | 
 84 |     @staticmethod
 85 |     def get_version():
 86 |         """Get Lucene version."""
 87 |         return Version.LUCENE_48
 88 | 
 89 |     @staticmethod
 90 |     def preprocess(text):
 91 |         """Tokenize and stop the input text."""
 92 |         ts = StandardTokenizer(Lucene.get_version(), StringReader(text.lower()))
 93 |         ts = StopFilter(Lucene.get_version(), ts,  StopAnalyzer.ENGLISH_STOP_WORDS_SET)
 94 |         string_builder = StringBuilder()
 95 |         ts.reset()
 96 |         char_term_attr = ts.addAttribute(CharTermAttribute.class_)
 97 |         while ts.incrementToken():
 98 |             if string_builder.length() > 0:
 99 |                 string_builder.append(" ")
100 |             string_builder.append(char_term_attr.toString())
101 |         return string_builder.toString()
102 | 
103 |     def get_analyzer(self):
104 |         """Get analyzer."""
105 |         if self.analyzer is None:
106 |             std_analyzer = StandardAnalyzer(Lucene.get_version())
107 |             if self.max_shingle_size is None:
108 |                 self.analyzer = std_analyzer
109 |             else:
110 |                 self.analyzer = ShingleAnalyzerWrapper(std_analyzer, self.max_shingle_size)
111 |         return self.analyzer
112 | 
113 |     def open_reader(self):
114 |         """Open IndexReader."""
115 |         if self.reader is None:
116 |             self.reader = DirectoryReader.open(self.dir)
117 | 
118 |     def get_reader(self):
119 |         return self.reader
120 | 
121 |     def close_reader(self):
122 |         """Close IndexReader."""
123 |         if self.reader is not None:
124 |             self.reader.close()
125 |             self.reader = None
126 |         else:
127 |             raise Exception("There is no open IndexReader to close")
128 | 
129 |     def open_searcher(self):
130 |         """
131 |         Open IndexSearcher. Automatically opens an IndexReader too,
132 |         if it is not already open. There is no close method for the
133 |         searcher.
134 |         """
135 |         if self.searcher is None:
136 |             self.open_reader()
137 |             self.searcher = IndexSearcher(self.reader)
138 | 
139 |     def get_searcher(self):
140 |         """Returns index searcher (opens it if needed)."""
141 |         self.open_searcher()
142 |         return self.searcher
143 | 
144 |     def set_lm_similarity_jm(self, method="jm", smoothing_param=0.1):
145 |         """
146 |         Set searcher to use LM similarity.
147 | 
148 |         :param method: LM similarity ("jm" or "dirichlet")
149 |         :param smoothing_param: smoothing parameter (lambda or mu)
150 |         """
151 |         if method == "jm":
152 |             similarity = LMJelinekMercerSimilarity(smoothing_param)
153 |         elif method == "dirichlet":
154 |             similarity = LMDirichletSimilarity(smoothing_param)
155 |         else:
156 |             raise Exception("Unknown method")
157 | 
158 |         if self.searcher is None:
159 |             raise Exception("Searcher has not been created")
160 |         self.searcher.setSimilarity(similarity)
161 | 
162 |     def open_writer(self):
163 |         """Open IndexWriter."""
164 |         if self.writer is None:
165 |             config = IndexWriterConfig(Lucene.get_version(), self.get_analyzer())
166 |             config.setOpenMode(IndexWriterConfig.OpenMode.CREATE)
167 |             self.writer = IndexWriter(self.dir, config)
168 |         else:
169 |             raise Exception("IndexWriter is already open")
170 | 
171 |     def close_writer(self):
172 |         """Close IndexWriter."""
173 |         if self.writer is not None:
174 |             self.writer.close()
175 |             self.writer = None
176 |         else:
177 |             raise Exception("There is no open IndexWriter to close")
178 | 
179 |     def add_document(self, contents):
180 |         """
181 |         Adds a Lucene document with the specified contents to the index.
182 |         See LuceneDocument.create_document() for the explanation of contents.
183 |         """
184 |         if self.ldf is None:  # create a single LuceneDocument object that will be reused
185 |             self.ldf = LuceneDocument()
186 |         self.writer.addDocument(self.ldf.create_document(contents))
187 | 
188 |     def get_lucene_document_id(self, doc_id):
189 |         """Loads a document from a Lucene index based on its id."""
190 |         self.open_searcher()
191 |         query = TermQuery(Term(self.FIELDNAME_ID, doc_id))
192 |         tophit = self.searcher.search(query, 1).scoreDocs
193 |         if len(tophit) == 1:
194 |             return tophit[0].doc
195 |         else:
196 |             return None
197 | 
198 |     def get_document_id(self, lucene_doc_id):
199 |         """Gets lucene document id and returns the document id."""
200 |         self.open_reader()
201 |         return self.reader.document(lucene_doc_id).get(self.FIELDNAME_ID)
202 | 
203 |     def print_document(self, lucene_doc_id, term_vect=False):
204 |         """Prints document contents."""
205 |         if lucene_doc_id is None:
206 |             print "Document is not found in the index."
207 |         else:
208 |             doc = self.reader.document(lucene_doc_id)
209 |             print "Document ID (field '" + self.FIELDNAME_ID + "'): " + doc.get(self.FIELDNAME_ID)
210 | 
211 |             # first collect (unique) field names
212 |             fields = []
213 |             for f in doc.getFields():
214 |                 if f.name() != self.FIELDNAME_ID and f.name() not in fields:
215 |                     fields.append(f.name())
216 | 
217 |             for fname in fields:
218 |                 print fname
219 |                 for fv in doc.getValues(fname):  # printing (possibly multiple) field values
220 |                     print "\t" + fv
221 |                 # term vector
222 |                 if term_vect:
223 |                     print "-----"
224 |                     termfreqs = self.get_doc_termfreqs(lucene_doc_id, fname)
225 |                     for term in termfreqs:
226 |                         print term + " : " + str(termfreqs[term])
227 |                     print "-----"
228 | 
229 |     def get_lucene_query(self, query, field=FIELDNAME_CONTENTS):
230 |         """Creates Lucene query from keyword query."""
231 |         query = query.replace("(", "").replace(")", "").replace("!", "")
232 |         return QueryParser(Lucene.get_version(), field,
233 |                            self.get_analyzer()).parse(query)
234 | 
235 |     def analyze_query(self, query, field=FIELDNAME_CONTENTS):
236 |         """
237 |         Analyses the query and returns query terms.
238 | 
239 |         :param query: query
240 |         :param field: field name
241 |         :return: list of query terms
242 |         """
243 |         qterms = []  # holds a list of analyzed query terms
244 |         ts = self.get_analyzer().tokenStream(field, query)
245 |         term = ts.addAttribute(CharTermAttribute.class_)
246 |         ts.reset()
247 |         while ts.incrementToken():
248 |             qterms.append(term.toString())
249 |         ts.end()
250 |         ts.close()
251 |         return qterms
252 | 
253 |     def get_id_lookup_query(self, id, field=None):
254 |         """Creates Lucene query for searching by (external) document id."""
255 |         if field is None:
256 |             field = self.FIELDNAME_ID
257 |         return TermQuery(Term(field, id))
258 | 
259 |     def get_and_query(self, queries):
260 |         """Creates an AND Boolean query from multiple Lucene queries."""
261 |         # empty boolean query with Similarity.coord() disabled
262 |         bq = BooleanQuery(False)
263 |         for q in queries:
264 |             bq.add(q, BooleanClause.Occur.MUST)
265 |         return bq
266 | 
267 |     def get_or_query(self, queries):
268 |         """Creates an OR Boolean query from multiple Lucene queries."""
269 |         # empty boolean query with Similarity.coord() disabled
270 |         bq = BooleanQuery(False)
271 |         for q in queries:
272 |             bq.add(q, BooleanClause.Occur.SHOULD)
273 |         return bq
274 | 
275 |     def get_phrase_query(self, query, field):
276 |         """Creates phrase query for searching exact phrase."""
277 |         phq = PhraseQuery()
278 |         for t in query.split():
279 |             phq.add(Term(field, t))
280 |         return phq
281 | 
282 |     def get_span_query(self, terms, field, slop, ordered=True):
283 |         """
284 |         Creates near span query
285 | 
286 |         :param terms: list of terms
287 |         :param field: field name
288 |         :param slop: number of terms between the query terms
289 |         :param ordered: If true, ordered search; otherwise unordered search
290 |         :return: lucene span near query
291 |         """
292 |         span_queries = []
293 |         for term in terms:
294 |             span_queries.append(SpanTermQuery(Term(field, term)))
295 |         span_near_query = SpanNearQuery(span_queries, slop, ordered)
296 |         return span_near_query
297 | 
298 |     def get_doc_phrase_freq(self, phrase, field, slop, ordered):
299 |         """
300 |         Returns collection frequency for a given phrase and field.
301 | 
302 |         :param phrase: str
303 |         :param field: field name
304 |         :param slop: number of terms in between
305 |         :param ordered: If true, term occurrences should be ordered
306 |         :return: dictionary {doc: freq, ...}
307 |         """
308 |         # creates span near query
309 |         span_near_query = self.get_span_query(phrase.split(" "), field, slop=slop, ordered=ordered)
310 | 
311 |         # extracts document frequency
312 |         self.open_searcher()
313 |         index_reader_context = self.searcher.getTopReaderContext()
314 |         term_contexts = HashMap()
315 |         terms = TreeSet()
316 |         span_near_query.extractTerms(terms)
317 |         for term in terms:
318 |             term_contexts.put(term, TermContext.build(index_reader_context, term))
319 |         leaves = index_reader_context.leaves()
320 |         doc_phrase_freq = {}
321 |         # iterates over all atomic readers
322 |         for atomic_reader_context in leaves:
323 |             bits = atomic_reader_context.reader().getLiveDocs()
324 |             spans = span_near_query.getSpans(atomic_reader_context, bits, term_contexts)
325 |             while spans.next():
326 |                 lucene_doc_id = spans.doc()
327 |                 doc_id = atomic_reader_context.reader().document(lucene_doc_id).get(self.FIELDNAME_ID)
328 |                 if doc_id not in doc_phrase_freq:
329 |                     doc_phrase_freq[doc_id] = 1
330 |                 else:
331 |                     doc_phrase_freq[doc_id] += 1
332 |         return doc_phrase_freq
333 | 
334 |     def get_id_filter(self):
335 |         return FieldValueFilter(self.FIELDNAME_ID)
336 | 
337 |     def __to_retrieval_results(self, scoredocs, field_id=FIELDNAME_ID):
338 |         """Converts Lucene scoreDocs results to RetrievalResults format."""
339 |         rr = RetrievalResults()
340 |         if scoredocs is not None:
341 |             for i in xrange(len(scoredocs)):
342 |                 score = scoredocs[i].score
343 |                 lucene_doc_id = scoredocs[i].doc  # internal doc_id
344 |                 doc_id = self.reader.document(lucene_doc_id).get(field_id)
345 |                 rr.append(doc_id, score, lucene_doc_id)
346 |         return rr
347 | 
348 |     def score_query(self, query, field_content=FIELDNAME_CONTENTS, field_id=FIELDNAME_ID, num_docs=100):
349 |         """Scores a given query and return results as a RetrievalScores object."""
350 |         lucene_query = self.get_lucene_query(query, field_content)
351 |         scoredocs = self.searcher.search(lucene_query, num_docs).scoreDocs
352 |         return self.__to_retrieval_results(scoredocs, field_id)
353 | 
354 |     def num_docs(self):
355 |         """Returns number of documents in the index."""
356 |         self.open_reader()
357 |         return self.reader.numDocs()
358 | 
359 |     def num_fields(self):
360 |         """Returns number of fields in the index."""
361 |         self.open_reader()
362 |         atomic_reader = SlowCompositeReaderWrapper.wrap(self.reader)
363 |         return atomic_reader.getFieldInfos().size()
364 | 
365 |     def get_fields(self):
366 |         """Returns name of fields in the index."""
367 |         fields = []
368 |         self.open_reader()
369 |         atomic_reader = SlowCompositeReaderWrapper.wrap(self.reader)
370 |         for fieldInfo in atomic_reader.getFieldInfos().iterator():
371 |             fields.append(fieldInfo.name)
372 |         return fields
373 | 
374 |     def get_doc_termvector(self, lucene_doc_id, field):
375 |         """Outputs the document term vector as a generator."""
376 |         terms = self.reader.getTermVector(lucene_doc_id, field)
377 |         if terms:
378 |             termenum = terms.iterator(None)
379 |             for bytesref in BytesRefIterator.cast_(termenum):
380 |                 yield bytesref.utf8ToString(), termenum
381 | 
382 |     def get_doc_termfreqs(self, lucene_doc_id, field):
383 |         """
384 |         Returns term frequencies for a given document field.
385 | 
386 |         :param lucene_doc_id: Lucene document ID
387 |         :param field: document field
388 |         :return dict: with terms
389 |         """
390 |         termfreqs = {}
391 |         for term, termenum in self.get_doc_termvector(lucene_doc_id, field):
392 |             termfreqs[term] = int(termenum.totalTermFreq())
393 |         return termfreqs
394 | 
395 |     def get_doc_termfreqs_all_fields(self, lucene_doc_id):
396 |         """
397 |         Returns term frequency for all fields in the given document.
398 | 
399 |         :param lucene_doc_id: Lucene document ID
400 |         :return: dictionary {field: {term: freq, ...}, ...}
401 |         """
402 |         doc_termfreqs = {}
403 |         vectors = self.reader.getTermVectors(lucene_doc_id)
404 |         if vectors:
405 |             for field in vectors.iterator():
406 |                 doc_termfreqs[field] = {}
407 |                 terms = vectors.terms(field)
408 |                 if terms:
409 |                     termenum = terms.iterator(None)
410 |                     for bytesref in BytesRefIterator.cast_(termenum):
411 |                         doc_termfreqs[field][bytesref.utf8ToString()] = int(termenum.totalTermFreq())
412 |                     print doc_termfreqs[field]
413 |         return doc_termfreqs
414 | 
415 |     def get_coll_termvector(self, field):
416 |         """ Returns collection term vector for the given field."""
417 |         self.open_reader()
418 |         fields = MultiFields.getFields(self.reader)
419 |         if fields is not None:
420 |             terms = fields.terms(field)
421 |             if terms:
422 |                 termenum = terms.iterator(None)
423 |                 for bytesref in BytesRefIterator.cast_(termenum):
424 |                     yield bytesref.utf8ToString(), termenum
425 | 
426 |     def get_coll_termfreq(self, term, field):
427 |         """ 
428 |         Returns collection term frequency for the given field.
429 | 
430 |         :param term: string
431 |         :param field: string, document field
432 |         :return: int
433 |         """
434 |         self.open_reader()
435 |         return self.reader.totalTermFreq(Term(field, term))
436 | 
437 |     def get_doc_freq(self, term, field):
438 |         """
439 |         Returns document frequency for the given term and field.
440 | 
441 |         :param term: string, term
442 |         :param field: string, document field
443 |         :return: int
444 |         """
445 |         self.open_reader()
446 |         return self.reader.docFreq(Term(field, term))
447 | 
448 |     def get_doc_count(self, field):
449 |         """
450 |         Returns number of documents with at least one term for the given field.
451 | 
452 |         :param field: string, field name
453 |         :return: int
454 |         """
455 |         self.open_reader()
456 |         return self.reader.getDocCount(field)
457 | 
458 |     def get_coll_length(self, field):
459 |         """ 
460 |         Returns length of field in the collection.
461 | 
462 |         :param field: string, field name
463 |         :return: int
464 |         """
465 |         self.open_reader()
466 |         return self.reader.getSumTotalTermFreq(field)
467 | 
468 |     def get_avg_len(self, field):
469 |         """ 
470 |         Returns average length of a field in the collection.
471 | 
472 |         :param field: string, field name
473 |         """
474 |         self.open_reader()
475 |         n = self.reader.getDocCount(field)  # number of documents with at least one term for this field
476 |         len_all = self.reader.getSumTotalTermFreq(field)
477 |         if n == 0:
478 |             return 0
479 |         else:
480 |             return len_all / float(n)
481 | 
482 | class LuceneDocument(object):
483 |     """Internal representation of a Lucene document."""
484 | 
485 |     def __init__(self):
486 |         self.ldf = LuceneDocumentField()
487 | 
488 |     def create_document(self, contents):
489 |         """Create a Lucene document from the specified contents.
490 |         Contents is a list of fields to be indexed, represented as a dictionary
491 |         with keys 'field_name', 'field_type', and 'field_value'."""
492 |         doc = Document()
493 |         for f in contents:
494 |             doc.add(Field(f['field_name'], f['field_value'],
495 |                           self.ldf.get_field(f['field_type'])))
496 |         return doc
497 | 
498 | 
499 | class LuceneDocumentField(object):
500 |     """Internal handler class for possible field types."""
501 | 
502 |     def __init__(self):
503 |         """Init possible field types."""
504 | 
505 |         # FIELD_ID: stored, indexed, non-tokenized
506 |         self.field_id = FieldType()
507 |         self.field_id.setIndexed(True)
508 |         self.field_id.setStored(True)
509 |         self.field_id.setTokenized(False)
510 | 
511 |         # FIELD_ID_TV: stored, indexed, not tokenized, with term vectors (without positions)
512 |         # for storing IDs with term vector info
513 |         self.field_id_tv = FieldType()
514 |         self.field_id_tv.setIndexed(True)
515 |         self.field_id_tv.setStored(True)
516 |         self.field_id_tv.setTokenized(False)
517 |         self.field_id_tv.setStoreTermVectors(True)
518 | 
519 |         # FIELD_TEXT: stored, indexed, tokenized, with positions
520 |         self.field_text = FieldType()
521 |         self.field_text.setIndexed(True)
522 |         self.field_text.setStored(True)
523 |         self.field_text.setTokenized(True)
524 | 
525 |         # FIELD_TEXT_TV: stored, indexed, tokenized, with term vectors (without positions)
526 |         self.field_text_tv = FieldType()
527 |         self.field_text_tv.setIndexed(True)
528 |         self.field_text_tv.setStored(True)
529 |         self.field_text_tv.setTokenized(True)
530 |         self.field_text_tv.setStoreTermVectors(True)
531 | 
532 |         # FIELD_TEXT_TVP: stored, indexed, tokenized, with term vectors and positions
533 |         # (but no character offsets)
534 |         self.field_text_tvp = FieldType()
535 |         self.field_text_tvp.setIndexed(True)
536 |         self.field_text_tvp.setStored(True)
537 |         self.field_text_tvp.setTokenized(True)
538 |         self.field_text_tvp.setStoreTermVectors(True)
539 |         self.field_text_tvp.setStoreTermVectorPositions(True)
540 | 
541 |         # FIELD_TEXT_NTV:  not stored, indexed, tokenized, with term vectors (without positions)
542 |         self.field_text_ntv = FieldType()
543 |         self.field_text_ntv.setIndexed(True)
544 |         self.field_text_ntv.setStored(False)
545 |         self.field_text_ntv.setTokenized(True)
546 |         self.field_text_ntv.setStoreTermVectors(True)
547 | 
548 |         # FIELD_TEXT_TVP: not stored, indexed, tokenized, with term vectors and positions
549 |         # (but no character offsets)
550 |         self.field_text_ntvp = FieldType()
551 |         self.field_text_ntvp.setIndexed(True)
552 |         self.field_text_ntvp.setStored(False)
553 |         self.field_text_ntvp.setTokenized(True)
554 |         self.field_text_ntvp.setStoreTermVectors(True)
555 |         self.field_text_ntvp.setStoreTermVectorPositions(True)
556 | 
557 |     def get_field(self, type):
558 |         """Gets Lucene FieldType object for the corresponding internal FIELDTYPE_ value."""
559 |         if type == Lucene.FIELDTYPE_ID:
560 |             return self.field_id
561 |         elif type == Lucene.FIELDTYPE_ID_TV:
562 |             return self.field_id_tv
563 |         elif type == Lucene.FIELDTYPE_TEXT:
564 |             return self.field_text
565 |         elif type == Lucene.FIELDTYPE_TEXT_TV:
566 |             return self.field_text_tv
567 |         elif type == Lucene.FIELDTYPE_TEXT_TVP:
568 |             return self.field_text_tvp
569 |         elif type == Lucene.FIELDTYPE_TEXT_NTV:
570 |             return self.field_text_ntv
571 |         elif type == Lucene.FIELDTYPE_TEXT_NTVP:
572 |             return self.field_text_ntvp
573 |         else:
574 |             raise Exception("Unknown field type")


--------------------------------------------------------------------------------
/nordlys/retrieval/results.py:
--------------------------------------------------------------------------------
 1 | """
 2 | Result list representation.
 3 | 
 4 | - for each hit it holds score and both internal and external doc_ids
 5 | 
 6 | @author: Krisztian Balog (krisztian.balog@uis.no)
 7 | """
 8 | 
 9 | import operator
10 | 
11 | 
12 | class RetrievalResults(object):
13 |     """Class for storing retrieval scores for a given query."""
14 |     def __init__(self):
15 |         self.scores = {}
16 |         # mapping from external to internal doc_ids -s
17 |         self.doc_ids = {}
18 | 
19 |     def append(self, doc_id, score, doc_id_int=None):
20 |         """Adds document to the result list"""
21 |         self.scores[doc_id] = score
22 |         if doc_id_int is not None:
23 |             self.doc_ids[doc_id] = doc_id_int
24 | 
25 |     def increase(self, doc_id, score):
26 |         """Increases the score of a document (adds it to the results list
27 |         if it is not already there)"""
28 |         if doc_id not in self.scores:
29 |             self.scores[doc_id] = 0
30 |         self.scores[doc_id] += score
31 | 
32 |     def num_docs(self):
33 |         """Returns the number of documents in the result list."""
34 |         return len(self.scores)
35 | 
36 |     def get_scores_sorted(self):
37 |         """Returns all results sorted by score"""
38 |         return sorted(self.scores.iteritems(), key=operator.itemgetter(1), reverse=True)
39 | 
40 |     def get_doc_id_int(self, doc_id):
41 |         """Returns internal doc_id for a given doc_id."""
42 |         if doc_id in self.doc_ids:
43 |             return self.doc_ids[doc_id]
44 |         return None
45 | 
46 |     def write_trec_format(self, query_id, run_id, out, max_rank=100):
47 |         """Outputs results in TREC format"""
48 |         rank = 1
49 |         for doc_id, score in self.get_scores_sorted():
50 |             if rank <= max_rank:
51 |                 out.write(query_id + "\tQ0\t" + doc_id + "\t" + str(rank) + "\t" + str(score) + "\t" + run_id + "\n")
52 |             rank += 1
53 | 


--------------------------------------------------------------------------------
/nordlys/retrieval/retrieval.py:
--------------------------------------------------------------------------------
  1 | """
  2 | Console application for general-purpose retrieval.
  3 | 
  4 | first pass: get top N documents using Lucene's default retrieval method (based on the catch-all content field)
  5 | second pass: perform (expensive) scoring of the top N documents using the Scorer class
  6 | 
  7 | General config parameters:
  8 | - index_dir: index directory
  9 | - query_file: query file (JSON)
 10 | - model: accepted values: lucene, lm, mlm, prms (default: lm)
 11 | - output_file: output file name
 12 | - output_format: (default: trec) -- not used yet
 13 | - run_id: run in (only for "trec" output format)
 14 | - num_docs: number of documents to return (default: 100)
 15 | - field_id: id field to be returned (default: Lucene.FIELDNAME_ID)
 16 | - first_pass_num_docs: number of documents in first-pass scoring (default: 10000)
 17 | - first_pass_field: field used in first pass retrieval (default: Lucene.FIELDNAME_CONTENTS)
 18 | 
 19 | Model-specific parameters:
 20 | - smoothing_method: jm or dirichlet (lm and mlm, default: jm)
 21 | - smoothing_param: value of lambda or alpha (jm default: 0.1, dirichlet default: average field length)
 22 | - field_weights: dict with fields and corresponding weights (only mlm)
 23 | - field: field name for LM model
 24 | - fields: fields for PRMS model
 25 | 
 26 | 
 27 | @author: Krisztian Balog (krisztian.balog@uis.no)
 28 | """
 29 | from datetime import datetime
 30 | 
 31 | import sys
 32 | import json
 33 | import os
 34 | from nordlys.retrieval.lucene_tools import Lucene
 35 | from scorer import Scorer
 36 | from results import RetrievalResults
 37 | 
 38 | 
 39 | class Retrieval(object):
 40 |     def __init__(self, config):
 41 |         """
 42 |         Loads config file, checks params, and sets default values.
 43 | 
 44 |         :param config: JSON config file or a dictionary
 45 |         """
 46 |         # set configurations
 47 |         if type(config) == dict:
 48 |             self.config = config
 49 |         else:
 50 |             try:
 51 |                 self.config = json.load(open(config))
 52 |             except Exception, e:
 53 |                 print "Error loading config file: ", e
 54 |                 sys.exit(1)
 55 | 
 56 |         # check params and set default values
 57 |         try:
 58 |             if 'index_dir' not in self.config:
 59 |                 raise Exception("index_dir is missing")
 60 |             if 'query_file' not in self.config:
 61 |                 raise Exception("query_file is missing")
 62 |             if 'output_file' not in self.config:
 63 |                 raise Exception("output_file is missing")
 64 |             if 'run_id' not in self.config:
 65 |                 raise Exception("run_id is missing")
 66 |             if 'model' not in self.config:
 67 |                 self.config['model'] = "lm"
 68 |             if 'num_docs' not in self.config:
 69 |                 self.config['num_docs'] = 100
 70 |             if 'field_id' not in self.config:
 71 |                 self.config['field_id'] = Lucene.FIELDNAME_ID
 72 |             if 'first_pass_num_docs' not in self.config:
 73 |                 self.config['first_pass_num_docs'] = 10000
 74 |             if 'first_pass_field' not in self.config:
 75 |                 self.config['first_pass_field'] = Lucene.FIELDNAME_CONTENTS
 76 | 
 77 |             # model specific params
 78 |             if self.config['model'] == "lm" or self.config['model'] == "mlm" or self.config['model'] == "prms":
 79 |                 if 'smoothing_method' not in self.config:
 80 |                     self.config['smoothing_method'] = "jm"
 81 |                 # if 'smoothing_param' not in self.config:
 82 |                 #     self.config['smoothing_param'] = 0.1
 83 | 
 84 |             if self.config['model'] == "mlm":
 85 |                 if 'field_weights' not in self.config:
 86 |                     raise Exception("field_weights is missing")
 87 | 
 88 |             if self.config['model'] == "prms":
 89 |                 if 'fields' not in self.config:
 90 |                     raise Exception("fields is missing")
 91 | 
 92 |         except Exception, e:
 93 |             print "Error in config file: ", e
 94 |             sys.exit(1)
 95 | 
 96 |     def _open_index(self):
 97 |         self.lucene = Lucene(self.config['index_dir'])
 98 | 
 99 |         self.lucene.open_searcher()
100 | 
101 |     def _close_index(self):
102 |         self.lucene.close_reader()
103 | 
104 |     def _load_queries(self):
105 |         self.queries = json.load(open(self.config['query_file']))
106 | 
107 |     def _first_pass_scoring(self, lucene, query):
108 |         """
109 |         Returns first-pass scoring of documents.
110 | 
111 |         :param query: raw query
112 |         :return RetrievalResults object
113 |         """
114 |         print "\tFirst pass scoring... ",
115 |         results = lucene.score_query(query, field_content=self.config['first_pass_field'],
116 |                                      field_id=self.config['field_id'],
117 |                                      num_docs=self.config['first_pass_num_docs'])
118 |         print results.num_docs()
119 |         return results
120 | 
121 |     def _second_pass_scoring(self, res1, scorer):
122 |         """
123 |         Returns second-pass scoring of documents.
124 | 
125 |         :param res1: first pass results
126 |         :return: RetrievalResults object
127 |         """
128 |         print "\tSecond pass scoring... "
129 |         results = RetrievalResults()
130 |         for doc_id, orig_score in res1.get_scores_sorted():
131 |             doc_id_int = res1.get_doc_id_int(doc_id)
132 |             score = scorer.score_doc(doc_id, doc_id_int)
133 |             results.append(doc_id, score)
134 |         print "done"
135 |         return results
136 | 
137 |     def retrieve(self):
138 |         """Scores queries and outputs results."""
139 |         s_t = datetime.now()  # start time
140 |         total_time = 0.0
141 | 
142 |         self._load_queries()
143 |         self._open_index()
144 | 
145 |         # init output file
146 |         if os.path.exists(self.config['output_file']):
147 |             os.remove(self.config['output_file'])
148 |         out = open(self.config['output_file'], "w")
149 | 
150 |         for query_id in sorted(self.queries):
151 |             # query = Query.preprocess(self.queries[query_id])
152 |             query = Lucene.preprocess(self.queries[query_id])
153 |             print "scoring [" + query_id + "] " + query
154 |             # first pass scoring
155 |             res1 = self._first_pass_scoring(self.lucene, query)
156 |             # second pass scoring (if needed)
157 |             if self.config['model'] == "lucene":
158 |                 results = res1
159 |             else:
160 |                 scorer = Scorer.get_scorer(self.config['model'], self.lucene, query, self.config)
161 |                 results = self._second_pass_scoring(res1, scorer)
162 |             # write results to output file
163 |             results.write_trec_format(query_id, self.config['run_id'], out, self.config['num_docs'])
164 | 
165 |         # close output file
166 |         out.close()
167 |         # close index
168 |         self._close_index()
169 | 
170 |         e_t = datetime.now()  # end time
171 |         diff = e_t - s_t
172 |         total_time += diff.total_seconds()
173 |         time_log = "Execution time(sec):\t" + str(total_time) + "\n"
174 |         print time_log
175 | 
176 | 
177 | def print_usage():
178 |     print sys.argv[0] + " <config_file>"
179 |     sys.exit()
180 | 
181 | 
182 | def main(argv):
183 |     if len(argv) < 1:
184 |         print_usage()
185 | 
186 |     r = Retrieval(argv[0])
187 |     r.retrieve()
188 | 
189 | 
190 | if __name__ == '__main__':
191 |     main(sys.argv[1:])
192 | 


--------------------------------------------------------------------------------
/nordlys/retrieval/scorer.py:
--------------------------------------------------------------------------------
  1 | """
  2 | Various retrieval models for scoring a individual document for a given query.
  3 | 
  4 | @author: Faegheh Hasibi (faegheh.hasibi@idi.ntnu.no)
  5 | @author: Krisztian Balog (krisztian.balog@uis.no)
  6 | """
  7 | 
  8 | from __future__ import division
  9 | import math
 10 | from lucene_tools import Lucene
 11 | 
 12 | 
 13 | class Scorer(object):
 14 |     """Base scorer class."""
 15 | 
 16 |     SCORER_DEBUG = 0
 17 | 
 18 |     def __init__(self, lucene, query, params):
 19 |         self.lucene = lucene
 20 |         self.query = query
 21 |         self.params = params
 22 |         self.lucene.open_searcher()
 23 |         """
 24 |         @todo consider the field for analysis
 25 |         """
 26 |         # NOTE: The analyser might return terms that are not in the collection.
 27 |         # These terms are filtered out later in the score_doc functions.
 28 |         self.query_terms = lucene.analyze_query(self.query) if query is not None else None
 29 | 
 30 |     @staticmethod
 31 |     def get_scorer(model, lucene, query, params):
 32 |         """
 33 |         Returns Scorer object (Scorer factory).
 34 | 
 35 |         :param model: accepted values: lucene, lm or mlm
 36 |         :param lucene: Lucene object
 37 |         :param query: raw query (to be analyzed)
 38 |         :param params: dict with models parameters
 39 |         """
 40 |         if model == "lm":
 41 |             print "\tLM scoring ... "
 42 |             return ScorerLM(lucene, query, params)
 43 |         elif model == "mlm":
 44 |             print "\tMLM scoring ..."
 45 |             return ScorerMLM(lucene, query, params)
 46 |         elif model == "prms":
 47 |             print "\tPRMS scoring ..."
 48 |             return ScorerPRMS(lucene, query, params)
 49 |         else:
 50 |             raise Exception("Unknown model '" + model + "'")
 51 | 
 52 | 
 53 | class ScorerLM(Scorer):
 54 |     def __init__(self, lucene, query, params):
 55 |         super(ScorerLM, self).__init__(lucene, query, params)
 56 |         self.smoothing_method = params.get('smoothing_method', "jm").lower()
 57 |         if (self.smoothing_method != "jm") and (self.smoothing_method != "dirichlet"):
 58 |             raise Exception(self.params['smoothing_method'] + " smoothing method is not supported!")
 59 |         self.tf = {}
 60 | 
 61 |     @staticmethod
 62 |     def get_jm_prob(tf_t_d, len_d, tf_t_C, len_C, lambd):
 63 |         """
 64 |         Computes JM-smoothed probability
 65 |         p(t|theta_d) = [(1-lambda) tf(t, d)/|d|] + [lambda tf(t, C)/|C|]
 66 | 
 67 |         :param tf_t_d: tf(t,d)
 68 |         :param len_d: |d|
 69 |         :param tf_t_C: tf(t,C)
 70 |         :param len_C: |C| = \sum_{d \in C} |d|
 71 |         :param lambd: \lambda
 72 |         :return:
 73 |         """
 74 |         p_t_d = tf_t_d / len_d if len_d > 0 else 0
 75 |         p_t_C = tf_t_C / len_C if len_C > 0 else 0
 76 |         return (1 - lambd) * p_t_d + lambd * p_t_C
 77 | 
 78 |     @staticmethod
 79 |     def get_dirichlet_prob(tf_t_d, len_d, tf_t_C, len_C, mu):
 80 |         """
 81 |         Computes Dirichlet-smoothed probability
 82 |         P(t|theta_d) = [tf(t, d) + mu P(t|C)] / [|d| + mu]
 83 | 
 84 |         :param tf_t_d: tf(t,d)
 85 |         :param len_d: |d|
 86 |         :param tf_t_C: tf(t,C)
 87 |         :param len_C: |C| = \sum_{d \in C} |d|
 88 |         :param mu: \mu
 89 |         :return:
 90 |         """
 91 |         if mu == 0:  # i.e. field does not have any content in the collection
 92 |             return 0
 93 |         else:
 94 |             p_t_C = tf_t_C / len_C if len_C > 0 else 0
 95 |             return (tf_t_d + mu * p_t_C) / (len_d + mu)
 96 | 
 97 |     def get_tf(self, lucene_doc_id, field):
 98 |         if lucene_doc_id not in self.tf:
 99 |             self.tf[lucene_doc_id] = {}
100 |         if field not in self.tf[lucene_doc_id]:
101 |             self.tf[lucene_doc_id][field] = self.lucene.get_doc_termfreqs(lucene_doc_id, field)
102 |         return self.tf[lucene_doc_id][field]
103 | 
104 |     def get_term_prob(self, lucene_doc_id, field, t, tf_t_d_f=None, tf_t_C_f=None):
105 |         """
106 |         Returns probability of a given term for the given field.
107 | 
108 |         :param lucene_doc_id: internal Lucene document ID
109 |         :param field: entity field name, e.g. <dbo:abstract>
110 |         :param t: term
111 |         :return: P(t|d_f)
112 |         """
113 |         # Gets term freqs for field of document
114 |         tf = {}
115 |         if lucene_doc_id is not None:
116 |             tf = self.get_tf(lucene_doc_id, field)
117 | 
118 |         len_d_f = sum(tf.values())
119 |         len_C_f = self.lucene.get_coll_length(field)
120 | 
121 |         tf_t_d_f = tf.get(t, 0) if tf_t_d_f is None else tf_t_d_f
122 |         tf_t_C_f = self.lucene.get_coll_termfreq(t, field) if tf_t_C_f is None else tf_t_C_f
123 |         if self.SCORER_DEBUG:
124 |             print "\t\tt=" + t + ", f=" + field
125 |             print "\t\t\tDoc:  tf(t,f)=" + str(tf_t_d_f) + "\t|f|=" + str(len_d_f)
126 |             print "\t\t\tColl: tf(t,f)=" + str(tf_t_C_f) + "\t|f|=" + str(len_C_f)
127 | 
128 |         # JM smoothing: p(t|theta_d_f) = [(1-lambda) tf(t, d_f)/|d_f|] + [lambda tf(t, C_f)/|C_f|]
129 |         if self.smoothing_method == "jm":
130 |             lambd = self.params.get('smoothing_param', 0.1)
131 |             p_t_d_f = self.get_jm_prob(tf_t_d_f, len_d_f, tf_t_C_f, len_C_f, lambd)
132 |             if self.SCORER_DEBUG:
133 |                 print "\t\t\tJM smoothing:"
134 |                 print "\t\t\tDoc:  p(t|theta_d_f)=", p_t_d_f
135 |         # Dirichlet smoothing
136 |         elif self.smoothing_method == "dirichlet":
137 |             mu = self.params.get('smoothing_param', self.lucene.get_avg_len(field))
138 |             p_t_d_f = self.get_dirichlet_prob(tf_t_d_f, len_d_f, tf_t_C_f, len_C_f, mu)
139 |             if self.SCORER_DEBUG:
140 |                 print "\t\t\tDirichlet smoothing:"
141 |                 print "\t\t\tmu:", mu
142 |                 print "\t\t\tDoc:  p(t|theta_d_f)=", p_t_d_f
143 |         return p_t_d_f
144 | 
145 |     def get_term_probs(self, lucene_doc_id, field):
146 |         """
147 |         Returns probability of all query terms for the given field.
148 | 
149 |         :param lucene_doc_id: internal Lucene document ID
150 |         :param field: entity field name, e.g. <dbo:abstract>
151 |         :return: dictionary of terms with their probabilities
152 |         """
153 |         p_t_theta_d_f = {}
154 |         for t in set(self.query_terms):
155 |             p_t_theta_d_f[t] = self.get_term_prob(lucene_doc_id, field, t)
156 |         return p_t_theta_d_f
157 | 
158 |     def score_doc(self, doc_id, lucene_doc_id=None):
159 |         """
160 |         Scores the given document using LM.
161 | 
162 |         :param doc_id: document id
163 |         :param lucene_doc_id: internal Lucene document ID
164 |         :return float, LM score of document and query
165 |         """
166 |         if self.SCORER_DEBUG:
167 |             print "Scoring doc ID=" + doc_id
168 | 
169 |         if lucene_doc_id is None:
170 |             lucene_doc_id = self.lucene.get_lucene_document_id(doc_id)
171 | 
172 |         field = self.params.get('field', Lucene.FIELDNAME_CONTENTS)
173 | 
174 |         p_t_theta_d = self.get_term_probs(lucene_doc_id, field)
175 |         if sum(p_t_theta_d.values()) == 0:  # none of query terms are in the field collection
176 |             if self.SCORER_DEBUG:
177 |                 print "\t\tP(q|" + field + ") = None"
178 |             return None
179 |         # p(q|theta_d) = prod(p(t|theta_d)) ; we return log(p(q|theta_d))
180 |         p_q_theta_d = 0
181 |         for t in self.query_terms:
182 |             # Skips the term if it is not in the field collection
183 |             if p_t_theta_d[t] == 0:
184 |                 continue
185 |             if self.SCORER_DEBUG:
186 |                 print "\t\tP(" + t + "|" + field + ") = " + str(p_t_theta_d[t])
187 |             p_q_theta_d += math.log(p_t_theta_d[t])
188 |         if self.SCORER_DEBUG:
189 |             print "\tP(d|q)=" + str(p_q_theta_d)
190 |         return p_q_theta_d
191 | 
192 | 
193 | class ScorerMLM(ScorerLM):
194 |     def __init__(self, lucene, query, params):
195 |         super(ScorerMLM, self).__init__(lucene, query, params)
196 | 
197 |     def get_mlm_term_prob(self, lucene_doc_id, weights, t):
198 |         """
199 |         Returns MLM probability for the given term and field-weights.
200 | 
201 |         :param lucene_doc_id: internal Lucene document ID
202 |         :param weights: dictionary, {field: weights, ...}
203 |         :param t: term
204 |         :return: P(t|theta_d)
205 |         """
206 |         # p(t|theta_d) = sum(mu_f * p(t|theta_d_f))
207 |         p_t_theta_d = 0
208 |         for f, mu_f in weights.iteritems():
209 |             p_t_theta_d_f = self.get_term_prob(lucene_doc_id, f, t)
210 |             p_t_theta_d += mu_f * p_t_theta_d_f
211 |         if self.SCORER_DEBUG:
212 |             print "\t\tP(t|theta_d)=" + str(p_t_theta_d)
213 |         return p_t_theta_d
214 | 
215 |     def get_mlm_term_probs(self, lucene_doc_id, weights):
216 |         """
217 |         Returns probability of all query terms for the given field weights.
218 | 
219 |         :param lucene_doc_id: internal Lucene document ID
220 |         :param weights: dictionary, {field: weights, ...}
221 |         :return: dictionary of terms with their probabilities
222 |         """
223 |         p_t_theta_d = {}
224 |         for t in set(self.query_terms):
225 |             if self.SCORER_DEBUG:
226 |                 print "\tt=" + t
227 |             p_t_theta_d[t] = self.get_mlm_term_prob(lucene_doc_id, weights, t)
228 |         return p_t_theta_d
229 | 
230 |     def score_doc(self, doc_id, lucene_doc_id=None):
231 |         """
232 |         Scores the given document using MLM model.
233 | 
234 |         :param doc_id: document id
235 |         :param lucene_doc_id: internal Lucene document ID
236 |         :return float, MLM score of document and query
237 |         """
238 |         if self.SCORER_DEBUG:
239 |             print "Scoring doc ID=" + doc_id
240 | 
241 |         if lucene_doc_id is None:
242 |             lucene_doc_id = self.lucene.get_lucene_document_id(doc_id)
243 | 
244 |         weights = self.params['field_weights']
245 | 
246 |         p_t_theta_d = self.get_mlm_term_probs(lucene_doc_id, weights)
247 |         # none of query terms are in the field collection
248 |         if sum(p_t_theta_d.values()) == 0:
249 |             if self.SCORER_DEBUG:
250 |                 print "\t\tP_mlm(q|theta_d) = None"
251 |             return None
252 |         # p(q|theta_d) = prod(p(t|theta_d)) ; we return log(p(q|theta_d))
253 |         p_q_theta_d = 0
254 |         for t in self.query_terms:
255 |             if p_t_theta_d[t] == 0:
256 |                 continue
257 |             if self.SCORER_DEBUG:
258 |                 print "\t\tP_mlm(" + t + "|theta_d) = " + str(p_t_theta_d[t])
259 |             p_q_theta_d += math.log(p_t_theta_d[t])
260 | 
261 |         return p_q_theta_d
262 | 
263 | 
264 | class ScorerPRMS(ScorerLM):
265 |     def __init__(self, lucene, query, params):
266 |         super(ScorerPRMS, self).__init__(lucene, query, params)
267 |         self.fields = self.params['fields']
268 |         self.total_field_freq = None
269 |         self.mapping_probs = None
270 | 
271 |     def score_doc(self, doc_id, lucene_doc_id=None):
272 |         """
273 |         Scores the given document using PRMS model.
274 | 
275 |         :param doc_id: document id
276 |         :param lucene_doc_id: internal Lucene document ID
277 |         :return float, PRMS score of document and query
278 |         """
279 |         if self.SCORER_DEBUG:
280 |             print "Scoring doc ID=" + doc_id
281 | 
282 |         if lucene_doc_id is None:
283 |             lucene_doc_id = self.lucene.get_lucene_document_id(doc_id)
284 | 
285 |         # gets mapping probs: p(f|t)
286 |         p_f_t = self.get_mapping_probs()
287 | 
288 |         # gets term probs: p(t|theta_d_f)
289 |         p_t_theta_d_f = {}
290 |         for field in self.fields:
291 |             p_t_theta_d_f[field] = self.get_term_probs(lucene_doc_id, field)
292 |         # none of query terms are in the field collection
293 |         if sum([sum(p_t_theta_d_f[field].values()) for field in p_t_theta_d_f]) == 0:
294 |             return None
295 | 
296 |         # p(q|theta_d) = prod(p(t|theta_d)) ; we return log(p(q|theta_d))
297 |         p_q_theta_d = 0
298 |         for t in self.query_terms:
299 |             if self.SCORER_DEBUG:
300 |                 print "\tt=" + t
301 |             # p(t|theta_d) = sum(p(f|t) * p(t|theta_d_f))
302 |             p_t_theta_d = 0
303 |             for f in self.fields:
304 |                 if f in p_f_t[t]:
305 |                     p_t_theta_d += p_f_t[t][f] * p_t_theta_d_f[f][t]
306 |                     if self.SCORER_DEBUG:
307 |                         print "\t\t\tf=" + f + ", p(t|f)=" + str(p_f_t[t][f]) + "  P(t|theta_d,f)=" + str(p_t_theta_d_f[f][t])
308 | 
309 |             if p_t_theta_d == 0:
310 |                 continue
311 |             p_q_theta_d += math.log(p_t_theta_d)
312 |             if self.SCORER_DEBUG:
313 |                 print "\t\tP(t|theta_d)=" + str(p_t_theta_d)
314 |         return p_q_theta_d
315 | 
316 |     def get_mapping_probs(self):
317 |         """Gets (cached) mapping probabilities for all query terms."""
318 |         if self.mapping_probs is None:
319 |             self.mapping_probs = {}
320 |             for t in set(self.query_terms):
321 |                 self.mapping_probs[t] = self.get_mapping_prob(t)
322 |         return self.mapping_probs
323 | 
324 |     def get_mapping_prob(self, t, coll_termfreq_fields=None):
325 |         """
326 |         Computes PRMS field mapping probability.
327 |             p(f|t) = P(t|f)P(f) / sum_f'(P(t|C_{f'_c})P(f'))
328 | 
329 |         :param t: str
330 |         :param coll_termfreq_fields: {field: freq, ...}
331 |         :return Dictionary {field: prms_prob, ...}
332 |         """
333 |         if coll_termfreq_fields is None:
334 |             coll_termfreq_fields = {}
335 |             for f in self.fields:
336 |                 coll_termfreq_fields[f] = self.lucene.get_coll_termfreq(t, f)
337 | 
338 |         # calculates numerators for all fields: P(t|f)P(f)
339 |         numerators = {}
340 |         for f in self.fields:
341 |             p_t_f = coll_termfreq_fields[f] / self.lucene.get_coll_length(f)
342 |             p_f = self.lucene.get_doc_count(f) / self.get_total_field_freq()
343 |             p_f_t = p_t_f * p_f
344 |             if p_f_t > 0:
345 |                 numerators[f] = p_f_t
346 |             if self.SCORER_DEBUG:
347 |                 print "\tf= " + f, "t= " + t + " P(t|f)=" + str(p_t_f) + " P(f)=" + str(p_f)
348 | 
349 |         # calculates denominator: sum_f'(P(t|C_{f'_c})P(f'))
350 |         denominator = sum(numerators.values())
351 | 
352 |         mapping_probs = {}
353 |         if denominator > 0:  # if the term is present in the collection
354 |             for f in numerators:
355 |                 mapping_probs[f] = numerators[f] / denominator
356 |                 if self.SCORER_DEBUG:
357 |                     print "\t\tf= " + f + " t= " + t + " p(f|t)= " + str(numerators[f]) + "/" + str(sum(numerators.values())) + \
358 |                           " = " + str(mapping_probs[f])
359 |         return mapping_probs
360 | 
361 |     def get_total_field_freq(self):
362 |         """Returns total occurrences of all fields"""
363 |         if self.total_field_freq is None:
364 |             total_field_freq = 0
365 |             for f in self.fields:
366 |                 total_field_freq += self.lucene.get_doc_count(f)
367 |             self.total_field_freq = total_field_freq
368 |         return self.total_field_freq


--------------------------------------------------------------------------------
/qrels/qrels-SemSearch_ES.txt:
--------------------------------------------------------------------------------
   1 | SemSearch_ES-1	Q0	<dbpedia:.444_Marlin>	1
   2 | SemSearch_ES-1	Q0	<dbpedia:.41_Remington_Magnum>	1
   3 | SemSearch_ES-1	Q0	<dbpedia:.500_S&W_Magnum>	1
   4 | SemSearch_ES-1	Q0	<dbpedia:Winchester_Model_1894>	1
   5 | SemSearch_ES-1	Q0	<dbpedia:Ruger_Alaskan>	1
   6 | SemSearch_ES-1	Q0	<dbpedia:Elmer_Keith>	1
   7 | SemSearch_ES-1	Q0	<dbpedia:.440_Cor-bon>	2
   8 | SemSearch_ES-1	Q0	<dbpedia:Ruger_Redhawk>	1
   9 | SemSearch_ES-1	Q0	<dbpedia:Handgun_hunting>	2
  10 | SemSearch_ES-1	Q0	<dbpedia:.44_Magnum>	2
  11 | SemSearch_ES-1	Q0	<dbpedia:.30-378_Weatherby_Magnum>	1
  12 | SemSearch_ES-1	Q0	<dbpedia:.45_Colt>	1
  13 | SemSearch_ES-1	Q0	<dbpedia:.44_Special>	2
  14 | SemSearch_ES-1	Q0	<dbpedia:5.6%C3%9750mm_Magnum>	1
  15 | SemSearch_ES-10	Q0	<dbpedia:Propheteer>	1
  16 | SemSearch_ES-10	Q0	<dbpedia:Tom_Bradley_(baseball)>	1
  17 | SemSearch_ES-10	Q0	<dbpedia:Asheville_metropolitan_area>	2
  18 | SemSearch_ES-10	Q0	<dbpedia:University_of_North_Carolina>	2
  19 | SemSearch_ES-10	Q0	<dbpedia:Ayden,_North_Carolina>	1
  20 | SemSearch_ES-10	Q0	<dbpedia:Ashville>	2
  21 | SemSearch_ES-10	Q0	<dbpedia:University_of_North_Carolina_at_Asheville>	1
  22 | SemSearch_ES-10	Q0	<dbpedia:U.S._Route_321>	1
  23 | SemSearch_ES-10	Q0	<dbpedia:Asheville,_North_Carolina>	2
  24 | SemSearch_ES-10	Q0	<dbpedia:2008_North_Carolina_Tar_Heels_football_team>	1
  25 | SemSearch_ES-10	Q0	<dbpedia:Clyde,_North_Carolina>	1
  26 | SemSearch_ES-100	Q0	<dbpedia:Tampa,_Florida>	1
  27 | SemSearch_ES-100	Q0	<dbpedia:YMCA>	1
  28 | SemSearch_ES-101	Q0	<dbpedia:Ashley_Wagner>	2
  29 | SemSearch_ES-102	Q0	<dbpedia:Camissoniopsis_cheiranthifolia>	1
  30 | SemSearch_ES-102	Q0	<dbpedia:Calystegia_soldanella>	1
  31 | SemSearch_ES-102	Q0	<dbpedia:Brandon_Flowers_(American_football)>	1
  32 | SemSearch_ES-102	Q0	<dbpedia:Dudleya_stolonifera>	1
  33 | SemSearch_ES-102	Q0	<dbpedia:Abronia_maritima>	1
  34 | SemSearch_ES-102	Q0	<dbpedia:Coreopsis_maritima>	1
  35 | SemSearch_ES-102	Q0	<dbpedia:Erysimum_menziesii>	1
  36 | SemSearch_ES-102	Q0	<dbpedia:Ipperwash_Provincial_Park>	1
  37 | SemSearch_ES-102	Q0	<dbpedia:Atriplex_leucophylla>	1
  38 | SemSearch_ES-103	Q0	<dbpedia:Humble,_Texas>	1
  39 | SemSearch_ES-104	Q0	<dbpedia:Bourbonnais,_Illinois>	2
  40 | SemSearch_ES-104	Q0	<dbpedia:Bourbonnais_Township,_Kankakee_County,_Illinois>	2
  41 | SemSearch_ES-104	Q0	<dbpedia:1999_Bourbonnais,_Illinois_train_accident>	1
  42 | SemSearch_ES-104	Q0	<dbpedia:Verneuil-en-Bourbonnais>	1
  43 | SemSearch_ES-104	Q0	<dbpedia:Bourbonnais>	1
  44 | SemSearch_ES-105	Q0	<dbpedia:Cedar_Grove,_New_Jersey>	1
  45 | SemSearch_ES-106	Q0	<dbpedia:Manticore_(film)>	1
  46 | SemSearch_ES-106	Q0	<dbpedia:Christopher_Masterson>	1
  47 | SemSearch_ES-106	Q0	<dbpedia:Chase_Masterson>	2
  48 | SemSearch_ES-107	Q0	<dbpedia:USS_Concord>	1
  49 | SemSearch_ES-107	Q0	<dbpedia:Shougang_Concord_International>	1
  50 | SemSearch_ES-108	Q0	<dbpedia:Danielia_Cotton>	2
  51 | SemSearch_ES-109	Q0	<dbpedia:David_Hewlett>	2
  52 | SemSearch_ES-109	Q0	<dbpedia:David_Woodley_Packard>	1
  53 | SemSearch_ES-109	Q0	<dbpedia:Desire_and_Hell_at_Sunset_Motel>	1
  54 | SemSearch_ES-109	Q0	<dbpedia:Hewlett_(surname)>	1
  55 | SemSearch_ES-109	Q0	<dbpedia:Rodney_McKay>	1
  56 | SemSearch_ES-109	Q0	<dbpedia:David_Packard>	1
  57 | SemSearch_ES-109	Q0	<dbpedia:Hewlett>	1
  58 | SemSearch_ES-11	Q0	<dbpedia:List_of_Austin_Powers_characters>	2
  59 | SemSearch_ES-11	Q0	<dbpedia:Austin_Powers_(film_series)>	2
  60 | SemSearch_ES-11	Q0	<dbpedia:Austin_Powers:_Welcome_to_My_Underground_Lair!>	2
  61 | SemSearch_ES-11	Q0	<dbpedia:Mike_Myers>	2
  62 | SemSearch_ES-11	Q0	<dbpedia:Felicity_Shagwell>	1
  63 | SemSearch_ES-11	Q0	<dbpedia:Austin_Powers>	2
  64 | SemSearch_ES-11	Q0	<dbpedia:Number_2_(Austin_Powers)>	1
  65 | SemSearch_ES-11	Q0	<dbpedia:Austin_Powers_Pinball>	2
  66 | SemSearch_ES-11	Q0	<dbpedia:Nigel_Powers>	2
  67 | SemSearch_ES-11	Q0	<dbpedia:Austin_Powers_in_Goldmember>	2
  68 | SemSearch_ES-11	Q0	<dbpedia:Austin_Powers_(character)>	2
  69 | SemSearch_ES-11	Q0	<dbpedia:Austin_Powers:_The_Spy_Who_Shagged_Me>	2
  70 | SemSearch_ES-11	Q0	<dbpedia:Austin_Powers:_International_Man_of_Mystery>	2
  71 | SemSearch_ES-111	Q0	<dbpedia:Eagle_Rock>	2
  72 | SemSearch_ES-111	Q0	<dbpedia:Eagle_Rock_Records>	1
  73 | SemSearch_ES-111	Q0	<dbpedia:Eagle_Rock,_Virginia>	1
  74 | SemSearch_ES-111	Q0	<dbpedia:Eagle_Rock_High_School_(Los_Angeles,_California)>	1
  75 | SemSearch_ES-111	Q0	<dbpedia:Eagle_Rock,_Los_Angeles>	2
  76 | SemSearch_ES-112	Q0	<dbpedia:Expresso_Bongo>	1
  77 | SemSearch_ES-114	Q0	<dbpedia:No_Fun_Aloud>	1
  78 | SemSearch_ES-114	Q0	<dbpedia:I_Can't_Tell_You_Why>	1
  79 | SemSearch_ES-114	Q0	<dbpedia:Glenn_Frey_Live>	1
  80 | SemSearch_ES-114	Q0	<dbpedia:You_Belong_to_the_City>	1
  81 | SemSearch_ES-114	Q0	<dbpedia:Glenn_Frey>	2
  82 | SemSearch_ES-115	Q0	<dbpedia:Goodwill_Industries>	1
  83 | SemSearch_ES-115	Q0	<dbpedia:Indianapolis_Metropolitan_High_School>	1
  84 | SemSearch_ES-118	Q0	<dbpedia:Iowa_Energy>	2
  85 | SemSearch_ES-118	Q0	<dbpedia:Wind_power_in_Iowa>	1
  86 | SemSearch_ES-118	Q0	<dbpedia:MidAmerican_Energy_Company>	1
  87 | SemSearch_ES-118	Q0	<dbpedia:Renewable_energy_in_the_United_States>	1
  88 | SemSearch_ES-118	Q0	<dbpedia:Compressed_air_energy_storage>	2
  89 | SemSearch_ES-118	Q0	<dbpedia:Duane_Arnold_Energy_Center>	1
  90 | SemSearch_ES-119	Q0	<dbpedia:John_Elliott>	2
  91 | SemSearch_ES-119	Q0	<dbpedia:John_Elliott_Smart>	1
  92 | SemSearch_ES-119	Q0	<dbpedia:John_Elliott_(boxer)>	2
  93 | SemSearch_ES-119	Q0	<dbpedia:John_S._Elliott>	2
  94 | SemSearch_ES-119	Q0	<dbpedia:John_Elliott_Cairnes>	1
  95 | SemSearch_ES-119	Q0	<dbpedia:John_Elliott_(artist)>	2
  96 | SemSearch_ES-119	Q0	<dbpedia:John_Elliott_(defensive_lineman)>	2
  97 | SemSearch_ES-119	Q0	<dbpedia:Charlotte_Elliott>	1
  98 | SemSearch_ES-119	Q0	<dbpedia:John_Campbell_Elliott>	2
  99 | SemSearch_ES-119	Q0	<dbpedia:John_Elliott_(Georgia)>	2
 100 | SemSearch_ES-119	Q0	<dbpedia:John_Elliott_(businessman)>	2
 101 | SemSearch_ES-119	Q0	<dbpedia:John_Milton_Elliott>	2
 102 | SemSearch_ES-119	Q0	<dbpedia:John_Elliot_(songwriter)>	2
 103 | SemSearch_ES-119	Q0	<dbpedia:Jumbo_Elliott_(American_football)>	2
 104 | SemSearch_ES-119	Q0	<dbpedia:John_Elliott_(historian)>	2
 105 | SemSearch_ES-119	Q0	<dbpedia:Doc_Elliott>	2
 106 | SemSearch_ES-119	Q0	<dbpedia:Jumbo_Elliott>	1
 107 | SemSearch_ES-12	Q0	<dbpedia:Republic_of_Texas>	1
 108 | SemSearch_ES-12	Q0	<dbpedia:University_of_Texas_at_Austin_College_of_Communication>	1
 109 | SemSearch_ES-12	Q0	<dbpedia:Twenty-first_Texas_Legislature>	1
 110 | SemSearch_ES-12	Q0	<dbpedia:Austin_High_School_(Austin,_Texas)>	1
 111 | SemSearch_ES-12	Q0	<dbpedia:List_of_Austin_neighborhoods>	1
 112 | SemSearch_ES-12	Q0	<dbpedia:Texas>	1
 113 | SemSearch_ES-12	Q0	<dbpedia:Texas_Ranger_Division>	1
 114 | SemSearch_ES-12	Q0	<dbpedia:List_of_counties_in_Texas>	2
 115 | SemSearch_ES-12	Q0	<dbpedia:Austin,_Texas>	1
 116 | SemSearch_ES-12	Q0	<dbpedia:Austin_County,_Texas>	2
 117 | SemSearch_ES-12	Q0	<dbpedia:University_of_Texas_at_Austin>	2
 118 | SemSearch_ES-12	Q0	<dbpedia:Stephen_F._Austin_High_School>	1
 119 | SemSearch_ES-12	Q0	<dbpedia:List_of_University_of_Texas_at_Austin_faculty>	2
 120 | SemSearch_ES-12	Q0	<dbpedia:Abner_Smith_Lipscomb>	1
 121 | SemSearch_ES-12	Q0	<dbpedia:Austin_Independent_School_District>	1
 122 | SemSearch_ES-120	Q0	<dbpedia:Allan_Lawrence>	1
 123 | SemSearch_ES-123	Q0	<dbpedia:Michael_Zimmerman_(biologist)>	2
 124 | SemSearch_ES-123	Q0	<dbpedia:Michael_Zimmerman_(jurist)>	2
 125 | SemSearch_ES-123	Q0	<dbpedia:Michael_Zimmerman>	1
 126 | SemSearch_ES-123	Q0	<dbpedia:Clergy_Letter_Project>	1
 127 | SemSearch_ES-123	Q0	<dbpedia:Michael_E._Zimmerman>	2
 128 | SemSearch_ES-123	Q0	<dbpedia:Michael_Zimmerman_(historian)>	2
 129 | SemSearch_ES-123	Q0	<dbpedia:Integral_ecology>	1
 130 | SemSearch_ES-124	Q0	<dbpedia:Motorola_Rokr>	1
 131 | SemSearch_ES-125	Q0	<dbpedia:List_of_Nokia_products>	1
 132 | SemSearch_ES-125	Q0	<dbpedia:Nokia_Eseries>	1
 133 | SemSearch_ES-127	Q0	<dbpedia:Personal_digital_assistant>	1
 134 | SemSearch_ES-127	Q0	<dbpedia:Handheld_game_console>	1
 135 | SemSearch_ES-127	Q0	<dbpedia:Palm_Tungsten>	1
 136 | SemSearch_ES-128	Q0	<dbpedia:Neufch%C3%A2tel_cheese>	1
 137 | SemSearch_ES-129	Q0	<dbpedia:Little_Caesars>	1
 138 | SemSearch_ES-13	Q0	<dbpedia:Banana_paper>	2
 139 | SemSearch_ES-13	Q0	<dbpedia:Paper_Clips_Project>	1
 140 | SemSearch_ES-130	Q0	<dbpedia:List_of_United_States_state_and_local_law_enforcement_agencies>	1
 141 | SemSearch_ES-130	Q0	<dbpedia:Devon_and_Cornwall_Police>	1
 142 | SemSearch_ES-130	Q0	<dbpedia:New_York_City_Police_Department>	1
 143 | SemSearch_ES-130	Q0	<dbpedia:List_of_law_enforcement_agencies_in_Massachusetts>	1
 144 | SemSearch_ES-131	Q0	<dbpedia:Rockstar_San_Diego>	1
 145 | SemSearch_ES-131	Q0	<dbpedia:SCPA>	1
 146 | SemSearch_ES-131	Q0	<dbpedia:San_Diego_Gulls_(1966%E2%80%931974)>	1
 147 | SemSearch_ES-131	Q0	<dbpedia:San_Diego_School_of_Creative_and_Performing_Arts>	1
 148 | SemSearch_ES-131	Q0	<dbpedia:University_of_San_Diego_(disambiguation)>	1
 149 | SemSearch_ES-132	Q0	<dbpedia:Sealy_Corporation>	2
 150 | SemSearch_ES-133	Q0	<dbpedia:Sedona,_Arizona>	1
 151 | SemSearch_ES-134	Q0	<dbpedia:Skye_Chandler>	1
 152 | SemSearch_ES-135	Q0	<dbpedia:Spring_(store)>	1
 153 | SemSearch_ES-136	Q0	<dbpedia:Ministry_of_External_Affairs_and_Defence>	1
 154 | SemSearch_ES-136	Q0	<dbpedia:National_Security_Council_of_Sri_Lanka>	1
 155 | SemSearch_ES-136	Q0	<dbpedia:Provincial_Governors_of_Sri_Lanka>	1
 156 | SemSearch_ES-136	Q0	<dbpedia:Sri_Lankan_Civil_War>	1
 157 | SemSearch_ES-136	Q0	<dbpedia:Tamil_Eelam>	1
 158 | SemSearch_ES-136	Q0	<dbpedia:Sri_Lanka_Ports_Authority>	1
 159 | SemSearch_ES-136	Q0	<dbpedia:Auditor_General_of_Sri_Lanka>	1
 160 | SemSearch_ES-136	Q0	<dbpedia:Divisional_Secretariats_of_Sri_Lanka>	1
 161 | SemSearch_ES-137	Q0	<dbpedia:Steak_frites>	1
 162 | SemSearch_ES-138	Q0	<dbpedia:Society_for_the_Prevention_of_Cruelty_to_Animals,_Monterey_County,_California>	1
 163 | SemSearch_ES-138	Q0	<dbpedia:San_Francisco_SPCA>	1
 164 | SemSearch_ES-138	Q0	<dbpedia:Society_for_the_Prevention_of_Cruelty_to_Animals_(Hong_Kong)>	1
 165 | SemSearch_ES-139	Q0	<dbpedia:Steakhouse>	1
 166 | SemSearch_ES-139	Q0	<dbpedia:The_Big_Texan_Steak_Ranch>	1
 167 | SemSearch_ES-14	Q0	<dbpedia:Ben_Franklin,_Texas>	2
 168 | SemSearch_ES-14	Q0	<dbpedia:Benjamin_Franklin>	2
 169 | SemSearch_ES-14	Q0	<dbpedia:Ben_Franklin_in_Paris>	2
 170 | SemSearch_ES-14	Q0	<dbpedia:Ben_Franklin_House>	1
 171 | SemSearch_ES-14	Q0	<dbpedia:Ben_Franklin_(The_Office)>	2
 172 | SemSearch_ES-14	Q0	<dbpedia:Ben_Franklin_Stores>	1
 173 | SemSearch_ES-14	Q0	<dbpedia:Ben_Franklin_effect>	1
 174 | SemSearch_ES-14	Q0	<dbpedia:Benjamin_Franklin_(disambiguation)>	2
 175 | SemSearch_ES-14	Q0	<dbpedia:Benjamin_Franklin_Bridge>	1
 176 | SemSearch_ES-14	Q0	<dbpedia:Ben_Franklin_(PX-15)>	2
 177 | SemSearch_ES-14	Q0	<dbpedia:Franklin_Society>	1
 178 | SemSearch_ES-14	Q0	<dbpedia:Ben_Franklin_(Canadian_politician)>	2
 179 | SemSearch_ES-14	Q0	<dbpedia:Benjamin_Butler>	2
 180 | SemSearch_ES-140	Q0	<dbpedia:Toledo_Storm>	1
 181 | SemSearch_ES-140	Q0	<dbpedia:Toledo,_Ohio>	1
 182 | SemSearch_ES-141	Q0	<dbpedia:Ventura,_California>	1
 183 | SemSearch_ES-141	Q0	<dbpedia:Ventura_County_Courthouse>	1
 184 | SemSearch_ES-141	Q0	<dbpedia:Ventura_County,_California>	1
 185 | SemSearch_ES-141	Q0	<dbpedia:Ventura_County_Sheriff's_Department>	1
 186 | SemSearch_ES-142	Q0	<dbpedia:Windsor_Hotel_(Americus,_Georgia)>	1
 187 | SemSearch_ES-142	Q0	<dbpedia:The_Windsor_Hotel_Toya_Resort_&_Spa>	1
 188 | SemSearch_ES-142	Q0	<dbpedia:Windsor_Hotel>	1
 189 | SemSearch_ES-15	Q0	<dbpedia:BMO_Harris_Bradley_Center>	2
 190 | SemSearch_ES-15	Q0	<dbpedia:Bill_Bradley_Center>	1
 191 | SemSearch_ES-15	Q0	<dbpedia:Bradley_M._Kuhn>	1
 192 | SemSearch_ES-15	Q0	<dbpedia:Shawn_Bradley>	1
 193 | SemSearch_ES-16	Q0	<dbpedia:Over_the_Brooklyn_Bridge>	1
 194 | SemSearch_ES-16	Q0	<dbpedia:Brooklyn_Bridge_(album)>	2
 195 | SemSearch_ES-16	Q0	<dbpedia:Brooklyn_Bridge_trolleys>	1
 196 | SemSearch_ES-16	Q0	<dbpedia:Brooklyn_Bridge_%E2%80%93_City_Hall_/_Chambers_Street_(New_York_City_Subway)>	2
 197 | SemSearch_ES-16	Q0	<dbpedia:Brooklyn_Bridge>	2
 198 | SemSearch_ES-16	Q0	<dbpedia:Brooklyn_Bridge_(TV_series)>	1
 199 | SemSearch_ES-16	Q0	<dbpedia:Brooklyn_Bridge_shooting>	1
 200 | SemSearch_ES-16	Q0	<dbpedia:The_Second_Brooklyn_Bridge>	1
 201 | SemSearch_ES-16	Q0	<dbpedia:Dumbo,_Brooklyn>	1
 202 | SemSearch_ES-16	Q0	<dbpedia:Brooklyn_Bridge_(disambiguation)>	1
 203 | SemSearch_ES-16	Q0	<dbpedia:Brooklyn_Bridge_Park>	1
 204 | SemSearch_ES-16	Q0	<dbpedia:Johnny_Maestro_and_The_Brooklyn_Bridge_(album)>	2
 205 | SemSearch_ES-16	Q0	<dbpedia:Live_from_Under_the_Brooklyn_Bridge>	1
 206 | SemSearch_ES-16	Q0	<dbpedia:Kosciuszko_Bridge_(New_York_City)>	1
 207 | SemSearch_ES-16	Q0	<dbpedia:Brooklyn>	1
 208 | SemSearch_ES-17	Q0	<dbpedia:Montana_Tech_of_the_University_of_Montana>	1
 209 | SemSearch_ES-17	Q0	<dbpedia:Heart_Butte,_Montana>	1
 210 | SemSearch_ES-17	Q0	<dbpedia:Interstate_15>	1
 211 | SemSearch_ES-17	Q0	<dbpedia:Butte_High_School_(Butte,_Montana)>	2
 212 | SemSearch_ES-17	Q0	<dbpedia:Butte_(disambiguation)>	1
 213 | SemSearch_ES-17	Q0	<dbpedia:Butte,_Montana>	2
 214 | SemSearch_ES-17	Q0	<dbpedia:Saddle_Butte,_Montana>	1
 215 | SemSearch_ES-17	Q0	<dbpedia:Grand_Junction_Rockies>	2
 216 | SemSearch_ES-17	Q0	<dbpedia:Butte_Central_Catholic_High_School>	2
 217 | SemSearch_ES-17	Q0	<dbpedia:George_Burton_(actor)>	1
 218 | SemSearch_ES-17	Q0	<dbpedia:Butte-Anaconda_Historic_District>	2
 219 | SemSearch_ES-17	Q0	<dbpedia:United_States_Post_Office_(Butte,_Montana)>	1
 220 | SemSearch_ES-17	Q0	<dbpedia:Butte,_Anaconda_and_Pacific_Railway>	1
 221 | SemSearch_ES-18	Q0	<dbpedia:Canasta>	2
 222 | SemSearch_ES-18	Q0	<dbpedia:Meld_(cards)>	2
 223 | SemSearch_ES-18	Q0	<dbpedia:Philip_Orbanes>	1
 224 | SemSearch_ES-19	Q0	<dbpedia:Tom_Tellez_Track_at_Carl_Lewis_International_Complex>	1
 225 | SemSearch_ES-19	Q0	<dbpedia:Archie_Hahn>	1
 226 | SemSearch_ES-19	Q0	<dbpedia:Frederick_Lewis>	2
 227 | SemSearch_ES-19	Q0	<dbpedia:Lewis_Hamilton>	1
 228 | SemSearch_ES-19	Q0	<dbpedia:Carl_Lewis>	2
 229 | SemSearch_ES-2	Q0	<dbpedia:Jimmy_Skinner>	1
 230 | SemSearch_ES-2	Q0	<dbpedia:B._F._Skinner>	2
 231 | SemSearch_ES-2	Q0	<dbpedia:Rate_of_reinforcement>	1
 232 | SemSearch_ES-2	Q0	<dbpedia:Beyond_Freedom_and_Dignity>	2
 233 | SemSearch_ES-2	Q0	<dbpedia:Tact_(psychology)>	1
 234 | SemSearch_ES-20	Q0	<dbpedia:List_of_newspapers_in_North_Carolina>	1
 235 | SemSearch_ES-20	Q0	<dbpedia:List_of_bicycle_routes_in_North_Carolina>	1
 236 | SemSearch_ES-20	Q0	<dbpedia:List_of_numbered_highways_in_South_Carolina>	1
 237 | SemSearch_ES-20	Q0	<dbpedia:List_of_airports_in_North_Carolina>	1
 238 | SemSearch_ES-20	Q0	<dbpedia:Eastern_box_turtle>	1
 239 | SemSearch_ES-20	Q0	<dbpedia:List_of_cities_and_towns_in_South_Carolina>	1
 240 | SemSearch_ES-20	Q0	<dbpedia:Western_North_Carolina>	1
 241 | SemSearch_ES-20	Q0	<dbpedia:Sullivan's_Island,_South_Carolina>	1
 242 | SemSearch_ES-20	Q0	<dbpedia:South_Carolina_Railroad>	1
 243 | SemSearch_ES-20	Q0	<dbpedia:Index_of_North_Carolina-related_articles>	1
 244 | SemSearch_ES-20	Q0	<dbpedia:List_of_hospitals_in_North_Carolina>	1
 245 | SemSearch_ES-20	Q0	<dbpedia:North_Carolina_Highway_System>	1
 246 | SemSearch_ES-20	Q0	<dbpedia:Cherryville,_North_Carolina>	1
 247 | SemSearch_ES-20	Q0	<dbpedia:Carolina,_North_Carolina>	2
 248 | SemSearch_ES-20	Q0	<dbpedia:Meggett,_South_Carolina>	1
 249 | SemSearch_ES-20	Q0	<dbpedia:South_Carolina>	1
 250 | SemSearch_ES-20	Q0	<dbpedia:South_Carolina_locations_by_per_capita_income>	1
 251 | SemSearch_ES-20	Q0	<dbpedia:Carolina,_Alamance_County,_North_Carolina>	1
 252 | SemSearch_ES-20	Q0	<dbpedia:Jericho,_North_Carolina>	1
 253 | SemSearch_ES-20	Q0	<dbpedia:United_States_presidential_election_in_South_Carolina,_2004>	1
 254 | SemSearch_ES-20	Q0	<dbpedia:2008_North_Carolina_Tar_Heels_football_team>	1
 255 | SemSearch_ES-20	Q0	<dbpedia:Grifton,_North_Carolina>	1
 256 | SemSearch_ES-21	Q0	<dbpedia:Charles_Darwin_University>	2
 257 | SemSearch_ES-21	Q0	<dbpedia:Darwin%E2%80%93Wedgwood_family>	1
 258 | SemSearch_ES-21	Q0	<dbpedia:The_Genius_of_Charles_Darwin>	1
 259 | SemSearch_ES-21	Q0	<dbpedia:The_Life_of_Erasmus_Darwin>	1
 260 | SemSearch_ES-21	Q0	<dbpedia:Charles_Waring_Darwin>	2
 261 | SemSearch_ES-21	Q0	<dbpedia:Darwin,_Northern_Territory>	2
 262 | SemSearch_ES-21	Q0	<dbpedia:The_Autobiography_of_Charles_Darwin>	2
 263 | SemSearch_ES-21	Q0	<dbpedia:Correspondence_of_Charles_Darwin>	1
 264 | SemSearch_ES-21	Q0	<dbpedia:Charles_Darwin_bibliography>	2
 265 | SemSearch_ES-21	Q0	<dbpedia:Charles_Galton_Darwin>	2
 266 | SemSearch_ES-21	Q0	<dbpedia:Charles_Darwin>	1
 267 | SemSearch_ES-21	Q0	<dbpedia:Erasmus_Darwin_(disambiguation)>	1
 268 | SemSearch_ES-21	Q0	<dbpedia:Religious_views_of_Charles_Darwin>	1
 269 | SemSearch_ES-21	Q0	<dbpedia:On_the_Origin_of_Species>	1
 270 | SemSearch_ES-21	Q0	<dbpedia:List_of_things_named_after_Charles_Darwin>	1
 271 | SemSearch_ES-21	Q0	<dbpedia:Darwin's_fox>	1
 272 | SemSearch_ES-21	Q0	<dbpedia:The_Complete_Works_of_Charles_Darwin_Online>	1
 273 | SemSearch_ES-21	Q0	<dbpedia:Charles_Darwin's_education>	2
 274 | SemSearch_ES-22	Q0	<dbpedia:Sports_in_Charlotte,_North_Carolina>	1
 275 | SemSearch_ES-22	Q0	<dbpedia:Charlotte_(disambiguation)>	1
 276 | SemSearch_ES-22	Q0	<dbpedia:Charlotte,_North_Carolina>	2
 277 | SemSearch_ES-22	Q0	<dbpedia:Queen_Charlotte_City_Water_Aerodrome>	1
 278 | SemSearch_ES-22	Q0	<dbpedia:Charlotte_metropolitan_area>	2
 279 | SemSearch_ES-22	Q0	<dbpedia:School_District_50_Haida_Gwaii>	1
 280 | SemSearch_ES-22	Q0	<dbpedia:University_of_North_Carolina_at_Charlotte>	1
 281 | SemSearch_ES-22	Q0	<dbpedia:Eastland_(Charlotte_neighborhood)>	1
 282 | SemSearch_ES-22	Q0	<dbpedia:Charlotte-Mecklenburg_Police_Department>	1
 283 | SemSearch_ES-22	Q0	<dbpedia:Center_City_Corridor_(LYNX)>	2
 284 | SemSearch_ES-22	Q0	<dbpedia:Charlotte_center_city>	1
 285 | SemSearch_ES-22	Q0	<dbpedia:List_of_city_council_members_in_Charlotte>	1
 286 | SemSearch_ES-22	Q0	<dbpedia:North_Charlotte_(Charlotte_neighborhood)>	2
 287 | SemSearch_ES-22	Q0	<dbpedia:List_of_Charlotte_neighborhoods>	2
 288 | SemSearch_ES-23	Q0	<dbpedia:Culture_in_Virginia_Beach>	2
 289 | SemSearch_ES-23	Q0	<dbpedia:List_of_cities_in_Virginia>	1
 290 | SemSearch_ES-23	Q0	<dbpedia:Meyera_E._Oberndorf>	1
 291 | SemSearch_ES-23	Q0	<dbpedia:Floyd_E._Kellam_High_School>	1
 292 | SemSearch_ES-23	Q0	<dbpedia:Pungo,_Virginia>	1
 293 | SemSearch_ES-23	Q0	<dbpedia:Princess_Anne,_Virginia>	1
 294 | SemSearch_ES-23	Q0	<dbpedia:Lynnwood,_Virginia_Beach,_Virginia>	1
 295 | SemSearch_ES-23	Q0	<dbpedia:Croatan_Beach,_Virginia>	2
 296 | SemSearch_ES-23	Q0	<dbpedia:London_Bridge,_Virginia>	1
 297 | SemSearch_ES-23	Q0	<dbpedia:Virginia_Beach_City_Public_Schools>	1
 298 | SemSearch_ES-23	Q0	<dbpedia:Salem,_Virginia_Beach,_Virginia>	2
 299 | SemSearch_ES-23	Q0	<dbpedia:Hampton_Roads>	1
 300 | SemSearch_ES-23	Q0	<dbpedia:U.S._Route_60_in_Virginia>	1
 301 | SemSearch_ES-23	Q0	<dbpedia:Virginia_Beach,_Virginia>	2
 302 | SemSearch_ES-23	Q0	<dbpedia:National_Register_of_Historic_Places_listings_in_Virginia>	1
 303 | SemSearch_ES-23	Q0	<dbpedia:Virginia_Beach_Sportsplex>	1
 304 | SemSearch_ES-23	Q0	<dbpedia:Sigma,_Virginia>	2
 305 | SemSearch_ES-23	Q0	<dbpedia:History_of_Virginia>	1
 306 | SemSearch_ES-23	Q0	<dbpedia:Virginia_City_(film)>	1
 307 | SemSearch_ES-23	Q0	<dbpedia:Bayside,_Virginia_Beach,_Virginia>	1
 308 | SemSearch_ES-23	Q0	<dbpedia:Norfolk,_Virginia>	1
 309 | SemSearch_ES-23	Q0	<dbpedia:Virginia_Beach_Boulevard>	1
 310 | SemSearch_ES-23	Q0	<dbpedia:Virginia_statistical_areas>	1
 311 | SemSearch_ES-23	Q0	<dbpedia:Interstate_264_(Virginia)>	1
 312 | SemSearch_ES-23	Q0	<dbpedia:Thalia,_Virginia>	2
 313 | SemSearch_ES-23	Q0	<dbpedia:Stephens_City,_Virginia>	1
 314 | SemSearch_ES-24	Q0	<dbpedia:2008_Penn_State_Nittany_Lions_football_team>	1
 315 | SemSearch_ES-24	Q0	<dbpedia:South_Carolina>	1
 316 | SemSearch_ES-24	Q0	<dbpedia:Conway,_South_Carolina>	1
 317 | SemSearch_ES-24	Q0	<dbpedia:North_Carolina_Community_College_System>	2
 318 | SemSearch_ES-24	Q0	<dbpedia:Coastal_Carolina_Fair>	1
 319 | SemSearch_ES-24	Q0	<dbpedia:North_Carolina_Highway_System>	1
 320 | SemSearch_ES-24	Q0	<dbpedia:Myrtle_Beach,_South_Carolina>	1
 321 | SemSearch_ES-24	Q0	<dbpedia:North_Carolina_Councils_of_Governments>	1
 322 | SemSearch_ES-24	Q0	<dbpedia:Lucas_v._South_Carolina_Coastal_Council>	1
 323 | SemSearch_ES-24	Q0	<dbpedia:Coastal_Carolina_Community_College>	1
 324 | SemSearch_ES-24	Q0	<dbpedia:Coastal_Carolina_Regional_Airport>	1
 325 | SemSearch_ES-24	Q0	<dbpedia:Coastal_Carolina_University>	2
 326 | SemSearch_ES-24	Q0	<dbpedia:North_Carolina>	1
 327 | SemSearch_ES-24	Q0	<dbpedia:David_Bennett_(American_football)>	1
 328 | SemSearch_ES-24	Q0	<dbpedia:Carolina_Coastal_Railway>	1
 329 | SemSearch_ES-25	Q0	<dbpedia:Foolproof>	1
 330 | SemSearch_ES-25	Q0	<dbpedia:Suchet>	2
 331 | SemSearch_ES-25	Q0	<dbpedia:David_Suchet>	2
 332 | SemSearch_ES-26	Q0	<dbpedia:Disney's_Animal_Kingdom>	1
 333 | SemSearch_ES-26	Q0	<dbpedia:List_of_amusement_parks_in_Greater_Orlando>	1
 334 | SemSearch_ES-26	Q0	<dbpedia:Disney's_All-Star_Sports_Resort>	2
 335 | SemSearch_ES-26	Q0	<dbpedia:Walt_Disney_World_Speedway>	1
 336 | SemSearch_ES-26	Q0	<dbpedia:Downtown_Disney_(Walt_Disney_World)>	1
 337 | SemSearch_ES-26	Q0	<dbpedia:Walt_Disney_World>	1
 338 | SemSearch_ES-26	Q0	<dbpedia:Disney_Legends>	1
 339 | SemSearch_ES-26	Q0	<dbpedia:Disney's_Hollywood_Studios>	1
 340 | SemSearch_ES-26	Q0	<dbpedia:Emily_Bavar>	1
 341 | SemSearch_ES-26	Q0	<dbpedia:Team_Disney>	1
 342 | SemSearch_ES-27	Q0	<dbpedia:Earl_of_Ross>	1
 343 | SemSearch_ES-27	Q0	<dbpedia:Earl_Alexander>	1
 344 | SemSearch_ES-27	Q0	<dbpedia:Robert_Earl>	1
 345 | SemSearch_ES-27	Q0	<dbpedia:Earl_of_Carnarvon>	1
 346 | SemSearch_ES-28	Q0	<dbpedia:El_Salvador_national_football_team>	2
 347 | SemSearch_ES-28	Q0	<dbpedia:National_Anthem_of_El_Salvador>	2
 348 | SemSearch_ES-28	Q0	<dbpedia:San_Salvador>	2
 349 | SemSearch_ES-28	Q0	<dbpedia:2001_El_Salvador_earthquakes>	1
 350 | SemSearch_ES-28	Q0	<dbpedia:Scouting_and_Guiding_in_El_Salvador>	1
 351 | SemSearch_ES-28	Q0	<dbpedia:CR_El_Salvador>	1
 352 | SemSearch_ES-28	Q0	<dbpedia:Antonio_Jos%C3%A9_Ca%C3%B1as>	2
 353 | SemSearch_ES-28	Q0	<dbpedia:Coat_of_arms_of_El_Salvador>	1
 354 | SemSearch_ES-28	Q0	<dbpedia:List_of_cities_in_El_Salvador>	1
 355 | SemSearch_ES-28	Q0	<dbpedia:El_Carmen,_El_Salvador>	1
 356 | SemSearch_ES-28	Q0	<dbpedia:History_of_El_Salvador>	2
 357 | SemSearch_ES-28	Q0	<dbpedia:Santa_Ana,_El_Salvador>	2
 358 | SemSearch_ES-28	Q0	<dbpedia:Flag_of_El_Salvador>	2
 359 | SemSearch_ES-28	Q0	<dbpedia:San_Salvador_Department>	2
 360 | SemSearch_ES-28	Q0	<dbpedia:Ayutuxtepeque>	1
 361 | SemSearch_ES-28	Q0	<dbpedia:El_Salvador_(disambiguation)>	2
 362 | SemSearch_ES-28	Q0	<dbpedia:Santa_Tecla,_El_Salvador>	1
 363 | SemSearch_ES-28	Q0	<dbpedia:List_of_diplomatic_missions_in_El_Salvador>	1
 364 | SemSearch_ES-28	Q0	<dbpedia:Politics_of_El_Salvador>	1
 365 | SemSearch_ES-28	Q0	<dbpedia:El_Paisnal>	1
 366 | SemSearch_ES-28	Q0	<dbpedia:Villa_El_Salvador>	2
 367 | SemSearch_ES-28	Q0	<dbpedia:President_of_El_Salvador>	1
 368 | SemSearch_ES-28	Q0	<dbpedia:El_Salvador>	2
 369 | SemSearch_ES-28	Q0	<dbpedia:El_Salvador_at_the_Olympics>	1
 370 | SemSearch_ES-28	Q0	<dbpedia:El_Salvador,_Cuba>	2
 371 | SemSearch_ES-28	Q0	<dbpedia:Asociaci%C3%B3n_de_Muchachas_Gu%C3%ADas_de_El_Salvador>	1
 372 | SemSearch_ES-28	Q0	<dbpedia:Outline_of_El_Salvador>	2
 373 | SemSearch_ES-28	Q0	<dbpedia:San_Salvador_(disambiguation)>	2
 374 | SemSearch_ES-28	Q0	<dbpedia:Vuelta_a_El_Salvador>	2
 375 | SemSearch_ES-28	Q0	<dbpedia:San_Salvador_El_Salvador_Temple>	1
 376 | SemSearch_ES-28	Q0	<dbpedia:Arturo_Armando_Molina>	1
 377 | SemSearch_ES-28	Q0	<dbpedia:Central_Reserve_Bank_of_El_Salvador>	2
 378 | SemSearch_ES-28	Q0	<dbpedia:University_of_El_Salvador>	1
 379 | SemSearch_ES-29	Q0	<dbpedia:Sir_Ellis_Ellis-Griffith,_1st_Baronet>	1
 380 | SemSearch_ES-29	Q0	<dbpedia:New_York_Institute_of_Technology>	1
 381 | SemSearch_ES-29	Q0	<dbpedia:Ellis_Clarke>	1
 382 | SemSearch_ES-3	Q0	<dbpedia:The_New_Jersey_Book_Arts_Symposium>	1
 383 | SemSearch_ES-3	Q0	<dbpedia:Guild_of_Bookworkers>	2
 384 | SemSearch_ES-30	Q0	<dbpedia:Credit_limit>	1
 385 | SemSearch_ES-30	Q0	<dbpedia:Line_of_credit>	1
 386 | SemSearch_ES-30	Q0	<dbpedia:Signature_line_of_credit>	2
 387 | SemSearch_ES-30	Q0	<dbpedia:Home_equity_line_of_credit>	1
 388 | SemSearch_ES-30	Q0	<dbpedia:E-Loan>	1
 389 | SemSearch_ES-30	Q0	<dbpedia:Credit_One_Bank>	1
 390 | SemSearch_ES-30	Q0	<dbpedia:Warehouse_line_of_credit>	1
 391 | SemSearch_ES-30	Q0	<dbpedia:Revolving_credit>	1
 392 | SemSearch_ES-31	Q0	<dbpedia:George_W._Emery>	1
 393 | SemSearch_ES-31	Q0	<dbpedia:Emery,_Utah>	1
 394 | SemSearch_ES-31	Q0	<dbpedia:Ray_Emery>	1
 395 | SemSearch_ES-31	Q0	<dbpedia:Matthew_Gault_Emery>	1
 396 | SemSearch_ES-31	Q0	<dbpedia:Carlo_Emery>	2
 397 | SemSearch_ES-31	Q0	<dbpedia:Cal_Emery>	1
 398 | SemSearch_ES-31	Q0	<dbpedia:John_J._Emery>	2
 399 | SemSearch_ES-31	Q0	<dbpedia:Joseph_Emery>	1
 400 | SemSearch_ES-31	Q0	<dbpedia:Emery>	1
 401 | SemSearch_ES-31	Q0	<dbpedia:Ed_Emery>	1
 402 | SemSearch_ES-31	Q0	<dbpedia:Walter_Bryan_Emery>	2
 403 | SemSearch_ES-31	Q0	<dbpedia:Dick_Emery>	2
 404 | SemSearch_ES-31	Q0	<dbpedia:Gideon_Emery>	1
 405 | SemSearch_ES-31	Q0	<dbpedia:Audrey_Emery>	2
 406 | SemSearch_ES-31	Q0	<dbpedia:Sonya_Emery>	1
 407 | SemSearch_ES-31	Q0	<dbpedia:Addison_Emery_Verrill>	1
 408 | SemSearch_ES-31	Q0	<dbpedia:Emery_Grover_Building>	1
 409 | SemSearch_ES-31	Q0	<dbpedia:Jill_Emery>	1
 410 | SemSearch_ES-31	Q0	<dbpedia:Emery_County,_Utah>	1
 411 | SemSearch_ES-31	Q0	<dbpedia:Emery,_Wisconsin>	1
 412 | SemSearch_ES-32	Q0	<dbpedia:Chambersburg,_Pennsylvania>	1
 413 | SemSearch_ES-33	Q0	<dbpedia:Magical_objects_in_Harry_Potter>	2
 414 | SemSearch_ES-33	Q0	<dbpedia:Harry_Potter_and_the_Deathly_Hallows>	2
 415 | SemSearch_ES-33	Q0	<dbpedia:Harry_Potter>	2
 416 | SemSearch_ES-33	Q0	<dbpedia:Harry_Potter_and_the_Chamber_of_Secrets_(film)>	2
 417 | SemSearch_ES-33	Q0	<dbpedia:Harry_Potter_(character)>	2
 418 | SemSearch_ES-33	Q0	<dbpedia:Magical_creatures_in_Harry_Potter>	1
 419 | SemSearch_ES-33	Q0	<dbpedia:List_of_supporting_Harry_Potter_characters>	2
 420 | SemSearch_ES-33	Q0	<dbpedia:Harry_Potter_and_the_Half-Blood_Prince>	2
 421 | SemSearch_ES-33	Q0	<dbpedia:Harry_Potter_and_the_Chamber_of_Secrets_(video_game)>	2
 422 | SemSearch_ES-33	Q0	<dbpedia:Harry_Potter_(disambiguation)>	2
 423 | SemSearch_ES-33	Q0	<dbpedia:Harry_Potter_and_the_Chamber_of_Secrets>	2
 424 | SemSearch_ES-33	Q0	<dbpedia:Harry_Potter_and_the_Order_of_the_Phoenix>	2
 425 | SemSearch_ES-33	Q0	<dbpedia:J._K._Rowling>	1
 426 | SemSearch_ES-33	Q0	<dbpedia:Harry_Potter_and_the_Prisoner_of_Azkaban>	1
 427 | SemSearch_ES-33	Q0	<dbpedia:Harry_Potter_and_the_Half-Blood_Prince_(video_game)>	1
 428 | SemSearch_ES-33	Q0	<dbpedia:Harry_Potter:_Quidditch_World_Cup>	1
 429 | SemSearch_ES-33	Q0	<dbpedia:Controversy_over_the_Harry_Potter_series>	2
 430 | SemSearch_ES-33	Q0	<dbpedia:Harry_Potter_and_the_Order_of_the_Phoenix_(film)>	2
 431 | SemSearch_ES-33	Q0	<dbpedia:List_of_Harry_Potter-related_topics>	2
 432 | SemSearch_ES-33	Q0	<dbpedia:Harry_Potter_and_the_Prisoner_of_Azkaban_(video_game)>	1
 433 | SemSearch_ES-33	Q0	<dbpedia:Harry_Potter_and_the_Half-Blood_Prince_(film)>	2
 434 | SemSearch_ES-33	Q0	<dbpedia:Albus_Dumbledore>	1
 435 | SemSearch_ES-33	Q0	<dbpedia:Harry_Potter_and_the_Goblet_of_Fire_(video_game)>	2
 436 | SemSearch_ES-33	Q0	<dbpedia:List_of_Harry_Potter_characters>	2
 437 | SemSearch_ES-33	Q0	<dbpedia:Harry_Potter_and_the_Goblet_of_Fire>	2
 438 | SemSearch_ES-33	Q0	<dbpedia:Bonnie_Wright>	1
 439 | SemSearch_ES-33	Q0	<dbpedia:Harry_Potter_and_the_Goblet_of_Fire_(film)>	2
 440 | SemSearch_ES-33	Q0	<dbpedia:Harry_Potter_and_the_Prisoner_of_Azkaban_(film)>	2
 441 | SemSearch_ES-33	Q0	<dbpedia:Places_in_Harry_Potter>	2
 442 | SemSearch_ES-34	Q0	<dbpedia:Harry_Potter_and_the_Deathly_Hallows_%E2%80%93_Part_1>	2
 443 | SemSearch_ES-34	Q0	<dbpedia:Magical_creatures_in_Harry_Potter>	2
 444 | SemSearch_ES-34	Q0	<dbpedia:Harry_Potter_and_the_Chamber_of_Secrets_(film)>	2
 445 | SemSearch_ES-34	Q0	<dbpedia:Harry_Potter_and_the_Order_of_the_Phoenix_(film)>	2
 446 | SemSearch_ES-34	Q0	<dbpedia:Harry_Potter_and_the_Order_of_the_Phoenix_(video_game)>	1
 447 | SemSearch_ES-34	Q0	<dbpedia:Harry_Potter_and_the_Deathly_Hallows>	1
 448 | SemSearch_ES-34	Q0	<dbpedia:Harry_Potter_(disambiguation)>	2
 449 | SemSearch_ES-34	Q0	<dbpedia:Harry_Potter_and_the_Chamber_of_Secrets>	2
 450 | SemSearch_ES-34	Q0	<dbpedia:Harry_Potter_and_the_Half-Blood_Prince_(film)>	2
 451 | SemSearch_ES-34	Q0	<dbpedia:Harry_Potter_and_the_Philosopher's_Stone_(film)>	2
 452 | SemSearch_ES-34	Q0	<dbpedia:Harry_Potter_and_the_Chamber_of_Secrets_(video_game)>	1
 453 | SemSearch_ES-34	Q0	<dbpedia:Harry_Potter_and_the_Goblet_of_Fire_(film)>	2
 454 | SemSearch_ES-34	Q0	<dbpedia:Harry_Potter_and_the_Prisoner_of_Azkaban_(film)>	2
 455 | SemSearch_ES-35	Q0	<dbpedia:University_of_Cincinnati>	1
 456 | SemSearch_ES-35	Q0	<dbpedia:University_of_Cincinnati_College_of_Law>	1
 457 | SemSearch_ES-35	Q0	<dbpedia:Cincinnati_Masters>	1
 458 | SemSearch_ES-35	Q0	<dbpedia:Hyde_Park,_Cincinnati>	1
 459 | SemSearch_ES-35	Q0	<dbpedia:2007_PapaJohns.com_Bowl>	1
 460 | SemSearch_ES-35	Q0	<dbpedia:Cincinnati_Police_Department>	1
 461 | SemSearch_ES-36	Q0	<dbpedia:Bo_Welch>	1
 462 | SemSearch_ES-36	Q0	<dbpedia:Batman>	1
 463 | SemSearch_ES-36	Q0	<dbpedia:Batman_(TV_series)>	1
 464 | SemSearch_ES-36	Q0	<dbpedia:Batman_Beyond:_Return_of_the_Joker_(video_game)>	2
 465 | SemSearch_ES-36	Q0	<dbpedia:Michael_Uslan>	1
 466 | SemSearch_ES-36	Q0	<dbpedia:The_New_Batman_Adventures>	2
 467 | SemSearch_ES-36	Q0	<dbpedia:Heart_of_Ice_(Batman:_The_Animated_Series_episode)>	2
 468 | SemSearch_ES-36	Q0	<dbpedia:Batman_Returns_(soundtrack)>	1
 469 | SemSearch_ES-36	Q0	<dbpedia:Batman,_Turkey>	1
 470 | SemSearch_ES-36	Q0	<dbpedia:List_of_Batman_animated_episodes>	3
 471 | SemSearch_ES-36	Q0	<dbpedia:Perchance_to_Dream_(Batman:_The_Animated_Series)>	1
 472 | SemSearch_ES-36	Q0	<dbpedia:Batman_(score)>	1
 473 | SemSearch_ES-36	Q0	<dbpedia:Hasankeyf>	1
 474 | SemSearch_ES-36	Q0	<dbpedia:Batman_Returns>	2
 475 | SemSearch_ES-36	Q0	<dbpedia:Batman_Returns_(video_game)>	2
 476 | SemSearch_ES-36	Q0	<dbpedia:Batman_Forever>	2
 477 | SemSearch_ES-36	Q0	<dbpedia:William_A._Graham_(director)>	1
 478 | SemSearch_ES-36	Q0	<dbpedia:Batman_(1989_film)>	1
 479 | SemSearch_ES-36	Q0	<dbpedia:Batman_Beyond:_Return_of_the_Joker>	1
 480 | SemSearch_ES-36	Q0	<dbpedia:Batmobile>	1
 481 | SemSearch_ES-37	Q0	<dbpedia:Upside_Down_(Jack_Johnson_song)>	2
 482 | SemSearch_ES-37	Q0	<dbpedia:Jack_Johnson_(boxer)>	2
 483 | SemSearch_ES-37	Q0	<dbpedia:Jack_Johnson_(politician)>	1
 484 | SemSearch_ES-37	Q0	<dbpedia:Robert_Price>	2
 485 | SemSearch_ES-37	Q0	<dbpedia:Jack_B._Johnson>	2
 486 | SemSearch_ES-37	Q0	<dbpedia:Jack_Johnson_(musician)>	2
 487 | SemSearch_ES-37	Q0	<dbpedia:The_Complete_Jack_Johnson_Sessions>	1
 488 | SemSearch_ES-37	Q0	<dbpedia:Jack_Johnson_(ice_hockey)>	2
 489 | SemSearch_ES-37	Q0	<dbpedia:A_Tribute_to_Jack_Johnson>	2
 490 | SemSearch_ES-37	Q0	<dbpedia:Big_Jack_Johnson>	2
 491 | SemSearch_ES-37	Q0	<dbpedia:Jack_Johnson>	2
 492 | SemSearch_ES-37	Q0	<dbpedia:Breakdown_(Jack_Johnson_song)>	2
 493 | SemSearch_ES-37	Q0	<dbpedia:Unforgivable_Blackness:_The_Rise_and_Fall_of_Jack_Johnson>	1
 494 | SemSearch_ES-37	Q0	<dbpedia:Jack_Johnson_(actor)>	2
 495 | SemSearch_ES-38	Q0	<dbpedia:Jack_the_Ripper_(song)>	1
 496 | SemSearch_ES-38	Q0	<dbpedia:Jack_the_Ripper,_Light-Hearted_Friend>	2
 497 | SemSearch_ES-38	Q0	<dbpedia:John_Netley>	1
 498 | SemSearch_ES-38	Q0	<dbpedia:Jack_the_Ripper_(disambiguation)>	2
 499 | SemSearch_ES-38	Q0	<dbpedia:Hands_of_the_Ripper>	1
 500 | SemSearch_ES-38	Q0	<dbpedia:Jack_the_Ripper_suspects>	2
 501 | SemSearch_ES-38	Q0	<dbpedia:From_Hell_letter>	1
 502 | SemSearch_ES-38	Q0	<dbpedia:Casebook:_Jack_the_Ripper>	2
 503 | SemSearch_ES-38	Q0	<dbpedia:Jack_the_Ripper:_The_Final_Solution>	2
 504 | SemSearch_ES-38	Q0	<dbpedia:Jack_the_Ripper_in_fiction>	1
 505 | SemSearch_ES-38	Q0	<dbpedia:Jack_the_Ripper>	2
 506 | SemSearch_ES-39	Q0	<dbpedia:Idaho_High_School_Activities_Association>	1
 507 | SemSearch_ES-39	Q0	<dbpedia:Villa_Walsh_Academy>	1
 508 | SemSearch_ES-39	Q0	<dbpedia:Caldwell,_New_Jersey>	1
 509 | SemSearch_ES-39	Q0	<dbpedia:Caldwell_High_School>	2
 510 | SemSearch_ES-39	Q0	<dbpedia:James_Caldwell_High_School>	2
 511 | SemSearch_ES-39	Q0	<dbpedia:Caldwell_High_School_(Caldwell,_Texas)>	1
 512 | SemSearch_ES-39	Q0	<dbpedia:Caldwell_High_School_(Caldwell,_Ohio)>	2
 513 | SemSearch_ES-39	Q0	<dbpedia:Caldwell_High_School_(Caldwell,_Idaho)>	2
 514 | SemSearch_ES-4	Q0	<dbpedia:35th_NAACP_Image_Awards>	2
 515 | SemSearch_ES-4	Q0	<dbpedia:NAACP_Image_Award_for_Outstanding_Song>	2
 516 | SemSearch_ES-4	Q0	<dbpedia:NAACP_Image_Award_%E2%80%93_Chairman's_Award>	2
 517 | SemSearch_ES-4	Q0	<dbpedia:NAACP_Image_Award_%E2%80%93_President's_Award>	2
 518 | SemSearch_ES-4	Q0	<dbpedia:NAACP_Image_Award_for_Outstanding_Motion_Picture>	2
 519 | SemSearch_ES-4	Q0	<dbpedia:37th_NAACP_Image_Awards>	2
 520 | SemSearch_ES-4	Q0	<dbpedia:NAACP_Image_Award_for_Outstanding_Actress_in_a_Motion_Picture>	2
 521 | SemSearch_ES-4	Q0	<dbpedia:NAACP_Image_Award_for_Outstanding_Actor_in_a_Daytime_Drama_Series>	2
 522 | SemSearch_ES-4	Q0	<dbpedia:NAACP_Image_Award_for_Outstanding_Literary_Work,_Children's>	2
 523 | SemSearch_ES-4	Q0	<dbpedia:38th_NAACP_Image_Awards>	2
 524 | SemSearch_ES-4	Q0	<dbpedia:NAACP_Image_Award>	2
 525 | SemSearch_ES-4	Q0	<dbpedia:36th_NAACP_Image_Awards>	2
 526 | SemSearch_ES-4	Q0	<dbpedia:NAACP_Image_Award_for_Outstanding_Male_Artist>	2
 527 | SemSearch_ES-4	Q0	<dbpedia:NAACP_Image_Award_for_Outstanding_Duo_or_Group>	2
 528 | SemSearch_ES-4	Q0	<dbpedia:NAACP_Image_Award_for_Outstanding_Actress_in_a_Daytime_Drama_Series>	1
 529 | SemSearch_ES-4	Q0	<dbpedia:NAACP_Image_Award_for_Outstanding_Literary_Work,_Nonfiction>	2
 530 | SemSearch_ES-40	Q0	<dbpedia:James_Clayton_(priest)>	2
 531 | SemSearch_ES-40	Q0	<dbpedia:James_Newton_Howard>	1
 532 | SemSearch_ES-40	Q0	<dbpedia:Thomas_Clayton>	2
 533 | SemSearch_ES-40	Q0	<dbpedia:United_States_national_amateur_boxing_middleweight_champions>	1
 534 | SemSearch_ES-41	Q0	<dbpedia:Saint_Joan_of_Arc_(Sackville-West)>	2
 535 | SemSearch_ES-41	Q0	<dbpedia:Joan_of_Arc,_Dick_Cheney,_Mark_Twain>	1
 536 | SemSearch_ES-41	Q0	<dbpedia:The_Passion_of_Joan_of_Arc>	2
 537 | SemSearch_ES-41	Q0	<dbpedia:Name_of_Joan_of_Arc>	1
 538 | SemSearch_ES-41	Q0	<dbpedia:Joan_of_Arc_(1948_film)>	2
 539 | SemSearch_ES-41	Q0	<dbpedia:The_Trial_of_Joan_of_Arc>	2
 540 | SemSearch_ES-41	Q0	<dbpedia:Alternative_historical_interpretations_of_Joan_of_Arc>	1
 541 | SemSearch_ES-41	Q0	<dbpedia:Timeline_of_Joan_of_Arc>	1
 542 | SemSearch_ES-41	Q0	<dbpedia:Joan_of_Arc_(miniseries)>	1
 543 | SemSearch_ES-41	Q0	<dbpedia:Joan_of_Arc_(band)>	2
 544 | SemSearch_ES-41	Q0	<dbpedia:Joan_of_Arc>	2
 545 | SemSearch_ES-41	Q0	<dbpedia:Clone_High>	2
 546 | SemSearch_ES-41	Q0	<dbpedia:The_Messenger:_The_Story_of_Joan_of_Arc>	2
 547 | SemSearch_ES-41	Q0	<dbpedia:Maid_of_Orleans_(The_Waltz_Joan_of_Arc)>	1
 548 | SemSearch_ES-41	Q0	<dbpedia:Cultural_depictions_of_Joan_of_Arc>	2
 549 | SemSearch_ES-41	Q0	<dbpedia:Canonization_of_Joan_of_Arc>	2
 550 | SemSearch_ES-41	Q0	<dbpedia:Personal_Recollections_of_Joan_of_Arc>	1
 551 | SemSearch_ES-41	Q0	<dbpedia:Wars_and_Warriors:_Joan_of_Arc>	2
 552 | SemSearch_ES-42	Q0	<dbpedia:John_Maxwell_(writer)>	2
 553 | SemSearch_ES-42	Q0	<dbpedia:John_C._Maxwell>	2
 554 | SemSearch_ES-42	Q0	<dbpedia:John_Alan_Maxwell>	2
 555 | SemSearch_ES-42	Q0	<dbpedia:John_Maxwell_(British_Army_officer)>	2
 556 | SemSearch_ES-42	Q0	<dbpedia:John_Maxwell,_1st_Baron_Farnham>	1
 557 | SemSearch_ES-42	Q0	<dbpedia:John_Preston_Maxwell>	2
 558 | SemSearch_ES-42	Q0	<dbpedia:John_Maxwell-Barry,_5th_Baron_Farnham>	1
 559 | SemSearch_ES-42	Q0	<dbpedia:John_Maxwell_(artist)>	1
 560 | SemSearch_ES-42	Q0	<dbpedia:Henry_Maxwell,_6th_Baron_Farnham>	2
 561 | SemSearch_ES-42	Q0	<dbpedia:John_Maxwell,_2nd_Earl_of_Farnham>	2
 562 | SemSearch_ES-42	Q0	<dbpedia:John_Maxwell>	2
 563 | SemSearch_ES-42	Q0	<dbpedia:John_Maxwell_(actor)>	2
 564 | SemSearch_ES-42	Q0	<dbpedia:John_Maxwell_(bishop)>	1
 565 | SemSearch_ES-45	Q0	<dbpedia:Once_in_a_Lifetime_(Keith_Urban_song)>	1
 566 | SemSearch_ES-45	Q0	<dbpedia:Keith_Urban>	2
 567 | SemSearch_ES-45	Q0	<dbpedia:Better_Life>	2
 568 | SemSearch_ES-45	Q0	<dbpedia:But_for_the_Grace_of_God>	2
 569 | SemSearch_ES-45	Q0	<dbpedia:Stupid_Boy>	2
 570 | SemSearch_ES-45	Q0	<dbpedia:Days_Go_By_(Keith_Urban_song)>	1
 571 | SemSearch_ES-45	Q0	<dbpedia:Everybody_(Keith_Urban_song)>	1
 572 | SemSearch_ES-45	Q0	<dbpedia:Greatest_Hits:_18_Kids>	2
 573 | SemSearch_ES-45	Q0	<dbpedia:Keith_Urban_(album)>	2
 574 | SemSearch_ES-45	Q0	<dbpedia:I_Told_You_So_(Keith_Urban_song)>	1
 575 | SemSearch_ES-47	Q0	<dbpedia:King_Arthur_(video_game)>	1
 576 | SemSearch_ES-47	Q0	<dbpedia:King_Arthur_(TV_series)>	1
 577 | SemSearch_ES-47	Q0	<dbpedia:King_Arthur>	2
 578 | SemSearch_ES-47	Q0	<dbpedia:King_Arthur's_Disasters>	2
 579 | SemSearch_ES-47	Q0	<dbpedia:King_Arthur_Flour>	1
 580 | SemSearch_ES-47	Q0	<dbpedia:Arthur_Henry_King>	1
 581 | SemSearch_ES-47	Q0	<dbpedia:Arthur_King>	1
 582 | SemSearch_ES-47	Q0	<dbpedia:King_Arthur_(film)>	2
 583 | SemSearch_ES-47	Q0	<dbpedia:King_Arthur_(opera)>	2
 584 | SemSearch_ES-47	Q0	<dbpedia:King_Arthur's_World>	1
 585 | SemSearch_ES-47	Q0	<dbpedia:List_of_works_based_on_Arthurian_legends>	2
 586 | SemSearch_ES-47	Q0	<dbpedia:King_Arthur_&_the_Knights_of_Justice_(video_game)>	2
 587 | SemSearch_ES-47	Q0	<dbpedia:A_Kid_in_King_Arthur's_Court>	1
 588 | SemSearch_ES-47	Q0	<dbpedia:King_Arthur_(disambiguation)>	2
 589 | SemSearch_ES-47	Q0	<dbpedia:King_Arthur_Was_a_Gentleman>	2
 590 | SemSearch_ES-47	Q0	<dbpedia:Unidentified_Flying_Oddball>	1
 591 | SemSearch_ES-48	Q0	<dbpedia:Le_Bec-Fin>	1
 592 | SemSearch_ES-48	Q0	<dbpedia:La_Mallorquina>	1
 593 | SemSearch_ES-48	Q0	<dbpedia:Declaration_of_Philadelphia>	1
 594 | SemSearch_ES-48	Q0	<dbpedia:Walnut_Hill,_Philadelphia>	1
 595 | SemSearch_ES-48	Q0	<dbpedia:Puerto_Plata_(city)>	1
 596 | SemSearch_ES-48	Q0	<dbpedia:La_Madeleine_(restaurant_chain)>	1
 597 | SemSearch_ES-49	Q0	<dbpedia:Susan_Whitson>	1
 598 | SemSearch_ES-49	Q0	<dbpedia:Laura_Bush>	2
 599 | SemSearch_ES-49	Q0	<dbpedia:Bush_family>	1
 600 | SemSearch_ES-49	Q0	<dbpedia:The_Family:_The_Real_Story_of_the_Bush_Dynasty>	1
 601 | SemSearch_ES-49	Q0	<dbpedia:Barbara_Bush>	2
 602 | SemSearch_ES-5	Q0	<dbpedia:Pleasant_Valley_Township,_Scott_County,_Iowa>	1
 603 | SemSearch_ES-5	Q0	<dbpedia:Scott_County>	2
 604 | SemSearch_ES-5	Q0	<dbpedia:Scott_Township,_Adams_County,_Ohio>	1
 605 | SemSearch_ES-5	Q0	<dbpedia:Scott_County,_Kansas>	2
 606 | SemSearch_ES-5	Q0	<dbpedia:Scott_County_Jail_Complex>	1
 607 | SemSearch_ES-5	Q0	<dbpedia:Scott_County,_Illinois>	2
 608 | SemSearch_ES-5	Q0	<dbpedia:Scott_County,_Iowa>	2
 609 | SemSearch_ES-5	Q0	<dbpedia:Scott_County,_Indiana>	2
 610 | SemSearch_ES-5	Q0	<dbpedia:Scott_County,_Kentucky>	2
 611 | SemSearch_ES-5	Q0	<dbpedia:Scott>	1
 612 | SemSearch_ES-5	Q0	<dbpedia:Scott_County,_Arkansas>	2
 613 | SemSearch_ES-5	Q0	<dbpedia:Scott_Township,_Brown_County,_Ohio>	2
 614 | SemSearch_ES-5	Q0	<dbpedia:Scott_County_Courthouse_(Scott_County,_Kentucky)>	2
 615 | SemSearch_ES-5	Q0	<dbpedia:Scott_County,_Virginia>	1
 616 | SemSearch_ES-5	Q0	<dbpedia:Scott_Township>	2
 617 | SemSearch_ES-5	Q0	<dbpedia:Scott_Township,_Pennsylvania>	1
 618 | SemSearch_ES-5	Q0	<dbpedia:Scott_County,_Minnesota>	2
 619 | SemSearch_ES-5	Q0	<dbpedia:Scott_Township,_Ohio>	2
 620 | SemSearch_ES-5	Q0	<dbpedia:Scott_County,_Tennessee>	2
 621 | SemSearch_ES-50	Q0	<dbpedia:The_Bob_&_Tom_Show>	1
 622 | SemSearch_ES-51	Q0	<dbpedia:Lexus_GX>	1
 623 | SemSearch_ES-51	Q0	<dbpedia:Maplewood,_Houston>	1
 624 | SemSearch_ES-51	Q0	<dbpedia:Maplewood,_Minnesota>	1
 625 | SemSearch_ES-52	Q0	<dbpedia:Lincoln_Park,_Michigan>	2
 626 | SemSearch_ES-52	Q0	<dbpedia:Lincoln_Park,_Colorado>	2
 627 | SemSearch_ES-52	Q0	<dbpedia:Lincoln_Park_(Dartmouth,_Massachusetts)>	2
 628 | SemSearch_ES-52	Q0	<dbpedia:Lincoln_Park,_Denver>	1
 629 | SemSearch_ES-52	Q0	<dbpedia:Lincoln_Park_(disambiguation)>	2
 630 | SemSearch_ES-52	Q0	<dbpedia:Lincoln_Park_Airport>	1
 631 | SemSearch_ES-52	Q0	<dbpedia:Lincoln_Park_Zoo>	1
 632 | SemSearch_ES-52	Q0	<dbpedia:Lincoln>	1
 633 | SemSearch_ES-52	Q0	<dbpedia:Lincoln_Park,_Texas>	2
 634 | SemSearch_ES-52	Q0	<dbpedia:Lincoln_Park_Performing_Arts_Charter_School>	2
 635 | SemSearch_ES-52	Q0	<dbpedia:Lincoln_Park_(San_Francisco)>	2
 636 | SemSearch_ES-52	Q0	<dbpedia:Lincoln_Park,_Georgia>	2
 637 | SemSearch_ES-52	Q0	<dbpedia:Lincoln_Park,_Rockville,_Maryland>	1
 638 | SemSearch_ES-52	Q0	<dbpedia:Lincoln_Park_Conservatory>	1
 639 | SemSearch_ES-52	Q0	<dbpedia:Lincoln_Park_Historic_District>	1
 640 | SemSearch_ES-52	Q0	<dbpedia:Lincoln_Park_(Washington,_D.C.)>	2
 641 | SemSearch_ES-52	Q0	<dbpedia:Lincoln_State_Park>	1
 642 | SemSearch_ES-52	Q0	<dbpedia:Lincoln_Trail_State_Park>	2
 643 | SemSearch_ES-52	Q0	<dbpedia:Lincoln_Park,_Chicago>	2
 644 | SemSearch_ES-52	Q0	<dbpedia:Lincoln_Park,_Calgary>	1
 645 | SemSearch_ES-52	Q0	<dbpedia:Lincoln_Park_(NJT_station)>	1
 646 | SemSearch_ES-52	Q0	<dbpedia:Lincoln_Park>	2
 647 | SemSearch_ES-52	Q0	<dbpedia:Lincoln_Park_(Los_Angeles)>	1
 648 | SemSearch_ES-52	Q0	<dbpedia:Lincoln_Park,_New_Jersey>	2
 649 | SemSearch_ES-53	Q0	<dbpedia:Lynchburg_Expressway>	2
 650 | SemSearch_ES-53	Q0	<dbpedia:Lynchburg_metropolitan_area>	2
 651 | SemSearch_ES-53	Q0	<dbpedia:Virginia_State_Route_128>	1
 652 | SemSearch_ES-53	Q0	<dbpedia:Lynchburg,_Virginia>	2
 653 | SemSearch_ES-53	Q0	<dbpedia:Greater_Lynchburg_Transit_Company>	2
 654 | SemSearch_ES-53	Q0	<dbpedia:Lynchburg_Regional_Airport>	1
 655 | SemSearch_ES-53	Q0	<dbpedia:Brandon_Inge>	1
 656 | SemSearch_ES-53	Q0	<dbpedia:Lynchburg>	1
 657 | SemSearch_ES-53	Q0	<dbpedia:Greg_Booker>	1
 658 | SemSearch_ES-53	Q0	<dbpedia:List_of_former_state_highways_in_Virginia>	1
 659 | SemSearch_ES-53	Q0	<dbpedia:Lynchburg_%E2%80%93_Kemper_Street_Station>	1
 660 | SemSearch_ES-53	Q0	<dbpedia:Lynchburg_College>	2
 661 | SemSearch_ES-53	Q0	<dbpedia:Heritage_High_School_(Lynchburg,_Virginia)>	1
 662 | SemSearch_ES-53	Q0	<dbpedia:National_Register_of_Historic_Places_listings_in_Virginia>	1
 663 | SemSearch_ES-54	Q0	<dbpedia:Libre_(Marc_Anthony_album)>	2
 664 | SemSearch_ES-54	Q0	<dbpedia:Marc_Anthony_(album)>	1
 665 | SemSearch_ES-54	Q0	<dbpedia:Marc_Pugh>	2
 666 | SemSearch_ES-54	Q0	<dbpedia:I_Need_to_Know_(Marc_Anthony_song)>	1
 667 | SemSearch_ES-54	Q0	<dbpedia:Sigo_Siendo_Yo:_Grandes_Exitos>	1
 668 | SemSearch_ES-54	Q0	<dbpedia:Da_la_Vuelta>	2
 669 | SemSearch_ES-54	Q0	<dbpedia:Contra_la_Corriente_(Marc_Anthony_album)>	2
 670 | SemSearch_ES-54	Q0	<dbpedia:Mark_Anthony_Awere>	1
 671 | SemSearch_ES-54	Q0	<dbpedia:Marc_Anthony>	2
 672 | SemSearch_ES-54	Q0	<dbpedia:My_Baby_You>	2
 673 | SemSearch_ES-54	Q0	<dbpedia:Mark_Anthony_(pornographic_actor)>	2
 674 | SemSearch_ES-54	Q0	<dbpedia:You_Sang_to_Me>	1
 675 | SemSearch_ES-55	Q0	<dbpedia:Marcus_Center>	2
 676 | SemSearch_ES-55	Q0	<dbpedia:Muvico_Theaters>	1
 677 | SemSearch_ES-55	Q0	<dbpedia:Marcus_Burghardt>	1
 678 | SemSearch_ES-55	Q0	<dbpedia:Douglas_Theatre_Company>	2
 679 | SemSearch_ES-56	Q0	<dbpedia:Super_Mario_Bros._(film)>	2
 680 | SemSearch_ES-56	Q0	<dbpedia:Super_Mario_Advance_4:_Super_Mario_Bros._3>	2
 681 | SemSearch_ES-56	Q0	<dbpedia:The_Super_Mario_Challenge>	2
 682 | SemSearch_ES-56	Q0	<dbpedia:Super_Mario_Bros._3>	2
 683 | SemSearch_ES-56	Q0	<dbpedia:The_Super_Mario_Bros._Super_Show!>	2
 684 | SemSearch_ES-56	Q0	<dbpedia:Super_Mario_Bros.:_The_Lost_Levels>	2
 685 | SemSearch_ES-56	Q0	<dbpedia:List_of_Mario_television_series>	2
 686 | SemSearch_ES-56	Q0	<dbpedia:Super_Mario_Bros._2>	2
 687 | SemSearch_ES-56	Q0	<dbpedia:Mario_Bros.>	2
 688 | SemSearch_ES-56	Q0	<dbpedia:Super_Mario_Bros._(disambiguation)>	2
 689 | SemSearch_ES-56	Q0	<dbpedia:Super_Mario_Bros.:_Peach-Hime_Kyushutsu_Dai_Sakusen!>	2
 690 | SemSearch_ES-56	Q0	<dbpedia:Super_Mario_Galaxy>	2
 691 | SemSearch_ES-56	Q0	<dbpedia:List_of_video_games_featuring_Mario>	2
 692 | SemSearch_ES-56	Q0	<dbpedia:Nintendo_e-Reader>	1
 693 | SemSearch_ES-56	Q0	<dbpedia:The_Adventures_of_Super_Mario_Bros._3>	1
 694 | SemSearch_ES-56	Q0	<dbpedia:New_Super_Mario_Bros.>	2
 695 | SemSearch_ES-56	Q0	<dbpedia:Mario>	2
 696 | SemSearch_ES-56	Q0	<dbpedia:List_of_Mario_franchise_characters>	2
 697 | SemSearch_ES-56	Q0	<dbpedia:List_of_recurring_Mario_franchise_enemies>	2
 698 | SemSearch_ES-56	Q0	<dbpedia:Super_Mario_Bros.>	2
 699 | SemSearch_ES-57	Q0	<dbpedia:Martin_Luther_King_High_School_(Riverside,_California)>	1
 700 | SemSearch_ES-57	Q0	<dbpedia:Martin_Luther_King_Jr._Shoreline>	1
 701 | SemSearch_ES-57	Q0	<dbpedia:Martin_Luther_King_Bridge_(Port_Arthur,_Texas)>	1
 702 | SemSearch_ES-57	Q0	<dbpedia:Martin_Luther_King,_Jr._Day>	2
 703 | SemSearch_ES-57	Q0	<dbpedia:Martin_Luther_King,_Jr._Memorial>	1
 704 | SemSearch_ES-57	Q0	<dbpedia:Martin_Luther_King,_Sr.>	2
 705 | SemSearch_ES-57	Q0	<dbpedia:Martin_Luther_King_Jr._Memorial_Library>	2
 706 | SemSearch_ES-57	Q0	<dbpedia:Martin_Luther_King_Middle_School>	1
 707 | SemSearch_ES-57	Q0	<dbpedia:Martin_Luther_King,_Jr.>	2
 708 | SemSearch_ES-57	Q0	<dbpedia:Martin_Luther_King,_Jr._authorship_issues>	1
 709 | SemSearch_ES-57	Q0	<dbpedia:Martin_Luther_King_High_School>	2
 710 | SemSearch_ES-57	Q0	<dbpedia:Martin_Luther_King_(disambiguation)>	2
 711 | SemSearch_ES-57	Q0	<dbpedia:Martin_Luther_King_Jr._Freeway>	1
 712 | SemSearch_ES-57	Q0	<dbpedia:Martin_Luther_King,_Jr._High_School_(Cleveland)>	1
 713 | SemSearch_ES-57	Q0	<dbpedia:Martin_Luther_(disambiguation)>	1
 714 | SemSearch_ES-57	Q0	<dbpedia:Martin_Luther_King,_Jr.,_National_Historic_Site>	1
 715 | SemSearch_ES-57	Q0	<dbpedia:Martin_Luther_King_III>	2
 716 | SemSearch_ES-58	Q0	<dbpedia:William_Mason_High_School_(Mason,_Ohio)>	1
 717 | SemSearch_ES-58	Q0	<dbpedia:The_Beach_at_Adventure_Landing>	2
 718 | SemSearch_ES-58	Q0	<dbpedia:Mason_Township,_Lawrence_County,_Ohio>	1
 719 | SemSearch_ES-58	Q0	<dbpedia:Bath_Township>	1
 720 | SemSearch_ES-58	Q0	<dbpedia:Mason,_Ohio>	1
 721 | SemSearch_ES-58	Q0	<dbpedia:Ohio_locations_by_per_capita_income>	1
 722 | SemSearch_ES-58	Q0	<dbpedia:Interstate_75_in_Ohio>	1
 723 | SemSearch_ES-58	Q0	<dbpedia:Ohio_River>	1
 724 | SemSearch_ES-58	Q0	<dbpedia:Ohio_Kings_Island_Open>	1
 725 | SemSearch_ES-58	Q0	<dbpedia:Indiana_and_Ohio_Railway>	1
 726 | SemSearch_ES-59	Q0	<dbpedia:Mercy_Hospital_and_Medical_Center>	1
 727 | SemSearch_ES-59	Q0	<dbpedia:Des_Moines_metropolitan_area>	1
 728 | SemSearch_ES-59	Q0	<dbpedia:List_of_hospitals_in_the_United_States>	1
 729 | SemSearch_ES-59	Q0	<dbpedia:Mercy_Medical_Center_(Idaho)>	1
 730 | SemSearch_ES-6	Q0	<dbpedia:AWI>	1
 731 | SemSearch_ES-6	Q0	<dbpedia:Air_Wisconsin>	2
 732 | SemSearch_ES-6	Q0	<dbpedia:List_of_airports_in_Wisconsin>	1
 733 | SemSearch_ES-6	Q0	<dbpedia:Volk_Field_Air_National_Guard_Base>	1
 734 | SemSearch_ES-6	Q0	<dbpedia:Air_Wisconsin_destinations>	1
 735 | SemSearch_ES-6	Q0	<dbpedia:Wisconsin_Air_National_Guard>	2
 736 | SemSearch_ES-6	Q0	<dbpedia:Wisconsin_Rapids,_Wisconsin>	1
 737 | SemSearch_ES-6	Q0	<dbpedia:Mountain_Air_Express>	2
 738 | SemSearch_ES-60	Q0	<dbpedia:Maicon_Sisenando>	2
 739 | SemSearch_ES-60	Q0	<dbpedia:Michael_Douglas_(skeleton_racer)>	1
 740 | SemSearch_ES-60	Q0	<dbpedia:Michael_Douglas_(disambiguation)>	2
 741 | SemSearch_ES-60	Q0	<dbpedia:Michael_Douglas>	2
 742 | SemSearch_ES-60	Q0	<dbpedia:Laura_Bush>	1
 743 | SemSearch_ES-61	Q0	<dbpedia:Martin's_Fantasy_Island>	1
 744 | SemSearch_ES-61	Q0	<dbpedia:Fantasy_Island>	1
 745 | SemSearch_ES-63	Q0	<dbpedia:Winchester_Model_1912>	1
 746 | SemSearch_ES-63	Q0	<dbpedia:Winchester_Model_21>	1
 747 | SemSearch_ES-63	Q0	<dbpedia:Trigger_disconnector>	1
 748 | SemSearch_ES-63	Q0	<dbpedia:Winchester_Model_1887/1901>	2
 749 | SemSearch_ES-63	Q0	<dbpedia:Winchester_Model_1200>	2
 750 | SemSearch_ES-64	Q0	<dbpedia:Mike_O'Meara>	1
 751 | SemSearch_ES-64	Q0	<dbpedia:Ford_Mustang>	1
 752 | SemSearch_ES-64	Q0	<dbpedia:Ford_Five_Hundred>	1
 753 | SemSearch_ES-65	Q0	<dbpedia:Orlando,_Florida>	1
 754 | SemSearch_ES-65	Q0	<dbpedia:University_of_Central_Florida>	1
 755 | SemSearch_ES-65	Q0	<dbpedia:Florida_locations_by_per_capita_income>	1
 756 | SemSearch_ES-65	Q0	<dbpedia:Orlando_Street_Railway>	1
 757 | SemSearch_ES-65	Q0	<dbpedia:Central_Florida>	1
 758 | SemSearch_ES-65	Q0	<dbpedia:Greater_Orlando>	1
 759 | SemSearch_ES-65	Q0	<dbpedia:Orlando_Police_Department>	1
 760 | SemSearch_ES-65	Q0	<dbpedia:Orlando_Magic>	1
 761 | SemSearch_ES-65	Q0	<dbpedia:Freedom_High_School_(Orlando,_Florida)>	1
 762 | SemSearch_ES-65	Q0	<dbpedia:Amway_Center>	2
 763 | SemSearch_ES-65	Q0	<dbpedia:Metroplan_Orlando>	1
 764 | SemSearch_ES-65	Q0	<dbpedia:Cypress_Creek_High_School_(Orlando,_Florida)>	1
 765 | SemSearch_ES-65	Q0	<dbpedia:Orlando_Health>	1
 766 | SemSearch_ES-66	Q0	<dbpedia:TOPS_Club>	1
 767 | SemSearch_ES-66	Q0	<dbpedia:Compulsive_overeating>	2
 768 | SemSearch_ES-66	Q0	<dbpedia:Overeaters_Anonymous>	2
 769 | SemSearch_ES-67	Q0	<dbpedia:STAR_Movies>	1
 770 | SemSearch_ES-67	Q0	<dbpedia:The_Movies>	1
 771 | SemSearch_ES-67	Q0	<dbpedia:Rock_Hudson's_Home_Movies>	1
 772 | SemSearch_ES-67	Q0	<dbpedia:UTV_Action>	1
 773 | SemSearch_ES-67	Q0	<dbpedia:Turner_Classic_Movies>	1
 774 | SemSearch_ES-68	Q0	<dbpedia:Pierce_County,_Wisconsin>	2
 775 | SemSearch_ES-68	Q0	<dbpedia:Washington_locations_by_per_capita_income>	1
 776 | SemSearch_ES-68	Q0	<dbpedia:Washington_statistical_areas>	1
 777 | SemSearch_ES-68	Q0	<dbpedia:Pierce_College>	1
 778 | SemSearch_ES-68	Q0	<dbpedia:DuPont,_Washington>	2
 779 | SemSearch_ES-68	Q0	<dbpedia:Pierce_Township>	2
 780 | SemSearch_ES-68	Q0	<dbpedia:Bethel_School_District_(Washington)>	1
 781 | SemSearch_ES-68	Q0	<dbpedia:List_of_counties_in_Washington>	1
 782 | SemSearch_ES-68	Q0	<dbpedia:Wauna,_Washington>	2
 783 | SemSearch_ES-68	Q0	<dbpedia:Pierce_County_Library_System>	1
 784 | SemSearch_ES-68	Q0	<dbpedia:List_of_U.S._Routes_in_Washington>	1
 785 | SemSearch_ES-68	Q0	<dbpedia:Pierce_County,_Washington>	1
 786 | SemSearch_ES-69	Q0	<dbpedia:Unsuccessful_attempts_to_participate_in_the_Eurovision_Song_Contest>	1
 787 | SemSearch_ES-69	Q0	<dbpedia:Mp3PRO>	1
 788 | SemSearch_ES-69	Q0	<dbpedia:Israel_in_the_Eurovision_Song_Contest_2007>	1
 789 | SemSearch_ES-69	Q0	<dbpedia:Windows_Media_Audio>	1
 790 | SemSearch_ES-7	Q0	<dbpedia:Glock>	1
 791 | SemSearch_ES-7	Q0	<dbpedia:Airsoft>	1
 792 | SemSearch_ES-7	Q0	<dbpedia:Airsoft_gun>	1
 793 | SemSearch_ES-70	Q0	<dbpedia:Virgin_Radio_Italia>	1
 794 | SemSearch_ES-70	Q0	<dbpedia:Radio_Active_(Sweden)>	1
 795 | SemSearch_ES-70	Q0	<dbpedia:Clive_Malcolm_Griffiths>	1
 796 | SemSearch_ES-70	Q0	<dbpedia:Radio_Maria>	1
 797 | SemSearch_ES-70	Q0	<dbpedia:Internet_radio>	1
 798 | SemSearch_ES-71	Q0	<dbpedia:History_of_Richmond,_Virginia>	2
 799 | SemSearch_ES-71	Q0	<dbpedia:Manchester,_Richmond,_Virginia>	1
 800 | SemSearch_ES-71	Q0	<dbpedia:Virginia>	1
 801 | SemSearch_ES-71	Q0	<dbpedia:Richmond_County,_Virginia>	1
 802 | SemSearch_ES-71	Q0	<dbpedia:Armstrong_High_School_(Richmond,_Virginia)>	1
 803 | SemSearch_ES-71	Q0	<dbpedia:East_End_(Richmond,_Virginia)>	2
 804 | SemSearch_ES-71	Q0	<dbpedia:City_of_Richmond_v._United_States>	1
 805 | SemSearch_ES-71	Q0	<dbpedia:Boulevard_(Richmond,_Virginia)>	1
 806 | SemSearch_ES-71	Q0	<dbpedia:Richmond_County>	2
 807 | SemSearch_ES-71	Q0	<dbpedia:Richmond,_Virginia>	2
 808 | SemSearch_ES-71	Q0	<dbpedia:West_End_(Richmond,_Virginia)>	2
 809 | SemSearch_ES-71	Q0	<dbpedia:Greater_Richmond_Region>	2
 810 | SemSearch_ES-71	Q0	<dbpedia:Southside_(Richmond,_Virginia)>	2
 811 | SemSearch_ES-71	Q0	<dbpedia:Port_of_Richmond_(Virginia)>	1
 812 | SemSearch_ES-71	Q0	<dbpedia:Sean_Marshall>	2
 813 | SemSearch_ES-71	Q0	<dbpedia:Neighborhoods_of_Richmond,_Virginia>	1
 814 | SemSearch_ES-71	Q0	<dbpedia:National_Register_of_Historic_Places_listings_in_Virginia>	1
 815 | SemSearch_ES-71	Q0	<dbpedia:Transportation_in_Richmond,_Virginia>	2
 816 | SemSearch_ES-71	Q0	<dbpedia:Downtown_Richmond,_Virginia>	2
 817 | SemSearch_ES-72	Q0	<dbpedia:KZCR>	2
 818 | SemSearch_ES-72	Q0	<dbpedia:WMFS>	1
 819 | SemSearch_ES-72	Q0	<dbpedia:Revolution_103>	2
 820 | SemSearch_ES-72	Q0	<dbpedia:WOLI-FM>	1
 821 | SemSearch_ES-72	Q0	<dbpedia:WRSR>	2
 822 | SemSearch_ES-72	Q0	<dbpedia:WFFX>	1
 823 | SemSearch_ES-72	Q0	<dbpedia:WEGR>	1
 824 | SemSearch_ES-72	Q0	<dbpedia:J._Peter_Sartain>	1
 825 | SemSearch_ES-73	Q0	<dbpedia:He-Man_and_the_Masters_of_the_Universe>	1
 826 | SemSearch_ES-73	Q0	<dbpedia:WGLS-FM>	1
 827 | SemSearch_ES-73	Q0	<dbpedia:Rowan_University>	2
 828 | SemSearch_ES-74	Q0	<dbpedia:Sacred_Heart_School_(Saratoga,_California)>	1
 829 | SemSearch_ES-74	Q0	<dbpedia:Sacred_Heart_%22The_Video%22>	1
 830 | SemSearch_ES-74	Q0	<dbpedia:Sacred_Heart_College,_Geelong>	1
 831 | SemSearch_ES-74	Q0	<dbpedia:Sacred_Heart_Medical_Center_Heliport>	1
 832 | SemSearch_ES-74	Q0	<dbpedia:University_of_the_Sacred_Heart>	1
 833 | SemSearch_ES-74	Q0	<dbpedia:Sacred_Heart_Academy_(Louisville)>	1
 834 | SemSearch_ES-74	Q0	<dbpedia:Sacred_Heart_school>	1
 835 | SemSearch_ES-74	Q0	<dbpedia:Sacred_Heart_(Shakespears_Sister_album)>	2
 836 | SemSearch_ES-74	Q0	<dbpedia:Sacred_Heart_University_Luxembourg>	2
 837 | SemSearch_ES-74	Q0	<dbpedia:Sacred_Heart_Academy>	2
 838 | SemSearch_ES-74	Q0	<dbpedia:Basilica_of_the_Sacred_Heart>	1
 839 | SemSearch_ES-74	Q0	<dbpedia:Sacred_Heart,_Minnesota>	1
 840 | SemSearch_ES-74	Q0	<dbpedia:Carrollton_School_of_the_Sacred_Heart>	1
 841 | SemSearch_ES-74	Q0	<dbpedia:Sacred_Heart_(disambiguation)>	1
 842 | SemSearch_ES-74	Q0	<dbpedia:Sacred_Heart_Academy_(New_York)>	1
 843 | SemSearch_ES-74	Q0	<dbpedia:Academy_of_the_Sacred_Heart_(New_Orleans)>	1
 844 | SemSearch_ES-74	Q0	<dbpedia:Society_of_the_Sacred_Heart>	1
 845 | SemSearch_ES-74	Q0	<dbpedia:Cathedral_of_the_Sacred_Heart_(Pensacola,_Florida)>	1
 846 | SemSearch_ES-74	Q0	<dbpedia:Sacred_Heart_University>	2
 847 | SemSearch_ES-75	Q0	<dbpedia:Houston>	1
 848 | SemSearch_ES-76	Q0	<dbpedia:San_Antonio_International>	1
 849 | SemSearch_ES-76	Q0	<dbpedia:San_Antonio_de_Cort%C3%A9s>	2
 850 | SemSearch_ES-76	Q0	<dbpedia:San_Antonio_de_Flores>	1
 851 | SemSearch_ES-76	Q0	<dbpedia:Cerro_San_Antonio>	1
 852 | SemSearch_ES-76	Q0	<dbpedia:San_Antonio_Bay>	1
 853 | SemSearch_ES-76	Q0	<dbpedia:San_Antonio,_San_Miguel>	1
 854 | SemSearch_ES-76	Q0	<dbpedia:Roman_Catholic_Archdiocese_of_San_Antonio>	1
 855 | SemSearch_ES-76	Q0	<dbpedia:University_of_San_Antonio>	1
 856 | SemSearch_ES-76	Q0	<dbpedia:San_Antonio_Pajonal>	2
 857 | SemSearch_ES-76	Q0	<dbpedia:San_Antonio,_Paraguay>	2
 858 | SemSearch_ES-76	Q0	<dbpedia:San_Antonio_del_T%C3%A1chira>	2
 859 | SemSearch_ES-76	Q0	<dbpedia:San_Antonio>	2
 860 | SemSearch_ES-76	Q0	<dbpedia:San_Antonio_Missions>	1
 861 | SemSearch_ES-76	Q0	<dbpedia:San_Antonio,_Chile>	2
 862 | SemSearch_ES-76	Q0	<dbpedia:New_San_Antonio_Rose>	1
 863 | SemSearch_ES-76	Q0	<dbpedia:San_Antonio_(disambiguation)>	2
 864 | SemSearch_ES-76	Q0	<dbpedia:San_Antonio,_Quezon>	1
 865 | SemSearch_ES-76	Q0	<dbpedia:San_Antonio_Aguas_Calientes>	2
 866 | SemSearch_ES-76	Q0	<dbpedia:San_Antonio,_Intibuc%C3%A1>	2
 867 | SemSearch_ES-76	Q0	<dbpedia:San_Antonio,_Cop%C3%A1n>	1
 868 | SemSearch_ES-77	Q0	<dbpedia:Brenda_Brathwaite>	1
 869 | SemSearch_ES-77	Q0	<dbpedia:Georgia_Tech_Savannah>	1
 870 | SemSearch_ES-77	Q0	<dbpedia:Don_Stone>	1
 871 | SemSearch_ES-77	Q0	<dbpedia:Savannah_State_University_College_of_Sciences_and_Technology>	1
 872 | SemSearch_ES-77	Q0	<dbpedia:Savannah_Technical_College>	2
 873 | SemSearch_ES-77	Q0	<dbpedia:List_of_Savannah_State_University_alumni>	1
 874 | SemSearch_ES-78	Q0	<dbpedia:Sharp_PC-1350>	1
 875 | SemSearch_ES-78	Q0	<dbpedia:Sharp_PC-E500S>	2
 876 | SemSearch_ES-78	Q0	<dbpedia:Sharp_PC-1500>	2
 877 | SemSearch_ES-78	Q0	<dbpedia:Sharp_PC-1403>	2
 878 | SemSearch_ES-78	Q0	<dbpedia:Sharp_X1>	2
 879 | SemSearch_ES-78	Q0	<dbpedia:Sharp_PC-1211>	2
 880 | SemSearch_ES-78	Q0	<dbpedia:Sharp_PC-5000>	2
 881 | SemSearch_ES-78	Q0	<dbpedia:Sharp_PC-1600>	1
 882 | SemSearch_ES-78	Q0	<dbpedia:Sharp_PC-1401>	2
 883 | SemSearch_ES-78	Q0	<dbpedia:Sharp_PC-1251>	2
 884 | SemSearch_ES-79	Q0	<dbpedia:Garam_masala>	2
 885 | SemSearch_ES-79	Q0	<dbpedia:Masala>	1
 886 | SemSearch_ES-79	Q0	<dbpedia:Shobana>	1
 887 | SemSearch_ES-8	Q0	<dbpedia:Aloha_Bowl>	1
 888 | SemSearch_ES-80	Q0	<dbpedia:Take_Me_Home_Tour>	1
 889 | SemSearch_ES-80	Q0	<dbpedia:Cher_filmography>	1
 890 | SemSearch_ES-80	Q0	<dbpedia:A_Cowboy's_Work_Is_Never_Done>	2
 891 | SemSearch_ES-80	Q0	<dbpedia:The_Sonny_Side_of_Ch%C3%A9r>	2
 892 | SemSearch_ES-80	Q0	<dbpedia:But_You're_Mine>	1
 893 | SemSearch_ES-80	Q0	<dbpedia:I_Got_You_Babe>	2
 894 | SemSearch_ES-80	Q0	<dbpedia:The_Sonny_&_Cher_Comedy_Hour>	2
 895 | SemSearch_ES-80	Q0	<dbpedia:Where_Do_You_Go_(Cher_song)>	1
 896 | SemSearch_ES-80	Q0	<dbpedia:With_Love,_Ch%C3%A9r>	1
 897 | SemSearch_ES-80	Q0	<dbpedia:Sonny>	1
 898 | SemSearch_ES-80	Q0	<dbpedia:Ch%C3%A9r_(1966_album)>	1
 899 | SemSearch_ES-80	Q0	<dbpedia:Sonny_Bono>	1
 900 | SemSearch_ES-80	Q0	<dbpedia:Cher>	1
 901 | SemSearch_ES-80	Q0	<dbpedia:Sonny_&_Cher>	2
 902 | SemSearch_ES-80	Q0	<dbpedia:Mama_Was_a_Rock_and_Roll_Singer_Papa_Used_to_Write_All_Her_Songs>	1
 903 | SemSearch_ES-81	Q0	<dbpedia:Dakota_Marker>	1
 904 | SemSearch_ES-81	Q0	<dbpedia:South_Dakota_State_Jackrabbits_men's_basketball>	1
 905 | SemSearch_ES-81	Q0	<dbpedia:University_of_South_Dakota%E2%80%93Springfield>	2
 906 | SemSearch_ES-81	Q0	<dbpedia:University_of_South_Dakota_School_of_Law>	2
 907 | SemSearch_ES-81	Q0	<dbpedia:South_Dakota_State_University>	2
 908 | SemSearch_ES-81	Q0	<dbpedia:South_Dakota_State_Jackrabbits>	2
 909 | SemSearch_ES-81	Q0	<dbpedia:Dakota_State_University>	2
 910 | SemSearch_ES-81	Q0	<dbpedia:University_of_South_Dakota>	1
 911 | SemSearch_ES-81	Q0	<dbpedia:North_Dakota_State_University>	1
 912 | SemSearch_ES-81	Q0	<dbpedia:David_L._Chicoine>	1
 913 | SemSearch_ES-81	Q0	<dbpedia:South_Dakota_Legislature>	1
 914 | SemSearch_ES-81	Q0	<dbpedia:South_Dakota_Board_of_Regents>	2
 915 | SemSearch_ES-81	Q0	<dbpedia:South_Dakota_Public_Universities_and_Research_Center>	2
 916 | SemSearch_ES-81	Q0	<dbpedia:Coughlin%E2%80%93Alumni_Stadium>	1
 917 | SemSearch_ES-81	Q0	<dbpedia:Huron_University>	1
 918 | SemSearch_ES-81	Q0	<dbpedia:List_of_colleges_and_universities_in_South_Dakota>	1
 919 | SemSearch_ES-82	Q0	<dbpedia:Elections_in_Saint_Lucia>	2
 920 | SemSearch_ES-82	Q0	<dbpedia:Saint_Lucia_racer>	2
 921 | SemSearch_ES-82	Q0	<dbpedia:Radio_St._Lucia>	1
 922 | SemSearch_ES-82	Q0	<dbpedia:St_Lucia_Whiptail>	1
 923 | SemSearch_ES-82	Q0	<dbpedia:Sport_in_Saint_Lucia>	1
 924 | SemSearch_ES-82	Q0	<dbpedia:List_of_colonial_governors_of_Saint_Lucia>	1
 925 | SemSearch_ES-82	Q0	<dbpedia:Foreign_relations_of_Saint_Lucia>	1
 926 | SemSearch_ES-82	Q0	<dbpedia:Saint_Lucia_Oriole>	1
 927 | SemSearch_ES-82	Q0	<dbpedia:Saint_Lucia_Black_Finch>	1
 928 | SemSearch_ES-82	Q0	<dbpedia:Saint_Lucia_Football_Association>	1
 929 | SemSearch_ES-82	Q0	<dbpedia:Battle_of_St._Lucia>	1
 930 | SemSearch_ES-82	Q0	<dbpedia:Saint_Lucia>	2
 931 | SemSearch_ES-82	Q0	<dbpedia:Flag_of_Saint_Lucia>	1
 932 | SemSearch_ES-82	Q0	<dbpedia:HMS_St_Lucia>	1
 933 | SemSearch_ES-82	Q0	<dbpedia:ISimangaliso_Wetland_Park>	1
 934 | SemSearch_ES-82	Q0	<dbpedia:Saint_Lucia_Warbler>	1
 935 | SemSearch_ES-82	Q0	<dbpedia:St_Lucia,_Queensland>	1
 936 | SemSearch_ES-82	Q0	<dbpedia:St._Lucia_Airways>	1
 937 | SemSearch_ES-82	Q0	<dbpedia:List_of_cities_in_Saint_Lucia>	2
 938 | SemSearch_ES-82	Q0	<dbpedia:Saint_Lucia_Amazon>	1
 939 | SemSearch_ES-83	Q0	<dbpedia:Paul_the_Apostle>	1
 940 | SemSearch_ES-83	Q0	<dbpedia:St._Paul_Saints>	2
 941 | SemSearch_ES-83	Q0	<dbpedia:St._Paul_Island_(Nova_Scotia)>	1
 942 | SemSearch_ES-83	Q0	<dbpedia:Saint_Paul's_College>	1
 943 | SemSearch_ES-83	Q0	<dbpedia:Cathedral_of_Saints_Peter_and_Paul,_Providence>	1
 944 | SemSearch_ES-83	Q0	<dbpedia:St._Paul's_Church>	1
 945 | SemSearch_ES-83	Q0	<dbpedia:St._Paul's_Cathedral_(disambiguation)>	1
 946 | SemSearch_ES-83	Q0	<dbpedia:Minnie_Mi%C3%B1oso>	1
 947 | SemSearch_ES-84	Q0	<dbpedia:The_Dish>	1
 948 | SemSearch_ES-84	Q0	<dbpedia:Danielle_Fishel>	2
 949 | SemSearch_ES-84	Q0	<dbpedia:National_Lampoon's_Dorm_Daze_2>	1
 950 | SemSearch_ES-85	Q0	<dbpedia:The_Longest_Yard>	1
 951 | SemSearch_ES-85	Q0	<dbpedia:The_Longest_Yard_(2005_film)>	1
 952 | SemSearch_ES-85	Q0	<dbpedia:The_Longest_Yard_(1974_film)>	1
 953 | SemSearch_ES-86	Q0	<dbpedia:The_Morning_Call>	2
 954 | SemSearch_ES-86	Q0	<dbpedia:Penn_State_Lehigh_Valley>	1
 955 | SemSearch_ES-86	Q0	<dbpedia:Morning_Call_(CNBC)>	2
 956 | SemSearch_ES-86	Q0	<dbpedia:Lehigh_Valley_AVA>	2
 957 | SemSearch_ES-86	Q0	<dbpedia:Lehigh_County,_Pennsylvania>	1
 958 | SemSearch_ES-86	Q0	<dbpedia:State_Route_1002_(Lehigh_County,_Pennsylvania)>	1
 959 | SemSearch_ES-86	Q0	<dbpedia:Lehigh_Valley_Conference>	1
 960 | SemSearch_ES-86	Q0	<dbpedia:Morning_Call>	1
 961 | SemSearch_ES-86	Q0	<dbpedia:Lehigh_Valley_Charter_High_School_for_the_Performing_Arts>	1
 962 | SemSearch_ES-87	Q0	<dbpedia:Lift_(Shannon_Noll_album)>	1
 963 | SemSearch_ES-88	Q0	<dbpedia:Thomas_Jefferson_Medal_in_Architecture>	1
 964 | SemSearch_ES-88	Q0	<dbpedia:USS_Thomas_Jefferson>	2
 965 | SemSearch_ES-88	Q0	<dbpedia:Thomas_Jefferson_(Caymanian_politician)>	1
 966 | SemSearch_ES-88	Q0	<dbpedia:Thomas_Jefferson_Education_Foundation>	1
 967 | SemSearch_ES-88	Q0	<dbpedia:Thomas_Jefferson:_Author_of_America>	1
 968 | SemSearch_ES-88	Q0	<dbpedia:Jefferson_County,_Alabama>	1
 969 | SemSearch_ES-88	Q0	<dbpedia:Thomas_Jefferson_Ramsdell>	2
 970 | SemSearch_ES-88	Q0	<dbpedia:Jefferson_Thomas>	2
 971 | SemSearch_ES-88	Q0	<dbpedia:Thomas_Jefferson_University>	1
 972 | SemSearch_ES-88	Q0	<dbpedia:Jefferson_State>	1
 973 | SemSearch_ES-88	Q0	<dbpedia:Thomas_Jefferson_Library>	1
 974 | SemSearch_ES-88	Q0	<dbpedia:Thomas_Jefferson_Education>	1
 975 | SemSearch_ES-88	Q0	<dbpedia:Thomas_Jefferson_(disambiguation)>	2
 976 | SemSearch_ES-88	Q0	<dbpedia:Thomas_Jefferson_(film)>	1
 977 | SemSearch_ES-88	Q0	<dbpedia:Thomas_Jefferson>	1
 978 | SemSearch_ES-88	Q0	<dbpedia:Jefferson_Memorial>	1
 979 | SemSearch_ES-88	Q0	<dbpedia:Thomas_Jefferson_Middle_School>	1
 980 | SemSearch_ES-88	Q0	<dbpedia:Jefferson_High_School>	1
 981 | SemSearch_ES-88	Q0	<dbpedia:Thomas_Jefferson_High_School>	1
 982 | SemSearch_ES-88	Q0	<dbpedia:Thomas_Jefferson_Byrd>	1
 983 | SemSearch_ES-89	Q0	<dbpedia:University_of_North_Dakota_men's_ice_hockey>	1
 984 | SemSearch_ES-89	Q0	<dbpedia:North_Dakota_State_University>	2
 985 | SemSearch_ES-89	Q0	<dbpedia:List_of_law_enforcement_agencies_in_North_Dakota>	1
 986 | SemSearch_ES-89	Q0	<dbpedia:List_of_University_of_North_Dakota_people>	2
 987 | SemSearch_ES-89	Q0	<dbpedia:University_of_North_Dakota_athletics>	1
 988 | SemSearch_ES-89	Q0	<dbpedia:University_of_North_Dakota_School_of_Medicine_and_Health_Sciences>	2
 989 | SemSearch_ES-89	Q0	<dbpedia:University_of_North_Dakota_School_of_Law>	1
 990 | SemSearch_ES-89	Q0	<dbpedia:North_Dakota_University_System>	2
 991 | SemSearch_ES-89	Q0	<dbpedia:North_Dakota_State_Bison_women's_soccer>	1
 992 | SemSearch_ES-89	Q0	<dbpedia:List_of_Presidents_of_North_Dakota_State_University>	1
 993 | SemSearch_ES-89	Q0	<dbpedia:University_of_North_Dakota>	2
 994 | SemSearch_ES-89	Q0	<dbpedia:Red_River_Valley_University>	1
 995 | SemSearch_ES-89	Q0	<dbpedia:Dakota_Prairie_High_School>	1
 996 | SemSearch_ES-89	Q0	<dbpedia:List_of_colleges_and_universities_in_North_Dakota>	2
 997 | SemSearch_ES-89	Q0	<dbpedia:Nickel_Trophy>	1
 998 | SemSearch_ES-9	Q0	<dbpedia:American_Embassy>	1
 999 | SemSearch_ES-9	Q0	<dbpedia:Embassy_of_the_United_States_in_Tokyo>	1
1000 | SemSearch_ES-9	Q0	<dbpedia:1998_United_States_embassy_bombings>	1
1001 | SemSearch_ES-9	Q0	<dbpedia:Bureau_of_Overseas_Buildings_Operations>	1
1002 | SemSearch_ES-90	Q0	<dbpedia:Arizona_State_University>	1
1003 | SemSearch_ES-90	Q0	<dbpedia:University_of_Phoenix_Stadium>	2
1004 | SemSearch_ES-90	Q0	<dbpedia:Phoenix,_Arizona>	1
1005 | SemSearch_ES-90	Q0	<dbpedia:Arizona_State_University_at_the_Downtown_Phoenix_campus>	1
1006 | SemSearch_ES-90	Q0	<dbpedia:University_of_Phoenix>	2
1007 | SemSearch_ES-91	Q0	<dbpedia:Sidney_Herbert,_1st_Baronet>	1
1008 | SemSearch_ES-91	Q0	<dbpedia:Westminster_Abbey_by-election,_1932>	1
1009 | SemSearch_ES-91	Q0	<dbpedia:Westminster_(UK_Parliament_constituency)>	1
1010 | SemSearch_ES-91	Q0	<dbpedia:Westminster_(disambiguation)>	1
1011 | SemSearch_ES-91	Q0	<dbpedia:Westminster>	1
1012 | SemSearch_ES-91	Q0	<dbpedia:St_Margaret's,_Westminster>	1
1013 | SemSearch_ES-91	Q0	<dbpedia:Westminster_Abbey_by-election>	2
1014 | SemSearch_ES-91	Q0	<dbpedia:Westminster_Abbey_(UK_Parliament_constituency)>	2
1015 | SemSearch_ES-91	Q0	<dbpedia:Abbot_of_Westminster>	1
1016 | SemSearch_ES-91	Q0	<dbpedia:Westminster_Abbey_Choir_School>	1
1017 | SemSearch_ES-91	Q0	<dbpedia:Westminster_Abbey_by-election,_1939>	2
1018 | SemSearch_ES-91	Q0	<dbpedia:High_Steward_of_Westminster_Abbey>	2
1019 | SemSearch_ES-91	Q0	<dbpedia:City_of_Westminster>	1
1020 | SemSearch_ES-91	Q0	<dbpedia:Westminster_Abbey>	2
1021 | SemSearch_ES-91	Q0	<dbpedia:Westminster_Abbey_by-election,_1921>	1
1022 | SemSearch_ES-91	Q0	<dbpedia:Diocese_of_Westminster>	1
1023 | SemSearch_ES-91	Q0	<dbpedia:Westminster_Abbey_(British_Columbia)>	1
1024 | SemSearch_ES-91	Q0	<dbpedia:Dean's_Yard>	1
1025 | SemSearch_ES-91	Q0	<dbpedia:Westminster_Abbey_Museum>	2
1026 | SemSearch_ES-91	Q0	<dbpedia:Westminster_Abbey_by-election,_1924>	2
1027 | SemSearch_ES-93	Q0	<dbpedia:Toyota_Tundra>	2
1028 | SemSearch_ES-93	Q0	<dbpedia:List_of_Toyota_vehicles>	1
1029 | SemSearch_ES-93	Q0	<dbpedia:Lucas_Deep_Clean_200>	1
1030 | SemSearch_ES-94	Q0	<dbpedia:WPAA>	1
1031 | SemSearch_ES-94	Q0	<dbpedia:Hugh_Downs>	2
1032 | SemSearch_ES-94	Q0	<dbpedia:Downs>	1
1033 | SemSearch_ES-94	Q0	<dbpedia:ASU_College_of_Liberal_Arts_and_Sciences>	1
1034 | SemSearch_ES-94	Q0	<dbpedia:Gene_Rayburn>	1
1035 | SemSearch_ES-95	Q0	<dbpedia:New_Madrid,_Missouri>	1
1036 | SemSearch_ES-95	Q0	<dbpedia:Technical_University_of_Madrid>	1
1037 | SemSearch_ES-95	Q0	<dbpedia:Madrid_Open_(tennis)>	1
1038 | SemSearch_ES-95	Q0	<dbpedia:Madrid_Open>	1
1039 | SemSearch_ES-95	Q0	<dbpedia:Madrid>	2
1040 | SemSearch_ES-95	Q0	<dbpedia:Autonomous_University_of_Madrid>	1
1041 | SemSearch_ES-95	Q0	<dbpedia:List_of_Madrid_metro_stations>	1
1042 | SemSearch_ES-95	Q0	<dbpedia:List_of_Real_Madrid_C.F._seasons>	1
1043 | SemSearch_ES-95	Q0	<dbpedia:Madrid%E2%80%93Barajas_Airport>	1
1044 | SemSearch_ES-95	Q0	<dbpedia:New_Madrid>	1
1045 | SemSearch_ES-95	Q0	<dbpedia:Ciudad_Real_Madrid>	1
1046 | SemSearch_ES-95	Q0	<dbpedia:Madrid%E2%80%93Seville_high-speed_rail_line>	1
1047 | SemSearch_ES-95	Q0	<dbpedia:Assembly_of_Madrid>	1
1048 | SemSearch_ES-95	Q0	<dbpedia:Madrid_Metro>	1
1049 | SemSearch_ES-95	Q0	<dbpedia:Siege_of_Madrid>	1
1050 | SemSearch_ES-95	Q0	<dbpedia:Supercomputing_and_Visualization_Center_of_Madrid>	1
1051 | SemSearch_ES-95	Q0	<dbpedia:Royal_Palace_of_Madrid>	1
1052 | SemSearch_ES-95	Q0	<dbpedia:Real_Madrid_C>	1
1053 | SemSearch_ES-95	Q0	<dbpedia:Real_Madrid_C.F.>	1
1054 | SemSearch_ES-95	Q0	<dbpedia:Madrid_(disambiguation)>	1
1055 | SemSearch_ES-95	Q0	<dbpedia:Madrid_Accords>	1
1056 | SemSearch_ES-95	Q0	<dbpedia:President_of_the_Community_of_Madrid>	1
1057 | SemSearch_ES-95	Q0	<dbpedia:Flag_of_Madrid>	1
1058 | SemSearch_ES-95	Q0	<dbpedia:Madrid_Arena>	1
1059 | SemSearch_ES-95	Q0	<dbpedia:Districts_of_Madrid>	1
1060 | SemSearch_ES-96	Q0	<dbpedia:State_House_Coffee>	1
1061 | SemSearch_ES-96	Q0	<dbpedia:Jonathan's_Coffee-House>	1
1062 | SemSearch_ES-96	Q0	<dbpedia:The_2i's_Coffee_Bar>	1
1063 | SemSearch_ES-97	Q0	<dbpedia:Curse_of_the_Pink_Panther>	2
1064 | SemSearch_ES-97	Q0	<dbpedia:Pink_Panther_(character)>	1
1065 | SemSearch_ES-97	Q0	<dbpedia:Pink_Is_a_Many_Splintered_Thing>	1
1066 | SemSearch_ES-97	Q0	<dbpedia:The_Pink_Panther_(2006_film)>	1
1067 | SemSearch_ES-97	Q0	<dbpedia:The_Pink_Panther>	1
1068 | SemSearch_ES-97	Q0	<dbpedia:Pink,_Plunk,_Plink>	1
1069 | SemSearch_ES-97	Q0	<dbpedia:The_Pink_Panther_(disambiguation)>	1
1070 | SemSearch_ES-97	Q0	<dbpedia:Son_of_the_Pink_Panther>	1
1071 | SemSearch_ES-97	Q0	<dbpedia:Revenge_of_the_Pink_Panther>	1
1072 | SemSearch_ES-97	Q0	<dbpedia:Shocking_Pink>	1
1073 | SemSearch_ES-97	Q0	<dbpedia:List_of_The_Pink_Panther_cartoons>	1
1074 | SemSearch_ES-97	Q0	<dbpedia:The_Pink_Panther_Strikes_Again>	1
1075 | SemSearch_ES-97	Q0	<dbpedia:Trail_of_the_Pink_Panther>	1
1076 | SemSearch_ES-97	Q0	<dbpedia:The_Pink_Panther_2>	2
1077 | SemSearch_ES-97	Q0	<dbpedia:Inspector_Clouseau_(film)>	1
1078 | SemSearch_ES-97	Q0	<dbpedia:The_Pink_Panther_Show>	1
1079 | SemSearch_ES-98	Q0	<dbpedia:William_C._Powers>	1
1080 | SemSearch_ES-98	Q0	<dbpedia:List_of_University_of_Texas_at_Austin_presidents>	1
1081 | SemSearch_ES-98	Q0	<dbpedia:Cockrell_School_of_Engineering>	1
1082 | SemSearch_ES-98	Q0	<dbpedia:University_of_Texas_at_Austin_College_of_Natural_Sciences>	1
1083 | SemSearch_ES-98	Q0	<dbpedia:University_of_Texas_School_of_Architecture>	1
1084 | SemSearch_ES-98	Q0	<dbpedia:Frank_Erwin_Center>	1
1085 | SemSearch_ES-98	Q0	<dbpedia:Texas_Blazers>	1
1086 | SemSearch_ES-98	Q0	<dbpedia:Sarah_and_Ernest_Butler_School_of_Music>	1
1087 | SemSearch_ES-98	Q0	<dbpedia:Ernest_Campbell_Mossner>	1
1088 | SemSearch_ES-98	Q0	<dbpedia:West_Campus,_Austin,_Texas>	1
1089 | SemSearch_ES-98	Q0	<dbpedia:Young_Conservatives_of_Texas>	1
1090 | SemSearch_ES-98	Q0	<dbpedia:List_of_University_of_Texas_at_Austin_faculty>	1
1091 | SemSearch_ES-98	Q0	<dbpedia:University_of_Texas_School_of_Law>	1
1092 | SemSearch_ES-98	Q0	<dbpedia:History_of_the_University_of_Texas_at_Austin>	2
1093 | SemSearch_ES-98	Q0	<dbpedia:University_of_Texas_at_Austin_College_of_Communication>	1
1094 | SemSearch_ES-98	Q0	<dbpedia:Texas_Longhorns>	1
1095 | SemSearch_ES-98	Q0	<dbpedia:Concordia_University_Texas>	1
1096 | SemSearch_ES-98	Q0	<dbpedia:Michener_Center_for_Writers>	1
1097 | SemSearch_ES-98	Q0	<dbpedia:List_of_University_of_Texas_at_Austin_buildings>	1
1098 | SemSearch_ES-98	Q0	<dbpedia:The_Alcalde>	1
1099 | SemSearch_ES-98	Q0	<dbpedia:Main_Building_(University_of_Texas_at_Austin)>	1
1100 | SemSearch_ES-98	Q0	<dbpedia:List_of_University_of_Texas_at_Austin_alumni>	1
1101 | SemSearch_ES-98	Q0	<dbpedia:University_of_Texas_System>	1
1102 | SemSearch_ES-98	Q0	<dbpedia:University_of_Texas_at_Austin>	2
1103 | SemSearch_ES-98	Q0	<dbpedia:University_of_Texas_at_Austin_College_of_Fine_Arts>	1
1104 | SemSearch_ES-98	Q0	<dbpedia:Texas_Exes>	1
1105 | SemSearch_ES-98	Q0	<dbpedia:Texas_Hillel>	1
1106 | SemSearch_ES-99	Q0	<dbpedia:York_University>	2
1107 | SemSearch_ES-99	Q0	<dbpedia:Queens_College,_City_University_of_New_York>	1
1108 | SemSearch_ES-99	Q0	<dbpedia:University_of_York>	2
1109 | SemSearch_ES-99	Q0	<dbpedia:State_University_of_New_York_Upstate_Medical_University>	1
1110 | SemSearch_ES-99	Q0	<dbpedia:University_Radio_York>	1
1111 | SemSearch_ES-99	Q0	<dbpedia:Pace_University>	1
1112 | SemSearch_ES-99	Q0	<dbpedia:City_University_of_New_York>	1
1113 | SemSearch_ES-99	Q0	<dbpedia:Goodricke_College>	1
1114 | SemSearch_ES-99	Q0	<dbpedia:City_College_of_New_York>	1
1115 | SemSearch_ES-99	Q0	<dbpedia:York_University_(YRT)>	2
1116 | 


--------------------------------------------------------------------------------
/qrels/queries.txt:
--------------------------------------------------------------------------------
  1 | INEX_LD-20120111	vietnam war movie
  2 | INEX_LD-20120112	vietnam war facts
  3 | INEX_LD-20120121	vietnam food recipes
  4 | INEX_LD-20120122	vietnamese food blog
  5 | INEX_LD-20120131	vietnam travel national park
  6 | INEX_LD-20120132	vietnam travel airports
  7 | INEX_LD-20120211	guitar chord tuning
  8 | INEX_LD-20120212	guitar chord minor
  9 | INEX_LD-20120221	guitar classical flamenco
 10 | INEX_LD-20120222	guitar classical bach
 11 | INEX_LD-20120231	guitar origin Russia
 12 | INEX_LD-20120232	guitar origin blues
 13 | INEX_LD-20120311	tango culture movies
 14 | INEX_LD-20120312	tango culture countries
 15 | INEX_LD-20120321	tango music  composers
 16 | INEX_LD-20120322	tango music instruments
 17 | INEX_LD-20120331	tango dance styles
 18 | INEX_LD-20120332	tango dance history
 19 | INEX_LD-20120411	bicycle sport races
 20 | INEX_LD-20120412	bicycle sport disciplines
 21 | INEX_LD-20120421	bicycle holiday towns
 22 | INEX_LD-20120422	bicycle holiday nature
 23 | INEX_LD-20120431	bicycle benefits health
 24 | INEX_LD-20120432	bicycle benefits environment
 25 | INEX_LD-20120511	female rock singers
 26 | INEX_LD-20120512	south korean girl groups
 27 | INEX_LD-20120521	electronic music genres
 28 | INEX_LD-20120522	digital music notation formats
 29 | INEX_LD-20120531	music conferences
 30 | INEX_LD-20120532	intellectual property rights lobby
 31 | INEX_LD-2009022	Szechwan dish food cuisine
 32 | INEX_LD-2009039	roman architecture
 33 | INEX_LD-2009053	finland car industry manufacturer saab sisu
 34 | INEX_LD-2009061	france second world war normandy
 35 | INEX_LD-2009062	social network group selection
 36 | INEX_LD-2009063	D-Day normandy invasion
 37 | INEX_LD-2009074	web ranking scoring algorithm
 38 | INEX_LD-2009096	Eiffel
 39 | INEX_LD-2009111	europe solar power facility
 40 | INEX_LD-2009115	virtual museums
 41 | INEX_LD-2010004	Indian food
 42 | INEX_LD-2010014	composer museum
 43 | INEX_LD-2010019	gallo roman architecture in paris
 44 | INEX_LD-2010020	electricity source in France
 45 | INEX_LD-2010037	social network API
 46 | INEX_LD-2010043	List of films from the surrealist category
 47 | INEX_LD-2010057	Einstein Relativity theory
 48 | INEX_LD-2010069	summer flowers
 49 | INEX_LD-2010100	house concrete wood
 50 | INEX_LD-2010106	organic food advantages disadvantages
 51 | INEX_LD-2012301	Niagara falls origin lake
 52 | INEX_LD-2012303	 Valley fever fungal infection San Joaquin
 53 | INEX_LD-2012305	North Dakota's lowest river of another colour
 54 | INEX_LD-2012307	 July, 1850  president died Millard Fillmore sworn following day
 55 | INEX_LD-2012309	residents small island city-state  Malay Peninsula Chinese
 56 | INEX_LD-2012311	John Lennon Yoko Ono album Starting Over
 57 | INEX_LD-2012313	John Turturro 1991 Coen Brothers film
 58 | INEX_LD-2012315	Baguio Quezon City  Manila official independence 1945
 59 | INEX_LD-2012317	daggeroso inclined to use a dagger novel Sons and Lovers
 60 | INEX_LD-2012318	Directed Bela Glen Glenda Bride Monster Plan 9 Outer Space
 61 | INEX_LD-2012319	 1994 short story collection Alice Munro is Open
 62 | INEX_LD-2012321	Asian port state-city Sir Stamford Raffles
 63 | INEX_LD-2012323	Large glaciers island nation Langjokull Hofsjokull Vatnajokull
 64 | INEX_LD-2012325	successor James G. Blaine studied law
 65 | INEX_LD-2012327	Beloved author African-American Nobel Prize Literature
 66 | INEX_LD-2012329	Sweden Iceland currency
 67 | INEX_LD-2012331	Seoul Korea river name ethnic group China
 68 | INEX_LD-2012333	Prime minister Canada nicknamed Silver-Tongued Laurier longest unbroken term
 69 | INEX_LD-2012335	U.S. president authorise nuclear weapons against Japan
 70 | INEX_LD-2012336	1906 territory Papua island Australian
 71 | INEX_LD-2012337	Texas city Baylor University tornado 1953
 72 | INEX_LD-2012339	Nelson Mandela John Dube
 73 | INEX_LD-2012341	1997  Houston airport president
 74 | INEX_LD-2012343	The Heart of a Woman poet's autobiography
 75 | INEX_LD-2012345	Kennedy assassination governor of Texas seriously injured
 76 | INEX_LD-2012347	seat Florida country Dade
 77 | INEX_LD-2012349	Alexander Nevsky Cathedral Bulgarian city liberation Turks
 78 | INEX_LD-2012351	Indian Cuisine dish rice dhal vegetables roti papad
 79 | INEX_LD-2012353	country German language
 80 | INEX_LD-2012354	greatest guitarist
 81 | INEX_LD-2012355	England football player highest paid
 82 | INEX_LD-2012357	 prima ballerina Bolshoi Theatre 1960
 83 | INEX_LD-2012359	 Bob Ricker Executive Director the latest front group for the anti-gun movement
 84 | INEX_LD-2012361	most famous award winning actor singer
 85 | INEX_LD-2012363	American twins famous  American professional tennis double players
 86 | INEX_LD-2012365	mathematician computer scientist MIT's six inaugural MacVicar Faculty Fellows
 87 | INEX_LD-2012367	invented telescope
 88 | INEX_LD-2012369	most famous civic-military airports
 89 | INEX_LD-2012371	most beautiful railway stations world cities located
 90 | INEX_LD-2012372	famous historical battlefields opponents fought
 91 | INEX_LD-2012373	birds cannot fly
 92 | INEX_LD-2012375	animals lay eggs mammals
 93 | INEX_LD-2012377	allegedly caused World War I
 94 | INEX_LD-2012379	pairs cities same language same longitude different countries
 95 | INEX_LD-2012381	movie directors directed a block buster
 96 | INEX_LD-2012383	famous computer scientists disappeared at sea
 97 | INEX_LD-2012385	famous politicians vegetarians
 98 | INEX_LD-2012387	famous river confluence dam constructed
 99 | INEX_LD-2012389	frequently visited sharks gulf Indian Ocean
100 | INEX_LD-2012390	baseball player most homeruns national league
101 | INEX_XER-60	olympic classes dinghy sailing
102 | INEX_XER-62	Neil Gaiman novels
103 | INEX_XER-63	Hugo awarded best novels
104 | INEX_XER-64	Alan Moore graphic novels adapted to film
105 | INEX_XER-65	Pacific navigators Australia explorers
106 | INEX_XER-67	Ferris and observation wheels
107 | INEX_XER-72	films shot in Venice 
108 | INEX_XER-73	magazines about indie-music 
109 | INEX_XER-74	circus mammals
110 | INEX_XER-79	Works by Charles Rennie Mackintosh
111 | INEX_XER-81	Movies about English hooligans
112 | INEX_XER-86	List of countries in World War Two
113 | INEX_XER-87	Axis powers of World War II
114 | INEX_XER-88	Nordic authors who are known for children's literature
115 | INEX_XER-91	Paul Auster novels
116 | INEX_XER-94	Hybrid cars sold in Europe
117 | INEX_XER-95	Tom Hanks movies where he plays a leading role.
118 | INEX_XER-96	Pure object-oriented programing languages
119 | INEX_XER-97	Compilers that can compile both C and C++
120 | INEX_XER-98	Makers of lawn tennis rackets
121 | INEX_XER-99	Computer systems that have a recursive acronym for the name
122 | INEX_XER-100	Operating systems to which Steve Jobs related
123 | INEX_XER-106	Noble english person from the Hundred Years' War
124 | INEX_XER-108	State capitals of the United States of America
125 | INEX_XER-109	National capitals situated on islands
126 | INEX_XER-110	Nobel Prize in Literature winners who were also poets
127 | INEX_XER-113	Formula 1 drivers that won the Monaco Grand Prix
128 | INEX_XER-114	Formula one races in Europe
129 | INEX_XER-115	Formula One World Constructors' Champions
130 | INEX_XER-116	Italian nobel prize winners
131 | INEX_XER-117	Musicians who appeared in the Blues Brothers movies
132 | INEX_XER-118	French car models in 1960's
133 | INEX_XER-119	Swiss cantons where they speak German
134 | INEX_XER-121	US presidents since 1960
135 | INEX_XER-122	Movies with eight or more Academy Awards
136 | INEX_XER-123	FIFA world cup national team winners since 1974
137 | INEX_XER-124	Novels that won the Booker Prize
138 | INEX_XER-125	countries which have won the FIFA world cup
139 | INEX_XER-126	toy train manufacturers that are still in business
140 | INEX_XER-127	german female politicians
141 | INEX_XER-128	Bond girls
142 | INEX_XER-129	Science fiction book written in the 1980
143 | INEX_XER-130	Star Trek Captains
144 | INEX_XER-132	living nordic classical composers
145 | INEX_XER-133	EU countries
146 | INEX_XER-134	record-breaking sprinters in male 100-meter sprints
147 | INEX_XER-135	professional baseball team in Japan
148 | INEX_XER-136	Japanese players in Major League Baseball
149 | INEX_XER-138	National Parks East Coast Canada US
150 | INEX_XER-139	Films directed by Akira Kurosawa
151 | INEX_XER-140	Airports in Germany
152 | INEX_XER-141	Universities in Catalunya
153 | INEX_XER-143	Hanseatic league in Germany in the Netherlands Circle
154 | INEX_XER-144	chess world champions
155 | INEX_XER-147	Chemical elements that are named after people
156 | QALD2_te-1	Which German cities have more than 250000 inhabitants?
157 | QALD2_te-2	Who was the successor of John F. Kennedy?
158 | QALD2_te-3	Who is the mayor of Berlin?
159 | QALD2_te-5	What is the second highest mountain on Earth?
160 | QALD2_te-6	Give me all professional skateboarders from Sweden.
161 | QALD2_te-8	To which countries does the Himalayan mountain system extend?
162 | QALD2_te-9	Give me a list of all trumpet players that were bandleaders.
163 | QALD2_te-11	Who is the Formula 1 race driver with the most races?
164 | QALD2_te-12	Give me all world heritage sites designated within the past five years.
165 | QALD2_te-13	Who is the youngest player in the Premier League?
166 | QALD2_te-14	Give me all members of Prodigy.
167 | QALD2_te-15	What is the longest river?
168 | QALD2_te-17	Give me all cars that are produced in Germany.
169 | QALD2_te-19	Give me all people that were born in Vienna and died in Berlin.
170 | QALD2_te-21	What is the capital of Canada?
171 | QALD2_te-22	Who is the governor of Texas?
172 | QALD2_te-24	Who was the father of Queen Elizabeth II?
173 | QALD2_te-25	Which U.S. state has been admitted latest?
174 | QALD2_te-27	Sean Parnell is the governor of which U.S. state?
175 | QALD2_te-28	Give me all movies directed by Francis Ford Coppola.
176 | QALD2_te-29	Give me all actors starring in movies directed by and starring William Shatner.
177 | QALD2_te-31	Give me all current Methodist national leaders.
178 | QALD2_te-33	Give me all Australian nonprofit organizations.
179 | QALD2_te-34	In which military conflicts did Lawrence of Arabia participate?
180 | QALD2_te-35	Who developed Skype?
181 | QALD2_te-39	Give me all companies in Munich.
182 | QALD2_te-40	List all boardgames by GMT.
183 | QALD2_te-41	Who founded Intel?
184 | QALD2_te-42	Who is the husband of Amanda Palmer?
185 | QALD2_te-43	Give me all breeds of the German Shepherd dog.
186 | QALD2_te-44	Which cities does the Weser flow through?
187 | QALD2_te-45	Which countries are connected by the Rhine?
188 | QALD2_te-46	Which professional surfers were born on the Philippines?
189 | QALD2_te-48	In which UK city are the headquarters of the MI6?
190 | QALD2_te-49	Which other weapons did the designer of the Uzi develop?
191 | QALD2_te-51	Give me all Frisian islands that belong to the Netherlands.
192 | QALD2_te-53	What is the ruling party in Lisbon?
193 | QALD2_te-55	Which Greek goddesses dwelt on Mount Olympus?
194 | QALD2_te-57	Give me the Apollo 14 astronauts.
195 | QALD2_te-58	What is the time zone of Salt Lake City?
196 | QALD2_te-59	Which U.S. states are in the same timezone as Utah?
197 | QALD2_te-60	Give me a list of all lakes in Denmark.
198 | QALD2_te-63	Give me all Argentine films.
199 | QALD2_te-64	Give me all launch pads operated by NASA.
200 | QALD2_te-65	Which instruments did John Lennon play?
201 | QALD2_te-66	Which ships were called after Benjamin Franklin?
202 | QALD2_te-67	Who are the parents of the wife of Juan Carlos I?
203 | QALD2_te-72	In which U.S. state is Area 51 located?
204 | QALD2_te-75	Which daughters of British earls died in the same place they were born in?
205 | QALD2_te-76	List the children of Margaret Thatcher.
206 | QALD2_te-77	Who was called Scarface?
207 | QALD2_te-80	Give me all books by William Goldman with more than 300 pages.
208 | QALD2_te-81	Which books by Kerouac were published by Viking Press?
209 | QALD2_te-82	Give me a list of all American inventions.
210 | QALD2_te-84	Who created the comic Captain America?
211 | QALD2_te-86	What is the largest city in Australia?
212 | QALD2_te-87	Who composed the music for Harold and Maude?
213 | QALD2_te-88	Which films starring Clint Eastwood did he direct himself?
214 | QALD2_te-89	In which city was the former Dutch queen Juliana buried?
215 | QALD2_te-90	Where is the residence of the prime minister of Spain?
216 | QALD2_te-91	Which U.S. State has the abbreviation MN?
217 | QALD2_te-92	Show me all songs from Bruce Springsteen released between 1980 and 1990.
218 | QALD2_te-93	Which movies did Sam Raimi direct after Army of Darkness?
219 | QALD2_te-95	Who wrote the lyrics for the Polish national anthem?
220 | QALD2_te-97	Who painted The Storm on the Sea of Galilee?
221 | QALD2_te-98	Which country does the creator of Miffy come from?
222 | QALD2_te-99	For which label did Elvis record his first album?
223 | QALD2_te-100	Who produces Orangina?
224 | QALD2_tr-1	Give me all female Russian astronauts.
225 | QALD2_tr-3	Who is the daughter of Bill Clinton married to?
226 | QALD2_tr-4	Which river does the Brooklyn Bridge cross?
227 | QALD2_tr-6	Where did Abraham Lincoln die?
228 | QALD2_tr-8	Which states of Germany are governed by the Social Democratic Party?
229 | QALD2_tr-9	Which U.S. states possess gold minerals?
230 | QALD2_tr-10	In which country does the Nile start?
231 | QALD2_tr-11	Which countries have places with more than two caves?
232 | QALD2_tr-13	Which classis does the Millepede belong to?
233 | QALD2_tr-15	Who created Goofy?
234 | QALD2_tr-16	Give me the capitals of all countries in Africa.
235 | QALD2_tr-17	Give me all cities in New Jersey with more than 100000 inhabitants.
236 | QALD2_tr-18	Which museum exhibits The Scream by Munch?
237 | QALD2_tr-21	Which states border Illinois?
238 | QALD2_tr-22	In which country is the Limerick Lake?
239 | QALD2_tr-23	Which television shows were created by Walt Disney?
240 | QALD2_tr-24	Which mountain is the highest after the Annapurna?
241 | QALD2_tr-25	In which films directed by Garry Marshall was Julia Roberts starring?
242 | QALD2_tr-26	Which bridges are of the same type as the Manhattan Bridge?
243 | QALD2_tr-28	Which European countries have a constitutional monarchy?
244 | QALD2_tr-29	Which awards did WikiLeaks win?
245 | QALD2_tr-30	Which state of the USA has the highest population density?
246 | QALD2_tr-31	What is the currency of the Czech Republic?
247 | QALD2_tr-32	Which countries in the European Union adopted the Euro?
248 | QALD2_tr-34	Which countries have more than two official languages?
249 | QALD2_tr-35	Who is the owner of Universal Studios?
250 | QALD2_tr-36	Through which countries does the Yenisei river flow?
251 | QALD2_tr-38	Which monarchs of the United Kingdom were married to a German?
252 | QALD2_tr-40	What is the highest mountain in Australia?
253 | QALD2_tr-41	Give me all soccer clubs in Spain.
254 | QALD2_tr-42	What are the official languages of the Philippines?
255 | QALD2_tr-43	Who is the mayor of New York City?
256 | QALD2_tr-44	Who designed the Brooklyn Bridge?
257 | QALD2_tr-45	Which telecommunications organizations are located in Belgium?
258 | QALD2_tr-47	What is the highest place of Karakoram?
259 | QALD2_tr-49	Give me all companies in the advertising industry.
260 | QALD2_tr-50	What did Bruce Carver die from?
261 | QALD2_tr-51	Give me all school types.
262 | QALD2_tr-52	Which presidents were born in 1945?
263 | QALD2_tr-53	Give me all presidents of the United States.
264 | QALD2_tr-54	Who was the wife of U.S. president Lincoln?
265 | QALD2_tr-55	Who developed the video game World of Warcraft?
266 | QALD2_tr-57	List all episodes of the first season of the HBO television series The Sopranos!
267 | QALD2_tr-58	Who produced the most films?
268 | QALD2_tr-59	Give me all people with first name Jimmy.
269 | QALD2_tr-61	Which mountains are higher than the Nanga Parbat?
270 | QALD2_tr-62	Who created Wikipedia?
271 | QALD2_tr-63	Give me all actors starring in Batman Begins.
272 | QALD2_tr-64	Which software has been developed by organizations founded in California?
273 | QALD2_tr-65	Which companies work in the aerospace industry as well as on nuclear reactor technology?
274 | QALD2_tr-68	Which actors were born in Germany?
275 | QALD2_tr-69	Which caves have more than 3 entrances?
276 | QALD2_tr-70	Give me all films produced by Hal Roach.
277 | QALD2_tr-71	Give me all video games published by Mean Hamster Software.
278 | QALD2_tr-72	Which languages are spoken in Estonia?
279 | QALD2_tr-73	Who owns Aldi?
280 | QALD2_tr-74	Which capitals in Europe were host cities of the summer olympic games?
281 | QALD2_tr-75	Who has been the 5th president of the United States of America?
282 | QALD2_tr-77	Which music albums contain the song Last Christmas?
283 | QALD2_tr-78	Give me all books written by Danielle Steel.
284 | QALD2_tr-79	Which airports are located in California, USA?
285 | QALD2_tr-80	Give me all Canadian Grunge record labels.
286 | QALD2_tr-81	Which country has the most official languages?
287 | QALD2_tr-82	In which programming language is GIMP written?
288 | QALD2_tr-83	Who produced films starring Natalie Portman?
289 | QALD2_tr-84	Give me all movies with Tom Cruise.
290 | QALD2_tr-85	In which films did Julia Roberts as well as Richard Gere play?
291 | QALD2_tr-86	Give me all female German chancellors.
292 | QALD2_tr-87	Who wrote the book The pillars of the Earth?
293 | QALD2_tr-89	Give me all soccer clubs in the Premier League.
294 | QALD2_tr-91	Which organizations were founded in 1950?
295 | QALD2_tr-92	What is the highest mountain?
296 | SemSearch_ES-1	44 magnum hunting
297 | SemSearch_ES-2	B. F. Skinner
298 | SemSearch_ES-3	Bookwork
299 | SemSearch_ES-4	NAACP Image Awards
300 | SemSearch_ES-5	Scott County
301 | SemSearch_ES-6	air wisconsin
302 | SemSearch_ES-7	airsoft glock
303 | SemSearch_ES-8	aloha sol
304 | SemSearch_ES-9	american embassy nairobi
305 | SemSearch_ES-10	asheville north carolina
306 | SemSearch_ES-11	austin powers
307 | SemSearch_ES-12	austin texas
308 | SemSearch_ES-13	banana paper making
309 | SemSearch_ES-14	ben franklin
310 | SemSearch_ES-15	bradley center
311 | SemSearch_ES-16	brooklyn bridge
312 | SemSearch_ES-17	butte montana
313 | SemSearch_ES-18	canasta cards
314 | SemSearch_ES-19	carl lewis
315 | SemSearch_ES-20	carolina
316 | SemSearch_ES-21	charles darwin
317 | SemSearch_ES-22	city of charlotte
318 | SemSearch_ES-23	city of virginia beach
319 | SemSearch_ES-24	coastal carolina
320 | SemSearch_ES-25	david suchet
321 | SemSearch_ES-26	disney orlando
322 | SemSearch_ES-27	earl may
323 | SemSearch_ES-28	el salvador
324 | SemSearch_ES-29	ellis college
325 | SemSearch_ES-30	eloan line of credit
326 | SemSearch_ES-31	emery
327 | SemSearch_ES-32	fitzgerald auto mall chambersburg pa
328 | SemSearch_ES-33	harry potter
329 | SemSearch_ES-34	harry potter movie
330 | SemSearch_ES-35	hospice of cincinnati
331 | SemSearch_ES-36	imdb batman returns
332 | SemSearch_ES-37	jack johnson
333 | SemSearch_ES-38	jack the ripper
334 | SemSearch_ES-39	james caldwell high school
335 | SemSearch_ES-40	james clayton md
336 | SemSearch_ES-41	joan of arc
337 | SemSearch_ES-42	john maxwell
338 | SemSearch_ES-45	keith urban
339 | SemSearch_ES-47	king arthur
340 | SemSearch_ES-48	la scala restaurant philadelphia
341 | SemSearch_ES-49	laura bush
342 | SemSearch_ES-50	laura steele bob and tom
343 | SemSearch_ES-51	lexus of maplewood
344 | SemSearch_ES-52	lincoln park
345 | SemSearch_ES-53	lynchburg virginia
346 | SemSearch_ES-54	marc anthony
347 | SemSearch_ES-55	marcus theaters
348 | SemSearch_ES-56	mario bros
349 | SemSearch_ES-57	martin luther king
350 | SemSearch_ES-58	mason ohio
351 | SemSearch_ES-59	mercy hospital in des moines, ia
352 | SemSearch_ES-60	michael douglas
353 | SemSearch_ES-61	mr rourke fantasy island
354 | SemSearch_ES-63	old winchester shotguns
355 | SemSearch_ES-64	omeara ford
356 | SemSearch_ES-65	orlando florida
357 | SemSearch_ES-66	overeaters anonymous
358 | SemSearch_ES-67	ovguide movies
359 | SemSearch_ES-68	pierce county washington
360 | SemSearch_ES-69	piosenki mp3
361 | SemSearch_ES-70	radio italia online
362 | SemSearch_ES-71	richmond virginia
363 | SemSearch_ES-72	rock 103 memphis
364 | SemSearch_ES-73	rowan university
365 | SemSearch_ES-74	sacred heart u
366 | SemSearch_ES-75	sagemont church houston tx
367 | SemSearch_ES-76	san antonio
368 | SemSearch_ES-77	savannah tech
369 | SemSearch_ES-78	sharp pc
370 | SemSearch_ES-79	shobana masala
371 | SemSearch_ES-80	sonny and cher
372 | SemSearch_ES-81	south dakota state university
373 | SemSearch_ES-82	st lucia
374 | SemSearch_ES-83	st paul saints
375 | SemSearch_ES-84	the dish danielle fishel
376 | SemSearch_ES-85	the longest yard sale
377 | SemSearch_ES-86	the morning call lehigh valley pa
378 | SemSearch_ES-87	the quick lift
379 | SemSearch_ES-88	thomas jefferson
380 | SemSearch_ES-89	university of north dakota
381 | SemSearch_ES-90	university of phoenix
382 | SemSearch_ES-91	westminster abbey
383 | SemSearch_ES-93	08 toyota tundra
384 | SemSearch_ES-94	Hugh Downs
385 | SemSearch_ES-95	MADRID
386 | SemSearch_ES-96	New England Coffee
387 | SemSearch_ES-97	PINK PANTHER 2
388 | SemSearch_ES-98	University of Texas at Austin
389 | SemSearch_ES-99	University of York
390 | SemSearch_ES-100	YMCA Tampa
391 | SemSearch_ES-101	ashley wagner
392 | SemSearch_ES-102	beach flowers
393 | SemSearch_ES-103	bounce city humble tx
394 | SemSearch_ES-104	bourbonnais il
395 | SemSearch_ES-105	cedar garden apartments
396 | SemSearch_ES-106	chase masterson
397 | SemSearch_ES-107	concord steel
398 | SemSearch_ES-108	danielia cotton
399 | SemSearch_ES-109	david hewlett
400 | SemSearch_ES-111	eagle rock, ca
401 | SemSearch_ES-112	espresso tv stands
402 | SemSearch_ES-114	glenn frey
403 | SemSearch_ES-115	goodwill of michigan
404 | SemSearch_ES-118	iowa energy
405 | SemSearch_ES-119	john elliott
406 | SemSearch_ES-120	lawrence general hospital
407 | SemSearch_ES-123	michael zimmerman
408 | SemSearch_ES-124	motorola bluetooth hs850
409 | SemSearch_ES-125	nokia e73
410 | SemSearch_ES-127	palm tungsten e2 handheld
411 | SemSearch_ES-128	philadelphia neufchatel cheese
412 | SemSearch_ES-129	pizza populous detroit mi
413 | SemSearch_ES-130	plymouth police department
414 | SemSearch_ES-131	scpa san diego
415 | SemSearch_ES-132	sealy mattress co
416 | SemSearch_ES-133	sedona hiking trails
417 | SemSearch_ES-134	skye woods
418 | SemSearch_ES-135	spring shoes canada
419 | SemSearch_ES-136	sri lanka government gazette
420 | SemSearch_ES-137	steak express
421 | SemSearch_ES-138	syracuse spca
422 | SemSearch_ES-139	the big texan steak house
423 | SemSearch_ES-140	toledo bend realty
424 | SemSearch_ES-141	ventura county court
425 | SemSearch_ES-142	windsor hotel philadelphia
426 | SemSearch_LS-1	Apollo astronauts who walked on the Moon
427 | SemSearch_LS-2	Arab states of the Persian Gulf
428 | SemSearch_LS-3	astronauts who landed on the Moon
429 | SemSearch_LS-4	Axis powers of World War II
430 | SemSearch_LS-5	books of the Jewish canon
431 | SemSearch_LS-6	boroughs of New York City
432 | SemSearch_LS-7	Branches of the US military
433 | SemSearch_LS-8	continents in the world
434 | SemSearch_LS-9	degrees of Eastern Orthodox monasticism
435 | SemSearch_LS-10	did nicole kidman have any siblings
436 | SemSearch_LS-11	dioceses of the church of ireland
437 | SemSearch_LS-12	first targets of the atomic bomb
438 | SemSearch_LS-13	five great epics of Tamil literature
439 | SemSearch_LS-14	gods who dwelt on Mount Olympus
440 | SemSearch_LS-16	hijackers in the September 11 attacks
441 | SemSearch_LS-17	houses of the Russian parliament
442 | SemSearch_LS-18	john lennon, parents
443 | SemSearch_LS-19	kenya's captain in cricket
444 | SemSearch_LS-20	kublai khan siblings
445 | SemSearch_LS-21	lilly allen parents
446 | SemSearch_LS-22	major leagues in the united states
447 | SemSearch_LS-24	matt berry tv series
448 | SemSearch_LS-25	members of u2?
449 | SemSearch_LS-26	movies starring erykah badu
450 | SemSearch_LS-29	nations where Portuguese is an official language
451 | SemSearch_LS-30	orders (or 'choirs') of angels
452 | SemSearch_LS-31	permanent members of the UN Security Council
453 | SemSearch_LS-32	presidents depicted on mount rushmore who died of shooting
454 | SemSearch_LS-33	provinces and territories of Canada
455 | SemSearch_LS-34	ratt albums
456 | SemSearch_LS-35	republics of the former Yugoslavia
457 | SemSearch_LS-36	revolutionaries of 1959 in Cuba
458 | SemSearch_LS-37	standard axioms of set theory
459 | SemSearch_LS-38	states that border oklahoma
460 | SemSearch_LS-39	ten ancient Greek city-kingdoms of Cyprus
461 | SemSearch_LS-40	the first 13 american states
462 | SemSearch_LS-41	the four of the companions of the prophet
463 | SemSearch_LS-42	twelve tribes or sons of Israel
464 | SemSearch_LS-43	what books did paul of tarsus write?
465 | SemSearch_LS-44	what languages do they speak in afghanistan
466 | SemSearch_LS-46	where the British monarch is also head of state
467 | SemSearch_LS-49	who invented the python programming language
468 | SemSearch_LS-50	wonders of the ancient world
469 | TREC_Entity-1	Carriers that Blackberry makes phones for.
470 | TREC_Entity-2	Winners of the ACM Athena award.
471 | TREC_Entity-4	Professional sports teams in Philadelphia.
472 | TREC_Entity-5	Products of Medimmune, Inc.
473 | TREC_Entity-6	Organizations that award Nobel prizes.
474 | TREC_Entity-7	Airlines that currently use Boeing 747 planes.
475 | TREC_Entity-9	Members of The Beaux Arts Trio.
476 | TREC_Entity-10	Campuses of Indiana University.
477 | TREC_Entity-11	Donors to the Home Depot Foundation.
478 | TREC_Entity-12	Airlines that Air Canada has code share flights with.
479 | TREC_Entity-14	Authors awarded an Anthony Award at Bouchercon in 2007.
480 | TREC_Entity-15	Universities that are members of the SEC conference for football.
481 | TREC_Entity-16	Sponsors of the Mancuso quilt festivals.
482 | TREC_Entity-17	Chefs with a show on the Food Network.
483 | TREC_Entity-18	Members of the band Jefferson Airplane.
484 | TREC_Entity-19	Companies that John Hennessey serves on the board of.
485 | TREC_Entity-20	Scotch whisky distilleries on the island of Islay.
486 | 


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | pylucene>=4.10.1


--------------------------------------------------------------------------------