├── .gitignore ├── LICENSE.txt ├── README.md ├── bioc_json.py ├── json ├── abbr.json ├── collection.json ├── everything-sentence.json └── everything.json ├── test_bioc_json.sh └── xml ├── abbr.xml ├── collection.xml ├── everything-sentence.xml └── everything.xml /.gitignore: -------------------------------------------------------------------------------- 1 | README.html 2 | -------------------------------------------------------------------------------- /LICENSE.txt: -------------------------------------------------------------------------------- 1 | ## Public Domain Notice ## 2 | 3 | This work is a "United States Government Work" under the terms of the 4 | United States Copyright Act. It was written as part of the authors' 5 | official duties as a United States Government employee and thus cannot 6 | be copyrighted within the United States. The data is freely available 7 | to the public for use. The National Library of Medicine and the U.S. 8 | Government have not placed any restriction on its use or reproduction. 9 | 10 | Although all reasonable efforts have been taken to ensure the accuracy 11 | and reliability of the data and its source code, the NLM and the 12 | U.S. Government do not and cannot warrant the performance or results 13 | that may be obtained by using it. The NLM and the U.S. Government 14 | disclaim all warranties, express or implied, including warranties of 15 | performance, merchantability or fitness for any particular purpose. 16 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # BioC-JSON # 2 | 3 | BioC-JSON is a tool that converts between BioC XML files and BioC JSON 4 | files. BioC is a simple data structure to share text data and 5 | annotations. More information is available at 6 | [http://bioc.sourceforge.net](http://bioc.sourceforge.net). 7 | 8 | When it was first implemented, XML was used for serialization. Now a 9 | lot of data exchange occurs in JSON. The BioC data structures have 10 | now been implemented in JSON. The JSON structure is as similar to the 11 | original data structures as is realistic. 12 | 13 | In addition to its use as a comand line tool, the classes can be used 14 | with in-memory data. 15 | 16 | ## Public Domain Notice ## 17 | 18 | This work is a "United States Government Work" under the terms of the 19 | United States Copyright Act. It was written as part of the authors' 20 | official duties as a United States Government employee and thus cannot 21 | be copyrighted within the United States. The data is freely available 22 | to the public for use. The National Library of Medicine and the U.S. 23 | Government have not placed any restriction on its use or reproduction. 24 | 25 | Although all reasonable efforts have been taken to ensure the accuracy 26 | and reliability of the data and its source code, the NLM and the 27 | U.S. Government do not and cannot warrant the performance or results 28 | that may be obtained by using it. The NLM and the U.S. Government 29 | disclaim all warranties, express or implied, including warranties of 30 | performance, merchantability or fitness for any particular purpose. 31 | 32 | ## Setup 33 | 34 | This code uses the PyBioC library. It can be found at 35 | [https://github.com/2mh/PyBioC](https://github.com/2mh/PyBioC). Python's `sys.path` needs to contain 36 | the `src` directory from where PyBioC was installed. One way to do 37 | this is to modify the `pybioc_path` line in `bioc_json.py`. Of course, 38 | there are many other ways to do this. 39 | 40 | The script `test_bioc_json.sh` includes some simple tests to make sure 41 | the code is running properly on your computer. 42 | 43 | The `json` and `xml` directories contain examples of BioC XML and BioC 44 | JSON files. 45 | 46 | ## Usage 47 | 48 | ./bioc_json.py -b|-j in_file out_file 49 | 50 | The option `-b` means convert to BioC XML, the option `-j` means convert 51 | to BioC JSON. `in_file` and `out_file` are obvious. 52 | -------------------------------------------------------------------------------- /bioc_json.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python 2 | 3 | #### Write a BioC collection in JSON 4 | 5 | from __future__ import print_function 6 | 7 | import json 8 | import sys 9 | import os 10 | 11 | pybioc_path = '/home/comeau/src/PyBioC' 12 | sys.path.append(os.path.join(pybioc_path, 'src')) 13 | import bioc 14 | 15 | class BioC2JSON: 16 | def node(this, node): 17 | json_node = {'refid': node.refid, 'role': node.role} 18 | return json_node 19 | 20 | def relation(this, rel): 21 | json_rel = {} 22 | json_rel['id'] = rel.id 23 | json_rel['infons'] = rel.infons 24 | json_rel['nodes'] = [this.node(n) for n in rel.nodes] 25 | return json_rel 26 | 27 | def location(this, loc): 28 | json_loc = {'offset': int(loc.offset), 'length': int(loc.length)} 29 | return json_loc 30 | 31 | def annotation(this, note): 32 | json_note = {} 33 | json_note['id'] = note.id 34 | json_note['infons'] = note.infons 35 | json_note['text'] = note.text 36 | json_note['locations'] = [this.location(l) 37 | for l in note.locations] 38 | return json_note 39 | 40 | def sentence(this, sent): 41 | json_sent = {} 42 | json_sent['infons'] = sent.infons 43 | json_sent['offset'] = int(sent.offset) 44 | json_sent['text'] = sent.text 45 | json_sent['annotations'] = [this.annotation(a) 46 | for a in sent.annotations] 47 | json_sent['relations'] = [this.relation(r) 48 | for r in sent.relations] 49 | return json_sent 50 | 51 | def passage(this, psg): 52 | json_psg = {} 53 | json_psg['infons'] = psg.infons 54 | json_psg['offset'] = int(psg.offset) 55 | json_psg['text'] = psg.text 56 | json_psg['text'] = psg.text if psg.text else "" 57 | json_psg['sentences'] = [this.sentence(s) 58 | for s in psg.sentences] 59 | json_psg['annotations'] = [this.annotation(a) 60 | for a in psg.annotations] 61 | json_psg['relations'] = [this.relation(r) 62 | for r in psg.relations] 63 | return json_psg 64 | 65 | def document(this, doc): 66 | json_doc = {} 67 | json_doc['id'] = doc.id 68 | json_doc['infons'] = doc.infons 69 | json_doc['passages'] = [this.passage(p) 70 | for p in doc.passages] 71 | json_doc['relations'] = [this.relation(r) 72 | for r in doc.relations] 73 | return json_doc 74 | 75 | def collection(this, collection): 76 | json_collection = {} 77 | json_collection['source'] = collection.source 78 | json_collection['date'] = collection.date 79 | json_collection['key'] = collection.key 80 | json_collection['infons'] = collection.infons 81 | json_collection['documents'] = [this.document(d) 82 | for d in collection.documents] 83 | return json_collection 84 | 85 | class JSON2BioC: 86 | 87 | def node(this, json_node): 88 | node = bioc.BioCNode() 89 | node.refid = json_node['refid'] 90 | node.role = json_node['role'] 91 | return node 92 | 93 | def relation(this, json_rel): 94 | rel = bioc.BioCRelation() 95 | rel.id = json_rel['id'] 96 | rel.infons = json_rel['infons'] 97 | rel.nodes = [this.node(n) for n in json_rel['nodes']] 98 | return rel 99 | 100 | def location(this, json_loc): 101 | loc = bioc.BioCLocation() 102 | loc.offset = str(json_loc['offset']) 103 | loc.length = str(json_loc['length']) 104 | return loc 105 | 106 | def annotation(this, json_note): 107 | note = bioc.BioCAnnotation() 108 | note.id = json_note['id'] 109 | note.infons = json_note['infons'] 110 | note.text = json_note['text'] 111 | note.locations = [this.location(l) 112 | for l in json_note['locations']] 113 | return note 114 | 115 | def sentence(this, json_sent): 116 | sent = bioc.BioCSentence() 117 | sent.infons = json_sent['infons'] 118 | sent.offset = str(json_sent['offset']) 119 | sent.text = json_sent['text'] 120 | sent.annotations = [this.annotation(a) 121 | for a in json_sent['annotations']] 122 | sent.relations = [this.relation(r) 123 | for r in json_sent['relations']] 124 | return sent 125 | 126 | def passage(this, json_psg): 127 | psg = bioc.BioCPassage() 128 | psg.infons = json_psg['infons'] 129 | psg.offset = str(json_psg['offset']) 130 | psg.text = json_psg.get('text') 131 | psg.sentences = [this.sentence(s) 132 | for s in json_psg['sentences']] 133 | psg.annotations = [this.annotation(a) 134 | for a in json_psg['annotations']] 135 | psg.relations = [this.relation(r) 136 | for r in json_psg['relations']] 137 | return psg 138 | 139 | def document(this, json_doc): 140 | doc = bioc.BioCDocument() 141 | doc.id = json_doc['id'] 142 | doc.infons = json_doc['infons'] 143 | doc.passages = [this.passage(p) 144 | for p in json_doc['passages']] 145 | doc.relations = [this.relation(r) 146 | for r in json_doc['relations']] 147 | return doc 148 | 149 | def collection(this, json_collection): 150 | collection = bioc.BioCCollection() 151 | collection.source = json_collection['source'] 152 | collection.date = json_collection['date'] 153 | collection.key = json_collection['key'] 154 | collection.infons = json_collection['infons'] 155 | collection.documents = [this.document(d) 156 | for d in json_collection['documents']] 157 | return collection 158 | 159 | if __name__ == "__main__": 160 | 161 | if len(sys.argv) != 4: 162 | print('usage:', sys.argv[0], '-b|-j in_file out_file') 163 | exit(1) 164 | pgm, option, in_file, out_file = sys.argv 165 | 166 | if option == '-j': 167 | reader = bioc.BioCReader(in_file) 168 | reader.read() 169 | 170 | bioc2json = BioC2JSON() 171 | bioc_json = bioc2json.collection(reader.collection) 172 | with open(out_file, 'w') as f: 173 | json.dump(bioc_json, f, indent=2) 174 | print(file=f) 175 | 176 | elif option == '-b': 177 | bioc_json = None 178 | with open(in_file) as f: 179 | bioc_json = json.load(f) 180 | 181 | # print json.dumps(bioc_json, indent=2) 182 | 183 | json2bioc =JSON2BioC() 184 | bioc_collection = json2bioc.collection(bioc_json) 185 | 186 | writer = bioc.BioCWriter(out_file, bioc_collection) 187 | writer.write() 188 | -------------------------------------------------------------------------------- /json/abbr.json: -------------------------------------------------------------------------------- 1 | { 2 | "date": "20120606", 3 | "source": "PubMed", 4 | "infons": {}, 5 | "documents": [ 6 | { 7 | "passages": [ 8 | { 9 | "text": "", 10 | "offset": 0, 11 | "relations": [], 12 | "infons": { 13 | "type": "title" 14 | }, 15 | "sentences": [], 16 | "annotations": [] 17 | }, 18 | { 19 | "text": "", 20 | "offset": 114, 21 | "relations": [], 22 | "infons": { 23 | "type": "abstract" 24 | }, 25 | "sentences": [], 26 | "annotations": [] 27 | } 28 | ], 29 | "infons": {}, 30 | "id": "21951408", 31 | "relations": [] 32 | }, 33 | { 34 | "passages": [ 35 | { 36 | "text": "", 37 | "offset": 0, 38 | "relations": [], 39 | "infons": { 40 | "type": "title" 41 | }, 42 | "sentences": [], 43 | "annotations": [] 44 | }, 45 | { 46 | "text": "", 47 | "offset": 105, 48 | "relations": [ 49 | { 50 | "nodes": [ 51 | { 52 | "role": "LongForm", 53 | "refid": "LF0" 54 | }, 55 | { 56 | "role": "ShortForm", 57 | "refid": "SF0" 58 | } 59 | ], 60 | "infons": { 61 | "type": "ABBR" 62 | }, 63 | "id": "R0" 64 | }, 65 | { 66 | "nodes": [ 67 | { 68 | "role": "LongForm", 69 | "refid": "LF1" 70 | }, 71 | { 72 | "role": "ShortForm", 73 | "refid": "SF1" 74 | } 75 | ], 76 | "infons": { 77 | "type": "ABBR" 78 | }, 79 | "id": "R1" 80 | } 81 | ], 82 | "infons": { 83 | "type": "abstract" 84 | }, 85 | "sentences": [], 86 | "annotations": [ 87 | { 88 | "text": "Human neutrophil elastase", 89 | "infons": { 90 | "type": "ABBR", 91 | "ABBR": "LongForm" 92 | }, 93 | "id": "LF0", 94 | "locations": [ 95 | { 96 | "length": 25, 97 | "offset": 105 98 | } 99 | ] 100 | }, 101 | { 102 | "text": "HNE", 103 | "infons": { 104 | "type": "ABBR", 105 | "ABBR": "ShortForm" 106 | }, 107 | "id": "SF0", 108 | "locations": [ 109 | { 110 | "length": 3, 111 | "offset": 132 112 | } 113 | ] 114 | }, 115 | { 116 | "text": "cathepsin G", 117 | "infons": { 118 | "type": "ABBR", 119 | "ABBR": "LongForm" 120 | }, 121 | "id": "LF1", 122 | "locations": [ 123 | { 124 | "length": 11, 125 | "offset": 141 126 | } 127 | ] 128 | }, 129 | { 130 | "text": "CatG", 131 | "infons": { 132 | "type": "ABBR", 133 | "ABBR": "ShortForm" 134 | }, 135 | "id": "SF1", 136 | "locations": [ 137 | { 138 | "length": 4, 139 | "offset": 154 140 | } 141 | ] 142 | } 143 | ] 144 | } 145 | ], 146 | "infons": {}, 147 | "id": "21488974", 148 | "relations": [] 149 | }, 150 | { 151 | "passages": [ 152 | { 153 | "text": "", 154 | "offset": 0, 155 | "relations": [], 156 | "infons": { 157 | "type": "title" 158 | }, 159 | "sentences": [], 160 | "annotations": [] 161 | } 162 | ], 163 | "infons": {}, 164 | "id": "21660417", 165 | "relations": [] 166 | }, 167 | { 168 | "passages": [ 169 | { 170 | "text": "", 171 | "offset": 0, 172 | "relations": [], 173 | "infons": { 174 | "type": "title" 175 | }, 176 | "sentences": [], 177 | "annotations": [] 178 | }, 179 | { 180 | "text": "", 181 | "offset": 68, 182 | "relations": [], 183 | "infons": { 184 | "type": "abstract" 185 | }, 186 | "sentences": [], 187 | "annotations": [] 188 | } 189 | ], 190 | "infons": {}, 191 | "id": "21831032", 192 | "relations": [] 193 | }, 194 | { 195 | "passages": [ 196 | { 197 | "text": "", 198 | "offset": 0, 199 | "relations": [], 200 | "infons": { 201 | "type": "title" 202 | }, 203 | "sentences": [], 204 | "annotations": [] 205 | }, 206 | { 207 | "text": "", 208 | "offset": 160, 209 | "relations": [ 210 | { 211 | "nodes": [ 212 | { 213 | "role": "LongForm", 214 | "refid": "LF2" 215 | }, 216 | { 217 | "role": "ShortForm", 218 | "refid": "SF2" 219 | } 220 | ], 221 | "infons": { 222 | "type": "ABBR" 223 | }, 224 | "id": "R2" 225 | } 226 | ], 227 | "infons": { 228 | "type": "abstract" 229 | }, 230 | "sentences": [], 231 | "annotations": [ 232 | { 233 | "text": "fluorescence in situ hybridization", 234 | "infons": { 235 | "type": "ABBR", 236 | "ABBR": "LongForm" 237 | }, 238 | "id": "LF2", 239 | "locations": [ 240 | { 241 | "length": 34, 242 | "offset": 246 243 | } 244 | ] 245 | }, 246 | { 247 | "text": "FISH", 248 | "infons": { 249 | "type": "ABBR", 250 | "ABBR": "ShortForm" 251 | }, 252 | "id": "SF2", 253 | "locations": [ 254 | { 255 | "length": 4, 256 | "offset": 282 257 | } 258 | ] 259 | } 260 | ] 261 | } 262 | ], 263 | "infons": {}, 264 | "id": "21571957", 265 | "relations": [] 266 | }, 267 | { 268 | "passages": [ 269 | { 270 | "text": "", 271 | "offset": 0, 272 | "relations": [], 273 | "infons": { 274 | "type": "title" 275 | }, 276 | "sentences": [], 277 | "annotations": [] 278 | }, 279 | { 280 | "text": "", 281 | "offset": 87, 282 | "relations": [ 283 | { 284 | "nodes": [ 285 | { 286 | "role": "LongForm", 287 | "refid": "LF3" 288 | }, 289 | { 290 | "role": "ShortForm", 291 | "refid": "SF3" 292 | } 293 | ], 294 | "infons": { 295 | "type": "ABBR" 296 | }, 297 | "id": "R3" 298 | } 299 | ], 300 | "infons": { 301 | "type": "abstract" 302 | }, 303 | "sentences": [], 304 | "annotations": [ 305 | { 306 | "text": "instrumental variables", 307 | "infons": { 308 | "type": "ABBR", 309 | "ABBR": "LongForm" 310 | }, 311 | "id": "LF3", 312 | "locations": [ 313 | { 314 | "length": 22, 315 | "offset": 144 316 | } 317 | ] 318 | }, 319 | { 320 | "text": "IVs", 321 | "infons": { 322 | "type": "ABBR", 323 | "ABBR": "ShortForm" 324 | }, 325 | "id": "SF3", 326 | "locations": [ 327 | { 328 | "length": 3, 329 | "offset": 168 330 | } 331 | ] 332 | } 333 | ] 334 | } 335 | ], 336 | "infons": {}, 337 | "id": "21216802", 338 | "relations": [] 339 | }, 340 | { 341 | "passages": [ 342 | { 343 | "text": "", 344 | "offset": 0, 345 | "relations": [], 346 | "infons": { 347 | "type": "title" 348 | }, 349 | "sentences": [], 350 | "annotations": [] 351 | }, 352 | { 353 | "text": "", 354 | "offset": 82, 355 | "relations": [], 356 | "infons": { 357 | "type": "abstract" 358 | }, 359 | "sentences": [], 360 | "annotations": [] 361 | } 362 | ], 363 | "infons": {}, 364 | "id": "21893921", 365 | "relations": [] 366 | }, 367 | { 368 | "passages": [ 369 | { 370 | "text": "", 371 | "offset": 0, 372 | "relations": [], 373 | "infons": { 374 | "type": "title" 375 | }, 376 | "sentences": [], 377 | "annotations": [] 378 | }, 379 | { 380 | "text": "", 381 | "offset": 62, 382 | "relations": [ 383 | { 384 | "nodes": [ 385 | { 386 | "role": "LongForm", 387 | "refid": "LF4" 388 | }, 389 | { 390 | "role": "ShortForm", 391 | "refid": "SF4" 392 | } 393 | ], 394 | "infons": { 395 | "type": "ABBR" 396 | }, 397 | "id": "R4" 398 | }, 399 | { 400 | "nodes": [ 401 | { 402 | "role": "LongForm", 403 | "refid": "LF5" 404 | }, 405 | { 406 | "role": "ShortForm", 407 | "refid": "SF5" 408 | } 409 | ], 410 | "infons": { 411 | "type": "ABBR" 412 | }, 413 | "id": "R5" 414 | }, 415 | { 416 | "nodes": [ 417 | { 418 | "role": "LongForm", 419 | "refid": "LF6" 420 | }, 421 | { 422 | "role": "ShortForm", 423 | "refid": "SF6" 424 | } 425 | ], 426 | "infons": { 427 | "type": "ABBR" 428 | }, 429 | "id": "R6" 430 | }, 431 | { 432 | "nodes": [ 433 | { 434 | "role": "LongForm", 435 | "refid": "LF7" 436 | }, 437 | { 438 | "role": "ShortForm", 439 | "refid": "SF7" 440 | } 441 | ], 442 | "infons": { 443 | "type": "ABBR" 444 | }, 445 | "id": "R7" 446 | }, 447 | { 448 | "nodes": [ 449 | { 450 | "role": "LongForm", 451 | "refid": "LF8" 452 | }, 453 | { 454 | "role": "ShortForm", 455 | "refid": "SF8" 456 | } 457 | ], 458 | "infons": { 459 | "type": "ABBR" 460 | }, 461 | "id": "R8" 462 | } 463 | ], 464 | "infons": { 465 | "type": "abstract" 466 | }, 467 | "sentences": [], 468 | "annotations": [ 469 | { 470 | "text": "acute leukemia", 471 | "infons": { 472 | "type": "ABBR", 473 | "ABBR": "LongForm" 474 | }, 475 | "id": "LF4", 476 | "locations": [ 477 | { 478 | "length": 14, 479 | "offset": 533 480 | } 481 | ] 482 | }, 483 | { 484 | "text": "AL", 485 | "infons": { 486 | "type": "ABBR", 487 | "ABBR": "ShortForm" 488 | }, 489 | "id": "SF4", 490 | "locations": [ 491 | { 492 | "length": 2, 493 | "offset": 549 494 | } 495 | ] 496 | }, 497 | { 498 | "text": "acute myeloid leukemia", 499 | "infons": { 500 | "type": "ABBR", 501 | "ABBR": "LongForm" 502 | }, 503 | "id": "LF5", 504 | "locations": [ 505 | { 506 | "length": 22, 507 | "offset": 706 508 | } 509 | ] 510 | }, 511 | { 512 | "text": "AML", 513 | "infons": { 514 | "type": "ABBR", 515 | "ABBR": "ShortForm" 516 | }, 517 | "id": "SF5", 518 | "locations": [ 519 | { 520 | "length": 3, 521 | "offset": 730 522 | } 523 | ] 524 | }, 525 | { 526 | "text": "acute promyelocytic leukemia", 527 | "infons": { 528 | "type": "ABBR", 529 | "ABBR": "LongForm" 530 | }, 531 | "id": "LF6", 532 | "locations": [ 533 | { 534 | "length": 28, 535 | "offset": 981 536 | } 537 | ] 538 | }, 539 | { 540 | "text": "APL", 541 | "infons": { 542 | "type": "ABBR", 543 | "ABBR": "ShortForm" 544 | }, 545 | "id": "SF6", 546 | "locations": [ 547 | { 548 | "length": 3, 549 | "offset": 1011 550 | } 551 | ] 552 | }, 553 | { 554 | "text": "acute lymphoid leukemia", 555 | "infons": { 556 | "type": "ABBR", 557 | "ABBR": "LongForm" 558 | }, 559 | "id": "LF7", 560 | "locations": [ 561 | { 562 | "length": 23, 563 | "offset": 1149 564 | } 565 | ] 566 | }, 567 | { 568 | "text": "ALL", 569 | "infons": { 570 | "type": "ABBR", 571 | "ABBR": "ShortForm" 572 | }, 573 | "id": "SF7", 574 | "locations": [ 575 | { 576 | "length": 3, 577 | "offset": 1174 578 | } 579 | ] 580 | }, 581 | { 582 | "text": "chronic myeloid leukemia", 583 | "infons": { 584 | "type": "ABBR", 585 | "ABBR": "LongForm" 586 | }, 587 | "id": "LF8", 588 | "locations": [ 589 | { 590 | "length": 24, 591 | "offset": 1183 592 | } 593 | ] 594 | }, 595 | { 596 | "text": "CML", 597 | "infons": { 598 | "type": "ABBR", 599 | "ABBR": "ShortForm" 600 | }, 601 | "id": "SF8", 602 | "locations": [ 603 | { 604 | "length": 3, 605 | "offset": 1209 606 | } 607 | ] 608 | } 609 | ] 610 | } 611 | ], 612 | "infons": {}, 613 | "id": "21176341", 614 | "relations": [] 615 | }, 616 | { 617 | "passages": [ 618 | { 619 | "text": "", 620 | "offset": 0, 621 | "relations": [], 622 | "infons": { 623 | "type": "title" 624 | }, 625 | "sentences": [], 626 | "annotations": [] 627 | }, 628 | { 629 | "text": "", 630 | "offset": 143, 631 | "relations": [], 632 | "infons": { 633 | "type": "abstract" 634 | }, 635 | "sentences": [], 636 | "annotations": [] 637 | } 638 | ], 639 | "infons": {}, 640 | "id": "21978046", 641 | "relations": [] 642 | }, 643 | { 644 | "passages": [ 645 | { 646 | "text": "", 647 | "offset": 0, 648 | "relations": [], 649 | "infons": { 650 | "type": "title" 651 | }, 652 | "sentences": [], 653 | "annotations": [] 654 | }, 655 | { 656 | "text": "", 657 | "offset": 69, 658 | "relations": [], 659 | "infons": { 660 | "type": "abstract" 661 | }, 662 | "sentences": [], 663 | "annotations": [] 664 | } 665 | ], 666 | "infons": {}, 667 | "id": "21785578", 668 | "relations": [] 669 | } 670 | ], 671 | "key": "abbreviation.key" 672 | } 673 | -------------------------------------------------------------------------------- /json/collection.json: -------------------------------------------------------------------------------- 1 | { 2 | "date": "20120606", 3 | "source": "PubMed", 4 | "infons": {}, 5 | "documents": [ 6 | { 7 | "passages": [ 8 | { 9 | "text": "Constraint-induced movement therapy for the upper paretic limb in acute or sub-acute stroke: a systematic review.", 10 | "offset": 0, 11 | "relations": [], 12 | "infons": { 13 | "type": "title" 14 | }, 15 | "sentences": [], 16 | "annotations": [] 17 | }, 18 | { 19 | "text": "Constraint-induced movement therapy is a commonly used intervention to improve upper limb function after stroke. However, the effectiveness of constraint-induced movement therapy and its optimal dosage during acute or sub-acute stroke is still under debate. To examine the literature on the effects of constraint-induced movement therapy in acute or sub-acute stroke. A literature search was performed to identify randomized, controlled trials; studies with the same outcome measure were pooled by calculating the mean difference. Separate quantitative analyses for high-intensity and low-intensity constraint-induced movement therapy were applied when possible. Five randomized, controlled trials were included, comprising 106 participants. The meta-analysis demonstrated significant mean differences in favor of constraint-induced movement therapy for the Fugl-Meyer arm, the Action Research Arm Test, the Motor Activity Log, Quality of Movement and the Grooved Pegboard Test. Nonsignificant mean difference in favor of constraint-induced movement therapy were found for the Motor Activity Log, Amount of Use. Separate analyses for high-intensity and low-intensity constraint-induced movement therapy resulted in significant favorable mean differences for low-intensity constraint-induced movement therapy for all outcome measures, in contrast to high-intensity constraint-induced movement therapy. This meta-analysis demonstrates a trend toward positive effects of high-intensity and low-intensity constraint-induced movement therapy in acute or sub-acute stroke, but also suggests that low-intensity constraint-induced movement therapy may be more beneficial during this period than high-intensity constraint-induced movement therapy. However, these results were based on a small number of studies. Therefore, more trials are needed applying different doses of therapy early after stroke and a better understanding is needed about the different time windows in which underlying mechanisms of recovery operate.", 20 | "offset": 114, 21 | "relations": [], 22 | "infons": { 23 | "type": "abstract" 24 | }, 25 | "sentences": [], 26 | "annotations": [] 27 | } 28 | ], 29 | "infons": {}, 30 | "id": "21951408", 31 | "relations": [] 32 | }, 33 | { 34 | "passages": [ 35 | { 36 | "text": "Sensor materials for the detection of human neutrophil elastase and cathepsin G activity in wound fluid.", 37 | "offset": 0, 38 | "relations": [], 39 | "infons": { 40 | "type": "title" 41 | }, 42 | "sentences": [], 43 | "annotations": [] 44 | }, 45 | { 46 | "text": "Human neutrophil elastase (HNE) and cathepsin G (CatG) are involved in the pathogenesis of a number of inflammatory disorders. These serine proteinases are released by neutrophils and monocytes in case of infection. Wound infection is a severe complication regarding wound healing causing diagnostic and therapeutic problems. In this study we have shown the potential of HNE and CatG to be used as markers for early detection of infection. Significant differences in HNE and CatG levels in infected and non-infected wound fluids were observed. Peptide substrates for these two enzymes were successfully immobilised on different surfaces, including collagen, modified collagen, polyamide polyesters and silica gel. HNE and CatG activities were monitored directly in wound fluid via hydrolysis of the chromogenic substrates. Infected wound fluids led to significant higher substrate hydrolysis compared with non-infected ones. These different approaches could be used for the development of devices which are able to detect elevated enzyme activities before manifestation of infection directly on bandages. This would allow a timely intervention by medical doctors thus preventing severe infections.", 47 | "offset": 105, 48 | "relations": [], 49 | "infons": { 50 | "type": "abstract" 51 | }, 52 | "sentences": [], 53 | "annotations": [] 54 | } 55 | ], 56 | "infons": {}, 57 | "id": "21488974", 58 | "relations": [] 59 | }, 60 | { 61 | "passages": [ 62 | { 63 | "text": "Perineal hernia after laparoscopic abdominoperineal resection--reconstruction of the pelvic floor with a biological mesh (Permacol\u2122).", 64 | "offset": 0, 65 | "relations": [], 66 | "infons": { 67 | "type": "title" 68 | }, 69 | "sentences": [], 70 | "annotations": [] 71 | } 72 | ], 73 | "infons": {}, 74 | "id": "21660417", 75 | "relations": [] 76 | }, 77 | { 78 | "passages": [ 79 | { 80 | "text": "The fetal origins of obesity: early origins of altered food intake.", 81 | "offset": 0, 82 | "relations": [], 83 | "infons": { 84 | "type": "title" 85 | }, 86 | "sentences": [], 87 | "annotations": [] 88 | }, 89 | { 90 | "text": "There is now clear evidence from population-based and experimental animal studies that maternal obesity and maternal overnutrition, particularly excessive intake of high-fat and high-sugar diets, is associated with an increased risk of obesity and type 2 diabetes in the offspring. Whilst the physiological reasons for this association are still not fully understood, one of the key pathways appears to be the ability of exposure to an oversupply of energy, fat and sugar during critical windows of development to program an increased food intake in the offspring. This review will focus on our current understanding of the programming of food intake, with a focus on the importance of the maternal diet. Specifically, we will discuss how exposure to an increased energy supply before birth and in early infancy, and/or increased maternal intake of palatable foods alters the development of the systems regulating appetite and food preferences, and how these changes interact to promote excess consumption and thus predispose the offspring to weight gain and obesity.", 91 | "offset": 68, 92 | "relations": [], 93 | "infons": { 94 | "type": "abstract" 95 | }, 96 | "sentences": [], 97 | "annotations": [] 98 | } 99 | ], 100 | "infons": {}, 101 | "id": "21831032", 102 | "relations": [] 103 | }, 104 | { 105 | "passages": [ 106 | { 107 | "text": "A comparison of equivocal immunohistochemical results with anti-HER2/neu antibodies A0485 and SP3 with corresponding FISH results in routine clinical practice.", 108 | "offset": 0, 109 | "relations": [], 110 | "infons": { 111 | "type": "title" 112 | }, 113 | "sentences": [], 114 | "annotations": [] 115 | }, 116 | { 117 | "text": "HER2/neu status in breast cancer is determined by immunohistochemical analysis and/or fluorescence in situ hybridization (FISH). Previous studies have found widely varying sensitivities and specificities for anti-HER2/neu antibodies, including recently developed rabbit monoclonal antibodies. The current prospective study compared rabbit monoclonal antibody SP3 and rabbit polyclonal antibody A0485 immunostaining on routinely processed consecutive cases of breast carcinoma. Of 1,610 cases tested, 261 (16.2%) equivocal (2+) cases were evaluated by FISH. Of 253 cases equivocal with A0485 results, 125 (49.4%) were negative with SP3. In 22 (8.7%) of 253 cases equivocal with A0485, there was amplification by FISH, and 3 of these cases were SP3- (0/1+). Of the 20 (14.8%) of 135 SP3-equivocal cases amplified by FISH, 1 case was A0485-. The reported false-negative rate with A0485 is 2.8%, and the American Society of Clinical Oncology/College of American Pathologists guidelines recommend a rate of less than 5%. Compared with A0485, the false-negative rate with SP3 is only 0.3% (3/1,156) higher, but it shows about a 50% reduction in equivocal scores, reducing the need for reflex FISH testing.", 118 | "offset": 160, 119 | "relations": [], 120 | "infons": { 121 | "type": "abstract" 122 | }, 123 | "sentences": [], 124 | "annotations": [] 125 | } 126 | ], 127 | "infons": {}, 128 | "id": "21571957", 129 | "relations": [] 130 | }, 131 | { 132 | "passages": [ 133 | { 134 | "text": "Using multiple genetic variants as instrumental variables for modifiable risk factors.", 135 | "offset": 0, 136 | "relations": [], 137 | "infons": { 138 | "type": "title" 139 | }, 140 | "sentences": [], 141 | "annotations": [] 142 | }, 143 | { 144 | "text": "Mendelian randomisation analyses use genetic variants as instrumental variables (IVs) to estimate causal effects of modifiable risk factors on disease outcomes. Genetic variants typically explain a small proportion of the variability in risk factors; hence Mendelian randomisation analyses can require large sample sizes. However, an increasing number of genetic variants have been found to be robustly associated with disease-related outcomes in genome-wide association studies. Use of multiple instruments can improve the precision of IV estimates, and also permit examination of underlying IV assumptions. We discuss the use of multiple genetic variants in Mendelian randomisation analyses with continuous outcome variables where all relationships are assumed to be linear. We describe possible violations of IV assumptions, and how multiple instrument analyses can be used to identify them. We present an example using four adiposity-associated genetic variants as IVs for the causal effect of fat mass on bone density, using data on 5509 children enrolled in the ALSPAC birth cohort study. We also use simulation studies to examine the effect of different sets of IVs on precision and bias. When each instrument independently explains variability in the risk factor, use of multiple instruments increases the precision of IV estimates. However, inclusion of weak instruments could increase finite sample bias. Missing data on multiple genetic variants can diminish the available sample size, compared with single instrument analyses. In simulations with additive genotype-risk factor effects, IV estimates using a weighted allele score had similar properties to estimates using multiple instruments. Under the correct conditions, multiple instrument analyses are a promising approach for Mendelian randomisation studies. Further research is required into multiple imputation methods to address missing data issues in IV estimation.", 145 | "offset": 87, 146 | "relations": [], 147 | "infons": { 148 | "type": "abstract" 149 | }, 150 | "sentences": [], 151 | "annotations": [] 152 | } 153 | ], 154 | "infons": {}, 155 | "id": "21216802", 156 | "relations": [] 157 | }, 158 | { 159 | "passages": [ 160 | { 161 | "text": "Healthy connections: online social networks and their potential for peer support.", 162 | "offset": 0, 163 | "relations": [], 164 | "infons": { 165 | "type": "title" 166 | }, 167 | "sentences": [], 168 | "annotations": [] 169 | }, 170 | { 171 | "text": "Social and professional support for mental health is lacking in many rural areas - highlighting the need for innovative ways to improve access to services. This study explores the potential of online social networking as an avenue for peer support. Using a cross sectional survey, 74 secondary students answered questions relating to internet use, online social network use and perceptions of mental health support. Over half of the sample had experienced a need for mental health support with 53% of participants turning to the internet. Results indicate that online social networking sites were used regularly by 82% of the sample and 47% believed these sites could help with mental health problems. The study concluded that online social networking sites may be able to link young people together with others in similar situations. The popularity and frequency of use may allow these sites to provide information, advice and direction for those seeking help.", 172 | "offset": 82, 173 | "relations": [], 174 | "infons": { 175 | "type": "abstract" 176 | }, 177 | "sentences": [], 178 | "annotations": [] 179 | } 180 | ], 181 | "infons": {}, 182 | "id": "21893921", 183 | "relations": [] 184 | }, 185 | { 186 | "passages": [ 187 | { 188 | "text": "[april mRNA expression in newly diagnosed leukemia patients].", 189 | "offset": 0, 190 | "relations": [], 191 | "infons": { 192 | "type": "title" 193 | }, 194 | "sentences": [], 195 | "annotations": [] 196 | }, 197 | { 198 | "text": "This study was aimed to quantitatively detect the levels of april mRNA expression in leukemia patients so as to provide theoretical basis for the target therapy directing at april in leukemia. Real time fluorescent quantitative PCR was used to detect the relative expression level of april mRNA in newly diagnosed leukemia patients and to analyze the changes of its expression level in various type of leukemia. The results showed that the april mRNA expression level in acute leukemia (AL) patients was significantly higher than that in normal controls, there was statistical difference between them (p < 0.05); april mRNA expression level in acute myeloid leukemia (AML) patients was significantly higher than that in normal controls (p < 0.05) and positively correlated with white blood cell count \u2265 20.0 \u00d7 10(9)/L (p < 0.05), but not related with extramedullary infiltration and the expression of CD34. Except for acute promyelocytic leukemia (APL), april mRNA expression level was negatively correlated with sensitivity of patients to chemotherapy. april mRNA expression levels in acute lymphoid leukemia (ALL) and chronic myeloid leukemia (CML) patients were not higher than that in normal controls, there was no statistical difference between them (p > 0.05). It is concluded that april gene overexpression exits in AML patients. APRIL protein produced by AML cells probably plays an important role in abnormal proliferation and drug-resistance of AML cells.", 199 | "offset": 62, 200 | "relations": [], 201 | "infons": { 202 | "type": "abstract" 203 | }, 204 | "sentences": [], 205 | "annotations": [] 206 | } 207 | ], 208 | "infons": {}, 209 | "id": "21176341", 210 | "relations": [] 211 | }, 212 | { 213 | "passages": [ 214 | { 215 | "text": "Optimizing the accuracy of a helical diode array dosimeter: a comprehensive calibration methodology coupled with a novel virtual inclinometer.", 216 | "offset": 0, 217 | "relations": [], 218 | "infons": { 219 | "type": "title" 220 | }, 221 | "sentences": [], 222 | "annotations": [] 223 | }, 224 | { 225 | "text": "PURPOSE: The goal of any dosimeter is to be as accurate as possible when measuring absolute dose to compare with calculated dose. This limits the uncertainties associated with the dosimeter itself and allows the task of dose QA to focus on detecting errors in the treatment planning (TPS) and/or delivery systems. This work introduces enhancements to the measurement accuracy of a 3D dosimeter comprised of a helical plane of diodes in a volumetric phantom. METHODS: We describe the methods and derivations of new corrections that account for repetition rate dependence, intrinsic relative sensitivity per diode, field size dependence based on the dynamic field size determination, and positional correction. Required and described is an accurate \"virtual inclinometer\" algorithm. The system allows for calibrating the array directly against an ion chamber signal collected with high angular resolution. These enhancements are quantitatively validated using several strategies including ion chamber measurements taken using a \"blank\" plastic shell mimicking the actual phantom, and comparison to high resolution dose calculations for a variety of fields: static, simple arcs, and VMAT. A number of sophisticated treatment planning algorithms were benchmarked against ion chamber measurements for their ability to handle a large air cavity in the phantom. RESULTS: Each calibration correction is quantified and presented vs its independent variable(s). The virtual inclinometer is validated by direct comparison to the gantry angle vs time data from machine log files. The effects of the calibration are quantified and improvements are seen in the dose agreement with the ion chamber reference measurements and with the TPS calculations. These improved agreements are a result of removing prior limitations and assumptions in the calibration methodology. Average gamma analysis passing rates for VMAT plans based on the AAPM TG-119 report are 98.4 and 93.3% for the 3%/3 mm and 2%/2 mm dose-error/distance to agreement threshold criteria, respectively, with the global dose-error normalization. With the local dose-error normalization, the average passing rates are reduced to 94.6 and 85.7% for the 3%/3 mm and 2%/2 mm criteria, respectively. Some algorithms in the convolution/superposition family are not sufficiently accurate in predicting the exit dose in the presence of a 15 cm diameter air cavity. CONCLUSIONS: Introduction of the improved calibration methodology, enabled by a robust virtual inclinometer algorithm, improves the accuracy of the dosimeter's absolute dose measurements. With our treatment planning and delivery chain, gamma analysis passing rates for the VMAT plans based on the AAPM TG-119 report are expected to be above 91% and average at about 95% level for \u03b3(3%/3 mm) with the local dose-error normalization. This stringent comparison methodology is more indicative of the true VMAT system commissioning accuracy compared to the often quoted dose-error normalization to a single high value.", 226 | "offset": 143, 227 | "relations": [], 228 | "infons": { 229 | "type": "abstract" 230 | }, 231 | "sentences": [], 232 | "annotations": [] 233 | } 234 | ], 235 | "infons": {}, 236 | "id": "21978046", 237 | "relations": [] 238 | }, 239 | { 240 | "passages": [ 241 | { 242 | "text": "Hemangiopericytoma of maxilla in a pediatric patient: a case report.", 243 | "offset": 0, 244 | "relations": [], 245 | "infons": { 246 | "type": "title" 247 | }, 248 | "sentences": [], 249 | "annotations": [] 250 | }, 251 | { 252 | "text": "The Hemangiopericytoma is a malignant vascular tumor arising from mesenchymal cells with pericytic differentiation. Hemangiopericytoma is most commonly seen in adults, and only 5% to 10% of cases occur in children. The tumor is extremely rare in the head and neck region (16%)1. Cytogenic abnormalities have been present in some hemangiopericytoma cases. Surgical resection remains the mainstay treatment. Adjuvant chemotherapy and radiotherapy is appropriate for cases of incomplete resections and life-threatening tumors particularly in children. Late relapses may occur and require long-term follow-up. A 4-year-old child patient with hemangiopericytoma of the maxilla presented with firm, recurrent, but painless jaw mass. Radiographic investigations revealed a poorly circumscribed radiolucency. The lesion biopsy showed well-circumscribed multiple lobules of tumor mass consisting of tightly packed, spindle-shaped cells. Chemotherapy and radiotherapy of the lesion was conducted. The role of the pediatric dental team is extensive in children with hemangiopericytoma, who require a regular dental review. The patient's oncologist should be immediately contacted if there is any suspicion of recurrence.", 253 | "offset": 69, 254 | "relations": [], 255 | "infons": { 256 | "type": "abstract" 257 | }, 258 | "sentences": [], 259 | "annotations": [] 260 | } 261 | ], 262 | "infons": {}, 263 | "id": "21785578", 264 | "relations": [] 265 | } 266 | ], 267 | "key": "collection.key" 268 | } 269 | -------------------------------------------------------------------------------- /json/everything-sentence.json: -------------------------------------------------------------------------------- 1 | { 2 | "date": "20130426", 3 | "source": "Made up file to test that everything is allowed and processed. Has text in the passage.", 4 | "infons": { 5 | "collection-infon-key": "collection-infon-value" 6 | }, 7 | "documents": [ 8 | { 9 | "passages": [ 10 | { 11 | "text": "", 12 | "offset": 0, 13 | "relations": [ 14 | { 15 | "nodes": [ 16 | { 17 | "role": "passage-relation", 18 | "refid": "RP1" 19 | } 20 | ], 21 | "infons": { 22 | "passage-relation-infon-key": "passage-relation-infon-value" 23 | }, 24 | "id": "RP1" 25 | } 26 | ], 27 | "infons": { 28 | "passage-infon-key": "passage-infon-value" 29 | }, 30 | "sentences": [ 31 | { 32 | "text": "text of sentence", 33 | "infons": { 34 | "sentence-infon-key": "sentence-infon-value" 35 | }, 36 | "annotations": [ 37 | { 38 | "text": "annotation text", 39 | "infons": { 40 | "annotation-infon-key": "annotation-infon-value" 41 | }, 42 | "id": "S1", 43 | "locations": [ 44 | { 45 | "length": 2, 46 | "offset": 1 47 | } 48 | ] 49 | } 50 | ], 51 | "relations": [ 52 | { 53 | "nodes": [ 54 | { 55 | "role": "sentence-relation", 56 | "refid": "RS1" 57 | } 58 | ], 59 | "infons": { 60 | "setence-relation-infon-key": "sentence-relation-infon-value" 61 | }, 62 | "id": "RS1" 63 | } 64 | ], 65 | "offset": 0 66 | } 67 | ], 68 | "annotations": [] 69 | } 70 | ], 71 | "infons": { 72 | "document-infon-key": "document-infon-value" 73 | }, 74 | "id": "1", 75 | "relations": [ 76 | { 77 | "nodes": [ 78 | { 79 | "role": "document-relation", 80 | "refid": "RD1" 81 | } 82 | ], 83 | "infons": { 84 | "document-relation-infon-key": "document-relation-infon-value" 85 | }, 86 | "id": "D1" 87 | } 88 | ] 89 | } 90 | ], 91 | "key": "everything.key" 92 | } 93 | -------------------------------------------------------------------------------- /json/everything.json: -------------------------------------------------------------------------------- 1 | { 2 | "date": "20130426", 3 | "source": "Made up file to test that everything is allowed and processed. Has text in the passage.", 4 | "infons": { 5 | "collection-infon-key": "collection-infon-value" 6 | }, 7 | "documents": [ 8 | { 9 | "passages": [ 10 | { 11 | "text": "text of passage", 12 | "offset": 0, 13 | "relations": [ 14 | { 15 | "nodes": [ 16 | { 17 | "role": "passage-relation", 18 | "refid": "RP1" 19 | } 20 | ], 21 | "infons": { 22 | "passage-relation-infon-key": "passage-relation-infon-value" 23 | }, 24 | "id": "RP1" 25 | } 26 | ], 27 | "infons": { 28 | "passage-infon-key": "passage-infon-value" 29 | }, 30 | "sentences": [], 31 | "annotations": [ 32 | { 33 | "text": "annotation text", 34 | "infons": { 35 | "annotation-infon-key": "annotation-infon-value" 36 | }, 37 | "id": "P1", 38 | "locations": [ 39 | { 40 | "length": 2, 41 | "offset": 1 42 | } 43 | ] 44 | } 45 | ] 46 | } 47 | ], 48 | "infons": { 49 | "document-infon-key": "document-infon-value" 50 | }, 51 | "id": "1", 52 | "relations": [ 53 | { 54 | "nodes": [ 55 | { 56 | "role": "document-relation", 57 | "refid": "RD1" 58 | } 59 | ], 60 | "infons": { 61 | "document-relation-infon-key": "document-relation-infon-value" 62 | }, 63 | "id": "D1" 64 | } 65 | ] 66 | } 67 | ], 68 | "key": "everything.key" 69 | } 70 | -------------------------------------------------------------------------------- /test_bioc_json.sh: -------------------------------------------------------------------------------- 1 | #!/bin/sh 2 | 3 | ### test bioc_json.py 4 | 5 | JSON_SRC=json 6 | XML_SRC=xml_old 7 | XML_SRC=xml 8 | XML_TARGET=xml 9 | DIFF_OPT=-q 10 | 11 | ./bioc_json.py -b $JSON_SRC/collection.json z.xml 12 | diff $DIFF_OPT xml/collection.xml z.xml 13 | 14 | ./bioc_json.py -b $JSON_SRC/abbr.json z.xml 15 | diff $DIFF_OPT xml/abbr.xml z.xml 16 | 17 | ./bioc_json.py -b $JSON_SRC/everything.json z.xml 18 | diff $DIFF_OPT xml/everything.xml z.xml 19 | 20 | ./bioc_json.py -b $JSON_SRC/everything-sentence.json z.xml 21 | diff $DIFF_OPT xml/everything-sentence.xml z.xml 22 | 23 | ./bioc_json.py -j $XML_SRC/collection.xml z.json 24 | diff $DIFF_OPT json/collection.json z.json 25 | 26 | ./bioc_json.py -j $XML_SRC/abbr.xml z.json 27 | diff $DIFF_OPT json/abbr.json z.json 28 | 29 | ./bioc_json.py -j $XML_SRC/everything.xml z.json 30 | diff $DIFF_OPT json/everything.json z.json 31 | 32 | ./bioc_json.py -j $XML_SRC/everything-sentence.xml z.json 33 | diff $DIFF_OPT json/everything-sentence.json z.json 34 | 35 | exit 36 | 37 | ## various commands for creating and working with the test set 38 | 39 | ./bioc_json.py -b json/collection.json $XML_TARGET/collection.xml 40 | ./bioc_json.py -b json/abbr.json $XML_TARGET/abbr.xml 41 | ./bioc_json.py -b json/everything.json $XML_TARGET/everything.xml 42 | ./bioc_json.py -b json/everything-sentence.json $XML_TARGET/everything-sentence.xml 43 | 44 | exit 45 | 46 | # ./bioc_json.py -j $XML_SRC/collection.xml json/collection.json 47 | ./bioc_json.py -j $XML_SRC/abbr.xml json/abbr.json 48 | # ./bioc_json.py -j $XML_SRC/everything.xml json/everything.json 49 | ./bioc_json.py -j $XML_SRC/everything-sentence.xml json/everything-sentence.json 50 | 51 | exit 52 | 53 | # ./bioc_json.py -b json/collection.json z.xml 54 | # diff xml/collection.xml z.xml 55 | 56 | -------------------------------------------------------------------------------- /xml/abbr.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | PubMed 4 | 20120606 5 | abbreviation.key 6 | 7 | 21951408 8 | 9 | title 10 | 0 11 | 12 | 13 | 14 | abstract 15 | 114 16 | 17 | 18 | 19 | 20 | 21488974 21 | 22 | title 23 | 0 24 | 25 | 26 | 27 | abstract 28 | 105 29 | 30 | 31 | ABBR 32 | LongForm 33 | 34 | Human neutrophil elastase 35 | 36 | 37 | ABBR 38 | ShortForm 39 | 40 | HNE 41 | 42 | 43 | ABBR 44 | LongForm 45 | 46 | cathepsin G 47 | 48 | 49 | ABBR 50 | ShortForm 51 | 52 | CatG 53 | 54 | 55 | ABBR 56 | 57 | 58 | 59 | 60 | ABBR 61 | 62 | 63 | 64 | 65 | 66 | 67 | 21660417 68 | 69 | title 70 | 0 71 | 72 | 73 | 74 | 75 | 21831032 76 | 77 | title 78 | 0 79 | 80 | 81 | 82 | abstract 83 | 68 84 | 85 | 86 | 87 | 88 | 21571957 89 | 90 | title 91 | 0 92 | 93 | 94 | 95 | abstract 96 | 160 97 | 98 | 99 | ABBR 100 | LongForm 101 | 102 | fluorescence in situ hybridization 103 | 104 | 105 | ABBR 106 | ShortForm 107 | 108 | FISH 109 | 110 | 111 | ABBR 112 | 113 | 114 | 115 | 116 | 117 | 118 | 21216802 119 | 120 | title 121 | 0 122 | 123 | 124 | 125 | abstract 126 | 87 127 | 128 | 129 | ABBR 130 | LongForm 131 | 132 | instrumental variables 133 | 134 | 135 | ABBR 136 | ShortForm 137 | 138 | IVs 139 | 140 | 141 | ABBR 142 | 143 | 144 | 145 | 146 | 147 | 148 | 21893921 149 | 150 | title 151 | 0 152 | 153 | 154 | 155 | abstract 156 | 82 157 | 158 | 159 | 160 | 161 | 21176341 162 | 163 | title 164 | 0 165 | 166 | 167 | 168 | abstract 169 | 62 170 | 171 | 172 | ABBR 173 | LongForm 174 | 175 | acute leukemia 176 | 177 | 178 | ABBR 179 | ShortForm 180 | 181 | AL 182 | 183 | 184 | ABBR 185 | LongForm 186 | 187 | acute myeloid leukemia 188 | 189 | 190 | ABBR 191 | ShortForm 192 | 193 | AML 194 | 195 | 196 | ABBR 197 | LongForm 198 | 199 | acute promyelocytic leukemia 200 | 201 | 202 | ABBR 203 | ShortForm 204 | 205 | APL 206 | 207 | 208 | ABBR 209 | LongForm 210 | 211 | acute lymphoid leukemia 212 | 213 | 214 | ABBR 215 | ShortForm 216 | 217 | ALL 218 | 219 | 220 | ABBR 221 | LongForm 222 | 223 | chronic myeloid leukemia 224 | 225 | 226 | ABBR 227 | ShortForm 228 | 229 | CML 230 | 231 | 232 | ABBR 233 | 234 | 235 | 236 | 237 | ABBR 238 | 239 | 240 | 241 | 242 | ABBR 243 | 244 | 245 | 246 | 247 | ABBR 248 | 249 | 250 | 251 | 252 | ABBR 253 | 254 | 255 | 256 | 257 | 258 | 259 | 21978046 260 | 261 | title 262 | 0 263 | 264 | 265 | 266 | abstract 267 | 143 268 | 269 | 270 | 271 | 272 | 21785578 273 | 274 | title 275 | 0 276 | 277 | 278 | 279 | abstract 280 | 69 281 | 282 | 283 | 284 | 285 | -------------------------------------------------------------------------------- /xml/collection.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | PubMed 4 | 20120606 5 | collection.key 6 | 7 | 21951408 8 | 9 | title 10 | 0 11 | Constraint-induced movement therapy for the upper paretic limb in acute or sub-acute stroke: a systematic review. 12 | 13 | 14 | abstract 15 | 114 16 | Constraint-induced movement therapy is a commonly used intervention to improve upper limb function after stroke. However, the effectiveness of constraint-induced movement therapy and its optimal dosage during acute or sub-acute stroke is still under debate. To examine the literature on the effects of constraint-induced movement therapy in acute or sub-acute stroke. A literature search was performed to identify randomized, controlled trials; studies with the same outcome measure were pooled by calculating the mean difference. Separate quantitative analyses for high-intensity and low-intensity constraint-induced movement therapy were applied when possible. Five randomized, controlled trials were included, comprising 106 participants. The meta-analysis demonstrated significant mean differences in favor of constraint-induced movement therapy for the Fugl-Meyer arm, the Action Research Arm Test, the Motor Activity Log, Quality of Movement and the Grooved Pegboard Test. Nonsignificant mean difference in favor of constraint-induced movement therapy were found for the Motor Activity Log, Amount of Use. Separate analyses for high-intensity and low-intensity constraint-induced movement therapy resulted in significant favorable mean differences for low-intensity constraint-induced movement therapy for all outcome measures, in contrast to high-intensity constraint-induced movement therapy. This meta-analysis demonstrates a trend toward positive effects of high-intensity and low-intensity constraint-induced movement therapy in acute or sub-acute stroke, but also suggests that low-intensity constraint-induced movement therapy may be more beneficial during this period than high-intensity constraint-induced movement therapy. However, these results were based on a small number of studies. Therefore, more trials are needed applying different doses of therapy early after stroke and a better understanding is needed about the different time windows in which underlying mechanisms of recovery operate. 17 | 18 | 19 | 20 | 21488974 21 | 22 | title 23 | 0 24 | Sensor materials for the detection of human neutrophil elastase and cathepsin G activity in wound fluid. 25 | 26 | 27 | abstract 28 | 105 29 | Human neutrophil elastase (HNE) and cathepsin G (CatG) are involved in the pathogenesis of a number of inflammatory disorders. These serine proteinases are released by neutrophils and monocytes in case of infection. Wound infection is a severe complication regarding wound healing causing diagnostic and therapeutic problems. In this study we have shown the potential of HNE and CatG to be used as markers for early detection of infection. Significant differences in HNE and CatG levels in infected and non-infected wound fluids were observed. Peptide substrates for these two enzymes were successfully immobilised on different surfaces, including collagen, modified collagen, polyamide polyesters and silica gel. HNE and CatG activities were monitored directly in wound fluid via hydrolysis of the chromogenic substrates. Infected wound fluids led to significant higher substrate hydrolysis compared with non-infected ones. These different approaches could be used for the development of devices which are able to detect elevated enzyme activities before manifestation of infection directly on bandages. This would allow a timely intervention by medical doctors thus preventing severe infections. 30 | 31 | 32 | 33 | 21660417 34 | 35 | title 36 | 0 37 | Perineal hernia after laparoscopic abdominoperineal resection--reconstruction of the pelvic floor with a biological mesh (Permacol™). 38 | 39 | 40 | 41 | 21831032 42 | 43 | title 44 | 0 45 | The fetal origins of obesity: early origins of altered food intake. 46 | 47 | 48 | abstract 49 | 68 50 | There is now clear evidence from population-based and experimental animal studies that maternal obesity and maternal overnutrition, particularly excessive intake of high-fat and high-sugar diets, is associated with an increased risk of obesity and type 2 diabetes in the offspring. Whilst the physiological reasons for this association are still not fully understood, one of the key pathways appears to be the ability of exposure to an oversupply of energy, fat and sugar during critical windows of development to program an increased food intake in the offspring. This review will focus on our current understanding of the programming of food intake, with a focus on the importance of the maternal diet. Specifically, we will discuss how exposure to an increased energy supply before birth and in early infancy, and/or increased maternal intake of palatable foods alters the development of the systems regulating appetite and food preferences, and how these changes interact to promote excess consumption and thus predispose the offspring to weight gain and obesity. 51 | 52 | 53 | 54 | 21571957 55 | 56 | title 57 | 0 58 | A comparison of equivocal immunohistochemical results with anti-HER2/neu antibodies A0485 and SP3 with corresponding FISH results in routine clinical practice. 59 | 60 | 61 | abstract 62 | 160 63 | HER2/neu status in breast cancer is determined by immunohistochemical analysis and/or fluorescence in situ hybridization (FISH). Previous studies have found widely varying sensitivities and specificities for anti-HER2/neu antibodies, including recently developed rabbit monoclonal antibodies. The current prospective study compared rabbit monoclonal antibody SP3 and rabbit polyclonal antibody A0485 immunostaining on routinely processed consecutive cases of breast carcinoma. Of 1,610 cases tested, 261 (16.2%) equivocal (2+) cases were evaluated by FISH. Of 253 cases equivocal with A0485 results, 125 (49.4%) were negative with SP3. In 22 (8.7%) of 253 cases equivocal with A0485, there was amplification by FISH, and 3 of these cases were SP3- (0/1+). Of the 20 (14.8%) of 135 SP3-equivocal cases amplified by FISH, 1 case was A0485-. The reported false-negative rate with A0485 is 2.8%, and the American Society of Clinical Oncology/College of American Pathologists guidelines recommend a rate of less than 5%. Compared with A0485, the false-negative rate with SP3 is only 0.3% (3/1,156) higher, but it shows about a 50% reduction in equivocal scores, reducing the need for reflex FISH testing. 64 | 65 | 66 | 67 | 21216802 68 | 69 | title 70 | 0 71 | Using multiple genetic variants as instrumental variables for modifiable risk factors. 72 | 73 | 74 | abstract 75 | 87 76 | Mendelian randomisation analyses use genetic variants as instrumental variables (IVs) to estimate causal effects of modifiable risk factors on disease outcomes. Genetic variants typically explain a small proportion of the variability in risk factors; hence Mendelian randomisation analyses can require large sample sizes. However, an increasing number of genetic variants have been found to be robustly associated with disease-related outcomes in genome-wide association studies. Use of multiple instruments can improve the precision of IV estimates, and also permit examination of underlying IV assumptions. We discuss the use of multiple genetic variants in Mendelian randomisation analyses with continuous outcome variables where all relationships are assumed to be linear. We describe possible violations of IV assumptions, and how multiple instrument analyses can be used to identify them. We present an example using four adiposity-associated genetic variants as IVs for the causal effect of fat mass on bone density, using data on 5509 children enrolled in the ALSPAC birth cohort study. We also use simulation studies to examine the effect of different sets of IVs on precision and bias. When each instrument independently explains variability in the risk factor, use of multiple instruments increases the precision of IV estimates. However, inclusion of weak instruments could increase finite sample bias. Missing data on multiple genetic variants can diminish the available sample size, compared with single instrument analyses. In simulations with additive genotype-risk factor effects, IV estimates using a weighted allele score had similar properties to estimates using multiple instruments. Under the correct conditions, multiple instrument analyses are a promising approach for Mendelian randomisation studies. Further research is required into multiple imputation methods to address missing data issues in IV estimation. 77 | 78 | 79 | 80 | 21893921 81 | 82 | title 83 | 0 84 | Healthy connections: online social networks and their potential for peer support. 85 | 86 | 87 | abstract 88 | 82 89 | Social and professional support for mental health is lacking in many rural areas - highlighting the need for innovative ways to improve access to services. This study explores the potential of online social networking as an avenue for peer support. Using a cross sectional survey, 74 secondary students answered questions relating to internet use, online social network use and perceptions of mental health support. Over half of the sample had experienced a need for mental health support with 53% of participants turning to the internet. Results indicate that online social networking sites were used regularly by 82% of the sample and 47% believed these sites could help with mental health problems. The study concluded that online social networking sites may be able to link young people together with others in similar situations. The popularity and frequency of use may allow these sites to provide information, advice and direction for those seeking help. 90 | 91 | 92 | 93 | 21176341 94 | 95 | title 96 | 0 97 | [april mRNA expression in newly diagnosed leukemia patients]. 98 | 99 | 100 | abstract 101 | 62 102 | This study was aimed to quantitatively detect the levels of april mRNA expression in leukemia patients so as to provide theoretical basis for the target therapy directing at april in leukemia. Real time fluorescent quantitative PCR was used to detect the relative expression level of april mRNA in newly diagnosed leukemia patients and to analyze the changes of its expression level in various type of leukemia. The results showed that the april mRNA expression level in acute leukemia (AL) patients was significantly higher than that in normal controls, there was statistical difference between them (p < 0.05); april mRNA expression level in acute myeloid leukemia (AML) patients was significantly higher than that in normal controls (p < 0.05) and positively correlated with white blood cell count ≥ 20.0 × 10(9)/L (p < 0.05), but not related with extramedullary infiltration and the expression of CD34. Except for acute promyelocytic leukemia (APL), april mRNA expression level was negatively correlated with sensitivity of patients to chemotherapy. april mRNA expression levels in acute lymphoid leukemia (ALL) and chronic myeloid leukemia (CML) patients were not higher than that in normal controls, there was no statistical difference between them (p > 0.05). It is concluded that april gene overexpression exits in AML patients. APRIL protein produced by AML cells probably plays an important role in abnormal proliferation and drug-resistance of AML cells. 103 | 104 | 105 | 106 | 21978046 107 | 108 | title 109 | 0 110 | Optimizing the accuracy of a helical diode array dosimeter: a comprehensive calibration methodology coupled with a novel virtual inclinometer. 111 | 112 | 113 | abstract 114 | 143 115 | PURPOSE: The goal of any dosimeter is to be as accurate as possible when measuring absolute dose to compare with calculated dose. This limits the uncertainties associated with the dosimeter itself and allows the task of dose QA to focus on detecting errors in the treatment planning (TPS) and/or delivery systems. This work introduces enhancements to the measurement accuracy of a 3D dosimeter comprised of a helical plane of diodes in a volumetric phantom. METHODS: We describe the methods and derivations of new corrections that account for repetition rate dependence, intrinsic relative sensitivity per diode, field size dependence based on the dynamic field size determination, and positional correction. Required and described is an accurate "virtual inclinometer" algorithm. The system allows for calibrating the array directly against an ion chamber signal collected with high angular resolution. These enhancements are quantitatively validated using several strategies including ion chamber measurements taken using a "blank" plastic shell mimicking the actual phantom, and comparison to high resolution dose calculations for a variety of fields: static, simple arcs, and VMAT. A number of sophisticated treatment planning algorithms were benchmarked against ion chamber measurements for their ability to handle a large air cavity in the phantom. RESULTS: Each calibration correction is quantified and presented vs its independent variable(s). The virtual inclinometer is validated by direct comparison to the gantry angle vs time data from machine log files. The effects of the calibration are quantified and improvements are seen in the dose agreement with the ion chamber reference measurements and with the TPS calculations. These improved agreements are a result of removing prior limitations and assumptions in the calibration methodology. Average gamma analysis passing rates for VMAT plans based on the AAPM TG-119 report are 98.4 and 93.3% for the 3%/3 mm and 2%/2 mm dose-error/distance to agreement threshold criteria, respectively, with the global dose-error normalization. With the local dose-error normalization, the average passing rates are reduced to 94.6 and 85.7% for the 3%/3 mm and 2%/2 mm criteria, respectively. Some algorithms in the convolution/superposition family are not sufficiently accurate in predicting the exit dose in the presence of a 15 cm diameter air cavity. CONCLUSIONS: Introduction of the improved calibration methodology, enabled by a robust virtual inclinometer algorithm, improves the accuracy of the dosimeter's absolute dose measurements. With our treatment planning and delivery chain, gamma analysis passing rates for the VMAT plans based on the AAPM TG-119 report are expected to be above 91% and average at about 95% level for γ(3%/3 mm) with the local dose-error normalization. This stringent comparison methodology is more indicative of the true VMAT system commissioning accuracy compared to the often quoted dose-error normalization to a single high value. 116 | 117 | 118 | 119 | 21785578 120 | 121 | title 122 | 0 123 | Hemangiopericytoma of maxilla in a pediatric patient: a case report. 124 | 125 | 126 | abstract 127 | 69 128 | The Hemangiopericytoma is a malignant vascular tumor arising from mesenchymal cells with pericytic differentiation. Hemangiopericytoma is most commonly seen in adults, and only 5% to 10% of cases occur in children. The tumor is extremely rare in the head and neck region (16%)1. Cytogenic abnormalities have been present in some hemangiopericytoma cases. Surgical resection remains the mainstay treatment. Adjuvant chemotherapy and radiotherapy is appropriate for cases of incomplete resections and life-threatening tumors particularly in children. Late relapses may occur and require long-term follow-up. A 4-year-old child patient with hemangiopericytoma of the maxilla presented with firm, recurrent, but painless jaw mass. Radiographic investigations revealed a poorly circumscribed radiolucency. The lesion biopsy showed well-circumscribed multiple lobules of tumor mass consisting of tightly packed, spindle-shaped cells. Chemotherapy and radiotherapy of the lesion was conducted. The role of the pediatric dental team is extensive in children with hemangiopericytoma, who require a regular dental review. The patient's oncologist should be immediately contacted if there is any suspicion of recurrence. 129 | 130 | 131 | 132 | -------------------------------------------------------------------------------- /xml/everything-sentence.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | Made up file to test that everything is allowed and processed. Has text in the passage. 4 | 20130426 5 | everything.key 6 | collection-infon-value 7 | 8 | 1 9 | document-infon-value 10 | 11 | passage-infon-value 12 | 0 13 | 14 | sentence-infon-value 15 | 0 16 | text of sentence 17 | 18 | annotation-infon-value 19 | 20 | annotation text 21 | 22 | 23 | sentence-relation-infon-value 24 | 25 | 26 | 27 | 28 | passage-relation-infon-value 29 | 30 | 31 | 32 | 33 | document-relation-infon-value 34 | 35 | 36 | 37 | 38 | -------------------------------------------------------------------------------- /xml/everything.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | Made up file to test that everything is allowed and processed. Has text in the passage. 4 | 20130426 5 | everything.key 6 | collection-infon-value 7 | 8 | 1 9 | document-infon-value 10 | 11 | passage-infon-value 12 | 0 13 | text of passage 14 | 15 | annotation-infon-value 16 | 17 | annotation text 18 | 19 | 20 | passage-relation-infon-value 21 | 22 | 23 | 24 | 25 | document-relation-infon-value 26 | 27 | 28 | 29 | 30 | --------------------------------------------------------------------------------