├── .gitignore ├── Data ├── README.md ├── loinc_counts.csv └── coh_top_500_lab_cats_cs.csv ├── LICENSE ├── README.md └── tsne_plot_kdd_dshealth2019.ipynb /.gitignore: -------------------------------------------------------------------------------- 1 | Prompt_and_Fine_tune_Llama_2_and_Other_LLMs.ipynb 2 | -------------------------------------------------------------------------------- /Data/README.md: -------------------------------------------------------------------------------- 1 | # Data and Models 2 | 3 | * **coh\_top\_500\_lab\_cats\_cs.csv** Aborted manual categorization of lab tests. Only used to distinguish labs in Complete Blood Count panel. For full lab categorization, please rely on official LOINC.org table. 4 | * **loinc\_counts.csv** Counts of lab tests by LOINC ordered for inpatients and outpatients of City of Hope National Medical Center main campus between 2010 and 2017. Used to select mots frequent labs in the plot. 5 | * **loinc\_s200\_w5\_c5\_sg\_wv.txt** 200-dimensional Word2Vec embeddings of 1098 LOINCs trained from 8,280,238 lab orders for 79,081 patients at City of Hope National Medical Center. 6 | 7 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2019 Lorenzo A. Rossi 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # LOINC Embeddings 2 | 3 | This repository provides the Python code and the Word2Vec embeddings to reproduce the scatter plots in the [KDD 2019 DSHealth Workshop](https://dshealthkdd.github.io/dshealth-2019/) paper "Evaluation of Embeddings of Laboratory Test Codes for Patients at a Cancer Center": [arxiv.org/abs/1907.09600](https://arxiv.org/abs/1907.09600). The code can be used as a starting point for further in depth explorations of the embeddings. 4 | 5 | The embeddings have been trained via Word2Vec skip-gram with EHR data from the [City of Hope National Medical Center](https://www.cityofhope.org/homepage). See paper for details on the training. 6 | 7 | If you produce interesting visualizations of the embeddings, email me at lorenzo **\[dot\]** rossi **\[at\]** gmail.com (lrossi **\[at\]** coh.org). 8 | 9 | ## Citation 10 | If you use the material in your work, please cite our paper. __BibTeX entry:__ 11 | 12 | ``` 13 | @inproceedings{larossi2019evaluation, 14 | title={Evaluation of Embeddings of Laboratory Test Codes for Patients at a Cancer Center}, 15 | author={Rossi, Lorenzo A and Shawber, Chad and Munu, Janet and Zachariah, Finly}, 16 | booktitle={KDD Workshop on Applied Data Science for Healthcare (DSHealth)}, 17 | year={2019} 18 | } 19 | ``` 20 | 21 | ## Contents 22 | * **tsne\_plot\_kdd_dshealth2019.ipynb** Jupyter notebook to generate the t-SNE plot. 23 | * **Data/** folder containing the Word2Vec embeddings for the LOINC codes as well as other files used to produce the *t*-SNE plot 24 | 25 | **Note** you need to download the official LOINC CSV Table from [loing.org](https://loinc.org). The table file is necessary to provide a taxonomy of the LOINC codes and hence the classes showed in different colors in the scatter plot. 26 | 27 | ### How to download the official Loinc Table CSV file 28 | * Create an account on LOINC.org (it's free) and log in 29 | * Click on 'Downloads' (menu at the top of the page) 30 | * Click on 'LOINC Table' 31 | * Click on 'LOINC Table File (CSV)' 32 | * Review and check the the Copyright and Terms of Use note 33 | * Click on 'Download': a zip archive will be downloaded on your machine 34 | * Extraxct Loinc.csv from the archive 35 | -------------------------------------------------------------------------------- /Data/loinc_counts.csv: -------------------------------------------------------------------------------- 1 | loinc,count 2 | 777-3,1471627 3 | 2160-0,1470882 4 | 48642-3,1433764 5 | 48643-1,1433763 6 | 2823-3,1393759 7 | 4544-3,1391115 8 | 718-7,1391048 9 | 26464-8,1381729 10 | 26453-1,1380775 11 | 30384-2,1380730 12 | 30428-7,1380727 13 | 28540-3,1380721 14 | 28539-5,1380721 15 | 2951-2,1370155 16 | 2345-7,1369115 17 | 17861-6,1366745 18 | 3094-0,1365909 19 | 2028-9,1365104 20 | 2075-0,1365095 21 | 1751-7,1061196 22 | 1742-6,1061126 23 | 1975-2,1057261 24 | 6768-6,1051994 25 | 736-9,857860 26 | 770-8,857704 27 | 5905-5,857577 28 | 26450-7,856097 29 | 30180-4,855874 30 | 19123-9,739881 31 | 32623-1,715032 32 | 48767-8,658679 33 | 2532-0,543430 34 | 2777-1,513984 35 | 2341-6,485435 36 | 33037-3,381395 37 | 26474-7,366166 38 | 30451-9,364898 39 | 26484-6,364811 40 | 26449-9,363846 41 | 26444-0,363684 42 | 26499-4,331915 43 | 6742-1,158507 44 | 2465-3,148976 45 | 5902-2,138200 46 | 6301-6,138119 47 | 2571-8,122515 48 | 14629-0,121981 49 | 5803-2,117476 50 | 14979-9,113578 51 | 20409-9,111568 52 | 10331-7,109075 53 | 883-9,109075 54 | 53326-5,106871 55 | 25428-4,106090 56 | 5899-0,100013 57 | 57735-3,94776 58 | 11253-2,94263 59 | 30247-1,93311 60 | 2514-8,92336 61 | 5770-3,92280 62 | 20408-1,92259 63 | 5802-4,92246 64 | 5818-0,92232 65 | 20627-6,92144 66 | 5778-6,92144 67 | 5821-4,92039 68 | 25145-4,92039 69 | 40482-2,92039 70 | 13945-1,92039 71 | 46133-5,92039 72 | 9842-6,92039 73 | 26508-2,86601 74 | 50012-4,84537 75 | 2093-3,76624 76 | 28541-1,73565 77 | 2857-1,71304 78 | 1989-3,70153 79 | 29247-4,70135 80 | 31208-2,67999 81 | 2472-9,56062 82 | 2458-8,55900 83 | 5894-1,55438 84 | 9830-1,54605 85 | 3024-7,54485 86 | 58413-6,49640 87 | 2089-1,48574 88 | 26446-5,44133 89 | 2039-6,43575 90 | 26498-6,43448 91 | 8310-5,40107 92 | 2744-1,36899 93 | 1960-4,36884 94 | 2019-8,36884 95 | 2703-7,36883 96 | 2708-6,36649 97 | 26507-4,32340 98 | 50984-4,31301 99 | 4548-4,31199 100 | 39111-0,30677 101 | 45353-0,30676 102 | 66746-9,30676 103 | 38256-4,29857 104 | 2685-6,29628 105 | 12851-2,29622 106 | 13983-2,29622 107 | 13982-4,29622 108 | 13981-6,29622 109 | 35706-1,29622 110 | 33358-3,29622 111 | 29495-9,29061 112 | 47938-6,28643 113 | 40844-3,28423 114 | 33944-0,28423 115 | 36916-5,28423 116 | 14357-8,28226 117 | 1925-7,27843 118 | 27353-2,27241 119 | 11156-7,27164 120 | 53809-0,26766 121 | 4092-3,25434 122 | 19913-3,25274 123 | 30433-7,25118 124 | 42216-2,25053 125 | 2157-6,23621 126 | 19927-3,23052 127 | 3040-3,22863 128 | 1798-8,21974 129 | 10839-9,21366 130 | 2276-4,21017 131 | 17842-6,20861 132 | 17849-1,20704 133 | 33516-6,20685 134 | 14196-0,20685 135 | 14338-8,20277 136 | 20396-8,19593 137 | 4542-7,19343 138 | 3255-7,19024 139 | 19926-5,18355 140 | 30446-9,18336 141 | 20124-4,17776 142 | 4537-7,17740 143 | 55305-7,17723 144 | 10334-1,17538 145 | 30934-4,17502 146 | 20139-2,17289 147 | 20077-4,16918 148 | 13969-1,16829 149 | 35674-1,16700 150 | 30376-8,16179 151 | 19834-1,15757 152 | 20112-9,15168 153 | 516-5,15136 154 | 11545-1,15058 155 | 890-4,14212 156 | 5196-1,13758 157 | 19877-0,13677 158 | 20157-4,13672 159 | 35383-9,13409 160 | 1952-1,13388 161 | 2888-6,12898 162 | 14959-1,12761 163 | 3051-0,12723 164 | 19911-7,12666 165 | 19916-6,12604 166 | 48159-8,12592 167 | 13955-0,12592 168 | 51953-8,12586 169 | 47982-4,12223 170 | 2731-8,12194 171 | 2132-9,11780 172 | 19867-1,11695 173 | 19843-2,11692 174 | 19862-2,11689 175 | 20146-7,11688 176 | 19924-0,11627 177 | 524-9,11466 178 | 6932-8,11428 179 | 133-9,11390 180 | 1988-5,11243 181 | 26346-7,11229 182 | 279-0,10935 183 | 8098-6,10814 184 | 508-2,10811 185 | 41925-9,10776 186 | 24108-3,10686 187 | 3053-6,10616 188 | 44099-0,10047 189 | 28-1,9975 190 | 267-5,9970 191 | 3013-0,9659 192 | 12-5,9566 193 | 6875-9,9531 194 | 412-7,9429 195 | 6652-2,9413 196 | 5124-3,9383 197 | 11031-2,9147 198 | 2284-8,9113 199 | 51730-0,9100 200 | 76-0,8927 201 | 2498-4,8785 202 | 2502-3,8777 203 | 3520-4,8075 204 | 5193-8,8071 205 | 10333-3,8048 206 | 17607-3,8048 207 | 806-0,8048 208 | 55794-2,8048 209 | 26454-9,8048 210 | 26487-9,7914 211 | 2880-3,7886 212 | 2342-4,7884 213 | 42176-8,7870 214 | 3167-4,7852 215 | 363-2,7797 216 | 5794-3,7719 217 | 53962-7,7597 218 | 185-9,7444 219 | 593-4,7349 220 | 13362-9,7338 221 | 24113-3,7277 222 | 62365-2,7096 223 | 44757-3,7088 224 | 20507-0,7053 225 | 26524-9,7018 226 | 20079-0,6962 227 | 496-0,6906 228 | 13950-1,6837 229 | 19930-7,6782 230 | 2991-8,6589 231 | 2986-8,6589 232 | 13457-7,6493 233 | 14836-1,6443 234 | 233-7,6317 235 | 2324-2,6269 236 | 14136-6,6133 237 | 13952-7,6126 238 | 2164-2,5907 239 | 19153-6,5881 240 | 30211-7,5881 241 | 15179-5,5762 242 | 193-3,5688 243 | 29541-0,5617 244 | 24013-5,5611 245 | 2143-6,5521 246 | 15067-2,5457 247 | 3026-2,5292 248 | 38889-2,5226 249 | 67723-7,5217 250 | 383-0,5041 251 | 1839-0,5039 252 | 30453-5,5020 253 | 6644-9,4993 254 | 2955-3,4882 255 | 2118-8,4869 256 | 2161-8,4851 257 | 7018-5,4774 258 | 14564-9,4762 259 | 14563-1,4762 260 | 14565-6,4762 261 | 20447-9,4702 262 | 49301-5,4700 263 | 20153-3,4689 264 | 20152-5,4689 265 | 19875-4,4688 266 | 20150-9,4685 267 | 29991-7,4657 268 | 2162-6,4538 269 | 50398-7,4510 270 | 22341-2,4495 271 | 2243-4,4437 272 | 2518-9,4405 273 | 55463-4,4354 274 | 24467-3,4342 275 | 2112-1,4303 276 | 34581-9,4301 277 | 58443-3,4245 278 | 15069-8,4146 279 | 10704-5,4050 280 | 10501-5,4040 281 | 20578-1,3980 282 | 49341-1,3976 283 | 13440-3,3865 284 | 30427-9,3803 285 | 29950-3,3784 286 | 51913-2,3742 287 | 2692-2,3717 288 | 21198-7,3716 289 | 26347-5,3671 290 | 26348-3,3658 291 | 2746-6,3396 292 | 2705-2,3395 293 | 2021-4,3390 294 | 14627-4,3390 295 | 3016-3,3364 296 | 35671-7,3362 297 | 2711-0,3341 298 | 8251-1,3311 299 | 2695-5,3224 300 | 9811-1,3215 301 | 29254-0,3145 302 | 48345-3,3106 303 | 8099-4,3083 304 | 10535-3,2993 305 | 2106-3,2930 306 | 3084-1,2915 307 | 3034-6,2899 308 | 2141-0,2887 309 | 44805-0,2860 310 | 45285-4,2805 311 | 12190-5,2802 312 | 9335-1,2799 313 | 12254-9,2687 314 | 55793-4,2687 315 | 26455-6,2687 316 | 46337-2,2647 317 | 46336-4,2615 318 | 48398-2,2604 319 | 1649-3,2579 320 | 39528-5,2540 321 | 38917-1,2540 322 | 29909-9,2540 323 | 40982-1,2540 324 | 29908-1,2540 325 | 29910-7,2540 326 | 30075-6,2540 327 | 30076-4,2540 328 | 49524-2,2540 329 | 49521-8,2540 330 | 7993-9,2540 331 | 3968-5,2519 332 | 5221-7,2459 333 | 56888-1,2410 334 | 41394-8,2380 335 | 61099-8,2379 336 | 61100-4,2379 337 | 2881-1,2369 338 | 1971-1,2357 339 | 116-4,2356 340 | 43380-5,2274 341 | 31153-0,2274 342 | 33510-9,2274 343 | 42595-9,2253 344 | 23641-4,2238 345 | 5096-3,2223 346 | 56850-1,2183 347 | 20607-8,2171 348 | 14135-8,2171 349 | 33903-6,2161 350 | 19072-8,2147 351 | 12180-6,2147 352 | 15205-8,2123 353 | 26523-1,2120 354 | 700-5,2088 355 | 2529-6,2056 356 | 6874-2,2053 357 | 18488-7,2053 358 | 13717-4,2053 359 | 2344-0,2035 360 | 2524-7,1961 361 | 31790-9,1923 362 | 38180-6,1919 363 | 30522-7,1893 364 | 48174-7,1873 365 | 141-2,1746 366 | 3174-0,1718 367 | 11011-4,1710 368 | 45323-3,1703 369 | 2283-0,1700 370 | 58450-8,1605 371 | 1763-2,1591 372 | 13355-3,1582 373 | 28544-5,1550 374 | 38370-3,1524 375 | 3209-4,1486 376 | 20448-7,1473 377 | 64111-8,1470 378 | 12841-3,1430 379 | 21347-0,1411 380 | 16982-1,1411 381 | 1795-4,1403 382 | 4090-7,1398 383 | 21215-9,1396 384 | 2357-2,1368 385 | 29951-1,1343 386 | 38176-4,1343 387 | 38169-9,1343 388 | 29947-9,1343 389 | 41759-2,1343 390 | 2889-4,1343 391 | 29946-1,1343 392 | 38178-0,1343 393 | 29945-3,1343 394 | 29899-2,1343 395 | 38177-2,1343 396 | 46418-0,1341 397 | 22666-2,1341 398 | 59471-3,1340 399 | 41171-0,1337 400 | 5014-6,1332 401 | 2193-1,1284 402 | 1986-9,1263 403 | 39017-9,1254 404 | 15061-5,1241 405 | 2999-1,1218 406 | 10378-8,1218 407 | 2519-7,1199 408 | 12228-3,1196 409 | 53731-6,1148 410 | 4485-9,1146 411 | 7799-0,1145 412 | 1761-6,1143 413 | 4498-2,1143 414 | 30471-7,1137 415 | 1747-5,1079 416 | 13967-5,1044 417 | 5236-5,1026 418 | 38492-5,982 419 | 3665-7,971 420 | 13047-6,953 421 | 34187-5,945 422 | 29965-1,913 423 | 26452-3,891 424 | 32516-7,871 425 | 8101-8,871 426 | 20599-7,871 427 | 15087-0,869 428 | 3274-8,866 429 | 2828-2,857 430 | 2900-9,841 431 | 4086-5,831 432 | 17838-4,809 433 | 22602-7,784 434 | 2191-5,774 435 | 33935-8,767 436 | 1644-4,764 437 | 3204-5,733 438 | 15432-8,724 439 | 2990-0,724 440 | 4621-9,722 441 | 35669-1,714 442 | 800-3,706 443 | 5880-0,696 444 | 5157-3,694 445 | 29966-9,670 446 | 31147-2,670 447 | 21518-6,670 448 | 12183-0,669 449 | 13522-8,658 450 | 161-0,658 451 | 2078-4,656 452 | 2079-2,656 453 | 4059-2,654 454 | 46128-5,649 455 | 41488-8,646 456 | 29958-6,638 457 | 4057-6,630 458 | 20593-0,625 459 | 1848-1,625 460 | 42189-1,625 461 | 30083-0,624 462 | 22295-0,624 463 | 5159-9,624 464 | 15363-5,613 465 | 1823-4,606 466 | 11038-7,606 467 | 2460-4,603 468 | 9572-9,592 469 | 49896-4,590 470 | 18310-3,587 471 | 21026-0,587 472 | 4575-7,587 473 | 4547-6,587 474 | 4576-5,587 475 | 4625-0,587 476 | 4551-8,587 477 | 4563-3,587 478 | 25474-8,584 479 | 25489-6,584 480 | 41407-8,581 481 | 13636-6,580 482 | 26528-0,569 483 | 26530-6,566 484 | 32713-0,555 485 | 19113-0,552 486 | 31625-7,547 487 | 53019-6,547 488 | 26287-3,546 489 | 41650-3,541 490 | 11115-3,537 491 | 1992-7,537 492 | 55784-3,537 493 | 13964-2,533 494 | 29591-5,522 495 | 2742-5,520 496 | 32717-1,520 497 | 2915-7,508 498 | 56540-8,491 499 | 41276-7,486 500 | 3128-6,483 501 | 5255-5,481 502 | 59465-5,481 503 | 5256-3,481 504 | 2963-7,479 505 | 56537-4,472 506 | 11040-3,459 507 | 2147-7,459 508 | 11155-9,459 509 | 30166-3,455 510 | 56634-9,454 511 | 49256-1,448 512 | 7791-7,445 513 | 6447-7,425 514 | 13965-9,423 515 | 6864-3,421 516 | 13971-7,415 517 | 8039-0,414 518 | 12203-6,412 519 | 2333-3,412 520 | 28543-7,408 521 | 3095-7,407 522 | 11483-5,392 523 | 6892-4,391 524 | 19108-0,390 525 | 19107-2,380 526 | 29964-4,380 527 | 13954-3,378 528 | 2483-6,364 529 | 5126-8,357 530 | 5244-9,357 531 | 2338-2,355 532 | 6367-7,347 533 | 35670-9,344 534 | 30458-4,338 535 | 32147-1,333 536 | 3181-5,333 537 | 3182-3,333 538 | 2639-3,332 539 | 27819-2,331 540 | 4553-4,330 541 | 49316-3,330 542 | 2615-3,326 543 | 5285-2,320 544 | 5281-1,320 545 | 2697-1,319 546 | 6014-5,319 547 | 803-7,317 548 | 5283-7,316 549 | 27057-9,315 550 | 35278-1,314 551 | 14115-0,313 552 | 14251-3,307 553 | 35863-0,306 554 | 54187-0,306 555 | 25452-4,305 556 | 29904-0,303 557 | 24-0,301 558 | 8014-3,300 559 | 55447-7,299 560 | 27823-4,299 561 | 29497-5,292 562 | 29498-3,292 563 | 30313-1,286 564 | 33915-0,284 565 | 10366-3,284 566 | 33917-6,284 567 | 3854-7,284 568 | 33916-8,284 569 | 21613-5,283 570 | 17314-6,274 571 | 702-1,270 572 | 588-4,270 573 | 58418-5,269 574 | 6371-9,268 575 | 54176-3,265 576 | 25418-5,264 577 | 53874-4,261 578 | 35410-0,261 579 | 30465-9,261 580 | 24111-7,260 581 | 2748-2,248 582 | 54906-3,247 583 | 54905-5,247 584 | 34701-3,247 585 | 6412-1,247 586 | 44871-2,246 587 | 6913-8,242 588 | 6920-3,242 589 | 6919-5,242 590 | 6917-9,242 591 | 6916-1,242 592 | 15212-4,242 593 | 6914-6,242 594 | 2671-6,241 595 | 47699-4,241 596 | 44342-4,241 597 | 19049-6,241 598 | 32361-8,239 599 | 5631-7,234 600 | 5274-6,233 601 | 5273-8,233 602 | 29944-6,233 603 | 19225-2,228 604 | 24027-5,228 605 | 34319-4,227 606 | 2714-4,226 607 | 2030-5,226 608 | 30571-4,221 609 | 3122-9,221 610 | 50948-9,221 611 | 189-1,220 612 | 31037-5,214 613 | 44909-0,212 614 | 31203-3,212 615 | 1695-6,212 616 | 58448-2,207 617 | 14958-3,207 618 | 13734-9,207 619 | 2232-7,207 620 | 2218-6,207 621 | 44341-6,207 622 | 2668-2,207 623 | 44316-8,207 624 | 5385-0,206 625 | 35788-9,204 626 | 2053-7,197 627 | 14956-7,194 628 | 1668-3,193 629 | 68961-2,192 630 | 5060-9,190 631 | 5109-4,187 632 | 13953-5,187 633 | 5111-0,187 634 | 5113-6,187 635 | 5103-7,187 636 | 5107-8,187 637 | 5105-2,187 638 | 2466-1,186 639 | 2468-7,186 640 | 2467-9,186 641 | 2469-5,186 642 | 42355-8,181 643 | 7792-5,179 644 | 2064-4,171 645 | 10381-2,171 646 | 32146-3,169 647 | 32286-7,168 648 | 44288-9,167 649 | 35668-3,164 650 | 20573-2,156 651 | 20574-0,156 652 | 249-3,156 653 | 5176-3,155 654 | 13760-4,155 655 | 3187-2,155 656 | 49269-4,155 657 | 12715-9,155 658 | 50595-8,152 659 | 73977-1,151 660 | 31788-3,150 661 | 2666-6,149 662 | 49257-9,149 663 | 2216-0,149 664 | 2230-1,149 665 | 2839-9,147 666 | 27948-9,146 667 | 34316-0,146 668 | 253-5,146 669 | 41399-7,146 670 | 9709-7,142 671 | 9708-9,142 672 | 26706-2,141 673 | 32853-4,141 674 | 9757-6,140 675 | 49229-8,140 676 | 31183-7,139 677 | 30153-1,139 678 | 42771-6,139 679 | 25296-5,139 680 | 27389-6,139 681 | 27387-0,139 682 | 27395-3,139 683 | 27374-8,139 684 | 27390-4,139 685 | 27118-9,139 686 | 27113-0,139 687 | 27392-0,139 688 | 27096-7,139 689 | 27092-6,139 690 | 27094-2,139 691 | 1974-5,138 692 | 5671-3,135 693 | 21422-1,135 694 | 10451-3,135 695 | 11139-3,135 696 | 55135-8,134 697 | 17780-8,134 698 | 6954-2,134 699 | 65633-0,134 700 | 6955-9,134 701 | 11034-6,132 702 | 17783-2,132 703 | 11046-0,129 704 | 2217-8,129 705 | 2667-4,129 706 | 52756-4,128 707 | 13227-4,128 708 | 2842-3,127 709 | 13514-5,126 710 | 5763-8,125 711 | 43734-3,124 712 | 5057-5,124 713 | 13927-9,124 714 | 58466-4,123 715 | 21582-2,122 716 | 29891-9,118 717 | 9361-7,118 718 | 9360-9,118 719 | 53595-5,117 720 | 42937-3,116 721 | 5685-3,115 722 | 42330-1,114 723 | 14194-5,112 724 | 32789-0,112 725 | 10580-9,112 726 | 2752-4,112 727 | 3160-9,112 728 | 9780-8,112 729 | 8855-9,112 730 | 14170-5,112 731 | 10622-9,111 732 | 10611-2,111 733 | 15376-7,111 734 | 10620-3,110 735 | 48702-5,109 736 | 4532-8,109 737 | 11259-9,108 738 | 19053-8,108 739 | 3853-9,106 740 | 35140-3,106 741 | 30557-3,106 742 | 10365-5,106 743 | 14336-2,105 744 | 5098-9,103 745 | 34376-4,102 746 | 33959-8,101 747 | 2270-7,101 748 | 41491-2,101 749 | 41492-0,101 750 | 5146-6,101 751 | 12598-9,101 752 | 5144-1,101 753 | 41489-6,101 754 | 1825-9,100 755 | 2513-0,100 756 | 48792-6,100 757 | 48793-4,100 758 | 48794-2,100 759 | 29535-2,100 760 | 48795-9,100 761 | 1869-7,100 762 | 1835-8,100 763 | 29536-0,100 764 | 1743-4,100 765 | 11274-8,99 766 | 36-4,99 767 | 3243-3,98 768 | 3052-8,98 769 | 3125-2,96 770 | 26010-9,95 771 | 15013-6,95 772 | 3185-6,95 773 | 15015-1,95 774 | 5955-0,92 775 | 1657-6,92 776 | 205-5,91 777 | 721-1,89 778 | 2779-7,88 779 | 2778-9,88 780 | 5232-4,88 781 | 13917-0,87 782 | 44448-9,85 783 | 44449-7,85 784 | 29998-2,85 785 | 26881-3,84 786 | 47393-4,83 787 | 3432-2,83 788 | 1854-9,83 789 | 10380-4,83 790 | 5370-2,82 791 | 3218-5,80 792 | 41871-5,80 793 | 774-0,80 794 | 31024-3,80 795 | 51952-0,80 796 | 51951-2,80 797 | 20495-8,79 798 | 33939-0,79 799 | 335-0,77 800 | 5290-2,77 801 | 31146-4,77 802 | 5583-0,76 803 | 29512-1,76 804 | 5222-5,76 805 | 3226-8,75 806 | 56965-7,75 807 | 29996-6,74 808 | 21362-9,73 809 | 5809-9,73 810 | 428-3,72 811 | 4635-9,71 812 | 15345-2,68 813 | 1-8,67 814 | 27812-7,67 815 | 3021-3,66 816 | 256-8,66 817 | 2861-3,66 818 | 15346-0,66 819 | 6012-9,64 820 | 4991-6,64 821 | 55136-6,63 822 | 55137-4,63 823 | 5325-6,62 824 | 54247-2,62 825 | 5324-9,62 826 | 2464-6,61 827 | 14116-8,61 828 | 1756-6,61 829 | 48668-8,61 830 | 2470-3,61 831 | 396-2,61 832 | 49293-4,61 833 | 1746-7,61 834 | 12782-9,61 835 | 49054-0,60 836 | 5308-2,60 837 | 5307-4,60 838 | 31140-7,60 839 | 3298-7,60 840 | 49543-2,60 841 | 42913-4,58 842 | 19124-7,58 843 | 24447-5,58 844 | 5053-4,56 845 | 24312-1,56 846 | 21027-8,56 847 | 29963-6,56 848 | 26516-5,56 849 | 49852-7,55 850 | 55180-4,55 851 | 9537-2,55 852 | 2641-9,55 853 | 680-9,55 854 | 4049-3,54 855 | 2829-0,53 856 | 10864-7,53 857 | 32-3,53 858 | 38415-6,53 859 | 2923-1,52 860 | 38496-6,52 861 | 241-0,49 862 | 3050-2,49 863 | 3198-9,49 864 | 3193-0,49 865 | 2617-9,49 866 | 3087-4,48 867 | 21587-1,48 868 | 44-8,48 869 | 5052-6,47 870 | 14334-7,46 871 | 3714-3,46 872 | 14235-6,46 873 | 7797-4,44 874 | 20420-6,44 875 | 27831-7,43 876 | 4545-0,43 877 | 32819-5,42 878 | 40768-4,41 879 | 30464-2,40 880 | 21563-2,40 881 | 6948-4,40 882 | 173-5,39 883 | 39789-3,38 884 | 327-7,38 885 | 41649-5,38 886 | 3237-5,38 887 | 39791-9,38 888 | 47977-4,37 889 | 30350-3,37 890 | 19227-8,35 891 | 51774-8,35 892 | 5908-9,34 893 | 3179-9,34 894 | 3948-7,34 895 | 1765-7,33 896 | 5213-4,33 897 | 2961-1,33 898 | 2032-1,33 899 | 3126-0,33 900 | 69367-1,32 901 | 728-6,31 902 | 666-8,31 903 | 51729-2,30 904 | 51703-7,30 905 | 9656-0,30 906 | 5609-3,30 907 | 17781-6,30 908 | 44706-0,30 909 | 3330-8,29 910 | 6774-4,29 911 | 27222-9,29 912 | 2701-1,29 913 | 17016-7,28 914 | 28005-7,28 915 | 28060-2,28 916 | 31438-5,28 917 | 17015-9,28 918 | 31437-7,28 919 | 3232-6,28 920 | 29893-5,27 921 | 5724-0,27 922 | 42768-2,27 923 | 10835-7,27 924 | 30361-0,27 925 | 69668-2,27 926 | 12736-5,26 927 | 5645-7,26 928 | 13876-8,25 929 | 4484-2,25 930 | 32637-1,23 931 | 16100-0,23 932 | 424-2,23 933 | 802-9,23 934 | 14286-9,23 935 | 6765-2,23 936 | 9838-4,23 937 | 287-3,22 938 | 1799-6,22 939 | 779-9,21 940 | 3559-2,21 941 | 18337-6,21 942 | 10989-2,21 943 | 23826-1,21 944 | 29723-4,21 945 | 35800-2,20 946 | 13914-7,20 947 | 5042-7,20 948 | 32769-2,19 949 | 31111-8,19 950 | 25757-6,19 951 | 16875-7,19 952 | 9522-4,19 953 | 20570-8,19 954 | 1964-6,19 955 | 20-8,19 956 | 1903-4,19 957 | 4477-6,18 958 | 13590-5,18 959 | 741-9,17 960 | 2898-5,16 961 | 56031-8,16 962 | 5199-5,16 963 | 13291-0,16 964 | 7795-8,16 965 | 47828-9,15 966 | 14207-5,15 967 | 13759-6,14 968 | 6884-1,14 969 | 5211-8,14 970 | 1809-3,14 971 | 2492-7,14 972 | 2431-5,14 973 | 1805-1,14 974 | 2351-5,14 975 | 2350-7,14 976 | 55928-6,14 977 | 26842-5,14 978 | 3093-2,13 979 | 20496-6,13 980 | 1656-8,13 981 | 16142-2,13 982 | 48346-1,12 983 | 6733-0,12 984 | 18319-4,12 985 | 6821-3,12 986 | 30078-0,12 987 | 13363-7,12 988 | 6818-9,12 989 | 6810-6,12 990 | 6809-8,12 991 | 6808-0,12 992 | 2258-2,11 993 | 33009-2,11 994 | 1871-3,11 995 | 4024-6,11 996 | 4659-9,10 997 | 2693-0,10 998 | 7793-3,9 999 | 2950-4,9 1000 | 765-8,9 1001 | 738-5,9 1002 | 29994-1,9 1003 | 460-6,9 1004 | 33311-2,8 1005 | 32674-4,8 1006 | 1690-7,8 1007 | 50670-9,8 1008 | 11281-3,8 1009 | 504-1,8 1010 | 45313-4,7 1011 | 13873-5,7 1012 | 38175-6,7 1013 | 53972-6,7 1014 | 63420-4,7 1015 | 63557-3,7 1016 | 45315-9,7 1017 | 5187-0,7 1018 | 3616-0,6 1019 | 21383-5,6 1020 | 27201-3,6 1021 | 26688-2,6 1022 | 2821-7,6 1023 | 2811-8,6 1024 | 14883-3,6 1025 | 14882-5,6 1026 | 21610-1,6 1027 | 265-9,6 1028 | 5225-8,6 1029 | 6693-6,6 1030 | 14121-8,6 1031 | 6349-5,6 1032 | 5765-3,6 1033 | 12480-0,5 1034 | 2870-4,5 1035 | 2873-8,5 1036 | 34645-2,5 1037 | 7959-0,5 1038 | 2864-7,5 1039 | 5183-9,5 1040 | 52999-0,5 1041 | 2876-1,5 1042 | 2867-0,5 1043 | 408-5,5 1044 | 15207-4,5 1045 | 15202-5,5 1046 | 15158-9,5 1047 | 7798-2,5 1048 | 49228-0,4 1049 | 2254-1,4 1050 | 21130-0,4 1051 | 2335-8,4 1052 | 4040-2,4 1053 | 14273-7,4 1054 | 13828-9,4 1055 | 3924-8,4 1056 | 5612-7,4 1057 | 17522-4,4 1058 | 3638-4,4 1059 | 703-9,3 1060 | 28660-9,3 1061 | 108-1,3 1062 | 3002-3,3 1063 | 2072-7,3 1064 | 25303-9,3 1065 | 801-1,3 1066 | 15066-4,3 1067 | 5683-8,3 1068 | 22632-4,2 1069 | 21668-9,2 1070 | 2187-3,2 1071 | 25896-2,2 1072 | 3978-4,2 1073 | 14689-4,2 1074 | 14785-0,2 1075 | 29956-0,2 1076 | 41173-6,2 1077 | 27810-1,2 1078 | 4462-8,2 1079 | 41237-9,2 1080 | 27038-9,2 1081 | 41289-0,1 1082 | 972-0,1 1083 | 10360-6,1 1084 | 21657-2,1 1085 | 18311-1,1 1086 | 550-4,1 1087 | 38534-4,1 1088 | 6901-3,1 1089 | 80-2,1 1090 | 375-6,1 1091 | 120-6,1 1092 | 3494-2,1 1093 | 291-5,1 1094 | 16623-1,1 1095 | 49231-4,1 1096 | 48726-4,1 1097 | 31036-7,1 1098 | 31138-1,1 1099 | 16099-4,1 1100 | -------------------------------------------------------------------------------- /Data/coh_top_500_lab_cats_cs.csv: -------------------------------------------------------------------------------- 1 | Loinc,Lab,Category,Uncertain 2 | 777-3,Platelets,Complete Blood Count, 3 | 2160-0,Creatinine,Renal Function, 4 | 48642-3,Glomerularfiltration rate/1.73 sq M.predicted.non black,Renal Function, 5 | 48643-1,Glomerularfiltration rate/1.73 sq M.predicted.black,Renal Function, 6 | 2823-3,Potassium,Metabolic Panel, 7 | 4544-3,Hematocrit,Complete Blood Count, 8 | 718-7,Hemoglobin,Complete Blood Count, 9 | 26464-8,Leukocytes,Complete Blood Count, 10 | 26453-1,Erythrocytes,Complete Blood Count, 11 | 30384-2,Erythrocytedistribution width,Complete Blood Count, 12 | 30428-7,Erythrocytemean corpuscular volume,Complete Blood Count, 13 | 28540-3,Erythrocytemean corpuscular hemoglobin concentration,Complete Blood Count, 14 | 28539-5,Erythrocytemean corpuscular hemoglobin,Complete Blood Count, 15 | 2951-2,Sodium,Metabolic Panel, 16 | 2345-7,Glucose,Metabolic Panel, 17 | 17861-6,Calcium,Metabolic Panel, 18 | 3094-0,Urea nitrogen,Renal Function, 19 | 2028-9,Carbon dioxide,Arterial Blood Gas,1 20 | 2075-0,Chloride,Metabolic Panel, 21 | 1751-7,Albumin,Metabolic Panel, 22 | 1742-6,Alanine aminotransferase,Renal Function, 23 | 1975-2,Bilirubin,Liver Function/Metabolic Panel, 24 | 6768-6,Alkaline phosphatase,Liver Function/Metabolic Panel, 25 | 736-9,Lymphocytes/100 leukocytes,Complete Blood Count, 26 | 770-8,Neutrophils/100 leukocytes,Complete Blood Count, 27 | 5905-5,Monocytes/100 leukocytes,Complete Blood Count, 28 | 26450-7,Eosinophils/100leukocytes,Complete Blood Count, 29 | 30180-4,Basophils/100leukocytes,Complete Blood Count, 30 | 19123-9,Magnesium,Metabolic Panel, 31 | 32623-1,Plateletmean volume,Complete Blood Count, 32 | 48767-8,Annotationcomment,Meta, 33 | 2532-0,Lactate dehydrogenase,Liver Function/Metabolic Panel, 34 | 2777-1,Phosphate,Metabolic Panel, 35 | 2341-6,Glucose,Metabolic Panel, 36 | 33037-3,Aniongap,Serum Urine/Metabolic, 37 | 26474-7,Lymphocytes,Complete Blood Count, 38 | 30451-9,Neutrophils.segmented,Complete Blood Count, 39 | 26484-6,Monocytes,Complete Blood Count, 40 | 26449-9,Eosinophils,Complete Blood Count, 41 | 26444-0,Basophils,Complete Blood Count, 42 | 26499-4,Neutrophils,Complete Blood Count, 43 | 6742-1,Erythrocyte morphology finding,Complete Blood Count, 44 | 2465-3,IgG,Immunotherapy,1 45 | 5902-2,Coagulation tissue factor induced,Prothrombin Time, 46 | 6301-6,Coagulation tissue factor induced.INR,Prothrombin Time, 47 | 2571-8,Triglyceride,Metabolic Panel, 48 | 14629-0,Bilirubin.glucuronidated+Bilirubin.albuminbound,Liver Function/Metabolic Panel, 49 | 5803-2,pH,Metabolic Panel, 50 | 14979-9,Coagulationsurface induced,Prothrombin Time,1 51 | 20409-9,Erythrocytes,Complete Blood Count, 52 | 10331-7,Rh,Type & Screen, 53 | 883-9,ABO group,Type & Screen, 54 | 53326-5,Specificgravity,Urine,1 55 | 25428-4,Glucose,Metabolic Panel, 56 | 5899-0,Coagulation surface induced normal/Actual,Prothrombin Time,1 57 | 57735-3,Protein,Metabolic Panel, 58 | 11253-2,Tacrolimus,Tacrolimus Serum, 59 | 30247-1,CytomegalovirusDNA,Cytology,1 60 | 2514-8,Ketones,Urine, 61 | 5770-3,Bilirubin,Liver Function/Metabolic Panel, 62 | 20408-1,Leukocytes,Urine, 63 | 5802-4,Nitrite,Urine, 64 | 5818-0,Urobilinogen,Urine, 65 | 20627-6,Turbidity,Urine, 66 | 5778-6,Color,Urine, 67 | 5821-4,Leukocytes,Urine, 68 | 25145-4,Bacteria,Urine, 69 | 40482-2,Crystals.unidentified,Urine, 70 | 13945-1,Erythrocytes,Urine, 71 | 46133-5,Epithelialcells,Urine, 72 | 9842-6,Casts,Urine, 73 | 26508-2,Neutrophils.bandform/100 leukocytes,Urine, 74 | 50012-4,Indirectantiglobulin test.poly specific reagent,Urine, 75 | 2093-3,Cholesterol,Liver Function/Metabolic Panel, 76 | 28541-1,Metamyelocytes/100leukocytes,Complete Blood Count, 77 | 2857-1,Prostate specific Ag,Metabolic Panel, 78 | 1989-3,Calcidiol,Metabolic Panel, 79 | 29247-4,Sirolimus,Sirolimus Level, 80 | 31208-2,Specimensource,Meta, 81 | 2472-9,IgM,Immunotherapy,1 82 | 2458-8,IgA,Immunotherapy,1 83 | 5894-1,Coagulation tissue factor induced actual/Normal,Prothrombin Time, 84 | 9830-1,Cholesterol.total/Cholesterol.in HDL,Liver Function, 85 | 3024-7,Thyroxine.free,Thyroid Panel, 86 | 58413-6,Erythrocytes.nucleated/100leukocytes,Complete Blood Count, 87 | 2089-1,Cholesterol.in LDL,Liver Function, 88 | 26446-5,Blasts/100leukocytes,Complete Blood Count, 89 | 2039-6,Carcinoembryonic Ag,CEA Serum, 90 | 26498-6,Myelocytes/100leukocytes,Complete Blood Count, 91 | 8310-5,Body temperature,Vital Sign, 92 | 2744-1,pH,Urinalysis/Metabolic Panel, 93 | 1960-4,Bicarbonate,Metabolic Panel, 94 | 2019-8,Carbon dioxide,Arterial Blood Gas, 95 | 2703-7,Oxygen,Arterial Blood Gas, 96 | 2708-6,Oxygen saturation,Arterial Blood Gas, 97 | 26507-4,Neutrophils.bandform,Arterial Blood Gas, 98 | 50984-4,Horowitzindex,Arterial Blood Gas, 99 | 4548-4,Hemoglobin A1c/Hemoglobin.total,HGBA1C, 100 | 39111-0,Bodysite,Location, 101 | 45353-0,Dateof analysis,Meta, 102 | 66746-9,Specimentype,Meta, 103 | 38256-4,Cellscounted.total,Type of specimen requested or collected, 104 | 2685-6,Alpha-1-Acid glycoprotein,Renal Function, 105 | 12851-2,Proteinpattern,Renal Function, 106 | 13983-2,Gammaglobulin/Protein.total,Renal Function, 107 | 13982-4,Betaglobulin/Protein.total,Renal Function, 108 | 13981-6,Alpha2 globulin/Protein.total,Renal Function, 109 | 35706-1,Albumin/Protein.total,Renal Function, 110 | 33358-3,Protein.monoclonal,Renal Function, 111 | 29495-9,Herpesvirus 6 DNA,Herpes Viral Testing, 112 | 47938-6,Specimensource,Type of specimen requested or collected, 113 | 40844-3,Immunoglobulinlight chains.kappa.free/Immunoglobulin light chains.lambda,Complete Blood Count, 114 | 33944-0,Immunoglobulinlight chains.lambda.free,Complete Blood Count, 115 | 36916-5,Immunoglobulinlight chains.kappa.free,Complete Blood Count, 116 | 14357-8,Microscopicobservation,Complete Blood Count, 117 | 1925-7,Base excess,Arterial Blood Gas,1 118 | 27353-2,Estimatedaverage glucose,Blood glucose serum, 119 | 11156-7,Leukocytemorphology finding,Cytology , 120 | 53809-0,Oxygen.alveolar- arterial^^adjusted to patients actual temperature,Arterial Blood Gas, 121 | 4092-3,Vancomycin^trough,Serum Vanco Trough, 122 | 19913-3,Diffusioncapacity.carbon monoxide^^adjusted for hemoglobin,Arterial Blood Gas, 123 | 30433-7,Metamyelocytes,Metamyelocyte Level,1 124 | 42216-2,Referencelab name,Meta, 125 | 2157-6,Creatine kinase,Urine, 126 | 19927-3,Gasflow^at 25-75% of forced expiration,Arterial Blood Gas, 127 | 3040-3,Triacylglycerol lipase,Liver Function , 128 | 1798-8,Amylase,Metabolic Panel, 129 | 10839-9,TroponinI.cardiac,Troponin Level, 130 | 2276-4,Ferritin,Ferritin Level, 131 | 17842-6,CancerAg 27-29,Cancer Antigen blood test, 132 | 17849-1,Reticulocytes/100erythrocytes,Retic Count, 133 | 33516-6,Reticulocytes.immature/Reticulocytes.total,Retic Count, 134 | 14196-0,Reticulocytes,Retic Count, 135 | 14338-8,Prealbumin,Prealbumin Level, 136 | 20396-8,Levofloxacin,Antibiotics, 137 | 4542-7,Haptoglobin,Haptoglobin blood test, 138 | 3255-7,Fibrinogen,Fibrinogen Level, 139 | 19926-5,Volumeat 1.0 s post forced expiration/Vital capacity.forced,Physical expiratory test for lung capacity, 140 | 30446-9,Myelocytes,Complete Blood Count, 141 | 20124-4,Ventilationmode,Arterial Blood Gas, 142 | 4537-7,Erythrocyte sedimentation rate,Sed Rate, 143 | 55305-7,Fungus,Microbilogy fungal test, 144 | 10334-1,CancerAg 125,CancerAg 125,1 145 | 30934-4,Natriureticpeptide.B,BNP peptide test., 146 | 20139-2,Volume.expired,Physical expiratory test for lung capacity, 147 | 20077-4,Pressure.positiveend expiratory setting,PULM, 148 | 13969-1,Creatinekinase.MB,CKMB, 149 | 35674-1,Creatinine,Creatinine, 150 | 30376-8,Blasts,Complete Blood Count, 151 | 19834-1,Breathssetting,Arterial Blood Gas, 152 | 20112-9,Tidalvolume setting,Ventilator Setting, 153 | 516-5,Trimethoprim+Sulfamethoxazole,Antibiotics, 154 | 11545-1,Microscopicobservation,Micro biology test for drug suseptability, 155 | 890-4,Blood group antibody screen,Cross Match, 156 | 5196-1,Hepatitis B virus surface Ag,Hep B (Viral Liver Test), 157 | 19877-0,Capacity.vital.forced^prebronchodilation,Arterial Blood Gas, 158 | 20157-4,Volumeat 1.0 s post forced expiration^ pre bronchodilation,Repiratory Expireatory test. , 159 | 35383-9,GalactomannanAg,Blood test that Detects Aspergillosis, 160 | 1952-1,Beta-2-Microglobulin,Blood test used to detect Protiens shed by cells, 161 | 2888-6,Protein,Protien Blood, 162 | 14959-1,Albumin/Creatinine,Kiddney functions, 163 | 3051-0,Triiodothyronine.free,Thyroid blood test, 164 | 19911-7,Diffusioncapacity.carbon monoxide,Arterial Blood Gas, 165 | 19916-6,Diffusioncapacity/Alveolar volume,Arterial Blood Gas, 166 | 48159-8,HepatitisC virus Ab Signal/Cutoff,Liver Function, 167 | 13955-0,HepatitisC virus Ab,Liver Function, 168 | 51953-8,Collectiondate,Meta, 169 | 47982-4,EpsteinBarr virus DNA,Blood test to detect Epstien Barr Virus, 170 | 2731-8,Parathyrin.intact,Blood test to check Parathyroid Hormone, 171 | 2132-9,Cobalamins,Vit B12, 172 | 19867-1,Capacity.vital,calculation of the max amount of air able to take in, 173 | 19843-2,Capacity.functionalresidual,FRC calculates air left in the lungs after exhilation, 174 | 19862-2,Capacity.total,Test to detect the amount of air in the lungs once airway closes., 175 | 20146-7,Volume.residual,Amount of air left in the lungs after maximum exhilation, 176 | 19924-0,Expiratoryreserve,Arterial Blood Gas, 177 | 524-9,Vancomycin,Arterial Blood Gas, 178 | 6932-8,Penicillin,Antibiotics, 179 | 133-9,Ceftazidime,Antibiotics, 180 | 1988-5,C reactive protein,CRP Liver Test, 181 | 26346-7,Viewsdiagnostic,, 182 | 279-0,Imipenem,Blood test to detect natural Antibiotics , 183 | 8098-6,Thyroglobulin Ab,Protien produced by Thyroid, 184 | 508-2,Tobramycin,Tobramycin level to detect Antibiotic, 185 | 41925-9,Class,, 186 | 24108-3,CancerAg 19-9,Cancer Antigen Test, 187 | 3053-6,Triiodothyronine,Thyroid Test, 188 | 44099-0,GalactomannanAg,ASPAG Aspergillus Antigen Serum, 189 | 28-1,Ampicillin,Antibiotics, 190 | 267-5,Gentamicin,Antibiotics, 191 | 5-Dec,Amikacin,Antibiotics, 192 | 3013-0,Thyroglobulin,Thyroid blood test, 193 | 6875-9,Cancer Ag 15-3,Blood test detects breast cancer, 194 | 412-7,Piperacillin+Tazobactam,Serum to detect Antibiotics, 195 | 6652-2,Meropenem,Injection, 196 | 5124-3,Cytomegalovirus Ab.IgG,Venereal, 197 | 11031-2,Lymphocytes/100leukocytes,Complete Blood Count, 198 | 2284-8,Folate,Folate Level for vitamine , 199 | 51730-0,Herpesvirus 6 Ab.IgG & IgM,Liver Function, 200 | 76-0,Cefazolin,Antibiotics, 201 | 2498-4,Iron,Iron Level, 202 | 2502-3,Iron saturation,Total body iron saturation, 203 | 3520-4,Cyclosporine,Oral Antibiotic , 204 | 5193-8,Hepatitis B virus surface Ab,Liver Function, 205 | 10333-3,Appearance,How something appears, 206 | 17607-3,Specimenvolume,Amount of a specimen, 207 | 806-0,Leukocytes,Complete Blood Count, 208 | 55794-2,Othercells,, 209 | 26454-9,Erythrocytes,Complete Blood Count, 210 | 26487-9,Monocytes/100leukocytes,Complete Blood Count, 211 | 2880-3,Protein,Blood Protein, 212 | 2342-4,Glucose,Blood sugar, 213 | 42176-8,"1,3beta glucan",Oral med or a blood test to detect,1 214 | 3167-4,Specimen volume,Arterial Blood Gas, 215 | 363-2,Nitrofurantoin,Antibiotics, 216 | 5794-3,Hemoglobin,Complete Blood Count, 217 | 53962-7,Alpha-1-Fetoprotein.tumormarker,Blood test to detect tumors, 218 | 185-9,Ciprofloxacin,Antibiotics, 219 | 593-4,Legionella sp identified,Lung test for Legionalla bacterium, 220 | 13362-9,Collectionduration,Meta, 221 | 24113-3,HepatitisB virus core Ab.IgM,Liver Function, 222 | 62365-2,Diagnosticimpression,Physician's impression, 223 | 44757-3,Bloodproduct units given,Blood Transfusion, 224 | 20507-0,ReaginAb,, 225 | 26524-9,Promyelocytes/100leukocytes,Complete Blood Count, 226 | 20079-0,Pressure.supportsetting,Ventilator Setting, 227 | 496-0,Tetracycline,Antibiotics, 228 | 13950-1,HepatitisA virus Ab.IgM,Liver Function, 229 | 19930-7,Gasflow,Arterial Blood Gas, 230 | 2991-8,Testosterone.free,Testosterone blood test, 231 | 2986-8,Testosterone,Testosterone blood test, 232 | 13457-7,Cholesterol.inLDL,Cholesterol blood chol, 233 | 14836-1,Methotrexate,Methotrexate blood meth, 234 | 233-7,Erythromycin,Erythromycin blood eryt, 235 | 2324-2,Gamma glutamyl transferase,Gamma blood gamm, 236 | 14136-6,Cells.CD34,Cells blood cell, 237 | 13952-7,HepatitisB virus core Ab,Liver Function, 238 | 2164-2,Creatinine renal clearance,Creatinine blood crea, 239 | 19153-6,Specimenvolume,Specimenvolume blood spec, 240 | 30211-7,Collectionduration,Meta, 241 | 15179-5,FibrinD-dimer,FibrinD blood fibr, 242 | 193-3,Clindamycin,Clindamycin blood clin, 243 | 29541-0,HIV1 RNA,HIV testing, 244 | 24013-5,HIV1 RNA,HIV testing, 245 | 2143-6,Cortisol,Steroid, 246 | 15067-2,Follitropin,Follitropin blood foll, 247 | 3026-2,Thyroxine,Hormone, 248 | 38889-2,Personnelnotified of {event},Meta, 249 | 67723-7,Dateof health-related event,Meta, 250 | 383-0,Oxacillin,Antibiotics, 251 | 1839-0,Ammonia,Ammonia blood ammo, 252 | 30453-5,Neutrophils.segmented/100leukocytes,Neutrophils blood neut, 253 | 6644-9,Cefepime,Antibiotics, 254 | 2955-3,Sodium,Sodium blood sodi, 255 | 2118-8,Choriogonadotropin (pregnancy test),Choriogonadotropin blood chor, 256 | 2161-8,Creatinine,Creatinine blood crea, 257 | 7018-5,Gentamicin.high potency,Gentamicin blood gent, 258 | 14564-9,Hemoglobin.gastrointestinal^2ndspecimen,Hemoglobin blood hemo, 259 | 14563-1,Hemoglobin.gastrointestinal^1stspecimen,Hemoglobin blood hemo, 260 | 14565-6,Hemoglobin.gastrointestinal^3rdspecimen,Hemoglobin blood hemo, 261 | 20447-9,HIV1 RNA,HIV testing, 262 | 49301-5,Proteinpattern,Proteinpattern blood prot, 263 | 20153-3,Volumeat 1.0 s post forced expiration/Volume.forced expiration.total,Arterial Blood Gas, 264 | 20152-5,Volumeat 1.0 s post forced expiration measured/Predicted,Arterial Blood Gas, 265 | 19875-4,Capacity.vital.forced^postbronchodilation,Arterial Blood Gas, 266 | 20150-9,Volumeat 1.0 s post forced expiration,Arterial Blood Gas, 267 | 29991-7,Fibrin+Fibrinogenfragments,Fibrin blood fibr, 268 | 2162-6,Creatinine,Creatinine blood crea, 269 | 50398-7,Narrativediagnostic report,Narrativediagnostic blood narr, 270 | 22341-2,Herpessimplex virus Ab,Herpessimplex blood herp, 271 | 2243-4,Estradiol,Estradiol blood estr, 272 | 2518-9,Lactate,Lactate blood lact, 273 | 55463-4,Influenzavirus A swine origin RNA,Flu, 274 | 24467-3,Cells.CD3+CD4+,Cells blood cell, 275 | 2112-1,Choriogonadotropin.beta subunit (pregnancy test),Choriogonadotropin blood chor, 276 | 34581-9,Calcium.ionized,Calcium blood calc, 277 | 58443-3,Othercells,Othercells blood othe, 278 | 15069-8,Fructosamine,Fructosamine blood fruc, 279 | 10704-5,Ova& parasites identified,Micro biology stool test, 280 | 10501-5,Lutropin,Hormone, 281 | 20578-1,Vancomycin,Antibiotics, 282 | 49341-1,AdenovirusDNA,AdenovirusDNA blood aden, 283 | 13440-3,Interpretation,Interpretation blood inte, 284 | 30427-9,Macrophages/100leukocytes,Macrophages blood macr, 285 | 29950-3,NuclearAb.IgG,NuclearAb blood nucl, 286 | 51913-2,HepatitisA virus Ab.IgG+IgM,Liver Function, 287 | 2692-2,Osmolality,Osmolality blood osmo, 288 | 21198-7,Choriogonadotropin.betasubunit,Choriogonadotropin blood chor, 289 | 26347-5,Viewsdiagnostic,Meta, 290 | 26348-3,Viewsdiagnostic,Meta, 291 | 2746-6,pH,, 292 | 2705-2,Oxygen,Oxygen blood oxyg, 293 | 2021-4,Carbon dioxide,Carbon blood carb, 294 | 14627-4,Bicarbonate,Bicarbonate blood bica, 295 | 3016-3,Thyrotropin,Thyrotropin blood thyr, 296 | 35671-7,GalactomannanAg,GalactomannanAg blood gala, 297 | 2711-0,Oxygen saturation,Arterial Blood Gas, 298 | 8251-1,Service comment,Meta, 299 | 2695-5,Osmolality,Osmolality blood osmo, 300 | 9811-1,Chromogranin A,Chromogranin blood chro, 301 | 29254-0,Linezolid,Antibiotics, 302 | 48345-3,HIV1+O+2 Ab,HIV blood hiv1, 303 | 8099-4,Thyroperoxidase Ab,Thyroperoxidase blood thyr, 304 | 10535-3,Digoxin,Hearth med, 305 | 2106-3,Choriogonadotropin (pregnancy test),Choriogonadotropin blood chor, 306 | 3084-1,Urate,Urate blood urat, 307 | 3034-6,Transferrin,Transferrin blood tran, 308 | 2141-0,Corticotropin,Corticotropin blood cort, 309 | 44805-0,BKvirus DNA,BKvirus blood bkvi, 310 | 45285-4,Date& Time,Meta, 311 | 12190-5,Creatinine,Creatinine blood crea, 312 | 9335-1,Appearance,Appearance blood appe, 313 | 12254-9,Specimenvolume,Specimenvolume blood spec, 314 | 55793-4,Nucleatedcells,Nucleatedcells blood nucl, 315 | 26455-6,Erythrocytes,Erythrocytes blood eryt, 316 | 46337-2,View1,Meta, 317 | 46336-4,View1,Meta, 318 | 48398-2,HepatitisB virus DNA,HepatitisB blood hepa, 319 | 1649-3,Calcitriol,Calcitriol blood calc, 320 | 39528-5,AdenovirusDNA,AdenovirusDNA blood aden, 321 | 38917-1,Humanmetapneumovirus RNA,Humanmetapneumovirus blood huma, 322 | 29909-9,Parainfluenzavirus 2 RNA,Flu, 323 | 40982-1,Influenzavirus B RNA,Flu, 324 | 29908-1,Parainfluenzavirus 1 RNA,Flu, 325 | 29910-7,Parainfluenzavirus 3 RNA,Flu, 326 | 30075-6,Respiratorysyncytial virus A RNA,Flu, 327 | 30076-4,Respiratorysyncytial virus B RNA,Flu, 328 | 49524-2,Influenzavirus A H3 RNA,Flu, 329 | 49521-8,Influenzavirus A H1 RNA,Flu, 330 | 7993-9,Rhinovirus RNA,Flu, 331 | 3968-5,Phenytoin,Phenytoin blood phen, 332 | 5221-7,HIV 1 Ab,HIV Testing, 333 | 56888-1,HIV1+2 Ab+HIV1 p24 Ag,HIV Testing, 334 | 41394-8,Collectiondate,Meta, 335 | 61099-8,Collectiondate^3rd specimen,Meta, 336 | 61100-4,Collectiondate^2nd specimen,Meta, 337 | 2881-1,Protein,Protein blood prot, 338 | 1971-1,Bilirubin.non-glucuronidated,Bilirubin blood bili, 339 | 116-4,Cefoxitin,Antibiotics, 340 | 43380-5,Coccidioidesimmitis Ab.IgG,Coccidioidesimmitis Ab blood cocc, 341 | 31153-0,Coccidioidessp Ab,Coccidioidessp blood cocc, 342 | 33510-9,Coccidioidesimmitis Ab.IgM,Coccidioidesimmitis Ab blood cocc, 343 | 42595-9,HepatitisB virus DNA,Liver Function, 344 | 23641-4,Quinupristin+Dalfopristin,Quinupristin blood quin, 345 | 5096-3,Coccidioides immitis Ab,Coccidioides blood cocc, 346 | 56850-1,Interpretationand review of laboratory results,Interpretationand blood inte, 347 | 20607-8,Cells.CD3+CD4+/Cells.CD3+CD8+,Cells.CD3+CD4+/Cells blood cell, 348 | 14135-8,Cells.CD3+CD8+,Cells blood cell, 349 | 33903-6,Ketones,Ketones blood keto, 350 | 19072-8,Calcium.ionized^^adjustedto pH 7.4,Calcium.ionized^^adjustedto pH 7 blood calc, 351 | 12180-6,Calcium.ionized,Calcium blood calc, 352 | 15205-8,Rheumatoidfactor,Rheumatoidfactor blood rheu, 353 | 26523-1,Promyelocytes,Promyelocytes blood prom, 354 | 700-5,Pneumocystis jiroveci Ag,Pneumocystis blood pneu, 355 | 2529-6,Lactate dehydrogenase,Lactate blood lact, 356 | 6874-2,Calcium,Calcium blood calc, 357 | 18488-7,Calcium,Calcium blood calc, 358 | 13717-4,Calcium/Creatinine,Calcium blood calc, 359 | 2344-0,Glucose,Glucose blood gluc, 360 | 2524-7,Lactate,Lactate blood lact, 361 | 31790-9,Cryptococcussp Ag,Cryptococcussp blood cryp, 362 | 38180-6,HepatitisC virus RNA,Liver Function, 363 | 30522-7,Creactive protein,Creactive blood crea, 364 | 48174-7,Mycobacteriumtuberculosis complex rRNA,Mycobacteriumtuberculosis blood myco, 365 | 141-2,Ceftriaxone,Antibiotics, 366 | 3174-0,Antithrombin,Antithrombin blood anti, 367 | 11011-4,HepatitisC virus RNA,Liver Function, 368 | 45323-3,Mycobacteriumtuberculosis tuberculin stimulated gamma interferon,Mycobacteriumtuberculosis blood myco, 369 | 2283-0,Folate,Folate blood fola, 370 | 58450-8,Bilirubin,Bilirubin blood bili, 371 | 1763-2,Aldosterone,Aldosterone blood aldo, 372 | 13355-3,Unspecifiedcells/100 leukocytes,Unspecifiedcells blood unsp, 373 | 28544-5,Mesothelialcells/100 leukocytes,Mesothelialcells blood meso, 374 | 38370-3,Voriconazole,Voriconazole blood vori, 375 | 3209-4,Coagulation factor VIII activity actual/Normal,Coagulation blood coag, 376 | 20448-7,Insulin,Insulin blood insu, 377 | 64111-8,Timeat blood draw,Meta, 378 | 12841-3,Prostatespecific Ag.free/Prostate specific Ag.total,Prostatespecific Ag.free/Prostate specific Ag blood pros, 379 | 21347-0,HTLV1+2 Ab band pattern,HTLV blood htlv, 380 | 16982-1,HTLV1+2 Ab,HTLV blood htlv, 381 | 1795-4,Amylase,Amylase blood amyl, 382 | 4090-7,Vancomycin^peak,Vancomycin^peak blood vanc, 383 | 21215-9,Collagencrosslinked N-telopeptide,Collagencrosslinked blood coll, 384 | 2357-2,Glucose-6-Phosphate dehydrogenase,Glucose blood gluc, 385 | 29951-1,Betaglobulin,Betaglobulin blood beta, 386 | 38176-4,Immunoglobulinlight chains.kappa.free,Immunoglobulinlight chains.kappa blood immu, 387 | 38169-9,Immunoglobulinlight chains.lambda.free,Immunoglobulinlight chains.lambda blood immu, 388 | 29947-9,Alpha2 globulin,Alpha blood alph, 389 | 41759-2,Immunoglobulinlight chains.kappa.free/Immunoglobulin light chains.lambda.free,Immunoglobulinlight chains.kappa.free/Immunoglobulin light chains.lambda blood immu, 390 | 2889-4,Protein,Protein blood prot, 391 | 29946-1,Albumin,Albumin blood albu, 392 | 38178-0,Immunoglobulinlight chains.lambda.free,Immunoglobulinlight chains.lambda blood immu, 393 | 29945-3,Alpha1 globulin,Alpha blood alph, 394 | 29899-2,Gammaglobulin,Gammaglobulin blood gamm, 395 | 38177-2,Immunoglobulinlight chains.kappa.free,Immunoglobulinlight chains.kappa blood immu, 396 | 46418-0,Coagulationtissue factor induced.INR,Coagulationtissue factor induced blood coag, 397 | 22666-2,Busulfan,Busulfan blood busu, 398 | 59471-3,Calcium.ionized,Calcium blood calc, 399 | 41171-0,Collagencrosslinked C-telopeptide,Collagencrosslinked blood coll, 400 | 5014-6,Herpes simplex virus DNA,Herpes virus blood test, 401 | 2193-1,Dehydroepiandrosterone,Dehydroepiandrosterone blood dehy, 402 | 1986-9,C peptide,C blood c pe, 403 | 39017-9,Mycobacteriumtuberculosis tuberculin stimulated gamma interferon/Mitogen stimulated gamma interferon,Mycobacteriumtuberculosis blood myco, 404 | 15061-5,Erythropoietin,Erythropoietin blood eryt, 405 | 2999-1,Thiamine,Thiamine blood thia, 406 | 10378-8,Polychromasia,Polychromasia blood poly, 407 | 2519-7,Lactate,Lactate blood lact, 408 | 12228-3,Triglyceride,Triglyceride blood trig, 409 | 53731-6,Posaconazole,Posaconazole blood posa, 410 | 4485-9,Complement C3,Complement blood comp, 411 | 7799-0,Fibrin D-dimer,Fibrin blood fibr, 412 | 1761-6,Aldolase,Aldolase blood aldo, 413 | 4498-2,Complement C4,Complement blood comp, 414 | 30471-7,Levetiracetam,Levetiracetam blood leve, 415 | 1747-5,Albumin,Albumin blood albu, 416 | 13967-5,Sexhormone binding globulin,Sexhormone blood sexh, 417 | 5236-5,Legionella pneumophila Ab,Lung test for Legionalla bacterium, 418 | 38492-5,Insulin-likegrowth factor-I,Insulin blood insu, 419 | 3665-7,Gentamicin^trough,Gentamicin^trough blood gent, 420 | 13047-6,Plasmacells/100 leukocytes,Plasmacells blood plas, 421 | 34187-5,DNAdouble strand Ab,DNAdouble blood dnad, 422 | 29965-1,Sjogrenssyndrome-B extractable nuclear Ab.IgG,Sjogrenssyndrome-B extractable nuclear Ab blood sjog, 423 | 26452-3,Eosinophils/100leukocytes,Eosinophils blood eosi, 424 | 32516-7,Cells.CD3+CD4+/100cells,Cells blood cell, 425 | 8101-8,Cells.CD3+CD8+/100 cells,Cells blood cell, 426 | 20599-7,Cells.CD3/100cells,Cells blood cell, 427 | 15087-0,Parathyrinrelated protein,Parathyrinrelated blood para, 428 | 3274-8,Heparin.unfractionated,Heparin blood hepa, 429 | 2828-2,Potassium,Potassium blood pota, 430 | 2900-9,Pyridoxine,Pyridoxine blood pyri, 431 | 4086-5,Valproate,Valproate blood valp, 432 | 17838-4,Alkalinephosphatase.bone,Alkalinephosphatase blood alka, 433 | 22602-7,Varicellazoster virus Ab.IgG,Varicellazoster virus Ab blood vari, 434 | 2191-5,Dehydroepiandrosterone sulfate,Dehydroepiandrosterone blood dehy, 435 | 33935-8,Cycliccitrullinated peptide Ab.IgG,Cycliccitrullinated peptide Ab blood cycl, 436 | 1644-4,Triglyceride^post 12H CFst,Triglyceride^post blood trig, 437 | 3204-5,Coagulation factor VIII inhibitor,Coagulation blood coag, 438 | 15432-8,Testosterone.free/Testosterone.total,Testosterone.free/Testosterone blood test, 439 | 2990-0,Testosterone.bioavailable,Testosterone blood test, 440 | 4621-9,Hemoglobin S,Hemoglobin blood hemo, 441 | 35669-1,Amikacin,Amikacin blood amik, 442 | 800-3,Schistocytes,Schistocytes blood schi, 443 | 5880-0,Rotavirus Ag,Rotavirus blood rota, 444 | 5157-3,Epstein Barr virus capsid Ab.IgG,Epstein Barr virus capsid Ab blood epst, 445 | 29966-9,CentromereAb.IgG,CentromereAb blood cent, 446 | 31147-2,ReaginAb,ReaginAb blood reag, 447 | 21518-6,SCL-70extractable nuclear Ab.IgG,SCL-70extractable nuclear Ab blood scl-, 448 | 12183-0,Cholesterol,Cholesterol blood chol, 449 | 13522-8,Blasts/100leukocytes,Blasts blood blas, 450 | 161-0,Cephalothin,Antibiotics, 451 | 2078-4,Chloride,Chloride blood chlo, 452 | 2079-2,Chloride,Chloride blood chlo, 453 | 4059-2,Tobramycin^trough,Tobramycin^trough blood tobr, 454 | 46128-5,Tissuetransglutaminase Ab.IgA,Tissuetransglutaminase Ab blood tiss, 455 | 41488-8,Cryptosporidiumsp,Cryptosporidiumsp blood cryp, 456 | 29958-6,Ribonucleoproteinextractable nuclear Ab.IgG,Ribonucleoproteinextractable nuclear Ab blood ribo, 457 | 4057-6,Tobramycin^peak,Tobramycin^peak blood tobr, 458 | 20593-0,Cells.CD19/100cells,Cells blood cell, 459 | 1848-1,Androstanolone,Androstanolone blood andr, 460 | 42189-1,Cells.CD3+CD16+CD56+/100cells,Cells blood cell, 461 | 30083-0,EpsteinBarr virus nuclear Ab.IgG,EpsteinBarr virus nuclear Ab blood epst, 462 | 22295-0,EpsteinBarr virus early Ab.IgG,EpsteinBarr virus early Ab blood epst, 463 | 5159-9,Epstein Barr virus capsid Ab.IgM,Epstein Barr virus capsid Ab blood epst, 464 | 15363-5,Busulfan,Busulfan blood busu, 465 | 1823-4,Alpha tocopherol,Alpha blood alph, 466 | 11038-7,Betagamma tocopherol,Betagamma blood beta, 467 | 2460-4,IgD,, 468 | 9572-9,Parvovirus B19 DNA,Parvovirus blood parv, 469 | 49896-4,Humanpapilloma virus 16+18+31+33+35+39+45+51+52+56+58+59+68 DNA,Venereal, 470 | 18310-3,Hemoglobin.other/Hemoglobin.total,Hemoglobin.other/Hemoglobin blood hemo, 471 | 21026-0,Pathologistinterpretation,Pathologistinterpretation blood path, 472 | 4575-7,Hemoglobin E/Hemoglobin.total,Hemoglobin E/Hemoglobin blood hemo, 473 | 4547-6,Hemoglobin A1/Hemoglobin.total,Hemoglobin A1/Hemoglobin blood hemo, 474 | 4576-5,Hemoglobin F/Hemoglobin.total,Hemoglobin F/Hemoglobin blood hemo, 475 | 4625-0,Hemoglobin S/Hemoglobin.total,Hemoglobin S/Hemoglobin blood hemo, 476 | 4551-8,Hemoglobin A2/Hemoglobin.total,Hemoglobin A2/Hemoglobin blood hemo, 477 | 4563-3,Hemoglobin C/Hemoglobin.total,Hemoglobin C/Hemoglobin blood hemo, 478 | 25474-8,Metanephrines,Metanephrines blood meta, 479 | 25489-6,Normetanephrine,Normetanephrine blood norm, 480 | 41407-8,Cortisol^predose corticotropin,Cortisol^predose blood cort, 481 | 13636-6,RibosomalP Ab,RibosomalP blood ribo, 482 | 26528-0,Cortisol^1Hpost dose corticotropin,Cortisol^ blood cort, 483 | 26530-6,Cortisol^30Mpost dose corticotropin,Cortisol^ blood cort, 484 | 32713-0,Potassium,Potassium blood pota, 485 | 19113-0,IgE,, 486 | 31625-7,Sjogrenssyndrome-A extractable nuclear Ab.IgG,Sjogrenssyndrome-A extractable nuclear Ab blood sjog, 487 | 53019-6,Sjogrenssyndrome-A extractable nuclear 60kD Ab,Sjogrenssyndrome blood sjog, 488 | 26287-3,Viewslimited,Viewslimited blood view, 489 | 41650-3,Chloride,Chloride blood chlo, 490 | 11115-3,Neutrophils/100cells,Neutrophils blood neut, 491 | 1992-7,Calcitonin,Calcitonin blood calc, 492 | 55784-3,Leukocytes^^correctedfor nucleated erythrocytes,Leukocytes^^correctedfor blood leuk, 493 | 13964-2,Methylmalonate,Methylmalonate blood meth, 494 | 29591-5,EnterovirusRNA,EnterovirusRNA blood ente, 495 | 2742-5,Angiotensin converting enzyme,Angiotensin blood angi, 496 | 32717-1,Sodium,Sodium blood sodi, 497 | 2915-7,Renin,Renin blood reni, 498 | 56540-8,Glutamatedecarboxylase 65 Ab,Glutamatedecarboxylase blood glut, 499 | 41276-7,Aniongap,Aniongap blood anio, 500 | 3128-6,Viscosity,Viscosity blood visc, 501 | 5255-5,Mycoplasma pneumoniae Ab.IgG,Mycoplasma pneumoniae Ab blood myco, 502 | -------------------------------------------------------------------------------- /tsne_plot_kdd_dshealth2019.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": {}, 6 | "source": [ 7 | "# *t*-SNE Plot for LOINC Embeddings\n", 8 | "**Lorenzo Rossi** (lrossi@coh.org, lorenzo.rossi@gmail.com) \n", 9 | "\n", 10 | "This notebook reproducees the *t*-SNE plot (Figure 3, page 4) in the [KDD 2019 DSHealth Workshop](https://dshealthkdd.github.io/dshealth-2019/) paper \"Evaluation of Embeddings of Laboratory Test Codes for Patients at a Cancer Center\": [arxiv.org/abs/1907.09600](https://arxiv.org/abs/1907.09600). The code can be used as a starting point for furtrher in depth exploration of the embeddings.\n", 11 | "\n", 12 | "The embeddings have been trained via Word2Vec skip-gram with EHR data from the [City of Hope National Medical Center](https://www.cityofhope.org/homepage). See paper for details on the training.\n", 13 | "\n", 14 | "## Files Needed to Run the Notebook\n", 15 | "The word2vec embeddings for the LOINCs are provided as well as other files, but you will need to downlload the official LOINC CSV file from [loing.org](https://loinc.org). That CSV file is necessary to provide a taxonomy of the LOINC code and hence the classes showed in different colors in the scatter plot.\n", 16 | "\n", 17 | "### How to download the official Loinc Table CSV file\n", 18 | "* Create an account on LOINC.org (it's free) and log in\n", 19 | "* Click on 'Downloads' (menu at the top of the page)\n", 20 | "* Click on 'LOINC Table'\n", 21 | "* Click on 'LOINC Table File (CSV)'\n", 22 | "* Review and check the Copyright and Terms of Use note\n", 23 | "* Click on 'Download': a zip archive will be downloaded on your machine\n", 24 | "* Extraxct the Loinc.csv from the zip archive\n", 25 | "\n", 26 | "### Versions\n", 27 | "```\n", 28 | "scikit-learn 0.19.2\n", 29 | "pandas 0.23.4\n", 30 | "gensim 3.7.3\n", 31 | "seaborn 0.9.0\n", 32 | "\n", 33 | "```\n", 34 | "\n", 35 | "## Citation\n", 36 | "If you use the material in your work, please cite our paper. __BibTeX entry:__\n", 37 | "\n", 38 | "```\n", 39 | "@inproceedings{larossi2019evaluation,\n", 40 | " title={Evaluation of Embeddings of Laboratory Test Codes for Patients at a Cancer Center},\n", 41 | " author={Rossi, Lorenzo A and Shawber, Chad and Munu, Janet and Zachariah, Finly},\n", 42 | " booktitle={KDD Workshop on Applied Data Science for Healthcare (DSHealth)},\n", 43 | " year={2019}\n", 44 | "}\n", 45 | "```" 46 | ] 47 | }, 48 | { 49 | "cell_type": "code", 50 | "execution_count": 1, 51 | "metadata": {}, 52 | "outputs": [ 53 | { 54 | "name": "stdout", 55 | "output_type": "stream", 56 | "text": [ 57 | "Using matplotlib backend: TkAgg\n" 58 | ] 59 | } 60 | ], 61 | "source": [ 62 | "import seaborn as sns\n", 63 | "import pandas as pd\n", 64 | "pd.options.mode.chained_assignment = None\n", 65 | "from sklearn.manifold import TSNE\n", 66 | "from gensim.models import Word2Vec, KeyedVectors\n", 67 | "import warnings\n", 68 | "warnings.filterwarnings(\"ignore\", category=UserWarning)\n", 69 | "import matplotlib.pyplot as plt\n", 70 | "\n", 71 | "def loinc_class_mapping(ct, map_to_other):\n", 72 | " try:\n", 73 | " return map_to_other[ct]\n", 74 | " except (TypeError, KeyError) as e:\n", 75 | " return ct\n", 76 | "\n", 77 | "def simplify(element, select_list):\n", 78 | " if element in select_list:\n", 79 | " return element\n", 80 | " else:\n", 81 | " return 'Other'\n", 82 | "\n", 83 | "%matplotlib" 84 | ] 85 | }, 86 | { 87 | "cell_type": "code", 88 | "execution_count": 2, 89 | "metadata": {}, 90 | "outputs": [], 91 | "source": [ 92 | "# Parameter setting, file and folder names\n", 93 | "\n", 94 | "hc_trained_infolder = 'Data/'\n", 95 | "vec_folder = 'Data/'\n", 96 | "doc_folder = 'Data/'\n", 97 | "in_fold = 'Data/'\n", 98 | "\n", 99 | "embedding_fname = 'loinc_s200_w5_c5_sg_wv.txt'\n", 100 | "cnt_fname = 'loinc_counts.csv'\n", 101 | "lab_cat_fname = 'coh_top_500_lab_cats_cs.csv'\n", 102 | "loinc_official_fname = 'Loinc.csv'\n", 103 | "\n", 104 | "# parameters\n", 105 | "rnd_seed = 201905\n", 106 | "n_max_codes = 500\n", 107 | "min_class_size = 12\n", 108 | "\n", 109 | "#Abbreviations for classes of LOINCs [loing.org]\n", 110 | "loinc_class_dict = {'ABXBACT':'Antibiotic susceptibilities', 'PULM':'Pulmonary', 'COAG':'Coagulation study', \n", 111 | " 'DRUG/TOX':'Drug levels & Toxicology', 'HEM/BC': 'Hematology/Cell counts', 'CHEM':'Chemistry',\n", 112 | " 'UA':'Urinalysis', 'MICRO':'Microbiology', 'SERO':'Serology', 'SPEC':'Specimen characteristics',\n", 113 | " 'CELLMARK': 'Cell Markers', 'OTHER': 'Other'}" 114 | ] 115 | }, 116 | { 117 | "cell_type": "code", 118 | "execution_count": 3, 119 | "metadata": {}, 120 | "outputs": [ 121 | { 122 | "name": "stdout", 123 | "output_type": "stream", 124 | "text": [ 125 | "loaded word vectors from loinc_s200_w5_c5_sg_wv.txt\n" 126 | ] 127 | } 128 | ], 129 | "source": [ 130 | "# File loading\n", 131 | "\n", 132 | "# Word2Vec embeddings for LOINCs\n", 133 | "cd_embs = KeyedVectors.load_word2vec_format(vec_folder+embedding_fname, binary=False)\n", 134 | "print('loaded word vectors from', embedding_fname)\n", 135 | "\n", 136 | "# COH past LOINC frequencies\n", 137 | "cd_counts = pd.read_csv(doc_folder+cnt_fname, header=0, squeeze=True, index_col=0)\n", 138 | "\n", 139 | "# Manual categorization of lab tests (used)\n", 140 | "lab_cats = pd.read_csv(in_fold+lab_cat_fname, header=0)\n", 141 | "\n", 142 | "# Official loinc.org table\n", 143 | "loinc_offic = pd.read_csv(in_fold+loinc_official_fname, header=0, low_memory=False)" 144 | ] 145 | }, 146 | { 147 | "cell_type": "code", 148 | "execution_count": 4, 149 | "metadata": {}, 150 | "outputs": [ 151 | { 152 | "name": "stdout", 153 | "output_type": "stream", 154 | "text": [ 155 | "Restrict to 500 most frequent codes\n", 156 | "t-SNE completed\n" 157 | ] 158 | } 159 | ], 160 | "source": [ 161 | "# t-SNE computations\n", 162 | "\n", 163 | "print('Restrict to', n_max_codes, 'most frequent codes')\n", 164 | "top_n_codes = cd_counts[:n_max_codes].index.tolist()\n", 165 | "##top_n_code_names = [cd_dict[cd] for cd in top_n_codes]\n", 166 | "\n", 167 | "\n", 168 | "tsne = TSNE(n_components=2, perplexity=20, learning_rate=40, n_iter=5000, random_state=rnd_seed)\n", 169 | "X_tsne = tsne.fit_transform(cd_embs[top_n_codes])\n", 170 | "print('t-SNE completed')" 171 | ] 172 | }, 173 | { 174 | "cell_type": "code", 175 | "execution_count": 5, 176 | "metadata": {}, 177 | "outputs": [], 178 | "source": [ 179 | "# Merging CoH internal LOINC data with loinc.org Table\n", 180 | "lab_cats = lab_cats.merge(loinc_offic, how='left', left_on='Loinc', right_on='LOINC_NUM')# -loinc_offic.columns\n", 181 | "coh_lab_official = lab_cats[['LOINC_NUM', 'Lab', 'SHORTNAME', 'LONG_COMMON_NAME', 'COMPONENT', 'Category',\n", 182 | " 'CLASS', 'SYSTEM', 'CLASSTYPE', 'PROPERTY','EXAMPLE_UNITS', 'RELATEDNAMES2']]\n", 183 | "coh_lab_official.rename({'Lab': 'COH_NAME', 'Category': 'MANUAL_CATEGORY'}, axis='columns', inplace=True)\n", 184 | "coh_lab_official['COUNTS'] = cd_counts.values[:len(coh_lab_official)]" 185 | ] 186 | }, 187 | { 188 | "cell_type": "code", 189 | "execution_count": 6, 190 | "metadata": {}, 191 | "outputs": [], 192 | "source": [ 193 | "# LOINC class simplification\n", 194 | "\n", 195 | "class_counts = coh_lab_official.groupby('CLASS').size().sort_values(ascending=False)\n", 196 | "map_to_other = pd.Series(dict([(item, 'OTHER') for item in class_counts[class_counts" 217 | ] 218 | }, 219 | "metadata": {}, 220 | "output_type": "display_data" 221 | } 222 | ], 223 | "source": [ 224 | "# t-SNE Plot\n", 225 | "\n", 226 | "palet = 'nipy_spectral'\n", 227 | "#\n", 228 | "filled_markers = ('o', 'v', '^', '<', '>', 's', 'p', '*', 'X', 'P', 'H', 'D', 'd')\n", 229 | "plt.figure()\n", 230 | "with sns.plotting_context('paper', font_scale=1.8):\n", 231 | " sns.scatterplot(X_tsne[:,0], X_tsne[:,1], hue=simplified_classes, style=panel, markers=['D', 'o'], \n", 232 | " s=48, palette=palet)\n", 233 | " plt.legend(bbox_to_anchor=(.9, 1), loc=2, borderaxespad=0.)" 234 | ] 235 | }, 236 | { 237 | "cell_type": "code", 238 | "execution_count": null, 239 | "metadata": {}, 240 | "outputs": [], 241 | "source": [] 242 | } 243 | ], 244 | "metadata": { 245 | "kernelspec": { 246 | "display_name": "Python 3", 247 | "language": "python", 248 | "name": "python3" 249 | }, 250 | "language_info": { 251 | "codemirror_mode": { 252 | "name": "ipython", 253 | "version": 3 254 | }, 255 | "file_extension": ".py", 256 | "mimetype": "text/x-python", 257 | "name": "python", 258 | "nbconvert_exporter": "python", 259 | "pygments_lexer": "ipython3", 260 | "version": "3.5.2" 261 | } 262 | }, 263 | "nbformat": 4, 264 | "nbformat_minor": 2 265 | } 266 | --------------------------------------------------------------------------------