├── README.md
├── biobert_ner
    ├── README.md
    ├── XMLtoTSV.py
    ├── analysis-ai
    │   ├── .gitignore
    │   ├── README.md
    │   ├── package.json
    │   ├── public
    │   │   ├── favicon.ico
    │   │   ├── index.html
    │   │   └── manifest.json
    │   ├── src
    │   │   ├── actions
    │   │   │   ├── bcdr.js
    │   │   │   ├── bioNlp.js
    │   │   │   └── request-type.js
    │   │   ├── components
    │   │   │   ├── button
    │   │   │   │   └── index.js
    │   │   │   ├── fork
    │   │   │   │   └── index.js
    │   │   │   ├── header
    │   │   │   │   └── index.js
    │   │   │   ├── highlighter
    │   │   │   │   └── index.js
    │   │   │   ├── input-with-examples
    │   │   │   │   ├── example-select.js
    │   │   │   │   ├── example-text.js
    │   │   │   │   ├── index.js
    │   │   │   │   ├── submit.js
    │   │   │   │   └── tooltip-composed-menu.js
    │   │   │   ├── request-type-radio
    │   │   │   │   └── index.js
    │   │   │   └── response-text-area
    │   │   │   │   ├── container.js
    │   │   │   │   ├── index.js
    │   │   │   │   └── styled-text.js
    │   │   ├── containers
    │   │   │   ├── input-with-examples
    │   │   │   │   ├── bc5dr-select.js
    │   │   │   │   ├── bc5dr-submit.js
    │   │   │   │   ├── bc5dr-text.js
    │   │   │   │   ├── example-select.js
    │   │   │   │   ├── example-text.js
    │   │   │   │   ├── index.js
    │   │   │   │   └── submit.js
    │   │   │   ├── request-type-radio.js
    │   │   │   └── response-text-area
    │   │   │   │   ├── bc5dr-text-area.js
    │   │   │   │   ├── bioNlp-text-area.js
    │   │   │   │   └── container.js
    │   │   ├── enums
    │   │   │   └── request-types.js
    │   │   ├── index.css
    │   │   ├── index.js
    │   │   ├── logo.svg
    │   │   ├── pages
    │   │   │   └── home
    │   │   │   │   ├── index.css
    │   │   │   │   ├── index.js
    │   │   │   │   └── index.test.js
    │   │   ├── reducers
    │   │   │   ├── bcdr.js
    │   │   │   ├── bioNlp.js
    │   │   │   ├── index.js
    │   │   │   └── request-type.js
    │   │   ├── redux-constants
    │   │   │   └── fetch.js
    │   │   ├── serviceWorker.js
    │   │   └── utils
    │   │   │   ├── mapCodeToColors.js
    │   │   │   └── params.js
    │   └── yarn.lock
    ├── api.py
    ├── convert_to_pytorch_wt.ipynb
    ├── data_load.py
    ├── extras
    │   └── ezgif.com-video-to-gif.gif
    ├── new_model.py
    ├── new_train.py
    ├── parameters.py
    └── requirements.txt
└── fill_the_blanks
    ├── README.md
    ├── fill_blanks.py
    └── fillblank.gif


/README.md:
--------------------------------------------------------------------------------
1 | This repo contains my experiments with Bert pretrained weights. 
2 | 


--------------------------------------------------------------------------------
/biobert_ner/README.md:
--------------------------------------------------------------------------------
  1 | # Solving BioNLP problems using Bert(BioBert Pytorch)
  2 | 
  3 | Working Demo For NER can be found [here](http://13.72.66.146:5000/)
  4 | 
  5 | This repository contains fine-tuning of Biobert[https://arxiv.org/abs/1901.08746].
  6 | 
  7 | ## Preparation :- 
  8 | To use biobert, download [weights](https://github.com/naver/biobert-pretrained/releases),  and make it compatible with pytorch using script [convert_to_pytorch_wt.ipynb](https://github.com/MeRajat/SolvingAlmostAnythingWithBert/blob/ner_medical/convert_to_pytorch_wt.ipynb). 
  9 | 
 10 | Place converted weights into ```weights/``` folder. 
 11 | 
 12 | ## NER :- 
 13 | 
 14 | NER Data can be downloaded using https://github.com/cambridgeltl/MTL-Bioinformatics-2016. 
 15 | 
 16 | Select NER data you want to train on and move it to data folder. 
 17 | 
 18 | ### Datasets 
 19 | 
 20 | We have used [BC5CDR](https://biocreative.bioinformatics.udel.edu/tasks/biocreative-v/track-3-cdr/) and [BioNLP13CG](http://2013.bionlp-st.org/). 
 21 | 
 22 | BC5CDR tags :- 
 23 | ```
 24 |     'B-Chemical', 
 25 |     'O', 
 26 |     'B-Disease', 
 27 |     'I-Disease', 
 28 |     'I-Chemical'
 29 | ```
 30 | 
 31 | BioNLP13CG tags :- 
 32 | ``` 'B-Amino_acid',
 33 | 'B-Anatomical_system',
 34 | 'B-Cancer',
 35 | 'B-Cell', 
 36 | 'B-Cellular_component',
 37 | 'B-Developing_anatomical_structure',
 38 | 'B-Gene_or_gene_product', 
 39 | 'B-Immaterial_anatomical_entity',
 40 | 'B-Multi-tissue_structure',
 41 | 'B-Organ',
 42 | 'B-Organism', 
 43 | 'B-Organism_subdivision',
 44 | 'B-Organism_substance',
 45 | 'B-Pathological_formation', 
 46 | 'B-Simple_chemical',
 47 | 'B-Tissue',
 48 | 'I-Amino_acid',
 49 | 'I-Anatomical_system',
 50 | 'I-Cancer', 
 51 | 'I-Cell',
 52 | 'I-Cellular_component',
 53 | 'I-Developing_anatomical_structure',
 54 | 'I-Gene_or_gene_product', 
 55 | 'I-Immaterial_anatomical_entity',
 56 | 'I-Multi-tissue_structure',
 57 | 'I-Organ',
 58 | 'I-Organism', 
 59 | 'I-Organism_subdivision',
 60 | 'I-Organism_substance',
 61 | 'I-Pathological_formation',
 62 | 'I-Simple_chemical', 
 63 | 'I-Tissue',
 64 | 'O'
 65 | ```
 66 | 
 67 | ## Result 
 68 | 
 69 | After fine-tuning it with biobert weights result were pretty good, <b>F1-score</b> for BC5CDR was <b>95</b> and for BioNLP13CG was <b>92</b>. 
 70 | 
 71 | Examples 
 72 | 
 73 | BC5CDR :- 
 74 | 
 75 | ```
 76 | Sentence = The authors describe the case of a 56 - year - old woman with chronic , severe heart failure secondary to dilated cardiomyopathy and absence of significant ventricular arrhythmias who developed QT prolongation and torsade de pointes ventricular tachycardia during one cycle of intermittent low dose ( 2 . 5 mcg / kg per min ) dobutamine . 
 77 | 
 78 | Result =
 79 | {"tagging":[["The","O"],["authors","O"],["describe","O"],
 80 | ["the","O"],["case","O"],["of","O"],["a","O"],["56","O"],["-",
 81 | "O"],["year","O"],["-","O"],["old","O"],["woman","O"],["with",
 82 | "O"],["chronic","O"],[",","O"],["severe","O"],["heart",
 83 | "I-Disease"],["failure","I-Disease"],["secondary","O"],["to",
 84 | "O"],["dilated","B-Disease"],["cardiomyopathy","I-Disease"],
 85 | ["and","O"],["absence","O"],["of","O"],["significant","O"],
 86 | ["ventricular","B-Disease"],["arrhythmias","I-Disease"],
 87 | ["who","O"],["developed","O"],["QT","B-Disease"],
 88 | ["prolongation","I-Disease"],["and","O"],["torsade",
 89 | "B-Disease"],["de","I-Disease"],["pointes","I-Disease"],
 90 | ["ventricular","I-Disease"],["tachycardia","I-Disease"],
 91 | ["during","O"],["one","O"],["cycle","O"],["of","O"],
 92 | ["intermittent","O"],["low","O"],["dose","O"],["(","O"],["2",
 93 | "O"],[".","O"],["5","O"],["mcg","O"],["/","O"],["kg","O"],
 94 | ["per","O"],["min","O"],[")","O"],["dobutamine","B-Chemical"],
 95 | [".","O"]]}
 96 | 
 97 | ```
 98 | 
 99 | BioNLP13CG :- 
100 | 
101 | ```
102 | Sentence = Cooccurrence of reduced expression of alpha - catenin and overexpression of p53 is a predictor of lymph node metastasis in early gastric cancer . 
103 | 
104 | 
105 | Result = 
106 | {"tags":[["Cooccurrence","O"],["of","O"],["reduced","O"],
107 | ["expression","O"],["of","O"],["alpha",
108 | "B-Gene_or_gene_product"],["-","I-Gene_or_gene_product"],
109 | ["catenin","I-Gene_or_gene_product"],["and","O"],
110 | ["overexpression","O"],["of","O"],["p53",
111 | "B-Gene_or_gene_product"],["is","O"],["a","O"],["predictor",
112 | "O"],["of","O"],["lymph","B-Multi-tissue_structure"],["node",
113 | "I-Multi-tissue_structure"],["metastasis","O"],["in","O"],
114 | ["early","O"],["gastric","B-Cancer"],["cancer","I-Cancer"],
115 | [".","O"]]}
116 | 
117 | ```
118 | 
119 | 
120 | 
121 | <img src="https://github.com/MeRajat/SolvingAlmostAnythingWithBert/blob/ner_medical/extras/ezgif.com-video-to-gif.gif" width="700" height="400" />
122 | 
123 | 


--------------------------------------------------------------------------------
/biobert_ner/XMLtoTSV.py:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/MeRajat/SolvingAlmostAnythingWithBert/1bfb6d679a668179bbb783d1c0eb9f338cd0f1c5/biobert_ner/XMLtoTSV.py


--------------------------------------------------------------------------------
/biobert_ner/analysis-ai/.gitignore:
--------------------------------------------------------------------------------
 1 | # See https://help.github.com/articles/ignoring-files/ for more about ignoring files.
 2 | 
 3 | # dependencies
 4 | /node_modules
 5 | /.pnp
 6 | .pnp.js
 7 | 
 8 | # testing
 9 | /coverage
10 | 
11 | # production
12 | /build
13 | 
14 | # misc
15 | .DS_Store
16 | .env.local
17 | .env.development.local
18 | .env.test.local
19 | .env.production.local
20 | 
21 | npm-debug.log*
22 | yarn-debug.log*
23 | yarn-error.log*
24 | 


--------------------------------------------------------------------------------
/biobert_ner/analysis-ai/README.md:
--------------------------------------------------------------------------------
 1 | This project was bootstrapped with [Create React App](https://github.com/facebook/create-react-app).
 2 | 
 3 | ## Available Scripts
 4 | 
 5 | In the project directory, you can run:
 6 | 
 7 | ### `npm start`
 8 | 
 9 | Runs the app in the development mode.<br>
10 | Open [http://localhost:3000](http://localhost:3000) to view it in the browser.
11 | 
12 | The page will reload if you make edits.<br>
13 | You will also see any lint errors in the console.
14 | 
15 | ### `npm test`
16 | 
17 | Launches the test runner in the interactive watch mode.<br>
18 | See the section about [running tests](https://facebook.github.io/create-react-app/docs/running-tests) for more information.
19 | 
20 | ### `npm run build`
21 | 
22 | Builds the app for production to the `build` folder.<br>
23 | It correctly bundles React in production mode and optimizes the build for the best performance.
24 | 
25 | The build is minified and the filenames include the hashes.<br>
26 | Your app is ready to be deployed!
27 | 
28 | See the section about [deployment](https://facebook.github.io/create-react-app/docs/deployment) for more information.
29 | 
30 | ### `npm run eject`
31 | 
32 | **Note: this is a one-way operation. Once you `eject`, you can’t go back!**
33 | 
34 | If you aren’t satisfied with the build tool and configuration choices, you can `eject` at any time. This command will remove the single build dependency from your project.
35 | 
36 | Instead, it will copy all the configuration files and the transitive dependencies (Webpack, Babel, ESLint, etc) right into your project so you have full control over them. All of the commands except `eject` will still work, but they will point to the copied scripts so you can tweak them. At this point you’re on your own.
37 | 
38 | You don’t have to ever use `eject`. The curated feature set is suitable for small and middle deployments, and you shouldn’t feel obligated to use this feature. However we understand that this tool wouldn’t be useful if you couldn’t customize it when you are ready for it.
39 | 
40 | ## Learn More
41 | 
42 | You can learn more in the [Create React App documentation](https://facebook.github.io/create-react-app/docs/getting-started).
43 | 
44 | To learn React, check out the [React documentation](https://reactjs.org/).
45 | 
46 | ### Code Splitting
47 | 
48 | This section has moved here: https://facebook.github.io/create-react-app/docs/code-splitting
49 | 
50 | ### Analyzing the Bundle Size
51 | 
52 | This section has moved here: https://facebook.github.io/create-react-app/docs/analyzing-the-bundle-size
53 | 
54 | ### Making a Progressive Web App
55 | 
56 | This section has moved here: https://facebook.github.io/create-react-app/docs/making-a-progressive-web-app
57 | 
58 | ### Advanced Configuration
59 | 
60 | This section has moved here: https://facebook.github.io/create-react-app/docs/advanced-configuration
61 | 
62 | ### Deployment
63 | 
64 | This section has moved here: https://facebook.github.io/create-react-app/docs/deployment
65 | 
66 | ### `npm run build` fails to minify
67 | 
68 | This section has moved here: https://facebook.github.io/create-react-app/docs/troubleshooting#npm-run-build-fails-to-minify
69 | 


--------------------------------------------------------------------------------
/biobert_ner/analysis-ai/package.json:
--------------------------------------------------------------------------------
 1 | {
 2 |   "name": "analysis-ai",
 3 |   "version": "0.1.0",
 4 |   "private": true,
 5 |   "dependencies": {
 6 |     "@material-ui/core": "^3.9.2",
 7 |     "dotenv": "^6.2.0",
 8 |     "gh-pages": "^2.0.1",
 9 |     "prop-types": "^15.7.2",
10 |     "react": "^16.8.3",
11 |     "react-dom": "^16.8.3",
12 |     "react-redux": "^6.0.1",
13 |     "react-scripts": "2.1.5",
14 |     "redux": "^4.0.1",
15 |     "redux-thunk": "^2.3.0",
16 |     "styled-components": "^4.1.3"
17 |   },
18 |   "scripts": {
19 |     "start": "react-scripts start",
20 |     "build": "react-scripts build",
21 |     "test": "react-scripts test",
22 |     "eject": "react-scripts eject"
23 |   },
24 |   "eslintConfig": {
25 |     "extends": "react-app"
26 |   },
27 |   "browserslist": [
28 |     ">0.2%",
29 |     "not dead",
30 |     "not ie <= 11",
31 |     "not op_mini all"
32 |   ],
33 |   "devDependencies": {
34 |     "serve": "^10.1.2"
35 |   }
36 | }
37 | 


--------------------------------------------------------------------------------
/biobert_ner/analysis-ai/public/favicon.ico:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/MeRajat/SolvingAlmostAnythingWithBert/1bfb6d679a668179bbb783d1c0eb9f338cd0f1c5/biobert_ner/analysis-ai/public/favicon.ico


--------------------------------------------------------------------------------
/biobert_ner/analysis-ai/public/index.html:
--------------------------------------------------------------------------------
 1 | <!DOCTYPE html>
 2 | <html lang="en">
 3 |   <head>
 4 |     <meta charset="utf-8" />
 5 |     <link rel="shortcut icon" href="%PUBLIC_URL%/favicon.ico" />
 6 |     <meta
 7 |       name="viewport"
 8 |       content="width=device-width, initial-scale=1, shrink-to-fit=no"
 9 |     />
10 |     <meta name="theme-color" content="#000000" />
11 |     <!--
12 |       manifest.json provides metadata used when your web app is installed on a
13 |       user's mobile device or desktop. See https://developers.google.com/web/fundamentals/web-app-manifest/
14 |     -->
15 |     <link rel="manifest" href="%PUBLIC_URL%/manifest.json" />
16 |     <link rel="stylesheet" href="https://fonts.googleapis.com/css?family=Roboto:300,400,500">
17 |     <link rel="stylesheet" href="https://use.fontawesome.com/releases/v5.7.2/css/all.css" integrity="sha384-fnmOCqbTlWIlj8LyTjo7mOUStjsKC4pOpQbqyi7RrhN7udi9RwhKkMHpvLbHG9Sr" crossorigin="anonymous">
18 |     <!--
19 |       Notice the use of %PUBLIC_URL% in the tags above.
20 |       It will be replaced with the URL of the `public` folder during the build.
21 |       Only files inside the `public` folder can be referenced from the HTML.
22 | 
23 |       Unlike "/favicon.ico" or "favicon.ico", "%PUBLIC_URL%/favicon.ico" will
24 |       work correctly both with client-side routing and a non-root public URL.
25 |       Learn how to configure a non-root public URL by running `npm run build`.
26 |     -->
27 |     <title>BioNLP</title>
28 |   </head>
29 |   <body>
30 |     <noscript>You need to enable JavaScript to run this app.</noscript>
31 |     <div id="root"></div>
32 |     <!--
33 |       This HTML file is a template.
34 |       If you open it directly in the browser, you will see an empty page.
35 | 
36 |       You can add webfonts, meta tags, or analytics to this file.
37 |       The build step will place the bundled scripts into the <body> tag.
38 | 
39 |       To begin the development, run `npm start` or `yarn start`.
40 |       To create a production bundle, use `npm run build` or `yarn build`.
41 |     -->
42 |   </body>
43 | </html>
44 | 


--------------------------------------------------------------------------------
/biobert_ner/analysis-ai/public/manifest.json:
--------------------------------------------------------------------------------
 1 | {
 2 |   "short_name": "BioNLP App",
 3 |   "name": "Solving BioNLP Problems",
 4 |   "icons": [
 5 |     {
 6 |       "src": "favicon.ico",
 7 |       "sizes": "32x32 16x16",
 8 |       "type": "image/x-icon"
 9 |     }
10 |   ],
11 |   "start_url": ".",
12 |   "display": "standalone",
13 |   "theme_color": "#000000",
14 |   "background_color": "#ffffff"
15 | }
16 | 


--------------------------------------------------------------------------------
/biobert_ner/analysis-ai/src/actions/bcdr.js:
--------------------------------------------------------------------------------
 1 | import constants from '../redux-constants/fetch';
 2 | import params from '../utils/params';
 3 | 
 4 | export const fetchBc5cdr = () => async (dispatch, getState) => {
 5 |   dispatch({
 6 |     type: constants.FETCH_BC5CDR_REQUEST
 7 |   })
 8 | 
 9 |   const payload =  params(getState().bc5cdr.request);
10 | 
11 |   try {
12 |     const response = await fetch(`http://13.72.66.146:9000/extract-ner?${payload}`)
13 |       .then(response => response.json());
14 |     dispatch({
15 |       type: constants.FETCH_BC5CDR_SUCCESS,
16 |       response
17 |     })
18 |   } catch (error) {
19 |     dispatch({
20 |       type: constants.FETCH_BC5CDR_FAILURE,
21 |       errorMessage: 'Request Failed'
22 |     })
23 |   }
24 | }
25 | 
26 | export const updateBc5cdr = text => {
27 |   return {
28 |     type: constants.UPDATE_BC5CDR,
29 |     content: text
30 |   }
31 | }
32 | 


--------------------------------------------------------------------------------
/biobert_ner/analysis-ai/src/actions/bioNlp.js:
--------------------------------------------------------------------------------
 1 | import constants from '../redux-constants/fetch';
 2 | import params from '../utils/params';
 3 | 
 4 | export const fetchBioNlp = () => async (dispatch, getState) => {
 5 |   dispatch({
 6 |     type: constants.FETCH_BIO_NLP_REQUEST
 7 |   })
 8 | 
 9 |   const payload =  params(getState().bioNlp.request);
10 | 
11 |   try {
12 |     const response = await fetch(`http://13.72.66.146:9000/extract-ner?${payload}`)
13 |       .then(response => response.json());
14 |     dispatch({
15 |       type: constants.FETCH_BIO_NLP_SUCCESS,
16 |       response
17 |     })
18 |   } catch (error) {
19 |     dispatch({
20 |       type: constants.FETCH_BIO_NLP_FAILURE,
21 |       errorMessage: 'Request Failed'
22 |     })
23 |   }
24 | }
25 | 
26 | export const updateBioNlp = text => {
27 |   return {
28 |     type: constants.UPDATE_BIO_NLP,
29 |     content: text
30 |   }
31 | }
32 | 


--------------------------------------------------------------------------------
/biobert_ner/analysis-ai/src/actions/request-type.js:
--------------------------------------------------------------------------------
1 | import constants from '../redux-constants/fetch';
2 | 
3 | export const updateRequestType = (type) => {
4 |   return {
5 |     type: constants.REQUEST_TYPE_CHANGE,
6 |     requestType: type
7 |   }
8 | }
9 | 


--------------------------------------------------------------------------------
/biobert_ner/analysis-ai/src/components/button/index.js:
--------------------------------------------------------------------------------
1 | import styled from 'styled-components';
2 | import Button from '@material-ui/core/Button';
3 | 
4 | export default styled(Button)`
5 |   color: #f2f2f2 !important;
6 | `;
7 | 


--------------------------------------------------------------------------------
/biobert_ner/analysis-ai/src/components/fork/index.js:
--------------------------------------------------------------------------------
 1 | import React from 'react';
 2 | import PropTypes from 'prop-types';
 3 | import styled from 'styled-components';
 4 | 
 5 | const GithubForkWrapper = styled.a`
 6 |   position: fixed;
 7 |   top: 0;
 8 |   right: 0;
 9 |   z-index: 1000;
10 |   margin: 10px;
11 |   text-decoration: none;
12 |   font-size: 30px;
13 |   color: rgba(0, 0, 0, 0.5);
14 |   cursor: pointer;
15 | 
16 |   &:focus {
17 |     outline: none;
18 |   }
19 | 
20 |   &:hover, &:focus {
21 |     color: rgba(0, 0, 0, 0.8);
22 |   }
23 | `;
24 | 
25 | export default class GithubFork extends React.PureComponent {
26 |   static propTypes = {
27 |     href: PropTypes.string.isRequired
28 |   };
29 | 
30 |   render () {
31 |     return <GithubForkWrapper target='_blank' href={this.props.href}>
32 |       <i className='fab fa-github'></i>
33 |     </GithubForkWrapper>
34 |   }
35 | }
36 | 
37 | 


--------------------------------------------------------------------------------
/biobert_ner/analysis-ai/src/components/header/index.js:
--------------------------------------------------------------------------------
1 | import styled from 'styled-components';
2 | import Typography from '@material-ui/core/Typography';
3 | 
4 | export const Header = styled(Typography)`
5 |   color: ${props => props.color ? props.color: '#4f4f4f'} !important;
6 | `;
7 | 


--------------------------------------------------------------------------------
/biobert_ner/analysis-ai/src/components/highlighter/index.js:
--------------------------------------------------------------------------------
1 | import styled from 'styled-components';
2 | 
3 | export const Highlighter = styled.span`
4 |   color: ${props => props.color};
5 |   border-radius: 5px;
6 |   background-color: ${props => props.bgColor ? props.bgColor : 'transparent'}
7 | `;
8 | 


--------------------------------------------------------------------------------
/biobert_ner/analysis-ai/src/components/input-with-examples/example-select.js:
--------------------------------------------------------------------------------
  1 | import React from 'react';
  2 | import PropTypes from 'prop-types';
  3 | import FormControl from '@material-ui/core/FormControl';
  4 | import MenuItem from '@material-ui/core/MenuItem';
  5 | import InputLabel from '@material-ui/core/InputLabel';
  6 | import Select from '@material-ui/core/Select';
  7 | import OutlinedInput from '@material-ui/core/OutlinedInput';
  8 | import styled from 'styled-components';
  9 | import RootRef from '@material-ui/core/RootRef';
 10 | 
 11 | import TooltipMenu from './tooltip-composed-menu';
 12 | 
 13 | const examples = [
 14 |   {
 15 |     key: 'bnlp#1',
 16 |     type: 'bioNlp',
 17 |     text: 'Cooccurrence of reduced expression of alpha - catenin and overexpression of p53 is a predictor of lymph node metastasis in early gastric cancer.',
 18 |   },
 19 |   {
 20 |     key: 'bnlp#2',
 21 |     type: 'bioNlp',
 22 |     text: 'In this review , the role of TSH - R gene alterations in benign and malignant thyroid neoplasia is examined.',
 23 |   },
 24 |   {
 25 |     key: 'bc5cdr#1',
 26 |     type: 'bc5cdr',
 27 |     text: "The authors describe the case of a 56 - year - old woman with chronic , severe heart failure secondary to dilated cardiomyopathy and absence of significant ventricular arrhythmias who developed QT prolongation and torsade de pointes ventricular tachycardia during one cycle of intermittent low dose ( 2.5 mcg/kg per min ) dobutamine."
 28 |   }
 29 | ];
 30 | 
 31 | const Subheader = styled.li`
 32 |   font-family: "Roboto", "Helvetica", "Arial", sans-serif;
 33 |   line-height: 1.5em;
 34 |   padding: 11px 16px;
 35 |   color: #827717;
 36 |   font-weight: 700;
 37 |   border-bottom: 1px solid #e2e2e2;
 38 |   pointer-events: none;
 39 | `;
 40 | 
 41 | 
 42 | 
 43 | export default class ExampleSelect extends React.Component {
 44 |   static propTypes = {
 45 |     update: PropTypes.func.isRequired
 46 |   };
 47 | 
 48 |   state = {
 49 |     content: '',
 50 |     selectedExampleKey: ''
 51 |   }
 52 | 
 53 |   constructor(props) {
 54 |     super(props);
 55 |     this.labelRef = React.createRef();
 56 |   }
 57 | 
 58 | 
 59 |   handleChange = event => {
 60 |     const selectedExample = typeof event.target.value === 'string' ? {} : event.target.value;
 61 |     this.setState({
 62 |       content: selectedExample.text || '',
 63 |       selectedExampleKey: selectedExample.key || ''
 64 |     })
 65 |     this.props.update(selectedExample.text || '');
 66 |   };
 67 | 
 68 |   render () {
 69 |     return (
 70 |       <FormControl variant="outlined" style={{ width: '100%' }}>
 71 |         <RootRef rootRef={this.labelRef}>
 72 |           <InputLabel
 73 |             style={{ whiteSpace: 'nowrap' }}
 74 |             htmlFor="outlined-example"
 75 |           >
 76 |             Example Texts
 77 |           </InputLabel>
 78 |         </RootRef>
 79 |         <Select
 80 |           value={this.state.content}
 81 |           renderValue={() => {
 82 |             return <span>{this.state.content}</span>
 83 |           }}
 84 |           onChange={this.handleChange}
 85 |           input={
 86 |             <OutlinedInput
 87 |               notched
 88 |               labelWidth={110}
 89 |               name="example"
 90 |               id="outlined-example"
 91 |             />
 92 |           }
 93 |           MenuProps={{
 94 |             PaperProps: {
 95 |               style: {
 96 |                 width: 400,
 97 |               },
 98 |             },
 99 |           }}
100 |         >
101 |           <MenuItem selected={!this.state.selectedExampleKey} value=""><em>--</em></MenuItem>
102 |           <Subheader className='subheader'>Bio NLP</Subheader>
103 |             {
104 |               examples.slice(0, 2).map(example => {
105 |                 return <TooltipMenu key={example.key} selectedKey={this.state.selectedExampleKey} value={example} />
106 |               })
107 |             }
108 |           <Subheader className='subheader'>BC5 CDR</Subheader>
109 |             {
110 |               examples.slice(2).map(example => {
111 |                 return <TooltipMenu key={example.key} selectedKey={this.state.selectedExampleKey} value={example} />
112 |               })
113 |             }
114 |         </Select>
115 |       </FormControl>
116 |     );
117 |   }
118 | }
119 | 


--------------------------------------------------------------------------------
/biobert_ner/analysis-ai/src/components/input-with-examples/example-text.js:
--------------------------------------------------------------------------------
 1 | import React from 'react';
 2 | import PropTypes from 'prop-types';
 3 | import TextField from '@material-ui/core/TextField';
 4 | 
 5 | export default class ExampleText extends React.Component {
 6 |   static propTypes = {
 7 |     content: PropTypes.string,
 8 |     update: PropTypes.func.isRequired
 9 |   }
10 | 
11 |   handleChange = event => {
12 |     this.props.update(event.target.value);
13 |   }
14 | 
15 |   render () {
16 |     return (
17 |       <TextField
18 |         id="standard-full-width"
19 |         style={{ margin: 0 }}
20 |         placeholder="Placeholder"
21 |         fullWidth
22 |         margin="normal"
23 |         onChange={this.handleChange}
24 |         value={this.props.content}
25 |         multiline
26 |       />
27 |     )
28 |   }
29 | }
30 | 


--------------------------------------------------------------------------------
/biobert_ner/analysis-ai/src/components/input-with-examples/index.js:
--------------------------------------------------------------------------------
 1 | import React from 'react';
 2 | import PropTypes from 'prop-types';
 3 | import ExampleSelect from '../../containers/input-with-examples/example-select';
 4 | import ExampleText from '../../containers/input-with-examples/example-text';
 5 | import Submit from '../../containers/input-with-examples/submit';
 6 | import Bc5drSelect from '../../containers/input-with-examples/bc5dr-select';
 7 | import Bc5drText from '../../containers/input-with-examples/bc5dr-text';
 8 | import Bc5drSubmit from '../../containers/input-with-examples/bc5dr-submit';
 9 | import Grid from '@material-ui/core/Grid';
10 | 
11 | import types from '../../enums/request-types';
12 | 
13 | export default class InputWithExamples extends React.PureComponent {
14 |   static propTypes = {
15 |     type: PropTypes.string.isRequired
16 |   }
17 | 
18 |   render () {
19 |     return (
20 |       <Grid container style={{ margin: '-12px', flexGrow: 0, width: '100%' }} justify='center' alignItems='center' spacing={24}>
21 |         <Grid item xs={2}>
22 |           { this.props.type === types.BIO_NLP
23 |             ? <ExampleSelect />
24 |             : <Bc5drSelect />
25 |           }
26 |         </Grid>
27 |         <Grid item xs={7}>
28 |           { this.props.type === types.BIO_NLP
29 |             ? <ExampleText />
30 |             : <Bc5drText />
31 |           }
32 |         </Grid>
33 |         <Grid item xs={1} style={{ textAlign: 'right' }}>
34 |           { this.props.type === types.BIO_NLP
35 |             ? <Submit />
36 |             : <Bc5drSubmit />
37 |           }
38 |         </Grid>
39 |       </Grid>
40 |     )
41 |   }
42 | }
43 | 


--------------------------------------------------------------------------------
/biobert_ner/analysis-ai/src/components/input-with-examples/submit.js:
--------------------------------------------------------------------------------
 1 | import React from 'react';
 2 | import PropTypes from 'prop-types';
 3 | import Button from '../button';
 4 | 
 5 | const Submit = (props) => {
 6 |   return (
 7 |     <Button variant="contained" color="primary" onClick={props.fetchData}>
 8 |       Submit
 9 |     </Button>
10 |   )
11 | }
12 | 
13 | Submit.propTypes = {
14 |   fetchData: PropTypes.func.isRequired
15 | }
16 | 
17 | export default Submit;
18 | 


--------------------------------------------------------------------------------
/biobert_ner/analysis-ai/src/components/input-with-examples/tooltip-composed-menu.js:
--------------------------------------------------------------------------------
 1 | import React from 'react';
 2 | import Tooltip from '@material-ui/core/Tooltip';
 3 | import MenuItem from '@material-ui/core/MenuItem';
 4 | 
 5 | import styled from 'styled-components';
 6 | 
 7 | const EllipsisText = styled.span`
 8 |   width: 100%;
 9 |   overflow: hidden;
10 |   white-space: nowrap;
11 |   text-overflow: ellipsis;
12 | `;
13 | 
14 | export default props => {
15 |   const {key, selectedKey, ...rest} = props;
16 |   const value = rest['data-value'];
17 |   return <Tooltip title={value.text}>
18 |     <MenuItem selected={value.key === selectedKey} {...rest}>
19 |       <EllipsisText>{value.text}</EllipsisText>
20 |     </MenuItem>
21 |   </Tooltip>
22 |   // return 'Hello World'
23 | }
24 | 


--------------------------------------------------------------------------------
/biobert_ner/analysis-ai/src/components/request-type-radio/index.js:
--------------------------------------------------------------------------------
 1 | import React from 'react';
 2 | import PropTypes from 'prop-types';
 3 | import FormControlLabel from '@material-ui/core/FormControlLabel';
 4 | import Radio from '@material-ui/core/Radio';
 5 | import RadioGroup from '@material-ui/core/RadioGroup';
 6 | 
 7 | import types from '../../enums/request-types';
 8 | 
 9 | export default class RequestRadio extends React.PureComponent {
10 |   static propTypes = {
11 |     type: PropTypes.string.isRequired,
12 |     updateRequestType: PropTypes.func.isRequired
13 |   }
14 |   handleChange = event => {
15 |     this.props.updateRequestType(event.target.value)
16 |   }
17 |   render () {
18 |     return (
19 |       <RadioGroup
20 |         aria-label="nerType"
21 |         name="nerType"
22 |         value={this.props.type}
23 |         onChange={this.handleChange}
24 |         row
25 |       >
26 |         <FormControlLabel
27 |           value={types.BIO_NLP}
28 |           control={<Radio color="primary" />}
29 |           label="BIO NLP 13CG"
30 |           labelPlacement="end"
31 |         />
32 |         <FormControlLabel
33 |           value={types.BC5CDR}
34 |           control={<Radio color="primary" />}
35 |           label="BC5CDR"
36 |           labelPlacement="end"
37 |         />
38 |       </RadioGroup>
39 |     )
40 |   }
41 | }
42 | 


--------------------------------------------------------------------------------
/biobert_ner/analysis-ai/src/components/response-text-area/container.js:
--------------------------------------------------------------------------------
 1 | import React from 'react';
 2 | import BioNlpTextArea from '../../containers/response-text-area/bioNlp-text-area'
 3 | import Bc5drTextArea from '../../containers/response-text-area/bc5dr-text-area'
 4 | 
 5 | import types from '../../enums/request-types';
 6 | 
 7 | export default props => {
 8 |   if (props.type === types.BIO_NLP) {
 9 |     return <BioNlpTextArea />
10 |   }
11 |   return <Bc5drTextArea />
12 | }
13 | 


--------------------------------------------------------------------------------
/biobert_ner/analysis-ai/src/components/response-text-area/index.js:
--------------------------------------------------------------------------------
 1 | import React, {Fragment} from 'react';
 2 | import styled from 'styled-components';
 3 | import Tooltip from '@material-ui/core/Tooltip';
 4 | import CircularProgress from '@material-ui/core/CircularProgress';
 5 | import StyledText from './styled-text';
 6 | 
 7 | import mapCodeToColors from '../../utils/mapCodeToColors';
 8 | 
 9 | const Wrapper = styled.div`
10 |   background: #fafafa;
11 |   padding: 24px;
12 |   border-radius: 5px;
13 |   border: 1px solid #efefef;
14 |   line-height: 1.4;
15 | `;
16 | 
17 | const renderWrapper = (tags) => {
18 |   return <Wrapper>
19 |     {
20 |       tags.map((tag, i) => {
21 |         let space = ' ';
22 |         if (i === 0) {
23 |           space = '';
24 |         }
25 |         if (tag[1] === 'O') {
26 |           return <span key={i}>{space + tag[0]}</span>
27 |         }
28 |         return <Fragment key={i}>
29 |           {space}
30 |           <Tooltip title={tag[1]}>
31 |             <StyledText
32 |               bgColor={mapCodeToColors[tag[1]].bg}
33 |               color={mapCodeToColors[tag[1]].fg}
34 |             >
35 |               <span className='text'>
36 |                 {tag[0]}
37 |               </span>
38 |               <span className='type'>
39 |                 {tag[1]}
40 |               </span>
41 |             </StyledText>
42 |           </Tooltip>
43 |         </Fragment>
44 |       })
45 |     }
46 |   </Wrapper>
47 | }
48 | 
49 | export default props => {
50 |   if (props.tags && !props.loading) {
51 |     return renderWrapper(props.tags);
52 |   }
53 |   if (props.loading) {
54 |     return <Wrapper style={{ textAlign: 'center' }}>
55 |       <CircularProgress />
56 |     </Wrapper>
57 |   }
58 |   return null;
59 | }
60 | 


--------------------------------------------------------------------------------
/biobert_ner/analysis-ai/src/components/response-text-area/styled-text.js:
--------------------------------------------------------------------------------
 1 | import styled from 'styled-components';
 2 | import { withTheme } from '@material-ui/core/styles';
 3 | 
 4 | const StyledText = styled.span`
 5 |   display: inline-flex;
 6 |   flex-direction: column;
 7 |   position: relative;
 8 |   padding: 5px;
 9 |   margin-bottom: 5px;
10 |   border-radius: 5px;
11 |   box-sizing: border-box;
12 |   background: ${props => props.bgColor ? props.bgColor : props.theme.palette.primary.main};
13 | 
14 |   > span.text {
15 |     padding: 4px 10px;
16 |     background: #fff;
17 |     color: #424242;
18 |     box-sizing: border-box;
19 |     border-radius: 5px;
20 |     line-height: 1.4;
21 |     text-align: center;
22 |   }
23 | 
24 |   > span.type {
25 |     font-size: 12px;
26 |     background: ${props => props.bgColor ? props.bgColor : props.theme.palette.primary.main};
27 |     color: ${props => props.color ? props.color : props.theme.palette.primary.main};
28 |     line-height: 1;
29 |     padding-top: 5px;
30 |     border-radius: 0 0 5px 5px;
31 |     text-align: center;
32 |   }
33 | `;
34 | 
35 | export default withTheme()(StyledText)
36 | 


--------------------------------------------------------------------------------
/biobert_ner/analysis-ai/src/containers/input-with-examples/bc5dr-select.js:
--------------------------------------------------------------------------------
 1 | import { bindActionCreators } from 'redux';
 2 | import { connect } from 'react-redux';
 3 | import { updateBc5cdr } from '../../actions/bcdr';
 4 | import ExampleSelect from '../../components/input-with-examples/example-select';
 5 | 
 6 | const mapStateToProps = state => {
 7 |   return {
 8 |     content: state.bc5cdr.request.text
 9 |   }
10 | }
11 | 
12 | const mapDispatchToProps = dispatch => {
13 |   return bindActionCreators({
14 |     update: updateBc5cdr
15 |   }, dispatch);
16 | }
17 | 
18 | export default connect(mapStateToProps, mapDispatchToProps)(ExampleSelect);
19 | 


--------------------------------------------------------------------------------
/biobert_ner/analysis-ai/src/containers/input-with-examples/bc5dr-submit.js:
--------------------------------------------------------------------------------
 1 | import { bindActionCreators } from 'redux';
 2 | import { connect } from 'react-redux';
 3 | import { fetchBc5cdr } from '../../actions/bcdr';
 4 | import Submit from '../../components/input-with-examples/submit';
 5 | 
 6 | const mapDispatchToProps = dispatch => {
 7 |   return bindActionCreators({
 8 |     fetchData: fetchBc5cdr
 9 |   }, dispatch);
10 | }
11 | 
12 | export default connect(null, mapDispatchToProps)(Submit);
13 | 


--------------------------------------------------------------------------------
/biobert_ner/analysis-ai/src/containers/input-with-examples/bc5dr-text.js:
--------------------------------------------------------------------------------
 1 | import { bindActionCreators } from 'redux';
 2 | import { connect } from 'react-redux';
 3 | import { updateBc5cdr } from '../../actions/bcdr';
 4 | import ExampleText from '../../components/input-with-examples/example-text';
 5 | 
 6 | const mapStateToProps = state => {
 7 |   return {
 8 |     content: state.bc5cdr.request.text
 9 |   }
10 | }
11 | 
12 | const mapDispatchToProps = dispatch => {
13 |   return bindActionCreators({
14 |     update: updateBc5cdr
15 |   }, dispatch);
16 | }
17 | 
18 | export default connect(mapStateToProps, mapDispatchToProps)(ExampleText);
19 | 


--------------------------------------------------------------------------------
/biobert_ner/analysis-ai/src/containers/input-with-examples/example-select.js:
--------------------------------------------------------------------------------
 1 | import { bindActionCreators } from 'redux';
 2 | import { connect } from 'react-redux';
 3 | import { updateBioNlp } from '../../actions/bioNlp';
 4 | import ExampleSelect from '../../components/input-with-examples/example-select';
 5 | 
 6 | const mapStateToProps = state => {
 7 |   return {
 8 |     content: state.bioNlp.request.text
 9 |   }
10 | }
11 | 
12 | const mapDispatchToProps = dispatch => {
13 |   return bindActionCreators({
14 |     update: updateBioNlp
15 |   }, dispatch);
16 | }
17 | 
18 | export default connect(mapStateToProps, mapDispatchToProps)(ExampleSelect);
19 | 


--------------------------------------------------------------------------------
/biobert_ner/analysis-ai/src/containers/input-with-examples/example-text.js:
--------------------------------------------------------------------------------
 1 | import { bindActionCreators } from 'redux';
 2 | import { connect } from 'react-redux';
 3 | import { updateBioNlp } from '../../actions/bioNlp';
 4 | import ExampleText from '../../components/input-with-examples/example-text';
 5 | 
 6 | const mapStateToProps = state => {
 7 |   return {
 8 |     content: state.bioNlp.request.text
 9 |   }
10 | }
11 | 
12 | const mapDispatchToProps = dispatch => {
13 |   return bindActionCreators({
14 |     update: updateBioNlp
15 |   }, dispatch);
16 | }
17 | 
18 | export default connect(mapStateToProps, mapDispatchToProps)(ExampleText);
19 | 


--------------------------------------------------------------------------------
/biobert_ner/analysis-ai/src/containers/input-with-examples/index.js:
--------------------------------------------------------------------------------
 1 | import { connect } from 'react-redux';
 2 | import ExampleSelect from '../../components/input-with-examples';
 3 | 
 4 | const mapStateToProps = state => {
 5 |   return {
 6 |     type: state.requestType.type
 7 |   }
 8 | }
 9 | 
10 | export default connect(mapStateToProps, null)(ExampleSelect);
11 | 


--------------------------------------------------------------------------------
/biobert_ner/analysis-ai/src/containers/input-with-examples/submit.js:
--------------------------------------------------------------------------------
 1 | import { bindActionCreators } from 'redux';
 2 | import { connect } from 'react-redux';
 3 | import { fetchBioNlp } from '../../actions/bioNlp';
 4 | import Submit from '../../components/input-with-examples/submit';
 5 | 
 6 | const mapDispatchToProps = dispatch => {
 7 |   return bindActionCreators({
 8 |     fetchData: fetchBioNlp
 9 |   }, dispatch);
10 | }
11 | 
12 | export default connect(null, mapDispatchToProps)(Submit);
13 | 


--------------------------------------------------------------------------------
/biobert_ner/analysis-ai/src/containers/request-type-radio.js:
--------------------------------------------------------------------------------
 1 | import { bindActionCreators } from 'redux';
 2 | import { connect } from 'react-redux';
 3 | import { updateRequestType } from '../actions/request-type';
 4 | import RequestRadio from '../components/request-type-radio';
 5 | 
 6 | const mapStateToProps = state => {
 7 |   return {
 8 |     type: state.requestType.type
 9 |   }
10 | }
11 | 
12 | const mapDispatchToProps = dispatch => {
13 |   return bindActionCreators({
14 |     updateRequestType
15 |   }, dispatch);
16 | }
17 | 
18 | export default connect(mapStateToProps, mapDispatchToProps)(RequestRadio);
19 | 


--------------------------------------------------------------------------------
/biobert_ner/analysis-ai/src/containers/response-text-area/bc5dr-text-area.js:
--------------------------------------------------------------------------------
 1 | import { connect } from 'react-redux';
 2 | import ResponseTextArea from '../../components/response-text-area';
 3 | 
 4 | const mapStateToProps = state => {
 5 |   return {
 6 |     tags: state.bc5cdr.response.tagging,
 7 |     loading: state.bc5cdr.loading
 8 |   }
 9 | }
10 | 
11 | export default connect(mapStateToProps, null)(ResponseTextArea);
12 | 


--------------------------------------------------------------------------------
/biobert_ner/analysis-ai/src/containers/response-text-area/bioNlp-text-area.js:
--------------------------------------------------------------------------------
 1 | import { connect } from 'react-redux';
 2 | import ResponseTextArea from '../../components/response-text-area';
 3 | 
 4 | const mapStateToProps = state => {
 5 |   return {
 6 |     tags: state.bioNlp.response.tags,
 7 |     loading: state.bioNlp.loading
 8 |   }
 9 | }
10 | 
11 | export default connect(mapStateToProps, null)(ResponseTextArea);
12 | 


--------------------------------------------------------------------------------
/biobert_ner/analysis-ai/src/containers/response-text-area/container.js:
--------------------------------------------------------------------------------
 1 | import { connect } from 'react-redux';
 2 | import ResponseTextAreaContainer from '../../components/response-text-area/container';
 3 | 
 4 | const mapStateToProps = state => {
 5 |   return {
 6 |     type: state.requestType.type
 7 |   }
 8 | }
 9 | 
10 | export default connect(mapStateToProps, null)(ResponseTextAreaContainer);
11 | 


--------------------------------------------------------------------------------
/biobert_ner/analysis-ai/src/enums/request-types.js:
--------------------------------------------------------------------------------
1 | export default Object.freeze({
2 |   BIO_NLP: 'bio',
3 |   BC5CDR: 'bc5'
4 | })
5 | 


--------------------------------------------------------------------------------
/biobert_ner/analysis-ai/src/index.css:
--------------------------------------------------------------------------------
 1 | body {
 2 |   margin: 0;
 3 |   padding: 0;
 4 |   font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", "Roboto", "Oxygen",
 5 |     "Ubuntu", "Cantarell", "Fira Sans", "Droid Sans", "Helvetica Neue",
 6 |     sans-serif;
 7 |   -webkit-font-smoothing: antialiased;
 8 |   -moz-osx-font-smoothing: grayscale;
 9 | }
10 | 
11 | code {
12 |   font-family: source-code-pro, Menlo, Monaco, Consolas, "Courier New",
13 |     monospace;
14 | }
15 | 


--------------------------------------------------------------------------------
/biobert_ner/analysis-ai/src/index.js:
--------------------------------------------------------------------------------
 1 | import React from 'react';
 2 | import ReactDOM from 'react-dom';
 3 | import './index.css';
 4 | import App from './pages/home';
 5 | import { createStore, applyMiddleware, compose } from 'redux';
 6 | import { Provider } from 'react-redux';
 7 | import thunk from 'redux-thunk';
 8 | import rootReducers from './reducers';
 9 | import * as serviceWorker from './serviceWorker';
10 | 
11 | require('dotenv').config();
12 | const composeEnhancers = window.__REDUX_DEVTOOLS_EXTENSION_COMPOSE__ || compose
13 | const store = createStore(rootReducers, composeEnhancers(applyMiddleware(thunk)));
14 | 
15 | ReactDOM.render(
16 |   <Provider store={store}>
17 |     <App />
18 |   </Provider>
19 | , document.getElementById('root'));
20 | 
21 | // If you want your app to work offline and load faster, you can change
22 | // unregister() to register() below. Note this comes with some pitfalls.
23 | // Learn more about service workers: http://bit.ly/CRA-PWA
24 | serviceWorker.unregister();
25 | 


--------------------------------------------------------------------------------
/biobert_ner/analysis-ai/src/logo.svg:
--------------------------------------------------------------------------------
1 | <svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 841.9 595.3">
2 |     <g fill="#61DAFB">
3 |         <path d="M666.3 296.5c0-32.5-40.7-63.3-103.1-82.4 14.4-63.6 8-114.2-20.2-130.4-6.5-3.8-14.1-5.6-22.4-5.6v22.3c4.6 0 8.3.9 11.4 2.6 13.6 7.8 19.5 37.5 14.9 75.7-1.1 9.4-2.9 19.3-5.1 29.4-19.6-4.8-41-8.5-63.5-10.9-13.5-18.5-27.5-35.3-41.6-50 32.6-30.3 63.2-46.9 84-46.9V78c-27.5 0-63.5 19.6-99.9 53.6-36.4-33.8-72.4-53.2-99.9-53.2v22.3c20.7 0 51.4 16.5 84 46.6-14 14.7-28 31.4-41.3 49.9-22.6 2.4-44 6.1-63.6 11-2.3-10-4-19.7-5.2-29-4.7-38.2 1.1-67.9 14.6-75.8 3-1.8 6.9-2.6 11.5-2.6V78.5c-8.4 0-16 1.8-22.6 5.6-28.1 16.2-34.4 66.7-19.9 130.1-62.2 19.2-102.7 49.9-102.7 82.3 0 32.5 40.7 63.3 103.1 82.4-14.4 63.6-8 114.2 20.2 130.4 6.5 3.8 14.1 5.6 22.5 5.6 27.5 0 63.5-19.6 99.9-53.6 36.4 33.8 72.4 53.2 99.9 53.2 8.4 0 16-1.8 22.6-5.6 28.1-16.2 34.4-66.7 19.9-130.1 62-19.1 102.5-49.9 102.5-82.3zm-130.2-66.7c-3.7 12.9-8.3 26.2-13.5 39.5-4.1-8-8.4-16-13.1-24-4.6-8-9.5-15.8-14.4-23.4 14.2 2.1 27.9 4.7 41 7.9zm-45.8 106.5c-7.8 13.5-15.8 26.3-24.1 38.2-14.9 1.3-30 2-45.2 2-15.1 0-30.2-.7-45-1.9-8.3-11.9-16.4-24.6-24.2-38-7.6-13.1-14.5-26.4-20.8-39.8 6.2-13.4 13.2-26.8 20.7-39.9 7.8-13.5 15.8-26.3 24.1-38.2 14.9-1.3 30-2 45.2-2 15.1 0 30.2.7 45 1.9 8.3 11.9 16.4 24.6 24.2 38 7.6 13.1 14.5 26.4 20.8 39.8-6.3 13.4-13.2 26.8-20.7 39.9zm32.3-13c5.4 13.4 10 26.8 13.8 39.8-13.1 3.2-26.9 5.9-41.2 8 4.9-7.7 9.8-15.6 14.4-23.7 4.6-8 8.9-16.1 13-24.1zM421.2 430c-9.3-9.6-18.6-20.3-27.8-32 9 .4 18.2.7 27.5.7 9.4 0 18.7-.2 27.8-.7-9 11.7-18.3 22.4-27.5 32zm-74.4-58.9c-14.2-2.1-27.9-4.7-41-7.9 3.7-12.9 8.3-26.2 13.5-39.5 4.1 8 8.4 16 13.1 24 4.7 8 9.5 15.8 14.4 23.4zM420.7 163c9.3 9.6 18.6 20.3 27.8 32-9-.4-18.2-.7-27.5-.7-9.4 0-18.7.2-27.8.7 9-11.7 18.3-22.4 27.5-32zm-74 58.9c-4.9 7.7-9.8 15.6-14.4 23.7-4.6 8-8.9 16-13 24-5.4-13.4-10-26.8-13.8-39.8 13.1-3.1 26.9-5.8 41.2-7.9zm-90.5 125.2c-35.4-15.1-58.3-34.9-58.3-50.6 0-15.7 22.9-35.6 58.3-50.6 8.6-3.7 18-7 27.7-10.1 5.7 19.6 13.2 40 22.5 60.9-9.2 20.8-16.6 41.1-22.2 60.6-9.9-3.1-19.3-6.5-28-10.2zM310 490c-13.6-7.8-19.5-37.5-14.9-75.7 1.1-9.4 2.9-19.3 5.1-29.4 19.6 4.8 41 8.5 63.5 10.9 13.5 18.5 27.5 35.3 41.6 50-32.6 30.3-63.2 46.9-84 46.9-4.5-.1-8.3-1-11.3-2.7zm237.2-76.2c4.7 38.2-1.1 67.9-14.6 75.8-3 1.8-6.9 2.6-11.5 2.6-20.7 0-51.4-16.5-84-46.6 14-14.7 28-31.4 41.3-49.9 22.6-2.4 44-6.1 63.6-11 2.3 10.1 4.1 19.8 5.2 29.1zm38.5-66.7c-8.6 3.7-18 7-27.7 10.1-5.7-19.6-13.2-40-22.5-60.9 9.2-20.8 16.6-41.1 22.2-60.6 9.9 3.1 19.3 6.5 28.1 10.2 35.4 15.1 58.3 34.9 58.3 50.6-.1 15.7-23 35.6-58.4 50.6zM320.8 78.4z"/>
4 |         <circle cx="420.9" cy="296.5" r="45.7"/>
5 |         <path d="M520.5 78.1z"/>
6 |     </g>
7 | </svg>
8 | 


--------------------------------------------------------------------------------
/biobert_ner/analysis-ai/src/pages/home/index.css:
--------------------------------------------------------------------------------
1 | .main-wrapper {
2 |   color: #4f4f4f;
3 | }
4 | 
5 | .text-center {
6 |   text-align: center;
7 | }
8 | 


--------------------------------------------------------------------------------
/biobert_ner/analysis-ai/src/pages/home/index.js:
--------------------------------------------------------------------------------
 1 | import React, { Component } from 'react';
 2 | import GithubFork from '../../components/fork';
 3 | import { Highlighter } from '../../components/highlighter';
 4 | import { Header } from '../../components/header';
 5 | import InputWithExamples from '../../containers/input-with-examples';
 6 | import ResponseTextAreaContainer from '../../containers/response-text-area/container';
 7 | import RequestRadio from '../../containers/request-type-radio'
 8 | import { MuiThemeProvider, createMuiTheme } from '@material-ui/core/styles';
 9 | import Grid from '@material-ui/core/Grid';
10 | import Card from '@material-ui/core/Card';
11 | import CardContent from '@material-ui/core/CardContent';
12 | import Typography from '@material-ui/core/Typography';
13 | 
14 | import './index.css';
15 | 
16 | const theme = createMuiTheme({
17 |   palette: {
18 |     primary: { main: '#827717' }, // Purple and green play nicely together.
19 |     secondary: { main: '#11cb5f' }, // This is just green.A700 as hex.
20 |     text: {
21 |       secondary: '#424242'
22 |     }
23 |   },
24 |   typography: {
25 |     useNextVariants: true,
26 |   },
27 |   props: {
28 |     MuiButtonBase: { // Name of the component ⚛️ / style sheet
29 |       text: { // Name of the rule
30 |         color: '#f2f2f2', // Some CSS
31 |       },
32 |     },
33 |   }
34 | });
35 | 
36 | 
37 | class App extends Component {
38 |   render() {
39 |     return (
40 |       <MuiThemeProvider theme={theme}>
41 |         <div className='main-wrapper' style={{ padding: '48px 12px' }}>
42 |           <GithubFork href='https://github.com/MeRajat/SolvingAlmostAnythingWithBert' />
43 |           <Grid container justify='center' alignItems='center' direction='column' spacing={24}>
44 |             <Grid item xs={6} className='text-center'>
45 |               <Header color='#827717' gutterBottom variant="h3" component="h1">
46 |                 Solving BioNLP problems
47 |               </Header>
48 |               <Typography gutterBottom component="p">
49 |                 This App solves <b><Highlighter color='#455a64'>BioNLP</Highlighter></b> problems using <b><Highlighter color='#455a64'>Bert(BioBert Pytorch)</Highlighter></b>
50 |               </Typography>
51 |             </Grid>
52 |             <Grid item xs={10}>
53 |               <Card>
54 |                 <CardContent>
55 |                   <Header color='#9e9d24' gutterBottom variant="h5" component="h2">
56 |                     Description
57 |                   </Header>
58 |                   <Typography component="p">
59 |                     This app demonstrates how Bert(BioBert) can be finetuned and used to beat any state of the art result. In this we have trained it discover entites in medical text. In <b><Highlighter color='#455a64'>BioNLP13CG</Highlighter></b> it finds entites like <b><Highlighter color='#455a64'>'Anatomical_system'</Highlighter></b>, <b><Highlighter color='#455a64'>'Cancer'</Highlighter></b>, <b><Highlighter color='#455a64'>'Cell'</Highlighter></b>,  <b><Highlighter color='#455a64'>'Cellular_component'</Highlighter></b>, <b><Highlighter color='#455a64'>'Developing_anatomical_structure'</Highlighter></b>, <b><Highlighter color='#455a64'>'Gene_or_gene_product'</Highlighter></b>, <b><Highlighter color='#455a64'>'Immaterial_anatomical_entity'</Highlighter></b>, <b><Highlighter color='#455a64'>'Multi-tissue_structure'</Highlighter></b>, <b><Highlighter color='#455a64'>'Organ'</Highlighter></b>, <b><Highlighter color='#455a64'>'Organism'</Highlighter></b>, <b><Highlighter color='#455a64'>'Organism_subdivision'</Highlighter></b>, <b><Highlighter color='#455a64'>'Organism_substance'</Highlighter></b>, <b><Highlighter color='#455a64'>'Pathological_formation'</Highlighter></b>, <b><Highlighter color='#455a64'>'Simple_chemical'</Highlighter></b>, <b><Highlighter color='#455a64'>'Tissue'</Highlighter></b>  and in <b><Highlighter color='#455a64'>BC5CDR</Highlighter></b> it finds <b><Highlighter color='#455a64'>Disease and Chemicals</Highlighter></b>.
60 |                   </Typography>
61 |                 </CardContent>
62 |               </Card>
63 |             </Grid>
64 |             <Grid item style={{ marginTop: '24px', width: '100%', textAlign: 'start' }} xs={10}>
65 |               <Header color='#9e9d24' gutterBottom variant="h5" component="h2"> DEMO </Header>
66 |             </Grid>
67 |             <Grid item style={{ width: '100%', textAlign: 'start' }} xs={10}>
68 |               <RequestRadio />
69 |             </Grid>
70 |             <InputWithExamples />
71 |             <Grid item xs={10} style={{ width: '100%' }} >
72 |               <ResponseTextAreaContainer />
73 |             </Grid>
74 |           </Grid>
75 |         </div>
76 |       </MuiThemeProvider>
77 |     );
78 |   }
79 | }
80 | 
81 | export default App;
82 | 


--------------------------------------------------------------------------------
/biobert_ner/analysis-ai/src/pages/home/index.test.js:
--------------------------------------------------------------------------------
 1 | import React from 'react';
 2 | import ReactDOM from 'react-dom';
 3 | import App from '.';
 4 | 
 5 | it('renders without crashing', () => {
 6 |   const div = document.createElement('div');
 7 |   ReactDOM.render(<App />, div);
 8 |   ReactDOM.unmountComponentAtNode(div);
 9 | });
10 | 


--------------------------------------------------------------------------------
/biobert_ner/analysis-ai/src/reducers/bcdr.js:
--------------------------------------------------------------------------------
 1 | import constants from '../redux-constants/fetch';
 2 | 
 3 | const initialState = {
 4 |   response: {},
 5 |   request: {
 6 |     bc5cdr: 'BC5CDR',
 7 |     text: ''
 8 |   },
 9 |   loading: false,
10 |   error: null
11 | }
12 | 
13 | export default (state = initialState, action) => {
14 |   const immutatedState = { ...state };
15 |   switch (action.type) {
16 |     case constants.FETCH_BC5CDR_REQUEST:
17 |       immutatedState.loading = true;
18 |       return immutatedState;
19 |     case constants.FETCH_BC5CDR_SUCCESS:
20 |       immutatedState.response = Object.assign({}, action.response);
21 |       immutatedState.loading = false;
22 |       immutatedState.error = null;
23 |       return immutatedState;
24 |     case constants.FETCH_BC5CDR_FAILURE:
25 |       immutatedState.loading = false;
26 |       immutatedState.error = { ...action.error };
27 |       return immutatedState;
28 |     case constants.UPDATE_BC5CDR:
29 |       immutatedState.request = {...state.request};
30 |       immutatedState.request.text = action.content;
31 |       return immutatedState;
32 |     default:
33 |       return state;
34 |   }
35 | }
36 | 


--------------------------------------------------------------------------------
/biobert_ner/analysis-ai/src/reducers/bioNlp.js:
--------------------------------------------------------------------------------
 1 | import constants from '../redux-constants/fetch';
 2 | 
 3 | const initialState = {
 4 |   response: {},
 5 |   request: {
 6 |     bionlp3g: 'BIO NLP 13CG',
 7 |     text: ''
 8 |   },
 9 |   loading: false,
10 |   error: null
11 | }
12 | 
13 | export default (state = initialState, action) => {
14 |   const immutatedState = { ...state };
15 |   switch (action.type) {
16 |     case constants.FETCH_BIO_NLP_REQUEST:
17 |       immutatedState.loading = true;
18 |       return immutatedState;
19 |     case constants.FETCH_BIO_NLP_SUCCESS:
20 |       immutatedState.response = Object.assign({}, action.response);
21 |       immutatedState.loading = false;
22 |       immutatedState.error = null;
23 |       return immutatedState;
24 |     case constants.FETCH_BIO_NLP_FAILURE:
25 |       immutatedState.loading = false;
26 |       immutatedState.error = { ...action.error };
27 |       return immutatedState;
28 |     case constants.UPDATE_BIO_NLP:
29 |       immutatedState.request = {...state.request};
30 |       immutatedState.request.text = action.content;
31 |       return immutatedState;
32 |     default:
33 |       return state;
34 |   }
35 | }
36 | 


--------------------------------------------------------------------------------
/biobert_ner/analysis-ai/src/reducers/index.js:
--------------------------------------------------------------------------------
 1 | import { combineReducers } from 'redux';
 2 | import bioNlp from './bioNlp';
 3 | import bc5cdr from './bcdr';
 4 | import requestType from './request-type';
 5 | 
 6 | export default combineReducers({
 7 |   bioNlp,
 8 |   bc5cdr,
 9 |   requestType
10 | });
11 | 


--------------------------------------------------------------------------------
/biobert_ner/analysis-ai/src/reducers/request-type.js:
--------------------------------------------------------------------------------
 1 | import constants from '../redux-constants/fetch';
 2 | import types from '../enums/request-types';
 3 | 
 4 | const initialState = {
 5 |   // type: types.BIO_NLP // Default State
 6 |   type: types.BC5CDR
 7 | }
 8 | 
 9 | export default (state = initialState, action) => {
10 |   const immutatedState = { ...state };
11 |   switch (action.type) {
12 |     case constants.REQUEST_TYPE_CHANGE:
13 |       immutatedState.type = action.requestType;
14 |       return immutatedState;
15 |     default:
16 |       return state;
17 |   }
18 | }
19 | 


--------------------------------------------------------------------------------
/biobert_ner/analysis-ai/src/redux-constants/fetch.js:
--------------------------------------------------------------------------------
 1 | export default {
 2 |   FETCH_BIO_NLP_REQUEST: 'FETCH_BIO_NLP_REQUEST',
 3 |   FETCH_BIO_NLP_SUCCESS: 'FETCH_BIO_NLP_SUCCESS',
 4 |   FETCH_BIO_NLP_FAILURE: 'FETCH_BIO_NLP_FAILURE',
 5 |   UPDATE_BIO_NLP: 'UPDATE_BIO_NLP',
 6 | 
 7 |   FETCH_BC5CDR_REQUEST: 'FETCH_BC5CDR_REQUEST',
 8 |   FETCH_BC5CDR_SUCCESS: 'FETCH_BC5CDR_SUCCESS',
 9 |   FETCH_BC5CDR_FAILURE: 'FETCH_BC5CDR_FAILURE',
10 |   UPDATE_BC5CDR: 'UPDATE_BC5CDR',
11 | 
12 |   REQUEST_TYPE_CHANGE: 'REQUEST_TYPE_CHANGE',
13 | }
14 | 


--------------------------------------------------------------------------------
/biobert_ner/analysis-ai/src/serviceWorker.js:
--------------------------------------------------------------------------------
  1 | // This optional code is used to register a service worker.
  2 | // register() is not called by default.
  3 | 
  4 | // This lets the app load faster on subsequent visits in production, and gives
  5 | // it offline capabilities. However, it also means that developers (and users)
  6 | // will only see deployed updates on subsequent visits to a page, after all the
  7 | // existing tabs open on the page have been closed, since previously cached
  8 | // resources are updated in the background.
  9 | 
 10 | // To learn more about the benefits of this model and instructions on how to
 11 | // opt-in, read http://bit.ly/CRA-PWA
 12 | 
 13 | const isLocalhost = Boolean(
 14 |   window.location.hostname === 'localhost' ||
 15 |     // [::1] is the IPv6 localhost address.
 16 |     window.location.hostname === '[::1]' ||
 17 |     // 127.0.0.1/8 is considered localhost for IPv4.
 18 |     window.location.hostname.match(
 19 |       /^127(?:\.(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)){3}$/
 20 |     )
 21 | );
 22 | 
 23 | export function register(config) {
 24 |   if (process.env.NODE_ENV === 'production' && 'serviceWorker' in navigator) {
 25 |     // The URL constructor is available in all browsers that support SW.
 26 |     const publicUrl = new URL(process.env.PUBLIC_URL, window.location.href);
 27 |     if (publicUrl.origin !== window.location.origin) {
 28 |       // Our service worker won't work if PUBLIC_URL is on a different origin
 29 |       // from what our page is served on. This might happen if a CDN is used to
 30 |       // serve assets; see https://github.com/facebook/create-react-app/issues/2374
 31 |       return;
 32 |     }
 33 | 
 34 |     window.addEventListener('load', () => {
 35 |       const swUrl = `${process.env.PUBLIC_URL}/service-worker.js`;
 36 | 
 37 |       if (isLocalhost) {
 38 |         // This is running on localhost. Let's check if a service worker still exists or not.
 39 |         checkValidServiceWorker(swUrl, config);
 40 | 
 41 |         // Add some additional logging to localhost, pointing developers to the
 42 |         // service worker/PWA documentation.
 43 |         navigator.serviceWorker.ready.then(() => {
 44 |           console.log(
 45 |             'This web app is being served cache-first by a service ' +
 46 |               'worker. To learn more, visit http://bit.ly/CRA-PWA'
 47 |           );
 48 |         });
 49 |       } else {
 50 |         // Is not localhost. Just register service worker
 51 |         registerValidSW(swUrl, config);
 52 |       }
 53 |     });
 54 |   }
 55 | }
 56 | 
 57 | function registerValidSW(swUrl, config) {
 58 |   navigator.serviceWorker
 59 |     .register(swUrl)
 60 |     .then(registration => {
 61 |       registration.onupdatefound = () => {
 62 |         const installingWorker = registration.installing;
 63 |         if (installingWorker == null) {
 64 |           return;
 65 |         }
 66 |         installingWorker.onstatechange = () => {
 67 |           if (installingWorker.state === 'installed') {
 68 |             if (navigator.serviceWorker.controller) {
 69 |               // At this point, the updated precached content has been fetched,
 70 |               // but the previous service worker will still serve the older
 71 |               // content until all client tabs are closed.
 72 |               console.log(
 73 |                 'New content is available and will be used when all ' +
 74 |                   'tabs for this page are closed. See http://bit.ly/CRA-PWA.'
 75 |               );
 76 | 
 77 |               // Execute callback
 78 |               if (config && config.onUpdate) {
 79 |                 config.onUpdate(registration);
 80 |               }
 81 |             } else {
 82 |               // At this point, everything has been precached.
 83 |               // It's the perfect time to display a
 84 |               // "Content is cached for offline use." message.
 85 |               console.log('Content is cached for offline use.');
 86 | 
 87 |               // Execute callback
 88 |               if (config && config.onSuccess) {
 89 |                 config.onSuccess(registration);
 90 |               }
 91 |             }
 92 |           }
 93 |         };
 94 |       };
 95 |     })
 96 |     .catch(error => {
 97 |       console.error('Error during service worker registration:', error);
 98 |     });
 99 | }
100 | 
101 | function checkValidServiceWorker(swUrl, config) {
102 |   // Check if the service worker can be found. If it can't reload the page.
103 |   fetch(swUrl)
104 |     .then(response => {
105 |       // Ensure service worker exists, and that we really are getting a JS file.
106 |       const contentType = response.headers.get('content-type');
107 |       if (
108 |         response.status === 404 ||
109 |         (contentType != null && contentType.indexOf('javascript') === -1)
110 |       ) {
111 |         // No service worker found. Probably a different app. Reload the page.
112 |         navigator.serviceWorker.ready.then(registration => {
113 |           registration.unregister().then(() => {
114 |             window.location.reload();
115 |           });
116 |         });
117 |       } else {
118 |         // Service worker found. Proceed as normal.
119 |         registerValidSW(swUrl, config);
120 |       }
121 |     })
122 |     .catch(() => {
123 |       console.log(
124 |         'No internet connection found. App is running in offline mode.'
125 |       );
126 |     });
127 | }
128 | 
129 | export function unregister() {
130 |   if ('serviceWorker' in navigator) {
131 |     navigator.serviceWorker.ready.then(registration => {
132 |       registration.unregister();
133 |     });
134 |   }
135 | }
136 | 


--------------------------------------------------------------------------------
/biobert_ner/analysis-ai/src/utils/mapCodeToColors.js:
--------------------------------------------------------------------------------
  1 | export default {
  2 |   'B-Chemical': {
  3 |     bg: '#ab47bc',
  4 |     fg: '#f2f2f2',
  5 |   },
  6 |   'I-Chemical': {
  7 |     bg: '#8e24aa',
  8 |     fg: '#f2f2f2',
  9 |   },
 10 |   'B-Disease': {
 11 |     bg: '#37474f',
 12 |     fg: '#f2f2f2',
 13 |   },
 14 |   'I-Disease': {
 15 |     bg: '#424242',
 16 |     fg: '#f2f2f2',
 17 |   },
 18 |   'B-Anatomical_system': {
 19 |     bg: '#ffd180',
 20 |     fg: '#333'
 21 |   },
 22 |   'B-Cancer': {
 23 |     bg: '#212121',
 24 |     fg: '#f2f2f2'
 25 |   },
 26 |   'B-Cell': {
 27 |     bg: '#43a047',
 28 |     fg: '#fff'
 29 |   },
 30 |   'B-Cellular_component': {
 31 |     bg: '#388e3c',
 32 |     fg: '#f2f2f2'
 33 |   },
 34 |   'B-Developing_anatomical_structure': {
 35 |     bg: '#ffb300',
 36 |     fg: '#333'
 37 |   },
 38 |   'B-Gene_or_gene_product': {
 39 |     bg: '#26a69a',
 40 |     fg: '#fff'
 41 |   },
 42 |   'B-Immaterial_anatomical_entity': {
 43 |     bg: '#78909c',
 44 |     fg: '#fff'
 45 |   },
 46 |   'B-Multi-tissue_structure': {
 47 |     bg: '#827717',
 48 |     fg: '#fff'
 49 |   },
 50 |   'B-Organ': {
 51 |     bg: '#d32f2f',
 52 |     fg: '#f2f2f2'
 53 |   },
 54 |   'B-Organism': {
 55 |     bg: '#689f38',
 56 |     fg: '#fff'
 57 |   },
 58 |   'B-Organism_subdivision': {
 59 |     bg: '#33691e',
 60 |     fg: '#f2f2f2'
 61 |   },
 62 |   'B-Organism_substance': {
 63 |     bg: '#795548',
 64 |     fg: '#f2f2f2'
 65 |   },
 66 |   'B-Pathological_formation': {
 67 |     bg: '#0288d1',
 68 |     fg: '#fff'
 69 |   } ,
 70 |   'B-Simple_chemical': {
 71 |     bg: '#d81b60',
 72 |     fg: '#f2f2f2'
 73 |   },
 74 |   'B-Tissue': {
 75 |     bg: '#673ab7',
 76 |     fg: '#fff'
 77 |   },
 78 |   'I-Amino_acid': {
 79 |     bg: '#558b2f',
 80 |     fg: '#fff'
 81 |   },
 82 |   'I-Anatomical_system': {
 83 |     bg: '#e64a19',
 84 |     fg: '#fff'
 85 |   },
 86 |   'I-Cancer': {
 87 |     bg: '#455a64',
 88 |     fg: '#f2f2f2'
 89 |   },
 90 |   'I-Cell': {
 91 |     bg: '#fbc02d',
 92 |     fg: '#333'
 93 |   },
 94 |   'I-Cellular_component': {
 95 |     bg: '#0097a7',
 96 |     fg: '#fff'
 97 |   },
 98 |   'I-Developing_anatomical_structure': {
 99 |     bg: '#303f9f',
100 |     fg: '#f2f2f2'
101 |   },
102 |   'I-Gene_or_gene_product': {
103 |     bg: '#512da8',
104 |     fg: '#f2f2f2'
105 |   },
106 |   'I-Immaterial_anatomical_entity': {
107 |     bg: '#6d4c41',
108 |     fg: '#f2f2f2'
109 |   },
110 |   'I-Multi-tissue_structure': {
111 |     bg: '#1976d2',
112 |     fg: '#f2f2f2'
113 |   },
114 |   'I-Organ': {
115 |     bg: '#ef5350',
116 |     fg: '#fff'
117 |   },
118 |   'I-Organism': {
119 |     bg: '#009688',
120 |     fg: '#fff'
121 |   },
122 |   'I-Organism_subdivision': {
123 |     bg: '#1de9b6',
124 |     fg: '#333'
125 |   },
126 |   'I-Organism_substance': {
127 |     bg: '#d84315',
128 |     fg: '#fff'
129 |   },
130 |   'I-Pathological_formation': {
131 |     bg: '#2962ff',
132 |     fg: '#fff'
133 |   },
134 |   'I-Simple_chemical': {
135 |     bg: '#4527a0',
136 |     fg: '#f2f2f2'
137 |   },
138 |   'I-Tissue': {
139 |     bg: '#fdd835',
140 |     fg: '#333'
141 |   },
142 | }
143 | 


--------------------------------------------------------------------------------
/biobert_ner/analysis-ai/src/utils/params.js:
--------------------------------------------------------------------------------
1 | export default (params = {}) => {
2 |   return Object.keys(params).reduce((queryString, key, index) => {
3 |     queryString += (index !== 0 ? '&' : '') + window.encodeURIComponent(key) + '=' + window.encodeURIComponent(params[key]);
4 |     return queryString;
5 |   }, '');
6 | }
7 | 


--------------------------------------------------------------------------------
/biobert_ner/api.py:
--------------------------------------------------------------------------------
  1 | from data_load import HParams
  2 | from new_model import Net
  3 | from pytorch_pretrained_bert.modeling import BertConfig
  4 | from pytorch_pretrained_bert import BertModel
  5 | import parameters
  6 | import numpy as np 
  7 | from starlette.applications import Starlette
  8 | from starlette.responses import JSONResponse, HTMLResponse, RedirectResponse
  9 | import torch
 10 | import sys
 11 | import uvicorn
 12 | import aiohttp
 13 | 
 14 | 
 15 | config = BertConfig(vocab_size_or_config_json_file=parameters.BERT_CONFIG_FILE)
 16 | app = Starlette()
 17 | 
 18 | 
 19 | def build_model(config, state_dict, hp):
 20 |     model = Net(config, vocab_len = len(hp.VOCAB), bert_state_dict=None)
 21 |     _ = model.load_state_dict(torch.load(state_dict, map_location='cpu'))
 22 |     _ = model.to('cpu')  # inference 
 23 |     return model 
 24 | 
 25 | 
 26 | # Model loaded 
 27 | bc5_model = build_model(config, parameters.BC5CDR_WEIGHT, HParams('bc5cdr'))
 28 | bionlp13cg_model = build_model(config, parameters.BIONLP13CG_WEIGHT, HParams('bionlp3g'))
 29 | 
 30 | 
 31 | # Process Query 
 32 | def process_query(query, hp, model):
 33 |     s = query
 34 |     split_s = ["[CLS]"] + s.split()+["[SEP]"]
 35 |     x = [] # list of ids
 36 |     is_heads = [] # list. 1: the token is the first piece of a word
 37 | 
 38 |     for w in split_s:
 39 |         tokens = hp.tokenizer.tokenize(w) if w not in ("[CLS]", "[SEP]") else [w]
 40 |         xx = hp.tokenizer.convert_tokens_to_ids(tokens)
 41 |         is_head = [1] + [0]*(len(tokens) - 1)
 42 |         x.extend(xx)
 43 |         is_heads.extend(is_head)
 44 | 
 45 |     x = torch.LongTensor(x).unsqueeze(dim=0)
 46 | 
 47 |     # Process query 
 48 |     model.eval()
 49 |     _, _, y_pred = model(x, torch.Tensor([1, 2, 3]))  # just a dummy y value
 50 |     preds = y_pred[0].cpu().numpy()[np.array(is_heads) == 1]  # Get prediction where head is 1 
 51 | 
 52 |     # convert to real tags and remove <SEP> and <CLS>  tokens labels 
 53 |     preds = [hp.idx2tag[i] for i in preds][1:-1]
 54 |     final_output = []
 55 |     for word, label in zip(s.split(), preds):
 56 |         final_output.append([word, label])
 57 |     return final_output
 58 | 
 59 | 
 60 | def get_bc5cdr(query):
 61 |     hp = HParams('bc5cdr')
 62 |     print("bc5cdr -> ", query)
 63 |     out = process_query(query=query, hp=hp, model=bc5_model)
 64 |     return JSONResponse({'tagging': out})
 65 | 
 66 | 
 67 | def get_bionlp13cg(query):
 68 |     hp = HParams('bionlp3g')
 69 |     print("bionlp3g -> ", query)
 70 |     out = process_query(query=query, hp=hp, model=bionlp13cg_model)
 71 |     return JSONResponse({'tags': out})
 72 | 
 73 | 
 74 | @app.route("/extract-ner", methods=["GET"])
 75 | async def extract_ner(request):
 76 |     text = request.query_params["text"]
 77 |     if "bionlp3g" in request.query_params:
 78 |         return get_bionlp13cg(text)
 79 |     else:
 80 |         return get_bc5cdr(text)
 81 | 
 82 | 
 83 | @app.route("/")
 84 | def form(_):
 85 |     return HTMLResponse(
 86 |         """
 87 |         <h3>This app will find the NER!<h3>
 88 |         <form action="/extract-ner" method="get">
 89 |             <textarea rows="10" cols="60" name="text">
 90 |             </textarea><br>
 91 |             <input type="submit" name="bionlp3g" value="BIO NLP 13CG">
 92 |             <input type="submit" name="bc5cdr" value="BC 5 CDR">
 93 |         </form>
 94 |             <p> More information can be found here https://github.com/MeRajat/SolvingAlmostAnythingWithBert<br>
 95 |             Examples :- <br>
 96 |             ## BIONLP13CG :- <br>
 97 |                 1. Cooccurrence of reduced expression of alpha - catenin and overexpression of p53 is a predictor of lymph node metastasis in early gastric cancer .<br>
 98 |                 2. In this review , the role of TSH - R gene alterations in benign and malignant thyroid neoplasia is examined . <br>
 99 | 
100 |             ## BC5CDR :- <br>
101 |                 1. The authors describe the case of a 56 - year - old woman with chronic , severe heart failure <br>
102 |                 secondary to dilated cardiomyopathy and absence of significant ventricular arrhythmias <br>
103 |                 who developed QT prolongation and torsade de pointes ventricular tachycardia during one cycle
104 |                 <br> of intermittent low dose ( 2 . 5 mcg / kg per min ) dobutamine . <br>
105 |         </p>
106 |     """)
107 | 
108 | 
109 | @app.route("/form")
110 | def redirect_to_homepage(_):
111 |     return RedirectResponse("/")
112 | 
113 | 
114 | if __name__ == "__main__":
115 |     # To run this app start application on server with python
116 |     # python FILENAME serve
117 |     # ex: python server.py server
118 |     if "serve" in sys.argv:
119 |         uvicorn.run(app, host="0.0.0.0", port=9000)
120 | 


--------------------------------------------------------------------------------
/biobert_ner/convert_to_pytorch_wt.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "code",
  5 |    "execution_count": 30,
  6 |    "metadata": {},
  7 |    "outputs": [],
  8 |    "source": [
  9 |     "import tensorflow as tf \n",
 10 |     "import re\n",
 11 |     "import torch\n",
 12 |     "import numpy as np"
 13 |    ]
 14 |   },
 15 |   {
 16 |    "cell_type": "code",
 17 |    "execution_count": 21,
 18 |    "metadata": {},
 19 |    "outputs": [],
 20 |    "source": [
 21 |     "tf_path = 'weights/pubmed_pmc_470k/biobert_model.ckpt'"
 22 |    ]
 23 |   },
 24 |   {
 25 |    "cell_type": "code",
 26 |    "execution_count": 22,
 27 |    "metadata": {},
 28 |    "outputs": [],
 29 |    "source": [
 30 |     "init_vars = tf.train.list_variables(tf_path)"
 31 |    ]
 32 |   },
 33 |   {
 34 |    "cell_type": "code",
 35 |    "execution_count": 23,
 36 |    "metadata": {},
 37 |    "outputs": [],
 38 |    "source": [
 39 |     "excluded = ['BERTAdam','_power','global_step']\n",
 40 |     "init_vars = list(filter(lambda x:all([True if e not in x[0] else False for e in excluded]),init_vars))"
 41 |    ]
 42 |   },
 43 |   {
 44 |    "cell_type": "code",
 45 |    "execution_count": 24,
 46 |    "metadata": {},
 47 |    "outputs": [
 48 |     {
 49 |      "data": {
 50 |       "text/plain": [
 51 |        "[('bert/embeddings/LayerNorm/beta', [768]),\n",
 52 |        " ('bert/embeddings/LayerNorm/gamma', [768]),\n",
 53 |        " ('bert/embeddings/position_embeddings', [512, 768]),\n",
 54 |        " ('bert/embeddings/token_type_embeddings', [2, 768]),\n",
 55 |        " ('bert/embeddings/word_embeddings', [28996, 768]),\n",
 56 |        " ('bert/encoder/layer_0/attention/output/LayerNorm/beta', [768]),\n",
 57 |        " ('bert/encoder/layer_0/attention/output/LayerNorm/gamma', [768]),\n",
 58 |        " ('bert/encoder/layer_0/attention/output/dense/bias', [768]),\n",
 59 |        " ('bert/encoder/layer_0/attention/output/dense/kernel', [768, 768]),\n",
 60 |        " ('bert/encoder/layer_0/attention/self/key/bias', [768]),\n",
 61 |        " ('bert/encoder/layer_0/attention/self/key/kernel', [768, 768]),\n",
 62 |        " ('bert/encoder/layer_0/attention/self/query/bias', [768]),\n",
 63 |        " ('bert/encoder/layer_0/attention/self/query/kernel', [768, 768]),\n",
 64 |        " ('bert/encoder/layer_0/attention/self/value/bias', [768]),\n",
 65 |        " ('bert/encoder/layer_0/attention/self/value/kernel', [768, 768]),\n",
 66 |        " ('bert/encoder/layer_0/intermediate/dense/bias', [3072]),\n",
 67 |        " ('bert/encoder/layer_0/intermediate/dense/kernel', [768, 3072]),\n",
 68 |        " ('bert/encoder/layer_0/output/LayerNorm/beta', [768]),\n",
 69 |        " ('bert/encoder/layer_0/output/LayerNorm/gamma', [768]),\n",
 70 |        " ('bert/encoder/layer_0/output/dense/bias', [768]),\n",
 71 |        " ('bert/encoder/layer_0/output/dense/kernel', [3072, 768]),\n",
 72 |        " ('bert/encoder/layer_1/attention/output/LayerNorm/beta', [768]),\n",
 73 |        " ('bert/encoder/layer_1/attention/output/LayerNorm/gamma', [768]),\n",
 74 |        " ('bert/encoder/layer_1/attention/output/dense/bias', [768]),\n",
 75 |        " ('bert/encoder/layer_1/attention/output/dense/kernel', [768, 768]),\n",
 76 |        " ('bert/encoder/layer_1/attention/self/key/bias', [768]),\n",
 77 |        " ('bert/encoder/layer_1/attention/self/key/kernel', [768, 768]),\n",
 78 |        " ('bert/encoder/layer_1/attention/self/query/bias', [768]),\n",
 79 |        " ('bert/encoder/layer_1/attention/self/query/kernel', [768, 768]),\n",
 80 |        " ('bert/encoder/layer_1/attention/self/value/bias', [768]),\n",
 81 |        " ('bert/encoder/layer_1/attention/self/value/kernel', [768, 768]),\n",
 82 |        " ('bert/encoder/layer_1/intermediate/dense/bias', [3072]),\n",
 83 |        " ('bert/encoder/layer_1/intermediate/dense/kernel', [768, 3072]),\n",
 84 |        " ('bert/encoder/layer_1/output/LayerNorm/beta', [768]),\n",
 85 |        " ('bert/encoder/layer_1/output/LayerNorm/gamma', [768]),\n",
 86 |        " ('bert/encoder/layer_1/output/dense/bias', [768]),\n",
 87 |        " ('bert/encoder/layer_1/output/dense/kernel', [3072, 768]),\n",
 88 |        " ('bert/encoder/layer_10/attention/output/LayerNorm/beta', [768]),\n",
 89 |        " ('bert/encoder/layer_10/attention/output/LayerNorm/gamma', [768]),\n",
 90 |        " ('bert/encoder/layer_10/attention/output/dense/bias', [768]),\n",
 91 |        " ('bert/encoder/layer_10/attention/output/dense/kernel', [768, 768]),\n",
 92 |        " ('bert/encoder/layer_10/attention/self/key/bias', [768]),\n",
 93 |        " ('bert/encoder/layer_10/attention/self/key/kernel', [768, 768]),\n",
 94 |        " ('bert/encoder/layer_10/attention/self/query/bias', [768]),\n",
 95 |        " ('bert/encoder/layer_10/attention/self/query/kernel', [768, 768]),\n",
 96 |        " ('bert/encoder/layer_10/attention/self/value/bias', [768]),\n",
 97 |        " ('bert/encoder/layer_10/attention/self/value/kernel', [768, 768]),\n",
 98 |        " ('bert/encoder/layer_10/intermediate/dense/bias', [3072]),\n",
 99 |        " ('bert/encoder/layer_10/intermediate/dense/kernel', [768, 3072]),\n",
100 |        " ('bert/encoder/layer_10/output/LayerNorm/beta', [768]),\n",
101 |        " ('bert/encoder/layer_10/output/LayerNorm/gamma', [768]),\n",
102 |        " ('bert/encoder/layer_10/output/dense/bias', [768]),\n",
103 |        " ('bert/encoder/layer_10/output/dense/kernel', [3072, 768]),\n",
104 |        " ('bert/encoder/layer_11/attention/output/LayerNorm/beta', [768]),\n",
105 |        " ('bert/encoder/layer_11/attention/output/LayerNorm/gamma', [768]),\n",
106 |        " ('bert/encoder/layer_11/attention/output/dense/bias', [768]),\n",
107 |        " ('bert/encoder/layer_11/attention/output/dense/kernel', [768, 768]),\n",
108 |        " ('bert/encoder/layer_11/attention/self/key/bias', [768]),\n",
109 |        " ('bert/encoder/layer_11/attention/self/key/kernel', [768, 768]),\n",
110 |        " ('bert/encoder/layer_11/attention/self/query/bias', [768]),\n",
111 |        " ('bert/encoder/layer_11/attention/self/query/kernel', [768, 768]),\n",
112 |        " ('bert/encoder/layer_11/attention/self/value/bias', [768]),\n",
113 |        " ('bert/encoder/layer_11/attention/self/value/kernel', [768, 768]),\n",
114 |        " ('bert/encoder/layer_11/intermediate/dense/bias', [3072]),\n",
115 |        " ('bert/encoder/layer_11/intermediate/dense/kernel', [768, 3072]),\n",
116 |        " ('bert/encoder/layer_11/output/LayerNorm/beta', [768]),\n",
117 |        " ('bert/encoder/layer_11/output/LayerNorm/gamma', [768]),\n",
118 |        " ('bert/encoder/layer_11/output/dense/bias', [768]),\n",
119 |        " ('bert/encoder/layer_11/output/dense/kernel', [3072, 768]),\n",
120 |        " ('bert/encoder/layer_2/attention/output/LayerNorm/beta', [768]),\n",
121 |        " ('bert/encoder/layer_2/attention/output/LayerNorm/gamma', [768]),\n",
122 |        " ('bert/encoder/layer_2/attention/output/dense/bias', [768]),\n",
123 |        " ('bert/encoder/layer_2/attention/output/dense/kernel', [768, 768]),\n",
124 |        " ('bert/encoder/layer_2/attention/self/key/bias', [768]),\n",
125 |        " ('bert/encoder/layer_2/attention/self/key/kernel', [768, 768]),\n",
126 |        " ('bert/encoder/layer_2/attention/self/query/bias', [768]),\n",
127 |        " ('bert/encoder/layer_2/attention/self/query/kernel', [768, 768]),\n",
128 |        " ('bert/encoder/layer_2/attention/self/value/bias', [768]),\n",
129 |        " ('bert/encoder/layer_2/attention/self/value/kernel', [768, 768]),\n",
130 |        " ('bert/encoder/layer_2/intermediate/dense/bias', [3072]),\n",
131 |        " ('bert/encoder/layer_2/intermediate/dense/kernel', [768, 3072]),\n",
132 |        " ('bert/encoder/layer_2/output/LayerNorm/beta', [768]),\n",
133 |        " ('bert/encoder/layer_2/output/LayerNorm/gamma', [768]),\n",
134 |        " ('bert/encoder/layer_2/output/dense/bias', [768]),\n",
135 |        " ('bert/encoder/layer_2/output/dense/kernel', [3072, 768]),\n",
136 |        " ('bert/encoder/layer_3/attention/output/LayerNorm/beta', [768]),\n",
137 |        " ('bert/encoder/layer_3/attention/output/LayerNorm/gamma', [768]),\n",
138 |        " ('bert/encoder/layer_3/attention/output/dense/bias', [768]),\n",
139 |        " ('bert/encoder/layer_3/attention/output/dense/kernel', [768, 768]),\n",
140 |        " ('bert/encoder/layer_3/attention/self/key/bias', [768]),\n",
141 |        " ('bert/encoder/layer_3/attention/self/key/kernel', [768, 768]),\n",
142 |        " ('bert/encoder/layer_3/attention/self/query/bias', [768]),\n",
143 |        " ('bert/encoder/layer_3/attention/self/query/kernel', [768, 768]),\n",
144 |        " ('bert/encoder/layer_3/attention/self/value/bias', [768]),\n",
145 |        " ('bert/encoder/layer_3/attention/self/value/kernel', [768, 768]),\n",
146 |        " ('bert/encoder/layer_3/intermediate/dense/bias', [3072]),\n",
147 |        " ('bert/encoder/layer_3/intermediate/dense/kernel', [768, 3072]),\n",
148 |        " ('bert/encoder/layer_3/output/LayerNorm/beta', [768]),\n",
149 |        " ('bert/encoder/layer_3/output/LayerNorm/gamma', [768]),\n",
150 |        " ('bert/encoder/layer_3/output/dense/bias', [768]),\n",
151 |        " ('bert/encoder/layer_3/output/dense/kernel', [3072, 768]),\n",
152 |        " ('bert/encoder/layer_4/attention/output/LayerNorm/beta', [768]),\n",
153 |        " ('bert/encoder/layer_4/attention/output/LayerNorm/gamma', [768]),\n",
154 |        " ('bert/encoder/layer_4/attention/output/dense/bias', [768]),\n",
155 |        " ('bert/encoder/layer_4/attention/output/dense/kernel', [768, 768]),\n",
156 |        " ('bert/encoder/layer_4/attention/self/key/bias', [768]),\n",
157 |        " ('bert/encoder/layer_4/attention/self/key/kernel', [768, 768]),\n",
158 |        " ('bert/encoder/layer_4/attention/self/query/bias', [768]),\n",
159 |        " ('bert/encoder/layer_4/attention/self/query/kernel', [768, 768]),\n",
160 |        " ('bert/encoder/layer_4/attention/self/value/bias', [768]),\n",
161 |        " ('bert/encoder/layer_4/attention/self/value/kernel', [768, 768]),\n",
162 |        " ('bert/encoder/layer_4/intermediate/dense/bias', [3072]),\n",
163 |        " ('bert/encoder/layer_4/intermediate/dense/kernel', [768, 3072]),\n",
164 |        " ('bert/encoder/layer_4/output/LayerNorm/beta', [768]),\n",
165 |        " ('bert/encoder/layer_4/output/LayerNorm/gamma', [768]),\n",
166 |        " ('bert/encoder/layer_4/output/dense/bias', [768]),\n",
167 |        " ('bert/encoder/layer_4/output/dense/kernel', [3072, 768]),\n",
168 |        " ('bert/encoder/layer_5/attention/output/LayerNorm/beta', [768]),\n",
169 |        " ('bert/encoder/layer_5/attention/output/LayerNorm/gamma', [768]),\n",
170 |        " ('bert/encoder/layer_5/attention/output/dense/bias', [768]),\n",
171 |        " ('bert/encoder/layer_5/attention/output/dense/kernel', [768, 768]),\n",
172 |        " ('bert/encoder/layer_5/attention/self/key/bias', [768]),\n",
173 |        " ('bert/encoder/layer_5/attention/self/key/kernel', [768, 768]),\n",
174 |        " ('bert/encoder/layer_5/attention/self/query/bias', [768]),\n",
175 |        " ('bert/encoder/layer_5/attention/self/query/kernel', [768, 768]),\n",
176 |        " ('bert/encoder/layer_5/attention/self/value/bias', [768]),\n",
177 |        " ('bert/encoder/layer_5/attention/self/value/kernel', [768, 768]),\n",
178 |        " ('bert/encoder/layer_5/intermediate/dense/bias', [3072]),\n",
179 |        " ('bert/encoder/layer_5/intermediate/dense/kernel', [768, 3072]),\n",
180 |        " ('bert/encoder/layer_5/output/LayerNorm/beta', [768]),\n",
181 |        " ('bert/encoder/layer_5/output/LayerNorm/gamma', [768]),\n",
182 |        " ('bert/encoder/layer_5/output/dense/bias', [768]),\n",
183 |        " ('bert/encoder/layer_5/output/dense/kernel', [3072, 768]),\n",
184 |        " ('bert/encoder/layer_6/attention/output/LayerNorm/beta', [768]),\n",
185 |        " ('bert/encoder/layer_6/attention/output/LayerNorm/gamma', [768]),\n",
186 |        " ('bert/encoder/layer_6/attention/output/dense/bias', [768]),\n",
187 |        " ('bert/encoder/layer_6/attention/output/dense/kernel', [768, 768]),\n",
188 |        " ('bert/encoder/layer_6/attention/self/key/bias', [768]),\n",
189 |        " ('bert/encoder/layer_6/attention/self/key/kernel', [768, 768]),\n",
190 |        " ('bert/encoder/layer_6/attention/self/query/bias', [768]),\n",
191 |        " ('bert/encoder/layer_6/attention/self/query/kernel', [768, 768]),\n",
192 |        " ('bert/encoder/layer_6/attention/self/value/bias', [768]),\n",
193 |        " ('bert/encoder/layer_6/attention/self/value/kernel', [768, 768]),\n",
194 |        " ('bert/encoder/layer_6/intermediate/dense/bias', [3072]),\n",
195 |        " ('bert/encoder/layer_6/intermediate/dense/kernel', [768, 3072]),\n",
196 |        " ('bert/encoder/layer_6/output/LayerNorm/beta', [768]),\n",
197 |        " ('bert/encoder/layer_6/output/LayerNorm/gamma', [768]),\n",
198 |        " ('bert/encoder/layer_6/output/dense/bias', [768]),\n",
199 |        " ('bert/encoder/layer_6/output/dense/kernel', [3072, 768]),\n",
200 |        " ('bert/encoder/layer_7/attention/output/LayerNorm/beta', [768]),\n",
201 |        " ('bert/encoder/layer_7/attention/output/LayerNorm/gamma', [768]),\n",
202 |        " ('bert/encoder/layer_7/attention/output/dense/bias', [768]),\n",
203 |        " ('bert/encoder/layer_7/attention/output/dense/kernel', [768, 768]),\n",
204 |        " ('bert/encoder/layer_7/attention/self/key/bias', [768]),\n",
205 |        " ('bert/encoder/layer_7/attention/self/key/kernel', [768, 768]),\n",
206 |        " ('bert/encoder/layer_7/attention/self/query/bias', [768]),\n",
207 |        " ('bert/encoder/layer_7/attention/self/query/kernel', [768, 768]),\n",
208 |        " ('bert/encoder/layer_7/attention/self/value/bias', [768]),\n",
209 |        " ('bert/encoder/layer_7/attention/self/value/kernel', [768, 768]),\n",
210 |        " ('bert/encoder/layer_7/intermediate/dense/bias', [3072]),\n",
211 |        " ('bert/encoder/layer_7/intermediate/dense/kernel', [768, 3072]),\n",
212 |        " ('bert/encoder/layer_7/output/LayerNorm/beta', [768]),\n",
213 |        " ('bert/encoder/layer_7/output/LayerNorm/gamma', [768]),\n",
214 |        " ('bert/encoder/layer_7/output/dense/bias', [768]),\n",
215 |        " ('bert/encoder/layer_7/output/dense/kernel', [3072, 768]),\n",
216 |        " ('bert/encoder/layer_8/attention/output/LayerNorm/beta', [768]),\n",
217 |        " ('bert/encoder/layer_8/attention/output/LayerNorm/gamma', [768]),\n",
218 |        " ('bert/encoder/layer_8/attention/output/dense/bias', [768]),\n",
219 |        " ('bert/encoder/layer_8/attention/output/dense/kernel', [768, 768]),\n",
220 |        " ('bert/encoder/layer_8/attention/self/key/bias', [768]),\n",
221 |        " ('bert/encoder/layer_8/attention/self/key/kernel', [768, 768]),\n",
222 |        " ('bert/encoder/layer_8/attention/self/query/bias', [768]),\n",
223 |        " ('bert/encoder/layer_8/attention/self/query/kernel', [768, 768]),\n",
224 |        " ('bert/encoder/layer_8/attention/self/value/bias', [768]),\n",
225 |        " ('bert/encoder/layer_8/attention/self/value/kernel', [768, 768]),\n",
226 |        " ('bert/encoder/layer_8/intermediate/dense/bias', [3072]),\n",
227 |        " ('bert/encoder/layer_8/intermediate/dense/kernel', [768, 3072]),\n",
228 |        " ('bert/encoder/layer_8/output/LayerNorm/beta', [768]),\n",
229 |        " ('bert/encoder/layer_8/output/LayerNorm/gamma', [768]),\n",
230 |        " ('bert/encoder/layer_8/output/dense/bias', [768]),\n",
231 |        " ('bert/encoder/layer_8/output/dense/kernel', [3072, 768]),\n",
232 |        " ('bert/encoder/layer_9/attention/output/LayerNorm/beta', [768]),\n",
233 |        " ('bert/encoder/layer_9/attention/output/LayerNorm/gamma', [768]),\n",
234 |        " ('bert/encoder/layer_9/attention/output/dense/bias', [768]),\n",
235 |        " ('bert/encoder/layer_9/attention/output/dense/kernel', [768, 768]),\n",
236 |        " ('bert/encoder/layer_9/attention/self/key/bias', [768]),\n",
237 |        " ('bert/encoder/layer_9/attention/self/key/kernel', [768, 768]),\n",
238 |        " ('bert/encoder/layer_9/attention/self/query/bias', [768]),\n",
239 |        " ('bert/encoder/layer_9/attention/self/query/kernel', [768, 768]),\n",
240 |        " ('bert/encoder/layer_9/attention/self/value/bias', [768]),\n",
241 |        " ('bert/encoder/layer_9/attention/self/value/kernel', [768, 768]),\n",
242 |        " ('bert/encoder/layer_9/intermediate/dense/bias', [3072]),\n",
243 |        " ('bert/encoder/layer_9/intermediate/dense/kernel', [768, 3072]),\n",
244 |        " ('bert/encoder/layer_9/output/LayerNorm/beta', [768]),\n",
245 |        " ('bert/encoder/layer_9/output/LayerNorm/gamma', [768]),\n",
246 |        " ('bert/encoder/layer_9/output/dense/bias', [768]),\n",
247 |        " ('bert/encoder/layer_9/output/dense/kernel', [3072, 768]),\n",
248 |        " ('bert/pooler/dense/bias', [768]),\n",
249 |        " ('bert/pooler/dense/kernel', [768, 768]),\n",
250 |        " ('cls/predictions/output_bias', [28996]),\n",
251 |        " ('cls/predictions/transform/LayerNorm/beta', [768]),\n",
252 |        " ('cls/predictions/transform/LayerNorm/gamma', [768]),\n",
253 |        " ('cls/predictions/transform/dense/bias', [768]),\n",
254 |        " ('cls/predictions/transform/dense/kernel', [768, 768]),\n",
255 |        " ('cls/seq_relationship/output_bias', [2]),\n",
256 |        " ('cls/seq_relationship/output_weights', [2, 768])]"
257 |       ]
258 |      },
259 |      "execution_count": 24,
260 |      "metadata": {},
261 |      "output_type": "execute_result"
262 |     }
263 |    ],
264 |    "source": [
265 |     "init_vars"
266 |    ]
267 |   },
268 |   {
269 |    "cell_type": "code",
270 |    "execution_count": 25,
271 |    "metadata": {},
272 |    "outputs": [
273 |     {
274 |      "name": "stdout",
275 |      "output_type": "stream",
276 |      "text": [
277 |       "Loading TF weight bert/embeddings/LayerNorm/beta with shape [768]\n",
278 |       "Loading TF weight bert/embeddings/LayerNorm/gamma with shape [768]\n",
279 |       "Loading TF weight bert/embeddings/position_embeddings with shape [512, 768]\n",
280 |       "Loading TF weight bert/embeddings/token_type_embeddings with shape [2, 768]\n",
281 |       "Loading TF weight bert/embeddings/word_embeddings with shape [28996, 768]\n",
282 |       "Loading TF weight bert/encoder/layer_0/attention/output/LayerNorm/beta with shape [768]\n",
283 |       "Loading TF weight bert/encoder/layer_0/attention/output/LayerNorm/gamma with shape [768]\n",
284 |       "Loading TF weight bert/encoder/layer_0/attention/output/dense/bias with shape [768]\n",
285 |       "Loading TF weight bert/encoder/layer_0/attention/output/dense/kernel with shape [768, 768]\n",
286 |       "Loading TF weight bert/encoder/layer_0/attention/self/key/bias with shape [768]\n",
287 |       "Loading TF weight bert/encoder/layer_0/attention/self/key/kernel with shape [768, 768]\n",
288 |       "Loading TF weight bert/encoder/layer_0/attention/self/query/bias with shape [768]\n",
289 |       "Loading TF weight bert/encoder/layer_0/attention/self/query/kernel with shape [768, 768]\n",
290 |       "Loading TF weight bert/encoder/layer_0/attention/self/value/bias with shape [768]\n",
291 |       "Loading TF weight bert/encoder/layer_0/attention/self/value/kernel with shape [768, 768]\n",
292 |       "Loading TF weight bert/encoder/layer_0/intermediate/dense/bias with shape [3072]\n",
293 |       "Loading TF weight bert/encoder/layer_0/intermediate/dense/kernel with shape [768, 3072]\n",
294 |       "Loading TF weight bert/encoder/layer_0/output/LayerNorm/beta with shape [768]\n",
295 |       "Loading TF weight bert/encoder/layer_0/output/LayerNorm/gamma with shape [768]\n",
296 |       "Loading TF weight bert/encoder/layer_0/output/dense/bias with shape [768]\n",
297 |       "Loading TF weight bert/encoder/layer_0/output/dense/kernel with shape [3072, 768]\n",
298 |       "Loading TF weight bert/encoder/layer_1/attention/output/LayerNorm/beta with shape [768]\n",
299 |       "Loading TF weight bert/encoder/layer_1/attention/output/LayerNorm/gamma with shape [768]\n",
300 |       "Loading TF weight bert/encoder/layer_1/attention/output/dense/bias with shape [768]\n",
301 |       "Loading TF weight bert/encoder/layer_1/attention/output/dense/kernel with shape [768, 768]\n",
302 |       "Loading TF weight bert/encoder/layer_1/attention/self/key/bias with shape [768]\n",
303 |       "Loading TF weight bert/encoder/layer_1/attention/self/key/kernel with shape [768, 768]\n",
304 |       "Loading TF weight bert/encoder/layer_1/attention/self/query/bias with shape [768]\n",
305 |       "Loading TF weight bert/encoder/layer_1/attention/self/query/kernel with shape [768, 768]\n",
306 |       "Loading TF weight bert/encoder/layer_1/attention/self/value/bias with shape [768]\n",
307 |       "Loading TF weight bert/encoder/layer_1/attention/self/value/kernel with shape [768, 768]\n",
308 |       "Loading TF weight bert/encoder/layer_1/intermediate/dense/bias with shape [3072]\n",
309 |       "Loading TF weight bert/encoder/layer_1/intermediate/dense/kernel with shape [768, 3072]\n",
310 |       "Loading TF weight bert/encoder/layer_1/output/LayerNorm/beta with shape [768]\n",
311 |       "Loading TF weight bert/encoder/layer_1/output/LayerNorm/gamma with shape [768]\n",
312 |       "Loading TF weight bert/encoder/layer_1/output/dense/bias with shape [768]\n",
313 |       "Loading TF weight bert/encoder/layer_1/output/dense/kernel with shape [3072, 768]\n",
314 |       "Loading TF weight bert/encoder/layer_10/attention/output/LayerNorm/beta with shape [768]\n",
315 |       "Loading TF weight bert/encoder/layer_10/attention/output/LayerNorm/gamma with shape [768]\n",
316 |       "Loading TF weight bert/encoder/layer_10/attention/output/dense/bias with shape [768]\n",
317 |       "Loading TF weight bert/encoder/layer_10/attention/output/dense/kernel with shape [768, 768]\n",
318 |       "Loading TF weight bert/encoder/layer_10/attention/self/key/bias with shape [768]\n",
319 |       "Loading TF weight bert/encoder/layer_10/attention/self/key/kernel with shape [768, 768]\n",
320 |       "Loading TF weight bert/encoder/layer_10/attention/self/query/bias with shape [768]\n",
321 |       "Loading TF weight bert/encoder/layer_10/attention/self/query/kernel with shape [768, 768]\n",
322 |       "Loading TF weight bert/encoder/layer_10/attention/self/value/bias with shape [768]\n",
323 |       "Loading TF weight bert/encoder/layer_10/attention/self/value/kernel with shape [768, 768]\n",
324 |       "Loading TF weight bert/encoder/layer_10/intermediate/dense/bias with shape [3072]\n",
325 |       "Loading TF weight bert/encoder/layer_10/intermediate/dense/kernel with shape [768, 3072]\n",
326 |       "Loading TF weight bert/encoder/layer_10/output/LayerNorm/beta with shape [768]\n",
327 |       "Loading TF weight bert/encoder/layer_10/output/LayerNorm/gamma with shape [768]\n",
328 |       "Loading TF weight bert/encoder/layer_10/output/dense/bias with shape [768]\n",
329 |       "Loading TF weight bert/encoder/layer_10/output/dense/kernel with shape [3072, 768]\n",
330 |       "Loading TF weight bert/encoder/layer_11/attention/output/LayerNorm/beta with shape [768]\n",
331 |       "Loading TF weight bert/encoder/layer_11/attention/output/LayerNorm/gamma with shape [768]\n",
332 |       "Loading TF weight bert/encoder/layer_11/attention/output/dense/bias with shape [768]\n",
333 |       "Loading TF weight bert/encoder/layer_11/attention/output/dense/kernel with shape [768, 768]\n",
334 |       "Loading TF weight bert/encoder/layer_11/attention/self/key/bias with shape [768]\n",
335 |       "Loading TF weight bert/encoder/layer_11/attention/self/key/kernel with shape [768, 768]\n",
336 |       "Loading TF weight bert/encoder/layer_11/attention/self/query/bias with shape [768]\n",
337 |       "Loading TF weight bert/encoder/layer_11/attention/self/query/kernel with shape [768, 768]\n",
338 |       "Loading TF weight bert/encoder/layer_11/attention/self/value/bias with shape [768]\n",
339 |       "Loading TF weight bert/encoder/layer_11/attention/self/value/kernel with shape [768, 768]\n",
340 |       "Loading TF weight bert/encoder/layer_11/intermediate/dense/bias with shape [3072]\n",
341 |       "Loading TF weight bert/encoder/layer_11/intermediate/dense/kernel with shape [768, 3072]\n",
342 |       "Loading TF weight bert/encoder/layer_11/output/LayerNorm/beta with shape [768]\n",
343 |       "Loading TF weight bert/encoder/layer_11/output/LayerNorm/gamma with shape [768]\n",
344 |       "Loading TF weight bert/encoder/layer_11/output/dense/bias with shape [768]\n",
345 |       "Loading TF weight bert/encoder/layer_11/output/dense/kernel with shape [3072, 768]\n",
346 |       "Loading TF weight bert/encoder/layer_2/attention/output/LayerNorm/beta with shape [768]\n",
347 |       "Loading TF weight bert/encoder/layer_2/attention/output/LayerNorm/gamma with shape [768]\n",
348 |       "Loading TF weight bert/encoder/layer_2/attention/output/dense/bias with shape [768]\n",
349 |       "Loading TF weight bert/encoder/layer_2/attention/output/dense/kernel with shape [768, 768]\n",
350 |       "Loading TF weight bert/encoder/layer_2/attention/self/key/bias with shape [768]\n",
351 |       "Loading TF weight bert/encoder/layer_2/attention/self/key/kernel with shape [768, 768]\n",
352 |       "Loading TF weight bert/encoder/layer_2/attention/self/query/bias with shape [768]\n",
353 |       "Loading TF weight bert/encoder/layer_2/attention/self/query/kernel with shape [768, 768]\n",
354 |       "Loading TF weight bert/encoder/layer_2/attention/self/value/bias with shape [768]\n",
355 |       "Loading TF weight bert/encoder/layer_2/attention/self/value/kernel with shape [768, 768]\n",
356 |       "Loading TF weight bert/encoder/layer_2/intermediate/dense/bias with shape [3072]\n",
357 |       "Loading TF weight bert/encoder/layer_2/intermediate/dense/kernel with shape [768, 3072]\n",
358 |       "Loading TF weight bert/encoder/layer_2/output/LayerNorm/beta with shape [768]\n",
359 |       "Loading TF weight bert/encoder/layer_2/output/LayerNorm/gamma with shape [768]\n",
360 |       "Loading TF weight bert/encoder/layer_2/output/dense/bias with shape [768]\n",
361 |       "Loading TF weight bert/encoder/layer_2/output/dense/kernel with shape [3072, 768]\n",
362 |       "Loading TF weight bert/encoder/layer_3/attention/output/LayerNorm/beta with shape [768]\n",
363 |       "Loading TF weight bert/encoder/layer_3/attention/output/LayerNorm/gamma with shape [768]\n",
364 |       "Loading TF weight bert/encoder/layer_3/attention/output/dense/bias with shape [768]\n",
365 |       "Loading TF weight bert/encoder/layer_3/attention/output/dense/kernel with shape [768, 768]\n",
366 |       "Loading TF weight bert/encoder/layer_3/attention/self/key/bias with shape [768]\n",
367 |       "Loading TF weight bert/encoder/layer_3/attention/self/key/kernel with shape [768, 768]\n",
368 |       "Loading TF weight bert/encoder/layer_3/attention/self/query/bias with shape [768]\n",
369 |       "Loading TF weight bert/encoder/layer_3/attention/self/query/kernel with shape [768, 768]\n",
370 |       "Loading TF weight bert/encoder/layer_3/attention/self/value/bias with shape [768]\n",
371 |       "Loading TF weight bert/encoder/layer_3/attention/self/value/kernel with shape [768, 768]\n",
372 |       "Loading TF weight bert/encoder/layer_3/intermediate/dense/bias with shape [3072]\n",
373 |       "Loading TF weight bert/encoder/layer_3/intermediate/dense/kernel with shape [768, 3072]\n",
374 |       "Loading TF weight bert/encoder/layer_3/output/LayerNorm/beta with shape [768]\n",
375 |       "Loading TF weight bert/encoder/layer_3/output/LayerNorm/gamma with shape [768]\n",
376 |       "Loading TF weight bert/encoder/layer_3/output/dense/bias with shape [768]\n",
377 |       "Loading TF weight bert/encoder/layer_3/output/dense/kernel with shape [3072, 768]\n",
378 |       "Loading TF weight bert/encoder/layer_4/attention/output/LayerNorm/beta with shape [768]\n",
379 |       "Loading TF weight bert/encoder/layer_4/attention/output/LayerNorm/gamma with shape [768]\n",
380 |       "Loading TF weight bert/encoder/layer_4/attention/output/dense/bias with shape [768]\n",
381 |       "Loading TF weight bert/encoder/layer_4/attention/output/dense/kernel with shape [768, 768]\n",
382 |       "Loading TF weight bert/encoder/layer_4/attention/self/key/bias with shape [768]\n",
383 |       "Loading TF weight bert/encoder/layer_4/attention/self/key/kernel with shape [768, 768]\n",
384 |       "Loading TF weight bert/encoder/layer_4/attention/self/query/bias with shape [768]\n",
385 |       "Loading TF weight bert/encoder/layer_4/attention/self/query/kernel with shape [768, 768]\n",
386 |       "Loading TF weight bert/encoder/layer_4/attention/self/value/bias with shape [768]\n",
387 |       "Loading TF weight bert/encoder/layer_4/attention/self/value/kernel with shape [768, 768]\n",
388 |       "Loading TF weight bert/encoder/layer_4/intermediate/dense/bias with shape [3072]\n",
389 |       "Loading TF weight bert/encoder/layer_4/intermediate/dense/kernel with shape [768, 3072]\n",
390 |       "Loading TF weight bert/encoder/layer_4/output/LayerNorm/beta with shape [768]\n",
391 |       "Loading TF weight bert/encoder/layer_4/output/LayerNorm/gamma with shape [768]\n",
392 |       "Loading TF weight bert/encoder/layer_4/output/dense/bias with shape [768]\n",
393 |       "Loading TF weight bert/encoder/layer_4/output/dense/kernel with shape [3072, 768]\n",
394 |       "Loading TF weight bert/encoder/layer_5/attention/output/LayerNorm/beta with shape [768]\n",
395 |       "Loading TF weight bert/encoder/layer_5/attention/output/LayerNorm/gamma with shape [768]\n",
396 |       "Loading TF weight bert/encoder/layer_5/attention/output/dense/bias with shape [768]\n",
397 |       "Loading TF weight bert/encoder/layer_5/attention/output/dense/kernel with shape [768, 768]\n",
398 |       "Loading TF weight bert/encoder/layer_5/attention/self/key/bias with shape [768]\n",
399 |       "Loading TF weight bert/encoder/layer_5/attention/self/key/kernel with shape [768, 768]\n",
400 |       "Loading TF weight bert/encoder/layer_5/attention/self/query/bias with shape [768]\n",
401 |       "Loading TF weight bert/encoder/layer_5/attention/self/query/kernel with shape [768, 768]\n",
402 |       "Loading TF weight bert/encoder/layer_5/attention/self/value/bias with shape [768]\n",
403 |       "Loading TF weight bert/encoder/layer_5/attention/self/value/kernel with shape [768, 768]\n",
404 |       "Loading TF weight bert/encoder/layer_5/intermediate/dense/bias with shape [3072]\n",
405 |       "Loading TF weight bert/encoder/layer_5/intermediate/dense/kernel with shape [768, 3072]\n",
406 |       "Loading TF weight bert/encoder/layer_5/output/LayerNorm/beta with shape [768]\n",
407 |       "Loading TF weight bert/encoder/layer_5/output/LayerNorm/gamma with shape [768]\n",
408 |       "Loading TF weight bert/encoder/layer_5/output/dense/bias with shape [768]\n"
409 |      ]
410 |     },
411 |     {
412 |      "name": "stdout",
413 |      "output_type": "stream",
414 |      "text": [
415 |       "Loading TF weight bert/encoder/layer_5/output/dense/kernel with shape [3072, 768]\n",
416 |       "Loading TF weight bert/encoder/layer_6/attention/output/LayerNorm/beta with shape [768]\n",
417 |       "Loading TF weight bert/encoder/layer_6/attention/output/LayerNorm/gamma with shape [768]\n",
418 |       "Loading TF weight bert/encoder/layer_6/attention/output/dense/bias with shape [768]\n",
419 |       "Loading TF weight bert/encoder/layer_6/attention/output/dense/kernel with shape [768, 768]\n",
420 |       "Loading TF weight bert/encoder/layer_6/attention/self/key/bias with shape [768]\n",
421 |       "Loading TF weight bert/encoder/layer_6/attention/self/key/kernel with shape [768, 768]\n",
422 |       "Loading TF weight bert/encoder/layer_6/attention/self/query/bias with shape [768]\n",
423 |       "Loading TF weight bert/encoder/layer_6/attention/self/query/kernel with shape [768, 768]\n",
424 |       "Loading TF weight bert/encoder/layer_6/attention/self/value/bias with shape [768]\n",
425 |       "Loading TF weight bert/encoder/layer_6/attention/self/value/kernel with shape [768, 768]\n",
426 |       "Loading TF weight bert/encoder/layer_6/intermediate/dense/bias with shape [3072]\n",
427 |       "Loading TF weight bert/encoder/layer_6/intermediate/dense/kernel with shape [768, 3072]\n",
428 |       "Loading TF weight bert/encoder/layer_6/output/LayerNorm/beta with shape [768]\n",
429 |       "Loading TF weight bert/encoder/layer_6/output/LayerNorm/gamma with shape [768]\n",
430 |       "Loading TF weight bert/encoder/layer_6/output/dense/bias with shape [768]\n",
431 |       "Loading TF weight bert/encoder/layer_6/output/dense/kernel with shape [3072, 768]\n",
432 |       "Loading TF weight bert/encoder/layer_7/attention/output/LayerNorm/beta with shape [768]\n",
433 |       "Loading TF weight bert/encoder/layer_7/attention/output/LayerNorm/gamma with shape [768]\n",
434 |       "Loading TF weight bert/encoder/layer_7/attention/output/dense/bias with shape [768]\n",
435 |       "Loading TF weight bert/encoder/layer_7/attention/output/dense/kernel with shape [768, 768]\n",
436 |       "Loading TF weight bert/encoder/layer_7/attention/self/key/bias with shape [768]\n",
437 |       "Loading TF weight bert/encoder/layer_7/attention/self/key/kernel with shape [768, 768]\n",
438 |       "Loading TF weight bert/encoder/layer_7/attention/self/query/bias with shape [768]\n",
439 |       "Loading TF weight bert/encoder/layer_7/attention/self/query/kernel with shape [768, 768]\n",
440 |       "Loading TF weight bert/encoder/layer_7/attention/self/value/bias with shape [768]\n",
441 |       "Loading TF weight bert/encoder/layer_7/attention/self/value/kernel with shape [768, 768]\n",
442 |       "Loading TF weight bert/encoder/layer_7/intermediate/dense/bias with shape [3072]\n",
443 |       "Loading TF weight bert/encoder/layer_7/intermediate/dense/kernel with shape [768, 3072]\n",
444 |       "Loading TF weight bert/encoder/layer_7/output/LayerNorm/beta with shape [768]\n",
445 |       "Loading TF weight bert/encoder/layer_7/output/LayerNorm/gamma with shape [768]\n",
446 |       "Loading TF weight bert/encoder/layer_7/output/dense/bias with shape [768]\n",
447 |       "Loading TF weight bert/encoder/layer_7/output/dense/kernel with shape [3072, 768]\n",
448 |       "Loading TF weight bert/encoder/layer_8/attention/output/LayerNorm/beta with shape [768]\n",
449 |       "Loading TF weight bert/encoder/layer_8/attention/output/LayerNorm/gamma with shape [768]\n",
450 |       "Loading TF weight bert/encoder/layer_8/attention/output/dense/bias with shape [768]\n",
451 |       "Loading TF weight bert/encoder/layer_8/attention/output/dense/kernel with shape [768, 768]\n",
452 |       "Loading TF weight bert/encoder/layer_8/attention/self/key/bias with shape [768]\n",
453 |       "Loading TF weight bert/encoder/layer_8/attention/self/key/kernel with shape [768, 768]\n",
454 |       "Loading TF weight bert/encoder/layer_8/attention/self/query/bias with shape [768]\n",
455 |       "Loading TF weight bert/encoder/layer_8/attention/self/query/kernel with shape [768, 768]\n",
456 |       "Loading TF weight bert/encoder/layer_8/attention/self/value/bias with shape [768]\n",
457 |       "Loading TF weight bert/encoder/layer_8/attention/self/value/kernel with shape [768, 768]\n",
458 |       "Loading TF weight bert/encoder/layer_8/intermediate/dense/bias with shape [3072]\n",
459 |       "Loading TF weight bert/encoder/layer_8/intermediate/dense/kernel with shape [768, 3072]\n",
460 |       "Loading TF weight bert/encoder/layer_8/output/LayerNorm/beta with shape [768]\n",
461 |       "Loading TF weight bert/encoder/layer_8/output/LayerNorm/gamma with shape [768]\n",
462 |       "Loading TF weight bert/encoder/layer_8/output/dense/bias with shape [768]\n",
463 |       "Loading TF weight bert/encoder/layer_8/output/dense/kernel with shape [3072, 768]\n",
464 |       "Loading TF weight bert/encoder/layer_9/attention/output/LayerNorm/beta with shape [768]\n",
465 |       "Loading TF weight bert/encoder/layer_9/attention/output/LayerNorm/gamma with shape [768]\n",
466 |       "Loading TF weight bert/encoder/layer_9/attention/output/dense/bias with shape [768]\n",
467 |       "Loading TF weight bert/encoder/layer_9/attention/output/dense/kernel with shape [768, 768]\n",
468 |       "Loading TF weight bert/encoder/layer_9/attention/self/key/bias with shape [768]\n",
469 |       "Loading TF weight bert/encoder/layer_9/attention/self/key/kernel with shape [768, 768]\n",
470 |       "Loading TF weight bert/encoder/layer_9/attention/self/query/bias with shape [768]\n",
471 |       "Loading TF weight bert/encoder/layer_9/attention/self/query/kernel with shape [768, 768]\n",
472 |       "Loading TF weight bert/encoder/layer_9/attention/self/value/bias with shape [768]\n",
473 |       "Loading TF weight bert/encoder/layer_9/attention/self/value/kernel with shape [768, 768]\n",
474 |       "Loading TF weight bert/encoder/layer_9/intermediate/dense/bias with shape [3072]\n",
475 |       "Loading TF weight bert/encoder/layer_9/intermediate/dense/kernel with shape [768, 3072]\n",
476 |       "Loading TF weight bert/encoder/layer_9/output/LayerNorm/beta with shape [768]\n",
477 |       "Loading TF weight bert/encoder/layer_9/output/LayerNorm/gamma with shape [768]\n",
478 |       "Loading TF weight bert/encoder/layer_9/output/dense/bias with shape [768]\n",
479 |       "Loading TF weight bert/encoder/layer_9/output/dense/kernel with shape [3072, 768]\n",
480 |       "Loading TF weight bert/pooler/dense/bias with shape [768]\n",
481 |       "Loading TF weight bert/pooler/dense/kernel with shape [768, 768]\n",
482 |       "Loading TF weight cls/predictions/output_bias with shape [28996]\n",
483 |       "Loading TF weight cls/predictions/transform/LayerNorm/beta with shape [768]\n",
484 |       "Loading TF weight cls/predictions/transform/LayerNorm/gamma with shape [768]\n",
485 |       "Loading TF weight cls/predictions/transform/dense/bias with shape [768]\n",
486 |       "Loading TF weight cls/predictions/transform/dense/kernel with shape [768, 768]\n",
487 |       "Loading TF weight cls/seq_relationship/output_bias with shape [2]\n",
488 |       "Loading TF weight cls/seq_relationship/output_weights with shape [2, 768]\n"
489 |      ]
490 |     }
491 |    ],
492 |    "source": [
493 |     "names = []\n",
494 |     "arrays = []\n",
495 |     "for name, shape in init_vars:\n",
496 |     "    print(\"Loading TF weight {} with shape {}\".format(name, shape))\n",
497 |     "    array = tf.train.load_variable(tf_path, name)\n",
498 |     "    names.append(name)\n",
499 |     "    arrays.append(array)"
500 |    ]
501 |   },
502 |   {
503 |    "cell_type": "code",
504 |    "execution_count": 26,
505 |    "metadata": {},
506 |    "outputs": [],
507 |    "source": [
508 |     "from pytorch_pretrained_bert  import BertConfig, BertForPreTraining"
509 |    ]
510 |   },
511 |   {
512 |    "cell_type": "code",
513 |    "execution_count": 27,
514 |    "metadata": {
515 |     "scrolled": true
516 |    },
517 |    "outputs": [
518 |     {
519 |      "name": "stdout",
520 |      "output_type": "stream",
521 |      "text": [
522 |       "Building PyTorch model from configuration: {\n",
523 |       "  \"attention_probs_dropout_prob\": 0.1,\n",
524 |       "  \"hidden_act\": \"gelu\",\n",
525 |       "  \"hidden_dropout_prob\": 0.1,\n",
526 |       "  \"hidden_size\": 768,\n",
527 |       "  \"initializer_range\": 0.02,\n",
528 |       "  \"intermediate_size\": 3072,\n",
529 |       "  \"max_position_embeddings\": 512,\n",
530 |       "  \"num_attention_heads\": 12,\n",
531 |       "  \"num_hidden_layers\": 12,\n",
532 |       "  \"type_vocab_size\": 2,\n",
533 |       "  \"vocab_size\": 28996\n",
534 |       "}\n",
535 |       "\n"
536 |      ]
537 |     }
538 |    ],
539 |    "source": [
540 |     "# Initialise PyTorch model\n",
541 |     "config = BertConfig.from_json_file('weights/pubmed_pmc_470k/bert_config.json')\n",
542 |     "print(\"Building PyTorch model from configuration: {}\".format(str(config)))\n",
543 |     "model = BertForPreTraining(config)\n"
544 |    ]
545 |   },
546 |   {
547 |    "cell_type": "code",
548 |    "execution_count": 31,
549 |    "metadata": {},
550 |    "outputs": [
551 |     {
552 |      "name": "stdout",
553 |      "output_type": "stream",
554 |      "text": [
555 |       "Initialize PyTorch weight ['bert', 'embeddings', 'LayerNorm', 'beta']\n",
556 |       "Initialize PyTorch weight ['bert', 'embeddings', 'LayerNorm', 'gamma']\n",
557 |       "Initialize PyTorch weight ['bert', 'embeddings', 'position_embeddings']\n",
558 |       "Initialize PyTorch weight ['bert', 'embeddings', 'token_type_embeddings']\n",
559 |       "Initialize PyTorch weight ['bert', 'embeddings', 'word_embeddings']\n",
560 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_0', 'attention', 'output', 'LayerNorm', 'beta']\n",
561 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_0', 'attention', 'output', 'LayerNorm', 'gamma']\n",
562 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_0', 'attention', 'output', 'dense', 'bias']\n",
563 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_0', 'attention', 'output', 'dense', 'kernel']\n",
564 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_0', 'attention', 'self', 'key', 'bias']\n",
565 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_0', 'attention', 'self', 'key', 'kernel']\n",
566 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_0', 'attention', 'self', 'query', 'bias']\n",
567 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_0', 'attention', 'self', 'query', 'kernel']\n",
568 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_0', 'attention', 'self', 'value', 'bias']\n",
569 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_0', 'attention', 'self', 'value', 'kernel']\n",
570 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_0', 'intermediate', 'dense', 'bias']\n",
571 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_0', 'intermediate', 'dense', 'kernel']\n",
572 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_0', 'output', 'LayerNorm', 'beta']\n",
573 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_0', 'output', 'LayerNorm', 'gamma']\n",
574 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_0', 'output', 'dense', 'bias']\n",
575 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_0', 'output', 'dense', 'kernel']\n",
576 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_1', 'attention', 'output', 'LayerNorm', 'beta']\n",
577 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_1', 'attention', 'output', 'LayerNorm', 'gamma']\n",
578 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_1', 'attention', 'output', 'dense', 'bias']\n",
579 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_1', 'attention', 'output', 'dense', 'kernel']\n",
580 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_1', 'attention', 'self', 'key', 'bias']\n",
581 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_1', 'attention', 'self', 'key', 'kernel']\n",
582 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_1', 'attention', 'self', 'query', 'bias']\n",
583 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_1', 'attention', 'self', 'query', 'kernel']\n",
584 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_1', 'attention', 'self', 'value', 'bias']\n",
585 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_1', 'attention', 'self', 'value', 'kernel']\n",
586 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_1', 'intermediate', 'dense', 'bias']\n",
587 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_1', 'intermediate', 'dense', 'kernel']\n",
588 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_1', 'output', 'LayerNorm', 'beta']\n",
589 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_1', 'output', 'LayerNorm', 'gamma']\n",
590 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_1', 'output', 'dense', 'bias']\n",
591 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_1', 'output', 'dense', 'kernel']\n",
592 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_10', 'attention', 'output', 'LayerNorm', 'beta']\n",
593 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_10', 'attention', 'output', 'LayerNorm', 'gamma']\n",
594 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_10', 'attention', 'output', 'dense', 'bias']\n",
595 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_10', 'attention', 'output', 'dense', 'kernel']\n",
596 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_10', 'attention', 'self', 'key', 'bias']\n",
597 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_10', 'attention', 'self', 'key', 'kernel']\n",
598 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_10', 'attention', 'self', 'query', 'bias']\n",
599 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_10', 'attention', 'self', 'query', 'kernel']\n",
600 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_10', 'attention', 'self', 'value', 'bias']\n",
601 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_10', 'attention', 'self', 'value', 'kernel']\n",
602 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_10', 'intermediate', 'dense', 'bias']\n",
603 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_10', 'intermediate', 'dense', 'kernel']\n",
604 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_10', 'output', 'LayerNorm', 'beta']\n",
605 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_10', 'output', 'LayerNorm', 'gamma']\n",
606 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_10', 'output', 'dense', 'bias']\n",
607 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_10', 'output', 'dense', 'kernel']\n",
608 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_11', 'attention', 'output', 'LayerNorm', 'beta']\n",
609 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_11', 'attention', 'output', 'LayerNorm', 'gamma']\n",
610 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_11', 'attention', 'output', 'dense', 'bias']\n",
611 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_11', 'attention', 'output', 'dense', 'kernel']\n",
612 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_11', 'attention', 'self', 'key', 'bias']\n",
613 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_11', 'attention', 'self', 'key', 'kernel']\n",
614 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_11', 'attention', 'self', 'query', 'bias']\n",
615 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_11', 'attention', 'self', 'query', 'kernel']\n",
616 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_11', 'attention', 'self', 'value', 'bias']\n",
617 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_11', 'attention', 'self', 'value', 'kernel']\n",
618 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_11', 'intermediate', 'dense', 'bias']\n",
619 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_11', 'intermediate', 'dense', 'kernel']\n",
620 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_11', 'output', 'LayerNorm', 'beta']\n",
621 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_11', 'output', 'LayerNorm', 'gamma']\n",
622 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_11', 'output', 'dense', 'bias']\n",
623 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_11', 'output', 'dense', 'kernel']\n",
624 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_2', 'attention', 'output', 'LayerNorm', 'beta']\n",
625 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_2', 'attention', 'output', 'LayerNorm', 'gamma']\n",
626 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_2', 'attention', 'output', 'dense', 'bias']\n",
627 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_2', 'attention', 'output', 'dense', 'kernel']\n",
628 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_2', 'attention', 'self', 'key', 'bias']\n",
629 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_2', 'attention', 'self', 'key', 'kernel']\n",
630 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_2', 'attention', 'self', 'query', 'bias']\n",
631 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_2', 'attention', 'self', 'query', 'kernel']\n",
632 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_2', 'attention', 'self', 'value', 'bias']\n",
633 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_2', 'attention', 'self', 'value', 'kernel']\n",
634 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_2', 'intermediate', 'dense', 'bias']\n",
635 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_2', 'intermediate', 'dense', 'kernel']\n",
636 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_2', 'output', 'LayerNorm', 'beta']\n",
637 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_2', 'output', 'LayerNorm', 'gamma']\n",
638 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_2', 'output', 'dense', 'bias']\n",
639 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_2', 'output', 'dense', 'kernel']\n",
640 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_3', 'attention', 'output', 'LayerNorm', 'beta']\n",
641 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_3', 'attention', 'output', 'LayerNorm', 'gamma']\n",
642 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_3', 'attention', 'output', 'dense', 'bias']\n",
643 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_3', 'attention', 'output', 'dense', 'kernel']\n",
644 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_3', 'attention', 'self', 'key', 'bias']\n",
645 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_3', 'attention', 'self', 'key', 'kernel']\n",
646 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_3', 'attention', 'self', 'query', 'bias']\n",
647 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_3', 'attention', 'self', 'query', 'kernel']\n",
648 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_3', 'attention', 'self', 'value', 'bias']\n",
649 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_3', 'attention', 'self', 'value', 'kernel']\n",
650 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_3', 'intermediate', 'dense', 'bias']\n",
651 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_3', 'intermediate', 'dense', 'kernel']\n",
652 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_3', 'output', 'LayerNorm', 'beta']\n",
653 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_3', 'output', 'LayerNorm', 'gamma']\n",
654 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_3', 'output', 'dense', 'bias']\n",
655 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_3', 'output', 'dense', 'kernel']\n",
656 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_4', 'attention', 'output', 'LayerNorm', 'beta']\n",
657 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_4', 'attention', 'output', 'LayerNorm', 'gamma']\n",
658 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_4', 'attention', 'output', 'dense', 'bias']\n",
659 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_4', 'attention', 'output', 'dense', 'kernel']\n",
660 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_4', 'attention', 'self', 'key', 'bias']\n",
661 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_4', 'attention', 'self', 'key', 'kernel']\n",
662 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_4', 'attention', 'self', 'query', 'bias']\n",
663 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_4', 'attention', 'self', 'query', 'kernel']\n",
664 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_4', 'attention', 'self', 'value', 'bias']\n",
665 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_4', 'attention', 'self', 'value', 'kernel']\n",
666 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_4', 'intermediate', 'dense', 'bias']\n",
667 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_4', 'intermediate', 'dense', 'kernel']\n",
668 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_4', 'output', 'LayerNorm', 'beta']\n",
669 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_4', 'output', 'LayerNorm', 'gamma']\n",
670 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_4', 'output', 'dense', 'bias']\n",
671 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_4', 'output', 'dense', 'kernel']\n",
672 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_5', 'attention', 'output', 'LayerNorm', 'beta']\n",
673 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_5', 'attention', 'output', 'LayerNorm', 'gamma']\n",
674 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_5', 'attention', 'output', 'dense', 'bias']\n",
675 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_5', 'attention', 'output', 'dense', 'kernel']\n",
676 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_5', 'attention', 'self', 'key', 'bias']\n",
677 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_5', 'attention', 'self', 'key', 'kernel']\n",
678 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_5', 'attention', 'self', 'query', 'bias']\n",
679 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_5', 'attention', 'self', 'query', 'kernel']\n",
680 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_5', 'attention', 'self', 'value', 'bias']\n",
681 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_5', 'attention', 'self', 'value', 'kernel']\n",
682 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_5', 'intermediate', 'dense', 'bias']\n",
683 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_5', 'intermediate', 'dense', 'kernel']\n",
684 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_5', 'output', 'LayerNorm', 'beta']\n",
685 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_5', 'output', 'LayerNorm', 'gamma']\n",
686 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_5', 'output', 'dense', 'bias']\n",
687 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_5', 'output', 'dense', 'kernel']\n",
688 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_6', 'attention', 'output', 'LayerNorm', 'beta']\n",
689 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_6', 'attention', 'output', 'LayerNorm', 'gamma']\n",
690 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_6', 'attention', 'output', 'dense', 'bias']\n",
691 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_6', 'attention', 'output', 'dense', 'kernel']\n",
692 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_6', 'attention', 'self', 'key', 'bias']\n",
693 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_6', 'attention', 'self', 'key', 'kernel']\n",
694 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_6', 'attention', 'self', 'query', 'bias']\n",
695 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_6', 'attention', 'self', 'query', 'kernel']\n",
696 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_6', 'attention', 'self', 'value', 'bias']\n",
697 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_6', 'attention', 'self', 'value', 'kernel']\n",
698 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_6', 'intermediate', 'dense', 'bias']\n",
699 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_6', 'intermediate', 'dense', 'kernel']\n",
700 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_6', 'output', 'LayerNorm', 'beta']\n",
701 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_6', 'output', 'LayerNorm', 'gamma']\n",
702 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_6', 'output', 'dense', 'bias']\n",
703 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_6', 'output', 'dense', 'kernel']\n",
704 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_7', 'attention', 'output', 'LayerNorm', 'beta']\n",
705 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_7', 'attention', 'output', 'LayerNorm', 'gamma']\n",
706 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_7', 'attention', 'output', 'dense', 'bias']\n",
707 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_7', 'attention', 'output', 'dense', 'kernel']\n",
708 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_7', 'attention', 'self', 'key', 'bias']\n",
709 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_7', 'attention', 'self', 'key', 'kernel']\n",
710 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_7', 'attention', 'self', 'query', 'bias']\n",
711 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_7', 'attention', 'self', 'query', 'kernel']\n",
712 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_7', 'attention', 'self', 'value', 'bias']\n",
713 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_7', 'attention', 'self', 'value', 'kernel']\n",
714 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_7', 'intermediate', 'dense', 'bias']\n",
715 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_7', 'intermediate', 'dense', 'kernel']\n",
716 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_7', 'output', 'LayerNorm', 'beta']\n",
717 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_7', 'output', 'LayerNorm', 'gamma']\n",
718 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_7', 'output', 'dense', 'bias']\n",
719 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_7', 'output', 'dense', 'kernel']\n",
720 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_8', 'attention', 'output', 'LayerNorm', 'beta']\n",
721 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_8', 'attention', 'output', 'LayerNorm', 'gamma']\n",
722 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_8', 'attention', 'output', 'dense', 'bias']\n",
723 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_8', 'attention', 'output', 'dense', 'kernel']\n",
724 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_8', 'attention', 'self', 'key', 'bias']\n",
725 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_8', 'attention', 'self', 'key', 'kernel']\n",
726 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_8', 'attention', 'self', 'query', 'bias']\n",
727 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_8', 'attention', 'self', 'query', 'kernel']\n",
728 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_8', 'attention', 'self', 'value', 'bias']\n",
729 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_8', 'attention', 'self', 'value', 'kernel']\n",
730 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_8', 'intermediate', 'dense', 'bias']\n",
731 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_8', 'intermediate', 'dense', 'kernel']\n",
732 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_8', 'output', 'LayerNorm', 'beta']\n",
733 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_8', 'output', 'LayerNorm', 'gamma']\n",
734 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_8', 'output', 'dense', 'bias']\n",
735 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_8', 'output', 'dense', 'kernel']\n",
736 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_9', 'attention', 'output', 'LayerNorm', 'beta']\n",
737 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_9', 'attention', 'output', 'LayerNorm', 'gamma']\n",
738 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_9', 'attention', 'output', 'dense', 'bias']\n",
739 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_9', 'attention', 'output', 'dense', 'kernel']\n",
740 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_9', 'attention', 'self', 'key', 'bias']\n",
741 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_9', 'attention', 'self', 'key', 'kernel']\n",
742 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_9', 'attention', 'self', 'query', 'bias']\n",
743 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_9', 'attention', 'self', 'query', 'kernel']\n",
744 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_9', 'attention', 'self', 'value', 'bias']\n",
745 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_9', 'attention', 'self', 'value', 'kernel']\n",
746 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_9', 'intermediate', 'dense', 'bias']\n",
747 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_9', 'intermediate', 'dense', 'kernel']\n",
748 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_9', 'output', 'LayerNorm', 'beta']\n",
749 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_9', 'output', 'LayerNorm', 'gamma']\n",
750 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_9', 'output', 'dense', 'bias']\n",
751 |       "Initialize PyTorch weight ['bert', 'encoder', 'layer_9', 'output', 'dense', 'kernel']\n",
752 |       "Initialize PyTorch weight ['bert', 'pooler', 'dense', 'bias']\n",
753 |       "Initialize PyTorch weight ['bert', 'pooler', 'dense', 'kernel']\n",
754 |       "Initialize PyTorch weight ['cls', 'predictions', 'output_bias']\n",
755 |       "Initialize PyTorch weight ['cls', 'predictions', 'transform', 'LayerNorm', 'beta']\n",
756 |       "Initialize PyTorch weight ['cls', 'predictions', 'transform', 'LayerNorm', 'gamma']\n",
757 |       "Initialize PyTorch weight ['cls', 'predictions', 'transform', 'dense', 'bias']\n",
758 |       "Initialize PyTorch weight ['cls', 'predictions', 'transform', 'dense', 'kernel']\n",
759 |       "Initialize PyTorch weight ['cls', 'seq_relationship', 'output_bias']\n",
760 |       "Initialize PyTorch weight ['cls', 'seq_relationship', 'output_weights']\n"
761 |      ]
762 |     },
763 |     {
764 |      "ename": "NameError",
765 |      "evalue": "name 'pytorch_dump_path' is not defined",
766 |      "output_type": "error",
767 |      "traceback": [
768 |       "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
769 |       "\u001b[0;31mNameError\u001b[0m                                 Traceback (most recent call last)",
770 |       "\u001b[0;32m<ipython-input-31-b321710b346a>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m()\u001b[0m\n\u001b[1;32m     37\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m     38\u001b[0m \u001b[0;31m# Save pytorch-model\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 39\u001b[0;31m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"Save PyTorch model to {}\"\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mformat\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mpytorch_dump_path\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m     40\u001b[0m \u001b[0mtorch\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0msave\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mmodel\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mstate_dict\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mpytorch_dump_path\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
771 |       "\u001b[0;31mNameError\u001b[0m: name 'pytorch_dump_path' is not defined"
772 |      ]
773 |     }
774 |    ],
775 |    "source": [
776 |     "\n",
777 |     "for name, array in zip(names, arrays):\n",
778 |     "    name = name.split('/')\n",
779 |     "    # adam_v and adam_m are variables used in AdamWeightDecayOptimizer to calculated m and v\n",
780 |     "    # which are not required for using pretrained model\n",
781 |     "    if any(n in [\"adam_v\", \"adam_m\", \"global_step\"] for n in name):\n",
782 |     "        print(\"Skipping {}\".format(\"/\".join(name)))\n",
783 |     "        continue\n",
784 |     "    pointer = model\n",
785 |     "    for m_name in name:\n",
786 |     "        if re.fullmatch(r'[A-Za-z]+_\\d+', m_name):\n",
787 |     "            l = re.split(r'_(\\d+)', m_name)\n",
788 |     "        else:\n",
789 |     "            l = [m_name]\n",
790 |     "        if l[0] == 'kernel' or l[0] == 'gamma':\n",
791 |     "            pointer = getattr(pointer, 'weight')\n",
792 |     "        elif l[0] == 'output_bias' or l[0] == 'beta':\n",
793 |     "            pointer = getattr(pointer, 'bias')\n",
794 |     "        elif l[0] == 'output_weights':\n",
795 |     "            pointer = getattr(pointer, 'weight')\n",
796 |     "        else:\n",
797 |     "            pointer = getattr(pointer, l[0])\n",
798 |     "        if len(l) >= 2:\n",
799 |     "            num = int(l[1])\n",
800 |     "            pointer = pointer[num]\n",
801 |     "    if m_name[-11:] == '_embeddings':\n",
802 |     "        pointer = getattr(pointer, 'weight')\n",
803 |     "    elif m_name == 'kernel':\n",
804 |     "        array = np.transpose(array)\n",
805 |     "    try:\n",
806 |     "        assert pointer.shape == array.shape\n",
807 |     "    except AssertionError as e:\n",
808 |     "        e.args += (pointer.shape, array.shape)\n",
809 |     "        raise\n",
810 |     "    print(\"Initialize PyTorch weight {}\".format(name))\n",
811 |     "    pointer.data = torch.from_numpy(array)\n",
812 |     "\n",
813 |     "# Save pytorch-model\n"
814 |    ]
815 |   },
816 |   {
817 |    "cell_type": "code",
818 |    "execution_count": 34,
819 |    "metadata": {},
820 |    "outputs": [
821 |     {
822 |      "name": "stdout",
823 |      "output_type": "stream",
824 |      "text": [
825 |       "Save PyTorch model to weights/\n"
826 |      ]
827 |     }
828 |    ],
829 |    "source": [
830 |     "print(\"Save PyTorch model to {}\".format('weights/'))\n",
831 |     "torch.save(model.state_dict(),'weights/pytorch_weight')"
832 |    ]
833 |   },
834 |   {
835 |    "cell_type": "code",
836 |    "execution_count": null,
837 |    "metadata": {},
838 |    "outputs": [],
839 |    "source": []
840 |   },
841 |   {
842 |    "cell_type": "code",
843 |    "execution_count": null,
844 |    "metadata": {},
845 |    "outputs": [],
846 |    "source": [
847 |     "import os\n",
848 |     "import re\n",
849 |     "import argparse\n",
850 |     "import tensorflow as tf\n",
851 |     "import torch\n",
852 |     "import numpy as np\n",
853 |     "\n",
854 |     "from pytorch_pretrained_bert import BertConfig, BertForPreTraining\n",
855 |     "\n",
856 |     "def convert_tf_checkpoint_to_pytorch(tf_checkpoint_path, bert_config_file, pytorch_dump_path):\n",
857 |     "    config_path = os.path.abspath(bert_config_file)\n",
858 |     "    tf_path = os.path.abspath(tf_checkpoint_path)\n",
859 |     "    print(\"Converting TensorFlow checkpoint from {} with config at {}\".format(tf_path, config_path))\n",
860 |     "    # Load weights from TF model\n",
861 |     "    init_vars = tf.train.list_variables(tf_path)\n",
862 |     "    names = []\n",
863 |     "    arrays = []\n",
864 |     "    for name, shape in init_vars:\n",
865 |     "        print(\"Loading TF weight {} with shape {}\".format(name, shape))\n",
866 |     "        array = tf.train.load_variable(tf_path, name)\n",
867 |     "        names.append(name)\n",
868 |     "        arrays.append(array)\n",
869 |     "\n",
870 |     "    # Initialise PyTorch model\n",
871 |     "    config = BertConfig.from_json_file(bert_config_file)\n",
872 |     "    print(\"Building PyTorch model from configuration: {}\".format(str(config)))\n",
873 |     "    model = BertForPreTraining(config)\n",
874 |     "\n",
875 |     "    for name, array in zip(names, arrays):\n",
876 |     "        name = name.split('/')\n",
877 |     "        # adam_v and adam_m are variables used in AdamWeightDecayOptimizer to calculated m and v\n",
878 |     "        # which are not required for using pretrained model\n",
879 |     "        if any(n in [\"adam_v\", \"adam_m\", \"global_step\"] for n in name):\n",
880 |     "            print(\"Skipping {}\".format(\"/\".join(name)))\n",
881 |     "            continue\n",
882 |     "        pointer = model\n",
883 |     "        for m_name in name:\n",
884 |     "            if re.fullmatch(r'[A-Za-z]+_\\d+', m_name):\n",
885 |     "                l = re.split(r'_(\\d+)', m_name)\n",
886 |     "            else:\n",
887 |     "                l = [m_name]\n",
888 |     "            if l[0] == 'kernel' or l[0] == 'gamma':\n",
889 |     "                pointer = getattr(pointer, 'weight')\n",
890 |     "            elif l[0] == 'output_bias' or l[0] == 'beta':\n",
891 |     "                pointer = getattr(pointer, 'bias')\n",
892 |     "            elif l[0] == 'output_weights':\n",
893 |     "                pointer = getattr(pointer, 'weight')\n",
894 |     "            else:\n",
895 |     "                pointer = getattr(pointer, l[0])\n",
896 |     "            if len(l) >= 2:\n",
897 |     "                num = int(l[1])\n",
898 |     "                pointer = pointer[num]\n",
899 |     "        if m_name[-11:] == '_embeddings':\n",
900 |     "            pointer = getattr(pointer, 'weight')\n",
901 |     "        elif m_name == 'kernel':\n",
902 |     "            array = np.transpose(array)\n",
903 |     "        try:\n",
904 |     "            assert pointer.shape == array.shape\n",
905 |     "        except AssertionError as e:\n",
906 |     "            e.args += (pointer.shape, array.shape)\n",
907 |     "            raise\n",
908 |     "        print(\"Initialize PyTorch weight {}\".format(name))\n",
909 |     "        pointer.data = torch.from_numpy(array)\n",
910 |     "\n",
911 |     "    # Save pytorch-model\n",
912 |     "    print(\"Save PyTorch model to {}\".format(pytorch_dump_path))\n",
913 |     "    torch.save(model.state_dict(), pytorch_dump_path)\n"
914 |    ]
915 |   }
916 |  ],
917 |  "metadata": {
918 |   "kernelspec": {
919 |    "display_name": "Python 3",
920 |    "language": "python",
921 |    "name": "python3"
922 |   },
923 |   "language_info": {
924 |    "codemirror_mode": {
925 |     "name": "ipython",
926 |     "version": 3
927 |    },
928 |    "file_extension": ".py",
929 |    "mimetype": "text/x-python",
930 |    "name": "python",
931 |    "nbconvert_exporter": "python",
932 |    "pygments_lexer": "ipython3",
933 |    "version": "3.6.8"
934 |   }
935 |  },
936 |  "nbformat": 4,
937 |  "nbformat_minor": 2
938 | }
939 | 


--------------------------------------------------------------------------------
/biobert_ner/data_load.py:
--------------------------------------------------------------------------------
 1 | import numpy as np 
 2 | from torch.utils import data 
 3 | import parameters
 4 | import torch 
 5 | from pytorch_pretrained_bert import BertTokenizer
 6 | 
 7 | 
 8 | class HParams:
 9 |     def __init__(self, vocab_type):
10 |         self.VOCAB_DICT = {
11 |             'bc5cdr': ('<PAD>', 'B-Chemical', 'O', 'B-Disease' , 'I-Disease', 'I-Chemical'),
12 |             'bionlp3g' : ('<PAD>', 'B-Amino_acid', 'B-Anatomical_system', 'B-Cancer', 'B-Cell', 
13 |                         'B-Cellular_component', 'B-Developing_anatomical_structure', 'B-Gene_or_gene_product', 
14 |                         'B-Immaterial_anatomical_entity', 'B-Multi-tissue_structure', 'B-Organ', 'B-Organism', 
15 |                         'B-Organism_subdivision', 'B-Organism_substance', 'B-Pathological_formation', 
16 |                         'B-Simple_chemical', 'B-Tissue', 'I-Amino_acid', 'I-Anatomical_system', 'I-Cancer', 
17 |                         'I-Cell', 'I-Cellular_component', 'I-Developing_anatomical_structure', 'I-Gene_or_gene_product', 
18 |                         'I-Immaterial_anatomical_entity', 'I-Multi-tissue_structure', 'I-Organ', 'I-Organism', 
19 |                         'I-Organism_subdivision', 'I-Organism_substance', 'I-Pathological_formation', 'I-Simple_chemical', 
20 |                         'I-Tissue', 'O')
21 |         }
22 |         self.VOCAB = self.VOCAB_DICT[vocab_type]
23 |         self.tag2idx = {v:k for k,v in enumerate(self.VOCAB)}
24 |         self.idx2tag = {k:v for k,v in enumerate(self.VOCAB)}
25 | 
26 |         self.batch_size = 128 
27 |         self.lr = 0.0001
28 |         self.n_epochs = 30 
29 | 
30 |         self.tokenizer = BertTokenizer(vocab_file=parameters.VOCAB_FILE, do_lower_case=False)
31 |         self.device = 'cuda' if torch.cuda.is_available() else 'cpu'
32 | 
33 | 
34 | class NerDataset(data.Dataset):
35 |     def __init__(self, path, vocab_type):
36 |         self.hp = HParams(vocab_type)
37 |         instances = open(path).read().strip().split('\n\n')
38 |         sents = []
39 |         tags_li = []
40 |         for entry in instances:
41 |             words = [line.split()[0] for line in entry.splitlines()]
42 |             tags = ([line.split()[-1] for line in entry.splitlines()])
43 |             sents.append(["[CLS]"] + words + ["[SEP]"])
44 |             tags_li.append(["<PAD>"] + tags + ["<PAD>"])
45 |         self.sents, self.tags_li = sents, tags_li
46 | 
47 |     def __len__(self):
48 |         return len(self.sents)
49 | 
50 | 
51 |     def __getitem__(self, idx):
52 |         words, tags = self.sents[idx], self.tags_li[idx] # words, tags: string list
53 | 
54 |         # We give credits only to the first piece.
55 |         x, y = [], [] # list of ids
56 |         is_heads = [] # list. 1: the token is the first piece of a word
57 |         for w, t in zip(words, tags):
58 |             tokens = self.hp.tokenizer.tokenize(w) if w not in ("[CLS]", "[SEP]") else [w]
59 |             xx = self.hp.tokenizer.convert_tokens_to_ids(tokens)
60 | 
61 |             is_head = [1] + [0]*(len(tokens) - 1)
62 | 
63 |             t = [t] + ["<PAD>"] * (len(tokens) - 1)  # <PAD>: no decision
64 |             yy = [self.hp.tag2idx[each] for each in t]  # (T,)
65 | 
66 |             x.extend(xx)
67 |             is_heads.extend(is_head)
68 |             y.extend(yy)
69 | 
70 |         assert len(x)==len(y)==len(is_heads), f"len(x)={len(x)}, len(y)={len(y)}, len(is_heads)={len(is_heads)}"
71 | 
72 |         # seqlen
73 |         seqlen = len(y)
74 | 
75 |         # to string
76 |         words = " ".join(words)
77 |         tags = " ".join(tags)
78 |         return words, x, is_heads, tags, y, seqlen
79 | 
80 | 
81 | def pad(batch):
82 |     '''Pads to the longest sample'''
83 |     f = lambda x: [sample[x] for sample in batch]
84 |     words = f(0)
85 |     is_heads = f(2)
86 |     tags = f(3)
87 |     seqlens = f(-1)
88 |     maxlen = np.array(seqlens).max()
89 | 
90 |     f = lambda x, seqlen: [sample[x] + [0] * (seqlen - len(sample[x])) for sample in batch] # 0: <pad>
91 |     x = f(1, maxlen)
92 |     y = f(-2, maxlen)
93 | 
94 | 
95 |     f = torch.LongTensor
96 | 
97 |     return words, f(x), is_heads, tags, f(y), seqlens


--------------------------------------------------------------------------------
/biobert_ner/extras/ezgif.com-video-to-gif.gif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/MeRajat/SolvingAlmostAnythingWithBert/1bfb6d679a668179bbb783d1c0eb9f338cd0f1c5/biobert_ner/extras/ezgif.com-video-to-gif.gif


--------------------------------------------------------------------------------
/biobert_ner/new_model.py:
--------------------------------------------------------------------------------
 1 | import torch
 2 | import torch.nn as nn
 3 | from pytorch_pretrained_bert import BertModel
 4 | 
 5 | class Net(nn.Module):
 6 |     def __init__(self, config, bert_state_dict, vocab_len, device = 'cpu'):
 7 |         super().__init__()
 8 |         self.bert = BertModel(config)
 9 |         if bert_state_dict is not None:
10 |             self.bert.load_state_dict(bert_state_dict)
11 |         self.bert.eval()
12 |         self.rnn = nn.LSTM(bidirectional=True, num_layers=2, input_size=768, hidden_size=768//2, batch_first=True)
13 |         self.fc = nn.Linear(768, vocab_len)
14 |         self.device = device
15 | 
16 |     def forward(self, x, y):
17 |         '''
18 |         x: (N, T). int64
19 |         y: (N, T). int64
20 | 
21 |         Returns
22 |         enc: (N, T, VOCAB)
23 |         '''
24 |         x = x.to(self.device)
25 |         y = y.to(self.device)
26 | 
27 |         with torch.no_grad():
28 |             encoded_layers, _ = self.bert(x)
29 |             enc = encoded_layers[-1]
30 |         enc, _ = self.rnn(enc)
31 |         logits = self.fc(enc)
32 |         y_hat = logits.argmax(-1)
33 |         return logits, y, y_hat


--------------------------------------------------------------------------------
/biobert_ner/new_train.py:
--------------------------------------------------------------------------------
  1 | import torch
  2 | import torch.nn as nn
  3 | import torch.optim as optim
  4 | from torch.utils import data
  5 | from new_model import Net
  6 | from data_load import NerDataset, pad, HParams
  7 | import os
  8 | import numpy as np
  9 | from pytorch_pretrained_bert.modeling import BertConfig
 10 | import parameters
 11 | from collections import OrderedDict
 12 | 
 13 | 
 14 | # prepare biobert dict 
 15 | tmp_d = torch.load(parameters.BERT_WEIGHTS, map_location='cpu')
 16 | state_dict = OrderedDict()
 17 | for i in list(tmp_d.keys())[:199]:
 18 |     x = i
 19 |     if i.find('bert') > -1:
 20 |         x = '.'.join(i.split('.')[1:])
 21 |     state_dict[x] = tmp_d[i]
 22 |     
 23 | 
 24 | def train(model, iterator, optimizer, criterion):
 25 |     model.train()
 26 |     for i, batch in enumerate(iterator):
 27 |         words, x, is_heads, tags, y, seqlens = batch
 28 |         _y = y # for monitoring
 29 |         optimizer.zero_grad()
 30 |         logits, y, _ = model(x, y) # logits: (N, T, VOCAB), y: (N, T)
 31 | 
 32 |         logits = logits.view(-1, logits.shape[-1]) # (N*T, VOCAB)
 33 |         y = y.view(-1)  # (N*T,)
 34 | 
 35 |         loss = criterion(logits, y)
 36 |         loss.backward()
 37 | 
 38 |         optimizer.step()
 39 | 
 40 |         if i==0:
 41 |             print("=====sanity check======")
 42 |             print("x:", x.cpu().numpy()[0])
 43 |             print("words:", words[0])
 44 |             print("tokens:", hp.tokenizer.convert_ids_to_tokens(x.cpu().numpy()[0]))
 45 |             print("y:", _y.cpu().numpy()[0])
 46 |             print("is_heads:", is_heads[0])
 47 |             print("tags:", tags[0])
 48 |             print("seqlen:", seqlens[0])
 49 | 
 50 | 
 51 |         if i%10==0: # monitoring
 52 |             print(f"step: {i}, loss: {loss.item()}")
 53 | 
 54 | def eval(model, iterator, f):
 55 |     model.eval()
 56 | 
 57 |     Words, Is_heads, Tags, Y, Y_hat = [], [], [], [], []
 58 |     with torch.no_grad():
 59 |         for i, batch in enumerate(iterator):
 60 |             words, x, is_heads, tags, y, seqlens = batch
 61 | 
 62 |             _, _, y_hat = model(x, y)  # y_hat: (N, T)
 63 | 
 64 |             Words.extend(words)
 65 |             Is_heads.extend(is_heads)
 66 |             Tags.extend(tags)
 67 |             Y.extend(y.numpy().tolist())
 68 |             Y_hat.extend(y_hat.cpu().numpy().tolist())
 69 | 
 70 |     ## gets results and save
 71 |     with open(f, 'w') as fout:
 72 |         for words, is_heads, tags, y_hat in zip(Words, Is_heads, Tags, Y_hat):
 73 |             y_hat = [hat for head, hat in zip(is_heads, y_hat) if head == 1]
 74 |             preds = [hp.idx2tag[hat] for hat in y_hat]
 75 |             assert len(preds)==len(words.split())==len(tags.split())
 76 |             for w, t, p in zip(words.split()[1:-1], tags.split()[1:-1], preds[1:-1]):
 77 |                 fout.write(f"{w} {t} {p}\n")
 78 |             fout.write("\n")
 79 | 
 80 |     ## calc metric
 81 |     y_true =  np.array([hp.tag2idx[line.split()[1]] for line in open(f, 'r').read().splitlines() if len(line) > 0])
 82 |     y_pred =  np.array([hp.tag2idx[line.split()[2]] for line in open(f, 'r').read().splitlines() if len(line) > 0])
 83 | 
 84 |     num_proposed = len(y_pred[y_pred>1])
 85 |     num_correct = (np.logical_and(y_true==y_pred, y_true>1)).astype(np.int).sum()
 86 |     num_gold = len(y_true[y_true>1])
 87 | 
 88 |     print(f"num_proposed:{num_proposed}")
 89 |     print(f"num_correct:{num_correct}")
 90 |     print(f"num_gold:{num_gold}")
 91 |     try:
 92 |         precision = num_correct / num_proposed
 93 |     except ZeroDivisionError:
 94 |         precision = 1.0
 95 | 
 96 |     try:
 97 |         recall = num_correct / num_gold
 98 |     except ZeroDivisionError:
 99 |         recall = 1.0
100 | 
101 |     try:
102 |         f1 = 2*precision*recall / (precision + recall)
103 |     except ZeroDivisionError:
104 |         if precision*recall==0:
105 |             f1=1.0
106 |         else:
107 |             f1=0
108 | 
109 |     final = f + ".P%.2f_R%.2f_F%.2f" %(precision, recall, f1)
110 |     with open(final, 'w') as fout:
111 |         result = open(f, "r").read()
112 |         fout.write(f"{result}\n")
113 | 
114 |         fout.write(f"precision={precision}\n")
115 |         fout.write(f"recall={recall}\n")
116 |         fout.write(f"f1={f1}\n")
117 | 
118 |     os.remove(f)
119 | 
120 |     print("precision=%.2f"%precision)
121 |     print("recall=%.2f"%recall)
122 |     print("f1=%.2f"%f1)
123 |     return precision, recall, f1
124 | 
125 | if __name__=="__main__":
126 |     
127 |     
128 |     train_dataset = NerDataset("data/train.tsv", 'bc5cdr')  # here bc5cdr is dataset type
129 |     eval_dataset = NerDataset("data/test.tsv", 'bc5cdr')
130 |     hp = HParams('bc5cdr')
131 | 
132 |     # Define model 
133 |     config = BertConfig(vocab_size_or_config_json_file=parameters.BERT_CONFIG_FILE)
134 |     model = Net(config = config, bert_state_dict = state_dict, vocab_len = len(hp.VOCAB), device=hp.device)
135 |     if torch.cuda.is_available():
136 |         model.cuda()
137 |     model.train()
138 |     # update with already pretrained weight
139 | 
140 | 
141 |     
142 |     train_iter = data.DataLoader(dataset=train_dataset,
143 |                                  batch_size=hp.batch_size,
144 |                                  shuffle=True,
145 |                                  num_workers=4,
146 |                                  collate_fn=pad)
147 |     eval_iter = data.DataLoader(dataset=eval_dataset,
148 |                                  batch_size=hp.batch_size,
149 |                                  shuffle=False,
150 |                                  num_workers=4,
151 |                                  collate_fn=pad)
152 | 
153 |     optimizer = optim.Adam(model.parameters(), lr = hp.lr)
154 |     # optimizer = optim.SGD(model.parameters(), lr=0.0001, momentum=0.9)
155 |     criterion = nn.CrossEntropyLoss(ignore_index=0)
156 | 
157 |     for epoch in range(1, hp.n_epochs+1):
158 |         train(model, train_iter, optimizer, criterion)
159 |         print(f"=========eval at epoch={epoch}=========")
160 |         if not os.path.exists('checkpoints'): os.makedirs('checkpoints')
161 |         fname = os.path.join('checkpoints', str(epoch))
162 |         precision, recall, f1 = eval(model, eval_iter, fname)
163 |         torch.save(model.state_dict(), f"{fname}.pt")


--------------------------------------------------------------------------------
/biobert_ner/parameters.py:
--------------------------------------------------------------------------------
1 | VOCAB_FILE = 'weights/pubmed_pmc_470k/vocab.txt'
2 | BERT_CONFIG_FILE = 'weights/pubmed_pmc_470k/bert_config.json'
3 | BC5CDR_WEIGHT = 'weights/bc5cdr_wt.pt'
4 | BIONLP13CG_WEIGHT = 'weights/bionlp13cg_wt.pt'
5 | BERT_WEIGHTS = 'weights/pytorch_weight'


--------------------------------------------------------------------------------
/biobert_ner/requirements.txt:
--------------------------------------------------------------------------------
 1 | aiohttp==3.5.4
 2 | async-timeout==3.0.1
 3 | attrs==18.2.0
 4 | boto3==1.9.105
 5 | botocore==1.12.105
 6 | certifi==2018.11.29
 7 | chardet==3.0.4
 8 | Click==7.0
 9 | docutils==0.14
10 | h11==0.8.1
11 | httptools==0.0.13
12 | idna==2.8
13 | idna-ssl==1.1.0
14 | jmespath==0.9.4
15 | multidict==4.5.2
16 | numpy==1.16.2
17 | pkg-resources==0.0.0
18 | python-dateutil==2.8.0
19 | pytorch-pretrained-bert==0.6.1
20 | regex==2019.2.21
21 | requests==2.21.0
22 | s3transfer==0.2.0
23 | six==1.12.0
24 | starlette==0.11.3
25 | torch==1.0.1.post2
26 | tqdm==4.31.1
27 | typing-extensions==3.7.2
28 | urllib3==1.24.1
29 | uvicorn==0.4.6
30 | uvloop==0.12.1
31 | websockets==7.0
32 | yarl==1.3.0


--------------------------------------------------------------------------------
/fill_the_blanks/README.md:
--------------------------------------------------------------------------------
1 | ![Alt Text](fillblank.gif)
2 | 


--------------------------------------------------------------------------------
/fill_the_blanks/fill_blanks.py:
--------------------------------------------------------------------------------
 1 | from pytorch_pretrained_bert import BertTokenizer, BertForMaskedLM
 2 | import torch 
 3 | import json 
 4 | import re 
 5 | import ftfy
 6 | from starlette.applications import Starlette
 7 | from starlette.responses import JSONResponse, HTMLResponse, RedirectResponse
 8 | import torch
 9 | import uvicorn
10 | import aiohttp
11 | 
12 | app = Starlette()
13 | 
14 | device = 'cuda' if torch.cuda.is_available() else 'cpu'
15 | 
16 | bert_model = 'bert-large-uncased'
17 | tokenizer = BertTokenizer.from_pretrained(bert_model)
18 | model = BertForMaskedLM.from_pretrained(bert_model)
19 | _ = model.eval().to(device)
20 | print("model loaded")
21 | def get_score(model, tokenizer, q_tensors, s_tensors, m_index, candidate):
22 |     candidate_tokens = tokenizer.tokenize(candidate)
23 |     candidate_ids = tokenizer.convert_tokens_to_ids(candidate_tokens)
24 |     
25 |     preds = model(q_tensors.to(device), s_tensors.to(device))
26 |     predictions_candidates = preds[0, m_index, candidate_ids].mean()
27 |     return predictions_candidates.item()
28 | 
29 | 
30 | def get_word(row):
31 |     """
32 |     
33 |     """
34 |     question = re.sub('\_+', ' [MASK] ', ftfy.fix_encoding(row['question']))
35 |     question_tokens = tokenizer.tokenize(question)
36 |     masked_index = question_tokens.index('[MASK]')
37 |     ## Make segments 
38 |     segment_ids = [0] * len(question_tokens)
39 |     segment_tensors = torch.tensor([segment_ids])
40 |     # Convert tokens to ids and tensors 
41 |     question_ids = tokenizer.convert_tokens_to_ids(question_tokens)
42 |     question_tensors = torch.tensor([question_ids]).to(device)
43 |     
44 |     candidates = [ftfy.fix_encoding(row['1']), ftfy.fix_encoding(row['2']), ftfy.fix_encoding(row['3']), ftfy.fix_encoding(row['4'])]
45 |     
46 |     predict_tensor = torch.tensor([get_score(model, tokenizer, question_tensors, segment_tensors, masked_index, candidate) for candidate in candidates])
47 |     predict_idx = torch.argmax(predict_tensor).item()
48 |     return candidates[predict_idx], predict_tensor
49 | 
50 | 
51 | @app.route("/fill_blank", methods = ["GET"])
52 | async def fill_blank(request):
53 |     row = {}
54 |     row["question"] = request.query_params["question"]
55 |     row["1"] = request.query_params["op1"]
56 |     row["2"] = request.query_params["op2"]
57 |     row["3"] = request.query_params["op3"]
58 |     row["4"] = request.query_params["op4"]
59 |     correct_word, prob_tensor = get_word(row)
60 |     return JSONResponse({'word': correct_word})
61 | 
62 | 
63 | 
64 | 
65 | @app.route("/")
66 | def form(_):
67 |     return HTMLResponse("""
68 |     <h3> Try the intelligent fill in the blanks  </h3>
69 |     <form action = "/fill_blank", method = "get">
70 |         <label for="question">Sentence:</label>
71 |         <textarea rows = "10" cols = "60", name = "question"></textarea><br>
72 |         <label for="op1">Option1:</label>
73 |         <textarea rows = "2" cols = "60", name = "op1"></textarea><br>
74 |         <label for="op2">Option2:</label>
75 |         <textarea rows = "2" cols = "60", name = "op2"></textarea><br>
76 |         <label for="op3">Option3:</label>
77 |         <textarea rows = "2" cols = "60", name = "op3"></textarea><br>
78 |         <label for="op4">Option4:</label>
79 |         <textarea rows = "2" cols = "60", name = "op4"></textarea><br>
80 |         <input type="submit" name ="fill blank" value = "fill" >
81 |     </form>
82 |     """)
83 | 
84 | @app.route("/form")
85 | def redirect_to_homepage(_):
86 |     return RedirectResponse("/")
87 | 
88 | 
89 | 
90 | if __name__ == "__main__":
91 |     # To run this app start application on server with python
92 |     # python FILENAME serve
93 |     # ex: python server.py serve
94 |     # if "serve" in sys.argv:
95 |     uvicorn.run(app, host="0.0.0.0", port=9000)
96 | 
97 | 


--------------------------------------------------------------------------------
/fill_the_blanks/fillblank.gif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/MeRajat/SolvingAlmostAnythingWithBert/1bfb6d679a668179bbb783d1c0eb9f338cd0f1c5/fill_the_blanks/fillblank.gif


--------------------------------------------------------------------------------