├── .github
    └── ISSUE_TEMPLATE
    │   ├── bug_report.md
    │   └── feature_request.md
├── .gitignore
├── Add dendro-group cats.ipynb
├── CODE_OF_CONDUCT.md
├── Category_colors.ipynb
├── Filter using names.ipynb
├── Fix Enrichrgram category coloring.ipynb
├── Improved sim-mat control.ipynb
├── LICENSE
├── MANIFEST
├── Modify downsample.ipynb
├── README.md
├── RELEASE.md
├── Row filtering based on original data.ipynb
├── Test net updating.ipynb
├── Widget_View_Downsample.ipynb
├── add_cats method.ipynb
├── add_enrichr_cats.ipynb
├── clustergrammer
    ├── __init__.py
    ├── calc_clust.py
    ├── cat_pval.py
    ├── categories.py
    ├── data_formats.py
    ├── downsample_fun.py
    ├── enrichr_functions.py
    ├── export_data.py
    ├── iframe_web_app.py
    ├── initialize_net.py
    ├── load_data.py
    ├── load_vect_post.py
    ├── make_clust_fun.py
    ├── make_sim_mat.py
    ├── make_unique_labels.py
    ├── make_views.py
    ├── make_viz.py
    ├── normalize_fun.py
    ├── proc_df_labels.py
    └── run_filter.py
├── json
    ├── mult_view.json
    ├── mult_view_sim_col.json
    └── mult_view_sim_row.json
├── make_clustergrammer.py
├── make_stdin_stdout.py
├── python27 new import.ipynb
├── python35_new_import.ipynb
├── setup.cfg
├── setup.py
└── txt
    ├── example_tsv.txt
    ├── rc_ptms.txt
    ├── rc_two_cats.txt
    └── rc_val_cats.txt


/.github/ISSUE_TEMPLATE/bug_report.md:
--------------------------------------------------------------------------------
 1 | ---
 2 | name: Bug report
 3 | about: Create a report to help us improve
 4 | title: ''
 5 | labels: ''
 6 | assignees: ''
 7 | 
 8 | ---
 9 | 
10 | **Describe the bug**
11 | A clear and concise description of what the bug is.
12 | 
13 | **To Reproduce**
14 | Steps to reproduce the behavior:
15 | 1. Go to '...'
16 | 2. Click on '....'
17 | 3. Scroll down to '....'
18 | 4. See error
19 | 
20 | **Expected behavior**
21 | A clear and concise description of what you expected to happen.
22 | 
23 | **Screenshots**
24 | If applicable, add screenshots to help explain your problem.
25 | 
26 | **Desktop (please complete the following information):**
27 |  - OS: [e.g. iOS]
28 |  - Browser [e.g. chrome, safari]
29 |  - Version [e.g. 22]
30 | 
31 | **Smartphone (please complete the following information):**
32 |  - Device: [e.g. iPhone6]
33 |  - OS: [e.g. iOS8.1]
34 |  - Browser [e.g. stock browser, safari]
35 |  - Version [e.g. 22]
36 | 
37 | **Additional context**
38 | Add any other context about the problem here.
39 | 


--------------------------------------------------------------------------------
/.github/ISSUE_TEMPLATE/feature_request.md:
--------------------------------------------------------------------------------
 1 | ---
 2 | name: Feature request
 3 | about: Suggest an idea for this project
 4 | title: ''
 5 | labels: ''
 6 | assignees: ''
 7 | 
 8 | ---
 9 | 
10 | **Is your feature request related to a problem? Please describe.**
11 | A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
12 | 
13 | **Describe the solution you'd like**
14 | A clear and concise description of what you want to happen.
15 | 
16 | **Describe alternatives you've considered**
17 | A clear and concise description of any alternative solutions or features you've considered.
18 | 
19 | **Additional context**
20 | Add any other context or screenshots about the feature request here.
21 | 


--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
 1 | # Compiled source #
 2 | ###################
 3 | *.com
 4 | *.class
 5 | *.dll
 6 | *.exe
 7 | *.o
 8 | *.so
 9 | 
10 | # Packages #
11 | ############
12 | # it's better to unpack these files and commit the raw source
13 | # git has its own built in compression methods
14 | *.7z
15 | *.dmg
16 | *.gz
17 | *.iso
18 | *.jar
19 | *.rar
20 | *.tar
21 | *.zip
22 | node_modules
23 | 
24 | # Logs and databases #
25 | ######################
26 | *.log
27 | *.sql
28 | *.sqlite
29 | 
30 | # OS generated files #
31 | ######################
32 | .DS_Store
33 | .DS_Store?
34 | ._*
35 | .Spotlight-V100
36 | .Trashes
37 | ehthumbs.db
38 | Thumbs.db
39 | 
40 | # cache files for sublime text
41 | *.tmlanguage.cache
42 | *.tmPreferences.cache
43 | *.stTheme.cache
44 | 
45 | # workspace files are user-specific
46 | *.sublime-workspace
47 | *.sublime-project
48 | *.idea
49 | *.swo
50 | *.swp
51 | 
52 | # sftp configuration file
53 | sftp-config.json
54 | 
55 | #TernJS
56 | .tern-port
57 | 
58 | # webpack
59 | *.js.map
60 | 
61 | # python
62 | *.ipynb_checkpoints
63 | *.pyc
64 | 
65 | # # eslint
66 | # .eslint*
67 | 
68 | # how to retroactively use
69 | ############################
70 | # git rm -r --cached .
71 | # git add .
72 | # git commit -m "fixing .gitignore"
73 | 
74 | 
75 | txt/ds_*


--------------------------------------------------------------------------------
/CODE_OF_CONDUCT.md:
--------------------------------------------------------------------------------
 1 | # Contributor Covenant Code of Conduct
 2 | 
 3 | ## Our Pledge
 4 | 
 5 | In the interest of fostering an open and welcoming environment, we as
 6 | contributors and maintainers pledge to making participation in our project and
 7 | our community a harassment-free experience for everyone, regardless of age, body
 8 | size, disability, ethnicity, sex characteristics, gender identity and expression,
 9 | level of experience, education, socio-economic status, nationality, personal
10 | appearance, race, religion, or sexual identity and orientation.
11 | 
12 | ## Our Standards
13 | 
14 | Examples of behavior that contributes to creating a positive environment
15 | include:
16 | 
17 | * Using welcoming and inclusive language
18 | * Being respectful of differing viewpoints and experiences
19 | * Gracefully accepting constructive criticism
20 | * Focusing on what is best for the community
21 | * Showing empathy towards other community members
22 | 
23 | Examples of unacceptable behavior by participants include:
24 | 
25 | * The use of sexualized language or imagery and unwelcome sexual attention or
26 |  advances
27 | * Trolling, insulting/derogatory comments, and personal or political attacks
28 | * Public or private harassment
29 | * Publishing others' private information, such as a physical or electronic
30 |  address, without explicit permission
31 | * Other conduct which could reasonably be considered inappropriate in a
32 |  professional setting
33 | 
34 | ## Our Responsibilities
35 | 
36 | Project maintainers are responsible for clarifying the standards of acceptable
37 | behavior and are expected to take appropriate and fair corrective action in
38 | response to any instances of unacceptable behavior.
39 | 
40 | Project maintainers have the right and responsibility to remove, edit, or
41 | reject comments, commits, code, wiki edits, issues, and other contributions
42 | that are not aligned to this Code of Conduct, or to ban temporarily or
43 | permanently any contributor for other behaviors that they deem inappropriate,
44 | threatening, offensive, or harmful.
45 | 
46 | ## Scope
47 | 
48 | This Code of Conduct applies both within project spaces and in public spaces
49 | when an individual is representing the project or its community. Examples of
50 | representing a project or community include using an official project e-mail
51 | address, posting via an official social media account, or acting as an appointed
52 | representative at an online or offline event. Representation of a project may be
53 | further defined and clarified by project maintainers.
54 | 
55 | ## Enforcement
56 | 
57 | Instances of abusive, harassing, or otherwise unacceptable behavior may be
58 | reported by contacting the project team at nicolas.fernandez@mssm.edu. All
59 | complaints will be reviewed and investigated and will result in a response that
60 | is deemed necessary and appropriate to the circumstances. The project team is
61 | obligated to maintain confidentiality with regard to the reporter of an incident.
62 | Further details of specific enforcement policies may be posted separately.
63 | 
64 | Project maintainers who do not follow or enforce the Code of Conduct in good
65 | faith may face temporary or permanent repercussions as determined by other
66 | members of the project's leadership.
67 | 
68 | ## Attribution
69 | 
70 | This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4,
71 | available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html
72 | 
73 | [homepage]: https://www.contributor-covenant.org
74 | 
75 | For answers to common questions about this code of conduct, see
76 | https://www.contributor-covenant.org/faq
77 | 


--------------------------------------------------------------------------------
/Category_colors.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "code",
  5 |    "execution_count": 1,
  6 |    "metadata": {
  7 |     "collapsed": false,
  8 |     "deletable": true,
  9 |     "editable": true
 10 |    },
 11 |    "outputs": [],
 12 |    "source": [
 13 |     "import pandas as pd\n",
 14 |     "import numpy as np\n",
 15 |     "from clustergrammer import Network\n",
 16 |     "net = Network()"
 17 |    ]
 18 |   },
 19 |   {
 20 |    "cell_type": "code",
 21 |    "execution_count": 2,
 22 |    "metadata": {
 23 |     "collapsed": false,
 24 |     "deletable": true,
 25 |     "editable": true
 26 |    },
 27 |    "outputs": [],
 28 |    "source": [
 29 |     "net.load_file('txt/ds_plasma.txt')"
 30 |    ]
 31 |   },
 32 |   {
 33 |    "cell_type": "code",
 34 |    "execution_count": 3,
 35 |    "metadata": {
 36 |     "collapsed": false
 37 |    },
 38 |    "outputs": [],
 39 |    "source": [
 40 |     "net.load_file('txt/ds_plasma.txt')"
 41 |    ]
 42 |   },
 43 |   {
 44 |    "cell_type": "code",
 45 |    "execution_count": 4,
 46 |    "metadata": {
 47 |     "collapsed": false,
 48 |     "deletable": true,
 49 |     "editable": true
 50 |    },
 51 |    "outputs": [
 52 |     {
 53 |      "data": {
 54 |       "text/plain": [
 55 |        "{'col': {'cat-0': {'Marker-type: phospho marker': '#17becf',\n",
 56 |        "   'Marker-type: surface marker': '#6b6ecf'}},\n",
 57 |        " 'row': {'cat-0': {'Majority-Treatment: Plasma': '#dbdb8d'}, 'cat-1': {}}}"
 58 |       ]
 59 |      },
 60 |      "execution_count": 4,
 61 |      "metadata": {},
 62 |      "output_type": "execute_result"
 63 |     }
 64 |    ],
 65 |    "source": [
 66 |     "net.viz['cat_colors']"
 67 |    ]
 68 |   },
 69 |   {
 70 |    "cell_type": "code",
 71 |    "execution_count": null,
 72 |    "metadata": {
 73 |     "collapsed": false
 74 |    },
 75 |    "outputs": [],
 76 |    "source": []
 77 |   },
 78 |   {
 79 |    "cell_type": "code",
 80 |    "execution_count": 5,
 81 |    "metadata": {
 82 |     "collapsed": false
 83 |    },
 84 |    "outputs": [],
 85 |    "source": [
 86 |     "net.load_file('txt/ds_pma.txt')"
 87 |    ]
 88 |   },
 89 |   {
 90 |    "cell_type": "code",
 91 |    "execution_count": 6,
 92 |    "metadata": {
 93 |     "collapsed": false
 94 |    },
 95 |    "outputs": [],
 96 |    "source": [
 97 |     "df_pma = net.export_df()"
 98 |    ]
 99 |   },
100 |   {
101 |    "cell_type": "code",
102 |    "execution_count": 7,
103 |    "metadata": {
104 |     "collapsed": false
105 |    },
106 |    "outputs": [
107 |     {
108 |      "data": {
109 |       "text/plain": [
110 |        "{'col': {'cat-0': {'Marker-type: phospho marker': '#17becf',\n",
111 |        "   'Marker-type: surface marker': '#6b6ecf'}},\n",
112 |        " 'row': {'cat-0': {'Majority-Treatment: PMA': '#c5b0d5',\n",
113 |        "   'Majority-Treatment: Plasma': '#dbdb8d'},\n",
114 |        "  'cat-1': {}}}"
115 |       ]
116 |      },
117 |      "execution_count": 7,
118 |      "metadata": {},
119 |      "output_type": "execute_result"
120 |     }
121 |    ],
122 |    "source": [
123 |     "net.viz['cat_colors']"
124 |    ]
125 |   },
126 |   {
127 |    "cell_type": "code",
128 |    "execution_count": 8,
129 |    "metadata": {
130 |     "collapsed": false
131 |    },
132 |    "outputs": [],
133 |    "source": [
134 |     "# generate random matrix\n",
135 |     "num_rows = 500\n",
136 |     "num_cols = 10\n",
137 |     "np.random.seed(seed=100)\n",
138 |     "mat = np.random.rand(num_rows, num_cols)\n",
139 |     "\n",
140 |     "# make row and col labels\n",
141 |     "rows = range(num_rows)\n",
142 |     "cols = range(num_cols)\n",
143 |     "rows = [str(i) for i in rows]\n",
144 |     "cols = [str(i) for i in cols]\n",
145 |     "\n",
146 |     "# make dataframe \n",
147 |     "df = pd.DataFrame(data=mat, columns=cols, index=rows)"
148 |    ]
149 |   },
150 |   {
151 |    "cell_type": "code",
152 |    "execution_count": 9,
153 |    "metadata": {
154 |     "collapsed": false
155 |    },
156 |    "outputs": [
157 |     {
158 |      "data": {
159 |       "text/plain": [
160 |        "{'col': {'cat-0': {'Marker-type: phospho marker': '#17becf',\n",
161 |        "   'Marker-type: surface marker': '#6b6ecf'}},\n",
162 |        " 'row': {'cat-0': {'Majority-Treatment: PMA': '#c5b0d5',\n",
163 |        "   'Majority-Treatment: Plasma': '#dbdb8d'},\n",
164 |        "  'cat-1': {}}}"
165 |       ]
166 |      },
167 |      "execution_count": 9,
168 |      "metadata": {},
169 |      "output_type": "execute_result"
170 |     }
171 |    ],
172 |    "source": [
173 |     "net.viz['cat_colors']"
174 |    ]
175 |   },
176 |   {
177 |    "cell_type": "code",
178 |    "execution_count": 10,
179 |    "metadata": {
180 |     "collapsed": true
181 |    },
182 |    "outputs": [],
183 |    "source": [
184 |     "net.load_df(df)"
185 |    ]
186 |   },
187 |   {
188 |    "cell_type": "code",
189 |    "execution_count": 11,
190 |    "metadata": {
191 |     "collapsed": false
192 |    },
193 |    "outputs": [
194 |     {
195 |      "data": {
196 |       "text/plain": [
197 |        "{'col': {'cat-0': {'Marker-type: phospho marker': '#17becf',\n",
198 |        "   'Marker-type: surface marker': '#6b6ecf'}},\n",
199 |        " 'row': {'cat-0': {'Majority-Treatment: PMA': '#c5b0d5',\n",
200 |        "   'Majority-Treatment: Plasma': '#dbdb8d'},\n",
201 |        "  'cat-1': {}}}"
202 |       ]
203 |      },
204 |      "execution_count": 11,
205 |      "metadata": {},
206 |      "output_type": "execute_result"
207 |     }
208 |    ],
209 |    "source": [
210 |     "net.viz['cat_colors']"
211 |    ]
212 |   },
213 |   {
214 |    "cell_type": "code",
215 |    "execution_count": null,
216 |    "metadata": {
217 |     "collapsed": true
218 |    },
219 |    "outputs": [],
220 |    "source": []
221 |   },
222 |   {
223 |    "cell_type": "code",
224 |    "execution_count": null,
225 |    "metadata": {
226 |     "collapsed": true
227 |    },
228 |    "outputs": [],
229 |    "source": []
230 |   },
231 |   {
232 |    "cell_type": "code",
233 |    "execution_count": 12,
234 |    "metadata": {
235 |     "collapsed": false,
236 |     "deletable": true,
237 |     "editable": true
238 |    },
239 |    "outputs": [],
240 |    "source": [
241 |     "net.set_cat_color('col', 1, 'Category: one', 'blue')"
242 |    ]
243 |   },
244 |   {
245 |    "cell_type": "code",
246 |    "execution_count": 13,
247 |    "metadata": {
248 |     "collapsed": false,
249 |     "deletable": true,
250 |     "editable": true
251 |    },
252 |    "outputs": [
253 |     {
254 |      "data": {
255 |       "text/plain": [
256 |        "{'col': {'cat-0': {'Category: one': 'blue',\n",
257 |        "   'Marker-type: phospho marker': '#17becf',\n",
258 |        "   'Marker-type: surface marker': '#6b6ecf'}},\n",
259 |        " 'row': {'cat-0': {'Majority-Treatment: PMA': '#c5b0d5',\n",
260 |        "   'Majority-Treatment: Plasma': '#dbdb8d'},\n",
261 |        "  'cat-1': {}}}"
262 |       ]
263 |      },
264 |      "execution_count": 13,
265 |      "metadata": {},
266 |      "output_type": "execute_result"
267 |     }
268 |    ],
269 |    "source": [
270 |     "net.viz['cat_colors']"
271 |    ]
272 |   },
273 |   {
274 |    "cell_type": "code",
275 |    "execution_count": 14,
276 |    "metadata": {
277 |     "collapsed": true,
278 |     "deletable": true,
279 |     "editable": true
280 |    },
281 |    "outputs": [],
282 |    "source": [
283 |     "df = net.export_df()"
284 |    ]
285 |   },
286 |   {
287 |    "cell_type": "code",
288 |    "execution_count": 15,
289 |    "metadata": {
290 |     "collapsed": false
291 |    },
292 |    "outputs": [],
293 |    "source": [
294 |     "net.load_df(df)\n",
295 |     "df = df.transpose()"
296 |    ]
297 |   },
298 |   {
299 |    "cell_type": "code",
300 |    "execution_count": 16,
301 |    "metadata": {
302 |     "collapsed": false
303 |    },
304 |    "outputs": [
305 |     {
306 |      "data": {
307 |       "text/plain": [
308 |        "{'col': {'cat-0': {'Category: one': 'blue',\n",
309 |        "   'Marker-type: phospho marker': '#17becf',\n",
310 |        "   'Marker-type: surface marker': '#6b6ecf'}},\n",
311 |        " 'row': {'cat-0': {'Majority-Treatment: PMA': '#c5b0d5',\n",
312 |        "   'Majority-Treatment: Plasma': '#dbdb8d'},\n",
313 |        "  'cat-1': {}}}"
314 |       ]
315 |      },
316 |      "execution_count": 16,
317 |      "metadata": {},
318 |      "output_type": "execute_result"
319 |     }
320 |    ],
321 |    "source": [
322 |     "net.viz['cat_colors']"
323 |    ]
324 |   },
325 |   {
326 |    "cell_type": "code",
327 |    "execution_count": null,
328 |    "metadata": {
329 |     "collapsed": true
330 |    },
331 |    "outputs": [],
332 |    "source": []
333 |   }
334 |  ],
335 |  "metadata": {
336 |   "kernelspec": {
337 |    "display_name": "Python 2",
338 |    "language": "python",
339 |    "name": "python2"
340 |   },
341 |   "language_info": {
342 |    "codemirror_mode": {
343 |     "name": "ipython",
344 |     "version": 2
345 |    },
346 |    "file_extension": ".py",
347 |    "mimetype": "text/x-python",
348 |    "name": "python",
349 |    "nbconvert_exporter": "python",
350 |    "pygments_lexer": "ipython2",
351 |    "version": "2.7.12"
352 |   }
353 |  },
354 |  "nbformat": 4,
355 |  "nbformat_minor": 2
356 | }
357 | 


--------------------------------------------------------------------------------
/Filter using names.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "code",
  5 |    "execution_count": 1,
  6 |    "metadata": {
  7 |     "collapsed": false
  8 |    },
  9 |    "outputs": [],
 10 |    "source": [
 11 |     "from clustergrammer import Network\n",
 12 |     "from clustergrammer_widget import clustergrammer_widget\n",
 13 |     "net = Network()"
 14 |    ]
 15 |   },
 16 |   {
 17 |    "cell_type": "code",
 18 |    "execution_count": 2,
 19 |    "metadata": {
 20 |     "collapsed": false
 21 |    },
 22 |    "outputs": [],
 23 |    "source": [
 24 |     "net.load_file('txt/rc_two_cats.txt')"
 25 |    ]
 26 |   },
 27 |   {
 28 |    "cell_type": "code",
 29 |    "execution_count": 3,
 30 |    "metadata": {
 31 |     "collapsed": false
 32 |    },
 33 |    "outputs": [
 34 |     {
 35 |      "name": "stdout",
 36 |      "output_type": "stream",
 37 |      "text": [
 38 |       "filter_names\n",
 39 |       "['ROS1', 'AAK1']\n",
 40 |       "[('Gene: AAK1', 'Gene Type: Not Interesting'), ('Gene: ROS1', 'Gene Type: Interesting')]\n"
 41 |      ]
 42 |     }
 43 |    ],
 44 |    "source": [
 45 |     "net.filter_names('row', ['ROS1', 'AAK1'])"
 46 |    ]
 47 |   },
 48 |   {
 49 |    "cell_type": "code",
 50 |    "execution_count": 4,
 51 |    "metadata": {
 52 |     "collapsed": false
 53 |    },
 54 |    "outputs": [
 55 |     {
 56 |      "data": {
 57 |       "text/plain": [
 58 |        "(2, 29)"
 59 |       ]
 60 |      },
 61 |      "execution_count": 4,
 62 |      "metadata": {},
 63 |      "output_type": "execute_result"
 64 |     }
 65 |    ],
 66 |    "source": [
 67 |     "net.dat['mat'].shape"
 68 |    ]
 69 |   },
 70 |   {
 71 |    "cell_type": "code",
 72 |    "execution_count": 5,
 73 |    "metadata": {
 74 |     "collapsed": false
 75 |    },
 76 |    "outputs": [
 77 |     {
 78 |      "name": "stdout",
 79 |      "output_type": "stream",
 80 |      "text": [
 81 |       "filter_names\n",
 82 |       "['H1781', 'H661']\n",
 83 |       "[('Cell Line: H661', 'Category: five', 'Gender: Male'), ('Cell Line: H1781', 'Category: one', 'Gender: Female')]\n"
 84 |      ]
 85 |     }
 86 |    ],
 87 |    "source": [
 88 |     "net.filter_names('col', ['H1781', 'H661'])"
 89 |    ]
 90 |   },
 91 |   {
 92 |    "cell_type": "code",
 93 |    "execution_count": 8,
 94 |    "metadata": {
 95 |     "collapsed": false
 96 |    },
 97 |    "outputs": [],
 98 |    "source": [
 99 |     "net.dat['mat'].shape\n",
100 |     "net.make_clust()"
101 |    ]
102 |   },
103 |   {
104 |    "cell_type": "code",
105 |    "execution_count": 9,
106 |    "metadata": {
107 |     "collapsed": false
108 |    },
109 |    "outputs": [
110 |     {
111 |      "data": {
112 |       "application/vnd.jupyter.widget-view+json": {
113 |        "model_id": "79b708096ef9427baa842f5e37f6622b"
114 |       }
115 |      },
116 |      "metadata": {},
117 |      "output_type": "display_data"
118 |     }
119 |    ],
120 |    "source": [
121 |     "clustergrammer_widget(network=net.widget())"
122 |    ]
123 |   },
124 |   {
125 |    "cell_type": "code",
126 |    "execution_count": null,
127 |    "metadata": {
128 |     "collapsed": true
129 |    },
130 |    "outputs": [],
131 |    "source": []
132 |   }
133 |  ],
134 |  "metadata": {
135 |   "kernelspec": {
136 |    "display_name": "Python [Root]",
137 |    "language": "python",
138 |    "name": "Python [Root]"
139 |   },
140 |   "language_info": {
141 |    "codemirror_mode": {
142 |     "name": "ipython",
143 |     "version": 2
144 |    },
145 |    "file_extension": ".py",
146 |    "mimetype": "text/x-python",
147 |    "name": "python",
148 |    "nbconvert_exporter": "python",
149 |    "pygments_lexer": "ipython2",
150 |    "version": "2.7.12"
151 |   }
152 |  },
153 |  "nbformat": 4,
154 |  "nbformat_minor": 2
155 | }
156 | 


--------------------------------------------------------------------------------
/Fix Enrichrgram category coloring.ipynb:
--------------------------------------------------------------------------------
 1 | {
 2 |  "cells": [
 3 |   {
 4 |    "cell_type": "code",
 5 |    "execution_count": null,
 6 |    "metadata": {
 7 |     "collapsed": true
 8 |    },
 9 |    "outputs": [],
10 |    "source": []
11 |   }
12 |  ],
13 |  "metadata": {
14 |   "kernelspec": {
15 |    "display_name": "Python [Root]",
16 |    "language": "python",
17 |    "name": "Python [Root]"
18 |   },
19 |   "language_info": {
20 |    "codemirror_mode": {
21 |     "name": "ipython",
22 |     "version": 2
23 |    },
24 |    "file_extension": ".py",
25 |    "mimetype": "text/x-python",
26 |    "name": "python",
27 |    "nbconvert_exporter": "python",
28 |    "pygments_lexer": "ipython2",
29 |    "version": "2.7.12"
30 |   }
31 |  },
32 |  "nbformat": 4,
33 |  "nbformat_minor": 2
34 | }
35 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2017 Nicolas Fernandez
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.


--------------------------------------------------------------------------------
/MANIFEST:
--------------------------------------------------------------------------------
 1 | # file GENERATED by distutils, do NOT edit
 2 | setup.cfg
 3 | setup.py
 4 | clustergrammer/__init__.py
 5 | clustergrammer/calc_clust.py
 6 | clustergrammer/cat_pval.py
 7 | clustergrammer/categories.py
 8 | clustergrammer/data_formats.py
 9 | clustergrammer/downsample_fun.py
10 | clustergrammer/enrichr_functions.py
11 | clustergrammer/export_data.py
12 | clustergrammer/iframe_web_app.py
13 | clustergrammer/initialize_net.py
14 | clustergrammer/load_data.py
15 | clustergrammer/load_vect_post.py
16 | clustergrammer/make_clust_fun.py
17 | clustergrammer/make_sim_mat.py
18 | clustergrammer/make_unique_labels.py
19 | clustergrammer/make_views.py
20 | clustergrammer/make_viz.py
21 | clustergrammer/normalize_fun.py
22 | clustergrammer/proc_df_labels.py
23 | clustergrammer/run_filter.py
24 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | # Clustergrammer Python Module
  2 | The python module [clutergrammer.py](clustergrammer), takes a tab-separated matrix file as input (see format [here](#input-matrix-format)), calculates clustering, and generates the visualization json (see format [here](https://github.com/MaayanLab/clustergrammer-json)) for [clustergrammer.js](https://github.com/MaayanLab/clustergrammer). See an [example workflow](#example-workflow) below:
  3 | 
  4 | 
  5 | Pleae see Clustergramer-PY's [documentation](http://clustergrammer.readthedocs.io/clustergrammer_py.html) for more information.
  6 | 
  7 | ## Installation
  8 | The module can be used by downloading the source code here or by installing with [pip](https://pypi.python.org/pypi?:action=display&name=clustergrammer):
  9 | 
 10 | ```
 11 | # python 2
 12 | $ pip install clustergrammer
 13 | 
 14 | # python 3
 15 | $ pip3 install clustergrammer
 16 | ```
 17 | 
 18 | ## Example Workflow
 19 | ```
 20 | from clustergrammer import Network
 21 | net = Network()
 22 | 
 23 | # load matrix file
 24 | net.load_file('txt/rc_two_cats.txt')
 25 | 
 26 | # calculate clustering
 27 | net.make_clust(dist_type='cos',views=['N_row_sum', 'N_row_var'])
 28 | 
 29 | # write visualization json to file
 30 | net.write_json_to_file('viz', 'json/mult_view.json')
 31 | ```
 32 | The script [make_clustergrammer.py](make_clustergrammer.py) is used to generate the visualization jsons (see [json](https://github.com/MaayanLab/clustergrammer/tree/master/json) directory of the clustergrammer repo) for the examples pages on the [clustergrammer](https://github.com/MaayanLab/clustergrammer) repo. To visualize your own data modify the [make_clustergrammer.py](make_clustergrammer.py) script on the [clustergrammer](https://github.com/MaayanLab/clustergrammer) repo.
 33 | 
 34 | ## Jupyter Notebook Examples
 35 | 
 36 | ### Clustergrammer-Widget Example
 37 | Clustergrammer can be used as a notebook extension widget. To install the widget use
 38 | 
 39 | ```
 40 | # python 2
 41 | $ pip install clustergrammer_widget
 42 | 
 43 | # python 3
 44 | $ pip3 install clustergrammer_widget
 45 | ```
 46 | 
 47 | Within the Jupyter/IPython notebook the widget can be run using the following commands
 48 | 
 49 | ```
 50 | # import the widget
 51 | from clustergrammer_widget import *
 52 | from copy import deepcopy
 53 | 
 54 | # load data into new network instance and cluster
 55 | net = deepcopy(Network())
 56 | net.load_file('rc_two_cats.txt')
 57 | net.make_clust()
 58 | 
 59 | # view the results as a widget
 60 | clustergrammer_notebook(network = net.export_net_json())
 61 | ```
 62 | 
 63 | The [clustergrammer_widget](https://github.com/MaayanLab/clustergrammer-widget) repo contains the source code for the widget.
 64 | 
 65 | ### IFrame Clustergrammer-web Results
 66 | The python module can make an IFramed visualization in Jupyter/Ipython Python notebooks. See [Jupyter_Notebook_Example.ipynb](Jupyter_Notebook_Example.ipynb) for and example notebook or the example workflow below:
 67 | 
 68 | ```
 69 | # upload a file to the clustergrammer web app and visualize using an Iframe
 70 | from clustergrammer import Network
 71 | from copy import deepcopy
 72 | net = deepcopy(Network())
 73 | link = net.Iframe_web_app('txt/rc_two_cats.txt')
 74 | print(link)
 75 | ```
 76 | 
 77 | ## Clustergrammer Python Module API
 78 | The python module, [clustergrammer.py](clustergrammer), allows users to upload a matrix, normalize or filter data, and make a visualization json for clustergrammer.js.
 79 | 
 80 | The python module works in the following way. First, data is loaded into a data state (net.dat). Second, a clustered visualization json is calculated and saved in the viz state (net.viz). Third, the visualization object is exported as a json for clustergrammer.js. These three steps are shown in the [example workflow](#example-workflow) as: ```net.load_file```, ```net.make_clust```, and ```net.write_json_to_file```.
 81 | 
 82 | The data state is similar to a Pandas Data Frame. A matrix also can be loaded directly as a [Data Frame](#df_to_dat) or [exported](#dat_to_df).
 83 | 
 84 | Below are the available functions in the ```Network``` object:
 85 | 
 86 | ##### ```load_file(filename)```
 87 | Load a tsv file, given by filename, into the ```Network``` object (stored as ```net.dat```).
 88 | 
 89 | ##### ```load_tsv_to_net(file_buffer)```
 90 | Load a file buffer directly into the ```Network``` object.
 91 | 
 92 | ##### ```df_to_dat()```
 93 | This function loads a Pandas Data Frame into the ```net.dat``` state. This allows a user to directly load a Data Frame rather than have to load from a file.
 94 | 
 95 | ##### ```swap_nan_for_zero()```
 96 | Swap all NaNs in a matrix for zeros.
 97 | 
 98 | ##### ```filter_sum(inst_rc, threshold, take_abs=True)```
 99 | This is a filtering function that can be run before ```make_clust``` that performs a permanent filtering on rows/columns based on their sum. For instance, to filter the matrix to only include rows with a sum above a threshold, 100, do the following: ```net.filter_sum('row', threshold=100)```. Additional, filtered views can also be added using the ```views``` argument in ```make_clust```.
100 | 
101 | ##### ```filter_N_top(inst_rc, N_top, rank_type='sum')```
102 | This is a filtering function that can be run before ```make_clust``` that performs a permanent filtering on rows/columns based on their sum/variance and return the top ```N``` rows/columns with the greatest (absolute value) sum or variance. For instance, to filter a matrix with >100 rows down to the top 100 rows based on their sum do the following: ```net.filter_N_top('row', N_top=100, rank_type='sum')```. This is useful for pre-filtering very large matrices to make them easier to visualize.
103 | 
104 | ##### ```filter_threshold(inst_rc, threshold, num_occur)```
105 | This is a filtering function that can be run before ```make_clust``` that performs a permanent filterin on rows/columns based on whether ```num_occur``` of their values have an absolute value greater than ```threshold```. For instance, to filter a matrix to only include rows that have at least 3 values with an absolute value above 10 do the following: ```net.filter_threshold('row', threshold=3, num_occur=10)```. This is useful for filtering rows/columns that have the same or simlar sums and variances.
106 | 
107 | ##### ```make_clust()```
108 | Calculate clustering and produce a visualization object (stored as ```net.viz```). The optional arguments are listed below:
109 | 
110 | - ```dist_type='cosine'``` The distance metric used to calculate the distance between all rows and columns (using Scipy). The defalt is cosine distance.
111 | 
112 | - ```run_clustering=True``` This determines whether clustering will be calculated. The default is set to ```True```. If ```False``` is given then a visualization of the matrix in its original ordering will be returned.
113 | 
114 | - ```dendro=True``` This determines whether a dendrogram will be included in the visualization. The default is True.
115 | 
116 | - ```linkage_type='average'``` This determines the linkage type used by Scipy to perform hierarchical clustering. For more options (e.g. 'single', 'complete') and information see [hierarchy.linkage documentation](http://docs.scipy.org/doc/scipy-0.17.0/reference/generated/scipy.cluster.hierarchy.linkage.html).
117 | 
118 | - ```views=['N_row_sum', 'N_row_var']``` This determines which row-filtered views will be calculated for the clustergram. Filters can be based on sum or variance and the cutoffs can be defined in absolute numbers (```N```) or as a percentage of the number of rows (```pct```). These views are available on the front-end visualization using the sliders. The defalt is ```['N_row_sum', 'N_row_var']```. The four options are:
119 |   - ```N_row_sum``` This indicates that additional row-filtered views should be calculated based on the sum of the values in the rows with cutoffs defined by absolute number. For instance, additional views will be calculated showing the top 500, 250, 100, 50, 20, and 10 rows based on the absolute sum of their values.
120 | 
121 |   - ```pct_row_sum``` This indicates that additional row-filtered views should be calculated based on the sum of the values in the rows with cutoffs defined by the percentage of rows. For instance, additional views will be calculated showing the top 10%, 20%, 30%, ... rows based on the absolute sum of their values.
122 | 
123 |   - ```N_row_var``` This indicates that additional row-filtered views should be calculated based on the variance of the values in the rows with cutoffs defined by absolute number. For instance, additional views will be calculated showing the top 500, 250, 100, 50, 20, and 10 rows based on the variance of their values.
124 | 
125 |   - ```pct_row_sum``` This indicates that additional row-filtered views should be calculated based on the variance of the values in the rows with cutoffs defined by the percentage of rows. For instance, additional views will be calculated showing the top 10%, 20%, 30%, ... rows based on the variance of their values.
126 | 
127 | - ```sim_mat=False``` This determines whether row and column similarity matrix visualizations will be calculated from your input matrix. The default is ```False```. If it is set to ```True```, then the row and column distance matrices used to calculate hierarchical clustering will be convered to similarity matrices and clustered. These visualization jsons will be stored as ```net.sim['row']``` and ```net.sim['col']```. These can be exporeted for visualization using ```net.write_json_to_file('sim_row', 'sim_row.json')``` and an example of this can be seen in [make_clustergrammer.py](make_clustergrammer.py).
128 | 
129 | ##### ```write_json_to_file(net_type, filename, indent='no-indent')```
130 | This writes a json of the network object data, either ```net.viz``` or ```net.dat```, to a file. Choose ```'viz'``` in order to write a visualization json for clustergrammer.js, e.g. ```net.write_json_to_file('viz','clustergram.json')```
131 | 
132 | ##### ```write_matrix_to_tsv(filename, df=None)```
133 | This write the matrix, stored in the network object, to a tsv file. Optional row/column categories are saved as tuples. See [tuple_cats.txt](txt/tuple_cats.txt) or [export.txt](txt/export.txt) for examples of the exported matrix file format.
134 | 
135 | ##### ```export_net_json(net_type, indent='no-indent')```
136 | This exports a json string from either ```net.dat``` or ```net.viz```. This is useful if a user wants the json, but does not want to first write to file.
137 | 
138 | ##### ```dat_to_df()```
139 | Export a matrix that has been loaded into the ```Network``` object as a Pandas Data Frame.


--------------------------------------------------------------------------------
/RELEASE.md:
--------------------------------------------------------------------------------
 1 | Publication Instructions
 2 | -------------------------------
 3 | http://peterdowns.com/posts/first-time-with-pypi.html
 4 | 
 5 | Updating Instructions
 6 | ----------------------------
 7 | 
 8 | First, release a new version and push to github repo.
 9 | 
10 | Then update the setup.py file to reflect the new version.
11 | 
12 | adding a tag
13 | -----------------
14 | git tag -a 0.1 -m "Adds a tag so that we can put this on PyPI."
15 | 
16 | run registering and updating
17 | 
18 | How to upgrade
19 | *****************
20 | 1) After commiting changes, make a new release tag and change the setup.py file to reflect this
21 | 
22 | 2) Then push with tags
23 |   git push github master --tags
24 | 
25 | 3) run the test registering and uploading using
26 |   python setup.py register -r pypitest
27 |   python setup.py sdist upload -r pypitest
28 | 
29 |   python setup.py register -r pypi
30 |   python setup.py sdist upload -r pypi
31 | 
32 | 
33 | 4) upgrade package
34 |   pip install clustergrammer --upgrade
35 |   pip show clustergrammer


--------------------------------------------------------------------------------
/Row filtering based on original data.ipynb:
--------------------------------------------------------------------------------
 1 | {
 2 |  "cells": [
 3 |   {
 4 |    "cell_type": "code",
 5 |    "execution_count": null,
 6 |    "metadata": {
 7 |     "collapsed": true
 8 |    },
 9 |    "outputs": [],
10 |    "source": []
11 |   }
12 |  ],
13 |  "metadata": {
14 |   "kernelspec": {
15 |    "display_name": "Python [Root]",
16 |    "language": "python",
17 |    "name": "Python [Root]"
18 |   },
19 |   "language_info": {
20 |    "codemirror_mode": {
21 |     "name": "ipython",
22 |     "version": 2
23 |    },
24 |    "file_extension": ".py",
25 |    "mimetype": "text/x-python",
26 |    "name": "python",
27 |    "nbconvert_exporter": "python",
28 |    "pygments_lexer": "ipython2",
29 |    "version": "2.7.12"
30 |   }
31 |  },
32 |  "nbformat": 4,
33 |  "nbformat_minor": 2
34 | }
35 | 


--------------------------------------------------------------------------------
/clustergrammer/__init__.py:
--------------------------------------------------------------------------------
  1 | 
  2 | import numpy as np
  3 | import pandas as pd
  4 | from copy import deepcopy
  5 | 
  6 | from . import initialize_net
  7 | from . import load_data
  8 | from . import export_data
  9 | from . import load_vect_post
 10 | from . import make_clust_fun
 11 | from . import normalize_fun
 12 | from . import data_formats
 13 | from . import enrichr_functions as enr_fun
 14 | from . import iframe_web_app
 15 | from . import run_filter
 16 | from . import downsample_fun
 17 | from . import categories
 18 | 
 19 | class Network(object):
 20 |   '''
 21 |   version 1.13.6
 22 | 
 23 |   Clustergrammer.py takes a matrix as input (either from a file of a Pandas DataFrame), normalizes/filters, hierarchically clusters, and produces the :ref:`visualization_json` for :ref:`clustergrammer_js`.
 24 | 
 25 |   Networks have two states:
 26 | 
 27 |     1. the data state, where they are stored as a matrix and nodes
 28 |     2. the viz state where they are stored as viz.links, viz.row_nodes, and viz.col_nodes.
 29 | 
 30 |   The goal is to start in a data-state and produce a viz-state of
 31 |   the network that will be used as input to clustergram.js.
 32 |   '''
 33 | 
 34 |   def __init__(self, widget=None):
 35 |     initialize_net.main(self, widget)
 36 | 
 37 |   def reset(self):
 38 |     '''
 39 |     This re-initializes the Network object.
 40 |     '''
 41 |     initialize_net.main(self)
 42 | 
 43 |   def load_file(self, filename):
 44 |     '''
 45 |     Load TSV file.
 46 |     '''
 47 |     load_data.load_file(self, filename)
 48 | 
 49 |   def load_file_as_string(self, file_string, filename=''):
 50 |     '''
 51 |     Load file as a string.
 52 |     '''
 53 |     load_data.load_file_as_string(self, file_string, filename=filename)
 54 | 
 55 | 
 56 |   def load_stdin(self):
 57 |     '''
 58 |     Load stdin TSV-formatted string.
 59 |     '''
 60 |     load_data.load_stdin(self)
 61 | 
 62 |   def load_tsv_to_net(self, file_buffer, filename=None):
 63 |     '''
 64 |     This will load a TSV matrix file buffer; this is exposed so that it will
 65 |     be possible to load data without having to read from a file.
 66 |     '''
 67 |     load_data.load_tsv_to_net(self, file_buffer, filename)
 68 | 
 69 |   def load_vect_post_to_net(self, vect_post):
 70 |     '''
 71 |     Load data in the vector format JSON.
 72 |     '''
 73 |     load_vect_post.main(self, vect_post)
 74 | 
 75 |   def load_data_file_to_net(self, filename):
 76 |     '''
 77 |     Load Clustergrammer's dat format (saved as JSON).
 78 |     '''
 79 |     inst_dat = self.load_json_to_dict(filename)
 80 |     load_data.load_data_to_net(self, inst_dat)
 81 | 
 82 |   def cluster(self, dist_type='cosine', run_clustering=True,
 83 |                  dendro=True, views=['N_row_sum', 'N_row_var'],
 84 |                  linkage_type='average', sim_mat=False, filter_sim=0.1,
 85 |                  calc_cat_pval=False, run_enrichr=None, enrichrgram=None):
 86 |     '''
 87 |     The main function performs hierarchical clustering, optionally generates filtered views (e.g. row-filtered views), and generates the :``visualization_json``.
 88 |     '''
 89 |     initialize_net.viz(self)
 90 | 
 91 |     make_clust_fun.make_clust(self, dist_type=dist_type, run_clustering=run_clustering,
 92 |                                    dendro=dendro,
 93 |                                    requested_views=views,
 94 |                                    linkage_type=linkage_type,
 95 |                                    sim_mat=sim_mat,
 96 |                                    filter_sim=filter_sim,
 97 |                                    calc_cat_pval=calc_cat_pval,
 98 |                                    run_enrichr=run_enrichr,
 99 |                                    enrichrgram=enrichrgram)
100 | 
101 |   def make_clust(self, dist_type='cosine', run_clustering=True,
102 |                  dendro=True, views=['N_row_sum', 'N_row_var'],
103 |                  linkage_type='average', sim_mat=False, filter_sim=0.1,
104 |                  calc_cat_pval=False, run_enrichr=None, enrichrgram=None):
105 |     '''
106 |     ... Will be deprecated, renaming method cluster ...
107 |     The main function performs hierarchical clustering, optionally generates filtered views (e.g. row-filtered views), and generates the :``visualization_json``.
108 |     '''
109 |     print('make_clust method will be deprecated in next version, please use cluster method.')
110 |     initialize_net.viz(self)
111 | 
112 |     make_clust_fun.make_clust(self, dist_type=dist_type, run_clustering=run_clustering,
113 |                                    dendro=dendro,
114 |                                    requested_views=views,
115 |                                    linkage_type=linkage_type,
116 |                                    sim_mat=sim_mat,
117 |                                    filter_sim=filter_sim,
118 |                                    calc_cat_pval=calc_cat_pval,
119 |                                    run_enrichr=run_enrichr,
120 |                                    enrichrgram=enrichrgram)
121 | 
122 |   def produce_view(self, requested_view=None):
123 |     '''
124 |     This function is under development and will produce a single view on demand.
125 |     '''
126 |     print('\tproduce a single view of a matrix, will be used for get requests')
127 | 
128 |     if requested_view != None:
129 |       print('requested_view')
130 |       print(requested_view)
131 | 
132 |   def swap_nan_for_zero(self):
133 |     '''
134 |     Swaps all NaN (numpy NaN) instances for zero.
135 |     '''
136 |     # # may re-instate this in some form
137 |     # self.dat['mat_orig'] = deepcopy(self.dat['mat'])
138 | 
139 |     self.dat['mat'][np.isnan(self.dat['mat'])] = 0
140 | 
141 |   def load_df(self, df):
142 |     '''
143 |     Load Pandas DataFrame.
144 |     '''
145 |     # self.__init__()
146 |     self.reset()
147 | 
148 |     df_dict = {}
149 |     df_dict['mat'] = deepcopy(df)
150 |     # always define category colors if applicable when loading a df
151 |     data_formats.df_to_dat(self, df_dict, define_cat_colors=True)
152 | 
153 |   def export_df(self):
154 |     '''
155 |     Export Pandas DataFrame/
156 |     '''
157 |     df_dict = data_formats.dat_to_df(self)
158 |     return df_dict['mat']
159 | 
160 |   def df_to_dat(self, df, define_cat_colors=False):
161 |     '''
162 |     Load Pandas DataFrame (will be deprecated).
163 |     '''
164 |     data_formats.df_to_dat(self, df, define_cat_colors)
165 | 
166 |   def set_cat_color(self, axis, cat_index, cat_name, inst_color):
167 | 
168 |     if axis == 0:
169 |       axis = 'row'
170 |     if axis == 1:
171 |       axis = 'col'
172 | 
173 |     try:
174 |       # process cat_index
175 |       cat_index = cat_index - 1
176 |       cat_index = 'cat-' + str(cat_index)
177 | 
178 |       self.viz['cat_colors'][axis][cat_index][cat_name] = inst_color
179 | 
180 |     except:
181 |       print('there was an error setting the category color')
182 | 
183 |   def dat_to_df(self):
184 |     '''
185 |     Export Pandas DataFrams (will be deprecated).
186 |     '''
187 |     return data_formats.dat_to_df(self)
188 | 
189 |   def export_net_json(self, net_type='viz', indent='no-indent'):
190 |     '''
191 |     Export dat or viz JSON.
192 |     '''
193 |     return export_data.export_net_json(self, net_type, indent)
194 | 
195 |   def export_viz_to_widget(self, which_viz='viz'):
196 |     '''
197 |     Export viz JSON, for use with clustergrammer_widget. Formerly method was
198 |     named widget.
199 |     '''
200 | 
201 |     return export_data.export_net_json(self, which_viz, 'no-indent')
202 | 
203 |   def widget(self, which_viz='viz'):
204 |     '''
205 |     Generate a widget visualization using the widget. The export_viz_to_widget
206 |     method passes the visualization JSON to the instantiated widget, which is
207 |     returned and visualized on the front-end.
208 |     '''
209 |     if hasattr(self, 'widget_class') == True:
210 |       self.widget_instance = self.widget_class(network = self.export_viz_to_widget(which_viz))
211 | 
212 |       return self.widget_instance
213 |     else:
214 |       print('Can not make widget because Network has no attribute widget_class')
215 |       print('Please instantiate Network with clustergrammer_widget using: Network(clustergrammer_widget)')
216 | 
217 | 
218 |   def widget_df(self):
219 |     '''
220 |     Export a DataFrame from the front-end visualization. For instance, a user
221 |     can filter to show only a single cluster using the dendrogram and then
222 |     get a dataframe of this cluster using the widget_df method.
223 |     '''
224 | 
225 |     if hasattr(self, 'widget_instance') == True:
226 | 
227 |       if self.widget_instance.mat_string != '':
228 | 
229 |         tmp_net = deepcopy(Network())
230 | 
231 |         df_string = self.widget_instance.mat_string
232 | 
233 |         tmp_net.load_file_as_string(df_string)
234 | 
235 |         df = tmp_net.export_df()
236 | 
237 |         return df
238 | 
239 |       else:
240 |         return self.export_df()
241 | 
242 |     else:
243 |       if hasattr(self, 'widget_class') == True:
244 |         print('Please make the widget before exporting the widget DataFrame.')
245 |         print('Do this using the widget method: net.widget()')
246 | 
247 |       else:
248 |         print('Can not make widget because Network has no attribute widget_class')
249 |         print('Please instantiate Network with clustergrammer_widget using: Network(clustergrammer_widget)')
250 | 
251 |   def write_json_to_file(self, net_type, filename, indent='no-indent'):
252 |     '''
253 |     Save dat or viz as a JSON to file.
254 |     '''
255 |     export_data.write_json_to_file(self, net_type, filename, indent)
256 | 
257 |   def write_matrix_to_tsv(self, filename=None, df=None):
258 |     '''
259 |     Export data-matrix to file.
260 |     '''
261 |     return export_data.write_matrix_to_tsv(self, filename, df)
262 | 
263 |   def filter_sum(self, inst_rc, threshold, take_abs=True):
264 |     '''
265 |     Filter a network's rows or columns based on the sum across rows or columns.
266 |     '''
267 |     inst_df = self.dat_to_df()
268 |     if inst_rc == 'row':
269 |       inst_df = run_filter.df_filter_row_sum(inst_df, threshold, take_abs)
270 |     elif inst_rc == 'col':
271 |       inst_df = run_filter.df_filter_col_sum(inst_df, threshold, take_abs)
272 |     self.df_to_dat(inst_df)
273 | 
274 |   def filter_N_top(self, inst_rc, N_top, rank_type='sum'):
275 |     '''
276 |     Filter the matrix rows or columns based on sum/variance, and only keep the top
277 |     N.
278 |     '''
279 |     inst_df = self.dat_to_df()
280 | 
281 |     inst_df = run_filter.filter_N_top(inst_rc, inst_df, N_top, rank_type)
282 | 
283 |     self.df_to_dat(inst_df)
284 | 
285 |   def filter_threshold(self, inst_rc, threshold, num_occur=1):
286 |     '''
287 |     Filter the matrix rows or columns based on num_occur values being above a
288 |     threshold (in absolute value).
289 |     '''
290 |     inst_df = self.dat_to_df()
291 | 
292 |     inst_df = run_filter.filter_threshold(inst_df, inst_rc, threshold,
293 |       num_occur)
294 | 
295 |     self.df_to_dat(inst_df)
296 | 
297 |   def filter_cat(self, axis, cat_index, cat_name):
298 |     '''
299 |     Filter the matrix based on their category. cat_index is the index of the category, the first category has index=1.
300 |     '''
301 |     run_filter.filter_cat(self, axis, cat_index, cat_name)
302 | 
303 |   def filter_names(self, axis, names):
304 |     '''
305 |     Filter the visualization using row/column names. The function takes, axis ('row'/'col') and names, a list of strings.
306 |     '''
307 |     run_filter.filter_names(self, axis, names)
308 | 
309 |   def clip(self, lower=None, upper=None):
310 |     '''
311 |     Trim values at input thresholds using pandas function
312 |     '''
313 |     df = self.export_df()
314 |     df = df.clip(lower=lower, upper=upper)
315 |     self.load_df(df)
316 | 
317 |   def normalize(self, df=None, norm_type='zscore', axis='row', keep_orig=False):
318 |     '''
319 |     Normalize the matrix rows or columns using Z-score (zscore) or Quantile Normalization (qn). Users can optionally pass in a DataFrame to be normalized (and this will be incorporated into the Network object).
320 |     '''
321 |     normalize_fun.run_norm(self, df, norm_type, axis, keep_orig)
322 | 
323 |   def downsample(self, df=None, ds_type='kmeans', axis='row', num_samples=100, random_state=1000):
324 |     '''
325 |     Downsample the matrix rows or columns (currently supporting kmeans only). Users can optionally pass in a DataFrame to be downsampled (and this will be incorporated into the network object).
326 |     '''
327 | 
328 |     return downsample_fun.main(self, df, ds_type, axis, num_samples, random_state)
329 | 
330 |   def random_sample(self, num_samples, df=None, replace=False, weights=None, random_state=100, axis='row'):
331 |     '''
332 |     Return random sample of matrix.
333 |     '''
334 | 
335 |     if df is None:
336 |       df = self.dat_to_df()
337 | 
338 |     if axis == 'row':
339 |       axis = 0
340 |     if axis == 'col':
341 |       axis = 1
342 | 
343 |     df = self.export_df()
344 |     df = df.sample(n=num_samples, replace=replace, weights=weights, random_state=random_state,  axis=axis)
345 | 
346 |     self.load_df(df)
347 | 
348 |   def add_cats(self, axis, cat_data):
349 |     '''
350 |     Add categories to rows or columns using cat_data array of objects. Each object in cat_data is a dictionary with one key (category title) and value (rows/column names) that have this category. Categories will be added onto the existing categories and will be added in the order of the objects in the array.
351 | 
352 |     Example ``cat_data``::
353 | 
354 | 
355 |         [
356 |           {
357 |             "title": "First Category",
358 |             "cats": {
359 |               "true": [
360 |                 "ROS1",
361 |                 "AAK1"
362 |               ]
363 |             }
364 |           },
365 |           {
366 |             "title": "Second Category",
367 |             "cats": {
368 |               "something": [
369 |                 "PDK4"
370 |               ]
371 |             }
372 |           }
373 |         ]
374 | 
375 | 
376 |     '''
377 |     for inst_data in cat_data:
378 |       categories.add_cats(self, axis, inst_data)
379 | 
380 |   def dendro_cats(self, axis, dendro_level):
381 |     '''
382 |     Generate categories from dendrogram groups/clusters. The dendrogram has 11
383 |     levels to choose from 0 -> 10. Dendro_level can be given as an integer or
384 |     string.
385 |     '''
386 |     categories.dendro_cats(self, axis, dendro_level)
387 | 
388 |   def Iframe_web_app(self, filename=None, width=1000, height=800):
389 | 
390 |     link = iframe_web_app.main(self, filename, width, height)
391 | 
392 |     return link
393 | 
394 |   def enrichrgram(self, lib, axis='row'):
395 |     '''
396 |     Add Enrichr gene enrichment results to your visualization (where your rows
397 |     are genes). Run enrichrgram before clustering to incldue enrichment results
398 |     as row categories. Enrichrgram can also be run on the front-end using the
399 |     Enrichr logo at the top left.
400 | 
401 |     Set lib to the Enrichr library that you want to use for enrichment analysis.
402 |     Libraries included:
403 | 
404 |       * ChEA_2016
405 |       * KEA_2015
406 |       * ENCODE_TF_ChIP-seq_2015
407 |       * ENCODE_Histone_Modifications_2015
408 |       * Disease_Perturbations_from_GEO_up
409 |       * Disease_Perturbations_from_GEO_down
410 |       * GO_Molecular_Function_2015
411 |       * GO_Biological_Process_2015
412 |       * GO_Cellular_Component_2015
413 |       * Reactome_2016
414 |       * KEGG_2016
415 |       * MGI_Mammalian_Phenotype_Level_4
416 |       * LINCS_L1000_Chem_Pert_up
417 |       * LINCS_L1000_Chem_Pert_down
418 | 
419 |     '''
420 | 
421 |     df = self.export_df()
422 |     df, bar_info = enr_fun.add_enrichr_cats(df, axis, lib)
423 |     self.load_df(df)
424 | 
425 |     self.dat['enrichrgram_lib'] = lib
426 |     self.dat['row_cat_bars'] = bar_info
427 | 
428 |   @staticmethod
429 |   def load_gmt(filename):
430 |     return load_data.load_gmt(filename)
431 | 
432 |   @staticmethod
433 |   def load_json_to_dict(filename):
434 |     return load_data.load_json_to_dict(filename)
435 | 
436 |   @staticmethod
437 |   def save_dict_to_json(inst_dict, filename, indent='no-indent'):
438 |     export_data.save_dict_to_json(inst_dict, filename, indent)


--------------------------------------------------------------------------------
/clustergrammer/calc_clust.py:
--------------------------------------------------------------------------------
  1 | def cluster_row_and_col(net, dist_type='cosine', linkage_type='average',
  2 |                         dendro=True, run_clustering=True, run_rank=True,
  3 |                         ignore_cat=False, calc_cat_pval=False, links=False):
  4 |   ''' cluster net.dat and make visualization json, net.viz.
  5 |   optionally leave out dendrogram colorbar groups with dendro argument '''
  6 | 
  7 |   import scipy
  8 |   from copy import deepcopy
  9 |   from scipy.spatial.distance import pdist
 10 |   from . import categories, make_viz, cat_pval
 11 | 
 12 |   dm = {}
 13 |   for inst_rc in ['row', 'col']:
 14 | 
 15 |     tmp_mat = deepcopy(net.dat['mat'])
 16 |     dm[inst_rc] = calc_distance_matrix(tmp_mat, inst_rc, dist_type)
 17 | 
 18 |     # save directly to dat structure
 19 |     node_info = net.dat['node_info'][inst_rc]
 20 | 
 21 |     node_info['ini'] = list(range( len(net.dat['nodes'][inst_rc]), -1, -1))
 22 | 
 23 |     # cluster
 24 |     if run_clustering is True:
 25 |       node_info['clust'], node_info['group'] = \
 26 |           clust_and_group(net, dm[inst_rc], linkage_type=linkage_type)
 27 |     else:
 28 |       dendro = False
 29 |       node_info['clust'] = node_info['ini']
 30 | 
 31 |     # sorting
 32 |     if run_rank is True:
 33 |       node_info['rank'] = sort_rank_nodes(net, inst_rc, 'sum')
 34 |       node_info['rankvar'] = sort_rank_nodes(net, inst_rc, 'var')
 35 |     else:
 36 |       node_info['rank'] = node_info['ini']
 37 |       node_info['rankvar'] = node_info['ini']
 38 | 
 39 |     ##################################
 40 |     if ignore_cat is False:
 41 |       categories.calc_cat_clust_order(net, inst_rc)
 42 | 
 43 |   if calc_cat_pval is True:
 44 |     cat_pval.main(net)
 45 | 
 46 |   # make the visualization json
 47 |   make_viz.viz_json(net, dendro, links)
 48 | 
 49 |   return dm
 50 | 
 51 | def calc_distance_matrix(tmp_mat, inst_rc, dist_type='cosine'):
 52 |   from scipy.spatial.distance import pdist
 53 |   import numpy as np
 54 | 
 55 |   if inst_rc == 'row':
 56 |     inst_dm = pdist(tmp_mat, metric=dist_type)
 57 |   elif inst_rc == 'col':
 58 |     inst_dm = pdist(tmp_mat.transpose(), metric=dist_type)
 59 | 
 60 |   inst_dm[inst_dm < 0] = float(0)
 61 | 
 62 |   return inst_dm
 63 | 
 64 | def clust_and_group(net, inst_dm, linkage_type='average'):
 65 |   import scipy.cluster.hierarchy as hier
 66 |   Y = hier.linkage(inst_dm, method=linkage_type)
 67 |   Z = hier.dendrogram(Y, no_plot=True)
 68 |   inst_clust_order = Z['leaves']
 69 |   all_dist = group_cutoffs()
 70 | 
 71 |   groups = {}
 72 |   for inst_dist in all_dist:
 73 |     inst_key = str(inst_dist).replace('.', '')
 74 |     groups[inst_key] = hier.fcluster(Y, inst_dist * inst_dm.max(), 'distance')
 75 |     groups[inst_key] = groups[inst_key].tolist()
 76 | 
 77 |   return inst_clust_order, groups
 78 | 
 79 | def sort_rank_nodes(net, rowcol, rank_type):
 80 |   import numpy as np
 81 |   from operator import itemgetter
 82 |   from copy import deepcopy
 83 | 
 84 |   tmp_nodes = deepcopy(net.dat['nodes'][rowcol])
 85 |   inst_mat = deepcopy(net.dat['mat'])
 86 | 
 87 |   sum_term = []
 88 |   for i in range(len(tmp_nodes)):
 89 |     inst_dict = {}
 90 |     inst_dict['name'] = tmp_nodes[i]
 91 | 
 92 |     if rowcol == 'row':
 93 |       if rank_type == 'sum':
 94 |         inst_dict['rank'] = np.sum(inst_mat[i, :])
 95 |       elif rank_type == 'var':
 96 |         inst_dict['rank'] = np.var(inst_mat[i, :])
 97 |     else:
 98 |       if rank_type == 'sum':
 99 |         inst_dict['rank'] = np.sum(inst_mat[:, i])
100 |       elif rank_type == 'var':
101 |         inst_dict['rank'] = np.var(inst_mat[:, i])
102 | 
103 |     sum_term.append(inst_dict)
104 | 
105 |   sum_term = sorted(sum_term, key=itemgetter('rank'), reverse=False)
106 | 
107 |   tmp_sort_nodes = []
108 |   for inst_dict in sum_term:
109 |     tmp_sort_nodes.append(inst_dict['name'])
110 | 
111 |   sort_index = []
112 |   for inst_node in tmp_nodes:
113 |     sort_index.append(tmp_sort_nodes.index(inst_node))
114 | 
115 |   return sort_index
116 | 
117 | def group_cutoffs():
118 |   all_dist = []
119 |   for i in range(11):
120 |     all_dist.append(float(i) / 10)
121 |   return all_dist
122 | 


--------------------------------------------------------------------------------
/clustergrammer/cat_pval.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import pandas as pd
 3 | from copy import deepcopy
 4 | 
 5 | def main(net):
 6 |   '''
 7 |   calculate pvalue of category closeness
 8 |   '''
 9 |   # calculate the distance between the data points within the same category and
10 |   # compare to null distribution
11 |   for inst_rc in ['row', 'col']:
12 | 
13 |     inst_nodes = deepcopy(net.dat['nodes'][inst_rc])
14 | 
15 |     inst_index = deepcopy(net.dat['node_info'][inst_rc]['clust'])
16 | 
17 |     # reorder based on clustered order
18 |     inst_nodes = [ inst_nodes[i] for i in inst_index]
19 | 
20 |     # make distance matrix dataframe
21 |     dm = dist_matrix_lattice(inst_nodes)
22 | 
23 |     node_infos = list(net.dat['node_info'][inst_rc].keys())
24 | 
25 |     all_cats = []
26 |     for inst_info in node_infos:
27 |       if 'dict_cat_' in inst_info:
28 |         all_cats.append(inst_info)
29 | 
30 |     for cat_dict in all_cats:
31 | 
32 |       tmp_dict = net.dat['node_info'][inst_rc][cat_dict]
33 | 
34 |       pval_name = cat_dict.replace('dict_','pval_')
35 |       net.dat['node_info'][inst_rc][pval_name] = {}
36 | 
37 |       for cat_name in tmp_dict:
38 | 
39 |         subset = tmp_dict[cat_name]
40 | 
41 |         inst_median = calc_median_dist_subset(dm, subset)
42 | 
43 |         hist = calc_hist_distances(dm, subset, inst_nodes)
44 | 
45 |         pval = 0
46 | 
47 |         for i in range(len(hist['prob'])):
48 |           if i == 0:
49 |             pval = hist['prob'][i]
50 |           if i >= 1:
51 |             if inst_median >= hist['bins'][i]:
52 |               pval = pval + hist['prob'][i]
53 | 
54 |         net.dat['node_info'][inst_rc][pval_name][cat_name] = pval
55 | 
56 | def dist_matrix_lattice(names):
57 |   from scipy.spatial.distance import pdist, squareform
58 | 
59 |   lattice_size = len(names)
60 |   mat = np.zeros([lattice_size, 1])
61 |   mat[:,0] = list(range(lattice_size))
62 | 
63 |   inst_dm = pdist(mat, metric='euclidean')
64 | 
65 |   inst_dm[inst_dm < 0] = float(0)
66 | 
67 |   inst_dm = squareform(inst_dm)
68 | 
69 |   df = pd.DataFrame(data=inst_dm, columns=names, index=names)
70 | 
71 |   return df
72 | 
73 | 
74 | def calc_median_dist_subset(dm, subset):
75 |   return np.median(dm[subset].ix[subset].values)
76 | 
77 | def calc_hist_distances(dm, subset, inst_nodes):
78 |   np.random.seed(100)
79 | 
80 |   num_null = 1000
81 |   num_points = len(subset)
82 | 
83 |   median_dist = []
84 |   for i in range(num_null):
85 |     tmp = np.random.choice(inst_nodes, num_points, replace=False)
86 |     median_dist.append( np.median(dm[tmp].ix[tmp].values)  )
87 | 
88 |   tmp_dist = sorted(deepcopy(median_dist))
89 | 
90 |   median_dist = np.asarray(median_dist)
91 |   s1 = pd.Series(median_dist)
92 |   hist = np.histogram(s1, bins=30)
93 | 
94 |   H = {}
95 |   H['prob'] = hist[0]/np.float(num_null)
96 |   H['bins'] = hist[1]
97 | 
98 |   return H


--------------------------------------------------------------------------------
/clustergrammer/categories.py:
--------------------------------------------------------------------------------
  1 | def check_categories(lines):
  2 |   '''
  3 |   find out how many row and col categories are available
  4 |   '''
  5 |   # count the number of row categories
  6 |   rcat_line = lines[0].split('\t')
  7 | 
  8 |   # calc the number of row names and categories
  9 |   num_rc = 0
 10 |   found_end = False
 11 | 
 12 |   # skip first tab
 13 |   for inst_string in rcat_line[1:]:
 14 |     if inst_string == '':
 15 |       if found_end is False:
 16 |         num_rc = num_rc + 1
 17 |     else:
 18 |       found_end = True
 19 | 
 20 |   max_rcat = 15
 21 |   if max_rcat > len(lines):
 22 |     max_rcat = len(lines) - 1
 23 | 
 24 |   num_cc = 0
 25 |   for i in range(max_rcat):
 26 |     ccat_line = lines[i + 1].split('\t')
 27 | 
 28 |     # make sure that line has length greater than one to prevent false cats from
 29 |     # trailing new lines at end of matrix
 30 |     if ccat_line[0] == '' and len(ccat_line) > 1:
 31 |       num_cc = num_cc + 1
 32 | 
 33 |   num_labels = {}
 34 |   num_labels['row'] = num_rc + 1
 35 |   num_labels['col'] = num_cc + 1
 36 | 
 37 |   return num_labels
 38 | 
 39 | def dict_cat(net, define_cat_colors=False):
 40 |   '''
 41 |   make a dictionary of node-category associations
 42 |   '''
 43 | 
 44 |   # print('---------------------------------')
 45 |   # print('---- dict_cat: before setting cat colors')
 46 |   # print('---------------------------------\n')
 47 |   # print(define_cat_colors)
 48 |   # print(net.viz['cat_colors'])
 49 | 
 50 |   net.persistent_cat = True
 51 | 
 52 |   for inst_rc in ['row', 'col']:
 53 |     inst_keys = list(net.dat['node_info'][inst_rc].keys())
 54 |     all_cats = [x for x in inst_keys if 'cat-' in x]
 55 | 
 56 |     for inst_name_cat in all_cats:
 57 | 
 58 |       dict_cat = {}
 59 |       tmp_cats = net.dat['node_info'][inst_rc][inst_name_cat]
 60 |       tmp_nodes = net.dat['nodes'][inst_rc]
 61 | 
 62 |       for i in range(len(tmp_cats)):
 63 |         inst_cat = tmp_cats[i]
 64 |         inst_node = tmp_nodes[i]
 65 | 
 66 |         if inst_cat not in dict_cat:
 67 |           dict_cat[inst_cat] = []
 68 | 
 69 |         dict_cat[inst_cat].append(inst_node)
 70 | 
 71 |       tmp_name = 'dict_' + inst_name_cat.replace('-', '_')
 72 |       net.dat['node_info'][inst_rc][tmp_name] = dict_cat
 73 | 
 74 |   # merge with old cat_colors by default
 75 |   cat_colors = net.viz['cat_colors']
 76 | 
 77 |   if define_cat_colors == True:
 78 |     cat_number = 0
 79 | 
 80 |     for inst_rc in ['row', 'col']:
 81 | 
 82 |       inst_keys = list(net.dat['node_info'][inst_rc].keys())
 83 |       all_cats = [x for x in inst_keys if 'cat-' in x]
 84 | 
 85 |       for cat_index in all_cats:
 86 | 
 87 |         if cat_index not in cat_colors[inst_rc]:
 88 |           cat_colors[inst_rc][cat_index] = {}
 89 | 
 90 |         cat_names = sorted(list(set(net.dat['node_info'][inst_rc][cat_index])))
 91 | 
 92 |         # loop through each category name and assign a color
 93 |         for tmp_name in cat_names:
 94 | 
 95 |           # using the same rules as the front-end to define cat_colors
 96 |           inst_color = get_cat_color(cat_number + cat_names.index(tmp_name))
 97 | 
 98 |           check_name = tmp_name
 99 | 
100 |           # check if category is string type and non-numeric
101 |           try:
102 |             float(check_name)
103 |             is_string_cat = False
104 |           except:
105 |             is_string_cat = True
106 | 
107 |           if is_string_cat == True:
108 |             # check for default non-color
109 |             if ': ' in check_name:
110 |               check_name = check_name.split(': ')[1]
111 | 
112 |             # if check_name == 'False' or check_name == 'false':
113 |             if 'False' in check_name or 'false' in check_name:
114 |               inst_color = '#eee'
115 | 
116 |             if 'Not ' in check_name:
117 |               inst_color = '#eee'
118 | 
119 |           # print('cat_colors')
120 |           # print('----------')
121 |           # print(cat_colors[inst_rc][cat_index])
122 | 
123 |           # do not overwrite old colors
124 |           if tmp_name not in cat_colors[inst_rc][cat_index] and is_string_cat:
125 | 
126 |             cat_colors[inst_rc][cat_index][tmp_name] = inst_color
127 |             # print('overwrite: ' + tmp_name + ' -> ' + str(inst_color))
128 | 
129 |           cat_number = cat_number + 1
130 | 
131 |     net.viz['cat_colors'] = cat_colors
132 | 
133 |     # print('after setting cat_colors')
134 |     # print(net.viz['cat_colors'])
135 |     # print('======================================\n\n')
136 | 
137 | def calc_cat_clust_order(net, inst_rc):
138 |   '''
139 |   cluster category subset of data
140 |   '''
141 |   from .__init__ import Network
142 |   from copy import deepcopy
143 |   from . import calc_clust, run_filter
144 | 
145 |   inst_keys = list(net.dat['node_info'][inst_rc].keys())
146 |   all_cats = [x for x in inst_keys if 'cat-' in x]
147 | 
148 |   if len(all_cats) > 0:
149 | 
150 |     for inst_name_cat in all_cats:
151 | 
152 |       tmp_name = 'dict_' + inst_name_cat.replace('-', '_')
153 |       dict_cat = net.dat['node_info'][inst_rc][tmp_name]
154 | 
155 |       unordered_cats = dict_cat.keys()
156 | 
157 |       ordered_cats = order_categories(unordered_cats)
158 | 
159 |       # this is the ordering of the columns based on their category, not
160 |       # including their clustering ordering within category
161 |       all_cat_orders = []
162 |       tmp_names_list = []
163 |       for inst_cat in ordered_cats:
164 | 
165 |         inst_nodes = dict_cat[inst_cat]
166 | 
167 |         tmp_names_list.extend(inst_nodes)
168 | 
169 |       #   cat_net = deepcopy(Network())
170 | 
171 |       #   cat_net.dat['mat'] = deepcopy(net.dat['mat'])
172 |       #   cat_net.dat['nodes'] = deepcopy(net.dat['nodes'])
173 | 
174 |       #   cat_df = cat_net.dat_to_df()
175 | 
176 |       #   sub_df = {}
177 |       #   if inst_rc == 'col':
178 |       #     sub_df['mat'] = cat_df['mat'][inst_nodes]
179 |       #   elif inst_rc == 'row':
180 |       #     # need to transpose df
181 |       #     cat_df['mat'] = cat_df['mat'].transpose()
182 |       #     sub_df['mat'] = cat_df['mat'][inst_nodes]
183 |       #     sub_df['mat'] = sub_df['mat'].transpose()
184 | 
185 |       #   # filter matrix before clustering
186 |       #   ###################################
187 |       #   threshold = 0.0001
188 |       #   sub_df = run_filter.df_filter_row_sum(sub_df, threshold)
189 |       #   sub_df = run_filter.df_filter_col_sum(sub_df, threshold)
190 | 
191 |       #   # load back to dat
192 |       #   cat_net.df_to_dat(sub_df)
193 | 
194 |       #   cat_mat_shape = cat_net.dat['mat'].shape
195 | 
196 |       #   print('***************')
197 |       #   try:
198 |       #     if cat_mat_shape[0]>1 and cat_mat_shape[1] > 1 and all_are_numbers == False:
199 | 
200 |       #       calc_clust.cluster_row_and_col(cat_net, 'cos')
201 |       #       inst_cat_order = cat_net.dat['node_info'][inst_rc]['clust']
202 |       #     else:
203 |       #       inst_cat_order = list(range(len(cat_net.dat['nodes'][inst_rc])))
204 | 
205 |       #   except:
206 |       #     inst_cat_order = list(range(len(cat_net.dat['nodes'][inst_rc])))
207 | 
208 | 
209 |       #   prev_order_len = len(all_cat_orders)
210 | 
211 |       #   # add prev order length to the current order number
212 |       #   inst_cat_order = [i + prev_order_len for i in inst_cat_order]
213 |       #   all_cat_orders.extend(inst_cat_order)
214 | 
215 |       # # generate ordered list of row/col names, which will be used to
216 |       # # assign the order to specific nodes
217 |       # names_clust_list = [x for (y, x) in sorted(zip(all_cat_orders,
218 |       #                     tmp_names_list))]
219 | 
220 |       names_clust_list = tmp_names_list
221 | 
222 |       # calc category-cluster order
223 |       final_order = []
224 | 
225 |       for i in range(len(net.dat['nodes'][inst_rc])):
226 | 
227 |         inst_node_name = net.dat['nodes'][inst_rc][i]
228 |         inst_node_num = names_clust_list.index(inst_node_name)
229 | 
230 |         final_order.append(inst_node_num)
231 | 
232 |       inst_index_cat = inst_name_cat.replace('-', '_') + '_index'
233 | 
234 |       net.dat['node_info'][inst_rc][inst_index_cat] = final_order
235 | 
236 | 
237 | def order_categories(unordered_cats):
238 |   '''
239 |   If categories are strings, then simple ordering is fine.
240 |   If categories are values then I'll need to order based on their values.
241 |   The final ordering is given as the original categories (including titles) in a
242 |   ordered list.
243 |   '''
244 | 
245 |   no_titles = remove_titles(unordered_cats)
246 | 
247 |   all_are_numbers = check_all_numbers(no_titles)
248 | 
249 |   if all_are_numbers:
250 |     ordered_cats = order_cats_based_on_values(unordered_cats, no_titles)
251 |   else:
252 |     ordered_cats = sorted(unordered_cats)
253 | 
254 |   return ordered_cats
255 | 
256 | 
257 | def order_cats_based_on_values(unordered_cats, values_list):
258 |   import pandas as pd
259 | 
260 |   try:
261 |     # convert values_list to values
262 |     values_list = [float(i) for i in values_list]
263 | 
264 |     inst_series = pd.Series(data=values_list, index=unordered_cats)
265 | 
266 |     inst_series.sort_values(inplace=True)
267 | 
268 |     ordered_cats = inst_series.index.tolist()
269 | 
270 |     # ordered_cats = unordered_cats
271 |   except:
272 |     # keep default ordering if error occurs
273 |     print('error sorting cats based on values ')
274 |     ordered_cats = unordered_cats
275 | 
276 |   return ordered_cats
277 | 
278 | def check_all_numbers(no_titles):
279 |   all_numbers = True
280 |   for tmp in no_titles:
281 |     if is_number(tmp) == False:
282 |       all_numbers = False
283 | 
284 |   return all_numbers
285 | 
286 | def remove_titles(cats):
287 |   from copy import deepcopy
288 | 
289 |   # check if all have titles
290 |   ###########################
291 |   all_have_titles = True
292 | 
293 |   for inst_cat in cats:
294 |     if is_number(inst_cat) == False:
295 |       if ': ' not in inst_cat:
296 |         all_have_titles = False
297 |     else:
298 |       all_have_titles = False
299 | 
300 |   if all_have_titles:
301 |     no_titles = cats
302 |     no_titles = [i.split(': ')[1] for i in no_titles]
303 | 
304 |   else:
305 |     no_titles = cats
306 | 
307 |   return no_titles
308 | 
309 | def is_number(s):
310 |     try:
311 |         float(s)
312 |         return True
313 |     except ValueError:
314 |         return False
315 | 
316 | def get_cat_color(cat_num):
317 | 
318 |   all_colors = [ "#393b79", "#aec7e8", "#ff7f0e", "#ffbb78", "#98df8a", "#bcbd22",
319 |     "#404040", "#ff9896", "#c5b0d5", "#8c564b", "#1f77b4", "#5254a3", "#FFDB58",
320 |     "#c49c94", "#e377c2", "#7f7f7f", "#2ca02c", "#9467bd", "#dbdb8d", "#17becf",
321 |     "#637939", "#6b6ecf", "#9c9ede", "#d62728", "#8ca252", "#8c6d31", "#bd9e39",
322 |     "#e7cb94", "#843c39", "#ad494a", "#d6616b", "#7b4173", "#a55194", "#ce6dbd",
323 |     "#de9ed6"];
324 | 
325 |   inst_color = all_colors[cat_num % len(all_colors)]
326 | 
327 |   return inst_color
328 | 
329 | def dendro_cats(net, axis, dendro_level):
330 | 
331 |   if axis == 0:
332 |     axis = 'row'
333 |   if axis == 1:
334 |     axis = 'col'
335 | 
336 |   dendro_level = str(dendro_level)
337 |   dendro_level_name = dendro_level
338 |   if len(dendro_level) == 1:
339 |     dendro_level = '0' + dendro_level
340 | 
341 |   df = net.export_df()
342 | 
343 |   if axis == 'row':
344 |     old_names = df.index.tolist()
345 |   elif axis == 'col':
346 |     old_names = df.columns.tolist()
347 | 
348 |   if 'group' in net.dat['node_info'][axis]:
349 |     inst_groups = net.dat['node_info'][axis]['group'][dendro_level]
350 | 
351 |     new_names = []
352 |     for i in range(len(old_names)):
353 |       inst_name = old_names[i]
354 |       group_cat = 'Group '+ str(dendro_level_name) +': cat-' + str(inst_groups[i])
355 |       inst_name = inst_name + (group_cat,)
356 |       new_names.append(inst_name)
357 | 
358 |     if axis == 'row':
359 |       df.index = new_names
360 |     elif axis == 'col':
361 |       df.columns = new_names
362 | 
363 |     net.load_df(df)
364 | 
365 |   else:
366 |     print('please cluster, using make_clust, to define dendrogram groups before running dendro_cats')
367 | 
368 | def add_cats(net, axis, cat_data):
369 | 
370 |   try:
371 |     df = net.export_df()
372 | 
373 |     if axis == 'row':
374 |       labels = df.index.tolist()
375 |     elif axis == 'col':
376 |       labels = df.columns.tolist()
377 | 
378 |     if 'title' in cat_data:
379 |       inst_title = cat_data['title']
380 |     else:
381 |       inst_title = 'New Category'
382 | 
383 |     all_cats = cat_data['cats']
384 | 
385 |     # loop through all labels
386 |     new_labels = []
387 |     for inst_label in labels:
388 | 
389 |       if type(inst_label) is tuple:
390 |         check_name = inst_label[0]
391 |         found_tuple = True
392 |       else:
393 |         check_name = inst_label
394 |         found_tuple = False
395 | 
396 |       if ': ' in check_name:
397 |         check_name = check_name.split(': ')[1]
398 | 
399 |       # default to False for found cat, overwrite if necessary
400 |       found_cat = inst_title + ': False'
401 | 
402 |       # check all categories in cats
403 |       for inst_cat in all_cats:
404 | 
405 |         inst_names = all_cats[inst_cat]
406 | 
407 |         if check_name in inst_names:
408 |           found_cat = inst_title + ': ' + inst_cat
409 | 
410 |       # add category to label
411 |       if found_tuple is True:
412 |         new_label = inst_label + (found_cat,)
413 |       else:
414 |         new_label = (inst_label, found_cat)
415 | 
416 |       new_labels.append(new_label)
417 | 
418 | 
419 |     # add labels back to DataFrame
420 |     if axis == 'row':
421 |       df.index = new_labels
422 |     elif axis == 'col':
423 |       df.columns = new_labels
424 | 
425 |     net.load_df(df)
426 | 
427 |   except:
428 |     print('error adding new categories')
429 | 
430 | 
431 | 
432 | 
433 | 


--------------------------------------------------------------------------------
/clustergrammer/data_formats.py:
--------------------------------------------------------------------------------
 1 | from . import make_unique_labels
 2 | 
 3 | def df_to_dat(net, df, define_cat_colors=False):
 4 |   '''
 5 |   This is always run when data is loaded.
 6 |   '''
 7 |   from . import categories
 8 | 
 9 |   # check if df has unique values
10 |   df['mat'] = make_unique_labels.main(net, df['mat'])
11 | 
12 |   net.dat['mat'] = df['mat'].values
13 |   net.dat['nodes']['row'] = df['mat'].index.tolist()
14 |   net.dat['nodes']['col'] = df['mat'].columns.tolist()
15 | 
16 |   for inst_rc in ['row', 'col']:
17 | 
18 |     if type(net.dat['nodes'][inst_rc][0]) is tuple:
19 |       # get the number of categories from the length of the tuple
20 |       # subtract 1 because the name is the first element of the tuple
21 |       num_cat = len(net.dat['nodes'][inst_rc][0]) - 1
22 | 
23 |       net.dat['node_info'][inst_rc]['full_names'] = net.dat['nodes']\
24 |           [inst_rc]
25 | 
26 |       for inst_rcat in range(num_cat):
27 |         net.dat['node_info'][inst_rc]['cat-' + str(inst_rcat)] = \
28 |           [i[inst_rcat + 1] for i in net.dat['nodes'][inst_rc]]
29 | 
30 |       net.dat['nodes'][inst_rc] = [i[0] for i in net.dat['nodes'][inst_rc]]
31 | 
32 |   if 'mat_up' in df:
33 |     net.dat['mat_up'] = df['mat_up'].values
34 |     net.dat['mat_dn'] = df['mat_dn'].values
35 | 
36 |   if 'mat_orig' in df:
37 |     net.dat['mat_orig'] = df['mat_orig'].values
38 | 
39 |   categories.dict_cat(net, define_cat_colors=define_cat_colors)
40 | 
41 | def dat_to_df(net):
42 |   import pandas as pd
43 | 
44 |   df = {}
45 |   nodes = {}
46 |   for inst_rc in ['row', 'col']:
47 |     if 'full_names' in net.dat['node_info'][inst_rc]:
48 |       nodes[inst_rc] = net.dat['node_info'][inst_rc]['full_names']
49 |     else:
50 |       nodes[inst_rc] = net.dat['nodes'][inst_rc]
51 | 
52 |   df['mat'] = pd.DataFrame(data=net.dat['mat'], columns=nodes['col'],
53 |       index=nodes['row'])
54 | 
55 |   if 'mat_up' in net.dat:
56 | 
57 |     df['mat_up'] = pd.DataFrame(data=net.dat['mat_up'],
58 |       columns=nodes['col'], index=nodes['row'])
59 | 
60 |     df['mat_dn'] = pd.DataFrame(data=net.dat['mat_dn'],
61 |       columns=nodes['col'], index=nodes['row'])
62 | 
63 |   if 'mat_orig' in net.dat:
64 |     df['mat_orig'] = pd.DataFrame(data=net.dat['mat_orig'],
65 |       columns=nodes['col'], index=nodes['row'])
66 | 
67 |   return df
68 | 
69 | def mat_to_numpy_arr(self):
70 |   ''' convert list to numpy array - numpy arrays can not be saved as json '''
71 |   import numpy as np
72 |   self.dat['mat'] = np.asarray(self.dat['mat'])


--------------------------------------------------------------------------------
/clustergrammer/downsample_fun.py:
--------------------------------------------------------------------------------
  1 | import pandas as pd
  2 | import numpy as np
  3 | from sklearn.cluster import MiniBatchKMeans
  4 | # string used to format titles
  5 | super_string = ': '
  6 | 
  7 | def main(net, df=None, ds_type='kmeans', axis='row', num_samples=100, random_state=1000):
  8 | 
  9 |   if df is None:
 10 |     df = net.export_df()
 11 | 
 12 |   # # run downsampling
 13 |   # random_state = 1000
 14 | 
 15 |   ds_df, ds_data = run_kmeans_mini_batch(df, num_samples, axis, random_state)
 16 | 
 17 |   net.load_df(ds_df)
 18 | 
 19 |   return ds_data
 20 | 
 21 | def run_kmeans_mini_batch(df, num_samples=100, axis='row', random_state=1000):
 22 | 
 23 | 
 24 |   # gather downsampled axis information
 25 |   if axis == 'row':
 26 |     X = df
 27 |     orig_labels = df.index.tolist()
 28 |     non_ds_labels = df.columns.tolist()
 29 | 
 30 |   else:
 31 |     X = df.transpose()
 32 |     orig_labels = df.columns.tolist()
 33 |     non_ds_labels = df.index.tolist()
 34 | 
 35 |   cat_index = 1
 36 | 
 37 |   # run until the number of returned clusters with data-points is equal to the
 38 |   # number of requested clusters
 39 |   num_returned_clusters = 0
 40 |   while num_samples != num_returned_clusters:
 41 | 
 42 |     clusters, num_returned_clusters, cluster_data, cluster_pop = \
 43 |       calc_mbk_clusters(X, num_samples, random_state)
 44 | 
 45 |     random_state = random_state + random_state
 46 | 
 47 |   clust_numbers = range(num_returned_clusters)
 48 |   clust_labels = [ 'cluster-' + str(i) for i in clust_numbers]
 49 | 
 50 |   if type(orig_labels[0]) is tuple:
 51 |     found_cats = True
 52 |   else:
 53 |     found_cats = False
 54 | 
 55 |   # Gather categories if necessary
 56 |   ########################################
 57 |   # check if there are categories
 58 |   if found_cats:
 59 |     all_cats = generate_cat_data(cluster_data, orig_labels, num_samples)
 60 | 
 61 |   # genrate cluster labels, e.g. add number in each cluster and majority cat
 62 |   # if necessary
 63 |   cluster_labels = []
 64 |   for i in range(num_returned_clusters):
 65 | 
 66 |     inst_name = 'Cluster: ' + clust_labels[i]
 67 |     num_in_clust_string =  'number in clust: '+ str(cluster_pop[i])
 68 | 
 69 |     inst_tuple = (inst_name,)
 70 | 
 71 |     if found_cats:
 72 |       for cat_data in all_cats:
 73 |         cat_values = cat_data['counts'][i]
 74 |         max_cat_fraction = cat_values.max()
 75 |         max_index = np.where(cat_values == max_cat_fraction)[0][0]
 76 |         max_cat_name = cat_data['types'][max_index]
 77 | 
 78 |         # add category title if available
 79 |         cat_name_string = 'Majority-'+ cat_data['title'] +': ' + max_cat_name
 80 | 
 81 |         inst_tuple = inst_tuple + (cat_name_string,)
 82 | 
 83 |     inst_tuple = inst_tuple + (num_in_clust_string,)
 84 | 
 85 |     cluster_labels.append(inst_tuple)
 86 | 
 87 |   # ds_df is always downsampling the rows, if the user wants to downsample the
 88 |   # columns, the df will be switched back later
 89 |   ds_df = pd.DataFrame(data=clusters, index=cluster_labels, columns=non_ds_labels)
 90 | 
 91 |   # swap back for downsampled columns
 92 |   if axis == 'col':
 93 |     ds_df = ds_df.transpose()
 94 | 
 95 |   return ds_df, cluster_data
 96 | 
 97 | def generate_cat_data(cluster_data, orig_labels, num_samples):
 98 | 
 99 |   # generate an array of orig_labels, using an array so that I can gather
100 |   # label subsets using indices
101 |   orig_array = np.asarray(orig_labels)
102 | 
103 |   example_label = orig_labels[0]
104 | 
105 |   # find out how many string categories are available
106 |   num_cats = 0
107 |   for i in range(len(example_label)):
108 | 
109 |     if i > 0:
110 |       inst_cat = example_label[i]
111 |       if super_string in inst_cat:
112 |         inst_cat = inst_cat.split(super_string)[1]
113 | 
114 |       string_cat = True
115 |       try:
116 |         float(inst_cat)
117 |         string_cat = False
118 |       except:
119 |         string_cat = True
120 | 
121 |       if string_cat:
122 |         num_cats = num_cats + 1
123 | 
124 |   all_cats = []
125 | 
126 |   for cat_index in range(num_cats):
127 | 
128 |     # index zero is for the names
129 |     cat_index = cat_index + 1
130 | 
131 |     cat_data = {}
132 | 
133 |     if super_string in example_label[cat_index]:
134 |       cat_data['title'] = example_label[cat_index].split(super_string)[0]
135 |     else:
136 |       cat_data['title'] = 'Category'
137 | 
138 |     # if there are string categories, then keep track of how many of each category
139 |     # are found in each of the downsampled clusters.
140 |     cat_data['types'] = []
141 | 
142 |     # gather possible categories
143 |     for inst_label in orig_labels:
144 | 
145 |       inst_cat = inst_label[cat_index]
146 | 
147 |       if super_string in inst_cat:
148 |         inst_cat = inst_cat.split(super_string)[1]
149 | 
150 |       # get first category
151 |       cat_data['types'].append(inst_cat)
152 | 
153 |     cat_data['types'] = sorted(list(set(cat_data['types'])))
154 | 
155 |     num_cats = len(cat_data['types'])
156 | 
157 |     # initialize cat_data['counts'] dictionary
158 |     cat_data['counts'] = {}
159 |     for inst_clust in range(num_samples):
160 |       cat_data['counts'][inst_clust] = np.zeros([num_cats])
161 | 
162 |     # populate cat_data['counts']
163 |     for inst_clust in range(num_samples):
164 | 
165 |       # get the indicies of all original labels that fall in the cluster
166 |       found = np.where(cluster_data == inst_clust)
167 |       found_indicies = found[0]
168 | 
169 |       clust_names = orig_array[found_indicies]
170 | 
171 |       for inst_name in clust_names:
172 | 
173 |         # get first category name
174 |         inst_name = inst_name[cat_index]
175 | 
176 |         if super_string in inst_name:
177 |           inst_name = inst_name.split(super_string)[1]
178 | 
179 |         tmp_index = cat_data['types'].index(inst_name)
180 | 
181 |         cat_data['counts'][inst_clust][tmp_index] = cat_data['counts'][inst_clust][tmp_index] + 1
182 | 
183 |     # calculate fractions
184 |     for inst_clust in range(num_samples):
185 |       # get array
186 |       counts = cat_data['counts'][inst_clust]
187 |       inst_total = np.sum(counts)
188 |       cat_data['counts'][inst_clust] = cat_data['counts'][inst_clust] / inst_total
189 | 
190 |     all_cats.append(cat_data)
191 | 
192 |   return all_cats
193 | 
194 | def calc_mbk_clusters(X, n_clusters, random_state=1000):
195 | 
196 |   # kmeans is run with rows as data-points and columns as dimensions
197 |   mbk = MiniBatchKMeans(init='k-means++', n_clusters=n_clusters,
198 |                          max_no_improvement=100, verbose=0,
199 |                          random_state=random_state)
200 | 
201 |   # need to loop through each label (each k-means cluster) and count how many
202 |   # points were given this label. This will give the population size of each label
203 |   mbk.fit(X)
204 |   cluster_data = mbk.labels_
205 |   clusters = mbk.cluster_centers_
206 | 
207 |   mbk_cluster_names, cluster_pop = np.unique(cluster_data, return_counts=True)
208 | 
209 |   num_returned_clusters = len(cluster_pop)
210 | 
211 |   return clusters, num_returned_clusters, cluster_data, cluster_pop


--------------------------------------------------------------------------------
/clustergrammer/enrichr_functions.py:
--------------------------------------------------------------------------------
  1 | def add_enrichr_cats(df, inst_rc, run_enrichr, num_terms=10):
  2 |   from copy import deepcopy
  3 | 
  4 |   tmp_gene_list = deepcopy(df.index.tolist())
  5 | 
  6 |   gene_list = []
  7 |   if type(tmp_gene_list[0]) is tuple:
  8 |     for inst_tuple in tmp_gene_list:
  9 |       gene_list.append(inst_tuple[0])
 10 |   else:
 11 |     gene_list = tmp_gene_list
 12 | 
 13 |   orig_gene_list = deepcopy(gene_list)
 14 | 
 15 |   # set up for non-tuple case first
 16 |   if ': ' in  gene_list[0]:
 17 |     # strip titles
 18 |     gene_list = [inst_gene.split(': ')[1] for inst_gene in gene_list]
 19 | 
 20 |   # strip extra information (e.g. PTMs)
 21 |   gene_list = [inst_gene.split('_')[0] for inst_gene in gene_list]
 22 |   gene_list = [inst_gene.split(' ')[0] for inst_gene in gene_list]
 23 |   gene_list = [inst_gene.split('-')[0] for inst_gene in gene_list]
 24 | 
 25 |   user_list_id = post_request(gene_list)
 26 | 
 27 |   enr, response_list = get_request(run_enrichr, user_list_id, max_terms=20)
 28 | 
 29 |   # p-value, adjusted pvalue, z-score, combined score, genes
 30 |   # 1: Term
 31 |   # 2: P-value
 32 |   # 3: Z-score
 33 |   # 4: Combined Score
 34 |   # 5: Genes
 35 |   # 6: pval_bh
 36 | 
 37 |   # while generating categories store as list of lists, then convert to list of
 38 |   # tuples
 39 | 
 40 |   bar_info = []
 41 |   cat_list = []
 42 |   for inst_gene in orig_gene_list:
 43 |     cat_list.append([inst_gene])
 44 | 
 45 |   for inst_enr in response_list[0:num_terms]:
 46 |     inst_term = inst_enr[1]
 47 |     inst_pval = inst_enr[2]
 48 |     inst_cs = inst_enr[4]
 49 |     inst_list = inst_enr[5]
 50 | 
 51 |     pval_string = '<p> Pval ' + str(inst_pval) + '</p>'
 52 | 
 53 |     bar_info.append(inst_cs)
 54 | 
 55 |     for inst_info in cat_list:
 56 | 
 57 |       # strip titles
 58 |       gene_name = inst_info[0]
 59 | 
 60 |       if ': ' in gene_name:
 61 |         gene_name = gene_name.split(': ')[1]
 62 | 
 63 |       # strip extra information (e.g. PTMs)
 64 |       gene_name = gene_name.split('_')[0]
 65 |       gene_name = gene_name.split(' ')[0]
 66 |       gene_name = gene_name.split('-')[0]
 67 | 
 68 |       if gene_name in inst_list:
 69 |         inst_info.append(inst_term+': True'+ pval_string)
 70 |       else:
 71 |         inst_info.append(inst_term+': False'+pval_string)
 72 | 
 73 |   cat_list = [tuple(x) for x in cat_list]
 74 | 
 75 |   df.index = cat_list
 76 | 
 77 |   return df, bar_info
 78 | 
 79 | def clust_from_response(response_list):
 80 |   from clustergrammer import Network
 81 |   import scipy
 82 |   import json
 83 |   import pandas as pd
 84 |   import math
 85 |   from copy import deepcopy
 86 | 
 87 |   # print('----------------------')
 88 |   # print('enrichr_clust_from_response')
 89 |   # print('----------------------')
 90 | 
 91 |   ini_enr = transfer_to_enr_dict( response_list )
 92 | 
 93 |   enr = []
 94 |   scores = {}
 95 |   score_types = ['combined_score','pval','zscore']
 96 | 
 97 |   for score_type in score_types:
 98 |     scores[score_type] = pd.Series()
 99 | 
100 |   for inst_enr in ini_enr:
101 |     if inst_enr['combined_score'] > 0:
102 | 
103 |       # make series of enriched terms with scores
104 |       for score_type in score_types:
105 | 
106 |         # collect the scores of the enriched terms
107 |         if score_type == 'combined_score':
108 |           scores[score_type][inst_enr['name']] = inst_enr[score_type]
109 |         if score_type == 'pval':
110 |           scores[score_type][inst_enr['name']] = -math.log(inst_enr[score_type])
111 |         if score_type == 'zscore':
112 |           scores[score_type][inst_enr['name']] = -inst_enr[score_type]
113 | 
114 |       # keep enrichement values
115 |       enr.append(inst_enr)
116 | 
117 |   # sort and normalize the scores
118 |   for score_type in score_types:
119 |     scores[score_type] = scores[score_type]/scores[score_type].max()
120 |     scores[score_type].sort_values(ascending=False)
121 | 
122 |   number_of_enriched_terms = len(scores['combined_score'])
123 | 
124 |   enr_score_types = ['combined_score','pval','zscore']
125 | 
126 |   if number_of_enriched_terms <10:
127 |     num_dict = {'ten':10}
128 |   elif number_of_enriched_terms <20:
129 |     num_dict = {'ten':10, 'twenty':20}
130 |   else:
131 |     num_dict = {'ten':10, 'twenty':20, 'thirty':30}
132 | 
133 |   # gather lists of top scores
134 |   top_terms = {}
135 |   for enr_type in enr_score_types:
136 |     top_terms[enr_type] = {}
137 |     for num_terms in list(num_dict.keys()):
138 |       inst_num = num_dict[num_terms]
139 |       top_terms[enr_type][num_terms] = scores[enr_type].index.tolist()[: inst_num]
140 | 
141 |   # gather the terms that should be kept - they are at the top of the score list
142 |   keep_terms = []
143 |   for inst_enr_score in top_terms:
144 |     for tmp_num in list(num_dict.keys()):
145 |       keep_terms.extend( top_terms[inst_enr_score][tmp_num] )
146 | 
147 |   keep_terms = list(set(keep_terms))
148 | 
149 |   # keep enriched terms that are at the top 10 based on at least one score
150 |   keep_enr = []
151 |   for inst_enr in enr:
152 |     if inst_enr['name'] in keep_terms:
153 |       keep_enr.append(inst_enr)
154 | 
155 | 
156 |   # fill in full matrix
157 |   #######################
158 | 
159 |   # genes
160 |   row_node_names = []
161 |   # enriched terms
162 |   col_node_names = []
163 | 
164 |   # gather information from the list of enriched terms
165 |   for inst_enr in keep_enr:
166 |     col_node_names.append(inst_enr['name'])
167 |     row_node_names.extend(inst_enr['int_genes'])
168 | 
169 |   row_node_names = sorted(list(set(row_node_names)))
170 | 
171 |   net = Network()
172 |   net.dat['nodes']['row'] = row_node_names
173 |   net.dat['nodes']['col'] = col_node_names
174 |   net.dat['mat'] = scipy.zeros([len(row_node_names),len(col_node_names)])
175 | 
176 |   for inst_enr in keep_enr:
177 | 
178 |     inst_term = inst_enr['name']
179 |     col_index = col_node_names.index(inst_term)
180 | 
181 |     # use combined score for full matrix - will not be seen in viz
182 |     tmp_score = scores['combined_score'][inst_term]
183 |     net.dat['node_info']['col']['value'].append(tmp_score)
184 | 
185 |     for inst_gene in inst_enr['int_genes']:
186 |       row_index = row_node_names.index(inst_gene)
187 | 
188 |       # save association
189 |       net.dat['mat'][row_index, col_index] = 1
190 | 
191 |   # cluster full matrix
192 |   #############################
193 |   # do not make multiple views
194 |   views = ['']
195 | 
196 |   if len(net.dat['nodes']['row']) > 1:
197 |     net.make_clust(dist_type='jaccard', views=views, dendro=False)
198 |   else:
199 |     net.make_clust(dist_type='jaccard', views=views, dendro=False, run_clustering=False)
200 | 
201 |   # get dataframe from full matrix
202 |   df = net.dat_to_df()
203 | 
204 |   for score_type in score_types:
205 | 
206 |     for num_terms in num_dict:
207 | 
208 |       inst_df = deepcopy(df)
209 |       inst_net = deepcopy(Network())
210 | 
211 |       inst_df['mat'] = inst_df['mat'][top_terms[score_type][num_terms]]
212 | 
213 |       # load back into net
214 |       inst_net.df_to_dat(inst_df)
215 | 
216 |       # make views
217 |       if len(net.dat['nodes']['row']) > 1:
218 |         inst_net.make_clust(dist_type='jaccard', views=['N_row_sum'], dendro=False)
219 |       else:
220 |         inst_net.make_clust(dist_type='jaccard', views=['N_row_sum'], dendro=False, run_clustering = False)
221 | 
222 |       inst_views = inst_net.viz['views']
223 | 
224 |       # add score_type to views
225 |       for inst_view in inst_views:
226 | 
227 |         inst_view['N_col_sum'] = num_dict[num_terms]
228 | 
229 |         inst_view['enr_score_type'] = score_type
230 | 
231 |         # add values to col_nodes and order according to rank
232 |         for inst_col in inst_view['nodes']['col_nodes']:
233 | 
234 |           inst_col['rank'] = len(top_terms[score_type][num_terms]) - top_terms[score_type][num_terms].index(inst_col['name'])
235 | 
236 |           inst_name = inst_col['name']
237 |           inst_col['value'] = scores[score_type][inst_name]
238 | 
239 |       # add views to main network
240 |       net.viz['views'].extend(inst_views)
241 | 
242 |   return net
243 | 
244 | # make the get request to enrichr using the requests library
245 | # this is done before making the get request with the lib name
246 | def post_request(input_genes, meta=''):
247 |   # get metadata
248 |   import requests
249 |   import json
250 | 
251 |   # stringify list
252 |   input_genes = '\n'.join(input_genes)
253 | 
254 |   # define post url
255 |   post_url = 'http://amp.pharm.mssm.edu/Enrichr/addList'
256 | 
257 |   # define parameters
258 |   params = {'list':input_genes, 'description':''}
259 | 
260 |   # make request: post the gene list
261 |   post_response = requests.post( post_url, files=params)
262 | 
263 |   # load json
264 |   inst_dict = json.loads( post_response.text )
265 |   userListId = str(inst_dict['userListId'])
266 | 
267 |   # return the userListId that is needed to reference the list later
268 |   return userListId
269 | 
270 | # make the get request to enrichr using the requests library
271 | # this is done after submitting post request with the input gene list
272 | def get_request(lib, userListId, max_terms=50 ):
273 |   import requests
274 |   import json
275 | 
276 |   # convert userListId to string
277 |   userListId = str(userListId)
278 | 
279 |   # define the get url
280 |   get_url = 'http://amp.pharm.mssm.edu/Enrichr/enrich'
281 | 
282 |   # get parameters
283 |   params = {'backgroundType':lib,'userListId':userListId}
284 | 
285 |   # try get request until status code is 200
286 |   inst_status_code = 400
287 | 
288 |   # wait until okay status code is returned
289 |   num_try = 0
290 | 
291 |   # print(('\tEnrichr enrichment get req userListId: '+str(userListId)))
292 | 
293 |   while inst_status_code == 400 and num_try < 100:
294 |     num_try = num_try +1
295 |     try:
296 |       # make the get request to get the enrichr results
297 | 
298 |       try:
299 |         get_response = requests.get( get_url, params=params )
300 | 
301 |         # get status_code
302 |         inst_status_code = get_response.status_code
303 | 
304 |       except:
305 |         print('retry get request')
306 | 
307 |     except:
308 |       print('get requests failed')
309 | 
310 |   # load as dictionary
311 |   resp_json = json.loads( get_response.text )
312 | 
313 |   # get the key
314 |   only_key = list(resp_json.keys())[0]
315 | 
316 |   # get response_list
317 |   response_list = resp_json[only_key]
318 | 
319 |   # transfer the response_list to the enr_dict
320 |   enr = transfer_to_enr_dict( response_list, max_terms )
321 | 
322 |   # return enrichment json and userListId
323 |   return enr, response_list
324 | 
325 | # transfer the response_list to a list of dictionaries
326 | def transfer_to_enr_dict(response_list, max_terms=50):
327 | 
328 |   # # reduce the number of enriched terms if necessary
329 |   # if len(response_list) < num_terms:
330 |   #   num_terms = len(response_list)
331 | 
332 |   # p-value, adjusted pvalue, z-score, combined score, genes
333 |   # 1: Term
334 |   # 2: P-value
335 |   # 3: Z-score
336 |   # 4: Combined Score
337 |   # 5: Genes
338 |   # 6: pval_bh
339 | 
340 |   num_enr_term = len(response_list)
341 |   if num_enr_term > max_terms:
342 |     num_enr_term = max_terms
343 | 
344 |   # transfer response_list to enr structure
345 |   # and only keep the top terms
346 |   #
347 |   # initialize enr
348 |   enr = []
349 |   for i in range(num_enr_term):
350 | 
351 |     # get list element
352 |     inst_enr = response_list[i]
353 | 
354 |     # initialize dict
355 |     inst_dict = {}
356 | 
357 |     # transfer term
358 |     inst_dict['name'] = inst_enr[1]
359 |     # transfer pval
360 |     inst_dict['pval'] = inst_enr[2]
361 |     # transfer zscore
362 |     inst_dict['zscore'] = inst_enr[3]
363 |     # transfer combined_score
364 |     inst_dict['combined_score'] = inst_enr[4]
365 |     # transfer int_genes
366 |     inst_dict['int_genes'] = inst_enr[5]
367 |     # adjusted pval
368 |     inst_dict['pval_bh'] = inst_enr[6]
369 | 
370 |     # append dict
371 |     enr.append(inst_dict)
372 | 
373 |   return enr
374 | 
375 | 


--------------------------------------------------------------------------------
/clustergrammer/export_data.py:
--------------------------------------------------------------------------------
 1 | def export_net_json(net, net_type, indent='no-indent'):
 2 |   ''' export json string of dat '''
 3 |   import json
 4 |   from copy import deepcopy
 5 | 
 6 |   if net_type == 'dat':
 7 |     exp_dict = deepcopy(net.dat)
 8 | 
 9 |     if type(exp_dict['mat']) is not list:
10 |       exp_dict['mat'] = exp_dict['mat'].tolist()
11 |       if 'mat_orig' in exp_dict:
12 |         exp_dict['mat_orig'] = exp_dict['mat_orig'].tolist()
13 | 
14 |   elif net_type == 'viz':
15 |     exp_dict = net.viz
16 | 
17 |   elif net_type == 'sim_row':
18 |     exp_dict = net.sim['row']
19 | 
20 |   elif net_type == 'sim_col':
21 |     exp_dict = net.sim['col']
22 | 
23 |   # make json
24 |   if indent == 'indent':
25 |     exp_json = json.dumps(exp_dict, indent=2)
26 |   else:
27 |     exp_json = json.dumps(exp_dict)
28 | 
29 |   return exp_json
30 | 
31 | def write_matrix_to_tsv(net, filename=None, df=None):
32 |   '''
33 |   This will export the matrix in net.dat or a dataframe (optional df in
34 |   arguments) as a tsv file. Row/column categories will be saved as tuples in
35 |   tsv, which can be read back into the network object.
36 |   '''
37 |   import pandas as pd
38 | 
39 |   if df is None:
40 |     df = net.dat_to_df()
41 | 
42 |   return df['mat'].to_csv(filename, sep='\t')
43 | 
44 | def write_json_to_file(net, net_type, filename, indent='no-indent'):
45 | 
46 |   exp_json = net.export_net_json(net_type, indent)
47 | 
48 |   fw = open(filename, 'w')
49 |   fw.write(exp_json)
50 |   fw.close()
51 | 
52 | def save_dict_to_json(inst_dict, filename, indent='no-indent'):
53 |   import json
54 |   fw = open(filename, 'w')
55 |   if indent == 'indent':
56 |     fw.write(json.dumps(inst_dict, indent=2))
57 |   else:
58 |     fw.write(json.dumps(inst_dict))
59 |   fw.close()


--------------------------------------------------------------------------------
/clustergrammer/iframe_web_app.py:
--------------------------------------------------------------------------------
 1 | def main(net, filename=None, width=1000, height=800):
 2 |   import requests, json
 3 |   # from io import StringIO
 4 |   from IPython.display import IFrame, display
 5 | 
 6 |   try:
 7 |       from StringIO import StringIO
 8 |   except ImportError:
 9 |       from io import StringIO
10 | 
11 |   clustergrammer_url = 'http://amp.pharm.mssm.edu/clustergrammer/matrix_upload/'
12 | 
13 |   if filename is None:
14 |     file_string = net.write_matrix_to_tsv()
15 |     file_obj = StringIO(file_string)
16 | 
17 |     if net.dat['filename'] is None:
18 |       fake_filename = 'Network.txt'
19 |     else:
20 |       fake_filename = net.dat['filename']
21 | 
22 |     r = requests.post(clustergrammer_url, files={'file': (fake_filename, file_obj)})
23 |   else:
24 |     file_obj = open(filename, 'r')
25 |     r = requests.post(clustergrammer_url, files={'file': file_obj})
26 | 
27 | 
28 |   link = r.text
29 | 
30 |   display(IFrame(link, width=width, height=height))
31 | 
32 |   return link


--------------------------------------------------------------------------------
/clustergrammer/initialize_net.py:
--------------------------------------------------------------------------------
 1 | def main(self, widget=None):
 2 | 
 3 |   self.dat = {}
 4 |   self.dat['nodes'] = {}
 5 |   self.dat['nodes']['row'] = []
 6 |   self.dat['nodes']['col'] = []
 7 |   self.dat['mat'] = []
 8 | 
 9 |   self.dat['node_info'] = {}
10 |   for inst_rc in self.dat['nodes']:
11 |     self.dat['node_info'][inst_rc] = {}
12 |     self.dat['node_info'][inst_rc]['ini'] = []
13 |     self.dat['node_info'][inst_rc]['clust'] = []
14 |     self.dat['node_info'][inst_rc]['rank'] = []
15 |     self.dat['node_info'][inst_rc]['info'] = []
16 |     self.dat['node_info'][inst_rc]['cat'] = []
17 |     self.dat['node_info'][inst_rc]['value'] = []
18 | 
19 |   # check if net has categories predefined
20 |   if hasattr(self, 'persistent_cat') == False:
21 |     self.persistent_cat = False
22 |     found_cats = False
23 |   else:
24 |     found_cats = True
25 |     inst_cat_colors = self.viz['cat_colors']
26 | 
27 |   # add widget if necessary
28 |   if widget != None:
29 |     self.widget_class = widget
30 | 
31 |   self.viz = {}
32 |   self.viz['row_nodes'] = []
33 |   self.viz['col_nodes'] = []
34 |   self.viz['links'] = []
35 |   self.viz['mat'] = []
36 | 
37 |   if found_cats == False:
38 |     # print('no persistent_cat')
39 |     self.viz['cat_colors'] = {}
40 |     self.viz['cat_colors']['row'] = {}
41 |     self.viz['cat_colors']['col'] = {}
42 |   else:
43 |     # print('yes persistent_cat')
44 |     self.viz['cat_colors'] = inst_cat_colors
45 | 
46 |   self.sim = {}
47 | 
48 | 
49 | def viz(self, reset_cat_colors=False):
50 | 
51 |   # keep track of old cat_colors
52 |   old_cat_colors = self.viz['cat_colors']
53 | 
54 |   self.viz = {}
55 |   self.viz['row_nodes'] = []
56 |   self.viz['col_nodes'] = []
57 |   self.viz['links'] = []
58 |   self.viz['mat'] = []
59 | 
60 |   if reset_cat_colors == True:
61 |     self.viz['cat_colors'] = {}
62 |     self.viz['cat_colors']['row'] = {}
63 |     self.viz['cat_colors']['col'] = {}
64 |   else:
65 |     self.viz['cat_colors'] = old_cat_colors
66 | 


--------------------------------------------------------------------------------
/clustergrammer/load_data.py:
--------------------------------------------------------------------------------
  1 | import io, sys
  2 | import json
  3 | import pandas as pd
  4 | from . import categories
  5 | from . import proc_df_labels
  6 | from . import data_formats
  7 | from . import make_unique_labels
  8 | 
  9 | try:
 10 |     from StringIO import StringIO
 11 | except ImportError:
 12 |     from io import StringIO
 13 | 
 14 | def load_file(net, filename):
 15 |   # reset network when loaing file, prevents errors when loading new file
 16 |   # have persistent categories
 17 | 
 18 |   # trying to improve re-initialization
 19 |   # net.__init__()
 20 |   net.reset()
 21 | 
 22 |   f = open(filename, 'r')
 23 | 
 24 |   file_string = f.read()
 25 |   f.close()
 26 | 
 27 |   load_file_as_string(net, file_string, filename)
 28 | 
 29 | def load_file_as_string(net, file_string, filename=''):
 30 | 
 31 |   if (sys.version_info > (3, 0)):
 32 |     # python 3
 33 |     ####################
 34 |     file_string = str(file_string)
 35 |   else:
 36 |     # python 2
 37 |     ####################
 38 |     file_string = unicode(file_string)
 39 | 
 40 |   buff = io.StringIO(file_string)
 41 | 
 42 |   if '/' in filename:
 43 |     filename = filename.split('/')[-1]
 44 | 
 45 |   net.load_tsv_to_net(buff, filename)
 46 | 
 47 | def load_stdin(net):
 48 |   data = ''
 49 | 
 50 |   for line in sys.stdin:
 51 |     data = data + line
 52 | 
 53 |   data = StringIO.StringIO(data)
 54 | 
 55 |   net.load_tsv_to_net(data)
 56 | 
 57 | def load_tsv_to_net(net, file_buffer, filename=None):
 58 |   lines = file_buffer.getvalue().split('\n')
 59 |   num_labels = categories.check_categories(lines)
 60 | 
 61 |   row_arr = list(range(num_labels['row']))
 62 |   col_arr = list(range(num_labels['col']))
 63 |   tmp_df = {}
 64 | 
 65 |   # use header if there are col categories
 66 |   if len(col_arr) > 1:
 67 |     tmp_df['mat'] = pd.read_table(file_buffer, index_col=row_arr,
 68 |                                   header=col_arr)
 69 |   else:
 70 |     tmp_df['mat'] = pd.read_table(file_buffer, index_col=row_arr)
 71 | 
 72 |   tmp_df = proc_df_labels.main(tmp_df)
 73 | 
 74 |   net.df_to_dat(tmp_df, True)
 75 |   net.dat['filename'] = filename
 76 | 
 77 | def load_json_to_dict(filename):
 78 |   f = open(filename, 'r')
 79 |   inst_dict = json.load(f)
 80 |   f.close()
 81 |   return inst_dict
 82 | 
 83 | def load_gmt(filename):
 84 |   f = open(filename, 'r')
 85 |   lines = f.readlines()
 86 |   f.close()
 87 |   gmt = {}
 88 |   for i in range(len(lines)):
 89 |     inst_line = lines[i].rstrip()
 90 |     inst_term = inst_line.split('\t')[0]
 91 |     inst_elems = inst_line.split('\t')[2:]
 92 |     gmt[inst_term] = inst_elems
 93 | 
 94 |   return gmt
 95 | 
 96 | def load_data_to_net(net, inst_net):
 97 |   ''' load data into nodes and mat, also convert mat to numpy array'''
 98 |   net.dat['nodes'] = inst_net['nodes']
 99 |   net.dat['mat'] = inst_net['mat']
100 |   data_formats.mat_to_numpy_arr(net)


--------------------------------------------------------------------------------
/clustergrammer/load_vect_post.py:
--------------------------------------------------------------------------------
 1 | def main(real_net, vect_post):
 2 |   import numpy as np
 3 |   from copy import deepcopy
 4 |   from .__init__ import Network
 5 |   from . import proc_df_labels
 6 | 
 7 |   net = deepcopy(Network())
 8 | 
 9 |   sigs = vect_post['columns']
10 | 
11 |   all_rows = []
12 |   all_sigs = []
13 |   for inst_sig in sigs:
14 |     all_sigs.append(inst_sig['col_name'])
15 | 
16 |     col_data = inst_sig['data']
17 | 
18 |     for inst_row_data in col_data:
19 |       all_rows.append(inst_row_data['row_name'])
20 | 
21 |   all_rows = sorted(list(set(all_rows)))
22 |   all_sigs = sorted(list(set(all_sigs)))
23 | 
24 |   net.dat['nodes']['row'] = all_rows
25 |   net.dat['nodes']['col'] = all_sigs
26 | 
27 |   net.dat['mat'] = np.empty((len(all_rows), len(all_sigs)))
28 |   net.dat['mat'][:] = np.nan
29 | 
30 |   is_up_down = False
31 |   if 'is_up_down' in vect_post:
32 |     if vect_post['is_up_down'] is True:
33 |       is_up_down = True
34 | 
35 |   if is_up_down is True:
36 |     net.dat['mat_up'] = np.empty((len(all_rows), len(all_sigs)))
37 |     net.dat['mat_up'][:] = np.nan
38 | 
39 |     net.dat['mat_dn'] = np.empty((len(all_rows), len(all_sigs)))
40 |     net.dat['mat_dn'][:] = np.nan
41 | 
42 |   for inst_sig in sigs:
43 |     inst_sig_name = inst_sig['col_name']
44 |     col_data = inst_sig['data']
45 | 
46 |     for inst_row_data in col_data:
47 |       inst_row = inst_row_data['row_name']
48 |       inst_value = inst_row_data['val']
49 | 
50 |       row_index = all_rows.index(inst_row)
51 |       col_index = all_sigs.index(inst_sig_name)
52 | 
53 |       net.dat['mat'][row_index, col_index] = inst_value
54 | 
55 |       if is_up_down is True:
56 |         net.dat['mat_up'][row_index, col_index] = inst_row_data['val_up']
57 |         net.dat['mat_dn'][row_index, col_index] = inst_row_data['val_dn']
58 | 
59 |   tmp_df = net.dat_to_df()
60 |   tmp_df = proc_df_labels.main(tmp_df)
61 | 
62 |   real_net.df_to_dat(tmp_df)
63 | 


--------------------------------------------------------------------------------
/clustergrammer/make_clust_fun.py:
--------------------------------------------------------------------------------
  1 | def make_clust(net, dist_type='cosine', run_clustering=True, dendro=True,
  2 |                           requested_views=['pct_row_sum', 'N_row_sum'],
  3 |                           linkage_type='average', sim_mat=False, filter_sim=0.1,
  4 |                           calc_cat_pval=False, sim_mat_views=['N_row_sum'],
  5 |                           run_enrichr=None, enrichrgram=None):
  6 |   '''
  7 |   This will calculate multiple views of a clustergram by filtering the
  8 |   data and clustering after each filtering. This filtering will keep the top
  9 |   N rows based on some quantity (sum, num-non-zero, etc).
 10 |   '''
 11 |   from copy import deepcopy
 12 |   import scipy
 13 |   from . import calc_clust, run_filter, make_views, make_sim_mat, cat_pval
 14 |   from . import enrichr_functions as enr_fun
 15 | 
 16 |   df = net.dat_to_df()
 17 | 
 18 |   threshold = 0.0001
 19 |   df = run_filter.df_filter_row_sum(df, threshold)
 20 |   df = run_filter.df_filter_col_sum(df, threshold)
 21 | 
 22 |   # default setting
 23 |   define_cat_colors = False
 24 | 
 25 |   if run_enrichr is not None:
 26 |     df = enr_fun.add_enrichr_cats(df, 'row', run_enrichr)
 27 | 
 28 |     define_cat_colors = True
 29 | 
 30 |   # calculate initial view with no row filtering
 31 |   net.df_to_dat(df, define_cat_colors=True)
 32 | 
 33 | 
 34 |   inst_dm = calc_clust.cluster_row_and_col(net, dist_type=dist_type,
 35 |                                 linkage_type=linkage_type,
 36 |                                 run_clustering=run_clustering,
 37 |                                 dendro=dendro, ignore_cat=False,
 38 |                                 calc_cat_pval=calc_cat_pval)
 39 | 
 40 |   all_views = []
 41 |   send_df = deepcopy(df)
 42 | 
 43 |   if 'N_row_sum' in requested_views:
 44 |     all_views = make_views.N_rows(net, send_df, all_views,
 45 |                                   dist_type=dist_type, rank_type='sum')
 46 | 
 47 |   if 'N_row_var' in requested_views:
 48 |     all_views = make_views.N_rows(net, send_df, all_views,
 49 |                                   dist_type=dist_type, rank_type='var')
 50 | 
 51 |   if 'pct_row_sum' in requested_views:
 52 |     all_views = make_views.pct_rows(net, send_df, all_views,
 53 |                                     dist_type=dist_type, rank_type='sum')
 54 | 
 55 |   if 'pct_row_var' in requested_views:
 56 |     all_views = make_views.pct_rows(net, send_df, all_views,
 57 |                                     dist_type=dist_type, rank_type='var')
 58 | 
 59 |   which_sim = []
 60 | 
 61 |   if sim_mat == True:
 62 |     which_sim = ['row', 'col']
 63 |   elif sim_mat == 'row':
 64 |     which_sim = ['row']
 65 |   elif sim_mat == 'col':
 66 |     which_sim = ['col']
 67 | 
 68 |   if sim_mat is not False:
 69 |     sim_net = make_sim_mat.main(net, inst_dm, which_sim, filter_sim, sim_mat_views)
 70 | 
 71 |     net.sim = {}
 72 | 
 73 |     for inst_rc in which_sim:
 74 |       net.sim[inst_rc] = sim_net[inst_rc].viz
 75 | 
 76 |       if inst_rc == 'row':
 77 |         other_rc = 'col'
 78 |       elif inst_rc == 'col':
 79 |         other_rc = 'row'
 80 | 
 81 |       # keep track of cat_colors
 82 |       net.sim[inst_rc]['cat_colors'][inst_rc] = net.viz['cat_colors'][inst_rc]
 83 |       net.sim[inst_rc]['cat_colors'][other_rc] = net.viz['cat_colors'][inst_rc]
 84 | 
 85 |   else:
 86 |     net.sim = {}
 87 | 
 88 |   net.viz['views'] = all_views
 89 | 
 90 |   if enrichrgram != None:
 91 |     # toggle enrichrgram functionality from back-end
 92 |     net.viz['enrichrgram'] = enrichrgram
 93 | 
 94 |   if 'enrichrgram_lib' in net.dat:
 95 |     net.viz['enrichrgram'] = True
 96 |     net.viz['enrichrgram_lib'] = net.dat['enrichrgram_lib']
 97 | 
 98 |   if 'row_cat_bars' in net.dat:
 99 |     net.viz['row_cat_bars'] = net.dat['row_cat_bars']
100 | 


--------------------------------------------------------------------------------
/clustergrammer/make_sim_mat.py:
--------------------------------------------------------------------------------
 1 | def main(net, inst_dm, which_sim, filter_sim, sim_mat_views=['N_row_sum']):
 2 |   from .__init__ import Network
 3 |   from copy import deepcopy
 4 |   from . import calc_clust, make_views
 5 | 
 6 |   print('in make_sim_mat, which_sim: ' + str(which_sim))
 7 | 
 8 |   sim_dict = {}
 9 | 
10 |   for inst_rc in which_sim:
11 | 
12 |     sim_dict[inst_rc] = dm_to_sim(inst_dm[inst_rc], make_squareform=True,
13 |                              filter_sim=filter_sim)
14 | 
15 |   sim_net = {}
16 | 
17 |   for inst_rc in which_sim:
18 | 
19 |     sim_net[inst_rc] = deepcopy(Network())
20 | 
21 |     sim_net[inst_rc].dat['mat'] = sim_dict[inst_rc]
22 | 
23 |     sim_net[inst_rc].dat['nodes']['row'] = net.dat['nodes'][inst_rc]
24 |     sim_net[inst_rc].dat['nodes']['col'] = net.dat['nodes'][inst_rc]
25 | 
26 |     sim_net[inst_rc].dat['node_info']['row'] = net.dat['node_info'][inst_rc]
27 |     sim_net[inst_rc].dat['node_info']['col'] = net.dat['node_info'][inst_rc]
28 | 
29 |     calc_clust.cluster_row_and_col(sim_net[inst_rc])
30 | 
31 |     all_views = []
32 |     df = sim_net[inst_rc].dat_to_df()
33 |     send_df = deepcopy(df)
34 | 
35 |     if 'N_row_sum' in sim_mat_views:
36 |       all_views = make_views.N_rows(net, send_df, all_views,
37 |                                     dist_type='cos', rank_type='sum')
38 | 
39 |     sim_net[inst_rc].viz['views'] = all_views
40 | 
41 |   return sim_net
42 | 
43 | def dm_to_sim(inst_dm, make_squareform=False, filter_sim=0):
44 |   import numpy as np
45 |   from scipy.spatial.distance import squareform
46 | 
47 |   if make_squareform is True:
48 |     inst_dm = squareform(inst_dm)
49 | 
50 |   inst_sim_mat = 1 - inst_dm
51 | 
52 |   if filter_sim > 0:
53 |     filter_sim = adjust_filter_sim(inst_sim_mat, filter_sim)
54 |     inst_sim_mat[ np.abs(inst_sim_mat) < filter_sim] = 0
55 | 
56 |   return inst_sim_mat
57 | 
58 | def adjust_filter_sim(inst_dm, filter_sim, keep_top=20000):
59 |   import pandas as pd
60 |   import numpy as np
61 | 
62 |   inst_df = pd.DataFrame(inst_dm)
63 |   val_vect = np.abs(inst_df.values.flatten())
64 | 
65 |   val_vect = val_vect[val_vect > 0.01]
66 | 
67 |   if len(val_vect) > keep_top:
68 | 
69 | 
70 |     inst_series = pd.Series(val_vect)
71 |     inst_series.sort_values(ascending=False)
72 | 
73 |     sort_values = inst_series.values
74 | 
75 |     filter_sim = sort_values[keep_top]
76 | 
77 |   return filter_sim


--------------------------------------------------------------------------------
/clustergrammer/make_unique_labels.py:
--------------------------------------------------------------------------------
 1 | import pandas as pd
 2 | 
 3 | def main(net, df=None):
 4 |   '''
 5 |   Run in load_data module (which runs when file is loaded or dataframe is loaded),
 6 |   check for duplicate row/col names, and add index to names if necesary
 7 |   '''
 8 |   if df is None:
 9 |     df = net.export_df()
10 | 
11 |   # rows
12 |   #############
13 |   rows = df.index.tolist()
14 |   if type(rows[0]) is str:
15 | 
16 |     if len(rows) != len(list(set(rows))):
17 |       new_rows = add_index_list(rows)
18 |       df.index = new_rows
19 | 
20 |   elif type(rows[0]) is tuple:
21 | 
22 |     row_names = []
23 |     for inst_row in rows:
24 |       row_names.append(inst_row[0])
25 | 
26 |     if len(row_names) != len(list(set(row_names))):
27 |       row_names = add_index_list(row_names)
28 | 
29 |       # add back to tuple
30 |       new_rows = []
31 |       for inst_index in range(len(rows)):
32 |         inst_row = rows[inst_index]
33 |         new_row = list(inst_row)
34 |         new_row[0] = row_names[inst_index]
35 |         new_row = tuple(new_row)
36 |         new_rows.append(new_row)
37 | 
38 |       df.index = new_rows
39 | 
40 |   # cols
41 |   #############
42 |   cols = df.columns.tolist()
43 |   if type(cols[0]) is str:
44 | 
45 |     # list column names
46 |     if len(cols) != len(list(set(cols))):
47 |       new_cols = add_index_list(cols)
48 |       df.columns = new_cols
49 | 
50 |   elif type(cols[0]) is tuple:
51 | 
52 |     col_names = []
53 |     for inst_col in cols:
54 |       col_names.append(inst_col[0])
55 | 
56 |     if len(col_names) != len(list(set(col_names))):
57 |       col_names = add_index_list(col_names)
58 | 
59 |       # add back to tuple
60 |       new_cols = []
61 |       for inst_index in range(len(cols)):
62 |         inst_col = cols[inst_index]
63 |         new_col = list(inst_col)
64 |         new_col[0] = col_names[inst_index]
65 |         new_col = tuple(new_col)
66 |         new_cols.append(new_col)
67 | 
68 |       df.columns = new_cols
69 | 
70 |   # return dataframe with unique names
71 |   return df
72 | 
73 | def add_index_list(nodes):
74 | 
75 |   new_nodes = []
76 |   for i in range(len(nodes)):
77 |     index = i + 1
78 |     inst_node = nodes[i]
79 |     new_node = inst_node + '-' + str(index)
80 |     new_nodes.append(new_node)
81 | 
82 |   return new_nodes
83 | 


--------------------------------------------------------------------------------
/clustergrammer/make_views.py:
--------------------------------------------------------------------------------
  1 | def N_rows(net, df, all_views, dist_type='cosine', rank_type='sum'):
  2 |   from copy import deepcopy
  3 |   from .__init__ import Network
  4 |   from . import calc_clust, run_filter
  5 | 
  6 |   keep_top = ['all', 500, 250, 100, 50, 20, 10]
  7 | 
  8 |   rows_sorted = run_filter.get_sorted_rows(df['mat'], rank_type)
  9 | 
 10 |   for inst_keep in keep_top:
 11 | 
 12 |     tmp_df = deepcopy(df)
 13 | 
 14 |     check_keep_num = inst_keep
 15 | 
 16 |     # convert 'all' to -1 to clean up checking mechanism
 17 |     if check_keep_num == 'all':
 18 |       check_keep_num = -1
 19 | 
 20 |     if check_keep_num < len(rows_sorted):
 21 | 
 22 |       tmp_net = deepcopy(Network())
 23 | 
 24 |       if inst_keep != 'all':
 25 | 
 26 |         keep_rows = rows_sorted[0:inst_keep]
 27 | 
 28 |         tmp_df['mat'] = tmp_df['mat'].ix[keep_rows]
 29 |         if 'mat_up' in tmp_df:
 30 |           tmp_df['mat_up'] = tmp_df['mat_up'].ix[keep_rows]
 31 |           tmp_df['mat_dn'] = tmp_df['mat_dn'].ix[keep_rows]
 32 |         if 'mat_orig' in tmp_df:
 33 |           tmp_df['mat_orig'] = tmp_df['mat_orig'].ix[keep_rows]
 34 | 
 35 |         tmp_df = run_filter.df_filter_col_sum(tmp_df, 0.001)
 36 |         tmp_net.df_to_dat(tmp_df)
 37 | 
 38 |       else:
 39 |         tmp_net.df_to_dat(tmp_df)
 40 | 
 41 |       try:
 42 |         try:
 43 |           calc_clust.cluster_row_and_col(tmp_net, dist_type, run_clustering=True)
 44 |         except:
 45 |           calc_clust.cluster_row_and_col(tmp_net, dist_type, run_clustering=False)
 46 | 
 47 |         # add view
 48 |         inst_view = {}
 49 |         inst_view['N_row_' + rank_type] = inst_keep
 50 |         inst_view['dist'] = 'cos'
 51 |         inst_view['nodes'] = {}
 52 |         inst_view['nodes']['row_nodes'] = tmp_net.viz['row_nodes']
 53 |         inst_view['nodes']['col_nodes'] = tmp_net.viz['col_nodes']
 54 |         all_views.append(inst_view)
 55 | 
 56 |       except:
 57 |         # print('\t*** did not cluster N filtered view')
 58 |         pass
 59 | 
 60 |   return all_views
 61 | 
 62 | def pct_rows(net, df, all_views, dist_type, rank_type):
 63 |   from .__init__ import Network
 64 |   from copy import deepcopy
 65 |   import numpy as np
 66 |   from . import calc_clust, run_filter
 67 | 
 68 |   copy_net = deepcopy(net)
 69 | 
 70 |   if len(net.dat['node_info']['col']['cat']) > 0:
 71 |     cat_key_col = {}
 72 |     for i in range(len(net.dat['nodes']['col'])):
 73 |       cat_key_col[net.dat['nodes']['col'][i]] = \
 74 |           net.dat['node_info']['col']['cat'][i]
 75 | 
 76 |   all_filt = list(range(10))
 77 |   all_filt = [i / float(10) for i in all_filt]
 78 | 
 79 |   mat = deepcopy(df['mat'])
 80 |   sum_row = np.sum(mat, axis=1)
 81 |   max_sum = max(sum_row)
 82 | 
 83 |   for inst_filt in all_filt:
 84 | 
 85 |     cutoff = inst_filt * max_sum
 86 |     copy_net = deepcopy(net)
 87 |     inst_df = deepcopy(df)
 88 |     inst_df = run_filter.df_filter_row_sum(inst_df, cutoff, take_abs=False)
 89 | 
 90 |     tmp_net = deepcopy(Network())
 91 |     tmp_net.df_to_dat(inst_df)
 92 | 
 93 |     try:
 94 |       try:
 95 |         calc_clust.cluster_row_and_col(tmp_net, dist_type=dist_type,
 96 |                                        run_clustering=True)
 97 | 
 98 |       except:
 99 |         calc_clust.cluster_row_and_col(tmp_net, dist_type=dist_type,
100 |                                        run_clustering=False)
101 | 
102 |       inst_view = {}
103 |       inst_view['pct_row_' + rank_type] = inst_filt
104 |       inst_view['dist'] = 'cos'
105 |       inst_view['nodes'] = {}
106 |       inst_view['nodes']['row_nodes'] = tmp_net.viz['row_nodes']
107 |       inst_view['nodes']['col_nodes'] = tmp_net.viz['col_nodes']
108 | 
109 |       all_views.append(inst_view)
110 | 
111 |     except:
112 |       pass
113 | 
114 |   return all_views


--------------------------------------------------------------------------------
/clustergrammer/make_viz.py:
--------------------------------------------------------------------------------
 1 | def viz_json(net, dendro=True, links=False):
 2 |   ''' make the dictionary for the clustergram.js visualization '''
 3 |   from . import calc_clust
 4 |   import numpy as np
 5 | 
 6 |   all_dist = calc_clust.group_cutoffs()
 7 | 
 8 |   for inst_rc in net.dat['nodes']:
 9 | 
10 |     inst_keys = net.dat['node_info'][inst_rc]
11 |     all_cats = [x for x in inst_keys if 'cat-' in x]
12 | 
13 |     for i in range(len(net.dat['nodes'][inst_rc])):
14 |       inst_dict = {}
15 |       inst_dict['name'] = net.dat['nodes'][inst_rc][i]
16 |       inst_dict['ini'] = net.dat['node_info'][inst_rc]['ini'][i]
17 |       inst_dict['clust'] = net.dat['node_info'][inst_rc]['clust'].index(i)
18 |       inst_dict['rank'] = net.dat['node_info'][inst_rc]['rank'][i]
19 | 
20 |       if 'rankvar' in inst_keys:
21 |         inst_dict['rankvar'] = net.dat['node_info'][inst_rc]['rankvar'][i]
22 | 
23 |       # fix for similarity matrix
24 |       if len(all_cats) > 0:
25 | 
26 |         for inst_name_cat in all_cats:
27 | 
28 |           actual_cat_name = net.dat['node_info'][inst_rc][inst_name_cat][i]
29 |           inst_dict[inst_name_cat] = actual_cat_name
30 | 
31 |           check_pval = 'pval_'+inst_name_cat.replace('-','_')
32 | 
33 |           if check_pval in net.dat['node_info'][inst_rc]:
34 |             tmp_pval_name = inst_name_cat.replace('-','_') + '_pval'
35 |             inst_dict[tmp_pval_name] = net.dat['node_info'][inst_rc][check_pval][actual_cat_name]
36 | 
37 |           tmp_index_name = inst_name_cat.replace('-', '_') + '_index'
38 | 
39 |           inst_dict[tmp_index_name] = net.dat['node_info'][inst_rc] \
40 |               [tmp_index_name][i]
41 | 
42 | 
43 |       if len(net.dat['node_info'][inst_rc]['value']) > 0:
44 |         inst_dict['value'] = net.dat['node_info'][inst_rc]['value'][i]
45 | 
46 |       if len(net.dat['node_info'][inst_rc]['info']) > 0:
47 |         inst_dict['info'] = net.dat['node_info'][inst_rc]['info'][i]
48 | 
49 |       if dendro is True:
50 |         inst_dict['group'] = []
51 |         for tmp_dist in all_dist:
52 |           tmp_dist = str(tmp_dist).replace('.', '')
53 |           tmp_append = float(
54 |               net.dat['node_info'][inst_rc]['group'][tmp_dist][i])
55 |           inst_dict['group'].append(tmp_append)
56 | 
57 |       net.viz[inst_rc + '_nodes'].append(inst_dict)
58 | 
59 |   mat_types = ['mat', 'mat_orig', 'mat_info', 'mat_hl', 'mat_up', 'mat_dn']
60 | 
61 |   # save data as links or mat
62 |   ###########################
63 |   if links is True:
64 |     for i in range(len(net.dat['nodes']['row'])):
65 |       for j in range(len(net.dat['nodes']['col'])):
66 | 
67 |         inst_dict = {}
68 |         inst_dict['source'] = i
69 |         inst_dict['target'] = j
70 |         inst_dict['value'] = float(net.dat['mat'][i, j])
71 | 
72 |         if 'mat_up' in net.dat:
73 |           inst_dict['value_up'] = net.dat['mat_up'][i, j]
74 |           inst_dict['value_dn'] = net.dat['mat_dn'][i, j]
75 | 
76 |         if 'mat_orig' in net.dat:
77 |           inst_dict['value_orig'] = net.dat['mat_orig'][i, j]
78 | 
79 |           if np.isnan(inst_dict['value_orig']):
80 |             inst_dict['value_orig'] = 'NaN'
81 | 
82 | 
83 |         if 'mat_info' in net.dat:
84 |           inst_dict['info'] = net.dat['mat_info'][str((i, j))]
85 | 
86 |         if 'mat_hl' in net.dat:
87 |           inst_dict['highlight'] = net.dat['mat_hl'][i, j]
88 | 
89 |         net.viz['links'].append(inst_dict)
90 | 
91 |   else:
92 |     for inst_mat in mat_types:
93 |       if inst_mat in net.dat:
94 |         net.viz[inst_mat] = net.dat[inst_mat].tolist()
95 | 
96 | 
97 | 


--------------------------------------------------------------------------------
/clustergrammer/normalize_fun.py:
--------------------------------------------------------------------------------
  1 | import pandas as pd
  2 | import numpy as np
  3 | from copy import deepcopy
  4 | 
  5 | def run_norm(net, df=None, norm_type='zscore', axis='row', keep_orig=False):
  6 |   '''
  7 |   A dataframe (more accurately a dictionary of dataframes, e.g. mat,
  8 |   mat_up...) can be passed to run_norm and a normalization will be run (
  9 |   e.g. zscore) on either the rows or columns
 10 |   '''
 11 | 
 12 |   # df here is actually a dictionary of several dataframes, 'mat', 'mat_orig',
 13 |   # etc
 14 |   if df is None:
 15 |     df = net.dat_to_df()
 16 | 
 17 |   if norm_type == 'zscore':
 18 |     df = zscore_df(df, axis, keep_orig)
 19 | 
 20 |   if norm_type == 'qn':
 21 |     df = qn_df(df, axis, keep_orig)
 22 | 
 23 |   net.df_to_dat(df)
 24 | 
 25 | def qn_df(df, axis='row', keep_orig=False):
 26 |   '''
 27 |   do quantile normalization of a dataframe dictionary, does not write to net
 28 |   '''
 29 |   df_qn = {}
 30 | 
 31 |   for mat_type in df:
 32 |     inst_df = df[mat_type]
 33 | 
 34 |     # using transpose to do row qn
 35 |     if axis == 'row':
 36 |       inst_df = inst_df.transpose()
 37 | 
 38 |     missing_values = inst_df.isnull().values.any()
 39 | 
 40 |     # make mask of missing values
 41 |     if missing_values:
 42 | 
 43 |       # get nan mask
 44 |       missing_mask = pd.isnull(inst_df)
 45 | 
 46 |       # tmp fill in na with zero, will not affect qn
 47 |       inst_df = inst_df.fillna(value=0)
 48 | 
 49 |     # calc common distribution
 50 |     common_dist = calc_common_dist(inst_df)
 51 | 
 52 |     # swap in common distribution
 53 |     inst_df = swap_in_common_dist(inst_df, common_dist)
 54 | 
 55 |     # swap back in missing values
 56 |     if missing_values:
 57 |       inst_df = inst_df.mask(missing_mask, other=np.nan)
 58 | 
 59 |     # using transpose to do row qn
 60 |     if axis == 'row':
 61 |       inst_df = inst_df.transpose()
 62 | 
 63 |     df_qn[mat_type] = inst_df
 64 | 
 65 |   return df_qn
 66 | 
 67 | def swap_in_common_dist(df, common_dist):
 68 | 
 69 |   col_names = df.columns.tolist()
 70 | 
 71 |   qn_arr = np.array([])
 72 |   orig_rows = df.index.tolist()
 73 | 
 74 |   # loop through each column
 75 |   for inst_col in col_names:
 76 | 
 77 |     # get the sorted list of row names for the given column
 78 |     tmp_series = deepcopy(df[inst_col])
 79 |     tmp_series = tmp_series.sort_values(ascending=False)
 80 |     sorted_names = tmp_series.index.tolist()
 81 | 
 82 |     qn_vect = np.array([])
 83 |     for inst_row in orig_rows:
 84 |       inst_index = sorted_names.index(inst_row)
 85 |       inst_val = common_dist[inst_index]
 86 |       qn_vect = np.hstack((qn_vect, inst_val))
 87 | 
 88 |     if qn_arr.shape[0] == 0:
 89 |       qn_arr = qn_vect
 90 |     else:
 91 |       qn_arr = np.vstack((qn_arr, qn_vect))
 92 | 
 93 |   # transpose (because of vstacking)
 94 |   qn_arr = qn_arr.transpose()
 95 | 
 96 |   qn_df = pd.DataFrame(data=qn_arr, columns=col_names, index=orig_rows)
 97 | 
 98 |   return qn_df
 99 | 
100 | def calc_common_dist(df):
101 |   '''
102 |   calculate a common distribution (for col qn only) that will be used to qn
103 |   '''
104 | 
105 |   # axis is col
106 |   tmp_arr = np.array([])
107 | 
108 |   col_names = df.columns.tolist()
109 | 
110 |   for inst_col in col_names:
111 | 
112 |     # sort column
113 |     tmp_vect = df[inst_col].sort_values(ascending=False).values
114 | 
115 |     # stacking rows vertically (will transpose)
116 |     if tmp_arr.shape[0] == 0:
117 |       tmp_arr = tmp_vect
118 |     else:
119 |       tmp_arr = np.vstack((tmp_arr, tmp_vect))
120 | 
121 |   tmp_arr = tmp_arr.transpose()
122 | 
123 |   common_dist = tmp_arr.mean(axis=1)
124 | 
125 |   return common_dist
126 | 
127 | def zscore_df(df, axis='row', keep_orig=False):
128 |   '''
129 |   take the zscore of a dataframe dictionary, does not write to net (self)
130 |   '''
131 |   df_z = {}
132 | 
133 |   for mat_type in df:
134 |     if keep_orig and mat_type == 'mat':
135 |       mat_orig = deepcopy(df[mat_type])
136 | 
137 |     inst_df = df[mat_type]
138 | 
139 |     if axis == 'row':
140 |       inst_df = inst_df.transpose()
141 | 
142 |     df_z[mat_type] = (inst_df - inst_df.mean())/inst_df.std()
143 | 
144 |     if axis == 'row':
145 |       df_z[mat_type] = df_z[mat_type].transpose()
146 | 
147 |   if keep_orig:
148 |     df_z['mat_orig'] = mat_orig
149 | 
150 |   return df_z
151 | 


--------------------------------------------------------------------------------
/clustergrammer/proc_df_labels.py:
--------------------------------------------------------------------------------
 1 | def main(df):
 2 |   '''
 3 |   1) check that rows are strings (in case of numerical names)
 4 |   2) check for tuples, and in that case load tuples to categories
 5 |   '''
 6 |   import numpy as np
 7 |   from ast import literal_eval as make_tuple
 8 | 
 9 |   test = {}
10 |   test['row'] = df['mat'].index.tolist()
11 |   test['col'] = df['mat'].columns.tolist()
12 | 
13 |   # if type( test_row ) is not str and type( test_row ) is not tuple:
14 | 
15 |   found_tuple = {}
16 |   found_number = {}
17 |   for inst_rc in ['row','col']:
18 | 
19 |     inst_name = test[inst_rc][0]
20 | 
21 |     found_tuple[inst_rc] = False
22 |     found_number[inst_rc] = False
23 | 
24 |     if type(inst_name) != tuple:
25 | 
26 |       if type(inst_name) is int or type(inst_name) is float or type(inst_name) is np.int64:
27 |         found_number[inst_rc] = True
28 | 
29 |       else:
30 |         check_open = inst_name[0]
31 |         check_comma = inst_name.find(',')
32 |         check_close = inst_name[-1]
33 | 
34 |         if check_open == '(' and check_close == ')' and check_comma > 0 \
35 |           and check_comma < len(inst_name):
36 |           found_tuple[inst_rc] = True
37 | 
38 |   # convert to tuple if necessary
39 |   #################################################
40 |   if found_tuple['row']:
41 |     row_names = df['mat'].index.tolist()
42 |     row_names = [make_tuple(x) for x in row_names]
43 |     df['mat'].index = row_names
44 | 
45 |   if found_tuple['col']:
46 |     col_names = df['mat'].columns.tolist()
47 |     col_names = [make_tuple(x) for x in col_names]
48 |     df['mat'].columns = col_names
49 | 
50 |   # convert numbers to string if necessary
51 |   #################################################
52 |   if found_number['row']:
53 |     row_names = df['mat'].index.tolist()
54 |     row_names = [str(x) for x in row_names]
55 |     df['mat'].index = row_names
56 | 
57 |   if found_number['col']:
58 |     col_names = df['mat'].columns.tolist()
59 |     col_names = [str(x) for x in col_names]
60 |     df['mat'].columns = col_names
61 | 
62 |   return df


--------------------------------------------------------------------------------
/clustergrammer/run_filter.py:
--------------------------------------------------------------------------------
  1 | def df_filter_row_sum(df, threshold, take_abs=True):
  2 |   ''' filter rows in matrix at some threshold
  3 |   and remove columns that have a sum below this threshold '''
  4 | 
  5 |   from copy import deepcopy
  6 |   from .__init__ import Network
  7 |   net = Network()
  8 | 
  9 |   if take_abs is True:
 10 |     df_copy = deepcopy(df['mat'].abs())
 11 |   else:
 12 |     df_copy = deepcopy(df['mat'])
 13 | 
 14 |   ini_rows = df_copy.index.values.tolist()
 15 |   df_copy = df_copy.transpose()
 16 |   tmp_sum = df_copy.sum(axis=0)
 17 |   tmp_sum = tmp_sum.abs()
 18 |   tmp_sum.sort_values(inplace=True, ascending=False)
 19 | 
 20 |   tmp_sum = tmp_sum[tmp_sum > threshold]
 21 |   keep_rows = sorted(tmp_sum.index.values.tolist())
 22 | 
 23 |   if len(keep_rows) < len(ini_rows):
 24 |     df['mat'] = grab_df_subset(df['mat'], keep_rows=keep_rows)
 25 | 
 26 |     if 'mat_up' in df:
 27 |       df['mat_up'] = grab_df_subset(df['mat_up'], keep_rows=keep_rows)
 28 |       df['mat_dn'] = grab_df_subset(df['mat_dn'], keep_rows=keep_rows)
 29 | 
 30 |     if 'mat_orig' in df:
 31 |       df['mat_orig'] = grab_df_subset(df['mat_orig'], keep_rows=keep_rows)
 32 | 
 33 |   return df
 34 | 
 35 | def df_filter_col_sum(df, threshold, take_abs=True):
 36 |   ''' filter columns in matrix at some threshold
 37 |   and remove rows that have all zero values '''
 38 | 
 39 |   from copy import deepcopy
 40 |   from .__init__ import Network
 41 |   net = Network()
 42 | 
 43 |   if take_abs is True:
 44 |     df_copy = deepcopy(df['mat'].abs())
 45 |   else:
 46 |     df_copy = deepcopy(df['mat'])
 47 | 
 48 |   df_copy = df_copy.transpose()
 49 |   df_copy = df_copy[df_copy.sum(axis=1) > threshold]
 50 |   df_copy = df_copy.transpose()
 51 |   df_copy = df_copy[df_copy.sum(axis=1) > 0]
 52 | 
 53 |   if take_abs is True:
 54 |     inst_rows = df_copy.index.tolist()
 55 |     inst_cols = df_copy.columns.tolist()
 56 |     df['mat'] = grab_df_subset(df['mat'], inst_rows, inst_cols)
 57 | 
 58 |     if 'mat_up' in df:
 59 |       df['mat_up'] = grab_df_subset(df['mat_up'], inst_rows, inst_cols)
 60 |       df['mat_dn'] = grab_df_subset(df['mat_dn'], inst_rows, inst_cols)
 61 | 
 62 |     if 'mat_orig' in df:
 63 |       df['mat_orig'] = grab_df_subset(df['mat_orig'], inst_rows, inst_cols)
 64 | 
 65 |   else:
 66 |     df['mat'] = df_copy
 67 | 
 68 |   return df
 69 | 
 70 | def grab_df_subset(df, keep_rows='all', keep_cols='all'):
 71 |   if keep_cols != 'all':
 72 |     df = df[keep_cols]
 73 |   if keep_rows != 'all':
 74 |     df = df.ix[keep_rows]
 75 |   return df
 76 | 
 77 | def get_sorted_rows(df, rank_type='sum'):
 78 |   from copy import deepcopy
 79 | 
 80 |   inst_df = deepcopy(df)
 81 |   inst_df = inst_df.transpose()
 82 | 
 83 |   if rank_type == 'sum':
 84 |     tmp_sum = inst_df.sum(axis=0)
 85 |   elif rank_type == 'var':
 86 |     tmp_sum = inst_df.var(axis=0)
 87 | 
 88 |   tmp_sum = tmp_sum.abs()
 89 |   tmp_sum.sort_values(inplace=True, ascending=False)
 90 |   rows_sorted = tmp_sum.index.values.tolist()
 91 | 
 92 |   return rows_sorted
 93 | 
 94 | def filter_N_top(inst_rc, df, N_top, rank_type='sum'):
 95 | 
 96 |   if inst_rc == 'col':
 97 |     for inst_type in df:
 98 |       df[inst_type] = df[inst_type].transpose()
 99 | 
100 |   rows_sorted = get_sorted_rows(df['mat'], rank_type)
101 | 
102 |   keep_rows = rows_sorted[:N_top]
103 | 
104 |   df['mat'] = df['mat'].ix[keep_rows]
105 |   if 'mat_up' in df:
106 |     df['mat_up'] = df['mat_up'].ix[keep_rows]
107 |     df['mat_dn'] = df['mat_dn'].ix[keep_rows]
108 | 
109 |   if 'mat_orig' in df:
110 |     df['mat_orig'] = df['mat_orig'].ix[keep_rows]
111 | 
112 |   if inst_rc == 'col':
113 |     for inst_type in df:
114 |       df[inst_type] = df[inst_type].transpose()
115 | 
116 |   return df
117 | 
118 | def filter_threshold(df, inst_rc, threshold, num_occur=1):
119 |   '''
120 |   Filter a network's rows or cols based on num_occur values being above a
121 |   threshold (in absolute_value)
122 |   '''
123 |   from copy import deepcopy
124 | 
125 |   inst_df = deepcopy(df['mat'])
126 | 
127 |   if inst_rc == 'col':
128 |     inst_df = inst_df.transpose()
129 | 
130 |   inst_df = inst_df.abs()
131 | 
132 |   ini_rows = inst_df.index.values.tolist()
133 | 
134 |   inst_df[inst_df < threshold] = 0
135 |   inst_df[inst_df >= threshold] = 1
136 | 
137 |   tmp_sum = inst_df.sum(axis=1)
138 | 
139 |   tmp_sum = tmp_sum[tmp_sum >= num_occur]
140 | 
141 |   keep_names = tmp_sum.index.values.tolist()
142 | 
143 |   if inst_rc == 'row':
144 |     if len(keep_names) < len(ini_rows):
145 |       df['mat'] = grab_df_subset(df['mat'], keep_rows=keep_names)
146 | 
147 |       if 'mat_up' in df:
148 |         df['mat_up'] = grab_df_subset(df['mat_up'], keep_rows=keep_names)
149 |         df['mat_dn'] = grab_df_subset(df['mat_dn'], keep_rows=keep_names)
150 | 
151 |       if 'mat_orig' in df:
152 |         df['mat_orig'] = grab_df_subset(df['mat_orig'], keep_rows=keep_names)
153 | 
154 |   elif inst_rc == 'col':
155 |     inst_df = inst_df.transpose()
156 | 
157 |     inst_rows = inst_df.index.values.tolist()
158 |     inst_cols = keep_names
159 | 
160 |     df['mat'] = grab_df_subset(df['mat'], inst_rows, inst_cols)
161 | 
162 |     if 'mat_up' in df:
163 |       df['mat_up'] = grab_df_subset(df['mat_up'], inst_rows, inst_cols)
164 |       df['mat_dn'] = grab_df_subset(df['mat_dn'], inst_rows, inst_cols)
165 | 
166 |     if 'mat_orig' in df:
167 |       df['mat_orig'] = grab_df_subset(df['mat_orig'], inst_rows, inst_cols)
168 | 
169 |   return df
170 | 
171 | def filter_cat(net, axis, cat_index, cat_name):
172 | 
173 |   try:
174 |     df = net.export_df()
175 | 
176 |     # DataFrame filtering will be run always be run on columns if the user
177 |     # wants to filter rows, transpose the matrix before and after
178 |     if axis == 'row':
179 |       df = df.transpose()
180 | 
181 |     all_names = df.columns.tolist()
182 | 
183 |     found_names = [i for i in all_names if i[cat_index] == cat_name]
184 | 
185 |     if len(found_names) > 0:
186 |       df = df[found_names]
187 | 
188 |       if axis == 'row':
189 |         df = df.transpose()
190 |     else:
191 |       print('no ' + axis + 's were found with this category and filtering was not run')
192 | 
193 |     net.load_df(df)
194 | 
195 |   except:
196 |     print('category filtering did not run\n check that your category filtering is set up correctly')
197 | 
198 | 
199 | def filter_names(net, axis, names):
200 | 
201 |   print('filter_names')
202 |   print(names)
203 | 
204 |   try:
205 | 
206 |     df = net.export_df()
207 | 
208 |     # Dataframe filtering will always be run on the columns. If the user wants to filter rows, then it will transpose back and forth.
209 | 
210 |     if axis == 'row':
211 |       df = df.transpose()
212 | 
213 |     all_names = df.columns.tolist()
214 | 
215 |     found_names = []
216 |     for inst_name in all_names:
217 | 
218 |       if type(inst_name) is tuple:
219 |         check_name = inst_name[0]
220 |       else:
221 |         check_name = inst_name
222 | 
223 |       if ': ' in check_name:
224 |         check_name = check_name.split(': ')[1]
225 | 
226 |       if check_name in names:
227 |         found_names.append(inst_name)
228 | 
229 |     if len(found_names) > 0:
230 |       df = df[found_names]
231 | 
232 |       if axis == 'row':
233 |         df = df.transpose()
234 | 
235 |       net.load_df(df)
236 | 
237 |     else:
238 |       print('no ' + axis + 's were found with these names')
239 | 
240 |   except:
241 |     print('error in filtering names')
242 | 
243 |   print(found_names)


--------------------------------------------------------------------------------
/make_clustergrammer.py:
--------------------------------------------------------------------------------
 1 | '''
 2 | The clustergrammer python module can be installed using pip:
 3 | pip install clustergrammer
 4 | 
 5 | or by getting the code from the repo:
 6 | https://github.com/MaayanLab/clustergrammer-py
 7 | '''
 8 | 
 9 | # from clustergrammer import Network
10 | from clustergrammer import Network
11 | net = Network()
12 | 
13 | # load matrix tsv file
14 | net.load_file('txt/rc_two_cats.txt')
15 | # net.load_file('txt/rc_val_cats.txt')
16 | 
17 | # optional filtering and normalization
18 | ##########################################
19 | # net.filter_sum('row', threshold=20)
20 | # net.normalize(axis='col', norm_type='zscore', keep_orig=True)
21 | # net.filter_N_top('row', 250, rank_type='sum')
22 | # net.filter_threshold('row', threshold=3.0, num_occur=4)
23 | # net.swap_nan_for_zero()
24 | # net.downsample(ds_type='kmeans', axis='col', num_samples=10)
25 | # net.random_sample(random_state=100, num_samples=10, axis='col')
26 | # net.clip(-6,6)
27 | # net.filter_cat('row', 1, 'Gene Type: Interesting')
28 | # net.set_cat_color('col', 1, 'Category: one', 'blue')
29 | 
30 | net.cluster(dist_type='cos',views=['N_row_sum', 'N_row_var'] , dendro=True,
31 |              sim_mat=True, filter_sim=0.1, calc_cat_pval=False, enrichrgram=True)
32 | 
33 | # write jsons for front-end visualizations
34 | net.write_json_to_file('viz', 'json/mult_view.json', 'no-indent')
35 | net.write_json_to_file('sim_row', 'json/mult_view_sim_row.json', 'no-indent')
36 | net.write_json_to_file('sim_col', 'json/mult_view_sim_col.json', 'no-indent')
37 | 


--------------------------------------------------------------------------------
/make_stdin_stdout.py:
--------------------------------------------------------------------------------
 1 | '''
 2 | The clustergrammer python module can be installed using pip:
 3 | pip install clustergrammer
 4 | 
 5 | or by getting the code from the repo:
 6 | https://github.com/MaayanLab/clustergrammer-py
 7 | '''
 8 | 
 9 | # from clustergrammer import Network
10 | from clustergrammer import Network
11 | net = Network()
12 | 
13 | # load matrix tsv file
14 | net.load_stdin()
15 | 
16 | # optional filtering and normalization
17 | ##########################################
18 | # net.filter_sum('row', threshold=20)
19 | # net.normalize(axis='col', norm_type='zscore', keep_orig=True)
20 | # net.filter_N_top('row', 250, rank_type='sum')
21 | # net.filter_threshold('row', threshold=3.0, num_occur=4)
22 | # net.swap_nan_for_zero()
23 | 
24 | net.make_clust(dist_type='cos',views=['N_row_sum', 'N_row_var'] , dendro=True,
25 |                sim_mat=True, filter_sim=0.1, calc_cat_pval=False)
26 | 
27 | # output jsons for front-end visualizations
28 | print(net.export_net_json('viz', 'no-indent'))


--------------------------------------------------------------------------------
/python27 new import.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "code",
  5 |    "execution_count": 1,
  6 |    "metadata": {
  7 |     "collapsed": true
  8 |    },
  9 |    "outputs": [],
 10 |    "source": [
 11 |     "import numpy as np\n",
 12 |     "import pandas as pd\n",
 13 |     "\n",
 14 |     "# import clustergrammer_widget\n",
 15 |     "from clustergrammer_widget import *\n",
 16 |     "\n",
 17 |     "# use local clustergrammer\n",
 18 |     "from clustergrammer import Network"
 19 |    ]
 20 |   },
 21 |   {
 22 |    "cell_type": "code",
 23 |    "execution_count": 2,
 24 |    "metadata": {},
 25 |    "outputs": [
 26 |     {
 27 |      "data": {
 28 |       "text/plain": [
 29 |        "'\\n  version 1.12.4\\n\\n  Clustergrammer.py takes a matrix as input (either from a file of a Pandas DataFrame), normalizes/filters, hierarchically clusters, and produces the :ref:`visualization_json` for :ref:`clustergrammer_js`.\\n\\n  Networks have two states:\\n\\n    1. the data state, where they are stored as a matrix and nodes\\n    2. the viz state where they are stored as viz.links, viz.row_nodes, and viz.col_nodes.\\n\\n  The goal is to start in a data-state and produce a viz-state of\\n  the network that will be used as input to clustergram.js.\\n  '"
 30 |       ]
 31 |      },
 32 |      "execution_count": 2,
 33 |      "metadata": {},
 34 |      "output_type": "execute_result"
 35 |     }
 36 |    ],
 37 |    "source": [
 38 |     "net = Network(clustergrammer_widget)\n",
 39 |     "net.__doc__"
 40 |    ]
 41 |   },
 42 |   {
 43 |    "cell_type": "code",
 44 |    "execution_count": 10,
 45 |    "metadata": {
 46 |     "collapsed": true
 47 |    },
 48 |    "outputs": [],
 49 |    "source": [
 50 |     "# generate random matrix\n",
 51 |     "num_rows = 500\n",
 52 |     "num_cols = 10\n",
 53 |     "np.random.seed(seed=100)\n",
 54 |     "mat = np.random.rand(num_rows, num_cols)\n",
 55 |     "\n",
 56 |     "# make row and col labels\n",
 57 |     "rows = range(num_rows)\n",
 58 |     "cols = range(num_cols)\n",
 59 |     "rows = [str(i) for i in rows]\n",
 60 |     "cols = [str(i) for i in cols]\n",
 61 |     "\n",
 62 |     "# make dataframe \n",
 63 |     "df = pd.DataFrame(data=mat, columns=cols, index=rows)"
 64 |    ]
 65 |   },
 66 |   {
 67 |    "cell_type": "code",
 68 |    "execution_count": 11,
 69 |    "metadata": {},
 70 |    "outputs": [
 71 |     {
 72 |      "data": {
 73 |       "application/vnd.jupyter.widget-view+json": {
 74 |        "model_id": "dfd54aa4aff347d2be76a753c0fdee93"
 75 |       }
 76 |      },
 77 |      "metadata": {},
 78 |      "output_type": "display_data"
 79 |     }
 80 |    ],
 81 |    "source": [
 82 |     "net.load_df(df)\n",
 83 |     "net.cluster()\n",
 84 |     "net.widget()"
 85 |    ]
 86 |   },
 87 |   {
 88 |    "cell_type": "code",
 89 |    "execution_count": 12,
 90 |    "metadata": {},
 91 |    "outputs": [
 92 |     {
 93 |      "data": {
 94 |       "application/vnd.jupyter.widget-view+json": {
 95 |        "model_id": "7b8846afd6cd4aa49556935cc23350e9"
 96 |       }
 97 |      },
 98 |      "metadata": {},
 99 |      "output_type": "display_data"
100 |     }
101 |    ],
102 |    "source": [
103 |     "net.load_file('txt/rc_two_cats.txt')\n",
104 |     "net.cluster()\n",
105 |     "net.widget()"
106 |    ]
107 |   },
108 |   {
109 |    "cell_type": "code",
110 |    "execution_count": null,
111 |    "metadata": {
112 |     "collapsed": true
113 |    },
114 |    "outputs": [],
115 |    "source": []
116 |   }
117 |  ],
118 |  "metadata": {
119 |   "anaconda-cloud": {},
120 |   "kernelspec": {
121 |    "display_name": "Python [Root]",
122 |    "language": "python",
123 |    "name": "Python [Root]"
124 |   },
125 |   "language_info": {
126 |    "codemirror_mode": {
127 |     "name": "ipython",
128 |     "version": 2
129 |    },
130 |    "file_extension": ".py",
131 |    "mimetype": "text/x-python",
132 |    "name": "python",
133 |    "nbconvert_exporter": "python",
134 |    "pygments_lexer": "ipython2",
135 |    "version": "2.7.12"
136 |   }
137 |  },
138 |  "nbformat": 4,
139 |  "nbformat_minor": 2
140 | }
141 | 


--------------------------------------------------------------------------------
/python35_new_import.ipynb:
--------------------------------------------------------------------------------
  1 | {
  2 |  "cells": [
  3 |   {
  4 |    "cell_type": "code",
  5 |    "execution_count": 1,
  6 |    "metadata": {
  7 |     "collapsed": true
  8 |    },
  9 |    "outputs": [],
 10 |    "source": [
 11 |     "import numpy as np\n",
 12 |     "import pandas as pd\n",
 13 |     "\n",
 14 |     "# import clustergrammer_widget\n",
 15 |     "from clustergrammer_widget import *\n",
 16 |     "\n",
 17 |     "# use local clustergrammer\n",
 18 |     "from clustergrammer import Network"
 19 |    ]
 20 |   },
 21 |   {
 22 |    "cell_type": "code",
 23 |    "execution_count": 2,
 24 |    "metadata": {},
 25 |    "outputs": [
 26 |     {
 27 |      "data": {
 28 |       "text/plain": [
 29 |        "'\\n  version 1.12.4\\n\\n  Clustergrammer.py takes a matrix as input (either from a file of a Pandas DataFrame), normalizes/filters, hierarchically clusters, and produces the :ref:`visualization_json` for :ref:`clustergrammer_js`.\\n\\n  Networks have two states:\\n\\n    1. the data state, where they are stored as a matrix and nodes\\n    2. the viz state where they are stored as viz.links, viz.row_nodes, and viz.col_nodes.\\n\\n  The goal is to start in a data-state and produce a viz-state of\\n  the network that will be used as input to clustergram.js.\\n  '"
 30 |       ]
 31 |      },
 32 |      "execution_count": 2,
 33 |      "metadata": {},
 34 |      "output_type": "execute_result"
 35 |     }
 36 |    ],
 37 |    "source": [
 38 |     "net = Network(clustergrammer_widget)\n",
 39 |     "net.__doc__"
 40 |    ]
 41 |   },
 42 |   {
 43 |    "cell_type": "code",
 44 |    "execution_count": 6,
 45 |    "metadata": {
 46 |     "collapsed": true
 47 |    },
 48 |    "outputs": [],
 49 |    "source": [
 50 |     "# generate random matrix\n",
 51 |     "num_rows = 500\n",
 52 |     "num_cols = 10\n",
 53 |     "np.random.seed(seed=100)\n",
 54 |     "mat = np.random.rand(num_rows, num_cols)\n",
 55 |     "\n",
 56 |     "# make row and col labels\n",
 57 |     "rows = range(num_rows)\n",
 58 |     "cols = range(num_cols)\n",
 59 |     "rows = [str(i) for i in rows]\n",
 60 |     "cols = [str(i) for i in cols]\n",
 61 |     "\n",
 62 |     "# make dataframe \n",
 63 |     "df = pd.DataFrame(data=mat, columns=cols, index=rows)"
 64 |    ]
 65 |   },
 66 |   {
 67 |    "cell_type": "code",
 68 |    "execution_count": 7,
 69 |    "metadata": {},
 70 |    "outputs": [
 71 |     {
 72 |      "data": {
 73 |       "application/vnd.jupyter.widget-view+json": {
 74 |        "model_id": "e7f1dc60de214594b83af2b5c77284c3"
 75 |       }
 76 |      },
 77 |      "metadata": {},
 78 |      "output_type": "display_data"
 79 |     }
 80 |    ],
 81 |    "source": [
 82 |     "net.load_df(df)\n",
 83 |     "net.cluster()\n",
 84 |     "net.widget()"
 85 |    ]
 86 |   },
 87 |   {
 88 |    "cell_type": "code",
 89 |    "execution_count": 9,
 90 |    "metadata": {},
 91 |    "outputs": [
 92 |     {
 93 |      "data": {
 94 |       "application/vnd.jupyter.widget-view+json": {
 95 |        "model_id": "0f403b0b1b604879bcd3154dd8020c1b"
 96 |       }
 97 |      },
 98 |      "metadata": {},
 99 |      "output_type": "display_data"
100 |     }
101 |    ],
102 |    "source": [
103 |     "net.load_file('txt/rc_two_cats.txt')\n",
104 |     "net.cluster()\n",
105 |     "net.widget()"
106 |    ]
107 |   },
108 |   {
109 |    "cell_type": "code",
110 |    "execution_count": null,
111 |    "metadata": {
112 |     "collapsed": true
113 |    },
114 |    "outputs": [],
115 |    "source": []
116 |   }
117 |  ],
118 |  "metadata": {
119 |   "anaconda-cloud": {},
120 |   "kernelspec": {
121 |    "display_name": "Python [py35]",
122 |    "language": "python",
123 |    "name": "Python [py35]"
124 |   },
125 |   "language_info": {
126 |    "codemirror_mode": {
127 |     "name": "ipython",
128 |     "version": 3
129 |    },
130 |    "file_extension": ".py",
131 |    "mimetype": "text/x-python",
132 |    "name": "python",
133 |    "nbconvert_exporter": "python",
134 |    "pygments_lexer": "ipython3",
135 |    "version": "3.5.2"
136 |   }
137 |  },
138 |  "nbformat": 4,
139 |  "nbformat_minor": 2
140 | }
141 | 


--------------------------------------------------------------------------------
/setup.cfg:
--------------------------------------------------------------------------------
1 | [metadata]
2 | description-file = README.md


--------------------------------------------------------------------------------
/setup.py:
--------------------------------------------------------------------------------
 1 | from distutils.core import setup
 2 | setup(
 3 |   name = 'clustergrammer',
 4 |   packages = ['clustergrammer'], # this must be the same as the name above
 5 |   version = '1.13.6',
 6 |   description = 'A python module for the Clustergrammer visualization project',
 7 |   author = 'Nicolas Fernandez',
 8 |   author_email = 'nickfloresfernandez@gmail.com',
 9 |   url = 'https://github.com/MaayanLab/clustergrammer-py',
10 |   download_url = 'https://github.com/MaayanLab/clustergrammer-py/tarball/1.1.2',
11 |   keywords = ['testing'],
12 |   classifiers = [],
13 | )


--------------------------------------------------------------------------------
/txt/example_tsv.txt:
--------------------------------------------------------------------------------
 1 | 	Col-1	Col-2	Col-3	Col-4	Col-5	Col-6	Col-7	Col-8	Col-9	Col-10	Col-11	Col-12	Col-13	Col-14	Col-15	Col-16	Col-17	Col-18	Col-19	Col-20	Col-21	Col-22	Col-23	Col-24	Col-25	Col-26	Col-27	Col-28	Col-29
 2 | CDK4	-0.792803571	0.527687127	0.000622536	0.356722594	0.933286088	-0.131728538	0.808451944	4.240884801	-0.540231391	-0.981456952	-0.84689892	-0.252795921	0.114189581	-0.06649884	0.149218809	1.351263924	0.645867212	0.60561098	3.232454573	0.342634572	-0.430912324	-0.40590567	0.199563989	-1.122536294	2.210334571	0.405126315	-0.089763159	0.405126315	0.340012773
 3 | LMTK3	0.17762054	-0.016061489	5.422113833	1.307039675	0.355814985	0.276904994	0.483153915	-0.240495821	1.336445996	1.149618502	0.361412978	-0.380518938	-0.213541004	-0.471938639	-0.620858723	-0.163637058	-0.487256142	-0.029569688	-0.232057778	-0.669036939	-0.449241698	1.158930406	0.511962022	2.370834155	0.262893885	-0.513128895	-0.501210068	0.439277561	-0.342460508
 4 | LRRK2	-0.697876151	-0.555610265	-0.360497559	-0.460236731	-0.680760697	-0.169463518	1.715708875	-0.517104823	0.184987709	0.8106597	-0.440334448	-0.621052026	-0.086803358	-0.753966225	-0.401972037	-0.562086752	-0.560644597	0.542301381	-0.382639145	-0.377853523	-0.713472923	-0.377609368	4.308904581	-0.638131949	-0.556114063	-0.318145763	-0.489582714	1.677376527	-0.682790464
 5 | UHMK1	0.850546518	-0.263279907	0.179253031	0.398646721	1.537663802	0.505291411	0.902366491	-0.16628803	0.630730564	0.399448283	0.847171367	-0.442268094	0.44368676	1.552969029	1.110283483	-0.326698072	-0.405267374	0.663747183	0.424470033	0.283221899	-4.243973921	0.718315578	1.747343933	-1.020927175	0.305028514	1.47174613	0.048902278	-0.255283556	0.548224573
 6 | EGFR	1.412416216	0.018987506	0.902251622	-0.17813747	0.781819022	0.211815895	-0.023427175	3.557295952	1.173783556	-0.012362164	0.769782484	-0.681031743	-1.047375389	0.652065499	0.172316691	2.072433469	1.135709377	-0.169977181	0.881067136	-0.486159025	-1.451838026	0.371237737	-0.581665325	-0.126356157	0.241004724	1.06526919	0.974531796	0.668645091	0.05696489
 7 | STK32A	-0.388039665	-0.592626614	-0.24413651	0.740364734	3.023348415	-0.433985412	-0.630124457	1.156531983	0.433696213	3.84950782	-0.225425742	-0.656106808	-0.311953357	-0.397450226	1.044025538	-0.247816912	3.640524345	-0.59251039	0.514666245	-0.45396994	1.649737631	3.366020313	-0.430502237	-0.295312303	2.824551497	-0.014275115	-0.410477794	-0.229717784	3.709828616
 8 | NRK	1.408537135	-0.017369325	-0.367127962	0.313253548	-0.16288686	0.027411933	-0.281351556	5.813846489	-0.161706584	0.472386752	-0.33979584	0.669956625	-0.2596391	-0.386601295	-0.293654593	4.390499721	-0.420942214	-0.402955154	-0.346809494	-0.222725132	0.36849943	1.49303248	-0.34174718	-0.343420451	5.284808607	-0.358156896	-0.222931558	-0.401391167	-0.412478715
 9 | ERBB2	0.906642406	-0.684771423	0.015261254	0.16056792	0.365002113	-0.564392699	0.169072827	-0.035192496	-0.031210405	0.447742443	0.544075103	0.280008477	-0.066278222	-0.225814318	4.103496507	1.219691566	-0.245022001	-0.681552658	-0.304817333	-0.511212295	-1.100056017	1.335983295	-0.500561544	0.721259819	0.284747072	0.232812724	-0.796930101	-0.156381455	1.503853721
10 | ERBB4	-0.452907052	-0.392790536	-0.374173515	-0.527418493	-0.320103334	-0.560657219	-0.312847509	-0.463903623	-0.304652329	-0.30897114	-0.331935876	4.098930821	-0.413942149	-0.501418917	1.256164876	-0.12356019	-0.425577927	-0.36998588	-0.054684881	-0.484730631	-0.419739472	-0.432412432	0.143245619	-0.266932489	-0.340860307	-0.231847291	-0.448292539	-0.42868169	-0.615936889
11 | AAK1	3.579051735	0.92330807	-0.651094367	0.952743833	-0.212733397	0.006074527	-0.121038246	0.083769063	-0.722678214	1.669410989	-0.247600883	-0.284623649	-0.687716717	-0.320883885	-0.93370415	-0.309230053	0.544870152	0.824029397	-0.087291924	-0.973905867	-0.308282983	0.822145704	-0.72904308	-0.088865731	-0.29848499	-0.451367112	-1.134040733	0.379230443	1.491612577
12 | SRPK3	-0.582761335	-0.706379425	0.364313301	-0.483011414	-0.71307719	-0.048548064	-0.527944549	0.337501769	-0.656781635	-0.323318624	-0.432950623	-0.414799098	-1.02570516	-0.861415433	0.113447131	-0.110117735	-0.493510825	0.148841502	-0.341096914	-0.373760525	5.16013802	-0.204906932	-0.465574863	-0.170405491	0.046286803	-0.100887639	0.936150906	-0.15980844	-0.846857677
13 | STK39	-0.58688791	-0.186685902	-3.51852921	0.250628834	-0.477773537	-0.62381107	-0.92202388	-0.55383453	0.018847867	1.267644557	0.243732055	-0.233528273	0.070726356	-0.256360198	1.741001607	0.168247379	-0.245974299	0.014972759	-0.537623787	0.259364957	1.303190492	1.043208024	-1.021094946	-0.097444212	-0.679290593	0.132592576	-0.440607517	-0.21005684	0.651311995
14 | GRK4	-0.693639785	-0.357559299	-0.903861262	-0.810450279	0.293775461	1.012469252	-0.1044623	-0.573757161	-0.629467998	-1.131138938	-0.401984075	-0.672554564	-1.182791974	-1.00138041	-1.020216093	-0.437799213	-0.103783602	-0.387565928	0.386471772	-0.524742175	3.627070248	-0.846550806	-0.473736661	0.443388919	0.766135257	0.193683015	-0.614757268	-0.382906171	-0.864410411
15 | TBK1	0.327203594	0.857319301	-1.397356596	-0.226683585	-0.986051455	0.438343505	0.095527129	5.598772308	0.535025797	-0.057479225	-0.089086932	-0.473291011	2.070252907	0.201409888	1.134728133	0.095734104	0.22916067	0.566649702	1.011889663	0.902342556	0.735510304	-0.493538517	3.176918602	0.490045737	-0.693495689	-0.183704878	-0.182356844	0.744728769	-0.311064392
16 | INSRR	0.331108191	-0.467978397	0.681112329	-1.195914121	-0.538461957	3.616542204	0.094919881	0.527357553	-0.160478312	-0.940444939	-1.025689676	0.053722044	0.081275611	-0.616337936	-0.48042057	0.620903382	-0.723793064	-0.759642991	-0.744900744	-0.43363819	-0.804929849	0.429022269	0.765012989	0.400356525	0.207741849	1.474373008	0.888734282	0.387715581	-0.623678298
17 | IRAK1	0.141183837	0.788608352	0.51421388	0.528255597	0.906234597	0.050158065	-0.843341382	1.44384296	-0.253699343	0.562537497	-0.505923659	-6.621630156	-1.253907087	0.947883556	0.394657662	0.613122698	-0.223205762	0.905373736	0.679301554	0.859789222	-0.022354553	1.046726052	0.194471	0.359912834	-0.806348234	1.586619876	0.311985824	0.544470027	1.479598537
18 | KDR	-0.524309949	-0.285994749	-0.484871681	0.176655189	-0.139711627	-0.352978397	-0.192529854	-0.601257973	-0.427323427	0.459403791	-0.383001	-0.571316753	-0.387982572	-0.353945118	-0.24031472	-0.305685522	0.456864915	-0.134886653	2.136583182	-0.463127859	-0.600408393	4.430955918	-0.374772774	-0.206853106	1.756104521	0.832887289	-0.314392176	-0.31373742	0.493420283
19 | NPR1	0.509592174	0.464774315	0.275495704	-0.01882253	-0.005537792	-0.457197866	3.408253083	-0.430407643	-0.754186082	-0.836530855	-0.277053957	-0.54257843	0.850439577	-0.298981449	-0.169394809	0.254783369	-0.427821755	-0.1389465	0.618069135	1.926349897	-0.305399918	-0.535939215	-0.078661666	6.267854465	-0.57293844	-0.302168609	0.72307512	0.611863003	-0.2145995
20 | PAK3	-0.554447111	-0.145753485	0.019807701	-0.634915727	-0.493766887	-0.587644968	-0.223714996	1.385049476	-0.346100755	0.254536609	0.057887318	-0.593081366	-0.47065185	-0.753944095	-0.044570503	-0.334597636	0.339587788	-0.135201173	-0.111896921	4.913848539	-0.542750046	-0.27783299	-0.689097582	-0.216688006	1.127699857	0.03588747	-0.070416156	-0.553268062	-0.977427689
21 | PDGFRA	-0.530831743	-0.260873607	-0.461134617	-0.389056188	-0.512555424	-0.513202268	-0.337550397	-0.449953768	-0.184845421	-0.565880657	-0.376356202	-0.285266104	-0.570505879	3.598950655	-0.477052971	-0.364449691	-0.648917127	-0.390363086	1.117877995	-0.464086586	-0.456431016	-0.571010056	-0.456668794	-0.58230495	-0.426192704	-0.411161613	-0.455618608	-0.297498019	-0.47141323
22 | PDK4	-0.643246331	0.052021433	-0.735006626	0.041068843	-0.062094125	5.477714716	1.256686967	-0.136401851	0.577266871	-0.60002565	0.087671916	-0.560959779	-0.56490393	-0.629261602	-0.214226487	0.09929963	-0.095715004	-0.632856345	1.09320354	0.386976419	-0.374720076	-0.564957229	1.680895489	0.508498891	0.916604247	-0.607497709	0.58989289	-0.122764837	-0.533651064
23 | ULK4	-0.693868027	0.57619653	-0.488541037	-0.60094858	1.139598497	-0.024286993	-0.288194559	-0.499250839	0.103697413	-1.044716157	-1.02427507	3.023003444	3.859028873	0.742978442	-0.352722819	0.069205383	-1.117741919	0.651553716	-0.364768511	-0.5472691	-0.735728719	-0.623797671	-0.105370633	-0.139431549	-0.055372648	0.774449831	0.538987341	0.35514519	-1.45386407
24 | PRKCE	0.006531886	0.564826732	3.695318524	0.316255033	-0.268737774	0.936461505	0.2291517	0.649579007	-0.330103595	-0.504534505	0.264729237	-0.977228043	0.493632169	-0.401821398	-0.286231779	0.143371278	-0.360231532	0.340762717	0.633162512	-0.710530502	-1.334690191	0.158108045	-0.347820435	-0.074497061	-0.970507716	-0.26479443	-0.298648517	-0.10090872	-0.11742112
25 | PRKG2	-0.185695405	-0.173758799	0.084357105	1.826502656	0.00816719	-1.102148634	0.299002536	0.458848186	0.292508806	0.110508201	0.083592283	-0.494333063	-0.117947546	-0.539712481	-0.106334279	-0.403083002	-0.789473381	1.041787363	1.70041072	-0.293951867	4.839524758	1.015480815	0.841188534	-0.620389764	-0.565583764	-0.262366184	0.226425315	-0.048000565	1.126249373
26 | MAPK4	0.184462349	-0.526037871	0.432087272	-0.882311913	0.246356093	0.858754521	0.052858019	-1.118340603	-0.846948816	-0.778824075	3.525192777	-1.872745007	-0.779756435	-1.039639399	-0.59333431	0.402156007	-1.387426464	-0.145435051	-0.46497243	-0.221064461	-0.861483648	0.125415634	-0.191849116	2.374460297	-0.74142144	0.7654394	1.029796862	0.03307866	0.44066582
27 | MAPK11	1.760301448	-0.912259652	-1.163345889	-0.965891664	-0.795153414	-0.616300339	-1.360743997	-1.448291877	-0.024088935	-1.188868793	-0.229906845	2.181489143	-1.154435684	6.28292787	-0.303782002	-0.165568925	-1.126153349	1.678721355	-1.683560793	-0.864063548	-0.025445472	1.890946219	0.667805988	-0.625764381	-1.063340313	3.222816803	-0.001359619	-0.203661756	0.187669924
28 | STK31	-0.07364355	-0.103789279	-0.171304836	0.351910065	0.63677969	-0.136732984	0.356830815	3.889115824	0.645442526	1.366358918	0.995319244	5.608685402	1.101919141	-0.554900568	0.087820649	0.061305127	1.931275557	-0.692417574	-0.481807702	-0.16288735	-0.538298189	2.440412245	0.804274605	-0.605526195	1.788457016	-0.376520922	0.35819202	0.164487781	3.719306763
29 | GRK1	-0.751526741	0.49762292	-0.142534658	-0.882124083	-1.151282849	2.307907188	-0.12032085	-0.351269532	-1.526178564	-0.753268428	3.600861739	-1.223995853	-0.607229424	-0.027417898	0.190161632	0.610550408	0.149796331	-0.122879865	0.247865963	-0.404833708	0.736929754	-0.944275068	-0.078919294	0.661648005	-0.244948779	3.051534602	-0.107365228	0.367536408	-1.517985824
30 | ROS1	-0.31236414	0.701257089	0.47520812	-0.585297054	-0.122694283	-0.866875137	0.367939523	-0.481103706	2.072237711	10.29186436	1.298805701	-0.628175917	-0.173084375	-0.02710755	0.355169073	0.470456905	0.121400231	0.374924602	-0.278307341	-0.553746266	-0.935156558	-0.042420296	-0.479479902	-0.332400886	-0.710017011	1.873931755	0.204554429	-0.32315246	0.187572521
31 | MAP2K4	0.11931136	0.593670684	0.489152771	0.841683345	1.064673748	0.095113499	1.050152022	1.891488427	-5.5283552	0.64306832	-1.100026181	0.765710935	1.165406655	0.30638633	-1.365894262	0.635492291	-0.377798616	0.521665309	-0.608497433	0.398484128	-0.988354968	1.36349214	1.36269783	-0.112291585	-0.262719995	0.503524059	0.498006014	1.525942005	0.339189212
32 | SRC	-0.294263824	-0.618071649	-0.252534114	-0.78660676	-0.228026664	0.977860794	-1.200449832	-0.22037931	-0.240489906	-0.201675468	1.47598938	-0.557000568	-0.502553204	-0.437501309	0.966927023	0.379670097	0.048795579	0.250622869	2.961024714	2.299033235	-1.210659274	0.418655141	1.161954005	-0.15700654	-1.254142937	-0.574558055	-0.662438275	3.702617515	-0.35302723
33 | TGFBR1	-0.000863802	0.735638383	-0.680289747	0.040925843	0.359330228	-1.587400295	-1.041686081	0.071551408	-0.168322665	-1.377303308	3.604539089	-0.004601068	1.527568732	-0.300154707	-0.786135509	-0.138050924	-0.366480418	-0.796970206	-0.030155544	0.803100056	0.683145561	-0.900708154	0.15251077	0.140092011	0.376815421	-1.214319621	1.326197465	1.523070279	-1.312001824
34 | CAMK2B	-0.276736819	-0.426080887	-0.160160461	-0.890032771	-0.437405434	0.143897214	-0.573425958	-0.486419381	-0.536963482	-0.657041002	-0.473345418	-0.237475279	-0.669396538	-0.559435302	0.038953301	0.033709721	-0.343587801	-0.513218087	-0.592303313	-0.431221835	5.339202897	-0.493778587	-0.645000361	-0.477984867	-0.401579746	-0.621782124	-0.249394627	-0.303365249	-0.922343302
35 | STK24	-0.31807579	-0.814110809	0.646545188	0.26837169	-9.425120961	-1.073853473	-2.049589626	-0.346921024	0.997283181	0.300619253	-0.543103864	-1.150792172	-2.283061167	-0.162802216	-1.053859713	-1.377541743	-0.288349474	-0.922266884	-1.123953091	-0.762953893	-0.687357148	0.28991073	0.317576672	-0.345565515	0.541683	0.009754099	0.73792006	-0.624271752	0.100532547
36 | DCLK3	-0.670177714	3.224533501	0.145509552	0.107432319	-1.120492739	0.288890539	1.549545918	-0.342665051	-0.017402855	-0.420002244	-0.361387453	-1.264272075	-0.794507765	-0.619944678	-0.338767802	-0.148529478	-1.078879645	0.130939014	-1.307815313	-1.818798474	3.683694337	0.920647357	-0.847056974	-0.343798498	-1.21552566	-0.853845334	-0.357215055	-0.043911541	-0.955847309
37 | LATS1	-0.695252888	4.299877134	-0.175587126	-0.061022137	-0.391646018	3.385451038	0.345114288	-0.505734993	-0.482953864	-0.081815586	-0.928486879	0.976209137	0.099021487	2.494690556	-1.088742779	0.437174751	-0.507169467	2.028724319	-0.507954247	0.143506281	-1.19702953	0.610379518	0.095879151	-0.663118727	0.50821984	-0.741815419	2.38531026	0.354750355	0.658437634
38 | NEK9	-0.337849025	-0.535265918	0.803160459	0.275911465	0.981343049	-0.748451144	-0.092431408	-0.326477104	-0.381243917	-0.575343824	-0.63351617	-0.380961411	-1.720616197	-0.85605361	-0.580950374	0.373293116	0.905490886	0.135705555	1.107780656	-0.545183144	0.475561701	0.016687596	-0.172178219	0.585186686	-0.40480014	-3.997318149	0.711029765	-0.470884061	0.354386296
39 | MYLK3	-0.368173217	0.209192446	0.266317555	-0.100656799	-0.336791718	-0.060827204	-0.199021599	-0.765882671	-0.071476548	-0.4402703	-0.3548684	3.468121376	5.853726714	-0.465135408	0.074434692	7.085199705	-0.399050575	-0.334999773	-0.623071147	-0.406230833	0.939058116	-0.269533885	0.117950503	0.1975473	-0.365407931	-0.056856473	0.001983212	0.081609959	-0.603299855


--------------------------------------------------------------------------------
/txt/rc_ptms.txt:
--------------------------------------------------------------------------------
 1 | 		Cell Line: H1650	Cell Line: H23	Cell Line: CAL-12T	Cell Line: H358	Cell Line: H1975	Cell Line: HCC15	Cell Line: H1355	Cell Line: HCC827	Cell Line: H2405	Cell Line: HCC78	Cell Line: H1666	Cell Line: H661	Cell Line: H838	Cell Line: H1703	Cell Line: CALU-3	Cell Line: H2342	Cell Line: H2228	Cell Line: H1299	Cell Line: H1792	Cell Line: H460	Cell Line: H2106	Cell Line: H441	Cell Line: H1944	Cell Line: H1437	Cell Line: H1734	Cell Line: LOU-NH91	Cell Line: HCC44	Cell Line: A549	Cell Line: H1781
 2 | 		Category: two	Category: two	Category: two	Category: one	Category: two	Category: two	Category: three	Category: one	Category: five	Category: five	Category: four	Category: five	Category: five	Category: five	Category: four	Category: four	Category: one	Category: three	Category: three	Category: three	Category: four	Category: one	Category: three	Category: four	Category: one	Category: five	Category: four	Category: four	Category: one
 3 | 		Gender: Male	Gender: Male	Gender: Male	Gender: Male	Gender: Female	Gender: Male	Gender: Male	Gender: Female	Gender: Male	Gender: Male	Gender: Female	Gender: Male	Gender: Male	Gender: Male	Gender: Male	Gender: Female	Gender: Female	Gender: Male	Gender: Male	Gender: Male	Gender: Male	Gender: Male	Gender: Female	Gender: Male	Gender: Female	Gender: Female	Gender: Female	Gender: Male	Gender: Female
 4 | Gene: CDK4_ptm-info	Gene Type: Interesting	-0.792803571	0.527687127	0.000622536	0.356722594	0.933286088	-0.131728538	0.808451944	4.240884801	-0.540231391	-0.981456952	-0.84689892	-0.252795921	0.114189581	-0.06649884	0.149218809	1.351263924	0.645867212	0.60561098	3.232454573	0.342634572	-0.430912324	-0.40590567	0.199563989	-1.122536294	2.210334571	0.405126315	-0.089763159	0.405126315	0.340012773
 5 | Gene: LMTK3_ptm-info	Gene Type: Not Interesting	0.17762054	-0.016061489	5.422113833	1.307039675	0.355814985	0.276904994	0.483153915	-0.240495821	1.336445996	1.149618502	0.361412978	-0.380518938	-0.213541004	-0.471938639	-0.620858723	-0.163637058	-0.487256142	-0.029569688	-0.232057778	-0.669036939	-0.449241698	1.158930406	0.511962022	2.370834155	0.262893885	-0.513128895	-0.501210068	0.439277561	-0.342460508
 6 | Gene: LRRK2_ptm-info	Gene Type: Not Interesting	-0.697876151	-0.555610265	-0.360497559	-0.460236731	-0.680760697	-0.169463518	1.715708875	-0.517104823	0.184987709	0.8106597	-0.440334448	-0.621052026	-0.086803358	-0.753966225	-0.401972037	-0.562086752	-0.560644597	0.542301381	-0.382639145	-0.377853523	-0.713472923	-0.377609368	4.308904581	-0.638131949	-0.556114063	-0.318145763	-0.489582714	1.677376527	-0.682790464
 7 | Gene: UHMK1_ptm-info	Gene Type: Not Interesting	0.850546518	-0.263279907	0.179253031	0.398646721	1.537663802	0.505291411	0.902366491	-0.16628803	0.630730564	0.399448283	0.847171367	-0.442268094	0.44368676	1.552969029	1.110283483	-0.326698072	-0.405267374	0.663747183	0.424470033	0.283221899	-4.243973921	0.718315578	1.747343933	-1.020927175	0.305028514	1.47174613	0.048902278	-0.255283556	0.548224573
 8 | Gene: EGFR_ptm-info	Gene Type: Interesting	1.412416216	0.018987506	0.902251622	-0.17813747	0.781819022	0.211815895	-0.023427175	3.557295952	1.173783556	-0.012362164	0.769782484	-0.681031743	-1.047375389	0.652065499	0.172316691	2.072433469	1.135709377	-0.169977181	0.881067136	-0.486159025	-1.451838026	0.371237737	-0.581665325	-0.126356157	0.241004724	1.06526919	0.974531796	0.668645091	0.05696489
 9 | Gene: STK32A_ptm-info	Gene Type: Interesting	-0.388039665	-0.592626614	-0.24413651	0.740364734	3.023348415	-0.433985412	-0.630124457	1.156531983	0.433696213	3.84950782	-0.225425742	-0.656106808	-0.311953357	-0.397450226	1.044025538	-0.247816912	3.640524345	-0.59251039	0.514666245	-0.45396994	1.649737631	3.366020313	-0.430502237	-0.295312303	2.824551497	-0.014275115	-0.410477794	-0.229717784	3.709828616
10 | Gene: NRK_ptm-info	Gene Type: Interesting	1.408537135	-0.017369325	-0.367127962	0.313253548	-0.16288686	0.027411933	-0.281351556	5.813846489	-0.161706584	0.472386752	-0.33979584	0.669956625	-0.2596391	-0.386601295	-0.293654593	4.390499721	-0.420942214	-0.402955154	-0.346809494	-0.222725132	0.36849943	1.49303248	-0.34174718	-0.343420451	5.284808607	-0.358156896	-0.222931558	-0.401391167	-0.412478715
11 | Gene: ERBB2_ptm-info	Gene Type: Not Interesting	0.906642406	-0.684771423	0.015261254	0.16056792	0.365002113	-0.564392699	0.169072827	-0.035192496	-0.031210405	0.447742443	0.544075103	0.280008477	-0.066278222	-0.225814318	4.103496507	1.219691566	-0.245022001	-0.681552658	-0.304817333	-0.511212295	-1.100056017	1.335983295	-0.500561544	0.721259819	0.284747072	0.232812724	-0.796930101	-0.156381455	1.503853721
12 | Gene: ERBB4_ptm-info	Gene Type: Not Interesting	-0.452907052	-0.392790536	-0.374173515	-0.527418493	-0.320103334	-0.560657219	-0.312847509	-0.463903623	-0.304652329	-0.30897114	-0.331935876	4.098930821	-0.413942149	-0.501418917	1.256164876	-0.12356019	-0.425577927	-0.36998588	-0.054684881	-0.484730631	-0.419739472	-0.432412432	0.143245619	-0.266932489	-0.340860307	-0.231847291	-0.448292539	-0.42868169	-0.615936889
13 | Gene: AAK1_ptm-info	Gene Type: Not Interesting	3.579051735	0.92330807	-0.651094367	0.952743833	-0.212733397	0.006074527	-0.121038246	0.083769063	-0.722678214	1.669410989	-0.247600883	-0.284623649	-0.687716717	-0.320883885	-0.93370415	-0.309230053	0.544870152	0.824029397	-0.087291924	-0.973905867	-0.308282983	0.822145704	-0.72904308	-0.088865731	-0.29848499	-0.451367112	-1.134040733	0.379230443	1.491612577
14 | Gene: SRPK3_ptm-info	Gene Type: Not Interesting	-0.582761335	-0.706379425	0.364313301	-0.483011414	-0.71307719	-0.048548064	-0.527944549	0.337501769	-0.656781635	-0.323318624	-0.432950623	-0.414799098	-1.02570516	-0.861415433	0.113447131	-0.110117735	-0.493510825	0.148841502	-0.341096914	-0.373760525	5.16013802	-0.204906932	-0.465574863	-0.170405491	0.046286803	-0.100887639	0.936150906	-0.15980844	-0.846857677
15 | Gene: STK39_ptm-info	Gene Type: Interesting	-0.58688791	-0.186685902	-3.51852921	0.250628834	-0.477773537	-0.62381107	-0.92202388	-0.55383453	0.018847867	1.267644557	0.243732055	-0.233528273	0.070726356	-0.256360198	1.741001607	0.168247379	-0.245974299	0.014972759	-0.537623787	0.259364957	1.303190492	1.043208024	-1.021094946	-0.097444212	-0.679290593	0.132592576	-0.440607517	-0.21005684	0.651311995
16 | Gene: GRK4_ptm-info	Gene Type: Not Interesting	-0.693639785	-0.357559299	-0.903861262	-0.810450279	0.293775461	1.012469252	-0.1044623	-0.573757161	-0.629467998	-1.131138938	-0.401984075	-0.672554564	-1.182791974	-1.00138041	-1.020216093	-0.437799213	-0.103783602	-0.387565928	0.386471772	-0.524742175	3.627070248	-0.846550806	-0.473736661	0.443388919	0.766135257	0.193683015	-0.614757268	-0.382906171	-0.864410411
17 | Gene: TBK1_ptm-info	Gene Type: Not Interesting	0.327203594	0.857319301	-1.397356596	-0.226683585	-0.986051455	0.438343505	0.095527129	5.598772308	0.535025797	-0.057479225	-0.089086932	-0.473291011	2.070252907	0.201409888	1.134728133	0.095734104	0.22916067	0.566649702	1.011889663	0.902342556	0.735510304	-0.493538517	3.176918602	0.490045737	-0.693495689	-0.183704878	-0.182356844	0.744728769	-0.311064392
18 | Gene: INSRR_ptm-info	Gene Type: Not Interesting	0.331108191	-0.467978397	0.681112329	-1.195914121	-0.538461957	3.616542204	0.094919881	0.527357553	-0.160478312	-0.940444939	-1.025689676	0.053722044	0.081275611	-0.616337936	-0.48042057	0.620903382	-0.723793064	-0.759642991	-0.744900744	-0.43363819	-0.804929849	0.429022269	0.765012989	0.400356525	0.207741849	1.474373008	0.888734282	0.387715581	-0.623678298
19 | Gene: IRAK1_ptm-info	Gene Type: Interesting	0.141183837	0.788608352	0.51421388	0.528255597	0.906234597	0.050158065	-0.843341382	1.44384296	-0.253699343	0.562537497	-0.505923659	-6.621630156	-1.253907087	0.947883556	0.394657662	0.613122698	-0.223205762	0.905373736	0.679301554	0.859789222	-0.022354553	1.046726052	0.194471	0.359912834	-0.806348234	1.586619876	0.311985824	0.544470027	1.479598537
20 | Gene: KDR_ptm-info	Gene Type: Not Interesting	-0.524309949	-0.285994749	-0.484871681	0.176655189	-0.139711627	-0.352978397	-0.192529854	-0.601257973	-0.427323427	0.459403791	-0.383001	-0.571316753	-0.387982572	-0.353945118	-0.24031472	-0.305685522	0.456864915	-0.134886653	2.136583182	-0.463127859	-0.600408393	4.430955918	-0.374772774	-0.206853106	1.756104521	0.832887289	-0.314392176	-0.31373742	0.493420283
21 | Gene: NPR1_ptm-info	Gene Type: Interesting	0.509592174	0.464774315	0.275495704	-0.01882253	-0.005537792	-0.457197866	3.408253083	-0.430407643	-0.754186082	-0.836530855	-0.277053957	-0.54257843	0.850439577	-0.298981449	-0.169394809	0.254783369	-0.427821755	-0.1389465	0.618069135	1.926349897	-0.305399918	-0.535939215	-0.078661666	6.267854465	-0.57293844	-0.302168609	0.72307512	0.611863003	-0.2145995
22 | Gene: PAK3_ptm-info	Gene Type: Not Interesting	-0.554447111	-0.145753485	0.019807701	-0.634915727	-0.493766887	-0.587644968	-0.223714996	1.385049476	-0.346100755	0.254536609	0.057887318	-0.593081366	-0.47065185	-0.753944095	-0.044570503	-0.334597636	0.339587788	-0.135201173	-0.111896921	4.913848539	-0.542750046	-0.27783299	-0.689097582	-0.216688006	1.127699857	0.03588747	-0.070416156	-0.553268062	-0.977427689
23 | Gene: PDGFRA_ptm-info	Gene Type: Interesting	-0.530831743	-0.260873607	-0.461134617	-0.389056188	-0.512555424	-0.513202268	-0.337550397	-0.449953768	-0.184845421	-0.565880657	-0.376356202	-0.285266104	-0.570505879	3.598950655	-0.477052971	-0.364449691	-0.648917127	-0.390363086	1.117877995	-0.464086586	-0.456431016	-0.571010056	-0.456668794	-0.58230495	-0.426192704	-0.411161613	-0.455618608	-0.297498019	-0.47141323
24 | Gene: PDK4_ptm-info	Gene Type: Not Interesting	-0.643246331	0.052021433	-0.735006626	0.041068843	-0.062094125	5.477714716	1.256686967	-0.136401851	0.577266871	-0.60002565	0.087671916	-0.560959779	-0.56490393	-0.629261602	-0.214226487	0.09929963	-0.095715004	-0.632856345	1.09320354	0.386976419	-0.374720076	-0.564957229	1.680895489	0.508498891	0.916604247	-0.607497709	0.58989289	-0.122764837	-0.533651064
25 | Gene: ULK4_ptm-info	Gene Type: Interesting	-0.693868027	0.57619653	-0.488541037	-0.60094858	1.139598497	-0.024286993	-0.288194559	-0.499250839	0.103697413	-1.044716157	-1.02427507	3.023003444	3.859028873	0.742978442	-0.352722819	0.069205383	-1.117741919	0.651553716	-0.364768511	-0.5472691	-0.735728719	-0.623797671	-0.105370633	-0.139431549	-0.055372648	0.774449831	0.538987341	0.35514519	-1.45386407
26 | Gene: PRKCE_ptm-info	Gene Type: Not Interesting	0.006531886	0.564826732	3.695318524	0.316255033	-0.268737774	0.936461505	0.2291517	0.649579007	-0.330103595	-0.504534505	0.264729237	-0.977228043	0.493632169	-0.401821398	-0.286231779	0.143371278	-0.360231532	0.340762717	0.633162512	-0.710530502	-1.334690191	0.158108045	-0.347820435	-0.074497061	-0.970507716	-0.26479443	-0.298648517	-0.10090872	-0.11742112
27 | Gene: PRKG2_ptm-info	Gene Type: Not Interesting	-0.185695405	-0.173758799	0.084357105	1.826502656	0.00816719	-1.102148634	0.299002536	0.458848186	0.292508806	0.110508201	0.083592283	-0.494333063	-0.117947546	-0.539712481	-0.106334279	-0.403083002	-0.789473381	1.041787363	1.70041072	-0.293951867	4.839524758	1.015480815	0.841188534	-0.620389764	-0.565583764	-0.262366184	0.226425315	-0.048000565	1.126249373
28 | Gene: MAPK4_ptm-info	Gene Type: Interesting	0.184462349	-0.526037871	0.432087272	-0.882311913	0.246356093	0.858754521	0.052858019	-1.118340603	-0.846948816	-0.778824075	3.525192777	-1.872745007	-0.779756435	-1.039639399	-0.59333431	0.402156007	-1.387426464	-0.145435051	-0.46497243	-0.221064461	-0.861483648	0.125415634	-0.191849116	2.374460297	-0.74142144	0.7654394	1.029796862	0.03307866	0.44066582
29 | Gene: MAPK11_ptm-info	Gene Type: Interesting	1.760301448	-0.912259652	-1.163345889	-0.965891664	-0.795153414	-0.616300339	-1.360743997	-1.448291877	-0.024088935	-1.188868793	-0.229906845	2.181489143	-1.154435684	6.28292787	-0.303782002	-0.165568925	-1.126153349	1.678721355	-1.683560793	-0.864063548	-0.025445472	1.890946219	0.667805988	-0.625764381	-1.063340313	3.222816803	-0.001359619	-0.203661756	0.187669924
30 | Gene: STK31_ptm-info	Gene Type: Interesting	-0.07364355	-0.103789279	-0.171304836	0.351910065	0.63677969	-0.136732984	0.356830815	3.889115824	0.645442526	1.366358918	0.995319244	5.608685402	1.101919141	-0.554900568	0.087820649	0.061305127	1.931275557	-0.692417574	-0.481807702	-0.16288735	-0.538298189	2.440412245	0.804274605	-0.605526195	1.788457016	-0.376520922	0.35819202	0.164487781	3.719306763
31 | Gene: GRK1_ptm-info	Gene Type: Not Interesting	-0.751526741	0.49762292	-0.142534658	-0.882124083	-1.151282849	2.307907188	-0.12032085	-0.351269532	-1.526178564	-0.753268428	3.600861739	-1.223995853	-0.607229424	-0.027417898	0.190161632	0.610550408	0.149796331	-0.122879865	0.247865963	-0.404833708	0.736929754	-0.944275068	-0.078919294	0.661648005	-0.244948779	3.051534602	-0.107365228	0.367536408	-1.517985824
32 | Gene: ROS1_ptm-info	Gene Type: Interesting	-0.31236414	0.701257089	0.47520812	-0.585297054	-0.122694283	-0.866875137	0.367939523	-0.481103706	2.072237711	10.29186436	1.298805701	-0.628175917	-0.173084375	-0.02710755	0.355169073	0.470456905	0.121400231	0.374924602	-0.278307341	-0.553746266	-0.935156558	-0.042420296	-0.479479902	-0.332400886	-0.710017011	1.873931755	0.204554429	-0.32315246	0.187572521
33 | Gene: MAP2K4_ptm-info	Gene Type: Interesting	0.11931136	0.593670684	0.489152771	0.841683345	1.064673748	0.095113499	1.050152022	1.891488427	-5.5283552	0.64306832	-1.100026181	0.765710935	1.165406655	0.30638633	-1.365894262	0.635492291	-0.377798616	0.521665309	-0.608497433	0.398484128	-0.988354968	1.36349214	1.36269783	-0.112291585	-0.262719995	0.503524059	0.498006014	1.525942005	0.339189212
34 | Gene: SRC_ptm-info	Gene Type: Interesting	-0.294263824	-0.618071649	-0.252534114	-0.78660676	-0.228026664	0.977860794	-1.200449832	-0.22037931	-0.240489906	-0.201675468	1.47598938	-0.557000568	-0.502553204	-0.437501309	0.966927023	0.379670097	0.048795579	0.250622869	2.961024714	2.299033235	-1.210659274	0.418655141	1.161954005	-0.15700654	-1.254142937	-0.574558055	-0.662438275	3.702617515	-0.35302723
35 | Gene: TGFBR1_ptm-info	Gene Type: Interesting	-0.000863802	0.735638383	-0.680289747	0.040925843	0.359330228	-1.587400295	-1.041686081	0.071551408	-0.168322665	-1.377303308	3.604539089	-0.004601068	1.527568732	-0.300154707	-0.786135509	-0.138050924	-0.366480418	-0.796970206	-0.030155544	0.803100056	0.683145561	-0.900708154	0.15251077	0.140092011	0.376815421	-1.214319621	1.326197465	1.523070279	-1.312001824
36 | Gene: CAMK2B_ptm-info	Gene Type: Not Interesting	-0.276736819	-0.426080887	-0.160160461	-0.890032771	-0.437405434	0.143897214	-0.573425958	-0.486419381	-0.536963482	-0.657041002	-0.473345418	-0.237475279	-0.669396538	-0.559435302	0.038953301	0.033709721	-0.343587801	-0.513218087	-0.592303313	-0.431221835	5.339202897	-0.493778587	-0.645000361	-0.477984867	-0.401579746	-0.621782124	-0.249394627	-0.303365249	-0.922343302
37 | Gene: STK24_ptm-info	Gene Type: Interesting	-0.31807579	-0.814110809	0.646545188	0.26837169	-9.425120961	-1.073853473	-2.049589626	-0.346921024	0.997283181	0.300619253	-0.543103864	-1.150792172	-2.283061167	-0.162802216	-1.053859713	-1.377541743	-0.288349474	-0.922266884	-1.123953091	-0.762953893	-0.687357148	0.28991073	0.317576672	-0.345565515	0.541683	0.009754099	0.73792006	-0.624271752	0.100532547
38 | Gene: DCLK3_ptm-info	Gene Type: Not Interesting	-0.670177714	3.224533501	0.145509552	0.107432319	-1.120492739	0.288890539	1.549545918	-0.342665051	-0.017402855	-0.420002244	-0.361387453	-1.264272075	-0.794507765	-0.619944678	-0.338767802	-0.148529478	-1.078879645	0.130939014	-1.307815313	-1.818798474	3.683694337	0.920647357	-0.847056974	-0.343798498	-1.21552566	-0.853845334	-0.357215055	-0.043911541	-0.955847309
39 | Gene: LATS1_ptm-info	Gene Type: Not Interesting	-0.695252888	4.299877134	-0.175587126	-0.061022137	-0.391646018	3.385451038	0.345114288	-0.505734993	-0.482953864	-0.081815586	-0.928486879	0.976209137	0.099021487	2.494690556	-1.088742779	0.437174751	-0.507169467	2.028724319	-0.507954247	0.143506281	-1.19702953	0.610379518	0.095879151	-0.663118727	0.50821984	-0.741815419	2.38531026	0.354750355	0.658437634
40 | Gene: NEK9_ptm-info	Gene Type: Not Interesting	-0.337849025	-0.535265918	0.803160459	0.275911465	0.981343049	-0.748451144	-0.092431408	-0.326477104	-0.381243917	-0.575343824	-0.63351617	-0.380961411	-1.720616197	-0.85605361	-0.580950374	0.373293116	0.905490886	0.135705555	1.107780656	-0.545183144	0.475561701	0.016687596	-0.172178219	0.585186686	-0.40480014	-3.997318149	0.711029765	-0.470884061	0.354386296
41 | Gene: MYLK3_ptm-info	Gene Type: Not Interesting	-0.368173217	0.209192446	0.266317555	-0.100656799	-0.336791718	-0.060827204	-0.199021599	-0.765882671	-0.071476548	-0.4402703	-0.3548684	3.468121376	5.853726714	-0.465135408	0.074434692	7.085199705	-0.399050575	-0.334999773	-0.623071147	-0.406230833	0.939058116	-0.269533885	0.117950503	0.1975473	-0.365407931	-0.056856473	0.001983212	0.081609959	-0.603299855


--------------------------------------------------------------------------------
/txt/rc_two_cats.txt:
--------------------------------------------------------------------------------
 1 | 		Cell Line: H1650	Cell Line: H23	Cell Line: CAL-12T	Cell Line: H358	Cell Line: H1975	Cell Line: HCC15	Cell Line: H1355	Cell Line: HCC827	Cell Line: H2405	Cell Line: HCC78	Cell Line: H1666	Cell Line: H661	Cell Line: H838	Cell Line: H1703	Cell Line: CALU-3	Cell Line: H2342	Cell Line: H2228	Cell Line: H1299	Cell Line: H1792	Cell Line: H460	Cell Line: H2106	Cell Line: H441	Cell Line: H1944	Cell Line: H1437	Cell Line: H1734	Cell Line: LOU-NH91	Cell Line: HCC44	Cell Line: A549	Cell Line: H1781
 2 | 		Category: two	Category: two	Category: two	Category: one	Category: two	Category: two	Category: three	Category: one	Category: five	Category: five	Category: four	Category: five	Category: five	Category: five	Category: four	Category: four	Category: one	Category: three	Category: three	Category: three	Category: four	Category: one	Category: three	Category: four	Category: one	Category: five	Category: four	Category: four	Category: one
 3 | 		Gender: Male	Gender: Male	Gender: Male	Gender: Male	Gender: Female	Gender: Male	Gender: Male	Gender: Female	Gender: Male	Gender: Male	Gender: Female	Gender: Male	Gender: Male	Gender: Male	Gender: Male	Gender: Female	Gender: Female	Gender: Male	Gender: Male	Gender: Male	Gender: Male	Gender: Male	Gender: Female	Gender: Male	Gender: Female	Gender: Female	Gender: Female	Gender: Male	Gender: Female
 4 | Gene: CDK4	Gene Type: Interesting	-0.792803571	0.527687127	0.000622536	0.356722594	0.933286088	-0.131728538	0.808451944	4.240884801	-0.540231391	-0.981456952	-0.84689892	-0.252795921	0.114189581	-0.06649884	0.149218809	1.351263924	0.645867212	0.60561098	3.232454573	0.342634572	-0.430912324	-0.40590567	0.199563989	-1.122536294	2.210334571	0.405126315	-0.089763159	0.405126315	0.340012773
 5 | Gene: LMTK3	Gene Type: Not Interesting	0.17762054	-0.016061489	5.422113833	1.307039675	0.355814985	0.276904994	0.483153915	-0.240495821	1.336445996	1.149618502	0.361412978	-0.380518938	-0.213541004	-0.471938639	-0.620858723	-0.163637058	-0.487256142	-0.029569688	-0.232057778	-0.669036939	-0.449241698	1.158930406	0.511962022	2.370834155	0.262893885	-0.513128895	-0.501210068	0.439277561	-0.342460508
 6 | Gene: LRRK2	Gene Type: Not Interesting	-0.697876151	-0.555610265	-0.360497559	-0.460236731	-0.680760697	-0.169463518	1.715708875	-0.517104823	0.184987709	0.8106597	-0.440334448	-0.621052026	-0.086803358	-0.753966225	-0.401972037	-0.562086752	-0.560644597	0.542301381	-0.382639145	-0.377853523	-0.713472923	-0.377609368	4.308904581	-0.638131949	-0.556114063	-0.318145763	-0.489582714	1.677376527	-0.682790464
 7 | Gene: UHMK1	Gene Type: Not Interesting	0.850546518	-0.263279907	0.179253031	0.398646721	1.537663802	0.505291411	0.902366491	-0.16628803	0.630730564	0.399448283	0.847171367	-0.442268094	0.44368676	1.552969029	1.110283483	-0.326698072	-0.405267374	0.663747183	0.424470033	0.283221899	-4.243973921	0.718315578	1.747343933	-1.020927175	0.305028514	1.47174613	0.048902278	-0.255283556	0.548224573
 8 | Gene: EGFR	Gene Type: Interesting	1.412416216	0.018987506	0.902251622	-0.17813747	0.781819022	0.211815895	-0.023427175	3.557295952	1.173783556	-0.012362164	0.769782484	-0.681031743	-1.047375389	0.652065499	0.172316691	2.072433469	1.135709377	-0.169977181	0.881067136	-0.486159025	-1.451838026	0.371237737	-0.581665325	-0.126356157	0.241004724	1.06526919	0.974531796	0.668645091	0.05696489
 9 | Gene: STK32A	Gene Type: Interesting	-0.388039665	-0.592626614	-0.24413651	0.740364734	3.023348415	-0.433985412	-0.630124457	1.156531983	0.433696213	3.84950782	-0.225425742	-0.656106808	-0.311953357	-0.397450226	1.044025538	-0.247816912	3.640524345	-0.59251039	0.514666245	-0.45396994	1.649737631	3.366020313	-0.430502237	-0.295312303	2.824551497	-0.014275115	-0.410477794	-0.229717784	3.709828616
10 | Gene: NRK	Gene Type: Interesting	1.408537135	-0.017369325	-0.367127962	0.313253548	-0.16288686	0.027411933	-0.281351556	5.813846489	-0.161706584	0.472386752	-0.33979584	0.669956625	-0.2596391	-0.386601295	-0.293654593	4.390499721	-0.420942214	-0.402955154	-0.346809494	-0.222725132	0.36849943	1.49303248	-0.34174718	-0.343420451	5.284808607	-0.358156896	-0.222931558	-0.401391167	-0.412478715
11 | Gene: ERBB2	Gene Type: Not Interesting	0.906642406	-0.684771423	0.015261254	0.16056792	0.365002113	-0.564392699	0.169072827	-0.035192496	-0.031210405	0.447742443	0.544075103	0.280008477	-0.066278222	-0.225814318	4.103496507	1.219691566	-0.245022001	-0.681552658	-0.304817333	-0.511212295	-1.100056017	1.335983295	-0.500561544	0.721259819	0.284747072	0.232812724	-0.796930101	-0.156381455	1.503853721
12 | Gene: ERBB4	Gene Type: Not Interesting	-0.452907052	-0.392790536	-0.374173515	-0.527418493	-0.320103334	-0.560657219	-0.312847509	-0.463903623	-0.304652329	-0.30897114	-0.331935876	4.098930821	-0.413942149	-0.501418917	1.256164876	-0.12356019	-0.425577927	-0.36998588	-0.054684881	-0.484730631	-0.419739472	-0.432412432	0.143245619	-0.266932489	-0.340860307	-0.231847291	-0.448292539	-0.42868169	-0.615936889
13 | Gene: AAK1	Gene Type: Not Interesting	3.579051735	0.92330807	-0.651094367	0.952743833	-0.212733397	0.006074527	-0.121038246	0.083769063	-0.722678214	1.669410989	-0.247600883	-0.284623649	-0.687716717	-0.320883885	-0.93370415	-0.309230053	0.544870152	0.824029397	-0.087291924	-0.973905867	-0.308282983	0.822145704	-0.72904308	-0.088865731	-0.29848499	-0.451367112	-1.134040733	0.379230443	1.491612577
14 | Gene: SRPK3	Gene Type: Not Interesting	-0.582761335	-0.706379425	0.364313301	-0.483011414	-0.71307719	-0.048548064	-0.527944549	0.337501769	-0.656781635	-0.323318624	-0.432950623	-0.414799098	-1.02570516	-0.861415433	0.113447131	-0.110117735	-0.493510825	0.148841502	-0.341096914	-0.373760525	5.16013802	-0.204906932	-0.465574863	-0.170405491	0.046286803	-0.100887639	0.936150906	-0.15980844	-0.846857677
15 | Gene: STK39	Gene Type: Interesting	-0.58688791	-0.186685902	-3.51852921	0.250628834	-0.477773537	-0.62381107	-0.92202388	-0.55383453	0.018847867	1.267644557	0.243732055	-0.233528273	0.070726356	-0.256360198	1.741001607	0.168247379	-0.245974299	0.014972759	-0.537623787	0.259364957	1.303190492	1.043208024	-1.021094946	-0.097444212	-0.679290593	0.132592576	-0.440607517	-0.21005684	0.651311995
16 | Gene: GRK4	Gene Type: Not Interesting	-0.693639785	-0.357559299	-0.903861262	-0.810450279	0.293775461	1.012469252	-0.1044623	-0.573757161	-0.629467998	-1.131138938	-0.401984075	-0.672554564	-1.182791974	-1.00138041	-1.020216093	-0.437799213	-0.103783602	-0.387565928	0.386471772	-0.524742175	3.627070248	-0.846550806	-0.473736661	0.443388919	0.766135257	0.193683015	-0.614757268	-0.382906171	-0.864410411
17 | Gene: TBK1	Gene Type: Not Interesting	0.327203594	0.857319301	-1.397356596	-0.226683585	-0.986051455	0.438343505	0.095527129	5.598772308	0.535025797	-0.057479225	-0.089086932	-0.473291011	2.070252907	0.201409888	1.134728133	0.095734104	0.22916067	0.566649702	1.011889663	0.902342556	0.735510304	-0.493538517	3.176918602	0.490045737	-0.693495689	-0.183704878	-0.182356844	0.744728769	-0.311064392
18 | Gene: INSRR	Gene Type: Not Interesting	0.331108191	-0.467978397	0.681112329	-1.195914121	-0.538461957	3.616542204	0.094919881	0.527357553	-0.160478312	-0.940444939	-1.025689676	0.053722044	0.081275611	-0.616337936	-0.48042057	0.620903382	-0.723793064	-0.759642991	-0.744900744	-0.43363819	-0.804929849	0.429022269	0.765012989	0.400356525	0.207741849	1.474373008	0.888734282	0.387715581	-0.623678298
19 | Gene: IRAK1	Gene Type: Interesting	0.141183837	0.788608352	0.51421388	0.528255597	0.906234597	0.050158065	-0.843341382	1.44384296	-0.253699343	0.562537497	-0.505923659	-6.621630156	-1.253907087	0.947883556	0.394657662	0.613122698	-0.223205762	0.905373736	0.679301554	0.859789222	-0.022354553	1.046726052	0.194471	0.359912834	-0.806348234	1.586619876	0.311985824	0.544470027	1.479598537
20 | Gene: KDR	Gene Type: Not Interesting	-0.524309949	-0.285994749	-0.484871681	0.176655189	-0.139711627	-0.352978397	-0.192529854	-0.601257973	-0.427323427	0.459403791	-0.383001	-0.571316753	-0.387982572	-0.353945118	-0.24031472	-0.305685522	0.456864915	-0.134886653	2.136583182	-0.463127859	-0.600408393	4.430955918	-0.374772774	-0.206853106	1.756104521	0.832887289	-0.314392176	-0.31373742	0.493420283
21 | Gene: NPR1	Gene Type: Interesting	0.509592174	0.464774315	0.275495704	-0.01882253	-0.005537792	-0.457197866	3.408253083	-0.430407643	-0.754186082	-0.836530855	-0.277053957	-0.54257843	0.850439577	-0.298981449	-0.169394809	0.254783369	-0.427821755	-0.1389465	0.618069135	1.926349897	-0.305399918	-0.535939215	-0.078661666	6.267854465	-0.57293844	-0.302168609	0.72307512	0.611863003	-0.2145995
22 | Gene: PAK3	Gene Type: Not Interesting	-0.554447111	-0.145753485	0.019807701	-0.634915727	-0.493766887	-0.587644968	-0.223714996	1.385049476	-0.346100755	0.254536609	0.057887318	-0.593081366	-0.47065185	-0.753944095	-0.044570503	-0.334597636	0.339587788	-0.135201173	-0.111896921	4.913848539	-0.542750046	-0.27783299	-0.689097582	-0.216688006	1.127699857	0.03588747	-0.070416156	-0.553268062	-0.977427689
23 | Gene: PDGFRA	Gene Type: Interesting	-0.530831743	-0.260873607	-0.461134617	-0.389056188	-0.512555424	-0.513202268	-0.337550397	-0.449953768	-0.184845421	-0.565880657	-0.376356202	-0.285266104	-0.570505879	3.598950655	-0.477052971	-0.364449691	-0.648917127	-0.390363086	1.117877995	-0.464086586	-0.456431016	-0.571010056	-0.456668794	-0.58230495	-0.426192704	-0.411161613	-0.455618608	-0.297498019	-0.47141323
24 | Gene: PDK4	Gene Type: Not Interesting	-0.643246331	0.052021433	-0.735006626	0.041068843	-0.062094125	5.477714716	1.256686967	-0.136401851	0.577266871	-0.60002565	0.087671916	-0.560959779	-0.56490393	-0.629261602	-0.214226487	0.09929963	-0.095715004	-0.632856345	1.09320354	0.386976419	-0.374720076	-0.564957229	1.680895489	0.508498891	0.916604247	-0.607497709	0.58989289	-0.122764837	-0.533651064
25 | Gene: ULK4	Gene Type: Interesting	-0.693868027	0.57619653	-0.488541037	-0.60094858	1.139598497	-0.024286993	-0.288194559	-0.499250839	0.103697413	-1.044716157	-1.02427507	3.023003444	3.859028873	0.742978442	-0.352722819	0.069205383	-1.117741919	0.651553716	-0.364768511	-0.5472691	-0.735728719	-0.623797671	-0.105370633	-0.139431549	-0.055372648	0.774449831	0.538987341	0.35514519	-1.45386407
26 | Gene: PRKCE	Gene Type: Not Interesting	0.006531886	0.564826732	3.695318524	0.316255033	-0.268737774	0.936461505	0.2291517	0.649579007	-0.330103595	-0.504534505	0.264729237	-0.977228043	0.493632169	-0.401821398	-0.286231779	0.143371278	-0.360231532	0.340762717	0.633162512	-0.710530502	-1.334690191	0.158108045	-0.347820435	-0.074497061	-0.970507716	-0.26479443	-0.298648517	-0.10090872	-0.11742112
27 | Gene: PRKG2	Gene Type: Not Interesting	-0.185695405	-0.173758799	0.084357105	1.826502656	0.00816719	-1.102148634	0.299002536	0.458848186	0.292508806	0.110508201	0.083592283	-0.494333063	-0.117947546	-0.539712481	-0.106334279	-0.403083002	-0.789473381	1.041787363	1.70041072	-0.293951867	4.839524758	1.015480815	0.841188534	-0.620389764	-0.565583764	-0.262366184	0.226425315	-0.048000565	1.126249373
28 | Gene: MAPK4	Gene Type: Interesting	0.184462349	-0.526037871	0.432087272	-0.882311913	0.246356093	0.858754521	0.052858019	-1.118340603	-0.846948816	-0.778824075	3.525192777	-1.872745007	-0.779756435	-1.039639399	-0.59333431	0.402156007	-1.387426464	-0.145435051	-0.46497243	-0.221064461	-0.861483648	0.125415634	-0.191849116	2.374460297	-0.74142144	0.7654394	1.029796862	0.03307866	0.44066582
29 | Gene: MAPK11	Gene Type: Interesting	1.760301448	-0.912259652	-1.163345889	-0.965891664	-0.795153414	-0.616300339	-1.360743997	-1.448291877	-0.024088935	-1.188868793	-0.229906845	2.181489143	-1.154435684	6.28292787	-0.303782002	-0.165568925	-1.126153349	1.678721355	-1.683560793	-0.864063548	-0.025445472	1.890946219	0.667805988	-0.625764381	-1.063340313	3.222816803	-0.001359619	-0.203661756	0.187669924
30 | Gene: STK31	Gene Type: Interesting	-0.07364355	-0.103789279	-0.171304836	0.351910065	0.63677969	-0.136732984	0.356830815	3.889115824	0.645442526	1.366358918	0.995319244	5.608685402	1.101919141	-0.554900568	0.087820649	0.061305127	1.931275557	-0.692417574	-0.481807702	-0.16288735	-0.538298189	2.440412245	0.804274605	-0.605526195	1.788457016	-0.376520922	0.35819202	0.164487781	3.719306763
31 | Gene: GRK1	Gene Type: Not Interesting	-0.751526741	0.49762292	-0.142534658	-0.882124083	-1.151282849	2.307907188	-0.12032085	-0.351269532	-1.526178564	-0.753268428	3.600861739	-1.223995853	-0.607229424	-0.027417898	0.190161632	0.610550408	0.149796331	-0.122879865	0.247865963	-0.404833708	0.736929754	-0.944275068	-0.078919294	0.661648005	-0.244948779	3.051534602	-0.107365228	0.367536408	-1.517985824
32 | Gene: ROS1	Gene Type: Interesting	-0.31236414	0.701257089	0.47520812	-0.585297054	-0.122694283	-0.866875137	0.367939523	-0.481103706	2.072237711	10.29186436	1.298805701	-0.628175917	-0.173084375	-0.02710755	0.355169073	0.470456905	0.121400231	0.374924602	-0.278307341	-0.553746266	-0.935156558	-0.042420296	-0.479479902	-0.332400886	-0.710017011	1.873931755	0.204554429	-0.32315246	0.187572521
33 | Gene: MAP2K4	Gene Type: Interesting	0.11931136	0.593670684	0.489152771	0.841683345	1.064673748	0.095113499	1.050152022	1.891488427	-5.5283552	0.64306832	-1.100026181	0.765710935	1.165406655	0.30638633	-1.365894262	0.635492291	-0.377798616	0.521665309	-0.608497433	0.398484128	-0.988354968	1.36349214	1.36269783	-0.112291585	-0.262719995	0.503524059	0.498006014	1.525942005	0.339189212
34 | Gene: SRC	Gene Type: Interesting	-0.294263824	-0.618071649	-0.252534114	-0.78660676	-0.228026664	0.977860794	-1.200449832	-0.22037931	-0.240489906	-0.201675468	1.47598938	-0.557000568	-0.502553204	-0.437501309	0.966927023	0.379670097	0.048795579	0.250622869	2.961024714	2.299033235	-1.210659274	0.418655141	1.161954005	-0.15700654	-1.254142937	-0.574558055	-0.662438275	3.702617515	-0.35302723
35 | Gene: TGFBR1	Gene Type: Interesting	-0.000863802	0.735638383	-0.680289747	0.040925843	0.359330228	-1.587400295	-1.041686081	0.071551408	-0.168322665	-1.377303308	3.604539089	-0.004601068	1.527568732	-0.300154707	-0.786135509	-0.138050924	-0.366480418	-0.796970206	-0.030155544	0.803100056	0.683145561	-0.900708154	0.15251077	0.140092011	0.376815421	-1.214319621	1.326197465	1.523070279	-1.312001824
36 | Gene: CAMK2B	Gene Type: Not Interesting	-0.276736819	-0.426080887	-0.160160461	-0.890032771	-0.437405434	0.143897214	-0.573425958	-0.486419381	-0.536963482	-0.657041002	-0.473345418	-0.237475279	-0.669396538	-0.559435302	0.038953301	0.033709721	-0.343587801	-0.513218087	-0.592303313	-0.431221835	5.339202897	-0.493778587	-0.645000361	-0.477984867	-0.401579746	-0.621782124	-0.249394627	-0.303365249	-0.922343302
37 | Gene: STK24	Gene Type: Interesting	-0.31807579	-0.814110809	0.646545188	0.26837169	-9.425120961	-1.073853473	-2.049589626	-0.346921024	0.997283181	0.300619253	-0.543103864	-1.150792172	-2.283061167	-0.162802216	-1.053859713	-1.377541743	-0.288349474	-0.922266884	-1.123953091	-0.762953893	-0.687357148	0.28991073	0.317576672	-0.345565515	0.541683	0.009754099	0.73792006	-0.624271752	0.100532547
38 | Gene: DCLK3	Gene Type: Not Interesting	-0.670177714	3.224533501	0.145509552	0.107432319	-1.120492739	0.288890539	1.549545918	-0.342665051	-0.017402855	-0.420002244	-0.361387453	-1.264272075	-0.794507765	-0.619944678	-0.338767802	-0.148529478	-1.078879645	0.130939014	-1.307815313	-1.818798474	3.683694337	0.920647357	-0.847056974	-0.343798498	-1.21552566	-0.853845334	-0.357215055	-0.043911541	-0.955847309
39 | Gene: LATS1	Gene Type: Not Interesting	-0.695252888	4.299877134	-0.175587126	-0.061022137	-0.391646018	3.385451038	0.345114288	-0.505734993	-0.482953864	-0.081815586	-0.928486879	0.976209137	0.099021487	2.494690556	-1.088742779	0.437174751	-0.507169467	2.028724319	-0.507954247	0.143506281	-1.19702953	0.610379518	0.095879151	-0.663118727	0.50821984	-0.741815419	2.38531026	0.354750355	0.658437634
40 | Gene: NEK9	Gene Type: Not Interesting	-0.337849025	-0.535265918	0.803160459	0.275911465	0.981343049	-0.748451144	-0.092431408	-0.326477104	-0.381243917	-0.575343824	-0.63351617	-0.380961411	-1.720616197	-0.85605361	-0.580950374	0.373293116	0.905490886	0.135705555	1.107780656	-0.545183144	0.475561701	0.016687596	-0.172178219	0.585186686	-0.40480014	-3.997318149	0.711029765	-0.470884061	0.354386296
41 | Gene: MYLK3	Gene Type: Not Interesting	-0.368173217	0.209192446	0.266317555	-0.100656799	-0.336791718	-0.060827204	-0.199021599	-0.765882671	-0.071476548	-0.4402703	-0.3548684	3.468121376	5.853726714	-0.465135408	0.074434692	7.085199705	-0.399050575	-0.334999773	-0.623071147	-0.406230833	0.939058116	-0.269533885	0.117950503	0.1975473	-0.365407931	-0.056856473	0.001983212	0.081609959	-0.603299855


--------------------------------------------------------------------------------
/txt/rc_val_cats.txt:
--------------------------------------------------------------------------------
1 | 		col-A	col-B	col-C
2 | 		1	3	2
3 | row-A	1	1	2	3
4 | row-B	2	10	11	12
5 | row-C	3	7	8	9
6 | row-D	7	4	5	6


--------------------------------------------------------------------------------