├── .gitignore ├── LICENSE ├── README.md ├── comut ├── __init__.py ├── comut.py └── fileparsers.py ├── examples ├── documentation.ipynb ├── images │ ├── melanoma_comut.png │ ├── melanoma_comut.svg │ └── tutorial_comut.svg ├── quickstart.ipynb └── tutorial_data │ ├── data_types.png │ ├── melanoma_example │ ├── best_response.tsv │ ├── copy_number_alterations.tsv │ ├── melanoma.maf │ ├── melanoma_comut.png │ ├── merged_clinical_data.tsv │ ├── mutation_frequency.tsv │ ├── mutation_signatures.tsv │ ├── mutational_burden.tsv │ ├── mutational_signatures.tsv │ ├── primary_type.tsv │ ├── purity.tsv │ └── whole_genome_doubling.tsv │ ├── tutorial_biopsy_site.tsv │ ├── tutorial_comut.png │ ├── tutorial_indicator.tsv │ ├── tutorial_mutation_burden.tsv │ ├── tutorial_mutation_data.tsv │ ├── tutorial_mutsig_qvals.tsv │ └── tutorial_purity.tsv ├── reqs └── base-requirements.txt └── setup.py /.gitignore: -------------------------------------------------------------------------------- 1 | __pycache__/ 2 | *.pyc 3 | build 4 | dist 5 | .ipynb_checkpoints 6 | *.egg-info 7 | **/.DS_Store 8 | .improvements -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2020 Jett P. Crowdis 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # CoMut 2 | CoMut is a Python library for creating comutation plots to visualize genomic and phenotypic information. 3 | 4 | ![melanoma_comut](https://raw.githubusercontent.com/vanallenlab/comut/master/examples/images/melanoma_comut.png) 5 | 6 | 7 | ## Installation 8 | 9 | CoMut is available on pypi [here](https://pypi.org/project/comut/) and can be installed via `pip` 10 | 11 | `pip install comut` 12 | 13 | ## Colab Quickstart 14 | 15 | For those who do not want to install Python or other packages, there is a [Google Colab notebook](https://colab.research.google.com/github/vanallenlab/comut/blob/master/examples/quickstart.ipynb) where you can simply upload a [MAF](https://software.broadinstitute.org/software/igv/MutationAnnotationFormat) file and run the notebook to make a basic comutation plot. This file is also available as a [jupyter notebook](https://github.com/vanallenlab/comut/blob/master/examples/quickstart.ipynb) for local use. 16 | 17 | ## Citation 18 | 19 | CoMut is now published here - https://academic.oup.com/bioinformatics/article/doi/10.1093/bioinformatics/btaa554/5851837. If you use CoMut in a paper, please cite: 20 | > Crowdis, J., He, M.X., Reardon, B. & Van Allen, E. M. CoMut: Visualizing integrated molecular information with comutation plots. Bioinformatics (2020). doi:10.1093/bioinformatics/btaa554 21 | 22 | ## Documentation 23 | 24 | There is also a [Documentation notebook](https://github.com/vanallenlab/comut/blob/master/examples/documentation.ipynb) that provides documentation for CoMut. It describes the fundamentals of creating comutation plots and provides the code used to generate the comut above. 25 | 26 | ## Development 27 | 28 | If you would like to report a bug or request a feature, please do so using the [issues page](https://github.com/vanallenlab/comut/issues) 29 | 30 | ## Dependencies 31 | 32 | CoMut runs on python 3.6 or later. CoMut requires the following packages as dependencies (they will be installed along with CoMut if using `pip`) 33 | 34 | ``` 35 | numpy>=1.18.1 36 | pandas>=0.25.3 37 | palettable>=3.3.0 38 | matplotlib>=3.3.1 39 | ``` 40 | 41 | ## Versions 42 | 43 | 0.0.3 - No code is changed, description updated for public release 44 | 0.0.2 - Introduce compatability for Python 3.6 45 | 0.0.1 - Initial release -------------------------------------------------------------------------------- /comut/__init__.py: -------------------------------------------------------------------------------- 1 | 2 | -------------------------------------------------------------------------------- /comut/comut.py: -------------------------------------------------------------------------------- 1 | from matplotlib import colors 2 | import matplotlib.pyplot as plt 3 | import matplotlib.patches as patches 4 | import pandas as pd 5 | import numpy as np 6 | import matplotlib.gridspec as gridspec 7 | import matplotlib.offsetbox as offsetbox 8 | import palettable 9 | from collections import defaultdict 10 | 11 | 12 | class CoMut: 13 | 14 | '''A user-created :class: `CoMut` object. 15 | 16 | Params: 17 | ----------- 18 | 19 | Attributes: 20 | ----------- 21 | samples: list 22 | List of samples that defines the sample order. It is set 23 | by the first data set added. Samples from later data sets 24 | are checked and reordered against this attribute. 25 | 26 | axes: dict 27 | Container containing plotted axes objects after plot_comut() 28 | is called. Axes objects can be accessed and changed to change 29 | the CoMut. 30 | 31 | figure: matplotlib figure object 32 | Figure that the CoMut is plotted on 33 | 34 | _plots: dict 35 | Container for plot information, including data, visual 36 | params (ie color maps), plot type, and plot name. 37 | 38 | _side_plots: dict of dicts 39 | Container for side plot information. Values are side plot 40 | data, keys are the name of the central CoMut plot the side 41 | plot is paired with.''' 42 | 43 | def __init__(self): 44 | 45 | # user accessible attributes 46 | self.samples = None 47 | self.axes = {} 48 | self.figure = None 49 | 50 | # attributes for manipulation and storage 51 | self._plots = {} 52 | self._side_plots = defaultdict(dict) 53 | 54 | @classmethod 55 | def _get_default_categorical_cmap(cls, n_cats): 56 | '''Returns the default color map for n categories. 57 | If 10 or fewer, uses vivid_10 from palettable. If more than 10, 58 | uses a segmented rainbow colormap. 59 | 60 | Params: 61 | ------- 62 | n_cats: int 63 | The number of categories in the data. 64 | 65 | Returns: 66 | -------- 67 | cmap: list of colors''' 68 | 69 | if n_cats <= 10: 70 | cmap = palettable.cartocolors.qualitative.Vivid_10.mpl_colors 71 | else: 72 | hsv_cmap = plt.get_cmap('hsv') 73 | cmap = [hsv_cmap(i/n_cats) for i in range(n_cats)] 74 | 75 | return cmap 76 | 77 | @classmethod 78 | def _get_triangles(cls, x_base, y_base, tri_padding, height, width): 79 | '''Returns np arrays of triangle coordinates 80 | 81 | Params: 82 | ------- 83 | x_base, y_base: floats 84 | The x and y coordinates of the base of the triangle 85 | 86 | tri_padding: float 87 | The space between triangles 88 | 89 | height, width: float 90 | Height and width of the box enclosing the triangles. 91 | 92 | Returns: 93 | -------- 94 | (tri_1_coords, tri_2_coords): tuple of np arrays 95 | Tuple of triangle coordinates as np arrays.''' 96 | 97 | tri_1_coords = [[x_base, y_base + tri_padding], 98 | [x_base, y_base + height], 99 | [x_base + width - tri_padding, y_base + height]] 100 | 101 | tri_2_coords = [[x_base + tri_padding, y_base], 102 | [x_base + width, y_base], 103 | [x_base + width, y_base + height - tri_padding]] 104 | 105 | return (np.array(tri_1_coords), np.array(tri_2_coords)) 106 | 107 | @classmethod 108 | def _sort_list_by_list(cls, value_list, value_order): 109 | '''Sort an value list by a specified value 110 | order, otherwise sort alphabetically at end. 111 | 112 | Params: 113 | ------- 114 | value_list: list-like 115 | values to sort, eg ['nonsense', 'amp'] 116 | 117 | value_order: list-like 118 | List of values that specify sort order. 119 | Values not in this list will be sorted alphabetically 120 | and placed at the end of the list. 121 | 122 | Returns: 123 | -------- 124 | sorted_values: list 125 | Values sorted by the value order specified and 126 | alphabetically otherwise.''' 127 | 128 | # extract subset of alts that are specified in value_order 129 | subset = [value for value in value_list if value in value_order] 130 | other = [value for value in value_list if value not in value_order] 131 | 132 | # sort subset according to value order list, otherwise alphabetical 133 | sorted_subset = sorted(subset, key=lambda x: value_order.index(x)) 134 | sorted_other = sorted(other) 135 | 136 | # join the two subsets 137 | sorted_values = sorted_subset + sorted_other 138 | return sorted_values 139 | 140 | @classmethod 141 | def _parse_categorical_data(cls, data, category_order, sample_order, 142 | value_order, priority): 143 | '''Parses tidy dataframe into a gene x sample dataframe 144 | of tuples for plotting 145 | 146 | Params: 147 | ------- 148 | data: pandas dataframe 149 | Dataframe from add_categorical_data or add_continuous_data 150 | 151 | category_order: list-like 152 | category_order from add_categorical_data 153 | 154 | sample_order: list-like 155 | Order of samples, from left to right. 156 | 157 | value_order: list-like: 158 | value_order from add_categorical_data 159 | 160 | priority: list-like 161 | priority from add_categorical_data 162 | 163 | Returns: 164 | -------- 165 | parsed_data: pandas dataframe, shape (categories, samples) 166 | Dataframe of tuples depicting values for each sample in 167 | each category.''' 168 | 169 | # create parsed data storage 170 | parsed_data = pd.DataFrame(index=category_order, columns=sample_order) 171 | 172 | # subset data to categories and samples to avoid handling large dataframes 173 | data = data[(data['category'].isin(category_order)) & 174 | (data['sample'].isin(sample_order))] 175 | 176 | # fill in parsed dataframe 177 | for category in category_order: 178 | for sample in sample_order: 179 | sample_category_data = data[(data['category'] == category) & 180 | (data['sample'] == sample)] 181 | 182 | # if data is empty, the sample does not have a value in category 183 | if len(sample_category_data) == 0: 184 | parsed_data.loc[category, sample] = () 185 | 186 | # if length 1 just put the value 187 | elif len(sample_category_data) == 1: 188 | value = sample_category_data['value'].values[0] 189 | parsed_data.loc[category, sample] = (value,) 190 | 191 | # if length 2, sort by value order then convert to tuple 192 | elif len(sample_category_data) == 2: 193 | values = sample_category_data['value'].values 194 | sorted_values = cls._sort_list_by_list(values, value_order) 195 | parsed_data.loc[category, sample] = tuple(sorted_values) 196 | 197 | # if more than two, apply priority, sort, then convert to tuple. 198 | else: 199 | values = sample_category_data['value'].values 200 | present_priorities = [v for v in values if v in priority] 201 | 202 | # just put 'Multiple' if no priorities or more than two 203 | if len(present_priorities) == 0 or len(present_priorities) > 2: 204 | parsed_data.loc[category, sample] = ('Multiple',) 205 | 206 | # always plot a priority if present 207 | elif len(present_priorities) == 1: 208 | df_entry = present_priorities + ['Multiple'] 209 | sorted_df_entry = cls._sort_list_by_list(df_entry, value_order) 210 | parsed_data.loc[category, sample] = tuple(sorted_df_entry) 211 | 212 | # plot two priorities if present, ignoring others 213 | elif len(present_priorities) == 2: 214 | df_entry = cls._sort_list_by_list(present_priorities, value_order) 215 | parsed_data.loc[category, sample] = tuple(df_entry) 216 | 217 | return parsed_data 218 | 219 | def _check_samples(self, samples): 220 | '''Checks that samples are a subset of samples 221 | currently associated with the CoMut object. 222 | 223 | Params: 224 | ------- 225 | samples: list-like 226 | A list of sample names.''' 227 | 228 | if not set(samples).issubset(set(self.samples)): 229 | extra = set(samples) - set(self.samples) 230 | raise ValueError('Unknown samples {} given. All added samples' 231 | ' must be a subset of either first samples' 232 | ' added or samples specified with' 233 | ' comut.samples'.format(extra)) 234 | 235 | def add_categorical_data(self, data, name=None, category_order=None, 236 | value_order=None, mapping=None, borders=None, 237 | priority=None, tick_style='normal'): 238 | '''Add categorical data to the CoMut object. 239 | 240 | Params: 241 | ------- 242 | data: pandas dataframe 243 | A tidy dataframe containing data. Required columns are 244 | sample, category, and value. Other columns are ignored. 245 | 246 | Example: 247 | ------- 248 | sample | category | value 249 | ---------------------------- 250 | Sample_1 | TP53 | Missense 251 | Sample_1 | Gender | Male 252 | 253 | name: str 254 | The name of the dataset being added. Used to references axes. 255 | 256 | Example: 257 | -------- 258 | example_comut = comut.CoMut() 259 | example_comut.add_categorical_data(data, name = 'Mutation type') 260 | 261 | category_order: list-like 262 | Order of category to plot, from top to bottom. Only these 263 | categories are plotted. 264 | 265 | Example: 266 | -------- 267 | example_comut = comut.CoMut() 268 | example_comut.add_categorical_data(data, category_order = ['TP53', 'BRAF']) 269 | 270 | value_order: list-like 271 | Order of plotting of values in a single patch, from left 272 | triangle to right triangle. 273 | 274 | Example: 275 | -------- 276 | value_order = ['Amp', 'Missense'] 277 | 278 | If Amp and Missense exist in the same category and sample, Amp 279 | will be drawn as left triangle, Missense as right. 280 | 281 | mapping: dict 282 | Mapping of values to patch properties. The dict can either specify 283 | only the facecolor or other patches properties. 284 | 285 | Note: 286 | ----- 287 | Three additional values are required to fully specify mapping: 288 | 289 | 'Absent', which determines the color for samples without value 290 | for a name (default white). 291 | 292 | 'Multiple', which determines the color for samples with more than 293 | two values in that category (default brown). 294 | 295 | 'Not Available', which determines the patch properties when a sample's 296 | value is 'Not Available'. 297 | 298 | borders: list-like 299 | List of values that should be plotted as borders, not patches. 300 | 301 | Example: 302 | -------- 303 | example_comut = comut.CoMut() 304 | example_comut.add_categorical_data(data, borders = ['LOH']) 305 | 306 | priority: list-like 307 | Ordered list of priorities for values. The function will attempt 308 | to preserve values in this list, subverting the "Multiple" 309 | assignment. 310 | 311 | Example: 312 | -------- 313 | example_comut.add_categorical_data(data, priority = ['Amp']) 314 | 315 | If Amp exists alongside two other values, it will be drawn as 316 | Amp + Multiple (two triangles), instead of Multiple. 317 | 318 | tick_style: str, default='normal', 'italic', 'oblique' 319 | Tick style to be used for the y axis ticks (category names). 320 | 321 | Returns: 322 | -------- 323 | None''' 324 | 325 | # check that required columns exist 326 | req_cols = {'sample', 'category', 'value'} 327 | if not req_cols.issubset(data.columns): 328 | missing_cols = req_cols - set(data.columns) 329 | msg = ', '.join(list(missing_cols)) 330 | raise ValueError('Data missing required columns: {}'.format(msg)) 331 | 332 | # check that samples are a subset of current samples. 333 | samples = list(data['sample'].drop_duplicates()) 334 | if self.samples is None: 335 | self.samples = samples 336 | else: 337 | self._check_samples(samples) 338 | 339 | # set defaults 340 | if name is None: 341 | name = len(self._plots) 342 | 343 | if borders is None: 344 | borders = [] 345 | 346 | if priority is None: 347 | priority = [] 348 | 349 | if value_order is None: 350 | value_order = [] 351 | 352 | # default category order to all categories uniquely present in data 353 | # in the order they appear 354 | if category_order is None: 355 | category_order = list(data['category'].drop_duplicates()) 356 | 357 | # build default color map, uses vivid 358 | unique_values = set(data['value']) 359 | if mapping is None: 360 | mapping = {} 361 | 362 | # define default borders 363 | for value in borders: 364 | mapping[value] = {'facecolor': 'none', 'edgecolor': 'black', 'linewidth': 1} 365 | 366 | # assign colors to other unique values 367 | non_border = [val for val in unique_values if val not in borders] 368 | default_cmap = self._get_default_categorical_cmap(len(non_border)) 369 | for i, value in enumerate(unique_values): 370 | mapping[value] = {'facecolor': default_cmap[i]} 371 | 372 | mapping['Absent'] = {'facecolor': 'white'} 373 | mapping['Multiple'] = {'facecolor': palettable.colorbrewer.qualitative.Set1_7.mpl_colors[6]} 374 | mapping['Not Available'] = {'facecolor': 'none', 'edgecolor': 'black', 'linewidth': 1} 375 | 376 | elif isinstance(mapping, dict): 377 | 378 | # copy the user mapping to avoid overwriting their mapping variable 379 | mapping = mapping.copy() 380 | 381 | # update user color map with reserved values if not present 382 | if 'Not Available' not in mapping: 383 | mapping['Not Available'] = {'facecolor': 'none', 'edgecolor': 'black', 'linewidth': 1} 384 | if 'Absent' not in mapping: 385 | mapping['Absent'] = {'facecolor': 'white'} 386 | if 'Multiple' not in mapping: 387 | mapping['Multiple'] = {'facecolor': palettable.colorbrewer.qualitative.Set1_7.mpl_colors[6]} 388 | 389 | # check that all alt types present in data are in mapping 390 | if not unique_values.issubset(mapping.keys()): 391 | missing_cats = unique_values - set(mapping.keys()) 392 | raise ValueError('Categories present in dataframe {}' 393 | ' are missing from mapping'.format(missing_cats)) 394 | 395 | # if passed values aren't kwargs, convert to patches kwargs 396 | for key, value in mapping.items(): 397 | if not isinstance(value, dict): 398 | if key in borders: 399 | mapping[key] = {'facecolor': 'none', 'edgecolor': value} 400 | else: 401 | mapping[key] = {'facecolor': value} 402 | 403 | # check that borders have facecolor - None 404 | for border in borders: 405 | if mapping[border]['facecolor'] != 'none': 406 | raise ValueError('Border category {} must have facecolor' 407 | ' = \'none\''.format(border)) 408 | 409 | else: 410 | raise ValueError('Invalid mapping. Mapping must be a dict.') 411 | 412 | # parse data into dataframe of tuples as required for plotting 413 | parsed_data = self._parse_categorical_data(data, category_order, self.samples, 414 | value_order, priority) 415 | 416 | # store plot data 417 | plot_data = {'data': parsed_data, 'patches_options': mapping, 418 | 'tick_style': tick_style, 'borders': borders, 'type': 'categorical'} 419 | 420 | self._plots[name] = plot_data 421 | return None 422 | 423 | def add_continuous_data(self, data, mapping='binary', tick_style='normal', 424 | value_range=None, cat_mapping=None, name=None): 425 | '''Add a sample level continuous data to the CoMut object 426 | 427 | Params: 428 | ----------- 429 | data: pandas dataframe 430 | A tidy dataframe containing data. Required columns are 431 | sample, category, and value. Other columns are ignored. 432 | Currently, only one category is allowed. 433 | 434 | mapping: str, colors.LinearSegmentedColormap, default 'binary' 435 | A mapping of continuous value to color. Can be defined as 436 | matplotlib colormap (str) or a custom LinearSegmentedColormap 437 | Samples with missing information are colored according to 'Absent'. 438 | 439 | value_range: tuple or list 440 | min and max value of the data. Data will be normalized using 441 | this range to fit (0, 1). Defaults to the range of the data. 442 | 443 | cat_mapping: dict 444 | Mapping from a discrete category to patch color. Primarily used 445 | to override defaults for 'Absent' and 'Not Available' but can 446 | be used to mix categorical and continuous values in the same data. 447 | 448 | name: str 449 | The name of the dataset being added. Used to references axes. 450 | defaults to the integer index of the plot being added. 451 | 452 | tick_style: str, default='normal', 'italic', 'oblique' 453 | Tick style to be used for the y axis ticks (category names). 454 | 455 | Returns: 456 | -------- 457 | None''' 458 | 459 | # check that required columns exist 460 | req_cols = {'sample', 'category', 'value'} 461 | if not req_cols.issubset(data.columns): 462 | missing_cols = req_cols - set(data.columns) 463 | msg = ', '.join(list(missing_cols)) 464 | raise ValueError('Data missing required columns: {}'.format(msg)) 465 | 466 | # check that samples are a subset of object samples. 467 | samples = list(data['sample'].drop_duplicates()) 468 | if self.samples is None: 469 | self.samples = samples 470 | else: 471 | self._check_samples(samples) 472 | 473 | # check that only one category is in the dataframe 474 | if len(set(data['category'])) > 1: 475 | raise ValueError('Only one category is allowed for continuous data') 476 | 477 | # make default name 478 | if name is None: 479 | name = len(self._plots) 480 | 481 | if value_range is None: 482 | data_max = pd.to_numeric(data['value'], 'coerce').max() 483 | data_min = pd.to_numeric(data['value'], 'coerce').min() 484 | else: 485 | data_min, data_max = value_range 486 | 487 | # make default categorical mapping 488 | if cat_mapping is None: 489 | cat_mapping = {'Absent': {'facecolor': 'white'}, 490 | 'Not Available': {'facecolor': 'none', 'edgecolor': 'black', 'linewidth': 1}} 491 | else: 492 | # update absent and not available 493 | cat_mapping = cat_mapping.copy() 494 | if 'Absent' not in cat_mapping: 495 | cat_mapping['Absent'] = {'facecolor': 'white'} 496 | if 'Not Available' not in cat_mapping: 497 | cat_mapping['Not Available'] = {'facecolor': 'none', 'edgecolor': 'black', 'linewidth': 1} 498 | 499 | # if values in cat_mapping aren't kwargs, convert to patches kwargs 500 | for key, value in cat_mapping.items(): 501 | if not isinstance(value, dict): 502 | mapping[key] = {'facecolor': value} 503 | 504 | def normalize(x): 505 | if isinstance(x, (int, float)): 506 | return (x - data_min)/data_max 507 | else: 508 | return x 509 | 510 | # normalize data to range 511 | norm_data = data.copy() 512 | norm_data.loc[:, 'value'] = data.loc[:, 'value'].apply(normalize) 513 | if isinstance(mapping, str): 514 | mapping = plt.get_cmap(mapping) 515 | 516 | elif not isinstance(mapping, colors.LinearSegmentedColormap): 517 | raise ValueError('Invalid color map for continuous data. Valid' 518 | ' types are colormap str or LinearSegmentedColormap') 519 | 520 | # build color map 521 | dict_mapping = {} 522 | for value in norm_data.loc[:, 'value']: 523 | if isinstance(value, (int, float)): 524 | dict_mapping[value] = {'facecolor': mapping(value)} 525 | 526 | # update continuous mapping with categorical mapping 527 | dict_mapping.update(cat_mapping) 528 | 529 | # data is now essentially categorical, so use that to parse data 530 | category_order = list(norm_data['category'].drop_duplicates()) 531 | parsed_data = self._parse_categorical_data(data=norm_data, category_order=category_order, 532 | sample_order=self.samples, value_order=[], priority=[]) 533 | 534 | # store plot data 535 | plot_data = {'data': parsed_data, 'patches_options': dict_mapping, 'tick_style': tick_style, 536 | 'type': 'continuous', 'range': value_range, 'colorbar': mapping} 537 | 538 | self._plots[name] = plot_data 539 | return None 540 | 541 | def add_bar_data(self, data, name=None, stacked=False, mapping=None, 542 | ylabel='', bar_kwargs=None): 543 | '''Add a bar plot to the CoMut object 544 | 545 | Params: 546 | ----------- 547 | data: pandas dataframe 548 | Dataframe containing data for samples. The first column must be 549 | sample, and other columns should be values for the bar plot. 550 | 551 | name: str 552 | The name of the dataset being added. Used to references axes. 553 | Defaults to the integer index of the plot being added. 554 | 555 | stacked: bool, default=False 556 | Whether the bar graph should be stacked. 557 | 558 | mapping: dict 559 | A mapping of column to color. Dictionary should map column name 560 | to color (str) or to plot kwargs. 561 | 562 | ylabel: str, default '' 563 | The label for the y axis. 564 | 565 | bar_kwargs: dict 566 | dict of kwargs to be passed to plt.bar 567 | 568 | Returns: 569 | -------- 570 | None''' 571 | 572 | # check formatting 573 | if data.columns[0] != 'sample': 574 | raise ValueError('First column in dataframe must be sample') 575 | 576 | # make defaults 577 | if name is None: 578 | name = len(self._plots) 579 | 580 | if bar_kwargs is None: 581 | bar_kwargs = {} 582 | 583 | # convert sample to an index 584 | bar_df_indexed = data.set_index('sample', drop=True) 585 | 586 | # check that samples are a subset of object samples. 587 | samples = list(bar_df_indexed.index) 588 | if self.samples is None: 589 | self.samples = samples 590 | else: 591 | self._check_samples(samples) 592 | 593 | # add missing samples and assign 0 value for all columns 594 | missing_samples = list(set(self.samples) - set(samples)) 595 | bar_df_indexed = bar_df_indexed.reindex(self.samples) 596 | bar_df_indexed.loc[missing_samples, :] = 0 597 | 598 | # make default mapping 599 | if mapping is None: 600 | num_cats = len(bar_df_indexed.columns) 601 | default_cmap = self._get_default_categorical_cmap(num_cats) 602 | mapping = {column: default_cmap[i] 603 | for i, column in enumerate(bar_df_indexed.columns)} 604 | 605 | # store plot data 606 | plot_data = {'data': bar_df_indexed, 'bar_options': mapping, 'type': 'bar', 607 | 'stacked': stacked, 'ylabel': ylabel, 'bar_kwargs': bar_kwargs} 608 | 609 | self._plots[name] = plot_data 610 | return None 611 | 612 | def add_sample_indicators(self, data, name=None, 613 | plot_kwargs=None): 614 | '''Add a line plot that indicates samples that share a characteristic 615 | 616 | Params: 617 | ----------- 618 | data: pandas dataframe 619 | A tidy dataframe that assigns individual samples to groups. 620 | Required columns are 'sample' and 'group'. Other columns are 621 | ignored. 622 | 623 | name: str 624 | The name of the dataset being added. Used to references axes 625 | Defaults to the integer index of the plot being added. 626 | 627 | plot_kwargs: dict 628 | dict of kwargs to be passed to plt.plot during plotting. Defaults 629 | to {'color': 'black', 'marker': 'o', markersize': 3} 630 | 631 | Returns: 632 | -------- 633 | None''' 634 | 635 | # check for required columns 636 | req_cols = {'sample', 'group'} 637 | if not req_cols.issubset(data.columns): 638 | missing_cols = req_cols - set(data.columns) 639 | msg = ', '.join(list(missing_cols)) 640 | raise ValueError('Data missing required columns: {}'.format(msg)) 641 | 642 | # make defaults 643 | if name is None: 644 | name = len(self._plots) 645 | 646 | if plot_kwargs is None: 647 | plot_kwargs = {'color': 'black', 'marker': 'o', 'markersize': 3} 648 | 649 | # convert sample to an index 650 | data_indexed = data.set_index('sample', drop=True) 651 | 652 | # check that samples are a subset of current samples 653 | samples = list(data_indexed.index) 654 | if self.samples is None: 655 | self.samples = samples 656 | else: 657 | self._check_samples(samples) 658 | 659 | # add missing samples and assign them NaN. They will be skipped. 660 | missing_samples = list(set(self.samples) - set(samples)) 661 | 662 | # Reorders - by default uses new samples = NaN 663 | data_indexed = data_indexed.reindex(self.samples) 664 | 665 | # connected samples must be adjacent. Throw an error otherwise. 666 | seen_groups = set() 667 | prev_group = None 668 | for assignment in data_indexed['group']: 669 | if assignment in seen_groups and not np.isnan(assignment): 670 | raise ValueError('Samples that share a group must be adjacent' 671 | ' in CoMut sample ordering.') 672 | elif assignment != prev_group: 673 | seen_groups.add(prev_group) 674 | prev_group = assignment 675 | 676 | plot_data = {'data': data_indexed, 'plot_options': plot_kwargs, 677 | 'type': 'indicator'} 678 | 679 | self._plots[name] = plot_data 680 | return None 681 | 682 | def _plot_patch_data(self, ax, data, name, mapping, borders, tick_style, 683 | x_padding=0, y_padding=0, tri_padding=0): 684 | '''Plot data represented as patches on CoMut plot 685 | 686 | Params: 687 | ----------- 688 | ax: axis object 689 | Axis object on which to draw the graph. 690 | 691 | data: pandas dataframe 692 | Parsed dataframe from _parse_categorical_data 693 | 694 | name: str 695 | Name of the plot to store in axes dictionary. 696 | 697 | mapping: dict 698 | mapping from add_categorical_data 699 | 700 | borders: list-like 701 | borders from add_categorical_data 702 | 703 | x_padding: float, default=0 704 | x_padding from plot_comut 705 | 706 | y_padding: float, default=0 707 | y_padding from plot_comut 708 | 709 | tri_padding: float, default=0 710 | tri_padding from plot_comut 711 | 712 | tick_style: str, default='normal', 'italic', 'oblique' 713 | Tick style to be used for the y axis ticks (category names). 714 | 715 | Returns: 716 | ------- 717 | ax: axis object 718 | Axis object on which the plot is drawn.''' 719 | 720 | # precalculate height and width of patches 721 | height, width = 1 - 2*y_padding, 1 - 2*x_padding 722 | 723 | # store unique labels 724 | unique_labels = set() 725 | 726 | # plot data from bottom to top, left to right of CoMut 727 | for i in range(len(data.index)): 728 | for j in range(len(data.columns)): 729 | 730 | # calculate loc of lower left corner of patch 731 | x_base, y_base = j + x_padding, i + y_padding 732 | 733 | cell_tuple = tuple(data.iloc[i, j]) 734 | 735 | # remove box borders from cell tuple 736 | box_borders = [value for value in cell_tuple 737 | if value in borders] 738 | 739 | # determine the values that are not borders 740 | cell_tuple = [value for value in cell_tuple 741 | if value not in borders] 742 | 743 | # determine number of values 744 | num_values = len(cell_tuple) 745 | 746 | # plot Not Available if present 747 | if 'Not Available' in cell_tuple: 748 | if len(cell_tuple) > 1: 749 | raise ValueError('Not Available must be a value by itself') 750 | 751 | # otherwise plot the Not Available patch. label = '' subverts legend 752 | patch_options = mapping['Not Available'] 753 | rect = patches.Rectangle((x_base, y_base), width, height, **patch_options, 754 | label='') 755 | ax.add_patch(rect) 756 | 757 | # prevent the border from exceeding the bounds of the patch 758 | rect.set_clip_path(rect) 759 | 760 | # plot the slashed line. This code is heuristic and does 761 | # not currently scale well. 762 | ax.plot([x_base + x_padding/2, x_base + width - x_padding/2], [y_base + y_padding/2, y_base + height - y_padding/2], 763 | color=patch_options['edgecolor'], linewidth=0.5, 764 | solid_capstyle='round') 765 | 766 | # go to next patch 767 | continue 768 | 769 | # use rectangles to draw single boxes 770 | if num_values != 2: 771 | 772 | # handle Absent and Multiple 773 | if num_values == 0: 774 | label = 'Absent' 775 | patch_options = mapping['Absent'] 776 | elif num_values > 2: 777 | label = 'Multiple' 778 | patch_options = mapping['Multiple'] 779 | 780 | # extract color for single patch based on value 781 | elif num_values == 1: 782 | value = cell_tuple[0] 783 | label = value 784 | patch_options = mapping[value] 785 | 786 | # create rectangle and add to plot. Add label if it 787 | # doesn't already exist in legend 788 | plot_label = label if label not in unique_labels else None 789 | unique_labels.add(label) 790 | rect = patches.Rectangle((x_base, y_base), 791 | width, height, 792 | **patch_options, label=plot_label) 793 | ax.add_patch(rect) 794 | 795 | # if two alterations, build using two triangles 796 | else: 797 | alt_1, alt_2 = cell_tuple 798 | 799 | # determine if labels are unique and add if so 800 | alt_1_label = alt_1 if alt_1 not in unique_labels else None 801 | unique_labels.add(alt_1) 802 | alt_2_label = alt_2 if alt_2 not in unique_labels else None 803 | unique_labels.add(alt_2) 804 | 805 | # extract color options for triangles 806 | patch_options_1 = mapping[alt_1] 807 | patch_options_2 = mapping[alt_2] 808 | 809 | # build triangles with triangle padding 810 | tri_1, tri_2 = self._get_triangles(x_base, y_base, tri_padding, 811 | height, width) 812 | 813 | tri_1_patch = patches.Polygon(tri_1, label=alt_1_label, **patch_options_1) 814 | tri_2_patch = patches.Polygon(tri_2, label=alt_2_label, **patch_options_2) 815 | 816 | ax.add_patch(tri_1_patch) 817 | ax.add_patch(tri_2_patch) 818 | 819 | # Once boxes have been plotted, plot border 820 | for value in box_borders: 821 | border_options = mapping[value] 822 | rect = patches.Rectangle((x_base, y_base), 823 | width, height, 824 | **border_options, label=value) 825 | ax.add_patch(rect) 826 | rect.set_clip_path(rect) 827 | 828 | # x and y limits 829 | ax.set_ylim([0, len(data.index) + y_padding]) 830 | ax.set_xlim([0, len(data.columns) + x_padding]) 831 | 832 | # add ytick labels 833 | ax.set_yticks(np.arange(0.5, len(data.index) + 0.5)) 834 | ax.set_yticklabels(data.index, style=tick_style) 835 | 836 | # delete tick marks and make x axis invisible 837 | ax.get_xaxis().set_visible(False) 838 | ax.tick_params( 839 | axis='both', # changes apply to both axes 840 | which='both', # both major and minor ticks are affected 841 | bottom=False, # ticks along the bottom edge are off 842 | top=False, # ticks along the top edge are off 843 | length=0) # remove ticks 844 | 845 | # remove spines 846 | for loc in ['top', 'right', 'bottom', 'left']: 847 | ax.spines[loc].set_visible(False) 848 | 849 | self.axes[name] = ax 850 | return ax 851 | 852 | def _plot_bar_data(self, ax, data, name, mapping, stacked, ylabel, bar_kwargs): 853 | '''Plot bar plot on CoMut plot 854 | 855 | Params: 856 | ----------- 857 | ax: axis object 858 | axis object on which to draw the graph. 859 | 860 | data: pandas Dataframe 861 | Dataframe from add_bar_data 862 | 863 | name: str 864 | name from add_bar_data 865 | 866 | mapping: dict 867 | mapping from add_bar_data 868 | 869 | stacked: bool 870 | stacked from add_bar_data 871 | 872 | ylabel: str 873 | ylabel from add_bar_data 874 | 875 | bar_kwargs: dict 876 | bar_kwargs from add_bar_data 877 | 878 | Returns: 879 | ------- 880 | ax: axis object 881 | Axis object on which the plot is drawn''' 882 | 883 | # define x range 884 | x_range = np.arange(0.5, len(data.index)) 885 | 886 | # if stacked, calculate cumulative height of bars 887 | if stacked: 888 | cum_bar_df = np.cumsum(data, axis=1) 889 | 890 | # for each bar, calculate bottom and top of bar and plot it 891 | for i in range(len(cum_bar_df.columns)): 892 | column = cum_bar_df.columns[i] 893 | color = mapping[column] 894 | if i == 0: 895 | bottom = None 896 | bar_data = cum_bar_df.loc[:, column] 897 | else: 898 | # calculate distance between previous and current column 899 | prev_column = cum_bar_df.columns[i-1] 900 | bar_data = cum_bar_df.loc[:, column] - cum_bar_df.loc[:, prev_column] 901 | 902 | # the previous column defines the bottom of the bars 903 | bottom = cum_bar_df.loc[:, prev_column] 904 | 905 | # plot bar data 906 | ax.bar(x_range, bar_data, align='center', color=color, 907 | bottom=bottom, label=column, **bar_kwargs) 908 | 909 | # plot unstacked bar. Label is '' to subvert legend. 910 | else: 911 | color = mapping[data.columns[0]] 912 | ax.bar(x_range, data.iloc[:, 0], 913 | align='center', color=color, label='', **bar_kwargs) 914 | 915 | # make x axis invisible and despine all axes 916 | ax.get_xaxis().set_visible(False) 917 | for loc in ['top', 'right', 'bottom', 'left']: 918 | ax.spines[loc].set_visible(False) 919 | 920 | # set the ylabel 921 | ax.set_ylabel(ylabel) 922 | self.axes[name] = ax 923 | return ax 924 | 925 | def add_side_bar_data(self, data, paired_name, name=None, position='right', 926 | mapping=None, stacked=False, xlabel='', bar_kwargs=None): 927 | '''Add a side bar plot to the CoMut object 928 | 929 | Params: 930 | ----------- 931 | data: pandas dataframe 932 | Dataframe containing data for categories in paired plot. The first 933 | column must be category, and other columns should be values for the 934 | bar plot. 935 | 936 | paired_name: str or int 937 | Name of plot on which the bar plot will be placed. Must reference 938 | a dataset already added to the CoMut object. 939 | 940 | name: str 941 | The name of the dataset being added. Used to references axes. 942 | defaults to the integer index of the plot being added. 943 | 944 | position: str, 'left' or 'right', default 'right' 945 | Where the bar plot should be graphed (left or right of paired name 946 | plot). 947 | 948 | stacked: bool, default=False 949 | Whether the bar graph should be stacked. 950 | 951 | mapping: dict 952 | A mapping of column to color. Dictionary should map column name 953 | to color (str) or to plot kwargs. 954 | 955 | xlabel: str, default '' 956 | The label for the x axis 957 | 958 | bar_kwargs: dict 959 | kwargs to be passed to plt.barh during the process of plotting. 960 | 961 | Returns: 962 | -------- 963 | None''' 964 | 965 | # check formatting 966 | if data.columns[0] != 'category': 967 | raise ValueError('First column in dataframe must be category') 968 | 969 | # make defaults 970 | if name is None: 971 | name = len(self._plots) 972 | 973 | if position not in ['left', 'right']: 974 | raise ValueError('Position must be left or right') 975 | 976 | if bar_kwargs is None: 977 | bar_kwargs = {} 978 | 979 | # side plots must be paired with a plot that exists 980 | if paired_name not in self._plots: 981 | raise KeyError('Plot {} does not exist. Side plots must be added' 982 | 'to an already existing plot'.format(paired_name)) 983 | 984 | # currently, side plot must be paired with a categorical dataset 985 | paired_plot = self._plots[paired_name] 986 | if paired_plot['type'] != 'categorical': 987 | raise ValueError('Side plots can only be added to categorical data') 988 | 989 | # set index to categories 990 | data_indexed = data.set_index('category') 991 | 992 | # check that the categories match paired plot's categories 993 | side_cats = set(data_indexed.index) 994 | paired_cats = set(paired_plot['data'].index) 995 | if not side_cats.issubset(paired_cats): 996 | new_cats = side_cats - paired_cats 997 | raise ValueError('Categories {} do not exist in paired plot {}. ' 998 | 'Categories in side bar plot must be a subset of' 999 | ' those in paired plot.'.format(new_cats, paired_name)) 1000 | 1001 | # add missing categories and assign them a value of 0 for all rows 1002 | missing_categories = paired_cats - side_cats 1003 | data_indexed = data_indexed.reindex(list(paired_plot['data'].index)) 1004 | data.loc[missing_categories, :] = 0 1005 | 1006 | # make default mapping 1007 | if mapping is None: 1008 | mapping = {column: palettable.cartocolors.qualitative.Vivid_10.mpl_colors[i] 1009 | for i, column in enumerate(data_indexed.columns)} 1010 | 1011 | # store the data 1012 | plot_data = {'data': data_indexed, 'mapping': mapping, 1013 | 'stacked': stacked, 'position': position, 'xlabel': xlabel, 1014 | 'bar_kwargs': bar_kwargs} 1015 | 1016 | self._side_plots[paired_name][name] = plot_data 1017 | return None 1018 | 1019 | def _plot_indicator_data(self, ax, data, name, plot_kwargs): 1020 | '''Plot data that connects samples with similar characteristics. 1021 | 1022 | Params: 1023 | ----------- 1024 | ax: axis object 1025 | axis object on which to draw the graph. 1026 | 1027 | data: pandas dataframe 1028 | data from add_sample_indicators 1029 | 1030 | name: str 1031 | name from add_sample_indicators 1032 | 1033 | plot_kwargs: dict 1034 | plot_kwargs from add_sample_indicators 1035 | 1036 | Returns: 1037 | -------- 1038 | ax: axis object 1039 | Axis object on which the plot is drawn.''' 1040 | 1041 | # loop through group assignments 1042 | for i, group in enumerate(set(data['group'])): 1043 | 1044 | # ignore missing samples 1045 | if np.isnan(group): 1046 | continue 1047 | 1048 | # plot the first with a label so legend can extract it later 1049 | label = name if i == 0 else None 1050 | 1051 | # extract x coordinates of group members. 1052 | x_vals = np.where(data['group'] == group)[0] 1053 | 1054 | # plot line plot connecting samples in group 1055 | ax.plot(x_vals + 0.5, [0.5]*len(x_vals), label=label, **plot_kwargs) 1056 | 1057 | # make axes invisible and despine 1058 | for loc in ['top', 'right', 'bottom', 'left']: 1059 | ax.spines[loc].set_visible(False) 1060 | 1061 | # remove axes 1062 | ax.get_xaxis().set_visible(False) 1063 | ax.get_yaxis().set_visible(False) 1064 | 1065 | self.axes[name] = ax 1066 | return self 1067 | 1068 | def _plot_side_bar_data(self, ax, name, data, mapping, position, stacked, 1069 | xlabel, y_padding, bar_kwargs): 1070 | '''Plot side bar plot on CoMut plot 1071 | 1072 | Params: 1073 | ----------- 1074 | ax: axis object 1075 | axis object on which to draw the graph. 1076 | 1077 | data: pandas Dataframe 1078 | data from add_side_bar_data 1079 | 1080 | name: str 1081 | name from add_side_bar_data 1082 | 1083 | mapping: dict 1084 | mapping from add_side_bar_data 1085 | 1086 | position: str, left or right 1087 | position from add_side_bar_data 1088 | 1089 | stacked: bool 1090 | stacked from add_side_bar_data 1091 | 1092 | xlabel: str 1093 | xlabel from add_side_bar_data 1094 | 1095 | y_padding: float 1096 | y_padding from plot_comut 1097 | 1098 | bar_kwargs: dict 1099 | bar_kwargs from add_side_bar_data 1100 | 1101 | Returns: 1102 | ------- 1103 | ax: axis object 1104 | The axis object on which the plot is drawn.''' 1105 | 1106 | # define y range, since the plot is rotated 1107 | y_range = np.arange(0.5, len(data.index)) 1108 | 1109 | # if height not in bar_kwargs, set it 1110 | if 'height' not in bar_kwargs: 1111 | bar_kwargs['height'] = 1 - 2*y_padding 1112 | 1113 | # if stacked, calculate cumulative height of bars 1114 | if stacked: 1115 | cum_bar_df = np.cumsum(data, axis=1) 1116 | 1117 | # for each bar, calculate bottom and top of bar and plot 1118 | for i in range(len(cum_bar_df.columns)): 1119 | column = cum_bar_df.columns[i] 1120 | color = mapping[column] 1121 | if i == 0: 1122 | left = None 1123 | bar_data = cum_bar_df.loc[:, column] 1124 | else: 1125 | # calculate distance between previous and current column 1126 | prev_column = cum_bar_df.columns[i-1] 1127 | bar_data = cum_bar_df.loc[:, column] - cum_bar_df.loc[:, prev_column] 1128 | 1129 | # previous column defines the "bottom" of the bars 1130 | left = cum_bar_df.loc[:, prev_column] 1131 | 1132 | ax.barh(y_range, bar_data, align='center', color=color, 1133 | left=left, label=column, **bar_kwargs) 1134 | 1135 | # plot unstacked bar 1136 | else: 1137 | color = mapping[data.columns[0]] 1138 | ax.barh(y_range, data.iloc[:, 0], 1139 | align='center', color=color, **bar_kwargs) 1140 | 1141 | # reverse x axis if position is to the left 1142 | if position == 'left': 1143 | xlim = ax.get_xlim() 1144 | ax.set_xlim(xlim[::-1]) 1145 | 1146 | # turn off the y axis by default 1147 | ax.set_yticklabels([]) 1148 | ax.tick_params(axis='y', which='both', length=0) 1149 | 1150 | for loc in ['top', 'right', 'left']: 1151 | ax.spines[loc].set_visible(False) 1152 | 1153 | # set xlabel 1154 | ax.set_xlabel(xlabel) 1155 | 1156 | self.axes[name] = ax 1157 | return ax 1158 | 1159 | def _plot_data_on_axis(self, ax, plot_name, x_padding, y_padding, tri_padding): 1160 | '''Wrapper function for plotting data on an axis 1161 | 1162 | Params: 1163 | ------- 1164 | ax: axis object 1165 | Axis object on which to plot 1166 | 1167 | plot_name: 1168 | Name of plot, used to index into plot dictionary associated 1169 | with CoMut object. 1170 | 1171 | x_padding, y_padding: float 1172 | Padding within patches for categorical data 1173 | 1174 | tri_padding: float 1175 | Padding between triangles for categorical data 1176 | 1177 | Returns: 1178 | -------- 1179 | ax: The axis object with now plotted_data''' 1180 | 1181 | # extract the plot type and data 1182 | plot_type = self._plots[plot_name]['type'] 1183 | data = self._plots[plot_name]['data'] 1184 | 1185 | # extract relevant plotting params depending on plot type, then plot 1186 | if plot_type == 'categorical' or plot_type == 'continuous': 1187 | mapping = self._plots[plot_name]['patches_options'] 1188 | borders = self._plots[plot_name]['borders'] if plot_type == 'categorical' else [] 1189 | tick_style = self._plots[plot_name]['tick_style'] 1190 | ax = self._plot_patch_data(ax=ax, data=data, name=plot_name, mapping=mapping, borders=borders, 1191 | x_padding=x_padding, y_padding=y_padding, tri_padding=tri_padding, 1192 | tick_style=tick_style) 1193 | 1194 | elif plot_type == 'bar': 1195 | mapping = self._plots[plot_name]['bar_options'] 1196 | stacked = self._plots[plot_name]['stacked'] 1197 | ylabel = self._plots[plot_name]['ylabel'] 1198 | bar_kwargs = self._plots[plot_name]['bar_kwargs'] 1199 | 1200 | # set the default width based on padding if not specified for bars 1201 | if 'width' not in bar_kwargs: 1202 | bar_kwargs['width'] = 1 - 2*x_padding 1203 | ax = self._plot_bar_data(ax=ax, data=data, name=plot_name, mapping=mapping, 1204 | stacked=stacked, ylabel=ylabel, bar_kwargs=bar_kwargs) 1205 | 1206 | elif plot_type == 'indicator': 1207 | plot_kwargs = self._plots[plot_name]['plot_options'] 1208 | ax = self._plot_indicator_data(ax=ax, data=data, name=plot_name, plot_kwargs=plot_kwargs) 1209 | 1210 | return ax 1211 | 1212 | def _get_default_widths_and_comut_loc(self): 1213 | '''Gets default widths from plots present in side_plots, 1214 | as well as the index location of the CoMut in widths. 1215 | 1216 | Returns: 1217 | -------- 1218 | default_widths: list 1219 | Default widths (5 to central CoMut, 1 to side plots). 1220 | 1221 | comut_idx: int 1222 | Integer index of the CoMut plot in the width list 1223 | ''' 1224 | 1225 | # determine the maximum number of right and left plots 1226 | max_left, max_right = 0, 0 1227 | for side_plots in self._side_plots.values(): 1228 | positions = [side_plot['position'] for side_plot in side_plots.values()] 1229 | 1230 | if positions.count('left') > max_left: 1231 | max_left = positions.count('left') 1232 | if positions.count('right') > max_right: 1233 | max_right = positions.count('right') 1234 | 1235 | # CoMut gets rel width of 5, other plots get 1 1236 | default_widths = [1]*max_left + [5] + [1]*max_right 1237 | 1238 | # CoMut is located in between left and right plots 1239 | comut_idx = max_left 1240 | 1241 | return default_widths, comut_idx 1242 | 1243 | def _get_default_height(self, name, plot_type): 1244 | '''Returns default height for a plot 1245 | 1246 | Params: 1247 | ------- 1248 | name: str or int 1249 | Name of the plot. 1250 | 1251 | plot_type: str 1252 | Type of plot, used to set default height. 1253 | 1254 | Returns: 1255 | -------- 1256 | height: float 1257 | Default height for plot type''' 1258 | 1259 | if plot_type == 'categorical': 1260 | data = self._plots[name]['data'] 1261 | height = len(data) 1262 | 1263 | elif plot_type == 'continuous': 1264 | height = 1 1265 | 1266 | elif plot_type == 'bar': 1267 | height = 3 1268 | 1269 | elif plot_type == 'indicator': 1270 | height = 1 1271 | 1272 | else: 1273 | raise ValueError('Invalid plot type {}'.format(plot_type)) 1274 | 1275 | return height 1276 | 1277 | def _get_height_spec(self, structure, heights): 1278 | '''Gets the default heights for plots in the CoMut. 1279 | 1280 | Height values for each plot type default to the following: 1281 | Number of categories for categorical data 1282 | 1 for continuous data 1283 | 3 for bar plots 1284 | 1 for sample indicator 1285 | 1286 | Params: 1287 | ------- 1288 | structure: list of lists 1289 | The structure of the CoMut plot, given as a list of lists 1290 | containing plot names. 1291 | 1292 | heights: dict 1293 | Dictionary specifying the relative height of certain plots. 1294 | Keys must be plot names, and values must be relative heights. 1295 | 1296 | Returns: 1297 | -------- 1298 | height_structure: list of lists 1299 | Relative heights in the same form as structure 1300 | 1301 | Example: 1302 | -------- 1303 | heights = {'plot1': 3, 'plot2': 5, 'plot3': 7} 1304 | 1305 | # plot4 is a bar plot 1306 | structure = [['plot1', 'plot2'], ['plot3'], ['plot4']] 1307 | 1308 | # returns [[3, 5], [7], [3]] 1309 | get_height_structure(heights, structure)''' 1310 | 1311 | structure_heights = [] 1312 | for plot in structure: 1313 | # if one plot in subplot, make it appropriate size 1314 | if len(plot) == 1: 1315 | name = plot[0] 1316 | 1317 | # get default height if no user height specified 1318 | plot_type = self._plots[name]['type'] 1319 | if name in heights: 1320 | plot_height = heights[name] 1321 | else: 1322 | plot_height = self._get_default_height(name, plot_type) 1323 | 1324 | structure_heights.append([plot_height]) 1325 | 1326 | # if more than one, calculate height of all plots in subplot 1327 | else: 1328 | subplot_heights = [] 1329 | for name in plot: 1330 | plot_type = self._plots[name]['type'] 1331 | 1332 | # get default height if no user height specified 1333 | if name in heights: 1334 | plot_height = heights[name] 1335 | else: 1336 | plot_height = self._get_default_height(name, plot_type) 1337 | 1338 | subplot_heights.append(plot_height) 1339 | structure_heights.append(subplot_heights) 1340 | 1341 | # structure heights should match the shape of structure 1342 | return structure_heights 1343 | 1344 | def plot_comut(self, fig=None, spec=None, x_padding=0, y_padding=0, 1345 | tri_padding=0, heights=None, hspace=0.2, subplot_hspace=None, 1346 | widths=None, wspace=0.2, structure=None, figsize=(10,6)): 1347 | '''plot the CoMut object 1348 | 1349 | Params: 1350 | ----------- 1351 | fig: `~.figure.Figure` 1352 | The figure on which to create the CoMut plot. If no fig 1353 | is passed, it will be created. 1354 | 1355 | spec: gridspec 1356 | The gridspec on which to create the CoMut plot. If no spec 1357 | is passed, one will be created. 1358 | 1359 | x_padding, y_padding: float, optional (default 0) 1360 | The space between patches in the CoMut plot in the x and y 1361 | direction. 1362 | 1363 | tri_padding: float 1364 | If there are two values for a sample in a category, the spacing 1365 | between the triangles that represent each value. 1366 | 1367 | heights: dict 1368 | The relative heights of all the plots. Dict should have keys as 1369 | plot names and values as relative height. 1370 | 1371 | Height values for each plot type default to the following: 1372 | Number of categories for categorical data 1373 | 1 for continuous data 1374 | 3 for bar plots 1375 | 1 for sample indicator 1376 | 1377 | Example: 1378 | -------- 1379 | heights = {'plot1': 3, 'plot2': 5, 'plot3': 7} 1380 | CoMut.plot_comut(heights=heights) 1381 | 1382 | hspace: float, default 0.2 1383 | The distance between different plots in the CoMut plot. 1384 | 1385 | widths: list-like 1386 | The relative widths of plots in the x direction. Valid only 1387 | if side bar plots are added. Defaults to 5 for the central CoMut 1388 | and 1 for each side plot. 1389 | 1390 | Example: 1391 | -------- 1392 | widths = [0.5, 5] 1393 | CoMut.plot_comut(widths=heights) 1394 | 1395 | wspace: float, default 0.2 1396 | The distance between different plots in the x-direction 1397 | (ie side bar plots) 1398 | 1399 | structure: list-like 1400 | List containing desired CoMut structure. Must be provided 1401 | as list of lists (see example). Default structure is to place 1402 | each plot in its own list. 1403 | 1404 | Example: 1405 | -------- 1406 | # plot plot1 and plot2 in a separate subplot from plot4, don't plot 1407 | # plot3. 1408 | structure = [('plot1', 'plot2'), ('plot4')] 1409 | CoMut.plot_comut(structure=structure) 1410 | 1411 | sublot_hspace: float 1412 | The distance between plots in a subplot. The scale for 1413 | subplot_hspace and hspace are not the same. 1414 | 1415 | figsize (float, float), optional, default: (10,6) 1416 | width, height of CoMut figure in inches. Only valid if fig argument 1417 | is None. 1418 | 1419 | Returns: 1420 | ------- 1421 | self: CoMut object 1422 | CoMut object with updated axes and figure attributes. 1423 | 1424 | Example 1425 | -------- 1426 | # create CoMut object 1427 | ex_comut = comut.CoMut() 1428 | 1429 | # add mutation data 1430 | ex_comut.add_categorical_data(mutation_data, name='mutation') 1431 | 1432 | # add clinical data 1433 | ex_comut.add_categorical_data(tumor_stage, name='tumor_stage') 1434 | ex_comut.add_continuous_data(purity_data, name='purity') 1435 | 1436 | # plot CoMut data 1437 | ex_comut.plot_comut() 1438 | 1439 | # ex_comut.axes will be a dictionary with keys 'mutation', 'tumor_stage', 1440 | # and 'purity', with values equal to the plotted axes.''' 1441 | 1442 | # default structure is each plot to its own subplot 1443 | if structure is None: 1444 | structure = [[plot] for plot in self._plots] 1445 | 1446 | if heights is None: 1447 | heights = {} 1448 | 1449 | # define number of subplots 1450 | num_subplots = len(structure) 1451 | 1452 | # get height structure based on input heights 1453 | heights = self._get_height_spec(structure, heights) 1454 | 1455 | # calculate height of plots for gridspeccing. Heights are 1456 | # reversed to match CoMut plotting (bottom to top) 1457 | plot_heights = [sum(height) for height in heights][::-1] 1458 | 1459 | # create figure if none given 1460 | if fig is None: 1461 | fig = plt.figure(figsize=figsize) 1462 | 1463 | # make default widths and determine location of CoMut. If widths give, 1464 | # just calculate location of CoMut 1465 | if widths is None: 1466 | widths, comut_idx = self._get_default_widths_and_comut_loc() 1467 | else: 1468 | _, comut_idx = self._get_default_widths_and_comut_loc() 1469 | 1470 | # number of cols is equal to size of widths 1471 | num_cols = len(widths) 1472 | 1473 | # create gridspec if none given 1474 | if spec is None: 1475 | spec = gridspec.GridSpec(ncols=num_cols, nrows=num_subplots, figure=fig, 1476 | height_ratios=plot_heights, width_ratios=widths, 1477 | hspace=hspace, wspace=wspace) 1478 | 1479 | # otherwise, create gridspec in spec 1480 | else: 1481 | spec = gridspec.GridSpecFromSubplotSpec(ncols=num_cols, nrows=num_subplots, 1482 | height_ratios=plot_heights, width_ratios=widths, 1483 | hspace=hspace, wspace=wspace, subplot_spec=spec) 1484 | 1485 | # plot each plot in subplots 1486 | for i, (plot, height) in enumerate(zip(structure, heights)): 1487 | # subplots share an x axis with first plot 1488 | if i == 0: 1489 | sharex = None 1490 | first_plot = plot[0] 1491 | else: 1492 | sharex = self.axes[first_plot] 1493 | 1494 | # if only one plot in subplot, just add subplot and plot 1495 | if len(plot) == 1: 1496 | plot_name = plot[0] 1497 | ax = fig.add_subplot(spec[num_subplots - i - 1, comut_idx], sharex=sharex) 1498 | ax = self._plot_data_on_axis(ax=ax, plot_name=plot_name, x_padding=x_padding, y_padding=y_padding, tri_padding=tri_padding) 1499 | 1500 | # extract all sideplots on this axis 1501 | side_plots = self._side_plots[plot_name] 1502 | 1503 | # identify the locations of each sideplot and plot from inward -> outward 1504 | left_idx, right_idx = 1, 1 1505 | for side_name, side_plot in side_plots.items(): 1506 | position = side_plot['position'] 1507 | if position == 'left': 1508 | sideplot_idx = comut_idx - left_idx 1509 | left_idx += 1 1510 | elif position == 'right': 1511 | sideplot_idx = comut_idx + right_idx 1512 | right_idx += 1 1513 | 1514 | # sideplots are paired with central CoMut plot 1515 | side_ax = fig.add_subplot(spec[num_subplots - i - 1, sideplot_idx]) 1516 | side_ax = self._plot_side_bar_data(side_ax, side_name, y_padding=y_padding, 1517 | **side_plot) 1518 | 1519 | # force side axis to match paired axis. Avoiding sharey in case 1520 | # side bar needs different ticklabels 1521 | side_ax.set_ylim(ax.get_ylim()) 1522 | 1523 | # if multiple plots in subplot, subplot gridspec required 1524 | else: 1525 | num_plots = len(plot) 1526 | 1527 | # reverse the heights to be bottom up 1528 | height = height[::-1] 1529 | subplot_spec = gridspec.GridSpecFromSubplotSpec(ncols=1, nrows=num_plots, 1530 | hspace=subplot_hspace, subplot_spec=spec[num_subplots - i - 1, comut_idx], 1531 | height_ratios=height) 1532 | # plot all data in subplots 1533 | for j, plot_name in enumerate(plot): 1534 | 1535 | # plot the data on a subplot within that subgridspec 1536 | ax = fig.add_subplot(subplot_spec[num_plots - j - 1, 0], sharex=sharex) 1537 | ax = self._plot_data_on_axis(ax=ax, plot_name=plot_name, x_padding=x_padding, y_padding=y_padding, tri_padding=tri_padding) 1538 | 1539 | # side bar plots are not allowed for plots within a subplot 1540 | if self._side_plots[plot_name]: 1541 | raise ValueError('Side bar plot for {} cannot be created. ' 1542 | 'Plots within a subplot cannot have a side plot.'.format(plot_name)) 1543 | 1544 | # add x axis labels to the bottom-most axis, make it visible 1545 | self.axes[first_plot].set_xticks(np.arange(0.5, len(self.samples) + 0.5)) 1546 | self.axes[first_plot].set_xticklabels(self.samples, rotation=90) 1547 | self.axes[first_plot].get_xaxis().set_visible(True) 1548 | self.axes[first_plot].tick_params(axis='x', which='both', bottom=False, length=0) 1549 | 1550 | self.figure = fig 1551 | return self 1552 | 1553 | def add_axis_legend(self, name, border_white=None, rename=None, order=None, 1554 | ignored_values=None, title_align='left', bbox_to_anchor=(1, 1), 1555 | frameon=False, **legend_kwargs): 1556 | '''Add a legend to a named axis of the CoMut plot 1557 | 1558 | Params: 1559 | ------- 1560 | name: str or int 1561 | Name of axis on which to create the legend. Names are created 1562 | when data is added. 1563 | 1564 | border_white: list-like 1565 | List of categories to replace with a black bordered box. 1566 | 1567 | rename: dict 1568 | A dictionary for renaming categories. The key should be the 1569 | original category name, with value as the new category name. 1570 | 1571 | order: list-like 1572 | Order of values in the legend. By default, values are 1573 | sorted alphabetically. 1574 | 1575 | ignored_values: list-list 1576 | List of values ignored by the legend. Defaults to 1577 | ['Not Available', 'Absent']. 1578 | 1579 | title_align: str, one of 'left', 'center', or 'right', default 'left' 1580 | The alignment of the legend title in the legend. If no title is 1581 | specified, nothing happens. 1582 | 1583 | bbox_to_anchor: BboxBase, 2-tuple, or 4-tuple of floats, default (1, 1) 1584 | The location of the legend relative to the axis. 1585 | 1586 | frameon: bool, default False 1587 | Whether a frame should be drawn around the legend 1588 | 1589 | legend_kwargs: kwargs 1590 | kwargs to pass to plt.legend() 1591 | 1592 | Returns: 1593 | -------- 1594 | legend: matplotlib legend object 1595 | Legend object created for the input named axis. Can be altered with 1596 | other matplotlib functions that work on legends.''' 1597 | 1598 | # define defaults 1599 | if border_white is None: 1600 | border_white = [] 1601 | 1602 | if rename is None: 1603 | rename = {} 1604 | 1605 | if order is None: 1606 | order = [] 1607 | 1608 | if ignored_values is None: 1609 | ignored_values = ['Not Available', 'Absent'] 1610 | 1611 | # define the axis 1612 | axis = self.axes[name] 1613 | 1614 | plot_type = self._plots[name]['type'] 1615 | if plot_type == 'continuous': 1616 | raise ValueError('add_axis_legend is not valid for continuous data.') 1617 | 1618 | # extract current handles and labels on axis 1619 | handles, labels = self.axes[name].get_legend_handles_labels() 1620 | handle_lookup = dict(zip(labels, handles)) 1621 | 1622 | # replace borde_white values with a white patch and black border 1623 | for patch_name in border_white: 1624 | handle_lookup[patch_name] = patches.Patch(facecolor='white', edgecolor='black', 1625 | label=border_white) 1626 | 1627 | # rename categories and delete old ones 1628 | for old_name, new_name in rename.items(): 1629 | handle_lookup[new_name] = handle_lookup[old_name] 1630 | del handle_lookup[old_name] 1631 | 1632 | # ignore certain values 1633 | for value in ignored_values: 1634 | if value in handle_lookup: 1635 | del handle_lookup[value] 1636 | 1637 | # sort labels by order, reorder handles to match 1638 | sorted_labels = self._sort_list_by_list(handle_lookup.keys(), order) 1639 | sorted_handles = [handle_lookup[label] for label in sorted_labels] 1640 | 1641 | # create legend 1642 | legend = axis.legend(sorted_handles, sorted_labels, bbox_to_anchor=bbox_to_anchor, 1643 | frameon=frameon, **legend_kwargs) 1644 | 1645 | # align legend title 1646 | legend._legend_box.align = title_align 1647 | return legend 1648 | 1649 | def add_unified_legend(self, axis_name=None, border_white=None, headers=True, 1650 | rename=None, bbox_to_anchor=(1, 1), ignored_values=None, 1651 | frameon=False, **legend_kwargs): 1652 | '''Add a unified legend to the CoMut plot 1653 | 1654 | This combines all the various legends into a one column master legend. 1655 | By default, the legend is placed at the top axis. 1656 | 1657 | Params: 1658 | ------- 1659 | axis_name: str or int 1660 | Name of axis on which to create the legend. Names are created 1661 | when data is added. 1662 | 1663 | border_white: list-like 1664 | List of categories to border for legend entry. This will replace whatever 1665 | patch currently exists with a white box bordered in black. 1666 | 1667 | headers: bool, default True 1668 | Whether the legend should include subtitles. Subtitles are left 1669 | aligned. 1670 | 1671 | rename: dict 1672 | A dictionary for renaming categories. The key should be the 1673 | original category name, with value as the new category name. 1674 | Renaming occurs after adding border white patches. 1675 | 1676 | bbox_to_anchor: BboxBase, 2-tuple, or 4-tuple of floats, default (1,1) 1677 | The location of the legend relative to the axis. 1678 | 1679 | ignored_values: list-like 1680 | List of ignored values. These categories will be ignored by 1681 | the legend. Defaults to ['Absent', 'Not Available']. 1682 | 1683 | frameon: bool, default False 1684 | Whether a frame should be drawn around the legend 1685 | 1686 | legend_kwargs: kwargs 1687 | Other kwargs to pass to ax.legend(). 1688 | 1689 | Returns: 1690 | -------- 1691 | legend: matplotlib legend object 1692 | Legend object created for the input named axis.''' 1693 | 1694 | if border_white is None: 1695 | border_white = [] 1696 | 1697 | if rename is None: 1698 | rename = {} 1699 | 1700 | if ignored_values is None: 1701 | ignored_values = ['Absent', 'Not Available'] 1702 | 1703 | # store labels and patches 1704 | legend_labels = [] 1705 | legend_patches = [] 1706 | 1707 | # loop through plots in reverse order (since plots are bottom up) 1708 | plot_names = list(self._plots.keys())[::-1] 1709 | plot_data_list = list(self._plots.values())[::-1] 1710 | 1711 | # extract the legend information for each plot and add to storage 1712 | for name, plot_data in zip(plot_names, plot_data_list): 1713 | axis = self.axes[name] 1714 | plot_type = plot_data['type'] 1715 | 1716 | if plot_type in ['categorical', 'bar', 'indicator']: 1717 | # nonstacked bar charts don't need legend labels 1718 | if plot_type == 'bar' and not plot_data['stacked']: 1719 | continue 1720 | 1721 | handles, labels = axis.get_legend_handles_labels() 1722 | 1723 | # create label-patch dict 1724 | handle_lookup = dict(zip(labels, handles)) 1725 | 1726 | # delete ignored categories 1727 | for value in ignored_values: 1728 | if value in handle_lookup: 1729 | del handle_lookup[value] 1730 | 1731 | # border the white patches 1732 | for patch_name in border_white: 1733 | if patch_name in handle_lookup: 1734 | handle_lookup[patch_name] = patches.Patch(facecolor='white', edgecolor='black', 1735 | label=patch_name) 1736 | 1737 | # add legend subheader for nonindicator data 1738 | if plot_type != 'indicator' and headers: 1739 | legend_labels.append(name) 1740 | legend_patches.append(patches.Patch(color='none', alpha=0)) 1741 | 1742 | # add plot labels and legends 1743 | legend_labels += list(handle_lookup.keys()) 1744 | legend_patches += list(handle_lookup.values()) 1745 | 1746 | # add a spacer patch 1747 | legend_labels.append('') 1748 | legend_patches.append(patches.Patch(color='white', alpha=0)) 1749 | 1750 | # rename labels 1751 | legend_labels = [rename.get(label, label) for label in legend_labels] 1752 | 1753 | # add to the top axis if no axis is given 1754 | if axis_name is None: 1755 | axis = self.axes[list(self.axes.keys())[-1]] 1756 | else: 1757 | axis = self.axes[axis_name] 1758 | 1759 | # add legend to axis 1760 | leg = axis.legend(labels=legend_labels, handles=legend_patches, 1761 | bbox_to_anchor=bbox_to_anchor, frameon=frameon, **legend_kwargs) 1762 | 1763 | # more involved code to align the headers 1764 | if headers: 1765 | vpackers = leg.findobj(offsetbox.VPacker) 1766 | for vpack in vpackers[:-1]: # Last vpack will be the title box 1767 | vpack.align = 'left' 1768 | for hpack in vpack.get_children(): 1769 | draw_area, text_area = hpack.get_children() 1770 | for collection in draw_area.get_children(): 1771 | alpha = collection.get_alpha() 1772 | 1773 | # if the patch has alpha = 0, set it to invisible, 1774 | # which will shift over it's label to align. 1775 | if alpha == 0: 1776 | draw_area.set_visible(False) 1777 | 1778 | return leg 1779 | -------------------------------------------------------------------------------- /comut/fileparsers.py: -------------------------------------------------------------------------------- 1 | def parse_maf(maf_df, variants=None, rename=None): 2 | '''Subsets a MAF to nonsilent mutations, then subsets to 3 | required columns and renames. Returns a parsed datafame. 4 | 5 | Params: 6 | ------- 7 | maf_df: pandas dataframe 8 | The dataframe of a MAF after pd.read_csv() 9 | 10 | variants: list-like 11 | A list of variants to subset the MAF to. Defaults to 12 | nonsilent variants. 13 | 14 | rename: dict 15 | A dictionary to rename values in the MAF. Defaults to 16 | collapsing insertions and deletions and renaming 17 | nonsilent mutations. 18 | 19 | Returns: 20 | -------- 21 | parsed_maf: pandas_dataframe 22 | A parsed maf for CoMut. It subsets to sample name, gene 23 | name, and variant classification and renames the columns. 24 | ''' 25 | 26 | # default to nonsilent variants 27 | if variants is None: 28 | variants = ['Nonsense_Mutation', 'In_Frame_Del', 'Frame_Shift_Ins', 'Splice_Site', 'In_Frame_Ins', 'Frame_Shift_Del', 'Missense_Mutation'] 29 | 30 | # default rename is to collapse ins and del to indel 31 | if rename is None: 32 | rename = {'Nonsense_Mutation': 'Nonsense', 'In_Frame_Del': 'In frame indel', 'In_Frame_Ins': 'In frame indel', 33 | 'Frame_Shift_Del': 'Frameshift indel', 'Missense_Mutation': 'Missense', 'Splice_Site': 'Splice site', 'Frame_Shift_Ins': 'Frameshift indel'} 34 | 35 | # subset to required columns 36 | subset_maf_df = maf_df[['Tumor_Sample_Barcode', 'Hugo_Symbol', 'Variant_Classification']] 37 | 38 | # rename columns 39 | subset_maf_df.columns = ['sample', 'category', 'value'] 40 | 41 | # subset maf to relevant mutation types 42 | parsed_maf = subset_maf_df[subset_maf_df['value'].isin(variants)].copy() 43 | 44 | # rename variants 45 | parsed_maf.loc[:, 'value'] = parsed_maf.loc[:, 'value'].replace(rename).copy() 46 | 47 | return parsed_maf 48 | -------------------------------------------------------------------------------- /examples/images/melanoma_comut.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/vanallenlab/comut/0ee140805cc7c2c8a9bd3fbb45e58e78b75ea17b/examples/images/melanoma_comut.png -------------------------------------------------------------------------------- /examples/tutorial_data/data_types.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/vanallenlab/comut/0ee140805cc7c2c8a9bd3fbb45e58e78b75ea17b/examples/tutorial_data/data_types.png -------------------------------------------------------------------------------- /examples/tutorial_data/melanoma_example/best_response.tsv: -------------------------------------------------------------------------------- 1 | sample category value 2 | Patient41 Best response PR 3 | Patient62 Best response PR 4 | Patient49 Best response PR 5 | Patient51 Best response PR 6 | Patient99 Best response PR 7 | Patient75 Best response PR 8 | Patient96 Best response CR 9 | Patient42 Best response CR 10 | Patient55 Best response PR 11 | Patient4 Best response CR 12 | Patient25 Best response PR 13 | Patient45 Best response PR 14 | Patient15 Best response PR 15 | Patient21 Best response PR 16 | Patient87 Best response PR 17 | Patient43 Best response PR 18 | Patient10 Best response PR 19 | Patient40 Best response PD 20 | Patient82 Best response PD 21 | Patient47 Best response PD 22 | Patient9 Best response PD 23 | Patient13 Best response PD 24 | Patient20 Best response PD 25 | Patient18 Best response PD 26 | Patient59 Best response PD 27 | Patient98 Best response PD 28 | Patient32 Best response PD 29 | Patient94 Best response PD 30 | Patient27 Best response PD 31 | Patient79 Best response PD 32 | Patient83 Best response PD 33 | Patient36 Best response PD 34 | Patient11 Best response PD 35 | Patient31 Best response PD 36 | Patient23 Best response PD 37 | Patient72 Best response PD 38 | Patient73 Best response PD 39 | Patient77 Best response PD 40 | Patient63 Best response PD 41 | Patient78 Best response PD 42 | Patient37 Best response PD 43 | Patient38 Best response PD 44 | Patient14 Best response PD 45 | Patient61 Best response SD 46 | Patient60 Best response SD 47 | Patient17 Best response SD 48 | Patient54 Best response MR 49 | Patient86 Best response SD 50 | Patient8 Best response SD 51 | Patient50 Best response MR 52 | Patient30 Best response SD 53 | Patient7 Best response SD 54 | -------------------------------------------------------------------------------- /examples/tutorial_data/melanoma_example/copy_number_alterations.tsv: -------------------------------------------------------------------------------- 1 | sample category value 2 | Patient18 MYC Baseline 3 | Patient18 MYC Baseline 4 | Patient49 MYC Allelic amplification 5 | Patient49 MYC Allelic amplification 6 | Patient21 MYC Baseline 7 | Patient21 MYC Baseline 8 | Patient32 MYC Allelic amplification 9 | Patient32 MYC Allelic deletion 10 | Patient14 MYC Baseline 11 | Patient14 MYC Baseline 12 | Patient43 MYC Allelic amplification 13 | Patient43 MYC aCN = 0 14 | Patient55 MYC Baseline 15 | Patient55 MYC Baseline 16 | Patient27 MYC Baseline 17 | Patient27 MYC Baseline 18 | Patient61 MYC Baseline 19 | Patient61 MYC Baseline 20 | Patient62 MYC Baseline 21 | Patient62 MYC Baseline 22 | Patient51 MYC Allelic amplification 23 | Patient51 MYC Baseline 24 | Patient10 MYC Allelic amplification 25 | Patient10 MYC Baseline 26 | Patient96 MYC Allelic amplification 27 | Patient96 MYC Baseline 28 | Patient7 MYC Allelic amplification 29 | Patient7 MYC Allelic amplification 30 | Patient8 MYC Baseline 31 | Patient8 MYC Baseline 32 | Patient17 MYC Baseline 33 | Patient17 MYC Baseline 34 | Patient36 MYC Allelic amplification 35 | Patient36 MYC Baseline 36 | Patient9 MYC Baseline 37 | Patient9 MYC Baseline 38 | Patient83 MYC Baseline 39 | Patient83 MYC Baseline 40 | Patient37 MYC Baseline 41 | Patient37 MYC Baseline 42 | Patient11 MYC Baseline 43 | Patient11 MYC Baseline 44 | Patient45 MYC Baseline 45 | Patient45 MYC Baseline 46 | Patient79 MYC Baseline 47 | Patient79 MYC Baseline 48 | Patient75 MYC Baseline 49 | Patient75 MYC Allelic deletion 50 | Patient15 MYC Baseline 51 | Patient15 MYC Baseline 52 | Patient13 MYC Allelic amplification 53 | Patient13 MYC Baseline 54 | Patient25 MYC Baseline 55 | Patient25 MYC Baseline 56 | Patient31 MYC Baseline 57 | Patient31 MYC Baseline 58 | Patient86 MYC Baseline 59 | Patient86 MYC Baseline 60 | Patient41 MYC Baseline 61 | Patient41 MYC Allelic deletion 62 | Patient98 MYC Allelic amplification 63 | Patient98 MYC Baseline 64 | Patient72 MYC Allelic amplification 65 | Patient72 MYC Baseline 66 | Patient99 MYC Baseline 67 | Patient99 MYC aCN = 0 68 | Patient4 MYC Baseline 69 | Patient4 MYC Baseline 70 | Patient40 MYC Baseline 71 | Patient40 MYC Baseline 72 | Patient30 MYC Allelic amplification 73 | Patient30 MYC Baseline 74 | Patient47 MYC Baseline 75 | Patient47 MYC Allelic deletion 76 | Patient60 MYC Baseline 77 | Patient60 MYC aCN = 0 78 | Patient59 MYC Baseline 79 | Patient59 MYC Baseline 80 | Patient87 MYC Baseline 81 | Patient87 MYC Baseline 82 | Patient23 MYC Baseline 83 | Patient23 MYC Baseline 84 | Patient94 MYC Baseline 85 | Patient94 MYC Baseline 86 | Patient20 MYC Baseline 87 | Patient20 MYC Baseline 88 | Patient73 MYC Allelic amplification 89 | Patient73 MYC Baseline 90 | Patient77 MYC Allelic amplification 91 | Patient77 MYC Baseline 92 | Patient50 MYC Allelic amplification 93 | Patient50 MYC Baseline 94 | Patient82 MYC Allelic amplification 95 | Patient82 MYC aCN = 0 96 | Patient38 MYC Allelic amplification 97 | Patient38 MYC Baseline 98 | Patient78 MYC Allelic amplification 99 | Patient78 MYC Allelic deletion 100 | Patient63 MYC Baseline 101 | Patient63 MYC aCN = 0 102 | Patient54 MYC Baseline 103 | Patient54 MYC Baseline 104 | Patient42 MYC Baseline 105 | Patient42 MYC Allelic deletion 106 | Patient18 CDKN2A Complex 107 | Patient49 CDKN2A aCN = 0 108 | Patient49 CDKN2A aCN = 0 109 | Patient21 CDKN2A Baseline 110 | Patient21 CDKN2A aCN = 0 111 | Patient32 CDKN2A Allelic amplification 112 | Patient32 CDKN2A aCN = 0 113 | Patient14 CDKN2A Baseline 114 | Patient14 CDKN2A aCN = 0 115 | Patient43 CDKN2A Allelic deletion 116 | Patient43 CDKN2A aCN = 0 117 | Patient55 CDKN2A Baseline 118 | Patient55 CDKN2A Baseline 119 | Patient27 CDKN2A Baseline 120 | Patient27 CDKN2A aCN = 0 121 | Patient61 CDKN2A Baseline 122 | Patient61 CDKN2A aCN = 0 123 | Patient62 CDKN2A Baseline 124 | Patient62 CDKN2A aCN = 0 125 | Patient51 CDKN2A Baseline 126 | Patient51 CDKN2A Allelic deletion 127 | Patient10 CDKN2A Baseline 128 | Patient10 CDKN2A aCN = 0 129 | Patient96 CDKN2A Baseline 130 | Patient96 CDKN2A Baseline 131 | Patient7 CDKN2A CN-LOH 132 | Patient8 CDKN2A Baseline 133 | Patient8 CDKN2A aCN = 0 134 | Patient17 CDKN2A Allelic deletion 135 | Patient17 CDKN2A aCN = 0 136 | Patient36 CDKN2A Baseline 137 | Patient36 CDKN2A aCN = 0 138 | Patient9 CDKN2A Baseline 139 | Patient9 CDKN2A aCN = 0 140 | Patient83 CDKN2A Not Available 141 | Patient37 CDKN2A Baseline 142 | Patient37 CDKN2A Baseline 143 | Patient11 CDKN2A Baseline 144 | Patient11 CDKN2A Allelic deletion 145 | Patient45 CDKN2A Baseline 146 | Patient45 CDKN2A Baseline 147 | Patient79 CDKN2A Baseline 148 | Patient79 CDKN2A aCN = 0 149 | Patient75 CDKN2A Baseline 150 | Patient75 CDKN2A Allelic deletion 151 | Patient15 CDKN2A CN-LOH 152 | Patient13 CDKN2A Not Available 153 | Patient25 CDKN2A Baseline 154 | Patient25 CDKN2A Baseline 155 | Patient31 CDKN2A Allelic deletion 156 | Patient31 CDKN2A Allelic deletion 157 | Patient86 CDKN2A Baseline 158 | Patient86 CDKN2A Baseline 159 | Patient41 CDKN2A Baseline 160 | Patient41 CDKN2A Baseline 161 | Patient98 CDKN2A Baseline 162 | Patient98 CDKN2A aCN = 0 163 | Patient72 CDKN2A aCN = 0 164 | Patient72 CDKN2A aCN = 0 165 | Patient99 CDKN2A Allelic deletion 166 | Patient99 CDKN2A Allelic deletion 167 | Patient4 CDKN2A aCN = 0 168 | Patient4 CDKN2A aCN = 0 169 | Patient40 CDKN2A Complex 170 | Patient30 CDKN2A aCN = 0 171 | Patient30 CDKN2A aCN = 0 172 | Patient47 CDKN2A aCN = 0 173 | Patient47 CDKN2A aCN = 0 174 | Patient60 CDKN2A Baseline 175 | Patient60 CDKN2A aCN = 0 176 | Patient59 CDKN2A Baseline 177 | Patient59 CDKN2A aCN = 0 178 | Patient87 CDKN2A Baseline 179 | Patient87 CDKN2A aCN = 0 180 | Patient23 CDKN2A Baseline 181 | Patient23 CDKN2A aCN = 0 182 | Patient94 CDKN2A Not Available 183 | Patient20 CDKN2A Not Available 184 | Patient73 CDKN2A Baseline 185 | Patient73 CDKN2A aCN = 0 186 | Patient77 CDKN2A Baseline 187 | Patient77 CDKN2A aCN = 0 188 | Patient50 CDKN2A Baseline 189 | Patient50 CDKN2A aCN = 0 190 | Patient82 CDKN2A aCN = 0 191 | Patient82 CDKN2A aCN = 0 192 | Patient38 CDKN2A Baseline 193 | Patient38 CDKN2A Baseline 194 | Patient78 CDKN2A Baseline 195 | Patient78 CDKN2A aCN = 0 196 | Patient63 CDKN2A aCN = 0 197 | Patient63 CDKN2A aCN = 0 198 | Patient54 CDKN2A Baseline 199 | Patient54 CDKN2A Baseline 200 | Patient42 CDKN2A Allelic deletion 201 | Patient42 CDKN2A Allelic deletion 202 | Patient18 TP53 Baseline 203 | Patient18 TP53 Baseline 204 | Patient49 TP53 Baseline 205 | Patient49 TP53 Baseline 206 | Patient21 TP53 Baseline 207 | Patient21 TP53 Baseline 208 | Patient32 TP53 Baseline 209 | Patient32 TP53 aCN = 0 210 | Patient14 TP53 Baseline 211 | Patient14 TP53 Baseline 212 | Patient43 TP53 Baseline 213 | Patient43 TP53 Baseline 214 | Patient55 TP53 Baseline 215 | Patient55 TP53 Baseline 216 | Patient27 TP53 Baseline 217 | Patient27 TP53 aCN = 0 218 | Patient61 TP53 Baseline 219 | Patient61 TP53 aCN = 0 220 | Patient62 TP53 Baseline 221 | Patient62 TP53 Baseline 222 | Patient51 TP53 Baseline 223 | Patient51 TP53 Baseline 224 | Patient10 TP53 Baseline 225 | Patient10 TP53 aCN = 0 226 | Patient96 TP53 Baseline 227 | Patient96 TP53 aCN = 0 228 | Patient7 TP53 Baseline 229 | Patient7 TP53 Baseline 230 | Patient8 TP53 Baseline 231 | Patient8 TP53 Baseline 232 | Patient17 TP53 Baseline 233 | Patient17 TP53 Allelic deletion 234 | Patient36 TP53 Baseline 235 | Patient36 TP53 Baseline 236 | Patient9 TP53 Baseline 237 | Patient9 TP53 Baseline 238 | Patient83 TP53 Baseline 239 | Patient83 TP53 Baseline 240 | Patient37 TP53 Baseline 241 | Patient37 TP53 Baseline 242 | Patient11 TP53 Baseline 243 | Patient11 TP53 Allelic deletion 244 | Patient45 TP53 Baseline 245 | Patient45 TP53 aCN = 0 246 | Patient79 TP53 CN-LOH 247 | Patient75 TP53 Baseline 248 | Patient75 TP53 Baseline 249 | Patient15 TP53 Baseline 250 | Patient15 TP53 Baseline 251 | Patient13 TP53 Baseline 252 | Patient13 TP53 Baseline 253 | Patient25 TP53 Baseline 254 | Patient25 TP53 Baseline 255 | Patient31 TP53 Baseline 256 | Patient31 TP53 Allelic deletion 257 | Patient86 TP53 Baseline 258 | Patient86 TP53 Baseline 259 | Patient41 TP53 CN-LOH 260 | Patient98 TP53 Baseline 261 | Patient98 TP53 Baseline 262 | Patient72 TP53 Baseline 263 | Patient72 TP53 Allelic deletion 264 | Patient99 TP53 Baseline 265 | Patient99 TP53 Allelic deletion 266 | Patient4 TP53 Baseline 267 | Patient4 TP53 Baseline 268 | Patient40 TP53 Baseline 269 | Patient40 TP53 Baseline 270 | Patient30 TP53 Baseline 271 | Patient30 TP53 Baseline 272 | Patient47 TP53 Allelic deletion 273 | Patient47 TP53 Allelic deletion 274 | Patient60 TP53 Baseline 275 | Patient60 TP53 aCN = 0 276 | Patient59 TP53 Baseline 277 | Patient59 TP53 Baseline 278 | Patient87 TP53 Baseline 279 | Patient87 TP53 aCN = 0 280 | Patient23 TP53 Baseline 281 | Patient23 TP53 aCN = 0 282 | Patient94 TP53 Baseline 283 | Patient94 TP53 aCN = 0 284 | Patient20 TP53 Baseline 285 | Patient20 TP53 Baseline 286 | Patient73 TP53 Baseline 287 | Patient73 TP53 Allelic deletion 288 | Patient77 TP53 Baseline 289 | Patient77 TP53 Baseline 290 | Patient50 TP53 Baseline 291 | Patient50 TP53 Allelic deletion 292 | Patient82 TP53 Baseline 293 | Patient82 TP53 aCN = 0 294 | Patient38 TP53 Baseline 295 | Patient38 TP53 Baseline 296 | Patient78 TP53 Baseline 297 | Patient78 TP53 aCN = 0 298 | Patient63 TP53 Baseline 299 | Patient63 TP53 aCN = 0 300 | Patient54 TP53 Baseline 301 | Patient54 TP53 Baseline 302 | Patient42 TP53 Baseline 303 | Patient42 TP53 aCN = 0 304 | -------------------------------------------------------------------------------- /examples/tutorial_data/melanoma_example/melanoma_comut.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/vanallenlab/comut/0ee140805cc7c2c8a9bd3fbb45e58e78b75ea17b/examples/tutorial_data/melanoma_example/melanoma_comut.png -------------------------------------------------------------------------------- /examples/tutorial_data/melanoma_example/merged_clinical_data.tsv: -------------------------------------------------------------------------------- 1 | sample total_muts nonsyn_muts clonal_muts subclonal_muts heterogeneity total_neoantigens CNA_prop gender (Male=1, Female=0) biopsy site monthsBiopsyPreTx BR PFS OS TimeToBR cyclesOnTherapy txOngoing Tx Mstage (IIIC=0, M1a=1, M1b=2, M1c=3) Tx_Start_ECOG Tx_Start_LDH LDH_Elevated Brain_Met Cut_SubQ_Met LN_Met Lung_Met Liver_Visc_Met Bone_Met progressed dead Primary_Type Histology IOTherapy steroidsGT10mgDaily priorMAPKTx priorCTLA4 postCTLA4 postMAPKTx postCombinedCTLA_PD1 numPriorTherapies biopsy site_categ biopsyContext (1=Pre-Ipi; 2=On-Ipi; 3=Pre-PD1; 4=On-PD1) daysBiopsyToPD1 daysBiopsyAfterIpiStart purity ploidy WGD wgd total_burden resp_idx 2 | Patient41 15258 9835 8924 662 0.069059044 24622 0.079081162 1 lung 1.4 PR 106 1685 63 7 0 Nivolumab 2 0 172 0 0 0 0 1 0 0 1 0 skin unknown Nivolumab 0 0 1 0.0 0.0 0 2 lung 3 -41 214 0.7 3.97 Yes Yes 9586 2 3 | Patient62 9594 6249 5401 203 0.036224126 18457 0.148674622 0 skin 2.2 PR 1001 1001 99 62 1 Nivolumab 3 0 191 0 0 1 1 1 1 0 0 0 skin LMM Nivolumab 0 0 1 0.0 0.0 0 1 skin 3 -65 336 0.5 3.7 Yes Yes 5604 2 4 | Patient49 2423 1624 1442 155 0.09705698199999999 4836 0.21576030399999999 0 skin 1.0 PR 280 1635 95 18 0 Nivolumab 3 0 237 0 0 1 1 1 1 0 1 0 skin ALM Nivolumab 0 1 1 0.0 0.0 0 4 skin 3 -30 380 0.69 2.13 No No 1597 2 5 | Patient51 1723 1134 989 112 0.101725704 3069 0.15900598 1 skin 1.3 PR 471 1547 56 34 0 Nivolumab 2 0 185 0 0 0 0 1 0 0 1 0 skin SSM Nivo 0 0 0 0.0 0.0 0 0 skin 3 -38 na 0.3 3.82 Yes Yes 1101 2 6 | Patient99 1392 909 690 190 0.21590909100000003 2796 0.187194201 1 soft tissue 5.2 PR 319 921 158 12 0 Nivolumab 3 0 355 1 0 0 1 1 1 0 1 0 skin unknown Nivolumab 1 0 0 0.0 0.0 1 0 soft tissue 3 -155 na 0.34 2.83 Yes Yes 880 2 7 | Patient75 1112 725 645 62 0.087694484 2304 0.201142567 1 skin 0.2 PR 891 891 78 15 0 Nivolumab 3 3 254 1 0 1 0 1 0 0 1 0 occult na nivolumab 0 0 0 0.0 0.0 0 0 skin 3 -6 na 0.55 3.76 Yes Yes 707 2 8 | Patient96 959 649 588 50 0.078369906 2136 0.19649924600000002 1 pancreas 2.7 CR 913 913 273 10 0 Nivolumab 3 0 174 0 0 0 0 1 1 0 0 0 skin unknown Nivolumab 0 0 0 0.0 0.0 0 0 liver/visceral 3 -81 na 0.82 2.22 No No 638 2 9 | Patient42 767 507 450 48 0.09638554199999999 1610 0.123826962 1 skin 5.666666667 CR 1027 1027 83 52 1 Nivolumab 3 0 296 1 0 1 0 0 0 0 0 0 skin nodular Nivolumab 0 0 1 0.0 0.0 0 1 skin 1 -170 -61 0.81 2.69 Yes Yes 498 2 10 | Patient55 701 478 371 93 0.20043103399999998 1421 0.092364302 1 skin 4.5 PR 1503 1503 83 approximatly 96 1 Nivolumab 3 0 182 0 0 0 1 1 1 0 0 0 skin na Nivo 0 0 0 0.0 0.0 0 0 skin 3 -134 na 0.57 1.94 No No 464 2 11 | Patient4 603 384 314 53 0.144414169 1321 0.121665383 1 skin 5.1 CR 574 600 151 26 0 Pembrolizumab 3 0 149 0 0 1 1 1 1 0 1 1 skin not classified MK3475 no 0 1 0.0 0.0 0 2 skin 1 -154 -106 0.88 2.03 No No 367 2 12 | Patient25 491 328 273 46 0.144200627 1088 0.010357406999999999 1 skin 1.8 PR 752 1096 84 47 1 Pembrolizumab 3 0 956 1 0 1 1 1 1 1 1 0 skin SSM MK3475 0 0 1 0.0 0.0 0 1 skin 2 -54 42 0.85 2.07 No No 319 2 13 | Patient45 423 293 219 71 0.244827586 793 0.216010791 1 skin 0.8 PR 1666 1666 185 approx. 120 1 Nivolumab 3 0 278 1 0 0 1 1 1 0 0 0 skin unk Nivolumab 0 0 1 0.0 0.0 0 3 skin 3 -23 310 0.61 2.08 No No 290 2 14 | Patient15 263 182 134 41 0.23428571399999998 530 0.18790907899999998 0 skin 5.0 PR 105 149 84 6 0 Pembrolizumab 3 0 225 0 0 1 1 1 0 1 1 1 skin LMM MK 3475 1 1 1 0.0 0.0 0 3 skin 3 -150 108 0.61 2.07 No No 175 2 15 | Patient21 148 105 52 46 0.469387755 184 0.236006699 0 skin 1.7 PR 906 906 90 15 0 Pembrolizumab 3 0 179 0 0 0 1 1 1 0 0 0 acral ALM MK3475 1 0 1 0.0 0.0 0 1 skin 3 -50 217 0.93 2.03 No No 98 2 16 | Patient87 121 87 72 14 0.162790698 222 0.251171185 0 skin 0.46666666700000003 PR 1142 1263 85 22 0 Pembrolizumab 3 343 1 0 0 1 0 0 1 1 0 occult unknown MK3475 0 0 1 0.0 0.0 0 1 skin 3 -14 335 0.36 3.48 Yes Yes 86 2 17 | Patient43 121 84 61 21 0.25609756100000003 291 0.43116056700000005 1 lymph node 0.7 PR 750 925 64 46 1 Nivolumab 3 0 280 1 0 0 1 1 1 0 1 0 skin ALM Nivolumab 0 0 0 0.0 0.0 0 0 lymph node 3 -22 na 0.28 2.67 Yes Yes 82 2 18 | Patient10 96 71 48 22 0.314285714 230 0.391383951 0 skin 0.4 PR 169 1139 81 1 0 Pembrolizumab 3 1 2186 1 0 1 1 1 1 1 1 1 skin x MK3475 1 1 1 1.0 1.0 0 2 skin 3 -12 107 0.83 1.84 No No 70 2 19 | Patient40 9222 6103 691 5228 0.883257307 13972 0.35340236700000005 0 liver 2.4 PD 63 468 63 15 0 Nivolumab 3 0 378 1 0 0 1 0 1 0 1 1 skin unknown Nivolumab 0 0 0 0.0 0.0 0 1 liver/visceral 3 -73 na 0.91 1.73 No No 5919 1 20 | Patient82 2253 1477 1390 48 0.033379694 4650 0.20502101 1 brain 42.0 PD 89 131 89 4 0 Pembrolizumab 3 1 1195 1 1 0 0 1 1 0 1 1 occult na Pembro 0 1 1 1.0 1.0 0 3 brain 1 -1260 -620 0.84 4.08 Yes Yes 1438 1 21 | Patient47 1087 756 641 66 0.09335219199999999 2511 0.19079258 0 skin 2.5 PD 74 937 74 6 0 Nivolumab 3 0 340 1 0 1 1 1 0 1 1 0 skin nodular Nivolumab 0 0 0 1.0 0.0 0 0 skin 3 -74 na 0.91 3.19 Yes Yes 707 1 22 | Patient9 911 652 562 80 0.12461059199999999 1996 0.111034165 0 skin 1.3 PD 21 963 21 2 0 Nivolumab 0 0 192 0 0 1 1 0 0 0 1 0 skin NMM Nivolumab no 0 0 0.0 0.0 0 0 skin 3 -39 na 0.88 2.02 No No 642 1 23 | Patient13 913 593 521 53 0.092334495 1733 0.13403875 1 skin 0.9 PD 44 58 44 2 0 Pembrolizumab 3 1 970 1 0 1 0 0 1 0 1 1 skin x MK3475 0 0 1 0.0 0.0 0 2 skin 3 -27 203 0.89 2.24 No No 574 1 24 | Patient20 569 394 315 65 0.17105263199999998 1216 0.13463337 0 skin 0.0 PD 84 242 84 15 0 Nivolumab 3 0 214 0 1 1 1 1 1 1 1 1 skin NM Nivolumab 1 0 1 0.0 0.0 0 1 skin 3 0 209 0.45 2.06 No No 380 1 25 | Patient18 526 342 293 42 0.125373134 983 0.147651333 1 skin 16.3 PD 86 130 86 8 0 Nivolumab 0 0 217 0 0 1 0 0 0 0 1 1 skin SSM Nivolumab 0 0 0 0.0 0.0 0 0 skin 3 -490 na 0.92 1.9 No No 335 1 26 | Patient59 509 315 243 68 0.21864951800000001 1025 0.064360264 1 skin 3.1 PD 82 191 82 7 0 Nivolumab 3 1 260 1 0 0 1 1 1 0 1 1 skin NM Nivo 0 0 1 0.0 1.0 0 1 skin 3 -94 140 0.37 2.1 No No 311 1 27 | Patient98 472 302 219 74 0.252559727 790 0.223074401 0 lung 0.8 PD 73 683 73 28 0 Nivolumab 2 0 192 0 0 0 1 1 0 0 1 1 occult Nivolumab 1 0 0 0.0 0.0 1 0 lung 3 -24 na 0.6 1.99 No No 293 1 28 | Patient32 396 270 217 38 0.149019608 769 0.376051775 0 skin 0.0 PD 49 53 49 3 0 Pembrolizumab 3 0 2419 1 0 1 0 1 1 0 1 1 skin unknown MK3475 0 1 1 0.0 0.0 0 2 skin 3 0 59 0.84 2.72 Yes Yes 255 1 29 | Patient94 414 268 227 21 0.084677419 929 0.902095292 0 skin 1.0 PD 40 40 40 1 0 Pembrolizumab 3 2 817 1 1 1 1 1 1 1 1 1 skin nodular MK3475 1 1 1 0.0 0.0 0 2 skin 3 -31 140 0.85 1.13 No No 248 1 30 | Patient27 300 217 156 54 0.257142857 717 0.194713842 1 lymph node 1.1 PD 94 431 94 10 0 Pembrolizumab 1 0 276 0 0 1 1 0 0 0 1 1 skin NM MK3475 0 0 1 0.0 0.0 0 1 lymph node 3 -32 434 0.75 1.91 No No 210 1 31 | Patient79 320 205 149 48 0.243654822 591 0.21409609699999999 0 brain 4.5 PD 84 849 84 35 1 Pembrolizumab 3 0 171 0 0 1 1 1 0 0 1 0 skin ssm pembrolizumab 1 1 0 0.0 0.0 0 1 brain 3 -136 na 0.69 1.91 No No 197 1 32 | Patient83 309 194 149 39 0.20744680899999998 539 0.13518770800000002 1 skin 8.1 PD 71 71 71 3 0 Nivolumab 3 1 496 1 0 0 1 1 0 0 1 1 skin SSM Nivolumab 0 0 1 0.0 0.0 0 1 skin 1 -243 -147 0.94 2.01 No No 188 1 33 | Patient36 277 180 157 20 0.11299435 543 0.18582013 1 lung 2.7 PD 63 451 63 5 0 Nivolumab 3 1 282 1 0 0 0 1 0 0 1 1 skin nodular Nivolumab 0 0 0 1.0 0.0 0 0 lung 3 -81 na 0.84 2.0 No No 177 1 34 | Patient11 246 172 131 35 0.210843373 560 0.23896445600000002 0 skin 0.3 PD 85 160 85 7 0 Pembrolizumab 3 1 336 1 0 1 1 0 1 0 1 1 skin SSM MK 3475 0 1 1 0.0 0.0 0 7 skin 3 -8 516 0.77 3.43 Yes Yes 166 1 35 | Patient31 235 167 145 21 0.126506024 372 0.233946178 1 skin 0.2 PD 85 426 85 7 0 Pembrolizumab 3 0 unknown unknown 0 1 1 1 0 1 1 1 skin unknown MK3475 0 0 1 0.0 0.0 0 1 skin 3 -6 214 0.88 2.97 Yes Yes 166 1 36 | Patient23 203 132 111 18 0.139534884 450 0.184649305 1 skin 2.1 PD 77 975 77 6 0 Nivolumab 0 0 unknown unknown 0 1 1 0 0 0 1 0 skin nmm nivolumab 0 0 0 0.0 0.0 0 0 skin 3 -63 na 0.94 1.83 No No 129 1 37 | Patient72 184 121 103 15 0.127118644 371 0.17113706899999998 1 skin 4.0 PD 67 395 67 x 0 Nivolumab 1 0 216 0 0 1 0 0 0 0 1 0 skin SSM nivolumab 0 0 1 0.0 1.0 0 2 skin 3 -119 240 0.78 3.89 Yes Yes 118 1 38 | Patient73 127 90 69 17 0.197674419 262 0.232352565 1 skin 1.9 PD 85 846 85 22 0 Nivolumab 3 1 279 1 0 1 1 0 0 0 1 1 skin x nivolumab 0 0 1 0.0 0.0 1 2 skin 3 -56 139 0.77 3.33 Yes Yes 86 1 39 | Patient77 151 92 67 19 0.220930233 201 0.180587685 0 lung 15.3 PD 95 444 95 13 0 Pembrolizumab 3 0 228 0 1 0 0 1 0 0 1 1 skin ssm pembrolizumab 1 0 0 0.0 1.0 0 0 lung 3 -459 na 0.74 1.98 No No 86 1 40 | Patient63 142 84 46 20 0.303030303 188 0.47209914299999994 0 skin 2.5 PD 89 178 89 12 0 Nivolumab 3 1 259 1 0 1 1 1 1 1 1 1 mucosal mucosal Nivolumab 0 0 1 0.0 0.0 0 1 skin 3 -74 425 0.93 2.62 Yes Yes 66 1 41 | Patient78 64 43 33 10 0.23255814 123 0.390218557 0 skin 0.7 PD 86 356 86 11 0 Pembrolizumab 2 0 178 0 0 1 0 1 0 0 1 1 skin nmm pembrolizumab 0 1 0 0.0 1.0 0 1 skin 3 -21 na 0.27 2.58 Yes Yes 43 1 42 | Patient37 48 40 27 12 0.307692308 60 0.07499339599999999 1 lymph node 1.2 PD 77 275 77 7 0 Nivolumab 3 0 335 1 0 1 1 1 1 1 1 1 mucosal unknown Nivolumab 0 0 0 0.0 0.0 0 0 lymph node 3 -35 na 0.3 2.09 No No 39 1 43 | Patient38 60 39 23 16 0.41025641 85 0.190553631 1 Colon 8.2 PD 98 343 98 6 0 Nivolumab 3 0 376 1 0 1 1 1 0 1 1 1 skin SSM Nivolumab 0 1 1 0.0 1.0 0 2 liver/visceral 1 -246 -124 0.78 2.0 No No 39 1 44 | Patient14 44 30 21 9 0.3 52 0.244658956 0 skin 2.2 PD 76 186 76 4 0 Pembrolizumab 2 0 191 0 0 1 0 1 0 0 1 1 acral ALM MK 3475 1 0 0 1.0 0.0 0 0 skin 3 -67 na 0.51 1.81 No No 30 1 45 | Patient61 3254 2117 1555 496 0.241833252 5854 0.23101254899999998 0 skin 2.6 SD 987 987 154 65 1 Nivolumab 3 0 204 0 0 1 1 0 0 1 0 0 skin SSM Nivolumab 0 0 1 0.0 0.0 0 1 skin 2 -77 30 0.42 1.83 No No 2051 0 46 | Patient60 574 392 280 94 0.25133689800000003 1106 0.48415202799999996 1 skin 0.6 SD 175 662 84 13 0 Nivolumab 2 0 245 0 0 1 1 0 0 0 1 1 skin na Nivo 0 0 1 0.0 0.0 1 1 skin 3 -17 117 0.36 1.54 No No 374 0 47 | Patient17 550 365 305 52 0.145658263 821 0.059114148 1 lymph node 5.2 SD 202 686 110 21 0 Nivolumab 3 0 192 0 0 0 1 1 1 1 1 1 skin NM Nivolumab 0 0 1 0.0 1.0 Reexposure PD-1 1 lymph node 3 -156 280 0.38 2.91 Yes Yes 357 0 48 | Patient54 465 288 245 39 0.137323944 536 0.106449389 1 lymph node 2.5 MR 429 1017 105 21 0 Pembrolizumab 3 0 255 1 0 1 1 1 0 1 1 1 skin SSM MK3475 0 1 1 0.0 1.0 1 3 lymph node 3 -76 310 0.19 2.19 No No 284 0 49 | Patient86 435 279 196 73 0.271375465 576 0.034541373 1 lymph node 1.6 SD 1327 1327 79 48 1 Pembrolizumab 3 144 0 0 1 1 0 1 0 1 0 skin NM MK3475 0 0 1 0.0 0.0 0 2 lymph node 3 -48 372 0.14 2.49 No No 269 0 50 | Patient8 311 194 152 40 0.208333333 638 0.273852935 1 skin 3.9 SD 955 1005 72 4 0 Pembrolizumab 2 0 245 0 0 1 0 1 0 0 1 1 skin NMM MK3475 yes, 140 mg Decortin H (Colitis) 0 0 0.0 0.0 0 0 skin 3 -116 na 0.72 2.01 No No 192 0 51 | Patient50 305 195 168 15 0.081967213 599 0.22303688300000002 0 skin 8.0 MR 105 175 66 8 0 Pembrolizumab 3 0 746 1 0 1 1 1 1 1 1 1 mucosal unknown MK3475 1 0 1 0.0 0.0 0 2 skin 1 -240 -20 0.61 3.59 Yes Yes 183 0 52 | Patient30 242 158 130 17 0.115646259 481 0.303313324 0 skin 0.7 SD 133 213 93 5 0 Pembrolizumab 3 0 609 1 0 0 1 1 1 0 1 1 skin unknown MK3475 0 1 1 0.0 0.0 0 7 skin 3 -21 904 0.46 4.73 Yes Yes 147 0 53 | Patient7 121 79 47 28 0.373333333 138 0.168904284 1 skin 2.3 SD 217 269 77 7 0 Pembrolizumab 3 0 263 1 0 1 1 1 1 0 1 1 occult occult melanoma MK3475 no 0 1 0.0 0.0 0 2 skin 2 -69 54 0.88 2.35 No No 75 0 54 | -------------------------------------------------------------------------------- /examples/tutorial_data/melanoma_example/mutation_frequency.tsv: -------------------------------------------------------------------------------- 1 | category Mutated samples 2 | BRAF 22 3 | NRAS 21 4 | NF1 11 5 | TP53 10 6 | CDKN2A 6 -------------------------------------------------------------------------------- /examples/tutorial_data/melanoma_example/mutation_signatures.tsv: -------------------------------------------------------------------------------- 1 | sample UV Alkylating Aging+ 2 | Patient10 0.4221710132360747 0.08012894638143825 0.4977000403824871 3 | Patient11 0.5165319808077606 0.07453374567940704 0.40893427351283235 4 | Patient13 0.8640849506465558 0.026139096799423292 0.10977595255402085 5 | Patient14 0.26962390187040547 0.21409081825736825 0.5162852798722262 6 | Patient15 0.6116945434917384 0.08474616653595819 0.3035592899723035 7 | Patient17 0.7579615241878873 0.042067395891367806 0.19997107992074487 8 | Patient18 0.8145267642449733 0.049613928529795424 0.13585930722523126 9 | Patient20 0.7800119661993588 0.12218440679106483 0.09780362700957652 10 | Patient21 4.3336301902581097e-13 0.07658054274720945 0.9234194572523573 11 | Patient23 0.6150606152493701 0.10747762281432761 0.2774617619363023 12 | Patient25 0.8380570928201992 0.09393526722467611 0.06800763995512474 13 | Patient27 0.7459537204211937 0.10512298220833918 0.14892329737046717 14 | Patient30 0.7461541666442959 0.010277690980992491 0.2435681423747116 15 | Patient31 0.33271795036169316 0.13349790930023137 0.5337841403380755 16 | Patient32 0.8785411344286833 0.039512438794847844 0.0819464267764688 17 | Patient36 0.7996804977573589 0.046110439078955026 0.15420906316368604 18 | Patient37 0.07600364278639857 4.893941923935072e-13 0.923996357213112 19 | Patient38 0.4340956941327824 0.0028132723692476124 0.56309103349797 20 | Patient4 0.9380261215089528 4.121841582778458e-14 0.06197387849100604 21 | Patient40 0.11162061591031651 0.888379384089681 2.57199749464766e-15 22 | Patient41 0.9847042294686196 1.3778777113665292e-08 0.015295756752603324 23 | Patient42 0.8253667728978246 0.029424005499699375 0.14520922160247615 24 | Patient43 0.17321658132822196 0.11222932122730231 0.7145540974444757 25 | Patient45 0.8361800821925418 0.014961390454906173 0.14885852735255198 26 | Patient47 0.9130294415706174 0.012195842757717443 0.07477471567166517 27 | Patient49 0.049968832969848026 0.9500311670301421 9.795907454595347e-15 28 | Patient50 0.8211231999511411 0.014981700396698218 0.16389509965216065 29 | Patient51 0.9028226796220555 0.026863925794599973 0.07031339458334453 30 | Patient54 0.4813676085091835 0.1668600568426622 0.35177233464815433 31 | Patient55 0.8758097319410939 0.012551258243903204 0.11163900981500298 32 | Patient59 0.8306223937980814 0.05312217421386612 0.11625543198805252 33 | Patient60 0.8850167227002458 8.513266030047008e-14 0.11498327729966909 34 | Patient61 0.9463001594563619 0.011861607601914397 0.041838232941723646 35 | Patient62 0.9619116897541894 1.4199550988171372e-13 0.03808831024566855 36 | Patient63 2.168366623719934e-09 0.12861047223453034 0.871389525597103 37 | Patient7 0.13601052011330397 0.08242268743192498 0.7815667924547711 38 | Patient72 0.5989742748964005 0.15806361536006497 0.24296210974353452 39 | Patient73 0.38982588089044834 0.05430791644924005 0.5558662026603115 40 | Patient75 0.8893243489700109 0.02378785356536675 0.08688779746462245 41 | Patient77 0.8094727478867936 0.07112113151691503 0.11940612059629152 42 | Patient78 0.15402011110165098 0.09225318306569721 0.7537267058326518 43 | Patient79 0.7160052752429312 0.12141422715214614 0.16258049760492263 44 | Patient8 0.8118266968327376 0.020749758049656376 0.16742354511760602 45 | Patient82 0.9063449914202228 0.0346619725765866 0.058993036003190516 46 | Patient83 0.7883094562038863 0.06410669570687709 0.14758384808923658 47 | Patient86 0.8123753422358813 0.06065518121954062 0.12696947654457819 48 | Patient87 0.2818401692067342 0.08862744781783968 0.629532382975426 49 | Patient9 0.7773534380456008 0.044354316249314345 0.17829224570508478 50 | Patient94 0.871345119113117 5.846154798435499e-11 0.1286548808284214 51 | Patient96 0.9305344747959984 0.002593973016378169 0.06687155218762346 52 | Patient98 0.8780937212002277 0.01460333851686167 0.10730294028291072 53 | Patient99 0.8564289387947661 0.037525135806481674 0.10604592539875213 54 | -------------------------------------------------------------------------------- /examples/tutorial_data/melanoma_example/mutational_burden.tsv: -------------------------------------------------------------------------------- 1 | sample Clonal Subclonal 2 | Patient41 8924 662 3 | Patient62 5401 203 4 | Patient49 1442 155 5 | Patient51 989 112 6 | Patient99 690 190 7 | Patient75 645 62 8 | Patient96 588 50 9 | Patient42 450 48 10 | Patient55 371 93 11 | Patient4 314 53 12 | Patient25 273 46 13 | Patient45 219 71 14 | Patient15 134 41 15 | Patient21 52 46 16 | Patient87 72 14 17 | Patient43 61 21 18 | Patient10 48 22 19 | Patient40 691 5228 20 | Patient82 1390 48 21 | Patient47 641 66 22 | Patient9 562 80 23 | Patient13 521 53 24 | Patient20 315 65 25 | Patient18 293 42 26 | Patient59 243 68 27 | Patient98 219 74 28 | Patient32 217 38 29 | Patient94 227 21 30 | Patient27 156 54 31 | Patient79 149 48 32 | Patient83 149 39 33 | Patient36 157 20 34 | Patient11 131 35 35 | Patient31 145 21 36 | Patient23 111 18 37 | Patient72 103 15 38 | Patient73 69 17 39 | Patient77 67 19 40 | Patient63 46 20 41 | Patient78 33 10 42 | Patient37 27 12 43 | Patient38 23 16 44 | Patient14 21 9 45 | Patient61 1555 496 46 | Patient60 280 94 47 | Patient17 305 52 48 | Patient54 245 39 49 | Patient86 196 73 50 | Patient8 152 40 51 | Patient50 168 15 52 | Patient30 130 17 53 | Patient7 47 28 54 | -------------------------------------------------------------------------------- /examples/tutorial_data/melanoma_example/mutational_signatures.tsv: -------------------------------------------------------------------------------- 1 | sample UV Alkylating Aging+ 2 | Patient10 0.4221710132360747 0.08012894638143825 0.4977000403824871 3 | Patient11 0.5165319808077606 0.07453374567940704 0.4089342735128324 4 | Patient13 0.8640849506465558 0.02613909679942329 0.10977595255402084 5 | Patient14 0.2696239018704055 0.21409081825736825 0.5162852798722262 6 | Patient15 0.6116945434917384 0.08474616653595819 0.3035592899723035 7 | Patient17 0.7579615241878873 0.042067395891367806 0.19997107992074487 8 | Patient18 0.8145267642449733 0.049613928529795424 0.13585930722523126 9 | Patient20 0.7800119661993588 0.12218440679106485 0.09780362700957652 10 | Patient21 4.3336301902581097e-13 0.07658054274720945 0.9234194572523572 11 | Patient23 0.6150606152493701 0.1074776228143276 0.2774617619363023 12 | Patient25 0.8380570928201992 0.09393526722467613 0.06800763995512474 13 | Patient27 0.7459537204211937 0.10512298220833917 0.14892329737046714 14 | Patient30 0.7461541666442959 0.010277690980992493 0.2435681423747116 15 | Patient31 0.3327179503616932 0.13349790930023134 0.5337841403380755 16 | Patient32 0.8785411344286833 0.03951243879484784 0.0819464267764688 17 | Patient36 0.7996804977573589 0.04611043907895503 0.15420906316368604 18 | Patient37 0.07600364278639858 4.893941923935072e-13 0.9239963572131121 19 | Patient38 0.4340956941327824 0.0028132723692476124 0.56309103349797 20 | Patient4 0.9380261215089528 4.1218415827784585e-14 0.06197387849100604 21 | Patient40 0.11162061591031652 0.8883793840896809 2.5719974946476604e-15 22 | Patient41 0.9847042294686196 1.3778777113665292e-08 0.015295756752603326 23 | Patient42 0.8253667728978246 0.029424005499699375 0.14520922160247615 24 | Patient43 0.17321658132822196 0.11222932122730232 0.7145540974444757 25 | Patient45 0.8361800821925418 0.014961390454906172 0.14885852735255198 26 | Patient47 0.9130294415706174 0.012195842757717444 0.07477471567166517 27 | Patient49 0.049968832969848026 0.950031167030142 9.795907454595348e-15 28 | Patient50 0.8211231999511411 0.014981700396698218 0.16389509965216065 29 | Patient51 0.9028226796220556 0.026863925794599973 0.07031339458334453 30 | Patient54 0.4813676085091835 0.1668600568426622 0.35177233464815433 31 | Patient55 0.8758097319410939 0.012551258243903204 0.11163900981500298 32 | Patient59 0.8306223937980814 0.05312217421386612 0.11625543198805252 33 | Patient60 0.8850167227002458 8.513266030047008e-14 0.11498327729966908 34 | Patient61 0.946300159456362 0.011861607601914395 0.04183823294172365 35 | Patient62 0.9619116897541894 1.4199550988171372e-13 0.03808831024566855 36 | Patient63 2.168366623719934e-09 0.12861047223453034 0.8713895255971031 37 | Patient7 0.13601052011330395 0.08242268743192498 0.7815667924547711 38 | Patient72 0.5989742748964005 0.15806361536006494 0.2429621097435345 39 | Patient73 0.3898258808904483 0.05430791644924005 0.5558662026603115 40 | Patient75 0.8893243489700109 0.02378785356536675 0.08688779746462245 41 | Patient77 0.8094727478867936 0.07112113151691503 0.11940612059629152 42 | Patient78 0.15402011110165098 0.0922531830656972 0.7537267058326518 43 | Patient79 0.7160052752429312 0.12141422715214616 0.16258049760492266 44 | Patient8 0.8118266968327376 0.020749758049656376 0.16742354511760602 45 | Patient82 0.9063449914202228 0.0346619725765866 0.05899303600319052 46 | Patient83 0.7883094562038863 0.06410669570687709 0.14758384808923658 47 | Patient86 0.8123753422358813 0.060655181219540624 0.1269694765445782 48 | Patient87 0.2818401692067342 0.08862744781783967 0.629532382975426 49 | Patient9 0.7773534380456008 0.044354316249314345 0.17829224570508478 50 | Patient94 0.871345119113117 5.846154798435499e-11 0.1286548808284214 51 | Patient96 0.9305344747959984 0.002593973016378169 0.06687155218762346 52 | Patient98 0.8780937212002277 0.014603338516861668 0.10730294028291072 53 | Patient99 0.8564289387947661 0.03752513580648168 0.10604592539875213 54 | -------------------------------------------------------------------------------- /examples/tutorial_data/melanoma_example/primary_type.tsv: -------------------------------------------------------------------------------- 1 | sample category value 2 | Patient41 Primary type Skin 3 | Patient62 Primary type Skin 4 | Patient49 Primary type Skin 5 | Patient51 Primary type Skin 6 | Patient99 Primary type Skin 7 | Patient75 Primary type Occult 8 | Patient96 Primary type Skin 9 | Patient42 Primary type Skin 10 | Patient55 Primary type Skin 11 | Patient4 Primary type Skin 12 | Patient25 Primary type Skin 13 | Patient45 Primary type Skin 14 | Patient15 Primary type Skin 15 | Patient21 Primary type Acral 16 | Patient87 Primary type Occult 17 | Patient43 Primary type Skin 18 | Patient10 Primary type Skin 19 | Patient40 Primary type Skin 20 | Patient82 Primary type Occult 21 | Patient47 Primary type Skin 22 | Patient9 Primary type Skin 23 | Patient13 Primary type Skin 24 | Patient20 Primary type Skin 25 | Patient18 Primary type Skin 26 | Patient59 Primary type Skin 27 | Patient98 Primary type Occult 28 | Patient32 Primary type Skin 29 | Patient94 Primary type Skin 30 | Patient27 Primary type Skin 31 | Patient79 Primary type Skin 32 | Patient83 Primary type Skin 33 | Patient36 Primary type Skin 34 | Patient11 Primary type Skin 35 | Patient31 Primary type Skin 36 | Patient23 Primary type Skin 37 | Patient72 Primary type Skin 38 | Patient73 Primary type Skin 39 | Patient77 Primary type Skin 40 | Patient63 Primary type Mucosal 41 | Patient78 Primary type Skin 42 | Patient37 Primary type Mucosal 43 | Patient38 Primary type Skin 44 | Patient14 Primary type Acral 45 | Patient61 Primary type Skin 46 | Patient60 Primary type Skin 47 | Patient17 Primary type Skin 48 | Patient54 Primary type Skin 49 | Patient86 Primary type Skin 50 | Patient8 Primary type Skin 51 | Patient50 Primary type Mucosal 52 | Patient30 Primary type Skin 53 | Patient7 Primary type Occult 54 | -------------------------------------------------------------------------------- /examples/tutorial_data/melanoma_example/purity.tsv: -------------------------------------------------------------------------------- 1 | sample category value 2 | Patient41 Purity 0.7 3 | Patient62 Purity 0.5 4 | Patient49 Purity 0.69 5 | Patient51 Purity 0.3 6 | Patient99 Purity 0.34 7 | Patient75 Purity 0.55 8 | Patient96 Purity 0.82 9 | Patient42 Purity 0.81 10 | Patient55 Purity 0.57 11 | Patient4 Purity 0.88 12 | Patient25 Purity 0.85 13 | Patient45 Purity 0.61 14 | Patient15 Purity 0.61 15 | Patient21 Purity 0.93 16 | Patient87 Purity 0.36 17 | Patient43 Purity 0.28 18 | Patient10 Purity 0.83 19 | Patient40 Purity 0.91 20 | Patient82 Purity 0.84 21 | Patient47 Purity 0.91 22 | Patient9 Purity 0.88 23 | Patient13 Purity 0.89 24 | Patient20 Purity 0.45 25 | Patient18 Purity 0.92 26 | Patient59 Purity 0.37 27 | Patient98 Purity 0.6 28 | Patient32 Purity 0.84 29 | Patient94 Purity 0.85 30 | Patient27 Purity 0.75 31 | Patient79 Purity 0.69 32 | Patient83 Purity 0.94 33 | Patient36 Purity 0.84 34 | Patient11 Purity 0.77 35 | Patient31 Purity 0.88 36 | Patient23 Purity 0.94 37 | Patient72 Purity 0.78 38 | Patient73 Purity 0.77 39 | Patient77 Purity 0.74 40 | Patient63 Purity 0.93 41 | Patient78 Purity 0.27 42 | Patient37 Purity 0.3 43 | Patient38 Purity 0.78 44 | Patient14 Purity 0.51 45 | Patient61 Purity 0.42 46 | Patient60 Purity 0.36 47 | Patient17 Purity 0.38 48 | Patient54 Purity 0.19 49 | Patient86 Purity 0.14 50 | Patient8 Purity 0.72 51 | Patient50 Purity 0.61 52 | Patient30 Purity 0.46 53 | Patient7 Purity 0.88 54 | -------------------------------------------------------------------------------- /examples/tutorial_data/melanoma_example/whole_genome_doubling.tsv: -------------------------------------------------------------------------------- 1 | sample category value 2 | Patient10 WGD No 3 | Patient11 WGD Yes 4 | Patient13 WGD No 5 | Patient14 WGD No 6 | Patient15 WGD No 7 | Patient17 WGD Yes 8 | Patient18 WGD No 9 | Patient20 WGD No 10 | Patient21 WGD No 11 | Patient23 WGD No 12 | Patient25 WGD No 13 | Patient27 WGD No 14 | Patient30 WGD Yes 15 | Patient31 WGD Yes 16 | Patient32 WGD Yes 17 | Patient36 WGD No 18 | Patient37 WGD No 19 | Patient38 WGD No 20 | Patient4 WGD No 21 | Patient40 WGD No 22 | Patient41 WGD Yes 23 | Patient42 WGD Yes 24 | Patient43 WGD Yes 25 | Patient45 WGD No 26 | Patient47 WGD Yes 27 | Patient49 WGD No 28 | Patient50 WGD Yes 29 | Patient51 WGD Yes 30 | Patient54 WGD No 31 | Patient55 WGD No 32 | Patient59 WGD No 33 | Patient60 WGD No 34 | Patient61 WGD No 35 | Patient62 WGD Yes 36 | Patient63 WGD Yes 37 | Patient7 WGD No 38 | Patient72 WGD Yes 39 | Patient73 WGD Yes 40 | Patient75 WGD Yes 41 | Patient77 WGD No 42 | Patient78 WGD Yes 43 | Patient79 WGD No 44 | Patient8 WGD No 45 | Patient82 WGD Yes 46 | Patient83 WGD No 47 | Patient86 WGD No 48 | Patient87 WGD Yes 49 | Patient9 WGD No 50 | Patient94 WGD No 51 | Patient96 WGD No 52 | Patient98 WGD No 53 | Patient99 WGD Yes -------------------------------------------------------------------------------- /examples/tutorial_data/tutorial_biopsy_site.tsv: -------------------------------------------------------------------------------- 1 | sample category value 2 | Sample1 Biopsy site Lung 3 | Sample2 Biopsy site Lung 4 | Sample3 Biopsy site Lung 5 | Sample4 Biopsy site Liver 6 | Sample5 Biopsy site Lung 7 | Sample6 Biopsy site Kidney 8 | Sample7 Biopsy site Kidney 9 | Sample8 Biopsy site Lung 10 | Sample9 Biopsy site Kidney 11 | Sample10 Biopsy site Lung 12 | Sample11 Biopsy site Liver 13 | Sample12 Biopsy site Lung 14 | Sample13 Biopsy site Lung 15 | Sample14 Biopsy site Lymph Node 16 | Sample15 Biopsy site Liver 17 | Sample16 Biopsy site Liver 18 | Sample17 Biopsy site Liver 19 | Sample18 Biopsy site Lung 20 | Sample19 Biopsy site Kidney 21 | Sample20 Biopsy site Lung 22 | Sample21 Biopsy site Kidney 23 | Sample22 Biopsy site Liver 24 | Sample23 Biopsy site Kidney 25 | Sample24 Biopsy site Kidney 26 | Sample25 Biopsy site Kidney 27 | Sample26 Biopsy site Lymph Node 28 | Sample27 Biopsy site Lung 29 | Sample28 Biopsy site Liver 30 | Sample29 Biopsy site Kidney 31 | Sample30 Biopsy site Kidney 32 | Sample31 Biopsy site Kidney 33 | Sample32 Biopsy site Kidney 34 | Sample33 Biopsy site Liver 35 | Sample34 Biopsy site Liver 36 | Sample35 Biopsy site Kidney 37 | Sample36 Biopsy site Lymph Node 38 | Sample37 Biopsy site Kidney 39 | Sample38 Biopsy site Kidney 40 | Sample39 Biopsy site Kidney 41 | Sample40 Biopsy site Liver 42 | Sample41 Biopsy site Liver 43 | Sample42 Biopsy site Lung 44 | Sample43 Biopsy site Liver 45 | Sample44 Biopsy site Kidney 46 | Sample45 Biopsy site Lung 47 | Sample46 Biopsy site Liver 48 | Sample47 Biopsy site Liver 49 | Sample48 Biopsy site Lung 50 | Sample49 Biopsy site Kidney 51 | Sample50 Biopsy site Liver 52 | -------------------------------------------------------------------------------- /examples/tutorial_data/tutorial_comut.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/vanallenlab/comut/0ee140805cc7c2c8a9bd3fbb45e58e78b75ea17b/examples/tutorial_data/tutorial_comut.png -------------------------------------------------------------------------------- /examples/tutorial_data/tutorial_indicator.tsv: -------------------------------------------------------------------------------- 1 | sample group 2 | Sample1 0 3 | Sample2 0 4 | Sample3 1 5 | Sample4 1 6 | Sample5 1 7 | Sample6 1 8 | Sample7 2 9 | Sample8 2 10 | Sample9 3 11 | Sample10 3 12 | Sample11 3 13 | Sample12 3 14 | Sample13 3 15 | Sample14 3 16 | Sample15 4 17 | Sample16 6 18 | Sample17 6 19 | Sample18 6 20 | Sample19 6 21 | Sample20 6 22 | Sample21 7 23 | Sample22 7 24 | Sample23 8 25 | Sample24 9 26 | Sample25 9 27 | Sample26 10 28 | Sample27 10 29 | Sample28 10 30 | Sample29 11 31 | Sample30 11 32 | Sample31 12 33 | Sample32 12 34 | Sample33 12 35 | Sample34 12 36 | Sample35 12 37 | Sample36 13 38 | Sample37 13 39 | Sample38 14 40 | Sample39 14 41 | Sample40 14 42 | Sample41 14 43 | Sample42 15 44 | Sample43 16 45 | Sample44 16 46 | Sample45 17 47 | Sample46 17 48 | Sample47 17 49 | Sample48 18 50 | Sample49 18 51 | Sample50 19 52 | -------------------------------------------------------------------------------- /examples/tutorial_data/tutorial_mutation_burden.tsv: -------------------------------------------------------------------------------- 1 | sample Nonsynonymous Synonymous 2 | Sample1 62 1 3 | Sample2 67 26 4 | Sample3 15 5 5 | Sample4 17 6 6 | Sample5 33 8 7 | Sample6 36 5 8 | Sample7 84 21 9 | Sample8 41 11 10 | Sample9 64 14 11 | Sample10 13 4 12 | Sample11 55 4 13 | Sample12 8 4 14 | Sample13 79 39 15 | Sample14 36 18 16 | Sample15 50 9 17 | Sample16 36 13 18 | Sample17 12 3 19 | Sample18 0 0 20 | Sample19 20 5 21 | Sample20 87 25 22 | Sample21 31 11 23 | Sample22 97 6 24 | Sample23 29 6 25 | Sample24 19 7 26 | Sample25 2 1 27 | Sample26 98 0 28 | Sample27 12 1 29 | Sample28 29 6 30 | Sample29 52 17 31 | Sample30 98 7 32 | Sample31 94 44 33 | Sample32 39 3 34 | Sample33 14 1 35 | Sample34 79 24 36 | Sample35 37 13 37 | Sample36 77 5 38 | Sample37 33 15 39 | Sample38 3 1 40 | Sample39 50 15 41 | Sample40 95 39 42 | Sample41 28 2 43 | Sample42 52 14 44 | Sample43 13 4 45 | Sample44 58 29 46 | Sample45 71 11 47 | Sample46 21 8 48 | Sample47 51 8 49 | Sample48 40 7 50 | Sample49 28 1 51 | Sample50 91 23 52 | -------------------------------------------------------------------------------- /examples/tutorial_data/tutorial_mutation_data.tsv: -------------------------------------------------------------------------------- 1 | sample category value 2 | Sample2 BRCA2 Splice site 3 | Sample2 NRAS Frameshift indel 4 | Sample2 NRAS Nonsense 5 | Sample4 RB1 Frameshift indel 6 | Sample4 RB1 Missense 7 | Sample4 BRCA2 Splice site 8 | Sample4 NRAS Splice site 9 | Sample5 TP53 Frameshift indel 10 | Sample5 TP53 Missense 11 | Sample5 BRAF In frame indel 12 | Sample5 MYC Nonsense 13 | Sample5 BRCA2 Frameshift indel 14 | Sample5 HER2 Splice site 15 | Sample5 PTEN Frameshift indel 16 | Sample5 PTEN Frameshift indel 17 | Sample6 TP53 Splice site 18 | Sample6 TP53 Frameshift indel 19 | Sample6 BRAF Splice site 20 | Sample6 KRAS In frame indel 21 | Sample6 PTEN In frame indel 22 | Sample7 BRAF Nonsense 23 | Sample7 HER2 In frame indel 24 | Sample7 HER2 Frameshift indel 25 | Sample8 MYC Splice site 26 | Sample8 YAP Frameshift indel 27 | Sample9 MYC Splice site 28 | Sample9 KRAS Nonsense 29 | Sample9 KRAS Splice site 30 | Sample9 RB1 Missense 31 | Sample9 BRCA2 Frameshift indel 32 | Sample11 MYC In frame indel 33 | Sample11 NRAS Nonsense 34 | Sample12 BRAF Missense 35 | Sample13 MYC Nonsense 36 | Sample13 HER2 Splice site 37 | Sample13 HER2 Splice site 38 | Sample14 NRAS Nonsense 39 | Sample14 NRAS Splice site 40 | Sample14 YAP In frame indel 41 | Sample15 TP53 Splice site 42 | Sample15 RB1 Frameshift indel 43 | Sample15 HER2 In frame indel 44 | Sample15 PTEN Frameshift indel 45 | Sample16 BRAF Missense 46 | Sample16 ATM Nonsense 47 | Sample16 NRAS Nonsense 48 | Sample17 BRCA2 Splice site 49 | Sample17 ATM Frameshift indel 50 | Sample18 MYC Missense 51 | Sample18 RB1 Splice site 52 | Sample19 TP53 Frameshift indel 53 | Sample19 MYC Splice site 54 | Sample19 RB1 Frameshift indel 55 | Sample19 RB1 In frame indel 56 | Sample20 ATM Splice site 57 | Sample20 ATM Frameshift indel 58 | Sample21 RB1 Splice site 59 | Sample22 BRAF In frame indel 60 | Sample22 BRCA2 Missense 61 | Sample22 BRCA2 Frameshift indel 62 | Sample22 HER2 In frame indel 63 | Sample22 HER2 Nonsense 64 | Sample23 BRAF Splice site 65 | Sample23 PTEN In frame indel 66 | Sample24 BRAF Missense 67 | Sample24 BRCA2 Splice site 68 | Sample24 HER2 Nonsense 69 | Sample24 NRAS Frameshift indel 70 | Sample25 TP53 Missense 71 | Sample25 TP53 Frameshift indel 72 | Sample25 RB1 Splice site 73 | Sample25 HER2 Frameshift indel 74 | Sample26 BRCA2 Missense 75 | Sample26 ATM Frameshift indel 76 | Sample27 BRCA2 Splice site 77 | Sample27 NRAS Splice site 78 | Sample28 KRAS Splice site 79 | Sample28 YAP In frame indel 80 | Sample29 KRAS Nonsense 81 | Sample29 NRAS In frame indel 82 | Sample29 PTEN Missense 83 | Sample30 BRAF Frameshift indel 84 | Sample30 BRAF Splice site 85 | Sample30 BRCA2 Missense 86 | Sample31 TP53 Nonsense 87 | Sample31 BRAF In frame indel 88 | Sample31 YAP Splice site 89 | Sample32 TP53 Splice site 90 | Sample32 BRAF In frame indel 91 | Sample32 BRAF Frameshift indel 92 | Sample32 BRAF Missense 93 | Sample32 HER2 Splice site 94 | Sample33 TP53 Nonsense 95 | Sample33 TP53 Missense 96 | Sample33 RB1 Frameshift indel 97 | Sample33 BRCA2 Nonsense 98 | Sample33 BRCA2 Frameshift indel 99 | Sample33 BRCA2 Frameshift indel 100 | Sample33 HER2 In frame indel 101 | Sample33 PTEN In frame indel 102 | Sample33 PTEN Nonsense 103 | Sample35 MYC Frameshift indel 104 | Sample35 MYC Frameshift indel 105 | Sample35 HER2 Frameshift indel 106 | Sample36 TP53 In frame indel 107 | Sample36 BRAF Splice site 108 | Sample36 BRAF Nonsense 109 | Sample36 HER2 Splice site 110 | Sample36 ATM Splice site 111 | Sample36 PTEN Splice site 112 | Sample36 YAP Missense 113 | Sample36 YAP Splice site 114 | Sample36 YAP Frameshift indel 115 | Sample37 ATM Missense 116 | Sample37 PTEN Frameshift indel 117 | Sample38 KRAS Splice site 118 | Sample38 ATM Nonsense 119 | Sample38 NRAS Frameshift indel 120 | Sample39 YAP In frame indel 121 | Sample39 YAP Splice site 122 | Sample40 YAP Missense 123 | Sample41 TP53 Frameshift indel 124 | Sample41 NRAS In frame indel 125 | Sample42 BRAF Splice site 126 | Sample42 NRAS Nonsense 127 | Sample42 YAP Splice site 128 | Sample43 BRAF Frameshift indel 129 | Sample43 PTEN Missense 130 | Sample44 HER2 Nonsense 131 | Sample44 HER2 In frame indel 132 | Sample44 HER2 Frameshift indel 133 | Sample45 KRAS In frame indel 134 | Sample45 RB1 Splice site 135 | Sample45 RB1 Splice site 136 | Sample45 PTEN Splice site 137 | Sample45 PTEN Frameshift indel 138 | Sample45 YAP Missense 139 | Sample45 YAP Frameshift indel 140 | Sample45 YAP Frameshift indel 141 | Sample46 TP53 Splice site 142 | Sample47 TP53 Nonsense 143 | Sample47 BRAF Nonsense 144 | Sample47 MYC In frame indel 145 | Sample47 KRAS Nonsense 146 | Sample48 TP53 Nonsense 147 | Sample48 BRAF Missense 148 | Sample49 HER2 Nonsense 149 | Sample49 NRAS Splice site 150 | Sample50 MYC Nonsense 151 | -------------------------------------------------------------------------------- /examples/tutorial_data/tutorial_mutsig_qvals.tsv: -------------------------------------------------------------------------------- 1 | category -log(Q) 2 | BRCA2 0.1 3 | NRAS 0.3 4 | RB1 0.5 5 | TP53 0.8 6 | BRAF 1.2 7 | MYC 1.5 8 | HER2 2.0 9 | PTEN 3.0 10 | KRAS 4.0 11 | YAP 5.0 12 | ATM 6.0 13 | -------------------------------------------------------------------------------- /examples/tutorial_data/tutorial_purity.tsv: -------------------------------------------------------------------------------- 1 | sample category value 2 | Sample1 Purity 0.49 3 | Sample2 Purity 0.39 4 | Sample3 Purity 0.17 5 | Sample4 Purity 0.97 6 | Sample5 Purity 0.47000000000000003 7 | Sample6 Purity 0.51 8 | Sample7 Purity 0.09 9 | Sample8 Purity 0.54 10 | Sample9 Purity 0.18 11 | Sample10 Purity 0.74 12 | Sample11 Purity 0.7000000000000001 13 | Sample12 Purity 0.23 14 | Sample13 Purity 0.9 15 | Sample14 Purity 0.9400000000000001 16 | Sample15 Purity 0.7000000000000001 17 | Sample16 Purity 0.5700000000000001 18 | Sample17 Purity 0.6900000000000001 19 | Sample18 Purity 0.13 20 | Sample19 Purity 0.05 21 | Sample20 Purity 0.89 22 | Sample21 Purity 0.51 23 | Sample22 Purity 0.92 24 | Sample23 Purity 0.74 25 | Sample24 Purity 0.29 26 | Sample25 Purity 0.49 27 | Sample26 Purity 0.07 28 | Sample27 Purity 0.6900000000000001 29 | Sample28 Purity 0.9 30 | Sample29 Purity 0.93 31 | Sample30 Purity 0.5700000000000001 32 | Sample31 Purity 0.39 33 | Sample32 Purity 0.93 34 | Sample33 Purity 0.8 35 | Sample34 Purity 0.72 36 | Sample35 Purity 0.92 37 | Sample36 Purity 0.29 38 | Sample37 Purity 0.7000000000000001 39 | Sample38 Purity 0.19 40 | Sample39 Purity 0.0 41 | Sample40 Purity 0.97 42 | Sample41 Purity 0.85 43 | Sample42 Purity 0.47000000000000003 44 | Sample43 Purity 0.05 45 | Sample44 Purity 0.92 46 | Sample45 Purity 0.74 47 | Sample46 Purity 0.27 48 | Sample47 Purity 0.39 49 | Sample48 Purity 0.79 50 | Sample49 Purity 0.35000000000000003 51 | Sample50 Purity 0.19 52 | -------------------------------------------------------------------------------- /reqs/base-requirements.txt: -------------------------------------------------------------------------------- 1 | numpy>=1.18.1 2 | pandas>=0.25.3 3 | palettable>=3.3.0 4 | matplotlib>=3.1.1 -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | import setuptools 2 | 3 | with open("README.md") as f: 4 | LONG_DESCRIPTION, LONG_DESC_TYPE = f.read(), "text/markdown" 5 | 6 | with open("reqs/base-requirements.txt") as f: 7 | REQUIREMENTS = f.read().splitlines() 8 | 9 | NAME = "comut" 10 | AUTHOR_NAME, AUTHOR_EMAIL = "Jett Crowdis", "jcrowdis@broadinstitute.org" 11 | CLASSIFIERS = [ 12 | "Development Status :: 4 - Beta", 13 | "License :: OSI Approved :: MIT License", 14 | "Programming Language :: Python", 15 | "Programming Language :: Python :: 3.6", 16 | "Programming Language :: Python :: 3.7", 17 | "Programming Language :: Python :: 3.8", 18 | "Programming Language :: Python :: 3.9", 19 | "Topic :: Scientific/Engineering :: Bio-Informatics", 20 | ] 21 | LICENSE = "MIT" 22 | DESCRIPTION = "A Python library for creating comutation plots to visualize genomic and phenotypic information" 23 | URL = "https://github.com/vanallenlab/comut" 24 | PYTHON_REQ = ">=3.6" 25 | 26 | setuptools.setup( 27 | name=NAME, 28 | version="0.0.3", 29 | author=AUTHOR_NAME, 30 | author_email=AUTHOR_EMAIL, 31 | description=DESCRIPTION, 32 | license=LICENSE, 33 | long_description=LONG_DESCRIPTION, 34 | long_description_content_type=LONG_DESC_TYPE, 35 | url=URL, 36 | packages=setuptools.find_packages(), 37 | classifiers=CLASSIFIERS, 38 | python_requires=PYTHON_REQ, 39 | install_requires=REQUIREMENTS, 40 | ) 41 | --------------------------------------------------------------------------------