├── .github
├── ISSUE_TEMPLATE
│ ├── bug_report.md
│ ├── feature_request.md
│ └── question.md
└── workflows
│ ├── deploy.yml
│ ├── pre-commit.yml
│ └── run_tests.yml
├── .gitignore
├── .pre-commit-config.yaml
├── CONTRIBUTING.md
├── LICENSE
├── README.md
├── Rmagic
├── .Rbuildignore
├── .pre-commit-config.yaml
├── .pre-commit.r_requirements.txt
├── DESCRIPTION
├── LICENSE
├── NAMESPACE
├── R
│ ├── magic.R
│ ├── magic_testdata.R
│ ├── preprocessing.R
│ └── utils.R
├── README.Rmd
├── README.md
├── data-raw
│ └── generate_test_data.R
├── data
│ └── magic_testdata.rda
├── inst
│ ├── CITATION
│ └── examples
│ │ ├── BMMSC_data_R_after_magic.png
│ │ ├── BMMSC_data_R_before_magic.png
│ │ ├── BMMSC_data_R_pca_colored_by_magic.png
│ │ ├── BMMSC_data_R_phate_colored_by_magic.png
│ │ ├── EMT_data_R_after_magic.png
│ │ ├── EMT_data_R_before_magic.png
│ │ ├── EMT_data_R_pca_colored_by_magic.png
│ │ ├── EMT_data_R_phate_colored_by_magic.png
│ │ ├── bonemarrow_tutorial.Rmd
│ │ ├── bonemarrow_tutorial.html
│ │ ├── emt_tutorial.Rmd
│ │ └── emt_tutorial.html
├── man
│ ├── as.data.frame.Rd
│ ├── as.matrix.Rd
│ ├── check_pymagic_version.Rd
│ ├── figures
│ │ ├── README-plot_magic-1.png
│ │ ├── README-plot_raw-1.png
│ │ ├── README-plot_reduced_t-1.png
│ │ ├── README-run_pca-1.png
│ │ ├── README-run_phate-1.png
│ │ ├── README-unnamed-chunk-1-1.png
│ │ └── README-unnamed-chunk-3-1.png
│ ├── ggplot.Rd
│ ├── install.magic.Rd
│ ├── library.size.normalize.Rd
│ ├── magic.Rd
│ ├── magic_testdata.Rd
│ ├── print.Rd
│ ├── pymagic_is_available.Rd
│ └── summary.Rd
└── tests
│ └── test_magic.R
├── data
├── HMLE_TGFb_day_8_10.csv.gz
└── test_data.csv
├── magic.gif
├── matlab
├── .DS_Store
├── MAGIC Tutorial MATLAB-EMT.pptx
├── compute_kernel.m
├── compute_operator.m
├── compute_optimal_t.m
├── load_10x.m
├── mmread.m
├── project_genes.m
├── randPCA.m
├── run_magic.m
├── svdpca.m
├── svdpca_sparse.m
└── test_magic.m
├── python
├── README.rst
├── doc
│ ├── Makefile
│ └── source
│ │ ├── api.rst
│ │ ├── conf.py
│ │ ├── index.rst
│ │ ├── installation.rst
│ │ ├── requirements.txt
│ │ └── tutorial.rst
├── magic
│ ├── __init__.py
│ ├── after_magic_example.png
│ ├── before_magic_example.png
│ ├── magic.py
│ ├── plot.py
│ ├── utils.py
│ └── version.py
├── requirements.txt
├── setup.py
├── test
│ └── test.py
└── tutorial_notebooks
│ ├── bonemarrow_tutorial.ipynb
│ └── emt_tutorial.ipynb
└── setup.cfg
/.github/ISSUE_TEMPLATE/bug_report.md:
--------------------------------------------------------------------------------
1 | ---
2 | name: Bug report
3 | about: Create a report to help us improve
4 | title: ''
5 | labels: bug
6 | assignees: ''
7 |
8 | ---
9 |
10 | **Describe the bug**
11 | A clear and concise description of what the bug is.
12 |
13 | **To Reproduce**
14 | Standalone code to reproduce the error
15 |
16 | **Expected behavior**
17 | A clear and concise description of what you expected to happen.
18 |
19 | **Actual behavior**
20 | Please include the full traceback of any errors
21 |
22 | **System information:**
23 |
24 | Output of `magic.__version__`:
25 |
26 | ```
27 | If you are running MAGIC in R or Python, please run magic.__version__ and paste the results here.
28 |
29 | You can do this with `python -c 'import magic; print(magic.__version__)'`
30 | ```
31 |
32 | Output of `pd.show_versions()`:
33 |
34 |
35 |
36 | ```
37 | If you are running MAGIC in R or Python, please run pd.show_versions() and paste the results here.
38 |
39 | You can do this with `python -c 'import pandas as pd; pd.show_versions()'`
40 | ```
41 |
42 |
43 |
44 | Output of `sessionInfo()`:
45 |
46 |
47 |
48 | ```
49 | If you are running MAGIC in R, please run sessionInfo() and paste the results here.
50 |
51 | You can do this with `R -e 'library(Rmagic); sessionInfo()'`
52 | ```
53 |
54 |
55 |
56 | Output of `reticulate::py_discover_config(required_module = "magic")`:
57 |
58 |
59 |
60 | ```
61 | If you are running MAGIC in R, please run `reticulate::py_discover_config(required_module = "magic")` and paste the results here.
62 |
63 | You can do this with `R -e 'reticulate::py_discover_config(required_module = "magic")'`
64 | ```
65 |
66 |
67 |
68 | Output of `Rmagic::check_pymagic_version()`:
69 |
70 |
71 |
72 | ```
73 | If you are running MAGIC in R, please run `Rmagic::check_pymagic_version()` and paste the results here.
74 |
75 | You can do this with `R -e 'Rmagic::check_pymagic_version()'`
76 | ```
77 |
78 |
79 |
80 | **Additional context**
81 | Add any other context about the problem here.
82 |
--------------------------------------------------------------------------------
/.github/ISSUE_TEMPLATE/feature_request.md:
--------------------------------------------------------------------------------
1 | ---
2 | name: Feature request
3 | about: Suggest an idea for this project
4 | title: ''
5 | labels: enhancement
6 | assignees: ''
7 |
8 | ---
9 |
10 | **Is your feature request related to a problem? Please describe.**
11 | A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
12 |
13 | **Describe the solution you'd like**
14 | A clear and concise description of what you want to happen.
15 |
16 | **Describe alternatives you've considered**
17 | A clear and concise description of any alternative solutions or features you've considered.
18 |
19 | **Additional context**
20 | Add any other context or screenshots about the feature request here.
21 |
--------------------------------------------------------------------------------
/.github/ISSUE_TEMPLATE/question.md:
--------------------------------------------------------------------------------
1 | ---
2 | name: Question
3 | about: Ask questions about MAGIC
4 | title: ''
5 | labels: question
6 | assignees: ''
7 |
8 | ---
9 |
--------------------------------------------------------------------------------
/.github/workflows/deploy.yml:
--------------------------------------------------------------------------------
1 | name: Publish Python distributions to PyPI
2 |
3 | on:
4 | push:
5 | branches:
6 | - 'master'
7 | - 'test_deploy'
8 | tags:
9 | - '*'
10 |
11 | jobs:
12 | build-n-publish:
13 | name: Build and publish Python distributions to PyPI
14 | runs-on: ubuntu-latest
15 |
16 | steps:
17 | - uses: actions/checkout@master
18 |
19 | - name: Set up Python 3.7
20 | uses: actions/setup-python@v2
21 | with:
22 | python-version: 3.7
23 |
24 | - name: Install pypa/build
25 | run: >-
26 | cd python &&
27 | python -m
28 | pip install
29 | build
30 | --user &&
31 | cd ..
32 |
33 | - name: Build a binary wheel and a source tarball
34 | run: >-
35 | cd python &&
36 | python -m
37 | build
38 | --sdist
39 | --wheel
40 | --outdir dist/
41 | . &&
42 | cd ..
43 |
44 | - name: Publish distribution to Test PyPI
45 | uses: pypa/gh-action-pypi-publish@master
46 | with:
47 | packages_dir: python/dist
48 | skip_existing: true
49 | password: ${{ secrets.test_pypi_password }}
50 | repository_url: https://test.pypi.org/legacy/
51 |
52 | - name: Publish distribution to PyPI
53 | if: startsWith(github.ref, 'refs/tags')
54 | uses: pypa/gh-action-pypi-publish@master
55 | with:
56 | packages_dir: python/dist
57 | password: ${{ secrets.pypi_password }}
58 |
--------------------------------------------------------------------------------
/.github/workflows/pre-commit.yml:
--------------------------------------------------------------------------------
1 | name: pre-commit
2 | on:
3 | push:
4 | branches-ignore:
5 | - 'master'
6 |
7 | jobs:
8 | pre-commit:
9 | runs-on: ubuntu-latest
10 |
11 | steps:
12 | - name: Cancel Previous Runs
13 | uses: styfle/cancel-workflow-action@0.6.0
14 | with:
15 | access_token: ${{ github.token }}
16 | - uses: actions/checkout@v2
17 | with:
18 | fetch-depth: 0
19 |
20 | - name: Set up environment
21 | run: |
22 | echo "UBUNTU_VERSION=`grep DISTRIB_RELEASE /etc/lsb-release | sed 's/.*=//g'`" >> $GITHUB_ENV
23 | mkdir -p .local/R/site-packages
24 | echo "R_LIBS_USER=`pwd`/.local/R/site-packages" >> $GITHUB_ENV
25 |
26 | - name: Install system dependencies
27 | if: runner.os == 'Linux'
28 | run: |
29 | sudo apt-get update -qq
30 | sudo apt-get install -y libcurl4-openssl-dev
31 |
32 | - name: Set up Python
33 | uses: actions/setup-python@v2
34 | with:
35 | python-version: "3.8"
36 | architecture: "x64"
37 |
38 | - name: Cache pre-commit
39 | uses: actions/cache@v2
40 | with:
41 | path: ~/.cache/pre-commit
42 | key: pre-commit-${{ hashFiles('.pre-commit-config.yaml') }}-
43 |
44 | - name: Run pre-commit
45 | uses: pre-commit/action@v2.0.0
46 |
47 | - name: Cache R packages
48 | uses: actions/cache@v2
49 | if: startsWith(runner.os, 'Linux')
50 | with:
51 | path: ${{env.R_LIBS_USER}}
52 | key: precommit-${{env.UBUNTU_VERSION}}-renv-${{ hashFiles('Rmagic/.pre-commit.r_requirements.txt') }}-${{ hashFiles('Rmagic/DESCRIPTION') }}-
53 | restore-keys: |
54 | precommit-${{env.UBUNTU_VERSION}}-renv-${{ hashFiles('Rmagic/.pre-commit.r_requirements.txt') }}-
55 | precommit-${{env.UBUNTU_VERSION}}-renv-
56 |
57 | - name: Install R packages
58 | run: |
59 | if (!requireNamespace("renv", quietly = TRUE)) install.packages("renv")
60 | con = file("Rmagic/.pre-commit.r_requirements.txt", "r")
61 | while ( length(pkg <- readLines(con, n = 1)) > 0 ) {
62 | renv::install(pkg)
63 | }
64 | close(con)
65 | if (!require("devtools")) install.packages("devtools", repos="http://cloud.r-project.org")
66 | devtools::install_dev_deps("./Rmagic", upgrade=TRUE)
67 | devtools::install("./Rmagic")
68 | shell: Rscript {0}
69 |
70 | - name: Run pre-commit for R
71 | run: |
72 | cd Rmagic
73 | git init
74 | git add *
75 | pre-commit run --all-files
76 | rm -rf .git
77 | cd ..
78 |
79 | - name: Commit files
80 | if: failure()
81 | run: |
82 | git checkout -- .github/workflows
83 | if [[ `git status --porcelain --untracked-files=no` ]]; then
84 | git config --local user.email "41898282+github-actions[bot]@users.noreply.github.com"
85 | git config --local user.name "github-actions[bot]"
86 | git commit -m "pre-commit" -a
87 | fi
88 |
89 | - name: Push changes
90 | if: failure()
91 | uses: ad-m/github-push-action@master
92 | with:
93 | github_token: ${{ secrets.GITHUB_TOKEN }}
94 | branch: ${{ github.ref }}
95 |
--------------------------------------------------------------------------------
/.github/workflows/run_tests.yml:
--------------------------------------------------------------------------------
1 | name: Unit Tests
2 |
3 | on:
4 | push:
5 | branches-ignore:
6 | - 'test_deploy'
7 | pull_request:
8 | branches:
9 | - '*'
10 |
11 | jobs:
12 |
13 | test_python:
14 | runs-on: ${{ matrix.config.os }}
15 | if: "!contains(github.event.head_commit.message, 'ci skip')"
16 |
17 | strategy:
18 | fail-fast: false
19 | matrix:
20 | config:
21 | - {name: '3.9', os: ubuntu-latest, python: '3.9' }
22 | - {name: '3.8', os: ubuntu-latest, python: '3.8' }
23 | - {name: '3.7', os: ubuntu-latest, python: '3.7' }
24 | - {name: '3.6', os: ubuntu-latest, python: '3.6' }
25 |
26 | steps:
27 | - name: Cancel Previous Runs
28 | uses: styfle/cancel-workflow-action@0.6.0
29 | with:
30 | access_token: ${{ github.token }}
31 |
32 | - name: Check Ubuntu version
33 | run: |
34 | echo "UBUNTU_VERSION=`grep DISTRIB_RELEASE /etc/lsb-release | sed 's/.*=//g'`" >> $GITHUB_ENV
35 |
36 | - uses: actions/checkout@v2
37 |
38 | - name: Set up Python
39 | uses: actions/setup-python@v2
40 | with:
41 | python-version: ${{ matrix.config.python }}
42 |
43 | - name: Cache Python packages
44 | uses: actions/cache@v2
45 | with:
46 | path: ${{ env.pythonLocation }}
47 | key: ${{runner.os}}-pip-${{ env.pythonLocation }}-${{ hashFiles('python/setup.py') }}
48 | restore-keys: ${{runner.os}}-pip-${{ env.pythonLocation }}-
49 |
50 | - name: Install package & dependencies
51 | run: |
52 | python -m pip install --upgrade pip
53 | pip install -U wheel setuptools
54 | pip install -U ./python[test]
55 | python -c "import magic"
56 |
57 | - name: Run Python tests
58 | run: |
59 | cd python
60 | nose2 -vvv
61 | cd ..
62 |
63 | - name: Build docs
64 | run: |
65 | cd python
66 | pip install .[doc]
67 | cd doc
68 | make html
69 | cd ../..
70 |
71 | - name: Coveralls
72 | env:
73 | GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
74 | COVERALLS_SERVICE_NAME: github
75 | run: |
76 | coveralls
77 |
78 | - name: Upload check results on fail
79 | if: failure()
80 | uses: actions/upload-artifact@master
81 | with:
82 | name: ${{ matrix.config.name }}_results
83 | path: check
84 |
85 | test_r:
86 | runs-on: ${{ matrix.config.os }}
87 | if: "!contains(github.event.head_commit.message, 'ci skip')"
88 |
89 | strategy:
90 | fail-fast: false
91 | matrix:
92 | config:
93 | - {name: 'devel', os: ubuntu-latest, r: 'devel' }
94 | - {name: 'release', os: ubuntu-latest, r: 'release' }
95 |
96 | steps:
97 | - name: Cancel Previous Runs
98 | uses: styfle/cancel-workflow-action@0.6.0
99 | with:
100 | access_token: ${{ github.token }}
101 |
102 | - name: Set up environment
103 | run: |
104 | echo "UBUNTU_VERSION=`grep DISTRIB_RELEASE /etc/lsb-release | sed 's/.*=//g'`" >> $GITHUB_ENV
105 | mkdir -p .local/R/site-packages
106 | echo "R_LIBS_USER=`pwd`/.local/R/site-packages" >> $GITHUB_ENV
107 |
108 | - uses: actions/checkout@v2
109 |
110 | - name: Set up Python
111 | uses: actions/setup-python@v2
112 | with:
113 | python-version: "3.8"
114 |
115 | - name: Install system dependencies
116 | if: runner.os == 'Linux'
117 | run: |
118 | sudo apt-get update -qq
119 | sudo apt-get install -y libcurl4-openssl-dev pandoc
120 |
121 | - name: Cache Python packages
122 | uses: actions/cache@v2
123 | with:
124 | path: ${{ env.pythonLocation }}
125 | key: ${{runner.os}}-pip-${{ env.pythonLocation }}-${{ hashFiles('python/setup.py') }}
126 | restore-keys: ${{runner.os}}-pip-${{ env.pythonLocation }}-
127 |
128 | - name: Install package & dependencies
129 | run: |
130 | python -m pip install --upgrade pip
131 | pip install -U wheel setuptools
132 | pip install -U ./python
133 | python -c "import magic"
134 |
135 | - name: Set up R
136 | id: setup-r
137 | uses: r-lib/actions/setup-r@v1
138 | with:
139 | r-version: ${{ matrix.config.r }}
140 |
141 | - name: Cache R packages
142 | uses: actions/cache@v2
143 | if: startsWith(runner.os, 'Linux')
144 | with:
145 | path: ${{env.R_LIBS_USER}}
146 | key: test-${{env.UBUNTU_VERSION}}-renv-${{ steps.setup-r.outputs.installed-r-version }}-${{ hashFiles('Rmagic/DESCRIPTION') }}-
147 | restore-keys: |
148 | test-${{env.UBUNTU_VERSION}}-renv-${{ steps.setup-r.outputs.installed-r-version }}-
149 |
150 | - name: Install R packages
151 | run: |
152 | if (!require("devtools")) install.packages("devtools", repos="http://cloud.r-project.org")
153 | devtools::install_dev_deps("./Rmagic", upgrade=TRUE)
154 | devtools::install("./Rmagic")
155 | shell: Rscript {0}
156 |
157 | - name: Install tinytex
158 | uses: r-lib/actions/setup-tinytex@v1
159 |
160 | - name: Run R tests
161 | run: |
162 | cd Rmagic
163 | R CMD build .
164 | R CMD check --as-cran *.tar.gz
165 | cd ..
166 |
167 | - name: Upload check results on fail
168 | if: failure()
169 | uses: actions/upload-artifact@master
170 | with:
171 | name: ${{ matrix.config.name }}_results
172 | path: check
173 |
--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
1 | # R project files
2 |
3 | .Rproj.user
4 | *.Rproj
5 | .Rhistory
6 | .RData
7 | .Ruserdata
8 |
9 | # R installation files
10 |
11 | build
12 | dist
13 |
14 | # Python installation files
15 |
16 | python/*.o
17 | python/*.so
18 | python/*.dll
19 | python/*.egg-info
20 | python/magic/__pycache__
21 | python/magic/*.pyc
22 | python/tutorial_notebooks/.ipynb_checkpoints
23 | __pycache__
24 | .eggs
25 |
26 |
27 | matlab/EMT.csv
28 |
29 | # Mac detritus
30 |
31 | *~
32 | ~$*
33 | .DS_Store
34 |
--------------------------------------------------------------------------------
/.pre-commit-config.yaml:
--------------------------------------------------------------------------------
1 | repos:
2 | - repo: https://github.com/pre-commit/pre-commit-hooks
3 | rev: v3.3.0
4 | hooks:
5 | - id: check-yaml
6 | - id: end-of-file-fixer
7 | - id: trailing-whitespace
8 | exclude: \.(ai|gz)$
9 | - repo: https://github.com/timothycrosley/isort
10 | rev: 5.6.4
11 | hooks:
12 | - id: isort
13 | - repo: https://github.com/psf/black
14 | rev: 20.8b1
15 | hooks:
16 | - id: black
17 | args: ['--target-version', 'py36']
18 | - repo: https://github.com/pre-commit/mirrors-autopep8
19 | rev: v1.5.4
20 | hooks:
21 | - id: autopep8
22 | - repo: https://gitlab.com/pycqa/flake8
23 | rev: 3.8.4
24 | hooks:
25 | - id: flake8
26 | additional_dependencies: ['hacking']
27 |
--------------------------------------------------------------------------------
/CONTRIBUTING.md:
--------------------------------------------------------------------------------
1 |
2 | Contributing to MAGIC
3 | ============================
4 |
5 | There are many ways to contribute to `MAGIC`, with the most common ones
6 | being contribution of code or documentation to the project. Improving the
7 | documentation is no less important than improving the library itself. If you
8 | find a typo in the documentation, or have made improvements, do not hesitate to
9 | submit a GitHub pull request.
10 |
11 | But there are many other ways to help. In particular answering queries on the
12 | [issue tracker](https://github.com/KrishnaswamyLab/MAGIC/issues),
13 | investigating bugs, and [reviewing other developers' pull
14 | requests](https://github.com/KrishnaswamyLab/MAGIC/pulls)
15 | are very valuable contributions that decrease the burden on the project
16 | maintainers.
17 |
18 | Another way to contribute is to report issues you're facing, and give a "thumbs
19 | up" on issues that others reported and that are relevant to you. It also helps
20 | us if you spread the word: reference the project from your blog and articles,
21 | link to it from your website, or simply star it in GitHub to say "I use it".
22 |
23 | Code Style and Testing
24 | ----------------------
25 |
26 | Contributors are encouraged to write tests for their code, but if you do not know how to do so, please do not feel discouraged from contributing code! Others can always help you test your contribution.
27 |
28 | Code style is dictated by [`black`](https://pypi.org/project/black/#installation-and-usage) and [OpenStack](https://docs.openstack.org/hacking/latest/user/hacking.html#styleguide). Styling is automatically applied by [`pre-commit`](https://github.com/pre-commit/pre-commit).
29 |
30 | Code of Conduct
31 | ---------------
32 |
33 | We abide by the principles of openness, respect, and consideration of others
34 | of the Python Software Foundation: https://www.python.org/psf/codeofconduct/.
35 |
36 | Attribution
37 | ---------------
38 |
39 | This `CONTRIBUTING.md` was adapted from [scikit-learn](https://github.com/scikit-learn/scikit-learn/blob/master/CONTRIBUTING.md).
40 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | GNU GENERAL PUBLIC LICENSE
2 | Version 2, June 1991
3 |
4 | Copyright (C) 1989, 1991 Free Software Foundation, Inc.,
5 | 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
6 | Everyone is permitted to copy and distribute verbatim copies
7 | of this license document, but changing it is not allowed.
8 |
9 | Preamble
10 |
11 | The licenses for most software are designed to take away your
12 | freedom to share and change it. By contrast, the GNU General Public
13 | License is intended to guarantee your freedom to share and change free
14 | software--to make sure the software is free for all its users. This
15 | General Public License applies to most of the Free Software
16 | Foundation's software and to any other program whose authors commit to
17 | using it. (Some other Free Software Foundation software is covered by
18 | the GNU Lesser General Public License instead.) You can apply it to
19 | your programs, too.
20 |
21 | When we speak of free software, we are referring to freedom, not
22 | price. Our General Public Licenses are designed to make sure that you
23 | have the freedom to distribute copies of free software (and charge for
24 | this service if you wish), that you receive source code or can get it
25 | if you want it, that you can change the software or use pieces of it
26 | in new free programs; and that you know you can do these things.
27 |
28 | To protect your rights, we need to make restrictions that forbid
29 | anyone to deny you these rights or to ask you to surrender the rights.
30 | These restrictions translate to certain responsibilities for you if you
31 | distribute copies of the software, or if you modify it.
32 |
33 | For example, if you distribute copies of such a program, whether
34 | gratis or for a fee, you must give the recipients all the rights that
35 | you have. You must make sure that they, too, receive or can get the
36 | source code. And you must show them these terms so they know their
37 | rights.
38 |
39 | We protect your rights with two steps: (1) copyright the software, and
40 | (2) offer you this license which gives you legal permission to copy,
41 | distribute and/or modify the software.
42 |
43 | Also, for each author's protection and ours, we want to make certain
44 | that everyone understands that there is no warranty for this free
45 | software. If the software is modified by someone else and passed on, we
46 | want its recipients to know that what they have is not the original, so
47 | that any problems introduced by others will not reflect on the original
48 | authors' reputations.
49 |
50 | Finally, any free program is threatened constantly by software
51 | patents. We wish to avoid the danger that redistributors of a free
52 | program will individually obtain patent licenses, in effect making the
53 | program proprietary. To prevent this, we have made it clear that any
54 | patent must be licensed for everyone's free use or not licensed at all.
55 |
56 | The precise terms and conditions for copying, distribution and
57 | modification follow.
58 |
59 | GNU GENERAL PUBLIC LICENSE
60 | TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
61 |
62 | 0. This License applies to any program or other work which contains
63 | a notice placed by the copyright holder saying it may be distributed
64 | under the terms of this General Public License. The "Program", below,
65 | refers to any such program or work, and a "work based on the Program"
66 | means either the Program or any derivative work under copyright law:
67 | that is to say, a work containing the Program or a portion of it,
68 | either verbatim or with modifications and/or translated into another
69 | language. (Hereinafter, translation is included without limitation in
70 | the term "modification".) Each licensee is addressed as "you".
71 |
72 | Activities other than copying, distribution and modification are not
73 | covered by this License; they are outside its scope. The act of
74 | running the Program is not restricted, and the output from the Program
75 | is covered only if its contents constitute a work based on the
76 | Program (independent of having been made by running the Program).
77 | Whether that is true depends on what the Program does.
78 |
79 | 1. You may copy and distribute verbatim copies of the Program's
80 | source code as you receive it, in any medium, provided that you
81 | conspicuously and appropriately publish on each copy an appropriate
82 | copyright notice and disclaimer of warranty; keep intact all the
83 | notices that refer to this License and to the absence of any warranty;
84 | and give any other recipients of the Program a copy of this License
85 | along with the Program.
86 |
87 | You may charge a fee for the physical act of transferring a copy, and
88 | you may at your option offer warranty protection in exchange for a fee.
89 |
90 | 2. You may modify your copy or copies of the Program or any portion
91 | of it, thus forming a work based on the Program, and copy and
92 | distribute such modifications or work under the terms of Section 1
93 | above, provided that you also meet all of these conditions:
94 |
95 | a) You must cause the modified files to carry prominent notices
96 | stating that you changed the files and the date of any change.
97 |
98 | b) You must cause any work that you distribute or publish, that in
99 | whole or in part contains or is derived from the Program or any
100 | part thereof, to be licensed as a whole at no charge to all third
101 | parties under the terms of this License.
102 |
103 | c) If the modified program normally reads commands interactively
104 | when run, you must cause it, when started running for such
105 | interactive use in the most ordinary way, to print or display an
106 | announcement including an appropriate copyright notice and a
107 | notice that there is no warranty (or else, saying that you provide
108 | a warranty) and that users may redistribute the program under
109 | these conditions, and telling the user how to view a copy of this
110 | License. (Exception: if the Program itself is interactive but
111 | does not normally print such an announcement, your work based on
112 | the Program is not required to print an announcement.)
113 |
114 | These requirements apply to the modified work as a whole. If
115 | identifiable sections of that work are not derived from the Program,
116 | and can be reasonably considered independent and separate works in
117 | themselves, then this License, and its terms, do not apply to those
118 | sections when you distribute them as separate works. But when you
119 | distribute the same sections as part of a whole which is a work based
120 | on the Program, the distribution of the whole must be on the terms of
121 | this License, whose permissions for other licensees extend to the
122 | entire whole, and thus to each and every part regardless of who wrote it.
123 |
124 | Thus, it is not the intent of this section to claim rights or contest
125 | your rights to work written entirely by you; rather, the intent is to
126 | exercise the right to control the distribution of derivative or
127 | collective works based on the Program.
128 |
129 | In addition, mere aggregation of another work not based on the Program
130 | with the Program (or with a work based on the Program) on a volume of
131 | a storage or distribution medium does not bring the other work under
132 | the scope of this License.
133 |
134 | 3. You may copy and distribute the Program (or a work based on it,
135 | under Section 2) in object code or executable form under the terms of
136 | Sections 1 and 2 above provided that you also do one of the following:
137 |
138 | a) Accompany it with the complete corresponding machine-readable
139 | source code, which must be distributed under the terms of Sections
140 | 1 and 2 above on a medium customarily used for software interchange; or,
141 |
142 | b) Accompany it with a written offer, valid for at least three
143 | years, to give any third party, for a charge no more than your
144 | cost of physically performing source distribution, a complete
145 | machine-readable copy of the corresponding source code, to be
146 | distributed under the terms of Sections 1 and 2 above on a medium
147 | customarily used for software interchange; or,
148 |
149 | c) Accompany it with the information you received as to the offer
150 | to distribute corresponding source code. (This alternative is
151 | allowed only for noncommercial distribution and only if you
152 | received the program in object code or executable form with such
153 | an offer, in accord with Subsection b above.)
154 |
155 | The source code for a work means the preferred form of the work for
156 | making modifications to it. For an executable work, complete source
157 | code means all the source code for all modules it contains, plus any
158 | associated interface definition files, plus the scripts used to
159 | control compilation and installation of the executable. However, as a
160 | special exception, the source code distributed need not include
161 | anything that is normally distributed (in either source or binary
162 | form) with the major components (compiler, kernel, and so on) of the
163 | operating system on which the executable runs, unless that component
164 | itself accompanies the executable.
165 |
166 | If distribution of executable or object code is made by offering
167 | access to copy from a designated place, then offering equivalent
168 | access to copy the source code from the same place counts as
169 | distribution of the source code, even though third parties are not
170 | compelled to copy the source along with the object code.
171 |
172 | 4. You may not copy, modify, sublicense, or distribute the Program
173 | except as expressly provided under this License. Any attempt
174 | otherwise to copy, modify, sublicense or distribute the Program is
175 | void, and will automatically terminate your rights under this License.
176 | However, parties who have received copies, or rights, from you under
177 | this License will not have their licenses terminated so long as such
178 | parties remain in full compliance.
179 |
180 | 5. You are not required to accept this License, since you have not
181 | signed it. However, nothing else grants you permission to modify or
182 | distribute the Program or its derivative works. These actions are
183 | prohibited by law if you do not accept this License. Therefore, by
184 | modifying or distributing the Program (or any work based on the
185 | Program), you indicate your acceptance of this License to do so, and
186 | all its terms and conditions for copying, distributing or modifying
187 | the Program or works based on it.
188 |
189 | 6. Each time you redistribute the Program (or any work based on the
190 | Program), the recipient automatically receives a license from the
191 | original licensor to copy, distribute or modify the Program subject to
192 | these terms and conditions. You may not impose any further
193 | restrictions on the recipients' exercise of the rights granted herein.
194 | You are not responsible for enforcing compliance by third parties to
195 | this License.
196 |
197 | 7. If, as a consequence of a court judgment or allegation of patent
198 | infringement or for any other reason (not limited to patent issues),
199 | conditions are imposed on you (whether by court order, agreement or
200 | otherwise) that contradict the conditions of this License, they do not
201 | excuse you from the conditions of this License. If you cannot
202 | distribute so as to satisfy simultaneously your obligations under this
203 | License and any other pertinent obligations, then as a consequence you
204 | may not distribute the Program at all. For example, if a patent
205 | license would not permit royalty-free redistribution of the Program by
206 | all those who receive copies directly or indirectly through you, then
207 | the only way you could satisfy both it and this License would be to
208 | refrain entirely from distribution of the Program.
209 |
210 | If any portion of this section is held invalid or unenforceable under
211 | any particular circumstance, the balance of the section is intended to
212 | apply and the section as a whole is intended to apply in other
213 | circumstances.
214 |
215 | It is not the purpose of this section to induce you to infringe any
216 | patents or other property right claims or to contest validity of any
217 | such claims; this section has the sole purpose of protecting the
218 | integrity of the free software distribution system, which is
219 | implemented by public license practices. Many people have made
220 | generous contributions to the wide range of software distributed
221 | through that system in reliance on consistent application of that
222 | system; it is up to the author/donor to decide if he or she is willing
223 | to distribute software through any other system and a licensee cannot
224 | impose that choice.
225 |
226 | This section is intended to make thoroughly clear what is believed to
227 | be a consequence of the rest of this License.
228 |
229 | 8. If the distribution and/or use of the Program is restricted in
230 | certain countries either by patents or by copyrighted interfaces, the
231 | original copyright holder who places the Program under this License
232 | may add an explicit geographical distribution limitation excluding
233 | those countries, so that distribution is permitted only in or among
234 | countries not thus excluded. In such case, this License incorporates
235 | the limitation as if written in the body of this License.
236 |
237 | 9. The Free Software Foundation may publish revised and/or new versions
238 | of the General Public License from time to time. Such new versions will
239 | be similar in spirit to the present version, but may differ in detail to
240 | address new problems or concerns.
241 |
242 | Each version is given a distinguishing version number. If the Program
243 | specifies a version number of this License which applies to it and "any
244 | later version", you have the option of following the terms and conditions
245 | either of that version or of any later version published by the Free
246 | Software Foundation. If the Program does not specify a version number of
247 | this License, you may choose any version ever published by the Free Software
248 | Foundation.
249 |
250 | 10. If you wish to incorporate parts of the Program into other free
251 | programs whose distribution conditions are different, write to the author
252 | to ask for permission. For software which is copyrighted by the Free
253 | Software Foundation, write to the Free Software Foundation; we sometimes
254 | make exceptions for this. Our decision will be guided by the two goals
255 | of preserving the free status of all derivatives of our free software and
256 | of promoting the sharing and reuse of software generally.
257 |
258 | NO WARRANTY
259 |
260 | 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
261 | FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
262 | OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
263 | PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
264 | OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
265 | MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
266 | TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
267 | PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
268 | REPAIR OR CORRECTION.
269 |
270 | 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
271 | WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
272 | REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
273 | INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
274 | OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
275 | TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
276 | YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
277 | PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
278 | POSSIBILITY OF SUCH DAMAGES.
279 |
280 | END OF TERMS AND CONDITIONS
281 |
282 | How to Apply These Terms to Your New Programs
283 |
284 | If you develop a new program, and you want it to be of the greatest
285 | possible use to the public, the best way to achieve this is to make it
286 | free software which everyone can redistribute and change under these terms.
287 |
288 | To do so, attach the following notices to the program. It is safest
289 | to attach them to the start of each source file to most effectively
290 | convey the exclusion of warranty; and each file should have at least
291 | the "copyright" line and a pointer to where the full notice is found.
292 |
293 | {description}
294 | Copyright (C) {year} {fullname}
295 |
296 | This program is free software; you can redistribute it and/or modify
297 | it under the terms of the GNU General Public License as published by
298 | the Free Software Foundation; either version 2 of the License, or
299 | (at your option) any later version.
300 |
301 | This program is distributed in the hope that it will be useful,
302 | but WITHOUT ANY WARRANTY; without even the implied warranty of
303 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
304 | GNU General Public License for more details.
305 |
306 | You should have received a copy of the GNU General Public License along
307 | with this program; if not, write to the Free Software Foundation, Inc.,
308 | 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
309 |
310 | Also add information on how to contact you by electronic and paper mail.
311 |
312 | If the program is interactive, make it output a short notice like this
313 | when it starts in an interactive mode:
314 |
315 | Gnomovision version 69, Copyright (C) year name of author
316 | Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
317 | This is free software, and you are welcome to redistribute it
318 | under certain conditions; type `show c' for details.
319 |
320 | The hypothetical commands `show w' and `show c' should show the appropriate
321 | parts of the General Public License. Of course, the commands you use may
322 | be called something other than `show w' and `show c'; they could even be
323 | mouse-clicks or menu items--whatever suits your program.
324 |
325 | You should also get your employer (if you work as a programmer) or your
326 | school, if any, to sign a "copyright disclaimer" for the program, if
327 | necessary. Here is a sample; alter the names:
328 |
329 | Yoyodyne, Inc., hereby disclaims all copyright interest in the program
330 | `Gnomovision' (which makes passes at compilers) written by James Hacker.
331 |
332 | {signature of Ty Coon}, 1 April 1989
333 | Ty Coon, President of Vice
334 |
335 | This General Public License does not permit incorporating your program into
336 | proprietary programs. If your program is a subroutine library, you may
337 | consider it more useful to permit linking proprietary applications with the
338 | library. If this is what you want to do, use the GNU Lesser General
339 | Public License instead of this License.
340 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | Markov Affinity-based Graph Imputation of Cells (MAGIC)
2 | -------------------------------------------------------
3 |
4 | [](https://pypi.org/project/magic-impute/)
5 | [](https://cran.r-project.org/package=Rmagic)
6 | [](https://github.com/KrishnaswamyLab/MAGIC/actions)
7 | [](https://magic.readthedocs.io/)
8 | [](https://www.cell.com/cell/abstract/S0092-8674(18)30724-4)
9 | [](https://twitter.com/KrishnaswamyLab)
10 | [](https://github.com/KrishnaswamyLab/MAGIC/)
11 |
12 | Markov Affinity-based Graph Imputation of Cells (MAGIC) is an algorithm for denoising high-dimensional data most commonly applied to single-cell RNA sequencing data. MAGIC learns the manifold data, using the resultant graph to smooth the features and restore the structure of the data.
13 |
14 | To see how MAGIC can be applied to single-cell RNA-seq, elucidating the epithelial-to-mesenchymal transition, read our [publication in Cell](https://www.cell.com/cell/abstract/S0092-8674(18)30724-4).
15 |
16 | [David van Dijk, et al. **Recovering Gene Interactions from Single-Cell Data Using Data Diffusion**. 2018. *Cell*.](https://www.cell.com/cell/abstract/S0092-8674(18)30724-4)
17 |
18 | MAGIC has been implemented in Python, Matlab, and R.
19 |
20 | #### To get started immediately, check out our tutorials:
21 | ##### Python
22 | * [Epithelial-to-Mesenchymal Transition Tutorial](http://nbviewer.jupyter.org/github/KrishnaswamyLab/MAGIC/blob/master/python/tutorial_notebooks/emt_tutorial.ipynb)
23 | * [Bone Marrow Tutorial](http://nbviewer.jupyter.org/github/KrishnaswamyLab/MAGIC/blob/master/python/tutorial_notebooks/bonemarrow_tutorial.ipynb)
24 | ##### R
25 | * [Epithelial-to-Mesenchymal Transition Tutorial](http://htmlpreview.github.io/?https://github.com/KrishnaswamyLab/MAGIC/blob/master/Rmagic/inst/examples/emt_tutorial.html)
26 | * [Bone Marrow Tutorial](http://htmlpreview.github.io/?https://github.com/KrishnaswamyLab/MAGIC/blob/master/Rmagic/inst/examples/bonemarrow_tutorial.html)
27 |
28 |
29 |
30 |
31 |
32 | Magic reveals the interaction between Vimentin (VIM), Cadherin-1 (CDH1), and Zinc finger E-box-binding homeobox 1 (ZEB1, encoded by colors).
33 |
34 |
35 |
36 | ### Table of Contents
37 |
38 | * [Python](#python)
39 | * [Installation](#installation)
40 | * [Installation with pip](#installation-with-pip)
41 | * [Installation from GitHub](#installation-from-github)
42 | * [Usage](#usage)
43 | * [Quick Start](#quick-start)
44 | * [Tutorials](#tutorials)
45 | * [Matlab](#matlab)
46 | * [Instructions for the Matlab version](#instructions-for-the-matlab-version)
47 | * [R](#r)
48 | * [Installation](#installation-1)
49 | * [Installation from CRAN](#installation-from-cran)
50 | * [Installation from GitHub](#installation-from-github-1)
51 | * [Usage](#usage-1)
52 | * [Quick Start](#quick-start-1)
53 | * [Tutorials](#tutorials-1)
54 | * [Help](#help)
55 |
56 | ## Python
57 |
58 | ### Installation
59 |
60 | #### Installation with pip
61 |
62 | To install with `pip`, run the following from a terminal:
63 |
64 | pip install --user magic-impute
65 |
66 | #### Installation from GitHub
67 |
68 | To clone the repository and install manually, run the following from a terminal:
69 |
70 | git clone git://github.com/KrishnaswamyLab/MAGIC.git
71 | cd MAGIC/python
72 | python setup.py install --user
73 |
74 | ### Usage
75 |
76 | #### Quick Start
77 |
78 | The following code runs MAGIC on test data located in the MAGIC repository.
79 |
80 | import magic
81 | import pandas as pd
82 | import matplotlib.pyplot as plt
83 | X = pd.read_csv("MAGIC/data/test_data.csv")
84 | magic_operator = magic.MAGIC()
85 | X_magic = magic_operator.fit_transform(X, genes=['VIM', 'CDH1', 'ZEB1'])
86 | plt.scatter(X_magic['VIM'], X_magic['CDH1'], c=X_magic['ZEB1'], s=1, cmap='inferno')
87 | plt.show()
88 | magic.plot.animate_magic(X, gene_x='VIM', gene_y='CDH1', gene_color='ZEB1', operator=magic_operator)
89 |
90 | #### Tutorials
91 |
92 | You can read the MAGIC documentation at https://magic.readthedocs.io/. We have included two tutorial notebooks on MAGIC usage and results visualization for single cell RNA-seq data.
93 |
94 | EMT data notebook: http://nbviewer.jupyter.org/github/KrishnaswamyLab/MAGIC/blob/master/python/tutorial_notebooks/emt_tutorial.ipynb
95 |
96 | Bone Marrow data notebook: http://nbviewer.jupyter.org/github/KrishnaswamyLab/MAGIC/blob/master/python/tutorial_notebooks/bonemarrow_tutorial.ipynb
97 |
98 | ## Matlab
99 |
100 | ### Instructions for the Matlab version
101 | 1. run_magic.m -- MAGIC imputation function
102 | 2. test_magic.m -- Shows how to run MAGIC. Also included is a function for loading 10x format data (load_10x.m)
103 |
104 | ## R
105 |
106 | ### Installation
107 |
108 | To use MAGIC, you will need to install both the R and Python packages.
109 |
110 | If `python` or `pip` are not installed, you will need to install them. We recommend
111 | [Miniconda3](https://conda.io/miniconda.html) to install Python and `pip` together,
112 | or otherwise you can install `pip` from https://pip.pypa.io/en/stable/installing/.
113 |
114 | #### Installation from CRAN
115 |
116 | In R, run this command to install MAGIC and all dependencies:
117 |
118 | install.packages("Rmagic")
119 |
120 | In a terminal, run the following command to install the Python
121 | repository.
122 |
123 | pip install --user magic-impute
124 |
125 | #### Installation from GitHub
126 |
127 | To clone the repository and install manually, run the following from a terminal:
128 |
129 | git clone git://github.com/KrishnaswamyLab/MAGIC.git
130 | cd MAGIC/python
131 | python setup.py install --user
132 | cd ../Rmagic
133 | R CMD INSTALL .
134 |
135 | ### Usage
136 |
137 | #### Quick Start
138 |
139 | After installing the package, MAGIC can be run by loading the library and calling `magic()`:
140 |
141 | library(Rmagic)
142 | library(ggplot2)
143 | data(magic_testdata)
144 | MAGIC_data <- magic(magic_testdata, genes=c("VIM", "CDH1", "ZEB1"))
145 | ggplot(MAGIC_data) +
146 | geom_point(aes(x=VIM, y=CDH1, color=ZEB1))
147 |
148 | #### Tutorials
149 |
150 | You can read the MAGIC tutorial by running `help(Rmagic::magic)`. For a working example, see the Rmarkdown tutorials at and or in `Rmagic/inst/examples`.
151 |
152 | ## Help
153 |
154 | If you have any questions or require assistance using MAGIC, please contact us at .
155 |
--------------------------------------------------------------------------------
/Rmagic/.Rbuildignore:
--------------------------------------------------------------------------------
1 | ^data-raw$
2 | ^tests$
3 | ^README\.Rmd$
4 | ^.pre\-commit.*$
5 |
--------------------------------------------------------------------------------
/Rmagic/.pre-commit-config.yaml:
--------------------------------------------------------------------------------
1 | repos:
2 | - repo: https://github.com/pre-commit/pre-commit-hooks
3 | rev: v3.3.0
4 | hooks:
5 | - id: check-yaml
6 | - id: end-of-file-fixer
7 | - id: trailing-whitespace
8 | exclude: \.(ai|gz)$
9 | - repo: https://github.com/lorenzwalthert/precommit
10 | rev: v0.1.3
11 | hooks:
12 | - id: parsable-R
13 | - id: no-browser-statement
14 | - id: readme-rmd-rendered
15 | - id: deps-in-desc
16 | exclude: data\-raw
17 | - id: use-tidy-description
18 | - id: style-files
19 | - id: lintr
20 | args: [--warn_only]
21 | verbose: true
22 | - id: roxygenize
23 |
--------------------------------------------------------------------------------
/Rmagic/.pre-commit.r_requirements.txt:
--------------------------------------------------------------------------------
1 | docopt
2 | styler
3 | git2r
4 | lintr
5 | roxygen2
6 | precommit
7 |
--------------------------------------------------------------------------------
/Rmagic/DESCRIPTION:
--------------------------------------------------------------------------------
1 | Type: Package
2 | Package: Rmagic
3 | Title: MAGIC - Markov Affinity-Based Graph Imputation of Cells
4 | Version: 2.0.3.999
5 | Authors@R:
6 | c(person(given = "David",
7 | family = "van Dijk",
8 | role = "aut",
9 | email = "davidvandijk@gmail.com"),
10 | person(given = "Scott",
11 | family = "Gigante",
12 | role = "cre",
13 | email = "scott.gigante@yale.edu",
14 | comment = c(ORCID = "0000-0002-4544-2764")))
15 | Maintainer: Scott Gigante
16 | Description: MAGIC (Markov affinity-based graph imputation of cells) is a
17 | method for addressing technical noise in single-cell data, including
18 | under-sampling of mRNA molecules, often termed "dropout" which can
19 | severely obscure important gene-gene relationships. MAGIC shares
20 | information across similar cells, via data diffusion, to denoise the
21 | cell count matrix and fill in missing transcripts. Read more: van Dijk
22 | et al. (2018) .
23 | License: GPL-2 | file LICENSE
24 | Depends:
25 | Matrix (>= 1.2-0),
26 | R (>= 3.3)
27 | Imports:
28 | ggplot2,
29 | methods,
30 | reticulate (>= 1.4),
31 | stats
32 | Suggests:
33 | phateR,
34 | readr,
35 | Seurat (>= 3.0.0),
36 | viridis
37 | Encoding: UTF-8
38 | LazyData: true
39 | RoxygenNote: 7.1.1
40 |
--------------------------------------------------------------------------------
/Rmagic/LICENSE:
--------------------------------------------------------------------------------
1 | GNU GENERAL PUBLIC LICENSE
2 | Version 2, June 1991
3 |
4 | Copyright (C) 1989, 1991 Free Software Foundation, Inc.,
5 | 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
6 | Everyone is permitted to copy and distribute verbatim copies
7 | of this license document, but changing it is not allowed.
8 |
9 | Preamble
10 |
11 | The licenses for most software are designed to take away your
12 | freedom to share and change it. By contrast, the GNU General Public
13 | License is intended to guarantee your freedom to share and change free
14 | software--to make sure the software is free for all its users. This
15 | General Public License applies to most of the Free Software
16 | Foundation's software and to any other program whose authors commit to
17 | using it. (Some other Free Software Foundation software is covered by
18 | the GNU Lesser General Public License instead.) You can apply it to
19 | your programs, too.
20 |
21 | When we speak of free software, we are referring to freedom, not
22 | price. Our General Public Licenses are designed to make sure that you
23 | have the freedom to distribute copies of free software (and charge for
24 | this service if you wish), that you receive source code or can get it
25 | if you want it, that you can change the software or use pieces of it
26 | in new free programs; and that you know you can do these things.
27 |
28 | To protect your rights, we need to make restrictions that forbid
29 | anyone to deny you these rights or to ask you to surrender the rights.
30 | These restrictions translate to certain responsibilities for you if you
31 | distribute copies of the software, or if you modify it.
32 |
33 | For example, if you distribute copies of such a program, whether
34 | gratis or for a fee, you must give the recipients all the rights that
35 | you have. You must make sure that they, too, receive or can get the
36 | source code. And you must show them these terms so they know their
37 | rights.
38 |
39 | We protect your rights with two steps: (1) copyright the software, and
40 | (2) offer you this license which gives you legal permission to copy,
41 | distribute and/or modify the software.
42 |
43 | Also, for each author's protection and ours, we want to make certain
44 | that everyone understands that there is no warranty for this free
45 | software. If the software is modified by someone else and passed on, we
46 | want its recipients to know that what they have is not the original, so
47 | that any problems introduced by others will not reflect on the original
48 | authors' reputations.
49 |
50 | Finally, any free program is threatened constantly by software
51 | patents. We wish to avoid the danger that redistributors of a free
52 | program will individually obtain patent licenses, in effect making the
53 | program proprietary. To prevent this, we have made it clear that any
54 | patent must be licensed for everyone's free use or not licensed at all.
55 |
56 | The precise terms and conditions for copying, distribution and
57 | modification follow.
58 |
59 | GNU GENERAL PUBLIC LICENSE
60 | TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
61 |
62 | 0. This License applies to any program or other work which contains
63 | a notice placed by the copyright holder saying it may be distributed
64 | under the terms of this General Public License. The "Program", below,
65 | refers to any such program or work, and a "work based on the Program"
66 | means either the Program or any derivative work under copyright law:
67 | that is to say, a work containing the Program or a portion of it,
68 | either verbatim or with modifications and/or translated into another
69 | language. (Hereinafter, translation is included without limitation in
70 | the term "modification".) Each licensee is addressed as "you".
71 |
72 | Activities other than copying, distribution and modification are not
73 | covered by this License; they are outside its scope. The act of
74 | running the Program is not restricted, and the output from the Program
75 | is covered only if its contents constitute a work based on the
76 | Program (independent of having been made by running the Program).
77 | Whether that is true depends on what the Program does.
78 |
79 | 1. You may copy and distribute verbatim copies of the Program's
80 | source code as you receive it, in any medium, provided that you
81 | conspicuously and appropriately publish on each copy an appropriate
82 | copyright notice and disclaimer of warranty; keep intact all the
83 | notices that refer to this License and to the absence of any warranty;
84 | and give any other recipients of the Program a copy of this License
85 | along with the Program.
86 |
87 | You may charge a fee for the physical act of transferring a copy, and
88 | you may at your option offer warranty protection in exchange for a fee.
89 |
90 | 2. You may modify your copy or copies of the Program or any portion
91 | of it, thus forming a work based on the Program, and copy and
92 | distribute such modifications or work under the terms of Section 1
93 | above, provided that you also meet all of these conditions:
94 |
95 | a) You must cause the modified files to carry prominent notices
96 | stating that you changed the files and the date of any change.
97 |
98 | b) You must cause any work that you distribute or publish, that in
99 | whole or in part contains or is derived from the Program or any
100 | part thereof, to be licensed as a whole at no charge to all third
101 | parties under the terms of this License.
102 |
103 | c) If the modified program normally reads commands interactively
104 | when run, you must cause it, when started running for such
105 | interactive use in the most ordinary way, to print or display an
106 | announcement including an appropriate copyright notice and a
107 | notice that there is no warranty (or else, saying that you provide
108 | a warranty) and that users may redistribute the program under
109 | these conditions, and telling the user how to view a copy of this
110 | License. (Exception: if the Program itself is interactive but
111 | does not normally print such an announcement, your work based on
112 | the Program is not required to print an announcement.)
113 |
114 | These requirements apply to the modified work as a whole. If
115 | identifiable sections of that work are not derived from the Program,
116 | and can be reasonably considered independent and separate works in
117 | themselves, then this License, and its terms, do not apply to those
118 | sections when you distribute them as separate works. But when you
119 | distribute the same sections as part of a whole which is a work based
120 | on the Program, the distribution of the whole must be on the terms of
121 | this License, whose permissions for other licensees extend to the
122 | entire whole, and thus to each and every part regardless of who wrote it.
123 |
124 | Thus, it is not the intent of this section to claim rights or contest
125 | your rights to work written entirely by you; rather, the intent is to
126 | exercise the right to control the distribution of derivative or
127 | collective works based on the Program.
128 |
129 | In addition, mere aggregation of another work not based on the Program
130 | with the Program (or with a work based on the Program) on a volume of
131 | a storage or distribution medium does not bring the other work under
132 | the scope of this License.
133 |
134 | 3. You may copy and distribute the Program (or a work based on it,
135 | under Section 2) in object code or executable form under the terms of
136 | Sections 1 and 2 above provided that you also do one of the following:
137 |
138 | a) Accompany it with the complete corresponding machine-readable
139 | source code, which must be distributed under the terms of Sections
140 | 1 and 2 above on a medium customarily used for software interchange; or,
141 |
142 | b) Accompany it with a written offer, valid for at least three
143 | years, to give any third party, for a charge no more than your
144 | cost of physically performing source distribution, a complete
145 | machine-readable copy of the corresponding source code, to be
146 | distributed under the terms of Sections 1 and 2 above on a medium
147 | customarily used for software interchange; or,
148 |
149 | c) Accompany it with the information you received as to the offer
150 | to distribute corresponding source code. (This alternative is
151 | allowed only for noncommercial distribution and only if you
152 | received the program in object code or executable form with such
153 | an offer, in accord with Subsection b above.)
154 |
155 | The source code for a work means the preferred form of the work for
156 | making modifications to it. For an executable work, complete source
157 | code means all the source code for all modules it contains, plus any
158 | associated interface definition files, plus the scripts used to
159 | control compilation and installation of the executable. However, as a
160 | special exception, the source code distributed need not include
161 | anything that is normally distributed (in either source or binary
162 | form) with the major components (compiler, kernel, and so on) of the
163 | operating system on which the executable runs, unless that component
164 | itself accompanies the executable.
165 |
166 | If distribution of executable or object code is made by offering
167 | access to copy from a designated place, then offering equivalent
168 | access to copy the source code from the same place counts as
169 | distribution of the source code, even though third parties are not
170 | compelled to copy the source along with the object code.
171 |
172 | 4. You may not copy, modify, sublicense, or distribute the Program
173 | except as expressly provided under this License. Any attempt
174 | otherwise to copy, modify, sublicense or distribute the Program is
175 | void, and will automatically terminate your rights under this License.
176 | However, parties who have received copies, or rights, from you under
177 | this License will not have their licenses terminated so long as such
178 | parties remain in full compliance.
179 |
180 | 5. You are not required to accept this License, since you have not
181 | signed it. However, nothing else grants you permission to modify or
182 | distribute the Program or its derivative works. These actions are
183 | prohibited by law if you do not accept this License. Therefore, by
184 | modifying or distributing the Program (or any work based on the
185 | Program), you indicate your acceptance of this License to do so, and
186 | all its terms and conditions for copying, distributing or modifying
187 | the Program or works based on it.
188 |
189 | 6. Each time you redistribute the Program (or any work based on the
190 | Program), the recipient automatically receives a license from the
191 | original licensor to copy, distribute or modify the Program subject to
192 | these terms and conditions. You may not impose any further
193 | restrictions on the recipients' exercise of the rights granted herein.
194 | You are not responsible for enforcing compliance by third parties to
195 | this License.
196 |
197 | 7. If, as a consequence of a court judgment or allegation of patent
198 | infringement or for any other reason (not limited to patent issues),
199 | conditions are imposed on you (whether by court order, agreement or
200 | otherwise) that contradict the conditions of this License, they do not
201 | excuse you from the conditions of this License. If you cannot
202 | distribute so as to satisfy simultaneously your obligations under this
203 | License and any other pertinent obligations, then as a consequence you
204 | may not distribute the Program at all. For example, if a patent
205 | license would not permit royalty-free redistribution of the Program by
206 | all those who receive copies directly or indirectly through you, then
207 | the only way you could satisfy both it and this License would be to
208 | refrain entirely from distribution of the Program.
209 |
210 | If any portion of this section is held invalid or unenforceable under
211 | any particular circumstance, the balance of the section is intended to
212 | apply and the section as a whole is intended to apply in other
213 | circumstances.
214 |
215 | It is not the purpose of this section to induce you to infringe any
216 | patents or other property right claims or to contest validity of any
217 | such claims; this section has the sole purpose of protecting the
218 | integrity of the free software distribution system, which is
219 | implemented by public license practices. Many people have made
220 | generous contributions to the wide range of software distributed
221 | through that system in reliance on consistent application of that
222 | system; it is up to the author/donor to decide if he or she is willing
223 | to distribute software through any other system and a licensee cannot
224 | impose that choice.
225 |
226 | This section is intended to make thoroughly clear what is believed to
227 | be a consequence of the rest of this License.
228 |
229 | 8. If the distribution and/or use of the Program is restricted in
230 | certain countries either by patents or by copyrighted interfaces, the
231 | original copyright holder who places the Program under this License
232 | may add an explicit geographical distribution limitation excluding
233 | those countries, so that distribution is permitted only in or among
234 | countries not thus excluded. In such case, this License incorporates
235 | the limitation as if written in the body of this License.
236 |
237 | 9. The Free Software Foundation may publish revised and/or new versions
238 | of the General Public License from time to time. Such new versions will
239 | be similar in spirit to the present version, but may differ in detail to
240 | address new problems or concerns.
241 |
242 | Each version is given a distinguishing version number. If the Program
243 | specifies a version number of this License which applies to it and "any
244 | later version", you have the option of following the terms and conditions
245 | either of that version or of any later version published by the Free
246 | Software Foundation. If the Program does not specify a version number of
247 | this License, you may choose any version ever published by the Free Software
248 | Foundation.
249 |
250 | 10. If you wish to incorporate parts of the Program into other free
251 | programs whose distribution conditions are different, write to the author
252 | to ask for permission. For software which is copyrighted by the Free
253 | Software Foundation, write to the Free Software Foundation; we sometimes
254 | make exceptions for this. Our decision will be guided by the two goals
255 | of preserving the free status of all derivatives of our free software and
256 | of promoting the sharing and reuse of software generally.
257 |
258 | NO WARRANTY
259 |
260 | 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
261 | FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
262 | OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
263 | PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
264 | OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
265 | MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
266 | TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
267 | PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
268 | REPAIR OR CORRECTION.
269 |
270 | 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
271 | WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
272 | REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
273 | INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
274 | OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
275 | TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
276 | YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
277 | PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
278 | POSSIBILITY OF SUCH DAMAGES.
279 |
280 | END OF TERMS AND CONDITIONS
281 |
282 | How to Apply These Terms to Your New Programs
283 |
284 | If you develop a new program, and you want it to be of the greatest
285 | possible use to the public, the best way to achieve this is to make it
286 | free software which everyone can redistribute and change under these terms.
287 |
288 | To do so, attach the following notices to the program. It is safest
289 | to attach them to the start of each source file to most effectively
290 | convey the exclusion of warranty; and each file should have at least
291 | the "copyright" line and a pointer to where the full notice is found.
292 |
293 | {description}
294 | Copyright (C) {year} {fullname}
295 |
296 | This program is free software; you can redistribute it and/or modify
297 | it under the terms of the GNU General Public License as published by
298 | the Free Software Foundation; either version 2 of the License, or
299 | (at your option) any later version.
300 |
301 | This program is distributed in the hope that it will be useful,
302 | but WITHOUT ANY WARRANTY; without even the implied warranty of
303 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
304 | GNU General Public License for more details.
305 |
306 | You should have received a copy of the GNU General Public License along
307 | with this program; if not, write to the Free Software Foundation, Inc.,
308 | 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
309 |
310 | Also add information on how to contact you by electronic and paper mail.
311 |
312 | If the program is interactive, make it output a short notice like this
313 | when it starts in an interactive mode:
314 |
315 | Gnomovision version 69, Copyright (C) year name of author
316 | Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
317 | This is free software, and you are welcome to redistribute it
318 | under certain conditions; type `show c' for details.
319 |
320 | The hypothetical commands `show w' and `show c' should show the appropriate
321 | parts of the General Public License. Of course, the commands you use may
322 | be called something other than `show w' and `show c'; they could even be
323 | mouse-clicks or menu items--whatever suits your program.
324 |
325 | You should also get your employer (if you work as a programmer) or your
326 | school, if any, to sign a "copyright disclaimer" for the program, if
327 | necessary. Here is a sample; alter the names:
328 |
329 | Yoyodyne, Inc., hereby disclaims all copyright interest in the program
330 | `Gnomovision' (which makes passes at compilers) written by James Hacker.
331 |
332 | {signature of Ty Coon}, 1 April 1989
333 | Ty Coon, President of Vice
334 |
335 | This General Public License does not permit incorporating your program into
336 | proprietary programs. If your program is a subroutine library, you may
337 | consider it more useful to permit linking proprietary applications with the
338 | library. If this is what you want to do, use the GNU Lesser General
339 | Public License instead of this License.
340 |
--------------------------------------------------------------------------------
/Rmagic/NAMESPACE:
--------------------------------------------------------------------------------
1 | # Generated by roxygen2: do not edit by hand
2 |
3 | S3method(as.data.frame,magic)
4 | S3method(as.matrix,magic)
5 | S3method(ggplot,magic)
6 | S3method(magic,Seurat)
7 | S3method(magic,default)
8 | S3method(magic,seurat)
9 | S3method(print,magic)
10 | S3method(summary,magic)
11 | export(check_pymagic_version)
12 | export(install.magic)
13 | export(library.size.normalize)
14 | export(magic)
15 | export(pymagic_is_available)
16 | import(Matrix)
17 | importFrom(ggplot2,ggplot)
18 | importFrom(utils,packageVersion)
19 |
--------------------------------------------------------------------------------
/Rmagic/R/magic.R:
--------------------------------------------------------------------------------
1 | #' Perform MAGIC on a data matrix
2 | #'
3 | #' Markov Affinity-based Graph Imputation of Cells (MAGIC) is an
4 | #' algorithm for denoising and transcript recover of single cells
5 | #' applied to single-cell RNA sequencing data, as described in
6 | #' van Dijk et al, 2018.
7 | #'
8 | #' @param data input data matrix or Seurat object
9 | #' @param genes character or integer vector, default: NULL
10 | #' vector of column names or column indices for which to return smoothed data
11 | #' If 'all_genes' or NULL, the entire smoothed matrix is returned
12 | #' @param knn int, optional, default: 5
13 | #' number of nearest neighbors on which to compute bandwidth
14 | #' @param knn.max int, optional, default: NULL
15 | #' maximum number of neighbors for each point. If NULL, defaults to 3*knn
16 | #' @param decay int, optional, default: 1
17 | #' sets decay rate of kernel tails.
18 | #' If NULL, alpha decaying kernel is not used
19 | #' @param t int, optional, default: 3
20 | #' power to which the diffusion operator is powered
21 | #' sets the level of diffusion. If 'auto', t is selected according to the
22 | #' Procrustes disparity of the diffused data.'
23 | #' @param npca number of PCA components that should be used; default: 100.
24 | #' @param solver str, optional, default: 'exact'
25 | #' Which solver to use. "exact" uses the implementation described
26 | #' in van Dijk et al. (2018). "approximate" uses a faster implementation
27 | #' that performs imputation in the PCA space and then projects back to the
28 | #' gene space. Note, the "approximate" solver may return negative values.
29 | #' @param init magic object, optional
30 | #' object to use for initialization. Avoids recomputing
31 | #' intermediate steps if parameters are the same.
32 | #' @param t.max int, optional, default: 20
33 | #' Maximum value of t to test for automatic t selection.
34 | #' @param knn.dist.method string, optional, default: 'euclidean'.
35 | #' recommended values: 'euclidean', 'cosine'
36 | #' Any metric from `scipy.spatial.distance` can be used
37 | #' distance metric for building kNN graph.
38 | #' @param verbose `int` or `boolean`, optional (default : 1)
39 | #' If `TRUE` or `> 0`, print verbose updates.
40 | #' @param n.jobs `int`, optional (default: 1)
41 | #' The number of jobs to use for the computation.
42 | #' If -1 all CPUs are used. If 1 is given, no parallel computing code is
43 | #' used at all, which is useful for debugging.
44 | #' For n_jobs below -1, (n.cpus + 1 + n.jobs) are used. Thus for
45 | #' n_jobs = -2, all CPUs but one are used
46 | #' @param seed int or `NULL`, random state (default: `NULL`)
47 | #' @param ... Arguments passed to and from other methods
48 | #' @param k Deprecated. Use `knn`.
49 | #' @param alpha Deprecated. Use `decay`.
50 | #'
51 | #' @return If a Seurat object is passed, a Seurat object is returned. Otherwise, a "magic" object containing:
52 | #' * **result**: matrix containing smoothed expression values
53 | #' * **operator**: The MAGIC operator (python magic.MAGIC object)
54 | #' * **params**: Parameters passed to magic
55 | #'
56 | #' @examples
57 | #' if (pymagic_is_available()) {
58 | #' data(magic_testdata)
59 | #'
60 | #' # Run MAGIC
61 | #' data_magic <- magic(magic_testdata, genes = c("VIM", "CDH1", "ZEB1"))
62 | #' summary(data_magic)
63 | #' ## CDH1 VIM ZEB1
64 | #' ## Min. :0.4303 Min. :3.854 Min. :0.01111
65 | #' ## 1st Qu.:0.4444 1st Qu.:3.947 1st Qu.:0.01145
66 | #' ## Median :0.4462 Median :3.964 Median :0.01153
67 | #' ## Mean :0.4461 Mean :3.965 Mean :0.01152
68 | #' ## 3rd Qu.:0.4478 3rd Qu.:3.982 3rd Qu.:0.01160
69 | #' ## Max. :0.4585 Max. :4.127 Max. :0.01201
70 | #'
71 | #' # Plot the result with ggplot2
72 | #' if (require(ggplot2)) {
73 | #' ggplot(data_magic) +
74 | #' geom_point(aes(x = VIM, y = CDH1, color = ZEB1))
75 | #' }
76 | #'
77 | #' # Run MAGIC again returning all genes
78 | #' # We use the last run as initialization
79 | #' data_magic <- magic(magic_testdata, genes = "all_genes", init = data_magic)
80 | #' # Extract the smoothed data matrix to use in downstream analysis
81 | #' data_smooth <- as.matrix(data_magic)
82 | #' }
83 | #'
84 | #' if (pymagic_is_available() && require(Seurat)) {
85 | #' data(magic_testdata)
86 | #'
87 | #' # Create a Seurat object
88 | #' seurat_object <- CreateSeuratObject(counts = t(magic_testdata), assay = "RNA")
89 | #' seurat_object <- NormalizeData(object = seurat_object)
90 | #' seurat_object <- ScaleData(object = seurat_object)
91 | #'
92 | #' # Run MAGIC and reset the active assay
93 | #' seurat_object <- magic(seurat_object)
94 | #' seurat_object@active.assay <- "MAGIC_RNA"
95 | #'
96 | #' # Analyze with Seurat
97 | #' VlnPlot(seurat_object, features = c("VIM", "ZEB1", "CDH1"))
98 | #' }
99 | #' @export
100 | #'
101 | magic <- function(data, ...) {
102 | UseMethod(generic = "magic", object = data)
103 | }
104 |
105 | #' @rdname magic
106 | #' @export
107 | #'
108 | magic.default <- function(
109 | data,
110 | genes = NULL,
111 | knn = 5,
112 | knn.max = NULL,
113 | decay = 1,
114 | t = 3,
115 | npca = 100,
116 | solver = "exact",
117 | init = NULL,
118 | t.max = 20,
119 | knn.dist.method = "euclidean",
120 | verbose = 1,
121 | n.jobs = 1,
122 | seed = NULL,
123 | # deprecated args
124 | k = NULL, alpha = NULL,
125 | ...) {
126 | # check installation
127 | if (!reticulate::py_module_available(module = "magic") ||
128 | (is.null(pymagic))) {
129 | load_pymagic()
130 | }
131 | # check for deprecated arguments
132 | if (!is.null(k)) {
133 | message("Argument k is deprecated. Using knn instead.")
134 | knn <- k
135 | }
136 | if (!is.null(alpha)) {
137 | message("Argument alpha is deprecated. Using decay instead.")
138 | decay <- alpha
139 | }
140 | # validate parameters
141 | knn <- check.int(x = knn)
142 | t.max <- check.int(x = t.max)
143 | n.jobs <- check.int(x = n.jobs)
144 | npca <- check.int.or.null(npca)
145 | knn.max <- check.int.or.null(knn.max)
146 | seed <- check.int.or.null(seed)
147 | verbose <- check.int.or.null(verbose)
148 | decay <- check.double.or.null(decay)
149 | t <- check.int.or.string(t, "auto")
150 | if (!methods::is(object = data, "Matrix")) {
151 | data <- as.matrix(x = data)
152 | }
153 | if (length(genes) <= 1 && (is.null(x = genes) || is.na(x = genes))) {
154 | genes <- NULL
155 | gene_names <- colnames(x = data)
156 | } else if (is.numeric(x = genes)) {
157 | gene_names <- colnames(x = data)[genes]
158 | genes <- as.integer(x = genes - 1)
159 | } else if (length(x = genes) == 1 && genes == "all_genes") {
160 | gene_names <- colnames(x = data)
161 | } else if (length(x = genes) == 1 && genes == "pca_only") {
162 | gene_names <- paste0("PC", 1:npca)
163 | } else {
164 | # character vector
165 | if (!all(genes %in% colnames(x = data))) {
166 | warning(paste0(
167 | "Genes ",
168 | genes[!(genes %in% colnames(data))],
169 | " not found.",
170 | collapse = ", "
171 | ))
172 | }
173 | genes <- which(x = colnames(x = data) %in% genes)
174 | gene_names <- colnames(x = data)[genes]
175 | genes <- as.integer(x = genes - 1)
176 | }
177 | # store parameters
178 | params <- list(
179 | "data" = data,
180 | "knn" = knn,
181 | "knn.max" = knn.max,
182 | "decay" = decay,
183 | "t" = t,
184 | "npca" = npca,
185 | "solver" = solver,
186 | "knn.dist.method" = knn.dist.method
187 | )
188 | # use pre-initialized values if given
189 | operator <- NULL
190 | if (!is.null(x = init)) {
191 | if (!methods::is(init, "magic")) {
192 | warning("object passed to init is not a phate object")
193 | } else {
194 | operator <- init$operator
195 | operator$set_params(
196 | knn = knn,
197 | knn_max = knn.max,
198 | decay = decay,
199 | t = t,
200 | n_pca = npca,
201 | solver = solver,
202 | knn_dist = knn.dist.method,
203 | n_jobs = n.jobs,
204 | random_state = seed,
205 | verbose = verbose,
206 | ...
207 | )
208 | }
209 | }
210 | if (is.null(x = operator)) {
211 | operator <- pymagic$MAGIC(
212 | knn = knn,
213 | knn_max = knn.max,
214 | decay = decay,
215 | t = t,
216 | n_pca = npca,
217 | solver = solver,
218 | knn_dist = knn.dist.method,
219 | n_jobs = n.jobs,
220 | random_state = seed,
221 | verbose = verbose,
222 | ...
223 | )
224 | }
225 | result <- operator$fit_transform(
226 | data,
227 | genes = genes,
228 | t_max = t.max
229 | )
230 | colnames(x = result) <- gene_names
231 | rownames(x = result) <- rownames(data)
232 | result <- as.data.frame(x = result)
233 | result <- list(
234 | "result" = result,
235 | "operator" = operator,
236 | "params" = params
237 | )
238 | class(x = result) <- c("magic", "list")
239 | return(result)
240 | }
241 |
242 | #' @rdname magic
243 | #' @export
244 | #' @method magic seurat
245 | #'
246 | magic.seurat <- function(
247 | data,
248 | genes = NULL,
249 | knn = 5,
250 | knn.max = NULL,
251 | decay = 1,
252 | t = 3,
253 | npca = 100,
254 | solver = "exact",
255 | init = NULL,
256 | t.max = 20,
257 | knn.dist.method = "euclidean",
258 | verbose = 1,
259 | n.jobs = 1,
260 | seed = NULL,
261 | ...) {
262 | if (requireNamespace("Seurat", quietly = TRUE)) {
263 | results <- magic(
264 | data = as.matrix(x = t(x = data@data)),
265 | genes = genes,
266 | knn = knn,
267 | knn.max = knn.max,
268 | decay = decay,
269 | t = t,
270 | npca = npca,
271 | solver = solver,
272 | init = init,
273 | t.max = t.max,
274 | knn.dist.method = knn.dist.method,
275 | verbose = verbose,
276 | n.jobs = n.jobs,
277 | seed = seed,
278 | ...
279 | )
280 | data@data <- t(x = as.matrix(x = results$result))
281 | return(data)
282 | } else {
283 | message("Seurat package not available. Running default MAGIC implementation.")
284 | return(magic(
285 | data,
286 | genes = genes,
287 | knn = knn,
288 | knn.max = knn.max,
289 | decay = decay,
290 | t = t,
291 | npca = npca,
292 | solver = solver,
293 | init = init,
294 | t.max = t.max,
295 | knn.dist.method = knn.dist.method,
296 | verbose = verbose,
297 | n.jobs = n.jobs,
298 | seed = seed,
299 | ...
300 | ))
301 | }
302 | }
303 |
304 | #' @param assay Assay to use for imputation, defaults to the default assay
305 | #'
306 | #' @rdname magic
307 | #' @export
308 | #' @method magic Seurat
309 | #'
310 | magic.Seurat <- function(
311 | data,
312 | assay = NULL,
313 | genes = NULL,
314 | knn = 5,
315 | knn.max = NULL,
316 | decay = 1,
317 | t = 3,
318 | npca = 100,
319 | solver = "exact",
320 | init = NULL,
321 | t.max = 20,
322 | knn.dist.method = "euclidean",
323 | verbose = 1,
324 | n.jobs = 1,
325 | seed = NULL,
326 | ...) {
327 | if (requireNamespace("Seurat", quietly = TRUE)) {
328 | if (is.null(x = assay)) {
329 | assay <- Seurat::DefaultAssay(object = data)
330 | }
331 | results <- magic(
332 | data = t(x = Seurat::GetAssayData(object = data, slot = "data", assay = assay)),
333 | genes = genes,
334 | knn = knn,
335 | knn.max = knn.max,
336 | decay = decay,
337 | t = t,
338 | npca = npca,
339 | solver = solver,
340 | init = init,
341 | t.max = t.max,
342 | knn.dist.method = knn.dist.method,
343 | verbose = verbose,
344 | n.jobs = n.jobs,
345 | seed = seed,
346 | ...
347 | )
348 | assay_name <- paste0("MAGIC_", assay)
349 | data[[assay_name]] <- Seurat::CreateAssayObject(
350 | data = t(x = as.matrix(x = results$result))
351 | )
352 | print(paste0(
353 | "Added MAGIC output to ",
354 | assay_name,
355 | ". To use it, pass assay='",
356 | assay_name,
357 | "' to downstream methods or set DefaultAssay(seurat_object) <- '",
358 | assay_name,
359 | "'."
360 | ))
361 | Seurat::Tool(object = data) <- results[c("operator", "params")]
362 | return(data)
363 | } else {
364 | message("Seurat package not available. Running default MAGIC implementation.")
365 | return(magic(
366 | data,
367 | genes = genes,
368 | knn = knn,
369 | knn.max = knn.max,
370 | decay = decay,
371 | t = t,
372 | npca = npca,
373 | init = init,
374 | t.max = t.max,
375 | knn.dist.method = knn.dist.method,
376 | verbose = verbose,
377 | n.jobs = n.jobs,
378 | seed = seed,
379 | ...
380 | ))
381 | }
382 | }
383 |
384 | #' Print a MAGIC object
385 | #'
386 | #' This avoids spamming the user's console with a list of many large matrices
387 | #'
388 | #' @param x A fitted MAGIC object
389 | #' @param ... Arguments for print()
390 | #' @examples
391 | #' if (pymagic_is_available()) {
392 | #' data(magic_testdata)
393 | #' data_magic <- magic(magic_testdata)
394 | #' print(data_magic)
395 | #' ## MAGIC with elements
396 | #' ## $result : (500, 197)
397 | #' ## $operator : Python MAGIC operator
398 | #' ## $params : list with elements (data, knn, decay, t, npca, knn.dist.method)
399 | #' }
400 | #' @rdname print
401 | #' @method print magic
402 | #' @export
403 | print.magic <- function(x, ...) {
404 | result <- paste0(
405 | "MAGIC with elements\n",
406 | " $result : (", nrow(x$result), ", ",
407 | ncol(x$result), ")\n",
408 | " $operator : Python MAGIC operator\n",
409 | " $params : list with elements (",
410 | paste(names(x$params), collapse = ", "), ")"
411 | )
412 | cat(result)
413 | }
414 |
415 | #' Summarize a MAGIC object
416 | #'
417 | #' @param object A fitted MAGIC object
418 | #' @param ... Arguments for summary()
419 | #' @examples
420 | #' if (pymagic_is_available()) {
421 | #' data(magic_testdata)
422 | #' data_magic <- magic(magic_testdata)
423 | #' summary(data_magic)
424 | #' ## ZEB1
425 | #' ## Min. :0.01071
426 | #' ## 1st Qu.:0.01119
427 | #' ## Median :0.01130
428 | #' ## Mean :0.01129
429 | #' ## 3rd Qu.:0.01140
430 | #' ## Max. :0.01201
431 | #' }
432 | #' @rdname summary
433 | #' @method summary magic
434 | #' @export
435 | summary.magic <- function(object, ...) {
436 | summary(object$result)
437 | }
438 |
439 | #' Convert a MAGIC object to a matrix
440 | #'
441 | #' Returns the smoothed data matrix
442 | #'
443 | #' @param x A fitted MAGIC object
444 | #' @param ... Arguments for as.matrix()
445 | #' @rdname as.matrix
446 | #' @method as.matrix magic
447 | #' @export
448 | as.matrix.magic <- function(x, ...) {
449 | as.matrix(as.data.frame(x))
450 | }
451 | #' Convert a MAGIC object to a data.frame
452 | #'
453 | #' Returns the smoothed data matrix
454 | #'
455 | #' @param x A fitted MAGIC object
456 | #' @param ... Arguments for as.data.frame()
457 | #' @rdname as.data.frame
458 | #' @method as.data.frame magic
459 | #' @export
460 | as.data.frame.magic <- function(x, ...) {
461 | x$result
462 | }
463 |
464 |
465 | #' Convert a MAGIC object to a data.frame for ggplot
466 | #'
467 | #' Passes the smoothed data matrix to ggplot
468 | #' @importFrom ggplot2 ggplot
469 | #' @param data A fitted MAGIC object
470 | #' @param ... Arguments for ggplot()
471 | #' @examples
472 | #' if (pymagic_is_available() && require(ggplot2)) {
473 | #' data(magic_testdata)
474 | #' data_magic <- magic(magic_testdata, genes = c("VIM", "CDH1", "ZEB1"))
475 | #' ggplot(data_magic, aes(VIM, CDH1, colour = ZEB1)) +
476 | #' geom_point()
477 | #' }
478 | #' @rdname ggplot
479 | #' @method ggplot magic
480 | #' @export
481 | ggplot.magic <- function(data, ...) {
482 | ggplot2::ggplot(as.data.frame(data), ...)
483 | }
484 |
--------------------------------------------------------------------------------
/Rmagic/R/magic_testdata.R:
--------------------------------------------------------------------------------
1 | #' Fake scRNAseq data for examples
2 | #'
3 | #' A subsampled dataset of epithelial to mesenchymal transition
4 | #'
5 | #' @format A matrix with 500 rows and 197 variables
6 | #'
7 | #' @source The authors
8 | "magic_testdata"
9 |
--------------------------------------------------------------------------------
/Rmagic/R/preprocessing.R:
--------------------------------------------------------------------------------
1 | #' Performs L1 normalization on input data such that the sum of expression
2 | #' values for each cell sums to 1, then returns normalized matrix to the metric
3 | #' space using median UMI count per cell effectively scaling all cells as if
4 | #' they were sampled evenly.
5 |
6 | #' @param data matrix (n_samples, n_dimensions)
7 | #' 2 dimensional input data array with n cells and p dimensions
8 | #' @param verbose boolean, default=FALSE. If true, print verbose output
9 |
10 | #' @return data_norm matrix (n_samples, n_dimensions)
11 | #' 2 dimensional array with normalized gene expression values
12 | #' @import Matrix
13 | #'
14 | #' @export
15 | library.size.normalize <- function(data, verbose = FALSE) {
16 | if (verbose) {
17 | message(paste0(
18 | "Normalizing library sizes for ",
19 | nrow(data), " cells"
20 | ))
21 | }
22 | library_size <- Matrix::rowSums(data)
23 | median_transcript_count <- stats::median(library_size)
24 | data_norm <- median_transcript_count * data / library_size
25 | data_norm
26 | }
27 |
--------------------------------------------------------------------------------
/Rmagic/R/utils.R:
--------------------------------------------------------------------------------
1 | # Return TRUE if x and y are equal or both NA
2 | null_equal <- function(x, y) {
3 | if (is.null(x) && is.null(y)) {
4 | return(TRUE)
5 | } else if (is.null(x) || is.null(y)) {
6 | return(FALSE)
7 | } else {
8 | return(x == y)
9 | }
10 | }
11 |
12 | #' Check that the current MAGIC version in Python is up to date.
13 | #'
14 | #' @importFrom utils packageVersion
15 | #' @export
16 | check_pymagic_version <- function() {
17 | pyversion <- strsplit(pymagic$`__version__`, "\\.")[[1]]
18 | rversion <- strsplit(as.character(packageVersion("Rmagic")), "\\.")[[1]]
19 | major_version <- as.integer(rversion[1])
20 | minor_version <- as.integer(rversion[2])
21 | if (as.integer(pyversion[1]) < major_version) {
22 | warning(paste0(
23 | "Python MAGIC version ",
24 | pymagic$`__version__`,
25 | " is out of date (recommended: ",
26 | major_version,
27 | ".",
28 | minor_version,
29 | "). Please update with pip ",
30 | "(e.g. ",
31 | reticulate::py_config()$python,
32 | " -m pip install --upgrade magic-impute) or Rmagic::install.magic()."
33 | ))
34 | return(FALSE)
35 | } else if (as.integer(pyversion[2]) < minor_version) {
36 | warning(paste0(
37 | "Python MAGIC version ",
38 | pymagic$`__version__`,
39 | " is out of date (recommended: ",
40 | major_version,
41 | ".",
42 | minor_version,
43 | "). Consider updating with pip ",
44 | "(e.g. ",
45 | reticulate::py_config()$python,
46 | " -m pip install --upgrade magic-impute) or Rmagic::install.magic()."
47 | ))
48 | return(FALSE)
49 | }
50 | return(TRUE)
51 | }
52 |
53 | failed_pymagic_import <- function(e) {
54 | message("Error loading Python module magic")
55 | message(e)
56 | result <- as.character(e)
57 | if (length(grep("ModuleNotFoundError: No module named 'magic'", result)) > 0 ||
58 | length(grep("ImportError: No module named magic", result)) > 0) {
59 | # not installed
60 | if (utils::menu(c("Yes", "No"), title = "Install MAGIC Python package with reticulate?") == 1) {
61 | install.magic()
62 | }
63 | } else if (length(grep("r\\-reticulate", reticulate::py_config()$python)) > 0) {
64 | # installed, but envs sometimes give weird results
65 | message("Consider removing the 'r-reticulate' environment by running:")
66 | if (length(grep("virtualenvs", reticulate::py_config()$python)) > 0) {
67 | message("reticulate::virtualenv_remove('r-reticulate')")
68 | } else {
69 | message("reticulate::conda_remove('r-reticulate')")
70 | }
71 | }
72 | }
73 |
74 | load_pymagic <- function() {
75 | delay_load <- list(on_load = check_pymagic_version, on_error = failed_pymagic_import)
76 | # load
77 | if (is.null(pymagic)) {
78 | # first time load
79 | result <- try(pymagic <<- reticulate::import("magic", delay_load = delay_load))
80 | } else {
81 | # already loaded
82 | result <- try(reticulate::import("magic", delay_load = delay_load))
83 | }
84 | }
85 |
86 | #' Check whether MAGIC Python package is available and can be loaded
87 | #'
88 | #' This is used primarily to avoid running tests on CRAN
89 | #' and elsewhere where the Python package should not be
90 | #' installed.
91 | #'
92 | #' @export
93 | pymagic_is_available <- function() {
94 | tryCatch(
95 | {
96 | reticulate::import("magic")$MAGIC
97 | check_pymagic_version()
98 | },
99 | error = function(e) {
100 | FALSE
101 | }
102 | )
103 | }
104 |
105 | #' Install MAGIC Python Package
106 | #'
107 | #' Install MAGIC Python package into a virtualenv or conda env.
108 | #'
109 | #' On Linux and OS X the "virtualenv" method will be used by default
110 | #' ("conda" will be used if virtualenv isn't available). On Windows,
111 | #' the "conda" method is always used.
112 | #'
113 | #' @param envname Name of environment to install packages into
114 | #' @param method Installation method. By default, "auto" automatically finds
115 | #' a method that will work in the local environment. Change the default to
116 | #' force a specific installation method. Note that the "virtualenv" method
117 | #' is not available on Windows.
118 | #' @param conda Path to conda executable (or "auto" to find conda using the PATH
119 | #' and other conventional install locations).
120 | #' @param pip Install from pip, if possible.
121 | #' @param ... Additional arguments passed to conda_install() or
122 | #' virtualenv_install().
123 | #'
124 | #' @export
125 | install.magic <- function(envname = "r-reticulate", method = "auto",
126 | conda = "auto", pip = TRUE, ...) {
127 | message("Attempting to install MAGIC python package with reticulate")
128 | tryCatch(
129 | {
130 | reticulate::py_install("magic-impute",
131 | envname = envname, method = method,
132 | conda = conda, pip = pip, ...
133 | )
134 | message("Install complete. Please restart R and try again.")
135 | },
136 | error = function(e) {
137 | stop(paste0(
138 | "Cannot locate MAGIC Python package, please install through pip ",
139 | "(e.g. ", reticulate::py_config()$python, " -m pip install magic-impute) and then restart R."
140 | ))
141 | }
142 | )
143 | }
144 |
145 | pymagic <- NULL
146 |
147 | .onLoad <- function(libname, pkgname) {
148 | py_config <- reticulate::py_discover_config(required_module = "magic")
149 | load_pymagic()
150 | }
151 |
152 | ######
153 | # Parameter validation
154 | ######
155 |
156 | check.int <- function(x) {
157 | as.integer(x)
158 | }
159 |
160 | check.int.or.null <- function(x) {
161 | if (is.numeric(x = x)) {
162 | x <- as.integer(x = x)
163 | } else if (!is.null(x = x) && is.na(x = x)) {
164 | x <- NULL
165 | }
166 | x
167 | }
168 |
169 | check.double.or.null <- function(x) {
170 | if (is.numeric(x = x)) {
171 | x <- as.integer(x = x)
172 | } else if (!is.null(x = x) && is.na(x = x)) {
173 | x <- NULL
174 | }
175 | x
176 | }
177 |
178 | check.int.or.string <- function(x, str) {
179 | if (is.numeric(x = x)) {
180 | x <- as.integer(x = x)
181 | } else if (is.null(x = x) || is.na(x = x)) {
182 | x <- str
183 | }
184 | x
185 | }
186 |
--------------------------------------------------------------------------------
/Rmagic/README.Rmd:
--------------------------------------------------------------------------------
1 | ---
2 | title : Rmagic
3 | output: github_document
4 | toc: true
5 | ---
6 |
7 |
8 |
9 | ```{r setup, include = FALSE}
10 | knitr::opts_chunk$set(
11 | collapse = TRUE,
12 | comment = "#>",
13 | fig.path = "man/figures/README-",
14 | out.width = "100%"
15 | )
16 | ```
17 |
18 | [](https://pypi.org/project/magic-impute/)
19 | [](https://cran.r-project.org/package=Rmagic)
20 | [](https://github.com/KrishnaswamyLab/MAGIC/actions)
21 | [](https://magic.readthedocs.io/)
22 | [](https://www.cell.com/cell/abstract/S0092-8674(18)30724-4)
23 | [](https://twitter.com/KrishnaswamyLab)
24 | [](https://github.com/KrishnaswamyLab/MAGIC/)
25 |
26 |
27 | Markov Affinity-based Graph Imputation of Cells (MAGIC) is an algorithm for denoising and imputation of single cells applied to single-cell RNA sequencing data, as described in Van Dijk D *et al.* (2018), *Recovering Gene Interactions from Single-Cell Data Using Data Diffusion*, Cell .
28 |
29 |
30 |
31 |
32 | Magic reveals the interaction between Vimentin (VIM), Cadherin-1 (CDH1), and Zinc finger E-box-binding homeobox 1 (ZEB1, encoded by colors).
33 |
34 |
35 |
36 | * MAGIC imputes missing data values on sparse data sets, restoring the structure of the data
37 | * It also proves dimensionality reduction and gene expression visualizations
38 | * MAGIC can be performed on a variety of datasets
39 | * Here, we show the usage of MAGIC on a toy dataset
40 | * You can view further examples of MAGIC on real data in our notebooks under `inst/examples`:
41 | * http://htmlpreview.github.io/?https://github.com/KrishnaswamyLab/MAGIC/blob/master/Rmagic/inst/examples/EMT_tutorial.html
42 | * http://htmlpreview.github.io/?https://github.com/KrishnaswamyLab/MAGIC/blob/master/Rmagic/inst/examples/bonemarrow_tutorial.html
43 |
44 | ## Table of Contents
45 |
46 | * [Installation](#installation)
47 | * [Installation from CRAN and PyPi](#installation-from-cran-and-pypi)
48 | * [Installation with devtools and reticulate](#installation-with-devtools-and-reticulate)
49 | * [Installation from source](#installation-from-source)
50 | * [Quick Start](#quick-start)
51 | * [Tutorial](#tutorial)
52 | * [Issues](#issues)
53 | * [FAQ](#faq)
54 | * [Help](#help)
55 |
56 | ## Installation
57 |
58 | To use MAGIC, you will need to install both the R and Python packages.
59 |
60 | If `python` or `pip` are not installed, you will need to install them. We recommend [Miniconda3](https://conda.io/miniconda.html) to install Python and `pip` together, or otherwise you can install `pip` from https://pip.pypa.io/en/stable/installing/.
61 |
62 | #### Installation from CRAN
63 |
64 | In R, run this command to install MAGIC and all dependencies:
65 |
66 | ```{r install_Rmagic, eval=FALSE}
67 | install.packages("Rmagic")
68 | ```
69 |
70 | In a terminal, run the following command to install the Python repository.
71 |
72 | ```{bash install_python_magic, eval=FALSE}
73 | pip install --user magic-impute
74 | ```
75 |
76 | #### Installaton from source
77 |
78 | To install the very latest version of MAGIC, you can install from GitHub with the following commands run in a terminal.
79 |
80 | ```{bash install_magic_source, eval=FALSE}
81 | git clone https://github.com/KrishnaswamyLab/MAGIC
82 | cd MAGIC/python
83 | python setup.py install --user
84 | cd ../Rmagic
85 | R CMD INSTALL .
86 | ```
87 |
88 | ## Quick Start
89 |
90 | If you have loaded a data matrix `data` in R (cells on rows, genes on columns) you can run PHATE as follows:
91 |
92 | ```{r quick start, eval=FALSE}
93 | library(phateR)
94 | data_phate <- phate(data)
95 | ```
96 |
97 | ## Tutorial
98 |
99 | #### Extra packages for the tutorial
100 |
101 | We'll install a couple more tools for this tutorial.
102 |
103 | ```{r install_extras, eval=FALSE}
104 | if (!require(viridis)) install.packages("viridis")
105 | if (!require(ggplot2)) install.packages("ggplot2")
106 | if (!require(phateR)) install.packages("phateR")
107 | ```
108 |
109 | If you have never used PHATE, you should also install PHATE from the command line as follows:
110 |
111 | ```{bash install_python_phate, eval=FALSE}
112 | pip install --user phate
113 | ```
114 |
115 | ### Loading packages
116 |
117 | We load the Rmagic package and a few others for convenience functions.
118 |
119 | ```{r load_packages}
120 | library(Rmagic)
121 | library(ggplot2)
122 | library(viridis)
123 | library(phateR)
124 | ```
125 |
126 | ### Loading data
127 |
128 | The example data is located in the MAGIC R package.
129 |
130 | ```{r load_data}
131 | # load data
132 | data(magic_testdata)
133 | magic_testdata[1:5, 1:10]
134 | ```
135 |
136 | ### Running MAGIC
137 |
138 | Running MAGIC is as simple as running the `magic` function.
139 |
140 | ```{r run_magic}
141 | # run MAGIC
142 | data_MAGIC <- magic(magic_testdata, genes = c("VIM", "CDH1", "ZEB1"))
143 | ```
144 |
145 | We can plot the data before and after MAGIC to visualize the results.
146 |
147 | ```{r plot_raw}
148 | ggplot(magic_testdata) +
149 | geom_point(aes(VIM, CDH1, colour = ZEB1)) +
150 | scale_colour_viridis(option = "B")
151 | ```
152 |
153 | The data suffers from dropout to the point that we cannot infer anything about the gene-gene relationships.
154 |
155 | ```{r plot_magic}
156 | ggplot(data_MAGIC) +
157 | geom_point(aes(VIM, CDH1, colour = ZEB1)) +
158 | scale_colour_viridis(option = "B")
159 | ```
160 |
161 | As you can see, the gene-gene relationships are much clearer after MAGIC.
162 |
163 | The data is sometimes a little too smooth - we can decrease `t` from the automatic value to reduce the amount of diffusion. We pass the original result to the argument `init` to avoid recomputing intermediate steps.
164 |
165 | ```{r plot_reduced_t}
166 | data_MAGIC <- magic(magic_testdata, genes = c("VIM", "CDH1", "ZEB1"), t = 6, init = data_MAGIC)
167 | ggplot(data_MAGIC) +
168 | geom_point(aes(VIM, CDH1, colour = ZEB1)) +
169 | scale_colour_viridis(option = "B")
170 | ```
171 |
172 |
173 | We can look at the entire smoothed matrix with `genes='all_genes'`, passing the original result to the argument `init` to avoid recomputing intermediate steps. Note that this matrix may be large and could take up a lot of memory.
174 |
175 | ```{r run_magic_full_matrix}
176 | data_MAGIC <- magic(magic_testdata, genes = "all_genes", t = 6, init = data_MAGIC)
177 | as.data.frame(data_MAGIC)[1:5, 1:10]
178 | ```
179 |
180 | ### Visualizing MAGIC values on PCA
181 |
182 | We can visualize the results of MAGIC on PCA as follows.
183 |
184 | ```{r run_pca}
185 | data_MAGIC_PCA <- as.data.frame(prcomp(data_MAGIC)$x)
186 | ggplot(data_MAGIC_PCA) +
187 | geom_point(aes(x = PC1, y = PC2, color = data_MAGIC$result$VIM)) +
188 | scale_color_viridis(option = "B") +
189 | labs(color = "VIM")
190 | ```
191 |
192 |
193 | ### Visualizing MAGIC values on PHATE
194 |
195 | We can visualize the results of MAGIC on PHATE as follows. We set `t` and `k` manually, because this toy dataset is really too small to make sense with PHATE; however, the default values work well for single-cell genomic data.
196 |
197 | ```{r run_phate}
198 | data_PHATE <- phate(magic_testdata, k = 3, t = 15)
199 | ggplot(data_PHATE) +
200 | geom_point(aes(x = PHATE1, y = PHATE2, color = data_MAGIC$result$VIM)) +
201 | scale_color_viridis(option = "B") +
202 | labs(color = "VIM")
203 | ```
204 |
205 | ## Issues
206 |
207 | ### FAQ
208 |
209 | - **Should genes (features) by rows or columns?**
210 |
211 | To be consistent with common functions such as PCA
212 | (`stats::prcomp`) and t-SNE (`Rtsne::Rtsne`), we require that cells
213 | (observations) be rows and genes (features) be columns of your input
214 | data.
215 |
216 | - **I have installed MAGIC in Python, but Rmagic says it is not
217 | installed!**
218 |
219 | Check your `reticulate::py_discover_config("magic")` and compare it to
220 | the version of Python in which you installed PHATE (run `which python`
221 | and `which pip` in a terminal.) Chances are `reticulate` can’t find the
222 | right version of Python; you can fix this by adding the following line
223 | to your `~/.Renviron`:
224 |
225 | `PATH=/path/to/my/python`
226 |
227 | You can read more about `Renviron` at
228 | .
229 |
230 | ### Help
231 |
232 | Please let us know of any issues at the [GitHub repository](https://github.com/KrishnaswamyLab/MAGIC/issues). If you
233 | have any questions or require assistance using MAGIC, please read the
234 | documentation by running `help(Rmagic::magic)` or contact us at
235 | .
236 |
--------------------------------------------------------------------------------
/Rmagic/README.md:
--------------------------------------------------------------------------------
1 | ---
2 | title : Rmagic
3 | output: github_document
4 | toc: true
5 | ---
6 |
7 |
8 |
9 |
10 |
11 | [](https://pypi.org/project/magic-impute/)
12 | [](https://cran.r-project.org/package=Rmagic)
13 | [](https://github.com/KrishnaswamyLab/MAGIC/actions)
14 | [](https://magic.readthedocs.io/)
15 | [](https://www.cell.com/cell/abstract/S0092-8674(18)30724-4)
16 | [](https://twitter.com/KrishnaswamyLab)
17 | [](https://github.com/KrishnaswamyLab/MAGIC/)
18 |
19 |
20 | Markov Affinity-based Graph Imputation of Cells (MAGIC) is an algorithm for denoising and imputation of single cells applied to single-cell RNA sequencing data, as described in Van Dijk D *et al.* (2018), *Recovering Gene Interactions from Single-Cell Data Using Data Diffusion*, Cell .
21 |
22 |
23 |
24 |
25 | Magic reveals the interaction between Vimentin (VIM), Cadherin-1 (CDH1), and Zinc finger E-box-binding homeobox 1 (ZEB1, encoded by colors).
26 |
27 |
28 |
29 | * MAGIC imputes missing data values on sparse data sets, restoring the structure of the data
30 | * It also proves dimensionality reduction and gene expression visualizations
31 | * MAGIC can be performed on a variety of datasets
32 | * Here, we show the usage of MAGIC on a toy dataset
33 | * You can view further examples of MAGIC on real data in our notebooks under `inst/examples`:
34 | * http://htmlpreview.github.io/?https://github.com/KrishnaswamyLab/MAGIC/blob/master/Rmagic/inst/examples/EMT_tutorial.html
35 | * http://htmlpreview.github.io/?https://github.com/KrishnaswamyLab/MAGIC/blob/master/Rmagic/inst/examples/bonemarrow_tutorial.html
36 |
37 | ## Table of Contents
38 |
39 | * [Installation](#installation)
40 | * [Installation from CRAN and PyPi](#installation-from-cran-and-pypi)
41 | * [Installation with devtools and reticulate](#installation-with-devtools-and-reticulate)
42 | * [Installation from source](#installation-from-source)
43 | * [Quick Start](#quick-start)
44 | * [Tutorial](#tutorial)
45 | * [Issues](#issues)
46 | * [FAQ](#faq)
47 | * [Help](#help)
48 |
49 | ## Installation
50 |
51 | To use MAGIC, you will need to install both the R and Python packages.
52 |
53 | If `python` or `pip` are not installed, you will need to install them. We recommend [Miniconda3](https://conda.io/miniconda.html) to install Python and `pip` together, or otherwise you can install `pip` from https://pip.pypa.io/en/stable/installing/.
54 |
55 | #### Installation from CRAN
56 |
57 | In R, run this command to install MAGIC and all dependencies:
58 |
59 |
60 | ```r
61 | install.packages("Rmagic")
62 | ```
63 |
64 | In a terminal, run the following command to install the Python repository.
65 |
66 |
67 | ```bash
68 | pip install --user magic-impute
69 | ```
70 |
71 | #### Installaton from source
72 |
73 | To install the very latest version of MAGIC, you can install from GitHub with the following commands run in a terminal.
74 |
75 |
76 | ```bash
77 | git clone https://github.com/KrishnaswamyLab/MAGIC
78 | cd MAGIC/python
79 | python setup.py install --user
80 | cd ../Rmagic
81 | R CMD INSTALL .
82 | ```
83 |
84 | ## Quick Start
85 |
86 | If you have loaded a data matrix `data` in R (cells on rows, genes on columns) you can run PHATE as follows:
87 |
88 |
89 | ```r
90 | library(phateR)
91 | data_phate <- phate(data)
92 | ```
93 |
94 | ## Tutorial
95 |
96 | #### Extra packages for the tutorial
97 |
98 | We'll install a couple more tools for this tutorial.
99 |
100 |
101 | ```r
102 | if (!require(viridis)) install.packages("viridis")
103 | if (!require(ggplot2)) install.packages("ggplot2")
104 | if (!require(phateR)) install.packages("phateR")
105 | ```
106 |
107 | If you have never used PHATE, you should also install PHATE from the command line as follows:
108 |
109 |
110 | ```bash
111 | pip install --user phate
112 | ```
113 |
114 | ### Loading packages
115 |
116 | We load the Rmagic package and a few others for convenience functions.
117 |
118 |
119 | ```r
120 | library(Rmagic)
121 | #> Loading required package: Matrix
122 | library(ggplot2)
123 | library(viridis)
124 | #> Loading required package: viridisLite
125 | library(phateR)
126 | #>
127 | #> Attaching package: 'phateR'
128 | #> The following object is masked from 'package:Rmagic':
129 | #>
130 | #> library.size.normalize
131 | ```
132 |
133 | ### Loading data
134 |
135 | The example data is located in the MAGIC R package.
136 |
137 |
138 | ```r
139 | # load data
140 | data(magic_testdata)
141 | magic_testdata[1:5, 1:10]
142 | #> A1BG-AS1 AAMDC AAMP AARSD1 ABCA12 ABCG2 ABHD13 AC007773.2
143 | #> 6564 0.0000000 0.0000000 0.0000000 0 0 0 0.0000000 0
144 | #> 3835 0.0000000 0.8714711 0.0000000 0 0 0 0.8714711 0
145 | #> 6318 0.7739207 0.0000000 0.7739207 0 0 0 0.0000000 0
146 | #> 3284 0.0000000 0.0000000 0.0000000 0 0 0 0.0000000 0
147 | #> 1171 0.0000000 0.0000000 0.0000000 0 0 0 0.0000000 0
148 | #> AC011998.4 AC013470.6
149 | #> 6564 0 0
150 | #> 3835 0 0
151 | #> 6318 0 0
152 | #> 3284 0 0
153 | #> 1171 0 0
154 | ```
155 |
156 | ### Running MAGIC
157 |
158 | Running MAGIC is as simple as running the `magic` function.
159 |
160 |
161 | ```r
162 | # run MAGIC
163 | data_MAGIC <- magic(magic_testdata, genes = c("VIM", "CDH1", "ZEB1"))
164 | ```
165 |
166 | We can plot the data before and after MAGIC to visualize the results.
167 |
168 |
169 | ```r
170 | ggplot(magic_testdata) +
171 | geom_point(aes(VIM, CDH1, colour = ZEB1)) +
172 | scale_colour_viridis(option = "B")
173 | ```
174 |
175 |
176 |
177 | The data suffers from dropout to the point that we cannot infer anything about the gene-gene relationships.
178 |
179 |
180 | ```r
181 | ggplot(data_MAGIC) +
182 | geom_point(aes(VIM, CDH1, colour = ZEB1)) +
183 | scale_colour_viridis(option = "B")
184 | ```
185 |
186 |
187 |
188 | As you can see, the gene-gene relationships are much clearer after MAGIC.
189 |
190 | The data is sometimes a little too smooth - we can decrease `t` from the automatic value to reduce the amount of diffusion. We pass the original result to the argument `init` to avoid recomputing intermediate steps.
191 |
192 |
193 | ```r
194 | data_MAGIC <- magic(magic_testdata, genes = c("VIM", "CDH1", "ZEB1"), t = 6, init = data_MAGIC)
195 | ggplot(data_MAGIC) +
196 | geom_point(aes(VIM, CDH1, colour = ZEB1)) +
197 | scale_colour_viridis(option = "B")
198 | ```
199 |
200 |
201 |
202 |
203 | We can look at the entire smoothed matrix with `genes='all_genes'`, passing the original result to the argument `init` to avoid recomputing intermediate steps. Note that this matrix may be large and could take up a lot of memory.
204 |
205 |
206 | ```r
207 | data_MAGIC <- magic(magic_testdata, genes = "all_genes", t = 6, init = data_MAGIC)
208 | as.data.frame(data_MAGIC)[1:5, 1:10]
209 | #> A1BG-AS1 AAMDC AAMP AARSD1 ABCA12 ABCG2
210 | #> 6564 0.03332336 0.06672377 0.1718769 0.01765440 0.03641116 0.01703004
211 | #> 3835 0.03142519 0.06720022 0.1568662 0.01619578 0.03338187 0.01729001
212 | #> 6318 0.03519781 0.06551774 0.1811869 0.01462556 0.03595934 0.02094741
213 | #> 3284 0.03130388 0.06374405 0.1621586 0.01686944 0.03288072 0.01786413
214 | #> 1171 0.03515109 0.06447265 0.1735847 0.01444976 0.03791399 0.01995593
215 | #> ABHD13 AC007773.2 AC011998.4 AC013470.6
216 | #> 6564 0.07692547 0.0007960324 0.001382103 0.002978190
217 | #> 3835 0.07578407 0.0007146892 0.001206586 0.002613474
218 | #> 6318 0.08120989 0.0011273292 0.001594218 0.005743911
219 | #> 3284 0.07568180 0.0007009115 0.001017284 0.002982551
220 | #> 1171 0.07975672 0.0010427596 0.001982926 0.005315534
221 | ```
222 |
223 | ### Visualizing MAGIC values on PCA
224 |
225 | We can visualize the results of MAGIC on PCA as follows.
226 |
227 |
228 | ```r
229 | data_MAGIC_PCA <- as.data.frame(prcomp(data_MAGIC)$x)
230 | ggplot(data_MAGIC_PCA) +
231 | geom_point(aes(x = PC1, y = PC2, color = data_MAGIC$result$VIM)) +
232 | scale_color_viridis(option = "B") +
233 | labs(color = "VIM")
234 | ```
235 |
236 |
237 |
238 |
239 | ### Visualizing MAGIC values on PHATE
240 |
241 | We can visualize the results of MAGIC on PHATE as follows. We set `t` and `k` manually, because this toy dataset is really too small to make sense with PHATE; however, the default values work well for single-cell genomic data.
242 |
243 |
244 | ```r
245 | data_PHATE <- phate(magic_testdata, k = 3, t = 15)
246 | #> Argument k is deprecated. Using knn instead.
247 | ggplot(data_PHATE) +
248 | geom_point(aes(x = PHATE1, y = PHATE2, color = data_MAGIC$result$VIM)) +
249 | scale_color_viridis(option = "B") +
250 | labs(color = "VIM")
251 | ```
252 |
253 |
254 |
255 | ## Issues
256 |
257 | ### FAQ
258 |
259 | - **Should genes (features) by rows or columns?**
260 |
261 | To be consistent with common functions such as PCA
262 | (`stats::prcomp`) and t-SNE (`Rtsne::Rtsne`), we require that cells
263 | (observations) be rows and genes (features) be columns of your input
264 | data.
265 |
266 | - **I have installed MAGIC in Python, but Rmagic says it is not
267 | installed!**
268 |
269 | Check your `reticulate::py_discover_config("magic")` and compare it to
270 | the version of Python in which you installed PHATE (run `which python`
271 | and `which pip` in a terminal.) Chances are `reticulate` can’t find the
272 | right version of Python; you can fix this by adding the following line
273 | to your `~/.Renviron`:
274 |
275 | `PATH=/path/to/my/python`
276 |
277 | You can read more about `Renviron` at
278 | .
279 |
280 | ### Help
281 |
282 | Please let us know of any issues at the [GitHub repository](https://github.com/KrishnaswamyLab/MAGIC/issues). If you
283 | have any questions or require assistance using MAGIC, please read the
284 | documentation by running `help(Rmagic::magic)` or contact us at
285 | .
286 |
--------------------------------------------------------------------------------
/Rmagic/data-raw/generate_test_data.R:
--------------------------------------------------------------------------------
1 | library(readr)
2 | magic_testdata <- read_csv("../../data/HMLE_TGFb_day_8_10.csv.gz")
3 | set.seed(42)
4 | keep_cols <- colSums(magic_testdata > 0) > 10
5 | keep_rows <- rowSums(magic_testdata) > 2000
6 | magic_testdata <- magic_testdata[keep_rows, keep_cols]
7 | magic_testdata <- Rmagic::library.size.normalize(magic_testdata)
8 | magic_testdata <- sqrt(magic_testdata)
9 | select_cols <- c(
10 | colnames(magic_testdata)[ceiling(runif(200) * nrow(magic_testdata))],
11 | c("VIM", "CDH1", "ZEB1")
12 | )
13 | magic_testdata <- magic_testdata[, colnames(magic_testdata) %in% select_cols]
14 | select_rows <- ceiling(runif(500) * nrow(magic_testdata))
15 | magic_testdata <- magic_testdata[select_rows, ]
16 | write_csv(magic_testdata, "../../data/test_data.csv")
17 | usethis::use_data(magic_testdata)
18 |
--------------------------------------------------------------------------------
/Rmagic/data/magic_testdata.rda:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/KrishnaswamyLab/MAGIC/3a4ffdbe435716bb3b2fbe78f434c6cdc8dd8d78/Rmagic/data/magic_testdata.rda
--------------------------------------------------------------------------------
/Rmagic/inst/CITATION:
--------------------------------------------------------------------------------
1 | bibentry(
2 | bibtype="Article",
3 | title="Recovering Gene Interactions from Single-Cell Data Using Data Diffusion",
4 | author = c(
5 | person("David", "van Dijk"),
6 | person("Roshan", "Sharma"),
7 | person("Juozas", "Nainys"),
8 | person("Kristina", "Yim"),
9 | person("Pooja", "Kathail"),
10 | person("Ambrose J.", "Carr"),
11 | person("Cassandra", "Burdziak"),
12 | person("Kevin R.", "Moon"),
13 | person("Christine L.", "Chaffer"),
14 | person("Diwakar", "Pattabiraman"),
15 | person("Brian", "Bierie"),
16 | person("Linas", "Mazutis"),
17 | person("Guy", "Wolf"),
18 | person("Smita", "Krishnaswamy"),
19 | person("Dana", "Pe'er")),
20 | year=2018,
21 | url="https://www.cell.com/cell/abstract/S0092-8674(18)30724-4",
22 | doi="10.1016/j.cell.2018.05.061",
23 | journal="Cell",
24 | publisher="Cell Press"
25 | )
26 |
--------------------------------------------------------------------------------
/Rmagic/inst/examples/BMMSC_data_R_after_magic.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/KrishnaswamyLab/MAGIC/3a4ffdbe435716bb3b2fbe78f434c6cdc8dd8d78/Rmagic/inst/examples/BMMSC_data_R_after_magic.png
--------------------------------------------------------------------------------
/Rmagic/inst/examples/BMMSC_data_R_before_magic.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/KrishnaswamyLab/MAGIC/3a4ffdbe435716bb3b2fbe78f434c6cdc8dd8d78/Rmagic/inst/examples/BMMSC_data_R_before_magic.png
--------------------------------------------------------------------------------
/Rmagic/inst/examples/BMMSC_data_R_pca_colored_by_magic.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/KrishnaswamyLab/MAGIC/3a4ffdbe435716bb3b2fbe78f434c6cdc8dd8d78/Rmagic/inst/examples/BMMSC_data_R_pca_colored_by_magic.png
--------------------------------------------------------------------------------
/Rmagic/inst/examples/BMMSC_data_R_phate_colored_by_magic.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/KrishnaswamyLab/MAGIC/3a4ffdbe435716bb3b2fbe78f434c6cdc8dd8d78/Rmagic/inst/examples/BMMSC_data_R_phate_colored_by_magic.png
--------------------------------------------------------------------------------
/Rmagic/inst/examples/EMT_data_R_after_magic.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/KrishnaswamyLab/MAGIC/3a4ffdbe435716bb3b2fbe78f434c6cdc8dd8d78/Rmagic/inst/examples/EMT_data_R_after_magic.png
--------------------------------------------------------------------------------
/Rmagic/inst/examples/EMT_data_R_before_magic.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/KrishnaswamyLab/MAGIC/3a4ffdbe435716bb3b2fbe78f434c6cdc8dd8d78/Rmagic/inst/examples/EMT_data_R_before_magic.png
--------------------------------------------------------------------------------
/Rmagic/inst/examples/EMT_data_R_pca_colored_by_magic.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/KrishnaswamyLab/MAGIC/3a4ffdbe435716bb3b2fbe78f434c6cdc8dd8d78/Rmagic/inst/examples/EMT_data_R_pca_colored_by_magic.png
--------------------------------------------------------------------------------
/Rmagic/inst/examples/EMT_data_R_phate_colored_by_magic.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/KrishnaswamyLab/MAGIC/3a4ffdbe435716bb3b2fbe78f434c6cdc8dd8d78/Rmagic/inst/examples/EMT_data_R_phate_colored_by_magic.png
--------------------------------------------------------------------------------
/Rmagic/inst/examples/bonemarrow_tutorial.Rmd:
--------------------------------------------------------------------------------
1 | ---
2 | title: "Rmagic Bone Marrow Tutorial"
3 | output:
4 | html_document:
5 | df_print: paged
6 | toc: yes
7 | toc_depth: '3'
8 | ---
9 |
10 |
11 |
12 | ```{r setup, include=FALSE}
13 | knitr::opts_chunk$set(echo = TRUE)
14 | ```
15 |
16 | ## MAGIC (Markov Affinity-Based Graph Imputation of Cells)
17 |
18 | * MAGIC imputes missing data values on sparse data sets, restoring the structure of the data
19 | * It also proves dimensionality reduction and gene expression visualizations
20 | * MAGIC can be performed on a variety of datasets
21 | * Here, we show the effectiveness of MAGIC on erythroid and myeloid cells developing in mouse bone marrow.
22 |
23 | Markov Affinity-based Graph Imputation of Cells (MAGIC) is an algorithm for denoising and transcript recover of single cells applied to single-cell RNA sequencing data, as described in Van Dijk D *et al.* (2018), *Recovering Gene Interactions from Single-Cell Data Using Data Diffusion*, Cell .
24 |
25 | ### Installation
26 |
27 | If you haven't yet installed MAGIC, you can find installation instructions in our [GitHub README](https://github.com/KrishnaswamyLab/MAGIC/tree/master/Rmagic).
28 |
29 | We'll install a couple more tools for this tutorial.
30 |
31 | ```{r install_extras, eval=FALSE}
32 | if (!require(viridis)) install.packages("viridis")
33 | if (!require(ggplot2)) install.packages("ggplot2")
34 | if (!require(readr)) install.packages("readr")
35 | if (!require(phateR)) install.packages("phateR")
36 | ```
37 |
38 | If you have never used PHATE, you should also install PHATE from the command line as follows:
39 |
40 | ```{bash install_python_phate, eval=FALSE}
41 | pip install --user phate
42 | ```
43 |
44 | ### Loading packages
45 |
46 | We load the Rmagic package and a few others for convenience functions.
47 |
48 | ```{r load_packages}
49 | library(Rmagic)
50 | library(ggplot2)
51 | library(readr)
52 | library(viridis)
53 | library(phateR)
54 | ```
55 |
56 | ### Loading data
57 |
58 | In this tutorial, we will analyse myeloid and erythroid cells in mouse bone marrow, as described in Paul et al., 2015. The example data is located in the PHATE Github repository and we can load it directly from the web. You can run this tutorial with your own data by downloading and opening it in RStudio.
59 |
60 | ```{r load_data}
61 | # load data
62 | bmmsc <- read_csv("https://github.com/KrishnaswamyLab/PHATE/raw/master/data/BMMC_myeloid.csv.gz")
63 | bmmsc <- bmmsc[, 2:ncol(bmmsc)]
64 | bmmsc[1:5, 1:10]
65 | ```
66 |
67 | ### Filtering data
68 |
69 | First, we need to remove lowly expressed genes and cells with small library size.
70 |
71 | ```{r}
72 | # keep genes expressed in at least 10 cells
73 | keep_cols <- colSums(bmmsc > 0) > 10
74 | bmmsc <- bmmsc[, keep_cols]
75 | # look at the distribution of library sizes
76 | ggplot() +
77 | geom_histogram(aes(x = rowSums(bmmsc)), bins = 50) +
78 | geom_vline(xintercept = 1000, color = "red")
79 | ```
80 |
81 | ```{r}
82 | # keep cells with at least 1000 UMIs
83 | keep_rows <- rowSums(bmmsc) > 1000
84 | bmmsc <- bmmsc[keep_rows, ]
85 | ```
86 |
87 | ### Normalizing data
88 |
89 | We should library size normalize and transform the data prior to MAGIC. Many people use a log transform, which requires adding a "pseudocount" to avoid log(0). We square root instead, which has a similar form but doesn't suffer from instabilities at zero.
90 |
91 | ```{r normalize}
92 | bmmsc <- library.size.normalize(bmmsc)
93 | bmmsc <- sqrt(bmmsc)
94 | ```
95 |
96 | ### Running MAGIC
97 |
98 | Running MAGIC is as simple as running the `magic` function.
99 |
100 | ```{r run_magic}
101 | # run MAGIC
102 | bmmsc_MAGIC <- magic(bmmsc, genes = c("Mpo", "Klf1", "Ifitm1"))
103 | ```
104 |
105 | We can plot the data before and after MAGIC to visualize the results.
106 |
107 | ```{r plot_raw}
108 | ggplot(bmmsc) +
109 | geom_point(aes(Mpo, Klf1, color = Ifitm1)) +
110 | scale_color_viridis(option = "B")
111 | ggsave("BMMSC_data_R_before_magic.png", width = 5, height = 5)
112 | ```
113 |
114 | The data suffers from dropout to the point that we cannot infer anything about the gene-gene relationships.
115 |
116 | ```{r plot_magic}
117 | ggplot(bmmsc_MAGIC) +
118 | geom_point(aes(Mpo, Klf1, color = Ifitm1)) +
119 | scale_color_viridis(option = "B")
120 | ```
121 |
122 | As you can see, the gene-gene relationships are much clearer after MAGIC. These relationships also match the biological progression we expect to see - Ifitm1 is a stem cell marker, Klf1 is an erythroid marker, and Mpo is a myeloid marker.
123 |
124 | ### Rerunning MAGIC with new parameters
125 |
126 | The data is a little too smooth - we can increase `t` from the default value of 3 to increase the amount of diffusion. We pass the original result to the argument `init` to avoid recomputing intermediate steps.
127 |
128 | ```{r decrease_t}
129 | bmmsc_MAGIC <- magic(bmmsc,
130 | genes = c("Mpo", "Klf1", "Ifitm1"),
131 | t = 4, init = bmmsc_MAGIC
132 | )
133 | ggplot(bmmsc_MAGIC) +
134 | geom_point(aes(Mpo, Klf1, color = Ifitm1)) +
135 | scale_color_viridis(option = "B")
136 | ggsave("BMMSC_data_R_after_magic.png", width = 5, height = 5)
137 | ```
138 |
139 | ### Visualizing MAGIC values on PCA
140 |
141 | We can visualize the results of MAGIC on PCA with `genes="pca_only"`.
142 |
143 | ```{r run_pca}
144 | bmmsc_MAGIC_PCA <- magic(bmmsc,
145 | genes = "pca_only",
146 | t = 4, init = bmmsc_MAGIC
147 | )
148 | # ggplot(bmmsc_MAGIC_PCA) +
149 | geom_point(aes(x = PC1, y = PC2, color = bmmsc_MAGIC$result$Klf1)) +
150 | scale_color_viridis(option = "B") +
151 | labs(color = "Klf1")
152 | ggsave("BMMSC_data_R_pca_colored_by_magic.png", width = 5, height = 5)
153 | ```
154 |
155 |
156 | ### Visualizing MAGIC values on PHATE
157 |
158 | We can visualize the results of MAGIC on PHATE as follows.
159 |
160 | ```{r run_phate}
161 | bmmsc_PHATE <- phate(bmmsc)
162 | ggplot(bmmsc_PHATE) +
163 | geom_point(aes(x = PHATE1, y = PHATE2, color = bmmsc_MAGIC$result$Klf1)) +
164 | scale_color_viridis(option = "B") +
165 | labs(color = "Klf1")
166 | ggsave("BMMSC_data_R_phate_colored_by_magic.png", width = 5, height = 5)
167 | ```
168 |
169 | ### Using MAGIC for downstream analysis
170 |
171 | We can look at the entire smoothed matrix with `genes='all_genes'`, passing the original result to the argument `init` to avoid recomputing intermediate steps. Note that this matrix may be large and could take up a lot of memory.
172 |
173 | ```{r run_magic_full_matrix}
174 | bmmsc_MAGIC <- magic(bmmsc,
175 | genes = "all_genes",
176 | t = 4, init = bmmsc_MAGIC
177 | )
178 | as.data.frame(bmmsc_MAGIC)[1:5, 1:10]
179 | ```
180 |
181 | ## Help
182 |
183 | If you have any questions or require assistance using MAGIC, please contact us at .
184 |
--------------------------------------------------------------------------------
/Rmagic/inst/examples/emt_tutorial.Rmd:
--------------------------------------------------------------------------------
1 | ---
2 | title: "Rmagic EMT Tutorial"
3 | output:
4 | html_document:
5 | df_print: paged
6 | toc: yes
7 | toc_depth: '3'
8 | ---
9 |
10 |
11 |
12 | ```{r setup, include=FALSE}
13 | knitr::opts_chunk$set(echo = TRUE)
14 | ```
15 |
16 | ## MAGIC (Markov Affinity-Based Graph Imputation of Cells)
17 |
18 | * MAGIC imputes missing data values on sparse data sets, restoring the structure of the data
19 | * It also proves dimensionality reduction and gene expression visualizations
20 | * MAGIC can be performed on a variety of datasets
21 | * Here, we show the effectiveness of MAGIC on epithelial-to-mesenchymal transition (EMT) data
22 |
23 | Markov Affinity-based Graph Imputation of Cells (MAGIC) is an algorithm for denoising and transcript recover of single cells applied to single-cell RNA sequencing data, as described in Van Dijk D *et al.* (2018), *Recovering Gene Interactions from Single-Cell Data Using Data Diffusion*, Cell .
24 |
25 | ### Installation
26 |
27 | If you haven't yet installed MAGIC, you can find installation instructions in our [GitHub README](https://github.com/KrishnaswamyLab/MAGIC/tree/master/Rmagic).
28 |
29 | We'll install a couple more tools for this tutorial.
30 |
31 | ```{r install_extras, eval=FALSE}
32 | if (!require(viridis)) install.packages("viridis")
33 | if (!require(ggplot2)) install.packages("ggplot2")
34 | if (!require(readr)) install.packages("readr")
35 | if (!require(phateR)) install.packages("phateR")
36 | ```
37 |
38 | If you have never used PHATE, you should also install PHATE from the command line as follows:
39 |
40 | ```{bash install_python_phate, eval=FALSE}
41 | pip install --user phate
42 | ```
43 |
44 | ### Loading packages
45 |
46 | We load the Rmagic package and a few others for convenience functions.
47 |
48 | ```{r load_packages}
49 | library(Rmagic)
50 | library(readr)
51 | library(ggplot2)
52 | library(viridis)
53 | library(phateR)
54 | ```
55 |
56 | ### Loading data
57 |
58 | In this tutorial, we will analyze single-cell RNA sequencing of the epithelial to mesenchymal transition. The example data is located in the MAGIC Github repository. You can run this tutorial with your own data by downloading and opening it in RStudio.
59 |
60 | ```{r load_data}
61 | # load data
62 | data <- read_csv("../../../data/HMLE_TGFb_day_8_10.csv.gz")
63 | data[1:5, 1:10]
64 | ```
65 |
66 | ### Filtering data
67 |
68 | First, we need to remove lowly expressed genes.
69 |
70 | ```{r remove_rare_genes}
71 | # keep genes expressed in at least 10 cells
72 | keep_cols <- colSums(data > 0) > 10
73 | data <- data[, keep_cols]
74 | ```
75 |
76 | Ordinarily, we would remove cells with small library sizes. In this dataset, it has already been done; however, if you wanted to do that, you could do it with the code below.
77 |
78 | ```{r libsize_histogram}
79 | # look at the distribution of library sizes
80 | ggplot() +
81 | geom_histogram(aes(x = rowSums(data)), bins = 50) +
82 | geom_vline(xintercept = 1000, color = "red")
83 | ```
84 |
85 | ```{r filter_libsize}
86 | if (FALSE) {
87 | # keep cells with at least 1000 UMIs and at most 15000
88 | keep_rows <- rowSums(data) > 1000 & rowSums(data) < 15000
89 | data <- data[keep_rows, ]
90 | }
91 | ```
92 |
93 | ### Normalizing data
94 |
95 | We should library size normalize the data prior to MAGIC. Often we also transform the data with either log or square root. The log transform is commonly used, which requires adding a "pseudocount" to avoid log(0). We normally square root instead, which has a similar form but doesn't suffer from instabilities at zero. For this dataset, though, it is not necessary as the distribution of gene expression is not too extreme.
96 |
97 | ```{r normalize}
98 | data <- library.size.normalize(data)
99 | if (FALSE) {
100 | data <- sqrt(data)
101 | }
102 | ```
103 |
104 | ### Running MAGIC
105 |
106 | Running MAGIC is as simple as running the `magic` function. Because this dataset is rather small, we can decrease `knn` from the default of 5 down to 3.
107 |
108 | ```{r run_magic}
109 | # run MAGIC
110 | data_MAGIC <- magic(data, knn = 3, genes = c("VIM", "CDH1", "ZEB1"))
111 | ```
112 |
113 | We can plot the data before and after MAGIC to visualize the results.
114 |
115 | ```{r plot_raw}
116 | ggplot(data) +
117 | geom_point(aes(VIM, CDH1, color = ZEB1)) +
118 | scale_color_viridis(option = "B")
119 | ggsave("EMT_data_R_before_magic.png", width = 5, height = 5)
120 | ```
121 |
122 | ```{r plot_magic}
123 | ggplot(data_MAGIC) +
124 | geom_point(aes(VIM, CDH1, color = ZEB1)) +
125 | scale_color_viridis(option = "B")
126 | ggsave("EMT_data_R_after_magic.png", width = 5, height = 5)
127 | ```
128 |
129 | As you can see, the gene-gene relationships are much clearer after MAGIC.
130 |
131 | ### Visualizing MAGIC values on PCA
132 |
133 | We can visualize the results of MAGIC on PCA with `genes="pca_only"`.
134 |
135 | ```{r run_pca}
136 | data_MAGIC_PCA <- magic(data,
137 | genes = "pca_only",
138 | knn = 15, init = data_MAGIC
139 | )
140 | ggplot(data_MAGIC_PCA) +
141 | geom_point(aes(x = PC1, y = PC2, color = data_MAGIC$result$VIM)) +
142 | scale_color_viridis(option = "B") +
143 | labs(color = "VIM")
144 | ggsave("EMT_data_R_pca_colored_by_magic.png", width = 5, height = 5)
145 | ```
146 |
147 | ### Using MAGIC for downstream analysis
148 |
149 | We can look at the entire smoothed matrix with `genes='all_genes'`, passing the original result to the argument `init` to avoid recomputing intermediate steps. Note that this matrix may be large and could take up a lot of memory.
150 |
151 | ```{r run_magic_full_matrix}
152 | data_MAGIC <- magic(data,
153 | genes = "all_genes",
154 | knn = 15, init = data_MAGIC
155 | )
156 | as.data.frame(data_MAGIC)[1:5, 1:10]
157 | ```
158 |
159 | ### Help
160 |
161 | If you have any questions or require assistance using MAGIC, please contact us at .
162 |
--------------------------------------------------------------------------------
/Rmagic/man/as.data.frame.Rd:
--------------------------------------------------------------------------------
1 | % Generated by roxygen2: do not edit by hand
2 | % Please edit documentation in R/magic.R
3 | \name{as.data.frame.magic}
4 | \alias{as.data.frame.magic}
5 | \title{Convert a MAGIC object to a data.frame}
6 | \usage{
7 | \method{as.data.frame}{magic}(x, ...)
8 | }
9 | \arguments{
10 | \item{x}{A fitted MAGIC object}
11 |
12 | \item{...}{Arguments for as.data.frame()}
13 | }
14 | \description{
15 | Returns the smoothed data matrix
16 | }
17 |
--------------------------------------------------------------------------------
/Rmagic/man/as.matrix.Rd:
--------------------------------------------------------------------------------
1 | % Generated by roxygen2: do not edit by hand
2 | % Please edit documentation in R/magic.R
3 | \name{as.matrix.magic}
4 | \alias{as.matrix.magic}
5 | \title{Convert a MAGIC object to a matrix}
6 | \usage{
7 | \method{as.matrix}{magic}(x, ...)
8 | }
9 | \arguments{
10 | \item{x}{A fitted MAGIC object}
11 |
12 | \item{...}{Arguments for as.matrix()}
13 | }
14 | \description{
15 | Returns the smoothed data matrix
16 | }
17 |
--------------------------------------------------------------------------------
/Rmagic/man/check_pymagic_version.Rd:
--------------------------------------------------------------------------------
1 | % Generated by roxygen2: do not edit by hand
2 | % Please edit documentation in R/utils.R
3 | \name{check_pymagic_version}
4 | \alias{check_pymagic_version}
5 | \title{Check that the current MAGIC version in Python is up to date.}
6 | \usage{
7 | check_pymagic_version()
8 | }
9 | \description{
10 | Check that the current MAGIC version in Python is up to date.
11 | }
12 |
--------------------------------------------------------------------------------
/Rmagic/man/figures/README-plot_magic-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/KrishnaswamyLab/MAGIC/3a4ffdbe435716bb3b2fbe78f434c6cdc8dd8d78/Rmagic/man/figures/README-plot_magic-1.png
--------------------------------------------------------------------------------
/Rmagic/man/figures/README-plot_raw-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/KrishnaswamyLab/MAGIC/3a4ffdbe435716bb3b2fbe78f434c6cdc8dd8d78/Rmagic/man/figures/README-plot_raw-1.png
--------------------------------------------------------------------------------
/Rmagic/man/figures/README-plot_reduced_t-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/KrishnaswamyLab/MAGIC/3a4ffdbe435716bb3b2fbe78f434c6cdc8dd8d78/Rmagic/man/figures/README-plot_reduced_t-1.png
--------------------------------------------------------------------------------
/Rmagic/man/figures/README-run_pca-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/KrishnaswamyLab/MAGIC/3a4ffdbe435716bb3b2fbe78f434c6cdc8dd8d78/Rmagic/man/figures/README-run_pca-1.png
--------------------------------------------------------------------------------
/Rmagic/man/figures/README-run_phate-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/KrishnaswamyLab/MAGIC/3a4ffdbe435716bb3b2fbe78f434c6cdc8dd8d78/Rmagic/man/figures/README-run_phate-1.png
--------------------------------------------------------------------------------
/Rmagic/man/figures/README-unnamed-chunk-1-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/KrishnaswamyLab/MAGIC/3a4ffdbe435716bb3b2fbe78f434c6cdc8dd8d78/Rmagic/man/figures/README-unnamed-chunk-1-1.png
--------------------------------------------------------------------------------
/Rmagic/man/figures/README-unnamed-chunk-3-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/KrishnaswamyLab/MAGIC/3a4ffdbe435716bb3b2fbe78f434c6cdc8dd8d78/Rmagic/man/figures/README-unnamed-chunk-3-1.png
--------------------------------------------------------------------------------
/Rmagic/man/ggplot.Rd:
--------------------------------------------------------------------------------
1 | % Generated by roxygen2: do not edit by hand
2 | % Please edit documentation in R/magic.R
3 | \name{ggplot.magic}
4 | \alias{ggplot.magic}
5 | \title{Convert a MAGIC object to a data.frame for ggplot}
6 | \usage{
7 | \method{ggplot}{magic}(data, ...)
8 | }
9 | \arguments{
10 | \item{data}{A fitted MAGIC object}
11 |
12 | \item{...}{Arguments for ggplot()}
13 | }
14 | \description{
15 | Passes the smoothed data matrix to ggplot
16 | }
17 | \examples{
18 | if (pymagic_is_available() && require(ggplot2)) {
19 | data(magic_testdata)
20 | data_magic <- magic(magic_testdata, genes = c("VIM", "CDH1", "ZEB1"))
21 | ggplot(data_magic, aes(VIM, CDH1, colour = ZEB1)) +
22 | geom_point()
23 | }
24 | }
25 |
--------------------------------------------------------------------------------
/Rmagic/man/install.magic.Rd:
--------------------------------------------------------------------------------
1 | % Generated by roxygen2: do not edit by hand
2 | % Please edit documentation in R/utils.R
3 | \name{install.magic}
4 | \alias{install.magic}
5 | \title{Install MAGIC Python Package}
6 | \usage{
7 | install.magic(
8 | envname = "r-reticulate",
9 | method = "auto",
10 | conda = "auto",
11 | pip = TRUE,
12 | ...
13 | )
14 | }
15 | \arguments{
16 | \item{envname}{Name of environment to install packages into}
17 |
18 | \item{method}{Installation method. By default, "auto" automatically finds
19 | a method that will work in the local environment. Change the default to
20 | force a specific installation method. Note that the "virtualenv" method
21 | is not available on Windows.}
22 |
23 | \item{conda}{Path to conda executable (or "auto" to find conda using the PATH
24 | and other conventional install locations).}
25 |
26 | \item{pip}{Install from pip, if possible.}
27 |
28 | \item{...}{Additional arguments passed to conda_install() or
29 | virtualenv_install().}
30 | }
31 | \description{
32 | Install MAGIC Python package into a virtualenv or conda env.
33 | }
34 | \details{
35 | On Linux and OS X the "virtualenv" method will be used by default
36 | ("conda" will be used if virtualenv isn't available). On Windows,
37 | the "conda" method is always used.
38 | }
39 |
--------------------------------------------------------------------------------
/Rmagic/man/library.size.normalize.Rd:
--------------------------------------------------------------------------------
1 | % Generated by roxygen2: do not edit by hand
2 | % Please edit documentation in R/preprocessing.R
3 | \name{library.size.normalize}
4 | \alias{library.size.normalize}
5 | \title{Performs L1 normalization on input data such that the sum of expression
6 | values for each cell sums to 1, then returns normalized matrix to the metric
7 | space using median UMI count per cell effectively scaling all cells as if
8 | they were sampled evenly.}
9 | \usage{
10 | library.size.normalize(data, verbose = FALSE)
11 | }
12 | \arguments{
13 | \item{data}{matrix (n_samples, n_dimensions)
14 | 2 dimensional input data array with n cells and p dimensions}
15 |
16 | \item{verbose}{boolean, default=FALSE. If true, print verbose output}
17 | }
18 | \value{
19 | data_norm matrix (n_samples, n_dimensions)
20 | 2 dimensional array with normalized gene expression values
21 | }
22 | \description{
23 | Performs L1 normalization on input data such that the sum of expression
24 | values for each cell sums to 1, then returns normalized matrix to the metric
25 | space using median UMI count per cell effectively scaling all cells as if
26 | they were sampled evenly.
27 | }
28 |
--------------------------------------------------------------------------------
/Rmagic/man/magic.Rd:
--------------------------------------------------------------------------------
1 | % Generated by roxygen2: do not edit by hand
2 | % Please edit documentation in R/magic.R
3 | \name{magic}
4 | \alias{magic}
5 | \alias{magic.default}
6 | \alias{magic.seurat}
7 | \alias{magic.Seurat}
8 | \title{Perform MAGIC on a data matrix}
9 | \usage{
10 | magic(data, ...)
11 |
12 | \method{magic}{default}(
13 | data,
14 | genes = NULL,
15 | knn = 5,
16 | knn.max = NULL,
17 | decay = 1,
18 | t = 3,
19 | npca = 100,
20 | solver = "exact",
21 | init = NULL,
22 | t.max = 20,
23 | knn.dist.method = "euclidean",
24 | verbose = 1,
25 | n.jobs = 1,
26 | seed = NULL,
27 | k = NULL,
28 | alpha = NULL,
29 | ...
30 | )
31 |
32 | \method{magic}{seurat}(
33 | data,
34 | genes = NULL,
35 | knn = 5,
36 | knn.max = NULL,
37 | decay = 1,
38 | t = 3,
39 | npca = 100,
40 | solver = "exact",
41 | init = NULL,
42 | t.max = 20,
43 | knn.dist.method = "euclidean",
44 | verbose = 1,
45 | n.jobs = 1,
46 | seed = NULL,
47 | ...
48 | )
49 |
50 | \method{magic}{Seurat}(
51 | data,
52 | assay = NULL,
53 | genes = NULL,
54 | knn = 5,
55 | knn.max = NULL,
56 | decay = 1,
57 | t = 3,
58 | npca = 100,
59 | solver = "exact",
60 | init = NULL,
61 | t.max = 20,
62 | knn.dist.method = "euclidean",
63 | verbose = 1,
64 | n.jobs = 1,
65 | seed = NULL,
66 | ...
67 | )
68 | }
69 | \arguments{
70 | \item{data}{input data matrix or Seurat object}
71 |
72 | \item{...}{Arguments passed to and from other methods}
73 |
74 | \item{genes}{character or integer vector, default: NULL
75 | vector of column names or column indices for which to return smoothed data
76 | If 'all_genes' or NULL, the entire smoothed matrix is returned}
77 |
78 | \item{knn}{int, optional, default: 5
79 | number of nearest neighbors on which to compute bandwidth}
80 |
81 | \item{knn.max}{int, optional, default: NULL
82 | maximum number of neighbors for each point. If NULL, defaults to 3*knn}
83 |
84 | \item{decay}{int, optional, default: 1
85 | sets decay rate of kernel tails.
86 | If NULL, alpha decaying kernel is not used}
87 |
88 | \item{t}{int, optional, default: 3
89 | power to which the diffusion operator is powered
90 | sets the level of diffusion. If 'auto', t is selected according to the
91 | Procrustes disparity of the diffused data.'}
92 |
93 | \item{npca}{number of PCA components that should be used; default: 100.}
94 |
95 | \item{solver}{str, optional, default: 'exact'
96 | Which solver to use. "exact" uses the implementation described
97 | in van Dijk et al. (2018). "approximate" uses a faster implementation
98 | that performs imputation in the PCA space and then projects back to the
99 | gene space. Note, the "approximate" solver may return negative values.}
100 |
101 | \item{init}{magic object, optional
102 | object to use for initialization. Avoids recomputing
103 | intermediate steps if parameters are the same.}
104 |
105 | \item{t.max}{int, optional, default: 20
106 | Maximum value of t to test for automatic t selection.}
107 |
108 | \item{knn.dist.method}{string, optional, default: 'euclidean'.
109 | recommended values: 'euclidean', 'cosine'
110 | Any metric from `scipy.spatial.distance` can be used
111 | distance metric for building kNN graph.}
112 |
113 | \item{verbose}{`int` or `boolean`, optional (default : 1)
114 | If `TRUE` or `> 0`, print verbose updates.}
115 |
116 | \item{n.jobs}{`int`, optional (default: 1)
117 | The number of jobs to use for the computation.
118 | If -1 all CPUs are used. If 1 is given, no parallel computing code is
119 | used at all, which is useful for debugging.
120 | For n_jobs below -1, (n.cpus + 1 + n.jobs) are used. Thus for
121 | n_jobs = -2, all CPUs but one are used}
122 |
123 | \item{seed}{int or `NULL`, random state (default: `NULL`)}
124 |
125 | \item{k}{Deprecated. Use `knn`.}
126 |
127 | \item{alpha}{Deprecated. Use `decay`.}
128 |
129 | \item{assay}{Assay to use for imputation, defaults to the default assay}
130 | }
131 | \value{
132 | If a Seurat object is passed, a Seurat object is returned. Otherwise, a "magic" object containing:
133 | * **result**: matrix containing smoothed expression values
134 | * **operator**: The MAGIC operator (python magic.MAGIC object)
135 | * **params**: Parameters passed to magic
136 | }
137 | \description{
138 | Markov Affinity-based Graph Imputation of Cells (MAGIC) is an
139 | algorithm for denoising and transcript recover of single cells
140 | applied to single-cell RNA sequencing data, as described in
141 | van Dijk et al, 2018.
142 | }
143 | \examples{
144 | if (pymagic_is_available()) {
145 | data(magic_testdata)
146 |
147 | # Run MAGIC
148 | data_magic <- magic(magic_testdata, genes = c("VIM", "CDH1", "ZEB1"))
149 | summary(data_magic)
150 | ## CDH1 VIM ZEB1
151 | ## Min. :0.4303 Min. :3.854 Min. :0.01111
152 | ## 1st Qu.:0.4444 1st Qu.:3.947 1st Qu.:0.01145
153 | ## Median :0.4462 Median :3.964 Median :0.01153
154 | ## Mean :0.4461 Mean :3.965 Mean :0.01152
155 | ## 3rd Qu.:0.4478 3rd Qu.:3.982 3rd Qu.:0.01160
156 | ## Max. :0.4585 Max. :4.127 Max. :0.01201
157 |
158 | # Plot the result with ggplot2
159 | if (require(ggplot2)) {
160 | ggplot(data_magic) +
161 | geom_point(aes(x = VIM, y = CDH1, color = ZEB1))
162 | }
163 |
164 | # Run MAGIC again returning all genes
165 | # We use the last run as initialization
166 | data_magic <- magic(magic_testdata, genes = "all_genes", init = data_magic)
167 | # Extract the smoothed data matrix to use in downstream analysis
168 | data_smooth <- as.matrix(data_magic)
169 | }
170 |
171 | if (pymagic_is_available() && require(Seurat)) {
172 | data(magic_testdata)
173 |
174 | # Create a Seurat object
175 | seurat_object <- CreateSeuratObject(counts = t(magic_testdata), assay = "RNA")
176 | seurat_object <- NormalizeData(object = seurat_object)
177 | seurat_object <- ScaleData(object = seurat_object)
178 |
179 | # Run MAGIC and reset the active assay
180 | seurat_object <- magic(seurat_object)
181 | seurat_object@active.assay <- "MAGIC_RNA"
182 |
183 | # Analyze with Seurat
184 | VlnPlot(seurat_object, features = c("VIM", "ZEB1", "CDH1"))
185 | }
186 | }
187 |
--------------------------------------------------------------------------------
/Rmagic/man/magic_testdata.Rd:
--------------------------------------------------------------------------------
1 | % Generated by roxygen2: do not edit by hand
2 | % Please edit documentation in R/magic_testdata.R
3 | \docType{data}
4 | \name{magic_testdata}
5 | \alias{magic_testdata}
6 | \title{Fake scRNAseq data for examples}
7 | \format{
8 | A matrix with 500 rows and 197 variables
9 | }
10 | \source{
11 | The authors
12 | }
13 | \usage{
14 | magic_testdata
15 | }
16 | \description{
17 | A subsampled dataset of epithelial to mesenchymal transition
18 | }
19 | \keyword{datasets}
20 |
--------------------------------------------------------------------------------
/Rmagic/man/print.Rd:
--------------------------------------------------------------------------------
1 | % Generated by roxygen2: do not edit by hand
2 | % Please edit documentation in R/magic.R
3 | \name{print.magic}
4 | \alias{print.magic}
5 | \title{Print a MAGIC object}
6 | \usage{
7 | \method{print}{magic}(x, ...)
8 | }
9 | \arguments{
10 | \item{x}{A fitted MAGIC object}
11 |
12 | \item{...}{Arguments for print()}
13 | }
14 | \description{
15 | This avoids spamming the user's console with a list of many large matrices
16 | }
17 | \examples{
18 | if (pymagic_is_available()) {
19 | data(magic_testdata)
20 | data_magic <- magic(magic_testdata)
21 | print(data_magic)
22 | ## MAGIC with elements
23 | ## $result : (500, 197)
24 | ## $operator : Python MAGIC operator
25 | ## $params : list with elements (data, knn, decay, t, npca, knn.dist.method)
26 | }
27 | }
28 |
--------------------------------------------------------------------------------
/Rmagic/man/pymagic_is_available.Rd:
--------------------------------------------------------------------------------
1 | % Generated by roxygen2: do not edit by hand
2 | % Please edit documentation in R/utils.R
3 | \name{pymagic_is_available}
4 | \alias{pymagic_is_available}
5 | \title{Check whether MAGIC Python package is available and can be loaded}
6 | \usage{
7 | pymagic_is_available()
8 | }
9 | \description{
10 | This is used primarily to avoid running tests on CRAN
11 | and elsewhere where the Python package should not be
12 | installed.
13 | }
14 |
--------------------------------------------------------------------------------
/Rmagic/man/summary.Rd:
--------------------------------------------------------------------------------
1 | % Generated by roxygen2: do not edit by hand
2 | % Please edit documentation in R/magic.R
3 | \name{summary.magic}
4 | \alias{summary.magic}
5 | \title{Summarize a MAGIC object}
6 | \usage{
7 | \method{summary}{magic}(object, ...)
8 | }
9 | \arguments{
10 | \item{object}{A fitted MAGIC object}
11 |
12 | \item{...}{Arguments for summary()}
13 | }
14 | \description{
15 | Summarize a MAGIC object
16 | }
17 | \examples{
18 | if (pymagic_is_available()) {
19 | data(magic_testdata)
20 | data_magic <- magic(magic_testdata)
21 | summary(data_magic)
22 | ## ZEB1
23 | ## Min. :0.01071
24 | ## 1st Qu.:0.01119
25 | ## Median :0.01130
26 | ## Mean :0.01129
27 | ## 3rd Qu.:0.01140
28 | ## Max. :0.01201
29 | }
30 | }
31 |
--------------------------------------------------------------------------------
/Rmagic/tests/test_magic.R:
--------------------------------------------------------------------------------
1 | # To run this file:
2 | # - Set the working directory to 'R/tests'.
3 |
4 | library(Rmagic)
5 | library(ggplot2)
6 | library(readr)
7 | library(viridis)
8 |
9 | seurat_obj <- function() {
10 | # load data
11 | data <- read.csv("../../data/HMLE_TGFb_day_8_10.csv.gz")
12 |
13 | seurat_raw_data <- t(data)
14 | rownames(seurat_raw_data) <- colnames(data)
15 | colnames(seurat_raw_data) <- rownames(data)
16 | seurat_obj <- Seurat::CreateSeuratObject(raw.data = seurat_raw_data)
17 |
18 | # run MAGIC
19 | data_MAGIC <- magic(data)
20 | seurat_MAGIC <- magic(seurat_obj)
21 | stopifnot(all(data_MAGIC$result == t(seurat_MAGIC@data)))
22 |
23 | # plot
24 | p <- ggplot(data) +
25 | geom_point(aes(VIM, CDH1, colour = ZEB1)) +
26 | scale_colour_viridis(option = "B")
27 | ggsave("EMT_data_R_before_magic.png", plot = p, width = 5, height = 5)
28 |
29 | p_m <- ggplot(data_MAGIC) +
30 | geom_point(aes(VIM, CDH1, colour = ZEB1)) +
31 | scale_colour_viridis(option = "B")
32 | ggsave("EMT_data_R_after_magic.png", plot = p_m, width = 5, height = 5)
33 | }
34 |
--------------------------------------------------------------------------------
/data/HMLE_TGFb_day_8_10.csv.gz:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/KrishnaswamyLab/MAGIC/3a4ffdbe435716bb3b2fbe78f434c6cdc8dd8d78/data/HMLE_TGFb_day_8_10.csv.gz
--------------------------------------------------------------------------------
/magic.gif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/KrishnaswamyLab/MAGIC/3a4ffdbe435716bb3b2fbe78f434c6cdc8dd8d78/magic.gif
--------------------------------------------------------------------------------
/matlab/.DS_Store:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/KrishnaswamyLab/MAGIC/3a4ffdbe435716bb3b2fbe78f434c6cdc8dd8d78/matlab/.DS_Store
--------------------------------------------------------------------------------
/matlab/MAGIC Tutorial MATLAB-EMT.pptx:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/KrishnaswamyLab/MAGIC/3a4ffdbe435716bb3b2fbe78f434c6cdc8dd8d78/matlab/MAGIC Tutorial MATLAB-EMT.pptx
--------------------------------------------------------------------------------
/matlab/compute_kernel.m:
--------------------------------------------------------------------------------
1 | function K = compute_alpha_kernel_sparse(data, varargin)
2 | % K = computer_alpha_kernel_sparse(data, varargin)
3 | % Computes sparse alpha-decay kernel
4 | % varargin:
5 | % 'npca' (default = [], no PCA)
6 | % Perform fast random PCA before computing distances
7 | % 'k' (default = 5)
8 | % k for the knn distances for the locally adaptive bandwidth
9 | % 'a' (default = 10)
10 | % The alpha exponent in the alpha-decaying kernel
11 | % 'distfun' (default = 'euclidean')
12 | % Input distance function
13 | k = 5;
14 | a = 10;
15 | npca = [];
16 | distfun = 'euclidean';
17 | % get the input parameters
18 | if ~isempty(varargin)
19 | for j = 1:length(varargin)
20 | % k nearest neighbora
21 | if strcmp(varargin{j}, 'k')
22 | k = varargin{j+1};
23 | end
24 | % alpha
25 | if strcmp(varargin{j}, 'a')
26 | a = varargin{j+1};
27 | end
28 | % npca to project data
29 | if strcmp(varargin{j}, 'npca')
30 | npca = varargin{j+1};
31 | end
32 | % distfun
33 | if strcmp(varargin{j}, 'distfun')
34 | distfun = varargin{j+1};
35 | end
36 | end
37 | end
38 |
39 | th = 1e-4;
40 |
41 | k_knn = k * 20;
42 |
43 | bth=(-log(th))^(1/a);
44 |
45 | disp 'Computing alpha decay kernel:'
46 |
47 | N = size(data, 1); % number of cells
48 |
49 | if ~isempty(npca)
50 | disp ' PCA'
51 | data_centr = bsxfun(@minus, data, mean(data,1));
52 | [U,~,~] = randPCA(data_centr', npca); % fast random svd
53 | %[U,~,~] = svds(data', npca);
54 | data_pc = data_centr * U; % PCA project
55 | else
56 | data_pc = data;
57 | end
58 |
59 | disp(['Number of samples = ' num2str(N)])
60 |
61 | % Initial knn search and set the radius
62 | disp(['First iteration: k = ' num2str(k_knn)])
63 | [idx, kdist]=knnsearch(data_pc,data_pc,'k',k_knn,'Distance',distfun);
64 | epsilon=kdist(:,k+1);
65 |
66 | % Find the points that have large enough distance to be below the kernel
67 | % threshold
68 | below_thresh=kdist(:,end)>=bth*epsilon;
69 |
70 | idx_thresh=find(below_thresh);
71 |
72 | if ~isempty(idx_thresh)
73 | K=exp(-(kdist(idx_thresh,:)./epsilon(idx_thresh)).^a);
74 | K(K<=th)=0;
75 | K=K(:);
76 | i = repmat(idx_thresh',1,size(idx,2));
77 | i = i(:);
78 | idx_temp=idx(idx_thresh,:);
79 | j = idx_temp(:);
80 | end
81 |
82 | disp(['Number of samples below the threshold from 1st iter: ' num2str(length(idx_thresh))])
83 |
84 | % Loop increasing k by factor of 20 until we cover 90% of the data
85 | while length(idx_thresh)<.9*N
86 | k_knn=min(20*k_knn,N);
87 | data_pc2=data_pc(~below_thresh,:);
88 | epsilon2=epsilon(~below_thresh);
89 | disp(['Next iteration: k= ' num2str(k_knn)])
90 | [idx2, kdist2]=knnsearch(data_pc,data_pc2,'k',k_knn,'Distance',distfun);
91 |
92 | % Find the points that have large enough distance
93 | below_thresh2=kdist2(:,end)>=bth*epsilon2;
94 | idx_thresh2=find(below_thresh2);
95 |
96 | if ~isempty(idx_thresh2)
97 | K2=exp(-(kdist2(idx_thresh2,:)./epsilon2(idx_thresh2)).^a);
98 | K2(K2<=th)=0;
99 | idx_notthresh=find(~below_thresh);
100 | i2=repmat(idx_notthresh(idx_thresh2)',1,size(idx2,2));
101 | i2=i2(:);
102 | idx_temp=idx2(idx_thresh2,:);
103 | j2=idx_temp(:);
104 |
105 | i=[i; i2];
106 | j=[j; j2];
107 | K=[K; K2(:)];
108 | % Add the newly thresholded points to the old ones
109 | below_thresh(idx_notthresh(idx_thresh2))=1;
110 | idx_thresh=find(below_thresh);
111 | end
112 | disp(['Number of samples below the threshold from the next iter: ' num2str(length(idx_thresh))])
113 | end
114 |
115 | % Radius search for the rest
116 | if length(idx_thresh) 0
91 | W = sparse(i, j, s);
92 | else
93 | W = sparse(i, j, ones(size(s))); % unweighted kNN graph
94 | end
95 |
96 | disp 'Symmetrize distances'
97 | W = W + W';
98 |
99 | if epsilon > 0
100 | disp 'Computing kernel'
101 | [i,j,s] = find(W);
102 | i = [i; (1:N)'];
103 | j = [j; (1:N)'];
104 | s = [s./(epsilon^2); zeros(N,1)];
105 | s = exp(-s);
106 | W = sparse(i,j,s);
107 | end
108 |
109 | disp 'Markov normalization'
110 | W = bsxfun(@rdivide, W, sum(W,2)); % Markov normalization
111 |
112 | disp 'done'
113 |
--------------------------------------------------------------------------------
/matlab/compute_optimal_t.m:
--------------------------------------------------------------------------------
1 | function [data_opt_t, t_opt] = compute_optimal_t(data, DiffOp, varargin)
2 |
3 | t_max = 32;
4 | make_plot = true;
5 | th = 1e-3;
6 | data_opt_t = [];
7 |
8 | if ~isempty(varargin)
9 | for j = 1:length(varargin)
10 | if strcmp(varargin{j}, 't_max')
11 | t_max = varargin{j+1};
12 | end
13 | if strcmp(varargin{j}, 'make_plot')
14 | make_plot = varargin{j+1};
15 | end
16 | if strcmp(varargin{j}, 'th')
17 | th = varargin{j+1};
18 | end
19 | end
20 | end
21 |
22 | data_prev = data;
23 | if make_plot
24 | error_vec = nan(t_max,1);
25 | for I=1:t_max
26 | disp(['t = ' num2str(I)]);
27 | data_curr = DiffOp * data_prev;
28 | error_vec(I) = procrustes(data_prev, data_curr);
29 | if error_vec(I) < th && isempty(data_opt_t)
30 | data_opt_t = data_curr;
31 | end
32 | data_prev = data_curr;
33 | end
34 | t_opt = find(error_vec < th, 1, 'first');
35 |
36 | figure;
37 | hold all;
38 | plot(1:t_max, error_vec, '*-');
39 | plot(t_opt, error_vec(t_opt), 'or', 'markersize', 10);
40 | xlabel 't'
41 | ylabel 'error'
42 | axis tight
43 | ylim([0 ceil(max(error_vec)*10)/10]);
44 | plot(xlim, [th th], '--k');
45 | legend({'y' 'optimal t' ['y=' num2str(th)]});
46 | set(gca,'xtick',1:t_max);
47 | set(gca,'ytick',0:0.1:1);
48 | else
49 | for I=1:t_max
50 | disp(['t = ' num2str(I)]);
51 | data_curr = DiffOp * data_prev;
52 | error = procrustes(data_prev, data_curr);
53 | if error < th
54 | t_opt = I;
55 | data_opt_t = data_curr;
56 | break
57 | end
58 | data_prev = data_curr;
59 | end
60 | end
61 |
62 | disp(['optimal t = ' num2str(t_opt)]);
63 |
--------------------------------------------------------------------------------
/matlab/load_10x.m:
--------------------------------------------------------------------------------
1 | function [data, gene_names, gene_ids, cells] = load_10x(data_dir, varargin)
2 | % [data, gene_names, gene_ids, cells] = load_10x(data_dir, varargin)
3 | % loads 10x sparse format data
4 | % data_dir is dir that contains matrix.mtx, genes.tsv and barcodes.tsv
5 | % varargin
6 | % 'sparse', true -- returns data matrix in sparse format (default 'false')
7 |
8 | return_sparse = false;
9 |
10 | if isempty(data_dir)
11 | data_dir = './';
12 | elseif data_dir(end) ~= '/'
13 | data_dir = [data_dir '/'];
14 | end
15 |
16 | for i=1:length(varargin)-1
17 | if (strcmp(varargin{i}, 'sparse'))
18 | return_sparse = varargin{i+1};
19 | end
20 | end
21 |
22 | filename_dataMatrix = [data_dir 'matrix.mtx'];
23 | filename_genes = [data_dir 'genes.tsv'];
24 | filename_cells = [data_dir 'barcodes.tsv'];
25 |
26 |
27 | % Read in gene expression matrix (sparse matrix)
28 | % Rows = genes, columns = cells
29 | fprintf('LOADING\n')
30 | dataMatrix = mmread(filename_dataMatrix);
31 | fprintf(' Data matrix (%i cells x %i genes): %s\n', ...
32 | size(dataMatrix'), ['''' filename_dataMatrix '''' ])
33 |
34 | % Read in row names (gene names / IDs)
35 | dataMatrix_genes = table2cell( ...
36 | readtable(filename_genes, ...
37 | 'FileType','text','ReadVariableNames',0));
38 | dataMatrix_cells = table2cell( ...
39 | readtable(filename_cells, ...
40 | 'FileType','text','ReadVariableNames',0));
41 |
42 | % Remove empty cells
43 | col_keep = any(dataMatrix,1);
44 | dataMatrix = dataMatrix(:,col_keep);
45 | dataMatrix_cells = dataMatrix_cells(col_keep,:);
46 | fprintf(' Removed %i empty cells\n', full(sum(~col_keep)))
47 |
48 | % Remove empty genes
49 | genes_keep = any(dataMatrix,2);
50 | dataMatrix = dataMatrix(genes_keep,:);
51 | dataMatrix_genes = dataMatrix_genes(genes_keep,:);
52 | fprintf(' Removed %i empty genes\n', full(sum(~genes_keep)))
53 |
54 | data = dataMatrix';
55 | if ~return_sparse
56 | data = full(data);
57 | end
58 | gene_names = dataMatrix_genes(:,2);
59 | gene_ids = dataMatrix_genes(:,1);
60 | cells = dataMatrix_cells;
61 |
--------------------------------------------------------------------------------
/matlab/mmread.m:
--------------------------------------------------------------------------------
1 | function [A,rows,cols,entries,rep,field,symm] = mmread(filename)
2 | %
3 | % function [A] = mmread(filename)
4 | %
5 | % function [A,rows,cols,entries,rep,field,symm] = mmread(filename)
6 | %
7 | % Reads the contents of the Matrix Market file 'filename'
8 | % into the matrix 'A'. 'A' will be either sparse or full,
9 | % depending on the Matrix Market format indicated by
10 | % 'coordinate' (coordinate sparse storage), or
11 | % 'array' (dense array storage). The data will be duplicated
12 | % as appropriate if symmetry is indicated in the header.
13 | %
14 | % Optionally, size information about the matrix can be
15 | % obtained by using the return values rows, cols, and
16 | % entries, where entries is the number of nonzero entries
17 | % in the final matrix. Type information can also be retrieved
18 | % using the optional return values rep (representation), field,
19 | % and symm (symmetry).
20 | %
21 |
22 | mmfile = fopen(filename,'r');
23 | if ( mmfile == -1 )
24 | disp(filename);
25 | error('File not found');
26 | end;
27 |
28 | header = fgets(mmfile);
29 | if (header == -1 )
30 | error('Empty file.')
31 | end
32 |
33 | % NOTE: If using a version of Matlab for which strtok is not
34 | % defined, substitute 'gettok' for 'strtok' in the
35 | % following lines, and download gettok.m from the
36 | % Matrix Market site.
37 | [head0,header] = strtok(header); % see note above
38 | [head1,header] = strtok(header);
39 | [rep,header] = strtok(header);
40 | [field,header] = strtok(header);
41 | [symm,header] = strtok(header);
42 | head1 = lower(head1);
43 | rep = lower(rep);
44 | field = lower(field);
45 | symm = lower(symm);
46 | if ( length(symm) == 0 )
47 | disp(['Not enough words in header line of file ',filename])
48 | disp('Recognized format: ')
49 | disp('%%MatrixMarket matrix representation field symmetry')
50 | error('Check header line.')
51 | end
52 | if ( ~ strcmp(head0,'%%MatrixMarket') )
53 | error('Not a valid MatrixMarket header.')
54 | end
55 | if ( ~ strcmp(head1,'matrix') )
56 | disp(['This seems to be a MatrixMarket ',head1,' file.']);
57 | disp('This function only knows how to read MatrixMarket matrix files.');
58 | disp(' ');
59 | error(' ');
60 | end
61 |
62 | % Read through comments, ignoring them
63 |
64 | commentline = fgets(mmfile);
65 | while length(commentline) > 0 & commentline(1) == '%',
66 | commentline = fgets(mmfile);
67 | end
68 |
69 | % Read size information, then branch according to
70 | % sparse or dense format
71 |
72 | if ( strcmp(rep,'coordinate')) % read matrix given in sparse
73 | % coordinate matrix format
74 |
75 | [sizeinfo,count] = sscanf(commentline,'%d%d%d');
76 | while ( count == 0 )
77 | commentline = fgets(mmfile);
78 | if (commentline == -1 )
79 | error('End-of-file reached before size information was found.')
80 | end
81 | [sizeinfo,count] = sscanf(commentline,'%d%d%d');
82 | if ( count > 0 & count ~= 3 )
83 | error('Invalid size specification line.')
84 | end
85 | end
86 | rows = sizeinfo(1);
87 | cols = sizeinfo(2);
88 | entries = sizeinfo(3);
89 |
90 | if ( strcmp(field,'real') || strcmp(field,'integer') ) % real valued entries:
91 |
92 | [T,count] = fscanf(mmfile,'%f',3);
93 | T = [T; fscanf(mmfile,'%f')];
94 | if ( size(T) ~= 3*entries )
95 | message = ...
96 | str2mat('Data file does not contain expected amount of data.',...
97 | 'Check that number of data lines matches nonzero count.');
98 | disp(message);
99 | error('Invalid data.');
100 | end
101 | T = reshape(T,3,entries)';
102 | A = sparse(T(:,1), T(:,2), T(:,3), rows , cols);
103 |
104 | elseif ( strcmp(field,'complex')) % complex valued entries:
105 |
106 | T = fscanf(mmfile,'%f',4);
107 | T = [T; fscanf(mmfile,'%f')];
108 | if ( size(T) ~= 4*entries )
109 | message = ...
110 | str2mat('Data file does not contain expected amount of data.',...
111 | 'Check that number of data lines matches nonzero count.');
112 | disp(message);
113 | error('Invalid data.');
114 | end
115 | T = reshape(T,4,entries)';
116 | A = sparse(T(:,1), T(:,2), T(:,3) + T(:,4)*sqrt(-1), rows , cols);
117 |
118 | elseif ( strcmp(field,'pattern')) % pattern matrix (no values given):
119 |
120 | T = fscanf(mmfile,'%f',2);
121 | T = [T; fscanf(mmfile,'%f')];
122 | if ( size(T) ~= 2*entries )
123 | message = ...
124 | str2mat('Data file does not contain expected amount of data.',...
125 | 'Check that number of data lines matches nonzero count.');
126 | disp(message);
127 | error('Invalid data.');
128 | end
129 | T = reshape(T,2,entries)';
130 | A = sparse(T(:,1), T(:,2), ones(entries,1) , rows , cols);
131 |
132 | end
133 |
134 | elseif ( strcmp(rep,'array') ) % read matrix given in dense
135 | % array (column major) format
136 |
137 | [sizeinfo,count] = sscanf(commentline,'%d%d');
138 | while ( count == 0 )
139 | commentline = fgets(mmfile);
140 | if (commentline == -1 )
141 | error('End-of-file reached before size information was found.')
142 | end
143 | [sizeinfo,count] = sscanf(commentline,'%d%d');
144 | if ( count > 0 & count ~= 2 )
145 | error('Invalid size specification line.')
146 | end
147 | end
148 | rows = sizeinfo(1);
149 | cols = sizeinfo(2);
150 | entries = rows*cols;
151 | if ( strcmp(field,'real') || strcmp(field,'integer') ) % real valued entries:
152 | A = fscanf(mmfile,'%f',1);
153 | A = [A; fscanf(mmfile,'%f')];
154 | if ( strcmp(symm,'symmetric') | strcmp(symm,'hermitian') | strcmp(symm,'skew-symmetric') )
155 | for j=1:cols-1,
156 | currenti = j*rows;
157 | A = [A(1:currenti); zeros(j,1);A(currenti+1:length(A))];
158 | end
159 | elseif ( ~ strcmp(symm,'general') )
160 | disp('Unrecognized symmetry')
161 | disp(symm)
162 | disp('Recognized choices:')
163 | disp(' symmetric')
164 | disp(' hermitian')
165 | disp(' skew-symmetric')
166 | disp(' general')
167 | error('Check symmetry specification in header.');
168 | end
169 | A = reshape(A,rows,cols);
170 | elseif ( strcmp(field,'complex')) % complx valued entries:
171 | tmpr = fscanf(mmfile,'%f',1);
172 | tmpi = fscanf(mmfile,'%f',1);
173 | A = tmpr+tmpi*i;
174 | for j=1:entries-1
175 | tmpr = fscanf(mmfile,'%f',1);
176 | tmpi = fscanf(mmfile,'%f',1);
177 | A = [A; tmpr + tmpi*i];
178 | end
179 | if ( strcmp(symm,'symmetric') | strcmp(symm,'hermitian') | strcmp(symm,'skew-symmetric') )
180 | for j=1:cols-1,
181 | currenti = j*rows;
182 | A = [A(1:currenti); zeros(j,1);A(currenti+1:length(A))];
183 | end
184 | elseif ( ~ strcmp(symm,'general') )
185 | disp('Unrecognized symmetry')
186 | disp(symm)
187 | disp('Recognized choices:')
188 | disp(' symmetric')
189 | disp(' hermitian')
190 | disp(' skew-symmetric')
191 | disp(' general')
192 | error('Check symmetry specification in header.');
193 | end
194 | A = reshape(A,rows,cols);
195 | elseif ( strcmp(field,'pattern')) % pattern (makes no sense for dense)
196 | disp('Matrix type:',field)
197 | error('Pattern matrix type invalid for array storage format.');
198 | else % Unknown matrix type
199 | disp('Matrix type:',field)
200 | error('Invalid matrix type specification. Check header against MM documentation.');
201 | end
202 | end
203 |
204 | %
205 | % If symmetric, skew-symmetric or Hermitian, duplicate lower
206 | % triangular part and modify entries as appropriate:
207 | %
208 |
209 | if ( strcmp(symm,'symmetric') )
210 | A = A + A.' - diag(diag(A));
211 | entries = nnz(A);
212 | elseif ( strcmp(symm,'hermitian') )
213 | A = A + A' - diag(diag(A));
214 | entries = nnz(A);
215 | elseif ( strcmp(symm,'skew-symmetric') )
216 | A = A - A';
217 | entries = nnz(A);
218 | end
219 |
220 | fclose(mmfile);
221 | % Done.
222 |
--------------------------------------------------------------------------------
/matlab/project_genes.m:
--------------------------------------------------------------------------------
1 | function [M, genes_found, gene_idx] = project_genes(genes, genes_all, pc_imputed, U)
2 | % project_genes -- obtain gene values from compressed imputed data
3 | % [M, genes_found, gene_idx] = project_genes(genes, genes_all, pc_imputed, U) computes
4 | % gene values (M) for given gene names (genes) given all gene names (genes_all), loadings
5 | % (U), and imputed principal components (pc_imputed).
6 | %
7 | % Since pc_imputed and U are both narrow matrices the imputed data can be
8 | % stored in a memory efficient way, without having to store the dense
9 | % matrix.
10 |
11 | [gene_idx,locb] = ismember(lower(genes_all), lower(genes));
12 | [~,sidx] = sort(locb(gene_idx));
13 | idx = find(gene_idx);
14 | idx = idx(sidx);
15 | M = pc_imputed * U(idx,:)'; % project
16 | genes_found = genes_all(idx);
17 | gene_idx = find(gene_idx);
18 |
--------------------------------------------------------------------------------
/matlab/randPCA.m:
--------------------------------------------------------------------------------
1 | function [U,S,V] = randPCA(A,k,its,l)
2 |
3 | %PCA Low-rank approximation in SVD form.
4 | %
5 | %
6 | % [U,S,V] = PCA(A) constructs a nearly optimal rank-6 approximation
7 | % USV' to A, using 2 full iterations of a block Lanczos method
8 | % of block size 6+2=8, started with an n x 8 random matrix,
9 | % when A is m x n; the ref. below explains "nearly optimal."
10 | % The smallest dimension of A must be >= 6 when A is
11 | % the only input to PCA.
12 | %
13 | % [U,S,V] = PCA(A,k) constructs a nearly optimal rank-k approximation
14 | % USV' to A, using 2 full iterations of a block Lanczos method
15 | % of block size k+2, started with an n x (k+2) random matrix,
16 | % when A is m x n; the ref. below explains "nearly optimal."
17 | % k must be a positive integer <= the smallest dimension of A.
18 | %
19 | % [U,S,V] = PCA(A,k,its) constructs a nearly optimal rank-k approx. USV'
20 | % to A, using its full iterations of a block Lanczos method
21 | % of block size k+2, started with an n x (k+2) random matrix,
22 | % when A is m x n; the ref. below explains "nearly optimal."
23 | % k must be a positive integer <= the smallest dimension of A,
24 | % and its must be a nonnegative integer.
25 | %
26 | % [U,S,V] = PCA(A,k,its,l) constructs a nearly optimal rank-k approx.
27 | % USV' to A, using its full iterates of a block Lanczos method
28 | % of block size l, started with an n x l random matrix,
29 | % when A is m x n; the ref. below explains "nearly optimal."
30 | % k must be a positive integer <= the smallest dimension of A,
31 | % its must be a nonnegative integer,
32 | % and l must be a positive integer >= k.
33 | %
34 | %
35 | % The low-rank approximation USV' is in the form of an SVD in the sense
36 | % that the columns of U are orthonormal, as are the columns of V,
37 | % the entries of S are all nonnegative, and the only nonzero entries
38 | % of S appear in non-increasing order on its diagonal.
39 | % U is m x k, V is n x k, and S is k x k, when A is m x n.
40 | %
41 | % Increasing its or l improves the accuracy of the approximation USV'
42 | % to A; the ref. below describes how the accuracy depends on its and l.
43 | %
44 | %
45 | % Note: PCA invokes RAND. To obtain repeatable results,
46 | % invoke RAND('seed',j) with a fixed integer j before invoking PCA.
47 | %
48 | % Note: PCA currently requires the user to center and normalize the rows
49 | % or columns of the input matrix A before invoking PCA (if such
50 | % is desired).
51 | %
52 | % Note: The user may ascertain the accuracy of the approximation USV'
53 | % to A by invoking DIFFSNORM(A,U,S,V).
54 | %
55 | %
56 | % inputs (the first is required):
57 | % A -- matrix being approximated
58 | % k -- rank of the approximation being constructed;
59 | % k must be a positive integer <= the smallest dimension of A,
60 | % and defaults to 6
61 | % its -- number of full iterations of a block Lanczos method to conduct;
62 | % its must be a nonnegative integer, and defaults to 2
63 | % l -- block size of the block Lanczos iterations;
64 | % l must be a positive integer >= k, and defaults to k+2
65 | %
66 | % outputs (all three are required):
67 | % U -- m x k matrix in the rank-k approximation USV' to A,
68 | % where A is m x n; the columns of U are orthonormal
69 | % S -- k x k matrix in the rank-k approximation USV' to A,
70 | % where A is m x n; the entries of S are all nonnegative,
71 | % and its only nonzero entries appear in nonincreasing order
72 | % on the diagonal
73 | % V -- n x k matrix in the rank-k approximation USV' to A,
74 | % where A is m x n; the columns of V are orthonormal
75 | %
76 | %
77 | % Example:
78 | % A = rand(1000,2)*rand(2,1000);
79 | % A = A/normest(A);
80 | % [U,S,V] = pca(A,2,0);
81 | % diffsnorm(A,U,S,V)
82 | %
83 | % This code snippet produces a rank-2 approximation USV' to A such that
84 | % the columns of U are orthonormal, as are the columns of V, and
85 | % the entries of S are all nonnegative and are zero off the diagonal.
86 | % diffsnorm(A,U,S,V) outputs an estimate of the spectral norm
87 | % of A-USV', which should be close to the machine precision.
88 | %
89 | %
90 | % Reference:
91 | % Nathan Halko, Per-Gunnar Martinsson, and Joel Tropp,
92 | % Finding structure with randomness: Stochastic algorithms
93 | % for constructing approximate matrix decompositions,
94 | % arXiv:0909.4061 [math.NA; math.PR], 2009
95 | % (available at http://arxiv.org).
96 | %
97 | %
98 | % See also PCACOV, PRINCOMP, SVDS.
99 | %
100 |
101 | % Copyright 2009 Mark Tygert.
102 |
103 | warning off;
104 |
105 | %
106 | % Check the number of inputs.
107 | %
108 | if(nargin < 1)
109 | error('MATLAB:pca:TooFewIn',...
110 | 'There must be at least 1 input.')
111 | end
112 |
113 | if(nargin > 4)
114 | error('MATLAB:pca:TooManyIn',...
115 | 'There must be at most 4 inputs.')
116 | end
117 |
118 | %
119 | % Check the number of outputs.
120 | %
121 | if(nargout ~= 3)
122 | error('MATLAB:pca:WrongNumOut',...
123 | 'There must be exactly 3 outputs.')
124 | end
125 |
126 | %
127 | % Set the inputs k, its, and l to default values, if necessary.
128 | %
129 | if(nargin == 1)
130 | k = 6;
131 | its = 2;
132 | l = k+2;
133 | end
134 |
135 | if(nargin == 2)
136 | its = 2;
137 | l = k+2;
138 | end
139 |
140 | if(nargin == 3)
141 | l = k+2;
142 | end
143 |
144 | %
145 | % Check the first input argument.
146 | %
147 | if(~isfloat(A))
148 | error('MATLAB:pca:In1NotFloat',...
149 | 'Input 1 must be a floating-point matrix.')
150 | end
151 |
152 | if(isempty(A))
153 | error('MATLAB:pca:In1Empty',...
154 | 'Input 1 must not be empty.')
155 | end
156 |
157 | %
158 | % Retrieve the dimensions of A.
159 | %
160 | [m n] = size(A);
161 |
162 | %
163 | % Check the remaining input arguments.
164 | %
165 | if(size(k,1) ~= 1 || size(k,2) ~= 1)
166 | error('MATLAB:pca:In2Not1x1',...
167 | 'Input 2 must be a scalar.')
168 | end
169 |
170 | if(size(its,1) ~= 1 || size(its,2) ~= 1)
171 | error('MATLAB:pca:In3Not1x1',...
172 | 'Input 3 must be a scalar.')
173 | end
174 |
175 | if(size(l,1) ~= 1 || size(l,2) ~= 1)
176 | error('MATLAB:pca:In4Not1x1',...
177 | 'Input 4 must be a scalar.')
178 | end
179 |
180 | if(k <= 0)
181 | error('MATLAB:pca:In2NonPos',...
182 | 'Input 2 must be > 0.')
183 | end
184 |
185 | if((k > m) || (k > n))
186 | error('MATLAB:pca:In2TooBig',...
187 | 'Input 2 must be <= the smallest dimension of Input 1.')
188 | end
189 |
190 | if(its < 0)
191 | error('MATLAB:pca:In3Neg',...
192 | 'Input 3 must be >= 0.')
193 | end
194 |
195 | if(l < k)
196 | error('MATLAB:pca:In4ltIn2',...
197 | 'Input 4 must be >= Input 2.')
198 | end
199 |
200 | %
201 | % SVD A directly if (its+1)*l >= m/1.25 or (its+1)*l >= n/1.25.
202 | %
203 | if(((its+1)*l >= m/1.25) || ((its+1)*l >= n/1.25))
204 |
205 | if(~issparse(A))
206 | [U,S,V] = svd(A,'econ');
207 | end
208 |
209 | if(issparse(A))
210 | [U,S,V] = svd(full(A),'econ');
211 | end
212 | %
213 | % Retain only the leftmost k columns of U, the leftmost k columns of V,
214 | % and the uppermost leftmost k x k block of S.
215 | %
216 | U = U(:,1:k);
217 | V = V(:,1:k);
218 | S = S(1:k,1:k);
219 |
220 | return
221 |
222 | end
223 |
224 |
225 | if(m >= n)
226 |
227 | %
228 | % Apply A to a random matrix, obtaining H.
229 | %
230 | %rand('seed',rand('seed'));
231 | rng('default');
232 |
233 | if(isreal(A))
234 | H = A*(2*rand(n,l)-ones(n,l));
235 | end
236 |
237 | if(~isreal(A))
238 | H = A*( (2*rand(n,l)-ones(n,l)) + i*(2*rand(n,l)-ones(n,l)) );
239 | end
240 |
241 | %rand('twister',rand('twister'));
242 | rng('default');
243 |
244 | %
245 | % Initialize F to its final size and fill its leftmost block with H.
246 | %
247 | F = zeros(m,(its+1)*l);
248 | F(1:m, 1:l) = H;
249 |
250 | %
251 | % Apply A*A' to H a total of its times,
252 | % augmenting F with the new H each time.
253 | %
254 | for it = 1:its
255 | H = (H'*A)';
256 | H = A*H;
257 | F(1:m, (1+it*l):((it+1)*l)) = H;
258 | end
259 |
260 | clear H;
261 |
262 | %
263 | % Form a matrix Q whose columns constitute an orthonormal basis
264 | % for the columns of F.
265 | %
266 | [Q,R,E] = qr(F,0);
267 |
268 | clear F R E;
269 |
270 | %
271 | % SVD Q'*A to obtain approximations to the singular values
272 | % and right singular vectors of A; adjust the left singular vectors
273 | % of Q'*A to approximate the left singular vectors of A.
274 | %
275 | [U2,S,V] = svd(Q'*A,'econ');
276 | U = Q*U2;
277 |
278 | clear Q U2;
279 |
280 | %
281 | % Retain only the leftmost k columns of U, the leftmost k columns of V,
282 | % and the uppermost leftmost k x k block of S.
283 | %
284 | U = U(:,1:k);
285 | V = V(:,1:k);
286 | S = S(1:k,1:k);
287 |
288 | end
289 |
290 |
291 | if(m < n)
292 |
293 | %
294 | % Apply A' to a random matrix, obtaining H.
295 | %
296 | %rand('seed',rand('seed'));
297 | rng('default');
298 |
299 | if(isreal(A))
300 | H = ((2*rand(l,m)-ones(l,m))*A)';
301 | end
302 |
303 | if(~isreal(A))
304 | H = (( (2*rand(l,m)-ones(l,m)) + i*(2*rand(l,m)-ones(l,m)) )*A)';
305 | end
306 |
307 | %rand('twister',rand('twister'));
308 | rng('default');
309 |
310 | %
311 | % Initialize F to its final size and fill its leftmost block with H.
312 | %
313 | F = zeros(n,(its+1)*l);
314 | F(1:n, 1:l) = H;
315 |
316 | %
317 | % Apply A'*A to H a total of its times,
318 | % augmenting F with the new H each time.
319 | %
320 | for it = 1:its
321 | H = A*H;
322 | H = (H'*A)';
323 | F(1:n, (1+it*l):((it+1)*l)) = H;
324 | end
325 |
326 | clear H;
327 |
328 | %
329 | % Form a matrix Q whose columns constitute an orthonormal basis
330 | % for the columns of F.
331 | %
332 | [Q,R,E] = qr(F,0);
333 |
334 | clear F R E;
335 |
336 | %
337 | % SVD A*Q to obtain approximations to the singular values
338 | % and left singular vectors of A; adjust the right singular vectors
339 | % of A*Q to approximate the right singular vectors of A.
340 | %
341 | [U,S,V2] = svd(A*Q,'econ');
342 | V = Q*V2;
343 |
344 | clear Q V2;
345 |
346 | %
347 | % Retain only the leftmost k columns of U, the leftmost k columns of V,
348 | % and the uppermost leftmost k x k block of S.
349 | %
350 | U = U(:,1:k);
351 | V = V(:,1:k);
352 | S = S(1:k,1:k);
353 |
354 | end
355 |
--------------------------------------------------------------------------------
/matlab/run_magic.m:
--------------------------------------------------------------------------------
1 | function [pc_imputed, U, pc] = run_magic(data, varargin)
2 | % run_magic Run MAGIC for imputing and denoising of single-cell data
3 | % [pc_imputed, U, pc] = run_magic(data, varargin) runs MAGIC on data (rows:
4 | % cells, columns: genes) with default parameter settings and returns the
5 | % imputed data in a compressed format.
6 | %
7 | % The compressed format consists of loadings (U) and imputed principal
8 | % components (pc_imputed). To obtain gene values form this compressedf format
9 | % either run project_genes.m or manually project (pc_imputed * U') either all
10 | % genes or a subset (pc_imputed * U(idx,:)'). Also returned are the original
11 | % principal components (pc);
12 | %
13 | % Since pc_imputed and U are both narrow matrices the imputed data can be
14 | % stored in a memory efficient way, without having to store the dense
15 | % matrix.
16 | %
17 | % Supplied data can be a sparse matrix, in which case MAGIC will be more
18 | % memory efficient.
19 | %
20 | % [...] = phate(data, 'PARAM1',val1, 'PARAM2',val2, ...) allows you to
21 | % specify optional parameter name/value pairs that control further details
22 | % of PHATE. Parameters are:
23 | %
24 | % 'npca' - number of PCA components to do MAGIC on. Defaults to 100.
25 | %
26 | % 'k' - number of nearest neighbors for bandwidth of adaptive alpha
27 | % decaying kernel or, when a=[], number of nearest neighbors of the knn
28 | % graph. For the unweighted kernel we recommend k to be a bit larger,
29 | % e.g. 10 or 15. Defaults to 10.
30 | %
31 | % 'a' - alpha of alpha decaying kernel. when a=[] knn (unweighted) kernel
32 | % is used. Defaults to 15.
33 | %
34 | % 't' - number of diffusion steps. Defaults to [] wich autmatically picks
35 | % the optimal t.
36 | %
37 | % 'distfun' - Distance function used to compute kernel. Defaults to
38 | % 'euclidean'.
39 | %
40 | % 'make_plot_opt_t' - Boolean flag for plotting the optimal t analysis.
41 | % Defaults to true.
42 |
43 | npca = 100;
44 | k = 10;
45 | a = 15;
46 | t = [];
47 | distfun = 'euclidean';
48 | make_plot_opt_t = true;
49 |
50 | % get input parameters
51 | for i=1:length(varargin)
52 | % k for knn adaptive sigma
53 | if(strcmp(varargin{i},'k'))
54 | k = lower(varargin{i+1});
55 | end
56 | % a (alpha) for alpha decaying kernel
57 | if(strcmp(varargin{i},'a'))
58 | a = lower(varargin{i+1});
59 | end
60 | % diffusion time
61 | if(strcmp(varargin{i},'t'))
62 | t = lower(varargin{i+1});
63 | end
64 | % npca
65 | if(strcmp(varargin{i},'npca'))
66 | npca = lower(varargin{i+1});
67 | end
68 | % make plot optimal t
69 | if(strcmp(varargin{i},'make_plot_opt_t'))
70 | make_plot_opt_t = lower(varargin{i+1});
71 | end
72 | end
73 |
74 | % PCA
75 | disp 'doing PCA'
76 | [U,~,~] = randPCA(data', npca); % this is svd
77 | pc = data * U; % this is PCA without mean centering to be able to handle sparse data
78 |
79 | % compute kernel
80 | disp 'computing kernel'
81 | K = compute_kernel(pc, 'k', k, 'a', a, 'distfun', distfun);
82 |
83 | % row stochastic
84 | P = bsxfun(@rdivide, K, sum(K,2));
85 |
86 | % optimal t
87 | if isempty(t)
88 | disp 'imputing using optimal t'
89 | pc_imputed = compute_optimal_t(pc, P, 'make_plot', make_plot_opt_t);
90 | else
91 | disp 'imputing using provided t'
92 | pc_imputed = pc;
93 | for I=1:t
94 | disp(['t = ' num2str(I)]);
95 | pc_imputed = P * pc_imputed;
96 | end
97 | end
98 |
99 | disp 'done.'
100 |
--------------------------------------------------------------------------------
/matlab/svdpca.m:
--------------------------------------------------------------------------------
1 | function Y = svdpca(X, k, method)
2 |
3 | if ~exist('method','var')
4 | method = 'svd';
5 | end
6 |
7 | X = bsxfun(@minus, X, mean(X));
8 |
9 | switch lower(method)
10 | case 'svd'
11 | disp 'PCA using SVD'
12 | [U,~,~] = svds(X', k);
13 | Y = X * U;
14 | case 'random'
15 | disp 'PCA using random SVD'
16 | [U,~,~] = randPCA(X', k);
17 | Y = X * U;
18 | case 'lra'
19 | disp 'LRA using random SVD'
20 | [U,S,V] = svds(X, k);
21 | Y = U*S*V';
22 | case 'lra_random'
23 | disp 'LRA using random SVD'
24 | [U,S,V] = randPCA(X, k);
25 | Y = U*S*V';
26 | end
27 |
--------------------------------------------------------------------------------
/matlab/svdpca_sparse.m:
--------------------------------------------------------------------------------
1 | function [pc,U,S] = svdpca_sparse(X, k, method)
2 |
3 | if ~exist('method','var')
4 | method = 'svd';
5 | end
6 |
7 | switch method
8 | case 'svd'
9 | disp 'PCA using SVD'
10 | [U,S,~] = svds(X', k);
11 | pc = X * U;
12 | case 'random'
13 | disp 'PCA using random SVD'
14 | [U,S,~] = randPCA(X', k);
15 | pc = X * U;
16 | S = diag(S);
17 | case 'none'
18 | disp 'No PCA performed'
19 | pc = X;
20 | end
21 |
--------------------------------------------------------------------------------
/matlab/test_magic.m:
--------------------------------------------------------------------------------
1 | %% load data (e.g. 10x data)
2 | % data should be cells as rows and genes as columns
3 | %sample_dir = 'path_to_data/';
4 | %[data, gene_names, gene_ids, cells] = load_10x(sample_dir);
5 |
6 | %% load EMT data
7 | file = '../data/HMLE_TGFb_day_8_10.csv'; %% gunzip ../data/HMLE_TGFb_day_8_10.csv.gz
8 | data = importdata(file);
9 | gene_names = data.colheaders;
10 | data = data.data;
11 |
12 | %% library size normalization
13 | libsize = sum(data,2);
14 | data = bsxfun(@rdivide, data, libsize) * median(libsize);
15 |
16 | %% log transform -- usually one would log transform the data. Here we don't do it.
17 | %data = log(data + 0.1);
18 |
19 | %% MAGIC
20 | [pc_imputed, U, pc] = run_magic(data, 'npca', 100, 'k', 15, 'a', 15, 'make_plot_opt_t', true);
21 |
22 | %% project genes
23 | plot_genes = {'Cdh1', 'Vim', 'Fn1', 'Zeb1'};
24 | [M_imputed, genes_found] = project_genes(plot_genes, gene_names, pc_imputed, U);
25 |
26 | %% plot
27 | ms = 20;
28 | v = [-45 20];
29 | % before MAGIC
30 | x = data(:, ismember(lower(gene_names), lower(plot_genes{1})));
31 | y = data(:, ismember(lower(gene_names), lower(plot_genes{2})));
32 | z = data(:, ismember(lower(gene_names), lower(plot_genes{3})));
33 | c = data(:, ismember(lower(gene_names), lower(plot_genes{4})));
34 | figure;
35 | subplot(2,2,1);
36 | scatter(y, x, ms, c, 'filled');
37 | colormap(parula);
38 | axis tight
39 | xlabel(plot_genes{2});
40 | ylabel(plot_genes{1});
41 | h = colorbar;
42 | %ylabel(h,plot_genes{4});
43 | title 'Before MAGIC'
44 |
45 | subplot(2,2,2);
46 | scatter3(x, y, z, ms, c, 'filled');
47 | colormap(parula);
48 | axis tight
49 | xlabel(plot_genes{1});
50 | ylabel(plot_genes{2});
51 | zlabel(plot_genes{3});
52 | %h = colorbar;
53 | ylabel(h,plot_genes{4});
54 | view(v);
55 | title 'Before MAGIC'
56 |
57 | % plot after MAGIC
58 | x = M_imputed(:,1);
59 | y = M_imputed(:,2);
60 | z = M_imputed(:,3);
61 | c = M_imputed(:,4);
62 | subplot(2,2,3);
63 | scatter(y, x, ms, c, 'filled');
64 | colormap(parula);
65 | axis tight
66 | xlabel(plot_genes{2});
67 | ylabel(plot_genes{1});
68 | h = colorbar;
69 | %ylabel(h,plot_genes{4});
70 | title 'After MAGIC'
71 |
72 | subplot(2,2,4);
73 | scatter3(x, y, z, ms, c, 'filled');
74 | colormap(parula);
75 | axis tight
76 | xlabel(plot_genes{1});
77 | ylabel(plot_genes{2});
78 | zlabel(plot_genes{3});
79 | %h = colorbar;
80 | ylabel(h,plot_genes{4});
81 | view(v);
82 | title 'After MAGIC'
83 |
84 | %% plot PCA before MAGIC
85 | figure;
86 | c = data(:, ismember(lower(gene_names), lower(plot_genes{4})));
87 | Y = svdpca(pc, 3, 'random'); % original PCs are not mean centered so doing proper PCA here
88 | % alternative is to do proper PCA on data:
89 | %Y = svdpca(data, 3, 'random');
90 | scatter3(Y(:,1), Y(:,2), Y(:,3), ms, c, 'filled');
91 | colormap(parula);
92 | axis tight
93 | xlabel 'PC1'
94 | ylabel 'PC2'
95 | zlabel 'PC3'
96 | h = colorbar;
97 | ylabel(h,plot_genes{4});
98 | view([-50 22]);
99 | title 'Before MAGIC'
100 |
101 | %% plot PCA after MAGIC
102 | figure;
103 | c = M_imputed(:,4);
104 | Y = svdpca(pc_imputed, 3, 'random'); % original PCs are not mean centered so doing proper PCA here
105 | % alternative is to go to full imputed data and then do proper PCA:
106 | %data_imputed = pc_imputed * U'; % project full data
107 | %Y = svdpca(data_imputed, 3, 'random');
108 | scatter3(Y(:,1), Y(:,2), Y(:,3), ms, c, 'filled');
109 | colormap(parula);
110 | axis tight
111 | xlabel 'PC1'
112 | ylabel 'PC2'
113 | zlabel 'PC3'
114 | h = colorbar;
115 | ylabel(h,plot_genes{4});
116 | view([-50 22]);
117 | title 'After MAGIC'
118 |
--------------------------------------------------------------------------------
/python/README.rst:
--------------------------------------------------------------------------------
1 | =======================================================
2 | Markov Affinity-based Graph Imputation of Cells (MAGIC)
3 | =======================================================
4 |
5 | .. image:: https://img.shields.io/pypi/v/magic-impute.svg
6 | :target: https://pypi.org/project/magic-impute/
7 | :alt: Latest PyPi version
8 | .. image:: https://img.shields.io/cran/v/Rmagic.svg
9 | :target: https://cran.r-project.org/package=Rmagic
10 | :alt: Latest CRAN version
11 | .. image:: https://img.shields.io/github/workflow/status/KrishnaswamyLab/MAGIC/Unit%20Tests/master?label=Github%20Actions
12 | :target: https://github.com/KrishnaswamyLab/MAGIC/actions
13 | :alt: GitHub Actions Build
14 | .. image:: https://img.shields.io/readthedocs/magic.svg
15 | :target: https://magic.readthedocs.io/
16 | :alt: Read the Docs
17 | .. image:: https://zenodo.org/badge/DOI/10.1016/j.cell.2018.05.061.svg
18 | :target: https://www.cell.com/cell/abstract/S0092-8674(18)30724-4
19 | :alt: Cell Publication DOI
20 | .. image:: https://img.shields.io/twitter/follow/KrishnaswamyLab.svg?style=social&label=Follow
21 | :target: https://twitter.com/KrishnaswamyLab
22 | :alt: Twitter
23 | .. image:: https://img.shields.io/github/stars/KrishnaswamyLab/MAGIC.svg?style=social&label=Stars
24 | :target: https://github.com/KrishnaswamyLab/MAGIC/
25 | :alt: GitHub stars
26 |
27 | Markov Affinity-based Graph Imputation of Cells (MAGIC) is an algorithm for denoising high-dimensional data most commonly applied to single-cell RNA sequencing data. MAGIC learns the manifold data, using the resultant graph to smooth the features and restore the structure of the data.
28 |
29 | To see how MAGIC can be applied to single-cell RNA-seq, elucidating the epithelial-to-mesenchymal transition, read our `publication in Cell`_.
30 |
31 | `David van Dijk, et al. Recovering Gene Interactions from Single-Cell Data Using Data Diffusion. 2018. Cell.`__
32 |
33 | .. _`publication in Cell`: https://www.cell.com/cell/abstract/S0092-8674(18)30724-4
34 |
35 | __ `publication in Cell`_
36 |
37 | For R and MATLAB implementations of MAGIC, see
38 | https://github.com/KrishnaswamyLab/MAGIC.
39 |
40 | .. image:: https://raw.githubusercontent.com/KrishnaswamyLab/MAGIC/master/magic.gif
41 | :align: center
42 | :alt: Magic reveals the interaction between Vimentin (VIM), Cadherin-1 (CDH1), and Zinc finger E-box-binding homeobox 1 (ZEB1, encoded by colors).
43 |
44 | *Magic reveals the interaction between Vimentin (VIM), Cadherin-1
45 | (CDH1), and Zinc finger E-box-binding homeobox 1 (ZEB1, encoded by
46 | colors).*
47 |
48 | Installation
49 | ~~~~~~~~~~~~
50 |
51 | Installation with pip
52 | ---------------------
53 |
54 | To install with ``pip``, run the following from a terminal::
55 |
56 | pip install --user magic-impute
57 |
58 | Installation from GitHub
59 | ------------------------
60 |
61 | To clone the repository and install manually, run the following from a
62 | terminal::
63 |
64 | git clone git://github.com/KrishnaswamyLab/MAGIC.git
65 | cd MAGIC/python
66 | python setup.py install --user
67 |
68 | Usage
69 | ~~~~~
70 |
71 | Example data
72 | ------------
73 |
74 | The following code runs MAGIC on test data located in the MAGIC
75 | repository::
76 |
77 | import magic
78 | import pandas as pd
79 | import matplotlib.pyplot as plt
80 | X = pd.read_csv("MAGIC/data/test_data.csv")
81 | magic_operator = magic.MAGIC()
82 | X_magic = magic_operator.fit_transform(X, genes=['VIM', 'CDH1', 'ZEB1'])
83 | plt.scatter(X_magic['VIM'], X_magic['CDH1'], c=X_magic['ZEB1'], s=1, cmap='inferno')
84 | plt.show()
85 | magic.plot.animate_magic(X, gene_x='VIM', gene_y='CDH1', gene_color='ZEB1', operator=magic_operator)
86 |
87 | Interactive command line
88 | ------------------------
89 |
90 | We have included two tutorial notebooks on MAGIC usage and results
91 | visualization for single cell RNA-seq data.
92 |
93 | EMT data notebook:
94 | http://nbviewer.jupyter.org/github/KrishnaswamyLab/magic/blob/master/python/tutorial_notebooks/emt_tutorial.ipynb
95 |
96 | Bone Marrow data notebook:
97 | http://nbviewer.jupyter.org/github/KrishnaswamyLab/magic/blob/master/python/tutorial_notebooks/bonemarrow_tutorial.ipynb
98 |
99 | Help
100 | ~~~~
101 |
102 | If you have any questions or require assistance using MAGIC, please
103 | contact us at https://krishnaswamylab.org/get-help.
104 |
--------------------------------------------------------------------------------
/python/doc/Makefile:
--------------------------------------------------------------------------------
1 | # Minimal makefile for Sphinx documentation
2 | #
3 |
4 | # You can set these variables from the command line.
5 | SPHINXOPTS =
6 | SPHINXBUILD = sphinx-build
7 | SPHINXPROJ = PHATE
8 | SOURCEDIR = source
9 | BUILDDIR = build
10 |
11 | # Put it first so that "make" without argument is like "make help".
12 | help:
13 | @$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
14 |
15 | .PHONY: help Makefile
16 |
17 | # Catch-all target: route all unknown targets to Sphinx using the new
18 | # "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
19 | %: Makefile
20 | @$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
21 |
--------------------------------------------------------------------------------
/python/doc/source/api.rst:
--------------------------------------------------------------------------------
1 | API
2 | ===
3 |
4 | MAGIC
5 | -----
6 |
7 | .. automodule:: magic.magic
8 | :members:
9 | :inherited-members:
10 | :show-inheritance:
11 |
12 | Plotting
13 | --------
14 |
15 | .. automodule:: magic.plot
16 | :members:
17 | :inherited-members:
18 | :show-inheritance:
19 |
--------------------------------------------------------------------------------
/python/doc/source/conf.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python3
2 | # -*- coding: utf-8 -*-
3 | #
4 | # MAGIC documentation build configuration file, created by
5 | # sphinx-quickstart on Thu Mar 30 19:50:14 2017.
6 | #
7 | # This file is execfile()d with the current directory set to its
8 | # containing dir.
9 | #
10 | # Note that not all possible configuration values are present in this
11 | # autogenerated file.
12 | #
13 | # All configuration values have a default; values that are commented out
14 | # serve to show the default.
15 |
16 | # If extensions (or modules to document with autodoc) are in another directory,
17 | # add these directories to sys.path here. If the directory is relative to the
18 | # documentation root, use os.path.abspath to make it absolute, like shown here.
19 | #
20 | import os
21 | import sys
22 |
23 | root_dir = os.path.abspath(os.path.join(os.path.dirname(__file__), "..", ".."))
24 | sys.path.insert(0, root_dir)
25 | # print(sys.path)
26 |
27 | # -- General configuration ------------------------------------------------
28 |
29 | # If your documentation needs a minimal Sphinx version, state it here.
30 | #
31 | # needs_sphinx = '1.0'
32 |
33 | # Add any Sphinx extension module names here, as strings. They can be
34 | # extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
35 | # ones.
36 | extensions = [
37 | "sphinx.ext.autodoc",
38 | "sphinx.ext.autosummary",
39 | "sphinx.ext.napoleon",
40 | "sphinx.ext.doctest",
41 | "sphinx.ext.coverage",
42 | "sphinx.ext.viewcode",
43 | ]
44 |
45 | # Add any paths that contain templates here, relative to this directory.
46 | templates_path = ["ytemplates"]
47 |
48 | # The suffix(es) of source filenames.
49 | # You can specify multiple suffix as a list of string:
50 | #
51 | # source_suffix = ['.rst', '.md']
52 | source_suffix = ".rst"
53 |
54 | # The master toctree document.
55 | master_doc = "index"
56 |
57 | # General information about the project.
58 | project = "MAGIC"
59 | copyright = "2017 Krishnaswamy Lab, Yale University"
60 | author = "Scott Gigante and Daniel Dager, Krishnaswamy Lab, Yale University"
61 |
62 | # The version info for the project you're documenting, acts as replacement for
63 | # |version| and |release|, also used in various other places throughout the
64 | # built documents.
65 | version_py = os.path.join(root_dir, "magic", "version.py")
66 | # The full version, including alpha/beta/rc tags.
67 | release = open(version_py).read().strip().split("=")[-1].replace('"', "").strip()
68 | # The short X.Y version.
69 | version = release.split("-")[0]
70 |
71 | # The language for content autogenerated by Sphinx. Refer to documentation
72 | # for a list of supported languages.
73 | #
74 | # This is also used if you do content translation via gettext catalogs.
75 | # Usually you set "language" from the command line for these cases.
76 | language = None
77 |
78 | # List of patterns, relative to source directory, that match files and
79 | # directories to ignore when looking for source files.
80 | # This patterns also effect to html_static_path and html_extra_path
81 | exclude_patterns = []
82 |
83 | # The name of the Pygments (syntax highlighting) style to use.
84 | pygments_style = "sphinx"
85 |
86 | # If true, `todo` and `todoList` produce output, else they produce nothing.
87 | todo_include_todos = False
88 |
89 |
90 | # -- Options for HTML output ----------------------------------------------
91 |
92 | # The theme to use for HTML and HTML Help pages. See the documentation for
93 | # a list of builtin themes.
94 | #
95 | html_theme = "default"
96 |
97 | # Theme options are theme-specific and customize the look and feel of a theme
98 | # further. For a list of options available for each theme, see the
99 | # documentation.
100 | #
101 | # html_theme_options = {}
102 |
103 | # Add any paths that contain custom static files (such as style sheets) here,
104 | # relative to this directory. They are copied after the builtin static files,
105 | # so a file named "default.css" will overwrite the builtin "default.css".
106 | html_static_path = ["ystatic"]
107 |
108 |
109 | # -- Options for HTMLHelp output ------------------------------------------
110 |
111 | # Output file base name for HTML help builder.
112 | htmlhelp_basename = "MAGICdoc"
113 |
114 |
115 | # -- Options for LaTeX output ---------------------------------------------
116 |
117 | latex_elements = {
118 | # The paper size ('letterpaper' or 'a4paper').
119 | #
120 | # 'papersize': 'letterpaper',
121 | # The font size ('10pt', '11pt' or '12pt').
122 | #
123 | # 'pointsize': '10pt',
124 | # Additional stuff for the LaTeX preamble.
125 | #
126 | # 'preamble': '',
127 | # Latex figure (float) alignment
128 | #
129 | # 'figure_align': 'htbp',
130 | }
131 |
132 | # Grouping the document tree into LaTeX files. List of tuples
133 | # (source start file, target name, title,
134 | # author, documentclass [howto, manual, or own class]).
135 | latex_documents = [
136 | (
137 | master_doc,
138 | "MAGIC.tex",
139 | "MAGIC Documentation",
140 | "Scott Gigante and Daniel Dager, Krishnaswamy Lab, Yale University",
141 | "manual",
142 | ),
143 | ]
144 |
145 |
146 | # -- Options for manual page output ---------------------------------------
147 |
148 | # One entry per manual page. List of tuples
149 | # (source start file, name, description, authors, manual section).
150 | man_pages = [(master_doc, "magic", "MAGIC Documentation", [author], 1)]
151 |
152 |
153 | # -- Options for Texinfo output -------------------------------------------
154 |
155 | # Grouping the document tree into Texinfo files. List of tuples
156 | # (source start file, target name, title, author,
157 | # dir menu entry, description, category)
158 | texinfo_documents = [
159 | (
160 | master_doc,
161 | "MAGIC",
162 | "MAGIC Documentation",
163 | author,
164 | "MAGIC",
165 | "One line description of project.",
166 | "Miscellaneous",
167 | ),
168 | ]
169 |
--------------------------------------------------------------------------------
/python/doc/source/index.rst:
--------------------------------------------------------------------------------
1 | =======================================================
2 | MAGIC - Markov Affinity-based Graph Imputation of Cells
3 | =======================================================
4 |
5 | .. raw:: html
6 |
7 |
8 |
9 | .. raw:: html
10 |
11 |
12 |
13 | .. raw:: html
14 |
15 |
16 |
17 | .. raw:: html
18 |
19 |
20 |
21 | .. raw:: html
22 |
23 |
24 |
25 | .. raw:: html
26 |
27 |
28 |
29 | .. raw:: html
30 |
31 |
32 |
33 | Markov Affinity-based Graph Imputation of Cells (MAGIC) is an algorithm for denoising high-dimensional data most commonly applied to single-cell RNA sequencing data. MAGIC learns the manifold data, using the resultant graph to smooth the features and restore the structure of the data.
34 |
35 | To see how MAGIC can be applied to single-cell RNA-seq, elucidating the epithelial-to-mesenchymal transition, read our `publication in Cell`_.
36 |
37 | .. raw:: html
38 |
39 |
40 |
41 |
42 | Magic reveals the interaction between Vimentin (VIM), Cadherin-1 (CDH1), and Zinc finger E-box-binding homeobox 1 (ZEB1, encoded by colors).
43 |
44 |
45 |
46 | `David van Dijk, et al. Recovering Gene Interactions from Single-Cell Data Using Data Diffusion. 2018. Cell.`__
47 |
48 | .. _`publication in Cell`: https://www.cell.com/cell/abstract/S0092-8674(18)30724-4
49 |
50 | __ `publication in Cell`_
51 |
52 | .. toctree::
53 | :maxdepth: 2
54 |
55 | installation
56 | tutorial
57 | api
58 |
59 | Quick Start
60 | ===========
61 |
62 | To run MAGIC on your dataset, create a MAGIC operator and run `fit_transform`. Here we show an example with a small, artificial dataset located in the MAGIC repository::
63 |
64 | import magic
65 | import pandas as pd
66 | import matplotlib.pyplot as plt
67 | X = pd.read_csv("MAGIC/data/test_data.csv")
68 | magic_operator = magic.MAGIC()
69 | X_magic = magic_operator.fit_transform(X, genes=['VIM', 'CDH1', 'ZEB1'])
70 | plt.scatter(X_magic['VIM'], X_magic['CDH1'], c=X_magic['ZEB1'], s=1, cmap='inferno')
71 | plt.show()
72 | magic.plot.animate_magic(X, gene_x='VIM', gene_y='CDH1', gene_color='ZEB1', operator=magic_operator)
73 |
74 | Help
75 | ====
76 |
77 | If you have any questions or require assistance using MAGIC, please contact us at https://krishnaswamylab.org/get-help
78 |
79 | .. autoclass:: magic.MAGIC
80 | :members:
81 | :noindex:
82 |
--------------------------------------------------------------------------------
/python/doc/source/installation.rst:
--------------------------------------------------------------------------------
1 | Installation
2 | ============
3 |
4 | Python installation
5 | -------------------
6 |
7 | Installation with `pip`
8 | ~~~~~~~~~~~~~~~~~~~~~~~
9 |
10 | The Python version of MAGIC can be installed using::
11 |
12 | pip install --user magic-impute
13 |
14 | Installation from source
15 | ~~~~~~~~~~~~~~~~~~~~~~~~
16 |
17 | The Python version of MAGIC can be installed from GitHub by running the following from a terminal::
18 |
19 | git clone --recursive git://github.com/KrishnaswamyLab/MAGIC.git
20 | cd MAGIC/python
21 | python setup.py install --user
22 |
23 | MATLAB installation
24 | -------------------
25 |
26 | 1. The MATLAB version of MAGIC can be accessed using::
27 |
28 | git clone git://github.com/KrishnaswamyLab/MAGIC.git
29 | cd MAGIC/Matlab
30 |
31 | 2. Add the MAGIC/Matlab directory to your MATLAB path and run any of our `run` or `test` scripts to get a feel for MAGIC.
32 |
33 | R installation
34 | --------------
35 |
36 | In order to use MAGIC in R, you must also install the Python package.
37 |
38 | If `python` or `pip` are not installed, you will need to install them. We recommend Miniconda3_ to install Python and `pip` together, or otherwise you can install `pip` from https://pip.pypa.io/en/stable/installing/.
39 |
40 | Installation from CRAN
41 | ~~~~~~~~~~~~~~~~~~~~~~
42 |
43 | In R, run this command to install MAGIC and all dependencies::
44 |
45 | install.packages("Rmagic")
46 |
47 | In a terminal, run the following command to install the Python
48 | repository::
49 |
50 | pip install --user magic-impute
51 |
52 | .. _Miniconda3: https://conda.io/miniconda.html
53 |
54 | Installation from source
55 | ~~~~~~~~~~~~~~~~~~~~~~~~
56 |
57 | The latest source version of MAGIC can be accessed by running the following in a terminal::
58 |
59 | git clone https://github.com/KrishnaswamyLab/MAGIC.git
60 | cd MAGIC/Rmagic
61 | R CMD INSTALL .
62 | cd ../python
63 | python setup.py install --user
64 |
65 | If the `Rmagic` folder is empty, you have may forgotten to use the `--recursive` option for `git clone`. You can rectify this by running the following in a terminal::
66 |
67 | cd MAGIC
68 | git submodule init
69 | git submodule update
70 | cd Rmagic
71 | R CMD INSTALL
72 | cd ../python
73 | python setup.py install --user
74 |
--------------------------------------------------------------------------------
/python/doc/source/requirements.txt:
--------------------------------------------------------------------------------
1 | scikit-learn>=0.19.1
2 | numpy>=1.14.0
3 | pandas>=0.25
4 | scprep>=1.0
5 | scipy>=1.1.0
6 | matplotlib>=2.0.1
7 | future
8 | graphtools>=1.0.0
9 | sphinx
10 | sphinxcontrib-napoleon
11 |
--------------------------------------------------------------------------------
/python/doc/source/tutorial.rst:
--------------------------------------------------------------------------------
1 | Tutorial
2 | --------
3 |
4 | To run MAGIC on your dataset, create a MAGIC operator and run `fit_transform`. Here we show an example with an artificial test dataset located in the MAGIC repository::
5 |
6 | import magic
7 | import matplotlib.pyplot as plt
8 | import pandas as pd
9 | X = pd.read_csv("MAGIC/data/test_data.csv")
10 | magic_operator = magic.MAGIC()
11 | X_magic = magic_operator.fit_transform(X, genes=['VIM', 'CDH1', 'ZEB1'])
12 | plt.scatter(X_magic['VIM'], X_magic['CDH1'], c=X_magic['ZEB1'], s=1, cmap='inferno')
13 | plt.show()
14 | magic.plot.animate_magic(X, gene_x='VIM', gene_y='CDH1', gene_color='ZEB1', operator=magic_operator)
15 |
16 | A demo on MAGIC usage for single cell RNA-seq data can be found in this notebook_: `http://nbviewer.jupyter.org/github/KrishnaswamyLab/magic/blob/master/python/tutorial_notebooks/emt_tutorial.ipynb`__
17 |
18 | .. _notebook: http://nbviewer.jupyter.org/github/KrishnaswamyLab/magic/blob/master/python/tutorial_notebooks/emt_tutorial.ipynb
19 |
20 | __ notebook_
21 |
22 | A second tutorial analyzing myeloid and erythroid cells in mouse bone marrow is available here_: `http://nbviewer.jupyter.org/github/KrishnaswamyLab/magic/blob/master/python/tutorial_notebooks/bonemarrow_tutorial.ipynb`__
23 |
24 | .. _here: http://nbviewer.jupyter.org/github/KrishnaswamyLab/magic/blob/master/python/tutorial_notebooks/bonemarrow_tutorial.ipynb
25 |
26 | __ here_
27 |
--------------------------------------------------------------------------------
/python/magic/__init__.py:
--------------------------------------------------------------------------------
1 | from .magic import MAGIC
2 | from .version import __version__
3 |
4 | import magic.plot
5 |
--------------------------------------------------------------------------------
/python/magic/after_magic_example.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/KrishnaswamyLab/MAGIC/3a4ffdbe435716bb3b2fbe78f434c6cdc8dd8d78/python/magic/after_magic_example.png
--------------------------------------------------------------------------------
/python/magic/before_magic_example.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/KrishnaswamyLab/MAGIC/3a4ffdbe435716bb3b2fbe78f434c6cdc8dd8d78/python/magic/before_magic_example.png
--------------------------------------------------------------------------------
/python/magic/plot.py:
--------------------------------------------------------------------------------
1 | # (C) 2017 Krishnaswamy Lab GPLv2
2 |
3 | from .magic import MAGIC
4 | from .utils import in_ipynb
5 | from matplotlib import animation
6 | from matplotlib import rc
7 |
8 | import matplotlib.pyplot as plt
9 | import numbers
10 | import numpy as np
11 | import pandas as pd
12 | import scprep
13 |
14 |
15 | def _validate_gene(gene, data):
16 | if isinstance(gene, str):
17 | if not isinstance(data, pd.DataFrame):
18 | raise ValueError(
19 | "Non-integer gene names only valid with pd.DataFrame "
20 | "input. X is a {}, gene = {}".format(type(data).__name__, gene)
21 | )
22 | if gene not in data.columns:
23 | raise ValueError("gene {} not found".format(gene))
24 | elif gene is not None and not isinstance(gene, numbers.Integral):
25 | raise TypeError("Expected int or str. Got {}".format(type(gene).__name__))
26 | return gene
27 |
28 |
29 | def animate_magic(
30 | data,
31 | gene_x,
32 | gene_y,
33 | gene_color=None,
34 | t_max=20,
35 | delay=2,
36 | operator=None,
37 | filename=None,
38 | ax=None,
39 | figsize=None,
40 | s=1,
41 | cmap="inferno",
42 | interval=200,
43 | dpi=100,
44 | ipython_html="jshtml",
45 | verbose=False,
46 | **kwargs,
47 | ):
48 | """Animate a gene-gene relationship with increased diffusion
49 |
50 | Parameters
51 | ----------
52 | data: array-like
53 | Input data matrix
54 | gene_x : int or str
55 | Gene to put on the x axis
56 | gene_y : int or str
57 | Gene to put on the y axis
58 | gene_color : int or str, optional (default: None)
59 | Gene to color by. If None, no color vector is used
60 | t_max : int, optional (default: 20)
61 | maximum value of t to include in the animation
62 | delay : int, optional (default: 5)
63 | number of frames to dwell on the first frame before applying MAGIC
64 | operator : magic.MAGIC, optional (default: None)
65 | precomputed MAGIC operator. If None, one is created.
66 | filename : str, optional (default: None)
67 | If not None, saves a .gif or .mp4 with the output
68 | ax : `matplotlib.Axes` or None, optional (default: None)
69 | axis on which to plot. If None, an axis is created
70 | figsize : tuple, optional (default: None)
71 | Tuple of floats for creation of new `matplotlib` figure. Only used if
72 | `ax` is None.
73 | s : int, optional (default: 1)
74 | Point size
75 | cmap : str or callable, optional (default: 'inferno')
76 | Matplotlib colormap
77 | interval : float, optional (default: 30)
78 | Time in milliseconds between frames
79 | dpi : int, optional (default: 100)
80 | Dots per inch (image quality) in saved animation)
81 | ipython_html : {'html5', 'jshtml'}
82 | which html writer to use if using a Jupyter Notebook
83 | verbose : bool, optional (default: False)
84 | MAGIC operator verbosity
85 | *kwargs : arguments for MAGIC
86 |
87 | Returns
88 | -------
89 | A Matplotlib animation showing diffusion of an edge with increased t
90 | """
91 | if in_ipynb():
92 | # credit to
93 | # http://tiao.io/posts/notebooks/save-matplotlib-animations-as-gifs/
94 | rc("animation", html=ipython_html)
95 |
96 | if filename is not None:
97 | if filename.endswith(".gif"):
98 | writer = "imagemagick"
99 | elif filename.endswith(".mp4"):
100 | writer = "ffmpeg"
101 | else:
102 | raise ValueError(
103 | "filename must end in .gif or .mp4. Got {}".format(filename)
104 | )
105 |
106 | if operator is None:
107 | operator = MAGIC(verbose=verbose, **kwargs).fit(data)
108 | else:
109 | operator.set_params(verbose=verbose, **kwargs)
110 | gene_x = _validate_gene(gene_x, data)
111 | gene_y = _validate_gene(gene_y, data)
112 | gene_color = _validate_gene(gene_color, data)
113 | if gene_color is not None:
114 | genes = np.array([gene_x, gene_y, gene_color])
115 | else:
116 | genes = np.array([gene_x, gene_y])
117 |
118 | if isinstance(cmap, str):
119 | cmap = plt.cm.cmap_d[cmap]
120 |
121 | if ax is None:
122 | fig, ax = plt.subplots(figsize=figsize)
123 | show = True
124 | else:
125 | fig = ax.get_figure()
126 | show = False
127 |
128 | data_magic = scprep.select.select_cols(data, idx=genes)
129 | data_magic = scprep.utils.toarray(data_magic)
130 | c = data_magic[gene_color] if gene_color is not None else None
131 | sc = ax.scatter(data_magic[gene_x], data_magic[gene_y], c=c, cmap=cmap)
132 | ax.set_title("t = 0")
133 | ax.set_xlabel(gene_x)
134 | ax.set_ylabel(gene_y)
135 | ax.set_xticks([])
136 | ax.set_yticks([])
137 | ax.set_xticklabels([])
138 | ax.set_yticklabels([])
139 | if gene_color is not None:
140 | plt.colorbar(sc, label=gene_color, ticks=[])
141 |
142 | data_magic = [data]
143 | for t in range(t_max):
144 | operator.set_params(t=t + 1)
145 | data_magic.append(operator.transform(genes=genes))
146 |
147 | def init():
148 | return ax
149 |
150 | def animate(i):
151 | i = max(i - delay, 0)
152 | data_t = data_magic[i]
153 | data_t = data_t if isinstance(data, pd.DataFrame) else data_t.T
154 | sc.set_offsets(np.array([data_t[gene_x], data_t[gene_y]]).T)
155 | xlim = np.min(data_t[gene_x]), np.max(data_t[gene_x])
156 | xrange = xlim[1] - xlim[0]
157 | ax.set_xlim(xlim[0] - xrange / 10, xlim[1] + xrange / 10)
158 | ylim = np.min(data_t[gene_y]), np.max(data_t[gene_y])
159 | yrange = ylim[1] - ylim[0]
160 | ax.set_ylim(ylim[0] - yrange / 10, ylim[1] + yrange / 10)
161 | ax.set_title("t = {}".format(i))
162 | if gene_color is not None:
163 | color_t = data_t[gene_color]
164 | color_t -= np.min(color_t)
165 | color_t /= np.max(color_t)
166 | sc.set_facecolor(cmap(color_t))
167 | return ax
168 |
169 | ani = animation.FuncAnimation(
170 | fig,
171 | animate,
172 | init_func=init,
173 | frames=range(t_max + delay + 1),
174 | interval=interval,
175 | blit=False,
176 | )
177 |
178 | if filename is not None:
179 | ani.save(filename, writer=writer, dpi=dpi)
180 |
181 | if in_ipynb():
182 | # credit to https://stackoverflow.com/a/45573903/3996580
183 | plt.close()
184 | elif show:
185 | plt.tight_layout()
186 | fig.show()
187 |
188 | return ani
189 |
--------------------------------------------------------------------------------
/python/magic/utils.py:
--------------------------------------------------------------------------------
1 | from scipy import sparse
2 |
3 | import numbers
4 | import numpy as np
5 | import pandas as pd
6 | import scprep
7 |
8 | try:
9 | import anndata
10 | except (ImportError, SyntaxError):
11 | # anndata not installed
12 | pass
13 |
14 |
15 | def check_positive(**params):
16 | """Check that parameters are positive as expected.
17 |
18 | Raises
19 | ------
20 | ValueError : unacceptable choice of parameters
21 | """
22 | for p in params:
23 | if params[p] <= 0:
24 | raise ValueError("Expected {} > 0, got {}".format(p, params[p]))
25 |
26 |
27 | def check_int(**params):
28 | """Check that parameters are integers as expected.
29 |
30 | Raises
31 | ------
32 | ValueError : unacceptable choice of parameters
33 | """
34 | for p in params:
35 | if not isinstance(params[p], numbers.Integral):
36 | raise ValueError("Expected {} integer, got {}".format(p, params[p]))
37 |
38 |
39 | def check_if_not(x, *checks, **params):
40 | """Run checks only if parameters are not equal to a specified value.
41 |
42 | Parameters
43 | ----------
44 |
45 | x : excepted value
46 | Checks not run if parameters equal x
47 |
48 | checks : function
49 | Unnamed arguments, check functions to be run
50 |
51 | params : object
52 | Named arguments, parameters to be checked
53 |
54 | Raises
55 | ------
56 | ValueError : unacceptable choice of parameters
57 | """
58 | for p in params:
59 | if params[p] is not x and params[p] != x:
60 | [check(p=params[p]) for check in checks]
61 |
62 |
63 | def check_in(choices, **params):
64 | """Checks parameters are in a list of allowed parameters.
65 |
66 | Parameters
67 | ----------
68 |
69 | choices : array-like, accepted values
70 |
71 | params : object
72 | Named arguments, parameters to be checked
73 |
74 | Raises
75 | ------
76 | ValueError : unacceptable choice of parameters
77 | """
78 | for p in params:
79 | if params[p] not in choices:
80 | raise ValueError(
81 | "{} value {} not recognized. Choose from {}".format(
82 | p, params[p], choices
83 | )
84 | )
85 |
86 |
87 | def check_between(v_min, v_max, **params):
88 | """Checks parameters are in a specified range.
89 |
90 | Parameters
91 | ----------
92 |
93 | v_min : float, minimum allowed value (inclusive)
94 |
95 | v_max : float, maximum allowed value (inclusive)
96 |
97 | params : object
98 | Named arguments, parameters to be checked
99 |
100 | Raises
101 | ------
102 | ValueError : unacceptable choice of parameters
103 | """
104 | for p in params:
105 | if params[p] < v_min or params[p] > v_max:
106 | raise ValueError(
107 | "Expected {} between {} and {}, "
108 | "got {}".format(p, v_min, v_max, params[p])
109 | )
110 |
111 |
112 | def matrix_is_equivalent(X, Y):
113 | """Check matrix equivalence with numpy, scipy and pandas."""
114 | if X is Y:
115 | return True
116 | elif X.shape == Y.shape:
117 | if sparse.issparse(X) or sparse.issparse(Y):
118 | X = scprep.utils.to_array_or_spmatrix(X)
119 | Y = scprep.utils.to_array_or_spmatrix(Y)
120 | elif isinstance(X, pd.DataFrame) and isinstance(Y, pd.DataFrame):
121 | return np.all(X == Y)
122 | elif not (sparse.issparse(X) and sparse.issparse(Y)):
123 | X = scprep.utils.toarray(X)
124 | Y = scprep.utils.toarray(Y)
125 | return np.allclose(X, Y)
126 | else:
127 | return np.allclose((X - Y).data, 0)
128 | else:
129 | return False
130 |
131 |
132 | def convert_to_same_format(data, target_data, columns=None, prevent_sparse=False):
133 | """Convert data to same format as target data."""
134 | # create new data object
135 | if scprep.utils.is_sparse_dataframe(target_data):
136 | if prevent_sparse:
137 | data = pd.DataFrame(data)
138 | else:
139 | data = scprep.utils.SparseDataFrame(data)
140 | pandas = True
141 | elif isinstance(target_data, pd.DataFrame):
142 | data = pd.DataFrame(data)
143 | pandas = True
144 | elif is_anndata(target_data):
145 | data = anndata.AnnData(data)
146 | pandas = False
147 | else:
148 | # nothing to do
149 | return data
150 | # retrieve column names
151 | target_columns = target_data.columns if pandas else target_data.var
152 | # subset column names
153 | try:
154 | if columns is not None:
155 | if pandas:
156 | target_columns = target_columns[columns]
157 | else:
158 | target_columns = target_columns.iloc[columns]
159 | except (KeyError, IndexError, ValueError):
160 | # keep the original column names
161 | if pandas:
162 | target_columns = columns
163 | else:
164 | target_columns = pd.DataFrame(index=columns)
165 | # set column names on new data object
166 | if pandas:
167 | data.columns = target_columns
168 | data.index = target_data.index
169 | else:
170 | data.var = target_columns
171 | data.obs = target_data.obs
172 | return data
173 |
174 |
175 | def in_ipynb():
176 | """Check if we are running in a Jupyter Notebook.
177 |
178 | Credit to https://stackoverflow.com/a/24937408/3996580
179 | """
180 | __VALID_NOTEBOOKS = [
181 | "",
182 | "",
183 | ]
184 | try:
185 | return str(type(get_ipython())) in __VALID_NOTEBOOKS
186 | except NameError:
187 | return False
188 |
189 |
190 | def is_anndata(data):
191 | """Check if an object is an AnnData object."""
192 | try:
193 | return isinstance(data, anndata.AnnData)
194 | except NameError:
195 | # anndata not installed
196 | return False
197 |
198 |
199 | def has_empty_columns(data):
200 | """Check if an object has empty columns."""
201 | try:
202 | return np.any(np.array(data.sum(0)) == 0)
203 | except AttributeError:
204 | if is_anndata(data):
205 | return np.any(np.array(data.X.sum(0)) == 0)
206 | else:
207 | raise
208 |
--------------------------------------------------------------------------------
/python/magic/version.py:
--------------------------------------------------------------------------------
1 | __version__ = "3.0.0"
2 |
--------------------------------------------------------------------------------
/python/requirements.txt:
--------------------------------------------------------------------------------
1 | numpy>=1.14.0
2 | scipy>=1.1.0
3 | pandas>=0.25
4 | scprep>=1.0
5 | matplotlib
6 | scikit-learn>=0.19.1
7 | future
8 | tasklogger>=1.0.0
9 | graphtools>=1.4.0
10 |
--------------------------------------------------------------------------------
/python/setup.py:
--------------------------------------------------------------------------------
1 | from setuptools import find_packages
2 | from setuptools import setup
3 |
4 | import os
5 |
6 | install_requires = [
7 | "numpy>=1.14.0",
8 | "scipy>=1.1.0",
9 | "matplotlib",
10 | "scikit-learn>=0.19.1",
11 | "future",
12 | "tasklogger>=1.0.0",
13 | "graphtools>=1.4.0",
14 | "pandas>=0.25",
15 | "scprep>=1.0",
16 | ]
17 |
18 | test_requires = ["nose2", "anndata", "coverage", "coveralls"]
19 |
20 | doc_requires = [
21 | "sphinx",
22 | "sphinxcontrib-napoleon",
23 | ]
24 |
25 | version_py = os.path.join(os.path.dirname(__file__), "magic", "version.py")
26 | version = open(version_py).read().strip().split("=")[-1].replace('"', "").strip()
27 |
28 | readme = open("README.rst").read()
29 |
30 | setup(
31 | name="magic-impute",
32 | version=version,
33 | description="MAGIC",
34 | author="",
35 | author_email="",
36 | packages=find_packages(),
37 | license="GNU General Public License Version 2",
38 | python_requires=">=3.6",
39 | install_requires=install_requires,
40 | extras_require={"test": test_requires, "doc": doc_requires},
41 | test_suite="nose2.collector.collector",
42 | long_description=readme,
43 | url="https://github.com/KrishnaswamyLab/MAGIC",
44 | download_url="https://github.com/KrishnaswamyLab/MAGIC/archive/v{}.tar.gz".format(
45 | version
46 | ),
47 | keywords=[
48 | "visualization",
49 | "big-data",
50 | "dimensionality-reduction",
51 | "embedding",
52 | "manifold-learning",
53 | "computational-biology",
54 | ],
55 | classifiers=[
56 | "Development Status :: 5 - Production/Stable",
57 | "Environment :: Console",
58 | "Framework :: Jupyter",
59 | "Intended Audience :: Developers",
60 | "Intended Audience :: Science/Research",
61 | "Natural Language :: English",
62 | "Operating System :: MacOS :: MacOS X",
63 | "Operating System :: Microsoft :: Windows",
64 | "Operating System :: POSIX :: Linux",
65 | "Programming Language :: Python :: 2",
66 | "Programming Language :: Python :: 2.7",
67 | "Programming Language :: Python :: 3",
68 | "Programming Language :: Python :: 3.5",
69 | "Programming Language :: Python :: 3.6",
70 | "Topic :: Scientific/Engineering :: Bio-Informatics",
71 | ],
72 | )
73 |
--------------------------------------------------------------------------------
/python/test/test.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 |
3 |
4 | import magic
5 | import matplotlib as mpl
6 | import numpy as np
7 | import os
8 | import scprep
9 |
10 | mpl.use("agg")
11 |
12 | try:
13 | import anndata
14 | except (ImportError, SyntaxError):
15 | # anndata not installed
16 | pass
17 |
18 |
19 | data_path = os.path.join("..", "data", "test_data.csv")
20 | if not os.path.isfile(data_path):
21 | data_path = os.path.join("..", data_path)
22 | scdata = scprep.io.load_csv(data_path, cell_names=False)
23 | scdata = scprep.filter.filter_empty_cells(scdata)
24 | scdata = scprep.filter.filter_empty_genes(scdata)
25 | scdata = scprep.filter.filter_duplicates(scdata)
26 | scdata_norm = scprep.normalize.library_size_normalize(scdata)
27 | scdata_norm = scprep.transform.sqrt(scdata_norm)
28 |
29 |
30 | def test_genes_str_int():
31 | magic_op = magic.MAGIC(t="auto", decay=20, knn=10, verbose=False)
32 | str_gene_magic = magic_op.fit_transform(scdata_norm, genes=["VIM", "ZEB1"])
33 | int_gene_magic = magic_op.fit_transform(
34 | scdata_norm, graph=magic_op.graph, genes=[-2, -1]
35 | )
36 | assert str_gene_magic.shape[0] == scdata_norm.shape[0]
37 | np.testing.assert_array_equal(str_gene_magic, int_gene_magic)
38 |
39 |
40 | def test_pca_only():
41 | magic_op = magic.MAGIC(t="auto", decay=20, knn=10, verbose=False)
42 | pca_magic = magic_op.fit_transform(scdata_norm, genes="pca_only")
43 | assert pca_magic.shape[0] == scdata_norm.shape[0]
44 | assert pca_magic.shape[1] == magic_op.n_pca
45 |
46 |
47 | def test_all_genes():
48 | magic_op = magic.MAGIC(t="auto", decay=20, knn=10, verbose=False, random_state=42)
49 | int_gene_magic = magic_op.fit_transform(scdata_norm, genes=[-2, -1])
50 | magic_all_genes = magic_op.fit_transform(scdata_norm, genes="all_genes")
51 | assert scdata_norm.shape == magic_all_genes.shape
52 | int_gene_magic2 = magic_op.transform(scdata_norm, genes=[-2, -1])
53 | np.testing.assert_allclose(int_gene_magic, int_gene_magic2, rtol=0.015)
54 |
55 |
56 | def test_all_genes_approx():
57 | magic_op = magic.MAGIC(
58 | t="auto", decay=20, knn=10, verbose=False, solver="approximate", random_state=42
59 | )
60 | int_gene_magic = magic_op.fit_transform(scdata_norm, genes=[-2, -1])
61 | magic_all_genes = magic_op.fit_transform(scdata_norm, genes="all_genes")
62 | assert scdata_norm.shape == magic_all_genes.shape
63 | int_gene_magic2 = magic_op.transform(scdata_norm, genes=[-2, -1])
64 | np.testing.assert_allclose(int_gene_magic, int_gene_magic2, atol=0.003, rtol=0.008)
65 |
66 |
67 | def test_dremi():
68 | magic_op = magic.MAGIC(t="auto", decay=20, knn=10, verbose=False)
69 | # test DREMI: need numerical precision here
70 | magic_op.set_params(random_state=42)
71 | magic_op.fit(scdata_norm)
72 | dremi = magic_op.knnDREMI("VIM", "ZEB1", plot=True)
73 | np.testing.assert_allclose(dremi, 1.466004, atol=0.0000005)
74 |
75 |
76 | def test_solver():
77 | # Testing exact vs approximate solver
78 | magic_op = magic.MAGIC(
79 | t="auto", decay=20, knn=10, solver="exact", verbose=False, random_state=42
80 | )
81 | data_imputed_exact = magic_op.fit_transform(scdata_norm)
82 | # should have exactly as many genes stored
83 | assert magic_op.X_magic.shape[1] == scdata_norm.shape[1]
84 | # should be nonzero
85 | assert np.all(data_imputed_exact >= 0)
86 |
87 | magic_op = magic.MAGIC(
88 | t="auto",
89 | decay=20,
90 | knn=10,
91 | n_pca=150,
92 | solver="approximate",
93 | verbose=False,
94 | random_state=42,
95 | )
96 | # magic_op.set_params(solver='approximate')
97 | data_imputed_apprx = magic_op.fit_transform(scdata_norm)
98 | # should have n_pca genes stored
99 | assert magic_op.X_magic.shape[1] == 150
100 | # make sure they're close-ish
101 | np.testing.assert_allclose(data_imputed_apprx, data_imputed_exact, atol=0.15)
102 | # make sure they're not identical
103 | assert np.any(data_imputed_apprx != data_imputed_exact)
104 |
105 |
106 | def test_anndata():
107 | try:
108 | anndata
109 | except NameError:
110 | # anndata not installed
111 | return
112 | scdata = anndata.read_csv(data_path)
113 | fast_magic_operator = magic.MAGIC(
114 | t="auto", solver="approximate", decay=None, knn=10, verbose=False
115 | )
116 | sc_magic = fast_magic_operator.fit_transform(scdata, genes="all_genes")
117 | assert np.all(sc_magic.var_names == scdata.var_names)
118 | assert np.all(sc_magic.obs_names == scdata.obs_names)
119 | sc_magic = fast_magic_operator.fit_transform(scdata, genes=["VIM", "ZEB1"])
120 | assert np.all(sc_magic.var_names.values == np.array(["VIM", "ZEB1"]))
121 | assert np.all(sc_magic.obs_names == scdata.obs_names)
122 |
--------------------------------------------------------------------------------
/setup.cfg:
--------------------------------------------------------------------------------
1 | [flake8]
2 | ignore =
3 | # top-level module docstring
4 | D100, D104,
5 | # space before : conflicts with black
6 | E203
7 | per-file-ignores =
8 | # imported but unused
9 | __init__.py: F401
10 | # missing docstring in public function for methods, metrics, datasets
11 | openproblems/tasks/*/*/*.py: D103, E203
12 | openproblems/tasks/*/*/__init__.py: F401, D103
13 | max-line-length = 88
14 | exclude =
15 | .git,
16 | __pycache__,
17 | build,
18 | dist,
19 | Snakefile
20 |
21 | [isort]
22 | profile = black
23 | force_single_line = true
24 | force_alphabetical_sort = true
25 |
--------------------------------------------------------------------------------