├── bash_examples
└── 001_ms4_bash_example
│ ├── .gitignore
│ ├── ms4_sort_bash.sh
│ ├── synthesize_dataset.sh
│ └── readme.md
├── .gitattributes
├── spikeforest
├── readme.md
├── test_view_neto.m
├── ironclust.py
├── spikeforest_datasets.py
├── prepare_neto.m
├── view_datasets.ipynb
├── single_sort.ipynb
├── spikeforest.py
├── create_synth_datasets.ipynb
├── prepare_neto.ipynb
├── spikeforest_view_results.ipynb
└── spikeforest.ipynb
├── .gitignore
├── Dockerfile
├── postBuild
├── python
├── default_lari_servers.py
├── summarize_sorting_results.py
├── synthesize_dataset.py
├── validate_sorting_results.py
└── mountainsort4_1_0.py
├── docs
├── misc.md
├── preparing_datasets.md
├── mda_format.md
└── sharing_datasets.md
├── jupyter_examples
├── view_datasets
│ ├── load_standard_datasets.py
│ └── view_datasets.ipynb
├── processor_tests
│ ├── view_timeseries.py
│ └── processor_tests.ipynb
├── 001_ms4_jupyter_example
│ └── ms4_jupyter_example.ipynb
└── example1
│ └── example1.ipynb
├── sandbox
├── cs_dataset_009
│ └── Untitled.ipynb
└── wrap_spyking_circus
│ └── wrap_spyking_circus.ipynb
└── README.md
/bash_examples/001_ms4_bash_example/.gitignore:
--------------------------------------------------------------------------------
1 | dataset
2 | output
3 |
--------------------------------------------------------------------------------
/.gitattributes:
--------------------------------------------------------------------------------
1 | *.ipynb filter=nbstripout
2 |
3 | *.ipynb diff=ipynb
4 |
--------------------------------------------------------------------------------
/spikeforest/readme.md:
--------------------------------------------------------------------------------
1 | This directory is a work in progress. Here we are developing the code for the web-based spike sorting comparison framework.
2 |
--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
1 | data
2 | build
3 | output
4 | .ipynb_checkpoints
5 | package-lock.json
6 | node_modules
7 | __pycache__
8 | dist
9 | *.sublime-project
10 | *.sublime-workspace
11 | dataset
12 | *output/
13 |
--------------------------------------------------------------------------------
/Dockerfile:
--------------------------------------------------------------------------------
1 | FROM magland/jp_proxy_widget:20180831
2 |
3 | ### Install conda packages
4 | RUN conda config --add channels flatiron
5 | RUN conda config --add channels conda-forge
6 | RUN conda install mountainlab>=0.15
7 | RUN conda install mountainlab_pytools
8 | RUN conda install ml_ephys ml_ms4alg ml_ms3 ml_pyms
9 | RUN conda install ml_spikeforest
10 |
11 | ### Add this repo
12 | ADD . /working
13 | WORKDIR /working
14 |
--------------------------------------------------------------------------------
/spikeforest/test_view_neto.m:
--------------------------------------------------------------------------------
1 | fname='download/NETO/2014_11_25_Pair_3_0/amplifier2014-11-25T23_00_08.bin';
2 | F=fopen(fname);
3 | X=fread(F,[32,inf],'uint16');
4 | fclose(F);
5 |
6 | figure;
7 | for j=1:32
8 | plot(X(j,1:300));
9 | hold on;
10 | end
11 |
12 | fname='datasets/neto_32ch_1/raw.mda';
13 |
14 | Y=readmda(fname);
15 |
16 | figure;
17 | for j=1:32
18 | plot(Y(j,1:300));
19 | hold on;
20 | end
21 |
22 |
23 |
--------------------------------------------------------------------------------
/postBuild:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 |
3 | # This script installs the required jupyterlab extensions for running the examples
4 | # jp_proxy_widget enables embedding of javascript widgets in jupyterlab notebooks
5 |
6 | set -e
7 |
8 | jupyter nbextension enable --py --sys-prefix jp_proxy_widget
9 | jupyter labextension install @jupyter-widgets/jupyterlab-manager --no-build
10 | jupyter labextension install jp_proxy_widget --no-build
11 | jupyter lab build
12 |
--------------------------------------------------------------------------------
/python/default_lari_servers.py:
--------------------------------------------------------------------------------
1 | def default_lari_servers():
2 | ret=[]
3 | ret.append(dict(
4 | label='Public 1 (passcode=public)',
5 | LARI_ID='ece375048d28'
6 | ))
7 | ret.append(dict(
8 | label='Public 2 (passcode=public)',
9 | LARI_ID='dd5921ab5fc1'
10 | ))
11 | ret.append(dict(
12 | label='Flatiron cluster',
13 | LARI_ID='fdb573a66f50'
14 | ))
15 | ret.append(dict(
16 | label='Jeremy\'s laptop',
17 | LARI_ID='cb48a51bf9e5'
18 | ))
19 | ret.append(dict(
20 | label='Local computer',
21 | LARI_ID=''
22 | ))
23 | return ret
--------------------------------------------------------------------------------
/docs/misc.md:
--------------------------------------------------------------------------------
1 | ## Running these examples via epoxy
2 |
3 | http://epoxyhub.org/?source=https://github.com/flatironinstitute/mountainsort_examples
4 |
5 | Navigigate to jupyter_examples/example1 and open the example1.ipynb
6 |
7 | Singularity... each MountainLab processing library can have a Singularity file. For example
8 | https://github.com/magland/ml_ephys
9 |
10 | Singularity-hub (https://www.singularity-hub.org/) can be configured to automatically build new Singularity images (aka containers) encapsulating the processors.
11 |
12 | The following code tells mountainlab to use a particular version of the singularity image for ml_ephys for ALL processors that begin with the string 'ephys.'
13 |
14 | ```
15 | mlp.addContainerRule(pattern='ephys.*',container='shub://magland/ml_ephys:v0.2.5')
16 | ```
17 |
--------------------------------------------------------------------------------
/python/summarize_sorting_results.py:
--------------------------------------------------------------------------------
1 | import os
2 | from mountainlab_pytools import mdaio
3 | from mountainlab_pytools import mlproc as mlp
4 |
5 | def summarize_sorting_results(*,dataset_dir,sorting_output_dir,output_dir,opts):
6 | if not os.path.exists(output_dir):
7 | os.mkdir(output_dir)
8 | compute_templates(timeseries=dataset_dir+'/raw.mda',firings=sorting_output_dir+'/firings.mda',templates_out=output_dir+'/templates.mda')
9 |
10 | def compute_templates(*,timeseries,firings,templates_out,opts={}):
11 | return mlp.addProcess(
12 | 'ephys.compute_templates',
13 | {
14 | 'timeseries':timeseries,
15 | 'firings':firings
16 | },
17 | {
18 | 'templates_out':templates_out
19 | },
20 | {},
21 | opts
22 | )['outputs']['templates_out']
23 |
--------------------------------------------------------------------------------
/bash_examples/001_ms4_bash_example/ms4_sort_bash.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 |
3 | set -e
4 |
5 | mkdir -p output
6 |
7 | # Preprocess
8 | ml-run-process ephys.bandpass_filter \
9 | --inputs timeseries:dataset/raw.mda.prv \
10 | --outputs timeseries_out:output/filt.mda.prv \
11 | --parameters samplerate:30000 freq_min:300 freq_max:6000
12 | ml-run-process ephys.whiten \
13 | --inputs timeseries:output/filt.mda.prv \
14 | --outputs timeseries_out:output/pre.mda.prv
15 |
16 | # Spike sorting
17 | ml-run-process ms4alg.sort \
18 | --inputs \
19 | timeseries:output/pre.mda.prv geom:dataset/geom.csv \
20 | --outputs \
21 | firings_out:output/firings.mda \
22 | --parameters \
23 | detect_sign:1 \
24 | adjacency_radius:-1 \
25 | detect_threshold:3
26 |
27 | # Compute templates
28 | ml-run-process ephys.compute_templates \
29 | --inputs timeseries:dataset/raw.mda.prv firings:output/firings.mda \
30 | --outputs templates_out:output/templates.mda.prv \
31 | --parameters \
32 | clip_size:150
--------------------------------------------------------------------------------
/bash_examples/001_ms4_bash_example/synthesize_dataset.sh:
--------------------------------------------------------------------------------
1 | #!/bin/bash
2 |
3 | set -e
4 |
5 | mkdir -p dataset
6 |
7 | # Create some random spike waveforms
8 | ml-run-process ephys.synthesize_random_waveforms \
9 | --outputs \
10 | waveforms_out:dataset/waveforms_true.mda.prv \
11 | geometry_out:dataset/geom.csv \
12 | --parameters \
13 | upsamplefac:13 \
14 | M:4 \
15 | average_peak_amplitude:100 \
16 |
17 | # Create random firing event timings
18 | ml-run-process ephys.synthesize_random_firings \
19 | --outputs \
20 | firings_out:dataset/firings_true.mda.prv \
21 | --parameters \
22 | duration:600
23 |
24 | # Make a synthetic ephys dataset
25 | ml-run-process ephys.synthesize_timeseries \
26 | --inputs \
27 | firings:dataset/firings_true.mda.prv \
28 | waveforms:dataset/waveforms_true.mda.prv \
29 | --outputs \
30 | timeseries_out:dataset/raw.mda.prv \
31 | --parameters \
32 | duration:600 \
33 | waveform_upsamplefac:13 \
34 | noise_level:10
35 |
36 | # Create the params.json file
37 | printf "{\n \"samplerate\":30000,\n \"spike_sign\":1\n}" > dataset/params.json
38 |
39 |
--------------------------------------------------------------------------------
/docs/preparing_datasets.md:
--------------------------------------------------------------------------------
1 | ## Preparing datasets
2 |
3 | To use MountainSort with your own data, you should prepare datasets in the following directory structure:
4 |
5 | ```
6 | study_directory/
7 | dataset1/
8 | raw.mda
9 | geom.csv
10 | params.json
11 | dataset2/
12 | ...
13 | ```
14 |
15 | where `study_directory`, `dataset1`, `dataset2`, ... can be replaced by names of your choosing (don't use spaces in the file names).
16 |
17 | `raw.mda` is the `M x N` timeseries array in [mda format](mda_format.md), where `M` is the number of channels and `N` is the number of timepoints.
18 |
19 | `geom.csv` represents the probe geometry and is a comma-separated text file containing 2-d or 3-d coordinates of the electrodes. The number of lines (or rows) in `geom.csv` should equal `M`, the number of channels. For example:
20 |
21 | ```
22 | 0,0
23 | 20,0
24 | 0,20
25 | 20,20
26 | ```
27 |
28 | `params.json` is a JSON format file containing dataset-specific parameters, including `samplerate` (in Hz) and `spike_sign` (`-1`, `0`, or `1`). For example:
29 |
30 | ```
31 | {
32 | "samplerate":30000,
33 | "spike_sign":-1
34 | }
35 | ```
36 |
--------------------------------------------------------------------------------
/spikeforest/ironclust.py:
--------------------------------------------------------------------------------
1 | from mountainlab_pytools import mlproc as mlp
2 | import os
3 | import json
4 |
5 | def read_dataset_params(dsdir):
6 | params_fname=mlp.realizeFile(dsdir+'/params.json')
7 | if not os.path.exists(params_fname):
8 | raise Exception('Dataset parameter file does not exist: '+params_fname)
9 | with open(params_fname) as f:
10 | return json.load(f)
11 |
12 | def sort_dataset(dataset_dir,output_dir):
13 | if not os.path.exists(output_dir):
14 | os.mkdir(output_dir)
15 |
16 | # Dataset parameters
17 | ds_params=read_dataset_params(dataset_dir)
18 |
19 | detect_sign=1
20 | if 'spike_sign' in ds_params:
21 | detect_sign=ds_params['spike_sign']
22 | if 'detect_sign' in ds_params:
23 | detect_sign=ds_params['detect_sign']
24 |
25 | mlp.addProcess(
26 | 'ironclust.sort',
27 | {
28 | 'timeseries':dataset_dir+'/raw.mda',
29 | 'geom':dataset_dir+'/geom.csv',
30 | },{
31 | 'firings_out':output_dir+'/firings.mda'
32 | },
33 | {
34 | 'samplerate':ds_params['samplerate'],
35 | 'detect_sign':detect_sign,
36 | 'prm_template_name':'tetrode_template.prm'
37 | #'should_cause_error':123
38 | },
39 | {
40 | }
41 | )
--------------------------------------------------------------------------------
/jupyter_examples/view_datasets/load_standard_datasets.py:
--------------------------------------------------------------------------------
1 | from mountainlab_pytools import mlproc as mlp
2 |
3 | def load_standard_datasets(verbose=True):
4 | datasets=[]
5 |
6 | jjun_dir='kbucket://22f3e72dd783/groundtruth'
7 |
8 | groups=[
9 | dict(
10 | group_dir='kbucket://b5ecdf1474c5/datasets/synth_datasets/datasets',
11 | group_name='synth'
12 | ),
13 | dict(
14 | group_dir=jjun_dir+'/kampff',
15 | group_name='kampff'
16 | ),
17 | dict(
18 | group_dir=jjun_dir+'/boyden',
19 | group_name='boyden'
20 | ),
21 | dict(
22 | group_dir=jjun_dir+'/bionet_8x',
23 | group_name='bionet_8x'
24 | ),
25 | dict(
26 | group_dir=jjun_dir+'/mea256yger',
27 | group_name='mea256yger'
28 | )
29 | ]
30 | #groups=[groups[0]]
31 |
32 | for group in groups:
33 | group_dir=group['group_dir']
34 | group_name=group['group_name']
35 | if verbose:
36 | print('Reading '+group_dir);
37 | D=mlp.readDir(group_dir)
38 | for name in D['dirs']:
39 | datasets.append({
40 | "id":group_name+"--"+name,
41 | "raw_path":group_dir+'/'+name
42 | })
43 |
44 | return datasets
45 |
46 |
--------------------------------------------------------------------------------
/spikeforest/spikeforest_datasets.py:
--------------------------------------------------------------------------------
1 | from mountainlab_pytools import mlproc as mlp
2 | import ipywidgets as widgets
3 |
4 | def load_standard_datasets(verbose=True):
5 | datasets=[]
6 |
7 | groups=[
8 | dict(
9 | group_dir='kbucket://b5ecdf1474c5/datasets/synth_datasets/datasets',
10 | group_name='synth'
11 | ),
12 | dict(
13 | group_dir='kbucket://697623ec3681/kampff',
14 | group_name='kampff'
15 | ),
16 | dict(
17 | group_dir='kbucket://697623ec3681/boyden',
18 | group_name='boyden'
19 | ),
20 | dict(
21 | group_dir='kbucket://697623ec3681/bionet_8x',
22 | group_name='bionet_8x'
23 | ),
24 | dict(
25 | group_dir='kbucket://697623ec3681/mea256yger',
26 | group_name='mea256yger'
27 | )
28 | ]
29 | #groups=[groups[0]]
30 |
31 | for group in groups:
32 | group_dir=group['group_dir']
33 | group_name=group['group_name']
34 | if verbose:
35 | print('Reading '+group_dir);
36 | D=mlp.readDir(group_dir)
37 | for name in D['dirs']:
38 | datasets.append({
39 | "id":group_name+"--"+name,
40 | "raw_path":group_dir+'/'+name
41 | })
42 |
43 | return datasets
44 |
45 |
--------------------------------------------------------------------------------
/spikeforest/prepare_neto.m:
--------------------------------------------------------------------------------
1 | function prepare_neto
2 |
3 | dir_in='download/NETO/2014_11_25_Pair_3_0';
4 | dir_out='datasets/neto_32ch_1';
5 | if ~exist(dir_out)
6 | mkdir(dir_out)
7 | end;
8 |
9 | fid=fopen([dir_in,'/adc2014-11-25T23_00_08.bin']); % ADC = juxta
10 | J = fread(fid,'uint16');
11 | fclose(fid);
12 | MJ = 8; % # adc ch
13 | N = numel(J)/MJ
14 | J = reshape(J,[MJ N]);
15 | used_channel = 0; J = J(used_channel+1,:); % to 1-indexed channel #
16 | J = J * (10/65536/100) * 1e6; % uV
17 | writemda32(J,[dir_out,'/juxta.mda']);
18 | mJ = mean(J);
19 | times = find(diff(J>mJ+(max(J)-mJ)/2)==1); % trigger on half-way-up-going
20 | labels = 1+0*times;
21 | writemda64([0*times;times;labels],[dir_out,'/firings_true.mda']);
22 |
23 | % elec coords (x,y)
24 | M=32
25 | ord = [31 24 7 1 21 10 30 25 6 15 20 11 16 26 5 14 19 12 17 27 4 8 18 13 23 28 3 9 29 2 22]; % ordering across, pasted from PDF file Map_32electrodes.pdf, apart from the top 0.
26 | ord = ord+1; % 1-indexed
27 | x=nan(32,1); y=x;
28 | x(1) = 0; x(ord(1:3:end))=0;
29 | x(ord(2:3:end))=-sqrt(3)/2; x(ord(3:3:end))=+sqrt(3)/2;
30 | y(1) = 0; y(ord(1:3:end))=-1:-1:-11;
31 | y(ord(2:3:end))=-1.5:-1:-10.5; y(ord(3:3:end))=-1.5:-1:-10.5;
32 | %figure; plot(x,y,'k.'); hold on; title('1-indexed electrode locations');
33 | %for m=1:M, text(x(m),y(m),sprintf('%d',m)); end, axis equal
34 |
35 | geom=zeros(2,M);
36 | geom(1,:)=x;
37 | geom(2,:)=y;
38 | csvwrite([dir_out,'/geom.csv'],geom');
--------------------------------------------------------------------------------
/bash_examples/001_ms4_bash_example/readme.md:
--------------------------------------------------------------------------------
1 | This example shows how to run the MountainSort v4 spike sorting algorithm using a bash script. This does not include the automated curation and is just intended to illustrate MountainLab usage from the command-line and using simple bash scripts. The recommended way to run spike sorting is by using python scripts and/or JupyterLab. See the documentation for more details.
2 |
3 | First you must install the latest version of mountainlab and at least the following mountainlab packages (see docs for installation instructions):
4 | * ml_ephys
5 | * ml_ms4alg
6 |
7 | In order to view the result you can also install ephys-viz and optionally qt-mountainview (see docs for installation instructions).
8 |
9 | Create a synthetic dataset by running:
10 |
11 | ```
12 | ./synthesize_dataset.sh
13 | ```
14 |
15 | This will create some files in the dataset/ directory. To view the dataset (using ephys-viz):
16 |
17 | ```
18 | ev-dataset dataset
19 | ```
20 |
21 | Next, run the spike sorting:
22 |
23 | ```
24 | ./ms4_sort_bash.sh
25 | ```
26 |
27 | This should create an output directory with some files, including a `firings.mda` file.
28 |
29 | Now, view the results using any of the following:
30 |
31 | ```
32 | ev-templates output/templates.mda.prv
33 | ev-timeseries dataset/raw.mda.prv --firings output/firings.mda.prv --samplerate=30000
34 | qt-mountainview --raw dataset/raw.mda.prv --filt output/filt.mda.prv --pre output/pre.mda.prv --samplerate 30000 --firings output/firings.mda
35 | ```
36 |
--------------------------------------------------------------------------------
/python/synthesize_dataset.py:
--------------------------------------------------------------------------------
1 | import os
2 | from mountainlab_pytools import mlproc as mlp
3 | import json
4 |
5 | def synthesize_dataset(dsdir,*,M,duration,average_snr,K=20):
6 | if not os.path.exists(dsdir):
7 | os.mkdir(dsdir)
8 | noise_level=10
9 | average_peak_amplitude=10*average_snr
10 | upsamplefac=13
11 | samplerate=30000
12 | mlp.addProcess(
13 | 'ephys.synthesize_random_waveforms',
14 | dict(
15 | ),
16 | dict(
17 | waveforms_out=dsdir+'/waveforms_true.mda.prv',
18 | geometry_out=dsdir+'/geom.csv'
19 | ),
20 | dict(
21 | upsamplefac=upsamplefac,
22 | M=M,
23 | K=K,
24 | average_peak_amplitude=average_peak_amplitude
25 | )
26 | )
27 | mlp.addProcess(
28 | 'ephys.synthesize_random_firings',
29 | dict(
30 | ),
31 | dict(
32 | firings_out=dsdir+'/firings_true.mda.prv'
33 | ),
34 | dict(
35 | duration=duration,
36 | samplerate=samplerate,
37 | K=K
38 | )
39 | )
40 | mlp.addProcess(
41 | 'ephys.synthesize_timeseries',
42 | dict(
43 | firings=dsdir+'/firings_true.mda',
44 | waveforms=dsdir+'/waveforms_true.mda'
45 | ),
46 | dict(
47 | timeseries_out=dsdir+'/raw.mda.prv'
48 | ),
49 | dict(
50 | duration=duration,
51 | waveform_upsamplefac=upsamplefac,
52 | noise_level=noise_level,
53 | samplerate=samplerate
54 | )
55 | )
56 | params=dict(
57 | samplerate=samplerate,
58 | spike_sign=1
59 | )
60 | with open(dsdir+'/params.json','w') as f:
61 | json.dump(params,f)
--------------------------------------------------------------------------------
/jupyter_examples/processor_tests/view_timeseries.py:
--------------------------------------------------------------------------------
1 | import matplotlib.pyplot as plt
2 | import numpy as np
3 | from mountainlab_pytools import mdaio
4 |
5 | def view_timeseries(timeseries,trange=None,channels=None,samplerate=30000,title='',fig_size=[18,6]):
6 | #timeseries=mls.loadMdaFile(timeseries)
7 | if type(timeseries)==str:
8 | X=mdaio.DiskReadMda(timeseries)
9 | M=X.N1()
10 | N=X.N2()
11 | if not trange:
12 | trange=[0,np.minimum(1000,N)]
13 | X=X.readChunk(i1=0,N1=X.N1(),i2=int(trange[0]),N2=int(trange[1]-trange[0]))
14 | else:
15 | M=timeseries.shape[0]
16 | N=timeseries.shape[1]
17 | if not channels:
18 | channels=range(M)
19 | if not trange:
20 | trange=[0,N]
21 | X=timeseries[channels][:,int(trange[0]):int(trange[1])]
22 |
23 | set_fig_size(fig_size[0],fig_size[1])
24 |
25 | channel_colors=_get_channel_colors(M)
26 | if not channels:
27 | channels=np.arange(M).tolist()
28 |
29 | spacing_between_channels=np.max(np.abs(X.ravel()))
30 |
31 | y_offset=0
32 | for m in range(len(channels)):
33 | A=X[m,:]
34 | plt.plot(np.arange(trange[0],trange[1]),A+y_offset,color=channel_colors[channels[m]])
35 | y_offset-=spacing_between_channels
36 |
37 | ax=plt.gca()
38 | ax.axes.get_xaxis().set_visible(False)
39 | ax.axes.get_yaxis().set_visible(False)
40 |
41 | if title:
42 | plt.title(title,fontsize=title_fontsize)
43 |
44 | plt.show()
45 | return ax;
46 |
47 | def set_fig_size(W,H):
48 | fig_size = plt.rcParams["figure.figsize"]
49 | fig_size[0] = W
50 | fig_size[1] = H
51 | plt.rcParams["figure.figsize"] = fig_size
52 |
53 | def _get_channel_colors(M):
54 | cm = plt.get_cmap('gist_ncar')
55 | channel_colors=[]
56 | for m in range(M):
57 | channel_colors.append(cm(1.0*(m+0.5)/M))
58 | return channel_colors
59 |
60 | def view_templates(X):
61 | Y=X.transpose((0,2,1)).reshape((X.shape[0],X.shape[1]*X.shape[2]))
62 | view_timeseries(Y,trange=[0,Y.shape[1]])
--------------------------------------------------------------------------------
/sandbox/cs_dataset_009/Untitled.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "code",
5 | "execution_count": null,
6 | "metadata": {},
7 | "outputs": [],
8 | "source": [
9 | "# For development purposes, reload imported modules when source changes\n",
10 | "%load_ext autoreload\n",
11 | "%autoreload 2\n",
12 | "\n",
13 | "def append_to_path(dir0): # A convenience function\n",
14 | " if dir0 not in sys.path:\n",
15 | " sys.path.append(dir0)\n",
16 | "\n",
17 | "import spikeinterface as si\n",
18 | "import os, sys\n",
19 | "import numpy as np\n",
20 | "\n",
21 | "append_to_path(os.getcwd()+'/../../spike-collab')\n",
22 | "from widgets.timeserieswidget import TimeseriesWidget\n",
23 | "\n",
24 | "from mountainlab_pytools import mlproc as mlp\n",
25 | "from mountainlab_pytools import mdaio"
26 | ]
27 | },
28 | {
29 | "cell_type": "code",
30 | "execution_count": null,
31 | "metadata": {},
32 | "outputs": [],
33 | "source": [
34 | "dspath='kbucket://b5ecdf1474c5/misc/cs_tetrode_009'"
35 | ]
36 | },
37 | {
38 | "cell_type": "code",
39 | "execution_count": null,
40 | "metadata": {},
41 | "outputs": [],
42 | "source": [
43 | "IX=si.MdaInputExtractor(dataset_directory=dspath,download=False)"
44 | ]
45 | },
46 | {
47 | "cell_type": "code",
48 | "execution_count": null,
49 | "metadata": {},
50 | "outputs": [],
51 | "source": [
52 | "W=TimeseriesWidget(input_extractor=IX,trange=[25000,30000])\n",
53 | "W.display()"
54 | ]
55 | },
56 | {
57 | "cell_type": "code",
58 | "execution_count": null,
59 | "metadata": {},
60 | "outputs": [],
61 | "source": []
62 | }
63 | ],
64 | "metadata": {
65 | "kernelspec": {
66 | "display_name": "Python 3",
67 | "language": "python",
68 | "name": "python3"
69 | },
70 | "language_info": {
71 | "codemirror_mode": {
72 | "name": "ipython",
73 | "version": 3
74 | },
75 | "file_extension": ".py",
76 | "mimetype": "text/x-python",
77 | "name": "python",
78 | "nbconvert_exporter": "python",
79 | "pygments_lexer": "ipython3",
80 | "version": "3.6.2"
81 | }
82 | },
83 | "nbformat": 4,
84 | "nbformat_minor": 2
85 | }
86 |
--------------------------------------------------------------------------------
/spikeforest/view_datasets.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "code",
5 | "execution_count": null,
6 | "metadata": {},
7 | "outputs": [],
8 | "source": [
9 | "#######################################\n",
10 | "#imports\n",
11 | "#######################################\n",
12 | "\n",
13 | "%load_ext autoreload\n",
14 | "%autoreload 2\n",
15 | "\n",
16 | "import os, sys\n",
17 | "dir0 = os.path.split(os.getcwd())[0]\n",
18 | "if dir0 not in sys.path:\n",
19 | " sys.path.append(dir0)\n",
20 | "\n",
21 | "import spikeforest_datasets as sd\n",
22 | "import spikeforestwidgets as SFW"
23 | ]
24 | },
25 | {
26 | "cell_type": "code",
27 | "execution_count": null,
28 | "metadata": {},
29 | "outputs": [],
30 | "source": [
31 | "ds"
32 | ]
33 | },
34 | {
35 | "cell_type": "code",
36 | "execution_count": null,
37 | "metadata": {},
38 | "outputs": [],
39 | "source": [
40 | "# Load datasets\n",
41 | "datasets=sd.load_standard_datasets()"
42 | ]
43 | },
44 | {
45 | "cell_type": "code",
46 | "execution_count": null,
47 | "metadata": {},
48 | "outputs": [],
49 | "source": [
50 | "# Select a dataset to view\n",
51 | "SS=SFW.DatasetSelectWidget(datasets)\n",
52 | "SS.display()"
53 | ]
54 | },
55 | {
56 | "cell_type": "code",
57 | "execution_count": null,
58 | "metadata": {},
59 | "outputs": [],
60 | "source": [
61 | "ds=SS.selectedDataset()\n",
62 | "W=SFW.DatasetWidget(ds,'1,2,3,4,12')\n",
63 | "W.display()"
64 | ]
65 | },
66 | {
67 | "cell_type": "code",
68 | "execution_count": null,
69 | "metadata": {},
70 | "outputs": [],
71 | "source": []
72 | }
73 | ],
74 | "metadata": {
75 | "kernelspec": {
76 | "display_name": "Python 3",
77 | "language": "python",
78 | "name": "python3"
79 | },
80 | "language_info": {
81 | "codemirror_mode": {
82 | "name": "ipython",
83 | "version": 3
84 | },
85 | "file_extension": ".py",
86 | "mimetype": "text/x-python",
87 | "name": "python",
88 | "nbconvert_exporter": "python",
89 | "pygments_lexer": "ipython3",
90 | "version": "3.6.2"
91 | }
92 | },
93 | "nbformat": 4,
94 | "nbformat_minor": 2
95 | }
96 |
--------------------------------------------------------------------------------
/jupyter_examples/view_datasets/view_datasets.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "Assuming that everything is running properly, [a live version of this notebook is available on epoxyhub](http://epoxyhub.org/?source=https://github.com/flatironinstitute/mountainsort_examples&path=jupyter_examples/view_datasets/view_datasets.ipynb)."
8 | ]
9 | },
10 | {
11 | "cell_type": "markdown",
12 | "metadata": {},
13 | "source": [
14 | "## Use this notebook to browse and view a collection of standard datasets hosted on kbucket"
15 | ]
16 | },
17 | {
18 | "cell_type": "code",
19 | "execution_count": null,
20 | "metadata": {},
21 | "outputs": [],
22 | "source": [
23 | "#######################################\n",
24 | "#imports\n",
25 | "#######################################\n",
26 | "\n",
27 | "%load_ext autoreload\n",
28 | "%autoreload 2\n",
29 | "\n",
30 | "import os, sys\n",
31 | "dir0 = os.path.split(os.getcwd())[0]\n",
32 | "if dir0 not in sys.path:\n",
33 | " sys.path.append(dir0)\n",
34 | "\n",
35 | "from load_standard_datasets import load_standard_datasets\n",
36 | "import spikeforestwidgets as SFW"
37 | ]
38 | },
39 | {
40 | "cell_type": "code",
41 | "execution_count": null,
42 | "metadata": {},
43 | "outputs": [],
44 | "source": [
45 | "datasets=load_standard_datasets()\n",
46 | "datasets=datasets[1:]"
47 | ]
48 | },
49 | {
50 | "cell_type": "code",
51 | "execution_count": null,
52 | "metadata": {},
53 | "outputs": [],
54 | "source": [
55 | "# Select a dataset to view\n",
56 | "SS=SFW.DatasetSelectWidget(datasets)\n",
57 | "SS.display()"
58 | ]
59 | },
60 | {
61 | "cell_type": "code",
62 | "execution_count": null,
63 | "metadata": {},
64 | "outputs": [],
65 | "source": [
66 | "ds=SS.selectedDataset()\n",
67 | "W=SFW.DatasetWidget(ds,'')\n",
68 | "W.display()"
69 | ]
70 | },
71 | {
72 | "cell_type": "code",
73 | "execution_count": null,
74 | "metadata": {},
75 | "outputs": [],
76 | "source": []
77 | }
78 | ],
79 | "metadata": {
80 | "kernelspec": {
81 | "display_name": "Python 3",
82 | "language": "python",
83 | "name": "python3"
84 | },
85 | "language_info": {
86 | "codemirror_mode": {
87 | "name": "ipython",
88 | "version": 3
89 | },
90 | "file_extension": ".py",
91 | "mimetype": "text/x-python",
92 | "name": "python",
93 | "nbconvert_exporter": "python",
94 | "pygments_lexer": "ipython3",
95 | "version": "3.6.2"
96 | }
97 | },
98 | "nbformat": 4,
99 | "nbformat_minor": 2
100 | }
101 |
--------------------------------------------------------------------------------
/docs/mda_format.md:
--------------------------------------------------------------------------------
1 | # MDA file format
2 |
3 | ## Principles of the .mda format
4 |
5 | The .mda file format was created as a simple method for storing multi-dimensional arrays of numbers. Of course the simplest way would be to store the array as a raw binary file, but the problem with this is that fundamental information required to read the data is missing – specifically,
6 |
7 | * the data type (e.g., float32, int16, byte, complex float, etc).
8 | * the number of dimensions
9 | * the size of the dimensions (e.g., number of rows and columns in a matrix)
10 |
11 | How should this information be included? There are many strategies, but we choose to include these in a minimal binary header.
12 |
13 | In contrast to file formats that can hold multiple data entitities, each .mda file is guaranteed to contain one and only one multi-dimensional array of byte, integer, or floating point numbers. The .mda file contains a small well-defined header containing only the minimal information required to read the array, namely the number and size of the dimensions as well as the data format of the entries. Immediately following the header, the data of the multi-dimensional array is stored in raw binary format.
14 |
15 | File format description
16 | -----------------------
17 |
18 | The .mda file format has evolved slightly over time (for example the first version only supported complex numbers), so please forgive the few arbitrary choices.
19 |
20 | The first four bytes contains a 32-bit signed integer containing a negative number representing the data format:
21 |
22 | ```
23 | -1 is complex float32 (not supported in all i/o libraries)
24 | -2 is byte
25 | -3 is float32
26 | -4 is int16
27 | -5 is int32
28 | -6 is uint16
29 | -7 is double
30 | -8 is uint32
31 | ```
32 |
33 | The next four bytes contains a 32-bit signed integer representing the number of bytes in each entry (okay a bit redundant, I know).
34 |
35 | The next four bytes contains `num_dims` a 32-bit signed integer representing the number of dimensions (num_dims should be between 1 and 50).
36 |
37 | Note: If `num_dims` is negative, it signifies that the size of the dimensions are stored as 64-bit integers, and the actual number of dimensions is `|num_dims|`.
38 |
39 | The next `4*|num_dims|` bytes contains a list of signed 32-bit integers (or 64-bit if `num_dims<0`) representing the size of each of the dimensions.
40 |
41 | That's it! Next comes the raw data.
42 |
43 | Reading and writing .mda files
44 | ------------------------------
45 |
46 | The easiest way to read and write .mda files is by using the readmda and writemda* functions available in matlab or python, or by using the C++ classes for mda i/o.
47 |
48 | For example, in matlab you can do the following after setting up the appropriate paths:
49 |
50 | ```
51 | X=readmda('myfile.mda');
52 | writemda32(X,'newfile.mda');
53 | writemda16i(X,'newfile_16bit_integer.mda');
54 | ```
55 |
56 | The matlab i/o functions are [here](https://github.com/flatironinstitute/mountainlab-js/tree/master/utilities/matlab/mdaio).
57 |
58 | The python functions are available by importing mdaio from the mountainlab_pytools library (see the [MountainLab docs](https://github.com/flatironinstitute/mountainlab-js/tree/master/README.md)).
59 |
60 | Reading the .mda file header from the command-line
61 | --------------------------------------------------
62 |
63 | You can get information about the datatype and dimensions of a .mda file using the `mda-info` commandline utility as follows:
64 |
65 | ```
66 | mda-info myfile.mda
67 | ```
68 |
69 |
--------------------------------------------------------------------------------
/docs/sharing_datasets.md:
--------------------------------------------------------------------------------
1 | ## Sharing datasets
2 |
3 | MountainLab makes it possible to share electrophysiology datasets by hosting them from your own computer, enabling you to take advantage of the web-based sorting capabilities. This is helpful for troubleshooting spike sorting issues, comparing methods, and collaborating on algorithm development.
4 |
5 | ### Step 1: prepare your datasets
6 |
7 | First you will need to organize your data into MountainSort-compatible datasets. Your directory structure should be as follows:
8 |
9 | ```
10 | study_directory/
11 | dataset1/
12 | raw.mda
13 | geom.csv
14 | params.json
15 | dataset2/
16 | ...
17 | ```
18 |
19 | Details on the contents of these files can be found [here](preparing_datasets.md).
20 |
21 | ### Step 2: Install KBucket
22 |
23 | [KBucket](https://github.com/flatironinstitute/kbucket/blob/master/README.md) is a system for sharing data to the internet from your computer, even when you are behind a firewall. Rather than uploading your data to a server, you host the data on your own machine. The easiest way to install kbucket is using conda, but [alternate installation methods](https://github.com/flatironinstitute/kbucket/blob/master/README.md) are also available. After entering a new conda environment, run the following:
24 |
25 | ```
26 | conda install -c flatiron -c conda-forge kbucket
27 | ```
28 |
29 | ### Step 3 (recommended): Start a tmux session
30 |
31 | Since you will be hosting (rather than uploading) your data, the kbucket process needs to remain running -- if you close the terminal the data will no longer be on the network. It is recommended that you install tmux and start a new session
32 |
33 | ```
34 | tmux new -s kbucket1
35 | ```
36 |
37 | If you close the terminal or exit out of this session, it runs in the background, and then you can re-attach later using
38 |
39 | ```
40 | tmux a -t kbucket1
41 | ```
42 |
43 | For more information about tmux, do a google search.
44 |
45 |
46 | ### Step 4: Host the data on the KBucket network
47 |
48 | Finally, make your data available to the kbucket network by running the following within your tmux session:
49 |
50 | ```
51 | cd study_directory
52 | kbucket-host .
53 | ```
54 |
55 | The program will ask you several questions. You can accept the defaults, except:
56 |
57 | * You must type "yes" to agree to share the resources, and confirm that you are sharing them for scientific research purposes (e.g., no sharing music or videos that don't relate to science experiments!)
58 | * You should enter a description, your name, and your email
59 | * For the parent hub passcode, you will need to ask Jeremy to give you that information.
60 |
61 | If all goes well, hosting will succeed, indexing will begin, and your data will be available on the kbucket network!
62 |
63 | Make note of the node id reported by the program (also found in the `.kbucket/kbnode.json` file within the shared directory). This is a crucial 12-character ID that uniquely identifies your kbucket share. If you stop the hosting and restart, your 12-character ID will stay the same. Let's say your ID was `aaabbbcccddd`. Then you should be able to browse your data at [https://kbucketgui.herokuapp.com/?node_id=aaabbbcccddd](https://kbucketgui.herokuapp.com/?node_id=aaabbbcccddd).
64 |
65 | To stop hosting just kill the process via `ctrl+c` within the terminal. If you are in a tmux session, then you can safely close the terminal and the hosting will continue, unless of course you turn off your computer!
66 |
67 | Now, you can refer to your dataset (for example `dataset1`) in processing pipelines via the kbucket url:
68 |
69 | ```
70 | kbucket://[your-id]/dataset1
71 | ```
72 |
73 | ### More information
74 |
75 | For more information, visit the [kbucket page](https://github.com/flatironinstitute/kbucket).
--------------------------------------------------------------------------------
/spikeforest/single_sort.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "code",
5 | "execution_count": null,
6 | "metadata": {},
7 | "outputs": [],
8 | "source": [
9 | "#######################################\n",
10 | "#imports\n",
11 | "#######################################\n",
12 | "\n",
13 | "%load_ext autoreload\n",
14 | "%autoreload 2\n",
15 | "def append_to_path(dir0):\n",
16 | " if dir0 not in sys.path:\n",
17 | " sys.path.append(dir0)\n",
18 | "\n",
19 | "import os, sys, json\n",
20 | "import numpy as np\n",
21 | "from matplotlib import pyplot as plt\n",
22 | "\n",
23 | "from mountainlab_pytools import mlproc as mlp\n",
24 | "from mountainlab_pytools import mdaio\n",
25 | "import spikeforestwidgets as SFW\n",
26 | "\n",
27 | "append_to_path(os.getcwd()+'/../python')\n",
28 | "from mountainsort4_1_0 import sort_dataset as ms4_sort_dataset\n",
29 | "from validate_sorting_results import validate_sorting_results\n",
30 | "\n",
31 | "import spikeforest_datasets as sd"
32 | ]
33 | },
34 | {
35 | "cell_type": "code",
36 | "execution_count": null,
37 | "metadata": {},
38 | "outputs": [],
39 | "source": [
40 | "datasets=sd.load_standard_datasets()"
41 | ]
42 | },
43 | {
44 | "cell_type": "code",
45 | "execution_count": null,
46 | "metadata": {},
47 | "outputs": [],
48 | "source": [
49 | "# Select a dataset\n",
50 | "SS=SFW.DatasetSelectWidget(datasets)\n",
51 | "SS.display()"
52 | ]
53 | },
54 | {
55 | "cell_type": "code",
56 | "execution_count": null,
57 | "metadata": {},
58 | "outputs": [],
59 | "source": [
60 | "# View the dataset\n",
61 | "ds=SS.selectedDataset()\n",
62 | "W=SFW.DatasetWidget(ds,visible_channels='')\n",
63 | "W.display()"
64 | ]
65 | },
66 | {
67 | "cell_type": "code",
68 | "execution_count": null,
69 | "metadata": {},
70 | "outputs": [],
71 | "source": [
72 | "# Select a processing resource\n",
73 | "SFW.LariLoginWidget().display()"
74 | ]
75 | },
76 | {
77 | "cell_type": "code",
78 | "execution_count": null,
79 | "metadata": {},
80 | "outputs": [],
81 | "source": [
82 | "# Create output directory\n",
83 | "\n",
84 | "output_dir=os.getcwd()+'/single_sort_output'\n",
85 | "if not os.path.exists(output_dir):\n",
86 | " os.mkdir(output_dir)"
87 | ]
88 | },
89 | {
90 | "cell_type": "code",
91 | "execution_count": null,
92 | "metadata": {},
93 | "outputs": [],
94 | "source": [
95 | "# Show the job status widget\n",
96 | "\n",
97 | "Pipeline=mlp.initPipeline()"
98 | ]
99 | },
100 | {
101 | "cell_type": "code",
102 | "execution_count": null,
103 | "metadata": {},
104 | "outputs": [],
105 | "source": [
106 | "# Run spike sorting and comparison with ground truth\n",
107 | "\n",
108 | "ds=SS.selectedDataset()\n",
109 | "dsdir=ds['raw_path']\n",
110 | "dsid=ds['id']\n",
111 | "with Pipeline:\n",
112 | " ms4_sort_dataset(dataset_dir=dsdir,output_dir=output_dir,adjacency_radius=50,detect_threshold=3)\n",
113 | " A=validate_sorting_results(dataset_dir=dsdir,sorting_output_dir=output_dir,output_dir=output_dir)\n",
114 | " amplitudes_true=A['amplitudes_true']\n",
115 | " accuracies=A['accuracies']"
116 | ]
117 | },
118 | {
119 | "cell_type": "code",
120 | "execution_count": null,
121 | "metadata": {},
122 | "outputs": [],
123 | "source": [
124 | "plt.plot(amplitudes_true,accuracies,'.')\n",
125 | "plt.xlabel('Amplitude')\n",
126 | "plt.ylabel('Accuracy');\n",
127 | "plt.title('Accuracy vs. amplitude for {}'.format(dsid))"
128 | ]
129 | },
130 | {
131 | "cell_type": "code",
132 | "execution_count": null,
133 | "metadata": {},
134 | "outputs": [],
135 | "source": []
136 | }
137 | ],
138 | "metadata": {
139 | "kernelspec": {
140 | "display_name": "Python 3",
141 | "language": "python",
142 | "name": "python3"
143 | },
144 | "language_info": {
145 | "codemirror_mode": {
146 | "name": "ipython",
147 | "version": 3
148 | },
149 | "file_extension": ".py",
150 | "mimetype": "text/x-python",
151 | "name": "python",
152 | "nbconvert_exporter": "python",
153 | "pygments_lexer": "ipython3",
154 | "version": "3.6.6"
155 | }
156 | },
157 | "nbformat": 4,
158 | "nbformat_minor": 2
159 | }
160 |
--------------------------------------------------------------------------------
/spikeforest/spikeforest.py:
--------------------------------------------------------------------------------
1 | import os
2 | from mountainlab_pytools import mdaio
3 | from mountainlab_pytools import mlproc as mlp
4 | import vdom
5 | import jp_proxy_widget
6 | import ipywidgets as widgets
7 | from jp_ephys_viz import ephys_viz_v1
8 |
9 | def add_run_to_pipeline(run, output_base_dir, verbose='minimal'):
10 | DS=run['dataset']
11 | ALG=run['alg']
12 | print(':::: Applying '+ALG['name']+' to '+DS['id'])
13 | dsdir=DS['raw_path']
14 | dsid=DS['id']
15 | algname=ALG['name']
16 | output_dir=output_base_dir+'/'+dsid+'--'+algname
17 | run['output_dir']=output_dir
18 | ALG['run'](
19 | dataset_dir=dsdir,
20 | output_dir=output_dir
21 | )
22 | summarize_sorting_results(
23 | dataset_dir=dsdir,
24 | sorting_output_dir=output_dir,
25 | output_dir=output_dir+'/summary',
26 | opts={'verbose':verbose}
27 | )
28 | ## TODO: Think of better term
29 | validate_sorting_results(
30 | dataset_dir=dsdir,
31 | sorting_output_dir=output_dir,
32 | output_dir=output_dir+'/validation',
33 | opts={'verbose':verbose}
34 | )
35 |
36 | def summarize_sorting_results(*,dataset_dir,sorting_output_dir,output_dir,opts):
37 | if not os.path.exists(output_dir):
38 | os.mkdir(output_dir)
39 | compute_templates(timeseries=dataset_dir+'/raw.mda',firings=sorting_output_dir+'/firings.mda',templates_out=output_dir+'/templates.mda')
40 |
41 | def compute_templates(*,timeseries,firings,templates_out,opts={}):
42 | return mlp.addProcess(
43 | 'ephys.compute_templates',
44 | {
45 | 'timeseries':timeseries,
46 | 'firings':firings
47 | },
48 | {
49 | 'templates_out':templates_out
50 | },
51 | {},
52 | opts
53 | )['outputs']['templates_out']
54 |
55 | def validate_sorting_results(*,dataset_dir,sorting_output_dir,output_dir,opts):
56 | if not os.path.exists(output_dir):
57 | os.mkdir(output_dir)
58 |
59 | compare_ground_truth(
60 | firings=sorting_output_dir+'/firings.mda',
61 | firings_true=dataset_dir+'/firings_true.mda',
62 | json_out=output_dir+'/compare_ground_truth.json'
63 | )
64 |
65 | def compare_ground_truth(*,firings,firings_true,json_out,opts={}):
66 | return mlp.addProcess(
67 | 'ephys.compare_ground_truth',
68 | dict(
69 | firings=firings,
70 | firings_true=firings_true
71 | ),
72 | dict(
73 | json_out=json_out
74 | ),
75 | dict(),
76 | opts
77 | )['outputs']['json_out']
78 |
79 | def get_run_output(run):
80 | out={}
81 | ds=run['dataset']
82 | alg=run['alg']
83 | out['dataset']=ds
84 | out['alg']={'name':alg['name']}
85 | out['output_dir']=run['output_dir']
86 | return out
87 |
88 | def ephys_viz_disable(*,params,title='View',external_link=False,height=450):
89 | if external_link:
90 | query=''
91 | for key in params:
92 | query=query+'{}={}&'.format(key,params[key])
93 | href='https://ephys-viz.herokuapp.com/?{}'.format(query)
94 | display(vdom.a(title,href=href,target='_blank'))
95 | else:
96 | if title:
97 | display(vdom.h3(title))
98 | W=jp_proxy_widget.JSProxyWidget()
99 | W.load_js_files(['ephys-viz/web/bundle.js'])
100 | W.load_js_files(['ephys-viz/node_modules/d3/dist/d3.min.js'])
101 | W.load_css('ephys-viz/node_modules/bootstrap/dist/css/bootstrap.min.css')
102 | W.load_css('ephys-viz/web/ml-layout.css')
103 | display(W)
104 | W.js_init('''
105 | element.empty()
106 | window.init_ephys_viz(params,element);
107 | element.css({height:height,overflow:'auto'})
108 | ''',params=params,height=height)
109 |
110 | class RunSelector:
111 | def __init__(self,sf_output):
112 | self._sf_output=sf_output
113 | self._W=None
114 | pass
115 | def display(self):
116 | options=[]
117 | output=self._sf_output
118 | for i in range(len(output['runs'])):
119 | run=output['runs'][i]
120 | options.append(run['dataset']['id']+' ::: '+run['alg']['name'])
121 | self._W=widgets.Select(options=options)
122 | display(self._W)
123 | def selectedRun(self):
124 | return self._sf_output['runs'][self._W.index]
125 |
126 | def view_dataset(dataset,external_link=False,height=450):
127 | dataset_id=dataset['id']
128 | raw_path=dataset['raw_path']
129 | ephys_viz_v1(params={'view':'dataset','dataset':raw_path},title='Dataset: {}'.format(dataset_id),external_link=external_link,height=height)
--------------------------------------------------------------------------------
/sandbox/wrap_spyking_circus/wrap_spyking_circus.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "code",
5 | "execution_count": null,
6 | "metadata": {},
7 | "outputs": [],
8 | "source": [
9 | "# For development purposes, reload imported modules when source changes\n",
10 | "%load_ext autoreload\n",
11 | "%autoreload 2\n",
12 | "\n",
13 | "def append_to_path(dir0): # A convenience function\n",
14 | " if dir0 not in sys.path:\n",
15 | " sys.path.append(dir0)\n",
16 | "\n",
17 | "# standard imports\n",
18 | "import os, sys, json\n",
19 | "import numpy as np\n",
20 | "from matplotlib import pyplot as plt\n",
21 | "\n",
22 | "# mountainlab imports\n",
23 | "from mountainlab_pytools import mlproc as mlp\n",
24 | "from mountainlab_pytools import mdaio\n",
25 | "import spikeforestwidgets as SFW\n",
26 | "\n",
27 | "# imports from this repo\n",
28 | "append_to_path(os.getcwd()+'/../../python')\n",
29 | "from synthesize_dataset import synthesize_dataset # Synthesize a test dataset\n",
30 | "from validate_sorting_results import validate_sorting_results # Validation processors"
31 | ]
32 | },
33 | {
34 | "cell_type": "code",
35 | "execution_count": null,
36 | "metadata": {},
37 | "outputs": [],
38 | "source": [
39 | "#######################################\n",
40 | "# Initialize the pipeline object\n",
41 | "#######################################\n",
42 | "\n",
43 | "Pipeline=mlp.initPipeline()"
44 | ]
45 | },
46 | {
47 | "cell_type": "code",
48 | "execution_count": null,
49 | "metadata": {},
50 | "outputs": [],
51 | "source": [
52 | "# Make synthetic ephys data and create output directory\n",
53 | "dsdir=os.getcwd()+'/dataset'\n",
54 | "with Pipeline:\n",
55 | " synthesize_dataset(dsdir,M=4,duration=60,average_snr=8,K=5)\n",
56 | " \n",
57 | "output_base_dir=os.getcwd()+'/output'\n",
58 | "if not os.path.exists(output_base_dir):\n",
59 | " os.mkdir(output_base_dir)"
60 | ]
61 | },
62 | {
63 | "cell_type": "code",
64 | "execution_count": null,
65 | "metadata": {},
66 | "outputs": [],
67 | "source": [
68 | "Pipeline=mlp.initPipeline()"
69 | ]
70 | },
71 | {
72 | "cell_type": "code",
73 | "execution_count": null,
74 | "metadata": {},
75 | "outputs": [],
76 | "source": [
77 | "#dsdir='dataset'\n",
78 | "dsdir='kbucket://b5ecdf1474c5/datasets/neuron_paper/synth_K30'\n",
79 | "\n",
80 | "with Pipeline:\n",
81 | " mlp.addProcess(\n",
82 | " 'spyking_circus.sort',\n",
83 | " dict(\n",
84 | " timeseries=dsdir+'/raw.mda',\n",
85 | " geom=dsdir+'/geom.csv'\n",
86 | " ),\n",
87 | " dict(\n",
88 | " firings_out='output/firings.mda'\n",
89 | " ),\n",
90 | " dict(\n",
91 | " samplerate=30000,\n",
92 | " spike_thresh=4,\n",
93 | " detect_sign=1,\n",
94 | " adjacency_radius=30\n",
95 | " )\n",
96 | " )"
97 | ]
98 | },
99 | {
100 | "cell_type": "code",
101 | "execution_count": null,
102 | "metadata": {},
103 | "outputs": [],
104 | "source": [
105 | "Pipeline=mlp.initPipeline()"
106 | ]
107 | },
108 | {
109 | "cell_type": "code",
110 | "execution_count": null,
111 | "metadata": {},
112 | "outputs": [],
113 | "source": [
114 | "with Pipeline:\n",
115 | " A=validate_sorting_results(dataset_dir=dsdir,sorting_output_dir='output',output_dir='output')\n",
116 | " amplitudes_true=A['amplitudes_true']\n",
117 | " accuracies=A['accuracies']"
118 | ]
119 | },
120 | {
121 | "cell_type": "code",
122 | "execution_count": null,
123 | "metadata": {},
124 | "outputs": [],
125 | "source": [
126 | "# Plot the comparison with ground truth\n",
127 | "plt.plot(amplitudes_true,accuracies,'.')\n",
128 | "plt.xlabel('Amplitude')\n",
129 | "plt.ylabel('Accuracy');\n",
130 | "plt.title('Accuracy vs. amplitude for {}'.format('simulated data'))"
131 | ]
132 | },
133 | {
134 | "cell_type": "code",
135 | "execution_count": null,
136 | "metadata": {},
137 | "outputs": [],
138 | "source": []
139 | }
140 | ],
141 | "metadata": {
142 | "kernelspec": {
143 | "display_name": "Python 3",
144 | "language": "python",
145 | "name": "python3"
146 | },
147 | "language_info": {
148 | "codemirror_mode": {
149 | "name": "ipython",
150 | "version": 3
151 | },
152 | "file_extension": ".py",
153 | "mimetype": "text/x-python",
154 | "name": "python",
155 | "nbconvert_exporter": "python",
156 | "pygments_lexer": "ipython3",
157 | "version": "3.6.2"
158 | }
159 | },
160 | "nbformat": 4,
161 | "nbformat_minor": 2
162 | }
163 |
--------------------------------------------------------------------------------
/python/validate_sorting_results.py:
--------------------------------------------------------------------------------
1 | import os, json
2 | from mountainlab_pytools import mdaio
3 | from mountainlab_pytools import mlproc as mlp
4 | import numpy as np
5 |
6 | def validate_sorting_results(*,dataset_dir,sorting_output_dir,output_dir):
7 | if not os.path.exists(output_dir):
8 | os.mkdir(output_dir)
9 |
10 | compare_ground_truth(
11 | firings=sorting_output_dir+'/firings.mda',
12 | firings_true=dataset_dir+'/firings_true.mda',
13 | json_out=output_dir+'/compare_ground_truth.json',
14 | )
15 |
16 | compute_templates(
17 | timeseries=dataset_dir+'/raw.mda',
18 | firings=dataset_dir+'/firings_true.mda',
19 | templates_out=output_dir+'/templates_true.mda.prv'
20 | )
21 |
22 | mlp.runPipeline()
23 |
24 | templates_true=mdaio.readmda(mlp.realizeFile(output_dir+'/templates_true.mda'))
25 | amplitudes_true=np.max(np.max(np.abs(templates_true),axis=1),axis=0)
26 | accuracies=get_accuracies(output_dir+'/compare_ground_truth.json')
27 | return dict(
28 | accuracies=accuracies,
29 | amplitudes_true=amplitudes_true
30 | )
31 |
32 | def get_accuracies(fname):
33 | with open(fname,'r') as f:
34 | obj=json.load(f)
35 | true_units=obj['true_units']
36 | K=np.max([int(k) for k in true_units])
37 | units=[]
38 | for k in range(1,K+1):
39 | units.append(true_units[str(k)])
40 | accuracies=[unit['accuracy'] for unit in units]
41 | return accuracies
42 |
43 | def count_matching_events(times1,times2,delta=20):
44 | times_concat=np.concatenate((times1,times2))
45 | membership=np.concatenate((np.ones(times1.shape)*1,np.ones(times2.shape)*2))
46 | indices=times_concat.argsort()
47 | times_concat_sorted=times_concat[indices]
48 | membership_sorted=membership[indices]
49 | diffs=times_concat_sorted[1:]-times_concat_sorted[:-1]
50 | inds=np.where((diffs<=delta)&(membership_sorted[0:-1]!=membership_sorted[1:]))[0]
51 | if (len(inds)==0):
52 | return 0
53 | inds2=np.where(inds[:-1]+1!=inds[1:])[0]
54 | return len(inds2)+1
55 |
56 | def compare_ground_truth_helper(times1,labels1,times2,labels2):
57 | K1=int(np.max(labels1))
58 | K2=int(np.max(labels2))
59 | matching_event_counts=np.zeros((K1,K2))
60 | counts1=np.zeros(K1)
61 | for k1 in range(1,K1+1):
62 | times_k1=times1[np.where(labels1==k1)[0]]
63 | counts1[k1-1]=len(times_k1)
64 | counts2=np.zeros(K2)
65 | for k2 in range(1,K2+1):
66 | times_k2=times2[np.where(labels2==k2)[0]]
67 | counts2[k2-1]=len(times_k2)
68 | for k1 in range(1,K1+1):
69 | times_k1=times1[np.where(labels1==k1)[0]]
70 | for k2 in range(1,K2+1):
71 | times_k2=times2[np.where(labels2==k2)[0]]
72 | num_matching_events=count_matching_events(times_k1,times_k2)
73 | matching_event_counts[k1-1,k2-1]=num_matching_events
74 | pairwise_accuracies=np.zeros((K1,K2))
75 | for k1 in range(1,K1+1):
76 | for k2 in range(1,K2+1):
77 | if (counts1[k1-1]>0) or (counts2[k2-1]>0):
78 | matching_count=matching_event_counts[k1-1,k2-1]
79 | pairwise_accuracies[k1-1,k2-1]=matching_count/(counts1[k1-1]+counts2[k2-1]-matching_count)
80 | ret={
81 | "true_units":{}
82 | }
83 | for k1 in range(1,K1+1):
84 | k2_match=int(1+np.argmax(pairwise_accuracies[k1-1,:].ravel()))
85 | num_matches=matching_event_counts[k1-1,k2_match-1]
86 | num_false_positives=int(counts2[k2_match-1]-num_matches)
87 | num_false_negatives=int(counts1[k1-1]-num_matches)
88 | unit={
89 | "best_match":k2_match,
90 | "accuracy":pairwise_accuracies[k1-1,k2_match-1],
91 | "num_matches":num_matches,
92 | "num_false_positives":num_false_positives,
93 | "num_false_negatives":num_false_negatives
94 | }
95 | ret['true_units'][k1]=unit
96 | return ret
97 |
98 | def compare_ground_truth(*,firings,firings_true,json_out,opts={}):
99 | Ft=mdaio.readmda(mlp.realizeFile(firings_true))
100 | F=mdaio.readmda(mlp.realizeFile(firings))
101 | times1=Ft[1,:]
102 | labels1=Ft[2,:]
103 | times2=F[1,:]
104 | labels2=F[2,:]
105 | out=compare_ground_truth_helper(times1,labels1,times2,labels2)
106 | with open(json_out, 'w') as outfile:
107 | json.dump(out, outfile, indent=4)
108 |
109 |
110 | #def compare_ground_truth(*,firings,firings_true,json_out,opts={}):
111 | # return mlp.addProcess(
112 | # 'ephys.compare_ground_truth',
113 | # dict(
114 | # firings=firings,
115 | # firings_true=firings_true
116 | # ),
117 | # dict(
118 | # json_out=json_out
119 | # ),
120 | # dict(),
121 | # opts
122 | # )['outputs']['json_out']
123 |
124 | def compute_templates(*,timeseries,firings,templates_out=True,opts={}):
125 | return mlp.addProcess(
126 | 'ephys.compute_templates',
127 | dict(
128 | firings=firings,
129 | timeseries=timeseries
130 | ),
131 | dict(
132 | templates_out=templates_out
133 | ),
134 | dict(),
135 | opts
136 | )['outputs']['templates_out']
137 |
--------------------------------------------------------------------------------
/spikeforest/create_synth_datasets.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "code",
5 | "execution_count": null,
6 | "metadata": {},
7 | "outputs": [],
8 | "source": [
9 | "#######################################\n",
10 | "# Auto-reload development imports\n",
11 | "#######################################\n",
12 | "\n",
13 | "%load_ext autoreload\n",
14 | "%autoreload 2"
15 | ]
16 | },
17 | {
18 | "cell_type": "code",
19 | "execution_count": null,
20 | "metadata": {},
21 | "outputs": [],
22 | "source": [
23 | "#######################################\n",
24 | "#imports\n",
25 | "#######################################\n",
26 | "\n",
27 | "import os, sys\n",
28 | "from mountainlab_pytools import mlproc as mlp\n",
29 | "from jp_ephys_viz import ephys_viz_v1"
30 | ]
31 | },
32 | {
33 | "cell_type": "code",
34 | "execution_count": null,
35 | "metadata": {},
36 | "outputs": [],
37 | "source": [
38 | "dsopts=[]\n",
39 | "dsopts.append(dict(\n",
40 | " id='synth_tetrode_30min',\n",
41 | " num_channels=4,\n",
42 | " duration=60*30,\n",
43 | " average_snr=10\n",
44 | "))\n",
45 | "dsopts.append(dict(\n",
46 | " id='synth_tetrode_120min',\n",
47 | " num_channels=4,\n",
48 | " duration=60*120,\n",
49 | " average_snr=10\n",
50 | "))\n",
51 | "dsopts.append(dict(\n",
52 | " id='synth_16ch_30min',\n",
53 | " num_channels=16,\n",
54 | " duration=60*30,\n",
55 | " average_snr=10\n",
56 | "))"
57 | ]
58 | },
59 | {
60 | "cell_type": "code",
61 | "execution_count": null,
62 | "metadata": {},
63 | "outputs": [],
64 | "source": [
65 | "#######################################\n",
66 | "# Create output directory\n",
67 | "#######################################\n",
68 | "\n",
69 | "datasets_dir=os.getcwd()+'/datasets'\n",
70 | "if not os.path.exists(datasets_dir):\n",
71 | " print('Creating directory: {}'.format(datasets_dir))\n",
72 | " os.mkdir(datasets_dir)\n",
73 | "else:\n",
74 | " print('Directory already exists: {}'.format(datasets_dir))"
75 | ]
76 | },
77 | {
78 | "cell_type": "code",
79 | "execution_count": null,
80 | "metadata": {},
81 | "outputs": [],
82 | "source": [
83 | "import json\n",
84 | "def synthesize_dataset(opts,output_directory):\n",
85 | " if not os.path.exists(output_directory):\n",
86 | " print('Creating directory: {}'.format(output_directory))\n",
87 | " os.mkdir(output_directory)\n",
88 | " else:\n",
89 | " print('Directory already exists: {}'.format(output_directory))\n",
90 | " mlp.addProcess(\n",
91 | " 'ephys.synthesize_random_firings',\n",
92 | " dict(),\n",
93 | " dict(\n",
94 | " firings_out=output_directory+'/firings_true.mda'\n",
95 | " ), \n",
96 | " dict(\n",
97 | " duration=opts['duration']\n",
98 | " ),\n",
99 | " dict()\n",
100 | " )\n",
101 | " mlp.addProcess(\n",
102 | " 'ephys.synthesize_random_waveforms',\n",
103 | " dict(),\n",
104 | " dict(\n",
105 | " waveforms_out=output_directory+'/waveforms_true.mda',\n",
106 | " geometry_out=output_directory+'/geom.csv'\n",
107 | " ), \n",
108 | " dict(\n",
109 | " M=opts['num_channels'],\n",
110 | " average_peak_amplitude=10*opts['average_snr']\n",
111 | " ),\n",
112 | " dict()\n",
113 | " )\n",
114 | " mlp.addProcess(\n",
115 | " 'ephys.synthesize_timeseries',\n",
116 | " dict(\n",
117 | " firings=output_directory+'/firings_true.mda',\n",
118 | " waveforms=output_directory+'/waveforms_true.mda'\n",
119 | " ),\n",
120 | " dict(\n",
121 | " timeseries_out=output_directory+'/raw.mda'\n",
122 | " ), \n",
123 | " dict(\n",
124 | " duration=opts['duration'],\n",
125 | " waveform_upsamplefac=13,\n",
126 | " noise_level=10\n",
127 | " ),\n",
128 | " dict()\n",
129 | " )\n",
130 | " params=dict(\n",
131 | " samplerate=30000,\n",
132 | " spike_sign=1\n",
133 | " )\n",
134 | " with open(output_directory+'/params.json','w') as f:\n",
135 | " json.dump(params,f)"
136 | ]
137 | },
138 | {
139 | "cell_type": "code",
140 | "execution_count": null,
141 | "metadata": {},
142 | "outputs": [],
143 | "source": [
144 | "P=mlp.initPipeline()"
145 | ]
146 | },
147 | {
148 | "cell_type": "code",
149 | "execution_count": null,
150 | "metadata": {},
151 | "outputs": [],
152 | "source": [
153 | "for dso in dsopts:\n",
154 | " with P:\n",
155 | " synthesize_dataset(dso,datasets_dir+'/'+dso['id'])"
156 | ]
157 | }
158 | ],
159 | "metadata": {
160 | "kernelspec": {
161 | "display_name": "Python 3",
162 | "language": "python",
163 | "name": "python3"
164 | },
165 | "language_info": {
166 | "codemirror_mode": {
167 | "name": "ipython",
168 | "version": 3
169 | },
170 | "file_extension": ".py",
171 | "mimetype": "text/x-python",
172 | "name": "python",
173 | "nbconvert_exporter": "python",
174 | "pygments_lexer": "ipython3",
175 | "version": "3.6.6"
176 | }
177 | },
178 | "nbformat": 4,
179 | "nbformat_minor": 2
180 | }
181 |
--------------------------------------------------------------------------------
/jupyter_examples/processor_tests/processor_tests.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "## Processor tests\n",
8 | "\n",
9 | "Here we run some basic tests. Make sure you have MountainLab installed on the computer running this jupyter lab."
10 | ]
11 | },
12 | {
13 | "cell_type": "code",
14 | "execution_count": null,
15 | "metadata": {},
16 | "outputs": [],
17 | "source": [
18 | "#######################################\n",
19 | "# imports and initialization\n",
20 | "#######################################\n",
21 | "\n",
22 | "# For development purposes, reload imported modules when source changes\n",
23 | "%load_ext autoreload\n",
24 | "%autoreload 2\n",
25 | "\n",
26 | "# standard imports\n",
27 | "import os, sys, json\n",
28 | "import numpy as np\n",
29 | "from matplotlib import pyplot as plt\n",
30 | "\n",
31 | "# mountainlab imports\n",
32 | "from mountainlab_pytools import mlproc as mlp\n",
33 | "from mountainlab_pytools import mdaio\n",
34 | "\n",
35 | "# Imports from this directory\n",
36 | "dir0 = os.getcwd()\n",
37 | "if dir0 not in sys.path:\n",
38 | " sys.path.append(dir0)\n",
39 | "from view_timeseries import view_timeseries"
40 | ]
41 | },
42 | {
43 | "cell_type": "code",
44 | "execution_count": null,
45 | "metadata": {},
46 | "outputs": [],
47 | "source": [
48 | "#######################################\n",
49 | "# Initialize the pipeline object\n",
50 | "#######################################\n",
51 | "Pipeline=mlp.initPipeline()"
52 | ]
53 | },
54 | {
55 | "cell_type": "code",
56 | "execution_count": null,
57 | "metadata": {},
58 | "outputs": [],
59 | "source": [
60 | "def test_mask_out_artifacts():\n",
61 | " \n",
62 | " # Create noisy array\n",
63 | " samplerate = int(48e3)\n",
64 | " duration = 30 # seconds\n",
65 | " n_samples = samplerate*duration\n",
66 | " noise_amplitude = 5\n",
67 | " noise = noise_amplitude*np.random.normal(0,1,n_samples)\n",
68 | " standard_dev = np.std(noise)\n",
69 | " \n",
70 | " # add three artefacts\n",
71 | " n_artifacts = 3\n",
72 | " artifacts = np.zeros_like(noise)\n",
73 | " artifact_duration = int(0.2*samplerate) # samples\n",
74 | " artifact_signal = np.zeros((n_artifacts, artifact_duration))\n",
75 | "\n",
76 | " for i in np.arange(n_artifacts): \n",
77 | " artifact_signal[i, :] = noise_amplitude*np.random.normal(0,6,artifact_duration)\n",
78 | "\n",
79 | " artifact_indices = np.tile(np.arange(artifact_duration), (3,1))\n",
80 | "\n",
81 | " artifact_shift = np.array([int(n_samples*0.10), int(n_samples*0.20), int(n_samples*0.70)])\n",
82 | "\n",
83 | " artifact_indices += artifact_shift.reshape((-1,1))\n",
84 | "\n",
85 | " for i, indices in enumerate(artifact_indices):\n",
86 | " artifacts[indices] = artifact_signal[i,:]\n",
87 | "\n",
88 | " signal = noise + artifacts\n",
89 | "\n",
90 | " timeseries = 'test_mask.mda'\n",
91 | " timeseries_out = 'masked.mda' \n",
92 | " \n",
93 | " # write as mda\n",
94 | " mdaio.writemda32(signal.reshape((1,-1)), timeseries)\n",
95 | " \n",
96 | " # run the mask artefacts\n",
97 | " mlp.addProcess(\n",
98 | " 'ephys.mask_out_artifacts',\n",
99 | " dict(\n",
100 | " timeseries=timeseries\n",
101 | " ),\n",
102 | " dict(\n",
103 | " timeseries_out=timeseries_out\n",
104 | " ),\n",
105 | " dict(\n",
106 | " chunk_size=2000,\n",
107 | " threshold=6,\n",
108 | " num_write_chunks=150,\n",
109 | " ),\n",
110 | " {}\n",
111 | " )\n",
112 | " mlp.runPipeline()\n",
113 | "\n",
114 | " \n",
115 | " # check that they are gone \n",
116 | " read_data = mdaio.readmda(timeseries).reshape((-1,1))\n",
117 | " masked_data = mdaio.readmda(timeseries_out).reshape((-1,1))\n",
118 | "\n",
119 | " indices_masked = sum(masked_data[artifact_indices,0].flatten() == 0)\n",
120 | " total_indices_to_mask = len(artifact_indices.flatten())\n",
121 | " masked = indices_masked == total_indices_to_mask\n",
122 | " \n",
123 | " os.remove(timeseries)\n",
124 | " os.remove(timeseries_out)\n",
125 | " \n",
126 | " view_timeseries(read_data.T, trange=[0,read_data.shape[0]])\n",
127 | " view_timeseries(masked_data.T, trange=[0,masked_data.shape[0]])\n",
128 | " \n",
129 | " if masked:\n",
130 | " print('Artifacts 100% masked')\n",
131 | " return True\n",
132 | " else:\n",
133 | " print('Artifacts %.2f%% masked' % (100*(indices_masked/total_indices_to_mask)))\n",
134 | " return False"
135 | ]
136 | },
137 | {
138 | "cell_type": "code",
139 | "execution_count": null,
140 | "metadata": {},
141 | "outputs": [],
142 | "source": [
143 | "with Pipeline:\n",
144 | " test_mask_out_artifacts()"
145 | ]
146 | },
147 | {
148 | "cell_type": "code",
149 | "execution_count": null,
150 | "metadata": {},
151 | "outputs": [],
152 | "source": []
153 | }
154 | ],
155 | "metadata": {
156 | "kernelspec": {
157 | "display_name": "Python 3",
158 | "language": "python",
159 | "name": "python3"
160 | },
161 | "language_info": {
162 | "codemirror_mode": {
163 | "name": "ipython",
164 | "version": 3
165 | },
166 | "file_extension": ".py",
167 | "mimetype": "text/x-python",
168 | "name": "python",
169 | "nbconvert_exporter": "python",
170 | "pygments_lexer": "ipython3",
171 | "version": "3.6.2"
172 | }
173 | },
174 | "nbformat": 4,
175 | "nbformat_minor": 2
176 | }
177 |
--------------------------------------------------------------------------------
/jupyter_examples/001_ms4_jupyter_example/ms4_jupyter_example.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "# Spike sorting using MountainSort\n",
8 | "\n",
9 | "First you must install MountainLab and MountainSort according to the installation instructions in this repository. Be sure to also install the required JupyterLab extensions and spikeforestwidgets as described in those docs.\n",
10 | "\n",
11 | "This notebook will run processing on the local machine (or the machine running jupyterlab)."
12 | ]
13 | },
14 | {
15 | "cell_type": "markdown",
16 | "metadata": {},
17 | "source": [
18 | "## First import some python modules"
19 | ]
20 | },
21 | {
22 | "cell_type": "code",
23 | "execution_count": null,
24 | "metadata": {},
25 | "outputs": [],
26 | "source": [
27 | "#######################################\n",
28 | "# imports and initialization\n",
29 | "#######################################\n",
30 | "\n",
31 | "# For development purposes, reload imported modules when source changes\n",
32 | "%load_ext autoreload\n",
33 | "%autoreload 2\n",
34 | "\n",
35 | "def append_to_path(dir0): # A convenience function\n",
36 | " if dir0 not in sys.path:\n",
37 | " sys.path.append(dir0)\n",
38 | "\n",
39 | "# standard imports\n",
40 | "import os, sys, json\n",
41 | "import numpy as np\n",
42 | "from matplotlib import pyplot as plt\n",
43 | "\n",
44 | "# mountainlab imports\n",
45 | "from mountainlab_pytools import mlproc as mlp\n",
46 | "from mountainlab_pytools import mdaio\n",
47 | "import spikeforestwidgets as SFW\n",
48 | "\n",
49 | "# imports from this repo\n",
50 | "append_to_path(os.getcwd()+'/../../python')\n",
51 | "from mountainsort4_1_0 import sort_dataset as ms4_sort_dataset # MountainSort spike sorting\n",
52 | "from validate_sorting_results import validate_sorting_results # Validation processors\n",
53 | "from synthesize_dataset import synthesize_dataset # Synthesize a test dataset"
54 | ]
55 | },
56 | {
57 | "cell_type": "markdown",
58 | "metadata": {},
59 | "source": [
60 | "## Initialize the pipeline object and job monitor widget"
61 | ]
62 | },
63 | {
64 | "cell_type": "code",
65 | "execution_count": null,
66 | "metadata": {},
67 | "outputs": [],
68 | "source": [
69 | "#######################################\n",
70 | "# Initialize the pipeline object\n",
71 | "#######################################\n",
72 | "\n",
73 | "Pipeline=mlp.initPipeline()"
74 | ]
75 | },
76 | {
77 | "cell_type": "markdown",
78 | "metadata": {},
79 | "source": [
80 | "## Create the synthetic dataset\n",
81 | "\n",
82 | "This will go into a new directory called `dataset/`"
83 | ]
84 | },
85 | {
86 | "cell_type": "code",
87 | "execution_count": null,
88 | "metadata": {},
89 | "outputs": [],
90 | "source": [
91 | "# Make synthetic ephys data and create output directory\n",
92 | "dsdir=os.getcwd()+'/dataset'\n",
93 | "with Pipeline:\n",
94 | " synthesize_dataset(dsdir,M=4,duration=600,average_snr=8)"
95 | ]
96 | },
97 | {
98 | "cell_type": "code",
99 | "execution_count": null,
100 | "metadata": {},
101 | "outputs": [],
102 | "source": [
103 | "dsdir=os.getcwd()+'/dataset'\n",
104 | "output_base_dir=os.getcwd()+'/output2'\n",
105 | "if not os.path.exists(output_base_dir):\n",
106 | " os.mkdir(output_base_dir)"
107 | ]
108 | },
109 | {
110 | "cell_type": "code",
111 | "execution_count": null,
112 | "metadata": {},
113 | "outputs": [],
114 | "source": [
115 | "## Note that the following does not work yet when using the local computer for computation\n",
116 | "## because I have not yet exposed the file system to the javascript widget\n",
117 | "\n",
118 | "#SFW.viewDataset(directory=dsdir)"
119 | ]
120 | },
121 | {
122 | "cell_type": "markdown",
123 | "metadata": {},
124 | "source": [
125 | "## Run the spike sorting and comparison with ground truth\n",
126 | "\n",
127 | "The output will go into a new directory called `output/`"
128 | ]
129 | },
130 | {
131 | "cell_type": "code",
132 | "execution_count": null,
133 | "metadata": {},
134 | "outputs": [],
135 | "source": [
136 | "#######################################\n",
137 | "# RUN THE PIPELINE\n",
138 | "#######################################\n",
139 | "#from ironclust_sort import sort_dataset as ironclust_sort_dataset\n",
140 | "\n",
141 | "output_dir=output_base_dir+'/ms4'\n",
142 | "with Pipeline:\n",
143 | " #ironclust_sort_dataset(dataset_dir=dsdir,output_dir=output_dir,adjacency_radius=-1,detect_threshold=3)\n",
144 | " ms4_sort_dataset(dataset_dir=dsdir,output_dir=output_dir,adjacency_radius=-1,detect_threshold=3)\n",
145 | " A=validate_sorting_results(dataset_dir=dsdir,sorting_output_dir=output_dir,output_dir=output_dir)\n",
146 | " amplitudes_true=A['amplitudes_true']\n",
147 | " accuracies=A['accuracies']"
148 | ]
149 | },
150 | {
151 | "cell_type": "markdown",
152 | "metadata": {},
153 | "source": [
154 | "## Plot the comparison with ground truth"
155 | ]
156 | },
157 | {
158 | "cell_type": "code",
159 | "execution_count": null,
160 | "metadata": {},
161 | "outputs": [],
162 | "source": [
163 | "# Plot the comparison with ground truth\n",
164 | "plt.plot(amplitudes_true,accuracies,'.')\n",
165 | "plt.xlabel('Amplitude')\n",
166 | "plt.ylabel('Accuracy');\n",
167 | "plt.title('Accuracy vs. amplitude for {}'.format('simulated data'))"
168 | ]
169 | }
170 | ],
171 | "metadata": {
172 | "kernelspec": {
173 | "display_name": "Python 3",
174 | "language": "python",
175 | "name": "python3"
176 | },
177 | "language_info": {
178 | "codemirror_mode": {
179 | "name": "ipython",
180 | "version": 3
181 | },
182 | "file_extension": ".py",
183 | "mimetype": "text/x-python",
184 | "name": "python",
185 | "nbconvert_exporter": "python",
186 | "pygments_lexer": "ipython3",
187 | "version": "3.6.2"
188 | }
189 | },
190 | "nbformat": 4,
191 | "nbformat_minor": 2
192 | }
193 |
--------------------------------------------------------------------------------
/python/mountainsort4_1_0.py:
--------------------------------------------------------------------------------
1 | from mountainlab_pytools import mdaio
2 | from mountainlab_pytools import mlproc as mlp
3 | import os
4 | import json
5 |
6 | def sort_dataset(*,dataset_dir,output_dir,freq_min=300,freq_max=6000,adjacency_radius,detect_threshold,opts={}):
7 | if not os.path.exists(output_dir):
8 | os.mkdir(output_dir)
9 |
10 | # Dataset parameters
11 | ds_params=read_dataset_params(dataset_dir)
12 |
13 | # Bandpass filter
14 | bandpass_filter(
15 | timeseries=dataset_dir+'/raw.mda',
16 | timeseries_out=output_dir+'/filt.mda.prv',
17 | samplerate=ds_params['samplerate'],
18 | freq_min=freq_min,
19 | freq_max=freq_max,
20 | opts=opts
21 | )
22 |
23 | # Whiten
24 | whiten(
25 | timeseries=output_dir+'/filt.mda.prv',
26 | timeseries_out=output_dir+'/pre.mda.prv',
27 | opts=opts
28 | )
29 |
30 | # Sort
31 | detect_sign=1
32 | if 'spike_sign' in ds_params:
33 | detect_sign=ds_params['spike_sign']
34 | if 'detect_sign' in ds_params:
35 | detect_sign=ds_params['detect_sign']
36 | ms4alg_sort(
37 | timeseries=output_dir+'/pre.mda.prv',
38 | geom=dataset_dir+'/geom.csv',
39 | firings_out=output_dir+'/firings_uncurated.mda',
40 | adjacency_radius=adjacency_radius,
41 | detect_sign=detect_sign,
42 | detect_threshold=detect_threshold,
43 | opts=opts
44 | )
45 |
46 | # Compute cluster metrics
47 | compute_cluster_metrics(
48 | timeseries=output_dir+'/pre.mda.prv',
49 | firings=output_dir+'/firings_uncurated.mda',
50 | metrics_out=output_dir+'/cluster_metrics.json',
51 | samplerate=ds_params['samplerate'],
52 | opts=opts
53 | )
54 |
55 | # Automated curation
56 | automated_curation(
57 | firings=output_dir+'/firings_uncurated.mda',
58 | cluster_metrics=output_dir+'/cluster_metrics.json',
59 | firings_out=output_dir+'/firings.mda',
60 | opts=opts
61 | )
62 |
63 | def read_dataset_params(dsdir):
64 | params_fname=mlp.realizeFile(dsdir+'/params.json')
65 | if not os.path.exists(params_fname):
66 | raise Exception('Dataset parameter file does not exist: '+params_fname)
67 | with open(params_fname) as f:
68 | return json.load(f)
69 |
70 | def bandpass_filter(*,timeseries,timeseries_out,samplerate,freq_min,freq_max,opts={}):
71 | return mlp.addProcess(
72 | 'ephys.bandpass_filter',
73 | {
74 | 'timeseries':timeseries
75 | },{
76 | 'timeseries_out':timeseries_out
77 | },
78 | {
79 | 'samplerate':samplerate,
80 | 'freq_min':freq_min,
81 | 'freq_max':freq_max
82 | },
83 | opts
84 | )
85 |
86 | def whiten(*,timeseries,timeseries_out,opts={}):
87 | return mlp.addProcess(
88 | 'ephys.whiten',
89 | {
90 | 'timeseries':timeseries
91 | },
92 | {
93 | 'timeseries_out':timeseries_out
94 | },
95 | {},
96 | opts
97 | )
98 |
99 | def ms4alg_sort(*,timeseries,geom,firings_out,detect_sign,adjacency_radius,detect_threshold=3,opts={}):
100 | pp={}
101 | pp['detect_sign']=detect_sign
102 | pp['adjacency_radius']=adjacency_radius
103 | pp['detect_threshold']=detect_threshold
104 | mlp.addProcess(
105 | 'ms4alg.sort',
106 | {
107 | 'timeseries':timeseries,
108 | 'geom':geom
109 | },
110 | {
111 | 'firings_out':firings_out
112 | },
113 | pp,
114 | opts
115 | )
116 |
117 | def compute_cluster_metrics(*,timeseries,firings,metrics_out,samplerate,opts={}):
118 | metrics1=mlp.addProcess(
119 | 'ms3.cluster_metrics',
120 | {
121 | 'timeseries':timeseries,
122 | 'firings':firings
123 | },
124 | {
125 | 'cluster_metrics_out':True
126 | },
127 | {
128 | 'samplerate':samplerate
129 | },
130 | opts
131 | )['outputs']['cluster_metrics_out']
132 | metrics2=mlp.addProcess(
133 | 'ms3.isolation_metrics',
134 | {
135 | 'timeseries':timeseries,
136 | 'firings':firings
137 | },
138 | {
139 | 'metrics_out':True
140 | },
141 | {
142 | 'compute_bursting_parents':'true'
143 | },
144 | opts
145 | )['outputs']['metrics_out']
146 | return mlp.addProcess(
147 | 'ms3.combine_cluster_metrics',
148 | {
149 | 'metrics_list':[metrics1,metrics2]
150 | },
151 | {
152 | 'metrics_out':metrics_out
153 | },
154 | {},
155 | opts
156 | )
157 |
158 | def automated_curation(*,firings,cluster_metrics,firings_out,opts={}):
159 | # Automated curation
160 | label_map=mlp.addProcess(
161 | 'ms4alg.create_label_map',
162 | {
163 | 'metrics':cluster_metrics
164 | },
165 | {
166 | 'label_map_out':True
167 | },
168 | {},
169 | opts
170 | )['outputs']['label_map_out']
171 | return mlp.addProcess(
172 | 'ms4alg.apply_label_map',
173 | {
174 | 'label_map':label_map,
175 | 'firings':firings
176 | },
177 | {
178 | 'firings_out':firings_out
179 | },
180 | {},
181 | opts
182 | )
183 |
184 | def synthesize_sample_dataset(*,dataset_dir,samplerate=30000,duration=600,num_channels=4,opts={}):
185 | if not os.path.exists(dataset_dir):
186 | os.mkdir(dataset_dir)
187 | M=num_channels
188 | mlp.addProcess(
189 | 'ephys.synthesize_random_waveforms',
190 | {},
191 | {
192 | 'geometry_out':dataset_dir+'/geom.csv',
193 | 'waveforms_out':dataset_dir+'/waveforms_true.mda'
194 | },
195 | {
196 | 'upsamplefac':13,
197 | 'M':M,
198 | 'average_peak_amplitude':100
199 | },
200 | opts
201 | )
202 | mlp.addProcess(
203 | 'ephys.synthesize_random_firings',
204 | {},
205 | {
206 | 'firings_out':dataset_dir+'/firings_true.mda'
207 | },
208 | {
209 | 'duration':duration
210 | },
211 | opts
212 | )
213 | mlp.addProcess(
214 | 'ephys.synthesize_timeseries',
215 | {
216 | 'firings':dataset_dir+'/firings_true.mda',
217 | 'waveforms':dataset_dir+'/waveforms_true.mda'
218 | },
219 | {
220 | 'timeseries_out':dataset_dir+'/raw.mda.prv'
221 | },{
222 | 'duration':duration,
223 | 'waveform_upsamplefac':13,
224 | 'noise_level':10
225 | },
226 | opts
227 | )
228 | params={
229 | 'samplerate':samplerate,
230 | 'spike_sign':1
231 | }
232 | with open(dataset_dir+'/params.json', 'w') as outfile:
233 | json.dump(params, outfile, indent=4)
--------------------------------------------------------------------------------
/spikeforest/prepare_neto.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "code",
5 | "execution_count": null,
6 | "metadata": {},
7 | "outputs": [],
8 | "source": [
9 | "#######################################\n",
10 | "# Auto-reload development imports\n",
11 | "#######################################\n",
12 | "\n",
13 | "%load_ext autoreload\n",
14 | "%autoreload 2"
15 | ]
16 | },
17 | {
18 | "cell_type": "code",
19 | "execution_count": null,
20 | "metadata": {},
21 | "outputs": [],
22 | "source": [
23 | "#######################################\n",
24 | "#imports\n",
25 | "#######################################\n",
26 | "\n",
27 | "import os, sys\n",
28 | "from mountainlab_pytools import mlproc as mlp\n",
29 | "from jp_ephys_viz import ephys_viz_v1"
30 | ]
31 | },
32 | {
33 | "cell_type": "code",
34 | "execution_count": null,
35 | "metadata": {},
36 | "outputs": [],
37 | "source": [
38 | "dirname=os.getcwd()+'/download/NETO/2014_11_25_Pair_3_0'\n",
39 | "bin_fname='amplifier2014-11-25T23_00_08.bin'\n",
40 | "output_dir=os.getcwd()+'/datasets/neto_32ch_1'"
41 | ]
42 | },
43 | {
44 | "cell_type": "code",
45 | "execution_count": null,
46 | "metadata": {},
47 | "outputs": [],
48 | "source": [
49 | "#######################################\n",
50 | "# Create output directory\n",
51 | "#######################################\n",
52 | "\n",
53 | "datasets_dir=os.getcwd()+'/datasets'\n",
54 | "if not os.path.exists(output_dir):\n",
55 | " print('Creating directory: {}'.format(output_dir))\n",
56 | " os.mkdir(output_dir)\n",
57 | "else:\n",
58 | " print('Directory already exists: {}'.format(output_dir))"
59 | ]
60 | },
61 | {
62 | "cell_type": "code",
63 | "execution_count": null,
64 | "metadata": {},
65 | "outputs": [],
66 | "source": [
67 | "# ephys.convert_array\n",
68 | "# Convert a multi-dimensional array between various formats ('.mda', '.npy', '.dat') based on the file extensions of the input/output files\n",
69 | "# \n",
70 | "# INPUTS\n",
71 | "# input -- Path of input array file (can be repeated for concatenation).\n",
72 | "# \n",
73 | "# OUTPUTS\n",
74 | "# output -- Path of the output array file.\n",
75 | "# \n",
76 | "# PARAMETERS\n",
77 | "# format -- (optional) The format for the input array (mda, npy, dat), or determined from the file extension if empty\n",
78 | "# format_out -- (optional) The format for the output input array (mda, npy, dat), or determined from the file extension if empty\n",
79 | "# dimensions -- (optional) Comma-separated list of dimensions (shape). If empty, it is auto-determined, if possible, by the input array. If second dim is -1 then it will be extrapolated from file size / first dim.\n",
80 | "# dtype -- (optional) The data format for the input array. Choices: int8, int16, int32, uint16, uint32, float32, float64 (possibly float16 in the future).\n",
81 | "# dtype_out -- (optional) The data format for the output array. If empty, the dtype for the input array is used.\n",
82 | "# channels -- (optional) Comma-seperated list of channels to keep in output. Zero-based indexing. Only works for .dat to .mda conversions.\n",
83 | "\n",
84 | "def convert_array(input_fname,output_fname,*,num_channels,dtype):\n",
85 | " mlp.addProcess(\n",
86 | " 'ephys.convert_array',\n",
87 | " dict(\n",
88 | " input=input_fname\n",
89 | " ),\n",
90 | " dict(\n",
91 | " output=output_fname\n",
92 | " ),\n",
93 | " dict(\n",
94 | " dimensions='{},-1'.format(num_channels),\n",
95 | " dtype=dtype\n",
96 | " ),\n",
97 | " dict(\n",
98 | " )\n",
99 | " )"
100 | ]
101 | },
102 | {
103 | "cell_type": "code",
104 | "execution_count": null,
105 | "metadata": {},
106 | "outputs": [],
107 | "source": [
108 | "def bandpass_filter(input_fname,output_fname,*,freq_min,freq_max,samplerate):\n",
109 | " mlp.addProcess(\n",
110 | " 'ephys.bandpass_filter',\n",
111 | " dict(\n",
112 | " timeseries=input_fname\n",
113 | " ),\n",
114 | " dict(\n",
115 | " timeseries_out=output_fname\n",
116 | " ),\n",
117 | " dict(\n",
118 | " freq_min=freq_min,\n",
119 | " freq_max=freq_max,\n",
120 | " samplerate=samplerate\n",
121 | " ),\n",
122 | " dict(\n",
123 | " )\n",
124 | " )"
125 | ]
126 | },
127 | {
128 | "cell_type": "code",
129 | "execution_count": null,
130 | "metadata": {},
131 | "outputs": [],
132 | "source": [
133 | "P=mlp.initPipeline()"
134 | ]
135 | },
136 | {
137 | "cell_type": "code",
138 | "execution_count": null,
139 | "metadata": {},
140 | "outputs": [],
141 | "source": [
142 | "with P:\n",
143 | " raw_fname=output_dir+'/raw0.mda'\n",
144 | " convert_array(dirname+'/'+bin_fname,raw_fname,num_channels=32,dtype='uint16')\n",
145 | " bandpass_filter(raw_fname,output_dir+'/raw.mda',freq_min=300,freq_max=6000,samplerate=30000)\n",
146 | " import json\n",
147 | " params=dict(\n",
148 | " samplerate=30000,\n",
149 | " spike_sign=-1\n",
150 | " )\n",
151 | " with open(output_dir+'/params.json','w') as f:\n",
152 | " json.dump(params,f)"
153 | ]
154 | },
155 | {
156 | "cell_type": "code",
157 | "execution_count": null,
158 | "metadata": {},
159 | "outputs": [],
160 | "source": [
161 | "def view_dataset(dsdir,external_link=False,height=450,dataset_id='',firings=''):\n",
162 | " params={'view':'dataset','dataset':dsdir}\n",
163 | " if firings:\n",
164 | " params['firings']=mlp.kbucketPath(firings)\n",
165 | " ephys_viz_v1(params=params,title='Dataset: {}'.format(dataset_id),external_link=external_link,height=height)"
166 | ]
167 | },
168 | {
169 | "cell_type": "code",
170 | "execution_count": null,
171 | "metadata": {},
172 | "outputs": [],
173 | "source": [
174 | "dsdir='kbucket://d97debc4bea2/spikeforest/datasets/neto_32ch_1'\n",
175 | "view_dataset(dsdir,firings=dsdir+'/firings_true.mda',external_link=False)"
176 | ]
177 | },
178 | {
179 | "cell_type": "code",
180 | "execution_count": null,
181 | "metadata": {},
182 | "outputs": [],
183 | "source": []
184 | }
185 | ],
186 | "metadata": {
187 | "kernelspec": {
188 | "display_name": "Python 3",
189 | "language": "python",
190 | "name": "python3"
191 | },
192 | "language_info": {
193 | "codemirror_mode": {
194 | "name": "ipython",
195 | "version": 3
196 | },
197 | "file_extension": ".py",
198 | "mimetype": "text/x-python",
199 | "name": "python",
200 | "nbconvert_exporter": "python",
201 | "pygments_lexer": "ipython3",
202 | "version": "3.6.6"
203 | }
204 | },
205 | "nbformat": 4,
206 | "nbformat_minor": 2
207 | }
208 |
--------------------------------------------------------------------------------
/jupyter_examples/example1/example1.ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "Assuming that everything is running properly, [a live version of this notebook is available on epoxyhub](http://epoxyhub.org/?source=https://github.com/flatironinstitute/mountainsort_examples&path=jupyter_examples/example1/example1.ipynb)."
8 | ]
9 | },
10 | {
11 | "cell_type": "markdown",
12 | "metadata": {},
13 | "source": [
14 | "## Basic spike sorting with MountainSort and Singularity\n",
15 | "\n",
16 | "This example shows how to use MountainSort within JupyterLab using code stored in Singularity containers\n",
17 | "\n",
18 | "This notebook accomplishes the following\n",
19 | "\n",
20 | "1. Specify which version of the processing to run by pointing to Singularity containers stored on Singularity Hub (Singularity container registry)\n",
21 | "1. User selects a (remote or local) processing resource (if local, Singularity must be installed on the computer running jupyterlab)\n",
22 | "1. Create a synthetic dataset\n",
23 | "1. Run the spike sorting\n",
24 | "1. Compare with ground truth\n",
25 | "1. Provide a simple output showing the results of the ground-truth comparison\n",
26 | "\n",
27 | "## Prerequisites\n",
28 | "\n",
29 | "1. Conda packages (see environment.yml) including\n",
30 | " - mountainlab\n",
31 | " - mountainlab_pytools\n",
32 | " - spikeforestwidgets\n",
33 | " - matplotlib\n",
34 | "1. *Singularity -- only needed if running on the local machine, i.e., the maching running jupyterlab*\n",
35 | " - note that Singularity should not be installed using conda. Instead, install it on your system using admin priviliges."
36 | ]
37 | },
38 | {
39 | "cell_type": "markdown",
40 | "metadata": {},
41 | "source": [
42 | "## First import the required python modules"
43 | ]
44 | },
45 | {
46 | "cell_type": "code",
47 | "execution_count": null,
48 | "metadata": {},
49 | "outputs": [],
50 | "source": [
51 | "#######################################\n",
52 | "# imports and initialization\n",
53 | "#######################################\n",
54 | "\n",
55 | "# For development purposes, reload imported modules when source changes\n",
56 | "%load_ext autoreload\n",
57 | "%autoreload 2\n",
58 | "\n",
59 | "def append_to_path(dir0): # A convenience function\n",
60 | " if dir0 not in sys.path:\n",
61 | " sys.path.append(dir0)\n",
62 | "\n",
63 | "# standard imports\n",
64 | "import os, sys, json\n",
65 | "import numpy as np\n",
66 | "from matplotlib import pyplot as plt\n",
67 | "\n",
68 | "# mountainlab imports\n",
69 | "from mountainlab_pytools import mlproc as mlp\n",
70 | "from mountainlab_pytools import mdaio\n",
71 | "import spikeforestwidgets as SFW\n",
72 | "\n",
73 | "# imports from this repo\n",
74 | "append_to_path(os.getcwd()+'/../../python')\n",
75 | "from mountainsort4_1_0 import sort_dataset as ms4_sort_dataset # MountainSort spike sorting\n",
76 | "from validate_sorting_results import validate_sorting_results # Validation processors\n",
77 | "from default_lari_servers import default_lari_servers # Choices for processing servers\n",
78 | "from synthesize_dataset import synthesize_dataset # Synthesize a test dataset"
79 | ]
80 | },
81 | {
82 | "cell_type": "markdown",
83 | "metadata": {},
84 | "source": [
85 | "## Specify the Singularity containers (on Singularity Hub) containing the required MountainLab processors"
86 | ]
87 | },
88 | {
89 | "cell_type": "code",
90 | "execution_count": null,
91 | "metadata": {},
92 | "outputs": [],
93 | "source": [
94 | "# Define which Singularity containers we will use for the processing\n",
95 | "# The name of the processor determines which container is used\n",
96 | "mlp.setContainerRules([])\n",
97 | "mlp.addContainerRule(pattern='ephys.*',container='shub://magland/ml_ephys:v0.2.5')\n",
98 | "mlp.addContainerRule(pattern='ms4alg.*',container='shub://magland/ml_ms4alg:v0.1.4')\n",
99 | "mlp.addContainerRule(pattern='pyms.*',container='shub://magland/ml_pyms:v0.0.1')\n",
100 | "mlp.addContainerRule(pattern='ms3.*',container='shub://magland/ml_ms3:v0.0.2')"
101 | ]
102 | },
103 | {
104 | "cell_type": "markdown",
105 | "metadata": {},
106 | "source": [
107 | "## The user selects the processing resource\n",
108 | "\n",
109 | "If this is going to the be local computer (i.e., the computer running JupyterLab), then you must have singularity installed"
110 | ]
111 | },
112 | {
113 | "cell_type": "code",
114 | "execution_count": null,
115 | "metadata": {},
116 | "outputs": [],
117 | "source": [
118 | "#######################################\n",
119 | "# LARI login and initialize the pipeline object\n",
120 | "#######################################\n",
121 | "\n",
122 | "SFW.LariLoginWidget(default_lari_servers()).display()\n",
123 | "Pipeline=mlp.initPipeline()"
124 | ]
125 | },
126 | {
127 | "cell_type": "markdown",
128 | "metadata": {},
129 | "source": [
130 | "## Create the synthetic dataset\n",
131 | "\n",
132 | "This will go into a new directory called `dataset/`"
133 | ]
134 | },
135 | {
136 | "cell_type": "code",
137 | "execution_count": null,
138 | "metadata": {},
139 | "outputs": [],
140 | "source": [
141 | "# Make synthetic ephys data and create output directory\n",
142 | "dsdir=os.getcwd()+'/dataset'\n",
143 | "with Pipeline:\n",
144 | " synthesize_dataset(dsdir,M=4,duration=600,average_snr=8)\n",
145 | " \n",
146 | "output_base_dir=os.getcwd()+'/output'\n",
147 | "if not os.path.exists(output_base_dir):\n",
148 | " os.mkdir(output_base_dir)"
149 | ]
150 | },
151 | {
152 | "cell_type": "code",
153 | "execution_count": null,
154 | "metadata": {},
155 | "outputs": [],
156 | "source": [
157 | "## Note that the following does not work yet when using the local computer for computation\n",
158 | "## because I have not yet exposed the file system to the javascript widget\n",
159 | "\n",
160 | "#SFW.viewDataset(directory=dsdir)"
161 | ]
162 | },
163 | {
164 | "cell_type": "markdown",
165 | "metadata": {},
166 | "source": [
167 | "## Run the spike sorting and comparison with ground truth\n",
168 | "\n",
169 | "The output will go into a new directory called `output/`"
170 | ]
171 | },
172 | {
173 | "cell_type": "code",
174 | "execution_count": null,
175 | "metadata": {},
176 | "outputs": [],
177 | "source": [
178 | "#######################################\n",
179 | "# RUN THE PIPELINE\n",
180 | "#######################################\n",
181 | "output_dir=output_base_dir+'/ms4'\n",
182 | "with Pipeline:\n",
183 | " ms4_sort_dataset(dataset_dir=dsdir,output_dir=output_dir,adjacency_radius=-1,detect_threshold=3)\n",
184 | " A=validate_sorting_results(dataset_dir=dsdir,sorting_output_dir=output_dir,output_dir=output_dir)\n",
185 | " amplitudes_true=A['amplitudes_true']\n",
186 | " accuracies=A['accuracies']"
187 | ]
188 | },
189 | {
190 | "cell_type": "markdown",
191 | "metadata": {},
192 | "source": [
193 | "## Plot the comparison with ground truth"
194 | ]
195 | },
196 | {
197 | "cell_type": "code",
198 | "execution_count": null,
199 | "metadata": {},
200 | "outputs": [],
201 | "source": [
202 | "# Plot the comparison with ground truth\n",
203 | "plt.plot(amplitudes_true,accuracies,'.')\n",
204 | "plt.xlabel('Amplitude')\n",
205 | "plt.ylabel('Accuracy');\n",
206 | "plt.title('Accuracy vs. amplitude for {}'.format('simulated data'))"
207 | ]
208 | },
209 | {
210 | "cell_type": "code",
211 | "execution_count": null,
212 | "metadata": {},
213 | "outputs": [],
214 | "source": []
215 | }
216 | ],
217 | "metadata": {
218 | "kernelspec": {
219 | "display_name": "Python 3",
220 | "language": "python",
221 | "name": "python3"
222 | },
223 | "language_info": {
224 | "codemirror_mode": {
225 | "name": "ipython",
226 | "version": 3
227 | },
228 | "file_extension": ".py",
229 | "mimetype": "text/x-python",
230 | "name": "python",
231 | "nbconvert_exporter": "python",
232 | "pygments_lexer": "ipython3",
233 | "version": "3.6.2"
234 | }
235 | },
236 | "nbformat": 4,
237 | "nbformat_minor": 2
238 | }
239 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # MountainSort
2 |
3 | MountainSort is spike sorting software. It is part of MountainLab, a larger framework for conducting reproducible and shareable data analysis.
4 |
5 | ## Installation and basic usage
6 |
7 | ### Overview
8 |
9 | There are many ways to use MountainSort and MountainLab, and there is not one set of installation instructions that will fit all use cases. We welcome contributions to the documentation and software.
10 |
11 | The core MountainSort algorithm is implemented in a python package called [ml_ms4alg](https://github.com/magland/ml_ms4alg) that is available via github, pypi, and conda. The pre- and post-processing methods as well as other, more general utilities for working with electrophysiology datasets are found in a second python package called [ml_ephys](https://github.com/magland/ml_ephys), also available via github, pypi, and conda. Some legacy packages (ml_ms3 and ml_pyms) are also available on github and as conda packages.
12 |
13 | The recommended way to use these packages is through [MountainLab](https://github.com/flatironinstitute/mountainlab-js). This allows all the processing routines to be called from a single common interface, either from command line, bash scripts, python scripts, jupyter notebooks, or other high level languages. The framework also enables operating on remote data using remote processing resources and supports encapsulating processors in [Singularity](https://www.singularity-hub.org/) containers. It also facilitates sharing of data, processing pipelines, and spike sorting results.
14 |
15 | ### Installation options
16 |
17 |
19 | Installation with conda (recommended)
20 |
21 |
22 | To install using conda, first [install miniconda (or anaconda)](https://github.com/flatironinstitute/mountainlab-js/blob/master/docs/conda.md). If you are not a conda user you may be wary of doing this since, by default, it injects itself into your system path and can cause conflicts with other installed software. However, there are relatively simple remedies for this issue, and conda in general is working to solve this in the default. Some details are [here](https://github.com/flatironinstitute/mountainlab-js/blob/master/docs/conda.md).
23 |
24 | After you have installed Miniconda and have created and activated a new conda environment, you can install the required MountainLab and MountainSort packages via:
25 |
26 | ```
27 | conda install -c flatiron -c conda-forge \
28 | mountainlab \
29 | mountainlab_pytools \
30 | ml_ephys \
31 | ml_ms3 \
32 | ml_ms4alg \
33 | ml_pyms
34 | ```
35 |
36 | At a later time you can update these packages via:
37 |
38 | ```
39 | conda update -c flatiron -c conda-forge \
40 | mountainlab \
41 | etc...
42 | ```
43 |
44 | You can test the installation by running
45 |
46 | ```
47 | ml-list-processors
48 | ```
49 |
50 | You should see a list of a few dozen processors. These are individual processing steps that can be pieced together to form a processing pipeline. You can get information on any particular processor via
51 |
52 | ```
53 | ml-spec [processor_name] -p
54 | ```
55 |
56 | More information about MountainLab and creating custom processors can be found in the [MountainLab documentation](https://github.com/flatironinstitute/mountainlab-js/blob/master/README.md). You may want to inspect the MountainLab configuration, and adjust the settings, such as where temporary data files are stored, by running
57 |
58 | ```
59 | ml-config
60 | ```
61 |
62 | You should also install the ephys-viz package which allows basic visualization of ephys datasets and the results of spike sorting:
63 |
64 | ```
65 | conda install -c flatiron -c conda-forge ephys-viz
66 | ```
67 |
68 | MountainView is an older (but more functional) GUI that can be installed via
69 |
70 | ```
71 | conda install -c flatiron -c conda-forge qt-mountainview
72 | ```
73 |
74 | Remember to periodically update these packages using the `conda update` command as shown above.
75 |
76 |
81 | Installation without conda
82 |
83 |
84 | If you choose not to (or cannot) use conda, you can alternatively install the software from source or by using the pip and npm package managers. Note that the ml_ms3 and qt-mountainview conda packages cannot be installed via (non-conda) package manager since they require Qt5/C++ compilation.
85 |
86 | Instructions on installing MountainLab and mountainlab_pytools can be found in the [MountainLab documentation](https://github.com/flatironinstitute/mountainlab-js/blob/master/README.md).
87 |
88 | To install the ml_ms4alg, ml_ephys, and ml_pyms packages without using conda, the first step is to use pip (and python 3.6 or later):
89 |
90 | ```
91 | pip install ml_ms4alg
92 | pip install ml_ephys
93 | pip install ml_pyms
94 | ```
95 |
96 | Then you must link those packages into the directory where MountainLab can find them. There is a convenience function for this distributed with mountainlab as described in [the docs](https://github.com/flatironinstitute/mountainlab-js/blob/master/README.md):
97 |
98 | ```
99 | ml-link-python-module ml_ms4alg `ml-config package_directory`/ml_ms4alg
100 | ml-link-python-module ml_ephys `ml-config package_directory`/ml_ephys
101 | ml-link-python-module ml_pyms `ml-config package_directory`/ml_pyms
102 | ```
103 |
104 | This creates a symbolic link to the installed python module directory from within the MountainLab package directory. If you are in not in a conda environment, this location is by default `~/.mountainlab/packages`.
105 |
106 | To confirm that these processing packages have been installed properly, try the `ml-list-processors`, `ml-spec`, and `ml-config` commands as above.
107 |
108 | You can also install ephys-viz using npm:
109 |
110 | ```
111 | npm install -g ephys-viz
112 | ```
113 |
114 | It is possible to install ml_ms3 and qt-mountainview from source, but we are gradually moving away from these packages, so if you need them, I recommend following the conda instructions above.
115 |
116 |
120 | Developer installation
121 |
122 |
123 | If you want to help develop the framework, or if you for some reason want to avoid using the above package managers, you can install everything from source. Developer installation instructions for MountainLab can be found in [the docs](https://github.com/flatironinstitute/mountainlab-js/blob/master/README.md).
124 |
125 | As for the processor packages, use the following to determine where MountainLab expects packages to be:
126 |
127 | ```
128 | ml-config package_directory
129 | ```
130 |
131 | If you are not in a conda environment, this should default to `~/.mountainlab/packages`. This is where you should put the processing packages. For convenience it is easiest to develop them elsewhere and create symbolic links.
132 |
133 | How you should install the processing packages depends on whether you want to just use them or if you want to modify/develop them. In the former case, just clone the repositories and then use `pip` and `ml-link-python-module` as follows:
134 |
135 |
136 | ```
137 | git clone https://github.com/magland/ml_ms4alg
138 | pip install ml_ms4alg
139 | ml-link-python-module ml_ms4alg `ml-config package_directory`/ml_ms4alg
140 | ```
141 |
142 | On the other hand, if you plan to modify or develop the code then you should instead do the following
143 |
144 | ```
145 | git clone https://github.com/magland/ml_ms4alg
146 |
147 | # PYTHONPATH affects where pip searches for python modules $
148 | export PYTHONPATH=[fill-in-path]/ml_ms4alg:$PYTHONPATH
149 |
150 | ml-link-python-module ml_ms4alg `ml-config package_directory`/ml_ms4alg
151 | ```
152 |
153 | But it is important that you also install all of the dependencies found in `setup.py` using pip3. The `export` command should also be appended to your `~/.bashrc` file.
154 |
155 | A similar procedure applies to the `ml_ephys` package, and something similar can be done for `ml_pyms`. The `ml_ms3` package involves Qt5/C++ and is more complicated to compile.
156 |
157 | Installation of `ephys-viz` is similar to that of `mountainlab-js`. Follow the above instructions, substituting `ephys-viz` for `mountainlab-js`.
158 |
159 |