├── docs ├── api.md ├── examples.md ├── how-to-guides.md ├── completed-tasks.md ├── logo.png ├── assets │ └── images │ │ ├── logo.png │ │ ├── open-book-icon.png │ │ ├── check-list-icon.png │ │ ├── conference-room-icon.png │ │ ├── group-discussion-icon.png │ │ ├── repair-fix-repairing-icon.png │ │ └── install-software-download-icon.png ├── css │ └── mkdocstrings.css ├── js │ └── katex.js ├── install.md ├── examples │ ├── scipy_example.py │ ├── dask_example.py │ ├── formats_example.py │ └── formats_example_finch.py ├── index.md ├── quickstart.md ├── migration-jl.md ├── gen_logo.py ├── introduction.md ├── roadmap.md ├── conduct.md ├── contributing.md ├── construct.md └── operations.md ├── benchmarks ├── __init__.py ├── utils.py ├── conftest.py ├── test_elemwise.py ├── test_tensordot.py └── test_benchmark_coo.py ├── examples ├── __init__.py ├── utils.py ├── triangles_example.py ├── mttkrp_example.py ├── matmul_example.py ├── spmv_add_example.py ├── sddmm_example.py ├── hits_example.py └── elemwise_example.py ├── ci ├── Numba-array-api-skips.txt ├── array-api-tests-rev.txt ├── test_notebooks.sh ├── test_backends.sh ├── test_examples.sh ├── test_MLIR.sh ├── test_Finch.sh ├── setup_env.sh ├── environment.yml ├── test_Numba.sh ├── test_all.sh ├── Finch-array-api-skips.txt ├── test_array_api.sh ├── clone_array_api_tests.sh └── Numba-array-api-xfails.txt ├── sparse ├── tests │ ├── __init__.py │ ├── conftest.py │ └── test_backends.py ├── mlir_backend │ ├── tests │ │ ├── __init__.py │ │ └── conftest.py │ ├── __init__.py │ ├── _core.py │ ├── _array.py │ ├── _common.py │ ├── _dtypes.py │ └── _conversions.py ├── numba_backend │ ├── tests │ │ ├── __init__.py │ │ ├── conftest.py │ │ ├── test_dask_interop.py │ │ ├── test_io.py │ │ ├── test_conversion.py │ │ ├── test_coo_numba.py │ │ ├── test_compressed_convert.py │ │ ├── test_array_function.py │ │ ├── test_compressed_2d.py │ │ ├── test_namespace.py │ │ └── test_einsum.py │ ├── _compressed │ │ ├── __init__.py │ │ └── common.py │ ├── _numba_extension.py │ ├── _coo │ │ └── __init__.py │ ├── _settings.py │ ├── _io.py │ └── __init__.py ├── finch_backend │ └── __init__.py └── __init__.py ├── benchmarks_original ├── __init__.py ├── utils.py ├── mttkrp_example.py ├── matmul_example.py ├── spmv_add_example.py ├── sddmm_example.py └── elemwise_example.py ├── .github ├── CODE_OF_CONDUCT.md ├── ISSUE_TEMPLATE │ ├── config.yml │ ├── question-support.yml │ ├── doc_issue.yml │ ├── feature_request.yml │ └── bug_report.yml ├── dependabot.yml ├── FUNDING.yml ├── workflows │ ├── codspeed.yml │ ├── release-drafter.yml │ └── ci.yml ├── pull_request_template.md └── release-drafter.yml ├── tox.ini ├── .gitattributes ├── .coveragerc ├── pytest.ini ├── setup.cfg ├── .codecov.yml ├── .readthedocs.yml ├── scripts └── gen_ref_pages.py ├── conftest.py ├── .pre-commit-config.yaml ├── README.md ├── release-procedure.md ├── .gitignore ├── LICENSE ├── pixi.toml ├── mkdocs.yml └── pyproject.toml /docs/api.md: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /docs/examples.md: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /benchmarks/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /examples/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /ci/Numba-array-api-skips.txt: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /sparse/tests/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /benchmarks_original/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /sparse/mlir_backend/tests/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /sparse/numba_backend/tests/__init__.py: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /docs/how-to-guides.md: -------------------------------------------------------------------------------- 1 | # How to guides 2 | -------------------------------------------------------------------------------- /ci/array-api-tests-rev.txt: -------------------------------------------------------------------------------- 1 | 2db6c7b807a609a1539e312e01af093a45d34764 2 | -------------------------------------------------------------------------------- /docs/completed-tasks.md: -------------------------------------------------------------------------------- 1 | [Completed tasks](roadmap.md#completed-tasks) 2 | -------------------------------------------------------------------------------- /docs/logo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/pydata/sparse/HEAD/docs/logo.png -------------------------------------------------------------------------------- /benchmarks/utils.py: -------------------------------------------------------------------------------- 1 | import os 2 | 3 | CI_MODE = bool(int(os.getenv("CI_MODE", default="0"))) 4 | -------------------------------------------------------------------------------- /docs/assets/images/logo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/pydata/sparse/HEAD/docs/assets/images/logo.png -------------------------------------------------------------------------------- /.github/CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- 1 | # Code of Conduct 2 | 3 | Please see [`docs/conduct.md`](docs/conduct.md) 4 | -------------------------------------------------------------------------------- /docs/assets/images/open-book-icon.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/pydata/sparse/HEAD/docs/assets/images/open-book-icon.png -------------------------------------------------------------------------------- /docs/assets/images/check-list-icon.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/pydata/sparse/HEAD/docs/assets/images/check-list-icon.png -------------------------------------------------------------------------------- /tox.ini: -------------------------------------------------------------------------------- 1 | [tox] 2 | envlist = py36, py37 3 | [testenv] 4 | commands= 5 | pytest {posargs} 6 | extras= 7 | tests 8 | tox 9 | -------------------------------------------------------------------------------- /docs/assets/images/conference-room-icon.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/pydata/sparse/HEAD/docs/assets/images/conference-room-icon.png -------------------------------------------------------------------------------- /docs/assets/images/group-discussion-icon.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/pydata/sparse/HEAD/docs/assets/images/group-discussion-icon.png -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE/config.yml: -------------------------------------------------------------------------------- 1 | blank_issues_enabled: true 2 | name: Blank issue 3 | url: https://github.com/pydata/sparse/issues/new 4 | -------------------------------------------------------------------------------- /.gitattributes: -------------------------------------------------------------------------------- 1 | sparse/_version.py export-subst 2 | # GitHub syntax highlighting 3 | pixi.lock linguist-language=YAML linguist-generated=true 4 | -------------------------------------------------------------------------------- /ci/test_notebooks.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | set -euxo pipefail 3 | 4 | CI_MODE=1 pytest -n 4 --nbmake --nbmake-timeout=600 ./examples/*.ipynb 5 | -------------------------------------------------------------------------------- /docs/assets/images/repair-fix-repairing-icon.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/pydata/sparse/HEAD/docs/assets/images/repair-fix-repairing-icon.png -------------------------------------------------------------------------------- /ci/test_backends.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | set -euxo pipefail 3 | 4 | source ci/test_Numba.sh 5 | source ci/test_Finch.sh 6 | source ci/test_MLIR.sh 7 | -------------------------------------------------------------------------------- /docs/assets/images/install-software-download-icon.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/pydata/sparse/HEAD/docs/assets/images/install-software-download-icon.png -------------------------------------------------------------------------------- /ci/test_examples.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | set -euxo pipefail 3 | 4 | for example in $(find ./examples/ -iname '*.py'); do 5 | CI_MODE=1 python $example 6 | done 7 | -------------------------------------------------------------------------------- /ci/test_MLIR.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | set -euxo pipefail 3 | 4 | SPARSE_BACKEND=MLIR pytest --pyargs sparse/mlir_backend --cov-report=xml:coverage_MLIR.xml -n auto -vvv 5 | -------------------------------------------------------------------------------- /sparse/numba_backend/_compressed/__init__.py: -------------------------------------------------------------------------------- 1 | from .common import concatenate, stack 2 | from .compressed import CSC, CSR, GCXS 3 | 4 | __all__ = ["GCXS", "CSR", "CSC", "concatenate", "stack"] 5 | -------------------------------------------------------------------------------- /sparse/mlir_backend/tests/conftest.py: -------------------------------------------------------------------------------- 1 | import pytest 2 | 3 | import numpy as np 4 | 5 | 6 | @pytest.fixture(scope="module") 7 | def rng() -> np.random.Generator: 8 | return np.random.default_rng(42) 9 | -------------------------------------------------------------------------------- /sparse/numba_backend/tests/conftest.py: -------------------------------------------------------------------------------- 1 | import pytest 2 | 3 | 4 | @pytest.fixture(scope="session") 5 | def rng(): 6 | from sparse.numba_backend._utils import default_rng 7 | 8 | return default_rng 9 | -------------------------------------------------------------------------------- /ci/test_Finch.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | set -euxo pipefail 3 | 4 | python -c 'import finch' 5 | PYTHONFAULTHANDLER="${HOME}/faulthandler.log" SPARSE_BACKEND=Finch pytest --pyargs sparse/tests --cov-report=xml:coverage_Finch.xml -n auto -vvv 6 | -------------------------------------------------------------------------------- /sparse/numba_backend/_numba_extension.py: -------------------------------------------------------------------------------- 1 | def _init_extension(): 2 | """ 3 | Load extensions when numba is loaded. 4 | This name must match the one in pyproject.toml 5 | """ 6 | from ._coo import numba_extension # noqa: F401 7 | -------------------------------------------------------------------------------- /.coveragerc: -------------------------------------------------------------------------------- 1 | [run] 2 | source= 3 | sparse/ 4 | 5 | omit= 6 | sparse/_version.py 7 | **/tests/* 8 | 9 | [report] 10 | exclude_lines = 11 | pragma: no cover 12 | return NotImplemented 13 | raise NotImplementedError 14 | -------------------------------------------------------------------------------- /pytest.ini: -------------------------------------------------------------------------------- 1 | [pytest] 2 | addopts = --cov-report term-missing --cov-report html --cov-report=term:skip-covered --cov sparse --cov-config .coveragerc 3 | filterwarnings = 4 | ignore::PendingDeprecationWarning 5 | testpaths = 6 | sparse 7 | junit_family=xunit2 8 | -------------------------------------------------------------------------------- /benchmarks/conftest.py: -------------------------------------------------------------------------------- 1 | import pytest 2 | 3 | import numpy as np 4 | 5 | 6 | @pytest.fixture(scope="function") 7 | def rng(): 8 | return np.random.default_rng(seed=42) 9 | 10 | 11 | @pytest.fixture 12 | def max_size(scope="session"): 13 | return 2**26 14 | -------------------------------------------------------------------------------- /sparse/finch_backend/__init__.py: -------------------------------------------------------------------------------- 1 | try: 2 | import finch # noqa: F401 3 | except ModuleNotFoundError as e: 4 | raise ImportError("Finch not installed. Run `pip install sparse[finch]` to enable Finch backend") from e 5 | 6 | from finch import * # noqa: F403 7 | from finch import __all__ as __all__ 8 | -------------------------------------------------------------------------------- /ci/setup_env.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | set -euxo pipefail 3 | 4 | if [ ! -d ".venv" ]; then 5 | python -m venv .venv 6 | source .venv/bin/activate 7 | pip install -e .[all] 8 | source ci/clone_array_api_tests.sh 9 | pip install -r ../array-api-tests/requirements.txt 10 | pip uninstall -y matrepr 11 | fi 12 | -------------------------------------------------------------------------------- /docs/css/mkdocstrings.css: -------------------------------------------------------------------------------- 1 | :root { 2 | --md-primary-fg-color: #c96c08; 3 | --md-primary-fg-color--light: #94f2f7; 4 | --md-primary-fg-color--dark: #335365; 5 | } 6 | 7 | .md-tabs__item { 8 | font-weight: bolder; 9 | } 10 | 11 | .grid { 12 | font-weight: bolder; 13 | font-size: 160%; 14 | font-family: Georgia, serif; 15 | } 16 | -------------------------------------------------------------------------------- /docs/js/katex.js: -------------------------------------------------------------------------------- 1 | document$.subscribe(({ body }) => { 2 | renderMathInElement(body, { 3 | delimiters: [ 4 | { left: "$$", right: "$$", display: true }, 5 | { left: "$", right: "$", display: false }, 6 | { left: "\\(", right: "\\)", display: false }, 7 | { left: "\\[", right: "\\]", display: true }, 8 | ], 9 | }); 10 | }); 11 | -------------------------------------------------------------------------------- /ci/environment.yml: -------------------------------------------------------------------------------- 1 | name: sparse-dev 2 | channels: 3 | - conda-forge 4 | - nodefaults 5 | dependencies: 6 | - python 7 | - pip 8 | - pip: 9 | - finch-tensor>=0.2.13 10 | - finch-mlir>=0.0.2 11 | - pytest-codspeed 12 | - numpy 13 | - numba 14 | - scipy 15 | - dask 16 | - pytest 17 | - pytest-cov 18 | - pytest-xdist 19 | -------------------------------------------------------------------------------- /setup.cfg: -------------------------------------------------------------------------------- 1 | [flake8] 2 | # References: 3 | # https://flake8.readthedocs.io/en/latest/user/configuration.html 4 | # https://flake8.readthedocs.io/en/latest/user/error-codes.html 5 | 6 | # Note: there cannot be spaces after comma's here 7 | exclude = 8 | __init__.py 9 | .tox/ 10 | 11 | 12 | max-line-length = 120 13 | 14 | [bdist_wheel] 15 | universal=1 16 | -------------------------------------------------------------------------------- /ci/test_Numba.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | set -euxo pipefail 3 | 4 | if [ $(python -c 'import numpy as np; print(np.lib.NumpyVersion(np.__version__) >= "2.0.0a1")') = 'True' ]; then 5 | pytest --pyargs sparse --doctest-modules --cov-report=xml:coverage_Numba.xml -n auto -vvv 6 | else 7 | pytest --pyargs sparse --cov-report=xml:coverage_Numba.xml -n auto -vvv 8 | fi 9 | -------------------------------------------------------------------------------- /.codecov.yml: -------------------------------------------------------------------------------- 1 | comment: false 2 | coverage: 3 | status: 4 | project: 5 | default: 6 | # Total project must be 95% 7 | target: '100%' 8 | threshold: '5%' 9 | 10 | patch: 11 | default: 12 | # Patch coverage must be 92% 13 | target: '100%' 14 | threshold: '8%' 15 | 16 | precision: 2 17 | round: down 18 | range: 80...98 19 | -------------------------------------------------------------------------------- /ci/test_all.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | set -euxo pipefail 3 | 4 | ACTIVATE_VENV="${ACTIVATE_VENV:-0}" 5 | 6 | if [ $ACTIVATE_VENV = "1" ]; then 7 | source .venv/bin/activate 8 | fi 9 | 10 | source ci/test_backends.sh 11 | source ci/test_examples.sh 12 | source ci/test_notebooks.sh 13 | SPARSE_BACKEND="Numba" source ci/test_array_api.sh 14 | SPARSE_BACKEND="Finch" PYTHONFAULTHANDLER="${HOME}/faulthandler.log" source ci/test_array_api.sh 15 | -------------------------------------------------------------------------------- /ci/Finch-array-api-skips.txt: -------------------------------------------------------------------------------- 1 | # `test_nonzero` name conflict 2 | array_api_tests/test_searching_functions.py::test_nonzero_zerodim_error 3 | # flaky test 4 | array_api_tests/test_special_cases.py::test_unary[sign((x_i is -0 or x_i == +0)) -> 0] 5 | # `broadcast_to` is not defined in Finch, hangs as xfail 6 | array_api_tests/test_searching_functions.py::test_where 7 | # `test_solve` is not defined in Finch, hangs as xfail 8 | array_api_tests/test_linalg.py::test_solve 9 | -------------------------------------------------------------------------------- /sparse/tests/conftest.py: -------------------------------------------------------------------------------- 1 | import sparse 2 | 3 | import pytest 4 | 5 | import numpy as np 6 | 7 | 8 | @pytest.fixture(scope="session") 9 | def backend(): 10 | yield sparse._BACKEND 11 | 12 | 13 | @pytest.fixture(scope="module") 14 | def graph(): 15 | return np.array( 16 | [ 17 | [0, 1, 1, 0, 0], 18 | [0, 0, 1, 0, 1], 19 | [0, 0, 0, 0, 0], 20 | [0, 0, 0, 0, 1], 21 | [0, 1, 0, 1, 0], 22 | ] 23 | ) 24 | -------------------------------------------------------------------------------- /sparse/numba_backend/tests/test_dask_interop.py: -------------------------------------------------------------------------------- 1 | import sparse 2 | 3 | from dask.base import tokenize 4 | 5 | 6 | def test_deterministic_token(): 7 | a = sparse.COO(data=[1, 2, 3], coords=[10, 20, 30], shape=(40,)) 8 | b = sparse.COO(data=[1, 2, 3], coords=[10, 20, 30], shape=(40,)) 9 | assert tokenize(a) == tokenize(b) 10 | # One of these things is not like the other.... 11 | c = sparse.COO(data=[1, 2, 4], coords=[10, 20, 30], shape=(40,)) 12 | assert tokenize(a) != tokenize(c) 13 | -------------------------------------------------------------------------------- /.github/dependabot.yml: -------------------------------------------------------------------------------- 1 | # Set update schedule for GitHub Actions 2 | # This opens a PR when actions in workflows need an update 3 | 4 | version: 2 5 | updates: 6 | - package-ecosystem: "github-actions" 7 | directory: "/" 8 | schedule: 9 | # Check for updates to GitHub Actions every week 10 | interval: "weekly" 11 | commit-message: 12 | prefix: "skip changelog" # So this PR will not be added to release-drafter 13 | include: "scope" # List of the updated dependencies in the commit will be added 14 | -------------------------------------------------------------------------------- /ci/test_array_api.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | set -euxo pipefail 3 | 4 | source ci/clone_array_api_tests.sh 5 | 6 | if [ "${SPARSE_BACKEND}" = "Finch" ]; then 7 | python -c 'import finch' 8 | fi 9 | ARRAY_API_TESTS_MODULE="sparse" pytest "$ARRAY_API_TESTS_DIR/array_api_tests/" -v -c "$ARRAY_API_TESTS_DIR/pytest.ini" --ci --max-examples=2 --derandomize --disable-deadline --disable-warnings -o xfail_strict=True -n auto --xfails-file ../sparse/ci/${SPARSE_BACKEND}-array-api-xfails.txt --skips-file ../sparse/ci/${SPARSE_BACKEND}-array-api-skips.txt 10 | -------------------------------------------------------------------------------- /docs/install.md: -------------------------------------------------------------------------------- 1 | # Install 2 | 3 | You can install this library with ``pip``: 4 | 5 | ```bash 6 | pip install sparse 7 | ``` 8 | 9 | You can also install from source from GitHub, either by pip installing 10 | directly:: 11 | ```bash 12 | pip install git+https://github.com/pydata/sparse 13 | ``` 14 | Or by cloning the repository and installing locally: 15 | ```bash 16 | git clone https://github.com/pydata/sparse.git 17 | cd sparse/ 18 | pip install . 19 | ``` 20 | Note that this library is under active development and so some API churn should 21 | be expected. 22 | -------------------------------------------------------------------------------- /.readthedocs.yml: -------------------------------------------------------------------------------- 1 | # Read the Docs configuration file for MkDocs projects 2 | # See https://docs.readthedocs.io/en/stable/config-file/v2.html for details 3 | 4 | # Required 5 | version: 2 6 | 7 | # Set the version of Python and other tools you might need 8 | build: 9 | os: ubuntu-22.04 10 | tools: 11 | python: "3.12" 12 | 13 | mkdocs: 14 | configuration: mkdocs.yml 15 | fail_on_warning: false 16 | 17 | # Optionally declare the Python requirements required to build your docs 18 | python: 19 | install: 20 | - method: pip 21 | path: . 22 | extra_requirements: 23 | - docs 24 | -------------------------------------------------------------------------------- /ci/clone_array_api_tests.sh: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env bash 2 | set -euxo pipefail 3 | 4 | ARRAY_API_TESTS_DIR="${ARRAY_API_TESTS_DIR:-"../array-api-tests"}" 5 | if [ ! -d "$ARRAY_API_TESTS_DIR" ]; then 6 | git clone --recursive https://github.com/data-apis/array-api-tests.git "$ARRAY_API_TESTS_DIR" 7 | fi 8 | 9 | git --git-dir="$ARRAY_API_TESTS_DIR/.git" --work-tree "$ARRAY_API_TESTS_DIR" clean -xddf 10 | git --git-dir="$ARRAY_API_TESTS_DIR/.git" --work-tree "$ARRAY_API_TESTS_DIR" fetch 11 | git --git-dir="$ARRAY_API_TESTS_DIR/.git" --work-tree "$ARRAY_API_TESTS_DIR" reset --hard $(cat "ci/array-api-tests-rev.txt") 12 | -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE/question-support.yml: -------------------------------------------------------------------------------- 1 | name: Question/Support 2 | description: A question about how to use this library. 3 | title: "Usage: " 4 | labels: "usage" 5 | 6 | body: 7 | - type: markdown 8 | attributes: 9 | value: > 10 | ## Thank you for your interest in sparse 11 | - type: textarea 12 | attributes: 13 | label: Please provide a description of what you'd like to do. 14 | validations: 15 | required: true 16 | - type: textarea 17 | attributes: 18 | label: Example Code 19 | description: > 20 | Syntactically valid Python code that shows what you want to do, 21 | possibly with placeholder functions or methods. 22 | -------------------------------------------------------------------------------- /scripts/gen_ref_pages.py: -------------------------------------------------------------------------------- 1 | """Generate the code reference pages.""" 2 | 3 | from pathlib import Path 4 | 5 | import sparse 6 | 7 | import mkdocs_gen_files 8 | 9 | nav = mkdocs_gen_files.Nav() 10 | 11 | root = Path(__file__).parent.parent 12 | 13 | for item in dir(sparse): 14 | if item.startswith("_") or not getattr(getattr(sparse, item), "__module__", "").startswith("sparse"): 15 | continue 16 | full_doc_path = Path("api/" + item + ".md") 17 | with mkdocs_gen_files.open(Path("api", f"{item}.md"), "w") as fd: 18 | print(f"# {item}", file=fd) 19 | print("::: " + f"sparse.{item}", file=fd) 20 | mkdocs_gen_files.set_edit_path(full_doc_path, root) 21 | -------------------------------------------------------------------------------- /.github/FUNDING.yml: -------------------------------------------------------------------------------- 1 | # These are supported funding model platforms 2 | 3 | github: [Quansight, Quansight-Labs] 4 | patreon: # Replace with a single Patreon username 5 | open_collective: # Replace with a single Open Collective username 6 | ko_fi: # Replace with a single Ko-fi username 7 | tidelift: # Replace with a single Tidelift platform-name/package-name e.g., npm/babel 8 | community_bridge: # Replace with a single Community Bridge project-name e.g., cloud-foundry 9 | liberapay: # Replace with a single Liberapay username 10 | issuehunt: # Replace with a single IssueHunt username 11 | otechie: # Replace with a single Otechie username 12 | custom: # Replace with up to 4 custom sponsorship URLs e.g., ['link1', 'link2'] 13 | -------------------------------------------------------------------------------- /examples/utils.py: -------------------------------------------------------------------------------- 1 | import os 2 | import time 3 | from collections.abc import Callable, Iterable 4 | from typing import Any 5 | 6 | CI_MODE = bool(int(os.getenv("CI_MODE", default="0"))) 7 | 8 | 9 | def benchmark( 10 | func: Callable, 11 | args: Iterable[Any], 12 | info: str, 13 | iters: int, 14 | ) -> object: 15 | # Compile 16 | result = func(*args) 17 | 18 | if CI_MODE: 19 | print("CI mode - skipping benchmark") 20 | return result 21 | 22 | # Benchmark 23 | print(info) 24 | start = time.time() 25 | for _ in range(iters): 26 | func(*args) 27 | elapsed = time.time() - start 28 | print(f"Took {elapsed / iters} s.\n") 29 | 30 | return result 31 | -------------------------------------------------------------------------------- /benchmarks_original/utils.py: -------------------------------------------------------------------------------- 1 | import os 2 | import time 3 | from collections.abc import Callable, Iterable 4 | from typing import Any 5 | 6 | CI_MODE = bool(int(os.getenv("CI_MODE", default="0"))) 7 | 8 | 9 | def benchmark( 10 | func: Callable, 11 | args: Iterable[Any], 12 | info: str, 13 | iters: int, 14 | ) -> object: 15 | # Compile 16 | result = func(*args) 17 | 18 | if CI_MODE: 19 | print("CI mode - skipping benchmark") 20 | return result 21 | 22 | # Benchmark 23 | print(info) 24 | start = time.time() 25 | for _ in range(iters): 26 | func(*args) 27 | elapsed = time.time() - start 28 | print(f"Took {elapsed / iters} s.\n") 29 | 30 | return result 31 | -------------------------------------------------------------------------------- /.github/workflows/codspeed.yml: -------------------------------------------------------------------------------- 1 | name: codspeed-benchmarks 2 | 3 | on: 4 | push: 5 | branches: 6 | - "main" # or "master" 7 | 8 | pull_request: 9 | # `workflow_dispatch` allows CodSpeed to trigger backtest 10 | # performance analysis in order to generate initial data. 11 | workflow_dispatch: 12 | 13 | jobs: 14 | benchmarks: 15 | runs-on: ubuntu-latest 16 | steps: 17 | - uses: actions/checkout@v4 18 | - uses: actions/setup-python@v5 19 | with: 20 | python-version: "3.12" 21 | 22 | - name: Install dependencies 23 | run: pip install ".[all]" 24 | 25 | - name: Run benchmarks 26 | uses: CodSpeedHQ/action@v3 27 | with: 28 | run: pytest benchmarks/ --codspeed 29 | -------------------------------------------------------------------------------- /conftest.py: -------------------------------------------------------------------------------- 1 | import pathlib 2 | 3 | import sparse 4 | 5 | import pytest 6 | 7 | 8 | @pytest.fixture(scope="session", autouse=True) 9 | def add_doctest_modules(doctest_namespace): 10 | import sparse 11 | 12 | import numpy as np 13 | 14 | doctest_namespace["np"] = np 15 | doctest_namespace["sparse"] = sparse 16 | 17 | 18 | def pytest_ignore_collect(collection_path: pathlib.Path, config: pytest.Config) -> bool | None: 19 | if "numba_backend" in collection_path.parts and sparse._BackendType.Numba != sparse._BACKEND: 20 | return True 21 | 22 | if "mlir_backend" in collection_path.parts and sparse._BackendType.MLIR != sparse._BACKEND: 23 | return True 24 | 25 | if "finch_backend" in collection_path.parts and sparse._BackendType.Finch != sparse._BACKEND: 26 | return True 27 | 28 | return None 29 | -------------------------------------------------------------------------------- /.github/pull_request_template.md: -------------------------------------------------------------------------------- 1 | 7 | 8 | ## What type of PR is this? (check all applicable) 9 | 10 | - [ ] 💾 Refactor 11 | - [ ] 🪄 Feature 12 | - [ ] 🐞 Bug Fix 13 | - [ ] 🔧 Optimization 14 | - [ ] 📚 Documentation 15 | - [ ] 🧪 Test 16 | - [ ] 🛠️ Other 17 | 18 | ## Related issues 19 | 20 | - Related issue # 21 | - Closes # 22 | 23 | ## Checklist 24 | 25 | - [ ] Code follows style guide 26 | - [ ] Tests added 27 | - [ ] Documented the changes 28 | 29 | *** 30 | 31 | ## Please explain your changes below. 32 | -------------------------------------------------------------------------------- /.pre-commit-config.yaml: -------------------------------------------------------------------------------- 1 | repos: 2 | - repo: https://github.com/pre-commit/pre-commit-hooks 3 | rev: v6.0.0 4 | hooks: 5 | - id: check-yaml 6 | - id: end-of-file-fixer 7 | - id: trailing-whitespace 8 | - id: fix-byte-order-marker 9 | - id: destroyed-symlinks 10 | - id: mixed-line-ending 11 | - id: name-tests-test 12 | args: ["--pytest-test-first"] 13 | - id: no-commit-to-branch 14 | - id: pretty-format-json 15 | args: ["--autofix", "--no-ensure-ascii"] 16 | exclude: ".ipynb" 17 | 18 | - repo: https://github.com/astral-sh/ruff-pre-commit 19 | rev: v0.14.10 20 | hooks: 21 | - id: ruff-check 22 | args: ["--fix"] 23 | types_or: [ python, pyi, jupyter ] 24 | - id: ruff-format 25 | types_or: [ python, pyi, jupyter ] 26 | 27 | - repo: https://github.com/kynan/nbstripout 28 | rev: 0.8.2 29 | hooks: 30 | - id: nbstripout 31 | -------------------------------------------------------------------------------- /sparse/numba_backend/tests/test_io.py: -------------------------------------------------------------------------------- 1 | import sparse 2 | from sparse import load_npz, save_npz 3 | from sparse.numba_backend._utils import assert_eq 4 | 5 | import pytest 6 | 7 | import numpy as np 8 | 9 | 10 | @pytest.mark.parametrize("compression", [True, False]) 11 | @pytest.mark.parametrize("format", ["coo", "gcxs"]) 12 | def test_save_load_npz_file(tmp_path, compression, format): 13 | x = sparse.random((2, 3, 4, 5), density=0.25, format=format) 14 | y = x.todense() 15 | 16 | filename = tmp_path / "mat.npz" 17 | save_npz(filename, x, compressed=compression) 18 | z = load_npz(filename) 19 | assert_eq(x, z) 20 | assert_eq(y, z.todense()) 21 | 22 | 23 | def test_load_wrong_format_exception(tmp_path): 24 | x = np.array([1, 2, 3]) 25 | 26 | filename = tmp_path / "mat.npz" 27 | 28 | np.savez(filename, x) 29 | with pytest.raises(RuntimeError): 30 | load_npz(filename) 31 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | ![Sparse](docs/assets/images/logo_with_text.svg) 2 | 3 | # Sparse Multidimensional Arrays 4 | 5 | [![Build Status](https://github.com/pydata/sparse/actions/workflows/ci.yml/badge.svg)]( 6 | https://github.com/pydata/sparse/actions/workflows/ci.yml) 7 | [![Docs Status](https://readthedocs.org/projects/sparse-nd/badge/?version=latest)]( 8 | http://sparse.pydata.org/en/latest/?badge=latest) 9 | [![Coverage](https://codecov.io/gh/pydata/sparse/branch/main/graph/badge.svg)]( 10 | https://codecov.io/gh/pydata/sparse) 11 | 12 | ## This library provides multi-dimensional sparse arrays. 13 | 14 | - 📚 [Documentation](http://sparse.pydata.org) 15 | 16 | - 🙌 [Contributing](https://github.com/pydata/sparse/blob/main/docs/contributing.md) 17 | 18 | - 🪲 [Bug Reports/Feature Requests](https://github.com/pydata/sparse/issues) 19 | 20 | - 💬 [Discord Server](https://discord.gg/vur45CbwMz) [Channel](https://discord.com/channels/786703927705862175/1301155724646289420) 21 | -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE/doc_issue.yml: -------------------------------------------------------------------------------- 1 | name: Documentation improvement 2 | description: Report to improve the docs. You could also directly open a PR with your suggestions. 3 | title: "Doc: " 4 | labels: ["docs", "needs triage"] 5 | 6 | body: 7 | - type: markdown 8 | attributes: 9 | value: > 10 | ## Thanks for taking the time to fill out this form 11 | - type: dropdown 12 | id: TYPE 13 | attributes: 14 | label: What type of report is this? 15 | options: 16 | - 'Correction' 17 | - 'Improvement' 18 | validations: 19 | required: true 20 | - type: textarea 21 | attributes: 22 | label: Please describe the issue. 23 | description: > 24 | Tell us if something is unclear or incorrect, and where. 25 | validations: 26 | required: true 27 | - type: textarea 28 | attributes: 29 | label: If you have a suggestion on how it should be, add it below. 30 | description: > 31 | How can we improve it 32 | - type: markdown 33 | attributes: 34 | value: > 35 | ### If you are interested in opening a pull request to fix this, please let us know. 36 | -------------------------------------------------------------------------------- /docs/examples/scipy_example.py: -------------------------------------------------------------------------------- 1 | # --- 2 | # jupyter: 3 | # jupytext: 4 | # text_representation: 5 | # extension: .py 6 | # format_name: light 7 | # format_version: '1.5' 8 | # jupytext_version: 1.16.4 9 | # kernelspec: 10 | # display_name: sparse 11 | # language: python 12 | # name: python3 13 | # --- 14 | 15 | # # Using with SciPy 16 | # ## Import 17 | 18 | # + 19 | import sparse 20 | 21 | import numpy as np 22 | import scipy.sparse as sps 23 | 24 | # - 25 | 26 | # ## Create Arrays 27 | 28 | rng = np.random.default_rng(42) 29 | M = 1_000 30 | DENSITY = 0.01 31 | a = sparse.random((M, M), density=DENSITY, format="csc") 32 | identity = sparse.eye(M, format="csc") 33 | 34 | # ## Invert and verify matrix 35 | # This showcases the `scipy.sparse.linalg` integration. 36 | 37 | a_inv = sps.linalg.spsolve(a, identity) 38 | np.testing.assert_array_almost_equal((a_inv @ a).todense(), identity.todense()) 39 | 40 | # ## Calculate the graph distances 41 | # This showcases the `scipy.sparse.csgraph` integration. 42 | 43 | sps.csgraph.bellman_ford(sparse.eye(5, k=1) + sparse.eye(5, k=-1), return_predecessors=False) 44 | -------------------------------------------------------------------------------- /sparse/numba_backend/_coo/__init__.py: -------------------------------------------------------------------------------- 1 | from .common import ( 2 | argmax, 3 | argmin, 4 | argwhere, 5 | clip, 6 | concatenate, 7 | diagonal, 8 | diagonalize, 9 | expand_dims, 10 | flip, 11 | isneginf, 12 | isposinf, 13 | kron, 14 | nanmax, 15 | nanmean, 16 | nanmin, 17 | nanprod, 18 | nanreduce, 19 | nansum, 20 | result_type, 21 | roll, 22 | sort, 23 | stack, 24 | take, 25 | tril, 26 | triu, 27 | unique_counts, 28 | unique_values, 29 | where, 30 | ) 31 | from .core import COO, as_coo 32 | 33 | __all__ = [ 34 | "COO", 35 | "as_coo", 36 | "argmax", 37 | "argmin", 38 | "argwhere", 39 | "clip", 40 | "concatenate", 41 | "diagonal", 42 | "diagonalize", 43 | "expand_dims", 44 | "flip", 45 | "isneginf", 46 | "isposinf", 47 | "kron", 48 | "nanmax", 49 | "nanmean", 50 | "nanmin", 51 | "nanprod", 52 | "nanreduce", 53 | "nansum", 54 | "result_type", 55 | "roll", 56 | "sort", 57 | "stack", 58 | "take", 59 | "tril", 60 | "triu", 61 | "unique_counts", 62 | "unique_values", 63 | "where", 64 | ] 65 | -------------------------------------------------------------------------------- /sparse/mlir_backend/__init__.py: -------------------------------------------------------------------------------- 1 | try: 2 | import mlir_finch # noqa: F401 3 | 4 | del mlir_finch 5 | except ModuleNotFoundError as e: 6 | raise ImportError( 7 | "MLIR Python bindings not installed. Run `pip install finch-mlir` to enable the MLIR backend." 8 | ) from e 9 | 10 | from . import formats 11 | from ._array import Array 12 | from ._conversions import asarray, from_constituent_arrays, to_numpy, to_scipy 13 | from ._dtypes import ( 14 | asdtype, 15 | complex64, 16 | complex128, 17 | float16, 18 | float32, 19 | float64, 20 | int8, 21 | int16, 22 | int32, 23 | int64, 24 | uint8, 25 | uint16, 26 | uint32, 27 | uint64, 28 | ) 29 | from ._ops import add, reshape 30 | 31 | __all__ = [ 32 | "Array", 33 | "add", 34 | "asarray", 35 | "asdtype", 36 | "to_numpy", 37 | "to_scipy", 38 | "formats", 39 | "reshape", 40 | "from_constituent_arrays", 41 | "int8", 42 | "int16", 43 | "int32", 44 | "int64", 45 | "uint8", 46 | "uint16", 47 | "uint32", 48 | "uint64", 49 | "float16", 50 | "float32", 51 | "float64", 52 | "complex64", 53 | "complex128", 54 | ] 55 | -------------------------------------------------------------------------------- /docs/examples/dask_example.py: -------------------------------------------------------------------------------- 1 | # --- 2 | # jupyter: 3 | # jupytext: 4 | # text_representation: 5 | # extension: .py 6 | # format_name: light 7 | # format_version: '1.5' 8 | # jupytext_version: 1.16.4 9 | # kernelspec: 10 | # display_name: sparse 11 | # language: python 12 | # name: python3 13 | # --- 14 | 15 | # # Using with Dask 16 | # ## Import 17 | 18 | # + 19 | import sparse 20 | 21 | import dask.array as da 22 | 23 | import numpy as np 24 | 25 | # - 26 | 27 | # ## Create Arrays 28 | # 29 | # Here, we create two random sparse arrays and move them to Dask. 30 | 31 | # + 32 | rng = np.random.default_rng(42) 33 | M, N = 10_000, 10_000 34 | DENSITY = 0.0001 35 | a = sparse.random((M, N), density=DENSITY) 36 | b = sparse.random((M, N), density=DENSITY) 37 | 38 | a_dask = da.from_array(a, chunks=1000) 39 | b_dask = da.from_array(b, chunks=1000) 40 | # - 41 | 42 | # As we can see in the "data type" section, each chunk of the Dask array is still sparse. 43 | 44 | a_dask # noqa: B018 45 | 46 | # # Compute and check results 47 | # As we can see, what we get out of Dask matches what we get out of `sparse`. 48 | 49 | assert sparse.all(a + b == (a_dask + b_dask).compute()) 50 | -------------------------------------------------------------------------------- /docs/examples/formats_example.py: -------------------------------------------------------------------------------- 1 | # --- 2 | # jupyter: 3 | # jupytext: 4 | # text_representation: 5 | # extension: .py 6 | # format_name: light 7 | # format_version: '1.5' 8 | # jupytext_version: 1.16.4 9 | # kernelspec: 10 | # display_name: sparse 11 | # language: python 12 | # name: python3 13 | # --- 14 | 15 | # # Multiple Formats 16 | # ## Import 17 | # Let's set the backend and import `sparse`. 18 | 19 | # + 20 | import sparse 21 | 22 | import numpy as np 23 | 24 | # - 25 | 26 | 27 | # ## Perform Operations 28 | # Let's create two arrays. 29 | 30 | rng = np.random.default_rng(42) # Seed for reproducibility 31 | a = sparse.random((3, 3), density=1 / 6, random_state=rng) 32 | b = sparse.random((3, 3), density=1 / 6, random_state=rng) 33 | 34 | # Now let's matrix multiply them. 35 | 36 | c = a @ b 37 | 38 | # And view the result as a (dense) NumPy array. 39 | 40 | c_dense = c.todense() 41 | 42 | # Now let's do the same for other formats, and compare the results. 43 | 44 | for format in ["coo", "csr", "csc"]: 45 | af = sparse.asarray(a, format=format) 46 | bf = sparse.asarray(b, format=format) 47 | cf = af @ bf 48 | np.testing.assert_array_equal(c_dense, cf.todense()) 49 | -------------------------------------------------------------------------------- /release-procedure.md: -------------------------------------------------------------------------------- 1 | * Tag commit 2 | ```bash 3 | git tag -a x.x.x -m 'Version x.x.x' 4 | ``` 5 | 6 | * Push to github 7 | ```bash 8 | git push pydata main --tags 9 | ``` 10 | When you open the PR on GitHub, make sure the title of the PR starts with "release". 11 | 12 | * Upload to PyPI 13 | ```bash 14 | git clean -xfd # remove all files in directory not in repository 15 | python -m build --wheel --sdist # make packages 16 | twine upload dist/* # upload packages 17 | ``` 18 | 19 | * Update the release drafter: 20 | Go to https://github.com/pydata/sparse 21 | Under the “Release" section there are two links: One is the latest release (it has a tag). 22 | The second one is +. Click on the second one so you can see the release drafter. 23 | Edit the draft by clicking the "pencil" figure. 24 | Make sure you have the correct tags. If they are not, you can create one. 25 | If the markdown page looks correct, click on “Publish release”. 26 |
27 | * Enable the newly-pushed tag for documentation: https://readthedocs.org/projects/sparse-nd/versions/ 28 | * Wait for conda-forge to realise that the build is too old and make a PR. 29 | * Edit and merge that PR. 30 | * Announce the release on: 31 | * numpy-discussion@python.org 32 | * python-announce-list@python.org 33 | -------------------------------------------------------------------------------- /docs/examples/formats_example_finch.py: -------------------------------------------------------------------------------- 1 | # --- 2 | # jupyter: 3 | # jupytext: 4 | # text_representation: 5 | # extension: .py 6 | # format_name: light 7 | # format_version: '1.5' 8 | # jupytext_version: 1.16.4 9 | # kernelspec: 10 | # display_name: sparse 11 | # language: python 12 | # name: python3 13 | # --- 14 | 15 | # # Multiple Formats with Finch 16 | # ## Import 17 | # Let's set the backend and import `sparse`. 18 | 19 | # + 20 | import os 21 | 22 | os.environ["SPARSE_BACKEND"] = "Finch" 23 | 24 | import sparse 25 | 26 | import numpy as np 27 | 28 | # - 29 | 30 | 31 | # ## Perform Operations 32 | # Let's create two arrays. 33 | 34 | rng = np.random.default_rng(42) # Seed for reproducibility 35 | a = sparse.random((3, 3), density=1 / 6, random_state=rng) 36 | b = sparse.random((3, 3), density=1 / 6, random_state=rng) 37 | 38 | # Now let's matrix multiply them. 39 | 40 | c = a @ b 41 | 42 | # And view the result as a (dense) NumPy array. 43 | 44 | c_dense = c.todense() 45 | 46 | # Now let's do the same for other formats, and compare the results. 47 | 48 | for format in ["coo", "csr", "csc", "dense"]: 49 | af = sparse.asarray(a, format=format) 50 | bf = sparse.asarray(b, format=format) 51 | cf = af @ bf 52 | np.testing.assert_array_equal(c_dense, cf.todense()) 53 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | #####=== Python ===##### 2 | 3 | # Byte-compiled / optimized / DLL files 4 | __pycache__/ 5 | *.py[cod] 6 | *$py.class 7 | 8 | # C extensions 9 | *.so 10 | 11 | # Distribution / packaging 12 | .Python 13 | env/ 14 | build/ 15 | develop-eggs/ 16 | dist/ 17 | downloads/ 18 | eggs/ 19 | .eggs/ 20 | lib/ 21 | lib64/ 22 | parts/ 23 | sdist/ 24 | var/ 25 | *.egg-info/ 26 | .installed.cfg 27 | *.egg 28 | 29 | # PyInstaller 30 | # Usually these files are written by a python script from a template 31 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 32 | *.manifest 33 | *.spec 34 | 35 | # Installer logs 36 | pip-log.txt 37 | pip-delete-this-directory.txt 38 | 39 | # Unit test / coverage reports 40 | htmlcov/ 41 | .tox/ 42 | .coverage 43 | .coverage.* 44 | .cache 45 | nosetests.xml 46 | coverage.xml 47 | *,cover 48 | .pytest_cache/ 49 | test_results/ 50 | junit/ 51 | .hypothesis/ 52 | coverage_*.xml 53 | 54 | # Translations 55 | *.mo 56 | *.pot 57 | 58 | # Django stuff: 59 | *.log 60 | 61 | # mkdocs documentation 62 | site/ 63 | 64 | # PyBuilder 65 | target/ 66 | 67 | # IDE 68 | .idea/ 69 | .vscode/ 70 | default.profraw 71 | 72 | # Sandbox 73 | sandbox.py 74 | 75 | # macOS 76 | **/.DS_Store 77 | 78 | # Version file 79 | sparse/_version.py 80 | 81 | # Benchmark Results 82 | results/ 83 | 84 | # Notebooks converted to scripts. 85 | docs/examples_ipynb/ 86 | 87 | # Envs 88 | .pixi/ 89 | pixi.lock 90 | .venv/ 91 | -------------------------------------------------------------------------------- /docs/index.md: -------------------------------------------------------------------------------- 1 | --- 2 | hide: 3 | - navigation 4 | - toc 5 | --- 6 | 7 | # Sparse 8 | This project implements sparse arrays of arbitrary dimension on top of 9 | [`numpy`][] and 10 | [`scipy.sparse`][]. It generalizes the 11 | [`scipy.sparse.coo_matrix`][] and 12 | [`scipy.sparse.dok_matrix`][] layouts, but 13 | extends beyond just rows and columns to an arbitrary number of 14 | dimensions. 15 |
16 |
17 | ![Sparse](./assets/images/logo.png){width=20%, align=left} 18 |
19 | 20 | ![Sparse](./assets/images/conference-room-icon.png){width=10%, align=left} 21 | Introduction 22 | { .card } 23 | 24 | ![Sparse](./assets/images/install-software-download-icon.png){width=10%, align=left} 25 | Install 26 | { .card } 27 | 28 | ![Sparse](./assets/images/open-book-icon.png){width=10%, align=left} 29 | Tutorials 30 | { .card } 31 | 32 | ![Sparse](./assets/images/check-list-icon.png){width=10%, align=left} 33 | How-to guides 34 | { .card } 35 | 36 | ![Sparse](./assets/images/repair-fix-repairing-icon.png){width=10%, align=left} 37 | API 38 | { .card } 39 | 40 | ![Sparse](./assets/images/group-discussion-icon.png){width=10%, align=left} 41 | Contributing 42 | { .card } 43 | 44 |
45 | -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE/feature_request.yml: -------------------------------------------------------------------------------- 1 | name: Feature request 2 | description: Form to request a new feature 3 | title: "Enh: " 4 | labels: ["enhancement", "needs triage"] 5 | 6 | body: 7 | - type: markdown 8 | attributes: 9 | value: > 10 | ## Thanks for helping us improve sparse! 11 | - type: markdown 12 | attributes: 13 | value: > 14 | ### Before submitting a request, please check if it has already been discused in the 15 | [list of issues](https://github.com/pydata/sparse/issues). 16 | - type: textarea 17 | attributes: 18 | label: Please describe the purpose of the new feature or describe the problem to solve. 19 | description: > 20 | A clear description of the objective. 21 | validations: 22 | required: true 23 | - type: textarea 24 | attributes: 25 | label: Suggest a solution if possible. 26 | description: > 27 | Please suggest a solution if you can. 28 | validations: 29 | required: false 30 | - type: textarea 31 | attributes: 32 | label: If you have tried alternatives, please describe them below. 33 | description: > 34 | What you have tried if applicable. 35 | - type: textarea 36 | attributes: 37 | label: Additional information that may help us understand your needs. 38 | description: > 39 | Context, screenshots, or any useful information. 40 | - type: markdown 41 | attributes: 42 | value: > 43 | ### If you are interested in opening a pull request to fix this, please let us know. 44 | -------------------------------------------------------------------------------- /sparse/numba_backend/_settings.py: -------------------------------------------------------------------------------- 1 | import os 2 | 3 | import numpy as np 4 | 5 | AUTO_DENSIFY = bool(int(os.environ.get("SPARSE_AUTO_DENSIFY", "0"))) 6 | WARN_ON_TOO_DENSE = bool(int(os.environ.get("SPARSE_WARN_ON_TOO_DENSE", "0"))) 7 | IS_NUMPY2 = np.lib.NumpyVersion(np.__version__) >= "2.0.0a1" 8 | 9 | 10 | def _is_nep18_enabled(): 11 | class A: 12 | def __array_function__(self, *args, **kwargs): 13 | return True 14 | 15 | try: 16 | return np.concatenate([A()]) 17 | except ValueError: 18 | return False 19 | 20 | 21 | NEP18_ENABLED = _is_nep18_enabled() 22 | 23 | 24 | class ArrayNamespaceInfo: 25 | def __init__(self): 26 | self.np_info = np.__array_namespace_info__() 27 | 28 | def capabilities(self): 29 | np_capabilities = self.np_info.capabilities() 30 | return { 31 | "boolean indexing": False, 32 | "data-dependent shapes": True, 33 | "max dimensions": np_capabilities.get("max dimensions", 64) - 1, 34 | } 35 | 36 | def default_device(self): 37 | return self.np_info.default_device() 38 | 39 | def default_dtypes(self, *, device=None): 40 | return self.np_info.default_dtypes(device=device) 41 | 42 | def devices(self): 43 | return self.np_info.devices() 44 | 45 | def dtypes(self, *, device=None, kind=None): 46 | return self.np_info.dtypes(device=device, kind=kind) 47 | 48 | 49 | def __array_namespace_info__() -> ArrayNamespaceInfo: 50 | return ArrayNamespaceInfo() 51 | -------------------------------------------------------------------------------- /sparse/mlir_backend/_core.py: -------------------------------------------------------------------------------- 1 | import ctypes 2 | import ctypes.util 3 | import os 4 | import pathlib 5 | import sys 6 | 7 | from mlir_finch.ir import Context 8 | from mlir_finch.passmanager import PassManager 9 | 10 | DEBUG = bool(int(os.environ.get("DEBUG", "0"))) 11 | CWD = pathlib.Path(".") 12 | 13 | finch_lib_path = f"{sys.prefix}/lib/python3.{sys.version_info.minor}/site-packages/lib" 14 | 15 | ld_library_path = os.environ.get("LD_LIBRARY_PATH") 16 | ld_library_path = f"{finch_lib_path}:{ld_library_path}" if ld_library_path is None else finch_lib_path 17 | os.environ["LD_LIBRARY_PATH"] = ld_library_path 18 | 19 | MLIR_C_RUNNER_UTILS = ctypes.util.find_library("mlir_c_runner_utils") 20 | if os.name == "posix" and MLIR_C_RUNNER_UTILS is not None: 21 | MLIR_C_RUNNER_UTILS = f"{finch_lib_path}/{MLIR_C_RUNNER_UTILS}" 22 | 23 | SHARED_LIBS = [] 24 | if MLIR_C_RUNNER_UTILS is not None: 25 | SHARED_LIBS.append(MLIR_C_RUNNER_UTILS) 26 | 27 | libc = ctypes.CDLL(ctypes.util.find_library("c")) if os.name != "nt" else ctypes.cdll.msvcrt 28 | libc.free.argtypes = [ctypes.c_void_p] 29 | libc.free.restype = None 30 | 31 | SHARED_LIBS = [] 32 | if DEBUG: 33 | SHARED_LIBS.append(MLIR_C_RUNNER_UTILS) 34 | 35 | OPT_LEVEL = 0 if DEBUG else 2 36 | 37 | # TODO: remove global state 38 | ctx = Context() 39 | 40 | pm = PassManager.parse( 41 | """ 42 | builtin.module( 43 | sparse-assembler{direct-out=true}, 44 | sparsifier{create-sparse-deallocs=1 enable-runtime-library=false} 45 | ) 46 | """, 47 | context=ctx, 48 | ) 49 | -------------------------------------------------------------------------------- /.github/workflows/release-drafter.yml: -------------------------------------------------------------------------------- 1 | name: Release Drafter 2 | 3 | on: 4 | push: 5 | # branches to consider in the event; optional, defaults to all 6 | branches: 7 | - main 8 | # pull_request event is required only for autolabeler 9 | pull_request: 10 | # Only following types are handled by the action, but one can default to all as well 11 | types: [opened, reopened, synchronize, edited] 12 | # pull_request_target event is required for autolabeler to support PRs from forks 13 | pull_request_target: 14 | types: [opened, reopened, synchronize, edited] 15 | 16 | permissions: 17 | contents: read 18 | 19 | jobs: 20 | update_release_draft: 21 | permissions: 22 | # write permission is required to create a github release 23 | contents: write 24 | # write permission is required for autolabeler 25 | # otherwise, read permission is required at least 26 | pull-requests: write 27 | runs-on: ubuntu-latest 28 | steps: 29 | # (Optional) GitHub Enterprise requires GHE_HOST variable set 30 | #- name: Set GHE_HOST 31 | # run: | 32 | # echo "GHE_HOST=${GITHUB_SERVER_URL##https:\/\/}" >> $GITHUB_ENV 33 | 34 | # Drafts your next Release notes as Pull Requests are merged into "main" 35 | - uses: release-drafter/release-drafter@v6 36 | # (Optional) specify config name to use, relative to .github/. Default: release-drafter.yml 37 | # with: 38 | # config-name: my-config.yml 39 | env: 40 | GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} 41 | -------------------------------------------------------------------------------- /examples/triangles_example.py: -------------------------------------------------------------------------------- 1 | import importlib 2 | import os 3 | 4 | import sparse 5 | 6 | import networkx as nx 7 | from utils import benchmark 8 | 9 | import numpy as np 10 | 11 | ITERS = 3 12 | 13 | 14 | if __name__ == "__main__": 15 | print("Counting Triangles Example:\n") 16 | 17 | G = nx.gnp_random_graph(n=200, p=0.2) 18 | 19 | # ======= Finch ======= 20 | os.environ[sparse._ENV_VAR_NAME] = "Finch" 21 | importlib.reload(sparse) 22 | 23 | a_sps = nx.to_scipy_sparse_array(G) 24 | a = sparse.asarray(a_sps) 25 | 26 | @sparse.compiled() 27 | def count_triangles_finch(a): 28 | return sparse.sum(a @ a * a) / sparse.asarray(6) 29 | 30 | # Compile & Benchmark 31 | result_finch = benchmark(count_triangles_finch, args=[a], info="Finch", iters=ITERS) 32 | 33 | # ======= SciPy ======= 34 | def count_triangles_scipy(a): 35 | return (a @ a * a).sum() / 6 36 | 37 | a = nx.to_scipy_sparse_array(G) 38 | 39 | # Compile & Benchmark 40 | result_scipy = benchmark(count_triangles_scipy, args=[a], info="SciPy", iters=ITERS) 41 | 42 | # ======= NetworkX ======= 43 | def count_triangles_networkx(a): 44 | return sum(nx.triangles(a).values()) / 3 45 | 46 | a = G 47 | 48 | # Compile & Benchmark 49 | result_networkx = benchmark(count_triangles_networkx, args=[a], info="NetworkX", iters=ITERS) 50 | 51 | np.testing.assert_equal(result_finch.todense(), result_scipy) 52 | np.testing.assert_equal(result_finch.todense(), result_networkx) 53 | assert result_networkx == result_scipy 54 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | BSD 3-Clause License 2 | 3 | Copyright (c) 2018, Sparse developers 4 | All rights reserved. 5 | 6 | Redistribution and use in source and binary forms, with or without 7 | modification, are permitted provided that the following conditions are met: 8 | 9 | * Redistributions of source code must retain the above copyright notice, this 10 | list of conditions and the following disclaimer. 11 | 12 | * Redistributions in binary form must reproduce the above copyright notice, 13 | this list of conditions and the following disclaimer in the documentation 14 | and/or other materials provided with the distribution. 15 | 16 | * Neither the name of the copyright holder nor the names of its 17 | contributors may be used to endorse or promote products derived from 18 | this software without specific prior written permission. 19 | 20 | THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" 21 | AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE 22 | IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 23 | DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE 24 | FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 25 | DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR 26 | SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER 27 | CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, 28 | OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE 29 | OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. 30 | -------------------------------------------------------------------------------- /sparse/mlir_backend/_array.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | 3 | from ._dtypes import DType 4 | from .formats import ConcreteFormat 5 | 6 | 7 | class Array: 8 | def __init__(self, *, storage, shape: tuple[int, ...]) -> None: 9 | storage_rank = storage.get_storage_format().rank 10 | if len(shape) != storage_rank: 11 | raise ValueError(f"Mismatched rank, `{storage_rank=}`, `{shape=}`") 12 | 13 | self._storage = storage 14 | self._shape = shape 15 | 16 | @property 17 | def shape(self) -> tuple[int, ...]: 18 | return self._shape 19 | 20 | @property 21 | def ndim(self) -> int: 22 | return len(self.shape) 23 | 24 | @property 25 | def dtype(self) -> DType: 26 | return self._storage.get_storage_format().dtype 27 | 28 | @property 29 | def format(self) -> ConcreteFormat: 30 | return self._storage.get_storage_format() 31 | 32 | def _get_mlir_type(self): 33 | return self.format._get_mlir_type(shape=self.shape) 34 | 35 | def _to_module_arg(self): 36 | return self._storage.to_module_arg() 37 | 38 | def copy(self) -> "Array": 39 | from ._conversions import from_constituent_arrays 40 | 41 | arrs = tuple(arr.copy() for arr in self.get_constituent_arrays()) 42 | return from_constituent_arrays(format=self.format, arrays=arrs, shape=self.shape) 43 | 44 | def asformat(self, format: ConcreteFormat) -> "Array": 45 | from ._ops import asformat 46 | 47 | return asformat(self, format=format) 48 | 49 | def get_constituent_arrays(self) -> tuple[np.ndarray, ...]: 50 | return self._storage.get_constituent_arrays() 51 | -------------------------------------------------------------------------------- /examples/mttkrp_example.py: -------------------------------------------------------------------------------- 1 | import importlib 2 | import os 3 | 4 | import sparse 5 | 6 | from utils import benchmark 7 | 8 | import numpy as np 9 | 10 | I_ = 1000 11 | J_ = 25 12 | K_ = 1000 13 | L_ = 100 14 | DENSITY = 0.0001 15 | ITERS = 3 16 | rng = np.random.default_rng(0) 17 | 18 | 19 | if __name__ == "__main__": 20 | print("MTTKRP Example:\n") 21 | 22 | B_sps = sparse.random((I_, K_, L_), density=DENSITY, random_state=rng) * 10 23 | D_sps = rng.random((L_, J_)) * 10 24 | C_sps = rng.random((K_, J_)) * 10 25 | 26 | # ======= Finch ======= 27 | os.environ[sparse._ENV_VAR_NAME] = "Finch" 28 | importlib.reload(sparse) 29 | 30 | B = sparse.asarray(B_sps.todense(), format="csf") 31 | D = sparse.asarray(np.array(D_sps, order="F")) 32 | C = sparse.asarray(np.array(C_sps, order="F")) 33 | 34 | @sparse.compiled() 35 | def mttkrp_finch(B, D, C): 36 | return sparse.sum(B[:, :, :, None] * D[None, None, :, :] * C[None, :, None, :], axis=(1, 2)) 37 | 38 | # Compile & Benchmark 39 | result_finch = benchmark(mttkrp_finch, args=[B, D, C], info="Finch", iters=ITERS) 40 | 41 | # ======= Numba ======= 42 | os.environ[sparse._ENV_VAR_NAME] = "Numba" 43 | importlib.reload(sparse) 44 | 45 | B = sparse.asarray(B_sps, format="gcxs") 46 | D = D_sps 47 | C = C_sps 48 | 49 | def mttkrp_numba(B, D, C): 50 | return sparse.sum(B[:, :, :, None] * D[None, None, :, :] * C[None, :, None, :], axis=(1, 2)) 51 | 52 | # Compile & Benchmark 53 | result_numba = benchmark(mttkrp_numba, args=[B, D, C], info="Numba", iters=ITERS) 54 | 55 | np.testing.assert_allclose(result_finch.todense(), result_numba.todense()) 56 | -------------------------------------------------------------------------------- /benchmarks_original/mttkrp_example.py: -------------------------------------------------------------------------------- 1 | import importlib 2 | import os 3 | 4 | import sparse 5 | 6 | from utils import benchmark 7 | 8 | import numpy as np 9 | 10 | I_ = 1000 11 | J_ = 25 12 | K_ = 1000 13 | L_ = 100 14 | DENSITY = 0.0001 15 | ITERS = 3 16 | rng = np.random.default_rng(0) 17 | 18 | 19 | if __name__ == "__main__": 20 | print("MTTKRP Example:\n") 21 | 22 | B_sps = sparse.random((I_, K_, L_), density=DENSITY, random_state=rng) * 10 23 | D_sps = rng.random((L_, J_)) * 10 24 | C_sps = rng.random((K_, J_)) * 10 25 | 26 | # ======= Finch ======= 27 | os.environ[sparse._ENV_VAR_NAME] = "Finch" 28 | importlib.reload(sparse) 29 | 30 | B = sparse.asarray(B_sps.todense(), format="csf") 31 | D = sparse.asarray(np.array(D_sps, order="F")) 32 | C = sparse.asarray(np.array(C_sps, order="F")) 33 | 34 | @sparse.compiled() 35 | def mttkrp_finch(B, D, C): 36 | return sparse.sum(B[:, :, :, None] * D[None, None, :, :] * C[None, :, None, :], axis=(1, 2)) 37 | 38 | # Compile & Benchmark 39 | result_finch = benchmark(mttkrp_finch, args=[B, D, C], info="Finch", iters=ITERS) 40 | 41 | # ======= Numba ======= 42 | os.environ[sparse._ENV_VAR_NAME] = "Numba" 43 | importlib.reload(sparse) 44 | 45 | B = sparse.asarray(B_sps, format="gcxs") 46 | D = D_sps 47 | C = C_sps 48 | 49 | def mttkrp_numba(B, D, C): 50 | return sparse.sum(B[:, :, :, None] * D[None, None, :, :] * C[None, :, None, :], axis=(1, 2)) 51 | 52 | # Compile & Benchmark 53 | result_numba = benchmark(mttkrp_numba, args=[B, D, C], info="Numba", iters=ITERS) 54 | 55 | np.testing.assert_allclose(result_finch.todense(), result_numba.todense()) 56 | -------------------------------------------------------------------------------- /.github/release-drafter.yml: -------------------------------------------------------------------------------- 1 | exclude-labels: 2 | # When PR will not be classified if it has these labels 3 | - skip changelog 4 | - release 5 | name-template: 'Sparse v$RESOLVED_VERSION' 6 | 7 | change-template: '- $TITLE (#$NUMBER)' 8 | 9 | autolabeler: 10 | - label: breaking 11 | title: 12 | # Example: feat!: ... 13 | - '/^(build|chore|ci|depr|docs|feat|fix|perf|refactor|release|test)(\(.*\))?\!\: /' 14 | - label: build 15 | title: 16 | - '/^(build)/' 17 | - label: internal 18 | title: 19 | - '/^(chore|ci|refactor|test)/' 20 | - label: deprecation 21 | title: 22 | - '/^depr/' 23 | - label: documentation 24 | title: 25 | - '/^(docs|docstring)/' 26 | - label: enhancement 27 | title: 28 | - '/^feat/' 29 | - label: fix 30 | title: 31 | - '/^fix/' 32 | - label: performance 33 | title: 34 | - '/^perf/' 35 | - label: release 36 | title: 37 | - '/^release/' 38 | - label: 'skip changelog' 39 | title: 40 | - '/^\[pre-commit.ci\]/' 41 | categories: 42 | - title: 📣 Highlights 43 | labels: highlight 44 | - title: 🧨 Breaking changes 45 | labels: 46 | - breaking 47 | - breaking python 48 | - title: 🚧 Deprecations 49 | labels: deprecation 50 | - title: 🪄 Performance improvements 51 | labels: performance 52 | - title: 🎊 Enhancements 53 | labels: enhancement 54 | - title: 🐞 Bug fixes 55 | labels: fix 56 | - title: 📚 Documentation 57 | labels: documentation 58 | - title: 🧰 Build system 59 | labels: build 60 | - title: 🔧 Other improvements 61 | labels: internal 62 | 63 | template: | 64 | ## Changes 65 | 66 | $CHANGES 67 | 68 | Thank you to all our contributors for making this release possible! 69 | $CONTRIBUTORS 70 | -------------------------------------------------------------------------------- /sparse/__init__.py: -------------------------------------------------------------------------------- 1 | import os 2 | import warnings 3 | from enum import Enum 4 | 5 | from ._version import __version__, __version_tuple__ # noqa: F401 6 | 7 | __array_api_version__ = "2024.12" 8 | 9 | 10 | class _BackendType(Enum): 11 | Numba = "Numba" 12 | Finch = "Finch" 13 | MLIR = "MLIR" 14 | 15 | 16 | _ENV_VAR_NAME = "SPARSE_BACKEND" 17 | 18 | 19 | class SparseFutureWarning(FutureWarning): 20 | pass 21 | 22 | 23 | if os.environ.get(_ENV_VAR_NAME, "") != "": 24 | warnings.warn( 25 | "Changing back-ends is a development feature, please do not rely on it in production.", 26 | SparseFutureWarning, 27 | stacklevel=1, 28 | ) 29 | _backend_name = os.environ[_ENV_VAR_NAME] 30 | else: 31 | _backend_name = _BackendType.Numba.value 32 | 33 | if _backend_name not in {v.value for v in _BackendType}: 34 | warnings.warn(f"Invalid backend identifier: {_backend_name}. Selecting Numba backend.", UserWarning, stacklevel=1) 35 | _BACKEND = _BackendType.Numba 36 | else: 37 | _BACKEND = _BackendType[_backend_name] 38 | 39 | del _backend_name 40 | 41 | if _BackendType.Finch == _BACKEND: 42 | from sparse.finch_backend import * # noqa: F403 43 | from sparse.finch_backend import __all__ 44 | elif _BackendType.MLIR == _BACKEND: 45 | from sparse.mlir_backend import * # noqa: F403 46 | from sparse.mlir_backend import __all__ 47 | else: 48 | from sparse.numba_backend import * # noqa: F403 49 | from sparse.numba_backend import ( # noqa: F401 50 | __all__, 51 | __array_namespace_info__, 52 | _common, 53 | _compressed, 54 | _coo, 55 | _dok, 56 | _io, 57 | _numba_extension, 58 | _settings, 59 | _slicing, 60 | _sparse_array, 61 | _umath, 62 | _utils, 63 | ) 64 | -------------------------------------------------------------------------------- /sparse/numba_backend/tests/test_conversion.py: -------------------------------------------------------------------------------- 1 | import sparse 2 | from sparse.numba_backend._utils import assert_eq 3 | 4 | import pytest 5 | 6 | import numpy as np 7 | import scipy.sparse as sps 8 | 9 | FORMATS_ND = [ 10 | sparse.COO, 11 | sparse.DOK, 12 | sparse.GCXS, 13 | ] 14 | 15 | FORMATS_2D = [ 16 | sparse.numba_backend._compressed.CSC, 17 | sparse.numba_backend._compressed.CSR, 18 | ] 19 | 20 | FORMATS = FORMATS_2D + FORMATS_ND 21 | 22 | 23 | @pytest.mark.parametrize("format1", FORMATS) 24 | @pytest.mark.parametrize("format2", FORMATS) 25 | def test_conversion(format1, format2): 26 | x = sparse.random((10, 10), density=0.5, format=format1, fill_value=0.5) 27 | y = x.asformat(format2) 28 | assert_eq(x, y) 29 | 30 | 31 | def test_extra_kwargs(): 32 | x = sparse.full((2, 2), 1, format="gcxs", compressed_axes=[1]) 33 | y = sparse.full_like(x, 1) 34 | 35 | assert_eq(x, y) 36 | 37 | 38 | @pytest.mark.parametrize("format1", FORMATS_ND) 39 | @pytest.mark.parametrize("format2", FORMATS_ND) 40 | def test_conversion_scalar(format1, format2): 41 | x = sparse.random((), format=format1, fill_value=0.5) 42 | y = x.asformat(format2) 43 | assert_eq(x, y) 44 | 45 | 46 | def test_non_canonical_conversion(): 47 | """ 48 | Regression test for gh-602. 49 | 50 | Adapted from https://github.com/LiberTEM/sparseconverter/blob/4cfc0ee2ad4c37b07742db8f3643bcbd858a4e85/src/sparseconverter/__init__.py#L154-L183 51 | """ 52 | data = np.array((2.0, 1.0, 3.0, 3.0, 1.0)) 53 | indices = np.array((1, 0, 0, 1, 1), dtype=int) 54 | indptr = np.array((0, 2, 5), dtype=int) 55 | 56 | x = sps.csr_matrix((data, indices, indptr), shape=(2, 2)) 57 | ref = np.array(((1.0, 2.0), (3.0, 4.0))) 58 | 59 | gcxs_check = sparse.GCXS(x) 60 | assert np.all(gcxs_check[:1].todense() == ref[:1]) and np.all(gcxs_check[1:].todense() == ref[1:]) 61 | -------------------------------------------------------------------------------- /examples/matmul_example.py: -------------------------------------------------------------------------------- 1 | import importlib 2 | import os 3 | 4 | import sparse 5 | 6 | from utils import benchmark 7 | 8 | import numpy as np 9 | import scipy.sparse as sps 10 | 11 | LEN = 100000 12 | DENSITY = 0.00001 13 | ITERS = 3 14 | rng = np.random.default_rng(0) 15 | 16 | 17 | if __name__ == "__main__": 18 | print("Matmul Example:\n") 19 | 20 | a_sps = sps.random(LEN, LEN - 10, format="csr", density=DENSITY, random_state=rng) * 10 21 | a_sps.sum_duplicates() 22 | b_sps = sps.random(LEN - 10, LEN, format="csr", density=DENSITY, random_state=rng) * 10 23 | b_sps.sum_duplicates() 24 | 25 | # ======= Finch ======= 26 | os.environ[sparse._ENV_VAR_NAME] = "Finch" 27 | importlib.reload(sparse) 28 | 29 | a = sparse.asarray(a_sps) 30 | b = sparse.asarray(b_sps) 31 | 32 | @sparse.compiled() 33 | def sddmm_finch(a, b): 34 | return a @ b 35 | 36 | # Compile & Benchmark 37 | result_finch = benchmark(sddmm_finch, args=[a, b], info="Finch", iters=ITERS) 38 | 39 | # ======= Numba ======= 40 | os.environ[sparse._ENV_VAR_NAME] = "Numba" 41 | importlib.reload(sparse) 42 | 43 | a = sparse.asarray(a_sps) 44 | b = sparse.asarray(b_sps) 45 | 46 | def sddmm_numba(a, b): 47 | return a @ b 48 | 49 | # Compile & Benchmark 50 | result_numba = benchmark(sddmm_numba, args=[a, b], info="Numba", iters=ITERS) 51 | 52 | # ======= SciPy ======= 53 | def sddmm_scipy(a, b): 54 | return a @ b 55 | 56 | a = a_sps 57 | b = b_sps 58 | 59 | # Compile & Benchmark 60 | result_scipy = benchmark(sddmm_scipy, args=[a, b], info="SciPy", iters=ITERS) 61 | 62 | # np.testing.assert_allclose(result_numba.todense(), result_scipy.toarray()) 63 | # np.testing.assert_allclose(result_finch.todense(), result_numba.todense()) 64 | # np.testing.assert_allclose(result_finch.todense(), result_scipy.toarray()) 65 | -------------------------------------------------------------------------------- /benchmarks_original/matmul_example.py: -------------------------------------------------------------------------------- 1 | import importlib 2 | import os 3 | 4 | import sparse 5 | 6 | from utils import benchmark 7 | 8 | import numpy as np 9 | import scipy.sparse as sps 10 | 11 | LEN = 100000 12 | DENSITY = 0.00001 13 | ITERS = 3 14 | rng = np.random.default_rng(0) 15 | 16 | 17 | if __name__ == "__main__": 18 | print("Matmul Example:\n") 19 | 20 | a_sps = sps.random(LEN, LEN - 10, format="csr", density=DENSITY, random_state=rng) * 10 21 | a_sps.sum_duplicates() 22 | b_sps = sps.random(LEN - 10, LEN, format="csr", density=DENSITY, random_state=rng) * 10 23 | b_sps.sum_duplicates() 24 | 25 | # ======= Finch ======= 26 | os.environ[sparse._ENV_VAR_NAME] = "Finch" 27 | importlib.reload(sparse) 28 | 29 | a = sparse.asarray(a_sps) 30 | b = sparse.asarray(b_sps) 31 | 32 | @sparse.compiled() 33 | def sddmm_finch(a, b): 34 | return a @ b 35 | 36 | # Compile & Benchmark 37 | result_finch = benchmark(sddmm_finch, args=[a, b], info="Finch", iters=ITERS) 38 | 39 | # ======= Numba ======= 40 | os.environ[sparse._ENV_VAR_NAME] = "Numba" 41 | importlib.reload(sparse) 42 | 43 | a = sparse.asarray(a_sps) 44 | b = sparse.asarray(b_sps) 45 | 46 | def sddmm_numba(a, b): 47 | return a @ b 48 | 49 | # Compile & Benchmark 50 | result_numba = benchmark(sddmm_numba, args=[a, b], info="Numba", iters=ITERS) 51 | 52 | # ======= SciPy ======= 53 | def sddmm_scipy(a, b): 54 | return a @ b 55 | 56 | a = a_sps 57 | b = b_sps 58 | 59 | # Compile & Benchmark 60 | result_scipy = benchmark(sddmm_scipy, args=[a, b], info="SciPy", iters=ITERS) 61 | 62 | # np.testing.assert_allclose(result_numba.todense(), result_scipy.toarray()) 63 | # np.testing.assert_allclose(result_finch.todense(), result_numba.todense()) 64 | # np.testing.assert_allclose(result_finch.todense(), result_scipy.toarray()) 65 | -------------------------------------------------------------------------------- /docs/quickstart.md: -------------------------------------------------------------------------------- 1 | # Getting Started 2 | 3 | ## Install 4 | 5 | If you haven't already, install the `sparse` library 6 | 7 | ```bash 8 | pip install sparse 9 | ``` 10 | 11 | ## Create 12 | 13 | To start, lets construct a sparse [`sparse.COO`][] array from a [`numpy.ndarray`][]: 14 | 15 | ```python 16 | 17 | import numpy as np 18 | import sparse 19 | 20 | x = np.random.random((100, 100, 100)) 21 | x[x < 0.9] = 0 # fill most of the array with zeros 22 | 23 | s = sparse.COO(x) # convert to sparse array 24 | ``` 25 | 26 | These store the same information and support many of the same operations, 27 | but the sparse version takes up less space in memory 28 | 29 | ```python 30 | >>> x.nbytes 31 | 8000000 32 | >>> s.nbytes 33 | 1102706 34 | >>> s 35 | 36 | ``` 37 | 38 | For more efficient ways to construct sparse arrays, 39 | see documentation on [Construct sparse arrays][construct-sparse-arrays]. 40 | 41 | ## Compute 42 | 43 | Many of the normal Numpy operations work on [`sparse.COO`][] objects just like on [`numpy.ndarray`][] objects. 44 | This includes arithmetic, [`numpy.ufunc`][] operations, or functions like tensordot and transpose. 45 | 46 | ```python 47 | >>> np.sin(s) + s.T * 1 48 | 49 | ``` 50 | 51 | However, operations which map zero elements to nonzero will usually change the fill-value 52 | instead of raising an error. 53 | 54 | ```python 55 | >>> y = s + 5 56 | 57 | ``` 58 | 59 | However, if you're sure you want to convert a sparse array to a dense one, 60 | you can use the ``todense`` method (which will result in a [`numpy.ndarray`][]): 61 | 62 | ```python 63 | y = s.todense() + 5 64 | ``` 65 | 66 | For more operations see the [operations][operators] 67 | or the [API reference page](../../api/). 68 | -------------------------------------------------------------------------------- /sparse/mlir_backend/_common.py: -------------------------------------------------------------------------------- 1 | import ctypes 2 | import functools 3 | import weakref 4 | from collections.abc import Iterable 5 | 6 | import mlir_finch.runtime as rt 7 | 8 | import numpy as np 9 | 10 | from ._core import libc 11 | from ._dtypes import DType, asdtype 12 | 13 | 14 | def fn_cache(f, maxsize: int | None = None): 15 | return functools.wraps(f)(functools.lru_cache(maxsize=maxsize)(f)) 16 | 17 | 18 | def get_nd_memref_descr(rank: int, dtype: DType) -> ctypes.Structure: 19 | return _get_nd_memref_descr(int(rank), asdtype(dtype)) 20 | 21 | 22 | @fn_cache 23 | def _get_nd_memref_descr(rank: int, dtype: DType) -> ctypes.Structure: 24 | return rt.make_nd_memref_descriptor(rank, dtype.to_ctype()) 25 | 26 | 27 | def numpy_to_ranked_memref(arr: np.ndarray) -> ctypes.Structure: 28 | memref = rt.get_ranked_memref_descriptor(arr) 29 | memref_descr = get_nd_memref_descr(arr.ndim, asdtype(arr.dtype)) 30 | # Required due to ctypes type checks 31 | return memref_descr( 32 | allocated=memref.allocated, 33 | aligned=memref.aligned, 34 | offset=memref.offset, 35 | shape=memref.shape, 36 | strides=memref.strides, 37 | ) 38 | 39 | 40 | def ranked_memref_to_numpy(ref: ctypes.Structure) -> np.ndarray: 41 | return rt.ranked_memref_to_numpy([ref]) 42 | 43 | 44 | def free_memref(obj: ctypes.Structure) -> None: 45 | libc.free(ctypes.cast(obj.allocated, ctypes.c_void_p)) 46 | 47 | 48 | def _hold_ref(owner, obj): 49 | ptr = ctypes.py_object(obj) 50 | ctypes.pythonapi.Py_IncRef(ptr) 51 | 52 | def finalizer(ptr): 53 | ctypes.pythonapi.Py_DecRef(ptr) 54 | 55 | weakref.finalize(owner, finalizer, ptr) 56 | 57 | 58 | def as_shape(x) -> tuple[int]: 59 | if not isinstance(x, Iterable): 60 | x = (x,) 61 | 62 | if not all(isinstance(xi, int) for xi in x): 63 | raise TypeError("Shape must be an `int` or tuple of `int`s.") 64 | 65 | return tuple(int(xi) for xi in x) 66 | -------------------------------------------------------------------------------- /examples/spmv_add_example.py: -------------------------------------------------------------------------------- 1 | import importlib 2 | import os 3 | 4 | import sparse 5 | 6 | from utils import benchmark 7 | 8 | import numpy as np 9 | import scipy.sparse as sps 10 | 11 | LEN = 100000 12 | DENSITY = 0.000001 13 | ITERS = 3 14 | rng = np.random.default_rng(0) 15 | 16 | 17 | if __name__ == "__main__": 18 | print("SpMv_add Example:\n") 19 | 20 | A_sps = sps.random(LEN - 10, LEN, format="csc", density=DENSITY, random_state=rng) * 10 21 | x_sps = rng.random((LEN, 1)) * 10 22 | y_sps = rng.random((LEN - 10, 1)) * 10 23 | 24 | # ======= Finch ======= 25 | os.environ[sparse._ENV_VAR_NAME] = "Finch" 26 | importlib.reload(sparse) 27 | 28 | A = sparse.asarray(A_sps) 29 | x = sparse.asarray(np.array(x_sps, order="C")) 30 | y = sparse.asarray(np.array(y_sps, order="C")) 31 | 32 | @sparse.compiled() 33 | def spmv_finch(A, x, y): 34 | return sparse.sum(A[:, None, :] * sparse.permute_dims(x, (1, 0))[None, :, :], axis=-1) + y 35 | 36 | # Compile & Benchmark 37 | result_finch = benchmark(spmv_finch, args=[A, x, y], info="Finch", iters=ITERS) 38 | 39 | # ======= Numba ======= 40 | os.environ[sparse._ENV_VAR_NAME] = "Numba" 41 | importlib.reload(sparse) 42 | 43 | A = sparse.asarray(A_sps, format="csc") 44 | x = x_sps 45 | y = y_sps 46 | 47 | def spmv_numba(A, x, y): 48 | return A @ x + y 49 | 50 | # Compile & Benchmark 51 | result_numba = benchmark(spmv_numba, args=[A, x, y], info="Numba", iters=ITERS) 52 | 53 | # ======= SciPy ======= 54 | def spmv_scipy(A, x, y): 55 | return A @ x + y 56 | 57 | A = A_sps 58 | x = x_sps 59 | y = y_sps 60 | 61 | # Compile & Benchmark 62 | result_scipy = benchmark(spmv_scipy, args=[A, x, y], info="SciPy", iters=ITERS) 63 | 64 | np.testing.assert_allclose(result_numba, result_scipy) 65 | np.testing.assert_allclose(result_finch.todense(), result_numba) 66 | np.testing.assert_allclose(result_finch.todense(), result_scipy) 67 | -------------------------------------------------------------------------------- /benchmarks_original/spmv_add_example.py: -------------------------------------------------------------------------------- 1 | import importlib 2 | import os 3 | 4 | import sparse 5 | 6 | from utils import benchmark 7 | 8 | import numpy as np 9 | import scipy.sparse as sps 10 | 11 | LEN = 100000 12 | DENSITY = 0.000001 13 | ITERS = 3 14 | rng = np.random.default_rng(0) 15 | 16 | 17 | if __name__ == "__main__": 18 | print("SpMv_add Example:\n") 19 | 20 | A_sps = sps.random(LEN - 10, LEN, format="csc", density=DENSITY, random_state=rng) * 10 21 | x_sps = rng.random((LEN, 1)) * 10 22 | y_sps = rng.random((LEN - 10, 1)) * 10 23 | 24 | # ======= Finch ======= 25 | os.environ[sparse._ENV_VAR_NAME] = "Finch" 26 | importlib.reload(sparse) 27 | 28 | A = sparse.asarray(A_sps) 29 | x = sparse.asarray(np.array(x_sps, order="C")) 30 | y = sparse.asarray(np.array(y_sps, order="C")) 31 | 32 | @sparse.compiled() 33 | def spmv_finch(A, x, y): 34 | return sparse.sum(A[:, None, :] * sparse.permute_dims(x, (1, 0))[None, :, :], axis=-1) + y 35 | 36 | # Compile & Benchmark 37 | result_finch = benchmark(spmv_finch, args=[A, x, y], info="Finch", iters=ITERS) 38 | 39 | # ======= Numba ======= 40 | os.environ[sparse._ENV_VAR_NAME] = "Numba" 41 | importlib.reload(sparse) 42 | 43 | A = sparse.asarray(A_sps, format="csc") 44 | x = x_sps 45 | y = y_sps 46 | 47 | def spmv_numba(A, x, y): 48 | return A @ x + y 49 | 50 | # Compile & Benchmark 51 | result_numba = benchmark(spmv_numba, args=[A, x, y], info="Numba", iters=ITERS) 52 | 53 | # ======= SciPy ======= 54 | def spmv_scipy(A, x, y): 55 | return A @ x + y 56 | 57 | A = A_sps 58 | x = x_sps 59 | y = y_sps 60 | 61 | # Compile & Benchmark 62 | result_scipy = benchmark(spmv_scipy, args=[A, x, y], info="SciPy", iters=ITERS) 63 | 64 | np.testing.assert_allclose(result_numba, result_scipy) 65 | np.testing.assert_allclose(result_finch.todense(), result_numba) 66 | np.testing.assert_allclose(result_finch.todense(), result_scipy) 67 | -------------------------------------------------------------------------------- /docs/migration-jl.md: -------------------------------------------------------------------------------- 1 | # Migration to the Finch Julia backend 2 | To switch to the Finch Julia backend, set the environment variable `SPARSE_BACKEND="Finch"`, then continue using. 3 | 4 | While this is largely compatible with the Array API, support for some functions may not be present, and API compatibility isn't strictly preserved with the default (Numba) backend. 5 | 6 | However, the new backend has a large performance benefit over the default backend. Below, you will find a table of common invocations, with their equivalents in the Finch Julia backend. The most common change is a standard API for construction of arrays. 7 | 8 | | Numba Backend
(`SPARSE_BACKEND="Numba"`) | Finch Julia Backend
(`SPARSE_BACKEND="Finch"`) | Notes | 9 | |---------------------------------------------|----------------------------------------------------|-------| 10 | | `sparse.COO.from_numpy(arr, fill_value=fv)`
`sparse.COO.from_scipy(arr)`
`sparse.COO(x)` | `sparse.asarray(x, format="coo", [fill_value=fv])` | Doesn't support pulling out individual arrays | 11 | | `sparse.GCXS.from_numpy(arr, fill_value=fv)`
`sparse.GCXS.from_scipy(arr)`
`sparse.GCXS(x)` | `sparse.asarray(x, format="csf", [fill_value=fv])` | Format might not be a 1:1 match | 12 | | `sparse.DOK.from_numpy(arr, fill_value=fv)`
`sparse.DOK.from_scipy(arr)`
`sparse.DOK(x)` | `sparse.asarray(x, format="dok", [fill_value=fv])` | Format might not be a 1:1 match | 13 | 14 | Most things work as expected, with the following exceptions, which aren't defined yet for Finch: 15 | 16 | * `sparse.broadcast_to` 17 | * `sparse.solve` 18 | * Statistical functions: `mean`, `std`, `var` 19 | * `sparse.isdtype` 20 | * `sparse.reshape` 21 | * Some elementwise functions 22 | * Manipulation functions: `concat`, `expand_dims`, `squeeze`, `flip`, `roll`, `stack` 23 | * `arg*` functions: `argmin`, `argmax` 24 | * Sorting functions: `sort`, `argsort` 25 | 26 | IEEE-754 compliance is hard to maintain with sparse arrays in general. This is now even more true of the Julia backend, which trades off performance for IEEE-754 compatibility. 27 | -------------------------------------------------------------------------------- /sparse/numba_backend/tests/test_coo_numba.py: -------------------------------------------------------------------------------- 1 | import sparse 2 | 3 | import numba 4 | 5 | import numpy as np 6 | 7 | 8 | @numba.njit 9 | def identity(x): 10 | """Pass an object through numba and back""" 11 | return x 12 | 13 | 14 | def identity_constant(x): 15 | @numba.njit 16 | def get_it(): 17 | """Pass an object through numba and back as a constant""" 18 | return x 19 | 20 | return get_it() 21 | 22 | 23 | def assert_coo_equal(c1, c2): 24 | assert c1.shape == c2.shape 25 | assert sparse.all(c1 == c2) 26 | assert c1.data.dtype == c2.data.dtype 27 | assert c1.fill_value == c2.fill_value 28 | 29 | 30 | def assert_coo_same_memory(c1, c2): 31 | assert_coo_equal(c1, c2) 32 | assert c1.coords.data == c2.coords.data 33 | assert c1.data.data == c2.data.data 34 | 35 | 36 | class TestBasic: 37 | """Test very simple construction and field access""" 38 | 39 | def test_roundtrip(self): 40 | c1 = sparse.COO(np.eye(3), fill_value=1) 41 | c2 = identity(c1) 42 | assert type(c1) is type(c2) 43 | assert_coo_same_memory(c1, c2) 44 | 45 | def test_roundtrip_constant(self): 46 | c1 = sparse.COO(np.eye(3), fill_value=1) 47 | c2 = identity_constant(c1) 48 | # constants are always copies 49 | assert_coo_equal(c1, c2) 50 | 51 | def test_unpack_attrs(self): 52 | @numba.njit 53 | def unpack(c): 54 | return c.coords, c.data, c.shape, c.fill_value 55 | 56 | c1 = sparse.COO(np.eye(3), fill_value=1) 57 | coords, data, shape, fill_value = unpack(c1) 58 | c2 = sparse.COO(coords, data, shape, fill_value=fill_value) 59 | assert_coo_same_memory(c1, c2) 60 | 61 | def test_repack_attrs(self): 62 | @numba.njit 63 | def pack(coords, data, shape): 64 | return sparse.COO(coords, data, shape) 65 | 66 | # repacking fill_value isn't possible yet 67 | c1 = sparse.COO(np.eye(3)) 68 | c2 = pack(c1.coords, c1.data, c1.shape) 69 | assert_coo_same_memory(c1, c2) 70 | -------------------------------------------------------------------------------- /benchmarks/test_elemwise.py: -------------------------------------------------------------------------------- 1 | import importlib 2 | import operator 3 | import os 4 | 5 | import sparse 6 | 7 | import pytest 8 | 9 | import scipy.sparse as sps 10 | 11 | DENSITY = 0.001 12 | 13 | 14 | def get_test_id(side): 15 | return f"{side=}" 16 | 17 | 18 | @pytest.fixture(params=[100, 500, 1000], ids=get_test_id) 19 | def elemwise_args(request, rng, max_size): 20 | side = request.param 21 | if side**2 >= max_size: 22 | pytest.skip() 23 | s1_sps = sps.random(side, side, format="csr", density=DENSITY, random_state=rng) * 10 24 | s1_sps.sum_duplicates() 25 | s2_sps = sps.random(side, side, format="csr", density=DENSITY, random_state=rng) * 10 26 | s2_sps.sum_duplicates() 27 | return s1_sps, s2_sps 28 | 29 | 30 | @pytest.fixture(params=[operator.add, operator.mul, operator.gt]) 31 | def elemwise_function(request): 32 | return request.param 33 | 34 | 35 | @pytest.fixture(params=["SciPy", "Numba", "Finch"]) 36 | def backend_name(request): 37 | return request.param 38 | 39 | 40 | @pytest.fixture 41 | def backend_setup(backend_name): 42 | os.environ[sparse._ENV_VAR_NAME] = backend_name 43 | importlib.reload(sparse) 44 | yield sparse, backend_name 45 | del os.environ[sparse._ENV_VAR_NAME] 46 | importlib.reload(sparse) 47 | 48 | 49 | @pytest.fixture 50 | def sparse_arrays(elemwise_args, backend_setup): 51 | s1_sps, s2_sps = elemwise_args 52 | sparse, backend_name = backend_setup 53 | 54 | if backend_name == "SciPy": 55 | s1 = s1_sps 56 | s2 = s2_sps 57 | elif backend_name == "Numba": 58 | s1 = sparse.asarray(s1_sps) 59 | s2 = sparse.asarray(s2_sps) 60 | elif backend_name == "Finch": 61 | s1 = sparse.asarray(s1_sps.asformat("csc"), format="csc") 62 | s2 = sparse.asarray(s2_sps.asformat("csc"), format="csc") 63 | 64 | return s1, s2 65 | 66 | 67 | def test_elemwise(benchmark, elemwise_function, sparse_arrays): 68 | s1, s2 = sparse_arrays 69 | 70 | elemwise_function(s1, s2) 71 | 72 | @benchmark 73 | def bench(): 74 | elemwise_function(s1, s2) 75 | -------------------------------------------------------------------------------- /examples/sddmm_example.py: -------------------------------------------------------------------------------- 1 | import importlib 2 | import os 3 | 4 | import sparse 5 | 6 | from utils import benchmark 7 | 8 | import numpy as np 9 | import scipy.sparse as sps 10 | 11 | LEN = 10000 12 | DENSITY = 0.00001 13 | ITERS = 3 14 | rng = np.random.default_rng(0) 15 | 16 | 17 | if __name__ == "__main__": 18 | print("SDDMM Example:\n") 19 | 20 | a_sps = rng.random((LEN, LEN - 10)) * 10 21 | b_sps = rng.random((LEN - 10, LEN)) * 10 22 | s_sps = sps.random(LEN, LEN, format="coo", density=DENSITY, random_state=rng) * 10 23 | s_sps.sum_duplicates() 24 | 25 | # ======= Finch ======= 26 | os.environ[sparse._ENV_VAR_NAME] = "Finch" 27 | importlib.reload(sparse) 28 | 29 | s = sparse.asarray(s_sps) 30 | a = sparse.asarray(np.array(a_sps, order="F")) 31 | b = sparse.asarray(np.array(b_sps, order="C")) 32 | 33 | @sparse.compiled() 34 | def sddmm_finch(s, a, b): 35 | return sparse.sum( 36 | s[:, :, None] * (a[:, None, :] * sparse.permute_dims(b, (1, 0))[None, :, :]), 37 | axis=-1, 38 | ) 39 | 40 | # Compile & Benchmark 41 | result_finch = benchmark(sddmm_finch, args=[s, a, b], info="Finch", iters=ITERS) 42 | 43 | # ======= Numba ======= 44 | os.environ[sparse._ENV_VAR_NAME] = "Numba" 45 | importlib.reload(sparse) 46 | 47 | s = sparse.asarray(s_sps) 48 | a = a_sps 49 | b = b_sps 50 | 51 | def sddmm_numba(s, a, b): 52 | return s * (a @ b) 53 | 54 | # Compile & Benchmark 55 | result_numba = benchmark(sddmm_numba, args=[s, a, b], info="Numba", iters=ITERS) 56 | 57 | # ======= SciPy ======= 58 | def sddmm_scipy(s, a, b): 59 | return s.multiply(a @ b) 60 | 61 | s = s_sps.asformat("csr") 62 | a = a_sps 63 | b = b_sps 64 | 65 | # Compile & Benchmark 66 | result_scipy = benchmark(sddmm_scipy, args=[s, a, b], info="SciPy", iters=ITERS) 67 | 68 | np.testing.assert_allclose(result_numba.todense(), result_scipy.toarray()) 69 | np.testing.assert_allclose(result_finch.todense(), result_numba.todense()) 70 | np.testing.assert_allclose(result_finch.todense(), result_scipy.toarray()) 71 | -------------------------------------------------------------------------------- /benchmarks_original/sddmm_example.py: -------------------------------------------------------------------------------- 1 | import importlib 2 | import os 3 | 4 | import sparse 5 | 6 | from utils import benchmark 7 | 8 | import numpy as np 9 | import scipy.sparse as sps 10 | 11 | LEN = 10000 12 | DENSITY = 0.00001 13 | ITERS = 3 14 | rng = np.random.default_rng(0) 15 | 16 | 17 | if __name__ == "__main__": 18 | print("SDDMM Example:\n") 19 | 20 | a_sps = rng.random((LEN, LEN - 10)) * 10 21 | b_sps = rng.random((LEN - 10, LEN)) * 10 22 | s_sps = sps.random(LEN, LEN, format="coo", density=DENSITY, random_state=rng) * 10 23 | s_sps.sum_duplicates() 24 | 25 | # ======= Finch ======= 26 | os.environ[sparse._ENV_VAR_NAME] = "Finch" 27 | importlib.reload(sparse) 28 | 29 | s = sparse.asarray(s_sps) 30 | a = sparse.asarray(np.array(a_sps, order="F")) 31 | b = sparse.asarray(np.array(b_sps, order="C")) 32 | 33 | @sparse.compiled() 34 | def sddmm_finch(s, a, b): 35 | return sparse.sum( 36 | s[:, :, None] * (a[:, None, :] * sparse.permute_dims(b, (1, 0))[None, :, :]), 37 | axis=-1, 38 | ) 39 | 40 | # Compile & Benchmark 41 | result_finch = benchmark(sddmm_finch, args=[s, a, b], info="Finch", iters=ITERS) 42 | 43 | # ======= Numba ======= 44 | os.environ[sparse._ENV_VAR_NAME] = "Numba" 45 | importlib.reload(sparse) 46 | 47 | s = sparse.asarray(s_sps) 48 | a = a_sps 49 | b = b_sps 50 | 51 | def sddmm_numba(s, a, b): 52 | return s * (a @ b) 53 | 54 | # Compile & Benchmark 55 | result_numba = benchmark(sddmm_numba, args=[s, a, b], info="Numba", iters=ITERS) 56 | 57 | # ======= SciPy ======= 58 | def sddmm_scipy(s, a, b): 59 | return s.multiply(a @ b) 60 | 61 | s = s_sps.asformat("csr") 62 | a = a_sps 63 | b = b_sps 64 | 65 | # Compile & Benchmark 66 | result_scipy = benchmark(sddmm_scipy, args=[s, a, b], info="SciPy", iters=ITERS) 67 | 68 | np.testing.assert_allclose(result_numba.todense(), result_scipy.toarray()) 69 | np.testing.assert_allclose(result_finch.todense(), result_numba.todense()) 70 | np.testing.assert_allclose(result_finch.todense(), result_scipy.toarray()) 71 | -------------------------------------------------------------------------------- /.github/ISSUE_TEMPLATE/bug_report.yml: -------------------------------------------------------------------------------- 1 | name: Bug report 2 | description: Report to help us reproduce the bug 3 | title: "Bug: " 4 | labels: ["bug", "needs triage"] 5 | 6 | body: 7 | - type: markdown 8 | attributes: 9 | value: > 10 | ## Thanks for taking the time to fill out this report 11 | - type: checkboxes 12 | id: checks 13 | attributes: 14 | label: sparse version checks 15 | options: 16 | - label: > 17 | I checked that this issue has not been reported before 18 | [list of issues](https://github.com/pydata/sparse/issues). 19 | required: true 20 | - label: > 21 | I have confirmed this bug exists on the latest version of sparse. 22 | required: true 23 | - label: > 24 | I have confirmed this bug exists on the main branch of sparse. 25 | - type: textarea 26 | attributes: 27 | label: Describe the bug 28 | description: > 29 | A clear and concise description of what the bug is. 30 | validations: 31 | required: true 32 | - type: textarea 33 | attributes: 34 | label: Steps or code to reproduce the bug 35 | description: | 36 | Please add a minimal code example to reproduce the bug. 37 | validations: 38 | required: true 39 | - type: textarea 40 | attributes: 41 | label: Expected results 42 | description: > 43 | Please paste or describe the expected results. 44 | placeholder: > 45 | Example: No error is thrown. 46 | validations: 47 | required: true 48 | - type: textarea 49 | attributes: 50 | label: Actual results 51 | description: | 52 | Please paste or describe the results you observe instead of the expected results. 53 | validations: 54 | required: true 55 | - type: textarea 56 | attributes: 57 | label: Please describe your system. 58 | value: | 59 | 1. OS and version: [e.g. Windows 10] 60 | 2. sparse version (sparse.__version__) 61 | 3. NumPy version (np.__version__) 62 | 4. Numba version (numba.__version__) 63 | validations: 64 | required: true 65 | - type: textarea 66 | attributes: 67 | label: Relevant log output 68 | description: Please copy and paste any relevant log output. This will be automatically formatted into code, so no need for backticks. 69 | render: shell 70 | -------------------------------------------------------------------------------- /examples/hits_example.py: -------------------------------------------------------------------------------- 1 | import os 2 | from typing import Any 3 | 4 | import graphblas as gb 5 | import graphblas_algorithms as ga 6 | 7 | import numpy as np 8 | import scipy.sparse as sps 9 | from numpy.testing import assert_allclose 10 | 11 | os.environ["SPARSE_BACKEND"] = "Finch" 12 | import sparse 13 | 14 | # select namespace 15 | xp = sparse # np jnp 16 | Array = Any 17 | 18 | 19 | def converged(xprev: Array, x: Array, N: int, tol: float) -> bool: 20 | err = xp.sum(xp.abs(x - xprev)) 21 | return err < xp.asarray(N * tol) 22 | 23 | 24 | class Graph: 25 | def __init__(self, A: Array): 26 | assert A.ndim == 2 and A.shape[0] == A.shape[1] 27 | self.N = A.shape[0] 28 | self.A = A 29 | 30 | 31 | @sparse.compiled() 32 | def kernel(hprev: Array, A: Array, N: int, tol: float) -> tuple[Array, Array, Array]: 33 | a = hprev.mT @ A 34 | h = A @ a.mT 35 | h = h / xp.max(h) 36 | conv = converged(hprev, h, N, tol) 37 | return h, a, conv 38 | 39 | 40 | def hits_finch(G: Graph, max_iter: int = 100, tol: float = 1e-8, normalized: bool = True) -> tuple[Array, Array]: 41 | N = G.N 42 | if N == 0: 43 | return xp.asarray([]), xp.asarray([]) 44 | 45 | h = xp.full((N, 1), 1.0 / N) 46 | A = xp.asarray(G.A) 47 | 48 | for _ in range(max_iter): 49 | hprev = h 50 | a = hprev.mT @ A 51 | h = A @ a.mT 52 | h = h / xp.max(h) 53 | if converged(hprev, h, N, tol): 54 | break 55 | # alternatively these lines can be compiled 56 | # h, a, conv = kernel(h, A, N, tol) 57 | else: 58 | raise Exception("Didn't converge") 59 | 60 | if normalized: 61 | h = h / xp.sum(xp.abs(h)) 62 | a = a / xp.sum(xp.abs(a)) 63 | return h, a 64 | 65 | 66 | if __name__ == "__main__": 67 | coords = (np.array([0, 0, 1, 2, 2, 3]), np.array([1, 3, 0, 0, 1, 2])) 68 | data = np.array([1.0, 2.0, 3.0, 4.0, 5.0, 6.0]) 69 | A = sps.coo_array((data, coords)) 70 | G = Graph(A) 71 | 72 | h_finch, a_finch = hits_finch(G) 73 | 74 | print(h_finch, a_finch) 75 | 76 | M = gb.io.from_scipy_sparse(A) 77 | G = ga.Graph(M) 78 | h_gb, a_gb = ga.hits(G) 79 | 80 | assert_allclose(h_finch.todense().ravel(), h_gb.to_dense()) 81 | assert_allclose(a_finch.todense().ravel(), a_gb.to_dense()) 82 | -------------------------------------------------------------------------------- /examples/elemwise_example.py: -------------------------------------------------------------------------------- 1 | import importlib 2 | import operator 3 | import os 4 | 5 | import sparse 6 | 7 | from utils import benchmark 8 | 9 | import numpy as np 10 | import scipy.sparse as sps 11 | 12 | LEN = 10000 13 | DENSITY = 0.001 14 | ITERS = 3 15 | rng = np.random.default_rng(0) 16 | 17 | 18 | if __name__ == "__main__": 19 | print("Elementwise Example:\n") 20 | 21 | for func_name in ["multiply", "add", "greater_equal"]: 22 | print(f"{func_name} benchmark:\n") 23 | 24 | s1_sps = sps.random(LEN, LEN, format="csr", density=DENSITY, random_state=rng) * 10 25 | s1_sps.sum_duplicates() 26 | s2_sps = sps.random(LEN, LEN, format="csr", density=DENSITY, random_state=rng) * 10 27 | s2_sps.sum_duplicates() 28 | 29 | # ======= Finch ======= 30 | os.environ[sparse._ENV_VAR_NAME] = "Finch" 31 | importlib.reload(sparse) 32 | 33 | s1 = sparse.asarray(s1_sps.asformat("csc"), format="csc") 34 | s2 = sparse.asarray(s2_sps.asformat("csc"), format="csc") 35 | 36 | func = getattr(sparse, func_name) 37 | 38 | # Compile & Benchmark 39 | result_finch = benchmark(func, args=[s1, s2], info="Finch", iters=ITERS) 40 | 41 | # ======= Numba ======= 42 | os.environ[sparse._ENV_VAR_NAME] = "Numba" 43 | importlib.reload(sparse) 44 | 45 | s1 = sparse.asarray(s1_sps) 46 | s2 = sparse.asarray(s2_sps) 47 | 48 | func = getattr(sparse, func_name) 49 | 50 | # Compile & Benchmark 51 | result_numba = benchmark(func, args=[s1, s2], info="Numba", iters=ITERS) 52 | 53 | # ======= SciPy ======= 54 | s1 = s1_sps 55 | s2 = s2_sps 56 | 57 | if func_name == "multiply": 58 | func, args = s1.multiply, [s2] 59 | elif func_name == "add": 60 | func, args = operator.add, [s1, s2] 61 | elif func_name == "greater_equal": 62 | func, args = operator.ge, [s1, s2] 63 | 64 | # Compile & Benchmark 65 | result_scipy = benchmark(func, args=args, info="SciPy", iters=ITERS) 66 | 67 | np.testing.assert_allclose(result_numba.todense(), result_scipy.toarray()) 68 | np.testing.assert_allclose(result_finch.todense(), result_numba.todense()) 69 | np.testing.assert_allclose(result_finch.todense(), result_scipy.toarray()) 70 | -------------------------------------------------------------------------------- /benchmarks_original/elemwise_example.py: -------------------------------------------------------------------------------- 1 | import importlib 2 | import operator 3 | import os 4 | 5 | import sparse 6 | 7 | from utils import benchmark 8 | 9 | import numpy as np 10 | import scipy.sparse as sps 11 | 12 | LEN = 10000 13 | DENSITY = 0.001 14 | ITERS = 3 15 | rng = np.random.default_rng(0) 16 | 17 | 18 | if __name__ == "__main__": 19 | print("Elementwise Example:\n") 20 | 21 | for func_name in ["multiply", "add", "greater_equal"]: 22 | print(f"{func_name} benchmark:\n") 23 | 24 | s1_sps = sps.random(LEN, LEN, format="csr", density=DENSITY, random_state=rng) * 10 25 | s1_sps.sum_duplicates() 26 | s2_sps = sps.random(LEN, LEN, format="csr", density=DENSITY, random_state=rng) * 10 27 | s2_sps.sum_duplicates() 28 | 29 | # ======= Finch ======= 30 | os.environ[sparse._ENV_VAR_NAME] = "Finch" 31 | importlib.reload(sparse) 32 | 33 | s1 = sparse.asarray(s1_sps.asformat("csc"), format="csc") 34 | s2 = sparse.asarray(s2_sps.asformat("csc"), format="csc") 35 | 36 | func = getattr(sparse, func_name) 37 | 38 | # Compile & Benchmark 39 | result_finch = benchmark(func, args=[s1, s2], info="Finch", iters=ITERS) 40 | 41 | # ======= Numba ======= 42 | os.environ[sparse._ENV_VAR_NAME] = "Numba" 43 | importlib.reload(sparse) 44 | 45 | s1 = sparse.asarray(s1_sps) 46 | s2 = sparse.asarray(s2_sps) 47 | 48 | func = getattr(sparse, func_name) 49 | 50 | # Compile & Benchmark 51 | result_numba = benchmark(func, args=[s1, s2], info="Numba", iters=ITERS) 52 | 53 | # ======= SciPy ======= 54 | s1 = s1_sps 55 | s2 = s2_sps 56 | 57 | if func_name == "multiply": 58 | func, args = s1.multiply, [s2] 59 | elif func_name == "add": 60 | func, args = operator.add, [s1, s2] 61 | elif func_name == "greater_equal": 62 | func, args = operator.ge, [s1, s2] 63 | 64 | # Compile & Benchmark 65 | result_scipy = benchmark(func, args=args, info="SciPy", iters=ITERS) 66 | 67 | np.testing.assert_allclose(result_numba.todense(), result_scipy.toarray()) 68 | np.testing.assert_allclose(result_finch.todense(), result_numba.todense()) 69 | np.testing.assert_allclose(result_finch.todense(), result_scipy.toarray()) 70 | -------------------------------------------------------------------------------- /benchmarks/test_tensordot.py: -------------------------------------------------------------------------------- 1 | import itertools 2 | 3 | import sparse 4 | 5 | import pytest 6 | 7 | import numpy as np 8 | 9 | DENSITY = 0.01 10 | 11 | 12 | def get_sides_ids(param): 13 | m, n, p, q = param 14 | return f"{m=}-{n=}-{p=}-{q=}" 15 | 16 | 17 | @pytest.fixture( 18 | params=itertools.product([10, 50], [10, 20], [20, 50], [10, 50]), 19 | ids=get_sides_ids, 20 | scope="function", 21 | ) 22 | def sides(request): 23 | m, n, p, q = request.param 24 | return m, n, p, q 25 | 26 | 27 | def get_tensor_ids(param): 28 | left_index, right_index, left_format, right_format = param 29 | return f"{left_index=}-{right_index=}-{left_format=}-{right_format=}" 30 | 31 | 32 | @pytest.fixture( 33 | params=([(1, 2, "dense", "coo"), (1, 2, "coo", "coo"), (1, 1, "coo", "dense")]), 34 | ids=get_tensor_ids, 35 | scope="function", 36 | ) 37 | def tensordot_args(request, sides, rng, max_size): 38 | m, n, p, q = sides 39 | if m * n * p * q >= max_size: 40 | pytest.skip() 41 | left_index, right_index, left_format, right_format = request.param 42 | 43 | t = rng.random((m, n)) 44 | 45 | if left_format == "dense" and right_format == "coo": 46 | left_tensor = t 47 | right_tensor = sparse.random((m, p, n, q), density=DENSITY, format=right_format, random_state=rng) 48 | 49 | if left_format == "coo" and right_format == "coo": 50 | left_tensor = sparse.random((m, p), density=DENSITY, format=left_format, random_state=rng) 51 | right_tensor = sparse.random((m, n, p, q), density=DENSITY, format=right_format, random_state=rng) 52 | 53 | if left_format == "coo" and right_format == "dense": 54 | left_tensor = sparse.random((m, n, p, q), density=DENSITY, format=left_format, random_state=rng) 55 | right_tensor = t 56 | 57 | return left_index, right_index, left_tensor, right_tensor 58 | 59 | 60 | @pytest.mark.parametrize("return_type", [np.ndarray, sparse.COO]) 61 | def test_tensordot(benchmark, return_type, tensordot_args): 62 | left_index, right_index, left_tensor, right_tensor = tensordot_args 63 | 64 | sparse.tensordot(left_tensor, right_tensor, axes=([0, left_index], [0, right_index]), return_type=return_type) 65 | 66 | @benchmark 67 | def bench(): 68 | sparse.tensordot(left_tensor, right_tensor, axes=([0, left_index], [0, right_index]), return_type=return_type) 69 | -------------------------------------------------------------------------------- /pixi.toml: -------------------------------------------------------------------------------- 1 | [project] 2 | authors = ["Hameer Abbasi <2190658+hameerabbasi@users.noreply.github.com>"] 3 | channels = ["conda-forge"] 4 | name = "sparse" 5 | platforms = ["osx-arm64", "osx-64", "linux-64", "win-64"] 6 | 7 | [pypi-dependencies] 8 | sparse = { path = ".", editable = true } 9 | numba = ">=0.49" 10 | numpy = ">=1.17" 11 | 12 | [dependencies] 13 | python = ">=3.11,<3.14" 14 | 15 | [feature.extra.pypi-dependencies] 16 | dask = { version = ">=2024", extras = ["array"] } 17 | scipy = ">=0.19" 18 | scikit-learn = "*" 19 | 20 | [feature.doc.pypi-dependencies] 21 | mkdocs-material = "*" 22 | mkdocstrings = { version = "*", extras = ["python"] } 23 | mkdocs-gen-files = "*" 24 | mkdocs-literate-nav = "*" 25 | mkdocs-section-index = "*" 26 | mkdocs-jupyter = "*" 27 | 28 | [feature.test.tasks] 29 | test = "ci/test_Numba.sh" 30 | test-mlir = "ci/test_MLIR.sh" 31 | test-finch = "ci/test_Finch.sh" 32 | 33 | [feature.test.pypi-dependencies] 34 | pytest = ">=3.5" 35 | pytest-cov = "*" 36 | pytest-xdist = "*" 37 | pytest-codspeed = "*" 38 | 39 | [feature.notebooks.pypi-dependencies] 40 | ipykernel = "*" 41 | nbmake = "*" 42 | matplotlib = "*" 43 | networkx = "*" 44 | jupyterlab = "*" 45 | 46 | [feature.matrepr.pypi-dependencies] 47 | matrepr = "*" 48 | 49 | [feature.finch.tasks] 50 | precompile = "python -c 'import finch'" 51 | 52 | [feature.finch.dependencies] 53 | python = ">=3.10" 54 | juliaup = ">=1.17.10" 55 | 56 | [feature.finch.pypi-dependencies] 57 | scipy = ">=1.13" 58 | finch-tensor = ">=0.2.13" 59 | 60 | [feature.finch.activation.env] 61 | SPARSE_BACKEND = "Finch" 62 | 63 | [feature.finch.target.osx-arm64.activation.env] 64 | PYTHONFAULTHANDLER = "${HOME}/faulthandler.log" 65 | 66 | [feature.mlir.dependencies] 67 | python = ">=3.10,<3.14" 68 | 69 | [feature.mlir.pypi-dependencies] 70 | scipy = ">=0.19" 71 | finch-mlir = ">=0.0.2" 72 | "PyYAML" = "*" 73 | 74 | [feature.barebones.dependencies] 75 | python = ">=3.10,<3.13" 76 | pip = ">=24" 77 | 78 | [feature.barebones.tasks] 79 | setup-env = {cmd = "ci/setup_env.sh" } 80 | test-all = { cmd = "ci/test_all.sh", env = { ACTIVATE_VENV = "1" }, depends-on = ["setup-env"] } 81 | test-finch = "ci/test_Finch.sh" 82 | 83 | [feature.mlir.activation.env] 84 | SPARSE_BACKEND = "MLIR" 85 | 86 | [environments] 87 | test = ["test", "extra"] 88 | doc = ["doc", "extra"] 89 | mlir-dev = {features = ["test", "mlir"], no-default-feature = true} 90 | finch-dev = {features = ["test", "finch"], no-default-feature = true} 91 | notebooks = ["extra", "mlir", "finch", "notebooks"] 92 | barebones = {features = ["barebones"], no-default-feature = true} 93 | -------------------------------------------------------------------------------- /sparse/numba_backend/tests/test_compressed_convert.py: -------------------------------------------------------------------------------- 1 | from sparse.numba_backend._compressed import convert 2 | from sparse.numba_backend._utils import assert_eq 3 | 4 | import pytest 5 | from numba.typed import List 6 | 7 | import numpy as np 8 | 9 | 10 | def make_inds(shape): 11 | return [np.arange(1, a - 1) for a in shape] 12 | 13 | 14 | def make_increments(shape): 15 | inds = make_inds(shape) 16 | shape_bins = convert.transform_shape(np.asarray(shape)) 17 | return List([inds[i] * shape_bins[i] for i in range(len(shape))]) 18 | 19 | 20 | @pytest.mark.parametrize( 21 | "shape, expected_subsample, subsample", 22 | [ 23 | [(5, 6, 7, 8, 9), np.array([3610, 6892, 10338]), 1000], 24 | [(13, 12, 12, 9, 7), np.array([9899, 34441, 60635, 86703]), 10000], 25 | [ 26 | (12, 15, 7, 14, 9), 27 | np.array([14248, 36806, 61382, 85956, 110532, 135106]), 28 | 10000, 29 | ], 30 | [(9, 9, 12, 7, 12), np.array([10177, 34369, 60577]), 10000], 31 | ], 32 | ) 33 | def test_convert_to_flat(shape, expected_subsample, subsample): 34 | inds = make_inds(shape) 35 | dtype = inds[0].dtype 36 | 37 | assert_eq( 38 | convert.convert_to_flat(inds, shape, dtype)[::subsample], 39 | expected_subsample.astype(dtype), 40 | ) 41 | 42 | 43 | @pytest.mark.parametrize( 44 | "shape, expected_subsample, subsample", 45 | [ 46 | [(5, 6, 7, 8, 9), np.array([3610, 6892, 10338]), 1000], 47 | [(13, 12, 12, 9, 7), np.array([9899, 34441, 60635, 86703]), 10000], 48 | [ 49 | (12, 15, 7, 14, 9), 50 | np.array([14248, 36806, 61382, 85956, 110532, 135106]), 51 | 10000, 52 | ], 53 | [(9, 9, 12, 7, 12), np.array([10177, 34369, 60577]), 10000], 54 | ], 55 | ) 56 | def test_compute_flat(shape, expected_subsample, subsample): 57 | increments = make_increments(shape) 58 | dtype = increments[0].dtype 59 | operations = np.prod([inc.shape[0] for inc in increments[:-1]], dtype=dtype) 60 | cols = np.tile(increments[-1], operations) 61 | 62 | assert_eq( 63 | convert.compute_flat(increments, cols, operations)[::subsample], 64 | expected_subsample.astype(dtype), 65 | ) 66 | 67 | 68 | @pytest.mark.parametrize( 69 | "shape, expected_shape", 70 | [ 71 | [(5, 6, 7, 8, 9), np.array([3024, 504, 72, 9, 1])], 72 | [(13, 12, 12, 9, 7), np.array([9072, 756, 63, 7, 1])], 73 | [(12, 15, 7, 14, 9), np.array([13230, 882, 126, 9, 1])], 74 | [ 75 | (18, 5, 12, 14, 9, 11, 8, 14), 76 | np.array([9313920, 1862784, 155232, 11088, 1232, 112, 14, 1]), 77 | ], 78 | [ 79 | (11, 6, 13, 11, 17, 7, 15), 80 | np.array([1531530, 255255, 19635, 1785, 105, 15, 1]), 81 | ], 82 | [(9, 9, 12, 7, 12), np.array([9072, 1008, 84, 12, 1])], 83 | ], 84 | ) 85 | def test_transform_shape(shape, expected_shape): 86 | assert_eq(convert.transform_shape(np.asarray(shape)), expected_shape, compare_dtype=False) 87 | -------------------------------------------------------------------------------- /mkdocs.yml: -------------------------------------------------------------------------------- 1 | site_name: sparse 2 | repo_url: https://github.com/pydata/sparse.git 3 | edit_uri: edit/main/docs/ 4 | #use_directory_urls: false 5 | theme: 6 | name: material 7 | palette: 8 | primary: custom 9 | accent: cyan 10 | font: false #avoid Google Fonts to adhere to data privacy regulations 11 | logo: assets/images/logo.png 12 | favicon: assets/images/logo.svg 13 | features: 14 | - navigation.tabs 15 | - navigation.tabs.sticky 16 | - navigation.tracking 17 | - navigation.instant 18 | - navigation.instant.progress 19 | - navigation.prune 20 | - navigation.footer 21 | - navigation.indexes 22 | - navigation.expand 23 | - navigation.top # adds a back-to-top button when user scrolls up 24 | - content.code.copy 25 | 26 | markdown_extensions: 27 | - tables 28 | - admonition # This line, pymdownx.details and pymdownx.superfences are used by warings 29 | - pymdownx.details 30 | - pymdownx.superfences 31 | - codehilite 32 | - toc: 33 | toc_depth: 3 34 | - pymdownx.arithmatex: # To display math content with KaTex 35 | generic: true 36 | - attr_list # To be able to link to a header on another page, use grids 37 | - md_in_html # Used for grids 38 | 39 | extra_javascript: 40 | - js/katex.js 41 | - https://unpkg.com/katex@0/dist/katex.min.js 42 | - https://unpkg.com/katex@0/dist/contrib/auto-render.min.js 43 | 44 | extra_css: 45 | - https://unpkg.com/katex@0/dist/katex.min.css 46 | - css/mkdocstrings.css 47 | 48 | plugins: 49 | - search 50 | - section-index 51 | - autorefs 52 | - gen-files: 53 | scripts: 54 | - scripts/gen_ref_pages.py 55 | - literate-nav 56 | - mkdocstrings: 57 | handlers: 58 | python: 59 | inventories: 60 | - https://numpy.org/doc/stable/objects.inv 61 | - https://docs.python.org/3/objects.inv 62 | - https://docs.scipy.org/doc/scipy/objects.inv 63 | options: 64 | inherited_members: yes 65 | show_root_members_full_path: false 66 | show_if_no_docstring: true 67 | members_order: source 68 | docstring_style: numpy 69 | show_source: true 70 | filters: ["!^_"] 71 | group_by_category: true 72 | show_category_heading: true 73 | 74 | - mkdocs-jupyter: 75 | include_source: true 76 | execute: true 77 | ignore: ["__init__.py", "utils.py", "gen_logo.py"] 78 | 79 | nav: 80 | - Home: 81 | - index.md 82 | - Introduction: 83 | - introduction.md 84 | - Install: 85 | - install.md 86 | - Tutorials: 87 | - examples.md 88 | - examples/* 89 | - How to guides: 90 | - how-to-guides.md 91 | - quickstart.md 92 | - construct.md 93 | - operations.md 94 | - migration-jl.md 95 | - API: 96 | - api.md 97 | - api/* 98 | - Contributing: 99 | - contributing.md 100 | - roadmap.md 101 | - completed-tasks.md 102 | - changelog.md 103 | - conduct.md 104 | -------------------------------------------------------------------------------- /pyproject.toml: -------------------------------------------------------------------------------- 1 | [build-system] 2 | requires = ["setuptools>=64", "setuptools_scm>=8"] 3 | build-backend = "setuptools.build_meta" 4 | 5 | [project] 6 | name = "sparse" 7 | dynamic = ["version"] 8 | description = "Sparse n-dimensional arrays for the PyData ecosystem" 9 | readme = "README.md" 10 | dependencies = ["numpy>=1.17", "numba>=0.49"] 11 | maintainers = [{ name = "Hameer Abbasi", email = "hameerabbasi@yahoo.com" }] 12 | requires-python = ">=3.11" 13 | license = { file = "LICENSE" } 14 | keywords = ["sparse", "numpy", "scipy", "dask"] 15 | classifiers = [ 16 | "Development Status :: 2 - Pre-Alpha", 17 | "Operating System :: OS Independent", 18 | "License :: OSI Approved :: BSD License", 19 | "Programming Language :: Python", 20 | "Programming Language :: Python :: 3", 21 | "Programming Language :: Python :: 3.10", 22 | "Programming Language :: Python :: 3.11", 23 | "Programming Language :: Python :: 3.12", 24 | "Programming Language :: Python :: 3 :: Only", 25 | "Intended Audience :: Developers", 26 | "Intended Audience :: Science/Research", 27 | ] 28 | 29 | [project.optional-dependencies] 30 | docs = [ 31 | "mkdocs-material", 32 | "mkdocstrings[python]", 33 | "mkdocs-gen-files", 34 | "mkdocs-literate-nav", 35 | "mkdocs-section-index", 36 | "mkdocs-jupyter", 37 | "sparse[extras]", 38 | ] 39 | extras = [ 40 | "dask[array]", 41 | "sparse[finch]", 42 | "scipy", 43 | "scikit-learn", 44 | "networkx", 45 | ] 46 | tests = [ 47 | "sparse[extras]", 48 | "pytest>=3.5", 49 | "pytest-cov", 50 | "pytest-xdist", 51 | "pre-commit", 52 | "pytest-codspeed", 53 | ] 54 | tox = ["sparse[tests]", "tox"] 55 | notebooks = ["sparse[tests]", "nbmake", "matplotlib"] 56 | all = ["sparse[docs,tox,notebooks,mlir]", "matrepr"] 57 | finch = ["finch-tensor>=0.2.13"] 58 | mlir = ["finch-mlir>=0.0.2"] 59 | 60 | [project.urls] 61 | Documentation = "https://sparse.pydata.org/" 62 | Source = "https://github.com/pydata/sparse/" 63 | Repository = "https://github.com/pydata/sparse.git" 64 | "Issue Tracker" = "https://github.com/pydata/sparse/issues" 65 | Discussions = "https://github.com/pydata/sparse/discussions" 66 | 67 | [project.entry-points.numba_extensions] 68 | init = "sparse.numba_backend._numba_extension:_init_extension" 69 | 70 | [tool.setuptools.packages.find] 71 | where = ["."] 72 | include = ["sparse", "sparse.*"] 73 | 74 | [tool.setuptools_scm] 75 | version_file = "sparse/_version.py" 76 | 77 | [tool.ruff] 78 | exclude = ["sparse/_version.py"] 79 | line-length = 120 80 | 81 | [tool.ruff.lint] 82 | select = ["F", "E", "W", "I", "B", "UP", "YTT", "BLE", "C4", "T10", "ISC", "ICN", "PIE", "PYI", "RSE", "RET", "SIM", "PGH", "FLY", "NPY", "PERF"] 83 | 84 | [tool.ruff.lint.isort.sections] 85 | numpy = ["numpy", "numpy.*", "scipy", "scipy.*"] 86 | 87 | [tool.ruff.format] 88 | quote-style = "double" 89 | docstring-code-format = true 90 | 91 | [tool.ruff.lint.isort] 92 | section-order = [ 93 | "future", 94 | "standard-library", 95 | "first-party", 96 | "third-party", 97 | "numpy", 98 | "local-folder", 99 | ] 100 | 101 | [tool.jupytext.formats] 102 | "docs/examples_ipynb/" = "ipynb" 103 | "docs/examples/" = "py:light" 104 | -------------------------------------------------------------------------------- /sparse/mlir_backend/_dtypes.py: -------------------------------------------------------------------------------- 1 | import abc 2 | import dataclasses 3 | import math 4 | import sys 5 | 6 | import mlir_finch.runtime as rt 7 | from mlir_finch import ir 8 | 9 | import numpy as np 10 | 11 | 12 | class MlirType(abc.ABC): 13 | @abc.abstractmethod 14 | def _get_mlir_type(self) -> ir.Type: ... 15 | 16 | 17 | def _get_pointer_width() -> int: 18 | return round(math.log2(sys.maxsize + 1.0)) + 1 19 | 20 | 21 | _PTR_WIDTH = _get_pointer_width() 22 | 23 | 24 | @dataclasses.dataclass(eq=True, frozen=True, kw_only=True) 25 | class DType(MlirType): 26 | bit_width: int 27 | 28 | @property 29 | @abc.abstractmethod 30 | def np_dtype(self) -> np.dtype: 31 | raise NotImplementedError 32 | 33 | def to_ctype(self): 34 | return rt.as_ctype(self.np_dtype) 35 | 36 | def __eq__(self, value): 37 | if np.isdtype(value) or isinstance(value, str): 38 | value = asdtype(value) 39 | return super().__eq__(value) 40 | 41 | 42 | @dataclasses.dataclass(eq=True, frozen=True, kw_only=True) 43 | class IeeeRealFloatingDType(DType): 44 | @property 45 | def np_dtype(self) -> np.dtype: 46 | return np.dtype(getattr(np, f"float{self.bit_width}")) 47 | 48 | def _get_mlir_type(self) -> ir.Type: 49 | return getattr(ir, f"F{self.bit_width}Type").get() 50 | 51 | 52 | float64 = IeeeRealFloatingDType(bit_width=64) 53 | float32 = IeeeRealFloatingDType(bit_width=32) 54 | float16 = IeeeRealFloatingDType(bit_width=16) 55 | 56 | 57 | @dataclasses.dataclass(eq=True, frozen=True, kw_only=True) 58 | class IeeeComplexFloatingDType(DType): 59 | @property 60 | def np_dtype(self) -> np.dtype: 61 | return np.dtype(getattr(np, f"complex{self.bit_width}")) 62 | 63 | def _get_mlir_type(self) -> ir.Type: 64 | return ir.ComplexType.get(getattr(ir, f"F{self.bit_width // 2}Type").get()) 65 | 66 | 67 | complex64 = IeeeComplexFloatingDType(bit_width=64) 68 | complex128 = IeeeComplexFloatingDType(bit_width=128) 69 | 70 | 71 | @dataclasses.dataclass(eq=True, frozen=True, kw_only=True) 72 | class IntegerDType(DType): 73 | def _get_mlir_type(self) -> ir.Type: 74 | return ir.IntegerType.get_signless(self.bit_width) 75 | 76 | 77 | @dataclasses.dataclass(eq=True, frozen=True, kw_only=True) 78 | class UnsignedIntegerDType(IntegerDType): 79 | @property 80 | def np_dtype(self) -> np.dtype: 81 | return np.dtype(getattr(np, f"uint{self.bit_width}")) 82 | 83 | 84 | uint8 = UnsignedIntegerDType(bit_width=8) 85 | uint16 = UnsignedIntegerDType(bit_width=16) 86 | uint32 = UnsignedIntegerDType(bit_width=32) 87 | uint64 = UnsignedIntegerDType(bit_width=64) 88 | 89 | 90 | @dataclasses.dataclass(eq=True, frozen=True, kw_only=True) 91 | class SignedIntegerDType(IntegerDType): 92 | @property 93 | def np_dtype(self) -> np.dtype: 94 | return np.dtype(getattr(np, f"int{self.bit_width}")) 95 | 96 | 97 | int8 = SignedIntegerDType(bit_width=8) 98 | int16 = SignedIntegerDType(bit_width=16) 99 | int32 = SignedIntegerDType(bit_width=32) 100 | int64 = SignedIntegerDType(bit_width=64) 101 | 102 | 103 | intp: SignedIntegerDType = locals()[f"int{_PTR_WIDTH}"] 104 | uintp: UnsignedIntegerDType = locals()[f"uint{_PTR_WIDTH}"] 105 | 106 | 107 | def isdtype(dt, /) -> bool: 108 | return isinstance(dt, DType) 109 | 110 | 111 | NUMPY_DTYPE_MAP = {np.dtype(dt.np_dtype): dt for dt in locals().values() if isdtype(dt)} 112 | 113 | 114 | def asdtype(dt, /) -> DType: 115 | if isdtype(dt): 116 | return dt 117 | 118 | return NUMPY_DTYPE_MAP[np.dtype(dt)] 119 | -------------------------------------------------------------------------------- /docs/gen_logo.py: -------------------------------------------------------------------------------- 1 | import xml.etree.ElementTree as ET 2 | 3 | import numpy as np 4 | 5 | 6 | def transform(a, b, c, d, e, f): 7 | return f"matrix({a},{b},{c},{d},{e},{f})" 8 | 9 | 10 | def fill(rs): 11 | """Generates opacity at random, weighted a bit toward 0 and 1""" 12 | x = rs.choice(np.arange(5), p=[0.3, 0.2, 0.0, 0.2, 0.3]) / 4 13 | return f"fill-opacity:{x:.1f}" 14 | 15 | 16 | rs = np.random.RandomState(1) 17 | 18 | colors = { 19 | "orange": "fill:rgb(241,141,59)", 20 | "blue": "fill:rgb(69,155,181)", 21 | "grey": "fill:rgb(103,124,131)", 22 | } 23 | 24 | s = 10 # face size 25 | offset_x = 10 # x margin 26 | offset_y = 10 # y margin 27 | b = np.tan(np.deg2rad(30)) # constant for transformations 28 | 29 | 30 | # reused attributes for small squares 31 | kwargs = {"x": "0", "y": "0", "width": f"{s}", "height": f"{s}", "stroke": "white"} 32 | 33 | # large white squares for background 34 | bg_kwargs = {**kwargs, "width": f"{5 * s}", "height": f"{5 * s}", "style": "fill:white;"} 35 | 36 | 37 | root = ET.Element( 38 | "svg", 39 | **{ 40 | "width": f"{s * 10 + 2 * offset_x}", 41 | "height": f"{s * 20 + 2 * offset_y}", 42 | "viewbox": f"0 0 {s * 10 + 2 * offset_x} {s * 20 + 2 * offset_y}", 43 | "version": "1.1", 44 | "style": "fill-rule:evenodd;clip-rule:evenodd;stroke-linejoin:round;stroke-miterlimit:2;", 45 | "xmlns": "http://www.w3.org/2000/svg", 46 | "xmlns:xlink": "http://www.w3.org/1999/xlink", 47 | "xml:space": "preserve", 48 | "xmlns:serif": "http://www.serif.com/", 49 | "class": "align-center", 50 | }, 51 | ) 52 | 53 | 54 | # face 1 (left, orange) 55 | ET.SubElement( 56 | root, 57 | "rect", 58 | transform=transform(1, b, 0, 1, 5 * s + offset_x, offset_y), 59 | **bg_kwargs, 60 | ) 61 | for i, j in np.ndindex(5, 5): 62 | ET.SubElement( 63 | root, 64 | "rect", 65 | style=f"{colors['orange']};{fill(rs)};", 66 | transform=transform(1, b, 0, 1, (i + 5) * s + offset_x, (i * b + j) * s + offset_y), 67 | **kwargs, 68 | ) 69 | 70 | # face 2 (top, orange) 71 | ET.SubElement( 72 | root, 73 | "rect", 74 | transform=transform(1, b, -1, b, 5 * s + offset_x, 5 * s + offset_y), 75 | **bg_kwargs, 76 | ) 77 | for i, j in np.ndindex(5, 5): 78 | ET.SubElement( 79 | root, 80 | "rect", 81 | style=f"{colors['orange']};{fill(rs)};", 82 | transform=transform( 83 | 1, 84 | b, 85 | -1, 86 | b, 87 | (i - j + 5) * s + offset_x, 88 | (i * b + j * b + 5) * s + offset_y, 89 | ), 90 | **kwargs, 91 | ) 92 | 93 | # face 3 (left, blue) 94 | for y2 in (5 + b * 5, 10 + b * 5): 95 | ET.SubElement( 96 | root, 97 | "rect", 98 | transform=transform(1, b, 0, 1, offset_x, y2 * s + offset_y), 99 | **bg_kwargs, 100 | ) 101 | for i, j in np.ndindex(5, 5): 102 | ET.SubElement( 103 | root, 104 | "rect", 105 | style=f"{colors['blue']};{fill(rs)};", 106 | transform=transform(1, b, 0, 1, i * s + offset_x, (i * b + j + y2) * s + offset_y), 107 | **kwargs, 108 | ) 109 | 110 | # face 4 (right, grey) 111 | ET.SubElement( 112 | root, 113 | "rect", 114 | transform=transform(1, -b, 0, 1, 5 * s + offset_x, (10 * b + 5) * s + offset_y), 115 | **bg_kwargs, 116 | ) 117 | for i, j in np.ndindex(5, 5): 118 | ET.SubElement( 119 | root, 120 | "rect", 121 | style=f"{colors['grey']};{fill(rs)};", 122 | transform=transform(1, -b, 0, 1, (i + 5) * s + offset_x, ((10 - i) * b + j + 5) * s + offset_y), 123 | **kwargs, 124 | ) 125 | 126 | ET.ElementTree(root).write("logo.svg", encoding="UTF-8") 127 | -------------------------------------------------------------------------------- /sparse/numba_backend/_compressed/common.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | 3 | from .._utils import can_store, check_consistent_fill_value, normalize_axis 4 | 5 | 6 | def concatenate(arrays, axis=0, compressed_axes=None): 7 | from .compressed import GCXS 8 | 9 | check_consistent_fill_value(arrays) 10 | arrays = [arr if isinstance(arr, GCXS) else GCXS(arr, compressed_axes=(axis,)) for arr in arrays] 11 | axis = normalize_axis(axis, arrays[0].ndim) 12 | dim = sum(x.shape[axis] for x in arrays) 13 | shape = list(arrays[0].shape) 14 | shape[axis] = dim 15 | assert all(x.shape[ax] == arrays[0].shape[ax] for x in arrays for ax in set(range(arrays[0].ndim)) - {axis}) 16 | if compressed_axes is None: 17 | compressed_axes = (axis,) 18 | if arrays[0].ndim == 1: 19 | from .._coo.common import concatenate as coo_concat 20 | 21 | arrays = [arr.tocoo() for arr in arrays] 22 | return coo_concat(arrays, axis=axis) 23 | # arrays may have different compressed_axes 24 | # concatenating becomes easy when compressed_axes are the same 25 | arrays = [arr.change_compressed_axes((axis,)) for arr in arrays] 26 | ptr_list = [] 27 | for i, arr in enumerate(arrays): 28 | if i == 0: 29 | ptr_list.append(arr.indptr) 30 | continue 31 | ptr_list.append(arr.indptr[1:]) 32 | indptr = np.concatenate(ptr_list) 33 | indices = np.concatenate([arr.indices for arr in arrays]) 34 | data = np.concatenate([arr.data for arr in arrays]) 35 | ptr_len = arrays[0].indptr.shape[0] 36 | nnz = arrays[0].nnz 37 | total_nnz = sum(int(arr.nnz) for arr in arrays) 38 | if not can_store(indptr.dtype, total_nnz): 39 | indptr = indptr.astype(np.min_scalar_type(total_nnz)) 40 | for i in range(1, len(arrays)): 41 | indptr[ptr_len:] += nnz 42 | nnz = arrays[i].nnz 43 | ptr_len += arrays[i].indptr.shape[0] - 1 44 | return GCXS( 45 | (data, indices, indptr), 46 | shape=tuple(shape), 47 | compressed_axes=arrays[0].compressed_axes, 48 | fill_value=arrays[0].fill_value, 49 | ).change_compressed_axes(compressed_axes) 50 | 51 | 52 | def stack(arrays, axis=0, compressed_axes=None): 53 | from .compressed import GCXS 54 | 55 | check_consistent_fill_value(arrays) 56 | arrays = [arr if isinstance(arr, GCXS) else GCXS(arr, compressed_axes=(axis,)) for arr in arrays] 57 | axis = normalize_axis(axis, arrays[0].ndim + 1) 58 | assert all(x.shape[ax] == arrays[0].shape[ax] for x in arrays for ax in set(range(arrays[0].ndim)) - {axis}) 59 | if compressed_axes is None: 60 | compressed_axes = (axis,) 61 | if arrays[0].ndim == 1: 62 | from .._coo.common import stack as coo_stack 63 | 64 | arrays = [arr.tocoo() for arr in arrays] 65 | return coo_stack(arrays, axis=axis) 66 | # arrays may have different compressed_axes 67 | # stacking becomes easy when compressed_axes are the same 68 | ptr_list = [] 69 | for i in range(len(arrays)): 70 | shape = list(arrays[i].shape) 71 | shape.insert(axis, 1) 72 | arrays[i] = arrays[i].reshape(shape).change_compressed_axes((axis,)) 73 | if i == 0: 74 | ptr_list.append(arrays[i].indptr) 75 | continue 76 | ptr_list.append(arrays[i].indptr[1:]) 77 | 78 | shape[axis] = len(arrays) 79 | indptr = np.concatenate(ptr_list) 80 | indices = np.concatenate([arr.indices for arr in arrays]) 81 | data = np.concatenate([arr.data for arr in arrays]) 82 | ptr_len = arrays[0].indptr.shape[0] 83 | nnz = arrays[0].nnz 84 | total_nnz = sum(int(arr.nnz) for arr in arrays) 85 | if not can_store(indptr.dtype, total_nnz): 86 | indptr = indptr.astype(np.min_scalar_type(total_nnz)) 87 | for i in range(1, len(arrays)): 88 | indptr[ptr_len:] += nnz 89 | nnz = arrays[i].nnz 90 | ptr_len += arrays[i].indptr.shape[0] - 1 91 | return GCXS( 92 | (data, indices, indptr), 93 | shape=tuple(shape), 94 | compressed_axes=arrays[0].compressed_axes, 95 | fill_value=arrays[0].fill_value, 96 | ).change_compressed_axes(compressed_axes) 97 | -------------------------------------------------------------------------------- /docs/introduction.md: -------------------------------------------------------------------------------- 1 | # Sparse 2 | 3 | This implements sparse arrays of arbitrary dimension on top of 4 | [numpy][] and 5 | [`scipy.sparse`][]. It generalizes the 6 | [`scipy.sparse.coo_matrix`][] and 7 | [`scipy.sparse.dok_matrix`][] layouts, but 8 | extends beyond just rows and columns to an arbitrary number of 9 | dimensions. 10 | 11 | Additionally, this project maintains compatibility with the 12 | [`numpy.ndarray`][] interface rather than the 13 | [`numpy.matrix`][] interface used in 14 | [`scipy.sparse`][] 15 | 16 | These differences make this project useful in certain situations where 17 | scipy.sparse matrices are not well suited, but it should not be 18 | considered a full replacement. The data structures in pydata/sparse 19 | complement and can be used in conjunction with the fast linear algebra 20 | routines inside [`scipy.sparse`][]. A format conversion or copy may be 21 | required. 22 | 23 | ## Motivation 24 | 25 | Sparse arrays, or arrays that are mostly empty or filled with zeros, are 26 | common in many scientific applications. To save space we often avoid 27 | storing these arrays in traditional dense formats, and instead choose 28 | different data structures. Our choice of data structure can 29 | significantly affect our storage and computational costs when working 30 | with these arrays. 31 | 32 | ## Design 33 | 34 | The main data structure in this library follows the [Coordinate List 35 | (COO)](https://en.wikipedia.org/wiki/Sparse_matrix#Coordinate_list_(COO)) 36 | layout for sparse matrices, but extends it to multiple dimensions. 37 | 38 | The COO layout, which stores the row index, column index, and value of 39 | every element: 40 | 41 | 42 | | row | col | data | 43 | |-----|-----|------| 44 | | 0 | 0 | 10 | 45 | | 0 | 2 | 13 | 46 | | 1 | 3 | 9 | 47 | | 3 | 8 | 21 | 48 | 49 | It is straightforward to extend the COO layout to an arbitrary number of 50 | dimensions: 51 | 52 | 53 | | dim1 | dim2 | dim3 | \... | data | 54 | |------|------|------|------|------| 55 | | 0 | 0 | 0 | . | 10 | 56 | | 0 | 0 | 3 | . | 13 | 57 | | 0 | 2 | 2 | . | 9 | 58 | | 3 | 1 | 4 | . | 21 | 59 | 60 | This makes it easy to *store* a multidimensional sparse array, but we 61 | still need to reimplement all of the array operations like transpose, 62 | reshape, slicing, tensordot, reductions, etc., which can be challenging 63 | in general. 64 | 65 | This library also includes several other data structures. Similar to 66 | COO, the [Dictionary of Keys 67 | (DOK)](https://en.wikipedia.org/wiki/Sparse_matrix#Dictionary_of_keys_(DOK)) 68 | format for sparse matrices generalizes well to an arbitrary number of 69 | dimensions. DOK is well-suited for writing and mutating. Most other 70 | operations are not supported for DOK. A common workflow may involve 71 | writing an array with DOK and then converting to another format for 72 | other operations. 73 | 74 | The [Compressed Sparse Row/Column 75 | (CSR/CSC)](https://en.wikipedia.org/wiki/Sparse_matrix#Compressed_sparse_column_(CSC_or_CCS)) 76 | formats are widely used in scientific computing are now supported by 77 | pydata/sparse. The CSR/CSC formats excel at compression and mathematical 78 | operations. While these formats are restricted to two dimensions, 79 | pydata/sparse supports the GCXS sparse array format, based on [GCRS/GCCS 80 | from](https://ieeexplore.ieee.org/abstract/document/7237032/similar#similar) 81 | which generalizes CSR/CSC to n-dimensional arrays. Like their 82 | two-dimensional CSR/CSC counterparts, GCXS arrays compress well. Whereas 83 | the storage cost of COO depends heavily on the number of dimensions of 84 | the array, the number of dimensions only minimally affects the storage 85 | cost of GCXS arrays, which results in favorable compression ratios 86 | across many use cases. 87 | 88 | Together these formats cover a wide array of applications of sparsity. 89 | Additionally, with each format complying with the 90 | [`numpy.ndarray`][] interface and following 91 | the appropriate dispatching protocols, pydata/sparse arrays can interact 92 | with other array libraries and seamlessly take part in 93 | pydata-ecosystem-based workflows. 94 | 95 | ## LICENSE 96 | 97 | This library is licensed under BSD-3. 98 | -------------------------------------------------------------------------------- /.github/workflows/ci.yml: -------------------------------------------------------------------------------- 1 | defaults: 2 | run: 3 | shell: bash -leo pipefail {0} 4 | 5 | concurrency: 6 | group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }} 7 | cancel-in-progress: true 8 | 9 | jobs: 10 | test: 11 | strategy: 12 | matrix: 13 | os: ['ubuntu-latest'] 14 | python: ['3.11', '3.12', '3.13'] 15 | pip_opts: [''] 16 | numba_boundscheck: [0] 17 | include: 18 | - os: macos-latest 19 | python: '3.11' 20 | - os: windows-latest 21 | python: '3.11' 22 | - os: ubuntu-latest 23 | python: '3.11' 24 | numba_boundscheck: 1 25 | - os: ubuntu-latest 26 | python: '3.11' 27 | pip_opts: 'numpy<2' 28 | fail-fast: false 29 | runs-on: ${{ matrix.os }} 30 | env: 31 | PYTHON_VERSION: ${{ matrix.python }} 32 | NUMBA_BOUNDSCHECK: ${{ matrix.numba_boundscheck }} 33 | steps: 34 | - name: Checkout Repo 35 | uses: actions/checkout@v4 36 | - uses: mamba-org/setup-micromamba@v2 37 | with: 38 | environment-file: ci/environment.yml 39 | init-shell: >- 40 | bash 41 | cache-environment: true 42 | cache-downloads: true 43 | post-cleanup: 'all' 44 | create-args: >- 45 | python=${{ matrix.python }} 46 | ${{ matrix.pip_opts }} 47 | - name: Install package 48 | run: | 49 | pip install -e '.[tests]' 50 | - name: Run tests 51 | run: ci/test_backends.sh 52 | - uses: codecov/codecov-action@v5 53 | with: 54 | token: ${{ secrets.CODECOV_TOKEN }} 55 | files: ./**/coverage*.xml 56 | 57 | examples: 58 | runs-on: ubuntu-latest 59 | steps: 60 | - name: Checkout Repo 61 | uses: actions/checkout@v4 62 | - name: Set up Python 63 | uses: actions/setup-python@v5 64 | with: 65 | python-version: '3.11' 66 | cache: 'pip' 67 | - name: Build and install Sparse 68 | run: | 69 | pip install -U setuptools wheel 70 | pip install '.[finch]' scipy dask networkx graphblas-algorithms 71 | - name: Run examples 72 | run: ci/test_examples.sh 73 | 74 | notebooks: 75 | runs-on: ubuntu-latest 76 | steps: 77 | - name: Checkout Repo 78 | uses: actions/checkout@v4 79 | - name: Set up Python 80 | uses: actions/setup-python@v5 81 | with: 82 | python-version: '3.11' 83 | cache: 'pip' 84 | - name: Build and install Sparse 85 | run: | 86 | pip install -U setuptools wheel 87 | pip install '.[notebooks]' 88 | - name: Run notebooks 89 | run: ci/test_notebooks.sh 90 | 91 | array_api_tests: 92 | strategy: 93 | matrix: 94 | backend: ['Numba', 'Finch'] 95 | fail-fast: false 96 | env: 97 | ARRAY_API_TESTS_DIR: ${{ github.workspace }}/array-api-tests 98 | runs-on: ubuntu-latest 99 | steps: 100 | - name: Checkout Repo 101 | uses: actions/checkout@v4 102 | - name: Checkout array-api-tests 103 | run: ci/clone_array_api_tests.sh 104 | - name: Set up Python 105 | uses: actions/setup-python@v5 106 | with: 107 | python-version: '3.11' 108 | cache: 'pip' 109 | - name: Install build and test dependencies from PyPI 110 | run: | 111 | pip install pytest-xdist -r "$ARRAY_API_TESTS_DIR/requirements.txt" 112 | - name: Build and install Sparse 113 | run: | 114 | pip install -U setuptools wheel 115 | pip install '.[finch]' 116 | - name: Run the test suite 117 | env: 118 | SPARSE_BACKEND: ${{ matrix.backend }} 119 | run: ci/test_array_api.sh 120 | 121 | on: 122 | # Trigger the workflow on push or pull request, 123 | # but only for the main branch 124 | push: 125 | branches: 126 | - main 127 | - vnext 128 | pull_request: 129 | branches: 130 | - main 131 | - vnext 132 | # Also trigger on page_build, as well as release created events 133 | page_build: 134 | release: 135 | types: # This configuration does not affect the page_build event above 136 | - created 137 | -------------------------------------------------------------------------------- /sparse/numba_backend/tests/test_array_function.py: -------------------------------------------------------------------------------- 1 | import sparse 2 | from sparse.numba_backend._settings import NEP18_ENABLED 3 | from sparse.numba_backend._utils import assert_eq 4 | 5 | import pytest 6 | 7 | import numpy as np 8 | import scipy 9 | 10 | if not NEP18_ENABLED: 11 | pytest.skip("NEP18 is not enabled", allow_module_level=True) 12 | 13 | 14 | @pytest.mark.parametrize( 15 | "func", 16 | [ 17 | np.mean, 18 | np.std, 19 | np.var, 20 | np.sum, 21 | lambda x: np.sum(x, axis=0), 22 | lambda x: np.transpose(x), 23 | ], 24 | ) 25 | def test_unary(func): 26 | y = sparse.random((50, 50), density=0.25) 27 | x = y.todense() 28 | xx = func(x) 29 | yy = func(y) 30 | assert_eq(xx, yy) 31 | 32 | 33 | @pytest.mark.parametrize("arg_order", [(0, 1), (1, 0), (1, 1)]) 34 | @pytest.mark.parametrize("func", [np.dot, np.result_type, np.tensordot, np.matmul]) 35 | def test_binary(func, arg_order): 36 | y = sparse.random((50, 50), density=0.25) 37 | x = y.todense() 38 | xx = func(x, x) 39 | args = [(x, y)[i] for i in arg_order] 40 | yy = func(*args) 41 | 42 | if isinstance(xx, np.ndarray): 43 | assert_eq(xx, yy) 44 | else: 45 | # result_type returns a dtype 46 | assert xx == yy 47 | 48 | 49 | def test_stack(): 50 | """stack(), by design, does not allow for mixed type inputs""" 51 | y = sparse.random((50, 50), density=0.25) 52 | x = y.todense() 53 | xx = np.stack([x, x]) 54 | yy = np.stack([y, y]) 55 | assert_eq(xx, yy) 56 | 57 | 58 | @pytest.mark.parametrize( 59 | "arg_order", 60 | [(0, 0, 1), (0, 1, 0), (0, 1, 1), (1, 0, 0), (1, 0, 1), (1, 1, 0), (1, 1, 1)], 61 | ) 62 | @pytest.mark.parametrize("func", [lambda a, b, c: np.where(a.astype(bool), b, c)]) 63 | def test_ternary(func, arg_order): 64 | y = sparse.random((50, 50), density=0.25) 65 | x = y.todense() 66 | xx = func(x, x, x) 67 | args = [(x, y)[i] for i in arg_order] 68 | yy = func(*args) 69 | assert_eq(xx, yy) 70 | 71 | 72 | @pytest.mark.parametrize("func", [np.shape, np.size, np.ndim]) 73 | def test_property(func): 74 | y = sparse.random((50, 50), density=0.25) 75 | x = y.todense() 76 | xx = func(x) 77 | yy = func(y) 78 | assert xx == yy 79 | 80 | 81 | def test_broadcast_to_scalar(): 82 | s = sparse.COO.from_numpy([0, 0, 1, 2]) 83 | actual = np.broadcast_to(np.zeros_like(s, shape=()), (3,)) 84 | expected = np.broadcast_to(np.zeros_like(s.todense(), shape=()), (3,)) 85 | 86 | assert isinstance(actual, sparse.COO) 87 | assert_eq(actual, expected) 88 | 89 | 90 | def test_zeros_like_order(): 91 | s = sparse.COO.from_numpy([0, 0, 1, 2]) 92 | actual = np.zeros_like(s, order="C") 93 | expected = np.zeros_like(s.todense(), order="C") 94 | 95 | assert isinstance(actual, sparse.COO) 96 | assert_eq(actual, expected) 97 | 98 | 99 | @pytest.mark.parametrize("format", ["dok", "gcxs", "coo"]) 100 | def test_format(format): 101 | s = sparse.random((5, 5), density=0.2, format=format) 102 | assert s.format == format 103 | 104 | 105 | class TestAsarray: 106 | np_eye = np.eye(5) 107 | 108 | @pytest.mark.parametrize( 109 | "input", 110 | [ 111 | np_eye, 112 | scipy.sparse.csr_matrix(np_eye), 113 | scipy.sparse.csc_matrix(np_eye), 114 | 4, 115 | np.array(5), 116 | np.arange(12).reshape((2, 3, 2)), 117 | sparse.COO.from_numpy(np_eye), 118 | sparse.GCXS.from_numpy(np_eye), 119 | sparse.DOK.from_numpy(np_eye), 120 | ], 121 | ) 122 | @pytest.mark.parametrize("dtype", [np.int64, np.float64, np.complex128]) 123 | @pytest.mark.parametrize("format", ["dok", "gcxs", "coo"]) 124 | def test_asarray(self, input, dtype, format): 125 | if format == "dok" and (np.isscalar(input) or input.ndim == 0): 126 | # scalars and 0-D arrays aren't supported in DOK format 127 | return 128 | 129 | s = sparse.asarray(input, dtype=dtype, format=format) 130 | 131 | actual = s.todense() if hasattr(s, "todense") else s 132 | expected = input.todense() if hasattr(input, "todense") else np.asarray(input) 133 | 134 | np.testing.assert_equal(actual, expected) 135 | -------------------------------------------------------------------------------- /docs/roadmap.md: -------------------------------------------------------------------------------- 1 | # Roadmap 2 | 3 | For a brochure version of this roadmap, see 4 | [this link](https://docs.wixstatic.com/ugd/095d2c_ac81d19db47047c79a55da7a6c31cf66.pdf). 5 | 6 | 7 | ## Background 8 | 9 | The aim of PyData/Sparse is to create sparse containers that implement the ndarray 10 | interface. Traditionally in the PyData ecosystem, sparse arrays have been provided 11 | by the `scipy.sparse` submodule. All containers there depend on and emulate the 12 | `numpy.matrix` interface. This means that they are limited to two dimensions and also 13 | don’t work well in places where `numpy.ndarray` would work. 14 | 15 | PyData/Sparse is well on its way to replacing `scipy.sparse` as the de-facto sparse array 16 | implementation in the PyData ecosystem. 17 | 18 | ## Topics 19 | 20 | * More storage formats 21 | * Better performance/algorithms 22 | * Covering more of the NumPy API 23 | * SciPy Integration 24 | * Dask integration for high scalability 25 | * CuPy integration for GPU-acceleration 26 | * Maintenance and General Improvements 27 | 28 | ## More Storage Formats 29 | 30 | In the sparse domain, you have to make a choice of format when representing your array in 31 | memory, and different formats have different trade-offs. For example: 32 | 33 | * CSR/CSC are usually expected by external libraries, and have good space characteristics 34 | for most arrays 35 | * DOK allows in-place modification and writes 36 | * LIL has faster writes if written to in-order. 37 | * BSR allows block-writes and reads 38 | 39 | The most important formats are, of course, CSR and CSC, because they allow zero-copy interaction 40 | with a number of libraries including MKL, LAPACK and others. This will allow PyData/Sparse to 41 | quickly reach the functionality of `scipy.sparse`, accelerating the path to its replacement. 42 | 43 | ## Better Performance/Algorithms 44 | 45 | There are a few places in scipy.sparse where algorithms are sub-optimal, sometimes due to reliance 46 | on NumPy which doesn’t have these algorithms. We intend to both improve the algorithms in NumPy, 47 | giving the broader community a chance to use them; as well as in PyData/Sparse, to reach optimal 48 | efficiency in the broadest use-cases. 49 | 50 | ## Covering More of the NumPy API 51 | 52 | Our eventual aim is to cover all areas of NumPy where algorithms exist that give sparse arrays an edge 53 | over dense arrays. Currently, PyData/Sparse supports reductions, element-wise functions and other common 54 | functions such as stacking, concatenating and tensor products. Common uses of sparse arrays include 55 | linear algebra and graph theoretic subroutines, so we plan on covering those first. 56 | 57 | ## SciPy Integration 58 | 59 | PyData/Sparse aims to build containers and elementary operations on them, such as element-wise operations, 60 | reductions and so on. We plan on modifying the current graph theoretic subroutines in `scipy.sparse.csgraph` 61 | to support PyData/Sparse arrays. The same applies for linear algebra and `scipy.sparse.linalg`. 62 | 63 | ## CuPy integration for GPU-acceleration 64 | 65 | CuPy is a project that implements a large portion of NumPy’s ndarray interface on GPUs. We plan to integrate 66 | with CuPy so that it’s possible to accelerate sparse arrays on GPUs. 67 | 68 | [](){#completed} 69 | # Completed Tasks 70 | 71 | ## Dask Integration for High Scalability 72 | 73 | Dask is a project that takes ndarray style containers and then allows them to scale across multiple cores or 74 | clusters. We plan on tighter integration and cooperation with the Dask team to ensure the highest amount of 75 | Dask functionality works with sparse arrays. 76 | 77 | Currently, integration with Dask is supported via array protocols. When more of the NumPy API (e.g. array 78 | creation functions) becomes available through array protocols, it will be automatically be supported by Dask. 79 | 80 | ## (Partial) SciPy Integration 81 | 82 | Support for `scipy.sparse.linalg` has been completed. We hope to add support for `scipy.sparse.csgraph` 83 | in the future. 84 | 85 | ## More Storage Formats 86 | 87 | GCXS, a compressed n-dimensional array format based on the GCRS/GCCS formats of 88 | [Shaikh and Hasan 2015](https://ieeexplore.ieee.org/document/7237032), has been added. 89 | In conjunction with this work, the CSR/CSC matrix formats have been are now a part of pydata/sparse. 90 | We plan to add better-performing algorithms for many of the operations currently supported. 91 | -------------------------------------------------------------------------------- /sparse/numba_backend/tests/test_compressed_2d.py: -------------------------------------------------------------------------------- 1 | import sparse 2 | from sparse import COO 3 | from sparse.numba_backend._compressed.compressed import CSC, CSR, GCXS 4 | from sparse.numba_backend._utils import assert_eq 5 | 6 | import pytest 7 | 8 | import numpy as np 9 | import scipy.sparse 10 | import scipy.stats 11 | 12 | 13 | @pytest.fixture(scope="module", params=[CSR, CSC]) 14 | def cls(request): 15 | return request.param 16 | 17 | 18 | @pytest.fixture(scope="module", params=["f8", "f4", "i8", "i4"]) 19 | def dtype(request): 20 | return request.param 21 | 22 | 23 | @pytest.fixture(scope="module") 24 | def random_sparse(cls, dtype, rng): 25 | if np.issubdtype(dtype, np.integer): 26 | 27 | def data_rvs(n): 28 | return rng.integers(-1000, 1000, n) 29 | 30 | else: 31 | data_rvs = None 32 | return cls(sparse.random((20, 30), density=0.25, data_rvs=data_rvs).astype(dtype)) 33 | 34 | 35 | @pytest.fixture(scope="module") 36 | def random_sparse_small(cls, dtype, rng): 37 | if np.issubdtype(dtype, np.integer): 38 | 39 | def data_rvs(n): 40 | return rng.integers(-10, 10, n) 41 | 42 | else: 43 | data_rvs = None 44 | return cls(sparse.random((20, 20), density=0.25, data_rvs=data_rvs).astype(dtype)) 45 | 46 | 47 | def test_repr(random_sparse): 48 | cls = type(random_sparse).__name__ 49 | 50 | str_repr = repr(random_sparse) 51 | assert cls in str_repr 52 | 53 | 54 | def test_bad_constructor_input(cls): 55 | with pytest.raises(ValueError, match=r".*shape.*"): 56 | cls(arg="hello world") 57 | 58 | 59 | @pytest.mark.parametrize("n", [0, 1, 3]) 60 | def test_bad_nd_input(cls, n): 61 | a = np.ones(shape=tuple(5 for _ in range(n))) 62 | with pytest.raises(ValueError, match=f"{n}-d"): 63 | cls(a) 64 | 65 | 66 | @pytest.mark.parametrize("source_type", ["gcxs", "coo"]) 67 | def test_from_sparse(cls, source_type): 68 | gcxs = sparse.random((20, 30), density=0.25, format=source_type) 69 | result = cls(gcxs) 70 | 71 | assert_eq(result, gcxs) 72 | 73 | 74 | @pytest.mark.parametrize("scipy_type", ["coo", "csr", "csc", "lil"]) 75 | @pytest.mark.parametrize("CLS", [CSR, CSC, GCXS]) 76 | def test_from_scipy_sparse(scipy_type, CLS, dtype): 77 | orig = scipy.sparse.random(20, 30, density=0.2, format=scipy_type, dtype=dtype) 78 | ref = COO.from_scipy_sparse(orig) 79 | result = CLS.from_scipy_sparse(orig) 80 | 81 | assert_eq(ref, result) 82 | 83 | result_via_init = CLS(orig) 84 | 85 | assert_eq(ref, result_via_init) 86 | 87 | 88 | @pytest.mark.parametrize("cls_str", ["coo", "dok", "csr", "csc", "gcxs"]) 89 | def test_to_sparse(cls_str, random_sparse): 90 | result = random_sparse.asformat(cls_str) 91 | 92 | assert_eq(random_sparse, result) 93 | 94 | 95 | @pytest.mark.parametrize("copy", [True, False]) 96 | def test_transpose(random_sparse, copy): 97 | from operator import is_, is_not 98 | 99 | t = random_sparse.transpose(copy=copy) 100 | tt = t.transpose(copy=copy) 101 | 102 | # Check if a copy was made 103 | check = is_not if copy else is_ 104 | 105 | assert check(random_sparse.data, t.data) 106 | assert check(random_sparse.indices, t.indices) 107 | assert check(random_sparse.indptr, t.indptr) 108 | 109 | assert random_sparse.shape == t.shape[::-1] 110 | 111 | assert_eq(random_sparse, tt) 112 | assert type(random_sparse) is type(tt) 113 | 114 | assert_eq(random_sparse.transpose(axes=(0, 1)), random_sparse) 115 | assert_eq(random_sparse.transpose(axes=(1, 0)), t) 116 | with pytest.raises(ValueError, match="Invalid transpose axes"): 117 | random_sparse.transpose(axes=0) 118 | 119 | 120 | @pytest.mark.parametrize("format", ["csr", "csc"]) 121 | def test_mT_fill_value(format): 122 | fv = 1.0 123 | arr = sparse.full((10, 20), fill_value=fv, format=format) 124 | assert_eq(arr.mT, sparse.full((20, 10), fill_value=fv)) 125 | 126 | 127 | def test_transpose_error(random_sparse): 128 | with pytest.raises(ValueError): 129 | random_sparse.transpose(axes=1) 130 | 131 | 132 | def test_matmul(random_sparse_small): 133 | arr = random_sparse_small.todense() 134 | 135 | actual = random_sparse_small @ random_sparse_small 136 | expected = arr @ arr 137 | 138 | assert_eq(actual, expected) 139 | -------------------------------------------------------------------------------- /sparse/numba_backend/tests/test_namespace.py: -------------------------------------------------------------------------------- 1 | import sparse 2 | 3 | 4 | def test_namespace(): 5 | from sparse.numba_backend._settings import IS_NUMPY2 6 | 7 | all_set = { 8 | "COO", 9 | "DOK", 10 | "GCXS", 11 | "SparseArray", 12 | "abs", 13 | "acos", 14 | "acosh", 15 | "add", 16 | "all", 17 | "any", 18 | "argmax", 19 | "argmin", 20 | "argwhere", 21 | "asCOO", 22 | "as_coo", 23 | "asarray", 24 | "asin", 25 | "asinh", 26 | "asnumpy", 27 | "astype", 28 | "atan", 29 | "atan2", 30 | "atanh", 31 | "bitwise_and", 32 | "bitwise_invert", 33 | "bitwise_left_shift", 34 | "bitwise_not", 35 | "bitwise_or", 36 | "bitwise_right_shift", 37 | "bitwise_xor", 38 | "bool", 39 | "broadcast_arrays", 40 | "broadcast_to", 41 | "can_cast", 42 | "ceil", 43 | "clip", 44 | "complex128", 45 | "complex64", 46 | "concat", 47 | "concatenate", 48 | "conj", 49 | "copysign", 50 | "cos", 51 | "cosh", 52 | "diagonal", 53 | "diagonalize", 54 | "diff", 55 | "divide", 56 | "dot", 57 | "e", 58 | "einsum", 59 | "elemwise", 60 | "empty", 61 | "empty_like", 62 | "equal", 63 | "exp", 64 | "expand_dims", 65 | "expm1", 66 | "eye", 67 | "finfo", 68 | "flip", 69 | "float16", 70 | "float32", 71 | "float64", 72 | "floor", 73 | "floor_divide", 74 | "full", 75 | "full_like", 76 | "greater", 77 | "greater_equal", 78 | "hypot", 79 | "iinfo", 80 | "imag", 81 | "inf", 82 | "int16", 83 | "int32", 84 | "int64", 85 | "int8", 86 | "interp", 87 | "isfinite", 88 | "isinf", 89 | "isnan", 90 | "isneginf", 91 | "isposinf", 92 | "kron", 93 | "less", 94 | "less_equal", 95 | "load_npz", 96 | "log", 97 | "log10", 98 | "log1p", 99 | "log2", 100 | "logaddexp", 101 | "logical_and", 102 | "logical_not", 103 | "logical_or", 104 | "logical_xor", 105 | "matrix_transpose", 106 | "matmul", 107 | "max", 108 | "maximum", 109 | "mean", 110 | "min", 111 | "minimum", 112 | "moveaxis", 113 | "multiply", 114 | "nan", 115 | "nanmax", 116 | "nanmean", 117 | "nanmin", 118 | "nanprod", 119 | "nanreduce", 120 | "nansum", 121 | "negative", 122 | "newaxis", 123 | "nextafter", 124 | "nonzero", 125 | "not_equal", 126 | "ones", 127 | "ones_like", 128 | "outer", 129 | "pad", 130 | "permute_dims", 131 | "pi", 132 | "positive", 133 | "pow", 134 | "prod", 135 | "random", 136 | "real", 137 | "reciprocal", 138 | "remainder", 139 | "repeat", 140 | "reshape", 141 | "result_type", 142 | "roll", 143 | "round", 144 | "save_npz", 145 | "sign", 146 | "signbit", 147 | "sin", 148 | "sinh", 149 | "sort", 150 | "sqrt", 151 | "square", 152 | "squeeze", 153 | "stack", 154 | "std", 155 | "subtract", 156 | "sum", 157 | "take", 158 | "tan", 159 | "tanh", 160 | "tensordot", 161 | "tile", 162 | "tril", 163 | "triu", 164 | "trunc", 165 | "uint16", 166 | "uint32", 167 | "uint64", 168 | "uint8", 169 | "unique_counts", 170 | "unique_values", 171 | "unstack", 172 | "var", 173 | "vecdot", 174 | "where", 175 | "zeros", 176 | "zeros_like", 177 | } 178 | 179 | if IS_NUMPY2: 180 | all_set.update({"isdtype"}) 181 | 182 | assert set(sparse.__all__) == all_set 183 | 184 | for attr in sparse.__all__: 185 | assert hasattr(sparse, attr) 186 | 187 | assert sorted(sparse.__all__) == sparse.__all__ 188 | -------------------------------------------------------------------------------- /sparse/numba_backend/_io.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | 3 | from ._compressed import GCXS 4 | from ._coo.core import COO 5 | 6 | 7 | def save_npz(filename, matrix, compressed=True): 8 | """Save a sparse matrix to disk in numpy's `.npz` format. 9 | Note: This is not binary compatible with scipy's `save_npz()`. 10 | This binary format is not currently stable. Will save a file 11 | that can only be opend with this package's `load_npz()`. 12 | 13 | Parameters 14 | ---------- 15 | filename : string or file 16 | Either the file name (string) or an open file (file-like object) 17 | where the data will be saved. If file is a string or a Path, the 18 | `.npz` extension will be appended to the file name if it is not 19 | already there 20 | matrix : SparseArray 21 | The matrix to save to disk 22 | compressed : bool 23 | Whether to save in compressed or uncompressed mode 24 | 25 | Examples 26 | -------- 27 | Store sparse matrix to disk, and load it again: 28 | 29 | >>> import os 30 | >>> import sparse 31 | >>> import numpy as np 32 | >>> dense_mat = np.array([[[0.0, 0.0], [0.0, 0.70677779]], [[0.0, 0.0], [0.0, 0.86522495]]]) 33 | >>> mat = sparse.COO(dense_mat) 34 | >>> mat 35 | 36 | >>> sparse.save_npz("mat.npz", mat) 37 | >>> loaded_mat = sparse.load_npz("mat.npz") 38 | >>> loaded_mat 39 | 40 | >>> os.remove("mat.npz") 41 | 42 | See Also 43 | -------- 44 | - [`sparse.load_npz`][] 45 | - [`scipy.sparse.save_npz`][] 46 | - [`scipy.sparse.load_npz`][] 47 | - [`numpy.savez`][] 48 | - [`numpy.load`][] 49 | 50 | """ 51 | 52 | nodes = { 53 | "data": matrix.data, 54 | "shape": matrix.shape, 55 | "fill_value": matrix.fill_value, 56 | } 57 | 58 | if type(matrix) is COO: 59 | nodes["coords"] = matrix.coords 60 | elif type(matrix) is GCXS: 61 | nodes["indices"] = matrix.indices 62 | nodes["indptr"] = matrix.indptr 63 | nodes["compressed_axes"] = matrix.compressed_axes 64 | 65 | if compressed: 66 | np.savez_compressed(filename, **nodes) 67 | else: 68 | np.savez(filename, **nodes) 69 | 70 | 71 | def load_npz(filename): 72 | """Load a sparse matrix in numpy's `.npz` format from disk. 73 | Note: This is not binary compatible with scipy's `save_npz()` 74 | output. This binary format is not currently stable. 75 | Will only load files saved by this package. 76 | 77 | Parameters 78 | ---------- 79 | filename : file-like object, string, or pathlib.Path 80 | The file to read. File-like objects must support the 81 | `seek()` and `read()` methods. 82 | 83 | Returns 84 | ------- 85 | SparseArray 86 | The sparse matrix at path `filename`. 87 | 88 | Examples 89 | -------- 90 | See [`sparse.save_npz`][] for usage examples. 91 | 92 | See Also 93 | -------- 94 | - [`sparse.save_npz`][] 95 | - [`scipy.sparse.save_npz`][] 96 | - [`scipy.sparse.load_npz`][] 97 | - [`numpy.savez`][] 98 | - [`numpy.load`][] 99 | 100 | """ 101 | 102 | with np.load(filename) as fp: 103 | try: 104 | coords = fp["coords"] 105 | data = fp["data"] 106 | shape = tuple(fp["shape"]) 107 | fill_value = fp["fill_value"][()] 108 | return COO( 109 | coords=coords, 110 | data=data, 111 | shape=shape, 112 | sorted=True, 113 | has_duplicates=False, 114 | fill_value=fill_value, 115 | ) 116 | except KeyError: 117 | pass 118 | try: 119 | data = fp["data"] 120 | indices = fp["indices"] 121 | indptr = fp["indptr"] 122 | comp_axes = fp["compressed_axes"] 123 | shape = tuple(fp["shape"]) 124 | fill_value = fp["fill_value"][()] 125 | return GCXS( 126 | (data, indices, indptr), 127 | shape=shape, 128 | fill_value=fill_value, 129 | compressed_axes=comp_axes, 130 | ) 131 | except KeyError as e: 132 | raise RuntimeError(f"The file {filename!s} does not contain a valid sparse matrix") from e 133 | -------------------------------------------------------------------------------- /benchmarks/test_benchmark_coo.py: -------------------------------------------------------------------------------- 1 | import itertools 2 | import operator 3 | 4 | import sparse 5 | 6 | import pytest 7 | 8 | DENSITY = 0.01 9 | 10 | 11 | def format_id(format): 12 | return f"{format=}" 13 | 14 | 15 | @pytest.fixture(params=["coo", "gcxs"], ids=format_id) 16 | def format_param(request): 17 | return request.param 18 | 19 | 20 | @pytest.fixture 21 | def matmul_args(sides, format_param, rng, max_size): 22 | m, n, p = sides 23 | 24 | if m * n >= max_size or n * p >= max_size: 25 | pytest.skip() 26 | 27 | x = sparse.random((m, n), density=DENSITY, format=format_param, random_state=rng) 28 | y = sparse.random((n, p), density=DENSITY, format=format_param, random_state=rng) 29 | 30 | return x, y 31 | 32 | 33 | def test_matmul(benchmark, matmul_args): 34 | x, y = matmul_args 35 | 36 | x @ y # Numba compilation 37 | 38 | @benchmark 39 | def bench(): 40 | x @ y 41 | 42 | 43 | def get_test_id(params): 44 | side, rank, format = params 45 | return f"{side=}-{rank=}-{format=}" 46 | 47 | 48 | @pytest.fixture(params=itertools.product([100, 500, 1000], [1, 2, 3, 4], ["coo", "gcxs"]), ids=get_test_id) 49 | def elemwise_args(request, rng, max_size): 50 | side, rank, format = request.param 51 | if side**rank >= max_size: 52 | pytest.skip() 53 | shape = (side,) * rank 54 | x = sparse.random(shape, density=DENSITY, format=format, random_state=rng) 55 | y = sparse.random(shape, density=DENSITY, format=format, random_state=rng) 56 | return x, y 57 | 58 | 59 | @pytest.mark.parametrize("f", [operator.add, operator.mul]) 60 | def test_elemwise(benchmark, f, elemwise_args): 61 | x, y = elemwise_args 62 | f(x, y) 63 | 64 | @benchmark 65 | def bench(): 66 | f(x, y) 67 | 68 | 69 | def get_elemwise_ids(params): 70 | side, format = params 71 | return f"{side=}-{format=}" 72 | 73 | 74 | @pytest.fixture(params=itertools.product([100, 500, 1000], ["coo", "gcxs"]), ids=get_elemwise_ids) 75 | def elemwise_broadcast_args(request, rng, max_size): 76 | side, format = request.param 77 | if side**2 >= max_size: 78 | pytest.skip() 79 | x = sparse.random((side, 1, side), density=DENSITY, format=format, random_state=rng) 80 | y = sparse.random((side, side), density=DENSITY, format=format, random_state=rng) 81 | return x, y 82 | 83 | 84 | @pytest.mark.parametrize("f", [operator.add, operator.mul]) 85 | def test_elemwise_broadcast(benchmark, f, elemwise_broadcast_args): 86 | x, y = elemwise_broadcast_args 87 | f(x, y) 88 | 89 | @benchmark 90 | def bench(): 91 | f(x, y) 92 | 93 | 94 | @pytest.fixture(params=itertools.product([100, 500, 1000], [1, 2, 3], ["coo", "gcxs"]), ids=get_test_id) 95 | def indexing_args(request, rng, max_size): 96 | side, rank, format = request.param 97 | if side**rank >= max_size: 98 | pytest.skip() 99 | shape = (side,) * rank 100 | 101 | return sparse.random(shape, density=DENSITY, format=format, random_state=rng) 102 | 103 | 104 | def test_index_scalar(benchmark, indexing_args): 105 | x = indexing_args 106 | side = x.shape[0] 107 | rank = x.ndim 108 | 109 | x[(side // 2,) * rank] # Numba compilation 110 | 111 | @benchmark 112 | def bench(): 113 | x[(side // 2,) * rank] 114 | 115 | 116 | def test_index_slice(benchmark, indexing_args): 117 | x = indexing_args 118 | side = x.shape[0] 119 | rank = x.ndim 120 | 121 | x[(slice(side // 2),) * rank] # Numba compilation 122 | 123 | @benchmark 124 | def bench(): 125 | x[(slice(side // 2),) * rank] 126 | 127 | 128 | def test_index_fancy(benchmark, indexing_args, rng): 129 | x = indexing_args 130 | side = x.shape[0] 131 | index = rng.integers(0, side, size=(side // 2,)) 132 | 133 | x[index] # Numba compilation 134 | 135 | @benchmark 136 | def bench(): 137 | x[index] 138 | 139 | 140 | def get_sides_ids(param): 141 | m, n, p = param 142 | return f"{m=}-{n=}-{p=}" 143 | 144 | 145 | @pytest.fixture(params=itertools.product([200, 500, 1000], [200, 500, 1000], [200, 500, 1000]), ids=get_sides_ids) 146 | def sides(request): 147 | m, n, p = request.param 148 | return m, n, p 149 | 150 | 151 | @pytest.fixture(params=([(0, "coo"), (0, "gcxs"), (1, "gcxs")]), ids=["coo", "gcxs-0-axis", "gcxs-1-axis"]) 152 | def densemul_args(request, sides, rng, max_size): 153 | compressed_axis, format = request.param 154 | m, n, p = sides 155 | if m * n >= max_size or n * p >= max_size: 156 | pytest.skip() 157 | if format == "coo": 158 | x = sparse.random((m, n), density=DENSITY / 10, format=format, random_state=rng) 159 | else: 160 | x = sparse.random((m, n), density=DENSITY / 10, format=format, random_state=rng).change_compressed_axes( 161 | (compressed_axis,) 162 | ) 163 | t = rng.random((n, p)) 164 | 165 | return x, t 166 | 167 | 168 | def test_gcxs_dot_ndarray(benchmark, densemul_args): 169 | x, t = densemul_args 170 | 171 | # Numba compilation 172 | x @ t 173 | 174 | @benchmark 175 | def bench(): 176 | x @ t 177 | -------------------------------------------------------------------------------- /sparse/mlir_backend/_conversions.py: -------------------------------------------------------------------------------- 1 | import functools 2 | 3 | import numpy as np 4 | 5 | from ._array import Array 6 | from .formats import ConcreteFormat, Coo, Csf, Dense, Level, LevelFormat 7 | 8 | try: 9 | import scipy.sparse as sps 10 | 11 | ScipySparseArray = sps.sparray | sps.spmatrix 12 | except ImportError: 13 | sps = None 14 | ScipySparseArray = None 15 | 16 | 17 | def _guard_scipy(f): 18 | @functools.wraps(f) 19 | def wrapped(*args, **kwargs): 20 | if sps is None: 21 | raise RuntimeError("Could not import `scipy.sparse`. Please install `scipy`.") 22 | 23 | return f(*args, **kwargs) 24 | 25 | return wrapped 26 | 27 | 28 | def _from_numpy(arr: np.ndarray, copy: bool | None = None) -> Array: 29 | if copy is not None and not copy and not arr.flags["C_CONTIGUOUS"]: 30 | raise NotImplementedError("Cannot only convert C-contiguous arrays at the moment.") 31 | if copy: 32 | arr = arr.copy(order="C") 33 | arr_flat = np.ascontiguousarray(arr).reshape(-1) 34 | dense_format = Dense().with_ndim(arr.ndim).with_dtype(arr.dtype).build() 35 | return from_constituent_arrays(format=dense_format, arrays=(arr_flat,), shape=arr.shape) 36 | 37 | 38 | def to_numpy(arr: Array) -> np.ndarray: 39 | if not Dense.is_this_format(arr.format): 40 | raise TypeError(f"Cannot convert a non-dense array to NumPy. `{arr.format=}`") 41 | 42 | (data,) = arr.get_constituent_arrays() 43 | arg_order = [0] * arr.format.storage_rank 44 | for i, o in enumerate(arr.format.order): 45 | arg_order[o] = i 46 | arg_order = tuple(arg_order) 47 | storage_shape = tuple(int(arr.shape[o]) for o in arg_order) 48 | return data.reshape(storage_shape).transpose(arg_order) 49 | 50 | 51 | @_guard_scipy 52 | def _from_scipy(arr: ScipySparseArray, copy: bool | None = None) -> Array: 53 | if not isinstance(arr, ScipySparseArray): 54 | raise TypeError(f"`arr` is not a `scipy.sparse` array, `{type(arr)=}`.") 55 | match arr.format: 56 | case "csr" | "csc": 57 | order = (0, 1) if arr.format == "csr" else (1, 0) 58 | pos_width = arr.indptr.dtype.itemsize * 8 59 | crd_width = arr.indices.dtype.itemsize * 8 60 | csx_format = ( 61 | Csf() 62 | .with_ndim(2, canonical=arr.has_canonical_format) 63 | .with_dtype(arr.dtype) 64 | .with_crd_width(crd_width) 65 | .with_pos_width(pos_width) 66 | .with_order(order) 67 | .build() 68 | ) 69 | 70 | indptr = arr.indptr 71 | indices = arr.indices 72 | data = arr.data 73 | 74 | if copy: 75 | indptr = indptr.copy() 76 | indices = indices.copy() 77 | data = data.copy() 78 | 79 | return from_constituent_arrays(format=csx_format, arrays=(indptr, indices, data), shape=arr.shape) 80 | case "coo": 81 | row, col = arr.row, arr.col 82 | if row.dtype != col.dtype: 83 | raise RuntimeError(f"`row` and `col` dtypes must be the same: {row.dtype} != {col.dtype}.") 84 | pos = np.array([0, arr.nnz], dtype=np.int64) 85 | pos_width = pos.dtype.itemsize * 8 86 | crd_width = row.dtype.itemsize * 8 87 | data = arr.data 88 | if copy: 89 | data = data.copy() 90 | row = row.copy() 91 | col = col.copy() 92 | 93 | coo_format = ( 94 | Coo() 95 | .with_ndim(2, canonical=arr.has_canonical_format) 96 | .with_dtype(arr.dtype) 97 | .with_pos_width(pos_width) 98 | .with_crd_width(crd_width) 99 | .build() 100 | ) 101 | 102 | return from_constituent_arrays(format=coo_format, arrays=(pos, row, col, data), shape=arr.shape) 103 | case _: 104 | raise NotImplementedError(f"No conversion implemented for `scipy.sparse.{type(arr.__name__)}`.") 105 | 106 | 107 | @_guard_scipy 108 | def to_scipy(arr: Array) -> ScipySparseArray: 109 | storage_format = arr.format 110 | 111 | match storage_format.levels: 112 | case (Level(LevelFormat.Dense, _), Level(LevelFormat.Compressed, _)): 113 | indptr, indices, data = arr.get_constituent_arrays() 114 | if storage_format.order == (0, 1): 115 | return sps.csr_array((data, indices, indptr), shape=arr.shape) 116 | return sps.csc_array((data, indices, indptr), shape=arr.shape) 117 | case (Level(LevelFormat.Compressed, _), Level(LevelFormat.Singleton, _)): 118 | _, row, col, data = arr.get_constituent_arrays() 119 | return sps.coo_array((data, (row, col)), shape=arr.shape) 120 | case _: 121 | raise RuntimeError(f"No conversion implemented for `{storage_format=}`.") 122 | 123 | 124 | def asarray(arr, copy: bool | None = None) -> Array: 125 | if sps is not None and isinstance(arr, ScipySparseArray): 126 | return _from_scipy(arr, copy=copy) 127 | if isinstance(arr, np.ndarray): 128 | return _from_numpy(arr, copy=copy) 129 | 130 | if isinstance(arr, Array): 131 | if copy: 132 | arr = arr.copy() 133 | return arr 134 | 135 | if copy is not None and not copy and not isinstance(arr, np.ndarray): 136 | raise ValueError("Cannot non-copy convert this object.") 137 | 138 | return _from_numpy(np.asarray(arr), copy=copy) 139 | 140 | 141 | def from_constituent_arrays(*, format: ConcreteFormat, arrays: tuple[np.ndarray, ...], shape: tuple[int, ...]) -> Array: 142 | storage = format._get_ctypes_type().from_constituent_arrays(arrays) 143 | return Array(storage=storage, shape=shape) 144 | -------------------------------------------------------------------------------- /docs/conduct.md: -------------------------------------------------------------------------------- 1 | # Code of Conduct 2 | 3 | ## Our Pledge 4 | 5 | We as members, contributors, and leaders pledge to make participation in 6 | our community a harassment-free experience for everyone, regardless of 7 | age, body size, visible or invisible disability, ethnicity, sex 8 | characteristics, gender identity and expression, level of experience, 9 | education, socio-economic status, nationality, personal appearance, 10 | race, religion, or sexual identity and orientation. 11 | 12 | We pledge to act and interact in ways that contribute to an open, 13 | welcoming, diverse, inclusive, and healthy community. 14 | 15 | ## Our Standards 16 | 17 | Examples of behavior that contributes to a positive environment for our 18 | community include: 19 | 20 | - Demonstrating empathy and kindness toward other people 21 | - Being respectful of differing opinions, viewpoints, and experiences 22 | - Giving and gracefully accepting constructive feedback 23 | - Accepting responsibility and apologizing to those affected by our 24 | mistakes, and learning from the experience 25 | - Focusing on what is best not just for us as individuals, but for the 26 | overall community 27 | 28 | Examples of unacceptable behavior include: 29 | 30 | - The use of sexualized language or imagery, and sexual attention or 31 | advances of any kind 32 | - Trolling, insulting or derogatory comments, and personal or political 33 | attacks 34 | - Public or private harassment 35 | - Publishing others' private information, such as a physical or email 36 | address, without their explicit permission 37 | - Other conduct which could reasonably be considered inappropriate in a 38 | professional setting 39 | 40 | ## Enforcement Responsibilities 41 | 42 | Community leaders are responsible for clarifying and enforcing our 43 | standards of acceptable behavior and will take appropriate and fair 44 | corrective action in response to any behavior that they deem 45 | inappropriate, threatening, offensive, or harmful. 46 | 47 | Community leaders have the right and responsibility to remove, edit, or 48 | reject comments, commits, code, wiki edits, issues, and other 49 | contributions that are not aligned to this Code of Conduct, and will 50 | communicate reasons for moderation decisions when appropriate. 51 | 52 | ## Scope 53 | 54 | This Code of Conduct applies within all community spaces, and also 55 | applies when an individual is officially representing the community in 56 | public spaces. Examples of representing our community include using an 57 | official e-mail address, posting via an official social media account, 58 | or acting as an appointed representative at an online or offline event. 59 | 60 | ## Enforcement 61 | 62 | 63 | Instances of abusive, harassing, or otherwise unacceptable behavior may 64 | be reported to the community leaders responsible for enforcement at 65 | [hameerabbasi@yahoo.com](mailto:hameerabbasi@yahoo.com). All complaints will be reviewed and 66 | investigated promptly and fairly. 67 | 68 | All community leaders are obligated to respect the privacy and security 69 | of the reporter of any incident. 70 | 71 | ## Enforcement Guidelines 72 | 73 | Community leaders will follow these Community Impact Guidelines in 74 | determining the consequences for any action they deem in violation of 75 | this Code of Conduct: 76 | 77 | ### 1. Correction 78 | 79 | **Community Impact**: Use of inappropriate language or other behavior 80 | deemed unprofessional or unwelcome in the community. 81 | 82 | **Consequence**: A private, written warning from community leaders, 83 | providing clarity around the nature of the violation and an explanation 84 | of why the behavior was inappropriate. A public apology may be 85 | requested. 86 | 87 | ### 2. Warning 88 | 89 | **Community Impact**: A violation through a single incident or series of 90 | actions. 91 | 92 | **Consequence**: A warning with consequences for continued behavior. No 93 | interaction with the people involved, including unsolicited interaction 94 | with those enforcing the Code of Conduct, for a specified period of 95 | time. This includes avoiding interactions in community spaces as well as 96 | external channels like social media. Violating these terms may lead to a 97 | temporary or permanent ban. 98 | 99 | ### 3. Temporary Ban 100 | 101 | **Community Impact**: A serious violation of community standards, 102 | including sustained inappropriate behavior. 103 | 104 | **Consequence**: A temporary ban from any sort of interaction or public 105 | communication with the community for a specified period of time. No 106 | public or private interaction with the people involved, including 107 | unsolicited interaction with those enforcing the Code of Conduct, is 108 | allowed during this period. Violating these terms may lead to a 109 | permanent ban. 110 | 111 | ### 4. Permanent Ban 112 | 113 | **Community Impact**: Demonstrating a pattern of violation of community 114 | standards, including sustained inappropriate behavior, harassment of an 115 | individual, or aggression toward or disparagement of classes of 116 | individuals. 117 | 118 | **Consequence**: A permanent ban from any sort of public interaction 119 | within the community. 120 | 121 | ## Attribution 122 | 123 | 124 | This Code of Conduct is adapted from the [Contributor 125 | Covenant](https://www.contributor-covenant.org), version 2.0, 126 | available at 127 | [https://www.contributor-covenant.org/version/2/0/code\_of\_conduct.html](https://www.contributor-covenant.org/version/2/0/code\_of\_conduct.html). 128 | 129 | Community Impact Guidelines were inspired by [Mozilla's code of conduct 130 | enforcement ladder](https://github.com/mozilla/inclusion). 131 | 132 | For answers to common questions about this code of conduct, see the FAQ 133 | at [https://www.contributor-covenant.org/faq](https://www.contributor-covenant.org/faq). Translations are available 134 | at [https://www.contributor-covenant.org/translations](https://www.contributor-covenant.org/translations). 135 | -------------------------------------------------------------------------------- /docs/contributing.md: -------------------------------------------------------------------------------- 1 | ## Contributing 2 | 3 | ## General Guidelines 4 | 5 | sparse is a community-driven project on GitHub. You can find our 6 | [repository on GitHub](https://github.com/pydata/sparse). Feel 7 | free to open issues for new features or bugs, or open a pull request 8 | to fix a bug or add a new feature. 9 | 10 | If you haven't contributed to open-source before, we recommend you read 11 | [this excellent guide by GitHub on how to contribute to open source](https://opensource.guide/how-to-contribute). The guide is long, 12 | so you can gloss over things you're familiar with. 13 | 14 | If you're not already familiar with it, we follow the [fork and pull model](https://help.github.com/articles/about-collaborative-development-models) 15 | on GitHub. 16 | 17 | ## Filing Issues 18 | 19 | If you find a bug or would like a new feature, you might want to *consider 20 | filing a new issue* on [GitHub](https://github.com/pydata/sparse/issues). Before 21 | you open a new issue, please make sure of the following: 22 | 23 | * This should go without saying, but make sure what you are requesting is within 24 | the scope of this project. 25 | * The bug/feature is still present/missing on the `main` branch on GitHub. 26 | * A similar issue or pull request isn't already open. If one already is, it's better 27 | to contribute to the discussion there. 28 | 29 | ## Contributing Code 30 | 31 | This project has a number of requirements for all code contributed. 32 | 33 | * We use `pre-commit` to automatically lint the code and maintain code style. 34 | * We use Numpy-style docstrings. 35 | * It's ideal if user-facing API changes or new features have documentation added. 36 | * 100% code coverage is recommended for all new code in any submitted PR. Doctests 37 | count toward coverage. 38 | * Performance optimizations should have benchmarks added in `benchmarks`. 39 | 40 | ## Setting up Your Development Environment 41 | 42 | The following bash script is all you need to set up your development environment, 43 | after forking and cloning the repository: 44 | 45 | ```bash 46 | 47 | pip install -e .[all] 48 | ``` 49 | 50 | ## Pull requests 51 | 52 | Please adhere to the following guidelines: 53 | 54 | 1. Start your pull request title with a [conventional commit](https://www.conventionalcommits.org/) tag. This helps us add your contribution to the right section of the changelog. We use "Type" from the [Angular convention](https://github.com/angular/angular/blob/22b96b9/CONTRIBUTING.md#type).
55 | TLDR:
56 | The PR title should start with any of these abbreviations: `build`, `chore`, `ci`, `depr`, 57 | `docs`, `feat`, `fix`, `perf`, `refactor`, `release`, `test`. Add a `!`at the end, if it is a breaking change. For example `refactor!`. 58 | 2. This text will end up in the changelog. 59 | 3. Please follow the instructions in the pull request form and submit. 60 | 61 | ## Running/Adding Unit Tests 62 | 63 | It is best if all new functionality and/or bug fixes have unit tests added 64 | with each use-case. 65 | 66 | We use [pytest](https://docs.pytest.org/en/latest) as our unit testing framework, 67 | with the `pytest-cov` extension to check code coverage and `pytest-flake8` to 68 | check code style. You don't need to configure these extensions yourself. Once you've 69 | configured your environment, you can just `cd` to the root of your repository and run 70 | 71 | ```bash 72 | pytest --pyargs sparse 73 | ``` 74 | 75 | This automatically checks code style and functionality, and prints code coverage, 76 | even though it doesn't fail on low coverage. 77 | 78 | Unit tests are automatically run on Travis CI for pull requests. 79 | 80 | ### Advanced 81 | 82 | To run the complete set of unit tests run in CI for your platform, run the following 83 | in the repository root: 84 | 85 | ```bash 86 | ci/setup_env.sh 87 | ACTIVATE_VENV=1 ci/test_all.sh 88 | ``` 89 | 90 | ## Coverage 91 | 92 | The `pytest` script automatically reports coverage, both on the terminal for 93 | missing line numbers, and in annotated HTML form in `htmlcov/index.html`. 94 | 95 | Coverage is automatically checked on CodeCov for pull requests. 96 | 97 | ## Adding/Building the Documentation 98 | 99 | If a feature is stable and relatively finalized, it is time to add it to the 100 | documentation. If you are adding any private/public functions, it is best to 101 | add docstrings, to aid in reviewing code and also for the API reference. 102 | 103 | We use [Numpy style docstrings](https://numpydoc.readthedocs.io/en/latest/format.html) 104 | and [Material for MkDocs](https://squidfunk.github.io/mkdocs-material) to document this library. 105 | MkDocs, in turn, uses [Markdown](https://www.markdownguide.org) 106 | as its markup language for adding code. 107 | 108 | We use [mkdoctrings](https://mkdocstrings.github.io/recipes) with the 109 | [mkdocs-gen-files plugin](https://oprypin.github.io/mkdocs-gen-files) 110 | to generate API references. 111 | 112 | To build the documentation, you can run 113 | 114 | ```bash 115 | 116 | mkdocs build 117 | mkdocs serve 118 | ``` 119 | 120 | After this, you can see a version of the documentation on your local server. 121 | 122 | Documentation for each pull requests is automatically built on `Read the Docs`. 123 | It is rebuilt with every new commit to your PR. There will be a link to preview it 124 | from your PR checks area on `GitHub` when ready. 125 | 126 | 127 | ## Adding and Running Benchmarks 128 | 129 | We use [`CodSpeed`](https://docs.codspeed.io/) to run benchmarks. They are run in the CI environment 130 | when a pull request is opened. Then the results of the run are sent to `CodSpeed` servers to be analyzed. 131 | When the analysis is done, a report is generated and posted automatically as a comment to the PR. 132 | The report includes a link to `CodSpeed`cloud where you can see the all the results. 133 | 134 | If you add benchmarks, they should be written as regular tests to be used with pytest, and use the fixture `benchmark`. Please see the `CodSpeed`documentation for more details. 135 | -------------------------------------------------------------------------------- /sparse/numba_backend/tests/test_einsum.py: -------------------------------------------------------------------------------- 1 | import sparse 2 | 3 | import pytest 4 | 5 | import numpy as np 6 | 7 | einsum_cases = [ 8 | "a,->a", 9 | "ab,->ab", 10 | ",ab,->ab", 11 | ",,->", 12 | "a,ab,abc->abc", 13 | "a,b,ab->ab", 14 | "ea,fb,gc,hd,abcd->efgh", 15 | "ea,fb,abcd,gc,hd->efgh", 16 | "abcd,ea,fb,gc,hd->efgh", 17 | "acdf,jbje,gihb,hfac,gfac,gifabc,hfac", 18 | "cd,bdhe,aidb,hgca,gc,hgibcd,hgac", 19 | "abhe,hidj,jgba,hiab,gab", 20 | "bde,cdh,agdb,hica,ibd,hgicd,hiac", 21 | "chd,bde,agbc,hiad,hgc,hgi,hiad", 22 | "chd,bde,agbc,hiad,bdi,cgh,agdb", 23 | "bdhe,acad,hiab,agac,hibd", 24 | "ab,ab,c->", 25 | "ab,ab,c->c", 26 | "ab,ab,cd,cd->", 27 | "ab,ab,cd,cd->ac", 28 | "ab,ab,cd,cd->cd", 29 | "ab,ab,cd,cd,ef,ef->", 30 | "ab,cd,ef->abcdef", 31 | "ab,cd,ef->acdf", 32 | "ab,cd,de->abcde", 33 | "ab,cd,de->be", 34 | "ab,bcd,cd->abcd", 35 | "ab,bcd,cd->abd", 36 | "eb,cb,fb->cef", 37 | "dd,fb,be,cdb->cef", 38 | "bca,cdb,dbf,afc->", 39 | "dcc,fce,ea,dbf->ab", 40 | "fdf,cdd,ccd,afe->ae", 41 | "abcd,ad", 42 | "ed,fcd,ff,bcf->be", 43 | "baa,dcf,af,cde->be", 44 | "bd,db,eac->ace", 45 | "fff,fae,bef,def->abd", 46 | "efc,dbc,acf,fd->abe", 47 | "ab,ab", 48 | "ab,ba", 49 | "abc,abc", 50 | "abc,bac", 51 | "abc,cba", 52 | "ab,bc", 53 | "ab,cb", 54 | "ba,bc", 55 | "ba,cb", 56 | "abcd,cd", 57 | "abcd,ab", 58 | "abcd,cdef", 59 | "abcd,cdef->feba", 60 | "abcd,efdc", 61 | "aab,bc->ac", 62 | "ab,bcc->ac", 63 | "aab,bcc->ac", 64 | "baa,bcc->ac", 65 | "aab,ccb->ac", 66 | "aab,fa,df,ecc->bde", 67 | "ecb,fef,bad,ed->ac", 68 | "bcf,bbb,fbf,fc->", 69 | "bb,ff,be->e", 70 | "bcb,bb,fc,fff->", 71 | "fbb,dfd,fc,fc->", 72 | "afd,ba,cc,dc->bf", 73 | "adb,bc,fa,cfc->d", 74 | "bbd,bda,fc,db->acf", 75 | "dba,ead,cad->bce", 76 | "aef,fbc,dca->bde", 77 | "abab->ba", 78 | "...ab,...ab", 79 | "...ab,...b->...a", 80 | "a...,a...", 81 | "a...,a...", 82 | ] 83 | 84 | 85 | @pytest.mark.parametrize("subscripts", einsum_cases) 86 | @pytest.mark.parametrize("density", [0.1, 1.0]) 87 | def test_einsum(subscripts, density): 88 | d = 4 89 | terms = subscripts.split("->")[0].split(",") 90 | arrays = [sparse.random((d,) * len(term), density=density) for term in terms] 91 | sparse_out = sparse.einsum(subscripts, *arrays) 92 | numpy_out = np.einsum(subscripts, *(s.todense() for s in arrays)) 93 | 94 | if not numpy_out.shape: 95 | # scalar output 96 | assert np.allclose(numpy_out, sparse_out) 97 | else: 98 | # array output 99 | assert np.allclose(numpy_out, sparse_out.todense()) 100 | 101 | 102 | @pytest.mark.parametrize("input", [[[0, 0]], [[0, Ellipsis]], [[Ellipsis, 1], [Ellipsis]], [[0, 1], [0]]]) 103 | @pytest.mark.parametrize("density", [0.1, 1.0]) 104 | def test_einsum_nosubscript(input, density): 105 | d = 4 106 | arrays = [sparse.random((d, d), density=density)] 107 | sparse_out = sparse.einsum(*arrays, *input) 108 | numpy_out = np.einsum(*(s.todense() for s in arrays), *input) 109 | 110 | if not numpy_out.shape: 111 | # scalar output 112 | assert np.allclose(numpy_out, sparse_out) 113 | else: 114 | # array output 115 | assert np.allclose(numpy_out, sparse_out.todense()) 116 | 117 | 118 | def test_einsum_input_fill_value(): 119 | x = sparse.random(shape=(2,), density=0.5, format="coo", fill_value=2) 120 | with pytest.raises(ValueError): 121 | sparse.einsum("cba", x) 122 | 123 | 124 | def test_einsum_no_input(): 125 | with pytest.raises(ValueError): 126 | sparse.einsum() 127 | 128 | 129 | @pytest.mark.parametrize("subscript", ["a+b->c", "i->&", "i->ij", "ij->jij", "a..,a...", ".i...", "a,a->->"]) 130 | def test_einsum_invalid_input(subscript): 131 | x = sparse.random(shape=(2,), density=0.5, format="coo") 132 | y = sparse.random(shape=(2,), density=0.5, format="coo") 133 | with pytest.raises(ValueError): 134 | sparse.einsum(subscript, x, y) 135 | 136 | 137 | @pytest.mark.parametrize("subscript", [0, [0, 0]]) 138 | def test_einsum_type_error(subscript): 139 | x = sparse.random(shape=(2,), density=0.5, format="coo") 140 | y = sparse.random(shape=(2,), density=0.5, format="coo") 141 | with pytest.raises(TypeError): 142 | sparse.einsum(subscript, x, y) 143 | 144 | 145 | format_test_cases = [ 146 | (("coo",), "coo"), 147 | (("dok",), "dok"), 148 | (("gcxs",), "gcxs"), 149 | (("dense",), "dense"), 150 | (("coo", "coo"), "coo"), 151 | (("dok", "coo"), "coo"), 152 | (("coo", "dok"), "coo"), 153 | (("coo", "dense"), "coo"), 154 | (("dense", "coo"), "coo"), 155 | (("dok", "dense"), "dok"), 156 | (("dense", "dok"), "dok"), 157 | (("gcxs", "dense"), "gcxs"), 158 | (("dense", "gcxs"), "gcxs"), 159 | (("dense", "dense"), "dense"), 160 | (("dense", "dok", "gcxs"), "coo"), 161 | ] 162 | 163 | 164 | @pytest.mark.parametrize("formats,expected", format_test_cases) 165 | def test_einsum_format(formats, expected, rng): 166 | inputs = [ 167 | rng.standard_normal((2, 2, 2)) if format == "dense" else sparse.random((2, 2, 2), density=0.5, format=format) 168 | for format in formats 169 | ] 170 | if len(inputs) == 1: 171 | eq = "abc->bc" 172 | elif len(inputs) == 2: 173 | eq = "abc,cda->abd" 174 | elif len(inputs) == 3: 175 | eq = "abc,cad,dea->abe" 176 | 177 | out = sparse.einsum(eq, *inputs) 178 | assert { 179 | sparse.COO: "coo", 180 | sparse.DOK: "dok", 181 | sparse.GCXS: "gcxs", 182 | np.ndarray: "dense", 183 | }[out.__class__] == expected 184 | 185 | 186 | def test_einsum_shape_check(): 187 | x = sparse.random((2, 3, 4), density=0.5) 188 | with pytest.raises(ValueError): 189 | sparse.einsum("aab", x) 190 | y = sparse.random((2, 3, 4), density=0.5) 191 | with pytest.raises(ValueError): 192 | sparse.einsum("abc,acb", x, y) 193 | 194 | 195 | @pytest.mark.parametrize("dtype", [np.int64, np.complex128]) 196 | def test_einsum_dtype(dtype): 197 | x = sparse.random((3, 3), density=0.5) * 10.0 198 | x = x.astype(np.float64) 199 | 200 | y = sparse.COO.from_numpy(np.ones((3, 1), dtype=np.float64)) 201 | 202 | result = sparse.einsum("ij,i->j", x, y, dtype=dtype) 203 | 204 | assert result.dtype == dtype 205 | -------------------------------------------------------------------------------- /sparse/numba_backend/__init__.py: -------------------------------------------------------------------------------- 1 | from numpy import ( 2 | add, 3 | bitwise_and, 4 | bitwise_not, 5 | bitwise_or, 6 | bitwise_xor, 7 | ceil, 8 | complex64, 9 | complex128, 10 | conj, 11 | copysign, 12 | cos, 13 | cosh, 14 | divide, 15 | e, 16 | exp, 17 | expm1, 18 | finfo, 19 | float16, 20 | float32, 21 | float64, 22 | floor, 23 | floor_divide, 24 | greater, 25 | greater_equal, 26 | hypot, 27 | iinfo, 28 | inf, 29 | int8, 30 | int16, 31 | int32, 32 | int64, 33 | isfinite, 34 | less, 35 | less_equal, 36 | log, 37 | log1p, 38 | log2, 39 | log10, 40 | logaddexp, 41 | logical_and, 42 | logical_not, 43 | logical_or, 44 | logical_xor, 45 | maximum, 46 | minimum, 47 | multiply, 48 | nan, 49 | negative, 50 | newaxis, 51 | nextafter, 52 | not_equal, 53 | pi, 54 | positive, 55 | reciprocal, 56 | remainder, 57 | sign, 58 | signbit, 59 | sin, 60 | sinh, 61 | sqrt, 62 | square, 63 | subtract, 64 | tan, 65 | tanh, 66 | trunc, 67 | uint8, 68 | uint16, 69 | uint32, 70 | uint64, 71 | ) 72 | from numpy import arccos as acos 73 | from numpy import arccosh as acosh 74 | from numpy import arcsin as asin 75 | from numpy import arcsinh as asinh 76 | from numpy import arctan as atan 77 | from numpy import arctan2 as atan2 78 | from numpy import arctanh as atanh 79 | from numpy import bool_ as bool 80 | from numpy import invert as bitwise_invert 81 | from numpy import left_shift as bitwise_left_shift 82 | from numpy import power as pow 83 | from numpy import right_shift as bitwise_right_shift 84 | 85 | from ._common import ( 86 | SparseArray, 87 | abs, 88 | all, 89 | any, 90 | asarray, 91 | asnumpy, 92 | astype, 93 | broadcast_arrays, 94 | broadcast_to, 95 | can_cast, 96 | concat, 97 | concatenate, 98 | diff, 99 | dot, 100 | einsum, 101 | empty, 102 | empty_like, 103 | equal, 104 | eye, 105 | full, 106 | full_like, 107 | imag, 108 | interp, 109 | isinf, 110 | isnan, 111 | matmul, 112 | max, 113 | mean, 114 | min, 115 | moveaxis, 116 | nonzero, 117 | ones, 118 | ones_like, 119 | outer, 120 | pad, 121 | permute_dims, 122 | prod, 123 | real, 124 | repeat, 125 | reshape, 126 | round, 127 | squeeze, 128 | stack, 129 | std, 130 | sum, 131 | tensordot, 132 | tile, 133 | unstack, 134 | var, 135 | vecdot, 136 | zeros, 137 | zeros_like, 138 | ) 139 | from ._compressed import GCXS 140 | from ._coo import COO, as_coo 141 | from ._coo.common import ( 142 | argmax, 143 | argmin, 144 | argwhere, 145 | asCOO, 146 | clip, 147 | diagonal, 148 | diagonalize, 149 | expand_dims, 150 | flip, 151 | isneginf, 152 | isposinf, 153 | kron, 154 | matrix_transpose, 155 | nanmax, 156 | nanmean, 157 | nanmin, 158 | nanprod, 159 | nanreduce, 160 | nansum, 161 | result_type, 162 | roll, 163 | sort, 164 | take, 165 | tril, 166 | triu, 167 | unique_counts, 168 | unique_values, 169 | where, 170 | ) 171 | from ._dok import DOK 172 | from ._io import load_npz, save_npz 173 | from ._settings import IS_NUMPY2 as _IS_NUMPY2 174 | from ._settings import __array_namespace_info__ # noqa: F401 175 | from ._umath import elemwise 176 | from ._utils import random 177 | 178 | __all__ = [ 179 | "COO", 180 | "DOK", 181 | "GCXS", 182 | "SparseArray", 183 | "abs", 184 | "acos", 185 | "acosh", 186 | "add", 187 | "all", 188 | "any", 189 | "argmax", 190 | "argmin", 191 | "argwhere", 192 | "asCOO", 193 | "as_coo", 194 | "asarray", 195 | "asin", 196 | "asinh", 197 | "asnumpy", 198 | "astype", 199 | "atan", 200 | "atan2", 201 | "atanh", 202 | "bitwise_and", 203 | "bitwise_invert", 204 | "bitwise_left_shift", 205 | "bitwise_not", 206 | "bitwise_or", 207 | "bitwise_right_shift", 208 | "bitwise_xor", 209 | "bool", 210 | "broadcast_arrays", 211 | "broadcast_to", 212 | "can_cast", 213 | "ceil", 214 | "clip", 215 | "complex128", 216 | "complex64", 217 | "concat", 218 | "concatenate", 219 | "conj", 220 | "copysign", 221 | "cos", 222 | "cosh", 223 | "diagonal", 224 | "diagonalize", 225 | "divide", 226 | "dot", 227 | "e", 228 | "einsum", 229 | "elemwise", 230 | "empty", 231 | "empty_like", 232 | "equal", 233 | "exp", 234 | "expand_dims", 235 | "expm1", 236 | "eye", 237 | "finfo", 238 | "flip", 239 | "float16", 240 | "float32", 241 | "float64", 242 | "floor", 243 | "floor_divide", 244 | "full", 245 | "full_like", 246 | "greater", 247 | "greater_equal", 248 | "hypot", 249 | "iinfo", 250 | "imag", 251 | "inf", 252 | "int16", 253 | "int32", 254 | "int64", 255 | "int8", 256 | "interp", 257 | "isfinite", 258 | "isinf", 259 | "isnan", 260 | "isneginf", 261 | "isposinf", 262 | "kron", 263 | "less", 264 | "less_equal", 265 | "load_npz", 266 | "log", 267 | "log10", 268 | "log1p", 269 | "log2", 270 | "logaddexp", 271 | "logical_and", 272 | "logical_not", 273 | "logical_or", 274 | "logical_xor", 275 | "matmul", 276 | "matrix_transpose", 277 | "max", 278 | "maximum", 279 | "mean", 280 | "min", 281 | "minimum", 282 | "moveaxis", 283 | "multiply", 284 | "nan", 285 | "nanmax", 286 | "nanmean", 287 | "nanmin", 288 | "nanprod", 289 | "nanreduce", 290 | "nansum", 291 | "negative", 292 | "newaxis", 293 | "nextafter", 294 | "nonzero", 295 | "not_equal", 296 | "ones", 297 | "ones_like", 298 | "outer", 299 | "pad", 300 | "permute_dims", 301 | "pi", 302 | "positive", 303 | "pow", 304 | "prod", 305 | "random", 306 | "real", 307 | "reciprocal", 308 | "remainder", 309 | "reshape", 310 | "result_type", 311 | "roll", 312 | "round", 313 | "save_npz", 314 | "sign", 315 | "signbit", 316 | "sin", 317 | "sinh", 318 | "sort", 319 | "sqrt", 320 | "square", 321 | "squeeze", 322 | "stack", 323 | "std", 324 | "subtract", 325 | "sum", 326 | "take", 327 | "tan", 328 | "tanh", 329 | "tensordot", 330 | "tril", 331 | "triu", 332 | "trunc", 333 | "uint16", 334 | "uint32", 335 | "uint64", 336 | "uint8", 337 | "unique_counts", 338 | "unique_values", 339 | "var", 340 | "vecdot", 341 | "where", 342 | "zeros", 343 | "zeros_like", 344 | "repeat", 345 | "tile", 346 | "unstack", 347 | "diff", 348 | ] 349 | 350 | 351 | if _IS_NUMPY2: 352 | from numpy import isdtype 353 | 354 | __all__ += [ 355 | "isdtype", 356 | ] 357 | 358 | __all__.sort() 359 | -------------------------------------------------------------------------------- /ci/Numba-array-api-xfails.txt: -------------------------------------------------------------------------------- 1 | array_api_tests/test_array_object.py::test_setitem 2 | array_api_tests/test_array_object.py::test_getitem_masking 3 | array_api_tests/test_array_object.py::test_setitem_masking 4 | array_api_tests/test_creation_functions.py::test_arange 5 | array_api_tests/test_creation_functions.py::test_linspace 6 | array_api_tests/test_creation_functions.py::test_meshgrid 7 | array_api_tests/test_data_type_functions.py::test_finfo[float32] 8 | array_api_tests/test_has_names.py::test_has_names[linalg-cholesky] 9 | array_api_tests/test_has_names.py::test_has_names[linalg-cross] 10 | array_api_tests/test_has_names.py::test_has_names[linalg-det] 11 | array_api_tests/test_has_names.py::test_has_names[linalg-diagonal] 12 | array_api_tests/test_has_names.py::test_has_names[linalg-eigh] 13 | array_api_tests/test_has_names.py::test_has_names[linalg-eigvalsh] 14 | array_api_tests/test_has_names.py::test_has_names[linalg-inv] 15 | array_api_tests/test_has_names.py::test_has_names[linalg-matmul] 16 | array_api_tests/test_has_names.py::test_has_names[linalg-matrix_norm] 17 | array_api_tests/test_has_names.py::test_has_names[linalg-matrix_power] 18 | array_api_tests/test_has_names.py::test_has_names[linalg-matrix_rank] 19 | array_api_tests/test_has_names.py::test_has_names[linalg-matrix_transpose] 20 | array_api_tests/test_has_names.py::test_has_names[linalg-outer] 21 | array_api_tests/test_has_names.py::test_has_names[linalg-pinv] 22 | array_api_tests/test_has_names.py::test_has_names[linalg-qr] 23 | array_api_tests/test_has_names.py::test_has_names[linalg-slogdet] 24 | array_api_tests/test_has_names.py::test_has_names[linalg-solve] 25 | array_api_tests/test_has_names.py::test_has_names[linalg-svd] 26 | array_api_tests/test_has_names.py::test_has_names[linalg-svdvals] 27 | array_api_tests/test_has_names.py::test_has_names[linalg-tensordot] 28 | array_api_tests/test_has_names.py::test_has_names[linalg-trace] 29 | array_api_tests/test_has_names.py::test_has_names[linalg-vecdot] 30 | array_api_tests/test_has_names.py::test_has_names[linalg-vector_norm] 31 | array_api_tests/test_has_names.py::test_has_names[set-unique_all] 32 | array_api_tests/test_has_names.py::test_has_names[set-unique_inverse] 33 | array_api_tests/test_has_names.py::test_has_names[creation-arange] 34 | array_api_tests/test_has_names.py::test_has_names[creation-from_dlpack] 35 | array_api_tests/test_has_names.py::test_has_names[creation-linspace] 36 | array_api_tests/test_has_names.py::test_has_names[creation-meshgrid] 37 | array_api_tests/test_has_names.py::test_has_names[sorting-argsort] 38 | array_api_tests/test_has_names.py::test_has_names[array_method-__dlpack__] 39 | array_api_tests/test_has_names.py::test_has_names[array_method-__dlpack_device__] 40 | array_api_tests/test_has_names.py::test_has_names[array_method-__setitem__] 41 | array_api_tests/test_indexing_functions.py::test_take 42 | array_api_tests/test_set_functions.py::test_unique_all 43 | array_api_tests/test_set_functions.py::test_unique_inverse 44 | array_api_tests/test_signatures.py::test_func_signature[unique_all] 45 | array_api_tests/test_signatures.py::test_func_signature[unique_inverse] 46 | array_api_tests/test_signatures.py::test_func_signature[arange] 47 | array_api_tests/test_signatures.py::test_func_signature[from_dlpack] 48 | array_api_tests/test_signatures.py::test_func_signature[linspace] 49 | array_api_tests/test_signatures.py::test_func_signature[meshgrid] 50 | array_api_tests/test_signatures.py::test_func_signature[argsort] 51 | array_api_tests/test_signatures.py::test_array_method_signature[__dlpack__] 52 | array_api_tests/test_signatures.py::test_array_method_signature[__dlpack_device__] 53 | array_api_tests/test_signatures.py::test_array_method_signature[__setitem__] 54 | array_api_tests/test_sorting_functions.py::test_argsort 55 | array_api_tests/test_has_names.py::test_has_names[fft-hfft] 56 | array_api_tests/test_has_names.py::test_has_names[fft-ihfft] 57 | array_api_tests/test_has_names.py::test_has_names[fft-fftfreq] 58 | array_api_tests/test_has_names.py::test_has_names[fft-rfftfreq] 59 | array_api_tests/test_has_names.py::test_has_names[fft-fftshift] 60 | array_api_tests/test_has_names.py::test_has_names[fft-ifftshift] 61 | array_api_tests/test_has_names.py::test_has_names[fft-fft] 62 | array_api_tests/test_has_names.py::test_has_names[fft-ifft] 63 | array_api_tests/test_has_names.py::test_has_names[fft-fftn] 64 | array_api_tests/test_has_names.py::test_has_names[fft-ifftn] 65 | array_api_tests/test_has_names.py::test_has_names[fft-rfft] 66 | array_api_tests/test_has_names.py::test_has_names[fft-irfft] 67 | array_api_tests/test_has_names.py::test_has_names[fft-rfftn] 68 | array_api_tests/test_has_names.py::test_has_names[fft-irfftn] 69 | array_api_tests/test_creation_functions.py::test_empty_like 70 | array_api_tests/test_data_type_functions.py::test_finfo[complex64] 71 | array_api_tests/test_manipulation_functions.py::test_squeeze 72 | array_api_tests/test_has_names.py::test_has_names[statistical-cumulative_sum] 73 | array_api_tests/test_has_names.py::test_has_names[statistical-cumulative_prod] 74 | array_api_tests/test_has_names.py::test_has_names[indexing-take_along_axis] 75 | array_api_tests/test_has_names.py::test_has_names[searching-count_nonzero] 76 | array_api_tests/test_has_names.py::test_has_names[searching-searchsorted] 77 | array_api_tests/test_signatures.py::test_func_signature[take_along_axis] 78 | array_api_tests/test_special_cases.py::test_binary[floor_divide(x1_i is +infinity and isfinite(x2_i) and x2_i > 0) -> +infinity] 79 | array_api_tests/test_special_cases.py::test_binary[floor_divide(x1_i is +infinity and isfinite(x2_i) and x2_i < 0) -> -infinity] 80 | array_api_tests/test_special_cases.py::test_binary[floor_divide(x1_i is -infinity and isfinite(x2_i) and x2_i > 0) -> -infinity] 81 | array_api_tests/test_special_cases.py::test_binary[floor_divide(x1_i is -infinity and isfinite(x2_i) and x2_i < 0) -> +infinity] 82 | array_api_tests/test_special_cases.py::test_binary[floor_divide(isfinite(x1_i) and x1_i > 0 and x2_i is -infinity) -> -0] 83 | array_api_tests/test_special_cases.py::test_binary[__floordiv__(x1_i is +infinity and isfinite(x2_i) and x2_i > 0) -> +infinity] 84 | array_api_tests/test_special_cases.py::test_binary[floor_divide(isfinite(x1_i) and x1_i < 0 and x2_i is +infinity) -> -0] 85 | array_api_tests/test_special_cases.py::test_binary[__floordiv__(x1_i is +infinity and isfinite(x2_i) and x2_i < 0) -> -infinity] 86 | array_api_tests/test_special_cases.py::test_binary[__floordiv__(x1_i is -infinity and isfinite(x2_i) and x2_i > 0) -> -infinity] 87 | array_api_tests/test_special_cases.py::test_binary[__floordiv__(x1_i is -infinity and isfinite(x2_i) and x2_i < 0) -> +infinity] 88 | array_api_tests/test_special_cases.py::test_binary[__floordiv__(isfinite(x1_i) and x1_i > 0 and x2_i is -infinity) -> -0] 89 | array_api_tests/test_special_cases.py::test_binary[__floordiv__(isfinite(x1_i) and x1_i < 0 and x2_i is +infinity) -> -0] 90 | array_api_tests/test_special_cases.py::test_iop[__ifloordiv__(x1_i is +infinity and isfinite(x2_i) and x2_i > 0) -> +infinity] 91 | array_api_tests/test_special_cases.py::test_iop[__ifloordiv__(x1_i is -infinity and isfinite(x2_i) and x2_i < 0) -> +infinity] 92 | array_api_tests/test_special_cases.py::test_iop[__ifloordiv__(x1_i is +infinity and isfinite(x2_i) and x2_i < 0) -> -infinity] 93 | array_api_tests/test_special_cases.py::test_iop[__ifloordiv__(isfinite(x1_i) and x1_i > 0 and x2_i is -infinity) -> -0] 94 | array_api_tests/test_special_cases.py::test_iop[__ifloordiv__(x1_i is -infinity and isfinite(x2_i) and x2_i > 0) -> -infinity] 95 | array_api_tests/test_special_cases.py::test_iop[__ifloordiv__(isfinite(x1_i) and x1_i < 0 and x2_i is +infinity) -> -0] 96 | array_api_tests/test_array_object.py::test_getitem_arrays_and_ints_1[1] 97 | array_api_tests/test_statistical_functions.py::test_cumulative_prod 98 | array_api_tests/test_statistical_functions.py::test_cumulative_sum 99 | array_api_tests/test_array_object.py::test_getitem_arrays_and_ints_1[None] 100 | array_api_tests/test_array_object.py::test_getitem_arrays_and_ints_2[1] 101 | array_api_tests/test_array_object.py::test_getitem_arrays_and_ints_2[None] 102 | array_api_tests/test_searching_functions.py::test_count_nonzero 103 | array_api_tests/test_searching_functions.py::test_searchsorted 104 | array_api_tests/test_signatures.py::test_func_signature[cumulative_sum] 105 | array_api_tests/test_signatures.py::test_func_signature[cumulative_prod] 106 | array_api_tests/test_signatures.py::test_func_signature[count_nonzero] 107 | array_api_tests/test_signatures.py::test_func_signature[searchsorted] 108 | -------------------------------------------------------------------------------- /docs/construct.md: -------------------------------------------------------------------------------- 1 | # Construct Sparse Arrays 2 | 3 | ## From coordinates and data 4 | 5 | You can construct [`sparse.COO`][] arrays from coordinates and value data. 6 | 7 | The `cords` parameter contains the indices where the data is nonzero, 8 | and the `data` parameter contains the data corresponding to those indices. 9 | For example, the following code will generate a $5 \times 5$ diagonal 10 | matrix: 11 | 12 | ```python 13 | 14 | >>> import sparse 15 | 16 | >>> coords = [[0, 1, 2, 3, 4], 17 | ... [0, 1, 2, 3, 4]] 18 | >>> data = [10, 20, 30, 40, 50] 19 | >>> s = sparse.COO(coords, data, shape=(5, 5)) 20 | >>> s 21 | 22 | 0 1 2 3 4 23 | ┌ ┐ 24 | 0 │ 10 │ 25 | 1 │ 20 │ 26 | 2 │ 30 │ 27 | 3 │ 40 │ 28 | 4 │ 50 │ 29 | └ ┘ 30 | ``` 31 | 32 | In general `coords` should be a `(ndim, nnz)` shaped 33 | array. Each row of `coords` contains one dimension of the 34 | desired sparse array, and each column contains the index 35 | corresponding to that nonzero element. `data` contains 36 | the nonzero elements of the array corresponding to the indices 37 | in `coords`. Its shape should be `(nnz,)`. 38 | 39 | If `data` is the same across all the coordinates, it can be passed 40 | in as a scalar. For example, the following produces the $4 \times 4$ 41 | identity matrix: 42 | 43 | ```python 44 | 45 | >>> import sparse 46 | 47 | >>> coords = [[0, 1, 2, 3], 48 | ... [0, 1, 2, 3]] 49 | >>> data = 1 50 | >>> s = sparse.COO(coords, data, shape=(4, 4)) 51 | >>> s 52 | 53 | 0 1 2 3 54 | ┌ ┐ 55 | 0 │ 1 │ 56 | 1 │ 1 │ 57 | 2 │ 1 │ 58 | 3 │ 1 │ 59 | └ ┘ 60 | ``` 61 | 62 | You can, and should, pass in [`numpy.ndarray`][] objects for 63 | `coords` and `data`. 64 | 65 | In this case, the shape of the resulting array was determined from 66 | the maximum index in each dimension. If the array extends beyond 67 | the maximum index in `coords`, you should supply a shape 68 | explicitly. For example, if we did the following without the 69 | `shape` keyword argument, it would result in a 70 | $4 \times 5$ matrix, but maybe we wanted one that was actually 71 | $5 \times 5$. 72 | 73 | ```python 74 | 75 | >>> coords = [[0, 3, 2, 1], [4, 1, 2, 0]] 76 | >>> data = [1, 4, 2, 1] 77 | >>> s = COO(coords, data, shape=(5, 5)) 78 | >>> s 79 | 80 | 0 1 2 3 4 81 | ┌ ┐ 82 | 0 │ 1 │ 83 | 1 │ 1 │ 84 | 2 │ 2 │ 85 | 3 │ 4 │ 86 | 4 │ │ 87 | └ ┘ 88 | ``` 89 | 90 | [`sparse.COO`][] arrays support arbitrary fill values. Fill values are the "default" 91 | value, or value to not store. This can be given a value other than zero. For 92 | example, the following builds a (bad) representation of a $2 \times 2$ 93 | identity matrix. Note that not all operations are supported for operations 94 | with nonzero fill values. 95 | 96 | ```python 97 | 98 | >>> coords = [[0, 1], [1, 0]] 99 | >>> data = [0, 0] 100 | >>> s = COO(coords, data, fill_value=1) 101 | >>> s 102 | 103 | 0 1 104 | ┌ ┐ 105 | 0 │ 0 │ 106 | 1 │ 0 │ 107 | └ ┘ 108 | ``` 109 | 110 | ## From [`scipy.sparse.spmatrix`][] 111 | 112 | To construct [`sparse.COO`][] array from [spmatrix][scipy.sparse.spmatrix] 113 | objects, you can use the [`sparse.COO.from_scipy_sparse`][] method. As an 114 | example, if `x` is a [scipy.sparse.spmatrix][], you can 115 | do the following to get an equivalent [`sparse.COO`][] array: 116 | 117 | ```python 118 | 119 | s = COO.from_scipy_sparse(x) 120 | ``` 121 | 122 | ## From [Numpy arrays][`numpy.ndarray`] 123 | 124 | To construct [`sparse.COO`][] arrays from [`numpy.ndarray`][] 125 | objects, you can use the [`sparse.COO.from_numpy`][] method. As an 126 | example, if `x` is a [`numpy.ndarray`][], you can 127 | do the following to get an equivalent [`sparse.COO`][] array: 128 | 129 | ```python 130 | 131 | s = COO.from_numpy(x) 132 | ``` 133 | 134 | ## Generating random [`sparse.COO`][] objects 135 | 136 | The [`sparse.random`][] method can be used to create random 137 | [`sparse.COO`][] arrays. For example, the following will generate 138 | a $10 \times 10$ matrix with $10$ nonzero entries, 139 | each in the interval $[0, 1)$. 140 | 141 | ```python 142 | 143 | s = sparse.random((10, 10), density=0.1) 144 | ``` 145 | 146 | Building [`sparse.COO`][] Arrays from [`sparse.DOK`][] Arrays 147 | 148 | It's possible to build [`sparse.COO`][] arrays from [`sparse.DOK`][] arrays, if it is not 149 | easy to construct the `coords` and `data` in a simple way. [`sparse.DOK`][] 150 | arrays provide a simple builder interface to build [`sparse.COO`][] arrays, but at 151 | this time, they can do little else. 152 | 153 | You can get started by defining the shape (and optionally, datatype) of the 154 | [`sparse.DOK`][] array. If you do not specify a dtype, it is inferred from the value 155 | dictionary or is set to `dtype('float64')` if that is not present. 156 | 157 | ```python 158 | 159 | s = DOK((6, 5, 2)) 160 | s2 = DOK((2, 3, 4), dtype=np.uint8) 161 | ``` 162 | 163 | After this, you can build the array by assigning arrays or scalars to elements 164 | or slices of the original array. Broadcasting rules are followed. 165 | 166 | ```python 167 | 168 | s[1:3, 3:1:-1] = [[6, 5]] 169 | ``` 170 | 171 | DOK arrays also support fancy indexing assignment if and only if all dimensions are indexed. 172 | 173 | ```python 174 | 175 | s[[0, 2], [2, 1], [0, 1]] = 5 176 | s[[0, 3], [0, 4], [0, 1]] = [1, 5] 177 | ``` 178 | 179 | Alongside indexing assignment and retrieval, [`sparse.DOK`][] arrays support any arbitrary broadcasting function 180 | to any number of arguments where the arguments can be [`sparse.SparseArray`][] objects, [`scipy.sparse.spmatrix`][] 181 | objects, or [`numpy.ndarray`][]. 182 | 183 | ```python 184 | 185 | x = sparse.random((10, 10), 0.5, format="dok") 186 | y = sparse.random((10, 10), 0.5, format="dok") 187 | sparse.elemwise(np.add, x, y) 188 | ``` 189 | 190 | [`sparse.DOK`][] arrays also support standard ufuncs and operators, including comparison operators, 191 | in combination with other objects implementing the *numpy* *ndarray.\__array_ufunc\__* method. For example, 192 | the following code will perform elementwise equality comparison on the two arrays 193 | and return a new boolean [`sparse.DOK`][] array. 194 | 195 | ```python 196 | 197 | x = sparse.random((10, 10), 0.5, format="dok") 198 | y = np.random.random((10, 10)) 199 | x == y 200 | ``` 201 | 202 | [`sparse.DOK`][] arrays are returned from elemwise functions and standard ufuncs if and only if all 203 | [`sparse.SparseArray`][] objects are [`sparse.DOK`][] arrays. Otherwise, a [`sparse.COO`][] array or dense array are returned. 204 | 205 | At the end, you can convert the [`sparse.DOK`][] array to a [`sparse.COO`][] arrays. 206 | 207 | ```python 208 | 209 | s3 = COO(s) 210 | ``` 211 | 212 | In addition, it is possible to access single elements and slices of the [`sparse.DOK`][] array 213 | using normal Numpy indexing, as well as fancy indexing if and only if all dimensions are indexed. 214 | Slicing and fancy indexing will always return a new DOK array. 215 | 216 | ```python 217 | 218 | s[1, 2, 1] # 5 219 | s[5, 1, 1] # 0 220 | s[[0, 3], [0, 4], [0, 1]] # 221 | ``` 222 | 223 | ## Converting [`sparse.COO`][] objects to other Formats 224 | 225 | [`sparse.COO`][] arrays can be converted to [Numpy arrays][numpy.ndarray], 226 | or to some [spmatrix][scipy.sparse.spmatrix] subclasses via the following 227 | methods: 228 | 229 | * [`sparse.COO.todense`][]: Converts to a [`numpy.ndarray`][] unconditionally. 230 | * [`sparse.COO.maybe_densify`][]: Converts to a [`numpy.ndarray`][] based on 231 | certain constraints. 232 | * [`sparse.COO.to_scipy_sparse`][]: Converts to a [`scipy.sparse.coo_matrix`][] if 233 | the array is two dimensional. 234 | * [`sparse.COO.tocsr`][]: Converts to a [`scipy.sparse.csr_matrix`][] if 235 | the array is two dimensional. 236 | * [`sparse.COO.tocsc`][]: Converts to a [`scipy.sparse.csc_matrix`][] if 237 | the array is two dimensional. 238 | -------------------------------------------------------------------------------- /sparse/tests/test_backends.py: -------------------------------------------------------------------------------- 1 | import warnings 2 | 3 | import sparse 4 | 5 | import pytest 6 | 7 | import numpy as np 8 | import scipy as sp 9 | import scipy.sparse as sps 10 | import scipy.sparse.csgraph as spgraph 11 | import scipy.sparse.linalg as splin 12 | from numpy.testing import assert_almost_equal, assert_equal 13 | 14 | 15 | def test_backends(backend): 16 | rng = np.random.default_rng(0) 17 | x = sparse.random((100, 10, 100), density=0.01, random_state=rng) 18 | y = sparse.random((100, 10, 100), density=0.01, random_state=rng) 19 | 20 | if backend == sparse._BackendType.Finch: 21 | import finch 22 | 23 | def storage(): 24 | return finch.Storage(finch.Dense(finch.SparseList(finch.SparseList(finch.Element(0.0)))), order="C") 25 | 26 | x = x.to_storage(storage()) 27 | y = y.to_storage(storage()) 28 | else: 29 | x.asformat("gcxs") 30 | y.asformat("gcxs") 31 | 32 | z = x + y 33 | result = sparse.sum(z) 34 | assert result.shape == () 35 | 36 | 37 | def test_finch_lazy_backend(backend): 38 | if backend != sparse._BackendType.Finch: 39 | pytest.skip("Tested only for Finch backend") 40 | 41 | import finch 42 | 43 | np_eye = np.eye(5) 44 | sp_arr = sps.csr_matrix(np_eye) 45 | finch_dense = finch.Tensor(np_eye) 46 | 47 | assert np.shares_memory(finch_dense.todense(), np_eye) 48 | 49 | finch_arr = finch.Tensor(sp_arr) 50 | 51 | assert_equal(finch_arr.todense(), np_eye) 52 | 53 | transposed = sparse.permute_dims(finch_arr, (1, 0)) 54 | 55 | assert_equal(transposed.todense(), np_eye.T) 56 | 57 | @sparse.compiled() 58 | def my_fun(tns1, tns2): 59 | tmp = sparse.add(tns1, tns2) 60 | return sparse.sum(tmp, axis=0) 61 | 62 | result = my_fun(finch_dense, finch_arr) 63 | 64 | assert_equal(result.todense(), np.sum(2 * np_eye, axis=0)) 65 | 66 | 67 | @pytest.mark.parametrize("format, order", [("csc", "F"), ("csr", "C"), ("coo", "F"), ("coo", "C")]) 68 | def test_asarray(backend, format, order): 69 | arr = np.eye(5, order=order) 70 | 71 | result = sparse.asarray(arr, format=format) 72 | 73 | assert_equal(result.todense(), arr) 74 | 75 | 76 | @pytest.mark.parametrize("format, order", [("csc", "F"), ("csr", "C"), ("coo", "F"), ("coo", "C")]) 77 | def test_scipy_spsolve(backend, format, order): 78 | x = np.eye(10, order=order) * 2 79 | y = np.ones((10, 1), order=order) 80 | x_pydata = sparse.asarray(x, format=format) 81 | y_pydata = sparse.asarray(y, format="coo") 82 | 83 | actual = splin.spsolve(x_pydata, y_pydata) 84 | expected = np.linalg.solve(x, y.ravel()) 85 | assert_almost_equal(actual, expected) 86 | 87 | 88 | @pytest.mark.parametrize("format, order", [("csc", "F"), ("csr", "C"), ("coo", "F"), ("coo", "C")]) 89 | def test_scipy_inv(backend, format, order): 90 | x = np.eye(10, order=order) * 2 91 | x_pydata = sparse.asarray(x, format=format) 92 | 93 | with warnings.catch_warnings(): 94 | warnings.simplefilter("ignore", category=sps.SparseEfficiencyWarning) 95 | actual = splin.inv(x_pydata) 96 | expected = np.linalg.inv(x) 97 | assert_almost_equal(actual.todense(), expected) 98 | 99 | 100 | @pytest.mark.skip(reason="https://github.com/scipy/scipy/pull/20759") 101 | @pytest.mark.parametrize("format, order", [("csc", "F"), ("csr", "C"), ("coo", "F"), ("coo", "C")]) 102 | def test_scipy_norm(backend, format, order): 103 | x = np.eye(10, order=order) * 2 104 | x_pydata = sparse.asarray(x, format=format) 105 | 106 | actual = splin.norm(x_pydata) 107 | expected = sp.linalg.norm(x) 108 | assert_almost_equal(actual, expected) 109 | 110 | 111 | @pytest.mark.skip(reason="https://github.com/scipy/scipy/pull/20759") 112 | @pytest.mark.parametrize("format, order", [("csc", "F"), ("csr", "C"), ("coo", "F"), ("coo", "C")]) 113 | def test_scipy_lsqr(backend, format, order): 114 | x = np.eye(10, order=order) * 2 115 | y = np.ones((10, 1), order=order) 116 | x_pydata = sparse.asarray(x, format=format) 117 | 118 | actual_x, _ = splin.lsqr(x_pydata, y)[:2] 119 | expected_x, _ = sp.linalg.lstsq(x, y)[:2] 120 | assert_almost_equal(actual_x, expected_x.ravel()) 121 | 122 | 123 | @pytest.mark.skip(reason="https://github.com/scipy/scipy/pull/20759") 124 | @pytest.mark.parametrize("format, order", [("csc", "F"), ("csr", "C"), ("coo", "F"), ("coo", "C")]) 125 | def test_scipy_eigs(backend, format, order): 126 | x = np.eye(10, order=order) * 2 127 | x_pydata = sparse.asarray(x, format=format) 128 | x_sp = sps.coo_matrix(x) 129 | 130 | actual_vals, _ = splin.eigs(x_pydata, k=3) 131 | expected_vals, _ = splin.eigs(x_sp, k=3) 132 | assert_almost_equal(actual_vals, expected_vals) 133 | 134 | 135 | @pytest.mark.parametrize( 136 | "matrix_fn, format, order", 137 | [(sps.csc_matrix, "csc", "F"), (sps.csr_matrix, "csr", "C"), (sps.coo_matrix, "coo", "F")], 138 | ) 139 | def test_scipy_connected_components(backend, graph, matrix_fn, format, order): 140 | graph = matrix_fn(np.array(graph, order=order)) 141 | sp_graph = sparse.asarray(graph, format=format) 142 | 143 | actual_n_components, actual_labels = spgraph.connected_components(sp_graph) 144 | expected_n_components, expected_labels = spgraph.connected_components(graph) 145 | assert actual_n_components == expected_n_components 146 | assert_equal(actual_labels, expected_labels) 147 | 148 | 149 | @pytest.mark.parametrize( 150 | "matrix_fn, format, order", 151 | [(sps.csc_matrix, "csc", "F"), (sps.csr_matrix, "csr", "C"), (sps.coo_matrix, "coo", "F")], 152 | ) 153 | def test_scipy_laplacian(backend, graph, matrix_fn, format, order): 154 | graph = matrix_fn(np.array(graph, order=order)) 155 | sp_graph = sparse.asarray(graph, format=format) 156 | 157 | actual_lap = spgraph.laplacian(sp_graph) 158 | expected_lap = spgraph.laplacian(graph) 159 | assert_equal(actual_lap.todense(), expected_lap.toarray()) 160 | 161 | 162 | @pytest.mark.parametrize("matrix_fn, format, order", [(sps.csc_matrix, "csc", "F"), (sps.csr_matrix, "csr", "C")]) 163 | def test_scipy_shortest_path(backend, graph, matrix_fn, format, order): 164 | graph = matrix_fn(np.array(graph, order=order)) 165 | sp_graph = sparse.asarray(graph, format=format) 166 | 167 | actual_dist_matrix, actual_predecessors = spgraph.shortest_path(sp_graph, return_predecessors=True) 168 | expected_dist_matrix, expected_predecessors = spgraph.shortest_path(graph, return_predecessors=True) 169 | assert_equal(actual_dist_matrix, expected_dist_matrix) 170 | assert_equal(actual_predecessors, expected_predecessors) 171 | 172 | 173 | @pytest.mark.parametrize( 174 | "matrix_fn, format, order", 175 | [(sps.csc_matrix, "csc", "F"), (sps.csr_matrix, "csr", "C"), (sps.coo_matrix, "coo", "F")], 176 | ) 177 | def test_scipy_breadth_first_tree(backend, graph, matrix_fn, format, order): 178 | graph = matrix_fn(np.array(graph, order=order)) 179 | sp_graph = sparse.asarray(graph, format=format) 180 | 181 | actual_bft = spgraph.breadth_first_tree(sp_graph, 0, directed=False) 182 | expected_bft = spgraph.breadth_first_tree(graph, 0, directed=False) 183 | assert_equal(actual_bft.todense(), expected_bft.toarray()) 184 | 185 | 186 | @pytest.mark.parametrize( 187 | "matrix_fn, format, order", 188 | [(sps.csc_matrix, "csc", "F"), (sps.csr_matrix, "csr", "C"), (sps.coo_matrix, "coo", "F")], 189 | ) 190 | def test_scipy_dijkstra(backend, graph, matrix_fn, format, order): 191 | graph = matrix_fn(np.array(graph, order=order)) 192 | sp_graph = sparse.asarray(graph, format=format) 193 | 194 | actual_dist_matrix = spgraph.dijkstra(sp_graph, directed=False) 195 | expected_dist_matrix = spgraph.dijkstra(graph, directed=False) 196 | assert_equal(actual_dist_matrix, expected_dist_matrix) 197 | 198 | 199 | @pytest.mark.parametrize( 200 | "matrix_fn, format, order", 201 | [(sps.csc_matrix, "csc", "F"), (sps.csr_matrix, "csr", "C"), (sps.coo_matrix, "coo", "F")], 202 | ) 203 | def test_scipy_minimum_spanning_tree(backend, graph, matrix_fn, format, order): 204 | graph = matrix_fn(np.array(graph, order=order)) 205 | sp_graph = sparse.asarray(graph, format=format) 206 | 207 | actual_span_tree = spgraph.minimum_spanning_tree(sp_graph) 208 | expected_span_tree = spgraph.minimum_spanning_tree(graph) 209 | assert_equal(actual_span_tree.todense(), expected_span_tree.toarray()) 210 | 211 | 212 | @pytest.mark.skip(reason="https://github.com/scikit-learn/scikit-learn/pull/29031") 213 | @pytest.mark.parametrize("matrix_fn, format, order", [(sps.csc_matrix, "csc", "F")]) 214 | def test_scikit_learn_dispatch(backend, graph, matrix_fn, format, order): 215 | from sklearn.cluster import KMeans 216 | 217 | graph = matrix_fn(np.array(graph, order=order)) 218 | 219 | sp_graph = sparse.asarray(graph, format=format) 220 | 221 | neigh = KMeans(n_clusters=2) 222 | actual_labels = neigh.fit_predict(sp_graph) 223 | 224 | neigh = KMeans(n_clusters=2) 225 | expected_labels = neigh.fit_predict(graph) 226 | 227 | assert_equal(actual_labels, expected_labels) 228 | -------------------------------------------------------------------------------- /docs/operations.md: -------------------------------------------------------------------------------- 1 | # Operations on [`sparse.COO`][] and [`sparse.GCXS`][] arrays 2 | 3 | ## Operators 4 | 5 | [`sparse.COO`][] and [`sparse.GCXS`][] objects support a number of operations. They interact with scalars, 6 | [`sparse.COO`][] and [`sparse.GCXS`][] objects, 7 | [scipy.sparse.spmatrix][] objects, all following standard Python and Numpy 8 | conventions. 9 | 10 | For example, the following Numpy expression produces equivalent 11 | results for both Numpy arrays, COO arrays, or a mix of the two: 12 | 13 | ```python 14 | 15 | np.log(X.dot(beta.T) + 1) 16 | ``` 17 | 18 | However some operations are not supported, like operations that 19 | implicitly cause dense structures, or numpy functions that are not 20 | yet implemented for sparse arrays. 21 | 22 | ```python 23 | 24 | np.linalg.cholesky(x) # sparse cholesky not implemented 25 | ``` 26 | 27 | This page describes those valid operations, and their limitations. 28 | 29 | **[`sparse.elemwise`][]** 30 | 31 | This function allows you to apply any arbitrary broadcasting function to any number of arguments 32 | where the arguments can be [`sparse.SparseArray`][] objects or [`scipy.sparse.spmatrix`][] objects. 33 | For example, the following will add two arrays: 34 | 35 | ```python 36 | 37 | sparse.elemwise(np.add, x, y) 38 | ``` 39 | 40 | !!! warning 41 | 42 | Previously, [`sparse.elemwise`][] was a method of the [`sparse.COO`][] class. Now, 43 | it has been moved to the [sparse][] module. 44 | 45 | 46 | **Auto-Densification** 47 | 48 | Operations that would result in dense matrices, such as 49 | operations with [Numpy arrays][`numpy.ndarray`] 50 | raises a [ValueError][]. For example, the following will raise a 51 | [ValueError][] if `x` is a [`numpy.ndarray`][]: 52 | 53 | ```python 54 | 55 | x + y 56 | ``` 57 | 58 | However, all of the following are valid operations. 59 | 60 | ```python 61 | 62 | x + 0 63 | x != y 64 | x + y 65 | x == 5 66 | 5 * x 67 | x / 7.3 68 | x != 0 69 | x == 0 70 | ~x 71 | x + 5 72 | ``` 73 | 74 | We also support operations with a nonzero fill value. These are operations 75 | that map zero values to nonzero values, such as `x + 1` or `~x`. 76 | In these cases, they will produce an output with a fill value of `1` or `True`, 77 | assuming the original array has a fill value of `0` or `False` respectively. 78 | 79 | If densification is needed, it must be explicit. In other words, you must call 80 | [`sparse.SparseArray.todense`][] on the [`sparse.SparseArray`][] object. If both operands are [`sparse.SparseArray`][], 81 | both must be densified. 82 | 83 | **Operations with NumPy arrays** 84 | 85 | In certain situations, operations with NumPy arrays are also supported. For example, 86 | the following will work if `x` is [`sparse.COO`][] and `y` is a NumPy array: 87 | 88 | ```python 89 | 90 | x * y 91 | ``` 92 | 93 | The following conditions must be met when performing element-wise operations with 94 | NumPy arrays: 95 | 96 | * The operation must produce a consistent fill-values. In other words, the resulting 97 | array must also be sparse. 98 | * Operating on the NumPy arrays must not increase the size when broadcasting the arrays. 99 | 100 | ## Operations with [`scipy.sparse.spmatrix`][] 101 | 102 | Certain operations with [`scipy.sparse.spmatrix`][] are also supported. 103 | For example, the following are all allowed if `y` is a [`scipy.sparse.spmatrix`][]: 104 | 105 | ```python 106 | 107 | x + y 108 | x - y 109 | x * y 110 | x > y 111 | x < y 112 | ``` 113 | 114 | In general, operating on a [`scipy.sparse.spmatrix`][] is the same as operating 115 | on [`sparse.COO`][] or [`sparse.GCXS`][], as long as it is to the right of the operator. 116 | 117 | !!! note 118 | 119 | Results are not guaranteed if `x` is a [scipy.sparse.spmatrix][]. 120 | For this reason, we recommend that all Scipy sparse matrices should be explicitly 121 | converted to [`sparse.COO`][] or [`sparse.GCXS`][] before any operations. 122 | 123 | 124 | ## Broadcasting 125 | 126 | All binary operators support [broadcasting](https://numpy.org/doc/stable/user/basics.broadcasting.html). 127 | This means that (under certain conditions) you can perform binary operations 128 | on arrays with unequal shape. Namely, when the shape is missing a dimension, 129 | or when a dimension is `1`. For example, performing a binary operation 130 | on two `COO` arrays with shapes `(4,)` and `(5, 1)` yields 131 | an object of shape `(5, 4)`. The same happens with arrays of shape 132 | `(1, 4)` and `(5, 1)`. However, `(4, 1)` and `(5, 1)` 133 | will raise a [`ValueError`][].If densification is needed, 134 | 135 | 136 | ## Element-wise Operations 137 | 138 | [`sparse.COO`][] and [`sparse.GCXS`][] arrays support a variety of element-wise operations. However, as 139 | with operators, operations that map zero to a nonzero value are not supported. 140 | 141 | To illustrate, the following are all possible, and will produce another 142 | [`sparse.SparseArray`][]: 143 | 144 | ```python 145 | 146 | np.abs(x) 147 | np.sin(x) 148 | np.sqrt(x) 149 | np.conj(x) 150 | np.expm1(x) 151 | np.log1p(x) 152 | np.exp(x) 153 | np.cos(x) 154 | np.log(x) 155 | ``` 156 | 157 | As above, in the last three cases, an array with a nonzero fill value will be produced. 158 | 159 | Notice that you can apply any unary or binary [`sparse.COO`][] 160 | arrays, and [`numpy.ndarray`][] objects and scalars and it will work so 161 | long as the result is not dense. When applying to [`numpy.ndarray`][] objects, 162 | we check that operating on the array with zero would always produce a zero. 163 | 164 | 165 | ## Reductions 166 | 167 | [`sparse.COO`][] and [`sparse.GCXS`][] objects support a number of reductions. However, not all important 168 | reductions are currently implemented (help welcome!). All of the following 169 | currently work: 170 | 171 | ```python 172 | 173 | x.sum(axis=1) 174 | np.max(x) 175 | np.min(x, axis=(0, 2)) 176 | x.prod() 177 | ``` 178 | 179 | [`sparse.SparseArray.reduce`][] 180 | 181 | This method can take an arbitrary [`numpy.ufunc`][] and performs a 182 | reduction using that method. For example, the following will perform 183 | a sum: 184 | 185 | ```python 186 | 187 | x.reduce(np.add, axis=1) 188 | ``` 189 | 190 | !!! note 191 | 192 | This library currently performs reductions by grouping together all 193 | coordinates along the supplied axes and reducing those. Then, if the 194 | number in a group is deficient, it reduces an extra time with zero. 195 | As a result, if reductions can change by adding multiple zeros to 196 | it, this method won't be accurate. However, it works in most cases. 197 | 198 | **Partial List of Supported Reductions** 199 | 200 | Although any binary [`numpy.ufunc`][] should work for reductions, when calling 201 | in the form `x.reduction()`, the following reductions are supported: 202 | 203 | * [`sparse.COO.sum`][] 204 | * [`sparse.COO.max`][] 205 | * [`sparse.COO.min`][] 206 | * [`sparse.COO.prod`][] 207 | 208 | 209 | ## Indexing 210 | 211 | [`sparse.COO`][] and [`sparse.GCXS`][] arrays can be [indexed](https://numpy.org/doc/stable/user/basics.indexing.html) 212 | just like regular [`numpy.ndarray`][] objects. They support integer, slice and boolean and array indexing (boolean and array indexing is supported only by dense NumPy arrays). 213 | However, currently, numpy advanced indexing is not properly supported. This 214 | means that all of the following work like in Numpy, except that they will produce 215 | [`sparse.SparseArray`][] arrays rather than [`numpy.ndarray`][] objects, and will produce 216 | scalars where expected. Assume that `z.shape` is `(5, 6, 7)` 217 | 218 | ```python 219 | 220 | z[0] 221 | z[1, 3] 222 | z[1, 4, 3] 223 | z[:3, :2, 3] 224 | z[::-1, 1, 3] 225 | z[-1] 226 | ``` 227 | 228 | All of the following will raise an `IndexError`, like in Numpy 1.13 and later. 229 | 230 | ```python 231 | 232 | z[6] 233 | z[3, 6] 234 | z[1, 4, 8] 235 | z[-6] 236 | ``` 237 | 238 | **Advanced Indexing** 239 | 240 | Advanced indexing (indexing arrays with other arrays) is supported, but only for indexing 241 | with a *single array*. Indexing a single array with multiple arrays is not supported at 242 | this time. As above, if `z.shape` is `(5, 6, 7)`, all of the following will 243 | work like NumPy: 244 | 245 | ```python 246 | 247 | z[[0, 1, 2]] 248 | z[1, [3]] 249 | z[1, 4, [3, 6]] 250 | z[:3, :2, [1, 5]] 251 | ``` 252 | 253 | 254 | **Package Configuration** 255 | 256 | By default, when performing something like `np.array(COO)`, we do not allow the 257 | array to be converted into a dense one and it raise a [`RuntimeError`][]. 258 | To prevent this, set the environment variable `SPARSE_AUTO_DENSIFY` to `1`. 259 | 260 | If it is desired to raise a warning if creating a sparse array that takes no less 261 | memory than an equivalent desne array, set the environment variable 262 | `SPARSE_WARN_ON_TOO_DENSE` to `1`. 263 | 264 | 265 | ## Other Operations 266 | 267 | [`sparse.COO`][] and [`sparse.GCXS`][] arrays support a number of other common operations. Among them are 268 | [`sparse.dot`][], [`sparse.tensordot`][] [`sparse.einsum`][], [`sparse.concatenate`][] 269 | and [`sparse.stack`][], [`sparse.COO.transpose`][] and [`sparse.COO.reshape`][]. 270 | You can view the full list on the [API reference page](../../api/). 271 | 272 | !!! note 273 | 274 | Some operations require zero fill-values (such as [`sparse.COO.nonzero`][]) 275 | and others (such as [`sparse.concatenate`][]) require that all inputs have consistent fill-values. 276 | For details, check the API reference. 277 | --------------------------------------------------------------------------------