├── .gitignore ├── CODE_OF_CONDUCT.md ├── LICENSE ├── README.md ├── application └── banchmark_qsvm_tnsm-mpi_app.py ├── benchmark ├── README.md ├── banchmark_qsvm_svsm_cusvaer.py ├── banchmark_qsvm_tnsm-mpi_mgpu.py ├── banchmark_qsvm_tnsm-mpi_sgpu.py ├── banchmark_qsvm_tnsm-opt_einsum.py ├── banchmark_qsvm_tnsm.py ├── figure │ ├── figure1_sgpu.png │ ├── figure2_mgpu_v100.png │ ├── figure3_mgpu_h100.png │ └── figure_sgpu.png └── mpi_demo.sh ├── cutn-qsvm.ipynb ├── env_check.py ├── environment.yml ├── figures ├── cutensornet_module.png ├── multi_GPU_linearity.png ├── multi_gpu_resource.png ├── process_flow_comparison.png └── speedup_cutensornet.png ├── requirements.txt └── requirements_benchmark.txt /.gitignore: -------------------------------------------------------------------------------- 1 | env 2 | tutorial -------------------------------------------------------------------------------- /CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- 1 | # Code of Conduct 2 | 3 | ## Our Pledge 4 | 5 | We as members, contributors, and leaders pledge to make participation in our 6 | community a harassment-free experience for everyone, regardless of age, body 7 | size, visible or invisible disability, ethnicity, sex characteristics, gender 8 | identity and expression, level of experience, education, socio-economic status, 9 | nationality, personal appearance, race, religion, or sexual identity 10 | and orientation. 11 | 12 | We pledge to act and interact in ways that contribute to an open, welcoming, 13 | diverse, inclusive, and healthy community. 14 | 15 | ## Our Standards 16 | 17 | Examples of behavior that contributes to a positive environment for our 18 | community include: 19 | 20 | * Demonstrating empathy and kindness toward other people 21 | * Being respectful of differing opinions, viewpoints, and experiences 22 | * Giving and gracefully accepting constructive feedback 23 | * Accepting responsibility and apologizing to those affected by our mistakes, 24 | and learning from the experience 25 | * Focusing on what is best not just for us as individuals, but for the 26 | overall community 27 | 28 | Examples of unacceptable behavior include: 29 | 30 | * The use of sexualized language or imagery, and sexual attention or 31 | advances of any kind 32 | * Trolling, insulting or derogatory comments, and personal or political attacks 33 | * Public or private harassment 34 | * Publishing others' private information, such as a physical or email 35 | address, without their explicit permission 36 | * Other conduct which could reasonably be considered inappropriate in a 37 | professional setting 38 | 39 | ## Enforcement Responsibilities 40 | 41 | Community leaders are responsible for clarifying and enforcing our standards of 42 | acceptable behavior and will take appropriate and fair corrective action in 43 | response to any behavior that they deem inappropriate, threatening, offensive, 44 | or harmful. 45 | 46 | Community leaders have the right and responsibility to remove, edit, or reject 47 | comments, commits, code, wiki edits, issues, and other contributions that are 48 | not aligned to this Code of Conduct, and will communicate reasons for moderation 49 | decisions when appropriate. 50 | 51 | ## Scope 52 | 53 | This Code of Conduct applies within all community spaces, and also applies when 54 | an individual is officially representing the community in public spaces. 55 | Examples of representing our community include using an official e-mail address, 56 | posting via an official social media account, or acting as an appointed 57 | representative at an online or offline event. 58 | 59 | ## Enforcement 60 | 61 | Instances of abusive, harassing, or otherwise unacceptable behavior may be 62 | reported to the community leaders responsible for enforcement at 63 | cudaq@nvidia.com. 64 | All complaints will be reviewed and investigated promptly and fairly. 65 | 66 | All community leaders are obligated to respect the privacy and security of the 67 | reporter of any incident. 68 | 69 | ## Enforcement Guidelines 70 | 71 | Community leaders will follow these Community Impact Guidelines in determining 72 | the consequences for any action they deem in violation of this Code of Conduct: 73 | 74 | ### 1. Correction 75 | 76 | **Community Impact**: Use of inappropriate language or other behavior deemed 77 | unprofessional or unwelcome in the community. 78 | 79 | **Consequence**: A private, written warning from community leaders, providing 80 | clarity around the nature of the violation and an explanation of why the 81 | behavior was inappropriate. A public apology may be requested. 82 | 83 | ### 2. Warning 84 | 85 | **Community Impact**: A violation through a single incident or series 86 | of actions. 87 | 88 | **Consequence**: A warning with consequences for continued behavior. No 89 | interaction with the people involved, including unsolicited interaction with 90 | those enforcing the Code of Conduct, for a specified period of time. This 91 | includes avoiding interactions in community spaces as well as external channels 92 | like social media. Violating these terms may lead to a temporary or 93 | permanent ban. 94 | 95 | ### 3. Temporary Ban 96 | 97 | **Community Impact**: A serious violation of community standards, including 98 | sustained inappropriate behavior. 99 | 100 | **Consequence**: A temporary ban from any sort of interaction or public 101 | communication with the community for a specified period of time. No public or 102 | private interaction with the people involved, including unsolicited interaction 103 | with those enforcing the Code of Conduct, is allowed during this period. 104 | Violating these terms may lead to a permanent ban. 105 | 106 | ### 4. Permanent Ban 107 | 108 | **Community Impact**: Demonstrating a pattern of violation of community 109 | standards, including sustained inappropriate behavior, harassment of an 110 | individual, or aggression toward or disparagement of classes of individuals. 111 | 112 | **Consequence**: A permanent ban from any sort of public interaction within 113 | the community. 114 | 115 | ## Attribution 116 | 117 | This Code of Conduct is adapted from the [Contributor Covenant][homepage], 118 | version 2.0, available at 119 | https://www.contributor-covenant.org/version/2/0/code_of_conduct.html. 120 | 121 | Community Impact Guidelines were inspired by [Mozilla's code of conduct 122 | enforcement ladder](https://github.com/mozilla/diversity). 123 | 124 | [homepage]: https://www.contributor-covenant.org 125 | 126 | For answers to common questions about this code of conduct, see the FAQ at 127 | https://www.contributor-covenant.org/faq. Translations are available at 128 | https://www.contributor-covenant.org/translations. 129 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Apache License 2 | Version 2.0, January 2004 3 | http://www.apache.org/licenses/ 4 | 5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 6 | 7 | 1. Definitions. 8 | 9 | "License" shall mean the terms and conditions for use, reproduction, 10 | and distribution as defined by Sections 1 through 9 of this document. 11 | 12 | "Licensor" shall mean the copyright owner or entity authorized by 13 | the copyright owner that is granting the License. 14 | 15 | "Legal Entity" shall mean the union of the acting entity and all 16 | other entities that control, are controlled by, or are under common 17 | control with that entity. For the purposes of this definition, 18 | "control" means (i) the power, direct or indirect, to cause the 19 | direction or management of such entity, whether by contract or 20 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 21 | outstanding shares, or (iii) beneficial ownership of such entity. 22 | 23 | "You" (or "Your") shall mean an individual or Legal Entity 24 | exercising permissions granted by this License. 25 | 26 | "Source" form shall mean the preferred form for making modifications, 27 | including but not limited to software source code, documentation 28 | source, and configuration files. 29 | 30 | "Object" form shall mean any form resulting from mechanical 31 | transformation or translation of a Source form, including but 32 | not limited to compiled object code, generated documentation, 33 | and conversions to other media types. 34 | 35 | "Work" shall mean the work of authorship, whether in Source or 36 | Object form, made available under the License, as indicated by a 37 | copyright notice that is included in or attached to the work 38 | (an example is provided in the Appendix below). 39 | 40 | "Derivative Works" shall mean any work, whether in Source or Object 41 | form, that is based on (or derived from) the Work and for which the 42 | editorial revisions, annotations, elaborations, or other modifications 43 | represent, as a whole, an original work of authorship. For the purposes 44 | of this License, Derivative Works shall not include works that remain 45 | separable from, or merely link (or bind by name) to the interfaces of, 46 | the Work and Derivative Works thereof. 47 | 48 | "Contribution" shall mean any work of authorship, including 49 | the original version of the Work and any modifications or additions 50 | to that Work or Derivative Works thereof, that is intentionally 51 | submitted to Licensor for inclusion in the Work by the copyright owner 52 | or by an individual or Legal Entity authorized to submit on behalf of 53 | the copyright owner. For the purposes of this definition, "submitted" 54 | means any form of electronic, verbal, or written communication sent 55 | to the Licensor or its representatives, including but not limited to 56 | communication on electronic mailing lists, source code control systems, 57 | and issue tracking systems that are managed by, or on behalf of, the 58 | Licensor for the purpose of discussing and improving the Work, but 59 | excluding communication that is conspicuously marked or otherwise 60 | designated in writing by the copyright owner as "Not a Contribution." 61 | 62 | "Contributor" shall mean Licensor and any individual or Legal Entity 63 | on behalf of whom a Contribution has been received by Licensor and 64 | subsequently incorporated within the Work. 65 | 66 | 2. Grant of Copyright License. Subject to the terms and conditions of 67 | this License, each Contributor hereby grants to You a perpetual, 68 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 69 | copyright license to reproduce, prepare Derivative Works of, 70 | publicly display, publicly perform, sublicense, and distribute the 71 | Work and such Derivative Works in Source or Object form. 72 | 73 | 3. Grant of Patent License. Subject to the terms and conditions of 74 | this License, each Contributor hereby grants to You a perpetual, 75 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 76 | (except as stated in this section) patent license to make, have made, 77 | use, offer to sell, sell, import, and otherwise transfer the Work, 78 | where such license applies only to those patent claims licensable 79 | by such Contributor that are necessarily infringed by their 80 | Contribution(s) alone or by combination of their Contribution(s) 81 | with the Work to which such Contribution(s) was submitted. If You 82 | institute patent litigation against any entity (including a 83 | cross-claim or counterclaim in a lawsuit) alleging that the Work 84 | or a Contribution incorporated within the Work constitutes direct 85 | or contributory patent infringement, then any patent licenses 86 | granted to You under this License for that Work shall terminate 87 | as of the date such litigation is filed. 88 | 89 | 4. Redistribution. You may reproduce and distribute copies of the 90 | Work or Derivative Works thereof in any medium, with or without 91 | modifications, and in Source or Object form, provided that You 92 | meet the following conditions: 93 | 94 | (a) You must give any other recipients of the Work or 95 | Derivative Works a copy of this License; and 96 | 97 | (b) You must cause any modified files to carry prominent notices 98 | stating that You changed the files; and 99 | 100 | (c) You must retain, in the Source form of any Derivative Works 101 | that You distribute, all copyright, patent, trademark, and 102 | attribution notices from the Source form of the Work, 103 | excluding those notices that do not pertain to any part of 104 | the Derivative Works; and 105 | 106 | (d) If the Work includes a "NOTICE" text file as part of its 107 | distribution, then any Derivative Works that You distribute must 108 | include a readable copy of the attribution notices contained 109 | within such NOTICE file, excluding those notices that do not 110 | pertain to any part of the Derivative Works, in at least one 111 | of the following places: within a NOTICE text file distributed 112 | as part of the Derivative Works; within the Source form or 113 | documentation, if provided along with the Derivative Works; or, 114 | within a display generated by the Derivative Works, if and 115 | wherever such third-party notices normally appear. The contents 116 | of the NOTICE file are for informational purposes only and 117 | do not modify the License. You may add Your own attribution 118 | notices within Derivative Works that You distribute, alongside 119 | or as an addendum to the NOTICE text from the Work, provided 120 | that such additional attribution notices cannot be construed 121 | as modifying the License. 122 | 123 | You may add Your own copyright statement to Your modifications and 124 | may provide additional or different license terms and conditions 125 | for use, reproduction, or distribution of Your modifications, or 126 | for any such Derivative Works as a whole, provided Your use, 127 | reproduction, and distribution of the Work otherwise complies with 128 | the conditions stated in this License. 129 | 130 | 5. Submission of Contributions. Unless You explicitly state otherwise, 131 | any Contribution intentionally submitted for inclusion in the Work 132 | by You to the Licensor shall be under the terms and conditions of 133 | this License, without any additional terms or conditions. 134 | Notwithstanding the above, nothing herein shall supersede or modify 135 | the terms of any separate license agreement you may have executed 136 | with Licensor regarding such Contributions. 137 | 138 | 6. Trademarks. This License does not grant permission to use the trade 139 | names, trademarks, service marks, or product names of the Licensor, 140 | except as required for reasonable and customary use in describing the 141 | origin of the Work and reproducing the content of the NOTICE file. 142 | 143 | 7. Disclaimer of Warranty. Unless required by applicable law or 144 | agreed to in writing, Licensor provides the Work (and each 145 | Contributor provides its Contributions) on an "AS IS" BASIS, 146 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 147 | implied, including, without limitation, any warranties or conditions 148 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 149 | PARTICULAR PURPOSE. You are solely responsible for determining the 150 | appropriateness of using or redistributing the Work and assume any 151 | risks associated with Your exercise of permissions under this License. 152 | 153 | 8. Limitation of Liability. In no event and under no legal theory, 154 | whether in tort (including negligence), contract, or otherwise, 155 | unless required by applicable law (such as deliberate and grossly 156 | negligent acts) or agreed to in writing, shall any Contributor be 157 | liable to You for damages, including any direct, indirect, special, 158 | incidental, or consequential damages of any character arising as a 159 | result of this License or out of the use or inability to use the 160 | Work (including but not limited to damages for loss of goodwill, 161 | work stoppage, computer failure or malfunction, or any and all 162 | other commercial damages or losses), even if such Contributor 163 | has been advised of the possibility of such damages. 164 | 165 | 9. Accepting Warranty or Additional Liability. While redistributing 166 | the Work or Derivative Works thereof, You may choose to offer, 167 | and charge a fee for, acceptance of support, warranty, indemnity, 168 | or other liability obligations and/or rights consistent with this 169 | License. However, in accepting such obligations, You may act only 170 | on Your own behalf and on Your sole responsibility, not on behalf 171 | of any other Contributor, and only if You agree to indemnify, 172 | defend, and hold each Contributor harmless for any liability 173 | incurred by, or claims asserted against, such Contributor by reason 174 | of your accepting any such warranty or additional liability. 175 | 176 | END OF TERMS AND CONDITIONS 177 | 178 | APPENDIX: How to apply the Apache License to your work. 179 | 180 | To apply the Apache License to your work, attach the following 181 | boilerplate notice, with the fields enclosed by brackets "[]" 182 | replaced with your own identifying information. (Don't include 183 | the brackets!) The text should be enclosed in the appropriate 184 | comment syntax for the file format. We also recommend that a 185 | file or class name and description of purpose be included on the 186 | same "printed page" as the copyright notice for easier 187 | identification within third-party archives. 188 | 189 | Copyright 2024 Kuan-Cheng Chen, Tai-Yue Lee, Yun-Yuan Wang 190 | 191 | Licensed under the Apache License, Version 2.0 (the "License"); 192 | you may not use this file except in compliance with the License. 193 | You may obtain a copy of the License at 194 | 195 | http://www.apache.org/licenses/LICENSE-2.0 196 | 197 | Unless required by applicable law or agreed to in writing, software 198 | distributed under the License is distributed on an "AS IS" BASIS, 199 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 200 | See the License for the specific language governing permissions and 201 | limitations under the License. 202 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | [![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg?style=flat-square)](https://opensource.org/licenses/Apache-2.0) 2 | [![arXiv](https://img.shields.io/badge/arXiv-2405.02630-b31b1b.svg?style=flat-square)](https://arxiv.org/abs/2405.02630) 3 | 4 | ## cuTN-QSVM: cuTensorNet-accelerated Quantum Support Vector Machine with cuQuantum SDK 5 | 6 | 9 | 10 | Welcome to the official repository of cuTN-QSVM, featuring fast GPU simulators for benchmarking Quantum Support Vector Machines (QSVMs) and scripts for generating compatible quantum circuits for hardware execution. Facilitated by NVIDIA's [cuQuantum SDK](https://github.com/NVIDIA/cuda-quantum/tree/main) and the [cuTensorNet](https://docs.nvidia.com/cuda/cuquantum/latest/cutensornet/overview.html) library, this project integrates cutting-edge quantum computing technologies with high-performance computing systems, enhancing quantum machine learning's efficiency and scalability to new heights. 11 | 12 | ## Project Overview 13 | Quantum Support Vector Machines (QSVMs) utilize a quantum-enhanced approach to tackle complex, multidimensional classification problems, surpassing the capabilities of classical SVMs under certain conditions. However, prior to the advent of large-scale quantum systems, the scalability of simulating QSVMs on CPUs was traditionally limited by the exponential growth in computational demands as qubit counts increased. By employing NVIDIA's cuQuantum SDK and the cuTensorNet library, cuTN-QSVM effectively reduces this computational complexity from exponential to quadratic. This enables the simulation of large quantum systems of up to 784 qubits on the NVIDIA A100 GPU within seconds. 14 | 15 | Technical Highlights: 16 | 17 | - Efficient Quantum Simulations: The cuTensorNet library significantly lowers the computational overhead for QSVMs, facilitating rapid and efficient quantum simulations that can handle extensive qubit counts. 18 | - Multi-GPU Processing: Supported by the Message Passing Interface (MPI), our implementation allows significant reductions in computation times and scalable performance improvements across varying data sizes. 19 | - Empirical Validation: Through rigorous testing, cuTN-QSVM achieves high classification accuracy, with results reaching up to 95% on the MNIST dataset for training sets larger than 100 instances, markedly outperforming traditional SVMs. 20 | 21 |
22 | Speedup-Result 23 | Multi-GPU-Resource 24 |
25 | 26 | 27 | 28 | 29 | ## Update 30 | - **2025.01.22** Multi-GPU and Multi-ndoe benchmark [\[here\]](benchmark/README.md) 31 | - **2025.01** Large-scale QC 32 | - **2025.01** Add colab demo [[here](https://colab.research.google.com/drive/1ksUC3nX8d1I4DE1EqihgAmKurPEbYIhW)] 33 | 34 | 35 | ## Quick Start 36 | ### Installation 37 | ``` 38 | conda create -n cutn-qsvm python=3.10 -y 39 | conda activate cutn-qsvm 40 | ``` 41 | ``` 42 | git clone https://github.com/Tim-Li/cuTN-QSVM.git 43 | cd cuTN-QSVM 44 | pip install -r requirements.txt 45 | ``` 46 | You can also use [NVIDIA cuQuantum Appliance >= 23.10](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/cuquantum-appliance) 47 | 48 | ``` 49 | # pull the image 50 | docker pull nvcr.io/nvidia/cuquantum-appliance:23.10 51 | 52 | # launch the container interactively 53 | docker run --gpus all -it --rm nvcr.io/nvidia/cuquantum-appliance:23.10 54 | ``` 55 | 56 | ### Quick Environment Check 57 | The env_check.py script is crafted to swiftly verify that your computational environment is optimally configured to execute simulations with cuTN-QSVM, leveraging the capabilities of cuQuantum and Qiskit. This Python script generates a random quantum circuit using Qiskit, then converts it to Einstein summation format utilizing cuQuantum's CircuitToEinsum with the CuPy backend. This process allows you to assess the integration and performance of these essential tools on your system. To run this script and ensure all necessary libraries are correctly interacting and prepared for more complex operations, execute the following command in your terminal: 58 | 59 | ``` 60 | python env_check.py 61 | ``` 62 | ### cuTN-QSVM demo code 63 | You can check with [cutn-qsvm demo code](cutn-qsvm.ipynb) to kenow the detail of QSVM simulation with tensornetwork. 64 | 65 | 66 | ### cuTN-QSVM with single GPU 67 | ``` 68 | python benchmark/banchmark_qsvm_tnsm.py 69 | ``` 70 | 71 | ### cuTN-QSVM with multi GPU 72 | ``` 73 | mpirun -np 8 python benchmark/banchmark_qsvm_tnsm-mpi.py 74 | mpirun -np 4 python benchmark/banchmark_qsvm_tnsm-mpi.py 75 | mpirun -np 2 python benchmark/banchmark_qsvm_tnsm-mpi.py 76 | mpirun -np 1 python benchmark/banchmark_qsvm_tnsm-mpi.py 77 | ``` 78 | 79 | ## Methodology 80 | ### [cuTensorNet](https://docs.nvidia.com/cuda/cuquantum/latest/cutensornet/overview.html) 81 | 82 | NVIDIA's [cuQuantum SDK](https://github.com/NVIDIA/cuda-quantum/tree/main) includes cuTensorNet, a key component designed to optimize quantum circuit simulations on NVIDIA GPUs. It reduces computational costs and memory usage by streamlining tensor contractions and simplifying network complexities through its modular APIs. This enhancement enables efficient, large-scale simulations across multi-GPU and multi-node environments, advancing research in quantum physics, chemistry, and machine learning. 83 | 84 |
85 | cuTensorNet Module 86 |
87 | 88 | ### Simulation Workflow 89 | In our enhanced QSVM simulation workflow using NVIDIA's cuQuantum SDK, the cuTensorNet module plays a pivotal role. This integration allows for the efficient transformation of quantum circuits into tensor networks, significantly reducing computational complexity from exponential to quadratic with respect to the number of qubits. By leveraging cuTensorNet’s advanced strategies like path reuse and non-blocking multi-GPU operations, we achieve substantial improvements in simulation speed and efficiency, enabling practical, large-scale quantum simulations up to 784 qubits. 90 | 91 |
92 | Process Flow 93 |
94 | 95 | ### Multi-GPU Enable 96 | In our study on distributed simulation within high-performance computing, we expanded QSVM model simulations using a multi-GPU setup to handle a dataset of over 1,000 MNIST images (28x28 pixels, 756 features). Leveraging NVIDIA’s cuStateVector with high-speed [NVLink](https://www.nvidia.com/en-gb/design-visualization/nvlink-bridges/) and [MPI](https://developer.nvidia.com/mpi-solutions-gpus) communication, we achieved significant computational efficiencies and demonstrated a linear speedup in quantum circuit simulations across multiple GPUs in [NVIDIA DGX Platform](https://www.nvidia.com/en-gb/data-center/dgx-platform/). 97 | 98 |
99 | Multi-GPU-Result 100 |
101 | 102 | ## How to cite 103 | 104 | If you used this package or framework for your research, please cite: 105 | 106 | ```text 107 | @article{10.1088/2632-2153/adb4ba, 108 | author={Chen, Kuan-Cheng and Li, Tai-Yue and Wang, Yun-Yuan and See, Simon and Wang, Chun-Chieh and Wille, Robert and Chen, Nan-Yow and Yang, An-Cheng and Lin, Chun-Yu}, 109 | title={Validating Large-Scale Quantum Machine Learning: Efficient Simulation of Quantum Support Vector Machines Using Tensor Networks}, 110 | journal={Machine Learning: Science and Technology}, 111 | url={http://iopscience.iop.org/article/10.1088/2632-2153/adb4ba}, 112 | year={2025} 113 | } 114 | ``` 115 | -------------------------------------------------------------------------------- /application/banchmark_qsvm_tnsm-mpi_app.py: -------------------------------------------------------------------------------- 1 | import time 2 | import numpy as np 3 | import pandas as pd 4 | from itertools import combinations, chain, product 5 | from sklearn.svm import SVC 6 | from sklearn.model_selection import train_test_split 7 | from sklearn.decomposition import PCA 8 | from sklearn.preprocessing import StandardScaler, MinMaxScaler 9 | from sklearn.datasets import load_digits, fetch_openml 10 | from qiskit.circuit.library import PauliFeatureMap, ZFeatureMap, ZZFeatureMap 11 | from qiskit_machine_learning.kernels import QuantumKernel 12 | from qiskit import QuantumCircuit 13 | from qiskit.circuit import ParameterVector 14 | from cuquantum import * 15 | import cupy as cp 16 | from cupy.cuda import nccl 17 | from cupy.cuda.runtime import getDeviceCount 18 | from mpi4py import MPI 19 | 20 | # mpi setup 21 | root = 0 22 | comm_mpi = MPI.COMM_WORLD 23 | rank, size = comm_mpi.Get_rank(), comm_mpi.Get_size() 24 | device_id = rank % getDeviceCount() 25 | cp.cuda.Device(device_id).use() 26 | name = MPI.Get_processor_name() 27 | print("MPI rank %d / %d on %s." % (rank, size, name)) 28 | 29 | # input data 30 | mnist = fetch_openml('mnist_784') 31 | X = mnist.data.to_numpy() 32 | Y = mnist.target.to_numpy().astype(int) 33 | class_list = [7,9] 34 | c01 = np.where((Y == class_list[0])|(Y == class_list[1])) 35 | X,Y = X[c01],Y[c01] 36 | MAX=1000 37 | data_train, label_train = X[:MAX],Y[:MAX] 38 | X_train, X_val, Y_train, Y_val = train_test_split(data_train, label_train, test_size = 0.2, random_state=255) 39 | 40 | if rank == root: 41 | print('qubits, acc_train, acc_valid, data, exp_t, operand_t, path_t, contact_t') 42 | 43 | def data_prepare(n_dim, sample_train, sample_test, nb1, nb2): 44 | std_scale = StandardScaler().fit(sample_train) 45 | data = std_scale.transform(sample_train) 46 | sample_train = std_scale.transform(sample_train) 47 | sample_test = std_scale.transform(sample_test) 48 | pca = PCA(n_components=n_dim, svd_solver="auto").fit(data) 49 | sample_train = pca.transform(sample_train) 50 | sample_test = pca.transform(sample_test) 51 | samples = np.append(sample_train, sample_test, axis=0) 52 | minmax_scale = MinMaxScaler((-1, 1)).fit(samples) 53 | sample_train = minmax_scale.transform(sample_train)[:nb1] 54 | sample_test = minmax_scale.transform(sample_test)[:nb2] 55 | return sample_train, sample_test 56 | def make_bsp(n_dim): 57 | param = ParameterVector("p",n_dim) 58 | bsp_qc = QuantumCircuit(n_dim) 59 | bsp_qc.h(list(range(n_dim))) 60 | i = 0 61 | for q in range(n_dim): 62 | bsp_qc.rz(param.params[q],[q]) 63 | bsp_qc.ry(param.params[q],[q]) 64 | for q in range(n_dim-1): 65 | bsp_qc.cx(0+i, 1+i) 66 | i+=1 67 | for q in range(n_dim): 68 | bsp_qc.rz(param.params[q],[q]) 69 | return bsp_qc 70 | def build_qsvm_qc(bsp_qc,n_dim,y_t,x_t): 71 | qc_1 = bsp_qc.assign_parameters(y_t).to_gate() 72 | qc_2 = bsp_qc.assign_parameters(x_t).inverse().to_gate() 73 | kernel_qc = QuantumCircuit(n_dim) 74 | kernel_qc.append(qc_1,list(range(n_dim))) 75 | kernel_qc.append(qc_2,list(range(n_dim))) 76 | return kernel_qc 77 | def renew_operand(n_dim,oper_tmp,y_t,x_t): 78 | oper = oper_tmp.copy() 79 | n_zg, n_zy_g = [], [] 80 | for d1 in y_t: 81 | z_g = np.array([[np.exp(-1j*0.5*d1),0],[0,np.exp(1j*0.5*d1)]]) 82 | n_zg.append(z_g) 83 | y_g = np.array([[np.cos(d1/2),-np.sin(d1/2)],[np.sin(d1/2),np.cos(d1/2)]]) 84 | n_zy_g.append(z_g) 85 | n_zy_g.append(y_g) 86 | oper[n_dim*2:n_dim*4] = cp.array(n_zy_g) 87 | oper[n_dim*5-1:n_dim*6-1] = cp.array(n_zg) 88 | n_zgd, n_zy_gd = [], [] 89 | for d2 in x_t[::-1]: 90 | z_gd = np.array([[np.exp(1j*0.5*d2),0],[0,np.exp(-1j*0.5*d2)]]) 91 | n_zgd.append(z_gd) 92 | y_gd = np.array([[np.cos(d2/2),np.sin(d2/2)],[-np.sin(d2/2),np.cos(d2/2)]]) 93 | n_zy_gd.append(y_gd) 94 | n_zy_gd.append(z_gd) 95 | oper[n_dim*6-1:n_dim*7-1] = cp.array(n_zgd) 96 | oper[n_dim*8-2:n_dim*10-2] = cp.array(n_zy_gd) 97 | return oper 98 | def data_partition(indices_list,size,rank): 99 | num_data = len(indices_list) 100 | chunk, extra = num_data // size, num_data % size 101 | data_begin = rank * chunk + min(rank, extra) 102 | data_end = num_data if rank == size - 1 else (rank + 1) * chunk + min(rank + 1, extra) 103 | data_index = range(data_begin,data_end) 104 | indices_list_rank = indices_list[data_begin:data_end] 105 | return indices_list_rank 106 | def data_to_operand(n_dim,operand_tmp,data1,data2,indices_list): 107 | operand_list = [] 108 | for i1, i2 in indices_list: 109 | n_op = renew_operand(n_dim,operand_tmp,data1[i1-1],data2[i2-1]) 110 | operand_list.append(n_op) 111 | return operand_list 112 | def operand_to_amp(opers, network): 113 | amp_tmp = [] 114 | with network as tn: 115 | for i in range(len(opers)): 116 | tn.reset_operands(*opers[i]) 117 | amp_tn = abs(tn.contract()) ** 2 118 | amp_tmp.append(amp_tn) 119 | return amp_tmp 120 | def get_kernel_matrix(data1, data2, amp_data, indices_list, mode=None): 121 | amp_m = list(chain.from_iterable(amp_data)) 122 | # print(len(amp),len(indices_list)) 123 | kernel_matrix = np.zeros((len(data1),len(data2))) 124 | i = -1 125 | for i1, i2 in indices_list: 126 | i += 1 127 | kernel_matrix[i1-1][i2-1] = np.round(amp_m[i],8) 128 | if mode == 'train': 129 | kernel_matrix = kernel_matrix + kernel_matrix.T+np.diag(np.ones((len(data2)))) 130 | return kernel_matrix 131 | 132 | def run_tnsm(data_train, data_val, n_dim): 133 | #1. data partition 134 | list_train = list(combinations(range(1, len(data_train) + 1), 2)) 135 | list_val = list(product(range(1, len(data_val) + 1),range(1, len(data_train) + 1))) 136 | list_train_partition = data_partition(list_train,size,rank) 137 | list_val_partition = data_partition(list_val,size,rank) 138 | 139 | #2. data to operand 140 | #2-1. quantum circuit setup and get exp 141 | t0 = time.time() 142 | bsp_qc = make_bsp(n_dim) 143 | circuit = build_qsvm_qc(bsp_qc,n_dim, data_train[0], data_train[0]) 144 | converter = CircuitToEinsum(circuit, dtype='complex128', backend='cupy') 145 | a = str(0).zfill(n_dim) 146 | exp, oper = converter.amplitude(a) 147 | exp_t = round((time.time()-t0),3) 148 | 149 | #2-2. all data to operand 150 | t0 = time.time() 151 | oper_train = data_to_operand(n_dim,oper,data_train,data_train,list_train_partition) 152 | oper_val = data_to_operand(n_dim,oper,data_val,data_train,list_val_partition) 153 | oper_t = round((time.time()-t0),3) 154 | 155 | #3. operand to amplitude 156 | #3-1. tensor network setup 157 | t0 = time.time() 158 | options = NetworkOptions(blocking="auto",device_id=device_id) 159 | network = Network(exp, *oper,options=options) 160 | path, info = network.contract_path() 161 | network.autotune(iterations=20) 162 | path_t = round((time.time()-t0),3) 163 | 164 | #3-2. all operand to amplitude 165 | t0 = time.time() 166 | oper_data = oper_train+oper_val 167 | amp_list = operand_to_amp(oper_data, network) 168 | amp_train = cp.array(amp_list[:len(oper_train)]) 169 | amp_valid = cp.array(amp_list[len(oper_train):len(oper_train)+len(oper_val)]) 170 | amp_data_train = comm_mpi.gather(amp_train, root=0) 171 | amp_data_valid = comm_mpi.gather(amp_valid, root=0) 172 | tnsm_kernel_t = round((time.time()-t0),3) 173 | 174 | if rank == root: 175 | kernel_train = get_kernel_matrix(data_train, data_train, amp_data_train, list_train, mode='train') 176 | kernel_valid = get_kernel_matrix(data_val, data_train, amp_data_valid, list_val, mode=None) 177 | svc = SVC(kernel="precomputed") 178 | svc.fit(kernel_train ,Y_train) 179 | acc_train = svc.score(kernel_train,Y_train) 180 | acc_test = svc.score(kernel_valid,Y_val) 181 | print(n_dim, round(acc_train, 5), round(acc_test, 5), len(data_train), exp_t, oper_t, path_t, tnsm_kernel_t, len(list_train_partition)/len(list_train), len(amp_data_train), len(amp_data_valid)) 182 | 183 | dd = np.zeros((10,2)) 184 | run_tnsm(dd, dd, 2) 185 | for ndim in [2,4,8,16,32,64,128]: 186 | for d in [20,40,60,80,100,200,400,600,800,1000]: 187 | dtrain, dval = data_prepare(ndim, X_train, X_val, d, 5) 188 | # dtrain = np.zeros((d,ndim)) 189 | run_tnsm(dtrain, dval, ndim) -------------------------------------------------------------------------------- /benchmark/README.md: -------------------------------------------------------------------------------- 1 | ## cutn-qsvm benchmark 2 | ### 1. Single Data Pair with Single GPU 3 | ``` 4 | mpirun -np 1 python banchmark_qsvm_tnsm-mpi_sgpu.py 5 | ``` 6 | #### - Runtime compare with v100, a100 and h100 GPU 7 | ![alt text](figure/figure_sgpu.png) 8 | 9 | #### - Runtime detail with h100 GPU 10 | ![alt text](figure/figure1_sgpu.png) 11 | 12 | ### 2. Multiple Data Pairs with Multiple GPUs 13 | ``` 14 | #!/bin/bash 15 | #SBATCH -J mgpu -p nchc 16 | #SBATCH --nodes=1 --ntasks-per-node=8 --cpus-per-task=10 17 | #SBATCH --gres=gpu:8 18 | #SBATCH --mem-bind=no 19 | 20 | ml purge 21 | ml cuq/12 22 | source /beegfs/_venv/cuq24cu12/bin/activate 23 | mpirun python banchmark_qsvm_tnsm-mpi_mgpu.py 24 | ``` 25 | #### - mgpu with v100 26 | ![alt text](figure/figure2_mgpu_v100.png) 27 | 28 | #### - mgpu with h100 29 | ![alt text](figure/figure3_mgpu_h100.png) -------------------------------------------------------------------------------- /benchmark/banchmark_qsvm_svsm_cusvaer.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import cupy as cp 3 | import pandas as pd 4 | import time 5 | import matplotlib.pyplot as plt 6 | from itertools import combinations,product 7 | from multiprocessing import Pool 8 | from sklearn.svm import SVC 9 | from sklearn.model_selection import train_test_split 10 | from sklearn.decomposition import PCA 11 | from sklearn.preprocessing import StandardScaler, MinMaxScaler 12 | from sklearn.datasets import load_digits, fetch_openml 13 | from sklearn.model_selection import GridSearchCV 14 | from qiskit.circuit.library import PauliFeatureMap, ZFeatureMap, ZZFeatureMap 15 | from qiskit_machine_learning.kernels import QuantumKernel 16 | from qiskit import QuantumCircuit, transpile, Aer 17 | from qiskit.circuit import ParameterVector 18 | from cuquantum import * 19 | import time 20 | from mpi4py import MPI 21 | 22 | mnist = fetch_openml('mnist_784') 23 | X = mnist.data.to_numpy() 24 | Y = mnist.target.to_numpy().astype(int) 25 | class_list = [7,9] 26 | c01 = np.where((Y == class_list[0])|(Y == class_list[1])) 27 | X,Y = X[c01],Y[c01] 28 | data_train, label_train = X[:1000],Y[:1000] 29 | X_train, X_val, Y_train, Y_val = train_test_split(data_train, label_train, test_size = 0.2, random_state=255) 30 | 31 | def data_prepare(n_dim, sample_train, sample_test, nb1, nb2): 32 | std_scale = StandardScaler().fit(sample_train) 33 | data = std_scale.transform(sample_train) 34 | sample_train = std_scale.transform(sample_train) 35 | sample_test = std_scale.transform(sample_test) 36 | pca = PCA(n_components=n_dim, svd_solver="full").fit(data) 37 | sample_train = pca.transform(sample_train) 38 | sample_test = pca.transform(sample_test) 39 | samples = np.append(sample_train, sample_test, axis=0) 40 | minmax_scale = MinMaxScaler((-1, 1)).fit(samples) 41 | sample_train = minmax_scale.transform(sample_train)[:nb1] 42 | sample_test = minmax_scale.transform(sample_test)[:nb2] 43 | return sample_train, sample_test 44 | def make_bsp(n_dim): 45 | param = ParameterVector("p",n_dim) 46 | bsp_qc = QuantumCircuit(n_dim) 47 | bsp_qc.h(list(range(n_dim))) 48 | i = 0 49 | for q in range(n_dim): 50 | bsp_qc.rz(param.params[q],[q]) 51 | bsp_qc.ry(param.params[q],[q]) 52 | for q in range(n_dim-1): 53 | bsp_qc.cx(0+i, 1+i) 54 | i+=1 55 | for q in range(n_dim): 56 | bsp_qc.rz(param.params[q],[q]) 57 | return bsp_qc 58 | def all_circuits_parallel(y_t, x_t, indices_list, n_dim, kernel, num_cpu): 59 | with Pool(processes=num_cpu, maxtasksperchild=100) as pool: 60 | circuits = pool.starmap(kernel.construct_circuit, [(y_t[i1-1], x_t[i2-1],False) for i1, i2 in indices_list]) 61 | return circuits 62 | def kernel_matrix_svsm(y_t, x_t, circuit,indices_list, simulator, mode=None): 63 | kernel_matrix = np.zeros((len(y_t),len(x_t))) 64 | i = -1 65 | for i1, i2 in indices_list: 66 | i += 1 67 | qc = circuit[i] 68 | qc.save_statevector() 69 | circ = transpile(qc, simulator) 70 | result = simulator.run(circ).result() 71 | amp = abs(result.get_statevector()[0]) ** 2 72 | kernel_matrix[i1-1][i2-1] = np.round(amp,8) 73 | if mode == 'train': 74 | kernel_matrix = kernel_matrix + kernel_matrix.T+np.diag(np.ones((len(x_t)))) 75 | return kernel_matrix 76 | 77 | simulator = Aer.get_backend('aer_simulator_statevector') 78 | simulator.set_option('cusvaer_enable', False) 79 | simulator.set_option('precision', 'double') 80 | 81 | def run_svsm(n_dim,simulator, nb1, nb2): 82 | data_train, data_val = data_prepare(n_dim, X_train, X_val, nb1, nb2) 83 | bsp_qc = make_bsp(n_dim) 84 | bsp_kernel_svsm = QuantumKernel(feature_map=bsp_qc, quantum_instance=simulator) 85 | indices_list_t = list(combinations(range(1, len(data_train) + 1), 2)) 86 | t0 = time.time() 87 | circuit_train = all_circuits_parallel(data_train, data_train, indices_list_t, n_dim, bsp_kernel_svsm, 10) 88 | circuit_t = round((time.time()-t0),3) 89 | t0 = time.time() 90 | svsm_kernel_matrix_train = kernel_matrix_svsm(data_train, data_train, circuit_train, indices_list_t, simulator, mode="train") 91 | svsm_kernel_t = round((time.time()-t0),3) 92 | if MPI.COMM_WORLD.Get_rank() == 0: 93 | print(n_dim,circuit_t,svsm_kernel_t,len(circuit_train)) 94 | 95 | run_svsm(2,simulator,2,1) 96 | for q in range(2,37): 97 | run_svsm(q,simulator,2,1) 98 | 99 | -------------------------------------------------------------------------------- /benchmark/banchmark_qsvm_tnsm-mpi_mgpu.py: -------------------------------------------------------------------------------- 1 | import time 2 | import numpy as np 3 | import pandas as pd 4 | from itertools import combinations, chain, product 5 | from sklearn.model_selection import train_test_split 6 | from sklearn.decomposition import PCA 7 | from sklearn.preprocessing import StandardScaler, MinMaxScaler 8 | from sklearn.datasets import load_digits, fetch_openml 9 | from qiskit.circuit.library import PauliFeatureMap, ZFeatureMap, ZZFeatureMap 10 | from qiskit import QuantumCircuit 11 | from qiskit.circuit import ParameterVector 12 | from cuquantum import * 13 | import cupy as cp 14 | from cupy.cuda import nccl 15 | from cupy.cuda.runtime import getDeviceCount 16 | from mpi4py import MPI 17 | 18 | # mpi setup 19 | root = 0 20 | comm_mpi = MPI.COMM_WORLD 21 | rank, size = comm_mpi.Get_rank(), comm_mpi.Get_size() 22 | device_id = rank % getDeviceCount() 23 | cp.cuda.Device(device_id).use() 24 | name = MPI.Get_processor_name() 25 | print("MPI rank %d / %d on %s." % (rank, size, name)) 26 | 27 | # input data 28 | mnist = fetch_openml('mnist_784') 29 | X = mnist.data.to_numpy() 30 | Y = mnist.target.to_numpy().astype(int) 31 | class_list = [7,9] 32 | c01 = np.where((Y == class_list[0])|(Y == class_list[1])) 33 | X,Y = X[c01],Y[c01] 34 | MAX=1600 35 | data_train, label_train = X[:MAX],Y[:MAX] 36 | X_train, X_val, Y_train, Y_val = train_test_split(data_train, label_train, test_size = 0.2, random_state=255) 37 | 38 | if rank == root: 39 | print(f' qubits, [num train data, num list, num parti-list, num gpu], [exp_t, operand_t, path_t, contact_t, total_t]') 40 | 41 | def data_prepare(n_dim, sample_train, sample_test, nb1, nb2): 42 | std_scale = StandardScaler().fit(sample_train) 43 | data = std_scale.transform(sample_train) 44 | sample_train = std_scale.transform(sample_train) 45 | sample_test = std_scale.transform(sample_test) 46 | pca = PCA(n_components=n_dim, svd_solver="auto").fit(data) 47 | sample_train = pca.transform(sample_train) 48 | sample_test = pca.transform(sample_test) 49 | samples = np.append(sample_train, sample_test, axis=0) 50 | minmax_scale = MinMaxScaler((-1, 1)).fit(samples) 51 | sample_train = minmax_scale.transform(sample_train)[:nb1] 52 | sample_test = minmax_scale.transform(sample_test)[:nb2] 53 | return sample_train, sample_test 54 | def make_bsp(n_dim): 55 | param = ParameterVector("p",n_dim) 56 | bsp_qc = QuantumCircuit(n_dim) 57 | bsp_qc.h(list(range(n_dim))) 58 | i = 0 59 | for q in range(n_dim): 60 | bsp_qc.rz(param.params[q],[q]) 61 | bsp_qc.ry(param.params[q],[q]) 62 | for q in range(n_dim-1): 63 | bsp_qc.cx(0+i, 1+i) 64 | i+=1 65 | for q in range(n_dim): 66 | bsp_qc.rz(param.params[q],[q]) 67 | return bsp_qc 68 | def build_qsvm_qc(bsp_qc,n_dim,y_t,x_t): 69 | qc_1 = bsp_qc.assign_parameters(y_t).to_gate() 70 | qc_2 = bsp_qc.assign_parameters(x_t).inverse().to_gate() 71 | kernel_qc = QuantumCircuit(n_dim) 72 | kernel_qc.append(qc_1,list(range(n_dim))) 73 | kernel_qc.append(qc_2,list(range(n_dim))) 74 | return kernel_qc 75 | def renew_operand(n_dim,oper_tmp,y_t,x_t): 76 | oper = oper_tmp.copy() 77 | n_zg, n_zy_g = [], [] 78 | for d1 in y_t: 79 | z_g = np.array([[np.exp(-1j*0.5*d1),0],[0,np.exp(1j*0.5*d1)]]) 80 | n_zg.append(z_g) 81 | y_g = np.array([[np.cos(d1/2),-np.sin(d1/2)],[np.sin(d1/2),np.cos(d1/2)]]) 82 | n_zy_g.append(z_g) 83 | n_zy_g.append(y_g) 84 | oper[n_dim*2:n_dim*4] = cp.array(n_zy_g) 85 | oper[n_dim*5-1:n_dim*6-1] = cp.array(n_zg) 86 | n_zgd, n_zy_gd = [], [] 87 | for d2 in x_t[::-1]: 88 | z_gd = np.array([[np.exp(1j*0.5*d2),0],[0,np.exp(-1j*0.5*d2)]]) 89 | n_zgd.append(z_gd) 90 | y_gd = np.array([[np.cos(d2/2),np.sin(d2/2)],[-np.sin(d2/2),np.cos(d2/2)]]) 91 | n_zy_gd.append(y_gd) 92 | n_zy_gd.append(z_gd) 93 | oper[n_dim*6-1:n_dim*7-1] = cp.array(n_zgd) 94 | oper[n_dim*8-2:n_dim*10-2] = cp.array(n_zy_gd) 95 | return oper 96 | def data_partition(indices_list,size,rank): 97 | num_data = len(indices_list) 98 | chunk, extra = num_data // size, num_data % size 99 | data_begin = rank * chunk + min(rank, extra) 100 | data_end = num_data if rank == size - 1 else (rank + 1) * chunk + min(rank + 1, extra) 101 | data_index = range(data_begin,data_end) 102 | indices_list_rank = indices_list[data_begin:data_end] 103 | return indices_list_rank 104 | def data_to_operand(n_dim,operand_tmp,data1,data2,indices_list): 105 | operand_list = [] 106 | for i1, i2 in indices_list: 107 | n_op = renew_operand(n_dim,operand_tmp,data1[i1-1],data2[i2-1]) 108 | operand_list.append(n_op) 109 | return operand_list 110 | def operand_to_amp(opers, network): 111 | amp_tmp = [] 112 | with network as tn: 113 | for i in range(len(opers)): 114 | tn.reset_operands(*opers[i]) 115 | amp_tn = abs(tn.contract()) ** 2 116 | amp_tmp.append(amp_tn) 117 | return amp_tmp 118 | def get_kernel_matrix(data1, data2, amp_data, indices_list, mode=None): 119 | amp_m = list(chain.from_iterable(amp_data)) 120 | # print(len(amp),len(indices_list)) 121 | kernel_matrix = np.zeros((len(data1),len(data2))) 122 | i = -1 123 | for i1, i2 in indices_list: 124 | i += 1 125 | kernel_matrix[i1-1][i2-1] = np.round(amp_m[i],8) 126 | if mode == 'train': 127 | kernel_matrix = kernel_matrix + kernel_matrix.T+np.diag(np.ones((len(data2)))) 128 | return kernel_matrix 129 | 130 | def run_tnsm(data_train, n_dim): 131 | #1. data partition 132 | list_train = list(combinations(range(1, len(data_train) + 1), 2)) 133 | list_train_partition = data_partition(list_train,size,rank) 134 | 135 | #2. data to operand 136 | #2-1. quantum circuit setup and get exp 137 | t0 = time.time() 138 | bsp_qc = make_bsp(n_dim) 139 | circuit = build_qsvm_qc(bsp_qc,n_dim, data_train[0], data_train[0]) 140 | converter = CircuitToEinsum(circuit, dtype='complex128', backend='cupy') 141 | a = str(0).zfill(n_dim) 142 | exp, oper = converter.amplitude(a) 143 | exp_t = round((time.time()-t0),3) 144 | 145 | #2-2. all data to operand 146 | t0 = time.time() 147 | oper_train = data_to_operand(n_dim,oper,data_train,data_train,list_train_partition) 148 | oper_t = round((time.time()-t0),3) 149 | 150 | #3. operand to amplitude 151 | #3-1. tensor network setup 152 | t0 = time.time() 153 | options = NetworkOptions(blocking="auto",device_id=device_id) 154 | network = Network(exp, *oper,options=options) 155 | path, info = network.contract_path() 156 | network.autotune(iterations=20) 157 | path_t = round((time.time()-t0),3) 158 | 159 | #3-2. all operand to amplitude 160 | t0 = time.time() 161 | oper_data = oper_train 162 | amp_list = operand_to_amp(oper_data, network) 163 | amp_train = cp.array(amp_list[:len(oper_train)]) 164 | amp_data_train = comm_mpi.gather(amp_train, root=0) 165 | tnsm_kernel_t = round((time.time()-t0),3) 166 | 167 | if rank == root: 168 | print(f' {n_dim}, {len(data_train)}, {len(list_train)}, {len(list_train_partition)}, {len(amp_data_train)}, {exp_t}, {oper_t}, {path_t}, {tnsm_kernel_t}, {round((exp_t+oper_t+path_t+tnsm_kernel_t),3)}') 169 | 170 | dd = np.zeros((20,2)) 171 | run_tnsm(dd, 2) 172 | ## for 1 node 8 gpus 173 | for ndim in [2,4,8,16,32,64,128,256,512,784]: 174 | for d in [20,40,50,60,80,100,200,400,500,600,800,1000]: 175 | dd = np.zeros((d,ndim)) 176 | run_tnsm(dd, ndim) 177 | ## for 4 node 8 gpus 178 | # for ndim in [1024,2048,2352,3072,4096]: 179 | # for d in [20,40,50,60,80,100,200,400,500,600,800,1000]: 180 | # dd = np.zeros((d,ndim)) 181 | # run_tnsm(dd, ndim) -------------------------------------------------------------------------------- /benchmark/banchmark_qsvm_tnsm-mpi_sgpu.py: -------------------------------------------------------------------------------- 1 | import time 2 | import numpy as np 3 | import pandas as pd 4 | from itertools import combinations, chain, product 5 | from sklearn.model_selection import train_test_split 6 | from sklearn.decomposition import PCA 7 | from sklearn.preprocessing import StandardScaler, MinMaxScaler 8 | from sklearn.datasets import load_digits, fetch_openml 9 | from qiskit.circuit.library import PauliFeatureMap, ZFeatureMap, ZZFeatureMap 10 | from qiskit import QuantumCircuit 11 | from qiskit.circuit import ParameterVector 12 | from cuquantum import * 13 | import cupy as cp 14 | from cupy.cuda import nccl 15 | from cupy.cuda.runtime import getDeviceCount 16 | from mpi4py import MPI 17 | 18 | # mpi setup 19 | root = 0 20 | comm_mpi = MPI.COMM_WORLD 21 | rank, size = comm_mpi.Get_rank(), comm_mpi.Get_size() 22 | device_id = rank % getDeviceCount() 23 | cp.cuda.Device(device_id).use() 24 | name = MPI.Get_processor_name() 25 | print("MPI rank %d / %d on %s." % (rank, size, name)) 26 | 27 | # input data 28 | mnist = fetch_openml('mnist_784') 29 | X = mnist.data.to_numpy() 30 | Y = mnist.target.to_numpy().astype(int) 31 | class_list = [7,9] 32 | c01 = np.where((Y == class_list[0])|(Y == class_list[1])) 33 | X,Y = X[c01],Y[c01] 34 | MAX=1600 35 | data_train, label_train = X[:MAX],Y[:MAX] 36 | X_train, X_val, Y_train, Y_val = train_test_split(data_train, label_train, test_size = 0.2, random_state=255) 37 | 38 | if rank == root: 39 | print(f' qubits, [num train data, num list, num parti-list, num gpu], [exp_t, operand_t, path_t, contact_t, total_t]') 40 | 41 | def data_prepare(n_dim, sample_train, sample_test, nb1, nb2): 42 | std_scale = StandardScaler().fit(sample_train) 43 | data = std_scale.transform(sample_train) 44 | sample_train = std_scale.transform(sample_train) 45 | sample_test = std_scale.transform(sample_test) 46 | pca = PCA(n_components=n_dim, svd_solver="auto").fit(data) 47 | sample_train = pca.transform(sample_train) 48 | sample_test = pca.transform(sample_test) 49 | samples = np.append(sample_train, sample_test, axis=0) 50 | minmax_scale = MinMaxScaler((-1, 1)).fit(samples) 51 | sample_train = minmax_scale.transform(sample_train)[:nb1] 52 | sample_test = minmax_scale.transform(sample_test)[:nb2] 53 | return sample_train, sample_test 54 | def make_bsp(n_dim): 55 | param = ParameterVector("p",n_dim) 56 | bsp_qc = QuantumCircuit(n_dim) 57 | bsp_qc.h(list(range(n_dim))) 58 | i = 0 59 | for q in range(n_dim): 60 | bsp_qc.rz(param.params[q],[q]) 61 | bsp_qc.ry(param.params[q],[q]) 62 | for q in range(n_dim-1): 63 | bsp_qc.cx(0+i, 1+i) 64 | i+=1 65 | for q in range(n_dim): 66 | bsp_qc.rz(param.params[q],[q]) 67 | return bsp_qc 68 | def build_qsvm_qc(bsp_qc,n_dim,y_t,x_t): 69 | qc_1 = bsp_qc.assign_parameters(y_t).to_gate() 70 | qc_2 = bsp_qc.assign_parameters(x_t).inverse().to_gate() 71 | kernel_qc = QuantumCircuit(n_dim) 72 | kernel_qc.append(qc_1,list(range(n_dim))) 73 | kernel_qc.append(qc_2,list(range(n_dim))) 74 | return kernel_qc 75 | def renew_operand(n_dim,oper_tmp,y_t,x_t): 76 | oper = oper_tmp.copy() 77 | n_zg, n_zy_g = [], [] 78 | for d1 in y_t: 79 | z_g = np.array([[np.exp(-1j*0.5*d1),0],[0,np.exp(1j*0.5*d1)]]) 80 | n_zg.append(z_g) 81 | y_g = np.array([[np.cos(d1/2),-np.sin(d1/2)],[np.sin(d1/2),np.cos(d1/2)]]) 82 | n_zy_g.append(z_g) 83 | n_zy_g.append(y_g) 84 | oper[n_dim*2:n_dim*4] = cp.array(n_zy_g) 85 | oper[n_dim*5-1:n_dim*6-1] = cp.array(n_zg) 86 | n_zgd, n_zy_gd = [], [] 87 | for d2 in x_t[::-1]: 88 | z_gd = np.array([[np.exp(1j*0.5*d2),0],[0,np.exp(-1j*0.5*d2)]]) 89 | n_zgd.append(z_gd) 90 | y_gd = np.array([[np.cos(d2/2),np.sin(d2/2)],[-np.sin(d2/2),np.cos(d2/2)]]) 91 | n_zy_gd.append(y_gd) 92 | n_zy_gd.append(z_gd) 93 | oper[n_dim*6-1:n_dim*7-1] = cp.array(n_zgd) 94 | oper[n_dim*8-2:n_dim*10-2] = cp.array(n_zy_gd) 95 | return oper 96 | def data_partition(indices_list,size,rank): 97 | num_data = len(indices_list) 98 | chunk, extra = num_data // size, num_data % size 99 | data_begin = rank * chunk + min(rank, extra) 100 | data_end = num_data if rank == size - 1 else (rank + 1) * chunk + min(rank + 1, extra) 101 | data_index = range(data_begin,data_end) 102 | indices_list_rank = indices_list[data_begin:data_end] 103 | return indices_list_rank 104 | def data_to_operand(n_dim,operand_tmp,data1,data2,indices_list): 105 | operand_list = [] 106 | for i1, i2 in indices_list: 107 | n_op = renew_operand(n_dim,operand_tmp,data1[i1-1],data2[i2-1]) 108 | operand_list.append(n_op) 109 | return operand_list 110 | def operand_to_amp(opers, network): 111 | amp_tmp = [] 112 | with network as tn: 113 | for i in range(len(opers)): 114 | tn.reset_operands(*opers[i]) 115 | amp_tn = abs(tn.contract()) ** 2 116 | amp_tmp.append(amp_tn) 117 | return amp_tmp 118 | def get_kernel_matrix(data1, data2, amp_data, indices_list, mode=None): 119 | amp_m = list(chain.from_iterable(amp_data)) 120 | # print(len(amp),len(indices_list)) 121 | kernel_matrix = np.zeros((len(data1),len(data2))) 122 | i = -1 123 | for i1, i2 in indices_list: 124 | i += 1 125 | kernel_matrix[i1-1][i2-1] = np.round(amp_m[i],8) 126 | if mode == 'train': 127 | kernel_matrix = kernel_matrix + kernel_matrix.T+np.diag(np.ones((len(data2)))) 128 | return kernel_matrix 129 | 130 | def run_tnsm(data_train, n_dim): 131 | #1. data partition 132 | list_train = list(combinations(range(1, len(data_train) + 1), 2)) 133 | list_train_partition = data_partition(list_train,size,rank) 134 | 135 | #2. data to operand 136 | #2-1. quantum circuit setup and get exp 137 | t0 = time.time() 138 | bsp_qc = make_bsp(n_dim) 139 | circuit = build_qsvm_qc(bsp_qc,n_dim, data_train[0], data_train[0]) 140 | converter = CircuitToEinsum(circuit, dtype='complex128', backend='cupy') 141 | a = str(0).zfill(n_dim) 142 | exp, oper = converter.amplitude(a) 143 | exp_t = round((time.time()-t0),3) 144 | 145 | #2-2. all data to operand 146 | t0 = time.time() 147 | oper_train = data_to_operand(n_dim,oper,data_train,data_train,list_train_partition) 148 | oper_t = round((time.time()-t0),3) 149 | 150 | #3. operand to amplitude 151 | #3-1. tensor network setup 152 | t0 = time.time() 153 | options = NetworkOptions(blocking="auto",device_id=device_id) 154 | network = Network(exp, *oper,options=options) 155 | path, info = network.contract_path() 156 | network.autotune(iterations=20) 157 | path_t = round((time.time()-t0),3) 158 | 159 | #3-2. all operand to amplitude 160 | t0 = time.time() 161 | oper_data = oper_train 162 | amp_list = operand_to_amp(oper_data, network) 163 | amp_train = cp.array(amp_list[:len(oper_train)]) 164 | amp_data_train = comm_mpi.gather(amp_train, root=0) 165 | tnsm_kernel_t = round((time.time()-t0),3) 166 | 167 | if rank == root: 168 | print(f' {n_dim}, {len(data_train)}, {len(list_train)}, {len(list_train_partition)}, {len(amp_data_train)}, {exp_t}, {oper_t}, {path_t}, {tnsm_kernel_t}, {round((exp_t+oper_t+path_t+tnsm_kernel_t),3)}') 169 | 170 | dd = np.zeros((2,2)) 171 | run_tnsm(dd, 2) 172 | # A. just follow the previous benchmark data 173 | for mq in range(2,34): 174 | dd = np.zeros((2,mq)) 175 | run_tnsm(dd,mq) 176 | for mq in range(42,200,10): 177 | dd = np.zeros((2,mq)) 178 | run_tnsm(dd,mq) 179 | for mq in [200,300,400,500,600,784]: 180 | dd = np.zeros((2,mq)) 181 | run_tnsm(dd,mq) 182 | 183 | # B. single gpu with single data pair for m qubits to test the limitation of qubits 184 | for mq in [2, 4, 8, 16, 32, 64, 128, 256, 512, 784, 1024, 2048, 2352, 3072, 4096]: 185 | dd = np.zeros((2,mq)) 186 | run_tnsm(dd, mq) -------------------------------------------------------------------------------- /benchmark/banchmark_qsvm_tnsm-opt_einsum.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import cupy as cp 3 | import pandas as pd 4 | import time 5 | import matplotlib.pyplot as plt 6 | from itertools import combinations,product 7 | from multiprocessing import Pool 8 | from sklearn.svm import SVC 9 | from sklearn.model_selection import train_test_split 10 | from sklearn.decomposition import PCA 11 | from sklearn.preprocessing import StandardScaler, MinMaxScaler 12 | from sklearn.datasets import load_digits, fetch_openml 13 | from sklearn.model_selection import GridSearchCV 14 | from qiskit.circuit.library import PauliFeatureMap, ZFeatureMap, ZZFeatureMap 15 | from qiskit_machine_learning.kernels import QuantumKernel 16 | from qiskit import QuantumCircuit, transpile, Aer 17 | from qiskit.circuit import ParameterVector 18 | from cuquantum import * 19 | import time 20 | import cupy as cp 21 | from cupy.cuda.runtime import getDeviceCount 22 | from mpi4py import MPI 23 | import opt_einsum as oe 24 | 25 | root = 0 26 | comm = MPI.COMM_WORLD 27 | rank, size = comm.Get_rank(), comm.Get_size() 28 | device_id = 6 29 | cp.cuda.Device(device_id).use() 30 | print(device_id) 31 | 32 | mnist = fetch_openml('mnist_784') 33 | X = mnist.data.to_numpy() 34 | Y = mnist.target.to_numpy().astype(int) 35 | class_list = [7,9] 36 | c01 = np.where((Y == class_list[0])|(Y == class_list[1])) 37 | X,Y = X[c01],Y[c01] 38 | data_train, label_train = X[:1000],Y[:1000] 39 | X_train, X_val, Y_train, Y_val = train_test_split(data_train, label_train, test_size = 0.2, random_state=255) 40 | 41 | def data_prepare(n_dim, sample_train, sample_test, nb1, nb2): 42 | std_scale = StandardScaler().fit(sample_train) 43 | data = std_scale.transform(sample_train) 44 | sample_train = std_scale.transform(sample_train) 45 | sample_test = std_scale.transform(sample_test) 46 | pca = PCA(n_components=n_dim, svd_solver="full").fit(data) 47 | sample_train = pca.transform(sample_train) 48 | sample_test = pca.transform(sample_test) 49 | samples = np.append(sample_train, sample_test, axis=0) 50 | minmax_scale = MinMaxScaler((-1, 1)).fit(samples) 51 | sample_train = minmax_scale.transform(sample_train)[:nb1] 52 | sample_test = minmax_scale.transform(sample_test)[:nb2] 53 | return sample_train, sample_test 54 | def make_bsp(n_dim): 55 | param = ParameterVector("p",n_dim) 56 | bsp_qc = QuantumCircuit(n_dim) 57 | bsp_qc.h(list(range(n_dim))) 58 | i = 0 59 | for q in range(n_dim): 60 | bsp_qc.rz(param.params[q],[q]) 61 | bsp_qc.ry(param.params[q],[q]) 62 | for q in range(n_dim-1): 63 | bsp_qc.cx(0+i, 1+i) 64 | i+=1 65 | for q in range(n_dim): 66 | bsp_qc.rz(param.params[q],[q]) 67 | return bsp_qc 68 | def new_op(n_dim,oper,y_t,x_t): 69 | n_zg, n_zy_g = [], [] 70 | for d1 in y_t: 71 | z_g = np.array([[np.exp(-1j*0.5*d1),0],[0,np.exp(1j*0.5*d1)]]) 72 | n_zg.append(z_g) 73 | y_g = np.array([[np.cos(d1/2),-np.sin(d1/2)],[np.sin(d1/2),np.cos(d1/2)]]) 74 | n_zy_g.append(z_g) 75 | n_zy_g.append(y_g) 76 | oper[n_dim*2:n_dim*4] = cp.array(n_zy_g) 77 | oper[n_dim*5-1:n_dim*6-1] = cp.array(n_zg) 78 | n_zgd, n_zy_gd = [], [] 79 | for d2 in x_t[::-1]: 80 | z_gd = np.array([[np.exp(1j*0.5*d2),0],[0,np.exp(-1j*0.5*d2)]]) 81 | n_zgd.append(z_gd) 82 | y_gd = np.array([[np.cos(d2/2),np.sin(d2/2)],[-np.sin(d2/2),np.cos(d2/2)]]) 83 | n_zy_gd.append(y_gd) 84 | n_zy_gd.append(z_gd) 85 | oper[n_dim*6-1:n_dim*7-1] = cp.array(n_zgd) 86 | oper[n_dim*8-2:n_dim*10-2] = cp.array(n_zy_gd) 87 | return oper 88 | 89 | def kernel_matrix_tnsm(y_t, x_t, opers, indices_list, exp, opt_path, mode=None): 90 | kernel_matrix = np.zeros((len(y_t),len(x_t))) 91 | i = -1 92 | for i1, i2 in indices_list: 93 | i += 1 94 | result = oe.contract(exp, *opers[i], optimize=opt_path) 95 | amp_tn = abs(result) ** 2 96 | kernel_matrix[i1-1][i2-1] = np.round(amp_tn,8) 97 | if mode == 'train': 98 | kernel_matrix = kernel_matrix + kernel_matrix.T+np.diag(np.ones((len(x_t)))) 99 | return kernel_matrix 100 | 101 | def run_tnsm(n_dim, nb1, nb2): 102 | data_train, data_val = data_prepare(n_dim, X_train, X_val, nb1, nb2) 103 | bsp_qc = make_bsp(n_dim) 104 | bsp_kernel_tnsm = QuantumKernel(feature_map=bsp_qc) 105 | indices_list_t = list(combinations(range(1, len(data_train) + 1), 2)) 106 | 107 | t0 = time.time() 108 | circuit = bsp_kernel_tnsm.construct_circuit(data_train[0], data_train[0],False) 109 | converter = CircuitToEinsum(circuit, dtype='complex128', backend='cupy') 110 | a = str(0).zfill(n_dim) 111 | exp, oper = converter.amplitude(a) 112 | exp_t = round((time.time()-t0),3) 113 | 114 | t0 = time.time() 115 | oper_train = [] 116 | for i1, i2 in indices_list_t: 117 | n_op = new_op(n_dim,oper,data_train[i1-1],data_train[i2-1]) 118 | oper_train.append(n_op) 119 | oper_t = round((time.time()-t0),3) 120 | 121 | t0 = time.time() 122 | oper = oper_train[0] 123 | path, path_info = oe.contract_path(exp, *oper) 124 | path_t = round((time.time()-t0),3) 125 | t0 = time.time() 126 | tnsm_kernel_matrix_train = kernel_matrix_tnsm(data_train, data_train, oper_train, indices_list_t, exp, path, mode='train') 127 | tnsm_kernel_t = round((time.time()-t0),3) 128 | print(n_dim,exp_t,oper_t,path_t,tnsm_kernel_t,len(oper_train)) 129 | 130 | run_tnsm(2,2,1) 131 | # for d in [2,5,10,50,100,500,1000]: 132 | # run_tnsm(128,d,1) 133 | # run_tnsm(300,2,1) 134 | for q in range(2,34): 135 | run_tnsm(q,2,1) 136 | for q in range(42,200,10): 137 | run_tnsm(q,2,1) 138 | for q in [200,300,400,500,600,784]: 139 | run_tnsm(q,2,1) -------------------------------------------------------------------------------- /benchmark/banchmark_qsvm_tnsm.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import cupy as cp 3 | import pandas as pd 4 | import time 5 | import matplotlib.pyplot as plt 6 | from itertools import combinations,product 7 | from multiprocessing import Pool 8 | from sklearn.svm import SVC 9 | from sklearn.model_selection import train_test_split 10 | from sklearn.decomposition import PCA 11 | from sklearn.preprocessing import StandardScaler, MinMaxScaler 12 | from sklearn.datasets import load_digits, fetch_openml 13 | from sklearn.model_selection import GridSearchCV 14 | from qiskit.circuit.library import PauliFeatureMap, ZFeatureMap, ZZFeatureMap 15 | from qiskit import QuantumCircuit, transpile, Aer 16 | from qiskit.circuit import ParameterVector 17 | from cuquantum import * 18 | import time 19 | import cupy as cp 20 | from cupy.cuda.runtime import getDeviceCount 21 | from mpi4py import MPI 22 | 23 | root = 0 24 | comm = MPI.COMM_WORLD 25 | rank, size = comm.Get_rank(), comm.Get_size() 26 | device_id = 6 27 | cp.cuda.Device(device_id).use() 28 | print(device_id) 29 | 30 | mnist = fetch_openml('mnist_784') 31 | X = mnist.data.to_numpy() 32 | Y = mnist.target.to_numpy().astype(int) 33 | class_list = [7,9] 34 | c01 = np.where((Y == class_list[0])|(Y == class_list[1])) 35 | X,Y = X[c01],Y[c01] 36 | data_train, label_train = X[:1000],Y[:1000] 37 | X_train, X_val, Y_train, Y_val = train_test_split(data_train, label_train, test_size = 0.2, random_state=255) 38 | 39 | def data_prepare(n_dim, sample_train, sample_test, nb1, nb2): 40 | std_scale = StandardScaler().fit(sample_train) 41 | data = std_scale.transform(sample_train) 42 | sample_train = std_scale.transform(sample_train) 43 | sample_test = std_scale.transform(sample_test) 44 | pca = PCA(n_components=n_dim, svd_solver="full").fit(data) 45 | sample_train = pca.transform(sample_train) 46 | sample_test = pca.transform(sample_test) 47 | samples = np.append(sample_train, sample_test, axis=0) 48 | minmax_scale = MinMaxScaler((-1, 1)).fit(samples) 49 | sample_train = minmax_scale.transform(sample_train)[:nb1] 50 | sample_test = minmax_scale.transform(sample_test)[:nb2] 51 | return sample_train, sample_test 52 | def make_bsp(n_dim): 53 | param = ParameterVector("p",n_dim) 54 | bsp_qc = QuantumCircuit(n_dim) 55 | bsp_qc.h(list(range(n_dim))) 56 | i = 0 57 | for q in range(n_dim): 58 | bsp_qc.rz(param.params[q],[q]) 59 | bsp_qc.ry(param.params[q],[q]) 60 | for q in range(n_dim-1): 61 | bsp_qc.cx(0+i, 1+i) 62 | i+=1 63 | for q in range(n_dim): 64 | bsp_qc.rz(param.params[q],[q]) 65 | return bsp_qc 66 | def build_qsvm_qc(bsp_qc,n_dim,y_t,x_t): 67 | qc_1 = bsp_qc.assign_parameters(y_t).to_gate() 68 | qc_2 = bsp_qc.assign_parameters(x_t).inverse().to_gate() 69 | kernel_qc = QuantumCircuit(n_dim) 70 | kernel_qc.append(qc_1,list(range(n_dim))) 71 | kernel_qc.append(qc_2,list(range(n_dim))) 72 | return kernel_qc 73 | def renew_operand(n_dim,oper_tmp,y_t,x_t): 74 | oper = oper_tmp.copy() 75 | n_zg, n_zy_g = [], [] 76 | for d1 in y_t: 77 | z_g = np.array([[np.exp(-1j*0.5*d1),0],[0,np.exp(1j*0.5*d1)]]) 78 | n_zg.append(z_g) 79 | y_g = np.array([[np.cos(d1/2),-np.sin(d1/2)],[np.sin(d1/2),np.cos(d1/2)]]) 80 | n_zy_g.append(z_g) 81 | n_zy_g.append(y_g) 82 | oper[n_dim*2:n_dim*4] = cp.array(n_zy_g) 83 | oper[n_dim*5-1:n_dim*6-1] = cp.array(n_zg) 84 | n_zgd, n_zy_gd = [], [] 85 | for d2 in x_t[::-1]: 86 | z_gd = np.array([[np.exp(1j*0.5*d2),0],[0,np.exp(-1j*0.5*d2)]]) 87 | n_zgd.append(z_gd) 88 | y_gd = np.array([[np.cos(d2/2),np.sin(d2/2)],[-np.sin(d2/2),np.cos(d2/2)]]) 89 | n_zy_gd.append(y_gd) 90 | n_zy_gd.append(z_gd) 91 | oper[n_dim*6-1:n_dim*7-1] = cp.array(n_zgd) 92 | oper[n_dim*8-2:n_dim*10-2] = cp.array(n_zy_gd) 93 | return oper 94 | def data_to_operand(n_dim,operand_tmp,data1,data2,indices_list): 95 | operand_list = [] 96 | for i1, i2 in indices_list: 97 | n_op = renew_operand(n_dim,operand_tmp,data1[i1-1],data2[i2-1]) 98 | operand_list.append(n_op) 99 | return operand_list 100 | 101 | def kernel_matrix_tnsm(y_t, x_t, opers, indices_list, network, mode=None): 102 | kernel_matrix = np.zeros((len(y_t),len(x_t))) 103 | i = -1 104 | with network as tn: 105 | for i1, i2 in indices_list: 106 | i += 1 107 | tn.reset_operands(*opers[i]) 108 | amp_tn = abs(tn.contract()) ** 2 109 | kernel_matrix[i1-1][i2-1] = np.round(amp_tn,8) 110 | if mode == 'train': 111 | kernel_matrix = kernel_matrix + kernel_matrix.T+np.diag(np.ones((len(x_t)))) 112 | return kernel_matrix 113 | 114 | def run_tnsm(n_dim, nb1, nb2): 115 | data_train, data_val = data_prepare(n_dim, X_train, X_val, nb1, nb2) 116 | bsp_qc = make_bsp(n_dim) 117 | indices_list_t = list(combinations(range(1, len(data_train) + 1), 2)) 118 | 119 | t0 = time.time() 120 | circuit = build_qsvm_qc(bsp_qc,n_dim, data_train[0], data_train[0]) 121 | converter = CircuitToEinsum(circuit, dtype='complex128', backend='cupy') 122 | a = str(0).zfill(n_dim) 123 | exp, oper = converter.amplitude(a) 124 | exp_t = round((time.time()-t0),3) 125 | 126 | t0 = time.time() 127 | oper_train = data_to_operand(n_dim,oper,data_train,data_train,indices_list_t) 128 | oper_t = round((time.time()-t0),3) 129 | 130 | t0 = time.time() 131 | oper = oper_train[0] 132 | options = NetworkOptions(blocking="auto",device_id=device_id) 133 | network = Network(exp, *oper,options=options) 134 | path, info = network.contract_path() 135 | network.autotune(iterations=20) 136 | path_t = round((time.time()-t0),3) 137 | 138 | t0 = time.time() 139 | tnsm_kernel_matrix_train = kernel_matrix_tnsm(data_train, data_train, oper_train, indices_list_t, network, mode='train') 140 | tnsm_kernel_t = round((time.time()-t0),3) 141 | print(n_dim,exp_t,oper_t,path_t,tnsm_kernel_t,len(oper_train)) 142 | 143 | run_tnsm(2,2,1) 144 | for d in [2,5,10,50,100,500,1000]: 145 | run_tnsm(128,d,1) 146 | # run_tnsm(300,2,1) 147 | # for q in range(2,34): 148 | # run_tnsm(q,2,1) 149 | # for q in range(42,200,10): 150 | # run_tnsm(q,2,1) 151 | # for q in [200,300,400,500,600,784]: 152 | # run_tnsm(q,2,1) -------------------------------------------------------------------------------- /benchmark/figure/figure1_sgpu.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tim-Li/cuTN-QSVM/260ab2e959dc8cb3c64bb2fd4599f2cc7b68d32c/benchmark/figure/figure1_sgpu.png -------------------------------------------------------------------------------- /benchmark/figure/figure2_mgpu_v100.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tim-Li/cuTN-QSVM/260ab2e959dc8cb3c64bb2fd4599f2cc7b68d32c/benchmark/figure/figure2_mgpu_v100.png -------------------------------------------------------------------------------- /benchmark/figure/figure3_mgpu_h100.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tim-Li/cuTN-QSVM/260ab2e959dc8cb3c64bb2fd4599f2cc7b68d32c/benchmark/figure/figure3_mgpu_h100.png -------------------------------------------------------------------------------- /benchmark/figure/figure_sgpu.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tim-Li/cuTN-QSVM/260ab2e959dc8cb3c64bb2fd4599f2cc7b68d32c/benchmark/figure/figure_sgpu.png -------------------------------------------------------------------------------- /benchmark/mpi_demo.sh: -------------------------------------------------------------------------------- 1 | mpirun -np 8 python banchmark_qsvm_tnsm-mpi.py 2 | mpirun -np 4 python banchmark_qsvm_tnsm-mpi.py 3 | mpirun -np 2 python banchmark_qsvm_tnsm-mpi.py 4 | mpirun -np 1 python banchmark_qsvm_tnsm-mpi.py -------------------------------------------------------------------------------- /env_check.py: -------------------------------------------------------------------------------- 1 | import qiskit.circuit.random 2 | from cuquantum import contract, CircuitToEinsum 3 | 4 | qc = qiskit.circuit.random.random_circuit(num_qubits=8, depth=7) 5 | converter = CircuitToEinsum(qc, backend='cupy') 6 | print(converter) -------------------------------------------------------------------------------- /environment.yml: -------------------------------------------------------------------------------- 1 | name: cutn-qsvm 2 | channels: 3 | - defaults 4 | dependencies: 5 | - _libgcc_mutex=0.1=main 6 | - _openmp_mutex=5.1=1_gnu 7 | - bzip2=1.0.8=h5eee18b_6 8 | - ca-certificates=2024.9.24=h06a4308_0 9 | - ld_impl_linux-64=2.40=h12ee557_0 10 | - libffi=3.4.4=h6a678d5_1 11 | - libgcc-ng=11.2.0=h1234567_1 12 | - libgomp=11.2.0=h1234567_1 13 | - libstdcxx-ng=11.2.0=h1234567_1 14 | - libuuid=1.41.5=h5eee18b_0 15 | - ncurses=6.4=h6a678d5_0 16 | - openssl=3.0.15=h5eee18b_0 17 | - pip=24.2=py310h06a4308_0 18 | - python=3.10.15=he870216_1 19 | - readline=8.2=h5eee18b_0 20 | - setuptools=75.1.0=py310h06a4308_0 21 | - sqlite=3.45.3=h5eee18b_0 22 | - tk=8.6.14=h39e8969_0 23 | - wheel=0.44.0=py310h06a4308_0 24 | - xz=5.4.6=h5eee18b_1 25 | - zlib=1.2.13=h5eee18b_1 26 | - pip: 27 | - contourpy==1.3.0 28 | - cupy-cuda12x==13.3.0 29 | - cuquantum==24.8.0.2 30 | - cuquantum-cu12==24.8.0 31 | - cuquantum-python==24.8.0.2 32 | - cuquantum-python-cu12==24.8.0 33 | - custatevec-cu12==1.6.0.post1 34 | - cutensor-cu12==2.0.2 35 | - cutensornet-cu12==2.5.0 36 | - cycler==0.12.1 37 | - dill==0.3.9 38 | - fastrlock==0.8.2 39 | - fonttools==4.54.1 40 | - joblib==1.4.2 41 | - kiwisolver==1.4.7 42 | - matplotlib==3.9.2 43 | - mpmath==1.3.0 44 | - numpy==2.1.3 45 | - nvidia-cublas-cu12==12.6.3.3 46 | - nvidia-cuda-runtime-cu12==12.6.77 47 | - nvidia-cusolver-cu12==11.7.1.2 48 | - nvidia-cusparse-cu12==12.5.4.2 49 | - nvidia-nvjitlink-cu12==12.6.77 50 | - opt-einsum==3.4.0 51 | - packaging==24.2 52 | - pandas==2.2.3 53 | - pbr==6.1.0 54 | - pillow==11.0.0 55 | - psutil==6.1.0 56 | - pydot==3.0.2 57 | - pylatexenc==2.10 58 | - pyparsing==3.2.0 59 | - python-dateutil==2.9.0.post0 60 | - pytz==2024.2 61 | - qiskit==1.2.4 62 | - qiskit-aer-gpu==0.15.1 63 | - rustworkx==0.15.1 64 | - scikit-learn==1.5.2 65 | - scipy==1.14.1 66 | - seaborn==0.13.2 67 | - six==1.16.0 68 | - stevedore==5.3.0 69 | - symengine==0.13.0 70 | - sympy==1.13.3 71 | - threadpoolctl==3.5.0 72 | - typing-extensions==4.12.2 73 | - tzdata==2024.2 74 | prefix: /home/txmai/anaconda3/envs/cutn-qsvm 75 | -------------------------------------------------------------------------------- /figures/cutensornet_module.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tim-Li/cuTN-QSVM/260ab2e959dc8cb3c64bb2fd4599f2cc7b68d32c/figures/cutensornet_module.png -------------------------------------------------------------------------------- /figures/multi_GPU_linearity.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tim-Li/cuTN-QSVM/260ab2e959dc8cb3c64bb2fd4599f2cc7b68d32c/figures/multi_GPU_linearity.png -------------------------------------------------------------------------------- /figures/multi_gpu_resource.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tim-Li/cuTN-QSVM/260ab2e959dc8cb3c64bb2fd4599f2cc7b68d32c/figures/multi_gpu_resource.png -------------------------------------------------------------------------------- /figures/process_flow_comparison.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tim-Li/cuTN-QSVM/260ab2e959dc8cb3c64bb2fd4599f2cc7b68d32c/figures/process_flow_comparison.png -------------------------------------------------------------------------------- /figures/speedup_cutensornet.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Tim-Li/cuTN-QSVM/260ab2e959dc8cb3c64bb2fd4599f2cc7b68d32c/figures/speedup_cutensornet.png -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | scikit-learn 2 | qiskit[visualization]==1.2.4 3 | cuquantum==24.08.0.2 4 | cuquantum-python==24.08.0.2 -------------------------------------------------------------------------------- /requirements_benchmark.txt: -------------------------------------------------------------------------------- 1 | scikit-learn 2 | opt-einsum 3 | qiskit[visualization]==1.2.4 4 | qiskit-aer-gpu==0.15.1 5 | cuquantum==24.08.0.2 6 | cuquantum-python==24.08.0.2 --------------------------------------------------------------------------------