├── LICENSE ├── README.md ├── requirements.txt ├── train.py └── tvmc ├── hamiltonians ├── hamiltonian.py └── rydberg.py ├── models ├── BaseModel.py ├── LPTF.py ├── ModelBuilder.py ├── PTF.py ├── RNN.py └── training.py └── util.py /LICENSE: -------------------------------------------------------------------------------- 1 | Apache License 2 | Version 2.0, January 2004 3 | http://www.apache.org/licenses/ 4 | 5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 6 | 7 | 1. Definitions. 8 | 9 | "License" shall mean the terms and conditions for use, reproduction, 10 | and distribution as defined by Sections 1 through 9 of this document. 11 | 12 | "Licensor" shall mean the copyright owner or entity authorized by 13 | the copyright owner that is granting the License. 14 | 15 | "Legal Entity" shall mean the union of the acting entity and all 16 | other entities that control, are controlled by, or are under common 17 | control with that entity. For the purposes of this definition, 18 | "control" means (i) the power, direct or indirect, to cause the 19 | direction or management of such entity, whether by contract or 20 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 21 | outstanding shares, or (iii) beneficial ownership of such entity. 22 | 23 | "You" (or "Your") shall mean an individual or Legal Entity 24 | exercising permissions granted by this License. 25 | 26 | "Source" form shall mean the preferred form for making modifications, 27 | including but not limited to software source code, documentation 28 | source, and configuration files. 29 | 30 | "Object" form shall mean any form resulting from mechanical 31 | transformation or translation of a Source form, including but 32 | not limited to compiled object code, generated documentation, 33 | and conversions to other media types. 34 | 35 | "Work" shall mean the work of authorship, whether in Source or 36 | Object form, made available under the License, as indicated by a 37 | copyright notice that is included in or attached to the work 38 | (an example is provided in the Appendix below). 39 | 40 | "Derivative Works" shall mean any work, whether in Source or Object 41 | form, that is based on (or derived from) the Work and for which the 42 | editorial revisions, annotations, elaborations, or other modifications 43 | represent, as a whole, an original work of authorship. For the purposes 44 | of this License, Derivative Works shall not include works that remain 45 | separable from, or merely link (or bind by name) to the interfaces of, 46 | the Work and Derivative Works thereof. 47 | 48 | "Contribution" shall mean any work of authorship, including 49 | the original version of the Work and any modifications or additions 50 | to that Work or Derivative Works thereof, that is intentionally 51 | submitted to Licensor for inclusion in the Work by the copyright owner 52 | or by an individual or Legal Entity authorized to submit on behalf of 53 | the copyright owner. For the purposes of this definition, "submitted" 54 | means any form of electronic, verbal, or written communication sent 55 | to the Licensor or its representatives, including but not limited to 56 | communication on electronic mailing lists, source code control systems, 57 | and issue tracking systems that are managed by, or on behalf of, the 58 | Licensor for the purpose of discussing and improving the Work, but 59 | excluding communication that is conspicuously marked or otherwise 60 | designated in writing by the copyright owner as "Not a Contribution." 61 | 62 | "Contributor" shall mean Licensor and any individual or Legal Entity 63 | on behalf of whom a Contribution has been received by Licensor and 64 | subsequently incorporated within the Work. 65 | 66 | 2. Grant of Copyright License. Subject to the terms and conditions of 67 | this License, each Contributor hereby grants to You a perpetual, 68 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 69 | copyright license to reproduce, prepare Derivative Works of, 70 | publicly display, publicly perform, sublicense, and distribute the 71 | Work and such Derivative Works in Source or Object form. 72 | 73 | 3. Grant of Patent License. Subject to the terms and conditions of 74 | this License, each Contributor hereby grants to You a perpetual, 75 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 76 | (except as stated in this section) patent license to make, have made, 77 | use, offer to sell, sell, import, and otherwise transfer the Work, 78 | where such license applies only to those patent claims licensable 79 | by such Contributor that are necessarily infringed by their 80 | Contribution(s) alone or by combination of their Contribution(s) 81 | with the Work to which such Contribution(s) was submitted. If You 82 | institute patent litigation against any entity (including a 83 | cross-claim or counterclaim in a lawsuit) alleging that the Work 84 | or a Contribution incorporated within the Work constitutes direct 85 | or contributory patent infringement, then any patent licenses 86 | granted to You under this License for that Work shall terminate 87 | as of the date such litigation is filed. 88 | 89 | 4. Redistribution. You may reproduce and distribute copies of the 90 | Work or Derivative Works thereof in any medium, with or without 91 | modifications, and in Source or Object form, provided that You 92 | meet the following conditions: 93 | 94 | (a) You must give any other recipients of the Work or 95 | Derivative Works a copy of this License; and 96 | 97 | (b) You must cause any modified files to carry prominent notices 98 | stating that You changed the files; and 99 | 100 | (c) You must retain, in the Source form of any Derivative Works 101 | that You distribute, all copyright, patent, trademark, and 102 | attribution notices from the Source form of the Work, 103 | excluding those notices that do not pertain to any part of 104 | the Derivative Works; and 105 | 106 | (d) If the Work includes a "NOTICE" text file as part of its 107 | distribution, then any Derivative Works that You distribute must 108 | include a readable copy of the attribution notices contained 109 | within such NOTICE file, excluding those notices that do not 110 | pertain to any part of the Derivative Works, in at least one 111 | of the following places: within a NOTICE text file distributed 112 | as part of the Derivative Works; within the Source form or 113 | documentation, if provided along with the Derivative Works; or, 114 | within a display generated by the Derivative Works, if and 115 | wherever such third-party notices normally appear. The contents 116 | of the NOTICE file are for informational purposes only and 117 | do not modify the License. You may add Your own attribution 118 | notices within Derivative Works that You distribute, alongside 119 | or as an addendum to the NOTICE text from the Work, provided 120 | that such additional attribution notices cannot be construed 121 | as modifying the License. 122 | 123 | You may add Your own copyright statement to Your modifications and 124 | may provide additional or different license terms and conditions 125 | for use, reproduction, or distribution of Your modifications, or 126 | for any such Derivative Works as a whole, provided Your use, 127 | reproduction, and distribution of the Work otherwise complies with 128 | the conditions stated in this License. 129 | 130 | 5. Submission of Contributions. Unless You explicitly state otherwise, 131 | any Contribution intentionally submitted for inclusion in the Work 132 | by You to the Licensor shall be under the terms and conditions of 133 | this License, without any additional terms or conditions. 134 | Notwithstanding the above, nothing herein shall supersede or modify 135 | the terms of any separate license agreement you may have executed 136 | with Licensor regarding such Contributions. 137 | 138 | 6. Trademarks. This License does not grant permission to use the trade 139 | names, trademarks, service marks, or product names of the Licensor, 140 | except as required for reasonable and customary use in describing the 141 | origin of the Work and reproducing the content of the NOTICE file. 142 | 143 | 7. Disclaimer of Warranty. Unless required by applicable law or 144 | agreed to in writing, Licensor provides the Work (and each 145 | Contributor provides its Contributions) on an "AS IS" BASIS, 146 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 147 | implied, including, without limitation, any warranties or conditions 148 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 149 | PARTICULAR PURPOSE. You are solely responsible for determining the 150 | appropriateness of using or redistributing the Work and assume any 151 | risks associated with Your exercise of permissions under this License. 152 | 153 | 8. Limitation of Liability. In no event and under no legal theory, 154 | whether in tort (including negligence), contract, or otherwise, 155 | unless required by applicable law (such as deliberate and grossly 156 | negligent acts) or agreed to in writing, shall any Contributor be 157 | liable to You for damages, including any direct, indirect, special, 158 | incidental, or consequential damages of any character arising as a 159 | result of this License or out of the use or inability to use the 160 | Work (including but not limited to damages for loss of goodwill, 161 | work stoppage, computer failure or malfunction, or any and all 162 | other commercial damages or losses), even if such Contributor 163 | has been advised of the possibility of such damages. 164 | 165 | 9. Accepting Warranty or Additional Liability. While redistributing 166 | the Work or Derivative Works thereof, You may choose to offer, 167 | and charge a fee for, acceptance of support, warranty, indemnity, 168 | or other liability obligations and/or rights consistent with this 169 | License. However, in accepting such obligations, You may act only 170 | on Your own behalf and on Your sole responsibility, not on behalf 171 | of any other Contributor, and only if You agree to indemnify, 172 | defend, and hold each Contributor harmless for any liability 173 | incurred by, or claims asserted against, such Contributor by reason 174 | of your accepting any such warranty or additional liability. 175 | 176 | END OF TERMS AND CONDITIONS 177 | 178 | APPENDIX: How to apply the Apache License to your work. 179 | 180 | To apply the Apache License to your work, attach the following 181 | boilerplate notice, with the fields enclosed by brackets "[]" 182 | replaced with your own identifying information. (Don't include 183 | the brackets!) The text should be enclosed in the appropriate 184 | comment syntax for the file format. We also recommend that a 185 | file or class name and description of purpose be included on the 186 | same "printed page" as the copyright notice for easier 187 | identification within third-party archives. 188 | 189 | Copyright [yyyy] [name of copyright owner] 190 | 191 | Licensed under the Apache License, Version 2.0 (the "License"); 192 | you may not use this file except in compliance with the License. 193 | You may obtain a copy of the License at 194 | 195 | http://www.apache.org/licenses/LICENSE-2.0 196 | 197 | Unless required by applicable law or agreed to in writing, software 198 | distributed under the License is distributed on an "AS IS" BASIS, 199 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 200 | See the License for the specific language governing permissions and 201 | limitations under the License. 202 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Code accompanying the paper "Variational Monte Carlo with Large Patched Transformers" 2 | 3 | ## Requirements 4 | A suitable [conda](https://conda.io/) environment named `qsr` can be created 5 | and activated with: 6 | 7 | ``` 8 | conda create --name qsr 9 | conda install -n qsr pip 10 | conda activate qsr 11 | pip install -r requirements.txt 12 | ``` 13 | 14 | ## Model builder 15 | 16 | ### TRAINING 17 | 18 | This script is used to train new models from scratch. This is an example of a command 19 | to train an $8\times 8$ Rydberg lattice with $V=7$, $\delta=\Omega=1$ with a $2\times 2$ patched transformer: 20 | ``` 21 | python train.py --train L=64 NLOOPS=16 K=1024 sub_directory=2x2 --ptf patch=2x2 --rydberg V=7 delta=1 Omega=1 22 | ``` 23 | Training parameters are shown when running: 24 | 25 | ``` 26 | python train.py --help --train 27 | ``` 28 | 29 | These are all possible training arguments: 30 | ``` 31 | 32 | Training Arguments: 33 | 34 | L (int) -- Total lattice size (8x8 would be L=64). 35 | 36 | Q (int) -- Number of minibatches per batch. 37 | 38 | K (int) -- size of each minibatch. 39 | 40 | B (int) -- Total batch size (should be Q*K). 41 | 42 | NLOOPS (int) -- Number of loops within the off_diag_labels function. Higher values save ram and 43 | generally makes the code run faster (up to 2x). Note, you can only set this 44 | as high as your effective sequence length. (Take L and divide by your patch size). 45 | 46 | steps (int) -- Number of training steps. 47 | 48 | dir (str) -- Output directory, set to for no output. 49 | 50 | lr (float) -- Learning rate. 51 | 52 | seed (int) -- Random seed for the run. 53 | 54 | sgrad (bool) -- Whether or not to sample with gradients, otherwise create gradients in extra network run. 55 | (Uses less ram when but slightly slower) 56 | 57 | true_grad (bool) -- Set to false to approximate the gradients, more efficient but approximate. 58 | 59 | sub_directory (str) -- String to add to the end of the output directory (inside a subfolder). 60 | ``` 61 | 62 | ### RNN 63 | 64 | All optional rnn parameters can be viewed by running 65 | 66 | ``` 67 | python train.py --help --rnn 68 | ``` 69 | 70 | These are the RNN parameters: 71 | 72 | 73 | ``` 74 | 75 | RNN Optional arguments: 76 | 77 | L (int) -- The total number of atoms in your lattice. 78 | 79 | Nh (int) -- RNN hidden size. 80 | 81 | patch (str) -- Number of atoms input/predicted at once (patch size). 82 | The Input sequence will have an effective length of L/prod(patch). 83 | Example values: 2x2, 2x3, 2, 4 84 | 85 | rnntype (string) -- Which type of RNN cell to use. Only ELMAN and GRU are valid options at the moment. 86 | 87 | 88 | ``` 89 | 90 | ### Patched Transformer (PTF) 91 | 92 | 93 | All optional ptf parameters can be viewed by running 94 | 95 | ``` 96 | python train.py --help --ptf 97 | ``` 98 | 99 | These are your PTF parameters: 100 | ``` 101 | 102 | PTF Optional arguments: 103 | 104 | L (int) -- The total number of atoms in your lattice. 105 | 106 | Nh (int) -- Transformer token size. Input patches are projected to match the token size. 107 | 108 | patch (str) -- Number of atoms input/predicted at once (patch size). 109 | The Input sequence will have an effective length of L/prod(patch). 110 | Example values: 2x2, 2x3, 2, 4 111 | 112 | dropout (float) -- The amount of dropout to use in the transformer layers. 113 | 114 | num_layers (int) -- The number of transformer layers to use. 115 | 116 | nhead (int) -- The number of heads to use in Multi-headed Self-Attention. This should divide Nh. 117 | 118 | repeat_pre (bool) -- Repeat the precondition (input) instead of projecting it out to match the token size. 119 | 120 | 121 | ``` 122 | 123 | ### Large-Patched Transformer (LPTF) 124 | 125 | 126 | All optional LPTF parameters can be viewed by running 127 | 128 | ``` 129 | python train.py --help --lptf 130 | ``` 131 | LPTF parameters must be followed by the sub-model (e.g. --rnn) and the corresponding parameters, where the L parameter needs to match the patch parameter of the LPTF (e.g. --lptf path=2x3 --rnn L=2x3). 132 | 133 | These are your LPTF parameters: 134 | ``` 135 | 136 | 137 | LPTF Optional arguments: 138 | 139 | L (int) -- The total number of atoms in your lattice. 140 | 141 | Nh (int) -- Transformer token size. Input patches are projected to match the token size. 142 | Note: When using an RNN subsampler this Nh MUST match the rnn's Nh. 143 | 144 | patch (int) -- Number of atoms input/predicted at once (patch size). 145 | The Input sequence will have an effective length of L/patch. 146 | 147 | dropout (float) -- The amount of dropout to use in the transformer layers. 148 | 149 | num_layers (int) -- The number of transformer layers to use. 150 | 151 | nhead (int) -- The number of heads to use in Multi-headed Self-Attention. This should divide Nh. 152 | 153 | subsampler (Sampler) -- The inner model to use for probability factorization. This is set implicitly 154 | by including --rnn or --ptf arguments. 155 | 156 | 157 | 158 | ``` 159 | 160 | ## Rydberg Hamiltonian 161 | 162 | The following parameters can be chosen for the Rydberg Hamiltonian: 163 | 164 | ``` 165 | Lx 4 166 | Ly 4 167 | V 7.0 168 | Omega 1.0 169 | delta 1.0 170 | 171 | ``` 172 | -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | filelock==3.12.0 2 | Jinja2==3.1.2 3 | MarkupSafe==2.1.2 4 | mpmath==1.3.0 5 | networkx==3.1 6 | numpy==1.24.3 7 | sympy==1.12 8 | torch==2.0.1 9 | typing_extensions==4.6.0 10 | -------------------------------------------------------------------------------- /train.py: -------------------------------------------------------------------------------- 1 | from tvmc.models.ModelBuilder import * 2 | 3 | import sys 4 | 5 | def helper(args): 6 | 7 | help(build_model) 8 | 9 | example = "Runtime Example:\n>>>python train.py --rydberg --train L=144" 10 | while True: 11 | if "--lptf" in args: 12 | print(LPTF.INFO) 13 | print(example+" --lptf patch=3x3 --rnn L=9 patch=3 Nh=128") 14 | break 15 | if "--rnn" in args: 16 | print(PRNN.INFO) 17 | print(example+" NLOOPS=36 --rnn patch=4") 18 | break 19 | if "--ptf" in args: 20 | print(PTF.INFO) 21 | print(example+" NLOOPS=24 --ptf patch=2x3") 22 | break 23 | if "--train" in args: 24 | print(TrainOpt.__doc__) 25 | print(example+" NLOOPS=36 sgrad=False steps=4000 --ptf patch=2x2") 26 | break 27 | 28 | args=["--"+input("What Model do you need help with?\nOptions are rnn, lptf, ptf, and train:\n".lower())] 29 | 30 | 31 | 32 | if "--help" in sys.argv: 33 | print() 34 | helper(sys.argv) 35 | else: 36 | print(sys.argv[1:]) 37 | 38 | model,full_opt,opt_dict = build_model(sys.argv[1:]) 39 | train_opt=opt_dict["TRAIN"] 40 | 41 | #Initialize optimizer 42 | beta1=0.9;beta2=0.999 43 | optimizer = torch.optim.Adam( 44 | model.parameters(), 45 | lr=train_opt.lr, 46 | betas=(beta1,beta2) 47 | ) 48 | 49 | print(full_opt) 50 | mydir=setup_dir(opt_dict) 51 | orig_stdout = sys.stdout 52 | 53 | full_opt.save(mydir+"\\settings.json") 54 | 55 | f = open(mydir+'\\output.txt', 'w') 56 | sys.stdout = f 57 | try: 58 | reg_train(opt_dict,(model,optimizer),printf=True,mydir=mydir) 59 | except Exception as e: 60 | print(e) 61 | sys.stdout = orig_stdout 62 | f.close() 63 | 1/0 64 | sys.stdout = orig_stdout 65 | f.close() 66 | -------------------------------------------------------------------------------- /tvmc/hamiltonians/hamiltonian.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import torch 3 | from torch import nn 4 | ngpu=1 5 | device = torch.device("cuda:0" if (torch.cuda.is_available() and ngpu > 0) else "cpu") 6 | 7 | class Hamiltonian(): 8 | def __init__(self,L,offDiag,device=device): 9 | self.offDiag = offDiag # Off-diagonal interaction 10 | self.L = L # Number of spins 11 | self.device = device 12 | self.Vij = self.Vij=nn.Linear(self.L,self.L).to(device) 13 | self.buildlattice() 14 | 15 | def buildlattice(): 16 | """Creates the matrix representation of the on-diagonal part of the hamiltonian 17 | - This should fill Vij with values""" 18 | raise NotImplementedError 19 | 20 | # def localenergy(self,samples,logp,logppj): 21 | # """ 22 | # Takes in s, ln[p(s)] and ln[p(s')] (for all s'), then computes Hloc(s) for N samples s. 23 | # 24 | # Inputs: 25 | # samples - [B,L,1] matrix of zeros and ones for ground/excited states 26 | # logp - size B vector of logscale probabilities ln[p(s)] 27 | # logppj - [B,L] matrix of logscale probabilities ln[p(s')] where s'[i][j] had one state flipped at position j 28 | # relative to s[i] 29 | # Returns: 30 | # size B vector of energies Hloc(s) 31 | # 32 | # """ 33 | # # Going to calculate Eloc for each sample in a separate spot 34 | # # so eloc will have shape [B] 35 | # # recall samples has shape [B,L,1] 36 | # B=samples.shape[0] 37 | # eloc = torch.zeros(B,device=self.device) 38 | # # Chemical potential 39 | # with torch.no_grad(): 40 | # tmp=self.Vij(samples.squeeze(2)) 41 | # eloc += torch.sum(tmp*samples.squeeze(2),axis=1) 42 | # # Off-diagonal part 43 | # #logppj is shape [B,L] 44 | # #logppj[:,j] has one state flipped at position j 45 | # for j in range(self.L): 46 | # #make sure torch.exp is a thing 47 | # eloc += self.offDiag * torch.exp((logppj[:,j]-logp)/2) 48 | # 49 | # return eloc 50 | 51 | def localenergyALT(self,samples,logp,sumsqrtp,logsqrtp): 52 | """ 53 | Takes in s, ln[p(s)] and exp(-logsqrtp)*sum(sqrt[p(s')]), then computes Hloc(s) for N samples s. 54 | 55 | Inputs: 56 | samples - [B,L,1] matrix of zeros and ones for ground/excited states 57 | logp - size B vector of logscale probabilities ln[p(s)] 58 | logsqrtp - size B vector of average (log p)/2 values used for numerical stability 59 | when calculating sum_s'(sqrt[p(s')/p(s)]) 60 | sumsqrtp - size B vector of exp(-logsqrtp)*sum(sqrt[p(s')]). 61 | Returns: 62 | size B vector of energies Hloc(s) 63 | 64 | """ 65 | # Going to calculate Eloc for each sample in a separate spot 66 | # so eloc will have shape [B] 67 | # recall samples has shape [B,L,1] 68 | B=samples.shape[0] 69 | eloc = torch.zeros(B,device=self.device) 70 | # Chemical potential 71 | with torch.no_grad(): 72 | tmp=self.Vij(samples.squeeze(2)) 73 | eloc += torch.sum(tmp*samples.squeeze(2),axis=1) 74 | # Off-diagonal part 75 | 76 | #in this function the entire sum is precomputed and it was premultiplied by exp(-logsqrtp) for stability 77 | eloc += self.offDiag *sumsqrtp* torch.exp(logsqrtp-logp/2) 78 | 79 | return eloc 80 | 81 | def magnetizations(self, samples): 82 | B = samples.shape[0] 83 | L = samples.shape[1] 84 | mag = torch.zeros(B, device=self.device) 85 | abs_mag = torch.zeros(B, device=self.device) 86 | sq_mag = torch.zeros(B, device=self.device) 87 | stag_mag = torch.zeros(B, device=self.device) 88 | 89 | with torch.no_grad(): 90 | samples_pm = 2 * samples - 1 91 | mag += torch.sum(samples_pm.squeeze(2), axis=1) 92 | abs_mag += torch.abs(torch.sum(samples_pm.squeeze(2), axis=1)) 93 | sq_mag += torch.abs(torch.sum(samples_pm.squeeze(2), axis=1))**2 94 | 95 | samples_reshape = torch.reshape(samples.squeeze(2), (B, int(np.sqrt(L)), int(np.sqrt(L)))) 96 | for i in range(int(np.sqrt(L))): 97 | for j in range(int(np.sqrt(L))): 98 | stag_mag += (-1)**(i+j) * (samples_reshape[:,i,j] - 0.5) 99 | 100 | return mag, abs_mag, sq_mag, stag_mag / L 101 | 102 | def ground(self): 103 | """Returns the ground state energy E/L""" 104 | raise NotImplementedError 105 | -------------------------------------------------------------------------------- /tvmc/hamiltonians/rydberg.py: -------------------------------------------------------------------------------- 1 | from tvmc.util import Options,OptionManager 2 | from tvmc.hamiltonians.hamiltonian import * 3 | 4 | class Rydberg(Hamiltonian): 5 | 6 | DEFAULTS = Options(Lx=4,Ly=4,V=7.0,Omega=1.0,delta=1.0) 7 | def __init__(self,Lx,Ly,V,Omega,delta,device=device,**kwargs): 8 | self.Lx = Lx # Size along x 9 | self.Ly = Ly # Size along y 10 | self.V = V # Van der Waals potential 11 | self.delta = delta # Detuning 12 | # off diagonal part is -0.5*Omega 13 | super(Rydberg,self).__init__(Lx*Ly,-0.5*Omega,device) 14 | 15 | @staticmethod 16 | def Vij(Ly,Lx,V,matrix): 17 | #matrix will be size [Lx*Ly,Lx*Ly] 18 | for i in range(Ly): 19 | for j in range(Lx): 20 | #flatten two indices into one 21 | idx = Ly*j+i 22 | # only fill in the upper diagonal 23 | for k in range(idx+1,Lx*Ly): 24 | #expand one index into two 25 | i2 = k%Ly 26 | j2=k//Ly 27 | div = ((i2-i)**2+(j2-j)**2)**3 28 | #if div<=R: 29 | matrix[idx][k]=V/div 30 | 31 | def buildlattice(self): 32 | Lx,Ly=self.Lx,self.Ly 33 | 34 | #diagonal hamiltonian portion can be written as a matrix multiplication then a dot product 35 | mat=np.zeros([self.L,self.L]) 36 | Rydberg.Vij(Lx,Ly,self.V,mat) 37 | 38 | with torch.no_grad(): 39 | self.Vij.weight[:,:]=torch.Tensor(mat) 40 | self.Vij.bias.fill_(-self.delta) 41 | 42 | def ground(self): 43 | return Rydberg.E[self.Lx*self.Ly] 44 | 45 | OptionManager.register("rydberg",Rydberg.DEFAULTS) 46 | -------------------------------------------------------------------------------- /tvmc/models/BaseModel.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | import math,time,json 3 | import torch 4 | from torch import nn 5 | ngpu=1 6 | device = torch.device("cuda:0" if (torch.cuda.is_available() and ngpu > 0) else "cpu") 7 | 8 | 9 | class Sampler(nn.Module): 10 | 11 | def __init__(self,device=device): 12 | self.device=device 13 | super(Sampler, self).__init__() 14 | 15 | def save(self,fn): 16 | torch.save(self,fn) 17 | 18 | def logprobability(self,input): 19 | # type: (Tensor) -> Tensor 20 | """Compute the logscale probability of a given state 21 | Inputs: 22 | input - [B,L,1] matrix of zeros and ones for ground/excited states 23 | Returns: 24 | logp - [B] size vector of logscale probability labels 25 | """ 26 | raise NotImplementedError 27 | 28 | @torch.jit.export 29 | def sample(self,B,L): 30 | # type: (int,int) -> Tensor 31 | """ Generates a set states 32 | Inputs: 33 | B (int) - The number of states to generate in parallel 34 | L (int) - The length of generated vectors 35 | Returns: 36 | samples - [B,L,1] matrix of zeros and ones for ground/excited states 37 | logprobs - [B] matrix of logscale probabilities (float Tensor) 38 | """ 39 | raise NotImplementedError 40 | 41 | @torch.jit.export 42 | def off_diag_labels(self,sample,nloops=1): 43 | # type: (Tensor,int) -> Tensor 44 | """ 45 | Inputs: 46 | samples - [B,L,1] matrix of zeros and ones for ground/excited states 47 | 48 | Returns: 49 | probs - size [B,L] tensor of probabilities of the excitation-flipped states 50 | """ 51 | D=nloops 52 | B,L,_=sample.shape 53 | sflip = torch.zeros([B,L,L,1],device=self.device) 54 | #collect all of the flipped states into one array 55 | for j in range(L): 56 | #get all of the states with one spin flipped 57 | sflip[:,j] = sample*1.0 58 | sflip[:,j,j] = 1-sflip[:,j,j] 59 | #compute all of their logscale probabilities 60 | with torch.no_grad(): 61 | probs=torch.zeros([B*L],device=self.device) 62 | tmp=sflip.view([B*L,L,1]) 63 | for k in range(D): 64 | probs[k*B*L//D:(k+1)*B*L//D] = self.logprobability(tmp[k*B*L//D:(k+1)*B*L//D]) 65 | 66 | return probs.reshape([B,L]) 67 | 68 | @torch.jit.export 69 | def off_diag_labels_summed(self,sample,nloops=1): 70 | # type: (Tensor,int) -> Tuple[Tensor,Tensor] 71 | """ 72 | Inputs: 73 | samples - [B,L,1] matrix of zeros and ones for ground/excited states 74 | 75 | Returns: 76 | logsqrtp - size B vector of average (log p)/2 values used for numerical stability 77 | when calculating sum_s'(sqrt[p(s')/p(s)]) 78 | sumsqrtp - size B vector of exp(-logsqrtp)*sum(sqrt[p(s')]). 79 | """ 80 | probs = self.off_diag_labels(sample,nloops) 81 | #get the average of our logprobabilities and divide by 2 82 | logsqrtp=probs.mean(dim=1)/2 83 | #compute the sum with a constant multiplied to keep the sum close to 1 84 | sumsqrtp = torch.exp(probs/2-logsqrtp.unsqueeze(1)).sum(dim=1) 85 | return sumsqrtp,logsqrtp 86 | 87 | # Functions for making Patches & doing probability traces 88 | class Patch2D(nn.Module): 89 | def __init__(self,nx,ny,Lx,Ly,device=device): 90 | super().__init__() 91 | self.nx=nx 92 | self.ny=ny 93 | self.Ly=Ly 94 | self.Lx=Lx 95 | 96 | #construct an index tensor for the reverse operation 97 | indices = torch.arange(Lx*Ly,device=device).unsqueeze(0) 98 | self.mixed = self.forward(indices).reshape([Lx*Ly]) 99 | #inverse 100 | self.mixed=torch.argsort(self.mixed) 101 | 102 | def forward(self,x): 103 | # type: (Tensor) -> Tensor 104 | nx,ny,Lx,Ly=self.nx,self.ny,self.Lx,self.Ly 105 | """Unflatten a tensor back to 2D, break it into nxn chunks, then flatten the sequence and the chunks 106 | Input: 107 | Tensor of shape [B,L] 108 | Output: 109 | Tensor of shape [B,L//n^2,n^2] 110 | """ 111 | #make the input 2D then break it into 2x2 chunks 112 | #afterwards reshape the 2x2 chunks to vectors of size 4 and flatten the 2d bit 113 | return x.view([x.shape[0],Lx,Ly]).unfold(-2,nx,nx).unfold(-2,ny,ny).reshape([x.shape[0],int(Lx*Ly//(nx*ny)),nx*ny]) 114 | 115 | def reverse(self,x): 116 | # type: (Tensor) -> Tensor 117 | """Inverse function of forward 118 | Input: 119 | Tensor of shape [B,L//n^2,n^2] 120 | Output: 121 | Tensor of shape [B,L] 122 | """ 123 | Ly,Lx=self.Ly,self.Lx 124 | # Reversing is done with an index tensor because torch doesn't have an inverse method for unfold 125 | return x.reshape([x.shape[0],Ly*Lx])[:,self.mixed] 126 | 127 | class Patch1D(nn.Module): 128 | def __init__(self,n,L): 129 | super().__init__() 130 | self.n=n 131 | self.L = L 132 | 133 | def forward(self,x): 134 | # type: (Tensor) -> Tensor 135 | """Break a tensor into chunks, essentially a wrapper of reshape 136 | Input: 137 | Tensor of shape [B,L] 138 | Output: 139 | Tensor of shape [B,L/n,n] 140 | """ 141 | #make the input 2D then break it into 2x2 chunks 142 | #afterwards reshape the 2x2 chunks to vectors of size 4 and flatten the 2d bit 143 | return x.reshape([x.shape[0],self.L//self.n,self.n]) 144 | 145 | def reverse(self,x): 146 | # type: (Tensor) -> Tensor 147 | """Inverse function of forward 148 | Input: 149 | Tensor of shape [B,L/n,n] 150 | Output: 151 | Tensor of shape [B,L] 152 | """ 153 | # original sequence order can be retrieved by chunking twice more 154 | #in the x-direction you should have chunks of size 2, but in y it should 155 | #be chunks of size Ly//2 156 | return x.reshape([x.shape[0],self.L]) 157 | 158 | @torch.jit.script 159 | def genpatch2onehot(patch,p): 160 | # type: (Tensor,int) -> Tensor 161 | """ Turn a sequence of size p patches into a onehot vector 162 | Inputs: 163 | patch - Tensor of shape [?,p] 164 | p (int) - the patch size 165 | 166 | """ 167 | #moving the last dimension to the front 168 | patch=patch.unsqueeze(0).transpose(-1,0).squeeze(-1).to(torch.int64) 169 | out=torch.zeros(patch.shape[1:],device=patch.device) 170 | for i in range(p): 171 | out+=patch[i]<