├── FPGA-synthesis.pdf ├── Introduction.pdf ├── List-of-models.pdf ├── Model-compression.pdf ├── fpga4hep_sdaccel.pdf ├── part1_hls4ml_intro.md ├── part2_aws_sdaccel.md └── part3_model_compression.md /FPGA-synthesis.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FPGA4HEP/course_material/69f92b36c67334d6d21c2f6be9b7a2feb5c8861a/FPGA-synthesis.pdf -------------------------------------------------------------------------------- /Introduction.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FPGA4HEP/course_material/69f92b36c67334d6d21c2f6be9b7a2feb5c8861a/Introduction.pdf -------------------------------------------------------------------------------- /List-of-models.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FPGA4HEP/course_material/69f92b36c67334d6d21c2f6be9b7a2feb5c8861a/List-of-models.pdf -------------------------------------------------------------------------------- /Model-compression.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FPGA4HEP/course_material/69f92b36c67334d6d21c2f6be9b7a2feb5c8861a/Model-compression.pdf -------------------------------------------------------------------------------- /fpga4hep_sdaccel.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/FPGA4HEP/course_material/69f92b36c67334d6d21c2f6be9b7a2feb5c8861a/fpga4hep_sdaccel.pdf -------------------------------------------------------------------------------- /part1_hls4ml_intro.md: -------------------------------------------------------------------------------- 1 | ### Download and install the package 2 | 3 | ``` 4 | git clone https://github.com/FPGA4HEP/hls4ml.git 5 | cd hls4ml 6 | source install_miniconda3.sh 7 | source ~/.bashrc 8 | source install.sh 9 | ``` 10 | 11 | Every time you log in do: 12 | 13 | ``` 14 | source setup_hls4ml.sh 15 | conda activate hls4ml-env 16 | ``` 17 | 18 | Run as well ```git pull``` to fetch the latest changes. 19 | 20 | ### Run the tool (with your favourite model, e.g. 1-layer) 21 | 22 | ``` 23 | cd keras-to-hls 24 | export FAVOURITEMODEL=1layer 25 | python keras-to-hls.py -c keras-config-${FAVOURITEMODEL}.yml 26 | ``` 27 | 28 | This will create a folder called `my-hls-test-${FAVOURITEMODEL}`. If you want to change the projectory directory name edit the yml configuration file. 29 | 30 | ### Run project design synthesis with Vivado HLS 31 | 32 | ``` 33 | cd my-hls-test-${FAVOURITEMODEL} 34 | vivado_hls -f build_prj.tcl 35 | ``` 36 | If you get a runtime error from vivado, log out and prepend "LC_ALL=C" to your ssh command, ex. 37 | ``` 38 | LC_ALL=C ssh -i FPGA4HEP.pem centos@your-ip 39 | ``` 40 | 41 | ### Readout resource usage and latency from the synthesis report 42 | 43 | ``` 44 | cd .. 45 | ./print-reports.sh my-hls-test-${FAVOURITEMODEL} 46 | ``` 47 | 48 | ### Extract and compare area under the ROC curve from keras (floating point calculations) and HLS (fixed point calculations) 49 | 50 | ``` 51 | python extract_roc.py -c keras-config-${FAVOURITEMODEL}.yml 52 | ``` 53 | 54 | ### EXERCISE: 55 | 56 | Change precision of calculations and reuse factor in the keras configuration file and check effect on NN performance (AUC) and FPGA resource usage using the scripts above. 57 | 58 | ``` 59 | ReuseFactor: N # N = number of times a multiplier is used to do a computation 60 | DefaultPrecision: ap_fixed # X = total number of bits, Y = number of integer bits, X-Y = number of decimal bits 61 | ``` 62 | 63 | NB: suggest to change the project output directory in the keras configuration for each test to avoid overwriting of previous projects. 64 | -------------------------------------------------------------------------------- /part2_aws_sdaccel.md: -------------------------------------------------------------------------------- 1 | ### Clone hls4ml wrapper for SDAccel 2 | 3 | ``` 4 | git clone https://github.com/FPGA4HEP/hls4ml_c.git 5 | cd hls4ml_c 6 | git pull #to fetch the latest changes 7 | ``` 8 | 9 | Edit Makefile in the hls4ml_c directory to change default input directory name: 10 | 11 | ``` 12 | HLS4ML_PROJECT := my-hls-test-FAVOURITE-MODEL 13 | ``` 14 | 15 | ### Check out SDAccel and setup environment 16 | 17 | ``` 18 | git clone https://github.com/aws/aws-fpga.git $AWS_FPGA_REPO_DIR 19 | ``` 20 | 21 | Every time you login do: 22 | 23 | ``` 24 | cd $AWS_FPGA_REPO_DIR 25 | source sdaccel_setup.sh 26 | ``` 27 | 28 | or today you can also just follow this shorcut we have set up for you: 29 | 30 | ``` 31 | cd ~/ 32 | source setup_sdaccel.sh 33 | ``` 34 | 35 | NB: if you had hls4ml activated, you should log out/in first. 36 | 37 | More detailed information [here](https://github.com/aws/aws-fpga/tree/master/SDAccel) 38 | 39 | ### Run software simulation, hardware emulation and build FPGA binary 40 | 41 | ``` 42 | make clean 43 | make check TARGETS=sw_emu DEVICES=$AWS_PLATFORM all #software emulation 44 | make check TARGETS=hw_emu DEVICES=$AWS_PLATFORM all #hardware emulation 45 | make TARGETS=hw DEVICES=$AWS_PLATFORM all && ./create.sh #firmware building 46 | ``` 47 | 48 | ### Run on real FPGA 49 | 50 | Launch a F1 instance and copy the host and binary files (.awsxclbin) from the T2 (nb, first copy to your laptop). 51 | 52 | Setup the SDAccel environment on the F1 as well: 53 | 54 | ``` 55 | git clone https://github.com/aws/aws-fpga.git $AWS_FPGA_REPO_DIR 56 | cd $AWS_FPGA_REPO_DIR 57 | source sdaccel_setup.sh 58 | sudo sh 59 | export VIVADO_TOOL_VERSION=2018.2 60 | curl -s https://s3.amazonaws.com/aws-fpga-developer-ami/1.5.0/Patches/xrt_201802.2.1.0_7.5.1804-xrt.rpm -o xrt_201802.2.1.0_7.5.1804-xrt.rpm 61 | curl -s https://s3.amazonaws.com/aws-fpga-developer-ami/1.5.0/Patches/xrt_201802.2.1.0_7.5.1804-aws.rpm -o xrt_201802.2.1.0_7.5.1804-aws.rpm 62 | sudo yum reinstall -y xrt_*-xrt.rpm 63 | sudo yum install -y xrt_*-aws.rpm 64 | source /home/centos/src/project_data/aws-fpga/sdaccel_runtime_setup.sh 65 | ``` 66 | 67 | But today you can also just run these two scripts: 68 | 69 | ``` 70 | source setup_sdaccel_fpga_base.sh 71 | source setup_sdaccel_fpga.sh 72 | ``` 73 | 74 | Now copy the input features and keras prediction files from your hls4ml project directory on the T2 (my-hls-test-FAVOURITE-MODEL/tb_data/) to the F1 to pass it to the FPGA. 75 | 76 | Finally, you can accelerate your NN inference on the FPGA running on the input features: 77 | 78 | ``` 79 | ./host N data_dir 80 | ``` 81 | 82 | where N is number of batches of 32 events (suggest use N=6168 if use the provided input features list), and data_dir is the directory with input features and keras predictions files. 83 | 84 | The application will produce a file with the predictions from the FPGA run. Compare it with HLS and Keras calculations using the extract_roc.py script in the hls4ml directory on the T2 instance (nb, copy the tb_output_data.dat from the F1 to hls4ml/keras-to-hls directory on the T2) 85 | 86 | ``` 87 | python extract_roc.py -c keras-config-FAVOURITE-MODEL.yml -f tb_output_data.dat 88 | ``` 89 | -------------------------------------------------------------------------------- /part3_model_compression.md: -------------------------------------------------------------------------------- 1 | ### Checkout the keras training package and setup the environment (assume local installation of python data analysis packages: keras, tensorflow, ...) 2 | 3 | ``` 4 | git clone https://github.com/FPGA4HEP/keras-training.git 5 | cp keras-training/install_miniconda.sh ~/ 6 | cd ~ 7 | source install_miniconda.sh 8 | source ~/.bashrc 9 | cd keras-training 10 | source install.sh 11 | source setup.sh # every time you log in 12 | ``` 13 | 14 | ### Training and evaluation of the [3-layer Dense NN](https://github.com/FPGA4HEP/keras-training/blob/master/models/models.py#L63-L76): 15 | 16 | ``` 17 | cd ~/keras-training/train 18 | python train.py -t t_allpar_new -i ../data/processed-pythia82-lhc13-all-pt1-50k-r1_h022_e0175_t220_nonu_truth.z -c train_config_threelayer.yml -o train_3layer/ 19 | ``` 20 | 21 | The NN inputs/outputs and training configuration are specified in the config file train_config_threelayer.yml. Note from the config file that we are training the model with L1 regularization = 0.0001. 22 | 23 | After the training, find the final weights in the train_3layer output folder. You can now evaluate the performance of the NN: 24 | 25 | ``` 26 | python eval.py -t t_allpar_new -i ../data/processed-pythia82-lhc13-all-pt1-50k-r1_h022_e0175_t220_nonu_truth.z -m train_3layer/KERAS_check_best_model.h5 -c train_config_threelayer.yml -o eval_3layer/ 27 | ``` 28 | 29 | Find training history (loss and accuracy), ROC curve and confusion matrix in the output folder eval_3layer. 30 | 31 | ### Pruning and retraining 32 | 33 | To prune the trained model by removing weights below a certain threshold (relative weight < 0.004): 34 | 35 | ``` 36 | mkdir prune_3layer_relwmax4e-3 37 | python prune.py -m train_3layer/KERAS_check_best_model.h5 --relative-weight-max 4e-3 -o prune_3layer_relwmax4e-3/pruned_model.h5 38 | ``` 39 | 40 | Check the output folder prune_3layer_relwmax4e-3 for plots of weights and quantiles. 41 | 42 | Now evaluate the pruned model: 43 | 44 | ``` 45 | python eval.py -t t_allpar_new -i ../data/processed-pythia82-lhc13-all-pt1-50k-r1_h022_e0175_t220_nonu_truth.z -m prune_3layer_relwmax4e-3/pruned_model.h5 -c train_config_threelayer.yml -o eval_3layer_relwmax4e-3/ 46 | ``` 47 | 48 | Check the performance of the pruned model in the eval_3layer_relwmax4e-3 output folder and compare with previous performance of the non pruned model. 49 | 50 | Now retrain the model (keeping the pruned weights fixed to 0) and evaluate the performance: 51 | 52 | ``` 53 | python retrain.py -t t_allpar_new -i ../data/processed-pythia82-lhc13-all-pt1-50k-r1_h022_e0175_t220_nonu_truth.z -o retrain_3layer_relwmax4e-3 -m prune_3layer_relwmax4e-3/pruned_model.h5 -c train_config_threelayer.yml -d prune_3layer_relwmax4e-3/pruned_model_drop_weights.h5 54 | python eval.py -t t_allpar_new -i ../data/processed-pythia82-lhc13-all-pt1-50k-r1_h022_e0175_t220_nonu_truth.z -m retrain_3layer_relwmax4e-3/KERAS_check_best_model.h5 -c train_config_threelayer.yml -o eval_retrain_3layer_relwmax4e-3/ 55 | ``` 56 | 57 | Repeat the full procedure few times and copy the final compressed model (pruned weights .h5 file) to the hls4ml directory. Synthetise the FPGA project and compare resources and latency with what obtained with the full 3-layer model. 58 | NB: when repeating the pruning+evaluation+retrain procedure remember to change the name of the output directory and input model. 59 | --------------------------------------------------------------------------------