├── FPGA-synthesis.pdf
├── Introduction.pdf
├── List-of-models.pdf
├── Model-compression.pdf
├── fpga4hep_sdaccel.pdf
├── part1_hls4ml_intro.md
├── part2_aws_sdaccel.md
└── part3_model_compression.md


/FPGA-synthesis.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/FPGA4HEP/course_material/69f92b36c67334d6d21c2f6be9b7a2feb5c8861a/FPGA-synthesis.pdf


--------------------------------------------------------------------------------
/Introduction.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/FPGA4HEP/course_material/69f92b36c67334d6d21c2f6be9b7a2feb5c8861a/Introduction.pdf


--------------------------------------------------------------------------------
/List-of-models.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/FPGA4HEP/course_material/69f92b36c67334d6d21c2f6be9b7a2feb5c8861a/List-of-models.pdf


--------------------------------------------------------------------------------
/Model-compression.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/FPGA4HEP/course_material/69f92b36c67334d6d21c2f6be9b7a2feb5c8861a/Model-compression.pdf


--------------------------------------------------------------------------------
/fpga4hep_sdaccel.pdf:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/FPGA4HEP/course_material/69f92b36c67334d6d21c2f6be9b7a2feb5c8861a/fpga4hep_sdaccel.pdf


--------------------------------------------------------------------------------
/part1_hls4ml_intro.md:
--------------------------------------------------------------------------------
 1 | ### Download and install the package
 2 | 
 3 | ```
 4 | git clone https://github.com/FPGA4HEP/hls4ml.git
 5 | cd hls4ml
 6 | source install_miniconda3.sh
 7 | source ~/.bashrc
 8 | source install.sh
 9 | ```
10 | 
11 | Every time you log in do:
12 | 
13 | ```
14 | source setup_hls4ml.sh
15 | conda activate hls4ml-env
16 | ```
17 | 
18 | Run as well ```git pull``` to fetch the latest changes.
19 | 
20 | ### Run the tool (with your favourite model, e.g. 1-layer)
21 | 
22 | ```
23 | cd keras-to-hls
24 | export FAVOURITEMODEL=1layer
25 | python keras-to-hls.py -c keras-config-${FAVOURITEMODEL}.yml
26 | ```
27 | 
28 | This will create a folder called `my-hls-test-${FAVOURITEMODEL}`. If you want to change the projectory directory name edit the yml configuration file.
29 | 
30 | ### Run project design synthesis with Vivado HLS
31 | 
32 | ```
33 | cd my-hls-test-${FAVOURITEMODEL}
34 | vivado_hls -f build_prj.tcl
35 | ```
36 | If you get a runtime error from vivado, log out and prepend "LC_ALL=C" to your ssh command, ex.
37 | ```
38 | LC_ALL=C ssh -i FPGA4HEP.pem centos@your-ip
39 | ```
40 | 
41 | ### Readout resource usage and latency from the synthesis report
42 | 
43 | ```
44 | cd ..
45 | ./print-reports.sh my-hls-test-${FAVOURITEMODEL}
46 | ```
47 | 
48 | ### Extract and compare area under the ROC curve from keras (floating point calculations) and HLS (fixed point calculations)
49 | 
50 | ```
51 | python extract_roc.py -c keras-config-${FAVOURITEMODEL}.yml
52 | ```
53 | 
54 | ### EXERCISE:
55 | 
56 | Change precision of calculations and reuse factor in the keras configuration file and check effect on NN performance (AUC) and FPGA resource usage using the scripts above.
57 | 
58 | ```
59 | ReuseFactor: N  # N = number of times a multiplier is used to do a computation 
60 | DefaultPrecision: ap_fixed<X,Y>  # X = total number of bits, Y = number of integer bits, X-Y = number of decimal bits
61 | ```
62 | 
63 | NB: suggest to change the project output directory in the keras configuration for each test to avoid overwriting of previous projects.
64 | 


--------------------------------------------------------------------------------
/part2_aws_sdaccel.md:
--------------------------------------------------------------------------------
 1 | ### Clone hls4ml wrapper for SDAccel
 2 | 
 3 | ```
 4 | git clone https://github.com/FPGA4HEP/hls4ml_c.git
 5 | cd hls4ml_c
 6 | git pull #to fetch the latest changes
 7 | ```
 8 | 
 9 | Edit Makefile in the hls4ml_c directory to change default input directory name:
10 | 
11 | ```
12 | HLS4ML_PROJECT := my-hls-test-FAVOURITE-MODEL
13 | ```
14 | 
15 | ### Check out SDAccel and setup environment
16 | 
17 | ```
18 | git clone https://github.com/aws/aws-fpga.git $AWS_FPGA_REPO_DIR  
19 | ```
20 | 
21 | Every time you login do:
22 | 
23 | ```
24 | cd $AWS_FPGA_REPO_DIR                                         
25 | source sdaccel_setup.sh
26 | ```
27 | 
28 | or today you can also just follow this shorcut we have set up for you:
29 | 
30 | ```
31 | cd ~/
32 | source setup_sdaccel.sh
33 | ```
34 | 
35 | NB: if you had hls4ml activated, you should log out/in first.
36 | 
37 | More detailed information [here](https://github.com/aws/aws-fpga/tree/master/SDAccel)
38 | 
39 | ### Run software simulation, hardware emulation and build FPGA binary
40 | 
41 | ```
42 | make clean                                                                 
43 | make check TARGETS=sw_emu DEVICES=$AWS_PLATFORM all                 #software emulation
44 | make check TARGETS=hw_emu DEVICES=$AWS_PLATFORM all                 #hardware emulation
45 | make TARGETS=hw DEVICES=$AWS_PLATFORM all && ./create.sh            #firmware building
46 | ```
47 | 
48 | ### Run on real FPGA
49 | 
50 | Launch a F1 instance and copy the host and binary files (.awsxclbin) from the T2 (nb, first copy to your laptop). 
51 | 
52 | Setup the SDAccel environment on the F1 as well:
53 | 
54 | ```
55 | git clone https://github.com/aws/aws-fpga.git $AWS_FPGA_REPO_DIR
56 | cd $AWS_FPGA_REPO_DIR 
57 | source sdaccel_setup.sh
58 | sudo sh
59 | export VIVADO_TOOL_VERSION=2018.2
60 | curl -s https://s3.amazonaws.com/aws-fpga-developer-ami/1.5.0/Patches/xrt_201802.2.1.0_7.5.1804-xrt.rpm -o xrt_201802.2.1.0_7.5.1804-xrt.rpm
61 | curl -s https://s3.amazonaws.com/aws-fpga-developer-ami/1.5.0/Patches/xrt_201802.2.1.0_7.5.1804-aws.rpm -o xrt_201802.2.1.0_7.5.1804-aws.rpm
62 | sudo yum reinstall -y xrt_*-xrt.rpm
63 | sudo yum install -y xrt_*-aws.rpm
64 | source /home/centos/src/project_data/aws-fpga/sdaccel_runtime_setup.sh
65 | ``` 
66 | 
67 | But today you can also just run these two scripts:
68 | 
69 | ```
70 | source setup_sdaccel_fpga_base.sh
71 | source setup_sdaccel_fpga.sh
72 | ```
73 | 
74 | Now copy the input features and keras prediction files from your hls4ml project directory on the T2 (my-hls-test-FAVOURITE-MODEL/tb_data/) to the F1 to pass it to the FPGA.
75 | 
76 | Finally, you can accelerate your NN inference on the FPGA running on the input features:
77 | 
78 | ```
79 | ./host N data_dir
80 | ```
81 | 
82 | where N is number of batches of 32 events (suggest use N=6168 if use the provided input features list), and data_dir is the directory with input features and keras predictions files.
83 | 
84 | The application will produce a file with the predictions from the FPGA run. Compare it with HLS and Keras calculations using the extract_roc.py script in the hls4ml directory on the T2 instance (nb, copy the tb_output_data.dat from the F1 to hls4ml/keras-to-hls directory on the T2)
85 | 
86 | ```
87 | python extract_roc.py -c keras-config-FAVOURITE-MODEL.yml -f tb_output_data.dat
88 | ```
89 | 


--------------------------------------------------------------------------------
/part3_model_compression.md:
--------------------------------------------------------------------------------
 1 | ### Checkout the keras training package and setup the environment (assume local installation of python data analysis packages: keras, tensorflow, ...)
 2 | 
 3 | ```
 4 | git clone https://github.com/FPGA4HEP/keras-training.git
 5 | cp keras-training/install_miniconda.sh ~/
 6 | cd ~
 7 | source install_miniconda.sh
 8 | source ~/.bashrc
 9 | cd keras-training
10 | source install.sh
11 | source setup.sh # every time you log in
12 | ```
13 | 
14 | ### Training and evaluation of the [3-layer Dense NN](https://github.com/FPGA4HEP/keras-training/blob/master/models/models.py#L63-L76):
15 | 
16 | ```
17 | cd ~/keras-training/train
18 | python train.py -t t_allpar_new -i ../data/processed-pythia82-lhc13-all-pt1-50k-r1_h022_e0175_t220_nonu_truth.z -c train_config_threelayer.yml -o train_3layer/
19 | ```
20 | 
21 | The NN inputs/outputs and training configuration are specified in the config file train_config_threelayer.yml. Note from the config file that we are training the model with L1 regularization = 0.0001.
22 | 
23 | After the training, find the final weights in the train_3layer output folder. You can now evaluate the performance of the NN:
24 | 
25 | ```
26 | python eval.py -t t_allpar_new -i ../data/processed-pythia82-lhc13-all-pt1-50k-r1_h022_e0175_t220_nonu_truth.z -m train_3layer/KERAS_check_best_model.h5 -c train_config_threelayer.yml -o eval_3layer/
27 | ```
28 | 
29 | Find training history (loss and accuracy), ROC curve and confusion matrix in the output folder eval_3layer.
30 | 
31 | ### Pruning and retraining
32 | 
33 | To prune the trained model by removing weights below a certain threshold (relative weight < 0.004):
34 | 
35 | ```
36 | mkdir prune_3layer_relwmax4e-3
37 | python prune.py -m train_3layer/KERAS_check_best_model.h5 --relative-weight-max 4e-3 -o prune_3layer_relwmax4e-3/pruned_model.h5
38 | ```
39 | 
40 | Check the output folder prune_3layer_relwmax4e-3 for plots of weights and quantiles.
41 | 
42 | Now evaluate the pruned model:
43 | 
44 | ```
45 | python eval.py -t t_allpar_new -i ../data/processed-pythia82-lhc13-all-pt1-50k-r1_h022_e0175_t220_nonu_truth.z -m prune_3layer_relwmax4e-3/pruned_model.h5 -c train_config_threelayer.yml -o eval_3layer_relwmax4e-3/
46 | ```
47 | 
48 | Check the performance of the pruned model in the eval_3layer_relwmax4e-3 output folder and compare with previous performance of the non pruned model.
49 | 
50 | Now retrain the model (keeping the pruned weights fixed to 0) and evaluate the performance:
51 | 
52 | ```
53 | python retrain.py -t t_allpar_new -i ../data/processed-pythia82-lhc13-all-pt1-50k-r1_h022_e0175_t220_nonu_truth.z -o retrain_3layer_relwmax4e-3 -m prune_3layer_relwmax4e-3/pruned_model.h5 -c train_config_threelayer.yml -d prune_3layer_relwmax4e-3/pruned_model_drop_weights.h5
54 | python eval.py -t t_allpar_new -i ../data/processed-pythia82-lhc13-all-pt1-50k-r1_h022_e0175_t220_nonu_truth.z -m retrain_3layer_relwmax4e-3/KERAS_check_best_model.h5 -c train_config_threelayer.yml -o eval_retrain_3layer_relwmax4e-3/
55 | ```
56 | 
57 | Repeat the full procedure few times and copy the final compressed model (pruned weights .h5 file) to the hls4ml directory. Synthetise the FPGA project and compare resources and latency with what obtained with the full 3-layer model.
58 | NB: when repeating the pruning+evaluation+retrain procedure remember to change the name of the output directory and input model.
59 | 


--------------------------------------------------------------------------------