├── .clang-format ├── .envrc ├── .flake8 ├── .github ├── dependabot.yml └── workflows │ ├── mega-linter.yml │ └── python-package.yml ├── .gitignore ├── .mega-linter.yml ├── .pre-commit-config.yaml ├── .pylintrc ├── .vscode └── settings.json ├── .yamlfmt.yml ├── .yamllint.yml ├── CPPLINT.cfg ├── LICENSE ├── MANIFEST.in ├── README.md ├── ci ├── copyright-ignore ├── flake8-ignore ├── pylint-ignore ├── pytest │ ├── test_general.py │ └── yaml.yaml └── run-tests.sh ├── cplusutilities ├── Download-generic.sh ├── Download.sh ├── README.md ├── clean_HF_TreeCreator_env.sh ├── download_from_grid.sh ├── downloader-generic.sh ├── downloader.sh ├── mass_fitter.C ├── merge_and_fit_invmasshisto.C ├── post_download.sh ├── post_download_all.sh ├── run_downloader └── run_mass_fitter.sh ├── figures ├── ALICE_all.png ├── LHCparticle.jpg ├── Lambda_peak.png ├── SelectionVar.png └── SideBands.png ├── machine_learning_hep ├── README.md ├── __init__.py ├── __main__.py ├── analysis │ ├── README.md │ ├── __init__.py │ ├── analyzer.py │ ├── analyzer_jets.py │ ├── analyzer_manager.py │ ├── analyzerdhadrons.py │ ├── analyzerdhadrons_mult.py │ ├── do_systematics.py │ ├── systematics.py │ └── utils.py ├── bitwise.py ├── clean.sh ├── clean_analysis.sh ├── clean_results.sh ├── config.py ├── correlations.py ├── data │ ├── __init__.py │ ├── config_model_parameters.yml │ ├── config_run_parameters.yml │ ├── data_run3 │ │ ├── database_ml_parameters_D0Jet_pp.yml │ │ ├── database_ml_parameters_D0pp_jet_run2cmp.yml │ │ ├── database_ml_parameters_Dp.yml │ │ ├── database_ml_parameters_DpJet_pp.yml │ │ ├── database_ml_parameters_JPsiJet_pp.yml │ │ ├── database_ml_parameters_Jet_pp.yml │ │ ├── database_ml_parameters_LcJet_pp.yml │ │ ├── database_ml_parameters_LcJet_pp_hp24.yml │ │ ├── database_ml_parameters_LcToPKPi_multiclass.yml │ │ ├── database_ml_parameters_LcToPKPi_newformat.yml │ │ ├── database_ml_parameters_LcToPKPi_newformat_mult_ana.yml │ │ ├── database_variations_D0Jet_pp_jet_obs.yml │ │ └── database_variations_LcJet_pp_jet_obs.yml │ ├── database_run_list.yml │ └── fonll │ │ ├── DmesonLcPredictions_13TeV_y05_FFee_BRpythia8.root │ │ ├── DmesonLcPredictions_13TeV_y05_FFee_BRpythia8_SepContr_PDG2020.root │ │ ├── DmesonLcPredictions_13TeV_y05_FFptDepLHCb_BRpythia8.root │ │ ├── DmesonLcPredictions_13TeV_y05_FFptDepLHCb_BRpythia8_PDG2020.root │ │ ├── DmesonLcPredictions_502TeV_y05_FFee_BRpythia8.root │ │ └── DmesonLcPredictions_502TeV_y05_FFptDepLHCb_BRpythia8.root ├── do_variations.py ├── fitting │ ├── README.md │ ├── __init__.py │ ├── fitters.py │ ├── helpers.py │ ├── roofitter.py │ ├── simple_fit.py │ └── utils.py ├── globalfitter.py ├── hf_analysis_utils.py ├── hf_pt_spectrum.py ├── io.py ├── logger.py ├── mlperformance.py ├── models.py ├── multiprocesser.py ├── optimisation │ ├── README.md │ ├── bayesian_opt.py │ ├── grid_search.py │ └── metrics.py ├── optimiser.py ├── optimization.py ├── plotting │ ├── __init__.py │ ├── compare_results.py │ └── plot_jetsubstructure_run3.py ├── processer.py ├── processer_jet.py ├── processerdhadrons.py ├── processerdhadrons_mult.py ├── root.py ├── selectionutils.py ├── steer_analysis.py ├── submission │ ├── __init__.py │ ├── all_off.yml │ ├── analysis.yml │ ├── analyzer.yml │ ├── data.yml │ ├── full_analysis.yml │ ├── mc.yml │ ├── mlapp.yml │ ├── mltrain.yml │ ├── preprocess.yml │ └── processor.yml ├── submit.sh ├── submit_variations.sh ├── templates_keras.py ├── templates_scikit.py ├── templates_xgboost.py ├── utilities.py ├── utilities_files.py ├── utilities_plot.py ├── utils │ ├── __init__.py │ ├── compare_directories.sh │ ├── compare_root_files.py │ ├── dl_train.py │ └── hist.py ├── vary_bdt.py └── workflow │ └── workflow_base.py ├── pyproject.toml ├── requirements.txt └── run_hfjets.py /.clang-format: -------------------------------------------------------------------------------- 1 | BasedOnStyle: Google 2 | AccessModifierOffset: -1 3 | AlignEscapedNewlinesLeft: true 4 | AlignTrailingComments: true 5 | AllowAllParametersOfDeclarationOnNextLine: false 6 | AllowShortFunctionsOnASingleLine: true 7 | AllowShortIfStatementsOnASingleLine: false 8 | AllowShortLoopsOnASingleLine: false 9 | #AlwaysBreakBeforeMultilineStrings: true 10 | AlwaysBreakTemplateDeclarations: true 11 | BinPackParameters: true 12 | BreakBeforeBinaryOperators: false 13 | BreakBeforeBraces: Linux 14 | BreakBeforeTernaryOperators: true 15 | BreakConstructorInitializersBeforeComma: false 16 | ColumnLimit: 0 17 | CommentPragmas: '^ IWYU pragma:' 18 | ConstructorInitializerAllOnOneLineOrOnePerLine: true 19 | ConstructorInitializerIndentWidth: 2 20 | ContinuationIndentWidth: 2 21 | Cpp11BracedListStyle: true 22 | DerivePointerBinding: false 23 | ExperimentalAutoDetectBinPacking: false 24 | IndentCaseLabels: true 25 | IndentFunctionDeclarationAfterType: true 26 | IndentWidth: 2 27 | # It is broken on windows. Breaks all #include "header.h" 28 | --- 29 | Language: Cpp 30 | MaxEmptyLinesToKeep: 1 31 | KeepEmptyLinesAtTheStartOfBlocks: true 32 | NamespaceIndentation: None 33 | ObjCSpaceAfterProperty: false 34 | ObjCSpaceBeforeProtocolList: false 35 | PenaltyBreakBeforeFirstCallParameter: 1 36 | PenaltyBreakComment: 300 37 | PenaltyBreakFirstLessLess: 120 38 | PenaltyBreakString: 1000 39 | PenaltyExcessCharacter: 1000000 40 | PenaltyReturnTypeOnItsOwnLine: 200 41 | SortIncludes: false 42 | SpaceBeforeAssignmentOperators: true 43 | SpaceBeforeParens: ControlStatements 44 | SpaceInEmptyParentheses: false 45 | SpacesBeforeTrailingComments: 1 46 | SpacesInAngles: false 47 | SpacesInContainerLiterals: true 48 | SpacesInCStyleCastParentheses: false 49 | SpacesInParentheses: false 50 | Standard: Cpp11 51 | TabWidth: 2 52 | UseTab: Never 53 | --- 54 | # Do not format protobuf files 55 | Language: Proto 56 | DisableFormat: true 57 | # --- 58 | # # Since clang-format 13.0.0 59 | # Language: Json 60 | # # O2 dumps JSON files with 4-space indents. 61 | # IndentWidth: 4 62 | -------------------------------------------------------------------------------- /.envrc: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | if [[ $(hostname) == "alicecerno2" || $(hostname) == "alipap1" ]] && [[ -z ${ROOTSYS} ]]; then 4 | 5 | PYTHON_VERSION=3.10.14 6 | ROOT_VERSION=v6-32-04 7 | ROOUNFOLD_VERSION=2.0.1 8 | 9 | PREFIX=/home/pyadmin/software_mlhep 10 | layout python ${PREFIX}/install/pyenv/versions/${PYTHON_VERSION}/bin/python3 11 | path_add PYTHONPATH ${PREFIX}/install/root-${ROOT_VERSION}_py-${PYTHON_VERSION}/lib 12 | PATH_add ${PREFIX}/install/root-${ROOT_VERSION}_py-${PYTHON_VERSION}/bin 13 | path_add LD_LIBRARY_PATH ${PREFIX}/install/root-${ROOT_VERSION}_py-${PYTHON_VERSION}/lib 14 | # path_add LD_LIBRARY_PATH ${PREFIX}/install/RooUnfold-${ROOUNFOLD_VERSION}_root-${ROOT_VERSION}_py-${PYTHON_VERSION}/lib 15 | path_add LD_LIBRARY_PATH ${PREFIX}/build/RooUnfold-${ROOUNFOLD_VERSION}_root-${ROOT_VERSION}_py-${PYTHON_VERSION} 16 | 17 | fi 18 | -------------------------------------------------------------------------------- /.flake8: -------------------------------------------------------------------------------- 1 | [flake8] 2 | max-line-length = 120 3 | extend-ignore = E203 4 | -------------------------------------------------------------------------------- /.github/dependabot.yml: -------------------------------------------------------------------------------- 1 | --- 2 | # Dependabot configuration 3 | # Reference: https://docs.github.com/code-security/dependabot/dependabot-version-updates/configuration-options-for-the-dependabot.yml-file 4 | 5 | version: 2 6 | updates: 7 | - package-ecosystem: "github-actions" # See documentation for possible values 8 | directory: "/" # Location of package manifests 9 | schedule: 10 | interval: "weekly" 11 | -------------------------------------------------------------------------------- /.github/workflows/mega-linter.yml: -------------------------------------------------------------------------------- 1 | --- 2 | # MegaLinter GitHub Action configuration file 3 | # More info at https://megalinter.io 4 | name: MegaLinter 5 | 6 | 'on': 7 | # Trigger mega-linter at every push. Action will also be visible from Pull Requests to master 8 | push: # Comment this line to trigger action only on pull-requests (not recommended if you don't pay for GH Actions) 9 | pull_request: 10 | branches: [master, run3] 11 | 12 | permissions: 13 | # Give the default GITHUB_TOKEN write permission to commit and push, comment issues & post new PR 14 | # Remove the ones you do not need 15 | contents: write 16 | issues: write 17 | pull-requests: write 18 | 19 | env: # Comment env block if you don't want to apply fixes 20 | # Apply linter fixes configuration 21 | APPLY_FIXES: all # When active, APPLY_FIXES must also be defined as environment variable (in github/workflows/mega-linter.yml or other CI tool) 22 | APPLY_FIXES_EVENT: push # Decide which event triggers application of fixes in a commit or a PR (pull_request, push, all) 23 | APPLY_FIXES_MODE: pull_request # If APPLY_FIXES is used, defines if the fixes are directly committed (commit) or posted in a PR (pull_request) 24 | 25 | concurrency: 26 | group: ${{ github.ref }}-${{ github.workflow }} 27 | cancel-in-progress: true 28 | 29 | jobs: 30 | megalinter: 31 | name: MegaLinter 32 | runs-on: ubuntu-latest 33 | steps: 34 | # Git Checkout 35 | - name: Checkout Code 36 | uses: actions/checkout@v4 37 | with: 38 | token: ${{ secrets.PAT || secrets.GITHUB_TOKEN }} 39 | fetch-depth: 0 # If you use VALIDATE_ALL_CODEBASE = true, you can remove this line to improve performances 40 | 41 | # MegaLinter 42 | - name: MegaLinter 43 | id: ml 44 | # You can override MegaLinter flavor used to have faster performances 45 | # More info at https://megalinter.io/flavors/ 46 | uses: oxsecurity/megalinter@v8.5.0 47 | env: 48 | # All available variables are described in documentation 49 | # https://megalinter.io/configuration/ 50 | VALIDATE_ALL_CODEBASE: false # ${{ github.event_name == 'push' && github.ref == 'refs/heads/master' }} # Validates all source when push on master, else just the git diff with master. Override with true if you always want to lint all sources 51 | GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} 52 | # ADD YOUR CUSTOM ENV VARIABLES HERE OR DEFINE THEM IN A FILE .mega-linter.yml AT THE ROOT OF YOUR REPOSITORY 53 | # DISABLE: COPYPASTE,SPELL # Uncomment to disable copy-paste and spell checks 54 | 55 | # Upload MegaLinter artifacts 56 | - name: Archive production artifacts 57 | if: success() || failure() 58 | uses: actions/upload-artifact@v4 59 | with: 60 | name: MegaLinter reports 61 | path: | 62 | megalinter-reports 63 | mega-linter.log 64 | 65 | # Create pull request if applicable (for now works only on PR from same repository, not from forks) 66 | - name: Print PR condition 67 | run: | 68 | # Print the condition 69 | echo "(${{ env.APPLY_FIXES_EVENT }} == 'all' || ${{ env.APPLY_FIXES_EVENT }} == ${{ github.event_name }}) && ${{ env.APPLY_FIXES_MODE }} == 'pull_request' && (${{ github.event_name }} == 'push' || ${{ github.event.pull_request.head.repo.full_name }} == ${{ github.repository }})" 70 | - name: Create Pull Request with applied fixes 71 | id: cpr 72 | if: (env.APPLY_FIXES_EVENT == 'all' || env.APPLY_FIXES_EVENT == github.event_name) && env.APPLY_FIXES_MODE == 'pull_request' && (github.event_name == 'push' || github.event.pull_request.head.repo.full_name == github.repository) 73 | uses: peter-evans/create-pull-request@v7 74 | with: 75 | token: ${{ secrets.PAT || secrets.GITHUB_TOKEN }} 76 | commit-message: "[MegaLinter] Apply linters automatic fixes" 77 | title: "[MegaLinter] Apply linters automatic fixes" 78 | body: "Please merge this pull request to apply automatic fixes by MegaLinter." 79 | labels: bot 80 | branch: patch-${{ github.workflow }}-${{ github.ref_name }} 81 | delete-branch: true 82 | - name: Create PR output 83 | if: steps.ml.outputs.has_updated_sources == 1 84 | run: | 85 | echo "::error::MegaLinter has fixed some files." 86 | if [ ${{ github.event_name }} == 'push' ]; then 87 | echo "::error::Merge pull request ${{ steps.cpr.outputs.pull-request-url }} to apply automatic fixes." 88 | elif [ ${{ github.event_name }} == 'pull_request' ]; then 89 | echo "::error::Check ${{ github.event.pull_request.head.repo.html_url }}/pulls to apply automatic fixes." 90 | echo "::notice::Actions must be allowed in your repository. See ${{ github.event.pull_request.head.repo.html_url }}/settings/actions" 91 | fi 92 | exit 1 93 | 94 | # Push new commit if applicable (for now works only on PR from same repository, not from forks) 95 | - name: Print commit condition 96 | run: | 97 | # Print the condition 98 | echo "${{ steps.ml.outputs.has_updated_sources }} == 1 && (${{ env.APPLY_FIXES_EVENT }} == 'all' || ${{ env.APPLY_FIXES_EVENT }} == ${{ github.event_name }}) && ${{ env.APPLY_FIXES_MODE }} == 'commit' && ${{ github.ref }} != 'refs/heads/master' && (${{ github.event_name }} == 'push' || ${{ github.event.pull_request.head.repo.full_name }} == ${{ github.repository }})" 99 | - name: Prepare commit 100 | if: steps.ml.outputs.has_updated_sources == 1 && (env.APPLY_FIXES_EVENT == 'all' || env.APPLY_FIXES_EVENT == github.event_name) && env.APPLY_FIXES_MODE == 'commit' && github.ref != 'refs/heads/master' && (github.event_name == 'push' || github.event.pull_request.head.repo.full_name == github.repository) 101 | run: sudo chown -Rc $UID .git/ 102 | - name: Commit and push applied linter fixes 103 | if: steps.ml.outputs.has_updated_sources == 1 && (env.APPLY_FIXES_EVENT == 'all' || env.APPLY_FIXES_EVENT == github.event_name) && env.APPLY_FIXES_MODE == 'commit' && github.ref != 'refs/heads/master' && (github.event_name == 'push' || github.event.pull_request.head.repo.full_name == github.repository) 104 | uses: stefanzweifel/git-auto-commit-action@v5 105 | with: 106 | branch: ${{ github.event.pull_request.head.ref || github.head_ref || github.ref }} 107 | commit_message: "[MegaLinter] Apply linters fixes" 108 | commit_user_name: megalinter-bot 109 | commit_user_email: nicolas.vuillamy@ox.security 110 | -------------------------------------------------------------------------------- /.github/workflows/python-package.yml: -------------------------------------------------------------------------------- 1 | --- 2 | name: Test package 3 | 4 | 'on': 5 | pull_request: 6 | branches: 7 | - master 8 | - run3 9 | paths: 10 | - "**.py" 11 | push: 12 | branches: 13 | - master 14 | - run3 15 | paths: 16 | - "**.py" 17 | 18 | permissions: 19 | contents: read 20 | pull-requests: read 21 | 22 | concurrency: 23 | group: ${{ github.ref }}-${{ github.workflow }} 24 | cancel-in-progress: true 25 | 26 | jobs: 27 | build-os-latest: 28 | runs-on: ${{ matrix.os }} 29 | strategy: 30 | max-parallel: 6 31 | matrix: 32 | os: [ubuntu-latest, macos-latest] 33 | python-version: ['3.10', '3.11', '3.12', '3.13'] 34 | test-tool: [pytest] 35 | 36 | steps: 37 | - uses: actions/checkout@v4 38 | - name: Set up Python ${{ matrix.python-version }} 39 | uses: actions/setup-python@v5 40 | with: 41 | python-version: ${{ matrix.python-version }} 42 | - name: Install dependencies 43 | run: | 44 | python -m pip install --upgrade pip 45 | python -m pip install --upgrade setuptools 46 | pip install -r requirements.txt 47 | - name: Install test tool ${{ matrix.test-tool }} 48 | run: | 49 | pip install ${{ matrix.test-tool }} 50 | - name: Run on pull_request 51 | if: github.event_name == 'pull_request' 52 | run: | 53 | git fetch --no-tags --prune --depth=1 origin +refs/heads/*:refs/remotes/origin/* 54 | changed_files="$(git diff --name-only origin/${{ github.base_ref }})" 55 | # shellcheck disable=SC2086 # Ignore unquoted options. 56 | ci/run-tests.sh --tests ${{ matrix.test-tool }} --files $changed_files 57 | - name: Run on push 58 | if: github.event_name == 'push' 59 | run: |- 60 | ci/run-tests.sh --tests ${{ matrix.test-tool }} 61 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # If python build in source tree 2 | build/* 3 | dist/* 4 | *.egg-info 5 | **/__pycache__ 6 | .direnv 7 | 8 | # Python compiled 9 | *.pyc 10 | *.pyd 11 | *.pyo 12 | 13 | # object files 14 | *.slo 15 | *.lo 16 | *.o 17 | *.obj 18 | 19 | # dynamic libraries 20 | *.so 21 | *.dylib 22 | *.dll 23 | 24 | # static libraries 25 | *.lai 26 | *.la 27 | *.a 28 | *.lib 29 | 30 | # executables 31 | *.exe 32 | *.out 33 | *.app 34 | 35 | #input and output data 36 | machine_learning_hep/data/**/*.root 37 | machine_learning_hep/D0kINT7HighMultwithJets 38 | machine_learning_hep/LckINT7HighMultwithJets 39 | 40 | dataframes_* 41 | plots_* 42 | output_* 43 | *.json 44 | *.h5 45 | *.png 46 | *.log 47 | 48 | # editors 49 | *.swp 50 | *~ 51 | *.vscode 52 | **/.mypy_cache 53 | setup.cfg 54 | *.code-workspace 55 | 56 | # macOS 57 | .DS_Store 58 | 59 | # linters 60 | megalinter-reports 61 | -------------------------------------------------------------------------------- /.mega-linter.yml: -------------------------------------------------------------------------------- 1 | --- 2 | # Configuration file for Mega-Linter 3 | # See all available variables at https://oxsecurity.github.io/megalinter/configuration/ and in linters documentation 4 | 5 | APPLY_FIXES: all # all, none, or list of linter keys 6 | DEFAULT_BRANCH: run3 # Usually master or main 7 | # ENABLE: # If you use ENABLE variable, all other languages/formats/tooling-formats will be disabled by default 8 | # ENABLE_LINTERS: # If you use ENABLE_LINTERS variable, all other linters will be disabled by default 9 | DISABLE: 10 | - C 11 | - COPYPASTE # abusive copy-pastes 12 | - SPELL # spelling mistakes 13 | DISABLE_LINTERS: 14 | - BASH_EXEC 15 | - BASH_SHFMT 16 | - JSON_PRETTIER 17 | - PYTHON_BLACK 18 | - PYTHON_FLAKE8 19 | - PYTHON_ISORT 20 | - REPOSITORY_DEVSKIM 21 | - REPOSITORY_GRYPE 22 | - REPOSITORY_KICS 23 | - REPOSITORY_SECRETLINT 24 | - REPOSITORY_TRIVY 25 | - YAML_PRETTIER 26 | - YAML_V8R 27 | DISABLE_ERRORS_LINTERS: # If errors are found by these linters, they will be considered as non blocking. 28 | - PYTHON_BANDIT # The bandit check is overly broad and complains about subprocess usage. 29 | SHOW_ELAPSED_TIME: true 30 | FILEIO_REPORTER: false 31 | GITHUB_COMMENT_REPORTER: false 32 | UPDATED_SOURCES_REPORTER: true 33 | PRINT_ALPACA: false # Don't print ASCII alpaca in the log 34 | PRINT_ALL_FILES: true # Print all processed files 35 | FLAVOR_SUGGESTIONS: false # Don't show suggestions about different MegaLinter flavors 36 | PYTHON_ISORT_CONFIG_FILE: pyproject.toml 37 | PYTHON_PYRIGHT_CONFIG_FILE: pyproject.toml 38 | PYTHON_RUFF_CONFIG_FILE: pyproject.toml 39 | CPP_CPPLINT_FILE_EXTENSIONS: [".C", ".c", ".c++", ".cc", ".cl", ".cpp", ".cu", ".cuh", ".cxx", ".cxx.in", ".h", ".h++", ".hh", ".h.in", ".hpp", ".hxx", ".inc", ".inl", ".macro"] 40 | CPP_CLANG_FORMAT_FILE_EXTENSIONS: [".C", ".c", ".c++", ".cc", ".cl", ".cpp", ".cu", ".cuh", ".cxx", ".cxx.in", ".h", ".h++", ".hh", ".h.in", ".hpp", ".hxx", ".inc", ".inl", ".macro"] 41 | -------------------------------------------------------------------------------- /.pre-commit-config.yaml: -------------------------------------------------------------------------------- 1 | --- 2 | # See https://pre-commit.com for more information 3 | # See https://pre-commit.com/hooks.html for more hooks 4 | 5 | repos: 6 | - repo: https://github.com/pre-commit/pre-commit-hooks 7 | rev: v5.0.0 8 | hooks: 9 | - id: check-added-large-files 10 | - id: check-ast 11 | - id: check-builtin-literals 12 | - id: check-docstring-first 13 | - id: check-executables-have-shebangs 14 | - id: check-merge-conflict 15 | - id: check-symlinks 16 | - id: check-toml 17 | - id: check-yaml 18 | - id: debug-statements 19 | - id: end-of-file-fixer 20 | - id: mixed-line-ending 21 | - id: name-tests-test 22 | - id: requirements-txt-fixer 23 | - id: trailing-whitespace 24 | - repo: https://github.com/astral-sh/ruff-pre-commit 25 | rev: v0.11.2 # ruff version 26 | hooks: 27 | - id: ruff # linter 28 | args: ["--fix"] 29 | - id: ruff-format # formatter 30 | - repo: https://github.com/asottile/pyupgrade 31 | rev: v3.19.1 32 | hooks: 33 | - id: pyupgrade 34 | args: ["--py310-plus"] 35 | - repo: https://github.com/shellcheck-py/shellcheck-py 36 | rev: v0.10.0.1 37 | hooks: 38 | - id: shellcheck 39 | - repo: https://github.com/google/yamlfmt 40 | rev: v0.16.0 41 | hooks: 42 | - id: yamlfmt 43 | - repo: https://github.com/adrienverge/yamllint 44 | rev: v1.36.2 45 | hooks: 46 | - id: yamllint 47 | -------------------------------------------------------------------------------- /.pylintrc: -------------------------------------------------------------------------------- 1 | [FORMAT] 2 | indent-string=' ' 3 | max-line-length=120 4 | 5 | [BASIC] 6 | variable-rgx=(?:(?P[a-z_]+)) 7 | 8 | [TYPECHECK] 9 | generated-members=RdBu 10 | 11 | [DESIGN] 12 | max-args=10 13 | max-locals=40 14 | 15 | [MESSAGES CONTROL] 16 | disable= 17 | useless-suppression, 18 | too-few-public-methods, 19 | too-many-arguments, 20 | too-many-branches, 21 | too-many-instance-attributes, 22 | too-many-lines, 23 | too-many-locals, 24 | too-many-nested-blocks, 25 | too-many-positional-arguments, 26 | too-many-public-methods, 27 | too-many-return-statements, 28 | too-many-statements 29 | 30 | [MISCELLANEOUS] 31 | notes=FIXME,XXX 32 | 33 | [IMPORTS] 34 | ignored-modules=ROOT,yaml,pandas,numpy,shap,uproot 35 | -------------------------------------------------------------------------------- /.vscode/settings.json: -------------------------------------------------------------------------------- 1 | { 2 | "editor.rulers": [120,], 3 | "files.trimTrailingWhitespace": true, 4 | "cmake.configureOnOpen": false, 5 | } 6 | -------------------------------------------------------------------------------- /.yamlfmt.yml: -------------------------------------------------------------------------------- 1 | --- 2 | # yamlfmt configuration 3 | # Reference: https://github.com/google/yamlfmt/blob/main/docs/config-file.md#configuration-1 4 | 5 | formatter: 6 | type: basic 7 | indent: 2 8 | include_document_start: true 9 | line_ending: lf 10 | retain_line_breaks_single: true 11 | max_line_length: -1 12 | drop_merge_tag: true 13 | pad_line_comments: 1 14 | trim_trailing_whitespace: true 15 | eof_newline: true 16 | -------------------------------------------------------------------------------- /.yamllint.yml: -------------------------------------------------------------------------------- 1 | --- 2 | # yamllint configuration 3 | # Reference: https://yamllint.readthedocs.io/en/stable/rules.html 4 | 5 | extends: default 6 | rules: 7 | line-length: 8 | max: 120 9 | level: warning 10 | indentation: 11 | spaces: 2 12 | level: warning 13 | comments: 14 | require-starting-space: true 15 | min-spaces-from-content: 1 16 | -------------------------------------------------------------------------------- /CPPLINT.cfg: -------------------------------------------------------------------------------- 1 | filter=-build/c++11,-build/namespaces,-readability/fn_size,-readability/todo,-runtime/references,-whitespace/blank_line,-whitespace/braces,-whitespace/comments,-whitespace/indent_namespace,-whitespace/line_length,-whitespace/semicolon,-whitespace/todo 2 | -------------------------------------------------------------------------------- /MANIFEST.in: -------------------------------------------------------------------------------- 1 | include machine_learning_hep/submission/default_complete.yml 2 | include machine_learning_hep/submission/default_pre.yml 3 | include machine_learning_hep/submission/default_train.yml 4 | include machine_learning_hep/submission/default_apply.yml 5 | include machine_learning_hep/submission/default_ana.yml 6 | include machine_learning_hep/data/config_model_parameters.yml 7 | include machine_learning_hep/data/database_run_list.yml" 8 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Machine learning package for high-energy physics 2 | 3 | ![LHC Particle](figures/LHCparticle.jpg) 4 | 5 | ## Overview of the package: 6 | This software provides a flexible, modular and easy-to-use package to perform classification using Scikit, XGBoost and Keras algorithms. The first purpose of the package is to provide tools for high-energy physicists to perform optimisation of rare signals produced in ultra-relativistic proton-proton and heavy-ion collisions. 7 | 8 | ## The package (v0) provides tools to: 9 | - convert ROOT datasets into Pandas Dataframes 10 | - create training and testing dataset starting from samples of data and Monte-Carlo simulations 11 | - perform Principal-Component-Analysis 12 | - training and testing using Scikit, XGBoost and Keras algorithms 13 | - large set of validation tools with a user friendly interface 14 | - conversion of Pandas Dataframe to ROOT objects including algorithm decisions and probabilities 15 | 16 | ## Instructions and tutorials 17 | Instructions for installing and running the package are provided in the Wiki section of this repository [wiki](https://github.com/ginnocen/MachineLearningHEP/wiki). 18 | 19 | ## The ALICE Collaboration at CERN 20 | Visit the collaboration website for more information about studies of hot nuclear matter at the Large Hadron Collider at CERN 21 | http://alice-collaboration.web.cern.ch 22 | 23 | ## Contacts 24 | For any questions please contact 25 | 26 | # Installation 27 | 28 | ## Usage with aliBuild software stack 29 | 30 | This package depends on functionality offered by external packages, e.g. RooUnfold and O2Physics. 31 | In order to use these packages from the aliBuild software stack, you should first install the aliBuild packages, and setup mlhep within the aliBuild environment. 32 | To install the python dependencies run the following (from within the aliBuild environment) in the root directory of mlehp: 33 | ``` 34 | python3 -m pip install -r requirements.txt 35 | ``` 36 | -------------------------------------------------------------------------------- /ci/copyright-ignore: -------------------------------------------------------------------------------- 1 | # Applies anyway only to .py files so no need to put yaml and others 2 | 3 | # That's just pytest stuff 4 | ci/pytest 5 | 6 | # And the setup 7 | setup.py 8 | -------------------------------------------------------------------------------- /ci/flake8-ignore: -------------------------------------------------------------------------------- 1 | # Legacy code 2 | machine_learning_hep/analysis/analyzer_jet_legacy.py 3 | machine_learning_hep/analysis/analyzer_Dhadrons.py 4 | 5 | # Don't pylint setup 6 | setup.py 7 | -------------------------------------------------------------------------------- /ci/pylint-ignore: -------------------------------------------------------------------------------- 1 | # Legacy code 2 | machine_learning_hep/analysis/analyzer_jet_legacy.py 3 | machine_learning_hep/analysis/analyzer_Dhadrons.py 4 | machine_learning_hep/analysis/analyzer_back.py 5 | 6 | # Don't pylint setup 7 | setup.py 8 | -------------------------------------------------------------------------------- /ci/pytest/test_general.py: -------------------------------------------------------------------------------- 1 | from machine_learning_hep.io import parse_yaml 2 | 3 | YAML_PATH = "ci/pytest/yaml.yaml" 4 | 5 | def test_yaml(): 6 | assert isinstance(parse_yaml(YAML_PATH), dict) 7 | -------------------------------------------------------------------------------- /ci/pytest/yaml.yaml: -------------------------------------------------------------------------------- 1 | --- 2 | test: 3 | test1: [42, "test", Null] 4 | test2: 5 | tea: False 6 | coffee: True 7 | list_of_lists: 8 | - [1, 2, 3, 4, 5] 9 | - ["ABC", "DEF", "GHI"] 10 | - [Null, False, True, true, false, null] 11 | test3: 42 12 | -------------------------------------------------------------------------------- /cplusutilities/README.md: -------------------------------------------------------------------------------- 1 | # Getting and processing TTreeCreator output 2 | 3 | Instructions to download the output from the LEGO train (can be run as part of the package or stand-alone), and merge the files (only stand-alone). The instructions assume you are a user of the `aliceml` machine. With some small changes, the instructions are valid for each system though. 4 | 5 | Completing steps 1) - 4) will make the data ready for being used in the `MLHEP` Python analysis package. 6 | 7 | ## 1) Setup your environment 8 | 9 | Start by logging in 10 | ``` 11 | ssh -X username@lxplus.cern.ch #only when needed 12 | ssh -X username@aliceml 13 | ``` 14 | > Please have a look at section 4) if you want to use these script on a local system or different server, as some packages are already pre-installed at aliceml which you might need to install yourself first. 15 | 16 | ### a) Building and loading the virtual environment 17 | 18 | On `aliceml`, one should (create/)load your personal virtual environment: 19 | ``` 20 | ml-create-virtualenv #only once to create the environment 21 | ml-activate-virtualenv #start (and enable python) in virtual environment 22 | ml-activate-root #Enable system-wide ROOT installation 23 | ``` 24 | and clone+install this git repository (see https://github.com/ginnocen/MachineLearningHEP/wiki) 25 | 26 | ## 2) Obtain a certificate and `JAliEn` 27 | 28 | Before downloading, one has to enter the JAliEn environment manually. Please make sure your GRID certificates are copied to the server. If you haven't done that, [this tutorial](https://alice-doc.github.io/alice-analysis-tutorial/start/cert.html#convert-your-certificate-for-using-the-grid-tools) how to do it. Furthermore, you can find steps to obtain the certificate if don't have one. 29 | 30 | Having the certificate, load the `JAliEn` environment once and exit afterwards. 31 | ``` 32 | jalien 33 | #Enter Grid Certificate password 34 | exit 35 | 36 | > **NB:** If you get the error: "**JBox isn't running, so we won't start JSh.**", your grid certificates probably don't have the right permissions. Correct them in `~/.globus/` using: 37 | 38 | ```bash 39 | chmod 0440 usercert.pem 40 | chmod 0400 userkey.pem 41 | ``` 42 | 43 | ## 3) Downloading productions from the GRID 44 | 45 | In the following, `$MLHEP` is considered to point to the top directory of the **MLHEP** package. 46 | 47 | Downloading is done with `$MLHEP/cplusutilities/Download-generic.sh`. **Note** that you have to be in that directory to run it as it uses helper scripts and finds them relatively to its own directory (to be updated): 48 | 49 | ```bash 50 | ./Download-generic.sh [] 51 | ``` 52 | 53 | If you steer the script without arguments, they will be asked for. `` has the format: `_-