Amitabh's Weblog

Machine Learning in High Energy Physics

Event Size, Data Volume and Complexity of recorded data will present both, qualitative and quantitative, new challenges in the post-Higgs Boson era in Particle Physics. Computational resource utilization and algorithm efficiency would be the bottleneck that may limit the reach of Physics. Machine Learning is the currently the explored candidate to address both these issues.

In this article, we discuss the Machine Learning (ML) applications in High Energy Physics (HEP), HEP-ML software, Hardware constraints for Training and Inference and, HEP-ML roadmap.


The targeted areas where ML can find applications for HEP, specifically Large Hadron Collider (LHC) research include:

  1. Performance gains for Track Reconstruction and Analysis.
  2. Reduce execution time for computationally expensive subroutines in event simulation, pattern recognition and calibration.
  3. Real-time algorithms such as, Trigger (such as, L1 and HLT).
  4. Reduction in data footprint with data compression, placement and access.

What does the LHC do exactly?

The challenge is to find rare events from the extremely high ‘pile up’ expected from the LHC. Probe the Standard Model, fundamental tests and search for new physics, by hunting for rare events in the background of extremely complex traces left behind due to proton bunch collisions.

How was ML used prior at the LHC?

Designed to work on large data-sets to reduce the complexity of data and find rare features/events. State-of-the-art implementations for event and particle identification, energy estimation and particle identification.

The main ML algorithms currently used in particle physics are:

Classification: Supervised Learning, and prediction discrete valued output. For example, for classification of events and particles in data.

Regression: Supervised Learning, by prediction continuous valued output. For example, to estimate energy of particle based on multiple measurements from multiple detectors.

Training the model is the most expensive step, considering time to develop the model and time to train. Inference is usually inexpensive.

What is the role for Deep Learning at LHC?

Deep Learning (DL) is useful when we have for large data-sets, with large number of features, symmetries and complex non-linear input-output dependencies. The main DL architectures used in particle physics are:

Targeted areas for Machine Learning Applications and Research for HEP

  1. Detector Simulation: New particles are discovered by comparing the recorded collision data with the predictions from Standard Model/beyond-standard Model physics. Detector Simulator, such as GEANT, help in simulation of particle trajectory and compare with the recorded data. The detector response along with known particle-matter iteration results, one can proceed to discover new particles. The HL-LHC would require simulation of up to trillions of events, that may help in testing the hypothesis. Simulation one proton-proton collision for LHC takes several minutes, which in addition to higher computational resource requirements, would scale many fold for HL-LHC simulations.
  2. Read-time analysis and Triggering:
  3. Object Reconstruction, Identification and Calibration:
  4. End-to-end Deep Learning:
  5. Sustainable Matrix Element Method:
  6. Matrix Element Machine Learning Method:
  7. Learning the Standard Model:
  8. Theory Applications:
  9. Uncertainity Assignment:
  10. Monitoring the Detectors, Hardware Anomaly and Preemptive Maintenance:
  11. Computing Resource Optimization and Control of Networks and Productive Workflows:

HEP Machine Learning Software

Hardware Resources and Computing Constraints

HEP ML Roadmap (2017-2022)

Reference: Albertsson, Kim, et al. “Machine learning in high energy physics community white paper.” Journal of Physics: Conference Series. Vol. 1085. No. 2. IOP Publishing, 2018.