Projects •

Projects

My research spans nuclear and particle physics, machine learning applications, and the intersection of physics with artificial intelligence. This work demonstrates how fundamental physics insights can drive innovations in AI, while advanced computational methods open new frontiers in experimental physics.


Generative AI: Research and Applied Projects

The Physics of Transformers

2024 - Present
Independent Research

This project treats transformer weights as a dynamical system and applies methods from statistical physics to provide insight into their behavior. The work develops a statistical analysis pipeline that ingests trained models and extracts a set of interpretable metrics to characterize their behavior and track its evolution during training. The framework supports both an empirical characterization of the weights and a physics-motivated interpretation grounded in the correspondence between transformer self-attention and the statistical mechanics of spin systems. The analysis spans multiple architectures (e.g. GPT-2, LLaMA, Mistral) at scales from 70M to 12B parameters, with a temporal study across training checkpoints, and produces open-source code, HuggingFace datasets, and interactive visualization dashboards.

Resources:

📝 Read the series  

Code Dashboard Data

Spectral Structure in Neural Network Solutions of the KdV Equation

2026 - Present Independent Research

This project trains physics-informed neural networks to solve the Korteweg–de Vries equation, a prototypical integrable nonlinear wave equation whose soliton solutions arise from a precise balance between nonlinear steepening and dispersion. Rather than imposing conventional boundary conditions on the field, the PINN is driven by scattering data from the inverse scattering transform — eigenvalues and norming constants of an associated Schrödinger operator — making the boundary value problem an inverse spectral problem. The KdV equation’s integrability provides a powerful validation framework: an infinite hierarchy of conservation laws, each computable via autograd, serves as unsupervised diagnostics that are independent of the training loss. The learned solutions preserve these conservation laws locally throughout the domain and, more strikingly, retain the full spectral structure of the Lax pair — isospectrality and eigenfunction dynamics — without any explicit spectral inductive bias in the architecture or loss function. The approach scales to multi-soliton configurations, with an interactive explorer for visualizing soliton interactions, eigenvalue recovery, and eigenfunction evolution in real time.

Resources:

📝 Read the post   Code Dashboard

Generative Modeling of Discrete Sequences

2025 - Present
Independent Research

Both projects treat domain-specific event streams as discrete token sequences: in one, each token is a pitch outcome or game state and each sequence a baseball game; in the other, each token is a song and each sequence a concert setlist. The parallel framing makes them complementary testbeds for questions in mechanistic interpretability — specifically, how structured statistics and domain-specific rule-following emerge from next-token training on sequences with well-defined grammars distinct from natural language.

Baseball Game States

Transformer language models for sequential game state prediction, trained on 3.3M pitch sequences from MLB’s Statcast (2015–present) and Retrosheet’s historical archives (1871–present). State representations range from a 24-state outs/baserunners encoding to approximately 57,000-state encodings that incorporate detailed game context. The evaluation framework assesses not only predictive accuracy but rule adherence: illegal-transition probes test whether models internalize actual game constraints, and capacity-reduction studies trace the point at which rule-following degrades.

Resources:

Code

Grateful Dead Setlists

Supervised fine-tuning of GPT-2 on the Grateful Dead’s 30-year performance history, treated as a corpus of approximately 417 unique song tokens. The data pipeline initially processed Archive.org’s 17,000+ concert recordings, developing fuzzy matching and vocabulary canonicalization techniques for messy real-world data; production training uses cleaned setlist.fm data. Setlists exhibit opener conventions, set-closing sequences, and thematic pairings that experienced listeners recognize but that emerge here statistically from training. The availability of original recordings on Archive.org provides a path to extend this work into the audio modality.

Resources:

Code

Skills: Pretraining SFT NLP Gen AI LLM


Machine Learning & AI Applications in Physics

AI-Informed Detector Design

Jan 2021 - Jan 2024
Lawrence Livermore National Laboratory

Using deep learning as a new tool to guide the design of detectors for collider physics experiments. This work represents a paradigm shift from traditional engineering approaches to ML-optimized detector configurations.

Key Achievements:

Skills: Machine LearningDeep LearningGenerative AI ToolsExperimental PhysicsData Analysis

Publications:

Improving Particle Reconstruction with Deep Learning

Jan 2019 - Jan 2022
Lawrence Livermore National Laboratory · ATLAS Experiment at CERN

We utilized deep learning methods to improve particle identification and reconstruction using the ATLAS calorimeter. We studied convolutional, graph and transformer architectures and compared their results, achieving major improvements in energy calibration by using full spatial information from electromagnetic and hadronic showers.

Key Achievements:

Skills: Experimental PhysicsData AnalysisMachine Learning

Code Repositories:

Publications:


Heavy-Ion Physics & QCD

Jet Energy Loss and Substructure

Jan 2019 - Jan 2022
Lawrence Livermore National Laboratory · ATLAS Experiment at CERN

Can we experimentally observe whether wide jets lose more energy than narrow ones?

This project explored fundamental questions about how jets interact with the quark-gluon plasma, providing new insights into the substructure dependence of energy loss mechanisms.

Skills: Experimental PhysicsData AnalysisResearch ProjectsAnalytical SkillsUncertainty Quantification

Publications:

Recognition:

New Approaches to Ultra-Peripheral Collisions

Lawrence Livermore National Laboratory · ATLAS Experiment at CERN

Advanced studies of ultra-peripheral heavy-ion collisions, exploring photonuclear processes and novel QCD phenomena in extreme electromagnetic field environments.

Skills: Experimental PhysicsData AnalysisAnalytical SkillsUncertainty QuantificationMonte Carlo SimulationHigh Performance Computing

Publications:

Observation of Jet Quenching at the LHC

Columbia University · ATLAS Experiment at CERN

Landmark discovery that ushered in the LHC era of heavy-ion physics

In the first Pb+Pb collisions at the LHC, the ATLAS experiment observed highly imbalanced dijet pairs, providing the first direct evidence of jet quenching in the quark-gluon plasma at unprecedented energies.

Skills: Independent ResearchData Analysis

Publications:

Precision Measurements of Jet Quenching

Columbia University · ATLAS Experiment at CERN

Systematic studies of jet suppression phenomena, establishing quantitative frameworks for understanding energy loss in the quark-gluon plasma.

Skills: Experimental PhysicsData AnalysisUncertainty QuantificationMonte Carlo SimulationAnalytical Skills

Publications: