Machine Learning + Space

Production ML Systems · Data Pipelines · Cloud Infrastructure

End-to-end ML systems: data, training, deployment, maintenance

I design and deploy production ML systems with experience in physics-informed modeling, large-scale data integration, and cloud infrastructure. Background in astrophysics and space-adjacent engineering.


Focus Areas

  • Production ML systems & automation pipelines
  • Scientific ML with simulation-based training
  • Large-scale data integration & statistical analysis
  • Infrastructure systems & institutional research

Core ML Stack

Python · PyTorch · NumPy · SciPy
AWS (EC2, S3, Batch) · Docker · Linux · HPC
Data Pipelines · Model Evaluation · Automation


Experience

UCSD Astrophysics · ML for GW analysis
Simulation-based training and uncertainty modeling

General Atomics · AI infrastructure
Hybrid on-prem + cloud ML systems

SPACES UCSD · Data-driven ops at scale
Analytics and automation for 890+ student program


Featured Projects

Astronomy Club Newsletter Automation

Production ML pipeline | Weekly automation | Python, APIs, LLMs | Zero-touch ops

Designed a zero-touch, weekly production pipeline using Python + APIs + LLMs, running unattended with monitoring and failure recovery. Reduced manual operations time to zero while maintaining editorial oversight through automated draft generation. Orchestrates 5+ external services (Calendar, Gmail, RSS, Weather APIs) with graceful degradation.

Read More →

JWST Gas Morphology Classifier

Physics-informed CNN | Simulated JWST data | PyTorch | 92% accuracy

Built end-to-end PyTorch pipeline for classifying 5 distinct gas morphologies in simulated JWST 4-channel NIRCam data, achieving 92% accuracy (Macro-F1=0.91). Physics-informed augmentation (PSF convolution, viewing angles, detector noise) improved performance by 6.7%. Designed for transfer learning to real JWST observations with interpretability tools for failure analysis.

Read More →

Renewable Energy ROI Analysis

Multi-source data pipeline | Federal datasets | Python, statistical modeling | 50-state analysis

Integrated 50-state economic data from multiple federal sources (EIA, Census, NREL) into unified analysis pipeline. Built custom feasibility scoring models combining resource potential, infrastructure costs, and economic factors. Delivered actionable state-level rankings with reproducible methodology and statistical validation.

Read More →

OER Integration at UC San Diego

Systems research | Infrastructure design | Cross-functional coordination | Enterprise scale

Led institutional research translating equity goals into technical infrastructure requirements for campus-wide OER adoption. Coordinated across 4 departments (IT, Library, Academic Affairs, Student Services) to design scalable integration with existing LMS and procurement systems. Delivered actionable roadmap with implementation timeline and resource requirements.

Read More →

Recent Posts

OER Integration at UC San Diego

4 minute read

Institutional research and infrastructure planning for Open Educational Resource adoption, translating equity goals into actionable systems recommendations.

Astronomy Club Newsletter Automation

2 minute read

Fully automated communications pipeline with LLM-based news curation, weather forecasting, and event aggregation, designed for production reliability and ope...