Production Python for ML systems. C++ and Rust for performance-critical paths. Go for infrastructure tooling.
Building intelligent systems that ship to production.
$
End-to-end ML pipelines from training to deployment. Deep experience with LLM orchestration, retrieval-augmented generation, and multi-agent systems.
Embeddings, anomaly detection, and graph-based methods. From exploratory analysis to production feature engineering.
Azure-native deployments with full observability. Container orchestration, CI/CD pipelines, and real-time monitoring.
Projects
mcp-cpp
A C++ SDK for the Model Context Protocol. JSON-RPC based messaging, supporting STDIO/HTTP/SSE, memory safe design with C++17 best practices.
ft-diloco
Fault-tolerant DiLoCo: training a language model across cheap, unreliable machines that sync only rarely, via Meta's torchft. Kill, freeze, or disconnect machines mid-run and convergence survives — 32 replicas weathered 125 faults/hr at 97.7% of fault-free throughput.
CHOP
GNN & Reinforcement Learning powered Mixed Integer Linear Program solver. Deep RL framework using Graph Neural Networks to learn heuristics for branch-and-bound on NP-Hard problems.
ternfpga
A multiply-free ternary LLM inference engine on a $130 FPGA, benchmarked head-to-head against an RTX 3060 in the same machine — the same decode at an estimated ~2.3× less energy per token, by refusing to multiply.
KSU AI Club — Founder & President
Scaled student AI community from 0 to 400+ members through technical workshops and industry partnerships.
View on LinkedIn arrow_forwardWriting
Training a Language Model on Machines That Keep Dying
Multiply-Free: A Ternary LLM Engine on a 130-Dollar FPGA
Convolutions of Grandeur
Branch-and-Bound from First Principles
Take Heed of the Hydra
Autodifferentiation
I build AI systems that survive contact with reality. The kind where you can't just restart the server when your agent loops infinitely, where a hallucinated API call costs real money, and where the gap between a paper's benchmark and a production SLA is measured in months of engineering.
Right now I'm at WM-Synergy, architecting RAG pipelines and observability infrastructure for enterprise AI agents. Before that, I was the founding ML engineer at Neumann Labs, where I designed multi-agent orchestration with the kind of reliability guarantees you'd expect from distributed systems, not chatbots. My background is in mathematics—discrete math and operations research at KSU, now pursuing an M.S. in Computer Science at Georgia Tech.
I write about the things I find interesting: how transformers actually work at the matrix level, why compilers are underrated for understanding computation, and what it takes to make ML systems reliable enough to trust with real decisions. I founded the KSU AI Club and scaled it to 400+ members because I think the best way to learn is to build things alongside other people who care.
Interested in building something ambitious together?
I'm always open to conversations about ML systems engineering, open-source collaboration, and hard technical problems worth solving.
"The best way to predict the future is to implement it."
— David Heinemeier Hansson