Alan Casallas

MIT-trained computer scientist with expertise in large-scale systems and machine learning, including hands-on work with multimodal and language models.


Master’s in Computer Science from MIT with 5 years deploying large scale systems for Oracle.


ABOUT ME

Hi! I’m Alan Casallas, a software engineer.

I earned my Master’s from MIT in 2019 and spent five years at Oracle building large-scale systems. More recently, I’ve been working on large language models (LLMs) and multimodal systems.

Machine Learning Engineer (Master’s in Computer Science, MIT) with 6 years of experience designing ML systems, predictive models, and large-scale data pipelines (Kafka/Airflow/Spark).

Proven track record in architecting high QPS and low-latency systems on AWS and Kubernetes. Deep expertise in transformers,
LLM fine-tuning, and multi-modal architectures.


MY PROJECTS

Building from scratch, exploring new frontiers.

A selection of technical projects spanning machine learning, robotics, and hardware design. These works reflect my focus on building systems from the ground up—whether designing large language models, creating multimodal architectures, or engineering robots and sensors. Each project highlights both the technical depth and the hands-on problem-solving that shaped my journey.

An LLM built from scratch.

I built CasaLLM, a 350M parameter LLM built from basic PyTorch elements.

It underwent pretraining, fine tuning, and RLHF over the course of 70 hours, and includes implementations of RoPE embeddings and kv caching.

See the full blogpost or try the live demo.

Custom CLIP

I designed and trained a custom implementation of CLIP, which serves as the foundation for many of today’s multimodal systems. 

This was a 34 million parameter model coded from scratch using PyTorch and trained on 3 million image-caption pairs.

An RNN was used as the text encoder to explore its performance in the CLIP architecture.

See the full blogpost.

Contactless Current and Voltage Detector

For my Master’s thesis at MIT, I developed a contactless detector that used signal processing and machine learning to infer current and voltage levels in a set of cables.

The goal was to replace expensive Hall Effect sensors with inexpensive magnetic field point measurements, using algorithmic methods to remove interference from external magnetic field sources.

US Patent No. 12085591

See the full thesis.


BEYOND THE CODE

A glimpse into the places, moments, and experiences that shape who I am outside of engineering.

Let’s Connect