Theodore
Dyer
I build data pipelines and ETL systems for healthcare and AI/ML. Currently shipping infrastructure for PHI datasets; previously at Apple working on data for Siri and ML teams.
About me
I'm a software / data engineer who builds the pipelines and infrastructure that make ML and analytics possible. My career has spanned healthcare data, big-tech AI infrastructure, and medical device data systems.
These days I'm at IntusCare, building ETL for complex PHI datasets with dbt, Airflow, and Airbyte. Before that I spent a year and a half at Apple on AI/ML data ops — leading a multi-format booking-data pipeline and designing automated PII detection across 12+ locales.
Off the clock I'm usually surfing, rock climbing, or hanging out with my dog Ghost. Lately I've been teaching myself game development in Godot.
Where I've worked
-
Software Engineer II · Data @ IntusCareProvidence, RI · Remote
- Build and maintain ETL pipelines for complex PHI datasets using dbt, Airflow, and Airbyte.
- Analyze and streamline existing early-stage startup processes to reduce tech debt and minimize bugs.
- Close collaboration with product and web teams to define pipeline requirements and customer data needs.
- Mentor junior engineers and overhauled the technical onboarding process.
stack: Python · SQL · dbt · Airflow · Airbyte · Terraform · Grafana -
AI/ML Data Engineer @ AppleSan Diego, CA · Contract via Inspyr Solutions
- Led architecture and development of a complex ETL project for multi-format booking data supporting ML teams.
- Designed automated global PII detection using the Gemini API and custom prompts to reliably flag sensitive information across 12+ locales, decreasing manual PII flagging by graders by approximately 95%.
- Documented full project architecture and feature roadmap; used it to onboard and lead two engineers for continued development.
- Overhauled database architecture, redesigned schemas, and handled data migration for Siri benchmarking labs, automating transcription pipelines and reducing manual testing overhead across worldwide lab sites.
stack: Python · SQL · LLMs · Docker · Airflow · AWS (S3, Glue) · Git -
Data Engineer @ QuidelorthoSan Diego, CA
- Solely orchestrated the design, development, and deployment of a complex medical data system — automated ETL pipeline, data modeling, and a secure Flask web app for data interaction.
- Managed full project lifecycle: tech stack selection, system architecture, version control, unit testing, Azure DevOps setup, and Azure cloud cost optimization.
stack: Azure (Logic Apps, Cosmos DB, Functions) · Python · Flask · Docker · Git -
Software Engineer Intern @ HumanyzeBoston, MA
- Reengineered the existing company data pipeline using PySpark and S3 to address PostgreSQL scaling issues for larger clients.
stack: PySpark · Amazon S3 · EMR · PostgreSQL · Git -
Data Science Intern (Lead) @ OpenPathIrvine, CA
- Led standup meetings and contributed to e-commerce data standardization and fraud detection projects.
stack: SQL Server · Python · Pandas · scikit-learn · Tableau
Education
Tools & tech
Selected work
rheal
Roguelike dungeon crawler built in Godot. A personal sandbox for procedural generation, combat systems, and game dev fundamentals.
Quantum ML
Graduate research paper exploring the intersection of quantum computing and machine learning. Submitted as part of my Johns Hopkins MS coursework.
Mask Detector
A browser-based CNN that detects whether you're wearing a face mask, built with ml5.js. Trained on a small custom dataset and deployed live.
Everything else
More projects, experiments, and learning repos live on my GitHub. Pull requests, stars, and curiosity welcome.