Practical guides on everything AI

Tested, explained, with code that runs

When new models drop or interesting papers come out, I spin up the GPUs, implement the ideas, and report back what actually works. These are practical guides with runnable code, written from the Coast of Somewhere Beautiful. I learn by building, and I'm here to help you do the same.

Browse tutorials Read articles

NEW Article: What It Takes to Be a Senior Machine Learning Engineer

A1A Highway badge with sunset over ocean

NEW

Featured Tutorial production ml

GPU Sizing for ML Workloads

Learn to calculate VRAM requirements, select the right AWS instance, and optimize costs. Includes real benchmarks and a Python sizing calculator.

Open Seas 25 min read

Start tutorial

Latest Tutorials

View all →

Experiment Tracking with MLflow and Langfuse

NEW

production ml Open Seas 30 min read

Experiment Tracking with MLflow and Langfuse

Set up experiment tracking for ML models with MLflow and LLM observability with Langfuse. Includes hyperparameter sweeps, model registry, and cost tracking.

NEW

production ml Deep Dive 35 min read

CI/CD for Machine Learning

Build a complete ML pipeline with GitHub Actions: data validation, model training, automated testing, and staged deployment to production.

NEW

production ml Deep Dive 40 min read

Model Serving on AWS

Deploy ML models to production with optimized inference: torch.compile vs ONNX benchmarks, FastAPI serving patterns, and AWS deployment options.

NEW

production ml Deep Dive 35 min read

ML Monitoring and Drift Detection

Monitor production ML models with data drift detection, performance tracking, and automated alerting. Includes working Python implementations.

Latest Articles

View all →

What It Takes to Be a Senior Machine Learning Engineer

NEW

Analysis Jan 24, 2026 8 min read

What It Takes to Be a Senior Machine Learning Engineer

A roadmap to the skills, knowledge, and practices that separate senior MLEs from the rest - with links to hands-on tutorials for each area.

#mlops #career #production-ml

2026 Frontier LLM Architectures: MLA, iRoPE, mHC, and the Race for Efficiency

FEATURED

Analysis Jan 3, 2026 15 min read

2026 Frontier LLM Architectures: MLA, iRoPE, mHC, and the Race for Efficiency

Technical comparison of DeepSeek V3.2, Llama 4, Gemini 3, and Qwen3 architectures—plus DeepSeek's mHC innovation expected in V4.

#architecture #deepseek #llama

FEATURED

Analysis Dec 31, 2025 10 min read

2025: The Year AI Got a Reality Check

From DeepSeek's January bombshell to vibe coding going mainstream, here's what actually changed for AI practitioners in 2025.

#llms #agents #year-in-review

Latest in AI

Updated daily

View all →

Small Yet Mighty: Improve Accuracy In Multimodal Search and Visual Document Retrieval with Llama Nemotron RAG Models

Nvidia's Llama Nemotron RAG models are purpose-built for multimodal search and visual document retrieval tasks, combining vision and language capabilities for improved accuracy. This release offers practical value for practitioners implementing production RAG systems, particularly those handling mixed-media documents. The article likely covers model architecture, performance benchmarks, and implementation guidance relevant to building retrieval systems at scale.

ARX

InfiAgent: An Infinite-Horizon Framework for General-Purpose Autonomous Agents

InfiAgent addresses a critical production challenge for LLM agents: managing unbounded context growth and error accumulation during long-horizon tasks. The framework externalizes persistent state into a file-centric abstraction, offering practical solutions for deploying agents at scale without sacrificing reasoning stability—directly applicable to building robust agentic systems in production environments.

ARX

Can Embedding Similarity Predict Cross-Lingual Transfer? A Systematic Study on African Languages

This systematic study evaluates embedding similarity metrics for predicting cross-lingual transfer success across African languages, providing practical guidance for selecting source languages in low-resource NLP scenarios. The findings on cosine gap and retrieval-based metrics (P@1, CSLS) offer actionable insights for practitioners building multilingual systems and optimizing transfer learning strategies. Relevant for those working with embeddings and retrieval systems in production ML contexts.

ARX

Fine-tuning Small Language Models as Efficient Enterprise Search Relevance Labelers

This paper addresses practical enterprise search challenges by demonstrating how to fine-tune small language models for relevance labeling at scale, achieving quality comparable to LLMs with better efficiency. Directly applicable to production ML systems requiring domain-specific relevance ranking without the cost of large model inference. Combines fine-tuning techniques with retrieval system optimization, making it valuable for practitioners building scalable search and RAG pipelines.

ARX

MemRL: Self-Evolving Agents via Runtime Reinforcement Learning on Episodic Memory

MemRL presents a method for LLMs to self-improve through episodic memory and reinforcement learning, addressing limitations of fine-tuning and passive retrieval. The approach combines memory-based retrieval with active learning signals, relevant for building adaptive AI agents and RAG systems that evolve without catastrophic forgetting. Practical value for practitioners implementing production agents that need continuous improvement without expensive retraining.

ARX

Accurate Table Question Answering with Accessible LLMs

This paper addresses practical table question answering using smaller, open-weight LLMs that can run locally, eliminating costly API dependencies. Directly relevant for practitioners deploying LLM-based systems in production environments with resource constraints, demonstrating how to achieve competitive performance with accessible models rather than proprietary large-scale alternatives.

Featured Projects

❄

🏈

intermediate 15 min read

CFP Oracle: Semantic Search for College Football History

Build a semantic search system to find historically similar College Football Playoff games using Amazon S3 Vectors and Bedrock embeddings.

🎄

intermediate 35 min read

Build a Community Christmas Tree with AI-Generated Ornaments

Create a shared Christmas tree where visitors add AI-generated ornaments using Amazon Nova Canvas, with defense-in-depth content moderation using Bedrock Guardrails and Claude.

🍺

intermediate 25 min read

Build a Holiday Cocktail Agent with TheCocktailDB

Create an AI bartender that suggests cocktails based on weather, searches by ingredient, and generates party menus with shopping lists.

🎣

intermediate 40 min read

Building a Fishing Report Agent with AWS Strands

Create an AI agent that combines tide, weather, and marine data to generate fishing reports. Learn tool-calling patterns with the Strands SDK, NOAA APIs, and Claude on AWS Bedrock.