Blog

Insights on AI Engineering, Data Science, and the future of technology. Exploring the intersection of innovation and practical implementation.

Jun 8, 2025•5 min read

The Hurdles of AI Engineering: 5 Mistakes to Avoid When Building with Foundation Models

AI EngineeringFoundation ModelsMachine LearningTech Advice

Have you ever been blown away by a generative AI demo—like a chatbot that writes poetry or a tool that churns out marketing copy—only to realize that turning it into something reliable is a different challenge? If so, you're not alone. Many teams jump into building AI applications with foundation models (think GPT or Llama) with big dreams, only to trip over the same hurdles. Drawing from my experience and insights from Chip Huyen's book AI Engineering: Building AI Applications with Foundation Models, here are five common mistakes to dodge—and how to fix them.

1. Underestimating the Difficulty of Evaluation

Evaluating AI isn't like flipping a switch to see if it works. With generative models, the outputs can be subtle, like a slightly off-tone email or a summary that misses the point. These "silent failures" sneak by unnoticed at first, but they can chip away at user trust over time.

Fix it: Put evaluation front and center from the start. Think of it like taste-testing a recipe as you cook, not just at the end. Focus on metrics that tie to your goals—like whether your AI cuts customer support calls by 20%—rather than getting lost in geeky stats like "perplexity."

2. Not Taking Prompt Engineering Seriously

Prompts are the magic words that steer your AI. A small tweak—like changing "explain this" to "break this down simply"—can mean the difference between a spot-on answer and a wild tangent.

Fix it: Treat prompts like a key ingredient in your product, not a throwaway line. Test them, tweak them, and keep a record of what works. Imagine prompts as the steering wheel of your AI—grip it with intention.

3. Clinging to the Old Data-First Workflow

In the old days of machine learning, you'd hoard data like a squirrel before winter, then build your model. Foundation models flip that. They're pre-trained and ready to go, so you can start with a rough prototype and gather data later.

Fix it: Start with the user in mind. Sketch out what your AI should do—like drafting emails or answering FAQs—before obsessing over datasets. It's like building a house: get the frame up before worrying about the wallpaper.

4. Overestimating Model Context Windows

You might think a model with a huge "memory" (say, 100,000 tokens) can juggle everything you throw at it. But pile on too much, and it's like asking a chef to cook with a cluttered counter—things get messy, and mistakes creep in.

Fix it: Use Retrieval-Augmented Generation (RAG) to keep things tidy. It's like handing your AI a cheat sheet of just the info it needs, instead of a whole textbook. Less clutter, better results.

5. Ignoring the Complexity of AI Agents

Dreaming of an AI that books flights or troubleshoots tech issues step-by-step? It's exciting, but the reality is tricky. Multi-step tasks can unravel weirdly, like a GPS that takes you in circles.

Fix it: Start small and scale up. Test simple tasks first, like sending a reminder, before tackling a full workflow. Watch each step like a hawk and be ready to troubleshoot. Planning isn't an afterthought—it's the backbone.

Building AI with foundation models is full of promise, but it's also a minefield of pitfalls. Sidestep these five mistakes, and you'll be on firmer ground. Keep your focus on the user, stay disciplined in your process, and don't be afraid to experiment.

Read on LinkedIn

May 26, 2025•4 min read

The Shift to AI Engineering: Lessons from Building with Foundation Models

The world of artificial intelligence is undergoing a dramatic transformation—not just in what we build, but how we build it.

AI EngineeringFoundation ModelsLLMProduct AI

The world of artificial intelligence is undergoing a dramatic transformation—not just in what we build, but how we build it.

Traditional machine learning (ML) relied on deep, resource-heavy pipelines: gather data → train a model → tune and deploy → repeat. This approach, while powerful, limited innovation to those with significant time, talent, and computing. However, with the rise of foundation models, a new discipline has emerged: AI Engineering.

This shift has been championed by thought leaders like Chip Huyen, whose book AI Engineering: Building AI Applications with Foundation Models profoundly reshaped how I think about building in the era of generative AI. Her writing bridges the gap between cutting-edge research and real-world deployment, making complex ideas feel actionable.

So, What Is AI Engineering?

At its core, AI Engineering is about harnessing powerful pre-trained models—like GPT, Claude, or open-source LLMs—to build production-grade, user-focused applications. Unlike traditional ML, where the model was often the final deliverable, here it becomes a starting point.

What Makes AI Engineering Different:

🚀 Accessibility: Anyone can plug into these models through APIs—no need to train from scratch.
↻ Reversed development cycle: Start with a prototype. Fine-tune or scale the model only if needed.
🧠 Product sense is crucial: Understanding users becomes as important as coding skills.
🤖 Hybrid architectures: LLMs are often just one piece, integrated with rules-based logic or classifiers.

The Generative AI Stack: 3 Core Layers

Chip breaks down the modern AI stack into three interconnected layers:

1. Application Layer Where developers build the user-facing experience: - Prompt engineering (in-context, defensive techniques) - Evaluation design - Safety guardrails - Integrations like RAG (Retrieval-Augmented Generation) - Tool-using agents powered by LLMs

2. Model Layer Where models are improved and customized: - Fine-tuning for specific tasks - Alignment and safety training - Compression and optimization - Custom architectures

3. Infrastructure Layer The backbone that makes it all possible: - Model serving and APIs - Vector databases for embeddings - Monitoring and observability - Cost optimization

AI Engineering is no longer about building alone—it's about collaborating with machines to create smarter, faster, safer systems.

Let's build.

Inspired by Chip's work and powered by this fast-evolving ecosystem, I'm continuing to learn, prototype, and share. If you're exploring AI Engineering—from building agents to prompt strategies or evaluation tools—I'd love to connect.

Read on LinkedIn

Nov 5, 2024•3 min read

The Magic Inside a Tiny Chip: CPUs and GPUs Explained

To appreciate why CPUs and GPUs are so essential, it helps to understand what they actually do.

HardwareCPUGPUComputer ArchitectureAI Infrastructure

To appreciate why CPUs and GPUs are so essential, it helps to understand what they actually do. The CPU (Central Processing Unit) is often called the "brain" of your computer, while the GPU (Graphics Processing Unit) is the powerhouse behind visual processing and, increasingly, AI computations.

Understanding the CPU: The Computer's Brain

The CPU excels at sequential processing—handling one complex task at a time with incredible precision. Think of it as a brilliant chef who can prepare an elaborate dish step by step. It's optimized for:

Complex decision-making: Branch predictions, conditional logic
Single-threaded performance: Executing instructions in sequence
Cache management: Quick access to frequently used data
General-purpose computing: From web browsing to running applications

Modern CPUs typically have 4-16 cores, each capable of handling multiple threads through techniques like hyperthreading.

The GPU Revolution: Parallel Processing Power

GPUs, originally designed for rendering graphics, excel at parallel processing. Imagine thousands of simple workers all tackling small parts of a massive problem simultaneously. This makes GPUs incredibly powerful for:

Matrix operations: The foundation of machine learning
Image and video processing: Real-time rendering and effects
Scientific computing: Simulations and modeling
AI training: Handling thousands of mathematical operations in parallel

A modern GPU can have thousands of cores—far more than a CPU—but each core is simpler and less capable of complex individual tasks.

Why This Matters for AI and Data Science

The explosion of AI and machine learning has highlighted the importance of understanding these architectures:

Training neural networks: GPUs can process massive datasets in parallel
Real-time inference: CPUs handle complex logic while GPUs crunch numbers
Cost optimization: Choosing the right hardware for the right task

As we move into an era where AI is everywhere, understanding the hardware that powers these systems becomes crucial for any technologist.

The next time you're training a model or processing data, remember: you're orchestrating a complex dance between these incredible pieces of silicon, each optimized for different aspects of computation.

Read on LinkedIn

Let's Connect

Interested in AI Engineering, Data Science, or collaboration opportunities? I'd love to hear from you.

Get in Touch