Your First AI Project Shouldn't Be an AI Project
Every tutorial, bootcamp, and "getting started with AI" guide points you in the same direction: build a chatbot. Build a RAG pipeline. Build an agent. Start from scratch, make AI the entire point, and learn by building something that's fundamentally an AI product.
This is the wrong on-ramp. It teaches you application-layer skills and skips the foundational ones that actually make you productive, whether you're building agents, chatbots, or anything else.
The Chatbot Tutorial Trap
Think about how we teach every other kind of development. You don't learn web development by building a browser. You don't learn databases by writing a query engine. You build a todo list, a weather app, a basic CRUD server. Small, familiar projects where the fundamentals are the point and the product is almost irrelevant.
AI development skipped that step entirely. The community went straight to "build an autonomous agent" as the starter project. And the worst part is that it's easy. Agent frameworks abstract everything away: you can have a working demo in an afternoon. It feels like you've learned AI development. You haven't. You've skipped several layers of fundamentals and landed on an abstraction that hides all of them.
A chatbot teaches you conversation management, prompt templates, maybe some retrieval. An agent teaches you tool calling and execution loops. These aren't useless skills, but they're application-layer skills. They teach you how to build one specific kind of thing. The fundamental skill underneath all of it, treating LLM calls as typed functions, gets skipped entirely. And that skill transfers everywhere.
The Weather App Approach
Say you're learning a new language or framework. You build a weather app. Standard starter project. It looks something like this:
function main(city: string): WeatherResult {
const latlong = getLatLong(city)
const weather = getWeather(latlong)
return weather
}Three function calls. Typed input, typed output. Each one does one job. The data flows through a pipeline you control. You already understand this shape. It's just code.
Now add AI to it. Your first instinct is to throw a prompt at it:
const suggestion = await llm(
`The weather in ${city} is ${weather.conditions}, ${weather.temp}°F.
Suggest an outdoor activity.`
)It works. You get back a paragraph about how it's a lovely day for a hike. And this is where most tutorials stop. You've "added AI." But you haven't built anything. You've got a string in and a string out. You can't parse it, validate it, or pipe it into anything downstream. It's a party trick, not a feature.
Type Your LLM Output
The first real step is making the LLM call look like every other function in your pipeline.
function main(city: string): WeatherWithActivity {
const latlong = getLatLong(city)
const weather = getWeather(latlong)
const activity = suggestActivity(weather)
return { weather, activity }
}One new line. suggestActivity takes weather data in and returns a suggestion out. The fact that it calls an LLM instead of a REST API is an implementation detail. It has the exact same shape as every other function in the pipeline: typed input, typed output, one job.
That's the shift. You stop thinking about prompts and start thinking about function signatures. What does this function accept? What does it return? What's the contract?
Harden Your LLM Function
suggestActivity can't just be a raw LLM call in production. The moment you type the output, you discover that the LLM doesn't always respect the contract. That's the lesson, and you can only learn it by building it.
What if the model returns a paragraph instead of structured data? You need output validation. What if it returns valid JSON but with wrong field names? You need schema enforcement. What if it fails entirely? You need retry logic. What if you're calling it a thousand times a day? You need to think about cost per call.
interface ActivitySuggestion {
activity: string
reason: string
indoor: boolean
}
async function suggestActivity(weather: WeatherData): Promise {
const response = await llm({
input: {
temperature: weather.temp,
conditions: weather.conditions,
wind: weather.wind
},
schema: ActivitySuggestionSchema,
retries: 2
})
return response
} None of this is unique to AI. It's the same discipline you'd apply to any unreliable external service: validate the response, handle failures, enforce a contract. The LLM is an API that sometimes lies about its response format. You already know how to deal with that.
When Not to Use an LLM
You run suggestActivity a few times and notice it's recommending picnics in thunderstorms. You need indoor/outdoor classification. Your first instinct is another LLM call: classifyEnvironment(weather). Stop.
function isOutdoorViable(weather: WeatherData): boolean {
return weather.temp > 35
&& weather.temp < 95
&& weather.wind < 30
&& !['thunderstorm', 'blizzard', 'tornado'].includes(weather.conditions)
}That's it. Deterministic, free, instant, and correct every time. No LLM needed. The weather data is already structured. You don't need a language model to interpret numbers. This is maybe the most important lesson the weather app teaches: the best LLM call is often the one you don't make. If the logic can be expressed as a conditional, express it as a conditional.
Knowing when not to use the LLM is a skill the chatbot tutorial will never teach you, because in a chatbot, every response is an LLM call by definition.
Ground Your LLM in Real Data
Now you're filtering by indoor/outdoor, the suggestions at least make sense for the conditions. But they're generic. The model suggests surfing in Nebraska. Rock climbing in downtown Miami. It's pulling from its training data, not from reality.
You already have the latlong from line one. It's just sitting there in the pipeline.
function main(city: string): WeatherReport {
const latlong = getLatLong(city)
const weather = getWeather(latlong)
const outdoor = isOutdoorViable(weather)
const attractions = getLocalAttractions(latlong, outdoor)
const activity = suggestActivity(weather, attractions)
return { weather, activity }
}getLocalAttractions is a regular API call (Google Places, Yelp, whatever). It returns real venues near the user. Now suggestActivity isn't generating ideas from nothing. It's selecting and reasoning over real options. Instead of "go surfing" in a landlocked city, you get a specific park, a specific museum, a specific trail, with a reason tied to the actual weather.
This is the pattern that makes LLMs useful in production: don't ask them to generate facts, give them facts and ask them to reason. The weather app teaches you this because the data is right there and the failure is obvious. In a chatbot, you'd solve this with RAG and embeddings and a vector database. Here, it's one API call and a function parameter.
The Full LLM Pipeline
Look at what you've built:
function main(city: string): WeatherReport {
const latlong = getLatLong(city) // API call
const weather = getWeather(latlong) // API call
const outdoor = isOutdoorViable(weather) // regular code
const attractions = getLocalAttractions(latlong, outdoor) // API call
const activity = suggestActivity(weather, attractions) // LLM call
return { weather, activity }
}Five steps. One LLM call. The rest is regular code and API calls doing what they've always done: fetching data, applying logic, passing results forward. The LLM does the one thing it's actually good at: taking structured context and producing a natural-language recommendation grounded in real data.
This is how every LLM-powered tool I've built actually works. A landing page generator that chains extract, classify, and expand calls with data lookups in between. An article builder that runs structured content through a sequence of transformations, grounded in real program data. The pattern is always the same: regular code handles the logic and data, LLMs handle the language.
Agent vs Workflow vs Single Prompt
Now look at how the AI-first approaches would solve the same problem. The first instinct is to let the LLM handle everything:
const result = await llm(
`For the city of ${city}, determine the current weather conditions,
find local attractions and activities, and recommend something to do
today. Consider whether indoor or outdoor activities are appropriate.
Return a JSON object with weather, activity, and reason.`
)The model can't check the weather. It can't query local attractions. It hallucinates both. You figure this out in about thirty seconds and move on. So you fetch the data yourself and stuff it all into one prompt:
const latlong = getLatLong(city)
const weather = getWeather(latlong)
const attractions = getLocalAttractions(latlong)
const result = await llm(
`Weather: ${JSON.stringify(weather)}
Local attractions: ${JSON.stringify(attractions)}
Based on the weather, determine if outdoor activities are viable.
Then suggest an activity from the attractions list with a reason
tied to the current conditions.
Return a JSON object with activity, reason, and indoor.`
)This "works." The data is real. The output is usually reasonable. And this is where most people stop. They've built the thing, it produces output, ship it.
But look at what you're asking the LLM to do in a single call: interpret weather data, decide indoor vs outdoor, filter attractions by viability, select one, and generate a reason. Five decisions in one prompt. Indoor/outdoor classification is an if statement. You're burning tokens on if (temp > 95). Attraction filtering is a database query. You're asking a language model to do what .filter() does. The LLM is doing jobs that should be code, and you have no visibility into which step went wrong when the output is bad.
The agentic version gives the model tools and lets it decide:
const agent = new Agent({
tools: [getLatLong, getWeather, getLocalAttractions],
prompt: `Given a city, suggest an activity based on the current
weather and local attractions.`
})
const result = await agent.run(city)This actually works. The agent calls getLatLong, then getWeather, then getLocalAttractions, then reasons about the results and produces a recommendation. Correct output. Real data. The problem is what it costs to get there, and what happens when it gets harder.
This is a straightforward dependency chain. You need the lat/long to get the weather. You need the weather to filter attractions. There's one sensible order of operations, and the agent "discovers" it every time by spending reasoning tokens to arrive at what you already know. The autonomy adds nothing here except overhead.
There's a subtler problem too. In the pipeline, the output of getWeather is passed directly to the next function. The data is the data. In the agent loop, every tool output gets interpreted by the LLM before it becomes the input to the next tool call. The data gets filtered through the model's interpretation at every handoff.
The pipeline does the same thing, with the same data, producing the same quality output. The indoor/outdoor decision is an if statement that runs in nanoseconds. The one LLM call does the one thing an LLM is needed for: turning structured context into a natural-language recommendation.
And if you actually need an agent for a broader task, the pipeline is still the right answer. You just make it the tool:
const agent = new Agent({
tools: [getWeatherActivity], // the whole pipeline, one typed function
prompt: `Help the user plan their day.`
})Instead of giving the agent five tools and hoping it figures out the dependency chain, you give it one tool that returns a clean, validated result. Each layer does what it's good at. Agents make sense when the execution path genuinely varies based on the input. Pipelines make sense when the steps are known. Most real systems need both, and the pipeline is always the primitive.
The Foundational Skill
That progression (prompt it, type it, harden it, know when not to use it, ground it) teaches you a set of skills that transfers to everything you'll build with LLMs.
Building a chatbot teaches you conversation state, memory management, prompt templates, and retrieval pipelines. Real skills, but specific to conversational AI. They don't transfer cleanly to a content generation pipeline, a classification system, or a data extraction tool.
The weather app teaches you typed I/O, validation, composition, when to use code instead of an LLM, and how to ground model output in real data. These are the fundamentals. An agent is just a loop that calls typed functions based on conditions. A chatbot is a conversational interface on top of typed function calls. A pipeline is typed functions composed in sequence. The function is the primitive.
- conversation state management
- prompt templates
- memory & retrieval (RAG)
- streaming responses
- chat UI patterns
- typed LLM I/O
- output validation & schema enforcement
- when to use code vs. LLM
- grounding output in real data
- retry logic & cost awareness
The chatbot tutorial never teaches you to think this way because in a chatbot, every response is an LLM call and every output is text. There's no moment where you realize code would be better. There's no data pipeline to ground the output in. The architecture doesn't demand the discipline. The weather app does.
The Bigger Point
The AI community has an on-ramp problem. The path from "I want to build with AI" to "I can ship AI features" runs through a swamp of chatbot tutorials, agent frameworks, and RAG pipelines that teach application-layer skills before the fundamentals are in place. It's like teaching someone React before they understand functions.
The right mental model isn't "AI is a new kind of application." It's "an LLM call is a function with a weird runtime." It takes typed input. It returns typed output. It composes with other functions. It needs validation because it's less reliable than most APIs. Once you see it that way, the mystique evaporates.
The right starter project isn't an AI project at all. It's a regular project with one smart function. Everything else builds from there.