Now that you understand the fundamental parts of an LLM agent, it's time to choose the "brain" for your first creation: the Large Language Model (LLM) itself. As we discussed in "The Building Blocks of an LLM Agent," the LLM is the core reasoning engine. For your initial agent, the selection process should prioritize ease of use and learning, rather than aiming for the most powerful or complex model available.
What to Look For in Your First LLM
When building your very first agent, some factors are more important than others. You want to get up and running quickly to see your agent in action. Here’s what to consider:
- Ease of Access via APIs: Most LLMs are offered by companies through Application Programming Interfaces (APIs). An API is essentially a way for your program to send requests to the LLM service and receive responses over the internet. This is generally the simplest way to start because you don't need to download or manage large model files yourself. You'll typically sign up with a provider, get an "API key" (a unique secret code that identifies your program), and then you can start making calls.
- Cost-Effectiveness: Many LLM providers offer free tiers or credits for new users, which are perfect for your first experiments. Even their paid tiers often operate on a pay-as-you-go basis, where you're charged a very small amount for the amount of text you process. For a simple first agent, these costs will likely be minimal, allowing you to learn without a significant financial commitment.
- Sufficient Capability for Simple Tasks: Your first agent will likely handle a straightforward task, like managing a to-do list, as we'll build later. You don’t need the absolute largest or most advanced model for this. Most widely available LLMs are more than capable of understanding simple instructions and generating appropriate text or decisions for basic agentic behavior.
- Good Documentation and Community Support: When you're starting, clear instructions and examples are very helpful. Providers who offer good documentation, tutorials, and have active user communities can make your learning process smoother if you encounter questions.
Popular LLM Choices via APIs
Several companies provide access to capable LLMs through APIs, making them excellent candidates for your first agent. Here are a few common options:
- OpenAI: Known for models like GPT-3.5-Turbo and GPT-4. OpenAI provides well-documented APIs and is a popular choice for many developers. GPT-3.5-Turbo often offers a good balance of capability and cost for initial projects.
- Anthropic: Offers models like Claude 3 Haiku, Claude 3 Sonnet, and Claude 3 Opus. Haiku is designed to be fast and cost-effective, making it a great starting point. Sonnet provides a balance of intelligence and speed. Their models are recognized for strong reasoning and helpfulness.
- Google: Provides access to its Gemini family of models, such as Gemini Pro, through the Google AI Studio or Vertex AI platform. These are powerful and versatile models suitable for a wide range of tasks.
To use any of these, you'll generally need to:
- Visit the provider's website (e.g., OpenAI, Anthropic, Google AI).
- Sign up for an account.
- Navigate to their developer section or API section.
- Generate an API key. Keep this key secure, as it's your credential for accessing the service.
For this course, we'll primarily focus on using these models via their APIs, as it’s the most straightforward path for building your first agent.
The API Route: Why It's Good for Starters
Using an LLM through an API means your agent code will make a network request to the LLM provider. Your code sends the prompt (your instructions and context), and the provider's servers, running the LLM, process it and send back the response.
Interaction flow when using an API-based LLM. Your agent code communicates with the LLM over the internet.
While it's also possible to run some LLMs locally on your own computer (especially smaller, open-source models using tools like Ollama or Hugging Face Transformers), this typically involves more setup, potentially requiring specific hardware and more technical steps. For your first agent, the API approach significantly lowers the barrier to entry. We recommend starting with an API to focus on the agent's logic first.
Making Your Pick for the First Agent
So, which one should you choose? For your very first agent, the specific LLM choice is less critical than understanding the process of integrating it.
- If you want a widely used option with extensive tutorials: OpenAI's GPT-3.5-Turbo is a solid start.
- If you're interested in models known for safety and detailed reasoning: Anthropic's Claude 3 Haiku is an excellent and cost-effective choice.
- If you're already in the Google ecosystem or want to try their latest offerings: Google's Gemini Pro is a strong contender.
Most modern LLMs available through these APIs are quite capable. Pick one that seems accessible to you, sign up for an API key, and you'll be ready for the next steps. The principles you learn by integrating one API-based LLM will be largely transferable if you decide to try a different one later. The most important thing is to get started and see how your agent begins to think and act based on the LLM you provide.