While guiding an agent's reasoning with techniques like Chain-of-Thought helps in structuring its internal thought process, many tasks require an agent to do more than just think through a problem in one go. To effectively interact or use tools to find information, an agent often needs to act, see what happens, and then decide what to do next. This iterative process of thinking and acting is central to a popular and effective framework known as ReAct.
ReAct stands for "Reason and Act." It's an approach that enables LLM agents to combine reasoning with action-taking in a structured way. Instead of trying to formulate a complete plan from start to finish and then executing it blindly, a ReAct agent breaks down a task into a sequence of smaller steps. At each step, it goes through a cycle of reasoning about the current situation, deciding on an action, taking that action, and then observing the outcome.
The core of the ReAct framework is an iterative loop involving three main components:
search("current temperature in Berlin") or calculator("125 * 4.5").This "Thought-Action-Observation" cycle repeats. The observation from the previous step informs the LLM's next thought, allowing it to assess progress, adjust its plan if necessary, and decide on the subsequent action. The loop continues until the agent determines that the overall goal has been achieved, at which point it might take a final action to provide the answer or complete the task.
Let's visualize this flow:
The ReAct framework operates as a loop where the agent thinks, acts, observes the outcome, and then thinks again based on the new information.
Imagine you ask an agent: "What was the score of the last game played by the winning team of the 2023 FIFA Women's Cup?"
A ReAct agent might proceed as follows (simplified):
Thought: "The question asks for two things: first, who won the 2023 FIFA Women's Cup, and second, the score of their last game. I need to find the winner first."
Action: search("winner of 2023 FIFA Women's Cup")
2. Observation: "Spain won the 2023 FIFA Women's World Cup."
3. Thought: "Okay, Spain was the winner. Now I need to find the score of Spain's last game. The question implies the last game they played, which might not be the final if they played since. I should search for 'Spain women's national football team last game score'."
Action: search("Spain women's national football team last game score")
4. Observation: (Assuming a result is found, e.g.) "Spain 2 - 0 Sweden, 2024-02-28."
5. Thought: "I have found the winner (Spain) and the score of their last game (2-0 against Sweden on Feb 28, 2024). I have all the information needed to answer the question."
Action: finish("The winning team of the 2023 FIFA Women's Cup was Spain. The score of their last game was Spain 2 - 0 Sweden.")
In this example, the LLM isn't just generating the final answer in one go. It's explicitly verbalizing its plan (the "Thought" part), choosing tools (the "Action" part, here search and finish), and then incorporating new information (the "Observation" part) to guide its next steps. The agent system running the LLM is responsible for parsing the Action string, calling the appropriate tool, and then feeding the Observation back to the LLM by including it in the next prompt.
The ReAct approach offers several advantages for building more capable agents:
Compared to a technique like Chain-of-Thought (CoT) prompting, which primarily focuses on generating a coherent line of reasoning before producing a final output, ReAct integrates reasoning with action-taking and environmental feedback in a continuous loop. While CoT helps the LLM "think things through," ReAct helps it "think, do, and learn" iteratively.
By structuring an agent's operation around this cycle of reasoning, acting, and observing, the ReAct framework allows us to build agents that are more interactive, adaptive, and capable of handling multi-step tasks that require external information or actions. As we move forward, you'll see how this pattern is a fundamental building block for many types of LLM agents.
Cleaner syntax. Built-in debugging. Production-ready from day one.
Built for the AI systems behind ApX Machine Learning
Was this section helpful?
© 2026 ApX Machine LearningEngineered with