Making an LLM chat client agentic

2025-12-21 in lagan

Say you have a minimal LLM chat client that does the following:

Wait for the user to input a prompt
Add a message to chat history containing the user input
Send any system prompt (or none) + full chat history to the model (via API)
Add the completion from the model to chat history
Repeat

To transform that into an agent:

Write a schema for each tool, to tell the model what it’s good for and which arguments it takes
Write the function that runs when the model asks for a given tool
Send all the tool schemas in the API request payload along with the message history
Add an “agent” loop that checks for a list of tool calls in the API response, and
- runs each tool call’s function, tacking a message containing each function’s output onto the chat history
- makes the next API call containing the snowballing history

When the model produces a tool call, it’s the client that turns the crank on the next inference run. When there’s no tool call, we pop out of the loop, display the output, and wait for the user to prompt again.

Here’s my ping-enabled agent for a local Ollama-hosted model.

Particles index