Blog posts

In this final post of the series we’ll add tool functions which will allow the model to retrieve new information by performing web searches and parsing the results. We’ll use a similar interface to the one used by OpenAI in the training of the gpt-oss models. See the OpenAI github for their sample implementation in Python.

So we need to add 3 tool functions which will implement the same ToolFunction interface we defined in part 3.

browser.search - Performs a web search and returns a list of links with title and description for each.
browser.open - Opens a URL and returns an extract from the page content in Markdown format.
browser.find - Searches for a literal string within an opened page.

The code described below is a reimplentation in go, using the Brave search API for web search and a local Firecrawl instance to scape web pages and convert the content to Markdown.

In this post we will take the command line chat / tool calling app we described in part 3 of this series which interacts with a local gpt-oss model and add a web browser user interface.

This will use HTML + CSS to render the frontend and Javascript + Websockets to interact with a local http server using the go standard library net/http and the Gorilla WebSocket package.

For styling for lists and buttons we’ll use Pure CSS. To render Markdown content generated by the model we’ll use goldmark along with chroma for code syntax highlighting and KaTeX to format math equations.

For the previous post in this series which is an intro to the completions API in go see part 2. In this post we will extend the command line chat app to add simple tool calling using the Open Weather API.

We will need to add the function schema definitions to the request that we send to the model so that it knows what functions it can call. Then if the returned response ends with a tool call instead of a final response we extract the parameters, call the function and add the call and the response to the message list before resending the request.

For the first post in this series which covers setting up the environment see part 1.

The llama-server binary provides an OpenAI compatible Completions API using a REST + JSON interface at /v1/completions URL. In this post we test this out by writing some simple command line programs in go.

This will be a series of posts about running LLMs locally. By the end we should have our own web based chat interface which can call tools we have defined - e.g. to use a web search API, retrieve pages and feed them back into the model.

This post covers the first steps:

Setting up an inference server with llama.cpp,
Downloading a model - we’ll use gpt-oss from OpenAI.

Local LLM models: Part 5 - adding a web browser tool

Local LLM models: Part 4 - a simple web UI

Local LLM models: Part 3 - calling tool functions

Local LLM models: Part 2 - a basic cmd line app in go

Local LLM models: Part 1 - getting started