Day 1 : Counting pennies

In the new age of LLM-based software development, working as a solo developer is no longer cheap. The expectation of developing mature products at a fast pace increases every day. As models get better every week, the cost of using them increases significantly. Budgeting for these tools should be a focus for every developer until costs come down.

As I start my journey to build or upgrade software tools daily, it would be prudent to keep track of my costs for LLM tools. It is logical to start by building something to track their usage.

Yet another Python script

The task of developing a new tool fast falls on the shoulders of Python again.

For now, my LLM usage is restricted to just chats using OpenWebUI with OpenRouter as the model provider. Every response message in the chat has an information section. It provides details such as input token count, completion token count, cost and so on. However, it is not possible to get such details by aggregating across messages. In comes the tool I have in mind!

To create a tool, the most important thing is to define its scope. Ideally, creating a minimum viable product (MVP) and iterating over it is the smart strategy. This helps us understand how the tool should evolve. If you don’t believe me, look at nature. Humans didn’t evolve from a blueprint. We started with a single-celled organism and evolved over billions of years.

The Purge

The tool needs to be as simple as possible. This means, anything that can be removed from scope will be removed.

The tool can talk to OpenWebUI server directly, but that means handling credentials as well as managing communication with a web server. Instead, let’s export the chat from OpenWebUI to a file. File processing is simpler. When choosing a file format, it would be foolish to use anything other than JSON. It’s a popular standard that has well-supported packages, helping speed up development. Reducing file size is not a goal at the moment.

I usually use the same model for my entire conversation. The tool can very well support conversation involving multiple models. That information is present in the conversation messages. However, assuming single model usage and not focusing on multi-model metrics is prudent.

OpenRouter credits are counted in USD. The inference cost incurred for every API call to the model is also reported in the same currency. Using the same logic as before, we want to report cost only in USD.

An important thing to consider is that messages from the model are marked with role as assistant in the chat. This means cost is incurred only for those messages. The tool should filter out any other message to calculate cost.

Finally, simple sanity checks should be performed. Any errors with tool usage should provide messages to aid debugging.

Boromir returns

As you know from the last post, Boromir is a sage when it comes to software development. The above requirements look simple enough for an LLM to write the code in one shot. Right?

If only!

As you can guess, the above discussion for scope happened on a chat. A sample chat was exported as JSON and provided as context to write the tool. I tried to save some tokens here, which backfired, as expected. Instead of providing the entire file as context, I only provided a single message. The model correctly identified how to parse the message, but had no knowledge where the messages were embedded in the JSON.

how not to pass context

Learning: Better to spend more tokens and time upfront than try to fix it later.

Spare me the details

It took me a while to fix the mistake, but I finally got the script working. I tweaked the code by hand, since LLM output had some issues. I’m using DeepSeek v3.2 for the entire activity. It’s cheap, but not great at coding. You can start to see my penny-pinching behavior here.

The tool was built in under an hour. People may say it should not have taken more than 5 mins. I concur, but I’m in no hurry. Crawl before you walk. That’s the state for me right now. I used the tool to compute total cost for LLM usage for the activity.

total cost for project

To generate metadata fields and perform sanity check on this post, it cost me $0.105 using gpt-5.3-chat. Damn!

Fin

I’m happy to report the tool can be found on Github.