Skip to content
Tim Frenzel

// Insight

smolagents: when the agent's action is code

3 min read
agentstoolingcode-execution

Most agent frameworks have the model emit a JSON tool call. smolagents, a deliberately minimal library from Hugging Face, takes a different line: its agents write their actions as executable Python code. For quant work the fit is natural, because the action often is code: pull the data, run the regression, score the backtest.

The argument for code actions is expressiveness. A JSON tool call invokes one function with some arguments. Code can do that, then loop over the result, filter it, feed it to the next call, and branch on what it finds, all in a single step. Programming languages were built to describe computer behavior. A model that already writes good code gets that composability for almost nothing. For multi-step work, one code action can replace a long chain of separate tool calls, with less round-tripping and less glue.

A code-writing agent loop
TaskModel writes the next action as Python codeExecute in a sandboxObserve the resultDone, or write the next step
Instead of emitting a JSON tool call, the agent writes executable code and runs it. For quant work the action often is code already: query the data, compute the metric, score the backtest.

Why it fits a quant desk

The match is close because so much analyst work already is code. The actions a research assistant needs to take, querying a database, computing a metric, running a backtest, fitting a model, are things you express in Python anyway. An agent that writes code is not translating its intent into a constrained tool schema. It is working in the native language of the task, which means fewer awkward wrappers and more of the model’s actual capability reaching the problem. smolagents keeps the scaffold minimal and model-agnostic. You can point it at any model and wrap your own functions as tools, which makes it a light way to prototype analyst automation without committing to a heavy framework.

A concrete shape makes it tangible. Ask the agent to check whether a factor still works. It can write the code to pull the returns, compute the factor, run the regression, and report the t-stat in one action, branching on what it finds rather than waiting for a human to chain the tools. That is the natural unit of analyst work, a small script, which a code agent speaks directly. The same task expressed as a sequence of JSON tool calls is clumsier and more brittle, with the orchestration logic pushed awkwardly into the prompt.

The catch

Executing model-written code is the obvious risk. It is a real one. A model that writes its own actions can write a harmful one, by mistake or by prompt injection, so sandboxed execution is not optional. smolagents supports it. You have to use it. The second caution is the one the library’s own authors make: agents are overkill for deterministic work. If a fixed workflow handles your queries, code that workflow directly and get a reliable system with no model in the loop. Reach for an agent when the task genuinely needs the model to decide its own steps, not as a default. Used where it fits, a code-writing agent is a sharp tool for the messy, exploratory corner of analyst work, the part where you cannot specify the steps in advance.

There is a maturity caution as well. smolagents is deliberately small, which is its appeal and its limit. It gives you the core loop and stays out of the way. It does not give you the orchestration, observability, and guardrails a production agent platform provides. For prototyping an analyst tool, that minimalism is exactly right. For running one against real money or real client data, you build the controls around it yourself, or you reach for something heavier.

smolagents lets an agent write its actions as code instead of JSON, which fits quant work where the action is code anyway. Sandbox the execution and reserve it for genuinely open-ended tasks; used that way it is a light scaffold for analyst automation.

Working on AI that needs to ship?

I help funds, fintechs, and data teams take AI from prototype to production.