Chat, agents, and tools
Chat vs agent
Chat— one model call perprompt/execute/stream/stream_*. Tool callbacks are not run automatically; you can inspecttool_callsand drive follow-up turns yourself.Agent- Unary
prompt/execute/prompt_parts/prompt_content: exactly one model call; registered tools are not run by the runtime. - Streaming
stream/stream_prompt/ …: runs the automatic tool loop until the model returns without tool calls orwith_max_tool_round_trips(default 8) is exceeded.
- Unary
Tool registration
- Tools are registered per chat or agent session.
- Tool calls are passed through the runtime request to the model.
- When tools run on
Agent, local session tools are preferred before a runtime-level tool executor. - Each tool’s
ToolDefinition::execution_modecontrols concurrency with other tools in the same model turn. If any invoked tool in a turn isSequential, all tool calls in that turn run sequentially in model order.
Context compression (agent streaming)
On Agent, with_context_compressor (Rust) registers an async closure that runs once per streaming tool-loop round, immediately before each model call. It receives the accumulated ChatMessage list and an optional best-effort input-token estimate.
The hook is not used for unary prompt / execute.
In JavaScript, use AgentSession.registerContextCompressor before streamPrompt / streamPromptWithContent.
For full tables and API notes, see the README “Tool Calling” section.