Skip to content

Commit 69e90bc

Browse files
tonyxwzDouweM
andauthored
Add an example of using RunContext to pass data among tools (#2316)
Co-authored-by: Douwe Maan <[email protected]>
1 parent 80a7284 commit 69e90bc

File tree

6 files changed

+504
-6
lines changed

6 files changed

+504
-6
lines changed

docs/examples/data-analyst.md

Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
# Data Analyst
2+
3+
Sometimes in an agent workflow, the agent does not need to know the exact tool
4+
output, but still needs to process the tool output in some ways. This is
5+
especially common in data analytics: the agent needs to know that the result of a
6+
query tool is a `DataFrame` with certain named columns, but not
7+
necessarily the content of every single row.
8+
9+
With Pydantic AI, you can use a [dependencies object](../dependencies.md) to
10+
store the result from one tool and use it in another tool.
11+
12+
In this example, we'll build an agent that analyzes the [Rotten Tomatoes movie review dataset from Cornell](https://huggingface.co/datasets/cornell-movie-review-data/rotten_tomatoes).
13+
14+
15+
Demonstrates:
16+
17+
- [agent dependencies](../dependencies.md)
18+
19+
20+
## Running the Example
21+
22+
With [dependencies installed and environment variables set](./index.md#usage), run:
23+
24+
```bash
25+
python/uv-run -m pydantic_ai_examples.data_analyst
26+
```
27+
28+
29+
Output (debug):
30+
31+
32+
> Based on my analysis of the Cornell Movie Review dataset (rotten_tomatoes), there are **4,265 negative comments** in the training split. These are the reviews labeled as 'neg' (represented by 0 in the dataset).
33+
34+
35+
36+
## Example Code
37+
38+
```snippet {path="/examples/pydantic_ai_examples/data_analyst.py"}```
39+
40+
41+
## Appendix
42+
43+
### Choosing a Model
44+
45+
This example requires using a model that understands DuckDB SQL. You can check with `clai`:
46+
47+
```sh
48+
> clai -m bedrock:us.anthropic.claude-3-7-sonnet-20250219-v1:0
49+
clai - Pydantic AI CLI v0.0.1.dev920+41dd069 with bedrock:us.anthropic.claude-3-7-sonnet-20250219-v1:0
50+
clai ➤ do you understand duckdb sql?
51+
# DuckDB SQL
52+
53+
Yes, I understand DuckDB SQL. DuckDB is an in-process analytical SQL database
54+
that uses syntax similar to PostgreSQL. It specializes in analytical queries
55+
and is designed for high-performance analysis of structured data.
56+
57+
Some key features of DuckDB SQL include:
58+
59+
• OLAP (Online Analytical Processing) optimized
60+
• Columnar-vectorized query execution
61+
• Standard SQL support with PostgreSQL compatibility
62+
• Support for complex analytical queries
63+
• Efficient handling of CSV/Parquet/JSON files
64+
65+
I can help you with DuckDB SQL queries, schema design, optimization, or other
66+
DuckDB-related questions.
67+
```

docs/tools.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ There are a number of ways to register tools with an agent:
1212
- via the [`@agent.tool_plain`][pydantic_ai.Agent.tool_plain] decorator — for tools that do not need access to the agent [context][pydantic_ai.tools.RunContext]
1313
- via the [`tools`][pydantic_ai.Agent.__init__] keyword argument to `Agent` which can take either plain functions, or instances of [`Tool`][pydantic_ai.tools.Tool]
1414

15-
For more advanced use cases, the [toolsets](toolsets.md) feature lets you manage collections of tools (built by you or providd by an [MCP server](mcp/client.md) or other [third party](#third-party-tools)) and register them with an agent in one go via the [`toolsets`][pydantic_ai.Agent.__init__] keyword argument to `Agent`.
15+
For more advanced use cases, the [toolsets](toolsets.md) feature lets you manage collections of tools (built by you or provided by an [MCP server](mcp/client.md) or other [third party](#third-party-tools)) and register them with an agent in one go via the [`toolsets`][pydantic_ai.Agent.__init__] keyword argument to `Agent`.
1616

1717
!!! info "Function tools vs. RAG"
1818
Function tools are basically the "R" of RAG (Retrieval-Augmented Generation) — they augment what the model can do by letting it request extra information.
Lines changed: 107 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,107 @@
1+
from dataclasses import dataclass, field
2+
3+
import datasets
4+
import duckdb
5+
import pandas as pd
6+
7+
from pydantic_ai import Agent, ModelRetry, RunContext
8+
9+
10+
@dataclass
11+
class AnalystAgentDeps:
12+
output: dict[str, pd.DataFrame] = field(default_factory=dict)
13+
14+
def store(self, value: pd.DataFrame) -> str:
15+
"""Store the output in deps and return the reference such as Out[1] to be used by the LLM."""
16+
ref = f'Out[{len(self.output) + 1}]'
17+
self.output[ref] = value
18+
return ref
19+
20+
def get(self, ref: str) -> pd.DataFrame:
21+
if ref not in self.output:
22+
raise ModelRetry(
23+
f'Error: {ref} is not a valid variable reference. Check the previous messages and try again.'
24+
)
25+
return self.output[ref]
26+
27+
28+
analyst_agent = Agent(
29+
'openai:gpt-4o',
30+
deps_type=AnalystAgentDeps,
31+
instructions='You are a data analyst and your job is to analyze the data according to the user request.',
32+
)
33+
34+
35+
@analyst_agent.tool
36+
def load_dataset(
37+
ctx: RunContext[AnalystAgentDeps],
38+
path: str,
39+
split: str = 'train',
40+
) -> str:
41+
"""Load the `split` of dataset `dataset_name` from huggingface.
42+
43+
Args:
44+
ctx: Pydantic AI agent RunContext
45+
path: name of the dataset in the form of `<user_name>/<dataset_name>`
46+
split: load the split of the dataset (default: "train")
47+
"""
48+
# begin load data from hf
49+
builder = datasets.load_dataset_builder(path) # pyright: ignore[reportUnknownMemberType]
50+
splits: dict[str, datasets.SplitInfo] = builder.info.splits or {} # pyright: ignore[reportUnknownMemberType]
51+
if split not in splits:
52+
raise ModelRetry(
53+
f'{split} is not valid for dataset {path}. Valid splits are {",".join(splits.keys())}'
54+
)
55+
56+
builder.download_and_prepare() # pyright: ignore[reportUnknownMemberType]
57+
dataset = builder.as_dataset(split=split)
58+
assert isinstance(dataset, datasets.Dataset)
59+
dataframe = dataset.to_pandas()
60+
assert isinstance(dataframe, pd.DataFrame)
61+
# end load data from hf
62+
63+
# store the dataframe in the deps and get a ref like "Out[1]"
64+
ref = ctx.deps.store(dataframe)
65+
# construct a summary of the loaded dataset
66+
output = [
67+
f'Loaded the dataset as `{ref}`.',
68+
f'Description: {dataset.info.description}'
69+
if dataset.info.description
70+
else None,
71+
f'Features: {dataset.info.features!r}' if dataset.info.features else None,
72+
]
73+
return '\n'.join(filter(None, output))
74+
75+
76+
@analyst_agent.tool
77+
def run_duckdb(ctx: RunContext[AnalystAgentDeps], dataset: str, sql: str) -> str:
78+
"""Run DuckDB SQL query on the DataFrame.
79+
80+
Note that the virtual table name used in DuckDB SQL must be `dataset`.
81+
82+
Args:
83+
ctx: Pydantic AI agent RunContext
84+
dataset: reference string to the DataFrame
85+
sql: the query to be executed using DuckDB
86+
"""
87+
data = ctx.deps.get(dataset)
88+
result = duckdb.query_df(df=data, virtual_table_name='dataset', sql_query=sql)
89+
# pass the result as ref (because DuckDB SQL can select many rows, creating another huge dataframe)
90+
ref = ctx.deps.store(result.df()) # pyright: ignore[reportUnknownMemberType]
91+
return f'Executed SQL, result is `{ref}`'
92+
93+
94+
@analyst_agent.tool
95+
def display(ctx: RunContext[AnalystAgentDeps], name: str) -> str:
96+
"""Display at most 5 rows of the dataframe."""
97+
dataset = ctx.deps.get(name)
98+
return dataset.head().to_string() # pyright: ignore[reportUnknownMemberType]
99+
100+
101+
if __name__ == '__main__':
102+
deps = AnalystAgentDeps()
103+
result = analyst_agent.run_sync(
104+
user_prompt='Count how many negative comments are there in the dataset `cornell-movie-review-data/rotten_tomatoes`',
105+
deps=deps,
106+
)
107+
print(result.output)

examples/pyproject.toml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,9 @@ dependencies = [
6060
"gradio>=5.9.0; python_version>'3.9'",
6161
"mcp[cli]>=1.4.1; python_version >= '3.10'",
6262
"modal>=1.0.4",
63+
"duckdb>=1.3.2",
64+
"datasets>=4.0.0",
65+
"pandas>=2.2.3",
6366
]
6467

6568
[tool.hatch.build.targets.wheel]

mkdocs.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -66,6 +66,7 @@ nav:
6666
- examples/chat-app.md
6767
- examples/question-graph.md
6868
- examples/slack-lead-qualifier.md
69+
- examples/data-analyst.md
6970

7071
- API Reference:
7172
- api/ag_ui.md

0 commit comments

Comments
 (0)