-
Notifications
You must be signed in to change notification settings - Fork 3k
docs: add tool guardrails documentation #2218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
TasmeerJamali
wants to merge
1
commit into
openai:main
Choose a base branch
from
TasmeerJamali:docs-add-tool-guardrails-documentation
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+139
−1
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -161,4 +161,142 @@ async def main(): | |
| 1. This is the actual agent's output type. | ||
| 2. This is the guardrail's output type. | ||
| 3. This is the guardrail function that receives the agent's output, and returns the result. | ||
| 4. This is the actual agent that defines the workflow. | ||
| 4. This is the actual agent that defines the workflow. | ||
|
|
||
| ## Tool guardrails | ||
|
|
||
| In addition to agent-level input and output guardrails, you can apply guardrails directly to individual tools. **Tool guardrails** validate the arguments passed to a tool or the results it returns, rather than the overall agent input/output. | ||
|
|
||
| There are two types of tool guardrails: | ||
|
|
||
| 1. **Tool input guardrails** run *before* the tool executes, validating its arguments | ||
| 2. **Tool output guardrails** run *after* the tool executes, validating its return value | ||
|
|
||
| !!! Note | ||
|
|
||
| Tool guardrails are different from agent-level guardrails: | ||
|
|
||
| - **Agent guardrails** run on user input (first agent) or final output (last agent) | ||
| - **Tool guardrails** run on every invocation of a specific tool, regardless of which agent calls it | ||
|
|
||
| ### Tool input guardrails | ||
|
|
||
| Tool input guardrails run before the tool function executes. They receive a [`ToolInputGuardrailData`][agents.tool_guardrails.ToolInputGuardrailData] object containing: | ||
|
|
||
| - The [`ToolContext`][agents.tool_context.ToolContext] with tool name, arguments, and call ID | ||
| - The [`Agent`][agents.agent.Agent] that is executing the tool | ||
|
|
||
| ```python | ||
| import json | ||
|
|
||
| from agents import ( | ||
| ToolGuardrailFunctionOutput, | ||
| ToolInputGuardrailData, | ||
| function_tool, | ||
| tool_input_guardrail, | ||
| ) | ||
|
|
||
|
|
||
| @tool_input_guardrail | ||
| def validate_email_args(data: ToolInputGuardrailData) -> ToolGuardrailFunctionOutput: | ||
| """Block emails containing suspicious keywords.""" | ||
| args = json.loads(data.context.tool_arguments) if data.context.tool_arguments else {} | ||
|
|
||
| blocked_words = ["password", "hack", "exploit"] | ||
| for key, value in args.items(): | ||
| value_str = str(value).lower() | ||
| for word in blocked_words: | ||
| if word in value_str: | ||
| return ToolGuardrailFunctionOutput.reject_content( | ||
| message=f"Email blocked: contains '{word}'", | ||
| output_info={"blocked_word": word}, | ||
| ) | ||
|
|
||
| return ToolGuardrailFunctionOutput(output_info="Validated") | ||
|
|
||
|
|
||
| @function_tool | ||
| def send_email(to: str, subject: str, body: str) -> str: | ||
| """Send an email.""" | ||
| return f"Email sent to {to}" | ||
|
|
||
|
|
||
| # Attach the guardrail to the tool | ||
| send_email.tool_input_guardrails = [validate_email_args] | ||
| ``` | ||
|
|
||
| ### Tool output guardrails | ||
|
|
||
| Tool output guardrails run after the tool function executes. They receive a [`ToolOutputGuardrailData`][agents.tool_guardrails.ToolOutputGuardrailData] object which extends the input data with: | ||
|
|
||
| - The `output` produced by the tool function | ||
|
|
||
| ```python | ||
| from agents import ( | ||
| ToolGuardrailFunctionOutput, | ||
| ToolOutputGuardrailData, | ||
| ToolOutputGuardrailTripwireTriggered, | ||
| function_tool, | ||
| tool_output_guardrail, | ||
| ) | ||
|
|
||
|
|
||
| @tool_output_guardrail | ||
| def block_sensitive_data(data: ToolOutputGuardrailData) -> ToolGuardrailFunctionOutput: | ||
| """Block outputs containing sensitive information like SSNs.""" | ||
| output_str = str(data.output).lower() | ||
|
|
||
| if "ssn" in output_str or "123-45-6789" in output_str: | ||
| # Halt execution completely for sensitive data. | ||
| return ToolGuardrailFunctionOutput.raise_exception( | ||
| output_info={"blocked_pattern": "SSN", "tool": data.context.tool_name}, | ||
| ) | ||
|
|
||
| return ToolGuardrailFunctionOutput(output_info="Output validated") | ||
|
|
||
|
|
||
| @function_tool | ||
| def get_user_data(user_id: str) -> dict: | ||
| """Get user data by ID.""" | ||
| return {"user_id": user_id, "name": "John", "ssn": "123-45-6789"} | ||
|
|
||
|
|
||
| # Attach the guardrail to the tool | ||
| get_user_data.tool_output_guardrails = [block_sensitive_data] | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. same as above |
||
| ``` | ||
|
|
||
| ### Guardrail behavior types | ||
|
|
||
| Tool guardrails return a [`ToolGuardrailFunctionOutput`][agents.tool_guardrails.ToolGuardrailFunctionOutput] that specifies how the system should respond: | ||
|
|
||
| | Behavior | Method | Effect | | ||
| |----------|--------|--------| | ||
| | **Allow** | `ToolGuardrailFunctionOutput(output_info=...)` | Continue normal execution (default) | | ||
| | **Reject content** | `.reject_content(message, output_info)` | Block the tool call but continue agent execution with a message | | ||
| | **Raise exception** | `.raise_exception(output_info)` | Halt execution by raising `ToolInputGuardrailTripwireTriggered` or `ToolOutputGuardrailTripwireTriggered` | | ||
|
|
||
| Use `reject_content` when you want to gracefully handle a violation and let the agent continue. Use `raise_exception` for critical violations that must stop all execution immediately. | ||
|
|
||
| ### Handling tool guardrail exceptions | ||
|
|
||
| When a guardrail uses `raise_exception()`, you can catch it to handle the violation: | ||
|
|
||
| ```python | ||
| from agents import Agent, Runner, ToolOutputGuardrailTripwireTriggered | ||
|
|
||
|
|
||
| agent = Agent( | ||
| name="Assistant", | ||
| instructions="You help users retrieve data.", | ||
| tools=[get_user_data], # Tool with output guardrail attached | ||
| ) | ||
|
|
||
|
|
||
| async def main(): | ||
| try: | ||
| result = await Runner.run(agent, "Get data for user123") | ||
| print(result.final_output) | ||
| except ToolOutputGuardrailTripwireTriggered as e: | ||
| print(f"Blocked: {e.output.output_info}") | ||
| # Handle the violation appropriately | ||
| ``` | ||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, this is the only way to pass tool guardrails but thinking of better developer experience, enabling to pass the same to the decorator would be better. I will come up with a pull request enhancing this.