Agent Structure Reference Documentation¶
graph TD
A[Task Initiation] -->|Receives Task| B[Initial LLM Processing]
B -->|Interprets Task| C[Tool Usage]
C -->|Calls Tools| D[Function 1]
C -->|Calls Tools| E[Function 2]
D -->|Returns Data| C
E -->|Returns Data| C
C -->|Provides Data| F[Memory Interaction]
F -->|Stores and Retrieves Data| G[RAG System]
G -->|ChromaDB/Pinecone| H[Enhanced Data]
F -->|Provides Enhanced Data| I[Final LLM Processing]
I -->|Generates Final Response| J[Output]
C -->|No Tools Available| K[Skip Tool Usage]
K -->|Proceeds to Memory Interaction| F
F -->|No Memory Available| L[Skip Memory Interaction]
L -->|Proceeds to Final LLM Processing| I
The Agent class is the core component of the Swarm Agent framework. It serves as an autonomous agent that bridges Language Models (LLMs) with external tools and long-term memory systems. The class is designed to handle a variety of document types—including PDFs, text files, Markdown, and JSON—enabling robust document ingestion and processing. By integrating these capabilities, the Agent class empowers LLMs to perform complex tasks, utilize external resources, and manage information efficiently, making it a versatile solution for advanced autonomous workflows.
Features¶
The Agent class establishes a conversational loop with a language model, allowing for interactive task execution, feedback collection, and dynamic response generation. It includes features such as:
| Feature | Description |
|---|---|
| Conversational Loop | Enables back-and-forth interaction with the model. |
| Feedback Collection | Allows users to provide feedback on generated responses. |
| Stoppable Conversation | Supports custom stopping conditions for the conversation. |
| Retry Mechanism | Implements a retry system for handling issues in response generation. |
| Tool Integration | Supports the integration of various tools for enhanced capabilities. |
| Long-term Memory Management | Incorporates vector databases for efficient information retrieval. |
| Document Ingestion | Processes various document types for information extraction. |
| Interactive Mode | Allows real-time communication with the agent. |
| Sentiment Analysis | Evaluates the sentiment of generated responses. |
| Output Filtering and Cleaning | Ensures generated responses meet specific criteria. |
| Asynchronous and Concurrent Execution | Supports efficient parallelization of tasks. |
| Planning and Reasoning | Implements planning functionality for enhanced decision-making. |
| Autonomous Planning and Execution | When max_loops="auto", automatically creates plans, executes subtasks, and generates summaries. Includes built-in tools for file operations, user communication, and workspace management. |
| Agent Handoffs and Task Delegation | Intelligently routes tasks to specialized agents based on capabilities and task requirements. |
Agent Attributes¶
| Attribute | Type | Description |
|---|---|---|
id |
Optional[str] |
Unique identifier for the agent instance. |
llm |
Optional[Any] |
Language model instance used by the agent. |
max_loops |
Optional[Union[int, str]] |
Maximum number of loops the agent can run. |
stopping_condition |
Optional[Callable[[str], bool]] |
Callable function determining when to stop looping. |
loop_interval |
Optional[int] |
Interval (in seconds) between loops. |
retry_attempts |
Optional[int] |
Number of retry attempts for failed LLM calls. |
retry_interval |
Optional[int] |
Interval (in seconds) between retry attempts. |
return_history |
Optional[bool] |
Boolean indicating whether to return conversation history. |
stopping_token |
Optional[str] |
Token that stops the agent from looping when present in the response. |
dynamic_loops |
Optional[bool] |
Boolean indicating whether to dynamically determine the number of loops. |
interactive |
Optional[bool] |
Boolean indicating whether to run in interactive mode. |
dashboard |
Optional[bool] |
Boolean indicating whether to display a dashboard. |
agent_name |
Optional[str] |
Name of the agent instance. |
agent_description |
Optional[str] |
Description of the agent instance. |
system_prompt |
Optional[str] |
System prompt used to initialize the conversation. |
tools |
List[Callable] |
List of callable functions representing tools the agent can use. |
dynamic_temperature_enabled |
Optional[bool] |
Boolean indicating whether to dynamically adjust the LLM's temperature. |
sop |
Optional[str] |
Standard operating procedure for the agent. |
sop_list |
Optional[List[str]] |
List of strings representing the standard operating procedure. |
saved_state_path |
Optional[str] |
File path for saving and loading the agent's state. |
autosave |
Optional[bool] |
Boolean indicating whether to automatically save the agent's state. |
context_length |
Optional[int] |
Maximum length of the context window (in tokens) for the LLM. |
transforms |
Optional[Union[TransformConfig, dict]] |
Message transformation configuration for handling context limits. |
user_name |
Optional[str] |
Name used to represent the user in the conversation. |
multi_modal |
Optional[bool] |
Boolean indicating whether to support multimodal inputs. |
tokenizer |
Optional[Any] |
Instance of a tokenizer used for token counting and management. |
long_term_memory |
Optional[Union[Callable, Any]] |
Instance of a BaseVectorDatabase implementation for long-term memory management. |
fallback_model_name |
Optional[str] |
The fallback model name to use if primary model fails. |
fallback_models |
Optional[List[str]] |
List of model names to try in order. First model is primary, rest are fallbacks. |
preset_stopping_token |
Optional[bool] |
Boolean indicating whether to use a preset stopping token. |
streaming_on |
Optional[bool] |
Boolean indicating whether to stream responses. |
stream |
Optional[bool] |
Boolean indicating whether to enable detailed token-by-token streaming with metadata. |
streaming_callback |
Optional[Callable[[str], None]] |
Callback function to receive streaming tokens in real-time. |
verbose |
Optional[bool] |
Boolean indicating whether to print verbose output. |
stopping_func |
Optional[Callable] |
Callable function used as a stopping condition. |
custom_exit_command |
Optional[str] |
String representing a custom command for exiting the agent's loop. |
custom_tools_prompt |
Optional[Callable] |
Callable function for generating a custom prompt for tool usage. |
tool_schema |
ToolUsageType |
Data structure representing the schema for the agent's tools. |
output_type |
OutputType |
Type representing the expected output type of responses. |
function_calling_type |
str |
String representing the type of function calling. |
output_cleaner |
Optional[Callable] |
Callable function for cleaning the agent's output. |
function_calling_format_type |
Optional[str] |
String representing the format type for function calling. |
list_base_models |
Optional[List[BaseModel]] |
List of base models used for generating tool schemas. |
metadata_output_type |
str |
String representing the output type for metadata. |
state_save_file_type |
str |
String representing the file type for saving the agent's state. |
tool_choice |
str |
String representing the method for tool selection. |
rules |
str |
String representing the rules for the agent's behavior. |
planning_prompt |
Optional[str] |
String representing the prompt for planning. |
custom_planning_prompt |
str |
String representing a custom prompt for planning. |
memory_chunk_size |
int |
Integer representing the maximum size of memory chunks for long-term memory retrieval. |
tool_system_prompt |
str |
String representing the system prompt for tools |
max_tokens |
int |
Integer representing the maximum number of tokens |
temperature |
float |
Float representing the temperature for the LLM |
timeout |
Optional[int] |
Integer representing the timeout for operations in seconds |
tags |
Optional[List[str]] |
Optional list of strings for tagging the agent. |
use_cases |
Optional[List[Dict[str, str]]] |
Optional list of dictionaries describing use cases for the agent. |
auto_generate_prompt |
bool |
Boolean indicating whether to automatically generate prompts. |
rag_every_loop |
bool |
Boolean indicating whether to query RAG database for context on every loop |
plan_enabled |
bool |
Boolean indicating whether planning functionality is enabled |
artifacts_on |
bool |
Boolean indicating whether to save artifacts from agent execution |
artifacts_output_path |
str |
File path where artifacts should be saved |
artifacts_file_extension |
str |
File extension to use for saved artifacts |
model_name |
str |
String representing the name of the model to use |
llm_args |
dict |
Dictionary containing additional arguments for the LLM |
load_state_path |
str |
String representing the path to load state from |
role |
agent_roles |
String representing the role of the agent (e.g., "worker") |
print_on |
bool |
Boolean indicating whether to print output |
tools_list_dictionary |
Optional[List[Dict[str, Any]]] |
List of dictionaries representing tool schemas |
mcp_url |
Optional[Union[str, MCPConnection]] |
String or MCPConnection representing the MCP server URL |
mcp_urls |
List[str] |
List of strings representing multiple MCP server URLs |
react_on |
bool |
Boolean indicating whether to enable ReAct reasoning |
safety_prompt_on |
bool |
Boolean indicating whether to enable safety prompts |
random_models_on |
bool |
Boolean indicating whether to randomly select models |
mcp_config |
Optional[MCPConnection] |
MCPConnection object containing MCP configuration |
mcp_configs |
Optional[MultipleMCPConnections] |
MultipleMCPConnections object for managing multiple MCP server connections |
top_p |
Optional[float] |
Float representing the top-p sampling parameter |
llm_base_url |
Optional[str] |
String representing the base URL for the LLM API |
llm_api_key |
Optional[str] |
String representing the API key for the LLM |
tool_call_summary |
bool |
Boolean indicating whether to summarize tool calls |
summarize_multiple_images |
bool |
Boolean indicating whether to summarize multiple image outputs |
tool_retry_attempts |
int |
Integer representing the number of retry attempts for tool execution |
reasoning_prompt_on |
bool |
Boolean indicating whether to enable reasoning prompts |
reasoning_effort |
Optional[str] |
Reasoning effort level for reasoning-enabled models (e.g., "low", "medium", "high") |
reasoning_enabled |
bool |
Boolean indicating whether to enable reasoning capabilities |
thinking_tokens |
Optional[int] |
Maximum number of thinking tokens for reasoning models |
dynamic_context_window |
bool |
Boolean indicating whether to dynamically adjust context window |
show_tool_execution_output |
bool |
Boolean indicating whether to show tool execution output |
workspace_dir |
str |
String representing the workspace directory for the agent |
handoffs |
Optional[Union[Sequence[Callable], Any]] |
List of Agent instances that can be delegated tasks to. When provided, the agent will use a MultiAgentRouter to intelligently route tasks to the most appropriate specialized agent. |
capabilities |
Optional[List[str]] |
List of strings describing the agent's capabilities. |
mode |
Literal["interactive", "fast", "standard"] |
Execution mode: "interactive" for real-time interaction, "fast" for optimized performance, "standard" for default behavior. |
publish_to_marketplace |
bool |
Boolean indicating whether to publish the agent's prompt to the Swarms marketplace. |
marketplace_prompt_id |
Optional[str] |
Unique UUID identifier of a prompt from the Swarms marketplace. When provided, the agent will automatically fetch and load the prompt as the system prompt. |
Agent Methods¶
| Method | Description | Inputs | Usage Example |
|---|---|---|---|
run(task, img=None, imgs=None, correct_answer=None, streaming_callback=None, *args, **kwargs) |
Runs the autonomous agent loop to complete the given task with enhanced parameters. | task (str): The task to be performed.img (str, optional): Path to a single image file.imgs (List[str], optional): List of image paths for batch processing.correct_answer (str, optional): Expected correct answer for validation with automatic retries.streaming_callback (Callable, optional): Callback function for real-time token streaming.*args, **kwargs: Additional arguments. |
response = agent.run("Generate a report on financial performance.") |
run_batched(tasks, imgs=None, *args, **kwargs) |
Runs multiple tasks concurrently in batch mode. | tasks (List[str]): List of tasks to run.imgs (List[str], optional): List of images to process.*args, **kwargs: Additional arguments. |
responses = agent.run_batched(["Task 1", "Task 2"]) |
run_multiple_images(task, imgs, *args, **kwargs) |
Runs the agent with multiple images using concurrent processing. | task (str): The task to perform on each image.imgs (List[str]): List of image paths or URLs.*args, **kwargs: Additional arguments. |
outputs = agent.run_multiple_images("Describe image", ["img1.jpg", "img2.png"]) |
continuous_run_with_answer(task, img=None, correct_answer=None, max_attempts=10) |
Runs the agent until the correct answer is provided. | task (str): The task to perform.img (str, optional): Image to process.correct_answer (str): Expected answer.max_attempts (int): Maximum attempts. |
response = agent.continuous_run_with_answer("Math problem", correct_answer="42") |
tool_execution_retry(response, loop_count) |
Executes tools with retry logic for handling failures. | response (any): Response containing tool calls.loop_count (int): Current loop number. |
agent.tool_execution_retry(response, 1) |
__call__(task, img=None, *args, **kwargs) |
Alternative way to call the run method. |
Same as run. |
response = agent("Generate a report on financial performance.") |
parse_and_execute_tools(response, *args, **kwargs) |
Parses the agent's response and executes any tools mentioned in it. | response (str): The agent's response to be parsed.*args, **kwargs: Additional arguments. |
agent.parse_and_execute_tools(response) |
add_memory(message) |
Adds a message to the agent's memory. | message (str): The message to add. |
agent.add_memory("Important information") |
plan(task, *args, **kwargs) |
Plans the execution of a task. | task (str): The task to plan.*args, **kwargs: Additional arguments. |
agent.plan("Analyze market trends") |
run_concurrent(task, *args, **kwargs) |
Runs a task concurrently. | task (str): The task to run.*args, **kwargs: Additional arguments. |
response = await agent.run_concurrent("Concurrent task") |
run_concurrent_tasks(tasks, *args, **kwargs) |
Runs multiple tasks concurrently. | tasks (List[str]): List of tasks to run.*args, **kwargs: Additional arguments. |
responses = agent.run_concurrent_tasks(["Task 1", "Task 2"]) |
bulk_run(inputs) |
Generates responses for multiple input sets. | inputs (List[Dict[str, Any]]): List of input dictionaries. |
responses = agent.bulk_run([{"task": "Task 1"}, {"task": "Task 2"}]) |
run_multiple_images(task, imgs, *args, **kwargs) |
Runs the agent with multiple images using concurrent processing. | task (str): The task to perform on each image.imgs (List[str]): List of image paths or URLs.*args, **kwargs: Additional arguments. |
outputs = agent.run_multiple_images("Describe image", ["img1.jpg", "img2.png"]) |
continuous_run_with_answer(task, img=None, correct_answer=None, max_attempts=10) |
Runs the agent until the correct answer is provided. | task (str): The task to perform.img (str, optional): Image to process.correct_answer (str): Expected answer.max_attempts (int): Maximum attempts. |
response = agent.continuous_run_with_answer("Math problem", correct_answer="42") |
save() |
Saves the agent's history to a file. | None | agent.save() |
load(file_path) |
Loads the agent's history from a file. | file_path (str): Path to the file. |
agent.load("agent_history.json") |
graceful_shutdown() |
Gracefully shuts down the system, saving the state. | None | agent.graceful_shutdown() |
analyze_feedback() |
Analyzes the feedback for issues. | None | agent.analyze_feedback() |
undo_last() |
Undoes the last response and returns the previous state. | None | previous_state, message = agent.undo_last() |
add_response_filter(filter_word) |
Adds a response filter to filter out certain words. | filter_word (str): Word to filter. |
agent.add_response_filter("sensitive") |
apply_response_filters(response) |
Applies response filters to the given response. | response (str): Response to filter. |
filtered_response = agent.apply_response_filters(response) |
filtered_run(task) |
Runs a task with response filtering applied. | task (str): Task to run. |
response = agent.filtered_run("Generate a report") |
save_to_yaml(file_path) |
Saves the agent to a YAML file. | file_path (str): Path to save the YAML file. |
agent.save_to_yaml("agent_config.yaml") |
get_llm_parameters() |
Returns the parameters of the language model. | None | llm_params = agent.get_llm_parameters() |
save_state(file_path, *args, **kwargs) |
Saves the current state of the agent to a JSON file. | file_path (str): Path to save the JSON file.*args, **kwargs: Additional arguments. |
agent.save_state("agent_state.json") |
update_system_prompt(system_prompt) |
Updates the system prompt. | system_prompt (str): New system prompt. |
agent.update_system_prompt("New system instructions") |
update_max_loops(max_loops) |
Updates the maximum number of loops. | max_loops (int): New maximum number of loops. |
agent.update_max_loops(5) |
update_loop_interval(loop_interval) |
Updates the loop interval. | loop_interval (int): New loop interval. |
agent.update_loop_interval(2) |
update_retry_attempts(retry_attempts) |
Updates the number of retry attempts. | retry_attempts (int): New number of retry attempts. |
agent.update_retry_attempts(3) |
update_retry_interval(retry_interval) |
Updates the retry interval. | retry_interval (int): New retry interval. |
agent.update_retry_interval(5) |
reset() |
Resets the agent's memory. | None | agent.reset() |
ingest_docs(docs, *args, **kwargs) |
Ingests documents into the agent's memory. | docs (List[str]): List of document paths.*args, **kwargs: Additional arguments. |
agent.ingest_docs(["doc1.pdf", "doc2.txt"]) |
ingest_pdf(pdf) |
Ingests a PDF document into the agent's memory. | pdf (str): Path to the PDF file. |
agent.ingest_pdf("document.pdf") |
receive_message(name, message) |
Receives a message and adds it to the agent's memory. | name (str): Name of the sender.message (str): Content of the message. |
agent.receive_message("User", "Hello, agent!") |
send_agent_message(agent_name, message, *args, **kwargs) |
Sends a message from the agent to a user. | agent_name (str): Name of the agent.message (str): Message to send.*args, **kwargs: Additional arguments. |
response = agent.send_agent_message("AgentX", "Task completed") |
add_tool(tool) |
Adds a tool to the agent's toolset. | tool (Callable): Tool to add. |
agent.add_tool(my_custom_tool) |
add_tools(tools) |
Adds multiple tools to the agent's toolset. | tools (List[Callable]): List of tools to add. |
agent.add_tools([tool1, tool2]) |
remove_tool(tool) |
Removes a tool from the agent's toolset. | tool (Callable): Tool to remove. |
agent.remove_tool(my_custom_tool) |
remove_tools(tools) |
Removes multiple tools from the agent's toolset. | tools (List[Callable]): List of tools to remove. |
agent.remove_tools([tool1, tool2]) |
get_docs_from_doc_folders() |
Retrieves and processes documents from the specified folder. | None | agent.get_docs_from_doc_folders() |
memory_query(task, *args, **kwargs) |
Queries the long-term memory for relevant information. | task (str): The task or query.*args, **kwargs: Additional arguments. |
result = agent.memory_query("Find information about X") |
sentiment_analysis_handler(response) |
Performs sentiment analysis on the given response. | response (str): The response to analyze. |
agent.sentiment_analysis_handler("Great job!") |
count_and_shorten_context_window(history, *args, **kwargs) |
Counts tokens and shortens the context window if necessary. | history (str): The conversation history.*args, **kwargs: Additional arguments. |
shortened_history = agent.count_and_shorten_context_window(history) |
output_cleaner_and_output_type(response, *args, **kwargs) |
Cleans and formats the output based on specified type. | response (str): The response to clean and format.*args, **kwargs: Additional arguments. |
cleaned_response = agent.output_cleaner_and_output_type(response) |
stream_response(response, delay=0.001) |
Streams the response token by token. | response (str): The response to stream.delay (float): Delay between tokens. |
agent.stream_response("This is a streamed response") |
dynamic_context_window() |
Dynamically adjusts the context window. | None | agent.dynamic_context_window() |
check_available_tokens() |
Checks and returns the number of available tokens. | None | available_tokens = agent.check_available_tokens() |
tokens_checks() |
Performs token checks and returns available tokens. | None | token_info = agent.tokens_checks() |
truncate_string_by_tokens(input_string, limit) |
Truncates a string to fit within a token limit. | input_string (str): String to truncate.limit (int): Token limit. |
truncated_string = agent.truncate_string_by_tokens("Long string", 100) |
tokens_operations(input_string) |
Performs various token-related operations on the input string. | input_string (str): String to process. |
processed_string = agent.tokens_operations("Input string") |
parse_function_call_and_execute(response) |
Parses a function call from the response and executes it. | response (str): Response containing the function call. |
result = agent.parse_function_call_and_execute(response) |
llm_output_parser(response) |
Parses the output from the language model. | response (Any): Response from the LLM. |
parsed_response = agent.llm_output_parser(llm_output) |
log_step_metadata(loop, task, response) |
Logs metadata for each step of the agent's execution. | loop (int): Current loop number.task (str): Current task.response (str): Agent's response. |
agent.log_step_metadata(1, "Analyze data", "Analysis complete") |
to_dict() |
Converts the agent's attributes to a dictionary. | None | agent_dict = agent.to_dict() |
to_json(indent=4, *args, **kwargs) |
Converts the agent's attributes to a JSON string. | indent (int): Indentation for JSON.*args, **kwargs: Additional arguments. |
agent_json = agent.to_json() |
to_yaml(indent=4, *args, **kwargs) |
Converts the agent's attributes to a YAML string. | indent (int): Indentation for YAML.*args, **kwargs: Additional arguments. |
agent_yaml = agent.to_yaml() |
to_toml(*args, **kwargs) |
Converts the agent's attributes to a TOML string. | *args, **kwargs: Additional arguments. |
agent_toml = agent.to_toml() |
model_dump_json() |
Saves the agent model to a JSON file in the workspace directory. | None | agent.model_dump_json() |
model_dump_yaml() |
Saves the agent model to a YAML file in the workspace directory. | None | agent.model_dump_yaml() |
log_agent_data() |
Logs the agent's data to an external API. | None | agent.log_agent_data() |
handle_tool_schema_ops() |
Handles operations related to tool schemas. | None | agent.handle_tool_schema_ops() |
handle_handoffs(task) |
Handles task delegation to specialized agents when handoffs are configured. | task (str): Task to be delegated to appropriate specialized agent. |
response = agent.handle_handoffs("Analyze market data") |
call_llm(task, *args, **kwargs) |
Calls the appropriate method on the language model. | task (str): Task for the LLM.*args, **kwargs: Additional arguments. |
response = agent.call_llm("Generate text") |
handle_sop_ops() |
Handles operations related to standard operating procedures. | None | agent.handle_sop_ops() |
agent_output_type(responses) |
Processes and returns the agent's output based on the specified output type. | responses (list): List of responses. |
formatted_output = agent.agent_output_type(responses) |
check_if_no_prompt_then_autogenerate(task) |
Checks if a system prompt is not set and auto-generates one if needed. | task (str): The task to use for generating a prompt. |
agent.check_if_no_prompt_then_autogenerate("Analyze data") |
handle_artifacts(response, output_path, extension) |
Handles saving artifacts from agent execution | response (str): Agent responseoutput_path (str): Output pathextension (str): File extension |
agent.handle_artifacts(response, "outputs/", ".txt") |
showcase_config() |
Displays the agent's configuration in a formatted table. | None | agent.showcase_config() |
talk_to(agent, task, img=None, *args, **kwargs) |
Initiates a conversation with another agent. | agent (Any): Target agent.task (str): Task to discuss.img (str, optional): Image to share.*args, **kwargs: Additional arguments. |
response = agent.talk_to(other_agent, "Let's collaborate") |
talk_to_multiple_agents(agents, task, *args, **kwargs) |
Talks to multiple agents concurrently. | agents (List[Any]): List of target agents.task (str): Task to discuss.*args, **kwargs: Additional arguments. |
responses = agent.talk_to_multiple_agents([agent1, agent2], "Group discussion") |
get_agent_role() |
Returns the role of the agent. | None | role = agent.get_agent_role() |
pretty_print(response, loop_count) |
Prints the response in a formatted panel. | response (str): Response to print.loop_count (int): Current loop number. |
agent.pretty_print("Analysis complete", 1) |
parse_llm_output(response) |
Parses and standardizes the output from the LLM. | response (Any): Response from the LLM. |
parsed_response = agent.parse_llm_output(llm_output) |
sentiment_and_evaluator(response) |
Performs sentiment analysis and evaluation on the response. | response (str): Response to analyze. |
agent.sentiment_and_evaluator("Great response!") |
output_cleaner_op(response) |
Applies output cleaning operations to the response. | response (str): Response to clean. |
cleaned_response = agent.output_cleaner_op(response) |
mcp_tool_handling(response, current_loop) |
Handles MCP tool execution and responses. | response (Any): Response containing tool calls.current_loop (int): Current loop number. |
agent.mcp_tool_handling(response, 1) |
temp_llm_instance_for_tool_summary() |
Creates a temporary LLM instance for tool summaries. | None | temp_llm = agent.temp_llm_instance_for_tool_summary() |
execute_tools(response, loop_count) |
Executes tools based on the LLM response. | response (Any): Response containing tool calls.loop_count (int): Current loop number. |
agent.execute_tools(response, 1) |
list_output_types() |
Returns available output types. | None | types = agent.list_output_types() |
tool_execution_retry(response, loop_count) |
Executes tools with retry logic for handling failures. | response (Any): Response containing tool calls.loop_count (int): Current loop number. |
agent.tool_execution_retry(response, 1) |
Agent.run(*args, **kwargs)¶
The run method has been significantly enhanced with new parameters for advanced functionality:
Method Signature¶
def run(
self,
task: Optional[Union[str, Any]] = None,
img: Optional[str] = None,
imgs: Optional[List[str]] = None,
correct_answer: Optional[str] = None,
streaming_callback: Optional[Callable[[str], None]] = None,
*args,
**kwargs,
) -> Any:
Parameters¶
| Parameter | Type | Description | Default |
|---|---|---|---|
task |
Optional[Union[str, Any]] |
The task to be executed | None |
img |
Optional[str] |
Path to a single image file | None |
imgs |
Optional[List[str]] |
List of image paths for batch processing | None |
correct_answer |
Optional[str] |
Expected correct answer for validation with automatic retries | None |
streaming_callback |
Optional[Callable[[str], None]] |
Callback function to receive streaming tokens in real-time | None |
*args |
Any |
Additional positional arguments | - |
**kwargs |
Any |
Additional keyword arguments | - |
Examples¶
# --- Enhanced Run Method Examples ---
# Basic Usage
# Simple task execution
response = agent.run("Generate a report on financial performance.")
# Single Image Processing
# Process a single image
response = agent.run(
task="Analyze this image and describe what you see",
img="path/to/image.jpg"
)
# Multiple Image Processing
# Process multiple images concurrently
response = agent.run(
task="Analyze these images and identify common patterns",
imgs=["image1.jpg", "image2.png", "image3.jpeg"]
)
# Answer Validation with Retries
# Run until correct answer is found
response = agent.run(
task="What is the capital of France?",
correct_answer="Paris"
)
# Real-time Streaming
def streaming_callback(token: str):
print(token, end="", flush=True)
response = agent.run(
task="Tell me a long story about space exploration",
streaming_callback=streaming_callback
)
# Combined Parameters
# Complex task with multiple features
response = agent.run(
task="Analyze these financial charts and provide insights",
imgs=["chart1.png", "chart2.png", "chart3.png"],
correct_answer="market volatility",
streaming_callback=my_callback
)
Return Types¶
The run method returns different types based on the input parameters:
| Scenario | Return Type | Description |
|---|---|---|
| Single task | str |
Returns the agent's response |
| Multiple images | List[Any] |
Returns a list of results, one for each image |
| Answer validation | str |
Returns the correct answer as a string |
| Streaming | str |
Returns the complete response after streaming completes |
Advanced Capabilities¶
Tool Integration¶
The Agent class allows seamless integration of external tools by accepting a list of Python functions via the tools parameter during initialization. Each tool function must include type annotations and a docstring. The Agent class automatically converts these functions into an OpenAI-compatible function calling schema, making them accessible for use during task execution.
Learn more about tools here
Requirements for a tool¶
| Requirement | Description |
|---|---|
| Function | The tool must be a Python function. |
| With types | The function must have type annotations for its parameters. |
| With doc strings | The function must include a docstring describing its behavior. |
| Must return a string | The function must return a string value. |
from swarms import Agent
import subprocess
def terminal(code: str):
"""
Run code in the terminal.
Args:
code (str): The code to run in the terminal.
Returns:
str: The output of the code.
"""
out = subprocess.run(code, shell=True, capture_output=True, text=True).stdout
return str(out)
# Initialize the agent with a tool
agent = Agent(
agent_name="Terminal-Agent",
model_name="claude-sonnet-4-20250514",
tools=[terminal],
system_prompt="You are an agent that can execute terminal commands. Use the tools provided to assist the user.",
)
# Run the agent
response = agent.run("List the contents of the current directory")
print(response)
Long-term Memory Management¶
The Swarm Agent supports integration with vector databases for long-term memory management. Here's an example using ChromaDB:
from swarms import Agent
from swarms_memory import ChromaDB
# Initialize ChromaDB
chromadb = ChromaDB(
metric="cosine",
output_dir="finance_agent_rag",
)
# Initialize the agent with long-term memory
agent = Agent(
agent_name="Financial-Analysis-Agent",
model_name="claude-sonnet-4-20250514",
long_term_memory=chromadb,
system_prompt="You are a financial analysis agent with access to long-term memory.",
)
# Run the agent
response = agent.run("What are the components of a startup's stock incentive equity plan?")
print(response)
Agent Handoffs and Task Delegation¶
The Agent class supports intelligent task delegation through the handoffs parameter. When provided with a list of specialized agents, the main agent acts as a router that analyzes incoming tasks and delegates them to the most appropriate specialized agent based on their capabilities and descriptions.
How Handoffs Work¶
- Task Analysis: When a task is received, the main agent uses a built-in "boss agent" to analyze the task requirements
- Agent Selection: The boss agent evaluates all available specialized agents and selects the most suitable one(s) based on their descriptions and capabilities
- Task Delegation: The selected agent(s) receive the task (potentially modified for better execution) and process it
- Response Aggregation: Results from specialized agents are collected and returned
Key Features¶
| Feature | Description |
|---|---|
| Intelligent Routing | Uses AI to determine the best agent for each task |
| Multiple Agent Support | Can delegate to multiple agents for complex tasks requiring different expertise |
| Task Modification | Can modify tasks to better suit the selected agent's capabilities |
| Transparent Reasoning | Provides clear explanations for agent selection decisions |
| Seamless Integration | Works transparently with the existing run() method |
Basic Handoff Example¶
from swarms.structs.agent import Agent
# Create specialized agents
research_agent = Agent(
agent_name="ResearchAgent",
agent_description="Specializes in researching topics and providing detailed, factual information",
model_name="gpt-4o-mini",
max_loops=1,
system_prompt="You are a research specialist. Provide detailed, well-researched information about any topic, citing sources when possible.",
)
code_agent = Agent(
agent_name="CodeExpertAgent",
agent_description="Expert in writing, reviewing, and explaining code across multiple programming languages",
model_name="gpt-4o-mini",
max_loops=1,
system_prompt="You are a coding expert. Write, review, and explain code with a focus on best practices and clean code principles.",
)
writing_agent = Agent(
agent_name="WritingAgent",
agent_description="Skilled in creative and technical writing, content creation, and editing",
model_name="gpt-4o-mini",
max_loops=1,
system_prompt="You are a writing specialist. Create, edit, and improve written content while maintaining appropriate tone and style.",
)
# Create a coordinator agent with handoffs enabled
coordinator = Agent(
agent_name="CoordinatorAgent",
agent_description="Coordinates tasks and delegates to specialized agents",
model_name="gpt-4o-mini",
max_loops=1,
handoffs=[research_agent, code_agent, writing_agent],
system_prompt="You are a coordinator agent. Analyze tasks and delegate them to the most appropriate specialized agent using the handoff_task tool. You can delegate to multiple agents if needed.",
output_type="all",
)
# Run task - will be automatically delegated to appropriate agent(s)
task = "Call all the agents available to you and ask them how they are doing"
result = coordinator.run(task=task)
print(result)
Use Cases¶
- Financial Analysis: Route different types of financial analysis to specialized agents (risk, valuation, market analysis)
- Software Development: Delegate coding, testing, documentation, and code review to different agents
- Research Projects: Route research tasks to domain-specific agents
- Customer Support: Delegate different types of inquiries to specialized support agents
- Content Creation: Route writing, editing, and fact-checking to different content specialists
Interactive Mode¶
To enable interactive mode, set the interactive parameter to True when initializing the Agent. See the Examples section for a complete code example.
Batch Processing with run_batched¶
The new run_batched method allows you to process multiple tasks efficiently:
Method Signature¶
Parameters¶
| Parameter | Type | Description | Default |
|---|---|---|---|
tasks |
List[str] |
List of tasks to run concurrently | Required |
imgs |
List[str] |
List of images to process with each task | None |
*args |
Any |
Additional positional arguments | - |
**kwargs |
Any |
Additional keyword arguments | - |
Usage Examples¶
# Process multiple tasks in batch
tasks = [
"Analyze the financial data for Q1",
"Generate a summary report for stakeholders",
"Create recommendations for Q2 planning"
]
# Run all tasks concurrently
batch_results = agent.run_batched(tasks)
# Process results
for i, (task, result) in enumerate(zip(tasks, batch_results)):
print(f"Task {i+1}: {task}")
print(f"Result: {result}\n")
Batch Processing with Images¶
# Process multiple tasks with multiple images
tasks = [
"Analyze this chart for trends",
"Identify patterns in this data visualization",
"Summarize the key insights from this graph"
]
images = ["chart1.png", "chart2.png", "chart3.png"]
# Each task will process all images
batch_results = agent.run_batched(tasks, imgs=images)
Return Type¶
- Returns:
List[Any]- List of results from each task execution - Order: Results are returned in the same order as the input tasks
Various other settings¶
# # Convert the agent object to a dictionary
print(agent.to_dict())
print(agent.to_toml())
print(agent.model_dump_json())
print(agent.model_dump_yaml())
# Ingest documents into the agent's knowledge base
agent.ingest_docs("your_pdf_path.pdf")
# Receive a message from a user and process it
agent.receive_message(name="agent_name", message="message")
# Send a message from the agent to a user
agent.send_agent_message(agent_name="agent_name", message="message")
# Ingest multiple documents into the agent's knowledge base
agent.ingest_docs("your_pdf_path.pdf", "your_csv_path.csv")
# Run the agent with a filtered system prompt
agent.filtered_run(
"How can I establish a ROTH IRA to buy stocks and get a tax break? What are the criteria?"
)
# Run the agent with multiple system prompts
agent.bulk_run(
[
"How can I establish a ROTH IRA to buy stocks and get a tax break? What are the criteria?",
"Another system prompt",
]
)
# Add a memory to the agent
agent.add_memory("Add a memory to the agent")
# Check the number of available tokens for the agent
agent.check_available_tokens()
# Perform token checks for the agent
agent.tokens_checks()
# Print the dashboard of the agent
agent.print_dashboard()
# Fetch all the documents from the doc folders
agent.get_docs_from_doc_folders()
# Dump the model to a JSON file
agent.model_dump_json()
print(agent.to_toml())
Examples¶
Tool Integration Examples¶
Basic Tool Example¶
Create a custom tool function and integrate it with an agent. The agent can then use the tool to execute terminal commands, extending its capabilities beyond text generation.
from swarms import Agent
import subprocess
def terminal(code: str):
"""
Run code in the terminal.
Args:
code (str): The code to run in the terminal.
Returns:
str: The output of the code.
"""
out = subprocess.run(code, shell=True, capture_output=True, text=True).stdout
return str(out)
# Initialize the agent with a tool
agent = Agent(
agent_name="Terminal-Agent",
model_name="claude-sonnet-4-20250514",
tools=[terminal],
system_prompt="You are an agent that can execute terminal commands. Use the tools provided to assist the user.",
)
# Run the agent
response = agent.run("List the contents of the current directory")
print(response)
Agent Structured Outputs with Tools¶
Use structured tool schemas (OpenAI function calling format) with an agent. The agent receives tool definitions as dictionaries and can call them to retrieve structured data, such as stock prices. The output can be converted from string to dictionary format for easier processing.
from dotenv import load_dotenv
from swarms import Agent
from swarms.prompts.finance_agent_sys_prompt import (
FINANCIAL_AGENT_SYS_PROMPT,
)
from swarms.utils.str_to_dict import str_to_dict
load_dotenv()
tools = [
{
"type": "function",
"function": {
"name": "get_stock_price",
"description": "Retrieve the current stock price and related information for a specified company.",
"parameters": {
"type": "object",
"properties": {
"ticker": {
"type": "string",
"description": "The stock ticker symbol of the company, e.g. AAPL for Apple Inc.",
},
"include_history": {
"type": "boolean",
"description": "Indicates whether to include historical price data along with the current price.",
},
"time": {
"type": "string",
"format": "date-time",
"description": "Optional parameter to specify the time for which the stock data is requested, in ISO 8601 format.",
},
},
"required": [
"ticker",
"include_history",
"time",
],
},
},
}
]
# Initialize the agent
agent = Agent(
agent_name="Financial-Analysis-Agent",
agent_description="Personal finance advisor agent",
system_prompt=FINANCIAL_AGENT_SYS_PROMPT,
max_loops=1,
tools_list_dictionary=tools,
)
out = agent.run(
"What is the current stock price for Apple Inc. (AAPL)? Include historical price data.",
)
print(out)
print(type(out))
print(str_to_dict(out))
print(type(str_to_dict(out)))
Long-term Memory with Tools¶
Integrate a vector database (ChromaDB) with an agent for long-term memory management. The agent can store and retrieve information from the vector database, enabling it to access previously learned knowledge and provide more contextually relevant responses.
from swarms import Agent
from swarms_memory import ChromaDB
# Initialize ChromaDB
chromadb = ChromaDB(
metric="cosine",
output_dir="finance_agent_rag",
)
# Initialize the agent with long-term memory
agent = Agent(
agent_name="Financial-Analysis-Agent",
model_name="claude-sonnet-4-20250514",
long_term_memory=chromadb,
system_prompt="You are a financial analysis agent with access to long-term memory.",
)
# Run the agent
response = agent.run("What are the components of a startup's stock incentive equity plan?")
print(response)
Handoffs Examples¶
Basic Handoff Example¶
Set up a multi-agent system with task delegation. Three specialized agents (ResearchAgent, CodeExpertAgent, and WritingAgent) are created, and a coordinator agent intelligently routes tasks to the most appropriate specialized agent(s) based on the task requirements. The coordinator can delegate to multiple agents if needed.
from swarms.structs.agent import Agent
# Create specialized agents
research_agent = Agent(
agent_name="ResearchAgent",
agent_description="Specializes in researching topics and providing detailed, factual information",
model_name="gpt-4o-mini",
max_loops=1,
system_prompt="You are a research specialist. Provide detailed, well-researched information about any topic, citing sources when possible.",
)
code_agent = Agent(
agent_name="CodeExpertAgent",
agent_description="Expert in writing, reviewing, and explaining code across multiple programming languages",
model_name="gpt-4o-mini",
max_loops=1,
system_prompt="You are a coding expert. Write, review, and explain code with a focus on best practices and clean code principles.",
)
writing_agent = Agent(
agent_name="WritingAgent",
agent_description="Skilled in creative and technical writing, content creation, and editing",
model_name="gpt-4o-mini",
max_loops=1,
system_prompt="You are a writing specialist. Create, edit, and improve written content while maintaining appropriate tone and style.",
)
# Create a coordinator agent with handoffs enabled
coordinator = Agent(
agent_name="CoordinatorAgent",
agent_description="Coordinates tasks and delegates to specialized agents",
model_name="gpt-4o-mini",
max_loops=1,
handoffs=[research_agent, code_agent, writing_agent],
system_prompt="You are a coordinator agent. Analyze tasks and delegate them to the most appropriate specialized agent using the handoff_task tool. You can delegate to multiple agents if needed.",
output_type="all",
)
# Run task - will be automatically delegated to appropriate agent(s)
task = "Call all the agents available to you and ask them how they are doing"
result = coordinator.run(task=task)
print(result)
Autonomous Agent Examples¶
Autonomous Agent with Automatic Planning¶
The autonomous agent mode uses max_loops="auto" to enable automatic planning and execution. The agent creates a structured plan with subtasks, executes them sequentially with dependency management, and generates a comprehensive summary. Ideal for complex tasks that require multi-step reasoning and planning, such as generating comprehensive financial reports.
The Agent supports autonomous operation with automatic planning when max_loops="auto" is set. This enables the agent to create a plan, execute subtasks, and generate a comprehensive summary automatically.
from swarms.structs.agent import Agent
# Initialize the agent with autonomous mode
agent = Agent(
agent_name="Quantitative-Trading-Agent",
agent_description="Advanced quantitative trading and algorithmic analysis agent",
model_name="gpt-4.1",
dynamic_temperature_enabled=True,
max_loops="auto", # Enable autonomous planning and execution
dynamic_context_window=True,
top_p=None,
output_type="all",
)
# Define a complex task that requires planning
quant_report_prompt = (
"You are an expert in quantitative trading and financial analysis. "
"Please generate a comprehensive, data-driven report on the top 5 publicly traded energy stocks as of today. "
"For each stock, include the following: \n"
"- Company name and ticker\n"
"- Brief business overview\n"
"- Key financial metrics (such as market cap, P/E ratio, recent performance)\n"
"- Recent news or notable events influencing the stock\n"
"- A concise analysis of why it is currently considered a top energy stock\n"
"Present your findings in a clear, organized format suitable for professional review."
"Only create 3 subtasks in your plan, make it very simple"
)
# Run the agent - it will automatically:
# 1. Create a plan with subtasks
# 2. Execute each subtask
# 3. Generate a comprehensive summary
out = agent.run(quant_report_prompt)
print(out)
Key Features of Autonomous Mode: - Automatic Planning: The agent creates a structured plan with subtasks - Subtask Execution: Each subtask is executed sequentially with dependency management - Comprehensive Summary: Final summary includes all subtask results and insights - Error Handling: Built-in retry logic and error recovery for robust execution - Built-in Tools: Access to file operations, user communication, and workspace management tools
Available Tools in Autonomous Mode¶
When max_loops="auto" and interactive=False, the agent has access to specialized tools for task execution:
| Tool | Description | Parameters |
|---|---|---|
respond_to_user |
Send messages to the user with types (info, question, warning, error, success) | message (str), message_type (str, optional) |
create_file |
Create a new file with specified content | file_path (str), content (str) |
update_file |
Update an existing file with new content | file_path (str), content (str), mode (str: "replace" or "append") |
read_file |
Read the contents of a file | file_path (str) |
list_directory |
List files and directories in a specified path | directory_path (str, optional) |
delete_file |
Delete a file (with safety checks) | file_path (str) |
File Operations and Workspace Directory:
All file operations use the agent's workspace directory structure:
- Workspace Location: Set via WORKSPACE_DIR environment variable (defaults to agent_workspace if not set)
- Agent-Specific Directory: Each agent gets its own workspace at workspace_dir/agents/{agent-name}-{uuid}/
- Relative Paths: File paths provided to tools are relative to the agent's workspace directory
- Absolute Paths: You can also use absolute paths for files outside the workspace
Example: Using Autonomous Tools
from swarms.structs.agent import Agent
# Initialize autonomous agent
agent = Agent(
agent_name="File-Management-Agent",
model_name="gpt-4.1",
max_loops="auto",
interactive=False,
)
# The agent can now use built-in tools during execution
task = """
Create a comprehensive report on renewable energy trends.
1. Research current trends
2. Create a markdown file with your findings
3. Update the file with additional insights
4. Read the file to verify content
5. Send a message to the user when complete
"""
response = agent.run(task)
# The agent will automatically:
# - Create files in workspace_dir/agents/file-management-agent-{uuid}/
# - Use create_file, update_file, read_file tools as needed
# - Communicate with respond_to_user tool
Loop Examples¶
Multiple Loops Example¶
Configure an agent with multiple reasoning loops. By setting max_loops=3 and enabling reasoning_prompt_on, the agent performs iterative reasoning, allowing it to refine its thinking over multiple iterations. Useful for complex problems that require step-by-step analysis.
from swarms import Agent
# Agent with multiple loops for iterative reasoning
agent = Agent(
agent_name="Iterative-Reasoning-Agent",
model_name="gpt-4.1",
max_loops=3, # Run 3 reasoning loops
reasoning_prompt_on=True,
system_prompt="You are an agent that reasons through problems step by step.",
)
response = agent.run("Solve this complex problem step by step: [problem description]")
print(response)
Dynamic Loops Example¶
Dynamic loop configuration allows the agent to automatically determine the optimal number of reasoning loops based on task complexity. By setting dynamic_loops=True, the agent adapts its reasoning depth, using more loops for complex tasks and fewer for simple ones.
from swarms import Agent
# Agent with dynamic loops that adjust based on task complexity
agent = Agent(
agent_name="Dynamic-Agent",
model_name="gpt-4.1",
dynamic_loops=True, # Automatically determines number of loops
system_prompt="You are an adaptive agent that adjusts your reasoning depth based on task complexity.",
)
response = agent.run("Analyze this complex scenario and provide insights")
print(response)
Simple Examples¶
Basic Agent Usage¶
The simplest way to use an agent with minimal configuration. Only requires a model name and max_loops parameter. Perfect for getting started quickly with basic text generation tasks.
from swarms import Agent
# Simple agent with minimal configuration
agent = Agent(
model_name="gpt-4o-mini",
max_loops=1,
)
response = agent.run("What is the capital of France?")
print(response)
Interactive Mode¶
Enable interactive mode for real-time conversation with the agent. When interactive=True, the agent prompts for user input after each response, creating a conversational loop. Useful for interactive applications, chatbots, or when you need to guide the agent through a multi-turn conversation.
from swarms import Agent
# Agent with interactive mode enabled
agent = Agent(
agent_name="Interactive-Agent",
model_name="claude-sonnet-4-20250514",
interactive=True,
system_prompt="You are an interactive agent. Engage in a conversation with the user.",
)
# Run the agent in interactive mode
agent.run("Let's start a conversation")
Auto Generate Prompt Example¶
Automatic prompt generation creates optimized system prompts without manual engineering. When auto_generate_prompt=True and no system prompt is provided, the agent automatically generates a contextually appropriate prompt based on the agent name, description, and task. This feature uses AI to create prompts, reducing the need for manual prompt engineering.
import os
from swarms import Agent
from dotenv import load_dotenv
# Load environment variables
load_dotenv()
# Initialize the agent with automated prompt engineering enabled
agent = Agent(
agent_name="Financial-Analysis-Agent",
system_prompt=None, # System prompt is dynamically generated
model_name="gpt-4.1",
agent_description=None,
max_loops=1,
autosave=True,
dashboard=False,
verbose=False,
dynamic_temperature_enabled=True,
saved_state_path="finance_agent.json",
user_name="Human:",
output_type="string",
streaming_on=False,
auto_generate_prompt=True, # Enable automated prompt engineering
)
# Run the agent with a task description
agent.run(
"How can I establish a ROTH IRA to buy stocks and get a tax break? What are the criteria",
)
# Print the dynamically generated system prompt
print(agent.system_prompt)
Token-by-Token Streaming¶
Enable detailed token-by-token streaming with metadata. When stream=True, the agent streams each token as it's generated, providing real-time feedback and detailed metadata including token count, model information, citations, and usage statistics. Useful for building interactive UIs or monitoring agent behavior in real-time.
from swarms import Agent
# Initialize agent with detailed streaming
agent = Agent(
model_name="gpt-4.1",
max_loops=1,
stream=True, # Enable detailed token-by-token streaming
)
# Run with detailed streaming - each token shows metadata
agent.run("Tell me a short story about a robot learning to paint.")
Undo Functionality¶
Revert the agent's last response and restore the previous conversation state. Useful for correcting mistakes, exploring alternative responses, or implementing undo/redo functionality in applications.
# Undo functionality
response = agent.run("Another task")
print(f"Response: {response}")
previous_state, message = agent.undo_last()
print(message)
Response Filtering¶
Filter specific words or phrases from agent responses. By adding response filters, you can automatically replace or remove sensitive content, profanity, or unwanted terms from the agent's output. Useful for content moderation, compliance, or customizing output formatting.
# Response filtering
agent.add_response_filter("report")
response = agent.filtered_run("Generate a report on finance")
print(response)
Saving and Loading State¶
Save and restore agent state to persist conversations and configurations across sessions. The agent can save its current state to a JSON file and load it later to continue from where it left off. Essential for long-running tasks, debugging, or maintaining conversation continuity.
# Save the agent state
agent.save_state('saved_flow.json')
# Load the agent state
agent = Agent(model_name="gpt-4o-mini", max_loops=5)
agent.load('saved_flow.json')
agent.run("Continue with the task")
Autosave Functionality¶
The agent supports automatic saving of configuration and state when autosave=True. This ensures your work is preserved even if the agent encounters errors or is interrupted.
How Autosave Works:
- Configuration Saving: At each loop step, the agent saves its configuration to
workspace_dir/agents/{agent-name}-{uuid}/config.json - State Saving: Full agent state is saved on errors, interruptions, or when explicitly called
- Workspace Structure: Each agent gets its own isolated workspace directory
- Atomic Writes: Files are written atomically (via temporary files) to prevent corruption
- Metadata Tracking: Each save includes metadata (timestamp, loop count, agent ID)
Autosave Configuration:
from swarms import Agent
# Enable autosave
agent = Agent(
model_name="gpt-4o-mini",
agent_name="autosave-demo",
max_loops=5,
autosave=True, # Enable automatic saving
verbose=True,
)
# Run task - configuration is saved at each step
response = agent.run("Complete a complex multi-step task")
# Access the agent's workspace directory
workspace = agent._get_agent_workspace_dir()
print(f"Files saved to: {workspace}")
# Files created:
# - config.json: Agent configuration at each step
# - {agent_name}_state.json: Full agent state
Autosave File Structure:
workspace_dir/
└── agents/
└── {agent-name}-{uuid}/
├── config.json # Configuration saved at each step
├── {agent_name}_state.json # Full state on save()
└── [other files created by agent tools]
Workspace Directory Configuration:
The workspace directory is controlled by the WORKSPACE_DIR environment variable:
import os
# Set workspace directory via environment variable
os.environ["WORKSPACE_DIR"] = "/path/to/my/workspace"
# Or it defaults to 'agent_workspace' in current directory
agent = Agent(
model_name="gpt-4o-mini",
autosave=True,
)
# Each agent gets its own subdirectory
# workspace_dir/agents/{agent-name}-{uuid}/
Note: The workspace_dir parameter in Agent initialization is ignored. The workspace is always read from the WORKSPACE_DIR environment variable, ensuring consistent file organization across all agents.
Async and Concurrent Execution¶
Run tasks asynchronously or in parallel for improved performance. The agent supports concurrent execution of multiple tasks, batch processing, and async operations. Ideal for processing large datasets, handling multiple requests simultaneously, or optimizing throughput in production environments.
# Run a task concurrently
response = await agent.run_concurrent("Concurrent task")
print(response)
# Run multiple tasks concurrently
tasks = [
{"task": "Task 1"},
{"task": "Task 2", "img": "path/to/image.jpg"},
{"task": "Task 3", "custom_param": 42}
]
responses = agent.bulk_run(tasks)
print(responses)
# Run multiple tasks in batch mode
task_list = ["Analyze data", "Generate report", "Create summary"]
batch_responses = agent.run_batched(task_list)
print(f"Completed {len(batch_responses)} tasks in batch mode")
Comprehensive Agent Configuration Examples¶
Advanced Agent with All New Features¶
A comprehensive example showcasing an agent configured with multiple advanced features including memory management, reasoning, MCP integration, artifacts, and batch processing. This demonstrates how to combine various capabilities for production-ready applications.
from swarms import Agent
from swarms_memory import ChromaDB
# Initialize advanced agent with comprehensive configuration
agent = Agent(
# Basic Configuration
agent_name="Advanced-Analysis-Agent",
agent_description="Multi-modal analysis agent with advanced capabilities",
system_prompt="You are an advanced analysis agent capable of processing multiple data types.",
# Enhanced Run Parameters
max_loops=3,
dynamic_loops=True,
interactive=False,
dashboard=True,
# Memory and Context Management
context_length=100000,
memory_chunk_size=3000,
dynamic_context_window=True,
rag_every_loop=True,
# Advanced Features
auto_generate_prompt=True,
plan_enabled=True,
react_on=True,
safety_prompt_on=True,
reasoning_prompt_on=True,
# Tool Management
tool_retry_attempts=5,
tool_call_summary=True,
show_tool_execution_output=True,
function_calling_format_type="OpenAI",
# Artifacts and Output
artifacts_on=True,
artifacts_output_path="./outputs",
artifacts_file_extension=".md",
output_type="json",
# LLM Configuration
model_name="gpt-4.1",
temperature=0.3,
max_tokens=8000,
top_p=0.95,
# MCP Integration
mcp_url="http://localhost:8000",
mcp_config=None,
# Performance Settings
timeout=300,
retry_attempts=3,
retry_interval=2,
# Metadata and Organization
tags=["analysis", "multi-modal", "advanced"],
use_cases=[{"name": "Data Analysis", "description": "Process and analyze complex datasets"}],
# Verbosity and Logging
verbose=True,
print_on=True
)
# Run with multiple images and streaming
def streaming_callback(token: str):
print(token, end="", flush=True)
response = agent.run(
task="Analyze these financial charts and provide comprehensive insights",
imgs=["chart1.png", "chart2.png", "chart3.png"],
streaming_callback=streaming_callback
)
# Run batch processing
tasks = [
"Analyze Q1 financial performance",
"Generate Q2 projections",
"Create executive summary"
]
batch_results = agent.run_batched(tasks)
# Run with answer validation
validated_response = agent.run(
task="What is the current market trend?",
correct_answer="bullish",
max_attempts=5
)
MCP-Enabled Agent Example¶
Connect an agent to Model Context Protocol (MCP) servers to access external tools and resources. The agent can connect to single or multiple MCP servers, enabling integration with external APIs, databases, and services through a standardized protocol.
from swarms import Agent
from swarms.schemas.mcp_schemas import MCPConnection
# Configure MCP connection
mcp_config = MCPConnection(
server_path="http://localhost:8000",
server_name="my_mcp_server",
capabilities=["tools", "filesystem"]
)
# Initialize agent with MCP integration
mcp_agent = Agent(
agent_name="MCP-Enabled-Agent",
system_prompt="You are an agent with access to external tools via MCP.",
mcp_config=mcp_config,
mcp_urls=["http://localhost:8000", "http://localhost:8001"],
tool_call_summary=True
)
# Run with MCP tools
response = mcp_agent.run("Use the available tools to analyze the current system status")
Multi-Image Processing Agent¶
Process multiple images concurrently and automatically summarize the results. The agent can analyze multiple images in parallel and generate a comprehensive summary of findings across all images, making it ideal for batch image analysis tasks.
# Initialize agent optimized for image processing
image_agent = Agent(
agent_name="Image-Analysis-Agent",
system_prompt="You are an expert at analyzing images and extracting insights.",
multi_modal=True,
summarize_multiple_images=True,
artifacts_on=True,
artifacts_output_path="./image_analysis",
artifacts_file_extension=".txt"
)
# Process multiple images with summarization
images = ["product1.jpg", "product2.jpg", "product3.jpg"]
analysis = image_agent.run(
task="Analyze these product images and identify design patterns",
imgs=images
)
# The agent will automatically summarize results if summarize_multiple_images=True
print(f"Analysis complete: {len(analysis)} images processed")
Simple Examples for New Features¶
Fallback Models¶
Configure multiple models as fallbacks for improved reliability. If the primary model fails, the agent automatically switches to the next model in the fallback list. Ensures task completion even when individual models encounter errors or rate limits.
from swarms import Agent
# Agent with fallback models - automatically switches if primary fails
agent = Agent(
model_name="gpt-4o",
fallback_models=["gpt-4o-mini", "gpt-3.5-turbo"],
max_loops=1
)
# Will try gpt-4o first, then fallback to gpt-4o-mini if it fails
response = agent.run("Analyze this data")
Marketplace Prompt Loading¶
Load pre-built prompts directly from the Swarms marketplace using a prompt ID. The agent automatically fetches and applies the prompt as its system prompt, enabling one-line prompt loading without manual configuration. Requires the SWARMS_API_KEY environment variable.
from swarms import Agent
# Load a prompt from the Swarms marketplace
agent = Agent(
model_name="gpt-4o-mini",
marketplace_prompt_id="550e8400-e29b-41d4-a716-446655440000",
max_loops=1
)
# Agent automatically loads the system prompt from marketplace
response = agent.run("Execute the marketplace prompt task")
Reasoning-Enabled Models¶
Configure agents to use reasoning-enabled models like o1-preview. These models perform internal reasoning before generating responses, making them ideal for complex mathematical problems, logical puzzles, and tasks requiring deep analytical thinking. Control reasoning effort and thinking token limits.
from swarms import Agent
# Agent with reasoning capabilities
agent = Agent(
model_name="o1-preview",
reasoning_enabled=True,
reasoning_effort="high",
thinking_tokens=10000,
max_loops=1
)
response = agent.run("Solve this complex mathematical problem step by step")
Execution Modes¶
Choose from three execution modes to optimize agent behavior: "fast" for performance (reduces verbosity), "interactive" for real-time conversations, and "standard" for default balanced behavior. Each mode automatically configures print and verbose settings appropriately.
from swarms import Agent
# Fast mode - optimized for performance (reduces verbosity)
fast_agent = Agent(
model_name="gpt-4o-mini",
mode="fast", # Disables print_on and verbose automatically
max_loops=1
)
# Interactive mode - for real-time conversations
interactive_agent = Agent(
model_name="gpt-4o-mini",
mode="interactive",
max_loops=5
)
# Standard mode - default behavior
standard_agent = Agent(
model_name="gpt-4o-mini",
mode="standard",
max_loops=1
)
Streaming Callback¶
Provide a custom callback function to receive streaming tokens in real-time. The callback is invoked for each token as it's generated, enabling custom UI updates, progress tracking, or integration with streaming interfaces. Useful for building responsive applications that display results as they're generated.
from swarms import Agent
# Define a custom streaming callback
def my_streaming_callback(token: str):
print(token, end="", flush=True)
# Agent with streaming callback
agent = Agent(
model_name="gpt-4o-mini",
streaming_callback=my_streaming_callback,
max_loops=1
)
# Tokens will be streamed to the callback in real-time
response = agent.run("Tell me a story")
Multiple MCP Connections¶
Connect to multiple MCP servers simultaneously to access tools and resources from different sources. The agent can use tools from all connected servers, enabling integration with diverse external services and APIs through a unified interface.
from swarms import Agent
from swarms.schemas.mcp_schemas import MultipleMCPConnections
# Configure multiple MCP servers
mcp_configs = MultipleMCPConnections(
connections=[
{"server_path": "http://localhost:8000", "server_name": "server1"},
{"server_path": "http://localhost:8001", "server_name": "server2"}
]
)
agent = Agent(
model_name="gpt-4o-mini",
mcp_configs=mcp_configs,
max_loops=1
)
response = agent.run("Use tools from both MCP servers")
Message Transforms for Context Management¶
Automatically manage long conversation histories by transforming messages when context limits are approached. Configure strategies like truncating oldest messages, summarizing, or chunking to maintain conversation quality while staying within token limits. Essential for long-running conversations or document processing.
from swarms import Agent
from swarms.structs.transforms import TransformConfig
# Configure message transforms to handle long contexts
transforms = TransformConfig(
max_tokens=8000,
strategy="truncate_oldest"
)
agent = Agent(
model_name="gpt-4o-mini",
transforms=transforms,
context_length=100000,
max_loops=1
)
# Agent will automatically manage context length
response = agent.run("Process this very long conversation history")
Agent with Capabilities¶
Define agent capabilities as metadata for better task routing and documentation. Capabilities help other agents or routing systems understand what an agent can do, enabling intelligent task delegation and agent discovery. Useful for multi-agent systems and marketplace listings.
from swarms import Agent
# Agent with defined capabilities for better routing
agent = Agent(
model_name="gpt-4o-mini",
agent_name="Data-Analysis-Agent",
capabilities=["data_analysis", "statistics", "visualization"],
max_loops=1
)
response = agent.run("Analyze this dataset")
Publishing to Marketplace¶
Publish your agent's prompt to the Swarms marketplace for sharing and reuse. When publish_to_marketplace=True, the agent automatically publishes its system prompt along with metadata (name, description, tags, capabilities, use cases) on initialization. Requires use_cases to be provided and SWARMS_API_KEY environment variable.
from swarms import Agent
# Agent configured to publish prompt to marketplace
agent = Agent(
model_name="gpt-4o-mini",
agent_name="Financial-Advisor",
agent_description="Expert financial advisor agent",
system_prompt="You are an expert financial advisor...",
tags=["finance", "advisor"],
capabilities=["financial_planning", "investment_advice"],
use_cases=[
{"title": "Retirement Planning", "description": "Help users plan for retirement"},
{"title": "Investment Analysis", "description": "Analyze investment opportunities"}
],
publish_to_marketplace=True,
max_loops=1
)
# Prompt will be published to marketplace on initialization
New Features and Parameters¶
Enhanced Run Method Parameters¶
The run method now supports several new parameters for advanced functionality:
imgs: Process multiple images simultaneously instead of just onecorrect_answer: Validate responses against expected answers with automatic retriesstreaming_callback: Real-time token streaming for interactive applications
MCP (Model Context Protocol) Integration¶
| Parameter | Description |
|---|---|
mcp_url |
Connect to a single MCP server |
mcp_urls |
Connect to multiple MCP servers |
mcp_config |
Advanced MCP configuration options for a single server |
mcp_configs |
MultipleMCPConnections object for managing multiple MCP server connections |
Advanced Reasoning and Safety¶
| Parameter | Description |
|---|---|
react_on |
Enable ReAct reasoning for complex problem-solving |
safety_prompt_on |
Add safety constraints to agent responses |
reasoning_prompt_on |
Enable multi-loop reasoning for complex tasks |
reasoning_enabled |
Enable reasoning capabilities for supported models (e.g., o1) |
reasoning_effort |
Set reasoning effort level: "low", "medium", or "high" |
thinking_tokens |
Maximum number of thinking tokens for reasoning models |
Performance and Resource Management¶
| Parameter | Description |
|---|---|
dynamic_context_window |
Automatically adjust context window based on available tokens |
tool_retry_attempts |
Configure retry behavior for tool execution |
summarize_multiple_images |
Automatically summarize results from multiple image processing |
Advanced Memory and Context¶
| Parameter | Description |
|---|---|
rag_every_loop |
Query RAG database on every loop iteration |
memory_chunk_size |
Control memory chunk size for long-term memory |
auto_generate_prompt |
Automatically generate system prompts based on tasks |
plan_enabled |
Enable planning functionality for complex tasks |
Artifacts and Output Management¶
| Parameter | Description |
|---|---|
artifacts_on |
Enable saving artifacts from agent execution |
artifacts_output_path |
Specify where to save artifacts |
artifacts_file_extension |
Control artifact file format |
Enhanced Tool Management¶
| Parameter | Description |
|---|---|
tools_list_dictionary |
Provide tool schemas in dictionary format |
tool_call_summary |
Enable automatic summarization of tool calls |
show_tool_execution_output |
Control visibility of tool execution details |
function_calling_format_type |
Specify function calling format (OpenAI, etc.) |
Advanced LLM Configuration¶
| Parameter | Description |
|---|---|
llm_args |
Pass additional arguments to the LLM |
llm_base_url |
Specify custom LLM API endpoint |
llm_api_key |
Provide LLM API key directly |
top_p |
Control top-p sampling parameter |
Reasoning and Advanced Capabilities¶
| Parameter | Description |
|---|---|
reasoning_enabled |
Enable reasoning capabilities for supported models |
reasoning_effort |
Set reasoning effort level ("low", "medium", "high") |
thinking_tokens |
Maximum number of thinking tokens for reasoning models |
Execution Modes and Marketplace¶
| Parameter | Description |
|---|---|
mode |
Execution mode: "interactive", "fast", or "standard" |
capabilities |
List of agent capabilities for documentation and routing |
publish_to_marketplace |
Publish agent prompt to Swarms marketplace |
marketplace_prompt_id |
Load prompt from Swarms marketplace by UUID |
Message Transforms and Context Management¶
| Parameter | Description |
|---|---|
transforms |
TransformConfig for handling context limits and message transformations |
Best Practices¶
| Best Practice / Feature | Description |
|---|---|
system_prompt |
Always provide a clear and concise system prompt to guide the agent's behavior. |
tools |
Use tools to extend the agent's capabilities for specific tasks. |
retry_attempts & error handling |
Implement error handling and utilize the retry_attempts feature for robust execution. |
long_term_memory |
Leverage long_term_memory for tasks that require persistent information. |
interactive & dashboard |
Use interactive mode for real-time conversations and dashboard for monitoring. |
sentiment_analysis |
Implement sentiment_analysis for applications requiring tone management. |
autosave, save/load |
Utilize autosave and save/load methods for continuity across sessions. Autosave saves configuration at each step to workspace_dir/agents/{agent-name}-{uuid}/config.json. Files created by agent tools are saved in the agent's workspace directory. |
dynamic_context_window & tokens_checks |
Optimize token usage with dynamic_context_window and tokens_checks methods. |
concurrent & async methods |
Use concurrent and async methods for performance-critical applications. |
analyze_feedback |
Regularly review and analyze feedback using the analyze_feedback method. |
artifacts_on |
Use artifacts_on to save important outputs from agent execution. |
rag_every_loop |
Enable rag_every_loop when continuous context from long-term memory is needed. |
run_batched |
Leverage run_batched for efficient processing of multiple related tasks. |
mcp_url or mcp_urls |
Use mcp_url or mcp_urls to extend agent capabilities with external tools. |
react_on |
Enable react_on for complex reasoning tasks requiring step-by-step analysis. |
tool_retry_attempts |
Configure tool_retry_attempts for robust tool execution in production environments. |
handoffs |
Use handoffs to create specialized agent teams that can intelligently route tasks based on complexity and expertise requirements. |
By following these guidelines and leveraging the Swarm Agent's extensive features, you can create powerful, flexible, and efficient autonomous agents for a wide range of applications.