Multi-Agent Speech Debate¶
This tutorial explores a more advanced use case: simulating a turn-based debate between two agents where each agent speaks their responses. We will also optionally use Speech-to-Text (STT) to provide the initial debate topic.
Prerequisites¶
- Python 3.10+
- OpenAI API key
swarmslibraryvoice-agentslibrary- A working microphone (if using STT)
Tutorial Steps¶
-
Install Dependencies
-
Define Agent Personalities Create distinct system prompts for your agents to ensure a dynamic debate. In this example, we use Socrates and Simone de Beauvoir.
-
Initialize Agents Set up two agents with
streaming_on=True. -
Create a Debate Loop Implement a function that alternates turns between agents, uses their respective TTS voices, and passes the response of one agent as the input to the next.
-
Integrate STT (Optional) Use
record_audioandspeech_to_textto capture your own voice as the starting prompt for the debate.
Code Example¶
from swarms import Agent
from swarms.structs.conversation import Conversation
from voice_agents.main import speech_to_text, record_audio, StreamingTTSCallback
def debate_with_speech(
agents: list,
max_loops: int = 1,
task: str = None,
use_stt_for_input: bool = False,
):
"""
Simulate a turn-based debate between two agents with speech capabilities.
Args:
agents (list): A list containing exactly two Agent instances who will debate.
max_loops (int): The number of conversational turns.
task (str): The initial prompt or question to start the debate.
use_stt_for_input (bool): If True, use speech-to-text for the initial task input.
Returns:
str: The formatted conversation history.
"""
conversation = Conversation()
# Create TTS callbacks with different voices to differentiate speakers
tts_callback1 = StreamingTTSCallback(voice="onyx", model="tts-1") # Deeper voice
tts_callback2 = StreamingTTSCallback(voice="nova", model="tts-1") # Softer voice
# Get initial task from STT or provided string
if use_stt_for_input:
print("Please speak your question or topic for the debate...")
audio = record_audio(duration=5.0)
task = speech_to_text(audio_data=audio, sample_rate=16000)
print(f"Transcribed: {task}\n")
message = task
speaker = agents[0]
other = agents[1]
current_callback = tts_callback1
other_callback = tts_callback2
for i in range(max_loops):
print(f"--- Turn {i+1}: {speaker.agent_name} speaking ---")
# Agent generates response and speaks in real-time
response = speaker.run(
task=message,
streaming_callback=current_callback,
)
current_callback.flush()
conversation.add(speaker.agent_name, response)
# Swap roles for the next turn
message = response
speaker, other = other, speaker
current_callback, other_callback = other_callback, current_callback
return conversation.return_history_as_string()
# Define System Prompts
socratic_prompt = "You are Socrates. Challenge every assumption with logic."
beauvoir_prompt = "You are Simone de Beauvoir. Focus on freedom and existence."
# Instantiate Agents
agent1 = Agent(
agent_name="Socrates",
system_prompt=socratic_prompt,
model_name="gpt-4o",
streaming_on=True,
)
agent2 = Agent(
agent_name="Simone de Beauvoir",
system_prompt=beauvoir_prompt,
model_name="gpt-4o",
streaming_on=True,
)
# Run the debate
history = debate_with_speech(
agents=[agent1, agent2],
max_loops=3,
task="Is freedom an illusion?",
)
print(history)
Key Components¶
- Differentiated Voices: Using "onyx" and "nova" helps the listener distinguish which agent is currently speaking.
- Turn-based Logic: The output of the first agent becomes the input for the second, creating a continuous dialogue.
- STT Integration:
speech_to_textallows for hands-free interaction with the swarm. - Conversation Tracking: The
Conversationstruct helps maintain a record of the entire exchange.