StructuredBot API Reference
StructuredBot is designed for getting structured, validated outputs from LLMs. It enforces Pydantic schema validation and provides automatic retry logic when the LLM doesn't produce valid output.
Class Definition
class StructuredBot(SimpleBot):
"""StructuredBot is given a Pydantic Model and expects the LLM to return
a JSON structure that conforms to the model schema.
It will validate the returned json against the pydantic model,
prompting the LLM to fix any of the validation errors if it does not validate,
and then explicitly return an instance of that model.
"""
Constructor
def __init__(
self,
system_prompt: Union[str, SystemMessage],
pydantic_model: BaseModel,
model_name: str = default_language_model(),
stream_target: str = "stdout",
allow_failed_validation: bool = False,
**completion_kwargs,
)
Constructor Parameters
-
system_prompt (
Union[str, SystemMessage]): The system prompt to use for the bot. Should instruct the LLM on how to extract or generate structured data. -
pydantic_model (
BaseModel): The Pydantic model that defines the expected output schema. The LLM must return JSON that validates against this model. -
model_name (
str, default:default_language_model()): The name of the model to use. Must support structured outputs (e.g.,gpt-4o,anthropic/claude-3-5-sonnet,gemini/gemini-1.5-pro-latest). See model support for details. -
stream_target (
str, default:"stdout"): The target to stream the response to. StructuredBot streams only to stdout; other modes may not work correctly. -
allow_failed_validation (
bool, default:False): Whether to allow returning invalid data if validation fails after retries. IfFalse, raisesValidationErroron failure. -
completion_kwargs: Additional keyword arguments to pass to the completion function.
Model Support
StructuredBot requires models that support both response_format and
response_schema parameters. Supported models include:
gpt-4o,gpt-4-turbo,gpt-4anthropic/claude-3-5-sonnet,anthropic/claude-3-opusgemini/gemini-1.5-pro-latestollama_chat/*(with structured output support)
If a model doesn't support structured outputs, StructuredBot will raise a ValueError at initialization.
Methods
__call__
def __call__(
self,
*messages: Union[str, BaseMessage],
) -> BaseModel
Process messages and return a validated Pydantic model instance.
Parameters
- messages: One or more messages to process. Can be strings or
BaseMessageobjects.
Returns
- BaseModel: An instance of the provided
pydantic_modelwith validated data.
Raises
- ValidationError: If validation fails after retries and
allow_failed_validation=False.
Example
import llamabot as lmb
from pydantic import BaseModel
class Person(BaseModel):
name: str
age: int
email: str
bot = lmb.StructuredBot(
system_prompt="Extract person information from text.",
pydantic_model=Person,
model_name="gpt-4o"
)
person = bot("John is 25 years old. Email: john@example.com")
print(person.name) # "John"
print(person.age) # 25
Retry Logic
StructuredBot automatically retries when validation fails:
- LLM generates JSON response
- JSON is validated against Pydantic model
- If validation fails, error details are sent back to LLM
- LLM attempts to fix the JSON
- Process repeats up to a maximum number of retries
- If still invalid and
allow_failed_validation=False, raisesValidationError
Attributes
- pydantic_model (
BaseModel): The Pydantic model for validation - allow_failed_validation (
bool): Whether to allow failed validation
Usage Examples
Basic Data Extraction
import llamabot as lmb
from pydantic import BaseModel
from typing import List
class Person(BaseModel):
name: str
age: int
hobbies: List[str]
bot = lmb.StructuredBot(
system_prompt="Extract person information from text.",
pydantic_model=Person,
model_name="gpt-4o"
)
person = bot("John is 25 years old and enjoys hiking and photography.")
print(person.name) # "John"
print(person.age) # 25
print(person.hobbies) # ["hiking", "photography"]
Complex Nested Models
import llamabot as lmb
from pydantic import BaseModel
from typing import List, Optional
from datetime import datetime
class Address(BaseModel):
street: str
city: str
zip_code: str
class Person(BaseModel):
name: str
age: int
address: Address
created_at: datetime
bot = lmb.StructuredBot(
system_prompt="Extract person information with address.",
pydantic_model=Person,
model_name="gpt-4o"
)
person = bot("John, 25, lives at 123 Main St, New York, 10001")
print(person.address.city) # "New York"
With Optional Fields
import llamabot as lmb
from pydantic import BaseModel
from typing import Optional
class Person(BaseModel):
name: str
age: int
email: Optional[str] = None
bot = lmb.StructuredBot(
system_prompt="Extract person information.",
pydantic_model=Person,
model_name="gpt-4o"
)
# Works even if email is missing
person = bot("John is 25 years old")
print(person.email) # None
Allowing Failed Validation
import llamabot as lmb
from pydantic import BaseModel, ValidationError
class Person(BaseModel):
name: str
age: int
bot = lmb.StructuredBot(
system_prompt="Extract person information.",
pydantic_model=Person,
model_name="gpt-4o",
allow_failed_validation=True
)
# If validation fails, returns partial data instead of raising
try:
person = bot("Invalid text")
except ValidationError:
# Handle validation error
pass
Differences from SimpleBot JSON Mode
- StructuredBot: Guarantees schema validation, returns Pydantic objects, automatic retries
- SimpleBot JSON Mode: Ensures valid JSON only, no schema validation, returns strings
Best Practices
- Use descriptive system prompts: Clearly explain what data to extract
- Define clear schemas: Use Pydantic's validation features (validators, constraints)
- Handle optional fields: Use
Optionalfor fields that may be missing - Test with edge cases: Ensure your schema handles various input formats
Related Classes
- SimpleBot: Base class that StructuredBot extends
- Pydantic BaseModel: Schema definition class