Operator Module¶
The operator.py
module provides tools for managing and operating on datasets using an agent-based approach. It is designed to facilitate the execution of operations on datasets through a structured and validated process.
Classes¶
AgentSpecification¶
Defines the specification for an LLM/agent:
name
: Name of the LLM/agentversion
: Version of the LLM/agentdescription
: Description of the LLM/agentcapabilities
: List of capabilitiesconfiguration
: Configuration settings
OperatorToolBox¶
Main class for dataset operations:
__init__(spec: AgentSpecification, datasets: list[dict[str, Any]])
: Initialize with agent spec and datasets. This sets up the toolbox with the necessary specifications and datasets for operation.get_spec()
: Get the agent specification. Returns theAgentSpecification
object associated with the toolbox.get_datasets()
: Get the datasets. Returns a list of datasets that the toolbox operates on.validate()
: Validate the toolbox. Checks if the toolbox is correctly set up with valid specifications and datasets.stop()
: Stop the toolbox. Halts any ongoing operations within the toolbox.run()
: Run the toolbox. Initiates the execution of operations as defined in the toolbox.get_results()
: Get operation results. Retrieves the results of operations performed by the toolbox.get_failures()
: Get failures. Provides a list of any failures encountered during operations.run_operation(operation: str)
: Run a specific operation. Executes a given operation on the datasets, returning the result or failure message.
Agent Tools¶
The dataset_manager_agent
provides these tools:
validate_toolbox¶
Validates the OperatorToolBox:
execute_operation¶
Executes an operation on a dataset:
@dataset_manager_agent.tool
async def execute_operation(ctx: RunContext[OperatorToolBox], operation: str) -> str
retrieve_results¶
Retrieves operation results:
retrieve_failures¶
Retrieves failures:
Usage Examples¶
Initializing the OperatorToolBox¶
To initialize the OperatorToolBox
, you need to provide an AgentSpecification
and a list of datasets:
spec = AgentSpecification(
name="GPT-4",
version="4.0",
description="A powerful language model",
capabilities=["text-generation", "question-answering"],
configuration={"max_tokens": 100},
)
datasets = [{"name": "dataset1"}, {"name": "dataset2"}]
toolbox = OperatorToolBox(spec=spec, datasets=datasets)
Synchronous Usage¶
def run_dataset_manager_agent_sync():
prompts = [
"Validate the toolbox.",
"Execute operation on 'dataset2'.",
"Retrieve the results.",
"Retrieve any failures."
]
for prompt in prompts:
result = dataset_manager_agent.run_sync(prompt, deps=toolbox)
print(f"Response: {result.data}")
Asynchronous Usage¶
async def run_dataset_manager_agent_async():
prompts = [
"Validate the toolbox.",
"Execute operation on 'dataset2'.",
"Retrieve the results.",
"Retrieve any failures."
]
for prompt in prompts:
result = await dataset_manager_agent.run(prompt, deps=toolbox)
print(f"Response: {result.data}")
These updates provide a more detailed and comprehensive understanding of the operator.py
module, its classes, and its usage.