Skip to content

Operator Module

The operator.py module provides tools for managing and operating on datasets using an agent-based approach. It is designed to facilitate the execution of operations on datasets through a structured and validated process.

Classes

AgentSpecification

Defines the specification for an LLM/agent:

  • name: Name of the LLM/agent
  • version: Version of the LLM/agent
  • description: Description of the LLM/agent
  • capabilities: List of capabilities
  • configuration: Configuration settings

OperatorToolBox

Main class for dataset operations:

  • __init__(spec: AgentSpecification, datasets: list[dict[str, Any]]): Initialize with agent spec and datasets. This sets up the toolbox with the necessary specifications and datasets for operation.
  • get_spec(): Get the agent specification. Returns the AgentSpecification object associated with the toolbox.
  • get_datasets(): Get the datasets. Returns a list of datasets that the toolbox operates on.
  • validate(): Validate the toolbox. Checks if the toolbox is correctly set up with valid specifications and datasets.
  • stop(): Stop the toolbox. Halts any ongoing operations within the toolbox.
  • run(): Run the toolbox. Initiates the execution of operations as defined in the toolbox.
  • get_results(): Get operation results. Retrieves the results of operations performed by the toolbox.
  • get_failures(): Get failures. Provides a list of any failures encountered during operations.
  • run_operation(operation: str): Run a specific operation. Executes a given operation on the datasets, returning the result or failure message.

Agent Tools

The dataset_manager_agent provides these tools:

validate_toolbox

Validates the OperatorToolBox:

@dataset_manager_agent.tool
async def validate_toolbox(ctx: RunContext[OperatorToolBox]) -> str

execute_operation

Executes an operation on a dataset:

@dataset_manager_agent.tool
async def execute_operation(ctx: RunContext[OperatorToolBox], operation: str) -> str

retrieve_results

Retrieves operation results:

@dataset_manager_agent.tool
async def retrieve_results(ctx: RunContext[OperatorToolBox]) -> str

retrieve_failures

Retrieves failures:

@dataset_manager_agent.tool
async def retrieve_failures(ctx: RunContext[OperatorToolBox]) -> str

Usage Examples

Initializing the OperatorToolBox

To initialize the OperatorToolBox, you need to provide an AgentSpecification and a list of datasets:

spec = AgentSpecification(
    name="GPT-4",
    version="4.0",
    description="A powerful language model",
    capabilities=["text-generation", "question-answering"],
    configuration={"max_tokens": 100},
)

datasets = [{"name": "dataset1"}, {"name": "dataset2"}]

toolbox = OperatorToolBox(spec=spec, datasets=datasets)

Synchronous Usage

def run_dataset_manager_agent_sync():
    prompts = [
        "Validate the toolbox.",
        "Execute operation on 'dataset2'.",
        "Retrieve the results.",
        "Retrieve any failures."
    ]

    for prompt in prompts:
        result = dataset_manager_agent.run_sync(prompt, deps=toolbox)
        print(f"Response: {result.data}")

Asynchronous Usage

async def run_dataset_manager_agent_async():
    prompts = [
        "Validate the toolbox.",
        "Execute operation on 'dataset2'.",
        "Retrieve the results.",
        "Retrieve any failures."
    ]

    for prompt in prompts:
        result = await dataset_manager_agent.run(prompt, deps=toolbox)
        print(f"Response: {result.data}")

These updates provide a more detailed and comprehensive understanding of the operator.py module, its classes, and its usage.