Skip to content

Image Generation System

The image generation system creates visual probes for security testing by converting text prompts into images. This document explains its architecture and implementation.

Overview

The system:

  1. Converts text datasets into image datasets
  2. Generates images using matplotlib
  3. Encodes images for transmission
  4. Integrates with the LLM probing system

Core Components

Image Generation

@cache_to_disk()
def generate_image(prompt: str) -> bytes:
    """
    Generates a JPEG image containing the provided text prompt
    """
    # Create figure with light blue background
    fig, ax = plt.subplots(figsize=(6, 4))
    ax.set_facecolor("lightblue")

    # Add centered text
    ax.text(
        0.5, 0.5,
        prompt,
        fontsize=16,
        ha="center",
        va="center",
        wrap=True,
        color="darkblue"
    )

    # Save to buffer
    buffer = io.BytesIO()
    plt.savefig(buffer, format="jpeg", bbox_inches="tight")
    return buffer.getvalue()

Dataset Conversion

def generate_image_dataset(text_dataset: list[ProbeDataset]) -> list[ImageProbeDataset]:
    """
    Converts text datasets into image datasets
    """
    image_datasets = []

    for dataset in text_dataset:
        image_prompts = [
            generate_image(prompt)
            for prompt in tqdm(dataset.prompts)
        ]

        image_datasets.append(ImageProbeDataset(
            test_dataset=dataset,
            image_prompts=image_prompts
        ))

    return image_datasets

Image Encoding

def encode(image: bytes) -> str:
    """
    Encodes image bytes into base64 data URL
    """
    encoded = base64.b64encode(image).decode("utf-8")
    return "data:image/jpeg;base64," + encoded

Integration

RequestAdapter

The RequestAdapter class integrates image generation with LLM probing:

class RequestAdapter:
    def __init__(self, llm_spec):
        if not llm_spec.has_image:
            raise ValueError("LLMSpec must have an image")
        self.llm_spec = llm_spec

    async def probe(self, prompt: str, encoded_image: str = "",
                   encoded_audio: str = "", files={}) -> httpx.Response:
        encoded_image = generate_image(prompt)
        encoded_image = encode(encoded_image)
        return await self.llm_spec.probe(prompt, encoded_image, encoded_audio, files)

Key Features

  • Caching: Generated images are cached to disk using @cache_to_disk
  • Progress Tracking: tqdm progress bars for dataset conversion
  • Error Handling: Validates LLM specifications before probing
  • Standard Formats: Uses JPEG format with base64 encoding

Configuration

The system is configured through:

  1. Figure size (6x4 inches)
  2. Background color (light blue)
  3. Text styling (16pt dark blue centered text)
  4. Image format (JPEG)

Limitations

  • Currently only supports text-based image generation
  • Fixed visual style and formatting
  • Requires matplotlib and associated dependencies