Probe Data Module Documentation¶
The probe_data
module is a core component of the Agentic Security project, responsible for handling datasets, generating audio and image data, and applying various transformations. This documentation provides an overview of the module's structure and functionality.
Files and Key Components¶
audio_generator.py¶
- Functions:
encode(content: bytes) -> str
: Encodes audio content to a string format.generate_audio_mac_wav(prompt: str) -> bytes
: Generates audio in WAV format for macOS.generate_audioform(prompt: str) -> bytes
: Generates audio from a given prompt.- Classes:
RequestAdapter
: Handles requests for audio generation.
data.py¶
- Functions:
load_dataset_general(...)
: Loads datasets with general specifications.count_words_in_list(str_list)
: Counts words in a list of strings.prepare_prompts(...)
: Prepares prompts for dataset processing.- Classes:
Stenography
: Applies transformations to prompt groups.
image_generator.py¶
- Functions:
generate_image_dataset(...)
: Generates a dataset of images.generate_image(prompt: str) -> bytes
: Generates an image from a prompt.- Classes:
RequestAdapter
: Handles requests for image generation.
models.py¶
- Classes:
ProbeDataset
: Represents a dataset for probing.ImageProbeDataset
: ExtendsProbeDataset
for image data.
msj_data.py¶
- Functions:
load_dataset_generic(...)
: Loads a generic dataset.- Classes:
ProbeDataset
: Represents a dataset for probing.
stenography_fn.py¶
- Functions:
rot13(input_text)
: Applies ROT13 transformation.base64_encode(data)
: Encodes data in base64 format.mirror_words(text)
: Mirrors words in the text.
rl_model.py¶
- Classes:
-
PromptSelectionInterface
: Abstract base class for prompt selection strategies.- Methods:
select_next_prompt(current_prompt: str, passed_guard: bool) -> str
: Selects next promptselect_next_prompts(current_prompt: str, passed_guard: bool) -> list[str]
: Selects multiple promptsupdate_rewards(previous_prompt: str, current_prompt: str, reward: float, passed_guard: bool) -> None
: Updates rewards
-
RandomPromptSelector
: Basic random selection with history tracking.- Parameters:
prompts: list[str]
: List of available promptshistory_size: int = 3
: Size of history to prevent cycles
-
CloudRLPromptSelector
: Cloud-based RL implementation with fallback.- Parameters:
prompts: list[str]
: List of available promptsapi_url: str
: URL of RL serviceauth_token: str = AUTH_TOKEN
: Authentication tokenhistory_size: int = 300
: Size of historytimeout: int = 5
: Request timeoutrun_id: str = ""
: Unique run identifier
-
QLearningPromptSelector
: Local Q-learning implementation.- Parameters:
prompts: list[str]
: List of available promptslearning_rate: float = 0.1
: Learning ratediscount_factor: float = 0.9
: Discount factorinitial_exploration: float = 1.0
: Initial exploration rateexploration_decay: float = 0.995
: Exploration decay ratemin_exploration: float = 0.01
: Minimum exploration ratehistory_size: int = 300
: Size of history
-
Module
: Main class that uses CloudRLPromptSelector.- Parameters:
prompt_groups: list[str]
: Groups of promptstools_inbox: asyncio.Queue
: Queue for tool communicationopts: dict = {}
: Configuration options
Usage Examples¶
Generating Audio¶
from agentic_security.probe_data.audio_generator import generate_audioform
audio_bytes = generate_audioform("Hello, world!")
Loading a Dataset¶
from agentic_security.probe_data.data import load_dataset_general
dataset = load_dataset_general("example_dataset")
Using RL Model¶
from agentic_security.probe_data.modules.rl_model import QLearningPromptSelector
prompts = ["What is AI?", "Explain machine learning"]
selector = QLearningPromptSelector(prompts)
current_prompt = "What is AI?"
next_prompt = selector.select_next_prompt(current_prompt, passed_guard=True)
selector.update_rewards(current_prompt, next_prompt, reward=1.0, passed_guard=True)
Conclusion¶
The probe_data
module provides essential functionality for handling and transforming datasets within the Agentic Security project. This documentation serves as a guide to understanding and utilizing the module's capabilities.