-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Problem
When iterating on agent prompts interactively (e.g. in VS Code Copilot, Claude Code, or any chat-based agent), useful test scenarios emerge organically from real conversations. Currently there's no way to capture these conversations and convert them into AgentV eval cases without manually writing YAML.
Proposal
Create an agent skill (not a CLI subcommand) that:
- Accepts a chat conversation transcript (e.g. markdown, JSON, or pasted text)
- Extracts test-worthy exchanges — identifying the user input, expected outcome, and optionally expected tool calls
- Generates AgentV eval YAML cases from them
- Optionally appends to an existing eval file or creates a new one
Why a skill, not a subcommand
- Keeps AgentV core minimal
- The conversion is inherently LLM-powered (extracting intent, expected outcomes from freeform chat) — perfect for an agent skill
- Users can customise the skill's prompt to match their eval style
- Works with any agent that supports skills (Copilot CLI, Claude Code, etc.)
Acceptance Criteria
- Skill accepts a conversation transcript and produces valid AgentV eval YAML
- Generated cases include
input,expected_outcome, and optionallyevaluatorsconfig - Works with common transcript formats (markdown chat, JSON messages array)
- Documentation and example usage
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request