feat: agent skill to convert chat conversations into eval cases

## Problem

When iterating on agent prompts interactively (e.g. in VS Code Copilot, Claude Code, or any chat-based agent), useful test scenarios emerge organically from real conversations. Currently there's no way to capture these conversations and convert them into AgentV eval cases without manually writing YAML.

## Proposal

Create an agent skill (not a CLI subcommand) that:

1. Accepts a chat conversation transcript (e.g. markdown, JSON, or pasted text)
2. Extracts test-worthy exchanges — identifying the user input, expected outcome, and optionally expected tool calls
3. Generates AgentV eval YAML cases from them
4. Optionally appends to an existing eval file or creates a new one

## Why a skill, not a subcommand

- Keeps AgentV core minimal
- The conversion is inherently LLM-powered (extracting intent, expected outcomes from freeform chat) — perfect for an agent skill
- Users can customise the skill's prompt to match their eval style
- Works with any agent that supports skills (Copilot CLI, Claude Code, etc.)

## Acceptance Criteria

- Skill accepts a conversation transcript and produces valid AgentV eval YAML
- Generated cases include `input`, `expected_outcome`, and optionally `evaluators` config
- Works with common transcript formats (markdown chat, JSON messages array)
- Documentation and example usage

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: agent skill to convert chat conversations into eval cases #182

Problem

Proposal

Why a skill, not a subcommand

Acceptance Criteria

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

feat: agent skill to convert chat conversations into eval cases #182

Description

Problem

Proposal

Why a skill, not a subcommand

Acceptance Criteria

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions