Skip to content

Agent Mode: Ensure consistent behavior across models #114

@CarGDev

Description

@CarGDev

Problem

Different LLM models behaved inconsistently during agent execution:

  • Some showed plans, others didn't
  • Approval flow varied by model
  • Output formatting differed

Observed Inconsistencies

Model Plan Shown Waited for Approval Output Format
GPT-5-mini Yes (multiple) No Plain text
Claude Sonnet 4.5 Yes (structured) No Markdown
Gemini Yes No Plain text

Expected Behavior

Regardless of which LLM provider is used:

  1. Same plan format displayed
  2. Same approval flow enforced
  3. Same tool calling interface
  4. Consistent output formatting

Implementation Suggestions

The agent layer should normalize behavior:

  • Wrap model responses in consistent format
  • Enforce approval gates at application level (not model level)
  • Standardize output through formatters
// Application-level enforcement, not model-dependent
const executeWithApproval = async (plan: Plan): Promise<void> => {
  const approved = await showPlanAndWaitForApproval(plan);
  if (!approved) return;
  // Execute...
};

Priority

🟡 Medium - Affects user experience and predictability


Generated from model evaluation test

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions