Parameterize LLM returning reasoning #64

steven10a · 2025-12-10T20:31:42Z

Allow users to toggle reason on and off for the LLM based guardrails via the config file

- include_reasoning (optional): Whether to include reasoning/explanation fields in the guardrail output (default: false)
  - When false: The LLM only generates the essential fields (flagged and confidence), reducing token generation costs
  - When true: Additionally, returns detailed reasoning for its decisions
  - Use Case: Keep disabled for production to minimize costs; enable for development and debugging

Updated docs and tests

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

src/guardrails/checks/text/llm_base.py

Copilot

Pull request overview

This PR adds the ability to parameterize whether LLM-based guardrails return reasoning/explanation fields, allowing users to reduce token generation costs in production while enabling detailed output for development and debugging.

Key Changes

Added include_reasoning field to LLMConfig (default: false) to control whether reasoning fields are included
Created LLMReasoningOutput class extending LLMOutput with a reason field
Updated guardrail implementations to conditionally use extended or base output models based on configuration
Updated tests to validate behavior with reasoning enabled/disabled
Updated documentation to reflect the new optional parameter

Reviewed changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
`src/guardrails/checks/text/llm_base.py`	Added `include_reasoning` config field, `LLMReasoningOutput` class, and logic to select output model
`src/guardrails/checks/text/jailbreak.py`	Removed custom `JailbreakLLMOutput` class, now uses base/reasoning models conditionally
`src/guardrails/checks/text/hallucination_detection.py`	Updated to conditionally use `HallucinationDetectionOutput` based on reasoning config
`src/guardrails/checks/text/prompt_injection_detection.py`	Updated to conditionally use `PromptInjectionDetectionOutput` based on reasoning config
`src/guardrails/checks/text/nsfw.py`	Removed explicit `output_model` parameter, now uses default reasoning support
`src/guardrails/checks/text/off_topic_prompts.py`	Removed explicit `output_model` parameter, now uses default reasoning support
`src/guardrails/checks/text/user_defined_llm.py`	Removed explicit `output_model` parameter, now uses default reasoning support
`src/guardrails/evals/core/benchmark_reporter.py`	Fixed file path sanitization for model names containing "/"
`tests/unit/checks/test_llm_base.py`	Added tests for reasoning toggle behavior
`tests/unit/checks/test_jailbreak.py`	Updated tests to not expect reason field by default, added reasoning toggle tests
`docs/ref/checks/*.md`	Updated documentation to describe the `include_reasoning` parameter

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

docs/ref/checks/off_topic_prompts.md

tests/unit/checks/test_llm_base.py

src/guardrails/checks/text/llm_base.py

steven10a · 2025-12-10T20:53:38Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

src/guardrails/checks/text/hallucination_detection.py

Copilot

Pull request overview

Copilot reviewed 17 out of 17 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

tests/unit/checks/test_llm_base.py

steven10a · 2025-12-10T21:16:01Z

@codex review

Copilot

Pull request overview

Copilot reviewed 17 out of 17 changed files in this pull request and generated 6 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/guardrails/checks/text/llm_base.py

src/guardrails/evals/core/benchmark_reporter.py

src/guardrails/checks/text/hallucination_detection.py

src/guardrails/checks/text/prompt_injection_detection.py

tests/unit/checks/test_llm_base.py

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

src/guardrails/checks/text/hallucination_detection.py

steven10a · 2025-12-10T21:40:03Z

@codex review

chatgpt-codex-connector

💡 Codex Review

openai-guardrails-python/src/guardrails/checks/text/llm_base.py

Lines 389 to 393 in 2d42617

    
           return ( 
        
               output_model( 
        
                   flagged=False, 
        
                   confidence=0.0, 
        
               ),

Handle empty responses with reasoning output

When include_reasoning=True, guardrails using create_llm_check_fn pass LLMReasoningOutput (which requires a reason) into run_llm. In the empty-response fallback here, the code instantiates output_model(flagged=False, confidence=0.0) without supplying the required reasoning field, so a blank/filtered completion will raise a validation error and fall into the generic error path instead of returning a benign unflagged result (the previous behaviour). Provide a default reason or skip validation in this branch so reasoning-enabled guardrails gracefully handle empty outputs.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

steven10a · 2025-12-10T21:59:36Z

@codex review

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

chatgpt-codex-connector · 2025-12-10T22:05:16Z

Codex Review: Didn't find any major issues. Hooray!

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

docs/ref/checks/custom_prompt_check.md

gabor-openai

LGTM thank you!

Parameterize LLM returning reasoning

5b2f338

Copilot AI review requested due to automatic review settings December 10, 2025 20:31

Copilot started reviewing on behalf of steven10a December 10, 2025 20:32 View session

chatgpt-codex-connector bot reviewed Dec 10, 2025

View reviewed changes

src/guardrails/checks/text/llm_base.py Outdated Show resolved Hide resolved

Copilot AI reviewed Dec 10, 2025

View reviewed changes

docs/ref/checks/off_topic_prompts.md Outdated Show resolved Hide resolved

tests/unit/checks/test_llm_base.py Outdated Show resolved Hide resolved

src/guardrails/checks/text/llm_base.py Outdated Show resolved Hide resolved

Respect custom output models

95654f7

steven10a requested a review from Copilot December 10, 2025 20:53

Copilot started reviewing on behalf of steven10a December 10, 2025 20:53 View session

chatgpt-codex-connector bot reviewed Dec 10, 2025

View reviewed changes

src/guardrails/checks/text/hallucination_detection.py Outdated Show resolved Hide resolved

Copilot AI reviewed Dec 10, 2025

View reviewed changes

tests/unit/checks/test_llm_base.py Outdated Show resolved Hide resolved

Make sys prompts dynamic to respect reasoning flag

a1a0c99

steven10a requested a review from Copilot December 10, 2025 21:15

Copilot started reviewing on behalf of steven10a December 10, 2025 21:16 View session

Copilot AI reviewed Dec 10, 2025

View reviewed changes

chatgpt-codex-connector bot reviewed Dec 10, 2025

View reviewed changes

src/guardrails/checks/text/hallucination_detection.py Show resolved Hide resolved

Add tests

2d42617

steven10a requested a review from Copilot December 10, 2025 21:39

Copilot started reviewing on behalf of steven10a December 10, 2025 21:40 View session

chatgpt-codex-connector bot reviewed Dec 10, 2025

View reviewed changes

Gracefully handle empty outputs

75435ec

Copilot AI reviewed Dec 10, 2025

View reviewed changes

steven10a requested a review from gabor-openai December 10, 2025 23:28

gabor-openai reviewed Dec 12, 2025

View reviewed changes

docs/ref/checks/custom_prompt_check.md Show resolved Hide resolved

add note on performance and latency

fe3ee1a

steven10a requested a review from gabor-openai December 12, 2025 22:45

gabor-openai approved these changes Dec 12, 2025

View reviewed changes

gabor-openai merged commit 92246d9 into main Dec 12, 2025
3 checks passed

gabor-openai deleted the dev/steven/reasoning_output branch December 12, 2025 23:06

Parameterize LLM returning reasoning #64

Parameterize LLM returning reasoning #64

Uh oh!

Conversation

steven10a commented Dec 10, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Key Changes

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

steven10a commented Dec 10, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

steven10a commented Dec 10, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

steven10a commented Dec 10, 2025

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

steven10a commented Dec 10, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot commented Dec 10, 2025

Uh oh!

Uh oh!

gabor-openai left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants