Adding multi-turn support to all LLM based guardrails #65

steven10a · 2025-12-12T17:36:58Z

Extending llm_base.py to always use the conversation_history from ctx to provide the conversation history to all LLM based guardrails. Previously we had the Jailbreak guardrail as a custom multi-turn guardrail.

Conversation history will allow for more robust detection
User can set max_turns in the config to control how much of the conversation is passed to the guardrail, balancing token cost with context
Updated documentation
Updated and added tests

Copilot

Pull request overview

This PR adds multi-turn conversation support to all LLM-based guardrails by extending the llm_base.py infrastructure. Previously, only the Jailbreak guardrail supported conversation history analysis; now all LLM-based guardrails can leverage conversation context for more robust detection across multiple turns.

Key changes:

Extended LLMConfig with a max_turns parameter (default: 10) to control conversation history length
Modified run_llm() to accept conversation history and intelligently switch between single-turn and multi-turn formats
Refactored the Jailbreak guardrail to use the common create_llm_check_fn factory instead of custom implementation
Updated Prompt Injection Detection to respect the max_turns configuration

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
src/guardrails/checks/text/llm_base.py	Added multi-turn support infrastructure: `max_turns` field in LLMConfig, `_build_analysis_payload()` helper, and conversation history handling in `run_llm()` and `create_llm_check_fn()`
src/guardrails/checks/text/jailbreak.py	Refactored to use `create_llm_check_fn()` factory, removing custom payload building and execution logic (~80 lines of code removed)
src/guardrails/checks/text/prompt_injection_detection.py	Updated `_extract_user_intent_from_messages()` and `_slice_conversation_since_latest_user()` to accept and respect `max_turns` parameter
tests/unit/checks/test_llm_base.py	Added comprehensive tests for multi-turn functionality, conversation history extraction, and payload building
tests/unit/checks/test_jailbreak.py	Updated tests to work with refactored implementation using the common factory pattern
tests/unit/checks/test_prompt_injection_detection.py	Added tests verifying `max_turns` configuration is properly applied
docs/ref/checks/llm_base.md	Updated documentation to describe multi-turn support, `max_turns` parameter, and usage patterns
docs/ref/checks/jailbreak.md	Updated to reflect new multi-turn capabilities and simplified configuration
docs/ref/checks/nsfw.md	Added `max_turns` parameter documentation and token usage example
docs/ref/checks/off_topic_prompts.md	Added `max_turns` parameter documentation and token usage example
docs/ref/checks/custom_prompt_check.md	Added `max_turns` parameter documentation and token usage example
docs/ref/checks/prompt_injection_detection.md	Added `max_turns` parameter documentation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

tests/unit/checks/test_llm_base.py

steven10a · 2025-12-12T17:55:36Z

@codex review

chatgpt-codex-connector · 2025-12-12T18:01:24Z

Codex Review: Didn't find any major issues. 🎉

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

steven10a added 2 commits December 12, 2025 12:21

Adding multi-turn support for all LLM based guardrails

d4b6e55

Handle whitespaces

ce6a891

Copilot AI review requested due to automatic review settings December 12, 2025 17:36

Copilot started reviewing on behalf of steven10a December 12, 2025 17:37 View session

Copilot AI reviewed Dec 12, 2025

View reviewed changes

tests/unit/checks/test_llm_base.py Outdated Show resolved Hide resolved

tests/unit/checks/test_llm_base.py Outdated Show resolved Hide resolved

Fix json import

3986ad3

steven10a requested a review from gabor-openai December 12, 2025 18:29

steven10a added 2 commits December 12, 2025 18:36

Merge: resolve test file conflicts

b46f50d

Remove unused LLMReasoningOutput import

c4fb10d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Adding multi-turn support to all LLM based guardrails #65

Adding multi-turn support to all LLM based guardrails #65

Uh oh!

steven10a commented Dec 12, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

steven10a commented Dec 12, 2025

Uh oh!

chatgpt-codex-connector bot commented Dec 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Adding multi-turn support to all LLM based guardrails #65

Are you sure you want to change the base?

Adding multi-turn support to all LLM based guardrails #65

Uh oh!

Conversation

steven10a commented Dec 12, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

steven10a commented Dec 12, 2025

Uh oh!

chatgpt-codex-connector bot commented Dec 12, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants