feat(safety): LLM healthy mode #4431

Soulter · 2026-01-12T10:30:12Z

Modifications / 改动点

This is NOT a breaking change. / 这不是一个破坏性变更。

Screenshots or Test Results / 运行截图或测试结果

Checklist / 检查清单

😊 如果 PR 中有新加入的功能，已经通过 Issue / 邮件等方式和作者讨论过。/ If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
👀 我的更改经过了良好的测试，并已在上方提供了“验证步骤”和“运行截图”。/ My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
🤓 我确保没有引入新依赖库，或者引入了新依赖库的同时将其添加到了 requirements.txt 和 pyproject.toml 文件相应位置。/ I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations in requirements.txt and pyproject.toml.
😮 我的更改没有引入恶意代码。/ My changes do not introduce malicious code.

Summary by Sourcery

实现一个可选的 LLM 安全模式，在向服务提供方发送请求前预先添加一个以安全为重点的 system prompt，并在默认配置和 UI 元数据中暴露其配置选项。

New Features:

在默认设置和服务提供方配置元数据中新增可配置的 LLM 安全模式和策略选项。
引入一个安全模式的 system prompt，引导模型给出安全、健康的回复，同时拒绝有害或敏感的请求。

Enhancements:

将 LLM 安全模式集成到内部代理的处理流水线中，使启用该模式的提供方在处理请求前自动接收安全 system prompt。

Original summary in English

Summary by Sourcery

Implement an optional LLM safety mode that prepends a safety-focused system prompt to provider requests and expose its configuration in defaults and UI metadata.

New Features:

Add configurable LLM safety mode and strategy options to the default settings and provider configuration metadata.
Introduce a safety mode system prompt that guides models toward safe, healthy responses while refusing harmful or sensitive requests.

Enhancements:

Integrate LLM safety mode into the internal agent processing pipeline so that enabled providers automatically receive the safety system prompt before handling requests.

sourcery-ai

Hey - 我在这里给出了一些整体层面的反馈：

在 _apply_llm_safety_mode 中，建议考虑是否应当在已有的 req.system_prompt 之后追加安全提示（或者在文档中明确说明安全提示会优先生效），以避免意外改变现有 system 指令的相对权重/优先级。
当 safety_mode_strategy 无法识别时，你目前只记录了一条警告日志，并且实际上对该请求禁用了安全模式；建议在这种情况下回退到默认的 system_prompt 策略，或者快速失败（fail fast），以避免因为配置错误而在悄无声息中绕过安全机制。

给 AI 代理的提示词

Please address the comments from this code review:

## Overall Comments
- In `_apply_llm_safety_mode`, consider whether the safety prompt should be appended after any existing `req.system_prompt` (or clearly documented as taking top priority) to avoid unexpectedly changing the relative weight/priority of existing system instructions.
- When `safety_mode_strategy` is not recognized, you currently only log a warning and effectively disable safety mode for that request; consider falling back to the default `system_prompt` strategy or failing fast so misconfiguration does not silently bypass the safety behavior.

Sourcery 对开源项目免费 —— 如果你觉得我们的评审有帮助，欢迎分享给更多人 ✨

_{帮我变得更有用！请在每条评论上点击 👍 或 👎，我会根据你的反馈改进后续的评审质量。}

Original comment in English

Hey - I've left some high level feedback:

In _apply_llm_safety_mode, consider whether the safety prompt should be appended after any existing req.system_prompt (or clearly documented as taking top priority) to avoid unexpectedly changing the relative weight/priority of existing system instructions.
When safety_mode_strategy is not recognized, you currently only log a warning and effectively disable safety mode for that request; consider falling back to the default system_prompt strategy or failing fast so misconfiguration does not silently bypass the safety behavior.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- In `_apply_llm_safety_mode`, consider whether the safety prompt should be appended after any existing `req.system_prompt` (or clearly documented as taking top priority) to avoid unexpectedly changing the relative weight/priority of existing system instructions.
- When `safety_mode_strategy` is not recognized, you currently only log a warning and effectively disable safety mode for that request; consider falling back to the default `system_prompt` strategy or failing fast so misconfiguration does not silently bypass the safety behavior.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

feat(safety): implement LLM safety mode

dc46eb8

auto-assign bot requested review from Fridemn and LIghtJUNction January 12, 2026 10:30

dosubot bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Jan 12, 2026

chore: ruff format

c424b23

sourcery-ai bot reviewed Jan 12, 2026

View reviewed changes

dosubot bot added the area:core The bug / feature is about astrbot's core, backend label Jan 12, 2026

Soulter changed the title ~~feat(safety): implement LLM safety mode~~ feat(safety): LLM healthy mode Jan 12, 2026

Soulter merged commit 52bba90 into master Jan 12, 2026
6 checks passed

Soulter deleted the feat/llm-safety-mode branch January 12, 2026 10:33

dosubot bot mentioned this pull request Jan 21, 2026

[Feature]注意到了安全模式无法开关 #4594

Closed

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

feat(safety): LLM healthy mode #4431

feat(safety): LLM healthy mode #4431

Uh oh!

Soulter commented Jan 12, 2026 •

edited

Loading

Uh oh!

sourcery-ai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

feat(safety): LLM healthy mode #4431

feat(safety): LLM healthy mode #4431

Uh oh!

Conversation

Soulter commented Jan 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Modifications / 改动点

Screenshots or Test Results / 运行截图或测试结果

Checklist / 检查清单

Summary by Sourcery

Summary by Sourcery

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Soulter commented Jan 12, 2026 •

edited

Loading