#

user-proxy-evaluation

Here is 1 public repository matching this topic...

SAP / mirrorbench

An automatic, extensible Framework to Evaluate User-Proxy Agents for Human-Likeness. 🌟 Star if you like it!

calibration evaluation-metrics dialogue-systems evaluation-framework conversational-ai user-simulation user-proxy llm-evaluation llm-as-a-judge chatbot-arena user-proxy-evaluation clariq qulac oasst1

Updated Jan 16, 2026
Jupyter Notebook

Improve this page

Add a description, image, and links to the user-proxy-evaluation topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the user-proxy-evaluation topic, visit your repo's landing page and select "manage topics."