fix(analytics): Capture token usage and model name for Langfuse, LangSmith, and other providers (fixes #5763)#5764
Conversation
…, and other providers What changed ------------ - handler.ts: Extended onLLMEnd() to accept string | structured output. When structured output is passed, we now extract content, usageMetadata (input/ output/total tokens), and responseMetadata (model name) and forward them to all analytics providers. Added usage/model to Langfuse generation.end(), LangSmith llm_output, and token attributes for Lunary, LangWatch, Arize, Phoenix, and Opik. Call langfuse.flushAsync() after generation.end() so updates are sent before the request completes. - LLM.ts: Pass full output object from prepareOutputObject() to onLLMEnd instead of finalResponse string, so usage and model are available. - Agent.ts: Same as LLM.ts — pass output object to onLLMEnd. - ConditionAgent.ts: Build analyticsOutput with content, usageMetadata, and responseMetadata from the LLM response and pass to onLLMEnd. - handler.test.ts: Added unit tests for the extraction logic (string vs object, token field normalization, model name sources, missing fields). OpenAIAssistant.ts call sites unchanged (Assistants API; no usage data). Why --- Fixes FlowiseAI#5763. Analytics (Langfuse, LangSmith, etc.) were only receiving plain text from onLLMEnd; usage_metadata and response_metadata from AIMessage were dropped, so token counts and model names were missing in dashboards and cost tracking. Testing ------- - pnpm build succeeds with no TypeScript errors. - Manual: Flowise started, Agentflow with ChatOpenAI run; LangSmith and Langfuse both show token usage and model on the LLM generation. - Backward compatible: call sites that pass a string (e.g. OpenAIAssistant) still work; onLLMEnd treats string as content-only. Co-authored-by: Cursor <cursoragent@cursor.com>
Summary of ChangesHello @TravisP-Greener, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request resolves a critical issue where analytics integrations were failing to capture essential token usage and model name information for LLM generations. By upgrading the analytics handler to accept and intelligently process structured output, and subsequently updating all relevant call sites and provider integrations, the system now ensures that platforms like Langfuse and LangSmith receive complete metadata. This enhancement significantly improves the accuracy of cost tracking and enables more detailed performance analysis for LLM operations. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
Hi @HenryHengZJ - Please see PR to address the missing token usage |
There was a problem hiding this comment.
Code Review
This pull request is a great improvement for analytics integrations. It correctly modifies onLLMEnd to accept structured data, allowing token usage and model names to be passed to providers like Langfuse and LangSmith. The implementation is solid, with robust data extraction logic and comprehensive new unit tests. I have two suggestions to enhance the code's maintainability and robustness, detailed in the comments.
- LangSmith: Only include token_usage properties that have defined values to avoid passing undefined to the API - Extract common OpenTelemetry span logic into _endOtelSpan helper method used by arize, phoenix, and opik providers Co-authored-by: Cursor <cursoragent@cursor.com>
- LangSmith: set usage_metadata and ls_model_name/ls_provider on run extra.metadata so LangSmith can compute costs from token counts (compatible with langsmith 0.1.6 which has no end(metadata) param). Infer ls_provider from model name. - buildAgentflow: use chatflow.name as analytics trace/run name instead of hardcoded 'Agentflow' so LangSmith and Langfuse show the Flowise flow name. Co-authored-by: Cursor <cursoragent@cursor.com>


Summary
Fixes #5763. Analytics integrations (Langfuse, LangSmith, Lunary, LangWatch, Arize, Phoenix, Opik) were not receiving token usage or model name for LLM generations. Only the raw text output was sent, so dashboards showed no token counts and cost tracking could not work.
This PR updates the custom analytics handler so that when an LLM run finishes, we pass structured output (including usage and model) to all providers and ensure Langfuse receives and flushes the update before the request completes.
Problem
AnalyticHandler.onLLMEnd()method had the signatureonLLMEnd(returnIds, output: string). All call sites passed only the final text (e.g.finalResponse), so the rich metadata on the LLM response (usage_metadata,response_metadata) was never sent to analytics.Root cause
AIMessage/AIMessageChunk) expose:usage_metadata.input_tokens,output_tokens,total_tokens(orprompt_tokens,completion_tokens,total_tokens).response_metadata.model,model_name, ormodelId.analyticHandlers.onLLMEnd(...), so providers only ever received a string.Solution
1. Handler: accept structured output and forward usage/model
File:
packages/components/src/handler.tsonLLMEnd(returnIds, output: string | ICommonObject)so we accept either a string (backward compatible) or an object with:content(or we treat the whole value as content when it’s a string).usageMetadata/usage_metadata(token counts).responseMetadata/response_metadata(model name).input_tokens/output_tokensandprompt_tokens/completion_tokens) and model (frommodel,model_name, ormodelId).llmRun.end({ outputs: { generations, llm_output: { token_usage, model_name } } }).generation.end({ output, model, usage: { promptTokens, completionTokens, totalTokens } }), thenawait langfuse.flushAsync()so the update is sent before the request ends.trackEvent('llm', 'end', { output, tokensUsage, model }).span.end({ output, metrics, model }).llm.token_count.prompt,llm.token_count.completion,llm.token_count.total, andllm.model_name.outputis a string, we use it as content only and do not add usage/model (e.g. OpenAIAssistant and any other callers that only pass text keep working).2. Call sites: pass structured output where metadata exists
packages/components/nodes/agentflow/LLM/LLM.tsPass the full
outputobject fromprepareOutputObject()(which already setscontent,usageMetadata,responseMetadata) toonLLMEndinstead offinalResponse.packages/components/nodes/agentflow/Agent/Agent.tsSame: pass the full
outputfromprepareOutputObject()toonLLMEnd.packages/components/nodes/agentflow/ConditionAgent/ConditionAgent.tsBuild an
analyticsOutputwithcontent,usageMetadata(fromresponse.usage_metadata), andresponseMetadata(fromresponse.response_metadata) and pass it toonLLMEnd.packages/components/nodes/agents/OpenAIAssistant/OpenAIAssistant.tsNo change. The Assistants API does not expose token usage in the same way; these call sites continue to pass a string, which remains valid with the new signature.
3. Unit tests
File:
packages/components/src/handler.test.tsonLLMEnd: string vs object, token field normalization (LangChain vs OpenAI naming), model name fromresponse_metadata.model/model_name/modelId, and missing/partial fields.Why Langfuse needed
flushAsync()The Langfuse JS SDK queues events and flushes in the background. Without an explicit flush after
generation.end(), the update (output, model, usage) could still be in the queue when the HTTP response finished, so Langfuse sometimes received only the generation start, not the end. Callingawait langfuse.flushAsync()aftergeneration.end()ensures the generation update is sent before the request completes, so traces in Langfuse show token usage and model.Testing
pnpm buildcompletes with no TypeScript errors.Checklist
onLLMEndaccepts both string and objectpnpm buildpasses