diff --git a/AGENTS.md b/AGENTS.md index b7b2950d..77b69ef4 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -55,6 +55,7 @@ Pre-commit runs: ruff, mypy, uv-lock, codespell | `ComponentParseError` | Add examples to docstring | | `uv.lock` out of sync | Run `uv sync` | | Ollama refused | Run `ollama serve` | +| Telemetry import errors | Run `uv sync` to install OpenTelemetry deps | ## 8. Self-Review (before notifying user) 1. `uv run pytest -m "not qualitative"` passes? diff --git a/docs/dev/telemetry.md b/docs/dev/telemetry.md new file mode 100644 index 00000000..2bfdf057 --- /dev/null +++ b/docs/dev/telemetry.md @@ -0,0 +1,201 @@ +## OpenTelemetry Instrumentation in Mellea + +Mellea provides built-in OpenTelemetry instrumentation with two independent trace scopes that can be enabled separately: + +1. **Application Trace** (`mellea.application`) - Tracks user-facing operations +2. **Backend Trace** (`mellea.backend`) - Tracks LLM backend interactions + +### Configuration + +Telemetry is configured via environment variables: + +| Variable | Description | Default | +|----------|-------------|---------| +| `MELLEA_TRACE_APPLICATION` | Enable application-level tracing | `false` | +| `MELLEA_TRACE_BACKEND` | Enable backend-level tracing | `false` | +| `OTEL_EXPORTER_OTLP_ENDPOINT` | OTLP endpoint for trace export | None | +| `OTEL_SERVICE_NAME` | Service name for traces | `mellea` | +| `MELLEA_TRACE_CONSOLE` | Print traces to console (debugging) | `false` | + +### Application Trace Scope + +The application tracer (`mellea.application`) instruments: + +- **Session lifecycle**: `start_session()`, session context manager entry/exit +- **@generative functions**: Execution of functions decorated with `@generative` +- **mfuncs.aact()**: Action execution with requirements and sampling strategies +- **Sampling strategies**: Rejection sampling, budget forcing, etc. +- **Requirement validation**: Validation of requirements and constraints + +**Span attributes include:** +- `backend`: Backend class name +- `model_id`: Model identifier +- `context_type`: Context class name +- `action_type`: Component type being executed +- `has_requirements`: Whether requirements are specified +- `has_strategy`: Whether a sampling strategy is used +- `strategy_type`: Sampling strategy class name +- `num_generate_logs`: Number of generation attempts +- `sampling_success`: Whether sampling succeeded +- `response`: Model response (truncated to 500 chars) +- `response_length`: Full length of model response + +### Backend Trace Scope + +The backend tracer (`mellea.backend`) instruments: + +- **Backend.generate_from_context()**: Context-based generation +- **Backend.generate_from_raw()**: Raw generation without context +- **Backend-specific implementations**: Ollama, OpenAI, HuggingFace, Watsonx, LiteLLM + +**Span attributes include:** +- `backend`: Backend class name (e.g., `OllamaModelBackend`) +- `model_id`: Model identifier string +- `action_type`: Component type +- `context_size`: Number of items in context +- `has_format`: Whether structured output format is specified +- `format_type`: Response format class name +- `tool_calls`: Whether tool calling is enabled +- `num_actions`: Number of actions in batch (for `generate_from_raw`) + +### Usage Examples + +#### Enable Application Tracing Only + +```bash +export MELLEA_TRACE_APPLICATION=true +export MELLEA_TRACE_BACKEND=false +python docs/examples/instruct_validate_repair/101_email.py +``` + +This traces user-facing operations like `@generative` function calls, session lifecycle, and sampling strategies, but not the underlying LLM API calls. + +#### Enable Backend Tracing Only + +```bash +export MELLEA_TRACE_APPLICATION=false +export MELLEA_TRACE_BACKEND=true +python docs/examples/instruct_validate_repair/101_email.py +``` + +This traces only the LLM backend interactions, showing model calls, token usage, and API latency. + +#### Enable Both Traces + +```bash +export MELLEA_TRACE_APPLICATION=true +export MELLEA_TRACE_BACKEND=true +python docs/examples/instruct_validate_repair/101_email.py +``` + +This provides complete observability across both application logic and backend interactions. + +#### Export to Jaeger + +```bash +# Start Jaeger (example using Docker) +docker run -d --name jaeger \ + -p 4317:4317 \ + -p 16686:16686 \ + jaegertracing/all-in-one:latest + +# Configure Mellea to export traces +export MELLEA_TRACE_APPLICATION=true +export MELLEA_TRACE_BACKEND=true +export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317 +export OTEL_SERVICE_NAME=my-mellea-app + +python docs/examples/instruct_validate_repair/101_email.py + +# View traces at http://localhost:16686 +``` + +#### Console Output for Debugging + +```bash +export MELLEA_TRACE_APPLICATION=true +export MELLEA_TRACE_CONSOLE=true +python docs/examples/instruct_validate_repair/101_email.py +``` + +This prints trace spans to the console, useful for local debugging without setting up a trace backend. + +### Programmatic Access + +You can check if tracing is enabled in your code: + +```python +from mellea.telemetry import ( + is_application_tracing_enabled, + is_backend_tracing_enabled, +) + +if is_application_tracing_enabled(): + print("Application tracing is enabled") + +if is_backend_tracing_enabled(): + print("Backend tracing is enabled") +``` + +### Performance Considerations + +- **Zero overhead when disabled**: When tracing is disabled (default), there is minimal performance impact +- **Async-friendly**: Tracing works seamlessly with async operations +- **Batched export**: Traces are exported in batches to minimize network overhead +- **Separate scopes**: Enable only the tracing you need to reduce overhead + +### Integration with Observability Tools + +Mellea's OpenTelemetry instrumentation works with any OTLP-compatible backend: + +- **Jaeger**: Distributed tracing +- **Zipkin**: Distributed tracing +- **Grafana Tempo**: Distributed tracing +- **Honeycomb**: Observability platform +- **Datadog**: APM and observability +- **New Relic**: APM and observability +- **AWS X-Ray**: Distributed tracing (via OTLP) +- **Google Cloud Trace**: Distributed tracing (via OTLP) + +### Example Trace Hierarchy + +When both traces are enabled, you'll see a hierarchy like: + +``` +session_context (application) +├── aact (application) +│ ├── generate_from_context (backend) +│ │ └── ollama.chat (backend) +│ └── requirement_validation (application) +├── aact (application) +│ └── generate_from_context (backend) +│ └── ollama.chat (backend) +``` + +### Troubleshooting + +**Traces not appearing:** +1. Verify environment variables are set correctly +2. Check that OTLP endpoint is reachable +3. Enable console output to verify traces are being created +4. Check firewall/network settings + +**High overhead:** +1. Disable application tracing if you only need backend metrics +2. Reduce sampling rate (future feature) +3. Use a local OTLP collector to batch exports + +**Missing spans:** +1. Ensure you're using `with start_session()` context manager +2. Check that async operations are properly awaited +3. Verify backend implementation has instrumentation + +### Future Enhancements + +Planned improvements to telemetry: + +- Sampling rate configuration +- Custom span attributes via decorators +- Metrics export (token counts, latency percentiles) +- Trace context propagation for distributed systems +- Integration with LangSmith and other LLM observability tools \ No newline at end of file diff --git a/docs/examples/telemetry/README.md b/docs/examples/telemetry/README.md new file mode 100644 index 00000000..14d7aae2 --- /dev/null +++ b/docs/examples/telemetry/README.md @@ -0,0 +1,147 @@ +# OpenTelemetry Tracing Examples + +This directory contains examples demonstrating OpenTelemetry instrumentation in Mellea. + +## Quick Start + +### 1. Install Dependencies + +```bash +uv sync --all-extras +``` + +### 2. Start Ollama (Required) + +```bash +ollama serve +``` + +### 3. Run Examples + +#### Basic Example (No Tracing) + +```bash +python telemetry_example.py +``` + +#### With Application Tracing Only + +```bash +export MELLEA_TRACE_APPLICATION=true +export MELLEA_TRACE_BACKEND=false +python telemetry_example.py +``` + +#### With Backend Tracing Only + +```bash +export MELLEA_TRACE_APPLICATION=false +export MELLEA_TRACE_BACKEND=true +python telemetry_example.py +``` + +#### With Both Traces + +```bash +export MELLEA_TRACE_APPLICATION=true +export MELLEA_TRACE_BACKEND=true +python telemetry_example.py +``` + +#### With Console Output (Debugging) + +```bash +export MELLEA_TRACE_APPLICATION=true +export MELLEA_TRACE_BACKEND=true +export MELLEA_TRACE_CONSOLE=true +python telemetry_example.py +``` + +## Using Jaeger for Visualization + +### 1. Start Jaeger + +```bash +docker run -d --name jaeger \ + -p 4317:4317 \ + -p 16686:16686 \ + jaegertracing/all-in-one:latest +``` + +### 2. Configure Mellea + +```bash +export MELLEA_TRACE_APPLICATION=true +export MELLEA_TRACE_BACKEND=true +export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317 +export OTEL_SERVICE_NAME=mellea-example +``` + +### 3. Run Example + +```bash +python telemetry_example.py +``` + +### 4. View Traces + +Open http://localhost:16686 in your browser and select "mellea-example" service. + +## What Gets Traced + +### Application Trace (`mellea.application`) + +- Session lifecycle (start, enter, exit) +- @generative function calls +- Action execution (aact) +- Sampling strategies +- Requirement validation + +### Backend Trace (`mellea.backend`) + +- Model generation calls +- Context-based generation +- Raw generation +- Backend-specific operations (Ollama, OpenAI, etc.) + +## Trace Attributes + +Each span includes rich metadata: + +- **model_id**: Model identifier +- **backend**: Backend class name +- **action_type**: Component type +- **context_size**: Number of context items +- **has_requirements**: Whether requirements are specified +- **strategy_type**: Sampling strategy used +- **tool_calls**: Whether tool calling is enabled +- **format_type**: Response format class + +## Performance Impact + +- **Disabled (default)**: Near-zero overhead +- **Application only**: Minimal overhead (~1-2%) +- **Backend only**: Minimal overhead (~1-2%) +- **Both enabled**: Low overhead (~2-5%) + +## Troubleshooting + +**Traces not appearing in Jaeger:** +1. Check Jaeger is running: `docker ps | grep jaeger` +2. Verify endpoint: `curl http://localhost:4317` +3. Check environment variables are set +4. Enable console output to verify traces are created + +**Import errors:** +```bash +uv sync # Reinstall dependencies +``` + +**Ollama connection errors:** +```bash +ollama serve # Start Ollama server +``` + +## Learn More + +See [docs/dev/telemetry.md](../../dev/telemetry.md) for complete documentation. \ No newline at end of file diff --git a/docs/examples/telemetry/telemetry_example.py b/docs/examples/telemetry/telemetry_example.py new file mode 100644 index 00000000..3778da22 --- /dev/null +++ b/docs/examples/telemetry/telemetry_example.py @@ -0,0 +1,108 @@ +"""Example demonstrating OpenTelemetry tracing in Mellea. + +This example shows how to use the two independent trace scopes: +1. Application trace - tracks user-facing operations +2. Backend trace - tracks LLM backend interactions + +Run with different configurations: + +# Enable only application tracing +export MELLEA_TRACE_APPLICATION=true +export MELLEA_TRACE_BACKEND=false +python telemetry_example.py + +# Enable only backend tracing +export MELLEA_TRACE_APPLICATION=false +export MELLEA_TRACE_BACKEND=true +python telemetry_example.py + +# Enable both traces +export MELLEA_TRACE_APPLICATION=true +export MELLEA_TRACE_BACKEND=true +python telemetry_example.py + +# Export to OTLP endpoint (e.g., Jaeger) +export MELLEA_TRACE_APPLICATION=true +export MELLEA_TRACE_BACKEND=true +export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317 +python telemetry_example.py + +# Enable console output for debugging +export MELLEA_TRACE_CONSOLE=true +python telemetry_example.py +""" + +from mellea import generative, start_session +from mellea.stdlib.requirements import req + + +@generative +def classify_sentiment(text: str) -> str: + """Classify the sentiment of the given text as positive, negative, or neutral.""" + ... + + +@generative +def extract_entities(text: str) -> list[str]: + """Extract named entities from the text.""" + ... + + +def main(): + """Run example with telemetry instrumentation.""" + print("=" * 60) + print("Mellea OpenTelemetry Example") + print("=" * 60) + + # Check which traces are enabled + from mellea.telemetry import ( + is_application_tracing_enabled, + is_backend_tracing_enabled, + ) + + print(f"Application tracing: {is_application_tracing_enabled()}") + print(f"Backend tracing: {is_backend_tracing_enabled()}") + print("=" * 60) + + # Start a session - this will be traced if application tracing is enabled + with start_session() as m: + # Example 1: Simple instruction with requirements + print("\n1. Simple instruction with requirements...") + email = m.instruct( + "Write a professional email to {{name}} about {{topic}}", + requirements=[req("Must be formal"), req("Must be under 100 words")], + user_variables={"name": "Alice", "topic": "project update"}, + ) + print(f"Generated email: {str(email)[:100]}...") + + # Example 2: Using @generative function + print("\n2. Using @generative function...") + sentiment = classify_sentiment( + m, text="I absolutely love this product! It's amazing!" + ) + print(f"Sentiment: {sentiment}") + + # Example 3: Multiple operations + print("\n3. Multiple operations...") + text = "Apple Inc. announced new products in Cupertino, California." + entities = extract_entities(m, text=text) + print(f"Entities: {entities}") + + # Example 4: Chat interaction + print("\n4. Chat interaction...") + response1 = m.chat("What is 2+2?") + print(f"Response 1: {str(response1)}") + + response2 = m.chat("Multiply that by 3") + print(f"Response 2: {str(response2)}") + + print("\n" + "=" * 60) + print("Example complete!") + print("=" * 60) + print("\nTrace data has been exported based on your configuration.") + print("If OTEL_EXPORTER_OTLP_ENDPOINT is set, check your trace backend.") + print("If MELLEA_TRACE_CONSOLE=true, traces are printed above.") + + +if __name__ == "__main__": + main() diff --git a/mellea/backends/huggingface.py b/mellea/backends/huggingface.py index 48e74543..3a4cb1ae 100644 --- a/mellea/backends/huggingface.py +++ b/mellea/backends/huggingface.py @@ -44,6 +44,10 @@ from ..helpers import message_to_openai_message, messages_to_docs, send_to_queue from ..stdlib.components import Intrinsic, Message from ..stdlib.requirements import ALoraRequirement, LLMaJRequirement +from ..telemetry.backend_instrumentation import ( + instrument_generate_from_context, + instrument_generate_from_raw, +) from .adapters import ( AdapterMixin, AdapterType, @@ -202,61 +206,68 @@ async def generate_from_context( tool_calls: bool = False, ) -> tuple[ModelOutputThunk[C], Context]: """Generate using the huggingface model.""" - await self.do_generate_walk(action) + with instrument_generate_from_context( + backend=self, action=action, ctx=ctx, format=format, tool_calls=tool_calls + ): + await self.do_generate_walk(action) - # Upsert model options. - model_opts = self._simplify_and_merge(model_options) + # Upsert model options. + model_opts = self._simplify_and_merge(model_options) - # Requirements can be automatically rerouted to a requirement adapter. - if isinstance(action, Requirement): - # See docs/dev/requirement_aLoRA_rerouting.md - reroute_to_alora = self.default_to_constraint_checking_alora - adapter_name = "requirement_check" + # Requirements can be automatically rerouted to a requirement adapter. + if isinstance(action, Requirement): + # See docs/dev/requirement_aLoRA_rerouting.md + reroute_to_alora = self.default_to_constraint_checking_alora + adapter_name = "requirement_check" - if isinstance(action, ALoraRequirement): - reroute_to_alora = True - adapter_name = action.intrinsic_name - alora_action = action - else: - assert action.description is not None, ( - "must have a description when generating from a requirement" + if isinstance(action, ALoraRequirement): + reroute_to_alora = True + adapter_name = action.intrinsic_name + alora_action = action + else: + assert action.description is not None, ( + "must have a description when generating from a requirement" + ) + alora_action = ALoraRequirement(action.description, adapter_name) + + # Check if a requirement_check (or AloraRequirement specified) adapter + # exists. + alora_req_adapter = get_adapter_for_intrinsic( + adapter_name, [AdapterType.ALORA], self._added_adapters ) - alora_action = ALoraRequirement(action.description, adapter_name) + if alora_req_adapter is None: + # Log a warning if using an AloraRequirement but no adapter fit. + if reroute_to_alora and isinstance(action, ALoraRequirement): + FancyLogger.get_logger().warning( + f"attempted to use an AloraRequirement but backend {self} doesn't have the specified adapter added {adapter_name}; defaulting to regular generation" + ) + reroute_to_alora = False - # Check if a requirement_check (or AloraRequirement specified) adapter - # exists. - alora_req_adapter = get_adapter_for_intrinsic( - adapter_name, [AdapterType.ALORA], self._added_adapters - ) - if alora_req_adapter is None: - # Log a warning if using an AloraRequirement but no adapter fit. - if reroute_to_alora and isinstance(action, ALoraRequirement): - FancyLogger.get_logger().warning( - f"attempted to use an AloraRequirement but backend {self} doesn't have the specified adapter added {adapter_name}; defaulting to regular generation" - ) - reroute_to_alora = False + if issubclass(type(action), LLMaJRequirement): + reroute_to_alora = False - if issubclass(type(action), LLMaJRequirement): - reroute_to_alora = False + if reroute_to_alora: + # Keep the alora requirement handling separate for now. + mot = await self._generate_from_intrinsic( + alora_action, ctx, model_options=model_opts + ) + return mot, ctx.add(alora_action).add(mot) - if reroute_to_alora: - # Keep the alora requirement handling separate for now. + elif isinstance(action, Intrinsic): mot = await self._generate_from_intrinsic( - alora_action, ctx, model_options=model_opts + action, ctx, model_options=model_opts ) - return mot, ctx.add(alora_action).add(mot) + return mot, ctx.add(action).add(mot) - elif isinstance(action, Intrinsic): - mot = await self._generate_from_intrinsic( - action, ctx, model_options=model_opts + mot = await self._generate_from_context_standard( + action, + ctx, + _format=format, + model_options=model_opts, + tool_calls=tool_calls, ) return mot, ctx.add(action).add(mot) - mot = await self._generate_from_context_standard( - action, ctx, _format=format, model_options=model_opts, tool_calls=tool_calls - ) - return mot, ctx.add(action).add(mot) - def _generate_with_adapter_lock( self, adapter_name: str, generate_func: Callable, *args, **kwargs ): @@ -972,67 +983,70 @@ async def generate_from_raw( tool_calls: bool = False, ) -> list[ModelOutputThunk]: """Generate using the completions api. Gives the input provided to the model without templating.""" - await self.do_generate_walks(list(actions)) - - if tool_calls: - FancyLogger.get_logger().warning( - "The raw endpoint does not support tool calling at the moment." - ) + with instrument_generate_from_raw( + backend=self, num_actions=len(actions), format=format, tool_calls=tool_calls + ): + await self.do_generate_walks(list(actions)) - if self._model.device.type == "mps": - # TODO: Remove this when we are able to update the torch package. - # Test this by ensuring all outputs from this call are populated when running on mps. - # https://github.com/pytorch/pytorch/pull/157727 - FancyLogger.get_logger().warning( - "utilizing device mps with a `generate_from_raw` request; you may see issues when submitting batches of prompts to a huggingface backend; ensure all ModelOutputThunks have non-empty values." - ) + if tool_calls: + FancyLogger.get_logger().warning( + "The raw endpoint does not support tool calling at the moment." + ) - model_opts = self._simplify_and_merge(model_options) - seed = model_opts.get(ModelOption.SEED, None) - if seed is not None: - set_seed(seed) + if self._model.device.type == "mps": + # TODO: Remove this when we are able to update the torch package. + # Test this by ensuring all outputs from this call are populated when running on mps. + # https://github.com/pytorch/pytorch/pull/157727 + FancyLogger.get_logger().warning( + "utilizing device mps with a `generate_from_raw` request; you may see issues when submitting batches of prompts to a huggingface backend; ensure all ModelOutputThunks have non-empty values." + ) - prompts = [self.formatter.print(action) for action in actions] + model_opts = self._simplify_and_merge(model_options) + seed = model_opts.get(ModelOption.SEED, None) + if seed is not None: + set_seed(seed) - # batch-encoding call is deprecated in favor of this - inputs = self._tokenizer(prompts, return_tensors="pt", padding=True).to( - self._device - ) + prompts = [self.formatter.print(action) for action in actions] - format_kwargs = {} - if format: - # outlines.generate.json always parses the resulting json into a python dict. - # We however want to keep it as a json string for later storing it in ModelOutputThunk - schema: dict[str, Any] = format.model_json_schema() # type: ignore - schema_json: str = json.dumps(schema) - regex_str: str = outlines_core.fsm.json_schema.build_regex_from_schema( # type: ignore - schema_json + # batch-encoding call is deprecated in favor of this + inputs = self._tokenizer(prompts, return_tensors="pt", padding=True).to( + self._device ) - from outlines.models.transformers import TransformerTokenizer - from outlines.processors.structured import RegexLogitsProcessor - from transformers import LogitsProcessorList # type: ignore + format_kwargs = {} + if format: + # outlines.generate.json always parses the resulting json into a python dict. + # We however want to keep it as a json string for later storing it in ModelOutputThunk + schema: dict[str, Any] = format.model_json_schema() # type: ignore + schema_json: str = json.dumps(schema) + regex_str: str = outlines_core.fsm.json_schema.build_regex_from_schema( # type: ignore + schema_json + ) - format_kwargs["logits_processor"] = LogitsProcessorList( - [ - RegexLogitsProcessor( - regex_str, tokenizer=TransformerTokenizer(self._tokenizer) - ) - ] - ) + from outlines.models.transformers import TransformerTokenizer + from outlines.processors.structured import RegexLogitsProcessor + from transformers import LogitsProcessorList # type: ignore - outputs = await asyncio.to_thread( - self._generate_with_adapter_lock, - "", # Empty for no adapter. - self._model.generate, # type: ignore - # Passed as args/kwargs to generate. - input_ids=inputs["input_ids"], - attention_mask=inputs["attention_mask"], - return_dict_in_generate=True, - output_scores=True, - **self._make_backend_specific_and_remove(model_opts), - **format_kwargs, - ) + format_kwargs["logits_processor"] = LogitsProcessorList( + [ + RegexLogitsProcessor( + regex_str, tokenizer=TransformerTokenizer(self._tokenizer) + ) + ] + ) + + outputs = await asyncio.to_thread( + self._generate_with_adapter_lock, + "", # Empty for no adapter. + self._model.generate, # type: ignore + # Passed as args/kwargs to generate. + input_ids=inputs["input_ids"], + attention_mask=inputs["attention_mask"], + return_dict_in_generate=True, + output_scores=True, + **self._make_backend_specific_and_remove(model_opts), + **format_kwargs, + ) sequences_to_decode = [ sequence[inputs["input_ids"][i].size(0) :] # type: ignore diff --git a/mellea/backends/litellm.py b/mellea/backends/litellm.py index 4dc82da8..d43c4545 100644 --- a/mellea/backends/litellm.py +++ b/mellea/backends/litellm.py @@ -34,6 +34,10 @@ ) from ..stdlib.components import Message from ..stdlib.requirements import ALoraRequirement +from ..telemetry.backend_instrumentation import ( + instrument_generate_from_context, + instrument_generate_from_raw, +) from .backend import FormatterBackend from .model_options import ModelOption from .tools import ( @@ -125,14 +129,17 @@ async def generate_from_context( assert ctx.is_chat_context, NotImplementedError( "The Openai backend only supports chat-like contexts." ) - mot = await self._generate_from_chat_context_standard( - action, - ctx, - _format=format, - model_options=model_options, - tool_calls=tool_calls, - ) - return mot, ctx.add(action).add(mot) + with instrument_generate_from_context( + backend=self, action=action, ctx=ctx, format=format, tool_calls=tool_calls + ): + mot = await self._generate_from_chat_context_standard( + action, + ctx, + _format=format, + model_options=model_options, + tool_calls=tool_calls, + ) + return mot, ctx.add(action).add(mot) def _simplify_and_merge( self, model_options: dict[str, Any] | None @@ -506,35 +513,38 @@ async def generate_from_raw( tool_calls: bool = False, ) -> list[ModelOutputThunk]: """Generate using the completions api. Gives the input provided to the model without templating.""" - await self.do_generate_walks(list(actions)) - extra_body = {} - if format is not None: - FancyLogger.get_logger().warning( - "The official OpenAI completion api does not accept response format / structured decoding; " - "it will be passed as an extra arg." - ) + with instrument_generate_from_raw( + backend=self, num_actions=len(actions), format=format, tool_calls=tool_calls + ): + await self.do_generate_walks(list(actions)) + extra_body = {} + if format is not None: + FancyLogger.get_logger().warning( + "The official OpenAI completion api does not accept response format / structured decoding; " + "it will be passed as an extra arg." + ) - # Some versions (like vllm's version) of the OpenAI API support structured decoding for completions requests. - extra_body["guided_json"] = format.model_json_schema() # type: ignore - if tool_calls: - FancyLogger.get_logger().warning( - "The completion endpoint does not support tool calling." - ) + # Some versions (like vllm's version) of the OpenAI API support structured decoding for completions requests. + extra_body["guided_json"] = format.model_json_schema() # type: ignore + if tool_calls: + FancyLogger.get_logger().warning( + "The completion endpoint does not support tool calling." + ) - # We don't do anything fancy for model_opts with generate from raw; litellm has too many potential options depending on provider. - model_opts = self._simplify_and_merge(model_options) - model_specific_options = self._make_backend_specific_and_remove(model_opts) + # We don't do anything fancy for model_opts with generate from raw; litellm has too many potential options depending on provider. + model_opts = self._simplify_and_merge(model_options) + model_specific_options = self._make_backend_specific_and_remove(model_opts) - if self._has_potential_event_loop_errors(): - FancyLogger().get_logger().warning( - "There is a known bug with litellm. This generation call may fail. If it does, you should ensure that you are either running only synchronous Mellea functions or running async Mellea functions from one asyncio.run() call." - ) + if self._has_potential_event_loop_errors(): + FancyLogger().get_logger().warning( + "There is a known bug with litellm. This generation call may fail. If it does, you should ensure that you are either running only synchronous Mellea functions or running async Mellea functions from one asyncio.run() call." + ) - prompts = [self.formatter.print(action) for action in actions] + prompts = [self.formatter.print(action) for action in actions] - completion_response = await litellm.atext_completion( - model=self._model_id, prompt=prompts, **model_specific_options - ) + completion_response = await litellm.atext_completion( + model=self._model_id, prompt=prompts, **model_specific_options + ) # Necessary for type checker. assert isinstance(completion_response, litellm.TextCompletionResponse) # type: ignore diff --git a/mellea/backends/ollama.py b/mellea/backends/ollama.py index 1c0c9a20..0b957d88 100644 --- a/mellea/backends/ollama.py +++ b/mellea/backends/ollama.py @@ -26,6 +26,10 @@ from ..helpers import ClientCache, get_current_event_loop, send_to_queue from ..stdlib.components import Message from ..stdlib.requirements import ALoraRequirement +from ..telemetry.backend_instrumentation import ( + instrument_generate_from_context, + instrument_generate_from_raw, +) from .backend import FormatterBackend from .model_options import ModelOption from .tools import add_tools_from_context_actions, add_tools_from_model_options @@ -255,18 +259,19 @@ async def generate_from_context( tool_calls: bool = False, ) -> tuple[ModelOutputThunk[C], Context]: """See `generate_from_chat_context`.""" - assert ctx.is_chat_context, ( - "The ollama backend only supports chat-like contexts." - ) - mot = await self.generate_from_chat_context( - action, - ctx, - _format=format, - model_options=model_options, - tool_calls=tool_calls, - ) + with instrument_generate_from_context(self, action, ctx, format, tool_calls): + assert ctx.is_chat_context, ( + "The ollama backend only supports chat-like contexts." + ) + mot = await self.generate_from_chat_context( + action, + ctx, + _format=format, + model_options=model_options, + tool_calls=tool_calls, + ) - return mot, ctx.add(action).add(mot) + return mot, ctx.add(action).add(mot) async def generate_from_chat_context( self, diff --git a/mellea/backends/openai.py b/mellea/backends/openai.py index d7eb284d..0715bb6b 100644 --- a/mellea/backends/openai.py +++ b/mellea/backends/openai.py @@ -42,6 +42,10 @@ ) from ..stdlib.components import Intrinsic, Message from ..stdlib.requirements import ALoraRequirement, LLMaJRequirement +from ..telemetry.backend_instrumentation import ( + instrument_generate_from_context, + instrument_generate_from_raw, +) from .adapters import ( AdapterMixin, AdapterType, @@ -307,13 +311,16 @@ async def generate_from_context( assert ctx.is_chat_context, NotImplementedError( "The Openai backend only supports chat-like contexts." ) - return await self.generate_from_chat_context( - action, - ctx, - _format=format, - model_options=model_options, - tool_calls=tool_calls, - ) + with instrument_generate_from_context( + backend=self, action=action, ctx=ctx, format=format, tool_calls=tool_calls + ): + return await self.generate_from_chat_context( + action, + ctx, + _format=format, + model_options=model_options, + tool_calls=tool_calls, + ) async def generate_from_chat_context( self, @@ -815,24 +822,27 @@ async def generate_from_raw( prompts = [self.formatter.print(action) for action in actions] - try: - completion_response: Completion = ( - await self._async_client.completions.create( - model=self._model_id, - prompt=prompts, - extra_body=extra_body, - **self._make_backend_specific_and_remove( - model_opts, is_chat_context=False - ), - ) - ) # type: ignore - except openai.BadRequestError as e: - if openai_ollama_batching_error in e.message: - FancyLogger.get_logger().error( - "If you are trying to call `OpenAIBackend._generate_from_raw while targeting an ollama server, " - "your requests will fail since ollama doesn't support batching requests." - ) - raise e + with instrument_generate_from_raw( + backend=self, num_actions=len(actions), format=format, tool_calls=tool_calls + ): + try: + completion_response: Completion = ( + await self._async_client.completions.create( + model=self._model_id, + prompt=prompts, + extra_body=extra_body, + **self._make_backend_specific_and_remove( + model_opts, is_chat_context=False + ), + ) + ) # type: ignore + except openai.BadRequestError as e: + if openai_ollama_batching_error in e.message: + FancyLogger.get_logger().error( + "If you are trying to call `OpenAIBackend._generate_from_raw while targeting an ollama server, " + "your requests will fail since ollama doesn't support batching requests." + ) + raise e # Necessary for type checker. assert isinstance(completion_response, Completion) diff --git a/mellea/backends/watsonx.py b/mellea/backends/watsonx.py index 58004c4c..fae43408 100644 --- a/mellea/backends/watsonx.py +++ b/mellea/backends/watsonx.py @@ -37,6 +37,10 @@ ) from ..stdlib.components import Message from ..stdlib.requirements import ALoraRequirement +from ..telemetry.backend_instrumentation import ( + instrument_generate_from_context, + instrument_generate_from_raw, +) from .backend import FormatterBackend from .model_options import ModelOption from .tools import ( @@ -249,14 +253,17 @@ async def generate_from_context( assert ctx.is_chat_context, NotImplementedError( "The watsonx.ai backend only supports chat-like contexts." ) - mot = await self.generate_from_chat_context( - action, - ctx, - _format=format, - model_options=model_options, - tool_calls=tool_calls, - ) - return mot, ctx.add(action).add(mot) + with instrument_generate_from_context( + backend=self, action=action, ctx=ctx, format=format, tool_calls=tool_calls + ): + mot = await self.generate_from_chat_context( + action, + ctx, + _format=format, + model_options=model_options, + tool_calls=tool_calls, + ) + return mot, ctx.add(action).add(mot) async def generate_from_chat_context( self, @@ -512,24 +519,27 @@ async def generate_from_raw( tool_calls: bool = False, ) -> list[ModelOutputThunk]: """Generates a completion text. Gives the input provided to the model without templating.""" - await self.do_generate_walks(list(actions)) + with instrument_generate_from_raw( + backend=self, num_actions=len(actions), format=format, tool_calls=tool_calls + ): + await self.do_generate_walks(list(actions)) - if format is not None: - FancyLogger.get_logger().warning( - "WatsonxAI completion api does not accept response format, ignoring it for this request." - ) + if format is not None: + FancyLogger.get_logger().warning( + "WatsonxAI completion api does not accept response format, ignoring it for this request." + ) - model_opts = self._simplify_and_merge(model_options, is_chat_context=False) + model_opts = self._simplify_and_merge(model_options, is_chat_context=False) - prompts = [self.formatter.print(action) for action in actions] + prompts = [self.formatter.print(action) for action in actions] - responses = await asyncio.to_thread( - self._model.generate, - prompt=prompts, - params=self._make_backend_specific_and_remove( - model_opts, is_chat_context=False - ), - ) + responses = await asyncio.to_thread( + self._model.generate, + prompt=prompts, + params=self._make_backend_specific_and_remove( + model_opts, is_chat_context=False + ), + ) results = [] date = datetime.datetime.now() diff --git a/mellea/stdlib/functional.py b/mellea/stdlib/functional.py index 41263f42..d87f1da4 100644 --- a/mellea/stdlib/functional.py +++ b/mellea/stdlib/functional.py @@ -26,6 +26,7 @@ ValidationResult, ) from ..helpers import _run_async_in_thread +from ..telemetry import set_span_attribute, trace_application from .components import Instruction, Message, MObjectProtocol, ToolMessage, mify from .context import SimpleContext from .sampling import RejectionSamplingStrategy @@ -475,75 +476,111 @@ async def aact( Returns: A (ModelOutputThunk, Context) if `return_sampling_results` is `False`, else returns a `SamplingResult`. """ - if not silence_context_type_warning and not isinstance(context, SimpleContext): - FancyLogger().get_logger().warning( - "Not using a SimpleContext with asynchronous requests could cause unexpected results due to stale contexts. Ensure you await between requests." - "\nSee the async section of the tutorial: https://github.com/generative-computing/mellea/blob/main/docs/tutorial.md#chapter-12-asynchronicity" - ) - - sampling_result: SamplingResult | None = None - generate_logs: list[GenerateLog] = [] + with trace_application( + "aact", + action_type=action.__class__.__name__, + has_requirements=requirements is not None and len(requirements) > 0, + has_strategy=strategy is not None, + strategy_type=strategy.__class__.__name__ if strategy else None, + has_format=format is not None, + tool_calls=tool_calls, + ) as span: + if not silence_context_type_warning and not isinstance(context, SimpleContext): + FancyLogger().get_logger().warning( + "Not using a SimpleContext with asynchronous requests could cause unexpected results due to stale contexts. Ensure you await between requests." + "\nSee the async section of the tutorial: https://github.com/generative-computing/mellea/blob/main/docs/tutorial.md#chapter-12-asynchronicity" + ) - if return_sampling_results: - assert strategy is not None, ( - "Must provide a SamplingStrategy when return_sampling_results==True" - ) + sampling_result: SamplingResult | None = None + generate_logs: list[GenerateLog] = [] - if strategy is None: - # Only use the strategy if one is provided. Add a warning if requirements were passed in though. - if requirements is not None and len(requirements) > 0: - FancyLogger.get_logger().warning( - "Calling the function with NO strategy BUT requirements. No requirement is being checked!" + if return_sampling_results: + assert strategy is not None, ( + "Must provide a SamplingStrategy when return_sampling_results==True" ) - result, new_ctx = await backend.generate_from_context( - action, - ctx=context, - format=format, - model_options=model_options, - tool_calls=tool_calls, - ) - await result.avalue() + if strategy is None: + # Only use the strategy if one is provided. Add a warning if requirements were passed in though. + if requirements is not None and len(requirements) > 0: + FancyLogger.get_logger().warning( + "Calling the function with NO strategy BUT requirements. No requirement is being checked!" + ) - # ._generate_log should never be None after generation. - assert result._generate_log is not None - result._generate_log.is_final_result = True - generate_logs.append(result._generate_log) + result, new_ctx = await backend.generate_from_context( + action, + ctx=context, + format=format, + model_options=model_options, + tool_calls=tool_calls, + ) + await result.avalue() - else: - # Always sample if a strategy is provided, even if no requirements were provided. - # Some sampling strategies don't use requirements or set them when instantiated. + # ._generate_log should never be None after generation. + assert result._generate_log is not None + result._generate_log.is_final_result = True + generate_logs.append(result._generate_log) - sampling_result = await strategy.sample( - action, - context=context, - backend=backend, - requirements=requirements, - validation_ctx=None, - format=format, - model_options=model_options, - tool_calls=tool_calls, - ) + else: + # Always sample if a strategy is provided, even if no requirements were provided. + # Some sampling strategies don't use requirements or set them when instantiated. + + sampling_result = await strategy.sample( + action, + context=context, + backend=backend, + requirements=requirements, + validation_ctx=None, + format=format, + model_options=model_options, + tool_calls=tool_calls, + ) - assert sampling_result.sample_generations is not None - for result in sampling_result.sample_generations: - assert result._generate_log is not None # Cannot be None after generation. - generate_logs.append(result._generate_log) + assert sampling_result.sample_generations is not None + for result in sampling_result.sample_generations: + assert ( + result._generate_log is not None + ) # Cannot be None after generation. + generate_logs.append(result._generate_log) - new_ctx = sampling_result.result_ctx - result = sampling_result.result - assert sampling_result.result._generate_log is not None - assert sampling_result.result._generate_log.is_final_result, ( - "generate logs from the final result returned by the sampling strategy must be marked as final" - ) + new_ctx = sampling_result.result_ctx + result = sampling_result.result + assert sampling_result.result._generate_log is not None + assert sampling_result.result._generate_log.is_final_result, ( + "generate logs from the final result returned by the sampling strategy must be marked as final" + ) - if return_sampling_results: - assert ( - sampling_result is not None - ) # Needed for the type checker but should never happen. - return sampling_result - else: - return result, new_ctx + # Add span attributes for the result + set_span_attribute(span, "num_generate_logs", len(generate_logs)) + if sampling_result: + set_span_attribute(span, "sampling_success", bool(sampling_result.result)) + + # Log the model response (truncated for large responses) + try: + response_value = ( + str(result.value) + if hasattr(result, "value") and result.value + else str(result) + ) + # Truncate to 500 chars to avoid overwhelming trace storage + if len(response_value) > 500: + response_value = response_value[:500] + "..." + set_span_attribute(span, "response", response_value) + set_span_attribute( + span, + "response_length", + len(str(result.value) if hasattr(result, "value") else str(result)), + ) + except Exception: + # If we can't get the response, don't fail the trace + pass + + if return_sampling_results: + assert ( + sampling_result is not None + ) # Needed for the type checker but should never happen. + return sampling_result + else: + return result, new_ctx @overload diff --git a/mellea/stdlib/session.py b/mellea/stdlib/session.py index 0c1df16f..5fd70590 100644 --- a/mellea/stdlib/session.py +++ b/mellea/stdlib/session.py @@ -27,6 +27,7 @@ ValidationResult, ) from ..stdlib import functional as mfuncs +from ..telemetry import set_span_attribute, trace_application from .components import Message from .context import SimpleContext from .sampling import RejectionSamplingStrategy @@ -136,20 +137,8 @@ def start_session( """ logger = FancyLogger.get_logger() - backend_class = backend_name_to_class(backend_name) - if backend_class is None: - raise Exception( - f"Backend name {backend_name} unknown. Please see the docstring for `mellea.stdlib.session.start_session` for a list of options." - ) - assert backend_class is not None - backend = backend_class(model_id, model_options=model_options, **backend_kwargs) - - if ctx is None: - ctx = SimpleContext() - - # Log session configuration + # Get model_id string for logging and tracing if isinstance(model_id, ModelIdentifier): - # Get the backend-specific model name backend_to_attr = { "ollama": "ollama_name", "hf": "hf_model_name", @@ -163,14 +152,32 @@ def start_session( getattr(model_id, attr, None) or model_id.hf_model_name or str(model_id) ) else: - model_id_str = model_id - logger.info( - f"Starting Mellea session: backend={backend_name}, model={model_id_str}, " - f"context={ctx.__class__.__name__}" - + (f", model_options={model_options}" if model_options else "") - ) + model_id_str = str(model_id) + + with trace_application( + "start_session", + backend=backend_name, + model_id=model_id_str, + context_type=ctx.__class__.__name__ if ctx else "SimpleContext", + ): + backend_class = backend_name_to_class(backend_name) + if backend_class is None: + raise Exception( + f"Backend name {backend_name} unknown. Please see the docstring for `mellea.stdlib.session.start_session` for a list of options." + ) + assert backend_class is not None + backend = backend_class(model_id, model_options=model_options, **backend_kwargs) + + if ctx is None: + ctx = SimpleContext() + + logger.info( + f"Starting Mellea session: backend={backend_name}, model={model_id_str}, " + f"context={ctx.__class__.__name__}" + + (f", model_options={model_options}" if model_options else "") + ) - return MelleaSession(backend, ctx) + return MelleaSession(backend, ctx) class MelleaSession: @@ -200,9 +207,16 @@ def __init__(self, backend: Backend, ctx: Context | None = None): self.ctx: Context = ctx if ctx is not None else SimpleContext() self._session_logger = FancyLogger.get_logger() self._context_token = None + self._session_span = None def __enter__(self): """Enter context manager and set this session as the current global session.""" + # Start a session span that will last for the entire context manager lifetime + self._session_span = trace_application( + "session_context", + backend=self.backend.__class__.__name__, + context_type=self.ctx.__class__.__name__, + ).__enter__() self._context_token = _context_session.set(self) return self @@ -212,6 +226,9 @@ def __exit__(self, exc_type, exc_val, exc_tb): if self._context_token is not None: _context_session.reset(self._context_token) self._context_token = None + if self._session_span is not None: + self._session_span.__exit__(exc_type, exc_val, exc_tb) + self._session_span = None def __copy__(self): """Use self.clone. Copies the current session but keeps references to the backend and context.""" diff --git a/mellea/telemetry/__init__.py b/mellea/telemetry/__init__.py new file mode 100644 index 00000000..89728bf7 --- /dev/null +++ b/mellea/telemetry/__init__.py @@ -0,0 +1,161 @@ +"""OpenTelemetry instrumentation for Mellea. + +This module provides two independent trace scopes: +1. Application Trace (mellea.application) - User-facing operations +2. Backend Trace (mellea.backend) - LLM backend interactions + +Configuration via environment variables: +- MELLEA_TRACE_APPLICATION: Enable/disable application tracing (default: false) +- MELLEA_TRACE_BACKEND: Enable/disable backend tracing (default: false) +- OTEL_EXPORTER_OTLP_ENDPOINT: OTLP endpoint for trace export +- OTEL_SERVICE_NAME: Service name for traces (default: mellea) +""" + +import os +from contextlib import contextmanager +from typing import Any + +from opentelemetry import trace +from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter +from opentelemetry.sdk.resources import Resource +from opentelemetry.sdk.trace import TracerProvider +from opentelemetry.sdk.trace.export import BatchSpanProcessor, ConsoleSpanExporter + +# Configuration from environment variables +_TRACE_APPLICATION_ENABLED = os.getenv("MELLEA_TRACE_APPLICATION", "false").lower() in ( + "true", + "1", + "yes", +) +_TRACE_BACKEND_ENABLED = os.getenv("MELLEA_TRACE_BACKEND", "false").lower() in ( + "true", + "1", + "yes", +) +_OTLP_ENDPOINT = os.getenv("OTEL_EXPORTER_OTLP_ENDPOINT") +_SERVICE_NAME = os.getenv("OTEL_SERVICE_NAME", "mellea") +_CONSOLE_EXPORT = os.getenv("MELLEA_TRACE_CONSOLE", "false").lower() in ( + "true", + "1", + "yes", +) + + +def _setup_tracer_provider() -> TracerProvider: + """Set up the global tracer provider with OTLP exporter if configured.""" + resource = Resource.create({"service.name": _SERVICE_NAME}) + provider = TracerProvider(resource=resource) + + # Add OTLP exporter if endpoint is configured + if _OTLP_ENDPOINT: + otlp_exporter = OTLPSpanExporter(endpoint=_OTLP_ENDPOINT) + provider.add_span_processor(BatchSpanProcessor(otlp_exporter)) + + # Add console exporter for debugging if enabled + # Note: Console exporter may cause harmless errors during test cleanup + if _CONSOLE_EXPORT: + try: + console_exporter = ConsoleSpanExporter() + provider.add_span_processor(BatchSpanProcessor(console_exporter)) + except Exception: + # Silently ignore console exporter setup failures + pass + + trace.set_tracer_provider(provider) + return provider + + +# Initialize tracer provider if any tracing is enabled +_tracer_provider = None +if _TRACE_APPLICATION_ENABLED or _TRACE_BACKEND_ENABLED: + _tracer_provider = _setup_tracer_provider() + +# Create separate tracers for application and backend +_application_tracer = trace.get_tracer("mellea.application", "0.3.0") +_backend_tracer = trace.get_tracer("mellea.backend", "0.3.0") + + +def is_application_tracing_enabled() -> bool: + """Check if application tracing is enabled.""" + return _TRACE_APPLICATION_ENABLED + + +def is_backend_tracing_enabled() -> bool: + """Check if backend tracing is enabled.""" + return _TRACE_BACKEND_ENABLED + + +@contextmanager +def trace_application(name: str, **attributes: Any): + """Create an application trace span if application tracing is enabled. + + Args: + name: Name of the span + **attributes: Additional attributes to add to the span + + Yields: + The span object if tracing is enabled, otherwise a no-op context manager + """ + if _TRACE_APPLICATION_ENABLED: + with _application_tracer.start_as_current_span(name) as span: + for key, value in attributes.items(): + if value is not None: + span.set_attribute(key, str(value)) + yield span + else: + yield None + + +@contextmanager +def trace_backend(name: str, **attributes: Any): + """Create a backend trace span if backend tracing is enabled. + + Args: + name: Name of the span + **attributes: Additional attributes to add to the span + + Yields: + The span object if tracing is enabled, otherwise a no-op context manager + """ + if _TRACE_BACKEND_ENABLED: + with _backend_tracer.start_as_current_span(name) as span: + for key, value in attributes.items(): + if value is not None: + span.set_attribute(key, str(value)) + yield span + else: + yield None + + +def set_span_attribute(span: Any, key: str, value: Any) -> None: + """Set an attribute on a span if the span is not None. + + Args: + span: The span object (may be None if tracing is disabled) + key: Attribute key + value: Attribute value + """ + if span is not None and value is not None: + span.set_attribute(key, str(value)) + + +def set_span_error(span: Any, exception: Exception) -> None: + """Record an exception on a span if the span is not None. + + Args: + span: The span object (may be None if tracing is disabled) + exception: The exception to record + """ + if span is not None: + span.record_exception(exception) + span.set_status(trace.Status(trace.StatusCode.ERROR, str(exception))) + + +__all__ = [ + "is_application_tracing_enabled", + "is_backend_tracing_enabled", + "set_span_attribute", + "set_span_error", + "trace_application", + "trace_backend", +] diff --git a/mellea/telemetry/backend_instrumentation.py b/mellea/telemetry/backend_instrumentation.py new file mode 100644 index 00000000..94af07c1 --- /dev/null +++ b/mellea/telemetry/backend_instrumentation.py @@ -0,0 +1,101 @@ +"""Backend instrumentation helpers for OpenTelemetry tracing.""" + +from typing import Any + +from ..telemetry import set_span_attribute, trace_backend + + +def get_model_id_str(backend: Any) -> str: + """Extract model_id string from a backend instance. + + Args: + backend: Backend instance + + Returns: + String representation of the model_id + """ + if hasattr(backend, "model_id"): + model_id = backend.model_id + if hasattr(model_id, "hf_model_name"): + return str(model_id.hf_model_name) + return str(model_id) + return backend.__class__.__name__ + + +def get_context_size(ctx: Any) -> int: + """Get the size of a context. + + Args: + ctx: Context object + + Returns: + Number of items in context, or 0 if cannot be determined + """ + try: + if hasattr(ctx, "__len__"): + return len(ctx) + if hasattr(ctx, "turns") and hasattr(ctx.turns, "__len__"): + return len(ctx.turns) + except Exception: + pass + return 0 + + +def instrument_generate_from_context( + backend: Any, action: Any, ctx: Any, format: Any = None, tool_calls: bool = False +): + """Create a backend trace span for generate_from_context. + + Args: + backend: Backend instance + action: Action component + ctx: Context + format: Response format (BaseModel subclass or None) + tool_calls: Whether tool calling is enabled + + Returns: + Context manager for the trace span + """ + return trace_backend( + "generate_from_context", + backend=backend.__class__.__name__, + model_id=get_model_id_str(backend), + action_type=action.__class__.__name__, + context_size=get_context_size(ctx), + has_format=format is not None, + format_type=format.__name__ if format else None, + tool_calls=tool_calls, + ) + + +def instrument_generate_from_raw( + backend: Any, num_actions: int, format: Any = None, tool_calls: bool = False +): + """Create a backend trace span for generate_from_raw. + + Args: + backend: Backend instance + num_actions: Number of actions in the batch + format: Response format (BaseModel subclass or None) + tool_calls: Whether tool calling is enabled + + Returns: + Context manager for the trace span + """ + return trace_backend( + "generate_from_raw", + backend=backend.__class__.__name__, + model_id=get_model_id_str(backend), + num_actions=num_actions, + has_format=format is not None, + format_type=format.__name__ if format else None, + tool_calls=tool_calls, + ) + + +__all__ = [ + "get_context_size", + "get_model_id_str", + "instrument_generate_from_context", + "instrument_generate_from_raw", +] diff --git a/pyproject.toml b/pyproject.toml index 3e71a837..bfc7536e 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -46,6 +46,10 @@ dependencies = [ "math_verify", # Needed for Majority Voting Sampling Strategies. "rouge_score", # Needed for Majority Voting Sampling Strategies. "llm-sandbox[docker]>=0.3.23", + "opentelemetry-api>=1.20.0", + "opentelemetry-sdk>=1.20.0", + "opentelemetry-exporter-otlp>=1.20.0", + "opentelemetry-distro>=0.59b0", ] [project.scripts] diff --git a/test/test_telemetry.py b/test/test_telemetry.py new file mode 100644 index 00000000..8d85820b --- /dev/null +++ b/test/test_telemetry.py @@ -0,0 +1,219 @@ +"""Tests for OpenTelemetry instrumentation.""" + +import os + +import pytest + + +@pytest.fixture +def enable_app_tracing(monkeypatch): + """Enable application tracing for tests.""" + monkeypatch.setenv("MELLEA_TRACE_APPLICATION", "true") + monkeypatch.setenv("MELLEA_TRACE_BACKEND", "false") + # Force reload of telemetry module to pick up env vars + import importlib + + import mellea.telemetry + + importlib.reload(mellea.telemetry) + yield + # Reset after test + monkeypatch.setenv("MELLEA_TRACE_APPLICATION", "false") + importlib.reload(mellea.telemetry) + + +@pytest.fixture +def enable_backend_tracing(monkeypatch): + """Enable backend tracing for tests.""" + monkeypatch.setenv("MELLEA_TRACE_APPLICATION", "false") + monkeypatch.setenv("MELLEA_TRACE_BACKEND", "true") + # Force reload of telemetry module to pick up env vars + import importlib + + import mellea.telemetry + + importlib.reload(mellea.telemetry) + yield + # Reset after test + monkeypatch.setenv("MELLEA_TRACE_BACKEND", "false") + importlib.reload(mellea.telemetry) + + +def test_telemetry_disabled_by_default(): + """Test that telemetry is disabled by default.""" + from mellea.telemetry import ( + is_application_tracing_enabled, + is_backend_tracing_enabled, + ) + + assert not is_application_tracing_enabled() + assert not is_backend_tracing_enabled() + + +def test_application_tracing_enabled(enable_app_tracing): + """Test that application tracing can be enabled.""" + from mellea.telemetry import ( + is_application_tracing_enabled, + is_backend_tracing_enabled, + ) + + assert is_application_tracing_enabled() + assert not is_backend_tracing_enabled() + + +def test_backend_tracing_enabled(enable_backend_tracing): + """Test that backend tracing can be enabled.""" + from mellea.telemetry import ( + is_application_tracing_enabled, + is_backend_tracing_enabled, + ) + + assert not is_application_tracing_enabled() + assert is_backend_tracing_enabled() + + +def test_trace_application_context_manager(): + """Test that trace_application works as a context manager.""" + from mellea.telemetry import trace_application + + # Should not raise even when tracing is disabled + with trace_application("test_span", test_attr="value") as span: + # Span will be None when tracing is disabled + assert span is None or hasattr(span, "set_attribute") + + +def test_trace_backend_context_manager(): + """Test that trace_backend works as a context manager.""" + from mellea.telemetry import trace_backend + + # Should not raise even when tracing is disabled + with trace_backend("test_span", test_attr="value") as span: + # Span will be None when tracing is disabled + assert span is None or hasattr(span, "set_attribute") + + +def test_set_span_attribute_with_none_span(): + """Test that set_span_attribute handles None span gracefully.""" + from mellea.telemetry import set_span_attribute + + # Should not raise when span is None + set_span_attribute(None, "key", "value") + + +def test_set_span_error_with_none_span(): + """Test that set_span_error handles None span gracefully.""" + from mellea.telemetry import set_span_error + + # Should not raise when span is None + exception = ValueError("test error") + set_span_error(None, exception) + + +@pytest.mark.ollama +def test_session_with_tracing_disabled(): + """Test that session works normally when tracing is disabled.""" + from mellea import start_session + + with start_session() as m: + result = m.instruct("Say hello") + assert result is not None + + +@pytest.mark.ollama +def test_session_with_application_tracing(enable_app_tracing): + """Test that session works with application tracing enabled.""" + from mellea import start_session + + # This should create application trace spans + with start_session() as m: + result = m.instruct("Say hello") + assert result is not None + + +@pytest.mark.ollama +def test_session_with_backend_tracing(enable_backend_tracing): + """Test that session works with backend tracing enabled.""" + from mellea import start_session + + # This should create backend trace spans + with start_session() as m: + result = m.instruct("Say hello") + assert result is not None + + +@pytest.mark.ollama +def test_generative_function_with_tracing(enable_app_tracing): + """Test that @generative functions work with tracing enabled.""" + from mellea import generative, start_session + + @generative + def classify(text: str) -> str: + """Classify the text.""" + ... + + with start_session() as m: + result = classify(m, text="test") + assert result is not None + + +def test_backend_instrumentation_helpers(): + """Test backend instrumentation helper functions.""" + from mellea.telemetry.backend_instrumentation import ( + get_context_size, + get_model_id_str, + ) + + # Test with mock objects + class MockBackend: + def __init__(self): + self.model_id = "test-model" + + class MockContext: + def __init__(self): + self.turns = [1, 2, 3] + + backend = MockBackend() + ctx = MockContext() + + assert get_model_id_str(backend) == "test-model" + assert get_context_size(ctx) == 3 + + +def test_instrument_generate_from_context(): + """Test instrument_generate_from_context helper.""" + from mellea.telemetry.backend_instrumentation import ( + instrument_generate_from_context, + ) + + class MockBackend: + model_id = "test-model" + + class MockAction: + pass + + class MockContext: + turns = [] + + backend = MockBackend() + action = MockAction() + ctx = MockContext() + + # Should return a context manager + with instrument_generate_from_context(backend, action, ctx) as span: + # Span will be None when tracing is disabled + assert span is None or hasattr(span, "set_attribute") + + +def test_instrument_generate_from_raw(): + """Test instrument_generate_from_raw helper.""" + from mellea.telemetry.backend_instrumentation import instrument_generate_from_raw + + class MockBackend: + model_id = "test-model" + + backend = MockBackend() + + # Should return a context manager + with instrument_generate_from_raw(backend, num_actions=5) as span: + # Span will be None when tracing is disabled + assert span is None or hasattr(span, "set_attribute") diff --git a/uv.lock b/uv.lock index 6b0f63e3..da59d733 100644 --- a/uv.lock +++ b/uv.lock @@ -3338,6 +3338,10 @@ dependencies = [ { name = "mistletoe" }, { name = "ollama" }, { name = "openai" }, + { name = "opentelemetry-api" }, + { name = "opentelemetry-distro" }, + { name = "opentelemetry-exporter-otlp" }, + { name = "opentelemetry-sdk" }, { name = "pillow" }, { name = "pydantic" }, { name = "requests" }, @@ -3445,6 +3449,10 @@ requires-dist = [ { name = "numpy", marker = "extra == 'vllm'", specifier = "<2.0.0" }, { name = "ollama", specifier = ">=0.5.1" }, { name = "openai" }, + { name = "opentelemetry-api", specifier = ">=1.20.0" }, + { name = "opentelemetry-distro", specifier = ">=0.59b0" }, + { name = "opentelemetry-exporter-otlp", specifier = ">=1.20.0" }, + { name = "opentelemetry-sdk", specifier = ">=1.20.0" }, { name = "outlines", marker = "extra == 'hf'" }, { name = "outlines-core", marker = "extra == 'hf'", specifier = "==0.1.26" }, { name = "outlines-core", marker = "extra == 'vllm'", specifier = "==0.1.26" }, @@ -4483,6 +4491,20 @@ wheels = [ { url = "https://files.pythonhosted.org/packages/ae/a2/d86e01c28300bd41bab8f18afd613676e2bd63515417b77636fc1add426f/opentelemetry_api-1.38.0-py3-none-any.whl", hash = "sha256:2891b0197f47124454ab9f0cf58f3be33faca394457ac3e09daba13ff50aa582", size = 65947, upload-time = "2025-10-16T08:35:30.23Z" }, ] +[[package]] +name = "opentelemetry-distro" +version = "0.59b0" +source = { registry = "https://pypi.org/simple" } +dependencies = [ + { name = "opentelemetry-api" }, + { name = "opentelemetry-instrumentation" }, + { name = "opentelemetry-sdk" }, +] +sdist = { url = "https://files.pythonhosted.org/packages/ef/73/909d18e3d609c9f72fdfc441dbf2f33d26d29126088de5b3df30f4867f8a/opentelemetry_distro-0.59b0.tar.gz", hash = "sha256:a72703a514e1773d35d1ec01489a5fd1f1e7ce92e93cf459ba60f85b880d0099", size = 2583, upload-time = "2025-10-16T08:39:28.111Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/e3/a5/71d78732d30616b0b57cce416fa49e7f25ce57492eaf66d0b6864c1df35f/opentelemetry_distro-0.59b0-py3-none-any.whl", hash = "sha256:bbe568d84d801d7e1ead320c4521fc37a4c24b3b2cd49a64f6d8a3c10676cea4", size = 3346, upload-time = "2025-10-16T08:38:27.63Z" }, +] + [[package]] name = "opentelemetry-exporter-otlp" version = "1.38.0" @@ -4544,6 +4566,21 @@ wheels = [ { url = "https://files.pythonhosted.org/packages/e5/77/154004c99fb9f291f74aa0822a2f5bbf565a72d8126b3a1b63ed8e5f83c7/opentelemetry_exporter_otlp_proto_http-1.38.0-py3-none-any.whl", hash = "sha256:84b937305edfc563f08ec69b9cb2298be8188371217e867c1854d77198d0825b", size = 19579, upload-time = "2025-10-16T08:35:36.269Z" }, ] +[[package]] +name = "opentelemetry-instrumentation" +version = "0.59b0" +source = { registry = "https://pypi.org/simple" } +dependencies = [ + { name = "opentelemetry-api" }, + { name = "opentelemetry-semantic-conventions" }, + { name = "packaging" }, + { name = "wrapt" }, +] +sdist = { url = "https://files.pythonhosted.org/packages/04/ed/9c65cd209407fd807fa05be03ee30f159bdac8d59e7ea16a8fe5a1601222/opentelemetry_instrumentation-0.59b0.tar.gz", hash = "sha256:6010f0faaacdaf7c4dff8aac84e226d23437b331dcda7e70367f6d73a7db1adc", size = 31544, upload-time = "2025-10-16T08:39:31.959Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/10/f5/7a40ff3f62bfe715dad2f633d7f1174ba1a7dd74254c15b2558b3401262a/opentelemetry_instrumentation-0.59b0-py3-none-any.whl", hash = "sha256:44082cc8fe56b0186e87ee8f7c17c327c4c2ce93bdbe86496e600985d74368ee", size = 33020, upload-time = "2025-10-16T08:38:31.463Z" }, +] + [[package]] name = "opentelemetry-proto" version = "1.38.0" @@ -8298,6 +8335,75 @@ wheels = [ { url = "https://files.pythonhosted.org/packages/3f/0e/fa3b193432cfc60c93b42f3be03365f5f909d2b3ea410295cf36df739e31/widgetsnbextension-4.0.15-py3-none-any.whl", hash = "sha256:8156704e4346a571d9ce73b84bee86a29906c9abfd7223b7228a28899ccf3366", size = 2196503, upload-time = "2025-11-01T21:15:53.565Z" }, ] +[[package]] +name = "wrapt" +version = "1.17.3" +source = { registry = "https://pypi.org/simple" } +sdist = { url = "https://files.pythonhosted.org/packages/95/8f/aeb76c5b46e273670962298c23e7ddde79916cb74db802131d49a85e4b7d/wrapt-1.17.3.tar.gz", hash = "sha256:f66eb08feaa410fe4eebd17f2a2c8e2e46d3476e9f8c783daa8e09e0faa666d0", size = 55547, upload-time = "2025-08-12T05:53:21.714Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/3f/23/bb82321b86411eb51e5a5db3fb8f8032fd30bd7c2d74bfe936136b2fa1d6/wrapt-1.17.3-cp310-cp310-macosx_10_9_universal2.whl", hash = "sha256:88bbae4d40d5a46142e70d58bf664a89b6b4befaea7b2ecc14e03cedb8e06c04", size = 53482, upload-time = "2025-08-12T05:51:44.467Z" }, + { url = "https://files.pythonhosted.org/packages/45/69/f3c47642b79485a30a59c63f6d739ed779fb4cc8323205d047d741d55220/wrapt-1.17.3-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:e6b13af258d6a9ad602d57d889f83b9d5543acd471eee12eb51f5b01f8eb1bc2", size = 38676, upload-time = "2025-08-12T05:51:32.636Z" }, + { url = "https://files.pythonhosted.org/packages/d1/71/e7e7f5670c1eafd9e990438e69d8fb46fa91a50785332e06b560c869454f/wrapt-1.17.3-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:fd341868a4b6714a5962c1af0bd44f7c404ef78720c7de4892901e540417111c", size = 38957, upload-time = "2025-08-12T05:51:54.655Z" }, + { url = "https://files.pythonhosted.org/packages/de/17/9f8f86755c191d6779d7ddead1a53c7a8aa18bccb7cea8e7e72dfa6a8a09/wrapt-1.17.3-cp310-cp310-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:f9b2601381be482f70e5d1051a5965c25fb3625455a2bf520b5a077b22afb775", size = 81975, upload-time = "2025-08-12T05:52:30.109Z" }, + { url = "https://files.pythonhosted.org/packages/f2/15/dd576273491f9f43dd09fce517f6c2ce6eb4fe21681726068db0d0467096/wrapt-1.17.3-cp310-cp310-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:343e44b2a8e60e06a7e0d29c1671a0d9951f59174f3709962b5143f60a2a98bd", size = 83149, upload-time = "2025-08-12T05:52:09.316Z" }, + { url = "https://files.pythonhosted.org/packages/0c/c4/5eb4ce0d4814521fee7aa806264bf7a114e748ad05110441cd5b8a5c744b/wrapt-1.17.3-cp310-cp310-musllinux_1_2_aarch64.whl", hash = "sha256:33486899acd2d7d3066156b03465b949da3fd41a5da6e394ec49d271baefcf05", size = 82209, upload-time = "2025-08-12T05:52:10.331Z" }, + { url = "https://files.pythonhosted.org/packages/31/4b/819e9e0eb5c8dc86f60dfc42aa4e2c0d6c3db8732bce93cc752e604bb5f5/wrapt-1.17.3-cp310-cp310-musllinux_1_2_x86_64.whl", hash = "sha256:e6f40a8aa5a92f150bdb3e1c44b7e98fb7113955b2e5394122fa5532fec4b418", size = 81551, upload-time = "2025-08-12T05:52:31.137Z" }, + { url = "https://files.pythonhosted.org/packages/f8/83/ed6baf89ba3a56694700139698cf703aac9f0f9eb03dab92f57551bd5385/wrapt-1.17.3-cp310-cp310-win32.whl", hash = "sha256:a36692b8491d30a8c75f1dfee65bef119d6f39ea84ee04d9f9311f83c5ad9390", size = 36464, upload-time = "2025-08-12T05:53:01.204Z" }, + { url = "https://files.pythonhosted.org/packages/2f/90/ee61d36862340ad7e9d15a02529df6b948676b9a5829fd5e16640156627d/wrapt-1.17.3-cp310-cp310-win_amd64.whl", hash = "sha256:afd964fd43b10c12213574db492cb8f73b2f0826c8df07a68288f8f19af2ebe6", size = 38748, upload-time = "2025-08-12T05:53:00.209Z" }, + { url = "https://files.pythonhosted.org/packages/bd/c3/cefe0bd330d389c9983ced15d326f45373f4073c9f4a8c2f99b50bfea329/wrapt-1.17.3-cp310-cp310-win_arm64.whl", hash = "sha256:af338aa93554be859173c39c85243970dc6a289fa907402289eeae7543e1ae18", size = 36810, upload-time = "2025-08-12T05:52:51.906Z" }, + { url = "https://files.pythonhosted.org/packages/52/db/00e2a219213856074a213503fdac0511203dceefff26e1daa15250cc01a0/wrapt-1.17.3-cp311-cp311-macosx_10_9_universal2.whl", hash = "sha256:273a736c4645e63ac582c60a56b0acb529ef07f78e08dc6bfadf6a46b19c0da7", size = 53482, upload-time = "2025-08-12T05:51:45.79Z" }, + { url = "https://files.pythonhosted.org/packages/5e/30/ca3c4a5eba478408572096fe9ce36e6e915994dd26a4e9e98b4f729c06d9/wrapt-1.17.3-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:5531d911795e3f935a9c23eb1c8c03c211661a5060aab167065896bbf62a5f85", size = 38674, upload-time = "2025-08-12T05:51:34.629Z" }, + { url = "https://files.pythonhosted.org/packages/31/25/3e8cc2c46b5329c5957cec959cb76a10718e1a513309c31399a4dad07eb3/wrapt-1.17.3-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:0610b46293c59a3adbae3dee552b648b984176f8562ee0dba099a56cfbe4df1f", size = 38959, upload-time = "2025-08-12T05:51:56.074Z" }, + { url = "https://files.pythonhosted.org/packages/5d/8f/a32a99fc03e4b37e31b57cb9cefc65050ea08147a8ce12f288616b05ef54/wrapt-1.17.3-cp311-cp311-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:b32888aad8b6e68f83a8fdccbf3165f5469702a7544472bdf41f582970ed3311", size = 82376, upload-time = "2025-08-12T05:52:32.134Z" }, + { url = "https://files.pythonhosted.org/packages/31/57/4930cb8d9d70d59c27ee1332a318c20291749b4fba31f113c2f8ac49a72e/wrapt-1.17.3-cp311-cp311-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:8cccf4f81371f257440c88faed6b74f1053eef90807b77e31ca057b2db74edb1", size = 83604, upload-time = "2025-08-12T05:52:11.663Z" }, + { url = "https://files.pythonhosted.org/packages/a8/f3/1afd48de81d63dd66e01b263a6fbb86e1b5053b419b9b33d13e1f6d0f7d0/wrapt-1.17.3-cp311-cp311-musllinux_1_2_aarch64.whl", hash = "sha256:d8a210b158a34164de8bb68b0e7780041a903d7b00c87e906fb69928bf7890d5", size = 82782, upload-time = "2025-08-12T05:52:12.626Z" }, + { url = "https://files.pythonhosted.org/packages/1e/d7/4ad5327612173b144998232f98a85bb24b60c352afb73bc48e3e0d2bdc4e/wrapt-1.17.3-cp311-cp311-musllinux_1_2_x86_64.whl", hash = "sha256:79573c24a46ce11aab457b472efd8d125e5a51da2d1d24387666cd85f54c05b2", size = 82076, upload-time = "2025-08-12T05:52:33.168Z" }, + { url = "https://files.pythonhosted.org/packages/bb/59/e0adfc831674a65694f18ea6dc821f9fcb9ec82c2ce7e3d73a88ba2e8718/wrapt-1.17.3-cp311-cp311-win32.whl", hash = "sha256:c31eebe420a9a5d2887b13000b043ff6ca27c452a9a22fa71f35f118e8d4bf89", size = 36457, upload-time = "2025-08-12T05:53:03.936Z" }, + { url = "https://files.pythonhosted.org/packages/83/88/16b7231ba49861b6f75fc309b11012ede4d6b0a9c90969d9e0db8d991aeb/wrapt-1.17.3-cp311-cp311-win_amd64.whl", hash = "sha256:0b1831115c97f0663cb77aa27d381237e73ad4f721391a9bfb2fe8bc25fa6e77", size = 38745, upload-time = "2025-08-12T05:53:02.885Z" }, + { url = "https://files.pythonhosted.org/packages/9a/1e/c4d4f3398ec073012c51d1c8d87f715f56765444e1a4b11e5180577b7e6e/wrapt-1.17.3-cp311-cp311-win_arm64.whl", hash = "sha256:5a7b3c1ee8265eb4c8f1b7d29943f195c00673f5ab60c192eba2d4a7eae5f46a", size = 36806, upload-time = "2025-08-12T05:52:53.368Z" }, + { url = "https://files.pythonhosted.org/packages/9f/41/cad1aba93e752f1f9268c77270da3c469883d56e2798e7df6240dcb2287b/wrapt-1.17.3-cp312-cp312-macosx_10_13_universal2.whl", hash = "sha256:ab232e7fdb44cdfbf55fc3afa31bcdb0d8980b9b95c38b6405df2acb672af0e0", size = 53998, upload-time = "2025-08-12T05:51:47.138Z" }, + { url = "https://files.pythonhosted.org/packages/60/f8/096a7cc13097a1869fe44efe68dace40d2a16ecb853141394047f0780b96/wrapt-1.17.3-cp312-cp312-macosx_10_13_x86_64.whl", hash = "sha256:9baa544e6acc91130e926e8c802a17f3b16fbea0fd441b5a60f5cf2cc5c3deba", size = 39020, upload-time = "2025-08-12T05:51:35.906Z" }, + { url = "https://files.pythonhosted.org/packages/33/df/bdf864b8997aab4febb96a9ae5c124f700a5abd9b5e13d2a3214ec4be705/wrapt-1.17.3-cp312-cp312-macosx_11_0_arm64.whl", hash = "sha256:6b538e31eca1a7ea4605e44f81a48aa24c4632a277431a6ed3f328835901f4fd", size = 39098, upload-time = "2025-08-12T05:51:57.474Z" }, + { url = "https://files.pythonhosted.org/packages/9f/81/5d931d78d0eb732b95dc3ddaeeb71c8bb572fb01356e9133916cd729ecdd/wrapt-1.17.3-cp312-cp312-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:042ec3bb8f319c147b1301f2393bc19dba6e176b7da446853406d041c36c7828", size = 88036, upload-time = "2025-08-12T05:52:34.784Z" }, + { url = "https://files.pythonhosted.org/packages/ca/38/2e1785df03b3d72d34fc6252d91d9d12dc27a5c89caef3335a1bbb8908ca/wrapt-1.17.3-cp312-cp312-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:3af60380ba0b7b5aeb329bc4e402acd25bd877e98b3727b0135cb5c2efdaefe9", size = 88156, upload-time = "2025-08-12T05:52:13.599Z" }, + { url = "https://files.pythonhosted.org/packages/b3/8b/48cdb60fe0603e34e05cffda0b2a4adab81fd43718e11111a4b0100fd7c1/wrapt-1.17.3-cp312-cp312-musllinux_1_2_aarch64.whl", hash = "sha256:0b02e424deef65c9f7326d8c19220a2c9040c51dc165cddb732f16198c168396", size = 87102, upload-time = "2025-08-12T05:52:14.56Z" }, + { url = "https://files.pythonhosted.org/packages/3c/51/d81abca783b58f40a154f1b2c56db1d2d9e0d04fa2d4224e357529f57a57/wrapt-1.17.3-cp312-cp312-musllinux_1_2_x86_64.whl", hash = "sha256:74afa28374a3c3a11b3b5e5fca0ae03bef8450d6aa3ab3a1e2c30e3a75d023dc", size = 87732, upload-time = "2025-08-12T05:52:36.165Z" }, + { url = "https://files.pythonhosted.org/packages/9e/b1/43b286ca1392a006d5336412d41663eeef1ad57485f3e52c767376ba7e5a/wrapt-1.17.3-cp312-cp312-win32.whl", hash = "sha256:4da9f45279fff3543c371d5ababc57a0384f70be244de7759c85a7f989cb4ebe", size = 36705, upload-time = "2025-08-12T05:53:07.123Z" }, + { url = "https://files.pythonhosted.org/packages/28/de/49493f962bd3c586ab4b88066e967aa2e0703d6ef2c43aa28cb83bf7b507/wrapt-1.17.3-cp312-cp312-win_amd64.whl", hash = "sha256:e71d5c6ebac14875668a1e90baf2ea0ef5b7ac7918355850c0908ae82bcb297c", size = 38877, upload-time = "2025-08-12T05:53:05.436Z" }, + { url = "https://files.pythonhosted.org/packages/f1/48/0f7102fe9cb1e8a5a77f80d4f0956d62d97034bbe88d33e94699f99d181d/wrapt-1.17.3-cp312-cp312-win_arm64.whl", hash = "sha256:604d076c55e2fdd4c1c03d06dc1a31b95130010517b5019db15365ec4a405fc6", size = 36885, upload-time = "2025-08-12T05:52:54.367Z" }, + { url = "https://files.pythonhosted.org/packages/fc/f6/759ece88472157acb55fc195e5b116e06730f1b651b5b314c66291729193/wrapt-1.17.3-cp313-cp313-macosx_10_13_universal2.whl", hash = "sha256:a47681378a0439215912ef542c45a783484d4dd82bac412b71e59cf9c0e1cea0", size = 54003, upload-time = "2025-08-12T05:51:48.627Z" }, + { url = "https://files.pythonhosted.org/packages/4f/a9/49940b9dc6d47027dc850c116d79b4155f15c08547d04db0f07121499347/wrapt-1.17.3-cp313-cp313-macosx_10_13_x86_64.whl", hash = "sha256:54a30837587c6ee3cd1a4d1c2ec5d24e77984d44e2f34547e2323ddb4e22eb77", size = 39025, upload-time = "2025-08-12T05:51:37.156Z" }, + { url = "https://files.pythonhosted.org/packages/45/35/6a08de0f2c96dcdd7fe464d7420ddb9a7655a6561150e5fc4da9356aeaab/wrapt-1.17.3-cp313-cp313-macosx_11_0_arm64.whl", hash = "sha256:16ecf15d6af39246fe33e507105d67e4b81d8f8d2c6598ff7e3ca1b8a37213f7", size = 39108, upload-time = "2025-08-12T05:51:58.425Z" }, + { url = "https://files.pythonhosted.org/packages/0c/37/6faf15cfa41bf1f3dba80cd3f5ccc6622dfccb660ab26ed79f0178c7497f/wrapt-1.17.3-cp313-cp313-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:6fd1ad24dc235e4ab88cda009e19bf347aabb975e44fd5c2fb22a3f6e4141277", size = 88072, upload-time = "2025-08-12T05:52:37.53Z" }, + { url = "https://files.pythonhosted.org/packages/78/f2/efe19ada4a38e4e15b6dff39c3e3f3f73f5decf901f66e6f72fe79623a06/wrapt-1.17.3-cp313-cp313-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:0ed61b7c2d49cee3c027372df5809a59d60cf1b6c2f81ee980a091f3afed6a2d", size = 88214, upload-time = "2025-08-12T05:52:15.886Z" }, + { url = "https://files.pythonhosted.org/packages/40/90/ca86701e9de1622b16e09689fc24b76f69b06bb0150990f6f4e8b0eeb576/wrapt-1.17.3-cp313-cp313-musllinux_1_2_aarch64.whl", hash = "sha256:423ed5420ad5f5529db9ce89eac09c8a2f97da18eb1c870237e84c5a5c2d60aa", size = 87105, upload-time = "2025-08-12T05:52:17.914Z" }, + { url = "https://files.pythonhosted.org/packages/fd/e0/d10bd257c9a3e15cbf5523025252cc14d77468e8ed644aafb2d6f54cb95d/wrapt-1.17.3-cp313-cp313-musllinux_1_2_x86_64.whl", hash = "sha256:e01375f275f010fcbf7f643b4279896d04e571889b8a5b3f848423d91bf07050", size = 87766, upload-time = "2025-08-12T05:52:39.243Z" }, + { url = "https://files.pythonhosted.org/packages/e8/cf/7d848740203c7b4b27eb55dbfede11aca974a51c3d894f6cc4b865f42f58/wrapt-1.17.3-cp313-cp313-win32.whl", hash = "sha256:53e5e39ff71b3fc484df8a522c933ea2b7cdd0d5d15ae82e5b23fde87d44cbd8", size = 36711, upload-time = "2025-08-12T05:53:10.074Z" }, + { url = "https://files.pythonhosted.org/packages/57/54/35a84d0a4d23ea675994104e667ceff49227ce473ba6a59ba2c84f250b74/wrapt-1.17.3-cp313-cp313-win_amd64.whl", hash = "sha256:1f0b2f40cf341ee8cc1a97d51ff50dddb9fcc73241b9143ec74b30fc4f44f6cb", size = 38885, upload-time = "2025-08-12T05:53:08.695Z" }, + { url = "https://files.pythonhosted.org/packages/01/77/66e54407c59d7b02a3c4e0af3783168fff8e5d61def52cda8728439d86bc/wrapt-1.17.3-cp313-cp313-win_arm64.whl", hash = "sha256:7425ac3c54430f5fc5e7b6f41d41e704db073309acfc09305816bc6a0b26bb16", size = 36896, upload-time = "2025-08-12T05:52:55.34Z" }, + { url = "https://files.pythonhosted.org/packages/02/a2/cd864b2a14f20d14f4c496fab97802001560f9f41554eef6df201cd7f76c/wrapt-1.17.3-cp314-cp314-macosx_10_13_universal2.whl", hash = "sha256:cf30f6e3c077c8e6a9a7809c94551203c8843e74ba0c960f4a98cd80d4665d39", size = 54132, upload-time = "2025-08-12T05:51:49.864Z" }, + { url = "https://files.pythonhosted.org/packages/d5/46/d011725b0c89e853dc44cceb738a307cde5d240d023d6d40a82d1b4e1182/wrapt-1.17.3-cp314-cp314-macosx_10_13_x86_64.whl", hash = "sha256:e228514a06843cae89621384cfe3a80418f3c04aadf8a3b14e46a7be704e4235", size = 39091, upload-time = "2025-08-12T05:51:38.935Z" }, + { url = "https://files.pythonhosted.org/packages/2e/9e/3ad852d77c35aae7ddebdbc3b6d35ec8013af7d7dddad0ad911f3d891dae/wrapt-1.17.3-cp314-cp314-macosx_11_0_arm64.whl", hash = "sha256:5ea5eb3c0c071862997d6f3e02af1d055f381b1d25b286b9d6644b79db77657c", size = 39172, upload-time = "2025-08-12T05:51:59.365Z" }, + { url = "https://files.pythonhosted.org/packages/c3/f7/c983d2762bcce2326c317c26a6a1e7016f7eb039c27cdf5c4e30f4160f31/wrapt-1.17.3-cp314-cp314-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:281262213373b6d5e4bb4353bc36d1ba4084e6d6b5d242863721ef2bf2c2930b", size = 87163, upload-time = "2025-08-12T05:52:40.965Z" }, + { url = "https://files.pythonhosted.org/packages/e4/0f/f673f75d489c7f22d17fe0193e84b41540d962f75fce579cf6873167c29b/wrapt-1.17.3-cp314-cp314-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:dc4a8d2b25efb6681ecacad42fca8859f88092d8732b170de6a5dddd80a1c8fa", size = 87963, upload-time = "2025-08-12T05:52:20.326Z" }, + { url = "https://files.pythonhosted.org/packages/df/61/515ad6caca68995da2fac7a6af97faab8f78ebe3bf4f761e1b77efbc47b5/wrapt-1.17.3-cp314-cp314-musllinux_1_2_aarch64.whl", hash = "sha256:373342dd05b1d07d752cecbec0c41817231f29f3a89aa8b8843f7b95992ed0c7", size = 86945, upload-time = "2025-08-12T05:52:21.581Z" }, + { url = "https://files.pythonhosted.org/packages/d3/bd/4e70162ce398462a467bc09e768bee112f1412e563620adc353de9055d33/wrapt-1.17.3-cp314-cp314-musllinux_1_2_x86_64.whl", hash = "sha256:d40770d7c0fd5cbed9d84b2c3f2e156431a12c9a37dc6284060fb4bec0b7ffd4", size = 86857, upload-time = "2025-08-12T05:52:43.043Z" }, + { url = "https://files.pythonhosted.org/packages/2b/b8/da8560695e9284810b8d3df8a19396a6e40e7518059584a1a394a2b35e0a/wrapt-1.17.3-cp314-cp314-win32.whl", hash = "sha256:fbd3c8319de8e1dc79d346929cd71d523622da527cca14e0c1d257e31c2b8b10", size = 37178, upload-time = "2025-08-12T05:53:12.605Z" }, + { url = "https://files.pythonhosted.org/packages/db/c8/b71eeb192c440d67a5a0449aaee2310a1a1e8eca41676046f99ed2487e9f/wrapt-1.17.3-cp314-cp314-win_amd64.whl", hash = "sha256:e1a4120ae5705f673727d3253de3ed0e016f7cd78dc463db1b31e2463e1f3cf6", size = 39310, upload-time = "2025-08-12T05:53:11.106Z" }, + { url = "https://files.pythonhosted.org/packages/45/20/2cda20fd4865fa40f86f6c46ed37a2a8356a7a2fde0773269311f2af56c7/wrapt-1.17.3-cp314-cp314-win_arm64.whl", hash = "sha256:507553480670cab08a800b9463bdb881b2edeed77dc677b0a5915e6106e91a58", size = 37266, upload-time = "2025-08-12T05:52:56.531Z" }, + { url = "https://files.pythonhosted.org/packages/77/ed/dd5cf21aec36c80443c6f900449260b80e2a65cf963668eaef3b9accce36/wrapt-1.17.3-cp314-cp314t-macosx_10_13_universal2.whl", hash = "sha256:ed7c635ae45cfbc1a7371f708727bf74690daedc49b4dba310590ca0bd28aa8a", size = 56544, upload-time = "2025-08-12T05:51:51.109Z" }, + { url = "https://files.pythonhosted.org/packages/8d/96/450c651cc753877ad100c7949ab4d2e2ecc4d97157e00fa8f45df682456a/wrapt-1.17.3-cp314-cp314t-macosx_10_13_x86_64.whl", hash = "sha256:249f88ed15503f6492a71f01442abddd73856a0032ae860de6d75ca62eed8067", size = 40283, upload-time = "2025-08-12T05:51:39.912Z" }, + { url = "https://files.pythonhosted.org/packages/d1/86/2fcad95994d9b572db57632acb6f900695a648c3e063f2cd344b3f5c5a37/wrapt-1.17.3-cp314-cp314t-macosx_11_0_arm64.whl", hash = "sha256:5a03a38adec8066d5a37bea22f2ba6bbf39fcdefbe2d91419ab864c3fb515454", size = 40366, upload-time = "2025-08-12T05:52:00.693Z" }, + { url = "https://files.pythonhosted.org/packages/64/0e/f4472f2fdde2d4617975144311f8800ef73677a159be7fe61fa50997d6c0/wrapt-1.17.3-cp314-cp314t-manylinux1_x86_64.manylinux_2_28_x86_64.manylinux_2_5_x86_64.whl", hash = "sha256:5d4478d72eb61c36e5b446e375bbc49ed002430d17cdec3cecb36993398e1a9e", size = 108571, upload-time = "2025-08-12T05:52:44.521Z" }, + { url = "https://files.pythonhosted.org/packages/cc/01/9b85a99996b0a97c8a17484684f206cbb6ba73c1ce6890ac668bcf3838fb/wrapt-1.17.3-cp314-cp314t-manylinux2014_aarch64.manylinux_2_17_aarch64.manylinux_2_28_aarch64.whl", hash = "sha256:223db574bb38637e8230eb14b185565023ab624474df94d2af18f1cdb625216f", size = 113094, upload-time = "2025-08-12T05:52:22.618Z" }, + { url = "https://files.pythonhosted.org/packages/25/02/78926c1efddcc7b3aa0bc3d6b33a822f7d898059f7cd9ace8c8318e559ef/wrapt-1.17.3-cp314-cp314t-musllinux_1_2_aarch64.whl", hash = "sha256:e405adefb53a435f01efa7ccdec012c016b5a1d3f35459990afc39b6be4d5056", size = 110659, upload-time = "2025-08-12T05:52:24.057Z" }, + { url = "https://files.pythonhosted.org/packages/dc/ee/c414501ad518ac3e6fe184753632fe5e5ecacdcf0effc23f31c1e4f7bfcf/wrapt-1.17.3-cp314-cp314t-musllinux_1_2_x86_64.whl", hash = "sha256:88547535b787a6c9ce4086917b6e1d291aa8ed914fdd3a838b3539dc95c12804", size = 106946, upload-time = "2025-08-12T05:52:45.976Z" }, + { url = "https://files.pythonhosted.org/packages/be/44/a1bd64b723d13bb151d6cc91b986146a1952385e0392a78567e12149c7b4/wrapt-1.17.3-cp314-cp314t-win32.whl", hash = "sha256:41b1d2bc74c2cac6f9074df52b2efbef2b30bdfe5f40cb78f8ca22963bc62977", size = 38717, upload-time = "2025-08-12T05:53:15.214Z" }, + { url = "https://files.pythonhosted.org/packages/79/d9/7cfd5a312760ac4dd8bf0184a6ee9e43c33e47f3dadc303032ce012b8fa3/wrapt-1.17.3-cp314-cp314t-win_amd64.whl", hash = "sha256:73d496de46cd2cdbdbcce4ae4bcdb4afb6a11234a1df9c085249d55166b95116", size = 41334, upload-time = "2025-08-12T05:53:14.178Z" }, + { url = "https://files.pythonhosted.org/packages/46/78/10ad9781128ed2f99dbc474f43283b13fea8ba58723e98844367531c18e9/wrapt-1.17.3-cp314-cp314t-win_arm64.whl", hash = "sha256:f38e60678850c42461d4202739f9bf1e3a737c7ad283638251e79cc49effb6b6", size = 38471, upload-time = "2025-08-12T05:52:57.784Z" }, + { url = "https://files.pythonhosted.org/packages/1f/f6/a933bd70f98e9cf3e08167fc5cd7aaaca49147e48411c0bd5ae701bb2194/wrapt-1.17.3-py3-none-any.whl", hash = "sha256:7171ae35d2c33d326ac19dd8facb1e82e5fd04ef8c6c0e394d7af55a55051c22", size = 23591, upload-time = "2025-08-12T05:53:20.674Z" }, +] + [[package]] name = "xformers" version = "0.0.30"