-
Notifications
You must be signed in to change notification settings - Fork 0
refactor: added settings-driven DI and CSRF middleware with SSE subscribed event #73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
📝 WalkthroughWalkthroughCentralized Settings injection across services and tests, DI-propagation of SecurityService, added CSRFMiddleware and per-request CSRF validation, metrics/tracing and Kafka/SSE components parameterized by Settings, DLQ batch-retry added to manager, and extensive test refactor to typed fixtures and CSRF-capable authenticated clients. Changes
* multiple related modules under same area. Sequence Diagram(s)sequenceDiagram
autonumber
participant TR as Test Runner
participant TS as test_settings (Settings)
participant App as FastAPI App
participant DI as DI Container / Providers
participant Sec as SecurityService
participant Client as Test AsyncClient
participant Prod as UnifiedProducer
participant Kafka as Kafka
TR->>TS: instantiate Settings (.env.test)
TR->>App: create_app(settings=TS)
App->>DI: build container/providers with TS
DI->>Sec: construct SecurityService(settings=TS)
App->>App: register CSRFMiddleware(container)
Client->>App: POST /api/v1/auth/register
App->>Sec: get_password_hash / create user
App-->>Client: set cookies + CSRF token
Client->>App: POST protected endpoint with X-CSRF-Token
App->>Sec: validate_csrf_from_request(request)
App->>Prod: produce event (topic from TS)
Prod->>Kafka: send message
Kafka-->>Prod: ack
App-->>Client: respond
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No issues found across 1 file
|
Codecov Report❌ Patch coverage is
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 9 files with indirect coverage changes 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In @backend/tests/conftest.py:
- Around line 73-78: The app fixture currently instantiates TestSettings()
directly instead of using the test_settings fixture that generates the unique
Kafka topic prefix; update the app fixture to accept the test_settings fixture
(i.e., add test_settings as a parameter to the app fixture) and use that
TestSettings instance when constructing the FastAPI application so the app (and
derived fixtures like client, scope, db) inherit the unique KAFKA_TOPIC_PREFIX
rather than a fresh TestSettings().
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
backend/tests/conftest.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Build Frontend
- GitHub Check: Integration Tests
- GitHub Check: E2E Tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3 issues found across 11 files (changes from recent commits).
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="backend/app/services/sse/kafka_redis_bridge.py">
<violation number="1" location="backend/app/services/sse/kafka_redis_bridge.py:65">
P1: Missing separator between base group_id and suffix. The original code used `.` separator for group_id and `-` for client_id. This is inconsistent with all other usages in the codebase (e.g., `coordinator.py`, `saga_orchestrator.py`, `notification_service.py`) which all use `f"{base}.{suffix}"` pattern.</violation>
</file>
<file name="backend/tests/conftest.py">
<violation number="1" location="backend/tests/conftest.py:25">
P2: Session ID doesn't check `PYTEST_SESSION_ID` environment variable. Consider using `os.environ.get("PYTEST_SESSION_ID") or uuid.uuid4().hex[:8]` to match the PR description and preserve the ability for CI to inject a consistent session ID.</violation>
</file>
<file name="backend/app/core/middlewares/metrics.py">
<violation number="1" location="backend/app/core/middlewares/metrics.py:125">
P2: Removing `OTEL_SDK_DISABLED` check breaks compatibility with the standard OpenTelemetry configuration. This environment variable is an official OTel SDK option to disable all signals. Users who have set `OTEL_SDK_DISABLED=true` (common in local dev or specific deployments) will unexpectedly have metrics enabled after this change, potentially causing connection timeouts or overhead. Consider keeping the standard env var check alongside `settings.TESTING`.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
backend/app/core/middlewares/metrics.py (1)
123-136: Fix misleading comment + unreachableservice.environment="test"branchBecause you
returnonsettings.TESTING(Line 125-127),"service.environment": "test" if settings.TESTING else "production"(Line 134) will always resolve to"production". Also, the comment “or when explicitly disabled” is no longer true.Proposed fix
def setup_metrics(app: FastAPI, logger: logging.Logger) -> None: """Set up OpenTelemetry metrics with OTLP exporter.""" settings = get_settings() - # Fast opt-out for tests or when explicitly disabled + # Fast opt-out for tests if settings.TESTING: logger.info("OpenTelemetry metrics disabled (TESTING)") return # Configure OpenTelemetry resource + # NOTE: If you want "test" here, you can't return early above. resource = Resource.create( { SERVICE_NAME: settings.PROJECT_NAME, SERVICE_VERSION: "1.0.0", - "service.environment": "test" if settings.TESTING else "production", + "service.environment": "production", } )
🤖 Fix all issues with AI agents
In @backend/app/core/middlewares/metrics.py:
- Around line 138-142: The fallback endpoint for OTLP metrics uses a bare
"localhost:4317"; change the default to include the HTTP scheme by setting
endpoint = settings.OTEL_EXPORTER_OTLP_ENDPOINT or "http://localhost:4317" so
OTLPMetricExporter(endpoint=endpoint, insecure=True) uses the explicit
"http://..." format (refer to the endpoint variable and OTLPMetricExporter in
metrics.py and settings.OTEL_EXPORTER_OTLP_ENDPOINT).
In @backend/app/services/sse/kafka_redis_bridge.py:
- Around line 64-66: Update the group naming in kafka_redis_bridge.py to include
the dot separator before the KAFKA_GROUP_SUFFIX: change how group_id and
client_id are constructed so they append a "." before
self.settings.KAFKA_GROUP_SUFFIX (i.e., adjust the expressions that set group_id
and client_id to use a dot separator), ensuring they match the convention used
elsewhere (see group_id and client_id variables in this file and the
KAFKA_GROUP_SUFFIX setting).
In @backend/scripts/seed_users.py:
- Around line 28-35: The file docstring's Environment Variables section is
missing DATABASE_NAME; update the top-level docstring to document DATABASE_NAME
(matching SeedSettings.database_name) as an available env var, include a short
description like "database name for the application" and its default value
("integr8scode_db"), and ensure the name matches the code (DATABASE_NAME) so
environment users can find and override SeedSettings.database_name.
🧹 Nitpick comments (2)
backend/app/core/dishka_lifespan.py (1)
54-54: Minor:enable_console_exporterparameter is always False in this context.Since Line 48 already ensures
not settings.TESTING, theenable_console_exporter=settings.TESTINGparameter on Line 54 will always evaluate toFalse. This parameter could be simplified toenable_console_exporter=Falsefor clarity.♻️ Simplify the parameter
- enable_console_exporter=settings.TESTING, + enable_console_exporter=False,backend/tests/conftest.py (1)
22-37: Strong test isolation implementation.The test_settings fixture correctly implements per-session and per-worker isolation by:
- Loading base configuration from .env.test
- Generating unique identifiers (session_id, worker-specific values)
- Creating isolated resources (DATABASE_NAME, REDIS_DB, Kafka topics/groups)
- Using Pydantic's model_copy for clean Settings customization
The worker_num calculation at line 27 (
sum(_WORKER_ID.encode()) % 16) is a simple deterministic hash to distribute workers across Redis databases 0-15. While unconventional, it's functional and consistent.Optional: Add clarifying comment for worker_num calculation
+ # Deterministic mapping of worker ID to Redis DB (0-15) worker_num = sum(_WORKER_ID.encode()) % 16
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
backend/.env.testbackend/app/core/dishka_lifespan.pybackend/app/core/middlewares/metrics.pybackend/app/events/schema/schema_registry.pybackend/app/services/sse/kafka_redis_bridge.pybackend/app/settings.pybackend/pyproject.tomlbackend/scripts/seed_users.pybackend/tests/conftest.pybackend/tests/integration/events/test_admin_utils.pybackend/tests/load/config.py
💤 Files with no reviewable changes (1)
- backend/.env.test
🧰 Additional context used
🧬 Code graph analysis (4)
backend/tests/integration/events/test_admin_utils.py (2)
backend/tests/conftest.py (2)
app(42-53)test_settings(22-37)backend/app/settings.py (1)
Settings(11-161)
backend/app/core/dishka_lifespan.py (2)
backend/app/core/tracing/config.py (1)
init_tracing(177-197)backend/app/core/tracing/models.py (2)
has_failures(61-63)get_summary(57-59)
backend/tests/load/config.py (2)
frontend/src/lib/api/core/params.gen.ts (1)
Field(5-36)backend/app/schemas_pydantic/replay.py (1)
duration_seconds(57-60)
backend/tests/conftest.py (2)
backend/app/settings.py (1)
Settings(11-161)backend/app/main.py (1)
create_app(44-127)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Scan Backend
- GitHub Check: cubic · AI code reviewer
- GitHub Check: Integration Tests
- GitHub Check: E2E Tests
🔇 Additional comments (12)
backend/pyproject.toml (1)
213-213: Change looks good—OpenTelemetry is properly disabled during tests.The
TESTINGflag is correctly read from the environment variable set in pytest configuration and explicitly disables OpenTelemetry metrics inbackend/app/core/middlewares/metrics.py(line 125) and prevents tracing initialization inbackend/app/core/dishka_lifespan.py(line 48). This centralized approach is cleaner than the previousOTEL_SDK_DISABLEDflag and properly prevents telemetry overhead during test runs.backend/scripts/seed_users.py (1)
81-83: LGTM! Clean migration to Settings pattern.The refactor from
os.getenv()to PydanticBaseSettingsis well-executed. The settings instance is created once per invocation, and all configuration values (URLs, database name, passwords) are now sourced consistently from theSeedSettingsobject.Also applies to: 95-95, 105-105
backend/app/core/dishka_lifespan.py (1)
47-72: LGTM! Tracing isolation for tests looks correct.The conditional initialization of tracing based on
ENABLE_TRACINGandnot TESTINGappropriately prevents exporter retries during tests, which aligns with the PR's goal of fixing flaky tests. The failure checking and separate logging paths are well-structured.backend/tests/load/config.py (2)
12-42: LGTM! Clean migration to Pydantic BaseSettings.The conversion from dataclass to Pydantic
BaseSettingsis well-executed:
- Environment variable loading with
LOAD_prefix is correctly configuredvalidation_aliasis appropriately used for backward compatibility with existing env var names- All fields have explicit types and sensible defaults
- The
api()method logic remains unchangedThis refactoring improves maintainability and aligns with the broader PR pattern of centralizing configuration through Settings objects.
5-6: pydantic-settings is already in project dependencies.
pydantic-settings==2.5.2is already declared inbackend/pyproject.toml, so no further action is needed.Likely an incorrect or invalid review comment.
backend/app/events/schema/schema_registry.py (1)
62-62: LGTM! Settings-based configuration for schema subject prefix.The change from reading
SCHEMA_SUBJECT_PREFIXvia environment variable to usingsettings.SCHEMA_SUBJECT_PREFIXcorrectly centralizes configuration and enables the per-session/worker test isolation described in the PR objectives.backend/app/settings.py (1)
76-76: LGTM! New setting for schema subject prefix isolation.The addition of
SCHEMA_SUBJECT_PREFIXwith an empty string default is appropriate. This field enables per-session/worker isolation for Kafka schema subjects during tests while keeping production behavior unchanged (no prefix).backend/tests/integration/events/test_admin_utils.py (2)
4-4: LGTM: Import addition supports test isolation.The Settings import is required for the updated test signature and enables proper type hinting for the test_settings fixture.
11-12: Excellent: Settings-based test isolation implemented.The switch from environment variables to the test_settings fixture enables proper per-session and per-worker topic isolation, directly addressing the flaky test issues mentioned in the PR objectives. The topic construction correctly combines the unique PREFIX and GROUP_SUFFIX to ensure test runs don't interfere with each other.
backend/tests/conftest.py (3)
17-17: LGTM: Standard pytest-xdist worker identification.The worker ID extraction with a sensible default ("gw0") ensures the fixture works correctly both in parallel (pytest-xdist) and single-worker test runs.
42-48: Clean integration with test_settings fixture.The app fixture now correctly receives and uses the test_settings instance, ensuring the FastAPI application is configured with the same isolated settings used throughout the test session. This maintains consistency across all test fixtures and components.
66-71: LGTM: Improved readability.The multi-line formatting of the AsyncClient initialization enhances readability without any functional changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 issue found across 2 files (changes from recent commits).
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="backend/app/core/metrics/base.py">
<violation number="1" location="backend/app/core/metrics/base.py:54">
P2: Removing the `config.otlp_endpoint` check means metrics export will be attempted even when no endpoint is configured. This could cause connection errors to the default `localhost:4317` endpoint and spawn unnecessary background threads. Consider preserving the endpoint check:
```python
if settings.TESTING or not settings.ENABLE_TRACING or not config.otlp_endpoint:
```</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
backend/app/core/metrics/base.py (2)
42-79: Resource leak: MeterProvider and background threads are not cleaned up.The
_create_metermethod creates aPeriodicExportingMetricReader(line 61-64) that starts a background thread for periodic metric export, and aSdkMeterProvider(line 67) that manages these resources. However:
- The
MeterProvideris not stored as an instance variable- The
close()method (lines 76-79) does nothing to shut down the provider or stop background threadsThis causes resource leaks when
BaseMetricsinstances are destroyed, as the export threads continue running.♻️ Proposed fix to store and clean up MeterProvider
def _create_meter(self, config: MetricsConfig, meter_name: str) -> Meter: """Create a new meter instance for this collector. Args: config: Metrics configuration meter_name: Name for this meter Returns: A new meter instance """ # If tracing/metrics disabled or no OTLP endpoint configured, use NoOp meter to avoid threads/network settings = get_settings() if settings.TESTING or not settings.ENABLE_TRACING: + self._meter_provider = None return NoOpMeterProvider().get_meter(meter_name) resource = Resource.create( {"service.name": config.service_name, "service.version": config.service_version, "meter.name": meter_name} ) reader = PeriodicExportingMetricReader( exporter=OTLPMetricExporter(endpoint=config.otlp_endpoint), export_interval_millis=config.export_interval_millis, ) # Each collector gets its own MeterProvider meter_provider = SdkMeterProvider(resource=resource, metric_readers=[reader]) + self._meter_provider = meter_provider # Return a meter from this provider return meter_provider.get_meter(meter_name) def _create_instruments(self) -> None: """Create metric instruments. Override in subclasses.""" pass def close(self) -> None: """Close the metrics collector and clean up resources.""" - # Subclasses can override if they need cleanup - pass + if self._meter_provider is not None: + self._meter_provider.shutdown()
54-70: Add validation for missing OTLP endpoint before creating exporter.The condition at line 54 returns a no-op meter when testing or tracing is disabled, but it does not validate that
config.otlp_endpointis configured. According to the documented behavior (metrics-reference.md), a no-op meter should be used when "no OTLP endpoint is configured," but the code only checks theENABLE_TRACINGflag. IfENABLE_TRACING=true,TESTING=false, andOTEL_EXPORTER_OTLP_ENDPOINTis not set (defaults toNone), line 62 will instantiateOTLPMetricExporter(endpoint=None), causing a runtime error.Add an endpoint check to the conditional:
-if settings.TESTING or not settings.ENABLE_TRACING: +if settings.TESTING or not settings.ENABLE_TRACING or not config.otlp_endpoint: return NoOpMeterProvider().get_meter(meter_name)This aligns the implementation with the documented behavior and matches the pattern used elsewhere in the codebase (e.g., middlewares/metrics.py line 140).
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
backend/.env.testbackend/app/core/metrics/base.py
🚧 Files skipped from review as they are similar to previous changes (1)
- backend/.env.test
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: cubic · AI code reviewer
- GitHub Check: E2E Tests
- GitHub Check: Integration Tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 issue found across 6 files (changes from recent commits).
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="backend/app/services/sse/kafka_redis_bridge.py">
<violation number="1" location="backend/app/services/sse/kafka_redis_bridge.py:65">
P2: Inconsistent separator used before `KAFKA_GROUP_SUFFIX`. The rest of the codebase uses `.` (dot) as the separator, but this change uses `-` (hyphen). Consider using `.{suffix}` for consistency with other Kafka consumer group naming patterns.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In @backend/app/core/middlewares/metrics.py:
- Line 140: The code calls OTLPMetricExporter with
settings.OTEL_EXPORTER_OTLP_ENDPOINT but Settings defaults that value to None;
add fallback logic so OTLPMetricExporter receives a concrete endpoint (e.g., use
settings.OTEL_EXPORTER_OTLP_ENDPOINT or "localhost:4317") or, if you prefer not
to add a fallback, update the comment above line 140 to state that the endpoint
must be explicitly configured and no default is applied; reference the
OTLPMetricExporter instantiation and settings.OTEL_EXPORTER_OTLP_ENDPOINT when
making the change.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
backend/app/core/middlewares/metrics.pybackend/app/core/providers.pybackend/app/events/core/producer.pybackend/app/services/sse/kafka_redis_bridge.pybackend/scripts/seed_users.pybackend/tests/integration/events/test_producer_roundtrip.py
🚧 Files skipped from review as they are similar to previous changes (1)
- backend/scripts/seed_users.py
🧰 Additional context used
🧬 Code graph analysis (3)
backend/tests/integration/events/test_producer_roundtrip.py (3)
backend/tests/conftest.py (2)
scope(82-84)test_settings(22-37)backend/app/events/core/producer.py (1)
UnifiedProducer(30-290)backend/app/events/core/types.py (1)
ProducerConfig(29-60)
backend/app/core/providers.py (1)
backend/app/events/core/producer.py (2)
UnifiedProducer(30-290)producer(65-66)
backend/app/events/core/producer.py (1)
backend/app/settings.py (2)
Settings(11-161)get_settings(165-166)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: cubic · AI code reviewer
- GitHub Check: Integration Tests
- GitHub Check: E2E Tests
🔇 Additional comments (4)
backend/app/services/sse/kafka_redis_bridge.py (1)
64-66: No changes needed. The KAFKA_GROUP_SUFFIX is properly defined with defaults.The settings-based approach correctly supports test isolation.
KAFKA_GROUP_SUFFIXis defined inbackend/app/settings.pyas a requiredstrfield with default value"suff", and tests override it with a session-and-worker-based suffix (f"{session_id}.{_WORKER_ID}"). The field is type-hinted asstr(notstr | None), so it cannot beNoneor empty. The code at lines 64–66 is correct and will never produce malformed Kafka consumer identifiers.backend/app/core/providers.py (1)
158-158: LGTM! Settings injection enables test isolation.The addition of
settings=settingsto theUnifiedProducerinitialization correctly propagates the injected Settings instance, enabling per-session/per-worker topic prefix isolation as described in the PR objectives.backend/tests/integration/events/test_producer_roundtrip.py (1)
17-26: LGTM! Test isolation properly implemented.The test signature and producer initialization correctly use
test_settingsto derive dynamic bootstrap servers and topic prefixes, enabling the per-session/per-worker isolation described in the PR objectives.backend/app/events/core/producer.py (1)
37-50: LGTM! Clean settings injection with backward compatibility.The optional
settingsparameter with fallback toget_settings()maintains backward compatibility while enabling test-specific configuration injection. The topic prefix resolution correctly uses the provided settings when available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 issue found across 5 files (changes from recent commits).
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="backend/app/events/core/producer.py">
<violation number="1" location="backend/app/events/core/producer.py:37">
P1: Using `get_settings()` as a default argument evaluates it at module import time, not at call time. This means all `UnifiedProducer` instances will share the same Settings captured when the module was loaded, defeating the test isolation goals of this PR. Use `None` as default and call `get_settings()` at runtime.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
backend/app/dlq/manager.py (2)
264-267: Retry policy lookup: normalization + default fallback looks correct; consider a tiny cleanupBehavior-wise this is solid (lookup by normalized topic, then fallback). If you want to keep it compact/readable:
Optional simplification
- retry_policy = self._retry_policies.get(self._normalize_topic(message.original_topic)) - retry_policy = retry_policy or self.default_retry_policy + retry_policy = ( + self._retry_policies.get(self._normalize_topic(message.original_topic)) + or self.default_retry_policy + )
438-440: Store key normalization is good, but keepRetryPolicy.topicconsistent with the key (debuggability)Since you’re normalizing the dict key, consider also aligning the
RetryPolicy.topicvalue with that normalized key (or at least asserting they match). Otherwise, inspecting/logging policies can be confusing (policy says"test123.orders"but it’s stored under"orders", or vice versa).One way to enforce consistency (shape depends on RetryPolicy mutability)
def set_retry_policy(self, topic: str, policy: RetryPolicy) -> None: - self._retry_policies[self._normalize_topic(topic)] = policy + key = self._normalize_topic(topic) + # Option A: assert to keep invariant strict + # assert policy.topic == key, f"RetryPolicy.topic={policy.topic} must match normalized key={key}" + # + # Option B: normalize the policy's topic too (if mutable / supported by your model) + # policy.topic = key + self._retry_policies[key] = policy
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
backend/app/core/container.pybackend/app/core/providers.pybackend/app/dlq/manager.pybackend/app/events/core/producer.pybackend/app/services/sse/kafka_redis_bridge.py
🚧 Files skipped from review as they are similar to previous changes (2)
- backend/app/services/sse/kafka_redis_bridge.py
- backend/app/events/core/producer.py
🧰 Additional context used
🧬 Code graph analysis (2)
backend/app/core/container.py (1)
backend/app/core/providers.py (1)
SagaOrchestratorProvider(690-695)
backend/app/core/providers.py (1)
backend/app/events/core/producer.py (2)
UnifiedProducer(30-289)producer(64-65)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: cubic · AI code reviewer
- GitHub Check: E2E Tests
- GitHub Check: Integration Tests
🔇 Additional comments (2)
backend/app/core/container.py (1)
38-57: VerifySagaOrchestratorProvider()scope/side-effects in the main app container (shared singleton across requests).
AddingSagaOrchestratorProvider()here means the web app container can now resolve (and potentially initialize) the saga orchestrator atScope.APP, which may be shared across concurrent requests. Please confirm that’s intended (thread-safety / no request-specific mutable state) and that initializing it at web-app startup won’t introduce unwanted startup cost or background tasks.backend/app/core/providers.py (1)
153-160: Good: passsettingsintoUnifiedProducerfor deterministic topic-prefixing; verify constructor/API compatibility.
This makes the producer’s topic prefix come from injectedSettings(vs ambient/global settings), which should help test isolation. Please double-checkUnifiedProducer’s constructor signature and that overriding settings here can’t accidentally diverge from other producer configuration (e.g., any settings-derived behavior beyondProducerConfig).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
backend/tests/integration/test_dlq_routes.py (1)
131-143: Use test isolation prefix for topic filtering test.The test uses hardcoded topic
"execution-events"without the test isolation prefix. Since other DLQ tests (test_dlq_retry_immediate, test_dlq_discard_policy, test_dlq_manager) all build topic names usingtest_settings.KAFKA_TOPIC_PREFIX, messages in the DLQ database are stored with prefixed topic names. The filtering does exact matching onoriginal_topic, so filtering by the unprefixed name returns no results. The assertion on line 143 never executes.Add
test_settingsparameter and usef"{test_settings.KAFKA_TOPIC_PREFIX}execution-events"instead of the hardcoded topic name.
🤖 Fix all issues with AI agents
In @backend/app/events/core/producer.py:
- Around line 9-19: The imports in producer.py are unsorted and include an
unused get_settings import causing ruff errors I001 and F401; reorder imports
into stdlib, third‑party, first‑party, then local groups (e.g., keep
confluent_kafka imports together with Producer and KafkaError in the third‑party
group), remove the unused get_settings symbol, and ensure first‑party imports
like app.core.lifecycle.LifecycleEnabled,
app.core.metrics.context.get_event_metrics,
app.dlq.models.DLQMessage/DLQMessageStatus, app.domain.enums.kafka.KafkaTopic,
app.events.schema.schema_registry.SchemaRegistryManager,
app.infrastructure.kafka.events.BaseEvent, app.settings.Settings, and local
.types imports (ProducerConfig, ProducerMetrics, ProducerState) are in the
correct group and sorted to satisfy ruff.
🧹 Nitpick comments (3)
backend/app/events/core/producer.py (2)
29-49: Settings injection looks right; consider letting the formatter handle indentation.
Requiringsettings: Settingsand deriving_topic_prefixfrom it aligns with the test-isolation goal. The parameter indentation in__init__looks off relative to typical black/ruff-format output—worth running the formatter to avoid churn.
174-176: Signature indentation drift inproduce/send_to_dlq—prefer black-style wrapping for stability.
This is cosmetic, but matching formatter output reduces future diffs.Example formatting
- async def produce( - self, event_to_produce: BaseEvent, key: str | None = None, headers: dict[str, str] | None = None - ) -> None: + async def produce( + self, + event_to_produce: BaseEvent, + key: str | None = None, + headers: dict[str, str] | None = None, + ) -> None: @@ - async def send_to_dlq( - self, original_event: BaseEvent, original_topic: str, error: Exception, retry_count: int = 0 - ) -> None: + async def send_to_dlq( + self, + original_event: BaseEvent, + original_topic: str, + error: Exception, + retry_count: int = 0, + ) -> None:Also applies to: 207-209
backend/tests/integration/test_dlq_routes.py (1)
35-35: Consider clarifying the purpose of thetest_userparameter.The
test_userparameter appears in all test method signatures but is never explicitly used in the test bodies. If this fixture provides authentication via side effects (e.g., setting up auth tokens in the client), consider adding a brief comment to clarify this pattern. If it's genuinely unused, it could be removed.Also applies to: 80-80, 116-116, 131-131, 146-146, 186-186, 198-198, 222-222, 259-259, 290-290, 328-328, 358-358
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
backend/app/events/core/producer.pybackend/tests/integration/test_dlq_routes.py
🧰 Additional context used
🧬 Code graph analysis (2)
backend/tests/integration/test_dlq_routes.py (1)
backend/tests/conftest.py (1)
test_settings(22-37)
backend/app/events/core/producer.py (4)
backend/app/settings.py (2)
Settings(11-161)get_settings(165-166)backend/app/events/core/types.py (1)
ProducerConfig(29-60)backend/app/events/schema/schema_registry.py (1)
SchemaRegistryManager(53-229)backend/app/infrastructure/kafka/events/base.py (1)
BaseEvent(13-37)
🪛 GitHub Actions: Ruff Linting
backend/app/events/core/producer.py
[error] 1-19: I001 Import block is un-sorted or un-formatted. Organize imports (ruff) for the step: uv run ruff check . --config pyproject.toml
[error] 15-15: F401 'app.settings.get_settings' imported but unused. Remove unused import.
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Scan Backend
- GitHub Check: cubic · AI code reviewer
- GitHub Check: Integration Tests
- GitHub Check: E2E Tests
🔇 Additional comments (2)
backend/tests/integration/test_dlq_routes.py (2)
35-35: LGTM: Type annotations modernized to Python 3.9+ style.The update from
Dict[str, str]todict[str, str]uses built-in generic types, which is the idiomatic approach for Python 3.9+.Also applies to: 80-80, 116-116, 131-131, 146-146, 186-186, 198-198, 222-222, 259-259, 290-290, 328-328, 358-358
198-210: LGTM: Test isolation implemented correctly.The addition of
test_settingsparameter and the prefixed topic construction at line 204 properly implements per-session and per-worker test isolation, which aligns with the PR's goal to fix flaky tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 issue found across 3 files (changes from recent commits).
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="backend/app/dlq/manager.py">
<violation number="1">
P2: Removing topic normalization may cause retry policy lookup mismatches when `original_topic` in messages contains the Kafka prefix but policies are registered without it (or vice versa). The previous `_normalize_topic` method ensured both sides used unprefixed topic names for consistent matching. Consider ensuring that either: (1) all `original_topic` values are stored without prefix, and policies are set without prefix, or (2) restore normalization to handle inconsistencies gracefully.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
backend/tests/integration/test_events_routes.py (1)
310-325: Fix CSRF header in fixture-auth tests: usetest_user["headers"]on mutating requests.
The test callsclient.post("/api/v1/events/publish", json=publish_request)without passingtest_user["headers"]. Since the backend enforces CSRF on POST/PUT/DELETE for authenticated sessions, this request fails CSRF validation before reaching the authorization check. Addheaders=test_user["headers"]to all mutating requests in fixture-auth tests.backend/tests/integration/test_user_settings_routes.py (1)
343-382: Assert login/logout success and replace fragile default-based comparison with user_id check.The test relies on fixture execution order without verifying login success, and the final assertion uses OR logic (
!= theme OR != timezone) which only needs one condition to pass—this can flake if defaults coincidentally match your chosen values. Additionally, the comment acknowledges the fixture order dependency but doesn't eliminate it robustly.Asserting login/logout status codes and comparing
user_idfields makes the test deterministic and independent of default settings.Proposed patch
async def test_settings_isolation_between_users(self, client: AsyncClient, test_user: Dict[str, str], another_user: Dict[str, str]) -> None: """Test that settings are isolated between users.""" - # Login as first user (fixture logs in another_user last, so re-login as test_user) + # Login as first user (do not rely on fixture execution order) login_data = { "username": test_user["username"], "password": test_user["password"] } - await client.post("/api/v1/auth/login", data=login_data) + r_login_1 = await client.post("/api/v1/auth/login", data=login_data) + assert r_login_1.status_code == 200 # Update first user's settings user1_update = { "theme": "dark", "timezone": "America/New_York" } response = await client.put("/api/v1/user/settings/", json=user1_update) assert response.status_code == 200 + user1_settings = UserSettings(**response.json()) # Log out - await client.post("/api/v1/auth/logout") + r_logout = await client.post("/api/v1/auth/logout") + assert r_logout.status_code in (200, 204) # Login as second user login_data = { "username": another_user["username"], "password": another_user["password"] } - await client.post("/api/v1/auth/login", data=login_data) + r_login_2 = await client.post("/api/v1/auth/login", data=login_data) + assert r_login_2.status_code == 200 # Get second user's settings response = await client.get("/api/v1/user/settings/") assert response.status_code == 200 - user2_settings = response.json() + user2_settings = UserSettings(**response.json()) - # Verify second user's settings are not affected by first user's changes - # Second user should have default settings, not the first user's custom settings - assert user2_settings["theme"] != user1_update["theme"] or user2_settings["timezone"] != user1_update[ - "timezone"] + # Verify isolation via identity (robust across varying defaults) + assert user2_settings.user_id != user1_settings.user_idbackend/tests/integration/test_admin_routes.py (1)
30-60: Avoid asserting hard-coded default values; prefer validating type, constraints, and valid ranges instead.The test asserts exact values for monitoring settings (e.g.,
enable_tracing is True,sampling_rate == 0.1), but these are hard-coded Pydantic field defaults that may change or be configurable. While.env.testconfiguration doesn't affect these specific admin settings defaults, the test is still brittle by coupling to exact values rather than validating that returned values conform to their schema constraints.Example (less brittle) assertions
- assert settings.monitoring_settings.enable_tracing is True - assert settings.monitoring_settings.sampling_rate == 0.1 + assert isinstance(settings.monitoring_settings.enable_tracing, bool) + assert isinstance(settings.monitoring_settings.sampling_rate, (int, float)) + assert 0.0 <= settings.monitoring_settings.sampling_rate <= 1.0backend/tests/conftest.py (1)
99-139: Auth fixtures: don't silently accept 400 on register + set CSRF header on authenticated client.Accepting
400on register can hide real regressions (e.g., if validation tightens unexpectedly). More critically, the CSRF token returned from login must be set onclient.headers["X-CSRF-Token"]because the API enforces CSRF validation on POST/PUT/PATCH/DELETE requests, and the current fixtures leave it unset—this will cause authenticated requests to fail CSRF checks.Proposed patch
async def _http_login(client: httpx.AsyncClient, username: str, password: str) -> str: data = {"username": username, "password": password} resp = await client.post("/api/v1/auth/login", data=data) resp.raise_for_status() - return resp.json().get("csrf_token", "") + token = resp.json().get("csrf_token", "") + if not token: + raise RuntimeError("Login succeeded but no csrf_token returned") + return token @pytest_asyncio.fixture async def test_user(client: httpx.AsyncClient): """Function-scoped authenticated user.""" uid = uuid.uuid4().hex[:8] creds = { "username": f"test_user_{uid}", "email": f"test_user_{uid}@example.com", "password": "TestPass123!", "role": "user", } r = await client.post("/api/v1/auth/register", json=creds) - if r.status_code not in (200, 201, 400): + if r.status_code not in (200, 201): pytest.fail(f"Cannot create test user (status {r.status_code}): {r.text}") csrf = await _http_login(client, creds["username"], creds["password"]) + client.headers.update({"X-CSRF-Token": csrf}) return {**creds, "csrf_token": csrf, "headers": {"X-CSRF-Token": csrf}} @pytest_asyncio.fixture async def test_admin(client: httpx.AsyncClient): """Function-scoped authenticated admin.""" uid = uuid.uuid4().hex[:8] creds = { "username": f"admin_user_{uid}", "email": f"admin_user_{uid}@example.com", "password": "AdminPass123!", "role": "admin", } r = await client.post("/api/v1/auth/register", json=creds) - if r.status_code not in (200, 201, 400): + if r.status_code not in (200, 201): pytest.fail(f"Cannot create test admin (status {r.status_code}): {r.text}") csrf = await _http_login(client, creds["username"], creds["password"]) + client.headers.update({"X-CSRF-Token": csrf}) return {**creds, "csrf_token": csrf, "headers": {"X-CSRF-Token": csrf}}
🧹 Nitpick comments (1)
backend/tests/conftest.py (1)
63-73: HTTP client: consider whetherfollow_redirects=Trueis desirable for integration assertions.
If any endpoint accidentally redirects (302/307), this setting can mask issues by auto-following. If you want strictness, disable it; otherwise OK.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
backend/app/events/core/producer.pybackend/tests/conftest.pybackend/tests/e2e/test_execution_routes.pybackend/tests/integration/test_admin_routes.pybackend/tests/integration/test_events_routes.pybackend/tests/integration/test_health_routes.pybackend/tests/integration/test_notifications_routes.pybackend/tests/integration/test_replay_routes.pybackend/tests/integration/test_saga_routes.pybackend/tests/integration/test_saved_scripts_routes.pybackend/tests/integration/test_user_settings_routes.py
💤 Files with no reviewable changes (6)
- backend/tests/integration/test_saved_scripts_routes.py
- backend/tests/e2e/test_execution_routes.py
- backend/tests/integration/test_replay_routes.py
- backend/tests/integration/test_health_routes.py
- backend/tests/integration/test_saga_routes.py
- backend/tests/integration/test_notifications_routes.py
🧰 Additional context used
🧬 Code graph analysis (3)
backend/app/events/core/producer.py (1)
backend/app/events/core/types.py (1)
ProducerConfig(29-60)
backend/tests/integration/test_user_settings_routes.py (1)
backend/tests/conftest.py (1)
another_user(142-162)
backend/tests/conftest.py (2)
backend/app/settings.py (1)
Settings(11-161)backend/app/main.py (1)
create_app(44-127)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: cubic · AI code reviewer
- GitHub Check: Integration Tests
- GitHub Check: E2E Tests
🔇 Additional comments (6)
backend/app/events/core/producer.py (2)
49-49: Good architectural improvement for test isolation.Using the injected settings instance for topic prefix configuration supports per-session/worker isolation and improves testability by eliminating runtime dependencies on global settings.
32-37: All instantiation sites correctly pass the settings parameter. The breaking change has been properly implemented across the codebase.backend/tests/conftest.py (2)
41-54: App lifecycle: good move passingtest_settingsintocreate_app, and closing Dishka container.
This should reduce cross-worker config bleed and resource leakage.
21-37: Remove redundant% 16modulo onREDIS_DBassignment.The code applies modulo 16 twice: once in
worker_num = sum(_WORKER_ID.encode()) % 16and again inREDIS_DB: worker_num % 16. The second modulo is redundant and should be removed. Additionally, note that with pytest-xdist runs exceeding 16 workers, Redis database collisions are unavoidable given Redis's default 16-database limit (0–15).Suggested fix
@pytest.fixture(scope="session") def test_settings() -> Settings: """Provide test settings with a unique Kafka topic prefix for isolation.""" base = Settings(_env_file=".env.test", _env_file_encoding="utf-8") session_id = uuid.uuid4().hex[:8] base_prefix = f"{base.KAFKA_TOPIC_PREFIX.rstrip('.')}." worker_num = sum(_WORKER_ID.encode()) % 16 unique_prefix = f"{base_prefix}{session_id}.{_WORKER_ID}." return base.model_copy( update={ "DATABASE_NAME": f"integr8scode_test_{session_id}_{_WORKER_ID}", - "REDIS_DB": worker_num % 16, + "REDIS_DB": worker_num, "KAFKA_GROUP_SUFFIX": f"{session_id}.{_WORKER_ID}", "SCHEMA_SUBJECT_PREFIX": f"test.{session_id}.{_WORKER_ID}.", "KAFKA_TOPIC_PREFIX": unique_prefix, } )backend/tests/integration/test_admin_routes.py (1)
111-121: No issues found; test is correct.The
test_userfixture properly establishes both authentication and CSRF state: it logs in via_http_login()(which sets session cookies and retrieves the CSRF token), then returns the CSRF token in the dict. However, CSRF validation is skipped for GET requests per the security implementation (seevalidate_csrf_token()inbackend/app/core/security.py, which exempts GET/HEAD/OPTIONS methods). The expected 403 status is correct and comes from theadmin_userdependency enforcing the admin role requirement, not from missing CSRF setup.Likely an incorrect or invalid review comment.
backend/tests/integration/test_events_routes.py (1)
383-399: CSRF validation is not currently enforced on these endpoints.The DELETE and POST routes (
delete_eventat line 277 andreplay_aggregate_eventsat line 308 ofbackend/app/api/routes/events.py) only depend onadmin_userfor authorization. Thevalidate_csrf_tokendependency defined inbackend/app/core/security.pyis never applied to these endpoints, so CSRF token validation does not occur. The tests correctly return 403 for non-admin users due to role-based access control, not CSRF failures. The fixture infrastructure is sound (httpx.AsyncClient persists auth cookies automatically), but CSRF enforcement is not active on these routes.Likely an incorrect or invalid review comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
backend/tests/integration/idempotency/test_consumer_idempotent.py (1)
62-63: Consider polling for consumer readiness instead of a fixed sleep.While the 2-second sleep addresses the flaky test, a fixed delay may still fail on slower CI systems or waste time on faster ones. Consider polling for partition assignments instead:
♻️ Suggested more robust approach
await wrapper.start([KafkaTopic.EXECUTION_EVENTS]) -# Allow time for consumer to join group and get partition assignments -await asyncio.sleep(2) + +# Wait for consumer to join group and get partition assignments +async def _consumer_ready(): + assignments = wrapper.consumer.assignment() + assert assignments, "Consumer should have partition assignments" + +await eventually(_consumer_ready, timeout=5.0, interval=0.1)This approach verifies the consumer is actually ready rather than assuming a fixed duration is sufficient.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
backend/tests/integration/idempotency/test_consumer_idempotent.py
🧰 Additional context used
🧬 Code graph analysis (1)
backend/tests/integration/idempotency/test_consumer_idempotent.py (1)
backend/tests/helpers/events.py (1)
make_execution_requested_event(8-50)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: E2E Tests
- GitHub Check: Integration Tests
🔇 Additional comments (1)
backend/tests/integration/idempotency/test_consumer_idempotent.py (1)
14-15: LGTM!The import reorganization is clean and maintains the same functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🤖 Fix all issues with AI agents
In @backend/tests/integration/events/test_consume_roundtrip.py:
- Around line 16-18: The xdist_group("kafka_consumers") (and similar
kafka_consumers, dlq, user_settings) markers are ineffective because
pyproject.toml uses addopts = "-n 4 --dist loadfile"; change the pytest config
to use --dist=loadgroup (update pyproject.toml addopts to "-n 4 --dist
loadgroup") so the xdist_group markers are honored, or alternatively remove the
xdist_group(...) markers from the affected test files if you prefer to keep
--dist=loadfile; update whichever of these two places (the xdist_group markers
in the test files or the addopts line in pyproject.toml) to make the behavior
consistent.
In @backend/tests/integration/events/test_consumer_lifecycle.py:
- Around line 10-13: The xdist_group marker in the test prevents librdkafka
races only when pytest runs in parallel; update the GitHub Actions job that runs
the integration tests (the workflow step invoking "pytest tests/integration") to
add parallelization grouping by appending "-n auto --dist=loadgroup" to the
pytest command so the xdist_group("kafka_consumers") marker takes effect under
CI when tests run with xdist.
In @backend/tests/integration/events/test_event_store_consumer.py:
- Around line 16-23: Add the pytest marker registration for the xdist_group
marker in the pytest config so pytest no longer warns; specifically, update the
[tool.pytest.ini_options] markers list to include the entry "xdist_group: marks
tests for pytest-xdist grouping" so the marker used by pytestmark (xdist_group)
in test_event_store_consumer.py is registered.
In @backend/tests/integration/idempotency/test_consumer_idempotent.py:
- Around line 69-70: Replace the fixed await asyncio.sleep(2) with a readiness
wait that polls the consumer assignment; call the wrapper.assignment() via the
UnifiedConsumer wrapper inside eventually() (e.g. eventually(lambda:
bool(wrapper.assignment()), timeout=10.0, interval=0.1)) so the test waits for
actual partition assignments and avoids flakiness or extra delay; alternatively
you may remove the sleep entirely if you rely on the subsequent eventually(_one,
...) call to synchronize readiness.
🧹 Nitpick comments (1)
backend/tests/integration/idempotency/test_consumer_idempotent.py (1)
14-16: Avoid relying ontests.helperspackage side effects for a single factory import.If this import was moved to dodge circular-import/ordering issues, consider importing the concrete module instead (less likely to reintroduce cycles via
tests/helpers/__init__.py), e.g.from tests.helpers.events import make_execution_requested_event.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
backend/tests/integration/events/test_consume_roundtrip.pybackend/tests/integration/events/test_consumer_lifecycle.pybackend/tests/integration/events/test_event_dispatcher.pybackend/tests/integration/events/test_event_store_consumer.pybackend/tests/integration/idempotency/test_consumer_idempotent.pybackend/tests/integration/result_processor/test_result_processor.py
🧰 Additional context used
🧬 Code graph analysis (4)
backend/tests/integration/events/test_consumer_lifecycle.py (1)
backend/app/events/core/types.py (1)
ConsumerConfig(64-102)
backend/tests/integration/result_processor/test_result_processor.py (6)
backend/tests/conftest.py (2)
app(42-53)db(88-90)backend/tests/unit/conftest.py (2)
app(25-26)db(10-11)backend/app/db/repositories/execution_repository.py (1)
ExecutionRepository(17-91)backend/app/domain/enums/kafka.py (1)
KafkaTopic(7-53)backend/app/domain/execution/models.py (1)
DomainExecutionCreate(73-80)backend/tests/helpers/eventually.py (1)
eventually(7-32)
backend/tests/integration/events/test_event_store_consumer.py (1)
backend/app/domain/enums/auth.py (1)
LoginMethod(4-10)
backend/tests/integration/idempotency/test_consumer_idempotent.py (2)
backend/tests/helpers/events.py (1)
make_execution_requested_event(8-50)backend/tests/helpers/eventually.py (1)
eventually(7-32)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Scan Backend
- GitHub Check: Integration Tests
- GitHub Check: E2E Tests
🔇 Additional comments (6)
backend/tests/integration/events/test_event_dispatcher.py (1)
16-18: Ensurexdist_groupis actually enforced (CI flags + marker registration).
pytest.mark.xdist_group("kafka_consumers")typically only changes scheduling if xdist is run with the grouping dist mode (e.g.--dist=loadgroup). Also, if your config enables strict markers, you’ll need to registerxdist_groupinpytest.ini/pyproject.tomlto avoid “unknown marker” failures.backend/tests/integration/events/test_consumer_lifecycle.py (1)
21-24: ConsumerConfig reformat is fine. No behavioral changes.backend/tests/integration/events/test_event_store_consumer.py (1)
6-7:LoginMethodimport looks correct and is used.backend/tests/integration/result_processor/test_result_processor.py (2)
25-32:xdist_group("kafka_consumers")addition is consistent with the rest of the suite.
As above: make sure loadgroup is enabled; otherwise this won’t serialize anything.
5-5: No action required—DomainExecutionCreateis properly re-exported fromapp.domain.execution.The
__init__.pyfile inapp/domain/executioncorrectly imports and exportsDomainExecutionCreate, so the import path used here and elsewhere is valid.backend/tests/integration/idempotency/test_consumer_idempotent.py (1)
18-25: Thexdist_groupcontainment strategy is properly supported and will work as intended.pytest-xdist 3.6.1 is already in the dev dependencies (backend/pyproject.toml), and the
xdist_groupmarker is used consistently across multiple Kafka-related test files. The strategy correctly addresses the documented librdkafka race condition in multi-worker environments.Minor improvement: register
xdist_groupin the[tool.pytest.ini_options]markers list for consistency:markers = [ ... "xdist_group: marks tests to run serially in pytest-xdist", ]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
.github/workflows/backend-ci.yml (1)
50-50: Consider making test duration reporting configurable.The
--durations=0flag reports execution time for all tests, which significantly increases CI log verbosity. While useful for profiling slow tests, this overhead may not be necessary for every CI run.Consider making this configurable:
- Use a workflow input or environment variable to enable/disable duration reporting
- Document the rationale if this is intended to be permanent
- Alternatively, use
--durations=10to report only the slowest 10 tests♻️ Example: Make duration reporting configurable
Add a workflow input at the top of the file:
on: workflow_dispatch: inputs: durations: description: 'Number of slowest tests to report (0 for all, empty to disable)' required: false default: ''Then use it conditionally in test commands:
run: | cd backend uv run pytest tests/unit -v -rs \ - --durations=0 \ + ${{ github.event.inputs.durations && format('--durations={0}', github.event.inputs.durations) || '' }} \ --cov=app \ --cov-report=xml --cov-report=termThis allows enabling duration reporting on-demand via workflow_dispatch while keeping regular CI runs clean.
Also applies to: 120-120, 215-215
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
.github/workflows/backend-ci.yml
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Integration Tests
- GitHub Check: E2E Tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
backend/tests/integration/test_notifications_routes.py (1)
125-138: Skipping on HTTP 500 may mask backend bugs.Skipping the test when the backend returns 500 for a non-existent resource could hide a real bug. A missing notification should return 404, and if it returns 500, that's an unhandled exception in the backend that warrants investigation rather than being skipped.
Consider removing the skip or adding a
pytest.xfailto track this as a known issue that needs fixing.Alternative: use xfail instead of skip
fake_notification_id = "00000000-0000-0000-0000-000000000000" response = await test_user.put(f"/api/v1/notifications/{fake_notification_id}/read") - # Prefer 404; if backend returns 500, treat as unavailable feature - if response.status_code == 500: - pytest.skip("Backend returns 500 for unknown notification IDs") + # Backend should return 404 for non-existent resources, not 500 + if response.status_code == 500: + pytest.xfail("Backend returns 500 instead of 404 for unknown notification IDs - needs fix") assert response.status_code == 404backend/Dockerfile (1)
22-43: Wait for both TLS key and cert file before starting Gunicorn.The wait loop only checks for
/app/certs/server.key, but Gunicorn requires both the key and cert to start; it will fail ifserver.crtis missing. Update the condition to verify both files exist.Proposed patch
-while [ ! -f /app/certs/server.key ]; do +while [ ! -f /app/certs/server.key ] || [ ! -f /app/certs/server.crt ]; do echo "Waiting for TLS certs..." sleep 2 done
🤖 Fix all issues with AI agents
In @backend/app/core/middlewares/csrf.py:
- Around line 40-51: The middleware currently returns the exception message to
clients when catching CSRFValidationError; replace the client-facing content in
the JSONResponse with a constant generic message like "CSRF validation failed"
while keeping the specific error for server logs (e.g., log e with your logger)
and still returning status_code=403; update the except block that catches
CSRFValidationError raised by security_service.validate_csrf_from_request and
construct the JSONResponse with the generic string instead of str(e) while
ensuring the original exception is logged server-side.
- Around line 31-37: The code creates a per-request scope to resolve an
APP-scoped SecurityService which is unnecessary and risky; replace the async
with container() context and request_scope usage by resolving the service
directly from the container: get the container from
scope["app"].state.dishka_container and call await
container.get(SecurityService) to obtain the SecurityService instance (mirror
the pattern used in dishka_lifespan.py), then use that instance for the
middleware without entering a request scope.
🧹 Nitpick comments (3)
backend/tests/integration/test_sse_routes.py (2)
36-42: Unusedtest_userparameter.The
test_user: AsyncClientparameter is injected but never used in this test. If it's only needed to ensure authentication state exists (fixture side-effect), consider adding a comment to clarify this intent.
136-141: Unusedtest_userparameter and misleading test name.
- The
test_userparameter is unused in this test.- The test name suggests it checks headers, but the implementation only validates that the health endpoint returns 200 and a dict. The docstring acknowledges streaming tests are done via service, but the test name could be more accurate.
Consider renaming for clarity
- async def test_sse_endpoint_returns_correct_headers(self, client: AsyncClient, test_user: AsyncClient) -> None: - """Test SSE health endpoint works (streaming tests done via service).""" + async def test_sse_health_endpoint_returns_json(self, client: AsyncClient, test_user: AsyncClient) -> None: + """Test SSE health endpoint returns valid JSON (test_user ensures auth context)."""backend/tests/integration/test_notifications_routes.py (1)
52-60: UseNotificationSeverityenum for consistency.Line 55 uses hardcoded strings for severity validation, while lines 54 and 57 correctly use the imported enums. For consistency and maintainability, use the
NotificationSeverityenum.Suggested fix
for n in notifications_response.notifications: assert n.notification_id assert n.channel in [c.value for c in NotificationChannel] - assert n.severity in ["low","medium","high","urgent"] + assert n.severity in [s.value for s in NotificationSeverity] assert isinstance(n.tags, list) assert n.status in [s.value for s in NotificationStatus]
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
backend/Dockerfilebackend/app/core/middlewares/csrf.pybackend/app/domain/enums/__init__.pybackend/app/domain/enums/sse.pybackend/app/schemas_pydantic/sse.pybackend/app/services/sse/sse_service.pybackend/tests/helpers/sse.pybackend/tests/integration/test_notifications_routes.pybackend/tests/integration/test_sse_routes.pybackend/tests/unit/services/sse/test_kafka_redis_bridge.py
🚧 Files skipped from review as they are similar to previous changes (1)
- backend/tests/unit/services/sse/test_kafka_redis_bridge.py
🧰 Additional context used
🧬 Code graph analysis (8)
backend/app/domain/enums/__init__.py (1)
backend/app/domain/enums/sse.py (2)
SSEControlEvent(4-12)SSENotificationEvent(15-21)
backend/app/domain/enums/sse.py (1)
backend/app/core/utils.py (1)
StringEnum(6-31)
backend/app/core/middlewares/csrf.py (1)
backend/app/core/security.py (2)
SecurityService(23-122)validate_csrf_from_request(84-122)
backend/tests/integration/test_notifications_routes.py (5)
backend/tests/conftest.py (4)
app(46-57)test_user(142-153)scope(86-88)test_admin(157-168)backend/app/services/notification_service.py (1)
NotificationService(115-951)backend/tests/unit/services/sse/test_sse_service.py (1)
get(32-41)backend/app/services/coordinator/queue_manager.py (1)
user_id(34-35)frontend/src/stores/notificationStore.ts (1)
delete(77-87)
backend/tests/helpers/sse.py (2)
backend/tests/conftest.py (1)
client(68-76)backend/tests/unit/services/sse/test_sse_service.py (1)
get(32-41)
backend/app/services/sse/sse_service.py (2)
backend/app/domain/enums/sse.py (2)
SSEControlEvent(4-12)SSENotificationEvent(15-21)backend/app/schemas_pydantic/sse.py (2)
SSEExecutionEventData(16-55)SSENotificationEventData(66-88)
backend/app/schemas_pydantic/sse.py (1)
backend/app/domain/enums/sse.py (2)
SSEControlEvent(4-12)SSENotificationEvent(15-21)
backend/tests/integration/test_sse_routes.py (4)
backend/app/infrastructure/kafka/events/metadata.py (1)
AvroEventMetadata(9-31)backend/app/schemas_pydantic/sse.py (2)
RedisNotificationMessage(91-101)SSEHealthResponse(115-125)backend/app/services/sse/redis_bus.py (2)
SSERedisBus(44-87)publish_notification(79-81)backend/app/services/sse/sse_service.py (3)
SSEService(25-282)create_notification_stream(197-261)create_execution_stream(52-119)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: Scan Frontend
- GitHub Check: E2E Tests
- GitHub Check: cubic · AI code reviewer
- GitHub Check: E2E Tests
- GitHub Check: Integration Tests
🔇 Additional comments (29)
backend/tests/integration/test_sse_routes.py (3)
44-84: Well-structured streaming test with proper synchronization.The test correctly waits for the
subscribedevent before publishing, ensuring the Redis subscription is established. The use ofaclosingfor proper async generator cleanup andasyncio.timeoutfor bounded execution is good practice.One minor observation: the
TimeoutErroris silently caught (line 81-82), which means if the notification is never received within 3 seconds, the test will fail at the assertion on line 84 with a descriptive message. This is acceptable behavior.
86-124: Execution stream test follows the same solid pattern.The test properly synchronizes on the
subscribedevent before publishing thePodCreatedEvent, and the assertion provides helpful debug output on failure. The pattern is consistent withtest_notification_stream_service.
150-158: Early break may not verify "connected" event correctly.The
breakon line 157 exits after the first event regardless of whether a "connected" event was found. If the first event has"data"but a differentevent_type, the function returnsFalsecorrectly, but if the first event lacks"data", it still breaks and returnsFalse. This seems intentional but could miss the "connected" event if it's not first.Given the SSE service emits "connected" as the first event (per
create_notification_streaminsse_service.py), this should work correctly. The logic is acceptable.backend/app/domain/enums/__init__.py (1)
10-10: Clean export additions following existing conventions.The new SSE enum exports are properly added with a section comment, consistent with the existing organization pattern in
__all__.Also applies to: 30-32
backend/app/domain/enums/sse.py (1)
1-21: Well-designed SSE event type enums.The separation into
SSEControlEvent(for execution streams) andSSENotificationEvent(for notification streams) provides clear type safety. UsingStringEnumensures these serialize as plain strings in JSON output, which is important for SSE payload compatibility.The shared values (
CONNECTED,SUBSCRIBED,HEARTBEAT) between both enums are appropriate since both stream types need these control signals.backend/app/services/sse/sse_service.py (4)
10-10: Good import addition for typed SSE events.The enum import centralizes event type definitions, improving type safety and maintainability.
85-93: Important synchronization point for clients.The new
SUBSCRIBEDevent provides a reliable signal that the Redis subscription is established and it's safe for clients/tests to publish events. This eliminates race conditions where events could be published before the subscription is ready.This aligns well with the test changes in
test_sse_routes.pythat wait for this event before publishing.
215-223: Consistent SUBSCRIBED event for notification streams.Same synchronization pattern applied to notification streams, ensuring parity with execution streams.
56-65: Consistent enum usage across all event emissions.All event type string literals have been replaced with their corresponding enum members (
SSEControlEvent.*andSSENotificationEvent.*). This provides compile-time type checking and eliminates the risk of typos in event type strings.Also applies to: 71-78, 97-105, 131-139, 144-151, 203-210, 230-237, 243-255
backend/app/schemas_pydantic/sse.py (3)
9-9: Type-safe event_type with union of EventType and SSEControlEvent.The union type
EventType | SSEControlEventcorrectly models that execution stream events can be either business events from Kafka (EventType) or control signals (SSEControlEvent). Pydantic will validate incoming data against both enum types.Also applies to: 25-27
73-73: Clean enum typing for notification events.Using
SSENotificationEventdirectly (without union) is appropriate since notification stream events are all control/notification events, not Kafka business events.
2-2: VerifyDictandAnyimports are used.Line 2 imports
DictandAnyfrom typing.Dictis used on line 63, andAnyis used on line 63.TypeVaris used on line 13. The imports appear valid.backend/tests/helpers/sse.py (3)
8-8: LGTM: Type annotation improvement.The return type annotation has been tightened from
AsyncIterator[dict]toAsyncIterator[dict[str, object]], providing better type safety and clarity.
44-51: LGTM: Type annotation improvement.The return type annotation has been tightened to
dict[str, object], consistent with the other helper functions.
54-61: LGTM: Type annotation improvement.The return type annotation has been tightened to
dict[str, object], maintaining consistency across all helper functions.backend/tests/integration/test_notifications_routes.py (13)
1-16: LGTM!The imports are well-organized and align with the refactoring to use fixtures and DI-based service injection.
23-33: LGTM!Correctly uses the unauthenticated
clientfixture to verify that authentication is required.
62-80: LGTM!Good use of
NotificationStatusenum for filtering. The assertions correctly validate that returned notifications match the requested status.
82-97: LGTM!Clean test with appropriate assertions for the unread count response structure.
98-123: LGTM!The test correctly handles the conditional case where no delivered notifications exist. The verification loop properly checks that the status changed to "read".
140-176: LGTM!Excellent test design - creating a notification via
NotificationServiceensures the test has data to work with, making it deterministic. The assertion message on line 174-176 is helpful for debugging failures.
178-209: LGTM!Thorough validation of subscription response structure including optional webhook URL format checks.
211-297: LGTM!The subscription update tests are well-structured. The Slack test appropriately handles the 422 case when Slack integration is disabled by configuration.
299-326: LGTM!Good verification pattern - checking that the deleted notification ID is no longer present in the list response.
328-338: LGTM!Note: This test correctly expects 404 without the skip-on-500 workaround present in
test_mark_nonexistent_notification_as_read. If the backend's error handling is consistent, the skip in the other test may be unnecessary.
340-366: LGTM!Good pagination test that properly handles the case where there aren't enough notifications to test the second page.
368-388: LGTM!Important test for verifying user isolation. Good use of both
test_userandtest_adminfixtures to ensure notifications don't leak between users.
390-400: LGTM!Good negative test case. Accepting multiple error codes (400, 404, 422) is reasonable since path validation vs. schema validation may return different codes.
backend/app/core/middlewares/csrf.py (1)
26-30: Non-HTTP passthrough looks correct.
Skipping non-HTTP scopes avoids breaking websockets/SSE internals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🤖 Fix all issues with AI agents
In @backend/app/core/middlewares/csrf.py:
- Line 44: The CSRF validation coroutine is being invoked but not awaited in the
middleware; change the call in the middleware to await
security_service.validate_csrf_from_request(request) so the async validation
actually runs (also ensure the enclosing middleware function is async and
properly awaits/handles exceptions from validate_csrf_from_request to enforce
rejection on failure).
In @backend/tests/conftest.py:
- Around line 109-140: In _create_authenticated_client, explicitly verify that
login_resp.json() contains a non-empty "csrf_token" after successful login
instead of defaulting to ""; if missing, close the client (c.aclose()) and fail
fast with a clear error (e.g., pytest.fail or raise) mentioning username/role
and login_resp.text; set c.headers["X-CSRF-Token"] only after that check so
downstream CSRF errors are not cryptic.
In @backend/tests/helpers/cleanup.py:
- Around line 8-16: Update the docstring in cleanup.py to remove the stale
“gw0→db0, gw1→db1” claim and instead state that each pytest-xdist worker is
assigned a dedicated Redis DB derived from the worker id; reference the REDIS_DB
calculation in backend/tests/conftest.py (sum(_WORKER_ID.encode()) % 16) or
phrasing like “derived from the worker id” so the comment matches the actual
REDIS_DB logic used by the test suite.
- Around line 17-22: The cleanup loop can call delete_many on views which raises
CommandNotSupportedOnView; change the collection enumeration to only actual
collections (e.g., call db.list_collection_names(filter={"type": "collection"})
or use db.list_collections() and filter items where info["type"] ==
"collection") before iterating the existing filter that skips "system." names,
then run await db[name].delete_many({}) as before and keep the existing await
redis_client.flushdb() call.
🧹 Nitpick comments (2)
backend/app/main.py (1)
105-105: Consider CSRFMiddleware placement in the middleware stack.The middleware execution order for incoming requests is:
- CORS
- CacheControl
- RequestSizeLimit
- Correlation
- CSRF (current position)
- RateLimit (if enabled)
- Metrics
This means CSRF validation occurs after correlation IDs are added (good for logging) and after request size limits (good for rejecting oversized requests early). However, it occurs before rate limiting, which means that invalid CSRF requests still consume rate limit quota.
Consider moving CSRFMiddleware before RateLimitMiddleware to prevent CSRF-invalid requests from affecting rate limits.
Suggested reordering
setup_metrics(app, settings, logger) app.add_middleware(MetricsMiddleware) + app.add_middleware(CSRFMiddleware, container=container) if settings.RATE_LIMIT_ENABLED: app.add_middleware(RateLimitMiddleware) - app.add_middleware(CSRFMiddleware, container=container) app.add_middleware(CorrelationMiddleware)backend/tests/conftest.py (1)
18-41: HardenKAFKA_TOPIC_PREFIXbuilding + make Redis DB selection clearer.Two small edge-cases:
- If
KAFKA_TOPIC_PREFIXis empty,base_prefixbecomes".".- The Redis DB selection is a hash/mod (not gwN→dbN), and will collide when
xdist -n > 16.Suggested tweak (safer prefix + cleaner Redis DB calc)
_WORKER_ID = os.environ.get("PYTEST_XDIST_WORKER", "gw0") @@ def test_settings() -> Settings: @@ base = Settings(_env_file=".env.test", _env_file_encoding="utf-8") session_id = uuid.uuid4().hex[:8] - base_prefix = f"{base.KAFKA_TOPIC_PREFIX.rstrip('.')}." - worker_num = sum(_WORKER_ID.encode()) % 16 - unique_prefix = f"{base_prefix}{session_id}.{_WORKER_ID}." + raw_prefix = base.KAFKA_TOPIC_PREFIX.rstrip(".") + base_prefix = f"{raw_prefix}." if raw_prefix else "" + worker_num = sum(_WORKER_ID.encode()) + unique_prefix = f"{base_prefix}{session_id}.{_WORKER_ID}." return base.model_copy( update={ @@ - "REDIS_DB": worker_num % 16, + "REDIS_DB": worker_num % 16, # stable per-worker; collisions if >16 workers "KAFKA_GROUP_SUFFIX": f"{session_id}.{_WORKER_ID}", "SCHEMA_SUBJECT_PREFIX": f"test.{session_id}.{_WORKER_ID}.", "KAFKA_TOPIC_PREFIX": unique_prefix,
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
backend/app/core/middlewares/csrf.pybackend/app/main.pybackend/tests/conftest.pybackend/tests/helpers/cleanup.py
🧰 Additional context used
🧬 Code graph analysis (3)
backend/app/core/middlewares/csrf.py (1)
backend/app/core/security.py (1)
validate_csrf_from_request(84-122)
backend/tests/conftest.py (2)
backend/app/settings.py (1)
Settings(11-165)backend/app/main.py (1)
create_app(60-162)
backend/tests/helpers/cleanup.py (2)
backend/tests/conftest.py (1)
db(93-95)backend/tests/unit/conftest.py (1)
db(53-54)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: E2E Tests
- GitHub Check: E2E Tests
- GitHub Check: Integration Tests
🔇 Additional comments (5)
backend/app/core/middlewares/csrf.py (1)
37-37: Verify container resource management.The
SecurityServiceis retrieved from the container on each request, but there's no explicit cleanup or context manager usage. Verify that dishka'sAsyncContainer.get()properly manages resource lifecycles and doesn't leak resources on each request.Based on learnings about dishka, if the service has request scope or needs cleanup, consider using an async context manager pattern:
async with self.container() as request_container: security_service = await request_container.get(SecurityService) # ... rest of validationbackend/app/main.py (2)
180-181: LGTM: Factory pattern enables proper per-worker initialization.Using
factory=Truewith the string path ensures each uvicorn worker callscreate_app()independently, providing proper isolation and initialization per worker process. This is the recommended approach for multi-worker deployments.
72-86: The metrics initialization is safe and does not have timing issues.
MetricsContext.initialize_allonly stores pre-constructed metric instances in context variables. Each metric class requires onlySettingsto initialize—it configures OpenTelemetry with OTLP endpoint, service name, and creates instruments. No app-level state, FastAPI app instance, or DI container dependency exists. Metrics can be safely constructed before app creation (line 89) and before the DI container is set up (line 97).If any metric constructor fails during initialization (lines 74-85), the
create_app()function fails entirely, which is appropriate for critical infrastructure. Theinitialize_allmethod itself is a simple context-setter with informational logging.backend/tests/conftest.py (2)
46-103: Session-scoped app + lifespan-driven init looks solid; verify Dishka scope typing.The session-scoped
appfixture withlifespan_contextis a good move for stability/perf. Only thing to double-check:_container_scopeis annotated as yieldingAsyncContainer, butasync with container() as scope:may be a narrower “scope” type in Dishka—worth aligning the annotation to whatevercontainer()actually returns.
142-184: Auth client fixtures look correct (unique users + explicitaclose()).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
backend/tests/unit/services/saga/test_execution_saga_steps.py (1)
187-187: Assertion always passes due toor Truefallback.The assertion
isinstance(...) or Truewill always evaluate toTrueregardless of the isinstance check result, making it ineffective. This either indicates a test bug or a coverage hack that should be cleaned up.Consider either:
- Removing the
or Trueif the type check should be validated- Replacing with a simpler statement if this is just for coverage
- Adding a comment explaining why the assertion is designed to always pass
🧪 Suggested fix
- assert isinstance(QueueExecutionStep().get_compensation(), type(DeletePodCompensation(None)).__bases__[0]) or True + # Exercise get_compensation for coverage + _ = QueueExecutionStep().get_compensation()Or if the type check is meaningful:
- assert isinstance(QueueExecutionStep().get_compensation(), type(DeletePodCompensation(None)).__bases__[0]) or True + comp = QueueExecutionStep().get_compensation() + assert comp is not None # or whatever the expected behavior is
🤖 Fix all issues with AI agents
In @backend/tests/conftest.py:
- Around line 21-41: The Redis DB selection in test_settings uses worker_num =
sum(_WORKER_ID.encode()) % 16 which causes collisions when xdist workers exceed
16; update conftest.py to either document the 16-db limit and rationale next to
test_settings and worker_num, or add a runtime guard in test_settings that
parses _WORKER_ID (PYTEST_XDIST_WORKER), computes the numeric worker index or
total workers, and raises a clear error if the indicated worker count > 16 (or
if modulo mapping would collide), so tests fail fast instead of silently sharing
Redis DBs; reference test_settings, worker_num and _WORKER_ID when making the
change.
In @backend/tests/helpers/cleanup.py:
- Around line 10-22: The cleanup currently calls
db.list_collection_names(filter={"type": "collection"}) which can skip
non-"collection" namespaces (e.g., time-series) and leave test data behind;
update the cleanup to iterate all collections (use db.list_collection_names()
without the filter or db.list_collections() and check each collection's "type"
!= "view") and for each name that does not start with "system." call
db[name].delete_many({}) so all collection-like namespaces (including
timeseries) are cleared while still preserving system namespaces.
In @backend/tests/integration/app/test_main_app.py:
- Around line 20-36: The test currently always asserts that
"RateLimitMiddleware" is present even though create_app() only registers it when
Settings.RATE_LIMIT_ENABLED is true; update test_create_app_real_instance to
first read Settings.RATE_LIMIT_ENABLED (import Settings or access the same
settings instance used by app) and only assert "RateLimitMiddleware" in
middleware_class_names when that flag is True, otherwise skip that assertion.
🧹 Nitpick comments (3)
backend/tests/unit/services/saga/test_saga_step_and_base.py (1)
84-94: LGTM! Excellent typing improvements and MagicMock usage.The explicit generic type
SagaStep[BaseEvent]and typed method signatures enhance type safety. UsingMagicMock(spec=BaseEvent)on line 94 is good practice as it ensures the mock conforms to theBaseEventinterface.💡 Optional: Consider marking the test as async
Line 94 uses
asyncio.run()within a synchronous test. While this works correctly, a more idiomatic approach would be:-def test_saga_step_str_and_can_execute() -> None: +@pytest.mark.asyncio +async def test_saga_step_str_and_can_execute() -> None: class S(SagaStep[BaseEvent]): async def execute(self, context: SagaContext, event: BaseEvent) -> bool: return True def get_compensation(self) -> CompensationStep | None: return None s = S("nm") assert str(s) == "SagaStep(nm)" # can_execute default True - assert asyncio.run(s.can_execute(SagaContext("s", "e"), MagicMock(spec=BaseEvent))) is True + assert await s.can_execute(SagaContext("s", "e"), MagicMock(spec=BaseEvent)) is TrueThis is purely stylistic; the current implementation is functionally correct.
backend/tests/unit/services/pod_monitor/test_monitor.py (2)
77-89: Consider parameterizing Settings for test isolation.The
Settings()instantiation reads from environment variables or defaults, which could lead to non-deterministic test behavior if the environment changes between test runs. Consider accepting aSettingsparameter or using a fixture to provide consistent test settings.♻️ Proposed refactor for test isolation
-def create_test_kafka_event_service() -> tuple[KafkaEventService, FakeUnifiedProducer]: +def create_test_kafka_event_service(settings: Settings | None = None) -> tuple[KafkaEventService, FakeUnifiedProducer]: """Create real KafkaEventService with fake dependencies for testing.""" fake_producer = FakeUnifiedProducer() fake_repo = FakeEventRepository() - settings = Settings() # Uses defaults/env vars + test_settings = settings or Settings() service = KafkaEventService( event_repository=fake_repo, kafka_producer=fake_producer, - settings=settings, + settings=test_settings, logger=_test_logger, ) return service, fake_producer
162-166: Inconsistent enum comparison pattern.Lines 162 and 165 use different comparison approaches:
- Line 162:
pm.state == MonitorState.RUNNING(direct)- Line 165:
pm.state.value == MonitorState.STOPPED.value(via.value)This pattern is inconsistent throughout the file (also lines 457, 562, 590). While both work for Python enums, prefer direct enum comparison for consistency and readability.
♻️ Standardize to direct enum comparison
await pm.aclose() - assert pm.state.value == MonitorState.STOPPED.value + assert pm.state == MonitorState.STOPPED assert spy.cleared is TrueApply the same pattern throughout the file (lines 562, 590, etc.).
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (10)
backend/tests/conftest.pybackend/tests/helpers/cleanup.pybackend/tests/integration/app/test_main_app.pybackend/tests/integration/core/test_dishka_lifespan.pybackend/tests/integration/events/test_event_store_consumer.pybackend/tests/unit/services/pod_monitor/test_monitor.pybackend/tests/unit/services/saga/test_execution_saga_steps.pybackend/tests/unit/services/saga/test_saga_step_and_base.pybackend/tests/unit/services/sse/test_sse_service.pybackend/tests/unit/services/sse/test_sse_shutdown_manager.py
🚧 Files skipped from review as they are similar to previous changes (2)
- backend/tests/integration/core/test_dishka_lifespan.py
- backend/tests/unit/services/sse/test_sse_shutdown_manager.py
🧰 Additional context used
🧬 Code graph analysis (6)
backend/tests/helpers/cleanup.py (2)
backend/tests/conftest.py (1)
db(93-95)backend/tests/unit/conftest.py (1)
db(53-54)
backend/tests/unit/services/saga/test_saga_step_and_base.py (4)
backend/app/infrastructure/kafka/events/base.py (1)
BaseEvent(13-37)backend/app/infrastructure/kafka/events/metadata.py (1)
AvroEventMetadata(9-31)backend/app/services/saga/base_saga.py (1)
BaseSaga(8-52)backend/app/services/saga/saga_step.py (2)
CompensationStep(103-121)SagaContext(14-70)
backend/tests/unit/services/sse/test_sse_service.py (8)
backend/app/db/repositories/sse_repository.py (3)
SSERepository(8-23)get_execution_status(9-17)get_execution(19-23)backend/app/domain/execution/models.py (2)
DomainExecution(23-36)ResourceUsageDomain(15-19)backend/app/domain/sse/models.py (3)
ShutdownStatus(11-19)SSEExecutionStatusDomain(35-38)SSEHealthDomain(23-31)backend/app/services/sse/kafka_redis_bridge.py (1)
SSEKafkaRedisBridge(16-149)backend/app/services/sse/redis_bus.py (4)
SSERedisSubscription(15-41)get(23-35)close(37-41)open_subscription(73-77)backend/app/services/sse/sse_service.py (2)
SSEService(25-282)create_execution_stream(52-119)backend/app/settings.py (1)
Settings(11-165)backend/app/domain/events/event_models.py (1)
Event(44-61)
backend/tests/conftest.py (2)
backend/app/settings.py (1)
Settings(11-165)backend/app/main.py (1)
create_app(60-162)
backend/tests/integration/events/test_event_store_consumer.py (5)
backend/tests/conftest.py (2)
app(46-58)scope(87-89)backend/app/domain/enums/auth.py (1)
LoginMethod(4-10)backend/app/domain/enums/kafka.py (1)
KafkaTopic(7-53)backend/app/events/schema/schema_registry.py (2)
SchemaRegistryManager(53-229)initialize_event_schemas(238-239)backend/tests/helpers/eventually.py (1)
eventually(8-36)
backend/tests/integration/app/test_main_app.py (3)
backend/tests/conftest.py (2)
app(46-58)test_settings(23-41)backend/app/settings.py (1)
Settings(11-165)backend/app/main.py (1)
create_app(60-162)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: E2E Tests
- GitHub Check: Build Frontend
- GitHub Check: E2E Tests
- GitHub Check: Integration Tests
🔇 Additional comments (17)
backend/tests/unit/services/saga/test_execution_saga_steps.py (6)
3-7: LGTM! Imports updated for typed test fakes.The new imports support properly typed fake implementations that extend the real repository and producer interfaces, improving type safety in tests. The reordering of saga imports is cosmetic and has no functional impact.
Also applies to: 11-16
24-24: LGTM! Return type annotation added.
46-71: LGTM! Fake repository properly implements the interface.The
_FakeAllocReponow correctly extendsResourceAllocationRepositorywith properly typed methods. The implementation provides appropriate test behavior while maintaining type safety.
104-107: LGTM! Better naming and type annotations.Renaming to
_BadCtxmakes the error-inducing purpose clearer, and the added type annotations improve code quality.
113-121: LGTM! Fake producer properly implements the interface.The
_FakeProducercorrectly extendsUnifiedProducerwith properly typed methods. The implementation captures produced events for test verification while maintaining type safety.
194-200: LGTM! Dummy classes appropriately skip initialization for tests.The explicit empty
__init__methods with clear comments make it obvious that parent initialization is intentionally bypassed. This is a reasonable test pattern when the classes are only used forisinstancechecks as shown in the test.backend/tests/unit/services/saga/test_saga_step_and_base.py (2)
1-9: LGTM! Imports support explicit typing improvements.The added imports (
asyncio,MagicMock,EventType,BaseEvent,AvroEventMetadata) are all necessary for the enhanced type hints and test implementations throughout the file.
68-80: LGTM! Explicit return types align with BaseSaga interface.The added type hints (
-> str,-> list[EventType],-> list[SagaStep[BaseEvent]]) correctly match the abstract methods defined inBaseSagaand improve type safety for the test implementations.backend/tests/unit/services/pod_monitor/test_monitor.py (5)
124-141: LGTM! Well-structured DI test factory.The
make_pod_monitorfactory provides sensible defaults while allowing selective dependency injection. This makes tests flexible and readable.
389-408: LGTM! Comprehensive event publishing test.The test properly validates the full flow from event creation through Kafka producer capture, including proper key extraction from pod labels.
411-443: LGTM! Thorough exception handling coverage.The test validates graceful error handling with a failing producer and edge cases (None metadata). The pattern of creating test doubles that fail in specific ways is excellent for verifying resilience.
530-563: LGTM! Thorough factory testing with proper mocking.The test validates the
create_pod_monitorcontext manager with monkeypatching to avoid real K8s connections. Patching both import locations (lines 550-551) ensures the mock is used consistently.
706-784: LGTM! Comprehensive coverage of edge cases and error scenarios.The remaining tests thoroughly cover:
- Field selector configuration
- Reconciliation loop error handling
- API exception handling (status 410, 500, generic errors)
- Task lifecycle and cleanup
The use of tracking classes and custom fakes provides excellent observability for test verification.
backend/tests/integration/events/test_event_store_consumer.py (1)
18-31: Kafka consumer xdist grouping + DI typing look good for stability.Serializing consumer construction via
pytest.mark.xdist_group("kafka_consumers")should reduce librdkafka flakiness;scope: AsyncContainerandeventually()usage reads cleanly.backend/tests/unit/services/sse/test_sse_service.py (2)
120-124: The review comment is incorrect—the codebase does not useasyncio.sleep(interval)and does not create a busy loop withinterval=0.The heartbeat logic uses
if (now - last_heartbeat).total_seconds() >= self.heartbeat_interval:to determine when to emit a heartbeat, followed byawait subscription.get(...), which is an async blocking operation. Withinterval=0, the heartbeat is simply emitted as soon as it becomes eligible, then the code waits on the async subscription call. This is the intended behavior and is correctly tested by the existing test that uses0with the comment "emit immediately".No changes are needed to the test code.
Likely an incorrect or invalid review comment.
26-42: PEP 695 syntax is appropriate for this codebase; no changes needed.The repository explicitly requires Python >=3.12 (per
pyproject.toml), where theasync def get[T: BaseModel]syntax is standard and fully supported. The concern about Python 3.11 compatibility does not apply here.Likely an incorrect or invalid review comment.
backend/tests/conftest.py (1)
109-149: The authenticated client works correctly with cookie-based authentication; no Authorization header is needed.The auth endpoint uses httpOnly cookies for authentication—the login response does not return a bearer token in the body. httpx.AsyncClient automatically persists cookies from Set-Cookie headers on subsequent requests, so the client is authenticated without setting an Authorization header. The CSRF token is correctly extracted and set as a custom header for CSRF protection.
The register endpoint returns specific status codes: 400 only for username conflicts, 409 for email conflicts, and 500 for other errors. Accepting 400 is intentional since tests may run multiple times with the same username. If stricter validation is needed, verify that the 400 response contains the expected "already registered" message.
Likely an incorrect or invalid review comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
backend/tests/integration/app/test_main_app.py (1)
28-37: Middleware verification approach is acceptable but potentially fragile.Using
__name__for middleware class name extraction works but relies on the attribute being consistently available. The conditional check for rate limiting based ontest_settings.RATE_LIMIT_ENABLEDis correct.💡 Optional: Consider a more robust middleware verification approach
If middleware verification becomes unreliable, consider checking
app.user_middlewareentries directly by inspecting theirclsattribute or using a set of expected middleware classes:-middleware_class_names = {getattr(m.cls, "__name__", str(m.cls)) for m in app.user_middleware} +# More explicit approach +middleware_classes = {m.cls for m in app.user_middleware} +expected_middlewares = {CORSMiddleware, CorrelationMiddleware, ...} +assert expected_middlewares.issubset(middleware_classes)However, the current string-based approach is simpler and works well for this use case.
backend/tests/integration/test_sse_routes.py (1)
144-168: Consider adding timeout to concurrent stream test.The
create_and_verify_streamfunction doesn't useasyncio.timeout, which could cause the test to hang if the "connected" event is never received.⏱️ Add timeout protection
async def create_and_verify_stream(user_id: str) -> bool: async with aclosing(sse_service.create_notification_stream(user_id)) as stream: - async for event in stream: - if "data" in event: - data = json.loads(event["data"]) - if data.get("event_type") == "connected": - return True - break + try: + async with asyncio.timeout(3.0): + async for event in stream: + if "data" in event: + data = json.loads(event["data"]) + if data.get("event_type") == "connected": + return True + break + except TimeoutError: + pass return Falsebackend/tests/integration/test_user_settings_routes.py (1)
368-389: Test isolation verification could be stronger.The assertion at lines 388-389 checks that users have different settings using an
orcondition, which could pass even if only one field differs. This might not catch isolation bugs where most fields are incorrectly shared.♻️ Strengthen isolation assertion
Consider verifying that each user maintains independent settings:
-# Verify second user's settings are not affected by first user's changes -# Second user should have default settings, not the first user's custom settings -assert user2_settings["theme"] != user1_update["theme"] or user2_settings["timezone"] != user1_update[ - "timezone"] +# Verify second user's settings are not affected by first user's changes +# Both fields should be different (user2 has defaults, user1 has custom) +assert user2_settings["theme"] != user1_update["theme"], \ + f"User2 theme {user2_settings['theme']} should differ from user1's {user1_update['theme']}" +assert user2_settings["timezone"] != user1_update["timezone"], \ + f"User2 timezone {user2_settings['timezone']} should differ from user1's {user1_update['timezone']}"Alternatively, if the issue is that one user might have the same defaults coincidentally, document this:
# At least one setting should differ, showing isolation # (both users could coincidentally have the same defaults) assert (user2_settings["theme"] != user1_update["theme"] or user2_settings["timezone"] != user1_update["timezone"])
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
backend/app/db/repositories/notification_repository.pybackend/app/services/notification_service.pybackend/tests/conftest.pybackend/tests/integration/app/test_main_app.pybackend/tests/integration/test_health_routes.pybackend/tests/integration/test_sse_routes.pybackend/tests/integration/test_user_settings_routes.py
🧰 Additional context used
🧬 Code graph analysis (5)
backend/tests/integration/app/test_main_app.py (3)
backend/tests/conftest.py (2)
app(53-65)test_settings(26-48)backend/app/settings.py (1)
Settings(11-165)backend/app/main.py (1)
create_app(60-162)
backend/tests/integration/test_sse_routes.py (5)
backend/app/infrastructure/kafka/events/metadata.py (1)
AvroEventMetadata(9-31)backend/app/schemas_pydantic/sse.py (2)
RedisNotificationMessage(91-101)SSEHealthResponse(115-125)backend/app/services/sse/redis_bus.py (2)
SSERedisBus(44-87)publish_notification(79-81)backend/app/services/sse/sse_service.py (2)
create_notification_stream(197-261)create_execution_stream(52-119)backend/app/services/kafka_event_service.py (1)
publish_event(39-150)
backend/tests/conftest.py (2)
backend/app/settings.py (1)
Settings(11-165)backend/app/main.py (1)
create_app(60-162)
backend/tests/integration/test_user_settings_routes.py (3)
backend/tests/conftest.py (3)
app(53-65)test_user(159-170)another_user(189-200)backend/tests/helpers/eventually.py (1)
eventually(8-36)backend/tests/unit/services/sse/test_sse_service.py (1)
get(32-41)
backend/tests/integration/test_health_routes.py (1)
backend/tests/conftest.py (2)
client(76-84)test_user(159-170)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: E2E Tests
- GitHub Check: cubic · AI code reviewer
- GitHub Check: Integration Tests
- GitHub Check: E2E Tests
🔇 Additional comments (13)
backend/tests/integration/test_health_routes.py (1)
50-76: LGTM! Good separation of concerns.The test now properly isolates load generation traffic (via
test_user) from health check requests (viaclient), which better simulates real-world conditions where health probes and user traffic use different connections.backend/tests/integration/app/test_main_app.py (1)
24-24: Good type safety improvement.Filtering routes by
isinstance(r, Route)is more precise than the previous approach and prevents potential issues with other route types.backend/tests/conftest.py (4)
18-21: Good safety check for parallel test execution.The assertion prevents exceeding Redis's 16 database limit when running tests with
pytest-xdist, ensuring tests fail early with a clear message rather than experiencing mysterious connection errors.
25-48: Well-structured test settings with proper worker isolation.The fixture correctly:
- Loads base settings from
.env.test- Applies worker-specific isolation via
KAFKA_GROUP_SUFFIX(consumer groups) rather than topic prefixes- Disables OTEL and tracing in tests
- Uses session UUIDs to prevent cross-worker conflicts
158-200: LGTM! Authenticated client fixtures provide good test isolation.The fixtures properly:
- Create unique users per test run to avoid conflicts
- Register and login via API (not database manipulation)
- Extract and set CSRF tokens automatically
- Clean up resources with
await c.aclose()This approach ensures tests use the same authentication flow as real clients.
116-155: The CSRF token presence check is unnecessary; the backend guarantees it.The login endpoint at line 92 always generates a CSRF token, and the
LoginResponseschema requires it as a non-optional field. FastAPI's response validation enforces this contract—if the token were missing, FastAPI would raise a validation error before sending the response. The defensive check in the test (lines 145–150) will never trigger under normal circumstances.Likely an incorrect or invalid review comment.
backend/tests/integration/test_sse_routes.py (2)
45-84: Excellent deterministic streaming test pattern.The test correctly:
- Waits for the "subscribed" event to ensure the subscription is ready
- Only then publishes the notification
- Uses
asyncio.timeoutto prevent hanging- Breaks out of the loop once the expected event is received
This is much more reliable than polling-based tests.
87-124: LGTM! Well-structured execution stream test.The test follows the same deterministic pattern as the notification test, properly synchronizing on the "subscribed" event before publishing test data.
backend/tests/integration/test_user_settings_routes.py (3)
10-34: Good use of TypedDict for type-safe test payloads.Defining typed structures for
_NotificationSettings,_EditorSettings, and_UpdateSettingsDataprovides:
- Clear documentation of expected payload shapes
- IDE autocomplete support
- Early detection of typos in test data
37-38: Appropriate use of xdist_group for sequential execution.The
xdist_groupmarker ensures these settings tests run sequentially, preventing race conditions when tests modify user settings state.
392-424: LGTM! Comprehensive persistence test.The test properly verifies that settings persist by:
- Updating multiple setting categories
- Re-fetching settings via a new GET request
- Validating all updated fields match
backend/app/services/notification_service.py (1)
807-829: Confirm product alignment for opt-out notification model.The behavior change is intentional and explicitly documented:
get_subscriptionandget_all_subscriptionsboth default missing subscriptions toenabled=True, implementing an opt-out model where new users receive notifications by default. The repository docstring confirms this is "consistent with get_all_subscriptions," and the refactored signature (removing Optional) correctly reflects this contract.However, verify that this opt-out default matches product requirements for new user notification delivery.
backend/app/db/repositories/notification_repository.py (1)
155-163: Opt-out default is intentional and well-supported.The behavior change is correct and consistent:
- Aligns with
get_all_subscriptions(line 191) — both returnenabled=Truefor new users- Simplifies caller code — eliminates None-handling in notification_service.py since subscription is never null
- Supports user control — preference center at
/subscriptionsendpoint andupdate_subscriptionmethod provide one-click opt-out on a per-channel basis- Documented clearly — docstring explains the default and rationale
Type signature improvement (no Optional) and behavior are sound. Best practices support opt-out defaults when paired with prominent preference controls, which your implementation provides.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 issue found across 7 files (changes from recent commits).
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="backend/tests/conftest.py">
<violation number="1" location="backend/tests/conftest.py:43">
P2: When `SCHEMA_SUBJECT_PREFIX` is empty (the default), this creates a prefix with a leading dot (e.g., `.abc12345.gw0.`). Consider adding a fallback or stripping the leading dot:
```python
base_prefix = base.SCHEMA_SUBJECT_PREFIX.rstrip('.') or 'test'
```</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
…ngs() being cached)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 issue found across 18 files (changes from recent commits).
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="backend/pyproject.toml">
<violation number="1" location="backend/pyproject.toml:147">
P2: Inconsistent version pinning: all other dev dependencies use exact versions (`==`) for reproducibility, but `pytest-dotenv` uses a minimum version constraint (`>=`). Consider pinning to an exact version for consistency.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
backend/scripts/create_topics.py (1)
1-1: Addbackend/scripts/__init__.pyto resolve MyPy duplicate module name error.MyPy detects this file under two module names:
"create_topics"and"scripts.create_topics". This occurs because the file is imported as a module inbackend/tests/conftest.py(from scripts.create_topics import create_topics) but the scripts directory lacks an__init__.pyfile, preventing it from being recognized as a proper Python package.Create an empty
backend/scripts/__init__.pyto make scripts a proper package and resolve the ambiguity.
🤖 Fix all issues with AI agents
In @backend/pyproject.toml:
- Line 147: The dependency change replacing pytest-env with the unmaintained
"pytest-dotenv>=0.5.2" is risky; revert this dependency change or replace
"pytest-dotenv>=0.5.2" with a maintained alternative (e.g., re-add "pytest-env"
or choose a maintained plugin like "pytest-dotenv-maintained" or use
environment-variable handling in tests directly). Update the pyproject.toml
entry to restore the prior package name/version or a vetted maintained
alternative and run tests to confirm no breakages.
- Line 217: The pytest addopts value was simplified to "--tb=short" which
removed parallelization and quieting flags; check and decide whether to restore
parallel test execution and reduced verbosity by setting addopts back to include
the xdist and quiet options (e.g., include "-n 4 --dist=loadfile" for 4 workers
and "-q --no-header" to suppress extra output) in the addopts entry in
pyproject.toml so tests run in parallel in CI when intended; update addopts
accordingly or document why sequential execution is desired.
🧹 Nitpick comments (3)
backend/app/main.py (1)
100-108: Review middleware ordering for security.CSRFMiddleware is added after MetricsMiddleware and RateLimitMiddleware. For security, CSRF validation should typically run earlier in the middleware stack to reject invalid requests before they consume resources for metrics collection or rate limiting.
Consider moving CSRFMiddleware before MetricsMiddleware and RateLimitMiddleware:
♻️ Suggested middleware reordering
setup_metrics(app, settings, logger) + app.add_middleware(CSRFMiddleware, container=container) app.add_middleware(MetricsMiddleware) if settings.RATE_LIMIT_ENABLED: app.add_middleware(RateLimitMiddleware) - app.add_middleware(CSRFMiddleware, container=container) app.add_middleware(CorrelationMiddleware)This ensures CSRF validation happens before resource-intensive operations like metrics collection.
backend/tests/conftest.py (2)
20-23: Verify xdist worker limit enforcement.The assertion enforces a hard limit of 16 workers to match Redis DB availability. This is good, but the assertion will fail at test collection time (not configuration time), which might be confusing.
Consider moving this check to
pytest_configurefor earlier failure:♻️ Suggested improvement
-# Redis has 16 DBs (0-15); each xdist worker gets one, limiting parallel workers to 16. -_WORKER_ID = os.environ.get("PYTEST_XDIST_WORKER", "gw0") -_WORKER_NUM = int(_WORKER_ID.removeprefix("gw") or "0") -assert _WORKER_NUM < 16, f"xdist worker {_WORKER_NUM} >= 16 exceeds Redis DB limit; use -n 16 or fewer" # ===== Pytest hooks ===== def pytest_configure(config: pytest.Config) -> None: """Create Kafka topics once before any tests run. Runs in master process before xdist workers spawn. pytest-dotenv has already loaded .env.test into os.environ, so Settings() picks up test configuration automatically. Silently skips if Kafka is unavailable (e.g., unit tests). """ + # Validate worker count doesn't exceed Redis DB limit + worker_id = os.environ.get("PYTEST_XDIST_WORKER", "gw0") + worker_num = int(worker_id.removeprefix("gw") or "0") + if worker_num >= 16: + pytest.exit( + f"xdist worker {worker_num} >= 16 exceeds Redis DB limit; use -n 16 or fewer", + returncode=1 + ) + # Only run in master process (not in xdist workers) if not hasattr(config, "workerinput"):Then define
_WORKER_IDand_WORKER_NUMin the test_settings fixture where they're actually used.
143-183: Verify CSRF token is always present in login response.The helper properly handles CSRF token extraction and fails fast if missing (lines 171-176). However, the error handling could be more defensive.
Consider adding validation earlier in the flow:
♻️ Optional improvement
login_resp = await c.post("/api/v1/auth/login", data={ "username": username, "password": password, }) login_resp.raise_for_status() login_data = login_resp.json() csrf = login_data.get("csrf_token") if not csrf: + # Log the full response for debugging + print(f"Login response: {login_resp.json()}") await c.aclose() pytest.fail( f"Login succeeded but csrf_token missing or empty for {role} '{username}'. " f"Response: {login_resp.text}" )This helps debug cases where the login endpoint might change its response format.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
backend/uv.lockis excluded by!**/*.lock
📒 Files selected for processing (17)
.github/workflows/backend-ci.ymlbackend/.env.testbackend/app/main.pybackend/app/schemas_pydantic/execution.pybackend/app/settings.pybackend/pyproject.tomlbackend/scripts/create_topics.pybackend/scripts/seed_users.pybackend/tests/conftest.pybackend/tests/integration/test_notifications_routes.pybackend/workers/dlq_processor.pybackend/workers/run_coordinator.pybackend/workers/run_event_replay.pybackend/workers/run_k8s_worker.pybackend/workers/run_pod_monitor.pybackend/workers/run_result_processor.pybackend/workers/run_saga_orchestrator.py
🚧 Files skipped from review as they are similar to previous changes (1)
- backend/workers/run_event_replay.py
🧰 Additional context used
🧬 Code graph analysis (8)
backend/workers/run_result_processor.py (1)
backend/app/settings.py (1)
Settings(9-163)
backend/scripts/create_topics.py (1)
backend/app/settings.py (1)
Settings(9-163)
backend/tests/conftest.py (3)
backend/scripts/create_topics.py (1)
create_topics(20-101)backend/app/settings.py (1)
Settings(9-163)backend/app/main.py (1)
create_app(60-162)
backend/workers/run_coordinator.py (1)
backend/app/settings.py (1)
Settings(9-163)
backend/app/main.py (6)
backend/app/core/metrics/events.py (1)
EventMetrics(4-209)backend/app/core/metrics/health.py (1)
HealthMetrics(4-109)backend/app/core/metrics/context.py (2)
MetricsContext(113-233)initialize_all(123-163)backend/app/core/middlewares/cache.py (1)
CacheControlMiddleware(7-62)backend/app/core/middlewares/csrf.py (1)
CSRFMiddleware(14-56)backend/app/core/middlewares/metrics.py (1)
MetricsMiddleware(19-118)
backend/workers/dlq_processor.py (2)
backend/app/settings.py (1)
Settings(9-163)backend/workers/run_pod_monitor.py (1)
main(55-75)
backend/workers/run_saga_orchestrator.py (1)
backend/app/settings.py (1)
Settings(9-163)
backend/tests/integration/test_notifications_routes.py (3)
backend/tests/conftest.py (4)
app(80-92)test_user(186-197)scope(121-123)test_admin(201-212)backend/app/services/notification_service.py (1)
NotificationService(115-948)backend/tests/helpers/eventually.py (1)
eventually(8-36)
🪛 dotenv-linter (4.0.0)
backend/.env.test
[warning] 26-26: [UnorderedKey] The SCHEMA_REGISTRY_URL key should go before the SCHEMA_SUBJECT_PREFIX key
(UnorderedKey)
[warning] 30-30: [UnorderedKey] The BCRYPT_ROUNDS key should go before the SECURE_COOKIES key
(UnorderedKey)
[warning] 34-34: [UnorderedKey] The ENABLE_TRACING key should go before the RATE_LIMIT_ENABLED key
(UnorderedKey)
[warning] 42-42: [UnorderedKey] The ENVIRONMENT key should go before the LOG_LEVEL key
(UnorderedKey)
🪛 GitHub Actions: MyPy Type Checking
backend/scripts/create_topics.py
[error] 1-1: Source file found twice under different module names: "create_topics" and "scripts.create_topics". Found 1 error in 1 file (errors prevented further checking).
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Scan Backend
- GitHub Check: E2E Tests
- GitHub Check: Build Frontend
- GitHub Check: cubic · AI code reviewer
- GitHub Check: Integration Tests
- GitHub Check: E2E Tests
🔇 Additional comments (36)
backend/pyproject.toml (2)
185-190: LGTM! Enhanced type checking for Pydantic models.The addition of the Pydantic mypy plugin with strict configuration (
init_forbid_extra,init_typed,warn_required_dynamic_aliases) will improve type safety for the Settings model and other Pydantic models throughout the codebase.
194-194: LGTM! Centralized test environment configuration.The
env_filessetting works withpytest-dotenvto automatically load test environment variables from.env.testbefore test execution. This aligns with the PR's goal of centralizing configuration and removes the need for environment variable setup in CI workflows..github/workflows/backend-ci.yml (2)
50-50: LGTM! Test duration reporting added.Adding
--durations=0to all test runs provides visibility into test performance, helping identify slow tests and monitoring regression over time.Also applies to: 99-99, 176-176
94-102: Kafka topic creation is correctly handled in test setup.Topics are created in the
pytest_configurehook inbackend/tests/conftest.pybefore any tests run. The implementation callscreate_topics(Settings())in the master process, gracefully skipping if Kafka is unavailable. Configuration from.env.testis automatically loaded via pytest-dotenv and passed toSettings().backend/.env.test (2)
25-25: LGTM! Schema Registry test isolation.Adding
SCHEMA_SUBJECT_PREFIX=test.provides proper test isolation for schema registry subjects, matching the pattern used for Kafka topics and preventing conflicts with other environments.
30-30: LGTM! Optimized bcrypt rounds for test performance.Setting
BCRYPT_ROUNDS=4reduces password hashing time during tests while maintaining security testing validity. This is a common practice to improve test suite performance without compromising test coverage.backend/workers/run_k8s_worker.py (2)
13-13: LGTM! Explicit Settings dependency injection.Removing the optional
Settingsparameter and theget_settings()fallback enforces explicit dependency injection. This makes the code more predictable and testable, aligning with the PR's goal of centralizing Settings-based configuration.Also applies to: 17-17
55-55: LGTM! Consistent Settings initialization pattern.Direct
Settings()instantiation and explicit passing toinit_tracing()maintains consistency with the DI pattern applied across all worker entry points in this PR.Also applies to: 65-65
backend/workers/run_saga_orchestrator.py (1)
13-13: LGTM! Consistent DI pattern across worker entry points.The changes mirror those in
run_k8s_worker.py, ensuring consistency across all worker entry points. The explicitSettingsrequirement and direct instantiation eliminate optional parameters and fallback logic, making the code more predictable and maintainable.Also applies to: 17-17, 55-55, 65-65
backend/workers/dlq_processor.py (2)
135-135: LGTM!Explicitly passing
Settings()tomain()aligns with the DI pattern and improves clarity.
103-103: Signature change properly enforced across all callers.The function signature change from
async def main(settings: Settings | None = None)toasync def main(settings: Settings)is correctly implemented. The entry point at the bottom of the file properly passesSettings()explicitly, and all external invocations (Docker Compose, Kubernetes via Helm) execute the script directly, which triggers the updated entry point. No other code in the repository imports or callsmain()directly, confirming the breaking change is safe.backend/workers/run_pod_monitor.py (2)
19-19: LGTM!Removing the optional Settings parameter enforces explicit configuration injection, improving testability and clarity.
57-57: LGTM!Settings instantiation and propagation to
init_tracingfollows the consistent DI pattern established across the codebase.Also applies to: 67-67
backend/workers/run_result_processor.py (2)
21-21: LGTM!The signature change enforces explicit Settings dependency injection, consistent with the broader DI refactor.
71-71: LGTM!Settings instantiation and tracing configuration follow the established DI pattern.
Also applies to: 81-81
backend/app/settings.py (3)
45-46: LGTM!The
BCRYPT_ROUNDSdefault of 12 aligns with OWASP recommendations for production environments. The inline comment clearly explains the security/performance trade-off.
77-77: LGTM!Empty string default for
SCHEMA_SUBJECT_PREFIXprovides a safe fallback for schema registry integration.
131-131: LGTM!Defaulting
ENVIRONMENTto "production" follows the secure-by-default principle, ensuring production-appropriate configurations unless explicitly overridden.backend/workers/run_coordinator.py (2)
17-17: LGTM!The signature change completes the consistent Settings DI pattern across all worker entry points.
55-55: LGTM!Settings instantiation and tracing initialization follow the established pattern, completing the DI refactor.
Also applies to: 65-65
backend/scripts/seed_users.py (1)
71-112: LGTM! Clean refactor to Settings-based configuration.The changes properly inject Settings for MongoDB configuration and use environment variables for password handling, which aligns with the broader DI refactor across the codebase.
backend/scripts/create_topics.py (1)
20-21: Clean Settings injection.The refactor to accept Settings as a parameter is consistent with the broader DI pattern.
backend/app/main.py (3)
60-68: LGTM! Factory pattern enables better testing.The factory function properly accepts an optional Settings parameter, enabling test configurations to be injected. The fallback to
Settings()for production use is correct.
180-181: LGTM! Factory pattern correctly integrated with uvicorn.The
factory=Trueparameter tells uvicorn to callcreate_app()as a factory function, which aligns with the refactor.
71-86: The metrics initialization already handles missing OTEL endpoint gracefully via NoOp meters.The
BaseMetrics._create_meter()method checks whetherOTEL_EXPORTER_OTLP_ENDPOINTis configured and returns aNoOpMeterProviderif it's missing, making all metric operations no-ops. However, metrics objects are still instantiated unconditionally inmain.py, which contradicts the PR summary stating metrics are "skipped if no OTLP."This is a valid graceful degradation pattern—no functional issue exists—but if the intent is to truly skip metrics initialization when OTLP is unconfigured (as the PR summary suggests), consider adding a guard condition:
if settings.OTEL_EXPORTER_OTLP_ENDPOINT: MetricsContext.initialize_all(...)Otherwise, clarify the PR summary to reflect that metrics degrade to no-ops rather than being skipped entirely.
backend/tests/conftest.py (3)
27-42: LGTM! Well-designed topic creation hook.The
pytest_configurehook properly:
- Runs only in the master process before workers spawn
- Uses Settings() which picks up test configuration from pytest-dotenv
- Silently skips if Kafka is unavailable (appropriate for unit tests)
47-75: Excellent documentation and per-worker isolation strategy.The fixture clearly documents:
- What gets isolated per worker (DATABASE_NAME, REDIS_DB, KAFKA_GROUP_SUFFIX)
- What's shared (KAFKA_TOPIC_PREFIX, SCHEMA_SUBJECT_PREFIX)
- Why telemetry is disabled in tests
The isolation strategy prevents test interference while sharing expensive resources like Kafka topics.
185-227: LGTM! Clean authenticated client fixtures.The fixtures properly:
- Create unique users per test with UUID-based usernames
- Handle CSRF token automatically
- Clean up clients with
await c.aclose()- Provide separate fixtures for different roles (user, admin, another_user)
This eliminates duplicate login code across tests.
backend/app/schemas_pydantic/execution.py (1)
77-78: SUPPORTED_RUNTIMES is properly initialized at module import time.The refactor correctly replaces dynamic
get_settings().SUPPORTED_RUNTIMESwith a direct import fromapp.runtime_registry.SUPPORTED_RUNTIMESis defined as a module-level constant constructed statically fromLANGUAGE_SPECS, ensuring it's available immediately upon import with no lazy initialization or settings dependencies.backend/tests/integration/test_notifications_routes.py (7)
1-18: LGTM: Clean import additions.All new imports are utilized appropriately in the refactored tests. The additions support DI-based test setup and enum-based validations.
65-83: LGTM: Proper enum usage.The refactoring correctly uses
NotificationStatusenum values and the authenticatedtest_userfixture.
143-187: LGTM: Robust test with proper async handling.The expanded test correctly:
- Creates test data via DI-injected
NotificationService- Uses the
eventuallyhelper to wait for async delivery before validation- Validates strict decrease in unread count with a clear assertion message
The test properly handles the async nature of notification delivery and provides a reliable validation of the mark-all-read functionality.
189-220: LGTM: Thorough schema validation.The test comprehensively validates the subscription response structure, including optional fields and webhook URL formats.
283-308: LGTM: Appropriate handling of environment-dependent configuration.The test correctly handles the case where Slack subscriptions may be disabled via configuration, allowing the test to pass in different environments.
379-399: LGTM: Critical isolation test.This test validates the important security invariant that notifications are properly isolated between users, preventing cross-user data leakage.
1-410: Excellent refactoring: cleaner, more maintainable tests.This comprehensive refactoring successfully:
- Eliminates duplicate authentication boilerplate across all 15 test methods
- Leverages DI for proper service access and test data setup
- Uses domain enums for type-safe assertions and filtering
- Implements robust async handling with the
eventuallyhelper- Maintains thorough validation of response schemas and business logic
- Preserves critical security tests (authentication requirements, user isolation)
The refactored tests are cleaner, more maintainable, and better aligned with the broader DI and Settings-driven architecture introduced in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In @backend/tests/conftest.py:
- Around line 142-181: Adjust the status-code check in
_create_authenticated_client to reflect actual register endpoint responses:
accept 200 (success), 400 (validation error), and 409 (duplicate/email conflict)
instead of (200, 201, 400); update the conditional that tests r.status_code to
include 409 so duplicate-registration cases from /api/v1/auth/register are
treated as expected rather than failing the test.
🧹 Nitpick comments (1)
backend/tests/conftest.py (1)
27-42: Consider logging exceptions or catching more specific types.The broad
except Exceptionsilently suppresses all errors, which could hide legitimate issues (e.g., misconfiguration, network problems). While this is intentional for unit tests without Kafka, consider:
- Logging suppressed exceptions at debug level for troubleshooting
- Catching more specific exceptions (e.g.,
KafkaException,ConnectionError)- Or at minimum, adding a comment about what exceptions are expected
📝 Proposed enhancement with logging
def pytest_configure(config: pytest.Config) -> None: """Create Kafka topics once before any tests run. Uses trylast=True to ensure pytest-env has set DOTENV_PATH first. Runs in master process before xdist workers spawn. Silently skips if Kafka is unavailable (e.g., unit tests). """ # Only run in master process (not in xdist workers) if not hasattr(config, "workerinput"): try: asyncio.run(create_topics(Settings())) - except Exception: + except Exception as e: # Kafka not available (unit tests) - silently skip - pass + import logging + logging.getLogger(__name__).debug(f"Skipping topic creation: {e}")
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
backend/uv.lockis excluded by!**/*.lock
📒 Files selected for processing (4)
backend/app/settings.pybackend/pyproject.tomlbackend/scripts/__init__.pybackend/tests/conftest.py
🚧 Files skipped from review as they are similar to previous changes (1)
- backend/app/settings.py
🧰 Additional context used
🧬 Code graph analysis (1)
backend/tests/conftest.py (3)
backend/scripts/create_topics.py (1)
create_topics(20-101)backend/app/settings.py (1)
Settings(11-165)backend/app/main.py (1)
create_app(60-162)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: Scan Backend
- GitHub Check: E2E Tests
- GitHub Check: Build Frontend
- GitHub Check: cubic · AI code reviewer
- GitHub Check: E2E Tests
- GitHub Check: Integration Tests
🔇 Additional comments (10)
backend/pyproject.toml (4)
185-190: LGTM! Stricter type checking with pydantic.mypy enhances type safety.The addition of the pydantic.mypy plugin with strict configuration options (
init_forbid_extra,init_typed,warn_required_dynamic_aliases) is excellent for catching type errors early, especially given the extensive Settings-driven dependency injection introduced in this PR.
216-216: Improved test parallelization with auto-detection.The change from
-n 4to-n autoallows pytest-xdist to automatically determine the optimal number of workers based on available CPU cores, which improves resource utilization across different environments. The removal of duplicate-qflags and--no-headeralso makes test output more informative for debugging.
218-220: Excellent test environment isolation with DOTENV_PATH.The
[tool.pytest_env]section withDOTENV_PATH = ".env.test"ensures test runs load test-specific configuration, which prevents tests from accidentally using development or production settings. This aligns perfectly with the Settings-driven dependency injection architecture introduced throughout this PR.
144-144: Dependency versions verified.Both pytest 8.4.2 and pytest-env 1.2.0 exist on PyPI and are valid releases. The version updates and exact pinning are appropriate.
backend/tests/conftest.py (6)
1-17: LGTM! Imports are appropriate.The new imports for
FastAPIandcreate_topicssupport the refactored test infrastructure with typed fixtures and Kafka topic bootstrapping.
19-23: LGTM! Worker isolation is well-designed.The per-worker isolation using Redis DB indices is a clean solution. The assertion prevents accidental over-allocation and provides a clear error message if violated.
46-74: LGTM! Excellent test settings isolation strategy.The per-worker isolation is well-designed with clear documentation of what's isolated vs. shared. Using
model_copywithupdateis the correct Pydantic v2 approach.
78-92: LGTM! Proper lifespan handling.Using
lifespan_contextto trigger startup/shutdown events is the correct approach and integrates cleanly with the DI container initialization flow.
94-136: LGTM! Clean fixture design with proper typing.The explicit type hints and generator-based returns improve clarity. The note about Dishka handling Redis cleanup is helpful.
184-226: LGTM! Well-designed authenticated client fixtures.The fixtures provide clean, isolated authenticated clients for different test scenarios. The code duplication across the three fixtures is acceptable here, as it keeps each fixture clear and self-contained.
The unique ID generation prevents collisions in parallel tests, and proper cleanup ensures no resource leaks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 issue found across 4 files (changes from recent commits).
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them.
<file name="backend/tests/integration/dlq/test_dlq_discard.py">
<violation number="1" location="backend/tests/integration/dlq/test_dlq_discard.py:160">
P2: Weak assertion: `>= 1` could pass even if this test's discard operation failed, due to leftover data from other tests. Consider capturing the count before the discard and asserting it increased by exactly 1.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In @backend/tests/integration/dlq/test_dlq_discard.py:
- Around line 144-160: The test test_dlq_stats_reflect_discarded_messages is
looking up DLQ stats with an enum key but get_dlq_stats() populates by_status
with string keys; update the assertion to use the string key form (e.g.,
stats.by_status.get(DLQMessageStatus.DISCARDED.value, 0) or
stats.by_status.get(DLQMessageStatus.DISCARDED.name.lower(), 0)) so the lookup
matches how DLQRepository.get_dlq_stats() aggregates keys; modify the
stats.by_status.get(...) call accordingly in the test.
🧹 Nitpick comments (4)
backend/tests/integration/dlq/test_dlq_discard.py (1)
20-45: Use.valueinstead ofstr(...)when persisting enum fields to Mongo.
IfKafkaTopic/StringEnumever changes__str__, this will silently store the wrong topic string.Proposed patch
@@ doc = DLQMessageDocument( @@ - original_topic=str(KafkaTopic.EXECUTION_EVENTS), + original_topic=KafkaTopic.EXECUTION_EVENTS.value, error="Test error", retry_count=0, failed_at=now, status=status, producer_id="test-producer", created_at=now, )backend/tests/integration/events/test_event_store.py (3)
1-17: LGTM: Imports and setup are appropriate.The imports and pytest markers are well-structured for MongoDB integration tests. The logger is defined but not used in any test functions—consider removing it or adding debug logging to improve test observability.
91-109: Add assertion forfailedcount to match batch test pattern.The test validates
total,stored, andduplicatesbut omits thefailedcount. For consistency withtest_event_store_stores_batch(line 61) and completeness, add:✨ Suggested addition
assert results["total"] == 2 assert results["stored"] == 1 # Only the new one assert results["duplicates"] == 1 # The duplicate +assert results["failed"] == 0
129-153: Consider strengthening the assertion to verify specific events.The test validates that at least 3 events are returned, but doesn't confirm that the specific events stored in lines 136-140 are included in the results. This could pass with stale data from previous test runs.
♻️ Optional improvement
# Query by type retrieved = await store.get_events_by_type( EventType.EXECUTION_REQUESTED, limit=100, ) -# Should find at least our 3 events -assert len(retrieved) >= 3 +# Should find at least our 3 events +stored_ids = {e.event_id for e in events} +retrieved_ids = {e.event_id for e in retrieved} +assert stored_ids.issubset(retrieved_ids), "All stored events should be retrieved" # All should be EXECUTION_REQUESTED for ev in retrieved: assert ev.event_type == EventType.EXECUTION_REQUESTEDThis verifies that the specific events stored in this test are actually retrieved, not just any 3 events.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
backend/tests/integration/dlq/test_dlq_discard.pybackend/tests/integration/dlq/test_dlq_discard_policy.pybackend/tests/integration/events/test_event_store.pybackend/tests/integration/events/test_event_store_consumer.py
💤 Files with no reviewable changes (2)
- backend/tests/integration/events/test_event_store_consumer.py
- backend/tests/integration/dlq/test_dlq_discard_policy.py
🧰 Additional context used
🧬 Code graph analysis (1)
backend/tests/integration/dlq/test_dlq_discard.py (3)
backend/app/db/repositories/dlq_repository.py (3)
DLQRepository(22-185)mark_message_discarded(175-185)get_dlq_stats(33-96)backend/app/dlq/models.py (1)
DLQMessageStatus(10-16)backend/app/domain/enums/kafka.py (1)
KafkaTopic(7-53)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: E2E Tests
- GitHub Check: Build Frontend
- GitHub Check: cubic · AI code reviewer
- GitHub Check: Integration Tests
- GitHub Check: E2E Tests
🔇 Additional comments (4)
backend/tests/integration/events/test_event_store.py (4)
19-42: Test logic is correct with adequate buffer for TTL validation.The single event storage test is well-structured and validates all key fields. The TTL check at line 41 uses a 1-day buffer (89 vs 90 days), which provides sufficient tolerance for test execution time.
44-68: LGTM: Batch storage test is thorough and correct.The test properly validates both aggregate results and individual event persistence in MongoDB.
70-89: LGTM: Duplicate handling test validates idempotent behavior.The test correctly verifies that duplicate event IDs don't create multiple documents while still returning success status.
111-127: LGTM: Retrieval by ID test validates core functionality.The test correctly verifies event retrieval by ID and checks the essential fields.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In @backend/tests/integration/dlq/test_dlq_discard.py:
- Around line 145-161: Add a serialization/xdist marker to
test_dlq_stats_reflect_discarded_messages so it cannot run in parallel with
other DLQ tests; specifically, annotate the test function
test_dlq_stats_reflect_discarded_messages with the same xdist/serial marker used
elsewhere (e.g., @pytest.mark.xdist_group("dlq-stats") or @pytest.mark.serial)
so the before/after delta assertion is not affected by concurrent tests.
🧹 Nitpick comments (2)
backend/tests/conftest.py (1)
38-42: Consider logging unexpected failures for debuggability.The broad
except Exceptionis intentional for unit tests without Kafka, but it could mask unexpected errors (e.g., misconfigured Settings). Consider logging at debug level to aid troubleshooting:📝 Optional: Add debug logging
+import logging + +_logger = logging.getLogger(__name__) + def pytest_configure(config: pytest.Config) -> None: ... if not hasattr(config, "workerinput"): try: asyncio.run(create_topics(Settings())) - except Exception: - # Kafka not available (unit tests) - silently skip - pass + except Exception as exc: + # Kafka not available (unit tests) - skip with debug log + _logger.debug("Skipping topic creation: %s", exc)backend/tests/integration/dlq/test_dlq_discard.py (1)
17-17: Remove unused logger.The
_test_loggeris defined but never used in any of the test functions.♻️ Proposed fix
-_test_logger = logging.getLogger("test.dlq.discard") - -
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
backend/tests/conftest.pybackend/tests/integration/dlq/test_dlq_discard.py
🧰 Additional context used
🧬 Code graph analysis (1)
backend/tests/integration/dlq/test_dlq_discard.py (1)
backend/app/db/repositories/dlq_repository.py (2)
mark_message_discarded(175-185)get_dlq_stats(33-96)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: E2E Tests
- GitHub Check: Build Frontend
- GitHub Check: Integration Tests
- GitHub Check: E2E Tests
🔇 Additional comments (8)
backend/tests/conftest.py (6)
1-17: LGTM!Imports are well-organized with clear separation between standard library, external packages, and internal modules.
19-23: LGTM!Per-worker isolation using Redis DB index is a clean approach. The assertion prevents silent failures when exceeding the 16-DB limit.
45-74: LGTM!Excellent documentation explaining what is isolated per-worker vs shared. The
model_copypattern is correct for Pydantic v2, and disabling telemetry in tests prevents noise.
77-135: LGTM!Well-structured fixtures with appropriate scopes:
- Session-scoped app with proper lifespan handling avoids schema validator memory issues.
- The
lifespan_contextpattern is correct for triggering FastAPI startup/shutdown in tests.- DI scope management is clean with explicit comments about cleanup ownership.
142-182: LGTM!The helper function handles resource cleanup correctly:
- Closes client on any exception before re-raising (lines 180-182).
- Provides clear error messages when CSRF token is missing.
- Accepts idempotent registration status codes (existing user from previous run).
The pattern of returning an unclosed client for the caller to manage is appropriate here since the fixture teardown handles final cleanup.
185-227: LGTM!Clean, consistent pattern across all authenticated client fixtures:
- Unique usernames per invocation prevent cross-test interference.
- Proper cleanup via
await c.aclose()after yield.- Clear docstrings explain the purpose and CSRF handling.
backend/tests/integration/dlq/test_dlq_discard.py (2)
20-44: LGTM!The helper function properly creates test DLQ documents with appropriate defaults and timezone-aware timestamps. Good use of DI-compatible patterns.
47-141: LGTM!These tests properly verify DLQ discard behavior with good isolation. Each test creates a unique document and verifies only that document's state, preventing cross-test interference. Proper use of AsyncContainer DI pattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
backend/tests/integration/dlq/test_dlq_retry.py (2)
120-137: LGTM! Stats validation with delta comparison.The before/after approach effectively validates that the retry operation increments the RETRIED count. The test is well-isolated with unique event IDs.
Note: If tests run concurrently, other tests creating RETRIED messages between the before/after snapshots could theoretically affect this assertion. However, given the PR objectives mention test stability improvements, this isolation is likely already handled via test grouping or serial execution.
1-137: Consider expanding test coverage for completeness.The current tests provide solid coverage of DLQ retry functionality. For more comprehensive testing, consider adding:
- Tests for
mark_message_discarded(mentioned in the repository snippets)- Error condition tests (database failures, network issues)
- Invalid status transition tests (e.g., attempting to retry an already RETRIED or DISCARDED message)
These additions would further strengthen the test suite but are not blocking for the current functionality.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
backend/tests/helpers/events.pybackend/tests/integration/dlq/test_dlq_retry.pybackend/tests/integration/dlq/test_dlq_retry_immediate.py
💤 Files with no reviewable changes (1)
- backend/tests/integration/dlq/test_dlq_retry_immediate.py
🧰 Additional context used
🧬 Code graph analysis (2)
backend/tests/helpers/events.py (1)
backend/app/services/coordinator/queue_manager.py (1)
execution_id(30-31)
backend/tests/integration/dlq/test_dlq_retry.py (4)
backend/app/db/repositories/dlq_repository.py (2)
mark_message_retried(164-173)get_dlq_stats(33-96)backend/app/dlq/models.py (1)
DLQMessageStatus(10-16)backend/app/domain/enums/kafka.py (1)
KafkaTopic(7-53)backend/tests/helpers/events.py (1)
make_execution_requested_event(8-51)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: E2E Tests
- GitHub Check: Build Frontend
- GitHub Check: Integration Tests
- GitHub Check: E2E Tests
🔇 Additional comments (6)
backend/tests/helpers/events.py (1)
35-50: LGTM! Test factory correctly aligned with production schema.The addition of
aggregate_id=execution_idproperly reflects the updated ExecutionRequestedEvent schema. The inline comment clearly documents the production semantics.backend/tests/integration/dlq/test_dlq_retry.py (5)
20-44: LGTM! Clean helper function for DLQ test documents.The helper appropriately separates concerns between
event_id(the DLQ message identifier) and the event payload'sexecution_id. Good use of sensible defaults and themake_execution_requested_eventfactory.
47-65: LGTM! Comprehensive test of retry status transition.The test correctly verifies both the status change to RETRIED and the presence of the
retried_attimestamp.
68-76: LGTM! Good edge case coverage.Properly validates that
mark_message_retriedreturnsFalsefor nonexistent messages.
79-97: LGTM! Solid timestamp validation.The time-window assertion effectively guards against timezone issues and ensures the timestamp is set correctly during the retry operation.
100-117: LGTM! Good coverage of status transition from PENDING.This test complements the SCHEDULED→RETRIED test by validating that PENDING messages can also transition to RETRIED status.
|



Summary by cubic
Added CSRF protection middleware and expanded Settings-driven dependency injection across backend, events, and metrics to improve consistency and test stability. SSE streams now emit a “subscribed” control event when the subscription is ready.
Bug Fixes
Refactors
Written for commit 2f0ee35. Summary will update on new commits.
Summary by CodeRabbit
New Features
Refactor
Tests
Chores
✏️ Tip: You can customize this high-level summary in your review settings.