HardMax71 · HardMax71 · Jan 4, 2026 · Jan 4, 2026 · Jan 4, 2026
diff --git a/docs/architecture/authentication.md b/docs/architecture/authentication.md
@@ -60,7 +60,7 @@ JWTs are signed with HS256 using a secret key from settings:
 ```
 
 The token payload contains the username in the `sub` claim and an expiration time. Token lifetime is configured via
-`ACCESS_TOKEN_EXPIRE_MINUTES` (default: 30 minutes).
+`ACCESS_TOKEN_EXPIRE_MINUTES` (default: 24 hours / 1440 minutes).
 
 ### CSRF Validation
 

diff --git a/docs/architecture/overview.md b/docs/architecture/overview.md
@@ -1,8 +1,16 @@
 # Architecture overview
 
 In this file, you can find broad description of main components of project architecture.
-Preciser info about peculiarities of separate components (SSE, Kafka topics, DLQ, ..) are in
-the [Components](../components/dead-letter-queue.md) section.
+For details on specific components, see:
+
+- [SSE Architecture](../components/sse/sse-architecture.md)
+- [Dead Letter Queue](../components/dead-letter-queue.md)
+- [Kafka Topics](kafka-topic-architecture.md)
+- [Workers](../components/workers/pod_monitor.md)
+
+!!! note "Event Streaming"
+    Kafka event streaming is disabled by default (`ENABLE_EVENT_STREAMING=false`). Set this to `true` in your
+    environment to enable the full event-driven architecture.
 
 ## System overview
 

diff --git a/docs/architecture/rate-limiting.md b/docs/architecture/rate-limiting.md
@@ -66,6 +66,10 @@ The platform ships with default rate limits organized by endpoint group. Higher
 Execution endpoints have the strictest limits since they spawn Kubernetes pods. The catch-all API rule (priority 1)
 applies to any endpoint not matching a more specific pattern.
 
+!!! note "WebSocket rule"
+    The `/api/v1/ws` pattern is reserved for future WebSocket support. The platform currently uses Server-Sent Events
+    (SSE) for real-time updates via `/api/v1/events/*`.
+
 ## Middleware Integration
 
 The `RateLimitMiddleware` intercepts all HTTP requests, extracts the user identifier, and checks against the configured
@@ -135,10 +139,14 @@ Configuration is cached in Redis for 5 minutes to reduce database load while all
 
 Rate limiting is controlled by environment variables:
 
-| Variable                  | Default      | Description                           |
-|---------------------------|--------------|---------------------------------------|
-| `RATE_LIMIT_ENABLED`      | `true`       | Enable/disable rate limiting globally |
-| `RATE_LIMIT_REDIS_PREFIX` | `ratelimit:` | Redis key prefix for isolation        |
+| Variable                  | Default          | Description                                          |
+|---------------------------|------------------|------------------------------------------------------|
+| `RATE_LIMIT_ENABLED`      | `true`           | Enable/disable rate limiting globally                |
+| `RATE_LIMIT_REDIS_PREFIX` | `rate_limit:`    | Redis key prefix for isolation                       |
+| `RATE_LIMIT_ALGORITHM`    | `sliding_window` | Algorithm to use (`sliding_window` or `token_bucket`)|
+| `RATE_LIMIT_DEFAULT_REQUESTS` | `100`        | Default request limit                                |
+| `RATE_LIMIT_DEFAULT_WINDOW`   | `60`         | Default window in seconds                            |
+| `RATE_LIMIT_BURST_MULTIPLIER` | `1.5`        | Burst multiplier for token bucket                    |
 
 The system gracefully degrades when Redis is unavailable—requests are allowed through rather than failing closed.
 

diff --git a/docs/architecture/runtime-registry.md b/docs/architecture/runtime-registry.md
@@ -77,23 +77,50 @@ The example scripts intentionally use features that may not work on older versio
 compatibility. For instance, Python's match statement (3.10+), Node's `Promise.withResolvers()` (22+), and Go's
 `clear()` function (1.21+).
 
-## API Endpoint
+## API Endpoints
 
-The `/api/v1/languages` endpoint returns the available runtimes:
+The runtime information is available via two endpoints:
+
+### GET /api/v1/k8s-limits
+
+Returns resource limits and supported runtimes:
+
+```json
+{
+  "cpu_limit": "1000m",
+  "memory_limit": "128Mi",
+  "cpu_request": "1000m",
+  "memory_request": "128Mi",
+  "execution_timeout": 300,
+  "supported_runtimes": {
+    "python": {"versions": ["3.7", "3.8", "3.9", "3.10", "3.11", "3.12"], "file_ext": "py"},
+    "node": {"versions": ["18", "20", "22"], "file_ext": "js"},
+    "ruby": {"versions": ["3.1", "3.2", "3.3"], "file_ext": "rb"},
+    "go": {"versions": ["1.20", "1.21", "1.22"], "file_ext": "go"},
+    "bash": {"versions": ["5.1", "5.2", "5.3"], "file_ext": "sh"}
+  }
+}
+```
+
+### GET /api/v1/example-scripts
+
+Returns example scripts for each language:
 
 ```json
 {
-  "python": {"versions": ["3.7", "3.8", "3.9", "3.10", "3.11", "3.12"], "file_ext": "py"},
-  "node": {"versions": ["18", "20", "22"], "file_ext": "js"},
-  "ruby": {"versions": ["3.1", "3.2", "3.3"], "file_ext": "rb"},
-  "go": {"versions": ["1.20", "1.21", "1.22"], "file_ext": "go"},
-  "bash": {"versions": ["5.1", "5.2", "5.3"], "file_ext": "sh"}
+  "scripts": {
+    "python": "# Python example script...",
+    "node": "// Node.js example script...",
+    "ruby": "# Ruby example script...",
+    "go": "package main...",
+    "bash": "#!/bin/bash..."
+  }
 }
 ```
 
 ## Key Files
 
-| File                                                                                                                 | Purpose                                               |
-|----------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------|
-| [`runtime_registry.py`](https://github.com/HardMax71/Integr8sCode/blob/main/backend/app/runtime_registry.py)         | Language specifications and runtime config generation |
-| [`api/routes/languages.py`](https://github.com/HardMax71/Integr8sCode/blob/main/backend/app/api/routes/languages.py) | API endpoint for available languages                  |
+| File                                                                                                                 | Purpose                                                |
+|----------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------|
+| [`runtime_registry.py`](https://github.com/HardMax71/Integr8sCode/blob/main/backend/app/runtime_registry.py)         | Language specifications and runtime config generation  |
+| [`api/routes/execution.py`](https://github.com/HardMax71/Integr8sCode/blob/main/backend/app/api/routes/execution.py) | API endpoints including k8s-limits and example-scripts |
diff --git a/docs/components/dead-letter-queue.md b/docs/components/dead-letter-queue.md
@@ -88,3 +88,11 @@ If sending to DLQ fails (extremely rare - would mean Kafka is down), the produce
 
 The system is designed to be resilient but not perfect. In catastrophic scenarios, you still have Kafka's built-in durability and the ability to replay topics from the beginning if needed.
 
+## Key files
+
+| File                                                                                                                     | Purpose                |
+|--------------------------------------------------------------------------------------------------------------------------|------------------------|
+| [`dlq_processor.py`](https://github.com/HardMax71/Integr8sCode/blob/main/backend/workers/dlq_processor.py)               | DLQ processor worker   |
+| [`manager.py`](https://github.com/HardMax71/Integr8sCode/blob/main/backend/app/dlq/manager.py)                           | DLQ management logic   |
+| [`unified_producer.py`](https://github.com/HardMax71/Integr8sCode/blob/main/backend/app/events/unified_producer.py)      | `send_to_dlq()` method |
+| [`dlq.py`](https://github.com/HardMax71/Integr8sCode/blob/main/backend/app/api/routes/dlq.py)                            | Admin API routes       |
diff --git a/docs/components/saga/resource-allocation.md b/docs/components/saga/resource-allocation.md
@@ -104,3 +104,10 @@ if active_count >= 100:  # <- adjust this value
 ```
 
 Future improvements could make this configurable per-language or dynamically adjustable based on cluster capacity.
+
+## Key files
+
+| File                                                                                                                                          | Purpose                  |
+|-----------------------------------------------------------------------------------------------------------------------------------------------|--------------------------|
+| [`execution_saga.py`](https://github.com/HardMax71/Integr8sCode/blob/main/backend/app/services/saga/execution_saga.py)                        | Saga with allocation step |
+| [`resource_allocation_repository.py`](https://github.com/HardMax71/Integr8sCode/blob/main/backend/app/db/repositories/resource_allocation_repository.py) | MongoDB operations       |
diff --git a/docs/components/schema-manager.md b/docs/components/schema-manager.md
@@ -33,3 +33,11 @@ During API startup, the `lifespan` function in `dishka_lifespan.py` gets the dat
 To force a specific MongoDB migration to run again, delete its document from `schema_versions`. To start fresh, point the app at a new database. Migrations are designed to be additive; the system doesn't support automatic rollbacks. If you need to undo a migration in production, you'll have to drop indexes or modify validators manually.
 
 For Kafka schemas, the registry keeps all versions. If you break compatibility and need to start over, delete the subject from the registry (either via REST API or the registry's UI if available) and let the app re-register on next startup.
+
+## Key files
+
+| File                                                                                                                           | Purpose                    |
+|--------------------------------------------------------------------------------------------------------------------------------|----------------------------|
+| [`schema_manager.py`](https://github.com/HardMax71/Integr8sCode/blob/main/backend/app/db/schema/schema_manager.py)             | MongoDB migrations         |
+| [`schema_registry.py`](https://github.com/HardMax71/Integr8sCode/blob/main/backend/app/events/schema/schema_registry.py)       | Kafka Avro serialization   |
+| [`dishka_lifespan.py`](https://github.com/HardMax71/Integr8sCode/blob/main/backend/app/dishka_lifespan.py)                     | Startup initialization     |
diff --git a/docs/components/sse/execution-sse-flow.md b/docs/components/sse/execution-sse-flow.md
@@ -9,3 +9,10 @@ The SSE router maintains a small pool of Kafka consumers and routes only the eve
 Using `result_stored` as the terminal signal removes artificial waiting. Earlier iterations ended the SSE stream on `execution_completed`/`failed`/`timeout` and slept on the server to "give Mongo time" to commit. That pause is unnecessary once the stream ends only after the result processor confirms persistence.
 
 This approach preserves clean attribution and ordering. The coordinator enriches pod creation commands with user information so pods are labeled correctly. The pod monitor converts Kubernetes phases into domain events. Timeout classification is deterministic: any pod finishing with `reason=DeadlineExceeded` results in an `execution_timeout` event. The result processor is the single writer of terminal state, so the UI never races the database — when the browser sees `result_stored`, the result is already present.
+
+## Related docs
+
+- [SSE Architecture](sse-architecture.md) — overall SSE design and components
+- [SSE Partitioned Router](sse-partitioned-architecture.md) — consumer pool and scaling
+- [Result Processor](../workers/result_processor.md) — terminal event handling
+- [Pod Monitor](../workers/pod_monitor.md) — Kubernetes event translation
diff --git a/docs/components/sse/sse-architecture.md b/docs/components/sse/sse-architecture.md
@@ -82,3 +82,12 @@ Key metrics: `sse.connections.active`, `sse.messages.sent.total`, `sse.connectio
 ## Why not WebSockets?
 
 WebSockets were initially implemented but removed because SSE is sufficient for server-to-client communication, simpler connection management, better proxy compatibility (many corporate proxies block WebSockets), excellent browser support with automatic reconnection, and works great with HTTP/2 multiplexing.
+
+## Key files
+
+| File                                                                                                                              | Purpose                |
+|-----------------------------------------------------------------------------------------------------------------------------------|------------------------|
+| [`sse_service.py`](https://github.com/HardMax71/Integr8sCode/blob/main/backend/app/services/sse/sse_service.py)                   | Client connections     |
+| [`kafka_redis_bridge.py`](https://github.com/HardMax71/Integr8sCode/blob/main/backend/app/services/sse/kafka_redis_bridge.py)     | Kafka-to-Redis routing |
+| [`redis_bus.py`](https://github.com/HardMax71/Integr8sCode/blob/main/backend/app/services/sse/redis_bus.py)                       | Redis pub/sub wrapper  |
+| [`sse_shutdown_manager.py`](https://github.com/HardMax71/Integr8sCode/blob/main/backend/app/services/sse/sse_shutdown_manager.py) | Graceful shutdown      |
diff --git a/docs/components/sse/sse-partitioned-architecture.md b/docs/components/sse/sse-partitioned-architecture.md
@@ -33,3 +33,10 @@ Memory management uses configurable buffer limits: max size, TTL for expiration,
 The `SSEShutdownManager` complements the router. The router handles the data plane (Kafka consumers, event routing to Redis). The shutdown manager handles the control plane (tracking connections, coordinating graceful shutdown, notifying clients).
 
 When the server shuts down, SSE clients receive a shutdown event so they can display messages and attempt reconnection. The shutdown manager implements phased shutdown: notify clients, wait for graceful disconnection, force-close remaining connections. SSE connections register with the shutdown manager and monitor a shutdown event while streaming from Redis. When shutdown triggers, connections send shutdown messages and close gracefully.
+
+## Key files
+
+| File                                                                                                                              | Purpose              |
+|-----------------------------------------------------------------------------------------------------------------------------------|----------------------|
+| [`kafka_redis_bridge.py`](https://github.com/HardMax71/Integr8sCode/blob/main/backend/app/services/sse/kafka_redis_bridge.py)     | Consumer pool router |
+| [`sse_shutdown_manager.py`](https://github.com/HardMax71/Integr8sCode/blob/main/backend/app/services/sse/sse_shutdown_manager.py) | Connection tracking  |
diff --git a/docs/components/workers/coordinator.md b/docs/components/workers/coordinator.md
@@ -0,0 +1,74 @@
+# Coordinator
+
+The coordinator owns admission and queuing policy for executions. It decides which executions can proceed based on
+available resources and enforces per-user limits to prevent any single user from monopolizing the system.
+
+```mermaid
+graph LR
+    Kafka[(Kafka)] --> Coord[Coordinator]
+    Coord --> Queue[Priority Queue]
+    Coord --> Resources[Resource Pool]
+    Coord --> Accepted[Accepted Events]
+    Accepted --> Kafka
+```
+
+## How it works
+
+When an `ExecutionRequested` event arrives, the coordinator checks:
+
+1. Is the queue full? (max 10,000 pending)
+2. Has this user exceeded their limit? (max 100 concurrent)
+3. Are there enough CPU and memory resources?
+
+If all checks pass, the coordinator allocates resources and publishes `ExecutionAccepted`. Otherwise, the request
+is either queued for later or rejected.
+
+The coordinator runs a background scheduling loop that continuously pulls from the priority queue and attempts to
+schedule pending executions as resources become available.
+
+## Priority queue
+
+Executions are processed in priority order. Lower numeric values are processed first:
+
+```python
+--8<-- "backend/app/services/coordinator/queue_manager.py:14:19"
+```
+
+When resources are unavailable, executions are requeued with reduced priority to prevent starvation.
+
+## Resource management
+
+The coordinator tracks a pool of CPU and memory resources:
+
+| Parameter                 | Default | Description                |
+|---------------------------|---------|----------------------------|
+| `total_cpu_cores`         | 32      | Total CPU pool             |
+| `total_memory_mb`         | 65,536  | Total memory pool (64GB)   |
+| `overcommit_factor`       | 1.2     | Allow 20% overcommit       |
+| `max_queue_size`          | 10,000  | Maximum pending executions |
+| `max_executions_per_user` | 100     | Per-user limit             |
+| `stale_timeout_seconds`   | 3,600   | Stale execution timeout    |
+
+## Topics
+
+- **Consumes**: `execution_events` (requested, completed, failed, cancelled)
+- **Produces**: `execution_events` (accepted)
+
+## Key files
+
+| File                                                                                                                           | Purpose                       |
+|--------------------------------------------------------------------------------------------------------------------------------|-------------------------------|
+| [`run_coordinator.py`](https://github.com/HardMax71/Integr8sCode/blob/main/backend/workers/run_coordinator.py)                 | Entry point                   |
+| [`coordinator.py`](https://github.com/HardMax71/Integr8sCode/blob/main/backend/app/services/coordinator/coordinator.py)        | Main coordinator service      |
+| [`queue_manager.py`](https://github.com/HardMax71/Integr8sCode/blob/main/backend/app/services/coordinator/queue_manager.py)    | Priority queue implementation |
+| [`resource_manager.py`](https://github.com/HardMax71/Integr8sCode/blob/main/backend/app/services/coordinator/resource_manager.py) | Resource pool and allocation  |
+
+## Deployment
+
+```yaml
+coordinator:
+  build:
+    dockerfile: workers/Dockerfile.coordinator
+```
+
+Usually runs as a single replica. Leader election via Redis is available if scaling is needed.