-
-
Notifications
You must be signed in to change notification settings - Fork 955
feat(supervisor): add node affinity rules for large machine worker pool scheduling #2869
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…ol scheduling **Background** Runs with `large-1x` or `large-2x` machine presets are disproportionally affected by scheduling delays during peak times. This is in part caused by the fact that the worker pool is shared for all runs, meaning large runs compete with smaller runs for available capacity. Because large runs require significantly more CPU and memory, they are harder for the scheduler to bin-pack onto existing nodes, often requiring a node with a significant amount of free resources or waiting for a new node to spin up entirely. This effect is amplified during peak times when nodes are already densely packed with smaller workloads, leaving insufficient contiguous resources for large runs. **Changes** This PR adds Kubernetes node affinity settings to separate large and standard machine workloads across node pools. - Controlled via KUBERNETES_LARGE_MACHINE_POOL_LABEL env var (disabled when not set) - Large machine presets (large-*) get a soft preference to schedule on the large pool, with fallback to standard nodes - Non-large machines are excluded from the large pool via required anti-affinity - This ensures the large machine pool is reserved for large workloads while allowing large workloads to spill over to standard nodes if needed
|
WalkthroughAdds an optional environment variable Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
📜 Recent review detailsConfiguration used: Repository UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (24)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
PR Review: Node Affinity Rules for Large Machine Worker Pool SchedulingSummaryThis PR adds Kubernetes node affinity configuration to separate large machine workloads ( Code Quality & DesignStrengths:
Observations:
Potential Issues
Performance ConsiderationsThe implementation looks efficient - no additional API calls or heavy computation. The affinity configuration is built once per pod creation. SecurityNo security concerns. The feature is configuration-driven and affects pod scheduling only. Minor Suggestions
VerdictApprove with minor suggestions. The implementation is clean, well-scoped, and solves a real scheduling problem with a sensible asymmetric approach. The main suggestion is to add unit tests for the affinity logic to prevent regressions. |
Review CompleteYour review story is ready! Comment !reviewfast on this PR to re-generate the story. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
apps/supervisor/src/workloadManager/kubernetes.ts (1)
365-410: Well-designed affinity logic with appropriate scheduling semantics.The implementation correctly applies:
- Soft preference (weight 100) for large machines → allows fallback to standard nodes
- Hard exclusion for non-large machines → reserves the large pool
The
NotInoperator correctly allows scheduling on nodes that don't have themachinepoollabel at all.Consider adding debug logging when affinity rules are applied to aid troubleshooting scheduling issues:
♻️ Optional enhancement for observability
#getNodeAffinity(preset: MachinePreset): k8s.V1Affinity | undefined { if (!env.KUBERNETES_LARGE_MACHINE_POOL_LABEL) { return undefined; } if (this.#isLargeMachine(preset)) { + this.logger.debug("[KubernetesWorkloadManager] Applying soft affinity for large machine pool", { + preset: preset.name, + poolLabel: env.KUBERNETES_LARGE_MACHINE_POOL_LABEL, + }); // soft preference for the large-machine pool, falls back to standard if unavailable return { // ... existing code }; } + this.logger.debug("[KubernetesWorkloadManager] Applying anti-affinity for large machine pool", { + preset: preset.name, + poolLabel: env.KUBERNETES_LARGE_MACHINE_POOL_LABEL, + }); // not schedulable in the large-machine pool return { // ... existing code }; }
📜 Review details
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
apps/supervisor/src/env.tsapps/supervisor/src/workloadManager/kubernetes.ts
🧰 Additional context used
📓 Path-based instructions (5)
**/*.{ts,tsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
**/*.{ts,tsx}: Use types over interfaces for TypeScript
Avoid using enums; prefer string unions or const objects insteadEvery Trigger.dev task must be exported; use task() function with unique id and run async function
Files:
apps/supervisor/src/env.tsapps/supervisor/src/workloadManager/kubernetes.ts
**/*.{ts,tsx,js,jsx}
📄 CodeRabbit inference engine (.github/copilot-instructions.md)
Use function declarations instead of default exports
Files:
apps/supervisor/src/env.tsapps/supervisor/src/workloadManager/kubernetes.ts
**/*.{js,ts,jsx,tsx,json,md,css,scss}
📄 CodeRabbit inference engine (AGENTS.md)
Format code using Prettier
Files:
apps/supervisor/src/env.tsapps/supervisor/src/workloadManager/kubernetes.ts
**/*.ts
📄 CodeRabbit inference engine (.cursor/rules/otel-metrics.mdc)
**/*.ts: When creating or editing OTEL metrics (counters, histograms, gauges), ensure metric attributes have low cardinality by using only enums, booleans, bounded error codes, or bounded shard IDs
Do not use high-cardinality attributes in OTEL metrics such as UUIDs/IDs (envId, userId, runId, projectId, organizationId), unbounded integers (itemCount, batchSize, retryCount), timestamps (createdAt, startTime), or free-form strings (errorMessage, taskName, queueName)
When exporting OTEL metrics via OTLP to Prometheus, be aware that the exporter automatically adds unit suffixes to metric names (e.g., 'my_duration_ms' becomes 'my_duration_ms_milliseconds', 'my_counter' becomes 'my_counter_total'). Account for these transformations when writing Grafana dashboards or Prometheus queries
Files:
apps/supervisor/src/env.tsapps/supervisor/src/workloadManager/kubernetes.ts
**/*.{ts,tsx,js}
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.{ts,tsx,js}: Import from @trigger.dev/core using subpaths only; never import from root of @trigger.dev/core
Always import task definitions from @trigger.dev/sdk, never from @trigger.dev/sdk/v3 or deprecated client.defineJob pattern
Files:
apps/supervisor/src/env.tsapps/supervisor/src/workloadManager/kubernetes.ts
🧠 Learnings (4)
📓 Common learnings
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Specify machine resources using the `machine` property with preset options like 'large-1x'
📚 Learning: 2025-11-27T16:27:35.304Z
Learnt from: CR
Repo: triggerdotdev/trigger.dev PR: 0
File: .cursor/rules/writing-tasks.mdc:0-0
Timestamp: 2025-11-27T16:27:35.304Z
Learning: Applies to **/trigger/**/*.{ts,tsx,js,jsx} : Specify machine resources using the `machine` property with preset options like 'large-1x'
Applied to files:
apps/supervisor/src/env.tsapps/supervisor/src/workloadManager/kubernetes.ts
📚 Learning: 2025-06-04T16:02:22.957Z
Learnt from: nicktrn
Repo: triggerdotdev/trigger.dev PR: 2150
File: apps/supervisor/src/workloadManager/docker.ts:115-115
Timestamp: 2025-06-04T16:02:22.957Z
Learning: In the Trigger.dev codebase, the supervisor component uses DOCKER_ENFORCE_MACHINE_PRESETS while the docker provider component uses ENFORCE_MACHINE_PRESETS. These are separate components with separate environment variable configurations for the same logical concept of enforcing machine presets.
Applied to files:
apps/supervisor/src/env.ts
📚 Learning: 2025-01-13T18:31:48.160Z
Learnt from: nicktrn
Repo: triggerdotdev/trigger.dev PR: 1608
File: apps/webapp/app/v3/services/triggerTask.server.ts:418-418
Timestamp: 2025-01-13T18:31:48.160Z
Learning: The `MachinePresetName` schema is used to validate machine preset values in the trigger.dev codebase, ensuring type safety and validation of machine preset options.
Applied to files:
apps/supervisor/src/workloadManager/kubernetes.ts
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (23)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (5, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (8, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (2, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (7, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (6, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (8, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (3, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (6, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (7, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (4, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (4, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (5, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (3, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (2, 8)
- GitHub Check: units / internal / 🧪 Unit Tests: Internal (1, 8)
- GitHub Check: units / webapp / 🧪 Unit Tests: Webapp (1, 8)
- GitHub Check: units / packages / 🧪 Unit Tests: Packages (1, 1)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - npm)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - npm)
- GitHub Check: e2e / 🧪 CLI v3 tests (ubuntu-latest - pnpm)
- GitHub Check: e2e / 🧪 CLI v3 tests (windows-latest - pnpm)
- GitHub Check: typecheck / typecheck
- GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (3)
apps/supervisor/src/env.ts (1)
94-94: LGTM!The optional environment variable follows the existing patterns in this file. The inline comment clearly documents that the feature is disabled when unset and explains the label semantics (
machinepool=<value>).apps/supervisor/src/workloadManager/kubernetes.ts (2)
98-98: LGTM!Clean integration with the pod spec. Returning
undefinedwhen the feature is disabled correctly omits theaffinityfield entirely.
361-363: LGTM!Simple and effective prefix check. All large machine presets (
large-1x,large-2x) follow thelarge-prefix convention, and this approach is flexible enough to accommodate any futurelarge-*presets without code changes.
PR Review: Node Affinity Rules for Large Machine Worker Pool SchedulingSummaryThis PR adds Kubernetes node affinity rules to separate large machine workloads (large-1x, large-2x) from standard workloads, addressing scheduling delays during peak times. The implementation is clean and follows the existing patterns in the codebase. Code QualityStrengths:
Minor Suggestions:
Potential Issues
Performance Considerations
Security Considerations
Test Coverage
Overall AssessmentThis is a well-designed, practical solution to improve scheduling performance for large machine workloads. The implementation is clean and follows existing patterns. The main suggestion is to add unit tests for the new logic to ensure correctness and prevent regressions. Recommendation: Approve with minor suggestions (consider tests and the optional improvements noted above). |
Background
Runs with
large-1xorlarge-2xmachine presets are disproportionally affected by scheduling delays during peak times. This is in part caused by the fact that the worker pool is shared for all runs, meaning large runs compete with smaller runs for available capacity. Because large runs require significantly more CPU and memory, they are harder for the scheduler to bin-pack onto existing nodes, often requiring a node with a significant amount of free resources or waiting for a new node to spin up entirely. This effect is amplified during peak times when nodes are already densely packed with smaller workloads, leaving insufficient contiguous resources for large runs. Also, large runs make up a small percentage of the total runs.Changes
This PR adds Kubernetes node affinity settings to separate large and standard machine workloads across node pools.
KUBERNETES_LARGE_MACHINE_POOL_LABELenv var (disabled when not set)