Extend server priorities to realtime indexing tasks for query isolation by abhishekrb19 · Pull Request #19040 · apache/druid

abhishekrb19 · 2026-02-20T22:44:59Z

Fixes #19018

Currently all task replicas by default answer all realtime queries regardless of the priority or routing strategy applied on the Broker.
For mixed query workloads, some queries may have higher priority than others, so isolation on the replicas is . We've seen bad queries take down some task replicas causing noisy neighbor problems.

This extends Druid’s query prioritization and routing strategies for Peon servers by letting operators configure how many replicas per server priority to configure for the tasks, similar to how it works for Historicals and Brokers. This is done by exposing serverPriorityToReplics in the supervisor's ioConfig that operators can optionally configure with the number of replicas allocated per server priority for realtime indexing tasks.

For example, some replicas can be configured to handle queries of all priorities, while others may only respond to specific priority ranges. This would isolate certain Peon replicas for certain priorities and others for more exploratory / dashboarding usecases.

Approach:

To support this, a new serverPriorityToReplicas property is added to the supervisor’s ioConfig.
The SeekableStreamSupervisor assigns priorities to SeekableStreamIndexTasks as they are created for a group. Similarly they're removed from internal bookkeeping when the tasks terminate. The ForkingTaskRunner then passes the appropriate server priority when initializing the Peon server. In the absence of this configuration, Peons continue to run with the default priority 0.

serverPriorityToReplicas is optional and is compatible with the existing replicas property if specified.

Release note

Added serverPriorityToReplicas parameter to the streaming supervisor specs (kafka/kinesis/rabbit). This allows operators to distribute task replicas across different server priorities for realtime indexing tasks. Similar to historical tiering, this enables query isolation for mixed workload scenarios on the Peons, allowing some task replicas to handle queries of specific priorities.

This PR has:

been self-reviewed.
added documentation for new or modified features or behaviors.
a release note entry in the PR description.
added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
been tested in a test Druid cluster.

.../main/java/org/apache/druid/indexing/seekablestream/supervisor/SeekableStreamSupervisor.java

…k replicas.

…plicas

...a/org/apache/druid/indexing/seekablestream/supervisor/SeekableStreamSupervisorStateTest.java

+    EasyMock.reset(spec);
+    EasyMock.expect(spec.getId()).andReturn(SUPERVISOR_ID).anyTimes();
+    EasyMock.expect(spec.getSupervisorStateManagerConfig()).andReturn(supervisorConfig).anyTimes();
+    EasyMock.expect(spec.getDataSchema()).andReturn(getDataSchema()).anyTimes();


...a/org/apache/druid/indexing/seekablestream/supervisor/SeekableStreamSupervisorStateTest.java

+    EasyMock.reset(spec);
+    EasyMock.expect(spec.getId()).andReturn(SUPERVISOR_ID).anyTimes();
+    EasyMock.expect(spec.getSupervisorStateManagerConfig()).andReturn(supervisorConfig).anyTimes();
+    EasyMock.expect(spec.getDataSchema()).andReturn(getDataSchema()).anyTimes();


aho135 · 2026-02-26T23:34:33Z

...va/org/apache/druid/indexing/seekablestream/supervisor/SeekableStreamSupervisorIOConfig.java


+  @Nullable
+  @JsonProperty
+  public Map<Integer, Integer> getserverPriorityToReplicas()


Suggested change

public Map<Integer, Integer> getserverPriorityToReplicas()

public Map<Integer, Integer> getServerPriorityToReplicas()

aho135 · 2026-02-26T23:40:57Z

...va/org/apache/druid/indexing/seekablestream/supervisor/SeekableStreamSupervisorIOConfig.java

+    if (this.serverPriorityToReplicas != null) {
+      final int replicaCount = this.serverPriorityToReplicas.values().stream().mapToInt(Integer::intValue).sum();
+      if (replicas != null && replicas != replicaCount) {
+        throw InvalidInput.exception(


What do you think about dropping this requirement? Seems like additional configuration overhead since we set it explicitly below anyway

This would be triggered only if someone unintentionally sets conflicting values that cause ambiguity. I wanted to keep the behavior straightforward for end users rather than document any precedence rules in such cases. I've updated the error message accordingly, please let me know if that makes sense.

jtuglu1

Did a first pass – general approach LGTM.

docs/ingestion/supervisor.md

...xing-service/src/main/java/org/apache/druid/indexing/rabbitstream/RabbitStreamIndexTask.java

...va/org/apache/druid/indexing/seekablestream/supervisor/SeekableStreamSupervisorIOConfig.java

aho135 · 2026-02-28T00:07:11Z

docs/ingestion/supervisor.md

 |`lateMessageRejectionPeriod`|ISO 8601 period|Configures tasks to reject messages with timestamps earlier than this period before the task was created. For example, if this property is set to `PT1H` and the supervisor creates a task at `2016-01-01T12:00Z`, Druid drops messages with timestamps earlier than `2016-01-01T11:00Z`. This may help prevent concurrency issues if your data stream has late messages and you have multiple pipelines that need to operate on the same segments, such as a streaming and a nightly batch ingestion pipeline. You can specify only one of the late message rejection properties.|No||
 |`earlyMessageRejectionPeriod`|ISO 8601 period|Configures tasks to reject messages with timestamps later than this period after the task reached its task duration. For example, if this property is set to `PT1H`, the task duration is set to `PT1H` and the supervisor creates a task at `2016-01-01T12:00Z`, Druid drops messages with timestamps later than `2016-01-01T14:00Z`. Tasks sometimes run past their task duration, such as in cases of supervisor failover.|No||
 |`stopTaskCount`|Integer|Limits the number of ingestion tasks Druid can cycle at any given time. If not set, Druid can cycle all tasks at the same time. If set to a value less than `taskCount`, your cluster needs fewer available slots to run the supervisor. You can save costs by scaling down your ingestion tier, but this can lead to slower cycle times and lag. See [`stopTaskCount`](#stoptaskcount) for more information.|No|`taskCount` value|
+|`serverPriorityToReplicas`|Object (`Map<Integer, Integer>`)|Map of server priorities to the number of replicas per priority. When set, each task replica is assigned a server priority that corresponds to `druid.server.priority` on the Peon process to enable query isolation for mixed workloads using [query routing strategies](../configuration/index.md#query-routing). If not configured, the `replicas` setting applies and all task replicas are assigned a default priority of 0.<br/><br/>For example, setting `serverPriorityToReplicas` to `{"1": 2, "0": 1}` creates 2 task replicas with `druid.server.priority=1` and 1 task replica with `druid.server.priority=0` per task group. This configuration scales proportionally with `taskCount`. For example, if `taskCount` is set to 5, this results in 15 total task replicas - 10 replicas with priority 1 and 5 replicas with priority 0.|No|null|


Suggested change

|`serverPriorityToReplicas`|Object (`Map<Integer, Integer>`)|Map of server priorities to the number of replicas per priority. When set, each task replica is assigned a server priority that corresponds to `druid.server.priority` on the Peon process to enable query isolation for mixed workloads using [query routing strategies](../configuration/index.md#query-routing). If not configured, the `replicas` setting applies and all task replicas are assigned a default priority of 0. For example, setting `serverPriorityToReplicas` to `{"1": 2, "0": 1}` creates 2 task replicas with `druid.server.priority=1` and 1 task replica with `druid.server.priority=0` per task group. This configuration scales proportionally with `taskCount`. For example, if `taskCount` is set to 5, this results in 15 total task replicas - 10 replicas with priority 1 and 5 replicas with priority 0.|No|null|

|`serverPriorityToReplicas`|Object (`Map<Integer, Integer>`)|Map of server priorities to the number of replicas per priority. When set, each task replica is assigned a server priority that corresponds to `druid.server.priority` on the Peon process to enable query isolation for mixed workloads using [query routing strategies](../configuration/index.md#query-routing). If not configured, the `replicas` setting applies and all task replicas are assigned a default priority of 0. For example, setting `serverPriorityToReplicas` to `{"1": 2, "0": 1}` creates 2 task replicas with `druid.server.priority=1` and 1 task replica with `druid.server.priority=0` per task group. This configuration scales proportionally with `taskCount`. For example, if `taskCount` is set to 5, this results in 15 total tasks - 10 tasks with priority 1 and 5 tasks with priority 0.|No|null|

I thought using the word replicas here was a bit confusing since we're referring to task counts here

aho135 · 2026-02-28T00:43:01Z

...va/org/apache/druid/indexing/seekablestream/supervisor/SeekableStreamSupervisorIOConfig.java

    this.idleConfig = idleConfig;
+    this.serverPriorityToReplicas = serverPriorityToReplicas;
+    if (this.serverPriorityToReplicas != null) {
+      final int serverPriorityReplicas = this.serverPriorityToReplicas.values().stream().mapToInt(Integer::intValue).sum();


Something like this would also pass validation

Map.of( 1, 2, 0, -1 )

Maybe make this more strict such that any value has to be > 0

github-actions bot added Area - Streaming Ingestion Area - Ingestion labels Feb 20, 2026

github-advanced-security bot found potential problems Feb 20, 2026

View reviewed changes

.../main/java/org/apache/druid/indexing/seekablestream/supervisor/SeekableStreamSupervisor.java Dismissed Show dismissed Hide dismissed

Add serverPriorities to supervisor spec to control priorities for tas…

eed82ea

…k replicas.

abhishekrb19 force-pushed the peon_server_priorities branch from 8e47ed3 to eed82ea Compare February 20, 2026 23:27

Tests

e617cd5

abhishekrb19 changed the title ~~Extend server priorities to Peon servers~~ Extend server priorities to realtime indexing tasks for query isolation Feb 21, 2026

abhishekrb19 added 2 commits February 22, 2026 18:57

Cleanup and test

458a020

Rename config from serverPriorityToReplicaCount -> serverPriorityToRe…

a2497d2

…plicas

github-actions bot added the Area - Documentation label Feb 23, 2026

Docs

6a42a18

abhishekrb19 force-pushed the peon_server_priorities branch from 85b9135 to 6a42a18 Compare February 23, 2026 04:46

Coverage

ad0abca

abhishekrb19 force-pushed the peon_server_priorities branch from 7829e01 to ad0abca Compare February 23, 2026 07:44

github-advanced-security bot found potential problems Feb 23, 2026

View reviewed changes

Merge branch 'master' into peon_server_priorities

a05aa73

jtuglu1 self-requested a review February 24, 2026 06:32

Change and tests

aa1774c

github-advanced-security bot found potential problems Feb 26, 2026

View reviewed changes

aho135 reviewed Feb 26, 2026

View reviewed changes

abhishekrb19 added 3 commits February 26, 2026 17:24

Fix casing and variable name

9abee83

Merge branch 'master' into peon_server_priorities

47515cd

Better error messages

0b67dd4

jtuglu1 reviewed Feb 27, 2026

View reviewed changes

abhishekrb19 added 4 commits February 27, 2026 08:30

Docs and additional validation

b31f799

 to 

2ac1ad4

Wire in serverPriorityToReplicas for Kinesis supervisor

22cf631

Wire in serverPriorityToReplicas for RabbitStreamSupervisor

e6bd4ba

aho135 reviewed Feb 28, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extend server priorities to realtime indexing tasks for query isolation#19040

Extend server priorities to realtime indexing tasks for query isolation#19040
abhishekrb19 wants to merge 15 commits intoapache:masterfrom
abhishekrb19:peon_server_priorities

abhishekrb19 commented Feb 20, 2026 •

edited

Loading

Uh oh!

Uh oh!

Check notice

Check notice

aho135 Feb 26, 2026

Uh oh!

aho135 Feb 26, 2026

Uh oh!

abhishekrb19 Feb 27, 2026

Uh oh!

jtuglu1 left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

aho135 Feb 28, 2026

Uh oh!

aho135 Feb 28, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	public Map<Integer, Integer> getserverPriorityToReplicas()
	public Map<Integer, Integer> getServerPriorityToReplicas()

Conversation

abhishekrb19 commented Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Approach:

Release note

Uh oh!

Uh oh!

Check notice

Uh oh!

Check notice

Uh oh!

aho135 Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

aho135 Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

abhishekrb19 Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

jtuglu1 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

aho135 Feb 28, 2026

Choose a reason for hiding this comment

Uh oh!

aho135 Feb 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

abhishekrb19 commented Feb 20, 2026 •

edited

Loading

aho135 Feb 28, 2026 •

edited

Loading