Skip to content

Conversation

@gaobinlong
Copy link
Contributor

Similar to TopScoreDocCollector, in FirstPassGroupingCollector, if the primary sort is _score, we can propagate the minimum competitive score to the scorer when the orderedGroups is full, this can improve the grouping performance by 20%, below are the benchmark result:

query DSL in OpenSearch:

GET big5/_search
{
  "query": {
    "match": {
      "message": "50-136-239-27"
    }
  },
  "collapse": {
    "field": "host.name"
  }
}

test result:

Before:
|                                   50th percentile service time | collapsing |     426.714 |     ms |
|                                   90th percentile service time | collapsing |     438.719 |     ms |
|                                   99th percentile service time | collapsing |     465.926 |     ms |
|                                 99.9th percentile service time | collapsing |     496.134 |     ms |
|                                  100th percentile service time | collapsing |     515.556 |     ms |

After:
|                                   50th percentile service time | collapsing |     353.501 |     ms |
|                                   90th percentile service time | collapsing |     358.511 |     ms |
|                                   99th percentile service time | collapsing |     379.314 |     ms |
|                                 99.9th percentile service time | collapsing |      398.28 |     ms |
|                                  100th percentile service time | collapsing |     476.743 |     ms |

Related to #15178.

Signed-off-by: Binlong Gao <gbinlong@amazon.com>
Signed-off-by: Binlong Gao <gbinlong@amazon.com>
@github-actions github-actions bot added this to the 10.4.0 milestone Dec 10, 2025
Signed-off-by: Binlong Gao <gbinlong@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant