Skip to content

[feat](eager-agg)Eager agg 0214#60757

Draft
englefly wants to merge 10 commits intoapache:masterfrom
englefly:eager-agg-0214
Draft

[feat](eager-agg)Eager agg 0214#60757
englefly wants to merge 10 commits intoapache:masterfrom
englefly:eager-agg-0214

Conversation

@englefly
Copy link
Contributor

What problem does this PR solve?

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

englefly and others added 10 commits February 14, 2026 10:40
ds14 增加了agg push,执行时间 4.7 -> 4.8
h13 增加了 agg push,应该让p6 恢复到 p4 的成绩,提高一些

当任一 group key的ndv 接近 行数(0.9 倍)时,不下推agg

DORIS-24367 case-when 不能下推join 补null的一侧

pick unnest-subquery-cte

ut-tmp

adjust rt

update-shape

fix eager_agg.groovy, runtime_filter_mode=OFF;

fmt

14/67 因为rebase后增加了 repeat 拆分, 形状变化

支持  min(if), max(if), 增加了context.isValid检查,避免无效下推

doris-24240: rewriteRoot 检查nullable失败则不做eagerAgg

column pruning 不产生不合法 的setOp

24207-2: orExpansion union 字段没对齐

DORIS-24239 context.groupKeys 不能为空

DORIS-24206: fix EliminateGroupByKeyByUniform bug:没有替换alias的exprId

DORIS-24205
1. union 的孩子不能部分改写
2. agg 输入字段和 group key 有交集,则不下推

LogicalProject 构造projectMap时不能有unbound

DORIS-23842 没有aggFunc时 下推包含所有group key
的分支,而不是大分支. ds37/38/82/87 受到影响。select distinct A from T1 join T2 on ... group by A`

aliasMap 使用HashMap,不用IdentityMap

DORIS-24149

DORIS-24151

doris-24150 rt case

1. exprId 的等值判断, 2.update rt.

DORIS-24150

update shape

remove unused code

1. sum-if 不考虑穿过bigJoin, 2. 支持union

q5 两个sum(0)错误去重了

sum-if 基本款 (还没有支持union), 43 有提升

simple sum-if no union

检查context的字段 是project的输出.拒绝 sum(A) 下推 proj(x, x+y as A) 且x 不是group key

derive deep false

throw exception for eager agg when FeDebug

1. remove finalGroupKeys, 2. project 下推后改写projects

push agg on join

group key only slotreference

do not support avg/count

mode=1 时 即使没有经过big join 也要 强制 下推

shape with/without pkfk based on tpc_preview
@Thearas
Copy link
Contributor

Thearas commented Feb 14, 2026

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@englefly
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 27608 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit f90def55f7e6340e69de14ccf648f940ba0378c3, data reload: false

------ Round 1 ----------------------------------
============================================
q1	17676	4511	4329	4329
q2	q3	10643	785	510	510
q4	4682	359	249	249
q5	7557	1242	1014	1014
q6	177	175	146	146
q7	780	840	669	669
q8	9752	1500	1332	1332
q9	5280	4783	4753	4753
q10	6324	1922	1668	1668
q11	471	268	232	232
q12	815	576	468	468
q13	18066	2941	2172	2172
q14	235	230	217	217
q15	954	802	795	795
q16	791	727	674	674
q17	711	861	413	413
q18	6009	5490	5268	5268
q19	1508	998	637	637
q20	491	485	401	401
q21	4475	1837	1419	1419
q22	337	296	242	242
Total cold run time: 97734 ms
Total hot run time: 27608 ms

----- Round 2, with runtime_filter_mode=off -----
============================================
q1	4558	4341	4551	4341
q2	q3	3866	4268	3753	3753
q4	850	1152	768	768
q5	4008	4331	4335	4331
q6	182	173	147	147
q7	1721	1619	1491	1491
q8	2440	2655	2528	2528
q9	7988	7518	7376	7376
q10	3764	3989	3571	3571
q11	519	442	410	410
q12	478	612	491	491
q13	2740	3334	2324	2324
q14	289	333	290	290
q15	906	803	793	793
q16	742	791	703	703
q17	1169	1438	1330	1330
q18	7195	6885	6866	6866
q19	886	918	945	918
q20	2057	2155	2131	2131
q21	3967	3516	3302	3302
q22	462	459	416	416
Total cold run time: 50787 ms
Total hot run time: 48280 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 152972 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit f90def55f7e6340e69de14ccf648f940ba0378c3, data reload: false

query5	4329	637	531	531
query6	317	225	203	203
query7	4227	478	270	270
query8	343	241	221	221
query9	8748	2771	2768	2768
query10	533	378	377	377
query11	7388	5866	5543	5543
query12	183	127	126	126
query13	1294	461	354	354
query14	5720	3818	3527	3527
query14_1	2773	2771	2820	2771
query15	206	197	179	179
query16	1025	481	471	471
query17	1110	729	624	624
query18	2443	442	358	358
query19	238	211	183	183
query20	136	134	131	131
query21	228	151	126	126
query22	4743	5049	4848	4848
query23	16875	15925	15889	15889
query23_1	15876	15848	15839	15839
query24	7710	1689	1329	1329
query24_1	1359	1302	1260	1260
query25	585	526	468	468
query26	1018	269	165	165
query27	2721	466	288	288
query28	4499	1890	1862	1862
query29	864	554	465	465
query30	321	247	215	215
query31	1360	1300	1208	1208
query32	82	74	69	69
query33	528	340	275	275
query34	924	897	576	576
query35	648	682	593	593
query36	1067	1146	917	917
query37	136	90	79	79
query38	2965	2871	2868	2868
query39	893	863	852	852
query39_1	829	828	836	828
query40	228	148	133	133
query41	63	59	58	58
query42	304	301	290	290
query43	240	246	220	220
query44	
query45	195	194	177	177
query46	873	969	609	609
query47	2112	2166	2077	2077
query48	330	312	232	232
query49	632	461	378	378
query50	675	272	220	220
query51	4106	4078	4062	4062
query52	286	302	282	282
query53	290	334	294	294
query54	310	272	255	255
query55	136	90	82	82
query56	310	310	302	302
query57	1375	1345	1284	1284
query58	295	294	270	270
query59	1378	1436	1255	1255
query60	344	341	336	336
query61	148	145	146	145
query62	642	582	534	534
query63	302	277	280	277
query64	4742	1276	1015	1015
query65	
query66	1410	456	358	358
query67	16535	16302	16314	16302
query68	
query69	382	323	296	296
query70	939	980	987	980
query71	346	311	289	289
query72	2742	2649	2332	2332
query73	542	539	318	318
query74	10079	9967	9769	9769
query75	2844	2753	2433	2433
query76	2301	1042	667	667
query77	357	366	294	294
query78	11301	11406	10646	10646
query79	2570	813	600	600
query80	1726	601	534	534
query81	571	284	244	244
query82	1009	150	118	118
query83	330	262	253	253
query84	250	117	102	102
query85	917	471	425	425
query86	412	297	295	295
query87	3135	3093	2963	2963
query88	3539	2685	2670	2670
query89	417	362	337	337
query90	2023	182	165	165
query91	166	155	132	132
query92	78	74	74	74
query93	1132	813	516	516
query94	633	334	287	287
query95	599	391	303	303
query96	641	511	228	228
query97	2449	2476	2435	2435
query98	238	214	218	214
query99	996	986	926	926
Total cold run time: 235608 ms
Total hot run time: 152972 ms

@hello-stephen
Copy link
Contributor

FE UT Coverage Report

Increment line coverage 63.48% (412/649) 🎉
Increment coverage report
Complete coverage report

@hello-stephen
Copy link
Contributor

FE Regression Coverage Report

Increment line coverage 74.58% (484/649) 🎉
Increment coverage report
Complete coverage report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants