LLVM and SPIRV-LLVM-Translator pulldown (WW05 2026) by iclsrc · Pull Request #21176 · intel/llvm

iclsrc · 2026-01-29T22:00:33Z

LLVM: llvm/llvm-project@e9f758a
SPIRV-LLVM-Translator: KhronosGroup/SPIRV-LLVM-Translator@fdbf92793d90efa

…self. (#174534) This patch simplifies extract-lane(%lane_num, %X) to %X when %X is a scalar value. Extracting from a scalar is redundant since there is only one value to extract.

The stub function is generated for R_MIPS_26 relocation, which could be used for local jumping inside a function, and do not expect any temporary register to be clobbered. Use AT instead of T9 for the stub function, otherwise functions using T9 will be messed up. Signed-off-by: Icenowy Zheng <uwu@icenowy.me>

…_support/math folder. (#175450) Closes [#175346](llvm/llvm-project#175346), Part of #175344

When trying to perf inject JIT dump generatd through the perf plugin, perf fails with the following error: ``` jitdump file contains invalid or unsupported flags 0xf5880666c26c 0x2b750 [0xa8]: failed to process type: 10 [Operation not permitted] ``` It turns out that Header's Flags field was never initialized, so the value could be random. This patch fixes the issue by initialising all Header's fields. Co-authored-by: Lang Hames <lhames@gmail.com>

…… (#175476) …#175242) This reapplies 451ca45, which was reverted in 25976e8 due to bot failures. The REQUIRES line has been further constrained to try to address the failures.

Do not verify GUID existence in pseudo probe desc by default since it generates false positive warnings with ThinLTO. User can use -pseudo-probe-verify-guid-existence-in-desc to verify it explicitly.

…uiltin CodeGen (#175113) This is a pre-commit of CIR codegen for `aesencwide/aesdecwide` builtin, remove useless `extractvalue` on clang CodeGen for this builtin.

It was left behind after f07988f.

Currently we do nothing RISC-V specific in this scheduler. This is a part of vtype-based scheduling. Reviewers: BeMg, mshockwave, lukel97, preames, topperc Pull Request: llvm/llvm-project#172613

This can be reused by #95924. Reviewers: BeMg, topperc, lukel97, preames, mshockwave Reviewed By: mshockwave, topperc Pull Request: llvm/llvm-project#172615

This can reduce some vsetvli toggles. This can be done in pre-ra scheduling as we have moved insertion of vsetvli after the first RA. Currently, we override `tryCandidate` and add a new heuristic based on comparison of `vtype`/`vl`. Reviewers: asb, preames, topperc, lukel97, mshockwave, BeMg Reviewed By: mshockwave, lukel97 Pull Request: llvm/llvm-project#95924

@macdice

Based on suggestion from @macdice on llvm/llvm-project#175204. Thanks @macdice!

…(#175404)

… (#175488) These ExecutorAddr calls were legacy from pre-ExecutorSymbolDef code. The getAddress method already returns an ExecutorAddr, so there's no need for them anymore.

…4626) Result type of P extension's comparison instructions is same as operands and the result bits are all 1s or 0s so we need to set ZeroOrNegativeOneBooleanContent to make sext(setcc) auto combined.

This patch is for special cases involving 0 vectors. During the comparison of vector operands, current code generation checks with `vcmpequh (vector compare equal unsigned halfword)` followed by a negation `xxlnor (VSX Vector Logical NOR XX3-form)`. This means that for the special case, instead of using `vcmpequh` and then negating the result, we can directly use `vcmpgtuh (vector compare greater than unsigned halfword)`. As a result the negation is avoided since the only condition where this will be false is for 0 as it is an `unsigned halfword`. --------- Co-authored-by: himadhith <himadhith.v@ibm.com>

```shell input_line_0:10:30: warning: comparison of integers of different signs: 'int' and 'unsigned long' [-Wsign-compare] ```

Addresses comment in #175322

…rse N2 (#174740) According to the [N2 SWOG](https://developer.arm.com/documentation/109914/latest/), flag-setting instructions for arithmetic/logical instructions should have a throughput of 3. [Similar to the V2 model](llvm/llvm-project#113542).

This patch adds initial support for the ARMv9.2+ Ampere1C core.

Add support for `__builtin_stack_address` builtin. The semantics match those of GCC's builtin with the same name. `__builtin_stack_address` returns the starting address of the stack region that may be used by called functions. It may or may not include the space used for on-stack arguments passed to a callee (See [GCC Bug/121013](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121013)). Fixes #82632.

When the SPIR-V backend handles a push constant, a new global with a target-specific is generated. This global should have the same visibility as the source, but turns out it was not the case: linkage was still external, but visibility went from hidden to visible. This causes the later passes to generate a Linkage decoration, adding the Linkage capability, which is not compatible with Vulkan. Fixing this.

We've already optimised these, so update the cost model to reflect it. And skip the isBeforeLegalize check when lowering i8 muls, because it then misses the cases where, say v32i8, has been type legalised into 2x v16i8. Also explicitly disable memory interleaving for any factor other than two or four.

The original Interpreter implementation had a hard dependency on ORC and grew organically with the addition of out-of-process JIT support. This tightly coupled the Interpreter to a specific execution engine and leaked ORC-specific assumptions (runtime layout, symbol lookup, exception model) into higher layers. The WebAssembly integration demonstrated that incremental execution can be implemented without ORC, exposing the need for a cleaner abstraction boundary. This change introduces an IncrementalExecutor interface and moves ORC-based execution behind a concrete implementation. The Interpreter now depends only on the abstract executor, improving layering and encapsulation. In addition, the Interpreter can be configured with user-provided incremental executor implementations, enabling ORC-independent execution, easier testing, and future extensions without modifying the core Interpreter.

…ash (#175484) The patch disables strict node mutation for LoongArch by setting IsStrictFPEnabled to true. This change fixes the current strict FP lowering crash only. ISD::STRICT_FSETCC and ISD::STRICT_FSETCCS can be further improved. Fixes #174606

(select (setcc ...) (sub a, b) (sub b, a)) When b is const, the `sub a, b` becomes `add a, -b` which we take care of in this patch with the m_SpecificNeg() matcher.

This patch marks TestDetachResumes.py skipped on Windows/AArch64. It has been failing intermittently on Windows AArch64 buildbot: https://lab.llvm.org/buildbot/#/builders/141/ This extends the prior change that disabled the same test on Windows x86_64 (commit 6d8d4cf by Dmitry Vasilyev, 2025-06-23). See #144891 for background and original discussion.

When using `std::variant` with non-trivial types, we need to go through multiple bases to find the `_Which` member. The MSVC STL implements this in `xsmf_control.h` which conditionally adds/deletes copy/move constructors/operators. We now go to `_Variant_base` (the holder of `_Which`). This inherits from `_Variant_storage`, which is our entry point to finding the n-th storage (going through `_Tail`).

…175215)

`[[nodiscard]]` should be applied to functions where discarding the return value is most likely a correctness issue. - https://libcxx.llvm.org/CodingGuidelines.html - https://wg21.link/allocator.adaptor Towards #172124

wenju-he

LGTM

jinge90 · 2026-02-03T06:03:19Z

Hi, @jsji
This PR removes libclc utils prepare-builtins which is used by libdevice cuda and amd backend: https://github.com/intel/llvm/blob/sycl/libdevice/cmake/modules/SYCLLibdevice.cmake#L806
I created #21200 to cut such dependency and also include 588723b into it.
Thanks very much.

jsji · 2026-02-03T13:47:35Z

Hi, @jsji This PR removes libclc utils prepare-builtins which is used by libdevice cuda and amd backend: https://github.com/intel/llvm/blob/sycl/libdevice/cmake/modules/SYCLLibdevice.cmake#L806 I created #21200 to cut such dependency and also include 588723b into it. Thanks very much.

Thank you @jinge90 for the follow up!

jsji · 2026-02-03T13:48:27Z

@intel/llvm-gatekeepers This is ready for merge. Can someone help to issue a "/merge"? Thanks!

KornevNikita · 2026-02-03T13:51:03Z

/merge

KornevNikita · 2026-02-03T14:04:39Z

@intel/llvm-gatekeepers This is ready for merge. Can someone help to issue a "/merge"? Thanks!

Does it accept comments only from certain persons like @sarnex?

jsji · 2026-02-03T14:10:54Z

@intel/llvm-gatekeepers This is ready for merge. Can someone help to issue a "/merge"? Thanks!

Does it accept comments only from certain persons like @sarnex?

No, it should accept all gatekeepers... but the automation sometimes fails ... :(

jsji · 2026-02-03T14:16:48Z

@intel/llvm-gatekeepers This is ready for merge. Can someone help to issue a "/merge"? Thanks!

Does it accept comments only from certain persons like @sarnex?

No, it should accept all gatekeepers... but the automation sometimes fails ... :(

Yes, the automation is failing with "java.lang.ClassFormatError: Incompatible magic value 0 in class file java/lang/invoke/WrongMethodTypeException..."...

@KornevNikita Can you try issuing "/merge" again? Hopefully it can be dispatched to another jenkins server without such exception, otherwise, we may need to do manual merge again ...

KornevNikita · 2026-02-03T14:18:16Z

/merge

jsji · 2026-02-03T14:24:47Z

/merge

:(, automation still failing with " Caused: java.io.IOException: Remote call on scvelsch14_GPU_uplift failed"

@hanzhan1 Can you please check and fix. Thanks!

sarnex · 2026-02-03T15:12:33Z

should i do it manually again?

sarnex · 2026-02-03T15:12:43Z

/merge

sarnex · 2026-02-03T15:13:50Z

rip hope it would work for me :)

Currently, libdevice cuda backend needs to invoke prepare-builtins from libclc but this tool is going to be removed in #21176. This PR cuts such dependency in libdevice. --------- Signed-off-by: jinge90 <ge.jin@intel.com>

ElvisWang123 and others added 30 commits January 12, 2026 10:03

[LV] Simplify extract-lane with scalar operand to the scalar value it…

cd2caf6

…self. (#174534) This patch simplifies extract-lane(%lane_num, %X) to %X when %X is a scalar value. Extracting from a scalar is redundant since there is only one value to extract.

[libc][math] Refactor ilogbf16 implementation to header-only in src/_…

79be97d

…_support/math folder. (#175450) Closes [#175346](llvm/llvm-project#175346), Part of #175344

Reapply "[llvm-jitlink] Replace IR backtrace symbolication test..." (…

187ca86

…… (#175476) …#175242) This reapplies 451ca45, which was reverted in 25976e8 due to bot failures. The REQUIRES line has been further constrained to try to address the failures.

[PseudoProbe] Add switch to control illegal guid warnings (#174927)

bdc6a67

Do not verify GUID existence in pseudo probe desc by default since it generates false positive warnings with ThinLTO. User can use -pseudo-probe-verify-guid-existence-in-desc to verify it explicitly.

[Clang][X86] Remove useless extractvalue on aesencwide/aesdecwide b…

e0cf581

…uiltin CodeGen (#175113) This is a pre-commit of CIR codegen for `aesencwide/aesdecwide` builtin, remove useless `extractvalue` on clang CodeGen for this builtin.

[libclc][NFC] Remove unused builtins_opt_lib_tgt (#175479)

a3ca7ca

It was left behind after f07988f.

[RISCV] Add a custom pre-ra scheduler

564f2be

Currently we do nothing RISC-V specific in this scheduler. This is a part of vtype-based scheduling. Reviewers: BeMg, mshockwave, lukel97, preames, topperc Pull Request: llvm/llvm-project#172613

[RISCV][NFC] Add RISCVVSETVLIInfoAnalysis

67601a4

This can be reused by #95924. Reviewers: BeMg, topperc, lukel97, preames, mshockwave Reviewed By: mshockwave, topperc Pull Request: llvm/llvm-project#172615

[ORC] Simplify zero initializer. NFCI. (#175482)

f114d95

Based on suggestion from @macdice on llvm/llvm-project#175204. Thanks @macdice!

[JITLink] Set correct triple instead of hard-code the value to linux …

ba6a59c

…(#175404)

[llvm-jitlink] Remove redundant ExecutorAddr constructor calls. NFCI.…

f827c20

… (#175488) These ExecutorAddr calls were legacy from pre-ExecutorSymbolDef code. The getAddress method already returns an ExecutorAddr, so there's no need for them anymore.

[RISCV][llvm] Support logical comparison codegen for P extension (#17…

145e28d

…4626) Result type of P extension's comparison instructions is same as operands and the result bits are all 1s or 0s so we need to set ZeroOrNegativeOneBooleanContent to make sext(setcc) auto combined.

[clang] fix warning (#174587)

3575501

```shell input_line_0:10:30: warning: comparison of integers of different signs: 'int' and 'unsigned long' [-Wsign-compare] ```

Fix gcc name shadow warning. (#175490)

1f1dee3

Addresses comment in #175322

[Aarch64] Add support for Ampere1C core (#175442)

43138d6

This patch adds initial support for the ARMv9.2+ Ampere1C core.

[SDAG] Combine select into ABD?, for const (#173581)

e51f25a

(select (setcc ...) (sub a, b) (sub b, a)) When b is const, the `sub a, b` becomes `add a, -b` which we take care of in this patch with the m_SpecificNeg() matcher.

[AArch64][llvm] Add extra dependencies for recently added features (#…

b646209

…175215)

[libc++][scoped_allocator] Applied [[nodiscard]] (#175291)

91268a5

`[[nodiscard]]` should be applied to functions where discarding the return value is most likely a correctness issue. - https://libcxx.llvm.org/CodingGuidelines.html - https://wg21.link/allocator.adaptor Towards #172124

jsji had a problem deploying to WindowsCILock February 2, 2026 20:17 — with GitHub Actions Failure

jsji had a problem deploying to WindowsCILock February 2, 2026 20:37 — with GitHub Actions Error

jsji had a problem deploying to WindowsCILock February 2, 2026 20:37 — with GitHub Actions Failure

jsji temporarily deployed to WindowsCILock February 2, 2026 20:47 — with GitHub Actions Inactive

jsji temporarily deployed to WindowsCILock February 2, 2026 21:17 — with GitHub Actions Inactive

jsji temporarily deployed to WindowsCILock February 2, 2026 21:40 — with GitHub Actions Inactive

jsji had a problem deploying to WindowsCILock February 2, 2026 21:40 — with GitHub Actions Failure

jsji temporarily deployed to WindowsCILock February 2, 2026 21:44 — with GitHub Actions Inactive

wenju-he mentioned this pull request Feb 2, 2026

[libclc] prepare-builtins: remove redundant metadata/linkage edits #21186

Closed

wenju-he approved these changes Feb 3, 2026

View reviewed changes

jsji had a problem deploying to WindowsCILock February 3, 2026 01:29 — with GitHub Actions Error

jinge90 mentioned this pull request Feb 3, 2026

[SYCL] Remove libclc prepare-builtins dependency for libdevice #21200

Merged

sarnex merged commit 85b461e into sycl Feb 3, 2026
81 of 105 checks passed

jsji deleted the llvmspirv_pulldown branch February 3, 2026 15:32

Conversation

iclsrc commented Jan 29, 2026 • edited by jsji Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wenju-he left a comment

Choose a reason for hiding this comment

Uh oh!

jinge90 commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jsji commented Feb 3, 2026

Uh oh!

jsji commented Feb 3, 2026

Uh oh!

KornevNikita commented Feb 3, 2026

Uh oh!

KornevNikita commented Feb 3, 2026

Uh oh!

jsji commented Feb 3, 2026

Uh oh!

jsji commented Feb 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

KornevNikita commented Feb 3, 2026

Uh oh!

jsji commented Feb 3, 2026

Uh oh!

sarnex commented Feb 3, 2026

Uh oh!

sarnex commented Feb 3, 2026

Uh oh!

sarnex commented Feb 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

iclsrc commented Jan 29, 2026 •

edited by jsji

Loading

jinge90 commented Feb 3, 2026 •

edited

Loading

jsji commented Feb 3, 2026 •

edited

Loading