Skip to content

LLVM and SPIRV-LLVM-Translator pulldown (WW05 2026)#21176

Merged
sarnex merged 1059 commits intosyclfrom
llvmspirv_pulldown
Feb 3, 2026
Merged

LLVM and SPIRV-LLVM-Translator pulldown (WW05 2026)#21176
sarnex merged 1059 commits intosyclfrom
llvmspirv_pulldown

Conversation

@iclsrc
Copy link
Collaborator

@iclsrc iclsrc commented Jan 29, 2026

ElvisWang123 and others added 30 commits January 12, 2026 10:03
…self. (#174534)

This patch simplifies extract-lane(%lane_num, %X) to %X when %X is a
scalar value. Extracting from a scalar is redundant since there is only
one value to extract.
The stub function is generated for R_MIPS_26 relocation, which could be
used for local jumping inside a function, and do not expect any
temporary register to be clobbered.

Use AT instead of T9 for the stub function, otherwise functions using T9
will be messed up.

Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
…_support/math folder. (#175450)

Closes [#175346](llvm/llvm-project#175346),
Part of #175344
When trying to perf inject JIT dump generatd through the perf plugin,
perf fails with the following error:
```
jitdump file contains invalid or unsupported flags 0xf5880666c26c
0x2b750 [0xa8]: failed to process type: 10 [Operation not permitted]
```
It turns out that Header's Flags field was never initialized, so the
value could be random.
This patch fixes the issue by initialising all Header's fields.

Co-authored-by: Lang Hames <lhames@gmail.com>
…… (#175476)

…#175242)

This reapplies 451ca45, which was
reverted in 25976e8 due to bot
failures.

The REQUIRES line has been further constrained to try to address the
failures.
Do not verify GUID existence in pseudo probe desc by default since it
generates false positive warnings with ThinLTO.
User can use -pseudo-probe-verify-guid-existence-in-desc to verify it
explicitly.
…uiltin CodeGen (#175113)

This is a pre-commit of CIR codegen for `aesencwide/aesdecwide` builtin,
remove useless `extractvalue` on clang CodeGen for this builtin.
Currently we do nothing RISC-V specific in this scheduler.

This is a part of vtype-based scheduling.

Reviewers: BeMg, mshockwave, lukel97, preames, topperc

Pull Request: llvm/llvm-project#172613
This can be reused by #95924.

Reviewers: BeMg, topperc, lukel97, preames, mshockwave

Reviewed By: mshockwave, topperc

Pull Request: llvm/llvm-project#172615
This can reduce some vsetvli toggles.

This can be done in pre-ra scheduling as we have moved insertion of
vsetvli after the first RA.

Currently, we override `tryCandidate` and add a new heuristic based
on comparison of `vtype`/`vl`.

Reviewers: asb, preames, topperc, lukel97, mshockwave, BeMg

Reviewed By: mshockwave, lukel97

Pull Request: llvm/llvm-project#95924
… (#175488)

These ExecutorAddr calls were legacy from pre-ExecutorSymbolDef code.
The getAddress method already returns an ExecutorAddr, so there's no
need for them anymore.
…4626)

Result type of P extension's comparison instructions is same as operands
and the result bits are all 1s or 0s so we need to set
ZeroOrNegativeOneBooleanContent to make sext(setcc) auto combined.
This patch is for special cases involving 0 vectors. During the
comparison of vector operands, current code generation checks with
`vcmpequh (vector compare equal unsigned halfword)` followed by a
negation `xxlnor (VSX Vector Logical NOR XX3-form)`.

This means that for the special case, instead of using `vcmpequh` and
then negating the result, we can directly use `vcmpgtuh (vector compare
greater than unsigned halfword)`.

As a result the negation is avoided since the only condition where this
will be false is for 0 as it is an `unsigned halfword`.

---------

Co-authored-by: himadhith <himadhith.v@ibm.com>
```shell
input_line_0:10:30: warning: comparison of integers of different signs: 'int' and 'unsigned long' [-Wsign-compare]
```
Addresses comment in #175322
…rse N2 (#174740)

According to the [N2
SWOG](https://developer.arm.com/documentation/109914/latest/),
flag-setting instructions for arithmetic/logical instructions should
have a throughput of 3.

[Similar to the V2
model](llvm/llvm-project#113542).
This patch adds initial support for the ARMv9.2+ Ampere1C core.
Add support for `__builtin_stack_address` builtin. The semantics match
those of GCC's builtin with the same name.

`__builtin_stack_address` returns the starting address of the stack
region that may be used by called functions. It may or may not include
the space used for on-stack arguments passed to a callee (See [GCC
Bug/121013](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121013)).

Fixes #82632.
When the SPIR-V backend handles a push constant, a new global with a
target-specific is generated. This global should have the same
visibility as the source, but turns out it was not the case: linkage was
still external, but visibility went from hidden to visible.

This causes the later passes to generate a Linkage decoration, adding
the Linkage capability, which is not compatible with Vulkan. Fixing
this.
We've already optimised these, so update the cost model to reflect it.
And skip the isBeforeLegalize check when lowering i8 muls, because it
then misses the cases where, say v32i8, has been type legalised into 2x
v16i8.

Also explicitly disable memory interleaving for any factor other than
two or four.
The original Interpreter implementation had a hard dependency on ORC and
grew organically with the addition of out-of-process JIT support. This
tightly coupled the Interpreter to a specific execution engine and
leaked ORC-specific assumptions (runtime layout, symbol lookup,
exception model) into higher layers.

The WebAssembly integration demonstrated that incremental execution can
be implemented without ORC, exposing the need for a cleaner abstraction
boundary.

This change introduces an IncrementalExecutor interface and moves
ORC-based execution behind a concrete implementation. The Interpreter
now depends only on the abstract executor, improving layering and
encapsulation.

In addition, the Interpreter can be configured with user-provided
incremental executor implementations, enabling ORC-independent
execution, easier testing, and future extensions without modifying the
core Interpreter.
…ash (#175484)

The patch disables strict node mutation for LoongArch by setting
IsStrictFPEnabled to true.

This change fixes the current strict FP lowering crash only.
ISD::STRICT_FSETCC and ISD::STRICT_FSETCCS can be further improved.

Fixes #174606
(select (setcc ...) (sub a, b) (sub b, a))

When b is const, the `sub a, b` becomes `add a, -b` which we take care of in this patch with the m_SpecificNeg() matcher.
This patch marks TestDetachResumes.py skipped on Windows/AArch64.
It has been failing intermittently on Windows AArch64 buildbot:
https://lab.llvm.org/buildbot/#/builders/141/

This extends the prior change that disabled the same test on Windows
x86_64 (commit 6d8d4cf by Dmitry
Vasilyev, 2025-06-23). See #144891 for background and original
discussion.
When using `std::variant` with non-trivial types, we need to go through
multiple bases to find the `_Which` member. The MSVC STL implements this
in `xsmf_control.h` which conditionally adds/deletes copy/move
constructors/operators.

We now go to `_Variant_base` (the holder of `_Which`). This inherits
from `_Variant_storage`, which is our entry point to finding the n-th
storage (going through `_Tail`).
`[[nodiscard]]` should be applied to functions where discarding the
return value is most likely a correctness issue.

- https://libcxx.llvm.org/CodingGuidelines.html
- https://wg21.link/allocator.adaptor

Towards #172124
Copy link
Contributor

@wenju-he wenju-he left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jinge90
Copy link
Contributor

jinge90 commented Feb 3, 2026

Hi, @jsji
This PR removes libclc utils prepare-builtins which is used by libdevice cuda and amd backend: https://github.com/intel/llvm/blob/sycl/libdevice/cmake/modules/SYCLLibdevice.cmake#L806
I created #21200 to cut such dependency and also include 588723b into it.
Thanks very much.

@jsji
Copy link
Contributor

jsji commented Feb 3, 2026

Hi, @jsji This PR removes libclc utils prepare-builtins which is used by libdevice cuda and amd backend: https://github.com/intel/llvm/blob/sycl/libdevice/cmake/modules/SYCLLibdevice.cmake#L806 I created #21200 to cut such dependency and also include 588723b into it. Thanks very much.

Thank you @jinge90 for the follow up!

@jsji
Copy link
Contributor

jsji commented Feb 3, 2026

@intel/llvm-gatekeepers This is ready for merge. Can someone help to issue a "/merge"? Thanks!

@KornevNikita
Copy link
Contributor

/merge

@KornevNikita
Copy link
Contributor

@intel/llvm-gatekeepers This is ready for merge. Can someone help to issue a "/merge"? Thanks!

Does it accept comments only from certain persons like @sarnex?

@jsji
Copy link
Contributor

jsji commented Feb 3, 2026

@intel/llvm-gatekeepers This is ready for merge. Can someone help to issue a "/merge"? Thanks!

Does it accept comments only from certain persons like @sarnex?

No, it should accept all gatekeepers... but the automation sometimes fails ... :(

@jsji
Copy link
Contributor

jsji commented Feb 3, 2026

@intel/llvm-gatekeepers This is ready for merge. Can someone help to issue a "/merge"? Thanks!

Does it accept comments only from certain persons like @sarnex?

No, it should accept all gatekeepers... but the automation sometimes fails ... :(

Yes, the automation is failing with "java.lang.ClassFormatError: Incompatible magic value 0 in class file java/lang/invoke/WrongMethodTypeException..."...

@KornevNikita Can you try issuing "/merge" again? Hopefully it can be dispatched to another jenkins server without such exception, otherwise, we may need to do manual merge again ...

@KornevNikita
Copy link
Contributor

/merge

@jsji
Copy link
Contributor

jsji commented Feb 3, 2026

/merge

:(, automation still failing with " Caused: java.io.IOException: Remote call on scvelsch14_GPU_uplift failed"

@hanzhan1 Can you please check and fix. Thanks!

@sarnex
Copy link
Contributor

sarnex commented Feb 3, 2026

should i do it manually again?

@sarnex
Copy link
Contributor

sarnex commented Feb 3, 2026

/merge

@sarnex
Copy link
Contributor

sarnex commented Feb 3, 2026

rip hope it would work for me :)

@sarnex sarnex merged commit 85b461e into sycl Feb 3, 2026
81 of 105 checks passed
@jsji jsji deleted the llvmspirv_pulldown branch February 3, 2026 15:32
KornevNikita pushed a commit that referenced this pull request Feb 6, 2026
Currently, libdevice cuda backend needs to invoke prepare-builtins from
libclc but this tool is going to be removed in
#21176.
This PR cuts such dependency in libdevice.

---------

Signed-off-by: jinge90 <ge.jin@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

disable-lint Skip linter check step and proceed with build jobs

Projects

None yet

Development

Successfully merging this pull request may close these issues.