LLVM and SPIRV-LLVM-Translator pulldown (WW05 2026)#21176
Conversation
…self. (#174534) This patch simplifies extract-lane(%lane_num, %X) to %X when %X is a scalar value. Extracting from a scalar is redundant since there is only one value to extract.
The stub function is generated for R_MIPS_26 relocation, which could be used for local jumping inside a function, and do not expect any temporary register to be clobbered. Use AT instead of T9 for the stub function, otherwise functions using T9 will be messed up. Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
…_support/math folder. (#175450) Closes [#175346](llvm/llvm-project#175346), Part of #175344
When trying to perf inject JIT dump generatd through the perf plugin, perf fails with the following error: ``` jitdump file contains invalid or unsupported flags 0xf5880666c26c 0x2b750 [0xa8]: failed to process type: 10 [Operation not permitted] ``` It turns out that Header's Flags field was never initialized, so the value could be random. This patch fixes the issue by initialising all Header's fields. Co-authored-by: Lang Hames <lhames@gmail.com>
Do not verify GUID existence in pseudo probe desc by default since it generates false positive warnings with ThinLTO. User can use -pseudo-probe-verify-guid-existence-in-desc to verify it explicitly.
…uiltin CodeGen (#175113) This is a pre-commit of CIR codegen for `aesencwide/aesdecwide` builtin, remove useless `extractvalue` on clang CodeGen for this builtin.
It was left behind after f07988f.
Currently we do nothing RISC-V specific in this scheduler. This is a part of vtype-based scheduling. Reviewers: BeMg, mshockwave, lukel97, preames, topperc Pull Request: llvm/llvm-project#172613
This can be reused by #95924. Reviewers: BeMg, topperc, lukel97, preames, mshockwave Reviewed By: mshockwave, topperc Pull Request: llvm/llvm-project#172615
This can reduce some vsetvli toggles. This can be done in pre-ra scheduling as we have moved insertion of vsetvli after the first RA. Currently, we override `tryCandidate` and add a new heuristic based on comparison of `vtype`/`vl`. Reviewers: asb, preames, topperc, lukel97, mshockwave, BeMg Reviewed By: mshockwave, lukel97 Pull Request: llvm/llvm-project#95924
Based on suggestion from @macdice on llvm/llvm-project#175204. Thanks @macdice!
… (#175488) These ExecutorAddr calls were legacy from pre-ExecutorSymbolDef code. The getAddress method already returns an ExecutorAddr, so there's no need for them anymore.
…4626) Result type of P extension's comparison instructions is same as operands and the result bits are all 1s or 0s so we need to set ZeroOrNegativeOneBooleanContent to make sext(setcc) auto combined.
This patch is for special cases involving 0 vectors. During the comparison of vector operands, current code generation checks with `vcmpequh (vector compare equal unsigned halfword)` followed by a negation `xxlnor (VSX Vector Logical NOR XX3-form)`. This means that for the special case, instead of using `vcmpequh` and then negating the result, we can directly use `vcmpgtuh (vector compare greater than unsigned halfword)`. As a result the negation is avoided since the only condition where this will be false is for 0 as it is an `unsigned halfword`. --------- Co-authored-by: himadhith <himadhith.v@ibm.com>
```shell input_line_0:10:30: warning: comparison of integers of different signs: 'int' and 'unsigned long' [-Wsign-compare] ```
Addresses comment in #175322
…rse N2 (#174740) According to the [N2 SWOG](https://developer.arm.com/documentation/109914/latest/), flag-setting instructions for arithmetic/logical instructions should have a throughput of 3. [Similar to the V2 model](llvm/llvm-project#113542).
This patch adds initial support for the ARMv9.2+ Ampere1C core.
Add support for `__builtin_stack_address` builtin. The semantics match those of GCC's builtin with the same name. `__builtin_stack_address` returns the starting address of the stack region that may be used by called functions. It may or may not include the space used for on-stack arguments passed to a callee (See [GCC Bug/121013](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121013)). Fixes #82632.
When the SPIR-V backend handles a push constant, a new global with a target-specific is generated. This global should have the same visibility as the source, but turns out it was not the case: linkage was still external, but visibility went from hidden to visible. This causes the later passes to generate a Linkage decoration, adding the Linkage capability, which is not compatible with Vulkan. Fixing this.
We've already optimised these, so update the cost model to reflect it. And skip the isBeforeLegalize check when lowering i8 muls, because it then misses the cases where, say v32i8, has been type legalised into 2x v16i8. Also explicitly disable memory interleaving for any factor other than two or four.
The original Interpreter implementation had a hard dependency on ORC and grew organically with the addition of out-of-process JIT support. This tightly coupled the Interpreter to a specific execution engine and leaked ORC-specific assumptions (runtime layout, symbol lookup, exception model) into higher layers. The WebAssembly integration demonstrated that incremental execution can be implemented without ORC, exposing the need for a cleaner abstraction boundary. This change introduces an IncrementalExecutor interface and moves ORC-based execution behind a concrete implementation. The Interpreter now depends only on the abstract executor, improving layering and encapsulation. In addition, the Interpreter can be configured with user-provided incremental executor implementations, enabling ORC-independent execution, easier testing, and future extensions without modifying the core Interpreter.
…ash (#175484) The patch disables strict node mutation for LoongArch by setting IsStrictFPEnabled to true. This change fixes the current strict FP lowering crash only. ISD::STRICT_FSETCC and ISD::STRICT_FSETCCS can be further improved. Fixes #174606
(select (setcc ...) (sub a, b) (sub b, a)) When b is const, the `sub a, b` becomes `add a, -b` which we take care of in this patch with the m_SpecificNeg() matcher.
This patch marks TestDetachResumes.py skipped on Windows/AArch64. It has been failing intermittently on Windows AArch64 buildbot: https://lab.llvm.org/buildbot/#/builders/141/ This extends the prior change that disabled the same test on Windows x86_64 (commit 6d8d4cf by Dmitry Vasilyev, 2025-06-23). See #144891 for background and original discussion.
When using `std::variant` with non-trivial types, we need to go through multiple bases to find the `_Which` member. The MSVC STL implements this in `xsmf_control.h` which conditionally adds/deletes copy/move constructors/operators. We now go to `_Variant_base` (the holder of `_Which`). This inherits from `_Variant_storage`, which is our entry point to finding the n-th storage (going through `_Tail`).
`[[nodiscard]]` should be applied to functions where discarding the return value is most likely a correctness issue. - https://libcxx.llvm.org/CodingGuidelines.html - https://wg21.link/allocator.adaptor Towards #172124
|
Hi, @jsji |
Thank you @jinge90 for the follow up! |
|
@intel/llvm-gatekeepers This is ready for merge. Can someone help to issue a "/merge"? Thanks! |
|
/merge |
Does it accept comments only from certain persons like @sarnex? |
No, it should accept all gatekeepers... but the automation sometimes fails ... :( |
Yes, the automation is failing with "java.lang.ClassFormatError: Incompatible magic value 0 in class file java/lang/invoke/WrongMethodTypeException..."... @KornevNikita Can you try issuing "/merge" again? Hopefully it can be dispatched to another jenkins server without such exception, otherwise, we may need to do manual merge again ... |
|
/merge |
:(, automation still failing with " Caused: java.io.IOException: Remote call on scvelsch14_GPU_uplift failed" @hanzhan1 Can you please check and fix. Thanks! |
|
should i do it manually again? |
|
/merge |
|
rip hope it would work for me :) |
Currently, libdevice cuda backend needs to invoke prepare-builtins from libclc but this tool is going to be removed in #21176. This PR cuts such dependency in libdevice. --------- Signed-off-by: jinge90 <ge.jin@intel.com>
LLVM: llvm/llvm-project@e9f758a
SPIRV-LLVM-Translator: KhronosGroup/SPIRV-LLVM-Translator@fdbf92793d90efa