Skip to content

fix : vx_spawn#321

Open
talubik wants to merge 2 commits intovortexgpgpu:masterfrom
talubik:vxspawn_fix
Open

fix : vx_spawn#321
talubik wants to merge 2 commits intovortexgpgpu:masterfrom
talubik:vxspawn_fix

Conversation

@talubik
Copy link

@talubik talubik commented Feb 17, 2026

Hello. I have encountered a bug in vx_spawn. When the number of groups (or tasks) does not divide evenly across active
cores, the old formula produced wrong offsets for cores with
core_id < remaining, causing some work items to be skipped and others
to be processed twice.

Cause of bug

Before computing the offset, total_groups_per_core is incremented for
cores that receive an extra group:

if (core_id < remaining_groups_per_core)
    ++total_groups_per_core;

The old formula then used this already-mutated variable:

group_offset = core_id * total_groups_per_core + MIN(core_id, remaining);

This double-counts the remainder for affected cores. The same bug was
present in the task distribution path (all_tasks_offset).

Example

10 groups, 4 cores → base = 2, remaining = 2

Core old offset correct offset groups assigned (old)
0 0 0 [0, 1, 2]
1 4 3 [4, 5, 6] — group 3 skipped
2 6 6 [6, 7] — group 6 duplicate
3 8 8 [8, 9]

This pull request fixes this bug.

@davymillion
Copy link

Hi, I encountered the same issue so I agree with the bug.
Your logic in the else statement is hard to understand and I think this can be simplified with the already defined variable total_groups_per_core which is equal to your A) base_groups + 1 and B) base_groups in the A) if core_id < remaining_groups_per_core and B) else cases, respectively. Here is the patch i've done in my code (same should be applied for all_tasks_offset):

// calculate offsets for group distribution
uint32_t remaining_offset = 0;
if (core_id >= remaining_groups_per_core)
  // The remaining groups per core have been distributed (1 by 1) on the
  // previous cores (whose core_id are < remaining_groups_per_core), so we
  // now need to offset by the total remaining groups
  remaining_offset = remaining_groups_per_core;
uint32_t group_offset = core_id * total_groups_per_core + remaining_offset;

If I'm not mistaken, this is equivalent to your code. For the if case in your code, this is easy to show as remaining_offset stays at 0 and total_groups_per_core = base_groups + 1. For the else condition, due to the fact that total_groups_per_core = base_groups, your group_offset computation can be developed and simplified: group_offset = remaining_groups_per_core * (total_groups_per_core + 1) + (core_id - remaining_groups_per_core) * (total_groups_per_core) = remaining_groups_per_core + total_groups_per_core * core_id, which is what computes the patch I just shared.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments