Run FJ heuristics before and during presolve#899
Run FJ heuristics before and during presolve#899aliceb-nv wants to merge 21 commits intoNVIDIA:mainfrom
Conversation
|
/ok to test e019653 |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds an early feasibility‑jump heuristic framework (CPU & GPU) that runs during presolve, centralizes B&B cutoff logic via Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu (1)
1405-1425:⚠️ Potential issue | 🟠 MajorGuard against overflow when
in_time_limitis infinite.Casting
inf * 1000tointis undefined; even large finite limits can overflow. Only construct milliseconds for finite limits (or usemilliseconds::max()).Suggested fix
- auto time_limit = std::chrono::milliseconds((int)(in_time_limit * 1000)); + auto time_limit = + (in_time_limit >= std::numeric_limits<f_t>::infinity()) + ? std::chrono::milliseconds::max() + : std::chrono::milliseconds(static_cast<long long>(in_time_limit * 1000.0)); @@ - if (in_time_limit < std::numeric_limits<f_t>::infinity() && - now - loop_time_start > time_limit) { + if (time_limit != std::chrono::milliseconds::max() && + now - loop_time_start > time_limit) {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu` around lines 1405 - 1425, The current cpufj_solve_loop casts (int)(in_time_limit * 1000) into std::chrono::milliseconds which is unsafe for infinite or very large in_time_limit; change the logic so you only construct time_limit when in_time_limit is finite (e.g., check std::isfinite(in_time_limit)) and otherwise set time_limit = std::chrono::milliseconds::max(); update the places that use time_limit (the comparison with now - loop_time_start) accordingly so infinite limits won’t trigger overflow or undefined behavior.
🧹 Nitpick comments (1)
cpp/src/mip_heuristics/feasibility_jump/early_cpufj.cu (1)
47-51: Nit: unnameddoubleparameter in the lambda.The third
doubleparameter on line 48 is silently discarded. If this is the elapsed-time argument fromfj_improvement_callback_t, consider giving it a name (even if unused) or adding a brief comment for readability. Not blocking.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip_heuristics/feasibility_jump/early_cpufj.cu` around lines 47 - 51, The lambda assigned to cpu_fj_thread_->fj_cpu->improvement_callback currently has an unnamed third parameter (double) which is silently discarded; give that parameter a descriptive name (e.g., "elapsed" or "time_sec") or add an inline comment (e.g., /* elapsed time */) in the lambda signature so callers reading the fj_improvement_callback_t usage understand the parameter purpose while keeping behavior unchanged in try_update_best and get_user_obj_from_solver_obj.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cpp/src/branch_and_bound/branch_and_bound.hpp`:
- Around line 111-117: The field initial_cutoff_ is accessed concurrently by
set_initial_cutoff() and get_cutoff(), causing a data race; make initial_cutoff_
thread-safe by changing its type to std::atomic<f_t> (or protect reads/writes
with mutex_upper_) and update set_initial_cutoff to use .store(...) and
get_cutoff to use .load(...) when computing std::min(upper_bound_.load(),
initial_cutoff_.load()); apply the same change to any other places that
read/write initial_cutoff_ (including the other setter/getter variants
referenced around the same variable).
In `@cpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cu`:
- Around line 936-949: Reset last_reported_objective_ at the start of each run
based on the objective sense (set to +inf for minimization, -inf for
maximization), then when invoking improvement_callback use a sense-aware
comparison (use < for minimization and > for maximization) against
solution.get_user_objective(); perform the device→host copy of
solution.assignment with raft::copy into h_assignment on climber_stream,
synchronize the stream, and immediately check for CUDA/raft errors (use the
project's CUDA_CHECK or equivalent) before calling
improvement_callback(user_obj, h_assignment) to ensure the copy succeeded.
In `@cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu`:
- Around line 790-794: The CPU path is passing solver-space objective
(fj_cpu.h_incumbent_objective) to fj_cpu.improvement_callback but should pass
the user-space objective like the GPU path does; before calling
improvement_callback compute the user-facing objective by converting the solver
objective using the same conversion used in GPU FJ (e.g., call the
solution.get_user_objective() equivalent or the shared conversion utility on the
current incumbent represented by fj_cpu.h_assignment), then pass that user-space
objective along with fj_cpu.h_assignment and current_work_units to
fj_cpu.improvement_callback.
In `@cpp/src/mip_heuristics/solve.cu`:
- Around line 307-323: The early-FJ incumbent tracking currently assumes
minimization (early_best_objective initialized to +inf and comparison objective
>= early_best_objective), which breaks maximization; change early_best_objective
initialization and the comparison in early_fj_callback based on the problem
sense (use -infinity for maximize and +infinity for minimize), and flip the
guard to accept a better incumbent when (minimize: objective < best) or
(maximize: objective > best); ensure the cutoff value passed into run_mip is set
from early_best_objective using the same sense-aware logic so the correct cutoff
is enforced for both maximize and minimize cases (update references to
early_best_objective, run_early_fj, early_fj_callback, and the point where
cutoff is derived for run_mip).
- Around line 56-72: invoke_solution_callbacks always constructs bound_vec with
+infinity which is incorrect for maximization cases; update the helper to accept
an objective sense (or a bound value) and set bound_vec to +infinity for
minimization and -infinity for maximization before calling get_solution on
internals::get_solution_callback_t; adjust the invoke_signature (and callers) to
pass the sense or concrete bound, and ensure the check using
internals::base_solution_callback_type::GET_SOLUTION and the call to
get_solution(assignment.data(), obj_vec.data(), bound_vec.data(), ...) remain
unchanged.
In `@cpp/src/mip_heuristics/solver.cu`:
- Around line 118-125: The code uses std::min when updating
context.initial_cutoff from context.early_cpufj_ptr->get_best_objective(), which
only works for minimization; change the update to be sense-aware: when the
problem is minimization set initial_cutoff = std::min(initial_cutoff, obj), and
when maximization set initial_cutoff = std::max(initial_cutoff, obj). Update the
logic around context.early_cpufj_ptr handling (references:
context.early_cpufj_ptr, get_best_objective(), context.initial_cutoff) and
ensure context.initial_cutoff is initialized/seeding to +inf for minimization
and -inf for maximization so the comparison behaves correctly.
---
Outside diff comments:
In `@cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu`:
- Around line 1405-1425: The current cpufj_solve_loop casts (int)(in_time_limit
* 1000) into std::chrono::milliseconds which is unsafe for infinite or very
large in_time_limit; change the logic so you only construct time_limit when
in_time_limit is finite (e.g., check std::isfinite(in_time_limit)) and otherwise
set time_limit = std::chrono::milliseconds::max(); update the places that use
time_limit (the comparison with now - loop_time_start) accordingly so infinite
limits won’t trigger overflow or undefined behavior.
---
Nitpick comments:
In `@cpp/src/mip_heuristics/feasibility_jump/early_cpufj.cu`:
- Around line 47-51: The lambda assigned to
cpu_fj_thread_->fj_cpu->improvement_callback currently has an unnamed third
parameter (double) which is silently discarded; give that parameter a
descriptive name (e.g., "elapsed" or "time_sec") or add an inline comment (e.g.,
/* elapsed time */) in the lambda signature so callers reading the
fj_improvement_callback_t usage understand the parameter purpose while keeping
behavior unchanged in try_update_best and get_user_obj_from_solver_obj.
ℹ️ Review info
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (16)
cpp/src/branch_and_bound/branch_and_bound.cppcpp/src/branch_and_bound/branch_and_bound.hppcpp/src/mip_heuristics/CMakeLists.txtcpp/src/mip_heuristics/diversity/population.cucpp/src/mip_heuristics/early_heuristic.cuhcpp/src/mip_heuristics/feasibility_jump/early_cpufj.cucpp/src/mip_heuristics/feasibility_jump/early_cpufj.cuhcpp/src/mip_heuristics/feasibility_jump/early_gpufj.cucpp/src/mip_heuristics/feasibility_jump/early_gpufj.cuhcpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cucpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cuhcpp/src/mip_heuristics/feasibility_jump/fj_cpu.cucpp/src/mip_heuristics/feasibility_jump/fj_cpu.cuhcpp/src/mip_heuristics/solve.cucpp/src/mip_heuristics/solver.cucpp/src/mip_heuristics/solver_context.cuh
| // Invoke improvement callback if we have a better feasible solution | ||
| if (is_feasible && improvement_callback) { | ||
| f_t user_obj = solution.get_user_objective(); | ||
| if (user_obj < last_reported_objective_) { | ||
| last_reported_objective_ = user_obj; | ||
| // Copy assignment to host for callback | ||
| std::vector<f_t> h_assignment(solution.assignment.size()); | ||
| raft::copy(h_assignment.data(), | ||
| solution.assignment.data(), | ||
| solution.assignment.size(), | ||
| climber_stream); | ||
| climber_stream.synchronize(); | ||
| improvement_callback(user_obj, h_assignment); | ||
| } |
There was a problem hiding this comment.
Handle objective sense + reset for improvement callback.
The < comparison assumes minimization; for maximization it suppresses real improvements and can misreport cutoffs. Also last_reported_objective_ is never reset between runs, so later solves may never emit callbacks. Initialize it per run using the objective sense and compare accordingly. Also add a CUDA error check after the new device→host copy.
Suggested fix (sense-aware + reset + error check)
@@
- f_t obj = -std::numeric_limits<f_t>::infinity();
+ f_t obj = -std::numeric_limits<f_t>::infinity();
data.incumbent_quality.set_value_async(obj, handle_ptr->get_stream());
+ last_reported_objective_ =
+ pb_ptr->maximize ? -std::numeric_limits<f_t>::infinity()
+ : std::numeric_limits<f_t>::infinity();
@@
- if (is_feasible && improvement_callback) {
- f_t user_obj = solution.get_user_objective();
- if (user_obj < last_reported_objective_) {
+ if (is_feasible && improvement_callback) {
+ f_t user_obj = solution.get_user_objective();
+ const bool improved =
+ pb_ptr->maximize ? (user_obj > last_reported_objective_)
+ : (user_obj < last_reported_objective_);
+ if (improved) {
last_reported_objective_ = user_obj;
std::vector<f_t> h_assignment(solution.assignment.size());
raft::copy(h_assignment.data(),
solution.assignment.data(),
solution.assignment.size(),
climber_stream);
climber_stream.synchronize();
+ RAFT_CHECK_CUDA(climber_stream);
improvement_callback(user_obj, h_assignment);
}
}Based on learnings: Validate algorithm correctness in optimization logic: simplex pivots, branch-and-bound decisions, routing heuristics, and constraint/objective handling must produce correct results.
As per coding guidelines: Every CUDA kernel launch and memory operation must have error checking with CUDA_CHECK or equivalent verification.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cu` around lines 936
- 949, Reset last_reported_objective_ at the start of each run based on the
objective sense (set to +inf for minimization, -inf for maximization), then when
invoking improvement_callback use a sense-aware comparison (use < for
minimization and > for maximization) against solution.get_user_objective();
perform the device→host copy of solution.assignment with raft::copy into
h_assignment on climber_stream, synchronize the stream, and immediately check
for CUDA/raft errors (use the project's CUDA_CHECK or equivalent) before calling
improvement_callback(user_obj, h_assignment) to ensure the copy succeeded.
There was a problem hiding this comment.
Was updated to compare the solver-space objectives, which are always minimization.
There was a problem hiding this comment.
@aliceb-nv, thank you for the clarification! Using solver-space objectives (which are always minimization) is a clean solution that makes the < comparison correct without needing conditional logic based on objective sense.
Quick follow-up: did the update also address the other two points from my review?
- Resetting
last_reported_objective_at the start of each run (currently it persists across multiplesolve()calls, which could prevent callbacks in subsequent runs) - CUDA error checking after the
raft::copy+synchronizeto catch any copy failures before invoking the callback
If those are intentionally deferred or not applicable in this context, no worries—just wanted to confirm. 👍
🧠 Learnings used
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Validate algorithm correctness in optimization logic: simplex pivots, branch-and-bound decisions, routing heuristics, and constraint/objective handling must produce correct results
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*.{cu,cuh,cpp,hpp,h} : Validate correct initialization of variable bounds, constraint coefficients, and algorithm state before solving; ensure reset when transitioning between algorithm phases (presolve, simplex, diving, crossover)
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*test*.{cpp,cu,py} : Add tests for algorithm phase transitions: verify correct initialization of bounds and state when transitioning from presolve to simplex to diving to crossover
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Applies to **/*test*.{cpp,cu,py} : Write tests validating numerical correctness of optimization results (not just 'runs without error'); test degenerate cases (infeasible, unbounded, empty, singleton problems)
Learnt from: chris-maes
Repo: NVIDIA/cuopt PR: 746
File: cpp/src/dual_simplex/barrier.cu:549-552
Timestamp: 2026-01-14T00:38:38.038Z
Learning: In cpp/src/dual_simplex/barrier.cu's form_adat() method within the barrier solver, the raft::copy from d_original_A_values to device_AD.x is necessary and must not be removed. This copy restores the original matrix A values before they are scaled by the diagonal matrix D (via d_inv_diag_prime). Since D changes between iterations and the scaling mutates device_AD.x, copying the original values each iteration is required for correctness.
Learnt from: aliceb-nv
Repo: NVIDIA/cuopt PR: 721
File: cpp/src/linear_programming/optimization_problem.cu:106-205
Timestamp: 2026-01-29T11:25:28.213Z
Learning: The function `csr_matrices_equivalent_with_permutation` in `cpp/src/linear_programming/optimization_problem.cu` is only used in CI testing with known inputs, so strict problem size validation is not required.
Learnt from: CR
Repo: NVIDIA/cuopt PR: 0
File: .github/.coderabbit_review_guide.md:0-0
Timestamp: 2025-11-25T10:20:49.822Z
Learning: Reduce tight coupling between solver components (presolve, simplex, basis, barrier); increase modularity and reusability of optimization algorithms
| if (fj_cpu.improvement_callback) { | ||
| double current_work_units = fj_cpu.work_units_elapsed.load(std::memory_order_acquire); | ||
| fj_cpu.improvement_callback( | ||
| fj_cpu.h_best_objective, fj_cpu.h_assignment, current_work_units); | ||
| fj_cpu.h_incumbent_objective, fj_cpu.h_assignment, current_work_units); | ||
| } |
There was a problem hiding this comment.
CPUFJ improvement callback should emit user-space objective.
GPU FJ uses solution.get_user_objective(), but CPUFJ forwards h_incumbent_objective (solver space). This can flip sign or ignore offsets for maximize/scaling, corrupting early cutoffs and user callbacks. Convert before invoking the callback.
Suggested fix
- fj_cpu.improvement_callback(
- fj_cpu.h_incumbent_objective, fj_cpu.h_assignment, current_work_units);
+ auto user_obj =
+ fj_cpu.pb_ptr->get_user_obj_from_solver_obj(fj_cpu.h_incumbent_objective);
+ fj_cpu.improvement_callback(user_obj, fj_cpu.h_assignment, current_work_units);Based on learnings: Validate algorithm correctness in optimization logic: simplex pivots, branch-and-bound decisions, routing heuristics, and constraint/objective handling must produce correct results.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cpp/src/mip_heuristics/feasibility_jump/fj_cpu.cu` around lines 790 - 794,
The CPU path is passing solver-space objective (fj_cpu.h_incumbent_objective) to
fj_cpu.improvement_callback but should pass the user-space objective like the
GPU path does; before calling improvement_callback compute the user-facing
objective by converting the solver objective using the same conversion used in
GPU FJ (e.g., call the solution.get_user_objective() equivalent or the shared
conversion utility on the current incumbent represented by fj_cpu.h_assignment),
then pass that user-space objective along with fj_cpu.h_assignment and
current_work_units to fj_cpu.improvement_callback.
| // Stop early CPUFJ after cuopt presolve (probing cache) but before main solve | ||
| if (context.early_cpufj_ptr) { | ||
| context.early_cpufj_ptr->stop(); | ||
| if (context.early_cpufj_ptr->solution_found()) { | ||
| f_t obj = context.early_cpufj_ptr->get_best_objective(); | ||
| context.initial_cutoff = std::min(context.initial_cutoff, obj); | ||
| CUOPT_LOG_INFO("Early CPUFJ found incumbent with objective %g during presolve", obj); | ||
| } |
There was a problem hiding this comment.
Cutoff update must respect objective sense.
std::min assumes minimization; for maximization this never updates initial_cutoff. Use a sense-aware comparison (and ensure initial_cutoff is seeded to ±inf accordingly).
Suggested fix
- context.initial_cutoff = std::min(context.initial_cutoff, obj);
+ const bool improved = context.problem_ptr->maximize
+ ? (obj > context.initial_cutoff)
+ : (obj < context.initial_cutoff);
+ if (improved) { context.initial_cutoff = obj; }Based on learnings: Validate algorithm correctness in optimization logic: simplex pivots, branch-and-bound decisions, routing heuristics, and constraint/objective handling must produce correct results.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Stop early CPUFJ after cuopt presolve (probing cache) but before main solve | |
| if (context.early_cpufj_ptr) { | |
| context.early_cpufj_ptr->stop(); | |
| if (context.early_cpufj_ptr->solution_found()) { | |
| f_t obj = context.early_cpufj_ptr->get_best_objective(); | |
| context.initial_cutoff = std::min(context.initial_cutoff, obj); | |
| CUOPT_LOG_INFO("Early CPUFJ found incumbent with objective %g during presolve", obj); | |
| } | |
| // Stop early CPUFJ after cuopt presolve (probing cache) but before main solve | |
| if (context.early_cpufj_ptr) { | |
| context.early_cpufj_ptr->stop(); | |
| if (context.early_cpufj_ptr->solution_found()) { | |
| f_t obj = context.early_cpufj_ptr->get_best_objective(); | |
| const bool improved = context.problem_ptr->maximize | |
| ? (obj > context.initial_cutoff) | |
| : (obj < context.initial_cutoff); | |
| if (improved) { context.initial_cutoff = obj; } | |
| CUOPT_LOG_INFO("Early CPUFJ found incumbent with objective %g during presolve", obj); | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cpp/src/mip_heuristics/solver.cu` around lines 118 - 125, The code uses
std::min when updating context.initial_cutoff from
context.early_cpufj_ptr->get_best_objective(), which only works for
minimization; change the update to be sense-aware: when the problem is
minimization set initial_cutoff = std::min(initial_cutoff, obj), and when
maximization set initial_cutoff = std::max(initial_cutoff, obj). Update the
logic around context.early_cpufj_ptr handling (references:
context.early_cpufj_ptr, get_best_objective(), context.initial_cutoff) and
ensure context.initial_cutoff is initialized/seeding to +inf for minimization
and -inf for maximization so the comparison behaves correctly.
| namespace cuopt::linear_programming::detail { | ||
|
|
||
| template <typename i_t, typename f_t> | ||
| early_cpufj_t<i_t, f_t>::early_cpufj_t( |
There was a problem hiding this comment.
why do you need a separate class for this one? Can't you repurpose the existing one?
There was a problem hiding this comment.
This is just a wrapper around said existing CPUFJ class for convenience in that specific scenario. This is helpful because we run cpufj twice in the early-heuristics: before papilo presolve (on the original problem), and before cuopt presolve (on the papilo-presolved problem)
…v/cuopt-fork into pre-presolve-heuristics
|
/ok to test d5f7dde |
|
/ok to test 62a53ef |
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
cpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cu (1)
936-949:⚠️ Potential issue | 🟠 MajorDuplicate: improvement callback still assumes minimization and lacks CUDA error check.
Same as the prior review: sense-aware comparison/reset is still missing, and the device→host copy should be CUDA-checked before invoking the callback.
Based on learnings: Validate algorithm correctness in optimization logic: simplex pivots, branch-and-bound decisions, routing heuristics, and constraint/objective handling must produce correct results.
As per coding guidelines: Every CUDA kernel launch and memory operation must have error checking with CUDA_CHECK or equivalent verification.Also applies to: 1078-1080
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cu` around lines 936 - 949, The improvement callback logic assumes minimization and lacks CUDA error checks — modify the comparison of solution.h_obj against last_reported_objective_ to be sense-aware (use the problem/objective_sense or similar flag to choose < for minimization and > for maximization and ensure last_reported_objective_ is initialized/reset accordingly), and after calling raft::copy and climber_stream.synchronize() check for CUDA errors (use CUDA_CHECK or your project's GPU error macro) before invoking improvement_callback(user_obj, h_assignment); reference functions/variables: improvement_callback, solution.get_user_objective(), solution.h_obj, last_reported_objective_, raft::copy, climber_stream.synchronize(), and add CUDA error verification to the device→host copy/sync path.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cpp/src/mip_heuristics/early_heuristic.cuh`:
- Around line 41-49: The device writes initiated by thrust::fill and any
device-side operations inside problem_ptr_->preprocess_problem() and
solution_ptr_->clamp_within_bounds() lack CUDA error checks; after the
thrust::fill call and immediately after any device memory/copy or kernel
launches performed within preprocess_problem() and clamp_within_bounds(), wrap
or follow those operations with the project's CUDA error verifier (e.g.,
CUDA_CHECK or equivalent) to check cudaGetLastError()/cudaDeviceSynchronize()
and report failures, referencing the symbols problem_ptr_, solution_ptr_,
preprocess_problem(), clamp_within_bounds(), thrust::fill,
handle_ptr->get_thrust_policy(), and solution_ptr_->assignment to locate where
to insert the checks.
In `@cpp/src/mip_heuristics/feasibility_jump/early_gpufj.cu`:
- Around line 39-61: start() doesn't clear the preemption flag so a restarted
instance immediately exits if stop() previously set preempt_heuristic_solver_;
before launching the worker reset preempt_heuristic_solver_ = false (or call a
reset method) in early_gpufj_t::start() so run_worker can proceed, ensuring this
happens before creating worker_thread_ (refer to start(), stop(),
preempt_heuristic_solver_, worker_thread_, and run_worker).
---
Duplicate comments:
In `@cpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cu`:
- Around line 936-949: The improvement callback logic assumes minimization and
lacks CUDA error checks — modify the comparison of solution.h_obj against
last_reported_objective_ to be sense-aware (use the problem/objective_sense or
similar flag to choose < for minimization and > for maximization and ensure
last_reported_objective_ is initialized/reset accordingly), and after calling
raft::copy and climber_stream.synchronize() check for CUDA errors (use
CUDA_CHECK or your project's GPU error macro) before invoking
improvement_callback(user_obj, h_assignment); reference functions/variables:
improvement_callback, solution.get_user_objective(), solution.h_obj,
last_reported_objective_, raft::copy, climber_stream.synchronize(), and add CUDA
error verification to the device→host copy/sync path.
ℹ️ Review info
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
cpp/src/mip_heuristics/early_heuristic.cuhcpp/src/mip_heuristics/feasibility_jump/early_cpufj.cucpp/src/mip_heuristics/feasibility_jump/early_gpufj.cucpp/src/mip_heuristics/feasibility_jump/feasibility_jump.cucpp/src/mip_heuristics/solve.cucpp/src/mip_heuristics/solver.cu
| problem_ptr_ = std::make_unique<problem_t<i_t, f_t>>(op_problem, tolerances, false); | ||
| problem_ptr_->preprocess_problem(); | ||
|
|
||
| solution_ptr_ = std::make_unique<solution_t<i_t, f_t>>(*problem_ptr_); | ||
| thrust::fill(problem_ptr_->handle_ptr->get_thrust_policy(), | ||
| solution_ptr_->assignment.begin(), | ||
| solution_ptr_->assignment.end(), | ||
| f_t{0}); | ||
| solution_ptr_->clamp_within_bounds(); |
There was a problem hiding this comment.
Add CUDA error checks after device operations.
Device writes/copies here are unchecked; per guidelines, add CUDA error verification after these operations.
🛠️ Proposed fix
thrust::fill(problem_ptr_->handle_ptr->get_thrust_policy(),
solution_ptr_->assignment.begin(),
solution_ptr_->assignment.end(),
f_t{0});
solution_ptr_->clamp_within_bounds();
+ RAFT_CHECK_CUDA(problem_ptr_->handle_ptr->get_stream());
@@
rmm::device_uvector<f_t> d_assignment(assignment.size(), handle_ptr->get_stream());
raft::copy(d_assignment.data(), assignment.data(), assignment.size(), handle_ptr->get_stream());
problem_ptr_->post_process_assignment(d_assignment);
auto user_assignment = cuopt::host_copy(d_assignment, handle_ptr->get_stream());
+ RAFT_CHECK_CUDA(handle_ptr->get_stream());As per coding guidelines: Every CUDA kernel launch and memory operation must have error checking with CUDA_CHECK or equivalent verification.
Also applies to: 61-72
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cpp/src/mip_heuristics/early_heuristic.cuh` around lines 41 - 49, The device
writes initiated by thrust::fill and any device-side operations inside
problem_ptr_->preprocess_problem() and solution_ptr_->clamp_within_bounds() lack
CUDA error checks; after the thrust::fill call and immediately after any device
memory/copy or kernel launches performed within preprocess_problem() and
clamp_within_bounds(), wrap or follow those operations with the project's CUDA
error verifier (e.g., CUDA_CHECK or equivalent) to check
cudaGetLastError()/cudaDeviceSynchronize() and report failures, referencing the
symbols problem_ptr_, solution_ptr_, preprocess_problem(),
clamp_within_bounds(), thrust::fill, handle_ptr->get_thrust_policy(), and
solution_ptr_->assignment to locate where to insert the checks.
| void early_gpufj_t<i_t, f_t>::start() | ||
| { | ||
| if (worker_thread_) { return; } | ||
|
|
||
| this->start_time_ = std::chrono::steady_clock::now(); | ||
|
|
||
| fj_settings_t fj_settings; | ||
| fj_settings.mode = fj_mode_t::EXIT_NON_IMPROVING; | ||
| fj_settings.n_of_minimums_for_exit = std::numeric_limits<int>::max(); | ||
| fj_settings.time_limit = std::numeric_limits<f_t>::infinity(); | ||
| fj_settings.iteration_limit = std::numeric_limits<int>::max(); | ||
| fj_settings.update_weights = true; | ||
| fj_settings.feasibility_run = false; | ||
|
|
||
| fj_ptr_ = std::make_unique<fj_t<i_t, f_t>>(*context_ptr_, fj_settings); | ||
|
|
||
| fj_ptr_->improvement_callback = [this](f_t user_obj, const std::vector<f_t>& h_assignment) { | ||
| f_t solver_obj = this->problem_ptr_->get_solver_obj_from_user_obj(user_obj); | ||
| this->try_update_best(solver_obj, h_assignment); | ||
| }; | ||
|
|
||
| worker_thread_ = std::make_unique<std::thread>(&early_gpufj_t::run_worker, this); | ||
| } |
There was a problem hiding this comment.
Reset preemption flag before launching the worker.
stop() sets preempt_heuristic_solver_ to true; without clearing it in start(), a restarted instance will exit immediately.
🛠️ Proposed fix
void early_gpufj_t<i_t, f_t>::start()
{
if (worker_thread_) { return; }
+ context_ptr_->preempt_heuristic_solver_.store(false);
this->start_time_ = std::chrono::steady_clock::now();As per coding guidelines: Validate correct initialization of variable bounds, constraint coefficients, and algorithm state before solving; ensure reset when transitioning between algorithm phases (presolve, simplex, diving, crossover).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cpp/src/mip_heuristics/feasibility_jump/early_gpufj.cu` around lines 39 - 61,
start() doesn't clear the preemption flag so a restarted instance immediately
exits if stop() previously set preempt_heuristic_solver_; before launching the
worker reset preempt_heuristic_solver_ = false (or call a reset method) in
early_gpufj_t::start() so run_worker can proceed, ensuring this happens before
creating worker_thread_ (refer to start(), stop(), preempt_heuristic_solver_,
worker_thread_, and run_worker).
|
/ok to test ca34f77 |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
cpp/include/cuopt/linear_programming/utilities/callbacks_implems.hpp (1)
35-51:⚠️ Potential issue | 🟠 MajorGuard against nullptr PyObject* before calling Py_DECREF and passing to PyObject_CallMethod.
Lines 36–38, 75–77:
get_numpy_array()returns the result ofPyObject_CallMethod, which may beNULLon Python API errors. Without null checks, passing these null pointers toPyObject_CallMethod(..., "OOOO", ...)will fail (the "O" format requires non-NULL objects), and lines 47–49 and 86–88 callPy_DECREF()on potentially null pointers—undefined behavior that crashes.Add null checks immediately after the three
get_numpy_array()calls, release the GIL on early exit, and usePy_XDECREF()for safe cleanup of any potentially null pointer.🔧 Suggested fix (apply to both get_solution and set_solution)
PyGILState_STATE gstate = PyGILState_Ensure(); PyObject* numpy_matrix = get_numpy_array(data, n_variables); PyObject* numpy_array = get_numpy_array(objective_value, 1); PyObject* numpy_bound = get_numpy_array(solution_bound, 1); +if (!numpy_matrix || !numpy_array || !numpy_bound) { + Py_XDECREF(numpy_matrix); + Py_XDECREF(numpy_array); + Py_XDECREF(numpy_bound); + PyGILState_Release(gstate); + return; +} PyObject* py_user_data = user_data == nullptr ? Py_None : static_cast<PyObject*>(user_data); PyObject* res = PyObject_CallMethod(this->pyCallbackClass, "get_solution", "(OOOO)", numpy_matrix, numpy_array, numpy_bound, py_user_data); Py_DECREF(numpy_matrix); Py_DECREF(numpy_array); Py_DECREF(numpy_bound); -if (res != nullptr) { Py_DECREF(res); } +Py_XDECREF(res); PyGILState_Release(gstate);Also applies to: lines 74–90 (set_solution callback).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/include/cuopt/linear_programming/utilities/callbacks_implems.hpp` around lines 35 - 51, The code calls get_numpy_array() for matrix/array/bound and then passes them to PyObject_CallMethod and Py_DECREF without checking for NULL; change get_solution (and set_solution) to check each PyObject* result immediately after get_numpy_array and if any is NULL, release the GIL and return early (do not call PyObject_CallMethod), and replace Py_DECREF calls for these potentially-NULL objects with Py_XDECREF to safely handle NULL; also guard the PyObject_CallMethod invocation so it only runs when all three numpy_* pointers are non-NULL and ensure py_user_data is set to Py_None as already done.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@cpp/include/cuopt/linear_programming/utilities/callbacks_implems.hpp`:
- Around line 35-51: The code calls get_numpy_array() for matrix/array/bound and
then passes them to PyObject_CallMethod and Py_DECREF without checking for NULL;
change get_solution (and set_solution) to check each PyObject* result
immediately after get_numpy_array and if any is NULL, release the GIL and return
early (do not call PyObject_CallMethod), and replace Py_DECREF calls for these
potentially-NULL objects with Py_XDECREF to safely handle NULL; also guard the
PyObject_CallMethod invocation so it only runs when all three numpy_* pointers
are non-NULL and ensure py_user_data is set to Py_None as already done.
|
/ok to test f349b5b |
|
/ok to test 506d046 |
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (1)
cpp/src/mip_heuristics/solve.cu (1)
306-326: Min‑only incumbent comparison still looks present.This block still initializes
early_best_objectiveto+infand rejects incumbents withsolver_obj >= best, which only works for minimization unless the objective is always normalized. Please confirm the normalization or reintroduce sense-aware comparisons as previously discussed.
Based on learnings: Validate algorithm correctness in optimization logic: simplex pivots, branch-and-bound decisions, routing heuristics, and constraint/objective handling must produce correct results.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@cpp/src/mip_heuristics/solve.cu` around lines 306 - 326, The early incumbent logic in early_fj_callback currently assumes minimization by initializing early_best_objective to +inf and using solver_obj >= early_best_objective to reject candidates; change this to be sense-aware: use the objective sense (e.g. problem.presolve_data.objective_scaling_factor or another canonical indicator) to initialize early_best_objective to +inf for minimization and -inf for maximization, and switch the comparison to reject candidates using (solver_obj >= best) for minimization but (solver_obj <= best) for maximization; update the early_fj_callback comparison and the early_best_objective.store call to follow the same sense, and ensure no_bound (already computed from objective_scaling_factor) remains consistent with this logic (use the same sign test where referenced).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@cpp/src/mip_heuristics/solve.cu`:
- Around line 169-195: The no_bound value is chosen using
presolve_data.objective_scaling_factor sign which can be unrelated to the user's
objective sense; change the logic to query the original problem's objective
sense (via problem.original_problem_ptr, e.g., a method like
get_objective_sense() / is_minimization()) and set no_bound to +infinity for a
minimization objective and -infinity for a maximization objective, then use that
no_bound in the incumbent_callback that calls invoke_solution_callbacks; update
the block that computes no_bound (and any comment) and keep the rest of the
early_cpufj/incumbent_callback code the same.
- Line 403: early_best_objective is in the unscaled objective space but is
passed directly into run_mip, so when mip_scaling is enabled the cutoff will be
wrong for B&B pruning; before calling run_mip, convert the loaded cutoff via
get_solver_obj_from_user_obj (using the solver scaling parameters available on
the Problem or Settings objects) and pass that converted value into run_mip (or,
alternatively, explicitly document why conversion is not required if that is
intended). Ensure you reference early_best_objective.load(), mip_scaling,
get_solver_obj_from_user_obj, and run_mip when making the change.
---
Duplicate comments:
In `@cpp/src/mip_heuristics/solve.cu`:
- Around line 306-326: The early incumbent logic in early_fj_callback currently
assumes minimization by initializing early_best_objective to +inf and using
solver_obj >= early_best_objective to reject candidates; change this to be
sense-aware: use the objective sense (e.g.
problem.presolve_data.objective_scaling_factor or another canonical indicator)
to initialize early_best_objective to +inf for minimization and -inf for
maximization, and switch the comparison to reject candidates using (solver_obj
>= best) for minimization but (solver_obj <= best) for maximization; update the
early_fj_callback comparison and the early_best_objective.store call to follow
the same sense, and ensure no_bound (already computed from
objective_scaling_factor) remains consistent with this logic (use the same sign
test where referenced).
| } | ||
|
|
||
| auto sol = run_mip(problem, settings, timer); | ||
| auto sol = run_mip(problem, settings, timer, early_best_objective.load()); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's find the solve.cu file and understand its structure
find . -name "solve.cu" -type f 2>/dev/null | head -20Repository: NVIDIA/cuopt
Length of output: 140
🏁 Script executed:
# Read the solve.cu file to understand the context around line 403
wc -l cpp/src/mip_heuristics/solve.cuRepository: NVIDIA/cuopt
Length of output: 91
🏁 Script executed:
# Get context around line 403
sed -n '380,430p' cpp/src/mip_heuristics/solve.cuRepository: NVIDIA/cuopt
Length of output: 2587
🏁 Script executed:
# Find where early_best_objective is declared/initialized
rg "early_best_objective" -B 5 -A 2 cpp/src/mip_heuristics/solve.cuRepository: NVIDIA/cuopt
Length of output: 1734
🏁 Script executed:
# Search for mip_scaling usage to understand when scaling is applied
rg "mip_scaling|scaling" -B 3 -A 3 cpp/src/mip_heuristics/solve.cu | head -100Repository: NVIDIA/cuopt
Length of output: 4177
🏁 Script executed:
# Look at the run_mip function signature and implementation
rg "run_mip" -B 2 -A 10 cpp/src/mip_heuristics/solve.cu | head -150Repository: NVIDIA/cuopt
Length of output: 1658
🏁 Script executed:
# Search for early FJ implementation to understand what space the objectives are in
rg "early_cpufj_t|early_gpufj_t" -A 20 cpp/src/mip_heuristics/ --type cu --type cuh --type hpp --type h | head -200Repository: NVIDIA/cuopt
Length of output: 82
🏁 Script executed:
# Look for where solver.context.initial_cutoff is used and how it's scaled
rg "initial_cutoff" -B 3 -A 3 cpp/src/mip_heuristics/ --type cu --type cuh --type hpp --type hRepository: NVIDIA/cuopt
Length of output: 82
🏁 Script executed:
# Check mip_solver_t to see how initial_cutoff is used
rg "mip_solver_t" -l cpp/src/mip_heuristics/ --type cu --type cuh --type hpp --type hRepository: NVIDIA/cuopt
Length of output: 82
🏁 Script executed:
# Look for how objectives are handled before and after scaling
rg "objective_scaling_factor" -B 2 -A 2 cpp/src/mip_heuristics/solve.cu | head -150Repository: NVIDIA/cuopt
Length of output: 1115
🏁 Script executed:
# Check if there's any conversion logic for the cutoff in the solver context
rg "context\.initial_cutoff" -B 5 -A 5 cpp/src/mip_heuristics/ --type cu --type cuh --type hpp --type hRepository: NVIDIA/cuopt
Length of output: 82
🏁 Script executed:
# Find where early FJ callbacks are defined and understand the objective space
rg "early_fj_callback|early_gpufj|early_cpufj" -B 3 -A 8 cpp/src/mip_heuristics/solve.cuRepository: NVIDIA/cuopt
Length of output: 5810
🏁 Script executed:
# Look at the early FJ section more carefully - check around the callback definition
sed -n '200,350p' cpp/src/mip_heuristics/solve.cuRepository: NVIDIA/cuopt
Length of output: 7282
🏁 Script executed:
# Search for how initial_cutoff is used in solver
rg "initial_cutoff" -B 2 -A 5 cpp/src/Repository: NVIDIA/cuopt
Length of output: 6113
🏁 Script executed:
# Check when scaling.scale_problem is called vs when solver is initialized
sed -n '50,150p' cpp/src/mip_heuristics/solve.cuRepository: NVIDIA/cuopt
Length of output: 5053
🏁 Script executed:
# Find mip_solver_t implementation to see how initial_cutoff is used
find cpp/src -name "*.hpp" -o -name "*.h" | xargs grep -l "mip_solver_t" | head -5Repository: NVIDIA/cuopt
Length of output: 38
🏁 Script executed:
# Find early_cpufj implementation to understand what space objectives are in
find cpp/src -name "early_cpufj*" -type fRepository: NVIDIA/cuopt
Length of output: 166
🏁 Script executed:
# Find early_gpufj implementation
find cpp/src -name "early_gpufj*" -type fRepository: NVIDIA/cuopt
Length of output: 166
🏁 Script executed:
# Check if initial_cutoff is scaled anywhere before being used
rg "initial_cutoff" -B 5 -A 5 cpp/src/branch_and_bound/Repository: NVIDIA/cuopt
Length of output: 2449
🏁 Script executed:
# Check if scaling adjusts or converts the cutoff
rg "scale_problem|initial_cutoff" cpp/src/mip_heuristics/solve.cu | head -30Repository: NVIDIA/cuopt
Length of output: 378
🏁 Script executed:
# Look for any conversion logic between scaled/unscaled cutoffs
rg "objective_scaling_factor|scale.*cutoff|cutoff.*scale" -B 2 -A 2 cpp/src/mip_heuristics/Repository: NVIDIA/cuopt
Length of output: 9781
🏁 Script executed:
# Check early_cpufj.cuh to see what space the objectives are in
head -100 cpp/src/mip_heuristics/feasibility_jump/early_cpufj.cuhRepository: NVIDIA/cuopt
Length of output: 1102
🏁 Script executed:
# Check early_cpufj.cu callback signature
grep -A 20 "solver_obj\|callback" cpp/src/mip_heuristics/feasibility_jump/early_cpufj.cu | head -80Repository: NVIDIA/cuopt
Length of output: 1406
🏁 Script executed:
# Look at the critical section: after scaling is applied, is initial_cutoff ever converted?
sed -n '50,130p' cpp/src/mip_heuristics/solve.cuRepository: NVIDIA/cuopt
Length of output: 4092
🏁 Script executed:
# Check if there's any scaling of initial_cutoff in the solver run path
rg "get_solver_obj|get_user_obj|scale.*cutoff|cutoff.*scale" cpp/src/mip_heuristics/Repository: NVIDIA/cuopt
Length of output: 5830
Verify cutoff scaling before passing to run_mip.
early_best_objective is captured from the unscaled problem (early FJ runs on op_problem before scaling), but the cutoff is passed to run_mip without conversion to the scaled objective space. When mip_scaling is enabled, the unscaled cutoff used for branch-and-bound pruning may be in the wrong scale. Convert using get_solver_obj_from_user_obj if needed, or document that the scaling factor does not require conversion for cutoff comparisons.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@cpp/src/mip_heuristics/solve.cu` at line 403, early_best_objective is in the
unscaled objective space but is passed directly into run_mip, so when
mip_scaling is enabled the cutoff will be wrong for B&B pruning; before calling
run_mip, convert the loaded cutoff via get_solver_obj_from_user_obj (using the
solver scaling parameters available on the Problem or Settings objects) and pass
that converted value into run_mip (or, alternatively, explicitly document why
conversion is not required if that is intended). Ensure you reference
early_best_objective.load(), mip_scaling, get_solver_obj_from_user_obj, and
run_mip when making the change.
There was a problem hiding this comment.
check if mip_scaling actually affects the objective function coefficients
|
/ok to test 33230b6 |
|
/ok to test 71e8c9a |
|
/ok to test 9d7055e |
|
/ok to test 7b9451b8898f53af58e399314faec3c9733a63de |
@aliceb-nv, there was an error processing your request: See the following link for more information: https://docs.gha-runners.nvidia.com/cpr/e/2/ |
|
/ok to test 7b9451b |
|
/ok to test c5d0bd5 |
|
/ok to test dee5d4d |
This presolve lets FJ (both CPU and GPU) run before presolving is completed, in order to provide incumbent callbacks as early as possible.
While we cannot yet crush these incumbents into the fully-presolved problem representation, their primal values can be used to prune the B&B and provide stronger reduced cost fixings.
With this PR, we now find 191 feasible in the first 15s (145 before), and the primal integral in 10min is now 18.7% (was 19.8% before).
Summary by CodeRabbit
New Features
Bug Fixes
Chores