OpenROAD surrogate autotuner #9236

PrecisEDAnon · 2026-01-12T02:28:40Z

This introduces a surrogate autotuner to OpenROAD (Note : ORFS PR to follow). The idea of the surrogate autotuner is to use a simple model to see how parameters like core util affect the PPA results, and then optimize them. It runs very fast (10 mins total optimization time max) vs the ORFS autotuner (24h+). AI generated content follows.

OpenROAD PR: Optional surrogate-based optimizer (`surrogate_optimize`) for fast autotuning

This PR adds an optional, gated surrogate model + optimizer to OpenROAD to enable fast design-space exploration (autotuning) without running full place/CTS/route for every sample.

The integration is intended to be used either:

directly from OpenROAD Tcl (power users / experiments), or
via a companion ORFS (OpenROAD-flow-scripts) PR that provides Makefile targets, design-specific knob spaces, and end-to-end “tune + validate” plumbing (companion ORFS PR: TBD).

Summary of user-visible changes

When enabled, OpenROAD registers three Tcl commands:

surrogate_supported_features — returns a JSON array of supported objective/output names
surrogate_eval — evaluates the surrogate model for a single parameter point (JSON params)
surrogate_optimize — searches a JSON-defined knob space and writes a JSON summary of top candidates

These commands are not registered unless explicitly enabled (see “Gating”).

Implementation overview (for reviewers)

Main touched/added areas:

src/Surrogate.cc: surrogate model + JSON space parsing + optimizer + Tcl commands
src/OpenRoad.cc: compile/runtime gating + command registration
include/ord/Surrogate.hh: minimal public init entry point
CMakeLists.txt + src/CMakeLists.txt: ENABLE_SURROGATE build option (default OFF)

Motivation / why this belongs in OpenROAD

Autotuning is often limited by the cost of running full flows. The goal here is to make it practical to:

score many candidates quickly using a built-in analytic surrogate, then
validate only a small portfolio of promising/diverse candidates with full runs (in ORFS or another driver).

This PR provides the OpenROAD half: a fast, built-in evaluator + optimizer that can be driven from Tcl, with stable JSON I/O suitable for orchestration.

Gating (no impact unless explicitly opted in)

This PR is designed to be non-invasive and safe for upstream:

Compile-time gate (CMake): ENABLE_SURROGATE (default OFF)

cmake -S . -B build -D CMAKE_BUILD_TYPE=Release -D ENABLE_SURROGATE=ON
cmake --build build -j

Runtime gate (env var): OPENROAD_ENABLE_SURROGATE=1

OPENROAD_ENABLE_SURROGATE=1 ./build/bin/openroad ...

If either gate is off, default OpenROAD behavior and Tcl command set are unchanged.

`surrogate_optimize` (main entry point)

What it does

surrogate_optimize:

parses a knob “space” (JSON) with per-knob {type, minmax, step},
samples candidate knob settings (multi-threaded),
evaluates each candidate using the built-in surrogate,
returns and writes a JSON summary including the best point and a top list.

The surrogate produces (at least) predictions for:

effective_clock_period
routed_wirelength
area
instance_area
power

It also produces internal diagnostic features (e.g. surrogate_fail_risk) that can be included in output for debugging.

Tcl usage (example)

# After loading a design (dbBlock must exist):
#
#   read_lef ...
#   read_def ...
#   read_sdc ...
#
# Provide a knob space JSON file (see schema below):

set res [surrogate_optimize \
  -space_file /path/to/surrogate_space.json \
  -objective effective_clock_period \
  -minimize \
  -samples 500000 \
  -top_n 50 \
  -threads 16 \
  -time_budget_s 600 \
  -output /tmp/surrogate_optimize.json \
  -include_features]
puts $res

Important options (CLI contract)

-space_file <path> or -space <json>
-objective <name> (see surrogate_supported_features)
-minimize / -maximize (default: minimize)
-samples <N> (required; N>0)
-top_n <K> (keeps the best K candidates; K>=1)
-threads <N> (default: hardware concurrency; clamped to 1..256)
-time_budget_s <seconds> (optional wall-clock cap; acts as an early stop)
-base_params_file <path> (optional: baseline/starting knob values)
-freeze <csv> (optional: do not vary these knobs; e.g. clock_period)
-calibrate_ws_file <path> and/or -calibrate_wl_file <path> (optional; see below)
-multi_fidelity + -shrink <0..1> (optional refinement strategy)
-portfolio + -portfolio_shrink <0..1> (optional “multi-island” sampling strategy)
-format simple|json (simple prints best_objective=..., json prints full JSON)
-output <path> (required)
-include_features (include diagnostic feature values for the best/top entries)

Unknown args are ignored with a warning (to keep wrappers forwards-compatible).

Space file schema (JSON)

The knob space is a JSON object mapping knob name → spec:

{
  "core_utilization":  { "type": "int",   "minmax": [20, 99],   "step": 1 },
  "core_aspect_ratio": { "type": "float", "minmax": [0.9, 1.1], "step": 0 },
  "enable_dpo":        { "type": "binary","minmax": [0, 1],     "step": 1 }
}

Rules:

type is one of: float, int, binary
minmax: [min, max] is required
step is optional:
- if step > 0, values are sampled on min + k*step
- if step == 0 or omitted, values are sampled uniformly over [min, max]
binary samples {0, 1} (the minmax/step fields are required but effectively ignored)

Supported knob names (current list):

clock_period
core_utilization, core_aspect_ratio, tns_end_percent
global_padding, detail_padding, place_density, enable_dpo
pin_layer_adjust, above_layer_adjust, density_margin_addon
cts_cluster_size, cts_cluster_diameter

Unknown knob names are ignored (no hard error).

Calibration inputs (optional but recommended)

To make surrogate predictions more comparable to a real baseline point, surrogate_optimize can ingest baseline metrics and calibrate built-in scaling factors via:

-calibrate_ws_file <6_report.json> (ORFS finish metrics; uses timing + power + area fields)
-calibrate_wl_file <5_2_route.json> (ORFS route metrics; uses detailedroute__route__wirelength)

This calibrates internal length/timing scales with a small log-space search and also provides baseline anchors for certain derived quantities (e.g. the power model).

Calibration overrides can also be provided via environment variables:

SURROGATE_BUILTIN_LENGTH_SCALE
SURROGATE_BUILTIN_TIMING_SCALE
SURROGATE_BUILTIN_REF_CLOCK_USER

Output JSON (stable artifact)

surrogate_optimize writes a JSON object like:

objective metadata (objective name, sense, samples, threads, etc.)
best_objective
best_params
best_outputs
optional best_features (with -include_features)
top: array of {objective, params, outputs, features?}

This JSON is designed to be consumed by external orchestration (e.g. an ORFS “tune + validate” wrapper).

`surrogate_eval` (single-point evaluation)

For debugging or analysis, surrogate_eval evaluates one parameter set:

set res [surrogate_eval -params_file /path/to/params.json -include_features]
puts $res

Empirical results (from a prototype ORFS driver)

In a prototype ORFS integration (branch orfs-surrogate-rebased) using this OpenROAD feature, on-disk runs with 600s surrogate search and validating K=14 candidates across {asap7,nangate45,sky130hd} × {aes,ibex,jpeg} observed:

routed_wirelength: median gain 3.35% (p25 0.96%, p75 7.37%), best 15.78%
effective_clock_period: median gain 2.99% (p25 1.61%, p75 4.71%), best 11.58%

These numbers are primarily to demonstrate usefulness; end-to-end gains depend on the driver, validation budget, and knob space.

Testing

Build coverage: compiles cleanly with -D ENABLE_SURROGATE=ON and keeps default builds unchanged with the option off.
Runtime sanity: when enabled, surrogate_supported_features returns the expected list; surrogate_eval and surrogate_optimize run on real designs via the ORFS prototype driver.

Follow-ups / companion work

ORFS companion PR (TBD): adds Makefile targets and an end-to-end “tune + validate” workflow and documentation; this OpenROAD PR is self-contained without it.

Minimal enablement of the surrogate autotuner (Surrogate module + build integration).

gemini-code-assist

Code Review

This pull request introduces a surrogate autotuner to OpenROAD, which is a significant new feature for fast design-space exploration. The implementation is extensive, primarily in the new src/Surrogate.cc file. The code is well-structured, using modern C++ features and good practices for multithreading and numerical computation.

My review focuses on improving code quality, readability, and maintainability of the new surrogate model implementation. I've identified a few areas for improvement:

Replacing a custom clamp function with the standard std::clamp.
Simplifying a redundant conditional block.
Refactoring a very long function into smaller, more manageable pieces.
Using named constants instead of magic numbers to improve clarity.
Removing a redundant check in the result sorting logic.

Overall, this is a high-quality contribution. The suggested changes are aimed at making this complex new feature even more robust and easier to maintain in the future.

gemini-code-assist · 2026-01-12T02:31:59Z

src/Surrogate.cc

+T clamp(const T& v, const T& lo, const T& hi)
+{
+  return std::min(hi, std::max(lo, v));
+}


This clamp function can be replaced with std::clamp from the <algorithm> header, which is standard since C++17. Using the standard library function is preferred for conciseness and clarity. You could remove this helper and use std::clamp at call sites directly.

Probably you should prompt to use c++20 style

gemini-code-assist · 2026-01-12T02:31:59Z

src/Surrogate.cc

+    } else if (master->isCore()) {
+      s.total_core_area_um2 += area_um2;
+      s.num_core_insts++;
+    } else {
+      s.total_core_area_um2 += area_um2;
+      s.num_core_insts++;
+    }


The else if (master->isCore()) block and the final else block have identical code. This is redundant and can be simplified by merging them.

} else { s.total_core_area_um2 += area_um2; s.num_core_insts++; }

gemini-code-assist · 2026-01-12T02:31:59Z

src/Surrogate.cc

+  return clamp(std::exp(0.5 * (a + b)), min_scale, max_scale);
+}
+
+SimOut simulateOnce(const ModelContext& ctx, const Knobs& k, const int fidelity)


The function simulateOnce is very long (over 400 lines), which makes it difficult to read, understand, and maintain. Consider refactoring it by extracting logical sections into smaller, well-named helper functions. For example, you could create separate functions for:

Placement and routing proxy estimation (lines 1072-1170)

Timing proxy estimation (lines 1171-1322)

Final PPA calculations including power (lines 1324-1429)
This will improve modularity and readability.

gemini-code-assist · 2026-01-12T02:31:59Z

src/Surrogate.cc

+  double k_place = 0.35;
+  k_place *= 1.0 + 0.25 * std::max(0.0, util_target - 0.60);
+  k_place *= 1.0 + 0.04 * pad_sum;


The simulateOnce function contains many "magic numbers" (unnamed numerical constants). While this is common in empirical models, giving them meaningful names via constexpr can greatly improve readability and maintainability. For example, here and in subsequent lines, constants like 0.35, 0.25, 0.60, 0.04 could be defined with names that explain their purpose.

gemini-code-assist · 2026-01-12T02:31:59Z

src/Surrogate.cc

+        }
+        return;
+      }
+      if (!vec.empty() && better(s.obj, vec.back().obj)) {


The check !vec.empty() is redundant here. The preceding if block (lines 1957-1965) handles cases where vec.size() < args.top_n. Since args.top_n is guaranteed to be at least 1, if execution reaches this line, vec will be full and therefore not empty. You can safely remove this check.

if (better(s.obj, vec.back().obj)) {

oharboe · 2026-01-12T06:13:42Z

Information in pull requests is "lost" to the "community memory" as soon as the PR is merged.

Request: create a README.md documentation explaining how to use this for permanent "community memory".

I'm skeptical of monolothic DSE(design space exploration) in OpenROAD. I'd rather see OpenROAD enabling the users choice of DSE setup, than to have OpenROAD be responsible for running the DSE (which is a recipe for framework inversion problems).

But perhaps this PR enables something that I think would be useful and a nice seperation of concerns: a fast scan to find ranges of values that are worth exploring in DSE?

I thought ORFS/OpenROAD (and EDA tools in general) had a long tail, meaning there's a substantial amount of quality of results to be found after extensive searches, and that there's no way around the "24 hour exploration times" for finding the best parameters for a design as early flow choices can have big impacts on the final result. For instance, an increase in placement density could cause macro placement to flip from one configuration to another, yielding very different results.

I think it is worth reading about the nature of the variables that the user has to set as it is explained in ORFS documentation: https://github.com/The-OpenROAD-Project/OpenROAD-flow-scripts/blob/master/docs/user/FlowVariables.md#types-of-variables

Also there was some discussion along these lines in The-OpenROAD-Project/OpenROAD-flow-scripts#3738

MrAMS · 2026-01-12T07:16:59Z

Interesting work 👍

maliberty · 2026-01-13T04:51:00Z

Fwiw I think the Gemini comments are useful here.

github-actions

clang-tidy made some suggestions

github-actions · 2026-01-13T04:56:05Z

src/Surrogate.cc

+#include <stdexcept>
+#include <string>
+#include <system_error>
+#include <thread>


warning: included header system_error is not used directly [misc-include-cleaner]

Suggested change

#include <thread>

#include <thread>

github-actions · 2026-01-13T04:56:05Z

src/Surrogate.cc

+#include <system_error>
+#include <thread>
+#include <unordered_set>
+#include <utility>


warning: included header unordered_set is not used directly [misc-include-cleaner]

Suggested change

#include <utility>

#include <utility>

github-actions · 2026-01-13T04:56:05Z

src/Surrogate.cc

+#include <unordered_set>
+#include <utility>
+#include <vector>
+


warning: included header vector is not used directly [misc-include-cleaner]

Suggested change

github-actions · 2026-01-13T04:56:05Z

src/Surrogate.cc

+#include <utility>
+#include <vector>
+
+#include "db_sta/dbSta.hh"


warning: 'db_sta/dbSta.hh' file not found [clang-diagnostic-error]

#include "db_sta/dbSta.hh" ^

github-actions · 2026-01-13T04:56:05Z

src/Surrogate.cc

+#include "db_sta/dbSta.hh"
+#include "odb/db.h"
+#include "odb/dbTypes.h"
+#include "ord/OpenRoad.hh"


warning: included header dbTypes.h is not used directly [misc-include-cleaner]

Suggested change

#include "ord/OpenRoad.hh"

#include "ord/OpenRoad.hh"

maliberty · 2026-01-13T04:58:03Z

@oharboe I think the idea is to give a good candidate, not an optimal one, very quickly. It could be a seed for AT/sweep or just a quick and easy gain. Is that useful to you assuming it works well at that goal?

This is a more interesting idea than the other PRs but it actual value in practice is unclear. The ui is a bit ugly and it does need permanent documentation.

MrAMS · 2026-01-13T05:38:47Z

@maliberty I have developed a parallelized DSE framework based on ORFS that explores various clock frequencies and Chisel parameters (source: MrAMS/bazel-chisel-verilator-openroad-demo/tree/dse-parallel-trials/eda/dse). By leveraging Bazel for parallel execution, I've already achieved an order-of-magnitude speedup.

regarding early pruning: I previously discussed this via email with @oharboe and ran some experiments. However, we found that mathematically, "early pruning" is inherently difficult to apply to multi-objective optimization problems. Most existing optimization frameworks do not support this directly, as they typically handle multi-objective problems by scalarizing them into single-objective ones first.

I would love to discuss how we might adapt candidate screening algorithms to effectively handle multi-objective optimization in this context.

maliberty · 2026-01-13T05:47:52Z

@luarss FYI

The OR autotuner is built on RayTune which has a variety of search algorithms with different qualities. Usually you need a single score to optimize though you could report multiple metrics. What did you have in mind for "multi-objective optimization"?

oharboe · 2026-01-13T06:04:49Z

There are many different mathematical models, only some of which are available in Ray and Optuna. I do believe that there is some optortunity for a more specialized flow for scoping the DSE parameters and providing an initial estimate of the landscape to a downstream full flow search. The landscape has discontinuities that are not going to be fully mapped out by any approximation and the discontinuities are going to be more significant the higher the utilization is, so ultimately a full flow with final variables will have to be run. In our case, this includes architectural parameters(pre-synthesis).

This is a hilly landscape, with cliffs... Tricky.

PrecisEDAnon added 2 commits January 9, 2026 05:24

surrogate: add autotune support

a04f00a

Minimal enablement of the surrogate autotuner (Surrogate module + build integration).

surrogate: gate and lint-clean

f5de6e7

gemini-code-assist bot reviewed Jan 12, 2026

View reviewed changes

github-actions bot reviewed Jan 13, 2026

View reviewed changes

surrogate: improve tight-clock wall for ECP

315737c

OpenROAD surrogate autotuner #9236

Are you sure you want to change the base?

OpenROAD surrogate autotuner #9236

Conversation

PrecisEDAnon commented Jan 12, 2026

OpenROAD PR: Optional surrogate-based optimizer (surrogate_optimize) for fast autotuning

Summary of user-visible changes

Implementation overview (for reviewers)

Motivation / why this belongs in OpenROAD

Gating (no impact unless explicitly opted in)

surrogate_optimize (main entry point)

What it does

Tcl usage (example)

Important options (CLI contract)

Space file schema (JSON)

Calibration inputs (optional but recommended)

Output JSON (stable artifact)

surrogate_eval (single-point evaluation)

Empirical results (from a prototype ORFS driver)

Testing

Follow-ups / companion work

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

maliberty Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

oharboe commented Jan 12, 2026

Uh oh!

MrAMS commented Jan 12, 2026

Uh oh!

maliberty commented Jan 13, 2026

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

maliberty commented Jan 13, 2026

Uh oh!

MrAMS commented Jan 13, 2026

Uh oh!

maliberty commented Jan 13, 2026

Uh oh!

oharboe commented Jan 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

OpenROAD PR: Optional surrogate-based optimizer (`surrogate_optimize`) for fast autotuning

`surrogate_optimize` (main entry point)

`surrogate_eval` (single-point evaluation)