[onnx_importer] Fix bug with path-based shape inference when `--temp-dir` is set and `--data-dir` is not #4375

tkocmathla · 2025-11-12T15:48:07Z

zjgarvey

Nice! Thanks for finding this. It does seem strange that we would be checking for an explicit args.temp_dir for the external data.

I have a few comments. Most importantly, I think we should find a way to test without needing to generate a 2GB random tensor.

zjgarvey · 2025-12-10T20:26:21Z

test/python/onnx_importer/command_line_test.py

+        mlir_file = run_path / f"{model_name}-explicit_temp_implicit_data.torch.mlir"
+        onnx.save(onnx_model, model_file)
+        temp_dir = run_path / "temp"
+        temp_dir.mkdir(exist_ok=True)


May be good to use tempfile.TemporaryDirectory in case there is an exception thrown before the temp files are cleaned up by the importer.

zjgarvey · 2025-12-10T20:27:50Z

test/python/onnx_importer/command_line_test.py

+    byte_size = numpy.dtype(dtype).itemsize
+    tensor_size = onnx.checker.MAXIMUM_PROTOBUF // byte_size + 1
+    large_tensor = numpy.random.rand(tensor_size).astype(dtype)
+    assert large_tensor.nbytes > onnx.checker.MAXIMUM_PROTOBUF


It does seem rather unfortunate that we need to make the CI create a 2GB tensor just to test this.

How long does this test take to run? If it takes more than a minute, I think we should refactor the load_onnx_model code so we can test the file-based shape inference directly.

Maybe pulling out the file based shape inference into a utility is just a good idea in general. https://github.com/llvm/torch-mlir/pull/4375/files#r2608194510

Alternatively, we could try to mirror what is currently being done in IREE:

https://github.com/iree-org/iree/blob/d0dd8893758dcf40558a57916b869ba0babc0a95/compiler/bindings/python/iree/compiler/tools/import_onnx/__main__.py#L86

Ignore the params (this is primarily the reason the importer code has diverged in IREE, but there are some improvements which haven't found their way back to torch-mlir yet).

zjgarvey · 2025-12-10T20:44:44Z

python/torch_mlir/tools/import_onnx/__main__.py

    # Load the temp file and the external data.
    inferred_model = onnx.load(temp_inferred_file, load_external_data=False)
-    data_dir = Path(input_dir if args.temp_dir is None else args.data_dir)
+    data_dir = Path(input_dir if args.data_dir is None else args.data_dir)


I think this needs to be moved up above the condition on line 127 (if raw_model_modified), since the external data may end up being organized differently and this gets saved to the temp_dir.

E.g.,

data_dir = Path(args.data_dir or input_dir) if raw_model_modified: ... data_dir = Path(temp_dir) ...

zjgarvey · 2025-12-10T20:59:54Z

python/torch_mlir/tools/import_onnx/__main__.py

Let's hoist this into a util function so we can test in isolation without needing to generate a 2GB tensor.

E.g.,

def _file_based_shape_infer( model_path : Path, data_dir : Path, temp_dir : Path, ) -> onnx.ModelProto: """docstring""" ...

tkocmathla added 3 commits November 12, 2025 07:42

Add unit test covering issue.

4027976

Format

5009096

Check data_dir, not temp_dir

42ac53f

sjarus requested a review from zjgarvey December 9, 2025 17:47

Fix test

5f2d9f6

zjgarvey requested changes Dec 10, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[onnx_importer] Fix bug with path-based shape inference when `--temp-dir` is set and `--data-dir` is not #4375

[onnx_importer] Fix bug with path-based shape inference when `--temp-dir` is set and `--data-dir` is not #4375

tkocmathla commented Nov 12, 2025

Uh oh!

zjgarvey left a comment

Uh oh!

zjgarvey Dec 10, 2025

Uh oh!

zjgarvey Dec 10, 2025

Uh oh!

zjgarvey Dec 10, 2025

Uh oh!

zjgarvey Dec 10, 2025

Uh oh!

zjgarvey Dec 10, 2025

Uh oh!

zjgarvey Dec 10, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[onnx_importer] Fix bug with path-based shape inference when --temp-dir is set and --data-dir is not #4375

Are you sure you want to change the base?

[onnx_importer] Fix bug with path-based shape inference when --temp-dir is set and --data-dir is not #4375

Conversation

tkocmathla commented Nov 12, 2025

Uh oh!

zjgarvey left a comment

Choose a reason for hiding this comment

Uh oh!

zjgarvey Dec 10, 2025

Choose a reason for hiding this comment

Uh oh!

zjgarvey Dec 10, 2025

Choose a reason for hiding this comment

Uh oh!

zjgarvey Dec 10, 2025

Choose a reason for hiding this comment

Uh oh!

zjgarvey Dec 10, 2025

Choose a reason for hiding this comment

Uh oh!

zjgarvey Dec 10, 2025

Choose a reason for hiding this comment

Uh oh!

zjgarvey Dec 10, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[onnx_importer] Fix bug with path-based shape inference when `--temp-dir` is set and `--data-dir` is not #4375

[onnx_importer] Fix bug with path-based shape inference when `--temp-dir` is set and `--data-dir` is not #4375