[ingress][torch-mlir][RFC] Initial version of fx-importer script using torch-mlir #4

dchigarev · 2025-09-17T14:09:59Z

The PR adds two utility functions as part of lighthouse's python package to convert torch models to a mlir module using torch_mlir.

A user can import one of the functions (import_from_model or import_from_file) and get a mlir.ir.Module that they can use to run passes on or simply write its content into a file.

some use cases

1. Import from an instance of a model:

from lighthouse.ingress.torch import import_from_model
from mlir import ir

ctx = ir.Context()

module : ir.Module = import_from_model(torch_model_instance, sample_args=(torch.rand(1, 10),), ir_context=ctx)

# can now run some passes on the module

2. Import from a file where a torch model is defined:

Imagine we want to import a model from KernelBench. They ship models as python files where models and their arguments are uniformly defined.

from lighthouse.ingress.torch import import_from_file
from mlir import ir

ctx = ir.Context()
kernel_bench_root = Path(...)

module : ir.Module = import_from_model(
    filepath=kernel_bench_root / "level1" / "10_3D_tensor_matrix_multiplication.py",
    ir_context=ctx
)

# can now run some passes on the module

The utility functions use torch_mlir and mlir installed in the python env following #6 (and not from lighthouses bindings as suggested in #3).

ingress/Torch-MLIR/generate-mlir.sh

rolfmorel

Thanks @dchigarev for making progress on PyTorch ingress!

In general it all seems to make sense to me! My comments are on relatively small matters.

Having said that, I would say the majority of the PR is on enabling the cmdline interface, which I expect to also be the most contentious. Personally, I am not a fan of such interfaces and prefer the scripting approach. If other people are in favour though, I am not opposed for the code to be included.

Do you happen to have examples of similar cmdline interfaces being used for enabling PyTorch lowerings in other projects?

ingress/Torch-MLIR/utils.py

ingress/Torch-MLIR/generate-mlir.sh

ingress/Torch-MLIR/generate-mlir.py

ingress/Torch-MLIR/generate-mlir.sh

dchigarev · 2025-09-25T10:12:33Z

Do you happen to have examples of similar cmdline interfaces being used for enabling PyTorch lowerings in other projects?

@rolfmorel thanks for your time and feedback!

No, I haven't seen such cmdline approach anywhere (I wasn't looking to deep though). On the surface of IREE's and Blade's documentation I could only found the user-script approach. So even if they have a cmdline option, they don't seem to promote it very well.

banach-space · 2025-09-26T12:26:49Z

This is great, thank you so much for working on this 🙏🏻

I have a few high-level suggestions.

Keep this PR simple and restrict to the required minimum.

The cmdline interface looks complex and is merely a "wrapper" for the script logic. We can't avoid having a script, but we can avoid the cmdline interface. And, with a complex cmdline interface like this, I would wrap it into yet another script. My suggestion - drop the interface for now. This will allow us to focus on the core logic instead.

Consistent filenames and hyphenation.

generate-mlir.py vs py_src vs dummy_mlp_factory.py vs export_bash.py. LLVM seems to prefer - over _. Whichever one we choose, lets use it consistently.

Use doctoring consistently.

Lets use (function + module) docstrings consistently (instead of mixing docstring and plain Python comments starting with #).

Do we need all the Bash scripts?

There's seems to be a fair bit of duplication, e.g. export_bash.sh vs export_py.sh vs export.py. It's not clear to me what all the scripts do and whether we need them. My suggestion - less is more.

Naming.

This PR modifies the torch-mlir/generate-mlir.py -> torch-mlir/py_src/main.py

IIUC, generate-mlir.py was misleading - no MLIR is generated. Instead, the script "exports" MLIR, right? To me, a generator would be something like https://github.com/libxsmm/tpp-mlir/blob/main/tools/mlir-gen/mlir-gen.cpp.

While main.py is an improvement (i.e. not misleading), it's a bit too enigmatic - why not export.py? Or export-mlir-from-pytorch.mlir? Basically, something descriptive. That said, naming is hard 🤷🏻

Final thoughts.

Really fantastic to see this, just a bit concerned that this PR is trying to achieve too many things in one go. I recommend trimming it - I'd much rather focus on the core part and also make sure that we establish a consistent way of naming, structuring and implementing things.

I've some other, more specific comments inline.

Thanks again for working on this! 🙏🏻

banach-space · 2025-09-26T11:37:04Z

ingress/Torch-MLIR/py_src/export_lib/export.py

+
+
+def generate_mlir(model, sample_args, sample_kwargs=None, dialect="linalg"):
+    # Convert the Torch model to MLIR


Could we use docstrings consistently throughout this project?

ingress/Torch-MLIR/scripts/generate-mlir.sh

banach-space · 2025-09-26T11:38:17Z

ingress/Torch-MLIR/scripts/generate-mlir.sh

@@ -0,0 +1,16 @@
+#!/usr/bin/env bash
+


DOCUMENTME - what is the purpose of this script and how do I use it?

ingress/Torch-MLIR/py_src/main.py

ingress/Torch-MLIR/README.md

ingress/Torch-MLIR/scripts/generate-mlir.sh

Groverkss · 2025-09-28T13:01:06Z

Building wrapper scripts around torch-mlir is not scalable at all. torch-mlir is not a library to build things with, not a tool to build scripts around. The proper way of doing this is shipping fx_importer as part of bindings: #3 (ready for review) and then building export over it and ship it as part of the python package. I'm going to send a pr on building an aot export for torch and onnx around that today to give an idea of how it should be done.

rolfmorel · 2025-10-08T14:57:45Z

Given the distinct though all ingress-related PRs currently up, I thought that delineating their separate purposes might be helpful:

Add python bindings for lighthouse #3
- -> Set up a build system / initialization approach for making both mlir and torch-mlir available to be used from Python, so as to enable ingress and lowering
- IMO, this would initially just enable ingress and lowering as a bunch of Python programs, i.e. all we need is that the Python packages are available when the ingress and compiler scripts get called.
[ingress][torch-mlir][RFC] Initial version of fx-importer script using torch-mlir #4
- -> Develop "the one" conversion script for going from PyTorch to mlir.
- IMO, as the cmdline interface is more controversial, this is probably best factored out to a separate PR so that the Python script-based approach can be merged.
[ingress][pytorch] Basic KernelBench to MLIR conversion #5
- -> One source of PyTorch kernels that we want to use to demonstrate working pipelines on.
- Currently invokes the couple of lines of Python necessary to get torch-mlir to do the conversion, though should use the general importer/conversion mechanism that #4 will provide and rely on #3 to make sure the dependencies are taken care off (excepting KernelBench, which is a distinct kind of dependency in need of a general mechanism for how deal with them).

Regarding the interaction between the importer script and the scripts that deal with input sources: My feeling is that providing a little python package that can be used as a utility by the separate input-processing scripts might be most helpful.

For example, from lighthouse.ingress import pytorch_module_converter could be used in, e.g. #5, to replace the conversion code in that script, e.g. if its signature was pytorch_module_converter(python_module: str | module) -> mlir.ir.Module (where the str type is for paths to scripts with a certain interface and the (Python) module for already imported modules with that same interface). The "interface" of the python_module could be like that of KernelBench, i.e. the module should have attributes Model: torch.nn.Module and gen_inputs: () -> List[torch.Tensor] and gen_init_inputs: () -> List[Any] (i.e. the "standard" arguments for Model.__init__).

As the PyTorch and torch-mlir libs need to live in the same process anyway, I do not see much benefit coming from trying to separate out the importer/converter code to a script that actually runs in a separate process from the code that deals with the input sources.

…rch-mlir Signed-off-by: dchigarev <[email protected]>

Signed-off-by: dchigarev <[email protected]>

dchigarev · 2025-10-20T10:34:24Z

@rolfmorel

I've updated the PR following your suggestions. Lighthouse now has a python/lighthouse/* folder that is supposed to be an installable python package (will be installable after #7) that provides utility ingress functions (e.g. import_from_file/import_from_model).

p.s.
The conversion still uses torch_mlir from the user env (as inspired by #6).

Signed-off-by: dchigarev <[email protected]>

rolfmorel

Many thanks for the revision (and the commitment to the PR)! It is looking good!

Have left a number of minor comments. I will soon try to rebase #5 on this branch and confirm that that works as expected. Happy to approve once both are sorted 👍

python/examples/ingress/torch/MLPModel/model.py

python/examples/ingress/torch/01-dummy-mlp-from-file.py

python/lighthouse/ingress/torch/torch_import.py

python/examples/ingress/torch/02-dummy-mlp-from-model.py

rolfmorel · 2025-10-20T16:19:36Z

python/lighthouse/ingress/torch/torch_import.py

+    ir_context : ir.Context, optional
+        An optional MLIR context to use for parsing the module.
+        If not provided, the module is returned as a string.


Just to leave a note: this is somewhat surprising to me, though as I don't currently have a better suggestion to expose the same functionality (i.e. returning before conversion to environment's mlir) I am okay with it.

python/lighthouse/ingress/torch/torch_import.py

Signed-off-by: dchigarev <[email protected]>

The PR modifies `pyproject.toml` to make the content of `python/lighthouse` to be installable via `uv pip install .` For now the package is empty, but after #4 is merged users would be able to access ingress helper functions as part of the package: ```python from lighthouse.ingress.torch import import_from_file ... ``` * install lighthouse on 'uv sync' Signed-off-by: dchigarev <[email protected]> * use dynamic version in pyproject.toml Signed-off-by: dchigarev <[email protected]> --------- Signed-off-by: dchigarev <[email protected]>

Signed-off-by: dchigarev <[email protected]>

rolfmorel · 2025-10-21T16:42:48Z

python/lighthouse/ingress/torch/importer.py

-    inputs_args_fn = getattr(module, inputs_args_fn_name, None)
-    if inputs_args_fn is None:
-        raise ValueError(f"Inputs args function '{inputs_args_fn_name}' not found in {filepath}")
+    model_init_args = maybe_load_and_run_callable(


I believe the following works and would mean a bit less abstraction:

try: model_init_args = init_args_fn_name and getattr(module, init_args_fn_name)() or tuple() except AttributeError: raise ValueError(f"Init args function '{init_args_fn_name}' not found in {filepath}")

I know not everyone will find using the boolean operators in this way intuitive though (and technically it's not completely right, in the sense that returned falsity values will get replaced by tuple()). Nonetheless thought to suggest it.

rolfmorel

Thanks @dchigarev ! That's it for my comments. IMO this is looking great.

Am approving now, though with the caveat that I will still rebase #5 on this branch before we merge this PR (to check that the API works there as expected). Will report back on that by EOD tomorrow at the latest.

@banach-space, would you like to give this a final pass before it goes in?

python/lighthouse/ingress/torch/torch_import.py

Signed-off-by: dchigarev <[email protected]>

rolfmorel · 2025-10-21T21:41:05Z

Can confirm that this is working as expected for #5: https://github.com/llvm/lighthouse/pull/5/files/0912de15b458a78a88f111045fe0e54618ae83a1..63b82406443cdd449bbb1100a1639853b7417160

Barring one or two outstanding comments, this is go to IMO 👍

banach-space

Thanks for the updates and for pushing on this 🙏🏻

Approving as is - looks very clean and clear. I have some suggestions for minor improvements, but this is already great, so feel free to ignore.

If folks agree with my suggestions but have no bandwidth for PRs, I can upload something myself. Thanks!

python/lighthouse/ingress/torch/importer.py

banach-space · 2025-10-22T07:34:24Z

python/examples/ingress/torch/02-dummy-mlp-from-model.py

+# Step 4: Apply some MLIR passes using a PassManager
+pm = passmanager.PassManager(context=ir_context)
+pm.add("linalg-specialize-generic-ops")
+pm.add("one-shot-bufferize")


Why do we bufferize? Bufferization is quite an involved transformation and IMHO, we should only do the bare minimum here. Specifically, these are two orthogonal things to me:

importing a PyTorch model into MLIR,

running transformations on MLIR.

WDYT? Thinking in terms of "separation of concerns".

+1 to removing bufferization specifically. It can have many flavors and generally we want to stay at tensor longer.

OTOH, it's just an arbitrary example so, it's fine. Alternatively, an extra comment spelling out the message or motivation here could help to clarify intent.

Right, I also think that applying bufferization in an ingress-example could be too much :)

But do you think we should remove the PassManager case from the ingress-examples completely? I also believe that ingress and running a pipeline are two separate things, we could leave a hint to the users though in form of an in-code comment on what to do next with an imported mlir module, e.g.

.... # Step 4: output the imported MLIR module print("\n\nModule dump after running the pipeline:") mlir_module_ir.dump() # You can alternatively write the MLIR module to a file: # with open("output.mlir", "w") as f: # f.write(str(mlir_module_ir)) # # Or apply some MLIR passes using a PassManager: # pm = passmanager.PassManager(context=ir_context) # pm.add("linalg-specialize-generic-ops") # pm.add(...) # pm.run(mlir_module_ir.operation)

It'd make sense to completely skip these parts and focus only on ingress.
We can make other examples that focus on lowering later.

I agree, let's remove it completely

python/examples/ingress/torch/MLPModel/model.py

python/examples/ingress/torch/mlp_from_file.py

banach-space · 2025-10-22T07:39:59Z

python/examples/ingress/torch/02-dummy-mlp-from-model.py

+Example demonstrating how to load an already instantiated PyTorch model
+to MLIR using Lighthouse.


The difference between this file and 01-dummy-mlir-from-model.py is "instantiation" (here the model is "already instantiated"). But what does "model instantiation" mean? Genuine question - I think that it would be good to capture this somewhere :)

by "instantiation" I meant an instantiation (creation) of the model's class :)

changed to "model initialization", means the same but sound simplier

I think that "model" is just a bit too overloaded term and that's a potential source of confusion (I remember getting quite confused first time I played with PyTorch).

[nit] Could you specify that you mean the PyTorch "model" (class)?

Tried to make the docstring clearer on what pytorch model means

adam-smnk

Great work 😎

adam-smnk · 2025-10-22T08:26:02Z

python/examples/ingress/torch/01-dummy-mlp-from-file.py

+print(f"entry-point name: {func_op.name}")
+print(f"entry-point type: {func_op.type}")
+
+# Step 4: Apply some MLIR passes using a PassManager


Going to category could be a more useful or more broadly applicable default but TBD

adam-smnk · 2025-10-22T08:28:48Z

python/examples/ingress/torch/02-dummy-mlp-from-model.py

+# Step 4: Apply some MLIR passes using a PassManager
+pm = passmanager.PassManager(context=ir_context)
+pm.add("linalg-specialize-generic-ops")
+pm.add("one-shot-bufferize")


+1 to removing bufferization specifically. It can have many flavors and generally we want to stay at tensor longer.

OTOH, it's just an arbitrary example so, it's fine. Alternatively, an extra comment spelling out the message or motivation here could help to clarify intent.

Signed-off-by: dchigarev <[email protected]>

python/examples/ingress/torch/MLPModel/model.py

Signed-off-by: dchigarev <[email protected]>

dchigarev commented Sep 17, 2025

View reviewed changes

ingress/Torch-MLIR/generate-mlir.sh Outdated Show resolved Hide resolved

adam-smnk requested review from adam-smnk, rengolin and rolfmorel September 17, 2025 14:55

rengolin requested review from Groverkss, banach-space and nicolasvasilache September 17, 2025 18:29

rolfmorel reviewed Sep 18, 2025

View reviewed changes

banach-space reviewed Sep 26, 2025

View reviewed changes

dchigarev added 4 commits October 20, 2025 09:11

[ingress][torch-mlir] Add utility functions to import models using to…

1a4b755

…rch-mlir Signed-off-by: dchigarev <[email protected]>

add .gitignore for python files

1c6df47

Signed-off-by: dchigarev <[email protected]>

delete old torch ingress

370e3c0

Signed-off-by: dchigarev <[email protected]>

add readme

b984314

Signed-off-by: dchigarev <[email protected]>

dchigarev force-pushed the dchigarev/fx_importer branch from a08bcc5 to b984314 Compare October 20, 2025 10:05

dchigarev marked this pull request as ready for review October 20, 2025 10:34

dchigarev requested review from banach-space and rolfmorel October 20, 2025 10:35

dchigarev mentioned this pull request Oct 20, 2025

Make lighthouse an installable python package #7

Merged

dchigarev added 2 commits October 20, 2025 10:53

fix 02-* tutorial

2897e0c

Signed-off-by: dchigarev <[email protected]>

fix grammar & typos

fa7d1de

Signed-off-by: dchigarev <[email protected]>

rolfmorel reviewed Oct 20, 2025

View reviewed changes

dchigarev added 4 commits October 21, 2025 12:35

change docstrings from numpy to google format

eaf8c9e

Signed-off-by: dchigarev <[email protected]>

add kwarg functions to 'import_from_file' fn

889314d

Signed-off-by: dchigarev <[email protected]>

fix incode-comments and doc-strings wording

0a188e5

Signed-off-by: dchigarev <[email protected]>

Merge remote-tracking branch 'origin/main' into dchigarev/fx_importer

11240bd

dchigarev added 3 commits October 21, 2025 14:20

fix tutorials

1068bf9

Signed-off-by: dchigarev <[email protected]>

Merge remote-tracking branch 'origin/main' into dchigarev/fx_importer

aafffef

fix whitespaces for type annotations according to pep

0912de1

Signed-off-by: dchigarev <[email protected]>

rolfmorel reviewed Oct 21, 2025

View reviewed changes

rolfmorel approved these changes Oct 21, 2025

View reviewed changes

rolfmorel reviewed Oct 21, 2025

View reviewed changes

python/lighthouse/ingress/torch/torch_import.py Outdated Show resolved Hide resolved

fix ImportError message

d5c710c

Signed-off-by: dchigarev <[email protected]>

banach-space approved these changes Oct 22, 2025

View reviewed changes

adam-smnk reviewed Oct 22, 2025

View reviewed changes

dchigarev added 2 commits October 22, 2025 08:58

apply review suggestions

029b8ad

Signed-off-by: dchigarev <[email protected]>

remove passManager from ingress examples

2215980

Signed-off-by: dchigarev <[email protected]>

dchigarev requested a review from adam-smnk October 22, 2025 09:05

adam-smnk approved these changes Oct 22, 2025

View reviewed changes

python/examples/ingress/torch/MLPModel/model.py Outdated Show resolved Hide resolved

dchigarev added 2 commits October 22, 2025 09:54

do not create .pth for model-mlp

8bec558

Signed-off-by: dchigarev <[email protected]>

make example's docstring clearer on what 'pytorch mode' means

a1a1ddd

Signed-off-by: dchigarev <[email protected]>



		def generate_mlir(model, sample_args, sample_kwargs=None, dialect="linalg"):
		# Convert the Torch model to MLIR

		Example demonstrating how to load an already instantiated PyTorch model
		to MLIR using Lighthouse.

[ingress][torch-mlir][RFC] Initial version of fx-importer script using torch-mlir #4

Are you sure you want to change the base?

[ingress][torch-mlir][RFC] Initial version of fx-importer script using torch-mlir #4

Conversation

dchigarev commented Sep 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

rolfmorel left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dchigarev commented Sep 25, 2025

Uh oh!

banach-space commented Sep 26, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Groverkss commented Sep 28, 2025

Uh oh!

rolfmorel commented Oct 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dchigarev commented Oct 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rolfmorel left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

rolfmorel Oct 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rolfmorel left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

rolfmorel commented Oct 21, 2025

Uh oh!

banach-space left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

adam-smnk Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dchigarev Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dchigarev commented Sep 17, 2025 •

edited

Loading

rolfmorel commented Oct 8, 2025 •

edited

Loading

dchigarev commented Oct 20, 2025 •

edited

Loading

rolfmorel Oct 20, 2025 •

edited

Loading

rolfmorel left a comment •

edited

Loading

adam-smnk Oct 22, 2025 •

edited

Loading

dchigarev Oct 22, 2025 •

edited

Loading

dchigarev Oct 22, 2025 •

edited

Loading

adam-smnk Oct 22, 2025 •

edited

Loading