MLIR (software)

{{Short description|C++ framework for compiler development}}

{{Infobox software

| name = MLIR

| logo = MLIR Logo.svg

| logo caption = The MLIR logo, used by the LLVM project

| author = Chris Lattner, Mehdi Amini, Uday Bondhugula, and others

| developer = LLVM Developer Group

| released = 2019

| programming language = C++

| operating system = Cross-platform

| genre = Compiler

| license = Apache License 2.0 with LLVM Exception

| website = {{URL|https://mlir.llvm.org}}

}}

MLIR (Multi-Level Intermediate Representation) is an open-source compiler infrastructure project developed as a sub-project of the LLVM project. It provides a modular and extensible intermediate representation (IR) framework intended to facilitate the construction of domain-specific compilers and improve compilation for heterogeneous computing platforms. MLIR supports multiple abstraction levels in a single IR and introduces dialects, a mechanism for defining custom operations, types, and attributes tailored to specific domains.{{cite web |title=Multi-Level Intermediate Representation Overview |url=https://mlir.llvm.org/docs/ |website=mlir.llvm.org |access-date=2025-06-05}} The name "Multi-Level Intermediate Representation" reflects the system’s ability to model computations at various abstraction levels and progressively lower them toward machine code.

MLIR was originally developed in 2018 by Chris Lattner at Google, and publicly released as part of LLVM in 2019.{{cite conference |last1=Lattner |first1=Chris |last2=Amini |first2=Mehdi |last3=Bondhugula |first3=Uday |last4=Cohen |first4=Albert |last5=Davis |first5=Andy |last6=Pienaar |first6=Jacques |last7=Riddle |first7=River |last8=Shpeisman |first8=Tatiana |last9=Vasilache |first9=Nicolas |last10=Zinenko |first10=Oleksandr |title=MLIR: Scaling Compiler Infrastructure for Domain Specific Computation |conference=2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO) |pages=2–14 |year=2021 |doi=10.1109/CGO51591.2021.9370308}} It was designed to address challenges in building compilers for modern workloads such as machine learning, hardware acceleration, and high-level synthesis by providing reusable components and standardizing the representation of intermediate computations across different programming languages and hardware targets.{{cite web |title=Why Mojo |url=https://docs.modular.com/mojo/why-mojo.html#mlir |website=docs.modular.com |access-date=2025-06-05}}

MLIR is used in a range of systems including TensorFlow, Mojo, TPU-MLIR, and others.{{cite web |title=Users of MLIR |url=https://mlir.llvm.org/users/ |website=mlir.llvm.org |access-date=2025-06-05}} It is released under the Apache License 2.0 with LLVM exceptions and is maintained as part of the LLVM project.

History

Work on MLIR began in 2018, led by Chris Lattner at Google in collaboration with Mehdi Amini, River Riddle, and others, as a response to the growing complexity of modern compiler toolchains. The project aimed to improve the modularity, composability, and maintainability of compiler infrastructures, particularly in domains such as machine learning, high-level synthesis, and hardware acceleration. It was formally introduced at the 2019 LLVM Developer Meeting and was open-sourced later that year as part of the LLVM monorepository.{{cite web |title=LLVM Developer Meetings |url=https://llvm.org/devmtg/ |website=llvm.org |access-date=2025-06-05}}{{cite web |title=llvm-project |url=https://github.com/llvm/llvm-project |website=GitHub |publisher=LLVM Project |access-date=2025-06-16}}

MLIR’s architecture was shaped by prior experiences building compilers such as XLA and LLVM, where limitations in existing intermediate representations hindered optimization and reuse across abstraction levels. To address this, MLIR introduced a novel concept of multi-level IRs that could coexist in the same system and be gradually lowered through well-defined transformations. A foundational design feature was the use of dialects, allowing different domains and hardware targets to define custom operations and type systems while maintaining interoperability.

Since its release, MLIR has been adopted by multiple compiler ecosystems and research efforts. In TensorFlow, MLIR serves as the foundation for rewriting and lowering transformations in components such as XLA and TensorFlow Runtime. The language Mojo, developed by Modular Inc., relies on MLIR to achieve ahead-of-time compilation for artificial intelligence workloads. Additional projects that have built on MLIR include TPU-MLIR for compiling models to Tensor Processing Unit hardware,{{cite web |title=TPU-MLIR Developer Manual |url=https://doc.sophgo.com/sdk-docs/v23.05.01/docs_latest_release/docs/tpu-mlir/developer_manual_en/html/01_introduction.html |website=doc.sophgo.com |publisher=Sophgo |access-date=2025-06-16}} ONNX-MLIR for interoperable machine learning models,{{cite web |title=ONNX-MLIR |url=https://github.com/onnx/onnx-mlir |website=GitHub |publisher=ONNX Project |access-date=2025-06-16}} MLIR-AIE for targeting Xilinx AI Engines,{{cite web |title=MLIR-AIE |url=https://github.com/Xilinx/mlir-aie |website=GitHub |publisher=Xilinx |access-date=2025-06-16}} IREE for compiling and executing machine learning models across CPUs, GPUs, and accelerators,{{cite web |title=IREE |url=https://iree.dev |website=iree.dev |access-date=2025-06-16}} DSP-MLIR, a compiler infrastructure tailored for digital signal processing (DSP) applications,{{cite conference |last1=Kumar |first1=Abhinav |last2=Khedkar |first2=Atharva |last3=So |first3=Hwisoo |last4=Kuo |first4=Megan |last5=Gurjar |first5=Ameya |last6=Biswas |first6=Partha |last7=Shrivastava |first7=Aviral |title=DSP-MLIR: A Domain-Specific Language and MLIR Dialect for Digital Signal Processing |book-title=Proceedings of the 26th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES '25) |publisher=Association for Computing Machinery |location=New York, NY, USA |year=2025 |pages=146–157 |doi=10.1145/3735452.3735527 |url=https://doi.org/10.1145/3735452.3735527|arxiv=2408.11205 }} and torch-mlir, which brings MLIR-based compilation capabilities to the PyTorch ecosystem.{{cite web |title=torch-mlir |url=https://github.com/llvm/torch-mlir |website=GitHub |publisher=LLVM Project |access-date=2025-06-16}}{{cite web |title=Users of MLIR |url=https://mlir.llvm.org/users/ |website=mlir.llvm.org |access-date=2025-06-05}}

MLIR continues to evolve as part of the LLVM Project and follows the project's release schedule and development policies. It is developed collaboratively by contributors from industry, academia, and the broader open-source community.

Dialects

In MLIR, a dialect defines a self-contained namespace of operations, types, attributes, and other constructs. Dialects are the primary mechanism for extensibility, allowing developers to introduce domain-specific abstractions while maintaining compatibility within the broader MLIR framework. Each operation within a dialect is identified by a unique name and may include optional operands, results, attributes, and regions. Operands and results follow the static single-assignment form (SSA), and each result is associated with a type. Attributes represent compile-time metadata, such as constant values. Regions consist of ordered blocks, each of which may take input arguments and contain a sequence of nested operations.{{Cite web |title=MLIR Language Reference - MLIR |url=https://mlir.llvm.org/docs/LangRef/ |access-date=2023-07-05 |website=mlir.llvm.org}} While MLIR is designed around SSA, it avoids traditional PHI nodes by using block arguments in conjunction with the operands of control-flow operations to model value merging.{{Cite web |title=MLIR Rationale - MLIR |url=https://mlir.llvm.org/docs/Rationale/Rationale/#block-arguments-vs-phi-nodes |access-date=2023-07-05 |website=mlir.llvm.org}}

The general syntax for an operation is the following:

%res:2 = "mydialect.morph"(%input#3) ({

^bb0(%arg0: !mydialect<"custom_type"> loc("mysource.cc":10:8)):

// nested operations

}) { some.attribute = true, other_attribute = 1.5 }

: (!mydialect<"custom_type">) -> (!mydialect<"other_type">, !mydialect<"other_type">)

loc(callsite("foo" at "mysource.cc":10:8))

This operation, named morph, belongs to the mydialect dialect. It takes one input operand (%input#3) of type custom_type and produces two output values of type other_type. The operation includes two attributes-some.attribute and other_attribute-and contains a region with a single block (^bb0) that accepts one argument. The loc keyword specifies source-level location information, which can be used for debugging or diagnostic reporting.{{Cite web |last1=Amini |first1=Mehdi |last2=Riddle |first2=River |title=MLIR Tutorial |url=https://llvm.org/devmtg/2020-09/slides/MLIR_Tutorial.pdf |access-date=2025-06-05}}

The syntax of operations, types and attributes can also be customized according to the user preferences by implementing proper parsing and printing functions within the operation definition.{{Cite book |last=Stroustrup |first=Bjarne |title=The C++ programming language: C++ 11 |date=2015 |publisher=Addison-Wesley |isbn=978-0-321-56384-2 |edition=4. ed., 4. print |location=Upper Saddle River, NJ}}

= Core dialects =

The MLIR dialects ecosystem is open and extensible, allowing end-users to define new dialects that capture the semantics of specific computational domains. At the same time, the MLIR codebase provides a variety of built-in dialects that address common patterns found in intermediate representations. These core dialects are designed to be self-contained and interoperable, making them suitable for reuse across different compiler stacks.{{Cite web |title=Dialects - MLIR |url=https://mlir.llvm.org/docs/Dialects/ |access-date=2023-07-07 |website=mlir.llvm.org}}

For example, the arith dialect includes basic mathematical operations over integers and floating-point types, while the memref dialect provides operations for memory allocation and access. Control-flow abstractions are handled by dialects such as affine, which supports affine loop nests suitable for polyhedral optimization, and scf, which provides structured control flow using constructs like for, if, and while. The func dialect supports function definitions and calls, while the gpu dialect introduces primitives for GPU programming models. Additionally, the tosa dialect defines a portable and quantization-friendly operator set for machine learning inference. Finally, the llvm dialect provides a one-to-one mapping to LLVM IR, enabling seamless lowering to LLVM’s backend and reuse of its optimization and code generation infrastructure.{{cite web |title=Users of MLIR |url=https://mlir.llvm.org/users/ |website=mlir.llvm.org |access-date=2025-06-05}}

The following code defines a function that takes two floating point matrices and performs the sum between the values at the same positions:

func.func @matrix_add(%arg0: memref<10x20xf32>, %arg1: memref<10x20xf32>) -> memref<10x20xf32> {

%result = memref.alloc() : memref<10x20xf32>

affine.for %i = 0 to 10 {

affine.for %j = 0 to 20 {

%lhs = memref.load %arg0[%i, %j] : memref<10x20xf32>

%rhs = memref.load %arg1[%i, %j] : memref<10x20xf32>

%sum = arith.addf %lhs, %rhs : f32

memref.store %sum, %result[%i, %j] : memref<10x20xf32>

}

}

func.return %result : memref<10x20xf32>

}

Although different dialects may be used to express similar computations, the level of abstraction and the intended compilation flow may vary. In the example above, the affine dialect enables polyhedral analysis and optimizations, while the memref and arith dialects express memory and arithmetic operations, respectively.

Operation definition specification

The operations of a dialect can be defined using the C++ language, but also in a more convenient and robust way by using the Operation definition specification (ODS).{{Cite web |title=Operation Definition Specification (ODS) - MLIR |url=https://mlir.llvm.org/docs/DefiningDialects/Operations/ |access-date=2023-07-05 |website=mlir.llvm.org}} By using TableGen, the C++ code for declarations and definitions can be then automatically generated.{{Cite web |title=TableGen Overview - LLVM 17.0.0git documentation |url=https://llvm.org/docs/TableGen/ |access-date=2023-07-05 |website=llvm.org}}

The autogenerated code can include parsing and printing methods – which are based on a simple string mapping the structure of desired textual representation – together with all the boilerplate code for accessing fields and perform common actions such verification of the semantics of each operation, canonicalization or folding.{{Cite web |title=Defining Dialects - MLIR |url=https://mlir.llvm.org/docs/DefiningDialects/ |access-date=2023-07-07 |website=mlir.llvm.org}}

The same declaration mechanism can be used also for types and attributes, which are the other two categories of elements constituting a dialect.

The following example illustrates how to specify the assembly format of an operation expecting a variadic number of operands and producing zero results. The textual representation consists in the optional list of attributes, followed by the optional list of operands, a colon, and types of the operands.

let assemblyFormat = "attr-dict ($operands^ `:` type($operands))?";

Transformations

Transformations can always be performed directly on the IR, without having to rely on built-in coordination mechanisms. However, in order to ease both implementation and maintenance, MLIR provides an infrastructure for IR rewriting that is composed by different rewrite drivers. Each driver receives a set of objects named patterns, each of which has its own internal logic to match operations with certain properties. When an operation is matched, the rewrite process is performed and the IR is modified according to the logic within the pattern.{{Cite web |title=Pattern Rewriting : Generic DAG-to-DAG Rewriting - MLIR |url=https://mlir.llvm.org/docs/PatternRewriter/ |access-date=2023-07-06 |website=mlir.llvm.org}}

= Dialect conversion driver =

This driver operates according to the legality of existing operations, meaning that the driver receives a set of rules determining which operations have to be considered illegal and expects the patterns to match and convert them into legal ones. The logic behind those rules can be arbitrarily complex: it may be based just on the dialect to which the operations belong, but can also inspect more specific properties such as attributes or nested operations.

As the names suggests, this driver is typically used for converting the operations of a dialect into operations belonging to a different one. In this scenario, the whole source dialect would be marked as illegal, the destination one as legal, and patterns for the source dialect operations would be provided. The dialect conversion framework also provides support for type conversion, which has to be performed on operands and results to convert them to the type system of the destination dialect.{{Cite web |title=Dialect Conversion - MLIR |url=https://mlir.llvm.org/docs/DialectConversion/ |access-date=2023-07-06 |website=mlir.llvm.org}}

MLIR allows for multiple conversion paths to be taken. Considering the example about the sum of matrices, a possible lowering strategy may be to generate for-loops belonging to the scf dialect, obtaining code to be executed on CPUs:

  1. map = affine_map<(d0, d1) -> (d0, d1)>

module {

func.func @avg(%arg0: memref<10x20xf32>, %arg1: memref<10x20xf32>) -> memref<10x20xf32> {

%alloc = memref.alloc() : memref<10x20xf32>

%c0 = arith.constant 0 : index

%c10 = arith.constant 10 : index

%c1 = arith.constant 1 : index

scf.for %arg2 = %c0 to %c10 step %c1 {

%c0_0 = arith.constant 0 : index

%c20 = arith.constant 20 : index

%c1_1 = arith.constant 1 : index

scf.for %arg3 = %c0_0 to %c20 step %c1_1 {

%0 = memref.load %arg0[%arg2, %arg3] : memref<10x20xf32>

%1 = memref.load %arg1[%arg2, %arg3] : memref<10x20xf32>

%2 = arith.addf %0, %1 : f32

memref.store %2, %alloc[%arg2, %arg3] : memref<10x20xf32>

}

}

return %alloc : memref<10x20xf32>

}

}

Another possible strategy, however, could have been to use the gpu dialect to generate code for GPUs:

  1. map = affine_map<(d0, d1) -> (d0, d1)>

module {

func.func @avg(%arg0: memref<10x20xf32>, %arg1: memref<10x20xf32>) -> memref<10x20xf32> {

%alloc = memref.alloc() : memref<10x20xf32>

%c0 = arith.constant 0 : index

%c10 = arith.constant 10 : index

%0 = arith.subi %c10, %c0 : index

%c1 = arith.constant 1 : index

%c0_0 = arith.constant 0 : index

%c20 = arith.constant 20 : index

%1 = arith.subi %c20, %c0_0 : index

%c1_1 = arith.constant 1 : index

%c1_2 = arith.constant 1 : index

gpu.launch blocks(%arg2, %arg3, %arg4) in (%arg8 = %0, %arg9 = %c1_2, %arg10 = %c1_2) threads(%arg5, %arg6, %arg7) in (%arg11 = %1, %arg12 = %c1_2, %arg13 = %c1_2) {

%2 = arith.addi %c0, %arg2 : index

%3 = arith.addi %c0_0, %arg5 : index

%4 = memref.load %arg0[%2, %3] : memref<10x20xf32>

%5 = memref.load %arg1[%2, %3] : memref<10x20xf32>

%6 = arith.addf %4, %5 : f32

memref.store %4, %alloc[%2, %3] : memref<10x20xf32>

gpu.terminator

}

return %alloc : memref<10x20xf32>

}

}

= Greedy pattern rewrite driver =

The driver greedily applies the provided patterns according to their benefit, until a fixed point is reached or the maximum number of iterations is reached. The benefit of a pattern is self-attributed. In case of equalities, the relative order within the patterns list is used.

Traits and interfaces

MLIR allows to apply existing optimizations (e.g., common subexpression elimination, loop-invariant code motion) on custom dialects by means of traits and interfaces. These two mechanisms enable transformation passes to operate on operations without knowing their actual implementation, relying only on some properties that traits or interfaces provide.

Traits are meant to be attached to operations without requiring any additional implementation. Their purpose is to indicate that the operation satisfies certain properties (e.g. having exactly two operands).{{Cite web |title=Traits - MLIR |url=https://mlir.llvm.org/docs/Traits/ |access-date=2023-07-05 |website=mlir.llvm.org}} Interfaces, instead, represent a more powerful tool through which the operation can be queried about some specific aspect, whose value may change between instances of the same kind of operation. An example of interface is the representation of memory effects: each operation that operates on memory may have such interface attached, but the actual effects may depend on the actual operands (e.g., a function call with arguments possibly being constants or references to memory).{{Cite web |title=Interfaces - MLIR |url=https://mlir.llvm.org/docs/Interfaces/ |access-date=2023-07-05 |website=mlir.llvm.org}}

Applications

The freedom in modeling intermediate representations enables MLIR to be used in a wide range of scenarios. This includes traditional programming languages,{{cite conference |url=https://ieeexplore.ieee.org/document/9563011 |title=Polygeist: Raising C to Polyhedral MLIR |first1=William S. |last1=Moses |last2=Chelini |first2=Lorenzo |last3=Zhao |first3=Ruizhe |last4=Zinenko |first4=Oleksandr |year=2021 |conference=30th International Conference on Parallel Architectures and Compilation Techniques (PACT) |pages=45–59 |isbn=978-1-6654-4278-7 |doi=10.1109/PACT52795.2021.00011|url-access=subscription }} but also high-level synthesis,{{cite conference |last1=Agostini |first1=Nicolas Bohm |last2=Curzel |first2=Serena |last3=Amatya |first3=Vinay |last4=Tan |first4=Cheng |last5=Minutoli |first5=Marco |last6=Castellana |first6=Vito Giovanni |last7=Manzano |first7=Joseph |last8=Kaeli |first8=David |last9=Tumeo |first9=Antonino |date=2022-10-30 |title=An MLIR-based Compiler Flow for System-Level Design and Hardware Acceleration |url=https://dl.acm.org/doi/10.1145/3508352.3549424 |publisher=Association for Computing Machinery |archive-date= |location= |format= |id= |isbn=978-1-4503-9217-4 |doi=10.1145/3508352.3549424 |language=en |pages=1–9 |book-title=Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design|hdl=11311/1229389 |hdl-access=free }}{{cite arXiv |last1=Ruizhe |first1= Zhao |last2= Jianyi |first2= Cheng |title=Phism: Polyhedral High-Level Synthesis in MLIR |date=2021 |eprint=2103.15103 |class=cs.PL}} quantum computing{{Cite book |last1=McCaskey |first1=Alexander |last2=Nguyen |first2=Thien |title=2021 IEEE International Conference on Quantum Computing and Engineering (QCE) |chapter=A MLIR Dialect for Quantum Assembly Languages |date=October 2021 |url=https://ieeexplore.ieee.org/document/9605269 |publisher=IEEE |pages=255–264 |doi=10.1109/QCE52317.2021.00043 |arxiv=2101.11365 |osti=1862113 |isbn=978-1-6654-1691-7|s2cid=231718965}} and homomorphic encryption.{{Cite journal |last1=Park |first1=Sunjae |last2=Song |first2=Woosung |last3=Nam |first3=Seunghyeon |last4=Kim |first4=Hyeongyu |last5=Shin |first5=Junbum |last6=Lee |first6=Juneyoung |date=2023-06-06 |title=HEaaN.MLIR: An Optimizing Compiler for Fast Ring-Based Homomorphic Encryption |journal=Proceedings of the ACM on Programming Languages |language=en |volume=7 |issue=PLDI |pages=196–220 |doi=10.1145/3591228 |issn=2475-1421|doi-access=free}}{{Cite web |last1=Govindarajan |first1=Sanath |last2=Moses |first2=William S. |title=SyFER-MLIR: Integrating Fully Homomorphic Encryption Into the MLIR Compiler Framework |url=https://math.mit.edu/research/highschool/primes/materials/2020/Govindarajan-Moses.pdf}}{{Cite web |title=HEIR: Homomorphic Encryption Intermediate Representation |website=GitHub |url=https://github.com/google/heir |access-date=2023-09-05}} Machine learning applications also take advantage of built-in polyhedral compilation techniques, together with dialects targeting accelerators and other heterogeneous systems.{{Cite arXiv |last1=Jin |first1=Tian |last2=Bercea |first2=Gheorghe-Teodor |last3=Le |first3=Tung D. |last4=Chen |first4=Tong |last5=Su |first5=Gong |last6=Imai |first6=Haruki |last7=Negishi |first7=Yasushi |last8=Leu |first8=Anh |last9=O'Brien |first9=Kevin |last10=Kawachiya |first10=Kiyokuni |last11=Eichenberger |first11=Alexandre E. |date=2020 |title=Compiling ONNX Neural Network Models Using MLIR |eprint=2008.08272 |class=cs.PL}}{{Citation |last=Pienaar |first=Jacques |title=MLIR in TensorFlow Ecosystem |date=2020 |url=https://research.google/pubs/pub48996/ |access-date=2023-07-06}}{{cite arXiv |last1=Hu |first1=Pengchao |last2=Lu |first2=Man |last3=Wang |first3=Lei |last4=Jiang |first4=Guoyue |date=2022 |title=TPU-MLIR: A Compiler For TPU Using MLIR |eprint=2210.15016 |class=cs.PL}}{{Cite book |last1=Katel |first1=Navdeep |last2=Khandelwal |first2=Vivek |last3=Bondhugula |first3=Uday |title=Proceedings of the 31st ACM SIGPLAN International Conference on Compiler Construction |chapter=MLIR-based code generation for GPU tensor cores |date=2022-03-19 |url=https://dl.acm.org/doi/10.1145/3497776.3517770 |language=en |publisher=ACM |pages=117–128 |doi=10.1145/3497776.3517770 |isbn=978-1-4503-9183-2|s2cid=247522110}}{{Cite journal |last1=Bik |first1=Aart |last2=Koanantakool |first2=Penporn |last3=Shpeisman |first3=Tatiana |last4=Vasilache |first4=Nicolas |last5=Zheng |first5=Bixia |last6=Kjolstad |first6=Fredrik |date=2022-12-31 |title=Compiler Support for Sparse Tensor Computations in MLIR |url=https://dl.acm.org/doi/10.1145/3544559 |journal=ACM Transactions on Architecture and Code Optimization |language=en |volume=19 |issue=4 |pages=1–25 |doi=10.1145/3544559 |s2cid=246680261 |issn=1544-3566 |arxiv=2202.04305}}

For specific compiler projects and toolchains built using MLIR, see the Ecosystem section below.

Ecosystem

MLIR has fostered a growing ecosystem of open-source projects, production compilers, and experimental toolchains across multiple domains. These projects demonstrate MLIR’s flexibility in modeling, optimizing, and lowering computations for a diverse set of hardware targets.

TensorFlow/XLA integrates MLIR as a foundational component of its modern compiler infrastructure. MLIR is used to represent TensorFlow computation graphs in an extensible intermediate form, facilitating transformations such as fusion, quantization, and backend-specific lowering. Both the TensorFlow Runtime(TFRT) and the Accelerated Linear Algebra (XLA) compiler rely on MLIR to improve portability and performance across hardware platforms.{{cite web |title=MLIR: A New Intermediate Representation and Compiler Framework |url=https://blog.tensorflow.org/2019/04/mlir-new-intermediate-representation.html |website=TensorFlow Blog |date=2019-04-18 |access-date=2025-06-16}}{{cite web |title=TFRT: A New TensorFlow Runtime |url=https://blog.tensorflow.org/2020/04/tfrt-new-tensorflow-runtime.html |website=TensorFlow Blog |date=2020-04-27 |access-date=2025-06-16}}{{cite web |title=MLIR for Graph Algorithms |url=https://mlir.llvm.org/docs/Rationale/MLIRForGraphAlgorithms/ |website=mlir.llvm.org |access-date=2025-06-16}}{{cite web |title=XLA Overview |url=https://openxla.org/xla |website=OpenXLA |access-date=2025-06-16}}

IREE (Intermediate Representation Execution Environment) is an end-to-end compiler and runtime system built entirely on MLIR. It compiles high-level machine learning models-such as those from TensorFlow and TensorFlow Lite into optimized, portable executables that can target a variety of hardware backends, including CPUs, GPUs, and dedicated accelerators. IREE supports both ahead-of-time (AOT) and just-in-time (JIT) compilation workflows, and serves as a demonstration of how MLIR can function as the intermediate representation for a complete compiler stack, encompassing frontend lowering, optimization, backend code generation, and runtime execution.{{cite web |title=IREE |url=https://iree.dev |website=iree.dev |access-date=2025-06-16}}{{cite web |title=Announcing IREE: A New Initiative for Machine Learning Deployment |url=https://lfaidata.foundation/blog/2024/05/23/announcing-iree-a-new-initiative-for-machine-learning-deployment/ |website=LF AI & Data Foundation |date=2024-05-23 |access-date=2025-06-16}}{{cite web |title=IREE Targeting Vulkan |url=https://www.khronos.org/assets/uploads/developers/presentations/IREE_targeting_Vulkan_Zhang_May22.pdf |website=Khronos Group |publisher=Khronos |access-date=2025-06-16}}{{cite journal |last1=Liu |first1=Hsin-I Cindy |last2=Brehler |first2=Marius |last3=Ravishankar |first3=Mahesh |last4=Vasilache |first4=Nicolas |last5=Vanik |first5=Ben |last6=Laurenzo |first6=Stella |title=TinyIREE: An ML Execution Environment for Embedded Systems From Compilation to Deployment |journal=IEEE Micro |volume=42 |issue=5 |pages=9–16 |year=2022 |doi=10.1109/MM.2022.3178068 |url=https://doi.org/10.1109/MM.2022.3178068 |publisher=IEEE|doi-access=free }}

torch-mlir is a compiler project that integrates MLIR-based infrastructure into the PyTorch ecosystem. It introduces Torch and TorchCoversion dialects which model PyTorch-level abstractions, such as TorchScript and eager-mode semantics, and provides transformation passes to progressively lower these representations toward hardware-optimized targets. torch-mlir is designed to be a modular backend framework, enabling high-performance execution across diverse platforms, including CPUs, GPUs, and specialized accelerators.{{cite web |title=The torch-mlir Project (LLVM OpenMeetings 2021) |url=https://mlir.llvm.org/OpenMeetings/2021-10-07-The-Torch-MLIR-project.pdf |website=mlir.llvm.org |date=2021-10-07 |access-date=2025-06-16}}{{cite web |title=torch-mlir: Bridging PyTorch and MLIR Ecosystems |url=https://discuss.pytorch.org/t/torch-mlir-bridging-pytorch-and-mlir-ecosystems/133151 |website=PyTorch Forums |date=30 September 2021 |access-date=2025-06-16}}{{cite web |title=An Introduction to torch-mlir (FOSDEM 2025) |url=https://fosdem.org/2025/events/attachments/fosdem-2025-6643-an-introduction-to-torch-mlir/slides/237934/An_Introd_nnOMKYo.pdf |website=fosdem.org |access-date=2025-06-16}}

ONNX-MLIR is a compiler framework built on MLIR that targets the ONNX ecosystem. It provides a conversion and optimization pipeline for ONNX models by translating them into MLIR using a series of dedicated dialects that represent ONNX operations and intermediate forms. ONNX-MLIR enables execution across a wide range of hardware platforms by leveraging MLIR’s extensible lowering infrastructure and backend integration. The project supports model import, shape inference, and code generation for multiple targets, and serves as a reference implementation of ONNX-to-MLIR compilation.{{cite web |title=ONNX-MLIR |url=https://onnx.ai/onnx-mlir/ |website=onnx.ai |access-date=2025-06-16}}{{cite arXiv |eprint=2008.08272 |class=cs.PL |title=Compiling ONNX Neural Network Models Using MLIR |last1=Jin |first1=Tian |last2=Bercea |first2=Gheorghe-Teodor |last3=Le |first3=Tung D |last4=Chen |first4=Tong |last5=Su |first5=Gong |last6=Imai |first6=Haruki |last7=Negishi |first7=Yasushi |last8=Leu |first8=Anh |last9=O'Brien |first9=Kevin |last10=Kawachiya |first10=Kiyokuni |year=2020}}

MLIR-AIE is a compiler framework developed by Xilinx for programming AI Engine (AIE) arrays found on Versal ACAP platforms. It extends MLIR with custom dialects and transformation passes tailored to the dataflow architecture and compilation constraints of AIE hardware. MLIR-AIE enables software developers to write high-level programs and compile them into optimized instruction sets suitable for deeply embedded, parallel, and statically scheduled workloads. The framework supports hardware-specific pipelines such as IRON and AIR for targeting AMD’s Ryzen AI and Versal-AIE platforms.{{cite web |title=MLIR-AIE Documentation |url=https://xilinx.github.io/mlir-aie/index.html |website=Xilinx GitHub Pages |publisher=Xilinx |access-date=2025-06-16}}{{cite web |title=IRON for Ryzen AI: A Tutorial on MLIR-AIE Compilation |url=https://www.amd.com/content/dam/amd/en/documents/products/processors/ryzen/ai/iron-for-ryzen-ai-tutorial-micro-2024.pdf |website=amd.com |publisher=AMD |access-date=2025-06-16}}{{cite web |title=AIR for Ryzen AI: A Tutorial on Versal-AIE Compilation |url=https://www.amd.com/content/dam/amd/en/documents/products/processors/ryzen/ai/air-for-ryzen-ai-tutorial-asplos-2024.pdf |website=amd.com |publisher=AMD |access-date=2025-06-16}}

Triton-MLIR is a compiler infrastructure that brings MLIR based tooling to the Triton programming model, which is used to write highly efficient custom GPU kernels. It introduces MLIR dialects that represent Triton's core abstractions, including blocks, warps, and memory spaces, and integrates them with existing MLIR transformation pipelines. Triton-MLIR enables new paths for optimization, interoperability, and backend extensibility within the Triton ecosystem. It also forms part of Microsoft’s broader effort to unify Triton’s kernel representation under the MLIR compiler architecture.{{cite web |title=Triton MLIR Dialects |url=https://triton-lang.org/main/dialects/dialects.html |website=triton-lang.org |access-date=2025-06-16}}{{cite web |title=microsoft/triton-shared |url=https://github.com/microsoft/triton-shared |website=GitHub |publisher=Microsoft |access-date=2025-06-16}}{{cite web |title=Understanding the Stages of Triton Kernel Compilation |url=https://pytorch.org/blog/triton-kernel-compilation-stages/ |website=PyTorch Blog |date=2023-09-26 |access-date=2025-06-16}}{{cite web |title=The Proton Dialect (LLVM Dev Meeting 2025) |url=https://llvm.org/devmtg/2025-03/slides/the_proton_dialect.pdf |website=llvm.org |access-date=2025-06-16}}

Mojo is a systems programming language developed by Modular Inc. that integrates Python syntax with low-level performance characteristics. Mojo is built on MLIR and uses it as its core intermediate representation framework. The language defines custom dialects to support advanced compilation features such as static typing, memory layout control, metaprogramming, and hardware specialization. MLIR enables Mojo to seamlessly interoperate with other MLIR-based systems and to generate highly optimized code for a wide range of accelerators and heterogeneous platforms.{{cite web |title=Mojo Keynote – LLVM Dev Meeting 2023 |url=https://llvm.org/devmtg/2023-10/slides/keynote/Mojo.pdf |website=llvm.org |access-date=2025-06-16}}{{cite web |title=How MLIR Powers Mojo |url=https://www.educative.io/answers/mlir-in-mojo |website=Educative |access-date=2025-06-16}}{{cite web |title=Mojo and MLIR Interoperability |url=https://ruhati.net/mojo/_mlir_interoperability.html |website=ruhati.net |access-date=2025-06-16}}

See also

References

{{Reflist}}