Skip to content

Conversation

@Mr-Anyone
Copy link
Member

Motivation

Added elementwise operator that has a one to one mapping. This allows us to test more GEMMs.

Technical Details

Added a TrivialElementwiseConverter class that performs one to one migraphx to linalg operations.

Example:

Starting from

%arg2 = migraphx.sub %arg0, %arg1 : <1x1xf32, 1x1>, <1x1xf32, 1x1> -> <1x1xf32, 1x1>

becomes:

%1 = linalg.sub ins(%expanded_0, %expanded : tensor<1x1xf32>, tensor<1x1xf32>) outs(%0 : tensor<1x1xf32>) -> tensor<1x1xf32>

Notice that there is a one to one mapping between linalg.sub and migraphx.sub.

Test Plan

Added a lit test for this initial conversion.

Test Result

Passed lit test

Submission Checklist

@Mr-Anyone Mr-Anyone requested a review from causten as a code owner February 3, 2026 19:49
@Mr-Anyone Mr-Anyone marked this pull request as draft February 3, 2026 19:49
@Mr-Anyone Mr-Anyone force-pushed the pr-template-migraphx-to-linalg-3 branch 2 times, most recently from 681fe17 to 322f71f Compare February 3, 2026 20:23
@Mr-Anyone Mr-Anyone marked this pull request as ready for review February 3, 2026 21:12
@@ -0,0 +1,92 @@
// RUN: rocmlir-opt -split-input-file --migraphx-to-linalg %s -verify-diagnostics -o | FileCheck %s
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add another test file for negative checks, i.e., migraphx elementwise ops that do not have a linalg conversion (yet)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See linalg-to-rock-not-implemented.mlir.

// RUN: rocmlir-opt -split-input-file --migraphx-to-linalg %s -verify-diagnostics -o | FileCheck %s

// CHECK-LABEL: func_sub
// CHECK: linalg.sub
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also check that the operands are also set in the right place (i.e., %arg0 then %arg1, not the other way around). Dont hardcode %arg0 and %arg1 in the check, instead use variables. Look for other lit tests for examples on how to do this

ConversionTarget &target) {
target.addLegalDialect<linalg::LinalgDialect, arith::ArithDialect,
tensor::TensorDialect>();
tensor::TensorDialect, math::MathDialect>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why MathDialect? We are not using it anywhere right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Take linalg.exp as an example, the pass fails if I don't add this line. This is what the log file shows:

[dialect-conversion:1]     //===-------------------------------------------===//
[dialect-conversion:1]     Legalizing operation : 'math.exp' (0x27df5ee0) {
[dialect-conversion:1]       %4 = "math.exp"(%arg2) <{fastmath = #arith.fastmath<none>}> : (f32) -> f32
[dialect-conversion:1]     } -> SUCCESS : operation marked legal by the target
[dialect-conversion:1]     //===-------------------------------------------===//

It seems that some math operations are emitted as intermediate steps.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is interesting. Can you find out why and where it using math.exp ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that is because linalg.exp have a region that contains math.exp. This isn't shown on the pretty printed version of mlir assembly:

"builtin.module"() ({
  "func.func"() <{function_type = (tensor<16xf32>, tensor<16xf32>) -> tensor<16xf32>, sym_name = "func_exp"}> ({
  ^bb0(%arg0: tensor<16xf32>, %arg1: tensor<16xf32>):
    %0 = "tensor.expand_shape"(%arg0) <{reassociation = [[0]], static_output_shape = array<i64: 16>}> : (tensor<16xf32>) -> tensor<16xf32>
    %1 = "tensor.empty"() : () -> tensor<16xf32>
    %2 = "linalg.exp"(%0, %1) <{operandSegmentSizes = array<i32: 1, 1>}> ({
    ^bb0(%arg2: f32, %arg3: f32):
      %4 = "math.exp"(%arg2) <{fastmath = #arith.fastmath<none>}> : (f32) -> f32
      "linalg.yield"(%4) : (f32) -> ()
    }) : (tensor<16xf32>, tensor<16xf32>) -> tensor<16xf32>
    %3 = "tensor.collapse_shape"(%2) <{reassociation = [[0]]}> : (tensor<16xf32>) -> tensor<16xf32>
    "func.return"(%3) : (tensor<16xf32>) -> ()
  }) : () -> ()
}) : () -> ()

Here we can see that linalg.exp have a region that contains math.exp.

//===----------------------------------------------------------------------===//
namespace {
template <class MIGraphXOp, class LinalgOp>
struct TrivialElementwiseConverter final
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename this to ElementwiseConverter

ConversionTarget finalConversionTarget(ctx);
RewritePatternSet genericPatternSet(&ctx);
populateLinalgGenericDialectConversion(finalConversionTarget);
linalg::populateLinalgNamedOpsGeneralizationPatterns(genericPatternSet);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to convert named ops to generic op?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am trying to match the output of tosa pipeline. In theory, I can move this into mlir/lib/Dialect/Rock/Pipelines/Pipelines.cpp instead:

root@14298acabab2:~/rocMLIR/build-debug# cat main.mlir
func.func @func_exp(%arg0 : !migraphx.shaped<1x3x2xf32, 6x2x1>) -> !migraphx.shaped<1x3x2xf32, 6x2x1> attributes {kernel, arch="gfx950"}{
  %1 = migraphx.exp %arg0 : <1x3x2xf32, 6x2x1> -> <1x3x2xf32, 6x2x1>
  func.return %1 : !migraphx.shaped<1x3x2xf32, 6x2x1>
}
root@14298acabab2:~/rocMLIR/build-debug# ./bin/rocmlir-driver --kernel-pipeline=migraphx,highlevel main.mlir
#map = affine_map<(d0, d1, d2) -> (d1 * 2 + d2)>
#map1 = affine_map<(d0, d1) -> (0, d0, d1)>
#map2 = affine_map<(d0, d1) -> (d0, d1)>
#map3 = affine_map<(d0, d1, d2) -> (d1, d2)>
#map4 = affine_map<(d0) -> (0, d0 floordiv 2, d0 mod 2)>
#transform_map = #rock.transform_map<#map by [<Unmerge{3, 2} ["exp1", "exp2"] at [1, 2] -> ["dim0"] at [0]>, <AddDim{1} ["unit0"] at [0] -> [] at []>] bounds = [1, 3, 2] -> [6]>
#transform_map1 = #rock.transform_map<#map1 by [<Merge{1, 3} ["dim0"] at [0] -> ["col0", "col1"] at [0, 1]>, <PassThrough ["dim1"] at [1] -> ["dim1"] at [2]>] bounds = [3, 2] -> [1, 3, 2]>
#transform_map2 = #rock.transform_map<#map3 by [<Unmerge{3} ["exp1"] at [1] -> ["dim0"] at [0]>, <PassThrough ["dim1"] at [2] -> ["dim1"] at [1]>, <AddDim{1} ["unit0"] at [0] -> [] at []>] bounds = [1, 3, 2] -> [3, 2]>
#transform_map3 = #rock.transform_map<#map4 by [<Merge{1, 3, 2} ["dim0"] at [0] -> ["col0", "col1", "col2"] at [0, 1, 2]>] bounds = [6] -> [1, 3, 2]>
module {
  func.func @func_exp(%arg0: memref<6xf32>, %arg1: memref<6xf32>) attributes {arch = "gfx950", kernel} {
    %0 = rock.transform %arg0 by #transform_map : memref<6xf32> to memref<1x3x2xf32>
    %1 = rock.transform %0 by #transform_map1 : memref<1x3x2xf32> to memref<3x2xf32>
    %alloc = memref.alloc() {alignment = 64 : i64} : memref<3x2xf32>
    linalg.generic {indexing_maps = [#map2, #map2], iterator_types = ["parallel", "parallel"]} ins(%1 : memref<3x2xf32>) outs(%alloc : memref<3x2xf32>) {
    ^bb0(%in: f32, %out: f32):
      %4 = math.exp %in : f32
      linalg.yield %4 : f32
    }
    %2 = rock.transform %alloc by #transform_map2 : memref<3x2xf32> to memref<1x3x2xf32>
    %3 = rock.transform %2 by #transform_map3 : memref<1x3x2xf32> to memref<6xf32>
    memref.copy %3, %arg1 : memref<6xf32> to memref<6xf32>
    return
  }
}

// Here we are doing two passes because applyPartialConversion doesn't
// guarantee a ordering of the passes that is going performed. We want to
// first try to convert all the named linalg operations first, and then
// transform the remaining linalg operations into linalg.generic operations.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which linalg operations are being turned into linalg.generics here? This is in LinalgToRock, so we should only be converting to rock ops in this pass?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There isn't a particularly strong reason, but I am just trying to match the output of the tosa lowering path. It seems that migraphx->tosa tosa->rock lowers a lot of elementwise operation into linalg.generic?

@Mr-Anyone Mr-Anyone force-pushed the pr-template-migraphx-to-linalg-3 branch from 322f71f to 6d7eb69 Compare February 4, 2026 21:29
@Mr-Anyone Mr-Anyone force-pushed the pr-template-migraphx-to-linalg-3 branch 2 times, most recently from 69d9574 to a793a6a Compare February 6, 2026 18:40
Comment on lines 213 to 292
Value init = tensor::EmptyOp::create(rewriter, loc, aType.getShape(),
aType.getElementType());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better to check output type is the same as input type before taking aType as the output for the linalg.

@Mr-Anyone Mr-Anyone force-pushed the pr-template-migraphx-to-linalg-2 branch 2 times, most recently from 952693d to b501718 Compare February 10, 2026 20:02
@Mr-Anyone Mr-Anyone force-pushed the pr-template-migraphx-to-linalg-3 branch from a793a6a to 436b328 Compare February 10, 2026 20:55
@Mr-Anyone Mr-Anyone requested a review from umangyadav February 10, 2026 21:14
@Mr-Anyone Mr-Anyone force-pushed the pr-template-migraphx-to-linalg-2 branch from 80495d1 to 75bf455 Compare February 11, 2026 14:38
@Mr-Anyone Mr-Anyone force-pushed the pr-template-migraphx-to-linalg-3 branch from 727a8fa to 590daf6 Compare February 11, 2026 14:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants