Skip to content

[transform_ext] Transform op to convert const dense_resource ops to function args#170

Open
tkarna wants to merge 1 commit into
llvm:mainfrom
tkarna:conv-const-resources
Open

[transform_ext] Transform op to convert const dense_resource ops to function args#170
tkarna wants to merge 1 commit into
llvm:mainfrom
tkarna:conv-const-resources

Conversation

@tkarna
Copy link
Copy Markdown
Contributor

@tkarna tkarna commented Jun 1, 2026

Adds convert_const_resources_to_args transform op that finds all arith.constant dense_resource ops and moves them to function arguments.

For example, KernelBench level2-9 case has matmul weights and bias encoded as dense_resource ops:

module {
  func.func @main(%arg0: tensor<1024x8192xf16>) -> tensor<1024x8192xf16> {
    %cst = arith.constant 0.000000e+00 : f32
    %cst_0 = arith.constant 0.000000e+00 : f16
    %cst_1 = arith.constant dense_resource<torch_tensor_8192_8192_torch.float16> : tensor<8192x8192xf16>
    %cst_2 = arith.constant dense_resource<torch_tensor_8192_torch.float16> : tensor<8192xf16>
    %0 = tensor.empty() : tensor<8192x8192xf16>
    %transposed = linalg.transpose ins(%cst_1 : tensor<8192x8192xf16>) outs(%0 : tensor<8192x8192xf16>)
        permutation = [1, 0] 
    ...

The weight and bias tensors are appended to the function arguments:

module {
  func.func @main(%arg0: tensor<1024x8192xf16>, %arg1: tensor<8192x8192xf16>, %arg2: tensor<8192xf16>)
        -> tensor<1024x8192xf16> attributes {llvm.emit_c_interface} {
    %cst = arith.constant 0.000000e+00 : f32
    %cst_0 = arith.constant 0.000000e+00 : f16
    %0 = tensor.empty() : tensor<8192x8192xf16>
    %transposed = linalg.transpose ins(%arg1 : tensor<8192x8192xf16>) outs(%0 : tensor<8192x8192xf16>)
        permutation = [1, 0] 
    ...

In general the payload function can have arbitrarily many dense_resource ops in any arbitrary order. They need to be identified to be able to pass in the right buffers. For now, the arguments are ordered by matmul ops and where they appear in the matmul producer/consumer chain: matmul_0_A, matmul_0_B, matmul_0_epilogue, matmul_1_A, etc. If dense_resource is not associated with a matmul op an error is raised.

@tkarna tkarna requested review from adam-smnk and rengolin June 1, 2026 15:39
@adam-smnk
Copy link
Copy Markdown
Member

adam-smnk commented Jun 1, 2026

High-level question, what happens to the original data from the dense resource?

@tkarna
Copy link
Copy Markdown
Contributor Author

tkarna commented Jun 1, 2026

High-level question, what happens to the original data from the dense resource?

The const dense_resource op is removed so the data is lost. We can access the weight data from the torch model and then pass that buffer to the mlir kernel.

@tkarna
Copy link
Copy Markdown
Contributor Author

tkarna commented Jun 2, 2026

To motivate this transformation: The const dense_resource ops are problematic for at least two reasons:

  1. The IR gets very large (>500 MB in text format for some cases) which makes lowering slow and increases memory footprint. It also complicates debugging etc.
  2. It's not clear how to deal with dense_resource ops with GPUs. Do they need to be explicitly moved to the device?

The present PR is just a workaround though. The payload analysis and arg ordering uses linalg.matmul ops as "anchor ops" that is used to infer the roles of the const weight arrays. This approach works for gemms and MLPs but does not generalize well: for example convolution ops can be linag.generic ops in which case we'd need to analyze whether the op is a convolution or, say, elementwise post op. Such analysis can get arbitrarily complex.

In long term, we should probably handle the const model data at higher level. At torch model level we can identify the model layers and their weights so we can easily identify their roles. We could for example modify torch-mlir to directly generate the weights as func args with proper annotation/ordering or use torch.compile instead (?).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants