Legalize quantized stablehlo operation using uniform_quantize/uniform_dequantize #2394

sdasgup3 · 2024-06-14T20:53:06Z

This PR provides with a pass to decompose StableHLO quantized programs using uniform quantize/dequantize
operations. For example, the following program

```mlir
func.func @add(%arg0: tensor<!quant.uniform<i8:f32,1.0:0>>, %arg1: tensor<!quant.uniform<i8:f32,2.0:1>>) ->  tensor<!quant.uniform<i8:f32,3.0:2>> {
  %0 = "stablehlo.add"(%arg0, %arg1) : (tensor<!quant.uniform<i8:f32,1.0:0>>, tensor<!quant.uniform<i8:f32,2.0:1>>) -> tensor<!quant.uniform<i8:f32,3.0:2>>
  func.return %0 : tensor<!quant.uniform<i8:f32,3.0:2>>
}
```

Will become:

```mlir
func.func @add(%arg0: tensor<!quant.uniform<i8:f32, 1.000000e+00>>, %arg1: tensor<!quant.uniform<i8:f32, 2.000000e+00:1>>) -> tensor<!quant.uniform<i8:f32, 3.000000e+00:2>> {
  %0 = stablehlo.uniform_dequantize %arg0 : (tensor<!quant.uniform<i8:f32, 1.000000e+00>>) -> tensor<f32>
  %1 = stablehlo.uniform_dequantize %arg1 : (tensor<!quant.uniform<i8:f32, 2.000000e+00:1>>) -> tensor<f32>
  %2 = stablehlo.add %0, %1 : tensor<f32>
  %3 = stablehlo.uniform_quantize %2 : (tensor<f32>) -> tensor<!quant.uniform<i8:f32, 3.000000e+00:2>>
  return %3 : tensor<!quant.uniform<i8:f32, 3.000000e+00:2>>
}

Per the docs/spec.md, following is the exhaustive list of ops which can be interpreted using dq-op-q strategy. The current PR handles all these ops except for un-bolded ones (DotGeneralOp, ConvolutionOp, DynamicConvOp, and AddOp) which are already lowered to integer match using --stablehlo-legalize-quant-to-int pass.

AbsOp
AddOp
Atan2Op
BatchNormGradOp
BatchNormInferenceOp
BatchNormTrainingOp
CbrtOp
CeilOp
CholeskyOp
ClampOp
CompareOp
ConvolutionOp
CosineOp
DivOp
DotGeneralOp
DynamicConvOp
Expm1Op
ExpOp
FloorOp
Log1pOp
LogisticOp
LogOp
MaxOp
MinOp
MulOp
NegOp
PowOp
ReducePrecisionOp
RemOp
RoundOp
RoundNearestEvenOp
RsqrtOp
SelectOp
SignOp
SineOp
SqrtOp
SubtractOp
TanhOp
TriangularSolveOp

BUILD.bazel

stablehlo/transforms/Passes.h

stablehlo/transforms/Passes.td

stablehlo/transforms/StablehloLegalizeQuantizedOpUsingQDQ.cpp

stablehlo/tests/ops_stablehlo_quantized.mlir

stablehlo/transforms/StablehloLegalizeQuantizedOpToQDQ.cpp

sdasgup3 requested a review from GleasonK June 14, 2024 20:53

sdasgup3 force-pushed the quant-to-int-unary-ops branch 2 times, most recently from bafe8e3 to 7afb5a5 Compare June 14, 2024 22:00

sdasgup3 marked this pull request as draft June 14, 2024 22:16

sdasgup3 marked this pull request as ready for review June 14, 2024 22:58

sdasgup3 requested a review from quanwanandy June 14, 2024 23:12

sdasgup3 force-pushed the quant-to-int-unary-ops branch from 85e418a to b47c342 Compare June 15, 2024 01:34