Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Legalize quantized stablehlo operation using uniform_quantize/uniform_dequantize #2394

Merged
merged 10 commits into from
Jun 18, 2024

Conversation

sdasgup3
Copy link
Member

@sdasgup3 sdasgup3 commented Jun 14, 2024

This PR provides with a pass to decompose StableHLO quantized programs using uniform quantize/dequantize
operations. For example, the following program

```mlir
func.func @add(%arg0: tensor<!quant.uniform<i8:f32,1.0:0>>, %arg1: tensor<!quant.uniform<i8:f32,2.0:1>>) ->  tensor<!quant.uniform<i8:f32,3.0:2>> {
  %0 = "stablehlo.add"(%arg0, %arg1) : (tensor<!quant.uniform<i8:f32,1.0:0>>, tensor<!quant.uniform<i8:f32,2.0:1>>) -> tensor<!quant.uniform<i8:f32,3.0:2>>
  func.return %0 : tensor<!quant.uniform<i8:f32,3.0:2>>
}
```

Will become:

```mlir
func.func @add(%arg0: tensor<!quant.uniform<i8:f32, 1.000000e+00>>, %arg1: tensor<!quant.uniform<i8:f32, 2.000000e+00:1>>) -> tensor<!quant.uniform<i8:f32, 3.000000e+00:2>> {
  %0 = stablehlo.uniform_dequantize %arg0 : (tensor<!quant.uniform<i8:f32, 1.000000e+00>>) -> tensor<f32>
  %1 = stablehlo.uniform_dequantize %arg1 : (tensor<!quant.uniform<i8:f32, 2.000000e+00:1>>) -> tensor<f32>
  %2 = stablehlo.add %0, %1 : tensor<f32>
  %3 = stablehlo.uniform_quantize %2 : (tensor<f32>) -> tensor<!quant.uniform<i8:f32, 3.000000e+00:2>>
  return %3 : tensor<!quant.uniform<i8:f32, 3.000000e+00:2>>
}

Per the docs/spec.md, following is the exhaustive list of ops which can be interpreted using dq-op-q strategy. The current PR handles all these ops except for un-bolded ones (DotGeneralOp, ConvolutionOp, DynamicConvOp, and AddOp) which are already lowered to integer match using --stablehlo-legalize-quant-to-int pass.

  1. AbsOp
  2. AddOp
  3. Atan2Op
  4. BatchNormGradOp
  5. BatchNormInferenceOp
  6. BatchNormTrainingOp
  7. CbrtOp
  8. CeilOp
  9. CholeskyOp
  10. ClampOp
  11. CompareOp
  12. ConvolutionOp
  13. CosineOp
  14. DivOp
  15. DotGeneralOp
  16. DynamicConvOp
  17. Expm1Op
  18. ExpOp
  19. FloorOp
  20. Log1pOp
  21. LogisticOp
  22. LogOp
  23. MaxOp
  24. MinOp
  25. MulOp
  26. NegOp
  27. PowOp
  28. ReducePrecisionOp
  29. RemOp
  30. RoundOp
  31. RoundNearestEvenOp
  32. RsqrtOp
  33. SelectOp
  34. SignOp
  35. SineOp
  36. SqrtOp
  37. SubtractOp
  38. TanhOp
  39. TriangularSolveOp

@sdasgup3 sdasgup3 requested a review from GleasonK June 14, 2024 20:53
@sdasgup3 sdasgup3 force-pushed the quant-to-int-unary-ops branch 2 times, most recently from bafe8e3 to 7afb5a5 Compare June 14, 2024 22:00
@sdasgup3 sdasgup3 marked this pull request as draft June 14, 2024 22:16
@sdasgup3 sdasgup3 marked this pull request as ready for review June 14, 2024 22:58
BUILD.bazel Outdated Show resolved Hide resolved
@sdasgup3 sdasgup3 force-pushed the quant-to-int-unary-ops branch 4 times, most recently from 1035a48 to 76aaa36 Compare June 18, 2024 01:31
@sdasgup3 sdasgup3 requested a review from GleasonK June 18, 2024 01:35
@sdasgup3 sdasgup3 merged commit 3295a5e into openxla:main Jun 18, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants