Add support for TFLite INT8 detection export #1968

adamp87 · 2024-01-15T21:35:34Z

This PR adds support to export a fixed model in ONNX format and convert with onnx2tf to quantized TFLite. It differs from the previous solutions, because it does not reimplement any of the nn blocks in TF, rather exports the PyTorch model to ONNX.

Quantization issues:

As seen in previous branches, the TF model used to normalize the pixel coordinates. This is essential, since the score and pixel values must be in the same range [0, 1] for proper quantization.

yolov7/models/tf.py

Lines 364 to 365 in f2439f8

    
           xy /= tf.constant([[self.imgsz[1], self.imgsz[0]]], dtype=tf.float32) 
        
           wh /= tf.constant([[self.imgsz[1], self.imgsz[0]]], dtype=tf.float32)

Furthermore, the above mentioned way still can introduce numerical instability, as during inference a large number [0, img_size] is divided with img_size. This has been improved by using normalized factors, which can be precomputed for static models.

Converted model:

As discussed by the authors of onnx2tf, their converter achieves a better conversion compared to other tools: https://github.com/PINTO0309/onnx2tf?tab=readme-ov-file#generated-model
The converted model introduces in the YOLO head one transpose per pyramid, due to NCHW to NHWC conversion.

Backward compatibility:

This PR introduces a new argument for exporting ONNX models, "normalize". The exporter exports the same models as before if argument is not given. If argument is given, pixel coordinates are normalized for the detection block.

Demo:

This PR adds a demo tool, which explains conversion step by step. Then performs inference using TFLite and applies NMS using OpenCV. Finally, it gives a visual comparison between PyTorch and TFLite.
NMS is not exported in the model, it is being done using OpenCV. This has been selected as PyTorch NMS is being executed in F32 and outputs int64, which is not supported by many edge devices. The tool onnx2tf has a mode to fix the int64 output, for more details see: https://github.com/PINTO0309/onnx2tf#10-fixing-the-output-of-nonmaxsuppression-nms

Conclusion:

The provided demo shows a functional way to export TFLite INT8 models. A retrained tiny model with ReLU6 activation has been tested on RPi4+Coral Edge TPU, where inference+nms was running with approx. 60ms. No quantitative tests have been done, as no common framework is present for YOLOv7+TFLite.

If you like this PR, please give a thumbs up and lets hope @AlexeyAB and @WongKinYiu will merge it.

dsbyprateekg · 2024-06-06T11:59:15Z

@adamp87 I tried your PR with my custom tolov7-tiny model and found that difference in bbox position. With the original model bbox was tight (close around the object) but with full_integer_quant.tflite model bbox was not so tight.
May be this is due to full int quantization.
Can you please share the changes I need to make in inference notebook for other models like-
dynamic_range_quant.tflite
float16.tflite
float32.tflite

For example with _float16.tflite model, I am facing the following error-

adamp87 · 2024-06-11T12:39:00Z

You dont need to do the integer scaling and shifting if you use models like float16. I recommend you reading Tensorflow documentation on quantizing models. https://www.tensorflow.org/lite/performance/post_training_quantization
Changes in bbox size and position is to be expected, mAP does drop for INT8 quantized model, mAP50 less and mAP95 more.

dsbyprateekg · 2024-06-12T11:03:15Z

@adamp87 onnx2tf script is failing for yolov7-w6 based model with image size of 1280, it is not showing any error in the console.
!onnx2tf -i model_opt.onnx -o model_opt.tf --verbosity info -nuo -oiqt -qt per-tensor

Let me know if any change is required in the code.

adamp87 added 2 commits January 15, 2024 10:57

detection pixel output normalization

b4247ab

add documentation for TFLite int8 export

1198d72

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for TFLite INT8 detection export #1968

Add support for TFLite INT8 detection export #1968

adamp87 commented Jan 15, 2024

dsbyprateekg commented Jun 6, 2024 •

edited

Loading

adamp87 commented Jun 11, 2024 •

edited

Loading

dsbyprateekg commented Jun 12, 2024 •

edited

Loading

	xy /= tf.constant([[self.imgsz[1], self.imgsz[0]]], dtype=tf.float32)
	wh /= tf.constant([[self.imgsz[1], self.imgsz[0]]], dtype=tf.float32)

Add support for TFLite INT8 detection export #1968

Are you sure you want to change the base?

Add support for TFLite INT8 detection export #1968

Conversation

adamp87 commented Jan 15, 2024

dsbyprateekg commented Jun 6, 2024 • edited Loading

adamp87 commented Jun 11, 2024 • edited Loading

dsbyprateekg commented Jun 12, 2024 • edited Loading

dsbyprateekg commented Jun 6, 2024 •

edited

Loading

adamp87 commented Jun 11, 2024 •

edited

Loading

dsbyprateekg commented Jun 12, 2024 •

edited

Loading