Tensorflow Model Optimization with TFlite

eetocs·2022년 9월 23일

Tensorflow Model Optimization XNNPACK tensorflow tflite

2

TFmot(Tensorflow Model Optimization)을 통한 unstructued pruning을 TFlite + XNNPACK에서 가속화를 지원을 확인하고 체험해보자

Fast Sparse ConvNets 기술이 TFmot과 TFlite에서 지원하기 시작(21.05.08)
- Nightly-version
설치
- 간단한 pip 설치로는 동작하지 않고, Ruy를 backend로 하는 TFlite에서 지원

TF-mot(TF model optimization)이 실제 동작하는지 확인하기 위해 google에서 제공한 pre-trained model로 실험 (https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/delegates/xnnpack/README.md#sparse-inference)
삽질의 흔적 (https://github.com/tensorflow/model-optimization/issues/708)
H/W Info
- Odroid C4 : Cortex-A55, 64bit
- RPI3 : Cortex-A53, 32bit
- RPI4 : Cortex-A72, 64bit

Model : MobileNetV2(alpha=2.0)

모델	Odroid C4	RPI3	RPI4	Top 1 Acc	Model Size
None-prune	335ms	586ms	310ms	75%	42.7MB
prune	235ms	428ms	164ms	74.5%	17.6MB

Model : Custom Model

모델	Rpi4	Acc(Cifa-10)
None-prune	4.100ms	71.2%
prune	2.845ms	70.7%

TFlite 모델 : https://drive.google.com/file/d/1bgNVOsOtE6oj1hftPw2jgIIRjKNS_aTW/view?usp=sharing

속도 향상이 매우 좋으나, 모델 구조의 제약이 아쉬움...
Sparse pruning 적용을 위한 모델 구조
- Sparse subgraph must start with a 3x3 stride-2 CONV_2D operator with padding 1 on each side, no dilation, and 3 input channels.
- Sparse subgraph must end with either a MEAN operator with reduction across spatial axes, or a DEPTH_TO_SPACE operator.
- Support Depthwise_Conv2d_list
  - 3x3 kernel, stride 1, no dilation, and padding 1
  - 3x3 kernel, stride 2, no dilation, and padding 1
  - 5x5 kernel, stride 1, no dilation, and padding 2
  - 5x5 kernel, stride 2, no dilation, and padding 2
- 참고 : https://github.com/tensorflow/tensorflow/blob/master/tensorflow/lite/delegates/xnnpack/README.md#sparse-inference
- 위와 같은 구조를 가지는 모델군(Mobilenet, EfficientNet)은 커버가능
  - In our experience, larger models based on MobileNetV3 or EfficientNet-lite show similar performance improvements. The speed-up varies based on the relative contribution of 1x1 convolutions to the overall model.

ML 잡부

이전 포스트

Faster Quantized Inference with XNNPACK

다음 포스트

[ML Model Convert] Pytorch -> TFLite Convert

9개의 댓글

2022년 11월 15일

안녕하세요. pruning관련 자료 찾아보다가 들어오게 됐습니다. 혹시 말씀하신 "Fast Sparse ConvNets 기술이 TFmot과 TFlite에서 지원하기 시작(21.05.08)" 이 정보를 어디서 보셨는지 알 수 있을까요? 저는 검색해봤는데 안 나와서요!

1개의 답글

2024년 7월 18일

안녕하세요 혹시 tflite를 어떤 tool로 실행하셨는지 알 수 있을까요?
tflite 내부 예제로 있는 benchmark model 사용하셨을까요?

1개의 답글

관련 채용 정보