[MLOps] PyTorch 모델을 TensorRT로 변환하기

Ellie·2023년 2월 15일


목록 보기

PyTorch Official Docs

Torch-TensorRT는 PyTorch 모델을 TensorRT 모델로 바꾸어 주는 역할
TensorRT는 NVIDIA에서 만든 inference engine으로 kernel fusion, graph optimization, low precision 등의 optimization을 도와 준다.

In this tutorial, converting a model from PyTorch to TensorRT™ involves the following general steps:

  1. Build a PyTorch model by doing any of the two options:

    Train a model in PyTorch
    Get a pre-trained model from the PyTorch ModelZoo, other model repository, or directly from Deci’s SuperGradients, an open-source PyTorch-based deep learning training library.

  2. Convert the PyTorch model to ONNX.

  3. Convert from ONNX to TensorRT™.

How to convert a Transformers model to TensorRT
전통적인(?) 방법으로는 pytorch에서 학습을 완료한 모델을 ONNX로 바꾼 다음 TensorRT로 변환함 (2번의 변환 과정을 거침)

Train a model using PyTorch
Convert the model to ONNX format
Use NVIDIA TensorRT for inference
그러나, https://github.com/pytorch/TensorRT

There's another library, but it's likely to be outdated. or it's already outdated? no more update since Nov, 2022 ?_? https://github.com/NVIDIA-AI-IOT/torch2trt

Firstly, install python packages below:

pip3 install nvidia-pyindex
pip3 install nvidia-tensorrt
pip3 install torch-tensorrt==<VERSION> -f 


deepspeed docker 사용하는 경우, 이미 docker에 설치되어 있음.

공식 repo example :

방법 1. CLI TensorRT

Official Docs : https://docs.nvidia.com/deeplearning/tensorrt/quick-start-guide/index.html#save-model
방법 2. Python TensorRT
2-1. tensorrt


python3 -m pip install --upgrade tensorrt

Version check

import tensorrt
assert tensorrt.Builder(tensorrt.Logger())

Convert the model

trtexec --onnx=resnet50/model.onnx --saveEngine=resnet_engine.trt
trtexec --onnx=resnet50/model.onnx --saveEngine=resnet_engine.plan

To tell trtexec where to find our ONNX model, run:


To tell trtexec where to save our optimized TensorRT engine, run:


2-2. Torch-TensorRT library


A bit nerdy

0개의 댓글