Environment
- Libtorch 2.5.0.dev (latest nightly) (built with CUDA 12.4)
- CUDA 12.4
- TensorRT 10.1.0.27
- PyTorch 2.4.0+cu124
- Torch-TensorRT 2.4.0
- Python 3.12.8
- Windows 10
Compile Torch-TensorRT with Cmake to generate lib and dll
Option : Export
If you want to optimize your model ahead-of-time and/or deploy in a C++ environment, Torch-TensorRT provides an export-style workflow that serializes an optimized module. This module can be deployed in PyTorch or with libtorch (i.e. without a Python dependency).
Step 1: Optimize + serialize
import torch
import torch_tensorrt
model = MyModel().eval().cuda() # define your model here
inputs = [torch.randn((1, 3, 224, 224)).cuda()] # define a list of representative inputs here
trt_gm = torch_tensorrtpile(model, ir="dynamo", inputs)
torch_tensorrt.save(trt_gm, "trt.ep", inputs=inputs) # PyTorch only supports Python runtime for an ExportedProgram. For C++ deployment, use a TorchScript file
torch_tensorrt.save(trt_gm, "trt.ts", output_format="torchscript", inputs=inputs)
Step 2: Deploy
Deployment in C++:
#include "torch/script.h"
#include "torch_tensorrt/torch_tensorrt.h"
auto trt_mod = torch::jit::load("trt.ts");
auto input_tensor = [...]; // fill this with your inputs
auto results = trt_mod.forward({input_tensor});
ERROR
auto trt_mod = torch::jit::load("trt.ts")
Unknown type name '__torch__.torch.classes.tensorrt.Engine':
File "code/__torch__/torch_tensorrt/dynamo/runtime/_TorchTensorRTModule.py", line 6
training : bool
_is_full_backward_hook : Optional[bool]
engine : __torch__.torch.classes.tensorrt.Engine
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~`your text`~~~~~~~~ <--- HERE
def forward(self: __torch__.torch_tensorrt.dynamo.runtime._TorchTensorRTModule.TorchTensorRTModule,
input: Tensor) -> Tensor: