Jetson Paddle模型推理

查看当前cmake版本，版本最大于3.13.0，否则download时openssl会验证失败，原因低版本cmake编译时libcurl并为加入对https支持

报错如下：

log: Protocol "https" not supported or disabled in libcurl

cmake --version

更新cmake

version=3.25
build=1
mkdir ~/temp
cd ~/temp
wget https://cmake.org/files/v$version/cmake-$version.$build.tar.gz
tar -xzvf cmake-$version.$build.tar.gz
cd cmake-$version.$build/
./bootstrap  # 这一步会等很久  也可以执行 ./confige
make -j4   # 使用命令 nproc 查看自己有多少个核心， 我这里用了4个核心编译make，速度快一点 
sudo make install

paddle2onnx-linux-aarch64-1.0.8rc0下载失败手动修改cmake\paddle2onnx.cmake中PADDLE2ONNX_VERSION版本为1.0.4rc0

使用FastDeploy推理

https://www.paddlepaddle.org.cn/inference/v2.4/guides/install/download_lib.html#c
https://github.com/PaddlePaddle/FastDeploy/blob/release/1.0.2/docs/cn/build_and_install/jetson.md
(Paddle推理支持列表)[https://ai.baidu.com/easyedge/app/adapt]

Jetson部署库编译

FastDeploy当前在Jetson仅支持ONNX Runtime CPU和TensorRT GPU/Paddle Inference三种后端推理

C++ SDK编译安装

编译需满足

gcc/g++ >= 5.4(推荐8.2)
cmake >= 3.10.0
jetpack >= 4.6.1

如果需要集成Paddle Inference后端，在Paddle Inference预编译库页面根据开发环境选择对应的Jetpack C++包下载，并解压。

编译FastDeploy时，当打开开关BUILD_ON_JETSON时，会默认开启ENABLE_ORT_BACKEND和ENABLE_TRT_BACKEND，即当前仅支持ONNXRuntime CPU或TensorRT两种后端分别用于在CPU和GPU上的推理。因此，这里的GPU推理并不会生效，而是会自动转成CPU推理，如果需要部署http serving 开启DENABLE_HTTP

git clone https://github.com/PaddlePaddle/FastDeploy.git
cd FastDeploy
mkdir build && cd build
cmake .. -DBUILD_ON_JETSON=ON \
         -DENABLE_VISION=ON \
         -DENABLE_TRT_BACKEND=ON \
         -DWITH_GPU=ON \
         -DENABLE_PADDLE_BACKEND=ON \ # 可选项，如若不需要Paddle Inference后端，可关闭
         -DPADDLEINFERENCE_DIRECTORY=/home/nvidia/paddle_inference_install_dir \
         -DCMAKE_INSTALL_PREFIX=${PWD}/installed_fastdeploy \
         -DOPENCV_DIRECTORY=/usr/lib/aarch64-linux-gnu/cmake/opencv4 \
         -DENABLE_HTTP=ON
# libevent 需单独编译
cmake libevent
# 不安装paddle后端推理可使用如下命令
cmake .. -DBUILD_ON_JETSON=ON -DENABLE_VISION=ON -DCMAKE_INSTALL_PREFIX=${PWD}/installed_fastdeploy -DWITH_GPU=ON -DENABLE_TRT_BACKEND=ON -DOPENCV_DIRECTORY=/usr/lib/aarch64-linux-gnu/cmake/opencv4 -DTRT_DIRECTORY=/usr/src/tensorrt
make -j8
make install
source /home/nvidia/FastDeploy/build/installed_fastdeploy/fastdeploy_init.sh

编译完成后，即在CMAKE_INSTALL_PREFIX指定的目录下生成C++推理库

C++推理

以yolox为例进行推理部署
cd FastDeploy/examples/vision/detection/paddledetection/cpp/
mkdir build && cd build
cmake .. -DFASTDEPLOY_INSTALL_DIR=/home/nvidia/FastDeploy/build/installed_fastdeploy
make -j6
# GPU推理
./infer_yolox_demo ./yolox_convnext_s_36e_coco/ wsz138.jpg 0
# GPU上TensorRT推理 
./infer_yolox_demo ./yolox_convnext_s_36e_coco/ wsz138.jpg 2

FastDeploy 默认采用TRT-FP32的推理。如果需要使用TRT-FP16的推理，设置的方法很简单，只需要在代码中加入一行 option.EnableTrtFP16() 即可。

【格外注意】当需要在TRT-FP32和TRT-FP16之间切换时，别忘了先删除保存的 trt 缓存文件。

void TrtInfer(const std::string& model_dir, const std::string& image_file) {
  auto model_file = model_dir + sep + "model.pdmodel";
  auto params_file = model_dir + sep + "model.pdiparams";
  auto config_file = model_dir + sep + "infer_cfg.yml";

  auto option = fastdeploy::RuntimeOption();
  option.UseGpu();
  option.UseTrtBackend();
  option.SetTrtCacheFile("./tensorrt_cache/model.trt");
  option.EnableTrtFP16();
  auto model = fastdeploy::vision::detection::PaddleYOLOX(
      model_file, params_file, config_file, option);
  if (!model.Initialized()) {
    std::cerr << "Failed to initialize." << std::endl;
    return;
  }

  auto im = cv::imread(image_file);

  fastdeploy::vision::DetectionResult res;
  if (!model.Predict(im, &res)) {
    std::cerr << "Failed to predict." << std::endl;
    return;
  }

Python编译安装

编译过程同样需要满足

gcc/g++ >= 5.4(推荐8.2)
cmake >= 3.10.0
jetpack >= 4.6.1
python >= 3.6

sudo apt-get update && sudo apt-get install python-pip python3-pip
pip3 install cython

Python打包依赖wheel，编译前请先执行pip install wheel

如果需要集成Paddle Inference后端，在Paddle Inference预编译库页面根据开发环境选择对应的Jetpack C++包下载，并解压。

所有编译选项通过环境变量导入

wget https://paddle-inference-lib.bj.bcebos.com/2.4.1/cxx_c/Jetson/jetpack4.6_gcc7.5/xavier/paddle_inference_install_dir.tgz
sudo tar zxvf paddle_inference_install_dir.tgz -C /usr/local/src/

git clone https://github.com/PaddlePaddle/FastDeploy.git
cd FastDeploy/python
export BUILD_ON_JETSON=ON
export ENABLE_VISION=ON

# ENABLE_PADDLE_BACKEND & PADDLEINFERENCE_DIRECTORY为可选项
export ENABLE_PADDLE_BACKEND=ON
export PADDLEINFERENCE_DIRECTORY=/usr/local/src/paddle_inference_install_dir/

export WITH_GPU=ON  

# 启动trt_backend
export ENABLE_TRT_BACKEND=ON     
export TRT_DIRECTORY=/usr/src/tensorrt

python3 setup.py build
python3 setup.py bdist_wheel

编译完成即会在FastDeploy/python/dist目录下生成编译后的wheel包，直接pip install即可

编译过程中，如若修改编译参数，为避免带来缓存影响，可删除FastDeploy/python目录下的build和.setuptools-cmake-build两个子目录后再重新编译

Python推理

在部署前，需要先将PaddleDetection导出成部署模型，导出步骤参考文档导出模型

注意

在导出模型时不要进行NMS的去除操作，正常导出即可
如果用于跑原生TensorRT后端（非Paddle Inference后端），不要添加–trt参数
导出模型时，不要添加fuse_normalize=True参数

cd /FastDeploy/examples/vision/detection/paddledetection/python
#	下载模型文件和测试图片,可使用PaddleDetectin导出的模型
# GPU上使用TensorRT推理
python3 infer_yolox.py --model_dir yolox_convnext_s_36e_coco --image wsz138.jpg --device gpu --use_trt True

参数

model_file(str): 模型文件路径
params_file(str): 参数文件路径
config_file(str): 推理配置yaml文件路径
runtime_option(RuntimeOption): 后端推理配置，默认为None，即采用默认配置
model_format(ModelFormat): 模型格式，默认为Paddle

Kevin

https://knowledge-things.github.io/2023/01/04/paddle-xi-lie-jetson-tui-li/