ch_ppocr_mobile_v2_0_rec_v2_0模型在arm7hf平台上的输入tensor的shape不符合预期 #10658

lqian · 2025-05-05T08:43:51Z

为使您的问题得到快速解决，在建立 Issue 前，请您先通过如下方式搜索是否有相似问题: 历史 issue, FAQ 文档, 官方文档

建立 issue 时，为快速解决问题，请您根据使用情况给出如下信息：

标题：简洁、精准描述您的问题，例如“ssd 模型转换报错”
版本、环境信息：
1）Paddle Lite 版本：v2.10
2）Host 环境：Ubuntu 18.04
模型信息
1）模型名称 ch_ppocr_mobile_v2_0_rec_v2_0
2）模型链接 https://paddlelite-demo.bj.bcebos.com/NNAdapter/models/PaddleOCR/ch_ppocr_mobile_v2.0_rec_infer.tgz
复现信息：

Paddle-Lite/build.opt/lite/api/opt --model_file model.pdmodel --param_file model.pdiparams --optimize_out_type naive_buffer --optimize_out ./ch_ppocr_mobile_v2_0_rec_v2_0 --valid_targets arm 
Loading topology data from model.pdmodel
Loading params data from model.pdiparams
1. Model is successfully loaded!
2. Model is optimized and saved into ./ch_ppocr_mobile_v2_0_rec_v2_0.nb successfully

./build/PADDLE-LITE-TEST ./ch_ppocr_mobile_v2_0_rec_v2_0.nb  ./303.jpg ./ppocr_keys_v1.txt 
load 6623 dictionary tokens
[I  1/ 1  1:57:22.214 ...ace/Paddle-Lite/lite/core/device_info.cc:238 get_cpu_arch] Unknow cpu arch: 3079
[I  1/ 1  1:57:22.216 ...ace/Paddle-Lite/lite/core/device_info.cc:1118 Setup] ARM multiprocessors name: MODEL NAME      : ARMV7 PROCESSOR REV 5 (V7L)
HARDWARE        : LOMBOTECH-N7 (FLATTENED DEVICE TREE)

[I  1/ 1  1:57:22.218 ...ace/Paddle-Lite/lite/core/device_info.cc:1119 Setup] ARM multiprocessors number: 1
[I  1/ 1  1:57:22.218 ...ace/Paddle-Lite/lite/core/device_info.cc:1121 Setup] ARM multiprocessors ID: 0, max freq: 0, min freq: 0, cluster ID: 0, CPU ARCH: A-1
[I  1/ 1  1:57:22.218 ...ace/Paddle-Lite/lite/core/device_info.cc:1127 Setup] L1 DataCache size is: 
[I  1/ 1  1:57:22.218 ...ace/Paddle-Lite/lite/core/device_info.cc:1129 Setup] 32 KB
[I  1/ 1  1:57:22.219 ...ace/Paddle-Lite/lite/core/device_info.cc:1131 Setup] L2 Cache size is: 
[I  1/ 1  1:57:22.219 ...ace/Paddle-Lite/lite/core/device_info.cc:1133 Setup] 512 KB
[I  1/ 1  1:57:22.220 ...ace/Paddle-Lite/lite/core/device_info.cc:1135 Setup] L3 Cache size is: 
[I  1/ 1  1:57:22.220 ...ace/Paddle-Lite/lite/core/device_info.cc:1137 Setup] 0 KB
[I  1/ 1  1:57:22.220 ...ace/Paddle-Lite/lite/core/device_info.cc:1139 Setup] Total memory: 59556KB
create predictor :4198012 
input name: x
--------- showed input names ---------
--------- GetInputByName("x")---------
get input tensor shape dimension: 4 
input tensor shape dimension: 0 1 0 3
predictor->Run() before
predictor->Run() done
output predict shape 0 0 0
output name: save_infer_model/scale_0.tmp_1
--------- showed output names ---------
inference cost: 239.409#

问题描述：下载的paddlepaddle文件显示ch_ppocr_mobile_v2_0_rec_v2_0输入tensor shape应该是[x 3 32 100], 在测试代码中input_tensor0->Resize({1, 3, 32, 100});实际输出的是[0 1 0 3]，可能是什么原因导致的

The text was updated successfully, but these errors were encountered:

lqian · 2025-05-05T08:44:19Z

完整的测试代码

/*
 * paddle-lite-test.cpp
 *
 *  Created on: May 5, 2025
 */

#include <chrono>
#include <iostream>
#include <fstream>
#include <vector>
#include "opencv2/core.hpp"
#include "opencv2/imgcodecs.hpp"
#include "opencv2/imgproc.hpp"
#include "paddle_api.h"  // NOLINT

using namespace std;
using namespace paddle::lite_api;  // NOLINT
using namespace cv;


const std::vector<int> rec_image_shape{3, 32, 128};

cv::Mat CrnnResizeImg(cv::Mat img, float wh_ratio, int rec_image_height) {
  int imgC, imgH, imgW;
  imgC = rec_image_shape[0];
  imgH = rec_image_height;
  imgW = rec_image_shape[2];

  imgW = int(imgH * wh_ratio);

  float ratio = float(img.cols) / float(img.rows);
  int resize_w, resize_h;

  if (ceilf(imgH * ratio) > imgW)
    resize_w = imgW;
  else
    resize_w = int(ceilf(imgH * ratio));
  cv::Mat resize_img;
  cv::resize(img, resize_img, cv::Size(resize_w, imgH), 0.f, 0.f,
             cv::INTER_LINEAR);
  cv::copyMakeBorder(resize_img, resize_img, 0, 0, 0,
                     int(imgW - resize_img.cols), cv::BORDER_CONSTANT,
                     {127, 127, 127});
  return resize_img;
}

// fill tensor with mean and scale and trans layout: nhwc -> nchw, neon speed up
void neon_mean_scale(const float* din,
                     float* dout,
                     int size,
                     const std::vector<float> mean,
                     const std::vector<float> scale) {
  if (mean.size() != 3 || scale.size() != 3) {
    std::cerr << "[ERROR] mean or scale size must equal to 3\n";
    exit(1);
  }
  float32x4_t vmean0 = vdupq_n_f32(mean[0]);
  float32x4_t vmean1 = vdupq_n_f32(mean[1]);
  float32x4_t vmean2 = vdupq_n_f32(mean[2]);
  float32x4_t vscale0 = vdupq_n_f32(scale[0]);
  float32x4_t vscale1 = vdupq_n_f32(scale[1]);
  float32x4_t vscale2 = vdupq_n_f32(scale[2]);

  float* dout_c0 = dout;
  float* dout_c1 = dout + size;
  float* dout_c2 = dout + size * 2;

  int i = 0;
  for (; i < size - 3; i += 4) {
    float32x4x3_t vin3 = vld3q_f32(din);
    float32x4_t vsub0 = vsubq_f32(vin3.val[0], vmean0);
    float32x4_t vsub1 = vsubq_f32(vin3.val[1], vmean1);
    float32x4_t vsub2 = vsubq_f32(vin3.val[2], vmean2);
    float32x4_t vs0 = vmulq_f32(vsub0, vscale0);
    float32x4_t vs1 = vmulq_f32(vsub1, vscale1);
    float32x4_t vs2 = vmulq_f32(vsub2, vscale2);
    vst1q_f32(dout_c0, vs0);
    vst1q_f32(dout_c1, vs1);
    vst1q_f32(dout_c2, vs2);

    din += 12;
    dout_c0 += 4;
    dout_c1 += 4;
    dout_c2 += 4;
  }
  for (; i < size; i++) {
    *(dout_c0++) = (*(din++) - mean[0]) * scale[0];
    *(dout_c1++) = (*(din++) - mean[1]) * scale[1];
    *(dout_c2++) = (*(din++) - mean[2]) * scale[2];
  }
}


void pre_process(const cv::Mat& img,
                 int width,
                 int height,
                 const std::vector<float>& mean,
                 const std::vector<float>& scale,
                 float* data,
                 bool is_scale = false) {
  cv::Mat resized_img;
  if (img.cols != width || img.rows != height) {
    cv::resize(
        img, resized_img, cv::Size(width, height), 0.f, 0.f, cv::INTER_CUBIC);
  } else {
    resized_img = img;
  }
  cv::Mat imgf;
  float scale_factor = is_scale ? 1.f / 256 : 1.f;
  resized_img.convertTo(imgf, CV_32FC3, scale_factor);
  const float* dimg = reinterpret_cast<const float*>(imgf.data);
  neon_mean_scale(dimg, data, width * height, mean, scale);
}


int rec_image_height = 32;


vector<string> dict;

void print_shape(const char * prefix, const shape_t & shape)
{
	printf("%s shape", prefix);
	for (auto s: shape)
	{
		printf(" %ld", s);
	}
	printf("\n");
}

int main(int argc, char ** argv)
{
	if (argc < 4)
	{
		printf("usage: paddle-lite-test [/path/to/modelfile] [/path/to/image]  [/path/to/dict]\n");
		exit(1);
	}

	cv::Mat img = cv::imread(argv[2]);
	if (img.empty())
	{
		printf("error: empty image \n");
		exit(1);
	}


	ifstream in;
	in.open(argv[3]);
	if (in.is_open())
	{
		string token;
		while (in >> token)
		{
			dict.push_back(token);
		}
	}

	if (dict.size() == 0)
	{
		printf("expect an valid dictionary text file \n");
		exit(1);
	}
	printf("load %d dictionary tokens\n", dict.size());


	string model_file = argv[1];
	// 1. Set MobileConfig
	MobileConfig config;
	config.set_model_from_file(model_file);

	// 2. Create PaddlePredictor by MobileConfig
	std::shared_ptr<PaddlePredictor> predictor = CreatePaddlePredictor<MobileConfig>(config);
	printf("create predictor :%d \n", predictor.get());

	vector<string> inputNames = predictor->GetInputNames();
	for (const string & inputName : inputNames)
	{
		printf("input name: %s\n", inputName.c_str());
	}
	printf("--------- showed input names ---------\n");
	// 3. Prepare input data from image
	// only has one input
	std::unique_ptr<Tensor> input_tensor0(std::move(predictor->GetInputByName("x")));
	printf("--------- GetInputByName(\"x\")---------\n");  //
	input_tensor0->Resize({1, 3, 32, 100});
	shape_t shape = input_tensor0->shape(); //
	printf("get input tensor shape dimension: %d \n", shape.size());
	printf("input tensor shape dimension: %ld %ld %ld %ld\n", shape[0], shape[1], shape[2], shape[3]);

	auto* data = input_tensor0->mutable_data<float>();
	std::vector<float> mean = {127.5, 127.5, 127.5};
	std::vector<float> scale = {0.007843, 0.007843, 0.007843};  // 1/127.5
	pre_process(img, 100, 32, mean, scale, data, false);

//	NeonMeanScale(dimg, data0, resize_img.rows * resize_img.cols, mean, scale);
	auto inference_start = std::chrono::steady_clock::now();
	printf("predictor->Run() before\n");
	predictor->Run();
	printf("predictor->Run() done\n");
    // Get output and run postprocess
    std::unique_ptr<const Tensor> output_tensor0 = predictor->GetOutput(0);

    auto predict_shape = output_tensor0->shape();
    auto inference_end = std::chrono::steady_clock::now();
    print_shape("output predict", predict_shape);

    vector<string> outputNames = predictor->GetOutputNames();
    for (const string & name : outputNames)
    {
    	printf("output name: %s\n", name.c_str());
    }
    printf("--------- showed output names ---------\n");


    auto duration = std::chrono::duration_cast<std::chrono::microseconds>(inference_end - inference_start);
    printf("inference cost: %.3f", duration * 0.001);
    // ctc decode
    auto postprocess_start = std::chrono::steady_clock::now();

    auto *predict_batch = output_tensor0->data<float>();
}

ddchenhao66 · 2025-05-07T11:34:08Z

tensor的resize是很基础的操作，不太可能有bug，倾向于怀疑导出的nb模型有问题。建议检查下模型是否正常，多加些打印看看。

ddchenhao66 · 2025-05-08T02:45:32Z

另外可以看下头文件和对应的库是否对应上，以及拿input_tensor0可以用predictor->GetInput(0)的接口试试，看着像是拿到了一个非法tensor，内存读错了

paddle-bot bot assigned ddchenhao66 May 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ch_ppocr_mobile_v2_0_rec_v2_0模型在arm7hf平台上的输入tensor的shape不符合预期 #10658

ch_ppocr_mobile_v2_0_rec_v2_0模型在arm7hf平台上的输入tensor的shape不符合预期 #10658

lqian commented May 5, 2025

lqian commented May 5, 2025

Uh oh!

ddchenhao66 commented May 7, 2025

Uh oh!

ddchenhao66 commented May 8, 2025

Uh oh!

ch_ppocr_mobile_v2_0_rec_v2_0模型在arm7hf平台上的输入tensor的shape不符合预期 #10658

ch_ppocr_mobile_v2_0_rec_v2_0模型在arm7hf平台上的输入tensor的shape不符合预期 #10658

Comments

lqian commented May 5, 2025

lqian commented May 5, 2025

Uh oh!

ddchenhao66 commented May 7, 2025

Uh oh!

ddchenhao66 commented May 8, 2025

Uh oh!