-
Notifications
You must be signed in to change notification settings - Fork 7.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
c++推理速度比python慢好几倍,请教下怎么解决。 #13900
Comments
我也提过这个 但是没后续 #10880 |
连续跑两次看看,第二次应该不会慢了 |
不是模型初始化的问题,慢的位置就在推理引擎运行推理的那一行 |
@dyning 大佬帮看看呢 |
怀疑是不是冷启动的原因,所以先连续跑两次推理看看 |
不太懂是不是冷启动的问题 但是我测试的时候是用的一组图片 不是一张图片 |
所以是每一张都很慢吗?留2张相同的图片跑下是什么结果 |
现在没mac m2的环境了 我记的当时测是测了好多次 每次耗时基本差不过,没有比较单个图片 |
是,1组就行,包含2张相同的图,跑一次看看两张图的耗时,另外cpu下环境也可以run的 |
更换paddle inference,或者用onnx,openvino之类跑,也可以试试paddlex |
看了下paddleX,好像是全py的,包括inference部分也是。 是不是可以直接使用nuitka打包? |
@RemyHaijie paddlex的问题建议在paddlex仓库下提问,参考paddlex部署相关pipeline |
如果要更换paddle inference的话,需要换成哪个?希望是在cpu上跑,目前C++的用的是mlk的版本。 |
@XiaoDongGuoGuo paddle inference 的2.7.0我没找到。 可以给一下链接吗? 另外这里追求的主要问题是,为什么python版本下比C++版本下速度快的问题,这里的耗时瓶颈不在于硬件。 |
sorry,打错了paddle inference 用 v2.6 https://www.paddlepaddle.org.cn/inference/v2.6/guides/install/download_lib.html#windows |
🔎 Search before asking
🐛 Bug (问题描述)
同一张图,同一个模型,同一种配置,python只用1秒,c++要用5秒多。 其他图片都能够复现运行慢的问题。
🏃♂️ Environment (运行环境)
C++用的是最新的paddle引擎的release版本2.8.1,在VS2022下编译的。
python也是2.8.1
Name: paddleocr Version: 2.8.1 Summary: Awesome OCR toolkits based on PaddlePaddle(8.6M ultra-lightweight pre-trained model, support training and deployment among server, mobile, embedded and IoT devices) Home-page: https://github.com/PaddlePaddle/PaddleOCR Author: Author-email: PaddlePaddle [email protected] License: Apache License 2.0 Location: C:\Users\Remy\AppData\Local\Programs\Python\Python312\Lib\site-packages Requires: beautifulsoup4, cython, fire, fonttools, imgaug, lmdb, numpy, opencv-contrib-python, opencv-python, Pillow, pyclipper, python-docx, pyyaml, rapidfuzz, requests, scikit-image, shapely, tqdm
针对于识别模型参数:
C++参数如下:
this 0x000001ecde9646c0 {predictor_=empty use_gpu_=false gpu_id_=0 ...} PaddleOCR::CRNNRecognizer *
predictor_ empty std::shared_ptr<paddle_infer::Predictor>
use_gpu_ false bool
gpu_id_ 0 int
gpu_mem_ 4000 int
cpu_math_library_num_threads_ 10 int
use_mkldnn_ false bool
label_list_ { size=6625 } std::vector<std::string,std::allocatorstd::string>
mean_ { size=3 } std::vector<float,std::allocator>
scale_ { size=3 } std::vector<float,std::allocator>
is_scale_ true bool
use_tensorrt_ false bool
precision_ "fp16" std::string
rec_batch_num_ 6 int
rec_img_h_ 48 int
rec_img_w_ 320 int
rec_image_shape_ { size=3 } std::vector<int,std::allocator>
resize_op_ {...} PaddleOCR::CrnnResizeImg
normalize_op_ {...} PaddleOCR::Normalize
permute_op_ {...} PaddleOCR::PermuteBatch
cpu_math_library_num_threads 10 const int &
gpu_id 0 const int &
gpu_mem 4000 const int &
label_path "D:/Code/gitClone/PaddleOCR-2.8.1/deploy/cpp_infer/build/Release/ppocr_keys_v1.txt" const std::string &
model_dir "D:\softwarePake\ch_PP-OCRv3_rec_infer" const std::string &
precision "fp16" const std::string &
rec_batch_num 6 const int &
rec_image_shape { size=3 } std::vector<int,std::allocator>
rec_img_h 48 const int &
rec_img_w 320 const int &
use_gpu false const bool &
use_mkldnn false const bool &
use_tensorrt false const bool &
python参数:
Namespace(help='==SUPPRESS==', use_gpu=False, use_xpu=False, use_npu=False, use_mlu=False, ir_optim=True, use_tensorrt=False, min_subgraph_size=15, precision='fp32', gpu_mem=500, gpu_id=0, image_dir=None, page_num=0, det_algorithm='DB', det_model_dir='C:\Users\44684/.paddleocr/whl\det\ch\ch_PP-OCRv4_det_infer', det_limit_side_len=960, det_limit_type='max', det_box_type='quad', det_db_thresh=0.3, det_db_box_thresh=0.6, det_db_unclip_ratio=1.5, max_batch_size=10, use_dilation=False, det_db_score_mode='fast', det_east_score_thresh=0.8, det_east_cover_thresh=0.1, det_east_nms_thresh=0.2, det_sast_score_thresh=0.5, det_sast_nms_thresh=0.2, det_pse_thresh=0, det_pse_box_thresh=0.85, det_pse_min_area=16, det_pse_scale=1, scales=[8, 16, 32], alpha=1.0, beta=1.0, fourier_degree=5, rec_algorithm='SVTR_LCNet', rec_model_dir='C:\Users\44684/.paddleocr/whl\rec\ch\ch_PP-OCRv4_rec_infer', rec_image_inverse=True, rec_image_shape='3, 48, 320', rec_batch_num=6, max_text_length=25, rec_char_dict_path='c:\Users\44684\AppData\Local\Programs\Python\Python312\Lib\site-packages\paddleocr\ppocr\utils\ppocr_keys_v1.txt', use_space_char=True, vis_font_path='./doc/fonts/simfang.ttf', drop_score=0.5, e2e_algorithm='PGNet', e2e_model_dir=None, e2e_limit_side_len=768, e2e_limit_type='max', e2e_pgnet_score_thresh=0.5, e2e_char_dict_path='./ppocr/utils/ic15_dict.txt', e2e_pgnet_valid_set='totaltext', e2e_pgnet_mode='fast', use_angle_cls=True, cls_model_dir='C:\Users\44684/.paddleocr/whl\cls\ch_ppocr_mobile_v2.0_cls_infer', cls_image_shape='3, 48, 192', label_list=['0', '180'], cls_batch_num=6, cls_thresh=0.9, enable_mkldnn=False, cpu_threads=10, use_pdserving=False, warmup=False, sr_model_dir=None, sr_image_shape='3, 32, 128', sr_batch_num=1, draw_img_save_dir='./inference_results', save_crop_res=False, crop_res_save_dir='./output', use_mp=False, total_process_num=1, process_id=0, benchmark=False, save_log_path='./log_output/', show_log=True, use_onnx=False, return_word_box=False, output='./output', table_max_len=488, table_algorithm='TableAttn', table_model_dir=None, merge_no_span_structure=True, table_char_dict_path=None, layout_model_dir=None, layout_dict_path=None, layout_score_threshold=0.5, layout_nms_threshold=0.5, kie_algorithm='LayoutXLM', ser_model_dir=None, re_model_dir=None, use_visual_backbone=True, ser_dict_path='../train_data/XFUND/class_list_xfun.txt', ocr_order_method=None, mode='structure', image_orientation=False, layout=True, table=True, ocr=True, recovery=False, use_pdf2docx_api=False, invert=False, binarize=False, alphacolor=(255, 255, 255), lang='ch', det=True, rec=True, type='ocr', savefile=False, ocr_version='PP-OCRv4', structure_version='PP-StructureV2')
🌰 Minimal Reproducible Example (最小可复现问题的Demo)
debug最后发现是rec模型最慢。
深入到源码层去看过,python和C++的没啥差异,唯独在运行推理的环节速度有明显差异。对应的C++是ocr_rec.cpp的
this->predictor_->Run() 这行。
The text was updated successfully, but these errors were encountered: