[Fix] Add aarch64 (ARM64) support: auto-enable ONNX Runtime to fix SIGSEGV#17824
[Fix] Add aarch64 (ARM64) support: auto-enable ONNX Runtime to fix SIGSEGV#17824geoHeil wants to merge 1 commit intoPaddlePaddle:mainfrom
Conversation
|
Thanks for your contribution! |
|
Please fix the CodeStyle. |
b2cab23 to
78f6eaa
Compare
…GSEGV Pre-built PaddlePaddle aarch64 wheels crash with SIGSEGV during both model loading (PIR executor) and inference (native kernels). This change: 1. Detects Linux aarch64 at runtime 2. Sets FLAGS_enable_pir_in_executor=0 (fixes model loading crash) 3. Auto-enables HPI with ONNX Runtime when ultra-infer is installed (bypasses broken native inference kernels) 4. Disables MKL-DNN on aarch64 (x86-only) 5. Adds Docker infrastructure to build ultra-infer from source for aarch64 and run end-to-end tests Requires companion PR in PaddleX to add aarch64 to the HPI backend selection function (suggest_inference_backend_and_config). Fixes PaddlePaddle#17590 Related: PaddlePaddle#16685 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
78f6eaa to
325f4f2
Compare
Bobholamovic
left a comment
There was a problem hiding this comment.
Thanks for the detailed and professional PR! The overall approach is clear and well thought out.
From what I can see, this is a practical short-term fix. However, from a longer-term evolution and maintenance perspective, I have a few concerns that I'd like to discuss with you:
First, regarding the PaddlePaddle ABI issue on aarch64: if the root cause lies within PaddlePaddle itself, would it make more sense to address it at the framework level in the long run, rather than working around it in downstream libraries (such as PaddleOCR) by switching inference engines? For PaddleOCR, PaddlePaddle-based inference is still a core capability, and many parts of the existing documentation and codebase assume its presence. While a workaround is understandable in the short term, relying on this approach long-term could introduce inconsistencies and additional maintenance overhead.
Second, about introducing ultra-infer: from a user perspective, building ultra-infer from source can be quite challenging, especially without comprehensive documentation. While providing prebuilt binaries can lower the barrier to entry, it also introduces additional maintenance overhead for the PaddleOCR dev team, particularly when multiple build configurations (e.g., different Dockerfiles) need to be maintained across different repositories (PaddleOCR and PaddleX). As additional context, we are planning to support inference directly via ONNX Runtime Python bindings in future PaddleOCR releases, to further simplify installation and usage. Meanwhile, due to some historical reasons, ultra-infer may gradually move toward a less actively maintained or even deprecated state. From this perspective, the long-term sustainability of this approach may need further consideration.
Finally, from a design standpoint, the current handling of aarch64 (e.g., auto-enabling HPI) feels somewhat ad hoc and case-specific, which makes the overall design less clean and may negatively impact long-term maintainability. In addition, ultra-infer is primarily maintained in PaddleX, while its aarch64 adaptations and build processes (including Dockerfiles) are maintained within PaddleOCR. This split in responsibility feels somewhat fragmented and could further increase maintenance costs over time.
Overall, this PR is valuable as a short-term workaround, but I do have some concerns about its role in the long-term direction. I’d really appreciate hearing your thoughts, especially if there are constraints or context that I might be missing.
|
Thx for your response! I am not a core paddle developer. So your architectural topics were unclear for me. It makese sense what you write. Do you have a timeline for when you would intend to support aarch? Or would it make possibly sense to support there to go there sooner? |
For "enabling inference on the aarch64 architecture in PaddleOCR via ONNX Runtime", I expect it will likely be around May or June. |
|
If that holds true that would be a viable timeline on my end. should we close the PR then? |
|
Works for me; feel free to close it. |
Summary
Motivation
PaddlePaddle 3.x pre-built aarch64 wheels crash with SIGSEGV at two sites:
std::filesystem::pathobjects (fixable withFLAGS_enable_pir_in_executor=0)This affects all users on Linux ARM64: Docker on Apple Silicon, Raspberry Pi, AWS Graviton, etc. Issues #17590 and #16685 have been open since October 2025 with no upstream PaddlePaddle fix.
The solution routes inference through ONNX Runtime via PaddleX's HPI (High-Performance Inference), which completely bypasses the broken native kernels.
Changes
paddleocr/_common_args.pyFLAGS_enable_pir_in_executor=0andFLAGS_enable_pir_api=0(fixes crash site 1)ultra-infer+onnxruntime+paddle2onnxare installed (fixes crash site 2)deploy/docker/aarch64/Dockerfile: builds ultra-infer from source for aarch64 with ORT backend, then installs PaddleOCRpatch_paddlex_hpi.py: patches PaddleX's backend selection to support aarch64 (until companion PR is merged)test_aarch64.py: exercises both crash sites and verifies text detection worksdocker-compose.ymlaarch64-testservice for automated testingaarch64service for interactive debuggingDependencies
Companion PR: PaddlePaddle/PaddleX#5048 — adds aarch64 to
suggest_inference_backend_and_config()so HPI selects ONNX Runtime on ARM64. Once merged, thepatch_paddlex_hpi.pyworkaround in the Dockerfile can be removed.Test plan
PaddleOCR(device="cpu")loads all 5 models without SIGSEGV'HelloOCR'detected)docker compose up --build aarch64-testpasses all testsFixes #17590
Related: #16685
🤖 Generated with Claude Code