-
Couldn't load subscription status.
- Fork 643
Graceful shut down #3785
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Jiang-Jia-Jun
merged 13 commits into
PaddlePaddle:develop
from
xiaolei373:graceful_shut_down
Sep 4, 2025
+151
−0
Merged
Graceful shut down #3785
Changes from all commits
Commits
Show all changes
13 commits
Select commit
Hold shift + click to select a range
f288d98
feat(log):add_request_and_response_log
xiaolei373 f083a85
Merge branch 'PaddlePaddle:develop' into develop
xiaolei373 830785b
Merge branch 'PaddlePaddle:develop' into develop
xiaolei373 3e8d4ab
Merge branch 'PaddlePaddle:develop' into develop
xiaolei373 26611bf
Merge branch 'PaddlePaddle:develop' into develop
xiaolei373 775060f
Merge branch 'PaddlePaddle:develop' into develop
xiaolei373 545d8f4
Merge branch 'PaddlePaddle:develop' into develop
xiaolei373 7a6de99
Merge branch 'PaddlePaddle:develop' into develop
xiaolei373 de928b1
Merge branch 'PaddlePaddle:develop' into develop
xiaolei373 361022a
Merge branch 'PaddlePaddle:develop' into develop
xiaolei373 287aa28
Merge branch 'PaddlePaddle:develop' into develop
xiaolei373 ddeff2f
Merge branch 'PaddlePaddle:develop' into develop
xiaolei373 7ac0d02
优雅退出-接口增加退出时长参数
xiaolei373 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,71 @@ | ||
| # Graceful Service Node Shutdown Solution | ||
|
|
||
| ## 1. Core Objective | ||
| Achieve graceful shutdown of service nodes, ensuring no in-flight user requests are lost during service termination while maintaining overall cluster availability. | ||
|
|
||
| ## 2. Solution Overview | ||
| This solution combines **Nginx reverse proxy**, **Gunicorn server**, **Uvicorn server**, and **FastAPI** working in collaboration to achieve the objective. | ||
|
|
||
|  | ||
|
|
||
| ## 3. Component Introduction | ||
|
|
||
| ### 1. Nginx: Traffic Entry Point and Load Balancer | ||
| - **Functions**: | ||
| - Acts as a reverse proxy, receiving all external client requests and distributing them to upstream Gunicorn worker nodes according to load balancing policies. | ||
| - Actively monitors backend node health status through health check mechanisms. | ||
| - Enables instantaneous removal of problematic nodes from the service pool through configuration management, achieving traffic switching. | ||
|
|
||
| ### 2. Gunicorn: WSGI HTTP Server (Process Manager) | ||
| - **Functions**: | ||
| - Serves as the master process, managing multiple Uvicorn worker child processes. | ||
| - Receives external signals (e.g., `SIGTERM`) and coordinates the graceful shutdown process for all child processes. | ||
| - Daemonizes worker processes and automatically restarts them upon abnormal termination, ensuring service robustness. | ||
|
|
||
| ### 3. Uvicorn: ASGI Server (Worker Process) | ||
| - **Functions**: | ||
| - Functions as a Gunicorn-managed worker, actually handling HTTP requests. | ||
| - Runs the FastAPI application instance, processing specific business logic. | ||
| - Implements the ASGI protocol, supporting asynchronous request processing for high performance. | ||
|
|
||
| --- | ||
|
|
||
| ## Advantages | ||
|
|
||
| 1. **Nginx**: | ||
| - Can quickly isolate faulty nodes, ensuring overall service availability. | ||
| - Allows configuration updates without downtime using `nginx -s reload`, making it transparent to users. | ||
|
|
||
| 2. **Gunicorn** (Compared to Uvicorn's native multi-worker mode): | ||
| - **Mature Process Management**: Built-in comprehensive process spawning, recycling, and management logic, eliminating the need for custom implementation. | ||
| - **Process Daemon Capability**: The Gunicorn Master automatically forks new Workers if they crash, whereas in Uvicorn's `--workers` mode, any crashed process is not restarted and requires an external daemon. | ||
| - **Rich Configuration**: Offers numerous parameters for adjusting timeouts, number of workers, restart policies, etc. | ||
|
|
||
| 3. **Uvicorn**: | ||
| - Extremely fast, built on uvloop and httptools. | ||
| - Natively supports graceful shutdown: upon receiving a shutdown signal, it stops accepting new connections and waits for existing requests to complete before exiting. | ||
|
|
||
| --- | ||
|
|
||
| ## Graceful Shutdown Procedure | ||
|
|
||
| When a specific node needs to be taken offline, the steps are as follows: | ||
|
|
||
| 1. **Nginx Monitors Node Health Status**: | ||
| - Monitors the node's health status by periodically sending health check requests to it. | ||
|
|
||
| 2. **Removal from Load Balancing**: | ||
| - Modify the Nginx configuration to mark the target node as `down` and reload the Nginx configuration. | ||
| - Subsequently, all new requests will no longer be sent to the target node. | ||
|
|
||
| 3. **Gunicorn Server**: | ||
| - Monitors for stop signals. Upon receiving a stop signal (e.g., `SIGTERM`), it relays this signal to all Uvicorn child processes. | ||
|
|
||
| 4. **Sending the Stop Signal**: | ||
| - Send a `SIGTERM` signal to the Uvicorn process on the target node, triggering Uvicorn's graceful shutdown process. | ||
|
|
||
| 5. **Waiting for Request Processing**: | ||
| - Wait for a period slightly longer than `timeout_graceful_shutdown` before forcefully terminating the service, allowing the node sufficient time to complete processing all received requests. | ||
|
|
||
| 6. **Shutdown Completion**: | ||
| - The node has now processed all remaining requests and exited safely. |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,71 @@ | ||
| # 服务节点优雅关闭方案 | ||
|
|
||
| ## 1. 核心目标 | ||
| 实现服务节点的优雅关闭,确保在停止服务时不丢失任何正在处理的用户请求,同时不影响整个集群的可用性。 | ||
|
|
||
| ## 2. 实现方案说明 | ||
| 该方案通过结合 **Nginx 反向代理**、**Gunicorn 服务器**、**Uvicorn 服务器** 和 **FastAPI** 协作来实现目标。 | ||
|
|
||
|  | ||
|
|
||
| ## 3. 组件介绍 | ||
|
|
||
| ### 1. Nginx:流量入口与负载均衡器 | ||
| - **功能**: | ||
| - 作为反向代理,接收所有外部客户端请求并按负载均衡策略分发到上游(Upstream)的 Gunicorn 工作节点。 | ||
| - 通过健康检查机制主动监控后端节点的健康状态。 | ||
| - 通过配置管理,能够瞬时地将问题节点从服务池中摘除,实现流量切换。 | ||
|
|
||
| ### 2. Gunicorn:WSGI HTTP 服务器(进程管理器) | ||
| - **功能**: | ||
| - 作为主进程(Master Process),负责管理多个 Uvicorn 工作子进程(Worker Process)。 | ||
| - 接收外部信号(如 `SIGTERM`),并协调所有子进程的优雅关闭流程。 | ||
| - 守护工作进程,在进程异常退出时自动重启,保证服务健壮性。 | ||
|
|
||
| ### 3. Uvicorn:ASGI 服务器(工作进程) | ||
| - **功能**: | ||
| - 作为 Gunicorn 管理的 Worker,实际负责处理 HTTP 请求。 | ||
| - 运行 FastAPI 应用实例,处理具体的业务逻辑。 | ||
| - 实现 ASGI 协议,支持异步请求处理,高性能。 | ||
|
|
||
| --- | ||
|
|
||
| ## 优势 | ||
|
|
||
| 1. **Nginx**: | ||
| - 能够快速隔离故障节点,保证整体服务的可用性。 | ||
| - 通过 `nginx -s reload` 可不停机更新配置,对用户无感知。 | ||
|
|
||
| 2. **Gunicorn**(相比于 Uvicorn 原生的多 Worker): | ||
| - **成熟的进程管理**:内置了完善的进程生成、回收、管理逻辑,无需自己实现。 | ||
| - **进程守护能力**:Gunicorn Master 会在 Worker 异常退出后自动 fork 新 Worker,而 Uvicorn `--workers` 模式下任何进程崩溃都不会被重新拉起,需要外部守护进程。 | ||
| - **配置丰富**:提供大量参数用于调整超时、Worker 数量、重启策略等。 | ||
|
|
||
| 3. **Uvicorn**: | ||
| - 基于 uvloop 和 httptools,速度极快。 | ||
| - 原生支持优雅关闭:在收到关闭信号后,会停止接受新连接,并等待现有请求处理完成后再退出。 | ||
|
|
||
| --- | ||
|
|
||
| ## 优雅关闭流程 | ||
|
|
||
| 当需要下线某个特定节点时,步骤如下: | ||
|
|
||
| 1. **Nginx 监控节点状态是否健康**: | ||
| - 通过向节点定时发送 health 请求,监控节点的健康状态。 | ||
|
|
||
| 2. **从负载均衡中摘除**: | ||
| - 修改 Nginx 配置,将该节点标记为 `down` 状态,并重载 Nginx 配置。 | ||
| - 此后,所有新请求将不再被发送到目标节点。 | ||
|
|
||
| 3. **Gunicorn 服务器**: | ||
| - 监控停止信号,收到停止信号(如 `SIGTERM` 信号)时,会把此信号向所有的 Uvicorn 子进程发送。 | ||
|
|
||
| 4. **发送停止信号**: | ||
| - 向目标节点的 Uvicorn 进程发送 `SIGTERM` 信号,触发 Uvicorn 的优雅关闭流程。 | ||
|
|
||
| 5. **等待请求处理**: | ||
| - 等待一段稍长于 `timeout_graceful_shutdown` 的时间后强制终止服务,让该节点有充足的时间完成所有已接收请求的处理。 | ||
|
|
||
| 6. **关闭完成**: | ||
| - 此时,该节点已经处理完所有存量请求并安全退出。 |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -77,9 +77,17 @@ | |
| help="max waiting time for connection, if set value -1 means no waiting time limit", | ||
| ) | ||
| parser.add_argument("--max-concurrency", default=512, type=int, help="max concurrency") | ||
|
|
||
| parser.add_argument( | ||
| "--enable-mm-output", action="store_true", help="Enable 'multimodal_content' field in response output. " | ||
| ) | ||
| parser.add_argument( | ||
| "--timeout-graceful-shutdown", | ||
| default=0, | ||
| type=int, | ||
| help="timeout for graceful shutdown in seconds (used by uvicorn)", | ||
| ) | ||
|
Comment on lines
+84
to
+89
|
||
|
|
||
| parser = EngineArgs.add_cli_args(parser) | ||
| args = parser.parse_args() | ||
|
|
||
|
|
@@ -431,6 +439,7 @@ def launch_api_server() -> None: | |
| workers=args.workers, | ||
| log_config=UVICORN_CONFIG, | ||
| log_level="info", | ||
| timeout_graceful_shutdown=args.timeout_graceful_shutdown, | ||
| ) # set log level to error to avoid log | ||
| except Exception as e: | ||
| api_server_logger.error(f"launch sync http server error, {e}, {str(traceback.format_exc())}") | ||
|
|
||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The default value of 0 seconds effectively disables graceful shutdown. Consider using a more reasonable default like 30 seconds to provide better out-of-the-box behavior for production deployments.