Skip to content

Commit 1b1276a

Browse files
authored
Translate some documents (#393)
2 parents 226f326 + d67c6f0 commit 1b1276a

File tree

6 files changed

+145
-7
lines changed

6 files changed

+145
-7
lines changed

README.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -185,6 +185,9 @@ fn main() {
185185

186186
## ⚓ Learn More
187187

188+
- [Project Overview](core/docs/en/overview.md)
189+
- [Background](docs/en/background.md)
190+
- [Why Rust](docs/en/why-rust.md)
188191
- [Coroutine Overview](core/docs/en/coroutine.md)
189192
- [Scalable Stack Overview](core/docs/en/scalable-stack.md)
190193
- [Monitor Overview](core/docs/en/monitor.md)

core/docs/en/overview.md

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -17,10 +17,8 @@ author: loongs-zhang
1717
The `open-coroutine` is a simple, efficient and generic stackfull-coroutine library, you can use this as a performance
1818
replacement for IO thread pools, see [why better](../en/why-better.md).
1919

20-
[//]: # (todo 增加英文版本的文档)
21-
22-
- [Background](../../../docs/cn/background.md)
23-
- [Why Rust](../../../docs/cn/why-rust.md)
20+
- [Background](../../../docs/en/background.md)
21+
- [Why Rust](../../../docs/en/why-rust.md)
2422
- [Why Better](../en/why-better.md)
2523
- [Quick Start](../../../README.md)
2624
- [Coroutine Overview](../en/coroutine.md)

docs/cn/background.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,8 @@ author: loongs-zhang
66

77
# 诞生之因
88

9+
[English](../en/background.md) | 中文
10+
911
## 待调优的线程池
1012

1113
在早期程序员为了支持多个用户并发访问服务应用,往往采用多进程方式,即针对每一个TCP网络连接创建一个服务进程。在2000年左右,比较流行使用CGI方式编写Web服务,当时人们用的比较多的Web服务器是基于多进程模式开发的Apache
@@ -54,7 +56,7 @@ PS:假设单线程,CPU时间片为1s,有100个任务,公平调度指每
5456

5557
协程技术哪家强,编程语言找golang。然而随着更深入的学习,我发现几个`goroutine`的不足:
5658

57-
1. `不是严格的thread-per-core`。goroutine运行时也是由线程池来支撑的,而这个线程池的最大线程为256,这个数字可比thread-per-core的线程数大得多;
59+
1. `不是thread-per-core`。goroutine运行时也是由线程池来支撑的,而这个线程池的最大线程为256,这个数字一般比thread-per-core的线程数大得多,且调度线程未绑定到CPU
5860
2. `抢占调度会打断正在运行的系统调用`。如果这个系统调用需要很长时间才能完成,显然会被打断多次,整体性能反而降低;
5961
3. `goroutine离极限性能有明显差距`。对比隔壁c/c++协程库,其性能甚至能到goroutine的1.5倍;
6062

docs/cn/why-rust.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,8 @@ author: loongs-zhang
66

77
# 语言选择
88

9+
[English](../en/why-rust.md) | 中文
10+
911
开发open-coroutine用什么语言呢?这是一个很重要的问题,毕竟不同的语言有不同的特性,选择不同的语言会对最终的结果产生很大的影响。
1012

1113
之前研究c协程库时,有看到大佬已经尝试过用c写动态链接库、然后java通过jni去调这种方式,最终失败了,具体原因得深入JVM源码才能得知,对鄙人来说太高深,告辞,因此排除java/kotlin等JVM字节码语言。
@@ -16,8 +18,8 @@ author: loongs-zhang
1618

1719
从研究过的好几个用c写的协程库来看,c的表达力差了点,需要编写巨量代码。相较之下,c++表达力就强多了,但开发的效率还是低了些,主要体现在以下几个方面:
1820

19-
1. `需要不停地写cmake`,告诉系统怎么编译它,有些麻烦,而这其实是不应该操太多心的部分
20-
2. `依赖管理麻烦`。如果要用别人写的类库,把代码拉下来,放到自己项目里,然后需要耗费大量时间来通过编译。如果别人依赖的库没有其他依赖还好,一旦有其他依赖,那么它依赖的依赖,也得按照刚才说的步骤处理,这就十分麻烦了;
21+
1. `必须写cmake`。纯粹为了告诉系统怎么编译,有些麻烦,而这其实是不应该操心的部分
22+
2. `依赖管理麻烦`。如果要用别人写的类库,需要把代码拉下来,放到自己项目里,然后不得不耗费大量时间来通过编译。如果别人的库没有其他依赖还好,一旦有其他依赖,那么它依赖的依赖,也得按照刚才说的步骤处理,这就十分麻烦了;
2123
3. `内存不安全`。c++很难写出没有内存泄漏/崩溃的代码。
2224

2325
<div style="text-align: center;">

docs/en/background.md

Lines changed: 95 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,95 @@
1+
---
2+
title: Reason for Birth
3+
date: 2025-02-24 17:08:33
4+
author: loongs-zhang
5+
---
6+
7+
# Reason for Birth
8+
9+
English | [中文](../cn/background.md)
10+
11+
## The thread pool needs to be optimized
12+
13+
In the early days, developers often adopted multiprocessing to support concurrent access to service applications by
14+
multiple users, which creates a service process for each TCP connection. Around 2000, it was quite popular to use CGI to
15+
write web services, and the most commonly used web server at that time was Apache 1.3.x series, which was developed
16+
based on the multiprocessing model. Because processes occupy more system resources while threads occupy fewer resources,
17+
people have started using multithreaded (usually using thread pools) to develop web service applications, which has
18+
increased the user concurrency supported by a single server, but there is still a problem of resource waste.
19+
20+
In 2020, I joined the V company. Due to occasional occurrences of the thread pool being fully filled in the internal
21+
system, coupled with the fact that the leader had
22+
read [《Java线程池实现原理及其在美团业务中的实践》](https://tech.meituan.com/2020/04/02/java-pooling-pratice-in-meituan.html),
23+
we decided to build our own dynamic thread pool. From the process, the results were good:
24+
25+
<div style="text-align: center;">
26+
<img src="/docs/img/begin.jpg" width="50%">
27+
</div>
28+
29+
But this don't fundamentally solve the problem. As is well known, thread context switching has a certain cost, and the
30+
more threads there are, the greater the cost of thread context switching. For CPU intensive tasks, simply ensure that
31+
the number of threads is equal to the number of CPU cores and bind the threads to the specified CPU core (hereinafter
32+
referred to as the `thread-per-core`), it can ensure optimal performance. For IO intensive tasks, since the task almost
33+
always blocks threads, the cost of thread context switching is generally less than the blocking cost. However, when the
34+
number of threads is too large, the cost of thread context switching will be greater than the blocking cost.
35+
36+
The essence of dynamic thread pool is to adjust the number of threads to minimize the cost of thread context switching
37+
compared to blocking. Since this is manual, it cannot be guaranteed.
38+
39+
<div style="text-align: center;">
40+
<img src="/docs/img/run.jpg" width="50%">
41+
</div>
42+
43+
## The pain of using NIO
44+
45+
Is there a technology that can perform IO intensive tasks with performance comparable to multithreading while ensuring
46+
thread-per-core? The answer is `NIO`, but there are still some limitations or unfriendly aspects:
47+
48+
1. The NIO API is more complex to use compared to the BIO API;
49+
2. System calls such as sleep still block threads. To achieve optimal performance, it is equivalent to disabling all
50+
blocking calls, which is unfriendly to developers;
51+
3. In the thread pool mode, for a single thread, the next task can only be executed after the current task has been
52+
completed, which cannot achieve fair scheduling between tasks;
53+
54+
Note: Assuming a single thread with a CPU time slice of 1 second and 100 tasks, the fair scheduling refers to each task
55+
being able to fairly occupy a 10ms time slice.
56+
57+
The first point can still be overcome, while the second and third points are weaknesses. In fact, if the third point can
58+
be implemented, RPC frameworks don't need to have too many threads, just thread-per-core.
59+
60+
How can developers use it easily while ensuring that the performance of IO intensive tasks is not inferior to
61+
multi threading and thread-per-core? The `Coroutine` technology slowly entered my field of vision.
62+
63+
## Goroutine still has shortcomings
64+
65+
At the beginning of playing with coroutines, due to the cost of learning, I first chose `kotlin`. However, when I
66+
realized that kotlin's coroutines needed to change APIs (such as replacing Thread.sleep with kotlinx.coroutines.delay)
67+
to avoid blocking threads, I decisively adjusted the direction to `golang`. About 2 weeks later:
68+
69+
<div style="text-align: center;">
70+
<img src="/docs/img/good.jpeg" width="50%">
71+
</div>
72+
73+
Which technology is strong in coroutine? Look for Golang in program languages. However, as I delved deeper into my
74+
studies, I discovered several shortcomings of goroutines:
75+
76+
1. `Not thread-per-core`. The goroutine runtime is also supported by a thread pool, and the maximum number of threads in
77+
this thread pool is 256, which is generally much larger than the number of threads in the thread-per-core, and the
78+
scheduling thread is not bound to the CPU;
79+
2. `Preemptive scheduling will interrupt the running system calls`. If the system call takes a long time to complete, it
80+
will obviously be interrupted multiple times, resulting in a decrease in overall performance;
81+
3. `There is a significant gap between goroutine and other in best performance`. Compared to the C/C++ coroutine
82+
library, its performance can even reach 1.5 times that of goroutines;
83+
84+
With regret, I continued to study the C/C++ coroutine libraries and found that they either only implemented `hook` (here
85+
we explain hook technology, in simple terms, proxy system calls, such as calling sleep. Without the hook, the operating
86+
system's sleep function would be called, and after the hook, it would point to our own code. For detailed operation
87+
steps, please refer to Chapters 41 and 42 of The Linux Programming Interface), or only implemented `work-stealing`.
88+
Some libraries only provided the most basic `coroutine abstraction`, and the most disappointing thing is that none of
89+
then implemented `preemptive scheduling`.
90+
91+
There's no other way, it seems like we can only do it ourselves.
92+
93+
<div style="text-align: center;">
94+
<img src="/docs/img/just_do_it.jpg" width="100%">
95+
</div>

docs/en/why-rust.md

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,38 @@
1+
---
2+
title: Language Selection
3+
date: 2025-02-24 17:37:10
4+
author: loongs-zhang
5+
---
6+
7+
# Language Selection
8+
9+
English | [中文](../cn/why-rust.md)
10+
11+
What language is used to develop open routine? This is a very important issue, as different languages have different
12+
features, and choosing different language can have a significant impact on the final outcome.
13+
14+
When researching the C coroutine library before, I saw that some experts had already tried to write dynamic link
15+
libraries in C and call them in Java through JNI, but finally failed. The specific reason needs to be found in the
16+
JVM source code, which is too hard for me, goodbye. So JVM bytecode languages such as Java/Kotlin are excluded.
17+
18+
Obviously, using Golang to implement a goroutine is no less complex than delving into JVM source code, and even if it is
19+
actually finished, no one would be willing to use it in a production environment, so Golang is excluded.
20+
21+
Now, there are still three players left: c/c++/rust.
22+
23+
From several coroutine libraries written in C that have been studied, it can be seen that the expressiveness of C is a
24+
bit lacking and requires writing a huge amount of code. In comparison, C++ has much stronger expressive power, but its
25+
development efficiency is still low, mainly reflected in the following aspects:
26+
27+
1. `Have to write cmake`. Purely to tell the system how to compile, it's a bit troublesome, but this is actually the
28+
part that shouldn't be worried about;
29+
2. `Difficulty in dependency management`. If you want to use a library written by someone else, you need to pull down
30+
the code and put it into your own project, and then you have to spend a lot of time compiling it. If the library has
31+
no other dependencies, it can barely be handled. Once there are other dependencies, the dependencies it depends on
32+
must also be handled according to the steps just mentioned, which can be very troublesome;
33+
3. `Memory is unsafe`. It's difficult to write code in C++ without memory leaks/crashes.
34+
35+
<div style="text-align: center;">
36+
<img src="/docs/img/what_else_can_I_say.jpg" width="50%">
37+
<img src="/docs/img/rust.jpeg" width="100%">
38+
</div>

0 commit comments

Comments
 (0)