↓Recommended to follow ↓

Source: https://zhuanlan.zhihu.com/p/498589747

The title is a golang programming proverb, and the understanding of it can be big or small.

To a lesser extent, golang recommends using channels to share information instead of using shared memory, which is an elegant way to avoid the cumbersome and inefficient nature of data synchronization.

To put it more broadly, it is essentially to let the resource schedule the request, rather than let the request schedule the resource.

Sometimes, a shift in thinking, a perspective on a problem, can bring unexpected gains

There are so many resources, and the way all requests use resources in an orderly manner is the way they communicate, and in turn, for each request, it virtualizes the illusion that it occupies the resources, that is, the way of sharing. The difference between the two very different approaches is in the arbitration cost, which determines their ability to carry concurrency.

Say one by one.

Circuit switching tries to take possession of the entire circuit (in fact, the last mile), and if it is not successful, it must wait until it succeeds.

Packet switching divides long information into small packets, and small packets are statistically multiplexed links.

Once a batch user uses the system, it will monopolize the system until the task is completed and the other users wait.

The time-sharing system shards the time, and multiple users are scheduled to reuse the time shard.

The CSMA/CD host attempted to send packets exclusively on the bus and, if unsuccessful, backed away until it succeeded.

Switched Ethernet packets are queued in an orderly manner on the switch and buffers are multiplexed.

Apache generates a task for each request, and once the task gets the CPU, other tasks will wait.

Nginx uses an asynchronous model where all requests are time-sharing and multiplexing the CPU time of a fixed number of tasks.

Shared memory is mutually exclusive to writes and writes, allowing only one operation at a time, and the others have to wait and try again.

The erlang/go channel disassembles the content into transactional messages, relying on the orderly delivery of shared information by the messages.

Let’s look at the commonality of the above two comparisons.

You can abstract all of the above two into scrambling mode and ordered mode:

For the Scramble mode, the conflict essentially requires arbitration.

For ordered mode, concurrency essentially needs to be scheduled.

The so-called arbitration of conflicts means what to do after a conflict. Whether you backing out to try or wait, nothing can be done during this period, and the arbitration itself is expensive.

Concurrent scheduling will be much better, orderly there will be no conflict, there will be no arbitration cost, no arbitration, there will be no need to retry, wait, you can do something else, the processing is completely asynchronous.

Let’s compare the previous technical advantages and disadvantages:

Once the circuit exchange is busy, you need to keep trying again and again.

Packet switching You only send packets, and the switching node automatically schedules these packets to reassemble after they arrive at their destination.

Once the batch system is occupied, you have to wait in line or wait for it later.

The time-sharing system you only need to issue tasks, the task scheduling system will let all users’ tasks time-sharing multiplexed time slices.

The CSMA/CD NIC needs to constantly listen for conflicts and try again.

Switched Ethernet cards only need to issue packets, and the switch queues up packets that are too late to be forwarded.

If an Apache thread/process is not scheduled to the CPU, it needs to wait until it is scheduled to switch to the CPU.

Nginx simply notifies the event and the worker process polls for all requests.

Shared memory access requires locking, and if the lock fails, either wait for a retry or come back later.

The erlang/go channel communicates with a message, and it doesn’t matter when the message is sent, unless it wants to get feedback, completely asynchronous.

It can be seen that this is another special destination. Like-minded ones also:

PCI vs PCIe, from bus to switch.

Macrokernel vs microkernel, from shared data structures to messaging.

Spin/RW Lock vs RCU Lock, from scrambling locks to operational replica atomic updates.

The RCU principle

Why the contention mode of conflict arbitration cannot carry large concurrency, because the overloaded conflict arbitration cost will overwhelm resources, and if you want to carry large concurrency, you must adopt the scheduling method. To understand this element, you need to take a different perspective.

We look at whether the operation is the information itself, or a copy of the information.

Going back to the title of this article, “Communicating by Shared Memory” operates a copy of the information, while “Communicate by Shared Memory” manipulates the information itself.

The operation information copy can ensure that there is and only one entity operating the copy, if there are two entities that need to operate the copy, then copy another copy, which ensures that there is no conflict, and the business flow is controllable and non-blocking.

RCU can achieve non-blocking concurrency of the service, whether it is spinlock or rwlock, can not do. spinlock/rwlock locks the critical area, causing the critical area to be serialized, and the RCU has no critical area, it will operate as a copy of the logic that belongs to the critical area, and the timing atom is updated, which can achieve non-blocking concurrency.

If concurrency is regarded as time scalability, then sharing information to a distant place is spatial scalability, and it is the network that completes this, which is currently a TCP/IP network. TCP/IP networks take a “shared memory by means of communication” approach, which is undoubtedly correct.

I don’t understand erlang, but roughly know what it means, erlang has no variables, only operates copies, it is a mapping of communication networks in programming languages, and for golang, presumably the same is true, using go channels can process information like a network of transceivers.

If we look at the socket interface, it’s really an ancient way of sharing memory by means of communication.

The socket interface began as an inter-process communication mechanism, with which processes can be local, remote, and anywhere in the world. “Communicating with memory” is the most primitive programming mode and is still true today.

Shared memory is a local optimization that has only programmatic meaning but no scalability, whether it’s time scalability without blocking concurrency or spatial scalability in passing information to distant places.

Shared memory is a local optimization, optimized for instruction operation delay, rather than encapsulating the information into a message and passing, it is better to directly manipulate the information itself, it is simpler to program, less code instructions, and lower execution delay. But high concurrency does not care about the number of valid instructions executed at the same time, and spin, switch does not belong to the effective instructions, so shared memory is not inherently paired with high concurrency.

In addition, it is still that point of view, the network programming scenario, the general millisecond-level single-stream communication delay, shared memory compared to messaging to save a subtle or even nanosecond operation delay, does not make much sense. Blame it on the speed of light.

Start with cloud native

Cloud-native is a microservices-oriented architecture, and messaging is the medium of microservice interaction, and every worker has been exposed to the concept of message queuing, and it is messages that underpin cloud-native microservices.

Messages do not encapsulate state, the message itself is stateless, and state is reflected through the interaction between messages. Message interactions can be freely combined, which is the source of distribution, and cloud native itself is a distributed-oriented design.

An IDC room that deploys cloud-native applications is a scaled-down version of the global TCP/IP Internet, where stateless messages are passed between distributed microservices, and the state is only defined and maintained by the interaction of microservices.

Even a physical host’s internal board has become a miniature version of the global TCP/IP Internet, with stateless messages passed between distributed modules, and state is defined and maintained only by the interaction between modules.

Shared memory is similar to a bus, everyone has equal access, but write access is exclusive. We can look at the situation of shared memory from the relationship between the bus and the message exchange.

In the past, there were many buses on the host motherboard, and many modules had to compete for bus control before communicating with the CPU or other modules, but later PCIe changed the bus to a switching network interconnected by the Hub, replacing the bus arbitration with message exchange.

Ethernet has gone through the same trajectory before that.

The idea of microkernels, which has been advocated by workers in recent years, is roughly the same thing, replacing the operation of shared data structures with messaging.

Why are these analogies to the global TCP/IP Internet? Because the foundation of TCP/IP is an asynchronous, stateless, distributed, messaging packet-switched network.

The bus is simple and simple, and as the scale of the system expands, the time loss caused by the bus scramble increases exponentially, and when people find that the bus cannot support high concurrency and cannot physically expand, the message delivery replaces the bus. In large-scale systems, the additional latency of message manipulation is negligible.

Whether it’s internal boards, LANs, PCI, operating systems all start at the local area, it’s not surprising that they start with the bus. However, the Internet is connected to the distributed wide area side from the beginning, and it is not suitable for the bus structure at the beginning, which in turn shows that the bus is not suitable in the distributed scenario.

I have always praised the TCP/IP end-to-end principle, and it is its stateless IP slim waist that allows the Internet to scale arbitrarily without introducing additional overhead, and the thin waist is also the core of stateless message exchange, only defining and maintaining state between the sender and receiver, rather than all the ends maintaining the state of the shared bus or memory together.

As a result, messaging also follows end-to-end principles and allows for free scaling, while bus and shared memory do the opposite.

From the perspective of the local expansion above, we see the trend of messaging replacing buses.

In turn, shrinking from the wide area inwards, the scale is getting smaller, the transmission delay is getting shorter and less distributed, the scalable advantages brought by stateless messaging are becoming more and more useless, and the additional delay brought by its additional packaging gradually bears the big head of end-to-end delay.

In addition to additional message wrapping and transmission operations, all small entities directly manipulate the memory where the information is located, minimizing end-to-end latency becomes a considerable benefit, so the bus and shared memory are the ultimate in miniature systems.


This makes sense on both sides, from small scale to large scale, bus and shared memory are replaced by messaging, and from large to small scale, bus and shared memory are optimized for messaging.

Just like general relativity, Newtonian mechanics, quantum mechanics, different scales have different philosophies.

Add the homepage Jun WeChat, not only Linux skills +1

Homepage Jun will also share Linux-related tools, resources and selected technical articles on personal WeChat, and share some interesting activities, job promotion and how to use technology to do amateur projects from time to time

Add a WeChat and open a window

1. In-depth understanding of USB communication protocol

2. A summary of Linux performance analysis tools

3, modern C++ testing toolchain (it’s time to ditch gtest/google bench)

Got a harvest after reading this article? Please share it with more people

It is recommended to pay attention to “Linux enthusiasts” to improve their Linux skills

Likes and looks are the biggest support ❤️