当前位置：网站首页>System design: key features of distributed systems

System design: key features of distributed systems

2022-06-24 10:52:00 【Xiaochengxin post station】

Key features of distributed systems include scalability 、 reliability 、 Usability 、 Efficiency and manageability . Let's briefly review

Scalability （ Including scalability ）

Scalability is the system 、 The ability of a process or network to grow and manage growing demand . Any distributed system that can evolve to support an increasing workload is considered scalable .

Due to the increase of data volume or workload （ For example, the number of transactions ） And so on , The system may have to be extended . A scalable system hopes to achieve this kind of expansion without losing performance . Generally speaking , Although the performance of the system is designed （ Or claim ） Is scalable , But for management or control reasons , Decreases with the size of the system

Environmental costs . for example , The network may slow down , Because machines are often far apart . More generally , Some tasks may not be distributed , This may be due to their inherent atomic properties , It may also be due to some defects in the system design . In a way , Such a task will limit the speed of the assignment . The scalable architecture avoids this , And try to balance the load on all participating nodes .

Horizontal and vertical expansion ：

Horizontal scaling means scaling by adding more servers to the resource pool （ That is what we often call a heap machine ）, Vertical expansion means adding more power to existing servers （CPU、RAM、 Storage, etc ） To expand .

Expand horizontally , By adding more machines to the existing pool , It is often easier to dynamically extend ; Vertical expansion is usually limited to the capacity of a single server , Scaling beyond this capacity usually involves downtime , And there is an upper limit .

A good example of horizontal expansion is Cassandra and MongoDB, Because they all meet the growing demand by adding more machines , Thus, a simple horizontal expansion method is provided . Similarly , A good example of vertical scaling is MySQL, Because it allows vertical scaling by switching from a smaller machine to a larger machine . However , This process often involves downtime .

Vertical scaling vs. Horizontal scaling

reliability （ High availability 、 stability ）

According to the definition , Reliability is the probability that a system will fail within a given period of time. For example, we often say that it is often unavailable throughout the year , There are several goals to achieve stability 9. In short , If a distributed system can still provide services when one or more software or hardware components fail , Then it is considered reliable . Reliability represents one of the main characteristics of any distributed system , Because in a system like this , Any malfunctioning machine can be replaced by another normal machine , To ensure that the requested task is completed .

Take large e-commerce stores （ Such as Amazon） For example , One of the main requirements is , No user transaction should be canceled because the machine running the transaction fails . for example , If the user has added an item to their cart , Then the system will not lose it . Reliable distributed systems achieve this through redundancy of software components and data . If the server hosting the user's shopping cart fails , Then another server with an exact copy of the shopping cart should replace it .

obviously , Redundancy comes at a price , And a reliable system must pay the price , Only by eliminating each failure point can the service recovery be realized .

Usability

Availability is also used as CAP and BASE Medium A theory , The system must be available . According to the definition , Availability is the time that a system remains running to perform its required functions for a specific period of time . It's about the system 、 A simple measure of the percentage of time a service or machine remains running under normal conditions . A plane that can fly several hours a month without stopping can be said to be highly available . Availability considers maintainability 、 Maintenance time 、 Spare parts availability and other logistics factors . If the aircraft is stopped for maintenance , Is considered unavailable during this period .

Reliability is the availability over a period of time taking into account the various situations that may occur in the real world . An aircraft that can safely fly in any possible weather is more reliable than an aircraft that is susceptible to possible conditions .

reliability VS Usability

If a system is reliable , It's available . However , If available , Not necessarily reliable . let me put it another way , High reliability contributes to high availability , But by minimizing maintenance time and ensuring spare parts availability , Even if the product is unreliable , It is also possible to achieve high availability

Available whenever needed . Let's take an online retail store as an example , It has... In the first two years after its launch 99.99% The usability of . However , The system did not conduct any information security test at startup . The customer was very satisfied with this system , But they didn't realize it wasn't reliable , Because it is very susceptible to risk . In the third year , The system has experienced a series of information security events , These events suddenly lead to very low availability for a long time . This will damage the customer's reputation and finances .

efficiency

To understand how to measure the efficiency of Distributed Systems , Suppose we have an operation that runs in a distributed manner , And deliver a set of items as a result . The two standard measures of its efficiency are the response time that represents the delay in obtaining the first item （ Or delay ） And denote in a given time unit （ for example , second ） Throughput of the number of items delivered in （ Or bandwidth ）. These two measures correspond to the following unit costs ：

• Regardless of message size , The number of messages sent globally by the system node .

• Message size indicating data exchange volume .

The complexity of the operations supported by distributed data structures （ for example , Search the distributed index for specific keys ） A function that can be described as one of these cost units . Generally speaking , use “ The number of messages ” It is too simple to analyze the distributed structure . It ignores the impact of many aspects , Including network topology 、 Network load and its changes 、 Possible heterogeneity of software

However , It is difficult to build an accurate cost model to accurately consider all these performance factors ; therefore , We have to accept rough and robust estimates of system behavior .

RT and throughout It is generally used as the benchmark index to measure the efficiency of the system .

Maintainability or manageability

When designing distributed systems , Another important consideration is how easy it is to operate and maintain . Serviceability or manageability is the simplicity and speed of system repair or maintenance ; If the time to repair the faulty system increases , Then the availability will be reduced . Manageability needs to be considered ： When problems happen , How easy it is to diagnose and understand the problem 、 How easy it is to update or modify , And the simplicity of the system operation （ namely , Whether the system operates normally without fault or abnormality ？）.

Early fault detection can reduce or avoid system downtime . for example , When a system failure occurs in the system , Some enterprise systems can automatically call the service center （ No manual intervention required ）.

A good distributed system , Some of the most basic key features are described above , Think about whether your system is a good distributed system ？ It conforms to the core reference features mentioned above ？

Reference material ：

grok_system_design_interview.pdf

原网站

版权声明
本文为[Xiaochengxin post station]所创，转载请带上原文链接，感谢
https://yzsam.com/2021/06/20210617143955317f.html