当前位置:网站首页>System design: index
System design: index
2022-06-24 05:03:00 【Xiaochengxin post station】
If someone talks to you about indexing , Is it the first time you think about databases , So what does indexing solve ? Such as query SQL slow , When this happens , One of the first things to do is to see if it's slow SQL Go to the database index .
The purpose of creating an index on a specific table in the database is to make it faster to search the table and find the required rows . You can create indexes using one or more columns of a database table , Facilitate rapid random search and efficient access to ordered records .
Example : Library catalog
A library catalogue is a register containing a list of books found in a library . Directories are organized like database tables , There are usually four columns : Title 、 author 、 Subject and publication date . There are usually two such directories : A sort by title , Another sort by author's name . In this case , You can think of a writer you want to read , Then read their books , Or find a specific title you know you want to read , In case you don't know the author's name . These catalogs are like indexes in a book database . They provide a sorted list of data , You can easily search through relevant information .
In short , An index is a data structure , Can be regarded as a directory , Point us to where the actual data is . therefore , When we create an index on a column of a table , We store this column and a pointer to the entire row in the index in the index . Let's assume a table containing a list of books , The image below shows “Title” The appearance of the index on the column :
Just like traditional relational data storage , We can also apply this concept to larger data sets . The trick to indexing is , We must carefully consider how users will access the data . For a number of sizes TB But the payload is very small ( Such as 1KB) Data set of , Indexing is a necessary condition for optimizing data access . Finding a small payload in such a large data set can be a real challenge , Because we can't iterate over so much data in any reasonable time . Besides , Such a large data set is likely to be distributed on multiple physical devices , This means that we need some way to find the correct physical location of the data we need . Indexing is the best way to do this .
Why indexes degrade write performance ?
Indexing can greatly speed up data retrieval , But with the extra keys , The index itself can be very large , This slows down data insertion and updates .
When adding rows to a table with an active index or updating existing rows , We don't just have to write data , And update the index . This reduces write performance . This performance degradation applies to all inserts of the table 、 Update and delete operations . therefore , Avoid adding unnecessary indexes to tables , And delete indexes that are no longer in use . To reiterate , Adding indexes is to improve the performance of search queries . If the goal of the database is to provide a data store that is frequently written but rarely read , that , Reduce the more common operations ( Write now ) The performance of may not be worth the performance improvement we get from reading . You can refer to it Wiki Encyclopedias https://en.wikipedia.org/wiki/Database_index Database index .
Off topic author's supplement
The Google system design guide specifies what we call the advantages and disadvantages of indexing , Well, actually, think deeply , Indexing is the solution to the read problem , Data storage is the solution to the write problem , And when we design the system , In the process of middleware , You will find that a large number of designs are separated from reading and writing , For example, writing to a disk is sequential , Disk reading is random reading . So the purpose of using an index is for us to make a trade-off , Does the index help us , If there is only one data record, then no index can . If the data is very large , Many redundant indexes are built, which undoubtedly makes it more difficult for us to write .
Reference material
grok_system_design_interview.pdf
边栏推荐
- How to change the IP address of ECS? What are the precautions for changing the IP address
- How novices choose ECs and how to judge the quality of ECS
- What are the functions of ASP files on ECs? What if the ECS cannot run ASP with a low version?
- Zhang Xiaodan, chief architect of Alibaba cloud hybrid cloud: evolution and development of government enterprise hybrid cloud technology architecture
- How does a R & d make a small demand bigger and bigger step by step
- Spirit breath development log (15)
- Activity recommendation | cloud native community meetup phase VII Shenzhen station begins to sign up!
- 线性回归的损失和优化,机器学习预测房价
- Integration of Alibaba cloud SMS services and reasons for illegal message signing
- How RedHat 8 checks whether the port is connected
猜你喜欢

Analyzing the superiority of humanoid robot in the post human era

What is the new generation cloud computing architecture cipu of Alibaba cloud?

What are the disadvantages of the free IP address replacement tool?

Facebook internal announcement: instant messaging will be re integrated

CTF learning notes 18:iwesec file upload vulnerability-03-content-type filtering bypass

Introduction to the "penetration foundation" cobalt strike Foundation_ Cobalt strike linkage msfconsole

Popularization of children's programming education in specific scenarios

Leetcode (question 1) - sum of two numbers

Leetcode (question 2) - adding two numbers

Let children learn the application essence of steam Education
随机推荐
What domain names do not need to be filed? Is there any process for domain name registration
Drawing axes with dates using Matplotlib
Blackmail virus prevention guide
2021-08-27: the normal odometer will display natural numbers in turn to indicate mileage, Kyrgyzstan
Bi-sql and & or & in
cuDNN installation
How do ECS create FTP accounts? What should I pay attention to during creation?
Black horse programmer machine learning handout: preliminary use of linear regression API
How to build a website for ECS? What are the prices of different ECS
Popularization of children's programming education in specific scenarios
What is a network domain name? What is the role of a domain name for an enterprise
阿里云新一代云计算体系架构 CIPU 到底是啥?
Deep learning NLP from RNN LSTM Gru seq2seq to attention classification and analysis
Pg-pool-ii read / write separation experience
How to control CDN traffic gracefully in cloud development?
Spirit breath development log (12)
Before creating an image, it is recommended to execute the following code to purify the image as an administrator
Advantages of fixed assets management system
Application practice of helium decentralized lorawan network in Tencent cloud IOT development platform
4G industrial VPN router