当前位置:网站首页>Take my brother to make a real-time Leaderboard

Take my brother to make a real-time Leaderboard

2022-06-24 07:42:00 Programmer fish skin

Design and implementation of real-time leaderboard system .

Hello everyone , I'm fish skin , Summer vacation is coming , My brother, little ABA, has heard that there are a lot of healthy people in my family , Come and play with me .

As a result, I put out a few small systems that I had developed before , I'm going to do some works with little abado during this period , Learn how to design a programming project . So when he starts school , You can do the project with the teacher more easily .

today , Let's take him to do a very common little function first : Real time ranking of users .

Real time leaderboard

demand

First, describe the requirements , In my programming navigation project (https://www.code-nav.cn) in , In order to encourage everyone to maintain the website together , Users can recommend resources 、 Positive comments 、 Get points by reporting illegal resources .

To further motivate you , The website needs to provide a user ranking , It is divided into Real time total scoreboard Zhou Bang and Monthly list , all Take only before 10 name . All users can view the current leaderboard , And check your own real time Total points ranking , Subsequent administrators can award prizes to users on the list .

The effect is as follows :

Click on My ranking Button , You can see your real-time ranking :

Space is limited , Let's just talk about Real time total scoreboard Design and implementation of .

After listening to the demand , Little Abba has a big smile : It's hard ? Let me design a wave , I'll tell you more .

Design implementation

Let's look at the structure of the database first , All in all 2 Tables : User table and User score table .

The user table stores user information , And the total points of users ( Real time updates ), In other words, the data needed by the total scoreboard can be directly obtained from here , There's no need to calculate .

User table content :

user id

user name

integral (score)

1

Little Abba

10

2

Plum skin

1000

3

petty thief

100

......

100

Li laore

66

If you want to take the front 10 name , Just take out all the users' information first , Just put it in another order , Write SQL Statement query is :

select * from `user` order by score;

Then if you want to get your own total ranking , Just traverse the ordered data found , Just find the subscript where you are , The pseudocode is as follows :

//  Query the list of all users from the database 
list = getAllDataList()
for(i = 0; i < total; i++) {
  //  Find your place 
  if(list[i].id == ' my id') {
    return i + 1;
  }
}

Little Abba is proud of : That's how we're going to get to the top of the table ? Your needs are too simple , Click on the tongue .

I laugh so much : Not bad , The idea of the championship is right , At least know to sort all the data . But if there are too many users ? Like hundreds of thousands of them , You just need to check your overall ranking , Do you still need to sort all the data ?

Little Abba was lost in thought , I thought about it for a long time. , I didn't think of it .

So I suggested : If you want to know your ranking in an exam , Do you just need to know how many people have higher scores than yourself , You don't have to worry about who's in the line, right ?

Little Abba patted his head : Yeah , I just need to find out my score first , Then count the number of users whose scores are greater than mine , I don't know my ranking ?

First use SQL Statement to find out the user's score :

/*  Take only the columns you need  */
select score as myScore
from `user`
where id = " user  id";

And then use SQL The number of statements whose statistics score is greater than that of the user :

select count(*) from `user`
where score > myScore;

Finally, we just need to add 1, It's my ranking ~

Little ABA sighed : It turns out that a little bit of thinking , It saves the performance overhead of extra sorting , take off ~

Think more

Fish skin : Don't take off yet , In fact, for a system with a large number of users , The above plan is enough . Now let's make it more difficult , If the number of users is a little more , For example, 100 million , How to real-time access before 10 What about the name ?

Little Abba : That's true “ Billion points ”, As far as your programming navigation is concerned, there are still 100 million users ?

Fish skin : cut the crap , Dreams are necessary , What if there are 100 million users ? Think about the system !

Little Abba : Not to mention how slow it is to sort 100 million data , It's a question whether we can save it or not ... ah , wait , That's what interviews are all about Top N problem !

Fish skin : Pretty good , I was asked several times during the interview Top N problem , How to find out the front N How about the number ?

Little Abba : I don't understand that at all , Algorithm does not , It's killing .

Fish skin : Actually Top N The core of the problem is to ensure the complexity of space and time , The first thing to consider is that data can be stored in memory , How to calculate faster .

Usually Top N There are several solutions to the problem .

Top N Solution

Sort all

Sort all data directly ( Quick row, etc ), The disadvantage is that you need to load the data into memory at one time .

Partial elimination

In memory to maintain a size of N The container of , Let the rest of the numbers go into the container one by one , And eliminate the minimum value in the container . Finally, the number left in the container is the first N name . The advantage is that it saves memory , The disadvantage is that it's too slow .

Divide and conquer

Divide the data into groups , In the group, select the first one respectively N A group leader , Finally, let these team leaders compete together , Choose the final front N name .

Hash preprocessing

If the data is highly repetitive , Can pass hash The way , Remove a lot of duplicate data . such as 1 In 100 million data , Half of it is 0, Half of it is 1, So take the front 10 Name time , You can eliminate the other half as 0 The data of .

But preprocessing itself also requires time and space , This requires us to have a clear judgment on the repeatability of the data , Or you'll be smart 、 Run counter to one's desire .

Heap

High frequency test points in interview algorithm —— Heap sort , You can take the front N The number makes up the pile of small roots , The top of the pile is always the minimum . Then go through the following numbers , If it is larger than the top, replace the top and adjust the minimum structure . The time complexity and space complexity of the algorithm ( by N, constant ) Is pretty good , So we have to master .

Heap

But which one to choose ? We still need to combine our actual projects and business scenarios to analyze .

Practical solution

Because our database records points , So when the user scale is large , First of all Sub database and sub table , It's usually a horizontal scale , According to certain rules ( such as id) Storing user data rows in multiple tables in batches .

table

And then with big data Map / Reduce The processing mechanism is the same , May adopt Divide and conquer The way Parallel computing The top of each table 10 name (map), After all the calculations , Put it all together to calculate the final front 10 name (reduce).

A big data parallel processing process

In this way , Don't say 1 The hundred million ,2 Billion 、3 The computing model of billion is the same , Just add the horizontal expansion of the machine ~

So I met Top N When it comes to problems , You can answer the above several schemes first , Combined with the specific scene analysis , Divide and conquer and minimum pile are relative to each other The core The point of .

Redis

Last , For the design of real-time leaderboard , I'm sure many friends who have recited the eight part essay will think of using it at the first time Redis Ordered set of zset, It's really a solution , But we should also analyze the pros and cons in combination with the scene , Don't answer in seconds .

Using memory based Redis zset It's faster , And it naturally supports sorting 、 Easy to use . But when the amount of data is large, it also faces data update 、 maintain 、 Sync 、 Persistent storage and so on , And for our low real-time demand , Some of them are overqualified. Ha ha .

zset data structure

I'm fish skin , It's not easy to write , give the thumbs-up It's still a request , I wish you all the best 、 Make a fortune 、 Universiade .

Finally, I'll send you some more Help me get to the big factory offer Learning resources , Video tutorial + exercises + answer + Source code 、 Programming books 、 Big factory surface 、 Practical projects, etc .

The way : ran , leave 6T Resources for !

How I started from scratch through self-study , Get Tencent 、 Byte and other big factories offer Of , You can read this article , No more confusion !

The way : I studied computer for four years , Mutual encouragement !

原网站

版权声明
本文为[Programmer fish skin]所创,转载请带上原文链接,感谢
https://yzsam.com/2021/06/20210628225726778Z.html