当前位置：网站首页>Nanny level teaching! Take you to play with time complexity and space complexity!

Nanny level teaching! Take you to play with time complexity and space complexity!

2022-06-23 17:47:00 【Programming Wen Qing, Li goudan】

Hello, everyone , I'm an egg .

Data structure and algorithm , As the king of programming from entry to persuasion , It is a sacred village flower that many beginners want to invade , Incarnate licking dog , It's hard work , Lick to the end , My life is oil, but I don't .

As a college student, I had zero programming ability , I think computer major is a salted fish playing games , To attend ACM competition , The front of holding several gold, silver, copper and iron medals by paddling and holding thighs ACM Bastard player , Think back to the days when I was crazy about kneeling and licking data structures and algorithms , So far, I can throw out a handful of bitter tears .

In order to make smelly treasures no longer be so difficult as me , I decided to study data structures and algorithms with you , I hope I can use a fool's way , from the shallower to the deeper , From concept to practice , Step by step , It can be a long process , I hope you can like it in the process , You can find interesting souls under their cold appearance .

Learn the first lesson of data structure and algorithm , I'll always choose Complexity analysis , in my opinion , This is the most important knowledge point in data structure and algorithm , And will not accept any refutation .

Complexity analysis is mainly time complexity and space complexity , The following article also focuses on these two aspects . I don't say much nonsense , The melon seeds in the front row are ready , The egg class officially receives guests .

Complexity analysis

I just said , In Ben's opinion , Complexity analysis is the most important knowledge point in data structure and algorithm , It's no exaggeration to say , This is the of data structure and algorithm learning The core Where . You learn it and you enter the door , You can't learn it , You'll never know where the door is .

Why is complexity analysis so important ？

This will start from Pangu , Uh , Start with the data structure and algorithm itself .

I usually dream during the day , Always thinking that Dangdang salted fish can make a lot of money by paddling , The best thing is to lie down , The money went straight into my head . Although data structures and algorithms are not as big a dream as Ben egg , But its emergence also wants to spend less time and less storage to solve the problem .

How to consider “ Less time and less storage ”, Complexity analysis is born for this .

very ： So why is complexity analysis important? You know, BAM ？ stinky ： I don't know I don't know very ： what ？ I don't know yet ？？？ stinky ： Yeah, yeah very ：...

Since you sincerely don't know , Then I'll mercifully make an inappropriate analogy .

If the data structure and algorithm are regarded as martial arts moves , The complexity analysis is the corresponding mental method .

If you just learn the usage of data structure and algorithm , Don't learn complexity analysis , That's why you worked so hard to dig out the most precious King's fist in the village that Lao Wang beat all over the village under the second floor tile on the right hand of Lao Wang's house next door , Then I found that there were only moves in the script , There is no internal mental skill , Wang Ba Quan is not as good as Wang Ba quan . Only when you learn to be king , To shake the tiger's body , The king's arrogance puffed , Shake away the pug raised by single Li at the entrance of the village .

stinky ： wow , Awesome, absolutely awesome , You're fat, you're right , But there's still no need to learn . very ：？？？ stinky ： Now there are many tools, bags , Just run the code , You can easily know how much time it takes to run and how much memory it takes . The efficiency of the algorithm can be easily compared ？ very ：...

Spit like spit Sen broken ！ Eat grapes without spitting their skins ！

The mainstream you're talking about is called After the fact statistics .

Simply speaking , Is that you need Write the algorithm code and test data in advance , Then run on the computer , The efficiency of the algorithm is judged by the final running time , The running time here is our daily time .

I don't have to “ In case you have to work hard to write the algorithm code itself is a very bad solution ” This reason to refute you , There are many defects in the post statistical method itself , It is not a useful measure for us ：

First , The post statistical method relies too much on the performance of computer software and hardware . Code in core i7 The processor is better than core i5 The processor is fast , Not to mention different operating systems 、 Different programming languages and other software aspects , Even on the same computer , Everything you use is the same , Memory occupation or CPU The usage rate of will also cause the difference of running time .

also , Post hoc statistics are too dependent on the size of the test data set . The same sort algorithm , Do whatever you want 5 individual 10 Sort the number of , Even the most garbage sorting algorithm , It also looks as fast as Ferrari , If you're here 10w individual 100w individual , Then there is a big gap between different algorithms , And it's also 10w individual 100w Number , The time of order and disorder is different again .

So here comes the question , How many test data sets are suitable ？ How to determine the sequence of data is appropriate ？

I can't say it. BAM ？

It can be seen that , We need a method that can estimate the efficiency of the algorithm without the influence of external forces such as performance and scale 、 Metrics to judge the advantages and disadvantages of the algorithm , Complexity analysis is born for this , This is why we should learn and must learn complexity analysis ！

Time complexity

Time complexity , That is, the running time of the algorithm .

Different algorithms for solving a problem , The shorter the running time, the higher the efficiency of the algorithm , contrary , The longer it runs , The less efficient the algorithm is .

So how to estimate the time complexity ？

When the big guys pulled off the last hair on the brain, they found , When the running time is used to describe the speed of an algorithm , The total number of steps performed in the algorithm is particularly important .

Because it's just an estimate , We can assume that the running time of each line of code is a Btime, Then the total running time of the algorithm = Total lines of code run .

Let's look at such a simple piece of code .

def dogeggs_sum (n):
    sum = 0
    
    for dogegg in range(n):
        sum += dogegg
    return sum

Under the above assumptions , What is the total running time of this code for summation ？

The first 2 This line of code requires 1 Btime Running time of , The first 4 and The first 5 The rows run separately n Time , So everyone needs to n * Btime Running time of , So the total running time is (1 + 2n) * Btime.

We generally use T Function to represent ** Total running time of assignment statement **, So the total running time above can be expressed as T(n) = (1 + 2n) * Btime, Translation is “ The data set size is n, The total number of steps is (1 + 2n) The execution time of the algorithm is T(n)”.

Through the formula , We can see that T(n) It is directly proportional to the total number of steps , The discovery of this law is actually very important , Because this tells us a trend , There is a trend between dataset size and running time ！

Everyone is ready to , We are familiar with the big O Come on ！

Big O notation

Big O notation , It shows how fast the algorithm is .

How fast is this , It's not about the real running time of the algorithm code , It represents a trend , One is that as the size of the data set increases , A trend of algorithm code running time . It's usually written as Ｏ(f(n)).

Then the previous formula becomes T(n) = O(f(n)), among f(n) Is the total number of steps executed by the algorithm code , Also called operands .

n As the data set size , It can take 1,10,100,1000 Or a larger number , You'll find a very interesting thing here , That's when the data set gets bigger and bigger , You'll find some parts of the code “ invalid ” 了 .

Let's take the previous code as an example . When n = 1000 when ,1 + 2n = 2001, When n = 10000 when ,1 + 2n = 20001, When this n When it continues to increase , constant 1 And coefficient 2 There is no sense of existence for the final result , That is, it has little impact on the change of trend .

So for our code sample , The final T(n) = O(n).

Next, I'll use two pieces of code to further learn how to solve the time complexity analysis .

Time complexity analysis

Code 1

def dogeggs_sum (n):
    sum = 0

    for dogegg in range(n):
        for i in range(n):
            sum += dogegg * i
    return sum

I'll bring you a little bit of this code from the beginning .

The first 2 Line needs to run 1 Time , The first 4 Line needs to run n Time , The first 5 Xing He 6 Lines need to be run separately n² Time . So the total number of runs f(n) = 1 + n + 2n².

When n by 5 When ,f(n) = 1 + 5 + 2 * 25, When n by 10000 When ,f(n) = 1 + 10000 + 2 * 100000000, When n Bigger ？

At this time, it is obvious that n² Played a decisive role , Like a constant 1, First order n and coefficient 2 For the final result （ That is, the trend ） The impact is not big , So we can ignore them directly , Therefore, the total number of steps performed can be regarded as “ The dominant ” The one that turned out , That is to say f(n) = n².

The running time of natural code T(n) = O(f(n)) = O(n²).

Code 2

def dogeggs_sum (n):
    sum1 = 0
    for dogegg1 in range(n):
        sum1 += dogegg1

    sum2 = 0
    for dogegg2 in range(n):
        for i in range(n):
            sum2 += dogegg2 * i

    sum3 = 0
    for dogegg3 in range(n):
        for i in range(n):
            for j in range(n):
                sum3 += dogegg3 * i * j

    return sum1 + sum2 + sum3

The above code is to find the sum of three parts , After previous study, it should be easy to know , The time complexity of the first part is O(n), The time complexity of the second part is O(n²), The third part is O(n³).

Normally speaking , Of this code T(n) = O(n) + O(n²) + O(n³), Take... According to us “ The dominant ” part , Obviously, the first two younger brothers should directly pass, Final T(n) = O(n³).

Through these examples , Smart little bitches are sure to find , For time complexity analysis , Just find “ The dominant ” Part of the code of the function can , This dominance is the highest complexity , That is, the most frequently executed part n The magnitude of .

The rest is to practice more , Consciously think more and practice more , Just like me handsome Steady .

Common time complexity

In the process of algorithm learning , We will encounter all kinds of time complexity , But even if your code 72 changes , It can hardly escape the following common time complexity .

Time complexity	rank	f(n) give an example
Constant complexity	O(1)	1
Logarithmic complexity	O(logn)	logn + 1
Linear complexity	O(n)	n + 1
Linear logarithmic complexity	O(nlogn)	nlogn + 1
k Sub complexity	O(n²)、O(n³)、....	n² + n +1
Exponential complexity	O(2n)	2n + 1
Factorial complexity	O(n!)	n! + 1

The time complexity of the above table increases from top to bottom ,O(n) and O(n²) As I said before ,O(2n) and O(n!) It's so inefficient , Unfortunately, I'll mention it again in the future , I won't go into that .

Let's focus on the remaining most common time complexity .

O(1)

O(1) Is constant time complexity .

Here you have to understand a concept , Not just execute 1 The time complexity of this code is recorded as O(1), As long as you are a constant , image O(2)、O(3) ... O(100000) In the representation of complexity O(1).

n = 10000
res = n / 2 
print(res)

Like the code above , It runs the 3 Time , But its time complexity is recorded as O(1), instead of O(3).

because No matter what n How much , For each line of code, they are only executed once , The execution time of the code does not follow n Change by change .

O(logn)

O(logn) Is logarithmic time complexity . First let's look at a piece of code ：

dogegg = 1

while dogegg < n:
    dogegg = dogegg * 2

According to what I said before , Let's find “ The dominant ”, Leading here is the first 4 Line code , Just calculate its time complexity , The total time complexity is known .

It's very simple , The above code translates to initialization dogegg = 1, Multiply by how many 2 I will ≥ n.

Suppose you need x individual , So it's equivalent to asking ：

namely ：

Therefore, the time complexity of the above code should be O(log2n).

however For logarithmic complexity , It doesn't matter if you use 2、3 Bottom , Or to 10 Bottom , Write it down as O(logn), Why is that ？

This is from the logarithmic Bottom formula Speaking of .

According to the bottom formula ,log2n It can be written. ：

For time complexity ：

So in the logarithmic time complexity, we ignore the bottom , Direct use O(logn) To represent logarithmic time complexity .

O(nlogn)

O(nlogn), It's really a common time complexity , Like the quick sort you'll learn later 、 The time complexity of merging and sorting is O(nlogn).

If it's just a simple look, it's in logn It's covered with a layer for loop , For example, the following code ：

def stupid_cnt(n):
    cnt = 0
    for i in range(n):
        dogegg = 1

        while dogegg < n:
            dogegg = dogegg * 2
            cnt += 1
    return cnt

Of course, like the above stupid Most people can't write code , I'm just giving you an example , I'm a dog egg , Don't think I'm a stupid dog .

The best situation 、 Worst case and average case time complexity

Except that the size of the data set affects the running time of the algorithm ,“ The details of the data ” It will also affect the running time .

Let's look at such a piece of code ：

def find_word(lst, word):

    flag = -1
    for i in range(len(lst)):
        if lst[i] == word:
            flag = i
            break

    return flag

The above simple code is to find the variable word In the list lst Where in , I use this paragraph to explain “ The details of the data ” What does that mean? .

Variable word May appear in the list lst Any position of , hypothesis a = ['a', 'b', 'c', 'd']：

When word = 'a' when , Just the list lst Of the 1 individual , The latter does not need to traverse , Then the time complexity in this case is O(1).
When word = 'd' perhaps word = 'e' when , In both cases, the whole list is traversed , So the time complexity in these cases is O(n).

This is the specific situation of the data , The time complexity of the code is different .

According to different situations , We have the best case time complexity 、 Worst case time complexity and average case time complexity 3 A concept .

Best case time complexity

The best case is in the best case , The time complexity of the code . Corresponding to the variable of the above example word Just the list lst Of the 1 individual , The time complexity at this time O(1) This is the best case of this code, time complexity .

Worst case time complexity

The worst case is in the worst case , The time complexity of the code . Corresponding to the variable of the above example word Just the list lst The last , perhaps word Does not exist in the list lst, The time complexity at this time O(n) Is the worst-case time complexity of this code .

Average time complexity

Most of the time , The execution of the code is between the best case and the worst case , So there is the average time complexity .

How to calculate the average time complexity ？ It's a little complicated , You need to use the knowledge of probability theory .

Here I still use the above example , Because it's just a simple science popularization , For the sake of calculation , My assumption will be a little casual ：

In a big way , Find variables x In the list lst There are two situations for the position in ： In or out . Hypothetical variables x In or out of the list lst The probabilities in are 1/2.
If x In the list lst in , that x Yes n In this case , It may appear in 0 To n-1 Anywhere in , Suppose the probability of appearing at each location is the same , All for 1/n.

The probability of each occurrence （ That's the weight ） got it , So the average time complexity is ：

Does it look familiar , This is the weighted average we all learned .

what , Never learned. ？ No big problem . weighted average Is to multiply each value by the corresponding weight , Then sum it up to get the total value , Divided by the total number of units .

So the final average time complexity is ：

stinky ： It's the best , The worst , It's average again , Which one do you use so much ？ very ： This also need to ask ？ Of course, the time complexity of choosing the worst case ！

The best situation , React to the ideal situation , How can pie fall from the sky every day , No reference value .

On average , This can reflect the overall situation of the algorithm , But in general “ comprehensive ”, It's like saying nothing , There is no reference value .

The worst , Consider the worst in everything you do , Because the worst outcome provides a guarantee , Bottom protection , Ensuring the running time of the code is the worst , There can be no worse .

therefore , There is no need to tangle with various situations , Calculate the time complexity, even the worst time complexity .

Spatial complexity

Compared with time complexity, space complexity , Relatively simple , There is not much to master .

The complexity of space is the same as that of time , It also reflects a trend , But this trend is the memory space occupied by temporary variables during code running .

stinky ： Why “ temporary ” Grin ？ very ： This starts with the operation of the code in the computer .

The storage space occupied by the operation of code in the computer , It is mainly divided into 3 part ：

What the code itself takes up
Occupied by input data
Occupied by temporary variables

The first two parts take up these spaces , It has nothing to do with the performance of the code , So we When measuring the spatial complexity of code , Only care about the memory space temporarily occupied during operation .

Space complexity is recorded as S(n), The representation is consistent with the time complexity .

It's the same here n As the data set size ,f(n) It refers to scale n The function of the storage space occupied .

Spatial complexity analysis

Let's use a simple code to learn how to analyze spatial complexity .

def dogeggs_sum(lst):
    sum = 0
    for i in range(len(lst)):
        sum += lst[i]

    return sum

The above code is to find the list lst The sum of all the elements of , According to what I said before , Only the memory space occupied by temporary variables is calculated .

Shape parameter lst The space occupied does not count , Then the only temporary variables left are sum and i,sum It's storage and , It's a constant order ,i Is the location of the storage element , It's also a constant order , The space and scale allocated by them n It doesn't matter , So the space complexity of the whole code S(n) = O(1).

Common space complexity

Common spatial complexity is O(1)、O(n)、O(n²),O(1) In the last section , And it's like O(logn) This kind also has , But you won't encounter , I won't be wordy here .

O(n)

def create_lst(n):
    lst = []
    for i in range(n):
        lst.append(i)

    return lst

The above silly code has two temporary variables lst and i.

among lst Is an empty list created , The memory occupied by this list increases with for Increase with the increase of circulation , Maximum to n, therefore lst The spatial complexity of is O(n),i Is the constant order of the location of the storage element , And scale n irrelevant , So the final space complexity of this code is O(n).

O(n²)

def create_lst(n):
    lst1 = []
    for i in range(n):
        lst2 = []
        for j in range(n):
            lst2.append(j)
        lst1.append(lst2)
    return lst1

The same way of analysis , Obviously, the above silly quadratic code creates a two-dimensional array lst, A one-dimensional lst1 Occupy n, A two-dimensional lst2 Occupy n², So the final space complexity of this code is O(n²).

Come here , That's all for complexity analysis , As long as the smelly treasures read this article carefully , I believe I will have a basic understanding of complexity analysis . Complexity analysis itself is not difficult , Just practice more , When writing code, I consciously want to estimate my code , The feeling will come slowly .

This is the first article of the official account , It's really not easy to finish writing , I hope it's a good start . It's not easy to code words , Smelly treasure, please give me a lot of support .

I'm an egg , I'll see you next time ～

原网站

版权声明
本文为[Programming Wen Qing, Li goudan]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/01/202201051847395361.html