当前位置：网站首页>[C language] deep analysis of data storage in memory

[C language] deep analysis of data storage in memory

2022-06-26 05:39:00 【Superman can't fly Ke】

List of articles

Preface
1️⃣ Data type introduction
2️⃣ The storage of data in memory
summary

Preface

Hello Here is Superman can't fly ke, On such a hot day , It is best to study in the air-conditioned room , Xiaobian has been addicted to learning these days ！ So today I want to sum up some of the achievements learned in recent days , And share some of your own experiences

This article focuses on the storage of data in memory , Application C Language depth analysis of how data is stored in memory , And give some examples to help you understand and master . Now let me lead you into the memory gate of the computer , Cultivate our “ Internal mental skill ” Well ！！

1️⃣ Data type introduction

stay C In language , To represent the different things in life , There are many types defined . Not only are there built-in types we are familiar with , There are also types of constructs that we can use ourselves . Different types of storage in memory are also different , The difference here mainly refers to the difference in the size of the space , Of course, sometimes different types may have different perspectives .
in summary , We can figure out c The meaning of type in language ：
It specifies the space opened in the memory when different types are used （ The amount of space taken up ）
How to look at the perspective of memory space

Let's sort it out c Language type ！

1. The integer family

char（ Character type ）
signed char
unsigned char
The amount of space taken up ：1 byte
Why are character types included in the integer family ？
Because the essence of characters is ASCII Code value , These values are integers , So it is included in the integer family
short （ Short ）
signed short [int]
unsigned short [int]
The amount of space taken up ：2 byte
int
signed int
unsigned int
The amount of space taken up ：4 byte
long（ Long plastic surgery ）
signed long [int]
unsigned long [int]
stay c99 Under the standard, it also adds long long Long integer type , Its size is eight bytes

Here we need to pay attention to two points ：
unsigned Is an unsigned type （ In order to define the data without negative numbers in life ,c The language defines a with unsigned The type of , Such as ： height 、 weight ）;
When we create a type of short、int、long Of variables , The compiler defaults to a signed variable （ Such as int The default is signed int）;
And when we create a type of char Of variables , The compiler does not necessarily default to signed char. Different compilers have different defaults .
When we create variables , If you want to represent a number that has only positive numbers, you can create one unsigned type , Instead, you can create a signed type .

2. Floating point family

float( Low accuracy )
double( High precision )
Floating point numbers are generally used to represent decimals

3. Construction type

An array type
Type of structure ：struct
Enumeration type ：enum
Joint type ：union

4. Pointer types

int* pi
char* pc
float* pf
double* pd
void* pv
…

2️⃣ The storage of data in memory

After knowing the type of data , Let's explore how data is stored in memory ！

We know , Variables are created to open up space in memory , The size of the space is determined by the type of the variable . that , In this open space , In what form is the data stored ？ What are the different ways of storing different types of data ？ Next, we will discuss two memory storage methods for integer data and floating-point data .

1. The storage of integers in memory

Original code 、 Inverse code 、 Complement code

How integers are stored in memory ？
In order to find out the truth , We must first understand the following concepts ——
Original code 、 Inverse code 、 Complement code ：
There are three kinds of integers in the computer 2 Decimal representation , The original code 、 Inverse and complement . The three representations have two parts: sign bit and numeric bit , The sign bit is the first bit of the binary sequence , use 0 Express “ just ”, use 1 Express “ negative ”, And the number bit , A positive number 、 back 、 The complement is the same , The three representations of negative integers are different （ The original inverse complement representation of negative integers As follows ：
Original code ： The original code can be obtained by directly translating the numerical value into binary in the form of positive and negative numbers ;
Inverse code ： Except for the sign bit, the original code is reversed by bit , Get the inverse ;
Complement code ： Inverse code +1 Get the complement .
Take a chestnut ： integer int type -6 The original code of 、 Inverse code 、 The form of complement is as follows ：

Actually , For integers ： Data stored in memory is actually stored in the complement . For example, above -6, If we create a int Variable of type int a = -6, Then it will open up a corresponding size of space in memory according to the type of variables （ Here the variable type is int Open up four bytes of space ）. Then the complement of the initialized data is stored in this space . Not just initialization , The same is true when assigning values to variables , It just omits the link of opening up space .

As shown in the figure
Insert picture description here

The address unit in memory is a byte , Open up the corresponding space according to the type , Prevent the waste of space . We know , A byte is 8 A bit , Therefore, the corresponding data is stored in each byte 8 A bit , Just deposited 32 position . What we should pay attention to here is , Local variables are created in the stack area in memory , And global variables 、 Static variables are created in the static area .

In order to facilitate subsequent analysis , We convert binary sequences to hexadecimal , Pictured （ One hexadecimal bit corresponds to four binary bits ）：
Insert picture description here

So here comes the question , Why is the complement stored in memory when storing integer data ？

In computer system , All values are represented and stored by complements . The reason lies in
Use complement , Symbol bits and value fields can be treated in a unified way
meanwhile , Addition and subtraction can also be handled in a unified way （CPU Only adders ）
Besides , Complement code and original code are converted to each other , Its operation process is the same , No need for additional hardware circuits .
explain
Sign bits are used to represent the positive and negative of data , Can be very good symbol bit and significant bit unified processing ;
CPU Only adders , So how does subtraction work ？ For example, to calculate 1-1, be CPU In calculation, it is converted to 1+(-1). And for 1+(-1), If you use the original code directly , be ：
1+(-1)=00000000000000000000000000000001+10000000000000000000000000000001=10000000000000000000000000000010
The result is not equal to 0, We can't get the result we want . And if we use complement to calculate , be ：
1+(-1)=00000000000000000000000000000001+11111111111111111111111111111111=100000000000000000000000000000000
And here, one person has become 33 position , Overflowed int The spatial scope of . Therefore, the result obtained by discarding the highest bit is 0, It's the result we want . This shows the advantage of using complement to store data
Here is also the ingenious use of complement . The process of complement and original code conversion is the same , They are all negative addition 1.
With -1 For example

Pictured , It is verified that the operation process of mutual conversion between complement code and original code is the same .

Introduction of large and small end

Here we will discuss another problem . Look at the picture ！
Insert picture description here
Above, we are drawing the diagram of data storage into memory space , Habitually store them in order , that , What is the order in which data is stored in memory ？ Is it stored in the order we draw ？
Here we need to understand a concept —— Large and small end , Understand the size of the end , Then you can understand the mystery .

What is the size end ？

The big and small ends are c Two modes of storing data in memory in language ：
Big end （ Storage ） Pattern , The low bit of data is stored in the high address of memory , The high bit of data is stored in the low address of memory ;
The small end （ Storage ） Pattern , The low bit of data is stored in the low address of memory , And the high end of the data , Stored in a high address in memory .
Here's a bit , It's in bytes , A byte order consisting of both low and high bytes , It is called the size side byte order .

Why are there big and small ends ？

stay c There are many types in languages , Such as short、int、long… They all come in different sizes , such as short Its size is 2 Bytes ,int Its size is 4 Bytes . The storage of multiple bytes in memory inevitably involves the problem of order , Because there can be many different sorting methods ,c There are two languages ： Big end mode and small end mode . The storage mode on the computer may be large or small , Which one depends on the hardware .

Here we give the hexadecimal sequence 11223344 In memory storage as an example , Draw pictures to help solve ：
Insert picture description here

You can see , The two storage modes are diametrically opposed . Big end storage is more in line with the logic of human thinking , The small end stores more symbolic computer operation logic .

In order to deepen the understanding of the size end , Now let's take a look at an example of the size side ：

Design a small program to determine the current machine byte order

Go straight to the code

#include <stdio.h>
int check_key()
{
    
	int a = 1;// Create variables a
	char* p = (char*)&a;
	// take a The address of （ From the knowledge of the pointer , Integer variables a The address of is the address of the lowest byte in its four byte space ）
	// And cast the type to char* type , Store pointer variables p in 
	return *p;// return p The value in , If it is 1 It's small end , yes 0 Is the big end 
}
int main()
{
    
	int ret = check_key();// Judge the size end by the return value of the function 
	if (ret == 1)
	{
    
		printf(" The small end \n");
	}
	else
	{
    
		printf(" Big end \n");
	}
	return 0;
}

Insert picture description here
Through debugging and observing the memory, you can see （ The byte order here is expressed in hexadecimal ）,a The data in is indeed stored in small end mode . So when we change the pointer type to char* When accessing data in its first byte , Get is 1 The lowest byte of 01. If it is big end storage, we will get 00.

Improve the overall shape 、 Arithmetic conversion and examples

Master the storage of data in memory , We know how data goes into memory “ discharge ”. that , Yes “ discharge ” There will be “ take ”, When we want to extract and use integer data in memory , What's so wonderful about it ？ Want to know the secret of this , We need to master these concepts ： Improve the overall shape 、 Arithmetic conversion

1. Improve the overall shape

What is integer lifting ？

C Integer arithmetic operations are always performed at least with the precision of the default integer type . To get this accuracy , Characters and short operands in expressions are converted to normal integers before use , This transformation is called Improve the overall shape .

The significance of integer Promotion ？

The integer operation of expression should be in CPU In the corresponding computing device of ,CPU Inner integer arithmetic unit (ALU) The byte length of the operands of is generally int Byte length of , It's also CPU The length of the general register of . therefore , Even two char The addition of types , stay CPU When executing, it should be converted to CPU The standard length of the inner operands . Universal CPU（general-purpose CPU） It is difficult to realize two directly 8 Direct addition of bits and bytes （ Although there may be such byte addition instructions in machine instructions ）. therefore , Various lengths in expressions may be less than int The integer value of the length , Must be converted to int or unsigned int, Then it can be sent in CPU To perform the operation .

When will integer promotion occur ？

char、short Operands of type are in progress Expression operation The integer promotion will occur first
stay printf Function , When char、short Data of type With %d or %u Format printing when , Integer promotion will occur first （ With %u Format is converted to when printing unsigned int)

How to carry out integer Promotion ？

* The shaping and lifting of negative numbers ：char c1 = -1; Variable c1 Binary bit of ( Complement code ) There are only 8 A bit ：1111111 , because char For signed char, So when integer lifting , High supplementary sign bit , That is to say 1, The result of ascension is ：
11111111111111111111111111111111
* Positive integer lifting ：char c2 = 1; Variable c2 Binary bit of ( Complement code ) There are only 8 A bit ： 00000001 , because char For signed char, So when integer lifting , High supplementary sign bit , That is to say 0, The result of ascension is ：
00000000000000000000000000000001
* No sign plastic lift ： High compensation 0

Two chestnuts

Patients with a ：

#include <stdio.h>
int main()
{
    
	char a = 1;
	char b = -1;
	char c = a + b;
	//00000001 -> 00000000000000000000000000000001 a
	//11111111 -> 11111111111111111111111111111111 b
	//a+b == 00000000000000000000000000000000
	//00000000 -> c
	return 0;
}

a and b The value of is promoted to a normal integer , recompute . The result of the operation is also an ordinary integer , Normal integer is stored in c in , Need to happen truncation Then deposit , Existing deposit c The value of is 0
truncation ： When assigning a data type with a large byte size to a data type with a small byte size , Due to insufficient space for small data types , Cannot accommodate big data types , Therefore, truncation occurs . The rule of truncation is ： Take the low order of the big data type and store it in the small data type .（ Like here. int Assign to char, Both int It's low 8 Place bestow char）

Example 2 ：

#include <stdio.h>
int main()
{
    
	char a = 0xb6;
	short b = 0xb600; 
	int c = 0xb6000000; 
	//
	if (a == 0xb6)
		printf("a"); 
	if (b == 0xb600)
		printf("b"); 
	if (c == 0xb6000000)
		printf("c"); 
	return 0;
}

When a variable is a relational operator 、 The operands of logical operators , It is also an expression operation , Integer elevation may also occur .
here a,b It's going to be a plastic lift , however c There is no need for plastic lifting a,b After shaping and lifting , It becomes a negative number , So the expression a==0xb6 , b==0xb600 It turned out to be false , The return value is 0, however c No plastic lifting occurs , Expression c==0xb6000000 The result is true .
The output of the program is : c

2. Arithmetic conversion

If the operands of an operator belong to different types , Then unless one of the operands is converted to the type of the other operand , Otherwise, the operation cannot be carried out . The following hierarchy is called Ordinary arithmetic conversion .

Insert picture description here

If the type of an operand is lower in the list above （ The arrow points from low to high ）, Then first convert to the type of another operand and perform the operation .

Take a chestnut

#include <stdio.h>
int main()
{
    
	int a = -4;
	unsigned int b = 8;
	printf("%d", a + b);
	return 0;
}

The diagram is as follows ：

Insert picture description here

Be careful Arithmetic conversion should be reasonable , Failure to do so may result in loss of accuracy

float f = 3.14;
int num = f;// Implicit conversion , There will be loss of accuracy

3. Example

Master the integer lifting and arithmetic conversion “ Internal skill ” after , Now let's practice a few problems to consolidate

One 、

//  transport   Out   What   Well  ？ 
#include <stdio.h> 
int main()
{
    
	char a = -1;
	signed char b = -1;
	unsigned char c = -1;
	printf("a=%d,b=%d,c=%d", a, b, c);
	return 0;
}

Insert picture description here
Running results

Two 、

//  transport   Out   What   Well  ？ 
#include <stdio.h> 
int main()
{
    
	char a = -128;
	printf("%u\n", a);
	return 0;
}

Insert picture description here
Running results

3、 ... and 、

//  transport   Out   What   Well  ？ 
#include <stdio.h> 
int main()
{
    
	char a = 128;
	printf("%u\n", a);
	return 0;
}

Insert picture description here
Running results

The result is the same as the second question
Insert picture description here

Four 、

// What output ？
#include <stdio.h> 
int main()
{
    
	int i = -20;
	unsigned int j = 10;
	printf("%d\n", i + j);
}

Insert picture description here
Running results

5、 ... and 、

#include <stdio.h>
// What's the result ?
int main()
{
    
	unsigned int i;
	for (i = 9; i >= 0; i--) 
	{
    
		printf("%u\n", i);
	}
}

Because of the variable i The type is unsigned int, So it can't be less than 0, That is to say, the cycle cannot end . So the program will go into an endless loop .

6、 ... and 、

// The output is ?
#include <stdio.h>
int main()
{
    
	char a[1000];
	int i;
	for (i = 0; i < 1000; i++)
	{
    
		a[i] = -1 - i;
	}
	printf("%d", strlen(a));
	return 0;
}

To understand this problem , The first thing to know is char Size range of type data . Here is a picture to help solve ：
Insert picture description here

We know ,char Type account 1 Bytes , Both 8 A bit ,8 The digits that a bit can represent are 2^8 That is to say 256 individual . As shown in the picture from 0 Start , Turn clockwise for char All the numbers that can be represented . There's a special case here ,10000000 If this complement is converted to the original code , Then for 100000000, One more , Overflowed char Space . therefore ,c The language is specified in char Type in the ,10000000 Express -128. in summary ,char The size range of type data is -128~127.

Having mastered this knowledge, let's look at this problem again ：
Insert picture description here
Running results

Want to know the size range of other integer types , By querying limits.h The header file , Learn more about . What we need to remember is char My good brother unsigned char, His scope is 0~255.

Here is the code in the header file

#pragma once
#define _INC_LIMITS

#include <vcruntime.h>

#pragma warning(push)
#pragma warning(disable: _VCRUNTIME_DISABLED_WARNINGS)

_CRT_BEGIN_C_HEADER

#define CHAR_BIT 8
#define SCHAR_MIN (-128)
#define SCHAR_MAX 127
#define UCHAR_MAX 0xff

#ifndef _CHAR_UNSIGNED
    #define CHAR_MIN SCHAR_MIN
    #define CHAR_MAX SCHAR_MAX
#else
    #define CHAR_MIN 0
    #define CHAR_MAX UCHAR_MAX
#endif

#define MB_LEN_MAX 5
#define SHRT_MIN (-32768)
#define SHRT_MAX 32767
#define USHRT_MAX 0xffff
#define INT_MIN (-2147483647 - 1)
#define INT_MAX 2147483647
#define UINT_MAX 0xffffffff
#define LONG_MIN (-2147483647L - 1)
#define LONG_MAX 2147483647L
#define ULONG_MAX 0xffffffffUL
#define LLONG_MAX 9223372036854775807i64
#define LLONG_MIN (-9223372036854775807i64 - 1)
#define ULLONG_MAX 0xffffffffffffffffui64

#define _I8_MIN (-127i8 - 1)
#define _I8_MAX 127i8
#define _UI8_MAX 0xffui8

#define _I16_MIN (-32767i16 - 1)
#define _I16_MAX 32767i16
#define _UI16_MAX 0xffffui16

#define _I32_MIN (-2147483647i32 - 1)
#define _I32_MAX 2147483647i32
#define _UI32_MAX 0xffffffffui32

#define _I64_MIN (-9223372036854775807i64 - 1)
#define _I64_MAX 9223372036854775807i64
#define _UI64_MAX 0xffffffffffffffffui64

#ifndef SIZE_MAX
    // SIZE_MAX definition must match exactly with stdint.h for modules support.
    #ifdef _WIN64
        #define SIZE_MAX 0xffffffffffffffffui64
    #else
        #define SIZE_MAX 0xffffffffui32
    #endif
#endif

#if __STDC_WANT_SECURE_LIB__
    #ifndef RSIZE_MAX
        #define RSIZE_MAX (SIZE_MAX >> 1)
    #endif
#endif

_CRT_END_C_HEADER

#pragma warning(pop) // _VCRUNTIME_DISABLED_WARNINGS

2. Floating point storage in memory

We have mastered the storage of integer data in memory . And in the c In language , There is another family —— Floating point family , Are they stored in the same way as integer data ？ If not, what kind of mode is it ？ Let's discuss these issues .

Floating point numbers , That is, decimals , Used to represent all kinds of decimals in our life . Common floating point numbers are 3.14159,1E10. The family of floating-point numbers includes ：float, double, long double.

Let's use an example to introduce the storage of floating-point numbers in memory ：

#include <stdio.h>
int main()
{
    
	int n = 9;
	float* pFloat = (float*)&n;
	printf("n The value of is ：%d\n", n);
	printf("*pFloat The value of is ：%f\n", *pFloat);
	//
	*pFloat = 9.0;
	printf("n The value of is ：%d\n", n);
	printf("*pFloat The value of is ：%f\n", *pFloat);
	return 0;
}

Output result ：
Insert picture description here
From this example we can draw a conclusion ： The storage and extraction of integer data and floating-point data in memory are different , So what is the difference ？ In order to explore the mystery , We must understand the rules for storing floating point numbers in memory ：

The representation of floating-point numbers inside a computer
Detailed interpretation ：
According to international standards IEEE（ Institute of electrical and Electronic Engineering ） 754, Any binary floating point number V It can be expressed in the following form ： (-1)^S * M * 2^E
(-1)^S The sign bit , When S=0,V Is a positive number ; When S=1,V It's a negative number .
M Represents a significant number （1≤M<2）
2^E Indicates the index bit .
for instance For example, decimal 5.0, Its binary representation is 101.0, amount to 1.01*2^2
Then according to the above V The representation of , here S=0,M=1.01,E=2
For example, decimal 5.0, Its binary representation is -101.0, amount to (-1)^1*1.01*2^2
Then according to the above V The representation of , here S=1,M=1.01,E=2

IEEE754 Regulations with S, E, M Three numbers represent any binary floating-point number , Use these three numbers again , Specifies the storage mode of floating-point numbers in memory ：

about 32 Floating point number of bits , The highest 1 Bits are sign bits s, And then 8 Bits are exponents E, The rest 23 Bits are significant numbers M
Insert picture description here
about 64 Floating point number of bits , The highest 1 Bits are sign bits s, And then 11 Bits are exponents E, The rest 52 Bits are significant numbers M

IEEE754 For significant figures M And the index E, There are some special rules
about M
As I said before , 1≤M<2 , in other words ,M It can be written. 1.xxxxxx In the form of , among xxxxxx Represents the fractional part .IEEE754 Regulations , Keep it in the computer M when , By default, the first digit of this number is always 1, So it can be discarded , Save only the back xxxxxx part . For example preservation 1.01 When , Save only 01, Wait until you read , Put the first 1 Add . The purpose of this , yes save 1 Significant digits . With 32 For example, a floating-point number , Leave to M Only 23 position , Will come first 1 After giving up , Equivalent to saving 24 Significant digits .
（ notes ： When storing floating point numbers , If M The first place 1 after , hinder xxxxxx Some are insufficient 23 position （ or 64 position ）, Then fill in the following 0 Until the number of digits is enough ）
For index E
First ,E For an unsigned integer （unsigned int） It means , If E by 8 position , Its value range is 0~255; If E by 11 position , Its value range is 0~2047. however , We know , In scientific counting E You can have negative numbers , therefore IEEE754 Regulations , In memory E The true value of must be added with an intermediate number , about 8 Bit E, The middle number is 127; about 11 Bit E, The middle number is 1023. such as ,2^10 Of E yes 10, So save it as 32 When floating-point numbers are in place , Must be saved as 10+127=137, namely 10001001.

And when we extract floating-point numbers from memory ,E It can also be divided into three cases .

1.E Not all for 0 Or not all of them 1
In this case, the floating-point number is retrieved according to the normal rules , Existing index E subtract 127（ or 1023） obtain E True value , And then the significant number M Fill in the first place 1 Get real M.
such as ：
Binary sequence 0 01111110 00000000000000000000000
It can be observed that this binary sequence has 32 position , Is a single precision floating-point number .
First step ： Read the first bit S by 0, Then the floating-point number of the number is a positive number .
The second step ： Read the last eight digits of the first one , Subtract... From the number represented by these eight digits 127 obtain E True value of E = 01111110 - 01111111 = -1
The third step ： Read the rest 23 position , this 23 The first place is replaced by the second 1 obtain M, be M = 1.00000000000000000000000
So the number is equal to (-1)^0 * 1.00000000000000000000000 * 2^(-1) = 0.1, Converted to decimal is 0.5

2.E All for 0
At this time , The exponent of a floating point number E be equal to 1-127（ perhaps 1-1023） That's the true value , Significant figures M No more first 1, It's reduced to 0.xxxxxx Decimals of . This is to show that ±0, And close to 0 A very small number of .

3.E All for 1
At this time , If the significant number M All for 0, Express ± infinity （ It depends on the sign bit s）

OK！ Understand the storage rules of floating point numbers in memory , Let's analyze the example introduced at the beginning ！ Why is there an unintended result here ？ Let's analyze it step by step （ It is divided into two parts ）：

// Cited example 
#include <stdio.h>
int main()
{
    
    // The first half 
	int n = 9;
	float* pFloat = (float*)&n;
	printf("n The value of is ：%d\n", n);
	printf("*pFloat The value of is ：%f\n", *pFloat);
	// The second part of 
	*pFloat = 9.0;
	printf("n The value of is ：%d\n", n);
	printf("*pFloat The value of is ：%f\n", *pFloat);
	return 0;
}

The first half ：
Insert picture description here

The second part of

So we can draw a conclusion , The reason why printing is abnormal is that we have different ways of storing data and retrieving data , The result may be beyond our expectation . We should be careful when we write code , And memory “ Make good contact ”, Reduce bug Appearance ~

summary

This is the end of today's sharing ！ If there is a mistake , Welcome the boss to correct ~
Here I want to share a sentence I saw today ：Do what you love, love what you do.
Doing what you like is the meaning of life ！ come on. xdm！
If you see it here, you might as well give it a third time ~

原网站

版权声明
本文为[Superman can't fly Ke]所创，转载请带上原文链接，感谢
https://yzsam.com/2022/177/202206260533517337.html