当前位置:网站首页>Want to know how a C program is compiled—— Show you the compilation of the program

Want to know how a C program is compiled—— Show you the compilation of the program

2022-07-24 04:44:00 Ordinary youth


Preface

Key points of this chapter :

  • The translation environment of the program
  • The execution environment of the program
  • Detailed explanation :C Compilation of language programs + link
  • Introduction to predefined symbols
  • Preprocessing instruction #define
  • The comparison between macro and function
  • Preprocessing operators # and ## Introduction to
  • Command definition
  • Preprocessing instruction #include
  • Preprocessing instruction #undef
  • Conditional compilation

One 、 Program translation environment and execution environment

stay ANSI C In any implementation of , There are two different environments .

The first 1 One is the translation environment , In this environment, source code is converted into executable machine instructions .
The first 2 One is the execution environment , It's used to actually execute code .

Two 、 Detailed compilation + link

2.1 Translation environment

When there are multiple source files in our project , Each source file will generate the corresponding target file through the compiler
for example :
add.c

int Add(int x,int y)
{
    
	return x+y;
}

test.c

#include<stdio.h>
extern int Add(int x, int y);
int main()
{
    
	int a=10;
	int b=20;
	int sum=Add(a,b);
	printf("%d\n", sum);
	return 0;
}

 Insert picture description here
 Insert picture description here
Run the above code , Then we will find two in the path of our program .obj The file of
 Insert picture description here
This is the corresponding target file generated

Each source file constituting a program is converted into object code through the compilation process (object code).
Each target file is linked by a linker (linker) Tied together , Form a single and complete executable program .
Linkers also introduce standards C Any function in the function library used by the program , And it can search the programmer's personal library , Link the functions it needs to the program

2.2 Compilation itself is also divided into three stages

 Insert picture description here

For example, the following code

#include<stdio.h>
#define MAX 100
// This is a comment 
int main()
{
    
	int a=MAX;
	printf("%d\n", a);
	return 0;
}

We use linux Under the gcc Compiler to observe the results of different stages of compilation .

  1. Preprocessing Options gcc -E test.c -o test.i
    Stop after the pretreatment is complete , All the results after pretreatment are put in test.i In file .
    open test.i file  Insert picture description here
    We found that , The header file disappeared , Replaced with a lot of code , also #define The content of the definition also disappeared ,MAX Be replaced by 100, The comments also disappeared .
    This is part of the operation performed in the preprocessing phase

    • Expansion of header file
    • #define Define the replacement of content
    • Deletion of comments
  2. compile Options gcc -S test.c
    Stop when the compilation is complete , The results are stored in test.s in .
    open test.s file
     Insert picture description here
    This is assembly language , So the final result of compilation is to translate the code into assembly language , Of course, there are other operations
    for example :

    • Syntax analysis
    • Semantic analysis
    • Lexical analysis
    • Symbol summary
  3. assembly gcc -c test.c
    Stop when the assembly is finished , The results are stored in test.o in .
    open test.o file
     Insert picture description here
    After compilation , The code is translated into binary , Generate target file , And form The symbol table , The symbol table will store the function name when the function is implemented , Used to find functions when linking .

2.3 Running environment

The process of program execution :

  1. The program must be loaded into memory . In an operating system environment : This is usually done by the operating system . In an independent environment , The loading of the program must be arranged manually , It can also be done by putting executable code into read-only memory .
  2. The execution of the procedure begins . And then I call main function .
  3. Start executing program code . At this point, the program will use a runtime stack (stack), Store function local variables and return address . Programs can also use static (static) Memory , Variables stored in static memory retain their values throughout the execution of the program .
  4. To terminate the program . Normal termination main function ; It could be an accidental termination .

notes : If the program is compiled 、 Students interested in the details of the link process , You can check it out 《 Self cultivation of programmers 》 This book .

3、 ... and 、 Pretreatment details

3.1 Predefined symbols

__FILE__ // Source files to compile 
__LINE__ // The current line number of the file 
__DATE__ // The date the file was compiled 
__TIME__ // When the file was compiled 
__STDC__ // If the compiler follows ANSI C, Its value is 1, Otherwise, it is not defined 

These predefined symbols are built into the language .
for example :

#include<stdio.h>
int main()
{
    
	printf("FILE:%s LINE=%d DATE:%s TIME:%s\n", __FILE__,__LINE__, __DATE__,__TIME__);
	return 0;
}

 Insert picture description here

3.2 #define

3.2.1 #define Define identifier

grammar :

#define name stuff

give an example :

#define MAX 1000
#define reg register // by  register This keyword , Create a short name 
#define do_forever for(;;) // Replace an implementation with a more vivid symbol 
#define CASE break;case // Writing case Automatically put  break write .

//  If you define  stuff Too long , It can be divided into several lines , Except for the last line , Add a backslash after each line ( Line continuation operator ).
#define DEBUG_PRINT printf("file:%s\tline:%d\t \ date:%s\ttime:%s\n" ,\ __FILE__,__LINE__ , \ __DATE__,__TIME__ )

notes :#define When defining identifiers , It is recommended not to bring ; A semicolon , Prone to grammatical errors .

3.2.2 #define Defining macro

#define The mechanism includes a provision , Allow parameters to be replaced with text , This implementation is often called a macro (macro) Or define macro (define macro).

Here is how macros are declared

#define name(parament-list) stuff
 Among them  parament-list  It's a list of symbols separated by commas , They may appear in stuff in 

Be careful :
The left parenthesis of the argument list must be the same as name Next door neighbor .
If there is any gap between the two , The parameter list will be interpreted as stuff Part of .

give an example :

#define SQUARE(x) x*x
// We define a macro for finding the square ,x Is an argument to a macro  x*x It's a macro 
#include<stdio.h>
int main()
{
    
	// Use macros 
	int n = SQUARE(5);// seek 5 The square of 
	printf("%d\n", n); // Print n
	return 0;
}

 Insert picture description here
So how do macros work ?
actually , The macro is to complete the replacement
We can observe the files generated after preprocessing , To understand how macros work .
 Insert picture description here
We see SQUARE(5) Was replaced by 5*5, This is how macros work —— Replace .

Understand this , Let's look at the following code

#define SQUARE(x) x*x
// We define a macro for finding the square ,x Is an argument to a macro  x*x It's a macro 
#include<stdio.h>
int main()
{
    
	// Use macros 
	int n = SQUARE(5+1);
	printf("%d\n", n); // Print n
	return 0;
}

What is the result of this code printing ? Many people will think that printing 36 This value . But actually it will print 11.
 Insert picture description here
Why? ? The reason is that macros are replaced .

>  When replacing text , Parameters x Replaced with 5 + 1, So this statement actually becomes :
> int n = 5+1*5+1; 

So it's clear , The expression resulting from the substitution does not evaluate in the expected order , So it will print 11.
The solution is simple , Add two parentheses to the macro definition , This problem is easily solved :

#define SQUARE(x) (x)*(x)
 After pretreatment, it will become 
int n = (5+1)*(5+1);

Look at another macro definition :

#define DOUBLE(x) (x) + (x)
// Use macros 
int a = 5;
printf("%d\n" ,10 * DOUBLE(a));

Although we use parentheses in the definition , But there may still be problems . Many people think this code will print 100, But what is actually printed is 55.
The reason is that the code after replacement is like this :

printf ("%d\n",10 * (5) + (5));
 Due to the problem of operator priority , The result is 55.

therefore , We usually write this when defining macros

#define DOUBLE(x) ( ( x ) + ( x ) )

Tips :

All macro definitions used to evaluate numeric expressions should be bracketed in this way , Avoid unexpected interactions between operators in parameters or adjacent operators when using macros .

3.2.3 #define Replacement rules

Extend... In a program #define When defining symbols and macros , There are several steps involved .

  1. When calling a macro , First, check the parameters , See if it contains any information from #define Defined symbols . If it is , They are replaced first .
  2. The replacement text is then inserted into the program at the location of the original text . For macros , Parameter names are replaced by their values .
  3. Last , Scan the result file again , See if it contains any information from #define Defined symbols . If it is , Repeat the above process .

for example :

#define M 100
#define DOUBLE(x) ((x)+(x))
int main()
{
    
	int n=DOUBLE(M+2);
	return 0;
}
 In this code , First of all, I will M Replace , If there is #define Defined symbols , It will also be replaced 
int n=DOUBLE(100+2);
 Then the macro will be replaced 
int n=((100+2)+(100+2));

3.2.4 # and ##

Here's a question : How to insert parameters into a string ?
First, let's look at this code :

#include<stdio.h>
int main()
{
    
	printf("Hello ""World\n");
	return 0;
}

Here will output Hello World Do you ?
It will be

Know this , Let's try to implement such a macro :
First we need to know # The role of
# Change a macro parameter into the corresponding string .

#include<stdio.h>
int main()
{
    
	int a=10;
	printf("The value of a if %d\n", a);
	int b=20;
	printf("The value of b if %d\n", b);
	return 0;
}
 In the above code printf The output is very similar , So we wonder whether we can package such content into a function or macro ?
 First of all, functions cannot be done . Because functions can only pass parameters , We can use macros to realize .

#define PRINT(n) printf("The value of "#n"is %d\n", n);
int main()
{
    
	int a=10;
	PRINT(a);
	int b=20;
	PRINT(b);
	return 0;
}

 Insert picture description here
## The role of

## You can combine the symbols on both sides of it into one symbol .
It allows macro definitions to create identifiers from detached pieces of text .

give an example :

#define CLASS_NUM(Class,Num) Class##Num
int main()
{
    
	int class106=10;
	printf("%d\n", CLASS_NUM(class,106);
	return 0;
}

 Insert picture description here
notes :
Such a connection must produce a legal identifier . Otherwise the result is undefined .

3.2.5 Macro parameters with side effects

When macro parameters appear more than once in the macro definition , If the parameter has side effects , Then you may be in danger when using this macro , Lead to unpredictable consequences . The side effect is the permanent effect of expression evaluation .
for example :

x+1;// No side effects 
x++;// With side effects 

Write a MAX macro

#include<stdio.h>
#define MAX(a, b) ( (a) > (b) ? (a) : (b) )
int main()
{
    
	int x = 5;
	int y = 8;
	int z = MAX(x++, y++);
	printf("x=%d y=%d z=%d\n", x, y, z);// What is the result of the output ?
	return 0;
}

 Insert picture description here
Here we need to know the result of pretreatment

z = ( (x++) > (y++) ? (x++) : (y++));

3.2.6 Macro and function comparison

Macros are usually used to perform simple operations .
For example, find the larger of the two numbers .

#define MAX(a, b) ((a)>(b)?(a):(b))

Then why not use functions to accomplish this task ?
There are two reasons :

  1. The code used to call and return from the function may take more time than actually performing this small computation .
    So macros are better than functions in terms of program size and speed .
  2. More importantly, the parameters of a function must be declared as specific types .
    So functions can only be used on expressions of the right type . On the contrary, how can this macro be applied to shaping 、 Long integer 、 Floating point type can be used for > To compare the types of .
    Macros are type independent .

The disadvantages of macro : Of course, compared with functions, macros also have disadvantages :

  1. Every time you use a macro , A macro defined code will be inserted into the program . Unless the macro is short , Otherwise, the length of the program may be greatly increased .
  2. Macros can't be debugged .
  3. Macro is type independent , It's not rigorous enough .
  4. Macros can cause operator priority problems , It is easy for Cheng to make mistakes .

Macros can sometimes do things that functions can't do . such as : Macro parameters can appear type , But functions can't do it .

#define MALLOC(num, type) (type *)malloc(num * sizeof(type))
...
// Use 
MALLOC(10, int);// Type as parameter 

// After preprocessor replacement :
(int *)malloc(10 * sizeof(int));

A comparison between macros and functions

attribute #define Defining macro function
Code length Every time I use it , Macro code is inserted into the program . Except for very small macros , The length of the program will increase significantly Function code only appears in one place ; Every time you use this function , Call the same code in that place
Execution speed faster There is an extra cost of calling and returning functions , So it's relatively slow
Operator priority Macro parameters are evaluated in the context of all surrounding expressions , Unless you put parentheses , Otherwise, the priority of adjacent operators may have unpredictable consequences , Therefore, it is recommended that macros be written with more parentheses Function parameters are evaluated only once when the function is called , Its result value is passed to the function . The evaluation of expressions is easier to predict
Parameters with side effects Parameters may be replaced at multiple locations in the macro body , Therefore, parameter evaluation with side effects may produce unpredictable results Function parameters are evaluated only once when passing parameters , The result is easier to control
Parameter type Macro parameters are type independent , As long as the operation on parameters is legal , It can be used for any parameter type The arguments to the function are type dependent , If the type of parameter is different , You need different functions , Even if they perform different tasks
debugging Macros are inconvenient to debug Functions can be debugged statement by statement
recursive Macros cannot be recursive Functions can be recursive

3.2.7 Naming conventions

Generally speaking, the syntax of function macro is very similar . So language itself can't help us distinguish the two .
Well, one of our usual habits is :

Capitalize all macro names
Do not capitalize all function names

3.3 #undef

This instruction is used to remove a macro definition .

#undef NAME
// If an existing name needs to be redefined , Then its old name must first be removed 

#define MAX 100
int main()
{
    
	printf("%d\n", MAX);// Output 100
#undef MAX
	printf("%d\n", MAX);//MAX Undefined 
	return 0;
}

3.4 Command line definition

many C Our compiler provides a capability , Allows symbols to be defined on the command line . Used to start the compilation process .
for example : When we compile different versions of a program according to the same source file , This feature is useful .( Suppose an array of a certain length is declared in a program , If the machine memory is limited , We need a very small array , But the other machine's memory is uppercase , We need an array that can be capitalized .)

#include <stdio.h>
int main()
{
    
	int array [SZ];
	int i = 0;
	for(i = 0; i< SZ; i ++)
	{
    
		array[i] = i;
	}
	for(i = 0; i< SZ; i ++)
	{
    
		printf("%d " ,array[i]);
	}
	printf("\n" );
	return 0;
}

We are linux Used in the system gcc Compile the above code
 Insert picture description here
You can see that you can't compile in the past with the normal compilation method , Now let's compile this code in the way defined by the command line

 Insert picture description here
Code execution is normal . This is the command line definition

3.5 Conditional compilation

When compiling a program, if we want to translate a statement ( A set of statements ) It's convenient to compile or discard . Because we have conditional compilation instructions .
for instance :

Debugging code , It's a pity , Reservation is in the way , So we can selectively compile .

#include <stdio.h>
#define __DEBUG__ // adopt #define Definition __DEBUG__
int main()
{
    
	int i = 0;
	int arr[10] = {
    0};
	for(i=0; i<10; i++)
	{
    
		arr[i] = i;
	#ifdef __DEBUG__ // If define I've defined __DEBUG__  Execute the following code , On the contrary, do not execute 
		printf("%d\n", arr[i]);// To see if the array assignment is successful .
	#endif //__DEBUG__
	}
	return 0;
}
#define name // here name Refers to the content of the definition 
#ifdef name // If you define name, Execute the following code 
...
#endif
 There is also 
#define name
#ifndef name // If there's no definition name  Execute the following code 
...
#endif

Common conditional compilation instructions :
ps: By analogy if else sentence
important :#endif There is no shortage of

1.
#if  Constant expression 
	//...
#endif
// Constant expressions are evaluated by the preprocessor .
 Such as :
#if x>y // Conditions established , Execute the following code 
	//...
#endif

#define __DEBUG__ 1
#if __DEBUG__
	//..
#endif

2. Conditional compilation of multiple branches 
#if  Constant expression 
	//...
#elif  Constant expression 
	//...
#else
	//...
#endif

3. Judge whether it is defined 
#if defined(symbol)
#endif
==
#ifdef symbol
#endif

#if !defined(symbol)
#endif
==
#ifndef symbol
#endif


4. Nested instruction 
#if defined(OS_UNIX)
	#ifdef OPTION1
		unix_version_option1();
	#endif
	#ifdef OPTION2
		unix_version_option2();
	#endif
#elif defined(OS_MSDOS)
	#ifdef OPTION2
		msdos_version_option2();
	#endif
#endif

3.6 File contains

We already know , #include Instruction can cause another file to be compiled . It's like it actually appears in #include It's the same place as the command .
It's a simple alternative :
The preprocessor first removes this instruction , And replace... With the contents of the containing file .
Such a source file is contained 10 Time , So it's actually compiled 10 Time .

3.6.1 How header files are included :

  • The local file contains
#include "filename"

Search strategy : First, find the source file in the directory , If the header file is not found , The compiler looks up the header file in a standard location just like it looks up the header file of a library function .
If no compilation error is found .

Linux The path to the standard header file of the environment :

/usr/include

VS The path to the standard header file of the environment :

C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\include

Follow your own installation path .

  • The library file contains
#include <filename.h>

Look up the header file and go directly to the standard path , If no compilation error is found .

Is it possible to say , For library files, you can also use “” The form of includes ?
The answer is yes , Sure .

But this is less efficient , Of course, it is not easy to distinguish between library files and local files .

3.6.2 Nested files contain

 Insert picture description here
comm.h and comm.c It's a common module .
test1.h and test1.c Using common modules .
test2.h and test2.c Using common modules .
test.h and test.c Used test1 Module and test2 modular .
So there will be two copies in the final program comm.h The content of . In this way, the content of the file is duplicated .

How to solve this problem ?
answer : Conditional compilation .
The beginning of each header file is written :

#ifndef __TEST_H__
#define __TEST_H__
// Content of header file 
#endif //__TEST_H__

perhaps :

#pragma once 

notes :
recommend 《 High-quality C/C++ Programming Guide 》 The test paper in the appendix ( Very important ).


summary

The above is all about program compilation , Of course , In fact, the compilation process of a program is very complicated , Here is just a brief introduction to the compilation process of the program . I hope it will be helpful for you to understand the compilation process of the program . Thank you. !

原网站

版权声明
本文为[Ordinary youth]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/205/202207240440501560.html