当前位置:网站首页>Deep understanding of ELF files
Deep understanding of ELF files
2022-06-25 17:48:00 【Akur studio】
Know the executable program
The address of a source file in the process of generating an executable program needs to go through the following main steps . 
After the source file is processed by the compiler, a relocatable target file will be generated , That's what we're used to .o file , After being processed by the linker , Will be more than one .o Files are processed into executable files .
Target can be located from
.o It is called a relocatable target , Contains binary code and data , Its form can be combined with other goals , Create an executable target file
because .o Documents are also ELF A kind of document , So we can use readelf -h Check it out. .o Of documents elf Header data
$ readelf -h main.o
ELF head :
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Category : ELF64
data : 2 Complement code , Small end of the sequence (little endian)
Version: 1 (current)
OS/ABI: UNIX - System V
ABI edition : 0
type : REL ( Relocatable files )
System architecture : Advanced Micro Devices X86-64
edition : 0x1
Entry point address : 0x0
Program header start : 0 (bytes into file)
Start of section headers: 960 (bytes into file)
sign : 0x0
Size of this header: 64 (bytes)
Size of program headers: 0 (bytes)
Number of program headers: 0
Size of section headers: 64 (bytes)
Number of section headers: 14
Section header string table index: 13
By comparing with the file header structure
typedef struct
{
unsigned char e_ident[EI_NIDENT]; /* Magic number and other info */
Elf64_Half e_type; /* Object file type */
Elf64_Half e_machine; /* Architecture */
Elf64_Word e_version; /* Object file version */
Elf64_Addr e_entry; /* Entry point virtual address */
Elf64_Off e_phoff; /* Program header table file offset */
Elf64_Off e_shoff; /* Section header table file offset */
Elf64_Word e_flags; /* Processor-specific flags */
Elf64_Half e_ehsize; /* ELF header size in bytes */
Elf64_Half e_phentsize; /* Program header table entry size */
Elf64_Half e_phnum; /* Program header table entry count */
Elf64_Half e_shentsize; /* Section header table entry size */
Elf64_Half e_shnum; /* Section header table entry count */
Elf64_Half e_shstrndx; /* Section header string table index */
} Elf64_Ehdr;
The first thing I see is Magic Magic number , The size of these numbers is defined by the macro #define EI_NIDENT (16) To limit ,Magic Put it in ELF Of the header of the file 16 byte , The meanings of each byte are as follows :


adopt readelf -S We can roughly give the composition of the relocatable file according to the address offset as follows :

Executable file
We compile the same source code into an executable program , And then use readelf -h View the header of the executable :
$ readelf -h a.out
ELF head :
Magic: 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
Category : ELF64
data : 2 Complement code , Small end of the sequence (little endian)
Version: 1 (current)
OS/ABI: UNIX - System V
ABI edition : 0
type : DYN ( Shared target file )
System architecture : Advanced Micro Devices X86-64
edition : 0x1
Entry point address : 0x1060
Program header start : 64 (bytes into file)
Start of section headers: 14744 (bytes into file)
sign : 0x0
Size of this header: 64 (bytes)
Size of program headers: 56 (bytes)
Number of program headers: 13
Size of section headers: 64 (bytes)
Number of section headers: 31
Section header string table index: 30
Use readelf -S Let's take a look at the section composition of the executable file, which is roughly as follows :

By comparing the heads of relocatable files and executable files , We can see the following differences :

[IMPORTANT] The above process is only seen through official tools , Will official tools deceive us , A real executable is really readelf Is the output composed in this way ?
We can follow suit readelf Analyze the information given by yourself ELF file , From a real existence ELF File to understand ELF The composition of the document
.https://github.com/zzu-andrew/note_book/src/elf_parser/elf_parser.h
// 1. Load the executable file into memory
mmap_res = ::mmap(nullptr, program_length_, PROT_READ, MAP_PRIVATE, fd_, 0);
if (mmap_res == MAP_FAILED)
{
ERROR_EXIT("mmap");
}
mmap_program_ = static_cast<std::uint8_t *>(mmap_res);
// 2. Take out the file header
file_header = reinterpret_cast<Elf64_Ehdr *>(mmap_program_);
// 3. Take out the segment header and header
const Elf64_Ehdr *file_header;
const Elf64_Shdr *section_table;
const char *section_string_table;
size_t section_string_table_index;
Elf64_Xword section_number;
file_header = reinterpret_cast<Elf64_Ehdr *>(mmap_program_);
section_table = reinterpret_cast<Elf64_Shdr *>(mmap_program_ + file_header->e_shoff);
// e_shstrndx = 35
section_string_table_index = file_header->e_shstrndx == SHN_XINDEX ?
reinterpret_cast<Elf64_Shdr *>(&mmap_program_[file_header->e_shoff])->sh_link :
file_header->e_shstrndx;
section_string_table = reinterpret_cast<char *>(&mmap_program_[section_table[section_string_table_index].sh_offset]);
section_number = reinterpret_cast<Elf64_Shdr *>(&mmap_program_[file_header->e_shoff])->sh_size;
After the above steps, print the information of the file header as follows :
$3 = {e_ident = "\177ELF\002\001\001\000\000\000\000\000\000\000\000", e_type = 3, e_machine = 62, e_version = 1, e_entry = 4512, e_phoff = 64, e_shoff = 36184, e_flags = 0, e_ehsize = 64, e_phentsize = 56,
e_phnum = 13, e_shentsize = 64, e_shnum = 36, e_shstrndx = 35}
By reading the size of the open file , The size of the entire executable is : fileSize = 38488 Through the file header, we can know : + ELF The head size is : e_ehsize = 64 + The segment header table offset is : e_phoff = 64 The size is e_phentsize = 56, The number is e_phnum = 13 + The offset address of the section header table is :e_shoff = 36184, The size is e_shentsize = 64, The number is e_shnum = 36 + Section address offset : 36184 +
ELF Head on head , We directly force the pointer pointing to the head to Elf64_Ehdr after , The extracted data is completely consistent with the corresponding file , So we can see that ELF The header placed at the head of the file is indeed the same as readelf The output is the same .
Then calculate according to the offset , The head of the segment should follow closely ELF After the head , therefore , The position of the segment should be offset backwards by the head pointer 64 Place bits by viewing e_phoff The value of is really 64
So let's verify whether the tail is where the section header table is stored , adopt ELF Boss, we know , The size of the section head is 64, The offset position of the section head is 36184, The number of section headers is 36, According to the above figure , The section head is in the last part , Then there must be fileSize - e_shoff = e_shentsize * e_shnum This equation , Otherwise, it means that the section head has not ELF The tail of the executable program is filled with .
38488 - 36184 = 2304 = 36 * 64( Section head length )
After calculation , Whole ELF The end of the file is really filled with section headers . For the verification occupied by other sections, you can verify on the basis of the original program , Here we will not verify them one by one
-pie && -no-pie
Careful readers may find , I use readelf -h The types of read executables are displayed as shared types , This is because my system is ubuntu As a result of , A lot now Ubuntu The default compiler of the system will be added by default when the compiler is -pie Option, which causes the generated executable to be marked as a shared type
pie(Position-Independent-Executable) It can be used to create programs between shared libraries and commonly executable programs , It is a program that can redistribute addresses like a shared library .
PIE The earliest by RedHat Of ⼈ Realization , He added... To the linker -pie Options , In this way ⽤-fPIE The compiled object can get the location through the linker ⽆ Guan Kezhi ⾏ Program .
Standard enforceability ⾏ The program needs a fixed address , And only when loaded to this address , The program can be executed correctly ⾏.PIE Can make the program like a shared library ⼀ Samples are loaded anywhere in main memory , This requires compiling the program into a location ⽆ Turn off , And link to ELF Shared objects .
lead ⼊PIE The reason is that the program can be loaded at a random address , Usually , The kernel runs at a fixed address ⾏, If you can change ⽤ Location ⽆ Turn off , It is difficult for an attacker to use the executable in the system ⾏ Code is attacking . Attacks such as buffer overflows will ⽆ Law enforcement .⽽ And the cost of this safety enhancement is very ⼩.
About Linux The overall analysis of binary has been put into [https://github.com/zzu-andrew/note_book/src/elf_parser]
For full-text documents, see :https://github.com/zzu-andrew/note_book/Linux/Linux Binary analysis .adoc
边栏推荐
- WARNING: Unsupported upgrade request.
- golang sort slice int
- 数据挖掘之时间序列分析[通俗易懂]
- 【 NLP 】 in this year's English college entrance examination, CMU delivered 134 high scores with reconstruction pre training, significantly surpassing gpt3
- Swagger implements background interface automation document
- 启牛的涨乐财付通如何?安全靠谱吗
- [compilation principle] lexical analysis
- 喜报|海泰方圆通过CMMI-3资质认证,研发能力获国际认可
- Garbage collector and memory allocation strategy
- Essential characteristics of convolution operation +textcnn text classification
猜你喜欢
![[compilation principle] lexical analysis](/img/b2/8f7dea3944839e27199b28d903d9f0.png)
[compilation principle] lexical analysis

How high does UART baud rate require for clock accuracy?

Recursion and divide and conquer

Encryption trend: Fashion advances to the meta universe
![[matlab] data interpolation](/img/b8/d7e1a5f7c6f56c8312a1fb5d517ac6.png)
[matlab] data interpolation

A simple and easy-to-use graph visualization tool developed recently
![[matlab] numerical calculus and equation solving](/img/4a/4eca552bd0d2aa71f8b35d92eb9bb7.png)
[matlab] numerical calculus and equation solving

利用Qt制作美化登录界面框

观察者模式之通用消息发布与订阅

HMS Core机器学习服务实现同声传译,支持中英文互译和多种音色语音播报
随机推荐
Use of jupyter
Three traversal methods of binary tree (recursive + non recursive) complete code
利用Qt制作美化登录界面框
How to solve the problem of network disconnection after enabling hotspot sharing in win10?
Kotlin入门(20)几种常见的对话框
配电室环境的分布式远程管理
观察者模式之通用消息发布与订阅
Under the same WiFi, the notebook is connected to the virtual machine on the desktop
C语言中%含义
十大证券公司哪个佣金最低 办理开户安全吗
汇编语言(5)寄存器(内存访问)
Distributed remote management of distribution room environment
Virtual machine class loading mechanism
什么是算子?
Vscode / * * generate function comments
[matlab] numerical calculus and equation solving
jupyter的使用
Acy100 oil fume concentration online monitor for kitchen oil fume emission in catering industry
Interrupt operation: abortcontroller learning notes
Golang sort slice int