当前位置:网站首页>Memory alignment in golang
Memory alignment in golang
2022-06-24 17:11:00 【Johns】
One . What is memory alignment , Why do I need memory alignment ?
Before explaining what memory alignment is , We need to know CPU The process of interacting with memory data .CPU And memory interact with each other through the bus . The address bus is used to transmit CPU Required data address , Memory transfers data to through the data bus CPU, perhaps CPU The data is returned to the memory through the data bus .
First we need to know the following concepts :
(1) Machine word length
In the field of computer , For a particular computer design , word (word) A term used to denote a natural unit of data , Is a fixed length used to represent a one-time transaction . The number of digits in a word , namely The word is long .
(2). Address bus
Dedicated to delivering addresses , Because the address can only be from CPU To external storage or I/O port , So the address bus is always unidirectional . The number of bits in the address bus determines CPU Size of directly addressable memory space , such as 8 The address bus of bit microcomputer is 16 position , The maximum addressable space is 2^16=64KB,16 The address bus of bit microcomputer is 20 position , Its addressable space is 2^20=1MB.
(3). data bus
yes CPU A channel for data transfer with memory or other devices . Each transmission line can only transmit at one time 1 Bit binary data , The total number of bytes that the data bus can transmit each time is called the machine word length or the width of the data bus . It determines CPU And external data transmission speed . What we use now is basically 32 position ( Every time you can transmit 4 byte ) perhaps 64 position ( Every time you can transmit 8 byte ) Machine word length machine .
Because the data is transmitted through the bus , If the data is not aligned according to certain rules ,CPU The access operation and bus transmission operation of the will be extremely complex , Therefore, the compiler will align various types of data according to certain rules during program compilation , The alignment process will fill the memory data segment with bytes according to certain rules , This is byte alignment .
for example : Now you want to store variables A(int32) and B(int64) So without any byte alignment optimization , The memory layout is like this
After byte alignment optimization, it looks like this :
It seems that memory is wasted after byte alignment , But when we read the data in memory to CPU when ,64 Bit machine ( It can be read atomically at a time 8 byte ) In the case of memory alignment and misalignment A Variables only need to be read once by the atom , But after alignment B Variables can be read only once , And out of alignment ,B Read required 2 Time , And additional processing is required to sacrifice performance to ensure 2 Atomicity of the second reading . So essentially , Memory filling is a kind of space for time , It is a means to improve the efficiency of memory reading through additional memory filling .
in general , Memory alignment mainly solves the following two problems :
【1】 Cross platform issues : If the data is not aligned , So in 64 The data stored by the bit length machine may be in 32 Machines with bit length may not be able to read normally .
【2】 Performance issues : If it's not aligned , It is unknown how many times each data will be transmitted through the bus , If you have to deal with these complex situations every time , So data reading / Write performance will be greatly affected . The reason why some CPU Support access to any address , Because the processor does a lot of extra processing later .
(4) Expanding reading
【1】 The implementation principle of modern processor atomic operation :1. The processor automatically guarantees the atomicity of basic memory operations . The processor guarantees that it is atomic to read or write a byte from the system memory , It means when a processor reads a byte , Other processors cannot access the memory address of this byte . Galloping 6 And the latest processor can automatically guarantee that a single processor can access the same cache line 16/32/64 The operation of bit is atomic . 2. Complex memory operation processors cannot guarantee their atomicity automatically , Like the width across the bus , Across multiple cache lines , Access to cross page table . But the processor provides bus locking and cache locking mechanisms to ensure the atomicity of complex memory operations . 3. In fact, most atomic operation guarantees provide instruction support at the hardware level , programing language (C, C++, Go, Java etc. ) Nothing more than encapsulating a layer to ensure that the corresponding instructions can be called correctly on different types of processors .【2】X64 The address bus width of the system must be 64 Place? ?32 The address bus of the bit system must also be 32 Place? ? **** The real situation is mostly X64 The system address bus used only 48 root , Also have 50 Root , X86 Our system also has 32 Root , Also have 36 Root , This is limited by hardware . Because the address bus width represents the addressing capability , natural 48 Address bits are addressable 2^48Byte=256TB, Now our memory does not reach this threshold in most cases . So the processor hardware requires the incoming address 48~63 The bit addresses must be the same , That is to say, before 48 The root address bus is sufficient to meet the current memory addressing requirements for a long time , This for 32 The same is true for bit systems ,2^32byte=4G This is also 32 The bit system can be matched at most 4G Memory reasons , No matter how old you are, you can't find it ,Pentinum pro/ Pentinum II /Pentinum The current address bus of the series is 36 root , The maximum address range supported is 64G.
Two . What are the rules for memory alignment ?
Memory alignment is mainly used to ensure the atomic reading of data , Therefore, the maximum boundary of memory alignment can only be the word length of the current machine . Of course, if each type uses the largest alignment boundary , It would be a waste of memory , In fact, we only need to ensure that the same data is not separated in multiple bus transactions .
Go In its official documents Size and alignment guarantees - golang spec It describes the details of memory alignment .
Go Also provided unsafe.Alignof(x) To return the alignment value of a type , And made the following agreement :
- For any type of variable x ,unsafe.Alignof(x) At least for 1.
- about struct Variable of structure type x, Calculation x Every field f Of unsafe.Alignof(x.f),unsafe.Alignof(x) Equal to the maximum of them .
- about array Variable of array type x,unsafe.Alignof(x) Equal to the alignment multiple of the element types that make up the array .
- There are no empty fields struct{} And without any elements array The memory space occupied is 0, Different sizes are 0 The variable of may point to the same block address .
In conclusion , It is divided into Basic type alignment and Structure type alignment
(1) Basic type alignment
go The memory alignment of basic types of languages is based on the size of basic types and the minimum machine word length
data type | Type size (32/64 position ) | Maximum alignment boundary (32 position ) | Maximum alignment boundary (64 position ) |
|---|---|---|---|
int8/uint8/byte | 1 byte | 1 | 1 |
int16/uint16 | 2 byte | 2 | 2 |
int32/uint32/rune/float32/complex32 | 4 byte | 4 | 4 |
int64/uint64/float64/complex64 | 8 byte | 4 | 8 |
string | 8 byte /16 byte | 4 | 8 |
slice | 12 byte /24 byte | 4 | 8 |
We can code and test on our own machine ( My machine is 64 Bit Mac OS X):
package service
import (
"testing"
"unsafe"
)
func TestAlign(t *testing.T) {
var byteTest byte = 'a'
var int8Test int8 = 0
var int16Test int16 = 0
var int32Test int32 = 0
var int64Test int64 = 0
var uint8Test uint8 = 0
var uint16Test uint16 = 0
var uint32Test uint32 = 0
var uint64Test uint64 = 0
var float32Test float32 = 0.0
var float64Test float64 = 0.0
println("byte max align size =>", unsafe.Alignof(byteTest))
println("int8/uint8 max align size =>", unsafe.Alignof(int8Test), "/", unsafe.Alignof(uint8Test))
println("int16/uint16 max align size =>", unsafe.Alignof(int16Test), "/", unsafe.Alignof(uint16Test))
println("int32/uint32/float32 max align size =>", unsafe.Alignof(int32Test), "/", unsafe.Alignof(uint32Test), "/", unsafe.Alignof(float32Test))
println("int64/uint64/float64 max align size =>", unsafe.Alignof(int64Test), "/", unsafe.Alignof(uint64Test), "/", unsafe.Alignof(float64Test))
var s string = "343240000000000"
println("string max align size =>", unsafe.Alignof(s))
var sliceTest []string
println("slice's size/max align size =>", unsafe.Alignof(sliceTest), "/", unsafe.Sizeof(sliceTest))
var structTest struct{}
println("struct{}'s size / max align size =>", unsafe.Alignof(structTest), "/", unsafe.Sizeof(structTest))
}Running results :
byte max align size => 1
int8/uint8 max align size => 1 / 1
int16/uint16 max align size => 2 / 2
int32/uint32/float32 max align size => 4 / 4 / 4
int64/uint64/float64 max align size => 8 / 8 / 8
string max align size => 8
slice's size/max align size => 8 / 24
struct{}'s size / max align size => 1 / 0(2) Structure type alignment
go The alignment of language structure is to align each field of the structure first , Then align the overall size according to the integer multiple of the maximum alignment boundary . There is a special case , If an empty structure is nested at the end of a structure , So this structure also needs extra alignment , Because if there is a pointer to the field , The returned address will be outside the structure , If the pointer is alive, do not free the corresponding memory , There will be a memory leak .
Here are some examples of columns to illustrate the rules of structure alignment , It should be noted that the field positions in the structure are actually determined by calculating the offset to the first address of the structure , For all fields , The first address is the index value in the structure 0 The address of .
Case a
type TestStruct1 struct {
a int8 // 1 byte ====> max align 1 byte
b int32 // 4 byte ====> max align 4 byte
c []string // 24 byte ====> max align 8 byte
}TestStruct1 Byte alignment is optimized at compile time . The relative position of each variable after optimization is shown in the following figure ( With 64 Take the environment with bit word length as an example ):
TestStruct1 Memory usage analysis : The maximum alignment boundary is 8, Total bytes = 1 + (align 3) + 4 + 24 = 32, because 32 just 8 Multiple , So there is no need to fill in the end , Finally, the size of this structure is 32 byte .
Case 2
type TestStruct2 struct {
a []string // 24 byte ====> max align 8 byte
b int64 // 8 byte ====> max align 8 byte
c int32 // 4 byte ====> max align 4 byte
}TestStruct2 Memory usage analysis : The maximum alignment boundary is 8 byte , Total bytes = 24(a) + 8(b) + 4(c) + 4( fill ) = 40, because 40 just 8 Multiple , therefore c There is no need to fill in the fields after filling .
Case three
type TestStruct3 struct {
a int8
b int64
c struct{}
}TestStruct3 Memory usage analysis : The maximum alignment boundary is 8 byte , Total bytes = 1(a)+ 7( fill ) + 8(b) + 8(c fill )=24, Empty structures theoretically do not occupy bytes , However, if it is at the end of another structure, additional byte alignment is required .
Case four
type TestStruct4 struct {
a struct{}
b int8
c int32
}TestStruct4 Memory usage analysis : The maximum alignment boundary is 4 byte , Total bytes = 0(a)+ 1(b)+ 7( fill ) + 4(c) = 8.
(3) Test verification
Execute the following code ( The environment is 64 Bit machine word length ) You can see the results of our previous case analysis .
func TestAlignStruct(t *testing.T) {
var testStruct1 TestStruct1
println("size of testStruct1:", unsafe.Sizeof(testStruct1))
var testStruct2 TestStruct2
println("size of testStruct2:", unsafe.Sizeof(testStruct2))
var testStruct3 TestStruct3
println("size of testStruct4 / testStruct4's a size:", unsafe.Sizeof(testStruct3), "/", unsafe.Sizeof(testStruct3.c))
var testStruct4 TestStruct4
println("size of testStruct4 / testStruct4's a size:", unsafe.Sizeof(testStruct4), "/", unsafe.Sizeof(testStruct4.a))
}Output is :
=== RUN TestAlignStruct size of testStruct1: 32 size of testStruct2: 40 size of testStruct4 / testStruct4's a size: 24 / 0 size of testStruct4 / testStruct4's a size: 8 / 0 --- PASS: TestAlignStruct (0.00s) PASS
About golang That's all for memory alignment , If you are interested, please remember to praise !
边栏推荐
- AFG EDI requirements details
- Bypass kernel function pointer integrity check
- Future banks need to think about today's structure with tomorrow's thinking
- Install Clickhouse client code 210 connection referred (localhost:9000)
- zblog系统如何根据用户ID获取用户相关信息的教程
- FPGA systematic learning notes serialization_ Day9 [serial port printing of PS terminal of Xilinx zynq7000 series]
- Introduction to website development for zero foundation Xiaobai
- How Tencent cloud es achieves cross cluster data copy & lt through reindex; Lower & gt;
- Analysis and introduction of NFT meta universe source code construction
- Will the easycvr video channel of the urban intelligent video monitoring image analysis platform occupy bandwidth after stopping playing?
猜你喜欢

Why do you develop middleware when you are young? "You can choose your own way"

A survey of training on graphs: taxonomy, methods, and Applications

A survey on model compression for natural language processing (NLP model compression overview)

MySQL learning -- table structure of SQL test questions
![[leetcode108] convert an ordered array into a binary search tree (medium order traversal)](/img/e1/0fac59a531040d74fd7531e2840eb5.jpg)
[leetcode108] convert an ordered array into a binary search tree (medium order traversal)

Daily algorithm & interview questions, 28 days of special training in large factories - the 15th day (string)

A survey on dynamic neural networks for natural language processing, University of California
随机推荐
[tke] nodelocaldnschache is used in IPVS forwarding mode
Robot toolbox matlab robotics toolbox
Experience and suggestions on cloud development database
网站SEO排名越做越差是什么原因造成的?
Jmeter+grafana+influxdb build a visual performance test monitoring platform
proxy pattern
Following the previous SYSTEMd pit
Management system permission design
How to perform concurrent stress testing on RTSP video streams distributed by audio and video streaming servers?
Abstract factory pattern
Development analysis of main chain system
Audio knowledge (I)
Is CICC securities reliable? Is it legal? Is it safe to open a stock account?
Zblog determines whether a plug-in installs the enabled built-in function code
Explore cloudera manager management software tuning (1)
Analysis and introduction of NFT meta universe source code construction
zblog系统实现前台调用当天发布文章数量的教程
[version upgrade] Tencent cloud firewall version 2.1.0 was officially released!
zblog判断某个插件是否安装启用的内置函数代码
Release! Tencent IOA and Tencent sky screen were selected into the first batch of certified products of domestic digital trusted services