当前位置:网站首页>Performance file system
Performance file system
2022-06-25 10:27:00 【Ash technology】
This article is the last of the performance article , It's a study note , The examples are also excerpted from other articles , It is mainly used to explain how to use and view the corresponding indicators . This article mainly introduces the file system , The more specific point is the disk .
The author still follows : Basic knowledge of ----》 Common commands and tools ----〉 The ways of checking ideas are sorted out .
One 、 Basic knowledge of 1. file system : On the basis of disk , Provides a tree structure for managing files , It's about the files on the storage device , The mechanism of organizational management .
For the convenience of management ,Linux The file system assigns two data structures to each file , The index node (index node) And catalog items (directory entry). They are mainly used to record the meta information and directory structure of files , The index node is the only flag for each file , The directory entry maintains the tree structure of the file system . The relationship between catalog entries and index nodes is many to one , You can simply understand it as , A file can have multiple aliases . The index node : Referred to as inode, Metadata used to record files , such as inode Number 、 file size 、 Access right 、 modification date 、 Location of data, etc . The index node corresponds to the file one by one , It's the same as the file , Will be persisted to disk . So remember , Index nodes also occupy disk space . Catalog items : Referred to as dentry, The name used to record the file 、 Index node pointer and its association with other directory entries . Multiple associated catalog entries , It forms the directory structure of the file system . however , Different from inodes , The directory entry is a memory data structure maintained by the kernel , So it's also known as directory entry caching .
2. Virtual file system VFS(Virtual File System): To support a variety of different file systems ,Linux The kernel is between the user process and the file system , Another layer of abstraction is introduced .VFS Defines a set of data structures and standard interfaces that all file systems support , such , User processes and other subsystems in the kernel , Just follow VFS Provide a unified interface for interaction , You don't need to care about the implementation details of the underlying file systems .
3. file system I/O: VFS Provides a set of standard file access interfaces , These interfaces are in the form of system calls , Provided to applications to use , for example :open()、read()、write(), It can be divided into the following four categories .
A class , Whether to use standard library cache , You can put the file I/O Divided into buffer I/O And unbuffering I/O.
1. buffer I/O, It refers to using standard library cache to speed up file access , The standard library can access files through system scheduling .
2. Non buffering I/O, It refers to accessing files directly through system call , No more standard library caching .
Two category , Whether to utilize the page cache of the operating system , You can put the file I/O Divided into direct I/O And indirect I/O.
1. direct I/O, Skip the page cache of the operating system , Interact directly with the file system to access files , Usually in a system call , Appoint O_DIRECT sign .
2. Not directly I/O Just the opposite , When reading and writing documents , First through the system's page cache , And then by the kernel or additional system calls , Actually write to disk .
Three types of , Whether the application is blocking itself , You can put the file I/O Divided into blocking I/O And non blocking I/O
1. Blocking I/O, It means that the application executes I/O After the operation , If there is no response , Will block the current thread , Naturally, we can't perform other tasks .
2. Non blocking I/O, It means that the application executes I/O After the operation , Will not block the current thread , You can go on with other tasks , Then by polling or event notification , Get the result of the call .
Four types of , Whether to wait for the response result , You can put the file I/O It can be divided into synchronous and asynchronous I/O:
1. Sync I/O, It means that the application executes I/O After the operation , Wait until the whole I/O After completion , In order to obtain I/O Respond to .
2. asynchronous I/O, It means that the application executes I/O After the operation , Don't wait for the completion and the response after completion , It's just to carry on .
Wait until this time I/O After completion , The response will be in the form of event notification , Tell the application .
4. Disk performance metrics :
1. Usage rate : Disk processing I/O Percent of time . Too much usage ( For example, over 80%), Usually means disk I/O Performance bottlenecks exist .
2. saturation : Disk processing I/O How busy . Too high saturation , It means that there are serious performance bottlenecks in the disk . When the saturation is 100% when , The disk cannot accept the new I/O request .
3. IOPS(Input/Output Per Second): It means... Every second I/O Number of requests .
4. throughput : It means... Every second I/O Request size .
5. response time : Refer to I/O The interval between the time when a request is sent and the time when a response is received .
Two 、 Common commands and tools
- df // View disk space
# -h Better readability , see /dev/sda1 Disk space used
$ df -h /dev/sda1
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 29G 3.1G 26G 11% /
# -i, View the disk space of the inode
$ df -i /dev/sda1
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/sda1 3870720 157460 3713260 5% /
2. Cache size view
The kernel using Slab Mechanism , Manage caching of catalog entries and inodes ,/proc/meminfo Only gives Slab The overall size of , Specific to each Slab cache , And check out /proc/slabinfo This file .
# see Slab The overall size of
$ cat /proc/meminfo | grep -E "SReclaimable|Cached"
Cached: 748316 kB
SwapCached: 0 kB
SReclaimable: 179508 kB
# View each Slab Cache size of
$ cat /proc/slabinfo | grep -E '^#|dentry|inode'
# name <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail>
xfs_inode 0 0 960 17 4 : tunables 0 0 0 : slabdata 0 0 0
...
ext4_inode_cache 32104 34590 1088 15 4 : tunables 0 0 0 : slabdata 2306 2306 0hugetlbfs_inode_cache 13 13 624 13 2 : tunables 0 0 0 : slabdata 1 1 0
sock_inode_cache 1190 1242 704 23 4 : tunables 0 0 0 : slabdata 54 54 0
shmem_inode_cache 1622 2139 712 23 4 : tunables 0 0 0 : slabdata 93 93 0
proc_inode_cache 3560 4080 680 12 2 : tunables 0 0 0 : slabdata 340 340 0
inode_cache 25172 25818 608 13 2 : tunables 0 0 0 : slabdata 1986 1986 0
dentry 76050 121296 192 21 1 : tunables 0 0 0 : slabdata 5776 5776 0
# View the usage size of the cache type
# Press down c Sort by cache size , Press down a Sort by number of active objects
$ slabtop
Active / Total Objects (% used) : 277970 / 358914 (77.4%)
Active / Total Slabs (% used) : 12414 / 12414 (100.0%)
Active / Total Caches (% used) : 83 / 135 (61.5%)
Active / Total Size (% used) : 57816.88K / 73307.70K (78.9%)
Minimum / Average / Maximum Object : 0.01K / 0.20K / 22.88K
OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
69804 23094 0% 0.19K 3324 21 13296K dentry
16380 15854 0% 0.59K 1260 13 10080K inode_cache
58260 55397 0% 0.13K 1942 30 7768K kernfs_node_cache
485 413 0% 5.69K 97 5 3104K task_struct
1472 1397 0% 2.00K 92 16 2944K kmalloc-2048
3. iostat // see io Indicators of
# r/s: Number of read requests sent to disk per second
# w/s: Number of write requests sent to disk per second
# rkB/s: The amount of data read from disk per second
# wkB/s: The amount of data written to disk per second
# rrqm/s : Number of read requests merged per second
# wrqm/s: Number of write requests merged per second
# r_await: Read request processing completion wait time , Include : Wait time in queue + Equipment processing time , Unit millisecond
# w_await: Write request processing completion wait time , Include : Wait time in queue + Equipment processing time , Unit millisecond
# aqu-sz: Average request queue length
# rareq-sz: Average read request size , Company KB
# wareq-sz: Average write request size , Company KB
# svctm: Handle I/O The average time required for a request , Excluding waiting time , Unit millisecond
# %util: Disk processing I/O Percentage of time
# -d -x Shows all disks I/O Indicators of
# remarks :-d The option is to display the I/O Performance index of ;
# -x The option is to display extended Statistics ( It shows all I/O indicators )
$ iostat -d -x 1
Device r/s w/s rkB/s wkB/s rrqm/s wrqm/s %rrqm %wrqm r_await w_await aqu-sz rareq-sz wareq-sz svctm %util
loop0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
loop1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdb 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
%util : disk I/O Usage rate ;
r/s+ w/s : yes IOPS; rkB/s+wkB/s : It's throughput ; r_await+w_await : It's response time .
4.iotop ,pidstat // according to I/O Size sorts processes
$ iotop
Total DISK READ : 0.00 B/s | Total DISK WRITE : 7.85 K/s
Actual DISK READ: 0.00 B/s | Actual DISK WRITE: 0.00 B/s
TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
15055 be/3 root 0.00 B/s 7.85 K/s 0.00 % 0.00 % systemd-journald
# -d You can display the process to disk io The situation of
$ pidstat -d 1
15:08:35 UID PID kB_rd/s kB_wr/s kB_ccwr/s iodelay Command
15:08:36 0 18940 0.00 45816.00 0.00 96 python
15:08:36 UID PID kB_rd/s kB_wr/s kB_ccwr/s iodelay Command
15:08:37 0 354 0.00 0.00 0.00 350 jbd2/sda1-8
15:08:37 0 18940 0.00 46000.00 0.00 96 python
15:08:37 0 20065 0.00 0.00 0.00 1503 kworker/u4:2
5.strace // Observe system calls
# 18940 Is a process,
$ strace -p 18940
strace: Process 18940 attached
...
mmap(NULL, 314576896, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f0f7aee9000
mmap(NULL, 314576896, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f0f682e8000
# You can see here write Yes 300M data
write(3, "2018-12-05 15:23:01,709 - __main"..., 314572844
) = 314572844
munmap(0x7f0f682e8000, 314576896) = 0
write(3, "\n", 1) = 1
munmap(0x7f0f7aee9000, 314576896) = 0
close(3) = 0
# Here we can see that the operation is “ obtain /tmp/logtest.txt.1 The state of ”
stat("/tmp/logtest.txt.1", {st_mode=S_IFREG|0644, st_size=943718535, ...}) = 0
6. lsof // View the files opened by the process
# FD: Represents a document description symbol ,
# TYPE: Indicates the file type ,
# NAME: Represents the file path
$ lsof -p 18940
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
python 18940 root cwd DIR 0,50 4096 1549389 /
python 18940 root rtd DIR 0,50 4096 1549389 /
…
python 18940 root 2u CHR 136,0 0t0 3 /dev/pts/0
python 18940 root 3w REG 8,1 117944320 303 /tmp/logtest.txt
7. fio
# direct, Indicates whether to skip the system cache .1 It means skipping the system cache .
# iodepth, Means to use asynchrony I/O(asynchronous I/O, abbreviation AIO) when , At the same time I/O Request a cap .
# rw, Express I/O Pattern . In my example , read/write Read in sequence / Write , and randread/randwrite They mean random reading / Write .
# ioengine, Express I/O engine , It supports synchronization (sync)、 asynchronous (libaio)、 Memory mapping (mmap)、 The Internet (net) And so on I/O engine .libaio Means to use asynchrony I/O.
# bs, Express I/O Size , 4K( This is also the default value ).
# filename, Represents the file path , Of course , It can be a disk path ( Test disk performance ), It can also be a file path ( Test file system performance ). But pay attention to , Test write with disk path , Will destroy the file system on this disk , So before using , You must back up your data in advance .
# random block read
fio -name=randread -direct=1 -iodepth=64 -rw=randread -ioengine=libaio -bs=4k -size=1G -numjobs=1 -runtime=1000 -group_reporting -filename=/dev/sdb
# Write at random
fio -name=randwrite -direct=1 -iodepth=64 -rw=randwrite -ioengine=libaio -bs=4k -size=1G -numjobs=1 -runtime=1000 -group_reporting -filename=/dev/sdb
# Sequential reading
fio -name=read -direct=1 -iodepth=64 -rw=read -ioengine=libaio -bs=4k -size=1G -numjobs=1 -runtime=1000 -group_reporting -filename=/dev/sdb
# Sequential writing
fio -name=write -direct=1 -iodepth=64 -rw=write -ioengine=libaio -bs=4k -size=1G -numjobs=1 -runtime=1000 -group_reporting -filename=/dev/sdb
Show the contents of the report :
read: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=64
fio-3.1
Starting 1 process
Jobs: 1 (f=1): [R(1)][100.0%][r=16.7MiB/s,w=0KiB/s][r=4280,w=0 IOPS][eta 00m:00s]
read: (groupid=0, jobs=1): err= 0: pid=17966: Sun Dec 30 08:31:48 2018
read: IOPS=4257, BW=16.6MiB/s (17.4MB/s)(1024MiB/61568msec)
# slat , From I/O Submit to actual implementation I/O Duration (Submission latency);
# clat , From I/O Submitted to the I/O The length of time to complete (Completion latency);
# lat , Means from fio establish I/O To I/O Total time to complete .
slat (usec): min=2, max=2566, avg= 4.29, stdev=21.76
clat (usec): min=228, max=407360, avg=15024.30, stdev=20524.39
lat (usec): min=243, max=407363, avg=15029.12, stdev=20524.26
clat percentiles (usec):
| 1.00th=[ 498], 5.00th=[ 1020], 10.00th=[ 1319], 20.00th=[ 1713],
| 30.00th=[ 1991], 40.00th=[ 2212], 50.00th=[ 2540], 60.00th=[ 2933],
| 70.00th=[ 5407], 80.00th=[ 44303], 90.00th=[ 45351], 95.00th=[ 45876],
| 99.00th=[ 46924], 99.50th=[ 46924], 99.90th=[ 48497], 99.95th=[ 49021],
| 99.99th=[404751]
# bw , It represents throughput
bw ( KiB/s): min= 8208, max=18832, per=99.85%, avg=17005.35, stdev=998.94, samples=123
# iops , In fact, it's every second I/O The number of times
iops : min= 2052, max= 4708, avg=4251.30, stdev=249.74, samples=123
lat (usec) : 250=0.01%, 500=1.03%, 750=1.69%, 1000=2.07%
lat (msec) : 2=25.64%, 4=37.58%, 10=2.08%, 20=0.02%, 50=29.86%
lat (msec) : 100=0.01%, 500=0.02%
cpu : usr=1.02%, sys=2.97%, ctx=33312, majf=0, minf=75
IO depths : 1=0.1%, 2=0.1%, 4=0.1%, 8=0.1%, 16=0.1%, 32=0.1%, >=64=100.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.1%, >=64=0.0%
issued rwt: total=262144,0,0, short=0,0,0, dropped=0,0,0
latency : target=0, window=0, percentile=100.00%, depth=64
Run status group 0 (all jobs):
READ: bw=16.6MiB/s (17.4MB/s), 16.6MiB/s-16.6MiB/s (17.4MB/s-17.4MB/s), io=1024MiB (1074MB), run=61568-61568msec
Disk stats (read/write):
sdb: ios=261897/0, merge=0/0, ticks=3912108/0, in_queue=3474336, util=90.09%
remarks :fio(Flexible I/O Tester) The most common file systems and disks I/O Performance benchmarking tools :https://github.com/axboe/fio
3、 ... and 、I/O Performance troubleshooting ideas
The following ideas can be followed for troubleshooting and positioning , As shown below :
1. use iostat Disk found I/O Performance bottleneck ;
2. With the help of pidstat , Locate the process that caused the bottleneck ;
3. Analyze the progress of I/O Behavior ;
4. Combine the principles of the application , Analyze these I/O The source of the .
See information :
https://github.com/axboe/fio
https://www.thomas-krenn.com/en/wiki/Linux_Storage_Stack_Diagram
Linux Performance optimization practice
边栏推荐
- How to make small programs on wechat? How to make small programs on wechat
- Redis (I) principle and basic use
- WPF binding expression and binding data source (I)
- 【RPC】I/O模型——BIO、NIO、AIO及NIO的Rector模式
- Modbus protocol and serialport port read / write
- Basic use and principle of Minio
- Computational Thinking and economic thinking
- Shardingsphere proxy 4.1 Sous - base de données sous - table
- WPF Prism框架
- 网络协议学习---LLDP协议学习
猜你喜欢
The left sliding menu +menu item icon is grayed out
西门子PLCS7-200使用(一)---开发环境和组态软件入门
Opencv learning (I) -- environment building
字符串 实现 strStr()
NetCore性能排查
MongoDB的原理、基本使用、集群和分片集群
单片机开发---基于ESP32-CAM的人脸识别应用
Unreal Engine graphics and text notes: use VAT (vertex animation texture) to make Houdini end on Houdini special effect (ue4/ue5)
Linked list delete nodes in the linked list
【论文阅读|深读】DRNE:Deep Recursive Network Embedding with Regular Equivalence
随机推荐
View. post VS Handler. Differences and usage scenarios of post
Kotlin advanced - class
持续交付-Jenkinsfile 语法
好好思考
Computational Thinking and economic thinking
QT: parsing JSON
Summary of considerations for native applet development
Unreal Engine graphics and text notes: use VAT (vertex animation texture) to make Houdini end on Houdini special effect (ue4/ue5)
i++ 和 ++i的真正区别
New school: no fraud Economics
Flutter adds event listening | subscription
国信证券证券账户开户安全吗
Flask blog practice - realize personal center and authority management
ScheduleMaster分布式任务调度中心基本使用和原理
Principle of distribution: understanding the gossip protocol
DigiCert和GlobalSign单域名OV SSL证书对比评测
Shardingsphere proxy 4.1 sub database and sub table
输出式阅读法:把学到的知识用起来
Yolov5 changing the upper sampling mode
OpenCV学习(一)---环境搭建