0%

首先介绍struct disk_stats的字段,接着介绍如何基于这些字段生成/proc/diskstats,然后介绍如何基于/proc/diskstats生成iostat的输出。本文基于linux kernel 3.19.8。

阅读全文 »

安装最新的fio (1)

1
2
3
4
5
6
7
# yum install -y libaio libaio-devel

# git clone http://git.kernel.dk/fio.git
# git checkout tags/fio-3.14
# ./configure --prefix=/usr/local/fio-3.14
# make
# make install

job definition和job instance (2)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
# cat jobfile1
[global]
ioengine=libaio
buffered=0
direct=1
filename=/dev/sdd

[randwrite_job]
rw=randwrite
bs=1m
size=2048G
runtime=30s
numjobs=3
[seqread_job]
rw=read
bs=4k
size=2048G
runtime=30s
numjobs=2
  • job definition: jobfile1中的randwrite_job和seqread_job;它们是静态的,描述I/O的特征(顺序写还是随机写,ioengine,大小等等),可以产生多个job instance;
  • job instance: 运行时产生的job实例(fork)。运行上面的jobfile1会产生5个job instance,其中randwrite_job产生3个(numjobs=3),seqread_job产生2个(numjobs=2);同一个job definition产生的job实例有着同样的I/O特征;
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
终端A:
# ./fio jobfile1
randwrite_job: (g=0): rw=randwrite, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=libaio, iodepth=1
...
seqread_job: (g=0): rw=read, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1
...
fio-3.14
Starting 5 processes
Jobs: 5 (f=5): [w(3),R(2)][100.0%][r=56KiB/s,w=82.0MiB/s][r=14,w=82 IOPS][eta 00m:00s]
randwrite_job: (groupid=0, jobs=1): err= 0: pid=30602: Thu Jun 20 18:47:26 2019
......
randwrite_job: (groupid=0, jobs=1): err= 0: pid=30603: Thu Jun 20 18:47:26 2019
......
randwrite_job: (groupid=0, jobs=1): err= 0: pid=30604: Thu Jun 20 18:47:26 2019
......
seqread_job: (groupid=0, jobs=1): err= 0: pid=30605: Thu Jun 20 18:47:26 2019
......
seqread_job: (groupid=0, jobs=1): err= 0: pid=30606: Thu Jun 20 18:47:26 2019
......

Run status group 0 (all jobs):
READ: bw=12.9MiB/s (13.5MB/s), 6597KiB/s-6598KiB/s (6756kB/s-6756kB/s), io=387MiB (406MB), run=30070-30070msec
WRITE: bw=60.5MiB/s (63.5MB/s), 20.1MiB/s-20.2MiB/s (21.1MB/s-21.2MB/s), io=1817MiB (1905MB), run=30003-30023msec

Disk stats (read/write):
sdd: ios=99304/3610, merge=0/0, ticks=58342/176819, in_queue=235709, util=99.77%

终端B:
# ps -ef|grep fio
root 30566 141809 19 18:46 pts/1 00:00:00 ./fio jobfile1
root 30602 30566 0 18:46 ? 00:00:00 ./fio jobfile1
root 30603 30566 0 18:46 ? 00:00:00 ./fio jobfile1
root 30604 30566 0 18:46 ? 00:00:00 ./fio jobfile1
root 30605 30566 4 18:46 ? 00:00:00 ./fio jobfile1
root 30606 30566 4 18:46 ? 00:00:00 ./fio jobfile1

值得一提的是,上面一共有6个fio进程;因为其中一个是主控进程(30566),其他5个才是job实例进程;

常用的fio选项 (3)

size选项(3.1)

它的值有两种形式:

  • 绝对大小,例如10M,20G;
  • 百分比,例如20%;需要文件事先存在;
    无论哪种形式都是指定一个job实例读写的空间(多个job实例的情况下每个job实例都读写这么大的空间);fio运行时间取决于–runtime指定的时间和读写这么多空间所需时间二者中的最小者。

bs 选项(3.2)

I/O单元的大小。

rw选项(3.3)

  • write: 顺序写;
  • randwrite: 随机写;
  • read: 顺序读;
  • randread: 随机读;
  • rw: 顺序混合读写;
  • randrw: 随机混合读写;

ioengine选项 (3.4)

libaio (3.4.1)

即linux native asynchronous I/O,也就是使用io_submit提交I/O请求,然后再异步地使用io_getevents获取结果。可以通过strace来看实际的系统调用。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
终端A:
# ./fio --name=seqwrite \
--rw=write \
--bs=4k \
--size=2048G \
--runtime=30 \
--ioengine=libaio \
--direct=1 \
--iodepth=1 \
--numjobs=1 \
--filename=/dev/sdd

终端B:
# ps -ef|grep fio
root 31136 141809 26 20:00 pts/1 00:00:00 ./fio --name=seqwrite --rw=write --bs=4k --size=2048G --runtime=30 --ioengine=libaio --direct=1 --iodepth=1 --numjobs=1 --filename=/dev/sdd
root 31170 31136 19 20:00 ? 00:00:00 ./fio --name=seqwrite --rw=write --bs=4k --size=2048G --runtime=30 --ioengine=libaio --direct=1 --iodepth=1 --numjobs=1 --filename=/dev/sdd

# strace -p 31170
Process 31170 attached
......
io_submit(140009640251392, 1, {{pwrite, filedes:3, str:"\0\200|:\0\0\0\0\6\234j\251\362\315\351\n\200S*\7\t\345\r\25pJ%\367\v9\235\30"..., nbytes:4096, offset:981250048}}) = 1
io_getevents(140009640251392, 1, 1, {{(nil), 0x1cb57e8, 4096, 0}}, NULL) = 1
io_submit(140009640251392, 1, {{pwrite, filedes:3, str:"\0\200|:\0\0\0\0\6\234j\251\362\315\351\n\200S*\7\t\345\r\25pJ%\367\v9\235\30"..., nbytes:4096, offset:981254144}}) = 1
io_getevents(140009640251392, 1, 1, {{(nil), 0x1cb57e8, 4096, 0}}, NULL) = 1
......

sync (3.4.2)

这个名字有点误导,它虽然叫做sync,但并不意味着文件以O_SYNC的方式打开,也不意味着每个write之后会调用fsync/fdatasync操作。实际上,这个sync和libaio是相对的概念,不是先提交I/O请求(io_submit)再异步获取结果(io_getevents),而是使用read/write这样的系统调用来完成I/O。到底write之后会不会调用fsync,要看–fsync或者–fdatasync的设置;

psync (3.4.3)

和sync类似,不同之处在于使用pread和pwrite来进行I/O。

iodepth选项 (3.5)

简单来说,就是一个job实例在一个文件上的inflight的I/O的数。考虑–ioengine=libaio的情景:把I/O请求通过io_submit发出去然后通过io_getevents获取结果,这样一个job实例就可以保持有多个inflight I/O。但是对于–ioengine=sync或者psync,一个job实例只能顺序地调用read/write(pread/pwrite),也就是只能保持一个I/O inflight,所以对于–ioengine=sync或者–ioengine=psync设置iodepth为大于1的值不起作用。对比一下:

  • libaio
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
终端A:
# ./fio --name=seqwrite \
--rw=write \
--bs=4k \
--size=2048G \
--runtime=30 \
--ioengine=libaio \
--direct=1 \
--iodepth=2 \
--numjobs=3 \
--filename=/dev/sdd \
--group_reporting

seqwrite: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=2
...
fio-3.14
Starting 3 processes
Jobs: 3 (f=3): [W(3)][100.0%][w=26.5MiB/s][w=6784 IOPS][eta 00m:00s]
seqwrite: (groupid=0, jobs=3): err= 0: pid=31333: Thu Jun 20 20:10:28 2019
write: IOPS=7158, BW=27.0MiB/s (29.3MB/s)(839MiB/30001msec)
slat (nsec): min=2934, max=37646, avg=6739.80, stdev=2421.02
clat (usec): min=125, max=148817, avg=830.45, stdev=1904.14
lat (usec): min=130, max=148823, avg=837.30, stdev=1904.16
clat percentiles (usec):
| 1.00th=[ 141], 5.00th=[ 347], 10.00th=[ 627], 20.00th=[ 635],
| 30.00th=[ 635], 40.00th=[ 644], 50.00th=[ 660], 60.00th=[ 791],
| 70.00th=[ 799], 80.00th=[ 865], 90.00th=[ 996], 95.00th=[ 1369],
| 99.00th=[ 1516], 99.50th=[ 2573], 99.90th=[12256], 99.95th=[34341],
| 99.99th=[96994]
bw ( KiB/s): min=14528, max=49712, per=100.00%, avg=28649.71, stdev=2786.85, samples=179
iops : min= 3632, max=12428, avg=7162.42, stdev=696.71, samples=179
lat (usec) : 250=4.55%, 500=0.51%, 750=51.07%, 1000=34.20%
lat (msec) : 2=8.99%, 4=0.21%, 10=0.05%, 20=0.34%, 50=0.03%
lat (msec) : 100=0.03%, 250=0.01%
cpu : usr=0.70%, sys=2.53%, ctx=185415, majf=0, minf=98
IO depths : 1=0.1%, 2=100.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=0,214751,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=2

Run status group 0 (all jobs):
WRITE: bw=27.0MiB/s (29.3MB/s), 27.0MiB/s-27.0MiB/s (29.3MB/s-29.3MB/s), io=839MiB (880MB), run=30001-30001msec

Disk stats (read/write):
sdd: ios=122/213903, merge=0/0, ticks=24/176947, in_queue=176925, util=99.77%


终端B:
# iostat -mxd 1 /dev/sdd
avg-cpu: %user %nice %system %iowait %steal %idle
0.09 0.00 0.22 0.00 0.00 99.69

Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdd 0.00 0.00 0.00 8540.00 0.00 33.36 8.00 5.91 0.69 0.00 0.69 0.12 100.00

avg-cpu: %user %nice %system %iowait %steal %idle
0.06 0.00 0.22 0.03 0.00 99.69

Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdd 0.00 0.00 0.00 8434.00 0.00 32.95 8.00 5.90 0.70 0.00 0.70 0.12 100.00

IO depths: 1=0.1%, 2=100.0%表明iodepth为2。iostat显示avgqu-sz接近6。我们有3个job实例,故每个job实例的iodepth接近2;

  • sync
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
终端A:
# ./fio --name=seqwrite \
--rw=write \
--bs=4k \
--size=2048G \
--runtime=30 \
--ioengine=sync \
--direct=1 \
--iodepth=2 \
--numjobs=3 \
--filename=/dev/sdd \
--group_reporting

seqwrite: (g=0): rw=write, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=sync, iodepth=2
...
fio-3.14
Starting 3 processes
Jobs: 3 (f=3): [W(3)][100.0%][w=33.0MiB/s][w=8457 IOPS][eta 00m:00s]
seqwrite: (groupid=0, jobs=3): err= 0: pid=31445: Thu Jun 20 20:14:31 2019
write: IOPS=5405, BW=21.1MiB/s (22.1MB/s)(634MiB/30001msec)
clat (usec): min=107, max=97490, avg=553.96, stdev=1255.04
lat (usec): min=107, max=97490, avg=554.08, stdev=1255.04
clat percentiles (usec):
| 1.00th=[ 269], 5.00th=[ 310], 10.00th=[ 314], 20.00th=[ 314],
| 30.00th=[ 318], 40.00th=[ 469], 50.00th=[ 490], 60.00th=[ 578],
| 70.00th=[ 668], 80.00th=[ 693], 90.00th=[ 734], 95.00th=[ 750],
| 99.00th=[ 766], 99.50th=[ 783], 99.90th=[14615], 99.95th=[26870],
| 99.99th=[62653]
bw ( KiB/s): min=12400, max=34776, per=99.06%, avg=21419.20, stdev=2159.32, samples=177
iops : min= 3100, max= 8694, avg=5354.80, stdev=539.83, samples=177
lat (usec) : 250=0.64%, 500=53.22%, 750=41.44%, 1000=4.47%
lat (msec) : 2=0.02%, 4=0.01%, 10=0.02%, 20=0.12%, 50=0.04%
lat (msec) : 100=0.02%
cpu : usr=0.42%, sys=1.85%, ctx=162178, majf=0, minf=102
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=0,162177,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=2

Run status group 0 (all jobs):
WRITE: bw=21.1MiB/s (22.1MB/s), 21.1MiB/s-21.1MiB/s (22.1MB/s-22.1MB/s), io=634MiB (664MB), run=30001-30001msec

Disk stats (read/write):
sdd: ios=120/161214, merge=0/0, ticks=28/87940, in_queue=87943, util=99.78%


终端B:
# iostat -mxd 1 /dev/sdd
avg-cpu: %user %nice %system %iowait %steal %idle
0.03 0.00 0.16 8.20 0.00 91.61

Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdd 0.00 0.00 0.00 4209.00 0.00 16.44 8.00 2.94 0.70 0.00 0.70 0.24 100.00

avg-cpu: %user %nice %system %iowait %steal %idle
0.03 0.00 0.09 9.08 0.00 90.80

Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdd 0.00 0.00 0.00 4155.00 0.00 16.23 8.00 2.96 0.71 0.00 0.71 0.24 100.10

IO depths: 1=100.0%, 2=0.0%表明iodepth为1;iostat显示avgqu-sz接近3。我们有3个job实例,故每个job实例的iodepth接近1;

direct选项 (3.6)

这个选项决定:打开文件时带不带O_DIRECT标记;

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# strace -f ./fio --name=seqwrite     \
--rw=write \
--bs=4k \
--size=2048G \
--runtime=30 \
--ioengine=libaio \
--direct=1 \
--iodepth=1 \
--numjobs=1 \
--filename=/dev/sdd \
--group_reporting > strace.log 2>&1

# cat strace.log
...
[pid 31496] open("/dev/sdd", O_RDWR|O_DIRECT|O_NOATIME) = 3
[pid 31496] ioctl(3, BLKFLSBUF, 0x750080) = 0
[pid 31496] fadvise64(3, 0, 2199023255552, POSIX_FADV_SEQUENTIAL) = 0
[pid 31496] io_submit(140395703390208, 1, {{pwrite, filedes:3, str:"5\340(\3148\240\231\26\6\234j\251\362\315\351\n\200S*\7\t\345\r\25pJ%\367\v9\235\30"..., nbytes:4096, offset:0}}) = 1
[pid 31496] io_getevents(140395703390208, 1, 1, {{(nil), 0x7d37e8, 4096, 0}}, NULL) = 1
[pid 31496] io_submit(140395703390208, 1, {{pwrite, filedes:3, str:"\0\20\0\0\0\0\0\0\6\234j\251\362\315\351\n\200S*\7\t\345\r\25pJ%\367\v9\235\30"..., nbytes:4096, offset:4096}}) = 1
[pid 31496] io_getevents(140395703390208, 1, 1, {{(nil), 0x7d37e8, 4096, 0}}, NULL) = 1

值得注意的是,ioengine=libaio时,一定要direct=1。因为目前(2018年),libaio只支持direct I/O:libaio只在O_DIRECT的情况下是异步的,在没有O_DIRECT的情况下可能会blocking;

sync选项 (3.7)

和direct选项类似,它决定的是:打开文件时带不带O_SYNC标记;

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# strace -f ./fio --name=seqwrite      \
--rw=write \
--bs=4k \
--size=2048G \
--runtime=30 \
--ioengine=sync \
--sync=1 \
--iodepth=1 \
--numjobs=1 \
--filename=/dev/sdd \
--group_reporting > strace.log 2>&1

# cat strace.log
...
[pid 31618] open("/dev/sdd", O_RDWR|O_SYNC|O_NOATIME) = 3
[pid 31618] ioctl(3, BLKFLSBUF, 0x747020) = 0
[pid 31618] fadvise64(3, 0, 2199023255552, POSIX_FADV_SEQUENTIAL) = 0
[pid 31618] write(3, "5\340(\3148\240\231\26\6\234j\251\362\315\351\n\200S*\7\t\345\r\25pJ%\367\v9\235\30"..., 4096 <unfinished ...>
[pid 31582] <... nanosleep resumed> NULL) = 0
[pid 31582] wait4(31618, 0x7ffcf5649400, WNOHANG, NULL) = 0
[pid 31582] stat("/tmp/fio-dump-status", 0x7ffcf56483c0) = -1 ENOENT (No such file or directory)
[pid 31582] nanosleep({0, 10000000}, <unfinished ...>
[pid 31618] <... write resumed> ) = 4096
[pid 31618] write(3, "5\340(\3148\240\231\26\6\234j\251\362\315\351\n\200S*\7\t\345\r\25pJ%\367\v9\235\30"..., 4096) = 4096
[pid 31618] write(3, "5\340(\3148\240\231\26\6\234j\251\362\315\351\n\200S*\7\t\345\r\25pJ%\367\v9\235\30"..., 4096) = 4096
[pid 31618] write(3, "5\340(\3148\240\231\26\6\234j\251\362\315\351\n\200S*\7\t\345\r\25pJ%\367\v9\235\30"..., 4096) = 4096

fsync和fdatasync选项 (3.8)

如第3.4.2节所说,ioengine=sync或psync,有点误导,因为它不会调用fsync或fdatasync操作。要想在write后调用fsync或者fdatasync,需要加水–fsync=N或者–fdatasync=N。这里的N表示每N个write之后调用fsync或者fdatasync一次。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
# strace -f ./fio --name=seqwrite     \
--rw=write \
--bs=4k \
--size=2048G \
--runtime=30 \
--ioengine=psync \
--fdatasync=3 \
--iodepth=1 \
--numjobs=1 \
--filename=/dev/sdd \
--group_reporting > strace.log 2>&1

#cat strace.log
...
[pid 31670] open("/dev/sdd", O_RDWR|O_NOATIME) = 3
[pid 31670] ioctl(3, BLKFLSBUF, 0x746f40) = 0
[pid 31670] fadvise64(3, 0, 2199023255552, POSIX_FADV_SEQUENTIAL) = 0
[pid 31670] pwrite(3, "5\340(\3148\240\231\26\6\234j\251\362\315\351\n\200S*\7\t\345\r\25pJ%\367\v9\235\30"..., 4096, 0) = 4096
[pid 31670] pwrite(3, "5\340(\3148\240\231\26\6\234j\251\362\315\351\n\200S*\7\t\345\r\25pJ%\367\v9\235\30"..., 4096, 4096) = 4096
[pid 31670] pwrite(3, "5\340(\3148\240\231\26\6\234j\251\362\315\351\n\200S*\7\t\345\r\25pJ%\367\v9\235\30"..., 4096, 8192) = 4096
[pid 31670] fdatasync(3 <unfinished ...>
[pid 31636] <... nanosleep resumed> NULL) = 0
[pid 31636] wait4(31670, 0x7ffe86a9c280, WNOHANG, NULL) = 0
[pid 31636] stat("/tmp/fio-dump-status", 0x7ffe86a9b240) = -1 ENOENT (No such file or directory)
[pid 31636] nanosleep({0, 10000000}, <unfinished ...>
[pid 31670] <... fdatasync resumed> ) = 0
[pid 31670] pwrite(3, "5\340(\3148\240\231\26\6\234j\251\362\315\351\n\200S*\7\t\345\r\25pJ%\367\v9\235\30"..., 4096, 12288) = 4096
[pid 31670] pwrite(3, "5\340(\3148\240\231\26\6\234j\251\362\315\351\n\200S*\7\t\345\r\25pJ%\367\v9\235\30"..., 4096, 16384) = 4096
[pid 31670] pwrite(3, "5\340(\3148\240\231\26\6\234j\251\362\315\351\n\200S*\7\t\345\r\25pJ%\367\v9\235\30"..., 4096, 20480) = 4096
[pid 31670] fdatasync(3 <unfinished ...>
[pid 31636] <... nanosleep resumed> NULL) = 0
[pid 31636] wait4(31670, 0x7ffe86a9c280, WNOHANG, NULL) = 0
[pid 31636] stat("/tmp/fio-dump-status", 0x7ffe86a9b240) = -1 ENOENT (No such file or directory)
[pid 31636] nanosleep({0, 10000000}, <unfinished ...>
[pid 31670] <... fdatasync resumed> ) = 0

关于fsync/fdatasync需要注意:fsync/fdatasync的耗时不包含在write latency之内。新版本的fio(例如fio-3.13和fio-3.14)单独显示fsync/fdatasync的耗时,而老版本不显示fsync/fdatasync的耗时(这有点使人迷惑:write latency很小——因为不包含fsync/fdatasync耗时——但IOPS又不高,你会疑惑时间花在哪里了)。

有关sync的选项 (3.9)

从前几节可以看到,和sync相关的选项有:

  • –direct,
  • –sync,
  • –fsync(fdatasync);
    而常用的ioengine有两种:
  • libaio
  • sync(psync)。
    如何组合使用呢?
  • ioengine=libaio: –direct=1是必须的(见3.6节),所以,–fsync(fdatasync)就不需要了;而–sync可以和–direct组合使用,但一般测试裸盘性能直接用–direct就可以了。
  • ioengine=sync(psync): 可以选择–direct或者–fsync(fdatasync),选择–direct时可以和–sync组合使用。

所以共有下面几种组合:

  • –ioengine=libaio –direct=1
  • –ioengine=sync(psync) –direct=1
  • –ioengine=sync(psync) –direct=1 –sync=1
  • –ioengine=sync(psync) –fsync=1

解析fio的输出 (4)

运行时输出 (4.1)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
# ./fio --name=randwrite     \
--rw=randwrite \
--bs=4k \
--size=2048G \
--runtime=30 \
--ioengine=libaio \
--direct=1 \
--iodepth=1 \
--numjobs=2 \
--filename=/dev/sdd \
--group_reporting

randwrite: (g=0): rw=randwrite, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=1
...
fio-3.14
Starting 2 processes
Jobs: 2 (f=2): [w(2)][30.0%][w=1473KiB/s][w=368 IOPS][eta 00m:21s]


# ./fio --name=randrw \
--rw=randrw \
--bs=4k \
--size=2048G \
--runtime=30 \
--ioengine=libaio \
--direct=1 \
--iodepth=2 \
--numjobs=3 \
--filename=/dev/sdd \
--group_reporting

randrw: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=2
...
fio-3.14
Starting 3 processes
Jobs: 3 (f=3): [m(3)][43.3%][r=356KiB/s,w=288KiB/s][r=89,w=72 IOPS][eta 00m:17s]

第一个方括号:

  • w: running, doing random writes
  • W: running, doing sequential writes
  • r: running, doing random reads
  • R: running, doing sequential reads
  • m: running, doing mixed random reads/writes
  • M: running, doing mixed sequential reads/writes
  • C: thread created
  • f: thread finishing

其他字段比较直观;

运行后输出 (4.2)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
终端A:
# cat jobfile2
[global]
filename=/dev/sdd
runtime=30

[randrw_job]
rw=randrw
bs=4k
size=2048G
ioengine=libaio
direct=1
iodepth=3
numjobs=2
new_group=1
[seqwrite_job]
rw=write
bs=1m
size=2048G
ioengine=psync
fdatasync=1
iodepth=1
numjobs=3
new_group=1

# ./fio jobfile2
randrw_job: (g=0): rw=randrw, bs=(R) 4096B-4096B, (W) 4096B-4096B, (T) 4096B-4096B, ioengine=libaio, iodepth=3
...
seqwrite_job: (g=1): rw=write, bs=(R) 1024KiB-1024KiB, (W) 1024KiB-1024KiB, (T) 1024KiB-1024KiB, ioengine=psync, iodepth=1
...
fio-3.14
Starting 5 processes
Jobs: 5 (f=5): [m(2),W(3)][100.0%][r=184KiB/s,w=18.2MiB/s][r=46,w=62 IOPS][eta 00m:00s]
randrw_job: (groupid=0, jobs=1): err= 0: pid=32197: Thu Jun 20 21:00:52 2019
read: IOPS=25, BW=102KiB/s (105kB/s)(3072KiB/30069msec)
slat (nsec): min=4625, max=47380, avg=8140.65, stdev=3172.10
clat (msec): min=6, max=702, avg=70.09, stdev=50.63
lat (msec): min=6, max=702, avg=70.10, stdev=50.63
clat percentiles (msec):
| 1.00th=[ 11], 5.00th=[ 16], 10.00th=[ 21], 20.00th=[ 32],
| 30.00th=[ 42], 40.00th=[ 53], 50.00th=[ 62], 60.00th=[ 74],
| 70.00th=[ 88], 80.00th=[ 109], 90.00th=[ 129], 95.00th=[ 140],
| 99.00th=[ 165], 99.50th=[ 180], 99.90th=[ 701], 99.95th=[ 701],
| 99.99th=[ 701]
bw ( KiB/s): min= 48, max= 208, per=50.59%, avg=102.20, stdev=30.54, samples=60
iops : min= 12, max= 52, avg=25.50, stdev= 7.66, samples=60
write: IOPS=26, BW=105KiB/s (108kB/s)(3168KiB/30069msec)
slat (nsec): min=4809, max=48279, avg=8970.20, stdev=3268.66
clat (usec): min=240, max=184217, avg=45829.65, stdev=38448.28
lat (usec): min=247, max=184225, avg=45838.75, stdev=38448.28
clat percentiles (usec):
| 1.00th=[ 249], 5.00th=[ 963], 10.00th=[ 3326], 20.00th=[ 10945],
| 30.00th=[ 18220], 40.00th=[ 27132], 50.00th=[ 34866], 60.00th=[ 44303],
| 70.00th=[ 60556], 80.00th=[ 86508], 90.00th=[108528], 95.00th=[114820],
| 99.00th=[135267], 99.50th=[147850], 99.90th=[183501], 99.95th=[183501],
| 99.99th=[183501]
bw ( KiB/s): min= 55, max= 256, per=49.42%, avg=105.27, stdev=38.93, samples=60
iops : min= 13, max= 64, avg=26.27, stdev= 9.75, samples=60
lat (usec) : 250=0.64%, 500=1.41%, 750=0.32%, 1000=0.19%
lat (msec) : 2=1.28%, 4=2.05%, 10=3.91%, 20=11.15%, 50=30.32%
lat (msec) : 100=30.00%, 250=18.59%, 750=0.13%
cpu : usr=0.07%, sys=0.06%, ctx=1293, majf=0, minf=36
IO depths : 1=0.1%, 2=99.9%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=768,792,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=3
randrw_job: (groupid=0, jobs=1): err= 0: pid=32198: Thu Jun 20 21:00:52 2019
read: IOPS=25, BW=101KiB/s (103kB/s)(3028KiB/30080msec)
slat (nsec): min=5002, max=73687, avg=8442.84, stdev=4368.25
clat (msec): min=3, max=761, avg=72.31, stdev=50.54
lat (msec): min=3, max=761, avg=72.32, stdev=50.54
clat percentiles (msec):
| 1.00th=[ 11], 5.00th=[ 17], 10.00th=[ 21], 20.00th=[ 34],
| 30.00th=[ 45], 40.00th=[ 56], 50.00th=[ 67], 60.00th=[ 80],
| 70.00th=[ 93], 80.00th=[ 109], 90.00th=[ 124], 95.00th=[ 133],
| 99.00th=[ 165], 99.50th=[ 174], 99.90th=[ 760], 99.95th=[ 760],
| 99.99th=[ 760]
bw ( KiB/s): min= 48, max= 208, per=49.83%, avg=100.67, stdev=25.65, samples=60
iops : min= 12, max= 52, avg=25.17, stdev= 6.41, samples=60
write: IOPS=26, BW=108KiB/s (110kB/s)(3240KiB/30080msec)
slat (nsec): min=4540, max=51576, avg=9104.22, stdev=2615.64
clat (usec): min=238, max=164380, avg=43712.83, stdev=37644.08
lat (usec): min=245, max=164387, avg=43722.05, stdev=37644.38
clat percentiles (usec):
| 1.00th=[ 243], 5.00th=[ 627], 10.00th=[ 2573], 20.00th=[ 10159],
| 30.00th=[ 19006], 40.00th=[ 26084], 50.00th=[ 33162], 60.00th=[ 40633],
| 70.00th=[ 52167], 80.00th=[ 81265], 90.00th=[107480], 95.00th=[116917],
| 99.00th=[135267], 99.50th=[147850], 99.90th=[164627], 99.95th=[164627],
| 99.99th=[164627]
bw ( KiB/s): min= 40, max= 224, per=50.64%, avg=107.87, stdev=38.31, samples=60
iops : min= 10, max= 56, avg=26.97, stdev= 9.58, samples=60
lat (usec) : 250=1.28%, 500=1.02%, 750=0.51%, 1000=0.13%
lat (msec) : 2=1.34%, 4=1.98%, 10=4.34%, 20=10.34%, 50=30.82%
lat (msec) : 100=28.27%, 250=19.85%, 750=0.06%, 1000=0.06%
cpu : usr=0.06%, sys=0.07%, ctx=1298, majf=0, minf=37
IO depths : 1=0.1%, 2=99.9%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=757,810,0,0 short=0,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=3
seqwrite_job: (groupid=1, jobs=1): err= 0: pid=32199: Thu Jun 20 21:00:52 2019
write: IOPS=6, BW=6500KiB/s (6656kB/s)(191MiB/30091msec)
clat (usec): min=433, max=1473, avg=869.79, stdev=93.12
lat (usec): min=472, max=1567, avg=907.29, stdev=98.27
clat percentiles (usec):
| 1.00th=[ 449], 5.00th=[ 717], 10.00th=[ 750], 20.00th=[ 848],
| 30.00th=[ 865], 40.00th=[ 873], 50.00th=[ 881], 60.00th=[ 889],
| 70.00th=[ 898], 80.00th=[ 906], 90.00th=[ 938], 95.00th=[ 971],
| 99.00th=[ 1074], 99.50th=[ 1467], 99.90th=[ 1467], 99.95th=[ 1467],
| 99.99th=[ 1467]
bw ( KiB/s): min= 4096, max=12263, per=34.20%, avg=6629.54, stdev=1334.07, samples=59
iops : min= 4, max= 11, avg= 6.46, stdev= 1.24, samples=59
lat (usec) : 500=1.05%, 750=8.90%, 1000=85.86%
lat (msec) : 2=4.19%
fsync/fdatasync/sync_file_range:
sync (msec): min=27, max=918, avg=156.63, stdev=62.89
sync percentiles (msec):
| 1.00th=[ 63], 5.00th=[ 103], 10.00th=[ 116], 20.00th=[ 134],
| 30.00th=[ 144], 40.00th=[ 150], 50.00th=[ 157], 60.00th=[ 161],
| 70.00th=[ 165], 80.00th=[ 174], 90.00th=[ 184], 95.00th=[ 194],
| 99.00th=[ 275], 99.50th=[ 919], 99.90th=[ 919], 99.95th=[ 919],
| 99.99th=[ 919]
cpu : usr=0.03%, sys=0.91%, ctx=1723, majf=0, minf=38
IO depths : 1=200.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=0,191,0,0 short=191,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
seqwrite_job: (groupid=1, jobs=1): err= 0: pid=32200: Thu Jun 20 21:00:52 2019
write: IOPS=6, BW=6462KiB/s (6617kB/s)(190MiB/30110msec)
clat (usec): min=394, max=1473, avg=570.53, stdev=144.18
lat (usec): min=433, max=1571, avg=620.61, stdev=155.65
clat percentiles (usec):
| 1.00th=[ 424], 5.00th=[ 453], 10.00th=[ 465], 20.00th=[ 486],
| 30.00th=[ 494], 40.00th=[ 506], 50.00th=[ 529], 60.00th=[ 537],
| 70.00th=[ 553], 80.00th=[ 586], 90.00th=[ 840], 95.00th=[ 865],
| 99.00th=[ 1074], 99.50th=[ 1467], 99.90th=[ 1467], 99.95th=[ 1467],
| 99.99th=[ 1467]
bw ( KiB/s): min= 4096, max=12263, per=34.02%, avg=6594.83, stdev=1319.27, samples=59
iops : min= 4, max= 11, avg= 6.42, stdev= 1.22, samples=59
lat (usec) : 500=33.68%, 750=53.68%, 1000=11.58%
lat (msec) : 2=1.05%
fsync/fdatasync/sync_file_range:
sync (msec): min=26, max=876, avg=157.85, stdev=61.68
sync percentiles (msec):
| 1.00th=[ 29], 5.00th=[ 103], 10.00th=[ 117], 20.00th=[ 136],
| 30.00th=[ 144], 40.00th=[ 150], 50.00th=[ 157], 60.00th=[ 161],
| 70.00th=[ 167], 80.00th=[ 174], 90.00th=[ 186], 95.00th=[ 197],
| 99.00th=[ 279], 99.50th=[ 877], 99.90th=[ 877], 99.95th=[ 877],
| 99.99th=[ 877]
cpu : usr=0.04%, sys=1.12%, ctx=17379, majf=0, minf=38
IO depths : 1=200.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=0,190,0,0 short=190,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1
seqwrite_job: (groupid=1, jobs=1): err= 0: pid=32201: Thu Jun 20 21:00:52 2019
write: IOPS=6, BW=6428KiB/s (6582kB/s)(189MiB/30110msec)
clat (usec): min=346, max=1072, avg=491.86, stdev=103.58
lat (usec): min=362, max=1108, avg=540.57, stdev=113.92
clat percentiles (usec):
| 1.00th=[ 371], 5.00th=[ 400], 10.00th=[ 416], 20.00th=[ 445],
| 30.00th=[ 453], 40.00th=[ 461], 50.00th=[ 469], 60.00th=[ 478],
| 70.00th=[ 486], 80.00th=[ 502], 90.00th=[ 578], 95.00th=[ 758],
| 99.00th=[ 963], 99.50th=[ 1074], 99.90th=[ 1074], 99.95th=[ 1074],
| 99.99th=[ 1074]
bw ( KiB/s): min= 4096, max=10240, per=33.84%, avg=6560.19, stdev=1247.56, samples=59
iops : min= 4, max= 10, avg= 6.39, stdev= 1.17, samples=59
lat (usec) : 500=79.37%, 750=15.34%, 1000=4.76%
lat (msec) : 2=0.53%
fsync/fdatasync/sync_file_range:
sync (msec): min=28, max=854, avg=158.76, stdev=59.57
sync percentiles (msec):
| 1.00th=[ 63], 5.00th=[ 104], 10.00th=[ 118], 20.00th=[ 136],
| 30.00th=[ 144], 40.00th=[ 150], 50.00th=[ 157], 60.00th=[ 163],
| 70.00th=[ 167], 80.00th=[ 176], 90.00th=[ 188], 95.00th=[ 197],
| 99.00th=[ 279], 99.50th=[ 852], 99.90th=[ 852], 99.95th=[ 852],
| 99.99th=[ 852]
cpu : usr=0.02%, sys=1.13%, ctx=17185, majf=0, minf=37
IO depths : 1=200.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued rwts: total=0,189,0,0 short=189,0,0,0 dropped=0,0,0,0
latency : target=0, window=0, percentile=100.00%, depth=1

Run status group 0 (all jobs):
READ: bw=203KiB/s (208kB/s), 101KiB/s-102KiB/s (103kB/s-105kB/s), io=6100KiB (6246kB), run=30069-30080msec
WRITE: bw=213KiB/s (218kB/s), 105KiB/s-108KiB/s (108kB/s-110kB/s), io=6408KiB (6562kB), run=30069-30080msec

Run status group 1 (all jobs):
WRITE: bw=18.9MiB/s (19.8MB/s), 6428KiB/s-6500KiB/s (6582kB/s-6656kB/s), io=570MiB (598MB), run=30091-30110msec

Disk stats (read/write):
sdd: ios=1637/4947, merge=0/140809, ticks=107529/282374, in_queue=391910, util=99.77%

终端B:
# ps -ef|grep fio
root 32162 141809 19 21:00 pts/1 00:00:00 ./fio jobfile2
root 32197 32162 1 21:00 ? 00:00:00 ./fio jobfile2
root 32198 32162 1 21:00 ? 00:00:00 ./fio jobfile2
root 32199 32162 1 21:00 ? 00:00:00 ./fio jobfile2
root 32200 32162 1 21:00 ? 00:00:00 ./fio jobfile2
root 32201 32162 1 21:00 ? 00:00:00 ./fio jobfile2

如jobfile2所示,我们有两个job definition:randrw_job和seqwrite_job:

randrw_job有2个job实例:

  • groupid=0 pid=32197: 记作g0-j0
  • groupid=0 pid=32198: 记作g0-j1

seqwrite_job有3个job实例;

  • groupid=1 pid=32199: 记作g1-j0
  • groupid=1 pid=32200: 记作g1-j1
  • groupid=1 pid=32201: 记作g1-j2

一个job实例的输出 (4.2.1)

我们以g0-j0为例,看输出结果:

  • job头
1
randrw_job: (groupid=0, jobs=1): err= 0: pid=32197: Thu Jun 20 21:00:52 2019

显示了所属group,pid,运行时间等;

  • job read统计信息
1
read: IOPS=25, BW=102KiB/s (105kB/s)(3072KiB/30069msec)

BW表示平均带宽,KiB/s是按1K=1024计算的,kB/s是按1K=1000计算的;别的无须多说;

1
slat (nsec): min=4625, max=47380, avg=8140.65, stdev=3172.10

提交延迟,s代表submission;min, max, avg分别代表最小、最大和平均值。stdev表示标准差(standard deviation),越大代表波动越大。注意slat只在–ioengine=libaio的时候才会出现,因为对于–ioengine=sync/psync没有所谓的提交延迟。

1
clat (msec): min=6, max=702, avg=70.09, stdev=50.63

完成延迟,c代表completion;和slat一样,min, max, avg分别代表最小,最大和平均值,stdev表示标准差(standard deviation)。对于–ioengine=libaio,clat表示从提交到完成的延迟。对于–ioengine=sync/psync,fio文档中说clat等于或非常接近于0(因为提交就相当于完成)。但从实验上看,不是这样的:对于–ioengine=sync/psync,不显示slat,clat接近总延迟。

1
lat (msec): min=6, max=702, avg=70.10, stdev=50.63

总延迟,即从I/O被创建到完成的延迟。大致是这样(对吗?):

  • 对于–ioengine=libaio: lat = latency(创建到提交) + slat + clat
  • 对于–ioengine=sync/psync: lat = latency(创建到开始) + clat
1
2
3
4
5
6
clat percentiles (msec):
| 1.00th=[ 11], 5.00th=[ 16], 10.00th=[ 21], 20.00th=[ 32],
| 30.00th=[ 42], 40.00th=[ 53], 50.00th=[ 62], 60.00th=[ 74],
| 70.00th=[ 88], 80.00th=[ 109], 90.00th=[ 129], 95.00th=[ 140],
| 99.00th=[ 165], 99.50th=[ 180], 99.90th=[ 701], 99.95th=[ 701],
| 99.99th=[ 701]

clat的百分位数。1%的clat在11ms以内;5%在16ms以内;10%的在21ms以内;99.9%在701ms以内;99.99%在701ms以内;

1
bw (  KiB/s): min=   48, max=  208, per=50.59%, avg=102.20, stdev=30.54, samples=60

基于采样得到的带宽统计(可以看见,与前面的BW=102KiB/s接近)。min, max, avg分别表示最小、最大和平均值;stdev表示标准差。samples表示采样数。per表示当前job实例(g0-j0)的带宽在组里所占的百分比,即g0-j0在groupid=0的组里(g0-j0和g0-j1), 读带宽占总读带宽的50.59%(g0-j1占49.83%)

1
iops        : min=   12, max=   52, avg=25.50, stdev= 7.66, samples=60

基于采样得到的iops统计(可以看见,与前面的IOPS=25接近)。和bw一样。

  • job write统计信息

和read一样,不赘述;

  • job 总体统计信息
1
2
3
lat (usec)   : 250=0.64%, 500=1.41%, 750=0.32%, 1000=0.19%
lat (msec) : 2=1.28%, 4=2.05%, 10=3.91%, 20=11.15%, 50=30.32%
lat (msec) : 100=30.00%, 250=18.59%, 750=0.13%

所有I/O的总体延迟分布,包含read和write。

  • lat在[0us,250us)区间内的I/O占0.64%;
  • lat在[250us,500us)区间内的I/O占1.41%;
  • lat在[500us,750us)区间内的I/O占0.32%;
  • lat在[750us,1ms)区间的I/O占0.19%;
  • lat在[1ms, 2ms)区间的I/O占1.28%;
  • 以此类推;
    注意和百分位数的区别。
1
cpu          : usr=0.07%, sys=0.06%, ctx=1293, majf=0, minf=36

CPU使用情况。usr表示用户态占比;sys表示内核态占比;ctx表示这个job实例/进程(g0-j0)经历的context switch数;majf和minf分别表示major and minor page faults。

1
IO depths    : 1=0.1%, 2=99.9%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%

IO depth的分布。

  • 1=表示1-2的占比;
  • 2=表示2-4的占比;
  • 4=表示4-8的占比;
  • 以此类推;
1
submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%

在一个submit调用里,提交了多少I/O。4=表示0-4区间内的占比;8=表示4-8区间内的占比;以此类推;不清楚。

1
complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%

和submit类似;不清楚。

1
issued rwts: total=768,792,0,0 short=0,0,0,0 dropped=0,0,0,0

总共发出了多少read/write/trim/x请求,有多少short(read, write, send, recv等返回大小小于请求大小),有多少被dropped。这一行需要和后面的Disk stats (read/write)里的sdd: ios=1637/4947对照来看:在本例中,g0-j0的读是768;g0-j1的读是757;g1-j0,g1-j1,g1-j2没有读操作,所以加在一起是1525,接近sdd: ios=1637/4947里面的1637。
注意:1. 写操作加起来(g0-j0,g0-j1,g1-j0,g1-j1,g1-j2)并不相等,因为g1写的是1m大小,可能分成多个I/O进行的;2. 另外,在文件系统中测试时,这个关系也不一样,可能有元数据操作;

1
latency   : target=0, window=0, percentile=100.00%, depth=3

和latency_target相关,暂忽略。

group统计 (4.2.2)

1
2
3
4
5
6
Run status group 0 (all jobs):
READ: bw=203KiB/s (208kB/s), 101KiB/s-102KiB/s (103kB/s-105kB/s), io=6100KiB (6246kB), run=30069-30080msec
WRITE: bw=213KiB/s (218kB/s), 105KiB/s-108KiB/s (108kB/s-110kB/s), io=6408KiB (6562kB), run=30069-30080msec

Run status group 1 (all jobs):
WRITE: bw=18.9MiB/s (19.8MB/s), 6428KiB/s-6500KiB/s (6582kB/s-6656kB/s), io=570MiB (598MB), run=30091-30110msec

group 0 READ:

  • bw=203KiB/s (208kB/s):所有job实例(g0-j0,g0-j1)的聚合带宽(分别以1K=1024和1K=1000计算);
  • 101KiB/s-102KiB/s (103kB/s-105kB/s):所有job实例中的最小带宽和最大带宽(分别以1K=1024和1K=1000计算);从前面输出可以看出,g0-j0的带宽是102KiB/s;g0-j1的带宽是101KiB/s;
  • io=6100KiB (6246kB):所有job实例的聚合传输量(分别以1K=1024和1K=1000计算);
  • run=30069-30080msec:所有job实例中最短和最长的运行时间;从前面全部输出中可以看出,g0-j0的运行时间是30069msec,g0-j1的运行时间是30080msec;

group 1 WRITE:

  • bw=18.9MiB/s (19.8MB/s):所有job实例(g1-j0,g1-j1,g1-j2)的聚合带宽(分别以1M=1024*1024和1M=1000*1000计算);
  • 6428KiB/s-6500KiB/s (6582kB/s-6656kB/s):所有job实例中的最小带宽和最大带宽(分别以1K=1024和1K=1000计算);从前面输出可以看出,g1-j2的带宽是6428KiB/s,最小;g1-j0的带宽是6500KiB/s,最大;
  • io=570MiB (598MB):所有job实例的聚合传输量(分别以1M=1024*1024和1M=1000*1000计算);
  • run=30091-30110msec:所有job实例中最短和最长的运行时间;从前面全部输出中可以看出,g1-j0的运行时间是30091msec,最小;g1-j1和g1-j2都是30110msec,最大;

磁盘统计 (4.2.3)

1
2
Disk stats (read/write):
sdd: ios=1637/4947, merge=0/140809, ticks=107529/282374, in_queue=391910, util=99.77%
  • ios=1637/4947:iostat的r/sw/s在运行时间上的累积;
  • merge=0/140809:iostat的rrqm/swrqm/s在运行时间上的累积;
  • ticks=107529/282374:disk忙着的ticks数;
  • in_queue=391910:所有I/O在disk queue中花费的ticks总数;
  • util:iostat的util;

说明:上例中,为了看清各个job实例的统计信息,没有加group_reporting;实际使用中,同一个job definition产生的job实例特征相同,所以,把它们的统计聚合起来更容易看。new_group=1为每个job definition创建一个分组(它的job实例都属于次分组);group_reporting=1按组聚合统计;

小结 (5)

记录了fio常用参数以及输出解读;

最近看了下namazu,试图使用它来模拟filesystem以及ethernet的抖动以及错误,发现ethernet inspector使用的是netfilter queue;这里对它做一个简单的调研。

阅读全文 »