From: Zheng Liu <gnehzuil.liu@gmail.com>
To: linux-ext4@vger.kernel.org, xfs@oss.sgi.com,
linux-fsdevel@vger.kernel.org, tytso@mit.edu,
david@fromorbit.com, hch@infradead.org, darrick.wong@oracle.com
Subject: A huge latency in ext4 and xfs because of stable page write
Date: Tue, 11 Dec 2012 16:45:21 +0800 [thread overview]
Message-ID: <20121211084520.GA13277@gmail.com> (raw)
Hi all,
At Tao Bao we meet a problem in our product system which causes a huge latency
because of stable page write. This problem is easy to reproduce in a testing
environment, and I can reproduce it in my desktop with a SATA disk. Here is the
fio config file that is used to reproduce this problem.
config file
-----------
[global]
iodepth=1
directory=/mnt/sda3
direct=0
group_reporting
thread
fallocate=0
runtime=120
[log-append]
ioengine=sync
rw=write
bs=1k
size=10g
filesize=10g
rate=5m
numjobs=1
I run this test case in ext4 and xfs, and both of them are affected by stable
page write. The result shows as below. Please notice the result of latency.
ext4 w/ stable page write
-------------------------
log-append: (groupid=0, jobs=1): err= 0: pid=3455: Tue Dec 11 14:41:27 2012
write: io=614401KB, bw=5119.1KB/s, iops=5119 , runt=120001msec
clat (usec): min=3 , max=225702 , avg= 7.32, stdev=411.95
lat (usec): min=3 , max=225702 , avg= 7.50, stdev=411.95
clat percentiles (usec):
| 1.00th=[ 3], 5.00th=[ 3], 10.00th=[ 3], 20.00th=[ 4],
| 30.00th=[ 4], 40.00th=[ 4], 50.00th=[ 4], 60.00th=[ 5],
| 70.00th=[ 6], 80.00th=[ 12], 90.00th=[ 13], 95.00th=[ 14],
| 99.00th=[ 15], 99.50th=[ 16], 99.90th=[ 18], 99.95th=[ 19],
| 99.99th=[ 25]
bw (KB/s) : min= 5108, max= 5134, per=100.00%, avg=5120.92, stdev= 1.58
lat (usec) : 4=15.93%, 10=59.05%, 20=24.98%, 50=0.03%, 100=0.01%
lat (usec) : 250=0.01%
lat (msec) : 20=0.01%, 250=0.01%
cpu : usr=0.18%, sys=4.31%, ctx=118850, majf=18446744073709551605,
minf=18446744073709538970
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued : total=r=0/w=614401/d=0, short=r=0/w=0/d=0
Run status group 0 (all jobs):
WRITE: io=614401KB, aggrb=5119KB/s, minb=5119KB/s, maxb=5119KB/s,
mint=120001msec, maxt=120001msec
Disk stats (read/write):
sda: ios=79/1287, merge=5/151907, ticks=702/147202, in_queue=147861, util=6.25%
ext4 w/o stable page write
--------------------------
log-append: (groupid=0, jobs=1): err= 0: pid=3062: Tue Dec 11 14:52:32 2012
write: io=614400KB, bw=5119.1KB/s, iops=5119 , runt=120001msec
clat (usec): min=3 , max=14565 , avg= 6.15, stdev=25.35
lat (usec): min=3 , max=14566 , avg= 6.29, stdev=25.35
clat percentiles (usec):
| 1.00th=[ 3], 5.00th=[ 3], 10.00th=[ 3], 20.00th=[ 4],
| 30.00th=[ 4], 40.00th=[ 4], 50.00th=[ 4], 60.00th=[ 4],
| 70.00th=[ 5], 80.00th=[ 12], 90.00th=[ 12], 95.00th=[ 13],
| 99.00th=[ 15], 99.50th=[ 15], 99.90th=[ 17], 99.95th=[ 18],
| 99.99th=[ 64]
bw (KB/s) : min= 5108, max= 5140, per=100.00%, avg=5122.26, stdev= 4.54
lat (usec) : 4=17.33%, 10=57.65%, 20=24.98%, 50=0.02%, 100=0.01%
lat (usec) : 250=0.01%
lat (msec) : 2=0.01%, 20=0.01%
cpu : usr=4.12%, sys=0.00%, ctx=58904, majf=18446744073709551605,
minf=18446744073709538969
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued : total=r=0/w=614400/d=0, short=r=0/w=0/d=0
Run status group 0 (all jobs):
WRITE: io=614400KB, aggrb=5119KB/s, minb=5119KB/s, maxb=5119KB/s,
mint=120001msec, maxt=120001msec
Disk stats (read/write):
sda: ios=34/1254, merge=4/152032, ticks=157/144066, in_queue=147725, util=5.82%
xfs w/ stable page write
------------------------
log-append: (groupid=0, jobs=1): err= 0: pid=2848: Tue Dec 11 15:39:25 2012
write: io=614401KB, bw=5119.1KB/s, iops=5119 , runt=120002msec
clat (usec): min=1 , max=635643 , avg= 5.02, stdev=1141.37
lat (usec): min=1 , max=635643 , avg= 5.17, stdev=1141.37
clat percentiles (usec):
| 1.00th=[ 1], 5.00th=[ 1], 10.00th=[ 1], 20.00th=[ 1],
| 30.00th=[ 2], 40.00th=[ 2], 50.00th=[ 2], 60.00th=[ 2],
| 70.00th=[ 4], 80.00th=[ 5], 90.00th=[ 6], 95.00th=[ 6],
| 99.00th=[ 9], 99.50th=[ 10], 99.90th=[ 11], 99.95th=[ 12],
| 99.99th=[ 13]
bw (KB/s) : min= 1388, max=11632, per=100.00%, avg=5147.03, stdev=675.82
lat (usec) : 2=21.76%, 4=46.03%, 10=31.65%, 20=0.56%, 50=0.01%
lat (msec) : 20=0.01%, 750=0.01%
cpu : usr=1.09%, sys=1.18%, ctx=58250, majf=18446744073709551605,
minf=18446744073709538970
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued : total=r=0/w=614401/d=0, short=r=0/w=0/d=0
Run status group 0 (all jobs):
WRITE: io=614401KB, aggrb=5119KB/s, minb=5119KB/s, maxb=5119KB/s,
mint=120002msec, maxt=120002msec
Disk stats (read/write):
sda: ios=15/1101, merge=0/28, ticks=150/554585, in_queue=554731, util=4.18%
xfs w/o stable page write
-------------------------
log-append: (groupid=0, jobs=1): err= 0: pid=3678: Tue Dec 11 15:01:10 2012
write: io=614401KB, bw=5119.1KB/s, iops=5119 , runt=120001msec
clat (usec): min=1 , max=20866 , avg= 3.50, stdev=26.72
lat (usec): min=1 , max=20866 , avg= 3.60, stdev=26.72
clat percentiles (usec):
| 1.00th=[ 1], 5.00th=[ 1], 10.00th=[ 1], 20.00th=[ 2],
| 30.00th=[ 2], 40.00th=[ 2], 50.00th=[ 2], 60.00th=[ 2],
| 70.00th=[ 5], 80.00th=[ 6], 90.00th=[ 6], 95.00th=[ 8],
| 99.00th=[ 10], 99.50th=[ 10], 99.90th=[ 12], 99.95th=[ 12],
| 99.99th=[ 14]
bw (KB/s) : min= 5110, max= 5132, per=100.00%, avg=5120.84, stdev= 2.25
lat (usec) : 2=15.33%, 4=44.93%, 10=38.33%, 20=1.41%, 50=0.01%
lat (usec) : 500=0.01%
lat (msec) : 50=0.01%
cpu : usr=2.83%, sys=0.00%, ctx=119423, majf=18446744073709551605,
minf=18446744073709538968
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued : total=r=0/w=614401/d=0, short=r=0/w=0/d=0
Run status group 0 (all jobs):
WRITE: io=614401KB, aggrb=5119KB/s, minb=5119KB/s, maxb=5119KB/s,
mint=120001msec, maxt=120001msec
Disk stats (read/write):
sda: ios=106/1055, merge=3/35, ticks=1042/522739, in_queue=523780, util=4.56%
We can see that the max of latency is very high with stable page write in ext4
and xfs, and it reduces to a lower value after stable page write is reverted. I
summarize as below.
ext4 xfs (usec, lower is better)
w/ 225702 635643
w/o 14565 20866
Hence, I wonder whether or not we could revert stable page write temporarily.
After it is improved, we could add it back again.
Thanks,
- Zheng
WARNING: multiple messages have this Message-ID (diff)
From: Zheng Liu <gnehzuil.liu@gmail.com>
To: linux-ext4@vger.kernel.org, xfs@oss.sgi.com,
linux-fsdevel@vger.kernel.org, tytso@mit.edu,
david@fromorbit.com, hch@infradead.org, darrick.wong@oracle.com
Subject: A huge latency in ext4 and xfs because of stable page write
Date: Tue, 11 Dec 2012 16:45:21 +0800 [thread overview]
Message-ID: <20121211084520.GA13277@gmail.com> (raw)
Hi all,
At Tao Bao we meet a problem in our product system which causes a huge latency
because of stable page write. This problem is easy to reproduce in a testing
environment, and I can reproduce it in my desktop with a SATA disk. Here is the
fio config file that is used to reproduce this problem.
config file
-----------
[global]
iodepth=1
directory=/mnt/sda3
direct=0
group_reporting
thread
fallocate=0
runtime=120
[log-append]
ioengine=sync
rw=write
bs=1k
size=10g
filesize=10g
rate=5m
numjobs=1
I run this test case in ext4 and xfs, and both of them are affected by stable
page write. The result shows as below. Please notice the result of latency.
ext4 w/ stable page write
-------------------------
log-append: (groupid=0, jobs=1): err= 0: pid=3455: Tue Dec 11 14:41:27 2012
write: io=614401KB, bw=5119.1KB/s, iops=5119 , runt=120001msec
clat (usec): min=3 , max=225702 , avg= 7.32, stdev=411.95
lat (usec): min=3 , max=225702 , avg= 7.50, stdev=411.95
clat percentiles (usec):
| 1.00th=[ 3], 5.00th=[ 3], 10.00th=[ 3], 20.00th=[ 4],
| 30.00th=[ 4], 40.00th=[ 4], 50.00th=[ 4], 60.00th=[ 5],
| 70.00th=[ 6], 80.00th=[ 12], 90.00th=[ 13], 95.00th=[ 14],
| 99.00th=[ 15], 99.50th=[ 16], 99.90th=[ 18], 99.95th=[ 19],
| 99.99th=[ 25]
bw (KB/s) : min= 5108, max= 5134, per=100.00%, avg=5120.92, stdev= 1.58
lat (usec) : 4=15.93%, 10=59.05%, 20=24.98%, 50=0.03%, 100=0.01%
lat (usec) : 250=0.01%
lat (msec) : 20=0.01%, 250=0.01%
cpu : usr=0.18%, sys=4.31%, ctx=118850, majf=18446744073709551605,
minf=18446744073709538970
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued : total=r=0/w=614401/d=0, short=r=0/w=0/d=0
Run status group 0 (all jobs):
WRITE: io=614401KB, aggrb=5119KB/s, minb=5119KB/s, maxb=5119KB/s,
mint=120001msec, maxt=120001msec
Disk stats (read/write):
sda: ios=79/1287, merge=5/151907, ticks=702/147202, in_queue=147861, util=6.25%
ext4 w/o stable page write
--------------------------
log-append: (groupid=0, jobs=1): err= 0: pid=3062: Tue Dec 11 14:52:32 2012
write: io=614400KB, bw=5119.1KB/s, iops=5119 , runt=120001msec
clat (usec): min=3 , max=14565 , avg= 6.15, stdev=25.35
lat (usec): min=3 , max=14566 , avg= 6.29, stdev=25.35
clat percentiles (usec):
| 1.00th=[ 3], 5.00th=[ 3], 10.00th=[ 3], 20.00th=[ 4],
| 30.00th=[ 4], 40.00th=[ 4], 50.00th=[ 4], 60.00th=[ 4],
| 70.00th=[ 5], 80.00th=[ 12], 90.00th=[ 12], 95.00th=[ 13],
| 99.00th=[ 15], 99.50th=[ 15], 99.90th=[ 17], 99.95th=[ 18],
| 99.99th=[ 64]
bw (KB/s) : min= 5108, max= 5140, per=100.00%, avg=5122.26, stdev= 4.54
lat (usec) : 4=17.33%, 10=57.65%, 20=24.98%, 50=0.02%, 100=0.01%
lat (usec) : 250=0.01%
lat (msec) : 2=0.01%, 20=0.01%
cpu : usr=4.12%, sys=0.00%, ctx=58904, majf=18446744073709551605,
minf=18446744073709538969
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued : total=r=0/w=614400/d=0, short=r=0/w=0/d=0
Run status group 0 (all jobs):
WRITE: io=614400KB, aggrb=5119KB/s, minb=5119KB/s, maxb=5119KB/s,
mint=120001msec, maxt=120001msec
Disk stats (read/write):
sda: ios=34/1254, merge=4/152032, ticks=157/144066, in_queue=147725, util=5.82%
xfs w/ stable page write
------------------------
log-append: (groupid=0, jobs=1): err= 0: pid=2848: Tue Dec 11 15:39:25 2012
write: io=614401KB, bw=5119.1KB/s, iops=5119 , runt=120002msec
clat (usec): min=1 , max=635643 , avg= 5.02, stdev=1141.37
lat (usec): min=1 , max=635643 , avg= 5.17, stdev=1141.37
clat percentiles (usec):
| 1.00th=[ 1], 5.00th=[ 1], 10.00th=[ 1], 20.00th=[ 1],
| 30.00th=[ 2], 40.00th=[ 2], 50.00th=[ 2], 60.00th=[ 2],
| 70.00th=[ 4], 80.00th=[ 5], 90.00th=[ 6], 95.00th=[ 6],
| 99.00th=[ 9], 99.50th=[ 10], 99.90th=[ 11], 99.95th=[ 12],
| 99.99th=[ 13]
bw (KB/s) : min= 1388, max=11632, per=100.00%, avg=5147.03, stdev=675.82
lat (usec) : 2=21.76%, 4=46.03%, 10=31.65%, 20=0.56%, 50=0.01%
lat (msec) : 20=0.01%, 750=0.01%
cpu : usr=1.09%, sys=1.18%, ctx=58250, majf=18446744073709551605,
minf=18446744073709538970
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued : total=r=0/w=614401/d=0, short=r=0/w=0/d=0
Run status group 0 (all jobs):
WRITE: io=614401KB, aggrb=5119KB/s, minb=5119KB/s, maxb=5119KB/s,
mint=120002msec, maxt=120002msec
Disk stats (read/write):
sda: ios=15/1101, merge=0/28, ticks=150/554585, in_queue=554731, util=4.18%
xfs w/o stable page write
-------------------------
log-append: (groupid=0, jobs=1): err= 0: pid=3678: Tue Dec 11 15:01:10 2012
write: io=614401KB, bw=5119.1KB/s, iops=5119 , runt=120001msec
clat (usec): min=1 , max=20866 , avg= 3.50, stdev=26.72
lat (usec): min=1 , max=20866 , avg= 3.60, stdev=26.72
clat percentiles (usec):
| 1.00th=[ 1], 5.00th=[ 1], 10.00th=[ 1], 20.00th=[ 2],
| 30.00th=[ 2], 40.00th=[ 2], 50.00th=[ 2], 60.00th=[ 2],
| 70.00th=[ 5], 80.00th=[ 6], 90.00th=[ 6], 95.00th=[ 8],
| 99.00th=[ 10], 99.50th=[ 10], 99.90th=[ 12], 99.95th=[ 12],
| 99.99th=[ 14]
bw (KB/s) : min= 5110, max= 5132, per=100.00%, avg=5120.84, stdev= 2.25
lat (usec) : 2=15.33%, 4=44.93%, 10=38.33%, 20=1.41%, 50=0.01%
lat (usec) : 500=0.01%
lat (msec) : 50=0.01%
cpu : usr=2.83%, sys=0.00%, ctx=119423, majf=18446744073709551605,
minf=18446744073709538968
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued : total=r=0/w=614401/d=0, short=r=0/w=0/d=0
Run status group 0 (all jobs):
WRITE: io=614401KB, aggrb=5119KB/s, minb=5119KB/s, maxb=5119KB/s,
mint=120001msec, maxt=120001msec
Disk stats (read/write):
sda: ios=106/1055, merge=3/35, ticks=1042/522739, in_queue=523780, util=4.56%
We can see that the max of latency is very high with stable page write in ext4
and xfs, and it reduces to a lower value after stable page write is reverted. I
summarize as below.
ext4 xfs (usec, lower is better)
w/ 225702 635643
w/o 14565 20866
Hence, I wonder whether or not we could revert stable page write temporarily.
After it is improved, we could add it back again.
Thanks,
- Zheng
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next reply other threads:[~2012-12-11 8:45 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-12-11 8:45 Zheng Liu [this message]
2012-12-11 8:45 ` A huge latency in ext4 and xfs because of stable page write Zheng Liu
2012-12-11 11:17 ` Dave Chinner
2012-12-12 4:18 ` Zheng Liu
2012-12-12 4:49 ` Dave Chinner
2012-12-12 5:13 ` Zheng Liu
2012-12-12 7:23 ` Stefan Ring
2012-12-12 8:17 ` Zheng Liu
2012-12-12 23:07 ` Dave Chinner
2012-12-12 19:47 ` Darrick J. Wong
2012-12-13 2:30 ` Zheng Liu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20121211084520.GA13277@gmail.com \
--to=gnehzuil.liu@gmail.com \
--cc=darrick.wong@oracle.com \
--cc=david@fromorbit.com \
--cc=hch@infradead.org \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=tytso@mit.edu \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.