* XFS buffered sequential read performance low after kernel upgrade
@ 2010-03-12 11:48 Stan Hoeppner
2010-03-12 12:46 ` Stan Hoeppner
2010-03-13 0:16 ` Dave Chinner
0 siblings, 2 replies; 6+ messages in thread
From: Stan Hoeppner @ 2010-03-12 11:48 UTC (permalink / raw)
To: xfs
Hello,
I'm uncertain whether this is the best place to bring this up. I've been
lurking a short while and it seems almost all posts here deal with dev
issues. On the off chance this is an appropriate forum, here goes.
I believe I recently ran into my first "issue" with XFS. Up to now I've
been pleased as punch with XFS' performance and features. I rolled a new
kernel the other day going from vanilla kernel.org 2.6.31.1 to 2.6.32.9.
This is an i386 binary small mem kernel running on a dual Intel P6 class
system. I tried 2.6.33 but apparently my version of gcc in Debian Lenny is
too old. Anyway, I'm noticing what I believe to be a fairly substantial
decrease in sequential read performance after upgrading my kernel.
The SUT (system under test) has one single platter WD 500GB 7.2K rpm SATA
disk, Sil 3512 chip, sata_sil driver, NCQ disabled, elevator=deadline, disk
carved as follows and with the following mount options and xfs_info:
Filesystem Type Size Used Avail Use% Mounted on
/dev/sda1 ext2 92M 5.9M 81M 7% /boot
/dev/sda2 ext2 33G 6.9G 25G 22% /
/dev/sda6 xfs 94G 2.0G 92G 3% /home
/dev/sda7 xfs 94G 20G 74G 21% /samba
/dev/sda1 /boot ext2 defaults 0 1
/dev/sda2 / ext2 errors=remount-ro 0 2
/dev/sda5 none swap sw 0 0
/dev/sda6 /home xfs defaults 0 0
/dev/sda7 /samba xfs defaults 0 0
~$ xfs_info /home
meta-data=/dev/sda6 isize=256 agcount=4, agsize=6103694 blks
= sectsz=512 attr=2
data = bsize=4096 blocks=24414775, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096
log =internal bsize=4096 blocks=11921, version=2
= sectsz=512 sunit=0 blks, lazy-count=0
realtime =none extsz=4096 blocks=0, rtextents=0
~$ xfs_info /samba
meta-data=/dev/sda7 isize=256 agcount=4, agsize=6103694 blks
= sectsz=512 attr=2
data = bsize=4096 blocks=24414775, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096
log =internal bsize=4096 blocks=11921, version=2
= sectsz=512 sunit=0 blks, lazy-count=0
realtime =none extsz=4096 blocks=0, rtextents=0
hdparm test results for the filesystems:
/dev/sda2:
Timing O_DIRECT disk reads: 236 MB in 3.01 seconds = 78.48 MB/sec
/dev/sda2:
Timing buffered disk reads: 172 MB in 3.03 seconds = 56.68 MB/sec
/dev/sda6:
Timing O_DIRECT disk reads: 238 MB in 3.00 seconds = 79.21 MB/sec
/dev/sda6:
Timing buffered disk reads: 116 MB in 3.03 seconds = 38.27 MB/sec
/dev/sda7:
Timing O_DIRECT disk reads: 238 MB in 3.01 seconds = 79.10 MB/sec
/dev/sda7:
Timing buffered disk reads: 114 MB in 3.00 seconds = 37.99 MB/sec
Note that XFS is giving up almost 20MB/s to EXT2 in the hdparm read tests
through the Linux buffer cache. IIRC hdparm is supposed to ignore the
filesystem, according to the man page, but I don't think this is true given
the that the O_DIRECT read performance for the EXT2 partition and both XFS
partitions is identical. Going through the buffer cache cuts XFS read
performance in half compared to O_DIRECT. EXT2 fares much better, losing
about a third of its O_DIRECT performance to the buffer cache.
Some dd read tests:
~$ dd if=/dev/sda2 of=/dev/null bs=4096 count=100000
100000+0 records in
100000+0 records out
409600000 bytes (410 MB) copied, 5.69972 s, 71.9 MB/s
~$ dd if=/dev/sda6 of=/dev/null bs=4096 count=100000
100000+0 records in
100000+0 records out
409600000 bytes (410 MB) copied, 7.53601 s, 54.4 MB/s
~$ dd if=/dev/sda7 of=/dev/null bs=4096 count=100000
100000+0 records in
100000+0 records out
409600000 bytes (410 MB) copied, 8.25571 s, 49.6 MB/s
Same thing with dd. The XFS partitions lag behind EXT2 by about 20MB/s,
although the overall numbers are better for dd than hdparm which seems to be
my general experience using the two utils.
Some small dd write tests:
EXT2
~$ dd if=/dev/zero of=/test.xfs bs=4096 count=100000
100000+0 records in
100000+0 records out
409600000 bytes (410 MB) copied, 5.72976 s, 71.5 MB/s
XFS
~$ dd if=/dev/zero of=/home/stan/test.xfs bs=4096 count=100000
100000+0 records in
100000+0 records out
409600000 bytes (410 MB) copied, 5.97482 s, 69.6 MB/s
XFS
~$ dd if=/dev/zero of=/samba/test.xfs bs=4096 count=100000
100000+0 records in
100000+0 records out
409600000 bytes (410 MB) copied, 5.91914 s, 68.8 MB/s
XFS seems to keep right up with EXT2 in the write tests. The 35GB EXT2
partition is on the outer edge of the platter, following inward by the 100GB
XFS /home partition and then the 100GB /samba partition. I think the slight
performance difference in the dd write tests is mostly due to partition
placement on the platter.
I don't recall doing any testing as formal as this with the old kernel.
However, I don't recall sub 40MB/s XFS read numbers from hdparm. That
really surprised me when I went kicking the tires on the new kernel. IIRC,
on the previous kernel, buffered sequential read performance was pretty much
the same for EXT2 and XFS, with XFS showing a small but significant lead in
write performance.
So, finally, my question: Is there a known issue with XFS performance in
kernel 2.6.32.x, or is there something I need to tweak manually in the mount
options or other in 2.6.32.x that was automatic in the previous kernel? I
created the filessytems on 2.6.21.1 if that has any bearing.
Thanks in advance for any answers or advice you may be able to provide.
--
Stan
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 6+ messages in thread* Re: XFS buffered sequential read performance low after kernel upgrade
2010-03-12 11:48 XFS buffered sequential read performance low after kernel upgrade Stan Hoeppner
@ 2010-03-12 12:46 ` Stan Hoeppner
2010-03-13 0:16 ` Dave Chinner
1 sibling, 0 replies; 6+ messages in thread
From: Stan Hoeppner @ 2010-03-12 12:46 UTC (permalink / raw)
To: xfs
Stan Hoeppner put forth on 3/12/2010 5:48 AM:
> I created the filessytems on 2.6.21.1 if that has any bearing.
That's a typo. It should read 2.6.31.1.
--
Stan
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: XFS buffered sequential read performance low after kernel upgrade
2010-03-12 11:48 XFS buffered sequential read performance low after kernel upgrade Stan Hoeppner
2010-03-12 12:46 ` Stan Hoeppner
@ 2010-03-13 0:16 ` Dave Chinner
2010-03-13 10:03 ` Stan Hoeppner
2010-03-13 15:24 ` Robert Brockway
1 sibling, 2 replies; 6+ messages in thread
From: Dave Chinner @ 2010-03-13 0:16 UTC (permalink / raw)
To: Stan Hoeppner; +Cc: xfs
On Fri, Mar 12, 2010 at 05:48:16AM -0600, Stan Hoeppner wrote:
> Hello,
>
> I'm uncertain whether this is the best place to bring this up. I've been
> lurking a short while and it seems almost all posts here deal with dev
> issues. On the off chance this is an appropriate forum, here goes.
....
> hdparm test results for the filesystems:
>
> /dev/sda2:
> Timing O_DIRECT disk reads: 236 MB in 3.01 seconds = 78.48 MB/sec
> /dev/sda2:
> Timing buffered disk reads: 172 MB in 3.03 seconds = 56.68 MB/sec
>
> /dev/sda6:
> Timing O_DIRECT disk reads: 238 MB in 3.00 seconds = 79.21 MB/sec
> /dev/sda6:
> Timing buffered disk reads: 116 MB in 3.03 seconds = 38.27 MB/sec
>
> /dev/sda7:
> Timing O_DIRECT disk reads: 238 MB in 3.01 seconds = 79.10 MB/sec
> /dev/sda7:
> Timing buffered disk reads: 114 MB in 3.00 seconds = 37.99 MB/sec
Those tests don't go through the filesystem - they go directly to the
block device. Hence if these are different to your old kernel, the
problem is outside XFS. I'd suggest the most likely cause is the
elevator - if you are using CFQ try turning off the new low-latency
mode, or changing to deadline or noop and see if the problem goes
away.
Also, the XFS partitions are on the inner edge of the disk, so they
are always going to be slower (can be as low as half the speed) than
partitions on the outer edge of the disk for sequential access...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: XFS buffered sequential read performance low after kernel upgrade
2010-03-13 0:16 ` Dave Chinner
@ 2010-03-13 10:03 ` Stan Hoeppner
2010-03-13 10:48 ` Dave Chinner
2010-03-13 15:24 ` Robert Brockway
1 sibling, 1 reply; 6+ messages in thread
From: Stan Hoeppner @ 2010-03-13 10:03 UTC (permalink / raw)
To: xfs
Dave Chinner put forth on 3/12/2010 6:16 PM:
Thanks for the tips Dave.
> Those tests don't go through the filesystem - they go directly to the
> block device. Hence if these are different to your old kernel, the
> problem is outside XFS. I'd suggest the most likely cause is the
> elevator - if you are using CFQ try turning off the new low-latency
> mode, or changing to deadline or noop and see if the problem goes
> away.
I'm using the deadline elevator as the kernel default on both the old and
new kernels. I have been for quite some time as it seems to yield over
double the random I/O seek performance of the cfq elevator for threaded or
multi-tasking workloads. My SATA controller doesn't support NCQ. In my
testing the deadline elevator seems to "restore" the lost performance that
NCQ would have provided, at the cost of a little more kernel/CPU overhead.
> Also, the XFS partitions are on the inner edge of the disk, so they
> are always going to be slower (can be as low as half the speed) than
> partitions on the outer edge of the disk for sequential access...
It's a 500GB single platter drive. I've partitioned a little less than
250GB of it, starting at the outer edge (at least that's what I instructed
cfdisk to do). Relatively speaking, the two 100GB XFS partitions should be
closer to the outer than inner tracks. The EXT2 / partition is obviously
right on the outer edge. The start of the first XFS partition is at the
~35GB mark out of 500GB, so it's right there too, and should have very
similar direct block read performance. Yes?
What boggles me is why my dd write speeds are 15-20MB/s faster than the read
speeds on the XFS partitions, considering, as you state, that the reads are
direct block I/O reads, bypassing the FS. Since I'm writing to a file in
the write tests, doesn't the writing have to go through XFS? If so, why is
it so much faster than the reads? Linux kernel and XFS write buffers? The
16MB cache chip on the drive?
~$ dd if=/dev/sda6 of=/dev/null bs=4096 count=100000
100000+0 records in
100000+0 records out
409600000 bytes (410 MB) copied, 9.07389 s, 45.1 MB/s
~$ dd if=/dev/zero of=/home/stan/test.xfs bs=4096 count=100000
100000+0 records in
100000+0 records out
409600000 bytes (410 MB) copied, 6.16098 s, 66.5 MB/s
I'm by no means an expert on this stuff. You guys are. That's why I'm
asking here. ;) I'm just trying to learn how to best optimize my kernel
and filesystem performance. There may be nothing "wrong" here, but it seems
that way at first glance. If so, I'd like to fix/optimize if possible.
Thanks for sharing your expertise, and your time.
--
Stan
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: XFS buffered sequential read performance low after kernel upgrade
2010-03-13 10:03 ` Stan Hoeppner
@ 2010-03-13 10:48 ` Dave Chinner
0 siblings, 0 replies; 6+ messages in thread
From: Dave Chinner @ 2010-03-13 10:48 UTC (permalink / raw)
To: Stan Hoeppner; +Cc: xfs
On Sat, Mar 13, 2010 at 04:03:10AM -0600, Stan Hoeppner wrote:
> Dave Chinner put forth on 3/12/2010 6:16 PM:
>
> Thanks for the tips Dave.
>
> > Those tests don't go through the filesystem - they go directly to the
> > block device. Hence if these are different to your old kernel, the
> > problem is outside XFS. I'd suggest the most likely cause is the
> > elevator - if you are using CFQ try turning off the new low-latency
> > mode, or changing to deadline or noop and see if the problem goes
> > away.
>
> I'm using the deadline elevator as the kernel default on both the old and
> new kernels. I have been for quite some time as it seems to yield over
> double the random I/O seek performance of the cfq elevator for threaded or
> multi-tasking workloads. My SATA controller doesn't support NCQ. In my
> testing the deadline elevator seems to "restore" the lost performance that
> NCQ would have provided, at the cost of a little more kernel/CPU overhead.
>
> > Also, the XFS partitions are on the inner edge of the disk, so they
> > are always going to be slower (can be as low as half the speed) than
> > partitions on the outer edge of the disk for sequential access...
>
> It's a 500GB single platter drive. I've partitioned a little less than
> 250GB of it, starting at the outer edge (at least that's what I instructed
> cfdisk to do). Relatively speaking, the two 100GB XFS partitions should be
> closer to the outer than inner tracks. The EXT2 / partition is obviously
> right on the outer edge. The start of the first XFS partition is at the
> ~35GB mark out of 500GB, so it's right there too, and should have very
> similar direct block read performance. Yes?
Yeah, they should be the roughly the same given that layout. I'd
check/increase the readahead of the block device to see if that makes
any difference (/sys/block/sd?/bdi/read_ahead_kb).
> What boggles me is why my dd write speeds are 15-20MB/s faster than the read
> speeds on the XFS partitions, considering, as you state, that the reads are
> direct block I/O reads, bypassing the FS. Since I'm writing to a file in
> the write tests, doesn't the writing have to go through XFS? If so, why is
> it so much faster than the reads? Linux kernel and XFS write buffers? The
> 16MB cache chip on the drive?
>
> ~$ dd if=/dev/sda6 of=/dev/null bs=4096 count=100000
> 100000+0 records in
> 100000+0 records out
> 409600000 bytes (410 MB) copied, 9.07389 s, 45.1 MB/s
>
> ~$ dd if=/dev/zero of=/home/stan/test.xfs bs=4096 count=100000
> 100000+0 records in
> 100000+0 records out
> 409600000 bytes (410 MB) copied, 6.16098 s, 66.5 MB/s
That doesn't time how long it takes for all the data to get to the
disk, just for it all to get into the cache. Run it like this:
$ time (dd if=/dev/zero of=/home/stan/test.xfs bs=4096 count=100000; sync)
And work out the speed from the time reported, not what dd tells
you.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: XFS buffered sequential read performance low after kernel upgrade
2010-03-13 0:16 ` Dave Chinner
2010-03-13 10:03 ` Stan Hoeppner
@ 2010-03-13 15:24 ` Robert Brockway
1 sibling, 0 replies; 6+ messages in thread
From: Robert Brockway @ 2010-03-13 15:24 UTC (permalink / raw)
To: xfs
On Sat, 13 Mar 2010, Dave Chinner wrote:
> Also, the XFS partitions are on the inner edge of the disk, so they
> are always going to be slower (can be as low as half the speed) than
> partitions on the outer edge of the disk for sequential access...
Hi Dave. Hasn't it been the case for a long time that the sectors on the
HDD are no long necessarily numbered from the outside in? I understood
some HDD are numbered in the opposite direction and some are striped and
that HDD manufacturers tend not to release this info. Short of timing
tests I believe there is no easy easy way to figure out the sequence
anymore.
This has tended to nullify the old wisdom of putting the root filesystem
at the beginning of the disk[1].
[1] Which was done in the past because it was faster.
Cheers,
Rob
--
Email: robert@timetraveller.org
IRC: Solver
Web: http://www.practicalsysadmin.com
Open Source: The revolution that silently changed the world
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2010-03-13 15:21 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-03-12 11:48 XFS buffered sequential read performance low after kernel upgrade Stan Hoeppner
2010-03-12 12:46 ` Stan Hoeppner
2010-03-13 0:16 ` Dave Chinner
2010-03-13 10:03 ` Stan Hoeppner
2010-03-13 10:48 ` Dave Chinner
2010-03-13 15:24 ` Robert Brockway
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox