* HUGE XFS regression in 2.6.32 upto 2.6.38
@ 2011-04-12 7:58 Raz
2011-04-12 9:52 ` Dave Chinner
0 siblings, 1 reply; 3+ messages in thread
From: Raz @ 2011-04-12 7:58 UTC (permalink / raw)
To: xfs-oss
Christoph Hello
I am testing 2.6.38 with AIM benchmark.
I compared 2.6.38 to 2.6.27 and I noticed that 2.6.27 is much better
than 2.6.38 when
doing sync random writes test over an xfs regular file over native
Linux partition on top common sata disk.
I git bisected the problem and I reached this SHA1:
commit 13e6d5cdde0e785aa943810f08b801cadd0935df
Author: Christoph Hellwig <hch@lst.de>
Date: Mon Aug 31 21:00:31 2009 -0300
xfs: merge fsync and O_SYNC handling
The guarantees for O_SYNC are exactly the same as the ones we need to
make for an fsync call (and given that Linux O_SYNC is O_DSYNC the
equivalent is fdadatasync, but we treat both the same in XFS), except
with a range data writeout. Jan Kara has started unifying these two
path for filesystems using the generic helpers, and I've started to
look at XFS.
...
The bellow two tests presents the how different performance is before and patch:
#test 16) bisect 11
------------------------------------------------------------------------------------------------------------
Test Test Elapsed Iteration Iteration Operation
Number Name Time (sec) Count Rate (loops/sec) Rate (ops/sec)
------------------------------------------------------------------------------------------------------------
1 sync_disk_rw 30.71 19 0.61869 1583.85
Sync Random Disk Writes (K)/second
------------------------------------------------------------------------------------------------------------
#test 17 ) bisect 12
------------------------------------------------------------------------------------------------------------
1 sync_disk_rw 69.05 1 0.01448 37.07
Sync Random Disk Writes (K)/second
------------------------------------------------------------------------------------------------------------
Regards
Raz Ben-Yehuda
Reply
Reply to all
Forward
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: HUGE XFS regression in 2.6.32 upto 2.6.38
2011-04-12 7:58 HUGE XFS regression in 2.6.32 upto 2.6.38 Raz
@ 2011-04-12 9:52 ` Dave Chinner
2011-04-12 11:19 ` Raz
0 siblings, 1 reply; 3+ messages in thread
From: Dave Chinner @ 2011-04-12 9:52 UTC (permalink / raw)
To: Raz; +Cc: xfs-oss
On Tue, Apr 12, 2011 at 10:58:53AM +0300, Raz wrote:
> Christoph Hello
> I am testing 2.6.38 with AIM benchmark.
> I compared 2.6.38 to 2.6.27 and I noticed that 2.6.27 is much better
> than 2.6.38 when
> doing sync random writes test over an xfs regular file over native
> Linux partition on top common sata disk.
> I git bisected the problem and I reached this SHA1:
> commit 13e6d5cdde0e785aa943810f08b801cadd0935df
> Author: Christoph Hellwig <hch@lst.de>
> Date: Mon Aug 31 21:00:31 2009 -0300
>
> xfs: merge fsync and O_SYNC handling
>
> The guarantees for O_SYNC are exactly the same as the ones we need to
> make for an fsync call (and given that Linux O_SYNC is O_DSYNC the
> equivalent is fdadatasync, but we treat both the same in XFS), except
> with a range data writeout. Jan Kara has started unifying these two
> path for filesystems using the generic helpers, and I've started to
> look at XFS.
> ...
>
>
> The bellow two tests presents the how different performance is before and patch:
> #test 16) bisect 11
> ------------------------------------------------------------------------------------------------------------
> Test Test Elapsed Iteration Iteration Operation
> Number Name Time (sec) Count Rate (loops/sec) Rate (ops/sec)
> ------------------------------------------------------------------------------------------------------------
> 1 sync_disk_rw 30.71 19 0.61869 1583.85
> Sync Random Disk Writes (K)/second
> ------------------------------------------------------------------------------------------------------------
That's clearly showing that your sync writes are not hitting the
disk. IOWs, the sync writes are not synchronous at all. There is
no way a single SATA drive can do >1500 writes to stable storage
per second.
IOWs, before this fix, sync writes were broken on your hardware.
> #test 17 ) bisect 12
> ------------------------------------------------------------------------------------------------------------
> 1 sync_disk_rw 69.05 1 0.01448 37.07
> Sync Random Disk Writes (K)/second
> ------------------------------------------------------------------------------------------------------------
And that's pretty tpyical for a SATA drive where sync writes are
actually hitting the platter correctly.
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: HUGE XFS regression in 2.6.32 upto 2.6.38
2011-04-12 9:52 ` Dave Chinner
@ 2011-04-12 11:19 ` Raz
0 siblings, 0 replies; 3+ messages in thread
From: Raz @ 2011-04-12 11:19 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs-oss
you are correct. man page says "..untill data has been physically
written to the underlying storage".
missed that one.
thank you dave
On Tue, Apr 12, 2011 at 12:52 PM, Dave Chinner <david@fromorbit.com> wrote:
> On Tue, Apr 12, 2011 at 10:58:53AM +0300, Raz wrote:
>> Christoph Hello
>> I am testing 2.6.38 with AIM benchmark.
>> I compared 2.6.38 to 2.6.27 and I noticed that 2.6.27 is much better
>> than 2.6.38 hwhen
>> doing sync random writes test over an xfs regular file over native
>> Linux partition on top common sata disk.
>> I git bisected the problem and I reached this SHA1:
>> commit 13e6d5cdde0e785aa943810f08b801cadd0935df
>> Author: Christoph Hellwig <hch@lst.de>
>> Date: Mon Aug 31 21:00:31 2009 -0300
>>
>> xfs: merge fsync and O_SYNC handling
>>
>> The guarantees for O_SYNC are exactly the same as the ones we need to
>> make for an fsync call (and given that Linux O_SYNC is O_DSYNC the
>> equivalent is fdadatasync, but we treat both the same in XFS), except
>> with a range data writeout. Jan Kara has started unifying these two
>> path for filesystems using the generic helpers, and I've started to
>> look at XFS.
>> ...
>>
>>
>> The bellow two tests presents the how different performance is before and patch:
>> #test 16) bisect 11
>> ------------------------------------------------------------------------------------------------------------
>> Test Test Elapsed Iteration Iteration Operation
>> Number Name Time (sec) Count Rate (loops/sec) Rate (ops/sec)
>> ------------------------------------------------------------------------------------------------------------
>> 1 sync_disk_rw 30.71 19 0.61869 1583.85
>> Sync Random Disk Writes (K)/second
>> ------------------------------------------------------------------------------------------------------------
>
> That's clearly showing that your sync writes are not hitting the
> disk. IOWs, the sync writes are not synchronous at all. There is
> no way a single SATA drive can do >1500 writes to stable storage
> per second.
>
> IOWs, before this fix, sync writes were broken on your hardware.
>
>> #test 17 ) bisect 12
>> ------------------------------------------------------------------------------------------------------------
>> 1 sync_disk_rw 69.05 1 0.01448 37.07
>> Sync Random Disk Writes (K)/second
>> ------------------------------------------------------------------------------------------------------------
>
> And that's pretty tpyical for a SATA drive where sync writes are
> actually hitting the platter correctly.
>
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
>
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2011-04-12 11:16 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-04-12 7:58 HUGE XFS regression in 2.6.32 upto 2.6.38 Raz
2011-04-12 9:52 ` Dave Chinner
2011-04-12 11:19 ` Raz
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox