From: David Chinner <dgc@sgi.com>
To: Andrew Clayton <andrew@digital-domain.net>
Cc: linux-fsdevel@vger.kernel.org
Subject: Re: XFS regression?
Date: Thu, 11 Oct 2007 11:01:39 +1000 [thread overview]
Message-ID: <20071011010139.GT995458@sgi.com> (raw)
In-Reply-To: <20071010152742.1b2a7bce@zeus.pccl.info>
On Wed, Oct 10, 2007 at 03:27:42PM +0100, Andrew Clayton wrote:
> Hi,
>
> (Seeing as I haven't been able to subscribe or post to the XFS mailing
> list, I'll try here)
>
> I'll try not to flood with information on the first post.
>
> In trying to track down this issue here:
> http://www.spinics.net/lists/raid/msg17195.html
>
> I think I'm seeing a regression in XFS
perhaps not XFS. 2.6.23-rc4, cfq, ia64, 10k rpm scsi
disk, WCE, ctq enabled, nobarrier mount option.
lattest output at defaults:
# cat /sys/block/sdb/queue/nr_requests
128
idle state:
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.000160>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.000159>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.000158>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.000159>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.000155>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.000159>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.000155>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.000159>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.000156>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.000158>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.000158>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.000153>
run dd:
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.000456>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.000372>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.000412>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.000365>
(writeback starts)
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.401471>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.544249>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.727238>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.569971>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.715226>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.561839>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.714825>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.000389>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.000480>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.441398>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.488799>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.701921>
So we've got latencies of 500-700ms there while writeback is occuring. i.e. the
disk is busy.
# echo 4 > /sys/block/sdb/queue/nr_requests
# cat /sys/block/sdb/queue/nr_requests
4
idle:
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.000223>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.000219>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.000219>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.000269>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.000249>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.000220>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.000215>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.000209>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.000214>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.000220>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.000213>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.000239>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.000281>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.000209>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.000326>
run dd:
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.001026>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.001077>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.000882>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.000197>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.000862>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.000198>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.000199>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.000218>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.000216>
(writeback starts):
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.080791>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.077202>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.003443>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.083383>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.071555>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.068379>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.077401>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.087573>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.080739>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.069364>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.071922>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.000216>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.076608>
open("/mnt/scratch/testfile", O_WRONLY|O_CREAT|O_TRUNC|O_EXCL, 0600) = 4 <0.069775>
Latencies are an order of magnitude lower at 60-70ms because the disks
have less deep queues. This is expected - deep queues and multiple outstanding
I/Os are the enemy of single I/O latency....
If I remount with barriers enabled, the latency at nr_requests=128 goes
up to a consistent 2.2s. Not surprising, we're flushing the drive
cache very regularly now and it points to the create or truncate
transaction having to pushing log buffers to disk. The latency remains
at 70-80ms at nr_requests=4.
> It seems this problem was introduced between 2.6.18 and 2.6.19.
When the new SATA driver infrastructure was introduced. Do you have NCQ
enabled on more recent kernels and not on 2.6.18? If so, try disabling it
and see if the problem goes away....
> The other thing I've found is that if I do the dd to an ext3 fs (on
> the same disk at least) while running the test in the XFS fs then I
> also see the latencies.
So it's almost certainly pointing at an elevator or driver change, not
an XFS change.
Cheers,
dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group
next prev parent reply other threads:[~2007-10-11 1:01 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-10-10 14:27 XFS regression? Andrew Clayton
2007-10-11 1:01 ` David Chinner [this message]
2007-10-11 9:05 ` Andrew Clayton
2007-10-11 14:15 ` Andrew Clayton
2007-10-11 21:53 ` David Chinner
2007-10-12 0:26 ` David Chinner
2007-10-12 11:36 ` Andrew Clayton
2007-10-12 13:28 ` Andrew Clayton
[not found] ` <cc7060690710130635u2a85bc28we36b344c0987b691@mail.gmail.com>
2007-10-14 23:09 ` David Chinner
2007-10-15 9:58 ` Bhagi rathi
2007-10-15 11:57 ` David Chinner
2007-10-14 23:19 ` David Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20071011010139.GT995458@sgi.com \
--to=dgc@sgi.com \
--cc=andrew@digital-domain.net \
--cc=linux-fsdevel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).