linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: David Chinner <dgc@sgi.com>
To: David Chinner <dgc@sgi.com>
Cc: Torsten Kaiser <just.for.lkml@googlemail.com>,
	Fengguang Wu <wfg@mail.ustc.edu.cn>,
	Peter Zijlstra <peterz@infradead.org>,
	Maxim Levitsky <maximlevitsky@gmail.com>,
	linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-fsdevel@vger.kernel.org
Subject: Re: writeout stalls in current -git
Date: Wed, 7 Nov 2007 13:13:24 +1100	[thread overview]
Message-ID: <20071107021324.GD995458@sgi.com> (raw)
In-Reply-To: <20071106233114.GB995458@sgi.com>

On Wed, Nov 07, 2007 at 10:31:14AM +1100, David Chinner wrote:
> On Tue, Nov 06, 2007 at 10:53:25PM +0100, Torsten Kaiser wrote:
> > On 11/6/07, David Chinner <dgc@sgi.com> wrote:
> > > Rather than vmstat, can you use something like iostat to show how busy your
> > > disks are?  i.e. are we seeing RMW cycles in the raid5 or some such issue.
> > 
> > Both "vmstat 10" and "iostat -x 10" output from this test:
> > procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
> >  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
> >  2  0      0 3700592      0  85424    0    0    31    83  108  244  2  1 95  1
> > -> emerge reads something, don't knwo for sure what...
> >  1  0      0 3665352      0  87940    0    0   239     2  343  585  2  1 97  0
> ....
> > 
> > The last 20% of the btrace look more or less completely like this, no
> > other programs do any IO...
> > 
> > 253,0    3   104626   526.293450729   974  C  WS 79344288 + 8 [0]
> > 253,0    3   104627   526.293455078   974  C  WS 79344296 + 8 [0]
> > 253,0    1    36469   444.513863133  1068  Q  WS 154998480 + 8 [xfssyncd]
> > 253,0    1    36470   444.513863135  1068  Q  WS 154998488 + 8 [xfssyncd]
>                                                 ^^
> Apparently we are doing synchronous writes. That would explain why
> it is slow. We shouldn't be doing synchronous writes here. I'll see if
> I can reproduce this.
> 
> <goes off and looks>
> 
> Yes, I can reproduce the sync writes coming out of xfssyncd. I'll
> look into this further and send a patch when I have something concrete.

Ok, so it's not synchronous writes that we are doing - we're just
submitting bio's tagged as WRITE_SYNC to get the I/O issued quickly.
The "synchronous" nature appears to be coming from higher level
locking when reclaiming inodes (on the flush lock). It appears that
inode write clustering is failing completely so we are writing the
same block multiple times i.e. once for each inode in the cluster we
have to write.

This must be a side effect of some other change as we haven't
changed anything in the reclaim code recently.....

/me scurries off to run some tests 

Indeed it is. The patch below should fix the problem - the inode
clusters weren't getting set up properly when inodes were being
read in or allocated. This is a regression, introduced by this
mod:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=da353b0d64e070ae7c5342a0d56ec20ae9ef5cfb

Cheers,

Dave.
-- 
Dave Chinner
Principal Engineer
SGI Australian Software Group

---
 fs/xfs/xfs_iget.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: 2.6.x-xfs-new/fs/xfs/xfs_iget.c
===================================================================
--- 2.6.x-xfs-new.orig/fs/xfs/xfs_iget.c	2007-11-02 13:44:46.000000000 +1100
+++ 2.6.x-xfs-new/fs/xfs/xfs_iget.c	2007-11-07 13:08:42.534440675 +1100
@@ -248,7 +248,7 @@ finish_inode:
 	icl = NULL;
 	if (radix_tree_gang_lookup(&pag->pag_ici_root, (void**)&iq,
 							first_index, 1)) {
-		if ((iq->i_ino & mask) == first_index)
+		if ((XFS_INO_TO_AGINO(mp, iq->i_ino) & mask) == first_index)
 			icl = iq->i_cluster;
 	}
 

  reply	other threads:[~2007-11-07  2:13 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <200710220822.52370.maximlevitsky@gmail.com>
     [not found] ` <200710221258.11384.maximlevitsky@gmail.com>
     [not found]   ` <393051953.24752@ustc.edu.cn>
     [not found]     ` <200710221421.21439.maximlevitsky@gmail.com>
     [not found]       ` <393126119.26275@ustc.edu.cn>
     [not found]         ` <1193134027.7406.1.camel@twins>
     [not found]           ` <20071023115620.GA5678@mail.ustc.edu.cn>
2007-10-23 11:56             ` [PATCH] reiserfs: don't drop PG_dirty when releasing sub-page-sized dirty file Fengguang Wu
2007-10-23 11:56             ` Fengguang Wu
2007-10-23 14:10               ` Chris Mason
     [not found]                 ` <20071023144014.GA6174@mail.ustc.edu.cn>
2007-10-23 14:40                   ` Fengguang Wu
2007-10-23 14:40                   ` Fengguang Wu
     [not found]       ` <393056632.00561@ustc.edu.cn>
     [not found]         ` <200710221505.35397.maximlevitsky@gmail.com>
     [not found]           ` <20071022131045.GA5357@mail.ustc.edu.cn>
     [not found]             ` <393060478.03650@ustc.edu.cn>
     [not found]               ` <64bb37e0710310822r5ca6b793p8fd97db2f72a8655@mail.gmail.com>
     [not found]                 ` <393903856.06449@ustc.edu.cn>
     [not found]                   ` <64bb37e0711011120i63cdfe3ci18995d57b6649a8@mail.gmail.com>
     [not found]                     ` <E1Inljm-0002DW-CL@localhost>
2007-11-02  1:54                       ` writeout stalls in current -git Fengguang Wu
2007-11-02  7:42                         ` Torsten Kaiser
     [not found]                           ` <E1InrKN-0000MK-G5@localhost>
2007-11-02  7:52                             ` Fengguang Wu
2007-11-02  7:52                             ` Fengguang Wu
2007-11-02 17:47                               ` Torsten Kaiser
2007-11-02  1:54                       ` Fengguang Wu
     [not found]                     ` <64bb37e0711011200n228e708eg255640388f83da22@mail.gmail.com>
     [not found]                       ` <E1InmAI-0003ME-2i@localhost>
2007-11-02  2:21                         ` Fengguang Wu
2007-11-02  2:21                         ` Fengguang Wu
2007-11-02  7:50                           ` Torsten Kaiser
2007-11-02 10:15                         ` Peter Zijlstra
     [not found]                           ` <E1IntqD-0001dK-OE@localhost>
2007-11-02 10:33                             ` Fengguang Wu
2007-11-02 10:33                             ` Fengguang Wu
2007-11-05 23:57                               ` Andrew Morton
2007-11-06 10:20                                 ` Peter Zijlstra
2007-11-02 19:22                           ` Torsten Kaiser
2007-11-02 20:43                             ` David Chinner
2007-11-02 21:02                               ` Torsten Kaiser
2007-11-04 11:19                               ` Torsten Kaiser
2007-11-05  1:45                                 ` David Chinner
2007-11-05  7:01                                   ` Torsten Kaiser
2007-11-05 18:27                                   ` Torsten Kaiser
2007-11-06  4:25                                     ` David Chinner
2007-11-06  7:10                                       ` Torsten Kaiser
2007-11-06 19:01                                       ` Peter Zijlstra
2007-11-06 20:26                                         ` Torsten Kaiser
     [not found]                             ` <E1IpKZ4-0004je-Lb@localhost>
2007-11-06  9:17                               ` Fengguang Wu
2007-11-06  9:17                               ` Fengguang Wu
2007-11-06 21:53                                 ` Torsten Kaiser
2007-11-06 23:31                                   ` David Chinner
2007-11-07  2:13                                     ` David Chinner [this message]
2007-11-07  7:15                                       ` Torsten Kaiser
2007-11-08  0:38                                         ` David Chinner
2007-11-20 13:16                                           ` Damien Wyart
2007-11-20 21:09                                             ` David Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20071107021324.GD995458@sgi.com \
    --to=dgc@sgi.com \
    --cc=akpm@linux-foundation.org \
    --cc=just.for.lkml@googlemail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maximlevitsky@gmail.com \
    --cc=peterz@infradead.org \
    --cc=wfg@mail.ustc.edu.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).