From: Dave Chinner <david@fromorbit.com>
To: Markus Trippelsdorf <markus@trippelsdorf.de>
Cc: xfs@oss.sgi.com
Subject: Re: long hangs when deleting large directories (3.0-rc3)
Date: Wed, 29 Jun 2011 17:24:46 +1000 [thread overview]
Message-ID: <20110629072446.GR561@dastard> (raw)
In-Reply-To: <20110629061954.GA1711@x4.trippels.de>
On Wed, Jun 29, 2011 at 08:19:54AM +0200, Markus Trippelsdorf wrote:
> On 2011.06.29 at 14:31 +1000, Dave Chinner wrote:
> > On Wed, Jun 22, 2011 at 05:30:47PM +1000, Dave Chinner wrote:
> > > Jun 22 08:53:09 x4 kernel: XFS (sdb1): ail: ooo splice, tail 0x12000156e7, item 0x12000156e6
> > > Jun 22 08:53:09 x4 kernel: XFS (sdb1): ail: ooo splice, walked 15503 items
> > > .....
> > > Jun 22 08:53:12 x4 kernel: XFS (sdb1): ail: ooo splice, tail 0x12000156e7, item 0x12000156e6
> > > Jun 22 08:53:12 x4 kernel: XFS (sdb1): ail: ooo splice, walked 16945 items
> > >
> > > Interesting is the LSN of the tail - it's only one sector further on
> > > than the items being inserted. That's what I'd expect from a commit
> > > record write race between two checkpoints. I'll have a deeper look
> > > into whether this can be avoided later tonight and also whether I
> > > can easily implement a "last insert cursor" easily so subsequent
> > > inserts at the same LSN avoid the walk....
> >
> > Ok, so here's a patch that does just this. I should probably also do
> > a little bit of cleanup on the cursor code as well, but this avoids
> > the repeated walks of the AIL to find the insert position.
> >
> > Can you try it without the WQ changes you made, Marcus, and see if
> > the interactivity problems go away?
>
> Sorry to be the bringer of bad news, but this made things much worse:
>
> -------cpu0-usage--------------cpu1-usage--------------cpu2-usage--------------cpu3-usage------ --dsk/sdc-- ---system-- ---load-avg--- --dsk/sdc--
> usr sys idl wai hiq siq:usr sys idl wai hiq siq:usr sys idl wai hiq siq:usr sys idl wai hiq siq| read writ| int csw | 1m 5m 15m |reads writs
> 1 1 98 0 0 0: 0 1 99 0 0 0: 0 1 99 0 0 0: 0 1 99 0 0 0| 0 0 | 603 380 |0.66 0.55 0.28| 0 0
> 1 0 99 0 0 0: 1 0 99 0 0 0: 1 19 80 0 0 0: 0 0 100 0 0 0| 0 0 | 719 383 |0.66 0.55 0.28| 0 0
> 3 1 96 0 0 0: 3 1 96 0 0 0: 1 52 47 0 0 0: 0 0 100 0 0 0| 0 6464k|1847 919 |0.66 0.55 0.28| 0 202
> 2 13 85 0 0 0: 2 2 96 0 0 0: 1 56 43 0 0 0: 1 31 69 0 0 0|4096B 256k|1910 1280 |0.68 0.56 0.28| 1 8
> > 0 1 99 0 0 0: 0 0 100 0 0 0: 0 1 99 0 0 0: 0 100 0 0 0 0| 0 0 |1256 170 |0.68 0.56 0.28| 0 0
> > 0 1 99 0 0 0: 1 1 98 0 0 0: 1 0 99 0 0 0: 0 99 0 0 0 1| 0 0 |1395 229 |0.68 0.56 0.28| 0 0
> > 0 0 100 0 0 0: 0 0 100 0 0 0: 0 3 97 0 0 0: 0 100 0 0 0 0| 0 512B|1304 167 |0.68 0.56 0.28| 0 1
> > 1 1 98 0 0 0: 1 1 98 0 0 0: 0 0 100 0 0 0: 0 99 0 0 0 1| 0 0 |1211 146 |0.68 0.56 0.28| 0 0
> > 0 0 100 0 0 0: 0 0 100 0 0 0: 0 1 99 0 0 0: 0 97 0 0 0 3| 0 0 |1270 149 |0.87 0.60 0.30| 0 0
> 5 2 65 29 0 0: 2 3 95 0 0 0: 1 0 99 0 0 0: 2 24 72 0 0 1| 0 8866k|2654 2398 |0.87 0.60 0.30| 0 496
> 6 2 25 67 0 0: 3 1 59 37 0 0: 0 0 100 0 0 0: 4 4 92 0 0 0| 0 4554k|2224 2494 |0.87 0.60 0.30| 0 399
> 1 1 98 0 0 0: 0 0 83 17 0 0: 1 3 96 0 0 0: 0 1 99 0 0 0| 0 2270k|1079 1030 |0.87 0.60 0.30| 0 200
> 1 1 98 0 0 0: 1 1 98 0 0 0: 0 1 99 0 0 0: 1 0 99 0 0 0| 0 9216B| 713 567 |0.87 0.60 0.30| 0 2
> 0 0 100 0 0 0: 1 1 98 0 0 0: 0 0 100 0 0 0: 0 1 99 0 0 0| 0 0 | 492 386 |0.80 0.59 0.30| 0 0
>
> As you can see in the table above (resolution 1sec) the hang is now
> 5-6 seconds long, instead of the 1-3 seconds seen before.
Interesting. I checked that the ordering was correct in each case
adn that it was behaving correctly here.
Can you add the following patch and send me the dmesg output over a
hang? It will tell me where the cursor is being initialised and when
it is being dropped, so should indicate if a specific insert chain
is getting stuck or doing something stoopid.
Cheers,
Dave
--
Dave Chinner
david@fromorbit.com
---
fs/xfs/xfs_trans_ail.c | 4 ++++
1 files changed, 4 insertions(+), 0 deletions(-)
diff --git a/fs/xfs/xfs_trans_ail.c b/fs/xfs/xfs_trans_ail.c
index 272e7fa..a087cbb 100644
--- a/fs/xfs/xfs_trans_ail.c
+++ b/fs/xfs/xfs_trans_ail.c
@@ -234,6 +234,8 @@ xfs_trans_ail_cursor_done(
struct xfs_ail_cursor *prev = NULL;
struct xfs_ail_cursor *cur;
+ printk(KERN_WARNING "done cur %p, lip %p/0x%llx\n",
+ done, done->item, done->item ? done->item->li_lsn : 0);
done->item = NULL;
if (done == &ailp->xa_cursors)
return;
@@ -323,6 +325,8 @@ __xfs_trans_ail_cursor_last(
break;
}
out:
+ printk(KERN_WARNING "last cur %p, init %d lsn 0x%llx, lip %p/0x%llx\n",
+ cur, do_init, lsn, lip, lip ? lip->li_lsn : 0);
if (cur)
cur->item = lip;
return lip;
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2011-06-29 7:24 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-06-18 14:19 long hangs when deleting large directories (3.0-rc3) Markus Trippelsdorf
2011-06-18 14:24 ` Christoph Hellwig
2011-06-18 14:37 ` Markus Trippelsdorf
2011-06-19 8:16 ` Markus Trippelsdorf
2011-06-19 22:24 ` Dave Chinner
2011-06-20 0:54 ` Markus Trippelsdorf
2011-06-20 1:34 ` Dave Chinner
2011-06-20 2:02 ` Markus Trippelsdorf
2011-06-20 2:36 ` Dave Chinner
2011-06-20 6:03 ` Markus Trippelsdorf
2011-06-20 11:13 ` Markus Trippelsdorf
2011-06-20 11:45 ` Michael Monnerie
2011-06-20 12:31 ` Markus Trippelsdorf
2011-06-20 21:16 ` Markus Trippelsdorf
2011-06-21 4:25 ` Dave Chinner
2011-06-21 8:02 ` Markus Trippelsdorf
2011-06-21 18:24 ` Markus Trippelsdorf
2011-06-21 18:57 ` Markus Trippelsdorf
2011-06-21 21:22 ` Markus Trippelsdorf
2011-06-22 0:04 ` Dave Chinner
2011-06-22 7:06 ` Markus Trippelsdorf
2011-06-22 7:30 ` Dave Chinner
2011-06-22 7:40 ` Markus Trippelsdorf
2011-06-29 4:31 ` Dave Chinner
2011-06-29 6:19 ` Markus Trippelsdorf
2011-06-29 7:24 ` Dave Chinner [this message]
2011-06-29 7:41 ` Markus Trippelsdorf
2011-06-29 12:10 ` Dave Chinner
2011-06-29 12:48 ` Markus Trippelsdorf
2011-06-29 15:08 ` Michael Weissenbacher
2011-06-29 23:53 ` Dave Chinner
2011-06-21 10:18 ` Markus Trippelsdorf
2011-06-21 10:40 ` Markus Trippelsdorf
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110629072446.GR561@dastard \
--to=david@fromorbit.com \
--cc=markus@trippelsdorf.de \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox