From: Dave Chinner <david@fromorbit.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: "xfs@oss.sgi.com" <xfs@oss.sgi.com>,
Stefan Priebe - Profihost AG <s.priebe@profihost.ag>
Subject: Re: [xfs-masters] xfs deadlock in stable kernel 3.0.4
Date: Thu, 22 Sep 2011 09:07:18 +1000 [thread overview]
Message-ID: <20110921230718.GS15688@dastard> (raw)
In-Reply-To: <20110921122649.GA16602@infradead.org>
On Wed, Sep 21, 2011 at 08:26:49AM -0400, Christoph Hellwig wrote:
> On Wed, Sep 21, 2011 at 09:42:37PM +1000, Dave Chinner wrote:
> > So, the log force not triggering in the AIL code looks to be the
> > problem. That, I simply cannot explain right now - it makes no sense
> > but that is what all the stats and trace events point to. I need to
> > do more investigation.
>
> Could it be that we have a huge amount of instances of xfs_ail_worker
> running at the same time? xfs_sync_wq is marked as WQ_CPU_INTENSIVE,
> so running/runnable workers are not counted towards the concurrency
> limit. From my look at the workqueue code this means we'll spawn new
> instances fairly quickly if the others are stuck. This means more
> and more of them hammering the pinned items, and we'll rarely reach
> the limit where we'd need to do a log force.
No, that's not possible. The XFS_AIL_PUSHING_BIT ensures that there
is only one instance of AIL pushing per struct xfs_ail running at
once. It's also backed up by the fact that I couldn't find a single
worker thread blocked running AIL pushing - it ran the 100 item
scan, got stuck, requeued itself to run again 20ms later....
FYI, what we want the concurrency for in the AIL wq is for multiple
filesystems to be able to run AIL pushing at the same time, which
is why it was set up this way. If one filesystem AIL push blocks,
then an unblocked one will simply run.
> What is also strange is that we allocate a xfs_ail_wq, but don't
> actually use it, although it would have the same idea. Stefan,
> can you try the following patch? This moves the ail work to it's
> explicit queue, and makes sure we never have the same work item
> (= same fs to be pushed) concurrently.
Oh, that's a bug. My bad. That definitely needs fixing.
> Note that before Linux 3.1-rc you'll need to edit fs/xfs/xfs_super.c
> to be fs/xfs/linux-2.6/xfs_super.c in the patch manually.
>
>
> Index: linux-2.6/fs/xfs/xfs_super.c
> ===================================================================
> --- linux-2.6.orig/fs/xfs/xfs_super.c 2011-09-21 08:00:01.864768359 -0400
> +++ linux-2.6/fs/xfs/xfs_super.c 2011-09-21 08:04:01.335266079 -0400
> @@ -1654,7 +1654,7 @@ xfs_init_workqueues(void)
> if (!xfs_syncd_wq)
> goto out;
>
> - xfs_ail_wq = alloc_workqueue("xfsail", WQ_CPU_INTENSIVE, 8);
> + xfs_ail_wq = alloc_workqueue("xfsail", WQ_NON_REENTRANT, 8);
> if (!xfs_ail_wq)
> goto out_destroy_syncd;
Drop this hunk....
>
> Index: linux-2.6/fs/xfs/xfs_trans_ail.c
> ===================================================================
> --- linux-2.6.orig/fs/xfs/xfs_trans_ail.c 2011-09-21 08:02:28.172765827 -0400
> +++ linux-2.6/fs/xfs/xfs_trans_ail.c 2011-09-21 08:02:46.843266108 -0400
> @@ -538,7 +538,7 @@ out_done:
> }
>
> /* There is more to do, requeue us. */
> - queue_delayed_work(xfs_syncd_wq, &ailp->xa_work,
> + queue_delayed_work(xfs_ail_wq, &ailp->xa_work,
> msecs_to_jiffies(tout));
> }
>
> @@ -575,7 +575,7 @@ xfs_ail_push(
> smp_wmb();
> xfs_trans_ail_copy_lsn(ailp, &ailp->xa_target, &threshold_lsn);
> if (!test_and_set_bit(XFS_AIL_PUSHING_BIT, &ailp->xa_flags))
> - queue_delayed_work(xfs_syncd_wq, &ailp->xa_work, 0);
> + queue_delayed_work(xfs_ail_wq, &ailp->xa_work, 0);
> }
just keep these. Can you repost with a sign-off?
Cheers,
Dave
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
next prev parent reply other threads:[~2011-09-21 23:07 UTC|newest]
Thread overview: 49+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-09-10 12:23 xfs deadlock in stable kernel 3.0.4 Stefan Priebe
2011-09-12 15:21 ` Christoph Hellwig
2011-09-12 16:46 ` Stefan Priebe
2011-09-12 20:05 ` Christoph Hellwig
2011-09-13 6:04 ` Stefan Priebe - Profihost AG
2011-09-13 19:31 ` Stefan Priebe - Profihost AG
2011-09-13 20:50 ` Christoph Hellwig
2011-09-13 21:52 ` [xfs-masters] " Alex Elder
2011-09-13 21:58 ` Alex Elder
2011-09-13 22:26 ` Christoph Hellwig
2011-09-14 7:26 ` Stefan Priebe - Profihost AG
2011-09-14 7:48 ` Stefan Priebe - Profihost AG
2011-09-14 8:49 ` Stefan Priebe - Profihost AG
2011-09-14 14:30 ` Christoph Hellwig
2011-09-14 14:30 ` Christoph Hellwig
2011-09-14 16:06 ` Stefan Priebe - Profihost AG
2011-09-18 9:14 ` Stefan Priebe - Profihost AG
2011-09-18 20:04 ` Christoph Hellwig
2011-09-19 10:54 ` Stefan Priebe - Profihost AG
2011-09-18 23:02 ` Dave Chinner
2011-09-20 0:47 ` Stefan Priebe
2011-09-20 1:01 ` Stefan Priebe
2011-09-20 10:09 ` Stefan Priebe - Profihost AG
2011-09-20 16:02 ` Christoph Hellwig
2011-09-20 17:23 ` Stefan Priebe - Profihost AG
2011-09-20 17:24 ` Christoph Hellwig
2011-09-20 17:35 ` Stefan Priebe - Profihost AG
2011-09-20 22:30 ` Christoph Hellwig
2011-09-21 2:11 ` [xfs-masters] " Dave Chinner
2011-09-21 7:40 ` Stefan Priebe - Profihost AG
2011-09-21 11:42 ` Dave Chinner
2011-09-21 11:55 ` Stefan Priebe - Profihost AG
2011-09-21 12:26 ` Christoph Hellwig
2011-09-21 13:42 ` Stefan Priebe
2011-09-21 16:48 ` Stefan Priebe - Profihost AG
2011-09-21 17:26 ` Stefan Priebe - Profihost AG
2011-09-21 19:01 ` Stefan Priebe - Profihost AG
2011-09-21 23:07 ` Dave Chinner [this message]
2011-09-22 14:14 ` Christoph Hellwig
2011-09-22 21:49 ` Dave Chinner
2011-09-22 22:01 ` Christoph Hellwig
2011-09-23 5:28 ` Stefan Priebe - Profihost AG
2011-09-22 0:53 ` Dave Chinner
2011-09-22 5:27 ` Stefan Priebe - Profihost AG
2011-09-22 7:52 ` Stefan Priebe - Profihost AG
2011-09-21 7:36 ` Stefan Priebe - Profihost AG
2011-09-21 11:39 ` Christoph Hellwig
2011-09-21 13:39 ` Stefan Priebe
2011-09-21 14:17 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110921230718.GS15688@dastard \
--to=david@fromorbit.com \
--cc=hch@infradead.org \
--cc=s.priebe@profihost.ag \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox