From: Dave Chinner <david@fromorbit.com>
To: Mark Tinguely <tinguely@sgi.com>
Cc: xfs@oss.sgi.com
Subject: Re: [PATCH] xfs: force log before unmount
Date: Sat, 23 Jun 2012 09:37:07 +1000 [thread overview]
Message-ID: <20120622233707.GX19223@dastard> (raw)
In-Reply-To: <4FE4D807.6030902@sgi.com>
On Fri, Jun 22, 2012 at 03:39:35PM -0500, Mark Tinguely wrote:
> On 06/21/12 18:50, Dave Chinner wrote:
> >On Thu, Jun 21, 2012 at 02:25:42PM -0500, tinguely@sgi.com wrote:
> >> xfs_ail_push_all_sync(mp->m_ail);
> >> xfs_wait_buftarg(mp->m_ddev_targp);
> >
> >So, what is different about the superblock buffer? It's an uncached
> >buffer. What does that mean? It means that when xfs_wait_buftarg()
> >walks the LRU, it never finds the superblock buffer because uncached
> >buffers are never put on the LRU.
> >
>
> Thanks Dave. Just interested; are the uncached buffers kept off the
> LRU with an extra b_hold reference?
No, they take the uncached path through xfs_buf_rele() which avoids
putting them on the LRU. i.e:
828 void
829 xfs_buf_rele(
830 xfs_buf_t *bp)
831 {
832 struct xfs_perag *pag = bp->b_pag;
833
834 trace_xfs_buf_rele(bp, _RET_IP_);
835
836 if (!pag) {
837 ASSERT(list_empty(&bp->b_lru));
838 ASSERT(RB_EMPTY_NODE(&bp->b_rbnode));
839 if (atomic_dec_and_test(&bp->b_hold))
840 xfs_buf_free(bp);
841 return;
842 }
This first !pag path....
> >Hence, when xfs_ail_push_all_sync() triggers an async write of the
> >superblock, nobody ever waits for it to complete. Hence the log and
> >AIL are torn down, and when the IO completes it goes to remove it
> >from the AIL (which succeeds) and then move the log tail because it
> >was the only item in the AIL, it goes kaboom.
> >
> >So the root cause is that we are not waiting for the superblock IO
> >to complete i.e. it's not at all related to the xfs_sync_worker(), log
> >forces, etc. What we really need to do is after the
> >xfs_log_sbcount() call is write the superblock synchronously, and we
> >can do that in xfs_log_sbcount() because all the callers of
> >xfs_log_sbcount() have this same problem of not waiting for the SB
> >to be written correctly. i.e. add the guts of xfs_sync_fsdata() to
> >xfs_log_sbcount()....
>
>
> I did not expect the sync worker was the cause. Ben's patch is not even
> in at least one of the crashes. I thought it would be good thing to have
> the sync worker idle by the time we write the unmount record in the log.
I guess it's possible it might race with writing back the
superblock, but I think that we need to solve this problem once and
for all rather than moving the sync worker shutdown every time we
find a problem. I think part of that needs to be handled by the log
itself given that really all the sync worker is doing is trying to
idle the log....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
prev parent reply other threads:[~2012-06-22 23:37 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20120621192541.630212563@sgi.com>
2012-06-21 19:25 ` [PATCH] xfs: force log before unmount tinguely
2012-06-21 23:50 ` Dave Chinner
2012-06-22 20:39 ` Mark Tinguely
2012-06-22 23:37 ` Dave Chinner [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120622233707.GX19223@dastard \
--to=david@fromorbit.com \
--cc=tinguely@sgi.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.