From: Brian Foster <bfoster@redhat.com>
To: Josef Bacik <jbacik@fb.com>
Cc: Dave Chinner <david@fromorbit.com>,
"linux-xfs@vger.kernel.org" <linux-xfs@vger.kernel.org>,
"darrick.wong@oracle.com" <darrick.wong@oracle.com>,
Kernel Team <Kernel-team@fb.com>,
Carlos Maiolino <cmaiolino@redhat.com>
Subject: Re: [PATCH] xfs: clear XBF_ASYNC if we attach an iodone callback
Date: Tue, 7 Mar 2017 11:12:46 -0500 [thread overview]
Message-ID: <20170307161246.GB8144@bfoster.bfoster> (raw)
In-Reply-To: <1488902165.9307.26.camel@fb.com>
On Tue, Mar 07, 2017 at 10:56:05AM -0500, Josef Bacik wrote:
> On Tue, 2017-03-07 at 12:22 +1100, Dave Chinner wrote:
> > On Mon, Mar 06, 2017 at 02:00:13PM -0500, Josef Bacik wrote:
> > >
> > > so you retry once, and the second time through we will return false
> > > if
> > > we are unmounting, which means that xfs_buf_do_callbacks gets
> > > called.
> > >
> > > Now you say that xfs_buf_do_callbacks() being called when we
> > > errored
> > > out is dangerous, but I don't understand why. We have this code
> > >
> > > if (need_ail) {
> > > struct xfs_log_item *log_items[need_ail];
> > > int i = 0;
> > > spin_lock(&ailp->xa_lock);
> > > for (blip = lip; blip; blip = blip->li_bio_list) {
> > > iip = INODE_ITEM(blip);
> > > if (iip->ili_logged &&
> > > blip->li_lsn == iip->ili_flush_lsn) {
> > > log_items[i++] = blip;
> > > }
> > > ASSERT(i <= need_ail);
> > > }
> > > /* xfs_trans_ail_delete_bulk() drops the AIL lock.
> > > */
> > > xfs_trans_ail_delete_bulk(ailp, log_items, i,
> > > SHUTDOWN_CORRUPT_INCORE);
> > > }
> > [ Your cut-n-paste is horribly mangled in something non-ascii. ]
> >
> > >
> > > and need_ail is set if we didn't actually end up on disk properly.
> > > So
> > > we delete the thing and mark the fs as fucked. Am I mis-reading
> > > what
> > > is going on here?
> > Yes. That won't shut down the filesystem on an inode write error. It
> > will simply remove the inode from the AIL, the error will do dropped
> > on the floor, and the filesystem will not be shut down even though
> > it's now in a corrupt state. xfs_trans_ail_delete_bulk() only shuts
> > down the filesytem if it doesn't find the item being removed form
> > the AIL in the AIL.
> >
> > As I keep telling people, what needs to happen is this:
> >
> > 1. buffer write error occurs
> > 2. retries exhausted, mark buffer as errored, run callbacks
> > 3. xfs_iflush_done() (and other iodone callbacks) need to
> > look at the buffer error state to determine what to do.
> > 4. if the buffer is errored and recovery is not possible,
> > the callback needs to shut down the filesytem.
> > 5. the callback then needs to call flush abort function for
> > item (e.g. xfs_iflush_abort()) rather than the normal
> > IO completion processing path.
> >
> > >
> > > This is what I did to our 4.6 tree (in testing
> > > still, not actually pushed to production)
> > >
> > > if (bp->b_flags & XBF_ASYNC) {
> > > ASSERT(bp->b_iodone != NULL);
> > >
> > > trace_xfs_buf_item_iodone_async(bp, _RET_IP_);
> > >
> > > xfs_buf_ioerror(bp, 0); /* errno of 0 unsets the
> > > flag
> > > */
> > >
> > > if (!(bp->b_flags & (XBF_STALE|XBF_WRITE_FAIL))) {
> > > bp->b_flags |= XBF_WRITE | XBF_ASYNC
> > > XBF_DONE | XBF_WRITE_FAIL;
> > > xfs_buf_submit(bp);
> > > } else {
> > > + xfs_buf_do_callbacks(bp);
> > > + bp->b_fspriv = NULL;
> > > + bp->b_iodone = NULL;
> > > xfs_buf_relse(bp);
> > > }
> > >
> > > return;
> > > }
> > It's likely you'll end up with silent on disk corruption if you do
> > that because it's just tossing the error state away.
>
> I think I'm getting a handle on this, but I wonder if we don't have
> this problem currently? We get an error on a buffer, we retry it and
> it fails again. If we don't have XFS_ERR_RETRY_FOREVER set then we
> error out and do the normal callbacks as if nothing ever happened, and
> we get the silent corruption without shutting down the file system.
> How is the code as it stands without XFS_ERR_RETRY_FOREVER safe?
> Thanks,
>
We shouldn't get into a situation where we run the callbacks when the
I/O has failed and the fs isn't shut down. IOWs, we do run the callbacks
only when the I/O succeeds, or it fails and the fs is shutdown as a
result (never when the I/O fails but we haven't shut down, which is the
problematic case the patch above introduces).
That tunable error handling logic (XFS_ERR_RETRY_FOREVER) basically
determines when an I/O error is considered permanent, which means
retries are futile and we have no choice but to shutdown the fs. E.g.,
if an error is deemed permanent, we jump to permanent_error:
/*
* Permanent error - we need to trigger a shutdown if we haven't already
* to indicate that inconsistency will result from this action.
*/
permanent_error:
xfs_force_shutdown(mp, SHUTDOWN_META_IO_ERROR);
out_stale:
xfs_buf_stale(bp);
bp->b_flags |= XBF_DONE;
trace_xfs_buf_error_relse(bp, _RET_IP_);
return false;
}
... which shuts down the filesystem before we go any farther.
Brian
> Josef
next prev parent reply other threads:[~2017-03-07 16:13 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-03-03 14:46 [PATCH] xfs: clear XBF_ASYNC if we attach an iodone callback Josef Bacik
2017-03-03 20:49 ` Brian Foster
2017-03-03 21:34 ` Josef Bacik
2017-03-03 22:20 ` Brian Foster
2017-03-04 1:26 ` Josef Bacik
2017-03-04 13:58 ` Brian Foster
2017-03-06 19:00 ` Josef Bacik
2017-03-07 1:22 ` Dave Chinner
2017-03-07 2:37 ` Josef Bacik
2017-03-07 15:56 ` Josef Bacik
2017-03-07 16:12 ` Brian Foster [this message]
2017-04-03 9:19 ` Carlos Maiolino
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170307161246.GB8144@bfoster.bfoster \
--to=bfoster@redhat.com \
--cc=Kernel-team@fb.com \
--cc=cmaiolino@redhat.com \
--cc=darrick.wong@oracle.com \
--cc=david@fromorbit.com \
--cc=jbacik@fb.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).