Re: [PATCH 2/2] xfs: Properly retry failed inode items in case of error during buffer writeback

linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Dave Chinner <david@fromorbit.com>
To: Brian Foster <bfoster@redhat.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 2/2] xfs: Properly retry failed inode items in case of error during buffer writeback
Date: Mon, 22 May 2017 09:19:06 +1000	[thread overview]
Message-ID: <20170521231906.GU17542@dastard> (raw)
In-Reply-To: <20170520114654.GA3888@bfoster.bfoster>

On Sat, May 20, 2017 at 07:46:56AM -0400, Brian Foster wrote:
> On Sat, May 20, 2017 at 09:39:00AM +1000, Dave Chinner wrote:
> > Adding new flags to the same field that can be asynchronously
> > updated by RMW operations outside the ailp->xa_lock will cause
> > problems in future. There *may not* be a problem right now, but it
> > is poor programming practice to have different coherency processes
> > for the same variable that is updated via RMW operations. In these
> > situations, the only safe way to update the variable is to use an
> > atomic operation.
> > 
> 
> So is there a reason why we couldn't acquire ->xa_lock to fail the log
> items as we would have done anyways if the metadata writeback had
> succeeded and we were removing the log items from the AIL..

Yes. the alip->xa_lock protects AIL state is a highly contended
lock. It should not be used for things that aren't AIL related
because that will have performance and scalability implications.

> (e.g., why
> introduce new serialization rules if we don't need to)?

Becuase we don't tie independent object operational state to a
single subsystem's internal locking requirements. If the log item
does not have it's own context lock (which it doesn't), then we need
to use atomic operations to serialise flag updates across different
subsystems so they can safely maintain their own independent
internal locking schemes.

An example is the inode flags - they are serialised by the
i_flags_lock spinlock because there are multiple state changes that
need to be done atomically through different subsystems (including
RCU freeing state).  We serialise these updates by an object lock
rather than a subsystem lock because it's the only way we can share
the flag field across different subsystems safely...

> > > But otherwise given the above, that this is primarily I/O error path
> > > code, and that we presumably already have serialization mechanisms in
> > > place in the form of parent object locks, why not just grab the inode
> > > lock in the appropriate places?
> > 
> > Because we can't guarantee that we can lock the parent object in IO
> > completion. Deadlock is trivial: process A grabs parent object,
> > blocks on flush lock. IO completion holds flush lock, blocks trying
> > to get parent object lock.
> > 
> 
> Ah, right. So once we acquire the flush lock and drop the ilock, we
> can no longer block on the ilock from a context responsible for
> releasing the flush lock.
> 
> That makes sense, but raises a larger question... if "process A" can
> cause this kind of locking problem in I/O completion context, what
> prevents it from interfering with (re)submission context just the same?
> For example, the AIL pushes an inode, locks, flushes and submits the
> underlying buffer. The buffer I/O fails and is ultimately released back
> to the AIL. Process A comes in, locks the inode and blocks on the flush
> lock.  Subsequent AIL pushes and retry attempts fail to lock the inode,
> never even get to the point of checking for LI_FAILED and thus we're
> back to the same problem we have today.
> 
> IOW, doesn't this mean we need to check and handle LI_FAILED first off
> in ->iop_push() and not just in response to flush lock failure?

It's entirely possible that we need to do that. This whole "don't
endlessly retry failed buffer writes" thing is a substantial change
in behaviour, so there's bound to be interactions that we don't get
right the first time...

Cheers,

Dave.

> 
> Brian
> 
> > Cheers,
> > 
> > Dave.
> > -- 
> > Dave Chinner
> > david@fromorbit.com
> 

-- 
Dave Chinner
david@fromorbit.com

next prev parent reply	other threads:[~2017-05-21 23:19 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-05-11 13:57 [PATCH 0/2] Resubmit items failed during writeback Carlos Maiolino
2017-05-11 13:57 ` [PATCH 1/2] xfs: Add infrastructure needed for error propagation during buffer IO failure Carlos Maiolino
2017-05-11 16:51   ` Brian Foster
2017-05-12  8:41     ` Carlos Maiolino
2017-05-12 11:37       ` Brian Foster
2017-05-11 13:57 ` [PATCH 2/2] xfs: Properly retry failed inode items in case of error during buffer writeback Carlos Maiolino
2017-05-11 15:32   ` Eric Sandeen
2017-05-12  8:19     ` Carlos Maiolino
2017-05-11 17:08   ` Brian Foster
2017-05-12  8:21     ` Carlos Maiolino
2017-05-12 11:37       ` Brian Foster
2017-05-17 11:47         ` Carlos Maiolino
2017-05-17  0:57   ` Dave Chinner
2017-05-17 10:41     ` Carlos Maiolino
2017-05-19  0:22       ` Dave Chinner
2017-05-19 11:27         ` Brian Foster
2017-05-19 23:39           ` Dave Chinner
2017-05-20 11:46             ` Brian Foster
2017-05-21 23:19               ` Dave Chinner [this message]
2017-05-22 12:51                 ` Brian Foster
2017-05-23 11:23                   ` Dave Chinner
2017-05-23 16:22                     ` Brian Foster
2017-05-24  1:06                       ` Dave Chinner
2017-05-24 12:42                         ` Brian Foster
2017-05-24 13:26                           ` Carlos Maiolino
2017-05-24 17:08                             ` Brian Foster
  -- strict thread matches above, loose matches on Subject: below --
2017-06-16 10:54 [PATCH 0/2 V4] Resubmit items failed during writeback Carlos Maiolino
2017-06-16 10:54 ` [PATCH 2/2] xfs: Properly retry failed inode items in case of error during buffer writeback Carlos Maiolino
2017-06-16 11:06   ` Carlos Maiolino
2017-06-16 18:35   ` Luis R. Rodriguez
2017-06-16 19:24     ` Darrick J. Wong
2017-06-16 19:37       ` Luis R. Rodriguez
2017-06-16 19:45         ` Eric Sandeen
2017-06-19 10:59           ` Brian Foster
2017-06-20 16:52             ` Luis R. Rodriguez
2017-06-20 17:20               ` Brian Foster
2017-06-20 18:05                 ` Luis R. Rodriguez
2017-06-21 10:10                   ` Brian Foster
2017-06-21 15:25                     ` Luis R. Rodriguez
2017-06-20 18:38                 ` Luis R. Rodriguez
2017-06-20  7:01     ` Carlos Maiolino
2017-06-20 16:24       ` Luis R. Rodriguez
2017-06-21 11:51         ` Carlos Maiolino
2017-06-19 13:49   ` Brian Foster
2017-06-19 15:09     ` Brian Foster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170521231906.GU17542@dastard \
    --to=david@fromorbit.com \
    --cc=bfoster@redhat.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).