All of lore.kernel.org
 help / color / mirror / Atom feed
From: Brian Foster <bfoster@redhat.com>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org, xfs@oss.sgi.com
Subject: Re: [PATCH 1/5] xfs: rework log recovery to submit buffers on LSN boundaries
Date: Fri, 23 Sep 2016 13:08:52 -0400	[thread overview]
Message-ID: <20160923170851.GA18135@bfoster.bfoster> (raw)
In-Reply-To: <20160920001330.GF340@dastard>

On Tue, Sep 20, 2016 at 10:13:30AM +1000, Dave Chinner wrote:
> [ sorry to take so long to get back to this, Brian, I missed your
> reply and only yesterday when I was sorting out for-next updates
> that I still had this on my "for-review" patch stack. ]
> 

No problem. I've been away anyways..

> On Mon, Aug 29, 2016 at 02:17:22PM -0400, Brian Foster wrote:
> > On Mon, Aug 29, 2016 at 11:16:31AM +1000, Dave Chinner wrote:
> > > On Thu, Aug 11, 2016 at 01:11:03PM -0400, Brian Foster wrote:
> > > i.e. We are very careful to write commit records in the correct
> > > order because that is what determines recovery order, but we don't
> > > care what order we write the actual contents of the checkpoints or
> > > whether they interleave with other checkpoints.  As such, ophdrs
> > > change transactions and LSNs without having actually completed
> > > recovery of a checkpoint. 
> > > 
> > > I think writeback should occur when all the transactions with a
> > > given lsn have been committed. I'm not sure there's a simple way to
> > > track and detect this, but using the ophdrs to detect a change of
> > > lsn to trigger buffer writeback does not look correct to me at this
> > > point in time.
> > > 
> > 
> > That is precisely the intent of this patch. What I think could be a
> > problem is something like the following, if possible:
> > 
> >                     CA         CB                  CC CD
> > +---------+--------+--+-------+--+--------+-------+--+--+
> >   trans A   trans B    trans C    trans C  trans D
> 
> Yes, that's possible.
> 

Ok.

> > Assume that trans A and trans B are within the same record and trans C
> > is in a separate record. In that case, we commit trans A which populates
> > buffer_list. We lookup trans C, note a new LSN and drain buffer_list.
> > Then we ultimately commit trans B, which has the same metadata LSN as
> > trans A and thus is a path to the original problem if trans B happened
> > to modify any of the same blocks as trans A.
> 
> Yes, that's right, we still are exposed to the same problem, and
> there's much more convoluted versions of it possible.
> 
> > Do note however that this is just an occurrence of the problem with log
> > recovery as implemented today (once we update metadata LSNs, and is
> > likely rare as I haven't been able to reproduce corruption in many
> > tries).
> 
> Yeah, it's damn hard to intentionally cause interleaving of
> checkpoint and commit records these days because of the delayed
> logging does aggregation in memory rather than in the log buffers
> themselves.
> 

Makes sense.

> > If that analysis is correct, I think a straightforward solution
> > might be to defer submission to the lookup of a transaction with a new
> > LSN that _also_ corresponds with processing of a commit record based on
> > where we are in the on-disk log. E.g.:
> > 
> > 	if (log->l_recovery_lsn != trans->r_lsn &&
> > 	    oh_flags & XLOG_COMMIT_TRANS) {
> > 		error = xfs_buf_delwri_submit(buffer_list);
> > 		...
> > 	}
> > 
> > So in the above, we'd submit buffers for A and B once we visit the
> > commit record for trans C. Thoughts?
> 
> Sounds plausible - let me just check I understood by repeating it
> back. Given the above case, we start with log->l_recovery_lsn set to
> the lsn before trans A and an empty buffer list.
> 
> 1. We now recover trans A and trans B into their respective structures,
> but we don't don't add their dirty buffers to the delwri list yet -
> they are kept internal to the trans.
> 
> 2. We then see commit A, and because the buffer list is empty we
> simply add them to the buffer list and update log->l_recovery_lsn to
> point at the transaction LSN.
> 

Right...

> 3. We now see trans C, and start recovering it into an internal buffer
> list.
> 
> 4. Then we process commit B, see that there are already queued buffers
> and so check the transaction LSN against log->l_recovery_lsn. They
> are the same, so we simply add the transactions dirty buffers to
> the buffer list.
> 

Maybe just weird wording here, but to be precise (and pedantic), the
top-level check is for the current LSN change, not necessarily whether
the buffer_list is empty or not. The behavior is the same either way.

> 5. We continue processing transaction C, and start on transaction D.
> We then see commit C. Buffer list is populated, so we check
> transaction lsn against log->l_recovery_lsn. They are different.
> At this point we know we have fully processed all the transactions
> that are associated with log->l_recovery_lsn, hence we can submit
> the buffer_list and mark it empty again.
> 
> 6. At this point we jump back to step 2, this time processing commit
> C onwards....
> 
> 7. At the end of log recovery, we commit the remaining buffer list
> from the last transaction we recovered from the log.
> 
> Did I understand it right? If so, I think this will work just fine.
> 

Yep, I think so. I'll send an updated version.

Brian

> Thanks, Brian!
> 
> -Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2016-09-23 17:09 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-11 17:11 [PATCH 0/5] fix log recovery for v5 superblocks Brian Foster
2016-08-11 17:11 ` [PATCH 1/5] xfs: rework log recovery to submit buffers on LSN boundaries Brian Foster
2016-08-29  1:16   ` Dave Chinner
2016-08-29 18:17     ` Brian Foster
2016-09-20  0:13       ` Dave Chinner
2016-09-23 17:08         ` Brian Foster [this message]
2016-08-11 17:11 ` [PATCH 2/5] xfs: pass current lsn to log recovery buffer validation Brian Foster
2016-08-11 17:11 ` [PATCH 3/5] xfs: don't warn on buffers not being recovered due to LSN Brian Foster
2016-08-29  1:25   ` Dave Chinner
2016-08-29 18:17     ` Brian Foster
2016-08-11 17:11 ` [PATCH 4/5] xfs: update metadata LSN in buffers during log recovery Brian Foster
2016-08-29  1:29   ` Dave Chinner
2016-08-29 18:17     ` Brian Foster
2016-08-11 17:11 ` [PATCH 5/5] xfs: log recovery tracepoints to track current lsn and buffer submission Brian Foster

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160923170851.GA18135@bfoster.bfoster \
    --to=bfoster@redhat.com \
    --cc=david@fromorbit.com \
    --cc=linux-xfs@vger.kernel.org \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.