From: Dave Chinner <david@fromorbit.com>
To: Brian Foster <bfoster@redhat.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: [PATCH 1/3] xfs: EFI recovery needs it's own transaction reservation
Date: Fri, 11 Sep 2020 07:29:27 +1000 [thread overview]
Message-ID: <20200910212927.GT12131@dread.disaster.area> (raw)
In-Reply-To: <20200910131810.GA1143857@bfoster>
On Thu, Sep 10, 2020 at 09:18:10AM -0400, Brian Foster wrote:
> On Thu, Sep 10, 2020 at 07:44:55AM +1000, Dave Chinner wrote:
> > On Wed, Sep 09, 2020 at 09:31:11AM -0400, Brian Foster wrote:
> > > It looks like extents are only freed when the last
> > > reference is dropped (otherwise we log a refcount intent), which makes
> > > me wonder whether we really need 7 log count units if recovery
> > > encounters an EFI.
> >
> > I don't know if the numbers are correct, and it really is out of
> > scope for this patch to audit/fix that. I really think we need to
> > map this whole thing out in a diagram at this point because I now
> > suspect that the allocfree log count calculation is not correct,
> > either...
>
> I agree up to the point where it relates to this specific EFI recovery
> issue. reflink is enabled by default, which means the default EFI
> recovery reservation is going to have 7 logcount units. Is that actually
> enough of a reduction to prevent this same recovery problem on newer
> fs'? I'm wondering if the tr_efi logcount should just be set to 1, for
> example, at least for the short term fix.
I spent yesterday mapping out the whole "free extent" chain to
understand exactly what was necessary, and in discussing this with
Darrick on #xfs I came to the conclusion that we cannot -ever- have
a logcount of more than 1 for an intent recovery of any sort.
That's because the transaction may have rolled enough times to
exhaust the initial grant of unit * logcount, so the only
reservation that the runtime kernel has when it crashs is a single
transaction unit reservation (which gets re-reserved on every roll).
Hence recovery cannot assume that intent that was being processed has more
than a single unit reservation of log space available to be used,
and hence all intent recovery reservations must start with a log
count of 1.
THere are other restrictions we need to deal with, too. multiple
intents may pin the tail of the log, so we can't just process a
single intent chain at a time as that will result in using all the
log space for a single intent chain and reservation deadlocking on
one of the intents we haven't yet started.
Hence the first thing we have to do in recovering intents is take an
active reservation for -all- intents in the log to match the
reservation state at runtime. Only then can we guarantee that
intents that pin the tail of the log will have the reservation space
needed to be able to unpin the log tail.
Further, because we now may have consumed all the available log
reservation space to guarantee forwards progress, the first commit
of the first intent may block trying to regrant space if another
intent also pins the tail of the log. This means we cannot replay
intents in a serial manner as we currently do. Each intent chain and it's
transaction context needs to run in it's own process context so it
can block waiting for other intents to make progress, just like
happens at runtime when the system crashed. IOWs, intent replay
needs to be done with a task context per intent chain (probably via
a workqueue).
AFAICT, this is the only way we can make intent recovery deadlock
free - we have to recreate the complete log reservation state in
memory before we start recovery of a single intent, we can only
assume an intent had a single unit reservation of log space assigned
to it, and intent chain execution needs to run concurrently so
commits can block waiting for log space to become available as other
intents commit and unpin the tail of the log...
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2020-09-10 21:29 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-09 8:19 [PATCH 0/3] [RFC] xfs: intent recovery reservation changes Dave Chinner
2020-09-09 8:19 ` [PATCH 1/3] xfs: EFI recovery needs it's own transaction reservation Dave Chinner
2020-09-09 13:31 ` Brian Foster
2020-09-09 21:44 ` Dave Chinner
2020-09-10 13:18 ` Brian Foster
2020-09-10 21:29 ` Dave Chinner [this message]
2020-09-11 12:37 ` Brian Foster
2020-09-09 8:19 ` [PATCH 2/3] xfs: add a free space extent change reservation Dave Chinner
2020-09-09 13:35 ` Brian Foster
2020-09-09 22:51 ` Dave Chinner
2020-09-10 13:19 ` Brian Foster
2020-09-09 8:19 ` [PATCH 3/3] xfs: factor free space tree transaciton reservations Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200910212927.GT12131@dread.disaster.area \
--to=david@fromorbit.com \
--cc=bfoster@redhat.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox