Re: [PATCH 3/4] xfs: create perag structures as soon as possible during log recovery

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Christoph Hellwig <hch@lst.de>
To: Dave Chinner <david@fromorbit.com>
Cc: Christoph Hellwig <hch@lst.de>,
	Chandan Babu R <chandan.babu@oracle.com>,
	"Darrick J. Wong" <djwong@kernel.org>,
	linux-xfs@vger.kernel.org
Subject: Re: [PATCH 3/4] xfs: create perag structures as soon as possible during log recovery
Date: Wed, 18 Sep 2024 08:11:05 +0200	[thread overview]
Message-ID: <20240918061105.GA31947@lst.de> (raw)
In-Reply-To: <ZueJusTG7CJ4jcp5@dread.disaster.area>

On Mon, Sep 16, 2024 at 11:28:26AM +1000, Dave Chinner wrote:
> I'm missing something - the intents aren't processed until the log
> has been recovered - queuing an intent to be processed does
> not require the per-ag to be present. We don't take per-ag
> references until we are recovering the intent. i.e. we've completed
> journal recovery and haven't found the corresponding EFD.
> 
> That leaves the EFI in the log->r_dfops, and we then run
> ->recover_work in the second phase of recovery. It is
> xfs_extent_free_recover_work() that creates the
> new transaction and runs the EFI processing that requires the
> perag references, isn't it?
> 
> IOWs, I don't see where the initial EFI/EFD recovery during the
> checkpoint processing requires the newly created perags to be
> present in memory for processing incomplete EFIs before the journal
> recovery phase has completed.

So my new test actually blows up before creating intents:

[   81.695529] XFS (nvme1n1): Mounting V5 Filesystem 07057234-4bec-4f17-97c5-420c71c83292
[   81.704541] XFS (nvme1n1): Starting recovery (logdev: internal)
[   81.707260] XFS (nvme1n1): xfs_buf_map_verify: daddr 0x40003 out of range, EOFS 0x40000
[   81.707974] ------------[ cut here ]------------
[   81.708376] WARNING: CPU: 1 PID: 5004 at fs/xfs/xfs_buf.c:553 xfs_buf_get_map+0x8b4/0xb70

Because sb_dblocks hasn't been updated yet.  I'd kinda assume we run
into the intents next, but maybe we don't.  I can try how far just
fixing the sb would get us, but that will potentially gets us into
more problems late the more we actually use the pag structure.

> If we are going to keep this logic, can you do this as a separate
> helper function? i.e.:

I actually did that earlier, and it turned out to create a bit more
boilerplate than I liked, but I can revert to it if there is a strong
preference.

> > +	xfs_sb_from_disk(&mp->m_sb, dsb);
> > +	if (mp->m_sb.sb_agcount < old_agcount) {
> > +		xfs_alert(mp, "Shrinking AG count in log recovery");
> > +		return -EFSCORRUPTED;
> > +	}
> > +	mp->m_features |= xfs_sb_version_to_features(&mp->m_sb);
> 
> I'm not sure this is safe. The item order in the checkpoint recovery
> isn't guaranteed to be exactly the same as when feature bits are
> modified at runtime. Hence there could be items in the checkpoint
> that haven't yet been recovered that are dependent on the original
> sb feature mask being present.  It may be OK to do this at the end
> of the checkpoint being recovered.
> 
> I'm also not sure why this feature update code is being changed
> because it's not mentioned at all in the commit message.

Mostly to keep the features in sync with the in-memory sb fields
updated above.  I'll switch to keep this as-is, but I fail to see how
updating features only after the entire reocvery is done will be safe
for all cases either.

Where would we depend on the old feature setting?

> 
> > +	error = xfs_initialize_perag(mp, old_agcount, mp->m_sb.sb_agcount,
> > +			mp->m_sb.sb_dblocks, &mp->m_maxagi);
> 
> Why do this if sb_agcount has not changed?  AFAICT it only iterates
> the AGs already initialised and so skips them, then recalculates
> inode32 and prealloc block parameters, which won't change. Hence
> it's a total no-op for anything other than an actual ag count change
> and should be skipped, right?

Yes, and the way how xfs_initialize_perag it is an entire no-op.
But I can add an extra explicit check to make that more clear.

next prev parent reply	other threads:[~2024-09-18  6:11 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-10  4:28 fix recovery of extfree items just after a growfs Christoph Hellwig
2024-09-10  4:28 ` [PATCH 1/4] xfs: pass the exact range to initialize to xfs_initialize_perag Christoph Hellwig
2024-09-17 18:50   ` Darrick J. Wong
2024-09-18  6:15     ` Christoph Hellwig
2024-09-10  4:28 ` [PATCH 2/4] xfs: merge the perag freeing helpers Christoph Hellwig
2024-09-17 18:55   ` Darrick J. Wong
2024-09-18  6:15     ` Christoph Hellwig
2024-09-10  4:28 ` [PATCH 3/4] xfs: create perag structures as soon as possible during log recovery Christoph Hellwig
2024-09-16  1:28   ` Dave Chinner
2024-09-18  6:11     ` Christoph Hellwig [this message]
2024-09-19  1:09       ` Dave Chinner
2024-09-19  7:46         ` Christoph Hellwig
2024-09-19 21:50           ` Dave Chinner
2024-09-10  4:28 ` [PATCH 4/4] xfs: don't use __GFP_RETRY_MAYFAIL Christoph Hellwig
2024-09-17 19:00   ` Darrick J. Wong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240918061105.GA31947@lst.de \
    --to=hch@lst.de \
    --cc=chandan.babu@oracle.com \
    --cc=david@fromorbit.com \
    --cc=djwong@kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.