From: Long Li <leo.lilong@huawei.com>
To: Dave Chinner <david@fromorbit.com>
Cc: <djwong@kernel.org>, <chandanbabu@kernel.org>,
<linux-xfs@vger.kernel.org>, <yi.zhang@huawei.com>,
<houtao1@huawei.com>, <yangerkun@huawei.com>
Subject: Re: [PATCH v2 2/3] xfs: don't assert perag when free perag
Date: Tue, 12 Dec 2023 21:28:54 +0800 [thread overview]
Message-ID: <20231212132854.GA2694327@ceph-admin> (raw)
In-Reply-To: <ZXeGkisobA2nXX5D@dread.disaster.area>
On Tue, Dec 12, 2023 at 09:00:50AM +1100, Dave Chinner wrote:
> On Sat, Dec 09, 2023 at 08:21:06PM +0800, Long Li wrote:
> > When releasing the perag in xfs_free_perag(), the assertion that the
> > perag in readix tree is correct in most cases. However, there is one
> > corner case where the assertion is not true. During log recovery, the
> > AGs become visible(that is included in mp->m_sb.sb_agcount) first, and
> > then the perag is initialized. If the initialization of the perag fails,
> > the assertion will be triggered. Worse yet, null pointer dereferencing
> > can occur.
>
> I'm going to assume that you are talking about xlog_do_recover()
> because the commit message doesn't actually tell us how this
> situation occurs.
>
> That code re-reads the superblock, then copies it to mp->m_sb,
> then calls xfs_initialize_perag() with the values from mp->m_sb.
>
> If log recovery replayed a growfs transaction, the mp->m_sb has a
> larger sb_agcount and so then xfs_initialize_perag() is called
> and if that fails we end up back in xfs_mountfs and the error
> stack calls xfs_free_perag().
>
> Is that correct?
Yes, you are right. When I tried to fix the perag leak issue in patch 3,
I found this problem.
>
> If so, then the fix is to change how xlog_do_recover() works. It
> needs to initialise the new perags before it updates the in-memory
> superblock. If xfs_initialize_perag() fails, it undoes all the
> changes it has made, so if we haven't updated the in-memory
> superblock when the init of the new perags fails then the error
> unwinding code works exactly as it should right now.
>
> i.e. the bug is that xlog_do_recover() is leaving the in-memory
> state inconsistent on init failure, and we need to fix that rather
> than remove the assert that is telling us that in-memory state is
> inconsistent....
>
Yes, agree with you, I used to think that removing the assertion
would solve the problem, but now it seems a bit lazy, the problem
should be solved at the source. Right now, I haven't figured out
how to fix this problem comprehensively, so I'll fix perag leak
issue first.
Thanks,
Long Li
next prev parent reply other threads:[~2023-12-12 13:25 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-12-09 12:21 [PATCH v2 1/3] xfs: add lock protection when remove perag from radix tree Long Li
2023-12-09 12:21 ` [PATCH v2 2/3] xfs: don't assert perag when free perag Long Li
2023-12-11 21:42 ` Darrick J. Wong
2023-12-11 22:00 ` Dave Chinner
2023-12-12 13:28 ` Long Li [this message]
2023-12-09 12:21 ` [PATCH v2 3/3] xfs: fix perag leak when growfs fails Long Li
2023-12-11 22:03 ` Darrick J. Wong
2023-12-12 13:40 ` Long Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231212132854.GA2694327@ceph-admin \
--to=leo.lilong@huawei.com \
--cc=chandanbabu@kernel.org \
--cc=david@fromorbit.com \
--cc=djwong@kernel.org \
--cc=houtao1@huawei.com \
--cc=linux-xfs@vger.kernel.org \
--cc=yangerkun@huawei.com \
--cc=yi.zhang@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.