From: Dave Chinner <david@fromorbit.com>
To: Long Li <leo.lilong@huawei.com>
Cc: djwong@kernel.org, cem@kernel.org, linux-xfs@vger.kernel.org,
yi.zhang@huawei.com, houtao1@huawei.com, yangerkun@huawei.com,
lonuxli.64@gmail.com
Subject: Re: [PATCH] xfs: fix race condition in inodegc list and cpumask handling
Date: Tue, 26 Nov 2024 08:03:43 +1100 [thread overview]
Message-ID: <Z0TmLzSmLr78T8Im@dread.disaster.area> (raw)
In-Reply-To: <20241125015258.2652325-1-leo.lilong@huawei.com>
On Mon, Nov 25, 2024 at 09:52:58AM +0800, Long Li wrote:
> There is a race condition between inodegc queue and inodegc worker where
> the cpumask bit may not be set when concurrent operations occur.
What problems does this cause? i.e. how do we identify systems
hitting this issue?
>
> Current problematic sequence:
>
> CPU0 CPU1
> -------------------- ---------------------
> xfs_inodegc_queue() xfs_inodegc_worker()
> llist_del_all(&gc->list)
> llist_add(&ip->i_gclist, &gc->list)
> cpumask_test_and_set_cpu()
> cpumask_clear_cpu()
> < cpumask not set >
>
> Fix this by moving llist_del_all() after cpumask_clear_cpu() to ensure
> proper ordering. This change ensures that when the worker thread clears
> the cpumask, any concurrent queue operations will either properly set
> the cpumask bit or have already emptied the list.
>
> Also remove unnecessary smp_mb__{before/after}_atomic() barriers since
> the llist_* operations already provide required ordering semantics. it
> make the code cleaner.
IIRC, the barriers were for ordering the cpumask bitmap ops against
llist operations. There are calls elsewhere to for_each_cpu() that
then use llist_empty() checks (e.g xfs_inodegc_queue_all/wait_all),
so on relaxed architectures (like alpha) I think we have to ensure
the bitmask ops carried full ordering against the independent llist
ops themselves. i.e. llist_empty() just uses READ_ONCE, so it only
orders against other llist ops and won't guarantee any specific
ordering against against cpumask modifications.
I could be remembering incorrectly, but I think that was the
original reason for the barriers. Can you please confirm that the
cpumask iteration/llist_empty checks do not need these bitmask
barriers anymore? If that's ok, then the change looks fine.
-Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2024-11-25 21:03 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-25 1:52 [PATCH] xfs: fix race condition in inodegc list and cpumask handling Long Li
2024-11-25 21:03 ` Dave Chinner [this message]
2024-12-05 6:59 ` Long Li
2024-12-10 6:20 ` Dave Chinner
2024-12-25 12:41 ` Long Li
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z0TmLzSmLr78T8Im@dread.disaster.area \
--to=david@fromorbit.com \
--cc=cem@kernel.org \
--cc=djwong@kernel.org \
--cc=houtao1@huawei.com \
--cc=leo.lilong@huawei.com \
--cc=linux-xfs@vger.kernel.org \
--cc=lonuxli.64@gmail.com \
--cc=yangerkun@huawei.com \
--cc=yi.zhang@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox