public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] xfs: fix race condition in inodegc list and cpumask handling
@ 2024-11-25  1:52 Long Li
  2024-11-25 21:03 ` Dave Chinner
  0 siblings, 1 reply; 5+ messages in thread
From: Long Li @ 2024-11-25  1:52 UTC (permalink / raw)
  To: djwong, cem
  Cc: linux-xfs, david, yi.zhang, houtao1, leo.lilong, yangerkun,
	lonuxli.64

There is a race condition between inodegc queue and inodegc worker where
the cpumask bit may not be set when concurrent operations occur.

Current problematic sequence:

  CPU0                             CPU1
  --------------------             ---------------------
  xfs_inodegc_queue()              xfs_inodegc_worker()
                                     llist_del_all(&gc->list)
    llist_add(&ip->i_gclist, &gc->list)
    cpumask_test_and_set_cpu()
                                     cpumask_clear_cpu()
                  < cpumask not set >

Fix this by moving llist_del_all() after cpumask_clear_cpu() to ensure
proper ordering. This change ensures that when the worker thread clears
the cpumask, any concurrent queue operations will either properly set
the cpumask bit or have already emptied the list.

Also remove unnecessary smp_mb__{before/after}_atomic() barriers since
the llist_* operations already provide required ordering semantics. it
make the code cleaner.

Fixes: 62334fab4762 ("xfs: use per-mount cpumask to track nonempty percpu inodegc lists")
Reported-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Long Li <leo.lilong@huawei.com>
---
 fs/xfs/xfs_icache.c | 25 ++++++++++++-------------
 1 file changed, 12 insertions(+), 13 deletions(-)

diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index 7b6c026d01a1..fba784d7a146 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -1948,19 +1948,20 @@ xfs_inodegc_worker(
 {
 	struct xfs_inodegc	*gc = container_of(to_delayed_work(work),
 						struct xfs_inodegc, work);
-	struct llist_node	*node = llist_del_all(&gc->list);
+	struct llist_node	*node;
 	struct xfs_inode	*ip, *n;
 	struct xfs_mount	*mp = gc->mp;
 	unsigned int		nofs_flag;
 
+	cpumask_clear_cpu(gc->cpu, &mp->m_inodegc_cpumask);
+
 	/*
-	 * Clear the cpu mask bit and ensure that we have seen the latest
-	 * update of the gc structure associated with this CPU. This matches
-	 * with the release semantics used when setting the cpumask bit in
-	 * xfs_inodegc_queue.
+	 * llist_del_all provides ordering semantics that ensure the CPU mask
+	 * clearing is always visible before removing all entries from the gc
+	 * list. This prevents list being added while the CPU mask bit is
+	 * unset in xfs_inodegc_queue()
 	 */
-	cpumask_clear_cpu(gc->cpu, &mp->m_inodegc_cpumask);
-	smp_mb__after_atomic();
+	node = llist_del_all(&gc->list);
 
 	WRITE_ONCE(gc->items, 0);
 
@@ -2188,13 +2189,11 @@ xfs_inodegc_queue(
 	shrinker_hits = READ_ONCE(gc->shrinker_hits);
 
 	/*
-	 * Ensure the list add is always seen by anyone who finds the cpumask
-	 * bit set. This effectively gives the cpumask bit set operation
-	 * release ordering semantics.
+	 * llist_add provides necessary ordering semantics to ensure list
+	 * additions are visible when cpumask bit is found set, so no
+	 * additional memory barrier is needed.
 	 */
-	smp_mb__before_atomic();
-	if (!cpumask_test_cpu(cpu_nr, &mp->m_inodegc_cpumask))
-		cpumask_test_and_set_cpu(cpu_nr, &mp->m_inodegc_cpumask);
+	cpumask_test_and_set_cpu(cpu_nr, &mp->m_inodegc_cpumask);
 
 	/*
 	 * We queue the work while holding the current CPU so that the work
-- 
2.39.2


^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-12-25 12:44 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-25  1:52 [PATCH] xfs: fix race condition in inodegc list and cpumask handling Long Li
2024-11-25 21:03 ` Dave Chinner
2024-12-05  6:59   ` Long Li
2024-12-10  6:20     ` Dave Chinner
2024-12-25 12:41       ` Long Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox