From: Waiman Long <longman@redhat.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Muchun Song <songmuchun@bytedance.com>,
Roman Gushchin <roman.gushchin@linux.dev>,
Waiman Long <longman@redhat.com>
Subject: [PATCH v2] mm/list_lru: Fix possible race in memcg_reparent_list_lru_node()
Date: Wed, 30 Mar 2022 13:26:46 -0400 [thread overview]
Message-ID: <20220330172646.2687555-1-longman@redhat.com> (raw)
Muchun Song found out there could be a race between list_lru_add()
and memcg_reparent_list_lru_node() causing the later function to miss
reparenting of a lru entry as shown below:
CPU0: CPU1:
list_lru_add()
spin_lock(&nlru->lock)
l = list_lru_from_kmem(memcg)
memcg_reparent_objcgs(memcg)
memcg_reparent_list_lrus(memcg)
memcg_reparent_list_lru()
memcg_reparent_list_lru_node()
if (!READ_ONCE(nlru->nr_items))
// Miss reparenting
return
// Assume 0->1
l->nr_items++
// Assume 0->1
nlru->nr_items++
Though it is not likely that a list_lru_node that has 0 item suddenly
has a newly added lru entry at the end of its life. The race is still
theoretically possible.
With the lock/unlock pair used within the percpu_ref_kill() which is
the last function call of memcg_reparent_objcgs(), any read issued
in memcg_reparent_list_lru_node() will not be reordered before the
reparenting of objcgs.
Adding a !spin_is_locked()/smp_rmb()/!READ_ONCE(nlru->nr_items) check
to ensure that either the reading of nr_items is valid or the racing
list_lru_add() will see the reparented objcg.
Fixes: 405cc51fc104 ("mm/list_lru: optimize memcg_reparent_list_lru_node()")
Reported-by: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Waiman Long <longman@redhat.com>
---
mm/list_lru.c | 31 +++++++++++++++++++++++++++----
1 file changed, 27 insertions(+), 4 deletions(-)
diff --git a/mm/list_lru.c b/mm/list_lru.c
index c669d87001a6..08ff54ffabd6 100644
--- a/mm/list_lru.c
+++ b/mm/list_lru.c
@@ -395,10 +395,33 @@ static void memcg_reparent_list_lru_node(struct list_lru *lru, int nid,
struct list_lru_one *src, *dst;
/*
- * If there is no lru entry in this nlru, we can skip it immediately.
+ * With the lock/unlock pair used within the percpu_ref_kill()
+ * which is the last function call of memcg_reparent_objcgs(), any
+ * read issued here will not be reordered before the reparenting
+ * of objcgs.
+ *
+ * Assuming a racing list_lru_add():
+ * list_lru_add()
+ * <- memcg_reparent_list_lru_node()
+ * spin_lock(&nlru->lock)
+ * l = list_lru_from_kmem(memcg)
+ * nlru->nr_items++
+ * spin_unlock(&nlru->lock)
+ * <- memcg_reparent_list_lru_node()
+ *
+ * The !spin_is_locked(&nlru->lock) check is true means it is
+ * either before the spin_lock() or after the spin_unlock(). In the
+ * former case, list_lru_add() will see the reparented objcg and so
+ * won't touch the lru to be reparented. In the later case, it will
+ * see the updated nr_items. So we can use the optimization that if
+ * there is no lru entry in this nlru, skip it immediately.
*/
- if (!READ_ONCE(nlru->nr_items))
- return;
+ if (!spin_is_locked(&nlru->lock)) {
+ /* nr_items read must be ordered after nlru->lock */
+ smp_rmb();
+ if (!READ_ONCE(nlru->nr_items))
+ return;
+ }
/*
* Since list_lru_{add,del} may be called under an IRQ-safe lock,
@@ -407,7 +430,7 @@ static void memcg_reparent_list_lru_node(struct list_lru *lru, int nid,
spin_lock_irq(&nlru->lock);
src = list_lru_from_memcg_idx(lru, nid, src_idx);
- if (!src)
+ if (!src || !src->nr_items)
goto out;
dst = list_lru_from_memcg_idx(lru, nid, dst_idx);
--
2.27.0
next reply other threads:[~2022-03-30 17:27 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-30 17:26 Waiman Long [this message]
2022-03-30 19:46 ` [PATCH v2] mm/list_lru: Fix possible race in memcg_reparent_list_lru_node() Roman Gushchin
2022-03-31 2:14 ` Andrew Morton
2022-03-31 2:48 ` Roman Gushchin
2022-03-31 6:39 ` Shakeel Butt
2022-03-31 7:46 ` Michal Hocko
2022-04-01 1:11 ` Andrew Morton
2022-04-01 4:23 ` Shakeel Butt
2022-04-01 4:30 ` Muchun Song
2022-04-01 11:28 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220330172646.2687555-1-longman@redhat.com \
--to=longman@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=roman.gushchin@linux.dev \
--cc=songmuchun@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.