All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wei Yang <richardw.yang-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
To: hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org,
	mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org,
	vdavydov.dev-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org,
	ktkhai-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org,
	kirill.shutemov-VuQAYsv1563Yd54FQh9/CA@public.gmane.org,
	yang.shi-KPsoFbNs7GizrGE5bRqYAgC/G2K4zDHf@public.gmane.org
Cc: cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	alexander.duyck-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org,
	rientjes-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org,
	Wei Yang <richardw.yang-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>,
	stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: [Patch v3] mm: thp: grab the lock before manipulation defer list
Date: Thu, 16 Jan 2020 09:31:00 +0800	[thread overview]
Message-ID: <20200116013100.7679-1-richardw.yang@linux.intel.com> (raw)

As all the other places, we grab the lock before manipulate the defer list.
Current implementation may face a race condition.

For example, the potential race would be:

    CPU1                      CPU2
    mem_cgroup_move_account   deferred_split_huge_page
      list_empty
                                lock
                                list_empty
                                list_add_tail
                                unlock
      lock
      # list_empty might not hold anymore
      list_add_tail
      unlock

When this sequence happens, the list_add_tail() in
mem_cgroup_move_account() corrupt the list since which is already been
added to some split_queue in split_huge_page_to_list().

Besides this, David Rientjes points out the split_queue_len would be in
a wrong state, which would be a significant issue for shrinkers.

Fixes: 87eaceb3faa5 ("mm: thp: make deferred split shrinker memcg aware")

Signed-off-by: Wei Yang <richardw.yang-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
Cc: <stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>    [5.4+]

---
v3:
  * remove all review/ack tag since rewrite the changelog
  * use deferred_split_huge_page as the example of race
  * add cc stable 5.4+ tag as suggested by David Rientjes

v2:
  * move check on compound outside suggested by Alexander
  * an example of the race condition, suggested by Michal
---
 mm/memcontrol.c | 18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index c5b5f74cfd4d..6450bbe394e2 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5360,10 +5360,12 @@ static int mem_cgroup_move_account(struct page *page,
 	}
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
-	if (compound && !list_empty(page_deferred_list(page))) {
+	if (compound) {
 		spin_lock(&from->deferred_split_queue.split_queue_lock);
-		list_del_init(page_deferred_list(page));
-		from->deferred_split_queue.split_queue_len--;
+		if (!list_empty(page_deferred_list(page))) {
+			list_del_init(page_deferred_list(page));
+			from->deferred_split_queue.split_queue_len--;
+		}
 		spin_unlock(&from->deferred_split_queue.split_queue_lock);
 	}
 #endif
@@ -5377,11 +5379,13 @@ static int mem_cgroup_move_account(struct page *page,
 	page->mem_cgroup = to;
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
-	if (compound && list_empty(page_deferred_list(page))) {
+	if (compound) {
 		spin_lock(&to->deferred_split_queue.split_queue_lock);
-		list_add_tail(page_deferred_list(page),
-			      &to->deferred_split_queue.split_queue);
-		to->deferred_split_queue.split_queue_len++;
+		if (list_empty(page_deferred_list(page))) {
+			list_add_tail(page_deferred_list(page),
+				      &to->deferred_split_queue.split_queue);
+			to->deferred_split_queue.split_queue_len++;
+		}
 		spin_unlock(&to->deferred_split_queue.split_queue_lock);
 	}
 #endif
-- 
2.17.1


WARNING: multiple messages have this Message-ID (diff)
From: Wei Yang <richardw.yang@linux.intel.com>
To: hannes@cmpxchg.org, mhocko@kernel.org, vdavydov.dev@gmail.com,
	akpm@linux-foundation.org, ktkhai@virtuozzo.com,
	kirill.shutemov@linux.intel.com, yang.shi@linux.alibaba.com
Cc: cgroups@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, alexander.duyck@gmail.com,
	rientjes@google.com, Wei Yang <richardw.yang@linux.intel.com>,
	stable@vger.kernel.org
Subject: [Patch v3] mm: thp: grab the lock before manipulation defer list
Date: Thu, 16 Jan 2020 09:31:00 +0800	[thread overview]
Message-ID: <20200116013100.7679-1-richardw.yang@linux.intel.com> (raw)

As all the other places, we grab the lock before manipulate the defer list.
Current implementation may face a race condition.

For example, the potential race would be:

    CPU1                      CPU2
    mem_cgroup_move_account   deferred_split_huge_page
      list_empty
                                lock
                                list_empty
                                list_add_tail
                                unlock
      lock
      # list_empty might not hold anymore
      list_add_tail
      unlock

When this sequence happens, the list_add_tail() in
mem_cgroup_move_account() corrupt the list since which is already been
added to some split_queue in split_huge_page_to_list().

Besides this, David Rientjes points out the split_queue_len would be in
a wrong state, which would be a significant issue for shrinkers.

Fixes: 87eaceb3faa5 ("mm: thp: make deferred split shrinker memcg aware")

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Cc: <stable@vger.kernel.org>    [5.4+]

---
v3:
  * remove all review/ack tag since rewrite the changelog
  * use deferred_split_huge_page as the example of race
  * add cc stable 5.4+ tag as suggested by David Rientjes

v2:
  * move check on compound outside suggested by Alexander
  * an example of the race condition, suggested by Michal
---
 mm/memcontrol.c | 18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index c5b5f74cfd4d..6450bbe394e2 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5360,10 +5360,12 @@ static int mem_cgroup_move_account(struct page *page,
 	}
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
-	if (compound && !list_empty(page_deferred_list(page))) {
+	if (compound) {
 		spin_lock(&from->deferred_split_queue.split_queue_lock);
-		list_del_init(page_deferred_list(page));
-		from->deferred_split_queue.split_queue_len--;
+		if (!list_empty(page_deferred_list(page))) {
+			list_del_init(page_deferred_list(page));
+			from->deferred_split_queue.split_queue_len--;
+		}
 		spin_unlock(&from->deferred_split_queue.split_queue_lock);
 	}
 #endif
@@ -5377,11 +5379,13 @@ static int mem_cgroup_move_account(struct page *page,
 	page->mem_cgroup = to;
 
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
-	if (compound && list_empty(page_deferred_list(page))) {
+	if (compound) {
 		spin_lock(&to->deferred_split_queue.split_queue_lock);
-		list_add_tail(page_deferred_list(page),
-			      &to->deferred_split_queue.split_queue);
-		to->deferred_split_queue.split_queue_len++;
+		if (list_empty(page_deferred_list(page))) {
+			list_add_tail(page_deferred_list(page),
+				      &to->deferred_split_queue.split_queue);
+			to->deferred_split_queue.split_queue_len++;
+		}
 		spin_unlock(&to->deferred_split_queue.split_queue_lock);
 	}
 #endif
-- 
2.17.1



             reply	other threads:[~2020-01-16  1:31 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-16  1:31 Wei Yang [this message]
2020-01-16  1:31 ` [Patch v3] mm: thp: grab the lock before manipulation defer list Wei Yang
     [not found] ` <20200116013100.7679-1-richardw.yang-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>
2020-01-16  9:35   ` Kirill Tkhai
2020-01-16  9:35     ` Kirill Tkhai
2020-01-16 22:01     ` David Rientjes
2020-01-17  0:47       ` Wei Yang
     [not found]       ` <alpine.DEB.2.21.2001161357240.109233-X6Q0R45D7oAcqpCFd4KODRPsWskHk0ljAL8bYrjMMd8@public.gmane.org>
2020-01-17  9:10         ` Michal Hocko
2020-01-17  9:10           ` Michal Hocko
2020-01-17  9:26           ` Kirill Tkhai
2020-01-17  9:32             ` David Rientjes
     [not found]               ` <alpine.DEB.2.21.2001170132090.20618-X6Q0R45D7oAcqpCFd4KODRPsWskHk0ljAL8bYrjMMd8@public.gmane.org>
2020-01-17  9:42                 ` Kirill Tkhai
2020-01-17  9:42                   ` Kirill Tkhai
2020-01-17 11:59                   ` Michal Hocko
     [not found]           ` <20200117091002.GM19428-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2020-01-17  9:31             ` David Rientjes
2020-01-17  9:31               ` David Rientjes
2020-01-17 15:38               ` Kirill A. Shutemov
2020-01-17 19:11                 ` David Rientjes
2020-01-17 19:17                 ` Yang Shi
2020-01-17 19:17                   ` Yang Shi
     [not found]                   ` <4d117021-da90-6069-1991-4df2249567f8-KPsoFbNs7GizrGE5bRqYAgC/G2K4zDHf@public.gmane.org>
2020-01-17 22:18                     ` Wei Yang
2020-01-17 22:18                       ` Wei Yang
2020-01-17 22:57                       ` Kirill A. Shutemov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200116013100.7679-1-richardw.yang@linux.intel.com \
    --to=richardw.yang-vuqaysv1563yd54fqh9/ca@public.gmane.org \
    --cc=akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org \
    --cc=alexander.duyck-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org \
    --cc=kirill.shutemov-VuQAYsv1563Yd54FQh9/CA@public.gmane.org \
    --cc=ktkhai-5HdwGun5lf+gSpxsJD1C4w@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=linux-mm-Bw31MaZKKs3YtjvyW6yDsg@public.gmane.org \
    --cc=mhocko-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org \
    --cc=rientjes-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org \
    --cc=stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=vdavydov.dev-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org \
    --cc=yang.shi-KPsoFbNs7GizrGE5bRqYAgC/G2K4zDHf@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.