All of lore.kernel.org
 help / color / mirror / Atom feed
From: Roman Gushchin <guro@fb.com>
To: linux-mm@kvack.org
Cc: Roman Gushchin <guro@fb.com>, Michal Hocko <mhocko@kernel.org>,
	Vladimir Davydov <vdavydov.dev@gmail.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
	David Rientjes <rientjes@google.com>, Tejun Heo <tj@kernel.org>,
	kernel-team@fb.com, cgroups@vger.kernel.org,
	linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: [v4 1/4] mm, oom: refactor the TIF_MEMDIE usage
Date: Wed, 26 Jul 2017 14:27:15 +0100	[thread overview]
Message-ID: <20170726132718.14806-2-guro@fb.com> (raw)
In-Reply-To: <20170726132718.14806-1-guro@fb.com>

First, separate tsk_is_oom_victim() and TIF_MEMDIE flag checks:
let the first one indicate that a task is killed by the OOM killer,
and the second one indicate that a task has an access to the memory
reserves (with a hope to eliminate it later).

Second, set TIF_MEMDIE to all threads of an OOM victim process.

Third, to limit the number of processes which have an access to memory
reserves, let's keep an atomic pointer to a task, which grabbed it.

Signed-off-by: Roman Gushchin <guro@fb.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: David Rientjes <rientjes@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: kernel-team@fb.com
Cc: cgroups@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org
---
 kernel/exit.c   |  2 +-
 mm/memcontrol.c |  2 +-
 mm/oom_kill.c   | 30 +++++++++++++++++++++++++-----
 3 files changed, 27 insertions(+), 7 deletions(-)

diff --git a/kernel/exit.c b/kernel/exit.c
index 8f40bee5ba9d..d5f372a2a363 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -542,7 +542,7 @@ static void exit_mm(void)
 	task_unlock(current);
 	mm_update_next_owner(mm);
 	mmput(mm);
-	if (test_thread_flag(TIF_MEMDIE))
+	if (tsk_is_oom_victim(current))
 		exit_oom_victim();
 }
 
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index d61133e6af99..9085e55eb69f 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1896,7 +1896,7 @@ static int try_charge(struct mem_cgroup *memcg, gfp_t gfp_mask,
 	 * bypass the last charges so that they can exit quickly and
 	 * free their memory.
 	 */
-	if (unlikely(test_thread_flag(TIF_MEMDIE) ||
+	if (unlikely(tsk_is_oom_victim(current) ||
 		     fatal_signal_pending(current) ||
 		     current->flags & PF_EXITING))
 		goto force;
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 9e8b4f030c1c..72de01be4d33 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -435,6 +435,8 @@ static DECLARE_WAIT_QUEUE_HEAD(oom_victims_wait);
 
 static bool oom_killer_disabled __read_mostly;
 
+static struct task_struct *tif_memdie_owner;
+
 #define K(x) ((x) << (PAGE_SHIFT-10))
 
 /*
@@ -656,13 +658,24 @@ static void mark_oom_victim(struct task_struct *tsk)
 	struct mm_struct *mm = tsk->mm;
 
 	WARN_ON(oom_killer_disabled);
-	/* OOM killer might race with memcg OOM */
-	if (test_and_set_tsk_thread_flag(tsk, TIF_MEMDIE))
+
+	if (!cmpxchg(&tif_memdie_owner, NULL, current)) {
+		struct task_struct *t;
+
+		rcu_read_lock();
+		for_each_thread(current, t)
+			set_tsk_thread_flag(t, TIF_MEMDIE);
+		rcu_read_unlock();
+	}
+
+	/*
+	 * OOM killer might race with memcg OOM.
+	 * oom_mm is bound to the signal struct life time.
+	 */
+	if (cmpxchg(&tsk->signal->oom_mm, NULL, mm))
 		return;
 
-	/* oom_mm is bound to the signal struct life time. */
-	if (!cmpxchg(&tsk->signal->oom_mm, NULL, mm))
-		mmgrab(tsk->signal->oom_mm);
+	mmgrab(tsk->signal->oom_mm);
 
 	/*
 	 * Make sure that the task is woken up from uninterruptible sleep
@@ -682,6 +695,13 @@ void exit_oom_victim(void)
 {
 	clear_thread_flag(TIF_MEMDIE);
 
+	/*
+	 * If current tasks if a thread, which initially
+	 * received TIF_MEMDIE, clear tif_memdie_owner to
+	 * give a next process a chance to capture it.
+	 */
+	cmpxchg(&tif_memdie_owner, current, NULL);
+
 	if (!atomic_dec_return(&oom_victims))
 		wake_up_all(&oom_victims_wait);
 }
-- 
2.13.3

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Roman Gushchin <guro@fb.com>
To: <linux-mm@kvack.org>
Cc: Roman Gushchin <guro@fb.com>, Michal Hocko <mhocko@kernel.org>,
	Vladimir Davydov <vdavydov.dev@gmail.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
	David Rientjes <rientjes@google.com>, Tejun Heo <tj@kernel.org>,
	<kernel-team@fb.com>, <cgroups@vger.kernel.org>,
	<linux-doc@vger.kernel.org>, <linux-kernel@vger.kernel.org>
Subject: [v4 1/4] mm, oom: refactor the TIF_MEMDIE usage
Date: Wed, 26 Jul 2017 14:27:15 +0100	[thread overview]
Message-ID: <20170726132718.14806-2-guro@fb.com> (raw)
In-Reply-To: <20170726132718.14806-1-guro@fb.com>

First, separate tsk_is_oom_victim() and TIF_MEMDIE flag checks:
let the first one indicate that a task is killed by the OOM killer,
and the second one indicate that a task has an access to the memory
reserves (with a hope to eliminate it later).

Second, set TIF_MEMDIE to all threads of an OOM victim process.

Third, to limit the number of processes which have an access to memory
reserves, let's keep an atomic pointer to a task, which grabbed it.

Signed-off-by: Roman Gushchin <guro@fb.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: David Rientjes <rientjes@google.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: kernel-team@fb.com
Cc: cgroups@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org
---
 kernel/exit.c   |  2 +-
 mm/memcontrol.c |  2 +-
 mm/oom_kill.c   | 30 +++++++++++++++++++++++++-----
 3 files changed, 27 insertions(+), 7 deletions(-)

diff --git a/kernel/exit.c b/kernel/exit.c
index 8f40bee5ba9d..d5f372a2a363 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -542,7 +542,7 @@ static void exit_mm(void)
 	task_unlock(current);
 	mm_update_next_owner(mm);
 	mmput(mm);
-	if (test_thread_flag(TIF_MEMDIE))
+	if (tsk_is_oom_victim(current))
 		exit_oom_victim();
 }
 
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index d61133e6af99..9085e55eb69f 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1896,7 +1896,7 @@ static int try_charge(struct mem_cgroup *memcg, gfp_t gfp_mask,
 	 * bypass the last charges so that they can exit quickly and
 	 * free their memory.
 	 */
-	if (unlikely(test_thread_flag(TIF_MEMDIE) ||
+	if (unlikely(tsk_is_oom_victim(current) ||
 		     fatal_signal_pending(current) ||
 		     current->flags & PF_EXITING))
 		goto force;
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 9e8b4f030c1c..72de01be4d33 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -435,6 +435,8 @@ static DECLARE_WAIT_QUEUE_HEAD(oom_victims_wait);
 
 static bool oom_killer_disabled __read_mostly;
 
+static struct task_struct *tif_memdie_owner;
+
 #define K(x) ((x) << (PAGE_SHIFT-10))
 
 /*
@@ -656,13 +658,24 @@ static void mark_oom_victim(struct task_struct *tsk)
 	struct mm_struct *mm = tsk->mm;
 
 	WARN_ON(oom_killer_disabled);
-	/* OOM killer might race with memcg OOM */
-	if (test_and_set_tsk_thread_flag(tsk, TIF_MEMDIE))
+
+	if (!cmpxchg(&tif_memdie_owner, NULL, current)) {
+		struct task_struct *t;
+
+		rcu_read_lock();
+		for_each_thread(current, t)
+			set_tsk_thread_flag(t, TIF_MEMDIE);
+		rcu_read_unlock();
+	}
+
+	/*
+	 * OOM killer might race with memcg OOM.
+	 * oom_mm is bound to the signal struct life time.
+	 */
+	if (cmpxchg(&tsk->signal->oom_mm, NULL, mm))
 		return;
 
-	/* oom_mm is bound to the signal struct life time. */
-	if (!cmpxchg(&tsk->signal->oom_mm, NULL, mm))
-		mmgrab(tsk->signal->oom_mm);
+	mmgrab(tsk->signal->oom_mm);
 
 	/*
 	 * Make sure that the task is woken up from uninterruptible sleep
@@ -682,6 +695,13 @@ void exit_oom_victim(void)
 {
 	clear_thread_flag(TIF_MEMDIE);
 
+	/*
+	 * If current tasks if a thread, which initially
+	 * received TIF_MEMDIE, clear tif_memdie_owner to
+	 * give a next process a chance to capture it.
+	 */
+	cmpxchg(&tif_memdie_owner, current, NULL);
+
 	if (!atomic_dec_return(&oom_victims))
 		wake_up_all(&oom_victims_wait);
 }
-- 
2.13.3

  reply	other threads:[~2017-07-26 13:27 UTC|newest]

Thread overview: 55+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-26 13:27 [v4 0/4] cgroup-aware OOM killer Roman Gushchin
2017-07-26 13:27 ` Roman Gushchin [this message]
2017-07-26 13:27   ` [v4 1/4] mm, oom: refactor the TIF_MEMDIE usage Roman Gushchin
2017-07-26 13:56   ` Michal Hocko
2017-07-26 13:56     ` Michal Hocko
2017-07-26 14:06     ` Roman Gushchin
2017-07-26 14:06       ` Roman Gushchin
2017-07-26 14:24       ` Michal Hocko
2017-07-26 14:24         ` Michal Hocko
2017-07-26 14:44         ` Michal Hocko
2017-07-26 14:44           ` Michal Hocko
     [not found]           ` <20170726144408.GU2981-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2017-07-26 14:50             ` Roman Gushchin
2017-07-26 14:50               ` Roman Gushchin
2017-07-26 14:50               ` Roman Gushchin
2017-07-26 13:27 ` [v4 2/4] mm, oom: cgroup-aware OOM killer Roman Gushchin
2017-07-26 13:27   ` Roman Gushchin
     [not found]   ` <20170726132718.14806-3-guro-b10kYP2dOMg@public.gmane.org>
2017-07-27 21:41     ` kbuild test robot
2017-07-27 21:41       ` kbuild test robot
2017-07-27 21:41       ` kbuild test robot
2017-08-01 14:54   ` Michal Hocko
2017-08-01 14:54     ` Michal Hocko
2017-08-01 15:25     ` Roman Gushchin
2017-08-01 15:25       ` Roman Gushchin
2017-08-01 17:03       ` Michal Hocko
2017-08-01 17:03         ` Michal Hocko
2017-08-01 18:13         ` Roman Gushchin
2017-08-01 18:13           ` Roman Gushchin
2017-08-01 18:13           ` Roman Gushchin
2017-08-02  7:29           ` Michal Hocko
2017-08-02  7:29             ` Michal Hocko
2017-08-03 12:47             ` Roman Gushchin
2017-08-03 12:47               ` Roman Gushchin
     [not found]               ` <20170803124751.GA24563-2xczL/1GIl5a1dPMsufgnw2O0Ztt9esIQQ4Iyu8u01E@public.gmane.org>
2017-08-03 13:01                 ` Michal Hocko
2017-08-03 13:01                   ` Michal Hocko
2017-08-03 13:01                   ` Michal Hocko
2017-08-08 23:06       ` David Rientjes
2017-08-08 23:06         ` David Rientjes
     [not found]         ` <alpine.DEB.2.10.1708081559001.54505-X6Q0R45D7oAcqpCFd4KODRPsWskHk0ljAL8bYrjMMd8@public.gmane.org>
2017-08-14 12:03           ` Roman Gushchin
2017-08-14 12:03             ` Roman Gushchin
2017-08-14 12:03             ` Roman Gushchin
2017-07-26 13:27 ` [v4 3/4] mm, oom: introduce oom_priority for memory cgroups Roman Gushchin
2017-07-26 13:27   ` Roman Gushchin
     [not found]   ` <20170726132718.14806-4-guro-b10kYP2dOMg@public.gmane.org>
2017-08-08 23:14     ` David Rientjes
2017-08-08 23:14       ` David Rientjes
2017-08-08 23:14       ` David Rientjes
2017-08-14 12:39       ` Roman Gushchin
2017-08-14 12:39         ` Roman Gushchin
2017-07-26 13:27 ` [v4 4/4] mm, oom, docs: describe the cgroup-aware OOM killer Roman Gushchin
2017-07-26 13:27   ` Roman Gushchin
     [not found]   ` <20170726132718.14806-5-guro-b10kYP2dOMg@public.gmane.org>
2017-08-08 23:24     ` David Rientjes
2017-08-08 23:24       ` David Rientjes
2017-08-08 23:24       ` David Rientjes
2017-08-14 12:28       ` Roman Gushchin
2017-08-14 12:28         ` Roman Gushchin
2017-08-14 12:28         ` Roman Gushchin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170726132718.14806-2-guro@fb.com \
    --to=guro@fb.com \
    --cc=cgroups@vger.kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=kernel-team@fb.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=penguin-kernel@I-love.SAKURA.ne.jp \
    --cc=rientjes@google.com \
    --cc=tj@kernel.org \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.