linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Yafang Shao <laoar.shao@gmail.com>
Cc: David Rientjes <rientjes@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linux MM <linux-mm@kvack.org>
Subject: Re: [PATCH] mm, oom: make the calculation of oom badness more accurate
Date: Thu, 9 Jul 2020 08:41:35 +0200	[thread overview]
Message-ID: <20200709064135.GA19160@dhcp22.suse.cz> (raw)
In-Reply-To: <20200709062644.GA12704@dhcp22.suse.cz>

On Thu 09-07-20 08:26:46, Michal Hocko wrote:
> On Thu 09-07-20 10:14:14, Yafang Shao wrote:
[...]
> > diff --git a/fs/proc/base.c b/fs/proc/base.c
> > index 774784587..0da8efa41 100644
> > --- a/fs/proc/base.c
> > +++ b/fs/proc/base.c
> > @@ -528,7 +528,7 @@ static int proc_oom_score(struct seq_file *m,
> > struct pid_namespace *ns,
> >         unsigned long totalpages = totalram_pages + total_swap_pages;
> >         unsigned long points = 0;
> > 
> > -       points = oom_badness(task, NULL, NULL, totalpages) *
> > +       points = 1000 + oom_badness(task, NULL, NULL, totalpages) *
> >                                       1000 / totalpages;
> >         seq_printf(m, "%lu\n", points);
> > 
> > And then update the documentation as "from 0 (never kill) to 3000
> > (always kill)"
> 
> This is still not quite there yet, I am afraid. OOM_SCORE_ADJ_MIN tasks have
> always reported 0 and I can imagine somebody might depend on this fact.
> So you need to special case LONG_MIN at least. It would be also better
> to stick with [0, 2000] range.

usage * 1000 / total -> [0, 1000]

adj -> [-1000, 1000]

max (0, usage * 1000 / total + adj) -> [0, 2000]

So effectively something like this.

diff --git a/fs/proc/base.c b/fs/proc/base.c
index d86c0afc8a85..fdf5a0be1cc3 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -550,9 +550,9 @@ static int proc_oom_score(struct seq_file *m, struct pid_namespace *ns,
 			  struct pid *pid, struct task_struct *task)
 {
 	unsigned long totalpages = totalram_pages() + total_swap_pages;
-	unsigned long points = 0;
+	long points = 0;
 
-	points = oom_badness(task, totalpages) * 1000 / totalpages;
+	points = max(0, oom_badness(task, totalpages));
 	seq_printf(m, "%lu\n", points);
 
 	return 0;
diff --git a/mm/gup.c b/mm/gup.c
index de9e36262ccb..75980dd5a2fc 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1794,7 +1794,6 @@ static long __gup_longterm_locked(struct task_struct *tsk,
 				     vmas_tmp, NULL, gup_flags);
 
 	if (gup_flags & FOLL_LONGTERM) {
-		memalloc_nocma_restore(flags);
 		if (rc < 0)
 			goto out;
 
@@ -1802,11 +1801,13 @@ static long __gup_longterm_locked(struct task_struct *tsk,
 			for (i = 0; i < rc; i++)
 				put_page(pages[i]);
 			rc = -EOPNOTSUPP;
+			memalloc_nocma_restore(flags);
 			goto out;
 		}
 
 		rc = check_and_migrate_cma_pages(tsk, mm, start, rc, pages,
 						 vmas_tmp, gup_flags);
+		memalloc_nocma_restore(flags);
 	}
 
 out:
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 6e94962893ee..30dbc302a677 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -196,13 +196,13 @@ static bool is_dump_unreclaim_slabs(void)
  * predictable as possible.  The goal is to return the highest value for the
  * task consuming the most memory to avoid subsequent oom failures.
  */
-unsigned long oom_badness(struct task_struct *p, unsigned long totalpages)
+long oom_badness(struct task_struct *p, unsigned long totalpages)
 {
 	long points;
 	long adj;
 
 	if (oom_unkillable_task(p))
-		return 0;
+		return -1;
 
 	p = find_lock_task_mm(p);
 	if (!p)
@@ -229,15 +229,10 @@ unsigned long oom_badness(struct task_struct *p, unsigned long totalpages)
 		mm_pgtables_bytes(p->mm) / PAGE_SIZE;
 	task_unlock(p);
 
-	/* Normalize to oom_score_adj units */
-	adj *= totalpages / 1000;
+	points *= 1000 / totalpages;
 	points += adj;
 
-	/*
-	 * Never return 0 for an eligible task regardless of the root bonus and
-	 * oom_score_adj (oom_score_adj can't be OOM_SCORE_ADJ_MIN here).
-	 */
-	return points > 0 ? points : 1;
+	return points;
 }
 
 static const char * const oom_constraint_text[] = {
@@ -341,7 +336,7 @@ static int oom_evaluate_task(struct task_struct *task, void *arg)
 	}
 
 	points = oom_badness(task, oc->totalpages);
-	if (!points || points < oc->chosen_points)
+	if (points > 0 || points < oc->chosen_points)
 		goto next;
 
 select:
-- 
Michal Hocko
SUSE Labs


  reply	other threads:[~2020-07-09  6:41 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-08 13:24 [PATCH] mm, oom: make the calculation of oom badness more accurate Yafang Shao
2020-07-08 14:28 ` Michal Hocko
2020-07-08 14:32   ` Michal Hocko
2020-07-08 17:57     ` David Rientjes
2020-07-08 19:02       ` Michal Hocko
2020-07-09  2:14         ` Yafang Shao
2020-07-09  6:26           ` Michal Hocko
2020-07-09  6:41             ` Michal Hocko [this message]
2020-07-09  7:31             ` Yafang Shao
2020-07-09  8:17               ` Michal Hocko
2020-07-09  1:57       ` Yafang Shao
2020-07-08 15:11   ` Yafang Shao
2020-07-08 16:09     ` Michal Hocko
2020-07-09  1:57       ` Yafang Shao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200709064135.GA19160@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=laoar.shao@gmail.com \
    --cc=linux-mm@kvack.org \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).