linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Minchan Kim <minchan@kernel.org>
Cc: Sasha Levin <levinsasha928@gmail.com>,
	Hugh Dickins <hughd@google.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm: raise MemFree by reverting percpu_pagelist_fraction to 0
Date: Fri, 11 May 2012 18:01:39 +0900	[thread overview]
Message-ID: <4FACD573.4060103@kernel.org> (raw)
In-Reply-To: <4FACD00D.4060003@kernel.org>

On 05/11/2012 05:38 PM, Minchan Kim wrote:

> On 05/11/2012 05:30 PM, Sasha Levin wrote:
> 
>> On Fri, May 11, 2012 at 10:00 AM, Hugh Dickins <hughd@google.com> wrote:
>>> Commit 93278814d359 "mm: fix division by 0 in percpu_pagelist_fraction()"
>>> mistakenly initialized percpu_pagelist_fraction to the sysctl's minimum 8,
>>> which leaves 1/8th of memory on percpu lists (on each cpu??); but most of
>>> us expect it to be left unset at 0 (and it's not then used as a divisor).
>>
>> I'm a bit confused about this, does it mean that once you set
>> percpu_pagelist_fraction to a value above the minimum, you can no
>> longer set it back to being 0?
> 
> 
> Unfortunately, Yes. :(
> It's rather awkward and need fix.



I didn't have a time so made quick patch to show just concept.
Not tested and Not consider carefully.
If anyone doesn't oppose, I will send formal patch which will have more beauty code.

diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index f487f25..fabc52c 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -132,7 +132,6 @@ static unsigned long dirty_bytes_min = 2 * PAGE_SIZE;
 /* this is needed for the proc_dointvec_minmax for [fs_]overflow UID and GID */
 static int maxolduid = 65535;
 static int minolduid;
-static int min_percpu_pagelist_fract = 8;
 
 static int ngroups_max = NGROUPS_MAX;
 static const int cap_last_cap = CAP_LAST_CAP;
@@ -1214,7 +1213,6 @@ static struct ctl_table vm_table[] = {
                .maxlen         = sizeof(percpu_pagelist_fraction),
                .mode           = 0644,
                .proc_handler   = percpu_pagelist_fraction_sysctl_handler,
-               .extra1         = &min_percpu_pagelist_fract,
        },
 #ifdef CONFIG_MMU
        {
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index a13ded1..cc2353a 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -5161,12 +5161,30 @@ int percpu_pagelist_fraction_sysctl_handler(ctl_table *table, int write,
        ret = proc_dointvec_minmax(table, write, buffer, length, ppos);
        if (!write || (ret == -EINVAL))
                return ret;
-       for_each_populated_zone(zone) {
-               for_each_possible_cpu(cpu) {
-                       unsigned long  high;
-                       high = zone->present_pages / percpu_pagelist_fraction;
-                       setup_pagelist_highmark(
-                               per_cpu_ptr(zone->pageset, cpu), high);
+
+       if (percpu_pagelist_fraction < 8 && percpu_pagelist_fraction != 0)
+               return -EINVAL;
+
+       if (percpu_pagelist_fraction != 0) {
+               for_each_populated_zone(zone) {
+                       for_each_possible_cpu(cpu) {
+                               unsigned long  high;
+                               high = zone->present_pages / percpu_pagelist_fraction;
+                               setup_pagelist_highmark(
+                                       per_cpu_ptr(zone->pageset, cpu), high);
+                       }
+               }
+       }
+       else {
+               for_each_populated_zone(zone) {
+                       for_each_possible_cpu(cpu) {
+                               struct per_cpu_pageset *p = per_cpu_ptr(zone->pageset, cpu);
+                               unsigned long batch = zone_batchsize(zone);
+                               struct per_cpu_pages *pcp;
+                               pcp = &p->pcp;
+                               pcp->high = 6 * batch;
+                               pcp->batch = max(1UL, 1 * batch);
+                       }
                }
        }
        return 0;


-- 
Kind regards,
Minchan Kim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2012-05-11  9:01 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-11  8:00 [PATCH] mm: raise MemFree by reverting percpu_pagelist_fraction to 0 Hugh Dickins
2012-05-11  8:30 ` Sasha Levin
2012-05-11  8:38   ` Minchan Kim
2012-05-11  9:01     ` Minchan Kim [this message]
2012-05-11 16:27       ` Linus Torvalds
2012-05-11 16:35         ` Sasha Levin
2012-05-11 16:54           ` Linus Torvalds
2012-05-14  8:51           ` Cong Wang
2012-05-11 14:10     ` Hugh Dickins
2012-05-14  1:51       ` Minchan Kim
2012-05-11 16:26 ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4FACD573.4060103@kernel.org \
    --to=minchan@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=levinsasha928@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).