From: Dave Hansen <dave.hansen@intel.com>
To: Linux-MM <linux-mm@kvack.org>, LKML <linux-kernel@vger.kernel.org>
Subject: [RFC] Bogus zone->watermark[WMARK_MIN] for big systems
Date: Tue, 17 Feb 2015 12:33:32 -0800 [thread overview]
Message-ID: <54E3A59C.7090202@intel.com> (raw)
[-- Attachment #1: Type: text/plain, Size: 1198 bytes --]
I've got a 2TB 8-node system (256GB per NUMA node) that's behaving a bit
strangely (OOMs with GB of free memory).
Its watermarks look wonky, with a min watermark of 0 pages for DMA and
only 11 pages for DMA32:
> Node 0 DMA free:7428kB min:0kB low:0kB high:0kB ...
> Node 0 DMA32 free:1024084kB min:44kB low:52kB high:64kB ... present:1941936kB managed:1862456kB
> Node 0 Normal free:4808kB min:6348kB low:7932kB high:9520kB ... present:266338304kB managed:262138972kB
This looks to be caused by us trying to evenly distribute the
min_free_kbytes value across the zones, but with such a huge size
imbalance (16MB zone vs 2TB system), 1/131072th of the default
min_free_kbytes ends up <1 page.
Should we be setting up some absolute floors on the watermarks, like the
attached patch?
BTW, it seems to be this code:
> static void __setup_per_zone_wmarks(void)
> {
> unsigned long pages_min = min_free_kbytes >> (PAGE_SHIFT - 10);
...
> for_each_zone(zone) {
> u64 tmp;
>
> spin_lock_irqsave(&zone->lock, flags);
> tmp = (u64)pages_min * zone->managed_pages;
> do_div(tmp, lowmem_pages);
[-- Attachment #2: mm-absolute-floors-for-watermarks.patch --]
[-- Type: text/x-patch, Size: 1170 bytes --]
---
b/mm/page_alloc.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff -puN mm/page_alloc.c~mm-absolute-floors-for-watermarks mm/page_alloc.c
--- a/mm/page_alloc.c~mm-absolute-floors-for-watermarks 2015-02-17 11:19:48.470054562 -0800
+++ b/mm/page_alloc.c 2015-02-17 11:26:48.164983632 -0800
@@ -5739,6 +5739,14 @@ static void __setup_per_zone_wmarks(void
}
for_each_zone(zone) {
+ /*
+ * For very small zones (think 16MB ZONE_DMA on a 4TB system),
+ * proportionally distributing pages_min can lean to
+ * watermarks of 0. Give it an absolute floor so we always
+ * have at least a minimal watermark based on the size of the
+ * *zone*, not the system.
+ */
+ unsigned long absolute_min = zone->managed_pages / 256;
u64 tmp;
spin_lock_irqsave(&zone->lock, flags);
@@ -5766,7 +5774,8 @@ static void __setup_per_zone_wmarks(void
*/
zone->watermark[WMARK_MIN] = tmp;
}
-
+ zone->watermark[WMARK_MIN] = max(zone->watermark[WMARK_MIN],
+ absolute_min);
zone->watermark[WMARK_LOW] = min_wmark_pages(zone) + (tmp >> 2);
zone->watermark[WMARK_HIGH] = min_wmark_pages(zone) + (tmp >> 1);
_
reply other threads:[~2015-02-17 20:33 UTC|newest]
Thread overview: [no followups] expand[flat|nested] mbox.gz Atom feed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=54E3A59C.7090202@intel.com \
--to=dave.hansen@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.