From: Hideo AOKI <haoki@redhat.com>
To: akpm@osdl.org
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: [patch 1/3] mm: An enhancement of OVERCOMMIT_GUESS
Date: Wed, 05 Apr 2006 19:47:27 -0400 [thread overview]
Message-ID: <4434570F.9030507@redhat.com> (raw)
[-- Attachment #1: Type: text/plain, Size: 562 bytes --]
Hello Andrew,
Could you apply my patches to your tree?
These patches are an enhancement of OVERCOMMIT_GUESS algorithm in
__vm_enough_memory(). The detailed description is in attached patch.
Actually, these are the revised patch which I sent to lkml in the last
year.
http://marc.theaimsgroup.com/?l=linux-kernel&m=112993489022427&w=2
I wrote a test kernel module to show the result of the patches.
For your information, I also would like to send the module in later e-mail.
Best regards,
Hideo Aoki
---
Hideo Aoki, Hitachi Computer Products (America) Inc.
[-- Attachment #2: mm-add-totalreserve_pages.patch --]
[-- Type: text/x-patch, Size: 4960 bytes --]
These patches are an enhancement of OVERCOMMIT_GUESS algorithm in
__vm_enough_memory().
- why the kernel needed patching
When the kernel can't allocate anonymous pages in practice, currnet
OVERCOMMIT_GUESS could return success. This implementation might be
the cause of oom kill in memory pressure situation.
If the Linux runs with page reservation features like
/proc/sys/vm/lowmem_reserve_ratio and without swap region, I think
the oom kill occurs easily.
- the overall design approach in the patch
When the OVERCOMMET_GUESS algorithm calculates number of free pages,
the reserved free pages are regarded as non-free pages.
This change helps to avoid the pitfall that the number of free pages
become less than the number which the kernel tries to keep free.
- testing results
I tested the patches using my test kernel module.
If the patches aren't applied to the kernel, __vm_enough_memory()
returns success in the situation but autual page allocation is
failed.
On the other hand, if the patches are applied to the kernel, memory
allocation failure is avoided since __vm_enough_memory() returns
failure in the situation.
I checked that on i386 SMP 16GB memory machine. I haven't tested on
nommu environment currently.
- changelog
v5:
- updated to 2.6.17-rc1-mm1
- did more strict tests.
- added the enhancement to mm/nommu.c too
v4:
- dealing with pages_high as reserved pages
- updated the code for 2.6.14-rc4-mm1
v3 (private):
- enhanced error handling in __vm_enough_memory
- fixed an issue related calculation of totalreserve_pages
v2 (private):
- fixed error handling bug
- updated test results
- updated the code for 2.6.14-rc2-mm2
This patch adds totalreserve_pages for __vm_enough_memory().
Calculate_totalreserve_pages() checks maximum lowmem_reserve pages and
pages_high in each zone. Finally, the function stores the sum of each
zone to totalreserve_pages.
The totalreserve_pages is calculated when the VM is initilized.
And the variable is updated when /proc/sys/vm/lowmem_reserve_raito
or /proc/sys/vm/min_free_kbytes are changed.
Signed-off-by: Hideo Aoki <haoki@redhat.com>
---
include/linux/swap.h | 1 +
mm/page_alloc.c | 39 +++++++++++++++++++++++++++++++++++++++
2 files changed, 40 insertions(+)
diff -purN linux-2.6.17-rc1-mm1/include/linux/swap.h linux-2.6.17-rc1-mm1-idea6/include/linux/swap.h
--- linux-2.6.17-rc1-mm1/include/linux/swap.h 2006-04-04 10:43:57.000000000 -0400
+++ linux-2.6.17-rc1-mm1-idea6/include/linux/swap.h 2006-04-04 15:13:26.000000000 -0400
@@ -155,6 +155,7 @@ extern void swapin_readahead(swp_entry_t
/* linux/mm/page_alloc.c */
extern unsigned long totalram_pages;
extern unsigned long totalhigh_pages;
+extern unsigned long totalreserve_pages;
extern long nr_swap_pages;
extern unsigned int nr_free_pages(void);
extern unsigned int nr_free_pages_pgdat(pg_data_t *pgdat);
diff -purN linux-2.6.17-rc1-mm1/mm/page_alloc.c linux-2.6.17-rc1-mm1-idea6/mm/page_alloc.c
--- linux-2.6.17-rc1-mm1/mm/page_alloc.c 2006-04-04 10:43:57.000000000 -0400
+++ linux-2.6.17-rc1-mm1-idea6/mm/page_alloc.c 2006-04-04 15:13:26.000000000 -0400
@@ -51,6 +51,7 @@ nodemask_t node_possible_map __read_most
EXPORT_SYMBOL(node_possible_map);
unsigned long totalram_pages __read_mostly;
unsigned long totalhigh_pages __read_mostly;
+unsigned long totalreserve_pages __read_mostly;
long nr_swap_pages;
int percpu_pagelist_fraction;
@@ -2548,6 +2549,38 @@ void __init page_alloc_init(void)
}
/*
+ * calculate_totalreserve_pages - called when sysctl_lower_zone_reserve_ratio
+ * or min_free_kbytes changes.
+ */
+static void calculate_totalreserve_pages(void)
+{
+ struct pglist_data *pgdat;
+ unsigned long reserve_pages = 0;
+ int i, j;
+
+ for_each_online_pgdat(pgdat) {
+ for (i = 0; i < MAX_NR_ZONES; i++) {
+ struct zone *zone = pgdat->node_zones + i;
+ unsigned long max = 0;
+
+ /* Find valid and maximum lowmem_reserve in the zone */
+ for (j = i; j < MAX_NR_ZONES; j++) {
+ if (zone->lowmem_reserve[j] > max)
+ max = zone->lowmem_reserve[j];
+ }
+
+ /* we treat pages_high as reserved pages. */
+ max += zone->pages_high;
+
+ if (max > zone->present_pages)
+ max = zone->present_pages;
+ reserve_pages += max;
+ }
+ }
+ totalreserve_pages = reserve_pages;
+}
+
+/*
* setup_per_zone_lowmem_reserve - called whenever
* sysctl_lower_zone_reserve_ratio changes. Ensures that each zone
* has a correct pages reserved value, so an adequate number of
@@ -2578,6 +2611,9 @@ static void setup_per_zone_lowmem_reserv
}
}
}
+
+ /* update totalreserve_pages */
+ calculate_totalreserve_pages();
}
/*
@@ -2632,6 +2668,9 @@ void setup_per_zone_pages_min(void)
zone->pages_high = zone->pages_min + tmp / 2;
spin_unlock_irqrestore(&zone->lru_lock, flags);
}
+
+ /* update totalreserve_pages */
+ calculate_totalreserve_pages();
}
/*
next reply other threads:[~2006-04-05 23:47 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-04-05 23:47 Hideo AOKI [this message]
2006-04-05 23:51 ` A test kernel module of OVERCOMMIT_GUESS Hideo AOKI
2006-04-05 23:52 ` A patch for test_overcommit module Hideo AOKI
2006-04-06 0:45 ` [patch 1/3] mm: An enhancement of OVERCOMMIT_GUESS KAMEZAWA Hiroyuki
2006-04-06 0:45 ` KAMEZAWA Hiroyuki
2006-04-06 7:20 ` Hideo AOKI
2006-04-06 7:20 ` Hideo AOKI
2006-04-06 8:08 ` KAMEZAWA Hiroyuki
2006-04-06 8:08 ` KAMEZAWA Hiroyuki
2006-04-07 11:49 ` Hideo AOKI
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4434570F.9030507@redhat.com \
--to=haoki@redhat.com \
--cc=akpm@osdl.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.