* [RFC][patch 1/5] mm: Revert "mm: fix boundary checking in free_bootmem_core"
2008-04-16 11:36 [RFC][patch 0/5] Bootmem fixes Johannes Weiner
@ 2008-04-16 11:36 ` Johannes Weiner
2008-04-16 17:49 ` Yinghai Lu
2008-04-16 11:36 ` [RFC][patch 2/5] mm: Node-setup agnostic free_bootmem() Johannes Weiner
` (4 subsequent siblings)
5 siblings, 1 reply; 16+ messages in thread
From: Johannes Weiner @ 2008-04-16 11:36 UTC (permalink / raw)
To: LKML
Cc: Linux MM, Yinghai Lu, Andi Kleen, Yasunori Goto,
KAMEZAWA Hiroyuki, Ingo Molnar, Christoph Lameter, Andrew Morton
[-- Attachment #1: 0001-bootmem-Revert-mm-fix-boundary-checking-in-free_b.patch --]
[-- Type: text/plain, Size: 3632 bytes --]
This reverts commit 5a982cbc7b3fe6cf72266f319286f29963c71b9e.
The intention behind this patch was to make the free_bootmem()
interface more robust with regards to the specified range and to let
it operate on multiple node setups as well.
However, it made free_bootmem_core()
1. handle bogus node/memory-range combination input by just
returning early without informing the callsite or screaming BUG()
as it did before
2. round slightly out of node-range values to the node boundaries
instead of treating them as the invalid parameters they are
This was partially done to abuse free_bootmem_core() for node
iteration in free_bootmem (just feeding it every node on the box and
let it figure out what it wants to do with it) instead of looking up
the proper node before the call to free_bootmem_core().
It also affects free_bootmem_node() which relies on
free_bootmem_core() and on its sanity checks now removed.
Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
CC: Yinghai Lu <yhlu.kernel@gmail.com>
CC: Andi Kleen <andi@firstfloor.org>
CC: Yasunori Goto <y-goto@jp.fujitsu.com>
CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
CC: Ingo Molnar <mingo@elte.hu>
CC: Christoph Lameter <clameter@sgi.com>
CC: Andrew Morton <akpm@linux-foundation.org>
---
mm/bootmem.c | 25 ++++++-------------------
1 files changed, 6 insertions(+), 19 deletions(-)
diff --git a/mm/bootmem.c b/mm/bootmem.c
index 2ccea70..f6ff433 100644
--- a/mm/bootmem.c
+++ b/mm/bootmem.c
@@ -125,7 +125,6 @@ static int __init reserve_bootmem_core(bootmem_data_t *bdata,
BUG_ON(!size);
BUG_ON(PFN_DOWN(addr) >= bdata->node_low_pfn);
BUG_ON(PFN_UP(addr + size) > bdata->node_low_pfn);
- BUG_ON(addr < bdata->node_boot_start);
sidx = PFN_DOWN(addr - bdata->node_boot_start);
eidx = PFN_UP(addr + size - bdata->node_boot_start);
@@ -157,31 +156,21 @@ static void __init free_bootmem_core(bootmem_data_t *bdata, unsigned long addr,
unsigned long sidx, eidx;
unsigned long i;
- BUG_ON(!size);
-
- /* out range */
- if (addr + size < bdata->node_boot_start ||
- PFN_DOWN(addr) > bdata->node_low_pfn)
- return;
/*
* round down end of usable mem, partially free pages are
* considered reserved.
*/
+ BUG_ON(!size);
+ BUG_ON(PFN_DOWN(addr + size) > bdata->node_low_pfn);
- if (addr >= bdata->node_boot_start && addr < bdata->last_success)
+ if (addr < bdata->last_success)
bdata->last_success = addr;
/*
- * Round up to index to the range.
+ * Round up the beginning of the address.
*/
- if (PFN_UP(addr) > PFN_DOWN(bdata->node_boot_start))
- sidx = PFN_UP(addr) - PFN_DOWN(bdata->node_boot_start);
- else
- sidx = 0;
-
+ sidx = PFN_UP(addr) - PFN_DOWN(bdata->node_boot_start);
eidx = PFN_DOWN(addr + size - bdata->node_boot_start);
- if (eidx > bdata->node_low_pfn - PFN_DOWN(bdata->node_boot_start))
- eidx = bdata->node_low_pfn - PFN_DOWN(bdata->node_boot_start);
for (i = sidx; i < eidx; i++) {
if (unlikely(!test_and_clear_bit(i, bdata->node_bootmem_map)))
@@ -432,9 +421,7 @@ int __init reserve_bootmem(unsigned long addr, unsigned long size,
void __init free_bootmem(unsigned long addr, unsigned long size)
{
- bootmem_data_t *bdata;
- list_for_each_entry(bdata, &bdata_list, list)
- free_bootmem_core(bdata, addr, size);
+ free_bootmem_core(NODE_DATA(0)->bdata, addr, size);
}
unsigned long __init free_all_bootmem(void)
--
1.5.2.2
--
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 16+ messages in thread* Re: [RFC][patch 1/5] mm: Revert "mm: fix boundary checking in free_bootmem_core"
2008-04-16 11:36 ` [RFC][patch 1/5] mm: Revert "mm: fix boundary checking in free_bootmem_core" Johannes Weiner
@ 2008-04-16 17:49 ` Yinghai Lu
0 siblings, 0 replies; 16+ messages in thread
From: Yinghai Lu @ 2008-04-16 17:49 UTC (permalink / raw)
To: Johannes Weiner
Cc: LKML, Linux MM, Andi Kleen, Yasunori Goto, KAMEZAWA Hiroyuki,
Ingo Molnar, Christoph Lameter, Andrew Morton
On Wed, Apr 16, 2008 at 4:36 AM, Johannes Weiner <hannes@saeurebad.de> wrote:
> This reverts commit 5a982cbc7b3fe6cf72266f319286f29963c71b9e.
>
> The intention behind this patch was to make the free_bootmem()
> interface more robust with regards to the specified range and to let
> it operate on multiple node setups as well.
>
> However, it made free_bootmem_core()
>
> 1. handle bogus node/memory-range combination input by just
> returning early without informing the callsite or screaming BUG()
> as it did before
> 2. round slightly out of node-range values to the node boundaries
> instead of treating them as the invalid parameters they are
>
> This was partially done to abuse free_bootmem_core() for node
> iteration in free_bootmem (just feeding it every node on the box and
> let it figure out what it wants to do with it) instead of looking up
> the proper node before the call to free_bootmem_core().
>
> It also affects free_bootmem_node() which relies on
> free_bootmem_core() and on its sanity checks now removed.
>
> Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
> CC: Yinghai Lu <yhlu.kernel@gmail.com>
> CC: Andi Kleen <andi@firstfloor.org>
> CC: Yasunori Goto <y-goto@jp.fujitsu.com>
> CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> CC: Ingo Molnar <mingo@elte.hu>
> CC: Christoph Lameter <clameter@sgi.com>
> CC: Andrew Morton <akpm@linux-foundation.org>
> ---
> mm/bootmem.c | 25 ++++++-------------------
> 1 files changed, 6 insertions(+), 19 deletions(-)
>
> diff --git a/mm/bootmem.c b/mm/bootmem.c
> index 2ccea70..f6ff433 100644
> --- a/mm/bootmem.c
> +++ b/mm/bootmem.c
> @@ -125,7 +125,6 @@ static int __init reserve_bootmem_core(bootmem_data_t *bdata,
> BUG_ON(!size);
> BUG_ON(PFN_DOWN(addr) >= bdata->node_low_pfn);
> BUG_ON(PFN_UP(addr + size) > bdata->node_low_pfn);
> - BUG_ON(addr < bdata->node_boot_start);
>
> sidx = PFN_DOWN(addr - bdata->node_boot_start);
> eidx = PFN_UP(addr + size - bdata->node_boot_start);
can you keep the change with reserve_bootmem_core? another patch
regarding reserve_bootmem will update it further.
YH
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* [RFC][patch 2/5] mm: Node-setup agnostic free_bootmem()
2008-04-16 11:36 [RFC][patch 0/5] Bootmem fixes Johannes Weiner
2008-04-16 11:36 ` [RFC][patch 1/5] mm: Revert "mm: fix boundary checking in free_bootmem_core" Johannes Weiner
@ 2008-04-16 11:36 ` Johannes Weiner
2008-04-16 17:54 ` Yinghai Lu
2008-04-16 11:36 ` [RFC][patch 3/5] mm: Unexport __alloc_bootmem_core() Johannes Weiner
` (3 subsequent siblings)
5 siblings, 1 reply; 16+ messages in thread
From: Johannes Weiner @ 2008-04-16 11:36 UTC (permalink / raw)
To: LKML
Cc: Linux MM, Ingo Molnar, Andi Kleen, Yinghai Lu, Yasunori Goto,
KAMEZAWA Hiroyuki, Christoph Lameter, Andrew Morton
[-- Attachment #1: mm-node-setup-agnostic-free_bootmem.patch --]
[-- Type: text/plain, Size: 1961 bytes --]
Make free_bootmem() look up the node holding the specified address
range which lets it work transparently on single-node and multi-node
configurations.
If the address range exceeds the node range, it well be marked free
across node boundaries, too.
Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
CC: Ingo Molnar <mingo@elte.hu>
CC: Andi Kleen <andi@firstfloor.org>
CC: Yinghai Lu <yhlu.kernel@gmail.com>
CC: Yasunori Goto <y-goto@jp.fujitsu.com>
CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
CC: Christoph Lameter <clameter@sgi.com>
CC: Andrew Morton <akpm@linux-foundation.org>
---
mm/bootmem.c | 10 +++++++++-
1 files changed, 9 insertions(+), 1 deletions(-)
Index: tree-linus/mm/bootmem.c
===================================================================
--- tree-linus.orig/mm/bootmem.c
+++ tree-linus/mm/bootmem.c
@@ -421,7 +421,32 @@ int __init reserve_bootmem(unsigned long
void __init free_bootmem(unsigned long addr, unsigned long size)
{
- free_bootmem_core(NODE_DATA(0)->bdata, addr, size);
+ bootmem_data_t *bdata;
+ unsigned long pos = addr;
+ unsigned long partsize = size;
+
+ list_for_each_entry(bdata, &bdata_list, list) {
+ unsigned long remainder = 0;
+
+ if (pos < bdata->node_boot_start)
+ continue;
+
+ if (PFN_DOWN(pos + partsize) > bdata->node_low_pfn) {
+ remainder = PFN_DOWN(pos + partsize) - bdata->node_low_pfn;
+ partsize -= remainder;
+ }
+
+ free_bootmem_core(bdata, pos, partsize);
+
+ if (!remainder)
+ return;
+
+ pos = PFN_PHYS(bdata->node_low_pfn + 1);
+ }
+ printk(KERN_ERR "free_bootmem: request: addr=%lx, size=%lx, "
+ "state: pos=%lx, partsize=%lx\n", addr, size,
+ pos, partsize);
+ BUG();
}
unsigned long __init free_all_bootmem(void)
--
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread* Re: [RFC][patch 2/5] mm: Node-setup agnostic free_bootmem()
2008-04-16 11:36 ` [RFC][patch 2/5] mm: Node-setup agnostic free_bootmem() Johannes Weiner
@ 2008-04-16 17:54 ` Yinghai Lu
2008-04-16 18:44 ` Yinghai Lu
2008-04-16 19:19 ` Johannes Weiner
0 siblings, 2 replies; 16+ messages in thread
From: Yinghai Lu @ 2008-04-16 17:54 UTC (permalink / raw)
To: Johannes Weiner
Cc: LKML, Linux MM, Ingo Molnar, Andi Kleen, Yasunori Goto,
KAMEZAWA Hiroyuki, Christoph Lameter, Andrew Morton
On Wed, Apr 16, 2008 at 4:36 AM, Johannes Weiner <hannes@saeurebad.de> wrote:
> Make free_bootmem() look up the node holding the specified address
> range which lets it work transparently on single-node and multi-node
> configurations.
>
> If the address range exceeds the node range, it well be marked free
> across node boundaries, too.
>
> Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
> CC: Ingo Molnar <mingo@elte.hu>
> CC: Andi Kleen <andi@firstfloor.org>
> CC: Yinghai Lu <yhlu.kernel@gmail.com>
> CC: Yasunori Goto <y-goto@jp.fujitsu.com>
> CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> CC: Christoph Lameter <clameter@sgi.com>
> CC: Andrew Morton <akpm@linux-foundation.org>
> ---
> mm/bootmem.c | 10 +++++++++-
> 1 files changed, 9 insertions(+), 1 deletions(-)
>
> Index: tree-linus/mm/bootmem.c
> ===================================================================
> --- tree-linus.orig/mm/bootmem.c
> +++ tree-linus/mm/bootmem.c
> @@ -421,7 +421,32 @@ int __init reserve_bootmem(unsigned long
>
> void __init free_bootmem(unsigned long addr, unsigned long size)
> {
> - free_bootmem_core(NODE_DATA(0)->bdata, addr, size);
> + bootmem_data_t *bdata;
> + unsigned long pos = addr;
> + unsigned long partsize = size;
> +
> + list_for_each_entry(bdata, &bdata_list, list) {
> + unsigned long remainder = 0;
> +
> + if (pos < bdata->node_boot_start)
> + continue;
> +
> + if (PFN_DOWN(pos + partsize) > bdata->node_low_pfn) {
> + remainder = PFN_DOWN(pos + partsize) - bdata->node_low_pfn;
> + partsize -= remainder;
> + }
> +
> + free_bootmem_core(bdata, pos, partsize);
> +
> + if (!remainder)
> + return;
> +
> + pos = PFN_PHYS(bdata->node_low_pfn + 1);
> + }
> + printk(KERN_ERR "free_bootmem: request: addr=%lx, size=%lx, "
> + "state: pos=%lx, partsize=%lx\n", addr, size,
> + pos, partsize);
> + BUG();
> }
>
> unsigned long __init free_all_bootmem(void)
>
> --
Yes, it should work well with cross nodes case.
but please add boundary check on free_bootmem_node too.
YH
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread* Re: [RFC][patch 2/5] mm: Node-setup agnostic free_bootmem()
2008-04-16 17:54 ` Yinghai Lu
@ 2008-04-16 18:44 ` Yinghai Lu
2008-04-16 18:48 ` Ingo Molnar
2008-04-16 19:19 ` Johannes Weiner
1 sibling, 1 reply; 16+ messages in thread
From: Yinghai Lu @ 2008-04-16 18:44 UTC (permalink / raw)
To: Johannes Weiner
Cc: LKML, Linux MM, Ingo Molnar, Andi Kleen, Yasunori Goto,
KAMEZAWA Hiroyuki, Christoph Lameter, Andrew Morton,
Siddha, Suresh B
On Wed, Apr 16, 2008 at 10:54 AM, Yinghai Lu <yhlu.kernel@gmail.com> wrote:
>
> On Wed, Apr 16, 2008 at 4:36 AM, Johannes Weiner <hannes@saeurebad.de> wrote:
> > Make free_bootmem() look up the node holding the specified address
> > range which lets it work transparently on single-node and multi-node
> > configurations.
> >
> > If the address range exceeds the node range, it well be marked free
> > across node boundaries, too.
> >
> > Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
> > CC: Ingo Molnar <mingo@elte.hu>
> > CC: Andi Kleen <andi@firstfloor.org>
> > CC: Yinghai Lu <yhlu.kernel@gmail.com>
> > CC: Yasunori Goto <y-goto@jp.fujitsu.com>
> > CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> > CC: Christoph Lameter <clameter@sgi.com>
> > CC: Andrew Morton <akpm@linux-foundation.org>
> > ---
> > mm/bootmem.c | 10 +++++++++-
> > 1 files changed, 9 insertions(+), 1 deletions(-)
> >
> > Index: tree-linus/mm/bootmem.c
> > ===================================================================
> > --- tree-linus.orig/mm/bootmem.c
> > +++ tree-linus/mm/bootmem.c
> > @@ -421,7 +421,32 @@ int __init reserve_bootmem(unsigned long
> >
> > void __init free_bootmem(unsigned long addr, unsigned long size)
> > {
> > - free_bootmem_core(NODE_DATA(0)->bdata, addr, size);
> > + bootmem_data_t *bdata;
> > + unsigned long pos = addr;
> > + unsigned long partsize = size;
> > +
> > + list_for_each_entry(bdata, &bdata_list, list) {
> > + unsigned long remainder = 0;
> > +
> > + if (pos < bdata->node_boot_start)
> > + continue;
> > +
> > + if (PFN_DOWN(pos + partsize) > bdata->node_low_pfn) {
> > + remainder = PFN_DOWN(pos + partsize) - bdata->node_low_pfn;
> > + partsize -= remainder;
> > + }
> > +
> > + free_bootmem_core(bdata, pos, partsize);
> > +
> > + if (!remainder)
> > + return;
> > +
> > + pos = PFN_PHYS(bdata->node_low_pfn + 1);
> > + }
> > + printk(KERN_ERR "free_bootmem: request: addr=%lx, size=%lx, "
> > + "state: pos=%lx, partsize=%lx\n", addr, size,
> > + pos, partsize);
> > + BUG();
> > }
> >
> > unsigned long __init free_all_bootmem(void)
> >
> > --
>
> Yes, it should work well with cross nodes case.
>
> but please add boundary check on free_bootmem_node too.
also please note:
it will have problem span nodes box.
for example: node 0: 0-2g, 4-6g, node1: 2-4g, 6-8g.
and if ramdisk sit creoss 2G boundary. you will only free the range before 2g.
YH
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread* Re: [RFC][patch 2/5] mm: Node-setup agnostic free_bootmem()
2008-04-16 18:44 ` Yinghai Lu
@ 2008-04-16 18:48 ` Ingo Molnar
2008-04-16 19:17 ` Johannes Weiner
0 siblings, 1 reply; 16+ messages in thread
From: Ingo Molnar @ 2008-04-16 18:48 UTC (permalink / raw)
To: Yinghai Lu
Cc: Johannes Weiner, LKML, Linux MM, Andi Kleen, Yasunori Goto,
KAMEZAWA Hiroyuki, Christoph Lameter, Andrew Morton,
Siddha, Suresh B
* Yinghai Lu <yhlu.kernel@gmail.com> wrote:
> > Yes, it should work well with cross nodes case.
> >
> > but please add boundary check on free_bootmem_node too.
>
> also please note: it will have problem span nodes box.
>
> for example: node 0: 0-2g, 4-6g, node1: 2-4g, 6-8g. and if ramdisk sit
> creoss 2G boundary. you will only free the range before 2g.
yes. Such systems _will_ become more common - so the "this is rare"
arguments are incorrect. bootmem has to be robust enough to deal with
it.
Ingo
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [RFC][patch 2/5] mm: Node-setup agnostic free_bootmem()
2008-04-16 18:48 ` Ingo Molnar
@ 2008-04-16 19:17 ` Johannes Weiner
2008-04-18 5:06 ` Yinghai Lu
0 siblings, 1 reply; 16+ messages in thread
From: Johannes Weiner @ 2008-04-16 19:17 UTC (permalink / raw)
To: Ingo Molnar
Cc: Yinghai Lu, LKML, Linux MM, Andi Kleen, Yasunori Goto,
KAMEZAWA Hiroyuki, Christoph Lameter, Andrew Morton,
Siddha, Suresh B
Hi,
Ingo Molnar <mingo@elte.hu> writes:
> * Yinghai Lu <yhlu.kernel@gmail.com> wrote:
>
>> > Yes, it should work well with cross nodes case.
>> >
>> > but please add boundary check on free_bootmem_node too.
>>
>> also please note: it will have problem span nodes box.
>>
>> for example: node 0: 0-2g, 4-6g, node1: 2-4g, 6-8g. and if ramdisk sit
>> creoss 2G boundary. you will only free the range before 2g.
>
> yes. Such systems _will_ become more common - so the "this is rare"
> arguments are incorrect. bootmem has to be robust enough to deal with
> it.
Ingo, I never doubted any of this, I was just asking more than once if
and when this might happen. And I don't want the allocator become
fragile, just not completely ignorant about bogus input.
But the situation is still not clear for me. Ingo, how are these
node spanning pfn ranges represented in the kernel? How many node
descriptors will you have in the case Yinghai described and how will
they look like?
Hannes
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [RFC][patch 2/5] mm: Node-setup agnostic free_bootmem()
2008-04-16 19:17 ` Johannes Weiner
@ 2008-04-18 5:06 ` Yinghai Lu
0 siblings, 0 replies; 16+ messages in thread
From: Yinghai Lu @ 2008-04-18 5:06 UTC (permalink / raw)
To: Johannes Weiner
Cc: Ingo Molnar, LKML, Linux MM, Andi Kleen, Yasunori Goto,
KAMEZAWA Hiroyuki, Christoph Lameter, Andrew Morton,
Siddha, Suresh B
On Wed, Apr 16, 2008 at 12:17 PM, Johannes Weiner <hannes@saeurebad.de> wrote:
> Hi,
>
>
>
> Ingo Molnar <mingo@elte.hu> writes:
>
> > * Yinghai Lu <yhlu.kernel@gmail.com> wrote:
> >
> >> > Yes, it should work well with cross nodes case.
> >> >
> >> > but please add boundary check on free_bootmem_node too.
> >>
> >> also please note: it will have problem span nodes box.
> >>
> >> for example: node 0: 0-2g, 4-6g, node1: 2-4g, 6-8g. and if ramdisk sit
> >> creoss 2G boundary. you will only free the range before 2g.
> >
> > yes. Such systems _will_ become more common - so the "this is rare"
> > arguments are incorrect. bootmem has to be robust enough to deal with
> > it.
>
> Ingo, I never doubted any of this, I was just asking more than once if
> and when this might happen. And I don't want the allocator become
> fragile, just not completely ignorant about bogus input.
>
> But the situation is still not clear for me. Ingo, how are these
> node spanning pfn ranges represented in the kernel? How many node
> descriptors will you have in the case Yinghai described and how will
> they look like?
according to patch from Suresh in x86.git, one node still only have one bdata.
YH
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [RFC][patch 2/5] mm: Node-setup agnostic free_bootmem()
2008-04-16 17:54 ` Yinghai Lu
2008-04-16 18:44 ` Yinghai Lu
@ 2008-04-16 19:19 ` Johannes Weiner
1 sibling, 0 replies; 16+ messages in thread
From: Johannes Weiner @ 2008-04-16 19:19 UTC (permalink / raw)
To: Yinghai Lu
Cc: LKML, Linux MM, Ingo Molnar, Andi Kleen, Yasunori Goto,
KAMEZAWA Hiroyuki, Christoph Lameter, Andrew Morton
Hi,
"Yinghai Lu" <yhlu.kernel@gmail.com> writes:
> On Wed, Apr 16, 2008 at 4:36 AM, Johannes Weiner <hannes@saeurebad.de> wrote:
>> Make free_bootmem() look up the node holding the specified address
>> range which lets it work transparently on single-node and multi-node
>> configurations.
>>
>> If the address range exceeds the node range, it well be marked free
>> across node boundaries, too.
>>
>> Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
>> CC: Ingo Molnar <mingo@elte.hu>
>> CC: Andi Kleen <andi@firstfloor.org>
>> CC: Yinghai Lu <yhlu.kernel@gmail.com>
>> CC: Yasunori Goto <y-goto@jp.fujitsu.com>
>> CC: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
>> CC: Christoph Lameter <clameter@sgi.com>
>> CC: Andrew Morton <akpm@linux-foundation.org>
>> ---
>> mm/bootmem.c | 10 +++++++++-
>> 1 files changed, 9 insertions(+), 1 deletions(-)
>>
>> Index: tree-linus/mm/bootmem.c
>> ===================================================================
>> --- tree-linus.orig/mm/bootmem.c
>> +++ tree-linus/mm/bootmem.c
>> @@ -421,7 +421,32 @@ int __init reserve_bootmem(unsigned long
>>
>> void __init free_bootmem(unsigned long addr, unsigned long size)
>> {
>> - free_bootmem_core(NODE_DATA(0)->bdata, addr, size);
>> + bootmem_data_t *bdata;
>> + unsigned long pos = addr;
>> + unsigned long partsize = size;
>> +
>> + list_for_each_entry(bdata, &bdata_list, list) {
>> + unsigned long remainder = 0;
>> +
>> + if (pos < bdata->node_boot_start)
>> + continue;
>> +
>> + if (PFN_DOWN(pos + partsize) > bdata->node_low_pfn) {
>> + remainder = PFN_DOWN(pos + partsize) - bdata->node_low_pfn;
>> + partsize -= remainder;
>> + }
>> +
>> + free_bootmem_core(bdata, pos, partsize);
>> +
>> + if (!remainder)
>> + return;
>> +
>> + pos = PFN_PHYS(bdata->node_low_pfn + 1);
>> + }
>> + printk(KERN_ERR "free_bootmem: request: addr=%lx, size=%lx, "
>> + "state: pos=%lx, partsize=%lx\n", addr, size,
>> + pos, partsize);
>> + BUG();
>> }
>>
>> unsigned long __init free_all_bootmem(void)
>>
>> --
>
> Yes, it should work well with cross nodes case.
>
> but please add boundary check on free_bootmem_node too.
Alright, I will.
Hannes
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* [RFC][patch 3/5] mm: Unexport __alloc_bootmem_core()
2008-04-16 11:36 [RFC][patch 0/5] Bootmem fixes Johannes Weiner
2008-04-16 11:36 ` [RFC][patch 1/5] mm: Revert "mm: fix boundary checking in free_bootmem_core" Johannes Weiner
2008-04-16 11:36 ` [RFC][patch 2/5] mm: Node-setup agnostic free_bootmem() Johannes Weiner
@ 2008-04-16 11:36 ` Johannes Weiner
2008-04-16 11:36 ` [RFC][patch 4/5] mm: Normalize internal argument passing of bootmem data Johannes Weiner
` (2 subsequent siblings)
5 siblings, 0 replies; 16+ messages in thread
From: Johannes Weiner @ 2008-04-16 11:36 UTC (permalink / raw)
To: LKML; +Cc: Linux MM
[-- Attachment #1: 0003-bootmem-Unexport-__alloc_bootmem_core.patch --]
[-- Type: text/plain, Size: 3235 bytes --]
Function has no external callers, make it local to the allocator.
Also fix its naming inconsistency.
Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
---
include/linux/bootmem.h | 5 -----
mm/bootmem.c | 18 +++++++++---------
2 files changed, 9 insertions(+), 14 deletions(-)
Index: tree-linus/include/linux/bootmem.h
===================================================================
--- tree-linus.orig/include/linux/bootmem.h
+++ tree-linus/include/linux/bootmem.h
@@ -54,11 +54,6 @@ extern void *__alloc_bootmem_low_node(pg
unsigned long size,
unsigned long align,
unsigned long goal);
-extern void *__alloc_bootmem_core(struct bootmem_data *bdata,
- unsigned long size,
- unsigned long align,
- unsigned long goal,
- unsigned long limit);
/*
* flags for reserve_bootmem (also if CONFIG_HAVE_ARCH_BOOTMEM_NODE,
Index: tree-linus/mm/bootmem.c
===================================================================
--- tree-linus.orig/mm/bootmem.c
+++ tree-linus/mm/bootmem.c
@@ -191,16 +191,16 @@ static void __init free_bootmem_core(boo
*
* NOTE: This function is _not_ reentrant.
*/
-void * __init
-__alloc_bootmem_core(struct bootmem_data *bdata, unsigned long size,
- unsigned long align, unsigned long goal, unsigned long limit)
+static void * __init alloc_bootmem_core(struct bootmem_data *bdata,
+ unsigned long size, unsigned long align,
+ unsigned long goal, unsigned long limit)
{
unsigned long offset, remaining_size, areasize, preferred;
unsigned long i, start = 0, incr, eidx, end_pfn;
void *ret;
if (!size) {
- printk("__alloc_bootmem_core(): zero-sized request\n");
+ printk(KERN_ERR "alloc_bootmem_core(): zero-sized request\n");
BUG();
}
BUG_ON(align & (align-1));
@@ -461,7 +461,7 @@ void * __init __alloc_bootmem_nopanic(un
void *ptr;
list_for_each_entry(bdata, &bdata_list, list) {
- ptr = __alloc_bootmem_core(bdata, size, align, goal, 0);
+ ptr = alloc_bootmem_core(bdata, size, align, goal, 0);
if (ptr)
return ptr;
}
@@ -489,7 +489,7 @@ void * __init __alloc_bootmem_node(pg_da
{
void *ptr;
- ptr = __alloc_bootmem_core(pgdat->bdata, size, align, goal, 0);
+ ptr = alloc_bootmem_core(pgdat->bdata, size, align, goal, 0);
if (ptr)
return ptr;
@@ -507,8 +507,8 @@ void * __init __alloc_bootmem_low(unsign
void *ptr;
list_for_each_entry(bdata, &bdata_list, list) {
- ptr = __alloc_bootmem_core(bdata, size, align, goal,
- ARCH_LOW_ADDRESS_LIMIT);
+ ptr = alloc_bootmem_core(bdata, size, align, goal,
+ ARCH_LOW_ADDRESS_LIMIT);
if (ptr)
return ptr;
}
@@ -524,6 +524,6 @@ void * __init __alloc_bootmem_low(unsign
void * __init __alloc_bootmem_low_node(pg_data_t *pgdat, unsigned long size,
unsigned long align, unsigned long goal)
{
- return __alloc_bootmem_core(pgdat->bdata, size, align, goal,
+ return alloc_bootmem_core(pgdat->bdata, size, align, goal,
ARCH_LOW_ADDRESS_LIMIT);
}
--
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread* [RFC][patch 4/5] mm: Normalize internal argument passing of bootmem data
2008-04-16 11:36 [RFC][patch 0/5] Bootmem fixes Johannes Weiner
` (2 preceding siblings ...)
2008-04-16 11:36 ` [RFC][patch 3/5] mm: Unexport __alloc_bootmem_core() Johannes Weiner
@ 2008-04-16 11:36 ` Johannes Weiner
2008-04-16 11:36 ` [RFC][patch 5/5] mm: Move bootmem descriptors definition to a single place Johannes Weiner
2008-04-17 9:36 ` [RFC][patch 0/5] Bootmem fixes KAMEZAWA Hiroyuki
5 siblings, 0 replies; 16+ messages in thread
From: Johannes Weiner @ 2008-04-16 11:36 UTC (permalink / raw)
To: LKML; +Cc: Linux MM
[-- Attachment #1: 0004-bootmem-Normalize-internal-argument-passing-of-boot.patch --]
[-- Type: text/plain, Size: 2700 bytes --]
All _core functions only need the bootmem data, not the node
descriptor. Adjust the two functions that take a node descriptor
unneededly.
Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
---
mm/bootmem.c | 14 ++++++--------
1 files changed, 6 insertions(+), 8 deletions(-)
Index: tree-linus/mm/bootmem.c
===================================================================
--- tree-linus.orig/mm/bootmem.c
+++ tree-linus/mm/bootmem.c
@@ -85,10 +85,9 @@ static unsigned long __init get_mapsize(
/*
* Called once to set up the allocator itself.
*/
-static unsigned long __init init_bootmem_core(pg_data_t *pgdat,
+static unsigned long __init init_bootmem_core(bootmem_data_t *bdata,
unsigned long mapstart, unsigned long start, unsigned long end)
{
- bootmem_data_t *bdata = pgdat->bdata;
unsigned long mapsize;
bdata->node_bootmem_map = phys_to_virt(PFN_PHYS(mapstart));
@@ -314,11 +313,10 @@ found:
return ret;
}
-static unsigned long __init free_all_bootmem_core(pg_data_t *pgdat)
+static unsigned long __init free_all_bootmem_core(bootmem_data_t *bdata)
{
struct page *page;
unsigned long pfn;
- bootmem_data_t *bdata = pgdat->bdata;
unsigned long i, count, total = 0;
unsigned long idx;
unsigned long *map;
@@ -384,7 +382,7 @@ static unsigned long __init free_all_boo
unsigned long __init init_bootmem_node(pg_data_t *pgdat, unsigned long freepfn,
unsigned long startpfn, unsigned long endpfn)
{
- return init_bootmem_core(pgdat, freepfn, startpfn, endpfn);
+ return init_bootmem_core(pgdat->bdata, freepfn, startpfn, endpfn);
}
void __init reserve_bootmem_node(pg_data_t *pgdat, unsigned long physaddr,
@@ -401,14 +399,14 @@ void __init free_bootmem_node(pg_data_t
unsigned long __init free_all_bootmem_node(pg_data_t *pgdat)
{
- return free_all_bootmem_core(pgdat);
+ return free_all_bootmem_core(pgdat->bdata);
}
unsigned long __init init_bootmem(unsigned long start, unsigned long pages)
{
max_low_pfn = pages;
min_low_pfn = start;
- return init_bootmem_core(NODE_DATA(0), start, 0, pages);
+ return init_bootmem_core(NODE_DATA(0)->bdata, start, 0, pages);
}
#ifndef CONFIG_HAVE_ARCH_BOOTMEM_NODE
@@ -451,7 +449,7 @@ void __init free_bootmem(unsigned long a
unsigned long __init free_all_bootmem(void)
{
- return free_all_bootmem_core(NODE_DATA(0));
+ return free_all_bootmem_core(NODE_DATA(0)->bdata);
}
void * __init __alloc_bootmem_nopanic(unsigned long size, unsigned long align,
--
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread* [RFC][patch 5/5] mm: Move bootmem descriptors definition to a single place
2008-04-16 11:36 [RFC][patch 0/5] Bootmem fixes Johannes Weiner
` (3 preceding siblings ...)
2008-04-16 11:36 ` [RFC][patch 4/5] mm: Normalize internal argument passing of bootmem data Johannes Weiner
@ 2008-04-16 11:36 ` Johannes Weiner
2008-04-16 17:30 ` Ralf Baechle
2008-04-17 9:36 ` [RFC][patch 0/5] Bootmem fixes KAMEZAWA Hiroyuki
5 siblings, 1 reply; 16+ messages in thread
From: Johannes Weiner @ 2008-04-16 11:36 UTC (permalink / raw)
To: LKML
Cc: Linux MM, Ingo Molnar, Richard Henderson, Russell King, Tony Luck,
Hirokazu Takata, Geert Uytterhoeven, Ralf Baechle, Kyle McMartin,
Paul Mackerras, Paul Mundt
[-- Attachment #1: 0005-bootmem-Move-bootmem-descriptors-definition-to-a-si.patch --]
[-- Type: text/plain, Size: 14423 bytes --]
There are a lot of places that define either a single bootmem
descriptor or an array of them. Use only one central array with
MAX_NUMNODES items instead.
Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
CC: Ingo Molnar <mingo@elte.hu>
CC: Richard Henderson <rth@twiddle.net>
CC: Russell King <rmk@arm.linux.org.uk>
CC: Tony Luck <tony.luck@intel.com>
CC: Hirokazu Takata <takata@linux-m32r.org>
CC: Geert Uytterhoeven <geert@linux-m68k.org>
CC: Ralf Baechle <ralf@linux-mips.org>
CC: Kyle McMartin <kyle@parisc-linux.org>
CC: Paul Mackerras <paulus@samba.org>
CC: Paul Mundt <lethal@linux-sh.org>
---
arch/alpha/mm/numa.c | 8 ++++----
arch/arm/mm/discontig.c | 34 ++++++++++++++++------------------
arch/ia64/mm/discontig.c | 11 +++++------
arch/m32r/mm/discontig.c | 4 +---
arch/m68k/mm/init.c | 4 +---
arch/mips/sgi-ip27/ip27-memory.c | 3 +--
arch/parisc/mm/init.c | 3 +--
arch/powerpc/mm/numa.c | 3 +--
arch/sh/mm/numa.c | 5 ++---
arch/x86/mm/discontig_32.c | 3 +--
arch/x86/mm/numa_64.c | 4 +---
include/linux/bootmem.h | 2 ++
mm/bootmem.c | 2 ++
mm/page_alloc.c | 4 +---
14 files changed, 39 insertions(+), 51 deletions(-)
Index: tree-linus/arch/alpha/mm/numa.c
===================================================================
--- tree-linus.orig/arch/alpha/mm/numa.c
+++ tree-linus/arch/alpha/mm/numa.c
@@ -19,7 +19,6 @@
#include <asm/pgalloc.h>
pg_data_t node_data[MAX_NUMNODES];
-bootmem_data_t node_bdata[MAX_NUMNODES];
EXPORT_SYMBOL(node_data);
#undef DEBUG_DISCONTIG
@@ -141,7 +140,7 @@ setup_memory_node(int nid, void *kernel_
printk(" not enough mem to reserve NODE_DATA");
return;
}
- NODE_DATA(nid)->bdata = &node_bdata[nid];
+ NODE_DATA(nid)->bdata = &bootmem_node_data[nid];
printk(" Detected node memory: start %8lu, end %8lu\n",
node_min_pfn, node_max_pfn);
@@ -304,8 +303,9 @@ void __init paging_init(void)
dma_local_pfn = virt_to_phys((char *)MAX_DMA_ADDRESS) >> PAGE_SHIFT;
for_each_online_node(nid) {
- unsigned long start_pfn = node_bdata[nid].node_boot_start >> PAGE_SHIFT;
- unsigned long end_pfn = node_bdata[nid].node_low_pfn;
+ bootmem_data_t *bdata = &bootmem_node_data[nid];
+ unsigned long start_pfn = bdata->node_boot_start >> PAGE_SHIFT;
+ unsigned long end_pfn = bdata->node_low_pfn;
if (dma_local_pfn >= end_pfn - start_pfn)
zones_size[ZONE_DMA] = end_pfn - start_pfn;
Index: tree-linus/arch/arm/mm/discontig.c
===================================================================
--- tree-linus.orig/arch/arm/mm/discontig.c
+++ tree-linus/arch/arm/mm/discontig.c
@@ -21,26 +21,24 @@
* Our node_data structure for discontiguous memory.
*/
-static bootmem_data_t node_bootmem_data[MAX_NUMNODES];
-
pg_data_t discontig_node_data[MAX_NUMNODES] = {
- { .bdata = &node_bootmem_data[0] },
- { .bdata = &node_bootmem_data[1] },
- { .bdata = &node_bootmem_data[2] },
- { .bdata = &node_bootmem_data[3] },
+ { .bdata = &bootmem_node_data[0] },
+ { .bdata = &bootmem_node_data[1] },
+ { .bdata = &bootmem_node_data[2] },
+ { .bdata = &bootmem_node_data[3] },
#if MAX_NUMNODES == 16
- { .bdata = &node_bootmem_data[4] },
- { .bdata = &node_bootmem_data[5] },
- { .bdata = &node_bootmem_data[6] },
- { .bdata = &node_bootmem_data[7] },
- { .bdata = &node_bootmem_data[8] },
- { .bdata = &node_bootmem_data[9] },
- { .bdata = &node_bootmem_data[10] },
- { .bdata = &node_bootmem_data[11] },
- { .bdata = &node_bootmem_data[12] },
- { .bdata = &node_bootmem_data[13] },
- { .bdata = &node_bootmem_data[14] },
- { .bdata = &node_bootmem_data[15] },
+ { .bdata = &bootmem_node_data[4] },
+ { .bdata = &bootmem_node_data[5] },
+ { .bdata = &bootmem_node_data[6] },
+ { .bdata = &bootmem_node_data[7] },
+ { .bdata = &bootmem_node_data[8] },
+ { .bdata = &bootmem_node_data[9] },
+ { .bdata = &bootmem_node_data[10] },
+ { .bdata = &bootmem_node_data[11] },
+ { .bdata = &bootmem_node_data[12] },
+ { .bdata = &bootmem_node_data[13] },
+ { .bdata = &bootmem_node_data[14] },
+ { .bdata = &bootmem_node_data[15] },
#endif
};
Index: tree-linus/arch/ia64/mm/discontig.c
===================================================================
--- tree-linus.orig/arch/ia64/mm/discontig.c
+++ tree-linus/arch/ia64/mm/discontig.c
@@ -36,7 +36,6 @@ struct early_node_data {
struct ia64_node_data *node_data;
unsigned long pernode_addr;
unsigned long pernode_size;
- struct bootmem_data bootmem_data;
unsigned long num_physpages;
#ifdef CONFIG_ZONE_DMA
unsigned long num_dma_physpages;
@@ -76,7 +75,7 @@ static int __init build_node_maps(unsign
int node)
{
unsigned long cstart, epfn, end = start + len;
- struct bootmem_data *bdp = &mem_data[node].bootmem_data;
+ struct bootmem_data *bdp = &bootmem_node_data[node];
epfn = GRANULEROUNDUP(end) >> PAGE_SHIFT;
cstart = GRANULEROUNDDOWN(start);
@@ -166,7 +165,7 @@ static void __init fill_pernode(int node
{
void *cpu_data;
int cpus = early_nr_cpus_node(node);
- struct bootmem_data *bdp = &mem_data[node].bootmem_data;
+ struct bootmem_data *bdp = &bootmem_node_data[node];
mem_data[node].pernode_addr = pernode;
mem_data[node].pernode_size = pernodesize;
@@ -223,7 +222,7 @@ static int __init find_pernode_space(uns
{
unsigned long epfn;
unsigned long pernodesize = 0, pernode, pages, mapsize;
- struct bootmem_data *bdp = &mem_data[node].bootmem_data;
+ struct bootmem_data *bdp = &bootmem_node_data[node];
epfn = (start + len) >> PAGE_SHIFT;
@@ -439,7 +438,7 @@ void __init find_memory(void)
efi_memmap_walk(find_max_min_low_pfn, NULL);
for_each_online_node(node)
- if (mem_data[node].bootmem_data.node_low_pfn) {
+ if (bootmem_node_data[node].node_low_pfn) {
node_clear(node, memory_less_mask);
mem_data[node].min_pfn = ~0UL;
}
@@ -459,7 +458,7 @@ void __init find_memory(void)
else if (node_isset(node, memory_less_mask))
continue;
- bdp = &mem_data[node].bootmem_data;
+ bdp = &bootmem_node_data[node];
pernode = mem_data[node].pernode_addr;
pernodesize = mem_data[node].pernode_size;
map = pernode + pernodesize;
Index: tree-linus/arch/m32r/mm/discontig.c
===================================================================
--- tree-linus.orig/arch/m32r/mm/discontig.c
+++ tree-linus/arch/m32r/mm/discontig.c
@@ -20,7 +20,6 @@ extern char _end[];
struct pglist_data *node_data[MAX_NUMNODES];
EXPORT_SYMBOL(node_data);
-static bootmem_data_t node_bdata[MAX_NUMNODES] __initdata;
pg_data_t m32r_node_data[MAX_NUMNODES];
@@ -81,7 +80,7 @@ unsigned long __init setup_memory(void)
for_each_online_node(nid) {
mp = &mem_prof[nid];
NODE_DATA(nid)=(pg_data_t *)&m32r_node_data[nid];
- NODE_DATA(nid)->bdata = &node_bdata[nid];
+ NODE_DATA(nid)->bdata = &bootmem_node_data[nid];
min_pfn = mp->start_pfn;
max_pfn = mp->start_pfn + mp->pages;
bootmap_size = init_bootmem_node(NODE_DATA(nid), mp->free_pfn,
@@ -163,4 +162,3 @@ unsigned long __init zone_sizes_init(voi
return holes;
}
-
Index: tree-linus/arch/m68k/mm/init.c
===================================================================
--- tree-linus.orig/arch/m68k/mm/init.c
+++ tree-linus/arch/m68k/mm/init.c
@@ -32,8 +32,6 @@
DEFINE_PER_CPU(struct mmu_gather, mmu_gathers);
-static bootmem_data_t __initdata bootmem_data[MAX_NUMNODES];
-
pg_data_t pg_data_map[MAX_NUMNODES];
EXPORT_SYMBOL(pg_data_map);
@@ -58,7 +56,7 @@ void __init m68k_setup_node(int node)
pg_data_table[i] = pg_data_map + node;
}
#endif
- pg_data_map[node].bdata = bootmem_data + node;
+ pg_data_map[node].bdata = bootmem_node_data + node;
node_set_online(node);
}
Index: tree-linus/arch/mips/sgi-ip27/ip27-memory.c
===================================================================
--- tree-linus.orig/arch/mips/sgi-ip27/ip27-memory.c
+++ tree-linus/arch/mips/sgi-ip27/ip27-memory.c
@@ -37,7 +37,6 @@
static short __initdata slot_lastfilled_cache[MAX_COMPACT_NODES];
static unsigned short __initdata slot_psize_cache[MAX_COMPACT_NODES][MAX_MEM_SLOTS];
-static struct bootmem_data __initdata plat_node_bdata[MAX_COMPACT_NODES];
struct node_data *__node_data[MAX_COMPACT_NODES];
@@ -453,7 +452,7 @@ static void __init node_mem_init(cnodeid
__node_data[node] = __va(slot_freepfn << PAGE_SHIFT);
pd = NODE_DATA(node);
- pd->bdata = &plat_node_bdata[node];
+ pd->bdata = &bootmem_node_data[node];
cpus_clear(hub_data(node)->h_cpus);
Index: tree-linus/arch/parisc/mm/init.c
===================================================================
--- tree-linus.orig/arch/parisc/mm/init.c
+++ tree-linus/arch/parisc/mm/init.c
@@ -36,7 +36,6 @@ extern int data_start;
#ifdef CONFIG_DISCONTIGMEM
struct node_map_data node_data[MAX_NUMNODES] __read_mostly;
-bootmem_data_t bmem_data[MAX_NUMNODES] __read_mostly;
unsigned char pfnnid_map[PFNNID_MAP_MAX] __read_mostly;
#endif
@@ -262,7 +261,7 @@ static void __init setup_bootmem(void)
#ifdef CONFIG_DISCONTIGMEM
for (i = 0; i < MAX_PHYSMEM_RANGES; i++) {
memset(NODE_DATA(i), 0, sizeof(pg_data_t));
- NODE_DATA(i)->bdata = &bmem_data[i];
+ NODE_DATA(i)->bdata = &bootmem_node_data[i];
}
memset(pfnnid_map, 0xff, sizeof(pfnnid_map));
Index: tree-linus/arch/powerpc/mm/numa.c
===================================================================
--- tree-linus.orig/arch/powerpc/mm/numa.c
+++ tree-linus/arch/powerpc/mm/numa.c
@@ -37,7 +37,6 @@ EXPORT_SYMBOL(numa_cpu_lookup_table);
EXPORT_SYMBOL(numa_cpumask_lookup_table);
EXPORT_SYMBOL(node_data);
-static bootmem_data_t __initdata plat_node_bdata[MAX_NUMNODES];
static int min_common_depth;
static int n_mem_addr_cells, n_mem_size_cells;
@@ -683,7 +682,7 @@ void __init do_init_bootmem(void)
dbg("node %d\n", nid);
dbg("NODE_DATA() = %p\n", NODE_DATA(nid));
- NODE_DATA(nid)->bdata = &plat_node_bdata[nid];
+ NODE_DATA(nid)->bdata = &bootmem_node_data[nid];
NODE_DATA(nid)->node_start_pfn = start_pfn;
NODE_DATA(nid)->node_spanned_pages = end_pfn - start_pfn;
Index: tree-linus/arch/sh/mm/numa.c
===================================================================
--- tree-linus.orig/arch/sh/mm/numa.c
+++ tree-linus/arch/sh/mm/numa.c
@@ -14,7 +14,6 @@
#include <linux/pfn.h>
#include <asm/sections.h>
-static bootmem_data_t plat_node_bdata[MAX_NUMNODES];
struct pglist_data *node_data[MAX_NUMNODES] __read_mostly;
EXPORT_SYMBOL_GPL(node_data);
@@ -35,7 +34,7 @@ void __init setup_memory(void)
NODE_DATA(0) = pfn_to_kaddr(free_pfn);
memset(NODE_DATA(0), 0, sizeof(struct pglist_data));
free_pfn += PFN_UP(sizeof(struct pglist_data));
- NODE_DATA(0)->bdata = &plat_node_bdata[0];
+ NODE_DATA(0)->bdata = &bootmem_node_data[0];
/* Set up node 0 */
setup_bootmem_allocator(free_pfn);
@@ -66,7 +65,7 @@ void __init setup_bootmem_node(int nid,
free_pfn += PFN_UP(sizeof(struct pglist_data));
memset(NODE_DATA(nid), 0, sizeof(struct pglist_data));
- NODE_DATA(nid)->bdata = &plat_node_bdata[nid];
+ NODE_DATA(nid)->bdata = &bootmem_node_data[nid];
NODE_DATA(nid)->node_start_pfn = start_pfn;
NODE_DATA(nid)->node_spanned_pages = end_pfn - start_pfn;
Index: tree-linus/arch/x86/mm/discontig_32.c
===================================================================
--- tree-linus.orig/arch/x86/mm/discontig_32.c
+++ tree-linus/arch/x86/mm/discontig_32.c
@@ -41,7 +41,6 @@
struct pglist_data *node_data[MAX_NUMNODES] __read_mostly;
EXPORT_SYMBOL(node_data);
-static bootmem_data_t node0_bdata;
/*
* numa interface - we expect the numa architecture specific code to have
@@ -382,7 +381,7 @@ unsigned long __init setup_memory(void)
find_max_pfn_node(nid);
memset(NODE_DATA(0), 0, sizeof(struct pglist_data));
- NODE_DATA(0)->bdata = &node0_bdata;
+ NODE_DATA(0)->bdata = &bootmem_node_data[0];
setup_bootmem_allocator();
return max_low_pfn;
}
Index: tree-linus/arch/x86/mm/numa_64.c
===================================================================
--- tree-linus.orig/arch/x86/mm/numa_64.c
+++ tree-linus/arch/x86/mm/numa_64.c
@@ -27,8 +27,6 @@
struct pglist_data *node_data[MAX_NUMNODES] __read_mostly;
EXPORT_SYMBOL(node_data);
-bootmem_data_t plat_node_bdata[MAX_NUMNODES];
-
struct memnode memnode;
int x86_cpu_to_node_map_init[NR_CPUS] = {
@@ -206,7 +204,7 @@ void __init setup_node_bootmem(int nodei
nodedata_phys + pgdat_size - 1);
memset(NODE_DATA(nodeid), 0, sizeof(pg_data_t));
- NODE_DATA(nodeid)->bdata = &plat_node_bdata[nodeid];
+ NODE_DATA(nodeid)->bdata = &bootmem_node_data[nodeid];
NODE_DATA(nodeid)->node_start_pfn = start_pfn;
NODE_DATA(nodeid)->node_spanned_pages = end_pfn - start_pfn;
Index: tree-linus/include/linux/bootmem.h
===================================================================
--- tree-linus.orig/include/linux/bootmem.h
+++ tree-linus/include/linux/bootmem.h
@@ -38,6 +38,8 @@ typedef struct bootmem_data {
struct list_head list;
} bootmem_data_t;
+extern bootmem_data_t bootmem_node_data[];
+
extern unsigned long bootmem_bootmap_pages(unsigned long);
extern unsigned long init_bootmem(unsigned long addr, unsigned long memend);
extern void free_bootmem(unsigned long addr, unsigned long size);
Index: tree-linus/mm/bootmem.c
===================================================================
--- tree-linus.orig/mm/bootmem.c
+++ tree-linus/mm/bootmem.c
@@ -19,6 +19,8 @@
#include "internal.h"
+bootmem_data_t bootmem_node_data[MAX_NUMNODES] __initdata;
+
/*
* Access to this subsystem has to be serialized externally. (this is
* true for the boot process anyway)
Index: tree-linus/mm/page_alloc.c
===================================================================
--- tree-linus.orig/mm/page_alloc.c
+++ tree-linus/mm/page_alloc.c
@@ -3960,9 +3960,7 @@ void __init set_dma_reserve(unsigned lon
}
#ifndef CONFIG_NEED_MULTIPLE_NODES
-static bootmem_data_t contig_bootmem_data;
-struct pglist_data contig_page_data = { .bdata = &contig_bootmem_data };
-
+struct pglist_data contig_page_data = { .bdata = &bootmem_node_data[0] };
EXPORT_SYMBOL(contig_page_data);
#endif
--
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread* Re: [RFC][patch 5/5] mm: Move bootmem descriptors definition to a single place
2008-04-16 11:36 ` [RFC][patch 5/5] mm: Move bootmem descriptors definition to a single place Johannes Weiner
@ 2008-04-16 17:30 ` Ralf Baechle
0 siblings, 0 replies; 16+ messages in thread
From: Ralf Baechle @ 2008-04-16 17:30 UTC (permalink / raw)
To: Johannes Weiner
Cc: LKML, Linux MM, Ingo Molnar, Richard Henderson, Russell King,
Tony Luck, Hirokazu Takata, Geert Uytterhoeven, Kyle McMartin,
Paul Mackerras, Paul Mundt
On Wed, Apr 16, 2008 at 01:36:34PM +0200, Johannes Weiner wrote:
> There are a lot of places that define either a single bootmem
> descriptor or an array of them. Use only one central array with
> MAX_NUMNODES items instead.
>
> Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
Acked-by: Ralf Baechle <ralf@linux-mips.org>
Ralf
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [RFC][patch 0/5] Bootmem fixes
2008-04-16 11:36 [RFC][patch 0/5] Bootmem fixes Johannes Weiner
` (4 preceding siblings ...)
2008-04-16 11:36 ` [RFC][patch 5/5] mm: Move bootmem descriptors definition to a single place Johannes Weiner
@ 2008-04-17 9:36 ` KAMEZAWA Hiroyuki
2008-04-17 10:49 ` Johannes Weiner
5 siblings, 1 reply; 16+ messages in thread
From: KAMEZAWA Hiroyuki @ 2008-04-17 9:36 UTC (permalink / raw)
To: Johannes Weiner; +Cc: LKML, Linux MM
On Wed, 16 Apr 2008 13:36:29 +0200
Johannes Weiner <hannes@saeurebad.de> wrote:
> Hi,
>
> here are a bunch of fixes for the bootmem allocator. These are tested
> on boring x86_32 UMA hardware, but 3 patches only show their effects
> on multi-node systems, so please review and test.
>
> Only the first two patches are real code changes, the others are
> cleanups.
>
> `Node-setup agnostic free_bootmem()' assumes that all bootmem
> descriptors describe contiguous regions and bdata_list is in ascending
> order. Yinghai was unsure about this fact, Ingo could you ACK/NAK
> this?
>
Tested on ia64/NUMA box on 2.6.25. seems no problem.
Thanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread* Re: [RFC][patch 0/5] Bootmem fixes
2008-04-17 9:36 ` [RFC][patch 0/5] Bootmem fixes KAMEZAWA Hiroyuki
@ 2008-04-17 10:49 ` Johannes Weiner
0 siblings, 0 replies; 16+ messages in thread
From: Johannes Weiner @ 2008-04-17 10:49 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki; +Cc: LKML, Linux MM
Hi,
KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com> writes:
> On Wed, 16 Apr 2008 13:36:29 +0200
> Johannes Weiner <hannes@saeurebad.de> wrote:
>
>> Hi,
>>
>> here are a bunch of fixes for the bootmem allocator. These are tested
>> on boring x86_32 UMA hardware, but 3 patches only show their effects
>> on multi-node systems, so please review and test.
>>
>> Only the first two patches are real code changes, the others are
>> cleanups.
>>
>> `Node-setup agnostic free_bootmem()' assumes that all bootmem
>> descriptors describe contiguous regions and bdata_list is in ascending
>> order. Yinghai was unsure about this fact, Ingo could you ACK/NAK
>> this?
>>
> Tested on ia64/NUMA box on 2.6.25. seems no problem.
Cool, thanks a lot!
Hannes
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 16+ messages in thread