* [PATCH] mm: allocate usemap at first instead of mem_map in sparse_init
@ 2008-04-02 22:25 Yinghai Lu
2008-04-02 22:52 ` Andrew Morton
2008-04-02 23:51 ` Badari Pulavarty
0 siblings, 2 replies; 11+ messages in thread
From: Yinghai Lu @ 2008-04-02 22:25 UTC (permalink / raw)
To: Andrew Morton, Ingo Molnar
Cc: kernel list, Kamalesh Babulal, linuxppc-dev, Badari Pulavarty,
Balbir Singh
[PATCH] mm: allocate usemap at first instead of mem_map in sparse_init
on powerpc,
On Wed, Apr 2, 2008 at 12:22 PM, Badari Pulavarty <pbadari@us.ibm.com> wrote:
>
> On Wed, 2008-04-02 at 18:17 +1100, Michael Ellerman wrote:
> > On Wed, 2008-04-02 at 12:38 +0530, Kamalesh Babulal wrote:
> > > Andrew Morton wrote:
> > > > On Wed, 02 Apr 2008 11:55:36 +0530 Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> wrote:
> > > >
> > > >> Hi Andrew,
> > > >>
> > > >> The 2.6.25-rc8-mm1 kernel panic's while bootup on the power machine(s).
> > > >>
> > > >> [ 0.000000] ------------[ cut here ]------------
> > > >> [ 0.000000] kernel BUG at arch/powerpc/mm/init_64.c:240!
> > > >> [ 0.000000] Oops: Exception in kernel mode, sig: 5 [#1]
> > > >> [ 0.000000] SMP NR_CPUS=32 NUMA PowerMac
> > > >> [ 0.000000] Modules linked in:
> > > >> [ 0.000000] NIP: c0000000003d1dcc LR: c0000000003d1dc4 CTR: c00000000002b6ac
> > > >> [ 0.000000] REGS: c00000000049b960 TRAP: 0700 Not tainted (2.6.25-rc8-mm1-autokern1)
> > > >> [ 0.000000] MSR: 9000000000021032 <ME,IR,DR> CR: 44000088 XER: 20000000
> > > >> [ 0.000000] TASK = c0000000003f9c90[0] 'swapper' THREAD: c000000000498000 CPU: 0
> > > >> [ 0.000000] GPR00: c0000000003d1dc4 c00000000049bbe0 c0000000004989d0 0000000000000001
> > > >> [ 0.000000] GPR04: d59aca40f0000000 000000000b000000 0000000000000010 0000000000000000
> > > >> [ 0.000000] GPR08: 0000000000000004 0000000000000001 c00000027e520800 c0000000004bf0f0
> > > >> [ 0.000000] GPR12: c0000000004bf020 c0000000003fa900 0000000000000000 0000000000000000
> > > >> [ 0.000000] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > > >> [ 0.000000] GPR20: 0000000000000000 0000000000000000 0000000000000000 4000000001400000
> > > >> [ 0.000000] GPR24: 00000000017d64b0 c0000000003d6250 0000000000000000 c000000000504000
> > > >> [ 0.000000] GPR28: 0000000000000000 cf000000001f8000 0000000001000000 cf00000000000000
> > > >> [ 0.000000] NIP [c0000000003d1dcc] .vmemmap_populate+0xb8/0xf4
> > > >> [ 0.000000] LR [c0000000003d1dc4] .vmemmap_populate+0xb0/0xf4
> > > >> [ 0.000000] Call Trace:
> > > >> [ 0.000000] [c00000000049bbe0] [c0000000003d1dc4] .vmemmap_populate+0xb0/0xf4 (unreliable)
> > > >> [ 0.000000] [c00000000049bc70] [c0000000003d2ee8] .sparse_mem_map_populate+0x38/0x60
> > > >> [ 0.000000] [c00000000049bd00] [c0000000003c242c] .sparse_early_mem_map_alloc+0x54/0x94
> > > >> [ 0.000000] [c00000000049bd90] [c0000000003c250c] .sparse_init+0xa0/0x20c
> > > >> [ 0.000000] [c00000000049be50] [c0000000003ab7d0] .setup_arch+0x1ac/0x218
> > > >> [ 0.000000] [c00000000049bee0] [c0000000003a36ac] .start_kernel+0xe0/0x3fc
> > > >> [ 0.000000] [c00000000049bf90] [c000000000008594] .start_here_common+0x54/0xc0
> > > >> [ 0.000000] Instruction dump:
> > > >> [ 0.000000] 7fe3fb78 7ca02a14 4082000c 3860fff4 4800003c e92289c8 e96289c0 e9090002
> > > >> [ 0.000000] e8eb0002 4bc575cd 60000000 78630fe0 <0b030000> 7ffff214 7fbfe840 7fe3fb78
> > > >> [ 0.000000] ---[ end trace 31fd0ba7d8756001 ]---
> > > >> [ 0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
>
> mm-make-mem_map-allocation-continuous.patch
> and its friends in -mm.
>
> You have to call sparse_init_one_section() on each pmap and usemap
> as we allocate - since valid_section() depends on it (which is needed
> by vmemmap_populate() to check if the section is populated or not).
> On ppc, we need to call htab_bolted_mapping() on each section and
> we need to skip existing sections.
>
> These patches tried to group all allocations together and then later
> calls sparse_init_one_section() - which is not good :(
so try to allocate usemap at first altogether.
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
diff --git a/mm/sparse.c b/mm/sparse.c
index d3cb085..782ebe5 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -294,7 +294,7 @@ void __init sparse_init(void)
unsigned long pnum;
struct page *map;
unsigned long *usemap;
- struct page **section_map;
+ unsigned long **usemap_map;
int size;
int node;
@@ -305,27 +305,31 @@ void __init sparse_init(void)
* make next 2M slip to one more 2M later.
* then in big system, the memmory will have a lot hole...
* here try to allocate 2M pages continously.
+ *
+ * powerpc hope to sparse_init_one_section right after each
+ * sparse_early_mem_map_alloc, so allocate usemap_map
+ * at first.
*/
- size = sizeof(struct page *) * NR_MEM_SECTIONS;
- section_map = alloc_bootmem(size);
- if (!section_map)
- panic("can not allocate section_map\n");
+ size = sizeof(unsigned long *) * NR_MEM_SECTIONS;
+ usemap_map = alloc_bootmem(size);
+ if (!usemap_map)
+ panic("can not allocate usemap_map\n");
for (pnum = 0; pnum < NR_MEM_SECTIONS; pnum++) {
if (!present_section_nr(pnum))
continue;
- section_map[pnum] = sparse_early_mem_map_alloc(pnum);
+ usemap_map[pnum] = sparse_early_usemap_alloc(pnum);
}
for (pnum = 0; pnum < NR_MEM_SECTIONS; pnum++) {
if (!present_section_nr(pnum))
continue;
- map = section_map[pnum];
+ map = sparse_early_mem_map_alloc(pnum);
if (!map)
continue;
- usemap = sparse_early_usemap_alloc(pnum);
+ usemap = usemap_map[pnum];
if (!usemap)
continue;
@@ -333,7 +337,7 @@ void __init sparse_init(void)
usemap);
}
- free_bootmem(__pa(section_map), size);
+ free_bootmem(__pa(usemap_map), size);
}
#ifdef CONFIG_MEMORY_HOTPLUG
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH] mm: allocate usemap at first instead of mem_map in sparse_init
2008-04-02 22:25 [PATCH] mm: allocate usemap at first instead of mem_map in sparse_init Yinghai Lu
@ 2008-04-02 22:52 ` Andrew Morton
2008-04-03 0:44 ` Yinghai Lu
2008-04-02 23:51 ` Badari Pulavarty
1 sibling, 1 reply; 11+ messages in thread
From: Andrew Morton @ 2008-04-02 22:52 UTC (permalink / raw)
To: yhlu.kernel
Cc: Balbir, Kamalesh, kernel list, Babulal, Yinghai Lu, linuxppc-dev,
Badari Pulavarty, Ingo Molnar, Singh
On Wed, 2 Apr 2008 15:25:48 -0700 Yinghai Lu <yhlu.kernel.send@gmail.com> wrote:
> [PATCH] mm: allocate usemap at first instead of mem_map in sparse_init
>
> on powerpc,
>
> On Wed, Apr 2, 2008 at 12:22 PM, Badari Pulavarty <pbadari@us.ibm.com> wrote:
> >
> > On Wed, 2008-04-02 at 18:17 +1100, Michael Ellerman wrote:
> > > On Wed, 2008-04-02 at 12:38 +0530, Kamalesh Babulal wrote:
> > > > Andrew Morton wrote:
> > > > > On Wed, 02 Apr 2008 11:55:36 +0530 Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> wrote:
> > > > >
> > > > >> Hi Andrew,
> > > > >>
> > > > >> The 2.6.25-rc8-mm1 kernel panic's while bootup on the power machine(s).
> > > > >>
> > > > >> [ 0.000000] ------------[ cut here ]------------
> > > > >> [ 0.000000] kernel BUG at arch/powerpc/mm/init_64.c:240!
> > > > >> [ 0.000000] Oops: Exception in kernel mode, sig: 5 [#1]
> > > > >> [ 0.000000] SMP NR_CPUS=32 NUMA PowerMac
> > > > >> [ 0.000000] Modules linked in:
> > > > >> [ 0.000000] NIP: c0000000003d1dcc LR: c0000000003d1dc4 CTR: c00000000002b6ac
> > > > >> [ 0.000000] REGS: c00000000049b960 TRAP: 0700 Not tainted (2.6.25-rc8-mm1-autokern1)
> > > > >> [ 0.000000] MSR: 9000000000021032 <ME,IR,DR> CR: 44000088 XER: 20000000
> > > > >> [ 0.000000] TASK = c0000000003f9c90[0] 'swapper' THREAD: c000000000498000 CPU: 0
> > > > >> [ 0.000000] GPR00: c0000000003d1dc4 c00000000049bbe0 c0000000004989d0 0000000000000001
> > > > >> [ 0.000000] GPR04: d59aca40f0000000 000000000b000000 0000000000000010 0000000000000000
> > > > >> [ 0.000000] GPR08: 0000000000000004 0000000000000001 c00000027e520800 c0000000004bf0f0
> > > > >> [ 0.000000] GPR12: c0000000004bf020 c0000000003fa900 0000000000000000 0000000000000000
> > > > >> [ 0.000000] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > > > >> [ 0.000000] GPR20: 0000000000000000 0000000000000000 0000000000000000 4000000001400000
> > > > >> [ 0.000000] GPR24: 00000000017d64b0 c0000000003d6250 0000000000000000 c000000000504000
> > > > >> [ 0.000000] GPR28: 0000000000000000 cf000000001f8000 0000000001000000 cf00000000000000
> > > > >> [ 0.000000] NIP [c0000000003d1dcc] .vmemmap_populate+0xb8/0xf4
> > > > >> [ 0.000000] LR [c0000000003d1dc4] .vmemmap_populate+0xb0/0xf4
> > > > >> [ 0.000000] Call Trace:
> > > > >> [ 0.000000] [c00000000049bbe0] [c0000000003d1dc4] .vmemmap_populate+0xb0/0xf4 (unreliable)
> > > > >> [ 0.000000] [c00000000049bc70] [c0000000003d2ee8] .sparse_mem_map_populate+0x38/0x60
> > > > >> [ 0.000000] [c00000000049bd00] [c0000000003c242c] .sparse_early_mem_map_alloc+0x54/0x94
> > > > >> [ 0.000000] [c00000000049bd90] [c0000000003c250c] .sparse_init+0xa0/0x20c
> > > > >> [ 0.000000] [c00000000049be50] [c0000000003ab7d0] .setup_arch+0x1ac/0x218
> > > > >> [ 0.000000] [c00000000049bee0] [c0000000003a36ac] .start_kernel+0xe0/0x3fc
> > > > >> [ 0.000000] [c00000000049bf90] [c000000000008594] .start_here_common+0x54/0xc0
> > > > >> [ 0.000000] Instruction dump:
> > > > >> [ 0.000000] 7fe3fb78 7ca02a14 4082000c 3860fff4 4800003c e92289c8 e96289c0 e9090002
> > > > >> [ 0.000000] e8eb0002 4bc575cd 60000000 78630fe0 <0b030000> 7ffff214 7fbfe840 7fe3fb78
> > > > >> [ 0.000000] ---[ end trace 31fd0ba7d8756001 ]---
> > > > >> [ 0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
> >
> > mm-make-mem_map-allocation-continuous.patch
> > and its friends in -mm.
> >
> > You have to call sparse_init_one_section() on each pmap and usemap
> > as we allocate - since valid_section() depends on it (which is needed
> > by vmemmap_populate() to check if the section is populated or not).
> > On ppc, we need to call htab_bolted_mapping() on each section and
> > we need to skip existing sections.
> >
> > These patches tried to group all allocations together and then later
> > calls sparse_init_one_section() - which is not good :(
>
> so try to allocate usemap at first altogether.
I have to turn all the above crud into a proper changelog. I'd prefer that
you do it.
Unless this patch should be folded into another one, in which case it
doesn't matter.
> Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
>
> diff --git a/mm/sparse.c b/mm/sparse.c
> index d3cb085..782ebe5 100644
> --- a/mm/sparse.c
> +++ b/mm/sparse.c
We shouldn't merge this patch on its own because then that will leave a
non-bisectable region in the powerpc history.
So which patch is this patch fixing? Lexically it applies to
mm-allocate-section_map-for-sparse_init.patch (and its updates). But is
that where it logically lies?
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] mm: allocate usemap at first instead of mem_map in sparse_init
2008-04-02 22:25 [PATCH] mm: allocate usemap at first instead of mem_map in sparse_init Yinghai Lu
2008-04-02 22:52 ` Andrew Morton
@ 2008-04-02 23:51 ` Badari Pulavarty
2008-04-03 0:47 ` Yinghai Lu
1 sibling, 1 reply; 11+ messages in thread
From: Badari Pulavarty @ 2008-04-02 23:51 UTC (permalink / raw)
To: yhlu.kernel
Cc: kernel list, Kamalesh Babulal, linuxppc-dev, Andrew Morton,
Ingo Molnar, Balbir Singh
On Wed, 2008-04-02 at 15:25 -0700, Yinghai Lu wrote:
> [PATCH] mm: allocate usemap at first instead of mem_map in sparse_init
> so try to allocate usemap at first altogether.
>
> Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
>
> diff --git a/mm/sparse.c b/mm/sparse.c
> index d3cb085..782ebe5 100644
> --- a/mm/sparse.c
> +++ b/mm/sparse.c
> @@ -294,7 +294,7 @@ void __init sparse_init(void)
> unsigned long pnum;
> struct page *map;
> unsigned long *usemap;
> - struct page **section_map;
> + unsigned long **usemap_map;
> int size;
> int node;
>
> @@ -305,27 +305,31 @@ void __init sparse_init(void)
> * make next 2M slip to one more 2M later.
> * then in big system, the memmory will have a lot hole...
> * here try to allocate 2M pages continously.
Comments are x86-64 specific. On ppc its 16MB chunks :(
> + *
> + * powerpc hope to sparse_init_one_section right after each
> + * sparse_early_mem_map_alloc, so allocate usemap_map
> + * at first.
> */
> - size = sizeof(struct page *) * NR_MEM_SECTIONS;
> - section_map = alloc_bootmem(size);
> - if (!section_map)
> - panic("can not allocate section_map\n");
> + size = sizeof(unsigned long *) * NR_MEM_SECTIONS;
> + usemap_map = alloc_bootmem(size);
> + if (!usemap_map)
> + panic("can not allocate usemap_map\n");
>
> for (pnum = 0; pnum < NR_MEM_SECTIONS; pnum++) {
> if (!present_section_nr(pnum))
> continue;
> - section_map[pnum] = sparse_early_mem_map_alloc(pnum);
> + usemap_map[pnum] = sparse_early_usemap_alloc(pnum);
> }
>
> for (pnum = 0; pnum < NR_MEM_SECTIONS; pnum++) {
> if (!present_section_nr(pnum))
> continue;
>
> - map = section_map[pnum];
> + map = sparse_early_mem_map_alloc(pnum);
> if (!map)
> continue;
>
> - usemap = sparse_early_usemap_alloc(pnum);
> + usemap = usemap_map[pnum];
> if (!usemap)
> continue;
You may want to move this check before doing sparse_early_mem_map_alloc
(). We are also not handling errors properly (freeing up the unused
map or usemap) if we "continue". I know the original code is this way,
but you touched it last :)
>
> @@ -333,7 +337,7 @@ void __init sparse_init(void)
> usemap);
> }
>
> - free_bootmem(__pa(section_map), size);
> + free_bootmem(__pa(usemap_map), size);
> }
>
> #ifdef CONFIG_MEMORY_HOTPLUG
Tested and boots my machine fine.
Acked-by: Badari Pulavarty <pbadari@us.ibm.com>
Thanks,
Badari
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] mm: allocate usemap at first instead of mem_map in sparse_init
2008-04-02 22:52 ` Andrew Morton
@ 2008-04-03 0:44 ` Yinghai Lu
2008-04-03 1:30 ` [PATCH] mm: make mem_map allocation continuous v2 Yinghai Lu
2008-04-03 1:43 ` [PATCH] mm: allocate usemap at first instead of mem_map in sparse_init Yinghai Lu
0 siblings, 2 replies; 11+ messages in thread
From: Yinghai Lu @ 2008-04-03 0:44 UTC (permalink / raw)
To: Andrew Morton
Cc: kernel list, Kamalesh Babulal, Yinghai Lu, linuxppc-dev,
Badari Pulavarty, Ingo Molnar, Balbir Singh
On Wed, Apr 2, 2008 at 3:52 PM, Andrew Morton <akpm@linux-foundation.org> wrote:
>
> On Wed, 2 Apr 2008 15:25:48 -0700 Yinghai Lu <yhlu.kernel.send@gmail.com> wrote:
>
> > [PATCH] mm: allocate usemap at first instead of mem_map in sparse_init
> >
> > on powerpc,
> >
> > On Wed, Apr 2, 2008 at 12:22 PM, Badari Pulavarty <pbadari@us.ibm.com> wrote:
> > >
> > > On Wed, 2008-04-02 at 18:17 +1100, Michael Ellerman wrote:
> > > > On Wed, 2008-04-02 at 12:38 +0530, Kamalesh Babulal wrote:
> > > > > Andrew Morton wrote:
> > > > > > On Wed, 02 Apr 2008 11:55:36 +0530 Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> wrote:
> > > > > >
> > > > > >> Hi Andrew,
> > > > > >>
> > > > > >> The 2.6.25-rc8-mm1 kernel panic's while bootup on the power machine(s).
> > > > > >>
> > > > > >> [ 0.000000] ------------[ cut here ]------------
> > > > > >> [ 0.000000] kernel BUG at arch/powerpc/mm/init_64.c:240!
> > > > > >> [ 0.000000] Oops: Exception in kernel mode, sig: 5 [#1]
> > > > > >> [ 0.000000] SMP NR_CPUS=32 NUMA PowerMac
> > > > > >> [ 0.000000] Modules linked in:
> > > > > >> [ 0.000000] NIP: c0000000003d1dcc LR: c0000000003d1dc4 CTR: c00000000002b6ac
> > > > > >> [ 0.000000] REGS: c00000000049b960 TRAP: 0700 Not tainted (2.6.25-rc8-mm1-autokern1)
> > > > > >> [ 0.000000] MSR: 9000000000021032 <ME,IR,DR> CR: 44000088 XER: 20000000
> > > > > >> [ 0.000000] TASK = c0000000003f9c90[0] 'swapper' THREAD: c000000000498000 CPU: 0
> > > > > >> [ 0.000000] GPR00: c0000000003d1dc4 c00000000049bbe0 c0000000004989d0 0000000000000001
> > > > > >> [ 0.000000] GPR04: d59aca40f0000000 000000000b000000 0000000000000010 0000000000000000
> > > > > >> [ 0.000000] GPR08: 0000000000000004 0000000000000001 c00000027e520800 c0000000004bf0f0
> > > > > >> [ 0.000000] GPR12: c0000000004bf020 c0000000003fa900 0000000000000000 0000000000000000
> > > > > >> [ 0.000000] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > > > > >> [ 0.000000] GPR20: 0000000000000000 0000000000000000 0000000000000000 4000000001400000
> > > > > >> [ 0.000000] GPR24: 00000000017d64b0 c0000000003d6250 0000000000000000 c000000000504000
> > > > > >> [ 0.000000] GPR28: 0000000000000000 cf000000001f8000 0000000001000000 cf00000000000000
> > > > > >> [ 0.000000] NIP [c0000000003d1dcc] .vmemmap_populate+0xb8/0xf4
> > > > > >> [ 0.000000] LR [c0000000003d1dc4] .vmemmap_populate+0xb0/0xf4
> > > > > >> [ 0.000000] Call Trace:
> > > > > >> [ 0.000000] [c00000000049bbe0] [c0000000003d1dc4] .vmemmap_populate+0xb0/0xf4 (unreliable)
> > > > > >> [ 0.000000] [c00000000049bc70] [c0000000003d2ee8] .sparse_mem_map_populate+0x38/0x60
> > > > > >> [ 0.000000] [c00000000049bd00] [c0000000003c242c] .sparse_early_mem_map_alloc+0x54/0x94
> > > > > >> [ 0.000000] [c00000000049bd90] [c0000000003c250c] .sparse_init+0xa0/0x20c
> > > > > >> [ 0.000000] [c00000000049be50] [c0000000003ab7d0] .setup_arch+0x1ac/0x218
> > > > > >> [ 0.000000] [c00000000049bee0] [c0000000003a36ac] .start_kernel+0xe0/0x3fc
> > > > > >> [ 0.000000] [c00000000049bf90] [c000000000008594] .start_here_common+0x54/0xc0
> > > > > >> [ 0.000000] Instruction dump:
> > > > > >> [ 0.000000] 7fe3fb78 7ca02a14 4082000c 3860fff4 4800003c e92289c8 e96289c0 e9090002
> > > > > >> [ 0.000000] e8eb0002 4bc575cd 60000000 78630fe0 <0b030000> 7ffff214 7fbfe840 7fe3fb78
> > > > > >> [ 0.000000] ---[ end trace 31fd0ba7d8756001 ]---
> > > > > >> [ 0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
> > >
> > > mm-make-mem_map-allocation-continuous.patch
> > > and its friends in -mm.
> > >
> > > You have to call sparse_init_one_section() on each pmap and usemap
> > > as we allocate - since valid_section() depends on it (which is needed
> > > by vmemmap_populate() to check if the section is populated or not).
> > > On ppc, we need to call htab_bolted_mapping() on each section and
> > > we need to skip existing sections.
> > >
> > > These patches tried to group all allocations together and then later
> > > calls sparse_init_one_section() - which is not good :(
> >
> > so try to allocate usemap at first altogether.
>
> I have to turn all the above crud into a proper changelog. I'd prefer that
> you do it.
>
> Unless this patch should be folded into another one, in which case it
> doesn't matter.
>
>
> > Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
> >
> > diff --git a/mm/sparse.c b/mm/sparse.c
> > index d3cb085..782ebe5 100644
> > --- a/mm/sparse.c
> > +++ b/mm/sparse.c
>
> We shouldn't merge this patch on its own because then that will leave a
> non-bisectable region in the powerpc history.
>
> So which patch is this patch fixing? Lexically it applies to
> mm-allocate-section_map-for-sparse_init.patch (and its updates). But is
> that where it logically lies?
yes. we should fold
mm-make-mem_map-allocation-continuous.patch
mm-allocate-section_map-for-sparse_init.patch
and this one
to big one (not big really).
YH
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] mm: allocate usemap at first instead of mem_map in sparse_init
2008-04-02 23:51 ` Badari Pulavarty
@ 2008-04-03 0:47 ` Yinghai Lu
0 siblings, 0 replies; 11+ messages in thread
From: Yinghai Lu @ 2008-04-03 0:47 UTC (permalink / raw)
To: Badari Pulavarty
Cc: kernel list, Kamalesh Babulal, linuxppc-dev, Andrew Morton,
Ingo Molnar, Balbir Singh
On Wed, Apr 2, 2008 at 4:51 PM, Badari Pulavarty <pbadari@us.ibm.com> wrote:
> On Wed, 2008-04-02 at 15:25 -0700, Yinghai Lu wrote:
> > [PATCH] mm: allocate usemap at first instead of mem_map in sparse_init
>
> > so try to allocate usemap at first altogether.
> >
> > Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
> >
> > diff --git a/mm/sparse.c b/mm/sparse.c
> > index d3cb085..782ebe5 100644
> > --- a/mm/sparse.c
> > +++ b/mm/sparse.c
> > @@ -294,7 +294,7 @@ void __init sparse_init(void)
> > unsigned long pnum;
> > struct page *map;
> > unsigned long *usemap;
> > - struct page **section_map;
> > + unsigned long **usemap_map;
> > int size;
> > int node;
> >
> > @@ -305,27 +305,31 @@ void __init sparse_init(void)
> > * make next 2M slip to one more 2M later.
> > * then in big system, the memmory will have a lot hole...
> > * here try to allocate 2M pages continously.
>
> Comments are x86-64 specific. On ppc its 16MB chunks :(
>
>
>
> > + *
> > + * powerpc hope to sparse_init_one_section right after each
> > + * sparse_early_mem_map_alloc, so allocate usemap_map
> > + * at first.
> > */
> > - size = sizeof(struct page *) * NR_MEM_SECTIONS;
> > - section_map = alloc_bootmem(size);
> > - if (!section_map)
> > - panic("can not allocate section_map\n");
> > + size = sizeof(unsigned long *) * NR_MEM_SECTIONS;
> > + usemap_map = alloc_bootmem(size);
> > + if (!usemap_map)
> > + panic("can not allocate usemap_map\n");
> >
> > for (pnum = 0; pnum < NR_MEM_SECTIONS; pnum++) {
> > if (!present_section_nr(pnum))
> > continue;
> > - section_map[pnum] = sparse_early_mem_map_alloc(pnum);
> > + usemap_map[pnum] = sparse_early_usemap_alloc(pnum);
> > }
> >
> > for (pnum = 0; pnum < NR_MEM_SECTIONS; pnum++) {
> > if (!present_section_nr(pnum))
> > continue;
> >
> > - map = section_map[pnum];
> > + map = sparse_early_mem_map_alloc(pnum);
> > if (!map)
> > continue;
> >
> > - usemap = sparse_early_usemap_alloc(pnum);
> > + usemap = usemap_map[pnum];
> > if (!usemap)
> > continue;
>
> You may want to move this check before doing sparse_early_mem_map_alloc
> (). We are also not handling errors properly (freeing up the unused
> map or usemap) if we "continue". I know the original code is this way,
> but you touched it last :)
Yes. could avoid some leak...
YH
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH] mm: make mem_map allocation continuous v2.
2008-04-03 0:44 ` Yinghai Lu
@ 2008-04-03 1:30 ` Yinghai Lu
2008-04-03 2:22 ` Andrew Morton
2008-04-03 3:22 ` Yasunori Goto
2008-04-03 1:43 ` [PATCH] mm: allocate usemap at first instead of mem_map in sparse_init Yinghai Lu
1 sibling, 2 replies; 11+ messages in thread
From: Yinghai Lu @ 2008-04-03 1:30 UTC (permalink / raw)
To: Andrew Morton, Ingo Molnar
Cc: kernel list, Kamalesh Babulal, linuxppc-dev, Badari Pulavarty,
Balbir Singh
vmemmap allocation current got
[ffffe20000000000-ffffe200001fffff] PMD ->ffff810001400000 on node 0
[ffffe20000200000-ffffe200003fffff] PMD ->ffff810001800000 on node 0
[ffffe20000400000-ffffe200005fffff] PMD ->ffff810001c00000 on node 0
[ffffe20000600000-ffffe200007fffff] PMD ->ffff810002000000 on node 0
[ffffe20000800000-ffffe200009fffff] PMD ->ffff810002400000 on node 0
...
there is 2M hole between them.
the rootcause is that usemap (24 bytes) will be allocated after every 2M
mem_map. and it will push next vmemmap (2M) to next align (2M).
solution:
try to allocate mem_map continously.
after patch, will get
[ffffe20000000000-ffffe200001fffff] PMD ->ffff810001400000 on node 0
[ffffe20000200000-ffffe200003fffff] PMD ->ffff810001600000 on node 0
[ffffe20000400000-ffffe200005fffff] PMD ->ffff810001800000 on node 0
[ffffe20000600000-ffffe200007fffff] PMD ->ffff810001a00000 on node 0
[ffffe20000800000-ffffe200009fffff] PMD ->ffff810001c00000 on node 0
...
and usemap will share in page because of they are allocated continuously too.
sparse_early_usemap_alloc: usemap = ffff810024e00000 size = 24
sparse_early_usemap_alloc: usemap = ffff810024e00080 size = 24
sparse_early_usemap_alloc: usemap = ffff810024e00100 size = 24
sparse_early_usemap_alloc: usemap = ffff810024e00180 size = 24
...
so we make the bootmem allocation more compact and use less memory for usemap.
for power pc
Badari Pulavarty <pbadari@us.ibm.com> wrote:
> You have to call sparse_init_one_section() on each pmap and usemap
> as we allocate - since valid_section() depends on it (which is needed
> by vmemmap_populate() to check if the section is populated or not).
> On ppc, we need to call htab_bolted_mapping() on each section and
> we need to skip existing sections.
so try to allocate usemap at first altogether.
v2 replace:
[PATCH] mm: make mem_map allocation continuous.
[PATCH] mm: allocate section_map for sparse_init
[PATCH] mm: allocate usemap at first instead of mem_map in sparse_init
Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
diff --git a/mm/sparse.c b/mm/sparse.c
index f6a43c0..2881222 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -294,22 +294,48 @@ void __init sparse_init(void)
unsigned long pnum;
struct page *map;
unsigned long *usemap;
+ unsigned long **usemap_map;
+ int size;
+
+ /*
+ * map is using big page (aka 2M in x86 64 bit)
+ * usemap is less one page (aka 24 bytes)
+ * so alloc 2M (with 2M align) and 24 bytes in turn will
+ * make next 2M slip to one more 2M later.
+ * then in big system, the memory will have a lot of holes...
+ * here try to allocate 2M pages continously.
+ *
+ * powerpc need to call sparse_init_one_section right after each
+ * sparse_early_mem_map_alloc, so allocate usemap_map at first.
+ */
+ size = sizeof(unsigned long *) * NR_MEM_SECTIONS;
+ usemap_map = alloc_bootmem(size);
+ if (!usemap_map)
+ panic("can not allocate usemap_map\n");
for (pnum = 0; pnum < NR_MEM_SECTIONS; pnum++) {
if (!present_section_nr(pnum))
continue;
+ usemap_map[pnum] = sparse_early_usemap_alloc(pnum);
+ }
- map = sparse_early_mem_map_alloc(pnum);
- if (!map)
+ for (pnum = 0; pnum < NR_MEM_SECTIONS; pnum++) {
+ if (!present_section_nr(pnum))
continue;
- usemap = sparse_early_usemap_alloc(pnum);
+ usemap = usemap_map[pnum];
if (!usemap)
continue;
+ map = sparse_early_mem_map_alloc(pnum);
+ if (!map)
+ continue;
+
sparse_init_one_section(__nr_to_section(pnum), pnum, map,
usemap);
}
+
+ free_bootmem(__pa(usemap_map), size);
}
#ifdef CONFIG_MEMORY_HOTPLUG
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH] mm: allocate usemap at first instead of mem_map in sparse_init
2008-04-03 0:44 ` Yinghai Lu
2008-04-03 1:30 ` [PATCH] mm: make mem_map allocation continuous v2 Yinghai Lu
@ 2008-04-03 1:43 ` Yinghai Lu
1 sibling, 0 replies; 11+ messages in thread
From: Yinghai Lu @ 2008-04-03 1:43 UTC (permalink / raw)
To: Andrew Morton
Cc: kernel list, Kamalesh Babulal, Yinghai Lu, linuxppc-dev,
Badari Pulavarty, Ingo Molnar, Balbir Singh
On Wed, Apr 2, 2008 at 5:44 PM, Yinghai Lu <yhlu.kernel@gmail.com> wrote:
>
> On Wed, Apr 2, 2008 at 3:52 PM, Andrew Morton <akpm@linux-foundation.org> wrote:
> >
> > On Wed, 2 Apr 2008 15:25:48 -0700 Yinghai Lu <yhlu.kernel.send@gmail.com> wrote:
> >
> > > [PATCH] mm: allocate usemap at first instead of mem_map in sparse_init
> > >
> > > on powerpc,
> > >
> > > On Wed, Apr 2, 2008 at 12:22 PM, Badari Pulavarty <pbadari@us.ibm.com> wrote:
> > > >
> > > > On Wed, 2008-04-02 at 18:17 +1100, Michael Ellerman wrote:
> > > > > On Wed, 2008-04-02 at 12:38 +0530, Kamalesh Babulal wrote:
> > > > > > Andrew Morton wrote:
> > > > > > > On Wed, 02 Apr 2008 11:55:36 +0530 Kamalesh Babulal <kamalesh@linux.vnet.ibm.com> wrote:
> > > so try to allocate usemap at first altogether.
> >
> > I have to turn all the above crud into a proper changelog. I'd prefer that
> > you do it.
> >
> > Unless this patch should be folded into another one, in which case it
> > doesn't matter.
> >
> >
> > > Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
> > >
> > > diff --git a/mm/sparse.c b/mm/sparse.c
> > > index d3cb085..782ebe5 100644
> > > --- a/mm/sparse.c
> > > +++ b/mm/sparse.c
> >
> > We shouldn't merge this patch on its own because then that will leave a
> > non-bisectable region in the powerpc history.
> >
> > So which patch is this patch fixing? Lexically it applies to
> > mm-allocate-section_map-for-sparse_init.patch (and its updates). But is
> > that where it logically lies?
>
> yes. we should fold
>
>
> mm-make-mem_map-allocation-continuous.patch
>
> mm-allocate-section_map-for-sparse_init.patch
> and this one
>
please check the big one.
http://lkml.org/lkml/2008/4/2/650
YH
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] mm: make mem_map allocation continuous v2.
2008-04-03 1:30 ` [PATCH] mm: make mem_map allocation continuous v2 Yinghai Lu
@ 2008-04-03 2:22 ` Andrew Morton
2008-04-03 4:16 ` Yinghai Lu
2008-04-03 3:22 ` Yasunori Goto
1 sibling, 1 reply; 11+ messages in thread
From: Andrew Morton @ 2008-04-03 2:22 UTC (permalink / raw)
To: yhlu.kernel
Cc: Balbir, Kamalesh, kernel list, Babulal, Yinghai Lu, linuxppc-dev,
Badari Pulavarty, Ingo Molnar, Singh
On Wed, 2 Apr 2008 18:30:24 -0700 Yinghai Lu <yhlu.kernel.send@gmail.com> wrote:
> v2 replace:
> [PATCH] mm: make mem_map allocation continuous.
> [PATCH] mm: allocate section_map for sparse_init
> [PATCH] mm: allocate usemap at first instead of mem_map in sparse_init
>
err, no.
>
> diff --git a/mm/sparse.c b/mm/sparse.c
> index f6a43c0..2881222 100644
> --- a/mm/sparse.c
> +++ b/mm/sparse.c
Sorry, but I'd rather not do it this way. We presently have this:
mm-make-mem_map-allocation-continuous.patch
mm-make-mem_map-allocation-continuous-checkpatch-fixes.patch
mm-fix-alloc_bootmem_core-to-use-fast-searching-for-all-nodes.patch
mm-allocate-section_map-for-sparse_init.patch
mm-allocate-section_map-for-sparse_init-update.patch
mm-allocate-section_map-for-sparse_init-update-fix.patch
mm-allocate-section_map-for-sparse_init-powerpc-fix.patch
mm-offset-align-in-alloc_bootmem.patch
mm-make-reserve_bootmem-can-crossed-the-nodes.patch
mm-make-reserve_bootmem-can-crossed-the-nodes-checkpatch-fixes.patch
and you purport to throw some of them away and combine them into a single
patch? We assume that the later patches will still apply and work on top
of this newer patch? It is up to me to check that the replacement patch
incorporates the third-party changes to the original patches?
Too hard, too risky. Can't we just do a fix against 2.6.25-rc8-mm1?
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] mm: make mem_map allocation continuous v2.
2008-04-03 1:30 ` [PATCH] mm: make mem_map allocation continuous v2 Yinghai Lu
2008-04-03 2:22 ` Andrew Morton
@ 2008-04-03 3:22 ` Yasunori Goto
1 sibling, 0 replies; 11+ messages in thread
From: Yasunori Goto @ 2008-04-03 3:22 UTC (permalink / raw)
To: yhlu.kernel
Cc: kernel list, Kamalesh Babulal, linuxppc-dev, Badari Pulavarty,
Andrew Morton, Ingo Molnar, Balbir Singh
Looks good to me. And ia64 boots up with this patch too.
Thanks.
Acked-by: Yasunori Goto <y-goto@jp.fujitsu.com>
>
> vmemmap allocation current got
> [ffffe20000000000-ffffe200001fffff] PMD ->ffff810001400000 on node 0
> [ffffe20000200000-ffffe200003fffff] PMD ->ffff810001800000 on node 0
> [ffffe20000400000-ffffe200005fffff] PMD ->ffff810001c00000 on node 0
> [ffffe20000600000-ffffe200007fffff] PMD ->ffff810002000000 on node 0
> [ffffe20000800000-ffffe200009fffff] PMD ->ffff810002400000 on node 0
> ...
>
> there is 2M hole between them.
>
> the rootcause is that usemap (24 bytes) will be allocated after every 2M
> mem_map. and it will push next vmemmap (2M) to next align (2M).
>
> solution:
> try to allocate mem_map continously.
>
> after patch, will get
> [ffffe20000000000-ffffe200001fffff] PMD ->ffff810001400000 on node 0
> [ffffe20000200000-ffffe200003fffff] PMD ->ffff810001600000 on node 0
> [ffffe20000400000-ffffe200005fffff] PMD ->ffff810001800000 on node 0
> [ffffe20000600000-ffffe200007fffff] PMD ->ffff810001a00000 on node 0
> [ffffe20000800000-ffffe200009fffff] PMD ->ffff810001c00000 on node 0
> ...
> and usemap will share in page because of they are allocated continuously too.
> sparse_early_usemap_alloc: usemap = ffff810024e00000 size = 24
> sparse_early_usemap_alloc: usemap = ffff810024e00080 size = 24
> sparse_early_usemap_alloc: usemap = ffff810024e00100 size = 24
> sparse_early_usemap_alloc: usemap = ffff810024e00180 size = 24
> ...
>
> so we make the bootmem allocation more compact and use less memory for usemap.
>
> for power pc
> Badari Pulavarty <pbadari@us.ibm.com> wrote:
>
> > You have to call sparse_init_one_section() on each pmap and usemap
> > as we allocate - since valid_section() depends on it (which is needed
> > by vmemmap_populate() to check if the section is populated or not).
> > On ppc, we need to call htab_bolted_mapping() on each section and
> > we need to skip existing sections.
>
> so try to allocate usemap at first altogether.
>
> v2 replace:
> [PATCH] mm: make mem_map allocation continuous.
> [PATCH] mm: allocate section_map for sparse_init
> [PATCH] mm: allocate usemap at first instead of mem_map in sparse_init
>
> Signed-off-by: Yinghai Lu <yhlu.kernel@gmail.com>
>
> diff --git a/mm/sparse.c b/mm/sparse.c
> index f6a43c0..2881222 100644
> --- a/mm/sparse.c
> +++ b/mm/sparse.c
> @@ -294,22 +294,48 @@ void __init sparse_init(void)
> unsigned long pnum;
> struct page *map;
> unsigned long *usemap;
> + unsigned long **usemap_map;
> + int size;
> +
> + /*
> + * map is using big page (aka 2M in x86 64 bit)
> + * usemap is less one page (aka 24 bytes)
> + * so alloc 2M (with 2M align) and 24 bytes in turn will
> + * make next 2M slip to one more 2M later.
> + * then in big system, the memory will have a lot of holes...
> + * here try to allocate 2M pages continously.
> + *
> + * powerpc need to call sparse_init_one_section right after each
> + * sparse_early_mem_map_alloc, so allocate usemap_map at first.
> + */
> + size = sizeof(unsigned long *) * NR_MEM_SECTIONS;
> + usemap_map = alloc_bootmem(size);
> + if (!usemap_map)
> + panic("can not allocate usemap_map\n");
>
> for (pnum = 0; pnum < NR_MEM_SECTIONS; pnum++) {
> if (!present_section_nr(pnum))
> continue;
> + usemap_map[pnum] = sparse_early_usemap_alloc(pnum);
> + }
>
> - map = sparse_early_mem_map_alloc(pnum);
> - if (!map)
> + for (pnum = 0; pnum < NR_MEM_SECTIONS; pnum++) {
> + if (!present_section_nr(pnum))
> continue;
>
> - usemap = sparse_early_usemap_alloc(pnum);
> + usemap = usemap_map[pnum];
> if (!usemap)
> continue;
>
> + map = sparse_early_mem_map_alloc(pnum);
> + if (!map)
> + continue;
> +
> sparse_init_one_section(__nr_to_section(pnum), pnum, map,
> usemap);
> }
> +
> + free_bootmem(__pa(usemap_map), size);
> }
>
> #ifdef CONFIG_MEMORY_HOTPLUG
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
Yasunori Goto
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] mm: make mem_map allocation continuous v2.
2008-04-03 2:22 ` Andrew Morton
@ 2008-04-03 4:16 ` Yinghai Lu
2008-04-03 10:49 ` Kamalesh Babulal
0 siblings, 1 reply; 11+ messages in thread
From: Yinghai Lu @ 2008-04-03 4:16 UTC (permalink / raw)
To: Andrew Morton
Cc: kernel list, Kamalesh Babulal, Yinghai Lu, linuxppc-dev,
Badari Pulavarty, Ingo Molnar, Balbir Singh
On Wed, Apr 2, 2008 at 7:22 PM, Andrew Morton <akpm@linux-foundation.org> wrote:
> On Wed, 2 Apr 2008 18:30:24 -0700 Yinghai Lu <yhlu.kernel.send@gmail.com> wrote:
>
> > v2 replace:
> > [PATCH] mm: make mem_map allocation continuous.
> > [PATCH] mm: allocate section_map for sparse_init
> > [PATCH] mm: allocate usemap at first instead of mem_map in sparse_init
> >
>
> err, no.
>
>
> >
> > diff --git a/mm/sparse.c b/mm/sparse.c
> > index f6a43c0..2881222 100644
> > --- a/mm/sparse.c
> > +++ b/mm/sparse.c
>
> Sorry, but I'd rather not do it this way. We presently have this:
>
it replaces
> mm-make-mem_map-allocation-continuous.patch
> mm-make-mem_map-allocation-continuous-checkpatch-fixes.patch
> mm-allocate-section_map-for-sparse_init.patch
> mm-allocate-section_map-for-sparse_init-update.patch
> mm-allocate-section_map-for-sparse_init-update-fix.patch
> mm-allocate-section_map-for-sparse_init-powerpc-fix.patch
others still needed
so mm-make-mem-map-allocation-continuous.patch will not break powerpc and ia64
YH
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] mm: make mem_map allocation continuous v2.
2008-04-03 4:16 ` Yinghai Lu
@ 2008-04-03 10:49 ` Kamalesh Babulal
0 siblings, 0 replies; 11+ messages in thread
From: Kamalesh Babulal @ 2008-04-03 10:49 UTC (permalink / raw)
To: Yinghai Lu
Cc: kernel list, Yinghai Lu, linuxppc-dev, Badari Pulavarty,
Andrew Morton, Ingo Molnar, Balbir Singh
Yinghai Lu wrote:
> On Wed, Apr 2, 2008 at 7:22 PM, Andrew Morton <akpm@linux-foundation.org> wrote:
>> On Wed, 2 Apr 2008 18:30:24 -0700 Yinghai Lu <yhlu.kernel.send@gmail.com> wrote:
>>
>> > v2 replace:
>> > [PATCH] mm: make mem_map allocation continuous.
>> > [PATCH] mm: allocate section_map for sparse_init
>> > [PATCH] mm: allocate usemap at first instead of mem_map in sparse_init
>> >
>>
>> err, no.
>>
>>
>> >
>> > diff --git a/mm/sparse.c b/mm/sparse.c
>> > index f6a43c0..2881222 100644
>> > --- a/mm/sparse.c
>> > +++ b/mm/sparse.c
>>
>> Sorry, but I'd rather not do it this way. We presently have this:
>>
>
> it replaces
>
>> mm-make-mem_map-allocation-continuous.patch
>> mm-make-mem_map-allocation-continuous-checkpatch-fixes.patch
>> mm-allocate-section_map-for-sparse_init.patch
>> mm-allocate-section_map-for-sparse_init-update.patch
>> mm-allocate-section_map-for-sparse_init-update-fix.patch
>> mm-allocate-section_map-for-sparse_init-powerpc-fix.patch
>
> others still needed
>
> so mm-make-mem-map-allocation-continuous.patch will not break powerpc and ia64
>
> YH
Hi,
Thanks, the patch fixes the issue. I am able to bootup without the kernel panic.
Tested-by: Kamalesh Babulal <kamalesh@linux.vnet.ibm.com>
--
Thanks & Regards,
Kamalesh Babulal,
Linux Technology Center,
IBM, ISTL.
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2008-04-03 10:49 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-02 22:25 [PATCH] mm: allocate usemap at first instead of mem_map in sparse_init Yinghai Lu
2008-04-02 22:52 ` Andrew Morton
2008-04-03 0:44 ` Yinghai Lu
2008-04-03 1:30 ` [PATCH] mm: make mem_map allocation continuous v2 Yinghai Lu
2008-04-03 2:22 ` Andrew Morton
2008-04-03 4:16 ` Yinghai Lu
2008-04-03 10:49 ` Kamalesh Babulal
2008-04-03 3:22 ` Yasunori Goto
2008-04-03 1:43 ` [PATCH] mm: allocate usemap at first instead of mem_map in sparse_init Yinghai Lu
2008-04-02 23:51 ` Badari Pulavarty
2008-04-03 0:47 ` Yinghai Lu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).