* Config NO_BOOTMEM breaks my amd64 box
@ 2010-03-31 4:49 James Morris
2010-03-31 6:26 ` H. Peter Anvin
2010-03-31 10:51 ` Config NO_BOOTMEM breaks my amd64 box Stefan Richter
0 siblings, 2 replies; 46+ messages in thread
From: James Morris @ 2010-03-31 4:49 UTC (permalink / raw)
To: Ingo Molnar; +Cc: H. Peter Anvin, Yinghai Lu, linux-kernel, airlied
Please make NO_BOOTMEM default to n, at least for amd64, where I've found
that it leads to all kinds of strange, undebuggable boot hangs and errors
(with relatively current Fedora development userland).
Also, the help text for the item makes little sense to a non-expert in
this area:
" ---help---
Use early_res directly instead of bootmem before slab is ready.
- allocator (buddy) [generic]
- early allocator (bootmem) [generic]
- very early allocator (reserve_early*()) [x86]
- very very early allocator (early brk model) [x86]
So reduce one layer between early allocator to final allocator."
I had no idea what all this meant, so trusted the default=y and then spent
several hours wondering why everything was breaking, and would likley not
have figured it out in linear time without a suggestion from Dave Airlie.
- James
--
James Morris
<jmorris@namei.org>
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box
2010-03-31 4:49 Config NO_BOOTMEM breaks my amd64 box James Morris
@ 2010-03-31 6:26 ` H. Peter Anvin
2010-03-31 6:47 ` James Morris
2010-03-31 10:51 ` Config NO_BOOTMEM breaks my amd64 box Stefan Richter
1 sibling, 1 reply; 46+ messages in thread
From: H. Peter Anvin @ 2010-03-31 6:26 UTC (permalink / raw)
To: James Morris; +Cc: Ingo Molnar, Yinghai Lu, linux-kernel, airlied
On 03/30/2010 09:49 PM, James Morris wrote:
> Please make NO_BOOTMEM default to n, at least for amd64, where I've found
> that it leads to all kinds of strange, undebuggable boot hangs and errors
> (with relatively current Fedora development userland).
Have you tested it with the latest fixes that are now in Linus' tree (-rc3)?
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box
2010-03-31 6:26 ` H. Peter Anvin
@ 2010-03-31 6:47 ` James Morris
2010-03-31 16:25 ` Yinghai Lu
` (2 more replies)
0 siblings, 3 replies; 46+ messages in thread
From: James Morris @ 2010-03-31 6:47 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: Ingo Molnar, Yinghai Lu, linux-kernel, airlied
On Tue, 30 Mar 2010, H. Peter Anvin wrote:
> On 03/30/2010 09:49 PM, James Morris wrote:
> > Please make NO_BOOTMEM default to n, at least for amd64, where I've found
> > that it leads to all kinds of strange, undebuggable boot hangs and errors
> > (with relatively current Fedora development userland).
>
> Have you tested it with the latest fixes that are now in Linus' tree (-rc3)?
Yes, it was happening with -rc3.
--
James Morris
<jmorris@namei.org>
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box
2010-03-31 4:49 Config NO_BOOTMEM breaks my amd64 box James Morris
2010-03-31 6:26 ` H. Peter Anvin
@ 2010-03-31 10:51 ` Stefan Richter
1 sibling, 0 replies; 46+ messages in thread
From: Stefan Richter @ 2010-03-31 10:51 UTC (permalink / raw)
To: James Morris
Cc: Ingo Molnar, H. Peter Anvin, Yinghai Lu, linux-kernel, airlied
James Morris wrote:
> Also, the help text for the item makes little sense to a non-expert in
> this area:
I too noticed this absolutely catastrophic "help" text but forgot to
send a bug report.
Either this option can be explained and the text fixed, or it cannot be
explained and shouldn't be an option in the first place.
--
Stefan Richter
-=====-==-=- --== =====
http://arcgraph.de/sr/
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box
2010-03-31 6:47 ` James Morris
@ 2010-03-31 16:25 ` Yinghai Lu
2010-03-31 18:59 ` Ingo Molnar
2010-03-31 22:05 ` Yinghai Lu
2 siblings, 0 replies; 46+ messages in thread
From: Yinghai Lu @ 2010-03-31 16:25 UTC (permalink / raw)
To: James Morris; +Cc: H. Peter Anvin, Ingo Molnar, linux-kernel, airlied
On 03/30/2010 11:47 PM, James Morris wrote:
> On Tue, 30 Mar 2010, H. Peter Anvin wrote:
>
>> On 03/30/2010 09:49 PM, James Morris wrote:
>>> Please make NO_BOOTMEM default to n, at least for amd64, where I've found
>>> that it leads to all kinds of strange, undebuggable boot hangs and errors
>>> (with relatively current Fedora development userland).
>>
>> Have you tested it with the latest fixes that are now in Linus' tree (-rc3)?
>
> Yes, it was happening with -rc3.
please send out bootlog if possible.
BTW please try
git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-2.6-yinghai.git
Thanks
Yinghai
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box
2010-03-31 6:47 ` James Morris
2010-03-31 16:25 ` Yinghai Lu
@ 2010-03-31 18:59 ` Ingo Molnar
2010-03-31 20:57 ` Dave Airlie
` (2 more replies)
2010-03-31 22:05 ` Yinghai Lu
2 siblings, 3 replies; 46+ messages in thread
From: Ingo Molnar @ 2010-03-31 18:59 UTC (permalink / raw)
To: James Morris
Cc: H. Peter Anvin, Yinghai Lu, linux-kernel, airlied,
Thomas Gleixner, Linus Torvalds, Pekka Enberg
* James Morris <jmorris@namei.org> wrote:
> On Tue, 30 Mar 2010, H. Peter Anvin wrote:
>
> > On 03/30/2010 09:49 PM, James Morris wrote:
> > >
> > > Please make NO_BOOTMEM default to n, at least for amd64, where I've found
> > > that it leads to all kinds of strange, undebuggable boot hangs and errors
> > > (with relatively current Fedora development userland).
> >
> > Have you tested it with the latest fixes that are now in Linus' tree (-rc3)?
>
> Yes, it was happening with -rc3.
Could you please send the bootlog that Yinghai asked for, plus also one that
you get with NO_BOOTMEM turned off (for comparison)?
Also, when did you first hit this bug? This code has been upstream for almost
a month, and it was in linux-next before that - so you should have hit this
much sooner. A rough timeframe would suffice. I suppose you were booting
upstream kernels during the merge window as well?
We can flip the default around if there's no fix available based on the
bootlogs. (Plus the help text should definitely be improved.)
Thanks,
Ingo
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box
2010-03-31 18:59 ` Ingo Molnar
@ 2010-03-31 20:57 ` Dave Airlie
2010-03-31 21:02 ` Linus Torvalds
2010-03-31 21:47 ` Ingo Molnar
2010-03-31 21:14 ` Dave Airlie
2010-03-31 22:58 ` James Morris
2 siblings, 2 replies; 46+ messages in thread
From: Dave Airlie @ 2010-03-31 20:57 UTC (permalink / raw)
To: Ingo Molnar
Cc: James Morris, H. Peter Anvin, Yinghai Lu, linux-kernel, airlied,
Thomas Gleixner, Linus Torvalds, Pekka Enberg
On Thu, Apr 1, 2010 at 4:59 AM, Ingo Molnar <mingo@elte.hu> wrote:
>
> * James Morris <jmorris@namei.org> wrote:
>
>> On Tue, 30 Mar 2010, H. Peter Anvin wrote:
>>
>> > On 03/30/2010 09:49 PM, James Morris wrote:
>> > >
>> > > Please make NO_BOOTMEM default to n, at least for amd64, where I've found
>> > > that it leads to all kinds of strange, undebuggable boot hangs and errors
>> > > (with relatively current Fedora development userland).
>> >
>> > Have you tested it with the latest fixes that are now in Linus' tree (-rc3)?
>>
>> Yes, it was happening with -rc3.
>
> Could you please send the bootlog that Yinghai asked for, plus also one that
> you get with NO_BOOTMEM turned off (for comparison)?
>
> Also, when did you first hit this bug? This code has been upstream for almost
> a month, and it was in linux-next before that - so you should have hit this
> much sooner. A rough timeframe would suffice. I suppose you were booting
> upstream kernels during the merge window as well?
A default y config option causing regressions still at rc3? and you guys
keep going? This is the sort of shit Linus would flame me for a day or two for,
Can we get some f'ing consistency here?
Dave.
>
> We can flip the default around if there's no fix available based on the
> bootlogs. (Plus the help text should definitely be improved.)
>
> Thanks,
>
> Ingo
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box
2010-03-31 20:57 ` Dave Airlie
@ 2010-03-31 21:02 ` Linus Torvalds
2010-03-31 21:40 ` Ingo Molnar
2010-03-31 21:47 ` Ingo Molnar
1 sibling, 1 reply; 46+ messages in thread
From: Linus Torvalds @ 2010-03-31 21:02 UTC (permalink / raw)
To: Dave Airlie
Cc: Ingo Molnar, James Morris, H. Peter Anvin, Yinghai Lu,
linux-kernel, airlied, Thomas Gleixner, Pekka Enberg
On Thu, 1 Apr 2010, Dave Airlie wrote:
>
> A default y config option causing regressions still at rc3? and you guys
> keep going? This is the sort of shit Linus would flame me for a day or two for,
>
> Can we get some f'ing consistency here?
Yeah. I think we need to remove the crap.
I thought the problems were known, and fixed in -rc3. Clearly they
weren't. And by now it's not about changing the default any more - by now
it's about removing the known-crap code.
Linus
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box
2010-03-31 18:59 ` Ingo Molnar
2010-03-31 20:57 ` Dave Airlie
@ 2010-03-31 21:14 ` Dave Airlie
2010-03-31 22:02 ` Yinghai Lu
2010-03-31 22:28 ` H. Peter Anvin
2010-03-31 22:58 ` James Morris
2 siblings, 2 replies; 46+ messages in thread
From: Dave Airlie @ 2010-03-31 21:14 UTC (permalink / raw)
To: Ingo Molnar
Cc: James Morris, H. Peter Anvin, Yinghai Lu, linux-kernel, airlied,
Thomas Gleixner, Linus Torvalds, Pekka Enberg
On Thu, Apr 1, 2010 at 4:59 AM, Ingo Molnar <mingo@elte.hu> wrote:
>
> * James Morris <jmorris@namei.org> wrote:
>
>> On Tue, 30 Mar 2010, H. Peter Anvin wrote:
>>
>> > On 03/30/2010 09:49 PM, James Morris wrote:
>> > >
>> > > Please make NO_BOOTMEM default to n, at least for amd64, where I've found
>> > > that it leads to all kinds of strange, undebuggable boot hangs and errors
>> > > (with relatively current Fedora development userland).
>> >
>> > Have you tested it with the latest fixes that are now in Linus' tree (-rc3)?
>>
>> Yes, it was happening with -rc3.
>
> Could you please send the bootlog that Yinghai asked for, plus also one that
> you get with NO_BOOTMEM turned off (for comparison)?
>
> Also, when did you first hit this bug? This code has been upstream for almost
> a month, and it was in linux-next before that - so you should have hit this
> much sooner. A rough timeframe would suffice. I suppose you were booting
> upstream kernels during the merge window as well?
>
> We can flip the default around if there's no fix available based on the
> bootlogs. (Plus the help text should definitely be improved.)
>
Are you testing this btw with initramfs/initrds? I suspect lots of testing
is being done by people on monolithic kernels, this is just a misc guess,
considering I couldn't boot from when this landed until rc3 with this option
on a basic 32-bit install on a dual-core 64-bit CPU, it suggested a
hole of some sort
in the test coverage.
Dave
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box
2010-03-31 21:02 ` Linus Torvalds
@ 2010-03-31 21:40 ` Ingo Molnar
0 siblings, 0 replies; 46+ messages in thread
From: Ingo Molnar @ 2010-03-31 21:40 UTC (permalink / raw)
To: Linus Torvalds
Cc: Dave Airlie, James Morris, H. Peter Anvin, Yinghai Lu,
linux-kernel, airlied, Thomas Gleixner, Pekka Enberg
* Linus Torvalds <torvalds@linux-foundation.org> wrote:
> On Thu, 1 Apr 2010, Dave Airlie wrote:
> >
> > A default y config option causing regressions still at rc3? [...]
> >
> > [...] and you guys keep going? This is the sort of shit Linus would flame
> > me for a day or two for,
> >
> > Can we get some f'ing consistency here?
>
> Yeah. I think we need to remove the crap.
>
> I thought the problems were known, and fixed in -rc3. Clearly they weren't.
Yeah.
It would still be nice to get the before/after bootlogs, because we'd like to
map out any remaining bugs.
> And by now it's not about changing the default any more - by now it's about
> removing the known-crap code.
Ok, we can certainly do that too.
Should we scrap the whole x86 bootmem conversion to begin with? I'm not sure
there's any fundamentally less risky way to it so if we try this again in .35
we might run into similar regressions and i'd like to avoid that. I wouldnt
mind not having to do that at all, it's been a lot of pain to pull it off and
the lmb conversion looks even more intrusive.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box
2010-03-31 20:57 ` Dave Airlie
2010-03-31 21:02 ` Linus Torvalds
@ 2010-03-31 21:47 ` Ingo Molnar
1 sibling, 0 replies; 46+ messages in thread
From: Ingo Molnar @ 2010-03-31 21:47 UTC (permalink / raw)
To: Dave Airlie
Cc: James Morris, H. Peter Anvin, Yinghai Lu, linux-kernel, airlied,
Thomas Gleixner, Linus Torvalds, Pekka Enberg
* Dave Airlie <airlied@gmail.com> wrote:
> On Thu, Apr 1, 2010 at 4:59 AM, Ingo Molnar <mingo@elte.hu> wrote:
> >
> > * James Morris <jmorris@namei.org> wrote:
> >
> >> On Tue, 30 Mar 2010, H. Peter Anvin wrote:
> >>
> >> > On 03/30/2010 09:49 PM, James Morris wrote:
> >> > >
> >> > > Please make NO_BOOTMEM default to n, at least for amd64, where I've found
> >> > > that it leads to all kinds of strange, undebuggable boot hangs and errors
> >> > > (with relatively current Fedora development userland).
> >> >
> >> > Have you tested it with the latest fixes that are now in Linus' tree (-rc3)?
> >>
> >> Yes, it was happening with -rc3.
> >
> > Could you please send the bootlog that Yinghai asked for, plus also one that
> > you get with NO_BOOTMEM turned off (for comparison)?
> >
> > Also, when did you first hit this bug? This code has been upstream for almost
> > a month, and it was in linux-next before that - so you should have hit this
> > much sooner. A rough timeframe would suffice. I suppose you were booting
> > upstream kernels during the merge window as well?
>
> A default y config option causing regressions still at rc3? and you guys
> keep going? This is the sort of shit Linus would flame me for a day or two
> for,
>
> Can we get some f'ing consistency here?
Note, without trying to defend the bootmem conversion itself, which didnt work
out well, this is not some optional new driver feature that was default-y
randomly but it was an infrastructure change that was to be made unconditional
in .35.
The flag was basically a testing/debug flag to allow the old code to be used
too, in case the new code was buggy. This is what helped James to report this
today, instead of forcing James through a very difficult ~14-reboot bisection.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box
2010-03-31 21:14 ` Dave Airlie
@ 2010-03-31 22:02 ` Yinghai Lu
2010-03-31 22:28 ` H. Peter Anvin
1 sibling, 0 replies; 46+ messages in thread
From: Yinghai Lu @ 2010-03-31 22:02 UTC (permalink / raw)
To: Dave Airlie
Cc: Ingo Molnar, James Morris, H. Peter Anvin, linux-kernel, airlied,
Thomas Gleixner, Linus Torvalds, Pekka Enberg
On 03/31/2010 02:14 PM, Dave Airlie wrote:
> On Thu, Apr 1, 2010 at 4:59 AM, Ingo Molnar <mingo@elte.hu> wrote:
>>
>> * James Morris <jmorris@namei.org> wrote:
>>
>>> On Tue, 30 Mar 2010, H. Peter Anvin wrote:
>>>
>>>> On 03/30/2010 09:49 PM, James Morris wrote:
>>>>>
>>>>> Please make NO_BOOTMEM default to n, at least for amd64, where I've found
>>>>> that it leads to all kinds of strange, undebuggable boot hangs and errors
>>>>> (with relatively current Fedora development userland).
>>>>
>>>> Have you tested it with the latest fixes that are now in Linus' tree (-rc3)?
>>>
>>> Yes, it was happening with -rc3.
>>
>> Could you please send the bootlog that Yinghai asked for, plus also one that
>> you get with NO_BOOTMEM turned off (for comparison)?
>>
>> Also, when did you first hit this bug? This code has been upstream for almost
>> a month, and it was in linux-next before that - so you should have hit this
>> much sooner. A rough timeframe would suffice. I suppose you were booting
>> upstream kernels during the merge window as well?
>>
>> We can flip the default around if there's no fix available based on the
>> bootlogs. (Plus the help text should definitely be improved.)
>>
>
> Are you testing this btw with initramfs/initrds? I suspect lots of testing
> is being done by people on monolithic kernels, this is just a misc guess,
> considering I couldn't boot from when this landed until rc3 with this option
> on a basic 32-bit install on a dual-core 64-bit CPU, it suggested a
> hole of some sort
> in the test coverage.
so -rc3 is working your setup?
Yinghai
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box
2010-03-31 6:47 ` James Morris
2010-03-31 16:25 ` Yinghai Lu
2010-03-31 18:59 ` Ingo Molnar
@ 2010-03-31 22:05 ` Yinghai Lu
2010-03-31 22:13 ` Ingo Molnar
2 siblings, 1 reply; 46+ messages in thread
From: Yinghai Lu @ 2010-03-31 22:05 UTC (permalink / raw)
To: James Morris; +Cc: H. Peter Anvin, Ingo Molnar, linux-kernel, airlied
On 03/30/2010 11:47 PM, James Morris wrote:
> On Tue, 30 Mar 2010, H. Peter Anvin wrote:
>
>> On 03/30/2010 09:49 PM, James Morris wrote:
>>> Please make NO_BOOTMEM default to n, at least for amd64, where I've found
>>> that it leads to all kinds of strange, undebuggable boot hangs and errors
>>> (with relatively current Fedora development userland).
>>
>> Have you tested it with the latest fixes that are now in Linus' tree (-rc3)?
>
> Yes, it was happening with -rc3.
in case, you have one 32bit system without RAM installed on node0. please check
Thanks
Yinghai
Subject: [PATCH] x86: Fix 32bit system without RAM on Node0
when 32bit numa is used, free_all_bootmem() will still only go over with
node id 0.
If node 0 doesn't have RAM installed, We need to go with node1
because early_node_map still use 1 for all ranges, and ram from node1
becom low ram.
Try to use MAX_NUMNODES like 64 numa does.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
arch/x86/mm/init_32.c | 5 +++++
1 file changed, 5 insertions(+)
Index: linux-2.6/arch/x86/mm/init_32.c
===================================================================
--- linux-2.6.orig/arch/x86/mm/init_32.c
+++ linux-2.6/arch/x86/mm/init_32.c
@@ -875,7 +875,12 @@ void __init mem_init(void)
BUG_ON(!mem_map);
#endif
/* this will put all low memory onto the freelists */
+#if defined(CONFIG_NO_BOOTMEM) && defined(MAX_NUMNODES)
+ /* In case some 32bit systems don't have RAM installed on node0 */
+ totalram_pages += free_all_memory_core_early(MAX_NUMNODES);
+#else
totalram_pages += free_all_bootmem();
+#endif
reservedpages = 0;
for (tmp = 0; tmp < max_low_pfn; tmp++)
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box
2010-03-31 22:05 ` Yinghai Lu
@ 2010-03-31 22:13 ` Ingo Molnar
2010-03-31 22:16 ` Yinghai Lu
0 siblings, 1 reply; 46+ messages in thread
From: Ingo Molnar @ 2010-03-31 22:13 UTC (permalink / raw)
To: Yinghai Lu; +Cc: James Morris, H. Peter Anvin, linux-kernel, airlied
* Yinghai Lu <yinghai@kernel.org> wrote:
> --- linux-2.6.orig/arch/x86/mm/init_32.c
> +++ linux-2.6/arch/x86/mm/init_32.c
> @@ -875,7 +875,12 @@ void __init mem_init(void)
> BUG_ON(!mem_map);
> #endif
> /* this will put all low memory onto the freelists */
> +#if defined(CONFIG_NO_BOOTMEM) && defined(MAX_NUMNODES)
> + /* In case some 32bit systems don't have RAM installed on node0 */
> + totalram_pages += free_all_memory_core_early(MAX_NUMNODES);
(Note: tab whitespace damage)
> +#else
> totalram_pages += free_all_bootmem();
So we get into this branch if CONFIG_NO_BOOTMEM is enabled but MAX_NUMNODES is
not defined? Doesnt look right.
> +#endif
Btw., and i said this before, i absolutely hate the CONFIG_NO_BOOTMEM naming
as well (a negative in the option), but it is was what expresses the 'this is
where we want to go' state better and thus CONFIG_NO_BOOTMEM removal will be a
straight removal instead of a removal of the inverse.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box
2010-03-31 22:13 ` Ingo Molnar
@ 2010-03-31 22:16 ` Yinghai Lu
2010-03-31 22:41 ` Ingo Molnar
0 siblings, 1 reply; 46+ messages in thread
From: Yinghai Lu @ 2010-03-31 22:16 UTC (permalink / raw)
To: Ingo Molnar; +Cc: James Morris, H. Peter Anvin, linux-kernel, airlied
On 03/31/2010 03:13 PM, Ingo Molnar wrote:
>
> * Yinghai Lu <yinghai@kernel.org> wrote:
>
>> --- linux-2.6.orig/arch/x86/mm/init_32.c
>> +++ linux-2.6/arch/x86/mm/init_32.c
>> @@ -875,7 +875,12 @@ void __init mem_init(void)
>> BUG_ON(!mem_map);
>> #endif
>> /* this will put all low memory onto the freelists */
>> +#if defined(CONFIG_NO_BOOTMEM) && defined(MAX_NUMNODES)
>> + /* In case some 32bit systems don't have RAM installed on node0 */
>> + totalram_pages += free_all_memory_core_early(MAX_NUMNODES);
>
> (Note: tab whitespace damage)
>
>> +#else
>> totalram_pages += free_all_bootmem();
>
> So we get into this branch if CONFIG_NO_BOOTMEM is enabled but MAX_NUMNODES is
> not defined? Doesnt look right.
yes.
free_all_bootmem() will call
free_all_memory_core_early(NODE_DATA(0)->node_id);
Thanks
Yinghai Lu
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box
2010-03-31 21:14 ` Dave Airlie
2010-03-31 22:02 ` Yinghai Lu
@ 2010-03-31 22:28 ` H. Peter Anvin
1 sibling, 0 replies; 46+ messages in thread
From: H. Peter Anvin @ 2010-03-31 22:28 UTC (permalink / raw)
To: Dave Airlie
Cc: Ingo Molnar, James Morris, Yinghai Lu, linux-kernel, airlied,
Thomas Gleixner, Linus Torvalds, Pekka Enberg
On 03/31/2010 02:14 PM, Dave Airlie wrote:
> Are you testing this btw with initramfs/initrds? I suspect lots of testing
> is being done by people on monolithic kernels, this is just a misc guess,
> considering I couldn't boot from when this landed until rc3 with this option
> on a basic 32-bit install on a dual-core 64-bit CPU, it suggested a
> hole of some sort
> in the test coverage.
Hi Dave,
The only bug report I remember getting from you had no details and was
in reply to another bug report which was, indeed, addressed, so we had
every reason to believe it was being dealt with with the patchset which
did indeed go into -rc3 (and does address a problem with initramfs in
particular cases.)
Clearly James Morris' problem is something unrelated, and regardless of
course of action we need to track it down.
If you also are having problems with -rc3 we would really appreciate as
much detail as possible -- boot logs at the very minimum -- so we have
a chance to at all track down the problems that do exist.
-hpa
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box
2010-03-31 22:16 ` Yinghai Lu
@ 2010-03-31 22:41 ` Ingo Molnar
2010-03-31 22:47 ` Yinghai Lu
0 siblings, 1 reply; 46+ messages in thread
From: Ingo Molnar @ 2010-03-31 22:41 UTC (permalink / raw)
To: Yinghai Lu; +Cc: James Morris, H. Peter Anvin, linux-kernel, airlied
* Yinghai Lu <yinghai@kernel.org> wrote:
> On 03/31/2010 03:13 PM, Ingo Molnar wrote:
> >
> > * Yinghai Lu <yinghai@kernel.org> wrote:
> >
> >> --- linux-2.6.orig/arch/x86/mm/init_32.c
> >> +++ linux-2.6/arch/x86/mm/init_32.c
> >> @@ -875,7 +875,12 @@ void __init mem_init(void)
> >> BUG_ON(!mem_map);
> >> #endif
> >> /* this will put all low memory onto the freelists */
> >> +#if defined(CONFIG_NO_BOOTMEM) && defined(MAX_NUMNODES)
> >> + /* In case some 32bit systems don't have RAM installed on node0 */
> >> + totalram_pages += free_all_memory_core_early(MAX_NUMNODES);
> >
> > (Note: tab whitespace damage)
> >
> >> +#else
> >> totalram_pages += free_all_bootmem();
> >
> > So we get into this branch if CONFIG_NO_BOOTMEM is enabled but MAX_NUMNODES is
> > not defined? Doesnt look right.
>
> yes.
>
> free_all_bootmem() will call
> free_all_memory_core_early(NODE_DATA(0)->node_id);
>
> Thanks
Well and that whole #ifdeffery is disgusting as well - even if the goal was to
remove CONFIG_NO_BOOTMEM ASAP.
Please learn to use proper intermediate helper functions and at minimum put
the conversion ugliness somewhere that doesnt intrude our daily flow in .c
files. The best rule is to _never ever_ put an #ifdef construct into a .c
file. It doesnt matter what the goal if the #ifdef is - such ugliness in code
is never justified.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box
2010-03-31 22:41 ` Ingo Molnar
@ 2010-03-31 22:47 ` Yinghai Lu
2010-03-31 22:56 ` Ingo Molnar
2010-03-31 23:34 ` H. Peter Anvin
0 siblings, 2 replies; 46+ messages in thread
From: Yinghai Lu @ 2010-03-31 22:47 UTC (permalink / raw)
To: Ingo Molnar; +Cc: James Morris, H. Peter Anvin, linux-kernel, airlied
On 03/31/2010 03:41 PM, Ingo Molnar wrote:
>
> * Yinghai Lu <yinghai@kernel.org> wrote:
>
>> On 03/31/2010 03:13 PM, Ingo Molnar wrote:
>>>
>>> * Yinghai Lu <yinghai@kernel.org> wrote:
>>>
>>>> --- linux-2.6.orig/arch/x86/mm/init_32.c
>>>> +++ linux-2.6/arch/x86/mm/init_32.c
>>>> @@ -875,7 +875,12 @@ void __init mem_init(void)
>>>> BUG_ON(!mem_map);
>>>> #endif
>>>> /* this will put all low memory onto the freelists */
>>>> +#if defined(CONFIG_NO_BOOTMEM) && defined(MAX_NUMNODES)
>>>> + /* In case some 32bit systems don't have RAM installed on node0 */
>>>> + totalram_pages += free_all_memory_core_early(MAX_NUMNODES);
>>>
>>> (Note: tab whitespace damage)
>>>
>>>> +#else
>>>> totalram_pages += free_all_bootmem();
>>>
>>> So we get into this branch if CONFIG_NO_BOOTMEM is enabled but MAX_NUMNODES is
>>> not defined? Doesnt look right.
>>
>> yes.
>>
>> free_all_bootmem() will call
>> free_all_memory_core_early(NODE_DATA(0)->node_id);
>>
>> Thanks
>
> Well and that whole #ifdeffery is disgusting as well - even if the goal was to
> remove CONFIG_NO_BOOTMEM ASAP.
>
> Please learn to use proper intermediate helper functions and at minimum put
> the conversion ugliness somewhere that doesnt intrude our daily flow in .c
> files. The best rule is to _never ever_ put an #ifdef construct into a .c
> file. It doesnt matter what the goal if the #ifdef is - such ugliness in code
> is never justified.
>
if you agree that i can have one nobootmem.c in mm/
Thanks
Yinghai
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box
2010-03-31 22:47 ` Yinghai Lu
@ 2010-03-31 22:56 ` Ingo Molnar
2010-04-01 0:01 ` Johannes Weiner
2010-03-31 23:34 ` H. Peter Anvin
1 sibling, 1 reply; 46+ messages in thread
From: Ingo Molnar @ 2010-03-31 22:56 UTC (permalink / raw)
To: Yinghai Lu
Cc: James Morris, H. Peter Anvin, linux-kernel, airlied,
Linus Torvalds, Pekka Enberg
* Yinghai Lu <yinghai@kernel.org> wrote:
> On 03/31/2010 03:41 PM, Ingo Molnar wrote:
> >
> > * Yinghai Lu <yinghai@kernel.org> wrote:
> >
> >> On 03/31/2010 03:13 PM, Ingo Molnar wrote:
> >>>
> >>> * Yinghai Lu <yinghai@kernel.org> wrote:
> >>>
> >>>> --- linux-2.6.orig/arch/x86/mm/init_32.c
> >>>> +++ linux-2.6/arch/x86/mm/init_32.c
> >>>> @@ -875,7 +875,12 @@ void __init mem_init(void)
> >>>> BUG_ON(!mem_map);
> >>>> #endif
> >>>> /* this will put all low memory onto the freelists */
> >>>> +#if defined(CONFIG_NO_BOOTMEM) && defined(MAX_NUMNODES)
> >>>> + /* In case some 32bit systems don't have RAM installed on node0 */
> >>>> + totalram_pages += free_all_memory_core_early(MAX_NUMNODES);
> >>>
> >>> (Note: tab whitespace damage)
> >>>
> >>>> +#else
> >>>> totalram_pages += free_all_bootmem();
> >>>
> >>> So we get into this branch if CONFIG_NO_BOOTMEM is enabled but MAX_NUMNODES is
> >>> not defined? Doesnt look right.
> >>
> >> yes.
> >>
> >> free_all_bootmem() will call
> >> free_all_memory_core_early(NODE_DATA(0)->node_id);
> >>
> >> Thanks
> >
> > Well and that whole #ifdeffery is disgusting as well - even if the goal was to
> > remove CONFIG_NO_BOOTMEM ASAP.
> >
> > Please learn to use proper intermediate helper functions and at minimum put
> > the conversion ugliness somewhere that doesnt intrude our daily flow in .c
> > files. The best rule is to _never ever_ put an #ifdef construct into a .c
> > file. It doesnt matter what the goal if the #ifdef is - such ugliness in code
> > is never justified.
> >
>
> if you agree that i can have one nobootmem.c in mm/
I think what we want is your lmb series, with CONFIG_NO_BOOTMEM eliminated
altogether and x86 converted to pure (extended) lmb facilities, and without
any traces of bootmem left in x86.
I.e. a really clean series with no CONFIG_NO_BOOTMEM kind of #ifdef crap left
around. This means 'nobootmem.c' (albeit saner than an #ifdef jungle) would be
moot as well.
We tried the dual model as it seemed prudent from a testing/conversion POV
(and it certainly allowed people to turn the new code off), but it's rather
ugly and we still have bugs left.
This means that if Linus likes that approach the conversion will be very
binary and very painful. The other option would be to go back to bootmem and
forget about the whole nobootmem and lmb thing.
Ingo
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box
2010-03-31 18:59 ` Ingo Molnar
2010-03-31 20:57 ` Dave Airlie
2010-03-31 21:14 ` Dave Airlie
@ 2010-03-31 22:58 ` James Morris
2010-03-31 23:02 ` Ingo Molnar
2010-03-31 23:35 ` H. Peter Anvin
2 siblings, 2 replies; 46+ messages in thread
From: James Morris @ 2010-03-31 22:58 UTC (permalink / raw)
To: Ingo Molnar
Cc: H. Peter Anvin, Yinghai Lu, linux-kernel, airlied,
Thomas Gleixner, Linus Torvalds, Pekka Enberg
On Wed, 31 Mar 2010, Ingo Molnar wrote:
> >
> > Yes, it was happening with -rc3.
>
> Could you please send the bootlog that Yinghai asked for, plus also one that
> you get with NO_BOOTMEM turned off (for comparison)?
I don't have the old boot logs, and have since upgraded the system
further.
IIRC, the boot was failing after not being able to find the root fs
(ext3/lvm/raid0). I thought it was a dracut issue, but it seemed to be
fixed by enabling bootmem.
> Also, when did you first hit this bug? This code has been upstream for almost
> a month, and it was in linux-next before that - so you should have hit this
> much sooner. A rough timeframe would suffice. I suppose you were booting
> upstream kernels during the merge window as well?
In this case, in the last few days (also when I first saw or noticed the
bootmem option). I was booting relatively recent linus kernels during the
merge window, although my main work was being done on an older upstream
kernel.
--
James Morris
<jmorris@namei.org>
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box
2010-03-31 22:58 ` James Morris
@ 2010-03-31 23:02 ` Ingo Molnar
2010-03-31 23:35 ` H. Peter Anvin
1 sibling, 0 replies; 46+ messages in thread
From: Ingo Molnar @ 2010-03-31 23:02 UTC (permalink / raw)
To: James Morris
Cc: H. Peter Anvin, Yinghai Lu, linux-kernel, airlied,
Thomas Gleixner, Linus Torvalds, Pekka Enberg
* James Morris <jmorris@namei.org> wrote:
> On Wed, 31 Mar 2010, Ingo Molnar wrote:
>
> > >
> > > Yes, it was happening with -rc3.
> >
> > Could you please send the bootlog that Yinghai asked for, plus also one that
> > you get with NO_BOOTMEM turned off (for comparison)?
>
> I don't have the old boot logs, and have since upgraded the system
> further.
Please, could you send any bootlog then that we could work from? That way we
could check the memory layout and guess the rough shape of the early
allocations, etc.
> IIRC, the boot was failing after not being able to find the root fs
> (ext3/lvm/raid0). I thought it was a dracut issue, but it seemed to be
> fixed by enabling bootmem.
Ok - initrd unpack failing or initial mount failing is consistent with the
initrd getting corrupted by overlapping early reservations due to allocator
bug.
> > Also, when did you first hit this bug? This code has been upstream for
> > almost a month, and it was in linux-next before that - so you should have
> > hit this much sooner. A rough timeframe would suffice. I suppose you were
> > booting upstream kernels during the merge window as well?
>
> In this case, in the last few days (also when I first saw or noticed the
> bootmem option). I was booting relatively recent linus kernels during the
> merge window, although my main work was being done on an older upstream
> kernel.
Ok, so it's not an old regression but possibly a bug in one of the fixes. Not
good.
Ingo
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box
2010-03-31 22:47 ` Yinghai Lu
2010-03-31 22:56 ` Ingo Molnar
@ 2010-03-31 23:34 ` H. Peter Anvin
2010-03-31 23:54 ` Yinghai Lu
1 sibling, 1 reply; 46+ messages in thread
From: H. Peter Anvin @ 2010-03-31 23:34 UTC (permalink / raw)
To: Yinghai Lu; +Cc: Ingo Molnar, James Morris, linux-kernel, airlied
On 03/31/2010 03:47 PM, Yinghai Lu wrote:
>>
>> Well and that whole #ifdeffery is disgusting as well - even if the goal was to
>> remove CONFIG_NO_BOOTMEM ASAP.
>>
>> Please learn to use proper intermediate helper functions and at minimum put
>> the conversion ugliness somewhere that doesnt intrude our daily flow in .c
>> files. The best rule is to _never ever_ put an #ifdef construct into a .c
>> file. It doesnt matter what the goal if the #ifdef is - such ugliness in code
>> is never justified.
>
> if you agree that i can have one nobootmem.c in mm/
>
That would be better, or more commonly, use inlines.
I'm still totally puzzled about this patch as well as the comment:
+#if defined(CONFIG_NO_BOOTMEM) && defined(MAX_NUMNODES)
+ /* In case some 32bit systems don't have RAM installed on node0 */
+ totalram_pages += free_all_memory_core_early(MAX_NUMNODES);
+#else
totalram_pages += free_all_bootmem();
+#endif
Why is that "32 bits" specific? Second, MAX_NUMNODES is defined
whenever <linux/numa.h> is included, so what on Earth is this supposed
to signify? Are you trying to say MAX_NUMNODES > 1? Or are you trying
to say CONFIG_NUMA?
Furthermore, I really don't see the connection between this and James
Morris' reported problem, which he reports as "amd64", which presumably
is an x86-64 kernel and not 32 bits... James, is that correct? Any
more details you can give about the system? I *really* don't want to go
into cargo cult programming mode, that would suck eggs no matter what.
-hpa
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box
2010-03-31 22:58 ` James Morris
2010-03-31 23:02 ` Ingo Molnar
@ 2010-03-31 23:35 ` H. Peter Anvin
2010-03-31 23:43 ` James Morris
1 sibling, 1 reply; 46+ messages in thread
From: H. Peter Anvin @ 2010-03-31 23:35 UTC (permalink / raw)
To: James Morris
Cc: Ingo Molnar, Yinghai Lu, linux-kernel, airlied, Thomas Gleixner,
Linus Torvalds, Pekka Enberg
On 03/31/2010 03:58 PM, James Morris wrote:
> On Wed, 31 Mar 2010, Ingo Molnar wrote:
>
>>>
>>> Yes, it was happening with -rc3.
>>
>> Could you please send the bootlog that Yinghai asked for, plus also one that
>> you get with NO_BOOTMEM turned off (for comparison)?
>
> I don't have the old boot logs, and have since upgraded the system
> further.
>
Upgraded how? The problem no longer happens?
> IIRC, the boot was failing after not being able to find the root fs
> (ext3/lvm/raid0). I thought it was a dracut issue, but it seemed to be
> fixed by enabling bootmem.
This would rather match the problem that was addressed by the patch in
-rc3. Any help in reproducing it would be great.
-hpa
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box
2010-03-31 23:35 ` H. Peter Anvin
@ 2010-03-31 23:43 ` James Morris
2010-03-31 23:48 ` H. Peter Anvin
0 siblings, 1 reply; 46+ messages in thread
From: James Morris @ 2010-03-31 23:43 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Ingo Molnar, Yinghai Lu, linux-kernel, airlied, Thomas Gleixner,
Linus Torvalds, Pekka Enberg
On Wed, 31 Mar 2010, H. Peter Anvin wrote:
> On 03/31/2010 03:58 PM, James Morris wrote:
> > On Wed, 31 Mar 2010, Ingo Molnar wrote:
> >
> >>>
> >>> Yes, it was happening with -rc3.
> >>
> >> Could you please send the bootlog that Yinghai asked for, plus also one that
> >> you get with NO_BOOTMEM turned off (for comparison)?
> >
> > I don't have the old boot logs, and have since upgraded the system
> > further.
> >
>
> Upgraded how? The problem no longer happens?
Upgraded to the latest rawhide userland -- I have not since tested with
bootmem off. I'll try and do so again when I get a chance.
>
> > IIRC, the boot was failing after not being able to find the root fs
> > (ext3/lvm/raid0). I thought it was a dracut issue, but it seemed to be
> > fixed by enabling bootmem.
>
> This would rather match the problem that was addressed by the patch in
> -rc3. Any help in reproducing it would be great.
>
> -hpa
>
--
James Morris
<jmorris@namei.org>
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box
2010-03-31 23:43 ` James Morris
@ 2010-03-31 23:48 ` H. Peter Anvin
2010-04-01 1:00 ` James Morris
0 siblings, 1 reply; 46+ messages in thread
From: H. Peter Anvin @ 2010-03-31 23:48 UTC (permalink / raw)
To: James Morris
Cc: Ingo Molnar, Yinghai Lu, linux-kernel, airlied, Thomas Gleixner,
Linus Torvalds, Pekka Enberg
On 03/31/2010 04:43 PM, James Morris wrote:
>>
>> Upgraded how? The problem no longer happens?
>
> Upgraded to the latest rawhide userland -- I have not since tested with
> bootmem off. I'll try and do so again when I get a chance.
>
That would be great. The sooner the better, obviously.
-hpa
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box
2010-03-31 23:34 ` H. Peter Anvin
@ 2010-03-31 23:54 ` Yinghai Lu
2010-04-01 0:35 ` H. Peter Anvin
0 siblings, 1 reply; 46+ messages in thread
From: Yinghai Lu @ 2010-03-31 23:54 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: Ingo Molnar, James Morris, linux-kernel, airlied
On 03/31/2010 04:34 PM, H. Peter Anvin wrote:
> On 03/31/2010 03:47 PM, Yinghai Lu wrote:
>>>
>>> Well and that whole #ifdeffery is disgusting as well - even if the goal was to
>>> remove CONFIG_NO_BOOTMEM ASAP.
>>>
>>> Please learn to use proper intermediate helper functions and at minimum put
>>> the conversion ugliness somewhere that doesnt intrude our daily flow in .c
>>> files. The best rule is to _never ever_ put an #ifdef construct into a .c
>>> file. It doesnt matter what the goal if the #ifdef is - such ugliness in code
>>> is never justified.
>>
>> if you agree that i can have one nobootmem.c in mm/
>>
>
> That would be better, or more commonly, use inlines.
>
> I'm still totally puzzled about this patch as well as the comment:
>
> +#if defined(CONFIG_NO_BOOTMEM) && defined(MAX_NUMNODES)
> + /* In case some 32bit systems don't have RAM installed on node0 */
> + totalram_pages += free_all_memory_core_early(MAX_NUMNODES);
> +#else
> totalram_pages += free_all_bootmem();
> +#endif
>
>
> Why is that "32 bits" specific? Second, MAX_NUMNODES is defined
> whenever <linux/numa.h> is included, so what on Earth is this supposed
> to signify? Are you trying to say MAX_NUMNODES > 1? Or are you trying
> to say CONFIG_NUMA?
you are right, this one should be more clear.
Subject: [PATCH -v2] nobootmem, x86: Fix 32bit system without RAM on Node0
when 32bit numa is used, free_all_bootmem() will still only go over with
node id 0.
If node 0 doesn't have RAM installed, We need to go with node1
because early_node_map still use 1 for all ranges, and ram from node1
becom low ram.
Try to use MAX_NUMNODES like 64 numa does.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
mm/bootmem.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Index: linux-2.6/mm/bootmem.c
===================================================================
--- linux-2.6.orig/mm/bootmem.c
+++ linux-2.6/mm/bootmem.c
@@ -303,7 +303,7 @@ unsigned long __init free_all_bootmem_no
unsigned long __init free_all_bootmem(void)
{
#ifdef CONFIG_NO_BOOTMEM
- return free_all_memory_core_early(NODE_DATA(0)->node_id);
+ return free_all_memory_core_early(MAX_NUMNODES);
#else
return free_all_bootmem_core(NODE_DATA(0)->bdata);
#endif
>
> Furthermore, I really don't see the connection between this and James
> Morris' reported problem, which he reports as "amd64", which presumably
> is an x86-64 kernel and not 32 bits... James, is that correct? Any
> more details you can give about the system? I *really* don't want to go
> into cargo cult programming mode, that would suck eggs no matter what.
it happened one of my test setup, node0 ram disappear somehow.
and i found the 32bit numa doesn't work on that.
Thanks
Yinghai
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box
2010-03-31 22:56 ` Ingo Molnar
@ 2010-04-01 0:01 ` Johannes Weiner
0 siblings, 0 replies; 46+ messages in thread
From: Johannes Weiner @ 2010-04-01 0:01 UTC (permalink / raw)
To: Ingo Molnar
Cc: Yinghai Lu, James Morris, H. Peter Anvin, linux-kernel, airlied,
Linus Torvalds, Pekka Enberg
On Thu, Apr 01, 2010 at 12:56:58AM +0200, Ingo Molnar wrote:
> I think what we want is your lmb series, with CONFIG_NO_BOOTMEM eliminated
> altogether and x86 converted to pure (extended) lmb facilities, and without
> any traces of bootmem left in x86.
That does not make much sense as bootmem is not only used on the architecture
side but also in generic code. So you either have to emulate the API on x86
or get lmb in a state to replace bootmem on _all_ architectures.
> I.e. a really clean series with no CONFIG_NO_BOOTMEM kind of #ifdef crap left
> around. This means 'nobootmem.c' (albeit saner than an #ifdef jungle) would be
> moot as well.
>
> We tried the dual model as it seemed prudent from a testing/conversion POV
> (and it certainly allowed people to turn the new code off), but it's rather
> ugly and we still have bugs left.
I think this was an implementation thing rather than a problem with the model
per se.
As written above, you can hardly get away without emulating the bootmem API
during transition.
> This means that if Linus likes that approach the conversion will be very
> binary and very painful. The other option would be to go back to bootmem and
> forget about the whole nobootmem and lmb thing.
I suppose it would be safest to replace early_res with lmb first to get
in sync with the other archs using it.
Step two would be to extend LMB and implement a bootmem emulation API on
top of it so that architectures can switch over to non-bootmem mode one
by one. Then you can drop the real bootmem code and switch generic code
to use LMB natively, also site by site. And finally, drop the emulation API.
If other architectures object to removing bootmem, there really is no point
for x86 to even try it.
For step one to work out, it's probably easiest to fully revert to the
.33 state than having to replace early_res while in its current state?
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box
2010-03-31 23:54 ` Yinghai Lu
@ 2010-04-01 0:35 ` H. Peter Anvin
2010-04-01 1:07 ` Yinghai Lu
2010-04-01 2:02 ` [PATCH -v3] nobootmem/bootmem, x86: Fix 32bit numa system without RAM on Node0 Yinghai Lu
0 siblings, 2 replies; 46+ messages in thread
From: H. Peter Anvin @ 2010-04-01 0:35 UTC (permalink / raw)
To: Yinghai Lu; +Cc: Ingo Molnar, James Morris, linux-kernel, airlied
On 03/31/2010 04:54 PM, Yinghai Lu wrote:
>
> ---
> mm/bootmem.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> Index: linux-2.6/mm/bootmem.c
> ===================================================================
> --- linux-2.6.orig/mm/bootmem.c
> +++ linux-2.6/mm/bootmem.c
> @@ -303,7 +303,7 @@ unsigned long __init free_all_bootmem_no
> unsigned long __init free_all_bootmem(void)
> {
> #ifdef CONFIG_NO_BOOTMEM
> - return free_all_memory_core_early(NODE_DATA(0)->node_id);
> + return free_all_memory_core_early(MAX_NUMNODES);
> #else
> return free_all_bootmem_core(NODE_DATA(0)->bdata);
> #endif
>
>>
>> Furthermore, I really don't see the connection between this and James
>> Morris' reported problem, which he reports as "amd64", which presumably
>> is an x86-64 kernel and not 32 bits... James, is that correct? Any
>> more details you can give about the system? I *really* don't want to go
>> into cargo cult programming mode, that would suck eggs no matter what.
>
> it happened one of my test setup, node0 ram disappear somehow.
> and i found the 32bit numa doesn't work on that.
>
... which is useful and valid, but I still think this isn't related to
James' problem, if James' problem wasn't actually fixed in -rc3. That's
the part that I'm afraid I have to be confused about... all the known
problems except the above are fixed in -rc3, and I'd at least like to
have a validated bug report of any sort before saying it should all be
tossed.
This patch looks a lot better. The whole use of MAX_NUMNODES as a
sentinel (which appears inherited from mm/page_alloc.c, and as such is a
pre-existing convention which is also invoked here) really could use a
comment, though.
-hpa
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box
2010-03-31 23:48 ` H. Peter Anvin
@ 2010-04-01 1:00 ` James Morris
2010-04-01 12:52 ` Ingo Molnar
0 siblings, 1 reply; 46+ messages in thread
From: James Morris @ 2010-04-01 1:00 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Ingo Molnar, Yinghai Lu, linux-kernel, airlied, Thomas Gleixner,
Linus Torvalds, Pekka Enberg
On Wed, 31 Mar 2010, H. Peter Anvin wrote:
> On 03/31/2010 04:43 PM, James Morris wrote:
> >>
> >> Upgraded how? The problem no longer happens?
> >
> > Upgraded to the latest rawhide userland -- I have not since tested with
> > bootmem off. I'll try and do so again when I get a chance.
> >
>
> That would be great. The sooner the better, obviously.
I'm not seeing any problems now, with current Linus and rawhide. I'll
leave bootmem off and see if anything comes up again.
--
James Morris
<jmorris@namei.org>
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box
2010-04-01 0:35 ` H. Peter Anvin
@ 2010-04-01 1:07 ` Yinghai Lu
2010-04-01 2:02 ` [PATCH -v3] nobootmem/bootmem, x86: Fix 32bit numa system without RAM on Node0 Yinghai Lu
1 sibling, 0 replies; 46+ messages in thread
From: Yinghai Lu @ 2010-04-01 1:07 UTC (permalink / raw)
To: H. Peter Anvin; +Cc: Ingo Molnar, James Morris, linux-kernel, airlied
On 03/31/2010 05:35 PM, H. Peter Anvin wrote:
> On 03/31/2010 04:54 PM, Yinghai Lu wrote:
>
> This patch looks a lot better. The whole use of MAX_NUMNODES as a
> sentinel (which appears inherited from mm/page_alloc.c, and as such is a
> pre-existing convention which is also invoked here) really could use a
> comment, though.
sure. will have updated one with coments there
Thanks
Yinghai
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH -v3] nobootmem/bootmem, x86: Fix 32bit numa system without RAM on Node0
2010-04-01 0:35 ` H. Peter Anvin
2010-04-01 1:07 ` Yinghai Lu
@ 2010-04-01 2:02 ` Yinghai Lu
2010-04-01 3:18 ` H. Peter Anvin
1 sibling, 1 reply; 46+ messages in thread
From: Yinghai Lu @ 2010-04-01 2:02 UTC (permalink / raw)
To: H. Peter Anvin, Ingo Molnar, Thomas Gleixner
Cc: linux-kernel, Johannes Weiner, Andrew Morton
on one system without RAM on nod0, got following dump with 32bit numa kernel
early_node_map[4] active PFN ranges
1: 0x00000010 -> 0x00000099
1: 0x00000100 -> 0x0007da00
1: 0x0007e800 -> 0x0007ffa0
1: 0x0007ffae -> 0x0007ffb0
Subtract (29 early reservations)
#000 [0000001000 - 0000002000]
#001 [0000089000 - 000008f000]
#002 [0000091000 - 0000093500]
#003 [0000094000 - 0000099000]
#004 [0000099400 - 0000100000]
#005 [0000200000 - 0000eb7644]
#006 [0000eb8000 - 0000ec327c]
#007 [007c400000 - 007c40e000]
#008 [007c440000 - 007c44e000]
#009 [007c480000 - 007c48e000]
#010 [007c4c0000 - 007c4ce000]
#011 [007c500000 - 007c50e000]
#012 [007c540000 - 007c54e000]
#013 [007c580000 - 007c58e000]
#014 [007c5c0000 - 007c5ce000]
#015 [007c674000 - 007cbfe000]
#016 [007cbfe500 - 007cbfe530]
#017 [007cbfe540 - 007cbfe5d0]
#018 [007cbfe600 - 007cbfe620]
#019 [007cbfe640 - 007cbfe660]
#020 [007cbfe680 - 007cbfe684]
#021 [007cbfe6c0 - 007cbfe6c4]
#022 [007cbfe700 - 007cbfe77e]
#023 [007cbfe780 - 007cbfe7fe]
#024 [007cbfe800 - 007cbfec54]
#025 [007cbfec80 - 007cbfeede]
#026 [007cbfef00 - 007cbfef2d]
#027 [007cbfef40 - 007e800000]
#028 [007e9ca000 - 007ff95000]
(0 free memory ranges)
Initializing HighMem for node 0 (00000000:00000000)
Initializing HighMem for node 1 (00000000:00000000)
Memory: 0k/2096832k available (6662k kernel code, 2096300k reserved, 4829k data, 484k init, 0k highmem)
virtual kernel memory layout:
fixmap : 0xff637000 - 0xfffff000 (10016 kB)
pkmap : 0xff200000 - 0xff400000 (2048 kB)
vmalloc : 0xc07b0000 - 0xff1fe000 (1002 MB)
lowmem : 0x40000000 - 0xbffb0000 (2047 MB)
.init : 0x40d39000 - 0x40db2000 ( 484 kB)
.data : 0x40881924 - 0x40d38e1c (4829 kB)
.text : 0x40200000 - 0x40881924 (6662 kB)
Checking if this processor honours the WP bit even in supervisor mode...Ok.
swapper: page allocation failure. order:0, mode:0x0
Pid: 0, comm: swapper Not tainted 2.6.34-rc3-tip-03818-g4b1ea6c-dirty #35
Call Trace:
[<4087a5dc>] ? printk+0xf/0x11
[<40286728>] __alloc_pages_nodemask+0x417/0x487
[<402a9ce1>] new_slab+0xe2/0x1fe
[<402aa5b2>] kmem_cache_open+0x185/0x358
[<402abbc0>] T.954+0x1c/0x60
[<40d52a29>] kmem_cache_init+0x24/0x113
[<40d39738>] start_kernel+0x166/0x2e4
[<40d3940e>] ? unknown_bootoption+0x0/0x18e
[<40d390ce>] i386_start_kernel+0xce/0xd5
Mem-Info:
Node 1 DMA per-cpu:
CPU 0: hi: 0, btch: 1 usd: 0
Node 1 Normal per-cpu:
CPU 0: hi: 0, btch: 1 usd: 0
active_anon:0 inactive_anon:0 isolated_anon:0
active_file:0 inactive_file:0 isolated_file:0
unevictable:0 dirty:0 writeback:0 unstable:0
free:0 slab_reclaimable:0 slab_unreclaimable:0
mapped:0 shmem:0 pagetables:0 bounce:0
When 32bit numa is used, free_all_bootmem() will still only go over with
node id 0.
If node 0 doesn't have RAM installed, We need to go with node1
because early_node_map still use 1 for all ranges, and ram from node1
become low ram.
Try to use MAX_NUMNODES like 64 numa does.
Also fixes BOOTMEM path by loop bdata_list.
Note: this bug exist before We have NO_BOOTMEM support.
-v3: add more comments, and fix bootmem path too.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
mm/bootmem.c | 17 +++++++++++++++--
1 file changed, 15 insertions(+), 2 deletions(-)
Index: linux-2.6/mm/bootmem.c
===================================================================
--- linux-2.6.orig/mm/bootmem.c
+++ linux-2.6/mm/bootmem.c
@@ -303,9 +303,22 @@ unsigned long __init free_all_bootmem_no
unsigned long __init free_all_bootmem(void)
{
#ifdef CONFIG_NO_BOOTMEM
- return free_all_memory_core_early(NODE_DATA(0)->node_id);
+ /*
+ * We need to use MAX_NUMNODES instead of NODE_DATA(0)->node_id
+ * because in some case like Node0 doesnt have RAM installed
+ * low ram will be on Node1
+ * Use MAX_NUMNODES will make sure all ranges in early_node_map[]
+ * will be used instead of only Node0 related
+ */
+ return free_all_memory_core_early(MAX_NUMNODES);
#else
- return free_all_bootmem_core(NODE_DATA(0)->bdata);
+ unsigned long total_pages = 0;
+ bootmem_data_t *bdata;
+
+ list_for_each_entry(bdata, &bdata_list, list)
+ total_pages = free_all_bootmem_core(bdata);
+
+ return total_pages;
#endif
}
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box
@ 2010-04-01 3:16 H. Peter Anvin
2010-04-01 3:35 ` Yinghai Lu
0 siblings, 1 reply; 46+ messages in thread
From: H. Peter Anvin @ 2010-04-01 3:16 UTC (permalink / raw)
To: James Morris
Cc: Ingo Molnar, Yinghai Lu, linux-kernel, airlied, Thomas Gleixner,
Linus Torvalds, Pekka Enberg
[-- Attachment #1: Type: text/plain, Size: 926 bytes --]
OK... I think I'm going to write this up as unconfirmed... which means the only known problem that was not addressed in rc3 is the 32-bit NUMA issue, which we have a bug for.
Linus: does this address your concerns for now, or do you still want us to revert?
"James Morris" <jmorris@namei.org> wrote:
>On Wed, 31 Mar 2010, H. Peter Anvin wrote:
>
>> On 03/31/2010 04:43 PM, James Morris wrote:
>> >>
>> >> Upgraded how? The problem no longer happens?
>> >
>> > Upgraded to the latest rawhide userland -- I have not since tested with
>> > bootmem off. I'll try and do so again when I get a chance.
>> >
>>
>> That would be great. The sooner the better, obviously.
>
>I'm not seeing any problems now, with current Linus and rawhide. I'll
>leave bootmem off and see if anything comes up again.
>
>
>--
>James Morris
><jmorris@namei.org>
--
Sent from my mobile phone, pardon any lack of formatting.
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [PATCH -v3] nobootmem/bootmem, x86: Fix 32bit numa system without RAM on Node0
2010-04-01 2:02 ` [PATCH -v3] nobootmem/bootmem, x86: Fix 32bit numa system without RAM on Node0 Yinghai Lu
@ 2010-04-01 3:18 ` H. Peter Anvin
2010-04-01 3:30 ` Yinghai Lu
2010-04-01 3:44 ` [PATCH -v4 1/2] nobootmem, " Yinghai Lu
0 siblings, 2 replies; 46+ messages in thread
From: H. Peter Anvin @ 2010-04-01 3:18 UTC (permalink / raw)
To: Yinghai Lu, Ingo Molnar, Thomas Gleixner
Cc: linux-kernel, Johannes Weiner, Andrew Morton
[-- Attachment #1: Type: text/plain, Size: 4700 bytes --]
Please address the separate bug fix in a separate patch.
"Yinghai Lu" <yinghai@kernel.org> wrote:
>
>on one system without RAM on nod0, got following dump with 32bit numa kernel
>
>early_node_map[4] active PFN ranges
> 1: 0x00000010 -> 0x00000099
> 1: 0x00000100 -> 0x0007da00
> 1: 0x0007e800 -> 0x0007ffa0
> 1: 0x0007ffae -> 0x0007ffb0
>
>Subtract (29 early reservations)
> #000 [0000001000 - 0000002000]
> #001 [0000089000 - 000008f000]
> #002 [0000091000 - 0000093500]
> #003 [0000094000 - 0000099000]
> #004 [0000099400 - 0000100000]
> #005 [0000200000 - 0000eb7644]
> #006 [0000eb8000 - 0000ec327c]
> #007 [007c400000 - 007c40e000]
> #008 [007c440000 - 007c44e000]
> #009 [007c480000 - 007c48e000]
> #010 [007c4c0000 - 007c4ce000]
> #011 [007c500000 - 007c50e000]
> #012 [007c540000 - 007c54e000]
> #013 [007c580000 - 007c58e000]
> #014 [007c5c0000 - 007c5ce000]
> #015 [007c674000 - 007cbfe000]
> #016 [007cbfe500 - 007cbfe530]
> #017 [007cbfe540 - 007cbfe5d0]
> #018 [007cbfe600 - 007cbfe620]
> #019 [007cbfe640 - 007cbfe660]
> #020 [007cbfe680 - 007cbfe684]
> #021 [007cbfe6c0 - 007cbfe6c4]
> #022 [007cbfe700 - 007cbfe77e]
> #023 [007cbfe780 - 007cbfe7fe]
> #024 [007cbfe800 - 007cbfec54]
> #025 [007cbfec80 - 007cbfeede]
> #026 [007cbfef00 - 007cbfef2d]
> #027 [007cbfef40 - 007e800000]
> #028 [007e9ca000 - 007ff95000]
>(0 free memory ranges)
>Initializing HighMem for node 0 (00000000:00000000)
>Initializing HighMem for node 1 (00000000:00000000)
>Memory: 0k/2096832k available (6662k kernel code, 2096300k reserved, 4829k data, 484k init, 0k highmem)
>virtual kernel memory layout:
> fixmap : 0xff637000 - 0xfffff000 (10016 kB)
> pkmap : 0xff200000 - 0xff400000 (2048 kB)
> vmalloc : 0xc07b0000 - 0xff1fe000 (1002 MB)
> lowmem : 0x40000000 - 0xbffb0000 (2047 MB)
> .init : 0x40d39000 - 0x40db2000 ( 484 kB)
> .data : 0x40881924 - 0x40d38e1c (4829 kB)
> .text : 0x40200000 - 0x40881924 (6662 kB)
>Checking if this processor honours the WP bit even in supervisor mode...Ok.
>swapper: page allocation failure. order:0, mode:0x0
>Pid: 0, comm: swapper Not tainted 2.6.34-rc3-tip-03818-g4b1ea6c-dirty #35
>Call Trace:
> [<4087a5dc>] ? printk+0xf/0x11
> [<40286728>] __alloc_pages_nodemask+0x417/0x487
> [<402a9ce1>] new_slab+0xe2/0x1fe
> [<402aa5b2>] kmem_cache_open+0x185/0x358
> [<402abbc0>] T.954+0x1c/0x60
> [<40d52a29>] kmem_cache_init+0x24/0x113
> [<40d39738>] start_kernel+0x166/0x2e4
> [<40d3940e>] ? unknown_bootoption+0x0/0x18e
> [<40d390ce>] i386_start_kernel+0xce/0xd5
>Mem-Info:
>Node 1 DMA per-cpu:
>CPU 0: hi: 0, btch: 1 usd: 0
>Node 1 Normal per-cpu:
>CPU 0: hi: 0, btch: 1 usd: 0
>active_anon:0 inactive_anon:0 isolated_anon:0
> active_file:0 inactive_file:0 isolated_file:0
> unevictable:0 dirty:0 writeback:0 unstable:0
> free:0 slab_reclaimable:0 slab_unreclaimable:0
> mapped:0 shmem:0 pagetables:0 bounce:0
>
>When 32bit numa is used, free_all_bootmem() will still only go over with
>node id 0.
>
>If node 0 doesn't have RAM installed, We need to go with node1
>because early_node_map still use 1 for all ranges, and ram from node1
>become low ram.
>
>Try to use MAX_NUMNODES like 64 numa does.
>
>Also fixes BOOTMEM path by loop bdata_list.
>Note: this bug exist before We have NO_BOOTMEM support.
>
>-v3: add more comments, and fix bootmem path too.
>
>Signed-off-by: Yinghai Lu <yinghai@kernel.org>
>
>---
> mm/bootmem.c | 17 +++++++++++++++--
> 1 file changed, 15 insertions(+), 2 deletions(-)
>
>Index: linux-2.6/mm/bootmem.c
>===================================================================
>--- linux-2.6.orig/mm/bootmem.c
>+++ linux-2.6/mm/bootmem.c
>@@ -303,9 +303,22 @@ unsigned long __init free_all_bootmem_no
> unsigned long __init free_all_bootmem(void)
> {
> #ifdef CONFIG_NO_BOOTMEM
>- return free_all_memory_core_early(NODE_DATA(0)->node_id);
>+ /*
>+ * We need to use MAX_NUMNODES instead of NODE_DATA(0)->node_id
>+ * because in some case like Node0 doesnt have RAM installed
>+ * low ram will be on Node1
>+ * Use MAX_NUMNODES will make sure all ranges in early_node_map[]
>+ * will be used instead of only Node0 related
>+ */
>+ return free_all_memory_core_early(MAX_NUMNODES);
> #else
>- return free_all_bootmem_core(NODE_DATA(0)->bdata);
>+ unsigned long total_pages = 0;
>+ bootmem_data_t *bdata;
>+
>+ list_for_each_entry(bdata, &bdata_list, list)
>+ total_pages = free_all_bootmem_core(bdata);
>+
>+ return total_pages;
> #endif
> }
>
--
Sent from my mobile phone, pardon any lack of formatting.
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: [PATCH -v3] nobootmem/bootmem, x86: Fix 32bit numa system without RAM on Node0
2010-04-01 3:18 ` H. Peter Anvin
@ 2010-04-01 3:30 ` Yinghai Lu
2010-04-01 3:44 ` [PATCH -v4 1/2] nobootmem, " Yinghai Lu
1 sibling, 0 replies; 46+ messages in thread
From: Yinghai Lu @ 2010-04-01 3:30 UTC (permalink / raw)
To: H. Peter Anvin
Cc: Ingo Molnar, Thomas Gleixner, linux-kernel, Johannes Weiner,
Andrew Morton
On 03/31/2010 08:18 PM, H. Peter Anvin wrote:
> Please address the separate bug fix in a separate patch.
ok.
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box
2010-04-01 3:16 H. Peter Anvin
@ 2010-04-01 3:35 ` Yinghai Lu
0 siblings, 0 replies; 46+ messages in thread
From: Yinghai Lu @ 2010-04-01 3:35 UTC (permalink / raw)
To: H. Peter Anvin
Cc: James Morris, Ingo Molnar, linux-kernel, airlied, Thomas Gleixner,
Linus Torvalds, Pekka Enberg
On 03/31/2010 08:16 PM, H. Peter Anvin wrote:
> OK... I think I'm going to write this up as unconfirmed... which means the only known problem that was not addressed in rc3 is the 32-bit NUMA issue, which we have a bug for.
>
the 32 bit numa problem is one edge case (node0 doesn't have memory installed),
and the problem has been there for a while before we introduce NO_BOOTMEM support.
Thanks
Yinghai
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH -v4 1/2] nobootmem, x86: Fix 32bit numa system without RAM on Node0
2010-04-01 3:18 ` H. Peter Anvin
2010-04-01 3:30 ` Yinghai Lu
@ 2010-04-01 3:44 ` Yinghai Lu
2010-04-01 3:45 ` [PATCH -v4 2/2] bootmem, " Yinghai Lu
2010-04-01 22:57 ` [tip:x86/urgent] nobootmem, " tip-bot for Yinghai Lu
1 sibling, 2 replies; 46+ messages in thread
From: Yinghai Lu @ 2010-04-01 3:44 UTC (permalink / raw)
To: H. Peter Anvin, Ingo Molnar, Thomas Gleixner
Cc: linux-kernel, Johannes Weiner, Andrew Morton
on one system without RAM on nod0, got following dump with 32bit numa kernel
early_node_map[4] active PFN ranges
1: 0x00000010 -> 0x00000099
1: 0x00000100 -> 0x0007da00
1: 0x0007e800 -> 0x0007ffa0
1: 0x0007ffae -> 0x0007ffb0
...
Subtract (29 early reservations)
#000 [0000001000 - 0000002000]
#001 [0000089000 - 000008f000]
#002 [0000091000 - 0000093500]
...
#027 [007cbfef40 - 007e800000]
#028 [007e9ca000 - 007ff95000]
(0 free memory ranges)
Initializing HighMem for node 0 (00000000:00000000)
Initializing HighMem for node 1 (00000000:00000000)
Memory: 0k/2096832k available (6662k kernel code, 2096300k reserved, 4829k data, 484k init, 0k highmem)
...
Checking if this processor honours the WP bit even in supervisor mode...Ok.
swapper: page allocation failure. order:0, mode:0x0
Pid: 0, comm: swapper Not tainted 2.6.34-rc3-tip-03818-g4b1ea6c-dirty #35
Call Trace:
[<4087a5dc>] ? printk+0xf/0x11
[<40286728>] __alloc_pages_nodemask+0x417/0x487
[<402a9ce1>] new_slab+0xe2/0x1fe
[<402aa5b2>] kmem_cache_open+0x185/0x358
[<402abbc0>] T.954+0x1c/0x60
[<40d52a29>] kmem_cache_init+0x24/0x113
[<40d39738>] start_kernel+0x166/0x2e4
[<40d3940e>] ? unknown_bootoption+0x0/0x18e
[<40d390ce>] i386_start_kernel+0xce/0xd5
Mem-Info:
Node 1 DMA per-cpu:
CPU 0: hi: 0, btch: 1 usd: 0
Node 1 Normal per-cpu:
CPU 0: hi: 0, btch: 1 usd: 0
active_anon:0 inactive_anon:0 isolated_anon:0
active_file:0 inactive_file:0 isolated_file:0
unevictable:0 dirty:0 writeback:0 unstable:0
free:0 slab_reclaimable:0 slab_unreclaimable:0
mapped:0 shmem:0 pagetables:0 bounce:0
When 32bit numa is used, free_all_bootmem() will still only go over with
node id 0.
If node 0 doesn't have RAM installed, We need to go with node1
because early_node_map still use 1 for all ranges, and ram from node1
become low ram.
Try to use MAX_NUMNODES like 64 numa does.
Note: BOOTMEM path has the same problem.
this bug exist before We have NO_BOOTMEM support.
-v3: add more comments, and fix bootmem path too.
-v4: seperate bootmem path fix
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
mm/bootmem.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
Index: linux-2.6/mm/bootmem.c
===================================================================
--- linux-2.6.orig/mm/bootmem.c
+++ linux-2.6/mm/bootmem.c
@@ -303,7 +303,14 @@ unsigned long __init free_all_bootmem_no
unsigned long __init free_all_bootmem(void)
{
#ifdef CONFIG_NO_BOOTMEM
- return free_all_memory_core_early(NODE_DATA(0)->node_id);
+ /*
+ * We need to use MAX_NUMNODES instead of NODE_DATA(0)->node_id
+ * because in some case like Node0 doesnt have RAM installed
+ * low ram will be on Node1
+ * Use MAX_NUMNODES will make sure all ranges in early_node_map[]
+ * will be used instead of only Node0 related
+ */
+ return free_all_memory_core_early(MAX_NUMNODES);
#else
return free_all_bootmem_core(NODE_DATA(0)->bdata);
#endif
^ permalink raw reply [flat|nested] 46+ messages in thread
* [PATCH -v4 2/2] bootmem, x86: Fix 32bit numa system without RAM on Node0
2010-04-01 3:44 ` [PATCH -v4 1/2] nobootmem, " Yinghai Lu
@ 2010-04-01 3:45 ` Yinghai Lu
2010-04-01 22:57 ` [tip:x86/urgent] bootmem, x86: Fix 32bit numa system without RAM on node 0 tip-bot for Yinghai Lu
2010-04-01 22:57 ` [tip:x86/urgent] nobootmem, " tip-bot for Yinghai Lu
1 sibling, 1 reply; 46+ messages in thread
From: Yinghai Lu @ 2010-04-01 3:45 UTC (permalink / raw)
To: H. Peter Anvin, Ingo Molnar, Thomas Gleixner
Cc: linux-kernel, Johannes Weiner, Andrew Morton
When 32bit numa is used, free_all_bootmem() will still only go over with
node id 0.
If node 0 doesn't have RAM installed, We need to go with node1
because early_node_map still use 1 for all ranges, and ram from node1
become low ram.
this one fixes BOOTMEM path by loop bdata_list.
-v3: add more comments, and fix bootmem path too.
-v4: seperate from one big patch
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
mm/bootmem.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
Index: linux-2.6/mm/bootmem.c
===================================================================
--- linux-2.6.orig/mm/bootmem.c
+++ linux-2.6/mm/bootmem.c
@@ -312,7 +312,13 @@ unsigned long __init free_all_bootmem(vo
*/
return free_all_memory_core_early(MAX_NUMNODES);
#else
- return free_all_bootmem_core(NODE_DATA(0)->bdata);
+ unsigned long total_pages = 0;
+ bootmem_data_t *bdata;
+
+ list_for_each_entry(bdata, &bdata_list, list)
+ total_pages += free_all_bootmem_core(bdata);
+
+ return total_pages;
#endif
}
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box
2010-04-01 1:00 ` James Morris
@ 2010-04-01 12:52 ` Ingo Molnar
2010-04-08 6:32 ` Ingo Molnar
0 siblings, 1 reply; 46+ messages in thread
From: Ingo Molnar @ 2010-04-01 12:52 UTC (permalink / raw)
To: James Morris
Cc: H. Peter Anvin, Yinghai Lu, linux-kernel, airlied,
Thomas Gleixner, Linus Torvalds, Pekka Enberg
* James Morris <jmorris@namei.org> wrote:
> On Wed, 31 Mar 2010, H. Peter Anvin wrote:
>
> > On 03/31/2010 04:43 PM, James Morris wrote:
> > >>
> > >> Upgraded how? The problem no longer happens?
> > >
> > > Upgraded to the latest rawhide userland -- I have not since tested with
> > > bootmem off. I'll try and do so again when I get a chance.
> > >
> >
> > That would be great. The sooner the better, obviously.
>
> I'm not seeing any problems now, with current Linus and rawhide. I'll leave
> bootmem off and see if anything comes up again.
(a current bootlog would still be nice)
Dave, can you reproduce any of these problems with Linus's latest?
Ingo
^ permalink raw reply [flat|nested] 46+ messages in thread
* [tip:x86/urgent] nobootmem, x86: Fix 32bit numa system without RAM on node 0
2010-04-01 3:44 ` [PATCH -v4 1/2] nobootmem, " Yinghai Lu
2010-04-01 3:45 ` [PATCH -v4 2/2] bootmem, " Yinghai Lu
@ 2010-04-01 22:57 ` tip-bot for Yinghai Lu
1 sibling, 0 replies; 46+ messages in thread
From: tip-bot for Yinghai Lu @ 2010-04-01 22:57 UTC (permalink / raw)
To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, yinghai, tglx
Commit-ID: 337998587f802535896e9ed16d19f97915ccd368
Gitweb: http://git.kernel.org/tip/337998587f802535896e9ed16d19f97915ccd368
Author: Yinghai Lu <yinghai@kernel.org>
AuthorDate: Wed, 31 Mar 2010 20:44:09 -0700
Committer: H. Peter Anvin <hpa@zytor.com>
CommitDate: Thu, 1 Apr 2010 14:39:29 -0700
nobootmem, x86: Fix 32bit numa system without RAM on node 0
On one system without RAM on node0, got following boot dump with a 32
bit NUMA kernel:
early_node_map[4] active PFN ranges
1: 0x00000010 -> 0x00000099
1: 0x00000100 -> 0x0007da00
1: 0x0007e800 -> 0x0007ffa0
1: 0x0007ffae -> 0x0007ffb0
...
Subtract (29 early reservations)
#000 [0000001000 - 0000002000]
#001 [0000089000 - 000008f000]
#002 [0000091000 - 0000093500]
...
#027 [007cbfef40 - 007e800000]
#028 [007e9ca000 - 007ff95000]
(0 free memory ranges)
Initializing HighMem for node 0 (00000000:00000000)
Initializing HighMem for node 1 (00000000:00000000)
Memory: 0k/2096832k available (6662k kernel code, 2096300k reserved, 4829k data, 484k init, 0k highmem)
...
Checking if this processor honours the WP bit even in supervisor mode...Ok.
swapper: page allocation failure. order:0, mode:0x0
Pid: 0, comm: swapper Not tainted 2.6.34-rc3-tip-03818-g4b1ea6c-dirty #35
Call Trace:
[<4087a5dc>] ? printk+0xf/0x11
[<40286728>] __alloc_pages_nodemask+0x417/0x487
[<402a9ce1>] new_slab+0xe2/0x1fe
[<402aa5b2>] kmem_cache_open+0x185/0x358
[<402abbc0>] T.954+0x1c/0x60
[<40d52a29>] kmem_cache_init+0x24/0x113
[<40d39738>] start_kernel+0x166/0x2e4
[<40d3940e>] ? unknown_bootoption+0x0/0x18e
[<40d390ce>] i386_start_kernel+0xce/0xd5
Mem-Info:
Node 1 DMA per-cpu:
CPU 0: hi: 0, btch: 1 usd: 0
Node 1 Normal per-cpu:
CPU 0: hi: 0, btch: 1 usd: 0
active_anon:0 inactive_anon:0 isolated_anon:0
active_file:0 inactive_file:0 isolated_file:0
unevictable:0 dirty:0 writeback:0 unstable:0
free:0 slab_reclaimable:0 slab_unreclaimable:0
mapped:0 shmem:0 pagetables:0 bounce:0
When 32bit NUMA is used, free_all_bootmem() will still only go over with
node id 0.
If node 0 doesn't have RAM installed, We need to go with node1
because early_node_map still use 1 for all ranges, and ram from node1
become low ram.
Use MAX_NUMNODES like 64-bit NUMA does.
Note: BOOTMEM path has the same problem.
this bug exist before We have NO_BOOTMEM support.
-v3: add more comments, and fix bootmem path too.
-v4: seperate bootmem path fix
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <4BB41689.9090502@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
---
mm/bootmem.c | 9 ++++++++-
1 files changed, 8 insertions(+), 1 deletions(-)
diff --git a/mm/bootmem.c b/mm/bootmem.c
index 9b13446..2058cb7 100644
--- a/mm/bootmem.c
+++ b/mm/bootmem.c
@@ -303,7 +303,14 @@ unsigned long __init free_all_bootmem_node(pg_data_t *pgdat)
unsigned long __init free_all_bootmem(void)
{
#ifdef CONFIG_NO_BOOTMEM
- return free_all_memory_core_early(NODE_DATA(0)->node_id);
+ /*
+ * We need to use MAX_NUMNODES instead of NODE_DATA(0)->node_id
+ * because in some case like Node0 doesnt have RAM installed
+ * low ram will be on Node1
+ * Use MAX_NUMNODES will make sure all ranges in early_node_map[]
+ * will be used instead of only Node0 related
+ */
+ return free_all_memory_core_early(MAX_NUMNODES);
#else
return free_all_bootmem_core(NODE_DATA(0)->bdata);
#endif
^ permalink raw reply related [flat|nested] 46+ messages in thread
* [tip:x86/urgent] bootmem, x86: Fix 32bit numa system without RAM on node 0
2010-04-01 3:45 ` [PATCH -v4 2/2] bootmem, " Yinghai Lu
@ 2010-04-01 22:57 ` tip-bot for Yinghai Lu
0 siblings, 0 replies; 46+ messages in thread
From: tip-bot for Yinghai Lu @ 2010-04-01 22:57 UTC (permalink / raw)
To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, yinghai, tglx
Commit-ID: aa235fc712f379d4194cff9217f07026c452c141
Gitweb: http://git.kernel.org/tip/aa235fc712f379d4194cff9217f07026c452c141
Author: Yinghai Lu <yinghai@kernel.org>
AuthorDate: Wed, 31 Mar 2010 20:45:27 -0700
Committer: H. Peter Anvin <hpa@zytor.com>
CommitDate: Thu, 1 Apr 2010 14:41:19 -0700
bootmem, x86: Fix 32bit numa system without RAM on node 0
When 32bit numa is used, free_all_bootmem() will still only go over with
node id 0.
If node 0 doesn't have RAM installed, the lowest populated node
becomes low RAM.
This one fixes BOOTMEM path by iterating over the bdata_list.
-v3: add more comments, and fix bootmem path too.
-v4: seperate from one big patch
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
LKML-Reference: <4BB416D7.6090203@kernel.org>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
---
mm/bootmem.c | 8 +++++++-
1 files changed, 7 insertions(+), 1 deletions(-)
diff --git a/mm/bootmem.c b/mm/bootmem.c
index 2058cb7..ba37d62 100644
--- a/mm/bootmem.c
+++ b/mm/bootmem.c
@@ -312,7 +312,13 @@ unsigned long __init free_all_bootmem(void)
*/
return free_all_memory_core_early(MAX_NUMNODES);
#else
- return free_all_bootmem_core(NODE_DATA(0)->bdata);
+ unsigned long total_pages = 0;
+ bootmem_data_t *bdata;
+
+ list_for_each_entry(bdata, &bdata_list, list)
+ total_pages += free_all_bootmem_core(bdata);
+
+ return total_pages;
#endif
}
^ permalink raw reply related [flat|nested] 46+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box
2010-04-01 12:52 ` Ingo Molnar
@ 2010-04-08 6:32 ` Ingo Molnar
2010-04-08 7:00 ` Yinghai
2010-04-08 8:05 ` James Morris
0 siblings, 2 replies; 46+ messages in thread
From: Ingo Molnar @ 2010-04-08 6:32 UTC (permalink / raw)
To: James Morris
Cc: H. Peter Anvin, Yinghai Lu, linux-kernel, airlied,
Thomas Gleixner, Linus Torvalds, Pekka Enberg
* Ingo Molnar <mingo@elte.hu> wrote:
> * James Morris <jmorris@namei.org> wrote:
>
> > On Wed, 31 Mar 2010, H. Peter Anvin wrote:
> >
> > > On 03/31/2010 04:43 PM, James Morris wrote:
> > > >>
> > > >> Upgraded how? The problem no longer happens?
> > > >
> > > > Upgraded to the latest rawhide userland -- I have not since tested with
> > > > bootmem off. I'll try and do so again when I get a chance.
> > > >
> > >
> > > That would be great. The sooner the better, obviously.
> >
> > I'm not seeing any problems now, with current Linus and rawhide. I'll leave
> > bootmem off and see if anything comes up again.
>
> (a current bootlog would still be nice)
>
> Dave, can you reproduce any of these problems with Linus's latest?
ping? Can you or Dave reproduce the bug with -rc3 or later kernels? (If not
then it probably means that the bug you triggered was already fixed at the
time you reported it, as hpa suspected.)
Thanks,
Ingo
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box
2010-04-08 6:32 ` Ingo Molnar
@ 2010-04-08 7:00 ` Yinghai
2010-04-08 7:27 ` Ingo Molnar
2010-04-08 8:05 ` James Morris
1 sibling, 1 reply; 46+ messages in thread
From: Yinghai @ 2010-04-08 7:00 UTC (permalink / raw)
To: Ingo Molnar
Cc: James Morris, H. Peter Anvin, linux-kernel, airlied,
Thomas Gleixner, Linus Torvalds, Pekka Enberg
On 04/07/2010 11:32 PM, Ingo Molnar wrote:
>
> * Ingo Molnar <mingo@elte.hu> wrote:
>
>> * James Morris <jmorris@namei.org> wrote:
>>
>>> On Wed, 31 Mar 2010, H. Peter Anvin wrote:
>>>
>>>> On 03/31/2010 04:43 PM, James Morris wrote:
>>>>>>
>>>>>> Upgraded how? The problem no longer happens?
>>>>>
>>>>> Upgraded to the latest rawhide userland -- I have not since tested with
>>>>> bootmem off. I'll try and do so again when I get a chance.
>>>>>
>>>>
>>>> That would be great. The sooner the better, obviously.
>>>
>>> I'm not seeing any problems now, with current Linus and rawhide. I'll leave
>>> bootmem off and see if anything comes up again.
>>
>> (a current bootlog would still be nice)
>>
>> Dave, can you reproduce any of these problems with Linus's latest?
>
> ping? Can you or Dave reproduce the bug with -rc3 or later kernels? (If not
> then it probably means that the bug you triggered was already fixed at the
> time you reported it, as hpa suspected.)
James already reported -rc3 fix the problem for him.
Dave implied -rc3 fixed problem for him
Thanks
Yinghai
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box
2010-04-08 7:00 ` Yinghai
@ 2010-04-08 7:27 ` Ingo Molnar
2010-04-09 2:43 ` Dave Airlie
0 siblings, 1 reply; 46+ messages in thread
From: Ingo Molnar @ 2010-04-08 7:27 UTC (permalink / raw)
To: Yinghai
Cc: James Morris, H. Peter Anvin, linux-kernel, airlied,
Thomas Gleixner, Linus Torvalds, Pekka Enberg
* Yinghai <yinghai.lu@oracle.com> wrote:
> On 04/07/2010 11:32 PM, Ingo Molnar wrote:
> >
> > * Ingo Molnar <mingo@elte.hu> wrote:
> >
> >> * James Morris <jmorris@namei.org> wrote:
> >>
> >>> On Wed, 31 Mar 2010, H. Peter Anvin wrote:
> >>>
> >>>> On 03/31/2010 04:43 PM, James Morris wrote:
> >>>>>>
> >>>>>> Upgraded how? The problem no longer happens?
> >>>>>
> >>>>> Upgraded to the latest rawhide userland -- I have not since tested with
> >>>>> bootmem off. I'll try and do so again when I get a chance.
> >>>>>
> >>>>
> >>>> That would be great. The sooner the better, obviously.
> >>>
> >>> I'm not seeing any problems now, with current Linus and rawhide. I'll leave
> >>> bootmem off and see if anything comes up again.
> >>
> >> (a current bootlog would still be nice)
> >>
> >> Dave, can you reproduce any of these problems with Linus's latest?
> >
> > ping? Can you or Dave reproduce the bug with -rc3 or later kernels? (If not
> > then it probably means that the bug you triggered was already fixed at the
> > time you reported it, as hpa suspected.)
>
> James already reported -rc3 fix the problem for him.
>
> Dave implied -rc3 fixed problem for him
Hm, i'm confused, does this mean that it was all fixed upstream already when
Dave and James sent their complaints?
Would be nice to have a confirmation from Dave for that (beyond 'implying'
it), to not keep this thread open-ended.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box
2010-04-08 6:32 ` Ingo Molnar
2010-04-08 7:00 ` Yinghai
@ 2010-04-08 8:05 ` James Morris
2010-04-08 8:22 ` Ingo Molnar
1 sibling, 1 reply; 46+ messages in thread
From: James Morris @ 2010-04-08 8:05 UTC (permalink / raw)
To: Ingo Molnar
Cc: H. Peter Anvin, Yinghai Lu, linux-kernel, airlied,
Thomas Gleixner, Linus Torvalds, Pekka Enberg
On Thu, 8 Apr 2010, Ingo Molnar wrote:
> ping? Can you or Dave reproduce the bug with -rc3 or later kernels? (If not
> then it probably means that the bug you triggered was already fixed at the
> time you reported it, as hpa suspected.)
I haven't seen it since.
- James
--
James Morris
<jmorris@namei.org>
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box
2010-04-08 8:05 ` James Morris
@ 2010-04-08 8:22 ` Ingo Molnar
0 siblings, 0 replies; 46+ messages in thread
From: Ingo Molnar @ 2010-04-08 8:22 UTC (permalink / raw)
To: James Morris
Cc: H. Peter Anvin, Yinghai Lu, linux-kernel, airlied,
Thomas Gleixner, Linus Torvalds, Pekka Enberg
* James Morris <jmorris@namei.org> wrote:
> On Thu, 8 Apr 2010, Ingo Molnar wrote:
>
> > ping? Can you or Dave reproduce the bug with -rc3 or later kernels? (If not
> > then it probably means that the bug you triggered was already fixed at the
> > time you reported it, as hpa suspected.)
>
> I haven't seen it since.
Great, thanks!
Ingo
^ permalink raw reply [flat|nested] 46+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box
2010-04-08 7:27 ` Ingo Molnar
@ 2010-04-09 2:43 ` Dave Airlie
0 siblings, 0 replies; 46+ messages in thread
From: Dave Airlie @ 2010-04-09 2:43 UTC (permalink / raw)
To: Ingo Molnar
Cc: Yinghai, James Morris, H. Peter Anvin, linux-kernel, airlied,
Thomas Gleixner, Linus Torvalds, Pekka Enberg
On Thu, Apr 8, 2010 at 5:27 PM, Ingo Molnar <mingo@elte.hu> wrote:
>
> * Yinghai <yinghai.lu@oracle.com> wrote:
>
>> On 04/07/2010 11:32 PM, Ingo Molnar wrote:
>> >
>> > * Ingo Molnar <mingo@elte.hu> wrote:
>> >
>> >> * James Morris <jmorris@namei.org> wrote:
>> >>
>> >>> On Wed, 31 Mar 2010, H. Peter Anvin wrote:
>> >>>
>> >>>> On 03/31/2010 04:43 PM, James Morris wrote:
>> >>>>>>
>> >>>>>> Upgraded how? The problem no longer happens?
>> >>>>>
>> >>>>> Upgraded to the latest rawhide userland -- I have not since tested with
>> >>>>> bootmem off. I'll try and do so again when I get a chance.
>> >>>>>
>> >>>>
>> >>>> That would be great. The sooner the better, obviously.
>> >>>
>> >>> I'm not seeing any problems now, with current Linus and rawhide. I'll leave
>> >>> bootmem off and see if anything comes up again.
>> >>
>> >> (a current bootlog would still be nice)
>> >>
>> >> Dave, can you reproduce any of these problems with Linus's latest?
>> >
>> > ping? Can you or Dave reproduce the bug with -rc3 or later kernels? (If not
>> > then it probably means that the bug you triggered was already fixed at the
>> > time you reported it, as hpa suspected.)
>>
>> James already reported -rc3 fix the problem for him.
>>
>> Dave implied -rc3 fixed problem for him
>
> Hm, i'm confused, does this mean that it was all fixed upstream already when
> Dave and James sent their complaints?
When I reported it, it was only at rc2 stage so not fixed upstream at all.
>
> Would be nice to have a confirmation from Dave for that (beyond 'implying'
> it), to not keep this thread open-ended.
Okay I built a linus head and it booted on the previously broken machine.
with CONFIG_NO_BOOTMEM=y
Dave.
^ permalink raw reply [flat|nested] 46+ messages in thread
end of thread, other threads:[~2010-04-09 2:44 UTC | newest]
Thread overview: 46+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-03-31 4:49 Config NO_BOOTMEM breaks my amd64 box James Morris
2010-03-31 6:26 ` H. Peter Anvin
2010-03-31 6:47 ` James Morris
2010-03-31 16:25 ` Yinghai Lu
2010-03-31 18:59 ` Ingo Molnar
2010-03-31 20:57 ` Dave Airlie
2010-03-31 21:02 ` Linus Torvalds
2010-03-31 21:40 ` Ingo Molnar
2010-03-31 21:47 ` Ingo Molnar
2010-03-31 21:14 ` Dave Airlie
2010-03-31 22:02 ` Yinghai Lu
2010-03-31 22:28 ` H. Peter Anvin
2010-03-31 22:58 ` James Morris
2010-03-31 23:02 ` Ingo Molnar
2010-03-31 23:35 ` H. Peter Anvin
2010-03-31 23:43 ` James Morris
2010-03-31 23:48 ` H. Peter Anvin
2010-04-01 1:00 ` James Morris
2010-04-01 12:52 ` Ingo Molnar
2010-04-08 6:32 ` Ingo Molnar
2010-04-08 7:00 ` Yinghai
2010-04-08 7:27 ` Ingo Molnar
2010-04-09 2:43 ` Dave Airlie
2010-04-08 8:05 ` James Morris
2010-04-08 8:22 ` Ingo Molnar
2010-03-31 22:05 ` Yinghai Lu
2010-03-31 22:13 ` Ingo Molnar
2010-03-31 22:16 ` Yinghai Lu
2010-03-31 22:41 ` Ingo Molnar
2010-03-31 22:47 ` Yinghai Lu
2010-03-31 22:56 ` Ingo Molnar
2010-04-01 0:01 ` Johannes Weiner
2010-03-31 23:34 ` H. Peter Anvin
2010-03-31 23:54 ` Yinghai Lu
2010-04-01 0:35 ` H. Peter Anvin
2010-04-01 1:07 ` Yinghai Lu
2010-04-01 2:02 ` [PATCH -v3] nobootmem/bootmem, x86: Fix 32bit numa system without RAM on Node0 Yinghai Lu
2010-04-01 3:18 ` H. Peter Anvin
2010-04-01 3:30 ` Yinghai Lu
2010-04-01 3:44 ` [PATCH -v4 1/2] nobootmem, " Yinghai Lu
2010-04-01 3:45 ` [PATCH -v4 2/2] bootmem, " Yinghai Lu
2010-04-01 22:57 ` [tip:x86/urgent] bootmem, x86: Fix 32bit numa system without RAM on node 0 tip-bot for Yinghai Lu
2010-04-01 22:57 ` [tip:x86/urgent] nobootmem, " tip-bot for Yinghai Lu
2010-03-31 10:51 ` Config NO_BOOTMEM breaks my amd64 box Stefan Richter
-- strict thread matches above, loose matches on Subject: below --
2010-04-01 3:16 H. Peter Anvin
2010-04-01 3:35 ` Yinghai Lu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).