* Config NO_BOOTMEM breaks my amd64 box @ 2010-03-31 4:49 James Morris 2010-03-31 6:26 ` H. Peter Anvin 2010-03-31 10:51 ` Config NO_BOOTMEM breaks my amd64 box Stefan Richter 0 siblings, 2 replies; 44+ messages in thread From: James Morris @ 2010-03-31 4:49 UTC (permalink / raw) To: Ingo Molnar; +Cc: H. Peter Anvin, Yinghai Lu, linux-kernel, airlied Please make NO_BOOTMEM default to n, at least for amd64, where I've found that it leads to all kinds of strange, undebuggable boot hangs and errors (with relatively current Fedora development userland). Also, the help text for the item makes little sense to a non-expert in this area: " ---help--- Use early_res directly instead of bootmem before slab is ready. - allocator (buddy) [generic] - early allocator (bootmem) [generic] - very early allocator (reserve_early*()) [x86] - very very early allocator (early brk model) [x86] So reduce one layer between early allocator to final allocator." I had no idea what all this meant, so trusted the default=y and then spent several hours wondering why everything was breaking, and would likley not have figured it out in linear time without a suggestion from Dave Airlie. - James -- James Morris <jmorris@namei.org> ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box 2010-03-31 4:49 Config NO_BOOTMEM breaks my amd64 box James Morris @ 2010-03-31 6:26 ` H. Peter Anvin 2010-03-31 6:47 ` James Morris 2010-03-31 10:51 ` Config NO_BOOTMEM breaks my amd64 box Stefan Richter 1 sibling, 1 reply; 44+ messages in thread From: H. Peter Anvin @ 2010-03-31 6:26 UTC (permalink / raw) To: James Morris; +Cc: Ingo Molnar, Yinghai Lu, linux-kernel, airlied On 03/30/2010 09:49 PM, James Morris wrote: > Please make NO_BOOTMEM default to n, at least for amd64, where I've found > that it leads to all kinds of strange, undebuggable boot hangs and errors > (with relatively current Fedora development userland). Have you tested it with the latest fixes that are now in Linus' tree (-rc3)? -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box 2010-03-31 6:26 ` H. Peter Anvin @ 2010-03-31 6:47 ` James Morris 2010-03-31 16:25 ` Yinghai Lu ` (2 more replies) 0 siblings, 3 replies; 44+ messages in thread From: James Morris @ 2010-03-31 6:47 UTC (permalink / raw) To: H. Peter Anvin; +Cc: Ingo Molnar, Yinghai Lu, linux-kernel, airlied On Tue, 30 Mar 2010, H. Peter Anvin wrote: > On 03/30/2010 09:49 PM, James Morris wrote: > > Please make NO_BOOTMEM default to n, at least for amd64, where I've found > > that it leads to all kinds of strange, undebuggable boot hangs and errors > > (with relatively current Fedora development userland). > > Have you tested it with the latest fixes that are now in Linus' tree (-rc3)? Yes, it was happening with -rc3. -- James Morris <jmorris@namei.org> ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box 2010-03-31 6:47 ` James Morris @ 2010-03-31 16:25 ` Yinghai Lu 2010-03-31 18:59 ` Ingo Molnar 2010-03-31 22:05 ` Yinghai Lu 2 siblings, 0 replies; 44+ messages in thread From: Yinghai Lu @ 2010-03-31 16:25 UTC (permalink / raw) To: James Morris; +Cc: H. Peter Anvin, Ingo Molnar, linux-kernel, airlied On 03/30/2010 11:47 PM, James Morris wrote: > On Tue, 30 Mar 2010, H. Peter Anvin wrote: > >> On 03/30/2010 09:49 PM, James Morris wrote: >>> Please make NO_BOOTMEM default to n, at least for amd64, where I've found >>> that it leads to all kinds of strange, undebuggable boot hangs and errors >>> (with relatively current Fedora development userland). >> >> Have you tested it with the latest fixes that are now in Linus' tree (-rc3)? > > Yes, it was happening with -rc3. please send out bootlog if possible. BTW please try git://git.kernel.org/pub/scm/linux/kernel/git/yinghai/linux-2.6-yinghai.git Thanks Yinghai ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box 2010-03-31 6:47 ` James Morris 2010-03-31 16:25 ` Yinghai Lu @ 2010-03-31 18:59 ` Ingo Molnar 2010-03-31 20:57 ` Dave Airlie ` (2 more replies) 2010-03-31 22:05 ` Yinghai Lu 2 siblings, 3 replies; 44+ messages in thread From: Ingo Molnar @ 2010-03-31 18:59 UTC (permalink / raw) To: James Morris Cc: H. Peter Anvin, Yinghai Lu, linux-kernel, airlied, Thomas Gleixner, Linus Torvalds, Pekka Enberg * James Morris <jmorris@namei.org> wrote: > On Tue, 30 Mar 2010, H. Peter Anvin wrote: > > > On 03/30/2010 09:49 PM, James Morris wrote: > > > > > > Please make NO_BOOTMEM default to n, at least for amd64, where I've found > > > that it leads to all kinds of strange, undebuggable boot hangs and errors > > > (with relatively current Fedora development userland). > > > > Have you tested it with the latest fixes that are now in Linus' tree (-rc3)? > > Yes, it was happening with -rc3. Could you please send the bootlog that Yinghai asked for, plus also one that you get with NO_BOOTMEM turned off (for comparison)? Also, when did you first hit this bug? This code has been upstream for almost a month, and it was in linux-next before that - so you should have hit this much sooner. A rough timeframe would suffice. I suppose you were booting upstream kernels during the merge window as well? We can flip the default around if there's no fix available based on the bootlogs. (Plus the help text should definitely be improved.) Thanks, Ingo ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box 2010-03-31 18:59 ` Ingo Molnar @ 2010-03-31 20:57 ` Dave Airlie 2010-03-31 21:02 ` Linus Torvalds 2010-03-31 21:47 ` Ingo Molnar 2010-03-31 21:14 ` Dave Airlie 2010-03-31 22:58 ` James Morris 2 siblings, 2 replies; 44+ messages in thread From: Dave Airlie @ 2010-03-31 20:57 UTC (permalink / raw) To: Ingo Molnar Cc: James Morris, H. Peter Anvin, Yinghai Lu, linux-kernel, airlied, Thomas Gleixner, Linus Torvalds, Pekka Enberg On Thu, Apr 1, 2010 at 4:59 AM, Ingo Molnar <mingo@elte.hu> wrote: > > * James Morris <jmorris@namei.org> wrote: > >> On Tue, 30 Mar 2010, H. Peter Anvin wrote: >> >> > On 03/30/2010 09:49 PM, James Morris wrote: >> > > >> > > Please make NO_BOOTMEM default to n, at least for amd64, where I've found >> > > that it leads to all kinds of strange, undebuggable boot hangs and errors >> > > (with relatively current Fedora development userland). >> > >> > Have you tested it with the latest fixes that are now in Linus' tree (-rc3)? >> >> Yes, it was happening with -rc3. > > Could you please send the bootlog that Yinghai asked for, plus also one that > you get with NO_BOOTMEM turned off (for comparison)? > > Also, when did you first hit this bug? This code has been upstream for almost > a month, and it was in linux-next before that - so you should have hit this > much sooner. A rough timeframe would suffice. I suppose you were booting > upstream kernels during the merge window as well? A default y config option causing regressions still at rc3? and you guys keep going? This is the sort of shit Linus would flame me for a day or two for, Can we get some f'ing consistency here? Dave. > > We can flip the default around if there's no fix available based on the > bootlogs. (Plus the help text should definitely be improved.) > > Thanks, > > Ingo > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box 2010-03-31 20:57 ` Dave Airlie @ 2010-03-31 21:02 ` Linus Torvalds 2010-03-31 21:40 ` Ingo Molnar 2010-03-31 21:47 ` Ingo Molnar 1 sibling, 1 reply; 44+ messages in thread From: Linus Torvalds @ 2010-03-31 21:02 UTC (permalink / raw) To: Dave Airlie Cc: Ingo Molnar, James Morris, H. Peter Anvin, Yinghai Lu, linux-kernel, airlied, Thomas Gleixner, Pekka Enberg On Thu, 1 Apr 2010, Dave Airlie wrote: > > A default y config option causing regressions still at rc3? and you guys > keep going? This is the sort of shit Linus would flame me for a day or two for, > > Can we get some f'ing consistency here? Yeah. I think we need to remove the crap. I thought the problems were known, and fixed in -rc3. Clearly they weren't. And by now it's not about changing the default any more - by now it's about removing the known-crap code. Linus ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box 2010-03-31 21:02 ` Linus Torvalds @ 2010-03-31 21:40 ` Ingo Molnar 0 siblings, 0 replies; 44+ messages in thread From: Ingo Molnar @ 2010-03-31 21:40 UTC (permalink / raw) To: Linus Torvalds Cc: Dave Airlie, James Morris, H. Peter Anvin, Yinghai Lu, linux-kernel, airlied, Thomas Gleixner, Pekka Enberg * Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Thu, 1 Apr 2010, Dave Airlie wrote: > > > > A default y config option causing regressions still at rc3? [...] > > > > [...] and you guys keep going? This is the sort of shit Linus would flame > > me for a day or two for, > > > > Can we get some f'ing consistency here? > > Yeah. I think we need to remove the crap. > > I thought the problems were known, and fixed in -rc3. Clearly they weren't. Yeah. It would still be nice to get the before/after bootlogs, because we'd like to map out any remaining bugs. > And by now it's not about changing the default any more - by now it's about > removing the known-crap code. Ok, we can certainly do that too. Should we scrap the whole x86 bootmem conversion to begin with? I'm not sure there's any fundamentally less risky way to it so if we try this again in .35 we might run into similar regressions and i'd like to avoid that. I wouldnt mind not having to do that at all, it's been a lot of pain to pull it off and the lmb conversion looks even more intrusive. Thanks, Ingo ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box 2010-03-31 20:57 ` Dave Airlie 2010-03-31 21:02 ` Linus Torvalds @ 2010-03-31 21:47 ` Ingo Molnar 1 sibling, 0 replies; 44+ messages in thread From: Ingo Molnar @ 2010-03-31 21:47 UTC (permalink / raw) To: Dave Airlie Cc: James Morris, H. Peter Anvin, Yinghai Lu, linux-kernel, airlied, Thomas Gleixner, Linus Torvalds, Pekka Enberg * Dave Airlie <airlied@gmail.com> wrote: > On Thu, Apr 1, 2010 at 4:59 AM, Ingo Molnar <mingo@elte.hu> wrote: > > > > * James Morris <jmorris@namei.org> wrote: > > > >> On Tue, 30 Mar 2010, H. Peter Anvin wrote: > >> > >> > On 03/30/2010 09:49 PM, James Morris wrote: > >> > > > >> > > Please make NO_BOOTMEM default to n, at least for amd64, where I've found > >> > > that it leads to all kinds of strange, undebuggable boot hangs and errors > >> > > (with relatively current Fedora development userland). > >> > > >> > Have you tested it with the latest fixes that are now in Linus' tree (-rc3)? > >> > >> Yes, it was happening with -rc3. > > > > Could you please send the bootlog that Yinghai asked for, plus also one that > > you get with NO_BOOTMEM turned off (for comparison)? > > > > Also, when did you first hit this bug? This code has been upstream for almost > > a month, and it was in linux-next before that - so you should have hit this > > much sooner. A rough timeframe would suffice. I suppose you were booting > > upstream kernels during the merge window as well? > > A default y config option causing regressions still at rc3? and you guys > keep going? This is the sort of shit Linus would flame me for a day or two > for, > > Can we get some f'ing consistency here? Note, without trying to defend the bootmem conversion itself, which didnt work out well, this is not some optional new driver feature that was default-y randomly but it was an infrastructure change that was to be made unconditional in .35. The flag was basically a testing/debug flag to allow the old code to be used too, in case the new code was buggy. This is what helped James to report this today, instead of forcing James through a very difficult ~14-reboot bisection. Thanks, Ingo ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box 2010-03-31 18:59 ` Ingo Molnar 2010-03-31 20:57 ` Dave Airlie @ 2010-03-31 21:14 ` Dave Airlie 2010-03-31 22:02 ` Yinghai Lu 2010-03-31 22:28 ` H. Peter Anvin 2010-03-31 22:58 ` James Morris 2 siblings, 2 replies; 44+ messages in thread From: Dave Airlie @ 2010-03-31 21:14 UTC (permalink / raw) To: Ingo Molnar Cc: James Morris, H. Peter Anvin, Yinghai Lu, linux-kernel, airlied, Thomas Gleixner, Linus Torvalds, Pekka Enberg On Thu, Apr 1, 2010 at 4:59 AM, Ingo Molnar <mingo@elte.hu> wrote: > > * James Morris <jmorris@namei.org> wrote: > >> On Tue, 30 Mar 2010, H. Peter Anvin wrote: >> >> > On 03/30/2010 09:49 PM, James Morris wrote: >> > > >> > > Please make NO_BOOTMEM default to n, at least for amd64, where I've found >> > > that it leads to all kinds of strange, undebuggable boot hangs and errors >> > > (with relatively current Fedora development userland). >> > >> > Have you tested it with the latest fixes that are now in Linus' tree (-rc3)? >> >> Yes, it was happening with -rc3. > > Could you please send the bootlog that Yinghai asked for, plus also one that > you get with NO_BOOTMEM turned off (for comparison)? > > Also, when did you first hit this bug? This code has been upstream for almost > a month, and it was in linux-next before that - so you should have hit this > much sooner. A rough timeframe would suffice. I suppose you were booting > upstream kernels during the merge window as well? > > We can flip the default around if there's no fix available based on the > bootlogs. (Plus the help text should definitely be improved.) > Are you testing this btw with initramfs/initrds? I suspect lots of testing is being done by people on monolithic kernels, this is just a misc guess, considering I couldn't boot from when this landed until rc3 with this option on a basic 32-bit install on a dual-core 64-bit CPU, it suggested a hole of some sort in the test coverage. Dave ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box 2010-03-31 21:14 ` Dave Airlie @ 2010-03-31 22:02 ` Yinghai Lu 2010-03-31 22:28 ` H. Peter Anvin 1 sibling, 0 replies; 44+ messages in thread From: Yinghai Lu @ 2010-03-31 22:02 UTC (permalink / raw) To: Dave Airlie Cc: Ingo Molnar, James Morris, H. Peter Anvin, linux-kernel, airlied, Thomas Gleixner, Linus Torvalds, Pekka Enberg On 03/31/2010 02:14 PM, Dave Airlie wrote: > On Thu, Apr 1, 2010 at 4:59 AM, Ingo Molnar <mingo@elte.hu> wrote: >> >> * James Morris <jmorris@namei.org> wrote: >> >>> On Tue, 30 Mar 2010, H. Peter Anvin wrote: >>> >>>> On 03/30/2010 09:49 PM, James Morris wrote: >>>>> >>>>> Please make NO_BOOTMEM default to n, at least for amd64, where I've found >>>>> that it leads to all kinds of strange, undebuggable boot hangs and errors >>>>> (with relatively current Fedora development userland). >>>> >>>> Have you tested it with the latest fixes that are now in Linus' tree (-rc3)? >>> >>> Yes, it was happening with -rc3. >> >> Could you please send the bootlog that Yinghai asked for, plus also one that >> you get with NO_BOOTMEM turned off (for comparison)? >> >> Also, when did you first hit this bug? This code has been upstream for almost >> a month, and it was in linux-next before that - so you should have hit this >> much sooner. A rough timeframe would suffice. I suppose you were booting >> upstream kernels during the merge window as well? >> >> We can flip the default around if there's no fix available based on the >> bootlogs. (Plus the help text should definitely be improved.) >> > > Are you testing this btw with initramfs/initrds? I suspect lots of testing > is being done by people on monolithic kernels, this is just a misc guess, > considering I couldn't boot from when this landed until rc3 with this option > on a basic 32-bit install on a dual-core 64-bit CPU, it suggested a > hole of some sort > in the test coverage. so -rc3 is working your setup? Yinghai ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box 2010-03-31 21:14 ` Dave Airlie 2010-03-31 22:02 ` Yinghai Lu @ 2010-03-31 22:28 ` H. Peter Anvin 1 sibling, 0 replies; 44+ messages in thread From: H. Peter Anvin @ 2010-03-31 22:28 UTC (permalink / raw) To: Dave Airlie Cc: Ingo Molnar, James Morris, Yinghai Lu, linux-kernel, airlied, Thomas Gleixner, Linus Torvalds, Pekka Enberg On 03/31/2010 02:14 PM, Dave Airlie wrote: > Are you testing this btw with initramfs/initrds? I suspect lots of testing > is being done by people on monolithic kernels, this is just a misc guess, > considering I couldn't boot from when this landed until rc3 with this option > on a basic 32-bit install on a dual-core 64-bit CPU, it suggested a > hole of some sort > in the test coverage. Hi Dave, The only bug report I remember getting from you had no details and was in reply to another bug report which was, indeed, addressed, so we had every reason to believe it was being dealt with with the patchset which did indeed go into -rc3 (and does address a problem with initramfs in particular cases.) Clearly James Morris' problem is something unrelated, and regardless of course of action we need to track it down. If you also are having problems with -rc3 we would really appreciate as much detail as possible -- boot logs at the very minimum -- so we have a chance to at all track down the problems that do exist. -hpa ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box 2010-03-31 18:59 ` Ingo Molnar 2010-03-31 20:57 ` Dave Airlie 2010-03-31 21:14 ` Dave Airlie @ 2010-03-31 22:58 ` James Morris 2010-03-31 23:02 ` Ingo Molnar 2010-03-31 23:35 ` H. Peter Anvin 2 siblings, 2 replies; 44+ messages in thread From: James Morris @ 2010-03-31 22:58 UTC (permalink / raw) To: Ingo Molnar Cc: H. Peter Anvin, Yinghai Lu, linux-kernel, airlied, Thomas Gleixner, Linus Torvalds, Pekka Enberg On Wed, 31 Mar 2010, Ingo Molnar wrote: > > > > Yes, it was happening with -rc3. > > Could you please send the bootlog that Yinghai asked for, plus also one that > you get with NO_BOOTMEM turned off (for comparison)? I don't have the old boot logs, and have since upgraded the system further. IIRC, the boot was failing after not being able to find the root fs (ext3/lvm/raid0). I thought it was a dracut issue, but it seemed to be fixed by enabling bootmem. > Also, when did you first hit this bug? This code has been upstream for almost > a month, and it was in linux-next before that - so you should have hit this > much sooner. A rough timeframe would suffice. I suppose you were booting > upstream kernels during the merge window as well? In this case, in the last few days (also when I first saw or noticed the bootmem option). I was booting relatively recent linus kernels during the merge window, although my main work was being done on an older upstream kernel. -- James Morris <jmorris@namei.org> ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box 2010-03-31 22:58 ` James Morris @ 2010-03-31 23:02 ` Ingo Molnar 2010-03-31 23:35 ` H. Peter Anvin 1 sibling, 0 replies; 44+ messages in thread From: Ingo Molnar @ 2010-03-31 23:02 UTC (permalink / raw) To: James Morris Cc: H. Peter Anvin, Yinghai Lu, linux-kernel, airlied, Thomas Gleixner, Linus Torvalds, Pekka Enberg * James Morris <jmorris@namei.org> wrote: > On Wed, 31 Mar 2010, Ingo Molnar wrote: > > > > > > > Yes, it was happening with -rc3. > > > > Could you please send the bootlog that Yinghai asked for, plus also one that > > you get with NO_BOOTMEM turned off (for comparison)? > > I don't have the old boot logs, and have since upgraded the system > further. Please, could you send any bootlog then that we could work from? That way we could check the memory layout and guess the rough shape of the early allocations, etc. > IIRC, the boot was failing after not being able to find the root fs > (ext3/lvm/raid0). I thought it was a dracut issue, but it seemed to be > fixed by enabling bootmem. Ok - initrd unpack failing or initial mount failing is consistent with the initrd getting corrupted by overlapping early reservations due to allocator bug. > > Also, when did you first hit this bug? This code has been upstream for > > almost a month, and it was in linux-next before that - so you should have > > hit this much sooner. A rough timeframe would suffice. I suppose you were > > booting upstream kernels during the merge window as well? > > In this case, in the last few days (also when I first saw or noticed the > bootmem option). I was booting relatively recent linus kernels during the > merge window, although my main work was being done on an older upstream > kernel. Ok, so it's not an old regression but possibly a bug in one of the fixes. Not good. Ingo ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box 2010-03-31 22:58 ` James Morris 2010-03-31 23:02 ` Ingo Molnar @ 2010-03-31 23:35 ` H. Peter Anvin 2010-03-31 23:43 ` James Morris 1 sibling, 1 reply; 44+ messages in thread From: H. Peter Anvin @ 2010-03-31 23:35 UTC (permalink / raw) To: James Morris Cc: Ingo Molnar, Yinghai Lu, linux-kernel, airlied, Thomas Gleixner, Linus Torvalds, Pekka Enberg On 03/31/2010 03:58 PM, James Morris wrote: > On Wed, 31 Mar 2010, Ingo Molnar wrote: > >>> >>> Yes, it was happening with -rc3. >> >> Could you please send the bootlog that Yinghai asked for, plus also one that >> you get with NO_BOOTMEM turned off (for comparison)? > > I don't have the old boot logs, and have since upgraded the system > further. > Upgraded how? The problem no longer happens? > IIRC, the boot was failing after not being able to find the root fs > (ext3/lvm/raid0). I thought it was a dracut issue, but it seemed to be > fixed by enabling bootmem. This would rather match the problem that was addressed by the patch in -rc3. Any help in reproducing it would be great. -hpa ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box 2010-03-31 23:35 ` H. Peter Anvin @ 2010-03-31 23:43 ` James Morris 2010-03-31 23:48 ` H. Peter Anvin 0 siblings, 1 reply; 44+ messages in thread From: James Morris @ 2010-03-31 23:43 UTC (permalink / raw) To: H. Peter Anvin Cc: Ingo Molnar, Yinghai Lu, linux-kernel, airlied, Thomas Gleixner, Linus Torvalds, Pekka Enberg On Wed, 31 Mar 2010, H. Peter Anvin wrote: > On 03/31/2010 03:58 PM, James Morris wrote: > > On Wed, 31 Mar 2010, Ingo Molnar wrote: > > > >>> > >>> Yes, it was happening with -rc3. > >> > >> Could you please send the bootlog that Yinghai asked for, plus also one that > >> you get with NO_BOOTMEM turned off (for comparison)? > > > > I don't have the old boot logs, and have since upgraded the system > > further. > > > > Upgraded how? The problem no longer happens? Upgraded to the latest rawhide userland -- I have not since tested with bootmem off. I'll try and do so again when I get a chance. > > > IIRC, the boot was failing after not being able to find the root fs > > (ext3/lvm/raid0). I thought it was a dracut issue, but it seemed to be > > fixed by enabling bootmem. > > This would rather match the problem that was addressed by the patch in > -rc3. Any help in reproducing it would be great. > > -hpa > -- James Morris <jmorris@namei.org> ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box 2010-03-31 23:43 ` James Morris @ 2010-03-31 23:48 ` H. Peter Anvin 2010-04-01 1:00 ` James Morris 0 siblings, 1 reply; 44+ messages in thread From: H. Peter Anvin @ 2010-03-31 23:48 UTC (permalink / raw) To: James Morris Cc: Ingo Molnar, Yinghai Lu, linux-kernel, airlied, Thomas Gleixner, Linus Torvalds, Pekka Enberg On 03/31/2010 04:43 PM, James Morris wrote: >> >> Upgraded how? The problem no longer happens? > > Upgraded to the latest rawhide userland -- I have not since tested with > bootmem off. I'll try and do so again when I get a chance. > That would be great. The sooner the better, obviously. -hpa ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box 2010-03-31 23:48 ` H. Peter Anvin @ 2010-04-01 1:00 ` James Morris 2010-04-01 12:52 ` Ingo Molnar 0 siblings, 1 reply; 44+ messages in thread From: James Morris @ 2010-04-01 1:00 UTC (permalink / raw) To: H. Peter Anvin Cc: Ingo Molnar, Yinghai Lu, linux-kernel, airlied, Thomas Gleixner, Linus Torvalds, Pekka Enberg On Wed, 31 Mar 2010, H. Peter Anvin wrote: > On 03/31/2010 04:43 PM, James Morris wrote: > >> > >> Upgraded how? The problem no longer happens? > > > > Upgraded to the latest rawhide userland -- I have not since tested with > > bootmem off. I'll try and do so again when I get a chance. > > > > That would be great. The sooner the better, obviously. I'm not seeing any problems now, with current Linus and rawhide. I'll leave bootmem off and see if anything comes up again. -- James Morris <jmorris@namei.org> ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box 2010-04-01 1:00 ` James Morris @ 2010-04-01 12:52 ` Ingo Molnar 2010-04-08 6:32 ` Ingo Molnar 0 siblings, 1 reply; 44+ messages in thread From: Ingo Molnar @ 2010-04-01 12:52 UTC (permalink / raw) To: James Morris Cc: H. Peter Anvin, Yinghai Lu, linux-kernel, airlied, Thomas Gleixner, Linus Torvalds, Pekka Enberg * James Morris <jmorris@namei.org> wrote: > On Wed, 31 Mar 2010, H. Peter Anvin wrote: > > > On 03/31/2010 04:43 PM, James Morris wrote: > > >> > > >> Upgraded how? The problem no longer happens? > > > > > > Upgraded to the latest rawhide userland -- I have not since tested with > > > bootmem off. I'll try and do so again when I get a chance. > > > > > > > That would be great. The sooner the better, obviously. > > I'm not seeing any problems now, with current Linus and rawhide. I'll leave > bootmem off and see if anything comes up again. (a current bootlog would still be nice) Dave, can you reproduce any of these problems with Linus's latest? Ingo ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box 2010-04-01 12:52 ` Ingo Molnar @ 2010-04-08 6:32 ` Ingo Molnar 2010-04-08 7:00 ` Yinghai 2010-04-08 8:05 ` James Morris 0 siblings, 2 replies; 44+ messages in thread From: Ingo Molnar @ 2010-04-08 6:32 UTC (permalink / raw) To: James Morris Cc: H. Peter Anvin, Yinghai Lu, linux-kernel, airlied, Thomas Gleixner, Linus Torvalds, Pekka Enberg * Ingo Molnar <mingo@elte.hu> wrote: > * James Morris <jmorris@namei.org> wrote: > > > On Wed, 31 Mar 2010, H. Peter Anvin wrote: > > > > > On 03/31/2010 04:43 PM, James Morris wrote: > > > >> > > > >> Upgraded how? The problem no longer happens? > > > > > > > > Upgraded to the latest rawhide userland -- I have not since tested with > > > > bootmem off. I'll try and do so again when I get a chance. > > > > > > > > > > That would be great. The sooner the better, obviously. > > > > I'm not seeing any problems now, with current Linus and rawhide. I'll leave > > bootmem off and see if anything comes up again. > > (a current bootlog would still be nice) > > Dave, can you reproduce any of these problems with Linus's latest? ping? Can you or Dave reproduce the bug with -rc3 or later kernels? (If not then it probably means that the bug you triggered was already fixed at the time you reported it, as hpa suspected.) Thanks, Ingo ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box 2010-04-08 6:32 ` Ingo Molnar @ 2010-04-08 7:00 ` Yinghai 2010-04-08 7:27 ` Ingo Molnar 2010-04-08 8:05 ` James Morris 1 sibling, 1 reply; 44+ messages in thread From: Yinghai @ 2010-04-08 7:00 UTC (permalink / raw) To: Ingo Molnar Cc: James Morris, H. Peter Anvin, linux-kernel, airlied, Thomas Gleixner, Linus Torvalds, Pekka Enberg On 04/07/2010 11:32 PM, Ingo Molnar wrote: > > * Ingo Molnar <mingo@elte.hu> wrote: > >> * James Morris <jmorris@namei.org> wrote: >> >>> On Wed, 31 Mar 2010, H. Peter Anvin wrote: >>> >>>> On 03/31/2010 04:43 PM, James Morris wrote: >>>>>> >>>>>> Upgraded how? The problem no longer happens? >>>>> >>>>> Upgraded to the latest rawhide userland -- I have not since tested with >>>>> bootmem off. I'll try and do so again when I get a chance. >>>>> >>>> >>>> That would be great. The sooner the better, obviously. >>> >>> I'm not seeing any problems now, with current Linus and rawhide. I'll leave >>> bootmem off and see if anything comes up again. >> >> (a current bootlog would still be nice) >> >> Dave, can you reproduce any of these problems with Linus's latest? > > ping? Can you or Dave reproduce the bug with -rc3 or later kernels? (If not > then it probably means that the bug you triggered was already fixed at the > time you reported it, as hpa suspected.) James already reported -rc3 fix the problem for him. Dave implied -rc3 fixed problem for him Thanks Yinghai ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box 2010-04-08 7:00 ` Yinghai @ 2010-04-08 7:27 ` Ingo Molnar 2010-04-09 2:43 ` Dave Airlie 0 siblings, 1 reply; 44+ messages in thread From: Ingo Molnar @ 2010-04-08 7:27 UTC (permalink / raw) To: Yinghai Cc: James Morris, H. Peter Anvin, linux-kernel, airlied, Thomas Gleixner, Linus Torvalds, Pekka Enberg * Yinghai <yinghai.lu@oracle.com> wrote: > On 04/07/2010 11:32 PM, Ingo Molnar wrote: > > > > * Ingo Molnar <mingo@elte.hu> wrote: > > > >> * James Morris <jmorris@namei.org> wrote: > >> > >>> On Wed, 31 Mar 2010, H. Peter Anvin wrote: > >>> > >>>> On 03/31/2010 04:43 PM, James Morris wrote: > >>>>>> > >>>>>> Upgraded how? The problem no longer happens? > >>>>> > >>>>> Upgraded to the latest rawhide userland -- I have not since tested with > >>>>> bootmem off. I'll try and do so again when I get a chance. > >>>>> > >>>> > >>>> That would be great. The sooner the better, obviously. > >>> > >>> I'm not seeing any problems now, with current Linus and rawhide. I'll leave > >>> bootmem off and see if anything comes up again. > >> > >> (a current bootlog would still be nice) > >> > >> Dave, can you reproduce any of these problems with Linus's latest? > > > > ping? Can you or Dave reproduce the bug with -rc3 or later kernels? (If not > > then it probably means that the bug you triggered was already fixed at the > > time you reported it, as hpa suspected.) > > James already reported -rc3 fix the problem for him. > > Dave implied -rc3 fixed problem for him Hm, i'm confused, does this mean that it was all fixed upstream already when Dave and James sent their complaints? Would be nice to have a confirmation from Dave for that (beyond 'implying' it), to not keep this thread open-ended. Thanks, Ingo ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box 2010-04-08 7:27 ` Ingo Molnar @ 2010-04-09 2:43 ` Dave Airlie 0 siblings, 0 replies; 44+ messages in thread From: Dave Airlie @ 2010-04-09 2:43 UTC (permalink / raw) To: Ingo Molnar Cc: Yinghai, James Morris, H. Peter Anvin, linux-kernel, airlied, Thomas Gleixner, Linus Torvalds, Pekka Enberg On Thu, Apr 8, 2010 at 5:27 PM, Ingo Molnar <mingo@elte.hu> wrote: > > * Yinghai <yinghai.lu@oracle.com> wrote: > >> On 04/07/2010 11:32 PM, Ingo Molnar wrote: >> > >> > * Ingo Molnar <mingo@elte.hu> wrote: >> > >> >> * James Morris <jmorris@namei.org> wrote: >> >> >> >>> On Wed, 31 Mar 2010, H. Peter Anvin wrote: >> >>> >> >>>> On 03/31/2010 04:43 PM, James Morris wrote: >> >>>>>> >> >>>>>> Upgraded how? The problem no longer happens? >> >>>>> >> >>>>> Upgraded to the latest rawhide userland -- I have not since tested with >> >>>>> bootmem off. I'll try and do so again when I get a chance. >> >>>>> >> >>>> >> >>>> That would be great. The sooner the better, obviously. >> >>> >> >>> I'm not seeing any problems now, with current Linus and rawhide. I'll leave >> >>> bootmem off and see if anything comes up again. >> >> >> >> (a current bootlog would still be nice) >> >> >> >> Dave, can you reproduce any of these problems with Linus's latest? >> > >> > ping? Can you or Dave reproduce the bug with -rc3 or later kernels? (If not >> > then it probably means that the bug you triggered was already fixed at the >> > time you reported it, as hpa suspected.) >> >> James already reported -rc3 fix the problem for him. >> >> Dave implied -rc3 fixed problem for him > > Hm, i'm confused, does this mean that it was all fixed upstream already when > Dave and James sent their complaints? When I reported it, it was only at rc2 stage so not fixed upstream at all. > > Would be nice to have a confirmation from Dave for that (beyond 'implying' > it), to not keep this thread open-ended. Okay I built a linus head and it booted on the previously broken machine. with CONFIG_NO_BOOTMEM=y Dave. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box 2010-04-08 6:32 ` Ingo Molnar 2010-04-08 7:00 ` Yinghai @ 2010-04-08 8:05 ` James Morris 2010-04-08 8:22 ` Ingo Molnar 1 sibling, 1 reply; 44+ messages in thread From: James Morris @ 2010-04-08 8:05 UTC (permalink / raw) To: Ingo Molnar Cc: H. Peter Anvin, Yinghai Lu, linux-kernel, airlied, Thomas Gleixner, Linus Torvalds, Pekka Enberg On Thu, 8 Apr 2010, Ingo Molnar wrote: > ping? Can you or Dave reproduce the bug with -rc3 or later kernels? (If not > then it probably means that the bug you triggered was already fixed at the > time you reported it, as hpa suspected.) I haven't seen it since. - James -- James Morris <jmorris@namei.org> ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box 2010-04-08 8:05 ` James Morris @ 2010-04-08 8:22 ` Ingo Molnar 0 siblings, 0 replies; 44+ messages in thread From: Ingo Molnar @ 2010-04-08 8:22 UTC (permalink / raw) To: James Morris Cc: H. Peter Anvin, Yinghai Lu, linux-kernel, airlied, Thomas Gleixner, Linus Torvalds, Pekka Enberg * James Morris <jmorris@namei.org> wrote: > On Thu, 8 Apr 2010, Ingo Molnar wrote: > > > ping? Can you or Dave reproduce the bug with -rc3 or later kernels? (If not > > then it probably means that the bug you triggered was already fixed at the > > time you reported it, as hpa suspected.) > > I haven't seen it since. Great, thanks! Ingo ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box 2010-03-31 6:47 ` James Morris 2010-03-31 16:25 ` Yinghai Lu 2010-03-31 18:59 ` Ingo Molnar @ 2010-03-31 22:05 ` Yinghai Lu 2010-03-31 22:13 ` Ingo Molnar 2 siblings, 1 reply; 44+ messages in thread From: Yinghai Lu @ 2010-03-31 22:05 UTC (permalink / raw) To: James Morris; +Cc: H. Peter Anvin, Ingo Molnar, linux-kernel, airlied On 03/30/2010 11:47 PM, James Morris wrote: > On Tue, 30 Mar 2010, H. Peter Anvin wrote: > >> On 03/30/2010 09:49 PM, James Morris wrote: >>> Please make NO_BOOTMEM default to n, at least for amd64, where I've found >>> that it leads to all kinds of strange, undebuggable boot hangs and errors >>> (with relatively current Fedora development userland). >> >> Have you tested it with the latest fixes that are now in Linus' tree (-rc3)? > > Yes, it was happening with -rc3. in case, you have one 32bit system without RAM installed on node0. please check Thanks Yinghai Subject: [PATCH] x86: Fix 32bit system without RAM on Node0 when 32bit numa is used, free_all_bootmem() will still only go over with node id 0. If node 0 doesn't have RAM installed, We need to go with node1 because early_node_map still use 1 for all ranges, and ram from node1 becom low ram. Try to use MAX_NUMNODES like 64 numa does. Signed-off-by: Yinghai Lu <yinghai@kernel.org> --- arch/x86/mm/init_32.c | 5 +++++ 1 file changed, 5 insertions(+) Index: linux-2.6/arch/x86/mm/init_32.c =================================================================== --- linux-2.6.orig/arch/x86/mm/init_32.c +++ linux-2.6/arch/x86/mm/init_32.c @@ -875,7 +875,12 @@ void __init mem_init(void) BUG_ON(!mem_map); #endif /* this will put all low memory onto the freelists */ +#if defined(CONFIG_NO_BOOTMEM) && defined(MAX_NUMNODES) + /* In case some 32bit systems don't have RAM installed on node0 */ + totalram_pages += free_all_memory_core_early(MAX_NUMNODES); +#else totalram_pages += free_all_bootmem(); +#endif reservedpages = 0; for (tmp = 0; tmp < max_low_pfn; tmp++) ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box 2010-03-31 22:05 ` Yinghai Lu @ 2010-03-31 22:13 ` Ingo Molnar 2010-03-31 22:16 ` Yinghai Lu 0 siblings, 1 reply; 44+ messages in thread From: Ingo Molnar @ 2010-03-31 22:13 UTC (permalink / raw) To: Yinghai Lu; +Cc: James Morris, H. Peter Anvin, linux-kernel, airlied * Yinghai Lu <yinghai@kernel.org> wrote: > --- linux-2.6.orig/arch/x86/mm/init_32.c > +++ linux-2.6/arch/x86/mm/init_32.c > @@ -875,7 +875,12 @@ void __init mem_init(void) > BUG_ON(!mem_map); > #endif > /* this will put all low memory onto the freelists */ > +#if defined(CONFIG_NO_BOOTMEM) && defined(MAX_NUMNODES) > + /* In case some 32bit systems don't have RAM installed on node0 */ > + totalram_pages += free_all_memory_core_early(MAX_NUMNODES); (Note: tab whitespace damage) > +#else > totalram_pages += free_all_bootmem(); So we get into this branch if CONFIG_NO_BOOTMEM is enabled but MAX_NUMNODES is not defined? Doesnt look right. > +#endif Btw., and i said this before, i absolutely hate the CONFIG_NO_BOOTMEM naming as well (a negative in the option), but it is was what expresses the 'this is where we want to go' state better and thus CONFIG_NO_BOOTMEM removal will be a straight removal instead of a removal of the inverse. Thanks, Ingo ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box 2010-03-31 22:13 ` Ingo Molnar @ 2010-03-31 22:16 ` Yinghai Lu 2010-03-31 22:41 ` Ingo Molnar 0 siblings, 1 reply; 44+ messages in thread From: Yinghai Lu @ 2010-03-31 22:16 UTC (permalink / raw) To: Ingo Molnar; +Cc: James Morris, H. Peter Anvin, linux-kernel, airlied On 03/31/2010 03:13 PM, Ingo Molnar wrote: > > * Yinghai Lu <yinghai@kernel.org> wrote: > >> --- linux-2.6.orig/arch/x86/mm/init_32.c >> +++ linux-2.6/arch/x86/mm/init_32.c >> @@ -875,7 +875,12 @@ void __init mem_init(void) >> BUG_ON(!mem_map); >> #endif >> /* this will put all low memory onto the freelists */ >> +#if defined(CONFIG_NO_BOOTMEM) && defined(MAX_NUMNODES) >> + /* In case some 32bit systems don't have RAM installed on node0 */ >> + totalram_pages += free_all_memory_core_early(MAX_NUMNODES); > > (Note: tab whitespace damage) > >> +#else >> totalram_pages += free_all_bootmem(); > > So we get into this branch if CONFIG_NO_BOOTMEM is enabled but MAX_NUMNODES is > not defined? Doesnt look right. yes. free_all_bootmem() will call free_all_memory_core_early(NODE_DATA(0)->node_id); Thanks Yinghai Lu ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box 2010-03-31 22:16 ` Yinghai Lu @ 2010-03-31 22:41 ` Ingo Molnar 2010-03-31 22:47 ` Yinghai Lu 0 siblings, 1 reply; 44+ messages in thread From: Ingo Molnar @ 2010-03-31 22:41 UTC (permalink / raw) To: Yinghai Lu; +Cc: James Morris, H. Peter Anvin, linux-kernel, airlied * Yinghai Lu <yinghai@kernel.org> wrote: > On 03/31/2010 03:13 PM, Ingo Molnar wrote: > > > > * Yinghai Lu <yinghai@kernel.org> wrote: > > > >> --- linux-2.6.orig/arch/x86/mm/init_32.c > >> +++ linux-2.6/arch/x86/mm/init_32.c > >> @@ -875,7 +875,12 @@ void __init mem_init(void) > >> BUG_ON(!mem_map); > >> #endif > >> /* this will put all low memory onto the freelists */ > >> +#if defined(CONFIG_NO_BOOTMEM) && defined(MAX_NUMNODES) > >> + /* In case some 32bit systems don't have RAM installed on node0 */ > >> + totalram_pages += free_all_memory_core_early(MAX_NUMNODES); > > > > (Note: tab whitespace damage) > > > >> +#else > >> totalram_pages += free_all_bootmem(); > > > > So we get into this branch if CONFIG_NO_BOOTMEM is enabled but MAX_NUMNODES is > > not defined? Doesnt look right. > > yes. > > free_all_bootmem() will call > free_all_memory_core_early(NODE_DATA(0)->node_id); > > Thanks Well and that whole #ifdeffery is disgusting as well - even if the goal was to remove CONFIG_NO_BOOTMEM ASAP. Please learn to use proper intermediate helper functions and at minimum put the conversion ugliness somewhere that doesnt intrude our daily flow in .c files. The best rule is to _never ever_ put an #ifdef construct into a .c file. It doesnt matter what the goal if the #ifdef is - such ugliness in code is never justified. Thanks, Ingo ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box 2010-03-31 22:41 ` Ingo Molnar @ 2010-03-31 22:47 ` Yinghai Lu 2010-03-31 22:56 ` Ingo Molnar 2010-03-31 23:34 ` H. Peter Anvin 0 siblings, 2 replies; 44+ messages in thread From: Yinghai Lu @ 2010-03-31 22:47 UTC (permalink / raw) To: Ingo Molnar; +Cc: James Morris, H. Peter Anvin, linux-kernel, airlied On 03/31/2010 03:41 PM, Ingo Molnar wrote: > > * Yinghai Lu <yinghai@kernel.org> wrote: > >> On 03/31/2010 03:13 PM, Ingo Molnar wrote: >>> >>> * Yinghai Lu <yinghai@kernel.org> wrote: >>> >>>> --- linux-2.6.orig/arch/x86/mm/init_32.c >>>> +++ linux-2.6/arch/x86/mm/init_32.c >>>> @@ -875,7 +875,12 @@ void __init mem_init(void) >>>> BUG_ON(!mem_map); >>>> #endif >>>> /* this will put all low memory onto the freelists */ >>>> +#if defined(CONFIG_NO_BOOTMEM) && defined(MAX_NUMNODES) >>>> + /* In case some 32bit systems don't have RAM installed on node0 */ >>>> + totalram_pages += free_all_memory_core_early(MAX_NUMNODES); >>> >>> (Note: tab whitespace damage) >>> >>>> +#else >>>> totalram_pages += free_all_bootmem(); >>> >>> So we get into this branch if CONFIG_NO_BOOTMEM is enabled but MAX_NUMNODES is >>> not defined? Doesnt look right. >> >> yes. >> >> free_all_bootmem() will call >> free_all_memory_core_early(NODE_DATA(0)->node_id); >> >> Thanks > > Well and that whole #ifdeffery is disgusting as well - even if the goal was to > remove CONFIG_NO_BOOTMEM ASAP. > > Please learn to use proper intermediate helper functions and at minimum put > the conversion ugliness somewhere that doesnt intrude our daily flow in .c > files. The best rule is to _never ever_ put an #ifdef construct into a .c > file. It doesnt matter what the goal if the #ifdef is - such ugliness in code > is never justified. > if you agree that i can have one nobootmem.c in mm/ Thanks Yinghai ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box 2010-03-31 22:47 ` Yinghai Lu @ 2010-03-31 22:56 ` Ingo Molnar 2010-04-01 0:01 ` Johannes Weiner 2010-03-31 23:34 ` H. Peter Anvin 1 sibling, 1 reply; 44+ messages in thread From: Ingo Molnar @ 2010-03-31 22:56 UTC (permalink / raw) To: Yinghai Lu Cc: James Morris, H. Peter Anvin, linux-kernel, airlied, Linus Torvalds, Pekka Enberg * Yinghai Lu <yinghai@kernel.org> wrote: > On 03/31/2010 03:41 PM, Ingo Molnar wrote: > > > > * Yinghai Lu <yinghai@kernel.org> wrote: > > > >> On 03/31/2010 03:13 PM, Ingo Molnar wrote: > >>> > >>> * Yinghai Lu <yinghai@kernel.org> wrote: > >>> > >>>> --- linux-2.6.orig/arch/x86/mm/init_32.c > >>>> +++ linux-2.6/arch/x86/mm/init_32.c > >>>> @@ -875,7 +875,12 @@ void __init mem_init(void) > >>>> BUG_ON(!mem_map); > >>>> #endif > >>>> /* this will put all low memory onto the freelists */ > >>>> +#if defined(CONFIG_NO_BOOTMEM) && defined(MAX_NUMNODES) > >>>> + /* In case some 32bit systems don't have RAM installed on node0 */ > >>>> + totalram_pages += free_all_memory_core_early(MAX_NUMNODES); > >>> > >>> (Note: tab whitespace damage) > >>> > >>>> +#else > >>>> totalram_pages += free_all_bootmem(); > >>> > >>> So we get into this branch if CONFIG_NO_BOOTMEM is enabled but MAX_NUMNODES is > >>> not defined? Doesnt look right. > >> > >> yes. > >> > >> free_all_bootmem() will call > >> free_all_memory_core_early(NODE_DATA(0)->node_id); > >> > >> Thanks > > > > Well and that whole #ifdeffery is disgusting as well - even if the goal was to > > remove CONFIG_NO_BOOTMEM ASAP. > > > > Please learn to use proper intermediate helper functions and at minimum put > > the conversion ugliness somewhere that doesnt intrude our daily flow in .c > > files. The best rule is to _never ever_ put an #ifdef construct into a .c > > file. It doesnt matter what the goal if the #ifdef is - such ugliness in code > > is never justified. > > > > if you agree that i can have one nobootmem.c in mm/ I think what we want is your lmb series, with CONFIG_NO_BOOTMEM eliminated altogether and x86 converted to pure (extended) lmb facilities, and without any traces of bootmem left in x86. I.e. a really clean series with no CONFIG_NO_BOOTMEM kind of #ifdef crap left around. This means 'nobootmem.c' (albeit saner than an #ifdef jungle) would be moot as well. We tried the dual model as it seemed prudent from a testing/conversion POV (and it certainly allowed people to turn the new code off), but it's rather ugly and we still have bugs left. This means that if Linus likes that approach the conversion will be very binary and very painful. The other option would be to go back to bootmem and forget about the whole nobootmem and lmb thing. Ingo ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box 2010-03-31 22:56 ` Ingo Molnar @ 2010-04-01 0:01 ` Johannes Weiner 0 siblings, 0 replies; 44+ messages in thread From: Johannes Weiner @ 2010-04-01 0:01 UTC (permalink / raw) To: Ingo Molnar Cc: Yinghai Lu, James Morris, H. Peter Anvin, linux-kernel, airlied, Linus Torvalds, Pekka Enberg On Thu, Apr 01, 2010 at 12:56:58AM +0200, Ingo Molnar wrote: > I think what we want is your lmb series, with CONFIG_NO_BOOTMEM eliminated > altogether and x86 converted to pure (extended) lmb facilities, and without > any traces of bootmem left in x86. That does not make much sense as bootmem is not only used on the architecture side but also in generic code. So you either have to emulate the API on x86 or get lmb in a state to replace bootmem on _all_ architectures. > I.e. a really clean series with no CONFIG_NO_BOOTMEM kind of #ifdef crap left > around. This means 'nobootmem.c' (albeit saner than an #ifdef jungle) would be > moot as well. > > We tried the dual model as it seemed prudent from a testing/conversion POV > (and it certainly allowed people to turn the new code off), but it's rather > ugly and we still have bugs left. I think this was an implementation thing rather than a problem with the model per se. As written above, you can hardly get away without emulating the bootmem API during transition. > This means that if Linus likes that approach the conversion will be very > binary and very painful. The other option would be to go back to bootmem and > forget about the whole nobootmem and lmb thing. I suppose it would be safest to replace early_res with lmb first to get in sync with the other archs using it. Step two would be to extend LMB and implement a bootmem emulation API on top of it so that architectures can switch over to non-bootmem mode one by one. Then you can drop the real bootmem code and switch generic code to use LMB natively, also site by site. And finally, drop the emulation API. If other architectures object to removing bootmem, there really is no point for x86 to even try it. For step one to work out, it's probably easiest to fully revert to the .33 state than having to replace early_res while in its current state? ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box 2010-03-31 22:47 ` Yinghai Lu 2010-03-31 22:56 ` Ingo Molnar @ 2010-03-31 23:34 ` H. Peter Anvin 2010-03-31 23:54 ` Yinghai Lu 1 sibling, 1 reply; 44+ messages in thread From: H. Peter Anvin @ 2010-03-31 23:34 UTC (permalink / raw) To: Yinghai Lu; +Cc: Ingo Molnar, James Morris, linux-kernel, airlied On 03/31/2010 03:47 PM, Yinghai Lu wrote: >> >> Well and that whole #ifdeffery is disgusting as well - even if the goal was to >> remove CONFIG_NO_BOOTMEM ASAP. >> >> Please learn to use proper intermediate helper functions and at minimum put >> the conversion ugliness somewhere that doesnt intrude our daily flow in .c >> files. The best rule is to _never ever_ put an #ifdef construct into a .c >> file. It doesnt matter what the goal if the #ifdef is - such ugliness in code >> is never justified. > > if you agree that i can have one nobootmem.c in mm/ > That would be better, or more commonly, use inlines. I'm still totally puzzled about this patch as well as the comment: +#if defined(CONFIG_NO_BOOTMEM) && defined(MAX_NUMNODES) + /* In case some 32bit systems don't have RAM installed on node0 */ + totalram_pages += free_all_memory_core_early(MAX_NUMNODES); +#else totalram_pages += free_all_bootmem(); +#endif Why is that "32 bits" specific? Second, MAX_NUMNODES is defined whenever <linux/numa.h> is included, so what on Earth is this supposed to signify? Are you trying to say MAX_NUMNODES > 1? Or are you trying to say CONFIG_NUMA? Furthermore, I really don't see the connection between this and James Morris' reported problem, which he reports as "amd64", which presumably is an x86-64 kernel and not 32 bits... James, is that correct? Any more details you can give about the system? I *really* don't want to go into cargo cult programming mode, that would suck eggs no matter what. -hpa ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box 2010-03-31 23:34 ` H. Peter Anvin @ 2010-03-31 23:54 ` Yinghai Lu 2010-04-01 0:35 ` H. Peter Anvin 0 siblings, 1 reply; 44+ messages in thread From: Yinghai Lu @ 2010-03-31 23:54 UTC (permalink / raw) To: H. Peter Anvin; +Cc: Ingo Molnar, James Morris, linux-kernel, airlied On 03/31/2010 04:34 PM, H. Peter Anvin wrote: > On 03/31/2010 03:47 PM, Yinghai Lu wrote: >>> >>> Well and that whole #ifdeffery is disgusting as well - even if the goal was to >>> remove CONFIG_NO_BOOTMEM ASAP. >>> >>> Please learn to use proper intermediate helper functions and at minimum put >>> the conversion ugliness somewhere that doesnt intrude our daily flow in .c >>> files. The best rule is to _never ever_ put an #ifdef construct into a .c >>> file. It doesnt matter what the goal if the #ifdef is - such ugliness in code >>> is never justified. >> >> if you agree that i can have one nobootmem.c in mm/ >> > > That would be better, or more commonly, use inlines. > > I'm still totally puzzled about this patch as well as the comment: > > +#if defined(CONFIG_NO_BOOTMEM) && defined(MAX_NUMNODES) > + /* In case some 32bit systems don't have RAM installed on node0 */ > + totalram_pages += free_all_memory_core_early(MAX_NUMNODES); > +#else > totalram_pages += free_all_bootmem(); > +#endif > > > Why is that "32 bits" specific? Second, MAX_NUMNODES is defined > whenever <linux/numa.h> is included, so what on Earth is this supposed > to signify? Are you trying to say MAX_NUMNODES > 1? Or are you trying > to say CONFIG_NUMA? you are right, this one should be more clear. Subject: [PATCH -v2] nobootmem, x86: Fix 32bit system without RAM on Node0 when 32bit numa is used, free_all_bootmem() will still only go over with node id 0. If node 0 doesn't have RAM installed, We need to go with node1 because early_node_map still use 1 for all ranges, and ram from node1 becom low ram. Try to use MAX_NUMNODES like 64 numa does. Signed-off-by: Yinghai Lu <yinghai@kernel.org> --- mm/bootmem.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) Index: linux-2.6/mm/bootmem.c =================================================================== --- linux-2.6.orig/mm/bootmem.c +++ linux-2.6/mm/bootmem.c @@ -303,7 +303,7 @@ unsigned long __init free_all_bootmem_no unsigned long __init free_all_bootmem(void) { #ifdef CONFIG_NO_BOOTMEM - return free_all_memory_core_early(NODE_DATA(0)->node_id); + return free_all_memory_core_early(MAX_NUMNODES); #else return free_all_bootmem_core(NODE_DATA(0)->bdata); #endif > > Furthermore, I really don't see the connection between this and James > Morris' reported problem, which he reports as "amd64", which presumably > is an x86-64 kernel and not 32 bits... James, is that correct? Any > more details you can give about the system? I *really* don't want to go > into cargo cult programming mode, that would suck eggs no matter what. it happened one of my test setup, node0 ram disappear somehow. and i found the 32bit numa doesn't work on that. Thanks Yinghai ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box 2010-03-31 23:54 ` Yinghai Lu @ 2010-04-01 0:35 ` H. Peter Anvin 2010-04-01 1:07 ` Yinghai Lu 2010-04-01 2:02 ` [PATCH -v3] nobootmem/bootmem, x86: Fix 32bit numa system without RAM on Node0 Yinghai Lu 0 siblings, 2 replies; 44+ messages in thread From: H. Peter Anvin @ 2010-04-01 0:35 UTC (permalink / raw) To: Yinghai Lu; +Cc: Ingo Molnar, James Morris, linux-kernel, airlied On 03/31/2010 04:54 PM, Yinghai Lu wrote: > > --- > mm/bootmem.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > Index: linux-2.6/mm/bootmem.c > =================================================================== > --- linux-2.6.orig/mm/bootmem.c > +++ linux-2.6/mm/bootmem.c > @@ -303,7 +303,7 @@ unsigned long __init free_all_bootmem_no > unsigned long __init free_all_bootmem(void) > { > #ifdef CONFIG_NO_BOOTMEM > - return free_all_memory_core_early(NODE_DATA(0)->node_id); > + return free_all_memory_core_early(MAX_NUMNODES); > #else > return free_all_bootmem_core(NODE_DATA(0)->bdata); > #endif > >> >> Furthermore, I really don't see the connection between this and James >> Morris' reported problem, which he reports as "amd64", which presumably >> is an x86-64 kernel and not 32 bits... James, is that correct? Any >> more details you can give about the system? I *really* don't want to go >> into cargo cult programming mode, that would suck eggs no matter what. > > it happened one of my test setup, node0 ram disappear somehow. > and i found the 32bit numa doesn't work on that. > ... which is useful and valid, but I still think this isn't related to James' problem, if James' problem wasn't actually fixed in -rc3. That's the part that I'm afraid I have to be confused about... all the known problems except the above are fixed in -rc3, and I'd at least like to have a validated bug report of any sort before saying it should all be tossed. This patch looks a lot better. The whole use of MAX_NUMNODES as a sentinel (which appears inherited from mm/page_alloc.c, and as such is a pre-existing convention which is also invoked here) really could use a comment, though. -hpa ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box 2010-04-01 0:35 ` H. Peter Anvin @ 2010-04-01 1:07 ` Yinghai Lu 2010-04-01 2:02 ` [PATCH -v3] nobootmem/bootmem, x86: Fix 32bit numa system without RAM on Node0 Yinghai Lu 1 sibling, 0 replies; 44+ messages in thread From: Yinghai Lu @ 2010-04-01 1:07 UTC (permalink / raw) To: H. Peter Anvin; +Cc: Ingo Molnar, James Morris, linux-kernel, airlied On 03/31/2010 05:35 PM, H. Peter Anvin wrote: > On 03/31/2010 04:54 PM, Yinghai Lu wrote: > > This patch looks a lot better. The whole use of MAX_NUMNODES as a > sentinel (which appears inherited from mm/page_alloc.c, and as such is a > pre-existing convention which is also invoked here) really could use a > comment, though. sure. will have updated one with coments there Thanks Yinghai ^ permalink raw reply [flat|nested] 44+ messages in thread
* [PATCH -v3] nobootmem/bootmem, x86: Fix 32bit numa system without RAM on Node0 2010-04-01 0:35 ` H. Peter Anvin 2010-04-01 1:07 ` Yinghai Lu @ 2010-04-01 2:02 ` Yinghai Lu 2010-04-01 3:18 ` H. Peter Anvin 1 sibling, 1 reply; 44+ messages in thread From: Yinghai Lu @ 2010-04-01 2:02 UTC (permalink / raw) To: H. Peter Anvin, Ingo Molnar, Thomas Gleixner Cc: linux-kernel, Johannes Weiner, Andrew Morton on one system without RAM on nod0, got following dump with 32bit numa kernel early_node_map[4] active PFN ranges 1: 0x00000010 -> 0x00000099 1: 0x00000100 -> 0x0007da00 1: 0x0007e800 -> 0x0007ffa0 1: 0x0007ffae -> 0x0007ffb0 Subtract (29 early reservations) #000 [0000001000 - 0000002000] #001 [0000089000 - 000008f000] #002 [0000091000 - 0000093500] #003 [0000094000 - 0000099000] #004 [0000099400 - 0000100000] #005 [0000200000 - 0000eb7644] #006 [0000eb8000 - 0000ec327c] #007 [007c400000 - 007c40e000] #008 [007c440000 - 007c44e000] #009 [007c480000 - 007c48e000] #010 [007c4c0000 - 007c4ce000] #011 [007c500000 - 007c50e000] #012 [007c540000 - 007c54e000] #013 [007c580000 - 007c58e000] #014 [007c5c0000 - 007c5ce000] #015 [007c674000 - 007cbfe000] #016 [007cbfe500 - 007cbfe530] #017 [007cbfe540 - 007cbfe5d0] #018 [007cbfe600 - 007cbfe620] #019 [007cbfe640 - 007cbfe660] #020 [007cbfe680 - 007cbfe684] #021 [007cbfe6c0 - 007cbfe6c4] #022 [007cbfe700 - 007cbfe77e] #023 [007cbfe780 - 007cbfe7fe] #024 [007cbfe800 - 007cbfec54] #025 [007cbfec80 - 007cbfeede] #026 [007cbfef00 - 007cbfef2d] #027 [007cbfef40 - 007e800000] #028 [007e9ca000 - 007ff95000] (0 free memory ranges) Initializing HighMem for node 0 (00000000:00000000) Initializing HighMem for node 1 (00000000:00000000) Memory: 0k/2096832k available (6662k kernel code, 2096300k reserved, 4829k data, 484k init, 0k highmem) virtual kernel memory layout: fixmap : 0xff637000 - 0xfffff000 (10016 kB) pkmap : 0xff200000 - 0xff400000 (2048 kB) vmalloc : 0xc07b0000 - 0xff1fe000 (1002 MB) lowmem : 0x40000000 - 0xbffb0000 (2047 MB) .init : 0x40d39000 - 0x40db2000 ( 484 kB) .data : 0x40881924 - 0x40d38e1c (4829 kB) .text : 0x40200000 - 0x40881924 (6662 kB) Checking if this processor honours the WP bit even in supervisor mode...Ok. swapper: page allocation failure. order:0, mode:0x0 Pid: 0, comm: swapper Not tainted 2.6.34-rc3-tip-03818-g4b1ea6c-dirty #35 Call Trace: [<4087a5dc>] ? printk+0xf/0x11 [<40286728>] __alloc_pages_nodemask+0x417/0x487 [<402a9ce1>] new_slab+0xe2/0x1fe [<402aa5b2>] kmem_cache_open+0x185/0x358 [<402abbc0>] T.954+0x1c/0x60 [<40d52a29>] kmem_cache_init+0x24/0x113 [<40d39738>] start_kernel+0x166/0x2e4 [<40d3940e>] ? unknown_bootoption+0x0/0x18e [<40d390ce>] i386_start_kernel+0xce/0xd5 Mem-Info: Node 1 DMA per-cpu: CPU 0: hi: 0, btch: 1 usd: 0 Node 1 Normal per-cpu: CPU 0: hi: 0, btch: 1 usd: 0 active_anon:0 inactive_anon:0 isolated_anon:0 active_file:0 inactive_file:0 isolated_file:0 unevictable:0 dirty:0 writeback:0 unstable:0 free:0 slab_reclaimable:0 slab_unreclaimable:0 mapped:0 shmem:0 pagetables:0 bounce:0 When 32bit numa is used, free_all_bootmem() will still only go over with node id 0. If node 0 doesn't have RAM installed, We need to go with node1 because early_node_map still use 1 for all ranges, and ram from node1 become low ram. Try to use MAX_NUMNODES like 64 numa does. Also fixes BOOTMEM path by loop bdata_list. Note: this bug exist before We have NO_BOOTMEM support. -v3: add more comments, and fix bootmem path too. Signed-off-by: Yinghai Lu <yinghai@kernel.org> --- mm/bootmem.c | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-) Index: linux-2.6/mm/bootmem.c =================================================================== --- linux-2.6.orig/mm/bootmem.c +++ linux-2.6/mm/bootmem.c @@ -303,9 +303,22 @@ unsigned long __init free_all_bootmem_no unsigned long __init free_all_bootmem(void) { #ifdef CONFIG_NO_BOOTMEM - return free_all_memory_core_early(NODE_DATA(0)->node_id); + /* + * We need to use MAX_NUMNODES instead of NODE_DATA(0)->node_id + * because in some case like Node0 doesnt have RAM installed + * low ram will be on Node1 + * Use MAX_NUMNODES will make sure all ranges in early_node_map[] + * will be used instead of only Node0 related + */ + return free_all_memory_core_early(MAX_NUMNODES); #else - return free_all_bootmem_core(NODE_DATA(0)->bdata); + unsigned long total_pages = 0; + bootmem_data_t *bdata; + + list_for_each_entry(bdata, &bdata_list, list) + total_pages = free_all_bootmem_core(bdata); + + return total_pages; #endif } ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH -v3] nobootmem/bootmem, x86: Fix 32bit numa system without RAM on Node0 2010-04-01 2:02 ` [PATCH -v3] nobootmem/bootmem, x86: Fix 32bit numa system without RAM on Node0 Yinghai Lu @ 2010-04-01 3:18 ` H. Peter Anvin 2010-04-01 3:30 ` Yinghai Lu 2010-04-01 3:44 ` [PATCH -v4 1/2] nobootmem, " Yinghai Lu 0 siblings, 2 replies; 44+ messages in thread From: H. Peter Anvin @ 2010-04-01 3:18 UTC (permalink / raw) To: Yinghai Lu, Ingo Molnar, Thomas Gleixner Cc: linux-kernel, Johannes Weiner, Andrew Morton [-- Attachment #1: Type: text/plain, Size: 4700 bytes --] Please address the separate bug fix in a separate patch. "Yinghai Lu" <yinghai@kernel.org> wrote: > >on one system without RAM on nod0, got following dump with 32bit numa kernel > >early_node_map[4] active PFN ranges > 1: 0x00000010 -> 0x00000099 > 1: 0x00000100 -> 0x0007da00 > 1: 0x0007e800 -> 0x0007ffa0 > 1: 0x0007ffae -> 0x0007ffb0 > >Subtract (29 early reservations) > #000 [0000001000 - 0000002000] > #001 [0000089000 - 000008f000] > #002 [0000091000 - 0000093500] > #003 [0000094000 - 0000099000] > #004 [0000099400 - 0000100000] > #005 [0000200000 - 0000eb7644] > #006 [0000eb8000 - 0000ec327c] > #007 [007c400000 - 007c40e000] > #008 [007c440000 - 007c44e000] > #009 [007c480000 - 007c48e000] > #010 [007c4c0000 - 007c4ce000] > #011 [007c500000 - 007c50e000] > #012 [007c540000 - 007c54e000] > #013 [007c580000 - 007c58e000] > #014 [007c5c0000 - 007c5ce000] > #015 [007c674000 - 007cbfe000] > #016 [007cbfe500 - 007cbfe530] > #017 [007cbfe540 - 007cbfe5d0] > #018 [007cbfe600 - 007cbfe620] > #019 [007cbfe640 - 007cbfe660] > #020 [007cbfe680 - 007cbfe684] > #021 [007cbfe6c0 - 007cbfe6c4] > #022 [007cbfe700 - 007cbfe77e] > #023 [007cbfe780 - 007cbfe7fe] > #024 [007cbfe800 - 007cbfec54] > #025 [007cbfec80 - 007cbfeede] > #026 [007cbfef00 - 007cbfef2d] > #027 [007cbfef40 - 007e800000] > #028 [007e9ca000 - 007ff95000] >(0 free memory ranges) >Initializing HighMem for node 0 (00000000:00000000) >Initializing HighMem for node 1 (00000000:00000000) >Memory: 0k/2096832k available (6662k kernel code, 2096300k reserved, 4829k data, 484k init, 0k highmem) >virtual kernel memory layout: > fixmap : 0xff637000 - 0xfffff000 (10016 kB) > pkmap : 0xff200000 - 0xff400000 (2048 kB) > vmalloc : 0xc07b0000 - 0xff1fe000 (1002 MB) > lowmem : 0x40000000 - 0xbffb0000 (2047 MB) > .init : 0x40d39000 - 0x40db2000 ( 484 kB) > .data : 0x40881924 - 0x40d38e1c (4829 kB) > .text : 0x40200000 - 0x40881924 (6662 kB) >Checking if this processor honours the WP bit even in supervisor mode...Ok. >swapper: page allocation failure. order:0, mode:0x0 >Pid: 0, comm: swapper Not tainted 2.6.34-rc3-tip-03818-g4b1ea6c-dirty #35 >Call Trace: > [<4087a5dc>] ? printk+0xf/0x11 > [<40286728>] __alloc_pages_nodemask+0x417/0x487 > [<402a9ce1>] new_slab+0xe2/0x1fe > [<402aa5b2>] kmem_cache_open+0x185/0x358 > [<402abbc0>] T.954+0x1c/0x60 > [<40d52a29>] kmem_cache_init+0x24/0x113 > [<40d39738>] start_kernel+0x166/0x2e4 > [<40d3940e>] ? unknown_bootoption+0x0/0x18e > [<40d390ce>] i386_start_kernel+0xce/0xd5 >Mem-Info: >Node 1 DMA per-cpu: >CPU 0: hi: 0, btch: 1 usd: 0 >Node 1 Normal per-cpu: >CPU 0: hi: 0, btch: 1 usd: 0 >active_anon:0 inactive_anon:0 isolated_anon:0 > active_file:0 inactive_file:0 isolated_file:0 > unevictable:0 dirty:0 writeback:0 unstable:0 > free:0 slab_reclaimable:0 slab_unreclaimable:0 > mapped:0 shmem:0 pagetables:0 bounce:0 > >When 32bit numa is used, free_all_bootmem() will still only go over with >node id 0. > >If node 0 doesn't have RAM installed, We need to go with node1 >because early_node_map still use 1 for all ranges, and ram from node1 >become low ram. > >Try to use MAX_NUMNODES like 64 numa does. > >Also fixes BOOTMEM path by loop bdata_list. >Note: this bug exist before We have NO_BOOTMEM support. > >-v3: add more comments, and fix bootmem path too. > >Signed-off-by: Yinghai Lu <yinghai@kernel.org> > >--- > mm/bootmem.c | 17 +++++++++++++++-- > 1 file changed, 15 insertions(+), 2 deletions(-) > >Index: linux-2.6/mm/bootmem.c >=================================================================== >--- linux-2.6.orig/mm/bootmem.c >+++ linux-2.6/mm/bootmem.c >@@ -303,9 +303,22 @@ unsigned long __init free_all_bootmem_no > unsigned long __init free_all_bootmem(void) > { > #ifdef CONFIG_NO_BOOTMEM >- return free_all_memory_core_early(NODE_DATA(0)->node_id); >+ /* >+ * We need to use MAX_NUMNODES instead of NODE_DATA(0)->node_id >+ * because in some case like Node0 doesnt have RAM installed >+ * low ram will be on Node1 >+ * Use MAX_NUMNODES will make sure all ranges in early_node_map[] >+ * will be used instead of only Node0 related >+ */ >+ return free_all_memory_core_early(MAX_NUMNODES); > #else >- return free_all_bootmem_core(NODE_DATA(0)->bdata); >+ unsigned long total_pages = 0; >+ bootmem_data_t *bdata; >+ >+ list_for_each_entry(bdata, &bdata_list, list) >+ total_pages = free_all_bootmem_core(bdata); >+ >+ return total_pages; > #endif > } > -- Sent from my mobile phone, pardon any lack of formatting. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [PATCH -v3] nobootmem/bootmem, x86: Fix 32bit numa system without RAM on Node0 2010-04-01 3:18 ` H. Peter Anvin @ 2010-04-01 3:30 ` Yinghai Lu 2010-04-01 3:44 ` [PATCH -v4 1/2] nobootmem, " Yinghai Lu 1 sibling, 0 replies; 44+ messages in thread From: Yinghai Lu @ 2010-04-01 3:30 UTC (permalink / raw) To: H. Peter Anvin Cc: Ingo Molnar, Thomas Gleixner, linux-kernel, Johannes Weiner, Andrew Morton On 03/31/2010 08:18 PM, H. Peter Anvin wrote: > Please address the separate bug fix in a separate patch. ok. ^ permalink raw reply [flat|nested] 44+ messages in thread
* [PATCH -v4 1/2] nobootmem, x86: Fix 32bit numa system without RAM on Node0 2010-04-01 3:18 ` H. Peter Anvin 2010-04-01 3:30 ` Yinghai Lu @ 2010-04-01 3:44 ` Yinghai Lu 2010-04-01 3:45 ` [PATCH -v4 2/2] bootmem, " Yinghai Lu 2010-04-01 22:57 ` [tip:x86/urgent] nobootmem, " tip-bot for Yinghai Lu 1 sibling, 2 replies; 44+ messages in thread From: Yinghai Lu @ 2010-04-01 3:44 UTC (permalink / raw) To: H. Peter Anvin, Ingo Molnar, Thomas Gleixner Cc: linux-kernel, Johannes Weiner, Andrew Morton on one system without RAM on nod0, got following dump with 32bit numa kernel early_node_map[4] active PFN ranges 1: 0x00000010 -> 0x00000099 1: 0x00000100 -> 0x0007da00 1: 0x0007e800 -> 0x0007ffa0 1: 0x0007ffae -> 0x0007ffb0 ... Subtract (29 early reservations) #000 [0000001000 - 0000002000] #001 [0000089000 - 000008f000] #002 [0000091000 - 0000093500] ... #027 [007cbfef40 - 007e800000] #028 [007e9ca000 - 007ff95000] (0 free memory ranges) Initializing HighMem for node 0 (00000000:00000000) Initializing HighMem for node 1 (00000000:00000000) Memory: 0k/2096832k available (6662k kernel code, 2096300k reserved, 4829k data, 484k init, 0k highmem) ... Checking if this processor honours the WP bit even in supervisor mode...Ok. swapper: page allocation failure. order:0, mode:0x0 Pid: 0, comm: swapper Not tainted 2.6.34-rc3-tip-03818-g4b1ea6c-dirty #35 Call Trace: [<4087a5dc>] ? printk+0xf/0x11 [<40286728>] __alloc_pages_nodemask+0x417/0x487 [<402a9ce1>] new_slab+0xe2/0x1fe [<402aa5b2>] kmem_cache_open+0x185/0x358 [<402abbc0>] T.954+0x1c/0x60 [<40d52a29>] kmem_cache_init+0x24/0x113 [<40d39738>] start_kernel+0x166/0x2e4 [<40d3940e>] ? unknown_bootoption+0x0/0x18e [<40d390ce>] i386_start_kernel+0xce/0xd5 Mem-Info: Node 1 DMA per-cpu: CPU 0: hi: 0, btch: 1 usd: 0 Node 1 Normal per-cpu: CPU 0: hi: 0, btch: 1 usd: 0 active_anon:0 inactive_anon:0 isolated_anon:0 active_file:0 inactive_file:0 isolated_file:0 unevictable:0 dirty:0 writeback:0 unstable:0 free:0 slab_reclaimable:0 slab_unreclaimable:0 mapped:0 shmem:0 pagetables:0 bounce:0 When 32bit numa is used, free_all_bootmem() will still only go over with node id 0. If node 0 doesn't have RAM installed, We need to go with node1 because early_node_map still use 1 for all ranges, and ram from node1 become low ram. Try to use MAX_NUMNODES like 64 numa does. Note: BOOTMEM path has the same problem. this bug exist before We have NO_BOOTMEM support. -v3: add more comments, and fix bootmem path too. -v4: seperate bootmem path fix Signed-off-by: Yinghai Lu <yinghai@kernel.org> --- mm/bootmem.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) Index: linux-2.6/mm/bootmem.c =================================================================== --- linux-2.6.orig/mm/bootmem.c +++ linux-2.6/mm/bootmem.c @@ -303,7 +303,14 @@ unsigned long __init free_all_bootmem_no unsigned long __init free_all_bootmem(void) { #ifdef CONFIG_NO_BOOTMEM - return free_all_memory_core_early(NODE_DATA(0)->node_id); + /* + * We need to use MAX_NUMNODES instead of NODE_DATA(0)->node_id + * because in some case like Node0 doesnt have RAM installed + * low ram will be on Node1 + * Use MAX_NUMNODES will make sure all ranges in early_node_map[] + * will be used instead of only Node0 related + */ + return free_all_memory_core_early(MAX_NUMNODES); #else return free_all_bootmem_core(NODE_DATA(0)->bdata); #endif ^ permalink raw reply [flat|nested] 44+ messages in thread
* [PATCH -v4 2/2] bootmem, x86: Fix 32bit numa system without RAM on Node0 2010-04-01 3:44 ` [PATCH -v4 1/2] nobootmem, " Yinghai Lu @ 2010-04-01 3:45 ` Yinghai Lu 2010-04-01 22:57 ` [tip:x86/urgent] bootmem, x86: Fix 32bit numa system without RAM on node 0 tip-bot for Yinghai Lu 2010-04-01 22:57 ` [tip:x86/urgent] nobootmem, " tip-bot for Yinghai Lu 1 sibling, 1 reply; 44+ messages in thread From: Yinghai Lu @ 2010-04-01 3:45 UTC (permalink / raw) To: H. Peter Anvin, Ingo Molnar, Thomas Gleixner Cc: linux-kernel, Johannes Weiner, Andrew Morton When 32bit numa is used, free_all_bootmem() will still only go over with node id 0. If node 0 doesn't have RAM installed, We need to go with node1 because early_node_map still use 1 for all ranges, and ram from node1 become low ram. this one fixes BOOTMEM path by loop bdata_list. -v3: add more comments, and fix bootmem path too. -v4: seperate from one big patch Signed-off-by: Yinghai Lu <yinghai@kernel.org> --- mm/bootmem.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) Index: linux-2.6/mm/bootmem.c =================================================================== --- linux-2.6.orig/mm/bootmem.c +++ linux-2.6/mm/bootmem.c @@ -312,7 +312,13 @@ unsigned long __init free_all_bootmem(vo */ return free_all_memory_core_early(MAX_NUMNODES); #else - return free_all_bootmem_core(NODE_DATA(0)->bdata); + unsigned long total_pages = 0; + bootmem_data_t *bdata; + + list_for_each_entry(bdata, &bdata_list, list) + total_pages += free_all_bootmem_core(bdata); + + return total_pages; #endif } ^ permalink raw reply [flat|nested] 44+ messages in thread
* [tip:x86/urgent] bootmem, x86: Fix 32bit numa system without RAM on node 0 2010-04-01 3:45 ` [PATCH -v4 2/2] bootmem, " Yinghai Lu @ 2010-04-01 22:57 ` tip-bot for Yinghai Lu 0 siblings, 0 replies; 44+ messages in thread From: tip-bot for Yinghai Lu @ 2010-04-01 22:57 UTC (permalink / raw) To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, yinghai, tglx Commit-ID: aa235fc712f379d4194cff9217f07026c452c141 Gitweb: http://git.kernel.org/tip/aa235fc712f379d4194cff9217f07026c452c141 Author: Yinghai Lu <yinghai@kernel.org> AuthorDate: Wed, 31 Mar 2010 20:45:27 -0700 Committer: H. Peter Anvin <hpa@zytor.com> CommitDate: Thu, 1 Apr 2010 14:41:19 -0700 bootmem, x86: Fix 32bit numa system without RAM on node 0 When 32bit numa is used, free_all_bootmem() will still only go over with node id 0. If node 0 doesn't have RAM installed, the lowest populated node becomes low RAM. This one fixes BOOTMEM path by iterating over the bdata_list. -v3: add more comments, and fix bootmem path too. -v4: seperate from one big patch Signed-off-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <4BB416D7.6090203@kernel.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com> --- mm/bootmem.c | 8 +++++++- 1 files changed, 7 insertions(+), 1 deletions(-) diff --git a/mm/bootmem.c b/mm/bootmem.c index 2058cb7..ba37d62 100644 --- a/mm/bootmem.c +++ b/mm/bootmem.c @@ -312,7 +312,13 @@ unsigned long __init free_all_bootmem(void) */ return free_all_memory_core_early(MAX_NUMNODES); #else - return free_all_bootmem_core(NODE_DATA(0)->bdata); + unsigned long total_pages = 0; + bootmem_data_t *bdata; + + list_for_each_entry(bdata, &bdata_list, list) + total_pages += free_all_bootmem_core(bdata); + + return total_pages; #endif } ^ permalink raw reply related [flat|nested] 44+ messages in thread
* [tip:x86/urgent] nobootmem, x86: Fix 32bit numa system without RAM on node 0 2010-04-01 3:44 ` [PATCH -v4 1/2] nobootmem, " Yinghai Lu 2010-04-01 3:45 ` [PATCH -v4 2/2] bootmem, " Yinghai Lu @ 2010-04-01 22:57 ` tip-bot for Yinghai Lu 1 sibling, 0 replies; 44+ messages in thread From: tip-bot for Yinghai Lu @ 2010-04-01 22:57 UTC (permalink / raw) To: linux-tip-commits; +Cc: linux-kernel, hpa, mingo, yinghai, tglx Commit-ID: 337998587f802535896e9ed16d19f97915ccd368 Gitweb: http://git.kernel.org/tip/337998587f802535896e9ed16d19f97915ccd368 Author: Yinghai Lu <yinghai@kernel.org> AuthorDate: Wed, 31 Mar 2010 20:44:09 -0700 Committer: H. Peter Anvin <hpa@zytor.com> CommitDate: Thu, 1 Apr 2010 14:39:29 -0700 nobootmem, x86: Fix 32bit numa system without RAM on node 0 On one system without RAM on node0, got following boot dump with a 32 bit NUMA kernel: early_node_map[4] active PFN ranges 1: 0x00000010 -> 0x00000099 1: 0x00000100 -> 0x0007da00 1: 0x0007e800 -> 0x0007ffa0 1: 0x0007ffae -> 0x0007ffb0 ... Subtract (29 early reservations) #000 [0000001000 - 0000002000] #001 [0000089000 - 000008f000] #002 [0000091000 - 0000093500] ... #027 [007cbfef40 - 007e800000] #028 [007e9ca000 - 007ff95000] (0 free memory ranges) Initializing HighMem for node 0 (00000000:00000000) Initializing HighMem for node 1 (00000000:00000000) Memory: 0k/2096832k available (6662k kernel code, 2096300k reserved, 4829k data, 484k init, 0k highmem) ... Checking if this processor honours the WP bit even in supervisor mode...Ok. swapper: page allocation failure. order:0, mode:0x0 Pid: 0, comm: swapper Not tainted 2.6.34-rc3-tip-03818-g4b1ea6c-dirty #35 Call Trace: [<4087a5dc>] ? printk+0xf/0x11 [<40286728>] __alloc_pages_nodemask+0x417/0x487 [<402a9ce1>] new_slab+0xe2/0x1fe [<402aa5b2>] kmem_cache_open+0x185/0x358 [<402abbc0>] T.954+0x1c/0x60 [<40d52a29>] kmem_cache_init+0x24/0x113 [<40d39738>] start_kernel+0x166/0x2e4 [<40d3940e>] ? unknown_bootoption+0x0/0x18e [<40d390ce>] i386_start_kernel+0xce/0xd5 Mem-Info: Node 1 DMA per-cpu: CPU 0: hi: 0, btch: 1 usd: 0 Node 1 Normal per-cpu: CPU 0: hi: 0, btch: 1 usd: 0 active_anon:0 inactive_anon:0 isolated_anon:0 active_file:0 inactive_file:0 isolated_file:0 unevictable:0 dirty:0 writeback:0 unstable:0 free:0 slab_reclaimable:0 slab_unreclaimable:0 mapped:0 shmem:0 pagetables:0 bounce:0 When 32bit NUMA is used, free_all_bootmem() will still only go over with node id 0. If node 0 doesn't have RAM installed, We need to go with node1 because early_node_map still use 1 for all ranges, and ram from node1 become low ram. Use MAX_NUMNODES like 64-bit NUMA does. Note: BOOTMEM path has the same problem. this bug exist before We have NO_BOOTMEM support. -v3: add more comments, and fix bootmem path too. -v4: seperate bootmem path fix Signed-off-by: Yinghai Lu <yinghai@kernel.org> LKML-Reference: <4BB41689.9090502@kernel.org> Signed-off-by: H. Peter Anvin <hpa@zytor.com> --- mm/bootmem.c | 9 ++++++++- 1 files changed, 8 insertions(+), 1 deletions(-) diff --git a/mm/bootmem.c b/mm/bootmem.c index 9b13446..2058cb7 100644 --- a/mm/bootmem.c +++ b/mm/bootmem.c @@ -303,7 +303,14 @@ unsigned long __init free_all_bootmem_node(pg_data_t *pgdat) unsigned long __init free_all_bootmem(void) { #ifdef CONFIG_NO_BOOTMEM - return free_all_memory_core_early(NODE_DATA(0)->node_id); + /* + * We need to use MAX_NUMNODES instead of NODE_DATA(0)->node_id + * because in some case like Node0 doesnt have RAM installed + * low ram will be on Node1 + * Use MAX_NUMNODES will make sure all ranges in early_node_map[] + * will be used instead of only Node0 related + */ + return free_all_memory_core_early(MAX_NUMNODES); #else return free_all_bootmem_core(NODE_DATA(0)->bdata); #endif ^ permalink raw reply related [flat|nested] 44+ messages in thread
* Re: Config NO_BOOTMEM breaks my amd64 box 2010-03-31 4:49 Config NO_BOOTMEM breaks my amd64 box James Morris 2010-03-31 6:26 ` H. Peter Anvin @ 2010-03-31 10:51 ` Stefan Richter 1 sibling, 0 replies; 44+ messages in thread From: Stefan Richter @ 2010-03-31 10:51 UTC (permalink / raw) To: James Morris Cc: Ingo Molnar, H. Peter Anvin, Yinghai Lu, linux-kernel, airlied James Morris wrote: > Also, the help text for the item makes little sense to a non-expert in > this area: I too noticed this absolutely catastrophic "help" text but forgot to send a bug report. Either this option can be explained and the text fixed, or it cannot be explained and shouldn't be an option in the first place. -- Stefan Richter -=====-==-=- --== ===== http://arcgraph.de/sr/ ^ permalink raw reply [flat|nested] 44+ messages in thread
end of thread, other threads:[~2010-04-09 2:44 UTC | newest] Thread overview: 44+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-03-31 4:49 Config NO_BOOTMEM breaks my amd64 box James Morris 2010-03-31 6:26 ` H. Peter Anvin 2010-03-31 6:47 ` James Morris 2010-03-31 16:25 ` Yinghai Lu 2010-03-31 18:59 ` Ingo Molnar 2010-03-31 20:57 ` Dave Airlie 2010-03-31 21:02 ` Linus Torvalds 2010-03-31 21:40 ` Ingo Molnar 2010-03-31 21:47 ` Ingo Molnar 2010-03-31 21:14 ` Dave Airlie 2010-03-31 22:02 ` Yinghai Lu 2010-03-31 22:28 ` H. Peter Anvin 2010-03-31 22:58 ` James Morris 2010-03-31 23:02 ` Ingo Molnar 2010-03-31 23:35 ` H. Peter Anvin 2010-03-31 23:43 ` James Morris 2010-03-31 23:48 ` H. Peter Anvin 2010-04-01 1:00 ` James Morris 2010-04-01 12:52 ` Ingo Molnar 2010-04-08 6:32 ` Ingo Molnar 2010-04-08 7:00 ` Yinghai 2010-04-08 7:27 ` Ingo Molnar 2010-04-09 2:43 ` Dave Airlie 2010-04-08 8:05 ` James Morris 2010-04-08 8:22 ` Ingo Molnar 2010-03-31 22:05 ` Yinghai Lu 2010-03-31 22:13 ` Ingo Molnar 2010-03-31 22:16 ` Yinghai Lu 2010-03-31 22:41 ` Ingo Molnar 2010-03-31 22:47 ` Yinghai Lu 2010-03-31 22:56 ` Ingo Molnar 2010-04-01 0:01 ` Johannes Weiner 2010-03-31 23:34 ` H. Peter Anvin 2010-03-31 23:54 ` Yinghai Lu 2010-04-01 0:35 ` H. Peter Anvin 2010-04-01 1:07 ` Yinghai Lu 2010-04-01 2:02 ` [PATCH -v3] nobootmem/bootmem, x86: Fix 32bit numa system without RAM on Node0 Yinghai Lu 2010-04-01 3:18 ` H. Peter Anvin 2010-04-01 3:30 ` Yinghai Lu 2010-04-01 3:44 ` [PATCH -v4 1/2] nobootmem, " Yinghai Lu 2010-04-01 3:45 ` [PATCH -v4 2/2] bootmem, " Yinghai Lu 2010-04-01 22:57 ` [tip:x86/urgent] bootmem, x86: Fix 32bit numa system without RAM on node 0 tip-bot for Yinghai Lu 2010-04-01 22:57 ` [tip:x86/urgent] nobootmem, " tip-bot for Yinghai Lu 2010-03-31 10:51 ` Config NO_BOOTMEM breaks my amd64 box Stefan Richter
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).