* Re: + powerpc-fix-code-for-reserved-memory-spanning-across-nodes.patch added to -mm tree [not found] <200812250806.mBP86LPQ018399@imap1.linux-foundation.org> @ 2009-02-08 20:47 ` Johannes Weiner 2009-02-09 0:31 ` Benjamin Herrenschmidt 2009-02-10 11:35 ` Chandru 0 siblings, 2 replies; 8+ messages in thread From: Johannes Weiner @ 2009-02-08 20:47 UTC (permalink / raw) To: linux-kernel; +Cc: mm-commits, chandru, benh, chandru, paulus, akpm On Thu, Dec 25, 2008 at 12:06:21AM -0800, akpm@linux-foundation.org wrote: > > The patch titled > powerpc: fix code for reserved memory spanning across nodes > has been added to the -mm tree. Its filename is > powerpc-fix-code-for-reserved-memory-spanning-across-nodes.patch > > Before you just go and hit "reply", please: > a) Consider who else should be cc'ed > b) Prefer to cc a suitable mailing list as well > c) Ideally: find the original patch on the mailing list and do a > reply-to-all to that, adding suitable additional cc's > > *** Remember to use Documentation/SubmitChecklist when testing your code *** > > See http://userweb.kernel.org/~akpm/stuff/added-to-mm.txt to find > out what to do about this > > The current -mm tree may be found at http://userweb.kernel.org/~akpm/mmotm/ > > ------------------------------------------------------ > Subject: powerpc: fix code for reserved memory spanning across nodes > From: Chandru <chandru@in.ibm.com> > > When booted with crashkernel=224M@32M or any memory size less than this, > the system boots properly. The following was the observation.. The > system comes up with two nodes (0-256M and 256M-4GB). The crashkernel > memory reservation spans across these two nodes. The > mark_reserved_regions_for_nid() in arch/powerpc/mm/numa.c resizes the > reserved part of the memory within it as: > > if (end_pfn > node_ar.end_pfn) > reserve_size = (node_ar.end_pfn << PAGE_SHIFT) > - (start_pfn << PAGE_SHIFT); > > > but the reserve_bootmem_node() in mm/bootmem.c raises the pfn value of end > > end = PFN_UP(physaddr + size); > > This causes end to get a value past the last page in the 0-256M node. > Again when reserve_bootmem_node() returns, mark_reserved_regions_for_nid() > loops around to set the rest of the crashkernel memory in the next node as > reserved. It references NODE_DATA(node_ar.nid) and this causes another > 'Oops: kernel access of bad area' problem. The following changes made the > system to boot with any amount of crashkernel memory size. > > Signed-off-by: Chandru S <chandru@linux.vnet.ibm.com> > Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> > Cc: Paul Mackerras <paulus@samba.org> > Signed-off-by: Andrew Morton <akpm@linux-foundation.org> > --- > > arch/powerpc/mm/numa.c | 7 ++++--- > mm/bootmem.c | 4 ++++ > 2 files changed, 8 insertions(+), 3 deletions(-) > > diff -puN arch/powerpc/mm/numa.c~powerpc-fix-code-for-reserved-memory-spanning-across-nodes arch/powerpc/mm/numa.c > --- a/arch/powerpc/mm/numa.c~powerpc-fix-code-for-reserved-memory-spanning-across-nodes > +++ a/arch/powerpc/mm/numa.c > @@ -995,10 +995,11 @@ void __init do_init_bootmem(void) > start_pfn, end_pfn); > > free_bootmem_with_active_regions(nid, end_pfn); > + } > + > + for_each_online_node(nid) { > /* > - * Be very careful about moving this around. Future > - * calls to careful_allocation() depend on this getting > - * done correctly. > + * Be very careful about moving this around. > */ > mark_reserved_regions_for_nid(nid); > sparse_memory_present_with_active_regions(nid); > diff -puN mm/bootmem.c~powerpc-fix-code-for-reserved-memory-spanning-across-nodes mm/bootmem.c > --- a/mm/bootmem.c~powerpc-fix-code-for-reserved-memory-spanning-across-nodes > +++ a/mm/bootmem.c > @@ -375,10 +375,14 @@ int __init reserve_bootmem_node(pg_data_ > unsigned long size, int flags) > { > unsigned long start, end; > + bootmem_data_t *bdata = pgdat->bdata; > > start = PFN_DOWN(physaddr); > end = PFN_UP(physaddr + size); > > + if (end > bdata->node_low_pfn) > + end = bdata->node_low_pfn; Chandru, from your patch descriptions I read two reasons why your bootup fails: 1. Rounding a contained range so that it exceeds the node This is impossible. Either your range includes that page already or it doesn't. bootmem doesn't reserve more pages than requested, it just rounds up to full pages if your range includes partials. Your size must already overflow the node boundary if the end pfn is too high. 2. Node-spanning range If you have ranges that span nodes you are using the wrong interface. By passing a node descriptor to reserve_bootmem_node(), you expect the specified range to be on that node. If it is not, please use the node-agnostic reserve_bootmem() interface. The above patch brings a kludge back to bootmem that silently cuts bogus ranges. I don't agree, these bugs should be fixed, not worked around. What I can spot in the callsite is the following: unsigned long physbase = lmb.reserved.region[i].base; unsigned long size = lmb.reserved.region[i].size; unsigned long start_pfn = physbase >> PAGE_SHIFT; unsigned long end_pfn = ((physbase + size) >> PAGE_SHIFT); When physbase + size is not a multiple of PAGE_SIZE, the overlap is cut off and end_pfn is too low. if (end_pfn > node_ar.end_pfn) reserve_size = (node_ar.end_pfn << PAGE_SHIFT) - (start_pfn << PAGE_SHIFT); The test could then fail even if the size should have been trimmed. Since the overlap must be less than PAGE_SIZE for this to happen, this would explain what you noticed: the bootmem range exceeds the node boundary by exactly one PFN because of the rounding (if I got you right). Patch attached, can not test, sorry, eat carefully. Hannes PS: Andrew, I wasn't CC'd but this patch was listed in this `patches that might be from hannes@etcppp'-list from another mm-commit message. It's magical! --- From: Johannes Weiner <hannes@cmpxchg.org> Subject: powerpc: fix rounding error in teaching bootmem about LMB If the reserved LMB does not exactly span complete pages, treating (start + size) >> PAGE_SHIFT as the ending PFN is an off by one error. The subsequent check for whether the region needs to be trimmed to fit the underlying node can now fail if the range exceeds the node by 1 to PAGE_SIZE - 1 bytes. The excessive range is then passed to bootmem which BUG()s out on it correctly. Fix up the rounding to include all pages the LMB spans, even partial ones. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> --- diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c index 7393bd7..38016a2 100644 --- a/arch/powerpc/mm/numa.c +++ b/arch/powerpc/mm/numa.c @@ -12,6 +12,7 @@ #include <linux/bootmem.h> #include <linux/init.h> #include <linux/mm.h> +#include <linux/pfn.h> #include <linux/mmzone.h> #include <linux/module.h> #include <linux/nodemask.h> @@ -881,8 +882,8 @@ static void mark_reserved_regions_for_nid(int nid) for (i = 0; i < lmb.reserved.cnt; i++) { unsigned long physbase = lmb.reserved.region[i].base; unsigned long size = lmb.reserved.region[i].size; - unsigned long start_pfn = physbase >> PAGE_SHIFT; - unsigned long end_pfn = ((physbase + size) >> PAGE_SHIFT); + unsigned long start_pfn = PFN_DOWN(physbase); + unsigned long end_pfn = PFN_UP(physbase + size); struct node_active_region node_ar; unsigned long node_end_pfn = node->node_start_pfn + node->node_spanned_pages; ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: + powerpc-fix-code-for-reserved-memory-spanning-across-nodes.patch added to -mm tree 2009-02-08 20:47 ` + powerpc-fix-code-for-reserved-memory-spanning-across-nodes.patch added to -mm tree Johannes Weiner @ 2009-02-09 0:31 ` Benjamin Herrenschmidt 2009-02-10 17:51 ` Dave Hansen 2009-02-10 11:35 ` Chandru 1 sibling, 1 reply; 8+ messages in thread From: Benjamin Herrenschmidt @ 2009-02-09 0:31 UTC (permalink / raw) To: Dave Hansen Cc: linux-kernel, mm-commits, chandru, chandru, paulus, akpm, Johannes Weiner Dave, you've had your hands in that code more than I did lately, do that look ok to you ? (BTW. On another note, do you still have pending bug fixes that didn't got in .29 ?) Cheers, Ben. On Sun, 2009-02-08 at 21:47 +0100, Johannes Weiner wrote: > On Thu, Dec 25, 2008 at 12:06:21AM -0800, akpm@linux-foundation.org wrote: > > > > The patch titled > > powerpc: fix code for reserved memory spanning across nodes > > has been added to the -mm tree. Its filename is > > powerpc-fix-code-for-reserved-memory-spanning-across-nodes.patch > > > > Before you just go and hit "reply", please: > > a) Consider who else should be cc'ed > > b) Prefer to cc a suitable mailing list as well > > c) Ideally: find the original patch on the mailing list and do a > > reply-to-all to that, adding suitable additional cc's > > > > *** Remember to use Documentation/SubmitChecklist when testing your code *** > > > > See http://userweb.kernel.org/~akpm/stuff/added-to-mm.txt to find > > out what to do about this > > > > The current -mm tree may be found at http://userweb.kernel.org/~akpm/mmotm/ > > > > ------------------------------------------------------ > > Subject: powerpc: fix code for reserved memory spanning across nodes > > From: Chandru <chandru@in.ibm.com> > > > > When booted with crashkernel=224M@32M or any memory size less than this, > > the system boots properly. The following was the observation.. The > > system comes up with two nodes (0-256M and 256M-4GB). The crashkernel > > memory reservation spans across these two nodes. The > > mark_reserved_regions_for_nid() in arch/powerpc/mm/numa.c resizes the > > reserved part of the memory within it as: > > > > if (end_pfn > node_ar.end_pfn) > > reserve_size = (node_ar.end_pfn << PAGE_SHIFT) > > - (start_pfn << PAGE_SHIFT); > > > > > > but the reserve_bootmem_node() in mm/bootmem.c raises the pfn value of end > > > > end = PFN_UP(physaddr + size); > > > > This causes end to get a value past the last page in the 0-256M node. > > Again when reserve_bootmem_node() returns, mark_reserved_regions_for_nid() > > loops around to set the rest of the crashkernel memory in the next node as > > reserved. It references NODE_DATA(node_ar.nid) and this causes another > > 'Oops: kernel access of bad area' problem. The following changes made the > > system to boot with any amount of crashkernel memory size. > > > > Signed-off-by: Chandru S <chandru@linux.vnet.ibm.com> > > Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> > > Cc: Paul Mackerras <paulus@samba.org> > > Signed-off-by: Andrew Morton <akpm@linux-foundation.org> > > --- > > > > arch/powerpc/mm/numa.c | 7 ++++--- > > mm/bootmem.c | 4 ++++ > > 2 files changed, 8 insertions(+), 3 deletions(-) > > > > diff -puN arch/powerpc/mm/numa.c~powerpc-fix-code-for-reserved-memory-spanning-across-nodes arch/powerpc/mm/numa.c > > --- a/arch/powerpc/mm/numa.c~powerpc-fix-code-for-reserved-memory-spanning-across-nodes > > +++ a/arch/powerpc/mm/numa.c > > @@ -995,10 +995,11 @@ void __init do_init_bootmem(void) > > start_pfn, end_pfn); > > > > free_bootmem_with_active_regions(nid, end_pfn); > > + } > > + > > + for_each_online_node(nid) { > > /* > > - * Be very careful about moving this around. Future > > - * calls to careful_allocation() depend on this getting > > - * done correctly. > > + * Be very careful about moving this around. > > */ > > mark_reserved_regions_for_nid(nid); > > sparse_memory_present_with_active_regions(nid); > > diff -puN mm/bootmem.c~powerpc-fix-code-for-reserved-memory-spanning-across-nodes mm/bootmem.c > > --- a/mm/bootmem.c~powerpc-fix-code-for-reserved-memory-spanning-across-nodes > > +++ a/mm/bootmem.c > > @@ -375,10 +375,14 @@ int __init reserve_bootmem_node(pg_data_ > > unsigned long size, int flags) > > { > > unsigned long start, end; > > + bootmem_data_t *bdata = pgdat->bdata; > > > > start = PFN_DOWN(physaddr); > > end = PFN_UP(physaddr + size); > > > > + if (end > bdata->node_low_pfn) > > + end = bdata->node_low_pfn; > > Chandru, > > from your patch descriptions I read two reasons why your bootup fails: > > 1. Rounding a contained range so that it exceeds the node > > This is impossible. Either your range includes that page already or > it doesn't. bootmem doesn't reserve more pages than requested, it > just rounds up to full pages if your range includes partials. Your > size must already overflow the node boundary if the end pfn is too > high. > > 2. Node-spanning range > > If you have ranges that span nodes you are using the wrong interface. > By passing a node descriptor to reserve_bootmem_node(), you expect the > specified range to be on that node. If it is not, please use the > node-agnostic reserve_bootmem() interface. > > The above patch brings a kludge back to bootmem that silently cuts > bogus ranges. I don't agree, these bugs should be fixed, not worked > around. > > What I can spot in the callsite is the following: > > unsigned long physbase = lmb.reserved.region[i].base; > unsigned long size = lmb.reserved.region[i].size; > unsigned long start_pfn = physbase >> PAGE_SHIFT; > unsigned long end_pfn = ((physbase + size) >> PAGE_SHIFT); > > When physbase + size is not a multiple of PAGE_SIZE, the overlap is > cut off and end_pfn is too low. > > if (end_pfn > node_ar.end_pfn) > reserve_size = (node_ar.end_pfn << PAGE_SHIFT) > - (start_pfn << PAGE_SHIFT); > > The test could then fail even if the size should have been trimmed. > Since the overlap must be less than PAGE_SIZE for this to happen, this > would explain what you noticed: the bootmem range exceeds the node > boundary by exactly one PFN because of the rounding (if I got you > right). > > Patch attached, can not test, sorry, eat carefully. > > Hannes > > PS: Andrew, I wasn't CC'd but this patch was listed in this `patches > that might be from hannes@etcppp'-list from another mm-commit message. > It's magical! > > --- > From: Johannes Weiner <hannes@cmpxchg.org> > Subject: powerpc: fix rounding error in teaching bootmem about LMB > > If the reserved LMB does not exactly span complete pages, treating > (start + size) >> PAGE_SHIFT as the ending PFN is an off by one error. > > The subsequent check for whether the region needs to be trimmed to fit > the underlying node can now fail if the range exceeds the node by 1 to > PAGE_SIZE - 1 bytes. The excessive range is then passed to bootmem > which BUG()s out on it correctly. > > Fix up the rounding to include all pages the LMB spans, even partial > ones. > > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> > --- > > diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c > index 7393bd7..38016a2 100644 > --- a/arch/powerpc/mm/numa.c > +++ b/arch/powerpc/mm/numa.c > @@ -12,6 +12,7 @@ > #include <linux/bootmem.h> > #include <linux/init.h> > #include <linux/mm.h> > +#include <linux/pfn.h> > #include <linux/mmzone.h> > #include <linux/module.h> > #include <linux/nodemask.h> > @@ -881,8 +882,8 @@ static void mark_reserved_regions_for_nid(int nid) > for (i = 0; i < lmb.reserved.cnt; i++) { > unsigned long physbase = lmb.reserved.region[i].base; > unsigned long size = lmb.reserved.region[i].size; > - unsigned long start_pfn = physbase >> PAGE_SHIFT; > - unsigned long end_pfn = ((physbase + size) >> PAGE_SHIFT); > + unsigned long start_pfn = PFN_DOWN(physbase); > + unsigned long end_pfn = PFN_UP(physbase + size); > struct node_active_region node_ar; > unsigned long node_end_pfn = node->node_start_pfn + > node->node_spanned_pages; > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: + powerpc-fix-code-for-reserved-memory-spanning-across-nodes.patch added to -mm tree 2009-02-09 0:31 ` Benjamin Herrenschmidt @ 2009-02-10 17:51 ` Dave Hansen 2009-02-10 18:27 ` Chandru 0 siblings, 1 reply; 8+ messages in thread From: Dave Hansen @ 2009-02-10 17:51 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: linux-kernel, mm-commits, chandru, chandru, paulus, akpm, Johannes Weiner, Jon M. Tollefson [imap] On Mon, 2009-02-09 at 11:31 +1100, Benjamin Herrenschmidt wrote: > Dave, you've had your hands in that code more than I did lately, do > that > look ok to you ? I still think that patch is bogus. I don't know how it is still around. I described the problem pretty completely here: http://git.kernel.org/gitweb.cgi?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=4a6186696e7f15b3ea4dafcdb64ee0703e0e4487 Chandru effectively reverted my bugfix from that patch in this one. > (BTW. On another note, do you still have pending bug fixes that didn't > got in .29 ?) No, I don't think so. There were some distro fixes but I couldn't reproduce the bugs on mainline. -- Dave ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: + powerpc-fix-code-for-reserved-memory-spanning-across-nodes.patch added to -mm tree 2009-02-10 17:51 ` Dave Hansen @ 2009-02-10 18:27 ` Chandru 2009-02-13 11:01 ` Chandru 0 siblings, 1 reply; 8+ messages in thread From: Chandru @ 2009-02-10 18:27 UTC (permalink / raw) To: Dave Hansen Cc: Benjamin Herrenschmidt, linux-kernel, mm-commits, chandru, paulus, akpm, Johannes Weiner, Jon M. Tollefson [imap] Dave Hansen wrote: > On Mon, 2009-02-09 at 11:31 +1100, Benjamin Herrenschmidt wrote: > >> Dave, you've had your hands in that code more than I did lately, do >> that >> look ok to you ? >> > > I still think that patch is bogus. I don't know how it is still around. > I described the problem pretty completely here: > > http://git.kernel.org/gitweb.cgi?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=4a6186696e7f15b3ea4dafcdb64ee0703e0e4487 > > Chandru effectively reverted my bugfix from that patch in this one. > > >> (BTW. On another note, do you still have pending bug fixes that didn't >> got in .29 ?) >> > > No, I don't think so. There were some distro fixes but I couldn't > reproduce the bugs on mainline. > > -- Dave > > Dave, If I remember, the first mail that I sent on this subject said 2.6.28-rc9 panics. Unless something has changed in the way lmb's are added or are reserved, this should be also reproducible with the mainline kernel on the machines that had exhibited this problem earlier. I will verify your patch on these machines again and let you know the results. Thanks, Chandru ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: + powerpc-fix-code-for-reserved-memory-spanning-across-nodes.patch added to -mm tree 2009-02-10 18:27 ` Chandru @ 2009-02-13 11:01 ` Chandru 0 siblings, 0 replies; 8+ messages in thread From: Chandru @ 2009-02-13 11:01 UTC (permalink / raw) To: Dave Hansen Cc: Benjamin Herrenschmidt, linux-kernel, mm-commits, chandru, paulus, akpm, Johannes Weiner, Jon M. Tollefson [imap] Chandru wrote: > Dave Hansen wrote: >> On Mon, 2009-02-09 at 11:31 +1100, Benjamin Herrenschmidt wrote: >> >> >>> (BTW. On another note, do you still have pending bug fixes that didn't >>> got in .29 ?) >>> >> >> No, I don't think so. There were some distro fixes but I couldn't >> reproduce the bugs on mainline. >> >> -- Dave >> >> > Dave, > > If I remember, the first mail that I sent on this subject said > 2.6.28-rc9 panics. Unless something has changed in the way > lmb's are added or are reserved, this should be also > reproducible with the mainline kernel on the machines that > had exhibited this problem earlier. I will verify your patch > on these machines again and let you know the results. > > Thanks, > Chandru This still occurs with 2.6.29.rc4 on one of the machines that had shown this problem earlier and Dave's patch solves the issue here. The system comes up with two nodes... <snip> NUMA associativity depth for CPU/Memory: 3 Node 0 Memory: 0x0-0x40000000 Node 1 Memory: 0x40000000-0x80000000 I had to increase the crashkernel memory size to beyond 1G and hence used crashkernel=1280M@32M. This caused the kernel to hit the bug as with earlier kernels. -Chandru ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: + powerpc-fix-code-for-reserved-memory-spanning-across-nodes.patch added to -mm tree 2009-02-08 20:47 ` + powerpc-fix-code-for-reserved-memory-spanning-across-nodes.patch added to -mm tree Johannes Weiner 2009-02-09 0:31 ` Benjamin Herrenschmidt @ 2009-02-10 11:35 ` Chandru 2009-02-10 17:59 ` Johannes Weiner 1 sibling, 1 reply; 8+ messages in thread From: Chandru @ 2009-02-10 11:35 UTC (permalink / raw) To: Johannes Weiner; +Cc: linux-kernel, mm-commits, benh, chandru, paulus, akpm Johannes Weiner wrote: > --- > From: Johannes Weiner <hannes@cmpxchg.org> > Subject: powerpc: fix rounding error in teaching bootmem about LMB > > If the reserved LMB does not exactly span complete pages, treating > (start + size) >> PAGE_SHIFT as the ending PFN is an off by one error. > > The subsequent check for whether the region needs to be trimmed to fit > the underlying node can now fail if the range exceeds the node by 1 to > PAGE_SIZE - 1 bytes. The excessive range is then passed to bootmem > which BUG()s out on it correctly. > > Fix up the rounding to include all pages the LMB spans, even partial > ones. > > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> > --- Hello Hannes, Dave Hansen gave a fix for this problem which looks similar to the changes that you have made here , but has an additional one line change too. Here is the patch ( probably may not apply cleanly to the latest kernel ) . I don' t know if Dave submitted this patch to lkml for it's inclusion into the latest tree. Thanks for looking into this issue. We may also have to remove the powerpc-fix-code-for-reserved-memory-spanning-across-nodes.patch from the -mm tree. Thanks, Chandru ================= Snippet from Dave's patch without change log --- linux-2.6.git-dave/arch/powerpc/mm/numa.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff -puN arch/powerpc/mm/numa.c~reserve-over-fix arch/powerpc/mm/numa.c --- linux-2.6.git/arch/powerpc/mm/numa.c~reserve-over-fix 2009-01-26 10:17:20.000000000 -0800 +++ linux-2.6.git-dave/arch/powerpc/mm/numa.c 2009-01-26 10:17:30.000000000 -0800 @@ -19,6 +19,7 @@ #include <linux/notifier.h> #include <linux/lmb.h> #include <linux/of.h> +#include <linux/pfn.h> #include <asm/sparsemem.h> #include <asm/prom.h> #include <asm/system.h> @@ -882,7 +883,7 @@ static void mark_reserved_regions_for_ni unsigned long physbase = lmb.reserved.region[i].base; unsigned long size = lmb.reserved.region[i].size; unsigned long start_pfn = physbase >> PAGE_SHIFT; - unsigned long end_pfn = ((physbase + size) >> PAGE_SHIFT); + unsigned long end_pfn = PFN_UP(physbase + size); struct node_active_region node_ar; unsigned long node_end_pfn = node->node_start_pfn + node->node_spanned_pages; @@ -908,7 +909,7 @@ static void mark_reserved_regions_for_ni */ if (end_pfn > node_ar.end_pfn) reserve_size = (node_ar.end_pfn << PAGE_SHIFT) - - (start_pfn << PAGE_SHIFT); + - physbase; /* * Only worry about *this* node, others may not * yet have valid NODE_DATA(). diff -puN arch/powerpc/kernel/prom_init.c~reserve-over-fix arch/powerpc/kernel/prom_init.c ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: + powerpc-fix-code-for-reserved-memory-spanning-across-nodes.patch added to -mm tree 2009-02-10 11:35 ` Chandru @ 2009-02-10 17:59 ` Johannes Weiner 2009-02-13 11:18 ` Chandru 0 siblings, 1 reply; 8+ messages in thread From: Johannes Weiner @ 2009-02-10 17:59 UTC (permalink / raw) To: Chandru; +Cc: linux-kernel, mm-commits, benh, chandru, paulus, akpm On Tue, Feb 10, 2009 at 05:05:45PM +0530, Chandru wrote: > Johannes Weiner wrote: > >--- > >From: Johannes Weiner <hannes@cmpxchg.org> > >Subject: powerpc: fix rounding error in teaching bootmem about LMB > > > >If the reserved LMB does not exactly span complete pages, treating > >(start + size) >> PAGE_SHIFT as the ending PFN is an off by one error. > > > >The subsequent check for whether the region needs to be trimmed to fit > >the underlying node can now fail if the range exceeds the node by 1 to > >PAGE_SIZE - 1 bytes. The excessive range is then passed to bootmem > >which BUG()s out on it correctly. > > > >Fix up the rounding to include all pages the LMB spans, even partial > >ones. > > > >Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> > >--- > > Hello Hannes, > > Dave Hansen gave a fix for this problem which looks similar > to the changes that you have made here , but has an additional > one line change too. Here is the patch ( probably may not apply > cleanly to the latest kernel ) . I don' t know if Dave > submitted this patch to lkml for it's inclusion into the > latest tree. Thanks for looking into this issue. We may also have > to remove the > powerpc-fix-code-for-reserved-memory-spanning-across-nodes.patch > from the -mm tree. > > Thanks, > Chandru > > ================= > Snippet from Dave's patch without change log > > --- > > linux-2.6.git-dave/arch/powerpc/mm/numa.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff -puN arch/powerpc/mm/numa.c~reserve-over-fix arch/powerpc/mm/numa.c > --- linux-2.6.git/arch/powerpc/mm/numa.c~reserve-over-fix 2009-01-26 > 10:17:20.000000000 -0800 > +++ linux-2.6.git-dave/arch/powerpc/mm/numa.c 2009-01-26 > 10:17:30.000000000 -0800 > @@ -19,6 +19,7 @@ > #include <linux/notifier.h> > #include <linux/lmb.h> > #include <linux/of.h> > +#include <linux/pfn.h> > #include <asm/sparsemem.h> > #include <asm/prom.h> > #include <asm/system.h> > @@ -882,7 +883,7 @@ static void mark_reserved_regions_for_ni > unsigned long physbase = lmb.reserved.region[i].base; > unsigned long size = lmb.reserved.region[i].size; > unsigned long start_pfn = physbase >> PAGE_SHIFT; > - unsigned long end_pfn = ((physbase + size) >> PAGE_SHIFT); > + unsigned long end_pfn = PFN_UP(physbase + size); > struct node_active_region node_ar; > unsigned long node_end_pfn = node->node_start_pfn + > node->node_spanned_pages; > @@ -908,7 +909,7 @@ static void mark_reserved_regions_for_ni > */ > if (end_pfn > node_ar.end_pfn) > reserve_size = (node_ar.end_pfn << > PAGE_SHIFT) > - - (start_pfn << PAGE_SHIFT); > + - physbase; > /* > * Only worry about *this* node, others may not > * yet have valid NODE_DATA(). > diff -puN arch/powerpc/kernel/prom_init.c~reserve-over-fix > arch/powerpc/kernel/prom_init.c This is fine, too. Subtracting physbase yields a smaller reserve_size if not page aligned (bootmem will round it up again, no problem there) and at the end of the loop, size is not zero. This has no practical impacts, though, as far as I can see. The interesting question, however, is whether this patch actually fixes the problem you encountered? If so, I would be glad if we could drop the workaround we currently have in -mm. Hannes ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: + powerpc-fix-code-for-reserved-memory-spanning-across-nodes.patch added to -mm tree 2009-02-10 17:59 ` Johannes Weiner @ 2009-02-13 11:18 ` Chandru 0 siblings, 0 replies; 8+ messages in thread From: Chandru @ 2009-02-13 11:18 UTC (permalink / raw) To: Johannes Weiner; +Cc: linux-kernel, mm-commits, benh, chandru, paulus, akpm Johannes Weiner wrote: > > This is fine, too. Subtracting physbase yields a smaller reserve_size > if not page aligned (bootmem will round it up again, no problem there) > and at the end of the loop, size is not zero. This has no practical > impacts, though, as far as I can see. > > The interesting question, however, is whether this patch actually > fixes the problem you encountered? If so, I would be glad if we could > drop the workaround we currently have in -mm. > > Hannes > Hello Hannes, Yes, the patch fixes the problem of hitting the BUG_ON() in bootmem. Andrew removed the workaround from the -mm tree. Thanks to you for running your eyes on this issue and arriving at an almost workable patch. thanks, Chandru ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2009-02-13 11:19 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <200812250806.mBP86LPQ018399@imap1.linux-foundation.org>
2009-02-08 20:47 ` + powerpc-fix-code-for-reserved-memory-spanning-across-nodes.patch added to -mm tree Johannes Weiner
2009-02-09 0:31 ` Benjamin Herrenschmidt
2009-02-10 17:51 ` Dave Hansen
2009-02-10 18:27 ` Chandru
2009-02-13 11:01 ` Chandru
2009-02-10 11:35 ` Chandru
2009-02-10 17:59 ` Johannes Weiner
2009-02-13 11:18 ` Chandru
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox