* too much memory for node
@ 2010-01-20 15:34 Ben Dooks
2010-01-20 15:55 ` Russell King - ARM Linux
0 siblings, 1 reply; 5+ messages in thread
From: Ben Dooks @ 2010-01-20 15:34 UTC (permalink / raw)
To: linux-arm-kernel
I've been looking at support for a new machine type where the
current system is using DISCONTIGMEM as the 1G memory map has
4x256M RAM ranges which may not all be filled.
After removing a hack in the machine support file which fills in
its own memory table thus obliteratign the bootloader supplied ATAG_MEM
I found a problem. The board has the full complement of memory and is
being passed a single ATAG_MEM from the bootloader for all 1G.
This causes arm_add_memory to add a membank that has more memory
than the bank.node it is describing and thus the memory setup code
is then failing.
My current hack to get this working is to do a check for the end
of memory being in the same node as the start, as so:
diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index c6c57b6..a5f7b21 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -410,6 +410,21 @@ static int __init arm_add_memory(unsigned long start, unsigned long size)
return -EINVAL;
meminfo.nr_banks++;
+
+ if (PHYS_TO_NID(start + size) != bank->node) {
+ /* more memory than a single node can handle */
+
+ unsigned long next = (1024*1024);
+
+ while (PHYS_TO_NID(start + next) == bank->node)
+ next += (1024*1024);
+
+ bank->size = next;
+ printk(KERN_INFO "adding new bank 0x%08x size 0x%08x\n",
+ start + next, size - next);
+ arm_add_memory(start + next, size - next);
+ }
+
return 0;
}
Note, this code uses a search as I couldn't find a good way of calculating
how mucch space was left in a node given a starting address.
Another option is to do similar to sanity_check_meminfo() in the mmu
setup code (we would still need some form of nodemask or similar from
the memory.h). Is NODE_MEM_SIZE_MASK usable if !CONIFG_DISCONTIGMEM
to allow us to avoid the search and still have compilable code?
A third option is to add some code to the specific cpu type to sanitise
the memory info in the machine fixups, but this is getting into hack
teritory.
There is the option of changing the bootloader, but this would required
updating a number of extant boards and would also require the kernel to
at least warn if the problems happens instead of stopping before the
console is started.
--
Ben
Q: What's a light-year?
A: One-third less calories than a regular year.
^ permalink raw reply related [flat|nested] 5+ messages in thread
* too much memory for node
2010-01-20 15:34 too much memory for node Ben Dooks
@ 2010-01-20 15:55 ` Russell King - ARM Linux
2010-01-20 16:05 ` Ben Dooks
0 siblings, 1 reply; 5+ messages in thread
From: Russell King - ARM Linux @ 2010-01-20 15:55 UTC (permalink / raw)
To: linux-arm-kernel
On Wed, Jan 20, 2010 at 03:34:11PM +0000, Ben Dooks wrote:
> I've been looking at support for a new machine type where the
> current system is using DISCONTIGMEM as the 1G memory map has
> 4x256M RAM ranges which may not all be filled.
Firstly, use sparsemem, unless you intentionally want to spend more
cycles in the kernel.
> After removing a hack in the machine support file which fills in
> its own memory table thus obliteratign the bootloader supplied ATAG_MEM
> I found a problem. The board has the full complement of memory and is
> being passed a single ATAG_MEM from the bootloader for all 1G.
You're into the classic problem which happens when you don't tackle
these kinds of issues as soon as the first one occurs. If you allow
them to persist for any time, getting rid of them later becomes very
difficult. I suspect that you're rather stuck now.
> diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
> index c6c57b6..a5f7b21 100644
> --- a/arch/arm/kernel/setup.c
> +++ b/arch/arm/kernel/setup.c
> @@ -410,6 +410,21 @@ static int __init arm_add_memory(unsigned long start, unsigned long size)
> return -EINVAL;
>
> meminfo.nr_banks++;
> +
> + if (PHYS_TO_NID(start + size) != bank->node) {
> + /* more memory than a single node can handle */
> +
> + unsigned long next = (1024*1024);
> +
> + while (PHYS_TO_NID(start + next) == bank->node)
> + next += (1024*1024);
> +
> + bank->size = next;
> + printk(KERN_INFO "adding new bank 0x%08x size 0x%08x\n",
> + start + next, size - next);
> + arm_add_memory(start + next, size - next);
> + }
> +
> return 0;
> }
>
> Note, this code uses a search as I couldn't find a good way of calculating
> how mucch space was left in a node given a starting address.
There is no way to do this in the kernel.
> Another option is to do similar to sanity_check_meminfo() in the mmu
> setup code (we would still need some form of nodemask or similar from
> the memory.h). Is NODE_MEM_SIZE_MASK usable if !CONIFG_DISCONTIGMEM
> to allow us to avoid the search and still have compilable code?
NODE_MEM_SIZE_MASK is DISCONTIGMEM specific; it doesn't exist for
flatmem nor sparsemem (where it's not relevant.)
It's also complicated by platforms which compress their physical RAM
space to fit inside the virtual mappings, eg:
* 256MB @ 0x00000000 -> PAGE_OFFSET
* 512MB @ 0x20000000 -> PAGE_OFFSET + 0x10000000
* 256MB @ 0x80000000 -> PAGE_OFFSET + 0x30000000
The only other solution would be for platforms to declare valid areas
of physical memory back to the setup_arch() code, so that it can
verify the parameters passed - but that's an awful lot of effort just
to fix idiotically implemented boot loaders.
^ permalink raw reply [flat|nested] 5+ messages in thread
* too much memory for node
2010-01-20 15:55 ` Russell King - ARM Linux
@ 2010-01-20 16:05 ` Ben Dooks
2010-01-20 16:10 ` Russell King - ARM Linux
0 siblings, 1 reply; 5+ messages in thread
From: Ben Dooks @ 2010-01-20 16:05 UTC (permalink / raw)
To: linux-arm-kernel
On Wed, Jan 20, 2010 at 03:55:10PM +0000, Russell King - ARM Linux wrote:
> On Wed, Jan 20, 2010 at 03:34:11PM +0000, Ben Dooks wrote:
> > I've been looking at support for a new machine type where the
> > current system is using DISCONTIGMEM as the 1G memory map has
> > 4x256M RAM ranges which may not all be filled.
>
> Firstly, use sparsemem, unless you intentionally want to spend more
> cycles in the kernel.
Ok, any idea if that will fix the problem in this case?
> > After removing a hack in the machine support file which fills in
> > its own memory table thus obliteratign the bootloader supplied ATAG_MEM
> > I found a problem. The board has the full complement of memory and is
> > being passed a single ATAG_MEM from the bootloader for all 1G.
>
> You're into the classic problem which happens when you don't tackle
> these kinds of issues as soon as the first one occurs. If you allow
> them to persist for any time, getting rid of them later becomes very
> difficult. I suspect that you're rather stuck now.
Unfortunately the bootloader is actually correct in this case, the
version of the chip on this board has a complete 1G of memory, it
just happens there are versions which have holey memory maps depending
on how the DRAM is bonded/wired to the chip.
Will look at SPARSEMEM tomorrow.
> > diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
> > index c6c57b6..a5f7b21 100644
> > --- a/arch/arm/kernel/setup.c
> > +++ b/arch/arm/kernel/setup.c
> > @@ -410,6 +410,21 @@ static int __init arm_add_memory(unsigned long start, unsigned long size)
> > return -EINVAL;
> >
> > meminfo.nr_banks++;
> > +
> > + if (PHYS_TO_NID(start + size) != bank->node) {
> > + /* more memory than a single node can handle */
> > +
> > + unsigned long next = (1024*1024);
> > +
> > + while (PHYS_TO_NID(start + next) == bank->node)
> > + next += (1024*1024);
> > +
> > + bank->size = next;
> > + printk(KERN_INFO "adding new bank 0x%08x size 0x%08x\n",
> > + start + next, size - next);
> > + arm_add_memory(start + next, size - next);
> > + }
> > +
> > return 0;
> > }
> >
> > Note, this code uses a search as I couldn't find a good way of calculating
> > how mucch space was left in a node given a starting address.
>
> There is no way to do this in the kernel.
>
> > Another option is to do similar to sanity_check_meminfo() in the mmu
> > setup code (we would still need some form of nodemask or similar from
> > the memory.h). Is NODE_MEM_SIZE_MASK usable if !CONIFG_DISCONTIGMEM
> > to allow us to avoid the search and still have compilable code?
>
> NODE_MEM_SIZE_MASK is DISCONTIGMEM specific; it doesn't exist for
> flatmem nor sparsemem (where it's not relevant.)
Yeah, my secodn solution had to be ?#ifdef CONFIG_DISCONTIGMEM
> It's also complicated by platforms which compress their physical RAM
> space to fit inside the virtual mappings, eg:
>
> * 256MB @ 0x00000000 -> PAGE_OFFSET
> * 512MB @ 0x20000000 -> PAGE_OFFSET + 0x10000000
> * 256MB @ 0x80000000 -> PAGE_OFFSET + 0x30000000
>
> The only other solution would be for platforms to declare valid areas
> of physical memory back to the setup_arch() code, so that it can
> verify the parameters passed - but that's an awful lot of effort just
> to fix idiotically implemented boot loaders.
thanks for the help.
Should I send a patch to detect bad DISCONTIGMEM setup? I think that
a check of PHYS_TO_PFN((start+end) != PHYS_TO_PFN(start)) should be
safe in all cases?
--
Ben
Q: What's a light-year?
A: One-third less calories than a regular year.
^ permalink raw reply [flat|nested] 5+ messages in thread
* too much memory for node
2010-01-20 16:05 ` Ben Dooks
@ 2010-01-20 16:10 ` Russell King - ARM Linux
2010-01-25 11:58 ` Ben Dooks
0 siblings, 1 reply; 5+ messages in thread
From: Russell King - ARM Linux @ 2010-01-20 16:10 UTC (permalink / raw)
To: linux-arm-kernel
On Wed, Jan 20, 2010 at 04:05:25PM +0000, Ben Dooks wrote:
> On Wed, Jan 20, 2010 at 03:55:10PM +0000, Russell King - ARM Linux wrote:
> > On Wed, Jan 20, 2010 at 03:34:11PM +0000, Ben Dooks wrote:
> > > I've been looking at support for a new machine type where the
> > > current system is using DISCONTIGMEM as the 1G memory map has
> > > 4x256M RAM ranges which may not all be filled.
> >
> > Firstly, use sparsemem, unless you intentionally want to spend more
> > cycles in the kernel.
>
> Ok, any idea if that will fix the problem in this case?
It _may_ do, because it doesn't have the concept of NUMA nodes (which
is what discontigmem is all about.)
> Should I send a patch to detect bad DISCONTIGMEM setup? I think that
> a check of PHYS_TO_PFN((start+end) != PHYS_TO_PFN(start)) should be
> safe in all cases?
Just move away from discontigmem completely. Sparsemem can do everything
we need on ARM at a lesser cost. As I say above, discontigmem is really
for NUMA platforms, and ARM is not a NUMA platform. No one on ARM should
be going anywhere near discontigmem.
^ permalink raw reply [flat|nested] 5+ messages in thread
* too much memory for node
2010-01-20 16:10 ` Russell King - ARM Linux
@ 2010-01-25 11:58 ` Ben Dooks
0 siblings, 0 replies; 5+ messages in thread
From: Ben Dooks @ 2010-01-25 11:58 UTC (permalink / raw)
To: linux-arm-kernel
On Wed, Jan 20, 2010 at 04:10:53PM +0000, Russell King - ARM Linux wrote:
> On Wed, Jan 20, 2010 at 04:05:25PM +0000, Ben Dooks wrote:
> > On Wed, Jan 20, 2010 at 03:55:10PM +0000, Russell King - ARM Linux wrote:
> > > On Wed, Jan 20, 2010 at 03:34:11PM +0000, Ben Dooks wrote:
> > > > I've been looking at support for a new machine type where the
> > > > current system is using DISCONTIGMEM as the 1G memory map has
> > > > 4x256M RAM ranges which may not all be filled.
> > >
> > > Firstly, use sparsemem, unless you intentionally want to spend more
> > > cycles in the kernel.
> >
> > Ok, any idea if that will fix the problem in this case?
>
> It _may_ do, because it doesn't have the concept of NUMA nodes (which
> is what discontigmem is all about.)
>
> > Should I send a patch to detect bad DISCONTIGMEM setup? I think that
> > a check of PHYS_TO_PFN((start+end) != PHYS_TO_PFN(start)) should be
> > safe in all cases?
>
> Just move away from discontigmem completely. Sparsemem can do everything
> we need on ARM at a lesser cost. As I say above, discontigmem is really
> for NUMA platforms, and ARM is not a NUMA platform. No one on ARM should
> be going anywhere near discontigmem.
Thanks, we seem to have working sparsemem support now so I will try and
ensure that is what gets sumitted.
--
Ben
Q: What's a light-year?
A: One-third less calories than a regular year.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2010-01-25 11:58 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-01-20 15:34 too much memory for node Ben Dooks
2010-01-20 15:55 ` Russell King - ARM Linux
2010-01-20 16:05 ` Ben Dooks
2010-01-20 16:10 ` Russell King - ARM Linux
2010-01-25 11:58 ` Ben Dooks
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).