linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* Missing initialization of pages removed with memblock_remove
@ 2012-04-05  1:10 Laura Abbott
  2012-04-14 10:13 ` Russell King - ARM Linux
  0 siblings, 1 reply; 2+ messages in thread
From: Laura Abbott @ 2012-04-05  1:10 UTC (permalink / raw)
  To: linux-arm-kernel, linux-arm-msm, linux-mm; +Cc: vgandhi, ohaugan

Hi,

We seem to have hit an odd edge case related to the use of 
memblock_remove. We carve out memory for certain use cases using 
memblock_remove, which gives a layout such as:

<4>[    0.000000] Zone PFN ranges:
<4>[    0.000000]   Normal   0x00080200 -> 0x000a1200
<4>[    0.000000]   HighMem  0x000a1200 -> 0x000c0000
<4>[    0.000000] Movable zone start PFN for each node
<4>[    0.000000] early_node_map[3] active PFN ranges
<4>[    0.000000]     0: 0x00080200 -> 0x00088f00
<4>[    0.000000]     0: 0x00090000 -> 0x000ac680
<4>[    0.000000]     0: 0x000b7a02 -> 0x000c0000

Since pfn_valid uses memblock_is_memory, pfn_valid will return false on 
all memory removed with memblock_remove. As a result, none of the page 
structures for the memblock_remove regions will have been initialized 
since memmap_init_zone calls pfn_valid before trying to initialize the 
memmap. Normally this isn't an issue but a recent test case ends up 
hitting a BUG_ON in move_freepages_block identical to the case in 
http://lists.infradead.org/pipermail/linux-arm-kernel/2011-August/059934.html
(BUG_ON(page_zone(start_page) != page_zone(end_page)))

What's happening is the calculation of start_page in 
move_freepages_block returns a page within a range removed by 
memblock_remove which means the page structure is uninitialized. (e.g. 
0xb7a02 -> 0xb7800)

I've read through that thread and several others which have discouraged 
use of CONFIG_HOLES_IN_ZONE due to the runtime overhead. The best 
alternative solution I've come up with is to align the memory removed 
via memblock_remove to MAX_ORDER_NR_PAGES but this will have a very high 
memory overhead for certain use cases.

A more fundamental question I have is should the page structures be 
initialized for the regions removed with memblock_remove? Internally, 
we've been divided on this issue and reading the source code hasn't 
given any indication of if this is expected behavior or not.

Any suggestions on what's the cleanest solution?

Thanks,
Laura
-- 
Sent by an employee of the Qualcomm Innovation Center, Inc.
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: Missing initialization of pages removed with memblock_remove
  2012-04-05  1:10 Missing initialization of pages removed with memblock_remove Laura Abbott
@ 2012-04-14 10:13 ` Russell King - ARM Linux
  0 siblings, 0 replies; 2+ messages in thread
From: Russell King - ARM Linux @ 2012-04-14 10:13 UTC (permalink / raw)
  To: Laura Abbott; +Cc: linux-arm-kernel, linux-arm-msm, linux-mm, ohaugan, vgandhi

On Wed, Apr 04, 2012 at 06:10:07PM -0700, Laura Abbott wrote:
> We seem to have hit an odd edge case related to the use of  
> memblock_remove. We carve out memory for certain use cases using  
> memblock_remove, which gives a layout such as:
>
> <4>[    0.000000] Zone PFN ranges:
> <4>[    0.000000]   Normal   0x00080200 -> 0x000a1200
> <4>[    0.000000]   HighMem  0x000a1200 -> 0x000c0000
> <4>[    0.000000] Movable zone start PFN for each node
> <4>[    0.000000] early_node_map[3] active PFN ranges
> <4>[    0.000000]     0: 0x00080200 -> 0x00088f00
> <4>[    0.000000]     0: 0x00090000 -> 0x000ac680
> <4>[    0.000000]     0: 0x000b7a02 -> 0x000c0000
>
> Since pfn_valid uses memblock_is_memory, pfn_valid will return false on  
> all memory removed with memblock_remove.

Correct.  memblock_remove() removes the range from the 'memory' array.
memblock_is_memory() searches the 'memory' array to discover whether
the address is within a region described by that array.  So, having called
memblock_remove() on a region, memblock_is_memory() will then return false
for that region.

This provably works, because all those platforms using arm_memblock_steal()
and then subsequently using ioremap() on the same physical address range
relies upon this behaviour - and this is the desired behaviour.

> As a result, none of the page structures for the memblock_remove regions
> will have been initialized since memmap_init_zone calls pfn_valid before
> trying to initialize the memmap. Normally this isn't an issue but a recent
> test case ends up hitting a BUG_ON in move_freepages_block identical to
> the case in  
> http://lists.infradead.org/pipermail/linux-arm-kernel/2011-August/059934.html
> (BUG_ON(page_zone(start_page) != page_zone(end_page)))

Yes, welcome to the sad fact that sparsemem can't handle... sparse memory.
sparsemem apparantly was designed to handle fully populated memory sections,
but we've had some forward progress to get it sorted.  So if your memory
size is a multiple of 1MB, and you have memory in the upper half of the 4GB
space, you'll need an insane number of sections to cover this if you follow
this - you will need 4GB / 1MB = 4096 sections.

> What's happening is the calculation of start_page in  
> move_freepages_block returns a page within a range removed by  
> memblock_remove which means the page structure is uninitialized. (e.g.  
> 0xb7a02 -> 0xb7800)
>
> I've read through that thread and several others which have discouraged  
> use of CONFIG_HOLES_IN_ZONE due to the runtime overhead. The best  
> alternative solution I've come up with is to align the memory removed  
> via memblock_remove to MAX_ORDER_NR_PAGES but this will have a very high  
> memory overhead for certain use cases.
>
> A more fundamental question I have is should the page structures be  
> initialized for the regions removed with memblock_remove? Internally,  
> we've been divided on this issue and reading the source code hasn't  
> given any indication of if this is expected behavior or not.

One of the problems with that is you may have a GB or so between memblock
memory regions, and you certainly do not want to try and populate all
those page structs.

> Any suggestions on what's the cleanest solution?

I think CONFIG_HOLES_IN_ZONE=y is the best solution short of writing a
memory support subsystem which _can_ cope with all the various broken
ideas of system memory layout on ARM.

This would be a lot less of a problem had ARM Ltd mandated as part of
the architecture that memory was to be contiguous (and preferably
starting at physical address zero in normal system operation) but alas
every silicon vendor is free to create whatever abortion they like
here - we've even had cases where people want the order of physical
memory reversed because the first populated memory region is at a
higher address, which we've had to say a definite no to.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2012-04-14 10:14 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-04-05  1:10 Missing initialization of pages removed with memblock_remove Laura Abbott
2012-04-14 10:13 ` Russell King - ARM Linux

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).