* [PATCH] powerpc/mm: Ensure Huge-page memory is free before allocation
@ 2019-06-18 4:46 Vaibhav Jain
2019-06-18 12:07 ` Michael Ellerman
0 siblings, 1 reply; 4+ messages in thread
From: Vaibhav Jain @ 2019-06-18 4:46 UTC (permalink / raw)
To: linuxppc-dev
Cc: Hari Bathini, Vaibhav Jain, Mahesh J Salgaonkar, Aneesh Kumar K.V
We recently discovered an bug where physical memory meant for
allocation of Huge-pages was inadvertently allocated by another component
during early boot. The behavior of memblock_reserve() where it wont
indicate whether an existing reserved block overlaps with the
requested reservation only makes such bugs hard to investigate.
Hence this patch proposes adding a memblock reservation check in
htab_dt_scan_hugepage_blocks() just before call to memblock_reserve()
to ensure that the physical memory thats being reserved for is not
already reserved by someone else. In case this happens we panic the
the kernel to ensure that user of this huge-page doesn't accidentally
stomp on memory allocated to someone else.
Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com>
---
arch/powerpc/mm/book3s64/hash_utils.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c
index 28ced26f2a00..a05be3adb8c9 100644
--- a/arch/powerpc/mm/book3s64/hash_utils.c
+++ b/arch/powerpc/mm/book3s64/hash_utils.c
@@ -516,6 +516,11 @@ static int __init htab_dt_scan_hugepage_blocks(unsigned long node,
printk(KERN_INFO "Huge page(16GB) memory: "
"addr = 0x%lX size = 0x%lX pages = %d\n",
phys_addr, block_size, expected_pages);
+
+ /* Ensure no one else has reserved memory for huge pages before */
+ BUG_ON(memblock_is_region_reserved(phys_addr,
+ block_size * expected_pages));
+
if (phys_addr + block_size * expected_pages <= memblock_end_of_DRAM()) {
memblock_reserve(phys_addr, block_size * expected_pages);
pseries_add_gpage(phys_addr, block_size, expected_pages);
--
2.21.0
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [PATCH] powerpc/mm: Ensure Huge-page memory is free before allocation
2019-06-18 4:46 [PATCH] powerpc/mm: Ensure Huge-page memory is free before allocation Vaibhav Jain
@ 2019-06-18 12:07 ` Michael Ellerman
2019-06-18 12:58 ` Aneesh Kumar K.V
2019-06-18 13:27 ` Vaibhav Jain
0 siblings, 2 replies; 4+ messages in thread
From: Michael Ellerman @ 2019-06-18 12:07 UTC (permalink / raw)
To: Vaibhav Jain, linuxppc-dev
Cc: Hari Bathini, Vaibhav Jain, Mahesh J Salgaonkar, Aneesh Kumar K.V
Vaibhav Jain <vaibhav@linux.ibm.com> writes:
> We recently discovered an bug where physical memory meant for
> allocation of Huge-pages was inadvertently allocated by another component
> during early boot.
Can you give me some more detail on what that was? You're seemingly the
only person who's ever hit this :)
> The behavior of memblock_reserve() where it wont
> indicate whether an existing reserved block overlaps with the
> requested reservation only makes such bugs hard to investigate.
>
> Hence this patch proposes adding a memblock reservation check in
> htab_dt_scan_hugepage_blocks() just before call to memblock_reserve()
> to ensure that the physical memory thats being reserved for is not
> already reserved by someone else. In case this happens we panic the
> the kernel to ensure that user of this huge-page doesn't accidentally
> stomp on memory allocated to someone else.
Do we really need to panic? Can't we just leave the block alone and not
register it as huge page memory? With a big warning obviously.
cheers
> diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c
> index 28ced26f2a00..a05be3adb8c9 100644
> --- a/arch/powerpc/mm/book3s64/hash_utils.c
> +++ b/arch/powerpc/mm/book3s64/hash_utils.c
> @@ -516,6 +516,11 @@ static int __init htab_dt_scan_hugepage_blocks(unsigned long node,
> printk(KERN_INFO "Huge page(16GB) memory: "
> "addr = 0x%lX size = 0x%lX pages = %d\n",
> phys_addr, block_size, expected_pages);
> +
> + /* Ensure no one else has reserved memory for huge pages before */
> + BUG_ON(memblock_is_region_reserved(phys_addr,
> + block_size * expected_pages));
> +
> if (phys_addr + block_size * expected_pages <= memblock_end_of_DRAM()) {
> memblock_reserve(phys_addr, block_size * expected_pages);
> pseries_add_gpage(phys_addr, block_size, expected_pages);
> --
> 2.21.0
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH] powerpc/mm: Ensure Huge-page memory is free before allocation
2019-06-18 12:07 ` Michael Ellerman
@ 2019-06-18 12:58 ` Aneesh Kumar K.V
2019-06-18 13:27 ` Vaibhav Jain
1 sibling, 0 replies; 4+ messages in thread
From: Aneesh Kumar K.V @ 2019-06-18 12:58 UTC (permalink / raw)
To: Michael Ellerman, Vaibhav Jain, linuxppc-dev, Paul Mackerras
Cc: Hari Bathini, Mahesh J Salgaonkar
On 6/18/19 5:37 PM, Michael Ellerman wrote:
> Vaibhav Jain <vaibhav@linux.ibm.com> writes:
>> We recently discovered an bug where physical memory meant for
>> allocation of Huge-pages was inadvertently allocated by another component
>> during early boot.
>
> Can you give me some more detail on what that was? You're seemingly the
> only person who's ever hit this :)
>
>> The behavior of memblock_reserve() where it wont
>> indicate whether an existing reserved block overlaps with the
>> requested reservation only makes such bugs hard to investigate.
>>
>> Hence this patch proposes adding a memblock reservation check in
>> htab_dt_scan_hugepage_blocks() just before call to memblock_reserve()
>> to ensure that the physical memory thats being reserved for is not
>> already reserved by someone else. In case this happens we panic the
>> the kernel to ensure that user of this huge-page doesn't accidentally
>> stomp on memory allocated to someone else.
>
> Do we really need to panic? Can't we just leave the block alone and not
> register it as huge page memory? With a big warning obviously.
>
I need to check this again with Paul. But IIRC we have issues w.r.t hash
page table size, if we use 16G pages as 64K mapped pages. ie, hypervisor
create hash page table size with the assumptions that these pfns will
only be mapped as 16G pages. If we try to put 64K hash pte entries for
them we will not have enough space in hash page table.
That is one of the reason we never allowed freeing the hugetlb reserved
pool with 16G pages back to buddy.
Note: We should switch that BUG to panic. I guess using BUG that early
don't work.
-aneesh
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] powerpc/mm: Ensure Huge-page memory is free before allocation
2019-06-18 12:07 ` Michael Ellerman
2019-06-18 12:58 ` Aneesh Kumar K.V
@ 2019-06-18 13:27 ` Vaibhav Jain
1 sibling, 0 replies; 4+ messages in thread
From: Vaibhav Jain @ 2019-06-18 13:27 UTC (permalink / raw)
To: Michael Ellerman, linuxppc-dev
Cc: Hari Bathini, Aneesh Kumar K.V, Mahesh J Salgaonkar
Michael Ellerman <mpe@ellerman.id.au> writes:
> Vaibhav Jain <vaibhav@linux.ibm.com> writes:
>> We recently discovered an bug where physical memory meant for
>> allocation of Huge-pages was inadvertently allocated by another component
>> during early boot.
>
> Can you give me some more detail on what that was? You're seemingly the
> only person who's ever hit this :)
The specific bug I investigated was in fadump which was trying to
reserve a large chunk of physically contiguous memory from memblock and
inadvertently stomped into a memory region that was reserved for
allocation of 16G hugepages. This happened because fadump reservation
happens much earlier in prom-init compared to hugepage reservation.
The bug manifested as a panic seen when trying to 'cat
/proc/pagetypeinfo' that dumps the buddy stats for each
zone/migrate-type.
Incidentally fadump after reserving this memory, would carve out a CMA
region that was then entered into the buddy-allocater. This would cause
pagetypeinfo_showfree_print() to fail when it tries to iterate over the
free list of this CMA migrate type as the corresponding memmap for these
pages was never initialized.
>
>> The behavior of memblock_reserve() where it wont
>> indicate whether an existing reserved block overlaps with the
>> requested reservation only makes such bugs hard to investigate.
>>
>> Hence this patch proposes adding a memblock reservation check in
>> htab_dt_scan_hugepage_blocks() just before call to memblock_reserve()
>> to ensure that the physical memory thats being reserved for is not
>> already reserved by someone else. In case this happens we panic the
>> the kernel to ensure that user of this huge-page doesn't accidentally
>> stomp on memory allocated to someone else.
>
> Do we really need to panic? Can't we just leave the block alone and not
> register it as huge page memory? With a big warning obviously.
Possibly yes, but Aneesh pointed out that this memory is supposed to be backed
only by 16G pages mapping due to limitation in phyp for hash page table size.
>
> cheers
>
>> diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c
>> index 28ced26f2a00..a05be3adb8c9 100644
>> --- a/arch/powerpc/mm/book3s64/hash_utils.c
>> +++ b/arch/powerpc/mm/book3s64/hash_utils.c
>> @@ -516,6 +516,11 @@ static int __init htab_dt_scan_hugepage_blocks(unsigned long node,
>> printk(KERN_INFO "Huge page(16GB) memory: "
>> "addr = 0x%lX size = 0x%lX pages = %d\n",
>> phys_addr, block_size, expected_pages);
>> +
>> + /* Ensure no one else has reserved memory for huge pages before */
>> + BUG_ON(memblock_is_region_reserved(phys_addr,
>> + block_size * expected_pages));
>> +
>> if (phys_addr + block_size * expected_pages <= memblock_end_of_DRAM()) {
>> memblock_reserve(phys_addr, block_size * expected_pages);
>> pseries_add_gpage(phys_addr, block_size, expected_pages);
>> --
>> 2.21.0
>
--
Vaibhav Jain <vaibhav@linux.ibm.com>
Linux Technology Center, IBM India Pvt. Ltd.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2019-06-18 13:30 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-06-18 4:46 [PATCH] powerpc/mm: Ensure Huge-page memory is free before allocation Vaibhav Jain
2019-06-18 12:07 ` Michael Ellerman
2019-06-18 12:58 ` Aneesh Kumar K.V
2019-06-18 13:27 ` Vaibhav Jain
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox