* [PATCH v3 1/5] powerpc/mm: set the radix linear page mapping size
2016-12-15 19:50 [PATCH v3 0/5] powerpc/mm: enable memory hotplug on radix Reza Arbab
@ 2016-12-15 19:50 ` Reza Arbab
2016-12-19 8:58 ` Aneesh Kumar K.V
2016-12-15 19:50 ` [PATCH v3 2/5] powerpc/mm: refactor {create, remove}_section_mapping() Reza Arbab
` (4 subsequent siblings)
5 siblings, 1 reply; 23+ messages in thread
From: Reza Arbab @ 2016-12-15 19:50 UTC (permalink / raw)
To: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras
Cc: linuxppc-dev, Aneesh Kumar K.V, Balbir Singh, Alistair Popple
This was defaulting to 4K, regardless of PAGE_SIZE.
Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com>
---
arch/powerpc/mm/pgtable-radix.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/arch/powerpc/mm/pgtable-radix.c b/arch/powerpc/mm/pgtable-radix.c
index 623a0dc..54bd70e 100644
--- a/arch/powerpc/mm/pgtable-radix.c
+++ b/arch/powerpc/mm/pgtable-radix.c
@@ -351,8 +351,10 @@ void __init radix__early_init_mmu(void)
#ifdef CONFIG_PPC_64K_PAGES
/* PAGE_SIZE mappings */
mmu_virtual_psize = MMU_PAGE_64K;
+ mmu_linear_psize = MMU_PAGE_64K;
#else
mmu_virtual_psize = MMU_PAGE_4K;
+ mmu_linear_psize = MMU_PAGE_4K;
#endif
#ifdef CONFIG_SPARSEMEM_VMEMMAP
--
1.8.3.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH v3 1/5] powerpc/mm: set the radix linear page mapping size
2016-12-15 19:50 ` [PATCH v3 1/5] powerpc/mm: set the radix linear page mapping size Reza Arbab
@ 2016-12-19 8:58 ` Aneesh Kumar K.V
2016-12-19 20:53 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 23+ messages in thread
From: Aneesh Kumar K.V @ 2016-12-19 8:58 UTC (permalink / raw)
To: Reza Arbab, Michael Ellerman, Benjamin Herrenschmidt,
Paul Mackerras
Cc: linuxppc-dev, Balbir Singh, Alistair Popple
Reza Arbab <arbab@linux.vnet.ibm.com> writes:
> This was defaulting to 4K, regardless of PAGE_SIZE.
>
> Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com>
> ---
> arch/powerpc/mm/pgtable-radix.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/arch/powerpc/mm/pgtable-radix.c b/arch/powerpc/mm/pgtable-radix.c
> index 623a0dc..54bd70e 100644
> --- a/arch/powerpc/mm/pgtable-radix.c
> +++ b/arch/powerpc/mm/pgtable-radix.c
> @@ -351,8 +351,10 @@ void __init radix__early_init_mmu(void)
> #ifdef CONFIG_PPC_64K_PAGES
> /* PAGE_SIZE mappings */
> mmu_virtual_psize = MMU_PAGE_64K;
> + mmu_linear_psize = MMU_PAGE_64K;
That is not clearly correct, we map the linear address with either 64K,
2M or 1G depending on the memory available. Take a look at
static void __init radix_init_pgtable(void)
> #else
> mmu_virtual_psize = MMU_PAGE_4K;
> + mmu_linear_psize = MMU_PAGE_4K;
> #endif
>
> #ifdef CONFIG_SPARSEMEM_VMEMMAP
> --
> 1.8.3.1
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v3 1/5] powerpc/mm: set the radix linear page mapping size
2016-12-19 8:58 ` Aneesh Kumar K.V
@ 2016-12-19 20:53 ` Benjamin Herrenschmidt
2016-12-20 2:02 ` Aneesh Kumar K.V
0 siblings, 1 reply; 23+ messages in thread
From: Benjamin Herrenschmidt @ 2016-12-19 20:53 UTC (permalink / raw)
To: Aneesh Kumar K.V, Reza Arbab, Michael Ellerman, Paul Mackerras
Cc: linuxppc-dev, Balbir Singh, Alistair Popple
On Mon, 2016-12-19 at 14:28 +0530, Aneesh Kumar K.V wrote:
> Reza Arbab <arbab@linux.vnet.ibm.com> writes:
>
> > This was defaulting to 4K, regardless of PAGE_SIZE.
> >
> > Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com>
> > ---
> > arch/powerpc/mm/pgtable-radix.c | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/arch/powerpc/mm/pgtable-radix.c
> > b/arch/powerpc/mm/pgtable-radix.c
> > index 623a0dc..54bd70e 100644
> > --- a/arch/powerpc/mm/pgtable-radix.c
> > +++ b/arch/powerpc/mm/pgtable-radix.c
> > @@ -351,8 +351,10 @@ void __init radix__early_init_mmu(void)
> > #ifdef CONFIG_PPC_64K_PAGES
> > /* PAGE_SIZE mappings */
> > mmu_virtual_psize = MMU_PAGE_64K;
> > + mmu_linear_psize = MMU_PAGE_64K;
>
> That is not clearly correct, we map the linear address with either
> 64K,
> 2M or 1G depending on the memory available. Take a look at
> static void __init radix_init_pgtable(void)
>
So should we fix that initialization regardless or take it out ?
>
> > #else
> > mmu_virtual_psize = MMU_PAGE_4K;
> > + mmu_linear_psize = MMU_PAGE_4K;
> > #endif
> >
> > #ifdef CONFIG_SPARSEMEM_VMEMMAP
> > --
> > 1.8.3.1
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v3 1/5] powerpc/mm: set the radix linear page mapping size
2016-12-19 20:53 ` Benjamin Herrenschmidt
@ 2016-12-20 2:02 ` Aneesh Kumar K.V
0 siblings, 0 replies; 23+ messages in thread
From: Aneesh Kumar K.V @ 2016-12-20 2:02 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Reza Arbab, Michael Ellerman,
Paul Mackerras
Cc: linuxppc-dev, Balbir Singh, Alistair Popple
Benjamin Herrenschmidt <benh@kernel.crashing.org> writes:
> On Mon, 2016-12-19 at 14:28 +0530, Aneesh Kumar K.V wrote:
>> Reza Arbab <arbab@linux.vnet.ibm.com> writes:
>>=20
>> > This was defaulting to 4K, regardless of PAGE_SIZE.
>> >=20
>> > Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com>
>> > ---
>> > =C2=A0arch/powerpc/mm/pgtable-radix.c | 2 ++
>> > =C2=A01 file changed, 2 insertions(+)
>> >=20
>> > diff --git a/arch/powerpc/mm/pgtable-radix.c
>> > b/arch/powerpc/mm/pgtable-radix.c
>> > index 623a0dc..54bd70e 100644
>> > --- a/arch/powerpc/mm/pgtable-radix.c
>> > +++ b/arch/powerpc/mm/pgtable-radix.c
>> > @@ -351,8 +351,10 @@ void __init radix__early_init_mmu(void)
>> > =C2=A0#ifdef CONFIG_PPC_64K_PAGES
>> > =C2=A0 /* PAGE_SIZE mappings */
>> > =C2=A0 mmu_virtual_psize =3D MMU_PAGE_64K;
>> > + mmu_linear_psize =3D MMU_PAGE_64K;
>>=20
>> That is not clearly correct, we map the linear address with either
>> 64K,
>> 2M or 1G depending on the memory available. Take a look at
>> static void __init radix_init_pgtable(void)
>>=20
>
> So should we fix that initialization regardless or take it out ?
We should not be usuing mmu_linear_psize on radix. Hence we can skip
that initialization.
-aneesh
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH v3 2/5] powerpc/mm: refactor {create, remove}_section_mapping()
2016-12-15 19:50 [PATCH v3 0/5] powerpc/mm: enable memory hotplug on radix Reza Arbab
2016-12-15 19:50 ` [PATCH v3 1/5] powerpc/mm: set the radix linear page mapping size Reza Arbab
@ 2016-12-15 19:50 ` Reza Arbab
2016-12-19 9:00 ` Aneesh Kumar K.V
2016-12-20 5:26 ` Balbir Singh
2016-12-15 19:50 ` [PATCH v3 3/5] powerpc/mm: add radix__create_section_mapping() Reza Arbab
` (3 subsequent siblings)
5 siblings, 2 replies; 23+ messages in thread
From: Reza Arbab @ 2016-12-15 19:50 UTC (permalink / raw)
To: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras
Cc: linuxppc-dev, Aneesh Kumar K.V, Balbir Singh, Alistair Popple
Change {create,remove}_section_mapping() to be wrappers around functions
prefixed with "hash__".
This is preparation for the addition of their "radix__" variants. No
functional change.
Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com>
---
arch/powerpc/include/asm/book3s/64/hash.h | 5 +++++
arch/powerpc/mm/hash_utils_64.c | 4 ++--
arch/powerpc/mm/pgtable-book3s64.c | 18 ++++++++++++++++++
3 files changed, 25 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
index f61cad3..dd90574 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -201,6 +201,11 @@ extern int __meminit hash__vmemmap_create_mapping(unsigned long start,
unsigned long phys);
extern void hash__vmemmap_remove_mapping(unsigned long start,
unsigned long page_size);
+
+#ifdef CONFIG_MEMORY_HOTPLUG
+int hash__create_section_mapping(unsigned long start, unsigned long end);
+int hash__remove_section_mapping(unsigned long start, unsigned long end);
+#endif /* CONFIG_MEMORY_HOTPLUG */
#endif /* !__ASSEMBLY__ */
#endif /* __KERNEL__ */
#endif /* _ASM_POWERPC_BOOK3S_64_HASH_H */
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index b9a062f..96a4fb7 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -743,7 +743,7 @@ static unsigned long __init htab_get_table_size(void)
}
#ifdef CONFIG_MEMORY_HOTPLUG
-int create_section_mapping(unsigned long start, unsigned long end)
+int hash__create_section_mapping(unsigned long start, unsigned long end)
{
int rc = htab_bolt_mapping(start, end, __pa(start),
pgprot_val(PAGE_KERNEL), mmu_linear_psize,
@@ -757,7 +757,7 @@ int create_section_mapping(unsigned long start, unsigned long end)
return rc;
}
-int remove_section_mapping(unsigned long start, unsigned long end)
+int hash__remove_section_mapping(unsigned long start, unsigned long end)
{
int rc = htab_remove_mapping(start, end, mmu_linear_psize,
mmu_kernel_ssize);
diff --git a/arch/powerpc/mm/pgtable-book3s64.c b/arch/powerpc/mm/pgtable-book3s64.c
index ebf9782..653ff6c 100644
--- a/arch/powerpc/mm/pgtable-book3s64.c
+++ b/arch/powerpc/mm/pgtable-book3s64.c
@@ -126,3 +126,21 @@ void mmu_cleanup_all(void)
else if (mmu_hash_ops.hpte_clear_all)
mmu_hash_ops.hpte_clear_all();
}
+
+#ifdef CONFIG_MEMORY_HOTPLUG
+int create_section_mapping(unsigned long start, unsigned long end)
+{
+ if (radix_enabled())
+ return -ENODEV;
+
+ return hash__create_section_mapping(start, end);
+}
+
+int remove_section_mapping(unsigned long start, unsigned long end)
+{
+ if (radix_enabled())
+ return -ENODEV;
+
+ return hash__remove_section_mapping(start, end);
+}
+#endif /* CONFIG_MEMORY_HOTPLUG */
--
1.8.3.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH v3 2/5] powerpc/mm: refactor {create, remove}_section_mapping()
2016-12-15 19:50 ` [PATCH v3 2/5] powerpc/mm: refactor {create, remove}_section_mapping() Reza Arbab
@ 2016-12-19 9:00 ` Aneesh Kumar K.V
2016-12-19 18:00 ` [PATCH v3 2/5] powerpc/mm: refactor {create,remove}_section_mapping() Reza Arbab
2016-12-20 5:26 ` Balbir Singh
1 sibling, 1 reply; 23+ messages in thread
From: Aneesh Kumar K.V @ 2016-12-19 9:00 UTC (permalink / raw)
To: Reza Arbab, Michael Ellerman, Benjamin Herrenschmidt,
Paul Mackerras
Cc: linuxppc-dev, Balbir Singh, Alistair Popple
Reza Arbab <arbab@linux.vnet.ibm.com> writes:
> Change {create,remove}_section_mapping() to be wrappers around functions
> prefixed with "hash__".
>
> This is preparation for the addition of their "radix__" variants. No
> functional change.
>
I think this can go upstream now ? To fixup broken hotplug with radix ?
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com>
> ---
> arch/powerpc/include/asm/book3s/64/hash.h | 5 +++++
> arch/powerpc/mm/hash_utils_64.c | 4 ++--
> arch/powerpc/mm/pgtable-book3s64.c | 18 ++++++++++++++++++
> 3 files changed, 25 insertions(+), 2 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
> index f61cad3..dd90574 100644
> --- a/arch/powerpc/include/asm/book3s/64/hash.h
> +++ b/arch/powerpc/include/asm/book3s/64/hash.h
> @@ -201,6 +201,11 @@ extern int __meminit hash__vmemmap_create_mapping(unsigned long start,
> unsigned long phys);
> extern void hash__vmemmap_remove_mapping(unsigned long start,
> unsigned long page_size);
> +
> +#ifdef CONFIG_MEMORY_HOTPLUG
> +int hash__create_section_mapping(unsigned long start, unsigned long end);
> +int hash__remove_section_mapping(unsigned long start, unsigned long end);
> +#endif /* CONFIG_MEMORY_HOTPLUG */
> #endif /* !__ASSEMBLY__ */
> #endif /* __KERNEL__ */
> #endif /* _ASM_POWERPC_BOOK3S_64_HASH_H */
> diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
> index b9a062f..96a4fb7 100644
> --- a/arch/powerpc/mm/hash_utils_64.c
> +++ b/arch/powerpc/mm/hash_utils_64.c
> @@ -743,7 +743,7 @@ static unsigned long __init htab_get_table_size(void)
> }
>
> #ifdef CONFIG_MEMORY_HOTPLUG
> -int create_section_mapping(unsigned long start, unsigned long end)
> +int hash__create_section_mapping(unsigned long start, unsigned long end)
> {
> int rc = htab_bolt_mapping(start, end, __pa(start),
> pgprot_val(PAGE_KERNEL), mmu_linear_psize,
> @@ -757,7 +757,7 @@ int create_section_mapping(unsigned long start, unsigned long end)
> return rc;
> }
>
> -int remove_section_mapping(unsigned long start, unsigned long end)
> +int hash__remove_section_mapping(unsigned long start, unsigned long end)
> {
> int rc = htab_remove_mapping(start, end, mmu_linear_psize,
> mmu_kernel_ssize);
> diff --git a/arch/powerpc/mm/pgtable-book3s64.c b/arch/powerpc/mm/pgtable-book3s64.c
> index ebf9782..653ff6c 100644
> --- a/arch/powerpc/mm/pgtable-book3s64.c
> +++ b/arch/powerpc/mm/pgtable-book3s64.c
> @@ -126,3 +126,21 @@ void mmu_cleanup_all(void)
> else if (mmu_hash_ops.hpte_clear_all)
> mmu_hash_ops.hpte_clear_all();
> }
> +
> +#ifdef CONFIG_MEMORY_HOTPLUG
> +int create_section_mapping(unsigned long start, unsigned long end)
> +{
> + if (radix_enabled())
> + return -ENODEV;
> +
> + return hash__create_section_mapping(start, end);
> +}
> +
> +int remove_section_mapping(unsigned long start, unsigned long end)
> +{
> + if (radix_enabled())
> + return -ENODEV;
> +
> + return hash__remove_section_mapping(start, end);
> +}
> +#endif /* CONFIG_MEMORY_HOTPLUG */
> --
> 1.8.3.1
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v3 2/5] powerpc/mm: refactor {create,remove}_section_mapping()
2016-12-19 9:00 ` Aneesh Kumar K.V
@ 2016-12-19 18:00 ` Reza Arbab
0 siblings, 0 replies; 23+ messages in thread
From: Reza Arbab @ 2016-12-19 18:00 UTC (permalink / raw)
To: Aneesh Kumar K.V
Cc: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
linuxppc-dev, Balbir Singh, Alistair Popple
On Mon, Dec 19, 2016 at 02:30:28PM +0530, Aneesh Kumar K.V wrote:
>Reza Arbab <arbab@linux.vnet.ibm.com> writes:
>> Change {create,remove}_section_mapping() to be wrappers around
>> functions prefixed with "hash__".
>>
>> This is preparation for the addition of their "radix__" variants. No
>> functional change.
>>
>
>I think this can go upstream now ? To fixup broken hotplug with radix ?
Yes, this one might be worth separating as a fix for the BUG() on radix.
--
Reza Arbab
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v3 2/5] powerpc/mm: refactor {create,remove}_section_mapping()
2016-12-15 19:50 ` [PATCH v3 2/5] powerpc/mm: refactor {create, remove}_section_mapping() Reza Arbab
2016-12-19 9:00 ` Aneesh Kumar K.V
@ 2016-12-20 5:26 ` Balbir Singh
1 sibling, 0 replies; 23+ messages in thread
From: Balbir Singh @ 2016-12-20 5:26 UTC (permalink / raw)
To: Reza Arbab, Michael Ellerman, Benjamin Herrenschmidt,
Paul Mackerras
Cc: linuxppc-dev, Aneesh Kumar K.V, Alistair Popple
On 12/16/2016 06:50 AM, Reza Arbab wrote:
> Change {create,remove}_section_mapping() to be wrappers around functions
> prefixed with "hash__".
>
> This is preparation for the addition of their "radix__" variants. No
> functional change.
>
> Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com>
Acked-by: Balbir Singh <bsingharora@gmail.com>
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH v3 3/5] powerpc/mm: add radix__create_section_mapping()
2016-12-15 19:50 [PATCH v3 0/5] powerpc/mm: enable memory hotplug on radix Reza Arbab
2016-12-15 19:50 ` [PATCH v3 1/5] powerpc/mm: set the radix linear page mapping size Reza Arbab
2016-12-15 19:50 ` [PATCH v3 2/5] powerpc/mm: refactor {create, remove}_section_mapping() Reza Arbab
@ 2016-12-15 19:50 ` Reza Arbab
2016-12-19 9:04 ` Aneesh Kumar K.V
2016-12-20 6:28 ` Balbir Singh
2016-12-15 19:50 ` [PATCH v3 4/5] powerpc/mm: add radix__remove_section_mapping() Reza Arbab
` (2 subsequent siblings)
5 siblings, 2 replies; 23+ messages in thread
From: Reza Arbab @ 2016-12-15 19:50 UTC (permalink / raw)
To: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras
Cc: linuxppc-dev, Aneesh Kumar K.V, Balbir Singh, Alistair Popple
Add the linear page mapping function for radix, used by memory hotplug.
This is similar to vmemmap_populate().
Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com>
---
arch/powerpc/include/asm/book3s/64/radix.h | 4 ++++
arch/powerpc/mm/pgtable-book3s64.c | 2 +-
arch/powerpc/mm/pgtable-radix.c | 19 +++++++++++++++++++
3 files changed, 24 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/include/asm/book3s/64/radix.h
index b4d1302..43c2571 100644
--- a/arch/powerpc/include/asm/book3s/64/radix.h
+++ b/arch/powerpc/include/asm/book3s/64/radix.h
@@ -291,5 +291,9 @@ static inline unsigned long radix__get_tree_size(void)
}
return rts_field;
}
+
+#ifdef CONFIG_MEMORY_HOTPLUG
+int radix__create_section_mapping(unsigned long start, unsigned long end);
+#endif /* CONFIG_MEMORY_HOTPLUG */
#endif /* __ASSEMBLY__ */
#endif
diff --git a/arch/powerpc/mm/pgtable-book3s64.c b/arch/powerpc/mm/pgtable-book3s64.c
index 653ff6c..2b13f6b 100644
--- a/arch/powerpc/mm/pgtable-book3s64.c
+++ b/arch/powerpc/mm/pgtable-book3s64.c
@@ -131,7 +131,7 @@ void mmu_cleanup_all(void)
int create_section_mapping(unsigned long start, unsigned long end)
{
if (radix_enabled())
- return -ENODEV;
+ return radix__create_section_mapping(start, end);
return hash__create_section_mapping(start, end);
}
diff --git a/arch/powerpc/mm/pgtable-radix.c b/arch/powerpc/mm/pgtable-radix.c
index 54bd70e..8201d1f 100644
--- a/arch/powerpc/mm/pgtable-radix.c
+++ b/arch/powerpc/mm/pgtable-radix.c
@@ -465,6 +465,25 @@ void radix__setup_initial_memory_limit(phys_addr_t first_memblock_base,
memblock_set_current_limit(first_memblock_base + first_memblock_size);
}
+#ifdef CONFIG_MEMORY_HOTPLUG
+int radix__create_section_mapping(unsigned long start, unsigned long end)
+{
+ unsigned long page_size = 1 << mmu_psize_defs[mmu_linear_psize].shift;
+
+ /* Align to the page size of the linear mapping. */
+ start = _ALIGN_DOWN(start, page_size);
+
+ for (; start < end; start += page_size) {
+ int rc = radix__map_kernel_page(start, __pa(start),
+ PAGE_KERNEL, page_size);
+ if (rc)
+ return rc;
+ }
+
+ return 0;
+}
+#endif /* CONFIG_MEMORY_HOTPLUG */
+
#ifdef CONFIG_SPARSEMEM_VMEMMAP
int __meminit radix__vmemmap_create_mapping(unsigned long start,
unsigned long page_size,
--
1.8.3.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH v3 3/5] powerpc/mm: add radix__create_section_mapping()
2016-12-15 19:50 ` [PATCH v3 3/5] powerpc/mm: add radix__create_section_mapping() Reza Arbab
@ 2016-12-19 9:04 ` Aneesh Kumar K.V
2016-12-19 18:06 ` Reza Arbab
2016-12-21 7:03 ` Anshuman Khandual
2016-12-20 6:28 ` Balbir Singh
1 sibling, 2 replies; 23+ messages in thread
From: Aneesh Kumar K.V @ 2016-12-19 9:04 UTC (permalink / raw)
To: Reza Arbab, Michael Ellerman, Benjamin Herrenschmidt,
Paul Mackerras
Cc: linuxppc-dev, Balbir Singh, Alistair Popple
Reza Arbab <arbab@linux.vnet.ibm.com> writes:
> Add the linear page mapping function for radix, used by memory hotplug.
> This is similar to vmemmap_populate().
>
Ok with this patch your first patch becomes useful. Can you merge that
with this and rename mmu_linear_psize to mmu_hotplug_psize or even use
mmu_virtual_psize. The linear naming is confusing.
> Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com>
> ---
> arch/powerpc/include/asm/book3s/64/radix.h | 4 ++++
> arch/powerpc/mm/pgtable-book3s64.c | 2 +-
> arch/powerpc/mm/pgtable-radix.c | 19 +++++++++++++++++++
> 3 files changed, 24 insertions(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/include/asm/book3s/64/radix.h
> index b4d1302..43c2571 100644
> --- a/arch/powerpc/include/asm/book3s/64/radix.h
> +++ b/arch/powerpc/include/asm/book3s/64/radix.h
> @@ -291,5 +291,9 @@ static inline unsigned long radix__get_tree_size(void)
> }
> return rts_field;
> }
> +
> +#ifdef CONFIG_MEMORY_HOTPLUG
> +int radix__create_section_mapping(unsigned long start, unsigned long end);
> +#endif /* CONFIG_MEMORY_HOTPLUG */
> #endif /* __ASSEMBLY__ */
> #endif
> diff --git a/arch/powerpc/mm/pgtable-book3s64.c b/arch/powerpc/mm/pgtable-book3s64.c
> index 653ff6c..2b13f6b 100644
> --- a/arch/powerpc/mm/pgtable-book3s64.c
> +++ b/arch/powerpc/mm/pgtable-book3s64.c
> @@ -131,7 +131,7 @@ void mmu_cleanup_all(void)
> int create_section_mapping(unsigned long start, unsigned long end)
> {
> if (radix_enabled())
> - return -ENODEV;
> + return radix__create_section_mapping(start, end);
>
> return hash__create_section_mapping(start, end);
> }
> diff --git a/arch/powerpc/mm/pgtable-radix.c b/arch/powerpc/mm/pgtable-radix.c
> index 54bd70e..8201d1f 100644
> --- a/arch/powerpc/mm/pgtable-radix.c
> +++ b/arch/powerpc/mm/pgtable-radix.c
> @@ -465,6 +465,25 @@ void radix__setup_initial_memory_limit(phys_addr_t first_memblock_base,
> memblock_set_current_limit(first_memblock_base + first_memblock_size);
> }
>
> +#ifdef CONFIG_MEMORY_HOTPLUG
> +int radix__create_section_mapping(unsigned long start, unsigned long end)
> +{
> + unsigned long page_size = 1 << mmu_psize_defs[mmu_linear_psize].shift;
> +
> + /* Align to the page size of the linear mapping. */
> + start = _ALIGN_DOWN(start, page_size);
> +
> + for (; start < end; start += page_size) {
> + int rc = radix__map_kernel_page(start, __pa(start),
> + PAGE_KERNEL, page_size);
> + if (rc)
> + return rc;
> + }
> +
> + return 0;
> +}
> +#endif /* CONFIG_MEMORY_HOTPLUG */
> +
> #ifdef CONFIG_SPARSEMEM_VMEMMAP
> int __meminit radix__vmemmap_create_mapping(unsigned long start,
> unsigned long page_size,
> --
> 1.8.3.1
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v3 3/5] powerpc/mm: add radix__create_section_mapping()
2016-12-19 9:04 ` Aneesh Kumar K.V
@ 2016-12-19 18:06 ` Reza Arbab
2016-12-21 7:03 ` Anshuman Khandual
1 sibling, 0 replies; 23+ messages in thread
From: Reza Arbab @ 2016-12-19 18:06 UTC (permalink / raw)
To: Aneesh Kumar K.V
Cc: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
linuxppc-dev, Balbir Singh, Alistair Popple
On Mon, Dec 19, 2016 at 02:34:13PM +0530, Aneesh Kumar K.V wrote:
>Reza Arbab <arbab@linux.vnet.ibm.com> writes:
>> Add the linear page mapping function for radix, used by memory
>> hotplug. This is similar to vmemmap_populate().
>>
>
>Ok with this patch your first patch becomes useful. Can you merge that
>with this and rename mmu_linear_psize to mmu_hotplug_psize or even use
>mmu_virtual_psize. The linear naming is confusing.
Thanks for pointing out radix_init_pgtable(). I think the right thing to
do here is create these mappings the same way it does. We can probably
factor out a common function.
--
Reza Arbab
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v3 3/5] powerpc/mm: add radix__create_section_mapping()
2016-12-19 9:04 ` Aneesh Kumar K.V
2016-12-19 18:06 ` Reza Arbab
@ 2016-12-21 7:03 ` Anshuman Khandual
1 sibling, 0 replies; 23+ messages in thread
From: Anshuman Khandual @ 2016-12-21 7:03 UTC (permalink / raw)
To: Aneesh Kumar K.V, Reza Arbab, Michael Ellerman,
Benjamin Herrenschmidt, Paul Mackerras
Cc: linuxppc-dev, Alistair Popple
On 12/19/2016 02:34 PM, Aneesh Kumar K.V wrote:
> Reza Arbab <arbab@linux.vnet.ibm.com> writes:
>
>> > Add the linear page mapping function for radix, used by memory hotplug.
>> > This is similar to vmemmap_populate().
>> >
> Ok with this patch your first patch becomes useful. Can you merge that
> with this and rename mmu_linear_psize to mmu_hotplug_psize or even use
> mmu_virtual_psize. The linear naming is confusing.
mmu_linear_psize variable was referring to the page size used to create
kernel linear mapping (the 0xc00.... range) for the newly added memory
section. Why the name should be changed ?
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v3 3/5] powerpc/mm: add radix__create_section_mapping()
2016-12-15 19:50 ` [PATCH v3 3/5] powerpc/mm: add radix__create_section_mapping() Reza Arbab
2016-12-19 9:04 ` Aneesh Kumar K.V
@ 2016-12-20 6:28 ` Balbir Singh
2016-12-20 15:32 ` Reza Arbab
1 sibling, 1 reply; 23+ messages in thread
From: Balbir Singh @ 2016-12-20 6:28 UTC (permalink / raw)
To: Reza Arbab, Michael Ellerman, Benjamin Herrenschmidt,
Paul Mackerras
Cc: linuxppc-dev, Aneesh Kumar K.V, Alistair Popple
>
> +#ifdef CONFIG_MEMORY_HOTPLUG
> +int radix__create_section_mapping(unsigned long start, unsigned long end)
> +{
> + unsigned long page_size = 1 << mmu_psize_defs[mmu_linear_psize].shift;
Can we refactor bits from radix_init_pgtable() and reuse?
Balbir
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v3 3/5] powerpc/mm: add radix__create_section_mapping()
2016-12-20 6:28 ` Balbir Singh
@ 2016-12-20 15:32 ` Reza Arbab
0 siblings, 0 replies; 23+ messages in thread
From: Reza Arbab @ 2016-12-20 15:32 UTC (permalink / raw)
To: Balbir Singh
Cc: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
linuxppc-dev, Aneesh Kumar K.V, Alistair Popple
On Tue, Dec 20, 2016 at 05:28:40PM +1100, Balbir Singh wrote:
>>+#ifdef CONFIG_MEMORY_HOTPLUG
>>+int radix__create_section_mapping(unsigned long start, unsigned long end)
>>+{
>>+ unsigned long page_size = 1 << mmu_psize_defs[mmu_linear_psize].shift;
>
>Can we refactor bits from radix_init_pgtable() and reuse?
Yes, that's my plan for v4.
--
Reza Arbab
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH v3 4/5] powerpc/mm: add radix__remove_section_mapping()
2016-12-15 19:50 [PATCH v3 0/5] powerpc/mm: enable memory hotplug on radix Reza Arbab
` (2 preceding siblings ...)
2016-12-15 19:50 ` [PATCH v3 3/5] powerpc/mm: add radix__create_section_mapping() Reza Arbab
@ 2016-12-15 19:50 ` Reza Arbab
2016-12-19 9:48 ` Aneesh Kumar K.V
2016-12-15 19:50 ` [PATCH v3 5/5] powerpc/mm: unstub radix__vmemmap_remove_mapping() Reza Arbab
2016-12-16 14:38 ` [PATCH v3 0/5] powerpc/mm: enable memory hotplug on radix Balbir Singh
5 siblings, 1 reply; 23+ messages in thread
From: Reza Arbab @ 2016-12-15 19:50 UTC (permalink / raw)
To: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras
Cc: linuxppc-dev, Aneesh Kumar K.V, Balbir Singh, Alistair Popple
Tear down and free the four-level page tables of the linear mapping
during memory hotremove.
We borrow the basic structure of remove_pagetable() and friends from the
identically-named x86 functions.
Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com>
---
arch/powerpc/include/asm/book3s/64/radix.h | 1 +
arch/powerpc/mm/pgtable-book3s64.c | 2 +-
arch/powerpc/mm/pgtable-radix.c | 163 +++++++++++++++++++++++++++++
3 files changed, 165 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/include/asm/book3s/64/radix.h
index 43c2571..0032b66 100644
--- a/arch/powerpc/include/asm/book3s/64/radix.h
+++ b/arch/powerpc/include/asm/book3s/64/radix.h
@@ -294,6 +294,7 @@ static inline unsigned long radix__get_tree_size(void)
#ifdef CONFIG_MEMORY_HOTPLUG
int radix__create_section_mapping(unsigned long start, unsigned long end);
+int radix__remove_section_mapping(unsigned long start, unsigned long end);
#endif /* CONFIG_MEMORY_HOTPLUG */
#endif /* __ASSEMBLY__ */
#endif
diff --git a/arch/powerpc/mm/pgtable-book3s64.c b/arch/powerpc/mm/pgtable-book3s64.c
index 2b13f6b..b798ff6 100644
--- a/arch/powerpc/mm/pgtable-book3s64.c
+++ b/arch/powerpc/mm/pgtable-book3s64.c
@@ -139,7 +139,7 @@ int create_section_mapping(unsigned long start, unsigned long end)
int remove_section_mapping(unsigned long start, unsigned long end)
{
if (radix_enabled())
- return -ENODEV;
+ return radix__remove_section_mapping(start, end);
return hash__remove_section_mapping(start, end);
}
diff --git a/arch/powerpc/mm/pgtable-radix.c b/arch/powerpc/mm/pgtable-radix.c
index 8201d1f..315237c 100644
--- a/arch/powerpc/mm/pgtable-radix.c
+++ b/arch/powerpc/mm/pgtable-radix.c
@@ -466,6 +466,159 @@ void radix__setup_initial_memory_limit(phys_addr_t first_memblock_base,
}
#ifdef CONFIG_MEMORY_HOTPLUG
+static void free_pte_table(pte_t *pte_start, pmd_t *pmd)
+{
+ pte_t *pte;
+ int i;
+
+ for (i = 0; i < PTRS_PER_PTE; i++) {
+ pte = pte_start + i;
+ if (!pte_none(*pte))
+ return;
+ }
+
+ pte_free_kernel(&init_mm, pte_start);
+ spin_lock(&init_mm.page_table_lock);
+ pmd_clear(pmd);
+ spin_unlock(&init_mm.page_table_lock);
+}
+
+static void free_pmd_table(pmd_t *pmd_start, pud_t *pud)
+{
+ pmd_t *pmd;
+ int i;
+
+ for (i = 0; i < PTRS_PER_PMD; i++) {
+ pmd = pmd_start + i;
+ if (!pmd_none(*pmd))
+ return;
+ }
+
+ pmd_free(&init_mm, pmd_start);
+ spin_lock(&init_mm.page_table_lock);
+ pud_clear(pud);
+ spin_unlock(&init_mm.page_table_lock);
+}
+
+static void free_pud_table(pud_t *pud_start, pgd_t *pgd)
+{
+ pud_t *pud;
+ int i;
+
+ for (i = 0; i < PTRS_PER_PUD; i++) {
+ pud = pud_start + i;
+ if (!pud_none(*pud))
+ return;
+ }
+
+ pud_free(&init_mm, pud_start);
+ spin_lock(&init_mm.page_table_lock);
+ pgd_clear(pgd);
+ spin_unlock(&init_mm.page_table_lock);
+}
+
+static void remove_pte_table(pte_t *pte_start, unsigned long addr,
+ unsigned long end)
+{
+ unsigned long next;
+ pte_t *pte;
+
+ pte = pte_start + pte_index(addr);
+ for (; addr < end; addr = next, pte++) {
+ next = (addr + PAGE_SIZE) & PAGE_MASK;
+ if (next > end)
+ next = end;
+
+ if (!pte_present(*pte))
+ continue;
+
+ spin_lock(&init_mm.page_table_lock);
+ pte_clear(&init_mm, addr, pte);
+ spin_unlock(&init_mm.page_table_lock);
+ }
+
+ flush_tlb_mm(&init_mm);
+}
+
+static void remove_pmd_table(pmd_t *pmd_start, unsigned long addr,
+ unsigned long end, unsigned long map_page_size)
+{
+ unsigned long next;
+ pte_t *pte_base;
+ pmd_t *pmd;
+
+ pmd = pmd_start + pmd_index(addr);
+ for (; addr < end; addr = next, pmd++) {
+ next = pmd_addr_end(addr, end);
+
+ if (!pmd_present(*pmd))
+ continue;
+
+ if (map_page_size == PMD_SIZE) {
+ spin_lock(&init_mm.page_table_lock);
+ pte_clear(&init_mm, addr, (pte_t *)pmd);
+ spin_unlock(&init_mm.page_table_lock);
+
+ continue;
+ }
+
+ pte_base = (pte_t *)pmd_page_vaddr(*pmd);
+ remove_pte_table(pte_base, addr, next);
+ free_pte_table(pte_base, pmd);
+ }
+}
+
+static void remove_pud_table(pud_t *pud_start, unsigned long addr,
+ unsigned long end, unsigned long map_page_size)
+{
+ unsigned long next;
+ pmd_t *pmd_base;
+ pud_t *pud;
+
+ pud = pud_start + pud_index(addr);
+ for (; addr < end; addr = next, pud++) {
+ next = pud_addr_end(addr, end);
+
+ if (!pud_present(*pud))
+ continue;
+
+ if (map_page_size == PUD_SIZE) {
+ spin_lock(&init_mm.page_table_lock);
+ pte_clear(&init_mm, addr, (pte_t *)pud);
+ spin_unlock(&init_mm.page_table_lock);
+
+ continue;
+ }
+
+ pmd_base = (pmd_t *)pud_page_vaddr(*pud);
+ remove_pmd_table(pmd_base, addr, next, map_page_size);
+ free_pmd_table(pmd_base, pud);
+ }
+}
+
+static void remove_pagetable(unsigned long start, unsigned long end,
+ unsigned long map_page_size)
+{
+ unsigned long next;
+ unsigned long addr;
+ pgd_t *pgd;
+ pud_t *pud;
+
+ for (addr = start; addr < end; addr = next) {
+ next = pgd_addr_end(addr, end);
+
+ pgd = pgd_offset_k(addr);
+ if (!pgd_present(*pgd))
+ continue;
+
+ pud = (pud_t *)pgd_page_vaddr(*pgd);
+ remove_pud_table(pud, addr, next, map_page_size);
+ free_pud_table(pud, pgd);
+ }
+
+ flush_tlb_mm(&init_mm);
+}
+
int radix__create_section_mapping(unsigned long start, unsigned long end)
{
unsigned long page_size = 1 << mmu_psize_defs[mmu_linear_psize].shift;
@@ -482,6 +635,16 @@ int radix__create_section_mapping(unsigned long start, unsigned long end)
return 0;
}
+
+int radix__remove_section_mapping(unsigned long start, unsigned long end)
+{
+ unsigned long page_size = 1 << mmu_psize_defs[mmu_linear_psize].shift;
+
+ start = _ALIGN_DOWN(start, page_size);
+ remove_pagetable(start, end, page_size);
+
+ return 0;
+}
#endif /* CONFIG_MEMORY_HOTPLUG */
#ifdef CONFIG_SPARSEMEM_VMEMMAP
--
1.8.3.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH v3 4/5] powerpc/mm: add radix__remove_section_mapping()
2016-12-15 19:50 ` [PATCH v3 4/5] powerpc/mm: add radix__remove_section_mapping() Reza Arbab
@ 2016-12-19 9:48 ` Aneesh Kumar K.V
2016-12-19 18:11 ` Reza Arbab
2016-12-19 20:59 ` Benjamin Herrenschmidt
0 siblings, 2 replies; 23+ messages in thread
From: Aneesh Kumar K.V @ 2016-12-19 9:48 UTC (permalink / raw)
To: Reza Arbab, Michael Ellerman, Benjamin Herrenschmidt,
Paul Mackerras
Cc: linuxppc-dev, Balbir Singh, Alistair Popple
Reza Arbab <arbab@linux.vnet.ibm.com> writes:
> Tear down and free the four-level page tables of the linear mapping
> during memory hotremove.
>
> We borrow the basic structure of remove_pagetable() and friends from the
> identically-named x86 functions.
>
Can you add more details here, which explain why we don't need to follow
the RCU page table free when doing memory hotunplug ?
> Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com>
> ---
> arch/powerpc/include/asm/book3s/64/radix.h | 1 +
> arch/powerpc/mm/pgtable-book3s64.c | 2 +-
> arch/powerpc/mm/pgtable-radix.c | 163 +++++++++++++++++++++++++++++
> 3 files changed, 165 insertions(+), 1 deletion(-)
>
....
....
> +static void remove_pte_table(pte_t *pte_start, unsigned long addr,
> + unsigned long end)
> +{
> + unsigned long next;
> + pte_t *pte;
> +
> + pte = pte_start + pte_index(addr);
> + for (; addr < end; addr = next, pte++) {
> + next = (addr + PAGE_SIZE) & PAGE_MASK;
> + if (next > end)
> + next = end;
> +
> + if (!pte_present(*pte))
> + continue;
> +
> + spin_lock(&init_mm.page_table_lock);
> + pte_clear(&init_mm, addr, pte);
> + spin_unlock(&init_mm.page_table_lock);
> + }
> +
> + flush_tlb_mm(&init_mm);
Why call a flush here. we do that at the end of remove_page_table .
Isn't that sufficient ?
> +}
> +
> +static void remove_pmd_table(pmd_t *pmd_start, unsigned long addr,
> + unsigned long end, unsigned long map_page_size)
> +{
> + unsigned long next;
> + pte_t *pte_base;
> + pmd_t *pmd;
> +
> + pmd = pmd_start + pmd_index(addr);
> + for (; addr < end; addr = next, pmd++) {
> + next = pmd_addr_end(addr, end);
> +
> + if (!pmd_present(*pmd))
> + continue;
> +
> + if (map_page_size == PMD_SIZE) {
> + spin_lock(&init_mm.page_table_lock);
> + pte_clear(&init_mm, addr, (pte_t *)pmd);
> + spin_unlock(&init_mm.page_table_lock);
> +
> + continue;
> + }
> +
> + pte_base = (pte_t *)pmd_page_vaddr(*pmd);
> + remove_pte_table(pte_base, addr, next);
> + free_pte_table(pte_base, pmd);
> + }
> +}
> +
> +static void remove_pud_table(pud_t *pud_start, unsigned long addr,
> + unsigned long end, unsigned long map_page_size)
> +{
> + unsigned long next;
> + pmd_t *pmd_base;
> + pud_t *pud;
> +
> + pud = pud_start + pud_index(addr);
> + for (; addr < end; addr = next, pud++) {
> + next = pud_addr_end(addr, end);
> +
> + if (!pud_present(*pud))
> + continue;
> +
> + if (map_page_size == PUD_SIZE) {
> + spin_lock(&init_mm.page_table_lock);
> + pte_clear(&init_mm, addr, (pte_t *)pud);
> + spin_unlock(&init_mm.page_table_lock);
> +
> + continue;
> + }
> +
> + pmd_base = (pmd_t *)pud_page_vaddr(*pud);
> + remove_pmd_table(pmd_base, addr, next, map_page_size);
> + free_pmd_table(pmd_base, pud);
> + }
> +}
> +
> +static void remove_pagetable(unsigned long start, unsigned long end,
> + unsigned long map_page_size)
> +{
> + unsigned long next;
> + unsigned long addr;
> + pgd_t *pgd;
> + pud_t *pud;
> +
> + for (addr = start; addr < end; addr = next) {
> + next = pgd_addr_end(addr, end);
> +
> + pgd = pgd_offset_k(addr);
> + if (!pgd_present(*pgd))
> + continue;
> +
> + pud = (pud_t *)pgd_page_vaddr(*pgd);
> + remove_pud_table(pud, addr, next, map_page_size);
> + free_pud_table(pud, pgd);
> + }
> +
> + flush_tlb_mm(&init_mm);
So we want to flush the full kernel tlb when we do a hotplug ?
May be check using flush_tlb_kernel_range(). Also that flush_tlb_mm() do
check for mm_is_thread_local(). Do we update init_mm correct to handle
that check ? I assume we want a tlbie() here instead of tlbiel() ?
> +}
> +
> int radix__create_section_mapping(unsigned long start, unsigned long end)
> {
> unsigned long page_size = 1 << mmu_psize_defs[mmu_linear_psize].shift;
> @@ -482,6 +635,16 @@ int radix__create_section_mapping(unsigned long start, unsigned long end)
>
> return 0;
> }
> +
> +int radix__remove_section_mapping(unsigned long start, unsigned long end)
> +{
> + unsigned long page_size = 1 << mmu_psize_defs[mmu_linear_psize].shift;
> +
> + start = _ALIGN_DOWN(start, page_size);
> + remove_pagetable(start, end, page_size);
> +
> + return 0;
> +}
> #endif /* CONFIG_MEMORY_HOTPLUG */
>
> #ifdef CONFIG_SPARSEMEM_VMEMMAP
-aneesh
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v3 4/5] powerpc/mm: add radix__remove_section_mapping()
2016-12-19 9:48 ` Aneesh Kumar K.V
@ 2016-12-19 18:11 ` Reza Arbab
2016-12-19 20:59 ` Benjamin Herrenschmidt
1 sibling, 0 replies; 23+ messages in thread
From: Reza Arbab @ 2016-12-19 18:11 UTC (permalink / raw)
To: Aneesh Kumar K.V
Cc: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
linuxppc-dev, Balbir Singh, Alistair Popple
On Mon, Dec 19, 2016 at 03:18:07PM +0530, Aneesh Kumar K.V wrote:
>Reza Arbab <arbab@linux.vnet.ibm.com> writes:
>> +static void remove_pte_table(pte_t *pte_start, unsigned long addr,
>> + unsigned long end)
>> +{
>> + unsigned long next;
>> + pte_t *pte;
>> +
>> + pte = pte_start + pte_index(addr);
>> + for (; addr < end; addr = next, pte++) {
>> + next = (addr + PAGE_SIZE) & PAGE_MASK;
>> + if (next > end)
>> + next = end;
>> +
>> + if (!pte_present(*pte))
>> + continue;
>> +
>> + spin_lock(&init_mm.page_table_lock);
>> + pte_clear(&init_mm, addr, pte);
>> + spin_unlock(&init_mm.page_table_lock);
>> + }
>> +
>> + flush_tlb_mm(&init_mm);
>
>Why call a flush here. we do that at the end of remove_page_table .
>Isn't that sufficient ?
This was carried over from the x86 version of the function, where they
do flush_tlb_all(). I can experiment to make sure things work without
it.
>> +static void remove_pagetable(unsigned long start, unsigned long end,
>> + unsigned long map_page_size)
>> +{
>> + unsigned long next;
>> + unsigned long addr;
>> + pgd_t *pgd;
>> + pud_t *pud;
>> +
>> + for (addr = start; addr < end; addr = next) {
>> + next = pgd_addr_end(addr, end);
>> +
>> + pgd = pgd_offset_k(addr);
>> + if (!pgd_present(*pgd))
>> + continue;
>> +
>> + pud = (pud_t *)pgd_page_vaddr(*pgd);
>> + remove_pud_table(pud, addr, next, map_page_size);
>> + free_pud_table(pud, pgd);
>> + }
>> +
>> + flush_tlb_mm(&init_mm);
>
>
>So we want to flush the full kernel tlb when we do a hotplug ?
>May be check using flush_tlb_kernel_range(). Also that flush_tlb_mm() do
>check for mm_is_thread_local(). Do we update init_mm correct to handle
>that check ? I assume we want a tlbie() here instead of tlbiel() ?
I'll try using flush_tlb_kernel_range() instead. That sure does seem
more appropriate.
--
Reza Arbab
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v3 4/5] powerpc/mm: add radix__remove_section_mapping()
2016-12-19 9:48 ` Aneesh Kumar K.V
2016-12-19 18:11 ` Reza Arbab
@ 2016-12-19 20:59 ` Benjamin Herrenschmidt
1 sibling, 0 replies; 23+ messages in thread
From: Benjamin Herrenschmidt @ 2016-12-19 20:59 UTC (permalink / raw)
To: Aneesh Kumar K.V, Reza Arbab, Michael Ellerman, Paul Mackerras
Cc: linuxppc-dev, Balbir Singh, Alistair Popple
On Mon, 2016-12-19 at 15:18 +0530, Aneesh Kumar K.V wrote:
> > + pte = pte_start + pte_index(addr);
> > + for (; addr < end; addr = next, pte++) {
> > + next = (addr + PAGE_SIZE) & PAGE_MASK;
> > + if (next > end)
> > + next = end;
> > +
> > + if (!pte_present(*pte))
> > + continue;
> > +
> > + spin_lock(&init_mm.page_table_lock);
> > + pte_clear(&init_mm, addr, pte);
> > + spin_unlock(&init_mm.page_table_lock);
> > + }
> > +
> > + flush_tlb_mm(&init_mm);
>
> Why call a flush here. we do that at the end of remove_page_table .
> Isn't that sufficient ?
All those lock/unlock ... what for ? Can't we just do the whole page ?
Also I agree, we can delay the flush of the PTEs to the end.
Ben.
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH v3 5/5] powerpc/mm: unstub radix__vmemmap_remove_mapping()
2016-12-15 19:50 [PATCH v3 0/5] powerpc/mm: enable memory hotplug on radix Reza Arbab
` (3 preceding siblings ...)
2016-12-15 19:50 ` [PATCH v3 4/5] powerpc/mm: add radix__remove_section_mapping() Reza Arbab
@ 2016-12-15 19:50 ` Reza Arbab
2016-12-16 14:38 ` [PATCH v3 0/5] powerpc/mm: enable memory hotplug on radix Balbir Singh
5 siblings, 0 replies; 23+ messages in thread
From: Reza Arbab @ 2016-12-15 19:50 UTC (permalink / raw)
To: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras
Cc: linuxppc-dev, Aneesh Kumar K.V, Balbir Singh, Alistair Popple
Use remove_pagetable() and friends for radix vmemmap removal.
We do not require the special-case handling of vmemmap done in the x86
versions of these functions. This is because vmemmap_free() has already
freed the mapped pages, and calls us with an aligned address range.
So, add a few failsafe WARNs, but otherwise the code to remove linear
mappings is already sufficient for vmemmap.
Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com>
---
arch/powerpc/mm/pgtable-radix.c | 23 ++++++++++++++++++++++-
1 file changed, 22 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/mm/pgtable-radix.c b/arch/powerpc/mm/pgtable-radix.c
index 315237c..9d1d51e 100644
--- a/arch/powerpc/mm/pgtable-radix.c
+++ b/arch/powerpc/mm/pgtable-radix.c
@@ -532,6 +532,15 @@ static void remove_pte_table(pte_t *pte_start, unsigned long addr,
if (!pte_present(*pte))
continue;
+ if (!PAGE_ALIGNED(addr) || !PAGE_ALIGNED(next)) {
+ /*
+ * The vmemmap_free() and remove_section_mapping()
+ * codepaths call us with aligned addresses.
+ */
+ WARN_ONCE(1, "%s: unaligned range\n", __func__);
+ continue;
+ }
+
spin_lock(&init_mm.page_table_lock);
pte_clear(&init_mm, addr, pte);
spin_unlock(&init_mm.page_table_lock);
@@ -555,6 +564,12 @@ static void remove_pmd_table(pmd_t *pmd_start, unsigned long addr,
continue;
if (map_page_size == PMD_SIZE) {
+ if (!IS_ALIGNED(addr, PMD_SIZE) ||
+ !IS_ALIGNED(next, PMD_SIZE)) {
+ WARN_ONCE(1, "%s: unaligned range\n", __func__);
+ continue;
+ }
+
spin_lock(&init_mm.page_table_lock);
pte_clear(&init_mm, addr, (pte_t *)pmd);
spin_unlock(&init_mm.page_table_lock);
@@ -583,6 +598,12 @@ static void remove_pud_table(pud_t *pud_start, unsigned long addr,
continue;
if (map_page_size == PUD_SIZE) {
+ if (!IS_ALIGNED(addr, PUD_SIZE) ||
+ !IS_ALIGNED(next, PUD_SIZE)) {
+ WARN_ONCE(1, "%s: unaligned range\n", __func__);
+ continue;
+ }
+
spin_lock(&init_mm.page_table_lock);
pte_clear(&init_mm, addr, (pte_t *)pud);
spin_unlock(&init_mm.page_table_lock);
@@ -662,7 +683,7 @@ int __meminit radix__vmemmap_create_mapping(unsigned long start,
#ifdef CONFIG_MEMORY_HOTPLUG
void radix__vmemmap_remove_mapping(unsigned long start, unsigned long page_size)
{
- /* FIXME!! intel does more. We should free page tables mapping vmemmap ? */
+ remove_pagetable(start, start + page_size, page_size);
}
#endif
#endif
--
1.8.3.1
^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH v3 0/5] powerpc/mm: enable memory hotplug on radix
2016-12-15 19:50 [PATCH v3 0/5] powerpc/mm: enable memory hotplug on radix Reza Arbab
` (4 preceding siblings ...)
2016-12-15 19:50 ` [PATCH v3 5/5] powerpc/mm: unstub radix__vmemmap_remove_mapping() Reza Arbab
@ 2016-12-16 14:38 ` Balbir Singh
2016-12-19 17:58 ` Reza Arbab
5 siblings, 1 reply; 23+ messages in thread
From: Balbir Singh @ 2016-12-16 14:38 UTC (permalink / raw)
To: Reza Arbab, Michael Ellerman, Benjamin Herrenschmidt,
Paul Mackerras
Cc: linuxppc-dev, Aneesh Kumar K.V, Alistair Popple
On 16/12/16 06:50, Reza Arbab wrote:
> Memory hotplug is leading to hash page table calls, even on radix:
>
> ...
> arch_add_memory
> create_section_mapping
> htab_bolt_mapping
> BUG_ON(!ppc_md.hpte_insert);
>
> To fix, refactor {create,remove}_section_mapping() into hash__ and radix__
> variants. Implement the radix versions by borrowing from existing vmemmap
> and x86 code.
>
> This passes basic verification of plugging and removing memory, but this
> stuff is tricky and I'd appreciate extra scrutiny of the series for
> correctness--in particular, the adaptation of remove_pagetable() from x86.
On quick glance everything seemed alright to me. I'll review each patch and Ack/provide comments.
Do we care about alt maps yet?
Balbir Singh.
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v3 0/5] powerpc/mm: enable memory hotplug on radix
2016-12-16 14:38 ` [PATCH v3 0/5] powerpc/mm: enable memory hotplug on radix Balbir Singh
@ 2016-12-19 17:58 ` Reza Arbab
2016-12-21 6:54 ` Anshuman Khandual
0 siblings, 1 reply; 23+ messages in thread
From: Reza Arbab @ 2016-12-19 17:58 UTC (permalink / raw)
To: Balbir Singh
Cc: Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
linuxppc-dev, Aneesh Kumar K.V, Alistair Popple
On Sat, Dec 17, 2016 at 01:38:40AM +1100, Balbir Singh wrote:
>Do we care about alt maps yet?
Good question. I'll try to see if/how altmaps might need special
consideration here.
--
Reza Arbab
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH v3 0/5] powerpc/mm: enable memory hotplug on radix
2016-12-19 17:58 ` Reza Arbab
@ 2016-12-21 6:54 ` Anshuman Khandual
0 siblings, 0 replies; 23+ messages in thread
From: Anshuman Khandual @ 2016-12-21 6:54 UTC (permalink / raw)
To: Reza Arbab, Balbir Singh
Cc: Alistair Popple, Paul Mackerras, Aneesh Kumar K.V, linuxppc-dev
On 12/19/2016 11:28 PM, Reza Arbab wrote:
> On Sat, Dec 17, 2016 at 01:38:40AM +1100, Balbir Singh wrote:
>> Do we care about alt maps yet?
>
> Good question. I'll try to see if/how altmaps might need special
> consideration here.
vmemmap alt-map should be enabled separately and should not be mixed
in this series which tries to just get hotplug working on radix.
^ permalink raw reply [flat|nested] 23+ messages in thread