* [PATCH] mm/THP: Use pmd_populate to update the pmd with pgtable_t pointer
@ 2013-05-12 8:35 Aneesh Kumar K.V
2013-05-13 13:48 ` Aneesh Kumar K.V
0 siblings, 1 reply; 8+ messages in thread
From: Aneesh Kumar K.V @ 2013-05-12 8:35 UTC (permalink / raw)
To: linux-mm, akpm, aarcange; +Cc: Aneesh Kumar K.V
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
We should not use set_pmd_at to update pmd_t with pgtable_t pointer. set_pmd_at
is used to set pmd with huge pte entries and architectures like ppc64, clear
few flags from the pte when saving a new entry. Without this change we observe
bad pte errors like below on ppc64 with THP enabled.
BUG: Bad page map in process ld mm=0xc000001ee39f4780 pte:7fc3f37848000001 pmd:c000001ec0000000
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
mm/huge_memory.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index bf281ce..cc47b29 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2333,7 +2333,7 @@ static void collapse_huge_page(struct mm_struct *mm,
pte_unmap(pte);
spin_lock(&mm->page_table_lock);
BUG_ON(!pmd_none(*pmd));
- set_pmd_at(mm, address, pmd, _pmd);
+ pmd_populate(mm, pmd, _pmd);
spin_unlock(&mm->page_table_lock);
anon_vma_unlock_write(vma->anon_vma);
goto out;
--
1.8.1.2
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] mm/THP: Use pmd_populate to update the pmd with pgtable_t pointer
2013-05-12 8:35 [PATCH] mm/THP: Use pmd_populate to update the pmd with pgtable_t pointer Aneesh Kumar K.V
@ 2013-05-13 13:48 ` Aneesh Kumar K.V
2013-05-13 14:13 ` Andrea Arcangeli
0 siblings, 1 reply; 8+ messages in thread
From: Aneesh Kumar K.V @ 2013-05-13 13:48 UTC (permalink / raw)
To: linux-mm, akpm, aarcange
updated one fixing a compile warning.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mm/THP: Use pmd_populate to update the pmd with pgtable_t pointer
2013-05-13 13:48 ` Aneesh Kumar K.V
@ 2013-05-13 14:13 ` Andrea Arcangeli
2013-05-13 14:44 ` Aneesh Kumar K.V
0 siblings, 1 reply; 8+ messages in thread
From: Andrea Arcangeli @ 2013-05-13 14:13 UTC (permalink / raw)
To: Aneesh Kumar K.V; +Cc: linux-mm, akpm
Hi Aneesh,
On Mon, May 13, 2013 at 07:18:57PM +0530, Aneesh Kumar K.V wrote:
>
> updated one fixing a compile warning.
>
> From f721c77eb0d6aaf75758e8e93991a05207680ac8 Mon Sep 17 00:00:00 2001
> From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
> Date: Sun, 12 May 2013 01:59:00 +0530
> Subject: [PATCH] mm/THP: Use pmd_populate to update the pmd with pgtable_t
> pointer
>
> We should not use set_pmd_at to update pmd_t with pgtable_t pointer. set_pmd_at
> is used to set pmd with huge pte entries and architectures like ppc64, clear
> few flags from the pte when saving a new entry. Without this change we observe
> bad pte errors like below on ppc64 with THP enabled.
>
> BUG: Bad page map in process ld mm=0xc000001ee39f4780 pte:7fc3f37848000001 pmd:c000001ec0000000
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> ---
> mm/huge_memory.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 03a89a2..f0bad1f 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -2325,7 +2325,7 @@ static void collapse_huge_page(struct mm_struct *mm,
> pte_unmap(pte);
> spin_lock(&mm->page_table_lock);
> BUG_ON(!pmd_none(*pmd));
> - set_pmd_at(mm, address, pmd, _pmd);
> + pmd_populate(mm, pmd, (pgtable_t)_pmd);
> spin_unlock(&mm->page_table_lock);
> anon_vma_unlock_write(vma->anon_vma);
> goto out;
Great, looks like you found the ppc problem with gcc builds and that
explains also why it cannot happen on x86.
But about the fix, did you test it? The above should be:
pmd_populate(mm, pmd, pmd_pgtable(_pmd)) instead.
_pmd is not a pointer to a page struct and the cast seems to be hiding
a bug. _pmd if something is a physical address potentially with some
high bit set not making it a good physical address either.
So you can only use set_pmd_at when establishing hugepmds, and never
for establishing regular pmds that points to regular pagetables. I
guess a comment would be good to add too.
Thanks!
Andrea
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mm/THP: Use pmd_populate to update the pmd with pgtable_t pointer
2013-05-13 14:13 ` Andrea Arcangeli
@ 2013-05-13 14:44 ` Aneesh Kumar K.V
2013-05-13 15:06 ` Aneesh Kumar K.V
0 siblings, 1 reply; 8+ messages in thread
From: Aneesh Kumar K.V @ 2013-05-13 14:44 UTC (permalink / raw)
To: Andrea Arcangeli; +Cc: linux-mm, akpm
Andrea Arcangeli <aarcange@redhat.com> writes:
> Hi Aneesh,
>
> On Mon, May 13, 2013 at 07:18:57PM +0530, Aneesh Kumar K.V wrote:
>>
>> updated one fixing a compile warning.
>>
>> From f721c77eb0d6aaf75758e8e93991a05207680ac8 Mon Sep 17 00:00:00 2001
>> From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
>> Date: Sun, 12 May 2013 01:59:00 +0530
>> Subject: [PATCH] mm/THP: Use pmd_populate to update the pmd with pgtable_t
>> pointer
>>
>> We should not use set_pmd_at to update pmd_t with pgtable_t pointer. set_pmd_at
>> is used to set pmd with huge pte entries and architectures like ppc64, clear
>> few flags from the pte when saving a new entry. Without this change we observe
>> bad pte errors like below on ppc64 with THP enabled.
>>
>> BUG: Bad page map in process ld mm=0xc000001ee39f4780 pte:7fc3f37848000001 pmd:c000001ec0000000
>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
>> ---
>> mm/huge_memory.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>> index 03a89a2..f0bad1f 100644
>> --- a/mm/huge_memory.c
>> +++ b/mm/huge_memory.c
>> @@ -2325,7 +2325,7 @@ static void collapse_huge_page(struct mm_struct *mm,
>> pte_unmap(pte);
>> spin_lock(&mm->page_table_lock);
>> BUG_ON(!pmd_none(*pmd));
>> - set_pmd_at(mm, address, pmd, _pmd);
>> + pmd_populate(mm, pmd, (pgtable_t)_pmd);
>> spin_unlock(&mm->page_table_lock);
>> anon_vma_unlock_write(vma->anon_vma);
>> goto out;
>
> Great, looks like you found the ppc problem with gcc builds and that
> explains also why it cannot happen on x86.
yes. That was the reason for the failure.
>
> But about the fix, did you test it? The above should be:
> pmd_populate(mm, pmd, pmd_pgtable(_pmd)) instead.
>
Yes and it worked in powerpc because we have for ppc64
static inline pgtable_t pmd_pgtable(pmd_t pmd)
{
return (pgtable_t)(pmd_val(pmd) & ~PMD_MASKED_BITS);
}
That is because we share the PTE page with multiple pmds
> _pmd is not a pointer to a page struct and the cast seems to be hiding
> a bug. _pmd if something is a physical address potentially with some
> high bit set not making it a good physical address either.
>
> So you can only use set_pmd_at when establishing hugepmds, and never
> for establishing regular pmds that points to regular pagetables. I
> guess a comment would be good to add too.
>
I will send an updated patch.
-aneesh
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mm/THP: Use pmd_populate to update the pmd with pgtable_t pointer
2013-05-13 14:44 ` Aneesh Kumar K.V
@ 2013-05-13 15:06 ` Aneesh Kumar K.V
2013-05-16 13:18 ` Andrea Arcangeli
0 siblings, 1 reply; 8+ messages in thread
From: Aneesh Kumar K.V @ 2013-05-13 15:06 UTC (permalink / raw)
To: Andrea Arcangeli; +Cc: linux-mm, akpm
"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> writes:
> Andrea Arcangeli <aarcange@redhat.com> writes:
>
>> Hi Aneesh,
>>
>> On Mon, May 13, 2013 at 07:18:57PM +0530, Aneesh Kumar K.V wrote:
>>>
>>> updated one fixing a compile warning.
>>>
>>> From f721c77eb0d6aaf75758e8e93991a05207680ac8 Mon Sep 17 00:00:00 2001
>>> From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
>>> Date: Sun, 12 May 2013 01:59:00 +0530
>>> Subject: [PATCH] mm/THP: Use pmd_populate to update the pmd with pgtable_t
>>> pointer
>>>
>>> We should not use set_pmd_at to update pmd_t with pgtable_t pointer. set_pmd_at
>>> is used to set pmd with huge pte entries and architectures like ppc64, clear
>>> few flags from the pte when saving a new entry. Without this change we observe
>>> bad pte errors like below on ppc64 with THP enabled.
>>>
>>> BUG: Bad page map in process ld mm=0xc000001ee39f4780 pte:7fc3f37848000001 pmd:c000001ec0000000
>>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
>>> ---
>>> mm/huge_memory.c | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>>> index 03a89a2..f0bad1f 100644
>>> --- a/mm/huge_memory.c
>>> +++ b/mm/huge_memory.c
>>> @@ -2325,7 +2325,7 @@ static void collapse_huge_page(struct mm_struct *mm,
>>> pte_unmap(pte);
>>> spin_lock(&mm->page_table_lock);
>>> BUG_ON(!pmd_none(*pmd));
>>> - set_pmd_at(mm, address, pmd, _pmd);
>>> + pmd_populate(mm, pmd, (pgtable_t)_pmd);
>>> spin_unlock(&mm->page_table_lock);
>>> anon_vma_unlock_write(vma->anon_vma);
>>> goto out;
>>
>> Great, looks like you found the ppc problem with gcc builds and that
>> explains also why it cannot happen on x86.
>
> yes. That was the reason for the failure.
>
The compiler crash was due to the fact that we didn't do hardware hash
pte flush on pmdp_clear_flush. That means we had previous translations
available while we did a hugepage copy on collapse. Once we fixed that we started
hitting the above Bad page map error. The interesting part for THP on
ppc64 is that we have
static inline void flush_tlb_range(struct vm_area_struct *vma,
unsigned long start, unsigned long end)
{
}
That means we don't really wait for those page table walks with local irq
disabled to finish in both split page and collapse huge page. I handled
that by looking at _PAGE_SPLITTING before we mark the pte _PAGE_BUSY.
Details are captured here
https://lists.ozlabs.org/pipermail/linuxppc-dev/2013-May/106406.html
https://lists.ozlabs.org/pipermail/linuxppc-dev/2013-May/106410.html
It would be nice if you could review the patches.
-aneesh
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mm/THP: Use pmd_populate to update the pmd with pgtable_t pointer
2013-05-13 15:06 ` Aneesh Kumar K.V
@ 2013-05-16 13:18 ` Andrea Arcangeli
2013-05-16 14:25 ` Aneesh Kumar K.V
0 siblings, 1 reply; 8+ messages in thread
From: Andrea Arcangeli @ 2013-05-16 13:18 UTC (permalink / raw)
To: Aneesh Kumar K.V; +Cc: linux-mm, akpm
Hi Aneesh,
On Mon, May 13, 2013 at 08:36:38PM +0530, Aneesh Kumar K.V wrote:
> https://lists.ozlabs.org/pipermail/linuxppc-dev/2013-May/106406.html
You need ACCESS_ONCE() in all "pgd = ACCESS_ONCE(*pgdp)", "pud =
ACCESS_ONCE(*pudp)" otherwise the compiler could decide your change is
a noop.
I think you could remove the #ifdef CONFIG_TRANSPARENT_HUGEPAGE too.
Thanks,
Andrea
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mm/THP: Use pmd_populate to update the pmd with pgtable_t pointer
2013-05-16 13:18 ` Andrea Arcangeli
@ 2013-05-16 14:25 ` Aneesh Kumar K.V
2013-05-16 14:51 ` Andrea Arcangeli
0 siblings, 1 reply; 8+ messages in thread
From: Aneesh Kumar K.V @ 2013-05-16 14:25 UTC (permalink / raw)
To: Andrea Arcangeli; +Cc: linux-mm, akpm
Andrea Arcangeli <aarcange@redhat.com> writes:
> Hi Aneesh,
>
> On Mon, May 13, 2013 at 08:36:38PM +0530, Aneesh Kumar K.V wrote:
>> https://lists.ozlabs.org/pipermail/linuxppc-dev/2013-May/106406.html
>
> You need ACCESS_ONCE() in all "pgd = ACCESS_ONCE(*pgdp)", "pud =
> ACCESS_ONCE(*pudp)" otherwise the compiler could decide your change is
> a noop.
Will do. I guess we have similar one for x86 here
http://article.gmane.org/gmane.linux.kernel/1483617
May be ppc64 gup walk also need similar changes ?
>
> I think you could remove the #ifdef CONFIG_TRANSPARENT_HUGEPAGE too.
That was becaue i had pte_pmd available only with that config. I will
see if we can fix that.
-aneesh
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mm/THP: Use pmd_populate to update the pmd with pgtable_t pointer
2013-05-16 14:25 ` Aneesh Kumar K.V
@ 2013-05-16 14:51 ` Andrea Arcangeli
0 siblings, 0 replies; 8+ messages in thread
From: Andrea Arcangeli @ 2013-05-16 14:51 UTC (permalink / raw)
To: Aneesh Kumar K.V; +Cc: linux-mm, akpm
On Thu, May 16, 2013 at 07:55:23PM +0530, Aneesh Kumar K.V wrote:
> Andrea Arcangeli <aarcange@redhat.com> writes:
>
> > Hi Aneesh,
> >
> > On Mon, May 13, 2013 at 08:36:38PM +0530, Aneesh Kumar K.V wrote:
> >> https://lists.ozlabs.org/pipermail/linuxppc-dev/2013-May/106406.html
> >
> > You need ACCESS_ONCE() in all "pgd = ACCESS_ONCE(*pgdp)", "pud =
> > ACCESS_ONCE(*pudp)" otherwise the compiler could decide your change is
> > a noop.
>
> Will do. I guess we have similar one for x86 here
Exactly.
>
> http://article.gmane.org/gmane.linux.kernel/1483617
>
> May be ppc64 gup walk also need similar changes ?
I think so, yes.
> > I think you could remove the #ifdef CONFIG_TRANSPARENT_HUGEPAGE too.
>
> That was becaue i had pte_pmd available only with that config. I will
> see if we can fix that.
Ok, up to you.
Thanks,
Andrea
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2013-05-16 14:51 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-05-12 8:35 [PATCH] mm/THP: Use pmd_populate to update the pmd with pgtable_t pointer Aneesh Kumar K.V
2013-05-13 13:48 ` Aneesh Kumar K.V
2013-05-13 14:13 ` Andrea Arcangeli
2013-05-13 14:44 ` Aneesh Kumar K.V
2013-05-13 15:06 ` Aneesh Kumar K.V
2013-05-16 13:18 ` Andrea Arcangeli
2013-05-16 14:25 ` Aneesh Kumar K.V
2013-05-16 14:51 ` Andrea Arcangeli
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).