From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
To: Michael Ellerman <mpe@ellerman.id.au>,
npiggin@gmail.com, benh@kernel.crashing.org, paulus@samba.org,
akpm@linux-foundation.org, x86@kernel.org
Cc: linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH V5 3/5] arch/powerpc/mm: Nest MMU workaround for mprotect RW upgrade.
Date: Thu, 31 Jan 2019 10:37:13 +0530 [thread overview]
Message-ID: <87k1ilo1oe.fsf@linux.ibm.com> (raw)
In-Reply-To: <87fttaqux5.fsf@concordia.ellerman.id.au>
Michael Ellerman <mpe@ellerman.id.au> writes:
> "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> writes:
>> NestMMU requires us to mark the pte invalid and flush the tlb when we do a
>> RW upgrade of pte. We fixed a variant of this in the fault path in commit
>> Fixes: bd5050e38aec ("powerpc/mm/radix: Change pte relax sequence to handle nest MMU hang")
>
> You don't want the "Fixes:" there.
>
>>
>> Do the same for mprotect upgrades.
>>
>> Hugetlb is handled in the next patch.
>>
>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>> ---
>> arch/powerpc/include/asm/book3s/64/pgtable.h | 18 ++++++++++++++
>> arch/powerpc/include/asm/book3s/64/radix.h | 4 ++++
>> arch/powerpc/mm/pgtable-book3s64.c | 25 ++++++++++++++++++++
>> arch/powerpc/mm/pgtable-radix.c | 18 ++++++++++++++
>> 4 files changed, 65 insertions(+)
>>
>> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
>> index 2e6ada28da64..92eaea164700 100644
>> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
>> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
>> @@ -1314,6 +1314,24 @@ static inline int pud_pfn(pud_t pud)
>> BUILD_BUG();
>> return 0;
>> }
>
> Can we get a blank line here?
>
>> +#define __HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION
>> +pte_t ptep_modify_prot_start(struct vm_area_struct *, unsigned long, pte_t *);
>> +void ptep_modify_prot_commit(struct vm_area_struct *, unsigned long,
>> + pte_t *, pte_t, pte_t);
>
> So these are not inline ...
>
>> +/*
>> + * Returns true for a R -> RW upgrade of pte
>> + */
>> +static inline bool is_pte_rw_upgrade(unsigned long old_val, unsigned long new_val)
>> +{
>> + if (!(old_val & _PAGE_READ))
>> + return false;
>> +
>> + if ((!(old_val & _PAGE_WRITE)) && (new_val & _PAGE_WRITE))
>> + return true;
>> +
>> + return false;
>> +}
>>
>> #endif /* __ASSEMBLY__ */
>> #endif /* _ASM_POWERPC_BOOK3S_64_PGTABLE_H_ */
>> diff --git a/arch/powerpc/mm/pgtable-book3s64.c b/arch/powerpc/mm/pgtable-book3s64.c
>> index f3c31f5e1026..47c742f002ea 100644
>> --- a/arch/powerpc/mm/pgtable-book3s64.c
>> +++ b/arch/powerpc/mm/pgtable-book3s64.c
>> @@ -400,3 +400,28 @@ void arch_report_meminfo(struct seq_file *m)
>> atomic_long_read(&direct_pages_count[MMU_PAGE_1G]) << 20);
>> }
>> #endif /* CONFIG_PROC_FS */
>> +
>> +pte_t ptep_modify_prot_start(struct vm_area_struct *vma, unsigned long addr,
>> + pte_t *ptep)
>> +{
>> + unsigned long pte_val;
>> +
>> + /*
>> + * Clear the _PAGE_PRESENT so that no hardware parallel update is
>> + * possible. Also keep the pte_present true so that we don't take
>> + * wrong fault.
>> + */
>> + pte_val = pte_update(vma->vm_mm, addr, ptep, _PAGE_PRESENT, _PAGE_INVALID, 0);
>> +
>> + return __pte(pte_val);
>> +
>> +}
>> +
>> +void ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr,
>> + pte_t *ptep, pte_t old_pte, pte_t pte)
>> +{
>
> Which means we're going to be doing a function call to get to here ...
>
>> + if (radix_enabled())
>> + return radix__ptep_modify_prot_commit(vma, addr,
>> + ptep, old_pte, pte);
>
> And then another function call to get to the radix version ...
>
>> + set_pte_at(vma->vm_mm, addr, ptep, pte);
>> +}
>> diff --git a/arch/powerpc/mm/pgtable-radix.c b/arch/powerpc/mm/pgtable-radix.c
>> index 931156069a81..dced3cd241c2 100644
>> --- a/arch/powerpc/mm/pgtable-radix.c
>> +++ b/arch/powerpc/mm/pgtable-radix.c
>> @@ -1063,3 +1063,21 @@ void radix__ptep_set_access_flags(struct vm_area_struct *vma, pte_t *ptep,
>> }
>> /* See ptesync comment in radix__set_pte_at */
>> }
>> +
>> +void radix__ptep_modify_prot_commit(struct vm_area_struct *vma,
>> + unsigned long addr, pte_t *ptep,
>> + pte_t old_pte, pte_t pte)
>> +{
>> + struct mm_struct *mm = vma->vm_mm;
>> +
>> + /*
>> + * To avoid NMMU hang while relaxing access we need to flush the tlb before
>> + * we set the new value. We need to do this only for radix, because hash
>> + * translation does flush when updating the linux pte.
>> + */
>> + if (is_pte_rw_upgrade(pte_val(old_pte), pte_val(pte)) &&
>> + (atomic_read(&mm->context.copros) > 0))
>> + radix__flush_tlb_page(vma, addr);
>
> To finally get here, where we'll realise that 99.99% of processes don't
> use copros and so we have nothing to do except set the PTE.
>
>> +
>> + set_pte_at(mm, addr, ptep, pte);
>> +}
>
> So can we just make it all inline in the header? Or do we think it's not
> a hot enough path to worry about it?
>
I did try that earlier, But IIRC that didn't work due to header
inclusion issue. I can try that again in an addon patch. That would
require moving things around so that we find different struct
definitions correctly.
-aneesh
WARNING: multiple messages have this Message-ID (diff)
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
To: Michael Ellerman <mpe@ellerman.id.au>,
npiggin@gmail.com, benh@kernel.crashing.org, paulus@samba.org,
akpm@linux-foundation.org, x86@kernel.org
Cc: linuxppc-dev@lists.ozlabs.org, linux-mm@kvack.org
Subject: Re: [PATCH V5 3/5] arch/powerpc/mm: Nest MMU workaround for mprotect RW upgrade.
Date: Thu, 31 Jan 2019 10:37:13 +0530 [thread overview]
Message-ID: <87k1ilo1oe.fsf@linux.ibm.com> (raw)
In-Reply-To: <87fttaqux5.fsf@concordia.ellerman.id.au>
Michael Ellerman <mpe@ellerman.id.au> writes:
> "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> writes:
>> NestMMU requires us to mark the pte invalid and flush the tlb when we do a
>> RW upgrade of pte. We fixed a variant of this in the fault path in commit
>> Fixes: bd5050e38aec ("powerpc/mm/radix: Change pte relax sequence to handle nest MMU hang")
>
> You don't want the "Fixes:" there.
>
>>
>> Do the same for mprotect upgrades.
>>
>> Hugetlb is handled in the next patch.
>>
>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>> ---
>> arch/powerpc/include/asm/book3s/64/pgtable.h | 18 ++++++++++++++
>> arch/powerpc/include/asm/book3s/64/radix.h | 4 ++++
>> arch/powerpc/mm/pgtable-book3s64.c | 25 ++++++++++++++++++++
>> arch/powerpc/mm/pgtable-radix.c | 18 ++++++++++++++
>> 4 files changed, 65 insertions(+)
>>
>> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
>> index 2e6ada28da64..92eaea164700 100644
>> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
>> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
>> @@ -1314,6 +1314,24 @@ static inline int pud_pfn(pud_t pud)
>> BUILD_BUG();
>> return 0;
>> }
>
> Can we get a blank line here?
>
>> +#define __HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION
>> +pte_t ptep_modify_prot_start(struct vm_area_struct *, unsigned long, pte_t *);
>> +void ptep_modify_prot_commit(struct vm_area_struct *, unsigned long,
>> + pte_t *, pte_t, pte_t);
>
> So these are not inline ...
>
>> +/*
>> + * Returns true for a R -> RW upgrade of pte
>> + */
>> +static inline bool is_pte_rw_upgrade(unsigned long old_val, unsigned long new_val)
>> +{
>> + if (!(old_val & _PAGE_READ))
>> + return false;
>> +
>> + if ((!(old_val & _PAGE_WRITE)) && (new_val & _PAGE_WRITE))
>> + return true;
>> +
>> + return false;
>> +}
>>
>> #endif /* __ASSEMBLY__ */
>> #endif /* _ASM_POWERPC_BOOK3S_64_PGTABLE_H_ */
>> diff --git a/arch/powerpc/mm/pgtable-book3s64.c b/arch/powerpc/mm/pgtable-book3s64.c
>> index f3c31f5e1026..47c742f002ea 100644
>> --- a/arch/powerpc/mm/pgtable-book3s64.c
>> +++ b/arch/powerpc/mm/pgtable-book3s64.c
>> @@ -400,3 +400,28 @@ void arch_report_meminfo(struct seq_file *m)
>> atomic_long_read(&direct_pages_count[MMU_PAGE_1G]) << 20);
>> }
>> #endif /* CONFIG_PROC_FS */
>> +
>> +pte_t ptep_modify_prot_start(struct vm_area_struct *vma, unsigned long addr,
>> + pte_t *ptep)
>> +{
>> + unsigned long pte_val;
>> +
>> + /*
>> + * Clear the _PAGE_PRESENT so that no hardware parallel update is
>> + * possible. Also keep the pte_present true so that we don't take
>> + * wrong fault.
>> + */
>> + pte_val = pte_update(vma->vm_mm, addr, ptep, _PAGE_PRESENT, _PAGE_INVALID, 0);
>> +
>> + return __pte(pte_val);
>> +
>> +}
>> +
>> +void ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr,
>> + pte_t *ptep, pte_t old_pte, pte_t pte)
>> +{
>
> Which means we're going to be doing a function call to get to here ...
>
>> + if (radix_enabled())
>> + return radix__ptep_modify_prot_commit(vma, addr,
>> + ptep, old_pte, pte);
>
> And then another function call to get to the radix version ...
>
>> + set_pte_at(vma->vm_mm, addr, ptep, pte);
>> +}
>> diff --git a/arch/powerpc/mm/pgtable-radix.c b/arch/powerpc/mm/pgtable-radix.c
>> index 931156069a81..dced3cd241c2 100644
>> --- a/arch/powerpc/mm/pgtable-radix.c
>> +++ b/arch/powerpc/mm/pgtable-radix.c
>> @@ -1063,3 +1063,21 @@ void radix__ptep_set_access_flags(struct vm_area_struct *vma, pte_t *ptep,
>> }
>> /* See ptesync comment in radix__set_pte_at */
>> }
>> +
>> +void radix__ptep_modify_prot_commit(struct vm_area_struct *vma,
>> + unsigned long addr, pte_t *ptep,
>> + pte_t old_pte, pte_t pte)
>> +{
>> + struct mm_struct *mm = vma->vm_mm;
>> +
>> + /*
>> + * To avoid NMMU hang while relaxing access we need to flush the tlb before
>> + * we set the new value. We need to do this only for radix, because hash
>> + * translation does flush when updating the linux pte.
>> + */
>> + if (is_pte_rw_upgrade(pte_val(old_pte), pte_val(pte)) &&
>> + (atomic_read(&mm->context.copros) > 0))
>> + radix__flush_tlb_page(vma, addr);
>
> To finally get here, where we'll realise that 99.99% of processes don't
> use copros and so we have nothing to do except set the PTE.
>
>> +
>> + set_pte_at(mm, addr, ptep, pte);
>> +}
>
> So can we just make it all inline in the header? Or do we think it's not
> a hot enough path to worry about it?
>
I did try that earlier, But IIRC that didn't work due to header
inclusion issue. I can try that again in an addon patch. That would
require moving things around so that we find different struct
definitions correctly.
-aneesh
next prev parent reply other threads:[~2019-01-31 5:08 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-01-16 8:50 [PATCH V5 0/5] NestMMU pte upgrade workaround for mprotect Aneesh Kumar K.V
2019-01-16 8:50 ` Aneesh Kumar K.V
2019-01-16 8:50 ` [PATCH V5 1/5] mm: Update ptep_modify_prot_start/commit to take vm_area_struct as arg Aneesh Kumar K.V
2019-01-16 8:50 ` Aneesh Kumar K.V
2019-01-30 10:33 ` Michael Ellerman
2019-01-30 10:33 ` Michael Ellerman
2019-01-16 8:50 ` [PATCH V5 2/5] mm: update ptep_modify_prot_commit to take old pte value " Aneesh Kumar K.V
2019-01-16 8:50 ` Aneesh Kumar K.V
2019-01-30 10:46 ` Michael Ellerman
2019-01-30 10:46 ` Michael Ellerman
2019-01-31 5:03 ` Aneesh Kumar K.V
2019-01-31 5:03 ` Aneesh Kumar K.V
2019-01-16 8:50 ` [PATCH V5 3/5] arch/powerpc/mm: Nest MMU workaround for mprotect RW upgrade Aneesh Kumar K.V
2019-01-16 8:50 ` Aneesh Kumar K.V
2019-01-30 10:52 ` Michael Ellerman
2019-01-30 10:52 ` Michael Ellerman
2019-01-31 5:07 ` Aneesh Kumar K.V [this message]
2019-01-31 5:07 ` Aneesh Kumar K.V
2019-01-16 8:50 ` [PATCH V5 4/5] mm/hugetlb: Add prot_modify_start/commit sequence for hugetlb update Aneesh Kumar K.V
2019-01-16 8:50 ` Aneesh Kumar K.V
2019-01-30 10:54 ` Michael Ellerman
2019-01-30 10:54 ` Michael Ellerman
2019-01-16 8:50 ` [PATCH V5 5/5] arch/powerpc/mm/hugetlb: NestMMU workaround for hugetlb mprotect RW upgrade Aneesh Kumar K.V
2019-01-16 8:50 ` Aneesh Kumar K.V
2019-01-30 11:01 ` Michael Ellerman
2019-01-30 11:01 ` Michael Ellerman
2019-01-29 10:43 ` [PATCH V5 0/5] NestMMU pte upgrade workaround for mprotect Aneesh Kumar K.V
2019-01-29 10:43 ` Aneesh Kumar K.V
2019-01-29 18:29 ` Andrew Morton
2019-01-29 18:29 ` Andrew Morton
2019-02-26 23:37 ` Andrew Morton
2019-02-26 23:37 ` Andrew Morton
2019-02-27 8:58 ` Aneesh Kumar K.V
2019-02-27 8:58 ` Aneesh Kumar K.V
2019-02-28 19:39 ` Andrew Morton
2019-02-28 19:39 ` Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87k1ilo1oe.fsf@linux.ibm.com \
--to=aneesh.kumar@linux.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=benh@kernel.crashing.org \
--cc=linux-mm@kvack.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mpe@ellerman.id.au \
--cc=npiggin@gmail.com \
--cc=paulus@samba.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.