From: mchitale@ventanamicro.com
To: Alexandre Ghiti <alex@ghiti.fr>,
Paul Walmsley <paul.walmsley@sifive.com>,
Palmer Dabbelt <palmer@dabbelt.com>,
Albert Ou <aou@eecs.berkeley.edu>,
linux-riscv@lists.infradead.org
Cc: Nanyong Sun <sunnanyong@huawei.com>,
Anup Patel <apatel@ventanamicro.com>,
Andrew Jones <ajones@ventanamicro.com>
Subject: Re: [PATCH] riscv: mm: Implement pmdp_collapse_flush for THP
Date: Mon, 30 Jan 2023 12:59:42 +0530 [thread overview]
Message-ID: <80649c8a017f7e5cbb4b23d7625bbb4737bfe5db.camel@ventanamicro.com> (raw)
In-Reply-To: <117752cf-c02b-4264-87c3-42c81aa429f7@ghiti.fr>
On Thu, 2023-01-26 at 16:33 +0100, Alexandre Ghiti wrote:
> Hi Mayuresh,
>
> On 1/25/23 13:55, Mayuresh Chitale wrote:
> > When THP is enabled, 4K pages are collapsed into a single huge
> > page using the generic pmdp_collapse_flush() which will further
> > use flush_tlb_range() to shoot-down stale TLB entries.
> > Unfortunately,
> > the generic pmdp_collapse_flush() only invalidates cached leaf PTEs
> > using address specific SFENCEs which results in repetitive (or
> > unpredictable) page faults on RISC-V implementations which cache
> > non-leaf PTEs.
>
> That's interesting! I'm wondering if the same issue will happen if a
> user maps 4K, unmaps it and at the same address maps a 2MB hugepage:
> I'm
> not sure the mm code would correctly flush the non-leaf PTE when
> unmapping the 4KB page. In that case, your patch only fixes the THP
> usecase and maybe we should try to catch this non-leaf -> leaf
> upgrade
> at some lower level page table functions, what do you think?
I will look into it but I dont know how to reproduce the issue without
the THP use case. It would be great if you could share the test case or
test code to reproduce it.
>
> Alex
>
>
> > Provide a RISC-V specific pmdp_collapse_flush() which ensures both
> > cached leaf and non-leaf PTEs are invalidated by using non-address
> > specific SFENCEs as recommended by the RISC-V privileged
> > specification.
> >
> > Fixes: e88b333142e4 ("riscv: mm: add THP support on 64-bit")
> > Signed-off-by: Mayuresh Chitale <mchitale@ventanamicro.com>
> > ---
> > arch/riscv/include/asm/pgtable.h | 24 ++++++++++++++++++++++++
> > 1 file changed, 24 insertions(+)
> >
> > diff --git a/arch/riscv/include/asm/pgtable.h
> > b/arch/riscv/include/asm/pgtable.h
> > index 4eba9a98d0e3..6d948dec6020 100644
> > --- a/arch/riscv/include/asm/pgtable.h
> > +++ b/arch/riscv/include/asm/pgtable.h
> > @@ -721,6 +721,30 @@ static inline pmd_t pmdp_establish(struct
> > vm_area_struct *vma,
> > page_table_check_pmd_set(vma->vm_mm, address, pmdp, pmd);
> > return __pmd(atomic_long_xchg((atomic_long_t *)pmdp,
> > pmd_val(pmd)));
> > }
> > +
> > +#define pmdp_collapse_flush pmdp_collapse_flush
> > +static inline pmd_t pmdp_collapse_flush(struct vm_area_struct
> > *vma,
> > + unsigned long address, pmd_t
> > *pmdp)
> > +{
> > + pmd_t pmd = pmdp_huge_get_and_clear(vma->vm_mm, address, pmdp);
> > +
> > + /*
> > + * When leaf PTE enteries (regular pages) are collapsed into a
> > leaf
> > + * PMD entry (huge page), a valid non-leaf PTE is converted
> > into a
> > + * valid leaf PTE at the level 1 page table. The RISC-V
> > privileged v1.12
> > + * specification allows implementations to cache valid non-leaf
> > PTEs,
> > + * but the section "4.2.1 Supervisor Memory-Management Fence
> > + * Instruction" recommends the following:
> > + * "If software modifies a non-leaf PTE, it should execute
> > SFENCE.VMA
> > + * with rs1=x0. If any PTE along the traversal path had its G
> > bit set,
> > + * rs2 must be x0; otherwise, rs2 should be set to the ASID for
> > which
> > + * the translation is being modified."
> > + * Based on the above recommendation, we should do full flush
> > whenever
> > + * leaf PTE entries are collapsed into a leaf PMD entry.
> > + */
> > + flush_tlb_mm(vma->vm_mm);
> > + return pmd;
> > +}
> > #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
> >
> > /*
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv
_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv
prev parent reply other threads:[~2023-01-30 7:30 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-25 12:55 [PATCH] riscv: mm: Implement pmdp_collapse_flush for THP Mayuresh Chitale
2023-01-25 15:35 ` Andrew Jones
2023-01-30 7:26 ` mchitale
2023-01-26 15:33 ` Alexandre Ghiti
2023-01-26 18:14 ` Anup Patel
2023-01-27 8:41 ` Alexandre Ghiti
2023-01-30 7:34 ` mchitale
2023-01-30 7:29 ` mchitale [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=80649c8a017f7e5cbb4b23d7625bbb4737bfe5db.camel@ventanamicro.com \
--to=mchitale@ventanamicro.com \
--cc=ajones@ventanamicro.com \
--cc=alex@ghiti.fr \
--cc=aou@eecs.berkeley.edu \
--cc=apatel@ventanamicro.com \
--cc=linux-riscv@lists.infradead.org \
--cc=palmer@dabbelt.com \
--cc=paul.walmsley@sifive.com \
--cc=sunnanyong@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.