All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: James Houghton <jthoughton@google.com>
Cc: Khalid Aziz <khalid.aziz@oracle.com>,
	Peter Xu <peterx@redhat.com>,
	Vishal Moola <vishal.moola@gmail.com>,
	Jane Chu <jane.chu@oracle.com>,
	Muchun Song <muchun.song@linux.dev>,
	linux-mm@kvack.org
Subject: Re: Unifying page table walkers
Date: Thu, 6 Jun 2024 22:21:13 +0100	[thread overview]
Message-ID: <ZmIoSWK6k2MNsLmv@casper.infradead.org> (raw)
In-Reply-To: <CADrL8HXAyYhV=pKJyy5JRZDRgBed4UTSos=z2pRXAX9C0P7d2w@mail.gmail.com>

On Thu, Jun 06, 2024 at 01:23:08PM -0700, James Houghton wrote:
> On Thu, Jun 6, 2024 at 1:04 PM Matthew Wilcox <willy@infradead.org> wrote:
> > Right, so we ignore hugetlb_fault() and call into __handle_mm_fault().
> > Once there, we'll do:
> >
> >         vmf.pud = pud_alloc(mm, p4d, address);
> >         if (pud_none(*vmf.pud) &&
> >             thp_vma_allowable_order(vma, vm_flags,
> >                                 TVA_IN_PF | TVA_ENFORCE_SYSFS, PUD_ORDER)) {
> >                 ret = create_huge_pud(&vmf);
> >
> > which will call vma->vm_ops->huge_fault(vmf, PUD_ORDER);
> >
> > So all we need to do is implement huge_fault in hugetlb_vm_ops.  I
> > don't think that's the same as creating a hugetlbfs2 because it's just
> > another entry point.  You can mmap() the same file both ways and it's
> > all cache coherent.
> 
> That makes a lot of sense. FWIW, this sounds good to me (though I'm
> curious what Peter thinks :)).
> 
> But I think you'll need to be careful to ensure that, for now anyway,
> huge_fault() is always called with the exact same ptep/pmdp/pudp that
> hugetlb_walk() would have returned (ignoring sharing). If you allow
> PMD mapping of what would otherwise be PUD-mapped hugetlb pages right
> now, you'll break the vmemmap optimization (and probably other
> things).

Why is that?  This sounds like you know something I don't ;-)
Is it the mapcount issue?

> Also I'm not sure how this will interact with arm64's hugetlb pages
> implemented with contiguous PTEs/PMDs. You might have to round
> `address` down to make sure you've picked the first PTE/PMD in the
> group.

I hadn't thought about the sub-PMD size hugetlb issue either.  We can
certainly limit the support to require alignment to the appropriate
size.


  reply	other threads:[~2024-06-06 21:21 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-06-06 18:29 Unifying page table walkers Matthew Wilcox
2024-06-06 19:30 ` James Houghton
2024-06-06 20:04   ` Matthew Wilcox
2024-06-06 20:23     ` James Houghton
2024-06-06 21:21       ` Matthew Wilcox [this message]
2024-06-06 23:07         ` James Houghton
2024-06-07  7:15           ` David Hildenbrand
2024-06-06 21:33     ` Peter Xu
2024-06-06 21:49 ` Peter Xu
2024-06-07  5:07   ` Oscar Salvador
2024-06-07  6:59 ` David Hildenbrand
2024-06-09 20:08   ` Matthew Wilcox
2024-06-09 20:28     ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZmIoSWK6k2MNsLmv@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=jane.chu@oracle.com \
    --cc=jthoughton@google.com \
    --cc=khalid.aziz@oracle.com \
    --cc=linux-mm@kvack.org \
    --cc=muchun.song@linux.dev \
    --cc=peterx@redhat.com \
    --cc=vishal.moola@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.