From: Johannes Weiner <hannes@cmpxchg.org>
To: David Hildenbrand <david@redhat.com>
Cc: Usama Arif <usamaarif642@gmail.com>,
akpm@linux-foundation.org, linux-mm@kvack.org, riel@surriel.com,
shakeel.butt@linux.dev, roman.gushchin@linux.dev,
yuzhao@google.com, baohua@kernel.org, ryan.roberts@arm.com,
rppt@kernel.org, willy@infradead.org,
cerasuolodomenico@gmail.com, corbet@lwn.net,
linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org,
kernel-team@meta.com
Subject: Re: [PATCH 0/6] mm: split underutilized THPs
Date: Tue, 6 Aug 2024 13:28:30 -0400 [thread overview]
Message-ID: <20240806172830.GD322282@cmpxchg.org> (raw)
In-Reply-To: <58025293-c70f-4377-b8be-39994136af83@redhat.com>
On Thu, Aug 01, 2024 at 08:36:32AM +0200, David Hildenbrand wrote:
> I just added another printf to postcopy_ram_supported_by_host(), where
> we temporarily do a UFFDIO_REGISTER on some test area.
>
> Sensing UFFD support # postcopy_ram_supported_by_host()
> Sensing UFFD support
> Writing received pages during precopy # ram_load_precopy()
> Writing received pages during precopy
> Writing received pages during precopy
> Writing received pages during precopy
> Writing received pages during precopy
> Writing received pages during precopy
> Writing received pages during precopy
> Writing received pages during precopy
> Writing received pages during precopy
> Writing received pages during precopy
> Writing received pages during precopy
> Writing received pages during precopy
> Writing received pages during precopy
> Writing received pages during precopy
> Writing received pages during precopy
> Writing received pages during precopy
> Writing received pages during precopy
> Writing received pages during precopy
> Writing received pages during precopy
> Writing received pages during precopy
> Writing received pages during precopy
> Writing received pages during precopy
> Disabling THP: MADV_NOHUGEPAGE # postcopy_ram_prepare_discard()
> Discarding pages # loadvm_postcopy_ram_handle_discard()
> Discarding pages
> Discarding pages
> Discarding pages
> Discarding pages
> Discarding pages
> Discarding pages
> Discarding pages
> Discarding pages
> Discarding pages
> Discarding pages
> Discarding pages
> Discarding pages
> Discarding pages
> Discarding pages
> Discarding pages
> Registering UFFD # postcopy_ram_incoming_setup()
>
> We could think about using this "ever user uffd" to avoid the shared
> zeropage in most processes.
>
> Of course, there might be other applications where that wouldn't work,
> but I think this behavior (write to area before enabling uffd) might be
> fairly QEMU specific already.
It makes me a bit uneasy to hardcode this into the kernel. It's fairly
specific to qemu/criu, and won't protect usecases that behave slightly
differently.
It would also give userfaultfd users that aren't susceptible to this
particular scenario a different code path.
> Avoiding the shared zeropage has the benefit that a later write fault
> won't have to do a TLB flush and can simply install a fresh anon page.
That's true - although if that happens frequently, it's something we
might want to tune the shrinker for anyway. If subpages do get used
later, we probably shouldn't have split the THP to begin with.
IMO the safest bet would be to use the zero page unconditionally.
> > return false;
> >
> > newpte = pte_mkspecial(pfn_pte(page_to_pfn(ZERO_PAGE(pvmw->address)),
> > pvmw->vma->vm_page_prot));
> >
> > set_pte_at(pvmw->vma->vm_mm, pvmw->address, pvmw->pte, newpte);
>
> We're replacing a present page by another present page without doing a
> TLB flush in between. I *think* this should be fine because the new
> present page is R/O and cannot possibly be written to.
It's safe because it's replacing a migration entry. The TLB was
flushed when that was installed, and since the migration pte is not
marked present it couldn't have re-established a TLB entry.
next prev parent reply other threads:[~2024-08-06 17:28 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-30 12:45 [PATCH 0/6] mm: split underutilized THPs Usama Arif
2024-07-30 12:45 ` [PATCH 1/6] Revert "memcg: remove mem_cgroup_uncharge_list()" Usama Arif
2024-07-30 12:45 ` [PATCH 2/6] Revert "mm: remove free_unref_page_list()" Usama Arif
2024-07-30 12:46 ` [PATCH 3/6] mm: free zapped tail pages when splitting isolated thp Usama Arif
2024-07-30 15:14 ` David Hildenbrand
2024-08-04 19:02 ` Usama Arif
2024-08-05 9:00 ` David Hildenbrand
2024-08-06 9:58 ` Usama Arif
2024-07-30 12:46 ` [PATCH 4/6] mm: don't remap unused subpages " Usama Arif
2024-07-30 18:07 ` Rik van Riel
2024-07-31 17:08 ` Usama Arif
2024-07-30 12:46 ` [PATCH 5/6] mm: add selftests to split_huge_page() to verify unmap/zap of zero pages Usama Arif
2024-07-30 18:10 ` Rik van Riel
2024-08-01 4:45 ` kernel test robot
2024-08-06 22:02 ` Usama Arif
2024-07-30 12:46 ` [PATCH 6/6] mm: split underutilized THPs Usama Arif
2024-07-30 13:59 ` Randy Dunlap
2024-07-30 14:35 ` [PATCH 0/6] " David Hildenbrand
2024-07-30 15:14 ` Usama Arif
2024-07-30 15:19 ` Usama Arif
2024-07-30 16:11 ` David Hildenbrand
2024-07-30 17:22 ` Usama Arif
2024-07-30 20:25 ` David Hildenbrand
2024-07-31 17:01 ` Usama Arif
2024-07-31 17:51 ` David Hildenbrand
2024-07-31 20:41 ` Usama Arif
2024-08-01 6:36 ` David Hildenbrand
2024-08-04 23:04 ` Usama Arif
2024-08-06 17:17 ` Usama Arif
2024-08-06 17:30 ` David Hildenbrand
2024-08-06 17:28 ` Johannes Weiner [this message]
2024-08-06 17:33 ` David Hildenbrand
2024-08-01 6:09 ` Yu Zhao
2024-08-01 15:47 ` David Hildenbrand
2024-08-04 21:54 ` Yu Zhao
2024-08-05 1:32 ` Rik van Riel
2024-08-05 19:51 ` Yu Zhao
2024-08-01 16:22 ` Usama Arif
2024-08-01 16:27 ` David Hildenbrand
2024-08-04 19:10 ` Usama Arif
2024-08-04 23:32 ` Yu Zhao
2024-08-04 23:23 ` Yu Zhao
2024-08-06 11:18 ` Usama Arif
2024-08-06 17:38 ` Johannes Weiner
2024-08-06 18:06 ` Yu Zhao
2024-08-06 19:54 ` Johannes Weiner
2024-08-06 20:53 ` Yu Zhao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240806172830.GD322282@cmpxchg.org \
--to=hannes@cmpxchg.org \
--cc=akpm@linux-foundation.org \
--cc=baohua@kernel.org \
--cc=cerasuolodomenico@gmail.com \
--cc=corbet@lwn.net \
--cc=david@redhat.com \
--cc=kernel-team@meta.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=riel@surriel.com \
--cc=roman.gushchin@linux.dev \
--cc=rppt@kernel.org \
--cc=ryan.roberts@arm.com \
--cc=shakeel.butt@linux.dev \
--cc=usamaarif642@gmail.com \
--cc=willy@infradead.org \
--cc=yuzhao@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).