linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Brendan Jackman <jackmanb@google.com>
To: Dave Hansen <dave.hansen@intel.com>,
	"Roy, Patrick" <roypat@amazon.co.uk>
Cc: "pbonzini@redhat.com" <pbonzini@redhat.com>,
	"corbet@lwn.net" <corbet@lwn.net>,
	 "maz@kernel.org" <maz@kernel.org>,
	"oliver.upton@linux.dev" <oliver.upton@linux.dev>,
	 "joey.gouly@arm.com" <joey.gouly@arm.com>,
	"suzuki.poulose@arm.com" <suzuki.poulose@arm.com>,
	 "yuzenghui@huawei.com" <yuzenghui@huawei.com>,
	"catalin.marinas@arm.com" <catalin.marinas@arm.com>,
	 "will@kernel.org" <will@kernel.org>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	 "mingo@redhat.com" <mingo@redhat.com>,
	"bp@alien8.de" <bp@alien8.de>,
	 "dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"x86@kernel.org" <x86@kernel.org>,
	 "hpa@zytor.com" <hpa@zytor.com>,
	"luto@kernel.org" <luto@kernel.org>,
	 "peterz@infradead.org" <peterz@infradead.org>,
	"willy@infradead.org" <willy@infradead.org>,
	 "akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"david@redhat.com" <david@redhat.com>,
	 "lorenzo.stoakes@oracle.com" <lorenzo.stoakes@oracle.com>,
	 "Liam.Howlett@oracle.com" <Liam.Howlett@oracle.com>,
	"vbabka@suse.cz" <vbabka@suse.cz>,
	 "rppt@kernel.org" <rppt@kernel.org>,
	"surenb@google.com" <surenb@google.com>,
	"mhocko@suse.com" <mhocko@suse.com>,
	 "song@kernel.org" <song@kernel.org>,
	"jolsa@kernel.org" <jolsa@kernel.org>,
	"ast@kernel.org" <ast@kernel.org>,
	 "daniel@iogearbox.net" <daniel@iogearbox.net>,
	"andrii@kernel.org" <andrii@kernel.org>,
	 "martin.lau@linux.dev" <martin.lau@linux.dev>,
	"eddyz87@gmail.com" <eddyz87@gmail.com>,
	 "yonghong.song@linux.dev" <yonghong.song@linux.dev>,
	 "john.fastabend@gmail.com" <john.fastabend@gmail.com>,
	"kpsingh@kernel.org" <kpsingh@kernel.org>,
	 "sdf@fomichev.me" <sdf@fomichev.me>,
	"haoluo@google.com" <haoluo@google.com>,
	"jgg@ziepe.ca" <jgg@ziepe.ca>,
	 "jhubbard@nvidia.com" <jhubbard@nvidia.com>,
	"peterx@redhat.com" <peterx@redhat.com>,
	 "jannh@google.com" <jannh@google.com>,
	"pfalcato@suse.de" <pfalcato@suse.de>,
	 "shuah@kernel.org" <shuah@kernel.org>,
	"seanjc@google.com" <seanjc@google.com>,
	 "kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	 "linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>,
	 "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	 "linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	 "kvmarm@lists.linux.dev" <kvmarm@lists.linux.dev>,
	 "linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	 "bpf@vger.kernel.org" <bpf@vger.kernel.org>,
	 "linux-kselftest@vger.kernel.org"
	<linux-kselftest@vger.kernel.org>,
	"Cali, Marco" <xmarcalx@amazon.co.uk>,
	 "Kalyazin, Nikita" <kalyazin@amazon.co.uk>,
	"Thomson, Jack" <jackabt@amazon.co.uk>,
	 "derekmn@amazon.co.uk" <derekmn@amazon.co.uk>,
	"tabba@google.com" <tabba@google.com>,
	 "ackerleytng@google.com" <ackerleytng@google.com>
Subject: Re: [PATCH v7 06/12] KVM: guest_memfd: add module param for disabling TLB flushing
Date: Thu, 30 Oct 2025 16:05:05 +0000	[thread overview]
Message-ID: <DDVS9ITBCE2Z.RSTLCU79EX8G@google.com> (raw)
In-Reply-To: <e25867b6-ffc0-4c7c-9635-9b3f47b186ca@intel.com>

On Thu Sep 25, 2025 at 6:27 PM UTC, Dave Hansen wrote:
> On 9/24/25 08:22, Roy, Patrick wrote:
>> Add an option to not perform TLB flushes after direct map manipulations.
>
> I'd really prefer this be left out for now. It's a massive can of worms.
> Let's agree on something that works and has well-defined behavior before
> we go breaking it on purpose.

As David pointed out in the MM Alignment Session yesterday, I might be
able to help here. In [0] I've proposed a way to break up the direct map
by ASI's "sensitivity" concept, which is weaker than the "totally absent
from the direct map" being proposed here, but it has kinda similar
implementation challenges.

Basically it introduces a thing called a "freetype" that extends the
idea of migratetype. Like the existing idea of migratetype, it's used to
physically group pages when allocating, and you can index free pages by
it, i.e. each freetype gets its own freelist. But it can also encode
other information than mobility (and the other stuff that's encoded in
migratetype...).

Could it make sense to use that logic to just have entire pageblocks
that are absent from the direct map? Then when allocating memory for the
guest_memfd we get it from one of those pageblocks. Then we only have to
flush the TLB if there's no memory left in pageblocks of this freetype
(so the allocator has to flip another pageblock over to the "no direct
map" freetype, after removing it from the direct map).

I haven't yet investigated this properly, I'll start doing that now.
But I thought I'd immediately drop this note in case anyone can
immediately see a reason why this doesn't work.

[0] https://lore.kernel.org/all/20250924-b4-asi-page-alloc-v1-0-2d861768041f@google.com/T/#t

BTW, I think if the skip-flush flag is the only thing blocking this
patchset, it would be great to merge it without it. Even if that means
it's no use for Firecracker usecases that doesn't mean the underlying
feature isn't valuable for _someone_. Then we can figure out how to make
it work for Firecracker afterwards, one way or another.

(Just to be transparent: my nefarious ulterior motive is that it would
give me an angle to start merging code that will eventually support ASI.
But, I'm serious that there are probably users who would like this
feature even if it's slow!)

  parent reply	other threads:[~2025-10-30 16:05 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-24 15:10 [PATCH v7 00/12] Direct Map Removal Support for guest_memfd Patrick Roy
2025-09-24 15:10 ` [PATCH v7 01/12] arch: export set_direct_map_valid_noflush to KVM module Patrick Roy
2025-09-24 15:10 ` [PATCH v7 02/12] x86/tlb: export flush_tlb_kernel_range " Patrick Roy
2025-09-24 15:10 ` [PATCH v7 03/12] mm: introduce AS_NO_DIRECT_MAP Patrick Roy
2025-09-24 15:22   ` [PATCH v7 04/12] KVM: guest_memfd: Add stub for kvm_arch_gmem_invalidate Roy, Patrick
2025-09-24 15:22     ` [PATCH v7 05/12] KVM: guest_memfd: Add flag to remove from direct map Roy, Patrick
2025-09-25 11:00       ` David Hildenbrand
2025-09-25 15:52         ` Roy, Patrick
2025-09-25 19:28           ` David Hildenbrand
2025-09-26 14:49       ` Patrick Roy
2025-09-24 15:22     ` [PATCH v7 06/12] KVM: guest_memfd: add module param for disabling TLB flushing Roy, Patrick
2025-09-25 11:02       ` David Hildenbrand
2025-09-25 15:50         ` Roy, Patrick
2025-09-25 19:32           ` David Hildenbrand
2025-09-25 18:27       ` Dave Hansen
2025-09-25 19:20         ` David Hildenbrand
2025-09-25 19:59           ` Dave Hansen
2025-09-25 20:13             ` David Hildenbrand
2025-09-26  9:46               ` Patrick Roy
2025-09-26 10:53                 ` Will Deacon
2025-09-26 20:09                   ` David Hildenbrand
2025-09-27  7:38                     ` Patrick Roy
2025-09-29 10:20                       ` David Hildenbrand
2025-10-11 14:32                         ` Patrick Roy
2025-10-30 16:05         ` Brendan Jackman [this message]
2025-09-24 15:22     ` [PATCH v7 07/12] KVM: selftests: load elf via bounce buffer Roy, Patrick
2025-09-24 15:22     ` [PATCH v7 08/12] KVM: selftests: set KVM_MEM_GUEST_MEMFD in vm_mem_add() if guest_memfd != -1 Roy, Patrick
2025-09-24 15:22     ` [PATCH v7 09/12] KVM: selftests: Add guest_memfd based vm_mem_backing_src_types Roy, Patrick
2025-09-24 15:22     ` [PATCH v7 10/12] KVM: selftests: cover GUEST_MEMFD_FLAG_NO_DIRECT_MAP in existing selftests Roy, Patrick
2025-09-24 15:22     ` [PATCH v7 11/12] KVM: selftests: stuff vm_mem_backing_src_type into vm_shape Roy, Patrick
2025-09-24 15:22     ` [PATCH v7 12/12] KVM: selftests: Test guest execution from direct map removed gmem Roy, Patrick
2025-10-30 17:18       ` Brendan Jackman
2025-09-25 10:26     ` [PATCH v7 04/12] KVM: guest_memfd: Add stub for kvm_arch_gmem_invalidate David Hildenbrand
2025-09-25 10:25   ` [PATCH v7 03/12] mm: introduce AS_NO_DIRECT_MAP David Hildenbrand
2025-09-24 15:29 ` [PATCH v7 00/12] Direct Map Removal Support for guest_memfd Roy, Patrick
2025-09-24 15:38   ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DDVS9ITBCE2Z.RSTLCU79EX8G@google.com \
    --to=jackmanb@google.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=ackerleytng@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bp@alien8.de \
    --cc=bpf@vger.kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=corbet@lwn.net \
    --cc=daniel@iogearbox.net \
    --cc=dave.hansen@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=derekmn@amazon.co.uk \
    --cc=eddyz87@gmail.com \
    --cc=haoluo@google.com \
    --cc=hpa@zytor.com \
    --cc=jackabt@amazon.co.uk \
    --cc=jannh@google.com \
    --cc=jgg@ziepe.ca \
    --cc=jhubbard@nvidia.com \
    --cc=joey.gouly@arm.com \
    --cc=john.fastabend@gmail.com \
    --cc=jolsa@kernel.org \
    --cc=kalyazin@amazon.co.uk \
    --cc=kpsingh@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.linux.dev \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=luto@kernel.org \
    --cc=martin.lau@linux.dev \
    --cc=maz@kernel.org \
    --cc=mhocko@suse.com \
    --cc=mingo@redhat.com \
    --cc=oliver.upton@linux.dev \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=peterz@infradead.org \
    --cc=pfalcato@suse.de \
    --cc=roypat@amazon.co.uk \
    --cc=rppt@kernel.org \
    --cc=sdf@fomichev.me \
    --cc=seanjc@google.com \
    --cc=shuah@kernel.org \
    --cc=song@kernel.org \
    --cc=surenb@google.com \
    --cc=suzuki.poulose@arm.com \
    --cc=tabba@google.com \
    --cc=tglx@linutronix.de \
    --cc=vbabka@suse.cz \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    --cc=x86@kernel.org \
    --cc=xmarcalx@amazon.co.uk \
    --cc=yonghong.song@linux.dev \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).