From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D9484D19503 for ; Mon, 26 Jan 2026 16:56:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:Content-Type: Content-Transfer-Encoding:Reply-To:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:From:References:CC:To: Subject:MIME-Version:Date:Message-ID:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=anPf4fJK/QwGzrMOympP5fVjxlkT6lPYVvW1EC3RlVw=; b=l7DG7i3oLljgyE i93i8hAAi5NnOPUrE4qINJB8VT6CExC1gX6gdE71vwdMbnOYe2gjofgUVjJ0+qHRyQOGTcxqWQPFS kFaO5a2b7yzhtIpmtt3zolASCz99bQ0w3S5oW6VXtcitBsx82MjibFQuMBu1AX6+foqubC07q1U6b IsiyoJexgTAPXRIfYj5FgY7Ekx5PfHVw8PdXFI9iwZQ0BQDVyIbtbs2lpl9QZjcI3w119eCKYbslq i/7cViXfoCK7H9rPLXJwGqfaIHIfjcKI//98l+HDHKC0ssxYS/5w4YI9Vgl/3bZCo+4D16C9QJjiK LZJEdlvt0DXLCTepWeew==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98.2 #2 (Red Hat Linux)) id 1vkPtI-0000000CuIh-2m4Z; Mon, 26 Jan 2026 16:56:40 +0000 Received: from fra-out-010.esa.eu-central-1.outbound.mail-perimeter.amazon.com ([63.178.143.178]) by bombadil.infradead.org with esmtps (Exim 4.98.2 #2 (Red Hat Linux)) id 1vkPtF-0000000CuII-3eM7; Mon, 26 Jan 2026 16:56:39 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazoncorp2; t=1769446597; x=1800982597; h=message-id:date:mime-version:reply-to:subject:to:cc: references:from:in-reply-to:content-transfer-encoding; bh=i1wNPmaVv8oAnx46s+pv/CwmAbOdzYrpQ/gTMvCUjdA=; b=klu4T5ia6GxVd51YU6JQ900i2Gdd+JA+cA6n7gP5/u8sl0wiCd4N9ccr xpIsP2YWvPMzPNmNHog84/qLHyL3FlvMFNYVRYct5sUS/vDKW+0/o5wuc p7rPi53Lv5YqFmvhgJv88WmttLPGvanvInEbTN4CxyQ79WAw/iZ94J6yF Pde3o/THqPlZ8aPuh3wgAn4KPamwAhvH6H1m4sCQ1O+b76UCYi3nzmtoE 6zfq6fuME+RHqLrOC0GsMlzGA0D7YgMyNDJE41dYsxHk0FKCeRHQ+vlnK CdSygWmXWLKRs+hRazMsrb9aB9owZJjB5mrk9uaZMB6KrUjhi4rsk3Dxy g==; X-CSE-ConnectionGUID: hG/nmz7vTNmceFnAp6G6SA== X-CSE-MsgGUID: 3E3qoK4vTpqzDoxpd4XG1Q== X-IronPort-AV: E=Sophos;i="6.21,255,1763424000"; d="scan'208";a="8357615" Received: from ip-10-6-3-216.eu-central-1.compute.internal (HELO smtpout.naws.eu-central-1.prod.farcaster.email.amazon.dev) ([10.6.3.216]) by internal-fra-out-010.esa.eu-central-1.outbound.mail-perimeter.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Jan 2026 16:56:36 +0000 Received: from EX19MTAEUC001.ant.amazon.com [54.240.197.225:22549] by smtpin.naws.eu-central-1.prod.farcaster.email.amazon.dev [10.0.9.185:2525] with esmtp (Farcaster) id 8a3ce0b8-5cfe-49f5-b875-46094201b2e6; Mon, 26 Jan 2026 16:56:35 +0000 (UTC) X-Farcaster-Flow-ID: 8a3ce0b8-5cfe-49f5-b875-46094201b2e6 Received: from EX19D005EUB003.ant.amazon.com (10.252.51.31) by EX19MTAEUC001.ant.amazon.com (10.252.51.193) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.2562.35; Mon, 26 Jan 2026 16:56:30 +0000 Received: from [192.168.25.27] (10.106.82.32) by EX19D005EUB003.ant.amazon.com (10.252.51.31) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA) id 15.2.2562.35; Mon, 26 Jan 2026 16:56:11 +0000 Message-ID: Date: Mon, 26 Jan 2026 16:56:10 +0000 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v9 07/13] KVM: guest_memfd: Add flag to remove from direct map To: Ackerley Tng , "Edgecombe, Rick P" , "linux-riscv@lists.infradead.org" , "kalyazin@amazon.co.uk" , "kernel@xen0n.name" , "linux-kselftest@vger.kernel.org" , "linux-mm@kvack.org" , "linux-fsdevel@vger.kernel.org" , "linux-s390@vger.kernel.org" , "kvmarm@lists.linux.dev" , "linux-kernel@vger.kernel.org" , "linux-arm-kernel@lists.infradead.org" , "kvm@vger.kernel.org" , "bpf@vger.kernel.org" , "linux-doc@vger.kernel.org" , "loongarch@lists.linux.dev" CC: "david@kernel.org" , "palmer@dabbelt.com" , "catalin.marinas@arm.com" , "svens@linux.ibm.com" , "jgross@suse.com" , "surenb@google.com" , "riel@surriel.com" , "pfalcato@suse.de" , "peterx@redhat.com" , "x86@kernel.org" , "rppt@kernel.org" , "thuth@redhat.com" , "maz@kernel.org" , "dave.hansen@linux.intel.com" , "ast@kernel.org" , "vbabka@suse.cz" , "Annapurve, Vishal" , "borntraeger@linux.ibm.com" , "alex@ghiti.fr" , "pjw@kernel.org" , "tglx@linutronix.de" , "willy@infradead.org" , "hca@linux.ibm.com" , "wyihan@google.com" , "ryan.roberts@arm.com" , "jolsa@kernel.org" , "yang@os.amperecomputing.com" , "jmattson@google.com" , "luto@kernel.org" , "aneesh.kumar@kernel.org" , "haoluo@google.com" , "patrick.roy@linux.dev" , "akpm@linux-foundation.org" , "coxu@redhat.com" , "mhocko@suse.com" , "mlevitsk@redhat.com" , "jgg@ziepe.ca" , "hpa@zytor.com" , "song@kernel.org" , "oupton@kernel.org" , "peterz@infradead.org" , "maobibo@loongson.cn" , "lorenzo.stoakes@oracle.com" , "Liam.Howlett@oracle.com" , "jthoughton@google.com" , "martin.lau@linux.dev" , "jhubbard@nvidia.com" , "Yu, Yu-cheng" , "Jonathan.Cameron@huawei.com" , "eddyz87@gmail.com" , "yonghong.song@linux.dev" , "chenhuacai@kernel.org" , "shuah@kernel.org" , "prsampat@amd.com" , "kevin.brodsky@arm.com" , "shijie@os.amperecomputing.com" , "suzuki.poulose@arm.com" , "itazur@amazon.co.uk" , "pbonzini@redhat.com" , "yuzenghui@huawei.com" , "dev.jain@arm.com" , "gor@linux.ibm.com" , "jackabt@amazon.co.uk" , "daniel@iogearbox.net" , "agordeev@linux.ibm.com" , "andrii@kernel.org" , "mingo@redhat.com" , "aou@eecs.berkeley.edu" , "joey.gouly@arm.com" , "derekmn@amazon.com" , "xmarcalx@amazon.co.uk" , "kpsingh@kernel.org" , "sdf@fomichev.me" , "jackmanb@google.com" , "bp@alien8.de" , "corbet@lwn.net" , "jannh@google.com" , "john.fastabend@gmail.com" , "kas@kernel.org" , "will@kernel.org" , "seanjc@google.com" References: <20260114134510.1835-1-kalyazin@amazon.com> <20260114134510.1835-8-kalyazin@amazon.com> <294bca75-2f3e-46db-bb24-7c471a779cc1@amazon.com> Content-Language: en-US From: Nikita Kalyazin Autocrypt: addr=kalyazin@amazon.com; keydata= xjMEY+ZIvRYJKwYBBAHaRw8BAQdA9FwYskD/5BFmiiTgktstviS9svHeszG2JfIkUqjxf+/N JU5pa2l0YSBLYWx5YXppbiA8a2FseWF6aW5AYW1hem9uLmNvbT7CjwQTFggANxYhBGhhGDEy BjLQwD9FsK+SyiCpmmTzBQJnrNfABQkFps9DAhsDBAsJCAcFFQgJCgsFFgIDAQAACgkQr5LK IKmaZPOpfgD/exazh4C2Z8fNEz54YLJ6tuFEgQrVQPX6nQ/PfQi2+dwBAMGTpZcj9Z9NvSe1 CmmKYnYjhzGxzjBs8itSUvWIcMsFzjgEY+ZIvRIKKwYBBAGXVQEFAQEHQCqd7/nb2tb36vZt ubg1iBLCSDctMlKHsQTp7wCnEc4RAwEIB8J+BBgWCAAmFiEEaGEYMTIGMtDAP0Wwr5LKIKma ZPMFAmes18AFCQWmz0MCGwwACgkQr5LKIKmaZPNTlQEA+q+rGFn7273rOAg+rxPty0M8lJbT i2kGo8RmPPLu650A/1kWgz1AnenQUYzTAFnZrKSsXAw5WoHaDLBz9kiO5pAK In-Reply-To: X-Originating-IP: [10.106.82.32] X-ClientProxiedBy: EX19D001EUB003.ant.amazon.com (10.252.51.38) To EX19D005EUB003.ant.amazon.com (10.252.51.31) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20260126_085638_200885_FCD45C06 X-CRM114-Status: GOOD ( 24.52 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: kalyazin@amazon.com Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="us-ascii"; Format="flowed" Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org On 22/01/2026 18:37, Ackerley Tng wrote: > Nikita Kalyazin writes: > >> On 16/01/2026 00:00, Edgecombe, Rick P wrote: >>> On Wed, 2026-01-14 at 13:46 +0000, Kalyazin, Nikita wrote: >>>> +static void kvm_gmem_folio_restore_direct_map(struct folio *folio) >>>> +{ >>>> + /* >>>> + * Direct map restoration cannot fail, as the only error condition >>>> + * for direct map manipulation is failure to allocate page tables >>>> + * when splitting huge pages, but this split would have already >>>> + * happened in folio_zap_direct_map() in kvm_gmem_folio_zap_direct_map(). > > Do you know if folio_restore_direct_map() will also end up merging page > table entries to a higher level? > >>>> + * Thus folio_restore_direct_map() here only updates prot bits. >>>> + */ >>>> + if (kvm_gmem_folio_no_direct_map(folio)) { >>>> + WARN_ON_ONCE(folio_restore_direct_map(folio)); >>>> + folio->private = (void *)((u64)folio->private & ~KVM_GMEM_FOLIO_NO_DIRECT_MAP); >>>> + } >>>> +} >>>> + >>> >>> Does this assume the folio would not have been split after it was zapped? As in, >>> if it was zapped at 2MB granularity (no 4KB direct map split required) but then >>> restored at 4KB (split required)? Or it gets merged somehow before this? > > I agree with the rest of the discussion that this will probably land > before huge page support, so I will have to figure out the intersection > of the two later. > >> >> AFAIK it can't be zapped at 2MB granularity as the zapping code will >> inevitably cause splitting because guest_memfd faults occur at the base >> page granularity as of now. > > Here's what I'm thinking for now: > > [HugeTLB, no conversions] > With initial HugeTLB support (no conversions), host userspace > guest_memfd faults will be: > > + For guest_memfd with PUD-sized pages > + At PUD level or PTE level > + For guest_memfd with PMD-sized pages > + At PMD level or PTE level > > Since this guest_memfd doesn't support conversions, the folio is never > split/merged, so the direct map is restored at whatever level it was > zapped. I think this works out well. > > [HugeTLB + conversions] > For a guest_memfd with HugeTLB support and conversions, host userspace > guest_memfd faults will always be at PTE level, so the direct map will > be split and the faulted pages have the direct map zapped in 4K chunks > as they are faulted. > > On conversion back to private, put those back into the direct map > (putting aside whether to merge the direct map PTEs for now). Makes sense to me. > > > Unfortunately there's no unmapping callback for guest_memfd to use, so > perhaps the principle should be to put the folios back into the direct > map ASAP - at unmapping if guest_memfd is doing the unmapping, otherwise > at freeing time? I'm not sure I fully understand what you mean here. What would be the purpose for hooking up to unmapping? Why would making sure we put folios back into the direct map whenever they are freed or converted to private not be sufficient? _______________________________________________ linux-riscv mailing list linux-riscv@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-riscv