From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net [23.128.96.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 765C528E12 for ; Mon, 6 Nov 2023 15:56:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="jfgAGp2H" Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 08F7F10A for ; Mon, 6 Nov 2023 07:56:06 -0800 (PST) Received: by mail-yb1-xb49.google.com with SMTP id 3f1490d57ef6-da04fb79246so5346041276.2 for ; Mon, 06 Nov 2023 07:56:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1699286165; x=1699890965; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=2j2Mtk4lW9HEtWavGEnwteIJF5J/7dhiq9rSq3IBx+E=; b=jfgAGp2HE5/xK9g4UWhCfe0wF8pepGfMwluQ7Me8zjfyk6kURtfjawQkzEKUL79x23 GA7yRpH7oYJ7PauF34lFwNQOzPOo074h3+rK5Nbw9zxFiNBJVL7dPbJmt303qnicGXhF nEfy/sj/763fXTU7n9f9aw4t+i+ogT4UuEHWx8nRxxHclr16shV25An5cvb61i6Now6A HZvr0t1B/eWzYe7LBjBhD2PvN13RVPGJ0vCSFvcA0giqJQMP2Ia9JWL28QQRK4AcKjCE yKF3NGLMBB0N+rtxWhEtTE6umISScKcM05FoJLvnyi5f4Ogi9MTGBHt+4K6q2C36ozTG krEQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1699286165; x=1699890965; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=2j2Mtk4lW9HEtWavGEnwteIJF5J/7dhiq9rSq3IBx+E=; b=ZridInnYLavezebxKBP0E5KIrZ+pG9Z8sFrHOVCqV98M9vZwS+s7jwP2cQ1/jEPE0Z LGMm7pAdCcZlr2y26pHLFXAdQGH4S6tsDZ0HzkG6tEbnAHY8WfZtKIHa/KvNEQuImsH8 7lR1JlaT6cnUzKOJNhxMBiS9+7VLfLxnQpieGrxAMDSl+n5Lx9Ms77+B+fw0MEtm1txt nwky9MxBLay3U7TE6aPd6L6V2EDVBVgv1GU9taewT4Ia8sYxYdbh3udzKOyZZH27ntQN GxX4edkV7TB6hLPiBSfm8ub9vEfoMl9embJU7QlXPWpVf2iRkbju2DnzDHGc3rt30u1z qc2g== X-Gm-Message-State: AOJu0Yz4Jbtw5dWUEXxoqJINTfrVLXBegRsIV7m1PoQ3ftKDLBpeawf+ 7IvVYhhxwIqbiYNUcOJWg7mrlaHTtOg= X-Google-Smtp-Source: AGHT+IERk2b/M7d3G4J3vDGzrJ+T8jAsirv4bBIzvQT8OzuTMUe5AwPvC/SZu8YyGY6aMLoBT08jDOLmS/Y= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a05:6902:1083:b0:d9a:c3b8:4274 with SMTP id v3-20020a056902108300b00d9ac3b84274mr683795ybu.7.1699286165114; Mon, 06 Nov 2023 07:56:05 -0800 (PST) Date: Mon, 6 Nov 2023 07:56:03 -0800 In-Reply-To: Precedence: bulk X-Mailing-List: linux-fsdevel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20231027182217.3615211-1-seanjc@google.com> <20231027182217.3615211-21-seanjc@google.com> Message-ID: Subject: Re: [PATCH v13 20/35] KVM: x86/mmu: Handle page fault for private memory From: Sean Christopherson To: Xu Yilun Cc: Paolo Bonzini , Marc Zyngier , Oliver Upton , Huacai Chen , Michael Ellerman , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Alexander Viro , Christian Brauner , "Matthew Wilcox (Oracle)" , Andrew Morton , kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Xiaoyao Li , Xu Yilun , Chao Peng , Fuad Tabba , Jarkko Sakkinen , Anish Moorthy , David Matlack , Yu Zhang , Isaku Yamahata , "=?utf-8?Q?Micka=C3=ABl_Sala=C3=BCn?=" , Vlastimil Babka , Vishal Annapurve , Ackerley Tng , Maciej Szmigiero , David Hildenbrand , Quentin Perret , Michael Roth , Wang , Liam Merwick , Isaku Yamahata , "Kirill A . Shutemov" Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable On Mon, Nov 06, 2023, Xu Yilun wrote: > On Sun, Nov 05, 2023 at 05:19:36PM +0100, Paolo Bonzini wrote: > > On Sun, Nov 5, 2023 at 2:04=E2=80=AFPM Xu Yilun wrote: > > > > > > > +static void kvm_mmu_prepare_memory_fault_exit(struct kvm_vcpu *vcp= u, > > > > + struct kvm_page_fault *= fault) > > > > +{ > > > > + kvm_prepare_memory_fault_exit(vcpu, fault->gfn << PAGE_SHIFT, > > > > + PAGE_SIZE, fault->write, fault-= >exec, > > > > + fault->is_private); > > > > +} > > > > + > > > > +static int kvm_faultin_pfn_private(struct kvm_vcpu *vcpu, > > > > + struct kvm_page_fault *fault) > > > > +{ > > > > + int max_order, r; > > > > + > > > > + if (!kvm_slot_can_be_private(fault->slot)) { > > > > + kvm_mmu_prepare_memory_fault_exit(vcpu, fault); > > > > + return -EFAULT; > > > > + } > > > > + > > > > + r =3D kvm_gmem_get_pfn(vcpu->kvm, fault->slot, fault->gfn, &f= ault->pfn, > > > > + &max_order); > > > > + if (r) { > > > > + kvm_mmu_prepare_memory_fault_exit(vcpu, fault); > > > > + return r; > > > > > > Why report KVM_EXIT_MEMORY_FAULT here? even with a ret !=3D -EFAULT? > >=20 > > The cases are EFAULT, EHWPOISON (which can report > > KVM_EXIT_MEMORY_FAULT) and ENOMEM. I think it's fine > > that even -ENOMEM can return KVM_EXIT_MEMORY_FAULT, > > and it doesn't violate the documentation. The docs tell you "what > > can you do if error if EFAULT or EHWPOISON?"; they don't > > exclude that other errnos result in KVM_EXIT_MEMORY_FAULT, > > it's just that you're not supposed to look at it >=20 > Thanks, it's OK for ENOMEM + KVM_EXIT_MEMORY_FAULT. >=20 > Another concern is, now 3 places to report EFAULT + KVM_EXIT_MEMORY_FAULT= : >=20 > if (!kvm_slot_can_be_private(fault->slot)) { > kvm_mmu_prepare_memory_fault_exit(vcpu, fault); > return -EFAULT; > } >=20 > file =3D kvm_gmem_get_file(slot); > if (!file) > return -EFAULT; >=20 > if (fault->is_private !=3D kvm_mem_is_private(vcpu->kvm, fault->gfn)) { > kvm_mmu_prepare_memory_fault_exit(vcpu, fault); > return -EFAULT; > } >=20 > They are different cases, and seems userspace should handle them > differently, but not enough information to distinguish them. For the first, the memory_fault exit will inform userspace that the guest w= ants to map memory as private, and userspace will see that the memslot isn't con= figured to support private mappings. Userspace may not even need to query memslots= , e.g. if the gfn in question has been enumerated to the guest as something that c= an only be mapped shared. For the second (no valid guest_memfd file), userspace put the last referenc= e to the guest_memfd file without informing the guest or creating a memslot. Th= at's firmly a userspace bug. For the third and last, userspace will see that the guest is requesting a p= rivate mapping but the gfn is configured for shared mappings. In all cases, userspace has the necessary information to resolve the issue,= where "resolving the issue" may mean terminating the guest. If userspace isn't t= racking memslots or the private attribute, then userspace has far bigger problems.