From mboxrd@z Thu Jan 1 00:00:00 1970 From: Sean Christopherson Date: Thu, 1 Aug 2024 07:43:13 -0700 Subject: [PATCH v12 05/84] KVM: Add kvm_release_page_unused() API to put pages that KVM never consumes In-Reply-To: <87wml0egzo.fsf@draig.linaro.org> References: <20240726235234.228822-1-seanjc@google.com> <20240726235234.228822-6-seanjc@google.com> <87wml0egzo.fsf@draig.linaro.org> Message-ID: List-Id: To: kvm-riscv@lists.infradead.org MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit On Thu, Aug 01, 2024, Alex Benn?e wrote: > Sean Christopherson writes: > > > Add an API to release an unused page, i.e. to put a page without marking > > it accessed or dirty. The API will be used when KVM faults-in a page but > > bails before installing the guest mapping (and other similar flows). > > > > Signed-off-by: Sean Christopherson > > --- > > include/linux/kvm_host.h | 9 +++++++++ > > 1 file changed, 9 insertions(+) > > > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > > index 3d9617d1de41..c5d39a337aa3 100644 > > --- a/include/linux/kvm_host.h > > +++ b/include/linux/kvm_host.h > > @@ -1201,6 +1201,15 @@ unsigned long gfn_to_hva_prot(struct kvm *kvm, gfn_t gfn, bool *writable); > > unsigned long gfn_to_hva_memslot(struct kvm_memory_slot *slot, gfn_t gfn); > > unsigned long gfn_to_hva_memslot_prot(struct kvm_memory_slot *slot, gfn_t gfn, > > bool *writable); > > + > > +static inline void kvm_release_page_unused(struct page *page) > > +{ > > + if (!page) > > + return; > > + > > + put_page(page); > > +} > > I guess it's unfamiliarity with the mm layout but I was trying to find > where the get_pages come from to see the full pattern of allocate and > return. I guess somewhere in the depths of hva_to_pfn() from > hva_to_pfn_retry()? If successful, get_user_page_fast_only() and get_user_pages_unlocked() grab a reference on behalf of the caller. As of this patch, hva_to_pfn_remapped() also grabs a reference to pages that appear to be refcounted, which is the underlying wart this series aims to fix. In KVM's early days, it _only_ supported GUP, i.e. if KVM got a pfn, that pfn was (a) backed by struct page and (b) KVM had a reference to said page. That led to the current mess, as KVM didn't get reworked to properly track pages vs. pfns when support for VM_MIXEDMAP was added. /* * Get a reference here because callers of *hva_to_pfn* and * *gfn_to_pfn* ultimately call kvm_release_pfn_clean on the * returned pfn. This is only needed if the VMA has VM_MIXEDMAP * set, but the kvm_try_get_pfn/kvm_release_pfn_clean pair will * simply do nothing for reserved pfns. * * Whoever called remap_pfn_range is also going to call e.g. * unmap_mapping_range before the underlying pages are freed, * causing a call to our MMU notifier. * * Certain IO or PFNMAP mappings can be backed with valid * struct pages, but be allocated without refcounting e.g., * tail pages of non-compound higher order allocations, which * would then underflow the refcount when the caller does the * required put_page. Don't allow those pages here. */ if (!kvm_try_get_pfn(pfn)) r = -EFAULT; From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C2E501A57DA for ; Thu, 1 Aug 2024 14:43:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.202 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722523397; cv=none; b=OzbTNlC4dyHAPARifgjZcqQnKHiAR89PZjeAUJHdtEYNdKPXcvCTZyAy1Oo9SWlzssqLe4+Ho+osx6MsUBIhREHAYqkHqcgdaq7koVrC0WPzGyXxNZ5SjRTdLJjaTR1CAiON4zhg5TaUSqStA2nufH8blOgnunQJBFhqZt+QO5o= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722523397; c=relaxed/simple; bh=IsY6cT/EMNbdEHxsGGmKU2yom71VQGN3eoqzlVoDRMA=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=Vojookl6i9yIjjI082PvGRI5yX0QXi/C2NvxzlKTImBAC3ifwsFiWqQjghKGV5fEfgbWzRKCQGQyiRFL8j6d4RVDDn+uHIMiS6CNeM8p1n+YpwjjK5Tili0VHFyRlGr2r7y/wTN+1HDihEMRNKwoldV2wtw3D0U+U9GuEj4RWBk= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=ATCP681e; arc=none smtp.client-ip=209.85.219.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="ATCP681e" Received: by mail-yb1-f202.google.com with SMTP id 3f1490d57ef6-e0b7922ed63so8248284276.0 for ; Thu, 01 Aug 2024 07:43:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1722523395; x=1723128195; darn=lists.linux.dev; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=9BUCK9RfTKafiexBRvCEwFw2qC43V57hNjDHfumiy08=; b=ATCP681eu6Kt/PIShW2CFNQX/BZoqW5aNGnpvDNef2M3MXvAQmWNK5JwOkQ8Dvxorc 1aj/D4P+2OyWlPKbFiUz6tCYOdRcGzUJnwU2rsLJ/QruRyHY43meKFIpeIZ1/a5fmfyS 5TAMilY26MdtIiZRRY3FJIyek184T9YAujscE31mFTsPlxRyzzWFiefUIX5R3HZcxuCO EjBTf6hAZJTOkJOR5lL/dQ/fK7KphemfBeBLaUQNfUxltSQGXTxYB6hirwviflWhMBbP TA5Xw1S3AKPAcA2TUruA1zOSI1TcToobJBiWeurOOkms4kqd/odRgu4/J3YqeWgxI+T5 su4A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722523395; x=1723128195; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=9BUCK9RfTKafiexBRvCEwFw2qC43V57hNjDHfumiy08=; b=f2N4kzWi6XlLI3++HYkV85cYgKdDn31VSPa/5bjFPimdaBBXroE9YBtJzNxmnS8MnX vFcCTNR++jtm5YjsEmWCuEUaXHp4NONrPHt+tIZmxwxdY7GZbOVxvc2Y30JFf9OnGvpU oR+mHi3HsXkKZCtpKzmD1IlW9uDRROTM2842RdOAkboybhMVuli1vqHetsdo38mMW4Do unj/tXWbwJmRSbSOXCR5ZzdoEhDli9ORXHHF7l54x9XDICk3WC7i4t2G/5WZFt+Vi+px tAvEVdD5hYXyrHyR+T65xKLsk2UEIN+QoY2LCiIVAgAwkhC2ksKgNNvq7is9vd3yKryy r3IQ== X-Forwarded-Encrypted: i=1; AJvYcCUTiA2hHvOCjE894ANMyO3gOuT37BLwr0TtWRrwxHeM9y+lCOBSl5Ietvd48zWg6ePft9a/S4bRkKXQUxj9oS5djeU2td5g X-Gm-Message-State: AOJu0YwtRmHYBeIkw3zPTQRRta43j7X6oVGmwV283Mm6ByW07mXbFjHC j+x28IDBM1qgt/KOZj2QkBN+vzrbE8wv3fIebttVv+9ey5AOSctVaZ2rW+0ruHoxgkt6TEBnc5I SVw== X-Google-Smtp-Source: AGHT+IHkE7Ce/eHqXIIWUhOrz6iZ7wUAM+0kNmbjyrQoG1wEe6+vtvmxZ2gwjb9N0RAQKGaJuDaT2daG0VE= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a05:6902:1104:b0:e0b:b6f3:85 with SMTP id 3f1490d57ef6-e0bde2f0105mr658276.2.1722523394843; Thu, 01 Aug 2024 07:43:14 -0700 (PDT) Date: Thu, 1 Aug 2024 07:43:13 -0700 In-Reply-To: <87wml0egzo.fsf@draig.linaro.org> Precedence: bulk X-Mailing-List: kvmarm@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240726235234.228822-1-seanjc@google.com> <20240726235234.228822-6-seanjc@google.com> <87wml0egzo.fsf@draig.linaro.org> Message-ID: Subject: Re: [PATCH v12 05/84] KVM: Add kvm_release_page_unused() API to put pages that KVM never consumes From: Sean Christopherson To: "Alex =?utf-8?Q?Benn=C3=A9e?=" Cc: Paolo Bonzini , Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Michael Ellerman , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, David Matlack , David Stevens Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable On Thu, Aug 01, 2024, Alex Benn=C3=A9e wrote: > Sean Christopherson writes: >=20 > > Add an API to release an unused page, i.e. to put a page without markin= g > > it accessed or dirty. The API will be used when KVM faults-in a page b= ut > > bails before installing the guest mapping (and other similar flows). > > > > Signed-off-by: Sean Christopherson > > --- > > include/linux/kvm_host.h | 9 +++++++++ > > 1 file changed, 9 insertions(+) > > > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > > index 3d9617d1de41..c5d39a337aa3 100644 > > --- a/include/linux/kvm_host.h > > +++ b/include/linux/kvm_host.h > > @@ -1201,6 +1201,15 @@ unsigned long gfn_to_hva_prot(struct kvm *kvm, g= fn_t gfn, bool *writable); > > unsigned long gfn_to_hva_memslot(struct kvm_memory_slot *slot, gfn_t g= fn); > > unsigned long gfn_to_hva_memslot_prot(struct kvm_memory_slot *slot, gf= n_t gfn, > > bool *writable); > > + > > +static inline void kvm_release_page_unused(struct page *page) > > +{ > > + if (!page) > > + return; > > + > > + put_page(page); > > +} >=20 > I guess it's unfamiliarity with the mm layout but I was trying to find > where the get_pages come from to see the full pattern of allocate and > return. I guess somewhere in the depths of hva_to_pfn() from > hva_to_pfn_retry()? If successful, get_user_page_fast_only() and get_user_pages_unlocked() grab= a reference on behalf of the caller. As of this patch, hva_to_pfn_remapped() also grabs a reference to pages tha= t appear to be refcounted, which is the underlying wart this series aims to f= ix. In KVM's early days, it _only_ supported GUP, i.e. if KVM got a pfn, that p= fn was (a) backed by struct page and (b) KVM had a reference to said page. Th= at led to the current mess, as KVM didn't get reworked to properly track pages= vs. pfns when support for VM_MIXEDMAP was added. /* * Get a reference here because callers of *hva_to_pfn* and * *gfn_to_pfn* ultimately call kvm_release_pfn_clean on the * returned pfn. This is only needed if the VMA has VM_MIXEDMAP * set, but the kvm_try_get_pfn/kvm_release_pfn_clean pair will * simply do nothing for reserved pfns. * * Whoever called remap_pfn_range is also going to call e.g. * unmap_mapping_range before the underlying pages are freed, * causing a call to our MMU notifier. * * Certain IO or PFNMAP mappings can be backed with valid * struct pages, but be allocated without refcounting e.g., * tail pages of non-compound higher order allocations, which * would then underflow the refcount when the caller does the * required put_page. Don't allow those pages here. */ if (!kvm_try_get_pfn(pfn)) r =3D -EFAULT; From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id EEAE6C3DA64 for ; Thu, 1 Aug 2024 14:44:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:Cc:To:From:Subject:Message-ID: References:Mime-Version:In-Reply-To:Date:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=gSykqKlJTH4QA8wDjLv/87NjgFVl8lfN/ltz/LFd2KM=; b=jiMusm5WycDNFbRvVSwfIzmzVH YOPOBKcx0jiad2nZ0WqkzQXmBJGJYP52quyL9Ogx0Nd3vbxPyoQOoCODf2RQwKJgk5oo0BxADYbSY d1YygMcJFegiOT1+0Bgp76YUOaFVUWSRm0hpFqebG/gBaukYBclyu5vHfYjSZhRLDyohAL39XqghP 5KT32C9cSZdYnD+Nvhv8esd0v0WDDMaerAtdZ02xGEWD9DQ/gSpUdcQ54NrnJvGliZpPUYWevZIvj 4e0IeHkVfAzsH/zEBiymPor6h1oG7I6/a+a32v/jWvwY3rn1dZ0+hukcaWRpKLFnyIaEjsnae1qWX Jy3wANxA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1sZX24-00000005ibg-3bEM; Thu, 01 Aug 2024 14:43:56 +0000 Received: from mail-yb1-xb4a.google.com ([2607:f8b0:4864:20::b4a]) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1sZX1Q-00000005iLs-4BVK for linux-riscv@lists.infradead.org; Thu, 01 Aug 2024 14:43:19 +0000 Received: by mail-yb1-xb4a.google.com with SMTP id 3f1490d57ef6-e0b7922ed63so8248285276.0 for ; Thu, 01 Aug 2024 07:43:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1722523395; x=1723128195; darn=lists.infradead.org; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=9BUCK9RfTKafiexBRvCEwFw2qC43V57hNjDHfumiy08=; b=p2r4NHfEScjvQchoa2mImy38578qnAnTkRie2NFrp4TJ59sQwQqvESudFKoBLC1WkF 1xPybnXn9eDFNeCA/sz/yyECcMQYTeLdPFumful8PbGB4D1eQEfUMbsHZeRcN0hZEAmz 0O4BdVx13E5+14uhnK00gYOqt2FmQEaiV6ZaBAmRrQkTZDrYitjRvTPWs7CWFfy1uTXq np4gqeMlIzRo1V1hIDwo/upjfAI1WzaufSC7YcGZbIIJrnpRCvojvJiEZWydfkmWpWwZ R6UahbV+Bw8xJuT3zKfN/0a089OzSoCwbcOl9Gf3FNzsZFZmapI2yYkR5+7CCZHyk04A Wl4Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722523395; x=1723128195; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=9BUCK9RfTKafiexBRvCEwFw2qC43V57hNjDHfumiy08=; b=vplbBatOzjR+6JWfYTlDwAKhZtMspqfBb7kQdSrrZX3z0bUuKCj5k5QpBeVGUB0a5X RWbDN/1F2tGQeGIGS0V6RsGhHbYQduyCm9KuRrK/IXo5t/ICtiRvUkU8m5+/XPC4Fhgp mZONeSmUtPtrI3yqCkOD7+QmXFHjofppY8U4f9/vf61oY0/OPwXD+JlbcvD/RhZFimYN dt993NbYNMy52LEmvZc4Jbfc/cHCC8IrOXvMoza4lUJK4BsYPE+gbho2gokPccJzlz34 VjfVk/C2qMKdGd8FT6C3VgyYqQv7SnrAlyb62vZu8NHuB2jkXtbuKDN3Lv9jO7ECv7EP r68w== X-Forwarded-Encrypted: i=1; AJvYcCWT7rlaqtTYdAS4ALGfcjeP0GLPqhAlWOGgMIhsF97GL9NHty8UqXxF5oXU4smpBLhPeElusdg70f99ZmWWFZLx3bp5XlMrKjB+zTIl8zK9 X-Gm-Message-State: AOJu0YyDK//GtrVUYVZmfEDQltgjOupSawvghd3ZiC+PIHxQrRR7AMzK DXRUhkbNL3AvBC8sR/M7I70v639VxNF7H7TMDCTtwmjQHGAsSgy7FhEMabyflQpjM6TPRz5E2XY Zug== X-Google-Smtp-Source: AGHT+IHkE7Ce/eHqXIIWUhOrz6iZ7wUAM+0kNmbjyrQoG1wEe6+vtvmxZ2gwjb9N0RAQKGaJuDaT2daG0VE= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a05:6902:1104:b0:e0b:b6f3:85 with SMTP id 3f1490d57ef6-e0bde2f0105mr658276.2.1722523394843; Thu, 01 Aug 2024 07:43:14 -0700 (PDT) Date: Thu, 1 Aug 2024 07:43:13 -0700 In-Reply-To: <87wml0egzo.fsf@draig.linaro.org> Mime-Version: 1.0 References: <20240726235234.228822-1-seanjc@google.com> <20240726235234.228822-6-seanjc@google.com> <87wml0egzo.fsf@draig.linaro.org> Message-ID: Subject: Re: [PATCH v12 05/84] KVM: Add kvm_release_page_unused() API to put pages that KVM never consumes From: Sean Christopherson To: "Alex =?utf-8?Q?Benn=C3=A9e?=" Cc: Paolo Bonzini , Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Michael Ellerman , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, David Matlack , David Stevens X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240801_074317_056901_E7607EC6 X-CRM114-Status: GOOD ( 23.50 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org T24gVGh1LCBBdWcgMDEsIDIwMjQsIEFsZXggQmVubsOpZSB3cm90ZToKPiBTZWFuIENocmlzdG9w aGVyc29uIDxzZWFuamNAZ29vZ2xlLmNvbT4gd3JpdGVzOgo+IAo+ID4gQWRkIGFuIEFQSSB0byBy ZWxlYXNlIGFuIHVudXNlZCBwYWdlLCBpLmUuIHRvIHB1dCBhIHBhZ2Ugd2l0aG91dCBtYXJraW5n Cj4gPiBpdCBhY2Nlc3NlZCBvciBkaXJ0eS4gIFRoZSBBUEkgd2lsbCBiZSB1c2VkIHdoZW4gS1ZN IGZhdWx0cy1pbiBhIHBhZ2UgYnV0Cj4gPiBiYWlscyBiZWZvcmUgaW5zdGFsbGluZyB0aGUgZ3Vl c3QgbWFwcGluZyAoYW5kIG90aGVyIHNpbWlsYXIgZmxvd3MpLgo+ID4KPiA+IFNpZ25lZC1vZmYt Ynk6IFNlYW4gQ2hyaXN0b3BoZXJzb24gPHNlYW5qY0Bnb29nbGUuY29tPgo+ID4gLS0tCj4gPiAg aW5jbHVkZS9saW51eC9rdm1faG9zdC5oIHwgOSArKysrKysrKysKPiA+ICAxIGZpbGUgY2hhbmdl ZCwgOSBpbnNlcnRpb25zKCspCj4gPgo+ID4gZGlmZiAtLWdpdCBhL2luY2x1ZGUvbGludXgva3Zt X2hvc3QuaCBiL2luY2x1ZGUvbGludXgva3ZtX2hvc3QuaAo+ID4gaW5kZXggM2Q5NjE3ZDFkZTQx Li5jNWQzOWEzMzdhYTMgMTAwNjQ0Cj4gPiAtLS0gYS9pbmNsdWRlL2xpbnV4L2t2bV9ob3N0LmgK PiA+ICsrKyBiL2luY2x1ZGUvbGludXgva3ZtX2hvc3QuaAo+ID4gQEAgLTEyMDEsNiArMTIwMSwx NSBAQCB1bnNpZ25lZCBsb25nIGdmbl90b19odmFfcHJvdChzdHJ1Y3Qga3ZtICprdm0sIGdmbl90 IGdmbiwgYm9vbCAqd3JpdGFibGUpOwo+ID4gIHVuc2lnbmVkIGxvbmcgZ2ZuX3RvX2h2YV9tZW1z bG90KHN0cnVjdCBrdm1fbWVtb3J5X3Nsb3QgKnNsb3QsIGdmbl90IGdmbik7Cj4gPiAgdW5zaWdu ZWQgbG9uZyBnZm5fdG9faHZhX21lbXNsb3RfcHJvdChzdHJ1Y3Qga3ZtX21lbW9yeV9zbG90ICpz bG90LCBnZm5fdCBnZm4sCj4gPiAgCQkJCSAgICAgIGJvb2wgKndyaXRhYmxlKTsKPiA+ICsKPiA+ ICtzdGF0aWMgaW5saW5lIHZvaWQga3ZtX3JlbGVhc2VfcGFnZV91bnVzZWQoc3RydWN0IHBhZ2Ug KnBhZ2UpCj4gPiArewo+ID4gKwlpZiAoIXBhZ2UpCj4gPiArCQlyZXR1cm47Cj4gPiArCj4gPiAr CXB1dF9wYWdlKHBhZ2UpOwo+ID4gK30KPiAKPiBJIGd1ZXNzIGl0J3MgdW5mYW1pbGlhcml0eSB3 aXRoIHRoZSBtbSBsYXlvdXQgYnV0IEkgd2FzIHRyeWluZyB0byBmaW5kCj4gd2hlcmUgdGhlIGdl dF9wYWdlcyBjb21lIGZyb20gdG8gc2VlIHRoZSBmdWxsIHBhdHRlcm4gb2YgYWxsb2NhdGUgYW5k Cj4gcmV0dXJuLiBJIGd1ZXNzIHNvbWV3aGVyZSBpbiB0aGUgZGVwdGhzIG9mIGh2YV90b19wZm4o KSBmcm9tCj4gaHZhX3RvX3Bmbl9yZXRyeSgpPwoKSWYgc3VjY2Vzc2Z1bCwgZ2V0X3VzZXJfcGFn ZV9mYXN0X29ubHkoKSBhbmQgZ2V0X3VzZXJfcGFnZXNfdW5sb2NrZWQoKSBncmFiIGEKcmVmZXJl bmNlIG9uIGJlaGFsZiBvZiB0aGUgY2FsbGVyLgoKQXMgb2YgdGhpcyBwYXRjaCwgaHZhX3RvX3Bm bl9yZW1hcHBlZCgpIGFsc28gZ3JhYnMgYSByZWZlcmVuY2UgdG8gcGFnZXMgdGhhdAphcHBlYXIg dG8gYmUgcmVmY291bnRlZCwgd2hpY2ggaXMgdGhlIHVuZGVybHlpbmcgd2FydCB0aGlzIHNlcmll cyBhaW1zIHRvIGZpeC4KSW4gS1ZNJ3MgZWFybHkgZGF5cywgaXQgX29ubHlfIHN1cHBvcnRlZCBH VVAsIGkuZS4gaWYgS1ZNIGdvdCBhIHBmbiwgdGhhdCBwZm4Kd2FzIChhKSBiYWNrZWQgYnkgc3Ry dWN0IHBhZ2UgYW5kIChiKSBLVk0gaGFkIGEgcmVmZXJlbmNlIHRvIHNhaWQgcGFnZS4gIFRoYXQK bGVkIHRvIHRoZSBjdXJyZW50IG1lc3MsIGFzIEtWTSBkaWRuJ3QgZ2V0IHJld29ya2VkIHRvIHBy b3Blcmx5IHRyYWNrIHBhZ2VzIHZzLgpwZm5zIHdoZW4gc3VwcG9ydCBmb3IgVk1fTUlYRURNQVAg d2FzIGFkZGVkLgoKCS8qCgkgKiBHZXQgYSByZWZlcmVuY2UgaGVyZSBiZWNhdXNlIGNhbGxlcnMg b2YgKmh2YV90b19wZm4qIGFuZAoJICogKmdmbl90b19wZm4qIHVsdGltYXRlbHkgY2FsbCBrdm1f cmVsZWFzZV9wZm5fY2xlYW4gb24gdGhlCgkgKiByZXR1cm5lZCBwZm4uICBUaGlzIGlzIG9ubHkg bmVlZGVkIGlmIHRoZSBWTUEgaGFzIFZNX01JWEVETUFQCgkgKiBzZXQsIGJ1dCB0aGUga3ZtX3Ry eV9nZXRfcGZuL2t2bV9yZWxlYXNlX3Bmbl9jbGVhbiBwYWlyIHdpbGwKCSAqIHNpbXBseSBkbyBu b3RoaW5nIGZvciByZXNlcnZlZCBwZm5zLgoJICoKCSAqIFdob2V2ZXIgY2FsbGVkIHJlbWFwX3Bm bl9yYW5nZSBpcyBhbHNvIGdvaW5nIHRvIGNhbGwgZS5nLgoJICogdW5tYXBfbWFwcGluZ19yYW5n ZSBiZWZvcmUgdGhlIHVuZGVybHlpbmcgcGFnZXMgYXJlIGZyZWVkLAoJICogY2F1c2luZyBhIGNh bGwgdG8gb3VyIE1NVSBub3RpZmllci4KCSAqCgkgKiBDZXJ0YWluIElPIG9yIFBGTk1BUCBtYXBw aW5ncyBjYW4gYmUgYmFja2VkIHdpdGggdmFsaWQKCSAqIHN0cnVjdCBwYWdlcywgYnV0IGJlIGFs bG9jYXRlZCB3aXRob3V0IHJlZmNvdW50aW5nIGUuZy4sCgkgKiB0YWlsIHBhZ2VzIG9mIG5vbi1j b21wb3VuZCBoaWdoZXIgb3JkZXIgYWxsb2NhdGlvbnMsIHdoaWNoCgkgKiB3b3VsZCB0aGVuIHVu ZGVyZmxvdyB0aGUgcmVmY291bnQgd2hlbiB0aGUgY2FsbGVyIGRvZXMgdGhlCgkgKiByZXF1aXJl ZCBwdXRfcGFnZS4gRG9uJ3QgYWxsb3cgdGhvc2UgcGFnZXMgaGVyZS4KCSAqLwoJaWYgKCFrdm1f dHJ5X2dldF9wZm4ocGZuKSkKCQlyID0gLUVGQVVMVDsKCl9fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fCmxpbnV4LXJpc2N2IG1haWxpbmcgbGlzdApsaW51eC1y aXNjdkBsaXN0cy5pbmZyYWRlYWQub3JnCmh0dHA6Ly9saXN0cy5pbmZyYWRlYWQub3JnL21haWxt YW4vbGlzdGluZm8vbGludXgtcmlzY3YK From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 897ADC3DA4A for ; Thu, 1 Aug 2024 14:44:00 +0000 (UTC) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20230601 header.b=KA49viHC; dkim-atps=neutral Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4WZWtl1Xshz3dT1 for ; Fri, 2 Aug 2024 00:43:59 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20230601 header.b=KA49viHC; dkim-atps=neutral Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=flex--seanjc.bounces.google.com (client-ip=2607:f8b0:4864:20::b49; helo=mail-yb1-xb49.google.com; envelope-from=3ap-rzgykdcszlhuqjnvvnsl.jvtspubewwj-klcspzaz.vgshiz.vyn@flex--seanjc.bounces.google.com; receiver=lists.ozlabs.org) Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4WZWsy3Wqtz304l for ; Fri, 2 Aug 2024 00:43:16 +1000 (AEST) Received: by mail-yb1-xb49.google.com with SMTP id 3f1490d57ef6-e0872023b7dso10883014276.2 for ; Thu, 01 Aug 2024 07:43:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1722523395; x=1723128195; darn=lists.ozlabs.org; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=9BUCK9RfTKafiexBRvCEwFw2qC43V57hNjDHfumiy08=; b=KA49viHC6dOCCh+JpkTKRo76BEVzNkkW6xFu7bgX70I7f/BnnWArM4YRAKun++/5JR WxL48MGVRMSnC12AZFuDC80KiVCmhEjxmAgYu7/hCswGb0sbKkkQ6JW3zFQWawY41nKh gs7NRXyvPDk2hLClQRu5JN5JMBto/OOkzUvOrVGV4SQY6xPYC6XdonFY1PSKWePj8ky+ 4QmDQp5lYt3ruSIFy2+fmLbZ1Usro4uCZvLyGO+WNgNGnB3H8zff+cJ5zuMZer8M6zv2 Zu/QR19rdC5/Eyzikpi44S+YUQZG2iUZyYdLrZ7HZjIwwiRNBNSgEPQZ5uLjvkJNgmVy lfVg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722523395; x=1723128195; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=9BUCK9RfTKafiexBRvCEwFw2qC43V57hNjDHfumiy08=; b=YR2UdZPkPWetZ0lOpsWgIcNLY5Z2Fjtzg58T7Y+XyItK1z43D+t7VxBi/J/WmJoyvi /CEyNwoLZn4CaapTnYQcweKv/I521DXqKDsWIMKie304L3flkmuY1JDT5goL0KokVlFw ORs8zS7c+aR4j9Wk0BBqmlSS+JNCcjk3TDRFeCkJX9S+YPhtDNj2sZO4sNfJacPR/tny I5Hc1aybG6uJrpqcU7hjvmqkbY0d8lKk2z+QVUovfsGk7w849VlgGPTrpYaReWGP1VOc q4lFZC/EU/oSt/oxVPQTtA0pfIxIGenQHr5zZ0UgnkdgC79ByV+uunyIoo5Ux1JtZyqx vdZA== X-Forwarded-Encrypted: i=1; AJvYcCXZcXJoEJj5pmISYa0fN4GdQGYlVKOXMOBP3cemr25txmYH4jjzUOL7fa62EPUKeQOETTyjMyJZQOnxR21tVRLDWQCE/xH8kItoANYJJA== X-Gm-Message-State: AOJu0YxCfp3uiVjaP59qPllUoj7AbVNMy1AjUSEdtXdfX560pf7x0gaK zmwW5gDd8wZ7C/BhJHFLY6PbBze8DMoQ75gVbQ+ajuNVRQj+tF+poWIAnyaoWQLbZvBnbMxD0Ec 7Eg== X-Google-Smtp-Source: AGHT+IHkE7Ce/eHqXIIWUhOrz6iZ7wUAM+0kNmbjyrQoG1wEe6+vtvmxZ2gwjb9N0RAQKGaJuDaT2daG0VE= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a05:6902:1104:b0:e0b:b6f3:85 with SMTP id 3f1490d57ef6-e0bde2f0105mr658276.2.1722523394843; Thu, 01 Aug 2024 07:43:14 -0700 (PDT) Date: Thu, 1 Aug 2024 07:43:13 -0700 In-Reply-To: <87wml0egzo.fsf@draig.linaro.org> Mime-Version: 1.0 References: <20240726235234.228822-1-seanjc@google.com> <20240726235234.228822-6-seanjc@google.com> <87wml0egzo.fsf@draig.linaro.org> Message-ID: Subject: Re: [PATCH v12 05/84] KVM: Add kvm_release_page_unused() API to put pages that KVM never consumes From: Sean Christopherson To: "Alex =?utf-8?Q?Benn=C3=A9e?=" Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, David Matlack , linux-riscv@lists.infradead.org, Claudio Imbrenda , Janosch Frank , Marc Zyngier , Huacai Chen , Christian Borntraeger , Albert Ou , Bibo Mao , loongarch@lists.linux.dev, Paul Walmsley , kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org, linux-mips@vger.kernel.org, Oliver Upton , Palmer Dabbelt , David Stevens , kvm-riscv@lists.infradead.org, Anup Patel , Paolo Bonzini , Tianrui Zhao , linuxppc-dev@lists.ozlabs.org Errors-To: linuxppc-dev-bounces+linuxppc-dev=archiver.kernel.org@lists.ozlabs.org Sender: "Linuxppc-dev" On Thu, Aug 01, 2024, Alex Benn=C3=A9e wrote: > Sean Christopherson writes: >=20 > > Add an API to release an unused page, i.e. to put a page without markin= g > > it accessed or dirty. The API will be used when KVM faults-in a page b= ut > > bails before installing the guest mapping (and other similar flows). > > > > Signed-off-by: Sean Christopherson > > --- > > include/linux/kvm_host.h | 9 +++++++++ > > 1 file changed, 9 insertions(+) > > > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > > index 3d9617d1de41..c5d39a337aa3 100644 > > --- a/include/linux/kvm_host.h > > +++ b/include/linux/kvm_host.h > > @@ -1201,6 +1201,15 @@ unsigned long gfn_to_hva_prot(struct kvm *kvm, g= fn_t gfn, bool *writable); > > unsigned long gfn_to_hva_memslot(struct kvm_memory_slot *slot, gfn_t g= fn); > > unsigned long gfn_to_hva_memslot_prot(struct kvm_memory_slot *slot, gf= n_t gfn, > > bool *writable); > > + > > +static inline void kvm_release_page_unused(struct page *page) > > +{ > > + if (!page) > > + return; > > + > > + put_page(page); > > +} >=20 > I guess it's unfamiliarity with the mm layout but I was trying to find > where the get_pages come from to see the full pattern of allocate and > return. I guess somewhere in the depths of hva_to_pfn() from > hva_to_pfn_retry()? If successful, get_user_page_fast_only() and get_user_pages_unlocked() grab= a reference on behalf of the caller. As of this patch, hva_to_pfn_remapped() also grabs a reference to pages tha= t appear to be refcounted, which is the underlying wart this series aims to f= ix. In KVM's early days, it _only_ supported GUP, i.e. if KVM got a pfn, that p= fn was (a) backed by struct page and (b) KVM had a reference to said page. Th= at led to the current mess, as KVM didn't get reworked to properly track pages= vs. pfns when support for VM_MIXEDMAP was added. /* * Get a reference here because callers of *hva_to_pfn* and * *gfn_to_pfn* ultimately call kvm_release_pfn_clean on the * returned pfn. This is only needed if the VMA has VM_MIXEDMAP * set, but the kvm_try_get_pfn/kvm_release_pfn_clean pair will * simply do nothing for reserved pfns. * * Whoever called remap_pfn_range is also going to call e.g. * unmap_mapping_range before the underlying pages are freed, * causing a call to our MMU notifier. * * Certain IO or PFNMAP mappings can be backed with valid * struct pages, but be allocated without refcounting e.g., * tail pages of non-compound higher order allocations, which * would then underflow the refcount when the caller does the * required put_page. Don't allow those pages here. */ if (!kvm_try_get_pfn(pfn)) r =3D -EFAULT;