From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6AA97C3DA4A for ; Thu, 1 Aug 2024 14:44:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: Content-Type:Cc:To:From:Subject:Message-ID:References:Mime-Version: In-Reply-To:Date:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=9BUCK9RfTKafiexBRvCEwFw2qC43V57hNjDHfumiy08=; b=UmxR4ZKW2W/na0Qb+2Tcpgo75J MT6ks9E1tQR4aBB2A57FZe9zmt+Y2296ciuVnaVq8nJQ2zMb2NQceRG/Ja87EX5vMqEO6zJr8kduo e7VoT1J3Y62KIYKQlZ+wL880SYAGvACiBcL8sZIH1kbXF1/QVa0BEicExBWTPnmB6eRo2Jf+JEkJl iG+1kZ8MgG1sVq5aO/90/DaydMdGnqUx67Iu198WxY8TesDK9BuR5aDI0EIr8zlNbzatOCxIvM2wW ULqr+Vt/+fi85n74+8wdV5OwEajI1ZX/kiev1NUU+pw3h5u3yVSS6U+/eGmqB23bRNEh2ti9iXr5c V3Bh01yA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1sZX23-00000005ib7-254T; Thu, 01 Aug 2024 14:43:55 +0000 Received: from mail-yb1-xb49.google.com ([2607:f8b0:4864:20::b49]) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1sZX1Q-00000005iLr-3Evr for linux-arm-kernel@lists.infradead.org; Thu, 01 Aug 2024 14:43:18 +0000 Received: by mail-yb1-xb49.google.com with SMTP id 3f1490d57ef6-e0872023b7dso10883010276.2 for ; Thu, 01 Aug 2024 07:43:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1722523395; x=1723128195; darn=lists.infradead.org; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=9BUCK9RfTKafiexBRvCEwFw2qC43V57hNjDHfumiy08=; b=p2r4NHfEScjvQchoa2mImy38578qnAnTkRie2NFrp4TJ59sQwQqvESudFKoBLC1WkF 1xPybnXn9eDFNeCA/sz/yyECcMQYTeLdPFumful8PbGB4D1eQEfUMbsHZeRcN0hZEAmz 0O4BdVx13E5+14uhnK00gYOqt2FmQEaiV6ZaBAmRrQkTZDrYitjRvTPWs7CWFfy1uTXq np4gqeMlIzRo1V1hIDwo/upjfAI1WzaufSC7YcGZbIIJrnpRCvojvJiEZWydfkmWpWwZ R6UahbV+Bw8xJuT3zKfN/0a089OzSoCwbcOl9Gf3FNzsZFZmapI2yYkR5+7CCZHyk04A Wl4Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722523395; x=1723128195; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=9BUCK9RfTKafiexBRvCEwFw2qC43V57hNjDHfumiy08=; b=rKvXKtAqGT/3iY2kabP3AAtwKmFb7S34488U2wOMIejyOgaqZojaIzG3U0f8G7ZfTT FlKqYszLmx8Ztf0kU5pzQM7lAd82umDJy3Z/CFZkRLu4R00GByzrJ5RD1QPtYNWZbn8V aznLcRkACZuu6JnIqZyNRL45nXAD9QbooEMm2dO3a5zzUp5KrNmvciGBx4q3OOSyuRd9 0Drq8t+NpWQ6LikGuS+WfkYO8MT4ooEeiUF/imf/ZqVESioM1wPHQR2M0wdWSLb2AUFg ckCgo7FvsqcGC0GMQFhxbKEgI3vbyCb9pdFfowqebk4+GOJ+eYe7NaC/zrqiDlYoiojg HqIg== X-Forwarded-Encrypted: i=1; AJvYcCVwolNCnv5gW+j0UFF7hMbPfVfCRZbvlHggiqJTPp1JlVKBNGQ8Mge7PJ59Gq06t58iqPL55pCfBLomwraguoJOXv1fMa0iLLhqic61RZ7Cmu90Bes= X-Gm-Message-State: AOJu0YwLyL94nXzVzihYdjLxkfW8GBlrYGJ03eIUXoGuZFBoUfbtI+Ku rVCkXzFH7bWsKvWi86fHc1UPIwtirtgqnp/4XPWq0V80ZMxPTxZfNfH+oGWWQbeUC2nPV7NDK7N h3A== X-Google-Smtp-Source: AGHT+IHkE7Ce/eHqXIIWUhOrz6iZ7wUAM+0kNmbjyrQoG1wEe6+vtvmxZ2gwjb9N0RAQKGaJuDaT2daG0VE= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a05:6902:1104:b0:e0b:b6f3:85 with SMTP id 3f1490d57ef6-e0bde2f0105mr658276.2.1722523394843; Thu, 01 Aug 2024 07:43:14 -0700 (PDT) Date: Thu, 1 Aug 2024 07:43:13 -0700 In-Reply-To: <87wml0egzo.fsf@draig.linaro.org> Mime-Version: 1.0 References: <20240726235234.228822-1-seanjc@google.com> <20240726235234.228822-6-seanjc@google.com> <87wml0egzo.fsf@draig.linaro.org> Message-ID: Subject: Re: [PATCH v12 05/84] KVM: Add kvm_release_page_unused() API to put pages that KVM never consumes From: Sean Christopherson To: "Alex =?utf-8?Q?Benn=C3=A9e?=" Cc: Paolo Bonzini , Marc Zyngier , Oliver Upton , Tianrui Zhao , Bibo Mao , Huacai Chen , Michael Ellerman , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Christian Borntraeger , Janosch Frank , Claudio Imbrenda , kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, loongarch@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, David Matlack , David Stevens Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240801_074316_836596_C6BA1D37 X-CRM114-Status: GOOD ( 25.07 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Thu, Aug 01, 2024, Alex Benn=C3=A9e wrote: > Sean Christopherson writes: >=20 > > Add an API to release an unused page, i.e. to put a page without markin= g > > it accessed or dirty. The API will be used when KVM faults-in a page b= ut > > bails before installing the guest mapping (and other similar flows). > > > > Signed-off-by: Sean Christopherson > > --- > > include/linux/kvm_host.h | 9 +++++++++ > > 1 file changed, 9 insertions(+) > > > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > > index 3d9617d1de41..c5d39a337aa3 100644 > > --- a/include/linux/kvm_host.h > > +++ b/include/linux/kvm_host.h > > @@ -1201,6 +1201,15 @@ unsigned long gfn_to_hva_prot(struct kvm *kvm, g= fn_t gfn, bool *writable); > > unsigned long gfn_to_hva_memslot(struct kvm_memory_slot *slot, gfn_t g= fn); > > unsigned long gfn_to_hva_memslot_prot(struct kvm_memory_slot *slot, gf= n_t gfn, > > bool *writable); > > + > > +static inline void kvm_release_page_unused(struct page *page) > > +{ > > + if (!page) > > + return; > > + > > + put_page(page); > > +} >=20 > I guess it's unfamiliarity with the mm layout but I was trying to find > where the get_pages come from to see the full pattern of allocate and > return. I guess somewhere in the depths of hva_to_pfn() from > hva_to_pfn_retry()? If successful, get_user_page_fast_only() and get_user_pages_unlocked() grab= a reference on behalf of the caller. As of this patch, hva_to_pfn_remapped() also grabs a reference to pages tha= t appear to be refcounted, which is the underlying wart this series aims to f= ix. In KVM's early days, it _only_ supported GUP, i.e. if KVM got a pfn, that p= fn was (a) backed by struct page and (b) KVM had a reference to said page. Th= at led to the current mess, as KVM didn't get reworked to properly track pages= vs. pfns when support for VM_MIXEDMAP was added. /* * Get a reference here because callers of *hva_to_pfn* and * *gfn_to_pfn* ultimately call kvm_release_pfn_clean on the * returned pfn. This is only needed if the VMA has VM_MIXEDMAP * set, but the kvm_try_get_pfn/kvm_release_pfn_clean pair will * simply do nothing for reserved pfns. * * Whoever called remap_pfn_range is also going to call e.g. * unmap_mapping_range before the underlying pages are freed, * causing a call to our MMU notifier. * * Certain IO or PFNMAP mappings can be backed with valid * struct pages, but be allocated without refcounting e.g., * tail pages of non-compound higher order allocations, which * would then underflow the refcount when the caller does the * required put_page. Don't allow those pages here. */ if (!kvm_try_get_pfn(pfn)) r =3D -EFAULT;