linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Lijun Pan <ljp@linux.vnet.ibm.com>
To: Greg Kurz <groug@kaod.org>
Cc: "Laurent Vivier" <lvivier@redhat.com>,
	linux-kernel@vger.kernel.org, kvm-ppc@vger.kernel.org,
	"Satheesh Rajendran" <sathnaga@linux.vnet.ibm.com>,
	"Cédric Le Goater" <clg@kaod.org>,
	"Lijun Pan" <ljp@linux.ibm.com>,
	stable@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
	"David Gibson" <david@gibson.dropbear.id.au>
Subject: Re: [PATCH v2 1/2] KVM: PPC: Book3S HV: XIVE: Free previous EQ page when setting up a new one
Date: Wed, 20 Nov 2019 10:19:40 -0600	[thread overview]
Message-ID: <CD112E07-53C1-4FD3-AD4D-69AA7A2CB5C5@linux.vnet.ibm.com> (raw)
In-Reply-To: <157366357346.1026356.14522564753643067538.stgit@bahia.lan>

[-- Attachment #1: Type: text/plain, Size: 4150 bytes --]



> On Nov 13, 2019, at 10:46 AM, Greg Kurz <groug@kaod.org> wrote:
> 
> The EQ page is allocated by the guest and then passed to the hypervisor
> with the H_INT_SET_QUEUE_CONFIG hcall. A reference is taken on the page
> before handing it over to the HW. This reference is dropped either when
> the guest issues the H_INT_RESET hcall or when the KVM device is released.
> But, the guest can legitimately call H_INT_SET_QUEUE_CONFIG several times,
> either to reset the EQ (vCPU hot unplug) or to set a new EQ (guest reboot).
> In both cases the existing EQ page reference is leaked because we simply
> overwrite it in the XIVE queue structure without calling put_page().
> 
> This is especially visible when the guest memory is backed with huge pages:
> start a VM up to the guest userspace, either reboot it or unplug a vCPU,
> quit QEMU. The leak is observed by comparing the value of HugePages_Free in
> /proc/meminfo before and after the VM is run.
> 
> Ideally we'd want the XIVE code to handle the EQ page de-allocation at the
> platform level. This isn't the case right now because the various XIVE
> drivers have different allocation needs. It could maybe worth introducing
> hooks for this purpose instead of exposing XIVE internals to the drivers,
> but this is certainly a huge work to be done later.
> 
> In the meantime, for easier backport, fix both vCPU unplug and guest reboot
> leaks by introducing a wrapper around xive_native_configure_queue() that
> does the necessary cleanup.
> 
> Reported-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
> Cc: stable@vger.kernel.org # v5.2
> Fixes: 13ce3297c576 ("KVM: PPC: Book3S HV: XIVE: Add controls for the EQ configuration")
> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> Signed-off-by: Greg Kurz <groug@kaod.org>

Tested-by: Lijun Pan <ljp@linux.ibm <mailto:ljp@linux.ibm>.com>

> ---
> v2: use wrapper as suggested by Cedric
> ---
> arch/powerpc/kvm/book3s_xive_native.c |   31 ++++++++++++++++++++++---------
> 1 file changed, 22 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/powerpc/kvm/book3s_xive_native.c b/arch/powerpc/kvm/book3s_xive_native.c
> index 34bd123fa024..0e1fc5a16729 100644
> --- a/arch/powerpc/kvm/book3s_xive_native.c
> +++ b/arch/powerpc/kvm/book3s_xive_native.c
> @@ -50,6 +50,24 @@ static void kvmppc_xive_native_cleanup_queue(struct kvm_vcpu *vcpu, int prio)
> 	}
> }
> 
> +static int kvmppc_xive_native_configure_queue(u32 vp_id, struct xive_q *q,
> +					      u8 prio, __be32 *qpage,
> +					      u32 order, bool can_escalate)
> +{
> +	int rc;
> +	__be32 *qpage_prev = q->qpage;
> +
> +	rc = xive_native_configure_queue(vp_id, q, prio, qpage, order,
> +					 can_escalate);
> +	if (rc)
> +		return rc;
> +
> +	if (qpage_prev)
> +		put_page(virt_to_page(qpage_prev));
> +
> +	return rc;
> +}
> +
> void kvmppc_xive_native_cleanup_vcpu(struct kvm_vcpu *vcpu)
> {
> 	struct kvmppc_xive_vcpu *xc = vcpu->arch.xive_vcpu;
> @@ -575,19 +593,14 @@ static int kvmppc_xive_native_set_queue_config(struct kvmppc_xive *xive,
> 		q->guest_qaddr  = 0;
> 		q->guest_qshift = 0;
> 
> -		rc = xive_native_configure_queue(xc->vp_id, q, priority,
> -						 NULL, 0, true);
> +		rc = kvmppc_xive_native_configure_queue(xc->vp_id, q, priority,
> +							NULL, 0, true);
> 		if (rc) {
> 			pr_err("Failed to reset queue %d for VCPU %d: %d\n",
> 			       priority, xc->server_num, rc);
> 			return rc;
> 		}
> 
> -		if (q->qpage) {
> -			put_page(virt_to_page(q->qpage));
> -			q->qpage = NULL;
> -		}
> -
> 		return 0;
> 	}
> 
> @@ -646,8 +659,8 @@ static int kvmppc_xive_native_set_queue_config(struct kvmppc_xive *xive,
> 	  * OPAL level because the use of END ESBs is not supported by
> 	  * Linux.
> 	  */
> -	rc = xive_native_configure_queue(xc->vp_id, q, priority,
> -					 (__be32 *) qaddr, kvm_eq.qshift, true);
> +	rc = kvmppc_xive_native_configure_queue(xc->vp_id, q, priority,
> +					(__be32 *) qaddr, kvm_eq.qshift, true);
> 	if (rc) {
> 		pr_err("Failed to configure queue %d for VCPU %d: %d\n",
> 		       priority, xc->server_num, rc);
> 


[-- Attachment #2: Type: text/html, Size: 11679 bytes --]

  parent reply	other threads:[~2019-11-20 20:17 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-13 16:46 [PATCH v2 1/2] KVM: PPC: Book3S HV: XIVE: Free previous EQ page when setting up a new one Greg Kurz
2019-11-13 16:46 ` [PATCH v2 2/2] KVM: PPC: Book3S HV: XIVE: Fix potential page leak on error path Greg Kurz
2019-11-13 17:00   ` Cédric Le Goater
2019-11-20 16:19 ` Lijun Pan [this message]
2019-11-20 16:38 ` [PATCH v2 1/2] KVM: PPC: Book3S HV: XIVE: Free previous EQ page when setting up a new one Lijun Pan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CD112E07-53C1-4FD3-AD4D-69AA7A2CB5C5@linux.vnet.ibm.com \
    --to=ljp@linux.vnet.ibm.com \
    --cc=clg@kaod.org \
    --cc=david@gibson.dropbear.id.au \
    --cc=groug@kaod.org \
    --cc=kvm-ppc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=ljp@linux.ibm.com \
    --cc=lvivier@redhat.com \
    --cc=sathnaga@linux.vnet.ibm.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).