linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: "Cédric Le Goater" <clg@kaod.org>
Cc: linuxppc-dev@lists.ozlabs.org, Paul Mackerras <paulus@samba.org>,
	kvm@vger.kernel.org, kvm-ppc@vger.kernel.org
Subject: Re: [PATCH v3 09/17] KVM: PPC: Book3S HV: XIVE: add a control to dirty the XIVE EQ pages
Date: Mon, 18 Mar 2019 14:31:10 +1100	[thread overview]
Message-ID: <20190318033110.GK6874@umbus.fritz.box> (raw)
In-Reply-To: <20190315120609.25910-10-clg@kaod.org>

[-- Attachment #1: Type: text/plain, Size: 7001 bytes --]

On Fri, Mar 15, 2019 at 01:06:01PM +0100, Cédric Le Goater wrote:
> When migration of a VM is initiated, a first copy of the RAM is
> transferred to the destination before the VM is stopped, but there is
> no guarantee that the EQ pages in which the event notifications are
> queued have not been modified.
> 
> To make sure migration will capture a consistent memory state, the
> XIVE device should perform a XIVE quiesce sequence to stop the flow of
> event notifications and stabilize the EQs. This is the purpose of the
> KVM_DEV_XIVE_EQ_SYNC control which will also marks the EQ pages dirty
> to force their transfer.
> 
> Signed-off-by: Cédric Le Goater <clg@kaod.org>

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

> ---
> 
>  Changes since v2 :
> 
>  - Extra comments
>  - fixed locking on source block
> 
>  arch/powerpc/include/uapi/asm/kvm.h        |  1 +
>  arch/powerpc/kvm/book3s_xive_native.c      | 85 ++++++++++++++++++++++
>  Documentation/virtual/kvm/devices/xive.txt | 29 ++++++++
>  3 files changed, 115 insertions(+)
> 
> diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h
> index fc9211dbfec8..caf52be89494 100644
> --- a/arch/powerpc/include/uapi/asm/kvm.h
> +++ b/arch/powerpc/include/uapi/asm/kvm.h
> @@ -678,6 +678,7 @@ struct kvm_ppc_cpu_char {
>  /* POWER9 XIVE Native Interrupt Controller */
>  #define KVM_DEV_XIVE_GRP_CTRL		1
>  #define   KVM_DEV_XIVE_RESET		1
> +#define   KVM_DEV_XIVE_EQ_SYNC		2
>  #define KVM_DEV_XIVE_GRP_SOURCE		2	/* 64-bit source identifier */
>  #define KVM_DEV_XIVE_GRP_SOURCE_CONFIG	3	/* 64-bit source identifier */
>  #define KVM_DEV_XIVE_GRP_EQ_CONFIG	4	/* 64-bit EQ identifier */
> diff --git a/arch/powerpc/kvm/book3s_xive_native.c b/arch/powerpc/kvm/book3s_xive_native.c
> index 26ac3c505cd2..ea091c0a8fb6 100644
> --- a/arch/powerpc/kvm/book3s_xive_native.c
> +++ b/arch/powerpc/kvm/book3s_xive_native.c
> @@ -669,6 +669,88 @@ static int kvmppc_xive_reset(struct kvmppc_xive *xive)
>  	return 0;
>  }
>  
> +static void kvmppc_xive_native_sync_sources(struct kvmppc_xive_src_block *sb)
> +{
> +	int j;
> +
> +	for (j = 0; j < KVMPPC_XICS_IRQ_PER_ICS; j++) {
> +		struct kvmppc_xive_irq_state *state = &sb->irq_state[j];
> +		struct xive_irq_data *xd;
> +		u32 hw_num;
> +
> +		if (!state->valid)
> +			continue;
> +
> +		/*
> +		 * The struct kvmppc_xive_irq_state reflects the state
> +		 * of the EAS configuration and not the state of the
> +		 * source. The source is masked setting the PQ bits to
> +		 * '-Q', which is what is being done before calling
> +		 * the KVM_DEV_XIVE_EQ_SYNC control.
> +		 *
> +		 * If a source EAS is configured, OPAL syncs the XIVE
> +		 * IC of the source and the XIVE IC of the previous
> +		 * target if any.
> +		 *
> +		 * So it should be fine ignoring MASKED sources as
> +		 * they have been synced already.
> +		 */
> +		if (state->act_priority == MASKED)
> +			continue;
> +
> +		kvmppc_xive_select_irq(state, &hw_num, &xd);
> +		xive_native_sync_source(hw_num);
> +		xive_native_sync_queue(hw_num);
> +	}
> +}
> +
> +static int kvmppc_xive_native_vcpu_eq_sync(struct kvm_vcpu *vcpu)
> +{
> +	struct kvmppc_xive_vcpu *xc = vcpu->arch.xive_vcpu;
> +	unsigned int prio;
> +
> +	if (!xc)
> +		return -ENOENT;
> +
> +	for (prio = 0; prio < KVMPPC_XIVE_Q_COUNT; prio++) {
> +		struct xive_q *q = &xc->queues[prio];
> +
> +		if (!q->qpage)
> +			continue;
> +
> +		/* Mark EQ page dirty for migration */
> +		mark_page_dirty(vcpu->kvm, gpa_to_gfn(q->guest_qpage));
> +	}
> +	return 0;
> +}
> +
> +static int kvmppc_xive_native_eq_sync(struct kvmppc_xive *xive)
> +{
> +	struct kvm *kvm = xive->kvm;
> +	struct kvm_vcpu *vcpu;
> +	unsigned int i;
> +
> +	pr_devel("%s\n", __func__);
> +
> +	mutex_lock(&kvm->lock);
> +	for (i = 0; i <= xive->max_sbid; i++) {
> +		struct kvmppc_xive_src_block *sb = xive->src_blocks[i];
> +
> +		if (sb) {
> +			arch_spin_lock(&sb->lock);
> +			kvmppc_xive_native_sync_sources(sb);
> +			arch_spin_unlock(&sb->lock);
> +		}
> +	}
> +
> +	kvm_for_each_vcpu(i, vcpu, kvm) {
> +		kvmppc_xive_native_vcpu_eq_sync(vcpu);
> +	}
> +	mutex_unlock(&kvm->lock);
> +
> +	return 0;
> +}
> +
>  static int kvmppc_xive_native_set_attr(struct kvm_device *dev,
>  				       struct kvm_device_attr *attr)
>  {
> @@ -679,6 +761,8 @@ static int kvmppc_xive_native_set_attr(struct kvm_device *dev,
>  		switch (attr->attr) {
>  		case KVM_DEV_XIVE_RESET:
>  			return kvmppc_xive_reset(xive);
> +		case KVM_DEV_XIVE_EQ_SYNC:
> +			return kvmppc_xive_native_eq_sync(xive);
>  		}
>  		break;
>  	case KVM_DEV_XIVE_GRP_SOURCE:
> @@ -717,6 +801,7 @@ static int kvmppc_xive_native_has_attr(struct kvm_device *dev,
>  	case KVM_DEV_XIVE_GRP_CTRL:
>  		switch (attr->attr) {
>  		case KVM_DEV_XIVE_RESET:
> +		case KVM_DEV_XIVE_EQ_SYNC:
>  			return 0;
>  		}
>  		break;
> diff --git a/Documentation/virtual/kvm/devices/xive.txt b/Documentation/virtual/kvm/devices/xive.txt
> index 055aed0c2abb..e6a984592189 100644
> --- a/Documentation/virtual/kvm/devices/xive.txt
> +++ b/Documentation/virtual/kvm/devices/xive.txt
> @@ -23,6 +23,12 @@ the legacy interrupt mode, referred as XICS (POWER7/8).
>      queues. To be used by kexec and kdump.
>      Errors: none
>  
> +    1.2 KVM_DEV_XIVE_EQ_SYNC (write only)
> +    Sync all the sources and queues and mark the EQ pages dirty. This
> +    to make sure that a consistent memory state is captured when
> +    migrating the VM.
> +    Errors: none
> +
>    2. KVM_DEV_XIVE_GRP_SOURCE (write only)
>    Initializes a new source in the XIVE device and mask it.
>    Attributes:
> @@ -97,3 +103,26 @@ the legacy interrupt mode, referred as XICS (POWER7/8).
>    Errors:
>      -ENOENT: Unknown source number
>      -EINVAL: Not initialized source number
> +
> +* Migration:
> +
> +  Saving the state of a VM using the XIVE native exploitation mode
> +  should follow a specific sequence. When the VM is stopped :
> +
> +  1. Mask all sources (PQ=01) to stop the flow of events.
> +
> +  2. Sync the XIVE device with the KVM control KVM_DEV_XIVE_EQ_SYNC to
> +  flush any in-flight event notification and to stabilize the EQs. At
> +  this stage, the EQ pages are marked dirty to make sure they are
> +  transferred in the migration sequence.
> +
> +  3. Capture the state of the source targeting, the EQs configuration
> +  and the state of thread interrupt context registers.
> +
> +  Restore is similar :
> +
> +  1. Restore the EQ configuration. As targeting depends on it.
> +  2. Restore targeting
> +  3. Restore the thread interrupt contexts
> +  4. Restore the source states
> +  5. Let the vCPU run

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2019-03-18  5:18 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-15 12:05 [PATCH v3 00/17] KVM: PPC: Book3S HV: add XIVE native exploitation mode Cédric Le Goater
2019-03-15 12:05 ` [PATCH v3 01/17] powerpc/xive: add OPAL extensions for the XIVE native exploitation support Cédric Le Goater
2019-03-15 12:05 ` [PATCH v3 02/17] KVM: PPC: Book3S HV: add a new KVM device for the XIVE native exploitation mode Cédric Le Goater
2019-03-17 23:48   ` David Gibson
2019-03-15 12:05 ` [PATCH v3 03/17] KVM: PPC: Book3S HV: XIVE: introduce a new capability KVM_CAP_PPC_IRQ_XIVE Cédric Le Goater
2019-03-18  0:19   ` David Gibson
2019-03-18 10:00     ` Cédric Le Goater
2019-03-15 12:05 ` [PATCH v3 04/17] KVM: PPC: Book3S HV: XIVE: add a control to initialize a source Cédric Le Goater
2019-03-18  1:38   ` David Gibson
2019-03-15 12:05 ` [PATCH v3 05/17] KVM: PPC: Book3S HV: XIVE: add a control to configure " Cédric Le Goater
2019-03-15 12:05 ` [PATCH v3 06/17] KVM: PPC: Book3S HV: XIVE: add controls for the EQ configuration Cédric Le Goater
2019-03-18  3:23   ` David Gibson
2019-03-18 14:12     ` Cédric Le Goater
2019-03-18 14:38       ` Cédric Le Goater
2019-03-19  4:54       ` David Gibson
2019-03-19 15:47         ` Cédric Le Goater
2019-03-20  3:44           ` David Gibson
2019-03-20  6:44             ` Cédric Le Goater
2019-03-15 12:05 ` [PATCH v3 07/17] KVM: PPC: Book3S HV: XIVE: add a global reset control Cédric Le Goater
2019-03-18  3:25   ` David Gibson
2019-03-15 12:06 ` [PATCH v3 08/17] KVM: PPC: Book3S HV: XIVE: add a control to sync the sources Cédric Le Goater
2019-03-18  3:28   ` David Gibson
2019-03-15 12:06 ` [PATCH v3 09/17] KVM: PPC: Book3S HV: XIVE: add a control to dirty the XIVE EQ pages Cédric Le Goater
2019-03-18  3:31   ` David Gibson [this message]
2019-03-15 12:06 ` [PATCH v3 10/17] KVM: PPC: Book3S HV: XIVE: add get/set accessors for the VP XIVE state Cédric Le Goater
2019-03-19  5:08   ` David Gibson
2019-03-15 12:06 ` [PATCH v3 11/17] KVM: introduce a 'mmap' method for KVM devices Cédric Le Goater
2019-03-18  3:32   ` David Gibson
2019-03-15 12:06 ` [PATCH v3 12/17] KVM: PPC: Book3S HV: XIVE: add a TIMA mapping Cédric Le Goater
2019-03-15 12:06 ` [PATCH v3 13/17] KVM: PPC: Book3S HV: XIVE: add a mapping for the source ESB pages Cédric Le Goater
2019-03-15 12:06 ` [PATCH v3 14/17] KVM: PPC: Book3S HV: XIVE: add passthrough support Cédric Le Goater
2019-03-19  5:22   ` David Gibson
2019-03-15 12:06 ` [PATCH v3 15/17] KVM: PPC: Book3S HV: XIVE: activate XIVE exploitation mode Cédric Le Goater
2019-03-18  6:42   ` David Gibson
2019-03-15 12:06 ` [PATCH v3 16/17] KVM: introduce a KVM_DESTROY_DEVICE ioctl Cédric Le Goater
2019-03-18  6:42   ` David Gibson
2019-03-15 12:06 ` [PATCH v3 17/17] KVM: PPC: Book3S HV: XIVE: clear the vCPU interrupt presenters Cédric Le Goater
2019-03-19  5:37   ` David Gibson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190318033110.GK6874@umbus.fritz.box \
    --to=david@gibson.dropbear.id.au \
    --cc=clg@kaod.org \
    --cc=kvm-ppc@vger.kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=paulus@samba.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).