linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: "Cédric Le Goater" <clg@kaod.org>
Cc: kvm@vger.kernel.org, kvm-ppc@vger.kernel.org,
	Paul Mackerras <paulus@samba.org>,
	linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH v2 10/16] KVM: PPC: Book3S HV: XIVE: add get/set accessors for the VP XIVE state
Date: Thu, 14 Mar 2019 14:09:14 +1100	[thread overview]
Message-ID: <20190314030914.GN8211@umbus.fritz.box> (raw)
In-Reply-To: <d33a2cb1-5370-7e72-07d9-ed0028c85b8c@kaod.org>

[-- Attachment #1: Type: text/plain, Size: 11922 bytes --]

On Wed, Mar 13, 2019 at 02:19:13PM +0100, Cédric Le Goater wrote:
> On 2/25/19 4:31 AM, David Gibson wrote:
> > On Fri, Feb 22, 2019 at 12:28:34PM +0100, Cédric Le Goater wrote:
> >> At a VCPU level, the state of the thread interrupt management
> >> registers needs to be collected. These registers are cached under the
> >> 'xive_saved_state.w01' field of the VCPU when the VPCU context is
> >> pulled from the HW thread. An OPAL call retrieves the backup of the
> >> IPB register in the underlying XIVE NVT structure and merges it in the
> >> KVM state.
> >>
> >> The structures of the interface between QEMU and KVM provisions some
> >> extra room (two u64) for further extensions if more state needs to be
> >> transferred back to QEMU.
> >>
> >> Signed-off-by: Cédric Le Goater <clg@kaod.org>
> >> ---
> >>  arch/powerpc/include/asm/kvm_ppc.h         | 11 +++
> >>  arch/powerpc/include/uapi/asm/kvm.h        |  2 +
> >>  arch/powerpc/kvm/book3s.c                  | 24 +++++++
> >>  arch/powerpc/kvm/book3s_xive_native.c      | 82 ++++++++++++++++++++++
> >>  Documentation/virtual/kvm/devices/xive.txt | 19 +++++
> >>  5 files changed, 138 insertions(+)
> >>
> >> diff --git a/arch/powerpc/include/asm/kvm_ppc.h b/arch/powerpc/include/asm/kvm_ppc.h
> >> index 1e61877fe147..664c65051612 100644
> >> --- a/arch/powerpc/include/asm/kvm_ppc.h
> >> +++ b/arch/powerpc/include/asm/kvm_ppc.h
> >> @@ -272,6 +272,7 @@ union kvmppc_one_reg {
> >>  		u64	addr;
> >>  		u64	length;
> >>  	}	vpaval;
> >> +	u64	xive_timaval[4];
> > 
> > This is doubling the size of the userspace visible one_reg union.  Is
> > that safe?
> 
> 'safe' as in compatibility on an older KVM which would still use the old 
> kvmppc_one_reg definition ?

I was more thinking of old qemu with a new kernel.

> It should be fine as KVM_REG_PPC_VP_STATE would not be handled. Am I
> wrong ?

Looks like it should be ok, because we only partially copy the
structure to/from userspace due to the one_reg_size() logic.  If the
whole union was always copied, it would be hilariously unsafe.

> 
> >>  };
> >>  
> >>  struct kvmppc_ops {
> >> @@ -604,6 +605,10 @@ extern int kvmppc_xive_native_connect_vcpu(struct kvm_device *dev,
> >>  extern void kvmppc_xive_native_cleanup_vcpu(struct kvm_vcpu *vcpu);
> >>  extern void kvmppc_xive_native_init_module(void);
> >>  extern void kvmppc_xive_native_exit_module(void);
> >> +extern int kvmppc_xive_native_get_vp(struct kvm_vcpu *vcpu,
> >> +				     union kvmppc_one_reg *val);
> >> +extern int kvmppc_xive_native_set_vp(struct kvm_vcpu *vcpu,
> >> +				     union kvmppc_one_reg *val);
> >>  
> >>  #else
> >>  static inline int kvmppc_xive_set_xive(struct kvm *kvm, u32 irq, u32 server,
> >> @@ -636,6 +641,12 @@ static inline int kvmppc_xive_native_connect_vcpu(struct kvm_device *dev,
> >>  static inline void kvmppc_xive_native_cleanup_vcpu(struct kvm_vcpu *vcpu) { }
> >>  static inline void kvmppc_xive_native_init_module(void) { }
> >>  static inline void kvmppc_xive_native_exit_module(void) { }
> >> +static inline int kvmppc_xive_native_get_vp(struct kvm_vcpu *vcpu,
> >> +					    union kvmppc_one_reg *val)
> >> +{ return 0; }
> >> +static inline int kvmppc_xive_native_set_vp(struct kvm_vcpu *vcpu,
> >> +					    union kvmppc_one_reg *val)
> >> +{ return -ENOENT; }
> >>  
> >>  #endif /* CONFIG_KVM_XIVE */
> >>  
> >> diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h
> >> index cd78ad1020fe..42d4ef93ec2d 100644
> >> --- a/arch/powerpc/include/uapi/asm/kvm.h
> >> +++ b/arch/powerpc/include/uapi/asm/kvm.h
> >> @@ -480,6 +480,8 @@ struct kvm_ppc_cpu_char {
> >>  #define  KVM_REG_PPC_ICP_PPRI_SHIFT	16	/* pending irq priority */
> >>  #define  KVM_REG_PPC_ICP_PPRI_MASK	0xff
> >>  
> >> +#define KVM_REG_PPC_VP_STATE	(KVM_REG_PPC | KVM_REG_SIZE_U256 | 0x8d)
> >> +
> >>  /* Device control API: PPC-specific devices */
> >>  #define KVM_DEV_MPIC_GRP_MISC		1
> >>  #define   KVM_DEV_MPIC_BASE_ADDR	0	/* 64-bit */
> >> diff --git a/arch/powerpc/kvm/book3s.c b/arch/powerpc/kvm/book3s.c
> >> index 96d43f091255..f85a9211f30c 100644
> >> --- a/arch/powerpc/kvm/book3s.c
> >> +++ b/arch/powerpc/kvm/book3s.c
> >> @@ -641,6 +641,18 @@ int kvmppc_get_one_reg(struct kvm_vcpu *vcpu, u64 id,
> >>  				*val = get_reg_val(id, kvmppc_xics_get_icp(vcpu));
> >>  			break;
> >>  #endif /* CONFIG_KVM_XICS */
> >> +#ifdef CONFIG_KVM_XIVE
> >> +		case KVM_REG_PPC_VP_STATE:
> >> +			if (!vcpu->arch.xive_vcpu) {
> >> +				r = -ENXIO;
> >> +				break;
> >> +			}
> >> +			if (xive_enabled())
> >> +				r = kvmppc_xive_native_get_vp(vcpu, val);
> >> +			else
> >> +				r = -ENXIO;
> >> +			break;
> >> +#endif /* CONFIG_KVM_XIVE */
> >>  		case KVM_REG_PPC_FSCR:
> >>  			*val = get_reg_val(id, vcpu->arch.fscr);
> >>  			break;
> >> @@ -714,6 +726,18 @@ int kvmppc_set_one_reg(struct kvm_vcpu *vcpu, u64 id,
> >>  				r = kvmppc_xics_set_icp(vcpu, set_reg_val(id, *val));
> >>  			break;
> >>  #endif /* CONFIG_KVM_XICS */
> >> +#ifdef CONFIG_KVM_XIVE
> >> +		case KVM_REG_PPC_VP_STATE:
> >> +			if (!vcpu->arch.xive_vcpu) {
> >> +				r = -ENXIO;
> >> +				break;
> >> +			}
> >> +			if (xive_enabled())
> >> +				r = kvmppc_xive_native_set_vp(vcpu, val);
> >> +			else
> >> +				r = -ENXIO;
> >> +			break;
> >> +#endif /* CONFIG_KVM_XIVE */
> >>  		case KVM_REG_PPC_FSCR:
> >>  			vcpu->arch.fscr = set_reg_val(id, *val);
> >>  			break;
> >> diff --git a/arch/powerpc/kvm/book3s_xive_native.c b/arch/powerpc/kvm/book3s_xive_native.c
> >> index 3debc876d5a0..132bff52d70a 100644
> >> --- a/arch/powerpc/kvm/book3s_xive_native.c
> >> +++ b/arch/powerpc/kvm/book3s_xive_native.c
> >> @@ -845,6 +845,88 @@ static int kvmppc_xive_native_create(struct kvm_device *dev, u32 type)
> >>  	return ret;
> >>  }
> >>  
> >> +/*
> >> + * Interrupt Pending Buffer (IPB) offset
> >> + */
> >> +#define TM_IPB_SHIFT 40
> >> +#define TM_IPB_MASK  (((u64) 0xFF) << TM_IPB_SHIFT)
> >> +
> >> +int kvmppc_xive_native_get_vp(struct kvm_vcpu *vcpu, union kvmppc_one_reg *val)
> >> +{
> >> +	struct kvmppc_xive_vcpu *xc = vcpu->arch.xive_vcpu;
> >> +	u64 opal_state;
> >> +	int rc;
> >> +
> >> +	if (!kvmppc_xive_enabled(vcpu))
> >> +		return -EPERM;
> >> +
> >> +	if (!xc)
> >> +		return -ENOENT;
> >> +
> >> +	/* Thread context registers. We only care about IPB and CPPR */
> >> +	val->xive_timaval[0] = vcpu->arch.xive_saved_state.w01;
> >> +
> >> +	/*
> >> +	 * Return the OS CAM line to print out the VP identifier in
> >> +	 * the QEMU monitor. This is not restored.
> >> +	 */
> >> +	val->xive_timaval[1] = vcpu->arch.xive_cam_word;
> > 
> > I'm pretty dubious about this mixing of vital state information with
> > what's basically debug information. 
> 
> I think QEMU deserves to know about the OS CAM line value. I was even 
> thinking about adding the POOL CAM line value for future use (nested) 
> 
> > Doubly so since it requires changing the ABI to increase 
> > the one_reg union's size.
> 
> OK. That's one argument.
>  
> > Might be better to have this control only return the 0th and 2nd u64s
> > from the TIMA, with the CAM debug information returned via some other
> > mechanism.
> 
> Like an extra reg : KVM_REG_PPC_VP_CAM ? 

That would be the obvious choice, yes.

> >> +
> >> +	/* Get the VP state from OPAL */
> >> +	rc = xive_native_get_vp_state(xc->vp_id, &opal_state);
> >> +	if (rc)
> >> +		return rc;
> >> +
> >> +	/*
> >> +	 * Capture the backup of IPB register in the NVT structure and
> >> +	 * merge it in our KVM VP state.
> >> +	 */
> >> +	val->xive_timaval[0] |= cpu_to_be64(opal_state & TM_IPB_MASK);
> >> +
> >> +	pr_devel("%s NSR=%02x CPPR=%02x IBP=%02x PIPR=%02x w01=%016llx w2=%08x opal=%016llx\n",
> >> +		 __func__,
> >> +		 vcpu->arch.xive_saved_state.nsr,
> >> +		 vcpu->arch.xive_saved_state.cppr,
> >> +		 vcpu->arch.xive_saved_state.ipb,
> >> +		 vcpu->arch.xive_saved_state.pipr,
> >> +		 vcpu->arch.xive_saved_state.w01,
> >> +		 (u32) vcpu->arch.xive_cam_word, opal_state);
> > 
> > Hrm.. except you don't seem to be using the last half of the timaval
> > field anyway.
> 
> Yes. The two u64 are extras. We can do without. 
> 
> Would that be ok if I stored the w01 regs in the first u64, the CAM line(s) 
> in the second and remove the extra two u64 ?

I'd still prefer them in separate regs.  They kind of belong to
different categories of information, and I can't think of any
particular reason you'd have to update or fetch them as a unit.

>  
> >> +
> >> +	return 0;
> >> +}
> >> +
> >> +int kvmppc_xive_native_set_vp(struct kvm_vcpu *vcpu, union kvmppc_one_reg *val)
> >> +{
> >> +	struct kvmppc_xive_vcpu *xc = vcpu->arch.xive_vcpu;
> >> +	struct kvmppc_xive *xive = vcpu->kvm->arch.xive;
> >> +
> >> +	pr_devel("%s w01=%016llx vp=%016llx\n", __func__,
> >> +		 val->xive_timaval[0], val->xive_timaval[1]);
> >> +
> >> +	if (!kvmppc_xive_enabled(vcpu))
> >> +		return -EPERM;
> >> +
> >> +	if (!xc || !xive)
> >> +		return -ENOENT;
> >> +
> >> +	/* We can't update the state of a "pushed" VCPU	 */
> >> +	if (WARN_ON(vcpu->arch.xive_pushed))
> > 
> > What prevents userspace from tripping this WARN_ON()?
> 
> if the vCPU is executing a vCPU ioctl, it means that it exited the guest 
> and that its interrupt context has been pulled out of XIVE.

But couldn't one user thread call the vcpu ioctl() while another is
inside the guest?

> >> +		return -EIO;
> > 
> > EBUSY might be more appropriate here.
> 
> OK.
> 
> Thanks,
> 
> C. 
> 
> > 
> >> +
> >> +	/*
> >> +	 * Restore the thread context registers. IPB and CPPR should
> >> +	 * be the only ones that matter.
> >> +	 */
> >> +	vcpu->arch.xive_saved_state.w01 = val->xive_timaval[0];
> >> +
> >> +	/*
> >> +	 * There is no need to restore the XIVE internal state (IPB
> >> +	 * stored in the NVT) as the IPB register was merged in KVM VP
> >> +	 * state when captured.
> >> +	 */
> >> +	return 0;
> >> +}
> >> +
> >>  static int xive_native_debug_show(struct seq_file *m, void *private)
> >>  {
> >>  	struct kvmppc_xive *xive = m->private;
> >> diff --git a/Documentation/virtual/kvm/devices/xive.txt b/Documentation/virtual/kvm/devices/xive.txt
> >> index a26be635cff9..1b8957c50c53 100644
> >> --- a/Documentation/virtual/kvm/devices/xive.txt
> >> +++ b/Documentation/virtual/kvm/devices/xive.txt
> >> @@ -102,6 +102,25 @@ the legacy interrupt mode, referred as XICS (POWER7/8).
> >>      -EINVAL: Not initialized source number, invalid priority or
> >>               invalid CPU number.
> >>  
> >> +* VCPU state
> >> +
> >> +  The XIVE IC maintains VP interrupt state in an internal structure
> >> +  called the NVT. When a VP is not dispatched on a HW processor
> >> +  thread, this structure can be updated by HW if the VP is the target
> >> +  of an event notification.
> >> +
> >> +  It is important for migration to capture the cached IPB from the NVT
> >> +  as it synthesizes the priorities of the pending interrupts. We
> >> +  capture a bit more to report debug information.
> >> +
> >> +  KVM_REG_PPC_VP_STATE (4 * 64bits)
> >> +  bits:     |  63  ....  32  |  31  ....  0  |
> >> +  values:   |   TIMA word0   |   TIMA word1  |
> >> +  bits:     | 127       ..........       64  |
> >> +  values:   |         VP CAM Line            |
> >> +  bits:     | 255       ..........      128  |
> >> +  values:   |            unused              |
> >> +
> >>  * Migration:
> >>  
> >>    Saving the state of a VM using the XIVE native exploitation mode
> > 
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2019-03-14  3:17 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-22 11:28 [PATCH v2 00/16] KVM: PPC: Book3S HV: add XIVE native exploitation mode Cédric Le Goater
2019-02-22 11:28 ` [PATCH v2 01/16] powerpc/xive: add OPAL extensions for the XIVE native exploitation support Cédric Le Goater
2019-02-24 23:42   ` David Gibson
2019-02-25  3:50   ` Michael Ellerman
2019-02-25 10:11     ` Cédric Le Goater
2019-02-26  4:21       ` David Gibson
2019-03-12 18:25         ` Cédric Le Goater
2019-02-22 11:28 ` [PATCH v2 02/16] KVM: PPC: Book3S HV: add a new KVM device for the XIVE native exploitation mode Cédric Le Goater
2019-02-25  0:08   ` David Gibson
2019-03-12 11:14     ` Cédric Le Goater
2019-02-22 11:28 ` [PATCH v2 03/16] KVM: PPC: Book3S HV: XIVE: introduce a new capability KVM_CAP_PPC_IRQ_XIVE Cédric Le Goater
2019-02-25  0:35   ` David Gibson
2019-02-25  4:59     ` Paul Mackerras
2019-03-12 14:10       ` Cédric Le Goater
2019-03-12 14:03     ` Cédric Le Goater
2019-03-13  4:05       ` David Gibson
2019-02-25  4:35   ` Paul Mackerras
2019-03-13  8:34     ` Cédric Le Goater
2019-03-14  2:29       ` David Gibson
2019-02-22 11:28 ` [PATCH v2 04/16] KVM: PPC: Book3S HV: XIVE: add a control to initialize a source Cédric Le Goater
2019-02-25  2:10   ` David Gibson
2019-02-26  4:25     ` Paul Mackerras
2019-02-26 23:20       ` David Gibson
2019-03-12 15:19     ` Cédric Le Goater
2019-03-14  2:15       ` David Gibson
2019-02-25  5:30   ` Paul Mackerras
2019-02-22 11:28 ` [PATCH v2 05/16] KVM: PPC: Book3S HV: XIVE: add a control to configure " Cédric Le Goater
2019-02-25  2:21   ` David Gibson
2019-02-22 11:28 ` [PATCH v2 06/16] KVM: PPC: Book3S HV: XIVE: add controls for the EQ configuration Cédric Le Goater
2019-02-25  2:39   ` David Gibson
2019-03-12 17:00     ` Cédric Le Goater
2019-03-13  4:03       ` David Gibson
2019-03-13  8:46         ` Cédric Le Goater
2019-03-14  3:29           ` David Gibson
2019-02-26  5:24   ` Paul Mackerras
2019-03-13  9:40     ` Cédric Le Goater
2019-03-14  2:32       ` David Gibson
2019-03-14  7:11         ` Cédric Le Goater
2019-03-15  0:29           ` David Gibson
2019-02-22 11:28 ` [PATCH v2 07/16] KVM: PPC: Book3S HV: XIVE: add a global reset control Cédric Le Goater
2019-02-25  2:43   ` David Gibson
2019-02-22 11:28 ` [PATCH v2 08/16] KVM: PPC: Book3S HV: XIVE: add a control to sync the sources Cédric Le Goater
2019-02-25  2:45   ` David Gibson
2019-02-22 11:28 ` [PATCH v2 09/16] KVM: PPC: Book3S HV: XIVE: add a control to dirty the XIVE EQ pages Cédric Le Goater
2019-02-25  2:53   ` David Gibson
2019-03-13 11:48     ` Cédric Le Goater
2019-03-14  2:33       ` David Gibson
2019-02-22 11:28 ` [PATCH v2 10/16] KVM: PPC: Book3S HV: XIVE: add get/set accessors for the VP XIVE state Cédric Le Goater
2019-02-25  3:31   ` David Gibson
2019-03-13 13:19     ` Cédric Le Goater
2019-03-14  3:09       ` David Gibson [this message]
2019-03-14  7:08         ` Cédric Le Goater
2019-02-22 11:28 ` [PATCH v2 11/16] KVM: introduce a 'mmap' method for KVM devices Cédric Le Goater
2019-02-25  3:33   ` David Gibson
2019-02-25 10:57     ` Cédric Le Goater
2019-02-26 12:52       ` Paolo Bonzini
2019-02-26 23:22         ` David Gibson
2019-02-22 11:28 ` [PATCH v2 12/16] KVM: PPC: Book3S HV: XIVE: add a TIMA mapping Cédric Le Goater
2019-02-25  3:42   ` David Gibson
2019-02-22 11:28 ` [PATCH v2 13/16] KVM: PPC: Book3S HV: XIVE: add a mapping for the source ESB pages Cédric Le Goater
2019-02-25  3:47   ` David Gibson
2019-02-22 11:28 ` [PATCH v2 14/16] KVM: PPC: Book3S HV: XIVE: add passthrough support Cédric Le Goater
2019-02-25  4:13   ` David Gibson
2019-02-22 11:28 ` [PATCH v2 15/16] KVM: introduce a KVM_DESTROY_DEVICE ioctl Cédric Le Goater
2019-02-25  4:15   ` David Gibson
2019-03-13  8:02     ` Cédric Le Goater
2019-03-15 17:57       ` Paolo Bonzini
2019-02-22 11:28 ` [PATCH v2 16/16] KVM: PPC: Book3S HV: XIVE: clear the vCPU interrupt presenters Cédric Le Goater
2019-02-25  4:18   ` David Gibson
2019-03-13  8:17     ` Cédric Le Goater
2019-03-14  2:26       ` David Gibson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190314030914.GN8211@umbus.fritz.box \
    --to=david@gibson.dropbear.id.au \
    --cc=clg@kaod.org \
    --cc=kvm-ppc@vger.kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=paulus@samba.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).