From mboxrd@z Thu Jan  1 00:00:00 1970
From: Scott Wood <scottwood@freescale.com>
Subject: Re: [PATCH 16/17] KVM: PPC: MPIC: Add support for KVM_IRQ_LINE
Date: Thu, 25 Apr 2013 14:03:52 -0500
Message-ID: <1366916632.30341.8@snotra>
References: <1366397465.8828.3@snotra>
	<78049BE8-79DD-4339-B330-C6FB5084ADE3@suse.de>
	<1A9486DB-F145-4CD2-A6C8-383107C35B5A@suse.de>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; delsp=Yes; format=Flowed
Content-Transfer-Encoding: 8BIT
Cc: <kvm-ppc@vger.kernel.org>,
	"kvm@vger.kernel.org mailing list" <kvm@vger.kernel.org>,
	Marcelo Tosatti <mtosatti@redhat.com>,
	Gleb Natapov <gleb@redhat.com>
To: Alexander Graf <agraf@suse.de>
Return-path: <kvm-ppc-owner@vger.kernel.org>
In-Reply-To: <1A9486DB-F145-4CD2-A6C8-383107C35B5A@suse.de> (from
	agraf@suse.de on Thu Apr 25 09:49:23 2013)
Content-Disposition: inline
Sender: kvm-ppc-owner@vger.kernel.org
List-Id: kvm.vger.kernel.org

On 04/25/2013 09:49:23 AM, Alexander Graf wrote:
> 
> On 25.04.2013, at 13:30, Alexander Graf wrote:
> 
> >
> > On 19.04.2013, at 20:51, Scott Wood wrote:
> >
> >> On 04/19/2013 09:06:27 AM, Alexander Graf wrote:
> >>> Now that all pieces are in place for reusing generic irq  
> infrastructure,
> >>> we can copy x86's implementation of KVM_IRQ_LINE irq injection  
> and simply
> >>> reuse it for PPC, as it will work there just as well.
> >>> Signed-off-by: Alexander Graf <agraf@suse.de>
> >>> ---
> >>> arch/powerpc/include/uapi/asm/kvm.h |    1 +
> >>> arch/powerpc/kvm/powerpc.c          |   13 +++++++++++++
> >>> 2 files changed, 14 insertions(+), 0 deletions(-)
> >>> diff --git a/arch/powerpc/include/uapi/asm/kvm.h  
> b/arch/powerpc/include/uapi/asm/kvm.h
> >>> index 3537bf3..dbb2ac2 100644
> >>> --- a/arch/powerpc/include/uapi/asm/kvm.h
> >>> +++ b/arch/powerpc/include/uapi/asm/kvm.h
> >>> @@ -26,6 +26,7 @@
> >>> #define __KVM_HAVE_SPAPR_TCE
> >>> #define __KVM_HAVE_PPC_SMT
> >>> #define __KVM_HAVE_IRQCHIP
> >>> +#define __KVM_HAVE_IRQ_LINE
> >>> struct kvm_regs {
> >>> 	__u64 pc;
> >>> diff --git a/arch/powerpc/kvm/powerpc.c  
> b/arch/powerpc/kvm/powerpc.c
> >>> index c431fea..874c106 100644
> >>> --- a/arch/powerpc/kvm/powerpc.c
> >>> +++ b/arch/powerpc/kvm/powerpc.c
> >>> @@ -33,6 +33,7 @@
> >>> #include <asm/cputhreads.h>
> >>> #include <asm/irqflags.h>
> >>> #include "timing.h"
> >>> +#include "irq.h"
> >>> #include "../mm/mmu_decl.h"
> >>> #define CREATE_TRACE_POINTS
> >>> @@ -945,6 +946,18 @@ static int kvm_vm_ioctl_get_pvinfo(struct  
> kvm_ppc_pvinfo *pvinfo)
> >>> 	return 0;
> >>> }
> >>> +int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level  
> *irq_event,
> >>> +			  bool line_status)
> >>> +{
> >>> +	if (!irqchip_in_kernel(kvm))
> >>> +		return -ENXIO;
> >>> +
> >>> +	irq_event->status = kvm_set_irq(kvm,  
> KVM_USERSPACE_IRQ_SOURCE_ID,
> >>> +					irq_event->irq,  
> irq_event->level,
> >>> +					line_status);
> >>> +	return 0;
> >>> +}
> >>
> >> As Paul noted in the XICS patchset, this could reference an MPIC  
> that has gone away if the user never attached any vcpus and then  
> closed the MPIC fd.  It's not a reasonable use case, but it could be  
> used malicously to get the kernel to access a bad pointer.  The  
> irqchip_in_kernel check helps somewhat, but it's meant for ensuring  
> that the creation has happened -- it's racy if used for ensuring that  
> destruction hasn't happened.
> >>
> >> The problem is rooted in the awkwardness of performing an  
> operation that logically should be on the MPIC fd, but is instead  
> being done on the vm fd.
> >>
> >> I think these three steps would fix it (the first two seem like  
> things we should be doing anyway):
> >> - During MPIC destruction, make sure MPIC deregisters all routes  
> that reference it.
> >> - In kvm_set_irq(), do not release the RCU read lock until after  
> the set() function has been called.
> >> - Do not hook up kvm_send_userspace_msi() to MPIC or other new  
> irqchips, as that bypasses the RCU lock.  It could be supported as a  
> device fd ioctl if desired, or it could be reworked to operate on an  
> RCU-managed list of MSI handlers, though MPIC really doesn't need  
> this at all.
> >
> > Can't we just add an RCU lock in the send_userspace_msi case? I  
> don't think we should handle MSIs any differently from normal IRQs.

Well, you can't *just* add the RCU lock -- you need to add data to be  
managed via RCU (e.g. a list of MSI callbacks, or at least a boolean  
indicating whether calling the MSI code is OK).

> In fact I'm having a hard time verifying that we're always accessing  
> things with proper locks held. I'm pretty sure we're missing a few  
> cases.

Any path in particular?

> So how about we delay mpic destruction to vm destruction? We simply  
> add one user too many when we spawn the mpic and put it on  
> vm_destruct. That way users _can_ destroy mpics, but they will only  
> be really free'd once the vm is also gone.

That's what we originally had before the fd conversion.  If we want it  
again, we'll need to go back to maintaining a list of devices in KVM  
(though it could be a linked list now that we don't need to use it for  
lookups), or have some hardcoded MPIC hack.

IIRC I said back then that converting to fd would make destruction  
ordering more of a pain...

-Scott