From mboxrd@z Thu Jan 1 00:00:00 1970 From: Scott Wood Subject: Re: [PATCH 16/17] KVM: PPC: MPIC: Add support for KVM_IRQ_LINE Date: Thu, 25 Apr 2013 14:03:52 -0500 Message-ID: <1366916632.30341.8@snotra> References: <1366397465.8828.3@snotra> <78049BE8-79DD-4339-B330-C6FB5084ADE3@suse.de> <1A9486DB-F145-4CD2-A6C8-383107C35B5A@suse.de> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; delsp=Yes; format=Flowed Content-Transfer-Encoding: 8BIT Cc: , "kvm@vger.kernel.org mailing list" , Marcelo Tosatti , Gleb Natapov To: Alexander Graf Return-path: In-Reply-To: <1A9486DB-F145-4CD2-A6C8-383107C35B5A@suse.de> (from agraf@suse.de on Thu Apr 25 09:49:23 2013) Content-Disposition: inline Sender: kvm-ppc-owner@vger.kernel.org List-Id: kvm.vger.kernel.org On 04/25/2013 09:49:23 AM, Alexander Graf wrote: > > On 25.04.2013, at 13:30, Alexander Graf wrote: > > > > > On 19.04.2013, at 20:51, Scott Wood wrote: > > > >> On 04/19/2013 09:06:27 AM, Alexander Graf wrote: > >>> Now that all pieces are in place for reusing generic irq > infrastructure, > >>> we can copy x86's implementation of KVM_IRQ_LINE irq injection > and simply > >>> reuse it for PPC, as it will work there just as well. > >>> Signed-off-by: Alexander Graf > >>> --- > >>> arch/powerpc/include/uapi/asm/kvm.h | 1 + > >>> arch/powerpc/kvm/powerpc.c | 13 +++++++++++++ > >>> 2 files changed, 14 insertions(+), 0 deletions(-) > >>> diff --git a/arch/powerpc/include/uapi/asm/kvm.h > b/arch/powerpc/include/uapi/asm/kvm.h > >>> index 3537bf3..dbb2ac2 100644 > >>> --- a/arch/powerpc/include/uapi/asm/kvm.h > >>> +++ b/arch/powerpc/include/uapi/asm/kvm.h > >>> @@ -26,6 +26,7 @@ > >>> #define __KVM_HAVE_SPAPR_TCE > >>> #define __KVM_HAVE_PPC_SMT > >>> #define __KVM_HAVE_IRQCHIP > >>> +#define __KVM_HAVE_IRQ_LINE > >>> struct kvm_regs { > >>> __u64 pc; > >>> diff --git a/arch/powerpc/kvm/powerpc.c > b/arch/powerpc/kvm/powerpc.c > >>> index c431fea..874c106 100644 > >>> --- a/arch/powerpc/kvm/powerpc.c > >>> +++ b/arch/powerpc/kvm/powerpc.c > >>> @@ -33,6 +33,7 @@ > >>> #include > >>> #include > >>> #include "timing.h" > >>> +#include "irq.h" > >>> #include "../mm/mmu_decl.h" > >>> #define CREATE_TRACE_POINTS > >>> @@ -945,6 +946,18 @@ static int kvm_vm_ioctl_get_pvinfo(struct > kvm_ppc_pvinfo *pvinfo) > >>> return 0; > >>> } > >>> +int kvm_vm_ioctl_irq_line(struct kvm *kvm, struct kvm_irq_level > *irq_event, > >>> + bool line_status) > >>> +{ > >>> + if (!irqchip_in_kernel(kvm)) > >>> + return -ENXIO; > >>> + > >>> + irq_event->status = kvm_set_irq(kvm, > KVM_USERSPACE_IRQ_SOURCE_ID, > >>> + irq_event->irq, > irq_event->level, > >>> + line_status); > >>> + return 0; > >>> +} > >> > >> As Paul noted in the XICS patchset, this could reference an MPIC > that has gone away if the user never attached any vcpus and then > closed the MPIC fd. It's not a reasonable use case, but it could be > used malicously to get the kernel to access a bad pointer. The > irqchip_in_kernel check helps somewhat, but it's meant for ensuring > that the creation has happened -- it's racy if used for ensuring that > destruction hasn't happened. > >> > >> The problem is rooted in the awkwardness of performing an > operation that logically should be on the MPIC fd, but is instead > being done on the vm fd. > >> > >> I think these three steps would fix it (the first two seem like > things we should be doing anyway): > >> - During MPIC destruction, make sure MPIC deregisters all routes > that reference it. > >> - In kvm_set_irq(), do not release the RCU read lock until after > the set() function has been called. > >> - Do not hook up kvm_send_userspace_msi() to MPIC or other new > irqchips, as that bypasses the RCU lock. It could be supported as a > device fd ioctl if desired, or it could be reworked to operate on an > RCU-managed list of MSI handlers, though MPIC really doesn't need > this at all. > > > > Can't we just add an RCU lock in the send_userspace_msi case? I > don't think we should handle MSIs any differently from normal IRQs. Well, you can't *just* add the RCU lock -- you need to add data to be managed via RCU (e.g. a list of MSI callbacks, or at least a boolean indicating whether calling the MSI code is OK). > In fact I'm having a hard time verifying that we're always accessing > things with proper locks held. I'm pretty sure we're missing a few > cases. Any path in particular? > So how about we delay mpic destruction to vm destruction? We simply > add one user too many when we spawn the mpic and put it on > vm_destruct. That way users _can_ destroy mpics, but they will only > be really free'd once the vm is also gone. That's what we originally had before the fd conversion. If we want it again, we'll need to go back to maintaining a list of devices in KVM (though it could be a linked list now that we don't need to use it for lookups), or have some hardcoded MPIC hack. IIRC I said back then that converting to fd would make destruction ordering more of a pain... -Scott