From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Michael S. Tsirkin" Subject: Re: [PATCH 2/2][RFC] KVM: Emulate MSI-X table and PBA in kernel Date: Sun, 2 Jan 2011 13:51:35 +0200 Message-ID: <20110102115135.GA712@redhat.com> References: <1293007495-32325-1-git-send-email-sheng@linux.intel.com> <4D1C5124.2090409@redhat.com> <20101230103256.GB6441@redhat.com> <201012311105.28371.sheng@linux.intel.com> <4D2052C3.3020901@redhat.com> <20110102103928.GA32272@redhat.com> <4D205A6A.10900@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Sheng Yang , Marcelo Tosatti , kvm@vger.kernel.org, Alex Williamson To: Avi Kivity Return-path: Received: from mx1.redhat.com ([209.132.183.28]:44041 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751093Ab1ABLxK (ORCPT ); Sun, 2 Jan 2011 06:53:10 -0500 Content-Disposition: inline In-Reply-To: <4D205A6A.10900@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: On Sun, Jan 02, 2011 at 12:58:50PM +0200, Avi Kivity wrote: > On 01/02/2011 12:39 PM, Michael S. Tsirkin wrote: > >> > > >> >I agree. At least it's not a regression. And in fact we haven't seen any device > >> >driver use this. I've checked Linux kernel code, found no one used PCI_MSIX_PBA or > >> >msix_pba_offset_reg(). > >> > > >> >I guess it's fine to get MSI-X mask part in first, then deal with PBA part if > >> >necessary - though we haven't seen any driver use it so far. It won't be worse > >> >with this patch anyway... > >> > >> In a way it is worse because before, the fix would belong in user > >> space, which is easier to test and distribute. Now we have to fix > >> it in the kernel. > >> > >> However I recognize that drivers which rely on the pending bit are > >> rare/nonexistent (likely on in preboot environments where interrupts > >> are hard), so even if we do code it, it will likely be incorrect > >> (certainly without a test). > >> > >> So I'll accept the patch without PBA. Michael, what about > >> supporting virtio? Can we base something on this patch? > > > >I don't see how userspace can send interrupts with this > >interface unfortunately. We also need irqfd support ... > > Sure we'll need additions to that interface. What I suggested is 1. an ioctl to map phy address + size to table id 2. a new gsi type with a table id + entry number. If we have that, assigned devices, virtio and vhost-net can work mostly as is, with just the mask bits accelerated. > What about vhost-net and vfio? I thought that they could emulate > the mask bits: > > - KVM_MMIOFD(vmfd, mmio_range, fd1, fd2) associates an mmio range with an fd > - writel(mmio_range) or readl(mmio_range) from the guest causes a > command to be written to fd1 > - for readl(), read from fd2 to see the result (works nicely for > "pci read flushes posted writes") > > this allows interesting stuff to be implemented in separate > processes, threads, or kernel modules. This could work. Some thought needs to be given to how we make sure that an appropriate type of file is passed in. Maybe using a netlink based connector for this a good idea? OTOH if we have MSIX mask bit emulation in kvm anyway, using it makes sense ... > -- > error compiling committee.c: too many arguments to function