From mboxrd@z Thu Jan 1 00:00:00 1970 From: Sheng Yang Subject: Re: [PATCH 2/2][RFC] KVM: Emulate MSI-X table and PBA in kernel Date: Fri, 31 Dec 2010 11:05:28 +0800 Message-ID: <201012311105.28371.sheng@linux.intel.com> References: <1293007495-32325-1-git-send-email-sheng@linux.intel.com> <4D1C5124.2090409@redhat.com> <20101230103256.GB6441@redhat.com> Mime-Version: 1.0 Content-Type: Text/Plain; charset=utf-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: "Michael S. Tsirkin" , Marcelo Tosatti , kvm@vger.kernel.org, Alex Williamson To: Avi Kivity Return-path: Received: from mga14.intel.com ([143.182.124.37]:53997 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751757Ab0LaDFq convert rfc822-to-8bit (ORCPT ); Thu, 30 Dec 2010 22:05:46 -0500 In-Reply-To: <20101230103256.GB6441@redhat.com> Sender: kvm-owner@vger.kernel.org List-ID: On Thursday 30 December 2010 18:32:56 Michael S. Tsirkin wrote: > On Thu, Dec 30, 2010 at 11:30:12AM +0200, Avi Kivity wrote: > > On 12/30/2010 09:47 AM, Michael S. Tsirkin wrote: > > >I am not really suggesting this. What I say is PBA is unimplemente= d > > >let us not commit to an interface yet. > >=20 > > What happens to a guest that tries to use PBA? > > It's a mandatory part of MSI-X, no? >=20 > Yes. Unfortunately the pending bit is in fact a communication channel > used for function specific purposes when mask bit is set, > and 0 when unset. The spec even seems to *require* this use: >=20 > I refer to this: >=20 > For MSI and MSI-X, while a vector is masked, the function is prohibi= ted > from sending the associated message, and the function must set the > associated Pending bit whenever the function would otherwise send th= e > message. When software unmasks a vector whose associated Pending bit= is > set, the function must schedule sending the associated message, and > clear the Pending bit as soon as the message has been sent. Note tha= t > clearing the MSI-X Function Mask bit may result in many messages nee= ding > to be sent. >=20 >=20 > If a masked vector has its Pending bit set, and the associated > underlying interrupt events are somehow satisfied (usually by softwa= re > though the exact manner is function-specific), the function must cle= ar > the Pending bit, to avoid sending a spurious interrupt message later > when software unmasks the vector. However, if a subsequent interrupt > event occurs while the vector is still masked, the function must aga= in > set the Pending bit. >=20 >=20 > Software is permitted to mask one or more vectors indefinitely, and > service their associated interrupt events strictly based on polling > their Pending bits. A function must set and clear its Pending bits a= s > necessary to support this =E2=80=9Cpure polling=E2=80=9D mode of ope= ration. >=20 > For assigned devices, supporting this would require > that the mask bits on the device are set if the mask bit in > guest is set (otherwise pending bits are disabled). =46or assigned device, I think the result we should return is IRQ_PENDI= NG bit of=20 related IRQ. Seems it perfectly fits the meaning of pending bit definit= ion here -=20 set when masked, and if we didn't clean it, one interrupt would be retr= iggered=20 after unmask. But it's a internal flag, and use it would lead to some c= ore=20 change(more need to be considered if we want to operate the flag bit ou= tside core=20 kernel part).=20 >=20 > Existing code does not support PBA in assigned devices, so at least i= t's > not a regression there, and the virtio spec says nothing about this s= o > we should be fine. I agree. At least it's not a regression. And in fact we haven't seen an= y device=20 driver use this. I've checked Linux kernel code, found no one used PCI_= MSIX_PBA or=20 msix_pba_offset_reg(). I guess it's fine to get MSI-X mask part in first, then deal with PBA p= art if=20 necessary - though we haven't seen any driver use it so far. It won't b= e worse=20 with this patch anyway... -- regards Yang, Sheng