From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Michael S. Tsirkin" Subject: Re: [PATCH 2/2][RFC] KVM: Emulate MSI-X table and PBA in kernel Date: Sun, 2 Jan 2011 11:26:19 +0200 Message-ID: <20110102092619.GA31061@redhat.com> References: <1293007495-32325-1-git-send-email-sheng@linux.intel.com> <4D1C5124.2090409@redhat.com> <20101230103256.GB6441@redhat.com> <201012311105.28371.sheng@linux.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Cc: Avi Kivity , Marcelo Tosatti , kvm@vger.kernel.org, Alex Williamson To: Sheng Yang Return-path: Received: from mx1.redhat.com ([209.132.183.28]:64946 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751828Ab1ABJ1v (ORCPT ); Sun, 2 Jan 2011 04:27:51 -0500 Content-Disposition: inline In-Reply-To: <201012311105.28371.sheng@linux.intel.com> Sender: kvm-owner@vger.kernel.org List-ID: On Fri, Dec 31, 2010 at 11:05:28AM +0800, Sheng Yang wrote: > On Thursday 30 December 2010 18:32:56 Michael S. Tsirkin wrote: > > On Thu, Dec 30, 2010 at 11:30:12AM +0200, Avi Kivity wrote: > > > On 12/30/2010 09:47 AM, Michael S. Tsirkin wrote: > > > >I am not really suggesting this. What I say is PBA is unimplemen= ted > > > >let us not commit to an interface yet. > > >=20 > > > What happens to a guest that tries to use PBA? > > > It's a mandatory part of MSI-X, no? > >=20 > > Yes. Unfortunately the pending bit is in fact a communication chann= el > > used for function specific purposes when mask bit is set, > > and 0 when unset. The spec even seems to *require* this use: > >=20 > > I refer to this: > >=20 > > For MSI and MSI-X, while a vector is masked, the function is prohi= bited > > from sending the associated message, and the function must set the > > associated Pending bit whenever the function would otherwise send = the > > message. When software unmasks a vector whose associated Pending b= it is > > set, the function must schedule sending the associated message, an= d > > clear the Pending bit as soon as the message has been sent. Note t= hat > > clearing the MSI-X Function Mask bit may result in many messages n= eeding > > to be sent. > >=20 > >=20 > > If a masked vector has its Pending bit set, and the associated > > underlying interrupt events are somehow satisfied (usually by soft= ware > > though the exact manner is function-specific), the function must c= lear > > the Pending bit, to avoid sending a spurious interrupt message lat= er > > when software unmasks the vector. However, if a subsequent interru= pt > > event occurs while the vector is still masked, the function must a= gain > > set the Pending bit. > >=20 > >=20 > > Software is permitted to mask one or more vectors indefinitely, an= d > > service their associated interrupt events strictly based on pollin= g > > their Pending bits. A function must set and clear its Pending bits= as > > necessary to support this =E2=80=9Cpure polling=E2=80=9D mode of o= peration. > >=20 > > For assigned devices, supporting this would require > > that the mask bits on the device are set if the mask bit in > > guest is set (otherwise pending bits are disabled). >=20 > For assigned device, I think the result we should return is IRQ_PENDI= NG bit of=20 > related IRQ. Seems it perfectly fits the meaning of pending bit defin= ition here -=20 > set when masked, and if we didn't clean it, one interrupt would be re= triggered=20 > after unmask. Well, it doesn't seem to fit this part of the definition=20 > > If a masked vector has its Pending bit set, and the associated > > underlying interrupt events are somehow satisfied (usually by soft= ware > > though the exact manner is function-specific), the function must c= lear > > the Pending bit, to avoid sending a spurious interrupt message lat= er > > when software unmasks the vector. However, if a subsequent interru= pt > > event occurs while the vector is still masked, the function must a= gain > > set the Pending bit. > >=20 > > Software is permitted to mask one or more vectors indefinitely, an= d > > service their associated interrupt events strictly based on pollin= g > > their Pending bits. A function must set and clear its Pending bits= as > > necessary to support this =E2=80=9Cpure polling=E2=80=9D mode of o= peration. looking at IRQ_PENDING will make the pending bit *never* clear while the vector is masked. > But it's a internal flag, and use it would lead to some core=20 > change(more need to be considered if we want to operate the flag bit = outside core=20 > kernel part).=20 > >=20 > > Existing code does not support PBA in assigned devices, so at least= it's > > not a regression there, and the virtio spec says nothing about this= so > > we should be fine. >=20 > I agree. At least it's not a regression. And in fact we haven't seen = any device=20 > driver use this. I've checked Linux kernel code, found no one used PC= I_MSIX_PBA or=20 > msix_pba_offset_reg(). >=20 > I guess it's fine to get MSI-X mask part in first, then deal with PBA= part if=20 > necessary - though we haven't seen any driver use it so far. It won't= be worse=20 > with this patch anyway... >=20 > -- > regards > Yang, Sheng