From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([140.186.70.92]:52806) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QPV7H-000295-JN for qemu-devel@nongnu.org; Thu, 26 May 2011 03:36:24 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1QPV7G-0003qL-C4 for qemu-devel@nongnu.org; Thu, 26 May 2011 03:36:23 -0400 Received: from mail-iw0-f173.google.com ([209.85.214.173]:61758) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1QPV7G-0003qF-7V for qemu-devel@nongnu.org; Thu, 26 May 2011 03:36:22 -0400 Received: by iwl42 with SMTP id 42so457522iwl.4 for ; Thu, 26 May 2011 00:36:21 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <4DDDFF83.6010404@siemens.com> References: <4DDDFF83.6010404@siemens.com> From: Wei Liu Date: Thu, 26 May 2011 15:35:51 +0800 Message-ID: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] Xen: enabling emulated MSI injection List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Jan Kiszka Cc: Anthony Perard , qemu-devel@nongnu.org, Stefano Stabellini On Thu, May 26, 2011 at 3:21 PM, Jan Kiszka wrote: > A cleaner way of hooking into this would be registering on the MSI MMIO > target region (replacing the APIC). That would give you MSI support as we= ll. > > However, KVM has much more complex requirements for MSI handling when > using the in-kernel irqchip. That's because we need to support, e.g., > eventfd-based MSI injection in the hypervisor without looping over QEMU, > ie. we need to translate a binary event into a MSI address/data tuple. > This is currently solved a bit too pragmatic in our qemu-kvm tree - > not mergeable to upstream. So I started working out a better plan, > partly based on ideas Avi provided. > > One element in this concept will be a hook into the MSI delivery path, > both for events triggered by msi/msix_notify as well as those raised by > weird DMA (to be architecturally correct). See below for a snapshot > (depends on other patches). That will obsolete any open-coded hooking > like this here. > > If Xen support is urging, I could try to break out the relevant bits a > bit earlier. Otherwise I would prefer to postpone the topic by a few > weeks until we are done with evaluating the concept against KVM's > requirements so that we can find a truly generic solution. > > Jan > > > ------8<------- > > diff --git a/hw/apic.c b/hw/apic.c > index a45b57f..7447d91 100644 > --- a/hw/apic.c > +++ b/hw/apic.c > @@ -19,6 +19,7 @@ > =C2=A0#include "hw.h" > =C2=A0#include "apic.h" > =C2=A0#include "ioapic.h" > +#include "msi.h" > =C2=A0#include "qemu-timer.h" > =C2=A0#include "host-utils.h" > =C2=A0#include "sysbus.h" > @@ -795,7 +796,7 @@ static uint32_t apic_mem_readl(void *opaque, target_p= hys_addr_t addr) > =C2=A0 =C2=A0 return val; > =C2=A0} > > -static void apic_send_msi(target_phys_addr_t addr, uint32_t data) > +void apic_deliver_msi(uint64_t addr, uint32_t data) > =C2=A0{ > =C2=A0 =C2=A0 uint8_t dest =3D (addr & MSI_ADDR_DEST_ID_MASK) >> MSI_ADDR= _DEST_ID_SHIFT; > =C2=A0 =C2=A0 uint8_t vector =3D (data & MSI_DATA_VECTOR_MASK) >> MSI_DAT= A_VECTOR_SHIFT; > @@ -817,7 +818,9 @@ static void apic_mem_writel(void *opaque, target_phys= _addr_t addr, uint32_t val) > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0* APIC is connected directly to the CPU= . > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0* Mapping them on the global bus happen= s to work because > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0* MSI registers are reserved in APIC MM= IO and vice versa. */ > - =C2=A0 =C2=A0 =C2=A0 =C2=A0apic_send_msi(addr, val); > + =C2=A0 =C2=A0 =C2=A0 =C2=A0MSIMessage msg =3D { .address =3D addr, .dat= a =3D val }; > + > + =C2=A0 =C2=A0 =C2=A0 =C2=A0msi_deliver(&msg); > =C2=A0 =C2=A0 =C2=A0 =C2=A0 return; > =C2=A0 =C2=A0 } > > diff --git a/hw/apic.h b/hw/apic.h > index c857d52..d0971ae 100644 > --- a/hw/apic.h > +++ b/hw/apic.h > @@ -20,6 +20,7 @@ void cpu_set_apic_tpr(DeviceState *s, uint8_t val); > =C2=A0uint8_t cpu_get_apic_tpr(DeviceState *s); > =C2=A0void apic_init_reset(DeviceState *s); > =C2=A0void apic_sipi(DeviceState *s); > +void apic_deliver_msi(uint64_t addr, uint32_t data); > > =C2=A0/* pc.c */ > =C2=A0int cpu_is_bsp(CPUState *env); > diff --git a/hw/msi.c b/hw/msi.c > index e111ceb..b88dcc4 100644 > --- a/hw/msi.c > +++ b/hw/msi.c > @@ -37,6 +37,14 @@ > > =C2=A0#define PCI_MSI_VECTORS_MAX =C2=A0 =C2=A0 32 > > +static void msi_unsupported(MSIMessage *msg) > +{ > + =C2=A0 =C2=A0/* If we get here, the board failed to register a delivery= handler. */ > + =C2=A0 =C2=A0abort(); > +} > + > +void (*msi_deliver)(MSIMessage *msg) =3D msi_unsupported; > + > =C2=A0/* If we get rid of cap allocator, we won't need this. */ > =C2=A0static inline uint8_t msi_cap_sizeof(uint16_t flags) > =C2=A0{ > @@ -361,7 +369,7 @@ void msi_notify(PCIDevice *dev, unsigned int vector) > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0"not= ify vector 0x%x" > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0" ad= dress: 0x%"PRIx64" data: 0x%"PRIx32"\n", > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0vect= or, msg.address, msg.data); > - =C2=A0 =C2=A0stl_phys(msg.address, msg.data); > + =C2=A0 =C2=A0msi_deliver(&msg); > =C2=A0} > > =C2=A0/* call this function after updating configs by pci_default_write_c= onfig(). */ > diff --git a/hw/msi.h b/hw/msi.h > index 7a0e76f..ba76eca 100644 > --- a/hw/msi.h > +++ b/hw/msi.h > @@ -44,4 +44,6 @@ static inline bool msi_present(const PCIDevice *dev) > =C2=A0 =C2=A0 return dev->cap_present & QEMU_PCI_CAP_MSI; > =C2=A0} > > +extern void (*msi_deliver)(MSIMessage *msg); > + > =C2=A0#endif /* QEMU_MSI_H */ > diff --git a/hw/msix.c b/hw/msix.c > index 6e55422..d893f88 100644 > --- a/hw/msix.c > +++ b/hw/msix.c > @@ -514,7 +514,7 @@ void msix_notify(PCIDevice *dev, unsigned vector) > > =C2=A0 =C2=A0 msix_message_from_vector(dev, vector, &msg); > > - =C2=A0 =C2=A0stl_phys(msg.address, msg.data); > + =C2=A0 =C2=A0msi_deliver(&msg); > =C2=A0} > > =C2=A0void msix_reset(PCIDevice *dev) > diff --git a/hw/pc.c b/hw/pc.c > index 73398eb..f88ef03 100644 > --- a/hw/pc.c > +++ b/hw/pc.c > @@ -24,6 +24,7 @@ > =C2=A0#include "hw.h" > =C2=A0#include "pc.h" > =C2=A0#include "apic.h" > +#include "msi.h" > =C2=A0#include "fdc.h" > =C2=A0#include "ide.h" > =C2=A0#include "pci.h" > @@ -102,6 +103,15 @@ void isa_irq_handler(void *opaque, int n, int level) > =C2=A0 =C2=A0 =C2=A0 =C2=A0 qemu_set_irq(isa->ioapic[n], level); > =C2=A0}; > > +static void pc_msi_deliver(MSIMessage *msg) > +{ > + =C2=A0 =C2=A0if ((msg->address & 0xfff00000) =3D=3D MSI_ADDR_BASE) { > + =C2=A0 =C2=A0 =C2=A0 =C2=A0apic_deliver_msi(msg->address, msg->data); > + =C2=A0 =C2=A0} else { > + =C2=A0 =C2=A0 =C2=A0 =C2=A0stl_phys(msg->address, msg->data); > + =C2=A0 =C2=A0} > +} > + > =C2=A0static void ioport80_write(void *opaque, uint32_t addr, uint32_t da= ta) > =C2=A0{ > =C2=A0} > @@ -889,6 +899,7 @@ static DeviceState *apic_init(void *env, uint8_t apic= _id) > =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0on the global memory bus. */ > =C2=A0 =C2=A0 =C2=A0 =C2=A0 /* XXX: what if the base changes? */ > =C2=A0 =C2=A0 =C2=A0 =C2=A0 sysbus_mmio_map(d, 0, MSI_ADDR_BASE); > + =C2=A0 =C2=A0 =C2=A0 =C2=A0msi_deliver =3D pc_msi_deliver; > =C2=A0 =C2=A0 =C2=A0 =C2=A0 apic_mapped =3D 1; > =C2=A0 =C2=A0 } > > -- > 1.7.1 > > -- > Siemens AG, Corporate Technology, CT T DE IT 1 > Corporate Competence Center Embedded Linux > Hmm... This patch is just part of GSoC project Virtio on Xen. AFAICS, this is not so urgent. I would prefer a cleaner way. A standalone function that doesn't belong to any QEMU subsystem annoys me. And the hooking looks ugly. Thanks for your patch, I think we can wait for a better design and cleaner solution for both Xen and KVM. --=20 Best regards Wei Liu Twitter: @iliuw Site: http://liuw.name