* [Qemu-devel] [PATCH] xen: fix interrupt routing
@ 2011-05-18 17:53 stefano.stabellini
2011-05-19 2:14 ` Isaku Yamahata
0 siblings, 1 reply; 19+ messages in thread
From: stefano.stabellini @ 2011-05-18 17:53 UTC (permalink / raw)
To: qemu-devel; +Cc: xen-devel, agraf, Stefano Stabellini
From: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Match the routing informations built by seabios:
- remove i440fx_write_config_xen
we don't need to intercept pci config writes to i440FX;
- introduce piix3_write_config_xen
we do need to intercept pci config write to the PCI-ISA bridge to update
the PCI link routing;
- remove xen_pci_slot_get_pirq
we are now using the same link routing as seabios and qemu so we don't
need a diffirent get_pirq function;
- fix xen_piix3_set_irq
we always inject one of the 4 pci intx, so we can use
xc_hvm_set_isa_irq_level to inject the interrupt. Use pci_irqs as
initialized by seabios to map a pirq into an ISA irq. This has the
benefit of removing all the calls to xc_hvm_set_pci_intx_level that
doesn't work correctly anymore because from the same device number and
intx Xen calculates a different PCI link compared to Qemu and Seabios.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
---
hw/piix_pci.c | 21 ++++++++++++---------
xen-all.c | 16 ++++++++--------
2 files changed, 20 insertions(+), 17 deletions(-)
diff --git a/hw/piix_pci.c b/hw/piix_pci.c
index 7f1c4cc..aae6a48 100644
--- a/hw/piix_pci.c
+++ b/hw/piix_pci.c
@@ -78,6 +78,8 @@ struct PCII440FXState {
#define I440FX_SMRAM 0x72
static void piix3_set_irq(void *opaque, int pirq, int level);
+static void piix3_write_config_xen(PCIDevice *dev,
+ uint32_t address, uint32_t val, int len);
/* return the global irq number corresponding to a given device irq
pin. We could also use the bus number to have a more precise
@@ -173,13 +175,6 @@ static void i440fx_write_config(PCIDevice *dev,
}
}
-static void i440fx_write_config_xen(PCIDevice *dev,
- uint32_t address, uint32_t val, int len)
-{
- xen_piix_pci_write_config_client(address, val, len);
- i440fx_write_config(dev, address, val, len);
-}
-
static int i440fx_load_old(QEMUFile* f, void *opaque, int version_id)
{
PCII440FXState *d = opaque;
@@ -301,7 +296,8 @@ PCIBus *i440fx_xen_init(PCII440FXState **pi440fx_state, int *piix3_devfn,
PCIBus *b;
b = i440fx_common_init("i440FX-xen", pi440fx_state, piix3_devfn, pic, ram_size);
- pci_bus_irqs(b, xen_piix3_set_irq, xen_pci_slot_get_pirq,
+ (*pi440fx_state)->piix3->dev.config_write = piix3_write_config_xen;
+ pci_bus_irqs(b, xen_piix3_set_irq, pci_slot_get_pirq,
(*pi440fx_state)->piix3, PIIX_NUM_PIRQS);
return b;
@@ -365,6 +361,13 @@ static void piix3_write_config(PCIDevice *dev,
}
}
+static void piix3_write_config_xen(PCIDevice *dev,
+ uint32_t address, uint32_t val, int len)
+{
+ xen_piix_pci_write_config_client(address, val, len);
+ piix3_write_config(dev, address, val, len);
+}
+
static void piix3_reset(void *opaque)
{
PIIX3State *d = opaque;
@@ -471,7 +474,7 @@ static PCIDeviceInfo i440fx_info[] = {
.qdev.vmsd = &vmstate_i440fx,
.qdev.no_user = 1,
.init = i440fx_initfn,
- .config_write = i440fx_write_config_xen,
+ .config_write = i440fx_write_config,
},{
.qdev.name = "PIIX3",
.qdev.desc = "ISA bridge",
diff --git a/xen-all.c b/xen-all.c
index 0eac202..7d7863f 100644
--- a/xen-all.c
+++ b/xen-all.c
@@ -70,17 +70,16 @@ typedef struct XenIOState {
Notifier exit;
} XenIOState;
-/* Xen specific function for piix pci */
+/* pci_irqs as initialized by seabios */
+uint8_t pci_irqs[4] = {
+ 10, 10, 11, 11
+};
-int xen_pci_slot_get_pirq(PCIDevice *pci_dev, int irq_num)
-{
- return irq_num + ((pci_dev->devfn >> 3) << 2);
-}
+/* Xen specific function for piix pci */
-void xen_piix3_set_irq(void *opaque, int irq_num, int level)
+void xen_piix3_set_irq(void *opaque, int pirq, int level)
{
- xc_hvm_set_pci_intx_level(xen_xc, xen_domid, 0, 0, irq_num >> 2,
- irq_num & 3, level);
+ xc_hvm_set_isa_irq_level(xen_xc, xen_domid, pci_irqs[pirq], level);
}
void xen_piix_pci_write_config_client(uint32_t address, uint32_t val, int len)
@@ -95,6 +94,7 @@ void xen_piix_pci_write_config_client(uint32_t address, uint32_t val, int len)
}
v &= 0xf;
if (((address + i) >= 0x60) && ((address + i) <= 0x63)) {
+ pci_irqs[address + i - 0x60] = v;
xc_hvm_set_pci_link_route(xen_xc, xen_domid, address + i - 0x60, v);
}
}
--
1.7.2.3
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [PATCH] xen: fix interrupt routing
2011-05-18 17:53 [Qemu-devel] " stefano.stabellini
@ 2011-05-19 2:14 ` Isaku Yamahata
2011-05-19 19:16 ` Stefano Stabellini
0 siblings, 1 reply; 19+ messages in thread
From: Isaku Yamahata @ 2011-05-19 2:14 UTC (permalink / raw)
To: stefano.stabellini; +Cc: xen-devel, qemu-devel, agraf
On Wed, May 18, 2011 at 06:53:40PM +0100, stefano.stabellini@eu.citrix.com wrote:
> From: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
>
> Match the routing informations built by seabios:
>
> - remove i440fx_write_config_xen
> we don't need to intercept pci config writes to i440FX;
>
> - introduce piix3_write_config_xen
> we do need to intercept pci config write to the PCI-ISA bridge to update
> the PCI link routing;
So "i440FX-xen" isn't needed anymore, and something like PIIX3-xen
is necessary instead of overwriting piix3->dev.config_write.
thanks,
>
> - remove xen_pci_slot_get_pirq
> we are now using the same link routing as seabios and qemu so we don't
> need a diffirent get_pirq function;
>
> - fix xen_piix3_set_irq
> we always inject one of the 4 pci intx, so we can use
> xc_hvm_set_isa_irq_level to inject the interrupt. Use pci_irqs as
> initialized by seabios to map a pirq into an ISA irq. This has the
> benefit of removing all the calls to xc_hvm_set_pci_intx_level that
> doesn't work correctly anymore because from the same device number and
> intx Xen calculates a different PCI link compared to Qemu and Seabios.
>
> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> ---
> hw/piix_pci.c | 21 ++++++++++++---------
> xen-all.c | 16 ++++++++--------
> 2 files changed, 20 insertions(+), 17 deletions(-)
>
> diff --git a/hw/piix_pci.c b/hw/piix_pci.c
> index 7f1c4cc..aae6a48 100644
> --- a/hw/piix_pci.c
> +++ b/hw/piix_pci.c
> @@ -78,6 +78,8 @@ struct PCII440FXState {
> #define I440FX_SMRAM 0x72
>
> static void piix3_set_irq(void *opaque, int pirq, int level);
> +static void piix3_write_config_xen(PCIDevice *dev,
> + uint32_t address, uint32_t val, int len);
>
> /* return the global irq number corresponding to a given device irq
> pin. We could also use the bus number to have a more precise
> @@ -173,13 +175,6 @@ static void i440fx_write_config(PCIDevice *dev,
> }
> }
>
> -static void i440fx_write_config_xen(PCIDevice *dev,
> - uint32_t address, uint32_t val, int len)
> -{
> - xen_piix_pci_write_config_client(address, val, len);
> - i440fx_write_config(dev, address, val, len);
> -}
> -
> static int i440fx_load_old(QEMUFile* f, void *opaque, int version_id)
> {
> PCII440FXState *d = opaque;
> @@ -301,7 +296,8 @@ PCIBus *i440fx_xen_init(PCII440FXState **pi440fx_state, int *piix3_devfn,
> PCIBus *b;
>
> b = i440fx_common_init("i440FX-xen", pi440fx_state, piix3_devfn, pic, ram_size);
> - pci_bus_irqs(b, xen_piix3_set_irq, xen_pci_slot_get_pirq,
> + (*pi440fx_state)->piix3->dev.config_write = piix3_write_config_xen;
> + pci_bus_irqs(b, xen_piix3_set_irq, pci_slot_get_pirq,
> (*pi440fx_state)->piix3, PIIX_NUM_PIRQS);
>
> return b;
> @@ -365,6 +361,13 @@ static void piix3_write_config(PCIDevice *dev,
> }
> }
>
> +static void piix3_write_config_xen(PCIDevice *dev,
> + uint32_t address, uint32_t val, int len)
> +{
> + xen_piix_pci_write_config_client(address, val, len);
> + piix3_write_config(dev, address, val, len);
> +}
> +
> static void piix3_reset(void *opaque)
> {
> PIIX3State *d = opaque;
> @@ -471,7 +474,7 @@ static PCIDeviceInfo i440fx_info[] = {
> .qdev.vmsd = &vmstate_i440fx,
> .qdev.no_user = 1,
> .init = i440fx_initfn,
> - .config_write = i440fx_write_config_xen,
> + .config_write = i440fx_write_config,
> },{
> .qdev.name = "PIIX3",
> .qdev.desc = "ISA bridge",
> diff --git a/xen-all.c b/xen-all.c
> index 0eac202..7d7863f 100644
> --- a/xen-all.c
> +++ b/xen-all.c
> @@ -70,17 +70,16 @@ typedef struct XenIOState {
> Notifier exit;
> } XenIOState;
>
> -/* Xen specific function for piix pci */
> +/* pci_irqs as initialized by seabios */
> +uint8_t pci_irqs[4] = {
> + 10, 10, 11, 11
> +};
>
> -int xen_pci_slot_get_pirq(PCIDevice *pci_dev, int irq_num)
> -{
> - return irq_num + ((pci_dev->devfn >> 3) << 2);
> -}
> +/* Xen specific function for piix pci */
>
> -void xen_piix3_set_irq(void *opaque, int irq_num, int level)
> +void xen_piix3_set_irq(void *opaque, int pirq, int level)
> {
> - xc_hvm_set_pci_intx_level(xen_xc, xen_domid, 0, 0, irq_num >> 2,
> - irq_num & 3, level);
> + xc_hvm_set_isa_irq_level(xen_xc, xen_domid, pci_irqs[pirq], level);
> }
>
> void xen_piix_pci_write_config_client(uint32_t address, uint32_t val, int len)
> @@ -95,6 +94,7 @@ void xen_piix_pci_write_config_client(uint32_t address, uint32_t val, int len)
> }
> v &= 0xf;
> if (((address + i) >= 0x60) && ((address + i) <= 0x63)) {
> + pci_irqs[address + i - 0x60] = v;
> xc_hvm_set_pci_link_route(xen_xc, xen_domid, address + i - 0x60, v);
> }
> }
> --
> 1.7.2.3
>
>
--
yamahata
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [PATCH] xen: fix interrupt routing
2011-05-19 2:14 ` Isaku Yamahata
@ 2011-05-19 19:16 ` Stefano Stabellini
0 siblings, 0 replies; 19+ messages in thread
From: Stefano Stabellini @ 2011-05-19 19:16 UTC (permalink / raw)
To: Isaku Yamahata
Cc: agraf@suse.de, xen-devel@lists.xensource.com,
qemu-devel@nongnu.org, Stefano Stabellini
On Thu, 19 May 2011, Isaku Yamahata wrote:
> On Wed, May 18, 2011 at 06:53:40PM +0100, stefano.stabellini@eu.citrix.com wrote:
> > From: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> >
> > Match the routing informations built by seabios:
> >
> > - remove i440fx_write_config_xen
> > we don't need to intercept pci config writes to i440FX;
> >
> > - introduce piix3_write_config_xen
> > we do need to intercept pci config write to the PCI-ISA bridge to update
> > the PCI link routing;
>
> So "i440FX-xen" isn't needed anymore, and something like PIIX3-xen
> is necessary instead of overwriting piix3->dev.config_write.
That is a good idea. I am also going to remove i440fx_xen_init to get rid of
a 'if (xen_enabled())' but to do that I have to move the call to
pci_bus_irqs in i440fx_common_init.
I'll send an update of the patch as soon as we figure out what to do
with the BIOS tables.
^ permalink raw reply [flat|nested] 19+ messages in thread
* [Qemu-devel] [PATCH] xen: fix interrupt routing
@ 2011-05-26 15:48 Stefano Stabellini
2011-05-27 23:17 ` Alexander Graf
0 siblings, 1 reply; 19+ messages in thread
From: Stefano Stabellini @ 2011-05-26 15:48 UTC (permalink / raw)
To: qemu-devel; +Cc: Isaku Yamahata, xen-devel, Alexander Graf
xen: fix interrupt routing
- remove i440FX-xen and i440fx_write_config_xen
we don't need to intercept pci config writes to i440FX anymore;
- introduce PIIX3-xen and piix3_write_config_xen
we do need to intercept pci config write to the PCI-ISA bridge to update
the PCI link routing;
- set the number of PIIX3-xen interrupts lines to 128;
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
diff --git a/hw/pc.h b/hw/pc.h
index 0dcbee7..6d5730b 100644
--- a/hw/pc.h
+++ b/hw/pc.h
@@ -176,7 +176,6 @@ struct PCII440FXState;
typedef struct PCII440FXState PCII440FXState;
PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int *piix_devfn, qemu_irq *pic, ram_addr_t ram_size);
-PCIBus *i440fx_xen_init(PCII440FXState **pi440fx_state, int *piix3_devfn, qemu_irq *pic, ram_addr_t ram_size);
void i440fx_init_memory_mappings(PCII440FXState *d);
/* piix4.c */
diff --git a/hw/pc_piix.c b/hw/pc_piix.c
index 9a22a8a..ba198de 100644
--- a/hw/pc_piix.c
+++ b/hw/pc_piix.c
@@ -124,11 +124,7 @@ static void pc_init1(ram_addr_t ram_size,
isa_irq = qemu_allocate_irqs(isa_irq_handler, isa_irq_state, 24);
if (pci_enabled) {
- if (!xen_enabled()) {
- pci_bus = i440fx_init(&i440fx_state, &piix3_devfn, isa_irq, ram_size);
- } else {
- pci_bus = i440fx_xen_init(&i440fx_state, &piix3_devfn, isa_irq, ram_size);
- }
+ pci_bus = i440fx_init(&i440fx_state, &piix3_devfn, isa_irq, ram_size);
} else {
pci_bus = NULL;
i440fx_state = NULL;
diff --git a/hw/piix_pci.c b/hw/piix_pci.c
index 7f1c4cc..3d73a42 100644
--- a/hw/piix_pci.c
+++ b/hw/piix_pci.c
@@ -40,6 +40,7 @@ typedef PCIHostState I440FXState;
#define PIIX_NUM_PIC_IRQS 16 /* i8259 * 2 */
#define PIIX_NUM_PIRQS 4ULL /* PIRQ[A-D] */
+#define XEN_PIIX_NUM_PIRQS 128ULL
#define PIIX_PIRQC 0x60
typedef struct PIIX3State {
@@ -78,6 +79,8 @@ struct PCII440FXState {
#define I440FX_SMRAM 0x72
static void piix3_set_irq(void *opaque, int pirq, int level);
+static void piix3_write_config_xen(PCIDevice *dev,
+ uint32_t address, uint32_t val, int len);
/* return the global irq number corresponding to a given device irq
pin. We could also use the bus number to have a more precise
@@ -173,13 +176,6 @@ static void i440fx_write_config(PCIDevice *dev,
}
}
-static void i440fx_write_config_xen(PCIDevice *dev,
- uint32_t address, uint32_t val, int len)
-{
- xen_piix_pci_write_config_client(address, val, len);
- i440fx_write_config(dev, address, val, len);
-}
-
static int i440fx_load_old(QEMUFile* f, void *opaque, int version_id)
{
PCII440FXState *d = opaque;
@@ -267,8 +263,17 @@ static PCIBus *i440fx_common_init(const char *device_name,
d = pci_create_simple(b, 0, device_name);
*pi440fx_state = DO_UPCAST(PCII440FXState, dev, d);
- piix3 = DO_UPCAST(PIIX3State, dev,
- pci_create_simple_multifunction(b, -1, true, "PIIX3"));
+ if (xen_enabled()) {
+ piix3 = DO_UPCAST(PIIX3State, dev,
+ pci_create_simple_multifunction(b, -1, true, "PIIX3-xen"));
+ pci_bus_irqs(b, xen_piix3_set_irq, xen_pci_slot_get_pirq,
+ piix3, XEN_PIIX_NUM_PIRQS);
+ } else {
+ piix3 = DO_UPCAST(PIIX3State, dev,
+ pci_create_simple_multifunction(b, -1, true, "PIIX3"));
+ pci_bus_irqs(b, piix3_set_irq, pci_slot_get_pirq, piix3,
+ PIIX_NUM_PIRQS);
+ }
piix3->pic = pic;
(*pi440fx_state)->piix3 = piix3;
@@ -289,21 +294,6 @@ PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int *piix3_devfn,
PCIBus *b;
b = i440fx_common_init("i440FX", pi440fx_state, piix3_devfn, pic, ram_size);
- pci_bus_irqs(b, piix3_set_irq, pci_slot_get_pirq, (*pi440fx_state)->piix3,
- PIIX_NUM_PIRQS);
-
- return b;
-}
-
-PCIBus *i440fx_xen_init(PCII440FXState **pi440fx_state, int *piix3_devfn,
- qemu_irq *pic, ram_addr_t ram_size)
-{
- PCIBus *b;
-
- b = i440fx_common_init("i440FX-xen", pi440fx_state, piix3_devfn, pic, ram_size);
- pci_bus_irqs(b, xen_piix3_set_irq, xen_pci_slot_get_pirq,
- (*pi440fx_state)->piix3, PIIX_NUM_PIRQS);
-
return b;
}
@@ -365,6 +355,13 @@ static void piix3_write_config(PCIDevice *dev,
}
}
+static void piix3_write_config_xen(PCIDevice *dev,
+ uint32_t address, uint32_t val, int len)
+{
+ xen_piix_pci_write_config_client(address, val, len);
+ piix3_write_config(dev, address, val, len);
+}
+
static void piix3_reset(void *opaque)
{
PIIX3State *d = opaque;
@@ -465,14 +462,6 @@ static PCIDeviceInfo i440fx_info[] = {
.init = i440fx_initfn,
.config_write = i440fx_write_config,
},{
- .qdev.name = "i440FX-xen",
- .qdev.desc = "Host bridge",
- .qdev.size = sizeof(PCII440FXState),
- .qdev.vmsd = &vmstate_i440fx,
- .qdev.no_user = 1,
- .init = i440fx_initfn,
- .config_write = i440fx_write_config_xen,
- },{
.qdev.name = "PIIX3",
.qdev.desc = "ISA bridge",
.qdev.size = sizeof(PIIX3State),
@@ -482,6 +471,15 @@ static PCIDeviceInfo i440fx_info[] = {
.init = piix3_initfn,
.config_write = piix3_write_config,
},{
+ .qdev.name = "PIIX3-xen",
+ .qdev.desc = "ISA bridge",
+ .qdev.size = sizeof(PIIX3State),
+ .qdev.vmsd = &vmstate_piix3,
+ .qdev.no_user = 1,
+ .no_hotplug = 1,
+ .init = piix3_initfn,
+ .config_write = piix3_write_config_xen,
+ },{
/* end of list */
}
};
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [PATCH] xen: fix interrupt routing
2011-05-26 15:48 [Qemu-devel] [PATCH] xen: fix interrupt routing Stefano Stabellini
@ 2011-05-27 23:17 ` Alexander Graf
2011-05-31 11:05 ` Stefano Stabellini
0 siblings, 1 reply; 19+ messages in thread
From: Alexander Graf @ 2011-05-27 23:17 UTC (permalink / raw)
To: Stefano Stabellini; +Cc: Isaku Yamahata, xen-devel, qemu-devel
On 26.05.2011, at 17:48, Stefano Stabellini wrote:
> xen: fix interrupt routing
>
> - remove i440FX-xen and i440fx_write_config_xen
> we don't need to intercept pci config writes to i440FX anymore;
Why not? In which version? Did anything below change? What about compat code? Older hypervisor versions?
> - introduce PIIX3-xen and piix3_write_config_xen
> we do need to intercept pci config write to the PCI-ISA bridge to update
> the PCI link routing;
>
> - set the number of PIIX3-xen interrupts lines to 128;
>
> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
>
> diff --git a/hw/pc.h b/hw/pc.h
> index 0dcbee7..6d5730b 100644
> --- a/hw/pc.h
> +++ b/hw/pc.h
> @@ -176,7 +176,6 @@ struct PCII440FXState;
> typedef struct PCII440FXState PCII440FXState;
>
> PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int *piix_devfn, qemu_irq *pic, ram_addr_t ram_size);
> -PCIBus *i440fx_xen_init(PCII440FXState **pi440fx_state, int *piix3_devfn, qemu_irq *pic, ram_addr_t ram_size);
> void i440fx_init_memory_mappings(PCII440FXState *d);
>
> /* piix4.c */
> diff --git a/hw/pc_piix.c b/hw/pc_piix.c
> index 9a22a8a..ba198de 100644
> --- a/hw/pc_piix.c
> +++ b/hw/pc_piix.c
> @@ -124,11 +124,7 @@ static void pc_init1(ram_addr_t ram_size,
> isa_irq = qemu_allocate_irqs(isa_irq_handler, isa_irq_state, 24);
>
> if (pci_enabled) {
> - if (!xen_enabled()) {
> - pci_bus = i440fx_init(&i440fx_state, &piix3_devfn, isa_irq, ram_size);
> - } else {
> - pci_bus = i440fx_xen_init(&i440fx_state, &piix3_devfn, isa_irq, ram_size);
> - }
> + pci_bus = i440fx_init(&i440fx_state, &piix3_devfn, isa_irq, ram_size);
> } else {
> pci_bus = NULL;
> i440fx_state = NULL;
> diff --git a/hw/piix_pci.c b/hw/piix_pci.c
> index 7f1c4cc..3d73a42 100644
> --- a/hw/piix_pci.c
> +++ b/hw/piix_pci.c
> @@ -40,6 +40,7 @@ typedef PCIHostState I440FXState;
>
> #define PIIX_NUM_PIC_IRQS 16 /* i8259 * 2 */
> #define PIIX_NUM_PIRQS 4ULL /* PIRQ[A-D] */
> +#define XEN_PIIX_NUM_PIRQS 128ULL
> #define PIIX_PIRQC 0x60
>
> typedef struct PIIX3State {
> @@ -78,6 +79,8 @@ struct PCII440FXState {
> #define I440FX_SMRAM 0x72
>
> static void piix3_set_irq(void *opaque, int pirq, int level);
> +static void piix3_write_config_xen(PCIDevice *dev,
> + uint32_t address, uint32_t val, int len);
>
> /* return the global irq number corresponding to a given device irq
> pin. We could also use the bus number to have a more precise
> @@ -173,13 +176,6 @@ static void i440fx_write_config(PCIDevice *dev,
> }
> }
>
> -static void i440fx_write_config_xen(PCIDevice *dev,
> - uint32_t address, uint32_t val, int len)
> -{
> - xen_piix_pci_write_config_client(address, val, len);
> - i440fx_write_config(dev, address, val, len);
> -}
> -
> static int i440fx_load_old(QEMUFile* f, void *opaque, int version_id)
> {
> PCII440FXState *d = opaque;
> @@ -267,8 +263,17 @@ static PCIBus *i440fx_common_init(const char *device_name,
> d = pci_create_simple(b, 0, device_name);
> *pi440fx_state = DO_UPCAST(PCII440FXState, dev, d);
>
> - piix3 = DO_UPCAST(PIIX3State, dev,
> - pci_create_simple_multifunction(b, -1, true, "PIIX3"));
> + if (xen_enabled()) {
> + piix3 = DO_UPCAST(PIIX3State, dev,
> + pci_create_simple_multifunction(b, -1, true, "PIIX3-xen"));
> + pci_bus_irqs(b, xen_piix3_set_irq, xen_pci_slot_get_pirq,
> + piix3, XEN_PIIX_NUM_PIRQS);
But with XEN_PIIX_NUM_PIRQS it's not a piix3 anymore, no? What's the reason behind this change?
Alex
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [PATCH] xen: fix interrupt routing
2011-05-27 23:17 ` Alexander Graf
@ 2011-05-31 11:05 ` Stefano Stabellini
2011-06-14 9:25 ` Alexander Graf
0 siblings, 1 reply; 19+ messages in thread
From: Stefano Stabellini @ 2011-05-31 11:05 UTC (permalink / raw)
To: Alexander Graf
Cc: Isaku Yamahata, xen-devel@lists.xensource.com,
qemu-devel@nongnu.org, Stefano Stabellini
On Sat, 28 May 2011, Alexander Graf wrote:
>
> On 26.05.2011, at 17:48, Stefano Stabellini wrote:
>
> > xen: fix interrupt routing
> >
> > - remove i440FX-xen and i440fx_write_config_xen
> > we don't need to intercept pci config writes to i440FX anymore;
>
> Why not? In which version? Did anything below change? What about compat code? Older hypervisor versions?
Nothing changed, I think it was a genuine mistake in the original patch
series: the intention was to catch the pci config writes to 0x60-0x63 to
reprogram the xen pci link routes accordingly (see
xen_piix_pci_write_config_client). If I am not mistaken a similar thing
is done by non-Xen Qemu in piix3_update_irq_levels.
> > - introduce PIIX3-xen and piix3_write_config_xen
> > we do need to intercept pci config write to the PCI-ISA bridge to update
> > the PCI link routing;
> >
> > - set the number of PIIX3-xen interrupts lines to 128;
> >
> > Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> >
> > diff --git a/hw/pc.h b/hw/pc.h
> > index 0dcbee7..6d5730b 100644
> > --- a/hw/pc.h
> > +++ b/hw/pc.h
> > @@ -176,7 +176,6 @@ struct PCII440FXState;
> > typedef struct PCII440FXState PCII440FXState;
> >
> > PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int *piix_devfn, qemu_irq *pic, ram_addr_t ram_size);
> > -PCIBus *i440fx_xen_init(PCII440FXState **pi440fx_state, int *piix3_devfn, qemu_irq *pic, ram_addr_t ram_size);
> > void i440fx_init_memory_mappings(PCII440FXState *d);
> >
> > /* piix4.c */
> > diff --git a/hw/pc_piix.c b/hw/pc_piix.c
> > index 9a22a8a..ba198de 100644
> > --- a/hw/pc_piix.c
> > +++ b/hw/pc_piix.c
> > @@ -124,11 +124,7 @@ static void pc_init1(ram_addr_t ram_size,
> > isa_irq = qemu_allocate_irqs(isa_irq_handler, isa_irq_state, 24);
> >
> > if (pci_enabled) {
> > - if (!xen_enabled()) {
> > - pci_bus = i440fx_init(&i440fx_state, &piix3_devfn, isa_irq, ram_size);
> > - } else {
> > - pci_bus = i440fx_xen_init(&i440fx_state, &piix3_devfn, isa_irq, ram_size);
> > - }
> > + pci_bus = i440fx_init(&i440fx_state, &piix3_devfn, isa_irq, ram_size);
> > } else {
> > pci_bus = NULL;
> > i440fx_state = NULL;
> > diff --git a/hw/piix_pci.c b/hw/piix_pci.c
> > index 7f1c4cc..3d73a42 100644
> > --- a/hw/piix_pci.c
> > +++ b/hw/piix_pci.c
> > @@ -40,6 +40,7 @@ typedef PCIHostState I440FXState;
> >
> > #define PIIX_NUM_PIC_IRQS 16 /* i8259 * 2 */
> > #define PIIX_NUM_PIRQS 4ULL /* PIRQ[A-D] */
> > +#define XEN_PIIX_NUM_PIRQS 128ULL
> > #define PIIX_PIRQC 0x60
> >
> > typedef struct PIIX3State {
> > @@ -78,6 +79,8 @@ struct PCII440FXState {
> > #define I440FX_SMRAM 0x72
> >
> > static void piix3_set_irq(void *opaque, int pirq, int level);
> > +static void piix3_write_config_xen(PCIDevice *dev,
> > + uint32_t address, uint32_t val, int len);
> >
> > /* return the global irq number corresponding to a given device irq
> > pin. We could also use the bus number to have a more precise
> > @@ -173,13 +176,6 @@ static void i440fx_write_config(PCIDevice *dev,
> > }
> > }
> >
> > -static void i440fx_write_config_xen(PCIDevice *dev,
> > - uint32_t address, uint32_t val, int len)
> > -{
> > - xen_piix_pci_write_config_client(address, val, len);
> > - i440fx_write_config(dev, address, val, len);
> > -}
> > -
> > static int i440fx_load_old(QEMUFile* f, void *opaque, int version_id)
> > {
> > PCII440FXState *d = opaque;
> > @@ -267,8 +263,17 @@ static PCIBus *i440fx_common_init(const char *device_name,
> > d = pci_create_simple(b, 0, device_name);
> > *pi440fx_state = DO_UPCAST(PCII440FXState, dev, d);
> >
> > - piix3 = DO_UPCAST(PIIX3State, dev,
> > - pci_create_simple_multifunction(b, -1, true, "PIIX3"));
> > + if (xen_enabled()) {
> > + piix3 = DO_UPCAST(PIIX3State, dev,
> > + pci_create_simple_multifunction(b, -1, true, "PIIX3-xen"));
> > + pci_bus_irqs(b, xen_piix3_set_irq, xen_pci_slot_get_pirq,
> > + piix3, XEN_PIIX_NUM_PIRQS);
>
> But with XEN_PIIX_NUM_PIRQS it's not a piix3 anymore, no? What's the reason behind this change?
It is still a piix3, but also provides non-legacy interrupt links to the
IO-APIC.
The four pins of each PCI device on the bus not only are routed to the
normal four pirqs (programmed writing to 0x60-0x63, see above) but also
they are connected to the IO-APIC directly.
These additional routes can only be discovered through ACPI, so you need
matching ACPI tables. We used to build the old ACPI tables like this:
/* PRTA: APIC routing table (via non-legacy IOAPIC GSIs). */
printf("Name(PRTA, Package() {\n");
for ( dev = 1; dev < 32; dev++ )
for ( intx = 0; intx < 4; intx++ ) /* INTA-D */
printf("Package(){0x%04xffff, %u, 0, %u},\n",
dev, intx, ((dev*4+dev/8+intx)&31)+16);
printf("})\n");
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [PATCH] xen: fix interrupt routing
2011-05-31 11:05 ` Stefano Stabellini
@ 2011-06-14 9:25 ` Alexander Graf
2011-06-14 12:17 ` [Qemu-devel] [Xen-devel] " Ian Campbell
0 siblings, 1 reply; 19+ messages in thread
From: Alexander Graf @ 2011-06-14 9:25 UTC (permalink / raw)
To: Stefano Stabellini
Cc: Isaku Yamahata, xen-devel@lists.xensource.com,
qemu-devel@nongnu.org
On 31.05.2011, at 13:05, Stefano Stabellini wrote:
> On Sat, 28 May 2011, Alexander Graf wrote:
>>
>> On 26.05.2011, at 17:48, Stefano Stabellini wrote:
>>
>>> xen: fix interrupt routing
>>>
>>> - remove i440FX-xen and i440fx_write_config_xen
>>> we don't need to intercept pci config writes to i440FX anymore;
>>
>> Why not? In which version? Did anything below change? What about compat code? Older hypervisor versions?
>
> Nothing changed, I think it was a genuine mistake in the original patch
> series: the intention was to catch the pci config writes to 0x60-0x63 to
> reprogram the xen pci link routes accordingly (see
> xen_piix_pci_write_config_client). If I am not mistaken a similar thing
> is done by non-Xen Qemu in piix3_update_irq_levels.
>
>
>>> - introduce PIIX3-xen and piix3_write_config_xen
>>> we do need to intercept pci config write to the PCI-ISA bridge to update
>>> the PCI link routing;
>>>
>>> - set the number of PIIX3-xen interrupts lines to 128;
>>>
>>> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
>>>
>>> diff --git a/hw/pc.h b/hw/pc.h
>>> index 0dcbee7..6d5730b 100644
>>> --- a/hw/pc.h
>>> +++ b/hw/pc.h
>>> @@ -176,7 +176,6 @@ struct PCII440FXState;
>>> typedef struct PCII440FXState PCII440FXState;
>>>
>>> PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int *piix_devfn, qemu_irq *pic, ram_addr_t ram_size);
>>> -PCIBus *i440fx_xen_init(PCII440FXState **pi440fx_state, int *piix3_devfn, qemu_irq *pic, ram_addr_t ram_size);
>>> void i440fx_init_memory_mappings(PCII440FXState *d);
>>>
>>> /* piix4.c */
>>> diff --git a/hw/pc_piix.c b/hw/pc_piix.c
>>> index 9a22a8a..ba198de 100644
>>> --- a/hw/pc_piix.c
>>> +++ b/hw/pc_piix.c
>>> @@ -124,11 +124,7 @@ static void pc_init1(ram_addr_t ram_size,
>>> isa_irq = qemu_allocate_irqs(isa_irq_handler, isa_irq_state, 24);
>>>
>>> if (pci_enabled) {
>>> - if (!xen_enabled()) {
>>> - pci_bus = i440fx_init(&i440fx_state, &piix3_devfn, isa_irq, ram_size);
>>> - } else {
>>> - pci_bus = i440fx_xen_init(&i440fx_state, &piix3_devfn, isa_irq, ram_size);
>>> - }
>>> + pci_bus = i440fx_init(&i440fx_state, &piix3_devfn, isa_irq, ram_size);
>>> } else {
>>> pci_bus = NULL;
>>> i440fx_state = NULL;
>>> diff --git a/hw/piix_pci.c b/hw/piix_pci.c
>>> index 7f1c4cc..3d73a42 100644
>>> --- a/hw/piix_pci.c
>>> +++ b/hw/piix_pci.c
>>> @@ -40,6 +40,7 @@ typedef PCIHostState I440FXState;
>>>
>>> #define PIIX_NUM_PIC_IRQS 16 /* i8259 * 2 */
>>> #define PIIX_NUM_PIRQS 4ULL /* PIRQ[A-D] */
>>> +#define XEN_PIIX_NUM_PIRQS 128ULL
>>> #define PIIX_PIRQC 0x60
>>>
>>> typedef struct PIIX3State {
>>> @@ -78,6 +79,8 @@ struct PCII440FXState {
>>> #define I440FX_SMRAM 0x72
>>>
>>> static void piix3_set_irq(void *opaque, int pirq, int level);
>>> +static void piix3_write_config_xen(PCIDevice *dev,
>>> + uint32_t address, uint32_t val, int len);
>>>
>>> /* return the global irq number corresponding to a given device irq
>>> pin. We could also use the bus number to have a more precise
>>> @@ -173,13 +176,6 @@ static void i440fx_write_config(PCIDevice *dev,
>>> }
>>> }
>>>
>>> -static void i440fx_write_config_xen(PCIDevice *dev,
>>> - uint32_t address, uint32_t val, int len)
>>> -{
>>> - xen_piix_pci_write_config_client(address, val, len);
>>> - i440fx_write_config(dev, address, val, len);
>>> -}
>>> -
>>> static int i440fx_load_old(QEMUFile* f, void *opaque, int version_id)
>>> {
>>> PCII440FXState *d = opaque;
>>> @@ -267,8 +263,17 @@ static PCIBus *i440fx_common_init(const char *device_name,
>>> d = pci_create_simple(b, 0, device_name);
>>> *pi440fx_state = DO_UPCAST(PCII440FXState, dev, d);
>>>
>>> - piix3 = DO_UPCAST(PIIX3State, dev,
>>> - pci_create_simple_multifunction(b, -1, true, "PIIX3"));
>>> + if (xen_enabled()) {
>>> + piix3 = DO_UPCAST(PIIX3State, dev,
>>> + pci_create_simple_multifunction(b, -1, true, "PIIX3-xen"));
>>> + pci_bus_irqs(b, xen_piix3_set_irq, xen_pci_slot_get_pirq,
>>> + piix3, XEN_PIIX_NUM_PIRQS);
>>
>> But with XEN_PIIX_NUM_PIRQS it's not a piix3 anymore, no? What's the reason behind this change?
>
> It is still a piix3, but also provides non-legacy interrupt links to the
> IO-APIC.
> The four pins of each PCI device on the bus not only are routed to the
> normal four pirqs (programmed writing to 0x60-0x63, see above) but also
> they are connected to the IO-APIC directly.
> These additional routes can only be discovered through ACPI, so you need
> matching ACPI tables. We used to build the old ACPI tables like this:
>
> /* PRTA: APIC routing table (via non-legacy IOAPIC GSIs). */
> printf("Name(PRTA, Package() {\n");
> for ( dev = 1; dev < 32; dev++ )
> for ( intx = 0; intx < 4; intx++ ) /* INTA-D */
> printf("Package(){0x%04xffff, %u, 0, %u},\n",
> dev, intx, ((dev*4+dev/8+intx)&31)+16);
> printf("})\n");
>
Interesting concept, but completely non-standard and very much different from real hardware. Please at least add a comment there to show readers that Xen is doing a hack which is not at all related to how the PIIX really works.
Also, do you really need this? For KVM, we simply share interrupts or use MSI where we need the device separate.
Alex
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [Xen-devel] Re: [PATCH] xen: fix interrupt routing
2011-06-14 9:25 ` Alexander Graf
@ 2011-06-14 12:17 ` Ian Campbell
2011-06-14 12:30 ` Alexander Graf
0 siblings, 1 reply; 19+ messages in thread
From: Ian Campbell @ 2011-06-14 12:17 UTC (permalink / raw)
To: Alexander Graf
Cc: Isaku Yamahata, xen-devel@lists.xensource.com,
qemu-devel@nongnu.org, Stefano Stabellini
On Tue, 2011-06-14 at 10:25 +0100, Alexander Graf wrote:
> On 31.05.2011, at 13:05, Stefano Stabellini wrote:
>
> > On Sat, 28 May 2011, Alexander Graf wrote:
> >>
> >> On 26.05.2011, at 17:48, Stefano Stabellini wrote:
> >>
> >>> xen: fix interrupt routing
> >>>
> >>> - remove i440FX-xen and i440fx_write_config_xen
> >>> we don't need to intercept pci config writes to i440FX anymore;
> >>
> >> Why not? In which version? Did anything below change? What about compat code? Older hypervisor versions?
> >
> > Nothing changed, I think it was a genuine mistake in the original patch
> > series: the intention was to catch the pci config writes to 0x60-0x63 to
> > reprogram the xen pci link routes accordingly (see
> > xen_piix_pci_write_config_client). If I am not mistaken a similar thing
> > is done by non-Xen Qemu in piix3_update_irq_levels.
> >
> >
> >>> - introduce PIIX3-xen and piix3_write_config_xen
> >>> we do need to intercept pci config write to the PCI-ISA bridge to update
> >>> the PCI link routing;
> >>>
> >>> - set the number of PIIX3-xen interrupts lines to 128;
> >>>
> >>> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> >>>
> >>> diff --git a/hw/pc.h b/hw/pc.h
> >>> index 0dcbee7..6d5730b 100644
> >>> --- a/hw/pc.h
> >>> +++ b/hw/pc.h
> >>> @@ -176,7 +176,6 @@ struct PCII440FXState;
> >>> typedef struct PCII440FXState PCII440FXState;
> >>>
> >>> PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int *piix_devfn, qemu_irq *pic, ram_addr_t ram_size);
> >>> -PCIBus *i440fx_xen_init(PCII440FXState **pi440fx_state, int *piix3_devfn, qemu_irq *pic, ram_addr_t ram_size);
> >>> void i440fx_init_memory_mappings(PCII440FXState *d);
> >>>
> >>> /* piix4.c */
> >>> diff --git a/hw/pc_piix.c b/hw/pc_piix.c
> >>> index 9a22a8a..ba198de 100644
> >>> --- a/hw/pc_piix.c
> >>> +++ b/hw/pc_piix.c
> >>> @@ -124,11 +124,7 @@ static void pc_init1(ram_addr_t ram_size,
> >>> isa_irq = qemu_allocate_irqs(isa_irq_handler, isa_irq_state, 24);
> >>>
> >>> if (pci_enabled) {
> >>> - if (!xen_enabled()) {
> >>> - pci_bus = i440fx_init(&i440fx_state, &piix3_devfn, isa_irq, ram_size);
> >>> - } else {
> >>> - pci_bus = i440fx_xen_init(&i440fx_state, &piix3_devfn, isa_irq, ram_size);
> >>> - }
> >>> + pci_bus = i440fx_init(&i440fx_state, &piix3_devfn, isa_irq, ram_size);
> >>> } else {
> >>> pci_bus = NULL;
> >>> i440fx_state = NULL;
> >>> diff --git a/hw/piix_pci.c b/hw/piix_pci.c
> >>> index 7f1c4cc..3d73a42 100644
> >>> --- a/hw/piix_pci.c
> >>> +++ b/hw/piix_pci.c
> >>> @@ -40,6 +40,7 @@ typedef PCIHostState I440FXState;
> >>>
> >>> #define PIIX_NUM_PIC_IRQS 16 /* i8259 * 2 */
> >>> #define PIIX_NUM_PIRQS 4ULL /* PIRQ[A-D] */
> >>> +#define XEN_PIIX_NUM_PIRQS 128ULL
> >>> #define PIIX_PIRQC 0x60
> >>>
> >>> typedef struct PIIX3State {
> >>> @@ -78,6 +79,8 @@ struct PCII440FXState {
> >>> #define I440FX_SMRAM 0x72
> >>>
> >>> static void piix3_set_irq(void *opaque, int pirq, int level);
> >>> +static void piix3_write_config_xen(PCIDevice *dev,
> >>> + uint32_t address, uint32_t val, int len);
> >>>
> >>> /* return the global irq number corresponding to a given device irq
> >>> pin. We could also use the bus number to have a more precise
> >>> @@ -173,13 +176,6 @@ static void i440fx_write_config(PCIDevice *dev,
> >>> }
> >>> }
> >>>
> >>> -static void i440fx_write_config_xen(PCIDevice *dev,
> >>> - uint32_t address, uint32_t val, int len)
> >>> -{
> >>> - xen_piix_pci_write_config_client(address, val, len);
> >>> - i440fx_write_config(dev, address, val, len);
> >>> -}
> >>> -
> >>> static int i440fx_load_old(QEMUFile* f, void *opaque, int version_id)
> >>> {
> >>> PCII440FXState *d = opaque;
> >>> @@ -267,8 +263,17 @@ static PCIBus *i440fx_common_init(const char *device_name,
> >>> d = pci_create_simple(b, 0, device_name);
> >>> *pi440fx_state = DO_UPCAST(PCII440FXState, dev, d);
> >>>
> >>> - piix3 = DO_UPCAST(PIIX3State, dev,
> >>> - pci_create_simple_multifunction(b, -1, true, "PIIX3"));
> >>> + if (xen_enabled()) {
> >>> + piix3 = DO_UPCAST(PIIX3State, dev,
> >>> + pci_create_simple_multifunction(b, -1, true, "PIIX3-xen"));
> >>> + pci_bus_irqs(b, xen_piix3_set_irq, xen_pci_slot_get_pirq,
> >>> + piix3, XEN_PIIX_NUM_PIRQS);
> >>
> >> But with XEN_PIIX_NUM_PIRQS it's not a piix3 anymore, no? What's the reason behind this change?
> >
> > It is still a piix3, but also provides non-legacy interrupt links to the
> > IO-APIC.
> > The four pins of each PCI device on the bus not only are routed to the
> > normal four pirqs (programmed writing to 0x60-0x63, see above) but also
> > they are connected to the IO-APIC directly.
> > These additional routes can only be discovered through ACPI, so you need
> > matching ACPI tables. We used to build the old ACPI tables like this:
> >
> > /* PRTA: APIC routing table (via non-legacy IOAPIC GSIs). */
> > printf("Name(PRTA, Package() {\n");
> > for ( dev = 1; dev < 32; dev++ )
> > for ( intx = 0; intx < 4; intx++ ) /* INTA-D */
> > printf("Package(){0x%04xffff, %u, 0, %u},\n",
> > dev, intx, ((dev*4+dev/8+intx)&31)+16);
> > printf("})\n");
> >
>
> Interesting concept, but completely non-standard and very much
> different from real hardware. Please at least add a comment there to
> show readers that Xen is doing a hack which is not at all related to
> how the PIIX really works.
Isn't this more a function of the "wires" on the motherboard than the
PIIX specifically? i.e. this just encodes the permutation of the wires
from the PCI slots into the IO-APIC input pins (bypassing the PIIX,
which is only used for legacy ISA IRQs i.e. by non-APIC aware OSes)?
IOW I guess strictly speaking Xen differs from the i440fx rather than
from the PIIX?
So XEN_PIIX_NUM_PIRQS is a bit misnamed -- it's more like
XEN_I440FX_NUM_PIRQS (or maybe even XEN_IOAPIC_NUM_...?)
Ian.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [Xen-devel] Re: [PATCH] xen: fix interrupt routing
2011-06-14 12:17 ` [Qemu-devel] [Xen-devel] " Ian Campbell
@ 2011-06-14 12:30 ` Alexander Graf
2011-06-14 13:08 ` [Qemu-devel] " Jan Kiszka
2011-06-14 13:27 ` [Qemu-devel] [Xen-devel] " Stefano Stabellini
0 siblings, 2 replies; 19+ messages in thread
From: Alexander Graf @ 2011-06-14 12:30 UTC (permalink / raw)
To: Ian Campbell
Cc: Isaku Yamahata, xen-devel@lists.xensource.com,
qemu-devel@nongnu.org, Stefano Stabellini
Am 14.06.2011 um 14:17 schrieb Ian Campbell <Ian.Campbell@citrix.com>:
> On Tue, 2011-06-14 at 10:25 +0100, Alexander Graf wrote:
>> On 31.05.2011, at 13:05, Stefano Stabellini wrote:
>>
>>> On Sat, 28 May 2011, Alexander Graf wrote:
>>>>
>>>> On 26.05.2011, at 17:48, Stefano Stabellini wrote:
>>>>
>>>>> xen: fix interrupt routing
>>>>>
>>>>> - remove i440FX-xen and i440fx_write_config_xen
>>>>> we don't need to intercept pci config writes to i440FX anymore;
>>>>
>>>> Why not? In which version? Did anything below change? What about compat code? Older hypervisor versions?
>>>
>>> Nothing changed, I think it was a genuine mistake in the original patch
>>> series: the intention was to catch the pci config writes to 0x60-0x63 to
>>> reprogram the xen pci link routes accordingly (see
>>> xen_piix_pci_write_config_client). If I am not mistaken a similar thing
>>> is done by non-Xen Qemu in piix3_update_irq_levels.
>>>
>>>
>>>>> - introduce PIIX3-xen and piix3_write_config_xen
>>>>> we do need to intercept pci config write to the PCI-ISA bridge to update
>>>>> the PCI link routing;
>>>>>
>>>>> - set the number of PIIX3-xen interrupts lines to 128;
>>>>>
>>>>> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
>>>>>
>>>>> diff --git a/hw/pc.h b/hw/pc.h
>>>>> index 0dcbee7..6d5730b 100644
>>>>> --- a/hw/pc.h
>>>>> +++ b/hw/pc.h
>>>>> @@ -176,7 +176,6 @@ struct PCII440FXState;
>>>>> typedef struct PCII440FXState PCII440FXState;
>>>>>
>>>>> PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int *piix_devfn, qemu_irq *pic, ram_addr_t ram_size);
>>>>> -PCIBus *i440fx_xen_init(PCII440FXState **pi440fx_state, int *piix3_devfn, qemu_irq *pic, ram_addr_t ram_size);
>>>>> void i440fx_init_memory_mappings(PCII440FXState *d);
>>>>>
>>>>> /* piix4.c */
>>>>> diff --git a/hw/pc_piix.c b/hw/pc_piix.c
>>>>> index 9a22a8a..ba198de 100644
>>>>> --- a/hw/pc_piix.c
>>>>> +++ b/hw/pc_piix.c
>>>>> @@ -124,11 +124,7 @@ static void pc_init1(ram_addr_t ram_size,
>>>>> isa_irq = qemu_allocate_irqs(isa_irq_handler, isa_irq_state, 24);
>>>>>
>>>>> if (pci_enabled) {
>>>>> - if (!xen_enabled()) {
>>>>> - pci_bus = i440fx_init(&i440fx_state, &piix3_devfn, isa_irq, ram_size);
>>>>> - } else {
>>>>> - pci_bus = i440fx_xen_init(&i440fx_state, &piix3_devfn, isa_irq, ram_size);
>>>>> - }
>>>>> + pci_bus = i440fx_init(&i440fx_state, &piix3_devfn, isa_irq, ram_size);
>>>>> } else {
>>>>> pci_bus = NULL;
>>>>> i440fx_state = NULL;
>>>>> diff --git a/hw/piix_pci.c b/hw/piix_pci.c
>>>>> index 7f1c4cc..3d73a42 100644
>>>>> --- a/hw/piix_pci.c
>>>>> +++ b/hw/piix_pci.c
>>>>> @@ -40,6 +40,7 @@ typedef PCIHostState I440FXState;
>>>>>
>>>>> #define PIIX_NUM_PIC_IRQS 16 /* i8259 * 2 */
>>>>> #define PIIX_NUM_PIRQS 4ULL /* PIRQ[A-D] */
>>>>> +#define XEN_PIIX_NUM_PIRQS 128ULL
>>>>> #define PIIX_PIRQC 0x60
>>>>>
>>>>> typedef struct PIIX3State {
>>>>> @@ -78,6 +79,8 @@ struct PCII440FXState {
>>>>> #define I440FX_SMRAM 0x72
>>>>>
>>>>> static void piix3_set_irq(void *opaque, int pirq, int level);
>>>>> +static void piix3_write_config_xen(PCIDevice *dev,
>>>>> + uint32_t address, uint32_t val, int len);
>>>>>
>>>>> /* return the global irq number corresponding to a given device irq
>>>>> pin. We could also use the bus number to have a more precise
>>>>> @@ -173,13 +176,6 @@ static void i440fx_write_config(PCIDevice *dev,
>>>>> }
>>>>> }
>>>>>
>>>>> -static void i440fx_write_config_xen(PCIDevice *dev,
>>>>> - uint32_t address, uint32_t val, int len)
>>>>> -{
>>>>> - xen_piix_pci_write_config_client(address, val, len);
>>>>> - i440fx_write_config(dev, address, val, len);
>>>>> -}
>>>>> -
>>>>> static int i440fx_load_old(QEMUFile* f, void *opaque, int version_id)
>>>>> {
>>>>> PCII440FXState *d = opaque;
>>>>> @@ -267,8 +263,17 @@ static PCIBus *i440fx_common_init(const char *device_name,
>>>>> d = pci_create_simple(b, 0, device_name);
>>>>> *pi440fx_state = DO_UPCAST(PCII440FXState, dev, d);
>>>>>
>>>>> - piix3 = DO_UPCAST(PIIX3State, dev,
>>>>> - pci_create_simple_multifunction(b, -1, true, "PIIX3"));
>>>>> + if (xen_enabled()) {
>>>>> + piix3 = DO_UPCAST(PIIX3State, dev,
>>>>> + pci_create_simple_multifunction(b, -1, true, "PIIX3-xen"));
>>>>> + pci_bus_irqs(b, xen_piix3_set_irq, xen_pci_slot_get_pirq,
>>>>> + piix3, XEN_PIIX_NUM_PIRQS);
>>>>
>>>> But with XEN_PIIX_NUM_PIRQS it's not a piix3 anymore, no? What's the reason behind this change?
>>>
>>> It is still a piix3, but also provides non-legacy interrupt links to the
>>> IO-APIC.
>>> The four pins of each PCI device on the bus not only are routed to the
>>> normal four pirqs (programmed writing to 0x60-0x63, see above) but also
>>> they are connected to the IO-APIC directly.
>>> These additional routes can only be discovered through ACPI, so you need
>>> matching ACPI tables. We used to build the old ACPI tables like this:
>>>
>>> /* PRTA: APIC routing table (via non-legacy IOAPIC GSIs). */
>>> printf("Name(PRTA, Package() {\n");
>>> for ( dev = 1; dev < 32; dev++ )
>>> for ( intx = 0; intx < 4; intx++ ) /* INTA-D */
>>> printf("Package(){0x%04xffff, %u, 0, %u},\n",
>>> dev, intx, ((dev*4+dev/8+intx)&31)+16);
>>> printf("})\n");
>>>
>>
>> Interesting concept, but completely non-standard and very much
>> different from real hardware. Please at least add a comment there to
>> show readers that Xen is doing a hack which is not at all related to
>> how the PIIX really works.
>
> Isn't this more a function of the "wires" on the motherboard than the
> PIIX specifically? i.e. this just encodes the permutation of the wires
> from the PCI slots into the IO-APIC input pins (bypassing the PIIX,
> which is only used for legacy ISA IRQs i.e. by non-APIC aware OSes)?
Interrupts with PCI work slightly different. PCI devices can map (themselves or by software) to one of 4 interrupt lines: INTA, INTB, INTC, INTD. These get converted using PCI host controller specific logic to 4 interrupt lines which then go into the IO-APIC.
The IO-APIC is a chip with a limited number of pins. IIRC it was 24, could be 26 though.
I haven't seen a single case where PCI devices have a direct link to the IO-APIC. I also have not seen any PCI host controller that exports more than 4 interrupts. Giving each PCI device its own line, on top of that more than ever could be in real hardware, is a plain hack IMHO.
Did this really give you actual performance/latency/scalability gains? I still think for devices that matter, we should go with MSI rather than deriving from real hw.
Alex
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [PATCH] xen: fix interrupt routing
2011-06-14 12:30 ` Alexander Graf
@ 2011-06-14 13:08 ` Jan Kiszka
2011-06-14 13:31 ` Stefano Stabellini
2011-06-14 13:27 ` [Qemu-devel] [Xen-devel] " Stefano Stabellini
1 sibling, 1 reply; 19+ messages in thread
From: Jan Kiszka @ 2011-06-14 13:08 UTC (permalink / raw)
To: Alexander Graf
Cc: xen-devel@lists.xensource.com, Ian Campbell, Stefano Stabellini,
qemu-devel@nongnu.org, Isaku Yamahata
On 2011-06-14 14:30, Alexander Graf wrote:
>
> Am 14.06.2011 um 14:17 schrieb Ian Campbell <Ian.Campbell@citrix.com>:
>
>> On Tue, 2011-06-14 at 10:25 +0100, Alexander Graf wrote:
>>> On 31.05.2011, at 13:05, Stefano Stabellini wrote:
>>>
>>>> On Sat, 28 May 2011, Alexander Graf wrote:
>>>>>
>>>>> On 26.05.2011, at 17:48, Stefano Stabellini wrote:
>>>>>
>>>>>> xen: fix interrupt routing
>>>>>>
>>>>>> - remove i440FX-xen and i440fx_write_config_xen
>>>>>> we don't need to intercept pci config writes to i440FX anymore;
>>>>>
>>>>> Why not? In which version? Did anything below change? What about compat code? Older hypervisor versions?
>>>>
>>>> Nothing changed, I think it was a genuine mistake in the original patch
>>>> series: the intention was to catch the pci config writes to 0x60-0x63 to
>>>> reprogram the xen pci link routes accordingly (see
>>>> xen_piix_pci_write_config_client). If I am not mistaken a similar thing
>>>> is done by non-Xen Qemu in piix3_update_irq_levels.
>>>>
>>>>
>>>>>> - introduce PIIX3-xen and piix3_write_config_xen
>>>>>> we do need to intercept pci config write to the PCI-ISA bridge to update
>>>>>> the PCI link routing;
>>>>>>
>>>>>> - set the number of PIIX3-xen interrupts lines to 128;
>>>>>>
>>>>>> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
>>>>>>
>>>>>> diff --git a/hw/pc.h b/hw/pc.h
>>>>>> index 0dcbee7..6d5730b 100644
>>>>>> --- a/hw/pc.h
>>>>>> +++ b/hw/pc.h
>>>>>> @@ -176,7 +176,6 @@ struct PCII440FXState;
>>>>>> typedef struct PCII440FXState PCII440FXState;
>>>>>>
>>>>>> PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int *piix_devfn, qemu_irq *pic, ram_addr_t ram_size);
>>>>>> -PCIBus *i440fx_xen_init(PCII440FXState **pi440fx_state, int *piix3_devfn, qemu_irq *pic, ram_addr_t ram_size);
>>>>>> void i440fx_init_memory_mappings(PCII440FXState *d);
>>>>>>
>>>>>> /* piix4.c */
>>>>>> diff --git a/hw/pc_piix.c b/hw/pc_piix.c
>>>>>> index 9a22a8a..ba198de 100644
>>>>>> --- a/hw/pc_piix.c
>>>>>> +++ b/hw/pc_piix.c
>>>>>> @@ -124,11 +124,7 @@ static void pc_init1(ram_addr_t ram_size,
>>>>>> isa_irq = qemu_allocate_irqs(isa_irq_handler, isa_irq_state, 24);
>>>>>>
>>>>>> if (pci_enabled) {
>>>>>> - if (!xen_enabled()) {
>>>>>> - pci_bus = i440fx_init(&i440fx_state, &piix3_devfn, isa_irq, ram_size);
>>>>>> - } else {
>>>>>> - pci_bus = i440fx_xen_init(&i440fx_state, &piix3_devfn, isa_irq, ram_size);
>>>>>> - }
>>>>>> + pci_bus = i440fx_init(&i440fx_state, &piix3_devfn, isa_irq, ram_size);
>>>>>> } else {
>>>>>> pci_bus = NULL;
>>>>>> i440fx_state = NULL;
>>>>>> diff --git a/hw/piix_pci.c b/hw/piix_pci.c
>>>>>> index 7f1c4cc..3d73a42 100644
>>>>>> --- a/hw/piix_pci.c
>>>>>> +++ b/hw/piix_pci.c
>>>>>> @@ -40,6 +40,7 @@ typedef PCIHostState I440FXState;
>>>>>>
>>>>>> #define PIIX_NUM_PIC_IRQS 16 /* i8259 * 2 */
>>>>>> #define PIIX_NUM_PIRQS 4ULL /* PIRQ[A-D] */
>>>>>> +#define XEN_PIIX_NUM_PIRQS 128ULL
>>>>>> #define PIIX_PIRQC 0x60
>>>>>>
>>>>>> typedef struct PIIX3State {
>>>>>> @@ -78,6 +79,8 @@ struct PCII440FXState {
>>>>>> #define I440FX_SMRAM 0x72
>>>>>>
>>>>>> static void piix3_set_irq(void *opaque, int pirq, int level);
>>>>>> +static void piix3_write_config_xen(PCIDevice *dev,
>>>>>> + uint32_t address, uint32_t val, int len);
>>>>>>
>>>>>> /* return the global irq number corresponding to a given device irq
>>>>>> pin. We could also use the bus number to have a more precise
>>>>>> @@ -173,13 +176,6 @@ static void i440fx_write_config(PCIDevice *dev,
>>>>>> }
>>>>>> }
>>>>>>
>>>>>> -static void i440fx_write_config_xen(PCIDevice *dev,
>>>>>> - uint32_t address, uint32_t val, int len)
>>>>>> -{
>>>>>> - xen_piix_pci_write_config_client(address, val, len);
>>>>>> - i440fx_write_config(dev, address, val, len);
>>>>>> -}
>>>>>> -
>>>>>> static int i440fx_load_old(QEMUFile* f, void *opaque, int version_id)
>>>>>> {
>>>>>> PCII440FXState *d = opaque;
>>>>>> @@ -267,8 +263,17 @@ static PCIBus *i440fx_common_init(const char *device_name,
>>>>>> d = pci_create_simple(b, 0, device_name);
>>>>>> *pi440fx_state = DO_UPCAST(PCII440FXState, dev, d);
>>>>>>
>>>>>> - piix3 = DO_UPCAST(PIIX3State, dev,
>>>>>> - pci_create_simple_multifunction(b, -1, true, "PIIX3"));
>>>>>> + if (xen_enabled()) {
>>>>>> + piix3 = DO_UPCAST(PIIX3State, dev,
>>>>>> + pci_create_simple_multifunction(b, -1, true, "PIIX3-xen"));
>>>>>> + pci_bus_irqs(b, xen_piix3_set_irq, xen_pci_slot_get_pirq,
>>>>>> + piix3, XEN_PIIX_NUM_PIRQS);
>>>>>
>>>>> But with XEN_PIIX_NUM_PIRQS it's not a piix3 anymore, no? What's the reason behind this change?
>>>>
>>>> It is still a piix3, but also provides non-legacy interrupt links to the
>>>> IO-APIC.
>>>> The four pins of each PCI device on the bus not only are routed to the
>>>> normal four pirqs (programmed writing to 0x60-0x63, see above) but also
>>>> they are connected to the IO-APIC directly.
>>>> These additional routes can only be discovered through ACPI, so you need
>>>> matching ACPI tables. We used to build the old ACPI tables like this:
>>>>
>>>> /* PRTA: APIC routing table (via non-legacy IOAPIC GSIs). */
>>>> printf("Name(PRTA, Package() {\n");
>>>> for ( dev = 1; dev < 32; dev++ )
>>>> for ( intx = 0; intx < 4; intx++ ) /* INTA-D */
>>>> printf("Package(){0x%04xffff, %u, 0, %u},\n",
>>>> dev, intx, ((dev*4+dev/8+intx)&31)+16);
>>>> printf("})\n");
>>>>
>>>
>>> Interesting concept, but completely non-standard and very much
>>> different from real hardware. Please at least add a comment there to
>>> show readers that Xen is doing a hack which is not at all related to
>>> how the PIIX really works.
>>
>> Isn't this more a function of the "wires" on the motherboard than the
>> PIIX specifically? i.e. this just encodes the permutation of the wires
>> from the PCI slots into the IO-APIC input pins (bypassing the PIIX,
>> which is only used for legacy ISA IRQs i.e. by non-APIC aware OSes)?
>
> Interrupts with PCI work slightly different. PCI devices can map (themselves or by software) to one of 4 interrupt lines: INTA, INTB, INTC, INTD. These get converted using PCI host controller specific logic to 4 interrupt lines which then go into the IO-APIC.
>
> The IO-APIC is a chip with a limited number of pins. IIRC it was 24, could be 26 though.
>
> I haven't seen a single case where PCI devices have a direct link to the IO-APIC. I also have not seen any PCI host controller that exports more than 4 interrupts. Giving each PCI device its own line, on top of that more than ever could be in real hardware, is a plain hack IMHO.
>
> Did this really give you actual performance/latency/scalability gains? I still think for devices that matter, we should go with MSI rather than deriving from real hw.
I bet the motivation is to have an IRQ route that is independent of
QEMU, thus can be discovered inside the Xen kernel and then remains
stable - or is simply hard-wired. Device assignment? Direct legacy IRQ
injection from the kernel?
KVM (qemu-kvm) has that issue as well and uses a simple hack to get a
notification on routing changes. That works as long as there is only one
hub (the PIIX3) involved. We need something smarter on the long term,
though.
Jan
--
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [Xen-devel] Re: [PATCH] xen: fix interrupt routing
2011-06-14 12:30 ` Alexander Graf
2011-06-14 13:08 ` [Qemu-devel] " Jan Kiszka
@ 2011-06-14 13:27 ` Stefano Stabellini
2011-06-14 15:30 ` [Qemu-devel] " Jan Kiszka
2011-06-15 8:16 ` Alexander Graf
1 sibling, 2 replies; 19+ messages in thread
From: Stefano Stabellini @ 2011-06-14 13:27 UTC (permalink / raw)
To: Alexander Graf
Cc: xen-devel@lists.xensource.com, Stefano Stabellini,
qemu-devel@nongnu.org, Ian Campbell, Isaku Yamahata
On Tue, 14 Jun 2011, Alexander Graf wrote:
> >>>>> static int i440fx_load_old(QEMUFile* f, void *opaque, int version_id)
> >>>>> {
> >>>>> PCII440FXState *d = opaque;
> >>>>> @@ -267,8 +263,17 @@ static PCIBus *i440fx_common_init(const char *device_name,
> >>>>> d = pci_create_simple(b, 0, device_name);
> >>>>> *pi440fx_state = DO_UPCAST(PCII440FXState, dev, d);
> >>>>>
> >>>>> - piix3 = DO_UPCAST(PIIX3State, dev,
> >>>>> - pci_create_simple_multifunction(b, -1, true, "PIIX3"));
> >>>>> + if (xen_enabled()) {
> >>>>> + piix3 = DO_UPCAST(PIIX3State, dev,
> >>>>> + pci_create_simple_multifunction(b, -1, true, "PIIX3-xen"));
> >>>>> + pci_bus_irqs(b, xen_piix3_set_irq, xen_pci_slot_get_pirq,
> >>>>> + piix3, XEN_PIIX_NUM_PIRQS);
> >>>>
> >>>> But with XEN_PIIX_NUM_PIRQS it's not a piix3 anymore, no? What's the reason behind this change?
> >>>
> >>> It is still a piix3, but also provides non-legacy interrupt links to the
> >>> IO-APIC.
> >>> The four pins of each PCI device on the bus not only are routed to the
> >>> normal four pirqs (programmed writing to 0x60-0x63, see above) but also
> >>> they are connected to the IO-APIC directly.
> >>> These additional routes can only be discovered through ACPI, so you need
> >>> matching ACPI tables. We used to build the old ACPI tables like this:
> >>>
> >>> /* PRTA: APIC routing table (via non-legacy IOAPIC GSIs). */
> >>> printf("Name(PRTA, Package() {\n");
> >>> for ( dev = 1; dev < 32; dev++ )
> >>> for ( intx = 0; intx < 4; intx++ ) /* INTA-D */
> >>> printf("Package(){0x%04xffff, %u, 0, %u},\n",
> >>> dev, intx, ((dev*4+dev/8+intx)&31)+16);
> >>> printf("})\n");
> >>>
> >>
> >> Interesting concept, but completely non-standard and very much
> >> different from real hardware. Please at least add a comment there to
> >> show readers that Xen is doing a hack which is not at all related to
> >> how the PIIX really works.
> >
> > Isn't this more a function of the "wires" on the motherboard than the
> > PIIX specifically? i.e. this just encodes the permutation of the wires
> > from the PCI slots into the IO-APIC input pins (bypassing the PIIX,
> > which is only used for legacy ISA IRQs i.e. by non-APIC aware OSes)?
>
> Interrupts with PCI work slightly different. PCI devices can map (themselves or by software) to one of 4 interrupt lines: INTA, INTB, INTC, INTD. These get converted using PCI host controller specific logic to 4 interrupt lines which then go into the IO-APIC.
>
> The IO-APIC is a chip with a limited number of pins. IIRC it was 24, could be 26 though.
The number of redirection entries in the IOAPIC can be discovered
reading from the IOAPICVER register and it is a property of a specific
model of IOAPIC. As a matter of fact Xen's emulated IOAPIC supports more
pins than the most popular IOAPIC used with PIIX3.
> I haven't seen a single case where PCI devices have a direct link to the IO-APIC. I also have not seen any PCI host controller that exports more than 4 interrupts. Giving each PCI device its own line, on top of that more than ever could be in real hardware, is a plain hack IMHO.
Actually this happens quite often: if I am not mistaken all the GSIs
higher than 15 are actually the result of a direct connection between
an interrupt source and the IOAPIC. I have several on my testboxes.
Also give a look at the Intel Multiprocessor Specification, section
3.6.2.3: as you can see from the diagram in "Symmetric I/O Mode" all the
interrupts are routed through the IOAPIC directly.
> Did this really give you actual performance/latency/scalability gains? I still think for devices that matter, we should go with MSI rather than deriving from real hw.
>
Not all the operating systems support MSIs, it is nice to be able to
avoid interrupt sharing without recurring to MSIs.
Also this is how Xen has been working for more then 5 years in HVM mode,
so this configuration is well tested and supported by most operating
systems (at least all the ones we tried so far).
In any case I think it is a good idea to add a comment to better explain
what we are doing, see below.
commit 973bb091a967fdec37a1bc8fe30d46a483d2903d
Author: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Date: Tue May 17 12:10:36 2011 +0000
xen: fix interrupt routing
- remove i440FX-xen and i440fx_write_config_xen
we don't need to intercept pci config writes to i440FX anymore;
- introduce PIIX3-xen and piix3_write_config_xen
we do need to intercept pci config write to the PCI-ISA bridge to update
the PCI link routing;
- set the number of PIIX3-xen interrupts line to 128;
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
diff --git a/hw/pc.h b/hw/pc.h
index 0dcbee7..6d5730b 100644
--- a/hw/pc.h
+++ b/hw/pc.h
@@ -176,7 +176,6 @@ struct PCII440FXState;
typedef struct PCII440FXState PCII440FXState;
PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int *piix_devfn, qemu_irq *pic, ram_addr_t ram_size);
-PCIBus *i440fx_xen_init(PCII440FXState **pi440fx_state, int *piix3_devfn, qemu_irq *pic, ram_addr_t ram_size);
void i440fx_init_memory_mappings(PCII440FXState *d);
/* piix4.c */
diff --git a/hw/pc_piix.c b/hw/pc_piix.c
index 9a22a8a..ba198de 100644
--- a/hw/pc_piix.c
+++ b/hw/pc_piix.c
@@ -124,11 +124,7 @@ static void pc_init1(ram_addr_t ram_size,
isa_irq = qemu_allocate_irqs(isa_irq_handler, isa_irq_state, 24);
if (pci_enabled) {
- if (!xen_enabled()) {
- pci_bus = i440fx_init(&i440fx_state, &piix3_devfn, isa_irq, ram_size);
- } else {
- pci_bus = i440fx_xen_init(&i440fx_state, &piix3_devfn, isa_irq, ram_size);
- }
+ pci_bus = i440fx_init(&i440fx_state, &piix3_devfn, isa_irq, ram_size);
} else {
pci_bus = NULL;
i440fx_state = NULL;
diff --git a/hw/piix_pci.c b/hw/piix_pci.c
index 7f1c4cc..1e29ec4 100644
--- a/hw/piix_pci.c
+++ b/hw/piix_pci.c
@@ -40,6 +40,7 @@ typedef PCIHostState I440FXState;
#define PIIX_NUM_PIC_IRQS 16 /* i8259 * 2 */
#define PIIX_NUM_PIRQS 4ULL /* PIRQ[A-D] */
+#define XEN_PIIX_NUM_PIRQS 128ULL
#define PIIX_PIRQC 0x60
typedef struct PIIX3State {
@@ -78,6 +79,8 @@ struct PCII440FXState {
#define I440FX_SMRAM 0x72
static void piix3_set_irq(void *opaque, int pirq, int level);
+static void piix3_write_config_xen(PCIDevice *dev,
+ uint32_t address, uint32_t val, int len);
/* return the global irq number corresponding to a given device irq
pin. We could also use the bus number to have a more precise
@@ -173,13 +176,6 @@ static void i440fx_write_config(PCIDevice *dev,
}
}
-static void i440fx_write_config_xen(PCIDevice *dev,
- uint32_t address, uint32_t val, int len)
-{
- xen_piix_pci_write_config_client(address, val, len);
- i440fx_write_config(dev, address, val, len);
-}
-
static int i440fx_load_old(QEMUFile* f, void *opaque, int version_id)
{
PCII440FXState *d = opaque;
@@ -267,8 +263,21 @@ static PCIBus *i440fx_common_init(const char *device_name,
d = pci_create_simple(b, 0, device_name);
*pi440fx_state = DO_UPCAST(PCII440FXState, dev, d);
- piix3 = DO_UPCAST(PIIX3State, dev,
- pci_create_simple_multifunction(b, -1, true, "PIIX3"));
+ /* Xen supports additional interrupt routes from the PCI devices to
+ * the IOAPIC: the four pins of each PCI device on the bus are also
+ * connected to the IOAPIC directly.
+ * These additional routes can be discovered through ACPI. */
+ if (xen_enabled()) {
+ piix3 = DO_UPCAST(PIIX3State, dev,
+ pci_create_simple_multifunction(b, -1, true, "PIIX3-xen"));
+ pci_bus_irqs(b, xen_piix3_set_irq, xen_pci_slot_get_pirq,
+ piix3, XEN_PIIX_NUM_PIRQS);
+ } else {
+ piix3 = DO_UPCAST(PIIX3State, dev,
+ pci_create_simple_multifunction(b, -1, true, "PIIX3"));
+ pci_bus_irqs(b, piix3_set_irq, pci_slot_get_pirq, piix3,
+ PIIX_NUM_PIRQS);
+ }
piix3->pic = pic;
(*pi440fx_state)->piix3 = piix3;
@@ -289,21 +298,6 @@ PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int *piix3_devfn,
PCIBus *b;
b = i440fx_common_init("i440FX", pi440fx_state, piix3_devfn, pic, ram_size);
- pci_bus_irqs(b, piix3_set_irq, pci_slot_get_pirq, (*pi440fx_state)->piix3,
- PIIX_NUM_PIRQS);
-
- return b;
-}
-
-PCIBus *i440fx_xen_init(PCII440FXState **pi440fx_state, int *piix3_devfn,
- qemu_irq *pic, ram_addr_t ram_size)
-{
- PCIBus *b;
-
- b = i440fx_common_init("i440FX-xen", pi440fx_state, piix3_devfn, pic, ram_size);
- pci_bus_irqs(b, xen_piix3_set_irq, xen_pci_slot_get_pirq,
- (*pi440fx_state)->piix3, PIIX_NUM_PIRQS);
-
return b;
}
@@ -365,6 +359,13 @@ static void piix3_write_config(PCIDevice *dev,
}
}
+static void piix3_write_config_xen(PCIDevice *dev,
+ uint32_t address, uint32_t val, int len)
+{
+ xen_piix_pci_write_config_client(address, val, len);
+ piix3_write_config(dev, address, val, len);
+}
+
static void piix3_reset(void *opaque)
{
PIIX3State *d = opaque;
@@ -465,14 +466,6 @@ static PCIDeviceInfo i440fx_info[] = {
.init = i440fx_initfn,
.config_write = i440fx_write_config,
},{
- .qdev.name = "i440FX-xen",
- .qdev.desc = "Host bridge",
- .qdev.size = sizeof(PCII440FXState),
- .qdev.vmsd = &vmstate_i440fx,
- .qdev.no_user = 1,
- .init = i440fx_initfn,
- .config_write = i440fx_write_config_xen,
- },{
.qdev.name = "PIIX3",
.qdev.desc = "ISA bridge",
.qdev.size = sizeof(PIIX3State),
@@ -482,6 +475,15 @@ static PCIDeviceInfo i440fx_info[] = {
.init = piix3_initfn,
.config_write = piix3_write_config,
},{
+ .qdev.name = "PIIX3-xen",
+ .qdev.desc = "ISA bridge",
+ .qdev.size = sizeof(PIIX3State),
+ .qdev.vmsd = &vmstate_piix3,
+ .qdev.no_user = 1,
+ .no_hotplug = 1,
+ .init = piix3_initfn,
+ .config_write = piix3_write_config_xen,
+ },{
/* end of list */
}
};
^ permalink raw reply related [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [PATCH] xen: fix interrupt routing
2011-06-14 13:08 ` [Qemu-devel] " Jan Kiszka
@ 2011-06-14 13:31 ` Stefano Stabellini
0 siblings, 0 replies; 19+ messages in thread
From: Stefano Stabellini @ 2011-06-14 13:31 UTC (permalink / raw)
To: Jan Kiszka
Cc: xen-devel@lists.xensource.com, Stefano Stabellini, Alexander Graf,
qemu-devel@nongnu.org, Ian Campbell, Isaku Yamahata
On Tue, 14 Jun 2011, Jan Kiszka wrote:
> I bet the motivation is to have an IRQ route that is independent of
> QEMU, thus can be discovered inside the Xen kernel and then remains
> stable - or is simply hard-wired. Device assignment? Direct legacy IRQ
> injection from the kernel?
>
This code predates me, so Keir might have more insights on the original
motivation, but you are correct, they are hard-wired and can be used for
device-assignment as well.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [PATCH] xen: fix interrupt routing
2011-06-14 13:27 ` [Qemu-devel] [Xen-devel] " Stefano Stabellini
@ 2011-06-14 15:30 ` Jan Kiszka
2011-06-14 18:18 ` Stefano Stabellini
2011-06-15 8:16 ` Alexander Graf
1 sibling, 1 reply; 19+ messages in thread
From: Jan Kiszka @ 2011-06-14 15:30 UTC (permalink / raw)
To: Stefano Stabellini
Cc: xen-devel@lists.xensource.com, Alexander Graf,
qemu-devel@nongnu.org, Ian Campbell, Isaku Yamahata
On 2011-06-14 15:27, Stefano Stabellini wrote:
> On Tue, 14 Jun 2011, Alexander Graf wrote:
>>>>>>> static int i440fx_load_old(QEMUFile* f, void *opaque, int version_id)
>>>>>>> {
>>>>>>> PCII440FXState *d = opaque;
>>>>>>> @@ -267,8 +263,17 @@ static PCIBus *i440fx_common_init(const char *device_name,
>>>>>>> d = pci_create_simple(b, 0, device_name);
>>>>>>> *pi440fx_state = DO_UPCAST(PCII440FXState, dev, d);
>>>>>>>
>>>>>>> - piix3 = DO_UPCAST(PIIX3State, dev,
>>>>>>> - pci_create_simple_multifunction(b, -1, true, "PIIX3"));
>>>>>>> + if (xen_enabled()) {
>>>>>>> + piix3 = DO_UPCAST(PIIX3State, dev,
>>>>>>> + pci_create_simple_multifunction(b, -1, true, "PIIX3-xen"));
>>>>>>> + pci_bus_irqs(b, xen_piix3_set_irq, xen_pci_slot_get_pirq,
>>>>>>> + piix3, XEN_PIIX_NUM_PIRQS);
>>>>>>
>>>>>> But with XEN_PIIX_NUM_PIRQS it's not a piix3 anymore, no? What's the reason behind this change?
>>>>>
>>>>> It is still a piix3, but also provides non-legacy interrupt links to the
>>>>> IO-APIC.
>>>>> The four pins of each PCI device on the bus not only are routed to the
>>>>> normal four pirqs (programmed writing to 0x60-0x63, see above) but also
>>>>> they are connected to the IO-APIC directly.
>>>>> These additional routes can only be discovered through ACPI, so you need
>>>>> matching ACPI tables. We used to build the old ACPI tables like this:
>>>>>
>>>>> /* PRTA: APIC routing table (via non-legacy IOAPIC GSIs). */
>>>>> printf("Name(PRTA, Package() {\n");
>>>>> for ( dev = 1; dev < 32; dev++ )
>>>>> for ( intx = 0; intx < 4; intx++ ) /* INTA-D */
>>>>> printf("Package(){0x%04xffff, %u, 0, %u},\n",
>>>>> dev, intx, ((dev*4+dev/8+intx)&31)+16);
>>>>> printf("})\n");
>>>>>
>>>>
>>>> Interesting concept, but completely non-standard and very much
>>>> different from real hardware. Please at least add a comment there to
>>>> show readers that Xen is doing a hack which is not at all related to
>>>> how the PIIX really works.
>>>
>>> Isn't this more a function of the "wires" on the motherboard than the
>>> PIIX specifically? i.e. this just encodes the permutation of the wires
>>> from the PCI slots into the IO-APIC input pins (bypassing the PIIX,
>>> which is only used for legacy ISA IRQs i.e. by non-APIC aware OSes)?
>>
>> Interrupts with PCI work slightly different. PCI devices can map (themselves or by software) to one of 4 interrupt lines: INTA, INTB, INTC, INTD. These get converted using PCI host controller specific logic to 4 interrupt lines which then go into the IO-APIC.
>>
>> The IO-APIC is a chip with a limited number of pins. IIRC it was 24, could be 26 though.
>
> The number of redirection entries in the IOAPIC can be discovered
> reading from the IOAPICVER register and it is a property of a specific
> model of IOAPIC. As a matter of fact Xen's emulated IOAPIC supports more
> pins than the most popular IOAPIC used with PIIX3.
Do real IOAPICs exist with more than 24 pins? Otherwise there is the
risk that OSes aren't well prepared for this oddity - specifically not
when the chipset is specified to include a 24-pin IOAPIC.
>
>
>> I haven't seen a single case where PCI devices have a direct link to the IO-APIC. I also have not seen any PCI host controller that exports more than 4 interrupts. Giving each PCI device its own line, on top of that more than ever could be in real hardware, is a plain hack IMHO.
>
> Actually this happens quite often: if I am not mistaken all the GSIs
> higher than 15 are actually the result of a direct connection between
> an interrupt source and the IOAPIC. I have several on my testboxes.
Except that the interrupt source is the chipset with its PCI bridge, not
individual PCI devices.
Jan
--
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [PATCH] xen: fix interrupt routing
2011-06-14 15:30 ` [Qemu-devel] " Jan Kiszka
@ 2011-06-14 18:18 ` Stefano Stabellini
2011-06-15 8:24 ` Alexander Graf
0 siblings, 1 reply; 19+ messages in thread
From: Stefano Stabellini @ 2011-06-14 18:18 UTC (permalink / raw)
To: Jan Kiszka
Cc: xen-devel@lists.xensource.com, Stefano Stabellini,
qemu-devel@nongnu.org, Alexander Graf, Ian Campbell,
Isaku Yamahata
On Tue, 14 Jun 2011, Jan Kiszka wrote:
> On 2011-06-14 15:27, Stefano Stabellini wrote:
> > On Tue, 14 Jun 2011, Alexander Graf wrote:
> >>>>>>> static int i440fx_load_old(QEMUFile* f, void *opaque, int version_id)
> >>>>>>> {
> >>>>>>> PCII440FXState *d = opaque;
> >>>>>>> @@ -267,8 +263,17 @@ static PCIBus *i440fx_common_init(const char *device_name,
> >>>>>>> d = pci_create_simple(b, 0, device_name);
> >>>>>>> *pi440fx_state = DO_UPCAST(PCII440FXState, dev, d);
> >>>>>>>
> >>>>>>> - piix3 = DO_UPCAST(PIIX3State, dev,
> >>>>>>> - pci_create_simple_multifunction(b, -1, true, "PIIX3"));
> >>>>>>> + if (xen_enabled()) {
> >>>>>>> + piix3 = DO_UPCAST(PIIX3State, dev,
> >>>>>>> + pci_create_simple_multifunction(b, -1, true, "PIIX3-xen"));
> >>>>>>> + pci_bus_irqs(b, xen_piix3_set_irq, xen_pci_slot_get_pirq,
> >>>>>>> + piix3, XEN_PIIX_NUM_PIRQS);
> >>>>>>
> >>>>>> But with XEN_PIIX_NUM_PIRQS it's not a piix3 anymore, no? What's the reason behind this change?
> >>>>>
> >>>>> It is still a piix3, but also provides non-legacy interrupt links to the
> >>>>> IO-APIC.
> >>>>> The four pins of each PCI device on the bus not only are routed to the
> >>>>> normal four pirqs (programmed writing to 0x60-0x63, see above) but also
> >>>>> they are connected to the IO-APIC directly.
> >>>>> These additional routes can only be discovered through ACPI, so you need
> >>>>> matching ACPI tables. We used to build the old ACPI tables like this:
> >>>>>
> >>>>> /* PRTA: APIC routing table (via non-legacy IOAPIC GSIs). */
> >>>>> printf("Name(PRTA, Package() {\n");
> >>>>> for ( dev = 1; dev < 32; dev++ )
> >>>>> for ( intx = 0; intx < 4; intx++ ) /* INTA-D */
> >>>>> printf("Package(){0x%04xffff, %u, 0, %u},\n",
> >>>>> dev, intx, ((dev*4+dev/8+intx)&31)+16);
> >>>>> printf("})\n");
> >>>>>
> >>>>
> >>>> Interesting concept, but completely non-standard and very much
> >>>> different from real hardware. Please at least add a comment there to
> >>>> show readers that Xen is doing a hack which is not at all related to
> >>>> how the PIIX really works.
> >>>
> >>> Isn't this more a function of the "wires" on the motherboard than the
> >>> PIIX specifically? i.e. this just encodes the permutation of the wires
> >>> from the PCI slots into the IO-APIC input pins (bypassing the PIIX,
> >>> which is only used for legacy ISA IRQs i.e. by non-APIC aware OSes)?
> >>
> >> Interrupts with PCI work slightly different. PCI devices can map (themselves or by software) to one of 4 interrupt lines: INTA, INTB, INTC, INTD. These get converted using PCI host controller specific logic to 4 interrupt lines which then go into the IO-APIC.
> >>
> >> The IO-APIC is a chip with a limited number of pins. IIRC it was 24, could be 26 though.
> >
> > The number of redirection entries in the IOAPIC can be discovered
> > reading from the IOAPICVER register and it is a property of a specific
> > model of IOAPIC. As a matter of fact Xen's emulated IOAPIC supports more
> > pins than the most popular IOAPIC used with PIIX3.
>
> Do real IOAPICs exist with more than 24 pins? Otherwise there is the
> risk that OSes aren't well prepared for this oddity - specifically not
> when the chipset is specified to include a 24-pin IOAPIC.
Linux supports up to 128 pins and as I wrote before all the other OSes
we tested so far seem to react well.
> >> I haven't seen a single case where PCI devices have a direct link to the IO-APIC. I also have not seen any PCI host controller that exports more than 4 interrupts. Giving each PCI device its own line, on top of that more than ever could be in real hardware, is a plain hack IMHO.
> >
> > Actually this happens quite often: if I am not mistaken all the GSIs
> > higher than 15 are actually the result of a direct connection between
> > an interrupt source and the IOAPIC. I have several on my testboxes.
>
> Except that the interrupt source is the chipset with its PCI bridge, not
> individual PCI devices.
That is the most common configuration but it is not the only one: I have
an ACPI table that has individual PCI devices as source in some test
boxes.
In fact there is even an example of it in this good article about
interrupt routing from the FreeBSD guys (it is the last figure):
http://people.freebsd.org/~jhb/papers/bsdcan/2007/article/node5.html
"Figure 6 contains a portion of an example _PRT. Specifically, it
includes the first entry in the table. This corresponds to the PCI
interrupt for PCI bus 3, slot 7"
..ZIP...
"For APIC mode, the interrupt is routed to GSI 66. For this machine,
ACPI assigns a base GSI of 64 to the I/O APIC with an APIC ID of 10.
Thus, GSI 66 corresponds to pin 2 on that I/O APIC"
Unless I am missing something I don't think that interrupt is going
through any PCI bridges...
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [Xen-devel] Re: [PATCH] xen: fix interrupt routing
2011-06-14 13:27 ` [Qemu-devel] [Xen-devel] " Stefano Stabellini
2011-06-14 15:30 ` [Qemu-devel] " Jan Kiszka
@ 2011-06-15 8:16 ` Alexander Graf
2011-06-15 16:32 ` Stefano Stabellini
1 sibling, 1 reply; 19+ messages in thread
From: Alexander Graf @ 2011-06-15 8:16 UTC (permalink / raw)
To: Stefano Stabellini
Cc: Ian Campbell, Isaku Yamahata, xen-devel@lists.xensource.com,
qemu-devel@nongnu.org
On 14.06.2011, at 15:27, Stefano Stabellini wrote:
> On Tue, 14 Jun 2011, Alexander Graf wrote:
>>>>>>> static int i440fx_load_old(QEMUFile* f, void *opaque, int version_id)
>>>>>>> {
>>>>>>> PCII440FXState *d = opaque;
>>>>>>> @@ -267,8 +263,17 @@ static PCIBus *i440fx_common_init(const char *device_name,
>>>>>>> d = pci_create_simple(b, 0, device_name);
>>>>>>> *pi440fx_state = DO_UPCAST(PCII440FXState, dev, d);
>>>>>>>
>>>>>>> - piix3 = DO_UPCAST(PIIX3State, dev,
>>>>>>> - pci_create_simple_multifunction(b, -1, true, "PIIX3"));
>>>>>>> + if (xen_enabled()) {
>>>>>>> + piix3 = DO_UPCAST(PIIX3State, dev,
>>>>>>> + pci_create_simple_multifunction(b, -1, true, "PIIX3-xen"));
>>>>>>> + pci_bus_irqs(b, xen_piix3_set_irq, xen_pci_slot_get_pirq,
>>>>>>> + piix3, XEN_PIIX_NUM_PIRQS);
>>>>>>
>>>>>> But with XEN_PIIX_NUM_PIRQS it's not a piix3 anymore, no? What's the reason behind this change?
>>>>>
>>>>> It is still a piix3, but also provides non-legacy interrupt links to the
>>>>> IO-APIC.
>>>>> The four pins of each PCI device on the bus not only are routed to the
>>>>> normal four pirqs (programmed writing to 0x60-0x63, see above) but also
>>>>> they are connected to the IO-APIC directly.
>>>>> These additional routes can only be discovered through ACPI, so you need
>>>>> matching ACPI tables. We used to build the old ACPI tables like this:
>>>>>
>>>>> /* PRTA: APIC routing table (via non-legacy IOAPIC GSIs). */
>>>>> printf("Name(PRTA, Package() {\n");
>>>>> for ( dev = 1; dev < 32; dev++ )
>>>>> for ( intx = 0; intx < 4; intx++ ) /* INTA-D */
>>>>> printf("Package(){0x%04xffff, %u, 0, %u},\n",
>>>>> dev, intx, ((dev*4+dev/8+intx)&31)+16);
>>>>> printf("})\n");
>>>>>
>>>>
>>>> Interesting concept, but completely non-standard and very much
>>>> different from real hardware. Please at least add a comment there to
>>>> show readers that Xen is doing a hack which is not at all related to
>>>> how the PIIX really works.
>>>
>>> Isn't this more a function of the "wires" on the motherboard than the
>>> PIIX specifically? i.e. this just encodes the permutation of the wires
>>> from the PCI slots into the IO-APIC input pins (bypassing the PIIX,
>>> which is only used for legacy ISA IRQs i.e. by non-APIC aware OSes)?
>>
>> Interrupts with PCI work slightly different. PCI devices can map (themselves or by software) to one of 4 interrupt lines: INTA, INTB, INTC, INTD. These get converted using PCI host controller specific logic to 4 interrupt lines which then go into the IO-APIC.
>>
>> The IO-APIC is a chip with a limited number of pins. IIRC it was 24, could be 26 though.
>
> The number of redirection entries in the IOAPIC can be discovered
> reading from the IOAPICVER register and it is a property of a specific
> model of IOAPIC. As a matter of fact Xen's emulated IOAPIC supports more
> pins than the most popular IOAPIC used with PIIX3.
which means you're emulating hardware that never existed :).
>
>
>> I haven't seen a single case where PCI devices have a direct link to the IO-APIC. I also have not seen any PCI host controller that exports more than 4 interrupts. Giving each PCI device its own line, on top of that more than ever could be in real hardware, is a plain hack IMHO.
>
> Actually this happens quite often: if I am not mistaken all the GSIs
> higher than 15 are actually the result of a direct connection between
> an interrupt source and the IOAPIC. I have several on my testboxes.
Yes. "Interrupt source" meaning a wire on the board. I haven't seen any situation so far where you get direct IO-APIC connections to PCI _device_ pins. You obviously get plenty connections to PCI _bus_ pins.
> Also give a look at the Intel Multiprocessor Specification, section
> 3.6.2.3: as you can see from the diagram in "Symmetric I/O Mode" all the
> interrupts are routed through the IOAPIC directly.
>
>
>> Did this really give you actual performance/latency/scalability gains? I still think for devices that matter, we should go with MSI rather than deriving from real hw.
>>
>
> Not all the operating systems support MSIs, it is nice to be able to
> avoid interrupt sharing without recurring to MSIs.
Yes and no. It's a tradeoff. If no interrupt sharing means that we emulate hardware that simply never could have existed the way we model it, I think it's a bad idea.
> Also this is how Xen has been working for more then 5 years in HVM mode,
> so this configuration is well tested and supported by most operating
> systems (at least all the ones we tried so far).
I'm fine with Xen breaking its own neck, as long as it doesn't affect non-Xen code paths. Just be aware that I'm not a huge fan of this approach :).
> In any case I think it is a good idea to add a comment to better explain
> what we are doing, see below.
>
>
>
> commit 973bb091a967fdec37a1bc8fe30d46a483d2903d
> Author: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> Date: Tue May 17 12:10:36 2011 +0000
>
> xen: fix interrupt routing
>
> - remove i440FX-xen and i440fx_write_config_xen
> we don't need to intercept pci config writes to i440FX anymore;
>
> - introduce PIIX3-xen and piix3_write_config_xen
> we do need to intercept pci config write to the PCI-ISA bridge to update
> the PCI link routing;
>
> - set the number of PIIX3-xen interrupts line to 128;
I still find it unpretty and I'm pretty sure it's completely different from real hardware, but since Xen code is your call and this doesn't affect non-Xen workloads, I won't block it, unless someone else is very much opposed to it.
Please resend as proper patch.
Alex
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [PATCH] xen: fix interrupt routing
2011-06-14 18:18 ` Stefano Stabellini
@ 2011-06-15 8:24 ` Alexander Graf
2011-06-15 16:34 ` [Qemu-devel] [Xen-devel] " Avi Kivity
0 siblings, 1 reply; 19+ messages in thread
From: Alexander Graf @ 2011-06-15 8:24 UTC (permalink / raw)
To: Stefano Stabellini
Cc: xen-devel@lists.xensource.com, Jan Kiszka, qemu-devel@nongnu.org,
Ian Campbell, Isaku Yamahata
On 14.06.2011, at 20:18, Stefano Stabellini wrote:
> On Tue, 14 Jun 2011, Jan Kiszka wrote:
>> On 2011-06-14 15:27, Stefano Stabellini wrote:
>>> On Tue, 14 Jun 2011, Alexander Graf wrote:
>>>>>>>>> static int i440fx_load_old(QEMUFile* f, void *opaque, int version_id)
>>>>>>>>> {
>>>>>>>>> PCII440FXState *d = opaque;
>>>>>>>>> @@ -267,8 +263,17 @@ static PCIBus *i440fx_common_init(const char *device_name,
>>>>>>>>> d = pci_create_simple(b, 0, device_name);
>>>>>>>>> *pi440fx_state = DO_UPCAST(PCII440FXState, dev, d);
>>>>>>>>>
>>>>>>>>> - piix3 = DO_UPCAST(PIIX3State, dev,
>>>>>>>>> - pci_create_simple_multifunction(b, -1, true, "PIIX3"));
>>>>>>>>> + if (xen_enabled()) {
>>>>>>>>> + piix3 = DO_UPCAST(PIIX3State, dev,
>>>>>>>>> + pci_create_simple_multifunction(b, -1, true, "PIIX3-xen"));
>>>>>>>>> + pci_bus_irqs(b, xen_piix3_set_irq, xen_pci_slot_get_pirq,
>>>>>>>>> + piix3, XEN_PIIX_NUM_PIRQS);
>>>>>>>>
>>>>>>>> But with XEN_PIIX_NUM_PIRQS it's not a piix3 anymore, no? What's the reason behind this change?
>>>>>>>
>>>>>>> It is still a piix3, but also provides non-legacy interrupt links to the
>>>>>>> IO-APIC.
>>>>>>> The four pins of each PCI device on the bus not only are routed to the
>>>>>>> normal four pirqs (programmed writing to 0x60-0x63, see above) but also
>>>>>>> they are connected to the IO-APIC directly.
>>>>>>> These additional routes can only be discovered through ACPI, so you need
>>>>>>> matching ACPI tables. We used to build the old ACPI tables like this:
>>>>>>>
>>>>>>> /* PRTA: APIC routing table (via non-legacy IOAPIC GSIs). */
>>>>>>> printf("Name(PRTA, Package() {\n");
>>>>>>> for ( dev = 1; dev < 32; dev++ )
>>>>>>> for ( intx = 0; intx < 4; intx++ ) /* INTA-D */
>>>>>>> printf("Package(){0x%04xffff, %u, 0, %u},\n",
>>>>>>> dev, intx, ((dev*4+dev/8+intx)&31)+16);
>>>>>>> printf("})\n");
>>>>>>>
>>>>>>
>>>>>> Interesting concept, but completely non-standard and very much
>>>>>> different from real hardware. Please at least add a comment there to
>>>>>> show readers that Xen is doing a hack which is not at all related to
>>>>>> how the PIIX really works.
>>>>>
>>>>> Isn't this more a function of the "wires" on the motherboard than the
>>>>> PIIX specifically? i.e. this just encodes the permutation of the wires
>>>>> from the PCI slots into the IO-APIC input pins (bypassing the PIIX,
>>>>> which is only used for legacy ISA IRQs i.e. by non-APIC aware OSes)?
>>>>
>>>> Interrupts with PCI work slightly different. PCI devices can map (themselves or by software) to one of 4 interrupt lines: INTA, INTB, INTC, INTD. These get converted using PCI host controller specific logic to 4 interrupt lines which then go into the IO-APIC.
>>>>
>>>> The IO-APIC is a chip with a limited number of pins. IIRC it was 24, could be 26 though.
>>>
>>> The number of redirection entries in the IOAPIC can be discovered
>>> reading from the IOAPICVER register and it is a property of a specific
>>> model of IOAPIC. As a matter of fact Xen's emulated IOAPIC supports more
>>> pins than the most popular IOAPIC used with PIIX3.
>>
>> Do real IOAPICs exist with more than 24 pins? Otherwise there is the
>> risk that OSes aren't well prepared for this oddity - specifically not
>> when the chipset is specified to include a 24-pin IOAPIC.
>
> Linux supports up to 128 pins and as I wrote before all the other OSes
> we tested so far seem to react well.
>
>
>>>> I haven't seen a single case where PCI devices have a direct link to the IO-APIC. I also have not seen any PCI host controller that exports more than 4 interrupts. Giving each PCI device its own line, on top of that more than ever could be in real hardware, is a plain hack IMHO.
>>>
>>> Actually this happens quite often: if I am not mistaken all the GSIs
>>> higher than 15 are actually the result of a direct connection between
>>> an interrupt source and the IOAPIC. I have several on my testboxes.
>>
>> Except that the interrupt source is the chipset with its PCI bridge, not
>> individual PCI devices.
>
> That is the most common configuration but it is not the only one: I have
> an ACPI table that has individual PCI devices as source in some test
> boxes.
> In fact there is even an example of it in this good article about
> interrupt routing from the FreeBSD guys (it is the last figure):
>
> http://people.freebsd.org/~jhb/papers/bsdcan/2007/article/node5.html
>
> "Figure 6 contains a portion of an example _PRT. Specifically, it
> includes the first entry in the table. This corresponds to the PCI
> interrupt for PCI bus 3, slot 7"
>
> ..ZIP...
>
> "For APIC mode, the interrupt is routed to GSI 66. For this machine,
> ACPI assigns a base GSI of 64 to the I/O APIC with an APIC ID of 10.
> Thus, GSI 66 corresponds to pin 2 on that I/O APIC"
>
> Unless I am missing something I don't think that interrupt is going
> through any PCI bridges...
I'm actually not quite sure what exactly he's describing here. But if it's bypassing the bus logic, it's not a normal PCI device :). Sure, there are special case devices that also expose a PCI interface. But real PCI cards that you plug in onto the PCI bus can't bypass the interrupt logic of the bus, as the only interrupt wires they have go to the bus. And since the PCI adapters we use in PC machines in Qemu are all non-special, guests can possibly choke on this.
But either way, I won't block the patch as I mentioned before.
Alex
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [Xen-devel] Re: [PATCH] xen: fix interrupt routing
2011-06-15 8:16 ` Alexander Graf
@ 2011-06-15 16:32 ` Stefano Stabellini
0 siblings, 0 replies; 19+ messages in thread
From: Stefano Stabellini @ 2011-06-15 16:32 UTC (permalink / raw)
To: Alexander Graf
Cc: xen-devel@lists.xensource.com, Stefano Stabellini,
qemu-devel@nongnu.org, Ian Campbell, Isaku Yamahata
On Wed, 15 Jun 2011, Alexander Graf wrote:
> > commit 973bb091a967fdec37a1bc8fe30d46a483d2903d
> > Author: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
> > Date: Tue May 17 12:10:36 2011 +0000
> >
> > xen: fix interrupt routing
> >
> > - remove i440FX-xen and i440fx_write_config_xen
> > we don't need to intercept pci config writes to i440FX anymore;
> >
> > - introduce PIIX3-xen and piix3_write_config_xen
> > we do need to intercept pci config write to the PCI-ISA bridge to update
> > the PCI link routing;
> >
> > - set the number of PIIX3-xen interrupts line to 128;
>
> I still find it unpretty and I'm pretty sure it's completely different from real hardware, but since Xen code is your call and this doesn't affect non-Xen workloads, I won't block it, unless someone else is very much opposed to it.
>
> Please resend as proper patch.
>
Thanks, I'll do that.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [Xen-devel] Re: [PATCH] xen: fix interrupt routing
2011-06-15 8:24 ` Alexander Graf
@ 2011-06-15 16:34 ` Avi Kivity
2011-06-15 16:54 ` Alexander Graf
0 siblings, 1 reply; 19+ messages in thread
From: Avi Kivity @ 2011-06-15 16:34 UTC (permalink / raw)
To: Alexander Graf
Cc: xen-devel@lists.xensource.com, Stefano Stabellini, Jan Kiszka,
qemu-devel@nongnu.org, Ian Campbell, Isaku Yamahata
On 06/15/2011 11:24 AM, Alexander Graf wrote:
> I'm actually not quite sure what exactly he's describing here. But if it's bypassing the bus logic, it's not a normal PCI device :). Sure, there are special case devices that also expose a PCI interface. But real PCI cards that you plug in onto the PCI bus can't bypass the interrupt logic of the bus, as the only interrupt wires they have go to the bus. And since the PCI adapters we use in PC machines in Qemu are all non-special, guests can possibly choke on this.
>
There actually is a special device in qemu - acpi power management is
configured as a PCI device, but its interrupt is hard-wired to gsi 9 and
is edge-triggered (so it can't share the irq line).
I other devices that are special in this regard to also be part of the
chipset, not devices you can plug into arbitrary slots.
--
error compiling committee.c: too many arguments to function
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Qemu-devel] [Xen-devel] Re: [PATCH] xen: fix interrupt routing
2011-06-15 16:34 ` [Qemu-devel] [Xen-devel] " Avi Kivity
@ 2011-06-15 16:54 ` Alexander Graf
0 siblings, 0 replies; 19+ messages in thread
From: Alexander Graf @ 2011-06-15 16:54 UTC (permalink / raw)
To: Avi Kivity
Cc: xen-devel@lists.xensource.com, Stefano Stabellini, Jan Kiszka,
qemu-devel@nongnu.org, Ian Campbell, Isaku Yamahata
Am 15.06.2011 um 18:34 schrieb Avi Kivity <avi@redhat.com>:
> On 06/15/2011 11:24 AM, Alexander Graf wrote:
>> I'm actually not quite sure what exactly he's describing here. But if it's bypassing the bus logic, it's not a normal PCI device :). Sure, there are special case devices that also expose a PCI interface. But real PCI cards that you plug in onto the PCI bus can't bypass the interrupt logic of the bus, as the only interrupt wires they have go to the bus. And since the PCI adapters we use in PC machines in Qemu are all non-special, guests can possibly choke on this.
>>
>
> There actually is a special device in qemu - acpi power management is configured as a PCI device, but its interrupt is hard-wired to gsi 9 and is edge-triggered (so it can't share the irq line).
>
> I other devices that are special in this regard to also be part of the chipset, not devices you can plug into arbitrary slots.
Sure, platform devices can do that. Real PCI cards can not. Have you ever seen an e1000 with direct mapped interrupt lines? :)
I admit though that we also emulate platform devices that happen to expose themselves on the PCI bus. It's not common though and I wouldn't expect every OS/driver to be happy about it.
Alex
>
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2011-06-15 16:55 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-05-26 15:48 [Qemu-devel] [PATCH] xen: fix interrupt routing Stefano Stabellini
2011-05-27 23:17 ` Alexander Graf
2011-05-31 11:05 ` Stefano Stabellini
2011-06-14 9:25 ` Alexander Graf
2011-06-14 12:17 ` [Qemu-devel] [Xen-devel] " Ian Campbell
2011-06-14 12:30 ` Alexander Graf
2011-06-14 13:08 ` [Qemu-devel] " Jan Kiszka
2011-06-14 13:31 ` Stefano Stabellini
2011-06-14 13:27 ` [Qemu-devel] [Xen-devel] " Stefano Stabellini
2011-06-14 15:30 ` [Qemu-devel] " Jan Kiszka
2011-06-14 18:18 ` Stefano Stabellini
2011-06-15 8:24 ` Alexander Graf
2011-06-15 16:34 ` [Qemu-devel] [Xen-devel] " Avi Kivity
2011-06-15 16:54 ` Alexander Graf
2011-06-15 8:16 ` Alexander Graf
2011-06-15 16:32 ` Stefano Stabellini
-- strict thread matches above, loose matches on Subject: below --
2011-05-18 17:53 [Qemu-devel] " stefano.stabellini
2011-05-19 2:14 ` Isaku Yamahata
2011-05-19 19:16 ` Stefano Stabellini
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).