* [PATCH] kvm-ia64 irq assignment 1/2 kernel @ 2008-06-06 15:58 Xu, Anthony 2008-06-06 19:58 ` Avi Kivity 0 siblings, 1 reply; 21+ messages in thread From: Xu, Anthony @ 2008-06-06 15:58 UTC (permalink / raw) To: Avi Kivity, Jes Sorensen, kvm, kvm-ia64 In kvm-ia64, we use the same guest firmware (GFW)as in Xen, GFW uses PRT to present PCI interrupt routing, all PCI devices' interrupt pins connect to IOAPIC, which doesn't match with kvm-ia64 Qemu PCI interrupt routing. This patch modify Qemu PCI interupt routing code to match with GFW, Then PCI devices in qemu can work in kvm-ia64, for exmaple, NIC Signed-off-by: Anthony Xu < anthony.xu@intel.com > diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c index 99a1736..80c116c 100644 --- a/virt/kvm/ioapic.c +++ b/virt/kvm/ioapic.c @@ -272,7 +272,11 @@ void kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int irq, int level) if (irq >= 0 && irq < IOAPIC_NUM_PINS) { entry = ioapic->redirtbl[irq]; - level ^= entry.fields.polarity; +// polarity of all devices in qemu is active high +// regardless of ioapic setting + +// level ^= entry.fields.polarity; + if (!level) ioapic->irr &= ~mask; else { ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH] kvm-ia64 irq assignment 1/2 kernel 2008-06-06 15:58 [PATCH] kvm-ia64 irq assignment 1/2 kernel Xu, Anthony @ 2008-06-06 19:58 ` Avi Kivity 2008-06-09 8:58 ` Alexander Graf 2008-06-10 6:33 ` Xu, Anthony 0 siblings, 2 replies; 21+ messages in thread From: Avi Kivity @ 2008-06-06 19:58 UTC (permalink / raw) To: Xu, Anthony; +Cc: Jes Sorensen, kvm, kvm-ia64 Xu, Anthony wrote: > In kvm-ia64, we use the same guest firmware (GFW)as in Xen, > GFW uses PRT to present PCI interrupt routing, all PCI devices' > interrupt pins > connect to IOAPIC, which doesn't match with kvm-ia64 Qemu PCI interrupt > routing. > > This patch modify Qemu PCI interupt routing code to match with GFW, > Then PCI devices in qemu can work in kvm-ia64, for exmaple, NIC > > > Signed-off-by: Anthony Xu < anthony.xu@intel.com > > > > diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c > index 99a1736..80c116c 100644 > --- a/virt/kvm/ioapic.c > +++ b/virt/kvm/ioapic.c > @@ -272,7 +272,11 @@ void kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, > int irq, int level) > > if (irq >= 0 && irq < IOAPIC_NUM_PINS) { > entry = ioapic->redirtbl[irq]; > - level ^= entry.fields.polarity; > +// polarity of all devices in qemu is active high > +// regardless of ioapic setting > + > +// level ^= entry.fields.polarity; > + > There are two errors in this patch: - kvm is not there just for qemu; it should be possible to use kvm with some other userspace, which would assume that kvm correctly emulates ioapic polarity - you are modifying shared code and so affect x86 as well I suggest modifying the firmware to report the interrupts as active high. Since Xen does not emulate polarity, the change will not affect it and the firmware can continue to be shared. I'd also recommend fixing Xen to emulate the polarity correctly, if possible. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] kvm-ia64 irq assignment 1/2 kernel 2008-06-06 19:58 ` Avi Kivity @ 2008-06-09 8:58 ` Alexander Graf 2008-06-09 9:16 ` Alexander Graf 2008-06-10 6:33 ` Xu, Anthony 1 sibling, 1 reply; 21+ messages in thread From: Alexander Graf @ 2008-06-09 8:58 UTC (permalink / raw) To: Avi Kivity; +Cc: Xu, Anthony, Jes Sorensen, kvm, kvm-ia64 Avi Kivity wrote: > Xu, Anthony wrote: >> In kvm-ia64, we use the same guest firmware (GFW)as in Xen, GFW uses >> PRT to present PCI interrupt routing, all PCI devices' >> interrupt pins >> connect to IOAPIC, which doesn't match with kvm-ia64 Qemu PCI interrupt >> routing. >> >> This patch modify Qemu PCI interupt routing code to match with GFW, >> Then PCI devices in qemu can work in kvm-ia64, for exmaple, NIC >> >> >> Signed-off-by: Anthony Xu < anthony.xu@intel.com > >> >> >> diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c >> index 99a1736..80c116c 100644 >> --- a/virt/kvm/ioapic.c >> +++ b/virt/kvm/ioapic.c >> @@ -272,7 +272,11 @@ void kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, >> int irq, int level) >> >> if (irq >= 0 && irq < IOAPIC_NUM_PINS) { >> entry = ioapic->redirtbl[irq]; >> - level ^= entry.fields.polarity; >> +// polarity of all devices in qemu is active high >> +// regardless of ioapic setting >> + >> +// level ^= entry.fields.polarity; >> + >> > > There are two errors in this patch: > > - kvm is not there just for qemu; it should be possible to use kvm > with some other userspace, which would assume that kvm correctly > emulates ioapic polarity > - you are modifying shared code and so affect x86 as well Apparently this is broken on x86 too. I was just trying this patch with Mac OS X as target and magically the in-kernel APIC starts working, so I guess something is going wrong already here. Btw, according to the ACPI tables, all PCI interrupts are currently defined Active-Low. Alex > > I suggest modifying the firmware to report the interrupts as active > high. Since Xen does not emulate polarity, the change will not affect > it and the firmware can continue to be shared. I'd also recommend > fixing Xen to emulate the polarity correctly, if possible. > ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] kvm-ia64 irq assignment 1/2 kernel 2008-06-09 8:58 ` Alexander Graf @ 2008-06-09 9:16 ` Alexander Graf 2008-06-12 12:24 ` Avi Kivity 0 siblings, 1 reply; 21+ messages in thread From: Alexander Graf @ 2008-06-09 9:16 UTC (permalink / raw) To: Alexander Graf; +Cc: Avi Kivity, Xu, Anthony, Jes Sorensen, kvm, kvm-ia64 Alexander Graf wrote: > Avi Kivity wrote: >> Xu, Anthony wrote: >>> In kvm-ia64, we use the same guest firmware (GFW)as in Xen, GFW uses >>> PRT to present PCI interrupt routing, all PCI devices' >>> interrupt pins >>> connect to IOAPIC, which doesn't match with kvm-ia64 Qemu PCI interrupt >>> routing. >>> >>> This patch modify Qemu PCI interupt routing code to match with GFW, >>> Then PCI devices in qemu can work in kvm-ia64, for exmaple, NIC >>> >>> >>> Signed-off-by: Anthony Xu < anthony.xu@intel.com > >>> >>> >>> diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c >>> index 99a1736..80c116c 100644 >>> --- a/virt/kvm/ioapic.c >>> +++ b/virt/kvm/ioapic.c >>> @@ -272,7 +272,11 @@ void kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, >>> int irq, int level) >>> >>> if (irq >= 0 && irq < IOAPIC_NUM_PINS) { >>> entry = ioapic->redirtbl[irq]; >>> - level ^= entry.fields.polarity; >>> +// polarity of all devices in qemu is active high >>> +// regardless of ioapic setting >>> + >>> +// level ^= entry.fields.polarity; >>> + >>> >> >> There are two errors in this patch: >> >> - kvm is not there just for qemu; it should be possible to use kvm >> with some other userspace, which would assume that kvm correctly >> emulates ioapic polarity >> - you are modifying shared code and so affect x86 as well > > Apparently this is broken on x86 too. I was just trying this patch > with Mac OS X as target and magically the in-kernel APIC starts > working, so I guess something is going wrong already here. > Btw, according to the ACPI tables, all PCI interrupts are currently > defined Active-Low. Sorry, ActiveHigh that is. Nevertheless I am having trouble with this since the very first time I used osx inside KVM. Does PCI allow Active > Interrupt (, Level, ActiveHigh, Shared) According to the PCI 3.0 Spec, "Interrupts on PCI are optional and defined as 'level sensitive,' asserted low (negative true)". Alex ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] kvm-ia64 irq assignment 1/2 kernel 2008-06-09 9:16 ` Alexander Graf @ 2008-06-12 12:24 ` Avi Kivity 0 siblings, 0 replies; 21+ messages in thread From: Avi Kivity @ 2008-06-12 12:24 UTC (permalink / raw) To: Alexander Graf; +Cc: Xu, Anthony, Jes Sorensen, kvm, kvm-ia64 Alexander Graf wrote: >> >> Apparently this is broken on x86 too. I was just trying this patch >> with Mac OS X as target and magically the in-kernel APIC starts >> working, so I guess something is going wrong already here. >> Btw, according to the ACPI tables, all PCI interrupts are currently >> defined Active-Low. > So there's something else wrong. > Sorry, ActiveHigh that is. Nevertheless I am having trouble with this > since the very first time I used osx inside KVM. Does PCI allow Active > > > Interrupt (, Level, ActiveHigh, Shared) > > According to the PCI 3.0 Spec, "Interrupts on PCI are optional and > defined as 'level sensitive,' asserted low (negative true)". The pci interrupts are active low, but they are converted to active high by the chipset qemu emulates, so active high is correct. Does OS X boot from the qemu bios or something else? If the latter, it may need adjustment. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. ^ permalink raw reply [flat|nested] 21+ messages in thread
* RE: [PATCH] kvm-ia64 irq assignment 1/2 kernel 2008-06-06 19:58 ` Avi Kivity 2008-06-09 8:58 ` Alexander Graf @ 2008-06-10 6:33 ` Xu, Anthony 2008-06-10 7:25 ` Alexander Graf 2008-06-12 12:30 ` Avi Kivity 1 sibling, 2 replies; 21+ messages in thread From: Xu, Anthony @ 2008-06-10 6:33 UTC (permalink / raw) To: Avi Kivity; +Cc: Jes Sorensen, kvm, kvm-ia64 Avi Kivity wrote: > > I suggest modifying the firmware to report the interrupts as active > high. Since Xen does not emulate polarity, the change will not affect > it and the firmware can continue to be shared. I'd also recommend > fixing Xen to emulate the polarity correctly, if possible. Thanks for your comments I agree modifying common code is not a good method. While your suggestion seems be infeasible too. According to acpi spec, only irq <=15 can be configured, such as trigger level, polarity. For irq >15 , means connect to IOAPIC directly, it can't be configured, it must be level triger, active low. I can't find any mechanism in firmware to configure irqs (> 15). Please enlighten me if you have. >From some experimental, Firmware both in IA64 and IA32 doesn't program IOAPIC, It's Guest OS to program. Guest OS gets PRT first, If the PCI device connected to IOAPIC pin directly, looks like below /* Device 1, INTA - INTD */ Package(){0x0001ffff, 0, 0, 20}, Package(){0x0001ffff, 1, 0, 21}, Package(){0x0001ffff, 2, 0, 22}, Package(){0x0001ffff, 3, 0, 23}, Guest OS configures this IOAPIC pin with level triger, active low unconditionally. If the PCI device connected to IOAPIC pin through interrupt link, the irq attribute(level, polarity) is decided by interrupt link attribute, Below is the interrupt link attribute in kvm/ia32 Device(LNKA){ Name(_HID, EISAID("PNP0C0F")) // PCI interrupt link Name(_UID, 1) Name(_PRS, ResourceTemplate(){ Interrupt (, Level, ActiveHigh, Shared) { 5, 10, 11 } }) It's defined as level trigger, activehigh, that's the reason why pci device worked well in kvm/ia32. I think below scheme is feasible, 1. all PCI devices in Qemu uses level trigger, active low interrupt. (not include ide, even though it is a PCI device, it uses legacy interrupt mechanism) 2. in Guest Firmware, all PCI devices' interrupts are configured as level trigger, active low for KVM/IA32 Guest firmware, just a little modifications Name(_PRS, ResourceTemplate(){ Interrupt (, Level, ActiveHigh, Shared)--> Interrupt (, Level, ActiveLow, Shared) There are some modifications in Qemu, But I think it's a worthwhile, it's a thoroghly solution both for KVM/IA32 and KVM/IA64. - Thanks Anthony ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] kvm-ia64 irq assignment 1/2 kernel 2008-06-10 6:33 ` Xu, Anthony @ 2008-06-10 7:25 ` Alexander Graf 2008-06-10 7:57 ` [RFC]RE: " Xu, Anthony 2008-06-12 12:30 ` Avi Kivity 1 sibling, 1 reply; 21+ messages in thread From: Alexander Graf @ 2008-06-10 7:25 UTC (permalink / raw) To: Xu, Anthony; +Cc: Avi Kivity, Jes Sorensen, kvm, kvm-ia64 On Jun 10, 2008, at 8:33 AM, Xu, Anthony wrote: > Avi Kivity wrote: >> >> I suggest modifying the firmware to report the interrupts as active >> high. Since Xen does not emulate polarity, the change will not >> affect >> it and the firmware can continue to be shared. I'd also recommend >> fixing Xen to emulate the polarity correctly, if possible. > > Thanks for your comments > I agree modifying common code is not a good method. > > While your suggestion seems be infeasible too. > According to acpi spec, only irq <=15 can be configured, such as > trigger > level, polarity. > For irq >15 , means connect to IOAPIC directly, it can't be > configured, > it must be level triger, active low. > > I can't find any mechanism in firmware to configure irqs (> 15). > Please > enlighten me if you have. You can change the defaults on that IOAPIC-wise. IIRC this was in the MADT, but I'd have to check. > From some experimental, Firmware both in IA64 and IA32 doesn't program > IOAPIC, > It's Guest OS to program. > Guest OS gets PRT first, > If the PCI device connected to IOAPIC pin directly, looks like below > /* Device 1, INTA - INTD */ > Package(){0x0001ffff, 0, 0, 20}, > Package(){0x0001ffff, 1, 0, 21}, > Package(){0x0001ffff, 2, 0, 22}, > Package(){0x0001ffff, 3, 0, 23}, > Guest OS configures this IOAPIC pin with level triger, active low > unconditionally. Yes, the Guest OS reads the PRT and configures the IOAPIC and LAPIC accordingly. > If the PCI device connected to IOAPIC pin through interrupt link, the > irq attribute(level, polarity) is decided by interrupt link attribute, > Below is the interrupt link attribute in kvm/ia32 > Device(LNKA){ > Name(_HID, EISAID("PNP0C0F")) // PCI interrupt link > Name(_UID, 1) > Name(_PRS, ResourceTemplate(){ > Interrupt (, Level, ActiveHigh, Shared) > { 5, 10, 11 } > }) > It's defined as level trigger, activehigh, that's the reason why pci > device worked well in kvm/ia32. This is the what I call "Legacy" way of handling interrupt lanes. Usually nowadays you simply connect devices magically to IOAPIC pins, which is exactly what you showed in the previous section. This "new" behavior is usually activated when the OS calls the _PIC function in the DSDT with a parameter != 0. > I think below scheme is feasible, > 1. all PCI devices in Qemu uses level trigger, active low interrupt. > (not include ide, even though it is a PCI device, it uses legacy > interrupt mechanism) This is what the PCI spec requires anyway. The way it's done right now looks wrong to me. > 2. in Guest Firmware, all PCI devices' interrupts are configured as > level trigger, active low > for KVM/IA32 Guest firmware, just a little modifications > Name(_PRS, ResourceTemplate(){ > Interrupt (, Level, ActiveHigh, Shared)--> > Interrupt > (, Level, ActiveLow, Shared) While at it it might be a good idea to switch to the direct IOAPIC- mapping approach. I sent a patch that did this on the kvm list some time ago. Please consider reading that and tell me what you think of it. It apparently broke Windows, but that might be simply an issue of the wrong ActiveHigh/ActiveLow configuration. > There are some modifications in Qemu, But I think it's a worthwhile, > it's a thoroghly solution both for KVM/IA32 and KVM/IA64. In essence this sounds great to me! I would love if we could talk about this on the KVM Forum. Thank you, Alex ^ permalink raw reply [flat|nested] 21+ messages in thread
* [RFC]RE: [PATCH] kvm-ia64 irq assignment 1/2 kernel 2008-06-10 7:25 ` Alexander Graf @ 2008-06-10 7:57 ` Xu, Anthony 2008-06-11 14:24 ` Alexander Graf ` (2 more replies) 0 siblings, 3 replies; 21+ messages in thread From: Xu, Anthony @ 2008-06-10 7:57 UTC (permalink / raw) To: Alexander Graf; +Cc: Avi Kivity, Jes Sorensen, kvm, kvm-ia64 Thanks for comments Basically we are on the same page, while I didn't find your patch about irq assignment, can you post it in this thread again, thx? Below patch makes all PCI devices use level-trigger , active low interrupt, it worked well when running linux guest, I didn't try windows guest yet. (didn't have windows image in hand) Please comment! If this is acceptabled, we can figure out how to use IOAPIC in kvm/ia32 based on this. Which will reduce irq sharing dramatically. Thanks, Anthony diff --git a/bios/acpi-dsdt.dsl b/bios/acpi-dsdt.dsl index 21fc76a..4b5e824 100755 --- a/bios/acpi-dsdt.dsl +++ b/bios/acpi-dsdt.dsl @@ -974,7 +974,7 @@ DefinitionBlock ( Name(_HID, EISAID("PNP0C0F")) // PCI interrupt link Name(_UID, 1) Name(_PRS, ResourceTemplate(){ - Interrupt (, Level, ActiveHigh, Shared) + Interrupt (, Level, ActiveLow, Shared) { 5, 10, 11 } }) Method (_STA, 0, NotSerialized) @@ -1019,7 +1019,7 @@ DefinitionBlock ( Name(_HID, EISAID("PNP0C0F")) // PCI interrupt link Name(_UID, 2) Name(_PRS, ResourceTemplate(){ - Interrupt (, Level, ActiveHigh, Shared) + Interrupt (, Level, ActiveLow, Shared) { 5, 10, 11 } }) Method (_STA, 0, NotSerialized) @@ -1064,7 +1064,7 @@ DefinitionBlock ( Name(_HID, EISAID("PNP0C0F")) // PCI interrupt link Name(_UID, 3) Name(_PRS, ResourceTemplate(){ - Interrupt (, Level, ActiveHigh, Shared) + Interrupt (, Level, ActiveLow, Shared) { 5, 10, 11 } }) Method (_STA, 0, NotSerialized) @@ -1109,7 +1109,7 @@ DefinitionBlock ( Name(_HID, EISAID("PNP0C0F")) // PCI interrupt link Name(_UID, 4) Name(_PRS, ResourceTemplate(){ - Interrupt (, Level, ActiveHigh, Shared) + Interrupt (, Level, ActiveLow, Shared) { 5, 10, 11 } }) Method (_STA, 0, NotSerialized) diff --git a/qemu/hw/pci.c b/qemu/hw/pci.c index a23a466..df0ea33 100644 --- a/qemu/hw/pci.c +++ b/qemu/hw/pci.c @@ -548,7 +548,7 @@ static void pci_set_irq(void *opaque, int irq_num, int level) pci_dev = bus->parent_dev; } bus->irq_count[irq_num] += change; - bus->set_irq(bus->irq_opaque, irq_num, bus->irq_count[irq_num] != 0); + bus->set_irq(bus->irq_opaque, irq_num, !(bus->irq_count[irq_num] != 0)); } /***********************************************************/ Alexander Graf wrote: > On Jun 10, 2008, at 8:33 AM, Xu, Anthony wrote: > >> Avi Kivity wrote: >>> >>> I suggest modifying the firmware to report the interrupts as active >>> high. Since Xen does not emulate polarity, the change will not >>> affect it and the firmware can continue to be shared. I'd also >>> recommend fixing Xen to emulate the polarity correctly, if possible. >> >> Thanks for your comments >> I agree modifying common code is not a good method. >> >> While your suggestion seems be infeasible too. >> According to acpi spec, only irq <=15 can be configured, such as >> trigger level, polarity. >> For irq >15 , means connect to IOAPIC directly, it can't be >> configured, it must be level triger, active low. >> >> I can't find any mechanism in firmware to configure irqs (> 15). >> Please enlighten me if you have. > > > You can change the defaults on that IOAPIC-wise. IIRC this was in the > MADT, but I'd have to check. > >> From some experimental, Firmware both in IA64 and IA32 doesn't >> program IOAPIC, It's Guest OS to program. >> Guest OS gets PRT first, >> If the PCI device connected to IOAPIC pin directly, looks like below >> /* Device 1, INTA - INTD */ >> Package(){0x0001ffff, 0, 0, 20}, >> Package(){0x0001ffff, 1, 0, 21}, >> Package(){0x0001ffff, 2, 0, 22}, >> Package(){0x0001ffff, 3, 0, 23}, >> Guest OS configures this IOAPIC pin with level triger, active low >> unconditionally. > > Yes, the Guest OS reads the PRT and configures the IOAPIC and LAPIC > accordingly. > >> If the PCI device connected to IOAPIC pin through interrupt link, the >> irq attribute(level, polarity) is decided by interrupt link >> attribute, Below is the interrupt link attribute in kvm/ia32 >> Device(LNKA){ >> Name(_HID, EISAID("PNP0C0F")) // PCI interrupt >> link Name(_UID, 1) Name(_PRS, >> ResourceTemplate(){ Interrupt (, Level, >> ActiveHigh, Shared) { 5, 10, 11 } >> }) >> It's defined as level trigger, activehigh, that's the reason why pci >> device worked well in kvm/ia32. > > This is the what I call "Legacy" way of handling interrupt lanes. > Usually nowadays you simply connect devices magically to IOAPIC pins, > which is exactly what you showed in the previous section. This "new" > behavior is usually activated when the OS calls the _PIC function in > the DSDT with a parameter != 0. > >> I think below scheme is feasible, >> 1. all PCI devices in Qemu uses level trigger, active low interrupt. >> (not include ide, even though it is a PCI device, it uses legacy >> interrupt mechanism) > > This is what the PCI spec requires anyway. The way it's done right now > looks wrong to me. > >> 2. in Guest Firmware, all PCI devices' interrupts are configured as >> level trigger, active low for KVM/IA32 Guest firmware, just a >> little modifications Name(_PRS, ResourceTemplate(){ >> Interrupt (, Level, ActiveHigh, Shared)--> >> Interrupt (, Level, ActiveLow, Shared) > > While at it it might be a good idea to switch to the direct IOAPIC- > mapping approach. I sent a patch that did this on the kvm list some > time ago. Please consider reading that and tell me what you think of > it. It apparently broke Windows, but that might be simply an issue of > the wrong ActiveHigh/ActiveLow configuration. > >> There are some modifications in Qemu, But I think it's a worthwhile, >> it's a thoroghly solution both for KVM/IA32 and KVM/IA64. > > In essence this sounds great to me! I would love if we could talk > about this on the KVM Forum. > > Thank you, > > Alex ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [RFC]RE: [PATCH] kvm-ia64 irq assignment 1/2 kernel 2008-06-10 7:57 ` [RFC]RE: " Xu, Anthony @ 2008-06-11 14:24 ` Alexander Graf 2008-06-11 16:02 ` Marcelo Tosatti 2008-06-12 16:08 ` Xu, Anthony 2008-06-11 16:16 ` Marcelo Tosatti 2008-06-12 12:34 ` Avi Kivity 2 siblings, 2 replies; 21+ messages in thread From: Alexander Graf @ 2008-06-11 14:24 UTC (permalink / raw) To: Xu, Anthony; +Cc: Avi Kivity, Jes Sorensen, kvm, kvm-ia64 [-- Attachment #1: Type: text/plain, Size: 2148 bytes --] On Jun 10, 2008, at 12:57 AM, Xu, Anthony wrote: > Thanks for comments > > Basically we are on the same page, while I didn't find your patch > about > irq assignment, can you post it in this thread again, thx? I'll attach it to this mail. > Below patch makes all PCI devices use level-trigger , active low > interrupt, it worked well when running linux guest, I didn't try > windows > guest yet. > (didn't have windows image in hand) > > Please comment! > > If this is acceptabled, we can figure out how to use IOAPIC in kvm/ > ia32 > based on this. Which will reduce irq sharing dramatically. > > > Thanks, > Anthony > > > > diff --git a/bios/acpi-dsdt.dsl b/bios/acpi-dsdt.dsl > index 21fc76a..4b5e824 100755 > --- a/bios/acpi-dsdt.dsl > +++ b/bios/acpi-dsdt.dsl > @@ -974,7 +974,7 @@ DefinitionBlock ( > Name(_HID, EISAID("PNP0C0F")) // PCI interrupt > link > Name(_UID, 1) > Name(_PRS, ResourceTemplate(){ > - Interrupt (, Level, ActiveHigh, Shared) > + Interrupt (, Level, ActiveLow, Shared) This looks pretty much correct to me ;-). You might also want to add the GSI functionality I have in my patch. The only thing we have not discussed so far is, how do interrupts get routed when _PIC is not set to 1, aka the "boot case"? > > { 5, 10, 11 } > }) > Method (_STA, 0, NotSerialized) [...snip...] > > diff --git a/qemu/hw/pci.c b/qemu/hw/pci.c > index a23a466..df0ea33 100644 > --- a/qemu/hw/pci.c > +++ b/qemu/hw/pci.c > @@ -548,7 +548,7 @@ static void pci_set_irq(void *opaque, int irq_num, > int level) > pci_dev = bus->parent_dev; > } > bus->irq_count[irq_num] += change; > - bus->set_irq(bus->irq_opaque, irq_num, bus->irq_count[irq_num] != > 0); > + bus->set_irq(bus->irq_opaque, irq_num, !(bus- > >irq_count[irq_num] != > 0)); > } I don't think this is the right place to do it. Probably it would be a better idea to have either the APIC emulation know that the levels are reversed or make every device trigger ActiveLow interrupts. Alex [-- Attachment #2: apci.gsi.patch --] [-- Type: application/octet-stream, Size: 10602 bytes --] Hi, in the DSDT there are two different ways of defining, how an interrupt is supposed to be routed. Currently we are using the LNKA - LNKD method, which afaict is for legacy support. The other method is to directly tell the Operating System, which APIC pin the device is attached to. We can get that information from the very same entry, the LNKA to LNKD pseudo devices receive it. For now this does not give any obvious improvement. It does leave room for more advanced mappings, with several IOAPICs that can handle more devices separately. This might help when we have a lot of devices, as currently all devices sit on two interrupt lanes. More importantly (for me) though, is that Darwin enables the APIC mode unconditionally, so it won't easily run in legacy mode. Regards, Alex Signed-off-by: Alexander Graf <agraf@suse.de> diff --git a/bios/acpi-dsdt.dsl b/bios/acpi-dsdt.dsl index d2e33f4..f718b2e 100755 --- a/bios/acpi-dsdt.dsl +++ b/bios/acpi-dsdt.dsl @@ -199,4 +199,10 @@ DefinitionBlock ( { DBGL, 32, } + /* PIC mode setting */ + Name (PICF, 0x00) + Method (_PIC, 1, NotSerialized) + { + Store(Arg0, PICF) + } } @@ -199,10 +199,204 @@ DefinitionBlock ( Device(PCI0) { Name (_HID, EisaId ("PNP0A03")) Name (_ADR, 0x00) Name (_UID, 1) - Name(_PRT, Package() { + Name(APRT, Package() { + // PCI Slot 0 + Package() {0x0000ffff, 0, 0, ARQ3}, + Package() {0x0000ffff, 1, 0, ARQ0}, + Package() {0x0000ffff, 2, 0, ARQ1}, + Package() {0x0000ffff, 3, 0, ARQ2}, + + // PCI Slot 1 + Package() {0x0001ffff, 0, 0, ARQ0}, + Package() {0x0001ffff, 1, 0, ARQ1}, + Package() {0x0001ffff, 2, 0, ARQ2}, + Package() {0x0001ffff, 3, 0, ARQ3}, + + // PCI Slot 2 + Package() {0x0002ffff, 0, 0, ARQ1}, + Package() {0x0002ffff, 1, 0, ARQ2}, + Package() {0x0002ffff, 2, 0, ARQ3}, + Package() {0x0002ffff, 3, 0, ARQ0}, + + // PCI Slot 3 + Package() {0x0003ffff, 0, 0, ARQ2}, + Package() {0x0003ffff, 1, 0, ARQ3}, + Package() {0x0003ffff, 2, 0, ARQ0}, + Package() {0x0003ffff, 3, 0, ARQ1}, + + // PCI Slot 4 + Package() {0x0004ffff, 0, 0, ARQ3}, + Package() {0x0004ffff, 1, 0, ARQ0}, + Package() {0x0004ffff, 2, 0, ARQ1}, + Package() {0x0004ffff, 3, 0, ARQ2}, + + // PCI Slot 5 + Package() {0x0005ffff, 0, 0, ARQ0}, + Package() {0x0005ffff, 1, 0, ARQ1}, + Package() {0x0005ffff, 2, 0, ARQ2}, + Package() {0x0005ffff, 3, 0, ARQ3}, + + // PCI Slot 6 + Package() {0x0006ffff, 0, 0, ARQ1}, + Package() {0x0006ffff, 1, 0, ARQ2}, + Package() {0x0006ffff, 2, 0, ARQ3}, + Package() {0x0006ffff, 3, 0, ARQ0}, + + // PCI Slot 7 + Package() {0x0007ffff, 0, 0, ARQ2}, + Package() {0x0007ffff, 1, 0, ARQ3}, + Package() {0x0007ffff, 2, 0, ARQ0}, + Package() {0x0007ffff, 3, 0, ARQ1}, + + // PCI Slot 8 + Package() {0x0008ffff, 0, 0, ARQ3}, + Package() {0x0008ffff, 1, 0, ARQ0}, + Package() {0x0008ffff, 2, 0, ARQ1}, + Package() {0x0008ffff, 3, 0, ARQ2}, + + // PCI Slot 9 + Package() {0x0009ffff, 0, 0, ARQ0}, + Package() {0x0009ffff, 1, 0, ARQ1}, + Package() {0x0009ffff, 2, 0, ARQ2}, + Package() {0x0009ffff, 3, 0, ARQ3}, + + // PCI Slot 10 + Package() {0x000affff, 0, 0, ARQ1}, + Package() {0x000affff, 1, 0, ARQ2}, + Package() {0x000affff, 2, 0, ARQ3}, + Package() {0x000affff, 3, 0, ARQ0}, + + // PCI Slot 11 + Package() {0x000bffff, 0, 0, ARQ2}, + Package() {0x000bffff, 1, 0, ARQ3}, + Package() {0x000bffff, 2, 0, ARQ0}, + Package() {0x000bffff, 3, 0, ARQ1}, + + // PCI Slot 12 + Package() {0x000cffff, 0, 0, ARQ3}, + Package() {0x000cffff, 1, 0, ARQ0}, + Package() {0x000cffff, 2, 0, ARQ1}, + Package() {0x000cffff, 3, 0, ARQ2}, + + // PCI Slot 13 + Package() {0x000dffff, 0, 0, ARQ0}, + Package() {0x000dffff, 1, 0, ARQ1}, + Package() {0x000dffff, 2, 0, ARQ2}, + Package() {0x000dffff, 3, 0, ARQ3}, + + // PCI Slot 14 + Package() {0x000effff, 0, 0, ARQ1}, + Package() {0x000effff, 1, 0, ARQ2}, + Package() {0x000effff, 2, 0, ARQ3}, + Package() {0x000effff, 3, 0, ARQ0}, + + // PCI Slot 15 + Package() {0x000fffff, 0, 0, ARQ2}, + Package() {0x000fffff, 1, 0, ARQ3}, + Package() {0x000fffff, 2, 0, ARQ0}, + Package() {0x000fffff, 3, 0, ARQ1}, + + // PCI Slot 16 + Package() {0x0010ffff, 0, 0, ARQ3}, + Package() {0x0010ffff, 1, 0, ARQ0}, + Package() {0x0010ffff, 2, 0, ARQ1}, + Package() {0x0010ffff, 3, 0, ARQ2}, + + // PCI Slot 17 + Package() {0x0011ffff, 0, 0, ARQ0}, + Package() {0x0011ffff, 1, 0, ARQ1}, + Package() {0x0011ffff, 2, 0, ARQ2}, + Package() {0x0011ffff, 3, 0, ARQ3}, + + // PCI Slot 18 + Package() {0x0012ffff, 0, 0, ARQ1}, + Package() {0x0012ffff, 1, 0, ARQ2}, + Package() {0x0012ffff, 2, 0, ARQ3}, + Package() {0x0012ffff, 3, 0, ARQ0}, + + // PCI Slot 19 + Package() {0x0013ffff, 0, 0, ARQ2}, + Package() {0x0013ffff, 1, 0, ARQ3}, + Package() {0x0013ffff, 2, 0, ARQ0}, + Package() {0x0013ffff, 3, 0, ARQ1}, + + // PCI Slot 20 + Package() {0x0014ffff, 0, 0, ARQ3}, + Package() {0x0014ffff, 1, 0, ARQ0}, + Package() {0x0014ffff, 2, 0, ARQ1}, + Package() {0x0014ffff, 3, 0, ARQ2}, + + // PCI Slot 21 + Package() {0x0015ffff, 0, 0, ARQ0}, + Package() {0x0015ffff, 1, 0, ARQ1}, + Package() {0x0015ffff, 2, 0, ARQ2}, + Package() {0x0015ffff, 3, 0, ARQ3}, + + // PCI Slot 22 + Package() {0x0016ffff, 0, 0, ARQ1}, + Package() {0x0016ffff, 1, 0, ARQ2}, + Package() {0x0016ffff, 2, 0, ARQ3}, + Package() {0x0016ffff, 3, 0, ARQ0}, + + // PCI Slot 23 + Package() {0x0017ffff, 0, 0, ARQ2}, + Package() {0x0017ffff, 1, 0, ARQ3}, + Package() {0x0017ffff, 2, 0, ARQ0}, + Package() {0x0017ffff, 3, 0, ARQ1}, + + // PCI Slot 24 + Package() {0x0018ffff, 0, 0, ARQ3}, + Package() {0x0018ffff, 1, 0, ARQ0}, + Package() {0x0018ffff, 2, 0, ARQ1}, + Package() {0x0018ffff, 3, 0, ARQ2}, + + // PCI Slot 25 + Package() {0x0019ffff, 0, 0, ARQ0}, + Package() {0x0019ffff, 1, 0, ARQ1}, + Package() {0x0019ffff, 2, 0, ARQ2}, + Package() {0x0019ffff, 3, 0, ARQ3}, + + // PCI Slot 26 + Package() {0x001affff, 0, 0, ARQ1}, + Package() {0x001affff, 1, 0, ARQ2}, + Package() {0x001affff, 2, 0, ARQ3}, + Package() {0x001affff, 3, 0, ARQ0}, + + // PCI Slot 27 + Package() {0x001bffff, 0, 0, ARQ2}, + Package() {0x001bffff, 1, 0, ARQ3}, + Package() {0x001bffff, 2, 0, ARQ0}, + Package() {0x001bffff, 3, 0, ARQ1}, + + // PCI Slot 28 + Package() {0x001cffff, 0, 0, ARQ3}, + Package() {0x001cffff, 1, 0, ARQ0}, + Package() {0x001cffff, 2, 0, ARQ1}, + Package() {0x001cffff, 3, 0, ARQ2}, + + // PCI Slot 29 + Package() {0x001dffff, 0, 0, ARQ0}, + Package() {0x001dffff, 1, 0, ARQ1}, + Package() {0x001dffff, 2, 0, ARQ2}, + Package() {0x001dffff, 3, 0, ARQ3}, + + // PCI Slot 30 + Package() {0x001effff, 0, 0, ARQ1}, + Package() {0x001effff, 1, 0, ARQ2}, + Package() {0x001effff, 2, 0, ARQ3}, + Package() {0x001effff, 3, 0, ARQ0}, + + // PCI Slot 31 + Package() {0x001fffff, 0, 0, ARQ2}, + Package() {0x001fffff, 1, 0, ARQ3}, + Package() {0x001fffff, 2, 0, ARQ0}, + Package() {0x001fffff, 3, 0, ARQ1}, + }) + + Name(LPRT, Package() { /* PCI IRQ routing table, example from ACPI 2.0a specification, section 6.2.8.1 */ /* Note: we provide the same info as the PCI routing table of the Bochs BIOS */ @@ -407,6 +681,18 @@ DefinitionBlock ( Package() {0x001fffff, 3, LNKB, 0}, }) + Method (_PRT, 0, NotSerialized) + { + If (\PICF) + { + Return (APRT) + } + Else + { + Return (LPRT) + } + } + OperationRegion(PCST, SystemIO, 0xae00, 0x08) Field (PCST, DWordAcc, NoLock, WriteAsZeros) { @@ -939,5 +1258,25 @@ DefinitionBlock ( PRQ3, 8 } + Method (ARQ0, 0, NotSerialized) + { + Return ( And ( PRQ0, 0x7f ) ) + } + + Method (ARQ1, 0, NotSerialized) + { + Return ( And ( PRQ1, 0x7f ) ) + } + + Method (ARQ2, 0, NotSerialized) + { + Return ( And ( PRQ2, 0x7f ) ) + } + + Method (ARQ3, 0, NotSerialized) + { + Return ( And ( PRQ3, 0x7f ) ) + } + Device(LNKA){ Name(_HID, EISAID("PNP0C0F")) // PCI interrupt link [-- Attachment #3: Type: text/plain, Size: 2 bytes --] ^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [RFC]RE: [PATCH] kvm-ia64 irq assignment 1/2 kernel 2008-06-11 14:24 ` Alexander Graf @ 2008-06-11 16:02 ` Marcelo Tosatti 2008-06-12 16:15 ` Xu, Anthony 2008-06-12 16:08 ` Xu, Anthony 1 sibling, 1 reply; 21+ messages in thread From: Marcelo Tosatti @ 2008-06-11 16:02 UTC (permalink / raw) To: Alexander Graf; +Cc: Xu, Anthony, Avi Kivity, Jes Sorensen, kvm, kvm-ia64 On Wed, Jun 11, 2008 at 07:24:09AM -0700, Alexander Graf wrote: > > On Jun 10, 2008, at 12:57 AM, Xu, Anthony wrote: > >> Thanks for comments >> >> Basically we are on the same page, while I didn't find your patch about >> irq assignment, can you post it in this thread again, thx? > > I'll attach it to this mail. > >> Below patch makes all PCI devices use level-trigger , active low >> interrupt, it worked well when running linux guest, I didn't try windows >> guest yet. >> (didn't have windows image in hand) >> >> Please comment! >> >> If this is acceptabled, we can figure out how to use IOAPIC in kvm/ia32 >> based on this. Which will reduce irq sharing dramatically. >> >> >> Thanks, >> Anthony >> >> >> >> diff --git a/bios/acpi-dsdt.dsl b/bios/acpi-dsdt.dsl >> index 21fc76a..4b5e824 100755 >> --- a/bios/acpi-dsdt.dsl >> +++ b/bios/acpi-dsdt.dsl >> @@ -974,7 +974,7 @@ DefinitionBlock ( >> Name(_HID, EISAID("PNP0C0F")) // PCI interrupt link >> Name(_UID, 1) >> Name(_PRS, ResourceTemplate(){ >> - Interrupt (, Level, ActiveHigh, Shared) >> + Interrupt (, Level, ActiveLow, Shared) > > This looks pretty much correct to me ;-). You might also want to add the > GSI functionality I have in my patch. The only thing we have not discussed > so far is, how do interrupts get routed when _PIC is not set to 1, aka the > "boot case"? Hi Alexander, Anthony, I think it would be better to avoid static PCI pin -> IOAPIC pin assignments, if PCI link devices can be used (allowing the OS to route IRQ's as it wishes to). Take a look at http://www.microsoft.com/whdc/archive/acpi-mp.mspx. It seems cleaner to use "bimodal link nodes" (using the parlance from URL above) instead of "bimodal _PRT" as your present GSI patch is using. My current inclination is: - Move the PCI interrupt link device code to a method in a separate file (with arguments such as _UID, IRQ list, etc). - Create separate PCI interrupt link devices for each PCI pin of each slot (see the example table at end of URL). - Assign all available IRQ's covered by the single IOAPIC (0-24) in the interrupt list for these interrupt link devices. I'll try Anthony's patch with Windows. ^ permalink raw reply [flat|nested] 21+ messages in thread
* RE: [RFC]RE: [PATCH] kvm-ia64 irq assignment 1/2 kernel 2008-06-11 16:02 ` Marcelo Tosatti @ 2008-06-12 16:15 ` Xu, Anthony 2008-06-14 22:58 ` Marcelo Tosatti 0 siblings, 1 reply; 21+ messages in thread From: Xu, Anthony @ 2008-06-12 16:15 UTC (permalink / raw) To: Marcelo Tosatti, Alexander Graf; +Cc: Avi Kivity, Jes Sorensen, kvm, kvm-ia64 Marcelo Tosatti wrote: > On Wed, Jun 11, 2008 at 07:24:09AM -0700, Alexander Graf wrote: >> >> On Jun 10, 2008, at 12:57 AM, Xu, Anthony wrote: >> >>> Thanks for comments >>> >>> Basically we are on the same page, while I didn't find your patch >>> about irq assignment, can you post it in this thread again, thx? >> >> I'll attach it to this mail. >> >>> Below patch makes all PCI devices use level-trigger , active low >>> interrupt, it worked well when running linux guest, I didn't try >>> windows guest yet. (didn't have windows image in hand) >>> >>> Please comment! >>> >>> If this is acceptabled, we can figure out how to use IOAPIC in >>> kvm/ia32 based on this. Which will reduce irq sharing dramatically. >>> >>> >>> Thanks, >>> Anthony >>> >>> >>> >>> diff --git a/bios/acpi-dsdt.dsl b/bios/acpi-dsdt.dsl >>> index 21fc76a..4b5e824 100755 >>> --- a/bios/acpi-dsdt.dsl >>> +++ b/bios/acpi-dsdt.dsl >>> @@ -974,7 +974,7 @@ DefinitionBlock ( >>> Name(_HID, EISAID("PNP0C0F")) // PCI interrupt >>> link Name(_UID, 1) Name(_PRS, >>> ResourceTemplate(){ - Interrupt (, Level, >>> ActiveHigh, Shared) + Interrupt (, Level, >>> ActiveLow, Shared) >> >> This looks pretty much correct to me ;-). You might also want to add >> the GSI functionality I have in my patch. The only thing we have not >> discussed so far is, how do interrupts get routed when _PIC is not >> set to 1, aka the "boot case"? > > Hi Alexander, Anthony, > > I think it would be better to avoid static PCI pin -> IOAPIC pin > assignments, if PCI link devices can be used (allowing the OS to route > IRQ's as it wishes to). Seems PCI link device only support irq-pin < 16, IOAPIC pin 16~23 can not be used. > > Take a look at http://www.microsoft.com/whdc/archive/acpi-mp.mspx. It > seems cleaner to use "bimodal link nodes" (using the parlance from URL > above) instead of "bimodal _PRT" as your present GSI patch is using. Bimodal _PRT is a great idea, I never thought of it before, thanks. While in PIIX platform there are only 4 PCI link entries, how can we introduce more? Where to put these added entries? still in ISA bridge configure space. Another concern is, can this link use irq-pin > 15? In the example ASL code in the web page you provided, they use irq-pin <=15 - Anthony > > My current inclination is: > > - Move the PCI interrupt link device code to a method in a separate > file (with arguments such as _UID, IRQ list, etc). > - Create separate PCI interrupt link devices for each PCI pin of each > slot (see the example table at end of URL). > - Assign all available IRQ's covered by the single IOAPIC (0-24) in > the interrupt list for these interrupt link devices. > > I'll try Anthony's patch with Windows. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC]RE: [PATCH] kvm-ia64 irq assignment 1/2 kernel 2008-06-12 16:15 ` Xu, Anthony @ 2008-06-14 22:58 ` Marcelo Tosatti 2008-06-16 1:13 ` Xu, Anthony 0 siblings, 1 reply; 21+ messages in thread From: Marcelo Tosatti @ 2008-06-14 22:58 UTC (permalink / raw) To: Xu, Anthony; +Cc: Alexander Graf, Avi Kivity, Jes Sorensen, kvm, kvm-ia64 On Fri, Jun 13, 2008 at 12:15:23AM +0800, Xu, Anthony wrote: > > I think it would be better to avoid static PCI pin -> IOAPIC pin > > assignments, if PCI link devices can be used (allowing the OS to route > > IRQ's as it wishes to). > Seems PCI link device only support irq-pin < 16, > IOAPIC pin 16~23 can not be used. > > > > > > > Take a look at http://www.microsoft.com/whdc/archive/acpi-mp.mspx. It > > seems cleaner to use "bimodal link nodes" (using the parlance from URL > > above) instead of "bimodal _PRT" as your present GSI patch is using. > Bimodal _PRT is a great idea, I never thought of it before, thanks. > > While in PIIX platform there are only 4 PCI link entries, how can we > introduce more? Where to put these added entries? > still in ISA bridge configure space. Would have to write an ACPI-IOAPIC "IRQ router" to replace PIIX. It would be queried via a SystemIO region, so QEMU can know what IRQ has been assigned to a particular slot/func (OS can then change IRQ assignment via link device _SRS method). That seems to be necessary for dynamic IRQ assignment of slots/function once you have more than one IOAPIC (note we can also assign one IRQ to each function inside each slot, currently there's one IRQ per _slot_). > Another concern is, can this link use irq-pin > 15? > In the example ASL code in the web page you provided, they use irq-pin > <=15 Sure it can, as long as the OS has notified its not using PIIX's PIC (via the _PIC method). ^ permalink raw reply [flat|nested] 21+ messages in thread
* RE: [RFC]RE: [PATCH] kvm-ia64 irq assignment 1/2 kernel 2008-06-14 22:58 ` Marcelo Tosatti @ 2008-06-16 1:13 ` Xu, Anthony 2008-07-02 9:11 ` Jes Sorensen 0 siblings, 1 reply; 21+ messages in thread From: Xu, Anthony @ 2008-06-16 1:13 UTC (permalink / raw) To: Marcelo Tosatti; +Cc: Alexander Graf, Avi Kivity, Jes Sorensen, kvm, kvm-ia64 Marcelo Tosatti wrote: > On Fri, Jun 13, 2008 at 12:15:23AM +0800, Xu, Anthony wrote: >>> I think it would be better to avoid static PCI pin -> IOAPIC pin >>> assignments, if PCI link devices can be used (allowing the OS to >>> route IRQ's as it wishes to). >> Seems PCI link device only support irq-pin < 16, >> IOAPIC pin 16~23 can not be used. >> >> >> >>> >>> Take a look at http://www.microsoft.com/whdc/archive/acpi-mp.mspx. >>> It seems cleaner to use "bimodal link nodes" (using the parlance >>> from URL above) instead of "bimodal _PRT" as your present GSI patch >>> is using. >> Bimodal _PRT is a great idea, I never thought of it before, thanks. >> >> While in PIIX platform there are only 4 PCI link entries, how can we >> introduce more? Where to put these added entries? >> still in ISA bridge configure space. > > Would have to write an ACPI-IOAPIC "IRQ router" to replace PIIX. It > would be queried via a SystemIO region, so QEMU can know what IRQ > has been assigned to a particular slot/func (OS can then change IRQ > assignment via link device _SRS method). > > That seems to be necessary for dynamic IRQ assignment of > slots/function once you have more than one IOAPIC (note we can also > assign one IRQ to each function inside each slot, currently there's > one IRQ per _slot_). Agree, it is necessary for dynamic IRQ assignment. > >> Another concern is, can this link use irq-pin > 15? >> In the example ASL code in the web page you provided, they use >> irq-pin <=15 > > Sure it can, as long as the OS has notified its not using PIIX's PIC > (via the _PIC method). Good. Anthony ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC]RE: [PATCH] kvm-ia64 irq assignment 1/2 kernel 2008-06-16 1:13 ` Xu, Anthony @ 2008-07-02 9:11 ` Jes Sorensen 2008-07-03 1:22 ` Xu, Anthony 0 siblings, 1 reply; 21+ messages in thread From: Jes Sorensen @ 2008-07-02 9:11 UTC (permalink / raw) To: Xu, Anthony; +Cc: Marcelo Tosatti, Alexander Graf, Avi Kivity, kvm, kvm-ia64 Hi Anthony, Looking back through this thread - do you happen to have an updated patch for the irq problem? If not, could you let me know which is your latest version and I will try and look at it. Cheers, Jes ^ permalink raw reply [flat|nested] 21+ messages in thread
* RE: [RFC]RE: [PATCH] kvm-ia64 irq assignment 1/2 kernel 2008-07-02 9:11 ` Jes Sorensen @ 2008-07-03 1:22 ` Xu, Anthony 0 siblings, 0 replies; 21+ messages in thread From: Xu, Anthony @ 2008-07-03 1:22 UTC (permalink / raw) To: Jes Sorensen; +Cc: Marcelo Tosatti, Alexander Graf, Avi Kivity, kvm, kvm-ia64 [-- Attachment #1: Type: text/plain, Size: 365 bytes --] Hi Jes, You can use this one, I'm considering how to make it more generic per AVI comment. Anthony Jes Sorensen wrote: > Hi Anthony, > > Looking back through this thread - do you happen to have an updated > patch for the irq problem? If not, could you let me know which is your > latest version and I will try and look at it. > > Cheers, > Jes [-- Attachment #2: 0002-Irq-assignment.patch --] [-- Type: application/octet-stream, Size: 6407 bytes --] From 67806cc642cb666fb59fec60115fee37b80ecb3a Mon Sep 17 00:00:00 2001 From: Anthony Xu <anthony.xu@intel.com> Date: Wed, 18 Jun 2008 14:32:46 -0400 Subject: [PATCH] Irq assignment 1. use bimodal _PRT 2. pci device can use irq > 15, reduce interrupt sharing 3. test by running linux guest in kvm-ia64, kvm-i32(w/ wo/ -no-kvm) signed-off-by: Anthony Xu <anthony.xu@intel.com> --- bios/acpi-dsdt.dsl | 59 +++++++++++++++++++++++++++++++++++++++++++++++++++- qemu/hw/apic.c | 22 ++++++++++++++++++- qemu/hw/ipf.c | 24 +++++++++++++++++++++ qemu/hw/pc.c | 2 +- qemu/hw/pc.h | 4 +++ qemu/hw/pci.c | 8 +++++++ 6 files changed, 116 insertions(+), 3 deletions(-) diff --git a/bios/acpi-dsdt.dsl b/bios/acpi-dsdt.dsl index d1bfa2c..62dd612 100755 --- a/bios/acpi-dsdt.dsl +++ b/bios/acpi-dsdt.dsl @@ -89,6 +89,12 @@ DefinitionBlock ( } } + Name (PICD, 0) + + Method(_PIC, 1) + { + Store(Arg0, PICD) + } /* PCI Bus definition */ Scope(\_SB) { @@ -96,7 +102,15 @@ DefinitionBlock ( Name (_HID, EisaId ("PNP0A03")) Name (_ADR, 0x00) Name (_UID, 1) - Name(_PRT, Package() { + + Method (_PRT,0) { + If (PICD) { + Return (PRTA) + } + Return (PRTP) + } + + Name (PRTP, Package() { /* PCI IRQ routing table, example from ACPI 2.0a specification, section 6.2.8.1 */ /* Note: we provide the same info as the PCI routing @@ -147,6 +161,49 @@ DefinitionBlock ( prt_slot3(0x001f), }) + Name (PRTA, Package() { + /* IOAPIC use fixed connection */ + +#define aprt_slot(nr, irq) \ + Package() { nr##ffff, 0, 0, irq }, \ + Package() { nr##ffff, 1, 0, irq }, \ + Package() { nr##ffff, 2, 0, irq }, \ + Package() { nr##ffff, 3, 0, irq } + + aprt_slot(0x0000, 16), + aprt_slot(0x0001, 17), + aprt_slot(0x0002, 18), + aprt_slot(0x0003, 19), + aprt_slot(0x0004, 20), + aprt_slot(0x0005, 21), + aprt_slot(0x0006, 22), + aprt_slot(0x0007, 23), + aprt_slot(0x0008, 16), + aprt_slot(0x0009, 17), + aprt_slot(0x000a, 18), + aprt_slot(0x000b, 19), + aprt_slot(0x000c, 20), + aprt_slot(0x000d, 21), + aprt_slot(0x000e, 22), + aprt_slot(0x000f, 23), + aprt_slot(0x0010, 16), + aprt_slot(0x0011, 17), + aprt_slot(0x0012, 18), + aprt_slot(0x0013, 19), + aprt_slot(0x0014, 20), + aprt_slot(0x0015, 21), + aprt_slot(0x0016, 22), + aprt_slot(0x0017, 23), + aprt_slot(0x0018, 16), + aprt_slot(0x0019, 17), + aprt_slot(0x001a, 18), + aprt_slot(0x001b, 19), + aprt_slot(0x001c, 20), + aprt_slot(0x001d, 21), + aprt_slot(0x001e, 22), + aprt_slot(0x001f, 23), + }) + OperationRegion(PCST, SystemIO, 0xae00, 0x08) Field (PCST, DWordAcc, NoLock, WriteAsZeros) { diff --git a/qemu/hw/apic.c b/qemu/hw/apic.c index 4ebf1ff..98bf3ad 100644 --- a/qemu/hw/apic.c +++ b/qemu/hw/apic.c @@ -1053,10 +1053,30 @@ static void ioapic_service(IOAPICState *s) } } +int ioapic_map_irq(int devfn, int irq_num) +{ + int irq; + irq = ((devfn >> 3) & 7) + 16; + return irq; +} + +static int ioapic_irq_count[IOAPIC_NUM_PINS]; + void ioapic_set_irq(void *opaque, int vector, int level) { IOAPICState *s = opaque; - + if( vector >= 16 ){ + if( level ) + ioapic_irq_count[vector] += 1; + else + ioapic_irq_count[vector] -= 1; + level = (ioapic_irq_count[vector] != 0); + } +#ifdef KVM_CAP_IRQCHIP + if (kvm_enabled()) + if (kvm_set_irq(vector, ioapic_irq_count[vector] == 0)) + return; +#endif if (vector >= 0 && vector < IOAPIC_NUM_PINS) { uint32_t mask = 1 << vector; uint64_t entry = s->ioredtbl[vector]; diff --git a/qemu/hw/ipf.c b/qemu/hw/ipf.c index b11e328..dabd7cc 100644 --- a/qemu/hw/ipf.c +++ b/qemu/hw/ipf.c @@ -672,3 +672,27 @@ QEMUMachine ipf_machine = { ipf_init_pci, VGA_RAM_SIZE + VGA_RAM_SIZE, }; + +#define IOAPIC_NUM_PINS 48 +static int ioapic_irq_count[IOAPIC_NUM_PINS]; + +int ioapic_map_irq(int devfn, int irq_num) +{ + int irq, dev; + dev = devfn >> 3; + irq = ((((dev << 2) + (dev >> 3) + irq_num) & 31) + 16); + return irq; +} + +void ioapic_set_irq(void *opaque, int vector, int level) +{ + if( level ) + ioapic_irq_count[vector] += 1; + else + ioapic_irq_count[vector] -= 1; + + if (kvm_enabled()) + if (kvm_set_irq(vector, ioapic_irq_count[vector] == 0)) + return; +} + diff --git a/qemu/hw/pc.c b/qemu/hw/pc.c index 42c2687..bf49689 100644 --- a/qemu/hw/pc.c +++ b/qemu/hw/pc.c @@ -53,8 +53,8 @@ static fdctrl_t *floppy_controller; static RTCState *rtc_state; static PITState *pit; -static IOAPICState *ioapic; static PCIDevice *i440fx_state; +IOAPICState *ioapic; static void ioport80_write(void *opaque, uint32_t addr, uint32_t data) { diff --git a/qemu/hw/pc.h b/qemu/hw/pc.h index c284bf1..379b386 100644 --- a/qemu/hw/pc.h +++ b/qemu/hw/pc.h @@ -47,6 +47,7 @@ int apic_accept_pic_intr(CPUState *env); void apic_local_deliver(CPUState *env, int vector); int apic_get_interrupt(CPUState *env); IOAPICState *ioapic_init(void); +int ioapic_map_irq(int devfn, int irq_num); void ioapic_set_irq(void *opaque, int vector, int level); /* i8254.c */ @@ -95,6 +96,9 @@ void ioport_set_a20(int enable); int ioport_get_a20(void); CPUState *pc_new_cpu(int cpu, const char *cpu_model, int pci_enabled); +/* pc.c */ +extern IOAPICState *ioapic; + /* acpi.c */ extern int acpi_enabled; i2c_bus *piix4_pm_init(PCIBus *bus, int devfn, uint32_t smb_io_base, diff --git a/qemu/hw/pci.c b/qemu/hw/pci.c index 92683d1..c1a0361 100644 --- a/qemu/hw/pci.c +++ b/qemu/hw/pci.c @@ -538,12 +538,20 @@ static void pci_set_irq(void *opaque, int irq_num, int level) PCIDevice *pci_dev = (PCIDevice *)opaque; PCIBus *bus; int change; + int irq; change = level - pci_dev->irq_state[irq_num]; if (!change) return; pci_dev->irq_state[irq_num] = level; + + irq = ioapic_map_irq(pci_dev->devfn, irq_num); +#if defined(TARGET_IA64) + ioapic_set_irq(NULL, irq, level); +#else + ioapic_set_irq(ioapic, irq, level); +#endif for (;;) { bus = pci_dev->bus; irq_num = bus->map_irq(pci_dev, irq_num); -- 1.5.5 ^ permalink raw reply related [flat|nested] 21+ messages in thread
* RE: [RFC]RE: [PATCH] kvm-ia64 irq assignment 1/2 kernel 2008-06-11 14:24 ` Alexander Graf 2008-06-11 16:02 ` Marcelo Tosatti @ 2008-06-12 16:08 ` Xu, Anthony 1 sibling, 0 replies; 21+ messages in thread From: Xu, Anthony @ 2008-06-12 16:08 UTC (permalink / raw) To: Alexander Graf; +Cc: Avi Kivity, Jes Sorensen, kvm, kvm-ia64 Alexander Graf wrote: > On Jun 10, 2008, at 12:57 AM, Xu, Anthony wrote: > >> Thanks for comments >> >> Basically we are on the same page, while I didn't find your patch >> about irq assignment, can you post it in this thread again, thx? > > I'll attach it to this mail. This patch is stilling use legacy LNKA~LNKD for ioapic, As you said irq-pins > 15 are not used. > >> Below patch makes all PCI devices use level-trigger , active low >> interrupt, it worked well when running linux guest, I didn't try >> windows guest yet. >> (didn't have windows image in hand) >> >> Please comment! >> >> If this is acceptabled, we can figure out how to use IOAPIC in kvm/ >> ia32 based on this. Which will reduce irq sharing dramatically. >> >> >> Thanks, >> Anthony >> >> >> >> diff --git a/bios/acpi-dsdt.dsl b/bios/acpi-dsdt.dsl >> index 21fc76a..4b5e824 100755 >> --- a/bios/acpi-dsdt.dsl >> +++ b/bios/acpi-dsdt.dsl >> @@ -974,7 +974,7 @@ DefinitionBlock ( >> Name(_HID, EISAID("PNP0C0F")) // PCI interrupt >> link Name(_UID, 1) >> Name(_PRS, ResourceTemplate(){ >> - Interrupt (, Level, ActiveHigh, Shared) >> + Interrupt (, Level, ActiveLow, Shared) > > This looks pretty much correct to me ;-). You might also want to add > the GSI functionality I have in my patch. The only thing we have not > discussed so far is, how do interrupts get routed when _PIC is not set > to 1, aka the "boot case"? Here Avi is correct, PIC only support activehigh level-triggered interrupt. From spec, PCI device uses activelow level-triggered interrupt. I guess it is interrupt Link to reverse it. > >> >> { 5, 10, 11 } >> }) >> Method (_STA, 0, NotSerialized) > > [...snip...] > >> >> diff --git a/qemu/hw/pci.c b/qemu/hw/pci.c >> index a23a466..df0ea33 100644 >> --- a/qemu/hw/pci.c >> +++ b/qemu/hw/pci.c >> @@ -548,7 +548,7 @@ static void pci_set_irq(void *opaque, int >> irq_num, int level) pci_dev = bus->parent_dev; >> } >> bus->irq_count[irq_num] += change; >> - bus->set_irq(bus->irq_opaque, irq_num, bus->irq_count[irq_num] >> != 0); + bus->set_irq(bus->irq_opaque, irq_num, !(bus- >>> irq_count[irq_num] != >> 0)); >> } > > I don't think this is the right place to do it. Probably it would be a > better idea to have either the APIC emulation know that the levels are > reversed or make every device trigger ActiveLow interrupts. Maybe it not a right place:-) Making every device trigger activelow interrupt introduce a lot of modifications, it's last resort. APIC emulation is inside kernel now, it is not a good idea to introduce qemu-special piece of code into kernel as Avi mentioned. Thanks. ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC]RE: [PATCH] kvm-ia64 irq assignment 1/2 kernel 2008-06-10 7:57 ` [RFC]RE: " Xu, Anthony 2008-06-11 14:24 ` Alexander Graf @ 2008-06-11 16:16 ` Marcelo Tosatti 2008-06-12 16:19 ` Xu, Anthony 2008-06-12 12:34 ` Avi Kivity 2 siblings, 1 reply; 21+ messages in thread From: Marcelo Tosatti @ 2008-06-11 16:16 UTC (permalink / raw) To: Xu, Anthony; +Cc: Alexander Graf, Avi Kivity, Jes Sorensen, kvm, kvm-ia64 On Tue, Jun 10, 2008 at 03:57:30PM +0800, Xu, Anthony wrote: > diff --git a/qemu/hw/pci.c b/qemu/hw/pci.c > index a23a466..df0ea33 100644 > --- a/qemu/hw/pci.c > +++ b/qemu/hw/pci.c > @@ -548,7 +548,7 @@ static void pci_set_irq(void *opaque, int irq_num, > int level) > pci_dev = bus->parent_dev; > } > bus->irq_count[irq_num] += change; > - bus->set_irq(bus->irq_opaque, irq_num, bus->irq_count[irq_num] != > 0); > + bus->set_irq(bus->irq_opaque, irq_num, !(bus->irq_count[irq_num] != > 0)); > } Ideally this should detect if the PCI interrupts are active low or high and act accordingly (so that older BIOSes still work and no assumptions are made). Will probably have to be done anyway for merging into QEMU. One way of doing it would be to talk to ACPI via a new SystemIO region, but there must be an easier way. ^ permalink raw reply [flat|nested] 21+ messages in thread
* RE: [RFC]RE: [PATCH] kvm-ia64 irq assignment 1/2 kernel 2008-06-11 16:16 ` Marcelo Tosatti @ 2008-06-12 16:19 ` Xu, Anthony 0 siblings, 0 replies; 21+ messages in thread From: Xu, Anthony @ 2008-06-12 16:19 UTC (permalink / raw) To: Marcelo Tosatti; +Cc: Alexander Graf, Avi Kivity, Jes Sorensen, kvm, kvm-ia64 Marcelo Tosatti wrote: > On Tue, Jun 10, 2008 at 03:57:30PM +0800, Xu, Anthony wrote: >> diff --git a/qemu/hw/pci.c b/qemu/hw/pci.c >> index a23a466..df0ea33 100644 >> --- a/qemu/hw/pci.c >> +++ b/qemu/hw/pci.c >> @@ -548,7 +548,7 @@ static void pci_set_irq(void *opaque, int >> irq_num, int level) pci_dev = bus->parent_dev; >> } >> bus->irq_count[irq_num] += change; >> - bus->set_irq(bus->irq_opaque, irq_num, bus->irq_count[irq_num] >> != 0); + bus->set_irq(bus->irq_opaque, irq_num, >> !(bus->irq_count[irq_num] != 0)); } > > Ideally this should detect if the PCI interrupts are active low or > high and act accordingly (so that older BIOSes still work and no > assumptions are made). Will probably have to be done anyway for > merging into QEMU. As I mentioned before, all PCI devices in Qemu used active high level-triggerred interrupt. While IOAPIC pin with fixed connection to slot uses active low level-triggerred interrupt by default. Seems we need to find a place to reverse it. > > One way of doing it would be to talk to ACPI via a new SystemIO > region, but there must be an easier way. Good point, qemu needs to know whether it is in PIC mode or in APIC mode. We need define the communication mechanism, SystemIO region is one choice. Thanks, Anthony ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [RFC]RE: [PATCH] kvm-ia64 irq assignment 1/2 kernel 2008-06-10 7:57 ` [RFC]RE: " Xu, Anthony 2008-06-11 14:24 ` Alexander Graf 2008-06-11 16:16 ` Marcelo Tosatti @ 2008-06-12 12:34 ` Avi Kivity 2008-06-12 16:20 ` Xu, Anthony 2 siblings, 1 reply; 21+ messages in thread From: Avi Kivity @ 2008-06-12 12:34 UTC (permalink / raw) To: Xu, Anthony; +Cc: Alexander Graf, Jes Sorensen, kvm, kvm-ia64 Xu, Anthony wrote: > Thanks for comments > > Basically we are on the same page, while I didn't find your patch about > irq assignment, can you post it in this thread again, thx? > Below patch makes all PCI devices use level-trigger , active low > interrupt, it worked well when running linux guest, I didn't try windows > guest yet. > (didn't have windows image in hand) > > Please comment! > > If this is acceptabled, we can figure out how to use IOAPIC in kvm/ia32 > based on this. Which will reduce irq sharing dramatically. > > > Thanks, > Anthony > > > > diff --git a/bios/acpi-dsdt.dsl b/bios/acpi-dsdt.dsl > index 21fc76a..4b5e824 100755 > --- a/bios/acpi-dsdt.dsl > +++ b/bios/acpi-dsdt.dsl > @@ -974,7 +974,7 @@ DefinitionBlock ( > Name(_HID, EISAID("PNP0C0F")) // PCI interrupt link > Name(_UID, 1) > Name(_PRS, ResourceTemplate(){ > - Interrupt (, Level, ActiveHigh, Shared) > + Interrupt (, Level, ActiveLow, Shared) > { 5, 10, 11 } > I think this will fail for guests which use the PIC, since the PIC is always active high. For x86 the interrupts will have to be active high since that's how piix works. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. ^ permalink raw reply [flat|nested] 21+ messages in thread
* RE: [RFC]RE: [PATCH] kvm-ia64 irq assignment 1/2 kernel 2008-06-12 12:34 ` Avi Kivity @ 2008-06-12 16:20 ` Xu, Anthony 0 siblings, 0 replies; 21+ messages in thread From: Xu, Anthony @ 2008-06-12 16:20 UTC (permalink / raw) To: Avi Kivity; +Cc: Alexander Graf, Jes Sorensen, kvm, kvm-ia64 Avi Kivity wrote: > Xu, Anthony wrote: >> Thanks for comments >> >> Basically we are on the same page, while I didn't find your patch >> about irq assignment, can you post it in this thread again, thx? >> Below patch makes all PCI devices use level-trigger , active low >> interrupt, it worked well when running linux guest, I didn't try >> windows guest yet. (didn't have windows image in hand) >> >> Please comment! >> >> If this is acceptabled, we can figure out how to use IOAPIC in >> kvm/ia32 based on this. Which will reduce irq sharing dramatically. >> >> >> Thanks, >> Anthony >> >> >> >> diff --git a/bios/acpi-dsdt.dsl b/bios/acpi-dsdt.dsl >> index 21fc76a..4b5e824 100755 >> --- a/bios/acpi-dsdt.dsl >> +++ b/bios/acpi-dsdt.dsl >> @@ -974,7 +974,7 @@ DefinitionBlock ( >> Name(_HID, EISAID("PNP0C0F")) // PCI interrupt >> link Name(_UID, 1) >> Name(_PRS, ResourceTemplate(){ >> - Interrupt (, Level, ActiveHigh, Shared) >> + Interrupt (, Level, ActiveLow, Shared) >> { 5, 10, 11 } >> > > I think this will fail for guests which use the PIC, since the PIC is > always active high. > > For x86 the interrupts will have to be active high since that's how > piix works. Understand this concern. Anthony ^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH] kvm-ia64 irq assignment 1/2 kernel 2008-06-10 6:33 ` Xu, Anthony 2008-06-10 7:25 ` Alexander Graf @ 2008-06-12 12:30 ` Avi Kivity 1 sibling, 0 replies; 21+ messages in thread From: Avi Kivity @ 2008-06-12 12:30 UTC (permalink / raw) To: Xu, Anthony; +Cc: Jes Sorensen, kvm, kvm-ia64 Xu, Anthony wrote: > Avi Kivity wrote: > >> I suggest modifying the firmware to report the interrupts as active >> high. Since Xen does not emulate polarity, the change will not affect >> it and the firmware can continue to be shared. I'd also recommend >> fixing Xen to emulate the polarity correctly, if possible. >> > > Thanks for your comments > I agree modifying common code is not a good method. > > While your suggestion seems be infeasible too. > According to acpi spec, only irq <=15 can be configured, such as trigger > level, polarity. > For irq >15 , means connect to IOAPIC directly, it can't be configured, > it must be level triger, active low. > Yes. > I can't find any mechanism in firmware to configure irqs (> 15). Please > enlighten me if you have. > > Okay. In any case we should emulate hardware as closely as possible to reality, so mu suggestion wasn't a good one. > I think below scheme is feasible, > 1. all PCI devices in Qemu uses level trigger, active low interrupt. > (not include ide, even though it is a PCI device, it uses legacy > interrupt mechanism) > > 2. in Guest Firmware, all PCI devices' interrupts are configured as > level trigger, active low > for KVM/IA32 Guest firmware, just a little modifications > Name(_PRS, ResourceTemplate(){ > Interrupt (, Level, ActiveHigh, Shared)--> Interrupt > (, Level, ActiveLow, Shared) > > > There are some modifications in Qemu, But I think it's a worthwhile, > it's a thoroghly solution both for KVM/IA32 and KVM/IA64. > > I agree. Note that the piix chipset used on x86 inverts the pci interrupts again so they become active high. But for ioapic mode we may be able to use active low interrupts. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain. ^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2008-07-03 1:22 UTC | newest] Thread overview: 21+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2008-06-06 15:58 [PATCH] kvm-ia64 irq assignment 1/2 kernel Xu, Anthony 2008-06-06 19:58 ` Avi Kivity 2008-06-09 8:58 ` Alexander Graf 2008-06-09 9:16 ` Alexander Graf 2008-06-12 12:24 ` Avi Kivity 2008-06-10 6:33 ` Xu, Anthony 2008-06-10 7:25 ` Alexander Graf 2008-06-10 7:57 ` [RFC]RE: " Xu, Anthony 2008-06-11 14:24 ` Alexander Graf 2008-06-11 16:02 ` Marcelo Tosatti 2008-06-12 16:15 ` Xu, Anthony 2008-06-14 22:58 ` Marcelo Tosatti 2008-06-16 1:13 ` Xu, Anthony 2008-07-02 9:11 ` Jes Sorensen 2008-07-03 1:22 ` Xu, Anthony 2008-06-12 16:08 ` Xu, Anthony 2008-06-11 16:16 ` Marcelo Tosatti 2008-06-12 16:19 ` Xu, Anthony 2008-06-12 12:34 ` Avi Kivity 2008-06-12 16:20 ` Xu, Anthony 2008-06-12 12:30 ` Avi Kivity
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox