kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* RFC: ioapic polarity vs. qemu os-x guest
       [not found] <20140130204423.GK29329@ERROL.INI.CMU.EDU>
@ 2014-02-11 18:23 ` Gabriel L. Somlo
  2014-02-11 19:54   ` Michael S. Tsirkin
  0 siblings, 1 reply; 32+ messages in thread
From: Gabriel L. Somlo @ 2014-02-11 18:23 UTC (permalink / raw)
  To: kvm, qemu-devel; +Cc: mst, eddie.dong, agraf

Hi,

I'm trying to get OS X to work as a QEMU guest, and one of the few
remaining "mysteries" I need to solve is that the OS X guest hangs
during boot, waiting for its boot disk to be available, unless the
following KVM patch is applied:


diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
index ce9ed99..1539d37 100644
--- a/virt/kvm/ioapic.c
+++ b/virt/kvm/ioapic.c
@@ -328,7 +328,6 @@ int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int irq, int irq_source_id,
 	irq_level = __kvm_irq_line_state(&ioapic->irq_states[irq],
 					 irq_source_id, level);
 	entry = ioapic->redirtbl[irq];
-	irq_level ^= entry.fields.polarity;
 	if (!irq_level) {
 		ioapic->irr &= ~mask;
 		ret = 1;
--


After digging around the KVM source for a bit, and printk-ing things
from Windows 7, Fedora 20, and OS X (10.9), I figured out the following:


1. Edge-triggered interrupts are invariably unaffected by the xor line
being removed by the patch. On all three guest types, edge-triggered
interrupts have polarity set to 0, so the xor is essentially a no-op,
and we can forget about it altogether.


2. Windows and Linux always configure all level-triggered interrupts
with polarity 0 (active-high, consistent with QEMU's ACPI/DSDT, in
particular q35-acpi-dsdt.dsl, which is what I'm using with -M q35).
As such, on Windows and Linux, the xor line in question is still a
no-op.


3. OS X (all versions I tried, at least since 10.5/Leopard) always
configures all level-triggered interrupts with polarity 1 (active-low),
regardless of what the QEMU DSDT says. As such, the xor line acts as
a negation of "irq_level", which at first glance sounds reasonable.

However: when KVM negates "irq_level" due to "polarity == 1", the OS X
guest hangs during boot.

OS X works fine when "polarity == 1" is ignored (with the xor line
commented out).

This may be another instance (similar to how OS X didn't use to check
with CPUID regarding monitor/mwait instruction availability) where
apple devs know that any of their supported hardware advertises
active-low in the DSDT, so no need to check, just hardcode that
assumption... :)


4. With s/ActiveHigh/ActiveLow/ in QEMU's q35-acpi-dsdt.dsl, Linux
actually switches to "polarity == 1" (active-low), and works fine
*with the xor line removed* !!!. With the xor line left intact (i.e.
without the above patch), the active-low fedora guest worked extremely
poorly, and printed out multiple error messages during boot:

	irq XX: nobody cared (try booting with the "irqpoll" option)
	...
	Disabling IRQ #XX

for XX in [16, 18, 19, ...].


So, right now, I'm wondering about the following:


1. Regarding KVM and the polarity xor line in the patch above: Does
anyone have experience with any *other* guests which insist on setting
level-triggered interrupt polarity to 1/active-low ? Is that xor line
actually doing anything useful in practice, for any other guest, on
either QEMU or any other platform ?


2. Is there anything in QEMU (besides the ACPI DSDT .dsl files) which
has a hardcoded assumption re. "polarity == 0", or active-high, for
level-triggered interrupts? I tried to dig through hw/i386/kvm/ioapic.c
and a bunch of other files, but couldn't isolate anything that I could
"flip" to fix things in userspace.


Any ideas or suggestions about the appropriate way to move forward would
be much appreciated !!!


Thanks much,
--Gabriel

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: RFC: ioapic polarity vs. qemu os-x guest
  2014-02-11 18:23 ` RFC: ioapic polarity vs. qemu os-x guest Gabriel L. Somlo
@ 2014-02-11 19:54   ` Michael S. Tsirkin
  2014-02-11 21:35     ` Gabriel L. Somlo
  2014-02-14 21:13     ` Gabriel L. Somlo
  0 siblings, 2 replies; 32+ messages in thread
From: Michael S. Tsirkin @ 2014-02-11 19:54 UTC (permalink / raw)
  To: Gabriel L. Somlo; +Cc: kvm, qemu-devel, eddie.dong, agraf

On Tue, Feb 11, 2014 at 01:23:31PM -0500, Gabriel L. Somlo wrote:
> Hi,
> 
> I'm trying to get OS X to work as a QEMU guest, and one of the few
> remaining "mysteries" I need to solve is that the OS X guest hangs
> during boot, waiting for its boot disk to be available, unless the
> following KVM patch is applied:
> 
> 
> diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
> index ce9ed99..1539d37 100644
> --- a/virt/kvm/ioapic.c
> +++ b/virt/kvm/ioapic.c
> @@ -328,7 +328,6 @@ int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int irq, int irq_source_id,
>  	irq_level = __kvm_irq_line_state(&ioapic->irq_states[irq],
>  					 irq_source_id, level);
>  	entry = ioapic->redirtbl[irq];
> -	irq_level ^= entry.fields.polarity;
>  	if (!irq_level) {
>  		ioapic->irr &= ~mask;
>  		ret = 1;
> --
> 
> 
> After digging around the KVM source for a bit, and printk-ing things
> from Windows 7, Fedora 20, and OS X (10.9), I figured out the following:
> 
> 
> 1. Edge-triggered interrupts are invariably unaffected by the xor line
> being removed by the patch. On all three guest types, edge-triggered
> interrupts have polarity set to 0, so the xor is essentially a no-op,
> and we can forget about it altogether.
> 
> 
> 2. Windows and Linux always configure all level-triggered interrupts
> with polarity 0 (active-high, consistent with QEMU's ACPI/DSDT, in
> particular q35-acpi-dsdt.dsl, which is what I'm using with -M q35).
> As such, on Windows and Linux, the xor line in question is still a
> no-op.
> 
> 
> 3. OS X (all versions I tried, at least since 10.5/Leopard) always
> configures all level-triggered interrupts with polarity 1 (active-low),
> regardless of what the QEMU DSDT says. As such, the xor line acts as
> a negation of "irq_level", which at first glance sounds reasonable.
> 
> However: when KVM negates "irq_level" due to "polarity == 1", the OS X
> guest hangs during boot.
> 
> OS X works fine when "polarity == 1" is ignored (with the xor line
> commented out).
> 
> This may be another instance (similar to how OS X didn't use to check
> with CPUID regarding monitor/mwait instruction availability) where
> apple devs know that any of their supported hardware advertises
> active-low in the DSDT, so no need to check, just hardcode that
> assumption... :)
> 
> 
> 4. With s/ActiveHigh/ActiveLow/ in QEMU's q35-acpi-dsdt.dsl, Linux
> actually switches to "polarity == 1" (active-low), and works fine
> *with the xor line removed* !!!. With the xor line left intact (i.e.
> without the above patch), the active-low fedora guest worked extremely
> poorly, and printed out multiple error messages during boot:
> 
> 	irq XX: nobody cared (try booting with the "irqpoll" option)
> 	...
> 	Disabling IRQ #XX
> 
> for XX in [16, 18, 19, ...].
> 
> 
> So, right now, I'm wondering about the following:
> 
> 
> 1. Regarding KVM and the polarity xor line in the patch above: Does
> anyone have experience with any *other* guests which insist on setting
> level-triggered interrupt polarity to 1/active-low ? Is that xor line
> actually doing anything useful in practice, for any other guest, on
> either QEMU or any other platform ?
> 
> 
> 2. Is there anything in QEMU (besides the ACPI DSDT .dsl files) which
> has a hardcoded assumption re. "polarity == 0", or active-high, for
> level-triggered interrupts? I tried to dig through hw/i386/kvm/ioapic.c
> and a bunch of other files, but couldn't isolate anything that I could
> "flip" to fix things in userspace.
> 
> 
> Any ideas or suggestions about the appropriate way to move forward would
> be much appreciated !!!
> 
> 
> Thanks much,
> --Gabriel

I think changing ACPI is the right thing to
do really. But we'll need to fix some things
first of course.

I think it's PC Q35 that has this assumption.
hw/i386/pc_q35.c

        gsi = qemu_allocate_irqs(kvm_pc_gsi_handler, gsi_state,
                                 GSI_NUM_PINS);

kvm_pc_gsi_handler simply forwards interrupts to kvm.

and

hw/isa/lpc_ich9.c
static void ich9_lpc_update_pic(ICH9LPCState *lpc, int pic_irq)
{
    int i, pic_level;

    /* The pic level is the logical OR of all the PCI irqs mapped to it */
    /* The pic level is the logical OR of all the PCI irqs mapped to it
 * */
    pic_level = 0;
    for (i = 0; i < ICH9_LPC_NB_PIRQS; i++) {
        int tmp_irq;
        int tmp_dis;
        ich9_lpc_pic_irq(lpc, i, &tmp_irq, &tmp_dis);
        if (!tmp_dis && pic_irq == tmp_irq) {
            pic_level |= pci_bus_get_irq_level(lpc->d.bus, i);
        }
    }

so somewhere we need to flip it, I am guessing in ich9
along the lines of:

-    pic_level = 0;
-            pic_level |= pci_bus_get_irq_level(lpc->d.bus, i);
+    pic_level = 1;
+            pic_level &= !pci_bus_get_irq_level(lpc->d.bus, i);




-- 
MST

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: RFC: ioapic polarity vs. qemu os-x guest
  2014-02-11 19:54   ` Michael S. Tsirkin
@ 2014-02-11 21:35     ` Gabriel L. Somlo
  2014-02-14 21:13     ` Gabriel L. Somlo
  1 sibling, 0 replies; 32+ messages in thread
From: Gabriel L. Somlo @ 2014-02-11 21:35 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: kvm, qemu-devel, eddie.dong, agraf

On Tue, Feb 11, 2014 at 09:54:44PM +0200, Michael S. Tsirkin wrote:
> On Tue, Feb 11, 2014 at 01:23:31PM -0500, Gabriel L. Somlo wrote:
> > I'm trying to get OS X to work as a QEMU guest, and one of the few
> > remaining "mysteries" I need to solve is that the OS X guest hangs
> > during boot, waiting for its boot disk to be available, unless the
> > following KVM patch is applied:
> > [...]
> > 2. Is there anything in QEMU (besides the ACPI DSDT .dsl files) which
> > has a hardcoded assumption re. "polarity == 0", or active-high, for
> > level-triggered interrupts? I tried to dig through hw/i386/kvm/ioapic.c
> > and a bunch of other files, but couldn't isolate anything that I could
> > "flip" to fix things in userspace.
> > 
> > 
> > Any ideas or suggestions about the appropriate way to move forward would
> > be much appreciated !!!
> > 
> > 
> > Thanks much,
> > --Gabriel
> 
> I think changing ACPI is the right thing to
> do really. But we'll need to fix some things
> first of course.
> 
> I think it's PC Q35 that has this assumption.
> hw/i386/pc_q35.c
> 
>         gsi = qemu_allocate_irqs(kvm_pc_gsi_handler, gsi_state,
>                                  GSI_NUM_PINS);
> 
> kvm_pc_gsi_handler simply forwards interrupts to kvm.
> 
> and
> 
> hw/isa/lpc_ich9.c
> static void ich9_lpc_update_pic(ICH9LPCState *lpc, int pic_irq)
> {
>     int i, pic_level;
> 
>     /* The pic level is the logical OR of all the PCI irqs mapped to it */
>     /* The pic level is the logical OR of all the PCI irqs mapped to it
>  * */
>     pic_level = 0;
>     for (i = 0; i < ICH9_LPC_NB_PIRQS; i++) {
>         int tmp_irq;
>         int tmp_dis;
>         ich9_lpc_pic_irq(lpc, i, &tmp_irq, &tmp_dis);
>         if (!tmp_dis && pic_irq == tmp_irq) {
>             pic_level |= pci_bus_get_irq_level(lpc->d.bus, i);
>         }
>     }
> 
> so somewhere we need to flip it, I am guessing in ich9
> along the lines of:
> 
> -    pic_level = 0;
> -            pic_level |= pci_bus_get_irq_level(lpc->d.bus, i);
> +    pic_level = 1;
> +            pic_level &= !pci_bus_get_irq_level(lpc->d.bus, i);

I think now we're on to something!

I managed to boot OS X on q35 with absolutely no kernel patches, but
Linux still hated it ("irqXX: nobody cared"). At least now I know what
I'm looking for, so I'll try to come up with a way to flip
level-triggered polarity to ActiveLow across all of i386, in a way
that works for Linux and Windows guests as well.

Thanks again for getting me unstuck!
--Gabriel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: RFC: ioapic polarity vs. qemu os-x guest
  2014-02-11 19:54   ` Michael S. Tsirkin
  2014-02-11 21:35     ` Gabriel L. Somlo
@ 2014-02-14 21:13     ` Gabriel L. Somlo
  2014-02-14 21:21       ` Alexander Graf
                         ` (2 more replies)
  1 sibling, 3 replies; 32+ messages in thread
From: Gabriel L. Somlo @ 2014-02-14 21:13 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: kvm, qemu-devel, eddie.dong, agraf

On Tue, Feb 11, 2014 at 09:54:44PM +0200, Michael S. Tsirkin wrote:
> On Tue, Feb 11, 2014 at 01:23:31PM -0500, Gabriel L. Somlo wrote:
> > 1. Regarding KVM and the polarity xor line in the patch above: Does
> > anyone have experience with any *other* guests which insist on setting
> > level-triggered interrupt polarity to 1/active-low ? Is that xor line
> > actually doing anything useful in practice, for any other guest, on
> > either QEMU or any other platform ?
> > 
> > 
> > 2. Is there anything in QEMU (besides the ACPI DSDT .dsl files) which
> > has a hardcoded assumption re. "polarity == 0", or active-high, for
> > level-triggered interrupts? I tried to dig through hw/i386/kvm/ioapic.c
> > and a bunch of other files, but couldn't isolate anything that I could
> > "flip" to fix things in userspace.
> > 
> > 
> > Any ideas or suggestions about the appropriate way to move forward would
> > be much appreciated !!!
> > 
> > 
> > Thanks much,
> > --Gabriel
> 
> I think changing ACPI is the right thing to
> do really. But we'll need to fix some things
> first of course.

So I followed your advice, and was able to boot OS X just fine (but
booting Linux after this patch still resulted in multiple "no one
cared" complaints on IRQs 17, 18, 19, etc.:

diff --git a/hw/i386/q35-acpi-dsdt.dsl b/hw/i386/q35-acpi-dsdt.dsl
index d618e9e..9c52f64 100644
--- a/hw/i386/q35-acpi-dsdt.dsl
+++ b/hw/i386/q35-acpi-dsdt.dsl
@@ -353,7 +353,7 @@ DefinitionBlock (
         Method(IQCR, 1, Serialized) {
             // _CRS method - get current settings
             Name(PRR0, ResourceTemplate() {
-                Interrupt(, Level, ActiveHigh, Shared) { 0 }
+                Interrupt(, Level, ActiveLow, Shared) { 0 }
             })
             CreateDWordField(PRR0, 0x05, PRRI)
             Store(And(Arg0, 0x0F), PRRI)
@@ -365,7 +365,7 @@ DefinitionBlock (
             Name(_HID, EISAID("PNP0C0F"))                       \
             Name(_UID, uid)                                     \
             Name(_PRS, ResourceTemplate() {                     \
-                Interrupt(, Level, ActiveHigh, Shared) {        \
+                Interrupt(, Level, ActiveLow, Shared) {        \
                     5, 10, 11                                   \
                 }                                               \
             })                                                  \
@@ -398,12 +398,12 @@ DefinitionBlock (
             Name(_HID, EISAID("PNP0C0F"))                       \
             Name(_UID, uid)                                     \
             Name(_PRS, ResourceTemplate() {                     \
-                Interrupt(, Level, ActiveHigh, Shared) {        \
+                Interrupt(, Level, ActiveLow, Shared) {        \
                     gsi                                         \
                 }                                               \
             })                                                  \
             Name(_CRS, ResourceTemplate() {                     \
-                Interrupt(, Level, ActiveHigh, Shared) {        \
+                Interrupt(, Level, ActiveLow, Shared) {        \
                     gsi                                         \
                 }                                               \
             })                                                  \
diff --git a/hw/isa/lpc_ich9.c b/hw/isa/lpc_ich9.c
index 51ce12d..fe1527a 100644
--- a/hw/isa/lpc_ich9.c
+++ b/hw/isa/lpc_ich9.c
@@ -206,17 +206,17 @@ static void ich9_lpc_update_pic(ICH9LPCState *lpc, int pic_irq)
     int i, pic_level;
 
     /* The pic level is the logical OR of all the PCI irqs mapped to it */
-    pic_level = 0;
+    pic_level = 1;
     for (i = 0; i < ICH9_LPC_NB_PIRQS; i++) {
         int tmp_irq;
         int tmp_dis;
         ich9_lpc_pic_irq(lpc, i, &tmp_irq, &tmp_dis);
         if (!tmp_dis && pic_irq == tmp_irq) {
-            pic_level |= pci_bus_get_irq_level(lpc->d.bus, i);
+            pic_level &= !pci_bus_get_irq_level(lpc->d.bus, i);
         }
     }
     if (pic_irq == ich9_lpc_sci_irq(lpc)) {
-        pic_level |= lpc->sci_level;
+        pic_level &= !lpc->sci_level;
     }
 
     qemu_set_irq(lpc->pic[pic_irq], pic_level);
--

However, even on OS X, the Ethernet (e1000) card won't link up at all.
Fixing that requires another patch:

diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index 58ba93b..c7a2c07 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -301,7 +301,7 @@ set_interrupt_cause(E1000State *s, int index, uint32_t val)
     s->mac_reg[ICS] = val;
 
     pending_ints = (s->mac_reg[IMS] & s->mac_reg[ICR]);
-    if (!s->mit_irq_level && pending_ints) {
+    if (s->mit_irq_level && pending_ints) {
         /*
          * Here we detect a potential raising edge. We postpone raising the
          * interrupt line if we are inside the mitigation delay window
@@ -339,7 +339,7 @@ set_interrupt_cause(E1000State *s, int index, uint32_t val)
         }
     }
 
-    s->mit_irq_level = (pending_ints != 0);
+    s->mit_irq_level = (pending_ints == 0);
     pci_set_irq(d, s->mit_irq_level);
 }
 
@@ -393,7 +393,7 @@ static void e1000_reset(void *opaque)
     timer_del(d->autoneg_timer);
     timer_del(d->mit_timer);
     d->mit_timer_on = 0;
-    d->mit_irq_level = 0;
+    d->mit_irq_level = 1;
     d->mit_ide = 0;
     memset(d->phy_reg, 0, sizeof d->phy_reg);
     memmove(d->phy_reg, phy_reg_init, sizeof phy_reg_init);
@@ -1324,7 +1324,7 @@ static int e1000_post_load(void *opaque, int version_id)
     if (!(s->compat_flags & E1000_FLAG_MIT)) {
         s->mac_reg[ITR] = s->mac_reg[RDTR] = s->mac_reg[RADV] =
             s->mac_reg[TADV] = 0;
-        s->mit_irq_level = false;
+        s->mit_irq_level = true;
     }
     s->mit_ide = 0;
     s->mit_timer_on = false;
---

At this point, I'm beginning to realize that the "ActiveHigh"
assumption is rather pervasively baked in throughout the QEMU
source code...

And I'm wondering whether a ton of changes to make it either
programatically configurable or change the hard-codded assumption
to "ActiveLow" would be feasible, welcome, etc... I personally
would prefer configurable
(e.g. "-machine foo,kernel_irqchip=on,polarity=high", or some such).

Thanks much for any ideas,
--Gabriel

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: RFC: ioapic polarity vs. qemu os-x guest
  2014-02-14 21:13     ` Gabriel L. Somlo
@ 2014-02-14 21:21       ` Alexander Graf
  2014-02-14 22:06         ` Gabriel L. Somlo
  2014-02-16 11:34       ` RFC: ioapic polarity vs. qemu os-x guest Michael S. Tsirkin
  2014-02-16 11:37       ` Michael S. Tsirkin
  2 siblings, 1 reply; 32+ messages in thread
From: Alexander Graf @ 2014-02-14 21:21 UTC (permalink / raw)
  To: Gabriel L. Somlo
  Cc: Michael S. Tsirkin, kvm@vger.kernel.org, qemu-devel@nongnu.org,
	eddie.dong@intel.com



> Am 14.02.2014 um 22:13 schrieb "Gabriel L. Somlo" <gsomlo@gmail.com>:
> 
>> On Tue, Feb 11, 2014 at 09:54:44PM +0200, Michael S. Tsirkin wrote:
>>> On Tue, Feb 11, 2014 at 01:23:31PM -0500, Gabriel L. Somlo wrote:
>>> 1. Regarding KVM and the polarity xor line in the patch above: Does
>>> anyone have experience with any *other* guests which insist on setting
>>> level-triggered interrupt polarity to 1/active-low ? Is that xor line
>>> actually doing anything useful in practice, for any other guest, on
>>> either QEMU or any other platform ?
>>> 
>>> 
>>> 2. Is there anything in QEMU (besides the ACPI DSDT .dsl files) which
>>> has a hardcoded assumption re. "polarity == 0", or active-high, for
>>> level-triggered interrupts? I tried to dig through hw/i386/kvm/ioapic.c
>>> and a bunch of other files, but couldn't isolate anything that I could
>>> "flip" to fix things in userspace.
>>> 
>>> 
>>> Any ideas or suggestions about the appropriate way to move forward would
>>> be much appreciated !!!
>>> 
>>> 
>>> Thanks much,
>>> --Gabriel
>> 
>> I think changing ACPI is the right thing to
>> do really. But we'll need to fix some things
>> first of course.
> 
> So I followed your advice, and was able to boot OS X just fine (but
> booting Linux after this patch still resulted in multiple "no one
> cared" complaints on IRQs 17, 18, 19, etc.:
> 
> diff --git a/hw/i386/q35-acpi-dsdt.dsl b/hw/i386/q35-acpi-dsdt.dsl
> index d618e9e..9c52f64 100644
> --- a/hw/i386/q35-acpi-dsdt.dsl
> +++ b/hw/i386/q35-acpi-dsdt.dsl
> @@ -353,7 +353,7 @@ DefinitionBlock (
>         Method(IQCR, 1, Serialized) {
>             // _CRS method - get current settings
>             Name(PRR0, ResourceTemplate() {
> -                Interrupt(, Level, ActiveHigh, Shared) { 0 }
> +                Interrupt(, Level, ActiveLow, Shared) { 0 }
>             })
>             CreateDWordField(PRR0, 0x05, PRRI)
>             Store(And(Arg0, 0x0F), PRRI)
> @@ -365,7 +365,7 @@ DefinitionBlock (
>             Name(_HID, EISAID("PNP0C0F"))                       \
>             Name(_UID, uid)                                     \
>             Name(_PRS, ResourceTemplate() {                     \
> -                Interrupt(, Level, ActiveHigh, Shared) {        \
> +                Interrupt(, Level, ActiveLow, Shared) {        \
>                     5, 10, 11                                   \
>                 }                                               \
>             })                                                  \
> @@ -398,12 +398,12 @@ DefinitionBlock (
>             Name(_HID, EISAID("PNP0C0F"))                       \
>             Name(_UID, uid)                                     \
>             Name(_PRS, ResourceTemplate() {                     \
> -                Interrupt(, Level, ActiveHigh, Shared) {        \
> +                Interrupt(, Level, ActiveLow, Shared) {        \
>                     gsi                                         \
>                 }                                               \
>             })                                                  \
>             Name(_CRS, ResourceTemplate() {                     \
> -                Interrupt(, Level, ActiveHigh, Shared) {        \
> +                Interrupt(, Level, ActiveLow, Shared) {        \
>                     gsi                                         \
>                 }                                               \
>             })                                                  \
> diff --git a/hw/isa/lpc_ich9.c b/hw/isa/lpc_ich9.c
> index 51ce12d..fe1527a 100644
> --- a/hw/isa/lpc_ich9.c
> +++ b/hw/isa/lpc_ich9.c
> @@ -206,17 +206,17 @@ static void ich9_lpc_update_pic(ICH9LPCState *lpc, int pic_irq)
>     int i, pic_level;
> 
>     /* The pic level is the logical OR of all the PCI irqs mapped to it */
> -    pic_level = 0;
> +    pic_level = 1;
>     for (i = 0; i < ICH9_LPC_NB_PIRQS; i++) {
>         int tmp_irq;
>         int tmp_dis;
>         ich9_lpc_pic_irq(lpc, i, &tmp_irq, &tmp_dis);
>         if (!tmp_dis && pic_irq == tmp_irq) {
> -            pic_level |= pci_bus_get_irq_level(lpc->d.bus, i);
> +            pic_level &= !pci_bus_get_irq_level(lpc->d.bus, i);
>         }
>     }
>     if (pic_irq == ich9_lpc_sci_irq(lpc)) {
> -        pic_level |= lpc->sci_level;
> +        pic_level &= !lpc->sci_level;
>     }
> 
>     qemu_set_irq(lpc->pic[pic_irq], pic_level);
> --
> 
> However, even on OS X, the Ethernet (e1000) card won't link up at all.
> Fixing that requires another patch:
> 
> diff --git a/hw/net/e1000.c b/hw/net/e1000.c
> index 58ba93b..c7a2c07 100644
> --- a/hw/net/e1000.c
> +++ b/hw/net/e1000.c
> @@ -301,7 +301,7 @@ set_interrupt_cause(E1000State *s, int index, uint32_t val)
>     s->mac_reg[ICS] = val;
> 
>     pending_ints = (s->mac_reg[IMS] & s->mac_reg[ICR]);
> -    if (!s->mit_irq_level && pending_ints) {
> +    if (s->mit_irq_level && pending_ints) {
>         /*
>          * Here we detect a potential raising edge. We postpone raising the
>          * interrupt line if we are inside the mitigation delay window
> @@ -339,7 +339,7 @@ set_interrupt_cause(E1000State *s, int index, uint32_t val)
>         }
>     }
> 
> -    s->mit_irq_level = (pending_ints != 0);
> +    s->mit_irq_level = (pending_ints == 0);
>     pci_set_irq(d, s->mit_irq_level);
> }
> 
> @@ -393,7 +393,7 @@ static void e1000_reset(void *opaque)
>     timer_del(d->autoneg_timer);
>     timer_del(d->mit_timer);
>     d->mit_timer_on = 0;
> -    d->mit_irq_level = 0;
> +    d->mit_irq_level = 1;
>     d->mit_ide = 0;
>     memset(d->phy_reg, 0, sizeof d->phy_reg);
>     memmove(d->phy_reg, phy_reg_init, sizeof phy_reg_init);
> @@ -1324,7 +1324,7 @@ static int e1000_post_load(void *opaque, int version_id)
>     if (!(s->compat_flags & E1000_FLAG_MIT)) {
>         s->mac_reg[ITR] = s->mac_reg[RDTR] = s->mac_reg[RADV] =
>             s->mac_reg[TADV] = 0;
> -        s->mit_irq_level = false;
> +        s->mit_irq_level = true;
>     }
>     s->mit_ide = 0;
>     s->mit_timer_on = false;
> ---
> 
> At this point, I'm beginning to realize that the "ActiveHigh"
> assumption is rather pervasively baked in throughout the QEMU
> source code...
> 
> And I'm wondering whether a ton of changes to make it either
> programatically configurable or change the hard-codded assumption
> to "ActiveLow" would be feasible, welcome, etc... I personally
> would prefer configurable
> (e.g. "-machine foo,kernel_irqchip=on,polarity=high", or some such).

Can't you just turn the polarity around in the pci host adapter?

Alex

> 
> Thanks much for any ideas,
> --Gabriel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: RFC: ioapic polarity vs. qemu os-x guest
  2014-02-14 21:21       ` Alexander Graf
@ 2014-02-14 22:06         ` Gabriel L. Somlo
  2014-02-14 22:13           ` Alexander Graf
  0 siblings, 1 reply; 32+ messages in thread
From: Gabriel L. Somlo @ 2014-02-14 22:06 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Michael S. Tsirkin, kvm@vger.kernel.org, qemu-devel@nongnu.org,
	eddie.dong@intel.com

On Fri, Feb 14, 2014 at 10:21:09PM +0100, Alexander Graf wrote:
> 
> Can't you just turn the polarity around in the pci host adapter?

I tried this:

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 1221f32..0e86d21 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -118,13 +118,13 @@ static int pci_bar(PCIDevice *d, int reg)
 
 static inline int pci_irq_state(PCIDevice *d, int irq_num)
 {
-	return (d->irq_state >> irq_num) & 0x1;
+	return !(d->irq_state >> irq_num) & 0x1;
 }
 
 static inline void pci_set_irq_state(PCIDevice *d, int irq_num, int level)
 {
 	d->irq_state &= ~(0x1 << irq_num);
-	d->irq_state |= level << irq_num;
+	d->irq_state &= ~(level << irq_num);
 }
 
 static void pci_change_irq_level(PCIDevice *pci_dev, int irq_num, int change)
@@ -229,7 +229,7 @@ static void pcibus_reset(BusState *qbus)
     }
 
     for (i = 0; i < bus->nirq; i++) {
-        assert(bus->irq_count[i] == 0);
+        assert(bus->irq_count[i] != 0);
     }
 }
 
---

but now OS X freezes during boot right after

	[ PCI configuration begin ]
	[ PCI configuration end, bridges 1, devices 10 ]
	RTC: Only single RAM bank (128 bytes)

which all looks normal, except the process is supposed to continue on
from there and doesn't :)

On Linux, I get Fedora 20 live all the way up with no obvious/loud
complaints, but mouse and keyboard don't work at all...

I have to admit I'm a bit out of my depth here, though :)

--Gabriel

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: RFC: ioapic polarity vs. qemu os-x guest
  2014-02-14 22:06         ` Gabriel L. Somlo
@ 2014-02-14 22:13           ` Alexander Graf
  2014-02-16 11:18             ` Michael S. Tsirkin
  2014-02-16 11:41             ` Michael S. Tsirkin
  0 siblings, 2 replies; 32+ messages in thread
From: Alexander Graf @ 2014-02-14 22:13 UTC (permalink / raw)
  To: Gabriel L. Somlo
  Cc: Michael S. Tsirkin, kvm@vger.kernel.org, qemu-devel@nongnu.org,
	eddie.dong@intel.com


On 14.02.2014, at 23:06, Gabriel L. Somlo <gsomlo@gmail.com> wrote:

> On Fri, Feb 14, 2014 at 10:21:09PM +0100, Alexander Graf wrote:
>> 
>> Can't you just turn the polarity around in the pci host adapter?
> 
> I tried this:
> 
> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> index 1221f32..0e86d21 100644
> --- a/hw/pci/pci.c
> +++ b/hw/pci/pci.c
> @@ -118,13 +118,13 @@ static int pci_bar(PCIDevice *d, int reg)
> 
> static inline int pci_irq_state(PCIDevice *d, int irq_num)
> {
> -	return (d->irq_state >> irq_num) & 0x1;
> +	return !(d->irq_state >> irq_num) & 0x1;
> }
> 
> static inline void pci_set_irq_state(PCIDevice *d, int irq_num, int level)
> {
> 	d->irq_state &= ~(0x1 << irq_num);
> -	d->irq_state |= level << irq_num;
> +	d->irq_state &= ~(level << irq_num);
> }
> 
> static void pci_change_irq_level(PCIDevice *pci_dev, int irq_num, int change)
> @@ -229,7 +229,7 @@ static void pcibus_reset(BusState *qbus)
>     }
> 
>     for (i = 0; i < bus->nirq; i++) {
> -        assert(bus->irq_count[i] == 0);
> +        assert(bus->irq_count[i] != 0);
>     }
> }
> 
> ---
> 
> but now OS X freezes during boot right after
> 
> 	[ PCI configuration begin ]
> 	[ PCI configuration end, bridges 1, devices 10 ]
> 	RTC: Only single RAM bank (128 bytes)
> 
> which all looks normal, except the process is supposed to continue on
> from there and doesn't :)
> 
> On Linux, I get Fedora 20 live all the way up with no obvious/loud
> complaints, but mouse and keyboard don't work at all...
> 
> I have to admit I'm a bit out of my depth here, though :)

Yeah, another thing we have to take into account is vhost-net which generates IRQs directly through irqfd. I guess for those we'll have to configure the polarity in the irq routing table?


Alex


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: RFC: ioapic polarity vs. qemu os-x guest
  2014-02-14 22:13           ` Alexander Graf
@ 2014-02-16 11:18             ` Michael S. Tsirkin
  2014-02-16 11:41             ` Michael S. Tsirkin
  1 sibling, 0 replies; 32+ messages in thread
From: Michael S. Tsirkin @ 2014-02-16 11:18 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Gabriel L. Somlo, kvm@vger.kernel.org, qemu-devel@nongnu.org,
	eddie.dong@intel.com

On Fri, Feb 14, 2014 at 11:13:04PM +0100, Alexander Graf wrote:
> 
> On 14.02.2014, at 23:06, Gabriel L. Somlo <gsomlo@gmail.com> wrote:
> 
> > On Fri, Feb 14, 2014 at 10:21:09PM +0100, Alexander Graf wrote:
> >> 
> >> Can't you just turn the polarity around in the pci host adapter?
> > 
> > I tried this:
> > 
> > diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> > index 1221f32..0e86d21 100644
> > --- a/hw/pci/pci.c
> > +++ b/hw/pci/pci.c
> > @@ -118,13 +118,13 @@ static int pci_bar(PCIDevice *d, int reg)
> > 
> > static inline int pci_irq_state(PCIDevice *d, int irq_num)
> > {
> > -	return (d->irq_state >> irq_num) & 0x1;
> > +	return !(d->irq_state >> irq_num) & 0x1;
> > }
> > 
> > static inline void pci_set_irq_state(PCIDevice *d, int irq_num, int level)
> > {
> > 	d->irq_state &= ~(0x1 << irq_num);
> > -	d->irq_state |= level << irq_num;
> > +	d->irq_state &= ~(level << irq_num);
> > }
> > 
> > static void pci_change_irq_level(PCIDevice *pci_dev, int irq_num, int change)
> > @@ -229,7 +229,7 @@ static void pcibus_reset(BusState *qbus)
> >     }
> > 
> >     for (i = 0; i < bus->nirq; i++) {
> > -        assert(bus->irq_count[i] == 0);
> > +        assert(bus->irq_count[i] != 0);
> >     }
> > }
> > 
> > ---
> > 
> > but now OS X freezes during boot right after
> > 
> > 	[ PCI configuration begin ]
> > 	[ PCI configuration end, bridges 1, devices 10 ]
> > 	RTC: Only single RAM bank (128 bytes)
> > 
> > which all looks normal, except the process is supposed to continue on
> > from there and doesn't :)
> > 
> > On Linux, I get Fedora 20 live all the way up with no obvious/loud
> > complaints, but mouse and keyboard don't work at all...
> > 
> > I have to admit I'm a bit out of my depth here, though :)
> 
> Yeah, another thing we have to take into account is vhost-net which generates IRQs directly through irqfd. I guess for those we'll have to configure the polarity in the irq routing table?
> 
> 
> Alex

This is using MSI-X interrupts which are edge though,
not going through IOAPIC at all.


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: RFC: ioapic polarity vs. qemu os-x guest
  2014-02-14 21:13     ` Gabriel L. Somlo
  2014-02-14 21:21       ` Alexander Graf
@ 2014-02-16 11:34       ` Michael S. Tsirkin
  2014-02-16 15:12         ` [Qemu-devel] " Peter Maydell
  2014-02-16 11:37       ` Michael S. Tsirkin
  2 siblings, 1 reply; 32+ messages in thread
From: Michael S. Tsirkin @ 2014-02-16 11:34 UTC (permalink / raw)
  To: Gabriel L. Somlo; +Cc: kvm, qemu-devel, eddie.dong, agraf

On Fri, Feb 14, 2014 at 04:13:11PM -0500, Gabriel L. Somlo wrote:
> On Tue, Feb 11, 2014 at 09:54:44PM +0200, Michael S. Tsirkin wrote:
> > On Tue, Feb 11, 2014 at 01:23:31PM -0500, Gabriel L. Somlo wrote:
> > > 1. Regarding KVM and the polarity xor line in the patch above: Does
> > > anyone have experience with any *other* guests which insist on setting
> > > level-triggered interrupt polarity to 1/active-low ? Is that xor line
> > > actually doing anything useful in practice, for any other guest, on
> > > either QEMU or any other platform ?
> > > 
> > > 
> > > 2. Is there anything in QEMU (besides the ACPI DSDT .dsl files) which
> > > has a hardcoded assumption re. "polarity == 0", or active-high, for
> > > level-triggered interrupts? I tried to dig through hw/i386/kvm/ioapic.c
> > > and a bunch of other files, but couldn't isolate anything that I could
> > > "flip" to fix things in userspace.
> > > 
> > > 
> > > Any ideas or suggestions about the appropriate way to move forward would
> > > be much appreciated !!!
> > > 
> > > 
> > > Thanks much,
> > > --Gabriel
> > 
> > I think changing ACPI is the right thing to
> > do really. But we'll need to fix some things
> > first of course.
> 
> So I followed your advice, and was able to boot OS X just fine (but
> booting Linux after this patch still resulted in multiple "no one
> cared" complaints on IRQs 17, 18, 19, etc.:
> 
> diff --git a/hw/i386/q35-acpi-dsdt.dsl b/hw/i386/q35-acpi-dsdt.dsl
> index d618e9e..9c52f64 100644
> --- a/hw/i386/q35-acpi-dsdt.dsl
> +++ b/hw/i386/q35-acpi-dsdt.dsl
> @@ -353,7 +353,7 @@ DefinitionBlock (
>          Method(IQCR, 1, Serialized) {
>              // _CRS method - get current settings
>              Name(PRR0, ResourceTemplate() {
> -                Interrupt(, Level, ActiveHigh, Shared) { 0 }
> +                Interrupt(, Level, ActiveLow, Shared) { 0 }
>              })
>              CreateDWordField(PRR0, 0x05, PRRI)
>              Store(And(Arg0, 0x0F), PRRI)
> @@ -365,7 +365,7 @@ DefinitionBlock (
>              Name(_HID, EISAID("PNP0C0F"))                       \
>              Name(_UID, uid)                                     \
>              Name(_PRS, ResourceTemplate() {                     \
> -                Interrupt(, Level, ActiveHigh, Shared) {        \
> +                Interrupt(, Level, ActiveLow, Shared) {        \
>                      5, 10, 11                                   \
>                  }                                               \
>              })                                                  \
> @@ -398,12 +398,12 @@ DefinitionBlock (
>              Name(_HID, EISAID("PNP0C0F"))                       \
>              Name(_UID, uid)                                     \
>              Name(_PRS, ResourceTemplate() {                     \
> -                Interrupt(, Level, ActiveHigh, Shared) {        \
> +                Interrupt(, Level, ActiveLow, Shared) {        \
>                      gsi                                         \
>                  }                                               \
>              })                                                  \
>              Name(_CRS, ResourceTemplate() {                     \
> -                Interrupt(, Level, ActiveHigh, Shared) {        \
> +                Interrupt(, Level, ActiveLow, Shared) {        \
>                      gsi                                         \
>                  }                                               \
>              })                                                  \
> diff --git a/hw/isa/lpc_ich9.c b/hw/isa/lpc_ich9.c
> index 51ce12d..fe1527a 100644
> --- a/hw/isa/lpc_ich9.c
> +++ b/hw/isa/lpc_ich9.c
> @@ -206,17 +206,17 @@ static void ich9_lpc_update_pic(ICH9LPCState *lpc, int pic_irq)
>      int i, pic_level;
>  
>      /* The pic level is the logical OR of all the PCI irqs mapped to it */
> -    pic_level = 0;
> +    pic_level = 1;
>      for (i = 0; i < ICH9_LPC_NB_PIRQS; i++) {
>          int tmp_irq;
>          int tmp_dis;
>          ich9_lpc_pic_irq(lpc, i, &tmp_irq, &tmp_dis);
>          if (!tmp_dis && pic_irq == tmp_irq) {
> -            pic_level |= pci_bus_get_irq_level(lpc->d.bus, i);
> +            pic_level &= !pci_bus_get_irq_level(lpc->d.bus, i);
>          }
>      }
>      if (pic_irq == ich9_lpc_sci_irq(lpc)) {
> -        pic_level |= lpc->sci_level;
> +        pic_level &= !lpc->sci_level;
>      }
>  
>      qemu_set_irq(lpc->pic[pic_irq], pic_level);
> --
> 
> However, even on OS X, the Ethernet (e1000) card won't link up at all.
> Fixing that requires another patch:
> 
> diff --git a/hw/net/e1000.c b/hw/net/e1000.c
> index 58ba93b..c7a2c07 100644
> --- a/hw/net/e1000.c
> +++ b/hw/net/e1000.c
> @@ -301,7 +301,7 @@ set_interrupt_cause(E1000State *s, int index, uint32_t val)
>      s->mac_reg[ICS] = val;
>  
>      pending_ints = (s->mac_reg[IMS] & s->mac_reg[ICR]);
> -    if (!s->mit_irq_level && pending_ints) {
> +    if (s->mit_irq_level && pending_ints) {
>          /*
>           * Here we detect a potential raising edge. We postpone raising the
>           * interrupt line if we are inside the mitigation delay window
> @@ -339,7 +339,7 @@ set_interrupt_cause(E1000State *s, int index, uint32_t val)
>          }
>      }
>  
> -    s->mit_irq_level = (pending_ints != 0);
> +    s->mit_irq_level = (pending_ints == 0);
>      pci_set_irq(d, s->mit_irq_level);
>  }
>  
> @@ -393,7 +393,7 @@ static void e1000_reset(void *opaque)
>      timer_del(d->autoneg_timer);
>      timer_del(d->mit_timer);
>      d->mit_timer_on = 0;
> -    d->mit_irq_level = 0;
> +    d->mit_irq_level = 1;
>      d->mit_ide = 0;
>      memset(d->phy_reg, 0, sizeof d->phy_reg);
>      memmove(d->phy_reg, phy_reg_init, sizeof phy_reg_init);
> @@ -1324,7 +1324,7 @@ static int e1000_post_load(void *opaque, int version_id)
>      if (!(s->compat_flags & E1000_FLAG_MIT)) {
>          s->mac_reg[ITR] = s->mac_reg[RDTR] = s->mac_reg[RADV] =
>              s->mac_reg[TADV] = 0;
> -        s->mit_irq_level = false;
> +        s->mit_irq_level = true;
>      }
>      s->mit_ide = 0;
>      s->mit_timer_on = false;

Hmm no this is all wrong, from API point of view,
devices shoud not care about value of interrupt.
They just assert/deassert interrupts.
It so happens that 1 means assert 0 means deassert.
It's the PCI host or the PIC that needs to flip it.
It can't be host ATM because PIC does the logical OR
so it must be encoded as 1 == assert.

So how about simply
-      qemu_set_irq(lpc->pic[pic_irq], pic_level);
+      qemu_set_irq(lpc->pic[pic_irq], !pic_level);
plus the ACPI tweaks ... ?

> ---
> 
> At this point, I'm beginning to realize that the "ActiveHigh"
> assumption is rather pervasively baked in throughout the QEMU
> source code...
> 
> And I'm wondering whether a ton of changes to make it either
> programatically configurable or change the hard-codded assumption
> to "ActiveLow" would be feasible, welcome, etc... I personally
> would prefer configurable
> (e.g. "-machine foo,kernel_irqchip=on,polarity=high", or some such).
> 
> Thanks much for any ideas,
> --Gabriel

I don't think we need to make this configurable unless
there's real hardware that makes PCI active low.
Don't give up yet :)

-- 
MST

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: RFC: ioapic polarity vs. qemu os-x guest
  2014-02-14 21:13     ` Gabriel L. Somlo
  2014-02-14 21:21       ` Alexander Graf
  2014-02-16 11:34       ` RFC: ioapic polarity vs. qemu os-x guest Michael S. Tsirkin
@ 2014-02-16 11:37       ` Michael S. Tsirkin
  2 siblings, 0 replies; 32+ messages in thread
From: Michael S. Tsirkin @ 2014-02-16 11:37 UTC (permalink / raw)
  To: Gabriel L. Somlo; +Cc: kvm, qemu-devel, eddie.dong, agraf

On Fri, Feb 14, 2014 at 04:13:11PM -0500, Gabriel L. Somlo wrote:
> On Tue, Feb 11, 2014 at 09:54:44PM +0200, Michael S. Tsirkin wrote:
> > On Tue, Feb 11, 2014 at 01:23:31PM -0500, Gabriel L. Somlo wrote:
> > > 1. Regarding KVM and the polarity xor line in the patch above: Does
> > > anyone have experience with any *other* guests which insist on setting
> > > level-triggered interrupt polarity to 1/active-low ? Is that xor line
> > > actually doing anything useful in practice, for any other guest, on
> > > either QEMU or any other platform ?
> > > 
> > > 
> > > 2. Is there anything in QEMU (besides the ACPI DSDT .dsl files) which
> > > has a hardcoded assumption re. "polarity == 0", or active-high, for
> > > level-triggered interrupts? I tried to dig through hw/i386/kvm/ioapic.c
> > > and a bunch of other files, but couldn't isolate anything that I could
> > > "flip" to fix things in userspace.
> > > 
> > > 
> > > Any ideas or suggestions about the appropriate way to move forward would
> > > be much appreciated !!!
> > > 
> > > 
> > > Thanks much,
> > > --Gabriel
> > 
> > I think changing ACPI is the right thing to
> > do really. But we'll need to fix some things
> > first of course.
> 
> So I followed your advice, and was able to boot OS X just fine (but
> booting Linux after this patch still resulted in multiple "no one
> cared" complaints on IRQs 17, 18, 19, etc.:
> 
> diff --git a/hw/i386/q35-acpi-dsdt.dsl b/hw/i386/q35-acpi-dsdt.dsl
> index d618e9e..9c52f64 100644
> --- a/hw/i386/q35-acpi-dsdt.dsl
> +++ b/hw/i386/q35-acpi-dsdt.dsl
> @@ -353,7 +353,7 @@ DefinitionBlock (
>          Method(IQCR, 1, Serialized) {
>              // _CRS method - get current settings
>              Name(PRR0, ResourceTemplate() {
> -                Interrupt(, Level, ActiveHigh, Shared) { 0 }
> +                Interrupt(, Level, ActiveLow, Shared) { 0 }
>              })
>              CreateDWordField(PRR0, 0x05, PRRI)
>              Store(And(Arg0, 0x0F), PRRI)
> @@ -365,7 +365,7 @@ DefinitionBlock (
>              Name(_HID, EISAID("PNP0C0F"))                       \
>              Name(_UID, uid)                                     \
>              Name(_PRS, ResourceTemplate() {                     \
> -                Interrupt(, Level, ActiveHigh, Shared) {        \
> +                Interrupt(, Level, ActiveLow, Shared) {        \
>                      5, 10, 11                                   \
>                  }                                               \
>              })                                                  \
> @@ -398,12 +398,12 @@ DefinitionBlock (
>              Name(_HID, EISAID("PNP0C0F"))                       \
>              Name(_UID, uid)                                     \
>              Name(_PRS, ResourceTemplate() {                     \
> -                Interrupt(, Level, ActiveHigh, Shared) {        \
> +                Interrupt(, Level, ActiveLow, Shared) {        \
>                      gsi                                         \
>                  }                                               \
>              })                                                  \
>              Name(_CRS, ResourceTemplate() {                     \
> -                Interrupt(, Level, ActiveHigh, Shared) {        \
> +                Interrupt(, Level, ActiveLow, Shared) {        \
>                      gsi                                         \
>                  }                                               \
>              })                                                  \
> diff --git a/hw/isa/lpc_ich9.c b/hw/isa/lpc_ich9.c
> index 51ce12d..fe1527a 100644
> --- a/hw/isa/lpc_ich9.c
> +++ b/hw/isa/lpc_ich9.c
> @@ -206,17 +206,17 @@ static void ich9_lpc_update_pic(ICH9LPCState *lpc, int pic_irq)
>      int i, pic_level;
>  
>      /* The pic level is the logical OR of all the PCI irqs mapped to it */
> -    pic_level = 0;
> +    pic_level = 1;
>      for (i = 0; i < ICH9_LPC_NB_PIRQS; i++) {
>          int tmp_irq;
>          int tmp_dis;
>          ich9_lpc_pic_irq(lpc, i, &tmp_irq, &tmp_dis);
>          if (!tmp_dis && pic_irq == tmp_irq) {
> -            pic_level |= pci_bus_get_irq_level(lpc->d.bus, i);
> +            pic_level &= !pci_bus_get_irq_level(lpc->d.bus, i);
>          }
>      }
>      if (pic_irq == ich9_lpc_sci_irq(lpc)) {
> -        pic_level |= lpc->sci_level;
> +        pic_level &= !lpc->sci_level;
>      }
>  
>      qemu_set_irq(lpc->pic[pic_irq], pic_level);
> --
> 
> However, even on OS X, the Ethernet (e1000) card won't link up at all.
> Fixing that requires another patch:
> 
> diff --git a/hw/net/e1000.c b/hw/net/e1000.c
> index 58ba93b..c7a2c07 100644
> --- a/hw/net/e1000.c
> +++ b/hw/net/e1000.c
> @@ -301,7 +301,7 @@ set_interrupt_cause(E1000State *s, int index, uint32_t val)
>      s->mac_reg[ICS] = val;
>  
>      pending_ints = (s->mac_reg[IMS] & s->mac_reg[ICR]);
> -    if (!s->mit_irq_level && pending_ints) {
> +    if (s->mit_irq_level && pending_ints) {
>          /*
>           * Here we detect a potential raising edge. We postpone raising the
>           * interrupt line if we are inside the mitigation delay window
> @@ -339,7 +339,7 @@ set_interrupt_cause(E1000State *s, int index, uint32_t val)
>          }
>      }
>  
> -    s->mit_irq_level = (pending_ints != 0);
> +    s->mit_irq_level = (pending_ints == 0);
>      pci_set_irq(d, s->mit_irq_level);
>  }
>  

If we really wanted to change it, we could
change pci_set_irq to reverse the polarity.
But I think doing this in PIC is cleaner.


> @@ -393,7 +393,7 @@ static void e1000_reset(void *opaque)
>      timer_del(d->autoneg_timer);
>      timer_del(d->mit_timer);
>      d->mit_timer_on = 0;
> -    d->mit_irq_level = 0;
> +    d->mit_irq_level = 1;
>      d->mit_ide = 0;
>      memset(d->phy_reg, 0, sizeof d->phy_reg);
>      memmove(d->phy_reg, phy_reg_init, sizeof phy_reg_init);
> @@ -1324,7 +1324,7 @@ static int e1000_post_load(void *opaque, int version_id)
>      if (!(s->compat_flags & E1000_FLAG_MIT)) {
>          s->mac_reg[ITR] = s->mac_reg[RDTR] = s->mac_reg[RADV] =
>              s->mac_reg[TADV] = 0;
> -        s->mit_irq_level = false;
> +        s->mit_irq_level = true;
>      }
>      s->mit_ide = 0;
>      s->mit_timer_on = false;
> ---
> 
> At this point, I'm beginning to realize that the "ActiveHigh"
> assumption is rather pervasively baked in throughout the QEMU
> source code...
> 
> And I'm wondering whether a ton of changes to make it either
> programatically configurable or change the hard-codded assumption
> to "ActiveLow" would be feasible, welcome, etc... I personally
> would prefer configurable
> (e.g. "-machine foo,kernel_irqchip=on,polarity=high", or some such).
> 
> Thanks much for any ideas,
> --Gabriel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: RFC: ioapic polarity vs. qemu os-x guest
  2014-02-14 22:13           ` Alexander Graf
  2014-02-16 11:18             ` Michael S. Tsirkin
@ 2014-02-16 11:41             ` Michael S. Tsirkin
  2014-02-16 14:47               ` Alex Williamson
  1 sibling, 1 reply; 32+ messages in thread
From: Michael S. Tsirkin @ 2014-02-16 11:41 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Gabriel L. Somlo, kvm@vger.kernel.org, qemu-devel@nongnu.org,
	eddie.dong@intel.com, pbonzini

On Fri, Feb 14, 2014 at 11:13:04PM +0100, Alexander Graf wrote:
> 
> On 14.02.2014, at 23:06, Gabriel L. Somlo <gsomlo@gmail.com> wrote:
> 
> > On Fri, Feb 14, 2014 at 10:21:09PM +0100, Alexander Graf wrote:
> >> 
> >> Can't you just turn the polarity around in the pci host adapter?
> > 
> > I tried this:
> > 
> > diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> > index 1221f32..0e86d21 100644
> > --- a/hw/pci/pci.c
> > +++ b/hw/pci/pci.c
> > @@ -118,13 +118,13 @@ static int pci_bar(PCIDevice *d, int reg)
> > 
> > static inline int pci_irq_state(PCIDevice *d, int irq_num)
> > {
> > -	return (d->irq_state >> irq_num) & 0x1;
> > +	return !(d->irq_state >> irq_num) & 0x1;
> > }
> > 
> > static inline void pci_set_irq_state(PCIDevice *d, int irq_num, int level)
> > {
> > 	d->irq_state &= ~(0x1 << irq_num);
> > -	d->irq_state |= level << irq_num;
> > +	d->irq_state &= ~(level << irq_num);
> > }
> > 
> > static void pci_change_irq_level(PCIDevice *pci_dev, int irq_num, int change)
> > @@ -229,7 +229,7 @@ static void pcibus_reset(BusState *qbus)
> >     }
> > 
> >     for (i = 0; i < bus->nirq; i++) {
> > -        assert(bus->irq_count[i] == 0);
> > +        assert(bus->irq_count[i] != 0);
> >     }
> > }
> > 
> > ---
> > 
> > but now OS X freezes during boot right after
> > 
> > 	[ PCI configuration begin ]
> > 	[ PCI configuration end, bridges 1, devices 10 ]
> > 	RTC: Only single RAM bank (128 bytes)
> > 
> > which all looks normal, except the process is supposed to continue on
> > from there and doesn't :)
> > 
> > On Linux, I get Fedora 20 live all the way up with no obvious/loud
> > complaints, but mouse and keyboard don't work at all...
> > 
> > I have to admit I'm a bit out of my depth here, though :)
> 
> Yeah, another thing we have to take into account is vhost-net which generates IRQs directly through irqfd. I guess for those we'll have to configure the polarity in the irq routing table?
> 
> 
> Alex

What will be affected is VFIO which uses IRQFD
for level interrupts with KVM_IRQFD_FLAG_RESAMPLE.
I suspect this will need a kernel change, maybe
a new flag for IRQFD: KVM_IRQFD_FLAG_ACTIVE_LOW,
since at the moment that does:

static void
irqfd_inject(struct work_struct *work)
{
        struct _irqfd *irqfd = container_of(work, struct _irqfd, inject);
        struct kvm *kvm = irqfd->kvm;

        if (!irqfd->resampler) {
                kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID, irqfd->gsi, 1,
                                false);
                kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID, irqfd->gsi, 0,
                                false);
        } else
                kvm_set_irq(kvm, KVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID,
                            irqfd->gsi, 1, false);
}



-- 
MST

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: RFC: ioapic polarity vs. qemu os-x guest
  2014-02-16 11:41             ` Michael S. Tsirkin
@ 2014-02-16 14:47               ` Alex Williamson
  2014-02-16 16:23                 ` Michael S. Tsirkin
  0 siblings, 1 reply; 32+ messages in thread
From: Alex Williamson @ 2014-02-16 14:47 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Alexander Graf, Gabriel L. Somlo, kvm@vger.kernel.org,
	qemu-devel@nongnu.org, eddie.dong@intel.com, pbonzini

On Sun, 2014-02-16 at 13:41 +0200, Michael S. Tsirkin wrote:
> On Fri, Feb 14, 2014 at 11:13:04PM +0100, Alexander Graf wrote:
> > 
> > On 14.02.2014, at 23:06, Gabriel L. Somlo <gsomlo@gmail.com> wrote:
> > 
> > > On Fri, Feb 14, 2014 at 10:21:09PM +0100, Alexander Graf wrote:
> > >> 
> > >> Can't you just turn the polarity around in the pci host adapter?
> > > 
> > > I tried this:
> > > 
> > > diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> > > index 1221f32..0e86d21 100644
> > > --- a/hw/pci/pci.c
> > > +++ b/hw/pci/pci.c
> > > @@ -118,13 +118,13 @@ static int pci_bar(PCIDevice *d, int reg)
> > > 
> > > static inline int pci_irq_state(PCIDevice *d, int irq_num)
> > > {
> > > -	return (d->irq_state >> irq_num) & 0x1;
> > > +	return !(d->irq_state >> irq_num) & 0x1;
> > > }
> > > 
> > > static inline void pci_set_irq_state(PCIDevice *d, int irq_num, int level)
> > > {
> > > 	d->irq_state &= ~(0x1 << irq_num);
> > > -	d->irq_state |= level << irq_num;
> > > +	d->irq_state &= ~(level << irq_num);
> > > }
> > > 
> > > static void pci_change_irq_level(PCIDevice *pci_dev, int irq_num, int change)
> > > @@ -229,7 +229,7 @@ static void pcibus_reset(BusState *qbus)
> > >     }
> > > 
> > >     for (i = 0; i < bus->nirq; i++) {
> > > -        assert(bus->irq_count[i] == 0);
> > > +        assert(bus->irq_count[i] != 0);
> > >     }
> > > }
> > > 
> > > ---
> > > 
> > > but now OS X freezes during boot right after
> > > 
> > > 	[ PCI configuration begin ]
> > > 	[ PCI configuration end, bridges 1, devices 10 ]
> > > 	RTC: Only single RAM bank (128 bytes)
> > > 
> > > which all looks normal, except the process is supposed to continue on
> > > from there and doesn't :)
> > > 
> > > On Linux, I get Fedora 20 live all the way up with no obvious/loud
> > > complaints, but mouse and keyboard don't work at all...
> > > 
> > > I have to admit I'm a bit out of my depth here, though :)
> > 
> > Yeah, another thing we have to take into account is vhost-net which generates IRQs directly through irqfd. I guess for those we'll have to configure the polarity in the irq routing table?
> > 
> > 
> > Alex
> 
> What will be affected is VFIO which uses IRQFD
> for level interrupts with KVM_IRQFD_FLAG_RESAMPLE.
> I suspect this will need a kernel change, maybe
> a new flag for IRQFD: KVM_IRQFD_FLAG_ACTIVE_LOW,
> since at the moment that does:
> 
> static void
> irqfd_inject(struct work_struct *work)
> {
>         struct _irqfd *irqfd = container_of(work, struct _irqfd, inject);
>         struct kvm *kvm = irqfd->kvm;
> 
>         if (!irqfd->resampler) {
>                 kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID, irqfd->gsi, 1,
>                                 false);
>                 kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID, irqfd->gsi, 0,
>                                 false);
>         } else
>                 kvm_set_irq(kvm, KVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID,
>                             irqfd->gsi, 1, false);
> }



As you said in a previous message, devices just want assert & de-assert,
1 & 0, which is what we have here.  I would think that what asserted
means only needs to be interpreted at the IOAPIC, so I'd hope we could
get it right w/o an API change.  Thanks,

Alex


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [Qemu-devel] RFC: ioapic polarity vs. qemu os-x guest
  2014-02-16 11:34       ` RFC: ioapic polarity vs. qemu os-x guest Michael S. Tsirkin
@ 2014-02-16 15:12         ` Peter Maydell
  0 siblings, 0 replies; 32+ messages in thread
From: Peter Maydell @ 2014-02-16 15:12 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Gabriel L. Somlo, eddie.dong, QEMU Developers, kvm-devel,
	Alexander Graf

On 16 February 2014 11:34, Michael S. Tsirkin <mst@redhat.com> wrote:
> Hmm no this is all wrong, from API point of view,
> devices shoud not care about value of interrupt.
> They just assert/deassert interrupts.
> It so happens that 1 means assert 0 means deassert.

Yeah, we generally model things as active-high even if the
hardware really treats the signal as active-low. (Among other
things there are some issues around how exactly device reset
should interact with a signal that is supposed to be high coming
out of reset, given you don't know whether the device at the
other end of the line has reset yet or not.)
This is great up until the point where you have a generic
GPIO device one of whose GPIO output lines happens to
be wired to an interrupt controller, of course.

thanks
-- PMM

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: RFC: ioapic polarity vs. qemu os-x guest
  2014-02-16 14:47               ` Alex Williamson
@ 2014-02-16 16:23                 ` Michael S. Tsirkin
  2014-02-17 17:57                   ` Gabriel L. Somlo
  2014-02-27 17:05                   ` [PATCH RFC] kvm: ignore apic polarity Michael S. Tsirkin
  0 siblings, 2 replies; 32+ messages in thread
From: Michael S. Tsirkin @ 2014-02-16 16:23 UTC (permalink / raw)
  To: Alex Williamson
  Cc: Alexander Graf, Gabriel L. Somlo, kvm@vger.kernel.org,
	qemu-devel@nongnu.org, eddie.dong@intel.com, pbonzini

On Sun, Feb 16, 2014 at 07:47:00AM -0700, Alex Williamson wrote:
> On Sun, 2014-02-16 at 13:41 +0200, Michael S. Tsirkin wrote:
> > On Fri, Feb 14, 2014 at 11:13:04PM +0100, Alexander Graf wrote:
> > > 
> > > On 14.02.2014, at 23:06, Gabriel L. Somlo <gsomlo@gmail.com> wrote:
> > > 
> > > > On Fri, Feb 14, 2014 at 10:21:09PM +0100, Alexander Graf wrote:
> > > >> 
> > > >> Can't you just turn the polarity around in the pci host adapter?
> > > > 
> > > > I tried this:
> > > > 
> > > > diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> > > > index 1221f32..0e86d21 100644
> > > > --- a/hw/pci/pci.c
> > > > +++ b/hw/pci/pci.c
> > > > @@ -118,13 +118,13 @@ static int pci_bar(PCIDevice *d, int reg)
> > > > 
> > > > static inline int pci_irq_state(PCIDevice *d, int irq_num)
> > > > {
> > > > -	return (d->irq_state >> irq_num) & 0x1;
> > > > +	return !(d->irq_state >> irq_num) & 0x1;
> > > > }
> > > > 
> > > > static inline void pci_set_irq_state(PCIDevice *d, int irq_num, int level)
> > > > {
> > > > 	d->irq_state &= ~(0x1 << irq_num);
> > > > -	d->irq_state |= level << irq_num;
> > > > +	d->irq_state &= ~(level << irq_num);
> > > > }
> > > > 
> > > > static void pci_change_irq_level(PCIDevice *pci_dev, int irq_num, int change)
> > > > @@ -229,7 +229,7 @@ static void pcibus_reset(BusState *qbus)
> > > >     }
> > > > 
> > > >     for (i = 0; i < bus->nirq; i++) {
> > > > -        assert(bus->irq_count[i] == 0);
> > > > +        assert(bus->irq_count[i] != 0);
> > > >     }
> > > > }
> > > > 
> > > > ---
> > > > 
> > > > but now OS X freezes during boot right after
> > > > 
> > > > 	[ PCI configuration begin ]
> > > > 	[ PCI configuration end, bridges 1, devices 10 ]
> > > > 	RTC: Only single RAM bank (128 bytes)
> > > > 
> > > > which all looks normal, except the process is supposed to continue on
> > > > from there and doesn't :)
> > > > 
> > > > On Linux, I get Fedora 20 live all the way up with no obvious/loud
> > > > complaints, but mouse and keyboard don't work at all...
> > > > 
> > > > I have to admit I'm a bit out of my depth here, though :)
> > > 
> > > Yeah, another thing we have to take into account is vhost-net which generates IRQs directly through irqfd. I guess for those we'll have to configure the polarity in the irq routing table?
> > > 
> > > 
> > > Alex
> > 
> > What will be affected is VFIO which uses IRQFD
> > for level interrupts with KVM_IRQFD_FLAG_RESAMPLE.
> > I suspect this will need a kernel change, maybe
> > a new flag for IRQFD: KVM_IRQFD_FLAG_ACTIVE_LOW,
> > since at the moment that does:
> > 
> > static void
> > irqfd_inject(struct work_struct *work)
> > {
> >         struct _irqfd *irqfd = container_of(work, struct _irqfd, inject);
> >         struct kvm *kvm = irqfd->kvm;
> > 
> >         if (!irqfd->resampler) {
> >                 kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID, irqfd->gsi, 1,
> >                                 false);
> >                 kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID, irqfd->gsi, 0,
> >                                 false);
> >         } else
> >                 kvm_set_irq(kvm, KVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID,
> >                             irqfd->gsi, 1, false);
> > }
> 
> 
> 
> As you said in a previous message, devices just want assert & de-assert,
> 1 & 0, which is what we have here.  I would think that what asserted
> means only needs to be interpreted at the IOAPIC, so I'd hope we could
> get it right w/o an API change.


Well there is a bigger issue: any interrupt with
multiple sources is broken.

__kvm_irq_line_state does a logical OR of all sources,
before XOR with polarity.

This makes no sense if polarity is active low.


One is beginning to think the simplest fix
would be Gabriel's patch after all:
-      irq_level ^= entry.fields.polarity;


although it's ugly in that it perpetuates the
bug in more places instead of fixing it.


>  Thanks,
> 
> Alex

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: RFC: ioapic polarity vs. qemu os-x guest
  2014-02-16 16:23                 ` Michael S. Tsirkin
@ 2014-02-17 17:57                   ` Gabriel L. Somlo
  2014-02-17 18:01                     ` Gabriel L. Somlo
  2014-02-27 17:05                   ` [PATCH RFC] kvm: ignore apic polarity Michael S. Tsirkin
  1 sibling, 1 reply; 32+ messages in thread
From: Gabriel L. Somlo @ 2014-02-17 17:57 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: kvm@vger.kernel.org, eddie.dong@intel.com, Alexander Graf,
	qemu-devel@nongnu.org, Alex Williamson, pbonzini

On Sun, Feb 16, 2014 at 06:23:11PM +0200, Michael S. Tsirkin wrote:
> Well there is a bigger issue: any interrupt with
> multiple sources is broken.
> 
> __kvm_irq_line_state does a logical OR of all sources,
> before XOR with polarity.
> 
> This makes no sense if polarity is active low.

So, do you think something like this would make sense, to address
active-low polarity in __kvm_irq_line_state ?
(this would be independent of the subsequent xor in
kvm_ioapic_set_irq()):

-static inline int __kvm_irq_line_state(unsigned long *irq_state,
+static inline int __kvm_irq_line_state(unsigned long *irq_state, int polarity,
					int irq_source_id, int level)
{
-        /* Logical OR for level trig interrupt */
	if (level)
		__set_bit(irq_source_id, irq_state);
	else
		__clear_bit(irq_source_id, irq_state);

-	return !!(*irq_state);
+	if (polarity) {
+		/* Logical OR for level trig interrupt, active-high */
+		return !!(*irq_state);
+	} else { // active-low
+		/* Logical AND for level trig interrupt, active-low */
+		return !~(*irq_state);
+	}
}

Thanks,
--Gabriel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: RFC: ioapic polarity vs. qemu os-x guest
  2014-02-17 17:57                   ` Gabriel L. Somlo
@ 2014-02-17 18:01                     ` Gabriel L. Somlo
  2014-02-17 18:06                       ` Paolo Bonzini
  0 siblings, 1 reply; 32+ messages in thread
From: Gabriel L. Somlo @ 2014-02-17 18:01 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Alex Williamson, Alexander Graf, kvm@vger.kernel.org,
	qemu-devel@nongnu.org, eddie.dong@intel.com, pbonzini

On Mon, Feb 17, 2014 at 12:57:00PM -0500, Gabriel L. Somlo wrote:
> On Sun, Feb 16, 2014 at 06:23:11PM +0200, Michael S. Tsirkin wrote:
> > Well there is a bigger issue: any interrupt with
> > multiple sources is broken.
> > 
> > __kvm_irq_line_state does a logical OR of all sources,
> > before XOR with polarity.
> > 
> > This makes no sense if polarity is active low.
> 
> So, do you think something like this would make sense, to address
> active-low polarity in __kvm_irq_line_state ?
> (this would be independent of the subsequent xor in
> kvm_ioapic_set_irq()):
 
Make that rather:

-static inline int __kvm_irq_line_state(unsigned long *irq_state,
+static inline int __kvm_irq_line_state(unsigned long *irq_state, int polarity,
					int irq_source_id, int level)
{
-        /* Logical OR for level trig interrupt */
	if (level)
		__set_bit(irq_source_id, irq_state);
	else
		__clear_bit(irq_source_id, irq_state);

-	return !!(*irq_state);
+	if (polarity) {
+		/* Logical AND for level trig interrupt, active-low */
+		return !~(*irq_state);
+	} else {
+		/* Logical OR for level trig interrupt, active-high */
+		return !!(*irq_state);
+	}
}

Thanks, and sorry for the noise :)
--Gabriel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: RFC: ioapic polarity vs. qemu os-x guest
  2014-02-17 18:01                     ` Gabriel L. Somlo
@ 2014-02-17 18:06                       ` Paolo Bonzini
  2014-02-17 19:38                         ` Gabriel L. Somlo
  0 siblings, 1 reply; 32+ messages in thread
From: Paolo Bonzini @ 2014-02-17 18:06 UTC (permalink / raw)
  To: Gabriel L. Somlo, Michael S. Tsirkin
  Cc: Alex Williamson, Alexander Graf, kvm@vger.kernel.org,
	qemu-devel@nongnu.org, eddie.dong@intel.com

Il 17/02/2014 19:01, Gabriel L. Somlo ha scritto:
> On Mon, Feb 17, 2014 at 12:57:00PM -0500, Gabriel L. Somlo wrote:
>> On Sun, Feb 16, 2014 at 06:23:11PM +0200, Michael S. Tsirkin wrote:
>>> Well there is a bigger issue: any interrupt with
>>> multiple sources is broken.
>>>
>>> __kvm_irq_line_state does a logical OR of all sources,
>>> before XOR with polarity.
>>>
>>> This makes no sense if polarity is active low.
>>
>> So, do you think something like this would make sense, to address
>> active-low polarity in __kvm_irq_line_state ?
>> (this would be independent of the subsequent xor in
>> kvm_ioapic_set_irq()):
>
> Make that rather:
>
> -static inline int __kvm_irq_line_state(unsigned long *irq_state,
> +static inline int __kvm_irq_line_state(unsigned long *irq_state, int polarity,
> 					int irq_source_id, int level)
> {
> -        /* Logical OR for level trig interrupt */
> 	if (level)
> 		__set_bit(irq_source_id, irq_state);
> 	else
> 		__clear_bit(irq_source_id, irq_state);
>
> -	return !!(*irq_state);
> +	if (polarity) {
> +		/* Logical AND for level trig interrupt, active-low */
> +		return !~(*irq_state);

This is ~*irq_state == 0, i.e. *irq_state == ~0.

What if high-order bits of *irq_state are never used?  That is, do you 
need to consider the maximum valid irq_source_id too?


> +	} else {
> +		/* Logical OR for level trig interrupt, active-high */
> +		return !!(*irq_state);

Better rewrite this as *irq_state != 0.

Paolo

> +	}
> }
>
> Thanks, and sorry for the noise :)
> --Gabriel
>


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: RFC: ioapic polarity vs. qemu os-x guest
  2014-02-17 18:06                       ` Paolo Bonzini
@ 2014-02-17 19:38                         ` Gabriel L. Somlo
  2014-02-18  0:58                           ` Gabriel L. Somlo
  0 siblings, 1 reply; 32+ messages in thread
From: Gabriel L. Somlo @ 2014-02-17 19:38 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Michael S. Tsirkin, Alex Williamson, Alexander Graf,
	kvm@vger.kernel.org, qemu-devel@nongnu.org, eddie.dong@intel.com

On Mon, Feb 17, 2014 at 07:06:11PM +0100, Paolo Bonzini wrote:
> Il 17/02/2014 19:01, Gabriel L. Somlo ha scritto:
> >On Mon, Feb 17, 2014 at 12:57:00PM -0500, Gabriel L. Somlo wrote:
> >>On Sun, Feb 16, 2014 at 06:23:11PM +0200, Michael S. Tsirkin wrote:
> >>>Well there is a bigger issue: any interrupt with
> >>>multiple sources is broken.
> >>>
> >>>__kvm_irq_line_state does a logical OR of all sources,
> >>>before XOR with polarity.
> >>>
> >>>This makes no sense if polarity is active low.
> >>
> >>So, do you think something like this would make sense, to address
> >>active-low polarity in __kvm_irq_line_state ?
> >>(this would be independent of the subsequent xor in
> >>kvm_ioapic_set_irq()):
> >
> >-	return !!(*irq_state);
> >+	if (polarity) {
> >+		/* Logical AND for level trig interrupt, active-low */
> >+		return !~(*irq_state);
> 
> This is ~*irq_state == 0, i.e. *irq_state == ~0.
> 
> What if high-order bits of *irq_state are never used?  That is, do
> you need to consider the maximum valid irq_source_id too?

Oh, I think I'm starting to comprehend the problem here. The bits of
"*irq_state" are indexed by "irq_source_id", which is dynamically
assigned by kvm_request_irq_source_id().

So, doing the OR thing when assuming always-active-high makes
sense. Doing AND based on an active-low assumption doesn't make
sense, because there could ALWAYS be 0 bits that just weren't
allocated (yet), and I'm having trouble imagining how I'd keep
track of where the current allocation boundary is in a sane way :)

Which I *think* was Michael's original point...

--Gabriel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: RFC: ioapic polarity vs. qemu os-x guest
  2014-02-17 19:38                         ` Gabriel L. Somlo
@ 2014-02-18  0:58                           ` Gabriel L. Somlo
  0 siblings, 0 replies; 32+ messages in thread
From: Gabriel L. Somlo @ 2014-02-18  0:58 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Michael S. Tsirkin, Alex Williamson, Alexander Graf,
	kvm@vger.kernel.org, qemu-devel@nongnu.org, eddie.dong@intel.com

On Mon, Feb 17, 2014 at 02:38:09PM -0500, Gabriel L. Somlo wrote:
> Oh, I think I'm starting to comprehend the problem here. The bits of
> "*irq_state" are indexed by "irq_source_id", which is dynamically
> assigned by kvm_request_irq_source_id().
> 
> So, doing the OR thing when assuming always-active-high makes
> sense. Doing AND based on an active-low assumption doesn't make
> sense, because there could ALWAYS be 0 bits that just weren't
> allocated (yet), and I'm having trouble imagining how I'd keep
> track of where the current allocation boundary is in a sane way :)

Hmm, I thought maybe I could use kvm->arch.irq_sources_bitmap, but
that's global across the whole VM, whereas irq_state belongs to
one given GSI. So, the per-GSI bitmap is sparse, so it's at least
as bad as I thought earlier, if not worse :)

Am I missing anything that would put this in a better light ?

Thanks,
--Gabriel

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH RFC] kvm: ignore apic polarity
  2014-02-16 16:23                 ` Michael S. Tsirkin
  2014-02-17 17:57                   ` Gabriel L. Somlo
@ 2014-02-27 17:05                   ` Michael S. Tsirkin
  2014-02-27 21:41                     ` Gabriel L. Somlo
  2014-03-01  5:03                     ` Alex Williamson
  1 sibling, 2 replies; 32+ messages in thread
From: Michael S. Tsirkin @ 2014-02-27 17:05 UTC (permalink / raw)
  To: Alex Williamson
  Cc: kvm@vger.kernel.org, eddie.dong@intel.com, Alexander Graf,
	qemu-devel@nongnu.org, Gabriel L. Somlo, pbonzini

apic polarity in KVM does not work: too many things assume active high.
Let's not pretend it works, let's just ignore polarity flag.  If we ever
want to emulate it exactly, this will need a feature flag anyway.

Also report this to userspace: this makes it
possible to report the interrupt active-low
in ACPI, this way we are closer to real hardware.

This patch fixes OSX running on KVM.

Reported-by: "Gabriel L. Somlo" <gsomlo@gmail.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

---

Gabriel, could you confirm this fixes OSX for you?
If you can play with linux tweaking interrupt
to active low, that would be very nice too:
it's weekend here.

diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 902f124..db29b27 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -674,6 +676,7 @@ struct kvm_ppc_smmu_info {
 #define KVM_CAP_ARM_EL1_32BIT 93
 #define KVM_CAP_SPAPR_MULTITCE 94
 #define KVM_CAP_EXT_EMUL_CPUID 95
+#define KVM_CAP_X86_IOAPIC_POLARITY_IGNORED 96
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
index 2d68297..ad170b4 100644
--- a/virt/kvm/ioapic.c
+++ b/virt/kvm/ioapic.c
@@ -328,7 +328,6 @@ int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int irq, int irq_source_id,
 	irq_level = __kvm_irq_line_state(&ioapic->irq_states[irq],
 					 irq_source_id, level);
 	entry = ioapic->redirtbl[irq];
-	irq_level ^= entry.fields.polarity;
 	if (!irq_level) {
 		ioapic->irr &= ~mask;
 		ret = 1;

^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH RFC] kvm: ignore apic polarity
  2014-02-27 17:05                   ` [PATCH RFC] kvm: ignore apic polarity Michael S. Tsirkin
@ 2014-02-27 21:41                     ` Gabriel L. Somlo
  2014-02-27 22:30                       ` Paolo Bonzini
  2014-03-01  5:03                     ` Alex Williamson
  1 sibling, 1 reply; 32+ messages in thread
From: Gabriel L. Somlo @ 2014-02-27 21:41 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Alex Williamson, kvm@vger.kernel.org, eddie.dong@intel.com,
	Alexander Graf, qemu-devel@nongnu.org, pbonzini

On Thu, Feb 27, 2014 at 07:05:49PM +0200, Michael S. Tsirkin wrote:
> apic polarity in KVM does not work: too many things assume active high.
> Let's not pretend it works, let's just ignore polarity flag.  If we ever
> want to emulate it exactly, this will need a feature flag anyway.
> 
> Also report this to userspace: this makes it
> possible to report the interrupt active-low
> in ACPI, this way we are closer to real hardware.
> 
> This patch fixes OSX running on KVM.
> 
> Reported-by: "Gabriel L. Somlo" <gsomlo@gmail.com>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> 
> ---
> 
> Gabriel, could you confirm this fixes OSX for you?
> If you can play with linux tweaking interrupt
> to active low, that would be very nice too:
> it's weekend here.

With Fedora 20 Live x86_64:

If all I do is 's/ActiveHigh/ActiveLow/' in hw/i386/[q36-]acpi-dsdt.dsl,
but otherwise don't try to change how QEMU deals with "logical" vs.
"physical" ioapic polarity, things work great. Printk's show polarity
set to 1, but with the ignore-polarity patch things work fine.

With normal (ActiveHigh) ACPI, printk reports polarity set to 0, and
things *still* work exactly the same.


So, the way I understand this (and I'm writing this mainly for myself,
to make sure I understand correctly, so please kick me if I got it
wrong), ACPI tells the guest OS how to configure "physical" ioapic polarity.

With ActiveHigh, "physical" == "logical", i.e. "high" == "asserted"
and "low" == "deasserted".

With ActiveLow, "physical" == !"logical", so the other way around.

QEMU being hard-coded to ActiveHigh is the moral equivalent of always
sending the kernel (KVM) "logical" line states, rather than "physical"
ones.

Assuming KVM's userland clients are all coded for ActiveHigh, we can
(should, for sanity's sake) just assume line states from userland are
logical, and stop paying attention the polarity bits. That way,
misbehaving guests [*] can configure their ioapics as they please, and
things will just work OK regardless.

As you pointed out earlier, even KVM itself already kind-of assumes
ActiveHigh (e.g. in __kvm_irq_line_state(), which should be coded
differently if ActiveLow were a serious possibility, and, BTW,
irq_states[irq] would probably have to be initialized to all-1's if
ActiveLow wre used, etc, etc).


[*] All Apple hardware is ActiveLow, and OS X has this as a hardcoded
assumption (it ignores ACPI's hints to the contrary). So OS X is one
of the misbehaving clients I mention above...


> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 902f124..db29b27 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -674,6 +676,7 @@ struct kvm_ppc_smmu_info {
>  #define KVM_CAP_ARM_EL1_32BIT 93
>  #define KVM_CAP_SPAPR_MULTITCE 94
>  #define KVM_CAP_EXT_EMUL_CPUID 95
> +#define KVM_CAP_X86_IOAPIC_POLARITY_IGNORED 96

For one thing, on current kvm git, this conflicts with

#define KVM_CAP_HYPERV_TIME 96

but the rest of the patch works (I've already been using it for quite
a while to get OS X guests to work...

Thanks much,
--Gabriel

>  
>  #ifdef KVM_CAP_IRQ_ROUTING
>  
> diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
> index 2d68297..ad170b4 100644
> --- a/virt/kvm/ioapic.c
> +++ b/virt/kvm/ioapic.c
> @@ -328,7 +328,6 @@ int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int irq, int irq_source_id,
>  	irq_level = __kvm_irq_line_state(&ioapic->irq_states[irq],
>  					 irq_source_id, level);
>  	entry = ioapic->redirtbl[irq];
> -	irq_level ^= entry.fields.polarity;
>  	if (!irq_level) {
>  		ioapic->irr &= ~mask;
>  		ret = 1;
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH RFC] kvm: ignore apic polarity
  2014-02-27 21:41                     ` Gabriel L. Somlo
@ 2014-02-27 22:30                       ` Paolo Bonzini
  2014-02-27 23:13                         ` Gabriel L. Somlo
  2014-02-28  4:55                         ` [PATCH RFC] kvm: ignore apic polarity Michael S. Tsirkin
  0 siblings, 2 replies; 32+ messages in thread
From: Paolo Bonzini @ 2014-02-27 22:30 UTC (permalink / raw)
  To: Gabriel L. Somlo, Michael S. Tsirkin
  Cc: Alex Williamson, eddie.dong@intel.com, Alexander Graf,
	kvm@vger.kernel.org, qemu-devel@nongnu.org

Il 27/02/2014 22:41, Gabriel L. Somlo ha scritto:
> On Thu, Feb 27, 2014 at 07:05:49PM +0200, Michael S. Tsirkin wrote:
>> apic polarity in KVM does not work: too many things assume active high.
>> Let's not pretend it works, let's just ignore polarity flag.  If we ever
>> want to emulate it exactly, this will need a feature flag anyway.
>>
>> Also report this to userspace: this makes it
>> possible to report the interrupt active-low
>> in ACPI, this way we are closer to real hardware.
>>
>> This patch fixes OSX running on KVM.
>>
>> Reported-by: "Gabriel L. Somlo" <gsomlo@gmail.com>
>> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>>
>> ---
>>
>> Gabriel, could you confirm this fixes OSX for you?
>> If you can play with linux tweaking interrupt
>> to active low, that would be very nice too:
>> it's weekend here.
>
> With Fedora 20 Live x86_64:
>
> If all I do is 's/ActiveHigh/ActiveLow/' in hw/i386/[q36-]acpi-dsdt.dsl,
> but otherwise don't try to change how QEMU deals with "logical" vs.
> "physical" ioapic polarity, things work great. Printk's show polarity
> set to 1, but with the ignore-polarity patch things work fine.
>
> With normal (ActiveHigh) ACPI, printk reports polarity set to 0, and
> things *still* work exactly the same.
>
>
> So, the way I understand this (and I'm writing this mainly for myself,
> to make sure I understand correctly, so please kick me if I got it
> wrong), ACPI tells the guest OS how to configure "physical" ioapic polarity.
>
> With ActiveHigh, "physical" == "logical", i.e. "high" == "asserted"
> and "low" == "deasserted".
>
> With ActiveLow, "physical" == !"logical", so the other way around.
>
> QEMU being hard-coded to ActiveHigh is the moral equivalent of always
> sending the kernel (KVM) "logical" line states, rather than "physical"
> ones.
>
> Assuming KVM's userland clients are all coded for ActiveHigh, we can
> (should, for sanity's sake) just assume line states from userland are
> logical, and stop paying attention the polarity bits. That way,
> misbehaving guests [*] can configure their ioapics as they please, and
> things will just work OK regardless.
>
> As you pointed out earlier, even KVM itself already kind-of assumes
> ActiveHigh (e.g. in __kvm_irq_line_state(), which should be coded
> differently if ActiveLow were a serious possibility, and, BTW,
> irq_states[irq] would probably have to be initialized to all-1's if
> ActiveLow wre used, etc, etc).

This is a much better description.  Can you turn it into a patch to 
Documentation/virtual/kvm/api.txt and a more complete commit message?

Also, there is a problem in this: we definitely do not want to have 
different ACPI tables for TCG vs. KVM.  Have you guys tested what 
happens with Linux guests + TCG if interrupts are declared active-low?

QEMU likely has many other places that hard-code active-high.  One 
approach could be to add a QOM property to the ioapic that is a bitmask 
of which GSIs are active-low.  The ioapic can consult it like this:

      if (vector >= 0 && vector < IOAPIC_NUM_PINS) {
          uint32_t mask = 1 << vector;
          uint64_t entry = s->ioredtbl[vector];

          if (entry & (1 << IOAPIC_LVT_POLARITY_SHIFT)) {
              level = !level;
          }
+        if (s->active_low_mask & (1 << vector)) {
+            level = !level;
+        }
          if (((entry >> IOAPIC_LVT_TRIGGER_MODE_SHIFT) & 1) ==
              IOAPIC_TRIGGER_LEVEL) {
              /* level triggered */

etc. so that the two NOTs undo each other, making the input to QEMU's 
ioapic also "logical" rather than "physical".

Paolo

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH RFC] kvm: ignore apic polarity
  2014-02-27 22:30                       ` Paolo Bonzini
@ 2014-02-27 23:13                         ` Gabriel L. Somlo
  2014-02-27 23:31                           ` Paolo Bonzini
  2014-02-28  4:55                         ` [PATCH RFC] kvm: ignore apic polarity Michael S. Tsirkin
  1 sibling, 1 reply; 32+ messages in thread
From: Gabriel L. Somlo @ 2014-02-27 23:13 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: kvm@vger.kernel.org, Michael S. Tsirkin, eddie.dong@intel.com,
	qemu-devel@nongnu.org, Alexander Graf, Alex Williamson

On Thu, Feb 27, 2014 at 11:30:55PM +0100, Paolo Bonzini wrote:
> Il 27/02/2014 22:41, Gabriel L. Somlo ha scritto:
> >On Thu, Feb 27, 2014 at 07:05:49PM +0200, Michael S. Tsirkin wrote:
> >>apic polarity in KVM does not work: too many things assume active high.
> >>Let's not pretend it works, let's just ignore polarity flag.  If we ever
> >>want to emulate it exactly, this will need a feature flag anyway.
> >>
> >>Also report this to userspace: this makes it
> >>possible to report the interrupt active-low
> >>in ACPI, this way we are closer to real hardware.
> >>
> >>This patch fixes OSX running on KVM.
> >>
> >>Reported-by: "Gabriel L. Somlo" <gsomlo@gmail.com>
> >>Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> >>
> >>---
> >
> >So, the way I understand this (and I'm writing this mainly for myself,
> >to make sure I understand correctly, so please kick me if I got it
> >wrong), ACPI tells the guest OS how to configure "physical" ioapic polarity.
> >
> >With ActiveHigh, "physical" == "logical", i.e. "high" == "asserted"
> >and "low" == "deasserted".
> >
> >With ActiveLow, "physical" == !"logical", so the other way around.
> >
> >QEMU being hard-coded to ActiveHigh is the moral equivalent of always
> >sending the kernel (KVM) "logical" line states, rather than "physical"
> >ones.
> >
> >Assuming KVM's userland clients are all coded for ActiveHigh, we can
> >(should, for sanity's sake) just assume line states from userland are
> >logical, and stop paying attention the polarity bits. That way,
> >misbehaving guests [*] can configure their ioapics as they please, and
> >things will just work OK regardless.
> >
> >As you pointed out earlier, even KVM itself already kind-of assumes
> >ActiveHigh (e.g. in __kvm_irq_line_state(), which should be coded
> >differently if ActiveLow were a serious possibility, and, BTW,
> >irq_states[irq] would probably have to be initialized to all-1's if
> >ActiveLow wre used, etc, etc).
> 
> This is a much better description.  Can you turn it into a patch to
> Documentation/virtual/kvm/api.txt and a more complete commit
> message?

Do you mean one patch to change both virt/kvm/ioapic.c and
Documentation/virtual/kvm/api.txt ? Or a separate documentation patch ?
(sorry for my ignorance, I'm new to being a KVM contributor :) )

> >With Fedora 20 Live x86_64:
> >
> >If all I do is 's/ActiveHigh/ActiveLow/' in hw/i386/[q36-]acpi-dsdt.dsl,
> >but otherwise don't try to change how QEMU deals with "logical" vs.
> >"physical" ioapic polarity, things work great. Printk's show polarity
> >set to 1, but with the ignore-polarity patch things work fine.
> >
> >With normal (ActiveHigh) ACPI, printk reports polarity set to 0, and
> >things *still* work exactly the same.
>
> Also, there is a problem in this: we definitely do not want to have
> different ACPI tables for TCG vs. KVM.  Have you guys tested what
> happens with Linux guests + TCG if interrupts are declared
> active-low?

I think removing the polarity xor from KVM is about giving up on
trying to add ActiveLow support to QEMU altogether. What I tested
was what would happen if Linux (which pays attention to ACPI) were
told to use ActiveLow, but thre rest of QEMU continued being hardcoded
as ActiveHigh. Basically, another datapoint similar to what happens
with OS X, which completely ignores ACPI and configures the ioapic as
ActiveLow (even while running on ActiveHigh "hardware", i.e. QEMU).

With KVM no longer paying attention to the polarity bit, things work
fine, both with Linux-thinking-it's-ActiveLow, and with OS X.

But, since QEMU will stay ActiveHigh, I don't think TCG will be
impacted in any way by this change.

(Hmmm, maybe this one of the reasons I never got OS X to boot without
-enable-kvm... I should look at the QEMU hw/intc/ioapic*.c, and see if
*it* cares about guest-configured polarity, and maybe get it to *stop*
caring :)

Thanks,
--Gabriel

> 
> QEMU likely has many other places that hard-code active-high.  One
> approach could be to add a QOM property to the ioapic that is a
> bitmask of which GSIs are active-low.  The ioapic can consult it
> like this:
> 
>      if (vector >= 0 && vector < IOAPIC_NUM_PINS) {
>          uint32_t mask = 1 << vector;
>          uint64_t entry = s->ioredtbl[vector];
> 
>          if (entry & (1 << IOAPIC_LVT_POLARITY_SHIFT)) {
>              level = !level;
>          }
> +        if (s->active_low_mask & (1 << vector)) {
> +            level = !level;
> +        }
>          if (((entry >> IOAPIC_LVT_TRIGGER_MODE_SHIFT) & 1) ==
>              IOAPIC_TRIGGER_LEVEL) {
>              /* level triggered */
> 
> etc. so that the two NOTs undo each other, making the input to
> QEMU's ioapic also "logical" rather than "physical".
> 
> Paolo

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH RFC] kvm: ignore apic polarity
  2014-02-27 23:13                         ` Gabriel L. Somlo
@ 2014-02-27 23:31                           ` Paolo Bonzini
  2014-02-28  4:06                             ` [RFC PATCH v2] kvm: x86: ignore ioapic polarity Gabriel L. Somlo
  0 siblings, 1 reply; 32+ messages in thread
From: Paolo Bonzini @ 2014-02-27 23:31 UTC (permalink / raw)
  To: Gabriel L. Somlo
  Cc: Michael S. Tsirkin, Alex Williamson, kvm@vger.kernel.org,
	eddie.dong@intel.com, Alexander Graf, qemu-devel@nongnu.org

Il 28/02/2014 00:13, Gabriel L. Somlo ha scritto:
> On Thu, Feb 27, 2014 at 11:30:55PM +0100, Paolo Bonzini wrote:
>> Il 27/02/2014 22:41, Gabriel L. Somlo ha scritto:
>>> On Thu, Feb 27, 2014 at 07:05:49PM +0200, Michael S. Tsirkin wrote:
>>>> apic polarity in KVM does not work: too many things assume active high.
>>>> Let's not pretend it works, let's just ignore polarity flag.  If we ever
>>>> want to emulate it exactly, this will need a feature flag anyway.
>>>>
>>>> Also report this to userspace: this makes it
>>>> possible to report the interrupt active-low
>>>> in ACPI, this way we are closer to real hardware.
>>>>
>>>> This patch fixes OSX running on KVM.
>>>>
>>>> Reported-by: "Gabriel L. Somlo" <gsomlo@gmail.com>
>>>> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>>>>
>>>> ---
>>>
>>> So, the way I understand this (and I'm writing this mainly for myself,
>>> to make sure I understand correctly, so please kick me if I got it
>>> wrong), ACPI tells the guest OS how to configure "physical" ioapic polarity.
>>>
>>> With ActiveHigh, "physical" == "logical", i.e. "high" == "asserted"
>>> and "low" == "deasserted".
>>>
>>> With ActiveLow, "physical" == !"logical", so the other way around.
>>>
>>> QEMU being hard-coded to ActiveHigh is the moral equivalent of always
>>> sending the kernel (KVM) "logical" line states, rather than "physical"
>>> ones.
>>>
>>> Assuming KVM's userland clients are all coded for ActiveHigh, we can
>>> (should, for sanity's sake) just assume line states from userland are
>>> logical, and stop paying attention the polarity bits. That way,
>>> misbehaving guests [*] can configure their ioapics as they please, and
>>> things will just work OK regardless.
>>>
>>> As you pointed out earlier, even KVM itself already kind-of assumes
>>> ActiveHigh (e.g. in __kvm_irq_line_state(), which should be coded
>>> differently if ActiveLow were a serious possibility, and, BTW,
>>> irq_states[irq] would probably have to be initialized to all-1's if
>>> ActiveLow wre used, etc, etc).
>>
>> This is a much better description.  Can you turn it into a patch to
>> Documentation/virtual/kvm/api.txt and a more complete commit
>> message?
>
> Do you mean one patch to change both virt/kvm/ioapic.c and
> Documentation/virtual/kvm/api.txt ? Or a separate documentation patch ?
> (sorry for my ignorance, I'm new to being a KVM contributor :) )

Yes.

> I think removing the polarity xor from KVM is about giving up on
> trying to add ActiveLow support to QEMU altogether. What I tested
> was what would happen if Linux (which pays attention to ACPI) were
> told to use ActiveLow, but thre rest of QEMU continued being hardcoded
> as ActiveHigh. Basically, another datapoint similar to what happens
> with OS X, which completely ignores ACPI and configures the ioapic as
> ActiveLow (even while running on ActiveHigh "hardware", i.e. QEMU).
>
> With KVM no longer paying attention to the polarity bit, things work
> fine, both with Linux-thinking-it's-ActiveLow, and with OS X.
>
> But, since QEMU will stay ActiveHigh, I don't think TCG will be
> impacted in any way by this change.

If you change ACPI tables to ActiveLow, and leave the polarity inversion 
in hw/intc/ioapic.c, then it will matter.

> (Hmmm, maybe this one of the reasons I never got OS X to boot without
> -enable-kvm... I should look at the QEMU hw/intc/ioapic*.c, and see if
> *it* cares about guest-configured polarity, and maybe get it to *stop*
> caring :)

Exactly my point. :)

Paolo

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [RFC PATCH v2] kvm: x86: ignore ioapic polarity
  2014-02-27 23:31                           ` Paolo Bonzini
@ 2014-02-28  4:06                             ` Gabriel L. Somlo
  2014-03-02 14:55                               ` Michael S. Tsirkin
  2014-03-13 10:53                               ` Paolo Bonzini
  0 siblings, 2 replies; 32+ messages in thread
From: Gabriel L. Somlo @ 2014-02-28  4:06 UTC (permalink / raw)
  To: kvm; +Cc: mst, pbonzini, alex.williamson, eddie.dong, agraf, qemu-devel

Both QEMU and KVM have already accumulated a significant number of
optimizations based on the hard-coded assumption that ioapic polarity
will always use the ActiveHigh convention, where the logical and
physical states of level-triggered irq lines always match (i.e.,
active(asserted) == high == 1, inactive == low == 0). QEMU guests
are expected to follow directions given via ACPI and configure the
ioapic with polarity 0 (ActiveHigh). However, even when misbehaving
guests (e.g. OS X <= 10.9) set the ioapic polarity to 1 (ActiveLow),
QEMU will still use the ActiveHigh signaling convention when
interfacing with KVM.

This patch modifies KVM to completely ignore ioapic polarity as set by
the guest OS, enabling misbehaving guests to work alongside those which
comply with the ActiveHigh polarity specified by QEMU's ACPI tables.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Gabriel L. Somlo <somlo@cmu.edu>
---

On Fri, Feb 28, 2014 at 12:31:27AM +0100, Paolo Bonzini wrote:
>>>This is a much better description.  Can you turn it into a patch to
>>>Documentation/virtual/kvm/api.txt and a more complete commit
>>>message?

OK, let me know what you all think of this version.

> If you change ACPI tables to ActiveLow, and leave the polarity
> inversion in hw/intc/ioapic.c, then it will matter.

I only changed ACPI to ActiveLow to cause Linux to misbehave in the
same way OS X does, it was a temporary/test hack. ACPI should stay
ActiveHigh to reflect the rest of QEMU/KVM, and misbehaving guest OSs
should ignore it at their own peril :)

> >(Hmmm, maybe this one of the reasons I never got OS X to boot without
> >-enable-kvm... I should look at the QEMU hw/intc/ioapic*.c, and see
> >if *it* cares about guest-configured polarity, and maybe get it to
> >*stop* caring :)

> Exactly my point. :)

So this patch gets the accelerated path to work regardless of how
well-behaved a guest OS may or may not be w.r.t. ACPI and polarity.
I'll try to find out whether it's possible to be *extra* nice to these
types of guests by also allowing them to ignore ACPI polarity when not
using acceleration :)

Thanks again,
  Gabriel

 Documentation/virtual/kvm/api.txt | 18 ++++++++++++++++++
 include/uapi/linux/kvm.h          |  1 +
 virt/kvm/ioapic.c                 |  1 -
 3 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index 6cd63a9..0492a94 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -1365,6 +1365,24 @@ struct kvm_irq_routing_msi {
 	__u32 pad;
 };
 
+NOTE: For each level-triggered interrupt managed by a virtual ioapic,
+the guest OS may set a polarity value (bit 13 of each corresponding I/O
+redirection table register). The polarity bit defines the relationship
+between an irq line's logical state (active/asserted = 1, inactive = 0)
+and its physical state (high = 1, low = 0). When the polarity bit is 0
+(ActiveHigh), logical and physical states are matched (active == high == 1,
+inactive == low == 0). When polarity is set to 1 (ActiveLow), logical and
+physical state are opposites (logical == !physical). Typically, guests are
+expected to use the same polarity across all level-triggered interrupts,
+as directed by ACPI. Historically, both KVM and QEMU have accumulated a
+significant number of optimizations based on the hard-coded assumption
+that polarity will always be set to ActiveHigh. When interfacing with KVM,
+QEMU will use the ActiveHigh convention for all level-triggered irqs,
+regardless of how the guest OS has configured the polarity bits in the
+ioapic registers. As such, KVM must also ignore these bits, and always act
+as if the logical and physical states of an irq line exactly match each
+other (i.e., follow the ActiveHigh convention regardless of polarity).
+
 
 4.53 KVM_ASSIGN_SET_MSIX_NR
 
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 932d7f2..5bfc0e6 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -675,6 +675,7 @@ struct kvm_ppc_smmu_info {
 #define KVM_CAP_SPAPR_MULTITCE 94
 #define KVM_CAP_EXT_EMUL_CPUID 95
 #define KVM_CAP_HYPERV_TIME 96
+#define KVM_CAP_X86_IOAPIC_POLARITY_IGNORED 97
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
index ce9ed99..1539d37 100644
--- a/virt/kvm/ioapic.c
+++ b/virt/kvm/ioapic.c
@@ -328,7 +328,6 @@ int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int irq, int irq_source_id,
 	irq_level = __kvm_irq_line_state(&ioapic->irq_states[irq],
 					 irq_source_id, level);
 	entry = ioapic->redirtbl[irq];
-	irq_level ^= entry.fields.polarity;
 	if (!irq_level) {
 		ioapic->irr &= ~mask;
 		ret = 1;
-- 
1.8.1.4


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH RFC] kvm: ignore apic polarity
  2014-02-27 22:30                       ` Paolo Bonzini
  2014-02-27 23:13                         ` Gabriel L. Somlo
@ 2014-02-28  4:55                         ` Michael S. Tsirkin
  2014-02-28  8:10                           ` Paolo Bonzini
  2014-02-28  8:11                           ` Paolo Bonzini
  1 sibling, 2 replies; 32+ messages in thread
From: Michael S. Tsirkin @ 2014-02-28  4:55 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Gabriel L. Somlo, Alex Williamson, kvm@vger.kernel.org,
	eddie.dong@intel.com, Alexander Graf, qemu-devel@nongnu.org

On Thu, Feb 27, 2014 at 11:30:55PM +0100, Paolo Bonzini wrote:
> Il 27/02/2014 22:41, Gabriel L. Somlo ha scritto:
> >On Thu, Feb 27, 2014 at 07:05:49PM +0200, Michael S. Tsirkin wrote:
> >>apic polarity in KVM does not work: too many things assume active high.
> >>Let's not pretend it works, let's just ignore polarity flag.  If we ever
> >>want to emulate it exactly, this will need a feature flag anyway.
> >>
> >>Also report this to userspace: this makes it
> >>possible to report the interrupt active-low
> >>in ACPI, this way we are closer to real hardware.
> >>
> >>This patch fixes OSX running on KVM.
> >>
> >>Reported-by: "Gabriel L. Somlo" <gsomlo@gmail.com>
> >>Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> >>
> >>---
> >>
> >>Gabriel, could you confirm this fixes OSX for you?
> >>If you can play with linux tweaking interrupt
> >>to active low, that would be very nice too:
> >>it's weekend here.
> >
> >With Fedora 20 Live x86_64:
> >
> >If all I do is 's/ActiveHigh/ActiveLow/' in hw/i386/[q36-]acpi-dsdt.dsl,
> >but otherwise don't try to change how QEMU deals with "logical" vs.
> >"physical" ioapic polarity, things work great. Printk's show polarity
> >set to 1, but with the ignore-polarity patch things work fine.
> >
> >With normal (ActiveHigh) ACPI, printk reports polarity set to 0, and
> >things *still* work exactly the same.
> >
> >
> >So, the way I understand this (and I'm writing this mainly for myself,
> >to make sure I understand correctly, so please kick me if I got it
> >wrong), ACPI tells the guest OS how to configure "physical" ioapic polarity.
> >
> >With ActiveHigh, "physical" == "logical", i.e. "high" == "asserted"
> >and "low" == "deasserted".
> >
> >With ActiveLow, "physical" == !"logical", so the other way around.
> >
> >QEMU being hard-coded to ActiveHigh is the moral equivalent of always
> >sending the kernel (KVM) "logical" line states, rather than "physical"
> >ones.
> >
> >Assuming KVM's userland clients are all coded for ActiveHigh, we can
> >(should, for sanity's sake) just assume line states from userland are
> >logical, and stop paying attention the polarity bits. That way,
> >misbehaving guests [*] can configure their ioapics as they please, and
> >things will just work OK regardless.
> >
> >As you pointed out earlier, even KVM itself already kind-of assumes
> >ActiveHigh (e.g. in __kvm_irq_line_state(), which should be coded
> >differently if ActiveLow were a serious possibility, and, BTW,
> >irq_states[irq] would probably have to be initialized to all-1's if
> >ActiveLow wre used, etc, etc).
> 
> This is a much better description.  Can you turn it into a patch to
> Documentation/virtual/kvm/api.txt and a more complete commit
> message?
> 
> Also, there is a problem in this: we definitely do not want to have
> different ACPI tables for TCG vs. KVM.

Why?

We already have different ACPI tables for CG vs KVM.
Specifically apic interrupt override flag in MADT is set
for KVM but not TCG.

As KVM hardware differs from TCG, I expect them to move
further apart with time.

> Have you guys tested what
> happens with Linux guests + TCG if interrupts are declared
> active-low?
> 
> QEMU likely has many other places that hard-code active-high.  One
> approach could be to add a QOM property to the ioapic that is a
> bitmask of which GSIs are active-low.  The ioapic can consult it
> like this:
> 
>      if (vector >= 0 && vector < IOAPIC_NUM_PINS) {
>          uint32_t mask = 1 << vector;
>          uint64_t entry = s->ioredtbl[vector];
> 
>          if (entry & (1 << IOAPIC_LVT_POLARITY_SHIFT)) {
>              level = !level;
>          }
> +        if (s->active_low_mask & (1 << vector)) {
> +            level = !level;
> +        }
>          if (((entry >> IOAPIC_LVT_TRIGGER_MODE_SHIFT) & 1) ==
>              IOAPIC_TRIGGER_LEVEL) {
>              /* level triggered */
> 
> etc. so that the two NOTs undo each other, making the input to
> QEMU's ioapic also "logical" rather than "physical".
> 
> Paolo

Or we can do a patch like we did for kvm and ignore polarity,
treating levels as logical rather than physical.

-- 
MST

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH RFC] kvm: ignore apic polarity
  2014-02-28  4:55                         ` [PATCH RFC] kvm: ignore apic polarity Michael S. Tsirkin
@ 2014-02-28  8:10                           ` Paolo Bonzini
  2014-02-28  8:11                           ` Paolo Bonzini
  1 sibling, 0 replies; 32+ messages in thread
From: Paolo Bonzini @ 2014-02-28  8:10 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Gabriel L. Somlo, Alex Williamson, kvm@vger.kernel.org,
	eddie.dong@intel.com, Alexander Graf, qemu-devel@nongnu.org

Il 28/02/2014 05:55, Michael S. Tsirkin ha scritto:
> Why?
>
> We already have different ACPI tables for TCG vs KVM.
> Specifically apic interrupt override flag in MADT is set
> for KVM but not TCG.

It used to be this way, but

   bool kvm_allows_irq0_override(void)
   {
       return !kvm_irqchip_in_kernel() || kvm_has_gsi_routing();
   }

means that these days it is usually set for both TCG and KVM.

> As KVM hardware differs from TCG, I expect them to move
> further apart with time.

I don't expect this, because there are some kind of guests that people 
will unknowingly end up running with TCG---for example libguestfs.  So 
minimizing the differences in hardware between KVM and TCG _is_ a goal.

> Or we can do a patch like we did for kvm and ignore polarity,
> treating levels as logical rather than physical.

Yes.

Paolo

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH RFC] kvm: ignore apic polarity
  2014-02-28  4:55                         ` [PATCH RFC] kvm: ignore apic polarity Michael S. Tsirkin
  2014-02-28  8:10                           ` Paolo Bonzini
@ 2014-02-28  8:11                           ` Paolo Bonzini
  1 sibling, 0 replies; 32+ messages in thread
From: Paolo Bonzini @ 2014-02-28  8:11 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Gabriel L. Somlo, Alex Williamson, kvm@vger.kernel.org,
	eddie.dong@intel.com, Alexander Graf, qemu-devel@nongnu.org

Il 28/02/2014 05:55, Michael S. Tsirkin ha scritto:
> Why?
>
> We already have different ACPI tables for TCG vs KVM.
> Specifically apic interrupt override flag in MADT is set
> for KVM but not TCG.

It used to be this way, but

   bool kvm_allows_irq0_override(void)
   {
       return !kvm_irqchip_in_kernel() || kvm_has_gsi_routing();
   }

means that these days it is usually set for both TCG and KVM.

> As KVM hardware differs from TCG, I expect them to move
> further apart with time.

I don't expect this, because there are some kind of guests that people 
will unknowingly end up running with TCG---for example libguestfs.  So 
minimizing the differences in hardware between KVM and TCG _is_ a goal.

> Or we can do a patch like we did for kvm and ignore polarity,
> treating levels as logical rather than physical.

Yes.

Paolo

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH RFC] kvm: ignore apic polarity
  2014-02-27 17:05                   ` [PATCH RFC] kvm: ignore apic polarity Michael S. Tsirkin
  2014-02-27 21:41                     ` Gabriel L. Somlo
@ 2014-03-01  5:03                     ` Alex Williamson
  1 sibling, 0 replies; 32+ messages in thread
From: Alex Williamson @ 2014-03-01  5:03 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: kvm@vger.kernel.org, eddie.dong@intel.com, Alexander Graf,
	qemu-devel@nongnu.org, Gabriel L. Somlo, pbonzini

On Thu, 2014-02-27 at 19:05 +0200, Michael S. Tsirkin wrote:
> apic polarity in KVM does not work: too many things assume active high.
> Let's not pretend it works, let's just ignore polarity flag.  If we ever
> want to emulate it exactly, this will need a feature flag anyway.
> 
> Also report this to userspace: this makes it
> possible to report the interrupt active-low
> in ACPI, this way we are closer to real hardware.
> 
> This patch fixes OSX running on KVM.
> 
> Reported-by: "Gabriel L. Somlo" <gsomlo@gmail.com>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> 
> ---
> 
> Gabriel, could you confirm this fixes OSX for you?
> If you can play with linux tweaking interrupt
> to active low, that would be very nice too:
> it's weekend here.
> 
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 902f124..db29b27 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -674,6 +676,7 @@ struct kvm_ppc_smmu_info {
>  #define KVM_CAP_ARM_EL1_32BIT 93
>  #define KVM_CAP_SPAPR_MULTITCE 94
>  #define KVM_CAP_EXT_EMUL_CPUID 95
> +#define KVM_CAP_X86_IOAPIC_POLARITY_IGNORED 96
>  
>  #ifdef KVM_CAP_IRQ_ROUTING
>  
> diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
> index 2d68297..ad170b4 100644
> --- a/virt/kvm/ioapic.c
> +++ b/virt/kvm/ioapic.c
> @@ -328,7 +328,6 @@ int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int irq, int irq_source_id,
>  	irq_level = __kvm_irq_line_state(&ioapic->irq_states[irq],
>  					 irq_source_id, level);
>  	entry = ioapic->redirtbl[irq];
> -	irq_level ^= entry.fields.polarity;
>  	if (!irq_level) {
>  		ioapic->irr &= ~mask;
>  		ret = 1;

I've run Windows and Linux guests with vfio assigned devices with this
change and all the interrupt types seem to work.

Tested-by: Alex Williamson <alex.williamson@redhat.com>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RFC PATCH v2] kvm: x86: ignore ioapic polarity
  2014-02-28  4:06                             ` [RFC PATCH v2] kvm: x86: ignore ioapic polarity Gabriel L. Somlo
@ 2014-03-02 14:55                               ` Michael S. Tsirkin
  2014-03-13 10:53                               ` Paolo Bonzini
  1 sibling, 0 replies; 32+ messages in thread
From: Michael S. Tsirkin @ 2014-03-02 14:55 UTC (permalink / raw)
  To: Gabriel L. Somlo
  Cc: kvm, pbonzini, alex.williamson, eddie.dong, agraf, qemu-devel

On Thu, Feb 27, 2014 at 11:06:17PM -0500, Gabriel L. Somlo wrote:
> Both QEMU and KVM have already accumulated a significant number of
> optimizations based on the hard-coded assumption that ioapic polarity
> will always use the ActiveHigh convention, where the logical and
> physical states of level-triggered irq lines always match (i.e.,
> active(asserted) == high == 1, inactive == low == 0). QEMU guests
> are expected to follow directions given via ACPI and configure the
> ioapic with polarity 0 (ActiveHigh). However, even when misbehaving
> guests (e.g. OS X <= 10.9) set the ioapic polarity to 1 (ActiveLow),
> QEMU will still use the ActiveHigh signaling convention when
> interfacing with KVM.
> 
> This patch modifies KVM to completely ignore ioapic polarity as set by
> the guest OS, enabling misbehaving guests to work alongside those which
> comply with the ActiveHigh polarity specified by QEMU's ACPI tables.
> 
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: Gabriel L. Somlo <somlo@cmu.edu>
> ---
> 
> On Fri, Feb 28, 2014 at 12:31:27AM +0100, Paolo Bonzini wrote:
> >>>This is a much better description.  Can you turn it into a patch to
> >>>Documentation/virtual/kvm/api.txt and a more complete commit
> >>>message?
> 
> OK, let me know what you all think of this version.

Looks good to me.

Acked-by: Michael S. Tsirkin <mst@redhat.com>



> > If you change ACPI tables to ActiveLow, and leave the polarity
> > inversion in hw/intc/ioapic.c, then it will matter.
> 
> I only changed ACPI to ActiveLow to cause Linux to misbehave in the
> same way OS X does, it was a temporary/test hack. ACPI should stay
> ActiveHigh to reflect the rest of QEMU/KVM, and misbehaving guest OSs
> should ignore it at their own peril :)

I actually think now that you've fixed intc.c as well, we should change
it as the next step.



> > >(Hmmm, maybe this one of the reasons I never got OS X to boot without
> > >-enable-kvm... I should look at the QEMU hw/intc/ioapic*.c, and see
> > >if *it* cares about guest-configured polarity, and maybe get it to
> > >*stop* caring :)
> 
> > Exactly my point. :)
> 
> So this patch gets the accelerated path to work regardless of how
> well-behaved a guest OS may or may not be w.r.t. ACPI and polarity.
> I'll try to find out whether it's possible to be *extra* nice to these
> types of guests by also allowing them to ignore ACPI polarity when not
> using acceleration :)
> 
> Thanks again,
>   Gabriel
> 
>  Documentation/virtual/kvm/api.txt | 18 ++++++++++++++++++
>  include/uapi/linux/kvm.h          |  1 +
>  virt/kvm/ioapic.c                 |  1 -
>  3 files changed, 19 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
> index 6cd63a9..0492a94 100644
> --- a/Documentation/virtual/kvm/api.txt
> +++ b/Documentation/virtual/kvm/api.txt
> @@ -1365,6 +1365,24 @@ struct kvm_irq_routing_msi {
>  	__u32 pad;
>  };
>  
> +NOTE: For each level-triggered interrupt managed by a virtual ioapic,
> +the guest OS may set a polarity value (bit 13 of each corresponding I/O
> +redirection table register). The polarity bit defines the relationship
> +between an irq line's logical state (active/asserted = 1, inactive = 0)
> +and its physical state (high = 1, low = 0). When the polarity bit is 0
> +(ActiveHigh), logical and physical states are matched (active == high == 1,
> +inactive == low == 0). When polarity is set to 1 (ActiveLow), logical and
> +physical state are opposites (logical == !physical). Typically, guests are
> +expected to use the same polarity across all level-triggered interrupts,
> +as directed by ACPI. Historically, both KVM and QEMU have accumulated a
> +significant number of optimizations based on the hard-coded assumption
> +that polarity will always be set to ActiveHigh. When interfacing with KVM,
> +QEMU will use the ActiveHigh convention for all level-triggered irqs,
> +regardless of how the guest OS has configured the polarity bits in the
> +ioapic registers. As such, KVM must also ignore these bits, and always act
> +as if the logical and physical states of an irq line exactly match each
> +other (i.e., follow the ActiveHigh convention regardless of polarity).
> +
>  
>  4.53 KVM_ASSIGN_SET_MSIX_NR
>  
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index 932d7f2..5bfc0e6 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -675,6 +675,7 @@ struct kvm_ppc_smmu_info {
>  #define KVM_CAP_SPAPR_MULTITCE 94
>  #define KVM_CAP_EXT_EMUL_CPUID 95
>  #define KVM_CAP_HYPERV_TIME 96
> +#define KVM_CAP_X86_IOAPIC_POLARITY_IGNORED 97
>  
>  #ifdef KVM_CAP_IRQ_ROUTING
>  
> diff --git a/virt/kvm/ioapic.c b/virt/kvm/ioapic.c
> index ce9ed99..1539d37 100644
> --- a/virt/kvm/ioapic.c
> +++ b/virt/kvm/ioapic.c
> @@ -328,7 +328,6 @@ int kvm_ioapic_set_irq(struct kvm_ioapic *ioapic, int irq, int irq_source_id,
>  	irq_level = __kvm_irq_line_state(&ioapic->irq_states[irq],
>  					 irq_source_id, level);
>  	entry = ioapic->redirtbl[irq];
> -	irq_level ^= entry.fields.polarity;
>  	if (!irq_level) {
>  		ioapic->irr &= ~mask;
>  		ret = 1;
> -- 
> 1.8.1.4

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RFC PATCH v2] kvm: x86: ignore ioapic polarity
  2014-02-28  4:06                             ` [RFC PATCH v2] kvm: x86: ignore ioapic polarity Gabriel L. Somlo
  2014-03-02 14:55                               ` Michael S. Tsirkin
@ 2014-03-13 10:53                               ` Paolo Bonzini
  2014-03-13 13:43                                 ` Gabriel L. Somlo
  1 sibling, 1 reply; 32+ messages in thread
From: Paolo Bonzini @ 2014-03-13 10:53 UTC (permalink / raw)
  To: Gabriel L. Somlo, kvm; +Cc: mst, alex.williamson, eddie.dong, agraf, qemu-devel

Il 28/02/2014 05:06, Gabriel L. Somlo ha scritto:
> +NOTE: For each level-triggered interrupt managed by a virtual ioapic,
> +the guest OS may set a polarity value (bit 13 of each corresponding I/O
> +redirection table register). The polarity bit defines the relationship
> +between an irq line's logical state (active/asserted = 1, inactive = 0)
> +and its physical state (high = 1, low = 0). When the polarity bit is 0
> +(ActiveHigh), logical and physical states are matched (active == high == 1,
> +inactive == low == 0). When polarity is set to 1 (ActiveLow), logical and
> +physical state are opposites (logical == !physical). Typically, guests are
> +expected to use the same polarity across all level-triggered interrupts,
> +as directed by ACPI. Historically, both KVM and QEMU have accumulated a
> +significant number of optimizations based on the hard-coded assumption
> +that polarity will always be set to ActiveHigh. When interfacing with KVM,
> +QEMU will use the ActiveHigh convention for all level-triggered irqs,
> +regardless of how the guest OS has configured the polarity bits in the
> +ioapic registers. As such, KVM must also ignore these bits, and always act
> +as if the logical and physical states of an irq line exactly match each
> +other (i.e., follow the ActiveHigh convention regardless of polarity).
> +

Instead of this, I'm adding the following to the KVM_IRQ_LINE ioctl:

+On real hardware, interrupt pins can be active-low or active-high.  This
+does not matter for the level field of struct kvm_irq_level: 1 always
+means active (asserted), 0 means inactive (deasserted).
+
+x86 allows the operating system to program the interrupt polarity
+(active-low/active-high) for level-triggered interrupts, and KVM used
+to consider the polarity.  However, due to bitrot in the handling of
+active-low interrupts, the above convention is now valid on x86 too.
+This is signaled by KVM_CAP_X86_IOAPIC_POLARITY_IGNORED.  Userspace
+should not present interrupts to the guest as active-low unless this
+capability is present (or unless it is not using the in-kernel irqchip,
+of course).

and applying the patch to kvm/queue.

Paolo

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [RFC PATCH v2] kvm: x86: ignore ioapic polarity
  2014-03-13 10:53                               ` Paolo Bonzini
@ 2014-03-13 13:43                                 ` Gabriel L. Somlo
  0 siblings, 0 replies; 32+ messages in thread
From: Gabriel L. Somlo @ 2014-03-13 13:43 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, mst, eddie.dong, qemu-devel, agraf, alex.williamson

On Thu, Mar 13, 2014 at 11:53:04AM +0100, Paolo Bonzini wrote:
> Instead of this, I'm adding the following to the KVM_IRQ_LINE ioctl:
> 
> +On real hardware, interrupt pins can be active-low or active-high.  This
> +does not matter for the level field of struct kvm_irq_level: 1 always
> +means active (asserted), 0 means inactive (deasserted).
> +
> +x86 allows the operating system to program the interrupt polarity
> +(active-low/active-high) for level-triggered interrupts, and KVM used
> +to consider the polarity.  However, due to bitrot in the handling of
> +active-low interrupts, the above convention is now valid on x86 too.
> +This is signaled by KVM_CAP_X86_IOAPIC_POLARITY_IGNORED.  Userspace
> +should not present interrupts to the guest as active-low unless this
> +capability is present (or unless it is not using the in-kernel irqchip,
> +of course).
> 
> and applying the patch to kvm/queue.

Sounds great to me, thanks !

--Gabriel

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2014-03-13 13:43 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20140130204423.GK29329@ERROL.INI.CMU.EDU>
2014-02-11 18:23 ` RFC: ioapic polarity vs. qemu os-x guest Gabriel L. Somlo
2014-02-11 19:54   ` Michael S. Tsirkin
2014-02-11 21:35     ` Gabriel L. Somlo
2014-02-14 21:13     ` Gabriel L. Somlo
2014-02-14 21:21       ` Alexander Graf
2014-02-14 22:06         ` Gabriel L. Somlo
2014-02-14 22:13           ` Alexander Graf
2014-02-16 11:18             ` Michael S. Tsirkin
2014-02-16 11:41             ` Michael S. Tsirkin
2014-02-16 14:47               ` Alex Williamson
2014-02-16 16:23                 ` Michael S. Tsirkin
2014-02-17 17:57                   ` Gabriel L. Somlo
2014-02-17 18:01                     ` Gabriel L. Somlo
2014-02-17 18:06                       ` Paolo Bonzini
2014-02-17 19:38                         ` Gabriel L. Somlo
2014-02-18  0:58                           ` Gabriel L. Somlo
2014-02-27 17:05                   ` [PATCH RFC] kvm: ignore apic polarity Michael S. Tsirkin
2014-02-27 21:41                     ` Gabriel L. Somlo
2014-02-27 22:30                       ` Paolo Bonzini
2014-02-27 23:13                         ` Gabriel L. Somlo
2014-02-27 23:31                           ` Paolo Bonzini
2014-02-28  4:06                             ` [RFC PATCH v2] kvm: x86: ignore ioapic polarity Gabriel L. Somlo
2014-03-02 14:55                               ` Michael S. Tsirkin
2014-03-13 10:53                               ` Paolo Bonzini
2014-03-13 13:43                                 ` Gabriel L. Somlo
2014-02-28  4:55                         ` [PATCH RFC] kvm: ignore apic polarity Michael S. Tsirkin
2014-02-28  8:10                           ` Paolo Bonzini
2014-02-28  8:11                           ` Paolo Bonzini
2014-03-01  5:03                     ` Alex Williamson
2014-02-16 11:34       ` RFC: ioapic polarity vs. qemu os-x guest Michael S. Tsirkin
2014-02-16 15:12         ` [Qemu-devel] " Peter Maydell
2014-02-16 11:37       ` Michael S. Tsirkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).