qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Markus Armbruster <armbru@redhat.com>
Cc: kwolf@redhat.com, jiri@resnulli.us, qemu-block@nongnu.org,
	jasowang@redhat.com, qemu-devel@nongnu.org,
	keith.busch@intel.com, dmitry@daynix.com,
	alex.williamson@redhat.com, sfeldma@gmail.com, kraxel@redhat.com,
	Cao jin <caoj.fnst@cn.fujitsu.com>,
	Marcel Apfelbaum <marcel@redhat.com>,
	pbonzini@redhat.com, jsnow@redhat.com,
	Jan Kiszka <jan.kiszka@web.de>,
	hare@suse.de
Subject: Re: [Qemu-devel] [PATCH RFC v2 1/2] Add param Error** to msi_init() & modify the callers
Date: Fri, 4 Mar 2016 15:10:06 +0200	[thread overview]
Message-ID: <20160304150138-mutt-send-email-mst@redhat.com> (raw)
In-Reply-To: <87mvqexuxq.fsf@blackfin.pond.sub.org>

On Fri, Mar 04, 2016 at 01:57:05PM +0100, Markus Armbruster wrote:
> "Michael S. Tsirkin" <mst@redhat.com> writes:
> 
> > On Fri, Mar 04, 2016 at 09:42:02AM +0100, Markus Armbruster wrote:
> >> Plugging an MSI-capable device into an MSI-incapable board works just
> >> fine, both for physical and for virtual hardware.  What doesn't work is
> >> plugging an MSI-capable device into an MSI-capable board with *broken*
> >> MSI support.
> >> 
> >> As a convenience feature, we summarily and silently remove a device's
> >> MSI capability when we detect such a broken board.  At least that's what
> >> the commit message you quoted claims.
> >
> > And this makes sense, right?
> 
> Yes, except for the part where we ignore the user's explicit orders,
> and, to a lesser degree, for the part where we create variants of
> physical devices that don't exist physically.
> 
> >> In reality, we remove it not just for broken boards, but even for
> >> MSI-incapable boards.
> >> 
> >> I take issue with "summarily and silently", and "even for MSI-incapable
> >> boards".
> >> 
> >> When multiple variants of a device exist, and the user didn't ask for a
> >> specific one, then picking the one that works best with the board is
> >> just fine.
> >> 
> >> It's absolutely not fine when the user did ask for a specific one.  When
> >> I ask for msi=on, I mean it.  If it can't work with this board, tell me.
> >> But silently ignoring my orders is a bug.
> >
> > I agree. msi is not the only case either.  We really need - for any boolean
> > flag - a way to figure out whether it was set by user.
> > With that in place we could fix it.
> 
> QMP provides that directly as "optional bool", but qdev properties do
> "optional" diffently, and when you see the default value, you don't know
> whether it comes from the user or not.
> 
> Another solution is an on/off/auto type.  We got it already in the QAPI
> schema, as OnOffAuto, and my recent "[PATCH 32/38] qdev: New
> DEFINE_PROP_ON_OFF_AUTO" makes it available as qdev property.  With
> default set to auto, we should be set.

Should we somehow change all on/off properties to on/off/auto?

> > However, almost no one uses the msi/msi-x flag - we introduced
> > them only for one reason - for backwards compatibility.
> > The fact that each time we need a compatibility flag
> > we also expose a new user interface is very unfortunate.
> > IMO it was a design mistake made a long time ago and it won't
> > be easy to fix now.
> 
> I agree there's no easy fix, but we can try to find a non-easy one.
> 
> For backward compatibility, we need to configure some device models for
> certain machine types.  We use the only object configuration mechanism
> we have: properties.  The properties we use for compatibility are all
> exposed to the user.
> 
> We could easily provide a flag to mark properties private, and only
> accept non-private properties at external interfaces.  This should help
> stopping growth of the problem, but it's not an easy fix for reducing
> it, because making a property private retroactively is problematic.  We
> could have a flag to mark them deprecated instead, warn on use, remove
> them from documentation, and perhaps drop them a couple of releases
> later.

Sounds good.

> > And for the above reason I personally do not intend to
> > spend time designing a specific hack just for the msi
> > property.
> >
> >> It's fine to have emulations of MSI-capable boards where MSI doesn't yet
> >> work.  Even if that means we have to reject MSI-capable devices.
> >
> > I don't know what does reject mean here. Removing msi capability?
> > In that case I agree.
> 
> By "reject" I mean "reject the user's order to plug in an MSI-capable
> device."

I don't believe anyone tweaks these properties in practice.

However, I have to wonder. Assume that you have supplied
a device with 10 properties. QEMU can not support them.
At your suggesion, device is rejected. How does user
know which property to tweak? Try all values for them all?


> >> It's absolutely not fine to reject them for MSI-incapable boards, where
> >> they'd work just fine.
> >
> > I think that as long as users did not ask for msi explicitly,
> > and board is msi incapable, it does not matter much whether
> > device has msi capability or not - guest will not try
> > to use it anyway.
> 
> If the device comes in MSI-capable and MSI-incapable variants, and the
> user didn't specify a variant, then picking one that fits the board is
> fine.
> 
> If the device does not come in variants (and many physical devices
> don't), then rejecting it just because it's MSI-capable and the board
> isn't is not fine.
> 
> To fix, we'd have to limit what is now !msi_supported to the boards that
> are nominally MSI-capable, except our emulation doesn't work, i.e. do
> exactly what the commit message promised.

I rather think it's academic. MSI is a performance optimization, and is
optional for guests anyway.  It's hard to see when may users absolutely
require it.

> The PCI core encourages creating both variants, and most (but not all)
> device models do, but:
> 
> >> Furthermore, I doubt the wisdom of creating virtual devices that don't
> >> exist physically just to have something that works in our broken boards.
> >> If the physical device exists in MSI-capable and MSI-incapable variants,
> >> emulating both is fine.  But if it only ever existed MSI-capable, having
> >> the PCI core(!) drop the MSI capability is a questionable idea.  I
> >> suspect that the need for this dubious hack would be much smaller if we
> >> didn't foolishly treat every MSI-incapable board as broken MSI-capable
> >> board.
> >> 
> >> If you conclude that cleaning up this carelessly made mess is not worth
> >> the bother now, then at least explain the mess in the code, please.
> >> It's not obvious, and figuring out what's going on and why it is the way
> >> it is has taken me several hours, and Marcel's help.
> >
> > I think it's worth cleaning up, or at least documenting.
> > Fixing it will take much more than the patch proposed here,
> > and we can not start with this patch as it will cause
> > regressions.
> > Adding a comment won't be too much work.
> > How about the below?
> >
> > -->
> >
> > msi_supported -> msi_nonbroken
> >
> > Rename controller flag to make it clearer what it means.
> > Add some documentation as well.
> >
> > Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> >
> > ---
> >
> > diff --git a/include/hw/pci/msi.h b/include/hw/pci/msi.h
> > index 50e452b..8124908 100644
> > --- a/include/hw/pci/msi.h
> > +++ b/include/hw/pci/msi.h
> > @@ -29,7 +29,7 @@ struct MSIMessage {
> >      uint32_t data;
> >  };
> >  
> > -extern bool msi_supported;
> > +extern bool msi_nonbroken;
> >  
> >  void msi_set_message(PCIDevice *dev, MSIMessage msg);
> >  MSIMessage msi_get_message(PCIDevice *dev, unsigned int vector);
> > diff --git a/hw/i386/kvm/apic.c b/hw/i386/kvm/apic.c
> > index 694d398..3c7c8fa 100644
> > --- a/hw/i386/kvm/apic.c
> > +++ b/hw/i386/kvm/apic.c
> > @@ -186,7 +186,7 @@ static void kvm_apic_realize(DeviceState *dev, Error **errp)
> >                            APIC_SPACE_SIZE);
> >  
> >      if (kvm_has_gsi_routing()) {
> > -        msi_supported = true;
> > +        msi_nonbroken = true;
> >      }
> >  }
> >  
> > diff --git a/hw/i386/xen/xen_apic.c b/hw/i386/xen/xen_apic.c
> > index 2b8d709..21d68ee 100644
> > --- a/hw/i386/xen/xen_apic.c
> > +++ b/hw/i386/xen/xen_apic.c
> > @@ -44,7 +44,7 @@ static void xen_apic_realize(DeviceState *dev, Error **errp)
> >      s->vapic_control = 0;
> >      memory_region_init_io(&s->io_memory, OBJECT(s), &xen_apic_io_ops, s,
> >                            "xen-apic-msi", APIC_SPACE_SIZE);
> > -    msi_supported = true;
> > +    msi_nonbroken = true;
> >  }
> >  
> >  static void xen_apic_set_base(APICCommonState *s, uint64_t val)
> > diff --git a/hw/intc/apic.c b/hw/intc/apic.c
> > index a299462..28c2ea5 100644
> > --- a/hw/intc/apic.c
> > +++ b/hw/intc/apic.c
> > @@ -874,7 +874,7 @@ static void apic_realize(DeviceState *dev, Error **errp)
> >      s->timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, apic_timer, s);
> >      local_apics[s->idx] = s;
> >  
> > -    msi_supported = true;
> > +    msi_nonbroken = true;
> >  }
> >  
> >  static void apic_class_init(ObjectClass *klass, void *data)
> > diff --git a/hw/intc/arm_gicv2m.c b/hw/intc/arm_gicv2m.c
> > index 70c0b97..ebd368b 100644
> > --- a/hw/intc/arm_gicv2m.c
> > +++ b/hw/intc/arm_gicv2m.c
> > @@ -148,7 +148,7 @@ static void gicv2m_realize(DeviceState *dev, Error **errp)
> >          sysbus_init_irq(SYS_BUS_DEVICE(dev), &s->spi[i]);
> >      }
> >  
> > -    msi_supported = true;
> > +    msi_nonbroken = true;
> >      kvm_gsi_direct_mapping = true;
> >      kvm_msi_via_irqfd_allowed = kvm_irqfds_enabled();
> >  }
> > diff --git a/hw/intc/openpic.c b/hw/intc/openpic.c
> > index 903888c..7685250 100644
> > --- a/hw/intc/openpic.c
> > +++ b/hw/intc/openpic.c
> > @@ -1375,7 +1375,7 @@ static void fsl_common_init(OpenPICState *opp)
> >  
> >      opp->irq_msi = 224;
> >  
> > -    msi_supported = true;
> > +    msi_nonbroken = true;
> >      for (i = 0; i < opp->fsl->max_ext; i++) {
> >          opp->src[i].level = false;
> >      }
> > diff --git a/hw/intc/openpic_kvm.c b/hw/intc/openpic_kvm.c
> > index 4dcdb61..778af4a 100644
> > --- a/hw/intc/openpic_kvm.c
> > +++ b/hw/intc/openpic_kvm.c
> > @@ -239,7 +239,7 @@ static void kvm_openpic_realize(DeviceState *dev, Error **errp)
> >      memory_listener_register(&opp->mem_listener, &address_space_memory);
> >  
> >      /* indicate pic capabilities */
> > -    msi_supported = true;
> > +    msi_nonbroken = true;
> >      kvm_kernel_irqchip = true;
> >      kvm_async_interrupts_allowed = true;
> >  
> > diff --git a/hw/pci-bridge/pci_bridge_dev.c b/hw/pci-bridge/pci_bridge_dev.c
> > index 100bb5e..862a2366 100644
> > --- a/hw/pci-bridge/pci_bridge_dev.c
> > +++ b/hw/pci-bridge/pci_bridge_dev.c
> > @@ -72,7 +72,7 @@ static int pci_bridge_dev_initfn(PCIDevice *dev)
> >          goto slotid_error;
> >      }
> >      if ((bridge_dev->flags & (1 << PCI_BRIDGE_DEV_F_MSI_REQ)) &&
> > -        msi_supported) {
> > +        msi_nonbroken) {
> >          err = msi_init(dev, 0, 1, true, true);
> >          if (err < 0) {
> >              goto msi_error;
> > diff --git a/hw/pci/msi.c b/hw/pci/msi.c
> > index 85f21b8..e0e64c2 100644
> > --- a/hw/pci/msi.c
> > +++ b/hw/pci/msi.c
> > @@ -34,8 +34,21 @@
> >  
> >  #define PCI_MSI_VECTORS_MAX     32
> >  
> > -/* Flag for interrupt controller to declare MSI/MSI-X support */
> > -bool msi_supported;
> > +/*
> > + * Flag for interrupt controllers to declare broken MSI/MSI-X support.
> > + * values: false - broken; true - non-broken.
> > + *
> > + * Setting this flag to false will remove MSI/MSI-X capability from all devices.
> > + *
> > + * It is preferrable for controllers to set this to true (non-broken) even if
> > + * they do not actually support MSI/MSI-X: guests normally probe the controller
> > + * type and do not attempt to enable MSI/MSI-X with interrupt controllers not
> > + * supporting such, so removing the capability is not required, and
> > + * it seems cleaner to have a given device look the same for all boards.
> > + *
> > + * TODO: some existing controllers violate the above rule. Identify and fix them.
> > + */
> > +bool msi_nonbroken;
> 
> Good description.  I'd use FIXME rather than TODO, but that's detail.
> 
> >  
> >  /* If we get rid of cap allocator, we won't need this. */
> >  static inline uint8_t msi_cap_sizeof(uint16_t flags)
> > @@ -160,7 +173,7 @@ int msi_init(struct PCIDevice *dev, uint8_t offset,
> >      uint8_t cap_size;
> >      int config_offset;
> >  
> > -    if (!msi_supported) {
> > +    if (!msi_nonbroken) {
> >          return -ENOTSUP;
> >      }
> >  
> > diff --git a/hw/pci/msix.c b/hw/pci/msix.c
> > index 537fdba..b75f0e9 100644
> > --- a/hw/pci/msix.c
> > +++ b/hw/pci/msix.c
> > @@ -249,7 +249,7 @@ int msix_init(struct PCIDevice *dev, unsigned short nentries,
> >      uint8_t *config;
> >  
> >      /* Nothing to do if MSI is not supported by interrupt controller */
> > -    if (!msi_supported) {
> > +    if (!msi_nonbroken) {
> >          return -ENOTSUP;
> >      }
> >  
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index e9d4abf..c4fb206 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -439,7 +439,7 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base,
> >      _FDT((fdt_property_cell(fdt, "rtas-event-scan-rate",
> >                              RTAS_EVENT_SCAN_RATE)));
> >  
> > -    if (msi_supported) {
> > +    if (msi_nonbroken) {
> >          _FDT((fdt_property(fdt, "ibm,change-msix-capable", NULL, 0)));
> >      }
> >  
> > @@ -1743,7 +1743,7 @@ static void ppc_spapr_init(MachineState *machine)
> >      bool kernel_le = false;
> >      char *filename;
> >  
> > -    msi_supported = true;
> > +    msi_nonbroken = true;
> >  
> >      QLIST_INIT(&spapr->phbs);
> >  
> > diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> > index e8edad3..3fc7895 100644
> > --- a/hw/ppc/spapr_pci.c
> > +++ b/hw/ppc/spapr_pci.c
> > @@ -1790,7 +1790,7 @@ void spapr_pci_rtas_init(void)
> >                          rtas_ibm_read_pci_config);
> >      spapr_rtas_register(RTAS_IBM_WRITE_PCI_CONFIG, "ibm,write-pci-config",
> >                          rtas_ibm_write_pci_config);
> > -    if (msi_supported) {
> > +    if (msi_nonbroken) {
> >          spapr_rtas_register(RTAS_IBM_QUERY_INTERRUPT_SOURCE_NUMBER,
> >                              "ibm,query-interrupt-source-number",
> >                              rtas_ibm_query_interrupt_source_number);
> > diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
> > index dba0202..f5f679f 100644
> > --- a/hw/s390x/s390-pci-bus.c
> > +++ b/hw/s390x/s390-pci-bus.c
> > @@ -597,7 +597,7 @@ static void s390_pcihost_class_init(ObjectClass *klass, void *data)
> >      k->init = s390_pcihost_init;
> >      hc->plug = s390_pcihost_hot_plug;
> >      hc->unplug = s390_pcihost_hot_unplug;
> > -    msi_supported = true;
> > +    msi_nonbroken = true;
> >  }
> >  
> >  static const TypeInfo s390_pcihost_info = {
> 
> Much appreciated!
> 
> Reviewed-by: Markus Armbruster <armbru@redhat.com>

  reply	other threads:[~2016-03-04 13:10 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-15 10:43 [Qemu-devel] [PATCH RFC v2 0/2] MSI/MSIX: fix to catch and report errors Cao jin
2015-12-15 10:43 ` [Qemu-devel] [PATCH RFC v2 1/2] Add param Error** to msi_init() & modify the callers Cao jin
2016-03-02  9:13   ` Markus Armbruster
2016-03-03  3:56     ` Cao jin
2016-03-03 10:12     ` Marcel Apfelbaum
2016-03-03 10:45       ` Michael S. Tsirkin
2016-03-03 11:19         ` Marcel Apfelbaum
2016-03-03 11:33           ` Michael S. Tsirkin
2016-03-03 15:03             ` Markus Armbruster
2016-03-03 18:33               ` Michael S. Tsirkin
2016-03-04  8:42                 ` Markus Armbruster
2016-03-04  9:24                   ` Michael S. Tsirkin
2016-03-04 12:57                     ` Markus Armbruster
2016-03-04 13:10                       ` Michael S. Tsirkin [this message]
2016-03-03 14:24       ` Markus Armbruster
2016-03-23  4:04     ` Cao jin
2016-03-23  8:12       ` Markus Armbruster
2016-03-23  9:23         ` Cao jin
2015-12-15 10:43 ` [Qemu-devel] [PATCH v2 RFC 2/2] Add param Error** to msix_init() " Cao jin
2015-12-17  8:02 ` [Qemu-devel] [PATCH RFC v2 0/2] MSI/MSIX: fix to catch and report errors Cao jin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160304150138-mutt-send-email-mst@redhat.com \
    --to=mst@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=armbru@redhat.com \
    --cc=caoj.fnst@cn.fujitsu.com \
    --cc=dmitry@daynix.com \
    --cc=hare@suse.de \
    --cc=jan.kiszka@web.de \
    --cc=jasowang@redhat.com \
    --cc=jiri@resnulli.us \
    --cc=jsnow@redhat.com \
    --cc=keith.busch@intel.com \
    --cc=kraxel@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=marcel@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=sfeldma@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).