Re: [Qemu-devel] [PATCH] virtio-pci: Set the QEMU_PCI_CAP_EXPRESS capability early in its DeviceClass realize method

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Marcel Apfelbaum <marcel@redhat.com>
To: Shmulik Ladkani <shmulik.ladkani@ravellosystems.com>
Cc: qemu-devel@nongnu.org, "Michael S. Tsirkin" <mst@redhat.com>
Subject: Re: [Qemu-devel] [PATCH] virtio-pci: Set the QEMU_PCI_CAP_EXPRESS capability early in its DeviceClass realize method
Date: Wed, 2 Dec 2015 11:51:46 +0200	[thread overview]
Message-ID: <565EBF32.80803@redhat.com> (raw)
In-Reply-To: <20151202100109.72be507a@pixies>

On 12/02/2015 10:01 AM, Shmulik Ladkani wrote:
> Thanks Marcel,
>
> On Tue, 1 Dec 2015 22:46:33 +0200, marcel@redhat.com wrote:
>>>> The reason is the device becomes express only if *all* the conditions
>>>> are met.
>>>
>>> I'm ok with either approaches.
>>>
>>> However it seems common practice to set QEMU_PCI_CAP_EXPRESS
>>> unconditionally for PCIE devices.
>>>
>>> The few existing PCIE devices do so by assigning their
>>> PCIDeviceClass.is_express to 1 within their 'class_init', regardless the
>>> properties of the bus their on.
>>> (e.g. xhci_class_init, megasas_class_init, vfio_pci_dev_class_init,
>>>    nvme_class_init, and more)
>>>
>>> Some devices later call pcie_endpoint_cap_init conditionally.
>>> (e.g. usb_xhci_realize).
>>>
>>> Can you please examine this and let me know the preferred approach?
>>
>> Yes, I saw that..., as always not a walk in the park.
>>
>> - So we have "is_express = true" <=> QEMU_PCI_CAP_EXPRESS on <=> "config size = PCIe"
>> - Not related to the above (!!), if (some condition) => add PCIe express capability
>>     (megasas is the exception)
>>
>> Let's take "usb_xhci":
>>    - If we put it under a PCI bus it will not be an express device, but
>>      it will have a "big" config space. Also pci_is_express(dev) will still return true!
>>    - This is probably a bug. (or I am missing something)
>
> I actually assumed this is the right behavior.
>
> A device class reports whether its instances *could* be pcie by arming
> its PCIDeviceClass.is_express.
> As such, the "big" config space is allocated for the instance. This is
> harmless.
>
> Such a device may (or may not) be connected to a pcie bus, and only if
> so, we report it is a pcie endpoint.
>
> Also, pcie_add_capability is allowed on that device, in order to setup
> whatever capabilities on its pcie config space (even if finally not on a
> pcie bus).
>
> Moreover, VMSTATE_PCIE_DEVICE (which uses vmstate_pcie_device rather
> than vmstate_pci_device) can be used for that device's
> VMStateDescription fields *without* worrying whether the actual config
> space is "big" or "small".
> Otherwise one should examine whether vmstate_pcie_device or
> vmstate_pci_device need to be used. Seems tedious.
>
> This is the reasoning I can think of, why assigning QEMU_PCI_CAP_EXPRESS
> and reporting pcie_endpoint_cap_init are not tightly coupled.

I agree it may be the reason, but that does not make it right.
I still see two *possible* problems:
1. Pci config space is guest visible. The guest can read/write to
    a place it shouldn't. I don't know if is a *real* issue, but it needs checking.
2. We still have pci_is_express returning true, this is error prone because one
    can use this function assuming the device is express. Maybe we should call it "can_be_express" ?
If the migration construct (VMSTATE) is *the only* reason for doing this, maybe is not a good
enough reason (I am not the one to decide :)  ). Is still it seems a little off to me.

If you think this is good enough, you can simply do the same:
   - Instead of replacing the realize method, just advertise it with "is_express" (meaning it can be express)
   - Leave all the conditions as they were in prev patch.
As a result, the pci config space will have the right length.
The consequences are obvious now, if virtio/pci maintainers are OK with that, so am I.

Thanks,
Marcel




>
> Indeed, no strict solution here, both approaches seem reasoanble (and
> both are used!).
>
> WDYT? Is my above interpretation makes sense?
>
> Regards,
> Shmulik
>

next prev parent reply	other threads:[~2015-12-02  9:51 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-01 16:23 [Qemu-devel] [PATCH] virtio-pci: Set the QEMU_PCI_CAP_EXPRESS capability early in its DeviceClass realize method Shmulik Ladkani
2015-12-01 16:36 ` Marcel Apfelbaum
2015-12-01 19:30   ` Shmulik Ladkani
2015-12-01 20:46     ` Marcel Apfelbaum
2015-12-02  8:01       ` Shmulik Ladkani
2015-12-02  9:51         ` Marcel Apfelbaum [this message]
2015-12-02 13:30           ` Shmulik Ladkani
2015-12-02 14:00             ` Marcel Apfelbaum
2015-12-02 14:27               ` Shmulik Ladkani

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=565EBF32.80803@redhat.com \
    --to=marcel@redhat.com \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=shmulik.ladkani@ravellosystems.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.