Re: [Qemu-devel] [PATCH] virtio-pci: Set the QEMU_PCI_CAP_EXPRESS capability early in its DeviceClass realize method

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Marcel Apfelbaum <marcel@redhat.com>
To: Shmulik Ladkani <shmulik.ladkani@ravellosystems.com>
Cc: qemu-devel@nongnu.org, "Michael S. Tsirkin" <mst@redhat.com>
Subject: Re: [Qemu-devel] [PATCH] virtio-pci: Set the QEMU_PCI_CAP_EXPRESS capability early in its DeviceClass realize method
Date: Wed, 2 Dec 2015 11:51:46 +0200	[thread overview]
Message-ID: <565EBF32.80803@redhat.com> (raw)
In-Reply-To: <20151202100109.72be507a@pixies>

On 12/02/2015 10:01 AM, Shmulik Ladkani wrote:
> Thanks Marcel,
>
> On Tue, 1 Dec 2015 22:46:33 +0200, marcel@redhat.com wrote:
>>>> The reason is the device becomes express only if *all* the conditions
>>>> are met.
>>>
>>> I'm ok with either approaches.
>>>
>>> However it seems common practice to set QEMU_PCI_CAP_EXPRESS
>>> unconditionally for PCIE devices.
>>>
>>> The few existing PCIE devices do so by assigning their
>>> PCIDeviceClass.is_express to 1 within their 'class_init', regardless the
>>> properties of the bus their on.
>>> (e.g. xhci_class_init, megasas_class_init, vfio_pci_dev_class_init,
>>>    nvme_class_init, and more)
>>>
>>> Some devices later call pcie_endpoint_cap_init conditionally.
>>> (e.g. usb_xhci_realize).
>>>
>>> Can you please examine this and let me know the preferred approach?
>>
>> Yes, I saw that..., as always not a walk in the park.
>>
>> - So we have "is_express = true" <=> QEMU_PCI_CAP_EXPRESS on <=> "config size = PCIe"
>> - Not related to the above (!!), if (some condition) => add PCIe express capability
>>     (megasas is the exception)
>>
>> Let's take "usb_xhci":
>>    - If we put it under a PCI bus it will not be an express device, but
>>      it will have a "big" config space. Also pci_is_express(dev) will still return true!
>>    - This is probably a bug. (or I am missing something)
>
> I actually assumed this is the right behavior.
>
> A device class reports whether its instances *could* be pcie by arming
> its PCIDeviceClass.is_express.
> As such, the "big" config space is allocated for the instance. This is
> harmless.
>
> Such a device may (or may not) be connected to a pcie bus, and only if
> so, we report it is a pcie endpoint.
>
> Also, pcie_add_capability is allowed on that device, in order to setup
> whatever capabilities on its pcie config space (even if finally not on a
> pcie bus).
>
> Moreover, VMSTATE_PCIE_DEVICE (which uses vmstate_pcie_device rather
> than vmstate_pci_device) can be used for that device's
> VMStateDescription fields *without* worrying whether the actual config
> space is "big" or "small".
> Otherwise one should examine whether vmstate_pcie_device or
> vmstate_pci_device need to be used. Seems tedious.
>
> This is the reasoning I can think of, why assigning QEMU_PCI_CAP_EXPRESS
> and reporting pcie_endpoint_cap_init are not tightly coupled.

I agree it may be the reason, but that does not make it right.
I still see two *possible* problems:
1. Pci config space is guest visible. The guest can read/write to
    a place it shouldn't. I don't know if is a *real* issue, but it needs checking.
2. We still have pci_is_express returning true, this is error prone because one
    can use this function assuming the device is express. Maybe we should call it "can_be_express" ?
If the migration construct (VMSTATE) is *the only* reason for doing this, maybe is not a good
enough reason (I am not the one to decide :)  ). Is still it seems a little off to me.

If you think this is good enough, you can simply do the same:
   - Instead of replacing the realize method, just advertise it with "is_express" (meaning it can be express)
   - Leave all the conditions as they were in prev patch.
As a result, the pci config space will have the right length.
The consequences are obvious now, if virtio/pci maintainers are OK with that, so am I.

Thanks,
Marcel




>
> Indeed, no strict solution here, both approaches seem reasoanble (and
> both are used!).
>
> WDYT? Is my above interpretation makes sense?
>
> Regards,
> Shmulik
>

next prev parent reply	other threads:[~2015-12-02  9:51 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-01 16:23 [Qemu-devel] [PATCH] virtio-pci: Set the QEMU_PCI_CAP_EXPRESS capability early in its DeviceClass realize method Shmulik Ladkani
2015-12-01 16:36 ` Marcel Apfelbaum
2015-12-01 19:30   ` Shmulik Ladkani
2015-12-01 20:46     ` Marcel Apfelbaum
2015-12-02  8:01       ` Shmulik Ladkani
2015-12-02  9:51         ` Marcel Apfelbaum [this message]
2015-12-02 13:30           ` Shmulik Ladkani
2015-12-02 14:00             ` Marcel Apfelbaum
2015-12-02 14:27               ` Shmulik Ladkani

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=565EBF32.80803@redhat.com \
    --to=marcel@redhat.com \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=shmulik.ladkani@ravellosystems.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).