qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Akihiko Odaki <akihiko.odaki@daynix.com>
To: Ani Sinha <anisinha@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>,
	qemu-devel <qemu-devel@nongnu.org>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Marcel Apfelbaum <marcel.apfelbaum@gmail.com>,
	Julia Suvorova <jusual@redhat.com>
Subject: Re: [PATCH v7 5/6] hw/pci: ensure PCIE devices are plugged into only slot 0 of PCIE port
Date: Wed, 5 Jul 2023 19:42:51 +0900	[thread overview]
Message-ID: <cec1bb4e-813a-fd27-25a2-4d547b91613e@daynix.com> (raw)
In-Reply-To: <C3053F47-2C39-4CB4-BEBD-9EC95CF1C4BC@redhat.com>

On 2023/07/05 14:43, Ani Sinha wrote:
> 
> 
>> On 05-Jul-2023, at 7:09 AM, Akihiko Odaki <akihiko.odaki@daynix.com> wrote:
>>
>>
>>
>> On 2023/07/05 0:07, Ani Sinha wrote:
>>>> On 04-Jul-2023, at 7:58 PM, Igor Mammedov <imammedo@redhat.com> wrote:
>>>>
>>>> On Tue, 4 Jul 2023 19:20:00 +0530
>>>> Ani Sinha <anisinha@redhat.com> wrote:
>>>>
>>>>>> On 04-Jul-2023, at 6:18 PM, Igor Mammedov <imammedo@redhat.com> wrote:
>>>>>>
>>>>>> On Tue, 4 Jul 2023 21:02:09 +0900
>>>>>> Akihiko Odaki <akihiko.odaki@daynix.com> wrote:
>>>>>>
>>>>>>> On 2023/07/04 20:59, Ani Sinha wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>> On 04-Jul-2023, at 5:24 PM, Akihiko Odaki <akihiko.odaki@daynix.com> wrote:
>>>>>>>>>
>>>>>>>>> On 2023/07/04 20:25, Ani Sinha wrote:
>>>>>>>>>> PCI Express ports only have one slot, so PCI Express devices can only be
>>>>>>>>>> plugged into slot 0 on a PCIE port. Add a warning to let users know when the
>>>>>>>>>> invalid configuration is used. We may enforce this more strongly later on once
>>>>>>>>>> we get more clarity on whether we are introducing a bad regression for users
>>>>>>>>>> currenly using the wrong configuration.
>>>>>>>>>> The change has been tested to not break or alter behaviors of ARI capable
>>>>>>>>>> devices by instantiating seven vfs on an emulated igb device (the maximum
>>>>>>>>>> number of vfs the linux igb driver supports). The vfs instantiated correctly
>>>>>>>>>> and are seen to have non-zero device/slot numbers in the conventional PCI BDF
>>>>>>>>>> representation.
>>>>>>>>>> CC: jusual@redhat.com
>>>>>>>>>> CC: imammedo@redhat.com
>>>>>>>>>> CC: mst@redhat.com
>>>>>>>>>> CC: akihiko.odaki@daynix.com
>>>>>>>>>> Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2128929
>>>>>>>>>> Signed-off-by: Ani Sinha <anisinha@redhat.com>
>>>>>>>>>> Reviewed-by: Julia Suvorova <jusual@redhat.com>
>>>>>>>>>> ---
>>>>>>>>>> hw/pci/pci.c | 15 +++++++++++++++
>>>>>>>>>> 1 file changed, 15 insertions(+)
>>>>>>>>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
>>>>>>>>>> index e2eb4c3b4a..47517ba3db 100644
>>>>>>>>>> --- a/hw/pci/pci.c
>>>>>>>>>> +++ b/hw/pci/pci.c
>>>>>>>>>> @@ -65,6 +65,7 @@ bool pci_available = true;
>>>>>>>>>> static char *pcibus_get_dev_path(DeviceState *dev);
>>>>>>>>>> static char *pcibus_get_fw_dev_path(DeviceState *dev);
>>>>>>>>>> static void pcibus_reset(BusState *qbus);
>>>>>>>>>> +static bool pcie_has_upstream_port(PCIDevice *dev);
>>>>>>>>>>    static Property pci_props[] = {
>>>>>>>>>>      DEFINE_PROP_PCI_DEVFN("addr", PCIDevice, devfn, -1),
>>>>>>>>>> @@ -2121,6 +2122,20 @@ static void pci_qdev_realize(DeviceState *qdev, Error **errp)
>>>>>>>>>>          }
>>>>>>>>>>      }
>>>>>>>>>> +    /*
>>>>>>>>>> +     * With SRIOV and ARI, vfs can have non-zero slot in the conventional
>>>>>>>>>> +     * PCI interpretation as all five bits reserved for slot addresses are
>>>>>>>>>> +     * also used for function bits for the various vfs. Ignore that case.
>>>>>>>>>
>>>>>>>>> You don't have to mention SR/IOV; it affects all ARI-capable devices. A PF can also have non-zero slot number in the conventional interpretation so you shouldn't call it vf either.
>>>>>>>>
>>>>>>>> Can you please help write a comment that explains this properly for all cases - ARI/non-ARI, PFs and VFs? Once everyone agrees that its clear and correct, I will re-spin.
>>>>>>>
>>>>>>> Simply, you can say:
>>>>>>> With ARI, the slot number field in the conventional PCI interpretation
>>>>>>> can have a non-zero value as the field bits are reused to extend the
>>>>>>> function number bits. Ignore that case.
>>>>>>
>>>>>> mentioning 'conventional PCI interpretation' in comment and then immediately
>>>>>> checking 'pci_is_express(pci_dev)' is confusing. Since comment belongs
>>>>>> only to PCIE branch it would be better to talk in only about PCIe stuff
>>>>>> and referring to relevant portions of spec.
>>>>>
>>>>> Ok so how about this?
>>>>>
>>>>>    * With ARI, devices can have non-zero slot in the traditional BDF
>>>>>      * representation as all five bits reserved for slot addresses are
>>>>>      * also used for function bits. Ignore that case.
>>>>
>>>> you still refer to traditional (which I misread as 'conventional'),
>>>> steal the linux comment and argument it with ARI if necessary,
>>>> something like this (probably needs some more massaging):
>>> The comment messaging in these patches seems to exceed the value of the patch itself :-)
>>> How about this?
>>>      /*
>>>       * A PCIe Downstream Port normally leads to a Link with only Device
>>>       * 0 on it (PCIe spec r3.1, sec 7.3.1).
>>>       * With ARI, PCI_SLOT() can return non-zero value as all five bits
>>>       * reserved for slot addresses are also used for function bits.
>>>       * Hence, ignore ARI capable devices.
>>>       */
>>
>> Perhaps: s/normally leads to/must lead to/
>>
>>  From the kernel perspective, they may need to deal with a quirky hardware that does not conform with the specification, but from QEMU perspective, it is what we *must* conform with.
> 
> PCI base spec 4.0, rev 3, section 7.3.1 says:
> 
> "
> Downstream Ports that do not have ARI Forwarding enabled must associate only Device 0 with the device attached to the Logical Bus representing the Link from the Port. Configuration Requests 15 targeting the Bus Number associated with a Link specifying Device Number 0 are delivered to the device attached to the Link; Configuration Requests specifying all other Device Numbers (1-31) must be terminated by the Switch Downstream Port or the Root Port with an Unsupported Request Completion Status (equivalent to Master Abort in PCI). Non-ARI Devices must not assume that Device Number 0 is associated with their Upstream Port, but must capture their assigned Device Number as discussed in Section 2.2.6.2. Non-ARI Devices must respond to all Type 0 Configuration Read Requests, regardless of the Device Number specified in the Request.
> 
> …
> 
> With an ARI Device, its Device Number is implied to be 0 rather than specified by a field within an ID. The traditional 5-bit Device Number and 3-bit Function Number fields in its associated Routing IDs, Requester IDs, and Completer IDs are interpreted as a single 8-bit Function Number. See Section 6.13. Any Type 0 Configuration Request targeting an unimplemented Function in an ARI Device must be handled as an Unsupported Request.
> 
> “
> 
> So it seems they do indeed use the “must” clause. I prefer to use the line from the spec verbatim as possible. Hence, this is what I am going with and be done with this patchset:
> 
>      /*
>       * A PCIe Downstream Port that do not have ARI Forwarding enabled must
>       * associate only Device 0 with the device attached to the bus
>       * representing the Link from the Port (PCIe base spec rev 4.0 ver 0.3,
>       * sec 7.3.1).
>       * With ARI, PCI_SLOT() can return non-zero value as the traditional
>       * 5-bit Device Number and 3-bit Function Number fields in its associated
>       * Routing IDs, Requester IDs and Completer IDs are interpreted as a
>       * single 8-bit Function Number. Hence, ignore ARI capable devices.
>       */

Looks perfect.

> 
> 
>>
>> Otherwise looks good to me.
>>
>>>>
>>>>
>>>>          /*
>>>>          * A PCIe Downstream Port normally leads to a Link with only Device
>>>>          * 0 on it (PCIe spec r3.1, sec 7.3.1).
>>>>           However PCI_SLOT() is broken if ARI is enabled, hence work around it
>>>>           by skipping check if the later cap is present.
>>>>          */
>>>>
>>>>>
>>>>>
>>>>>> (for example see how it's done in kernel code: only_one_child(...)
>>>>>>
>>>>>> PS:
>>>>>> kernel can be forced  to scan for !0 device numbers, but that's rather
>>>>>> a hack, so we shouldn't really care about that.
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>> +     */
>>>>>>>>>> +    if (pci_is_express(pci_dev) &&
>>>>>>>>>> +        !pcie_find_capability(pci_dev, PCI_EXT_CAP_ID_ARI) &&
>>>>>>>>>> +        pcie_has_upstream_port(pci_dev) &&
>>>>>>>>>> +        PCI_SLOT(pci_dev->devfn)) {
>>>>>>>>>> +        warn_report("PCI: slot %d is not valid for %s,"
>>>>>>>>>> +                    " parent device only allows plugging into slot 0.",
>>>>>>>>>> +                    PCI_SLOT(pci_dev->devfn), pci_dev->name);
>>>>>>>>>> +    }
>>>>>>>>>> +
>>>>>>>>>>      if (pci_dev->failover_pair_id) {
>>>>>>>>>>          if (!pci_bus_is_express(pci_get_bus(pci_dev))) {
>>>>>>>>>>              error_setg(errp, "failover primary device must be on "
>>>>>
>>>>
>>
> 


  reply	other threads:[~2023-07-05 10:43 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-04 11:25 [PATCH v7 0/6] test and QEMU fixes to ensure proper PCIE device usage Ani Sinha
2023-07-04 11:25 ` [PATCH v7 1/6] tests/acpi: allow changes in DSDT.noacpihp table blob Ani Sinha
2023-07-04 11:25 ` [PATCH v7 2/6] tests/acpi/bios-tables-test: use the correct slot on the pcie-root-port Ani Sinha
2023-07-04 11:25 ` [PATCH v7 3/6] tests/acpi/bios-tables-test: update acpi blob q35/DSDT.noacpihp Ani Sinha
2023-07-04 11:25 ` [PATCH v7 4/6] tests/qtest/hd-geo-test: fix incorrect pcie-root-port usage and simplify test Ani Sinha
2023-07-04 11:25 ` [PATCH v7 5/6] hw/pci: ensure PCIE devices are plugged into only slot 0 of PCIE port Ani Sinha
2023-07-04 11:38   ` Ani Sinha
2023-07-04 11:54   ` Akihiko Odaki
2023-07-04 11:59     ` Ani Sinha
2023-07-04 12:02       ` Akihiko Odaki
2023-07-04 12:08         ` Ani Sinha
2023-07-04 12:09           ` Akihiko Odaki
2023-07-04 12:28             ` Ani Sinha
2023-07-04 12:48         ` Igor Mammedov
2023-07-04 13:50           ` Ani Sinha
2023-07-04 14:28             ` Igor Mammedov
2023-07-04 15:07               ` Ani Sinha
2023-07-05  1:39                 ` Akihiko Odaki
2023-07-05  5:43                   ` Ani Sinha
2023-07-05 10:42                     ` Akihiko Odaki [this message]
2023-07-04 11:25 ` [PATCH v7 6/6] hw/pci: add comment explaining the reason for checking function 0 in hotplug Ani Sinha
2023-07-04 12:15   ` Igor Mammedov
2023-07-04 12:31     ` Ani Sinha

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cec1bb4e-813a-fd27-25a2-4d547b91613e@daynix.com \
    --to=akihiko.odaki@daynix.com \
    --cc=anisinha@redhat.com \
    --cc=imammedo@redhat.com \
    --cc=jusual@redhat.com \
    --cc=marcel.apfelbaum@gmail.com \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).