From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:55946) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZqgJE-0000as-3t for qemu-devel@nongnu.org; Mon, 26 Oct 2015 07:51:30 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZqgJA-0001Ed-LC for qemu-devel@nongnu.org; Mon, 26 Oct 2015 07:51:28 -0400 Received: from [59.151.112.132] (port=44875 helo=heian.cn.fujitsu.com) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZqgJ6-0001Aj-Tf for qemu-devel@nongnu.org; Mon, 26 Oct 2015 07:51:24 -0400 References: <1445830158-20721-1-git-send-email-caoj.fnst@cn.fujitsu.com> <1445830158-20721-3-git-send-email-caoj.fnst@cn.fujitsu.com> <20151026100131-mutt-send-email-mst@redhat.com> From: Cao jin Message-ID: <562E151C.1040208@cn.fujitsu.com> Date: Mon, 26 Oct 2015 19:57:16 +0800 MIME-Version: 1.0 In-Reply-To: <20151026100131-mutt-send-email-mst@redhat.com> Content-Type: multipart/mixed; boundary="------------080008090703040600000802" Subject: Re: [Qemu-devel] [PATCH v5 2/2] enable multi-function hot-add List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: "Michael S. Tsirkin" Cc: izumi.taku@jp.fujitsu.com, alex.williamson@redhat.com, qemu-devel@nongnu.org --------------080008090703040600000802 Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit Hi, warning, there is a long story below O:-) On 10/26/2015 04:29 PM, Michael S. Tsirkin wrote: > On Mon, Oct 26, 2015 at 11:29:18AM +0800, Cao jin wrote: >> Enable pcie device multifunction hot, just ensure the function 0 >> added last, then driver will got the notification to scan all the >> function in the slot. >> >> Signed-off-by: Cao jin >> --- >> hw/pci/pci.c | 31 ++++++++++++++++++++++++++++++- >> hw/pci/pci_host.c | 13 +++++++++++-- >> hw/pci/pcie.c | 18 +++++++++--------- >> include/hw/pci/pci.h | 1 + >> 4 files changed, 51 insertions(+), 12 deletions(-) >> >> diff --git a/hw/pci/pci.c b/hw/pci/pci.c >> index b0bf540..6f43b12 100644 >> --- a/hw/pci/pci.c >> +++ b/hw/pci/pci.c >> @@ -847,6 +847,9 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, PCIBus *bus, >> PCIConfigWriteFunc *config_write = pc->config_write; >> Error *local_err = NULL; >> AddressSpace *dma_as; >> + DeviceState *dev = DEVICE(pci_dev); >> + >> + pci_dev->bus = bus; >> >> if (devfn < 0) { >> for(devfn = bus->devfn_min ; devfn < ARRAY_SIZE(bus->devices); >> @@ -864,9 +867,17 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, PCIBus *bus, >> PCI_SLOT(devfn), PCI_FUNC(devfn), name, >> bus->devices[devfn]->name); >> return NULL; >> + } else if (dev->hotplugged && >> + pci_get_function_0(pci_dev)) { >> + error_setg(errp, "PCI: slot %d function 0 already ocuppied by %s," >> + " new func %s cannot be exposed to guest.", >> + PCI_SLOT(devfn), >> + bus->devices[PCI_DEVFN(PCI_SLOT(devfn), 0)]->name, >> + name); >> + >> + return NULL; >> } >> >> - pci_dev->bus = bus; >> pci_dev->devfn = devfn; >> dma_as = pci_device_iommu_address_space(pci_dev); >> >> @@ -2454,6 +2465,24 @@ void pci_bus_get_w64_range(PCIBus *bus, Range *range) >> pci_for_each_device_under_bus(bus, pci_dev_get_w64, range); >> } >> >> +/* ARI device function number range is 0-255, means has only 1 function0; >> + * while non-ARI device has 1 function0 in each slot. non-ARI device could >> + * be PCI or PCIe, and there is up to 32 slots for PCI */ >> +PCIDevice *pci_get_function_0(PCIDevice *pci_dev) >> +{ >> + PCIDevice *parent_dev; >> + >> + parent_dev = pci_bridge_get_device(pci_dev->bus); >> + if (pcie_cap_is_arifwd_enabled(parent_dev) && >> + pci_find_capability(pci_dev, PCI_EXT_CAP_ID_ARI)) { >> + /* ARI enabled */ >> + return pci_dev->bus->devices[0]; > > That's wrong I think since software might enable ARI after hotplug. According to the spec, ARI feature is enabled only based on the following 2 conditions: 1. For an ARI Downstream Port, the capability is communicated through the Device Capabilities 2 register. 2. For an ARI Device, the capability is communicated through the ARI Capability structure. also according to the driver code pci_configure_ari(), I think my implementation does follows spec? And as you know, only ARI feature is enabled, we return pci_dev->bus->devices[0] > And I'm not sure all functions must have the ARI capability. > do you means there maybe the following condition: ARI forwarding bit is enabled in downstream port, but the functions below, some have ARI Capability structure while the others don`t. Shouldn`t we forbid the condition? Because If this condition happens, I am not sure whether the device could work normally. In the IMPLEMENTATION NOTE: ARI Forwarding Enable Being Set Inappropriately, It seems the spec don`t want that complicated condition? While actually, qemu calls pcie_cap_arifwd_init() in root port/downstream port initialization first. > I don't see why don't you just go by spec: > I do read the spec, and also referred the pcie driver code(see pci_configure_ari()). I think it is my inaccurate understanding about "upstream port" results in my implementation(I also consult PCISIG support team for this question, see attachment). The concept of "upstream port" in ARI device definition confuse me a lot. Talking about the definition of ARI device, I always thought the "upstream port" should on the ARI device itself(like the figure I drew before), or else why the definition add the words "with an upstream port"? It seems to me that the emphasis is on "with an upstream port", and implies to me that the non-ARI device doesn`t have an upstream port. Seeing their replies in attachment, says that I am wrong at this point, both of them have an upstream port.(their saying actually make me more confused at that time, but now I think I am clear about this concept after reading your implementation below) After reading your implementation, and the PCISIG support team replies again, finally I figure out that, "upstream port" in ARI device definition is just a port whose position is closer to root complex, and the point is, the "upstream port" doesn`t need to exist on the ARI device itself, which also means, take Figure 6-13 in PCIe spec 3.1, the Root Port A is the "upstream port" for ARI Device X, and the Downstream Port D in Switch is the "upstream port" for ARI Device Y. Now, do I understand it right? Hope I can get you understood from the long description above. If I still don`t understand it right, please point it out, after all, it confused me for a long time. > static > bool pcie_has_upstream_port(PCIDevice *dev) > { > PCIDevice *parent_dev = pci_bridge_get_device(pci_dev->bus); > > /* > * Device associated with an upstream port. > * As there are several types of these, it's easier to check the > * parent device: upstream ports are always connected to > * root or downstream ports. > */ > return parent_dev && > pci_is_express(parent_dev) && > parent_dev->exp.exp_cap && > (pcie_cap_get_type(parent_dev) == PCI_EXP_TYPE_ROOT_PORT || > pcie_cap_get_type(parent_dev) == PCI_EXP_TYPE_DOWNSTREAM); > } > Assume my understanding is right, which means both ARI and non-ARI device have the upstream port(root port or downstream port), could the existence of upstream port be the judgment condition? > > PCIDevice *pci_get_function_0(PCIDevice *pci_dev) > { > if (pcie_has_upstream_port(dev)) { > /* With an upstream PCIE port, we only support 1 device at slot 0 */ > return pci_dev->bus->devices[0]; > } else { > /* Other bus types might support multiple devices at slots 0-31 */ > return pci_dev->bus->devices[PCI_DEVFN(PCI_SLOT(pci_dev->devfn), 0)]; > } > } > >> + } else { >> + /* no ARI */ >> + return pci_dev->bus->devices[PCI_DEVFN(PCI_SLOT(pci_dev->devfn), 0)]; >> + } >> +} >> + >> static const TypeInfo pci_device_type_info = { >> .name = TYPE_PCI_DEVICE, >> .parent = TYPE_DEVICE, >> diff --git a/hw/pci/pci_host.c b/hw/pci/pci_host.c >> index 3e26f92..63d7d2f 100644 >> --- a/hw/pci/pci_host.c >> +++ b/hw/pci/pci_host.c >> @@ -20,6 +20,7 @@ >> >> #include "hw/pci/pci.h" >> #include "hw/pci/pci_host.h" >> +#include "hw/pci/pci_bus.h" >> #include "trace.h" >> >> /* debug PCI */ >> @@ -75,7 +76,11 @@ void pci_data_write(PCIBus *s, uint32_t addr, uint32_t val, int len) >> PCIDevice *pci_dev = pci_dev_find_by_addr(s, addr); >> uint32_t config_addr = addr & (PCI_CONFIG_SPACE_SIZE - 1); >> >> - if (!pci_dev) { >> + /* non-zero functions are only exposed when function 0 is present, >> + * allowing direct removal of unexposed functions. >> + */ >> + if (!pci_dev || >> + (pci_dev->qdev.hotplugged && !pci_get_function_0(pci_dev))) { >> return; >> } >> >> @@ -91,7 +96,11 @@ uint32_t pci_data_read(PCIBus *s, uint32_t addr, int len) >> uint32_t config_addr = addr & (PCI_CONFIG_SPACE_SIZE - 1); >> uint32_t val; >> >> - if (!pci_dev) { >> + /* non-zero functions are only exposed when function 0 is present, >> + * allowing direct removal of unexposed functions. >> + */ >> + if (!pci_dev || >> + (pci_dev->qdev.hotplugged && !pci_get_function_0(pci_dev))) { >> return ~0x0; >> } >> >> diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c >> index b1adeaf..4ba9501 100644 >> --- a/hw/pci/pcie.c >> +++ b/hw/pci/pcie.c >> @@ -249,16 +249,16 @@ void pcie_cap_slot_hotplug_cb(HotplugHandler *hotplug_dev, DeviceState *dev, >> return; >> } >> >> - /* TODO: multifunction hot-plug. >> - * Right now, only a device of function = 0 is allowed to be >> - * hot plugged/unplugged. >> + /* To enable multifunction hot-plug, we just ensure the function >> + * 0 added last. When function 0 is added, we set the sltsta and >> + * inform OS via event notification. >> */ >> - assert(PCI_FUNC(pci_dev->devfn) == 0); >> - >> - pci_word_test_and_set_mask(exp_cap + PCI_EXP_SLTSTA, >> - PCI_EXP_SLTSTA_PDS); >> - pcie_cap_slot_event(PCI_DEVICE(hotplug_dev), >> - PCI_EXP_HP_EV_PDC | PCI_EXP_HP_EV_ABP); >> + if (pci_get_function_0(pci_dev)) { >> + pci_word_test_and_set_mask(exp_cap + PCI_EXP_SLTSTA, >> + PCI_EXP_SLTSTA_PDS); >> + pcie_cap_slot_event(PCI_DEVICE(hotplug_dev), >> + PCI_EXP_HP_EV_PDC | PCI_EXP_HP_EV_ABP); >> + } >> } >> >> static void pcie_unplug_device(PCIBus *bus, PCIDevice *dev, void *opaque) >> diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h >> index f5e7fd8..379b6e1 100644 >> --- a/include/hw/pci/pci.h >> +++ b/include/hw/pci/pci.h >> @@ -397,6 +397,7 @@ void pci_for_each_bus_depth_first(PCIBus *bus, >> void *(*begin)(PCIBus *bus, void *parent_state), >> void (*end)(PCIBus *bus, void *state), >> void *parent_state); >> +PCIDevice *pci_get_function_0(PCIDevice *pci_dev); >> >> /* Use this wrapper when specific scan order is not required. */ >> static inline >> -- >> 2.1.0 > . > -- Yours Sincerely, Cao Jin --------------080008090703040600000802 Content-Type: message/rfc822; name="Re: [Tech Support] question about ARI Device definition.eml" Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename="Re: [Tech Support] question about ARI Device definition.eml" X-Account-Key: account1 X-Mozilla-Keys: Received: from edo.cn.fujitsu.com (10.167.33.5) by G08CNEXCHPEKD02.g08.fujitsu.local (10.167.33.89) with Microsoft SMTP Server id 14.3.181.6; Sat, 24 Oct 2015 02:52:57 +0800 Received: from heian.cn.fujitsu.com (localhost.localdomain [127.0.0.1]) by edo.cn.fujitsu.com (8.14.3/8.13.1) with ESMTP id t9NIqKZ7000375 for ; Sat, 24 Oct 2015 02:52:25 +0800 X-IronPort-AV: E=Sophos;i="5.15,520,1432569600"; d="scan'208,217";a="102209019" Received: from nm19-vm1.access.bullet.mail.gq1.yahoo.com ([216.39.63.17]) by heian.cn.fujitsu.com with ESMTP; 24 Oct 2015 02:55:13 +0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bellsouth.net; s=s2048; t=1445626371; bh=arQKBlmALBt528mq496k+ymrn7rxLq/SBhUWw8iwof4=; h=From:To:References:In-Reply-To:Subject:Date:From:Subject; b=SrvSyC9iXHs5GVkedQ+ha49Ga3RxI8Z5uhLe+zbitq5HcB8KtuCqVZYUoZ3dkg2IJM42a+iWv0e90Qk18S1jbUbhKwSbvX2SsdKgy32iDZTLzitsD76oXTT2rmZuCthfhcY4NlbpWbiMqxPCBy7RH/1mxK99tOetXjAi9/wvj1rYY8JQ7AQJfO5KDWxe2eEsmBuGAvkU21xBNDcNAObvNqAAi+xPPkKFqHSbzBTUv1tVycR7Y85jU87FAaFCiWyOGMtateug+d5FgdaEQEvNgfwqC4uInTS6MMJ+tLwyoek3kiaiLLXGaj3VzyNNpxve/31g6JsqoVptgU0RGYQgZw== Received: from [216.39.60.167] by nm19.access.bullet.mail.gq1.yahoo.com with NNFMP; 23 Oct 2015 18:52:51 -0000 Received: from [67.195.23.146] by tm3.access.bullet.mail.gq1.yahoo.com with NNFMP; 23 Oct 2015 18:52:51 -0000 Received: from [127.0.0.1] by smtp118.sbc.mail.gq1.yahoo.com with NNFMP; 23 Oct 2015 18:52:51 -0000 X-Yahoo-Newman-Id: 594588.61665.bm@smtp118.sbc.mail.gq1.yahoo.com X-Yahoo-Newman-Property: ymail-3 X-YMail-OSG: 3TMtA0QVM1l.kCmQHlBGrFsZXX.Z8X701RmsF7UJuD2BY9z c5TJ_CowlWgu.6ZQTwFVlNnzXZteLayBNZtVrJpLa.4oBC1c8zMHatlzCneG VyjVC3FAdT5Ixk2sZErLrbOMmuUnL3aHpZFWIwBe3dNt4thp8EepjFTiJ7HT vtvLUz2UCQ.tUUNWCh_ouTbuYmZ0k0lnWGLFOG6C9Oz8q30c_.5Mzn6HthoF ayTC3oIB_ssAXZC6gXj53cxEj0zQcI425XXkZV3Mp0I0uVk6RowViWt.UB1G MguIXhVi1Bd9fdAmAj4irNvyq64RPXDw.AHNuQ_LRUYNFD80F9.voskHnHmS FFCHV1DGKwBosaEcLLAac0W7vOpOg0v1TGZagNm0iUlvRqwEVkUOoUkhx7u8 3DsMqXThNb4XLpkIS5Y7nlNqJuVDM2dQ6QSVScQ605la1yV2yGrYYUekqmSR npPh5Mp84kl6hv_0fTCkX8foP9BAKpIB0v24v6QnhuPzTdkBjcG50CCXv4.v 4W9qyA8vuw60_RtiefdpMBUo6Q2zpE6dnp68fiMvqMRpXmQ-- X-Yahoo-SMTP: 2W8Z.8SswBC69ljk8iHLxrZ7pZ4jzBlYQZ8znCn3qAs- From: Dick K. To: 'Cao jin' References: <5629EE4D.7060806@cn.fujitsu.com> In-Reply-To: <5629EE4D.7060806@cn.fujitsu.com> Subject: RE: [Tech Support] question about ARI Device definition Date: Fri, 23 Oct 2015 14:52:53 -0400 Message-ID: <007901d10dc4$07446910$15cd3b30$@bellsouth.net> Content-Type: multipart/alternative; boundary="----=_NextPart_000_007A_01D10DA2.80344FB0" X-Mailer: Microsoft Outlook 14.0 Thread-Index: AQE0WVdaDLuBJDjdUM0DMYcJVmxJzZ+yzkNw Content-Language: en-us Return-Path: dckly@bellsouth.net X-MS-Exchange-Organization-AuthSource: G08CNEXCHPEKD02.g08.fujitsu.local X-MS-Exchange-Organization-AuthAs: Anonymous X-MS-Exchange-Organization-AVStamp-Mailbox: SMEXutTf;1199460;0;This mail has been scanned by Trend Micro ScanMail for Microsoft Exchange; X-MS-Exchange-Organization-SCL: 0 MIME-Version: 1.0 ------=_NextPart_000_007A_01D10DA2.80344FB0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Please see the response in your appended note. The response is based on = the PCIe 3.1 spec. Regards, Richard (Please send responses to this inquiry (and future technical support = inquiries) to pcitech@pcisig.com) Sending technical questions to administration@pcisig.com delays the = response to the question. ----------------------------------2242 613 From: Cao jin [mailto:caoj.fnst@cn.fujitsu.com]=20 Sent: Friday, October 23, 2015 4:23 AM To: pcitech@pcisig.com Subject: [Tech Support] question about ARI Device definition hello there, thanks for quick response to my last mail. I have a new question = here. the definition of ARI Device in spec: A Device associated with an = Upstream Port. If I understand right, ARI device has an upstream port on the device, = while non-ARI device don`t have.=20 ***********************2242 RESPONSE: An ARI Endpoint and a non-ARI = Endpoint both have an Upstream Port. But in figure 6-13 of PCIe spec 3.1, on ARI device X & Y, don`t see any = notation of upstream port on it, should they have a notation of upstream = port? ***********************613 RESPONSE: it should be obvious that a device = connected to a Root Port or switch Downstream Port must be a device with = an Upstream Port. and also, If I understand right, ARI device has a upstream port = associated, so the main functionality of upstream port is routing msg to = the functions immediately below the port? ***********************2242 RESPONSE: The main feature of an ARI device = is a Function Number field of 8 bits.=20 If you have any additional questions, please let us know. =20 -- Yours Sincerely, Cao Jin ------=_NextPart_000_007A_01D10DA2.80344FB0 Content-Type: text/html; charset="utf-8" Content-Transfer-Encoding: quoted-printable RE: [Tech Support] question about ARI Device definition

Please see the response in your appended note.  The respo= nse is based on the PCIe 3.1 spec.

Regards,<= /span>

Richard

(Please send res= ponses to this inquiry (and future technical support inquiries) to pcitech@= pcisig.com)

Sending technica= l questions to administration@pcisig.com delays the response to the questio= n.

----------------= ------------------2242  613
From: Cao jin [mailto:caoj.fnst@cn.fujitsu.com]
Sent: Friday, October 23, 2015 4:23 AM
To: pcitech@pcisig.com
Subject: [Tech Support] question about ARI Device definition
<= span lang=3D"en-us">

hello there,

  &nbs= p;  thanks for quick response to my last mail. I have a new question h= ere.

  &nbs= p;  the definition of ARI Device in spec: A Device associated with an = Upstream Port.

If I understand = right, ARI device has an upstream port on the device, while non-ARI device = don`t have.

<= font color=3D"#0070C0" face=3D"Calibri">***********************<= /span>2242<= b> RESPONSE:  An ARI Endpoint and a = non-ARI Endpoint both have an Upstream Port.

But in figure 6-= 13 of PCIe spec 3.1, on ARI device X & Y, don`t see any notation of ups= tream port on it, should they have a notation of upstream port?

<= font color=3D"#0070C0" face=3D"Calibri">***********************<= /span>613 RESPONSE:  it should be obvious that a device connected to a Root Port or switch= Downstream Port must be a device with an Upstream Port.

and also, If I u= nderstand right, ARI device has a upstream port associated, so the main fun= ctionality of upstream port is routing msg to the functions immediately bel= ow the port?

<= font color=3D"#0070C0" face=3D"Calibri">***********************<= /span>2242<= b> RESPONSE:  The main feature of an ARI device is a Function Number field of 8 bits.

If you have any a= dditional questions, please let us know.

 

--=

Yours Sincerely,=

Cao Jin

= ------=_NextPart_000_007A_01D10DA2.80344FB0-- --------------080008090703040600000802--