* [PATCH 1/4] docs: Fix next function numbers in SR/IOV documentation
2023-07-01 7:01 [PATCH 0/4] pci: Compare function number and ARI next function number Akihiko Odaki
@ 2023-07-01 7:01 ` Akihiko Odaki
2023-07-01 14:31 ` Ani Sinha
2023-07-01 7:01 ` [PATCH 2/4] hw/nvme: Fix ARI next function numbers Akihiko Odaki
` (3 subsequent siblings)
4 siblings, 1 reply; 17+ messages in thread
From: Akihiko Odaki @ 2023-07-01 7:01 UTC (permalink / raw)
Cc: qemu-devel, qemu-block, Ani Sinha, Michael S . Tsirkin,
Marcel Apfelbaum, Sriram Yagnaraman, Jason Wang, Keith Busch,
Klaus Jensen, Akihiko Odaki
The next function numbers are expected to form a linked list ending with
0.
Fixes: 2503461691 ("pcie: Add some SR/IOV API documentation in docs/pcie_sriov.txt")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
docs/pcie_sriov.txt | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/docs/pcie_sriov.txt b/docs/pcie_sriov.txt
index 7eff7f2703..cc4232e49a 100644
--- a/docs/pcie_sriov.txt
+++ b/docs/pcie_sriov.txt
@@ -48,7 +48,7 @@ setting up a BAR for a VF.
...
int ret = pcie_endpoint_cap_init(d, 0x70);
...
- pcie_ari_init(d, 0x100, 1);
+ pcie_ari_init(d, 0x100, fun_offset);
...
/* Add and initialize the SR/IOV capability */
@@ -76,9 +76,10 @@ setting up a BAR for a VF.
pci_your_vf_dev_realize( ... )
{
...
+ uint16_t nextvfn = pcie_sriov_vf_number(dev) + 1;
int ret = pcie_endpoint_cap_init(d, 0x60);
...
- pcie_ari_init(d, 0x100, 1);
+ pcie_ari_init(d, 0x100, nextvfn < total_vfs ? fun_offset + nextvfn * stride : 0);
...
memory_region_init(mr, ... )
pcie_sriov_vf_register_bar(d, bar_nr, mr);
--
2.41.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 1/4] docs: Fix next function numbers in SR/IOV documentation
2023-07-01 7:01 ` [PATCH 1/4] docs: Fix next function numbers in SR/IOV documentation Akihiko Odaki
@ 2023-07-01 14:31 ` Ani Sinha
2023-07-02 3:51 ` Akihiko Odaki
0 siblings, 1 reply; 17+ messages in thread
From: Ani Sinha @ 2023-07-01 14:31 UTC (permalink / raw)
To: Akihiko Odaki
Cc: qemu-devel, qemu-block, Michael S . Tsirkin, Marcel Apfelbaum,
Sriram Yagnaraman, Jason Wang, Keith Busch, Klaus Jensen
> On 01-Jul-2023, at 12:31 PM, Akihiko Odaki <akihiko.odaki@daynix.com> wrote:
>
> The next function numbers are expected to form a linked list ending with
> 0.
>
> Fixes: 2503461691 ("pcie: Add some SR/IOV API documentation in docs/pcie_sriov.txt")
> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
> ---
> docs/pcie_sriov.txt | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/docs/pcie_sriov.txt b/docs/pcie_sriov.txt
> index 7eff7f2703..cc4232e49a 100644
> --- a/docs/pcie_sriov.txt
> +++ b/docs/pcie_sriov.txt
> @@ -48,7 +48,7 @@ setting up a BAR for a VF.
> ...
> int ret = pcie_endpoint_cap_init(d, 0x70);
> ...
> - pcie_ari_init(d, 0x100, 1);
> + pcie_ari_init(d, 0x100, fun_offset);
> ...
>
> /* Add and initialize the SR/IOV capability */
> @@ -76,9 +76,10 @@ setting up a BAR for a VF.
> pci_your_vf_dev_realize( ... )
> {
> ...
> + uint16_t nextvfn = pcie_sriov_vf_number(dev) + 1;
> int ret = pcie_endpoint_cap_init(d, 0x60);
> ...
> - pcie_ari_init(d, 0x100, 1);
> + pcie_ari_init(d, 0x100, nextvfn < total_vfs ? fun_offset + nextvfn * stride : 0);
^^^^
I think this will be fun_offset and not just 0
Same with the other patches ..
> ...
> memory_region_init(mr, ... )
> pcie_sriov_vf_register_bar(d, bar_nr, mr);
> --
> 2.41.0
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 1/4] docs: Fix next function numbers in SR/IOV documentation
2023-07-01 14:31 ` Ani Sinha
@ 2023-07-02 3:51 ` Akihiko Odaki
0 siblings, 0 replies; 17+ messages in thread
From: Akihiko Odaki @ 2023-07-02 3:51 UTC (permalink / raw)
To: Ani Sinha
Cc: qemu-devel, qemu-block, Michael S . Tsirkin, Marcel Apfelbaum,
Sriram Yagnaraman, Jason Wang, Keith Busch, Klaus Jensen
On 2023/07/01 23:31, Ani Sinha wrote:
>
>
>> On 01-Jul-2023, at 12:31 PM, Akihiko Odaki <akihiko.odaki@daynix.com> wrote:
>>
>> The next function numbers are expected to form a linked list ending with
>> 0.
>>
>> Fixes: 2503461691 ("pcie: Add some SR/IOV API documentation in docs/pcie_sriov.txt")
>> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
>> ---
>> docs/pcie_sriov.txt | 5 +++--
>> 1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/docs/pcie_sriov.txt b/docs/pcie_sriov.txt
>> index 7eff7f2703..cc4232e49a 100644
>> --- a/docs/pcie_sriov.txt
>> +++ b/docs/pcie_sriov.txt
>> @@ -48,7 +48,7 @@ setting up a BAR for a VF.
>> ...
>> int ret = pcie_endpoint_cap_init(d, 0x70);
>> ...
>> - pcie_ari_init(d, 0x100, 1);
>> + pcie_ari_init(d, 0x100, fun_offset);
>> ...
>>
>> /* Add and initialize the SR/IOV capability */
>> @@ -76,9 +76,10 @@ setting up a BAR for a VF.
>> pci_your_vf_dev_realize( ... )
>> {
>> ...
>> + uint16_t nextvfn = pcie_sriov_vf_number(dev) + 1;
>> int ret = pcie_endpoint_cap_init(d, 0x60);
>> ...
>> - pcie_ari_init(d, 0x100, 1);
>> + pcie_ari_init(d, 0x100, nextvfn < total_vfs ? fun_offset + nextvfn * stride : 0);
> ^^^^
> I think this will be fun_offset and not just 0
> Same with the other patches ..
It is intended to point to the PF. fun_offset points to the first VF.
>
>> ...
>> memory_region_init(mr, ... )
>> pcie_sriov_vf_register_bar(d, bar_nr, mr);
>> --
>> 2.41.0
>>
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 2/4] hw/nvme: Fix ARI next function numbers
2023-07-01 7:01 [PATCH 0/4] pci: Compare function number and ARI next function number Akihiko Odaki
2023-07-01 7:01 ` [PATCH 1/4] docs: Fix next function numbers in SR/IOV documentation Akihiko Odaki
@ 2023-07-01 7:01 ` Akihiko Odaki
2023-07-01 7:01 ` [PATCH 3/4] igb: " Akihiko Odaki
` (2 subsequent siblings)
4 siblings, 0 replies; 17+ messages in thread
From: Akihiko Odaki @ 2023-07-01 7:01 UTC (permalink / raw)
Cc: qemu-devel, qemu-block, Ani Sinha, Michael S . Tsirkin,
Marcel Apfelbaum, Sriram Yagnaraman, Jason Wang, Keith Busch,
Klaus Jensen, Akihiko Odaki
The next function numbers are expected to form a linked list ending with
0.
Fixes: 44c2c09488 ("hw/nvme: Add support for SR-IOV")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
hw/nvme/ctrl.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
index fd917fcda1..12500dc80b 100644
--- a/hw/nvme/ctrl.c
+++ b/hw/nvme/ctrl.c
@@ -8088,7 +8088,12 @@ static bool nvme_init_pci(NvmeCtrl *n, PCIDevice *pci_dev, Error **errp)
pcie_endpoint_cap_init(pci_dev, 0x80);
pcie_cap_flr_init(pci_dev);
if (n->params.sriov_max_vfs) {
- pcie_ari_init(pci_dev, 0x100, 1);
+ uint16_t nextvfn = pci_is_vf(pci_dev) ?
+ pcie_sriov_vf_number(pci_dev) + 1 : 0;
+ uint16_t nextfn = nextvfn < n->params.sriov_max_vfs ?
+ NVME_VF_OFFSET + nextvfn * NVME_VF_STRIDE : 0;
+
+ pcie_ari_init(pci_dev, 0x100, nextfn);
}
/* add one to max_ioqpairs to account for the admin queue pair */
--
2.41.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* [PATCH 3/4] igb: Fix ARI next function numbers
2023-07-01 7:01 [PATCH 0/4] pci: Compare function number and ARI next function number Akihiko Odaki
2023-07-01 7:01 ` [PATCH 1/4] docs: Fix next function numbers in SR/IOV documentation Akihiko Odaki
2023-07-01 7:01 ` [PATCH 2/4] hw/nvme: Fix ARI next function numbers Akihiko Odaki
@ 2023-07-01 7:01 ` Akihiko Odaki
2023-07-02 5:05 ` Michael S. Tsirkin
2023-07-01 7:01 ` [PATCH 4/4] pci: Compare function number and ARI next function number Akihiko Odaki
2023-07-02 5:10 ` [PATCH 0/4] " Michael S. Tsirkin
4 siblings, 1 reply; 17+ messages in thread
From: Akihiko Odaki @ 2023-07-01 7:01 UTC (permalink / raw)
Cc: qemu-devel, qemu-block, Ani Sinha, Michael S . Tsirkin,
Marcel Apfelbaum, Sriram Yagnaraman, Jason Wang, Keith Busch,
Klaus Jensen, Akihiko Odaki
The next function numbers are expected to form a linked list ending with
0.
Fixes: 3a977deebe ("Intrdocue igb device emulation")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
hw/net/igb_core.h | 3 +++
hw/net/igb.c | 4 +---
hw/net/igbvf.c | 5 ++++-
3 files changed, 8 insertions(+), 4 deletions(-)
diff --git a/hw/net/igb_core.h b/hw/net/igb_core.h
index 9cbbfd516b..e1dab76995 100644
--- a/hw/net/igb_core.h
+++ b/hw/net/igb_core.h
@@ -49,6 +49,9 @@
#define IGB_NUM_QUEUES (16)
#define IGB_NUM_VM_POOLS (8)
+#define IGB_VF_OFFSET (0x80)
+#define IGB_VF_STRIDE (2)
+
typedef struct IGBCore IGBCore;
enum { PHY_R = BIT(0),
diff --git a/hw/net/igb.c b/hw/net/igb.c
index 1c989d7677..543ca0114a 100644
--- a/hw/net/igb.c
+++ b/hw/net/igb.c
@@ -81,8 +81,6 @@ struct IGBState {
};
#define IGB_CAP_SRIOV_OFFSET (0x160)
-#define IGB_VF_OFFSET (0x80)
-#define IGB_VF_STRIDE (2)
#define E1000E_MMIO_IDX 0
#define E1000E_FLASH_IDX 1
@@ -431,7 +429,7 @@ static void igb_pci_realize(PCIDevice *pci_dev, Error **errp)
hw_error("Failed to initialize AER capability");
}
- pcie_ari_init(pci_dev, 0x150, 1);
+ pcie_ari_init(pci_dev, 0x150, IGB_VF_OFFSET);
pcie_sriov_pf_init(pci_dev, IGB_CAP_SRIOV_OFFSET, TYPE_IGBVF,
IGB_82576_VF_DEV_ID, IGB_MAX_VF_FUNCTIONS, IGB_MAX_VF_FUNCTIONS,
diff --git a/hw/net/igbvf.c b/hw/net/igbvf.c
index 284ea61184..bf2f237ab5 100644
--- a/hw/net/igbvf.c
+++ b/hw/net/igbvf.c
@@ -240,6 +240,9 @@ static const MemoryRegionOps mmio_ops = {
static void igbvf_pci_realize(PCIDevice *dev, Error **errp)
{
IgbVfState *s = IGBVF(dev);
+ uint16_t nextvfn = pcie_sriov_vf_number(dev) + 1;
+ uint16_t nextfn = nextvfn < IGB_MAX_VF_FUNCTIONS ?
+ IGB_VF_OFFSET + nextvfn * IGB_VF_STRIDE : 0;
int ret;
int i;
@@ -270,7 +273,7 @@ static void igbvf_pci_realize(PCIDevice *dev, Error **errp)
hw_error("Failed to initialize AER capability");
}
- pcie_ari_init(dev, 0x150, 1);
+ pcie_ari_init(dev, 0x150, nextfn);
}
static void igbvf_pci_uninit(PCIDevice *dev)
--
2.41.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 3/4] igb: Fix ARI next function numbers
2023-07-01 7:01 ` [PATCH 3/4] igb: " Akihiko Odaki
@ 2023-07-02 5:05 ` Michael S. Tsirkin
2023-07-02 8:38 ` Akihiko Odaki
0 siblings, 1 reply; 17+ messages in thread
From: Michael S. Tsirkin @ 2023-07-02 5:05 UTC (permalink / raw)
To: Akihiko Odaki
Cc: qemu-devel, qemu-block, Ani Sinha, Marcel Apfelbaum,
Sriram Yagnaraman, Jason Wang, Keith Busch, Klaus Jensen
On Sat, Jul 01, 2023 at 04:01:21PM +0900, Akihiko Odaki wrote:
> The next function numbers are expected to form a linked list ending with
> 0.
>
> Fixes: 3a977deebe ("Intrdocue igb device emulation")
> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
> ---
> hw/net/igb_core.h | 3 +++
> hw/net/igb.c | 4 +---
> hw/net/igbvf.c | 5 ++++-
> 3 files changed, 8 insertions(+), 4 deletions(-)
>
> diff --git a/hw/net/igb_core.h b/hw/net/igb_core.h
> index 9cbbfd516b..e1dab76995 100644
> --- a/hw/net/igb_core.h
> +++ b/hw/net/igb_core.h
> @@ -49,6 +49,9 @@
> #define IGB_NUM_QUEUES (16)
> #define IGB_NUM_VM_POOLS (8)
>
> +#define IGB_VF_OFFSET (0x80)
> +#define IGB_VF_STRIDE (2)
> +
> typedef struct IGBCore IGBCore;
>
> enum { PHY_R = BIT(0),
> diff --git a/hw/net/igb.c b/hw/net/igb.c
> index 1c989d7677..543ca0114a 100644
> --- a/hw/net/igb.c
> +++ b/hw/net/igb.c
> @@ -81,8 +81,6 @@ struct IGBState {
> };
>
> #define IGB_CAP_SRIOV_OFFSET (0x160)
> -#define IGB_VF_OFFSET (0x80)
> -#define IGB_VF_STRIDE (2)
>
> #define E1000E_MMIO_IDX 0
> #define E1000E_FLASH_IDX 1
> @@ -431,7 +429,7 @@ static void igb_pci_realize(PCIDevice *pci_dev, Error **errp)
> hw_error("Failed to initialize AER capability");
> }
>
> - pcie_ari_init(pci_dev, 0x150, 1);
> + pcie_ari_init(pci_dev, 0x150, IGB_VF_OFFSET);
>
> pcie_sriov_pf_init(pci_dev, IGB_CAP_SRIOV_OFFSET, TYPE_IGBVF,
> IGB_82576_VF_DEV_ID, IGB_MAX_VF_FUNCTIONS, IGB_MAX_VF_FUNCTIONS,
I think this change would break migrations from 8.0. No?
More importantly your commit log says linked list should end
with 0, but you make it point at a VF instead.
> diff --git a/hw/net/igbvf.c b/hw/net/igbvf.c
> index 284ea61184..bf2f237ab5 100644
> --- a/hw/net/igbvf.c
> +++ b/hw/net/igbvf.c
> @@ -240,6 +240,9 @@ static const MemoryRegionOps mmio_ops = {
> static void igbvf_pci_realize(PCIDevice *dev, Error **errp)
> {
> IgbVfState *s = IGBVF(dev);
> + uint16_t nextvfn = pcie_sriov_vf_number(dev) + 1;
> + uint16_t nextfn = nextvfn < IGB_MAX_VF_FUNCTIONS ?
> + IGB_VF_OFFSET + nextvfn * IGB_VF_STRIDE : 0;
> int ret;
> int i;
>
> @@ -270,7 +273,7 @@ static void igbvf_pci_realize(PCIDevice *dev, Error **errp)
> hw_error("Failed to initialize AER capability");
> }
>
> - pcie_ari_init(dev, 0x150, 1);
> + pcie_ari_init(dev, 0x150, nextfn);
For this one I don't see why it matters at all:
The presence of Shadow Functions does not affect this field.
For VFs, this field is undefined since VFs are located using First VF Offset (see § Section 9.3.3.9 ) and VF
Stride (see § Section 9.3.3.10 ).
> }
>
> static void igbvf_pci_uninit(PCIDevice *dev)
> --
> 2.41.0
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 3/4] igb: Fix ARI next function numbers
2023-07-02 5:05 ` Michael S. Tsirkin
@ 2023-07-02 8:38 ` Akihiko Odaki
2023-07-02 8:40 ` Michael S. Tsirkin
0 siblings, 1 reply; 17+ messages in thread
From: Akihiko Odaki @ 2023-07-02 8:38 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: qemu-devel, qemu-block, Ani Sinha, Marcel Apfelbaum,
Sriram Yagnaraman, Jason Wang, Keith Busch, Klaus Jensen
On 2023/07/02 14:05, Michael S. Tsirkin wrote:
> On Sat, Jul 01, 2023 at 04:01:21PM +0900, Akihiko Odaki wrote:
>> The next function numbers are expected to form a linked list ending with
>> 0.
>>
>> Fixes: 3a977deebe ("Intrdocue igb device emulation")
>> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
>> ---
>> hw/net/igb_core.h | 3 +++
>> hw/net/igb.c | 4 +---
>> hw/net/igbvf.c | 5 ++++-
>> 3 files changed, 8 insertions(+), 4 deletions(-)
>>
>> diff --git a/hw/net/igb_core.h b/hw/net/igb_core.h
>> index 9cbbfd516b..e1dab76995 100644
>> --- a/hw/net/igb_core.h
>> +++ b/hw/net/igb_core.h
>> @@ -49,6 +49,9 @@
>> #define IGB_NUM_QUEUES (16)
>> #define IGB_NUM_VM_POOLS (8)
>>
>> +#define IGB_VF_OFFSET (0x80)
>> +#define IGB_VF_STRIDE (2)
>> +
>> typedef struct IGBCore IGBCore;
>>
>> enum { PHY_R = BIT(0),
>> diff --git a/hw/net/igb.c b/hw/net/igb.c
>> index 1c989d7677..543ca0114a 100644
>> --- a/hw/net/igb.c
>> +++ b/hw/net/igb.c
>> @@ -81,8 +81,6 @@ struct IGBState {
>> };
>>
>> #define IGB_CAP_SRIOV_OFFSET (0x160)
>> -#define IGB_VF_OFFSET (0x80)
>> -#define IGB_VF_STRIDE (2)
>>
>> #define E1000E_MMIO_IDX 0
>> #define E1000E_FLASH_IDX 1
>> @@ -431,7 +429,7 @@ static void igb_pci_realize(PCIDevice *pci_dev, Error **errp)
>> hw_error("Failed to initialize AER capability");
>> }
>>
>> - pcie_ari_init(pci_dev, 0x150, 1);
>> + pcie_ari_init(pci_dev, 0x150, IGB_VF_OFFSET);
>>
>> pcie_sriov_pf_init(pci_dev, IGB_CAP_SRIOV_OFFSET, TYPE_IGBVF,
>> IGB_82576_VF_DEV_ID, IGB_MAX_VF_FUNCTIONS, IGB_MAX_VF_FUNCTIONS,
>
>
> I think this change would break migrations from 8.0. No?
Well, I don't have a reason to concern more for this change than other
bug fixes with behavioral changes observable from the guest.
>
>
> More importantly your commit log says linked list should end
> with 0, but you make it point at a VF instead.
>
>
>> diff --git a/hw/net/igbvf.c b/hw/net/igbvf.c
>> index 284ea61184..bf2f237ab5 100644
>> --- a/hw/net/igbvf.c
>> +++ b/hw/net/igbvf.c
>> @@ -240,6 +240,9 @@ static const MemoryRegionOps mmio_ops = {
>> static void igbvf_pci_realize(PCIDevice *dev, Error **errp)
>> {
>> IgbVfState *s = IGBVF(dev);
>> + uint16_t nextvfn = pcie_sriov_vf_number(dev) + 1;
>> + uint16_t nextfn = nextvfn < IGB_MAX_VF_FUNCTIONS ?
>> + IGB_VF_OFFSET + nextvfn * IGB_VF_STRIDE : 0;
>> int ret;
>> int i;
>>
>> @@ -270,7 +273,7 @@ static void igbvf_pci_realize(PCIDevice *dev, Error **errp)
>> hw_error("Failed to initialize AER capability");
>> }
>>
>> - pcie_ari_init(dev, 0x150, 1);
>> + pcie_ari_init(dev, 0x150, nextfn);
>
>
>
> For this one I don't see why it matters at all:
>
> The presence of Shadow Functions does not affect this field.
> For VFs, this field is undefined since VFs are located using First VF Offset (see § Section 9.3.3.9 ) and VF
> Stride (see § Section 9.3.3.10 ).
I missed the statements saying the field is undefined for VFs. I posted
an alternative series ("[PATCH 0/3] pci: Fix ARI next function numbers")
so please review it.
>
>
>
>
>> }
>>
>> static void igbvf_pci_uninit(PCIDevice *dev)
>> --
>> 2.41.0
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 3/4] igb: Fix ARI next function numbers
2023-07-02 8:38 ` Akihiko Odaki
@ 2023-07-02 8:40 ` Michael S. Tsirkin
0 siblings, 0 replies; 17+ messages in thread
From: Michael S. Tsirkin @ 2023-07-02 8:40 UTC (permalink / raw)
To: Akihiko Odaki
Cc: qemu-devel, qemu-block, Ani Sinha, Marcel Apfelbaum,
Sriram Yagnaraman, Jason Wang, Keith Busch, Klaus Jensen
On Sun, Jul 02, 2023 at 05:38:42PM +0900, Akihiko Odaki wrote:
> On 2023/07/02 14:05, Michael S. Tsirkin wrote:
> > On Sat, Jul 01, 2023 at 04:01:21PM +0900, Akihiko Odaki wrote:
> > > The next function numbers are expected to form a linked list ending with
> > > 0.
> > >
> > > Fixes: 3a977deebe ("Intrdocue igb device emulation")
> > > Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
> > > ---
> > > hw/net/igb_core.h | 3 +++
> > > hw/net/igb.c | 4 +---
> > > hw/net/igbvf.c | 5 ++++-
> > > 3 files changed, 8 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/hw/net/igb_core.h b/hw/net/igb_core.h
> > > index 9cbbfd516b..e1dab76995 100644
> > > --- a/hw/net/igb_core.h
> > > +++ b/hw/net/igb_core.h
> > > @@ -49,6 +49,9 @@
> > > #define IGB_NUM_QUEUES (16)
> > > #define IGB_NUM_VM_POOLS (8)
> > > +#define IGB_VF_OFFSET (0x80)
> > > +#define IGB_VF_STRIDE (2)
> > > +
> > > typedef struct IGBCore IGBCore;
> > > enum { PHY_R = BIT(0),
> > > diff --git a/hw/net/igb.c b/hw/net/igb.c
> > > index 1c989d7677..543ca0114a 100644
> > > --- a/hw/net/igb.c
> > > +++ b/hw/net/igb.c
> > > @@ -81,8 +81,6 @@ struct IGBState {
> > > };
> > > #define IGB_CAP_SRIOV_OFFSET (0x160)
> > > -#define IGB_VF_OFFSET (0x80)
> > > -#define IGB_VF_STRIDE (2)
> > > #define E1000E_MMIO_IDX 0
> > > #define E1000E_FLASH_IDX 1
> > > @@ -431,7 +429,7 @@ static void igb_pci_realize(PCIDevice *pci_dev, Error **errp)
> > > hw_error("Failed to initialize AER capability");
> > > }
> > > - pcie_ari_init(pci_dev, 0x150, 1);
> > > + pcie_ari_init(pci_dev, 0x150, IGB_VF_OFFSET);
> > > pcie_sriov_pf_init(pci_dev, IGB_CAP_SRIOV_OFFSET, TYPE_IGBVF,
> > > IGB_82576_VF_DEV_ID, IGB_MAX_VF_FUNCTIONS, IGB_MAX_VF_FUNCTIONS,
> >
> >
> > I think this change would break migrations from 8.0. No?
>
> Well, I don't have a reason to concern more for this change than other bug
> fixes with behavioral changes observable from the guest.
Try it and see it fail.
> >
> >
> > More importantly your commit log says linked list should end
> > with 0, but you make it point at a VF instead.
> >
> >
> > > diff --git a/hw/net/igbvf.c b/hw/net/igbvf.c
> > > index 284ea61184..bf2f237ab5 100644
> > > --- a/hw/net/igbvf.c
> > > +++ b/hw/net/igbvf.c
> > > @@ -240,6 +240,9 @@ static const MemoryRegionOps mmio_ops = {
> > > static void igbvf_pci_realize(PCIDevice *dev, Error **errp)
> > > {
> > > IgbVfState *s = IGBVF(dev);
> > > + uint16_t nextvfn = pcie_sriov_vf_number(dev) + 1;
> > > + uint16_t nextfn = nextvfn < IGB_MAX_VF_FUNCTIONS ?
> > > + IGB_VF_OFFSET + nextvfn * IGB_VF_STRIDE : 0;
> > > int ret;
> > > int i;
> > > @@ -270,7 +273,7 @@ static void igbvf_pci_realize(PCIDevice *dev, Error **errp)
> > > hw_error("Failed to initialize AER capability");
> > > }
> > > - pcie_ari_init(dev, 0x150, 1);
> > > + pcie_ari_init(dev, 0x150, nextfn);
> >
> >
> >
> > For this one I don't see why it matters at all:
> >
> > The presence of Shadow Functions does not affect this field.
> > For VFs, this field is undefined since VFs are located using First VF Offset (see § Section 9.3.3.9 ) and VF
> > Stride (see § Section 9.3.3.10 ).
>
> I missed the statements saying the field is undefined for VFs. I posted an
> alternative series ("[PATCH 0/3] pci: Fix ARI next function numbers") so
> please review it.
>
> >
> >
> >
> >
> > > }
> > > static void igbvf_pci_uninit(PCIDevice *dev)
> > > --
> > > 2.41.0
> >
^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH 4/4] pci: Compare function number and ARI next function number
2023-07-01 7:01 [PATCH 0/4] pci: Compare function number and ARI next function number Akihiko Odaki
` (2 preceding siblings ...)
2023-07-01 7:01 ` [PATCH 3/4] igb: " Akihiko Odaki
@ 2023-07-01 7:01 ` Akihiko Odaki
2023-07-02 4:58 ` Michael S. Tsirkin
2023-07-11 7:10 ` Ani Sinha
2023-07-02 5:10 ` [PATCH 0/4] " Michael S. Tsirkin
4 siblings, 2 replies; 17+ messages in thread
From: Akihiko Odaki @ 2023-07-01 7:01 UTC (permalink / raw)
Cc: qemu-devel, qemu-block, Ani Sinha, Michael S . Tsirkin,
Marcel Apfelbaum, Sriram Yagnaraman, Jason Wang, Keith Busch,
Klaus Jensen, Akihiko Odaki
The function number must be lower than the next function number
advertised with ARI.
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
---
hw/pci/pci.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)
diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index e2eb4c3b4a..568665ee42 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -2059,6 +2059,8 @@ static void pci_qdev_realize(DeviceState *qdev, Error **errp)
Error *local_err = NULL;
bool is_default_rom;
uint16_t class_id;
+ uint16_t ari;
+ uint16_t nextfn;
/*
* capped by systemd (see: udev-builtin-net_id.c)
@@ -2121,6 +2123,19 @@ static void pci_qdev_realize(DeviceState *qdev, Error **errp)
}
}
+ if (pci_is_express(pci_dev)) {
+ ari = pcie_find_capability(pci_dev, PCI_EXT_CAP_ID_ARI);
+ if (ari) {
+ nextfn = (pci_get_long(pci_dev->config + ari + PCI_ARI_CAP) >> 8) & 0xff;
+ if (nextfn && (pci_dev->devfn & 0xff) >= nextfn) {
+ error_setg(errp, "PCI: function number %u is not lower than ARI next function number %u",
+ pci_dev->devfn & 0xff, nextfn);
+ pci_qdev_unrealize(DEVICE(pci_dev));
+ return;
+ }
+ }
+ }
+
if (pci_dev->failover_pair_id) {
if (!pci_bus_is_express(pci_get_bus(pci_dev))) {
error_setg(errp, "failover primary device must be on "
--
2.41.0
^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH 4/4] pci: Compare function number and ARI next function number
2023-07-01 7:01 ` [PATCH 4/4] pci: Compare function number and ARI next function number Akihiko Odaki
@ 2023-07-02 4:58 ` Michael S. Tsirkin
2023-07-02 8:46 ` Akihiko Odaki
2023-07-11 7:10 ` Ani Sinha
1 sibling, 1 reply; 17+ messages in thread
From: Michael S. Tsirkin @ 2023-07-02 4:58 UTC (permalink / raw)
To: Akihiko Odaki
Cc: qemu-devel, qemu-block, Ani Sinha, Marcel Apfelbaum,
Sriram Yagnaraman, Jason Wang, Keith Busch, Klaus Jensen
On Sat, Jul 01, 2023 at 04:01:22PM +0900, Akihiko Odaki wrote:
> The function number must be lower than the next function number
> advertised with ARI.
>
> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
I don't get this logic at all - where is the limitation coming from?
All I see in the spec is:
Next Function Number - With non-VFs, this field indicates the Function Number of the next higher
numbered Function in the Device, or 00h if there are no higher numbered Functions. Function 0 starts
this linked list of Functions.
The presence of Shadow Functions does not affect this field.
For VFs, this field is undefined since VFs are located using First VF Offset (see § Section 9.3.3.9 ) and VF
Stride (see § Section 9.3.3.10 ).
and
To improve the enumeration performance and create a more deterministic solution, software can
enumerate Functions through a linked list of Function Numbers. The next linked list element is
communicated through each Function’s ARI Capability Register.
i. Function 0 acts as the head of a linked list of Function Numbers. Software detects a
non-Zero Next Function Number field within the ARI Capability Register as the next
Function within the linked list. Software issues a configuration probe using the Bus Number
captured by the Device and the Function Number derived from the ARI Capability Register
to locate the next associated Function’s configuration space.
ii. Function Numbers may be sparse and non-sequential in their consumption by an ARI
Device.
> ---
> hw/pci/pci.c | 15 +++++++++++++++
> 1 file changed, 15 insertions(+)
>
> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> index e2eb4c3b4a..568665ee42 100644
> --- a/hw/pci/pci.c
> +++ b/hw/pci/pci.c
> @@ -2059,6 +2059,8 @@ static void pci_qdev_realize(DeviceState *qdev, Error **errp)
> Error *local_err = NULL;
> bool is_default_rom;
> uint16_t class_id;
> + uint16_t ari;
> + uint16_t nextfn;
>
> /*
> * capped by systemd (see: udev-builtin-net_id.c)
> @@ -2121,6 +2123,19 @@ static void pci_qdev_realize(DeviceState *qdev, Error **errp)
> }
> }
>
> + if (pci_is_express(pci_dev)) {
> + ari = pcie_find_capability(pci_dev, PCI_EXT_CAP_ID_ARI);
> + if (ari) {
> + nextfn = (pci_get_long(pci_dev->config + ari + PCI_ARI_CAP) >> 8) & 0xff;
> + if (nextfn && (pci_dev->devfn & 0xff) >= nextfn) {
> + error_setg(errp, "PCI: function number %u is not lower than ARI next function number %u",
> + pci_dev->devfn & 0xff, nextfn);
> + pci_qdev_unrealize(DEVICE(pci_dev));
> + return;
> + }
> + }
> + }
> +
> if (pci_dev->failover_pair_id) {
> if (!pci_bus_is_express(pci_get_bus(pci_dev))) {
> error_setg(errp, "failover primary device must be on "
> --
> 2.41.0
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 4/4] pci: Compare function number and ARI next function number
2023-07-02 4:58 ` Michael S. Tsirkin
@ 2023-07-02 8:46 ` Akihiko Odaki
2023-07-02 8:55 ` Michael S. Tsirkin
0 siblings, 1 reply; 17+ messages in thread
From: Akihiko Odaki @ 2023-07-02 8:46 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: qemu-devel, qemu-block, Ani Sinha, Marcel Apfelbaum,
Sriram Yagnaraman, Jason Wang, Keith Busch, Klaus Jensen
On 2023/07/02 13:58, Michael S. Tsirkin wrote:
> On Sat, Jul 01, 2023 at 04:01:22PM +0900, Akihiko Odaki wrote:
>> The function number must be lower than the next function number
>> advertised with ARI.
>>
>> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
>
> I don't get this logic at all - where is the limitation coming from?
>
> All I see in the spec is:
> Next Function Number - With non-VFs, this field indicates the Function Number of the next higher
> numbered Function in the Device, or 00h if there are no higher numbered Functions. Function 0 starts
> this linked list of Functions.
> The presence of Shadow Functions does not affect this field.
> For VFs, this field is undefined since VFs are located using First VF Offset (see § Section 9.3.3.9 ) and VF
> Stride (see § Section 9.3.3.10 ).
>
> and
>
> To improve the enumeration performance and create a more deterministic solution, software can
> enumerate Functions through a linked list of Function Numbers. The next linked list element is
> communicated through each Function’s ARI Capability Register.
> i. Function 0 acts as the head of a linked list of Function Numbers. Software detects a
> non-Zero Next Function Number field within the ARI Capability Register as the next
> Function within the linked list. Software issues a configuration probe using the Bus Number
> captured by the Device and the Function Number derived from the ARI Capability Register
> to locate the next associated Function’s configuration space.
> ii. Function Numbers may be sparse and non-sequential in their consumption by an ARI
> Device.
The statement "With non-VFs, this field indicates the Function Number of
the next higher numbered Function in the Device, or 00h if there are no
higher numbered Functions." implies the Function Number of the device
should be lower than the value advertised by the field (for non-VFs;
this patch does not check if it's VF or not.)
>
>
>
>
>
>> ---
>> hw/pci/pci.c | 15 +++++++++++++++
>> 1 file changed, 15 insertions(+)
>>
>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
>> index e2eb4c3b4a..568665ee42 100644
>> --- a/hw/pci/pci.c
>> +++ b/hw/pci/pci.c
>> @@ -2059,6 +2059,8 @@ static void pci_qdev_realize(DeviceState *qdev, Error **errp)
>> Error *local_err = NULL;
>> bool is_default_rom;
>> uint16_t class_id;
>> + uint16_t ari;
>> + uint16_t nextfn;
>>
>> /*
>> * capped by systemd (see: udev-builtin-net_id.c)
>> @@ -2121,6 +2123,19 @@ static void pci_qdev_realize(DeviceState *qdev, Error **errp)
>> }
>> }
>>
>> + if (pci_is_express(pci_dev)) {
>> + ari = pcie_find_capability(pci_dev, PCI_EXT_CAP_ID_ARI);
>> + if (ari) {
>> + nextfn = (pci_get_long(pci_dev->config + ari + PCI_ARI_CAP) >> 8) & 0xff;
>> + if (nextfn && (pci_dev->devfn & 0xff) >= nextfn) {
>> + error_setg(errp, "PCI: function number %u is not lower than ARI next function number %u",
>> + pci_dev->devfn & 0xff, nextfn);
>> + pci_qdev_unrealize(DEVICE(pci_dev));
>> + return;
>> + }
>> + }
>> + }
>> +
>> if (pci_dev->failover_pair_id) {
>> if (!pci_bus_is_express(pci_get_bus(pci_dev))) {
>> error_setg(errp, "failover primary device must be on "
>> --
>> 2.41.0
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 4/4] pci: Compare function number and ARI next function number
2023-07-02 8:46 ` Akihiko Odaki
@ 2023-07-02 8:55 ` Michael S. Tsirkin
2023-07-02 8:57 ` Michael S. Tsirkin
0 siblings, 1 reply; 17+ messages in thread
From: Michael S. Tsirkin @ 2023-07-02 8:55 UTC (permalink / raw)
To: Akihiko Odaki
Cc: qemu-devel, qemu-block, Ani Sinha, Marcel Apfelbaum,
Sriram Yagnaraman, Jason Wang, Keith Busch, Klaus Jensen
On Sun, Jul 02, 2023 at 05:46:38PM +0900, Akihiko Odaki wrote:
> On 2023/07/02 13:58, Michael S. Tsirkin wrote:
> > On Sat, Jul 01, 2023 at 04:01:22PM +0900, Akihiko Odaki wrote:
> > > The function number must be lower than the next function number
> > > advertised with ARI.
> > >
> > > Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
> >
> > I don't get this logic at all - where is the limitation coming from?
> >
> > All I see in the spec is:
> > Next Function Number - With non-VFs, this field indicates the Function Number of the next higher
> > numbered Function in the Device, or 00h if there are no higher numbered Functions. Function 0 starts
> > this linked list of Functions.
> > The presence of Shadow Functions does not affect this field.
> > For VFs, this field is undefined since VFs are located using First VF Offset (see § Section 9.3.3.9 ) and VF
> > Stride (see § Section 9.3.3.10 ).
> >
> > and
> >
> > To improve the enumeration performance and create a more deterministic solution, software can
> > enumerate Functions through a linked list of Function Numbers. The next linked list element is
> > communicated through each Function’s ARI Capability Register.
> > i. Function 0 acts as the head of a linked list of Function Numbers. Software detects a
> > non-Zero Next Function Number field within the ARI Capability Register as the next
> > Function within the linked list. Software issues a configuration probe using the Bus Number
> > captured by the Device and the Function Number derived from the ARI Capability Register
> > to locate the next associated Function’s configuration space.
> > ii. Function Numbers may be sparse and non-sequential in their consumption by an ARI
> > Device.
>
> The statement "With non-VFs, this field indicates the Function Number of the
> next higher numbered Function in the Device, or 00h if there are no higher
> numbered Functions." implies the Function Number of the device should be
> lower than the value advertised by the field (for non-VFs; this patch does
> not check if it's VF or not.)
Now I get it. Good point! I'd say if we want this check we should add
it in pcie_ari_init, making that return int.
But for now it's dead code since your are changing it to 0.
So maybe a comment in pcie_ari_init is enough:
/*
* Note: nextfn must be the Function Number of the
* next higher numbered Function in the Device, or 00h if there are no higher
* numbered Functions.
* TODO: validate this.
*/
> >
> >
> >
> >
> >
> > > ---
> > > hw/pci/pci.c | 15 +++++++++++++++
> > > 1 file changed, 15 insertions(+)
> > >
> > > diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> > > index e2eb4c3b4a..568665ee42 100644
> > > --- a/hw/pci/pci.c
> > > +++ b/hw/pci/pci.c
> > > @@ -2059,6 +2059,8 @@ static void pci_qdev_realize(DeviceState *qdev, Error **errp)
> > > Error *local_err = NULL;
> > > bool is_default_rom;
> > > uint16_t class_id;
> > > + uint16_t ari;
> > > + uint16_t nextfn;
> > > /*
> > > * capped by systemd (see: udev-builtin-net_id.c)
> > > @@ -2121,6 +2123,19 @@ static void pci_qdev_realize(DeviceState *qdev, Error **errp)
> > > }
> > > }
> > > + if (pci_is_express(pci_dev)) {
> > > + ari = pcie_find_capability(pci_dev, PCI_EXT_CAP_ID_ARI);
> > > + if (ari) {
> > > + nextfn = (pci_get_long(pci_dev->config + ari + PCI_ARI_CAP) >> 8) & 0xff;
> > > + if (nextfn && (pci_dev->devfn & 0xff) >= nextfn) {
> > > + error_setg(errp, "PCI: function number %u is not lower than ARI next function number %u",
> > > + pci_dev->devfn & 0xff, nextfn);
> > > + pci_qdev_unrealize(DEVICE(pci_dev));
> > > + return;
> > > + }
> > > + }
> > > + }
> > > +
> > > if (pci_dev->failover_pair_id) {
> > > if (!pci_bus_is_express(pci_get_bus(pci_dev))) {
> > > error_setg(errp, "failover primary device must be on "
> > > --
> > > 2.41.0
> >
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 4/4] pci: Compare function number and ARI next function number
2023-07-02 8:55 ` Michael S. Tsirkin
@ 2023-07-02 8:57 ` Michael S. Tsirkin
0 siblings, 0 replies; 17+ messages in thread
From: Michael S. Tsirkin @ 2023-07-02 8:57 UTC (permalink / raw)
To: Akihiko Odaki
Cc: qemu-devel, qemu-block, Ani Sinha, Marcel Apfelbaum,
Sriram Yagnaraman, Jason Wang, Keith Busch, Klaus Jensen
On Sun, Jul 02, 2023 at 04:55:48AM -0400, Michael S. Tsirkin wrote:
> On Sun, Jul 02, 2023 at 05:46:38PM +0900, Akihiko Odaki wrote:
> > On 2023/07/02 13:58, Michael S. Tsirkin wrote:
> > > On Sat, Jul 01, 2023 at 04:01:22PM +0900, Akihiko Odaki wrote:
> > > > The function number must be lower than the next function number
> > > > advertised with ARI.
> > > >
> > > > Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
> > >
> > > I don't get this logic at all - where is the limitation coming from?
> > >
> > > All I see in the spec is:
> > > Next Function Number - With non-VFs, this field indicates the Function Number of the next higher
> > > numbered Function in the Device, or 00h if there are no higher numbered Functions. Function 0 starts
> > > this linked list of Functions.
> > > The presence of Shadow Functions does not affect this field.
> > > For VFs, this field is undefined since VFs are located using First VF Offset (see § Section 9.3.3.9 ) and VF
> > > Stride (see § Section 9.3.3.10 ).
> > >
> > > and
> > >
> > > To improve the enumeration performance and create a more deterministic solution, software can
> > > enumerate Functions through a linked list of Function Numbers. The next linked list element is
> > > communicated through each Function’s ARI Capability Register.
> > > i. Function 0 acts as the head of a linked list of Function Numbers. Software detects a
> > > non-Zero Next Function Number field within the ARI Capability Register as the next
> > > Function within the linked list. Software issues a configuration probe using the Bus Number
> > > captured by the Device and the Function Number derived from the ARI Capability Register
> > > to locate the next associated Function’s configuration space.
> > > ii. Function Numbers may be sparse and non-sequential in their consumption by an ARI
> > > Device.
> >
> > The statement "With non-VFs, this field indicates the Function Number of the
> > next higher numbered Function in the Device, or 00h if there are no higher
> > numbered Functions." implies the Function Number of the device should be
> > lower than the value advertised by the field (for non-VFs; this patch does
> > not check if it's VF or not.)
>
>
> Now I get it. Good point! I'd say if we want this check we should add
> it in pcie_ari_init, making that return int.
> But for now it's dead code since your are changing it to 0.
> So maybe a comment in pcie_ari_init is enough:
>
> /*
> * Note: nextfn must be the Function Number of the
> * next higher numbered Function in the Device, or 00h if there are no higher
> * numbered Functions.
> * TODO: validate this.
> */
Or add an assert, and
TODO: in case this can ever come from command line, we'll have
to replace the assert below with a runtime check.
> > >
> > >
> > >
> > >
> > >
> > > > ---
> > > > hw/pci/pci.c | 15 +++++++++++++++
> > > > 1 file changed, 15 insertions(+)
> > > >
> > > > diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> > > > index e2eb4c3b4a..568665ee42 100644
> > > > --- a/hw/pci/pci.c
> > > > +++ b/hw/pci/pci.c
> > > > @@ -2059,6 +2059,8 @@ static void pci_qdev_realize(DeviceState *qdev, Error **errp)
> > > > Error *local_err = NULL;
> > > > bool is_default_rom;
> > > > uint16_t class_id;
> > > > + uint16_t ari;
> > > > + uint16_t nextfn;
> > > > /*
> > > > * capped by systemd (see: udev-builtin-net_id.c)
> > > > @@ -2121,6 +2123,19 @@ static void pci_qdev_realize(DeviceState *qdev, Error **errp)
> > > > }
> > > > }
> > > > + if (pci_is_express(pci_dev)) {
> > > > + ari = pcie_find_capability(pci_dev, PCI_EXT_CAP_ID_ARI);
> > > > + if (ari) {
> > > > + nextfn = (pci_get_long(pci_dev->config + ari + PCI_ARI_CAP) >> 8) & 0xff;
> > > > + if (nextfn && (pci_dev->devfn & 0xff) >= nextfn) {
> > > > + error_setg(errp, "PCI: function number %u is not lower than ARI next function number %u",
> > > > + pci_dev->devfn & 0xff, nextfn);
> > > > + pci_qdev_unrealize(DEVICE(pci_dev));
> > > > + return;
> > > > + }
> > > > + }
> > > > + }
> > > > +
> > > > if (pci_dev->failover_pair_id) {
> > > > if (!pci_bus_is_express(pci_get_bus(pci_dev))) {
> > > > error_setg(errp, "failover primary device must be on "
> > > > --
> > > > 2.41.0
> > >
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 4/4] pci: Compare function number and ARI next function number
2023-07-01 7:01 ` [PATCH 4/4] pci: Compare function number and ARI next function number Akihiko Odaki
2023-07-02 4:58 ` Michael S. Tsirkin
@ 2023-07-11 7:10 ` Ani Sinha
2023-07-11 8:33 ` Michael S. Tsirkin
1 sibling, 1 reply; 17+ messages in thread
From: Ani Sinha @ 2023-07-11 7:10 UTC (permalink / raw)
To: Akihiko Odaki
Cc: qemu-devel, qemu-block, Michael S . Tsirkin, Marcel Apfelbaum,
Sriram Yagnaraman, Jason Wang, Keith Busch, Klaus Jensen
> On 01-Jul-2023, at 12:31 PM, Akihiko Odaki <akihiko.odaki@daynix.com> wrote:
>
> The function number must be lower than the next function number
> advertised with ARI.
>
> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
> ---
> hw/pci/pci.c | 15 +++++++++++++++
> 1 file changed, 15 insertions(+)
>
> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> index e2eb4c3b4a..568665ee42 100644
> --- a/hw/pci/pci.c
> +++ b/hw/pci/pci.c
> @@ -2059,6 +2059,8 @@ static void pci_qdev_realize(DeviceState *qdev, Error **errp)
> Error *local_err = NULL;
> bool is_default_rom;
> uint16_t class_id;
> + uint16_t ari;
> + uint16_t nextfn;
>
> /*
> * capped by systemd (see: udev-builtin-net_id.c)
> @@ -2121,6 +2123,19 @@ static void pci_qdev_realize(DeviceState *qdev, Error **errp)
> }
> }
>
> + if (pci_is_express(pci_dev)) {
> + ari = pcie_find_capability(pci_dev, PCI_EXT_CAP_ID_ARI);
> + if (ari) {
> + nextfn = (pci_get_long(pci_dev->config + ari + PCI_ARI_CAP) >> 8) & 0xff;
> + if (nextfn && (pci_dev->devfn & 0xff) >= nextfn) {
> + error_setg(errp, "PCI: function number %u is not lower than ARI next function number %u",
> + pci_dev->devfn & 0xff, nextfn);
> + pci_qdev_unrealize(DEVICE(pci_dev));
> + return;
> + }
> + }
> + }
> +
So I kind of got lost in all the patches. What was the ultimate decision regarding checking this?
> if (pci_dev->failover_pair_id) {
> if (!pci_bus_is_express(pci_get_bus(pci_dev))) {
> error_setg(errp, "failover primary device must be on "
> --
> 2.41.0
>
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 4/4] pci: Compare function number and ARI next function number
2023-07-11 7:10 ` Ani Sinha
@ 2023-07-11 8:33 ` Michael S. Tsirkin
0 siblings, 0 replies; 17+ messages in thread
From: Michael S. Tsirkin @ 2023-07-11 8:33 UTC (permalink / raw)
To: Ani Sinha
Cc: Akihiko Odaki, qemu-devel, qemu-block, Marcel Apfelbaum,
Sriram Yagnaraman, Jason Wang, Keith Busch, Klaus Jensen
On Tue, Jul 11, 2023 at 12:40:47PM +0530, Ani Sinha wrote:
>
>
> > On 01-Jul-2023, at 12:31 PM, Akihiko Odaki <akihiko.odaki@daynix.com> wrote:
> >
> > The function number must be lower than the next function number
> > advertised with ARI.
> >
> > Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
> > ---
> > hw/pci/pci.c | 15 +++++++++++++++
> > 1 file changed, 15 insertions(+)
> >
> > diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> > index e2eb4c3b4a..568665ee42 100644
> > --- a/hw/pci/pci.c
> > +++ b/hw/pci/pci.c
> > @@ -2059,6 +2059,8 @@ static void pci_qdev_realize(DeviceState *qdev, Error **errp)
> > Error *local_err = NULL;
> > bool is_default_rom;
> > uint16_t class_id;
> > + uint16_t ari;
> > + uint16_t nextfn;
> >
> > /*
> > * capped by systemd (see: udev-builtin-net_id.c)
> > @@ -2121,6 +2123,19 @@ static void pci_qdev_realize(DeviceState *qdev, Error **errp)
> > }
> > }
> >
> > + if (pci_is_express(pci_dev)) {
> > + ari = pcie_find_capability(pci_dev, PCI_EXT_CAP_ID_ARI);
> > + if (ari) {
> > + nextfn = (pci_get_long(pci_dev->config + ari + PCI_ARI_CAP) >> 8) & 0xff;
> > + if (nextfn && (pci_dev->devfn & 0xff) >= nextfn) {
> > + error_setg(errp, "PCI: function number %u is not lower than ARI next function number %u",
> > + pci_dev->devfn & 0xff, nextfn);
> > + pci_qdev_unrealize(DEVICE(pci_dev));
> > + return;
> > + }
> > + }
> > + }
> > +
>
> So I kind of got lost in all the patches. What was the ultimate decision regarding checking this?
We still need to fix ARI for multi-function PFs.
I feel the right thing to do is to init Next Function in the ARI
capability, automatically.
For now, we have merely changed ARI setting next function to 0.
At least that's more correct for the common case of ARI PF with VFs.
> > if (pci_dev->failover_pair_id) {
> > if (!pci_bus_is_express(pci_get_bus(pci_dev))) {
> > error_setg(errp, "failover primary device must be on "
> > --
> > 2.41.0
> >
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH 0/4] pci: Compare function number and ARI next function number
2023-07-01 7:01 [PATCH 0/4] pci: Compare function number and ARI next function number Akihiko Odaki
` (3 preceding siblings ...)
2023-07-01 7:01 ` [PATCH 4/4] pci: Compare function number and ARI next function number Akihiko Odaki
@ 2023-07-02 5:10 ` Michael S. Tsirkin
4 siblings, 0 replies; 17+ messages in thread
From: Michael S. Tsirkin @ 2023-07-02 5:10 UTC (permalink / raw)
To: Akihiko Odaki
Cc: qemu-devel, qemu-block, Ani Sinha, Marcel Apfelbaum,
Sriram Yagnaraman, Jason Wang, Keith Busch, Klaus Jensen
On Sat, Jul 01, 2023 at 04:01:18PM +0900, Akihiko Odaki wrote:
> The function number must be lower than the next function number
> advertised with ARI. Add a check to enforce this.
>
> I suggested this change at:
> https://lore.kernel.org/qemu-devel/bf351f8b-1c8a-8a7a-7f44-17c9ba18f179@daynix.com/
>
> Implementing this change, I found the devices implementing ARI do not set the
> correct next function numbers, which is also fixed in this series.
This isn't going to be merged with more in the way of motivation.
Analysis of at least linux guest behavious, documentation about testing,
addressing migration concerns are all not there either.
> Akihiko Odaki (4):
> docs: Fix next function numbers in SR/IOV documentation
> hw/nvme: Fix ARI next function numbers
> igb: Fix ARI next function numbers
> pci: Compare function number and ARI next function number
>
> docs/pcie_sriov.txt | 5 +++--
> hw/net/igb_core.h | 3 +++
> hw/net/igb.c | 4 +---
> hw/net/igbvf.c | 5 ++++-
> hw/nvme/ctrl.c | 7 ++++++-
> hw/pci/pci.c | 15 +++++++++++++++
> 6 files changed, 32 insertions(+), 7 deletions(-)
>
> --
> 2.41.0
^ permalink raw reply [flat|nested] 17+ messages in thread