All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Roger Pau Monné" <roger.pau@citrix.com>
To: Mykyta Poturai <Mykyta_Poturai@epam.com>
Cc: "xen-devel@lists.xenproject.org" <xen-devel@lists.xenproject.org>,
	Stewart Hildebrand <stewart.hildebrand@amd.com>
Subject: Re: [PATCH v1 6/6] vpci: add SR-IOV support for DomUs
Date: Mon, 28 Jul 2025 17:10:43 +0200	[thread overview]
Message-ID: <aIeS86j8ZOGVK3fy@macbook.local> (raw)
In-Reply-To: <24ecd4b4877f82304e5fe5a4a6c524cc263c7756.1753450965.git.mykyta_poturai@epam.com>

On Fri, Jul 25, 2025 at 02:24:33PM +0000, Mykyta Poturai wrote:
> From: Stewart Hildebrand <stewart.hildebrand@amd.com>
> 
> SR-IOV virtual functions have simplified configuration space such as
> Vendor ID is derived from the physical function and Device ID comes
> from SR-IOV extended capability.
> Emulate those, so we can provide VID/DID pair for guests which use PCI
> passthrough for SR-IOV virtual functions.
> 
> Emulate guest BAR register values based on PF BAR values for VFs.
> This allows creating a guest view of the normal BAR registers and emulates
> the size and properties as it is done during PCI device enumeration by
> the guest.
> 
> Note, that VFs ROM BAR is read-only and is all zeros, but VF may provide
> access to the PFs ROM via emulation and is not implemented.
> 
> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com>
> Signed-off-by: Mykyta Poturai <mykyta_poturai@epam.com>
> ---
>  xen/drivers/vpci/sriov.c | 119 ++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 118 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/drivers/vpci/sriov.c b/xen/drivers/vpci/sriov.c
> index 640430e3e9..bf8d9763c6 100644
> --- a/xen/drivers/vpci/sriov.c
> +++ b/xen/drivers/vpci/sriov.c
> @@ -39,7 +39,8 @@ static int vf_init_bars(const struct pci_dev *vf_pdev)
>      for ( i = 0; i < PCI_SRIOV_NUM_BARS; i++ )
>      {
>          bars[i].addr = physfn_vf_bars[i].addr + vf_idx * physfn_vf_bars[i].size;
> -        bars[i].guest_addr = bars[i].addr;
> +        if ( pf_pdev->domain == vf_pdev->domain || bars[i].guest_addr == 0 )
> +            bars[i].guest_addr = bars[i].addr;

Wouldn't this better be done based on the owner of the device being
the hardware domain?  This seems a bit ad-hoc and not quite obvious.

>          bars[i].size = physfn_vf_bars[i].size;
>          bars[i].type = physfn_vf_bars[i].type;
>          bars[i].prefetchable = physfn_vf_bars[i].prefetchable;
> @@ -166,6 +167,44 @@ static void cf_check control_write(const struct pci_dev *pdev, unsigned int reg,
>      pci_conf_write16(pdev->sbdf, reg, val);
>  }
>  
> +#ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT
> +static void cf_check vf_cmd_write(const struct pci_dev *pdev, unsigned int reg,
> +                                  uint32_t cmd, void *data)
> +{
> +    struct vpci_header *header = data;
> +
> +    cmd |= PCI_COMMAND_INTX_DISABLE;
> +
> +    header->guest_cmd = cmd;
> +
> +    /*
> +     * Let Dom0 play with all the bits directly except for the memory
> +     * decoding one. Bits that are not allowed for DomU are already
> +     * handled above and by the rsvdp_mask.
> +     */
> +    if ( header->bars_mapped != !!(cmd & PCI_COMMAND_MEMORY) )
> +    {
> +        /*
> +         * Ignore the error. No memory has been added or removed from the p2m
> +         * (because the actual p2m changes are deferred in defer_map) and the
> +         * memory decoding bit has not been changed, so leave everything as-is,
> +         * hoping the guest will realize and try again.
> +         */
> +        map_vf(pdev, cmd);
> +    }
> +    else
> +        pci_conf_write16(pdev->sbdf, reg, cmd);
> +}

This seems to be (mostly) a duplication of the existing cmd_write() in
header.c.  Is there anyway that we could use the same handler and
check for whether the device is a VF for any specific VF handling?

> +
> +static uint32_t cf_check vf_cmd_read(const struct pci_dev *pdev,
> +                                     unsigned int reg, void *data)
> +{
> +    const struct vpci_header *header = data;
> +
> +    return header->guest_cmd;
> +}

This is a verbatim duplicate of guest_cmd_read() in header.c.

> +#endif /* CONFIG_HAS_VPCI_GUEST_SUPPORT */
> +
>  static int vf_init_header(struct pci_dev *vf_pdev)
>  {
>      const struct pci_dev *pf_pdev;
> @@ -184,6 +223,84 @@ static int vf_init_header(struct pci_dev *vf_pdev)
>      sriov_pos = pci_find_ext_capability(pf_pdev->sbdf, PCI_EXT_CAP_ID_SRIOV);
>      ctrl = pci_conf_read16(pf_pdev->sbdf, sriov_pos + PCI_SRIOV_CTRL);
>  
> +#ifdef CONFIG_HAS_VPCI_GUEST_SUPPORT
> +    if ( pf_pdev->domain != vf_pdev->domain )
> +    {
> +        uint16_t vid = pci_conf_read16(pf_pdev->sbdf, PCI_VENDOR_ID);
> +        uint16_t did = pci_conf_read16(pf_pdev->sbdf,
> +                                       sriov_pos + PCI_SRIOV_VF_DID);
> +        struct vpci_bar *bars = vf_pdev->vpci->header.bars;
> +        unsigned int i;
> +
> +        rc = vpci_add_register(vf_pdev->vpci, vpci_read_val, NULL,
> +                               PCI_VENDOR_ID, 2, (void *)(uintptr_t)vid);
> +        if ( rc )
> +            return rc;
> +
> +        rc = vpci_add_register(vf_pdev->vpci, vpci_read_val, NULL,
> +                               PCI_DEVICE_ID, 2, (void *)(uintptr_t)did);
> +        if ( rc )
> +            return rc;
> +
> +        /* Hardcode multi-function device bit to 0 */
> +        rc = vpci_add_register(vf_pdev->vpci, vpci_read_val, NULL,
> +                               PCI_HEADER_TYPE, 1,
> +                               (void *)PCI_HEADER_TYPE_NORMAL);
> +        if ( rc )
> +            return rc;
> +
> +        rc = vpci_add_register(vf_pdev->vpci, vpci_hw_read32, NULL,
> +                               PCI_CLASS_REVISION, 4, NULL);
> +        if ( rc )
> +            return rc;
> +
> +        /* VF ROZ is covered by ro_mask */
> +        rc = vpci_add_register_mask(vf_pdev->vpci, vf_cmd_read, vf_cmd_write,
> +                                    PCI_COMMAND, 2, &vf_pdev->vpci->header,
> +                                    PCI_COMMAND_IO | PCI_COMMAND_SPECIAL |
> +                                    PCI_COMMAND_INVALIDATE |
> +                                    PCI_COMMAND_VGA_PALETTE | PCI_COMMAND_WAIT |
> +                                    PCI_COMMAND_FAST_BACK,
> +                                    0,
> +                                    PCI_COMMAND_RSVDP_MASK |
> +                                    PCI_COMMAND_PARITY | PCI_COMMAND_SERR,
> +                                    0);
> +        if ( rc )
> +            return rc;
> +
> +        rc = vpci_init_capability_list(vf_pdev);
> +        if ( rc )
> +            return rc;
> +
> +        for ( i = 0; i < PCI_SRIOV_NUM_BARS; i++ )
> +        {
> +            switch ( pf_pdev->vpci->sriov->vf_bars[i].type )
> +            {
> +            case VPCI_BAR_MEM32:
> +            case VPCI_BAR_MEM64_LO:
> +            case VPCI_BAR_MEM64_HI:
> +                rc = vpci_add_register(vf_pdev->vpci, vpci_guest_mem_bar_read,
> +                                       vpci_guest_mem_bar_write,
> +                                       PCI_BASE_ADDRESS_0 + i * 4, 4, &bars[i]);
> +                if ( rc )
> +                    return rc;
> +                break;
> +            default:
> +                rc = vpci_add_register(vf_pdev->vpci, vpci_read_val, NULL,
> +                                       PCI_BASE_ADDRESS_0 + i * 4, 4,
> +                                       (void *)0);
> +                if ( rc )
> +                    return rc;
> +                break;
> +            }
> +        }
> +
> +        rc = vf_init_bars(vf_pdev);
> +        if ( rc )
> +            return rc;
> +    }
> +#endif /* CONFIG_HAS_VPCI_GUEST_SUPPORT */

I think it would be better if the code in sr-iov.c would only deal
with accesses to the SR-IOV capability (so the hardware domain), while
the code to handle SR-IOV VFs was directly added to header.c.

Otherwise the work here will collide from the series by Jiqian that
attempts to clean up the mediation of capabilities done in vPCI:

https://lore.kernel.org/xen-devel/20250728050401.329510-1-Jiqian.Chen@amd.com/

Specifically see patch #2:

https://lore.kernel.org/xen-devel/20250728050401.329510-3-Jiqian.Chen@amd.com/

Which introduces vpci_init_capabilities(), and makes vPCI capability
initialization tied to the PCI device having the capability present.

Thanks, Roger.


  reply	other threads:[~2025-07-28 15:11 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-25 14:24 [PATCH v1 0/6] Implement SR-IOV support for PVH Mykyta Poturai
2025-07-25 14:24 ` [PATCH v1 3/6] vpci: rename and export vpci_bar_add_rangeset Mykyta Poturai
2025-07-25 14:54   ` Teddy Astie
2025-07-28  9:51   ` Roger Pau Monné
2025-07-25 14:24 ` [PATCH v1 2/6] vpci: rename and export vpci_guest_mem_bar_{read,write} Mykyta Poturai
2025-07-25 14:51   ` Teddy Astie
2025-07-25 14:24 ` [PATCH v1 1/6] vpci: rename and export vpci_modify_bars Mykyta Poturai
2025-07-25 14:42   ` Teddy Astie
2025-07-25 14:24 ` [PATCH v1 4/6] vpci: add SR-IOV support for PVH Dom0 Mykyta Poturai
2025-07-25 17:39   ` Teddy Astie
2026-02-26 10:06     ` Mykyta Poturai
2025-07-28 11:33   ` Roger Pau Monné
2026-03-04  8:43     ` Mykyta Poturai
2026-03-04 15:16       ` Roger Pau Monné
2026-03-04 15:34         ` Stewart Hildebrand
2025-07-25 14:24 ` [PATCH v1 6/6] vpci: add SR-IOV support for DomUs Mykyta Poturai
2025-07-28 15:10   ` Roger Pau Monné [this message]
2025-07-25 14:24 ` [PATCH v1 5/6] vpci: export vpci_init_capability_list() Mykyta Poturai
2025-07-28  9:04 ` [PATCH v1 0/6] Implement SR-IOV support for PVH Roger Pau Monné

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aIeS86j8ZOGVK3fy@macbook.local \
    --to=roger.pau@citrix.com \
    --cc=Mykyta_Poturai@epam.com \
    --cc=stewart.hildebrand@amd.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.