qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Cédric Le Goater" <clg@redhat.com>
To: Alex Williamson <alex.williamson@redhat.com>, qemu-devel@nongnu.org
Subject: Re: [PATCH] vfio/pci: Static Resizable BAR capability
Date: Fri, 5 May 2023 10:29:36 +0200	[thread overview]
Message-ID: <fca03c69-5a5d-f743-5238-1eb43192226a@redhat.com> (raw)
In-Reply-To: <20230504204248.2774200-1-alex.williamson@redhat.com>

On 5/4/23 22:42, Alex Williamson wrote:
> The PCI Resizable BAR (ReBAR) capability is currently hidden from the
> VM because the protocol for interacting with the capability does not
> support a mechanism for the device to reject an advertised supported
> BAR size.  However, when assigned to a VM, the act of resizing the
> BAR requires adjustment of host resources for the device, which
> absolutely can fail.  Linux does not currently allow us to reserve
> resources for the device independent of the current usage.
> 
> The only writable field within the ReBAR capability is the BAR Size
> register.  The PCIe spec indicates that when written, the device
> should immediately begin to operate with the provided BAR size.  The
> spec however also notes that software must only write values
> corresponding to supported sizes as indicated in the capability and
> control registers.  Writing unsupported sizes produces undefined
> results.  Therefore, if the hypervisor were to virtualize the
> capability and control registers such that the current size is the
> only indicated available size, then a write of anything other than
> the current size falls into the category of undefined behavior,
> where we can essentially expose the modified ReBAR capability as
> read-only.
> 
> This may seem pointless, but users have reported that virtualizing
> the capability in this way not only allows guest software to expose
> related features as available (even if only cosmetic), but in some
> scenarios can resolve guest driver issues.  Additionally, no
> regressions in behavior have been reported for this change.
> 
> A caveat here is that the PCIe spec requires for compatibility that
> devices report support for a size in the range of 1MB to 512GB,
> therefore if the current BAR size falls outside that range we revert
> to hiding the capability.
> 
> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>

Reviewed-by: Cédric Le Goater <clg@redhat.com>

Some more below.

> ---
>   hw/vfio/pci.c | 49 ++++++++++++++++++++++++++++++++++++++++++++++++-
>   1 file changed, 48 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> index ec9a854361ac..3b4d36ce87bf 100644
> --- a/hw/vfio/pci.c
> +++ b/hw/vfio/pci.c
> @@ -2066,6 +2066,49 @@ static int vfio_add_std_cap(VFIOPCIDevice *vdev, uint8_t pos, Error **errp)
>       return 0;
>   }
>   
> +static int vfio_setup_rebar_ecap(VFIOPCIDevice *vdev, uint16_t pos)
> +{
> +    uint8_t bars = pci_get_byte(vdev->pdev.config + pos + PCI_REBAR_CTRL) >>
> +                                                    PCI_REBAR_CTRL_NBAR_SHIFT;
> +    int i;
> +
> +    for (i = 0; i < bars; i++) {
> +        uint32_t cap, ctrl;
> +        uint8_t size;
> +
> +        ctrl = pci_get_long(vdev->pdev.config + pos + PCI_REBAR_CTRL + (i * 8));
> +        size = (ctrl & PCI_REBAR_CTRL_BAR_SIZE) >> PCI_REBAR_CTRL_BAR_SHIFT;
> +
> +        /*
> +         * PCIe spec requires HW to support at least one size in the range 1MB
> +         * to 512GB, we intend to mask all sizes except the one currently
May be mention "7.8.6 Resizable BAR Extended Capability" of the PCIe specs ?
Because the size encoding is different between the CAP and CTRL registers
and the '19' and '+ 4' values below are a bit confusing. I don't see how
to make things better without macros (seems unnecessary as of today)


> +         * enabled in the size field, therefore if it's outside the range,
> +         * hide the whole capability.
> +         */
> +        if (size > 19) {

should we not report a warning ?

Thanks,

C.

> +            return -EINVAL;
> +        }
> +
> +        /* Hide all sizes reported in the ctrl reg per above requirement. */
> +        ctrl &= (PCI_REBAR_CTRL_BAR_SIZE |
> +                 PCI_REBAR_CTRL_NBAR_MASK | PCI_REBAR_CTRL_BAR_IDX);
> +
> +        /* Only the current size is reported in the capabilities register. */
> +        cap = 1 << (4 + size);
> +
> +        /*
> +         * The BAR size field is RW, however we've mangled the capability
> +         * register such that we only report a single size, ie. the current
> +         * BAR size.  A write of an unsupported value is undefined, therefore
> +         * the register field is essentially RO.
> +         */
> +        vfio_add_emulated_long(vdev, pos + PCI_REBAR_CTRL + (i * 8), ctrl, ~0);
> +        vfio_add_emulated_long(vdev, pos + PCI_REBAR_CAP + (i * 8), cap, ~0);
> +    }
> +
> +    return 0;
> +}
> +
>   static void vfio_add_ext_cap(VFIOPCIDevice *vdev)
>   {
>       PCIDevice *pdev = &vdev->pdev;
> @@ -2139,9 +2182,13 @@ static void vfio_add_ext_cap(VFIOPCIDevice *vdev)
>           case 0: /* kernel masked capability */
>           case PCI_EXT_CAP_ID_SRIOV: /* Read-only VF BARs confuse OVMF */
>           case PCI_EXT_CAP_ID_ARI: /* XXX Needs next function virtualization */
> -        case PCI_EXT_CAP_ID_REBAR: /* Can't expose read-only */
>               trace_vfio_add_ext_cap_dropped(vdev->vbasedev.name, cap_id, next);
>               break;
> +        case PCI_EXT_CAP_ID_REBAR:
> +            if (!vfio_setup_rebar_ecap(vdev, next)) {
> +                pcie_add_capability(pdev, cap_id, cap_ver, next, size);
> +            }
> +            break;
>           default:
>               pcie_add_capability(pdev, cap_id, cap_ver, next, size);
>           }



  reply	other threads:[~2023-05-05  8:30 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-04 20:42 [PATCH] vfio/pci: Static Resizable BAR capability Alex Williamson
2023-05-05  8:29 ` Cédric Le Goater [this message]
2023-05-05 16:06   ` Alex Williamson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fca03c69-5a5d-f743-5238-1eb43192226a@redhat.com \
    --to=clg@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).