From: "Cédric Le Goater" <clg@redhat.com>
To: Alex Williamson <alex.williamson@redhat.com>, qemu-devel@nongnu.org
Subject: Re: [PATCH v2] vfio/pci: Static Resizable BAR capability
Date: Tue, 9 May 2023 11:39:57 +0200 [thread overview]
Message-ID: <0e61fc26-369b-3a6a-8001-8be1bb49ff40@redhat.com> (raw)
In-Reply-To: <20230505232308.2869912-1-alex.williamson@redhat.com>
On 5/6/23 01:23, Alex Williamson wrote:
> The PCI Resizable BAR (ReBAR) capability is currently hidden from the
> VM because the protocol for interacting with the capability does not
> support a mechanism for the device to reject an advertised supported
> BAR size. However, when assigned to a VM, the act of resizing the
> BAR requires adjustment of host resources for the device, which
> absolutely can fail. Linux does not currently allow us to reserve
> resources for the device independent of the current usage.
>
> The only writable field within the ReBAR capability is the BAR Size
> register. The PCIe spec indicates that when written, the device
> should immediately begin to operate with the provided BAR size. The
> spec however also notes that software must only write values
> corresponding to supported sizes as indicated in the capability and
> control registers. Writing unsupported sizes produces undefined
> results. Therefore, if the hypervisor were to virtualize the
> capability and control registers such that the current size is the
> only indicated available size, then a write of anything other than
> the current size falls into the category of undefined behavior,
> where we can essentially expose the modified ReBAR capability as
> read-only.
>
> This may seem pointless, but users have reported that virtualizing
> the capability in this way not only allows guest software to expose
> related features as available (even if only cosmetic), but in some
> scenarios can resolve guest driver issues. Additionally, no
> regressions in behavior have been reported for this change.
>
> A caveat here is that the PCIe spec requires for compatibility that
> devices report support for a size in the range of 1MB to 512GB,
> therefore if the current BAR size falls outside that range we revert
> to hiding the capability.
>
> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
> ---
> v2:
> - Add spec reference
> - Use PCI_REBAR_CAP_SIZES to check sizes in range
> - Try to clarify capability bit generation
> - Rename s/bars/nbar/ to match #defines
> - More complete masking of NBAR value
>
> hw/vfio/pci.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++++++-
> 1 file changed, 53 insertions(+), 1 deletion(-)
>
> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> index cf27f28936cb..3ab849767a92 100644
> --- a/hw/vfio/pci.c
> +++ b/hw/vfio/pci.c
> @@ -2066,6 +2066,54 @@ static int vfio_add_std_cap(VFIOPCIDevice *vdev, uint8_t pos, Error **errp)
> return 0;
> }
>
> +static int vfio_setup_rebar_ecap(VFIOPCIDevice *vdev, uint16_t pos)
> +{
> + uint32_t ctrl;
> + int i, nbar;
> +
> + ctrl = pci_get_long(vdev->pdev.config + pos + PCI_REBAR_CTRL);
> + nbar = (ctrl & PCI_REBAR_CTRL_NBAR_MASK) >> PCI_REBAR_CTRL_NBAR_SHIFT;
> +
> + for (i = 0; i < nbar; i++) {
> + uint32_t cap;
> + int size;
> +
> + ctrl = pci_get_long(vdev->pdev.config + pos + PCI_REBAR_CTRL + (i * 8));
> + size = (ctrl & PCI_REBAR_CTRL_BAR_SIZE) >> PCI_REBAR_CTRL_BAR_SHIFT;
> +
> + /* The cap register reports sizes 1MB to 127TB, with 4 reserved bits */
s/127/128/
C.
> + cap = size <= 27 ? 1U << (size + 4) : 0;
> +
> + /*
> + * The PCIe spec (v6.0.1, 7.8.6) requires HW to support at least one
> + * size in the range 1MB to 512GB. We intend to mask all sizes except
> + * the one currently enabled in the size field, therefore if it's
> + * outside the range, hide the whole capability as this virtualization
> + * trick won't work. If >512GB resizable BARs start to appear, we
> + * might need an opt-in or reservation scheme in the kernel.
> + */
> + if (!(cap & PCI_REBAR_CAP_SIZES)) {
> + return -EINVAL;
> + }
> +
> + /* Hide all sizes reported in the ctrl reg per above requirement. */
> + ctrl &= (PCI_REBAR_CTRL_BAR_SIZE |
> + PCI_REBAR_CTRL_NBAR_MASK |
> + PCI_REBAR_CTRL_BAR_IDX);
> +
> + /*
> + * The BAR size field is RW, however we've mangled the capability
> + * register such that we only report a single size, ie. the current
> + * BAR size. A write of an unsupported value is undefined, therefore
> + * the register field is essentially RO.
> + */
> + vfio_add_emulated_long(vdev, pos + PCI_REBAR_CAP + (i * 8), cap, ~0);
> + vfio_add_emulated_long(vdev, pos + PCI_REBAR_CTRL + (i * 8), ctrl, ~0);
> + }
> +
> + return 0;
> +}
> +
> static void vfio_add_ext_cap(VFIOPCIDevice *vdev)
> {
> PCIDevice *pdev = &vdev->pdev;
> @@ -2139,9 +2187,13 @@ static void vfio_add_ext_cap(VFIOPCIDevice *vdev)
> case 0: /* kernel masked capability */
> case PCI_EXT_CAP_ID_SRIOV: /* Read-only VF BARs confuse OVMF */
> case PCI_EXT_CAP_ID_ARI: /* XXX Needs next function virtualization */
> - case PCI_EXT_CAP_ID_REBAR: /* Can't expose read-only */
> trace_vfio_add_ext_cap_dropped(vdev->vbasedev.name, cap_id, next);
> break;
> + case PCI_EXT_CAP_ID_REBAR:
> + if (!vfio_setup_rebar_ecap(vdev, next)) {
> + pcie_add_capability(pdev, cap_id, cap_ver, next, size);
> + }
> + break;
> default:
> pcie_add_capability(pdev, cap_id, cap_ver, next, size);
> }
next prev parent reply other threads:[~2023-05-09 9:40 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-05 23:23 [PATCH v2] vfio/pci: Static Resizable BAR capability Alex Williamson
2023-05-09 9:39 ` Cédric Le Goater [this message]
2023-05-09 14:20 ` Alex Williamson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0e61fc26-369b-3a6a-8001-8be1bb49ff40@redhat.com \
--to=clg@redhat.com \
--cc=alex.williamson@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).