From: Jonathan Cameron <jonathan.cameron@huawei.com>
To: Dave Jiang <dave.jiang@intel.com>
Cc: <mhonap@nvidia.com>, <aniketa@nvidia.com>, <ankita@nvidia.com>,
<alwilliamson@nvidia.com>, <vsethi@nvidia.com>, <jgg@nvidia.com>,
<mochs@nvidia.com>, <skolothumtho@nvidia.com>,
<alejandro.lucero-palau@amd.com>, <dave@stgolabs.net>,
<alison.schofield@intel.com>, <vishal.l.verma@intel.com>,
<ira.weiny@intel.com>, <dan.j.williams@intel.com>, <jgg@ziepe.ca>,
<yishaih@nvidia.com>, <kevin.tian@intel.com>, <cjia@nvidia.com>,
<targupta@nvidia.com>, <zhiw@nvidia.com>, <kjaju@nvidia.com>,
<linux-kernel@vger.kernel.org>, <linux-cxl@vger.kernel.org>,
<kvm@vger.kernel.org>
Subject: Re: [PATCH 10/20] vfio/cxl: CXL region management
Date: Fri, 13 Mar 2026 12:52:30 +0000 [thread overview]
Message-ID: <20260313125230.000058dd@huawei.com> (raw)
In-Reply-To: <e2eaa470-c57b-4fa5-9810-4224ed314445@intel.com>
On Thu, 12 Mar 2026 15:55:32 -0700
Dave Jiang <dave.jiang@intel.com> wrote:
> On 3/11/26 1:34 PM, mhonap@nvidia.com wrote:
> > From: Manish Honap <mhonap@nvidia.com>
> >
> > Add CXL region management for future guest access.
> >
> > Region Management makes use of APIs provided by CXL_CORE as below:
> >
> > CREATE_REGION flow:
> > 1. Validate request (size, decoder availability)
> > 2. Allocate HPA via cxl_get_hpa_freespace()
> > 3. Allocate DPA via cxl_request_dpa()
> > 4. Create region via cxl_create_region() - commits HDM decoder!
> > 5. Get HPA range via cxl_get_region_range()
> >
> > DESTROY_REGION flow:
> > 1. Detach decoder via cxl_decoder_detach()
> > 2. Free DPA via cxl_dpa_free()
> > 3. Release root decoder via cxl_put_root_decoder()
> >
> > Signed-off-by: Manish Honap <mhonap@nvidia.com>
A few additional comments from me.
> > ---
> > drivers/vfio/pci/cxl/vfio_cxl_core.c | 118 ++++++++++++++++++++++++++-
> > drivers/vfio/pci/cxl/vfio_cxl_priv.h | 5 ++
> > drivers/vfio/pci/vfio_pci_priv.h | 8 ++
> > 3 files changed, 130 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/vfio/pci/cxl/vfio_cxl_core.c b/drivers/vfio/pci/cxl/vfio_cxl_core.c
> > index 2da6da1c0605..9c71f592e74e 100644
> > --- a/drivers/vfio/pci/cxl/vfio_cxl_core.c
> > +++ b/drivers/vfio/pci/cxl/vfio_cxl_core.c
> > @@ -126,6 +126,112 @@ static int vfio_cxl_setup_regs(struct vfio_pci_core_device *vdev)
> > return 0;
> > }
> >
> > +int vfio_cxl_create_cxl_region(struct vfio_pci_core_device *vdev, resource_size_t size)
> > +{
> > + struct vfio_pci_cxl_state *cxl = vdev->cxl;
> > + resource_size_t max_size;
> > + int ret;
> > +
> > + if (cxl->precommitted)
> > + return 0;
> > +
> > + cxl->cxlrd = cxl_get_hpa_freespace(cxl->cxlmd, 1,
> > + CXL_DECODER_F_RAM |
> > + CXL_DECODER_F_TYPE2,
> > + &max_size);
>
> Not sure what VFIO subsystem's policy is on scoped base resource cleanup, but a __free() here can get you out of managing put() of the root decoder.
>
> > + if (IS_ERR(cxl->cxlrd))
> > + return PTR_ERR(cxl->cxlrd);
> > +
> > + /* Insufficient HPA space */
> > + if (max_size < size) {
> > + cxl_put_root_decoder(cxl->cxlrd);
> > + cxl->cxlrd = NULL;
Similar to other cases, I'd keep assigning stuff in cxl to the point
where there are no more error paths. Use local variables until then.
(that would fit with using __free() as well which I'd also favor if
accepted in VFIO).
> > + return -ENOSPC;
> > + }
> > +
> > + cxl->cxled = cxl_request_dpa(cxl->cxlmd, CXL_PARTMODE_RAM, size);
>
> Same comment here about __free().
>
> > + if (IS_ERR(cxl->cxled)) {
> > + ret = PTR_ERR(cxl->cxled);
> > + goto err_free_hpa;
> > + }
> > +
> > + cxl->region = cxl_create_region(cxl->cxlrd, &cxl->cxled, 1);
> > + if (IS_ERR(cxl->region)) {
> > + ret = PTR_ERR(cxl->region);
You carefully NULL this in vfio_cxl_destroy_region, but if you fail
here you end up with it containing an ERR_PTR(). I'd avoid that by
using a local variable and only assigning cxl->region after this
suceeds.
> > + goto err_free_dpa;
> > + }
> > +
> > + return 0;
> > +
> > +err_free_dpa:
> > + cxl_dpa_free(cxl->cxled);
> > +err_free_hpa:
> > + if (cxl->cxlrd)
> > + cxl_put_root_decoder(cxl->cxlrd);
> > +
> > + return ret;
> > +}
> > +
> > +void vfio_cxl_destroy_cxl_region(struct vfio_pci_core_device *vdev)
> > +{
> > + struct vfio_pci_cxl_state *cxl = vdev->cxl;
> > +
> > + if (!cxl->region)
> > + return;
> > +
> > + cxl_unregister_region(cxl->region);
> > + cxl->region = NULL;
> > +
> > + if (cxl->precommitted)
> > + return;
> > +
> > + cxl_dpa_free(cxl->cxled);
> > + cxl_put_root_decoder(cxl->cxlrd);
> > +}
> > +
> > +static int vfio_cxl_create_region_helper(struct vfio_pci_core_device *vdev,
> > + resource_size_t capacity)
> > +{
> > + struct vfio_pci_cxl_state *cxl = vdev->cxl;
> > + struct pci_dev *pdev = vdev->pdev;
> > + int ret;
> > +
> > + if (cxl->precommitted) {
> > + cxl->cxled = cxl_get_committed_decoder(cxl->cxlmd,
> > + &cxl->region);
> > + if (IS_ERR(cxl->cxled))
> > + return PTR_ERR(cxl->cxled);
> > + } else {
> > + ret = vfio_cxl_create_cxl_region(vdev, capacity);
> > + if (ret)
> > + return ret;
> > + }
> > +
> > + if (cxl->region) {
>
> Maybe if you do 'if (!cxl->region)' first and just exit, then you don't need to indent the normal code path.
>
> > + struct range range;
> > +
> > + ret = cxl_get_region_range(cxl->region, &range);
> > + if (ret)
> > + goto failed;
> > +
> > + cxl->region_hpa = range.start;
> > + cxl->region_size = range_len(&range);
> > +
> > + pci_dbg(pdev, "Precommitted decoder: HPA 0x%llx size %lu MB\n",
> > + cxl->region_hpa, cxl->region_size >> 20);
> > + } else {
> > + pci_err(pdev, "Failed to create CXL region\n");
> > + ret = -ENODEV;
> > + goto failed;
> > + }
> > +
> > + return 0;
> > +
> > +failed:
> > + vfio_cxl_destroy_cxl_region(vdev);
Little bit of refactoring and this could be replaced with __free() magic.
> > + return ret;
> > +}
> > +
> > /**
> > * vfio_pci_cxl_detect_and_init - Detect and initialize CXL Type-2 device
> > * @vdev: VFIO PCI device
> > @@ -172,6 +278,12 @@ void vfio_pci_cxl_detect_and_init(struct vfio_pci_core_device *vdev)
> >
> > pci_disable_device(pdev);
> >
> > + ret = vfio_cxl_create_region_helper(vdev, SZ_256M);
>
> Maybe a comment on why this size?
:) I wondered that as well. I'm guessing your bios isn't always providing
the decoder and this lets you test.
>
> DJ
>
> > + if (ret)
> > + goto failed;
> > +
> > + cxl->precommitted = true;
> > +
> > return;
> >
> > failed:
> > @@ -181,6 +293,10 @@ void vfio_pci_cxl_detect_and_init(struct vfio_pci_core_device *vdev)
> >
> > void vfio_pci_cxl_cleanup(struct vfio_pci_core_device *vdev)
> > {
> > - if (!vdev->cxl)
> > + struct vfio_pci_cxl_state *cxl = vdev->cxl;
Do that in the earlier patch to reduce churn a tiny bit.
> > +
> > + if (!cxl || !cxl->region)
> > return;
> > +
> > + vfio_cxl_destroy_cxl_region(vdev);
> > }
next prev parent reply other threads:[~2026-03-13 12:52 UTC|newest]
Thread overview: 54+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-11 20:34 [PATCH 00/20] vfio/pci: Add CXL Type-2 device passthrough support mhonap
2026-03-11 20:34 ` [PATCH 01/20] cxl: Introduce cxl_get_hdm_reg_info() mhonap
2026-03-12 11:28 ` Jonathan Cameron
2026-03-12 16:33 ` Dave Jiang
2026-03-11 20:34 ` [PATCH 02/20] cxl: Expose cxl subsystem specific functions for vfio mhonap
2026-03-12 16:49 ` Dave Jiang
2026-03-13 10:05 ` Manish Honap
2026-03-11 20:34 ` [PATCH 03/20] cxl: Move CXL spec defines to public header mhonap
2026-03-13 12:18 ` Jonathan Cameron
2026-03-13 16:56 ` Dave Jiang
2026-03-18 14:56 ` Jonathan Cameron
2026-03-18 17:51 ` Manish Honap
2026-03-11 20:34 ` [PATCH 04/20] cxl: Media ready check refactoring mhonap
2026-03-12 20:29 ` Dave Jiang
2026-03-13 10:05 ` Manish Honap
2026-03-11 20:34 ` [PATCH 05/20] cxl: Expose BAR index and offset from register map mhonap
2026-03-12 20:58 ` Dave Jiang
2026-03-13 10:11 ` Manish Honap
2026-03-11 20:34 ` [PATCH 06/20] vfio/cxl: Add UAPI for CXL Type-2 device passthrough mhonap
2026-03-12 21:04 ` Dave Jiang
2026-03-11 20:34 ` [PATCH 07/20] vfio/pci: Add CXL state to vfio_pci_core_device mhonap
2026-03-11 20:34 ` [PATCH 08/20] vfio/pci: Add vfio-cxl Kconfig and build infrastructure mhonap
2026-03-13 12:27 ` Jonathan Cameron
2026-03-18 17:21 ` Manish Honap
2026-03-11 20:34 ` [PATCH 09/20] vfio/cxl: Implement CXL device detection and HDM register probing mhonap
2026-03-12 22:31 ` Dave Jiang
2026-03-13 12:43 ` Jonathan Cameron
2026-03-18 17:43 ` Manish Honap
2026-03-11 20:34 ` [PATCH 10/20] vfio/cxl: CXL region management mhonap
2026-03-12 22:55 ` Dave Jiang
2026-03-13 12:52 ` Jonathan Cameron [this message]
2026-03-18 17:48 ` Manish Honap
2026-03-11 20:34 ` [PATCH 11/20] vfio/cxl: Expose DPA memory region to userspace with fault+zap mmap mhonap
2026-03-13 17:07 ` Dave Jiang
2026-03-18 17:54 ` Manish Honap
2026-03-11 20:34 ` [PATCH 12/20] vfio/pci: Export config access helpers mhonap
2026-03-11 20:34 ` [PATCH 13/20] vfio/cxl: Introduce HDM decoder register emulation framework mhonap
2026-03-13 19:05 ` Dave Jiang
2026-03-18 17:58 ` Manish Honap
2026-03-11 20:34 ` [PATCH 14/20] vfio/cxl: Check media readiness and create CXL memdev mhonap
2026-03-11 20:34 ` [PATCH 15/20] vfio/cxl: Introduce CXL DVSEC configuration space emulation mhonap
2026-03-13 22:07 ` Dave Jiang
2026-03-18 18:41 ` Manish Honap
2026-03-11 20:34 ` [PATCH 16/20] vfio/pci: Expose CXL device and region info via VFIO ioctl mhonap
2026-03-11 20:34 ` [PATCH 17/20] vfio/cxl: Provide opt-out for CXL feature mhonap
2026-03-11 20:34 ` [PATCH 18/20] docs: vfio-pci: Document CXL Type-2 device passthrough mhonap
2026-03-13 12:13 ` Jonathan Cameron
2026-03-17 21:24 ` Alex Williamson
2026-03-19 16:06 ` Jonathan Cameron
2026-03-23 14:36 ` Manish Honap
2026-03-11 20:34 ` [PATCH 19/20] selftests/vfio: Add CXL Type-2 passthrough tests mhonap
2026-03-11 20:34 ` [PATCH 20/20] selftests/vfio: Fix VLA initialisation in vfio_pci_irq_set() mhonap
2026-03-13 22:23 ` Dave Jiang
2026-03-18 18:07 ` Manish Honap
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260313125230.000058dd@huawei.com \
--to=jonathan.cameron@huawei.com \
--cc=alejandro.lucero-palau@amd.com \
--cc=alison.schofield@intel.com \
--cc=alwilliamson@nvidia.com \
--cc=aniketa@nvidia.com \
--cc=ankita@nvidia.com \
--cc=cjia@nvidia.com \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=dave@stgolabs.net \
--cc=ira.weiny@intel.com \
--cc=jgg@nvidia.com \
--cc=jgg@ziepe.ca \
--cc=kevin.tian@intel.com \
--cc=kjaju@nvidia.com \
--cc=kvm@vger.kernel.org \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mhonap@nvidia.com \
--cc=mochs@nvidia.com \
--cc=skolothumtho@nvidia.com \
--cc=targupta@nvidia.com \
--cc=vishal.l.verma@intel.com \
--cc=vsethi@nvidia.com \
--cc=yishaih@nvidia.com \
--cc=zhiw@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.