public inbox for linux-pci@vger.kernel.org
 help / color / mirror / Atom feed
From: Jonathan Cameron <jonathan.cameron@huawei.com>
To: <smadhavan@nvidia.com>
Cc: <dave@stgolabs.net>, <dave.jiang@intel.com>,
	<alison.schofield@intel.com>, <vishal.l.verma@intel.com>,
	<ira.weiny@intel.com>, <dan.j.williams@intel.com>,
	<bhelgaas@google.com>, <ming.li@zohomail.com>, <rrichter@amd.com>,
	<Smita.KoralahalliChannabasappa@amd.com>,
	<huaisheng.ye@intel.com>, <linux-cxl@vger.kernel.org>,
	<linux-pci@vger.kernel.org>, <vaslot@nvidia.com>,
	<vsethi@nvidia.com>, <sdonthineni@nvidia.com>,
	<vidyas@nvidia.com>, <mochs@nvidia.com>, <jsequeira@nvidia.com>
Subject: Re: [PATCH v4 05/10] cxl: add reset prepare and region teardown
Date: Wed, 21 Jan 2026 11:09:31 +0000	[thread overview]
Message-ID: <20260121110931.000063e2@huawei.com> (raw)
In-Reply-To: <20260120222610.2227109-6-smadhavan@nvidia.com>

On Tue, 20 Jan 2026 22:26:05 +0000
smadhavan@nvidia.com wrote:

> From: Srirangan Madhavan <smadhavan@nvidia.com>
> 
> Prepare a Type 2 device for cxl_reset by validating memory is offline,
> flushing device caches for region participants, and tearing down decoders
> under cxl_region_rwsem. The lock stays held across reset to prevent new
> region creation while reset is in progress.
> 
> Signed-off-by: Srirangan Madhavan <smadhavan@nvidia.com>

Some minor feedback from a quick look. I'll want to take a closer look
when we are closer to merging this.

> ---
>  drivers/cxl/pci.c | 214 ++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 214 insertions(+)
> 
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index b562e607ec46..e4134162e82a 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -1085,6 +1085,220 @@ bool cxl_is_type2_device(struct pci_dev *pdev)
>  	return cxlds->type == CXL_DEVTYPE_DEVMEM;
>  }
> 
> +static int cxl_check_region_driver_bound(struct device *dev, void *data)
> +{
> +	struct cxl_decoder *cxld = to_cxl_decoder(dev);
> +
> +	if (!is_endpoint_decoder(dev))
> +		return 0;
> +
> +	guard(rwsem_read)(&cxl_region_rwsem);
> +	if (cxld->region && cxld->region->driver)
> +		return -EBUSY;
> +
> +	return 0;
> +}
> +
> +static int cxl_decoder_kill_region_iter(struct device *dev, void *data)
> +{
> +	struct cxl_endpoint_decoder *cxled = to_cxl_endpoint_decoder(dev);
> +	int rc;
> +
> +	if (!is_endpoint_decoder(dev))
> +		return 0;
> +
> +	if (!cxled->cxld.region)
> +		return 0;
> +
> +	cxl_decoder_kill_region_locked(cxled);
> +
> +	rc = device_for_each_child(&cxled->cxld.dev, NULL,
> +				   cxl_check_region_driver_bound);

return device_for_each_child()

If that doesn't make sense after later patches, fine to leave as it is.


> +	if (rc)
> +		return rc;
> +
> +	return 0;
> +}
> +
> +static int cxl_device_cache_wb_invalidate(struct pci_dev *pdev)
> +{
> +	struct cxl_dev_state *cxlds = pci_get_drvdata(pdev);
> +	u16 reg, val, cap;
> +	int dvsec, rc;
> +
> +	if (!cxlds)
> +		return -ENODEV;
> +
> +	dvsec = cxlds->cxl_dvsec;
> +	if (!dvsec)
> +		return -ENODEV;
> +
> +	rc = pci_read_config_word(pdev, dvsec + CXL_DVSEC_CAP_OFFSET, &cap);
> +	if (rc)
> +		return rc;
> +
> +	if (!(cap & CXL_DVSEC_CACHE_WBI_CAPABLE))
> +		return 1;

With unusual return value, definitely need docs for this function.
Given use below, maybe just return 0?

If there are caches and there is no way to force WB, what does that mean
for whether we can reset the device?  Feels like maybe this at least
deserves a warning print.

My suspicion is that lack of that feature just means there is a device
specific way to do it but I'm fine with Linux not supporting that ;)

> +
> +	rc = pci_read_config_word(pdev, dvsec + CXL_DVSEC_CTRL2_OFFSET, &val);
> +	if (rc)
> +		return rc;
> +
> +	val |= CXL_DVSEC_INIT_CACHE_WBI;
> +	rc = pci_write_config_word(pdev, dvsec + CXL_DVSEC_CTRL2_OFFSET, val);
> +	if (rc)
> +		return rc;
> +
> +	do {
> +		rc = pci_read_config_word(pdev, dvsec + CXL_DVSEC_STATUS2_OFFSET, &reg);
> +		if (rc)
> +			return rc;
> +	} while (!(reg & CXL_DVSEC_CACHE_INVALID));
> +
> +	return 0;
> +}
> +
> +static int cxl_region_flush_device_caches(struct device *dev, void *data)
> +{
> +	struct cxl_endpoint_decoder *cxled = to_cxl_endpoint_decoder(dev);
> +	struct cxl_region *cxlr = cxled->cxld.region;
> +	struct cxl_region_params *p = &cxlr->params;
> +	struct pci_dev *target_pdev = data;
> +	int i, rc;
> +
> +	if (!is_endpoint_decoder(dev))
> +		return 0;
> +
> +	if (!cxlr || !cxlr->params.res)
> +		return 0;
> +
> +	for (i = 0; i < p->nr_targets; i++) {
> +		struct cxl_endpoint_decoder *target_cxled = p->targets[i];
> +		struct cxl_memdev *target_cxlmd = cxled_to_memdev(target_cxled);
> +		struct cxl_dev_state *target_cxlds = target_cxlmd->cxlds;
> +
> +		if (!target_cxlds || !target_cxlds->pdev)
> +			continue;
> +
> +		if (target_cxlds->pdev != target_pdev)

Seems like target_pdev == NULL is a bug and if possible should be checked for
before doing anything in this function, so you could simplify this as

		if (!target_cxlds || target_clds->pdev != target_pdev)

> +			continue;
> +
> +		rc = cxl_device_cache_wb_invalidate(target_pdev);
> +		if (rc && rc != 1)

As above, I'm not sure the rc == 1 return is helpful.

> +			return rc;
> +	}
> +
> +	return 0;
> +}
> +
> +/**
> + * cxl_reset_prepare_memdev - Prepare CXL device for reset
> + * @pdev: PCI device
> + *
> + * Validates it's safe to reset and tears down regions atomically under lock.
> + * Acquires cxl_region_rwsem and keeps it held throughout reset.

That may need some lockdep annotations. Make sure to run a lockdep build.

> + *
> + * Return: 0 on success (lock held), -EBUSY if memory online, negative on error
> + */
> +static int cxl_reset_prepare_memdev(struct pci_dev *pdev)
> +{

> 


  reply	other threads:[~2026-01-21 11:09 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-20 22:26 [PATCH v4 0/10] CXL Reset support for Type 2 devices smadhavan
2026-01-20 22:26 ` [PATCH v4 01/10] cxl: move DVSEC defines to cxl pci header smadhavan
2026-01-21 10:31   ` Jonathan Cameron
2026-01-20 22:26 ` [PATCH v4 02/10] PCI: switch CXL port DVSEC defines smadhavan
2026-01-21 10:34   ` Jonathan Cameron
2026-01-20 22:26 ` [PATCH v4 03/10] cxl: add type 2 helper and reset DVSEC bits smadhavan
2026-01-20 23:27   ` Dave Jiang
2026-01-21 10:45     ` Jonathan Cameron
2026-01-20 22:26 ` [PATCH v4 04/10] PCI: add CXL reset method smadhavan
2026-01-21  0:08   ` Dave Jiang
2026-01-21 10:57   ` Jonathan Cameron
2026-01-23 13:54   ` kernel test robot
2026-01-20 22:26 ` [PATCH v4 05/10] cxl: add reset prepare and region teardown smadhavan
2026-01-21 11:09   ` Jonathan Cameron [this message]
2026-01-21 21:25   ` Dave Jiang
2026-01-20 22:26 ` [PATCH v4 06/10] PCI: wire CXL reset prepare/cleanup smadhavan
2026-01-21 22:13   ` Dave Jiang
2026-01-22  2:17     ` Srirangan Madhavan
2026-01-22 15:11       ` Dave Jiang
2026-01-24  7:54   ` kernel test robot
2026-01-20 22:26 ` [PATCH v4 07/10] cxl: add host cache flush and multi-function reset smadhavan
2026-01-21 11:20   ` Jonathan Cameron
2026-01-21 20:27     ` Davidlohr Bueso
2026-01-22  9:53       ` Jonathan Cameron
2026-01-21 22:19     ` Vikram Sethi
2026-01-22  9:40       ` Souvik Chakravarty
     [not found]     ` <PH7PR12MB9175CDFC163843BB497073CEBD96A@PH7PR12MB9175.namprd12.prod.outlook.com>
2026-01-22 10:31       ` Jonathan Cameron
2026-01-22 19:24         ` Vikram Sethi
2026-01-23 13:13           ` Jonathan Cameron
2026-01-21 23:59   ` Dave Jiang
2026-01-20 22:26 ` [PATCH v4 08/10] cxl: add DVSEC config save/restore smadhavan
2026-01-21 11:31   ` Jonathan Cameron
2026-01-20 22:26 ` [PATCH v4 09/10] PCI: save/restore CXL config around reset smadhavan
2026-01-21 22:32   ` Dave Jiang
2026-01-22 10:01   ` Lukas Wunner
2026-01-22 10:47     ` Jonathan Cameron
2026-01-26 22:34       ` Alex Williamson
2026-03-12 18:24         ` Jonathan Cameron
2026-01-20 22:26 ` [PATCH v4 10/10] cxl: add HDM decoder and IDE save/restore smadhavan
2026-01-21 11:42   ` Jonathan Cameron
2026-01-22 15:09   ` Dave Jiang
2026-01-21  1:19 ` [PATCH v4 0/10] CXL Reset support for Type 2 devices Alison Schofield
2026-01-22  0:00 ` Bjorn Helgaas
2026-01-27 16:33 ` Alex Williamson
2026-01-27 17:02   ` dan.j.williams
2026-01-27 18:07     ` Vikram Sethi
2026-01-28  3:42       ` dan.j.williams
2026-01-28 12:36         ` Jonathan Cameron

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260121110931.000063e2@huawei.com \
    --to=jonathan.cameron@huawei.com \
    --cc=Smita.KoralahalliChannabasappa@amd.com \
    --cc=alison.schofield@intel.com \
    --cc=bhelgaas@google.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=dave@stgolabs.net \
    --cc=huaisheng.ye@intel.com \
    --cc=ira.weiny@intel.com \
    --cc=jsequeira@nvidia.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=ming.li@zohomail.com \
    --cc=mochs@nvidia.com \
    --cc=rrichter@amd.com \
    --cc=sdonthineni@nvidia.com \
    --cc=smadhavan@nvidia.com \
    --cc=vaslot@nvidia.com \
    --cc=vidyas@nvidia.com \
    --cc=vishal.l.verma@intel.com \
    --cc=vsethi@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox