From: Dan Williams <dan.j.williams@intel.com>
To: Li Ming <ming.li@zohomail.com>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
"Rafael J. Wysocki" <rafael@kernel.org>,
"Danilo Krummrich" <dakr@kernel.org>,
Davidlohr Bueso <dave@stgolabs.net>,
"Jonathan Cameron" <jonathan.cameron@huawei.com>,
Dave Jiang <dave.jiang@intel.com>,
Alison Schofield <alison.schofield@intel.com>,
Vishal Verma <vishal.l.verma@intel.com>,
Ira Weiny <ira.weiny@intel.com>,
Dan Williams <dan.j.williams@intel.com>,
Bjorn Helgaas <bhelgaas@google.com>,
Ben Cheatham <benjamin.cheatham@amd.com>
Cc: <driver-core@lists.linux.dev>, <linux-kernel@vger.kernel.org>,
<linux-cxl@vger.kernel.org>,
Jonathan Cameron <Jonathan.Cameron@huawei.com>,
Li Ming <ming.li@zohomail.com>
Subject: Re: [PATCH 7/7] cxl/port: Reset cxlmd->endpoint to -ENXIO by default
Date: Tue, 10 Mar 2026 12:29:59 -0700 [thread overview]
Message-ID: <69b071373ffca_490a100d@dwillia2-mobl4.notmuch> (raw)
In-Reply-To: <20260310-fix_access_endpoint_without_drv_check-v1-7-94fe919a0b87@zohomail.com>
Li Ming wrote:
> cxlmd->endpoint is set to -ENXIO by default, This intentional invalid
> value ensures that any unintended access to the endpoint will be
> detectable. But CXL driver didn't reset it to the default value when the
> endpoint is released in delete_endpoint().
>
> Besides, cxlmd->endpoint is updated to point to an valid endpoint in
> cxl_port_add(), but if the device_add() in cxl_port_add() fails, the
> endpoint will be released, but cxlmd->endpoint remains pointing to the
> released endpoint, it may introduce a potential use-after-free issue.
>
> Fixes: 3d8be8b398e3 ("cxl: Set cxlmd->endpoint before adding port device")
> Fixes: 29317f8dc6ed ("cxl/mem: Introduce cxl_memdev_attach for CXL-dependent operation")
So I disagree that this is a fix. It may be a consistency change to make
sure that all invalid ->endpoint accesses have the same result, but it
is not a fix for any discernable end user problem.
> Signed-off-by: Li Ming <ming.li@zohomail.com>
> ---
> drivers/cxl/core/port.c | 14 ++++++++++----
> 1 file changed, 10 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> index 0c5957d1d329..ec3cb62b44b7 100644
> --- a/drivers/cxl/core/port.c
> +++ b/drivers/cxl/core/port.c
> @@ -834,11 +834,14 @@ static int cxl_port_add(struct cxl_port *port,
> struct cxl_dport *parent_dport)
> {
> struct device *dev __free(put_device) = &port->dev;
> + struct cxl_memdev *cxlmd = NULL;
> int rc;
>
> if (is_cxl_memdev(port->uport_dev)) {
> - struct cxl_memdev *cxlmd = to_cxl_memdev(port->uport_dev);
> - struct cxl_dev_state *cxlds = cxlmd->cxlds;
> + struct cxl_dev_state *cxlds;
> +
> + cxlmd = to_cxl_memdev(port->uport_dev);
> + cxlds = cxlmd->cxlds;
>
> rc = dev_set_name(dev, "endpoint%d", port->id);
> if (rc)
> @@ -865,8 +868,11 @@ static int cxl_port_add(struct cxl_port *port,
> }
>
> rc = device_add(dev);
> - if (rc)
> + if (rc) {
> + if (cxlmd)
> + cxlmd->endpoint = ERR_PTR(-ENXIO);
I see no need for this. All of the upper levels paths ensure that all
the results of setup are torn down on failure. Consider that if
device_add() succeeds and any of the follow on functions in
__devm_cxl_add_port() then this cleanup will also not happen.
So you must rely on upper level cleanup. This is similar to the
semantics of dev_set_drvdata(). That is only reset up after _probe()
finishes failing.
Now, if you want to do some separate cleanups in this space I would redo
the anti-patterns of scope-based cleanup in cxl_port_alloc() and
cxl_port_add(). cxl_port_alloc() does the "__free = NULL" anti-pattern.
It appears to be doing that to avoid adding a DEFINE_FREE() for ida
cleanup if the port allocation fails.
Also, if the cxl_port_alloc() call became something like:
struct cxl_port *port __free(put_cxl_port) = cxl_port_alloc(...)
...then I do not think cxl_port_add() would need to do the odd
__free(put_device()) initialization without a corresponding
get_device(). Nor the odd "dev = NULL" at the end.
> return rc;
> + }
>
> /* Inhibit the cleanup function invoked */
> dev = NULL;
> @@ -1425,7 +1431,7 @@ static void delete_endpoint(void *data)
> devm_release_action(host, cxl_unlink_uport, endpoint);
> devm_release_action(host, unregister_port, endpoint);
> }
> - cxlmd->endpoint = NULL;
> + cxlmd->endpoint = ERR_PTR(-ENXIO);
I also see no need for this. Either dereferencing NULL or ERR_PTR() will
result in what is needed, a kernel crash. How the use after-free crash
is signalled does not really matter. The only expectation that other
code has is that "->endpoint != NULL" does not mean ->endpoint is valid.
Nothing should care about that though because cxlmd->dev.driver is the
authoritative gate for ->endpoint validity.
next prev parent reply other threads:[~2026-03-10 19:30 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-10 15:57 [PATCH 0/7] cxl: Consolidate cxlmd->endpoint accessing Li Ming
2026-03-10 15:57 ` [PATCH 1/7] driver core: Add conditional guard support for device_lock() Li Ming
2026-03-10 17:45 ` Dave Jiang
2026-03-10 18:06 ` Danilo Krummrich
2026-03-10 18:09 ` Dave Jiang
2026-03-10 18:39 ` Dan Williams
2026-03-10 19:17 ` Danilo Krummrich
2026-03-10 20:37 ` Dan Williams
2026-03-10 20:41 ` Danilo Krummrich
2026-03-12 14:35 ` Greg Kroah-Hartman
2026-03-14 15:14 ` Danilo Krummrich
2026-03-10 15:57 ` [PATCH 2/7] cxl/memdev: Hold memdev lock during memdev poison injection/clear Li Ming
2026-03-10 17:53 ` Dave Jiang
2026-03-10 19:29 ` Alison Schofield
2026-03-10 21:34 ` Alison Schofield
2026-03-11 10:53 ` Li Ming
2026-03-12 4:05 ` Alison Schofield
2026-03-12 10:45 ` Li Ming
2026-03-10 15:57 ` [PATCH 3/7] cxl/region: Hold memdev lock during region " Li Ming
2026-03-10 19:54 ` Dan Williams
2026-03-10 21:57 ` Alison Schofield
2026-03-11 11:10 ` Li Ming
2026-03-17 2:10 ` Dan Williams
2026-03-10 15:57 ` [PATCH 4/7] cxl/pci: Hold memdev lock in cxl_event_trace_record() Li Ming
2026-03-10 19:33 ` Dan Williams
2026-03-11 11:11 ` Li Ming
2026-03-10 20:52 ` Dave Jiang
2026-03-11 11:12 ` Li Ming
2026-03-10 15:57 ` [PATCH 5/7] cxl/region: Ensure endpoint is valid in cxl_dpa_to_region() Li Ming
2026-03-10 20:53 ` Dave Jiang
2026-03-10 15:57 ` [PATCH 6/7] cxl/pci: Check memdev driver binding status in cxl_reset_done() Li Ming
2026-03-10 19:31 ` Dan Williams
2026-03-10 20:50 ` Dave Jiang
2026-03-10 15:57 ` [PATCH 7/7] cxl/port: Reset cxlmd->endpoint to -ENXIO by default Li Ming
2026-03-10 19:29 ` Dan Williams [this message]
2026-03-11 12:14 ` Li Ming
2026-03-10 19:20 ` [PATCH 0/7] cxl: Consolidate cxlmd->endpoint accessing Alison Schofield
2026-03-11 10:41 ` Li Ming
2026-03-10 20:33 ` Dan Williams
2026-03-11 10:44 ` Li Ming
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=69b071373ffca_490a100d@dwillia2-mobl4.notmuch \
--to=dan.j.williams@intel.com \
--cc=alison.schofield@intel.com \
--cc=benjamin.cheatham@amd.com \
--cc=bhelgaas@google.com \
--cc=dakr@kernel.org \
--cc=dave.jiang@intel.com \
--cc=dave@stgolabs.net \
--cc=driver-core@lists.linux.dev \
--cc=gregkh@linuxfoundation.org \
--cc=ira.weiny@intel.com \
--cc=jonathan.cameron@huawei.com \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=ming.li@zohomail.com \
--cc=rafael@kernel.org \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox