From: Gregory Price <gregory.price@memverge.com>
To: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Jonathan Cameron via <qemu-devel@nongnu.org>,
Lukas Wunner <lukas@wunner.de>, Michael Tsirkin <mst@redhat.com>,
Ben Widawsky <bwidawsk@kernel.org>,
linux-cxl@vger.kernel.org, linuxarm@huawei.com,
Ira Weiny <ira.weiny@intel.com>,
Gregory Price <gourry.memverge@gmail.com>,
Dan Williams <dan.j.williams@intel.com>
Subject: Re: cxl nvdimm Potential probe ordering issues.
Date: Fri, 20 Jan 2023 00:51:26 -0500 [thread overview]
Message-ID: <Y8or3tSnQm5In17u@memverge.com> (raw)
In-Reply-To: <20230119161711.000057a7@Huawei.com>
On Thu, Jan 19, 2023 at 04:17:11PM +0000, Jonathan Cameron wrote:
>
> Whilst I still have no idea if this is the same problem, I have identified
> what goes wrong if there is a module probe ordering issue.
> https://elixir.bootlin.com/linux/v6.2-rc4/source/drivers/cxl/core/pmem.c#L306
>
> /*
> * The two actions below arrange for @cxl_nvd to be deleted when either
> * the top-level PMEM bridge goes down, or the endpoint device goes
> * through ->remove().
> */
> device_lock(&cxl_nvb->dev);
> if (cxl_nvb->dev.driver)
> rc = devm_add_action_or_reset(&cxl_nvb->dev, cxl_nvd_unregister,
> cxl_nvd);
> else
> // bridge driver not loaded, so we hit this path.
> rc = -ENXIO;
> device_unlock(&cxl_nvb->dev);
>
> if (rc)
> /// and this one
> goto err_alloc;
>
> /* @cxlmd carries a reference on @cxl_nvb until cxlmd_release_nvdimm */
> return devm_add_action_or_reset(&cxlmd->dev, cxlmd_release_nvdimm, cxlmd);
>
> err:
> put_device(dev);
> err_alloc:
> cxlmd->cxl_nvb = NULL;
> cxlmd->cxl_nvd = NULL;
> put_device(&cxl_nvb->dev);
> // whilst we scrub the pointers we don't actually get rid of the
> // cxl_nvd that we registered. Hence later load of the driver tries to
> // attach to that and boom because we've scrubbed these pointers here.
> // A quick hack is to just call device_del(&cxl_nvd->dev) if rc = -ENXIO here.
> // There may well be a races though....
> return rc;
> }
> EXPORT_SYMBOL_NS_GPL(devm_cxl_add_nvdimm, CXL);
>
>
> Of course this "fix" just stops things blowing up, it doesn't leave things
> in a remotely useful state. If it's triggered because someone
> is messing with the load order that's fine. If the same issue
> is occurring for Gregory, not so much.
>
> Jonathan
>
mild hint in the dev_cxl_add_nvdimm_bridge path
driver/cxl/acpi.c
static int cxl_acpi_probe(struct platform_device *pdev)
{
... snip ...
if (IS_ENABLED(CONFIG_CXL_PMEM))
rc = device_for_each_child(&root_port->dev, root_port,
add_root_nvdimm_bridge);
if (rc < 0)
return rc;
/* In case PCI is scanned before ACPI re-trigger memdev attach */
cxl_bus_rescan();
return 0;
}
if PCI is presently written in a way that it's expecting nvdimm_bridge
to be present (via acpi_probe), then clearly this would break.
From the other discussion here... that seems to be the issue? If that's
an issue, I also imagine there are other parts that may be subject to
the same problem.
static int cxl_pmem_region_probe(struct device *dev)
{
struct nd_mapping_desc mappings[CXL_DECODER_MAX_INTERLEAVE];
struct cxl_pmem_region *cxlr_pmem = to_cxl_pmem_region(dev);
struct cxl_region *cxlr = cxlr_pmem->cxlr;
struct cxl_nvdimm_bridge *cxl_nvb = cxlr->cxl_nvb;
this may be unreachable due to prior stack traces, but you get the
point.
Reiterating my confusion a bit: I don't have an nvdimm, why am i getting
an nvdimm_bridge? The reason it no longer appears to trigger on my
memexp example is because it doesnt go down this path:
static int cxl_mem_probe(struct device *dev)
{
... snip ...
// resource size is 0 here due to type3dev->persistent_capacity=0
if (resource_size(&cxlds->pmem_res) && IS_ENABLED(CONFIG_CXL_PMEM)) {
rc = devm_cxl_add_nvdimm(cxlmd);
if (rc == -ENODEV)
dev_info(dev, "PMEM disabled by platform\n");
else
return rc;
}
... snip ...
}
This seems like more than an ordering issue.
next prev parent reply other threads:[~2023-01-20 5:52 UTC|newest]
Thread overview: 71+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-01-11 14:24 [PATCH 0/8] hw/cxl: CXL emulation cleanups and minor fixes for upstream Jonathan Cameron
2023-01-11 14:24 ` Jonathan Cameron via
2023-01-11 14:24 ` [PATCH 1/8] hw/mem/cxl_type3: Improve error handling in realize() Jonathan Cameron
2023-01-11 14:24 ` Jonathan Cameron via
2023-01-11 17:33 ` Ira Weiny
2023-01-11 14:24 ` [PATCH 2/8] hw/pci-bridge/cxl_downstream: Fix type naming mismatch Jonathan Cameron
2023-01-11 14:24 ` Jonathan Cameron via
2023-01-11 14:45 ` Philippe Mathieu-Daudé
2023-01-11 17:38 ` Ira Weiny
2023-01-11 14:24 ` [PATCH 3/8] hw/cxl: set cxl-type3 device type to PCI_CLASS_MEMORY_CXL Jonathan Cameron
2023-01-11 14:24 ` Jonathan Cameron via
2023-01-11 17:41 ` Ira Weiny
2023-01-11 14:24 ` [PATCH 4/8] hw/cxl: Add CXL_CAPACITY_MULTIPLIER definition Jonathan Cameron
2023-01-11 14:24 ` Jonathan Cameron via
2023-01-11 15:48 ` Philippe Mathieu-Daudé
2023-01-11 14:24 ` [PATCH 5/8] hw/i386/acpi: Drop duplicate _UID entry for CXL root bridge Jonathan Cameron
2023-01-11 14:24 ` Jonathan Cameron via
2023-01-11 17:48 ` Ira Weiny
2023-01-11 14:24 ` [PATCH 6/8] qemu/bswap: Add const_le64() Jonathan Cameron
2023-01-11 14:24 ` Jonathan Cameron via
2023-01-11 15:49 ` Philippe Mathieu-Daudé
2023-01-11 16:07 ` Philippe Mathieu-Daudé
2023-01-11 16:33 ` Philippe Mathieu-Daudé
2023-01-11 16:40 ` Philippe Mathieu-Daudé
2023-01-11 16:59 ` Jonathan Cameron
2023-01-11 16:59 ` Jonathan Cameron via
2023-01-11 14:24 ` [PATCH 7/8] qemu/uuid: Add UUID static initializer Jonathan Cameron
2023-01-11 14:24 ` Jonathan Cameron via
2023-01-11 14:24 ` [PATCH 8/8] hw/cxl/mailbox: Use new UUID network order define for cel_uuid Jonathan Cameron
2023-01-11 14:24 ` Jonathan Cameron via
2023-01-11 15:50 ` Philippe Mathieu-Daudé
2023-01-12 15:39 ` [PATCH 0/8] hw/cxl: CXL emulation cleanups and minor fixes for upstream Gregory Price
2023-01-12 17:21 ` Jonathan Cameron
2023-01-12 17:21 ` Jonathan Cameron via
2023-01-12 22:46 ` Gregory Price
2023-01-13 9:12 ` Jonathan Cameron
2023-01-13 9:12 ` Jonathan Cameron via
2023-01-13 14:19 ` Gregory Price
2023-01-13 14:40 ` Jonathan Cameron
2023-01-13 14:40 ` Jonathan Cameron via
2023-01-13 14:45 ` Jonathan Cameron
2023-01-13 14:45 ` Jonathan Cameron via
2023-01-13 15:12 ` Lukas Wunner
2023-01-13 15:42 ` Gregory Price
2023-01-18 19:22 ` Gregory Price
2023-01-18 19:31 ` Gregory Price
2023-01-19 12:42 ` Jonathan Cameron
2023-01-19 12:42 ` Jonathan Cameron via
2023-01-19 15:04 ` cxl nvdimm Potential probe ordering issues Jonathan Cameron
2023-01-19 15:04 ` Jonathan Cameron via
2023-01-19 16:17 ` Jonathan Cameron
2023-01-19 16:17 ` Jonathan Cameron via
2023-01-20 5:51 ` Gregory Price [this message]
2023-01-20 17:26 ` Dan Williams
2023-01-20 4:53 ` Gregory Price
2023-01-20 10:47 ` Jonathan Cameron
2023-01-20 10:47 ` Jonathan Cameron via
2023-01-20 17:38 ` Dan Williams
2023-01-20 21:54 ` Gregory Price
2023-01-20 22:41 ` Dan Williams
2023-01-23 9:44 ` Jonathan Cameron
2023-01-23 9:44 ` Jonathan Cameron via
2023-01-23 18:16 ` Gregory Price
2023-01-19 10:19 ` [PATCH 0/8] hw/cxl: CXL emulation cleanups and minor fixes for upstream Jonathan Cameron
2023-01-19 10:19 ` Jonathan Cameron via
2023-01-19 11:48 ` Michael S. Tsirkin
2023-01-19 12:16 ` Jonathan Cameron
2023-01-19 12:16 ` Jonathan Cameron via
2023-01-19 14:23 ` Gregory Price
2023-01-19 14:20 ` Gregory Price
2023-01-13 14:45 ` Gregory Price
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y8or3tSnQm5In17u@memverge.com \
--to=gregory.price@memverge.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=bwidawsk@kernel.org \
--cc=dan.j.williams@intel.com \
--cc=gourry.memverge@gmail.com \
--cc=ira.weiny@intel.com \
--cc=linux-cxl@vger.kernel.org \
--cc=linuxarm@huawei.com \
--cc=lukas@wunner.de \
--cc=mst@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.