From: Dan Williams <dan.j.williams@intel.com>
To: Dave Jiang <dave.jiang@intel.com>, <linux-cxl@vger.kernel.org>,
<dan.j.williams@intel.com>
Cc: <ira.weiny@intel.com>, <vishal.l.verma@intel.com>,
<alison.schofield@intel.com>, <Jonathan.Cameron@huawei.com>
Subject: RE: [PATCH v2 2/2] cxl: Move cxl_await_media_ready() to before capacity info retrieval
Date: Thu, 18 May 2023 13:38:37 -0700 [thread overview]
Message-ID: <64668ccd4280_682c129483@dwillia2-xfh.jf.intel.com.notmuch> (raw)
In-Reply-To: <168443110845.2957452.15022248373553807511.stgit@djiang5-mobl3>
Dave Jiang wrote:
> Move cxl_await_media_ready() to cxl_pci probe before driver starts issuing
> IDENTIFY and retrieving memory device information to ensure that the
> device is ready to provide the information.
>
> Suggested-by: Dan Williams <dan.j.williams@intel.com>
> Fixes: b39cb1052a5c ("cxl/mem: Register CXL memX devices")
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>
> ---
> drivers/cxl/pci.c | 6 ++++++
> drivers/cxl/port.c | 6 ------
> 2 files changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c
> index ea38bd49b0cf..fc59ca79b2a0 100644
> --- a/drivers/cxl/pci.c
> +++ b/drivers/cxl/pci.c
> @@ -708,6 +708,12 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> if (rc)
> dev_dbg(&pdev->dev, "Failed to map RAS capability.\n");
>
> + rc = cxl_await_media_ready(cxlds);
> + if (rc) {
> + dev_err(&pdev->dev, "Media not active (%d)\n", rc);
> + return rc;
> + }
So, now that I see this new failure mode here it raises another concern.
I think there is value in still trying to bring up the mailbox interface
even if media-ready never completes. For example, what if you need to
update-firmware to remediate the device?
The mailbox interface does not need the memdev to attach to the cxl_mem
driver to enable command submission, but the question becomes what do
you do once you have fixed up whatever condition caused the media to
fail?
I think the simplest answer is to just require tooling to reload cxl_pci
for that device to re-attempt init. A more complicated answer would be
to teach cxl_mem how to revalidate capacity after-the-fact, but at this
point I think the implementation is stuck with cxl_pci owning capacity
initialization.
So I think this looks like caching the the media-ready timeout condition
and teaching all subsequent code that cares about capacity info to fail
and continue. Like don't issue identify, make sure the sysfs capacity
values return 0, and don't let cxl_mem proceed with topology
enumeration.
prev parent reply other threads:[~2023-05-18 20:39 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-18 17:31 [PATCH v2 0/2] ] cxl: Move operations after memory is ready Dave Jiang
2023-05-18 17:31 ` [PATCH v2 1/2] cxl: Wait Memory_Info_Valid before access memory related info Dave Jiang
2023-05-18 19:01 ` Ira Weiny
2023-05-18 20:52 ` Dave Jiang
2023-05-18 17:31 ` [PATCH v2 2/2] cxl: Move cxl_await_media_ready() to before capacity info retrieval Dave Jiang
2023-05-18 19:05 ` Ira Weiny
2023-05-18 20:38 ` Dan Williams [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=64668ccd4280_682c129483@dwillia2-xfh.jf.intel.com.notmuch \
--to=dan.j.williams@intel.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=alison.schofield@intel.com \
--cc=dave.jiang@intel.com \
--cc=ira.weiny@intel.com \
--cc=linux-cxl@vger.kernel.org \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox