From: Yuquan Wang <wangyuquan1236@phytium.com.cn>
To: Dan Williams <dan.j.williams@intel.com>, jonathan.cameron@huawei.com
Cc: linux-cxl@vger.kernel.org
Subject: Re: [RFC PATCH] cxl: Support Global Persistent Flush (GPF)
Date: Wed, 11 Dec 2024 18:53:25 +0800 [thread overview]
Message-ID: <Z1lvJWHYJFucba5I@phytium.com.cn> (raw)
In-Reply-To: <6758d309c9ddb_10a08329414@dwillia2-xfh.jf.intel.com.notmuch>
On Tue, Dec 10, 2024 at 03:47:21PM -0800, Dan Williams wrote:
> Davidlohr Bueso wrote:
>
> Hi Davidlohr, thanks for this:
>
> > Add support for GPF flows. It is found that the CXL specification
> > around this to be a bit too involved from the driver side. And while
> > this should really all handled by the hardware, this patch takes
> > things with a grain of salt.
> >
> > - Dirty shutdown is not handled, and puts the responsibility on the
> > Admin to deal with any GPF failure - otherwise the kernel will just
> > keep using the device upon next boot. Hence no SetShutdownState DIRTY
> > upon memdev probe (and no need for clearing upon successful flush).
> >
> > - As such, the driver will only update port timeouts throughout the
> > decode hierarchy, upon device probing and hot-remove. These timeouts
> > can be over-specified, particularly T1. Set the max and rely on
> > devices to minimize GPF response times to avoid the worst case wait
> > times that those timeouts imply.
> >
> > - Energy budgeting is not supported.
>
> The missing detail for me is why are the defaults likely to be broken
> such that the kernel must get involved in setting them, and how does
> the kernel determine they are broken?
>
> It is unfortunate that the specification does not recommend default
> values for this, or even an implementation note about what system
> software is supposed to do with these registers.
>
> Can you say a bit more about the end user visible effects when the
> timeouts are violated? Because the critical enabling from my perspective
> is timeout detection (notifying the sysadmin), not timeout setting.
>
> A policy for setting the timeouts makes sense, but I am not sure that
> the kernel should pick "max" as a default policy. Especially when max is
> 80 seconds which is already longer than lockup detectors would tolerate.
> Likely anything over 30 seconds the kernel should be active complaining
> about something being broken. If a configuration needs more time that
> likely needs a user policy override mechanism.
>
> This would be consistent with what we did for mailbox initialization
> timeouts. Pick a "kernel reasonable" default and provide an override for
> outlier cases.
Hi, list
Except GPF flows, could we use cxl pmem as a special nvdimm device in kernel
including using ADR mechanism? As the cxl_pmem module would bridge the cxl
pmem device from cxl subsystem into the nvdimm subsystem.
Many Thanks
Yuquan
prev parent reply other threads:[~2024-12-11 10:53 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-05 8:21 [RFC PATCH] cxl: Support Global Persistent Flush (GPF) Davidlohr Bueso
2024-12-10 23:47 ` Dan Williams
2024-12-11 10:53 ` Yuquan Wang [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z1lvJWHYJFucba5I@phytium.com.cn \
--to=wangyuquan1236@phytium.com.cn \
--cc=dan.j.williams@intel.com \
--cc=jonathan.cameron@huawei.com \
--cc=linux-cxl@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.