From: Dan Williams <dan.j.williams@intel.com>
To: Shiyang Ruan <ruansy.fnst@fujitsu.com>,
Dan Williams <dan.j.williams@intel.com>
Cc: <linux-cxl@vger.kernel.org>, <qemu-devel@nongnu.org>,
<Jonathan.Cameron@huawei.com>, <dave@stgolabs.net>,
<ira.weiny@intel.com>, <alison.schofield@intel.com>
Subject: Re: [PATCH v3 2/2] cxl/core: add poison creation event handler
Date: Tue, 21 May 2024 23:45:35 -0700 [thread overview]
Message-ID: <664d948fb86f0_e8be294f8@dwillia2-mobl3.amr.corp.intel.com.notmuch> (raw)
In-Reply-To: <b14ed74c-7fc5-461a-9c5f-fbb94df50e7d@fujitsu.com>
Shiyang Ruan wrote:
[..]
> >> My expectation is MF_ACTION_REQUIRED is not appropriate for CXL event
> >> reported errors since action is only required for direct consumption
> >> events and those need not be reported through the device event queue.
> > Got it.
>
> I'm not very sure about 'Host write/read' type. In my opinion, these
> two types of event should be sent from device when CPU is accessing a
> bad memory address, they could be thought of a sync event which needs
Hmm, no that's not my understanding of a sync event. I expect when error
notifications are synchronous the CPU is guaranteed not to make forward
progress past the point of encountering the error. MSI-signaled
component-events are always asynchronous by that definition because the
CPU is free running while the interrupt is in-flight.
> the 'MF_ACTION_REQUIRED' flag. Then, we can determine the flag by the
> types like this:
> - CXL_EVENT_TRANSACTION_READ | CXL_EVENT_TRANSACTION_WRITE
> => MF_ACTION_REQUIRED
> - CXL_EVENT_TRANSACTION_INJECT_POISON => MF_SW_SIMULATED
> - others => 0
I doubt any reasonable policy can be inferred from the transaction type.
Consider that the CPU itself does not take a sychronous exception when
writes encounter poison. At most those are flagged via CMCI
(corrected machine check interrupt). The only events that cause
exceptions are CPU reads that consume poison. The device has no idea
whether read events are coming from a CPU or a DMA event.
MF_SW_SIMULATED is purely for software simulated poison events as
injected poison can stil cause system fatal damage if the poison is
ingested in an unrecoverable path.
So, I think all CXL poison notification events should trigger an action
optional memory_failure(). I expect this needs to make sure that
duplicates re not a problem. I.e. in the case of CPU consumption of CXL
poison, that causes a synchronous MF_ACTION_REQUIRED event via the MCE
path *and* it may trigger the device to send an error record for the
same page. As far as I can see, duplicate reports (MCE + CXL device) are
unavoidable.
next prev parent reply other threads:[~2024-05-22 6:46 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-17 7:50 [PATCH v3 0/2] cxl: add poison creation event handler Shiyang Ruan via
2024-04-17 7:50 ` [PATCH v3 1/2] cxl/core: correct length of DPA field masks Shiyang Ruan via
2024-04-23 17:35 ` Ira Weiny
2024-04-23 17:35 ` Dan Williams
2024-04-23 17:42 ` Alison Schofield
2024-04-23 21:04 ` Ira Weiny
2024-04-25 10:05 ` Shiyang Ruan via
2024-04-25 16:04 ` Ira Weiny
2024-04-30 21:00 ` Alison Schofield
2024-05-03 11:37 ` Shiyang Ruan via
2024-04-17 7:50 ` [PATCH v3 2/2] cxl/core: add poison creation event handler Shiyang Ruan via
2024-04-17 17:30 ` Dave Jiang
2024-04-18 9:01 ` Shiyang Ruan via
2024-04-21 12:14 ` kernel test robot
2024-04-23 17:57 ` Ira Weiny
2024-05-03 10:42 ` Shiyang Ruan via
2024-05-08 16:15 ` Jonathan Cameron via
2024-04-23 18:40 ` Dan Williams
2024-05-03 11:32 ` Shiyang Ruan via
2024-05-21 5:35 ` Shiyang Ruan via
2024-05-22 6:45 ` Dan Williams [this message]
2024-05-24 15:15 ` Shiyang Ruan via
2024-05-28 10:13 ` Shiyang Ruan via
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=664d948fb86f0_e8be294f8@dwillia2-mobl3.amr.corp.intel.com.notmuch \
--to=dan.j.williams@intel.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=alison.schofield@intel.com \
--cc=dave@stgolabs.net \
--cc=ira.weiny@intel.com \
--cc=linux-cxl@vger.kernel.org \
--cc=qemu-devel@nongnu.org \
--cc=ruansy.fnst@fujitsu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).