From: Dave Jiang <dave.jiang@intel.com>
To: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: linux-cxl@vger.kernel.org, alison.schofield@intel.com,
vishal.l.verma@intel.com, bwidawsk@kernel.org,
dan.j.williams@intel.com, shiju.jose@huawei.com,
rrichter@amd.com
Subject: Re: [PATCH RFC v2 9/9] cxl/pci: Add (hopeful) error handling support
Date: Thu, 20 Oct 2022 09:06:50 -0700 [thread overview]
Message-ID: <032c48b4-f1d5-6458-ccde-04ce8b1a7f10@intel.com> (raw)
In-Reply-To: <20221020165203.00002101@huawei.com>
On 10/20/2022 8:52 AM, Jonathan Cameron wrote:
> On Fri, 16 Sep 2022 16:11:45 -0700
> Dave Jiang <dave.jiang@intel.com> wrote:
>
>> From: Dan Williams <dan.j.williams@intel.com>
>>
>> Add nominal error handling that tears down CXL.mem in response to error
>> notifications that imply a device reset. Given some CXL.mem may be
>> operating as System RAM, there is a high likelihood that these error
>> events are fatal. However, if the system survives the notification the
>> expectation is that the driver behavior is equivalent to a hot-unplug
>> and re-plug of an endpoint.
>>
>> Note that this does not change the mask values from the default. That
>> awaits CXL _OSC support to determine whether platform firmware is in
>> control of the mask registers.
> Hi Dave,
>
> So I just implemented correctable error reporting and it never gets
> to the handling in here. My perhaps wrong assumption is that the
> device would use ERR_COR messages to indicate those?
>
> They get to the AER handlers (which print appropriately) but because
> they have been corrected are never reported to the PCIe drivers.
>
> https://elixir.bootlin.com/linux/latest/source/drivers/pci/pcie/aer.c#L956
>
> I guess we will want a hook for those as well so we can log the
> extra info on what the error was when they occur.
Are you suggesting having that function call pdrv->err_handler with
either going through ->error_detected() or a new callback like
->correctable_error_notify()?
>
> Jonathan
next prev parent reply other threads:[~2022-10-20 16:10 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-09-16 23:10 [PATCH RFC v2 0/9] cxl/pci: Add fundamental error handling Dave Jiang
2022-09-16 23:11 ` [PATCH RFC v2 1/9] cxl/pci: Cleanup repeated code in cxl_probe_regs() helpers Dave Jiang
2022-09-16 23:11 ` [PATCH RFC v2 2/9] cxl/pci: Cleanup cxl_map_device_regs() Dave Jiang
2022-09-16 23:11 ` [PATCH RFC v2 3/9] cxl/pci: Kill cxl_map_regs() Dave Jiang
2022-10-18 13:43 ` Jonathan Cameron
2022-09-16 23:11 ` [PATCH RFC v2 4/9] cxl/core/regs: Make cxl_map_{component, device}_regs() device generic Dave Jiang
2022-09-16 23:11 ` [PATCH RFC v2 5/9] cxl/port: Limit the port driver to just the HDM Decoder Capability Dave Jiang
2022-10-20 16:54 ` Jonathan Cameron
2022-09-16 23:11 ` [PATCH RFC v2 6/9] cxl/pci: Prepare for mapping RAS Capability Structure Dave Jiang
2022-09-16 23:11 ` [PATCH RFC v2 7/9] cxl/pci: Find and map the " Dave Jiang
2022-09-16 23:11 ` [PATCH RFC v2 8/9] cxl/pci: add tracepoint events for CXL RAS Dave Jiang
2022-10-20 17:02 ` Jonathan Cameron
2022-10-20 17:07 ` Dave Jiang
2022-10-20 17:52 ` Steven Rostedt
2022-09-16 23:11 ` [PATCH RFC v2 9/9] cxl/pci: Add (hopeful) error handling support Dave Jiang
2022-10-20 13:45 ` Jonathan Cameron
2022-10-20 14:50 ` Dave Jiang
2022-10-20 14:03 ` Jonathan Cameron
2022-10-20 14:57 ` Dave Jiang
2022-10-20 15:52 ` Jonathan Cameron
2022-10-20 16:06 ` Dave Jiang [this message]
2022-10-20 16:11 ` Jonathan Cameron
2022-10-11 14:17 ` [PATCH RFC v2 0/9] cxl/pci: Add fundamental error handling Jonathan Cameron
2022-10-11 15:18 ` Dave Jiang
2022-10-11 17:19 ` Jonathan Cameron
2022-10-19 17:30 ` Jonathan Cameron
2022-10-19 17:38 ` Dave Jiang
2022-10-24 16:01 ` Jonathan Cameron
2022-10-25 15:22 ` Dave Jiang
2022-11-03 12:58 ` Jonathan Cameron
2022-11-03 13:27 ` Jonathan Cameron
2022-11-16 23:20 ` Dave Jiang
2022-11-17 13:50 ` Jonathan Cameron
2022-11-18 17:15 ` Dave Jiang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=032c48b4-f1d5-6458-ccde-04ce8b1a7f10@intel.com \
--to=dave.jiang@intel.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=alison.schofield@intel.com \
--cc=bwidawsk@kernel.org \
--cc=dan.j.williams@intel.com \
--cc=linux-cxl@vger.kernel.org \
--cc=rrichter@amd.com \
--cc=shiju.jose@huawei.com \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox