From: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
To: "Parthasarathy,
Mohan (HPC/AI and Labs)\"
<mohan_parthasarathy@hpe.com>"@domain.invalid
Cc: "linux-cxl@vger.kernel.org" <linux-cxl@vger.kernel.org>,
<shiju.jose@huawei.com>
Subject: Re: CXL RAS flows on Linux
Date: Mon, 31 Jul 2023 12:35:35 +0100 [thread overview]
Message-ID: <20230731123535.00002d5c@Huawei.com> (raw)
In-Reply-To: <PH7PR84MB1582BBE2FB47C45DC4BFE78D8805A@PH7PR84MB1582.NAMPRD84.PROD.OUTLOOK.COM>
On Mon, 31 Jul 2023 06:06:27 +0000
"Parthasarathy, Mohan (HPC/AI and Labs)" <mohan_parthasarathy@hpe.com> wrote:
> Hi all,
Hi Mohan,
Great to have more interest in this aspect.
>
> I am very interested in the RAS enablement for CXL on Linux. Is there a RAS project for CXL/Linux ?
I'm not aware of a separate project, just mixture of work in the kernel and standard
tools such as RAS daemon. You'll find all the relevant stuff in the archive linux-cxl
https://lore.kernel.org/linux-cxl/
We are definitely only part of the way there for RAS flows - have reporting but beyond
that there is a lot of work still to do.
>
> 1) Do we have a design specification somewhere on the RAS interfaces on Linux for CXL that I can read on this ? Any document describing the correctable and uncorrectable error flows for CXL.mem?
From Linux side of things I'm not aware of any public docs (there will be various internal ones in the
companies are contributing).
> 2) Are there any error injections tests and testcases that I can experiment with to see the RAS flows with CXL on Linux, using QEMU ?
The infrastructure is there but we don't have any automated scripted flows yet. Note we got
some of this stuff upstream only recently so you will want to build directly from the master branch
or wait for the next qemu release in a few weeks time. My staging branch at
gitlab.com/jic23/qemu (cxl-* whatever latest date available is) runs ahead of that for features
but I don't think we have much ras stuff in the queue currently.
https://gitlab.com/jic23/qemu/-/commits/cxl-2023-07-17/
Documentation is lagging as well, so most of the instructions are in the commit messages
e.g. For poison
https://gitlab.com/qemu-project/qemu/-/commits/master/hw/cxl
For DRAM event records etc
https://lore.kernel.org/linux-cxl/20230530133603.16934-1-Jonathan.Cameron@huawei.com/
Similar for Uncor and Cor events... They've been in for a while, so easiest
is to look at the json files for cxl
https://elixir.bootlin.com/qemu/v8.1.0-rc1/source/qapi/cxl.json
For now RAS Daemon upstream support is lagging though you can see the RAS events
are there and there is a pull request for the various event queue based reports.
https://github.com/mchehab/rasdaemon/commits/master
https://github.com/mchehab/rasdaemon/pull/104
Injection is all done via the QMP interface qemu provides.
I've not used it but I gather https://github.com/pmem/run_qemu is useful
for bringing up suitable qemu configs to poke.
If no events are coming through, check that the internal errors aren't
masked in AER as I don't think we've fully resolved how to control
that masking in the kernel yet.
Let me know how you get on. We should document this stuff better but
as ever there are too many things on the todo list :(
Jonathan
>
> Any pointers for both would be very much appreciated.
>
> Thanks and Regards,
> Mohan
next prev parent reply other threads:[~2023-07-31 11:36 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-31 6:06 CXL RAS flows on Linux Parthasarathy, Mohan (HPC/AI and Labs)
2023-07-31 11:35 ` Jonathan Cameron [this message]
2023-08-02 5:17 ` Parthasarathy, Mohan (HPC/AI and Labs)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230731123535.00002d5c@Huawei.com \
--to=jonathan.cameron@huawei.com \
--cc="Parthasarathy, Mohan (HPC/AI and Labs)\" <mohan_parthasarathy@hpe.com>"@domain.invalid \
--cc=linux-cxl@vger.kernel.org \
--cc=shiju.jose@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox