From: Aravind Iddamsetty <aravind.iddamsetty@linux.intel.com>
To: Hawking.Zhang@amd.com, tao.zhou1@amd.com, Xiang Liu <xiang.liu@amd.com>
Cc: amd-gfx@lists.freedesktop.org,
"airlied@gmail.com" <airlied@gmail.com>,
"intel-xe@lists.freedesktop.org" <intel-xe@lists.freedesktop.org>,
"alexander.deucher@amd.com" <alexander.deucher@amd.com>,
ckoenig.leichtzumerken@gmail.com,
"joonas.lahtinen@linux.intel.com"
<joonas.lahtinen@linux.intel.com>,
simona@ffwll.ch
Subject: Re: [PATCH v2 00/12] Generate CPER records for RAS and commit to CPER ring
Date: Fri, 28 Mar 2025 15:57:56 +0530 [thread overview]
Message-ID: <fd1e4ddf-f123-4e72-beb8-1308bf7c32ab@linux.intel.com> (raw)
In-Reply-To: <cover.1739519672.git.xiang.liu@amd.com>
Hi,
Based on the discussions around using Netlink for RAS purposes, as
summarized in this blog post [1] by Dave Airlie. I had proposed a series
regarding RAS infrastructure in DRM [2].
I came across your work, which appears to address related areas and I'm
particularly interested in understanding how it aligns with or could be
adapted to the ongoing discussions around leveraging Netlink for RAS.
Could you share your perspective on the potential integration of your
efforts with Netlink? Do you foresee any challenges or opportunities in
aligning with the approach discussed in the above-mentioned blog post
and series?
Looking forward to your insights and any additional thoughts you may
have on this topic.
[1]
https://airlied.blogspot.com/2022/09/accelerators-bof-outcomes-summary.html
[2]
https://lore.kernel.org/all/20231020155835.1295524-1-aravind.iddamsetty@linux.intel.com/
Thanks,
Aravind.
On 14-02-2025 13:37, Xiang Liu wrote:
> This patch series generate RAS CPER records for UE/DE/CE/BP threshold exceed
> event. SMU_TYPE_CE banks are combined into 1 CPER entry, they could be CEs or
> DEs or both. UEs and BPs are encoded into separate CPER entries.
>
> RAS CPER records for CEs will be generated only after CEs count been queried.
>
> All records are committed to a pure software ring with a limit size, new records
> will flush older records when overflow happened. User can access the records by
> reading debugfs node, which is read-only.
>
> Hawking Zhang (5):
> drm/amd/include: Add amd cper header
> drm/amdgpu: Introduce funcs for populating CPER
> drm/amdgpu: Include ACA error type in aca bank
> drm/amdgpu: Introduce funcs for generating cper record
> drm/amdgpu: Generate cper records
>
> Tao Zhou (4):
> drm/amdgpu: add RAS CPER ring buffer
> drm/amdgpu: read CPER ring via debugfs
> drm/amdgpu: add data write function for CPER ring
> drm/amdgpu: add mutex lock for cper ring
>
> Xiang Liu (3):
> drm/amdgpu: Get timestamp from system time
> drm/amdgpu: Commit CPER entry
> drm/amdgpu: Generate bad page threshold cper records
>
> drivers/gpu/drm/amd/amdgpu/Makefile | 2 +-
> drivers/gpu/drm/amd/amdgpu/amdgpu.h | 4 +
> drivers/gpu/drm/amd/amdgpu/amdgpu_aca.c | 46 +-
> drivers/gpu/drm/amd/amdgpu/amdgpu_aca.h | 16 +-
> drivers/gpu/drm/amd/amdgpu/amdgpu_cper.c | 559 +++++++++++++++++++++
> drivers/gpu/drm/amd/amdgpu/amdgpu_cper.h | 104 ++++
> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 +
> drivers/gpu/drm/amd/amdgpu/amdgpu_ring.c | 91 +++-
> drivers/gpu/drm/amd/amdgpu/amdgpu_ring.h | 1 +
> drivers/gpu/drm/amd/amdgpu/amdgpu_xgmi.c | 2 +
> drivers/gpu/drm/amd/amdgpu/aqua_vanjaram.c | 3 +-
> drivers/gpu/drm/amd/amdgpu/gfx_v9_4_3.c | 2 +
> drivers/gpu/drm/amd/amdgpu/jpeg_v4_0_3.c | 2 +
> drivers/gpu/drm/amd/amdgpu/mmhub_v1_8.c | 2 +
> drivers/gpu/drm/amd/amdgpu/sdma_v4_4_2.c | 2 +
> drivers/gpu/drm/amd/amdgpu/umc_v12_0.c | 1 +
> drivers/gpu/drm/amd/amdgpu/vcn_v4_0_3.c | 2 +
> drivers/gpu/drm/amd/include/amd_cper.h | 269 ++++++++++
> drivers/gpu/drm/amd/pm/amdgpu_dpm.c | 3 +
> 19 files changed, 1075 insertions(+), 40 deletions(-)
> create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_cper.c
> create mode 100644 drivers/gpu/drm/amd/amdgpu/amdgpu_cper.h
> create mode 100644 drivers/gpu/drm/amd/include/amd_cper.h
>
next parent reply other threads:[~2025-03-28 10:28 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <cover.1739519672.git.xiang.liu@amd.com>
2025-03-28 10:27 ` Aravind Iddamsetty [this message]
2025-03-28 12:12 ` [PATCH v2 00/12] Generate CPER records for RAS and commit to CPER ring Aravind Iddamsetty
2025-04-16 9:40 ` Aravind Iddamsetty
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=fd1e4ddf-f123-4e72-beb8-1308bf7c32ab@linux.intel.com \
--to=aravind.iddamsetty@linux.intel.com \
--cc=Hawking.Zhang@amd.com \
--cc=airlied@gmail.com \
--cc=alexander.deucher@amd.com \
--cc=amd-gfx@lists.freedesktop.org \
--cc=ckoenig.leichtzumerken@gmail.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=joonas.lahtinen@linux.intel.com \
--cc=simona@ffwll.ch \
--cc=tao.zhou1@amd.com \
--cc=xiang.liu@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox