From: Naveen Krishna Chatradhi <nchatrad@amd.com>
To: <linux-edac@vger.kernel.org>, <x86@kernel.org>
Cc: <linux-kernel@vger.kernel.org>, <bp@alien8.de>,
<mingo@redhat.com>, <mchehab@kernel.org>, <yazen.ghannam@amd.com>,
Muralidhara M K <muralimk@amd.com>
Subject: [PATCH v7 00/12] x86/edac/amd64: Add support for GPU nodes
Date: Thu, 3 Feb 2022 11:49:30 -0600 [thread overview]
Message-ID: <20220203174942.31630-1-nchatrad@amd.com> (raw)
From: Muralidhara M K <muralimk@amd.com>
On heterogeneous systems made up of AMD CPUs and GPUs, where the
data fabrics of CPUs and GPUs are connected directly via custom links.
UMC MCA banks on GPUs can be viewed similar to the UMCs banks on the CPUs.
Hence, memory errors on GPU UMCs can be reported via edac framework.
This patchset applies on top of the following series
[v4,00/24] AMD MCA Address Translation Updates
https://patchwork.kernel.org/project/linux-edac/cover/20220127204115.384161-1-yazen.ghannam@amd.com/
Each patch was build tested individually. The entire set was
tested for address translation and error counts on GPU
memory.
This patchset does the following
1. edac.rst:
a. Add Documentation support for heterogeneous systems
2. amd_nb.c:
a. Add support for northbridges on Aldebaran GPU nodes
b. export AMD node map details to be used by edac and mce modules
3. mce_amd module:
a. Identify the node ID where the error is and map the node id
to linux enumerated node id.
4. Modifies the amd64_edac module
a. Refactor the code, define new family op routines and use
struct amd64_pvt. Making struct fam_type obsolete.
b. Enumerate UMCs and HBMs on the GPU nodes
5. DF3.5 Address translation support
a. Support Data Fabric 3.5 Address translation
b. Fixed UMC to CS mapping for errors
Muralidhara M K (6):
EDAC/amd64: edac.rst: Add Doc support for heterogeneous systems
x86/amd_nb: Add support for northbridges on Aldebaran
EDAC/amd64: Move struct fam_type variables into amd64_pvt structure
EDAC/amd64: Define dynamic family ops routines
EDAC/amd64: Add AMD heterogeneous family 19h Model 30h-3fh
EDAC/amd64: Add address translation support for DF3.5
Naveen Krishna Chatradhi (3):
EDAC/mce_amd: Extract node id from MCA_IPID
EDAC/amd64: Enumerate Aldebaran GPU nodes by adding family ops
EDAC/amd64: Add Family ops to update GPU csrow and channel info
Yazen Ghannam (3):
EDAC/amd64: Add check for when to add DRAM base and hole
EDAC/amd64: Save the number of block instances
EDAC/amd64: Add fixed UMC to CS mapping
Documentation/driver-api/edac.rst | 9 +
arch/x86/include/asm/amd_nb.h | 9 +
arch/x86/kernel/amd_nb.c | 149 ++-
drivers/edac/amd64_edac.c | 1450 ++++++++++++++++++++---------
drivers/edac/amd64_edac.h | 203 +++-
drivers/edac/mce_amd.c | 23 +-
include/linux/pci_ids.h | 1 +
7 files changed, 1345 insertions(+), 499 deletions(-)
--
2.25.1
next reply other threads:[~2022-02-03 17:50 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-02-03 17:49 Naveen Krishna Chatradhi [this message]
2022-02-03 17:49 ` [PATCH v7 01/12] EDAC/amd64: Document heterogeneous enumeration Naveen Krishna Chatradhi
2022-02-09 22:34 ` Yazen Ghannam
2022-02-03 17:49 ` [PATCH v7 02/12] x86/amd_nb: Add support for northbridges on Aldebaran Naveen Krishna Chatradhi
2022-02-09 23:23 ` Yazen Ghannam
2022-02-03 17:49 ` [PATCH v7 03/12] EDAC/mce_amd: Extract node id from MCA_IPID Naveen Krishna Chatradhi
2022-02-09 23:31 ` Yazen Ghannam
2022-02-14 17:54 ` Chatradhi, Naveen Krishna
2022-02-03 17:49 ` [PATCH v7 04/12] EDAC/amd64: Move struct fam_type variables into amd64_pvt structure Naveen Krishna Chatradhi
2022-02-15 15:39 ` Yazen Ghannam
2022-02-03 17:49 ` [PATCH v7 05/12] EDAC/amd64: Define dynamic family ops routines Naveen Krishna Chatradhi
2022-02-15 15:49 ` Yazen Ghannam
2022-02-03 17:49 ` [PATCH v7 06/12] EDAC/amd64: Add AMD heterogeneous family 19h Model 30h-3fh Naveen Krishna Chatradhi
2022-02-15 16:20 ` Yazen Ghannam
2022-02-03 17:49 ` [PATCH v7 07/12] EDAC/amd64: Enumerate Aldebaran GPU nodes by adding family ops Naveen Krishna Chatradhi
2022-02-15 16:34 ` Yazen Ghannam
2022-02-03 17:49 ` [PATCH v7 08/12] EDAC/amd64: Add Family ops to update GPU csrow and channel info Naveen Krishna Chatradhi
2022-02-15 16:43 ` Yazen Ghannam
2022-02-03 17:49 ` [PATCH v7 09/12] EDAC/amd64: Add check for when to add DRAM base and hole Naveen Krishna Chatradhi
2022-02-03 17:49 ` [PATCH v7 10/12] EDAC/amd64: Save the number of block instances Naveen Krishna Chatradhi
2022-02-03 17:49 ` [PATCH v7 11/12] EDAC/amd64: Add address translation support for DF3.5 Naveen Krishna Chatradhi
2022-02-03 17:49 ` [PATCH v7 12/12] EDAC/amd64: Add fixed UMC to CS mapping Naveen Krishna Chatradhi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220203174942.31630-1-nchatrad@amd.com \
--to=nchatrad@amd.com \
--cc=bp@alien8.de \
--cc=linux-edac@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mchehab@kernel.org \
--cc=mingo@redhat.com \
--cc=muralimk@amd.com \
--cc=x86@kernel.org \
--cc=yazen.ghannam@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox