* [PATCH] EDAC/amd64: Add support for ECC on family 19h model 60h-6Fh
@ 2023-04-25 20:12 Hristo Venev
2023-05-09 14:53 ` Yazen Ghannam
0 siblings, 1 reply; 9+ messages in thread
From: Hristo Venev @ 2023-04-25 20:12 UTC (permalink / raw)
To: Yazen Ghannam; +Cc: Borislav Petkov, linux-edac, Hristo Venev
Ryzen 9 7950X uses model 61h. Treat it as Epyc 9004, but with 2 channels
instead of 12.
I tested this with two 32GB dual-rank DIMMs. The sizes appear to be
reported correctly:
[ 2.122750] EDAC MC0: Giving out device to module amd64_edac controller F19h_M60h: DEV 0000:00:18.3 (INTERRUPT)
[ 2.122751] EDAC amd64: F19h_M60h detected (node 0).
[ 2.122754] EDAC MC: UMC0 chip selects:
[ 2.122754] EDAC amd64: MC: 0: 0MB 1: 0MB
[ 2.122755] EDAC amd64: MC: 2: 16384MB 3: 16384MB
[ 2.122757] EDAC MC: UMC1 chip selects:
[ 2.122757] EDAC amd64: MC: 0: 0MB 1: 0MB
[ 2.122758] EDAC amd64: MC: 2: 16384MB 3: 16384MB
[ 2.122759] AMD64 EDAC driver v3.5.0
ECC errors can also be detected:
[ 313.747594] mce: [Hardware Error]: Machine check events logged
[ 313.747597] [Hardware Error]: Corrected error, no action required.
[ 313.747613] [Hardware Error]: CPU:0 (19:61:2) MC21_STATUS[Over|CE|MiscV|AddrV|-|-|SyndV|CECC|-|-|-]: 0xdc2040000400011b
[ 313.747632] [Hardware Error]: Error Addr: 0x00000007ff7e93c0
[ 313.747639] [Hardware Error]: IPID: 0x0000009600050f00, Syndrome: 0x000100010a801203
[ 313.747652] [Hardware Error]: Unified Memory Controller Ext. Error Code: 0, DRAM ECC error.
[ 313.747669] EDAC MC0: 1 CE Cannot decode normalized address on mc#0csrow#3channel#0 (csrow:3 channel:0 page:0x0 offset:0x0 grain:64 syndrome:0x1)
[ 313.747672] [Hardware Error]: cache level: L3/GEN, tx: GEN, mem-tx: RD
Signed-off-by: Hristo Venev <hristo@venev.name>
---
drivers/edac/amd64_edac.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
index b55129425c81..1080784e2784 100644
--- a/drivers/edac/amd64_edac.c
+++ b/drivers/edac/amd64_edac.c
@@ -3816,6 +3816,10 @@ static int per_family_init(struct amd64_pvt *pvt)
case 0x50 ... 0x5f:
pvt->ctl_name = "F19h_M50h";
break;
+ case 0x60 ... 0x6f:
+ pvt->ctl_name = "F19h_M60h";
+ pvt->flags.zn_regs_v2 = 1;
+ break;
case 0xa0 ... 0xaf:
pvt->ctl_name = "F19h_MA0h";
pvt->max_mcs = 12;
--
2.40.0
^ permalink raw reply related [flat|nested] 9+ messages in thread* Re: [PATCH] EDAC/amd64: Add support for ECC on family 19h model 60h-6Fh 2023-04-25 20:12 [PATCH] EDAC/amd64: Add support for ECC on family 19h model 60h-6Fh Hristo Venev @ 2023-05-09 14:53 ` Yazen Ghannam 2023-05-10 23:42 ` Limonciello, Mario 0 siblings, 1 reply; 9+ messages in thread From: Yazen Ghannam @ 2023-05-09 14:53 UTC (permalink / raw) To: Hristo Venev, Limonciello, Mario Cc: yazen.ghannam, Borislav Petkov, linux-edac On 4/25/23 4:12 PM, Hristo Venev wrote: > Ryzen 9 7950X uses model 61h. Treat it as Epyc 9004, but with 2 channels > instead of 12. > > I tested this with two 32GB dual-rank DIMMs. The sizes appear to be > reported correctly: > > [ 2.122750] EDAC MC0: Giving out device to module amd64_edac controller F19h_M60h: DEV 0000:00:18.3 (INTERRUPT) > [ 2.122751] EDAC amd64: F19h_M60h detected (node 0). > [ 2.122754] EDAC MC: UMC0 chip selects: > [ 2.122754] EDAC amd64: MC: 0: 0MB 1: 0MB > [ 2.122755] EDAC amd64: MC: 2: 16384MB 3: 16384MB > [ 2.122757] EDAC MC: UMC1 chip selects: > [ 2.122757] EDAC amd64: MC: 0: 0MB 1: 0MB > [ 2.122758] EDAC amd64: MC: 2: 16384MB 3: 16384MB > [ 2.122759] AMD64 EDAC driver v3.5.0 > > ECC errors can also be detected: > > [ 313.747594] mce: [Hardware Error]: Machine check events logged > [ 313.747597] [Hardware Error]: Corrected error, no action required. > [ 313.747613] [Hardware Error]: CPU:0 (19:61:2) MC21_STATUS[Over|CE|MiscV|AddrV|-|-|SyndV|CECC|-|-|-]: 0xdc2040000400011b > [ 313.747632] [Hardware Error]: Error Addr: 0x00000007ff7e93c0 > [ 313.747639] [Hardware Error]: IPID: 0x0000009600050f00, Syndrome: 0x000100010a801203 > [ 313.747652] [Hardware Error]: Unified Memory Controller Ext. Error Code: 0, DRAM ECC error. > [ 313.747669] EDAC MC0: 1 CE Cannot decode normalized address on mc#0csrow#3channel#0 (csrow:3 channel:0 page:0x0 offset:0x0 grain:64 syndrome:0x1) > [ 313.747672] [Hardware Error]: cache level: L3/GEN, tx: GEN, mem-tx: RD > > Signed-off-by: Hristo Venev <hristo@venev.name> Hi Hristo, Thank you for the patch. It looks good to me. Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com> > --- > drivers/edac/amd64_edac.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c > index b55129425c81..1080784e2784 100644 > --- a/drivers/edac/amd64_edac.c > +++ b/drivers/edac/amd64_edac.c > @@ -3816,6 +3816,10 @@ static int per_family_init(struct amd64_pvt *pvt) > case 0x50 ... 0x5f: > pvt->ctl_name = "F19h_M50h"; > break; > + case 0x60 ... 0x6f: > + pvt->ctl_name = "F19h_M60h"; > + pvt->flags.zn_regs_v2 = 1; > + break; Mario, Are there other Client models that can leverage this change? Thanks, Yazen ^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: [PATCH] EDAC/amd64: Add support for ECC on family 19h model 60h-6Fh 2023-05-09 14:53 ` Yazen Ghannam @ 2023-05-10 23:42 ` Limonciello, Mario 2023-05-11 13:02 ` Yazen Ghannam 0 siblings, 1 reply; 9+ messages in thread From: Limonciello, Mario @ 2023-05-10 23:42 UTC (permalink / raw) To: Ghannam, Yazen, Hristo Venev; +Cc: Borislav Petkov, linux-edac@vger.kernel.org [AMD Official Use Only - General] > -----Original Message----- > From: Ghannam, Yazen <Yazen.Ghannam@amd.com> > Sent: Tuesday, May 9, 2023 9:53 AM > To: Hristo Venev <hristo@venev.name>; Limonciello, Mario > <Mario.Limonciello@amd.com> > Cc: Ghannam, Yazen <Yazen.Ghannam@amd.com>; Borislav Petkov > <bp@alien8.de>; linux-edac@vger.kernel.org > Subject: Re: [PATCH] EDAC/amd64: Add support for ECC on family 19h model > 60h-6Fh > > On 4/25/23 4:12 PM, Hristo Venev wrote: > > Ryzen 9 7950X uses model 61h. Treat it as Epyc 9004, but with 2 channels > > instead of 12. > > > > I tested this with two 32GB dual-rank DIMMs. The sizes appear to be > > reported correctly: > > > > [ 2.122750] EDAC MC0: Giving out device to module amd64_edac > controller F19h_M60h: DEV 0000:00:18.3 (INTERRUPT) > > [ 2.122751] EDAC amd64: F19h_M60h detected (node 0). > > [ 2.122754] EDAC MC: UMC0 chip selects: > > [ 2.122754] EDAC amd64: MC: 0: 0MB 1: 0MB > > [ 2.122755] EDAC amd64: MC: 2: 16384MB 3: 16384MB > > [ 2.122757] EDAC MC: UMC1 chip selects: > > [ 2.122757] EDAC amd64: MC: 0: 0MB 1: 0MB > > [ 2.122758] EDAC amd64: MC: 2: 16384MB 3: 16384MB > > [ 2.122759] AMD64 EDAC driver v3.5.0 > > > > ECC errors can also be detected: > > > > [ 313.747594] mce: [Hardware Error]: Machine check events logged > > [ 313.747597] [Hardware Error]: Corrected error, no action required. > > [ 313.747613] [Hardware Error]: CPU:0 (19:61:2) > MC21_STATUS[Over|CE|MiscV|AddrV|-|-|SyndV|CECC|-|-|-]: > 0xdc2040000400011b > > [ 313.747632] [Hardware Error]: Error Addr: 0x00000007ff7e93c0 > > [ 313.747639] [Hardware Error]: IPID: 0x0000009600050f00, Syndrome: > 0x000100010a801203 > > [ 313.747652] [Hardware Error]: Unified Memory Controller Ext. Error > Code: 0, DRAM ECC error. > > [ 313.747669] EDAC MC0: 1 CE Cannot decode normalized address on > mc#0csrow#3channel#0 (csrow:3 channel:0 page:0x0 offset:0x0 grain:64 > syndrome:0x1) > > [ 313.747672] [Hardware Error]: cache level: L3/GEN, tx: GEN, mem-tx: > RD > > > > Signed-off-by: Hristo Venev <hristo@venev.name> > > Hi Hristo, > > Thank you for the patch. It looks good to me. > > Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com> > > > --- > > drivers/edac/amd64_edac.c | 4 ++++ > > 1 file changed, 4 insertions(+) > > > > diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c > > index b55129425c81..1080784e2784 100644 > > --- a/drivers/edac/amd64_edac.c > > +++ b/drivers/edac/amd64_edac.c > > @@ -3816,6 +3816,10 @@ static int per_family_init(struct amd64_pvt *pvt) > > case 0x50 ... 0x5f: > > pvt->ctl_name = "F19h_M50h"; > > break; > > + case 0x60 ... 0x6f: > > + pvt->ctl_name = "F19h_M60h"; > > + pvt->flags.zn_regs_v2 = 1; > > + break; > > Mario, > > Are there other Client models that can leverage this change? Yes family 0x19 models 0x70... 0x7f can too, thanks! > > Thanks, > Yazen ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] EDAC/amd64: Add support for ECC on family 19h model 60h-6Fh 2023-05-10 23:42 ` Limonciello, Mario @ 2023-05-11 13:02 ` Yazen Ghannam 2023-05-11 17:45 ` Hristo Venev 2023-05-11 17:45 ` [PATCH v2] EDAC/amd64: Add support for ECC on family 19h model 60h-7Fh Hristo Venev 0 siblings, 2 replies; 9+ messages in thread From: Yazen Ghannam @ 2023-05-11 13:02 UTC (permalink / raw) To: Limonciello, Mario, Hristo Venev Cc: yazen.ghannam, Borislav Petkov, linux-edac@vger.kernel.org On 5/10/23 7:42 PM, Limonciello, Mario wrote: > [AMD Official Use Only - General] > >> -----Original Message----- >> From: Ghannam, Yazen <Yazen.Ghannam@amd.com> >> Sent: Tuesday, May 9, 2023 9:53 AM >> To: Hristo Venev <hristo@venev.name>; Limonciello, Mario >> <Mario.Limonciello@amd.com> >> Cc: Ghannam, Yazen <Yazen.Ghannam@amd.com>; Borislav Petkov >> <bp@alien8.de>; linux-edac@vger.kernel.org >> Subject: Re: [PATCH] EDAC/amd64: Add support for ECC on family 19h model >> 60h-6Fh >> >> On 4/25/23 4:12 PM, Hristo Venev wrote: >>> Ryzen 9 7950X uses model 61h. Treat it as Epyc 9004, but with 2 channels >>> instead of 12. >>> >>> I tested this with two 32GB dual-rank DIMMs. The sizes appear to be >>> reported correctly: >>> >>> [ 2.122750] EDAC MC0: Giving out device to module amd64_edac >> controller F19h_M60h: DEV 0000:00:18.3 (INTERRUPT) >>> [ 2.122751] EDAC amd64: F19h_M60h detected (node 0). >>> [ 2.122754] EDAC MC: UMC0 chip selects: >>> [ 2.122754] EDAC amd64: MC: 0: 0MB 1: 0MB >>> [ 2.122755] EDAC amd64: MC: 2: 16384MB 3: 16384MB >>> [ 2.122757] EDAC MC: UMC1 chip selects: >>> [ 2.122757] EDAC amd64: MC: 0: 0MB 1: 0MB >>> [ 2.122758] EDAC amd64: MC: 2: 16384MB 3: 16384MB >>> [ 2.122759] AMD64 EDAC driver v3.5.0 >>> >>> ECC errors can also be detected: >>> >>> [ 313.747594] mce: [Hardware Error]: Machine check events logged >>> [ 313.747597] [Hardware Error]: Corrected error, no action required. >>> [ 313.747613] [Hardware Error]: CPU:0 (19:61:2) >> MC21_STATUS[Over|CE|MiscV|AddrV|-|-|SyndV|CECC|-|-|-]: >> 0xdc2040000400011b >>> [ 313.747632] [Hardware Error]: Error Addr: 0x00000007ff7e93c0 >>> [ 313.747639] [Hardware Error]: IPID: 0x0000009600050f00, Syndrome: >> 0x000100010a801203 >>> [ 313.747652] [Hardware Error]: Unified Memory Controller Ext. Error >> Code: 0, DRAM ECC error. >>> [ 313.747669] EDAC MC0: 1 CE Cannot decode normalized address on >> mc#0csrow#3channel#0 (csrow:3 channel:0 page:0x0 offset:0x0 grain:64 >> syndrome:0x1) >>> [ 313.747672] [Hardware Error]: cache level: L3/GEN, tx: GEN, mem-tx: >> RD >>> >>> Signed-off-by: Hristo Venev <hristo@venev.name> >> >> Hi Hristo, >> >> Thank you for the patch. It looks good to me. >> >> Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com> >> >>> --- >>> drivers/edac/amd64_edac.c | 4 ++++ >>> 1 file changed, 4 insertions(+) >>> >>> diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c >>> index b55129425c81..1080784e2784 100644 >>> --- a/drivers/edac/amd64_edac.c >>> +++ b/drivers/edac/amd64_edac.c >>> @@ -3816,6 +3816,10 @@ static int per_family_init(struct amd64_pvt *pvt) >>> case 0x50 ... 0x5f: >>> pvt->ctl_name = "F19h_M50h"; >>> break; >>> + case 0x60 ... 0x6f: >>> + pvt->ctl_name = "F19h_M60h"; >>> + pvt->flags.zn_regs_v2 = 1; >>> + break; >> >> Mario, >> >> Are there other Client models that can leverage this change? > > Yes family 0x19 models 0x70... 0x7f can too, thanks! > Thanks Mario. Hristo, Can you please also add those models? Thanks, Yazen ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] EDAC/amd64: Add support for ECC on family 19h model 60h-6Fh 2023-05-11 13:02 ` Yazen Ghannam @ 2023-05-11 17:45 ` Hristo Venev 2023-05-15 14:27 ` Borislav Petkov 2023-05-11 17:45 ` [PATCH v2] EDAC/amd64: Add support for ECC on family 19h model 60h-7Fh Hristo Venev 1 sibling, 1 reply; 9+ messages in thread From: Hristo Venev @ 2023-05-11 17:45 UTC (permalink / raw) To: Yazen Ghannam, Limonciello, Mario; +Cc: Borislav Petkov, linux-edac I'll send the updated patch. One thing I noticed is that in the ECC error I observed the address was not decoded successfully. As I don't really have good test infrastructure (getting the error involved tuning voltages over several reboots), do you think you could look into it? ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] EDAC/amd64: Add support for ECC on family 19h model 60h-6Fh 2023-05-11 17:45 ` Hristo Venev @ 2023-05-15 14:27 ` Borislav Petkov 0 siblings, 0 replies; 9+ messages in thread From: Borislav Petkov @ 2023-05-15 14:27 UTC (permalink / raw) To: Hristo Venev; +Cc: Yazen Ghannam, Limonciello, Mario, linux-edac On Thu, May 11, 2023 at 08:45:06PM +0300, Hristo Venev wrote: > do you think you could look into it? Yeah, that's being worked on but it'll take a while longer. Thx. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v2] EDAC/amd64: Add support for ECC on family 19h model 60h-7Fh 2023-05-11 13:02 ` Yazen Ghannam 2023-05-11 17:45 ` Hristo Venev @ 2023-05-11 17:45 ` Hristo Venev 2023-05-11 17:58 ` Limonciello, Mario 2023-05-15 14:39 ` Borislav Petkov 1 sibling, 2 replies; 9+ messages in thread From: Hristo Venev @ 2023-05-11 17:45 UTC (permalink / raw) To: Yazen Ghannam, Limonciello, Mario Cc: Borislav Petkov, linux-edac, Hristo Venev Ryzen 9 7950X uses model 61h. Treat it as Epyc 9004, but with 2 channels instead of 12. I tested this with two 32GB dual-rank DIMMs. The sizes appear to be reported correctly: [ 2.122750] EDAC MC0: Giving out device to module amd64_edac controller F19h_M60h: DEV 0000:00:18.3 (INTERRUPT) [ 2.122751] EDAC amd64: F19h_M60h detected (node 0). [ 2.122754] EDAC MC: UMC0 chip selects: [ 2.122754] EDAC amd64: MC: 0: 0MB 1: 0MB [ 2.122755] EDAC amd64: MC: 2: 16384MB 3: 16384MB [ 2.122757] EDAC MC: UMC1 chip selects: [ 2.122757] EDAC amd64: MC: 0: 0MB 1: 0MB [ 2.122758] EDAC amd64: MC: 2: 16384MB 3: 16384MB [ 2.122759] AMD64 EDAC driver v3.5.0 ECC errors can also be detected: [ 313.747594] mce: [Hardware Error]: Machine check events logged [ 313.747597] [Hardware Error]: Corrected error, no action required. [ 313.747613] [Hardware Error]: CPU:0 (19:61:2) MC21_STATUS[Over|CE|MiscV|AddrV|-|-|SyndV|CECC|-|-|-]: 0xdc2040000400011b [ 313.747632] [Hardware Error]: Error Addr: 0x00000007ff7e93c0 [ 313.747639] [Hardware Error]: IPID: 0x0000009600050f00, Syndrome: 0x000100010a801203 [ 313.747652] [Hardware Error]: Unified Memory Controller Ext. Error Code: 0, DRAM ECC error. [ 313.747669] EDAC MC0: 1 CE Cannot decode normalized address on mc#0csrow#3channel#0 (csrow:3 channel:0 page:0x0 offset:0x0 grain:64 syndrome:0x1) [ 313.747672] [Hardware Error]: cache level: L3/GEN, tx: GEN, mem-tx: RD According to Mario Limonciello, the same code should also work for models 70h-7Fh [1]. Link: https://lore.kernel.org/linux-edac/d619252e-35c7-814b-acdb-74714619d62a@amd.com/T/#m9fc20d5dc36074048ec5f1c0a5b01b7f972a1cc7 [1] Signed-off-by: Hristo Venev <hristo@venev.name> --- drivers/edac/amd64_edac.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c index b55129425c81..c00f7e4ef366 100644 --- a/drivers/edac/amd64_edac.c +++ b/drivers/edac/amd64_edac.c @@ -3816,6 +3816,14 @@ static int per_family_init(struct amd64_pvt *pvt) case 0x50 ... 0x5f: pvt->ctl_name = "F19h_M50h"; break; + case 0x60 ... 0x6f: + pvt->ctl_name = "F19h_M60h"; + pvt->flags.zn_regs_v2 = 1; + break; + case 0x70 ... 0x7f: + pvt->ctl_name = "F19h_M70h"; + pvt->flags.zn_regs_v2 = 1; + break; case 0xa0 ... 0xaf: pvt->ctl_name = "F19h_MA0h"; pvt->max_mcs = 12; -- 2.40.1 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* RE: [PATCH v2] EDAC/amd64: Add support for ECC on family 19h model 60h-7Fh 2023-05-11 17:45 ` [PATCH v2] EDAC/amd64: Add support for ECC on family 19h model 60h-7Fh Hristo Venev @ 2023-05-11 17:58 ` Limonciello, Mario 2023-05-15 14:39 ` Borislav Petkov 1 sibling, 0 replies; 9+ messages in thread From: Limonciello, Mario @ 2023-05-11 17:58 UTC (permalink / raw) To: Hristo Venev, Ghannam, Yazen; +Cc: Borislav Petkov, linux-edac@vger.kernel.org [AMD Official Use Only - General] > Ryzen 9 7950X uses model 61h. Treat it as Epyc 9004, but with 2 channels > instead of 12. > > I tested this with two 32GB dual-rank DIMMs. The sizes appear to be > reported correctly: > > [ 2.122750] EDAC MC0: Giving out device to module amd64_edac > controller F19h_M60h: DEV 0000:00:18.3 (INTERRUPT) > [ 2.122751] EDAC amd64: F19h_M60h detected (node 0). > [ 2.122754] EDAC MC: UMC0 chip selects: > [ 2.122754] EDAC amd64: MC: 0: 0MB 1: 0MB > [ 2.122755] EDAC amd64: MC: 2: 16384MB 3: 16384MB > [ 2.122757] EDAC MC: UMC1 chip selects: > [ 2.122757] EDAC amd64: MC: 0: 0MB 1: 0MB > [ 2.122758] EDAC amd64: MC: 2: 16384MB 3: 16384MB > [ 2.122759] AMD64 EDAC driver v3.5.0 > > ECC errors can also be detected: > > [ 313.747594] mce: [Hardware Error]: Machine check events logged > [ 313.747597] [Hardware Error]: Corrected error, no action required. > [ 313.747613] [Hardware Error]: CPU:0 (19:61:2) > MC21_STATUS[Over|CE|MiscV|AddrV|-|-|SyndV|CECC|-|-|-]: > 0xdc2040000400011b > [ 313.747632] [Hardware Error]: Error Addr: 0x00000007ff7e93c0 > [ 313.747639] [Hardware Error]: IPID: 0x0000009600050f00, Syndrome: > 0x000100010a801203 > [ 313.747652] [Hardware Error]: Unified Memory Controller Ext. Error > Code: 0, DRAM ECC error. > [ 313.747669] EDAC MC0: 1 CE Cannot decode normalized address on > mc#0csrow#3channel#0 (csrow:3 channel:0 page:0x0 offset:0x0 grain:64 > syndrome:0x1) > [ 313.747672] [Hardware Error]: cache level: L3/GEN, tx: GEN, mem-tx: RD > > According to Mario Limonciello, the same code should also work for > models 70h-7Fh [1]. > > Link: https://lore.kernel.org/linux-edac/d619252e-35c7-814b-acdb- > 74714619d62a@amd.com/T/#m9fc20d5dc36074048ec5f1c0a5b01b7f972a1cc7 > [1] > Signed-off-by: Hristo Venev <hristo@venev.name> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> > --- > drivers/edac/amd64_edac.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c > index b55129425c81..c00f7e4ef366 100644 > --- a/drivers/edac/amd64_edac.c > +++ b/drivers/edac/amd64_edac.c > @@ -3816,6 +3816,14 @@ static int per_family_init(struct amd64_pvt *pvt) > case 0x50 ... 0x5f: > pvt->ctl_name = "F19h_M50h"; > break; > + case 0x60 ... 0x6f: > + pvt->ctl_name = "F19h_M60h"; > + pvt->flags.zn_regs_v2 = 1; > + break; > + case 0x70 ... 0x7f: > + pvt->ctl_name = "F19h_M70h"; > + pvt->flags.zn_regs_v2 = 1; > + break; > case 0xa0 ... 0xaf: > pvt->ctl_name = "F19h_MA0h"; > pvt->max_mcs = 12; > -- > 2.40.1 ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v2] EDAC/amd64: Add support for ECC on family 19h model 60h-7Fh 2023-05-11 17:45 ` [PATCH v2] EDAC/amd64: Add support for ECC on family 19h model 60h-7Fh Hristo Venev 2023-05-11 17:58 ` Limonciello, Mario @ 2023-05-15 14:39 ` Borislav Petkov 1 sibling, 0 replies; 9+ messages in thread From: Borislav Petkov @ 2023-05-15 14:39 UTC (permalink / raw) To: Hristo Venev; +Cc: Yazen Ghannam, Limonciello, Mario, linux-edac On Thu, May 11, 2023 at 08:45:07PM +0300, Hristo Venev wrote: > Ryzen 9 7950X uses model 61h. Treat it as Epyc 9004, but with 2 channels > instead of 12. > > I tested this with two 32GB dual-rank DIMMs. The sizes appear to be > reported correctly: > > [ 2.122750] EDAC MC0: Giving out device to module amd64_edac controller F19h_M60h: DEV 0000:00:18.3 (INTERRUPT) > [ 2.122751] EDAC amd64: F19h_M60h detected (node 0). > [ 2.122754] EDAC MC: UMC0 chip selects: > [ 2.122754] EDAC amd64: MC: 0: 0MB 1: 0MB > [ 2.122755] EDAC amd64: MC: 2: 16384MB 3: 16384MB > [ 2.122757] EDAC MC: UMC1 chip selects: > [ 2.122757] EDAC amd64: MC: 0: 0MB 1: 0MB > [ 2.122758] EDAC amd64: MC: 2: 16384MB 3: 16384MB > [ 2.122759] AMD64 EDAC driver v3.5.0 > > ECC errors can also be detected: > > [ 313.747594] mce: [Hardware Error]: Machine check events logged > [ 313.747597] [Hardware Error]: Corrected error, no action required. > [ 313.747613] [Hardware Error]: CPU:0 (19:61:2) MC21_STATUS[Over|CE|MiscV|AddrV|-|-|SyndV|CECC|-|-|-]: 0xdc2040000400011b > [ 313.747632] [Hardware Error]: Error Addr: 0x00000007ff7e93c0 > [ 313.747639] [Hardware Error]: IPID: 0x0000009600050f00, Syndrome: 0x000100010a801203 > [ 313.747652] [Hardware Error]: Unified Memory Controller Ext. Error Code: 0, DRAM ECC error. > [ 313.747669] EDAC MC0: 1 CE Cannot decode normalized address on mc#0csrow#3channel#0 (csrow:3 channel:0 page:0x0 offset:0x0 grain:64 syndrome:0x1) > [ 313.747672] [Hardware Error]: cache level: L3/GEN, tx: GEN, mem-tx: RD > > According to Mario Limonciello, the same code should also work for > models 70h-7Fh [1]. > > Link: https://lore.kernel.org/linux-edac/d619252e-35c7-814b-acdb-74714619d62a@amd.com/T/#m9fc20d5dc36074048ec5f1c0a5b01b7f972a1cc7 [1] > Signed-off-by: Hristo Venev <hristo@venev.name> > --- > drivers/edac/amd64_edac.c | 8 ++++++++ > 1 file changed, 8 insertions(+) Applied, thanks. -- Regards/Gruss, Boris. https://people.kernel.org/tglx/notes-about-netiquette ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2023-05-15 14:39 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-04-25 20:12 [PATCH] EDAC/amd64: Add support for ECC on family 19h model 60h-6Fh Hristo Venev 2023-05-09 14:53 ` Yazen Ghannam 2023-05-10 23:42 ` Limonciello, Mario 2023-05-11 13:02 ` Yazen Ghannam 2023-05-11 17:45 ` Hristo Venev 2023-05-15 14:27 ` Borislav Petkov 2023-05-11 17:45 ` [PATCH v2] EDAC/amd64: Add support for ECC on family 19h model 60h-7Fh Hristo Venev 2023-05-11 17:58 ` Limonciello, Mario 2023-05-15 14:39 ` Borislav Petkov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox