* [PATCH 00/16] AMD NB and SMN rework
@ 2024-10-23 17:21 Yazen Ghannam
2024-10-23 17:21 ` [PATCH 01/16] x86/mce/amd: Remove shared threshold bank plumbing Yazen Ghannam
` (16 more replies)
0 siblings, 17 replies; 65+ messages in thread
From: Yazen Ghannam @ 2024-10-23 17:21 UTC (permalink / raw)
To: linux-edac
Cc: linux-kernel, tony.luck, x86, avadhut.naik, john.allen,
mario.limonciello, bhelgaas, Shyam-sundar.S-k, richard.gong,
jdelvare, linux, clemens, hdegoede, ilpo.jarvinen, linux-pci,
linux-hwmon, platform-driver-x86, naveenkrishna.chatradhi,
carlos.bilbao.osdev, Yazen Ghannam
Hi all,
The theme of this set is decoupling the "AMD node" concept from the
legacy northbridge support.
Additionally, AMD System Management Network (SMN) access code is
decoupled and expanded too.
Patches 1-3 begin reducing the scope of AMD_NB.
Patches 4-9 begin moving generic AMD node support out of AMD_NB.
Patches 10-13 move SMN support out of AMD_NB and do some refactoring.
Patch 14 has HSMP reuse SMN functionality.
Patches 15-16 address userspace access to SMN.
I say "begin" above because there is more to do here. Ultimately, AMD_NB
should only be needed for code used on legacy systems with northbridges.
Also, any and all SMN users in the kernel need to be updated to use the
central SMN code. Local solutions should be avoided.
Thanks,
Yazen
Mario Limonciello (4):
x86/amd_nb, x86/amd_node: Simplify amd_pci_dev_to_node_id()
x86/amd_nb: Move SMN access code to a new amd_smn driver
x86/amd_smn: Add SMN offsets to exclusive region access
x86/amd_smn: Add support for debugfs access to SMN registers
Yazen Ghannam (12):
x86/mce/amd: Remove shared threshold bank plumbing
x86/amd_nb: Restrict init function to AMD-based systems
x86/amd_nb: Clean up early_is_amd_nb()
x86: Start moving AMD Node functionality out of AMD_NB
x86/amd_nb: Simplify function 4 search
x86/amd_nb: Simplify root device search
x86/amd_nb: Use topology info to get AMD node count
x86/amd_nb: Simplify function 3 search
x86/amd_smn: Fixup __amd_smn_rw()
x86/amd_smn: Remove dependency on AMD_NB
x86/amd_smn: Use defines for register offsets
x86/amd_smn, platform/x86/amd/hsmp: Have HSMP use SMN
MAINTAINERS | 15 ++
arch/x86/Kconfig | 9 +-
arch/x86/include/asm/amd_nb.h | 53 +----
arch/x86/include/asm/amd_node.h | 39 ++++
arch/x86/include/asm/amd_smn.h | 14 ++
arch/x86/kernel/Makefile | 2 +
arch/x86/kernel/amd_nb.c | 294 ++-------------------------
arch/x86/kernel/amd_node.c | 91 +++++++++
arch/x86/kernel/amd_smn.c | 269 ++++++++++++++++++++++++
arch/x86/kernel/cpu/mce/amd.c | 127 +++---------
arch/x86/pci/fixup.c | 4 +-
drivers/edac/Kconfig | 1 +
drivers/edac/amd64_edac.c | 1 +
drivers/hwmon/Kconfig | 2 +-
drivers/hwmon/k10temp.c | 2 +-
drivers/platform/x86/amd/Kconfig | 2 +-
drivers/platform/x86/amd/hsmp.c | 32 +--
drivers/platform/x86/amd/pmc/Kconfig | 2 +-
drivers/platform/x86/amd/pmc/pmc.c | 2 +-
drivers/platform/x86/amd/pmf/Kconfig | 2 +-
drivers/platform/x86/amd/pmf/core.c | 2 +-
drivers/ras/amd/atl/Kconfig | 1 +
drivers/ras/amd/atl/internal.h | 1 +
23 files changed, 495 insertions(+), 472 deletions(-)
create mode 100644 arch/x86/include/asm/amd_node.h
create mode 100644 arch/x86/include/asm/amd_smn.h
create mode 100644 arch/x86/kernel/amd_node.c
create mode 100644 arch/x86/kernel/amd_smn.c
base-commit: 94559bac4d403b1575b32a863f5c0429cdd33eaa
--
2.43.0
^ permalink raw reply [flat|nested] 65+ messages in thread
* [PATCH 01/16] x86/mce/amd: Remove shared threshold bank plumbing
2024-10-23 17:21 [PATCH 00/16] AMD NB and SMN rework Yazen Ghannam
@ 2024-10-23 17:21 ` Yazen Ghannam
2024-10-23 17:21 ` [PATCH 02/16] x86/amd_nb: Restrict init function to AMD-based systems Yazen Ghannam
` (15 subsequent siblings)
16 siblings, 0 replies; 65+ messages in thread
From: Yazen Ghannam @ 2024-10-23 17:21 UTC (permalink / raw)
To: linux-edac
Cc: linux-kernel, tony.luck, x86, avadhut.naik, john.allen,
mario.limonciello, bhelgaas, Shyam-sundar.S-k, richard.gong,
jdelvare, linux, clemens, hdegoede, ilpo.jarvinen, linux-pci,
linux-hwmon, platform-driver-x86, naveenkrishna.chatradhi,
carlos.bilbao.osdev, Yazen Ghannam, Borislav Petkov
Legacy AMD systems include an integrated Northbridge that is represented
by MCA bank 4. This is the only non-core MCA bank in legacy systems. The
Northbridge is physically shared by all the CPUs within an AMD "Node".
However, in practice the "shared" MCA bank can only by managed by a
single CPU within that AMD Node. This is known as the "Node Base Core"
(NBC). For example, only the NBC will be able to read the MCA bank 4
registers; they will be Read-as-Zero for other CPUs. Also, the MCA
Thresholding interrupt will only signal the NBC; the other CPUs will not
receive it. This is enforced by hardware, and it should not be managed by
software.
The current AMD Thresholding code attempts to deal with the "shared" MCA
bank by micromanaging the bank's sysfs kobjects. However, this does not
follow the intended kobject use cases. It is also fragile, and it has
caused bugs in the past.
Modern AMD systems do not need this shared MCA bank support, and it
should not be needed on legacy systems either.
Remove the shared threshold bank code. Also, move the threshold struct
definitions to mce/amd.c, since they are no longer needed in amd_nb.c.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>
---
arch/x86/Kconfig | 2 +-
arch/x86/include/asm/amd_nb.h | 31 ---------
arch/x86/kernel/cpu/mce/amd.c | 127 +++++++---------------------------
3 files changed, 27 insertions(+), 133 deletions(-)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 0bdb7a394f59..c6f917b762c0 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -1186,7 +1186,7 @@ config X86_MCE_INTEL
config X86_MCE_AMD
def_bool y
prompt "AMD MCE features"
- depends on X86_MCE && X86_LOCAL_APIC && AMD_NB
+ depends on X86_MCE && X86_LOCAL_APIC
help
Additional support for AMD specific MCE features such as
the DRAM Error Threshold.
diff --git a/arch/x86/include/asm/amd_nb.h b/arch/x86/include/asm/amd_nb.h
index 6f3b6aef47ba..b09c26a551eb 100644
--- a/arch/x86/include/asm/amd_nb.h
+++ b/arch/x86/include/asm/amd_nb.h
@@ -4,7 +4,6 @@
#include <linux/ioport.h>
#include <linux/pci.h>
-#include <linux/refcount.h>
struct amd_nb_bus_dev_range {
u8 bus;
@@ -29,41 +28,11 @@ struct amd_l3_cache {
u8 subcaches[4];
};
-struct threshold_block {
- unsigned int block; /* Number within bank */
- unsigned int bank; /* MCA bank the block belongs to */
- unsigned int cpu; /* CPU which controls MCA bank */
- u32 address; /* MSR address for the block */
- u16 interrupt_enable; /* Enable/Disable APIC interrupt */
- bool interrupt_capable; /* Bank can generate an interrupt. */
-
- u16 threshold_limit; /*
- * Value upon which threshold
- * interrupt is generated.
- */
-
- struct kobject kobj; /* sysfs object */
- struct list_head miscj; /*
- * List of threshold blocks
- * within a bank.
- */
-};
-
-struct threshold_bank {
- struct kobject *kobj;
- struct threshold_block *blocks;
-
- /* initialized to the number of CPUs on the node sharing this bank */
- refcount_t cpus;
- unsigned int shared;
-};
-
struct amd_northbridge {
struct pci_dev *root;
struct pci_dev *misc;
struct pci_dev *link;
struct amd_l3_cache l3_cache;
- struct threshold_bank *bank4;
};
struct amd_northbridge_info {
diff --git a/arch/x86/kernel/cpu/mce/amd.c b/arch/x86/kernel/cpu/mce/amd.c
index 14bf8c232e45..0c80dad50664 100644
--- a/arch/x86/kernel/cpu/mce/amd.c
+++ b/arch/x86/kernel/cpu/mce/amd.c
@@ -4,8 +4,6 @@
*
* Written by Jacob Shin - AMD, Inc.
* Maintained by: Borislav Petkov <bp@alien8.de>
- *
- * All MC4_MISCi registers are shared between cores on a node.
*/
#include <linux/interrupt.h>
#include <linux/notifier.h>
@@ -20,7 +18,6 @@
#include <linux/smp.h>
#include <linux/string.h>
-#include <asm/amd_nb.h>
#include <asm/traps.h>
#include <asm/apic.h>
#include <asm/mce.h>
@@ -221,6 +218,32 @@ static const struct smca_hwid smca_hwid_mcatypes[] = {
#define MAX_MCATYPE_NAME_LEN 30
static char buf_mcatype[MAX_MCATYPE_NAME_LEN];
+struct threshold_block {
+ /* This block's number within its bank. */
+ unsigned int block;
+ /* MCA bank number that contains this block. */
+ unsigned int bank;
+ /* CPU which controls this block's MCA bank. */
+ unsigned int cpu;
+ /* MCA_MISC MSR address for this block. */
+ u32 address;
+ /* Enable/Disable APIC interrupt. */
+ bool interrupt_enable;
+ /* Bank can generate an interrupt. */
+ bool interrupt_capable;
+ /* Value upon which threshold interrupt is generated. */
+ u16 threshold_limit;
+ /* sysfs object */
+ struct kobject kobj;
+ /* List of threshold blocks within this block's MCA bank. */
+ struct list_head miscj;
+};
+
+struct threshold_bank {
+ struct kobject *kobj;
+ struct threshold_block *blocks;
+};
+
static DEFINE_PER_CPU(struct threshold_bank **, threshold_banks);
/*
@@ -333,19 +356,6 @@ struct thresh_restart {
u16 old_limit;
};
-static inline bool is_shared_bank(int bank)
-{
- /*
- * Scalable MCA provides for only one core to have access to the MSRs of
- * a shared bank.
- */
- if (mce_flags.smca)
- return false;
-
- /* Bank 4 is for northbridge reporting and is thus shared */
- return (bank == 4);
-}
-
static const char *bank4_names(const struct threshold_block *b)
{
switch (b->address) {
@@ -1194,35 +1204,10 @@ static int allocate_threshold_blocks(unsigned int cpu, struct threshold_bank *tb
return err;
}
-static int __threshold_add_blocks(struct threshold_bank *b)
-{
- struct list_head *head = &b->blocks->miscj;
- struct threshold_block *pos = NULL;
- struct threshold_block *tmp = NULL;
- int err = 0;
-
- err = kobject_add(&b->blocks->kobj, b->kobj, b->blocks->kobj.name);
- if (err)
- return err;
-
- list_for_each_entry_safe(pos, tmp, head, miscj) {
-
- err = kobject_add(&pos->kobj, b->kobj, pos->kobj.name);
- if (err) {
- list_for_each_entry_safe_reverse(pos, tmp, head, miscj)
- kobject_del(&pos->kobj);
-
- return err;
- }
- }
- return err;
-}
-
static int threshold_create_bank(struct threshold_bank **bp, unsigned int cpu,
unsigned int bank)
{
struct device *dev = this_cpu_read(mce_device);
- struct amd_northbridge *nb = NULL;
struct threshold_bank *b = NULL;
const char *name = get_name(cpu, bank, NULL);
int err = 0;
@@ -1230,26 +1215,6 @@ static int threshold_create_bank(struct threshold_bank **bp, unsigned int cpu,
if (!dev)
return -ENODEV;
- if (is_shared_bank(bank)) {
- nb = node_to_amd_nb(topology_amd_node_id(cpu));
-
- /* threshold descriptor already initialized on this node? */
- if (nb && nb->bank4) {
- /* yes, use it */
- b = nb->bank4;
- err = kobject_add(b->kobj, &dev->kobj, name);
- if (err)
- goto out;
-
- bp[bank] = b;
- refcount_inc(&b->cpus);
-
- err = __threshold_add_blocks(b);
-
- goto out;
- }
- }
-
b = kzalloc(sizeof(struct threshold_bank), GFP_KERNEL);
if (!b) {
err = -ENOMEM;
@@ -1263,17 +1228,6 @@ static int threshold_create_bank(struct threshold_bank **bp, unsigned int cpu,
goto out_free;
}
- if (is_shared_bank(bank)) {
- b->shared = 1;
- refcount_set(&b->cpus, 1);
-
- /* nb is already initialized, see above */
- if (nb) {
- WARN_ON(nb->bank4);
- nb->bank4 = b;
- }
- }
-
err = allocate_threshold_blocks(cpu, b, bank, 0, mca_msr_reg(bank, MCA_MISC));
if (err)
goto out_kobj;
@@ -1306,40 +1260,11 @@ static void deallocate_threshold_blocks(struct threshold_bank *bank)
kobject_put(&bank->blocks->kobj);
}
-static void __threshold_remove_blocks(struct threshold_bank *b)
-{
- struct threshold_block *pos = NULL;
- struct threshold_block *tmp = NULL;
-
- kobject_put(b->kobj);
-
- list_for_each_entry_safe(pos, tmp, &b->blocks->miscj, miscj)
- kobject_put(b->kobj);
-}
-
static void threshold_remove_bank(struct threshold_bank *bank)
{
- struct amd_northbridge *nb;
-
if (!bank->blocks)
goto out_free;
- if (!bank->shared)
- goto out_dealloc;
-
- if (!refcount_dec_and_test(&bank->cpus)) {
- __threshold_remove_blocks(bank);
- return;
- } else {
- /*
- * The last CPU on this node using the shared bank is going
- * away, remove that bank now.
- */
- nb = node_to_amd_nb(topology_amd_node_id(smp_processor_id()));
- nb->bank4 = NULL;
- }
-
-out_dealloc:
deallocate_threshold_blocks(bank);
out_free:
--
2.43.0
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 02/16] x86/amd_nb: Restrict init function to AMD-based systems
2024-10-23 17:21 [PATCH 00/16] AMD NB and SMN rework Yazen Ghannam
2024-10-23 17:21 ` [PATCH 01/16] x86/mce/amd: Remove shared threshold bank plumbing Yazen Ghannam
@ 2024-10-23 17:21 ` Yazen Ghannam
2024-10-31 8:09 ` Zhuo, Qiuxu
2024-10-23 17:21 ` [PATCH 03/16] x86/amd_nb: Clean up early_is_amd_nb() Yazen Ghannam
` (14 subsequent siblings)
16 siblings, 1 reply; 65+ messages in thread
From: Yazen Ghannam @ 2024-10-23 17:21 UTC (permalink / raw)
To: linux-edac
Cc: linux-kernel, tony.luck, x86, avadhut.naik, john.allen,
mario.limonciello, bhelgaas, Shyam-sundar.S-k, richard.gong,
jdelvare, linux, clemens, hdegoede, ilpo.jarvinen, linux-pci,
linux-hwmon, platform-driver-x86, naveenkrishna.chatradhi,
carlos.bilbao.osdev, Yazen Ghannam
The code implicitly operates on AMD-based systems by matching on PCI
IDs. However, the use of these IDs is going away.
Add an explicit CPU vendor check instead of relying on PCI IDs.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
---
arch/x86/kernel/amd_nb.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/arch/x86/kernel/amd_nb.c b/arch/x86/kernel/amd_nb.c
index 9fe9972d2071..37b8244899d8 100644
--- a/arch/x86/kernel/amd_nb.c
+++ b/arch/x86/kernel/amd_nb.c
@@ -582,6 +582,10 @@ static __init void fix_erratum_688(void)
static __init int init_amd_nbs(void)
{
+ if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD &&
+ boot_cpu_data.x86_vendor != X86_VENDOR_HYGON)
+ return 0;
+
amd_cache_northbridges();
amd_cache_gart();
--
2.43.0
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 03/16] x86/amd_nb: Clean up early_is_amd_nb()
2024-10-23 17:21 [PATCH 00/16] AMD NB and SMN rework Yazen Ghannam
2024-10-23 17:21 ` [PATCH 01/16] x86/mce/amd: Remove shared threshold bank plumbing Yazen Ghannam
2024-10-23 17:21 ` [PATCH 02/16] x86/amd_nb: Restrict init function to AMD-based systems Yazen Ghannam
@ 2024-10-23 17:21 ` Yazen Ghannam
2024-10-25 15:58 ` Borislav Petkov
2024-10-23 17:21 ` [PATCH 04/16] x86: Start moving AMD Node functionality out of AMD_NB Yazen Ghannam
` (13 subsequent siblings)
16 siblings, 1 reply; 65+ messages in thread
From: Yazen Ghannam @ 2024-10-23 17:21 UTC (permalink / raw)
To: linux-edac
Cc: linux-kernel, tony.luck, x86, avadhut.naik, john.allen,
mario.limonciello, bhelgaas, Shyam-sundar.S-k, richard.gong,
jdelvare, linux, clemens, hdegoede, ilpo.jarvinen, linux-pci,
linux-hwmon, platform-driver-x86, naveenkrishna.chatradhi,
carlos.bilbao.osdev, Yazen Ghannam
The check for early_is_amd_nb() is only useful for systems with GART or
the NB_CFG register.
Zen-based systems (both AMD and Hygon) have neither, so return early for
them.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
---
arch/x86/kernel/amd_nb.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kernel/amd_nb.c b/arch/x86/kernel/amd_nb.c
index 37b8244899d8..65884d0613f8 100644
--- a/arch/x86/kernel/amd_nb.c
+++ b/arch/x86/kernel/amd_nb.c
@@ -385,7 +385,6 @@ static int amd_cache_northbridges(void)
*/
bool __init early_is_amd_nb(u32 device)
{
- const struct pci_device_id *misc_ids = amd_nb_misc_ids;
const struct pci_device_id *id;
u32 vendor = device & 0xffff;
@@ -393,11 +392,11 @@ bool __init early_is_amd_nb(u32 device)
boot_cpu_data.x86_vendor != X86_VENDOR_HYGON)
return false;
- if (boot_cpu_data.x86_vendor == X86_VENDOR_HYGON)
- misc_ids = hygon_nb_misc_ids;
+ if (boot_cpu_has(X86_FEATURE_ZEN))
+ return false;
device >>= 16;
- for (id = misc_ids; id->vendor; id++)
+ for (id = amd_nb_misc_ids; id->vendor; id++)
if (vendor == id->vendor && device == id->device)
return true;
return false;
--
2.43.0
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 04/16] x86: Start moving AMD Node functionality out of AMD_NB
2024-10-23 17:21 [PATCH 00/16] AMD NB and SMN rework Yazen Ghannam
` (2 preceding siblings ...)
2024-10-23 17:21 ` [PATCH 03/16] x86/amd_nb: Clean up early_is_amd_nb() Yazen Ghannam
@ 2024-10-23 17:21 ` Yazen Ghannam
2024-10-23 17:21 ` [PATCH 05/16] x86/amd_nb: Simplify function 4 search Yazen Ghannam
` (12 subsequent siblings)
16 siblings, 0 replies; 65+ messages in thread
From: Yazen Ghannam @ 2024-10-23 17:21 UTC (permalink / raw)
To: linux-edac
Cc: linux-kernel, tony.luck, x86, avadhut.naik, john.allen,
mario.limonciello, bhelgaas, Shyam-sundar.S-k, richard.gong,
jdelvare, linux, clemens, hdegoede, ilpo.jarvinen, linux-pci,
linux-hwmon, platform-driver-x86, naveenkrishna.chatradhi,
carlos.bilbao.osdev, Yazen Ghannam
The "AMD Node" concept spans many families of systems and applies to a
number of subsystems and drivers.
Currently, the AMD Northbridge code is overloaded with AMD node
functionality. However, the node concept is broader than just
northbridges.
Start files to host common AMD node functions and definitions.
Include a helper to find an AMD node device function based on the
convention described in AMD documentation.
Anything that needs node functionality should include this rather than
amd_nb.h. The AMD_NB code will be reduced to only northbridge-specific
code needed for legacy systems.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
---
MAINTAINERS | 7 +++++++
arch/x86/Kconfig | 4 ++++
arch/x86/include/asm/amd_node.h | 27 ++++++++++++++++++++++++++
arch/x86/kernel/Makefile | 1 +
arch/x86/kernel/amd_node.c | 34 +++++++++++++++++++++++++++++++++
5 files changed, 73 insertions(+)
create mode 100644 arch/x86/include/asm/amd_node.h
create mode 100644 arch/x86/kernel/amd_node.c
diff --git a/MAINTAINERS b/MAINTAINERS
index cf02cbf4bef1..9ca246aef7ab 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1106,6 +1106,13 @@ L: linux-i2c@vger.kernel.org
S: Maintained
F: drivers/i2c/busses/i2c-amd-mp2*
+AMD NODE DRIVER
+M: Yazen Ghannam <yazen.ghannam@amd.com>
+L: linux-kernel@vger.kernel.org
+S: Supported
+F: arch/x86/include/asm/amd_node.h
+F: arch/x86/kernel/amd_node.c
+
AMD PDS CORE DRIVER
M: Shannon Nelson <shannon.nelson@amd.com>
M: Brett Creeley <brett.creeley@amd.com>
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index c6f917b762c0..ba5252d8e21c 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -3121,6 +3121,10 @@ config TS5500
endif # X86_32
config AMD_NB
+ def_bool y
+ depends on AMD_NODE
+
+config AMD_NODE
def_bool y
depends on CPU_SUP_AMD && PCI
diff --git a/arch/x86/include/asm/amd_node.h b/arch/x86/include/asm/amd_node.h
new file mode 100644
index 000000000000..622bd3038eeb
--- /dev/null
+++ b/arch/x86/include/asm/amd_node.h
@@ -0,0 +1,27 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * AMD Node helper functions and common defines
+ *
+ * Copyright (c) 2024, Advanced Micro Devices, Inc.
+ * All Rights Reserved.
+ *
+ * Author: Yazen Ghannam <Yazen.Ghannam@amd.com>
+ *
+ * Note:
+ * Items in this file may only be used in a single place.
+ * However, it's prudent to keep all AMD Node functionality
+ * in a unified place rather than spreading throughout the
+ * kernel.
+ */
+
+#ifndef _ASM_X86_AMD_NODE_H_
+#define _ASM_X86_AMD_NODE_H_
+
+#include <linux/pci.h>
+
+#define MAX_AMD_NUM_NODES 8
+#define AMD_NODE0_PCI_SLOT 0x18
+
+struct pci_dev *amd_node_get_func(u16 node, u8 func);
+
+#endif /*_ASM_X86_AMD_NODE_H_*/
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index f7918980667a..b43eb7e384eb 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -119,6 +119,7 @@ obj-$(CONFIG_EARLY_PRINTK) += early_printk.o
obj-$(CONFIG_HPET_TIMER) += hpet.o
obj-$(CONFIG_AMD_NB) += amd_nb.o
+obj-$(CONFIG_AMD_NODE) += amd_node.o
obj-$(CONFIG_DEBUG_NMI_SELFTEST) += nmi_selftest.o
obj-$(CONFIG_KVM_GUEST) += kvm.o kvmclock.o
diff --git a/arch/x86/kernel/amd_node.c b/arch/x86/kernel/amd_node.c
new file mode 100644
index 000000000000..e825cd4426b9
--- /dev/null
+++ b/arch/x86/kernel/amd_node.c
@@ -0,0 +1,34 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * AMD Node helper functions and common defines
+ *
+ * Copyright (c) 2024, Advanced Micro Devices, Inc.
+ * All Rights Reserved.
+ *
+ * Author: Yazen Ghannam <Yazen.Ghannam@amd.com>
+ */
+
+#include <asm/amd_node.h>
+
+/*
+ * AMD Nodes are a physical collection of I/O devices within an SoC. There can be one
+ * or more nodes per package.
+ *
+ * The nodes are software-visible through PCI config space. All nodes are enumerated
+ * on segment 0 bus 0. The device (slot) numbers range from 0x18 to 0x1F (maximum 8
+ * nodes) with 0x18 corresponding to node 0, 0x19 to node 1, etc. Each node can be a
+ * multi-function device.
+ *
+ * On legacy systems, these node devices represent integrated Northbridge functionality.
+ * On Zen-based systems, these node devices represent Data Fabric functionality.
+ *
+ * See "Configuration Space Accesses" section in BKDGs or
+ * "Processor x86 Core" -> "Configuration Space" section in PPRs.
+ */
+struct pci_dev *amd_node_get_func(u16 node, u8 func)
+{
+ if (node >= MAX_AMD_NUM_NODES)
+ return NULL;
+
+ return pci_get_domain_bus_and_slot(0, 0, PCI_DEVFN(AMD_NODE0_PCI_SLOT + node, func));
+}
--
2.43.0
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 05/16] x86/amd_nb: Simplify function 4 search
2024-10-23 17:21 [PATCH 00/16] AMD NB and SMN rework Yazen Ghannam
` (3 preceding siblings ...)
2024-10-23 17:21 ` [PATCH 04/16] x86: Start moving AMD Node functionality out of AMD_NB Yazen Ghannam
@ 2024-10-23 17:21 ` Yazen Ghannam
2024-10-31 11:15 ` Borislav Petkov
2024-10-23 17:21 ` [PATCH 06/16] x86/amd_nb: Simplify root device search Yazen Ghannam
` (11 subsequent siblings)
16 siblings, 1 reply; 65+ messages in thread
From: Yazen Ghannam @ 2024-10-23 17:21 UTC (permalink / raw)
To: linux-edac
Cc: linux-kernel, tony.luck, x86, avadhut.naik, john.allen,
mario.limonciello, bhelgaas, Shyam-sundar.S-k, richard.gong,
jdelvare, linux, clemens, hdegoede, ilpo.jarvinen, linux-pci,
linux-hwmon, platform-driver-x86, naveenkrishna.chatradhi,
carlos.bilbao.osdev, Yazen Ghannam
Use the newly added helper function to look up a CPU/Node function to
find "function 4" devices.
This avoids the need to regularly add new PCI IDs for basic discovery.
The unique PCI IDs are still useful in case of quirks or functional
changes. And they should be used only in such a manner.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
---
arch/x86/include/asm/amd_nb.h | 1 +
arch/x86/kernel/amd_nb.c | 66 ++---------------------------------
2 files changed, 4 insertions(+), 63 deletions(-)
diff --git a/arch/x86/include/asm/amd_nb.h b/arch/x86/include/asm/amd_nb.h
index b09c26a551eb..a0f2182107b0 100644
--- a/arch/x86/include/asm/amd_nb.h
+++ b/arch/x86/include/asm/amd_nb.h
@@ -4,6 +4,7 @@
#include <linux/ioport.h>
#include <linux/pci.h>
+#include <asm/amd_node.h>
struct amd_nb_bus_dev_range {
u8 bus;
diff --git a/arch/x86/kernel/amd_nb.c b/arch/x86/kernel/amd_nb.c
index 65884d0613f8..34c06b25782d 100644
--- a/arch/x86/kernel/amd_nb.c
+++ b/arch/x86/kernel/amd_nb.c
@@ -30,26 +30,6 @@
#define PCI_DEVICE_ID_AMD_MI200_ROOT 0x14bb
#define PCI_DEVICE_ID_AMD_MI300_ROOT 0x14f8
-#define PCI_DEVICE_ID_AMD_17H_DF_F4 0x1464
-#define PCI_DEVICE_ID_AMD_17H_M10H_DF_F4 0x15ec
-#define PCI_DEVICE_ID_AMD_17H_M30H_DF_F4 0x1494
-#define PCI_DEVICE_ID_AMD_17H_M60H_DF_F4 0x144c
-#define PCI_DEVICE_ID_AMD_17H_M70H_DF_F4 0x1444
-#define PCI_DEVICE_ID_AMD_17H_MA0H_DF_F4 0x1728
-#define PCI_DEVICE_ID_AMD_19H_DF_F4 0x1654
-#define PCI_DEVICE_ID_AMD_19H_M10H_DF_F4 0x14b1
-#define PCI_DEVICE_ID_AMD_19H_M40H_DF_F4 0x167d
-#define PCI_DEVICE_ID_AMD_19H_M50H_DF_F4 0x166e
-#define PCI_DEVICE_ID_AMD_19H_M60H_DF_F4 0x14e4
-#define PCI_DEVICE_ID_AMD_19H_M70H_DF_F4 0x14f4
-#define PCI_DEVICE_ID_AMD_19H_M78H_DF_F4 0x12fc
-#define PCI_DEVICE_ID_AMD_1AH_M00H_DF_F4 0x12c4
-#define PCI_DEVICE_ID_AMD_1AH_M20H_DF_F4 0x16fc
-#define PCI_DEVICE_ID_AMD_1AH_M60H_DF_F4 0x124c
-#define PCI_DEVICE_ID_AMD_1AH_M70H_DF_F4 0x12bc
-#define PCI_DEVICE_ID_AMD_MI200_DF_F4 0x14d4
-#define PCI_DEVICE_ID_AMD_MI300_DF_F4 0x152c
-
/* Protect the PCI config register pairs used for SMN. */
static DEFINE_MUTEX(smn_mutex);
@@ -73,8 +53,6 @@ static const struct pci_device_id amd_root_ids[] = {
{}
};
-#define PCI_DEVICE_ID_AMD_CNB17H_F4 0x1704
-
static const struct pci_device_id amd_nb_misc_ids[] = {
{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_K8_NB_MISC) },
{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_10H_NB_MISC) },
@@ -107,35 +85,6 @@ static const struct pci_device_id amd_nb_misc_ids[] = {
{}
};
-static const struct pci_device_id amd_nb_link_ids[] = {
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_15H_NB_F4) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_15H_M30H_NB_F4) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_15H_M60H_NB_F4) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_16H_NB_F4) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_16H_M30H_NB_F4) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_DF_F4) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_M10H_DF_F4) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_M30H_DF_F4) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_M60H_DF_F4) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_M70H_DF_F4) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_MA0H_DF_F4) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_DF_F4) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M10H_DF_F4) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M40H_DF_F4) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M50H_DF_F4) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M60H_DF_F4) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M70H_DF_F4) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M78H_DF_F4) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_CNB17H_F4) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_1AH_M00H_DF_F4) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_1AH_M20H_DF_F4) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_1AH_M60H_DF_F4) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_1AH_M70H_DF_F4) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_MI200_DF_F4) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_MI300_DF_F4) },
- {}
-};
-
static const struct pci_device_id hygon_root_ids[] = {
{ PCI_DEVICE(PCI_VENDOR_ID_HYGON, PCI_DEVICE_ID_AMD_17H_ROOT) },
{}
@@ -146,11 +95,6 @@ static const struct pci_device_id hygon_nb_misc_ids[] = {
{}
};
-static const struct pci_device_id hygon_nb_link_ids[] = {
- { PCI_DEVICE(PCI_VENDOR_ID_HYGON, PCI_DEVICE_ID_AMD_17H_DF_F4) },
- {}
-};
-
const struct amd_nb_bus_dev_range amd_nb_bus_dev_ranges[] __initconst = {
{ 0x00, 0x18, 0x20 },
{ 0xff, 0x00, 0x20 },
@@ -275,13 +219,11 @@ int __must_check amd_smn_write(u16 node, u32 address, u32 value)
}
EXPORT_SYMBOL_GPL(amd_smn_write);
-
static int amd_cache_northbridges(void)
{
const struct pci_device_id *misc_ids = amd_nb_misc_ids;
- const struct pci_device_id *link_ids = amd_nb_link_ids;
const struct pci_device_id *root_ids = amd_root_ids;
- struct pci_dev *root, *misc, *link;
+ struct pci_dev *root, *misc;
struct amd_northbridge *nb;
u16 roots_per_misc = 0;
u16 misc_count = 0;
@@ -294,7 +236,6 @@ static int amd_cache_northbridges(void)
if (boot_cpu_data.x86_vendor == X86_VENDOR_HYGON) {
root_ids = hygon_root_ids;
misc_ids = hygon_nb_misc_ids;
- link_ids = hygon_nb_link_ids;
}
misc = NULL;
@@ -328,14 +269,13 @@ static int amd_cache_northbridges(void)
amd_northbridges.nb = nb;
amd_northbridges.num = misc_count;
- link = misc = root = NULL;
+ misc = root = NULL;
for (i = 0; i < amd_northbridges.num; i++) {
node_to_amd_nb(i)->root = root =
next_northbridge(root, root_ids);
node_to_amd_nb(i)->misc = misc =
next_northbridge(misc, misc_ids);
- node_to_amd_nb(i)->link = link =
- next_northbridge(link, link_ids);
+ node_to_amd_nb(i)->link = amd_node_get_func(i, 4);
/*
* If there are more PCI root devices than data fabric/
--
2.43.0
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 06/16] x86/amd_nb: Simplify root device search
2024-10-23 17:21 [PATCH 00/16] AMD NB and SMN rework Yazen Ghannam
` (4 preceding siblings ...)
2024-10-23 17:21 ` [PATCH 05/16] x86/amd_nb: Simplify function 4 search Yazen Ghannam
@ 2024-10-23 17:21 ` Yazen Ghannam
2024-10-31 7:52 ` Zhuo, Qiuxu
2024-10-31 11:20 ` Borislav Petkov
2024-10-23 17:21 ` [PATCH 07/16] x86/amd_nb: Use topology info to get AMD node count Yazen Ghannam
` (10 subsequent siblings)
16 siblings, 2 replies; 65+ messages in thread
From: Yazen Ghannam @ 2024-10-23 17:21 UTC (permalink / raw)
To: linux-edac
Cc: linux-kernel, tony.luck, x86, avadhut.naik, john.allen,
mario.limonciello, bhelgaas, Shyam-sundar.S-k, richard.gong,
jdelvare, linux, clemens, hdegoede, ilpo.jarvinen, linux-pci,
linux-hwmon, platform-driver-x86, naveenkrishna.chatradhi,
carlos.bilbao.osdev, Yazen Ghannam
The "root" device search was introduced to support SMN access for Zen
systems. This device represents a PCIe root complex. It is not the
same as the "CPU/node" devices found at slots 0x18-0x1F.
There may be multiple PCIe root complexes within an AMD node. Such is
the case with server or HEDT systems, etc. Therefore it is not enough to
assume "root <-> AMD node" is a 1-to-1 association.
Currently, this is handled by skipping "extra" root complexes during the
search. However, the hardware provides the PCI bus number of an AMD
node's root device.
Use the hardware info to get the root device's bus and drop the extra
search code and PCI IDs.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
---
arch/x86/include/asm/amd_node.h | 1 +
arch/x86/kernel/amd_nb.c | 80 ++-------------------------------
arch/x86/kernel/amd_node.c | 57 +++++++++++++++++++++++
3 files changed, 62 insertions(+), 76 deletions(-)
diff --git a/arch/x86/include/asm/amd_node.h b/arch/x86/include/asm/amd_node.h
index 622bd3038eeb..3f097dd479f8 100644
--- a/arch/x86/include/asm/amd_node.h
+++ b/arch/x86/include/asm/amd_node.h
@@ -23,5 +23,6 @@
#define AMD_NODE0_PCI_SLOT 0x18
struct pci_dev *amd_node_get_func(u16 node, u8 func);
+struct pci_dev *amd_node_get_root(u16 node);
#endif /*_ASM_X86_AMD_NODE_H_*/
diff --git a/arch/x86/kernel/amd_nb.c b/arch/x86/kernel/amd_nb.c
index 34c06b25782d..135ecc0a0166 100644
--- a/arch/x86/kernel/amd_nb.c
+++ b/arch/x86/kernel/amd_nb.c
@@ -15,44 +15,11 @@
#include <linux/pci_ids.h>
#include <asm/amd_nb.h>
-#define PCI_DEVICE_ID_AMD_17H_ROOT 0x1450
-#define PCI_DEVICE_ID_AMD_17H_M10H_ROOT 0x15d0
-#define PCI_DEVICE_ID_AMD_17H_M30H_ROOT 0x1480
-#define PCI_DEVICE_ID_AMD_17H_M60H_ROOT 0x1630
-#define PCI_DEVICE_ID_AMD_17H_MA0H_ROOT 0x14b5
-#define PCI_DEVICE_ID_AMD_19H_M10H_ROOT 0x14a4
-#define PCI_DEVICE_ID_AMD_19H_M40H_ROOT 0x14b5
-#define PCI_DEVICE_ID_AMD_19H_M60H_ROOT 0x14d8
-#define PCI_DEVICE_ID_AMD_19H_M70H_ROOT 0x14e8
-#define PCI_DEVICE_ID_AMD_1AH_M00H_ROOT 0x153a
-#define PCI_DEVICE_ID_AMD_1AH_M20H_ROOT 0x1507
-#define PCI_DEVICE_ID_AMD_1AH_M60H_ROOT 0x1122
-#define PCI_DEVICE_ID_AMD_MI200_ROOT 0x14bb
-#define PCI_DEVICE_ID_AMD_MI300_ROOT 0x14f8
-
/* Protect the PCI config register pairs used for SMN. */
static DEFINE_MUTEX(smn_mutex);
static u32 *flush_words;
-static const struct pci_device_id amd_root_ids[] = {
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_ROOT) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_M10H_ROOT) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_M30H_ROOT) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_M60H_ROOT) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_MA0H_ROOT) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M10H_ROOT) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M40H_ROOT) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M60H_ROOT) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M70H_ROOT) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_1AH_M00H_ROOT) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_1AH_M20H_ROOT) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_1AH_M60H_ROOT) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_MI200_ROOT) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_MI300_ROOT) },
- {}
-};
-
static const struct pci_device_id amd_nb_misc_ids[] = {
{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_K8_NB_MISC) },
{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_10H_NB_MISC) },
@@ -85,11 +52,6 @@ static const struct pci_device_id amd_nb_misc_ids[] = {
{}
};
-static const struct pci_device_id hygon_root_ids[] = {
- { PCI_DEVICE(PCI_VENDOR_ID_HYGON, PCI_DEVICE_ID_AMD_17H_ROOT) },
- {}
-};
-
static const struct pci_device_id hygon_nb_misc_ids[] = {
{ PCI_DEVICE(PCI_VENDOR_ID_HYGON, PCI_DEVICE_ID_AMD_17H_DF_F3) },
{}
@@ -222,19 +184,15 @@ EXPORT_SYMBOL_GPL(amd_smn_write);
static int amd_cache_northbridges(void)
{
const struct pci_device_id *misc_ids = amd_nb_misc_ids;
- const struct pci_device_id *root_ids = amd_root_ids;
- struct pci_dev *root, *misc;
+ struct pci_dev *misc;
struct amd_northbridge *nb;
- u16 roots_per_misc = 0;
u16 misc_count = 0;
- u16 root_count = 0;
- u16 i, j;
+ u16 i;
if (amd_northbridges.num)
return 0;
if (boot_cpu_data.x86_vendor == X86_VENDOR_HYGON) {
- root_ids = hygon_root_ids;
misc_ids = hygon_nb_misc_ids;
}
@@ -245,23 +203,6 @@ static int amd_cache_northbridges(void)
if (!misc_count)
return -ENODEV;
- root = NULL;
- while ((root = next_northbridge(root, root_ids)))
- root_count++;
-
- if (root_count) {
- roots_per_misc = root_count / misc_count;
-
- /*
- * There should be _exactly_ N roots for each DF/SMN
- * interface.
- */
- if (!roots_per_misc || (root_count % roots_per_misc)) {
- pr_info("Unsupported AMD DF/PCI configuration found\n");
- return -ENODEV;
- }
- }
-
nb = kcalloc(misc_count, sizeof(struct amd_northbridge), GFP_KERNEL);
if (!nb)
return -ENOMEM;
@@ -269,25 +210,12 @@ static int amd_cache_northbridges(void)
amd_northbridges.nb = nb;
amd_northbridges.num = misc_count;
- misc = root = NULL;
+ misc = NULL;
for (i = 0; i < amd_northbridges.num; i++) {
- node_to_amd_nb(i)->root = root =
- next_northbridge(root, root_ids);
+ node_to_amd_nb(i)->root = amd_node_get_root(i);
node_to_amd_nb(i)->misc = misc =
next_northbridge(misc, misc_ids);
node_to_amd_nb(i)->link = amd_node_get_func(i, 4);
-
- /*
- * If there are more PCI root devices than data fabric/
- * system management network interfaces, then the (N)
- * PCI roots per DF/SMN interface are functionally the
- * same (for DF/SMN access) and N-1 are redundant. N-1
- * PCI roots should be skipped per DF/SMN interface so
- * the following DF/SMN interfaces get mapped to
- * correct PCI roots.
- */
- for (j = 1; j < roots_per_misc; j++)
- root = next_northbridge(root, root_ids);
}
if (amd_gart_present())
diff --git a/arch/x86/kernel/amd_node.c b/arch/x86/kernel/amd_node.c
index e825cd4426b9..3aaf7c81f0fa 100644
--- a/arch/x86/kernel/amd_node.c
+++ b/arch/x86/kernel/amd_node.c
@@ -32,3 +32,60 @@ struct pci_dev *amd_node_get_func(u16 node, u8 func)
return pci_get_domain_bus_and_slot(0, 0, PCI_DEVFN(AMD_NODE0_PCI_SLOT + node, func));
}
+
+#define DF_BLK_INST_CNT 0x040
+#define DF_CFG_ADDR_CNTL_LEGACY 0x084
+#define DF_CFG_ADDR_CNTL_DF4 0xC04
+
+#define DF_MAJOR_REVISION GENMASK(27, 24)
+
+static u16 get_cfg_addr_cntl_offset(struct pci_dev *df_f0)
+{
+ u32 reg;
+
+ /*
+ * Revision fields added for DF4 and later.
+ *
+ * Major revision of '0' is found pre-DF4. Field is Read-as-Zero.
+ */
+ if (pci_read_config_dword(df_f0, DF_BLK_INST_CNT, ®))
+ return 0;
+
+ if (reg & DF_MAJOR_REVISION)
+ return DF_CFG_ADDR_CNTL_DF4;
+
+ return DF_CFG_ADDR_CNTL_LEGACY;
+}
+
+struct pci_dev *amd_node_get_root(u16 node)
+{
+ struct pci_dev *df_f0 __free(pci_dev_put) = NULL;
+ struct pci_dev *root;
+ u16 cntl_off;
+ u8 bus;
+
+ if (!boot_cpu_has(X86_FEATURE_ZEN))
+ return NULL;
+
+ /*
+ * D18F0xXXX [Config Address Control] (DF::CfgAddressCntl)
+ * Bits [7:0] (SecBusNum) holds the bus number of the root device for
+ * this Data Fabric instance. The segment, device, and function will be 0.
+ */
+ df_f0 = amd_node_get_func(node, 0);
+ if (!df_f0)
+ return NULL;
+
+ cntl_off = get_cfg_addr_cntl_offset(df_f0);
+ if (!cntl_off)
+ return NULL;
+
+ if (pci_read_config_byte(df_f0, cntl_off, &bus))
+ return NULL;
+
+ /* Grab the pointer for the actual root device instance. */
+ root = pci_get_domain_bus_and_slot(0, bus, 0);
+
+ pci_dbg(root, "is root for AMD node %u\n", node);
+ return root;
+}
--
2.43.0
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 07/16] x86/amd_nb: Use topology info to get AMD node count
2024-10-23 17:21 [PATCH 00/16] AMD NB and SMN rework Yazen Ghannam
` (5 preceding siblings ...)
2024-10-23 17:21 ` [PATCH 06/16] x86/amd_nb: Simplify root device search Yazen Ghannam
@ 2024-10-23 17:21 ` Yazen Ghannam
2024-10-23 17:21 ` [PATCH 08/16] x86/amd_nb: Simplify function 3 search Yazen Ghannam
` (9 subsequent siblings)
16 siblings, 0 replies; 65+ messages in thread
From: Yazen Ghannam @ 2024-10-23 17:21 UTC (permalink / raw)
To: linux-edac
Cc: linux-kernel, tony.luck, x86, avadhut.naik, john.allen,
mario.limonciello, bhelgaas, Shyam-sundar.S-k, richard.gong,
jdelvare, linux, clemens, hdegoede, ilpo.jarvinen, linux-pci,
linux-hwmon, platform-driver-x86, naveenkrishna.chatradhi,
carlos.bilbao.osdev, Yazen Ghannam
Currently, the total AMD node count is determined by searching and
counting CPU/node devices using PCI IDs.
However, AMD node information is already available through topology
CPUID/MSRs. The recent topology rework has made this info easier to
access.
Replace the node counting code with a simple product of topology info.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
---
arch/x86/include/asm/amd_node.h | 5 +++++
arch/x86/kernel/amd_nb.c | 11 ++---------
2 files changed, 7 insertions(+), 9 deletions(-)
diff --git a/arch/x86/include/asm/amd_node.h b/arch/x86/include/asm/amd_node.h
index 3f097dd479f8..419a0ad13ef2 100644
--- a/arch/x86/include/asm/amd_node.h
+++ b/arch/x86/include/asm/amd_node.h
@@ -25,4 +25,9 @@
struct pci_dev *amd_node_get_func(u16 node, u8 func);
struct pci_dev *amd_node_get_root(u16 node);
+static inline u16 amd_num_nodes(void)
+{
+ return topology_amd_nodes_per_pkg() * topology_max_packages();
+}
+
#endif /*_ASM_X86_AMD_NODE_H_*/
diff --git a/arch/x86/kernel/amd_nb.c b/arch/x86/kernel/amd_nb.c
index 135ecc0a0166..7ccd769f9c5e 100644
--- a/arch/x86/kernel/amd_nb.c
+++ b/arch/x86/kernel/amd_nb.c
@@ -186,7 +186,6 @@ static int amd_cache_northbridges(void)
const struct pci_device_id *misc_ids = amd_nb_misc_ids;
struct pci_dev *misc;
struct amd_northbridge *nb;
- u16 misc_count = 0;
u16 i;
if (amd_northbridges.num)
@@ -196,19 +195,13 @@ static int amd_cache_northbridges(void)
misc_ids = hygon_nb_misc_ids;
}
- misc = NULL;
- while ((misc = next_northbridge(misc, misc_ids)))
- misc_count++;
-
- if (!misc_count)
- return -ENODEV;
+ amd_northbridges.num = amd_num_nodes();
- nb = kcalloc(misc_count, sizeof(struct amd_northbridge), GFP_KERNEL);
+ nb = kcalloc(amd_northbridges.num, sizeof(struct amd_northbridge), GFP_KERNEL);
if (!nb)
return -ENOMEM;
amd_northbridges.nb = nb;
- amd_northbridges.num = misc_count;
misc = NULL;
for (i = 0; i < amd_northbridges.num; i++) {
--
2.43.0
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 08/16] x86/amd_nb: Simplify function 3 search
2024-10-23 17:21 [PATCH 00/16] AMD NB and SMN rework Yazen Ghannam
` (6 preceding siblings ...)
2024-10-23 17:21 ` [PATCH 07/16] x86/amd_nb: Use topology info to get AMD node count Yazen Ghannam
@ 2024-10-23 17:21 ` Yazen Ghannam
2024-10-23 17:21 ` [PATCH 09/16] x86/amd_nb, x86/amd_node: Simplify amd_pci_dev_to_node_id() Yazen Ghannam
` (8 subsequent siblings)
16 siblings, 0 replies; 65+ messages in thread
From: Yazen Ghannam @ 2024-10-23 17:21 UTC (permalink / raw)
To: linux-edac
Cc: linux-kernel, tony.luck, x86, avadhut.naik, john.allen,
mario.limonciello, bhelgaas, Shyam-sundar.S-k, richard.gong,
jdelvare, linux, clemens, hdegoede, ilpo.jarvinen, linux-pci,
linux-hwmon, platform-driver-x86, naveenkrishna.chatradhi,
carlos.bilbao.osdev, Yazen Ghannam
Use the newly introduced helper function to look up "function 3". Drop
unused PCI IDs and code.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
---
arch/x86/kernel/amd_nb.c | 46 +---------------------------------------
1 file changed, 1 insertion(+), 45 deletions(-)
diff --git a/arch/x86/kernel/amd_nb.c b/arch/x86/kernel/amd_nb.c
index 7ccd769f9c5e..9b159f9b4a11 100644
--- a/arch/x86/kernel/amd_nb.c
+++ b/arch/x86/kernel/amd_nb.c
@@ -29,31 +29,6 @@ static const struct pci_device_id amd_nb_misc_ids[] = {
{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_15H_M60H_NB_F3) },
{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_16H_NB_F3) },
{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_16H_M30H_NB_F3) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_DF_F3) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_M10H_DF_F3) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_M30H_DF_F3) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_M60H_DF_F3) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_MA0H_DF_F3) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_CNB17H_F3) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_M70H_DF_F3) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_DF_F3) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M10H_DF_F3) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M40H_DF_F3) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M50H_DF_F3) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M60H_DF_F3) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M70H_DF_F3) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M78H_DF_F3) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_1AH_M00H_DF_F3) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_1AH_M20H_DF_F3) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_1AH_M60H_DF_F3) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_1AH_M70H_DF_F3) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_MI200_DF_F3) },
- { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_MI300_DF_F3) },
- {}
-};
-
-static const struct pci_device_id hygon_nb_misc_ids[] = {
- { PCI_DEVICE(PCI_VENDOR_ID_HYGON, PCI_DEVICE_ID_AMD_17H_DF_F3) },
{}
};
@@ -84,17 +59,6 @@ struct amd_northbridge *node_to_amd_nb(int node)
}
EXPORT_SYMBOL_GPL(node_to_amd_nb);
-static struct pci_dev *next_northbridge(struct pci_dev *dev,
- const struct pci_device_id *ids)
-{
- do {
- dev = pci_get_device(PCI_ANY_ID, PCI_ANY_ID, dev);
- if (!dev)
- break;
- } while (!pci_match_id(ids, dev));
- return dev;
-}
-
/*
* SMN accesses may fail in ways that are difficult to detect here in the called
* functions amd_smn_read() and amd_smn_write(). Therefore, callers must do
@@ -183,18 +147,12 @@ EXPORT_SYMBOL_GPL(amd_smn_write);
static int amd_cache_northbridges(void)
{
- const struct pci_device_id *misc_ids = amd_nb_misc_ids;
- struct pci_dev *misc;
struct amd_northbridge *nb;
u16 i;
if (amd_northbridges.num)
return 0;
- if (boot_cpu_data.x86_vendor == X86_VENDOR_HYGON) {
- misc_ids = hygon_nb_misc_ids;
- }
-
amd_northbridges.num = amd_num_nodes();
nb = kcalloc(amd_northbridges.num, sizeof(struct amd_northbridge), GFP_KERNEL);
@@ -203,11 +161,9 @@ static int amd_cache_northbridges(void)
amd_northbridges.nb = nb;
- misc = NULL;
for (i = 0; i < amd_northbridges.num; i++) {
node_to_amd_nb(i)->root = amd_node_get_root(i);
- node_to_amd_nb(i)->misc = misc =
- next_northbridge(misc, misc_ids);
+ node_to_amd_nb(i)->misc = amd_node_get_func(i, 3);
node_to_amd_nb(i)->link = amd_node_get_func(i, 4);
}
--
2.43.0
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 09/16] x86/amd_nb, x86/amd_node: Simplify amd_pci_dev_to_node_id()
2024-10-23 17:21 [PATCH 00/16] AMD NB and SMN rework Yazen Ghannam
` (7 preceding siblings ...)
2024-10-23 17:21 ` [PATCH 08/16] x86/amd_nb: Simplify function 3 search Yazen Ghannam
@ 2024-10-23 17:21 ` Yazen Ghannam
2024-11-04 14:23 ` Borislav Petkov
2024-10-23 17:21 ` [PATCH 10/16] x86/amd_nb: Move SMN access code to a new amd_smn driver Yazen Ghannam
` (7 subsequent siblings)
16 siblings, 1 reply; 65+ messages in thread
From: Yazen Ghannam @ 2024-10-23 17:21 UTC (permalink / raw)
To: linux-edac
Cc: linux-kernel, tony.luck, x86, avadhut.naik, john.allen,
mario.limonciello, bhelgaas, Shyam-sundar.S-k, richard.gong,
jdelvare, linux, clemens, hdegoede, ilpo.jarvinen, linux-pci,
linux-hwmon, platform-driver-x86, naveenkrishna.chatradhi,
carlos.bilbao.osdev, Yazen Ghannam
From: Mario Limonciello <mario.limonciello@amd.com>
amd_pci_dev_to_node_id() tries to find the AMD node ID of a device by
searching and counting devices.
The AMD node ID of an AMD node device is simply its slot number minus
the AMD node 0 slot number.
Simplify this function and move it to amd_node.h.
[Yazen: Update commit message and simplify function]
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Co-developed-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
---
arch/x86/include/asm/amd_nb.h | 17 -----------------
arch/x86/include/asm/amd_node.h | 6 ++++++
2 files changed, 6 insertions(+), 17 deletions(-)
diff --git a/arch/x86/include/asm/amd_nb.h b/arch/x86/include/asm/amd_nb.h
index a0f2182107b0..b3b42e585655 100644
--- a/arch/x86/include/asm/amd_nb.h
+++ b/arch/x86/include/asm/amd_nb.h
@@ -52,23 +52,6 @@ u16 amd_nb_num(void);
bool amd_nb_has_feature(unsigned int feature);
struct amd_northbridge *node_to_amd_nb(int node);
-static inline u16 amd_pci_dev_to_node_id(struct pci_dev *pdev)
-{
- struct pci_dev *misc;
- int i;
-
- for (i = 0; i != amd_nb_num(); i++) {
- misc = node_to_amd_nb(i)->misc;
-
- if (pci_domain_nr(misc->bus) == pci_domain_nr(pdev->bus) &&
- PCI_SLOT(misc->devfn) == PCI_SLOT(pdev->devfn))
- return i;
- }
-
- WARN(1, "Unable to find AMD Northbridge id for %s\n", pci_name(pdev));
- return 0;
-}
-
static inline bool amd_gart_present(void)
{
if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD)
diff --git a/arch/x86/include/asm/amd_node.h b/arch/x86/include/asm/amd_node.h
index 419a0ad13ef2..8e473a293706 100644
--- a/arch/x86/include/asm/amd_node.h
+++ b/arch/x86/include/asm/amd_node.h
@@ -30,4 +30,10 @@ static inline u16 amd_num_nodes(void)
return topology_amd_nodes_per_pkg() * topology_max_packages();
}
+/* Caller must ensure the input is an AMD node device. */
+static inline u16 amd_pci_dev_to_node_id(struct pci_dev *pdev)
+{
+ return PCI_SLOT(pdev->devfn) - AMD_NODE0_PCI_SLOT;
+}
+
#endif /*_ASM_X86_AMD_NODE_H_*/
--
2.43.0
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 10/16] x86/amd_nb: Move SMN access code to a new amd_smn driver
2024-10-23 17:21 [PATCH 00/16] AMD NB and SMN rework Yazen Ghannam
` (8 preceding siblings ...)
2024-10-23 17:21 ` [PATCH 09/16] x86/amd_nb, x86/amd_node: Simplify amd_pci_dev_to_node_id() Yazen Ghannam
@ 2024-10-23 17:21 ` Yazen Ghannam
2024-11-04 14:29 ` Borislav Petkov
2024-10-23 17:21 ` [PATCH 11/16] x86/amd_smn: Fixup __amd_smn_rw() Yazen Ghannam
` (6 subsequent siblings)
16 siblings, 1 reply; 65+ messages in thread
From: Yazen Ghannam @ 2024-10-23 17:21 UTC (permalink / raw)
To: linux-edac
Cc: linux-kernel, tony.luck, x86, avadhut.naik, john.allen,
mario.limonciello, bhelgaas, Shyam-sundar.S-k, richard.gong,
jdelvare, linux, clemens, hdegoede, ilpo.jarvinen, linux-pci,
linux-hwmon, platform-driver-x86, naveenkrishna.chatradhi,
carlos.bilbao.osdev, Yazen Ghannam
From: Mario Limonciello <mario.limonciello@amd.com>
SMN access was bolted into amd_nb mostly as convenience. This has
limitations though that require incurring tech debt to keep it working.
Move SMN access to its own driver.
[Yazen: Adjust commit message and add MAINTAINERS entry]
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
---
MAINTAINERS | 8 +++
arch/x86/Kconfig | 3 +
arch/x86/include/asm/amd_nb.h | 3 -
arch/x86/include/asm/amd_smn.h | 11 +++
arch/x86/kernel/Makefile | 1 +
arch/x86/kernel/amd_nb.c | 89 -----------------------
arch/x86/kernel/amd_smn.c | 101 +++++++++++++++++++++++++++
arch/x86/pci/fixup.c | 4 +-
drivers/edac/Kconfig | 1 +
drivers/edac/amd64_edac.c | 1 +
drivers/hwmon/Kconfig | 2 +-
drivers/hwmon/k10temp.c | 2 +-
drivers/platform/x86/amd/pmc/Kconfig | 2 +-
drivers/platform/x86/amd/pmc/pmc.c | 2 +-
drivers/platform/x86/amd/pmf/Kconfig | 2 +-
drivers/platform/x86/amd/pmf/core.c | 2 +-
drivers/ras/amd/atl/Kconfig | 1 +
drivers/ras/amd/atl/internal.h | 1 +
18 files changed, 136 insertions(+), 100 deletions(-)
create mode 100644 arch/x86/include/asm/amd_smn.h
create mode 100644 arch/x86/kernel/amd_smn.c
diff --git a/MAINTAINERS b/MAINTAINERS
index 9ca246aef7ab..cb9af4353b31 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1180,6 +1180,14 @@ S: Maintained
F: Documentation/hid/amd-sfh*
F: drivers/hid/amd-sfh-hid/
+AMD SMN DRIVER
+M: Mario Limonciello <mario.limonciello@amd.com>
+M: Yazen Ghannam <yazen.ghannam@amd.com>
+L: linux-kernel@vger.kernel.org
+S: Supported
+F: arch/x86/include/asm/amd_smn.h
+F: arch/x86/kernel/amd_smn.c
+
AMD SPI DRIVER
M: Sanjay R Mehta <sanju.mehta@amd.com>
S: Maintained
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index ba5252d8e21c..a03ffa5b6bb1 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -3128,6 +3128,9 @@ config AMD_NODE
def_bool y
depends on CPU_SUP_AMD && PCI
+config AMD_SMN
+ def_bool y
+ depends on AMD_NODE
endmenu
menu "Binary Emulations"
diff --git a/arch/x86/include/asm/amd_nb.h b/arch/x86/include/asm/amd_nb.h
index b3b42e585655..55c03d3495bc 100644
--- a/arch/x86/include/asm/amd_nb.h
+++ b/arch/x86/include/asm/amd_nb.h
@@ -21,9 +21,6 @@ extern int amd_numa_init(void);
extern int amd_get_subcaches(int);
extern int amd_set_subcaches(int, unsigned long);
-int __must_check amd_smn_read(u16 node, u32 address, u32 *value);
-int __must_check amd_smn_write(u16 node, u32 address, u32 value);
-
struct amd_l3_cache {
unsigned indices;
u8 subcaches[4];
diff --git a/arch/x86/include/asm/amd_smn.h b/arch/x86/include/asm/amd_smn.h
new file mode 100644
index 000000000000..6850de69f863
--- /dev/null
+++ b/arch/x86/include/asm/amd_smn.h
@@ -0,0 +1,11 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_X86_AMD_SMN_H
+#define _ASM_X86_AMD_SMN_H
+
+#include <linux/types.h>
+#include <asm/amd_node.h>
+
+int __must_check amd_smn_read(u16 node, u32 address, u32 *value);
+int __must_check amd_smn_write(u16 node, u32 address, u32 value);
+
+#endif /* _ASM_X86_AMD_SMN_H */
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index b43eb7e384eb..1df22821f72d 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -120,6 +120,7 @@ obj-$(CONFIG_HPET_TIMER) += hpet.o
obj-$(CONFIG_AMD_NB) += amd_nb.o
obj-$(CONFIG_AMD_NODE) += amd_node.o
+obj-$(CONFIG_AMD_SMN) += amd_smn.o
obj-$(CONFIG_DEBUG_NMI_SELFTEST) += nmi_selftest.o
obj-$(CONFIG_KVM_GUEST) += kvm.o kvmclock.o
diff --git a/arch/x86/kernel/amd_nb.c b/arch/x86/kernel/amd_nb.c
index 9b159f9b4a11..10cdeddeda02 100644
--- a/arch/x86/kernel/amd_nb.c
+++ b/arch/x86/kernel/amd_nb.c
@@ -15,9 +15,6 @@
#include <linux/pci_ids.h>
#include <asm/amd_nb.h>
-/* Protect the PCI config register pairs used for SMN. */
-static DEFINE_MUTEX(smn_mutex);
-
static u32 *flush_words;
static const struct pci_device_id amd_nb_misc_ids[] = {
@@ -59,92 +56,6 @@ struct amd_northbridge *node_to_amd_nb(int node)
}
EXPORT_SYMBOL_GPL(node_to_amd_nb);
-/*
- * SMN accesses may fail in ways that are difficult to detect here in the called
- * functions amd_smn_read() and amd_smn_write(). Therefore, callers must do
- * their own checking based on what behavior they expect.
- *
- * For SMN reads, the returned value may be zero if the register is Read-as-Zero.
- * Or it may be a "PCI Error Response", e.g. all 0xFFs. The "PCI Error Response"
- * can be checked here, and a proper error code can be returned.
- *
- * But the Read-as-Zero response cannot be verified here. A value of 0 may be
- * correct in some cases, so callers must check that this correct is for the
- * register/fields they need.
- *
- * For SMN writes, success can be determined through a "write and read back"
- * However, this is not robust when done here.
- *
- * Possible issues:
- *
- * 1) Bits that are "Write-1-to-Clear". In this case, the read value should
- * *not* match the write value.
- *
- * 2) Bits that are "Read-as-Zero"/"Writes-Ignored". This information cannot be
- * known here.
- *
- * 3) Bits that are "Reserved / Set to 1". Ditto above.
- *
- * Callers of amd_smn_write() should do the "write and read back" check
- * themselves, if needed.
- *
- * For #1, they can see if their target bits got cleared.
- *
- * For #2 and #3, they can check if their target bits got set as intended.
- *
- * This matches what is done for RDMSR/WRMSR. As long as there's no #GP, then
- * the operation is considered a success, and the caller does their own
- * checking.
- */
-static int __amd_smn_rw(u16 node, u32 address, u32 *value, bool write)
-{
- struct pci_dev *root;
- int err = -ENODEV;
-
- if (node >= amd_northbridges.num)
- goto out;
-
- root = node_to_amd_nb(node)->root;
- if (!root)
- goto out;
-
- mutex_lock(&smn_mutex);
-
- err = pci_write_config_dword(root, 0x60, address);
- if (err) {
- pr_warn("Error programming SMN address 0x%x.\n", address);
- goto out_unlock;
- }
-
- err = (write ? pci_write_config_dword(root, 0x64, *value)
- : pci_read_config_dword(root, 0x64, value));
-
-out_unlock:
- mutex_unlock(&smn_mutex);
-
-out:
- return err;
-}
-
-int __must_check amd_smn_read(u16 node, u32 address, u32 *value)
-{
- int err = __amd_smn_rw(node, address, value, false);
-
- if (PCI_POSSIBLE_ERROR(*value)) {
- err = -ENODEV;
- *value = 0;
- }
-
- return err;
-}
-EXPORT_SYMBOL_GPL(amd_smn_read);
-
-int __must_check amd_smn_write(u16 node, u32 address, u32 value)
-{
- return __amd_smn_rw(node, address, &value, true);
-}
-EXPORT_SYMBOL_GPL(amd_smn_write);
-
static int amd_cache_northbridges(void)
{
struct amd_northbridge *nb;
diff --git a/arch/x86/kernel/amd_smn.c b/arch/x86/kernel/amd_smn.c
new file mode 100644
index 000000000000..06160a9e2444
--- /dev/null
+++ b/arch/x86/kernel/amd_smn.c
@@ -0,0 +1,101 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * AMD SMN driver, used for register access
+ * Copyright 2024 Advanced Micro Devices, Inc.
+ */
+
+#define pr_fmt(fmt) "amd_smn: " fmt
+
+#include <linux/pci.h>
+
+#include <asm/amd_nb.h>
+#include <asm/amd_smn.h>
+
+/* Protect the PCI config register pairs used for SMN. */
+static DEFINE_MUTEX(smn_mutex);
+
+/*
+ * SMN accesses may fail in ways that are difficult to detect here in the called
+ * functions amd_smn_read() and amd_smn_write(). Therefore, callers must do
+ * their own checking based on what behavior they expect.
+ *
+ * For SMN reads, the returned value may be zero if the register is Read-as-Zero.
+ * Or it may be a "PCI Error Response", e.g. all 0xFFs. The "PCI Error Response"
+ * can be checked here, and a proper error code can be returned.
+ *
+ * But the Read-as-Zero response cannot be verified here. A value of 0 may be
+ * correct in some cases, so callers must check that this correct is for the
+ * register/fields they need.
+ *
+ * For SMN writes, success can be determined through a "write and read back"
+ * However, this is not robust when done here.
+ *
+ * Possible issues:
+ *
+ * 1) Bits that are "Write-1-to-Clear". In this case, the read value should
+ * *not* match the write value.
+ *
+ * 2) Bits that are "Read-as-Zero"/"Writes-Ignored". This information cannot be
+ * known here.
+ *
+ * 3) Bits that are "Reserved / Set to 1". Ditto above.
+ *
+ * Callers of amd_smn_write() should do the "write and read back" check
+ * themselves, if needed.
+ *
+ * For #1, they can see if their target bits got cleared.
+ *
+ * For #2 and #3, they can check if their target bits got set as intended.
+ *
+ * This matches what is done for RDMSR/WRMSR. As long as there's no #GP, then
+ * the operation is considered a success, and the caller does their own
+ * checking.
+ */
+static int __amd_smn_rw(u16 node, u32 address, u32 *value, bool write)
+{
+ struct pci_dev *root;
+ int err = -ENODEV;
+
+ if (node >= amd_nb_num())
+ goto out;
+
+ root = node_to_amd_nb(node)->root;
+ if (!root)
+ goto out;
+
+ mutex_lock(&smn_mutex);
+
+ err = pci_write_config_dword(root, 0x60, address);
+ if (err) {
+ pr_warn("Error programming SMN address 0x%x.\n", address);
+ goto out_unlock;
+ }
+
+ err = (write ? pci_write_config_dword(root, 0x64, *value)
+ : pci_read_config_dword(root, 0x64, value));
+
+out_unlock:
+ mutex_unlock(&smn_mutex);
+
+out:
+ return err;
+}
+
+int __must_check amd_smn_read(u16 node, u32 address, u32 *value)
+{
+ int err = __amd_smn_rw(node, address, value, false);
+
+ if (PCI_POSSIBLE_ERROR(*value)) {
+ err = -ENODEV;
+ *value = 0;
+ }
+
+ return err;
+}
+EXPORT_SYMBOL_GPL(amd_smn_read);
+
+int __must_check amd_smn_write(u16 node, u32 address, u32 value)
+{
+ return __amd_smn_rw(node, address, &value, true);
+}
+EXPORT_SYMBOL_GPL(amd_smn_write);
diff --git a/arch/x86/pci/fixup.c b/arch/x86/pci/fixup.c
index 98a9bb92d75c..e8ea627c22b4 100644
--- a/arch/x86/pci/fixup.c
+++ b/arch/x86/pci/fixup.c
@@ -9,7 +9,7 @@
#include <linux/pci.h>
#include <linux/suspend.h>
#include <linux/vgaarb.h>
-#include <asm/amd_nb.h>
+#include <asm/amd_smn.h>
#include <asm/hpet.h>
#include <asm/pci_x86.h>
@@ -828,7 +828,7 @@ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x7910, rs690_fix_64bit_dma);
#endif
-#ifdef CONFIG_AMD_NB
+#ifdef CONFIG_AMD_SMN
#define AMD_15B8_RCC_DEV2_EPF0_STRAP2 0x10136008
#define AMD_15B8_RCC_DEV2_EPF0_STRAP2_NO_SOFT_RESET_DEV2_F0_MASK 0x00000080L
diff --git a/drivers/edac/Kconfig b/drivers/edac/Kconfig
index 81af6c344d6b..acdd920b8769 100644
--- a/drivers/edac/Kconfig
+++ b/drivers/edac/Kconfig
@@ -78,6 +78,7 @@ config EDAC_GHES
config EDAC_AMD64
tristate "AMD64 (Opteron, Athlon64)"
depends on AMD_NB && EDAC_DECODE_MCE
+ depends on AMD_SMN
imply AMD_ATL
help
Support for error detection and correction of DRAM ECC errors on
diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
index ddfbdb66b794..b0ee0593137b 100644
--- a/drivers/edac/amd64_edac.c
+++ b/drivers/edac/amd64_edac.c
@@ -2,6 +2,7 @@
#include <linux/ras.h>
#include "amd64_edac.h"
#include <asm/amd_nb.h>
+#include <asm/amd_smn.h>
static struct edac_pci_ctl_info *pci_ctl;
diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig
index 08a3c863f80a..bf2bd7098f2c 100644
--- a/drivers/hwmon/Kconfig
+++ b/drivers/hwmon/Kconfig
@@ -324,7 +324,7 @@ config SENSORS_K8TEMP
config SENSORS_K10TEMP
tristate "AMD Family 10h+ temperature sensor"
- depends on X86 && PCI && AMD_NB
+ depends on X86 && PCI && AMD_SMN
help
If you say yes here you get support for the temperature
sensor(s) inside your CPU. Supported are later revisions of
diff --git a/drivers/hwmon/k10temp.c b/drivers/hwmon/k10temp.c
index 7dc19c5d62ac..1a16640d2fa1 100644
--- a/drivers/hwmon/k10temp.c
+++ b/drivers/hwmon/k10temp.c
@@ -20,7 +20,7 @@
#include <linux/module.h>
#include <linux/pci.h>
#include <linux/pci_ids.h>
-#include <asm/amd_nb.h>
+#include <asm/amd_smn.h>
#include <asm/processor.h>
MODULE_DESCRIPTION("AMD Family 10h+ CPU core temperature monitor");
diff --git a/drivers/platform/x86/amd/pmc/Kconfig b/drivers/platform/x86/amd/pmc/Kconfig
index 94f9563d8be7..d9cae227aff9 100644
--- a/drivers/platform/x86/amd/pmc/Kconfig
+++ b/drivers/platform/x86/amd/pmc/Kconfig
@@ -5,7 +5,7 @@
config AMD_PMC
tristate "AMD SoC PMC driver"
- depends on ACPI && PCI && RTC_CLASS && AMD_NB
+ depends on ACPI && PCI && RTC_CLASS && AMD_SMN
depends on SUSPEND
select SERIO
help
diff --git a/drivers/platform/x86/amd/pmc/pmc.c b/drivers/platform/x86/amd/pmc/pmc.c
index bbb8edb62e00..663cf8d671c7 100644
--- a/drivers/platform/x86/amd/pmc/pmc.c
+++ b/drivers/platform/x86/amd/pmc/pmc.c
@@ -10,7 +10,7 @@
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
-#include <asm/amd_nb.h>
+#include <asm/amd_smn.h>
#include <linux/acpi.h>
#include <linux/bitfield.h>
#include <linux/bits.h>
diff --git a/drivers/platform/x86/amd/pmf/Kconfig b/drivers/platform/x86/amd/pmf/Kconfig
index f4fa8bd8bda8..67913f64faf3 100644
--- a/drivers/platform/x86/amd/pmf/Kconfig
+++ b/drivers/platform/x86/amd/pmf/Kconfig
@@ -7,7 +7,7 @@ config AMD_PMF
tristate "AMD Platform Management Framework"
depends on ACPI && PCI
depends on POWER_SUPPLY
- depends on AMD_NB
+ depends on AMD_SMN
select ACPI_PLATFORM_PROFILE
depends on TEE && AMDTEE
depends on AMD_SFH_HID
diff --git a/drivers/platform/x86/amd/pmf/core.c b/drivers/platform/x86/amd/pmf/core.c
index d6af0ca036f1..21f9ec19941b 100644
--- a/drivers/platform/x86/amd/pmf/core.c
+++ b/drivers/platform/x86/amd/pmf/core.c
@@ -8,7 +8,7 @@
* Author: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
*/
-#include <asm/amd_nb.h>
+#include <asm/amd_smn.h>
#include <linux/debugfs.h>
#include <linux/iopoll.h>
#include <linux/module.h>
diff --git a/drivers/ras/amd/atl/Kconfig b/drivers/ras/amd/atl/Kconfig
index 551680073e43..819ac593e743 100644
--- a/drivers/ras/amd/atl/Kconfig
+++ b/drivers/ras/amd/atl/Kconfig
@@ -10,6 +10,7 @@
config AMD_ATL
tristate "AMD Address Translation Library"
depends on AMD_NB && X86_64 && RAS
+ depends on AMD_SMN
depends on MEMORY_FAILURE
default N
help
diff --git a/drivers/ras/amd/atl/internal.h b/drivers/ras/amd/atl/internal.h
index 143d04c779a8..49cd3e95e0c6 100644
--- a/drivers/ras/amd/atl/internal.h
+++ b/drivers/ras/amd/atl/internal.h
@@ -18,6 +18,7 @@
#include <linux/ras.h>
#include <asm/amd_nb.h>
+#include <asm/amd_smn.h>
#include "reg_fields.h"
--
2.43.0
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 11/16] x86/amd_smn: Fixup __amd_smn_rw()
2024-10-23 17:21 [PATCH 00/16] AMD NB and SMN rework Yazen Ghannam
` (9 preceding siblings ...)
2024-10-23 17:21 ` [PATCH 10/16] x86/amd_nb: Move SMN access code to a new amd_smn driver Yazen Ghannam
@ 2024-10-23 17:21 ` Yazen Ghannam
2024-11-04 14:32 ` Borislav Petkov
2024-10-23 17:21 ` [PATCH 12/16] x86/amd_smn: Remove dependency on AMD_NB Yazen Ghannam
` (5 subsequent siblings)
16 siblings, 1 reply; 65+ messages in thread
From: Yazen Ghannam @ 2024-10-23 17:21 UTC (permalink / raw)
To: linux-edac
Cc: linux-kernel, tony.luck, x86, avadhut.naik, john.allen,
mario.limonciello, bhelgaas, Shyam-sundar.S-k, richard.gong,
jdelvare, linux, clemens, hdegoede, ilpo.jarvinen, linux-pci,
linux-hwmon, platform-driver-x86, naveenkrishna.chatradhi,
carlos.bilbao.osdev, Yazen Ghannam, Tom Lendacky
Use guard(mutex) and convert PCI error codes to common ones.
Suggested-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
---
arch/x86/kernel/amd_smn.c | 14 +++++---------
1 file changed, 5 insertions(+), 9 deletions(-)
diff --git a/arch/x86/kernel/amd_smn.c b/arch/x86/kernel/amd_smn.c
index 06160a9e2444..e53db07d1a77 100644
--- a/arch/x86/kernel/amd_smn.c
+++ b/arch/x86/kernel/amd_smn.c
@@ -57,28 +57,24 @@ static int __amd_smn_rw(u16 node, u32 address, u32 *value, bool write)
int err = -ENODEV;
if (node >= amd_nb_num())
- goto out;
+ return err;
root = node_to_amd_nb(node)->root;
if (!root)
- goto out;
+ return err;
- mutex_lock(&smn_mutex);
+ guard(mutex)(&smn_mutex);
err = pci_write_config_dword(root, 0x60, address);
if (err) {
pr_warn("Error programming SMN address 0x%x.\n", address);
- goto out_unlock;
+ return pcibios_err_to_errno(err);
}
err = (write ? pci_write_config_dword(root, 0x64, *value)
: pci_read_config_dword(root, 0x64, value));
-out_unlock:
- mutex_unlock(&smn_mutex);
-
-out:
- return err;
+ return pcibios_err_to_errno(err);
}
int __must_check amd_smn_read(u16 node, u32 address, u32 *value)
--
2.43.0
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 12/16] x86/amd_smn: Remove dependency on AMD_NB
2024-10-23 17:21 [PATCH 00/16] AMD NB and SMN rework Yazen Ghannam
` (10 preceding siblings ...)
2024-10-23 17:21 ` [PATCH 11/16] x86/amd_smn: Fixup __amd_smn_rw() Yazen Ghannam
@ 2024-10-23 17:21 ` Yazen Ghannam
2024-10-23 17:21 ` [PATCH 13/16] x86/amd_smn: Use defines for register offsets Yazen Ghannam
` (4 subsequent siblings)
16 siblings, 0 replies; 65+ messages in thread
From: Yazen Ghannam @ 2024-10-23 17:21 UTC (permalink / raw)
To: linux-edac
Cc: linux-kernel, tony.luck, x86, avadhut.naik, john.allen,
mario.limonciello, bhelgaas, Shyam-sundar.S-k, richard.gong,
jdelvare, linux, clemens, hdegoede, ilpo.jarvinen, linux-pci,
linux-hwmon, platform-driver-x86, naveenkrishna.chatradhi,
carlos.bilbao.osdev, Yazen Ghannam
Cache the root devices locally so that there are no more dependencies on
AMD_NB.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
---
arch/x86/kernel/amd_smn.c | 42 ++++++++++++++++++++++++++++++++++++---
1 file changed, 39 insertions(+), 3 deletions(-)
diff --git a/arch/x86/kernel/amd_smn.c b/arch/x86/kernel/amd_smn.c
index e53db07d1a77..45cf86cd9014 100644
--- a/arch/x86/kernel/amd_smn.c
+++ b/arch/x86/kernel/amd_smn.c
@@ -8,9 +8,10 @@
#include <linux/pci.h>
-#include <asm/amd_nb.h>
#include <asm/amd_smn.h>
+static struct pci_dev **amd_roots;
+
/* Protect the PCI config register pairs used for SMN. */
static DEFINE_MUTEX(smn_mutex);
@@ -56,10 +57,10 @@ static int __amd_smn_rw(u16 node, u32 address, u32 *value, bool write)
struct pci_dev *root;
int err = -ENODEV;
- if (node >= amd_nb_num())
+ if (node >= amd_num_nodes())
return err;
- root = node_to_amd_nb(node)->root;
+ root = amd_roots[node];
if (!root)
return err;
@@ -95,3 +96,38 @@ int __must_check amd_smn_write(u16 node, u32 address, u32 value)
return __amd_smn_rw(node, address, &value, true);
}
EXPORT_SYMBOL_GPL(amd_smn_write);
+
+static int amd_cache_roots(void)
+{
+ u16 node, num_nodes = amd_num_nodes();
+
+ amd_roots = kcalloc(num_nodes, sizeof(struct pci_dev *), GFP_KERNEL);
+ if (!amd_roots)
+ return -ENOMEM;
+
+ for (node = 0; node < num_nodes; node++)
+ amd_roots[node] = amd_node_get_root(node);
+
+ return 0;
+}
+
+static int __init amd_smn_init(void)
+{
+ int err;
+
+ if (!cpu_feature_enabled(X86_FEATURE_ZEN))
+ return 0;
+
+ guard(mutex)(&smn_mutex);
+
+ if (amd_roots)
+ return 0;
+
+ err = amd_cache_roots();
+ if (err)
+ return err;
+
+ return 0;
+}
+
+fs_initcall(amd_smn_init);
--
2.43.0
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 13/16] x86/amd_smn: Use defines for register offsets
2024-10-23 17:21 [PATCH 00/16] AMD NB and SMN rework Yazen Ghannam
` (11 preceding siblings ...)
2024-10-23 17:21 ` [PATCH 12/16] x86/amd_smn: Remove dependency on AMD_NB Yazen Ghannam
@ 2024-10-23 17:21 ` Yazen Ghannam
2024-10-23 17:21 ` [PATCH 14/16] x86/amd_smn, platform/x86/amd/hsmp: Have HSMP use SMN Yazen Ghannam
` (3 subsequent siblings)
16 siblings, 0 replies; 65+ messages in thread
From: Yazen Ghannam @ 2024-10-23 17:21 UTC (permalink / raw)
To: linux-edac
Cc: linux-kernel, tony.luck, x86, avadhut.naik, john.allen,
mario.limonciello, bhelgaas, Shyam-sundar.S-k, richard.gong,
jdelvare, linux, clemens, hdegoede, ilpo.jarvinen, linux-pci,
linux-hwmon, platform-driver-x86, naveenkrishna.chatradhi,
carlos.bilbao.osdev, Yazen Ghannam
There are more than one SMN index/data pair available for software use.
The register offsets are different, but the protocol is the same.
Use defines for the SMN offset values and allow the index/data offsets
to be passed to the read/write helper function.
This eases code reuse with other SMN users in the kernel.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
---
arch/x86/kernel/amd_smn.c | 15 +++++++++------
1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/arch/x86/kernel/amd_smn.c b/arch/x86/kernel/amd_smn.c
index 45cf86cd9014..997fd3edd9c0 100644
--- a/arch/x86/kernel/amd_smn.c
+++ b/arch/x86/kernel/amd_smn.c
@@ -15,6 +15,9 @@ static struct pci_dev **amd_roots;
/* Protect the PCI config register pairs used for SMN. */
static DEFINE_MUTEX(smn_mutex);
+#define SMN_INDEX_OFFSET 0x60
+#define SMN_DATA_OFFSET 0x64
+
/*
* SMN accesses may fail in ways that are difficult to detect here in the called
* functions amd_smn_read() and amd_smn_write(). Therefore, callers must do
@@ -52,7 +55,7 @@ static DEFINE_MUTEX(smn_mutex);
* the operation is considered a success, and the caller does their own
* checking.
*/
-static int __amd_smn_rw(u16 node, u32 address, u32 *value, bool write)
+static int __amd_smn_rw(u8 i_off, u8 d_off, u16 node, u32 address, u32 *value, bool write)
{
struct pci_dev *root;
int err = -ENODEV;
@@ -66,21 +69,21 @@ static int __amd_smn_rw(u16 node, u32 address, u32 *value, bool write)
guard(mutex)(&smn_mutex);
- err = pci_write_config_dword(root, 0x60, address);
+ err = pci_write_config_dword(root, i_off, address);
if (err) {
pr_warn("Error programming SMN address 0x%x.\n", address);
return pcibios_err_to_errno(err);
}
- err = (write ? pci_write_config_dword(root, 0x64, *value)
- : pci_read_config_dword(root, 0x64, value));
+ err = (write ? pci_write_config_dword(root, d_off, *value)
+ : pci_read_config_dword(root, d_off, value));
return pcibios_err_to_errno(err);
}
int __must_check amd_smn_read(u16 node, u32 address, u32 *value)
{
- int err = __amd_smn_rw(node, address, value, false);
+ int err = __amd_smn_rw(SMN_INDEX_OFFSET, SMN_DATA_OFFSET, node, address, value, false);
if (PCI_POSSIBLE_ERROR(*value)) {
err = -ENODEV;
@@ -93,7 +96,7 @@ EXPORT_SYMBOL_GPL(amd_smn_read);
int __must_check amd_smn_write(u16 node, u32 address, u32 value)
{
- return __amd_smn_rw(node, address, &value, true);
+ return __amd_smn_rw(SMN_INDEX_OFFSET, SMN_DATA_OFFSET, node, address, &value, true);
}
EXPORT_SYMBOL_GPL(amd_smn_write);
--
2.43.0
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 14/16] x86/amd_smn, platform/x86/amd/hsmp: Have HSMP use SMN
2024-10-23 17:21 [PATCH 00/16] AMD NB and SMN rework Yazen Ghannam
` (12 preceding siblings ...)
2024-10-23 17:21 ` [PATCH 13/16] x86/amd_smn: Use defines for register offsets Yazen Ghannam
@ 2024-10-23 17:21 ` Yazen Ghannam
2024-10-24 13:23 ` Ilpo Järvinen
2024-10-23 17:21 ` [PATCH 15/16] x86/amd_smn: Add SMN offsets to exclusive region access Yazen Ghannam
` (2 subsequent siblings)
16 siblings, 1 reply; 65+ messages in thread
From: Yazen Ghannam @ 2024-10-23 17:21 UTC (permalink / raw)
To: linux-edac
Cc: linux-kernel, tony.luck, x86, avadhut.naik, john.allen,
mario.limonciello, bhelgaas, Shyam-sundar.S-k, richard.gong,
jdelvare, linux, clemens, hdegoede, ilpo.jarvinen, linux-pci,
linux-hwmon, platform-driver-x86, naveenkrishna.chatradhi,
carlos.bilbao.osdev, Yazen Ghannam
The HSMP interface is just an SMN interface with different offsets.
Define an HSMP wrapper in the SMN code and have the HSMP platform driver
use that rather than a local solution.
Also, remove the "root" member from AMD_NB, since there are no more
users of it.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
---
arch/x86/include/asm/amd_nb.h | 1 -
arch/x86/include/asm/amd_smn.h | 3 +++
arch/x86/kernel/amd_nb.c | 1 -
arch/x86/kernel/amd_smn.c | 9 +++++++++
drivers/platform/x86/amd/Kconfig | 2 +-
drivers/platform/x86/amd/hsmp.c | 32 +++++---------------------------
6 files changed, 18 insertions(+), 30 deletions(-)
diff --git a/arch/x86/include/asm/amd_nb.h b/arch/x86/include/asm/amd_nb.h
index 55c03d3495bc..cbe31e316e39 100644
--- a/arch/x86/include/asm/amd_nb.h
+++ b/arch/x86/include/asm/amd_nb.h
@@ -27,7 +27,6 @@ struct amd_l3_cache {
};
struct amd_northbridge {
- struct pci_dev *root;
struct pci_dev *misc;
struct pci_dev *link;
struct amd_l3_cache l3_cache;
diff --git a/arch/x86/include/asm/amd_smn.h b/arch/x86/include/asm/amd_smn.h
index 6850de69f863..f0eb12859c42 100644
--- a/arch/x86/include/asm/amd_smn.h
+++ b/arch/x86/include/asm/amd_smn.h
@@ -8,4 +8,7 @@
int __must_check amd_smn_read(u16 node, u32 address, u32 *value);
int __must_check amd_smn_write(u16 node, u32 address, u32 value);
+/* Should only be used by the HSMP driver. */
+int __must_check amd_smn_hsmp_rdwr(u16 node, u32 address, u32 *value, bool write);
+
#endif /* _ASM_X86_AMD_SMN_H */
diff --git a/arch/x86/kernel/amd_nb.c b/arch/x86/kernel/amd_nb.c
index 10cdeddeda02..4c22317a6dfe 100644
--- a/arch/x86/kernel/amd_nb.c
+++ b/arch/x86/kernel/amd_nb.c
@@ -73,7 +73,6 @@ static int amd_cache_northbridges(void)
amd_northbridges.nb = nb;
for (i = 0; i < amd_northbridges.num; i++) {
- node_to_amd_nb(i)->root = amd_node_get_root(i);
node_to_amd_nb(i)->misc = amd_node_get_func(i, 3);
node_to_amd_nb(i)->link = amd_node_get_func(i, 4);
}
diff --git a/arch/x86/kernel/amd_smn.c b/arch/x86/kernel/amd_smn.c
index 997fd3edd9c0..527dda8e3a2b 100644
--- a/arch/x86/kernel/amd_smn.c
+++ b/arch/x86/kernel/amd_smn.c
@@ -18,6 +18,9 @@ static DEFINE_MUTEX(smn_mutex);
#define SMN_INDEX_OFFSET 0x60
#define SMN_DATA_OFFSET 0x64
+#define HSMP_INDEX_OFFSET 0xc4
+#define HSMP_DATA_OFFSET 0xc8
+
/*
* SMN accesses may fail in ways that are difficult to detect here in the called
* functions amd_smn_read() and amd_smn_write(). Therefore, callers must do
@@ -100,6 +103,12 @@ int __must_check amd_smn_write(u16 node, u32 address, u32 value)
}
EXPORT_SYMBOL_GPL(amd_smn_write);
+int __must_check amd_smn_hsmp_rdwr(u16 node, u32 address, u32 *value, bool write)
+{
+ return __amd_smn_rw(HSMP_INDEX_OFFSET, HSMP_DATA_OFFSET, node, address, value, write);
+}
+EXPORT_SYMBOL_GPL(amd_smn_hsmp_rdwr);
+
static int amd_cache_roots(void)
{
u16 node, num_nodes = amd_num_nodes();
diff --git a/drivers/platform/x86/amd/Kconfig b/drivers/platform/x86/amd/Kconfig
index f88682d36447..e100b315c62b 100644
--- a/drivers/platform/x86/amd/Kconfig
+++ b/drivers/platform/x86/amd/Kconfig
@@ -8,7 +8,7 @@ source "drivers/platform/x86/amd/pmc/Kconfig"
config AMD_HSMP
tristate "AMD HSMP Driver"
- depends on AMD_NB && X86_64 && ACPI
+ depends on AMD_SMN && X86_64 && ACPI
help
The driver provides a way for user space tools to monitor and manage
system management functionality on EPYC server CPUs from AMD.
diff --git a/drivers/platform/x86/amd/hsmp.c b/drivers/platform/x86/amd/hsmp.c
index 8fcf38eed7f0..544efb0255c0 100644
--- a/drivers/platform/x86/amd/hsmp.c
+++ b/drivers/platform/x86/amd/hsmp.c
@@ -10,7 +10,7 @@
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
#include <asm/amd_hsmp.h>
-#include <asm/amd_nb.h>
+#include <asm/amd_smn.h>
#include <linux/delay.h>
#include <linux/io.h>
#include <linux/miscdevice.h>
@@ -48,9 +48,6 @@
#define SMN_HSMP_MSG_RESP 0x0010980
#define SMN_HSMP_MSG_DATA 0x00109E0
-#define HSMP_INDEX_REG 0xc4
-#define HSMP_DATA_REG 0xc8
-
#define HSMP_CDEV_NAME "hsmp_cdev"
#define HSMP_DEVNODE_NAME "hsmp"
#define HSMP_METRICS_TABLE_NAME "metrics_bin"
@@ -62,8 +59,6 @@
#define MSG_ARGOFF_STR "MsgArgOffset"
#define MSG_RESPOFF_STR "MsgRspOffset"
-#define MAX_AMD_SOCKETS 8
-
struct hsmp_mbaddr_info {
u32 base_addr;
u32 msg_id_off;
@@ -79,7 +74,6 @@ struct hsmp_socket {
void __iomem *virt_base_addr;
struct semaphore hsmp_sem;
char name[HSMP_ATTR_GRP_NAME_SIZE];
- struct pci_dev *root;
struct device *dev;
u16 sock_ind;
};
@@ -98,20 +92,7 @@ static struct hsmp_plat_device plat_dev;
static int amd_hsmp_pci_rdwr(struct hsmp_socket *sock, u32 offset,
u32 *value, bool write)
{
- int ret;
-
- if (!sock->root)
- return -ENODEV;
-
- ret = pci_write_config_dword(sock->root, HSMP_INDEX_REG,
- sock->mbinfo.base_addr + offset);
- if (ret)
- return ret;
-
- ret = (write ? pci_write_config_dword(sock->root, HSMP_DATA_REG, *value)
- : pci_read_config_dword(sock->root, HSMP_DATA_REG, value));
-
- return ret;
+ return amd_smn_hsmp_rdwr(sock->sock_ind, sock->mbinfo.base_addr + offset, value, write);
}
static void amd_hsmp_acpi_rdwr(struct hsmp_socket *sock, u32 offset,
@@ -749,10 +730,7 @@ static int init_platform_device(struct device *dev)
int ret, i;
for (i = 0; i < plat_dev.num_sockets; i++) {
- if (!node_to_amd_nb(i))
- return -ENODEV;
sock = &plat_dev.sock[i];
- sock->root = node_to_amd_nb(i)->root;
sock->sock_ind = i;
sock->dev = dev;
sock->mbinfo.base_addr = SMN_HSMP_BASE;
@@ -946,11 +924,11 @@ static int __init hsmp_plt_init(void)
int ret = -ENODEV;
/*
- * amd_nb_num() returns number of SMN/DF interfaces present in the system
+ * amd_num_nodes() returns number of SMN/DF interfaces present in the system
* if we have N SMN/DF interfaces that ideally means N sockets
*/
- plat_dev.num_sockets = amd_nb_num();
- if (plat_dev.num_sockets == 0 || plat_dev.num_sockets > MAX_AMD_SOCKETS)
+ plat_dev.num_sockets = amd_num_nodes();
+ if (plat_dev.num_sockets == 0 || plat_dev.num_sockets > MAX_AMD_NUM_NODES)
return ret;
ret = platform_driver_register(&amd_hsmp_driver);
--
2.43.0
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 15/16] x86/amd_smn: Add SMN offsets to exclusive region access
2024-10-23 17:21 [PATCH 00/16] AMD NB and SMN rework Yazen Ghannam
` (13 preceding siblings ...)
2024-10-23 17:21 ` [PATCH 14/16] x86/amd_smn, platform/x86/amd/hsmp: Have HSMP use SMN Yazen Ghannam
@ 2024-10-23 17:21 ` Yazen Ghannam
2024-10-23 17:21 ` [PATCH 16/16] x86/amd_smn: Add support for debugfs access to SMN registers Yazen Ghannam
2024-10-23 17:59 ` [PATCH 00/16] AMD NB and SMN rework Bjorn Helgaas
16 siblings, 0 replies; 65+ messages in thread
From: Yazen Ghannam @ 2024-10-23 17:21 UTC (permalink / raw)
To: linux-edac
Cc: linux-kernel, tony.luck, x86, avadhut.naik, john.allen,
mario.limonciello, bhelgaas, Shyam-sundar.S-k, richard.gong,
jdelvare, linux, clemens, hdegoede, ilpo.jarvinen, linux-pci,
linux-hwmon, platform-driver-x86, naveenkrishna.chatradhi,
carlos.bilbao.osdev, Yazen Ghannam
From: Mario Limonciello <mario.limonciello@amd.com>
Offsets 0x60 and 0x64 are used internally by kernel drivers that call
the amd_smn_read() and amd_smn_write() functions. If userspace accesses
the regions at the same time as the kernel it may cause malfunctions in
drivers using the offsets.
Add these offsets to the exclusions so that the kernel is tainted if a
non locked down userspace tries to access them.
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
---
arch/x86/kernel/amd_smn.c | 41 +++++++++++++++++++++++++++++++++++++++
1 file changed, 41 insertions(+)
diff --git a/arch/x86/kernel/amd_smn.c b/arch/x86/kernel/amd_smn.c
index 527dda8e3a2b..aeebd63234f9 100644
--- a/arch/x86/kernel/amd_smn.c
+++ b/arch/x86/kernel/amd_smn.c
@@ -14,6 +14,7 @@ static struct pci_dev **amd_roots;
/* Protect the PCI config register pairs used for SMN. */
static DEFINE_MUTEX(smn_mutex);
+static bool smn_exclusive;
#define SMN_INDEX_OFFSET 0x60
#define SMN_DATA_OFFSET 0x64
@@ -72,6 +73,9 @@ static int __amd_smn_rw(u8 i_off, u8 d_off, u16 node, u32 address, u32 *value, b
guard(mutex)(&smn_mutex);
+ if (!smn_exclusive)
+ return err;
+
err = pci_write_config_dword(root, i_off, address);
if (err) {
pr_warn("Error programming SMN address 0x%x.\n", address);
@@ -123,6 +127,39 @@ static int amd_cache_roots(void)
return 0;
}
+static int reserve_root_config_spaces(void)
+{
+ struct pci_dev *root = NULL;
+ struct pci_bus *bus = NULL;
+
+ while ((bus = pci_find_next_bus(bus))) {
+ /* Root device is Device 0 Function 0 on each Primary Bus. */
+ root = pci_get_slot(bus, 0);
+ if (!root)
+ continue;
+
+ if (root->vendor != PCI_VENDOR_ID_AMD &&
+ root->vendor != PCI_VENDOR_ID_HYGON)
+ continue;
+
+ pci_dbg(root, "Reserving PCI config space\n");
+
+ /*
+ * There are a few SMN index/data pairs and other registers
+ * that shouldn't be accessed by user space.
+ * So reserve the entire PCI config space for simplicity rather
+ * than covering specific registers piecemeal.
+ */
+ if (!pci_request_config_region_exclusive(root, 0, PCI_CFG_SPACE_SIZE, NULL)) {
+ pci_err(root, "Failed to reserve config space\n");
+ return -EEXIST;
+ }
+ }
+
+ smn_exclusive = true;
+ return 0;
+}
+
static int __init amd_smn_init(void)
{
int err;
@@ -139,6 +176,10 @@ static int __init amd_smn_init(void)
if (err)
return err;
+ err = reserve_root_config_spaces();
+ if (err)
+ return err;
+
return 0;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 65+ messages in thread
* [PATCH 16/16] x86/amd_smn: Add support for debugfs access to SMN registers
2024-10-23 17:21 [PATCH 00/16] AMD NB and SMN rework Yazen Ghannam
` (14 preceding siblings ...)
2024-10-23 17:21 ` [PATCH 15/16] x86/amd_smn: Add SMN offsets to exclusive region access Yazen Ghannam
@ 2024-10-23 17:21 ` Yazen Ghannam
2024-11-05 19:21 ` Borislav Petkov
2024-10-23 17:59 ` [PATCH 00/16] AMD NB and SMN rework Bjorn Helgaas
16 siblings, 1 reply; 65+ messages in thread
From: Yazen Ghannam @ 2024-10-23 17:21 UTC (permalink / raw)
To: linux-edac
Cc: linux-kernel, tony.luck, x86, avadhut.naik, john.allen,
mario.limonciello, bhelgaas, Shyam-sundar.S-k, richard.gong,
jdelvare, linux, clemens, hdegoede, ilpo.jarvinen, linux-pci,
linux-hwmon, platform-driver-x86, naveenkrishna.chatradhi,
carlos.bilbao.osdev, Yazen Ghannam
From: Mario Limonciello <mario.limonciello@amd.com>
There are certain registers on AMD Zen systems that can only be
accessed through SMN.
Introduce a new interface that provides debugfs files for accessing SMN.
As this introduces the capability for userspace to manipulate the
hardware in unpredictable ways, taint the kernel when writing.
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
---
arch/x86/kernel/amd_smn.c | 83 +++++++++++++++++++++++++++++++++++++++
1 file changed, 83 insertions(+)
diff --git a/arch/x86/kernel/amd_smn.c b/arch/x86/kernel/amd_smn.c
index aeebd63234f9..618d9a05f3d7 100644
--- a/arch/x86/kernel/amd_smn.c
+++ b/arch/x86/kernel/amd_smn.c
@@ -7,6 +7,7 @@
#define pr_fmt(fmt) "amd_smn: " fmt
#include <linux/pci.h>
+#include <linux/debugfs.h>
#include <asm/amd_smn.h>
@@ -113,6 +114,82 @@ int __must_check amd_smn_hsmp_rdwr(u16 node, u32 address, u32 *value, bool write
}
EXPORT_SYMBOL_GPL(amd_smn_hsmp_rdwr);
+static struct dentry *debugfs_dir;
+static u16 debug_node;
+static u32 debug_address;
+
+static ssize_t smn_node_write(struct file *file, const char __user *userbuf,
+ size_t count, loff_t *ppos)
+{
+ int ret;
+
+ ret = kstrtou16_from_user(userbuf, count, 0, &debug_node);
+ if (ret)
+ return ret;
+
+ return count;
+}
+
+static int smn_node_show(struct seq_file *m, void *v)
+{
+ seq_printf(m, "0x%08x\n", debug_node);
+ return 0;
+}
+
+static ssize_t smn_address_write(struct file *file, const char __user *userbuf,
+ size_t count, loff_t *ppos)
+{
+ int ret;
+
+ ret = kstrtouint_from_user(userbuf, count, 0, &debug_address);
+ if (ret)
+ return ret;
+
+ return count;
+}
+
+static int smn_address_show(struct seq_file *m, void *v)
+{
+ seq_printf(m, "0x%08x\n", debug_address);
+ return 0;
+}
+
+static int smn_value_show(struct seq_file *m, void *v)
+{
+ u32 val;
+ int ret;
+
+ ret = amd_smn_read(debug_node, debug_address, &val);
+ if (ret)
+ return ret;
+
+ seq_printf(m, "0x%08x\n", val);
+ return 0;
+}
+
+static ssize_t smn_value_write(struct file *file, const char __user *userbuf,
+ size_t count, loff_t *ppos)
+{
+ u32 val;
+ int ret;
+
+ ret = kstrtouint_from_user(userbuf, count, 0, &val);
+ if (ret)
+ return ret;
+
+ add_taint(TAINT_USER, LOCKDEP_STILL_OK);
+
+ ret = amd_smn_write(debug_node, debug_address, val);
+ if (ret)
+ return ret;
+
+ return count;
+}
+
+DEFINE_SHOW_STORE_ATTRIBUTE(smn_node);
+DEFINE_SHOW_STORE_ATTRIBUTE(smn_address);
+DEFINE_SHOW_STORE_ATTRIBUTE(smn_value);
+
static int amd_cache_roots(void)
{
u16 node, num_nodes = amd_num_nodes();
@@ -180,6 +257,12 @@ static int __init amd_smn_init(void)
if (err)
return err;
+ debugfs_dir = debugfs_create_dir("amd_smn", arch_debugfs_dir);
+
+ debugfs_create_file("node", 0600, debugfs_dir, NULL, &smn_node_fops);
+ debugfs_create_file("address", 0600, debugfs_dir, NULL, &smn_address_fops);
+ debugfs_create_file("value", 0600, debugfs_dir, NULL, &smn_value_fops);
+
return 0;
}
--
2.43.0
^ permalink raw reply related [flat|nested] 65+ messages in thread
* Re: [PATCH 00/16] AMD NB and SMN rework
2024-10-23 17:21 [PATCH 00/16] AMD NB and SMN rework Yazen Ghannam
` (15 preceding siblings ...)
2024-10-23 17:21 ` [PATCH 16/16] x86/amd_smn: Add support for debugfs access to SMN registers Yazen Ghannam
@ 2024-10-23 17:59 ` Bjorn Helgaas
2024-10-24 16:01 ` Yazen Ghannam
16 siblings, 1 reply; 65+ messages in thread
From: Bjorn Helgaas @ 2024-10-23 17:59 UTC (permalink / raw)
To: Yazen Ghannam
Cc: linux-edac, linux-kernel, tony.luck, x86, avadhut.naik,
john.allen, mario.limonciello, bhelgaas, Shyam-sundar.S-k,
richard.gong, jdelvare, linux, clemens, hdegoede, ilpo.jarvinen,
linux-pci, linux-hwmon, platform-driver-x86,
naveenkrishna.chatradhi, carlos.bilbao.osdev
On Wed, Oct 23, 2024 at 05:21:34PM +0000, Yazen Ghannam wrote:
> Hi all,
>
> The theme of this set is decoupling the "AMD node" concept from the
> legacy northbridge support.
>
> Additionally, AMD System Management Network (SMN) access code is
> decoupled and expanded too.
>
> Patches 1-3 begin reducing the scope of AMD_NB.
>
> Patches 4-9 begin moving generic AMD node support out of AMD_NB.
>
> Patches 10-13 move SMN support out of AMD_NB and do some refactoring.
>
> Patch 14 has HSMP reuse SMN functionality.
>
> Patches 15-16 address userspace access to SMN.
>
> I say "begin" above because there is more to do here. Ultimately, AMD_NB
> should only be needed for code used on legacy systems with northbridges.
> Also, any and all SMN users in the kernel need to be updated to use the
> central SMN code. Local solutions should be avoided.
Glad to see many of the PCI device IDs going away; thanks for working
on that!
The use of pci_get_slot() and pci_get_domain_bus_and_slot() is not
ideal since all those pci_get_*() interfaces are kind of ugly in my
opinion, and using them means we have to encode topology details in
the kernel. But this still seems like a big improvement.
Bjorn
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 14/16] x86/amd_smn, platform/x86/amd/hsmp: Have HSMP use SMN
2024-10-23 17:21 ` [PATCH 14/16] x86/amd_smn, platform/x86/amd/hsmp: Have HSMP use SMN Yazen Ghannam
@ 2024-10-24 13:23 ` Ilpo Järvinen
2024-10-24 16:06 ` Yazen Ghannam
0 siblings, 1 reply; 65+ messages in thread
From: Ilpo Järvinen @ 2024-10-24 13:23 UTC (permalink / raw)
To: Yazen Ghannam
Cc: linux-edac, LKML, tony.luck, x86, avadhut.naik, john.allen,
mario.limonciello, bhelgaas, Shyam-sundar.S-k, richard.gong,
jdelvare, linux, clemens, Hans de Goede, linux-pci, linux-hwmon,
platform-driver-x86, naveenkrishna.chatradhi, carlos.bilbao.osdev
On Wed, 23 Oct 2024, Yazen Ghannam wrote:
> The HSMP interface is just an SMN interface with different offsets.
>
> Define an HSMP wrapper in the SMN code and have the HSMP platform driver
> use that rather than a local solution.
>
> Also, remove the "root" member from AMD_NB, since there are no more
> users of it.
>
> Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
> ---
> arch/x86/include/asm/amd_nb.h | 1 -
> arch/x86/include/asm/amd_smn.h | 3 +++
> arch/x86/kernel/amd_nb.c | 1 -
> arch/x86/kernel/amd_smn.c | 9 +++++++++
> drivers/platform/x86/amd/Kconfig | 2 +-
> drivers/platform/x86/amd/hsmp.c | 32 +++++---------------------------
> 6 files changed, 18 insertions(+), 30 deletions(-)
>
> diff --git a/arch/x86/include/asm/amd_nb.h b/arch/x86/include/asm/amd_nb.h
> index 55c03d3495bc..cbe31e316e39 100644
> --- a/arch/x86/include/asm/amd_nb.h
> +++ b/arch/x86/include/asm/amd_nb.h
> @@ -27,7 +27,6 @@ struct amd_l3_cache {
> };
>
> struct amd_northbridge {
> - struct pci_dev *root;
> struct pci_dev *misc;
> struct pci_dev *link;
> struct amd_l3_cache l3_cache;
> diff --git a/arch/x86/include/asm/amd_smn.h b/arch/x86/include/asm/amd_smn.h
> index 6850de69f863..f0eb12859c42 100644
> --- a/arch/x86/include/asm/amd_smn.h
> +++ b/arch/x86/include/asm/amd_smn.h
> @@ -8,4 +8,7 @@
> int __must_check amd_smn_read(u16 node, u32 address, u32 *value);
> int __must_check amd_smn_write(u16 node, u32 address, u32 value);
>
> +/* Should only be used by the HSMP driver. */
> +int __must_check amd_smn_hsmp_rdwr(u16 node, u32 address, u32 *value, bool write);
> +
> #endif /* _ASM_X86_AMD_SMN_H */
> diff --git a/arch/x86/kernel/amd_nb.c b/arch/x86/kernel/amd_nb.c
> index 10cdeddeda02..4c22317a6dfe 100644
> --- a/arch/x86/kernel/amd_nb.c
> +++ b/arch/x86/kernel/amd_nb.c
> @@ -73,7 +73,6 @@ static int amd_cache_northbridges(void)
> amd_northbridges.nb = nb;
>
> for (i = 0; i < amd_northbridges.num; i++) {
> - node_to_amd_nb(i)->root = amd_node_get_root(i);
> node_to_amd_nb(i)->misc = amd_node_get_func(i, 3);
> node_to_amd_nb(i)->link = amd_node_get_func(i, 4);
> }
> diff --git a/arch/x86/kernel/amd_smn.c b/arch/x86/kernel/amd_smn.c
> index 997fd3edd9c0..527dda8e3a2b 100644
> --- a/arch/x86/kernel/amd_smn.c
> +++ b/arch/x86/kernel/amd_smn.c
> @@ -18,6 +18,9 @@ static DEFINE_MUTEX(smn_mutex);
> #define SMN_INDEX_OFFSET 0x60
> #define SMN_DATA_OFFSET 0x64
>
> +#define HSMP_INDEX_OFFSET 0xc4
> +#define HSMP_DATA_OFFSET 0xc8
> +
> /*
> * SMN accesses may fail in ways that are difficult to detect here in the called
> * functions amd_smn_read() and amd_smn_write(). Therefore, callers must do
> @@ -100,6 +103,12 @@ int __must_check amd_smn_write(u16 node, u32 address, u32 value)
> }
> EXPORT_SYMBOL_GPL(amd_smn_write);
>
> +int __must_check amd_smn_hsmp_rdwr(u16 node, u32 address, u32 *value, bool write)
> +{
> + return __amd_smn_rw(HSMP_INDEX_OFFSET, HSMP_DATA_OFFSET, node, address, value, write);
> +}
> +EXPORT_SYMBOL_GPL(amd_smn_hsmp_rdwr);
> +
> static int amd_cache_roots(void)
> {
> u16 node, num_nodes = amd_num_nodes();
> diff --git a/drivers/platform/x86/amd/Kconfig b/drivers/platform/x86/amd/Kconfig
> index f88682d36447..e100b315c62b 100644
> --- a/drivers/platform/x86/amd/Kconfig
> +++ b/drivers/platform/x86/amd/Kconfig
> @@ -8,7 +8,7 @@ source "drivers/platform/x86/amd/pmc/Kconfig"
>
> config AMD_HSMP
> tristate "AMD HSMP Driver"
> - depends on AMD_NB && X86_64 && ACPI
> + depends on AMD_SMN && X86_64 && ACPI
> help
> The driver provides a way for user space tools to monitor and manage
> system management functionality on EPYC server CPUs from AMD.
> diff --git a/drivers/platform/x86/amd/hsmp.c b/drivers/platform/x86/amd/hsmp.c
> index 8fcf38eed7f0..544efb0255c0 100644
> --- a/drivers/platform/x86/amd/hsmp.c
> +++ b/drivers/platform/x86/amd/hsmp.c
FYI, there has been major restructuring done for this driver in
pdx86/for-next.
--
i.
> @@ -10,7 +10,7 @@
> #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
>
> #include <asm/amd_hsmp.h>
> -#include <asm/amd_nb.h>
> +#include <asm/amd_smn.h>
> #include <linux/delay.h>
> #include <linux/io.h>
> #include <linux/miscdevice.h>
> @@ -48,9 +48,6 @@
> #define SMN_HSMP_MSG_RESP 0x0010980
> #define SMN_HSMP_MSG_DATA 0x00109E0
>
> -#define HSMP_INDEX_REG 0xc4
> -#define HSMP_DATA_REG 0xc8
> -
> #define HSMP_CDEV_NAME "hsmp_cdev"
> #define HSMP_DEVNODE_NAME "hsmp"
> #define HSMP_METRICS_TABLE_NAME "metrics_bin"
> @@ -62,8 +59,6 @@
> #define MSG_ARGOFF_STR "MsgArgOffset"
> #define MSG_RESPOFF_STR "MsgRspOffset"
>
> -#define MAX_AMD_SOCKETS 8
> -
> struct hsmp_mbaddr_info {
> u32 base_addr;
> u32 msg_id_off;
> @@ -79,7 +74,6 @@ struct hsmp_socket {
> void __iomem *virt_base_addr;
> struct semaphore hsmp_sem;
> char name[HSMP_ATTR_GRP_NAME_SIZE];
> - struct pci_dev *root;
> struct device *dev;
> u16 sock_ind;
> };
> @@ -98,20 +92,7 @@ static struct hsmp_plat_device plat_dev;
> static int amd_hsmp_pci_rdwr(struct hsmp_socket *sock, u32 offset,
> u32 *value, bool write)
> {
> - int ret;
> -
> - if (!sock->root)
> - return -ENODEV;
> -
> - ret = pci_write_config_dword(sock->root, HSMP_INDEX_REG,
> - sock->mbinfo.base_addr + offset);
> - if (ret)
> - return ret;
> -
> - ret = (write ? pci_write_config_dword(sock->root, HSMP_DATA_REG, *value)
> - : pci_read_config_dword(sock->root, HSMP_DATA_REG, value));
> -
> - return ret;
> + return amd_smn_hsmp_rdwr(sock->sock_ind, sock->mbinfo.base_addr + offset, value, write);
> }
>
> static void amd_hsmp_acpi_rdwr(struct hsmp_socket *sock, u32 offset,
> @@ -749,10 +730,7 @@ static int init_platform_device(struct device *dev)
> int ret, i;
>
> for (i = 0; i < plat_dev.num_sockets; i++) {
> - if (!node_to_amd_nb(i))
> - return -ENODEV;
> sock = &plat_dev.sock[i];
> - sock->root = node_to_amd_nb(i)->root;
> sock->sock_ind = i;
> sock->dev = dev;
> sock->mbinfo.base_addr = SMN_HSMP_BASE;
> @@ -946,11 +924,11 @@ static int __init hsmp_plt_init(void)
> int ret = -ENODEV;
>
> /*
> - * amd_nb_num() returns number of SMN/DF interfaces present in the system
> + * amd_num_nodes() returns number of SMN/DF interfaces present in the system
> * if we have N SMN/DF interfaces that ideally means N sockets
> */
> - plat_dev.num_sockets = amd_nb_num();
> - if (plat_dev.num_sockets == 0 || plat_dev.num_sockets > MAX_AMD_SOCKETS)
> + plat_dev.num_sockets = amd_num_nodes();
> + if (plat_dev.num_sockets == 0 || plat_dev.num_sockets > MAX_AMD_NUM_NODES)
> return ret;
>
> ret = platform_driver_register(&amd_hsmp_driver);
>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 00/16] AMD NB and SMN rework
2024-10-23 17:59 ` [PATCH 00/16] AMD NB and SMN rework Bjorn Helgaas
@ 2024-10-24 16:01 ` Yazen Ghannam
2024-10-24 17:46 ` Bjorn Helgaas
0 siblings, 1 reply; 65+ messages in thread
From: Yazen Ghannam @ 2024-10-24 16:01 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: linux-edac, linux-kernel, tony.luck, x86, avadhut.naik,
john.allen, mario.limonciello, bhelgaas, Shyam-sundar.S-k,
richard.gong, jdelvare, linux, clemens, hdegoede, ilpo.jarvinen,
linux-pci, linux-hwmon, platform-driver-x86,
naveenkrishna.chatradhi, carlos.bilbao.osdev
On Wed, Oct 23, 2024 at 12:59:28PM -0500, Bjorn Helgaas wrote:
> On Wed, Oct 23, 2024 at 05:21:34PM +0000, Yazen Ghannam wrote:
> > Hi all,
> >
> > The theme of this set is decoupling the "AMD node" concept from the
> > legacy northbridge support.
> >
> > Additionally, AMD System Management Network (SMN) access code is
> > decoupled and expanded too.
> >
> > Patches 1-3 begin reducing the scope of AMD_NB.
> >
> > Patches 4-9 begin moving generic AMD node support out of AMD_NB.
> >
> > Patches 10-13 move SMN support out of AMD_NB and do some refactoring.
> >
> > Patch 14 has HSMP reuse SMN functionality.
> >
> > Patches 15-16 address userspace access to SMN.
> >
> > I say "begin" above because there is more to do here. Ultimately, AMD_NB
> > should only be needed for code used on legacy systems with northbridges.
> > Also, any and all SMN users in the kernel need to be updated to use the
> > central SMN code. Local solutions should be avoided.
>
> Glad to see many of the PCI device IDs going away; thanks for working
> on that!
>
> The use of pci_get_slot() and pci_get_domain_bus_and_slot() is not
> ideal since all those pci_get_*() interfaces are kind of ugly in my
> opinion, and using them means we have to encode topology details in
> the kernel. But this still seems like a big improvement.
>
Thanks for the feedback. Hopefully, we'll come to some improved
solution. :)
Can you please elaborate on your concern? Is it about saying "thing X is
always at SBDF A:B:C.D" or something else?
Thanks,
Yazen
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 14/16] x86/amd_smn, platform/x86/amd/hsmp: Have HSMP use SMN
2024-10-24 13:23 ` Ilpo Järvinen
@ 2024-10-24 16:06 ` Yazen Ghannam
2024-10-25 13:39 ` Ilpo Järvinen
0 siblings, 1 reply; 65+ messages in thread
From: Yazen Ghannam @ 2024-10-24 16:06 UTC (permalink / raw)
To: Ilpo Järvinen
Cc: linux-edac, LKML, tony.luck, x86, avadhut.naik, john.allen,
mario.limonciello, bhelgaas, Shyam-sundar.S-k, richard.gong,
jdelvare, linux, clemens, Hans de Goede, linux-pci, linux-hwmon,
platform-driver-x86, naveenkrishna.chatradhi, carlos.bilbao.osdev
On Thu, Oct 24, 2024 at 04:23:55PM +0300, Ilpo Järvinen wrote:
> On Wed, 23 Oct 2024, Yazen Ghannam wrote:
>
> > The HSMP interface is just an SMN interface with different offsets.
> >
> > Define an HSMP wrapper in the SMN code and have the HSMP platform driver
> > use that rather than a local solution.
> >
> > Also, remove the "root" member from AMD_NB, since there are no more
> > users of it.
> >
> > Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
> > ---
> > arch/x86/include/asm/amd_nb.h | 1 -
> > arch/x86/include/asm/amd_smn.h | 3 +++
> > arch/x86/kernel/amd_nb.c | 1 -
> > arch/x86/kernel/amd_smn.c | 9 +++++++++
> > drivers/platform/x86/amd/Kconfig | 2 +-
> > drivers/platform/x86/amd/hsmp.c | 32 +++++---------------------------
> > 6 files changed, 18 insertions(+), 30 deletions(-)
> >
> > diff --git a/arch/x86/include/asm/amd_nb.h b/arch/x86/include/asm/amd_nb.h
> > index 55c03d3495bc..cbe31e316e39 100644
> > --- a/arch/x86/include/asm/amd_nb.h
> > +++ b/arch/x86/include/asm/amd_nb.h
> > @@ -27,7 +27,6 @@ struct amd_l3_cache {
> > };
> >
> > struct amd_northbridge {
> > - struct pci_dev *root;
> > struct pci_dev *misc;
> > struct pci_dev *link;
> > struct amd_l3_cache l3_cache;
> > diff --git a/arch/x86/include/asm/amd_smn.h b/arch/x86/include/asm/amd_smn.h
> > index 6850de69f863..f0eb12859c42 100644
> > --- a/arch/x86/include/asm/amd_smn.h
> > +++ b/arch/x86/include/asm/amd_smn.h
> > @@ -8,4 +8,7 @@
> > int __must_check amd_smn_read(u16 node, u32 address, u32 *value);
> > int __must_check amd_smn_write(u16 node, u32 address, u32 value);
> >
> > +/* Should only be used by the HSMP driver. */
> > +int __must_check amd_smn_hsmp_rdwr(u16 node, u32 address, u32 *value, bool write);
> > +
> > #endif /* _ASM_X86_AMD_SMN_H */
> > diff --git a/arch/x86/kernel/amd_nb.c b/arch/x86/kernel/amd_nb.c
> > index 10cdeddeda02..4c22317a6dfe 100644
> > --- a/arch/x86/kernel/amd_nb.c
> > +++ b/arch/x86/kernel/amd_nb.c
> > @@ -73,7 +73,6 @@ static int amd_cache_northbridges(void)
> > amd_northbridges.nb = nb;
> >
> > for (i = 0; i < amd_northbridges.num; i++) {
> > - node_to_amd_nb(i)->root = amd_node_get_root(i);
> > node_to_amd_nb(i)->misc = amd_node_get_func(i, 3);
> > node_to_amd_nb(i)->link = amd_node_get_func(i, 4);
> > }
> > diff --git a/arch/x86/kernel/amd_smn.c b/arch/x86/kernel/amd_smn.c
> > index 997fd3edd9c0..527dda8e3a2b 100644
> > --- a/arch/x86/kernel/amd_smn.c
> > +++ b/arch/x86/kernel/amd_smn.c
> > @@ -18,6 +18,9 @@ static DEFINE_MUTEX(smn_mutex);
> > #define SMN_INDEX_OFFSET 0x60
> > #define SMN_DATA_OFFSET 0x64
> >
> > +#define HSMP_INDEX_OFFSET 0xc4
> > +#define HSMP_DATA_OFFSET 0xc8
> > +
> > /*
> > * SMN accesses may fail in ways that are difficult to detect here in the called
> > * functions amd_smn_read() and amd_smn_write(). Therefore, callers must do
> > @@ -100,6 +103,12 @@ int __must_check amd_smn_write(u16 node, u32 address, u32 value)
> > }
> > EXPORT_SYMBOL_GPL(amd_smn_write);
> >
> > +int __must_check amd_smn_hsmp_rdwr(u16 node, u32 address, u32 *value, bool write)
> > +{
> > + return __amd_smn_rw(HSMP_INDEX_OFFSET, HSMP_DATA_OFFSET, node, address, value, write);
> > +}
> > +EXPORT_SYMBOL_GPL(amd_smn_hsmp_rdwr);
> > +
> > static int amd_cache_roots(void)
> > {
> > u16 node, num_nodes = amd_num_nodes();
> > diff --git a/drivers/platform/x86/amd/Kconfig b/drivers/platform/x86/amd/Kconfig
> > index f88682d36447..e100b315c62b 100644
> > --- a/drivers/platform/x86/amd/Kconfig
> > +++ b/drivers/platform/x86/amd/Kconfig
> > @@ -8,7 +8,7 @@ source "drivers/platform/x86/amd/pmc/Kconfig"
> >
> > config AMD_HSMP
> > tristate "AMD HSMP Driver"
> > - depends on AMD_NB && X86_64 && ACPI
> > + depends on AMD_SMN && X86_64 && ACPI
> > help
> > The driver provides a way for user space tools to monitor and manage
> > system management functionality on EPYC server CPUs from AMD.
> > diff --git a/drivers/platform/x86/amd/hsmp.c b/drivers/platform/x86/amd/hsmp.c
> > index 8fcf38eed7f0..544efb0255c0 100644
> > --- a/drivers/platform/x86/amd/hsmp.c
> > +++ b/drivers/platform/x86/amd/hsmp.c
>
> FYI, there has been major restructuring done for this driver in
> pdx86/for-next.
>
Yep, no problem. I can rebase these changes on top of those.
Any comments on the general approach?
Thanks,
Yazen
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 00/16] AMD NB and SMN rework
2024-10-24 16:01 ` Yazen Ghannam
@ 2024-10-24 17:46 ` Bjorn Helgaas
2024-10-24 20:08 ` Mario Limonciello
0 siblings, 1 reply; 65+ messages in thread
From: Bjorn Helgaas @ 2024-10-24 17:46 UTC (permalink / raw)
To: Yazen Ghannam
Cc: linux-edac, linux-kernel, tony.luck, x86, avadhut.naik,
john.allen, mario.limonciello, bhelgaas, Shyam-sundar.S-k,
richard.gong, jdelvare, linux, clemens, hdegoede, ilpo.jarvinen,
linux-pci, linux-hwmon, platform-driver-x86,
naveenkrishna.chatradhi, carlos.bilbao.osdev
On Thu, Oct 24, 2024 at 12:01:59PM -0400, Yazen Ghannam wrote:
> On Wed, Oct 23, 2024 at 12:59:28PM -0500, Bjorn Helgaas wrote:
> > On Wed, Oct 23, 2024 at 05:21:34PM +0000, Yazen Ghannam wrote:
> > > Hi all,
> > >
> > > The theme of this set is decoupling the "AMD node" concept from the
> > > legacy northbridge support.
> > >
> > > Additionally, AMD System Management Network (SMN) access code is
> > > decoupled and expanded too.
> > >
> > > Patches 1-3 begin reducing the scope of AMD_NB.
> > >
> > > Patches 4-9 begin moving generic AMD node support out of AMD_NB.
> > >
> > > Patches 10-13 move SMN support out of AMD_NB and do some refactoring.
> > >
> > > Patch 14 has HSMP reuse SMN functionality.
> > >
> > > Patches 15-16 address userspace access to SMN.
> > >
> > > I say "begin" above because there is more to do here. Ultimately, AMD_NB
> > > should only be needed for code used on legacy systems with northbridges.
> > > Also, any and all SMN users in the kernel need to be updated to use the
> > > central SMN code. Local solutions should be avoided.
> >
> > Glad to see many of the PCI device IDs going away; thanks for working
> > on that!
> >
> > The use of pci_get_slot() and pci_get_domain_bus_and_slot() is not
> > ideal since all those pci_get_*() interfaces are kind of ugly in my
> > opinion, and using them means we have to encode topology details in
> > the kernel. But this still seems like a big improvement.
>
> Thanks for the feedback. Hopefully, we'll come to some improved
> solution. :)
>
> Can you please elaborate on your concern? Is it about saying "thing X is
> always at SBDF A:B:C.D" or something else?
"Thing X is always at SBDF A:B:C.D" is one big reason. "A:B:C.D" says
nothing about the actual functionality of the device. A PCI
Vendor/Device ID or a PNP ID identifies the device programming model
independent of its geographical location. Inferring the functionality
and programming model from the location is a maintenance issue because
hardware may change the address.
PCI bus numbers are under software control, so in general it's not
safe to rely on them, although in this case these devices are probably
on root buses where the bus number is either fixed or determined by
BIOS configuration of the host bridge.
I don't like the pci_get_*() functions because they break the driver
model. The usual .probe() model binds a device to a driver, which
essentially means the driver owns the device and its resources, and
the driver and doesn't have to worry about other code interfering.
Unlike pci_get_*(), the .probe()/.remove() model automatically handles
hotplug without extra things like notifiers in the driver. Hotplug
may not be an issue in this particular case, but it requires specific
platform knowledge to be sure. Some platforms do support CPU and PCI
host bridge hotplug.
Thanks again for doing all this work. It's a huge improvement
already!
Bjorn
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 00/16] AMD NB and SMN rework
2024-10-24 17:46 ` Bjorn Helgaas
@ 2024-10-24 20:08 ` Mario Limonciello
2024-10-24 21:06 ` Bjorn Helgaas
0 siblings, 1 reply; 65+ messages in thread
From: Mario Limonciello @ 2024-10-24 20:08 UTC (permalink / raw)
To: Bjorn Helgaas, Yazen Ghannam
Cc: linux-edac, linux-kernel, tony.luck, x86, avadhut.naik,
john.allen, bhelgaas, Shyam-sundar.S-k, richard.gong, jdelvare,
linux, clemens, hdegoede, ilpo.jarvinen, linux-pci, linux-hwmon,
platform-driver-x86, naveenkrishna.chatradhi, carlos.bilbao.osdev
On 10/24/2024 12:46, Bjorn Helgaas wrote:
> On Thu, Oct 24, 2024 at 12:01:59PM -0400, Yazen Ghannam wrote:
>> On Wed, Oct 23, 2024 at 12:59:28PM -0500, Bjorn Helgaas wrote:
>>> On Wed, Oct 23, 2024 at 05:21:34PM +0000, Yazen Ghannam wrote:
>>>> Hi all,
>>>>
>>>> The theme of this set is decoupling the "AMD node" concept from the
>>>> legacy northbridge support.
>>>>
>>>> Additionally, AMD System Management Network (SMN) access code is
>>>> decoupled and expanded too.
>>>>
>>>> Patches 1-3 begin reducing the scope of AMD_NB.
>>>>
>>>> Patches 4-9 begin moving generic AMD node support out of AMD_NB.
>>>>
>>>> Patches 10-13 move SMN support out of AMD_NB and do some refactoring.
>>>>
>>>> Patch 14 has HSMP reuse SMN functionality.
>>>>
>>>> Patches 15-16 address userspace access to SMN.
>>>>
>>>> I say "begin" above because there is more to do here. Ultimately, AMD_NB
>>>> should only be needed for code used on legacy systems with northbridges.
>>>> Also, any and all SMN users in the kernel need to be updated to use the
>>>> central SMN code. Local solutions should be avoided.
>>>
>>> Glad to see many of the PCI device IDs going away; thanks for working
>>> on that!
>>>
>>> The use of pci_get_slot() and pci_get_domain_bus_and_slot() is not
>>> ideal since all those pci_get_*() interfaces are kind of ugly in my
>>> opinion, and using them means we have to encode topology details in
>>> the kernel. But this still seems like a big improvement.
>>
>> Thanks for the feedback. Hopefully, we'll come to some improved
>> solution. :)
>>
>> Can you please elaborate on your concern? Is it about saying "thing X is
>> always at SBDF A:B:C.D" or something else?
>
> "Thing X is always at SBDF A:B:C.D" is one big reason. "A:B:C.D" says
> nothing about the actual functionality of the device. A PCI
> Vendor/Device ID or a PNP ID identifies the device programming model
> independent of its geographical location. Inferring the functionality
> and programming model from the location is a maintenance issue because
> hardware may change the address.
>
> PCI bus numbers are under software control, so in general it's not
> safe to rely on them, although in this case these devices are probably
> on root buses where the bus number is either fixed or determined by
> BIOS configuration of the host bridge.
>
> I don't like the pci_get_*() functions because they break the driver
> model. The usual .probe() model binds a device to a driver, which
> essentially means the driver owns the device and its resources, and
> the driver and doesn't have to worry about other code interfering.
Are you suggesting that perhaps we should be introducing amd_smn (patch
10) as a PCI driver that binds "to the root device" instead?
If we made this change, I would wonder if it comes up early enough,
particularly considering quirk_clear_strap_no_soft_reset_dev2_f0() uses
the SMN symbols as PCI fixup final which happens before a driver
attaches (pci_bus_add_device()).
There are some areas that do discovery (for example amd_node_get_root()
in patch 6/16).
I think we should aspire to do is much discovery as possible but I don't
know we can get TOTALLY away from some hardcoded topology information.
>
> Unlike pci_get_*(), the .probe()/.remove() model automatically handles
> hotplug without extra things like notifiers in the driver. Hotplug
> may not be an issue in this particular case, but it requires specific
> platform knowledge to be sure. Some platforms do support CPU and PCI
> host bridge hotplug.
>
Yeah hotplug won't matter for these.
> Thanks again for doing all this work. It's a huge improvement
> already!
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 00/16] AMD NB and SMN rework
2024-10-24 20:08 ` Mario Limonciello
@ 2024-10-24 21:06 ` Bjorn Helgaas
2024-10-24 21:20 ` Mario Limonciello
0 siblings, 1 reply; 65+ messages in thread
From: Bjorn Helgaas @ 2024-10-24 21:06 UTC (permalink / raw)
To: Mario Limonciello
Cc: Yazen Ghannam, linux-edac, linux-kernel, tony.luck, x86,
avadhut.naik, john.allen, bhelgaas, Shyam-sundar.S-k,
richard.gong, jdelvare, linux, clemens, hdegoede, ilpo.jarvinen,
linux-pci, linux-hwmon, platform-driver-x86,
naveenkrishna.chatradhi, carlos.bilbao.osdev
On Thu, Oct 24, 2024 at 03:08:41PM -0500, Mario Limonciello wrote:
> On 10/24/2024 12:46, Bjorn Helgaas wrote:
> > On Thu, Oct 24, 2024 at 12:01:59PM -0400, Yazen Ghannam wrote:
> > > On Wed, Oct 23, 2024 at 12:59:28PM -0500, Bjorn Helgaas wrote:
> > > > On Wed, Oct 23, 2024 at 05:21:34PM +0000, Yazen Ghannam wrote:
> ...
> > > > The use of pci_get_slot() and pci_get_domain_bus_and_slot() is not
> > > > ideal since all those pci_get_*() interfaces are kind of ugly in my
> > > > opinion, and using them means we have to encode topology details in
> > > > the kernel. But this still seems like a big improvement.
> > >
> > > Thanks for the feedback. Hopefully, we'll come to some improved
> > > solution. :)
> > >
> > > Can you please elaborate on your concern? Is it about saying "thing X is
> > > always at SBDF A:B:C.D" or something else?
> >
> > "Thing X is always at SBDF A:B:C.D" is one big reason. "A:B:C.D" says
> > nothing about the actual functionality of the device. A PCI
> > Vendor/Device ID or a PNP ID identifies the device programming model
> > independent of its geographical location. Inferring the functionality
> > and programming model from the location is a maintenance issue because
> > hardware may change the address.
> >
> > PCI bus numbers are under software control, so in general it's not
> > safe to rely on them, although in this case these devices are probably
> > on root buses where the bus number is either fixed or determined by
> > BIOS configuration of the host bridge.
> >
> > I don't like the pci_get_*() functions because they break the driver
> > model. The usual .probe() model binds a device to a driver, which
> > essentially means the driver owns the device and its resources, and
> > the driver and doesn't have to worry about other code interfering.
>
> Are you suggesting that perhaps we should be introducing amd_smn (patch 10)
> as a PCI driver that binds "to the root device" instead?
I don't know any of the specifics, so I can't really opine on that.
The PCI specs envision that a Vendor/Device ID defines the programming
model of the device, and you would only use a new Device ID when that
programming model changes.
Of course, vendors like to define a new set of Device IDs for every
new chipset even when no driver changes are required, so even if a new
SMN works exactly the same as in previous chipsets, you're probably
back to having to add a new Device ID for every new chipset.
The Subsystem Vendor ID and Subsystem ID exist to solve a similar
problem (sort of in reverse). If AMD could allocate a Subsystem ID
for this SMN programming model and use that same ID in every chipset,
you could make a pci_driver.id_table entry that would match them all,
e.g.,
.vendor = PCI_VENDOR_ID_AMD,
.device = PCI_ANY_ID,
.subvendor = PCI_VENDOR_ID_AMD,
.subdevice = PCI_SUBSYSTEM_AMD_SMN,
(pci_device_id.subdevice is misnamed; the spec calls it "Subsystem ID")
> There are some areas that do discovery (for example amd_node_get_root() in
> patch 6/16).
Sort of. amd_node_get_root() and amd_node_get_func() both just grub
through all the devices that the PCI core has enumerated and return
the one that has the right geographical address.
There's no binding to a driver, so another driver could come along and
bind to the same device, and then you have a potential conflict.
You also give up all the standard driver model infrastructure for
hotplug, power management, etc. Granted, you probably don't care
about those things here.
Bjorn
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 00/16] AMD NB and SMN rework
2024-10-24 21:06 ` Bjorn Helgaas
@ 2024-10-24 21:20 ` Mario Limonciello
2024-10-24 21:47 ` Bjorn Helgaas
0 siblings, 1 reply; 65+ messages in thread
From: Mario Limonciello @ 2024-10-24 21:20 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Yazen Ghannam, linux-edac, linux-kernel, tony.luck, x86,
avadhut.naik, john.allen, bhelgaas, Shyam-sundar.S-k,
richard.gong, jdelvare, linux, clemens, hdegoede, ilpo.jarvinen,
linux-pci, linux-hwmon, platform-driver-x86,
naveenkrishna.chatradhi, carlos.bilbao.osdev
On 10/24/2024 16:06, Bjorn Helgaas wrote:
> On Thu, Oct 24, 2024 at 03:08:41PM -0500, Mario Limonciello wrote:
>> On 10/24/2024 12:46, Bjorn Helgaas wrote:
>>> On Thu, Oct 24, 2024 at 12:01:59PM -0400, Yazen Ghannam wrote:
>>>> On Wed, Oct 23, 2024 at 12:59:28PM -0500, Bjorn Helgaas wrote:
>>>>> On Wed, Oct 23, 2024 at 05:21:34PM +0000, Yazen Ghannam wrote:
>> ...
>
>>>>> The use of pci_get_slot() and pci_get_domain_bus_and_slot() is not
>>>>> ideal since all those pci_get_*() interfaces are kind of ugly in my
>>>>> opinion, and using them means we have to encode topology details in
>>>>> the kernel. But this still seems like a big improvement.
>>>>
>>>> Thanks for the feedback. Hopefully, we'll come to some improved
>>>> solution. :)
>>>>
>>>> Can you please elaborate on your concern? Is it about saying "thing X is
>>>> always at SBDF A:B:C.D" or something else?
>>>
>>> "Thing X is always at SBDF A:B:C.D" is one big reason. "A:B:C.D" says
>>> nothing about the actual functionality of the device. A PCI
>>> Vendor/Device ID or a PNP ID identifies the device programming model
>>> independent of its geographical location. Inferring the functionality
>>> and programming model from the location is a maintenance issue because
>>> hardware may change the address.
>>>
>>> PCI bus numbers are under software control, so in general it's not
>>> safe to rely on them, although in this case these devices are probably
>>> on root buses where the bus number is either fixed or determined by
>>> BIOS configuration of the host bridge.
>>>
>>> I don't like the pci_get_*() functions because they break the driver
>>> model. The usual .probe() model binds a device to a driver, which
>>> essentially means the driver owns the device and its resources, and
>>> the driver and doesn't have to worry about other code interfering.
>>
>> Are you suggesting that perhaps we should be introducing amd_smn (patch 10)
>> as a PCI driver that binds "to the root device" instead?
>
> I don't know any of the specifics, so I can't really opine on that.
>
> The PCI specs envision that a Vendor/Device ID defines the programming
> model of the device, and you would only use a new Device ID when that
> programming model changes.
>
> Of course, vendors like to define a new set of Device IDs for every
> new chipset even when no driver changes are required, so even if a new
> SMN works exactly the same as in previous chipsets, you're probably
> back to having to add a new Device ID for every new chipset.
Yeah; this I believe is why we're here today and trying to find
something more manageable (IE this series).
>
> The Subsystem Vendor ID and Subsystem ID exist to solve a similar
> problem (sort of in reverse). If AMD could allocate a Subsystem ID
> for this SMN programming model and use that same ID in every chipset,
> you could make a pci_driver.id_table entry that would match them all,
> e.g.,
>
> .vendor = PCI_VENDOR_ID_AMD,
> .device = PCI_ANY_ID,
> .subvendor = PCI_VENDOR_ID_AMD,
> .subdevice = PCI_SUBSYSTEM_AMD_SMN,
>
> (pci_device_id.subdevice is misnamed; the spec calls it "Subsystem ID")
Isn't the subsystem ID based typically upon the platform it's running
on? For example I seem to recall on Dell systems it's used the value
that was in the SBMIOS ProductSKU field here (IoW not something AMD
would control).
I mean I guess maybe we could do a:
.vendor = PCI_VENDOR_ID_AMD,
.device = PCI_ANY_ID,
.class = PCI_CLASS_BRIDGE_HOST << 8
And then in probe() figure out if it's the right one, but that's still
pretty ugly, eh?
>
>> There are some areas that do discovery (for example amd_node_get_root() in
>> patch 6/16).
>
> Sort of. amd_node_get_root() and amd_node_get_func() both just grub
> through all the devices that the PCI core has enumerated and return
> the one that has the right geographical address.
>
> There's no binding to a driver, so another driver could come along and
> bind to the same device, and then you have a potential conflict.
>
> You also give up all the standard driver model infrastructure for
> hotplug, power management, etc. Granted, you probably don't care
> about those things here.
Right; I agree none of that matters here.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 00/16] AMD NB and SMN rework
2024-10-24 21:20 ` Mario Limonciello
@ 2024-10-24 21:47 ` Bjorn Helgaas
2024-10-31 16:22 ` Yazen Ghannam
0 siblings, 1 reply; 65+ messages in thread
From: Bjorn Helgaas @ 2024-10-24 21:47 UTC (permalink / raw)
To: Mario Limonciello
Cc: Yazen Ghannam, linux-edac, linux-kernel, tony.luck, x86,
avadhut.naik, john.allen, bhelgaas, Shyam-sundar.S-k,
richard.gong, jdelvare, linux, clemens, hdegoede, ilpo.jarvinen,
linux-pci, linux-hwmon, platform-driver-x86,
naveenkrishna.chatradhi, carlos.bilbao.osdev
On Thu, Oct 24, 2024 at 04:20:35PM -0500, Mario Limonciello wrote:
> On 10/24/2024 16:06, Bjorn Helgaas wrote:
> > On Thu, Oct 24, 2024 at 03:08:41PM -0500, Mario Limonciello wrote:
> > > On 10/24/2024 12:46, Bjorn Helgaas wrote:
> > > > On Thu, Oct 24, 2024 at 12:01:59PM -0400, Yazen Ghannam wrote:
> > > > > On Wed, Oct 23, 2024 at 12:59:28PM -0500, Bjorn Helgaas wrote:
> > > > > > On Wed, Oct 23, 2024 at 05:21:34PM +0000, Yazen Ghannam wrote:
> > > ...
> >
> > > > > > The use of pci_get_slot() and pci_get_domain_bus_and_slot() is not
> > > > > > ideal since all those pci_get_*() interfaces are kind of ugly in my
> > > > > > opinion, and using them means we have to encode topology details in
> > > > > > the kernel. But this still seems like a big improvement.
> > > > >
> > > > > Thanks for the feedback. Hopefully, we'll come to some improved
> > > > > solution. :)
> > > > >
> > > > > Can you please elaborate on your concern? Is it about saying "thing X is
> > > > > always at SBDF A:B:C.D" or something else?
> > > >
> > > > "Thing X is always at SBDF A:B:C.D" is one big reason. "A:B:C.D" says
> > > > nothing about the actual functionality of the device. A PCI
> > > > Vendor/Device ID or a PNP ID identifies the device programming model
> > > > independent of its geographical location. Inferring the functionality
> > > > and programming model from the location is a maintenance issue because
> > > > hardware may change the address.
> > > >
> > > > PCI bus numbers are under software control, so in general it's not
> > > > safe to rely on them, although in this case these devices are probably
> > > > on root buses where the bus number is either fixed or determined by
> > > > BIOS configuration of the host bridge.
> > > >
> > > > I don't like the pci_get_*() functions because they break the driver
> > > > model. The usual .probe() model binds a device to a driver, which
> > > > essentially means the driver owns the device and its resources, and
> > > > the driver and doesn't have to worry about other code interfering.
> > >
> > > Are you suggesting that perhaps we should be introducing amd_smn (patch 10)
> > > as a PCI driver that binds "to the root device" instead?
> >
> > I don't know any of the specifics, so I can't really opine on that.
> >
> > The PCI specs envision that a Vendor/Device ID defines the programming
> > model of the device, and you would only use a new Device ID when that
> > programming model changes.
> >
> > Of course, vendors like to define a new set of Device IDs for every
> > new chipset even when no driver changes are required, so even if a new
> > SMN works exactly the same as in previous chipsets, you're probably
> > back to having to add a new Device ID for every new chipset.
>
> Yeah; this I believe is why we're here today and trying to find something
> more manageable (IE this series).
Another alternative would be an ACPI device where you can use the same
_HID (or at least a _CID) for all the chipsets.
> > The Subsystem Vendor ID and Subsystem ID exist to solve a similar
> > problem (sort of in reverse). If AMD could allocate a Subsystem ID
> > for this SMN programming model and use that same ID in every chipset,
> > you could make a pci_driver.id_table entry that would match them all,
> > e.g.,
> >
> > .vendor = PCI_VENDOR_ID_AMD,
> > .device = PCI_ANY_ID,
> > .subvendor = PCI_VENDOR_ID_AMD,
> > .subdevice = PCI_SUBSYSTEM_AMD_SMN,
> >
> > (pci_device_id.subdevice is misnamed; the spec calls it "Subsystem ID")
>
> Isn't the subsystem ID based typically upon the platform it's
> running on? For example I seem to recall on Dell systems it's used
> the value that was in the SBMIOS ProductSKU field here (IoW not
> something AMD would control).
Right, it is typically based on the platform; that's why I said "in
reverse." I think all these devices are integrated into the chipset,
so I'm speculating that platform vendors would have no need (maybe
even no way) to use the Subsystem ID. But maybe that's not the case.
> I mean I guess maybe we could do a:
>
> .vendor = PCI_VENDOR_ID_AMD,
> .device = PCI_ANY_ID,
> .class = PCI_CLASS_BRIDGE_HOST << 8
>
> And then in probe() figure out if it's the right one, but that's still
> pretty ugly, eh?
I think there are some drivers that do this, and it's not completely
terrible. The probe() can just return failure if it doesn't want the
device.
Bjorn
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 14/16] x86/amd_smn, platform/x86/amd/hsmp: Have HSMP use SMN
2024-10-24 16:06 ` Yazen Ghannam
@ 2024-10-25 13:39 ` Ilpo Järvinen
0 siblings, 0 replies; 65+ messages in thread
From: Ilpo Järvinen @ 2024-10-25 13:39 UTC (permalink / raw)
To: Yazen Ghannam
Cc: linux-edac, LKML, tony.luck, x86, avadhut.naik, john.allen,
mario.limonciello, bhelgaas, Shyam-sundar.S-k, richard.gong,
jdelvare, linux, clemens, Hans de Goede, linux-pci, linux-hwmon,
platform-driver-x86, naveenkrishna.chatradhi, carlos.bilbao.osdev
[-- Attachment #1: Type: text/plain, Size: 4858 bytes --]
On Thu, 24 Oct 2024, Yazen Ghannam wrote:
> On Thu, Oct 24, 2024 at 04:23:55PM +0300, Ilpo Järvinen wrote:
> > On Wed, 23 Oct 2024, Yazen Ghannam wrote:
> >
> > > The HSMP interface is just an SMN interface with different offsets.
> > >
> > > Define an HSMP wrapper in the SMN code and have the HSMP platform driver
> > > use that rather than a local solution.
> > >
> > > Also, remove the "root" member from AMD_NB, since there are no more
> > > users of it.
> > >
> > > Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
> > > ---
> > > arch/x86/include/asm/amd_nb.h | 1 -
> > > arch/x86/include/asm/amd_smn.h | 3 +++
> > > arch/x86/kernel/amd_nb.c | 1 -
> > > arch/x86/kernel/amd_smn.c | 9 +++++++++
> > > drivers/platform/x86/amd/Kconfig | 2 +-
> > > drivers/platform/x86/amd/hsmp.c | 32 +++++---------------------------
> > > 6 files changed, 18 insertions(+), 30 deletions(-)
> > >
> > > diff --git a/arch/x86/include/asm/amd_nb.h b/arch/x86/include/asm/amd_nb.h
> > > index 55c03d3495bc..cbe31e316e39 100644
> > > --- a/arch/x86/include/asm/amd_nb.h
> > > +++ b/arch/x86/include/asm/amd_nb.h
> > > @@ -27,7 +27,6 @@ struct amd_l3_cache {
> > > };
> > >
> > > struct amd_northbridge {
> > > - struct pci_dev *root;
> > > struct pci_dev *misc;
> > > struct pci_dev *link;
> > > struct amd_l3_cache l3_cache;
> > > diff --git a/arch/x86/include/asm/amd_smn.h b/arch/x86/include/asm/amd_smn.h
> > > index 6850de69f863..f0eb12859c42 100644
> > > --- a/arch/x86/include/asm/amd_smn.h
> > > +++ b/arch/x86/include/asm/amd_smn.h
> > > @@ -8,4 +8,7 @@
> > > int __must_check amd_smn_read(u16 node, u32 address, u32 *value);
> > > int __must_check amd_smn_write(u16 node, u32 address, u32 value);
> > >
> > > +/* Should only be used by the HSMP driver. */
> > > +int __must_check amd_smn_hsmp_rdwr(u16 node, u32 address, u32 *value, bool write);
> > > +
> > > #endif /* _ASM_X86_AMD_SMN_H */
> > > diff --git a/arch/x86/kernel/amd_nb.c b/arch/x86/kernel/amd_nb.c
> > > index 10cdeddeda02..4c22317a6dfe 100644
> > > --- a/arch/x86/kernel/amd_nb.c
> > > +++ b/arch/x86/kernel/amd_nb.c
> > > @@ -73,7 +73,6 @@ static int amd_cache_northbridges(void)
> > > amd_northbridges.nb = nb;
> > >
> > > for (i = 0; i < amd_northbridges.num; i++) {
> > > - node_to_amd_nb(i)->root = amd_node_get_root(i);
> > > node_to_amd_nb(i)->misc = amd_node_get_func(i, 3);
> > > node_to_amd_nb(i)->link = amd_node_get_func(i, 4);
> > > }
> > > diff --git a/arch/x86/kernel/amd_smn.c b/arch/x86/kernel/amd_smn.c
> > > index 997fd3edd9c0..527dda8e3a2b 100644
> > > --- a/arch/x86/kernel/amd_smn.c
> > > +++ b/arch/x86/kernel/amd_smn.c
> > > @@ -18,6 +18,9 @@ static DEFINE_MUTEX(smn_mutex);
> > > #define SMN_INDEX_OFFSET 0x60
> > > #define SMN_DATA_OFFSET 0x64
> > >
> > > +#define HSMP_INDEX_OFFSET 0xc4
> > > +#define HSMP_DATA_OFFSET 0xc8
> > > +
> > > /*
> > > * SMN accesses may fail in ways that are difficult to detect here in the called
> > > * functions amd_smn_read() and amd_smn_write(). Therefore, callers must do
> > > @@ -100,6 +103,12 @@ int __must_check amd_smn_write(u16 node, u32 address, u32 value)
> > > }
> > > EXPORT_SYMBOL_GPL(amd_smn_write);
> > >
> > > +int __must_check amd_smn_hsmp_rdwr(u16 node, u32 address, u32 *value, bool write)
> > > +{
> > > + return __amd_smn_rw(HSMP_INDEX_OFFSET, HSMP_DATA_OFFSET, node, address, value, write);
> > > +}
> > > +EXPORT_SYMBOL_GPL(amd_smn_hsmp_rdwr);
> > > +
> > > static int amd_cache_roots(void)
> > > {
> > > u16 node, num_nodes = amd_num_nodes();
> > > diff --git a/drivers/platform/x86/amd/Kconfig b/drivers/platform/x86/amd/Kconfig
> > > index f88682d36447..e100b315c62b 100644
> > > --- a/drivers/platform/x86/amd/Kconfig
> > > +++ b/drivers/platform/x86/amd/Kconfig
> > > @@ -8,7 +8,7 @@ source "drivers/platform/x86/amd/pmc/Kconfig"
> > >
> > > config AMD_HSMP
> > > tristate "AMD HSMP Driver"
> > > - depends on AMD_NB && X86_64 && ACPI
> > > + depends on AMD_SMN && X86_64 && ACPI
> > > help
> > > The driver provides a way for user space tools to monitor and manage
> > > system management functionality on EPYC server CPUs from AMD.
> > > diff --git a/drivers/platform/x86/amd/hsmp.c b/drivers/platform/x86/amd/hsmp.c
> > > index 8fcf38eed7f0..544efb0255c0 100644
> > > --- a/drivers/platform/x86/amd/hsmp.c
> > > +++ b/drivers/platform/x86/amd/hsmp.c
> >
> > FYI, there has been major restructuring done for this driver in
> > pdx86/for-next.
> >
>
> Yep, no problem. I can rebase these changes on top of those.
>
> Any comments on the general approach?
I deemed looking deeper into the patch waste of my time due to the
expected changes, so no comments at this time.
--
i.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 03/16] x86/amd_nb: Clean up early_is_amd_nb()
2024-10-23 17:21 ` [PATCH 03/16] x86/amd_nb: Clean up early_is_amd_nb() Yazen Ghannam
@ 2024-10-25 15:58 ` Borislav Petkov
2024-10-29 14:39 ` Yazen Ghannam
0 siblings, 1 reply; 65+ messages in thread
From: Borislav Petkov @ 2024-10-25 15:58 UTC (permalink / raw)
To: Yazen Ghannam
Cc: linux-edac, linux-kernel, tony.luck, x86, avadhut.naik,
john.allen, mario.limonciello, bhelgaas, Shyam-sundar.S-k,
richard.gong, jdelvare, linux, clemens, hdegoede, ilpo.jarvinen,
linux-pci, linux-hwmon, platform-driver-x86,
naveenkrishna.chatradhi, carlos.bilbao.osdev
On Wed, Oct 23, 2024 at 05:21:37PM +0000, Yazen Ghannam wrote:
> @@ -393,11 +392,11 @@ bool __init early_is_amd_nb(u32 device)
> boot_cpu_data.x86_vendor != X86_VENDOR_HYGON)
> return false;
>
> - if (boot_cpu_data.x86_vendor == X86_VENDOR_HYGON)
> - misc_ids = hygon_nb_misc_ids;
> + if (boot_cpu_has(X86_FEATURE_ZEN))
check_for_deprecated_apis: WARNING: arch/x86/kernel/amd_nb.c:395: Do not use boot_cpu_has() - use cpu_feature_enabled() instead.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 03/16] x86/amd_nb: Clean up early_is_amd_nb()
2024-10-25 15:58 ` Borislav Petkov
@ 2024-10-29 14:39 ` Yazen Ghannam
2024-10-29 15:08 ` Borislav Petkov
0 siblings, 1 reply; 65+ messages in thread
From: Yazen Ghannam @ 2024-10-29 14:39 UTC (permalink / raw)
To: Borislav Petkov
Cc: linux-edac, linux-kernel, tony.luck, x86, avadhut.naik,
john.allen, mario.limonciello, bhelgaas, Shyam-sundar.S-k,
richard.gong, jdelvare, linux, clemens, hdegoede, ilpo.jarvinen,
linux-pci, linux-hwmon, platform-driver-x86,
naveenkrishna.chatradhi, carlos.bilbao.osdev
On Fri, Oct 25, 2024 at 05:58:30PM +0200, Borislav Petkov wrote:
> On Wed, Oct 23, 2024 at 05:21:37PM +0000, Yazen Ghannam wrote:
> > @@ -393,11 +392,11 @@ bool __init early_is_amd_nb(u32 device)
> > boot_cpu_data.x86_vendor != X86_VENDOR_HYGON)
> > return false;
> >
> > - if (boot_cpu_data.x86_vendor == X86_VENDOR_HYGON)
> > - misc_ids = hygon_nb_misc_ids;
> > + if (boot_cpu_has(X86_FEATURE_ZEN))
>
> check_for_deprecated_apis: WARNING: arch/x86/kernel/amd_nb.c:395: Do not use boot_cpu_has() - use cpu_feature_enabled() instead.
Sure thing.
How can I enable this check myself?
Thanks,
Yazen
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 03/16] x86/amd_nb: Clean up early_is_amd_nb()
2024-10-29 14:39 ` Yazen Ghannam
@ 2024-10-29 15:08 ` Borislav Petkov
2024-10-29 16:15 ` Luck, Tony
0 siblings, 1 reply; 65+ messages in thread
From: Borislav Petkov @ 2024-10-29 15:08 UTC (permalink / raw)
To: Yazen Ghannam
Cc: linux-edac, linux-kernel, tony.luck, x86, avadhut.naik,
john.allen, mario.limonciello, bhelgaas, Shyam-sundar.S-k,
richard.gong, jdelvare, linux, clemens, hdegoede, ilpo.jarvinen,
linux-pci, linux-hwmon, platform-driver-x86,
naveenkrishna.chatradhi, carlos.bilbao.osdev
On Tue, Oct 29, 2024 at 10:39:28AM -0400, Yazen Ghannam wrote:
> How can I enable this check myself?
It is part of my silly patch checking script:
https://git.kernel.org/pub/scm/linux/kernel/git/bp/bp.git/log/?h=vp
in here:
https://git.kernel.org/pub/scm/linux/kernel/git/bp/bp.git/tree/.tip/bin/vp.py?h=vp
but it probably isn't ready for public consumption yet.
I probably should try to package it properly when there's time...
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 65+ messages in thread
* RE: [PATCH 03/16] x86/amd_nb: Clean up early_is_amd_nb()
2024-10-29 15:08 ` Borislav Petkov
@ 2024-10-29 16:15 ` Luck, Tony
2024-10-30 14:21 ` Yazen Ghannam
0 siblings, 1 reply; 65+ messages in thread
From: Luck, Tony @ 2024-10-29 16:15 UTC (permalink / raw)
To: Borislav Petkov, Yazen Ghannam, Andy Whitcroft, Joe Perches
Cc: linux-edac@vger.kernel.org, linux-kernel@vger.kernel.org,
x86@kernel.org, avadhut.naik@amd.com, john.allen@amd.com,
mario.limonciello@amd.com, bhelgaas@google.com,
Shyam-sundar.S-k@amd.com, richard.gong@amd.com, jdelvare@suse.com,
linux@roeck-us.net, clemens@ladisch.de, hdegoede@redhat.com,
ilpo.jarvinen@linux.intel.com, linux-pci@vger.kernel.org,
linux-hwmon@vger.kernel.org, platform-driver-x86@vger.kernel.org,
naveenkrishna.chatradhi@amd.com, carlos.bilbao.osdev@gmail.com
> > > > - if (boot_cpu_data.x86_vendor == X86_VENDOR_HYGON)
> > > > - misc_ids = hygon_nb_misc_ids;
> > > > + if (boot_cpu_has(X86_FEATURE_ZEN))
> > >
> > > check_for_deprecated_apis: WARNING: arch/x86/kernel/amd_nb.c:395: Do not use boot_cpu_has() - use cpu_feature_enabled() instead.
> >
> > Sure thing.
> >
> > How can I enable this check myself?
> It is part of my silly patch checking script:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/bp/bp.git/log/?h=vp
>
> in here:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/bp/bp.git/tree/.tip/bin/vp.py?h=vp
>
> but it probably isn't ready for public consumption yet.
>
> I probably should try to package it properly when there's time...
Sounds like it would be a valuable addition to checkpatch.
Maybe Joe or Andy will find time before you do.
-Tony
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 03/16] x86/amd_nb: Clean up early_is_amd_nb()
2024-10-29 16:15 ` Luck, Tony
@ 2024-10-30 14:21 ` Yazen Ghannam
2024-10-31 0:53 ` Sohil Mehta
2024-10-31 10:24 ` [PATCH 03/16] x86/amd_nb: Clean up early_is_amd_nb() Borislav Petkov
0 siblings, 2 replies; 65+ messages in thread
From: Yazen Ghannam @ 2024-10-30 14:21 UTC (permalink / raw)
To: Luck, Tony
Cc: Borislav Petkov, Andy Whitcroft, Joe Perches,
linux-edac@vger.kernel.org, linux-kernel@vger.kernel.org,
x86@kernel.org, avadhut.naik@amd.com, john.allen@amd.com,
mario.limonciello@amd.com, bhelgaas@google.com,
Shyam-sundar.S-k@amd.com, richard.gong@amd.com, jdelvare@suse.com,
linux@roeck-us.net, clemens@ladisch.de, hdegoede@redhat.com,
ilpo.jarvinen@linux.intel.com, linux-pci@vger.kernel.org,
linux-hwmon@vger.kernel.org, platform-driver-x86@vger.kernel.org,
naveenkrishna.chatradhi@amd.com, carlos.bilbao.osdev@gmail.com
On Tue, Oct 29, 2024 at 04:15:33PM +0000, Luck, Tony wrote:
> > > > > - if (boot_cpu_data.x86_vendor == X86_VENDOR_HYGON)
> > > > > - misc_ids = hygon_nb_misc_ids;
> > > > > + if (boot_cpu_has(X86_FEATURE_ZEN))
> > > >
> > > > check_for_deprecated_apis: WARNING: arch/x86/kernel/amd_nb.c:395: Do not use boot_cpu_has() - use cpu_feature_enabled() instead.
> > >
> > > Sure thing.
> > >
> > > How can I enable this check myself?
> > It is part of my silly patch checking script:
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/bp/bp.git/log/?h=vp
> >
> > in here:
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/bp/bp.git/tree/.tip/bin/vp.py?h=vp
> >
> > but it probably isn't ready for public consumption yet.
> >
> > I probably should try to package it properly when there's time...
>
> Sounds like it would be a valuable addition to checkpatch.
>
> Maybe Joe or Andy will find time before you do.
>
And if not to checkpatch, then maybe it can be included in the TIP
maintainers' handbook? That is, if others are using it or something
similar.
Thanks,
Yazen
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 03/16] x86/amd_nb: Clean up early_is_amd_nb()
2024-10-30 14:21 ` Yazen Ghannam
@ 2024-10-31 0:53 ` Sohil Mehta
2024-10-31 10:34 ` [PATCH] x86/cpufeature: Document cpu_feature_enabled() as the default to use Borislav Petkov
2024-10-31 10:24 ` [PATCH 03/16] x86/amd_nb: Clean up early_is_amd_nb() Borislav Petkov
1 sibling, 1 reply; 65+ messages in thread
From: Sohil Mehta @ 2024-10-31 0:53 UTC (permalink / raw)
To: Yazen Ghannam, Luck, Tony, Borislav Petkov
Cc: Andy Whitcroft, Joe Perches, linux-edac@vger.kernel.org,
linux-kernel@vger.kernel.org, x86@kernel.org,
avadhut.naik@amd.com, john.allen@amd.com,
mario.limonciello@amd.com, bhelgaas@google.com,
Shyam-sundar.S-k@amd.com, richard.gong@amd.com, jdelvare@suse.com,
linux@roeck-us.net, clemens@ladisch.de, hdegoede@redhat.com,
ilpo.jarvinen@linux.intel.com, linux-pci@vger.kernel.org,
linux-hwmon@vger.kernel.org, platform-driver-x86@vger.kernel.org,
naveenkrishna.chatradhi@amd.com, carlos.bilbao.osdev@gmail.com
> On Tue, Oct 29, 2024 at 04:15:33PM +0000, Luck, Tony wrote:
>>>>>> - if (boot_cpu_data.x86_vendor == X86_VENDOR_HYGON)
>>>>>> - misc_ids = hygon_nb_misc_ids;
>>>>>> + if (boot_cpu_has(X86_FEATURE_ZEN))
>>>>>
>>>>> check_for_deprecated_apis: WARNING: arch/x86/kernel/amd_nb.c:395: Do not use boot_cpu_has() - use cpu_feature_enabled() instead.
>>>>
Do the comments in cpufeature.h need updating? It seems to recommend
boot_cpu_has() in most cases and suggests using static_cpu_has() (which
is used by cpu_feature_enabled()) only in fast paths.
/*
* Static testing of CPU features. Used the same as boot_cpu_has(). It
* statically patches the target code for additional performance. Use
* static_cpu_has() only in fast paths, where every cycle counts. Which
* means that the boot_cpu_has() variant is already fast enough for the
* majority of cases and you should stick to using it as it is generally
* only two instructions: a RIP-relative MOV and a TEST.
*
...
*/
static __always_inline bool _static_cpu_has(u16 bit)
>
> And if not to checkpatch, then maybe it can be included in the TIP
> maintainers' handbook? That is, if others are using it or something
> similar.
>
^ permalink raw reply [flat|nested] 65+ messages in thread
* RE: [PATCH 06/16] x86/amd_nb: Simplify root device search
2024-10-23 17:21 ` [PATCH 06/16] x86/amd_nb: Simplify root device search Yazen Ghannam
@ 2024-10-31 7:52 ` Zhuo, Qiuxu
2024-10-31 10:08 ` Ilpo Järvinen
2024-10-31 11:20 ` Borislav Petkov
1 sibling, 1 reply; 65+ messages in thread
From: Zhuo, Qiuxu @ 2024-10-31 7:52 UTC (permalink / raw)
To: Yazen Ghannam, linux-edac@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, Luck, Tony, x86@kernel.org,
avadhut.naik@amd.com, john.allen@amd.com,
mario.limonciello@amd.com, bhelgaas@google.com,
Shyam-sundar.S-k@amd.com, richard.gong@amd.com, jdelvare@suse.com,
linux@roeck-us.net, clemens@ladisch.de, hdegoede@redhat.com,
ilpo.jarvinen@linux.intel.com, linux-pci@vger.kernel.org,
linux-hwmon@vger.kernel.org, platform-driver-x86@vger.kernel.org,
naveenkrishna.chatradhi@amd.com, carlos.bilbao.osdev@gmail.com
> From: Yazen Ghannam <yazen.ghannam@amd.com>
> [...]
> +struct pci_dev *amd_node_get_root(u16 node) {
> + struct pci_dev *df_f0 __free(pci_dev_put) = NULL;
NULL pointer initialization is not necessary.
> + struct pci_dev *root;
> + u16 cntl_off;
> + u8 bus;
> +
> + if (!boot_cpu_has(X86_FEATURE_ZEN))
> + return NULL;
> +
> + /*
> + * D18F0xXXX [Config Address Control] (DF::CfgAddressCntl)
> + * Bits [7:0] (SecBusNum) holds the bus number of the root device for
> + * this Data Fabric instance. The segment, device, and function will be
> 0.
> + */
> + df_f0 = amd_node_get_func(node, 0);
> + if (!df_f0)
> + return NULL;
> [...]
^ permalink raw reply [flat|nested] 65+ messages in thread
* RE: [PATCH 02/16] x86/amd_nb: Restrict init function to AMD-based systems
2024-10-23 17:21 ` [PATCH 02/16] x86/amd_nb: Restrict init function to AMD-based systems Yazen Ghannam
@ 2024-10-31 8:09 ` Zhuo, Qiuxu
2024-10-31 10:36 ` Borislav Petkov
0 siblings, 1 reply; 65+ messages in thread
From: Zhuo, Qiuxu @ 2024-10-31 8:09 UTC (permalink / raw)
To: Yazen Ghannam, linux-edac@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, Luck, Tony, x86@kernel.org,
avadhut.naik@amd.com, john.allen@amd.com,
mario.limonciello@amd.com, bhelgaas@google.com,
Shyam-sundar.S-k@amd.com, richard.gong@amd.com, jdelvare@suse.com,
linux@roeck-us.net, clemens@ladisch.de, hdegoede@redhat.com,
ilpo.jarvinen@linux.intel.com, linux-pci@vger.kernel.org,
linux-hwmon@vger.kernel.org, platform-driver-x86@vger.kernel.org,
naveenkrishna.chatradhi@amd.com, carlos.bilbao.osdev@gmail.com
> From: Yazen Ghannam <yazen.ghannam@amd.com>
> [..]
> --- a/arch/x86/kernel/amd_nb.c
> +++ b/arch/x86/kernel/amd_nb.c
> @@ -582,6 +582,10 @@ static __init void fix_erratum_688(void)
>
> static __init int init_amd_nbs(void)
> {
> + if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD &&
> + boot_cpu_data.x86_vendor != X86_VENDOR_HYGON)
> + return 0;
> +
> amd_cache_northbridges();
This function could fail and return an error.
Is an early return with an error code needed if this function fails?
[ I know this is out of this patch scope. ]
> amd_cache_gart();
-Qiuxu
^ permalink raw reply [flat|nested] 65+ messages in thread
* RE: [PATCH 06/16] x86/amd_nb: Simplify root device search
2024-10-31 7:52 ` Zhuo, Qiuxu
@ 2024-10-31 10:08 ` Ilpo Järvinen
2024-10-31 13:10 ` Zhuo, Qiuxu
2024-10-31 15:34 ` Yazen Ghannam
0 siblings, 2 replies; 65+ messages in thread
From: Ilpo Järvinen @ 2024-10-31 10:08 UTC (permalink / raw)
To: Zhuo, Qiuxu, Yazen Ghannam
Cc: linux-edac@vger.kernel.org, linux-kernel@vger.kernel.org,
Luck, Tony, x86@kernel.org, avadhut.naik@amd.com,
john.allen@amd.com, mario.limonciello@amd.com,
bhelgaas@google.com, Shyam-sundar.S-k@amd.com,
richard.gong@amd.com, jdelvare@suse.com, linux@roeck-us.net,
clemens@ladisch.de, hdegoede@redhat.com,
linux-pci@vger.kernel.org, linux-hwmon@vger.kernel.org,
platform-driver-x86@vger.kernel.org,
naveenkrishna.chatradhi@amd.com, carlos.bilbao.osdev@gmail.com
On Thu, 31 Oct 2024, Zhuo, Qiuxu wrote:
> > From: Yazen Ghannam <yazen.ghannam@amd.com>
> > [...]
> > +struct pci_dev *amd_node_get_root(u16 node) {
> > + struct pci_dev *df_f0 __free(pci_dev_put) = NULL;
>
> NULL pointer initialization is not necessary.
It is, because __free() is used...
> > + struct pci_dev *root;
> > + u16 cntl_off;
> > + u8 bus;
> > +
> > + if (!boot_cpu_has(X86_FEATURE_ZEN))
> > + return NULL;
...This would try to free() whatever garbage df_f0 holds...
> > + /*
> > + * D18F0xXXX [Config Address Control] (DF::CfgAddressCntl)
> > + * Bits [7:0] (SecBusNum) holds the bus number of the root device for
> > + * this Data Fabric instance. The segment, device, and function will be
> > 0.
> > + */
> > + df_f0 = amd_node_get_func(node, 0);
...However, the recommended practice when using __free() is this (as
documented in include/linux/cleanup.h):
* Given that the "__free(...) = NULL" pattern for variables defined at
* the top of the function poses this potential interdependency problem
* the recommendation is to always define and assign variables in one
* statement and not group variable definitions at the top of the
* function when __free() is used.
I know the outcome will look undesirable to some, me included, but
there's little that can be done to that because there's no other way for
the compiler to infer the order.
That being said, strictly speaking it isn't causing issue in this function
as is but it's still a bad pattern to initialize to = NULL because in
other instances it will cause problems. So better to steer away from the
pattern entirely rather than depend on reviewers noticing the a cleaup
ordering problem gets introduced by some later change to the function.
> > + if (!df_f0)
> > + return NULL;
--
i.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 03/16] x86/amd_nb: Clean up early_is_amd_nb()
2024-10-30 14:21 ` Yazen Ghannam
2024-10-31 0:53 ` Sohil Mehta
@ 2024-10-31 10:24 ` Borislav Petkov
1 sibling, 0 replies; 65+ messages in thread
From: Borislav Petkov @ 2024-10-31 10:24 UTC (permalink / raw)
To: Yazen Ghannam
Cc: Luck, Tony, Andy Whitcroft, Joe Perches,
linux-edac@vger.kernel.org, linux-kernel@vger.kernel.org,
x86@kernel.org, avadhut.naik@amd.com, john.allen@amd.com,
mario.limonciello@amd.com, bhelgaas@google.com,
Shyam-sundar.S-k@amd.com, richard.gong@amd.com, jdelvare@suse.com,
linux@roeck-us.net, clemens@ladisch.de, hdegoede@redhat.com,
ilpo.jarvinen@linux.intel.com, linux-pci@vger.kernel.org,
linux-hwmon@vger.kernel.org, platform-driver-x86@vger.kernel.org,
naveenkrishna.chatradhi@amd.com, carlos.bilbao.osdev@gmail.com
On Wed, Oct 30, 2024 at 10:21:38AM -0400, Yazen Ghannam wrote:
> And if not to checkpatch, then maybe it can be included in the TIP
> maintainers' handbook? That is, if others are using it or something
> similar.
Nah, this needs to be a normal janitor task. tglx is working on it. :)
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 65+ messages in thread
* [PATCH] x86/cpufeature: Document cpu_feature_enabled() as the default to use
2024-10-31 0:53 ` Sohil Mehta
@ 2024-10-31 10:34 ` Borislav Petkov
2024-10-31 18:26 ` Sohil Mehta
2024-11-05 19:59 ` Dave Hansen
0 siblings, 2 replies; 65+ messages in thread
From: Borislav Petkov @ 2024-10-31 10:34 UTC (permalink / raw)
To: Sohil Mehta
Cc: Yazen Ghannam, Luck, Tony, Andy Whitcroft, Joe Perches,
linux-edac@vger.kernel.org, linux-kernel@vger.kernel.org,
x86@kernel.org, avadhut.naik@amd.com, john.allen@amd.com,
mario.limonciello@amd.com, bhelgaas@google.com,
Shyam-sundar.S-k@amd.com, richard.gong@amd.com, jdelvare@suse.com,
linux@roeck-us.net, clemens@ladisch.de, hdegoede@redhat.com,
ilpo.jarvinen@linux.intel.com, linux-pci@vger.kernel.org,
linux-hwmon@vger.kernel.org, platform-driver-x86@vger.kernel.org,
naveenkrishna.chatradhi@amd.com, carlos.bilbao.osdev@gmail.com
From: "Borislav Petkov (AMD)" <bp@alien8.de>
cpu_feature_enabled() should be used in most cases when CPU feature
support needs to be tested in code. Document that.
Reported-by: Sohil Mehta <sohil.mehta@intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
---
arch/x86/include/asm/cpufeature.h | 18 ++++++------------
1 file changed, 6 insertions(+), 12 deletions(-)
diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index 0b9611da6c53..de1ad09fe8d7 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -132,11 +132,12 @@ extern const char * const x86_bug_flags[NBUGINTS*32];
x86_this_cpu_test_bit(bit, cpu_info.x86_capability))
/*
- * This macro is for detection of features which need kernel
- * infrastructure to be used. It may *not* directly test the CPU
- * itself. Use the cpu_has() family if you want true runtime
- * testing of CPU features, like in hypervisor code where you are
- * supporting a possible guest feature where host support for it
+ * This is the default CPU features testing macro to use in code.
+ *
+ * It is for detection of features which need kernel infrastructure to be
+ * used. It may *not* directly test the CPU itself. Use the cpu_has() family
+ * if you want true runtime testing of CPU features, like in hypervisor code
+ * where you are supporting a possible guest feature where host support for it
* is not relevant.
*/
#define cpu_feature_enabled(bit) \
@@ -161,13 +162,6 @@ extern void clear_cpu_cap(struct cpuinfo_x86 *c, unsigned int bit);
#define setup_force_cpu_bug(bit) setup_force_cpu_cap(bit)
/*
- * Static testing of CPU features. Used the same as boot_cpu_has(). It
- * statically patches the target code for additional performance. Use
- * static_cpu_has() only in fast paths, where every cycle counts. Which
- * means that the boot_cpu_has() variant is already fast enough for the
- * majority of cases and you should stick to using it as it is generally
- * only two instructions: a RIP-relative MOV and a TEST.
- *
* Do not use an "m" constraint for [cap_byte] here: gcc doesn't know
* that this is only used on a fallback path and will sometimes cause
* it to manifest the address of boot_cpu_data in a register, fouling
--
2.43.0
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply related [flat|nested] 65+ messages in thread
* Re: [PATCH 02/16] x86/amd_nb: Restrict init function to AMD-based systems
2024-10-31 8:09 ` Zhuo, Qiuxu
@ 2024-10-31 10:36 ` Borislav Petkov
2024-10-31 11:50 ` Zhuo, Qiuxu
0 siblings, 1 reply; 65+ messages in thread
From: Borislav Petkov @ 2024-10-31 10:36 UTC (permalink / raw)
To: Zhuo, Qiuxu
Cc: Yazen Ghannam, linux-edac@vger.kernel.org,
linux-kernel@vger.kernel.org, Luck, Tony, x86@kernel.org,
avadhut.naik@amd.com, john.allen@amd.com,
mario.limonciello@amd.com, bhelgaas@google.com,
Shyam-sundar.S-k@amd.com, richard.gong@amd.com, jdelvare@suse.com,
linux@roeck-us.net, clemens@ladisch.de, hdegoede@redhat.com,
ilpo.jarvinen@linux.intel.com, linux-pci@vger.kernel.org,
linux-hwmon@vger.kernel.org, platform-driver-x86@vger.kernel.org,
naveenkrishna.chatradhi@amd.com, carlos.bilbao.osdev@gmail.com
On Thu, Oct 31, 2024 at 08:09:14AM +0000, Zhuo, Qiuxu wrote:
> This function could fail and return an error.
> Is an early return with an error code needed if this function fails?
No, grep for amd_northbridges.num checks in the code.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 05/16] x86/amd_nb: Simplify function 4 search
2024-10-23 17:21 ` [PATCH 05/16] x86/amd_nb: Simplify function 4 search Yazen Ghannam
@ 2024-10-31 11:15 ` Borislav Petkov
0 siblings, 0 replies; 65+ messages in thread
From: Borislav Petkov @ 2024-10-31 11:15 UTC (permalink / raw)
To: Yazen Ghannam
Cc: linux-edac, linux-kernel, tony.luck, x86, avadhut.naik,
john.allen, mario.limonciello, bhelgaas, Shyam-sundar.S-k,
richard.gong, jdelvare, linux, clemens, hdegoede, ilpo.jarvinen,
linux-pci, linux-hwmon, platform-driver-x86,
naveenkrishna.chatradhi, carlos.bilbao.osdev
On Wed, Oct 23, 2024 at 05:21:39PM +0000, Yazen Ghannam wrote:
> Use the newly added helper function to look up a CPU/Node function to
> find "function 4" devices.
>
> This avoids the need to regularly add new PCI IDs for basic discovery.
> The unique PCI IDs are still useful in case of quirks or functional
> changes. And they should be used only in such a manner.
>
> Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
> ---
> arch/x86/include/asm/amd_nb.h | 1 +
> arch/x86/kernel/amd_nb.c | 66 ++---------------------------------
> 2 files changed, 4 insertions(+), 63 deletions(-)
Nice cleanup.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 06/16] x86/amd_nb: Simplify root device search
2024-10-23 17:21 ` [PATCH 06/16] x86/amd_nb: Simplify root device search Yazen Ghannam
2024-10-31 7:52 ` Zhuo, Qiuxu
@ 2024-10-31 11:20 ` Borislav Petkov
2024-10-31 15:29 ` Yazen Ghannam
1 sibling, 1 reply; 65+ messages in thread
From: Borislav Petkov @ 2024-10-31 11:20 UTC (permalink / raw)
To: Yazen Ghannam
Cc: linux-edac, linux-kernel, tony.luck, x86, avadhut.naik,
john.allen, mario.limonciello, bhelgaas, Shyam-sundar.S-k,
richard.gong, jdelvare, linux, clemens, hdegoede, ilpo.jarvinen,
linux-pci, linux-hwmon, platform-driver-x86,
naveenkrishna.chatradhi, carlos.bilbao.osdev
On Wed, Oct 23, 2024 at 05:21:40PM +0000, Yazen Ghannam wrote:
> The "root" device search was introduced to support SMN access for Zen
> systems. This device represents a PCIe root complex. It is not the
> same as the "CPU/node" devices found at slots 0x18-0x1F.
>
> There may be multiple PCIe root complexes within an AMD node. Such is
> the case with server or HEDT systems, etc. Therefore it is not enough to
HEDT?
...
> +struct pci_dev *amd_node_get_root(u16 node)
> +{
> + struct pci_dev *df_f0 __free(pci_dev_put) = NULL;
> + struct pci_dev *root;
> + u16 cntl_off;
> + u8 bus;
> +
> + if (!boot_cpu_has(X86_FEATURE_ZEN))
check_for_deprecated_apis: WARNING: arch/x86/kernel/amd_node.c:67: Do not use boot_cpu_has() - use cpu_feature_enabled() instead
> + return NULL;
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 65+ messages in thread
* RE: [PATCH 02/16] x86/amd_nb: Restrict init function to AMD-based systems
2024-10-31 10:36 ` Borislav Petkov
@ 2024-10-31 11:50 ` Zhuo, Qiuxu
2024-10-31 13:11 ` Borislav Petkov
0 siblings, 1 reply; 65+ messages in thread
From: Zhuo, Qiuxu @ 2024-10-31 11:50 UTC (permalink / raw)
To: Borislav Petkov
Cc: Yazen Ghannam, linux-edac@vger.kernel.org,
linux-kernel@vger.kernel.org, Luck, Tony, x86@kernel.org,
avadhut.naik@amd.com, john.allen@amd.com,
mario.limonciello@amd.com, bhelgaas@google.com,
Shyam-sundar.S-k@amd.com, richard.gong@amd.com, jdelvare@suse.com,
linux@roeck-us.net, clemens@ladisch.de, hdegoede@redhat.com,
ilpo.jarvinen@linux.intel.com, linux-pci@vger.kernel.org,
linux-hwmon@vger.kernel.org, platform-driver-x86@vger.kernel.org,
naveenkrishna.chatradhi@amd.com, carlos.bilbao.osdev@gmail.com
> From: Borislav Petkov <bp@alien8.de>
> [...]
>
> On Thu, Oct 31, 2024 at 08:09:14AM +0000, Zhuo, Qiuxu wrote:
> > This function could fail and return an error.
> > Is an early return with an error code needed if this function fails?
>
> No, grep for amd_northbridges.num checks in the code.
If the return value of amd_cache_northbridges() isn't used in any place,
make it return void?
-Qiuxu
^ permalink raw reply [flat|nested] 65+ messages in thread
* RE: [PATCH 06/16] x86/amd_nb: Simplify root device search
2024-10-31 10:08 ` Ilpo Järvinen
@ 2024-10-31 13:10 ` Zhuo, Qiuxu
2024-10-31 15:34 ` Yazen Ghannam
1 sibling, 0 replies; 65+ messages in thread
From: Zhuo, Qiuxu @ 2024-10-31 13:10 UTC (permalink / raw)
To: Ilpo Järvinen, Yazen Ghannam
Cc: linux-edac@vger.kernel.org, linux-kernel@vger.kernel.org,
Luck, Tony, x86@kernel.org, avadhut.naik@amd.com,
john.allen@amd.com, mario.limonciello@amd.com,
bhelgaas@google.com, Shyam-sundar.S-k@amd.com,
richard.gong@amd.com, jdelvare@suse.com, linux@roeck-us.net,
clemens@ladisch.de, hdegoede@redhat.com,
linux-pci@vger.kernel.org, linux-hwmon@vger.kernel.org,
platform-driver-x86@vger.kernel.org,
naveenkrishna.chatradhi@amd.com, carlos.bilbao.osdev@gmail.com
> From: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
> [...]
> > > + if (!boot_cpu_has(X86_FEATURE_ZEN))
> > > + return NULL;
>
> ...This would try to free() whatever garbage df_f0 holds...
Oops ... I missed this point.
Thanks for correcting me.
-Qiuxu
^ permalink raw reply [flat|nested] 65+ messages in thread
* RE: [PATCH 02/16] x86/amd_nb: Restrict init function to AMD-based systems
2024-10-31 11:50 ` Zhuo, Qiuxu
@ 2024-10-31 13:11 ` Borislav Petkov
0 siblings, 0 replies; 65+ messages in thread
From: Borislav Petkov @ 2024-10-31 13:11 UTC (permalink / raw)
To: Zhuo, Qiuxu
Cc: Yazen Ghannam, linux-edac@vger.kernel.org,
linux-kernel@vger.kernel.org, Luck, Tony, x86@kernel.org,
avadhut.naik@amd.com, john.allen@amd.com,
mario.limonciello@amd.com, bhelgaas@google.com,
Shyam-sundar.S-k@amd.com, richard.gong@amd.com, jdelvare@suse.com,
linux@roeck-us.net, clemens@ladisch.de, hdegoede@redhat.com,
ilpo.jarvinen@linux.intel.com, linux-pci@vger.kernel.org,
linux-hwmon@vger.kernel.org, platform-driver-x86@vger.kernel.org,
naveenkrishna.chatradhi@amd.com, carlos.bilbao.osdev@gmail.com
On October 31, 2024 12:50:10 PM GMT+01:00, "Zhuo, Qiuxu" <qiuxu.zhuo@intel.com> wrote:
>> From: Borislav Petkov <bp@alien8.de>
>> [...]
>>
>> On Thu, Oct 31, 2024 at 08:09:14AM +0000, Zhuo, Qiuxu wrote:
>> > This function could fail and return an error.
>> > Is an early return with an error code needed if this function fails?
>>
>> No, grep for amd_northbridges.num checks in the code.
>
>If the return value of amd_cache_northbridges() isn't used in any place,
>make it return void?
>
>-Qiuxu
Feel free to propose patches *after* the dust around that area settles.
--
Sent from a small device: formatting sucks and brevity is inevitable.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 06/16] x86/amd_nb: Simplify root device search
2024-10-31 11:20 ` Borislav Petkov
@ 2024-10-31 15:29 ` Yazen Ghannam
0 siblings, 0 replies; 65+ messages in thread
From: Yazen Ghannam @ 2024-10-31 15:29 UTC (permalink / raw)
To: Borislav Petkov
Cc: linux-edac, linux-kernel, tony.luck, x86, avadhut.naik,
john.allen, mario.limonciello, bhelgaas, Shyam-sundar.S-k,
richard.gong, jdelvare, linux, clemens, hdegoede, ilpo.jarvinen,
linux-pci, linux-hwmon, platform-driver-x86,
naveenkrishna.chatradhi, carlos.bilbao.osdev
On Thu, Oct 31, 2024 at 12:20:27PM +0100, Borislav Petkov wrote:
> On Wed, Oct 23, 2024 at 05:21:40PM +0000, Yazen Ghannam wrote:
> > The "root" device search was introduced to support SMN access for Zen
> > systems. This device represents a PCIe root complex. It is not the
> > same as the "CPU/node" devices found at slots 0x18-0x1F.
> >
> > There may be multiple PCIe root complexes within an AMD node. Such is
> > the case with server or HEDT systems, etc. Therefore it is not enough to
>
> HEDT?
>
Sorry, forgot to spell it: High-end Desktop.
> ...
>
> > +struct pci_dev *amd_node_get_root(u16 node)
> > +{
> > + struct pci_dev *df_f0 __free(pci_dev_put) = NULL;
> > + struct pci_dev *root;
> > + u16 cntl_off;
> > + u8 bus;
> > +
> > + if (!boot_cpu_has(X86_FEATURE_ZEN))
>
> check_for_deprecated_apis: WARNING: arch/x86/kernel/amd_node.c:67: Do not use boot_cpu_has() - use cpu_feature_enabled() instead
>
Ack.
Thanks,
Yazen
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 06/16] x86/amd_nb: Simplify root device search
2024-10-31 10:08 ` Ilpo Järvinen
2024-10-31 13:10 ` Zhuo, Qiuxu
@ 2024-10-31 15:34 ` Yazen Ghannam
2024-10-31 15:42 ` Ilpo Järvinen
1 sibling, 1 reply; 65+ messages in thread
From: Yazen Ghannam @ 2024-10-31 15:34 UTC (permalink / raw)
To: Ilpo Järvinen
Cc: Zhuo, Qiuxu, linux-edac@vger.kernel.org,
linux-kernel@vger.kernel.org, Luck, Tony, x86@kernel.org,
avadhut.naik@amd.com, john.allen@amd.com,
mario.limonciello@amd.com, bhelgaas@google.com,
Shyam-sundar.S-k@amd.com, richard.gong@amd.com, jdelvare@suse.com,
linux@roeck-us.net, clemens@ladisch.de, hdegoede@redhat.com,
linux-pci@vger.kernel.org, linux-hwmon@vger.kernel.org,
platform-driver-x86@vger.kernel.org,
naveenkrishna.chatradhi@amd.com, carlos.bilbao.osdev@gmail.com
On Thu, Oct 31, 2024 at 12:08:20PM +0200, Ilpo Järvinen wrote:
> On Thu, 31 Oct 2024, Zhuo, Qiuxu wrote:
>
> > > From: Yazen Ghannam <yazen.ghannam@amd.com>
> > > [...]
> > > +struct pci_dev *amd_node_get_root(u16 node) {
> > > + struct pci_dev *df_f0 __free(pci_dev_put) = NULL;
> >
> > NULL pointer initialization is not necessary.
>
> It is, because __free() is used...
>
> > > + struct pci_dev *root;
> > > + u16 cntl_off;
> > > + u8 bus;
> > > +
> > > + if (!boot_cpu_has(X86_FEATURE_ZEN))
> > > + return NULL;
>
> ...This would try to free() whatever garbage df_f0 holds...
>
> > > + /*
> > > + * D18F0xXXX [Config Address Control] (DF::CfgAddressCntl)
> > > + * Bits [7:0] (SecBusNum) holds the bus number of the root device for
> > > + * this Data Fabric instance. The segment, device, and function will be
> > > 0.
> > > + */
> > > + df_f0 = amd_node_get_func(node, 0);
>
> ...However, the recommended practice when using __free() is this (as
> documented in include/linux/cleanup.h):
>
> * Given that the "__free(...) = NULL" pattern for variables defined at
> * the top of the function poses this potential interdependency problem
> * the recommendation is to always define and assign variables in one
> * statement and not group variable definitions at the top of the
> * function when __free() is used.
>
> I know the outcome will look undesirable to some, me included, but
> there's little that can be done to that because there's no other way for
> the compiler to infer the order.
>
> That being said, strictly speaking it isn't causing issue in this function
> as is but it's still a bad pattern to initialize to = NULL because in
> other instances it will cause problems. So better to steer away from the
> pattern entirely rather than depend on reviewers noticing the a cleaup
> ordering problem gets introduced by some later change to the function.
>
I originally read that in the context of using a guard(). But really we
should do like this in any case, correct?
struct pci_dev *df_f0 __free(pci_dev_put) = amd_node_get_func(node, 0);
Thanks,
Yazen
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 06/16] x86/amd_nb: Simplify root device search
2024-10-31 15:34 ` Yazen Ghannam
@ 2024-10-31 15:42 ` Ilpo Järvinen
2024-10-31 15:45 ` Yazen Ghannam
0 siblings, 1 reply; 65+ messages in thread
From: Ilpo Järvinen @ 2024-10-31 15:42 UTC (permalink / raw)
To: Yazen Ghannam
Cc: Zhuo, Qiuxu, linux-edac@vger.kernel.org,
linux-kernel@vger.kernel.org, Luck, Tony, x86@kernel.org,
avadhut.naik@amd.com, john.allen@amd.com,
mario.limonciello@amd.com, bhelgaas@google.com,
Shyam-sundar.S-k@amd.com, richard.gong@amd.com, jdelvare@suse.com,
linux@roeck-us.net, clemens@ladisch.de, hdegoede@redhat.com,
linux-pci@vger.kernel.org, linux-hwmon@vger.kernel.org,
platform-driver-x86@vger.kernel.org,
naveenkrishna.chatradhi@amd.com, carlos.bilbao.osdev@gmail.com
[-- Attachment #1: Type: text/plain, Size: 2557 bytes --]
On Thu, 31 Oct 2024, Yazen Ghannam wrote:
> On Thu, Oct 31, 2024 at 12:08:20PM +0200, Ilpo Järvinen wrote:
> > On Thu, 31 Oct 2024, Zhuo, Qiuxu wrote:
> >
> > > > From: Yazen Ghannam <yazen.ghannam@amd.com>
> > > > [...]
> > > > +struct pci_dev *amd_node_get_root(u16 node) {
> > > > + struct pci_dev *df_f0 __free(pci_dev_put) = NULL;
> > >
> > > NULL pointer initialization is not necessary.
> >
> > It is, because __free() is used...
> >
> > > > + struct pci_dev *root;
> > > > + u16 cntl_off;
> > > > + u8 bus;
> > > > +
> > > > + if (!boot_cpu_has(X86_FEATURE_ZEN))
> > > > + return NULL;
> >
> > ...This would try to free() whatever garbage df_f0 holds...
> >
> > > > + /*
> > > > + * D18F0xXXX [Config Address Control] (DF::CfgAddressCntl)
> > > > + * Bits [7:0] (SecBusNum) holds the bus number of the root device for
> > > > + * this Data Fabric instance. The segment, device, and function will be
> > > > 0.
> > > > + */
> > > > + df_f0 = amd_node_get_func(node, 0);
> >
> > ...However, the recommended practice when using __free() is this (as
> > documented in include/linux/cleanup.h):
> >
> > * Given that the "__free(...) = NULL" pattern for variables defined at
> > * the top of the function poses this potential interdependency problem
> > * the recommendation is to always define and assign variables in one
> > * statement and not group variable definitions at the top of the
> > * function when __free() is used.
> >
> > I know the outcome will look undesirable to some, me included, but
> > there's little that can be done to that because there's no other way for
> > the compiler to infer the order.
> >
> > That being said, strictly speaking it isn't causing issue in this function
> > as is but it's still a bad pattern to initialize to = NULL because in
> > other instances it will cause problems. So better to steer away from the
> > pattern entirely rather than depend on reviewers noticing the a cleaup
> > ordering problem gets introduced by some later change to the function.
> >
>
> I originally read that in the context of using a guard(). But really we
> should do like this in any case, correct?
>
> struct pci_dev *df_f0 __free(pci_dev_put) = amd_node_get_func(node, 0);
Yes, that is the recommendation. It says "always" so not only the cases
where guard() or other __free()s are used.
Of course this only applies to use of __free(), other variables should
still be declared in the usual place and not spread around.
--
i.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 06/16] x86/amd_nb: Simplify root device search
2024-10-31 15:42 ` Ilpo Järvinen
@ 2024-10-31 15:45 ` Yazen Ghannam
0 siblings, 0 replies; 65+ messages in thread
From: Yazen Ghannam @ 2024-10-31 15:45 UTC (permalink / raw)
To: Ilpo Järvinen
Cc: Zhuo, Qiuxu, linux-edac@vger.kernel.org,
linux-kernel@vger.kernel.org, Luck, Tony, x86@kernel.org,
avadhut.naik@amd.com, john.allen@amd.com,
mario.limonciello@amd.com, bhelgaas@google.com,
Shyam-sundar.S-k@amd.com, richard.gong@amd.com, jdelvare@suse.com,
linux@roeck-us.net, clemens@ladisch.de, hdegoede@redhat.com,
linux-pci@vger.kernel.org, linux-hwmon@vger.kernel.org,
platform-driver-x86@vger.kernel.org,
naveenkrishna.chatradhi@amd.com, carlos.bilbao.osdev@gmail.com
On Thu, Oct 31, 2024 at 05:42:34PM +0200, Ilpo Järvinen wrote:
> On Thu, 31 Oct 2024, Yazen Ghannam wrote:
>
> > On Thu, Oct 31, 2024 at 12:08:20PM +0200, Ilpo Järvinen wrote:
> > > On Thu, 31 Oct 2024, Zhuo, Qiuxu wrote:
> > >
> > > > > From: Yazen Ghannam <yazen.ghannam@amd.com>
> > > > > [...]
> > > > > +struct pci_dev *amd_node_get_root(u16 node) {
> > > > > + struct pci_dev *df_f0 __free(pci_dev_put) = NULL;
> > > >
> > > > NULL pointer initialization is not necessary.
> > >
> > > It is, because __free() is used...
> > >
> > > > > + struct pci_dev *root;
> > > > > + u16 cntl_off;
> > > > > + u8 bus;
> > > > > +
> > > > > + if (!boot_cpu_has(X86_FEATURE_ZEN))
> > > > > + return NULL;
> > >
> > > ...This would try to free() whatever garbage df_f0 holds...
> > >
> > > > > + /*
> > > > > + * D18F0xXXX [Config Address Control] (DF::CfgAddressCntl)
> > > > > + * Bits [7:0] (SecBusNum) holds the bus number of the root device for
> > > > > + * this Data Fabric instance. The segment, device, and function will be
> > > > > 0.
> > > > > + */
> > > > > + df_f0 = amd_node_get_func(node, 0);
> > >
> > > ...However, the recommended practice when using __free() is this (as
> > > documented in include/linux/cleanup.h):
> > >
> > > * Given that the "__free(...) = NULL" pattern for variables defined at
> > > * the top of the function poses this potential interdependency problem
> > > * the recommendation is to always define and assign variables in one
> > > * statement and not group variable definitions at the top of the
> > > * function when __free() is used.
> > >
> > > I know the outcome will look undesirable to some, me included, but
> > > there's little that can be done to that because there's no other way for
> > > the compiler to infer the order.
> > >
> > > That being said, strictly speaking it isn't causing issue in this function
> > > as is but it's still a bad pattern to initialize to = NULL because in
> > > other instances it will cause problems. So better to steer away from the
> > > pattern entirely rather than depend on reviewers noticing the a cleaup
> > > ordering problem gets introduced by some later change to the function.
> > >
> >
> > I originally read that in the context of using a guard(). But really we
> > should do like this in any case, correct?
> >
> > struct pci_dev *df_f0 __free(pci_dev_put) = amd_node_get_func(node, 0);
>
> Yes, that is the recommendation. It says "always" so not only the cases
> where guard() or other __free()s are used.
>
> Of course this only applies to use of __free(), other variables should
> still be declared in the usual place and not spread around.
>
Ah right. Will make the change.
Thanks,
Yazen
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 00/16] AMD NB and SMN rework
2024-10-24 21:47 ` Bjorn Helgaas
@ 2024-10-31 16:22 ` Yazen Ghannam
0 siblings, 0 replies; 65+ messages in thread
From: Yazen Ghannam @ 2024-10-31 16:22 UTC (permalink / raw)
To: Bjorn Helgaas
Cc: Mario Limonciello, linux-edac, linux-kernel, tony.luck, x86,
avadhut.naik, john.allen, bhelgaas, Shyam-sundar.S-k,
richard.gong, jdelvare, linux, clemens, hdegoede, ilpo.jarvinen,
linux-pci, linux-hwmon, platform-driver-x86,
naveenkrishna.chatradhi, carlos.bilbao.osdev
On Thu, Oct 24, 2024 at 04:47:53PM -0500, Bjorn Helgaas wrote:
> On Thu, Oct 24, 2024 at 04:20:35PM -0500, Mario Limonciello wrote:
> > On 10/24/2024 16:06, Bjorn Helgaas wrote:
> > > On Thu, Oct 24, 2024 at 03:08:41PM -0500, Mario Limonciello wrote:
> > > > On 10/24/2024 12:46, Bjorn Helgaas wrote:
> > > > > On Thu, Oct 24, 2024 at 12:01:59PM -0400, Yazen Ghannam wrote:
> > > > > > On Wed, Oct 23, 2024 at 12:59:28PM -0500, Bjorn Helgaas wrote:
> > > > > > > On Wed, Oct 23, 2024 at 05:21:34PM +0000, Yazen Ghannam wrote:
> > > > ...
> > >
> > > > > > > The use of pci_get_slot() and pci_get_domain_bus_and_slot() is not
> > > > > > > ideal since all those pci_get_*() interfaces are kind of ugly in my
> > > > > > > opinion, and using them means we have to encode topology details in
> > > > > > > the kernel. But this still seems like a big improvement.
> > > > > >
> > > > > > Thanks for the feedback. Hopefully, we'll come to some improved
> > > > > > solution. :)
> > > > > >
> > > > > > Can you please elaborate on your concern? Is it about saying "thing X is
> > > > > > always at SBDF A:B:C.D" or something else?
> > > > >
> > > > > "Thing X is always at SBDF A:B:C.D" is one big reason. "A:B:C.D" says
> > > > > nothing about the actual functionality of the device. A PCI
> > > > > Vendor/Device ID or a PNP ID identifies the device programming model
> > > > > independent of its geographical location. Inferring the functionality
> > > > > and programming model from the location is a maintenance issue because
> > > > > hardware may change the address.
> > > > >
> > > > > PCI bus numbers are under software control, so in general it's not
> > > > > safe to rely on them, although in this case these devices are probably
> > > > > on root buses where the bus number is either fixed or determined by
> > > > > BIOS configuration of the host bridge.
> > > > >
> > > > > I don't like the pci_get_*() functions because they break the driver
> > > > > model. The usual .probe() model binds a device to a driver, which
> > > > > essentially means the driver owns the device and its resources, and
> > > > > the driver and doesn't have to worry about other code interfering.
> > > >
> > > > Are you suggesting that perhaps we should be introducing amd_smn (patch 10)
> > > > as a PCI driver that binds "to the root device" instead?
> > >
> > > I don't know any of the specifics, so I can't really opine on that.
> > >
> > > The PCI specs envision that a Vendor/Device ID defines the programming
> > > model of the device, and you would only use a new Device ID when that
> > > programming model changes.
> > >
> > > Of course, vendors like to define a new set of Device IDs for every
> > > new chipset even when no driver changes are required, so even if a new
> > > SMN works exactly the same as in previous chipsets, you're probably
> > > back to having to add a new Device ID for every new chipset.
> >
> > Yeah; this I believe is why we're here today and trying to find something
> > more manageable (IE this series).
>
> Another alternative would be an ACPI device where you can use the same
> _HID (or at least a _CID) for all the chipsets.
>
Yes, we've had some internal discussions about something like this. Of
course, any new solution will only apply to future products.
Another option could be for the platform to provide an abstracted
interface for each unique access method. For example, we could have
define UEFI PRM methods, the code can run in OS context, and the details
would be abstracted. But again, this would have to come for future
products. :/
> > > The Subsystem Vendor ID and Subsystem ID exist to solve a similar
> > > problem (sort of in reverse). If AMD could allocate a Subsystem ID
> > > for this SMN programming model and use that same ID in every chipset,
> > > you could make a pci_driver.id_table entry that would match them all,
> > > e.g.,
> > >
> > > .vendor = PCI_VENDOR_ID_AMD,
> > > .device = PCI_ANY_ID,
> > > .subvendor = PCI_VENDOR_ID_AMD,
> > > .subdevice = PCI_SUBSYSTEM_AMD_SMN,
> > >
> > > (pci_device_id.subdevice is misnamed; the spec calls it "Subsystem ID")
> >
> > Isn't the subsystem ID based typically upon the platform it's
> > running on? For example I seem to recall on Dell systems it's used
> > the value that was in the SBMIOS ProductSKU field here (IoW not
> > something AMD would control).
>
> Right, it is typically based on the platform; that's why I said "in
> reverse." I think all these devices are integrated into the chipset,
> so I'm speculating that platform vendors would have no need (maybe
> even no way) to use the Subsystem ID. But maybe that's not the case.
>
The devices are integrated. However, they aren't solely used for the
register access interfaces. The index/data pairs just happen to reside
in a root complex device, but of course the root complex is not there
just for this use.
> > I mean I guess maybe we could do a:
> >
> > .vendor = PCI_VENDOR_ID_AMD,
> > .device = PCI_ANY_ID,
> > .class = PCI_CLASS_BRIDGE_HOST << 8
> >
> > And then in probe() figure out if it's the right one, but that's still
> > pretty ugly, eh?
>
> I think there are some drivers that do this, and it's not completely
> terrible. The probe() can just return failure if it doesn't want the
> device.
>
Would it make sense to have a driver if we're not actually driving
anything? We really just need to read/write to a few registers in the
same vein of using an I/O access port.
I think a driver would really work if there was a lot more functionality
and the access port was just one of many features.
But maybe a really bare-bones driver isn't too much more than what we
have. And it could be more maintainable.
And maybe we can treat the "AMD Node" as a logical device that
collects/manages interfaces across multiple devices.
Bjorn, would you mind if we pursued this as a follow up to this set? I
think there's potential for some of these ideas. But I'll need to do
more research and discuss with others.
Thanks,
Yazen
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH] x86/cpufeature: Document cpu_feature_enabled() as the default to use
2024-10-31 10:34 ` [PATCH] x86/cpufeature: Document cpu_feature_enabled() as the default to use Borislav Petkov
@ 2024-10-31 18:26 ` Sohil Mehta
2024-10-31 19:15 ` Borislav Petkov
2024-11-05 19:59 ` Dave Hansen
1 sibling, 1 reply; 65+ messages in thread
From: Sohil Mehta @ 2024-10-31 18:26 UTC (permalink / raw)
To: Borislav Petkov
Cc: Yazen Ghannam, Luck, Tony, Andy Whitcroft, Joe Perches,
linux-edac@vger.kernel.org, linux-kernel@vger.kernel.org,
x86@kernel.org, avadhut.naik@amd.com, john.allen@amd.com,
mario.limonciello@amd.com, bhelgaas@google.com,
Shyam-sundar.S-k@amd.com, richard.gong@amd.com, jdelvare@suse.com,
linux@roeck-us.net, clemens@ladisch.de, hdegoede@redhat.com,
ilpo.jarvinen@linux.intel.com, linux-pci@vger.kernel.org,
linux-hwmon@vger.kernel.org, platform-driver-x86@vger.kernel.org,
naveenkrishna.chatradhi@amd.com, carlos.bilbao.osdev@gmail.com
On 10/31/2024 3:34 AM, Borislav Petkov wrote:
> From: "Borislav Petkov (AMD)" <bp@alien8.de>
>
> cpu_feature_enabled() should be used in most cases when CPU feature
> support needs to be tested in code. Document that.
>
> Reported-by: Sohil Mehta <sohil.mehta@intel.com>
> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
> ---
> arch/x86/include/asm/cpufeature.h | 18 ++++++------------
> 1 file changed, 6 insertions(+), 12 deletions(-)
>
Looks good (a minor nit below),
Reviewed-by: Sohil Mehta <sohil.mehta@intel.com>
> diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
> index 0b9611da6c53..de1ad09fe8d7 100644
> --- a/arch/x86/include/asm/cpufeature.h
> +++ b/arch/x86/include/asm/cpufeature.h
> @@ -132,11 +132,12 @@ extern const char * const x86_bug_flags[NBUGINTS*32];
> x86_this_cpu_test_bit(bit, cpu_info.x86_capability))
>
> /*
> - * This macro is for detection of features which need kernel
> - * infrastructure to be used. It may *not* directly test the CPU
> - * itself. Use the cpu_has() family if you want true runtime
> - * testing of CPU features, like in hypervisor code where you are
> - * supporting a possible guest feature where host support for it
> + * This is the default CPU features testing macro to use in code.
> + *
Does "default CPU feature testing macro" roll better than "default CPU
features testing macro"?
> + * It is for detection of features which need kernel infrastructure to be
> + * used. It may *not* directly test the CPU itself. Use the cpu_has() family
> + * if you want true runtime testing of CPU features, like in hypervisor code
> + * where you are supporting a possible guest feature where host support for it
> * is not relevant.
> */
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH] x86/cpufeature: Document cpu_feature_enabled() as the default to use
2024-10-31 18:26 ` Sohil Mehta
@ 2024-10-31 19:15 ` Borislav Petkov
0 siblings, 0 replies; 65+ messages in thread
From: Borislav Petkov @ 2024-10-31 19:15 UTC (permalink / raw)
To: Sohil Mehta
Cc: Yazen Ghannam, Luck, Tony, Andy Whitcroft, Joe Perches,
linux-edac@vger.kernel.org, linux-kernel@vger.kernel.org,
x86@kernel.org, avadhut.naik@amd.com, john.allen@amd.com,
mario.limonciello@amd.com, bhelgaas@google.com,
Shyam-sundar.S-k@amd.com, richard.gong@amd.com, jdelvare@suse.com,
linux@roeck-us.net, clemens@ladisch.de, hdegoede@redhat.com,
ilpo.jarvinen@linux.intel.com, linux-pci@vger.kernel.org,
linux-hwmon@vger.kernel.org, platform-driver-x86@vger.kernel.org,
naveenkrishna.chatradhi@amd.com, carlos.bilbao.osdev@gmail.com
On October 31, 2024 7:26:27 PM GMT+01:00, Sohil Mehta <sohil.mehta@intel.com> wrote:
>Does "default CPU feature testing macro" roll better than "default CPU
>features testing macro"?
Waaay too finicky to me. No one cares, I'd say. 🤗😂
--
Sent from a small device: formatting sucks and brevity is inevitable.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 09/16] x86/amd_nb, x86/amd_node: Simplify amd_pci_dev_to_node_id()
2024-10-23 17:21 ` [PATCH 09/16] x86/amd_nb, x86/amd_node: Simplify amd_pci_dev_to_node_id() Yazen Ghannam
@ 2024-11-04 14:23 ` Borislav Petkov
2024-11-05 14:54 ` Yazen Ghannam
0 siblings, 1 reply; 65+ messages in thread
From: Borislav Petkov @ 2024-11-04 14:23 UTC (permalink / raw)
To: Yazen Ghannam
Cc: linux-edac, linux-kernel, tony.luck, x86, avadhut.naik,
john.allen, mario.limonciello, bhelgaas, Shyam-sundar.S-k,
richard.gong, jdelvare, linux, clemens, hdegoede, ilpo.jarvinen,
linux-pci, linux-hwmon, platform-driver-x86,
naveenkrishna.chatradhi, carlos.bilbao.osdev
On Wed, Oct 23, 2024 at 05:21:43PM +0000, Yazen Ghannam wrote:
> diff --git a/arch/x86/include/asm/amd_node.h b/arch/x86/include/asm/amd_node.h
> index 419a0ad13ef2..8e473a293706 100644
> --- a/arch/x86/include/asm/amd_node.h
> +++ b/arch/x86/include/asm/amd_node.h
> @@ -30,4 +30,10 @@ static inline u16 amd_num_nodes(void)
> return topology_amd_nodes_per_pkg() * topology_max_packages();
> }
>
> +/* Caller must ensure the input is an AMD node device. */
You can ensure that yourself by checking the PCI vendor in the PCI device,
right?
IOW, pdev->vendor...
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 10/16] x86/amd_nb: Move SMN access code to a new amd_smn driver
2024-10-23 17:21 ` [PATCH 10/16] x86/amd_nb: Move SMN access code to a new amd_smn driver Yazen Ghannam
@ 2024-11-04 14:29 ` Borislav Petkov
2024-11-05 14:58 ` Yazen Ghannam
0 siblings, 1 reply; 65+ messages in thread
From: Borislav Petkov @ 2024-11-04 14:29 UTC (permalink / raw)
To: Yazen Ghannam
Cc: linux-edac, linux-kernel, tony.luck, x86, avadhut.naik,
john.allen, mario.limonciello, bhelgaas, Shyam-sundar.S-k,
richard.gong, jdelvare, linux, clemens, hdegoede, ilpo.jarvinen,
linux-pci, linux-hwmon, platform-driver-x86,
naveenkrishna.chatradhi, carlos.bilbao.osdev
On Wed, Oct 23, 2024 at 05:21:44PM +0000, Yazen Ghannam wrote:
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index ba5252d8e21c..a03ffa5b6bb1 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -3128,6 +3128,9 @@ config AMD_NODE
> def_bool y
> depends on CPU_SUP_AMD && PCI
>
> +config AMD_SMN
> + def_bool y
> + depends on AMD_NODE
Why is this a separate compilation unit and not part of amd_node.c? Especially
if it depends on it.
I don't see the real need for having smaller compilation units. Both the node
and the smn stuff will end up being built-in in 99% of the configs. So why are
we making separate Kconfig items and yadda yadda?
Just do a single amd_node and that's it. We can always split later, if really
needed.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 11/16] x86/amd_smn: Fixup __amd_smn_rw()
2024-10-23 17:21 ` [PATCH 11/16] x86/amd_smn: Fixup __amd_smn_rw() Yazen Ghannam
@ 2024-11-04 14:32 ` Borislav Petkov
2024-11-05 14:59 ` Yazen Ghannam
0 siblings, 1 reply; 65+ messages in thread
From: Borislav Petkov @ 2024-11-04 14:32 UTC (permalink / raw)
To: Yazen Ghannam
Cc: linux-edac, linux-kernel, tony.luck, x86, avadhut.naik,
john.allen, mario.limonciello, bhelgaas, Shyam-sundar.S-k,
richard.gong, jdelvare, linux, clemens, hdegoede, ilpo.jarvinen,
linux-pci, linux-hwmon, platform-driver-x86,
naveenkrishna.chatradhi, carlos.bilbao.osdev, Tom Lendacky
On Wed, Oct 23, 2024 at 05:21:45PM +0000, Yazen Ghannam wrote:
> Subject: Re: [PATCH 11/16] x86/amd_smn: Fixup __amd_smn_rw()
Needs a more descriptive title...
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 09/16] x86/amd_nb, x86/amd_node: Simplify amd_pci_dev_to_node_id()
2024-11-04 14:23 ` Borislav Petkov
@ 2024-11-05 14:54 ` Yazen Ghannam
0 siblings, 0 replies; 65+ messages in thread
From: Yazen Ghannam @ 2024-11-05 14:54 UTC (permalink / raw)
To: Borislav Petkov
Cc: linux-edac, linux-kernel, tony.luck, x86, avadhut.naik,
john.allen, mario.limonciello, bhelgaas, Shyam-sundar.S-k,
richard.gong, jdelvare, linux, clemens, hdegoede, ilpo.jarvinen,
linux-pci, linux-hwmon, platform-driver-x86,
naveenkrishna.chatradhi, carlos.bilbao.osdev
On Mon, Nov 04, 2024 at 03:23:29PM +0100, Borislav Petkov wrote:
> On Wed, Oct 23, 2024 at 05:21:43PM +0000, Yazen Ghannam wrote:
> > diff --git a/arch/x86/include/asm/amd_node.h b/arch/x86/include/asm/amd_node.h
> > index 419a0ad13ef2..8e473a293706 100644
> > --- a/arch/x86/include/asm/amd_node.h
> > +++ b/arch/x86/include/asm/amd_node.h
> > @@ -30,4 +30,10 @@ static inline u16 amd_num_nodes(void)
> > return topology_amd_nodes_per_pkg() * topology_max_packages();
> > }
> >
> > +/* Caller must ensure the input is an AMD node device. */
>
> You can ensure that yourself by checking the PCI vendor in the PCI device,
> right?
>
> IOW, pdev->vendor...
>
Yes, there can be a vendor and/or bus,device check.
I'll add them.
Thanks,
Yazen
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 10/16] x86/amd_nb: Move SMN access code to a new amd_smn driver
2024-11-04 14:29 ` Borislav Petkov
@ 2024-11-05 14:58 ` Yazen Ghannam
2024-11-05 19:42 ` Borislav Petkov
0 siblings, 1 reply; 65+ messages in thread
From: Yazen Ghannam @ 2024-11-05 14:58 UTC (permalink / raw)
To: Borislav Petkov
Cc: linux-edac, linux-kernel, tony.luck, x86, avadhut.naik,
john.allen, mario.limonciello, bhelgaas, Shyam-sundar.S-k,
richard.gong, jdelvare, linux, clemens, hdegoede, ilpo.jarvinen,
linux-pci, linux-hwmon, platform-driver-x86,
naveenkrishna.chatradhi, carlos.bilbao.osdev
On Mon, Nov 04, 2024 at 03:29:58PM +0100, Borislav Petkov wrote:
> On Wed, Oct 23, 2024 at 05:21:44PM +0000, Yazen Ghannam wrote:
> > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> > index ba5252d8e21c..a03ffa5b6bb1 100644
> > --- a/arch/x86/Kconfig
> > +++ b/arch/x86/Kconfig
> > @@ -3128,6 +3128,9 @@ config AMD_NODE
> > def_bool y
> > depends on CPU_SUP_AMD && PCI
> >
> > +config AMD_SMN
> > + def_bool y
> > + depends on AMD_NODE
>
> Why is this a separate compilation unit and not part of amd_node.c? Especially
> if it depends on it.
>
> I don't see the real need for having smaller compilation units. Both the node
> and the smn stuff will end up being built-in in 99% of the configs. So why are
> we making separate Kconfig items and yadda yadda?
>
AMD_NB (legacy systems) and AMD_SMN (Zen-based systems) would both
depend on AMD_NODE. The thinking is that a user could build a minimal
config for either legacy or Zen-based systems.
> Just do a single amd_node and that's it. We can always split later, if really
> needed.
>
Okay, will do.
Thanks,
Yazen
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 11/16] x86/amd_smn: Fixup __amd_smn_rw()
2024-11-04 14:32 ` Borislav Petkov
@ 2024-11-05 14:59 ` Yazen Ghannam
0 siblings, 0 replies; 65+ messages in thread
From: Yazen Ghannam @ 2024-11-05 14:59 UTC (permalink / raw)
To: Borislav Petkov
Cc: linux-edac, linux-kernel, tony.luck, x86, avadhut.naik,
john.allen, mario.limonciello, bhelgaas, Shyam-sundar.S-k,
richard.gong, jdelvare, linux, clemens, hdegoede, ilpo.jarvinen,
linux-pci, linux-hwmon, platform-driver-x86,
naveenkrishna.chatradhi, carlos.bilbao.osdev, Tom Lendacky
On Mon, Nov 04, 2024 at 03:32:12PM +0100, Borislav Petkov wrote:
> On Wed, Oct 23, 2024 at 05:21:45PM +0000, Yazen Ghannam wrote:
> > Subject: Re: [PATCH 11/16] x86/amd_smn: Fixup __amd_smn_rw()
>
> Needs a more descriptive title...
>
Okay, will update it.
Thanks,
Yazen
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 16/16] x86/amd_smn: Add support for debugfs access to SMN registers
2024-10-23 17:21 ` [PATCH 16/16] x86/amd_smn: Add support for debugfs access to SMN registers Yazen Ghannam
@ 2024-11-05 19:21 ` Borislav Petkov
2024-11-05 19:46 ` Mario Limonciello
0 siblings, 1 reply; 65+ messages in thread
From: Borislav Petkov @ 2024-11-05 19:21 UTC (permalink / raw)
To: Yazen Ghannam
Cc: linux-edac, linux-kernel, tony.luck, x86, avadhut.naik,
john.allen, mario.limonciello, bhelgaas, Shyam-sundar.S-k,
richard.gong, jdelvare, linux, clemens, hdegoede, ilpo.jarvinen,
linux-pci, linux-hwmon, platform-driver-x86,
naveenkrishna.chatradhi, carlos.bilbao.osdev
On Wed, Oct 23, 2024 at 05:21:50PM +0000, Yazen Ghannam wrote:
> +static ssize_t smn_value_write(struct file *file, const char __user *userbuf,
> + size_t count, loff_t *ppos)
> +{
> + u32 val;
> + int ret;
> +
> + ret = kstrtouint_from_user(userbuf, count, 0, &val);
> + if (ret)
> + return ret;
> +
> + add_taint(TAINT_USER, LOCKDEP_STILL_OK);
That looks like a TAINT_CPU_OUT_OF_SPEC to me.
> + ret = amd_smn_write(debug_node, debug_address, val);
> + if (ret)
> + return ret;
> +
> + return count;
> +}
> +
> +DEFINE_SHOW_STORE_ATTRIBUTE(smn_node);
> +DEFINE_SHOW_STORE_ATTRIBUTE(smn_address);
> +DEFINE_SHOW_STORE_ATTRIBUTE(smn_value);
> +
> static int amd_cache_roots(void)
> {
> u16 node, num_nodes = amd_num_nodes();
> @@ -180,6 +257,12 @@ static int __init amd_smn_init(void)
> if (err)
> return err;
>
> + debugfs_dir = debugfs_create_dir("amd_smn", arch_debugfs_dir);
> +
> + debugfs_create_file("node", 0600, debugfs_dir, NULL, &smn_node_fops);
> + debugfs_create_file("address", 0600, debugfs_dir, NULL, &smn_address_fops);
> + debugfs_create_file("value", 0600, debugfs_dir, NULL, &smn_value_fops);
Can we pls stick this behind a module param which is off by default? I don't
want that crap exposed even in debugfs, by default.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 10/16] x86/amd_nb: Move SMN access code to a new amd_smn driver
2024-11-05 14:58 ` Yazen Ghannam
@ 2024-11-05 19:42 ` Borislav Petkov
0 siblings, 0 replies; 65+ messages in thread
From: Borislav Petkov @ 2024-11-05 19:42 UTC (permalink / raw)
To: Yazen Ghannam
Cc: linux-edac, linux-kernel, tony.luck, x86, avadhut.naik,
john.allen, mario.limonciello, bhelgaas, Shyam-sundar.S-k,
richard.gong, jdelvare, linux, clemens, hdegoede, ilpo.jarvinen,
linux-pci, linux-hwmon, platform-driver-x86,
naveenkrishna.chatradhi, carlos.bilbao.osdev
On Tue, Nov 05, 2024 at 09:58:26AM -0500, Yazen Ghannam wrote:
> AMD_NB (legacy systems)
That's legacy and won't get anything new.
> and AMD_SMN (Zen-based systems) would both depend on AMD_NODE. The thinking
> is that a user could build a minimal config for either legacy or Zen-based
> systems.
amd_node is three functions.
amd_node_get_func() can go into a header so that both nb.c and node.c can call
it and smn.c and node.c can be a single compilation unit.
> > Just do a single amd_node and that's it. We can always split later, if really
> > needed.
> >
>
> Okay, will do.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 16/16] x86/amd_smn: Add support for debugfs access to SMN registers
2024-11-05 19:21 ` Borislav Petkov
@ 2024-11-05 19:46 ` Mario Limonciello
2024-11-05 19:53 ` Borislav Petkov
0 siblings, 1 reply; 65+ messages in thread
From: Mario Limonciello @ 2024-11-05 19:46 UTC (permalink / raw)
To: Borislav Petkov, Yazen Ghannam
Cc: linux-edac, linux-kernel, tony.luck, x86, avadhut.naik,
john.allen, bhelgaas, Shyam-sundar.S-k, richard.gong, jdelvare,
linux, clemens, hdegoede, ilpo.jarvinen, linux-pci, linux-hwmon,
platform-driver-x86, naveenkrishna.chatradhi, carlos.bilbao.osdev
On 11/5/2024 13:21, Borislav Petkov wrote:
> On Wed, Oct 23, 2024 at 05:21:50PM +0000, Yazen Ghannam wrote:
>> +static ssize_t smn_value_write(struct file *file, const char __user *userbuf,
>> + size_t count, loff_t *ppos)
>> +{
>> + u32 val;
>> + int ret;
>> +
>> + ret = kstrtouint_from_user(userbuf, count, 0, &val);
>> + if (ret)
>> + return ret;
>> +
>> + add_taint(TAINT_USER, LOCKDEP_STILL_OK);
>
> That looks like a TAINT_CPU_OUT_OF_SPEC to me.
Makes sense.
>
>> + ret = amd_smn_write(debug_node, debug_address, val);
>> + if (ret)
>> + return ret;
>> +
>> + return count;
>> +}
>> +
>> +DEFINE_SHOW_STORE_ATTRIBUTE(smn_node);
>> +DEFINE_SHOW_STORE_ATTRIBUTE(smn_address);
>> +DEFINE_SHOW_STORE_ATTRIBUTE(smn_value);
>> +
>> static int amd_cache_roots(void)
>> {
>> u16 node, num_nodes = amd_num_nodes();
>> @@ -180,6 +257,12 @@ static int __init amd_smn_init(void)
>> if (err)
>> return err;
>>
>> + debugfs_dir = debugfs_create_dir("amd_smn", arch_debugfs_dir);
>> +
>> + debugfs_create_file("node", 0600, debugfs_dir, NULL, &smn_node_fops);
>> + debugfs_create_file("address", 0600, debugfs_dir, NULL, &smn_address_fops);
>> + debugfs_create_file("value", 0600, debugfs_dir, NULL, &smn_value_fops);
>
> Can we pls stick this behind a module param which is off by default? I don't
> want that crap exposed even in debugfs, by default.
>
> Thx.
>
Why the worry about it being in debugfs by default?
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 16/16] x86/amd_smn: Add support for debugfs access to SMN registers
2024-11-05 19:46 ` Mario Limonciello
@ 2024-11-05 19:53 ` Borislav Petkov
2024-11-05 19:56 ` Mario Limonciello
0 siblings, 1 reply; 65+ messages in thread
From: Borislav Petkov @ 2024-11-05 19:53 UTC (permalink / raw)
To: Mario Limonciello
Cc: Yazen Ghannam, linux-edac, linux-kernel, tony.luck, x86,
avadhut.naik, john.allen, bhelgaas, Shyam-sundar.S-k,
richard.gong, jdelvare, linux, clemens, hdegoede, ilpo.jarvinen,
linux-pci, linux-hwmon, platform-driver-x86,
naveenkrishna.chatradhi, carlos.bilbao.osdev
On Tue, Nov 05, 2024 at 01:46:46PM -0600, Mario Limonciello wrote:
> Why the worry about it being in debugfs by default?
I want people to *actively* and *explicitly* do something before they can
brick their systems.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 16/16] x86/amd_smn: Add support for debugfs access to SMN registers
2024-11-05 19:53 ` Borislav Petkov
@ 2024-11-05 19:56 ` Mario Limonciello
2024-11-05 19:59 ` Borislav Petkov
0 siblings, 1 reply; 65+ messages in thread
From: Mario Limonciello @ 2024-11-05 19:56 UTC (permalink / raw)
To: Borislav Petkov, Bjorn Helgaas
Cc: Yazen Ghannam, linux-edac, linux-kernel, tony.luck, x86,
avadhut.naik, john.allen, bhelgaas, Shyam-sundar.S-k,
richard.gong, jdelvare, linux, clemens, hdegoede, ilpo.jarvinen,
linux-pci, linux-hwmon, platform-driver-x86,
naveenkrishna.chatradhi, carlos.bilbao.osdev
On 11/5/2024 13:53, Borislav Petkov wrote:
> On Tue, Nov 05, 2024 at 01:46:46PM -0600, Mario Limonciello wrote:
>> Why the worry about it being in debugfs by default?
>
> I want people to *actively* and *explicitly* do something before they can
> brick their systems.
>
OK got it. Considering that I think after this series lands we need to
re-open the conversation about PCI config space access to userspace;
particularly on regions that have been marked as exclusions.
By that grounds a taint is definitely not enough for that either.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 16/16] x86/amd_smn: Add support for debugfs access to SMN registers
2024-11-05 19:56 ` Mario Limonciello
@ 2024-11-05 19:59 ` Borislav Petkov
2024-11-05 20:53 ` Mario Limonciello
0 siblings, 1 reply; 65+ messages in thread
From: Borislav Petkov @ 2024-11-05 19:59 UTC (permalink / raw)
To: Mario Limonciello
Cc: Bjorn Helgaas, Yazen Ghannam, linux-edac, linux-kernel, tony.luck,
x86, avadhut.naik, john.allen, Shyam-sundar.S-k, richard.gong,
jdelvare, linux, clemens, hdegoede, ilpo.jarvinen, linux-pci,
linux-hwmon, platform-driver-x86, naveenkrishna.chatradhi,
carlos.bilbao.osdev
On Tue, Nov 05, 2024 at 01:56:23PM -0600, Mario Limonciello wrote:
> OK got it. Considering that I think after this series lands we need to
> re-open the conversation about PCI config space access to userspace;
> particularly on regions that have been marked as exclusions.
I could imagine a patch that goes and requests those regions exclusively if
luserspace has no business poking there.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH] x86/cpufeature: Document cpu_feature_enabled() as the default to use
2024-10-31 10:34 ` [PATCH] x86/cpufeature: Document cpu_feature_enabled() as the default to use Borislav Petkov
2024-10-31 18:26 ` Sohil Mehta
@ 2024-11-05 19:59 ` Dave Hansen
1 sibling, 0 replies; 65+ messages in thread
From: Dave Hansen @ 2024-11-05 19:59 UTC (permalink / raw)
To: Borislav Petkov, Sohil Mehta
Cc: Yazen Ghannam, Luck, Tony, Andy Whitcroft, Joe Perches,
linux-edac@vger.kernel.org, linux-kernel@vger.kernel.org,
x86@kernel.org, avadhut.naik@amd.com, john.allen@amd.com,
mario.limonciello@amd.com, bhelgaas@google.com,
Shyam-sundar.S-k@amd.com, richard.gong@amd.com, jdelvare@suse.com,
linux@roeck-us.net, clemens@ladisch.de, hdegoede@redhat.com,
ilpo.jarvinen@linux.intel.com, linux-pci@vger.kernel.org,
linux-hwmon@vger.kernel.org, platform-driver-x86@vger.kernel.org,
naveenkrishna.chatradhi@amd.com, carlos.bilbao.osdev@gmail.com
On 10/31/24 03:34, Borislav Petkov wrote:
> cpu_feature_enabled() should be used in most cases when CPU feature
> support needs to be tested in code. Document that.
Yes, please. BTW, I know the code generation isn't great in some cases.
But this is the right _way_ to call things no "boot_" or &boot_cpu_data
for system-wide things. Underlying code generation can continue to be
fixed up over time.
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: [PATCH 16/16] x86/amd_smn: Add support for debugfs access to SMN registers
2024-11-05 19:59 ` Borislav Petkov
@ 2024-11-05 20:53 ` Mario Limonciello
0 siblings, 0 replies; 65+ messages in thread
From: Mario Limonciello @ 2024-11-05 20:53 UTC (permalink / raw)
To: Borislav Petkov
Cc: Bjorn Helgaas, Yazen Ghannam, linux-edac, linux-kernel, tony.luck,
x86, avadhut.naik, john.allen, Shyam-sundar.S-k, richard.gong,
jdelvare, linux, clemens, hdegoede, ilpo.jarvinen, linux-pci,
linux-hwmon, platform-driver-x86, naveenkrishna.chatradhi,
carlos.bilbao.osdev
On 11/5/2024 13:59, Borislav Petkov wrote:
> On Tue, Nov 05, 2024 at 01:56:23PM -0600, Mario Limonciello wrote:
>> OK got it. Considering that I think after this series lands we need to
>> re-open the conversation about PCI config space access to userspace;
>> particularly on regions that have been marked as exclusions.
>
> I could imagine a patch that goes and requests those regions exclusively if
> luserspace has no business poking there.
>
Take look at what pci_write_config() does today. It basically shows a
warning and taints but lets userspace proceed.
commit 278294798ac91 ("PCI: Allow drivers to request exclusive config
regions") and the matching mailing list thread linked from the commit
message have some history from it.
I'd personally like to see that taint turned into an "return -EACCES"
instead. But given the discussion linked in that commit, I think it
should be a follow up rather than part of this series.
^ permalink raw reply [flat|nested] 65+ messages in thread
end of thread, other threads:[~2024-11-05 20:53 UTC | newest]
Thread overview: 65+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-23 17:21 [PATCH 00/16] AMD NB and SMN rework Yazen Ghannam
2024-10-23 17:21 ` [PATCH 01/16] x86/mce/amd: Remove shared threshold bank plumbing Yazen Ghannam
2024-10-23 17:21 ` [PATCH 02/16] x86/amd_nb: Restrict init function to AMD-based systems Yazen Ghannam
2024-10-31 8:09 ` Zhuo, Qiuxu
2024-10-31 10:36 ` Borislav Petkov
2024-10-31 11:50 ` Zhuo, Qiuxu
2024-10-31 13:11 ` Borislav Petkov
2024-10-23 17:21 ` [PATCH 03/16] x86/amd_nb: Clean up early_is_amd_nb() Yazen Ghannam
2024-10-25 15:58 ` Borislav Petkov
2024-10-29 14:39 ` Yazen Ghannam
2024-10-29 15:08 ` Borislav Petkov
2024-10-29 16:15 ` Luck, Tony
2024-10-30 14:21 ` Yazen Ghannam
2024-10-31 0:53 ` Sohil Mehta
2024-10-31 10:34 ` [PATCH] x86/cpufeature: Document cpu_feature_enabled() as the default to use Borislav Petkov
2024-10-31 18:26 ` Sohil Mehta
2024-10-31 19:15 ` Borislav Petkov
2024-11-05 19:59 ` Dave Hansen
2024-10-31 10:24 ` [PATCH 03/16] x86/amd_nb: Clean up early_is_amd_nb() Borislav Petkov
2024-10-23 17:21 ` [PATCH 04/16] x86: Start moving AMD Node functionality out of AMD_NB Yazen Ghannam
2024-10-23 17:21 ` [PATCH 05/16] x86/amd_nb: Simplify function 4 search Yazen Ghannam
2024-10-31 11:15 ` Borislav Petkov
2024-10-23 17:21 ` [PATCH 06/16] x86/amd_nb: Simplify root device search Yazen Ghannam
2024-10-31 7:52 ` Zhuo, Qiuxu
2024-10-31 10:08 ` Ilpo Järvinen
2024-10-31 13:10 ` Zhuo, Qiuxu
2024-10-31 15:34 ` Yazen Ghannam
2024-10-31 15:42 ` Ilpo Järvinen
2024-10-31 15:45 ` Yazen Ghannam
2024-10-31 11:20 ` Borislav Petkov
2024-10-31 15:29 ` Yazen Ghannam
2024-10-23 17:21 ` [PATCH 07/16] x86/amd_nb: Use topology info to get AMD node count Yazen Ghannam
2024-10-23 17:21 ` [PATCH 08/16] x86/amd_nb: Simplify function 3 search Yazen Ghannam
2024-10-23 17:21 ` [PATCH 09/16] x86/amd_nb, x86/amd_node: Simplify amd_pci_dev_to_node_id() Yazen Ghannam
2024-11-04 14:23 ` Borislav Petkov
2024-11-05 14:54 ` Yazen Ghannam
2024-10-23 17:21 ` [PATCH 10/16] x86/amd_nb: Move SMN access code to a new amd_smn driver Yazen Ghannam
2024-11-04 14:29 ` Borislav Petkov
2024-11-05 14:58 ` Yazen Ghannam
2024-11-05 19:42 ` Borislav Petkov
2024-10-23 17:21 ` [PATCH 11/16] x86/amd_smn: Fixup __amd_smn_rw() Yazen Ghannam
2024-11-04 14:32 ` Borislav Petkov
2024-11-05 14:59 ` Yazen Ghannam
2024-10-23 17:21 ` [PATCH 12/16] x86/amd_smn: Remove dependency on AMD_NB Yazen Ghannam
2024-10-23 17:21 ` [PATCH 13/16] x86/amd_smn: Use defines for register offsets Yazen Ghannam
2024-10-23 17:21 ` [PATCH 14/16] x86/amd_smn, platform/x86/amd/hsmp: Have HSMP use SMN Yazen Ghannam
2024-10-24 13:23 ` Ilpo Järvinen
2024-10-24 16:06 ` Yazen Ghannam
2024-10-25 13:39 ` Ilpo Järvinen
2024-10-23 17:21 ` [PATCH 15/16] x86/amd_smn: Add SMN offsets to exclusive region access Yazen Ghannam
2024-10-23 17:21 ` [PATCH 16/16] x86/amd_smn: Add support for debugfs access to SMN registers Yazen Ghannam
2024-11-05 19:21 ` Borislav Petkov
2024-11-05 19:46 ` Mario Limonciello
2024-11-05 19:53 ` Borislav Petkov
2024-11-05 19:56 ` Mario Limonciello
2024-11-05 19:59 ` Borislav Petkov
2024-11-05 20:53 ` Mario Limonciello
2024-10-23 17:59 ` [PATCH 00/16] AMD NB and SMN rework Bjorn Helgaas
2024-10-24 16:01 ` Yazen Ghannam
2024-10-24 17:46 ` Bjorn Helgaas
2024-10-24 20:08 ` Mario Limonciello
2024-10-24 21:06 ` Bjorn Helgaas
2024-10-24 21:20 ` Mario Limonciello
2024-10-24 21:47 ` Bjorn Helgaas
2024-10-31 16:22 ` Yazen Ghannam
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).