* [PATCH v3 23/25] ACPI: power: Switch to sys-off handler API
From: Dmitry Osipenko @ 2021-11-08 0:45 UTC (permalink / raw)
To: Thierry Reding, Jonathan Hunter, Russell King, Catalin Marinas,
Will Deacon, Guo Ren, Geert Uytterhoeven, Greg Ungerer,
Joshua Thompson, Thomas Bogendoerfer, Nick Hu, Greentime Hu,
Vincent Chen, James E.J. Bottomley, Helge Deller,
Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
Paul Walmsley, Palmer Dabbelt, Albert Ou, Yoshinori Sato,
Rich Felker, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Boris Ostrovsky, Juergen Gross,
Stefano Stabellini, Rafael J. Wysocki, Len Brown,
Santosh Shilimkar, Krzysztof Kozlowski, Liam Girdwood, Mark Brown,
Pavel Machek, Lee Jones, Andrew Morton, Guenter Roeck,
Daniel Lezcano, Andy Shevchenko, Ulf Hansson
Cc: linux-ia64, linux-parisc, linux-sh, linux-pm, linux-kernel,
linux-csky, linux-mips, linux-acpi, linux-m68k, linux-tegra,
xen-devel, linux-riscv, linuxppc-dev, linux-arm-kernel
In-Reply-To: <20211108004524.29465-1-digetx@gmail.com>
Switch to sys-off API that replaces legacy pm_power_off callbacks.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
---
drivers/acpi/sleep.c | 25 +++++++++++--------------
1 file changed, 11 insertions(+), 14 deletions(-)
diff --git a/drivers/acpi/sleep.c b/drivers/acpi/sleep.c
index eaa47753b758..2e613fddd614 100644
--- a/drivers/acpi/sleep.c
+++ b/drivers/acpi/sleep.c
@@ -47,19 +47,11 @@ static void acpi_sleep_tts_switch(u32 acpi_state)
}
}
-static int tts_notify_reboot(struct notifier_block *this,
- unsigned long code, void *x)
+static void tts_reboot_prepare(struct reboot_prep_data *data)
{
acpi_sleep_tts_switch(ACPI_STATE_S5);
- return NOTIFY_DONE;
}
-static struct notifier_block tts_notifier = {
- .notifier_call = tts_notify_reboot,
- .next = NULL,
- .priority = 0,
-};
-
static int acpi_sleep_prepare(u32 acpi_state)
{
#ifdef CONFIG_ACPI_SLEEP
@@ -1020,7 +1012,7 @@ static void acpi_sleep_hibernate_setup(void)
static inline void acpi_sleep_hibernate_setup(void) {}
#endif /* !CONFIG_HIBERNATION */
-static void acpi_power_off_prepare(void)
+static void acpi_power_off_prepare(struct power_off_prep_data *data)
{
/* Prepare to power off the system */
acpi_sleep_prepare(ACPI_STATE_S5);
@@ -1028,7 +1020,7 @@ static void acpi_power_off_prepare(void)
acpi_os_wait_events_complete();
}
-static void acpi_power_off(void)
+static void acpi_power_off(struct power_off_data *data)
{
/* acpi_sleep_prepare(ACPI_STATE_S5) should have already been called */
pr_debug("%s called\n", __func__);
@@ -1036,6 +1028,11 @@ static void acpi_power_off(void)
acpi_enter_sleep_state(ACPI_STATE_S5);
}
+static struct sys_off_handler acpi_sys_off_handler = {
+ .power_off_priority = POWEROFF_PRIO_FIRMWARE,
+ .reboot_prepare_cb = tts_reboot_prepare,
+};
+
int __init acpi_sleep_init(void)
{
char supported[ACPI_S_STATE_COUNT * 3 + 1];
@@ -1052,8 +1049,8 @@ int __init acpi_sleep_init(void)
if (acpi_sleep_state_supported(ACPI_STATE_S5)) {
sleep_states[ACPI_STATE_S5] = 1;
- pm_power_off_prepare = acpi_power_off_prepare;
- pm_power_off = acpi_power_off;
+ acpi_sys_off_handler.power_off_cb = acpi_power_off;
+ acpi_sys_off_handler.power_off_prepare_cb = acpi_power_off_prepare;
} else {
acpi_no_s5 = true;
}
@@ -1069,6 +1066,6 @@ int __init acpi_sleep_init(void)
* Register the tts_notifier to reboot notifier list so that the _TTS
* object can also be evaluated when the system enters S5.
*/
- register_reboot_notifier(&tts_notifier);
+ register_sys_off_handler(&acpi_sys_off_handler);
return 0;
}
--
2.33.1
^ permalink raw reply related
* [PATCH v3 24/25] regulator: pfuze100: Use devm_register_sys_off_handler()
From: Dmitry Osipenko @ 2021-11-08 0:45 UTC (permalink / raw)
To: Thierry Reding, Jonathan Hunter, Russell King, Catalin Marinas,
Will Deacon, Guo Ren, Geert Uytterhoeven, Greg Ungerer,
Joshua Thompson, Thomas Bogendoerfer, Nick Hu, Greentime Hu,
Vincent Chen, James E.J. Bottomley, Helge Deller,
Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
Paul Walmsley, Palmer Dabbelt, Albert Ou, Yoshinori Sato,
Rich Felker, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Boris Ostrovsky, Juergen Gross,
Stefano Stabellini, Rafael J. Wysocki, Len Brown,
Santosh Shilimkar, Krzysztof Kozlowski, Liam Girdwood, Mark Brown,
Pavel Machek, Lee Jones, Andrew Morton, Guenter Roeck,
Daniel Lezcano, Andy Shevchenko, Ulf Hansson
Cc: linux-ia64, linux-parisc, linux-sh, linux-pm, linux-kernel,
linux-csky, linux-mips, linux-acpi, linux-m68k, linux-tegra,
xen-devel, linux-riscv, linuxppc-dev, linux-arm-kernel
In-Reply-To: <20211108004524.29465-1-digetx@gmail.com>
Use devm_register_sys_off_handler() that replaces global
pm_power_off_prepare variable and allows to register multiple
power-off handlers.
Acked-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
---
drivers/regulator/pfuze100-regulator.c | 38 ++++++++++----------------
1 file changed, 14 insertions(+), 24 deletions(-)
diff --git a/drivers/regulator/pfuze100-regulator.c b/drivers/regulator/pfuze100-regulator.c
index d60d7d1b7fa2..2eca8d43a097 100644
--- a/drivers/regulator/pfuze100-regulator.c
+++ b/drivers/regulator/pfuze100-regulator.c
@@ -10,6 +10,7 @@
#include <linux/of_device.h>
#include <linux/regulator/of_regulator.h>
#include <linux/platform_device.h>
+#include <linux/reboot.h>
#include <linux/regulator/driver.h>
#include <linux/regulator/machine.h>
#include <linux/regulator/pfuze100.h>
@@ -76,6 +77,7 @@ struct pfuze_chip {
struct pfuze_regulator regulator_descs[PFUZE100_MAX_REGULATOR];
struct regulator_dev *regulators[PFUZE100_MAX_REGULATOR];
struct pfuze_regulator *pfuze_regulators;
+ struct sys_off_handler sys_off;
};
static const int pfuze100_swbst[] = {
@@ -569,10 +571,10 @@ static inline struct device_node *match_of_node(int index)
return pfuze_matches[index].of_node;
}
-static struct pfuze_chip *syspm_pfuze_chip;
-
-static void pfuze_power_off_prepare(void)
+static void pfuze_power_off_prepare(struct power_off_prep_data *data)
{
+ struct pfuze_chip *syspm_pfuze_chip = data->cb_data;
+
dev_info(syspm_pfuze_chip->dev, "Configure standby mode for power off");
/* Switch from default mode: APS/APS to APS/Off */
@@ -611,24 +613,23 @@ static void pfuze_power_off_prepare(void)
static int pfuze_power_off_prepare_init(struct pfuze_chip *pfuze_chip)
{
+ int err;
+
if (pfuze_chip->chip_id != PFUZE100) {
dev_warn(pfuze_chip->dev, "Requested pm_power_off_prepare handler for not supported chip\n");
return -ENODEV;
}
- if (pm_power_off_prepare) {
- dev_warn(pfuze_chip->dev, "pm_power_off_prepare is already registered.\n");
- return -EBUSY;
- }
+ pfuze_chip->sys_off.power_off_prepare_cb = pfuze_power_off_prepare;
+ pfuze_chip->sys_off.cb_data = pfuze_chip;
- if (syspm_pfuze_chip) {
- dev_warn(pfuze_chip->dev, "syspm_pfuze_chip is already set.\n");
- return -EBUSY;
+ err = devm_register_sys_off_handler(pfuze_chip->dev, &pfuze_chip->sys_off);
+ if (err) {
+ dev_err(pfuze_chip->dev,
+ "failed to register sys-off handler: %d\n", err);
+ return err;
}
- syspm_pfuze_chip = pfuze_chip;
- pm_power_off_prepare = pfuze_power_off_prepare;
-
return 0;
}
@@ -837,23 +838,12 @@ static int pfuze100_regulator_probe(struct i2c_client *client,
return 0;
}
-static int pfuze100_regulator_remove(struct i2c_client *client)
-{
- if (syspm_pfuze_chip) {
- syspm_pfuze_chip = NULL;
- pm_power_off_prepare = NULL;
- }
-
- return 0;
-}
-
static struct i2c_driver pfuze_driver = {
.driver = {
.name = "pfuze100-regulator",
.of_match_table = pfuze_dt_ids,
},
.probe = pfuze100_regulator_probe,
- .remove = pfuze100_regulator_remove,
};
module_i2c_driver(pfuze_driver);
--
2.33.1
^ permalink raw reply related
* [PATCH v3 25/25] reboot: Remove pm_power_off_prepare()
From: Dmitry Osipenko @ 2021-11-08 0:45 UTC (permalink / raw)
To: Thierry Reding, Jonathan Hunter, Russell King, Catalin Marinas,
Will Deacon, Guo Ren, Geert Uytterhoeven, Greg Ungerer,
Joshua Thompson, Thomas Bogendoerfer, Nick Hu, Greentime Hu,
Vincent Chen, James E.J. Bottomley, Helge Deller,
Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
Paul Walmsley, Palmer Dabbelt, Albert Ou, Yoshinori Sato,
Rich Felker, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Boris Ostrovsky, Juergen Gross,
Stefano Stabellini, Rafael J. Wysocki, Len Brown,
Santosh Shilimkar, Krzysztof Kozlowski, Liam Girdwood, Mark Brown,
Pavel Machek, Lee Jones, Andrew Morton, Guenter Roeck,
Daniel Lezcano, Andy Shevchenko, Ulf Hansson
Cc: linux-ia64, linux-parisc, linux-sh, linux-pm, linux-kernel,
linux-csky, linux-mips, linux-acpi, linux-m68k, linux-tegra,
xen-devel, linux-riscv, linuxppc-dev, linux-arm-kernel
In-Reply-To: <20211108004524.29465-1-digetx@gmail.com>
All pm_power_off_prepare() users were converted to sys-off handler API.
Remove the obsolete callback.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
---
include/linux/pm.h | 1 -
kernel/reboot.c | 11 -----------
2 files changed, 12 deletions(-)
diff --git a/include/linux/pm.h b/include/linux/pm.h
index 1d8209c09686..d9bf1426f81e 100644
--- a/include/linux/pm.h
+++ b/include/linux/pm.h
@@ -20,7 +20,6 @@
* Callbacks for platform drivers to implement.
*/
extern void (*pm_power_off)(void);
-extern void (*pm_power_off_prepare)(void);
struct device; /* we have a circular dep with device.h */
#ifdef CONFIG_VT_CONSOLE_SLEEP
diff --git a/kernel/reboot.c b/kernel/reboot.c
index 4884204f9a31..a832bb660040 100644
--- a/kernel/reboot.c
+++ b/kernel/reboot.c
@@ -48,13 +48,6 @@ int reboot_cpu;
enum reboot_type reboot_type = BOOT_ACPI;
int reboot_force;
-/*
- * If set, this is used for preparing the system to power off.
- */
-
-void (*pm_power_off_prepare)(void);
-EXPORT_SYMBOL_GPL(pm_power_off_prepare);
-
/**
* emergency_restart - reboot the system
*
@@ -807,10 +800,6 @@ void do_kernel_power_off(void)
static void do_kernel_power_off_prepare(void)
{
- /* legacy pm_power_off_prepare() is unchained and has highest priority */
- if (pm_power_off_prepare)
- return pm_power_off_prepare();
-
blocking_notifier_call_chain(&power_off_handler_list, POWEROFF_PREPARE,
NULL);
}
--
2.33.1
^ permalink raw reply related
* Re: [PATCH v3 00/25] Introduce power-off+restart call chain API
From: Dmitry Osipenko @ 2021-11-08 1:14 UTC (permalink / raw)
To: Thierry Reding, Jonathan Hunter, Russell King, Catalin Marinas,
Will Deacon, Guo Ren, Geert Uytterhoeven, Greg Ungerer,
Joshua Thompson, Thomas Bogendoerfer, Nick Hu, Greentime Hu,
Vincent Chen, James E.J. Bottomley, Helge Deller,
Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
Paul Walmsley, Palmer Dabbelt, Albert Ou, Yoshinori Sato,
Rich Felker, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, x86, H. Peter Anvin, Boris Ostrovsky, Juergen Gross,
Stefano Stabellini, Rafael J. Wysocki, Len Brown,
Santosh Shilimkar, Krzysztof Kozlowski, Liam Girdwood, Mark Brown,
Pavel Machek, Lee Jones, Andrew Morton, Guenter Roeck,
Daniel Lezcano, Andy Shevchenko, Ulf Hansson, Sebastian Reichel,
Linus Walleij, Philipp Zabel
Cc: linux-ia64, linux-parisc, linux-sh, linux-pm, linux-kernel,
linux-csky, linux-mips, linux-acpi, linux-m68k, linux-tegra,
xen-devel, linux-riscv, linuxppc-dev, linux-arm-kernel
In-Reply-To: <20211108004524.29465-1-digetx@gmail.com>
08.11.2021 03:44, Dmitry Osipenko пишет:
> Problem
> -------
>
> SoC devices require power-off call chaining functionality from kernel.
> We have a widely used restart chaining provided by restart notifier API,
> but nothing for power-off.
>
> Solution
> --------
>
> Introduce new API that provides both restart and power-off call chains.
>
> Why combine restart with power-off? Because drivers often do both.
> More practical to have API that provides both under the same roof.
>
> The new API is designed with simplicity and extensibility in mind.
> It's built upon the existing restart and reboot APIs. The simplicity
> is in new helper functions that are convenient for drivers. The
> extensibility is in the design that doesn't hardcode callback
> arguments, making easy to add new parameters and remove old.
>
> This is a third attempt to introduce the new API. First was made by
> Guenter Roeck back in 2014, second was made by Thierry Reding in 2017.
> In fact the work didn't stop and recently arm_pm_restart() was removed
> from v5.14 kernel, which was a part of preparatory work started by
> Guenter Roeck. I took into account experience and ideas from the
> previous attempts, extended and polished them.
>
> Adoption plan
> -------------
>
> This patchset introduces the new API. It also converts multiple drivers
> and arch code to the new API to demonstrate how it all looks in practice.
>
> The plan is:
>
> 1. Merge new API (patches 1-8). This API will co-exist with the old APIs.
>
> 2. Convert arch code to do_kernel_power_off() (patches 9-21).
>
> 3. Convert drivers and platform code to the new API.
>
> 4. Remove obsolete pm_power_off and pm_power_off_prepare variables.
>
> 5. Make restart-notifier API private to kernel/reboot.c once no users left.
>
> It's fully implemented here:
>
> [1] https://github.com/grate-driver/linux/commits/sys-off-handler
>
> For now I'm sending only the first 25 base patches out of ~180. It's
> preferable to squash 1-2, partially 3 and 4 points of the plan into a
> single patchset to ease and speed up applying of the rest of the patches.
> Majority of drivers and platform patches depend on the base, hence they
> will come later (and per subsystem), once base will land.
>
> All [1] patches are compile-tested. Tegra and x86 ACPI patches are tested
> on hardware. The remaining should be covered by unit tests (unpublished).
>
> Results
> -------
>
> 1. Devices can be powered off properly.
>
> 2. Global variables are removed from drivers.
>
> 3. Global pm_power_off and pm_power_off_prepare callback variables are
> removed once all users are converted to the new API. The latter callback
> is removed by patch #25 of this series.
>
> 4. Ambiguous call chain ordering is prohibited. See patch #5 which adds
> verification of restart handlers priorities, ensuring that they are unique.
>
> Changelog:
>
> v3: - Renamed power_handler to sys_off_handler as was suggested by
> Rafael Wysocki.
>
> - Improved doc-comments as was suggested by Rafael Wysocki. Added more
> doc-comments.
>
> - Implemented full set of 180 patches which convert whole kernel in
> accordance to the plan, see link [1] above. Slightly adjusted API to
> better suit for the remaining converted drivers.
>
> * Added unregister_sys_off_handler() that is handy for a couple old
> platform drivers.
>
> * Dropped devm_register_trivial_restart_handler(), 'simple' variant
> is enough to have.
>
> - Improved "Add atomic/blocking_notifier_has_unique_priority()" patch,
> as was suggested by Andy Shevchenko. Also replaced down_write() with
> down_read() and factored out common notifier_has_unique_priority().
>
> - Added stop_chain field to struct restart_data and reboot_prep_data
> after discovering couple drivers wanting that feature.
>
> - Added acks that were given to v2.
>
> v2: - Replaced standalone power-off call chain demo-API with the combined
> power-off+restart API because this is what drivers want. It's a more
> comprehensive solution.
>
> - Converted multiple drivers and arch code to the new API. Suggested by
> Andy Shevchenko. I skimmed through the rest of drivers, verifying that
> new API suits them. The rest of the drivers will be converted once we
> will settle on the new API, otherwise will be too many patches here.
>
> - v2 API doesn't expose notifier to users and require handlers to
> have unique priority. Suggested by Guenter Roeck.
>
> - v2 API has power-off chaining disabled by default and require
> drivers to explicitly opt-in to the chaining. This preserves old
> behaviour for existing drivers once they are converted to the new
> API.
>
> Dmitry Osipenko (25):
> notifier: Remove extern annotation from function prototypes
> notifier: Add blocking_notifier_call_chain_is_empty()
> notifier: Add atomic/blocking_notifier_has_unique_priority()
> reboot: Correct typo in a comment
> reboot: Warn if restart handler has duplicated priority
> reboot: Warn if unregister_restart_handler() fails
> reboot: Remove extern annotation from function prototypes
> kernel: Add combined power-off+restart handler call chain API
> ARM: Use do_kernel_power_off()
> csky: Use do_kernel_power_off()
> riscv: Use do_kernel_power_off()
> arm64: Use do_kernel_power_off()
> parisc: Use do_kernel_power_off()
> xen/x86: Use do_kernel_power_off()
> sh: Use do_kernel_power_off()
> x86: Use do_kernel_power_off()
> ia64: Use do_kernel_power_off()
> mips: Use do_kernel_power_off()
> nds32: Use do_kernel_power_off()
> powerpc: Use do_kernel_power_off()
> m68k: Switch to new sys-off handler API
> memory: emif: Use kernel_can_power_off()
> ACPI: power: Switch to sys-off handler API
> regulator: pfuze100: Use devm_register_sys_off_handler()
> reboot: Remove pm_power_off_prepare()
>
> arch/arm/kernel/reboot.c | 4 +-
> arch/arm64/kernel/process.c | 3 +-
> arch/csky/kernel/power.c | 6 +-
> arch/ia64/kernel/process.c | 4 +-
> arch/m68k/emu/natfeat.c | 3 +-
> arch/m68k/include/asm/machdep.h | 1 -
> arch/m68k/kernel/process.c | 5 +-
> arch/m68k/kernel/setup_mm.c | 1 -
> arch/m68k/kernel/setup_no.c | 1 -
> arch/m68k/mac/config.c | 4 +-
> arch/mips/kernel/reset.c | 3 +-
> arch/nds32/kernel/process.c | 3 +-
> arch/parisc/kernel/process.c | 4 +-
> arch/powerpc/kernel/setup-common.c | 4 +-
> arch/powerpc/xmon/xmon.c | 3 +-
> arch/riscv/kernel/reset.c | 12 +-
> arch/sh/kernel/reboot.c | 3 +-
> arch/x86/kernel/reboot.c | 4 +-
> arch/x86/xen/enlighten_pv.c | 4 +-
> drivers/acpi/sleep.c | 25 +-
> drivers/memory/emif.c | 2 +-
> drivers/regulator/pfuze100-regulator.c | 38 +-
> include/linux/notifier.h | 37 +-
> include/linux/pm.h | 1 -
> include/linux/reboot.h | 305 ++++++++++++--
> kernel/notifier.c | 83 ++++
> kernel/power/hibernate.c | 2 +-
> kernel/reboot.c | 556 ++++++++++++++++++++++++-
> 28 files changed, 985 insertions(+), 136 deletions(-)
>
+CC Linus Walleij, Sebastian Reichel and Philipp Zabel; whom I missed to
include by accident.
https://lore.kernel.org/all/20211108004524.29465-1-digetx@gmail.com/T/#t
^ permalink raw reply
* [PATCH kernel 0/3] powerpc/pseries/ddw: Fixes for persistent memory case
From: Alexey Kardashevskiy @ 2021-11-08 4:03 UTC (permalink / raw)
To: linuxppc-dev
Cc: Frederic Barrat, Brian King, Alexey Kardashevskiy, Leonardo Bras
This is based on sha1
f855455dee0b Michael Ellerman "Automatic merge of 'next' into merge (2021-11-05 22:19)".
Please comment. Thanks.
Alexey Kardashevskiy (3):
powerpc/pseries/ddw: Revert "Extend upper limit for huge DMA window
for persistent memory"
powerpc/pseries/ddw: simplify enable_ddw()
powerpc/pseries/ddw: Do not try direct mapping with persistent memory
and one window
arch/powerpc/platforms/pseries/iommu.c | 26 ++++++++------------------
1 file changed, 8 insertions(+), 18 deletions(-)
--
2.30.2
^ permalink raw reply
* [PATCH kernel 1/3] powerpc/pseries/ddw: Revert "Extend upper limit for huge DMA window for persistent memory"
From: Alexey Kardashevskiy @ 2021-11-08 4:03 UTC (permalink / raw)
To: linuxppc-dev
Cc: Frederic Barrat, Brian King, Alexey Kardashevskiy, Leonardo Bras
In-Reply-To: <20211108040320.3857636-1-aik@ozlabs.ru>
This reverts commit 54fc3c681ded9437e4548e2501dc1136b23cfa9a
which does not allow 1:1 mapping even for the system RAM which
is usually possible.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
arch/powerpc/platforms/pseries/iommu.c | 9 ---------
1 file changed, 9 deletions(-)
diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c
index 49b401536d29..64385d6f33c2 100644
--- a/arch/powerpc/platforms/pseries/iommu.c
+++ b/arch/powerpc/platforms/pseries/iommu.c
@@ -1094,15 +1094,6 @@ static phys_addr_t ddw_memory_hotplug_max(void)
phys_addr_t max_addr = memory_hotplug_max();
struct device_node *memory;
- /*
- * The "ibm,pmemory" can appear anywhere in the address space.
- * Assuming it is still backed by page structs, set the upper limit
- * for the huge DMA window as MAX_PHYSMEM_BITS.
- */
- if (of_find_node_by_type(NULL, "ibm,pmemory"))
- return (sizeof(phys_addr_t) * 8 <= MAX_PHYSMEM_BITS) ?
- (phys_addr_t) -1 : (1ULL << MAX_PHYSMEM_BITS);
-
for_each_node_by_type(memory, "memory") {
unsigned long start, size;
int n_mem_addr_cells, n_mem_size_cells, len;
--
2.30.2
^ permalink raw reply related
* [PATCH kernel 2/3] powerpc/pseries/ddw: simplify enable_ddw()
From: Alexey Kardashevskiy @ 2021-11-08 4:03 UTC (permalink / raw)
To: linuxppc-dev
Cc: Frederic Barrat, Brian King, Alexey Kardashevskiy, Leonardo Bras
In-Reply-To: <20211108040320.3857636-1-aik@ozlabs.ru>
This drops rather useless ddw_enabled flag as direct_mapping implies
it anyway.
While at this, fix indents in enable_ddw().
This should not cause any behavioral change.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
This replaces "powerpc/pseries/iommu: Fix indentations"
---
arch/powerpc/platforms/pseries/iommu.c | 11 ++++-------
1 file changed, 4 insertions(+), 7 deletions(-)
diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c
index 64385d6f33c2..301fa5b3d528 100644
--- a/arch/powerpc/platforms/pseries/iommu.c
+++ b/arch/powerpc/platforms/pseries/iommu.c
@@ -1229,7 +1229,6 @@ static bool enable_ddw(struct pci_dev *dev, struct device_node *pdn)
u32 ddw_avail[DDW_APPLICABLE_SIZE];
struct dma_win *window;
struct property *win64;
- bool ddw_enabled = false;
struct failed_ddw_pdn *fpdn;
bool default_win_removed = false, direct_mapping = false;
bool pmem_present;
@@ -1244,7 +1243,6 @@ static bool enable_ddw(struct pci_dev *dev, struct device_node *pdn)
if (find_existing_ddw(pdn, &dev->dev.archdata.dma_offset, &len)) {
direct_mapping = (len >= max_ram_len);
- ddw_enabled = true;
goto out_unlock;
}
@@ -1397,8 +1395,8 @@ static bool enable_ddw(struct pci_dev *dev, struct device_node *pdn)
dev_info(&dev->dev, "failed to map DMA window for %pOF: %d\n",
dn, ret);
- /* Make sure to clean DDW if any TCE was set*/
- clean_dma_window(pdn, win64->value);
+ /* Make sure to clean DDW if any TCE was set*/
+ clean_dma_window(pdn, win64->value);
goto out_del_list;
}
} else {
@@ -1445,7 +1443,6 @@ static bool enable_ddw(struct pci_dev *dev, struct device_node *pdn)
spin_unlock(&dma_win_list_lock);
dev->dev.archdata.dma_offset = win_addr;
- ddw_enabled = true;
goto out_unlock;
out_del_list:
@@ -1481,10 +1478,10 @@ static bool enable_ddw(struct pci_dev *dev, struct device_node *pdn)
* as RAM, then we failed to create a window to cover persistent
* memory and need to set the DMA limit.
*/
- if (pmem_present && ddw_enabled && direct_mapping && len == max_ram_len)
+ if (pmem_present && direct_mapping && len == max_ram_len)
dev->dev.bus_dma_limit = dev->dev.archdata.dma_offset + (1ULL << len);
- return ddw_enabled && direct_mapping;
+ return direct_mapping;
}
static void pci_dma_dev_setup_pSeriesLP(struct pci_dev *dev)
--
2.30.2
^ permalink raw reply related
* [PATCH kernel 3/3] powerpc/pseries/ddw: Do not try direct mapping with persistent memory and one window
From: Alexey Kardashevskiy @ 2021-11-08 4:03 UTC (permalink / raw)
To: linuxppc-dev
Cc: Frederic Barrat, Brian King, Alexey Kardashevskiy, Leonardo Bras
In-Reply-To: <20211108040320.3857636-1-aik@ozlabs.ru>
There is a possibility of having just one DMA window available with
a limited capacity which the existing code does not handle that well.
If the window is big enough for the system RAM but less than
MAX_PHYSMEM_BITS (which we want when persistent memory is present),
we create 1:1 window and leave persistent memory without DMA.
This disables 1:1 mapping entirely if there is persistent memory and
either:
- the huge DMA window does not cover the entire address space;
- the default DMA window is removed.
This relies on reverted 54fc3c681ded
("powerpc/pseries/ddw: Extend upper limit for huge DMA window for persistent memory")
to return the actual amount RAM in ddw_memory_hotplug_max() (posted
separately).
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
---
arch/powerpc/platforms/pseries/iommu.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c
index 301fa5b3d528..8f998e55735b 100644
--- a/arch/powerpc/platforms/pseries/iommu.c
+++ b/arch/powerpc/platforms/pseries/iommu.c
@@ -1356,8 +1356,10 @@ static bool enable_ddw(struct pci_dev *dev, struct device_node *pdn)
len = order_base_2(query.largest_available_block << page_shift);
win_name = DMA64_PROPNAME;
} else {
- direct_mapping = true;
- win_name = DIRECT64_PROPNAME;
+ direct_mapping = !default_win_removed ||
+ (len == MAX_PHYSMEM_BITS) ||
+ (!pmem_present && (len == max_ram_len));
+ win_name = direct_mapping ? DIRECT64_PROPNAME : DMA64_PROPNAME;
}
ret = create_ddw(dev, ddw_avail, &create, page_shift, len);
--
2.30.2
^ permalink raw reply related
* Re: [PATCH] powerpc/pseries: Fix numa FORM2 parsing fallback code
From: Michael Ellerman @ 2021-11-08 5:20 UTC (permalink / raw)
To: Nicholas Piggin, linuxppc-dev; +Cc: Aneesh Kumar K . V, Nicholas Piggin
In-Reply-To: <20211105132909.1582449-1-npiggin@gmail.com>
Nicholas Piggin <npiggin@gmail.com> writes:
> In case the FORM2 distance table from firmware is not the expected size,
> there is fallback code that just populates the lookup table as local vs
> remote.
>
> However it then continues on to use the distance table. Fix.
>
> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> Fixes: 1c6b5a7e7405 ("powerpc/pseries: Add support for FORM2 associativity")
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
> arch/powerpc/mm/numa.c | 29 +++++++++++++----------------
> 1 file changed, 13 insertions(+), 16 deletions(-)
>
> diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
> index 6f14c8fb6359..0789cde7f658 100644
> --- a/arch/powerpc/mm/numa.c
> +++ b/arch/powerpc/mm/numa.c
> @@ -380,6 +380,7 @@ static void initialize_form2_numa_distance_lookup_table(void)
> const __be32 *numa_lookup_index;
> int numa_dist_table_length;
> int max_numa_index, distance_index;
> + bool good = true;
numa_dist_table is a pointer, so couldn't we just set it to NULL if the
info it's pointing at is invalid?
>
> if (firmware_has_feature(FW_FEATURE_OPAL))
> root = of_find_node_by_path("/ibm,opal");
> @@ -407,30 +408,26 @@ static void initialize_form2_numa_distance_lookup_table(void)
>
> if (numa_dist_table_length != max_numa_index * max_numa_index) {
> WARN(1, "Wrong NUMA distance information\n");
> - /* consider everybody else just remote. */
> - for (i = 0; i < max_numa_index; i++) {
> - for (j = 0; j < max_numa_index; j++) {
> - int nodeA = numa_id_index_table[i];
> - int nodeB = numa_id_index_table[j];
> -
> - if (nodeA == nodeB)
> - numa_distance_table[nodeA][nodeB] = LOCAL_DISTANCE;
> - else
> - numa_distance_table[nodeA][nodeB] = REMOTE_DISTANCE;
> - }
> - }
> + good = false;
ie. numa_dist_table = NULL;
> }
> -
> distance_index = 0;
> for (i = 0; i < max_numa_index; i++) {
> for (j = 0; j < max_numa_index; j++) {
> int nodeA = numa_id_index_table[i];
> int nodeB = numa_id_index_table[j];
> -
> - numa_distance_table[nodeA][nodeB] = numa_dist_table[distance_index++];
> - pr_debug("dist[%d][%d]=%d ", nodeA, nodeB, numa_distance_table[nodeA][nodeB]);
> + int dist;
> +
> + if (good)
if (numa_dist_table)
> + dist = numa_dist_table[distance_index++];
> + else if (nodeA == nodeB)
> + dist = LOCAL_DISTANCE;
> + else
> + dist = REMOTE_DISTANCE;
> + numa_distance_table[nodeA][nodeB] = dist;
> + pr_debug("dist[%d][%d]=%d ", nodeA, nodeB, dist);
> }
> }
> +
> of_node_put(root);
> }
But maybe before we do that we can rename it, because it is really easy
to confuse numa_dist_table and numa_distance_table if you don't look
closely.
cheers
^ permalink raw reply
* Re: [PATCH] powerpc/64s: introduce CONFIG_MAXSMP to test very large SMP
From: Michael Ellerman @ 2021-11-08 5:28 UTC (permalink / raw)
To: Nicholas Piggin, linuxppc-dev; +Cc: Nicholas Piggin
In-Reply-To: <20211105041132.1443767-1-npiggin@gmail.com>
Nicholas Piggin <npiggin@gmail.com> writes:
> Similarly to x86, add MAXSMP that should help flush out problems with
> vary large SMP and other values associated with very big systems.
>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
> arch/powerpc/Kconfig | 8 ++++++++
> arch/powerpc/platforms/Kconfig.cputype | 5 +++--
> 2 files changed, 11 insertions(+), 2 deletions(-)
>
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index b8f6185d3998..d585fcfa456f 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -64,6 +64,13 @@ config NEED_PER_CPU_EMBED_FIRST_CHUNK
> config NEED_PER_CPU_PAGE_FIRST_CHUNK
> def_bool y if PPC64
>
> +config MAXSMP
> + bool "Enable Maximum number of SMP Processors and NUMA Nodes"
> + depends on SMP && DEBUG_KERNEL && PPC_BOOK3S_64
> + help
> + Enable maximum number of CPUS and NUMA Nodes for this architecture.
> + If unsure, say N.
As evidenced by the kernel robot report, I think we need to exclude this
from allyesconfig.
Because our max is 16K, larger than the 8K on x86, we are going to be
constantly hitting stack usage errors in driver code. Getting those
fixed tends to take time, because the driver authors don't see the
warnings when they build for other arches, and because the fixes go via
driver trees.
Making MAXSMP depend on !COMPILE_TEST should do the trick.
cheers
^ permalink raw reply
* [PATCH v3] perf vendor events power10: Add metric events json file for power10 platform
From: Kajol Jain @ 2021-11-08 6:00 UTC (permalink / raw)
To: acme
Cc: atrajeev, rnsastry, linuxppc-dev, linux-kernel, linux-perf-users,
maddy, pc, kjain, jolsa
Add pmu metric json file for power10 platform.
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
---
Changelog
v2 -> v3:
- Did nit changes in BriefDescription as suggested
by Paul A. Clarke and Michael Ellermen
- Link tot the v2 patch: https://lkml.org/lkml/2021/10/22/48
v1 -> v2:
- Did some nit changes in BriefDescription field
as suggested by Paul A. Clarke
- Link to the v1 patch: https://lkml.org/lkml/2021/10/6/131
.../arch/powerpc/power10/metrics.json | 676 ++++++++++++++++++
1 file changed, 676 insertions(+)
create mode 100644 tools/perf/pmu-events/arch/powerpc/power10/metrics.json
diff --git a/tools/perf/pmu-events/arch/powerpc/power10/metrics.json b/tools/perf/pmu-events/arch/powerpc/power10/metrics.json
new file mode 100644
index 000000000000..8adab5cd9934
--- /dev/null
+++ b/tools/perf/pmu-events/arch/powerpc/power10/metrics.json
@@ -0,0 +1,676 @@
+[
+ {
+ "BriefDescription": "Percentage of cycles that are run cycles",
+ "MetricExpr": "PM_RUN_CYC / PM_CYC * 100",
+ "MetricGroup": "General",
+ "MetricName": "RUN_CYCLES_RATE",
+ "ScaleUnit": "1%"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction",
+ "MetricExpr": "PM_CYC / PM_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "CYCLES_PER_INSTRUCTION"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when dispatch was stalled for any reason",
+ "MetricExpr": "PM_DISP_STALL_CYC / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "DISPATCHED_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when dispatch was stalled because there was a flush",
+ "MetricExpr": "PM_DISP_STALL_FLUSH / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "DISPATCHED_FLUSH_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when dispatch was stalled because the MMU was handling a translation miss",
+ "MetricExpr": "PM_DISP_STALL_TRANSLATION / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "DISPATCHED_TRANSLATION_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when dispatch was stalled waiting to resolve an instruction ERAT miss",
+ "MetricExpr": "PM_DISP_STALL_IERAT_ONLY_MISS / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "DISPATCHED_IERAT_ONLY_MISS_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when dispatch was stalled waiting to resolve an instruction TLB miss",
+ "MetricExpr": "PM_DISP_STALL_ITLB_MISS / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "DISPATCHED_ITLB_MISS_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when dispatch was stalled due to an icache miss",
+ "MetricExpr": "PM_DISP_STALL_IC_MISS / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "DISPATCHED_IC_MISS_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when dispatch was stalled while the instruction was fetched from the local L2",
+ "MetricExpr": "PM_DISP_STALL_IC_L2 / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "DISPATCHED_IC_L2_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when dispatch was stalled while the instruction was fetched from the local L3",
+ "MetricExpr": "PM_DISP_STALL_IC_L3 / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "DISPATCHED_IC_L3_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when dispatch was stalled while the instruction was fetched from any source beyond the local L3",
+ "MetricExpr": "PM_DISP_STALL_IC_L3MISS / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "DISPATCHED_IC_L3MISS_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when dispatch was stalled due to an icache miss after a branch mispredict",
+ "MetricExpr": "PM_DISP_STALL_BR_MPRED_ICMISS / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "DISPATCHED_BR_MPRED_ICMISS_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when dispatch was stalled while instruction was fetched from the local L2 after suffering a branch mispredict",
+ "MetricExpr": "PM_DISP_STALL_BR_MPRED_IC_L2 / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "DISPATCHED_BR_MPRED_IC_L2_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when dispatch was stalled while instruction was fetched from the local L3 after suffering a branch mispredict",
+ "MetricExpr": "PM_DISP_STALL_BR_MPRED_IC_L3 / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "DISPATCHED_BR_MPRED_IC_L3_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when dispatch was stalled while instruction was fetched from any source beyond the local L3 after suffering a branch mispredict",
+ "MetricExpr": "PM_DISP_STALL_BR_MPRED_IC_L3MISS / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "DISPATCHED_BR_MPRED_IC_L3MISS_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when dispatch was stalled due to a branch mispredict",
+ "MetricExpr": "PM_DISP_STALL_BR_MPRED / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "DISPATCHED_BR_MPRED_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when the NTC instruction was held at dispatch for any reason",
+ "MetricExpr": "PM_DISP_STALL_HELD_CYC / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "DISPATCHED_HELD_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when the NTC instruction was held at dispatch because of a synchronizing instruction that requires the ICT to be empty before dispatch",
+ "MetricExpr": "PM_DISP_STALL_HELD_SYNC_CYC / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "DISP_HELD_STALL_SYNC_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when the NTC instruction was held at dispatch while waiting on the scoreboard",
+ "MetricExpr": "PM_DISP_STALL_HELD_SCOREBOARD_CYC / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "DISP_HELD_STALL_SCOREBOARD_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when the NTC instruction was held at dispatch due to issue queue full",
+ "MetricExpr": "PM_DISP_STALL_HELD_ISSQ_FULL_CYC / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "DISP_HELD_STALL_ISSQ_FULL_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when the NTC instruction was held at dispatch because the mapper/SRB was full",
+ "MetricExpr": "PM_DISP_STALL_HELD_RENAME_CYC / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "DISPATCHED_HELD_RENAME_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when the NTC instruction was held at dispatch because the STF mapper/SRB was full",
+ "MetricExpr": "PM_DISP_STALL_HELD_STF_MAPPER_CYC / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "DISPATCHED_HELD_STF_MAPPER_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when the NTC instruction was held at dispatch because the XVFC mapper/SRB was full",
+ "MetricExpr": "PM_DISP_STALL_HELD_XVFC_MAPPER_CYC / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "DISPATCHED_HELD_XVFC_MAPPER_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when the NTC instruction was held at dispatch for any other reason",
+ "MetricExpr": "PM_DISP_STALL_HELD_OTHER_CYC / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "DISPATCHED_HELD_OTHER_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when the NTC instruction has been dispatched but not issued for any reason",
+ "MetricExpr": "PM_ISSUE_STALL / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "ISSUE_STALL_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when the NTC instruction is waiting to be finished in one of the execution units",
+ "MetricExpr": "PM_EXEC_STALL / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "EXECUTION_STALL_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction spent executing an NTC instruction that gets flushed some time after dispatch",
+ "MetricExpr": "PM_EXEC_STALL_NTC_FLUSH / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "NTC_FLUSH_STALL_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when the NTF instruction finishes at dispatch",
+ "MetricExpr": "PM_EXEC_STALL_FIN_AT_DISP / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "FIN_AT_DISP_STALL_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when the NTC instruction is executing in the branch unit",
+ "MetricExpr": "PM_EXEC_STALL_BRU / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "BRU_STALL_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when the NTC instruction is a simple fixed point instruction that is executing in the LSU",
+ "MetricExpr": "PM_EXEC_STALL_SIMPLE_FX / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "SIMPLE_FX_STALL_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when the NTC instruction is executing in the VSU",
+ "MetricExpr": "PM_EXEC_STALL_VSU / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "VSU_STALL_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when the NTC instruction is waiting to be finished in one of the execution units",
+ "MetricExpr": "PM_EXEC_STALL_TRANSLATION / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "TRANSLATION_STALL_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when the NTC instruction is a load or store that suffered a translation miss",
+ "MetricExpr": "PM_EXEC_STALL_DERAT_ONLY_MISS / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "DERAT_ONLY_MISS_STALL_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when the NTC instruction is recovering from a TLB miss",
+ "MetricExpr": "PM_EXEC_STALL_DERAT_DTLB_MISS / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "DERAT_DTLB_MISS_STALL_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when the NTC instruction is executing in the LSU",
+ "MetricExpr": "PM_EXEC_STALL_LSU / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "LSU_STALL_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when the NTC instruction is a load that is executing in the LSU",
+ "MetricExpr": "PM_EXEC_STALL_LOAD / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "LOAD_STALL_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when the NTC instruction is waiting for a load miss to resolve from either the local L2 or local L3",
+ "MetricExpr": "PM_EXEC_STALL_DMISS_L2L3 / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "DMISS_L2L3_STALL_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when the NTC instruction is waiting for a load miss to resolve from either the local L2 or local L3, with an RC dispatch conflict",
+ "MetricExpr": "PM_EXEC_STALL_DMISS_L2L3_CONFLICT / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "DMISS_L2L3_CONFLICT_STALL_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when the NTC instruction is waiting for a load miss to resolve from either the local L2 or local L3, without an RC dispatch conflict",
+ "MetricExpr": "PM_EXEC_STALL_DMISS_L2L3_NOCONFLICT / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "DMISS_L2L3_NOCONFLICT_STALL_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when the NTC instruction is waiting for a load miss to resolve from a source beyond the local L2 and local L3",
+ "MetricExpr": "PM_EXEC_STALL_DMISS_L3MISS / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "DMISS_L3MISS_STALL_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when the NTC instruction is waiting for a load miss to resolve from a neighbor chiplet's L2 or L3 in the same chip",
+ "MetricExpr": "PM_EXEC_STALL_DMISS_L21_L31 / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "DMISS_L21_L31_STALL_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when the NTC instruction is waiting for a load miss to resolve from L4, local memory or OpenCAPI chip",
+ "MetricExpr": "PM_EXEC_STALL_DMISS_LMEM / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "DMISS_LMEM_STALL_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when the NTC instruction is waiting for a load miss to resolve from a remote chip (cache, L4, memory or OpenCAPI) in the same group",
+ "MetricExpr": "PM_EXEC_STALL_DMISS_OFF_CHIP / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "DMISS_OFF_CHIP_STALL_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when the NTC instruction is waiting for a load miss to resolve from a distant chip (cache, L4, memory or OpenCAPI chip)",
+ "MetricExpr": "PM_EXEC_STALL_DMISS_OFF_NODE / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "DMISS_OFF_NODE_STALL_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when the NTC instruction is executing a TLBIEL instruction",
+ "MetricExpr": "PM_EXEC_STALL_TLBIEL / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "TLBIEL_STALL_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when the NTC instruction is finishing a load after its data has been reloaded from a data source beyond the local L1, OR when the LSU is processing an L1-hit, OR when the NTF instruction merged with another load in the LMQ",
+ "MetricExpr": "PM_EXEC_STALL_LOAD_FINISH / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "LOAD_FINISH_STALL_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when the NTC instruction is a store that is executing in the LSU",
+ "MetricExpr": "PM_EXEC_STALL_STORE / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "STORE_STALL_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when the NTC instruction is in the store unit outside of handling store misses or other special store operations",
+ "MetricExpr": "PM_EXEC_STALL_STORE_PIPE / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "STORE_PIPE_STALL_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when the NTC instruction is a store whose cache line was not resident in the L1 and had to wait for allocation of the missing line into the L1",
+ "MetricExpr": "PM_EXEC_STALL_STORE_MISS / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "STORE_MISS_STALL_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when the NTC instruction is a TLBIE instruction waiting for a response from the L2",
+ "MetricExpr": "PM_EXEC_STALL_TLBIE / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "TLBIE_STALL_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when the NTC instruction is executing a PTESYNC instruction",
+ "MetricExpr": "PM_EXEC_STALL_PTESYNC / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "PTESYNC_STALL_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when the NTC instruction cannot complete because the thread was blocked",
+ "MetricExpr": "PM_CMPL_STALL / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "COMPLETION_STALL_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when the NTC instruction cannot complete because it was interrupted by ANY exception",
+ "MetricExpr": "PM_CMPL_STALL_EXCEPTION / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "EXCEPTION_COMPLETION_STALL_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when the NTC instruction is stuck at finish waiting for the non-speculative finish of either a STCX instruction waiting for its result or a load waiting for non-critical sectors of data and ECC",
+ "MetricExpr": "PM_CMPL_STALL_MEM_ECC / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "MEM_ECC_COMPLETION_STALL_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when the NTC instruction is a STCX instruction waiting for resolution from the nest",
+ "MetricExpr": "PM_CMPL_STALL_STCX / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "STCX_COMPLETION_STALL_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when the NTC instruction is a LWSYNC instruction waiting to complete",
+ "MetricExpr": "PM_CMPL_STALL_LWSYNC / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "LWSYNC_COMPLETION_STALL_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when the NTC instruction is a HWSYNC instruction stuck at finish waiting for a response from the L2",
+ "MetricExpr": "PM_CMPL_STALL_HWSYNC / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "HWSYNC_COMPLETION_STALL_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when the NTC instruction required special handling before completion",
+ "MetricExpr": "PM_CMPL_STALL_SPECIAL / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "SPECIAL_COMPLETION_STALL_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when dispatch was stalled because fetch was being held, so there was nothing in the pipeline for this thread",
+ "MetricExpr": "PM_DISP_STALL_FETCH / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "DISPATCHED_FETCH_CPI"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when the NTC instruction was held at dispatch because of power management",
+ "MetricExpr": "PM_DISP_STALL_HELD_HALT_CYC / PM_RUN_INST_CMPL",
+ "MetricGroup": "CPI",
+ "MetricName": "DISPATCHED_HELD_HALT_CPI"
+ },
+ {
+ "BriefDescription": "Percentage of flushes per completed instruction",
+ "MetricExpr": "PM_FLUSH / PM_RUN_INST_CMPL * 100",
+ "MetricGroup": "Others",
+ "MetricName": "FLUSH_RATE",
+ "ScaleUnit": "1%"
+ },
+ {
+ "BriefDescription": "Percentage of flushes due to a branch mispredict per completed instruction",
+ "MetricExpr": "PM_FLUSH_MPRED / PM_RUN_INST_CMPL * 100",
+ "MetricGroup": "Others",
+ "MetricName": "BR_MPRED_FLUSH_RATE",
+ "ScaleUnit": "1%"
+ },
+ {
+ "BriefDescription": "Percentage of branch mispredictions per completed instruction",
+ "MetricExpr": "PM_BR_MPRED_CMPL / PM_RUN_INST_CMPL",
+ "MetricGroup": "Others",
+ "MetricName": "BRANCH_MISPREDICTION_RATE"
+ },
+ {
+ "BriefDescription": "Percentage of finished loads that missed in the L1",
+ "MetricExpr": "PM_LD_MISS_L1 / PM_LD_REF_L1 * 100",
+ "MetricGroup": "Others",
+ "MetricName": "L1_LD_MISS_RATIO",
+ "ScaleUnit": "1%"
+ },
+ {
+ "BriefDescription": "Percentage of completed instructions that were loads that missed the L1",
+ "MetricExpr": "PM_LD_MISS_L1 / PM_RUN_INST_CMPL * 100",
+ "MetricGroup": "Others",
+ "MetricName": "L1_LD_MISS_RATE",
+ "ScaleUnit": "1%"
+ },
+ {
+ "BriefDescription": "Percentage of completed instructions when the DPTEG required for the load/store instruction in execution was missing from the TLB",
+ "MetricExpr": "PM_DTLB_MISS / PM_RUN_INST_CMPL * 100",
+ "MetricGroup": "Others",
+ "MetricName": "DTLB_MISS_RATE",
+ "ScaleUnit": "1%"
+ },
+ {
+ "BriefDescription": "Average number of completed instructions dispatched per instruction completed",
+ "MetricExpr": "PM_INST_DISP / PM_RUN_INST_CMPL",
+ "MetricGroup": "General",
+ "MetricName": "DISPATCH_PER_INST_CMPL"
+ },
+ {
+ "BriefDescription": "Percentage of completed instructions that were a demand load that did not hit in the L1 or L2",
+ "MetricExpr": "PM_DATA_FROM_L2MISS / PM_RUN_INST_CMPL * 100",
+ "MetricGroup": "General",
+ "MetricName": "L2_LD_MISS_RATE",
+ "ScaleUnit": "1%"
+ },
+ {
+ "BriefDescription": "Percentage of completed instructions that were demand fetches that missed the L1 icache",
+ "MetricExpr": "PM_L1_ICACHE_MISS / PM_RUN_INST_CMPL * 100",
+ "MetricGroup": "Instruction_Misses",
+ "MetricName": "L1_INST_MISS_RATE",
+ "ScaleUnit": "1%"
+ },
+ {
+ "BriefDescription": "Percentage of completed instructions that were demand fetches that reloaded from beyond the L3 icache",
+ "MetricExpr": "PM_INST_FROM_L3MISS / PM_RUN_INST_CMPL * 100",
+ "MetricGroup": "General",
+ "MetricName": "L3_INST_MISS_RATE",
+ "ScaleUnit": "1%"
+ },
+ {
+ "BriefDescription": "Average number of completed instructions per cycle",
+ "MetricExpr": "PM_INST_CMPL / PM_CYC",
+ "MetricGroup": "General",
+ "MetricName": "IPC"
+ },
+ {
+ "BriefDescription": "Average number of cycles per completed instruction group",
+ "MetricExpr": "PM_CYC / PM_1PLUS_PPC_CMPL",
+ "MetricGroup": "General",
+ "MetricName": "CYCLES_PER_COMPLETED_INSTRUCTIONS_SET"
+ },
+ {
+ "BriefDescription": "Percentage of cycles when at least 1 instruction dispatched",
+ "MetricExpr": "PM_1PLUS_PPC_DISP / PM_RUN_CYC * 100",
+ "MetricGroup": "General",
+ "MetricName": "CYCLES_ATLEAST_ONE_INST_DISPATCHED",
+ "ScaleUnit": "1%"
+ },
+ {
+ "BriefDescription": "Average number of finished loads per completed instruction",
+ "MetricExpr": "PM_LD_REF_L1 / PM_RUN_INST_CMPL",
+ "MetricGroup": "General",
+ "MetricName": "LOADS_PER_INST"
+ },
+ {
+ "BriefDescription": "Average number of finished stores per completed instruction",
+ "MetricExpr": "PM_ST_FIN / PM_RUN_INST_CMPL",
+ "MetricGroup": "General",
+ "MetricName": "STORES_PER_INST"
+ },
+ {
+ "BriefDescription": "Percentage of demand loads that reloaded from beyond the L2 per completed instruction",
+ "MetricExpr": "PM_DATA_FROM_L2MISS / PM_RUN_INST_CMPL * 100",
+ "MetricGroup": "dL1_Reloads",
+ "MetricName": "DL1_RELOAD_FROM_L2_MISS_RATE",
+ "ScaleUnit": "1%"
+ },
+ {
+ "BriefDescription": "Percentage of demand loads that reloaded from beyond the L3 per completed instruction",
+ "MetricExpr": "PM_DATA_FROM_L3MISS / PM_RUN_INST_CMPL * 100",
+ "MetricGroup": "dL1_Reloads",
+ "MetricName": "DL1_RELOAD_FROM_L3_MISS_RATE",
+ "ScaleUnit": "1%"
+ },
+ {
+ "BriefDescription": "Percentage of DERAT misses with 4k page size per completed instruction",
+ "MetricExpr": "PM_DERAT_MISS_4K / PM_RUN_INST_CMPL * 100",
+ "MetricGroup": "Translation",
+ "MetricName": "DERAT_4K_MISS_RATE",
+ "ScaleUnit": "1%"
+ },
+ {
+ "BriefDescription": "Percentage of DERAT misses with 64k page size per completed instruction",
+ "MetricExpr": "PM_DERAT_MISS_64K / PM_RUN_INST_CMPL * 100",
+ "MetricGroup": "Translation",
+ "MetricName": "DERAT_64K_MISS_RATE",
+ "ScaleUnit": "1%"
+ },
+ {
+ "BriefDescription": "Average number of run cycles per completed instruction",
+ "MetricExpr": "PM_RUN_CYC / PM_RUN_INST_CMPL",
+ "MetricGroup": "General",
+ "MetricName": "RUN_CPI"
+ },
+ {
+ "BriefDescription": "Percentage of DERAT misses per completed instruction",
+ "MetricExpr": "PM_DERAT_MISS / PM_RUN_INST_CMPL * 100",
+ "MetricGroup": "Translation",
+ "MetricName": "DERAT_MISS_RATE",
+ "ScaleUnit": "1%"
+ },
+ {
+ "BriefDescription": "Average number of completed instructions per run cycle",
+ "MetricExpr": "PM_RUN_INST_CMPL / PM_RUN_CYC",
+ "MetricGroup": "General",
+ "MetricName": "RUN_IPC"
+ },
+ {
+ "BriefDescription": "Average number of completed instructions per instruction group",
+ "MetricExpr": "PM_RUN_INST_CMPL / PM_1PLUS_PPC_CMPL",
+ "MetricGroup": "General",
+ "MetricName": "AVERAGE_COMPLETED_INSTRUCTION_SET_SIZE"
+ },
+ {
+ "BriefDescription": "Average number of finished instructions per completed instructions",
+ "MetricExpr": "PM_INST_FIN / PM_RUN_INST_CMPL",
+ "MetricGroup": "General",
+ "MetricName": "INST_FIN_PER_CMPL"
+ },
+ {
+ "BriefDescription": "Average cycles per completed instruction when the NTF instruction is completing and the finish was overlooked",
+ "MetricExpr": "PM_EXEC_STALL_UNKNOWN / PM_RUN_INST_CMPL",
+ "MetricGroup": "General",
+ "MetricName": "EXEC_STALL_UNKOWN_CPI"
+ },
+ {
+ "BriefDescription": "Percentage of finished branches that were taken",
+ "MetricExpr": "PM_BR_TAKEN_CMPL / PM_BR_FIN * 100",
+ "MetricGroup": "General",
+ "MetricName": "TAKEN_BRANCHES",
+ "ScaleUnit": "1%"
+ },
+ {
+ "BriefDescription": "Percentage of completed instructions that were a demand load that did not hit in the L1, L2, or the L3",
+ "MetricExpr": "PM_DATA_FROM_L3MISS / PM_RUN_INST_CMPL * 100",
+ "MetricGroup": "General",
+ "MetricName": "L3_LD_MISS_RATE",
+ "ScaleUnit": "1%"
+ },
+ {
+ "BriefDescription": "Average number of finished branches per completed instruction",
+ "MetricExpr": "PM_BR_FIN / PM_RUN_INST_CMPL",
+ "MetricGroup": "General",
+ "MetricName": "BRANCHES_PER_INST"
+ },
+ {
+ "BriefDescription": "Average number of instructions finished in the LSU per completed instruction",
+ "MetricExpr": "PM_LSU_FIN / PM_RUN_INST_CMPL",
+ "MetricGroup": "General",
+ "MetricName": "LSU_PER_INST"
+ },
+ {
+ "BriefDescription": "Average number of instructions finished in the VSU per completed instruction",
+ "MetricExpr": "PM_VSU_FIN / PM_RUN_INST_CMPL",
+ "MetricGroup": "General",
+ "MetricName": "VSU_PER_INST"
+ },
+ {
+ "BriefDescription": "Average number of TLBIE instructions finished in the LSU per completed instruction",
+ "MetricExpr": "PM_TLBIE_FIN / PM_RUN_INST_CMPL",
+ "MetricGroup": "General",
+ "MetricName": "TLBIE_PER_INST"
+ },
+ {
+ "BriefDescription": "Average number of STCX instructions finshed per completed instruction",
+ "MetricExpr": "PM_STCX_FIN / PM_RUN_INST_CMPL",
+ "MetricGroup": "General",
+ "MetricName": "STXC_PER_INST"
+ },
+ {
+ "BriefDescription": "Average number of LARX instructions finshed per completed instruction",
+ "MetricExpr": "PM_LARX_FIN / PM_RUN_INST_CMPL",
+ "MetricGroup": "General",
+ "MetricName": "LARX_PER_INST"
+ },
+ {
+ "BriefDescription": "Average number of PTESYNC instructions finshed per completed instruction",
+ "MetricExpr": "PM_PTESYNC_FIN / PM_RUN_INST_CMPL",
+ "MetricGroup": "General",
+ "MetricName": "PTESYNC_PER_INST"
+ },
+ {
+ "BriefDescription": "Average number of simple fixed-point instructions finshed in the store unit per completed instruction",
+ "MetricExpr": "PM_FX_LSU_FIN / PM_RUN_INST_CMPL",
+ "MetricGroup": "General",
+ "MetricName": "FX_PER_INST"
+ },
+ {
+ "BriefDescription": "Percentage of demand load misses that reloaded the L1 cache",
+ "MetricExpr": "PM_LD_DEMAND_MISS_L1 / PM_LD_MISS_L1 * 100",
+ "MetricGroup": "General",
+ "MetricName": "DL1_MISS_RELOADS",
+ "ScaleUnit": "1%"
+ },
+ {
+ "BriefDescription": "Percentage of demand load misses that reloaded from beyond the local L2",
+ "MetricExpr": "PM_DATA_FROM_L2MISS / PM_LD_DEMAND_MISS_L1 * 100",
+ "MetricGroup": "dL1_Reloads",
+ "MetricName": "DL1_RELOAD_FROM_L2_MISS",
+ "ScaleUnit": "1%"
+ },
+ {
+ "BriefDescription": "Percentage of demand load misses that reloaded from beyond the local L3",
+ "MetricExpr": "PM_DATA_FROM_L3MISS / PM_LD_DEMAND_MISS_L1 * 100",
+ "MetricGroup": "dL1_Reloads",
+ "MetricName": "DL1_RELOAD_FROM_L3_MISS",
+ "ScaleUnit": "1%"
+ },
+ {
+ "BriefDescription": "Percentage of cycles stalled due to the NTC instruction waiting for a load miss to resolve from a source beyond the local L2 and local L3",
+ "MetricExpr": "DMISS_L3MISS_STALL_CPI / RUN_CPI * 100",
+ "MetricGroup": "General",
+ "MetricName": "DCACHE_MISS_CPI",
+ "ScaleUnit": "1%"
+ },
+ {
+ "BriefDescription": "Percentage of DERAT misses with 2M page size per completed instruction",
+ "MetricExpr": "PM_DERAT_MISS_2M / PM_RUN_INST_CMPL * 100",
+ "MetricGroup": "Translation",
+ "MetricName": "DERAT_2M_MISS_RATE",
+ "ScaleUnit": "1%"
+ },
+ {
+ "BriefDescription": "Percentage of DERAT misses with 16M page size per completed instruction",
+ "MetricExpr": "PM_DERAT_MISS_16M / PM_RUN_INST_CMPL * 100",
+ "MetricGroup": "Translation",
+ "MetricName": "DERAT_16M_MISS_RATE",
+ "ScaleUnit": "1%"
+ },
+ {
+ "BriefDescription": "DERAT miss ratio for 4K page size",
+ "MetricExpr": "PM_DERAT_MISS_4K / PM_DERAT_MISS",
+ "MetricGroup": "Translation",
+ "MetricName": "DERAT_4K_MISS_RATIO"
+ },
+ {
+ "BriefDescription": "DERAT miss ratio for 2M page size",
+ "MetricExpr": "PM_DERAT_MISS_2M / PM_DERAT_MISS",
+ "MetricGroup": "Translation",
+ "MetricName": "DERAT_2M_MISS_RATIO"
+ },
+ {
+ "BriefDescription": "DERAT miss ratio for 16M page size",
+ "MetricExpr": "PM_DERAT_MISS_16M / PM_DERAT_MISS",
+ "MetricGroup": "Translation",
+ "MetricName": "DERAT_16M_MISS_RATIO"
+ },
+ {
+ "BriefDescription": "DERAT miss ratio for 64K page size",
+ "MetricExpr": "PM_DERAT_MISS_64K / PM_DERAT_MISS",
+ "MetricGroup": "Translation",
+ "MetricName": "DERAT_64K_MISS_RATIO"
+ },
+ {
+ "BriefDescription": "Percentage of DERAT misses that resulted in TLB reloads",
+ "MetricExpr": "PM_DTLB_MISS / PM_DERAT_MISS * 100",
+ "MetricGroup": "Translation",
+ "MetricName": "DERAT_MISS_RELOAD",
+ "ScaleUnit": "1%"
+ },
+ {
+ "BriefDescription": "Percentage of icache misses that were reloaded from beyond the local L3",
+ "MetricExpr": "PM_INST_FROM_L3MISS / PM_L1_ICACHE_MISS * 100",
+ "MetricGroup": "Instruction_Misses",
+ "MetricName": "INST_FROM_L3_MISS",
+ "ScaleUnit": "1%"
+ },
+ {
+ "BriefDescription": "Percentage of icache reloads from the beyond the L3 per completed instruction",
+ "MetricExpr": "PM_INST_FROM_L3MISS / PM_RUN_INST_CMPL * 100",
+ "MetricGroup": "Instruction_Misses",
+ "MetricName": "INST_FROM_L3_MISS_RATE",
+ "ScaleUnit": "1%"
+ }
+]
--
2.26.2
^ permalink raw reply related
* Re: [PATCH 6/7] include: mfd: Remove leftovers from bd70528 watchdog
From: Vaittinen, Matti @ 2021-11-08 6:20 UTC (permalink / raw)
To: Alexandre Ghiti, Steve French, Jonathan Corbet, David Howells,
Russell King, Thomas Bogendoerfer, Michael Ellerman,
Benjamin Herrenschmidt, Paul Mackerras, Yoshinori Sato,
Rich Felker, Lee Jones, Jeff Layton, Greg Kroah-Hartman,
Arnd Bergmann, Ronnie Sahlberg, Guenter Roeck, Wim Van Sebroeck,
Lukas Bulwahn, Luis Chamberlain, Kalle Valo,
linux-cifs@vger.kernel.org, samba-technical@lists.samba.org,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-cachefs@redhat.com, linux-arm-kernel@lists.infradead.org,
linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
linux-sh@vger.kernel.org, linux-power
In-Reply-To: <20211105154334.1841927-7-alexandre.ghiti@canonical.com>
Thanks Alexandre,
On 11/5/21 17:43, Alexandre Ghiti wrote:
> This driver was removed so remove all references to it.
>
> Fixes: 52a5502507bc ("watchdog: bd70528 drop bd70528 support")
> Signed-off-by: Alexandre Ghiti <alexandre.ghiti@canonical.com>
> ---
> include/linux/mfd/rohm-bd70528.h | 24 ------------------------
> 1 file changed, 24 deletions(-)
>
> diff --git a/include/linux/mfd/rohm-bd70528.h b/include/linux/mfd/rohm-bd70528.h
This whole header should be dropped. I've already sent a patch for this
during the previous cycle. I guess I need to respin that.
https://lore.kernel.org/lkml/b288b97d-4c5f-1966-92b0-e949588ba97e@fi.rohmeurope.com/
Best Regards
--Matti
^ permalink raw reply
* Re: [PATCH v3 21/25] m68k: Switch to new sys-off handler API
From: Geert Uytterhoeven @ 2021-11-08 7:47 UTC (permalink / raw)
To: Dmitry Osipenko
Cc: Ulf Hansson, Rich Felker, linux-ia64, Santosh Shilimkar,
Rafael J. Wysocki, Boris Ostrovsky, Dave Hansen, linux-kernel,
James E.J. Bottomley, Thierry Reding, Guo Ren, Pavel Machek,
H. Peter Anvin, linux-riscv, Vincent Chen, Will Deacon,
Greg Ungerer, Stefano Stabellini, Yoshinori Sato,
Krzysztof Kozlowski, linux-sh, Helge Deller,
the arch/x86 maintainers, Russell King, linux-csky,
Jonathan Hunter, linux-acpi, Ingo Molnar, linux-parisc,
Catalin Marinas, xen-devel, linux-mips, Guenter Roeck, Len Brown,
Albert Ou, Lee Jones, linux-m68k, Mark Brown, Borislav Petkov,
Greentime Hu, Paul Walmsley, linux-tegra, Thomas Gleixner,
Andy Shevchenko, linux-arm-kernel, Juergen Gross,
Thomas Bogendoerfer, Daniel Lezcano, Nick Hu, linux-pm,
Liam Girdwood, Palmer Dabbelt, Paul Mackerras, Andrew Morton,
linuxppc-dev, Joshua Thompson
In-Reply-To: <20211108004524.29465-22-digetx@gmail.com>
On Mon, Nov 8, 2021 at 1:48 AM Dmitry Osipenko <digetx@gmail.com> wrote:
> Kernel now supports chained power-off handlers. Use
> register_power_off_handler() that registers power-off handlers and
> do_kernel_power_off() that invokes chained power-off handlers. Legacy
> pm_power_off() will be removed once all drivers will be converted to
> the new power-off API.
>
> Normally arch code should adopt only the do_kernel_power_off() at first,
> but m68k is a special case because it uses pm_power_off() "inside out",
> i.e. pm_power_off() invokes machine_power_off() [in fact it does nothing],
> while it's machine_power_off() that should invoke the pm_power_off(), and
> thus, we can't convert platforms to the new API separately. There are only
> two platforms changed here, so it's not a big deal.
>
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
^ permalink raw reply
* Re: [PATCH 5/5] KVM: Convert the kvm->vcpus array to a xarray
From: Marc Zyngier @ 2021-11-08 8:23 UTC (permalink / raw)
To: Sean Christopherson
Cc: Juergen Gross, Alexandru Elisei, Anup Patel, Janosch Frank, kvm,
Christian Borntraeger, Huacai Chen, David Hildenbrand, linux-mips,
Nicholas Piggin, Atish Patra, Aleksandar Markovic, Paul Mackerras,
James Morse, Paolo Bonzini, kernel-team, Claudio Imbrenda,
linuxppc-dev, kvmarm, Suzuki K Poulose
In-Reply-To: <87mtmhec88.wl-maz@kernel.org>
On 2021-11-06 11:48, Marc Zyngier wrote:
> On Fri, 05 Nov 2021 20:21:36 +0000,
> Sean Christopherson <seanjc@google.com> wrote:
>>
>> On Fri, Nov 05, 2021, Marc Zyngier wrote:
>> > At least on arm64 and x86, the vcpus array is pretty huge (512 entries),
>> > and is mostly empty in most cases (running 512 vcpu VMs is not that
>> > common). This mean that we end-up with a 4kB block of unused memory
>> > in the middle of the kvm structure.
>>
>> Heh, x86 is now up to 1024 entries.
>
> Humph. I don't want to know whether people are actually using that in
> practice. The only time I create VMs with 512 vcpus is to check
> whether it still works...
>
>>
>> > Instead of wasting away this memory, let's use an xarray instead,
>> > which gives us almost the same flexibility as a normal array, but
>> > with a reduced memory usage with smaller VMs.
>> >
>> > Signed-off-by: Marc Zyngier <maz@kernel.org>
>> > ---
>> > @@ -693,7 +694,7 @@ static inline struct kvm_vcpu *kvm_get_vcpu(struct kvm *kvm, int i)
>> >
>> > /* Pairs with smp_wmb() in kvm_vm_ioctl_create_vcpu. */
>> > smp_rmb();
>> > - return kvm->vcpus[i];
>> > + return xa_load(&kvm->vcpu_array, i);
>> > }
>>
>> It'd be nice for this series to convert kvm_for_each_vcpu() to use
>> xa_for_each() as well. Maybe as a patch on top so that potential
>> explosions from that are isolated from the initiali conversion?
>>
>> Or maybe even use xa_for_each_range() to cap at online_vcpus?
>> That's technically a functional change, but IMO it's easier to
>> reason about iterating over a snapshot of vCPUs as opposed to being
>> able to iterate over vCPUs as their being added. In practice I
>> doubt it matters.
>>
>> #define kvm_for_each_vcpu(idx, vcpup, kvm) \
>> xa_for_each_range(&kvm->vcpu_array, idx, vcpup, 0,
>> atomic_read(&kvm->online_vcpus))
>>
>
> I think that's already the behaviour of this iterator (we stop at the
> first empty slot capped to online_vcpus. The only change in behaviour
> is that vcpup currently holds a pointer to the last vcpu in no empty
> slot has been encountered. xa_for_each{,_range}() would set the
> pointer to NULL at all times.
>
> I doubt anyone relies on that, but it is probably worth eyeballing
> some of the use cases...
This turned out to be an interesting exercise, as we always use an
int for the index, and the xarray iterators insist on an unsigned
long (and even on a pointer to it). On the other hand, I couldn't
spot any case where we'd rely on the last value of the vcpu pointer.
I'll repost the series once we have a solution for patch #4, and
we can then decide whether we want the iterator churn.
--
Jazz is not dead. It just smells funny...
^ permalink raw reply
* [PATCH 1/2] powerpc/mce: Avoid using irq_work_queue() in realmode
From: Ganesh Goudar @ 2021-11-08 8:38 UTC (permalink / raw)
To: linuxppc-dev, mpe; +Cc: Ganesh Goudar, mahesh, npiggin
In realmode mce handler we use irq_work_queue() to defer
the processing of mce events, irq_work_queue() can only
be called when translation is enabled because it touches
memory outside RMA, hence we enable translation before
calling irq_work_queue and disable on return, though it
is not safe to do in realmode.
To avoid this, program the decrementer and call the event
processing functions from timer handler.
Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com>
---
arch/powerpc/include/asm/machdep.h | 2 +
arch/powerpc/include/asm/mce.h | 2 +
arch/powerpc/include/asm/paca.h | 1 +
arch/powerpc/kernel/mce.c | 51 +++++++++++-------------
arch/powerpc/kernel/time.c | 3 ++
arch/powerpc/platforms/pseries/pseries.h | 1 +
arch/powerpc/platforms/pseries/ras.c | 31 +-------------
arch/powerpc/platforms/pseries/setup.c | 1 +
8 files changed, 34 insertions(+), 58 deletions(-)
diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h
index 764f2732a821..c89cc03c0f97 100644
--- a/arch/powerpc/include/asm/machdep.h
+++ b/arch/powerpc/include/asm/machdep.h
@@ -103,6 +103,8 @@ struct machdep_calls {
/* Called during machine check exception to retrive fixup address. */
bool (*mce_check_early_recovery)(struct pt_regs *regs);
+ void (*machine_check_log_err)(void);
+
/* Motherboard/chipset features. This is a kind of general purpose
* hook used to control some machine specific features (like reset
* lines, chip power control, etc...).
diff --git a/arch/powerpc/include/asm/mce.h b/arch/powerpc/include/asm/mce.h
index 331d944280b8..187810f13669 100644
--- a/arch/powerpc/include/asm/mce.h
+++ b/arch/powerpc/include/asm/mce.h
@@ -235,8 +235,10 @@ extern void machine_check_print_event_info(struct machine_check_event *evt,
unsigned long addr_to_pfn(struct pt_regs *regs, unsigned long addr);
extern void mce_common_process_ue(struct pt_regs *regs,
struct mce_error_info *mce_err);
+extern void machine_check_raise_dec_intr(void);
int mce_register_notifier(struct notifier_block *nb);
int mce_unregister_notifier(struct notifier_block *nb);
+void mce_run_late_handlers(void);
#ifdef CONFIG_PPC_BOOK3S_64
void flush_and_reload_slb(void);
void flush_erat(void);
diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
index dc05a862e72a..f49180f8c9be 100644
--- a/arch/powerpc/include/asm/paca.h
+++ b/arch/powerpc/include/asm/paca.h
@@ -280,6 +280,7 @@ struct paca_struct {
#endif
#ifdef CONFIG_PPC_BOOK3S_64
struct mce_info *mce_info;
+ atomic_t mces_to_process;
#endif /* CONFIG_PPC_BOOK3S_64 */
} ____cacheline_aligned;
diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
index fd829f7f25a4..45baa062ebc0 100644
--- a/arch/powerpc/kernel/mce.c
+++ b/arch/powerpc/kernel/mce.c
@@ -28,19 +28,9 @@
#include "setup.h"
-static void machine_check_process_queued_event(struct irq_work *work);
-static void machine_check_ue_irq_work(struct irq_work *work);
static void machine_check_ue_event(struct machine_check_event *evt);
static void machine_process_ue_event(struct work_struct *work);
-static struct irq_work mce_event_process_work = {
- .func = machine_check_process_queued_event,
-};
-
-static struct irq_work mce_ue_event_irq_work = {
- .func = machine_check_ue_irq_work,
-};
-
static DECLARE_WORK(mce_ue_event_work, machine_process_ue_event);
static BLOCKING_NOTIFIER_HEAD(mce_notifier_list);
@@ -89,6 +79,12 @@ static void mce_set_error_info(struct machine_check_event *mce,
}
}
+/* Raise decrementer interrupt */
+void machine_check_raise_dec_intr(void)
+{
+ set_dec(1);
+}
+
/*
* Decode and save high level MCE information into per cpu buffer which
* is an array of machine_check_event structure.
@@ -135,6 +131,8 @@ void save_mce_event(struct pt_regs *regs, long handled,
if (mce->error_type == MCE_ERROR_TYPE_UE)
mce->u.ue_error.ignore_event = mce_err->ignore_event;
+ atomic_inc(&local_paca->mces_to_process);
+
if (!addr)
return;
@@ -217,7 +215,7 @@ void release_mce_event(void)
get_mce_event(NULL, true);
}
-static void machine_check_ue_irq_work(struct irq_work *work)
+static void machine_check_ue_work(void)
{
schedule_work(&mce_ue_event_work);
}
@@ -239,7 +237,7 @@ static void machine_check_ue_event(struct machine_check_event *evt)
evt, sizeof(*evt));
/* Queue work to process this event later. */
- irq_work_queue(&mce_ue_event_irq_work);
+ machine_check_raise_dec_intr();
}
/*
@@ -249,7 +247,6 @@ void machine_check_queue_event(void)
{
int index;
struct machine_check_event evt;
- unsigned long msr;
if (!get_mce_event(&evt, MCE_EVENT_RELEASE))
return;
@@ -263,20 +260,7 @@ void machine_check_queue_event(void)
memcpy(&local_paca->mce_info->mce_event_queue[index],
&evt, sizeof(evt));
- /*
- * Queue irq work to process this event later. Before
- * queuing the work enable translation for non radix LPAR,
- * as irq_work_queue may try to access memory outside RMO
- * region.
- */
- if (!radix_enabled() && firmware_has_feature(FW_FEATURE_LPAR)) {
- msr = mfmsr();
- mtmsr(msr | MSR_IR | MSR_DR);
- irq_work_queue(&mce_event_process_work);
- mtmsr(msr);
- } else {
- irq_work_queue(&mce_event_process_work);
- }
+ machine_check_raise_dec_intr();
}
void mce_common_process_ue(struct pt_regs *regs,
@@ -338,7 +322,7 @@ static void machine_process_ue_event(struct work_struct *work)
* process pending MCE event from the mce event queue. This function will be
* called during syscall exit.
*/
-static void machine_check_process_queued_event(struct irq_work *work)
+static void machine_check_process_queued_event(void)
{
int index;
struct machine_check_event *evt;
@@ -363,6 +347,17 @@ static void machine_check_process_queued_event(struct irq_work *work)
}
}
+void mce_run_late_handlers(void)
+{
+ if (unlikely(atomic_read(&local_paca->mces_to_process))) {
+ if (ppc_md.machine_check_log_err)
+ ppc_md.machine_check_log_err();
+ machine_check_process_queued_event();
+ machine_check_ue_work();
+ atomic_dec(&local_paca->mces_to_process);
+ }
+}
+
void machine_check_print_event_info(struct machine_check_event *evt,
bool user_mode, bool in_guest)
{
diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index 934d8ae66cc6..2dc09d75d77c 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -597,6 +597,9 @@ DEFINE_INTERRUPT_HANDLER_ASYNC(timer_interrupt)
irq_work_run();
}
+#ifdef CONFIG_PPC_BOOK3S_64
+ mce_run_late_handlers();
+#endif
now = get_tb();
if (now >= *next_tb) {
*next_tb = ~(u64)0;
diff --git a/arch/powerpc/platforms/pseries/pseries.h b/arch/powerpc/platforms/pseries/pseries.h
index 3544778e06d0..0dc4f1027b30 100644
--- a/arch/powerpc/platforms/pseries/pseries.h
+++ b/arch/powerpc/platforms/pseries/pseries.h
@@ -21,6 +21,7 @@ struct pt_regs;
extern int pSeries_system_reset_exception(struct pt_regs *regs);
extern int pSeries_machine_check_exception(struct pt_regs *regs);
extern long pseries_machine_check_realmode(struct pt_regs *regs);
+extern void pSeries_machine_check_log_err(void);
#ifdef CONFIG_SMP
extern void smp_init_pseries(void);
diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c
index 56092dccfdb8..8613f9cc5798 100644
--- a/arch/powerpc/platforms/pseries/ras.c
+++ b/arch/powerpc/platforms/pseries/ras.c
@@ -23,11 +23,6 @@ static DEFINE_SPINLOCK(ras_log_buf_lock);
static int ras_check_exception_token;
-static void mce_process_errlog_event(struct irq_work *work);
-static struct irq_work mce_errlog_process_work = {
- .func = mce_process_errlog_event,
-};
-
#define EPOW_SENSOR_TOKEN 9
#define EPOW_SENSOR_INDEX 0
@@ -729,40 +724,16 @@ static int mce_handle_error(struct pt_regs *regs, struct rtas_error_log *errp)
error_type = mce_log->error_type;
disposition = mce_handle_err_realmode(disposition, error_type);
-
- /*
- * Enable translation as we will be accessing per-cpu variables
- * in save_mce_event() which may fall outside RMO region, also
- * leave it enabled because subsequently we will be queuing work
- * to workqueues where again per-cpu variables accessed, besides
- * fwnmi_release_errinfo() crashes when called in realmode on
- * pseries.
- * Note: All the realmode handling like flushing SLB entries for
- * SLB multihit is done by now.
- */
out:
- msr = mfmsr();
- mtmsr(msr | MSR_IR | MSR_DR);
-
disposition = mce_handle_err_virtmode(regs, errp, mce_log,
disposition);
-
- /*
- * Queue irq work to log this rtas event later.
- * irq_work_queue uses per-cpu variables, so do this in virt
- * mode as well.
- */
- irq_work_queue(&mce_errlog_process_work);
-
- mtmsr(msr);
-
return disposition;
}
/*
* Process MCE rtas errlog event.
*/
-static void mce_process_errlog_event(struct irq_work *work)
+void pSeries_machine_check_log_err(void)
{
struct rtas_error_log *err;
diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index f79126f16258..54bd7bdb7e92 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -1085,6 +1085,7 @@ define_machine(pseries) {
.system_reset_exception = pSeries_system_reset_exception,
.machine_check_early = pseries_machine_check_realmode,
.machine_check_exception = pSeries_machine_check_exception,
+ .machine_check_log_err = pSeries_machine_check_log_err,
#ifdef CONFIG_KEXEC_CORE
.machine_kexec = pSeries_machine_kexec,
.kexec_cpu_down = pseries_kexec_cpu_down,
--
2.26.2
^ permalink raw reply related
* [PATCH 2/2] pseries/mce: Refactor the pseries mce handling code
From: Ganesh Goudar @ 2021-11-08 8:38 UTC (permalink / raw)
To: linuxppc-dev, mpe; +Cc: Ganesh Goudar, mahesh, npiggin
In-Reply-To: <20211108083804.380142-1-ganeshgr@linux.ibm.com>
Now that we are no longer switching on the mmu in realmode
mce handler, Revert the commit 4ff753feab02("powerpc/pseries:
Avoid using addr_to_pfn in real mode") partially, which
introduced functions mce_handle_err_virtmode/realmode() to
separate mce handler code which needed translation to enabled.
Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com>
---
arch/powerpc/platforms/pseries/ras.c | 122 +++++++++++----------------
1 file changed, 49 insertions(+), 73 deletions(-)
diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c
index 8613f9cc5798..62e1519b8355 100644
--- a/arch/powerpc/platforms/pseries/ras.c
+++ b/arch/powerpc/platforms/pseries/ras.c
@@ -511,58 +511,17 @@ int pSeries_system_reset_exception(struct pt_regs *regs)
return 0; /* need to perform reset */
}
-static int mce_handle_err_realmode(int disposition, u8 error_type)
-{
-#ifdef CONFIG_PPC_BOOK3S_64
- if (disposition == RTAS_DISP_NOT_RECOVERED) {
- switch (error_type) {
- case MC_ERROR_TYPE_ERAT:
- flush_erat();
- disposition = RTAS_DISP_FULLY_RECOVERED;
- break;
- case MC_ERROR_TYPE_SLB:
- /*
- * Store the old slb content in paca before flushing.
- * Print this when we go to virtual mode.
- * There are chances that we may hit MCE again if there
- * is a parity error on the SLB entry we trying to read
- * for saving. Hence limit the slb saving to single
- * level of recursion.
- */
- if (local_paca->in_mce == 1)
- slb_save_contents(local_paca->mce_faulty_slbs);
- flush_and_reload_slb();
- disposition = RTAS_DISP_FULLY_RECOVERED;
- break;
- default:
- break;
- }
- } else if (disposition == RTAS_DISP_LIMITED_RECOVERY) {
- /* Platform corrected itself but could be degraded */
- pr_err("MCE: limited recovery, system may be degraded\n");
- disposition = RTAS_DISP_FULLY_RECOVERED;
- }
-#endif
- return disposition;
-}
-
-static int mce_handle_err_virtmode(struct pt_regs *regs,
- struct rtas_error_log *errp,
- struct pseries_mc_errorlog *mce_log,
- int disposition)
+static int mce_handle_error(struct pt_regs *regs, struct rtas_error_log *errp)
{
struct mce_error_info mce_err = { 0 };
+ unsigned long eaddr = 0, paddr = 0;
+ struct pseries_errorlog *pseries_log;
+ struct pseries_mc_errorlog *mce_log;
+ int disposition = rtas_error_disposition(errp);
int initiator = rtas_error_initiator(errp);
int severity = rtas_error_severity(errp);
- unsigned long eaddr = 0, paddr = 0;
u8 error_type, err_sub_type;
- if (!mce_log)
- goto out;
-
- error_type = mce_log->error_type;
- err_sub_type = rtas_mc_error_sub_type(mce_log);
-
if (initiator == RTAS_INITIATOR_UNKNOWN)
mce_err.initiator = MCE_INITIATOR_UNKNOWN;
else if (initiator == RTAS_INITIATOR_CPU)
@@ -588,6 +547,8 @@ static int mce_handle_err_virtmode(struct pt_regs *regs,
mce_err.severity = MCE_SEV_SEVERE;
else if (severity == RTAS_SEVERITY_ERROR)
mce_err.severity = MCE_SEV_SEVERE;
+ else if (severity == RTAS_SEVERITY_FATAL)
+ mce_err.severity = MCE_SEV_FATAL;
else
mce_err.severity = MCE_SEV_FATAL;
@@ -599,7 +560,18 @@ static int mce_handle_err_virtmode(struct pt_regs *regs,
mce_err.error_type = MCE_ERROR_TYPE_UNKNOWN;
mce_err.error_class = MCE_ECLASS_UNKNOWN;
- switch (error_type) {
+ if (!rtas_error_extended(errp))
+ goto out;
+
+ pseries_log = get_pseries_errorlog(errp, PSERIES_ELOG_SECT_ID_MCE);
+ if (!pseries_log)
+ goto out;
+
+ mce_log = (struct pseries_mc_errorlog *)pseries_log->data;
+ error_type = mce_log->error_type;
+ err_sub_type = rtas_mc_error_sub_type(mce_log);
+
+ switch (mce_log->error_type) {
case MC_ERROR_TYPE_UE:
mce_err.error_type = MCE_ERROR_TYPE_UE;
mce_common_process_ue(regs, &mce_err);
@@ -692,41 +664,45 @@ static int mce_handle_err_virtmode(struct pt_regs *regs,
mce_err.error_type = MCE_ERROR_TYPE_DCACHE;
break;
case MC_ERROR_TYPE_I_CACHE:
- mce_err.error_type = MCE_ERROR_TYPE_ICACHE;
+ mce_err.error_type = MCE_ERROR_TYPE_DCACHE;
break;
case MC_ERROR_TYPE_UNKNOWN:
default:
mce_err.error_type = MCE_ERROR_TYPE_UNKNOWN;
break;
}
+
+#ifdef CONFIG_PPC_BOOK3S_64
+ if (disposition == RTAS_DISP_NOT_RECOVERED) {
+ switch (error_type) {
+ case MC_ERROR_TYPE_SLB:
+ case MC_ERROR_TYPE_ERAT:
+ /*
+ * Store the old slb content in paca before flushing.
+ * Print this when we go to virtual mode.
+ * There are chances that we may hit MCE again if there
+ * is a parity error on the SLB entry we trying to read
+ * for saving. Hence limit the slb saving to single
+ * level of recursion.
+ */
+ if (local_paca->in_mce == 1)
+ slb_save_contents(local_paca->mce_faulty_slbs);
+ flush_and_reload_slb();
+ disposition = RTAS_DISP_FULLY_RECOVERED;
+ break;
+ default:
+ break;
+ }
+ } else if (disposition == RTAS_DISP_LIMITED_RECOVERY) {
+ /* Platform corrected itself but could be degraded */
+ pr_err("MCE: limited recovery, system may be degraded\n");
+ disposition = RTAS_DISP_FULLY_RECOVERED;
+ }
+#endif
out:
save_mce_event(regs, disposition == RTAS_DISP_FULLY_RECOVERED,
- &mce_err, regs->nip, eaddr, paddr);
- return disposition;
-}
+ &mce_err, regs->nip, eaddr, paddr);
-static int mce_handle_error(struct pt_regs *regs, struct rtas_error_log *errp)
-{
- struct pseries_errorlog *pseries_log;
- struct pseries_mc_errorlog *mce_log = NULL;
- int disposition = rtas_error_disposition(errp);
- unsigned long msr;
- u8 error_type;
-
- if (!rtas_error_extended(errp))
- goto out;
-
- pseries_log = get_pseries_errorlog(errp, PSERIES_ELOG_SECT_ID_MCE);
- if (!pseries_log)
- goto out;
-
- mce_log = (struct pseries_mc_errorlog *)pseries_log->data;
- error_type = mce_log->error_type;
-
- disposition = mce_handle_err_realmode(disposition, error_type);
-out:
- disposition = mce_handle_err_virtmode(regs, errp, mce_log,
- disposition);
return disposition;
}
--
2.26.2
^ permalink raw reply related
* Re: [PATCH v3 1/3] powerpc/pseries: Parse control memory access error
From: Ganesh @ 2021-11-08 8:41 UTC (permalink / raw)
To: linuxppc-dev, mpe; +Cc: mahesh, npiggin
In-Reply-To: <20210906084303.183921-1-ganeshgr@linux.ibm.com>
[-- Attachment #1: Type: text/plain, Size: 617 bytes --]
On 9/6/21 14:13, Ganesh Goudar wrote:
> Add support to parse and log control memory access
> error for pseries. These changes are made according to
> PAPR v2.11 10.3.2.2.12.
>
> Signed-off-by: Ganesh Goudar<ganeshgr@linux.ibm.com>
> ---
> v3: Modify the commit log to mention the document according
> to which changes are made.
> Define and use a macro to check if the effective address
> is provided.
>
> v2: No changes.
> ---
> arch/powerpc/platforms/pseries/ras.c | 36 ++++++++++++++++++++++++----
> 1 file changed, 32 insertions(+), 4 deletions(-)
>
Hi mpe, Any comments on this patch series?
[-- Attachment #2: Type: text/html, Size: 1023 bytes --]
^ permalink raw reply
* Re: [PATCH v1 1/5] livepatch: Fix build failure on 32 bits processors
From: Petr Mladek @ 2021-11-08 9:47 UTC (permalink / raw)
To: Christophe Leroy
Cc: Joe Lawrence, Jiri Kosina, linux-kernel, Steven Rostedt,
Ingo Molnar, Josh Poimboeuf, live-patching, Naveen N . Rao,
Miroslav Benes, linuxppc-dev
In-Reply-To: <cefeeaf1447088db00c5a62e2ff03f7d15bb4c05.1635423081.git.christophe.leroy@csgroup.eu>
On Thu 2021-10-28 14:24:01, Christophe Leroy wrote:
> Trying to build livepatch on powerpc/32 results in:
>
> kernel/livepatch/core.c: In function 'klp_resolve_symbols':
> kernel/livepatch/core.c:221:23: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
> 221 | sym = (Elf64_Sym *)sechdrs[symndx].sh_addr + ELF_R_SYM(relas[i].r_info);
> | ^
> kernel/livepatch/core.c:221:21: error: assignment to 'Elf32_Sym *' {aka 'struct elf32_sym *'} from incompatible pointer type 'Elf64_Sym *' {aka 'struct elf64_sym *'} [-Werror=incompatible-pointer-types]
> 221 | sym = (Elf64_Sym *)sechdrs[symndx].sh_addr + ELF_R_SYM(relas[i].r_info);
> | ^
> kernel/livepatch/core.c: In function 'klp_apply_section_relocs':
> kernel/livepatch/core.c:312:35: error: passing argument 1 of 'klp_resolve_symbols' from incompatible pointer type [-Werror=incompatible-pointer-types]
> 312 | ret = klp_resolve_symbols(sechdrs, strtab, symndx, sec, sec_objname);
> | ^~~~~~~
> | |
> | Elf32_Shdr * {aka struct elf32_shdr *}
> kernel/livepatch/core.c:193:44: note: expected 'Elf64_Shdr *' {aka 'struct elf64_shdr *'} but argument is of type 'Elf32_Shdr *' {aka 'struct elf32_shdr *'}
> 193 | static int klp_resolve_symbols(Elf64_Shdr *sechdrs, const char *strtab,
> | ~~~~~~~~~~~~^~~~~~~
>
> Fix it by using the right types instead of forcing 64 bits types.
>
> Fixes: 7c8e2bdd5f0d ("livepatch: Apply vmlinux-specific KLP relocations early")
> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Makes sense. I haven't tested it but it looks correct ;-)
Acked-by: Petr Mladek <pmladek@suse.com>
Best Regards,
Petr
^ permalink raw reply
* Re: [PATCH v1 5/5] powerpc/ftrace: Add support for livepatch to PPC32
From: Petr Mladek @ 2021-11-08 10:01 UTC (permalink / raw)
To: Christophe Leroy
Cc: Joe Lawrence, Jiri Kosina, linux-kernel, Steven Rostedt,
Ingo Molnar, Josh Poimboeuf, live-patching, Naveen N . Rao,
Miroslav Benes, linuxppc-dev
In-Reply-To: <b73d053c145245499511c4827890c9411c8b3a5a.1635423081.git.christophe.leroy@csgroup.eu>
On Thu 2021-10-28 14:24:05, Christophe Leroy wrote:
> This is heavily copied from PPC64. Not much to say about it.
>
> Livepatch sample modules all work.
>
> Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
> ---
> diff --git a/arch/powerpc/include/asm/livepatch.h b/arch/powerpc/include/asm/livepatch.h
> index 4fe018cc207b..daf24d837241 100644
> --- a/arch/powerpc/include/asm/livepatch.h
> +++ b/arch/powerpc/include/asm/livepatch.h
> @@ -23,8 +23,8 @@ static inline void klp_arch_set_pc(struct ftrace_regs *fregs, unsigned long ip)
> static inline unsigned long klp_get_ftrace_location(unsigned long faddr)
> {
> /*
> - * Live patch works only with -mprofile-kernel on PPC. In this case,
> - * the ftrace location is always within the first 16 bytes.
> + * Live patch works on PPC32 and only with -mprofile-kernel on PPC64. In
> + * both cases, the ftrace location is always within the first 16 bytes.
Nit: I had some problems to parse it. I wonder if the following is
better:
* Live patch works on PPC32 out of box and on PPC64 only with
* -mprofile-kernel. In both cases, the ftrace location is always
* within the first 16 bytes.
> */
> return ftrace_location_range(faddr, faddr + 16);
> }
Best Regards,
Petr
^ permalink raw reply
* [PATCH v0 00/42] notifiers: Return an error when callback is already registered
From: Borislav Petkov @ 2021-11-08 10:19 UTC (permalink / raw)
To: LKML
Cc: alsa-devel, x86, linux-sh, linux-iio, linux-remoteproc,
linux-fbdev, netdev, Ayush Sawal, sparclinux, linux-clk,
linux-leds, linux-s390, linux-scsi, Rohit Maheshwari,
linux-staging, bcm-kernel-feedback-list, xen-devel, linux-xtensa,
Arnd Bergmann, linux-pm, intel-gfx, Vinay Kumar Yadav, linux-um,
Steven Rostedt, rcu, linux-tegra, openipmi-developer,
intel-gvt-dev, linux-arm-kernel, linux-edac, linux-parisc,
Greg Kroah-Hartman, linux-usb, linux-mips, linux-renesas-soc,
linux-hyperv, linux-crypto, linux-alpha, linuxppc-dev
In-Reply-To: <20211108101157.15189-1-bp@alien8.de>
From: Borislav Petkov <bp@suse.de>
Hi all,
this is a huge patchset for something which is really trivial - it
changes the notifier registration routines to return an error value
if a notifier callback is already present on the respective list of
callbacks. For more details scroll to the last patch.
Everything before it is converting the callers to check the return value
of the registration routines and issue a warning, instead of the WARN()
notifier_chain_register() does now.
Before the last patch has been applied, though, that checking is a
NOP which would make the application of those patches trivial - every
maintainer can pick a patch at her/his discretion - only the last one
enables the build warnings and that one will be queued only after the
preceding patches have all been merged so that there are no build
warnings.
Due to the sheer volume of the patches, I have addressed the respective
patch and the last one, which enables the warning, with addressees for
each maintained area so as not to spam people unnecessarily.
If people prefer I carry some through tip, instead, I'll gladly do so -
your call.
And, if you think the warning messages need to be more precise, feel
free to adjust them before committing.
Thanks!
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Ayush Sawal <ayush.sawal@chelsio.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Rohit Maheshwari <rohitm@chelsio.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Vinay Kumar Yadav <vinay.yadav@chelsio.com>
Cc: alsa-devel@alsa-project.org
Cc: bcm-kernel-feedback-list@broadcom.com
Cc: intel-gfx@lists.freedesktop.org
Cc: intel-gvt-dev@lists.freedesktop.org
Cc: linux-alpha@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-clk@vger.kernel.org
Cc: linux-crypto@vger.kernel.org
Cc: linux-edac@vger.kernel.org
Cc: linux-fbdev@vger.kernel.org
Cc: linux-hyperv@vger.kernel.org
Cc: linux-iio@vger.kernel.org
Cc: linux-leds@vger.kernel.org
Cc: linux-mips@vger.kernel.org
Cc: linux-parisc@vger.kernel.org
Cc: linux-pm@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-remoteproc@vger.kernel.org
Cc: linux-renesas-soc@vger.kernel.org
Cc: linux-s390@vger.kernel.org
Cc: linux-scsi@vger.kernel.org
Cc: linux-sh@vger.kernel.org
Cc: linux-staging@lists.linux.dev
Cc: linux-tegra@vger.kernel.org
Cc: linux-um@lists.infradead.org
Cc: linux-usb@vger.kernel.org
Cc: linux-xtensa@linux-xtensa.org
Cc: netdev@vger.kernel.org
Cc: openipmi-developer@lists.sourceforge.net
Cc: rcu@vger.kernel.org
Cc: sparclinux@vger.kernel.org
Cc: x86@kernel.org
Cc: xen-devel@lists.xenproject.org
Borislav Petkov (42):
x86: Check notifier registration return value
xen/x86: Check notifier registration return value
impi: Check notifier registration return value
clk: renesas: Check notifier registration return value
dca: Check notifier registration return value
firmware: Check notifier registration return value
drm/i915: Check notifier registration return value
Drivers: hv: vmbus: Check notifier registration return value
iio: proximity: cros_ec: Check notifier registration return value
leds: trigger: Check notifier registration return value
misc: Check notifier registration return value
ethernet: chelsio: Check notifier registration return value
power: reset: Check notifier registration return value
remoteproc: Check notifier registration return value
scsi: target: Check notifier registration return value
USB: Check notifier registration return value
drivers: video: Check notifier registration return value
drivers/xen: Check notifier registration return value
kernel/hung_task: Check notifier registration return value
rcu: Check notifier registration return value
tracing: Check notifier registration return value
net: fib_notifier: Check notifier registration return value
ASoC: soc-jack: Check notifier registration return value
staging: olpc_dcon: Check notifier registration return value
arch/um: Check notifier registration return value
alpha: Check notifier registration return value
bus: brcmstb_gisb: Check notifier registration return value
soc: bcm: brcmstb: pm: pm-arm: Check notifier registration return
value
arm64: Check notifier registration return value
soc/tegra: Check notifier registration return value
parisc: Check notifier registration return value
macintosh/adb: Check notifier registration return value
mips: Check notifier registration return value
powerpc: Check notifier registration return value
sh: Check notifier registration return value
s390: Check notifier registration return value
sparc: Check notifier registration return value
xtensa: Check notifier registration return value
crypto: ccree - check notifier registration return value
EDAC/altera: Check notifier registration return value
power: supply: ab8500: Check notifier registration return value
notifier: Return an error when callback is already registered
arch/alpha/kernel/setup.c | 5 +--
arch/arm64/kernel/setup.c | 6 ++--
arch/mips/kernel/relocate.c | 6 ++--
arch/mips/sgi-ip22/ip22-reset.c | 4 ++-
arch/mips/sgi-ip32/ip32-reset.c | 4 ++-
arch/parisc/kernel/pdc_chassis.c | 5 +--
arch/powerpc/kernel/setup-common.c | 12 ++++---
arch/s390/kernel/ipl.c | 4 ++-
arch/s390/kvm/kvm-s390.c | 7 ++--
arch/sh/kernel/cpu/sh4a/setup-sh7724.c | 11 +++---
arch/sparc/kernel/sstate.c | 6 ++--
arch/um/drivers/mconsole_kern.c | 6 ++--
arch/um/kernel/um_arch.c | 5 +--
arch/x86/kernel/cpu/mce/core.c | 3 +-
arch/x86/kernel/cpu/mce/dev-mcelog.c | 3 +-
arch/x86/kernel/setup.c | 7 ++--
arch/x86/xen/enlighten.c | 4 ++-
arch/xtensa/platforms/iss/setup.c | 3 +-
drivers/bus/brcmstb_gisb.c | 6 ++--
drivers/char/ipmi/ipmi_msghandler.c | 3 +-
drivers/clk/renesas/clk-div6.c | 4 ++-
drivers/clk/renesas/rcar-cpg-lib.c | 4 ++-
drivers/crypto/ccree/cc_fips.c | 4 ++-
drivers/dca/dca-core.c | 3 +-
drivers/edac/altera_edac.c | 6 ++--
drivers/firmware/arm_scmi/notify.c | 3 +-
drivers/firmware/google/gsmi.c | 6 ++--
drivers/gpu/drm/i915/gvt/scheduler.c | 6 ++--
drivers/hv/vmbus_drv.c | 4 +--
.../iio/proximity/cros_ec_mkbp_proximity.c | 3 +-
drivers/leds/trigger/ledtrig-activity.c | 6 ++--
drivers/leds/trigger/ledtrig-heartbeat.c | 6 ++--
drivers/leds/trigger/ledtrig-panic.c | 4 +--
drivers/macintosh/adbhid.c | 4 +--
drivers/misc/ibmasm/heartbeat.c | 3 +-
drivers/misc/pvpanic/pvpanic.c | 3 +-
.../chelsio/inline_crypto/chtls/chtls_main.c | 5 ++-
drivers/parisc/power.c | 5 +--
drivers/power/reset/ltc2952-poweroff.c | 6 ++--
drivers/power/supply/ab8500_charger.c | 8 ++---
drivers/remoteproc/qcom_common.c | 3 +-
drivers/remoteproc/qcom_sysmon.c | 4 ++-
drivers/remoteproc/remoteproc_core.c | 4 ++-
drivers/s390/char/con3215.c | 5 ++-
drivers/s390/char/con3270.c | 5 ++-
drivers/s390/char/sclp_con.c | 4 ++-
drivers/s390/char/sclp_vt220.c | 4 ++-
drivers/s390/char/zcore.c | 4 ++-
drivers/soc/bcm/brcmstb/pm/pm-arm.c | 5 +--
drivers/soc/tegra/ari-tegra186.c | 7 ++--
drivers/staging/olpc_dcon/olpc_dcon.c | 4 ++-
drivers/target/tcm_fc/tfc_conf.c | 4 ++-
drivers/usb/core/notify.c | 3 +-
drivers/video/console/dummycon.c | 3 +-
drivers/video/fbdev/hyperv_fb.c | 5 +--
drivers/xen/manage.c | 3 +-
drivers/xen/xenbus/xenbus_probe.c | 8 +++--
include/linux/notifier.h | 8 ++---
kernel/hung_task.c | 3 +-
kernel/notifier.c | 36 ++++++++++---------
kernel/rcu/tree_stall.h | 4 ++-
kernel/trace/trace.c | 4 +--
net/core/fib_notifier.c | 4 ++-
sound/soc/soc-jack.c | 3 +-
64 files changed, 222 insertions(+), 118 deletions(-)
--
2.29.2
^ permalink raw reply
* [PATCH v0 32/42] macintosh/adb: Check notifier registration return value
From: Borislav Petkov @ 2021-11-08 10:11 UTC (permalink / raw)
To: LKML; +Cc: linuxppc-dev, kernel test robot
In-Reply-To: <20211108101157.15189-1-bp@alien8.de>
From: Borislav Petkov <bp@suse.de>
Avoid homegrown notifier registration checks.
No functional changes.
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: linuxppc-dev@lists.ozlabs.org
---
drivers/macintosh/adbhid.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/macintosh/adbhid.c b/drivers/macintosh/adbhid.c
index 994ba5cb3678..c8cbf8588186 100644
--- a/drivers/macintosh/adbhid.c
+++ b/drivers/macintosh/adbhid.c
@@ -1262,8 +1262,8 @@ static int __init adbhid_init(void)
adbhid_probe();
- blocking_notifier_chain_register(&adb_client_list,
- &adbhid_adb_notifier);
+ if (blocking_notifier_chain_register(&adb_client_list, &adbhid_adb_notifier))
+ pr_warn("ADB message notifier already registered\n");
return 0;
}
--
2.29.2
^ permalink raw reply related
* [PATCH v0 34/42] powerpc: Check notifier registration return value
From: Borislav Petkov @ 2021-11-08 10:11 UTC (permalink / raw)
To: LKML; +Cc: linuxppc-dev
In-Reply-To: <20211108101157.15189-1-bp@alien8.de>
From: Borislav Petkov <bp@suse.de>
Avoid homegrown notifier registration checks.
No functional changes.
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: linuxppc-dev@lists.ozlabs.org
---
arch/powerpc/kernel/setup-common.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/arch/powerpc/kernel/setup-common.c b/arch/powerpc/kernel/setup-common.c
index b1e43b69a559..32f9900cd4f4 100644
--- a/arch/powerpc/kernel/setup-common.c
+++ b/arch/powerpc/kernel/setup-common.c
@@ -725,14 +725,18 @@ static struct notifier_block kernel_offset_notifier = {
void __init setup_panic(void)
{
- if (IS_ENABLED(CONFIG_RANDOMIZE_BASE) && kaslr_offset() > 0)
- atomic_notifier_chain_register(&panic_notifier_list,
- &kernel_offset_notifier);
+ if (IS_ENABLED(CONFIG_RANDOMIZE_BASE) && kaslr_offset() > 0) {
+ if (atomic_notifier_chain_register(&panic_notifier_list,
+ &kernel_offset_notifier))
+ pr_warn("Kernel offset notifier already registered\n");
+ }
/* PPC64 always does a hard irq disable in its panic handler */
if (!IS_ENABLED(CONFIG_PPC64) && !ppc_md.panic)
return;
- atomic_notifier_chain_register(&panic_notifier_list, &ppc_panic_block);
+
+ if (atomic_notifier_chain_register(&panic_notifier_list, &ppc_panic_block))
+ pr_warn("Panic notifier already registered\n");
}
#ifdef CONFIG_CHECK_CACHE_COHERENCY
--
2.29.2
^ permalink raw reply related
* [PATCH v0 42/42] notifier: Return an error when callback is already registered
From: Borislav Petkov @ 2021-11-08 10:11 UTC (permalink / raw)
To: LKML
Cc: alsa-devel, x86, linux-sh, linux-iio, linux-remoteproc,
linux-fbdev, netdev, Ayush Sawal, sparclinux, linux-clk,
linux-leds, linux-s390, linux-scsi, Rohit Maheshwari,
linux-staging, bcm-kernel-feedback-list, openipmi-developer,
xen-devel, linux-xtensa, Arnd Bergmann, linux-pm, intel-gfx,
Vinay Kumar Yadav, linux-um, Steven Rostedt, rcu, linux-tegra,
Thomas Gleixner, intel-gvt-dev, linux-arm-kernel, linux-edac,
linux-parisc, Greg Kroah-Hartman, linux-usb, linux-mips,
linux-renesas-soc, linux-hyperv, linux-crypto, linux-alpha,
linuxppc-dev
In-Reply-To: <20211108101157.15189-1-bp@alien8.de>
From: Borislav Petkov <bp@suse.de>
The notifier registration routine doesn't return a proper error value
when a callback has already been registered, leading people to track
whether that registration has happened at the call site:
https://lore.kernel.org/amd-gfx/20210512013058.6827-1-mukul.joshi@amd.com/
Which is unnecessary.
Return -EEXIST to signal that case so that callers can act accordingly.
Enforce callers to check the return value, leading to loud screaming
during build:
arch/x86/kernel/cpu/mce/core.c: In function ‘mce_register_decode_chain’:
arch/x86/kernel/cpu/mce/core.c:167:2: error: ignoring return value of \
‘blocking_notifier_chain_register’, declared with attribute warn_unused_result [-Werror=unused-result]
blocking_notifier_chain_register(&x86_mce_decoder_chain, nb);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Drop the WARN too, while at it.
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Ayush Sawal <ayush.sawal@chelsio.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Rohit Maheshwari <rohitm@chelsio.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Vinay Kumar Yadav <vinay.yadav@chelsio.com>
Cc: alsa-devel@alsa-project.org
Cc: bcm-kernel-feedback-list@broadcom.com
Cc: intel-gfx@lists.freedesktop.org
Cc: intel-gvt-dev@lists.freedesktop.org
Cc: linux-alpha@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-clk@vger.kernel.org
Cc: linux-crypto@vger.kernel.org
Cc: linux-edac@vger.kernel.org
Cc: linux-fbdev@vger.kernel.org
Cc: linux-hyperv@vger.kernel.org
Cc: linux-iio@vger.kernel.org
Cc: linux-leds@vger.kernel.org
Cc: linux-mips@vger.kernel.org
Cc: linux-parisc@vger.kernel.org
Cc: linux-pm@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-remoteproc@vger.kernel.org
Cc: linux-renesas-soc@vger.kernel.org
Cc: linux-s390@vger.kernel.org
Cc: linux-scsi@vger.kernel.org
Cc: linux-sh@vger.kernel.org
Cc: linux-staging@lists.linux.dev
Cc: linux-tegra@vger.kernel.org
Cc: linux-um@lists.infradead.org
Cc: linux-usb@vger.kernel.org
Cc: linux-xtensa@linux-xtensa.org
Cc: netdev@vger.kernel.org
Cc: openipmi-developer@lists.sourceforge.net
Cc: rcu@vger.kernel.org
Cc: sparclinux@vger.kernel.org
Cc: x86@kernel.org
Cc: xen-devel@lists.xenproject.org
---
include/linux/notifier.h | 8 ++++----
kernel/notifier.c | 36 +++++++++++++++++++-----------------
2 files changed, 23 insertions(+), 21 deletions(-)
diff --git a/include/linux/notifier.h b/include/linux/notifier.h
index 87069b8459af..45cc5a8d0fd8 100644
--- a/include/linux/notifier.h
+++ b/include/linux/notifier.h
@@ -141,13 +141,13 @@ extern void srcu_init_notifier_head(struct srcu_notifier_head *nh);
#ifdef __KERNEL__
-extern int atomic_notifier_chain_register(struct atomic_notifier_head *nh,
+extern int __must_check atomic_notifier_chain_register(struct atomic_notifier_head *nh,
struct notifier_block *nb);
-extern int blocking_notifier_chain_register(struct blocking_notifier_head *nh,
+extern int __must_check blocking_notifier_chain_register(struct blocking_notifier_head *nh,
struct notifier_block *nb);
-extern int raw_notifier_chain_register(struct raw_notifier_head *nh,
+extern int __must_check raw_notifier_chain_register(struct raw_notifier_head *nh,
struct notifier_block *nb);
-extern int srcu_notifier_chain_register(struct srcu_notifier_head *nh,
+extern int __must_check srcu_notifier_chain_register(struct srcu_notifier_head *nh,
struct notifier_block *nb);
extern int atomic_notifier_chain_unregister(struct atomic_notifier_head *nh,
diff --git a/kernel/notifier.c b/kernel/notifier.c
index b8251dc0bc0f..451ef3f73ad2 100644
--- a/kernel/notifier.c
+++ b/kernel/notifier.c
@@ -20,13 +20,11 @@ BLOCKING_NOTIFIER_HEAD(reboot_notifier_list);
*/
static int notifier_chain_register(struct notifier_block **nl,
- struct notifier_block *n)
+ struct notifier_block *n)
{
while ((*nl) != NULL) {
- if (unlikely((*nl) == n)) {
- WARN(1, "double register detected");
- return 0;
- }
+ if (unlikely((*nl) == n))
+ return -EEXIST;
if (n->priority > (*nl)->priority)
break;
nl = &((*nl)->next);
@@ -134,10 +132,11 @@ static int notifier_call_chain_robust(struct notifier_block **nl,
*
* Adds a notifier to an atomic notifier chain.
*
- * Currently always returns zero.
+ * Returns 0 on success, %-EEXIST on error.
*/
-int atomic_notifier_chain_register(struct atomic_notifier_head *nh,
- struct notifier_block *n)
+int __must_check
+atomic_notifier_chain_register(struct atomic_notifier_head *nh,
+ struct notifier_block *n)
{
unsigned long flags;
int ret;
@@ -216,10 +215,11 @@ NOKPROBE_SYMBOL(atomic_notifier_call_chain);
* Adds a notifier to a blocking notifier chain.
* Must be called in process context.
*
- * Currently always returns zero.
+ * Returns 0 on success, %-EEXIST on error.
*/
-int blocking_notifier_chain_register(struct blocking_notifier_head *nh,
- struct notifier_block *n)
+int __must_check
+blocking_notifier_chain_register(struct blocking_notifier_head *nh,
+ struct notifier_block *n)
{
int ret;
@@ -335,10 +335,11 @@ EXPORT_SYMBOL_GPL(blocking_notifier_call_chain);
* Adds a notifier to a raw notifier chain.
* All locking must be provided by the caller.
*
- * Currently always returns zero.
+ * Returns 0 on success, %-EEXIST on error.
*/
-int raw_notifier_chain_register(struct raw_notifier_head *nh,
- struct notifier_block *n)
+int __must_check
+raw_notifier_chain_register(struct raw_notifier_head *nh,
+ struct notifier_block *n)
{
return notifier_chain_register(&nh->head, n);
}
@@ -406,10 +407,11 @@ EXPORT_SYMBOL_GPL(raw_notifier_call_chain);
* Adds a notifier to an SRCU notifier chain.
* Must be called in process context.
*
- * Currently always returns zero.
+ * Returns 0 on success, %-EEXIST on error.
*/
-int srcu_notifier_chain_register(struct srcu_notifier_head *nh,
- struct notifier_block *n)
+int __must_check
+srcu_notifier_chain_register(struct srcu_notifier_head *nh,
+ struct notifier_block *n)
{
int ret;
--
2.29.2
^ permalink raw reply related
* Re: [PATCH v3 20/25] powerpc: Use do_kernel_power_off()
From: Michael Ellerman @ 2021-11-08 12:01 UTC (permalink / raw)
To: Dmitry Osipenko; +Cc: linuxppc-dev
In-Reply-To: <20211108004524.29465-21-digetx@gmail.com>
Dmitry Osipenko <digetx@gmail.com> writes:
> Kernel now supports chained power-off handlers. Use do_kernel_power_off()
> that invokes chained power-off handlers. It also invokes legacy
> pm_power_off() for now, which will be removed once all drivers will
> be converted to the new power-off API.
>
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> ---
> arch/powerpc/kernel/setup-common.c | 4 +---
> arch/powerpc/xmon/xmon.c | 3 +--
> 2 files changed, 2 insertions(+), 5 deletions(-)
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
cheers
^ permalink raw reply
* Re: [PATCH 0/3] KEXEC_SIG with appended signature
From: Michal Suchánek @ 2021-11-08 12:05 UTC (permalink / raw)
To: Daniel Axtens
Cc: Thiago Jung Bauermann, Rob Herring, Vasily Gorbik, linux-s390,
Heiko Carstens, linux-kernel, Mimi Zohar, David Howells,
Lakshmi Ramasubramanian, Luis Chamberlain, keyrings,
Paul Mackerras, Frank van der Linden, Jessica Yu,
Alexander Gordeev, linuxppc-dev, Christian Borntraeger,
Hari Bathini
In-Reply-To: <87a6ifehin.fsf@dja-thinkpad.axtens.net>
Hello,
On Mon, Nov 08, 2021 at 09:18:56AM +1100, Daniel Axtens wrote:
> Michal Suchánek <msuchanek@suse.de> writes:
>
> > On Fri, Nov 05, 2021 at 09:55:52PM +1100, Daniel Axtens wrote:
> >> Michal Suchanek <msuchanek@suse.de> writes:
> >>
> >> > S390 uses appended signature for kernel but implements the check
> >> > separately from module loader.
> >> >
> >> > Support for secure boot on powerpc with appended signature is planned -
> >> > grub patches submitted upstream but not yet merged.
> >>
> >> Power Non-Virtualised / OpenPower already supports secure boot via kexec
> >> with signature verification via IMA. I think you have now sent a
> >> follow-up series that merges some of the IMA implementation, I just
> >> wanted to make sure it was clear that we actually already have support
> >
> > So is IMA_KEXEC and KEXEC_SIG redundant?
> >
> > I see some architectures have both. I also see there is a lot of overlap
> > between the IMA framework and the KEXEC_SIG and MODULE_SIg.
>
>
> Mimi would be much better placed than me to answer this.
>
> The limits of my knowledge are basically that signature verification for
> modules and kexec kernels can be enforced by IMA policies.
>
> For example a secure booted powerpc kernel with module support will have
> the following IMA policy set at the arch level:
>
> "appraise func=KEXEC_KERNEL_CHECK appraise_flag=check_blacklist appraise_type=imasig|modsig",
> (in arch/powerpc/kernel/ima_arch.c)
>
> Module signature enforcement can be set with either IMA (policy like
> "appraise func=MODULE_CHECK appraise_flag=check_blacklist appraise_type=imasig|modsig" )
> or with CONFIG_MODULE_SIG_FORCE/module.sig_enforce=1.
>
> Sometimes this leads to arguably unexpected interactions - for example
> commit fa4f3f56ccd2 ("powerpc/ima: Fix secure boot rules in ima arch
> policy"), so it might be interesting to see if we can make things easier
> to understand.
I suspect that is the root of the problem here. Until distributions pick
up IMA and properly document step by step in detail how to implement,
enable, and debug it the _SIG options are required for users to be able
to make use of signatures.
The other part is that distributions apply 'lockdown' patches that change
the security policy depending on secure boot status which were rejected
by upstream which only hook into the _SIG options, and not into the IMA_
options. Of course, I expect this to change when the IMA options are
universally available across architectures and the support picked up by
distributions.
Which brings the third point: IMA features vary across architectures,
and KEXEC_SIG is more common than IMA_KEXEC.
config/arm64/default:CONFIG_HAVE_IMA_KEXEC=y
config/ppc64le/default:CONFIG_HAVE_IMA_KEXEC=y
config/arm64/default:CONFIG_KEXEC_SIG=y
config/s390x/default:CONFIG_KEXEC_SIG=y
config/x86_64/default:CONFIG_KEXEC_SIG=y
KEXEC_SIG makes it much easier to get uniform features across
architectures.
Thanks
Michal
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox