* [PATCH AUTOSEL 5.4 08/33] powerpc/pci: Add ppc_md.discover_phbs()
[not found] <20210302115749.62653-1-sashal@kernel.org>
@ 2021-03-02 11:57 ` Sasha Levin
2021-03-02 11:57 ` [PATCH AUTOSEL 5.4 11/33] powerpc: improve handling of unrecoverable system reset Sasha Levin
` (2 subsequent siblings)
3 siblings, 0 replies; 4+ messages in thread
From: Sasha Levin @ 2021-03-02 11:57 UTC (permalink / raw)
To: linux-kernel, stable
Cc: Sasha Levin, linuxppc-dev, Oliver O'Halloran,
kernel test robot
From: Oliver O'Halloran <oohall@gmail.com>
[ Upstream commit 5537fcb319d016ce387f818dd774179bc03217f5 ]
On many powerpc platforms the discovery and initalisation of
pci_controllers (PHBs) happens inside of setup_arch(). This is very early
in boot (pre-initcalls) and means that we're initialising the PHB long
before many basic kernel services (slab allocator, debugfs, a real ioremap)
are available.
On PowerNV this causes an additional problem since we map the PHB registers
with ioremap(). As of commit d538aadc2718 ("powerpc/ioremap: warn on early
use of ioremap()") a warning is printed because we're using the "incorrect"
API to setup and MMIO mapping in searly boot. The kernel does provide
early_ioremap(), but that is not intended to create long-lived MMIO
mappings and a seperate warning is printed by generic code if
early_ioremap() mappings are "leaked."
This is all fixable with dumb hacks like using early_ioremap() to setup
the initial mapping then replacing it with a real ioremap later on in
boot, but it does raise the question: Why the hell are we setting up the
PHB's this early in boot?
The old and wise claim it's due to "hysterical rasins." Aside from amused
grapes there doesn't appear to be any real reason to maintain the current
behaviour. Already most of the newer embedded platforms perform PHB
discovery in an arch_initcall and between the end of setup_arch() and the
start of initcalls none of the generic kernel code does anything PCI
related. On powerpc scanning PHBs occurs in a subsys_initcall so it should
be possible to move the PHB discovery to a core, postcore or arch initcall.
This patch adds the ppc_md.discover_phbs hook and a core_initcall stub that
calls it. The core_initcalls are the earliest to be called so this will
any possibly issues with dependency between initcalls. This isn't just an
academic issue either since on pseries and PowerNV EEH init occurs in an
arch_initcall and depends on the pci_controllers being available, similarly
the creation of pci_dns occurs at core_initcall_sync (i.e. between core and
postcore initcalls). These problems need to be addressed seperately.
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
[mpe: Make discover_phbs() static]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20201103043523.916109-1-oohall@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
arch/powerpc/include/asm/machdep.h | 3 +++
arch/powerpc/kernel/pci-common.c | 10 ++++++++++
2 files changed, 13 insertions(+)
diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h
index 7bcb64444a39..f71c361dc356 100644
--- a/arch/powerpc/include/asm/machdep.h
+++ b/arch/powerpc/include/asm/machdep.h
@@ -59,6 +59,9 @@ struct machdep_calls {
int (*pcibios_root_bridge_prepare)(struct pci_host_bridge
*bridge);
+ /* finds all the pci_controllers present at boot */
+ void (*discover_phbs)(void);
+
/* To setup PHBs when using automatic OF platform driver for PCI */
int (*pci_setup_phb)(struct pci_controller *host);
diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
index 1c448cf25506..a2c258a8d736 100644
--- a/arch/powerpc/kernel/pci-common.c
+++ b/arch/powerpc/kernel/pci-common.c
@@ -1669,3 +1669,13 @@ static void fixup_hide_host_resource_fsl(struct pci_dev *dev)
}
DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_MOTOROLA, PCI_ANY_ID, fixup_hide_host_resource_fsl);
DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_FREESCALE, PCI_ANY_ID, fixup_hide_host_resource_fsl);
+
+
+static int __init discover_phbs(void)
+{
+ if (ppc_md.discover_phbs)
+ ppc_md.discover_phbs();
+
+ return 0;
+}
+core_initcall(discover_phbs);
--
2.30.1
^ permalink raw reply related [flat|nested] 4+ messages in thread* [PATCH AUTOSEL 5.4 11/33] powerpc: improve handling of unrecoverable system reset
[not found] <20210302115749.62653-1-sashal@kernel.org>
2021-03-02 11:57 ` [PATCH AUTOSEL 5.4 08/33] powerpc/pci: Add ppc_md.discover_phbs() Sasha Levin
@ 2021-03-02 11:57 ` Sasha Levin
2021-03-02 11:57 ` [PATCH AUTOSEL 5.4 12/33] powerpc/perf: Record counter overflow always if SAMPLE_IP is unset Sasha Levin
2021-03-02 11:57 ` [PATCH AUTOSEL 5.4 14/33] powerpc/64: Fix stack trace not displaying final frame Sasha Levin
3 siblings, 0 replies; 4+ messages in thread
From: Sasha Levin @ 2021-03-02 11:57 UTC (permalink / raw)
To: linux-kernel, stable; +Cc: Sasha Levin, linuxppc-dev, Nicholas Piggin
From: Nicholas Piggin <npiggin@gmail.com>
[ Upstream commit 11cb0a25f71818ca7ab4856548ecfd83c169aa4d ]
If an unrecoverable system reset hits in process context, the system
does not have to panic. Similar to machine check, call nmi_exit()
before die().
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210130130852.2952424-26-npiggin@gmail.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
arch/powerpc/kernel/traps.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index 206032c9b545..ecfa460f66d1 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -513,8 +513,11 @@ out:
die("Unrecoverable nested System Reset", regs, SIGABRT);
#endif
/* Must die if the interrupt is not recoverable */
- if (!(regs->msr & MSR_RI))
+ if (!(regs->msr & MSR_RI)) {
+ /* For the reason explained in die_mce, nmi_exit before die */
+ nmi_exit();
die("Unrecoverable System Reset", regs, SIGABRT);
+ }
if (saved_hsrrs) {
mtspr(SPRN_HSRR0, hsrr0);
--
2.30.1
^ permalink raw reply related [flat|nested] 4+ messages in thread* [PATCH AUTOSEL 5.4 12/33] powerpc/perf: Record counter overflow always if SAMPLE_IP is unset
[not found] <20210302115749.62653-1-sashal@kernel.org>
2021-03-02 11:57 ` [PATCH AUTOSEL 5.4 08/33] powerpc/pci: Add ppc_md.discover_phbs() Sasha Levin
2021-03-02 11:57 ` [PATCH AUTOSEL 5.4 11/33] powerpc: improve handling of unrecoverable system reset Sasha Levin
@ 2021-03-02 11:57 ` Sasha Levin
2021-03-02 11:57 ` [PATCH AUTOSEL 5.4 14/33] powerpc/64: Fix stack trace not displaying final frame Sasha Levin
3 siblings, 0 replies; 4+ messages in thread
From: Sasha Levin @ 2021-03-02 11:57 UTC (permalink / raw)
To: linux-kernel, stable; +Cc: Sasha Levin, Athira Rajeev, linuxppc-dev
From: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
[ Upstream commit d137845c973147a22622cc76c7b0bc16f6206323 ]
While sampling for marked events, currently we record the sample only
if the SIAR valid bit of Sampled Instruction Event Register (SIER) is
set. SIAR_VALID bit is used for fetching the instruction address from
Sampled Instruction Address Register(SIAR). But there are some
usecases, where the user is interested only in the PMU stats at each
counter overflow and the exact IP of the overflow event is not
required. Dropping SIAR invalid samples will fail to record some of
the counter overflows in such cases.
Example of such usecase is dumping the PMU stats (event counts) after
some regular amount of instructions/events from the userspace (ex: via
ptrace). Here counter overflow is indicated to userspace via signal
handler, and captured by monitoring and enabling I/O signaling on the
event file descriptor. In these cases, we expect to get
sample/overflow indication after each specified sample_period.
Perf event attribute will not have PERF_SAMPLE_IP set in the
sample_type if exact IP of the overflow event is not requested. So
while profiling if SAMPLE_IP is not set, just record the counter
overflow irrespective of SIAR_VALID check.
Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
[mpe: Reflow comment and if formatting]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1612516492-1428-1-git-send-email-atrajeev@linux.vnet.ibm.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
arch/powerpc/perf/core-book3s.c | 19 +++++++++++++++----
1 file changed, 15 insertions(+), 4 deletions(-)
diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index 02fc75ddcbb3..6f013e418834 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -2077,7 +2077,17 @@ static void record_and_restart(struct perf_event *event, unsigned long val,
left += period;
if (left <= 0)
left = period;
- record = siar_valid(regs);
+
+ /*
+ * If address is not requested in the sample via
+ * PERF_SAMPLE_IP, just record that sample irrespective
+ * of SIAR valid check.
+ */
+ if (event->attr.sample_type & PERF_SAMPLE_IP)
+ record = siar_valid(regs);
+ else
+ record = 1;
+
event->hw.last_period = event->hw.sample_period;
}
if (left < 0x80000000LL)
@@ -2095,9 +2105,10 @@ static void record_and_restart(struct perf_event *event, unsigned long val,
* MMCR2. Check attr.exclude_kernel and address to drop the sample in
* these cases.
*/
- if (event->attr.exclude_kernel && record)
- if (is_kernel_addr(mfspr(SPRN_SIAR)))
- record = 0;
+ if (event->attr.exclude_kernel &&
+ (event->attr.sample_type & PERF_SAMPLE_IP) &&
+ is_kernel_addr(mfspr(SPRN_SIAR)))
+ record = 0;
/*
* Finally record data if requested.
--
2.30.1
^ permalink raw reply related [flat|nested] 4+ messages in thread* [PATCH AUTOSEL 5.4 14/33] powerpc/64: Fix stack trace not displaying final frame
[not found] <20210302115749.62653-1-sashal@kernel.org>
` (2 preceding siblings ...)
2021-03-02 11:57 ` [PATCH AUTOSEL 5.4 12/33] powerpc/perf: Record counter overflow always if SAMPLE_IP is unset Sasha Levin
@ 2021-03-02 11:57 ` Sasha Levin
3 siblings, 0 replies; 4+ messages in thread
From: Sasha Levin @ 2021-03-02 11:57 UTC (permalink / raw)
To: linux-kernel, stable; +Cc: Sasha Levin, linuxppc-dev
From: Michael Ellerman <mpe@ellerman.id.au>
[ Upstream commit e3de1e291fa58a1ab0f471a4b458eff2514e4b5f ]
In commit bf13718bc57a ("powerpc: show registers when unwinding
interrupt frames") we changed our stack dumping logic to show the full
registers whenever we find an interrupt frame on the stack.
However we didn't notice that on 64-bit this doesn't show the final
frame, ie. the interrupt that brought us in from userspace, whereas on
32-bit it does.
That is due to confusion about the size of that last frame. The code
in show_stack() calls validate_sp(), passing it STACK_INT_FRAME_SIZE
to check the sp is at least that far below the top of the stack.
However on 64-bit that size is too large for the final frame, because
it includes the red zone, but we don't allocate a red zone for the
first frame.
So add a new define that encodes the correct size for 32-bit and
64-bit, and use it in show_stack().
This results in the full trace being shown on 64-bit, eg:
sysrq: Trigger a crash
Kernel panic - not syncing: sysrq triggered crash
CPU: 0 PID: 83 Comm: sh Not tainted 5.11.0-rc2-gcc-8.2.0-00188-g571abcb96b10-dirty #649
Call Trace:
[c00000000a1c3ac0] [c000000000897b70] dump_stack+0xc4/0x114 (unreliable)
[c00000000a1c3b00] [c00000000014334c] panic+0x178/0x41c
[c00000000a1c3ba0] [c00000000094e600] sysrq_handle_crash+0x40/0x50
[c00000000a1c3c00] [c00000000094ef98] __handle_sysrq+0xd8/0x210
[c00000000a1c3ca0] [c00000000094f820] write_sysrq_trigger+0x100/0x188
[c00000000a1c3ce0] [c0000000005559dc] proc_reg_write+0x10c/0x1b0
[c00000000a1c3d10] [c000000000479950] vfs_write+0xf0/0x360
[c00000000a1c3d60] [c000000000479d9c] ksys_write+0x7c/0x140
[c00000000a1c3db0] [c00000000002bf5c] system_call_exception+0x19c/0x2c0
[c00000000a1c3e10] [c00000000000d35c] system_call_common+0xec/0x278
--- interrupt: c00 at 0x7fff9fbab428
NIP: 00007fff9fbab428 LR: 000000001000b724 CTR: 0000000000000000
REGS: c00000000a1c3e80 TRAP: 0c00 Not tainted (5.11.0-rc2-gcc-8.2.0-00188-g571abcb96b10-dirty)
MSR: 900000000280f033 <SF,HV,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE> CR: 22002884 XER: 00000000
IRQMASK: 0
GPR00: 0000000000000004 00007fffc3cb8960 00007fff9fc59900 0000000000000001
GPR04: 000000002a4b32d0 0000000000000002 0000000000000063 0000000000000063
GPR08: 000000002a4b32d0 0000000000000000 0000000000000000 0000000000000000
GPR12: 0000000000000000 00007fff9fcca9a0 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 00000000100b8fd0
GPR20: 000000002a4b3485 00000000100b8f90 0000000000000000 0000000000000000
GPR24: 000000002a4b0440 00000000100e77b8 0000000000000020 000000002a4b32d0
GPR28: 0000000000000001 0000000000000002 000000002a4b32d0 0000000000000001
NIP [00007fff9fbab428] 0x7fff9fbab428
LR [000000001000b724] 0x1000b724
--- interrupt: c00
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20210209141627.2898485-1-mpe@ellerman.id.au
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
arch/powerpc/include/asm/ptrace.h | 3 +++
arch/powerpc/kernel/asm-offsets.c | 2 +-
arch/powerpc/kernel/process.c | 2 +-
3 files changed, 5 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/include/asm/ptrace.h b/arch/powerpc/include/asm/ptrace.h
index c41220f4aad9..5a424f867c82 100644
--- a/arch/powerpc/include/asm/ptrace.h
+++ b/arch/powerpc/include/asm/ptrace.h
@@ -62,6 +62,9 @@ struct pt_regs
};
#endif
+
+#define STACK_FRAME_WITH_PT_REGS (STACK_FRAME_OVERHEAD + sizeof(struct pt_regs))
+
#ifdef __powerpc64__
/*
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index 5c0a1e17219b..af399675248e 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -285,7 +285,7 @@ int main(void)
/* Interrupt register frame */
DEFINE(INT_FRAME_SIZE, STACK_INT_FRAME_SIZE);
- DEFINE(SWITCH_FRAME_SIZE, STACK_FRAME_OVERHEAD + sizeof(struct pt_regs));
+ DEFINE(SWITCH_FRAME_SIZE, STACK_FRAME_WITH_PT_REGS);
STACK_PT_REGS_OFFSET(GPR0, gpr[0]);
STACK_PT_REGS_OFFSET(GPR1, gpr[1]);
STACK_PT_REGS_OFFSET(GPR2, gpr[2]);
diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c
index bd0c258a1d5d..c94bba9142e7 100644
--- a/arch/powerpc/kernel/process.c
+++ b/arch/powerpc/kernel/process.c
@@ -2081,7 +2081,7 @@ void show_stack(struct task_struct *tsk, unsigned long *stack)
* See if this is an exception frame.
* We look for the "regshere" marker in the current frame.
*/
- if (validate_sp(sp, tsk, STACK_INT_FRAME_SIZE)
+ if (validate_sp(sp, tsk, STACK_FRAME_WITH_PT_REGS)
&& stack[STACK_FRAME_MARKER] == STACK_FRAME_REGS_MARKER) {
struct pt_regs *regs = (struct pt_regs *)
(sp + STACK_FRAME_OVERHEAD);
--
2.30.1
^ permalink raw reply related [flat|nested] 4+ messages in thread