linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] powerpc/fadump: Fix the race in crash_fadump().
@ 2016-10-24 18:21 Mahesh J Salgaonkar
  2017-02-01  1:05 ` [v2] " Michael Ellerman
  0 siblings, 1 reply; 2+ messages in thread
From: Mahesh J Salgaonkar @ 2016-10-24 18:21 UTC (permalink / raw)
  To: linuxppc-dev, Michael Ellerman
  Cc: Hari Bathini, Balbir Singh, Anton Blanchard

From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>

There are chances that multiple CPUs can call crash_fadump() simultaneously
and would start duplicating same info to vmcoreinfo ELF note section. This
causes makedumpfile to fail during kdump capture. One example is,
triggering dumprestart from HMC which sends system reset to all the CPUs at
once.

makedumpfile --dump-dmesg /proc/vmcore
read_vmcoreinfo_basic_info: Invalid data in /tmp/vmcoreinfoyjgxlL: CRASHTIME=1475605971CRASHTIME=1475605971CRASHTIME=1475605971CRASHTIME=1475605971CRASHTIME=1475605971CRASHTIME=1475605971CRASHTIME=1475605971CRASHTIME=1475605971
makedumpfile Failed.
Running makedumpfile --dump-dmesg /proc/vmcore failed (1).

makedumpfile  -d 31 -l /proc/vmcore
read_vmcoreinfo_basic_info: Invalid data in /tmp/vmcoreinfo1mmVdO: CRASHTIME=1475605971CRASHTIME=1475605971CRASHTIME=1475605971CRASHTIME=1475605971CRASHTIME=1475605971CRASHTIME=1475605971CRASHTIME=1475605971CRASHTIME=1475605971
makedumpfile Failed.
Running makedumpfile  -d 31 -l /proc/vmcore failed (1).

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
---
Changes in V2:
- Use cmpxchg instead of mutex lock during panic path.
---
 arch/powerpc/kernel/fadump.c |   25 ++++++++++++++++++++++++-
 1 file changed, 24 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
index b3a6633..a795956 100644
--- a/arch/powerpc/kernel/fadump.c
+++ b/arch/powerpc/kernel/fadump.c
@@ -401,12 +401,35 @@ static void register_fw_dump(struct fadump_mem_struct *fdm)
 void crash_fadump(struct pt_regs *regs, const char *str)
 {
 	struct fadump_crash_info_header *fdh = NULL;
+	int old_cpu, this_cpu;
 
 	if (!fw_dump.dump_registered || !fw_dump.fadumphdr_addr)
 		return;
 
+	/*
+	 * old_cpu == -1 means this is the first CPU which has come here,
+	 * go ahead and trigger fadump.
+	 *
+	 * old_cpu != -1 means some other CPU has already on it's way
+	 * to trigger fadump, just keep looping here.
+	 */
+	this_cpu = smp_processor_id();
+	old_cpu = cmpxchg(&crashing_cpu, -1, this_cpu);
+
+	if (old_cpu != -1) {
+		/*
+		 * We can't loop here indefinitely. Wait as long as fadump
+		 * is in force. If we race with fadump un-registration this
+		 * loop will break and then we go down to normal panic path
+		 * and reboot. If fadump is in force the first crashing
+		 * cpu will definitely trigger fadump.
+		 */
+		while (fw_dump.dump_registered)
+			cpu_relax();
+		return;
+	}
+
 	fdh = __va(fw_dump.fadumphdr_addr);
-	crashing_cpu = smp_processor_id();
 	fdh->crashing_cpu = crashing_cpu;
 	crash_save_vmcoreinfo();
 

^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [v2] powerpc/fadump: Fix the race in crash_fadump().
  2016-10-24 18:21 [PATCH v2] powerpc/fadump: Fix the race in crash_fadump() Mahesh J Salgaonkar
@ 2017-02-01  1:05 ` Michael Ellerman
  0 siblings, 0 replies; 2+ messages in thread
From: Michael Ellerman @ 2017-02-01  1:05 UTC (permalink / raw)
  To: Mahesh Salgaonkar, linuxppc-dev; +Cc: Hari Bathini, Anton Blanchard

On Mon, 2016-10-24 at 18:21:51 UTC, Mahesh Salgaonkar wrote:
> From: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
> 
> There are chances that multiple CPUs can call crash_fadump() simultaneously
> and would start duplicating same info to vmcoreinfo ELF note section. This
> causes makedumpfile to fail during kdump capture. One example is,
> triggering dumprestart from HMC which sends system reset to all the CPUs at
> once.
> 
> makedumpfile --dump-dmesg /proc/vmcore
> read_vmcoreinfo_basic_info: Invalid data in /tmp/vmcoreinfoyjgxlL: CRASHTIME=1475605971CRASHTIME=1475605971CRASHTIME=1475605971CRASHTIME=1475605971CRASHTIME=1475605971CRASHTIME=1475605971CRASHTIME=1475605971CRASHTIME=1475605971
> makedumpfile Failed.
> Running makedumpfile --dump-dmesg /proc/vmcore failed (1).
> 
> makedumpfile  -d 31 -l /proc/vmcore
> read_vmcoreinfo_basic_info: Invalid data in /tmp/vmcoreinfo1mmVdO: CRASHTIME=1475605971CRASHTIME=1475605971CRASHTIME=1475605971CRASHTIME=1475605971CRASHTIME=1475605971CRASHTIME=1475605971CRASHTIME=1475605971CRASHTIME=1475605971
> makedumpfile Failed.
> Running makedumpfile  -d 31 -l /proc/vmcore failed (1).
> 
> Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/f2a5e8f0023eba847ad2adb145b2f6

cheers

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2017-02-01  1:05 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-10-24 18:21 [PATCH v2] powerpc/fadump: Fix the race in crash_fadump() Mahesh J Salgaonkar
2017-02-01  1:05 ` [v2] " Michael Ellerman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).