From mboxrd@z Thu Jan 1 00:00:00 1970 Return-path: Date: Tue, 5 Jun 2018 10:01:51 +0200 From: Petr Tesarik Subject: panic kexec broken on ARM64? Message-ID: <20180605100151.7fd54381@ezekiel.suse.cz> MIME-Version: 1.0 List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "kexec" Errors-To: kexec-bounces+dwmw2=infradead.org@lists.infradead.org To: linux-arm-kernel@lists.infradead.org Cc: Matthias Brugger , kexec mailing list Hi all, I have observed hangs after crash on a Raspberry Pi 3 Model B+ board when a panic kernel is loaded. I attached a hardware debugger and found out that all CPU cores were stopped except one which was stuck in the idle thread. It seems that irq_set_irqchip_state() may sleep, which is definitely not safe after a kernel panic. If I'm right, then this is broken in general, but I have only ever seen it on RPi 3 Model B+ (even RPi3 Model B works fine), so the issue may be more subtle. FWIW the code for 32-bit ARM seems to work just fine without this code in machine_kexec_mask_interrupts(): /* * First try to remove the active state. If this * fails, try to EOI the interrupt. */ ret = irq_set_irqchip_state(i, IRQCHIP_STATE_ACTIVE, false); I wonder what breaks if this call to irq_set_irqchip_state() is removed. For reference, here is a stack trace of the process which originally triggered the panic: #0 __switch_to (prev=0xffff000008e62a00 , next=0xffff80002b796080) at ../arch/arm64/kernel/process.c:355 #1 0xffff0000088f584c in context_switch (rf=, next=, prev=, rq=) at ../kernel/sched/core.c:2896 #2 __schedule (preempt=false) at ../kernel/sched/core.c:3457 #3 0xffff0000088f5eac in schedule () at ../kernel/sched/core.c:3516 #4 0xffff0000088f9448 in schedule_timeout (timeout=) at ../kernel/time/timer.c:1743 #5 0xffff0000088f6afc in do_wait_for_common (state=, timeout=500, action=, x=) at ../kernel/sched/completion.c:77 #6 __wait_for_common (state=, timeout=, action=, x=) at ../kernel/sched/completion.c:96 #7 wait_for_common (x=0xffff000008e53848 , timeout=500, state=) at ../kernel/sched/completion.c:104 #8 0xffff0000088f6c1c in wait_for_completion_timeout (x=0xffff000008e53848 , timeout=500) at ../kernel/sched/completion.c:144 #9 0xffff000000a19f1c in usb_start_wait_urb (urb=0xffff80002c1cd700, timeout=5000, actual_length=0xffff000008e538dc ) at ../drivers/usb/core/message.c:61 #10 0xffff000000a1a05c in usb_internal_control_msg (timeout=, len=, data=, cmd=, pipe=, usb_dev=) at ../drivers/usb/core/message.c:100 #11 usb_control_msg (dev=0xffff80002c348000, pipe=2147484800, request=161 '\241', requesttype=192 '\300', value=0, index=152, data=0xffff80002b6fa080, size=4, timeout=5000) at ../drivers/usb/core/message.c:151 #12 0xffff000001001cd0 in lan78xx_read_reg (index=152, data=0xffff000008e5396c , dev=, dev=) at ../drivers/net/usb/lan78xx.c:425 #13 0xffff00000100365c in lan78xx_irq_bus_sync_unlock (irqd=) at ../drivers/net/usb/lan78xx.c:1909 #14 0xffff00000813e590 in chip_bus_sync_unlock (desc=) at ../kernel/irq/internals.h:129 #15 __irq_put_desc_unlock (desc=0xffff80002c361c00, flags=128, bus=true) at ../kernel/irq/irqdesc.c:804 #16 0xffff00000813f604 in irq_put_desc_busunlock (flags=, desc=) at ../kernel/irq/internals.h:155 #17 irq_set_irqchip_state (irq=, which=, val=false) at ../kernel/irq/manage.c:2136 #18 0xffff00000809b7d4 in machine_kexec_mask_interrupts () at ../arch/arm64/kernel/machine_kexec.c:233 #19 machine_crash_shutdown (regs=) at ../arch/arm64/kernel/machine_kexec.c:259 #20 0xffff000008180fd4 in __crash_kexec (regs=0xffff000008e53d70 ) at ../kernel/kexec_core.c:943 #21 0xffff0000081810e4 in crash_kexec (regs=0xffff000008e53d70 ) at ../kernel/kexec_core.c:965 #22 0xffff00000808ab58 in die (str=, regs=0xffff000008e53d70 , err=-2046820348) at ../arch/arm64/kernel/traps.c:266 #23 0xffff0000080a1c14 in __do_kernel_fault (mm=0x0, addr=0, esr=2248146948, regs=0xffff000008e53d70 ) at ../arch/arm64/mm/fault.c:226 #24 0xffff0000088fc8dc in do_page_fault (addr=0, esr=2248146948, regs=0xffff000008e53d70 ) at ../arch/arm64/mm/fault.c:476 #25 0xffff0000088fccdc in do_translation_fault (addr=0, esr=2248146948, regs=0xffff000008e53d70 ) at ../arch/arm64/mm/fault.c:502 #26 0xffff000008081478 in do_mem_abort (addr=0, esr=2248146948, regs=0xffff000008e53d70 ) at ../arch/arm64/mm/fault.c:657 #27 0xffff000008082dd0 in el1_sync () at ../arch/arm64/kernel/entry.S:548 Petr T _______________________________________________ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec