From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 348BBE7717F for ; Mon, 16 Dec 2024 13:21:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Type:MIME-Version: Message-ID:Date:References:In-Reply-To:Subject:Cc:To:From:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=SR4BYJA56VrvqQ4ml6U2AYvWFpo838DUlVvuAgtV018=; b=YB+gld8Q/HIjyUNXxrkdANlmpb FHvZJhorsWD7P8jWiU48l6Cro8X3rQ/1M6aih4BHajVDQmzByEBfb6FredzayT0WS95ZFAack0t6C M6zaouYQ4CEZda+MJgthhdlyImRu5wMAYM4EtRxh6PlLl6yatDDAzAaSgA2kwHwTzcVx3eruW3oiu 8vvCnDG1hT3fxrH4a4hSDzqvCCesZQ4RJCsDFaKVbGvI1VFK2IU70EbnPGHna/IlzGgXhcJWMmEIN nn+cIUY2WpynmqajGBjbte/4vmGbg+Zauq/eBcHtM0zJmqTdw8r4wNgR4Oe126Pwsilwg/g02SKne 4cjybfRw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1tNB1z-0000000A59R-0ZDx; Mon, 16 Dec 2024 13:21:03 +0000 Received: from galois.linutronix.de ([193.142.43.55]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1tNB1w-0000000A58i-1jMk for kexec@lists.infradead.org; Mon, 16 Dec 2024 13:21:01 +0000 From: Thomas Gleixner DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1734355256; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=SR4BYJA56VrvqQ4ml6U2AYvWFpo838DUlVvuAgtV018=; b=oWdlwB4cjjAzveebQzRVbyMbJ2H6W7Zxtm9ybHHZeHLXv9wLqt1Ockqs57yN80drVauxRq HTYdeimcZh4aoKw0BmHhfmVZ6bTllEBZVIDkSCEkmKXawILCjJG2Y6nI7hq22edVdxfqDK MpxNdT2Ekv5WnUidt/IOHjIa7c0uweesDxjhw5qDpKBN/G3YVBCW45krmLaI6oxwXslJ13 /4zBOkaf2bvGp8CXamX+cuWNMYqfcyC5wencDVfCtdrOZhjMth3LtPCzrL8T6nQffOQ74d 0tHoa+pD7am5+rKix526mRqKydYvO5dDoRwhcG+Emc88YndgPT9COdk2BPxniA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1734355256; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=SR4BYJA56VrvqQ4ml6U2AYvWFpo838DUlVvuAgtV018=; b=HU1NTDnLFv7Qvrwyti0VGD5srlAwrHQgRIb3B38UWenC9CMYUjffp+tc9UlQ8miZRhSj+J GwVZuXZ6UI4e0oAg== To: Peter Zijlstra Cc: David Woodhouse , Stefan Hajnoczi , Jason Wang , "x86@kernel.org" , hpa , dyoung , kexec , linux-ext4 , "Michael S. Tsirkin" , Stefano Garzarella , eperezma , Paolo Bonzini , ming.lei@redhat.com, Petr Mladek , John Ogness Subject: [PATCH] sched: Prevent rescheduling when interrupts are disabled In-Reply-To: <87seqr914v.ffs@tglx> References: <1f631458c180c975c238d4d33d333f9fa9a4d2a3.camel@infradead.org> <20241211124240.GA310916@fedora> <7717fe2ac0ce5f0a2c43fdab8b11f4483d54a2a4.camel@infradead.org> <87ldwl9g93.ffs@tglx> <10f5d22150b548ec271e0a847ba2eb91139e6f61.camel@infradead.org> <87a5d0aibc.ffs@tglx> <874j38a16p.ffs@tglx> <20241213112028.GE21636@noisy.programming.kicks-ass.net> <87seqr914v.ffs@tglx> Date: Mon, 16 Dec 2024 14:20:56 +0100 Message-ID: <87a5cv932f.ffs@tglx> MIME-Version: 1.0 Content-Type: text/plain X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20241216_052100_607006_A1626F3F X-CRM114-Status: GOOD ( 12.96 ) X-BeenThere: kexec@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "kexec" Errors-To: kexec-bounces+kexec=archiver.kernel.org@lists.infradead.org David reported a warning observed while loop testing kexec jump: Interrupts enabled after irqrouter_resume+0x0/0x50 WARNING: CPU: 0 PID: 560 at drivers/base/syscore.c:103 syscore_resume+0x18a/0x220 kernel_kexec+0xf6/0x180 __do_sys_reboot+0x206/0x250 do_syscall_64+0x95/0x180 The corresponding interrupt flag trace: hardirqs last enabled at (15573): [] __up_console_sem+0x7e/0x90 hardirqs last disabled at (15580): [] __up_console_sem+0x63/0x90 That means __up_console_sem() was invoked with interrupts enabled. Further instrumentation revealed that in the interrupt disabled section of kexec jump one of the syscore_suspend() callbacks woke up a task, which set the NEED_RESCHED flag. A later callback in the resume path invoked cond_resched() which in turn led to the invocation of the scheduler: __cond_resched+0x21/0x60 down_timeout+0x18/0x60 acpi_os_wait_semaphore+0x4c/0x80 acpi_ut_acquire_mutex+0x3d/0x100 acpi_ns_get_node+0x27/0x60 acpi_ns_evaluate+0x1cb/0x2d0 acpi_rs_set_srs_method_data+0x156/0x190 acpi_pci_link_set+0x11c/0x290 irqrouter_resume+0x54/0x60 syscore_resume+0x6a/0x200 kernel_kexec+0x145/0x1c0 __do_sys_reboot+0xeb/0x240 do_syscall_64+0x95/0x180 This is a long standing problem, which probably got more visible with the recent printk changes. Something does a task wakeup and the scheduler sets the NEED_RESCHED flag. cond_resched() sees it set and invokes schedule() from a completely bogus context. The scheduler enables interrupts after context switching, which causes the above warning at the end. Quite some of the code paths in syscore_suspend()/resume() can result in triggering a wakeup with the exactly same consequences. They might not have done so yet, but as they share a lot of code with normal operations it's just a question of time. The problem only affects the PREEMPT_NONE and PREEMPT_VOLUNTARY scheduling models. Full preemption is not affected as cond_resched() is disabled and the preemption check preemptible() takes the interrupt disabled flag into account. Cure the problem by adding a corresponding check into cond_resched(). Reported-by: David Woodhouse Signed-off-by: Thomas Gleixner Tested-by: David Woodhouse Cc: stable@vger.kernel.org Closes: https://lore.kernel.org/all/7717fe2ac0ce5f0a2c43fdab8b11f4483d54a2a4.camel@infradead.org --- kernel/sched/core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -7276,7 +7276,7 @@ void rt_mutex_setprio(struct task_struct #if !defined(CONFIG_PREEMPTION) || defined(CONFIG_PREEMPT_DYNAMIC) int __sched __cond_resched(void) { - if (should_resched(0)) { + if (should_resched(0) && !irqs_disabled()) { preempt_schedule_common(); return 1; }