From: John Ogness <john.ogness@linutronix.de>
To: Juri Lelli <juri.lelli@redhat.com>
Cc: Petr Mladek <pmladek@suse.com>,
Sergey Senozhatsky <senozhatsky@chromium.org>,
Steven Rostedt <rostedt@goodmis.org>,
Thomas Gleixner <tglx@linutronix.de>,
linux-kernel@vger.kernel.org, Jonathan Corbet <corbet@lwn.net>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Jiri Slaby <jirislaby@kernel.org>,
Sreenath Vijayan <sreenath.vijayan@sony.com>,
Shimoyashiki Taichi <taichi.shimoyashiki@sony.com>,
Tomas Mudrunka <tomas.mudrunka@gmail.com>,
linux-doc@vger.kernel.org, linux-serial@vger.kernel.org,
linux-fsdevel@vger.kernel.org,
"Paul E. McKenney" <paulmck@kernel.org>,
Josh Poimboeuf <jpoimboe@kernel.org>,
"Borislav Petkov (AMD)" <bp@alien8.de>,
Xiongwei Song <xiongwei.song@windriver.com>
Subject: Re: [PATCH printk v2 00/18] add threaded printing + the rest
Date: Wed, 05 Jun 2024 10:15:12 +0206 [thread overview]
Message-ID: <875xunx13r.fsf@jogness.linutronix.de> (raw)
In-Reply-To: <aqkcpca4vgadxc3yzcu74xwq3grslj5m43f3eb5fcs23yo2gy4@gcsnqcts5tos>
Hi Juri,
On 2024-06-04, Juri Lelli <juri.lelli@redhat.com> wrote:
> Our QE reported something like the following while testing the latest
> rt-devel branch (I then could reproduce with this set applied on top of
> linux-next).
>
> ---
> ... kernel: INFO: task khugepaged:351 blocked for more than 1 seconds.
> ... kernel: Not tainted 6.9.0-thrdprintk+ #3
> ... kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> ... kernel: task:khugepaged state:D stack:0 pid:351 tgid:351 ppid:2 flags:0x00004000
> ... kernel: Call Trace:
> ... kernel: <TASK>
> ... kernel: __schedule+0x2bd/0x7f0
> ... kernel: ? __lock_release.isra.0+0x5e/0x170
> ... kernel: schedule+0x3d/0x100
> ... kernel: schedule_timeout+0x1ca/0x1f0
> ... kernel: ? mark_held_locks+0x49/0x80
> ... kernel: ? _raw_spin_unlock_irq+0x24/0x50
> ... kernel: ? lockdep_hardirqs_on+0x77/0x100
> ... kernel: __wait_for_common+0xb7/0x220
> ... kernel: ? __pfx_schedule_timeout+0x10/0x10
> ... kernel: __flush_work+0x70/0x90
> ... kernel: ? __pfx_wq_barrier_func+0x10/0x10
> ... kernel: __lru_add_drain_all+0x179/0x210
> ... kernel: khugepaged+0x73/0x200
> ... kernel: ? lockdep_hardirqs_on+0x77/0x100
> ... kernel: ? _raw_spin_unlock_irqrestore+0x38/0x60
> ... kernel: ? __pfx_khugepaged+0x10/0x10
> ... kernel: kthread+0xec/0x120
> ... kernel: ? __pfx_kthread+0x10/0x10
> ... kernel: ret_from_fork+0x2d/0x50
> ... kernel: ? __pfx_kthread+0x10/0x10
> ... kernel: ret_from_fork_asm+0x1a/0x30
> ... kernel: </TASK>
> ... kernel:
> ... Showing all locks held in the system:
> ... kernel: 1 lock held by khungtaskd/345:
> ... kernel: #0: ffffffff8cbff1c0 (rcu_read_lock){....}-{1:2}, at: debug_show_all_locks+0x32/0x1d0
> ... kernel: BUG: using smp_processor_id() in preemptible [00000000] code: khungtaskd/345
> ... kernel: caller is nbcon_get_cpu_emergency_nesting+0x25/0x40
> ... kernel: CPU: 30 PID: 345 Comm: khungtaskd Kdump: loaded Not tainted 6.9.0-thrdprintk+ #3
> ... kernel: Hardware name: Dell Inc. PowerEdge R740/04FC42, BIOS 2.10.2 02/24/2021
> ... kernel: Call Trace:
> ... kernel: <TASK>
> ... kernel: dump_stack_lvl+0x7f/0xa0
> ... kernel: check_preemption_disabled+0xbf/0xe0
> ... kernel: nbcon_get_cpu_emergency_nesting+0x25/0x40
> ... kernel: nbcon_cpu_emergency_flush+0xa/0x60
> ... kernel: debug_show_all_locks+0x9d/0x1d0
> ... kernel: check_hung_uninterruptible_tasks+0x4f0/0x540
> ... kernel: ? check_hung_uninterruptible_tasks+0x185/0x540
> ... kernel: ? __pfx_watchdog+0x10/0x10
> ... kernel: watchdog+0x99/0xa0
> ... kernel: kthread+0xec/0x120
> ... kernel: ? __pfx_kthread+0x10/0x10
> ... kernel: ret_from_fork+0x2d/0x50
> ... kernel: ? __pfx_kthread+0x10/0x10
> ... kernel: ret_from_fork_asm+0x1a/0x30
> ... kernel: </TASK>
> ---
>
> It requires DEBUG_PREEMPT and LOCKDEP enabled, sched_rt_runtime_us = -1
> and a while(1) loop running at FIFO for some time (I also set sysctl
> kernel.hung_task_timeout_secs=1 to speed up reproduction).
>
> Looks like check_hung_uninterruptible_tasks() requires some care as you
> did already in linux-next for panic, rcu and lockdep ("Make emergency
> sections ...")?
Yes, that probably is a good candidate for emergency mode.
However, your report is also identifying a real issue:
nbcon_cpu_emergency_flush() was implemented to be callable from
non-emergency contexts (in which case it should do nothing). However, in
order to check if it is an emergency context, migration needs to be
disabled.
Perhaps the below change can be made for v2 of this series?
John
diff --git a/kernel/printk/nbcon.c b/kernel/printk/nbcon.c
index 4b9645e7ed70..eeaf8465f492 100644
--- a/kernel/printk/nbcon.c
+++ b/kernel/printk/nbcon.c
@@ -1581,8 +1581,19 @@ void nbcon_cpu_emergency_exit(void)
*/
void nbcon_cpu_emergency_flush(void)
{
+ bool is_emergency;
+
+ /*
+ * If the current context is not an emergency context, preemption
+ * might be enabled. To be sure, disable preemption when checking
+ * if this is an emergency context.
+ */
+ preempt_disable();
+ is_emergency = (*nbcon_get_cpu_emergency_nesting() != 0);
+ preempt_enable();
+
/* The explicit flush is needed only in the emergency context. */
- if (*(nbcon_get_cpu_emergency_nesting()) == 0)
+ if (!is_emergency)
return;
nbcon_atomic_flush_pending();
next prev parent reply other threads:[~2024-06-05 8:09 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-03 23:24 [PATCH printk v2 00/18] add threaded printing + the rest John Ogness
2024-06-03 23:24 ` [PATCH printk v2 16/18] printk: Provide threadprintk boot argument John Ogness
2024-07-02 12:12 ` Petr Mladek
2024-06-04 13:31 ` [PATCH printk v2 00/18] add threaded printing + the rest Juri Lelli
2024-06-05 8:09 ` John Ogness [this message]
2024-06-05 9:32 ` Juri Lelli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=875xunx13r.fsf@jogness.linutronix.de \
--to=john.ogness@linutronix.de \
--cc=bp@alien8.de \
--cc=corbet@lwn.net \
--cc=gregkh@linuxfoundation.org \
--cc=jirislaby@kernel.org \
--cc=jpoimboe@kernel.org \
--cc=juri.lelli@redhat.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-serial@vger.kernel.org \
--cc=paulmck@kernel.org \
--cc=pmladek@suse.com \
--cc=rostedt@goodmis.org \
--cc=senozhatsky@chromium.org \
--cc=sreenath.vijayan@sony.com \
--cc=taichi.shimoyashiki@sony.com \
--cc=tglx@linutronix.de \
--cc=tomas.mudrunka@gmail.com \
--cc=xiongwei.song@windriver.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).