From: Peter Zijlstra <peterz@infradead.org>
To: Ingo Molnar <mingo@elte.hu>
Cc: Arne Jansen <lists@die-jansens.de>,
Linus Torvalds <torvalds@linux-foundation.org>,
mingo@redhat.com, hpa@zytor.com, linux-kernel@vger.kernel.org,
efault@gmx.de, npiggin@kernel.dk, akpm@linux-foundation.org,
frank.rowand@am.sony.com, tglx@linutronix.de,
linux-tip-commits@vger.kernel.org
Subject: Re: [debug patch] printk: Add a printk killswitch to robustify NMI watchdog messages
Date: Mon, 06 Jun 2011 18:44:09 +0200 [thread overview]
Message-ID: <1307378649.2322.198.camel@twins> (raw)
In-Reply-To: <20110606161749.GA22157@elte.hu>
On Mon, 2011-06-06 at 18:17 +0200, Ingo Molnar wrote:
> * Peter Zijlstra <peterz@infradead.org> wrote:
>
> > On Mon, 2011-06-06 at 18:08 +0200, Ingo Molnar wrote:
> > > * Peter Zijlstra <peterz@infradead.org> wrote:
> > >
> > > > On Mon, 2011-06-06 at 17:52 +0200, Ingo Molnar wrote:
> > > > > * Peter Zijlstra <peterz@infradead.org> wrote:
> > > > >
> > > > > > Needs more staring at, preferably by someone who actually
> > > > > > understands that horrid mess :/ Also, this all still doesn't make
> > > > > > printk() work reliably while holding rq->lock.
> > > > >
> > > > > So, what about my suggestion to just *remove* the wakeup from there
> > > > > and use the deferred wakeup mechanism that klogd uses.
> > > > >
> > > > > That would make printk() *visibly* more robust in practice.
> > > >
> > > > That's currently done from the jiffy tick, do you want to effectively
> > > > delay releasing the console_sem for the better part of a jiffy?
> > >
> > > Yes, and we already do it in some other circumstances.
> >
> > We do?
>
> Yes, see the whole printk_pending logic, it delays:
>
> wake_up_interruptible(&log_wait);
>
> to the next jiffies tick.
Again, that's not console_sem ("..delay releasing console_sem.."
"..already done.." isn't true).
> > > Can you see
> > > any problem with that? klogd is an utter slowpath anyway.
> >
> > but console_sem isn't klogd. We delay klogd and that's perfectly
> > fine, but afaict we don't delay console_sem.
>
> But console_sem is really a similar special case as klogd. See, it's
> about a *printk*. That's rare by definition.
But its not rare, its _the_ lock that serialized the whole console
layer. Pretty much everything a console does goes through that lock.
By delaying this with 10ms (CONFIG_HZ=100) per printk could really delay
the whole boot process.
> If someone on the console sees it he'll be startled by at least 10
> msecs ;-) So delaying the wakeup to the next jiffy really fits into
> the same approach as we already do with &log_wait, hm?
Not convinced yet, I mean, don't get me wrong, I'd love to rid us of the
thing, but I'm not sure delaying the release of a resource like this is
the right approach.
Ahh, what we could do is something like the below and delay both the
acquire and release of the console_sem.
---
kernel/printk.c | 86 +++++++++++++++++++++++++-----------------------------
1 files changed, 40 insertions(+), 46 deletions(-)
diff --git a/kernel/printk.c b/kernel/printk.c
index 3518539..d3bdf5a 100644
--- a/kernel/printk.c
+++ b/kernel/printk.c
@@ -686,6 +686,7 @@ static void zap_locks(void)
oops_timestamp = jiffies;
+ debug_locks_off();
/* If a crash is occurring, make sure we can't deadlock */
spin_lock_init(&logbuf_lock);
/* And make sure that we print immediately */
@@ -774,16 +775,13 @@ static inline int can_use_console(unsigned int cpu)
* messages from a 'printk'. Return true (and with the
* console_lock held, and 'console_locked' set) if it
* is successful, false otherwise.
- *
- * This gets called with the 'logbuf_lock' spinlock held and
- * interrupts disabled. It should return with 'lockbuf_lock'
- * released but interrupts still disabled.
*/
static int console_trylock_for_printk(unsigned int cpu)
__releases(&logbuf_lock)
{
int retval = 0;
+ spin_lock(&logbuf_lock);
if (console_trylock()) {
retval = 1;
@@ -803,12 +801,27 @@ static int console_trylock_for_printk(unsigned int cpu)
spin_unlock(&logbuf_lock);
return retval;
}
+
static const char recursion_bug_msg [] =
KERN_CRIT "BUG: recent printk recursion!\n";
static int recursion_bug;
static int new_text_line = 1;
static char printk_buf[1024];
+static DEFINE_PER_CPU(int, printk_pending);
+
+int printk_needs_cpu(int cpu)
+{
+ if (cpu_is_offline(cpu))
+ printk_tick();
+ return __this_cpu_read(printk_pending);
+}
+
+void printk_set_pending(void)
+{
+ this_cpu_write(printk_pending, 1);
+}
+
int printk_delay_msec __read_mostly;
static inline void printk_delay(void)
@@ -836,9 +849,8 @@ asmlinkage int vprintk(const char *fmt, va_list args)
boot_delay_msec();
printk_delay();
- preempt_disable();
/* This stops the holder of console_sem just where we want him */
- raw_local_irq_save(flags);
+ local_irq_save(flags);
this_cpu = smp_processor_id();
/*
@@ -859,7 +871,6 @@ asmlinkage int vprintk(const char *fmt, va_list args)
zap_locks();
}
- lockdep_off();
spin_lock(&logbuf_lock);
printk_cpu = this_cpu;
@@ -942,25 +953,13 @@ asmlinkage int vprintk(const char *fmt, va_list args)
if (*p == '\n')
new_text_line = 1;
}
+ spin_unlock(&logbuf_lock);
- /*
- * Try to acquire and then immediately release the
- * console semaphore. The release will do all the
- * actual magic (print out buffers, wake up klogd,
- * etc).
- *
- * The console_trylock_for_printk() function
- * will release 'logbuf_lock' regardless of whether it
- * actually gets the semaphore or not.
- */
- if (console_trylock_for_printk(this_cpu))
- console_unlock();
+ printk_set_pending();
- lockdep_on();
out_restore_irqs:
- raw_local_irq_restore(flags);
+ local_irq_restore(flags);
- preempt_enable();
return printed_len;
}
EXPORT_SYMBOL(printk);
@@ -1201,29 +1200,6 @@ int is_console_locked(void)
return console_locked;
}
-static DEFINE_PER_CPU(int, printk_pending);
-
-void printk_tick(void)
-{
- if (__this_cpu_read(printk_pending)) {
- __this_cpu_write(printk_pending, 0);
- wake_up_interruptible(&log_wait);
- }
-}
-
-int printk_needs_cpu(int cpu)
-{
- if (cpu_is_offline(cpu))
- printk_tick();
- return __this_cpu_read(printk_pending);
-}
-
-void wake_up_klogd(void)
-{
- if (waitqueue_active(&log_wait))
- this_cpu_write(printk_pending, 1);
-}
-
/**
* console_unlock - unlock the console system
*
@@ -1273,11 +1249,29 @@ void console_unlock(void)
up(&console_sem);
spin_unlock_irqrestore(&logbuf_lock, flags);
+
if (wake_klogd)
- wake_up_klogd();
+ wake_up_interruptible(&log_wait);
}
EXPORT_SYMBOL(console_unlock);
+void printk_tick(void)
+{
+ if (!__this_cpu_read(printk_pending))
+ return;
+
+ /*
+ * Try to acquire and then immediately release the
+ * console semaphore. The release will do all the
+ * actual magic (print out buffers, wake up klogd,
+ * etc).
+ */
+ if (console_trylock_for_printk(smp_processor_id())) {
+ console_unlock();
+ __this_cpu_write(printk_pending, 0);
+ }
+}
+
/**
* console_conditional_schedule - yield the CPU if required
*
next prev parent reply other threads:[~2011-06-06 16:44 UTC|newest]
Thread overview: 152+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-04-05 15:23 [PATCH 00/21] sched: Reduce runqueue lock contention -v6 Peter Zijlstra
2011-04-05 15:23 ` [PATCH 01/21] sched: Provide scheduler_ipi() callback in response to smp_send_reschedule() Peter Zijlstra
2011-04-13 21:15 ` Tony Luck
2011-04-13 21:38 ` Peter Zijlstra
2011-04-14 8:31 ` [tip:sched/locking] " tip-bot for Peter Zijlstra
2011-04-05 15:23 ` [PATCH 02/21] sched: Always provide p->on_cpu Peter Zijlstra
2011-04-14 8:31 ` [tip:sched/locking] " tip-bot for Peter Zijlstra
2011-04-05 15:23 ` [PATCH 03/21] mutex: Use p->on_cpu for the adaptive spin Peter Zijlstra
2011-04-14 8:32 ` [tip:sched/locking] " tip-bot for Peter Zijlstra
2011-04-05 15:23 ` [PATCH 04/21] sched: Change the ttwu success details Peter Zijlstra
2011-04-13 9:23 ` Peter Zijlstra
2011-04-13 10:48 ` Peter Zijlstra
2011-04-13 11:06 ` Peter Zijlstra
2011-04-13 18:39 ` Tejun Heo
2011-04-13 19:11 ` Peter Zijlstra
2011-04-14 8:32 ` [tip:sched/locking] sched: Change the ttwu() " tip-bot for Peter Zijlstra
2011-04-05 15:23 ` [PATCH 05/21] sched: Clean up ttwu stats Peter Zijlstra
2011-04-14 8:33 ` [tip:sched/locking] sched: Clean up ttwu() stats tip-bot for Peter Zijlstra
2011-04-05 15:23 ` [PATCH 06/21] sched: Provide p->on_rq Peter Zijlstra
2011-04-14 8:33 ` [tip:sched/locking] " tip-bot for Peter Zijlstra
2011-04-05 15:23 ` [PATCH 07/21] sched: Serialize p->cpus_allowed and ttwu() using p->pi_lock Peter Zijlstra
2011-04-14 8:34 ` [tip:sched/locking] " tip-bot for Peter Zijlstra
2011-04-05 15:23 ` [PATCH 08/21] sched: Drop the rq argument to sched_class::select_task_rq() Peter Zijlstra
2011-04-14 8:34 ` [tip:sched/locking] " tip-bot for Peter Zijlstra
2011-04-05 15:23 ` [PATCH 09/21] sched: Remove rq argument to sched_class::task_waking() Peter Zijlstra
2011-04-14 8:35 ` [tip:sched/locking] " tip-bot for Peter Zijlstra
2011-04-05 15:23 ` [PATCH 10/21] sched: Deal with non-atomic min_vruntime reads on 32bits Peter Zijlstra
2011-04-14 8:35 ` [tip:sched/locking] " tip-bot for Peter Zijlstra
2011-04-05 15:23 ` [PATCH 11/21] sched: Delay task_contributes_to_load() Peter Zijlstra
2011-04-14 8:35 ` [tip:sched/locking] " tip-bot for Peter Zijlstra
2011-04-05 15:23 ` [PATCH 12/21] sched: Also serialize ttwu_local() with p->pi_lock Peter Zijlstra
2011-04-14 8:36 ` [tip:sched/locking] " tip-bot for Peter Zijlstra
2011-04-05 15:23 ` [PATCH 13/21] sched: Add p->pi_lock to task_rq_lock() Peter Zijlstra
2011-04-14 8:36 ` [tip:sched/locking] " tip-bot for Peter Zijlstra
2011-06-01 13:58 ` Arne Jansen
2011-06-01 16:35 ` Peter Zijlstra
2011-06-01 17:20 ` Arne Jansen
2011-06-01 18:09 ` Peter Zijlstra
2011-06-01 18:44 ` Peter Zijlstra
2011-06-01 19:30 ` Arne Jansen
2011-06-01 21:09 ` Linus Torvalds
2011-06-03 9:15 ` Peter Zijlstra
2011-06-03 10:02 ` Arne Jansen
2011-06-03 10:30 ` Peter Zijlstra
2011-06-03 11:52 ` Arne Jansen
2011-06-05 8:17 ` Ingo Molnar
2011-06-05 8:53 ` Arne Jansen
2011-06-05 9:41 ` Ingo Molnar
2011-06-05 9:45 ` Ingo Molnar
2011-06-05 9:43 ` Arne Jansen
2011-06-05 9:55 ` Ingo Molnar
2011-06-05 10:22 ` Arne Jansen
2011-06-05 11:01 ` Ingo Molnar
2011-06-05 11:19 ` [debug patch] printk: Add a printk killswitch to robustify NMI watchdog messages Ingo Molnar
2011-06-05 11:36 ` Ingo Molnar
2011-06-05 11:57 ` Arne Jansen
2011-06-05 13:39 ` Ingo Molnar
2011-06-05 13:54 ` Arne Jansen
2011-06-05 14:06 ` Ingo Molnar
2011-06-05 14:45 ` Arne Jansen
2011-06-05 14:10 ` Ingo Molnar
2011-06-05 14:31 ` Arne Jansen
2011-06-05 15:13 ` Ingo Molnar
2011-06-05 15:26 ` Ingo Molnar
2011-06-05 15:32 ` Ingo Molnar
2011-06-05 16:07 ` Arne Jansen
2011-06-05 16:35 ` Arne Jansen
2011-06-05 16:50 ` Arne Jansen
2011-06-05 17:20 ` Ingo Molnar
2011-06-05 17:42 ` Arne Jansen
2011-06-05 18:59 ` Ingo Molnar
2011-06-05 19:30 ` Arne Jansen
2011-06-05 19:44 ` Ingo Molnar
2011-06-05 20:15 ` Arne Jansen
2011-06-06 6:56 ` Arne Jansen
2011-06-06 9:01 ` Peter Zijlstra
2011-06-06 9:18 ` Arne Jansen
2011-06-06 9:24 ` Peter Zijlstra
2011-06-06 9:52 ` Peter Zijlstra
2011-06-06 10:00 ` Arne Jansen
2011-06-06 10:26 ` Peter Zijlstra
2011-06-06 13:25 ` Peter Zijlstra
2011-06-06 15:04 ` Ingo Molnar
2011-06-06 15:08 ` Ingo Molnar
2011-06-06 17:44 ` Mike Galbraith
2011-06-07 5:20 ` Mike Galbraith
2011-06-06 13:10 ` Ingo Molnar
2011-06-06 13:12 ` Peter Zijlstra
2011-06-06 13:21 ` Ingo Molnar
2011-06-06 13:31 ` Peter Zijlstra
2011-06-06 7:34 ` Arne Jansen
2011-06-05 15:34 ` Arne Jansen
2011-06-06 8:38 ` Peter Zijlstra
2011-06-06 14:58 ` Ingo Molnar
2011-06-06 15:09 ` Peter Zijlstra
2011-06-06 15:47 ` Peter Zijlstra
2011-06-06 15:52 ` Ingo Molnar
2011-06-06 16:00 ` Peter Zijlstra
2011-06-06 16:08 ` Ingo Molnar
2011-06-06 16:12 ` Peter Zijlstra
2011-06-06 16:17 ` Ingo Molnar
2011-06-06 16:38 ` Arne Jansen
2011-06-06 16:45 ` Arne Jansen
2011-06-06 16:53 ` Peter Zijlstra
2011-06-06 17:07 ` Ingo Molnar
2011-06-06 17:11 ` Peter Zijlstra
2011-06-08 15:50 ` Peter Zijlstra
2011-06-08 19:17 ` Ingo Molnar
2011-06-08 19:27 ` Linus Torvalds
2011-06-08 20:32 ` Peter Zijlstra
2011-06-08 20:53 ` Linus Torvalds
2011-06-08 20:54 ` Thomas Gleixner
2011-06-08 19:45 ` Peter Zijlstra
2011-06-08 20:52 ` Ingo Molnar
2011-06-08 21:49 ` Peter Zijlstra
2011-06-08 21:57 ` Thomas Gleixner
2011-06-06 16:44 ` Peter Zijlstra [this message]
2011-06-06 16:50 ` Peter Zijlstra
2011-06-06 17:13 ` Ingo Molnar
2011-06-06 17:04 ` Peter Zijlstra
2011-06-06 17:11 ` Ingo Molnar
2011-06-06 17:57 ` Arne Jansen
2011-06-06 18:07 ` Ingo Molnar
2011-06-06 18:14 ` Arne Jansen
2011-06-06 18:19 ` Peter Zijlstra
2011-06-06 22:08 ` Ingo Molnar
2011-06-03 12:44 ` [tip:sched/locking] sched: Add p->pi_lock to task_rq_lock() Linus Torvalds
2011-06-03 13:05 ` Arne Jansen
2011-06-04 21:29 ` Linus Torvalds
2011-06-04 22:08 ` Peter Zijlstra
2011-06-04 22:50 ` Linus Torvalds
2011-06-05 6:01 ` Arne Jansen
2011-06-05 7:57 ` Mike Galbraith
2011-04-05 15:23 ` [PATCH 14/21] sched: Drop rq->lock from first part of wake_up_new_task() Peter Zijlstra
2011-04-14 8:37 ` [tip:sched/locking] " tip-bot for Peter Zijlstra
2011-04-05 15:23 ` [PATCH 15/21] sched: Drop rq->lock from sched_exec() Peter Zijlstra
2011-04-14 8:37 ` [tip:sched/locking] " tip-bot for Peter Zijlstra
2011-04-05 15:23 ` [PATCH 16/21] sched: Remove rq->lock from the first half of ttwu() Peter Zijlstra
2011-04-14 8:38 ` [tip:sched/locking] " tip-bot for Peter Zijlstra
2011-04-05 15:23 ` [PATCH 17/21] sched: Remove rq argument from ttwu_stat() Peter Zijlstra
2011-04-14 8:38 ` [tip:sched/locking] " tip-bot for Peter Zijlstra
2011-04-05 15:23 ` [PATCH 18/21] sched: Rename ttwu_post_activation Peter Zijlstra
2011-04-14 8:39 ` [tip:sched/locking] sched: Rename ttwu_post_activation() to ttwu_do_wakeup() tip-bot for Peter Zijlstra
2011-04-05 15:23 ` [PATCH 19/21] sched: Restructure ttwu some more Peter Zijlstra
2011-04-14 8:39 ` [tip:sched/locking] sched: Restructure ttwu() " tip-bot for Peter Zijlstra
2011-04-05 15:23 ` [PATCH 20/21] sched: Move the second half of ttwu() to the remote cpu Peter Zijlstra
2011-04-14 8:39 ` [tip:sched/locking] " tip-bot for Peter Zijlstra
2011-04-05 15:23 ` [PATCH 21/21] sched: Remove need_migrate_task() Peter Zijlstra
2011-04-14 8:40 ` [tip:sched/locking] " tip-bot for Peter Zijlstra
2011-04-05 15:59 ` [PATCH 00/21] sched: Reduce runqueue lock contention -v6 Peter Zijlstra
2011-04-06 11:00 ` Peter Zijlstra
2011-04-27 16:54 ` Dave Kleikamp
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1307378649.2322.198.camel@twins \
--to=peterz@infradead.org \
--cc=akpm@linux-foundation.org \
--cc=efault@gmx.de \
--cc=frank.rowand@am.sony.com \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-tip-commits@vger.kernel.org \
--cc=lists@die-jansens.de \
--cc=mingo@elte.hu \
--cc=mingo@redhat.com \
--cc=npiggin@kernel.dk \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).