public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 1/2] printk: print initial logbuf contents before re-enabling interrupts
@ 2014-04-17 11:36 Will Deacon
  2014-04-17 11:36 ` [RFC PATCH 2/2] printk: report dropping of messages from logbuf Will Deacon
  2014-04-25 13:29 ` [RFC PATCH 1/2] printk: print initial logbuf contents before re-enabling interrupts Will Deacon
  0 siblings, 2 replies; 5+ messages in thread
From: Will Deacon @ 2014-04-17 11:36 UTC (permalink / raw)
  To: linux-kernel; +Cc: Will Deacon, Andrew Morton, Kay Sievers, Peter Zijlstra

When running on a hideously slow system (~10Mhz FPGA) with a bunch of
debug printk invocations on the timer interrupt path, we end up filling
the log buffer faster than we can drain it.

The reason is that console_unlock (which is responsible for moving
messages out of logbuf to hand over to the console driver) removes one
message at a time, briefly re-enabling interrupts between each of them.
If the interrupt path prints more than a single message, then we can
easily generate more messages than we can print for a regular, recurring
interrupt (e.g. a 1khz timer). This results in messages getting silently
dropped, leading to counter-intuitive, incomplete printk traces on the
console.

Rather than run the console_unlock loop with interrupts disabled (which
has obvious latency problems), this patch records the sequence number of
the last message in the log buffer after taking the logbuf_lock. We can
then print this fixed amount of work before re-enabling interrupts again,
making sure we keep up with ourself. Other CPUs could still potentially
flood the buffer, but there's little that we can do to protect against
that.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Kay Sievers <kay@vrfy.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 kernel/printk/printk.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index a45b50962295..721a7d8fb853 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -2033,10 +2033,13 @@ void console_unlock(void)
 again:
 	for (;;) {
 		struct printk_log *msg;
+		u64 console_end_seq;
 		size_t len;
 		int level;
 
 		raw_spin_lock_irqsave(&logbuf_lock, flags);
+		console_end_seq = log_next_seq;
+again_noirq:
 		if (seen_seq != log_next_seq) {
 			wake_klogd = true;
 			seen_seq = log_next_seq;
@@ -2081,6 +2084,12 @@ skip:
 		stop_critical_timings();	/* don't trace print latency */
 		call_console_drivers(level, text, len);
 		start_critical_timings();
+
+		if (console_seq < console_end_seq) {
+			raw_spin_lock(&logbuf_lock);
+			goto again_noirq;
+		}
+
 		local_irq_restore(flags);
 	}
 	console_locked = 0;
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [RFC PATCH 2/2] printk: report dropping of messages from logbuf
  2014-04-17 11:36 [RFC PATCH 1/2] printk: print initial logbuf contents before re-enabling interrupts Will Deacon
@ 2014-04-17 11:36 ` Will Deacon
  2014-04-25 13:29 ` [RFC PATCH 1/2] printk: print initial logbuf contents before re-enabling interrupts Will Deacon
  1 sibling, 0 replies; 5+ messages in thread
From: Will Deacon @ 2014-04-17 11:36 UTC (permalink / raw)
  To: linux-kernel; +Cc: Will Deacon, Andrew Morton, Kay Sievers, Peter Zijlstra

If the log ring buffer becomes full, we silently overwrite old messages
with new data. console_unlock will detect this case and fast-forward the
console_* pointers to skip over the corrupted data, but nothing will be
reported to the user.

This patch hijacks the first valid log message after detecting that we
dropped messages and prefixes it with a note detailing how many messages
were dropped. For long (~1000 char) messages, this will result in some
truncation of the real message, but given that we're dropping things
anyway, that doesn't seem to be the end of the world.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Kay Sievers <kay@vrfy.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 kernel/printk/printk.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 721a7d8fb853..e9e83d627ff0 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -2046,10 +2046,15 @@ again_noirq:
 		}
 
 		if (console_seq < log_first_seq) {
+			len = sprintf(text, "** %u printk messages dropped ** ",
+				      (unsigned)(log_first_seq - console_seq));
+
 			/* messages are gone, move to first one */
 			console_seq = log_first_seq;
 			console_idx = log_first_idx;
 			console_prev = 0;
+		} else {
+			len = 0;
 		}
 skip:
 		if (console_seq == log_next_seq)
@@ -2074,8 +2079,8 @@ skip:
 		}
 
 		level = msg->level;
-		len = msg_print_text(msg, console_prev, false,
-				     text, sizeof(text));
+		len += msg_print_text(msg, console_prev, false,
+				      text + len, sizeof(text) - len);
 		console_idx = log_next(console_idx);
 		console_seq++;
 		console_prev = msg->flags;
-- 
1.9.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH 1/2] printk: print initial logbuf contents before re-enabling interrupts
  2014-04-17 11:36 [RFC PATCH 1/2] printk: print initial logbuf contents before re-enabling interrupts Will Deacon
  2014-04-17 11:36 ` [RFC PATCH 2/2] printk: report dropping of messages from logbuf Will Deacon
@ 2014-04-25 13:29 ` Will Deacon
  2014-04-25 14:01   ` Peter Zijlstra
  1 sibling, 1 reply; 5+ messages in thread
From: Will Deacon @ 2014-04-25 13:29 UTC (permalink / raw)
  To: linux-kernel@vger.kernel.org; +Cc: Andrew Morton, Kay Sievers, Peter Zijlstra

Hi all,

On Thu, Apr 17, 2014 at 12:36:09PM +0100, Will Deacon wrote:
> When running on a hideously slow system (~10Mhz FPGA) with a bunch of
> debug printk invocations on the timer interrupt path, we end up filling
> the log buffer faster than we can drain it.
> 
> The reason is that console_unlock (which is responsible for moving
> messages out of logbuf to hand over to the console driver) removes one
> message at a time, briefly re-enabling interrupts between each of them.
> If the interrupt path prints more than a single message, then we can
> easily generate more messages than we can print for a regular, recurring
> interrupt (e.g. a 1khz timer). This results in messages getting silently
> dropped, leading to counter-intuitive, incomplete printk traces on the
> console.
> 
> Rather than run the console_unlock loop with interrupts disabled (which
> has obvious latency problems), this patch records the sequence number of
> the last message in the log buffer after taking the logbuf_lock. We can
> then print this fixed amount of work before re-enabling interrupts again,
> making sure we keep up with ourself. Other CPUs could still potentially
> flood the buffer, but there's little that we can do to protect against
> that.

Any thoughts on these two patches? I can understand the reluctance to make
changes to printk, but I had a horrible time debugging timers without these
patches!

Cheers,

Will

> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Kay Sievers <kay@vrfy.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Will Deacon <will.deacon@arm.com>
> ---
>  kernel/printk/printk.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index a45b50962295..721a7d8fb853 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -2033,10 +2033,13 @@ void console_unlock(void)
>  again:
>  	for (;;) {
>  		struct printk_log *msg;
> +		u64 console_end_seq;
>  		size_t len;
>  		int level;
>  
>  		raw_spin_lock_irqsave(&logbuf_lock, flags);
> +		console_end_seq = log_next_seq;
> +again_noirq:
>  		if (seen_seq != log_next_seq) {
>  			wake_klogd = true;
>  			seen_seq = log_next_seq;
> @@ -2081,6 +2084,12 @@ skip:
>  		stop_critical_timings();	/* don't trace print latency */
>  		call_console_drivers(level, text, len);
>  		start_critical_timings();
> +
> +		if (console_seq < console_end_seq) {
> +			raw_spin_lock(&logbuf_lock);
> +			goto again_noirq;
> +		}
> +
>  		local_irq_restore(flags);
>  	}
>  	console_locked = 0;
> -- 
> 1.9.1
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH 1/2] printk: print initial logbuf contents before re-enabling interrupts
  2014-04-25 13:29 ` [RFC PATCH 1/2] printk: print initial logbuf contents before re-enabling interrupts Will Deacon
@ 2014-04-25 14:01   ` Peter Zijlstra
  2014-04-25 14:20     ` Will Deacon
  0 siblings, 1 reply; 5+ messages in thread
From: Peter Zijlstra @ 2014-04-25 14:01 UTC (permalink / raw)
  To: Will Deacon; +Cc: linux-kernel@vger.kernel.org, Andrew Morton, Kay Sievers

On Fri, Apr 25, 2014 at 02:29:37PM +0100, Will Deacon wrote:
> Hi all,
> 
> On Thu, Apr 17, 2014 at 12:36:09PM +0100, Will Deacon wrote:
> > When running on a hideously slow system (~10Mhz FPGA) with a bunch of

Hey, still faster then the 4.77 MHz 8088 chips I started with :-)

> > debug printk invocations on the timer interrupt path, we end up filling
> > the log buffer faster than we can drain it.
> > 
> > The reason is that console_unlock (which is responsible for moving
> > messages out of logbuf to hand over to the console driver) removes one
> > message at a time, briefly re-enabling interrupts between each of them.
> > If the interrupt path prints more than a single message, then we can
> > easily generate more messages than we can print for a regular, recurring
> > interrupt (e.g. a 1khz timer). This results in messages getting silently
> > dropped, leading to counter-intuitive, incomplete printk traces on the
> > console.
> > 
> > Rather than run the console_unlock loop with interrupts disabled (which
> > has obvious latency problems), this patch records the sequence number of
> > the last message in the log buffer after taking the logbuf_lock. We can
> > then print this fixed amount of work before re-enabling interrupts again,
> > making sure we keep up with ourself. Other CPUs could still potentially
> > flood the buffer, but there's little that we can do to protect against
> > that.
> 
> Any thoughts on these two patches? I can understand the reluctance to make
> changes to printk, but I had a horrible time debugging timers without these
> patches!

They look fine to me.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH 1/2] printk: print initial logbuf contents before re-enabling interrupts
  2014-04-25 14:01   ` Peter Zijlstra
@ 2014-04-25 14:20     ` Will Deacon
  0 siblings, 0 replies; 5+ messages in thread
From: Will Deacon @ 2014-04-25 14:20 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: linux-kernel@vger.kernel.org, Andrew Morton, Kay Sievers

On Fri, Apr 25, 2014 at 03:01:20PM +0100, Peter Zijlstra wrote:
> On Fri, Apr 25, 2014 at 02:29:37PM +0100, Will Deacon wrote:
> > On Thu, Apr 17, 2014 at 12:36:09PM +0100, Will Deacon wrote:
> > > When running on a hideously slow system (~10Mhz FPGA) with a bunch of
> 
> Hey, still faster then the 4.77 MHz 8088 chips I started with :-)

... but you didn't have to run KVM guests on that sucker!

> > > debug printk invocations on the timer interrupt path, we end up filling
> > > the log buffer faster than we can drain it.
> > > 
> > > The reason is that console_unlock (which is responsible for moving
> > > messages out of logbuf to hand over to the console driver) removes one
> > > message at a time, briefly re-enabling interrupts between each of them.
> > > If the interrupt path prints more than a single message, then we can
> > > easily generate more messages than we can print for a regular, recurring
> > > interrupt (e.g. a 1khz timer). This results in messages getting silently
> > > dropped, leading to counter-intuitive, incomplete printk traces on the
> > > console.
> > > 
> > > Rather than run the console_unlock loop with interrupts disabled (which
> > > has obvious latency problems), this patch records the sequence number of
> > > the last message in the log buffer after taking the logbuf_lock. We can
> > > then print this fixed amount of work before re-enabling interrupts again,
> > > making sure we keep up with ourself. Other CPUs could still potentially
> > > flood the buffer, but there's little that we can do to protect against
> > > that.
> > 
> > Any thoughts on these two patches? I can understand the reluctance to make
> > changes to printk, but I had a horrible time debugging timers without these
> > patches!
> 
> They look fine to me.

Thanks, Peter. Andrew, would you mind taking these please? They look a bit
out of place in the arm64 tree.

I can repost them if necessary.

Cheers,

Will

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2014-04-25 14:21 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-04-17 11:36 [RFC PATCH 1/2] printk: print initial logbuf contents before re-enabling interrupts Will Deacon
2014-04-17 11:36 ` [RFC PATCH 2/2] printk: report dropping of messages from logbuf Will Deacon
2014-04-25 13:29 ` [RFC PATCH 1/2] printk: print initial logbuf contents before re-enabling interrupts Will Deacon
2014-04-25 14:01   ` Peter Zijlstra
2014-04-25 14:20     ` Will Deacon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox