All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: David Howells <dhowells@redhat.com>
Cc: torvalds@linux-foundation.org, linux-kernel@vger.kernel.org,
	dhowells@redhat.com
Subject: Re: [PATCH] PRINTK: Decouple input locking from output locking and move render outside lock
Date: Tue, 15 Apr 2008 16:26:06 -0700	[thread overview]
Message-ID: <20080415162606.50fcfaaa.akpm@linux-foundation.org> (raw)
In-Reply-To: <20080408180129.21772.9835.stgit@warthog.procyon.org.uk>

On Tue, 08 Apr 2008 19:01:29 +0100
David Howells <dhowells@redhat.com> wrote:

> Decouple the printk input locking from the output locking using a dual-tailed
> ring buffer, and move the render (vscnprintf) and processing (log level token
> inclusion and timestamp inclusion) outside of the input locked section.
> 
> 	+---------+
> 	|         |
> 	| vprintk |
> 	|         |
> 	+---------+                                   +-----------------+
> 	     |                                        |                 |
> 	     V                                   +--->| Syslog syscall  |
> 	+---------+                              |    |                 |
> 	|         |                              |    +-----------------+
> 	| Render  |                              |
> 	|         |                              |    +-----------------+
> 	+---------+      +-------------------+   |    |                 |
> 	     |           | Ring buffer       |   +--->| /proc/kmsg      |
> 	     V           |   +---------------+   |    |                 |
> 	+---------+      |   | syslog queue  |---+    +-----------------+
> 	|         |      |   +---------------+
> 	| Process |----->|                   |
> 	|         |      |   +---------------+
> 	+---------+      |   | console queue |---+    +-----------------+
> 	                 |   +---------------+   |    |                 |
> 	                 |                   |   +--->| Console drivers |
> 	                 +-------------------+        |                 |
> 	                                              +-----------------+
> 
> 
> New/renamed global variables:
> 
>  (*) printk_insertion_point is the point in the ring buffer at which data is
>      inserted.  It may only be moved if printk_insertion_lock is held.
> 
>  (*) log_extraction_point is the point in the ring buffer from which data is
>      read out by do_syslog().  It may only be moved if log_extraction_lock is
>      held.
> 
>  (*) con_extraction_point is the point in the ring buffer from which data is
>      read out to be fed to the console drivers.  It may only be moved if
>      console_sem is held.  No spinlock needs to be held whilst writing to the
>      console.
> 
>  (*) console_anytime is the number of registered drivers that have CON_ANYTIME
>      flagged.  This avoids the need to walk the console list regularly before
>      the CPU is officially online.
> 
> 
> This patch has the following effects:
> 
>  (1) vprintk() spends much less time within the insertion critical section.
> 
>  (2) vprintk() can be filling the buffer whilst do_syslog() and
>      release_console_sem are emptying it.  Memory barriers are employed to
>      prevent out-of-order access between the buffers and the pointers.
> 
>  (3) Instead of a single render buffer, vprintk() has two render buffers per
>      CPU.  This allows multiple CPUs to be rendering and processing their
>      rendered texts at the same time without the need for this to be within a
>      spinlock.
> 
>  (4) vprintk() first prints the string into the first buffer for that CPU, and
>      then post-processes it into the CPU's second buffer to make sure metadata
>      is present (log level and timestamp)
> 
>      As the rendering is done into a temporary buffer rather than a ring
>      buffer, the printed output is more likely to get truncated as there's less
>      room for metadata.  This can be alleviated by increasing the size of the
>      processing output buffer.
> 
>      These render buffers amount to 2KB * NR_CPUS in total instead of 1KB.
> 
>  (5) The timestamp inserted by vprintk() into a line is the time at which the
>      rendered line is processed, not the time it hits the buffer.  This means:
> 
>      (a) all those calls to do_div() and snprintf() are done outside the
>      	 critical section (except for recursion messages), and
> 
>      (b) lines from different CPUs may appear to be out of order (this can be
>      	 solved by throwing a lock around the processing section, a lock that's
>      	 not printk_insertion_lock).
> 
>  (6) A partial vprintk() on one CPU (ie: not ending in "\n") will have a
>      newline inserted after it if another CPU gets a line in first.
> 
>  (7) release_console_sem() does not need to take any spinlocks; console_sem is
>      probably sufficient locking for talking to console drivers.  Although the
>      function disables IRQs too, this may be unnecessary, provided the IRQ
>      handler doesn't try filling up the ring buffer without being able to empty
>      it because someone further up the stack holds console_sem.
> 
>  (8) vprintk() may have to discard messages under high load:
> 
>      (a) If the console queue can't be emptied fast enough, new messages are
>      	 discarded wholly or in part.  Possibly old messages should be
>      	 discarded instead.
> 
>      (b) If the syslog queue can't be emptied fast enough, then if vprintk()
>      	 can immediately get the appropriate lock, it will advance the
>      	 extraction point by at least 1024 characters, rounded up to the next
>      	 newline, thus discarding some old messages.  Failing that, new
>      	 messages will be discarded wholly or in part instead.
> 
> 
> Why this patch?
> ===============
> 
> We're seeing the kernel spit messages out about CPUs getting stuck on the
> console locks.  This appears to be due to I/O errors in a just-disconnected
> iSCSI block device causing lots and lots of printks.  The system on which this
> is being seen is a ppc64 system with spinlocks that yield to the hypervisor,
> which I suspect is exacerbating the situation.
> 
> This patch attempts to alleviate the situation by shortening the amount of time
> spent in the kernel logging critical sections by breaking up the locking and
> moving out stuff that can be done outside of the critical section.
> 
> 
> Testing
> =======
> 
> I've used my FRV test board (32-bit/BE/UP) firstly to step through the code and
> check the various values with GDB under various cases, and secondly to generate
> a sequence of numbered printks, which I've checked have all been correctly
> displayed, and correctly ordered.
> 
> I've also run this up on my x86_64 box with 2 CPUs (64-bit/LE/SMP), checking
> that it correctly handles output from FS-Cache and CacheFiles with full
> debugging output turned on, both in printing to tty and serial consoles and in
> extraction via syslog().
> 
> ...
> 
>  kernel/printk.c |  820 +++++++++++++++++++++++++++++++++++--------------------

geeze, that didn't do much to simplify the printk code :(

Wouldn't it be better to teach the offending iscsi code about
printk_ratelimit()?  Because I'd expect that even with this change,
undesirable things will still happen with (say) a serial console?


      reply	other threads:[~2008-04-15 23:27 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-04-08 18:01 [PATCH] PRINTK: Decouple input locking from output locking and move render outside lock David Howells
2008-04-15 23:26 ` Andrew Morton [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080415162606.50fcfaaa.akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=dhowells@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.