linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] ext4: Rate limit printk in buffer_io_error()
@ 2013-07-09 23:01 Anatol Pomazau
  2013-07-12  2:44 ` Theodore Ts'o
  0 siblings, 1 reply; 2+ messages in thread
From: Anatol Pomazau @ 2013-07-09 23:01 UTC (permalink / raw)
  To: linux-ext4; +Cc: tytso, Anatol Pomozov

From: Anatol Pomozov <anatol.pomozov@gmail.com>

If there are a lot of outstanding buffered IOs when a device is
taken offline (due to hardware errors etc), ext4_end_bio prints
out a message for each failed logical block. While this is desirable,
we see thousands of such lines being printed out before the
serial console gets overwhelmed, causing ext4_end_bio() wait for
the printk to complete.

This in itself isn't a disaster, except for the detail that this
function is being called with the queue lock held.
This causes any other function in the block layer
to spin on its spin_lock_irqsave while the serial console is
draining. If NMI watchdog is enabled on this machine then it
eventually comes along and shoots the machine in the head.

The end result is that losing any one disk causes the machine to
go down. This patch rate limits the printk to bandaid around the
problem.

Tested: xfstests
Change-Id: I8ab5690dcf4f3a67e78be147d45e489fdf4a88d8
Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com>
---
 fs/ext4/page-io.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/ext4/page-io.c b/fs/ext4/page-io.c
index 4acf1f7..4c9d5e7 100644
--- a/fs/ext4/page-io.c
+++ b/fs/ext4/page-io.c
@@ -25,6 +25,7 @@
 #include <linux/kernel.h>
 #include <linux/slab.h>
 #include <linux/mm.h>
+#include <linux/ratelimit.h>
 
 #include "ext4_jbd2.h"
 #include "xattr.h"
@@ -214,7 +215,7 @@ ext4_io_end_t *ext4_init_io_end(struct inode *inode, gfp_t flags)
 static void buffer_io_error(struct buffer_head *bh)
 {
 	char b[BDEVNAME_SIZE];
-	printk(KERN_ERR "Buffer I/O error on device %s, logical block %llu\n",
+	printk_ratelimited(KERN_ERR "Buffer I/O error on device %s, logical block %llu\n",
 			bdevname(bh->b_bdev, b),
 			(unsigned long long)bh->b_blocknr);
 }
-- 
1.8.3


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: ext4: Rate limit printk in buffer_io_error()
  2013-07-09 23:01 [PATCH] ext4: Rate limit printk in buffer_io_error() Anatol Pomazau
@ 2013-07-12  2:44 ` Theodore Ts'o
  0 siblings, 0 replies; 2+ messages in thread
From: Theodore Ts'o @ 2013-07-12  2:44 UTC (permalink / raw)
  To: Anatol Pomazau; +Cc: linux-ext4, Anatol Pomozov

On Tue, Jul 09, 2013 at 04:01:38PM -0700, Anatol Pomazau wrote:
> From: Anatol Pomozov <anatol.pomozov@gmail.com>
> 
> If there are a lot of outstanding buffered IOs when a device is
> taken offline (due to hardware errors etc), ext4_end_bio prints
> out a message for each failed logical block. While this is desirable,
> we see thousands of such lines being printed out before the
> serial console gets overwhelmed, causing ext4_end_bio() wait for
> the printk to complete.
> 
> This in itself isn't a disaster, except for the detail that this
> function is being called with the queue lock held.
> This causes any other function in the block layer
> to spin on its spin_lock_irqsave while the serial console is
> draining. If NMI watchdog is enabled on this machine then it
> eventually comes along and shoots the machine in the head.
> 
> The end result is that losing any one disk causes the machine to
> go down. This patch rate limits the printk to bandaid around the
> problem.
> 
> Tested: xfstests
> Change-Id: I8ab5690dcf4f3a67e78be147d45e489fdf4a88d8
> Signed-off-by: Anatol Pomozov <anatol.pomozov@gmail.com>

Thanks, applied.

					- Ted

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2013-07-12  2:44 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-07-09 23:01 [PATCH] ext4: Rate limit printk in buffer_io_error() Anatol Pomazau
2013-07-12  2:44 ` Theodore Ts'o

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).