All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: xfs@oss.sgi.com
Subject: Re: [PATCH] xfs: prevent NMI timeouts in cmn_err
Date: Fri, 3 Dec 2010 15:38:46 +1100	[thread overview]
Message-ID: <20101203043846.GB23339@dastard> (raw)
In-Reply-To: <1291341315-31338-1-git-send-email-david@fromorbit.com>

On Fri, Dec 03, 2010 at 12:55:15PM +1100, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> We currently have a global error message buffer in cmn_err that is
> protected by a spin lock that disables interrupts.  Recently there
> have been reports of NMI timeouts occurring when the console is
> being flooded by SCSI error reports due to cmn_err() getting stuck
> trying to print to the console while holding this lock (i.e. with
> interrupts disabled). The NMI watchdog is seeing this CPU as
> non-responding and so is triggering a panic.  While the trigger for
> the reported case is SCSI errors, pretty much anything that spams
> the kernel log could cause this to occur.
> 
> Realistically the only reason that we have the intemediate message
> buffer is to prepend the correct kernel log level prefix to the log
> message. The only reason we have the lock is to protect the global
> message buffer and the only reason the message buffer is global is
> to keep it off the stack. Hence if we can avoid needing a global
> message buffer we avoid needing the lock, and we can do this with a
> small amount of cleanup and some preprocessor tricks:
> 
> 	1. clean up xfs_cmn_err() panic mask functionality to avoid
> 	   needing debug code in xfs_cmn_err()
> 	2. remove the couple of "!" message prefixes that still exist that
> 	   the existing cmn_err() code steps over.
> 	3. redefine CE_* levels directly to KERN_*
> 	4. redefine cmn_err() and friends to use printk() directly
> 	   via variable argument length macros.
> 
> By doing this, we can completely remove the cmn_err() code and the
> lock that is causing the problems, and rely solely on printk()
> serialisation to ensure that we don't get garbled messages.
> 
> A series of followup patches is really needed to clean up all the
> cmn_err() calls and related messages properly, but that results in a
> series that is not easily back portable to enterprise kernels. Hence
> this initial fix is only to address the direct problem in the lowest
> impact way possible.

FWIW, while these macros are the best way to make a simple backport
is possible, I just discovered that mainline has a %pV format
operator that allows an implementation like:

void
xfs_fs_cmn_err(
	const char              *lvl,
	struct xfs_mount        *mp,
	const char              *fmt,
	...)
{
	struct va_format        vaf;
	va_list                 args;

	va_start(args, fmt);
	vaf.fmt = fmt;
	vaf.va = &args;

	printk("%sFilesystem %s: %pV", lvl, mp->m_fsname, &vaf);
	va_end(args);

	BUG_ON(strncmp(lvl, KERN_EMERG, strlen(KERN_EMERG)) == 0);
}

Would this be a preferable method for replacing the existing
implementations, or are the macros good enough as the first step of
a mainline cleanup?

Cheers,,,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

  reply	other threads:[~2010-12-03  4:37 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-12-03  1:55 [PATCH] xfs: prevent NMI timeouts in cmn_err Dave Chinner
2010-12-03  4:38 ` Dave Chinner [this message]
2010-12-10 13:29   ` Christoph Hellwig
2010-12-13  0:30     ` Dave Chinner
     [not found] <121488966.359171291347997519.JavaMail.root@zmail05.collab.prod.int.phx2.redhat.com>
2010-12-03  3:51 ` Lachlan McIlroy
2010-12-03  8:36   ` Dave Chinner
     [not found] <2033621546.27171291599487418.JavaMail.root@zmail05.collab.prod.int.phx2.redhat.com>
2010-12-06  1:40 ` Lachlan McIlroy
     [not found] <996570405.660111292204469269.JavaMail.root@zmail05.collab.prod.int.phx2.redhat.com>
2010-12-13  1:43 ` Lachlan McIlroy
2010-12-13  3:44   ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101203043846.GB23339@dastard \
    --to=david@fromorbit.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.