linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Willy Tarreau <w@1wt.eu>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Antonio SJ Musumeci <trapexit@spawn.link>,
	Miklos Szeredi <miklos@szeredi.hu>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	stable <stable@vger.kernel.org>
Subject: Re: BUG_ON() in workingset_node_shadows_dec() triggers
Date: Wed, 5 Oct 2016 21:06:04 +0200	[thread overview]
Message-ID: <20161005190604.GA8116@1wt.eu> (raw)
In-Reply-To: <CA+55aFxeamjJmckjA_n_BMwpGKkxd51Tint0z7E9CDHAnRZJAw@mail.gmail.com>

On Wed, Oct 05, 2016 at 08:52:54AM -0700, Linus Torvalds wrote:
> On Tue, Oct 4, 2016 at 10:44 PM, Willy Tarreau <w@1wt.eu> wrote:
> >
> > I think instead we should completely remove any simple way to halt the
> > system and document how to do it.
> 
> Having slept on it, I suspect you're right. I worry about some
> BUG_ON() that really relies on the killing behavior, but if it takes a
> "real" fault later, that is when it gets killed. And on the whole,
> we've had lots of problems with the killing behavior over the years,
> so we should just try switching BUG_ON() over to non-fatal. It's
> unlikely to be worse than what we have now, as exemplified by this
> event.

I have the same doubts, so at least I would not want to run the "sed"
immediately, at least to keep the initial intent. But I think everyone
is right in is own yard when he puts a BUG_ON() when he doesn't know
how to handle an unsafe situation, he's wrong from a global perspective.

For example, it could be seen as safe to crash the system in a filesystem
driver to protect against the risk of data corruption resulting from an
impossible condition, but when this happens due to a dirty FS on a USB
stick that a person inserts on the PC to save her work, actually the
BUG_ON() is the one responsible for the data loss. Even something as
painful as leaving a process in D state in this situation would have
been cleaner as it would let the admin reboot when he wants and not
have to experience it at the worst moment.

I've already met 100% reproducible panics that I never had the time to
inestigate (one involving running an mmap-based hex editor on /dev/mem,
and the other one doing stupid things with mount --move), and I'm sure
once I find the cause I'll see a BUG_ON() that should have been a warning.

I'm pretty sure there are historically valid BUG_ON() that are probably
not needed anymore just like I'm also convinced that some of them are
hard to get rid of. Maybe at least having the same as WARN_ON() but
prepending the dump with a message saying "you encountered a critical
bug which should have crashed the kernel, you must absolutely report it"
would help at the beginning.

Cheers,
Willy

  reply	other threads:[~2016-10-05 19:06 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-04  4:00 BUG_ON() in workingset_node_shadows_dec() triggers Linus Torvalds
2016-10-04  4:07 ` Andrew Morton
2016-10-04  4:12   ` Linus Torvalds
2016-10-04  7:03     ` Raymond Jennings
2016-10-04 16:03       ` Linus Torvalds
2016-10-04  8:02 ` Greg KH
2016-10-04  9:32 ` Johannes Weiner
2016-10-05  1:21   ` Linus Torvalds
2016-10-05  9:25     ` Johannes Weiner
2016-10-05  9:31       ` Johannes Weiner
2016-10-05 10:40       ` Jan Kara
2016-10-05 16:10       ` Linus Torvalds
2016-10-05 17:00         ` [PATCH] checkpatch: extend BUG warning Joe Perches
2016-10-05 17:07           ` Linus Torvalds
2016-10-05  2:43 ` BUG_ON() in workingset_node_shadows_dec() triggers Paul Gortmaker
2016-10-05  3:29   ` Linus Torvalds
2016-10-05  5:44     ` Willy Tarreau
2016-10-05 15:52       ` Linus Torvalds
2016-10-05 19:06         ` Willy Tarreau [this message]
2016-10-05 19:18           ` Linus Torvalds
2016-10-05 21:09             ` Willy Tarreau
2016-10-05 21:14             ` Kees Cook
2016-10-05 21:46               ` Linus Torvalds
2016-10-05 22:17                 ` Kees Cook
2016-10-05 22:29                   ` Linus Torvalds
2016-10-06 22:07                     ` Kees Cook
2016-10-06 22:29                       ` Linus Torvalds
2016-10-06 23:05                         ` Kees Cook
2016-10-06 23:59                           ` Linus Torvalds
2016-10-07  5:48                             ` Willy Tarreau
2016-10-07 17:16                               ` Kees Cook
2016-10-07 17:21                                 ` Linus Torvalds
2016-10-07 17:33                                   ` Kees Cook
2016-10-07 18:26                                     ` Willy Tarreau
2016-10-06  1:59     ` Dave Chinner
2016-10-06  2:12       ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161005190604.GA8116@1wt.eu \
    --to=w@1wt.eu \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=paul.gortmaker@windriver.com \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=trapexit@spawn.link \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).