Re: [PATCH RFC 1/2] coding-style.rst: document BUG() and WARN() rules ("do not crash the kernel")

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: David Hildenbrand <david@redhat.com>
To: Dave Young <dyoung@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linux-doc@vger.kernel.org, kexec@lists.infradead.org,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Ingo Molnar <mingo@kernel.org>,
	David Laight <David.Laight@aculab.com>,
	Jonathan Corbet <corbet@lwn.net>,
	Andy Whitcroft <apw@canonical.com>, Joe Perches <joe@perches.com>,
	Dwaipayan Ray <dwaipayanray1@gmail.com>,
	Lukas Bulwahn <lukas.bulwahn@gmail.com>,
	Baoquan He <bhe@redhat.com>, Vivek Goyal <vgoyal@redhat.com>,
	Stephen Johnston <sjohnsto@redhat.com>,
	Prarit Bhargava <prarit@redhat.com>
Subject: Re: [PATCH RFC 1/2] coding-style.rst: document BUG() and WARN() rules ("do not crash the kernel")
Date: Fri, 26 Aug 2022 19:02:06 +0200	[thread overview]
Message-ID: <fe7aee66-d9f7-e472-a13f-e4c5aa176cca@redhat.com> (raw)
In-Reply-To: <CALu+AoThhou7z+JCyv44AxGWDLDt2b7h0W6wcKRsJyLvSR1iQA@mail.gmail.com>

On 26.08.22 03:43, Dave Young wrote:
> Hi David,
> 
> [Added more people in cc]
> 

Hi Dave,

thanks for your input!

[...]

>> Side note: especially with kdump() I feel like we might see much more
>> widespread use of panic_on_warn to be able to actually extract debug
>> information in a controlled manner -- for example on enterprise distros.
>> ... which would then make these systems more likely to crash, because
>> there is no way to distinguish a rather harmless warning from a severe
>> warning :/ . But let's see if some kdump() folks will share their
>> opinion as reply to the cover letter.
> 
> I can understand the intention of this patch, and I totally agree that
> BUG() should be used carefully, this is a good proposal if we can
> clearly define the standard about when to use BUG().  But I do have

Essentially, the general rule from Linus is "absolutely no new BUG_ON()
calls ever" -- but I think the consensus in that thread was that there
are corner cases when it comes to unavoidable data corruption/security
issues. And these are rare cases, not the usual case where we'd have
used BUG_ON()/VM_BUG_ON().

> some worries,  I think this standard is different for different sub
> components, it is not clear to me at least,  so this may introduce an
> unstable running kernel and cause troubles (eg. data corruption) with
> a WARN instead of a BUG. Probably it would be better to say "Do not
> WARN lightly, and do not hesitate to use BUG if it is really needed"?

Well, I don't make the rules, I document them and share them for general
awareness/comments :) Documenting this is valuable, because there seem
to be quite some different opinions floating around in the community --
and I've been learning different rules from different people over the years.

> 
> About "patch_on_warn", it will depend on the admin/end user to set it,
> it is not a good idea for distribution to set it. It seems we are
> leaving it to end users to take the risk of a kernel panic even with
> all kernel WARN even if it is sometimes not necessary.

My question would be what we could add/improve to keep systems with
kdump armed running as expected for end users, that is most probably:

1) don't crash on harmless WARN() that can just be reported and the
   machine will continue running mostly fine without real issues.
2) crash on severe issues (previously BUG) such that we can properly
   capture a system dump via kdump. The restart the machine.

Of course, once one would run into 2), one could try reproducing with
"panic_on_warn" to get a reasonable system dump. But I guess that's not
what enterprise customers expect.

One wild idea (in the cover letter) was to add something new that can be
configured by user space and that expresses that something is more
severe than just some warning that can be recovered easily. But it can
eventually be recovered to keep the system running to some degree. But
still, it's configurable if we want to trigger a panic or let the system
run.

John mentioned PANIC_ON().

What would be your expectation for kdump users under which conditions we
want to trigger kdump and when not?

Regarding panic_on_warn, how often do e.g., RHEL users observe warnings
that we're not able to catch during testing, such that "panic_on_warn"
would be a real no-go?

-- 
Thanks,

David / dhildenb

next prev parent reply	other threads:[~2022-08-26 17:02 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-24 16:30 [PATCH RFC 0/2] coding-style.rst: document BUG() and WARN() rules David Hildenbrand
2022-08-24 16:30 ` [PATCH RFC 1/2] coding-style.rst: document BUG() and WARN() rules ("do not crash the kernel") David Hildenbrand
2022-08-24 21:59   ` John Hubbard
2022-08-25 12:12     ` David Hildenbrand
2022-08-26  1:43       ` Dave Young
2022-08-26 17:02         ` David Hildenbrand [this message]
2022-08-29  1:55           ` Dave Young
2022-08-29  3:07             ` Linus Torvalds
2022-08-29  4:49               ` John Hubbard
2022-08-29 17:19                 ` Linus Torvalds
2022-08-29  8:44               ` David Hildenbrand
2022-08-29  9:25               ` Jani Nikula
2022-08-24 16:31 ` [PATCH RFC 2/2] checkpatch: warn on usage of VM_BUG_ON() and friends David Hildenbrand
2022-08-24 16:52   ` Joe Perches
2022-08-24 19:00     ` David Hildenbrand
2022-08-25  9:58     ` David Hildenbrand
2022-08-25 11:43       ` Jani Nikula
2022-08-25 11:51         ` David Hildenbrand
2022-08-25  2:30 ` [PATCH RFC 0/2] coding-style.rst: document BUG() and WARN() rules John Hubbard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fe7aee66-d9f7-e472-a13f-e4c5aa176cca@redhat.com \
    --to=david@redhat.com \
    --cc=David.Laight@aculab.com \
    --cc=akpm@linux-foundation.org \
    --cc=apw@canonical.com \
    --cc=bhe@redhat.com \
    --cc=corbet@lwn.net \
    --cc=dwaipayanray1@gmail.com \
    --cc=dyoung@redhat.com \
    --cc=jhubbard@nvidia.com \
    --cc=joe@perches.com \
    --cc=kexec@lists.infradead.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lukas.bulwahn@gmail.com \
    --cc=mingo@kernel.org \
    --cc=prarit@redhat.com \
    --cc=sjohnsto@redhat.com \
    --cc=torvalds@linux-foundation.org \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).