linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org, linux-doc@vger.kernel.org,
	kexec@lists.infradead.org, David Hildenbrand <david@redhat.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Ingo Molnar <mingo@kernel.org>,
	David Laight <David.Laight@ACULAB.COM>,
	Jonathan Corbet <corbet@lwn.net>,
	Andy Whitcroft <apw@canonical.com>, Joe Perches <joe@perches.com>,
	Dwaipayan Ray <dwaipayanray1@gmail.com>,
	Lukas Bulwahn <lukas.bulwahn@gmail.com>,
	Baoquan He <bhe@redhat.com>, Vivek Goyal <vgoyal@redhat.com>,
	Dave Young <dyoung@redhat.com>
Subject: [PATCH RFC 1/2] coding-style.rst: document BUG() and WARN() rules ("do not crash the kernel")
Date: Wed, 24 Aug 2022 18:30:59 +0200	[thread overview]
Message-ID: <20220824163100.224449-2-david@redhat.com> (raw)
In-Reply-To: <20220824163100.224449-1-david@redhat.com>

Linus notes [1] that the introduction of new code that uses VM_BUG_ON()
is just as bad as BUG_ON(), because it will crash the kernel on
distributions that enable CONFIG_DEBUG_VM (like Fedora):

    VM_BUG_ON() has the exact same semantics as BUG_ON. It is literally
    no different, the only difference is "we can make the code smaller
    because these are less important". [2]

This resulted in a more generic discussion about usage of BUG() and
friends. While there might be corner cases that still deserve a BUG_ON(),
most BUG_ON() cases should simply use WARN_ON_ONCE() and implement a
recovery path if reasonable:

    The only possible case where BUG_ON can validly be used is "I have
    some fundamental data corruption and cannot possibly return an
    error". [2]

As a very good approximation is the general rule:

    "absolutely no new BUG_ON() calls _ever_" [2]

... not even if something really shouldn't ever happen and is merely for
documenting that an invariant always has to hold.

There is only one good BUG_ON():

    Now, that said, there is one very valid sub-form of BUG_ON():
    BUILD_BUG_ON() is absolutely 100% fine. [2]

While WARN will also crash the machine with panic_on_warn set, that's
exactly to be expected:

    So we have two very different cases: the "virtual machine with good
    logging where a dead machine is fine" - use 'panic_on_warn'. And
    the actual real hardware with real drivers, running real loads by
    users. [3]

The basic idea is that warnings will similarly get reported by users
and be found during testing. However, in contrast to a BUG(), there is a
way to actually influence the expected behavior (e.g., panic_on_warn)
and to eventually keep the machine alive to extract some debug info.

Ingo notes that not all WARN_ON_ONCE cases need recovery. If we don't ever
expect this code to trigger in any case, recovery code is not really
helpful.

    I'd prefer to keep all these warnings 'simple' - i.e. no attempted
    recovery & control flow, unless we ever expect these to trigger.
    [4]

There have been different rules floating around that were never properly
documented. Let's try to clarify.

[1] https://lkml.kernel.org/r/CAHk-=wiEAH+ojSpAgx_Ep=NKPWHU8AdO3V56BXcCsU97oYJ1EA@mail.gmail.com
[2] https://lore.kernel.org/r/CAHk-=wg40EAZofO16Eviaj7mfqDhZ2gVEbvfsMf6gYzspRjYvw@mail.gmail.com
[3] https://lore.kernel.org/r/CAHk-=wgF7K2gSSpy=m_=K3Nov4zaceUX9puQf1TjkTJLA2XC_g@mail.gmail.com

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 Documentation/process/coding-style.rst | 27 ++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/Documentation/process/coding-style.rst b/Documentation/process/coding-style.rst
index 03eb53fd029a..a6d81ff578fe 100644
--- a/Documentation/process/coding-style.rst
+++ b/Documentation/process/coding-style.rst
@@ -1186,6 +1186,33 @@ expression used.  For instance:
 	#endif /* CONFIG_SOMETHING */
 
 
+22) Do not crash the kernel
+---------------------------
+
+Do not add new code that uses BUG(), BUG_ON(), VM_BUG_ON(), ... to crash
+the kernel if certain conditions are not met. Instead, use WARN_ON_ONCE()
+with recovery code (if reasonable) instead. Unavoidable data corruption /
+security issues might be a very rare exception to this rule and need good
+justification.
+
+There is no need for WARN_ON_ONCE() recovery code if we never expect it to
+possibly trigger unless something goes seriously wrong or some other code
+is changed to break invariants. Note that VM_WARN_ON_ONCE() cannot be used
+in conditionals.
+
+Usage of WARN() and friends is fine for something that is not expected to
+be triggered easily. ``panic_on_warn`` users are not an excuse to not use
+WARN(): whoever enables ``panic_on_warn`` explicitly asked the kernel to
+crash in this case.
+
+However, WARN() and friends should not be used for something that is
+expected to trigger easily, for example, by user space. pr_warn_once()
+might be a reasonable replacement to notify the user.
+
+Note that BUILD_BUG_ON() is perfectly fine because it will make compilation
+fail instead of crashing the kernel.
+
+
 Appendix I) References
 ----------------------
 
-- 
2.37.1



  reply	other threads:[~2022-08-24 16:31 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-24 16:30 [PATCH RFC 0/2] coding-style.rst: document BUG() and WARN() rules David Hildenbrand
2022-08-24 16:30 ` David Hildenbrand [this message]
2022-08-24 21:59   ` [PATCH RFC 1/2] coding-style.rst: document BUG() and WARN() rules ("do not crash the kernel") John Hubbard
2022-08-25 12:12     ` David Hildenbrand
2022-08-26  1:43       ` Dave Young
2022-08-26 17:02         ` David Hildenbrand
2022-08-29  1:55           ` Dave Young
2022-08-29  3:07             ` Linus Torvalds
2022-08-29  4:49               ` John Hubbard
2022-08-29 17:19                 ` Linus Torvalds
2022-08-29  8:44               ` David Hildenbrand
2022-08-29  9:25               ` Jani Nikula
2022-08-24 16:31 ` [PATCH RFC 2/2] checkpatch: warn on usage of VM_BUG_ON() and friends David Hildenbrand
2022-08-24 16:52   ` Joe Perches
2022-08-24 19:00     ` David Hildenbrand
2022-08-25  9:58     ` David Hildenbrand
2022-08-25 11:43       ` Jani Nikula
2022-08-25 11:51         ` David Hildenbrand
2022-08-25  2:30 ` [PATCH RFC 0/2] coding-style.rst: document BUG() and WARN() rules John Hubbard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220824163100.224449-2-david@redhat.com \
    --to=david@redhat.com \
    --cc=David.Laight@ACULAB.COM \
    --cc=akpm@linux-foundation.org \
    --cc=apw@canonical.com \
    --cc=bhe@redhat.com \
    --cc=corbet@lwn.net \
    --cc=dwaipayanray1@gmail.com \
    --cc=dyoung@redhat.com \
    --cc=joe@perches.com \
    --cc=kexec@lists.infradead.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lukas.bulwahn@gmail.com \
    --cc=mingo@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=vgoyal@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).