From: David Hildenbrand <david@redhat.com>
To: linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org, linux-doc@vger.kernel.org,
kexec@lists.infradead.org, David Hildenbrand <david@redhat.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Andrew Morton <akpm@linux-foundation.org>,
Ingo Molnar <mingo@kernel.org>,
David Laight <David.Laight@ACULAB.COM>,
Jonathan Corbet <corbet@lwn.net>,
Andy Whitcroft <apw@canonical.com>, Joe Perches <joe@perches.com>,
Dwaipayan Ray <dwaipayanray1@gmail.com>,
Lukas Bulwahn <lukas.bulwahn@gmail.com>,
Baoquan He <bhe@redhat.com>, Vivek Goyal <vgoyal@redhat.com>,
Dave Young <dyoung@redhat.com>
Subject: [PATCH RFC 1/2] coding-style.rst: document BUG() and WARN() rules ("do not crash the kernel")
Date: Wed, 24 Aug 2022 18:30:59 +0200 [thread overview]
Message-ID: <20220824163100.224449-2-david@redhat.com> (raw)
In-Reply-To: <20220824163100.224449-1-david@redhat.com>
Linus notes [1] that the introduction of new code that uses VM_BUG_ON()
is just as bad as BUG_ON(), because it will crash the kernel on
distributions that enable CONFIG_DEBUG_VM (like Fedora):
VM_BUG_ON() has the exact same semantics as BUG_ON. It is literally
no different, the only difference is "we can make the code smaller
because these are less important". [2]
This resulted in a more generic discussion about usage of BUG() and
friends. While there might be corner cases that still deserve a BUG_ON(),
most BUG_ON() cases should simply use WARN_ON_ONCE() and implement a
recovery path if reasonable:
The only possible case where BUG_ON can validly be used is "I have
some fundamental data corruption and cannot possibly return an
error". [2]
As a very good approximation is the general rule:
"absolutely no new BUG_ON() calls _ever_" [2]
... not even if something really shouldn't ever happen and is merely for
documenting that an invariant always has to hold.
There is only one good BUG_ON():
Now, that said, there is one very valid sub-form of BUG_ON():
BUILD_BUG_ON() is absolutely 100% fine. [2]
While WARN will also crash the machine with panic_on_warn set, that's
exactly to be expected:
So we have two very different cases: the "virtual machine with good
logging where a dead machine is fine" - use 'panic_on_warn'. And
the actual real hardware with real drivers, running real loads by
users. [3]
The basic idea is that warnings will similarly get reported by users
and be found during testing. However, in contrast to a BUG(), there is a
way to actually influence the expected behavior (e.g., panic_on_warn)
and to eventually keep the machine alive to extract some debug info.
Ingo notes that not all WARN_ON_ONCE cases need recovery. If we don't ever
expect this code to trigger in any case, recovery code is not really
helpful.
I'd prefer to keep all these warnings 'simple' - i.e. no attempted
recovery & control flow, unless we ever expect these to trigger.
[4]
There have been different rules floating around that were never properly
documented. Let's try to clarify.
[1] https://lkml.kernel.org/r/CAHk-=wiEAH+ojSpAgx_Ep=NKPWHU8AdO3V56BXcCsU97oYJ1EA@mail.gmail.com
[2] https://lore.kernel.org/r/CAHk-=wg40EAZofO16Eviaj7mfqDhZ2gVEbvfsMf6gYzspRjYvw@mail.gmail.com
[3] https://lore.kernel.org/r/CAHk-=wgF7K2gSSpy=m_=K3Nov4zaceUX9puQf1TjkTJLA2XC_g@mail.gmail.com
Signed-off-by: David Hildenbrand <david@redhat.com>
---
Documentation/process/coding-style.rst | 27 ++++++++++++++++++++++++++
1 file changed, 27 insertions(+)
diff --git a/Documentation/process/coding-style.rst b/Documentation/process/coding-style.rst
index 03eb53fd029a..a6d81ff578fe 100644
--- a/Documentation/process/coding-style.rst
+++ b/Documentation/process/coding-style.rst
@@ -1186,6 +1186,33 @@ expression used. For instance:
#endif /* CONFIG_SOMETHING */
+22) Do not crash the kernel
+---------------------------
+
+Do not add new code that uses BUG(), BUG_ON(), VM_BUG_ON(), ... to crash
+the kernel if certain conditions are not met. Instead, use WARN_ON_ONCE()
+with recovery code (if reasonable) instead. Unavoidable data corruption /
+security issues might be a very rare exception to this rule and need good
+justification.
+
+There is no need for WARN_ON_ONCE() recovery code if we never expect it to
+possibly trigger unless something goes seriously wrong or some other code
+is changed to break invariants. Note that VM_WARN_ON_ONCE() cannot be used
+in conditionals.
+
+Usage of WARN() and friends is fine for something that is not expected to
+be triggered easily. ``panic_on_warn`` users are not an excuse to not use
+WARN(): whoever enables ``panic_on_warn`` explicitly asked the kernel to
+crash in this case.
+
+However, WARN() and friends should not be used for something that is
+expected to trigger easily, for example, by user space. pr_warn_once()
+might be a reasonable replacement to notify the user.
+
+Note that BUILD_BUG_ON() is perfectly fine because it will make compilation
+fail instead of crashing the kernel.
+
+
Appendix I) References
----------------------
--
2.37.1
next prev parent reply other threads:[~2022-08-24 16:31 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-24 16:30 [PATCH RFC 0/2] coding-style.rst: document BUG() and WARN() rules David Hildenbrand
2022-08-24 16:30 ` David Hildenbrand [this message]
2022-08-24 21:59 ` [PATCH RFC 1/2] coding-style.rst: document BUG() and WARN() rules ("do not crash the kernel") John Hubbard
2022-08-25 12:12 ` David Hildenbrand
2022-08-26 1:43 ` Dave Young
2022-08-26 17:02 ` David Hildenbrand
2022-08-29 1:55 ` Dave Young
2022-08-29 3:07 ` Linus Torvalds
2022-08-29 4:49 ` John Hubbard
2022-08-29 17:19 ` Linus Torvalds
2022-08-29 8:44 ` David Hildenbrand
2022-08-29 9:25 ` Jani Nikula
2022-08-24 16:31 ` [PATCH RFC 2/2] checkpatch: warn on usage of VM_BUG_ON() and friends David Hildenbrand
2022-08-24 16:52 ` Joe Perches
2022-08-24 19:00 ` David Hildenbrand
2022-08-25 9:58 ` David Hildenbrand
2022-08-25 11:43 ` Jani Nikula
2022-08-25 11:51 ` David Hildenbrand
2022-08-25 2:30 ` [PATCH RFC 0/2] coding-style.rst: document BUG() and WARN() rules John Hubbard
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220824163100.224449-2-david@redhat.com \
--to=david@redhat.com \
--cc=David.Laight@ACULAB.COM \
--cc=akpm@linux-foundation.org \
--cc=apw@canonical.com \
--cc=bhe@redhat.com \
--cc=corbet@lwn.net \
--cc=dwaipayanray1@gmail.com \
--cc=dyoung@redhat.com \
--cc=joe@perches.com \
--cc=kexec@lists.infradead.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lukas.bulwahn@gmail.com \
--cc=mingo@kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=vgoyal@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).