From: Benno Lossin <benno.lossin@proton.me>
To: Alice Ryhl <aliceryhl@google.com>
Cc: "Jonathan Corbet" <corbet@lwn.net>,
"Miguel Ojeda" <ojeda@kernel.org>,
"Alex Gaynor" <alex.gaynor@gmail.com>,
"Wedson Almeida Filho" <wedsonaf@gmail.com>,
"Boqun Feng" <boqun.feng@gmail.com>,
"Gary Guo" <gary@garyguo.net>,
"Björn Roy Baron" <bjorn3_gh@protonmail.com>,
"Andreas Hindborg" <a.hindborg@samsung.com>,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
rust-for-linux@vger.kernel.org
Subject: Re: [RFC PATCH 1/5] doc: rust: create safety standard
Date: Wed, 24 Jul 2024 19:36:29 +0000 [thread overview]
Message-ID: <4ed475d3-37b6-49e7-8b43-df3160f4ebbc@proton.me> (raw)
In-Reply-To: <CAH5fLgg0O=t2T2MQH2hvm4eZnCOa_NffzRw=XZPi9+7+=XsmRg@mail.gmail.com>
On 18.07.24 14:20, Alice Ryhl wrote:
> On Thu, Jul 18, 2024 at 12:12 AM Benno Lossin <benno.lossin@proton.me> wrote:
>> +Soundness
>> +=========
>> +
>> +``unsafe`` operations (e.g. ``unsafe`` functions, dereferencing raw pointers etc.) have certain
>> +conditions that need to be fulfilled in order for the operation to not be UB.
>> +To evaluate if the ``unsafe`` code usage is correct, one needs to consider the API that wraps said
>> +``unsafe`` code. If under all possible safe uses of the API, the conditions for the ``unsafe``
>> +operation are fulfilled, the API is *sound*. Otherwise it is *unsound*. Here is a simple example::
>> +
>> + pub struct Data {
>> + a: usize,
>> + }
>> +
>> + pub fn access_a(data: *mut Data) -> usize {
>> + unsafe { (*data).a }
>> + }
>> +
>> + fn main() {
>> + let mut d = Data { a: 42 };
>> + println!("{}", access_a(&mut d));
>> + }
>> +
>> +While this example has no UB, the function ``access_a`` is unsound. This is because one could just
>> +write the following safe usage::
>> +
>> + println!("{}", access_a(core::ptr::null_mut()));
>> +
>> +And this would result in a dereference of a null pointer.
>> +
>> +In its essence, a sound API means that if someone only writes safe code, they can never encounter UB
>> +even if they call safe code that calls ``unsafe`` code behind the scenes.
>
> I think this section on soundness could be more clear. You never
> really give a definition for what soundness is; you just jump into an
> example immediately.
I would argue that the sentence "If under all possible safe uses of the
API, the conditions for the ``unsafe`` operation are fulfilled, the API
is *sound*. Otherwise it is *unsound*." is a definition. Is this lacking
anything, or do you have an idea to improve the wording?
I am not really understanding what kind of additional definition you
would like to see.
> It may be nice to talk about how a sound API must prevent memory
> safety issues, even if the safe code required to do so is silly or
> unrealistic. Doing so is necessary to maintain the "safe code never
> has memory safety bugs" guarantee.
That is a good way to phrase things, I will add that.
>> +Because unsoundness issues have the potential for allowing safe code to experience UB, they are
>> +treated similarly to actual bugs with UB. Their fixes should also be included in the stable tree.
>> +
>> +Safety Documentation
>> +====================
>> +
>> +After trying to minimize and remove as much ``unsafe`` code as possible, there still is some left.
>> +This is because some things are just not possible in only safe code. This last part of ``unsafe``
>> +code must still be correct. Helping with that is the safety documentation: it meticulously documents
>> +the various requirements and justifications for every line of ``unsafe`` code. That way it can be
>> +ensured that all ``unsafe`` code is sound without anyone needing to know the whole kernel at once.
>> +The gist of the idea is this: every ``unsafe`` operation documents its requirements and every
>> +location that uses an ``unsafe`` operation documents for every requirement a justification why they
>> +are fulfilled. If now all requirements and justifications are correct, then there can only be sound
>> +``unsafe`` code.
>> +
>> +The ``unsafe`` keywords has two different meanings depending on the context it is used in:
>> +
>> +* granting access to an unchecked operation,
>> +* declaring that something is an unchecked operation.
>> +
>> +In both cases we have to add safety documentation. In the first case, we have to justify why we can
>> +always guarantee that the requirements of the unchecked operation are fulfilled. In the second case,
>> +we have to list the requirements that have to be fulfilled for the operation to be sound.
>> +
>> +In the following sections we will go over each location where ``unsafe`` can be used.
>> +
>> +.. _unsafe-Functions:
>> +
>> +``unsafe`` Functions
>> +--------------------
>> +
>> +``unsafe`` on function declarations is used to state that this function has special requirements
>> +that callers have to ensure when calling the function::
>> +
>> + unsafe fn foo() {
>> + // ...
>> + }
>> +
>> +These requirements are called the safety requirements of the function. These requirements can take
>> +any shape and range from simple requirements like "``ptr_arg`` is valid" (``ptr_arg`` refers to some
>> +argument with the type matching ``*mut T`` or ``*const T``) to more complex requirements like
>> +"``ptr`` must be valid, point to a ``NUL``-terminated C string, and it must be valid for at least
>> +``'a``. While the returned value is alive, the memory at ``ptr`` must not be mutated.".
>> +
>> +The safety requirements have to be documented in the so called safety section::
>> +
>> + /// <oneline description of the function>
>> + ///
>> + /// <full description of the function>
>> + ///
>> + /// # Safety
>> + ///
>> + /// <safety requirements>
>> + unsafe fn foo() {
>> + // ...
>> + }
>
> I would love to see a proper example here.
Depending on how the discussion goes with this series, I plan to go
through the tree and try to improve the safety comments. That way I will
have a good source for examples, since otherwise they are extremely hard
to come up with.
(I am also open to any suggestions :)
>> +.. _unsafe-Blocks:
>> +
>> +``unsafe`` Blocks
>> +-----------------
>> +
>> +``unsafe`` code blocks are used to call ``unsafe`` functions and perform built-in ``unsafe``
>> +operations such as dereferencing a raw pointer::
>> +
>> + unsafe { foo() };
>> +
>> +In order to ensure that all safety requirements of ``unsafe`` operations are upheld, a safety
>> +comment is mandatory for all ``unsafe`` blocks. This safety comment needs to provide a correct
>> +justification for every safety requirements of every operation within the block::
>> +
>> + // SAFETY: <justifications>
>> + unsafe { foo() };
>
> I think it is worth explicitly pointing out that the safety comment
> must explain why the preconditions are satisfied, *not* what the
> preconditions are. It's a really really common mistake to mix up
> these, and it probably even makes sense to include two examples
> showing the difference.
Good idea, but I will put that into justifications.rst.
>> +For transparency it is best practice to have only a single ``unsafe`` operation per ``unsafe``
>> +block, since then it is more clear what the justifications are trying to justify. Safe operations
>> +should not be included in the block, since it adds confusion as to which operation is the ``unsafe``
>> +one. In certain cases however it makes it easier to understand if there is only a single ``unsafe``
>> +block. For example::
>> +
>> + // SAFETY: `ptr` is valid for writes.
>> + unsafe {
>> + (*ptr).field1 = 42;
>> + (*ptr).field2 = 24;
>> + (*ptr).field3 = 2442;
>> + }
>> +
>> +In this case it is more readable to not split the block into multiple parts.
>
> It would be nice with an example of a completely normal safety comment
> example. Also would be nice to call out that the semicolon goes
> outside the unsafe block to improve formatting, when it's a single
> operation.
Will add.
> It would also be nice with an example that shows how to reference the
> safety requirements of the current function. (That is, an example that
> combines unsafe fn with unsafe block.)
See above.
---
Cheers,
Benno
next prev parent reply other threads:[~2024-07-24 19:36 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-07-17 22:12 [RFC PATCH 0/5] Introduce the Rust Safety Standard Benno Lossin
2024-07-17 22:12 ` [RFC PATCH 1/5] doc: rust: create safety standard Benno Lossin
2024-07-18 4:45 ` Greg KH
2024-07-24 19:13 ` Benno Lossin
2024-07-25 4:57 ` Greg KH
2024-07-18 12:20 ` Alice Ryhl
2024-07-24 19:36 ` Benno Lossin [this message]
2024-07-19 16:24 ` Daniel Almeida
2024-07-19 16:46 ` Alice Ryhl
2024-07-19 17:10 ` Danilo Krummrich
2024-07-19 18:09 ` Daniel Almeida
2024-07-19 19:20 ` Danilo Krummrich
2024-07-19 17:28 ` Miguel Ojeda
2024-07-19 18:18 ` Daniel Almeida
2024-07-19 17:56 ` Miguel Ojeda
2024-07-24 20:31 ` Benno Lossin
2024-07-24 21:20 ` Miguel Ojeda
2024-07-24 21:28 ` Benno Lossin
2024-08-08 13:01 ` Daniel Almeida
2024-08-08 13:08 ` Miguel Ojeda
2024-07-19 22:11 ` Boqun Feng
2024-07-24 22:01 ` Benno Lossin
2024-07-20 7:45 ` Dirk Behme
2024-07-17 22:12 ` [RFC PATCH 2/5] doc: rust: safety standard: add examples Benno Lossin
2024-07-19 16:28 ` Daniel Almeida
2024-07-19 16:36 ` Daniel Almeida
2024-07-25 7:47 ` Benno Lossin
2024-08-08 13:10 ` Daniel Almeida
2024-08-08 14:33 ` Benno Lossin
2024-07-17 22:12 ` [RFC PATCH 3/5] doc: rust: safety standard: add guarantees and type invariants Benno Lossin
2024-07-17 22:12 ` [RFC PATCH 4/5] doc: rust: safety standard: add safety requirements Benno Lossin
2024-07-17 22:13 ` [RFC PATCH 5/5] doc: rust: safety standard: add justifications Benno Lossin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4ed475d3-37b6-49e7-8b43-df3160f4ebbc@proton.me \
--to=benno.lossin@proton.me \
--cc=a.hindborg@samsung.com \
--cc=alex.gaynor@gmail.com \
--cc=aliceryhl@google.com \
--cc=bjorn3_gh@protonmail.com \
--cc=boqun.feng@gmail.com \
--cc=corbet@lwn.net \
--cc=gary@garyguo.net \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=ojeda@kernel.org \
--cc=rust-for-linux@vger.kernel.org \
--cc=wedsonaf@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).