rust-for-linux.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Benno Lossin <benno.lossin@proton.me>
To: Alice Ryhl <aliceryhl@google.com>
Cc: "Jonathan Corbet" <corbet@lwn.net>,
	"Miguel Ojeda" <ojeda@kernel.org>,
	"Alex Gaynor" <alex.gaynor@gmail.com>,
	"Wedson Almeida Filho" <wedsonaf@gmail.com>,
	"Boqun Feng" <boqun.feng@gmail.com>,
	"Gary Guo" <gary@garyguo.net>,
	"Björn Roy Baron" <bjorn3_gh@protonmail.com>,
	"Andreas Hindborg" <a.hindborg@samsung.com>,
	linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
	rust-for-linux@vger.kernel.org
Subject: Re: [RFC PATCH 1/5] doc: rust: create safety standard
Date: Wed, 24 Jul 2024 19:36:29 +0000	[thread overview]
Message-ID: <4ed475d3-37b6-49e7-8b43-df3160f4ebbc@proton.me> (raw)
In-Reply-To: <CAH5fLgg0O=t2T2MQH2hvm4eZnCOa_NffzRw=XZPi9+7+=XsmRg@mail.gmail.com>

On 18.07.24 14:20, Alice Ryhl wrote:
> On Thu, Jul 18, 2024 at 12:12 AM Benno Lossin <benno.lossin@proton.me> wrote:
>> +Soundness
>> +=========
>> +
>> +``unsafe`` operations (e.g. ``unsafe`` functions, dereferencing raw pointers etc.) have certain
>> +conditions that need to be fulfilled in order for the operation to not be UB.
>> +To evaluate if the ``unsafe`` code usage is correct, one needs to consider the API that wraps said
>> +``unsafe`` code. If under all possible safe uses of the API, the conditions for the ``unsafe``
>> +operation are fulfilled, the API is *sound*. Otherwise it is *unsound*. Here is a simple example::
>> +
>> +    pub struct Data {
>> +        a: usize,
>> +    }
>> +
>> +    pub fn access_a(data: *mut Data) -> usize {
>> +        unsafe { (*data).a }
>> +    }
>> +
>> +    fn main() {
>> +        let mut d = Data { a: 42 };
>> +        println!("{}", access_a(&mut d));
>> +    }
>> +
>> +While this example has no UB, the function ``access_a`` is unsound. This is because one could just
>> +write the following safe usage::
>> +
>> +    println!("{}", access_a(core::ptr::null_mut()));
>> +
>> +And this would result in a dereference of a null pointer.
>> +
>> +In its essence, a sound API means that if someone only writes safe code, they can never encounter UB
>> +even if they call safe code that calls ``unsafe`` code behind the scenes.
> 
> I think this section on soundness could be more clear. You never
> really give a definition for what soundness is; you just jump into an
> example immediately.

I would argue that the sentence "If under all possible safe uses of the
API, the conditions for the ``unsafe`` operation are fulfilled, the API
is *sound*. Otherwise it is *unsound*." is a definition. Is this lacking
anything, or do you have an idea to improve the wording?
I am not really understanding what kind of additional definition you
would like to see.

> It may be nice to talk about how a sound API must prevent memory
> safety issues, even if the safe code required to do so is silly or
> unrealistic. Doing so is necessary to maintain the "safe code never
> has memory safety bugs" guarantee.

That is a good way to phrase things, I will add that.

>> +Because unsoundness issues have the potential for allowing safe code to experience UB, they are
>> +treated similarly to actual bugs with UB. Their fixes should also be included in the  stable tree.
>> +
>> +Safety Documentation
>> +====================
>> +
>> +After trying to minimize and remove as much ``unsafe`` code as possible, there still is some left.
>> +This is because some things are just not possible in only safe code. This last part of ``unsafe``
>> +code must still be correct. Helping with that is the safety documentation: it meticulously documents
>> +the various requirements and justifications for every line of ``unsafe`` code. That way it can be
>> +ensured that all ``unsafe`` code is sound without anyone needing to know the whole kernel at once.
>> +The gist of the idea is this: every ``unsafe`` operation documents its requirements and every
>> +location that uses an ``unsafe`` operation documents for every requirement a justification why they
>> +are fulfilled. If now all requirements and justifications are correct, then there can only be sound
>> +``unsafe`` code.
>> +
>> +The ``unsafe`` keywords has two different meanings depending on the context it is used in:
>> +
>> +* granting access to an unchecked operation,
>> +* declaring that something is an unchecked operation.
>> +
>> +In both cases we have to add safety documentation. In the first case, we have to justify why we can
>> +always guarantee that the requirements of the unchecked operation are fulfilled. In the second case,
>> +we have to list the requirements that have to be fulfilled for the operation to be sound.
>> +
>> +In the following sections we will go over each location where ``unsafe`` can be used.
>> +
>> +.. _unsafe-Functions:
>> +
>> +``unsafe`` Functions
>> +--------------------
>> +
>> +``unsafe`` on function declarations is used to state that this function has special requirements
>> +that callers have to ensure when calling the function::
>> +
>> +    unsafe fn foo() {
>> +        // ...
>> +    }
>> +
>> +These requirements are called the safety requirements of the function. These requirements can take
>> +any shape and range from simple requirements like "``ptr_arg`` is valid" (``ptr_arg`` refers to some
>> +argument with the type matching ``*mut T`` or ``*const T``) to more complex requirements like
>> +"``ptr`` must be valid, point to a ``NUL``-terminated C string, and it must be valid for at least
>> +``'a``. While the returned value is alive, the memory at ``ptr`` must not be mutated.".
>> +
>> +The safety requirements have to be documented in the so called safety section::
>> +
>> +    /// <oneline description of the function>
>> +    ///
>> +    /// <full description of the function>
>> +    ///
>> +    /// # Safety
>> +    ///
>> +    /// <safety requirements>
>> +    unsafe fn foo() {
>> +        // ...
>> +    }
> 
> I would love to see a proper example here.

Depending on how the discussion goes with this series, I plan to go
through the tree and try to improve the safety comments. That way I will
have a good source for examples, since otherwise they are extremely hard
to come up with.
(I am also open to any suggestions :)

>> +.. _unsafe-Blocks:
>> +
>> +``unsafe`` Blocks
>> +-----------------
>> +
>> +``unsafe`` code blocks are used to call ``unsafe`` functions and perform built-in ``unsafe``
>> +operations such as dereferencing a raw pointer::
>> +
>> +    unsafe { foo() };
>> +
>> +In order to ensure that all safety requirements of ``unsafe`` operations are upheld, a safety
>> +comment is mandatory for all ``unsafe`` blocks. This safety comment needs to provide a correct
>> +justification for every safety requirements of every operation within the block::
>> +
>> +    // SAFETY: <justifications>
>> +    unsafe { foo() };
> 
> I think it is worth explicitly pointing out that the safety comment
> must explain why the preconditions are satisfied, *not* what the
> preconditions are. It's a really really common mistake to mix up
> these, and it probably even makes sense to include two examples
> showing the difference.

Good idea, but I will put that into justifications.rst.

>> +For transparency it is best practice to have only a single ``unsafe`` operation per ``unsafe``
>> +block, since then it is more clear what the justifications are trying to justify. Safe operations
>> +should not be included in the block, since it adds confusion as to which operation is the ``unsafe``
>> +one. In certain cases however it makes it easier to understand if there is only a single ``unsafe``
>> +block. For example::
>> +
>> +    // SAFETY: `ptr` is valid for writes.
>> +    unsafe {
>> +        (*ptr).field1 = 42;
>> +        (*ptr).field2 = 24;
>> +        (*ptr).field3 = 2442;
>> +    }
>> +
>> +In this case it is more readable to not split the block into multiple parts.
> 
> It would be nice with an example of a completely normal safety comment
> example. Also would be nice to call out that the semicolon goes
> outside the unsafe block to improve formatting, when it's a single
> operation.

Will add.

> It would also be nice with an example that shows how to reference the
> safety requirements of the current function. (That is, an example that
> combines unsafe fn with unsafe block.)

See above.

---
Cheers,
Benno


  reply	other threads:[~2024-07-24 19:36 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-17 22:12 [RFC PATCH 0/5] Introduce the Rust Safety Standard Benno Lossin
2024-07-17 22:12 ` [RFC PATCH 1/5] doc: rust: create safety standard Benno Lossin
2024-07-18  4:45   ` Greg KH
2024-07-24 19:13     ` Benno Lossin
2024-07-25  4:57       ` Greg KH
2024-07-18 12:20   ` Alice Ryhl
2024-07-24 19:36     ` Benno Lossin [this message]
2024-07-19 16:24   ` Daniel Almeida
2024-07-19 16:46     ` Alice Ryhl
2024-07-19 17:10     ` Danilo Krummrich
2024-07-19 18:09       ` Daniel Almeida
2024-07-19 19:20         ` Danilo Krummrich
2024-07-19 17:28     ` Miguel Ojeda
2024-07-19 18:18       ` Daniel Almeida
2024-07-19 17:56     ` Miguel Ojeda
2024-07-24 20:31     ` Benno Lossin
2024-07-24 21:20       ` Miguel Ojeda
2024-07-24 21:28         ` Benno Lossin
2024-08-08 13:01       ` Daniel Almeida
2024-08-08 13:08         ` Miguel Ojeda
2024-07-19 22:11   ` Boqun Feng
2024-07-24 22:01     ` Benno Lossin
2024-07-20  7:45   ` Dirk Behme
2024-07-17 22:12 ` [RFC PATCH 2/5] doc: rust: safety standard: add examples Benno Lossin
2024-07-19 16:28   ` Daniel Almeida
2024-07-19 16:36   ` Daniel Almeida
2024-07-25  7:47     ` Benno Lossin
2024-08-08 13:10       ` Daniel Almeida
2024-08-08 14:33         ` Benno Lossin
2024-07-17 22:12 ` [RFC PATCH 3/5] doc: rust: safety standard: add guarantees and type invariants Benno Lossin
2024-07-17 22:12 ` [RFC PATCH 4/5] doc: rust: safety standard: add safety requirements Benno Lossin
2024-07-17 22:13 ` [RFC PATCH 5/5] doc: rust: safety standard: add justifications Benno Lossin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4ed475d3-37b6-49e7-8b43-df3160f4ebbc@proton.me \
    --to=benno.lossin@proton.me \
    --cc=a.hindborg@samsung.com \
    --cc=alex.gaynor@gmail.com \
    --cc=aliceryhl@google.com \
    --cc=bjorn3_gh@protonmail.com \
    --cc=boqun.feng@gmail.com \
    --cc=corbet@lwn.net \
    --cc=gary@garyguo.net \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ojeda@kernel.org \
    --cc=rust-for-linux@vger.kernel.org \
    --cc=wedsonaf@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).