From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from madrid.collaboradmins.com (madrid.collaboradmins.com [46.235.227.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B0F5A22331; Fri, 19 Jul 2024 16:24:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=46.235.227.194 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721406274; cv=none; b=npX/tRxG8kabkt5IxSWwfsx/S5/eq+HLun9O4dLZeA71c7O/FicO+l+PPeWnEUHSqfczlNoubU82Ad/UmHOxQX4UH4n2ElR35GLuOPyPxNS6X+4uWrfn1++zE1JxHHlsxjhdtFLMuhkafigIgcijSNaRj9hprBukMKuiAMIPM5A= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721406274; c=relaxed/simple; bh=FcSr1TgcEitI1TYZehyv0Ew6mstsPtwktnR+RpvKb7Y=; h=Content-Type:Mime-Version:Subject:From:In-Reply-To:Date:Cc: Message-Id:References:To; b=IuqedY3x8VtP/+ipgwJjeeTpE2I05wNPFtwcoAVoRjuUkoZpQX+1/2XzjBf7vT5tgCZIT17zCz0wOr608kt+ptNSlBqdCO8X8KrnG1mCneVYfckkPWVGdXe1YC42E8aa6anl34lf693fUHTNjekPscChvOLAR5736vN9GFY6JEY= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=collabora.com; spf=pass smtp.mailfrom=collabora.com; dkim=pass (2048-bit key) header.d=collabora.com header.i=@collabora.com header.b=3bkB/Dfl; arc=none smtp.client-ip=46.235.227.194 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=collabora.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=collabora.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=collabora.com header.i=@collabora.com header.b="3bkB/Dfl" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1721406269; bh=FcSr1TgcEitI1TYZehyv0Ew6mstsPtwktnR+RpvKb7Y=; h=Subject:From:In-Reply-To:Date:Cc:References:To:From; b=3bkB/Dflg3/cKipfhJ7HkDO3nBMnBnXHFtLV9RJQAnweeozgcXyplrkhx4sjfxdxO tm6282EsHc9SvQm4JS1xfV8SLij95eKg3YJJz3guVXGfpBElxzkW0pVddP+35RYPFs 8FifM+lVVKPq6B2UWEkUgxKQX9GMu1QLlTwibw7+YQRjnGJzrs4skU9LcEaw0eO9Y0 p3bvQtMoCp5Jec31fKikA22efboe4uAyaCO+E5Upg9mHkJL4ewmBT+sWYxrhrJyNUl GSZXR5X9peGXIY93zxwE5/JbM+n1odwOjbYfJV3fmvlQZ6NcYzkgwv5uW+UQ/CQkcj UTBe7p3GYDsuA== Received: from smtpclient.apple (cola.collaboradmins.com [IPv6:2a01:4f8:1c1c:5717::1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) (Authenticated sender: dwlsalmeida) by madrid.collaboradmins.com (Postfix) with ESMTPSA id E38B537813D3; Fri, 19 Jul 2024 16:24:23 +0000 (UTC) Content-Type: text/plain; charset=utf-8 Precedence: bulk X-Mailing-List: linux-doc@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3774.600.62\)) Subject: Re: [RFC PATCH 1/5] doc: rust: create safety standard From: Daniel Almeida In-Reply-To: <20240717221133.459589-2-benno.lossin@proton.me> Date: Fri, 19 Jul 2024 13:24:10 -0300 Cc: Jonathan Corbet , Miguel Ojeda , Alex Gaynor , Wedson Almeida Filho , Boqun Feng , Gary Guo , =?utf-8?Q?Bj=C3=B6rn_Roy_Baron?= , Andreas Hindborg , Alice Ryhl , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, rust-for-linux@vger.kernel.org Content-Transfer-Encoding: quoted-printable Message-Id: <99DF6A0F-BAE9-4341-8B42-6C1F1C69E2C6@collabora.com> References: <20240717221133.459589-1-benno.lossin@proton.me> <20240717221133.459589-2-benno.lossin@proton.me> To: Benno Lossin X-Mailer: Apple Mail (2.3774.600.62) Hi Benno, It=E2=80=99s nice to see this shaping up. I do agree that it=E2=80=99s a = bit of a wild west right now. IMHO, we need a lint to enforce compliance, unless we plan to have every = patch reviewed by the RFL community, which is unrealistic as time goes = forward. I myself have forgotten to properly document unsafe blocks because it=E2=80=99= s easy to miss things when submitting more than a thousand LOC. A new clippy lint would make sense here, since we already have clippy = support in the kernel anyways. > On 17 Jul 2024, at 19:12, Benno Lossin wrote: >=20 > `unsafe` Rust code in the kernel is required to have safety > documentation. This is to ensure the correctness of `unsafe` code and = is > thus very important. > However, at this point in time there does not exist a standard way of > writing safety documentation. This leads to confusion, as authors > struggle to find the right way to convey their desired intentions. > Readers struggle with correctly interpreting the existing = documentation. >=20 > Add the safety standard that will document the meaning of safety > documentation. This first document gives an overview of the problem = and > gives general information about the topic. >=20 > Signed-off-by: Benno Lossin > --- > Documentation/rust/general-information.rst | 1 + > Documentation/rust/index.rst | 1 + > Documentation/rust/safety-standard/index.rst | 246 +++++++++++++++++++ > 3 files changed, 248 insertions(+) > create mode 100644 Documentation/rust/safety-standard/index.rst >=20 > diff --git a/Documentation/rust/general-information.rst = b/Documentation/rust/general-information.rst > index e3f388ef4ee4..ddfe4e2e5307 100644 > --- a/Documentation/rust/general-information.rst > +++ b/Documentation/rust/general-information.rst > @@ -54,6 +54,7 @@ the same invocation used for compilation, e.g.:: > Please note that Clippy may change code generation, thus it should not = be > enabled while building a production kernel. >=20 > +.. _rust-abstractions: >=20 > Abstractions vs. bindings > ------------------------- > diff --git a/Documentation/rust/index.rst = b/Documentation/rust/index.rst > index 46d35bd395cf..968e9aace301 100644 > --- a/Documentation/rust/index.rst > +++ b/Documentation/rust/index.rst > @@ -39,6 +39,7 @@ configurations. > quick-start > general-information > coding-guidelines > + safety-standard/index > arch-support > testing >=20 > diff --git a/Documentation/rust/safety-standard/index.rst = b/Documentation/rust/safety-standard/index.rst > new file mode 100644 > index 000000000000..1cbc8d3dea04 > --- /dev/null > +++ b/Documentation/rust/safety-standard/index.rst > @@ -0,0 +1,246 @@ > +.. SPDX-License-Identifier: GPL-2.0 > +.. highlight:: rust > + > +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > +Rust Safety Standard > +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > + > +Safe Rust code cannot have memory related bugs. This is a guarantee = by the Rust compiler. Of course > +it is not without caveats: no compiler bugs, no bugs in the = specification etc. But the possibly most > +important caveat is that of ``unsafe`` code. ``unsafe`` code needs to = follow certain rules in order > +for safe code to enjoy the no-memory-bugs privilege. A simple example = of such a rule is that > +references must be valid for the duration of their lifetime. If any = rule is violated, it can lead > +to undefined behavior even in safe code! The term undefined behavior = in Rust has a lot stricter > +meaning than in C or C++: UB in Rust is totally forbidden. In C one = might rely on the compiler > +implementation to ensure correct code generation, but that is not the = case for Rust. You can read > +more about UB in Rust > +`here = `_= . > + > +If ``unsafe`` code makes our life this difficult, one might ask the = question "why do we even need > +it?" and the answer to that is that it gives users an escape hatch to = do things that the compiler > +normally forbids. ``unsafe`` code is a tool that enables programmers = to write more performant code, > +or code that interacts with hardware or C. These things are = particularly important in kernel > +development. > + > +The most effective way to prevent issues in ``unsafe`` code is to = just not write ``unsafe`` code in > +the first place. That is why minimizing the amount of ``unsafe`` code = is very important. For > +example, drivers are not allowed to directly interface with the C = side. Instead of directly > +communicating with C functions, they interact with Rust abstractions. = This concentrates the usage > +of ``unsafe`` code, making it easy to fix issues, since only the = abstraction needs to be fixed. > +Abstractions also allow taking advantage of other Rust language = features. Read more in > +:ref:`rust-abstractions`. This is something that I think we should discuss at Kangrejos. I do not = think that we should set in stone that the kernel crate is the only place = where unsafe code is acceptable. I am in no way disagreeing with the use of safe abstractions, but I = think we should have abstractions where they make sense. This is the case in the = vast majority of times, but not in *all* of them. A simple example is a MMIO read or write. Should a driver be forbidden = to call readX/writeX for an address it knows to be valid? How can you possibly = write an abstraction for this, when the driver is the only one aware of the = actual device addresses, and when the driver author is the person with actual = access to the HW docs? If a driver is written partially in Rust, and partially in C, and it = gets a pointer to some kcalloc=E2=80=99d memory in C, should It be forbidden to = use unsafe in order to build a slice from that pointer? How can you possibly design = a general abstraction for something that is, essentially, a = driver-internal API? For these corner cases, a simple safety comment should suffice. By all = means, let's strive to push as much of the unsafe bits into the kernel crate. = But, IMHO, we shouldn=E2=80=99t treat Rust drivers as some unprivileged = entity, they=E2=80=99re also kernel code, after all. > + > +Since the correctness of the abstractions is integral for safe code = to also be correct, extra effort > +is expended to get them right. Part of that is good safety = documentation. > + > +The goals of safety documentation are: > + > +* reduce the amount of bugs in ``unsafe`` code, > +* help readers know why a given piece of ``unsafe`` code is sound, > +* help writers write ``unsafe`` code with confidence, > +* simplify the work of reviewers. > + > +This document standardizes safety documentation. The necessity for = this is simple, only a common > +language that all parties understand is effective at the above task. = We want to avoid > +misunderstandings in safety related matters. An additional benefit is = that programmers will not have > +to ponder for the correct phrasing, since they can find it here. > + > +This document assumes that the reader is familiar with Rust code and = understands the most important > +concepts of ``unsafe`` Rust. It is recommended that the reader has = read the `Rust Book`_. Since this > +document is about safety documentation, almost all examples are going = to contain ``unsafe`` code. > +For this reason it is also recommended to read the `Rustonomicon`_, = one of the best resources on > +``unsafe`` code. > + > +.. _Rustonomicon: https://doc.rust-lang.org/nomicon/index.html > +.. _Rust Book: https://doc.rust-lang.org/stable/book/ > + > +If you need help coming up with an abstraction, or with writing the = safety documentation for an > +abstraction, feel free to reach out on `zulip`_ or the `list`_. > + > +.. _zulip: https://rust-for-linux.zulipchat.com > +.. _list: https://lore.kernel.org/rust-for-linux > + > +Soundness > +=3D=3D=3D=3D=3D=3D=3D=3D=3D > + > +``unsafe`` operations (e.g. ``unsafe`` functions, dereferencing raw = pointers etc.) have certain > +conditions that need to be fulfilled in order for the operation to = not be UB. > +To evaluate if the ``unsafe`` code usage is correct, one needs to = consider the API that wraps said > +``unsafe`` code. If under all possible safe uses of the API, the = conditions for the ``unsafe`` > +operation are fulfilled, the API is *sound*. Otherwise it is = *unsound*. Here is a simple example:: > + > + pub struct Data { > + a: usize, > + } > + > + pub fn access_a(data: *mut Data) -> usize { > + unsafe { (*data).a } > + } > + > + fn main() { > + let mut d =3D Data { a: 42 }; > + println!("{}", access_a(&mut d)); > + } > + > +While this example has no UB, the function ``access_a`` is unsound. = This is because one could just > +write the following safe usage:: > + > + println!("{}", access_a(core::ptr::null_mut())); > + > +And this would result in a dereference of a null pointer. > + > +In its essence, a sound API means that if someone only writes safe = code, they can never encounter UB > +even if they call safe code that calls ``unsafe`` code behind the = scenes. > + > +Because unsoundness issues have the potential for allowing safe code = to experience UB, they are > +treated similarly to actual bugs with UB. Their fixes should also be = included in the stable tree. > + > +Safety Documentation > +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > + > +After trying to minimize and remove as much ``unsafe`` code as = possible, there still is some left. > +This is because some things are just not possible in only safe code. = This last part of ``unsafe`` > +code must still be correct. Helping with that is the safety = documentation: it meticulously documents > +the various requirements and justifications for every line of = ``unsafe`` code. That way it can be > +ensured that all ``unsafe`` code is sound without anyone needing to = know the whole kernel at once. > +The gist of the idea is this: every ``unsafe`` operation documents = its requirements and every > +location that uses an ``unsafe`` operation documents for every = requirement a justification why they > +are fulfilled. If now all requirements and justifications are = correct, then there can only be sound > +``unsafe`` code. > + > +The ``unsafe`` keywords has two different meanings depending on the = context it is used in: > + > +* granting access to an unchecked operation, > +* declaring that something is an unchecked operation. > + > +In both cases we have to add safety documentation. In the first case, = we have to justify why we can > +always guarantee that the requirements of the unchecked operation are = fulfilled. In the second case, > +we have to list the requirements that have to be fulfilled for the = operation to be sound. > + > +In the following sections we will go over each location where = ``unsafe`` can be used. > + > +.. _unsafe-Functions: > + > +``unsafe`` Functions > +-------------------- > + > +``unsafe`` on function declarations is used to state that this = function has special requirements > +that callers have to ensure when calling the function:: > + > + unsafe fn foo() { > + // ... > + } > + > +These requirements are called the safety requirements of the = function. These requirements can take > +any shape and range from simple requirements like "``ptr_arg`` is = valid" (``ptr_arg`` refers to some > +argument with the type matching ``*mut T`` or ``*const T``) to more = complex requirements like > +"``ptr`` must be valid, point to a ``NUL``-terminated C string, and = it must be valid for at least > +``'a``. While the returned value is alive, the memory at ``ptr`` must = not be mutated.". > + > +The safety requirements have to be documented in the so called safety = section:: > + > + /// > + /// > + /// > + /// > + /// # Safety > + /// > + /// > + unsafe fn foo() { > + // ... > + } > + > +.. _unsafe-Blocks: > + > +``unsafe`` Blocks > +----------------- > + > +``unsafe`` code blocks are used to call ``unsafe`` functions and = perform built-in ``unsafe`` > +operations such as dereferencing a raw pointer:: > + > + unsafe { foo() }; > + > +In order to ensure that all safety requirements of ``unsafe`` = operations are upheld, a safety > +comment is mandatory for all ``unsafe`` blocks. This safety comment = needs to provide a correct > +justification for every safety requirements of every operation within = the block:: > + > + // SAFETY: > + unsafe { foo() }; > + > +For transparency it is best practice to have only a single ``unsafe`` = operation per ``unsafe`` > +block, since then it is more clear what the justifications are trying = to justify. Safe operations > +should not be included in the block, since it adds confusion as to = which operation is the ``unsafe`` > +one. In certain cases however it makes it easier to understand if = there is only a single ``unsafe`` > +block. For example:: > + > + // SAFETY: `ptr` is valid for writes. > + unsafe { > + (*ptr).field1 =3D 42; > + (*ptr).field2 =3D 24; > + (*ptr).field3 =3D 2442; > + } > + > +In this case it is more readable to not split the block into multiple = parts. > + > +``unsafe`` Traits > +----------------- > + > +When ``unsafe`` is on a ``trait`` declaration:: > + > + unsafe trait Foo {} > + > +The ``trait`` has special requirements for implementing it. Similar = to :ref:`unsafe-Functions`, these > +are called safety requirements and need to be documented in the same = way:: > + > + /// > + /// > + /// > + /// > + /// # Safety > + /// > + /// > + unsafe trait Foo {} > + > +``unsafe`` Impls > +---------------- > + > +When ``unsafe`` is on an ``impl`` item:: > + > + unsafe impl Foo for Bar {} > + > +The ``Foo`` ``trait`` has to be ``unsafe`` and its safety = requirements need to be justified > +similarly to :ref:`unsafe-Blocks`:: > + > + // SAFETY: > + unsafe impl Foo for Bar {} > + > +General Rules > +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > + > +The general thought behind all rules in the safety standard is that = everything that cannot be > +statically checked by the Rust compiler and guaranteed, needs to be = either checked at runtime, or > +have to have safety documentation. > + > +The Kernel uses ``deny(unsafe_op_in_unsafe_fn)``, disallowing = ``unsafe`` operations to be contained > +in ``unsafe`` functions without a surrounding ``unsafe`` block, an = example violating that would be:: > + > + unsafe fn zero_ptr(ptr: *mut u32) { > + *ptr =3D 0; > + } > + > +Denying code like this is becoming the default in modern editions of = the Rust language. It is also > +easy to see why we would want to deny such code: where would we put = the ``SAFETY`` comment for the > +pointer dereference? > + > +Further Pages > +------------- > + > +.. toctree:: > + :maxdepth: 1 > + > +.. only:: subproject and html > + > + Indices > + =3D=3D=3D=3D=3D=3D=3D > + > + * :ref:`genindex` > --=20 > 2.45.1 >=20 >=20 =E2=80=94 Daniel