* [PATCH v2 1/2] rust: add untrusted data abstraction
[not found] <20240925205244.873020-1-benno.lossin@proton.me>
@ 2024-09-25 20:53 ` Benno Lossin
2024-09-26 7:08 ` Dirk Behme
` (2 more replies)
2024-09-25 20:53 ` [PATCH v2 2/2] rust: switch uaccess to untrusted data API Benno Lossin
1 sibling, 3 replies; 15+ messages in thread
From: Benno Lossin @ 2024-09-25 20:53 UTC (permalink / raw)
To: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
Trevor Gross
Cc: Greg KH, Simona Vetter, linux-kernel, rust-for-linux
When reading data from userspace, hardware or other external untrusted
sources, the data must be validated before it is used for logic within
the kernel. This abstraction provides a generic newtype wrapper
`Untrusted`; it prevents direct access to the inner type. The only way
to use the underlying data is to call `.validate()` on such a value.
Doing so utilizes the new `Validate` trait that is responsible for all
of the validation logic. This trait gives access to the inner value of
`Untrusted` by means of another newtype wrapper `Unvalidated`. In
contrast to `Untrusted`, `Unvalidated` allows direct access and
additionally provides several helper functions for slices.
Having these two different newtype wrappers is an idea from Simona
Vetter. It has several benefits: it fully prevents safe access to the
underlying value of `Untrusted` without going through the `Validate`
API. Additionally, it allows one to grep for validation logic by simply
looking for `Unvalidated<`.
Any API that reads data from an untrusted source should return
`Untrusted<T>` where `T` is the type of the underlying untrusted data.
This generic allows other abstractions to return their custom type
wrapped by `Untrusted`, signaling to the caller that the data must be
validated before use. This allows those abstractions to be used both in
a trusted and untrusted manner, increasing their generality.
Additionally, using the arbitrary self types feature, APIs can be
designed to explicitly read untrusted data:
impl MyCustomDataSource {
pub fn read(self: &Untrusted<Self>) -> &Untrusted<[u8]>;
}
Cc: Simona Vetter <simona.vetter@ffwll.ch>
Signed-off-by: Benno Lossin <benno.lossin@proton.me>
---
rust/kernel/lib.rs | 1 +
rust/kernel/validate.rs | 602 ++++++++++++++++++++++++++++++++++++++++
2 files changed, 603 insertions(+)
create mode 100644 rust/kernel/validate.rs
diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
index f10b06a78b9d..3125936eae45 100644
--- a/rust/kernel/lib.rs
+++ b/rust/kernel/lib.rs
@@ -54,6 +54,7 @@
pub mod time;
pub mod types;
pub mod uaccess;
+pub mod validate;
pub mod workqueue;
#[doc(hidden)]
diff --git a/rust/kernel/validate.rs b/rust/kernel/validate.rs
new file mode 100644
index 000000000000..b325349e7dc3
--- /dev/null
+++ b/rust/kernel/validate.rs
@@ -0,0 +1,604 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! Types for handling and validating untrusted data.
+//!
+//! # Overview
+//!
+//! Untrusted data is marked using the [`Untrusted<T>`] type. See [Rationale](#rationale) for the
+//! reasons to mark untrusted data throught the kernel. It is a totally opaque wrapper, it is not
+//! possible to read the data inside; but it is possible to [`Untrusted::write`] into it.
+//!
+//! The only way to "access" the data inside an [`Untrusted<T>`] is to [`Untrusted::validate`] it;
+//! turning it into a different form using the [`Validate`] trait. That trait receives the data in
+//! the form of [`Unvalidated<T>`], which in contrast to [`Untrusted<T>`], allows access to the
+//! underlying data. It additionally provides several utility functions to simplify validation.
+//!
+//! # Rationale
+//!
+//! When reading data from an untrusted source, it must be validated before it can be used for
+//! logic. For example, this is a very bad idea:
+//!
+//! ```
+//! # fn read_bytes_from_network() -> Box<[u8]> {
+//! # Box::new([1, 0], kernel::alloc::flags::GFP_KERNEL).unwrap()
+//! # }
+//! let bytes: Box<[u8]> = read_bytes_from_network();
+//! let data_index = bytes[0];
+//! let data = bytes[usize::from(data_index)];
+//! ```
+//!
+//! While this will not lead to a memory violation (because the array index checks the bounds), it
+//! might result in a kernel panic. For this reason, all untrusted data must be wrapped in
+//! [`Untrusted<T>`]. This type only allows validating the data or passing it along, since copying
+//! data from one userspace buffer into another is allowed for untrusted data.
+
+use crate::init::Init;
+use core::{
+ mem::MaybeUninit,
+ ops::{Index, IndexMut},
+ ptr, slice,
+};
+
+/// Untrusted data of type `T`.
+///
+/// When reading data from userspace, hardware or other external untrusted sources, the data must
+/// be validated before it is used for logic within the kernel. To do so, the [`validate()`]
+/// function exists and uses the [`Validate`] trait.
+///
+/// Also see the [module] description.
+///
+/// [`validate()`]: Self::validate
+/// [module]: self
+#[repr(transparent)]
+pub struct Untrusted<T: ?Sized>(Unvalidated<T>);
+
+impl<T: ?Sized> Untrusted<T> {
+ /// Marks the given value as untrusted.
+ ///
+ /// # Examples
+ ///
+ /// ```
+ /// use kernel::validate::Untrusted;
+ ///
+ /// # mod bindings { pub(crate) unsafe fn read_foo_info() -> [u8; 4] { todo!() } };
+ /// fn read_foo_info() -> Untrusted<[u8; 4]> {
+ /// // SAFETY: just an FFI call without preconditions.
+ /// Untrusted::new(unsafe { bindings::read_foo_info() })
+ /// }
+ /// ```
+ pub fn new(value: T) -> Self
+ where
+ T: Sized,
+ {
+ Self(Unvalidated::new(value))
+ }
+
+ /// Marks the value behind the reference as untrusted.
+ ///
+ /// # Examples
+ ///
+ /// In this imaginary example there exists the `foo_hardware` struct on the C side, as well as
+ /// a `foo_hardware_read` function that reads some data directly from the hardware.
+ /// ```
+ /// use kernel::{error, types::Opaque, validate::Untrusted};
+ /// use core::ptr;
+ ///
+ /// # #[allow(non_camel_case_types)]
+ /// # mod bindings {
+ /// # pub(crate) struct foo_hardware;
+ /// # pub(crate) unsafe fn foo_hardware_read(_foo: *mut foo_hardware, _len: &mut usize) -> *mut u8 {
+ /// # todo!()
+ /// # }
+ /// # }
+ /// struct Foo(Opaque<bindings::foo_hardware>);
+ ///
+ /// impl Foo {
+ /// fn read(&mut self, mut len: usize) -> Result<&Untrusted<[u8]>> {
+ /// // SAFETY: just an FFI call without preconditions.
+ /// let data: *mut u8 = unsafe { bindings::foo_hardware_read(self.0.get(), &mut len) };
+ /// let data = error::from_err_ptr(data)?;
+ /// let data = ptr::slice_from_raw_parts(data, len);
+ /// // SAFETY: `data` returned by `foo_hardware_read` is valid for reads as long as the
+ /// // `foo_hardware` object exists. That function updated the
+ /// let data = unsafe { &*data };
+ /// Ok(Untrusted::new_ref(data))
+ /// }
+ /// }
+ /// ```
+ pub fn new_ref(value: &T) -> &Self {
+ let ptr: *const T = value;
+ // CAST: `Self` and `Unvalidated` are `repr(transparent)` and contain a `T`.
+ let ptr = ptr as *const Self;
+ // SAFETY: `ptr` came from a shared reference valid for `'a`.
+ unsafe { &*ptr }
+ }
+
+ /// Marks the value behind the reference as untrusted.
+ ///
+ /// # Examples
+ ///
+ /// In this imaginary example there exists the `foo_hardware` struct on the C side, as well as
+ /// a `foo_hardware_read` function that reads some data directly from the hardware.
+ /// ```
+ /// use kernel::{error, types::Opaque, validate::Untrusted};
+ /// use core::ptr;
+ ///
+ /// # #[allow(non_camel_case_types)]
+ /// # mod bindings {
+ /// # pub(crate) struct foo_hardware;
+ /// # pub(crate) unsafe fn foo_hardware_read(_foo: *mut foo_hardware, _len: &mut usize) -> *mut u8 {
+ /// # todo!()
+ /// # }
+ /// # }
+ /// struct Foo(Opaque<bindings::foo_hardware>);
+ ///
+ /// impl Foo {
+ /// fn read(&mut self, mut len: usize) -> Result<&mut Untrusted<[u8]>> {
+ /// // SAFETY: just an FFI call without preconditions.
+ /// let data: *mut u8 = unsafe { bindings::foo_hardware_read(self.0.get(), &mut len) };
+ /// let data = error::from_err_ptr(data)?;
+ /// let data = ptr::slice_from_raw_parts_mut(data, len);
+ /// // SAFETY: `data` returned by `foo_hardware_read` is valid for reads as long as the
+ /// // `foo_hardware` object exists. That function updated the
+ /// let data = unsafe { &mut *data };
+ /// Ok(Untrusted::new_mut(data))
+ /// }
+ /// }
+ /// ```
+ pub fn new_mut(value: &mut T) -> &mut Self {
+ let ptr: *mut T = value;
+ // CAST: `Self` and `Unvalidated` are `repr(transparent)` and contain a `T`.
+ let ptr = ptr as *mut Self;
+ // SAFETY: `ptr` came from a mutable reference valid for `'a`.
+ unsafe { &mut *ptr }
+ }
+
+ /// Validates and parses the untrusted data.
+ ///
+ /// See the [`Validate`] trait on how to implement it.
+ pub fn validate<'a, V: Validate<&'a Unvalidated<T>>>(&'a self) -> Result<V, V::Err> {
+ V::validate(&self.0)
+ }
+
+ /// Validates and parses the untrusted data.
+ ///
+ /// See the [`Validate`] trait on how to implement it.
+ pub fn validate_mut<'a, V: Validate<&'a mut Unvalidated<T>>>(
+ &'a mut self,
+ ) -> Result<V, V::Err> {
+ V::validate(&mut self.0)
+ }
+
+ /// Sets the underlying untrusted value.
+ ///
+ /// # Examples
+ ///
+ /// ```
+ /// use kernel::validate::Untrusted;
+ ///
+ /// let mut untrusted = Untrusted::new(42);
+ /// untrusted.write(24);
+ /// ```
+ pub fn write(&mut self, value: impl Init<T>) {
+ let ptr: *mut T = &mut self.0 .0;
+ // SAFETY: `ptr` came from a mutable reference and the value is overwritten before it is
+ // read.
+ unsafe { ptr::drop_in_place(ptr) };
+ // SAFETY: `ptr` came from a mutable reference and the initializer cannot error.
+ match unsafe { value.__init(ptr) } {
+ Ok(()) => {}
+ }
+ }
+
+ /// Turns a slice of untrusted values into an untrusted slice of values.
+ pub fn transpose_slice(slice: &[Untrusted<T>]) -> &Untrusted<[T]>
+ where
+ T: Sized,
+ {
+ let ptr = slice.as_ptr().cast::<T>();
+ // SAFETY: `ptr` and `len` come from the same slice reference.
+ let slice = unsafe { slice::from_raw_parts(ptr, slice.len()) };
+ Untrusted::new_ref(slice)
+ }
+
+ /// Turns a slice of uninitialized, untrusted values into an untrusted slice of uninitialized
+ /// values.
+ pub fn transpose_slice_uninit(
+ slice: &[MaybeUninit<Untrusted<T>>],
+ ) -> &Untrusted<[MaybeUninit<T>]>
+ where
+ T: Sized,
+ {
+ let ptr = slice.as_ptr().cast::<MaybeUninit<T>>();
+ // SAFETY: `ptr` and `len` come from the same mutable slice reference.
+ let slice = unsafe { slice::from_raw_parts(ptr, slice.len()) };
+ Untrusted::new_ref(slice)
+ }
+
+ /// Turns a slice of uninitialized, untrusted values into an untrusted slice of uninitialized
+ /// values.
+ pub fn transpose_slice_uninit_mut(
+ slice: &mut [MaybeUninit<Untrusted<T>>],
+ ) -> &mut Untrusted<[MaybeUninit<T>]>
+ where
+ T: Sized,
+ {
+ // CAST: `MaybeUninit<T>` and `MaybeUninit<Untrusted<T>>` have the same layout.
+ let ptr = slice.as_mut_ptr().cast::<MaybeUninit<T>>();
+ // SAFETY: `ptr` and `len` come from the same mutable slice reference.
+ let slice = unsafe { slice::from_raw_parts_mut(ptr, slice.len()) };
+ Untrusted::new_mut(slice)
+ }
+}
+
+impl<T> Untrusted<MaybeUninit<T>> {
+ /// Sets the underlying untrusted value.
+ ///
+ /// # Examples
+ ///
+ /// ```
+ /// use kernel::validate::Untrusted;
+ ///
+ /// let mut untrusted = Untrusted::new(42);
+ /// untrusted.write(24);
+ /// ```
+ pub fn write_uninit<E>(&mut self, value: impl Init<T, E>) -> Result<&mut Untrusted<T>, E> {
+ let ptr: *mut MaybeUninit<T> = &mut self.0 .0;
+ // CAST: `MaybeUninit<T>` is `repr(transparent)`.
+ let ptr = ptr.cast::<T>();
+ // SAFETY: `ptr` came from a reference and if `Err` is returned, the underlying memory is
+ // considered uninitialized.
+ unsafe { value.__init(ptr) }.map(|()| {
+ let this = self.0.raw_mut();
+ // SAFETY: we initialized the memory above.
+ Untrusted::new_mut(unsafe { this.assume_init_mut() })
+ })
+ }
+}
+
+impl<T> Untrusted<[MaybeUninit<T>]> {
+ /// Sets the underlying untrusted value.
+ ///
+ /// # Examples
+ ///
+ /// ```
+ /// use kernel::validate::Untrusted;
+ ///
+ /// let mut untrusted = Untrusted::new(42);
+ /// untrusted.write(24);
+ /// ```
+ pub fn write_uninit_slice<E>(
+ &mut self,
+ value: impl Init<[T], E>,
+ ) -> Result<&mut Untrusted<[T]>, E> {
+ let ptr: *mut [MaybeUninit<T>] = &mut self.0 .0;
+ // CAST: `MaybeUninit<T>` is `repr(transparent)`.
+ let ptr = ptr as *mut [T];
+ // SAFETY: `ptr` came from a reference and if `Err` is returned, the underlying memory is
+ // considered uninitialized.
+ unsafe { value.__init(ptr) }.map(|()| {
+ let this = self.0.raw_mut().as_mut_ptr();
+ // CAST: `MaybeUninit<T>` is `repr(transparent)`.
+ let this = this.cast::<T>();
+ // SAFETY: `this` and `len` came from the same slice reference.
+ let this = unsafe { slice::from_raw_parts_mut(this, self.0.len()) };
+ Untrusted::new_mut(this)
+ })
+ }
+}
+
+/// Marks types that can be used as input to [`Validate::validate`].
+pub trait ValidateInput: private::Sealed + Sized {}
+
+mod private {
+ pub trait Sealed {}
+}
+
+impl<'a, T: ?Sized> private::Sealed for &'a Unvalidated<T> {}
+impl<'a, T: ?Sized> ValidateInput for &'a Unvalidated<T> {}
+
+impl<'a, T: ?Sized> private::Sealed for &'a mut Unvalidated<T> {}
+impl<'a, T: ?Sized> ValidateInput for &'a mut Unvalidated<T> {}
+
+/// Validates untrusted data.
+///
+/// # Examples
+///
+/// The simplest way to validate data is to just implement `Validate<&Unvalidated<[u8]>>` for the
+/// type that you wish to validate:
+///
+/// ```
+/// use kernel::{
+/// error::{code::EINVAL, Error},
+/// str::{CStr, CString},
+/// validate::{Unvalidated, Validate},
+/// };
+///
+/// struct Data {
+/// flags: u8,
+/// name: CString,
+/// }
+///
+/// impl Validate<&Unvalidated<[u8]>> for Data {
+/// type Err = Error;
+///
+/// fn validate(unvalidated: &Unvalidated<[u8]>) -> Result<Self, Self::Err> {
+/// let raw = unvalidated.raw();
+/// let (&flags, name) = raw.split_first().ok_or(EINVAL)?;
+/// let name = CStr::from_bytes_with_nul(name)?.to_cstring()?;
+/// Ok(Data { flags, name })
+/// }
+/// }
+/// ```
+///
+/// This approach copies the data and requires allocation. If you want to avoid the allocation and
+/// copying the data, you can borrow from the input like this:
+///
+/// ```
+/// use kernel::{
+/// error::{code::EINVAL, Error},
+/// str::CStr,
+/// validate::{Unvalidated, Validate},
+/// };
+///
+/// struct Data<'a> {
+/// flags: u8,
+/// name: &'a CStr,
+/// }
+///
+/// impl<'a> Validate<&'a Unvalidated<[u8]>> for Data<'a> {
+/// type Err = Error;
+///
+/// fn validate(unvalidated: &'a Unvalidated<[u8]>) -> Result<Self, Self::Err> {
+/// let raw = unvalidated.raw();
+/// let (&flags, name) = raw.split_first().ok_or(EINVAL)?;
+/// let name = CStr::from_bytes_with_nul(name)?;
+/// Ok(Data { flags, name })
+/// }
+/// }
+/// ```
+///
+/// If you need to in-place validate your data, you currently need to resort to `unsafe`:
+///
+/// ```
+/// use kernel::{
+/// error::{code::EINVAL, Error},
+/// str::CStr,
+/// validate::{Unvalidated, Validate},
+/// };
+/// use core::mem;
+///
+/// // Important: use `repr(C)`, this ensures a linear layout of this type.
+/// #[repr(C)]
+/// struct Data {
+/// version: u8,
+/// flags: u8,
+/// _reserved: [u8; 2],
+/// count: u64,
+/// // lots of other fields...
+/// }
+///
+/// impl Validate<&Unvalidated<[u8]>> for &Data {
+/// type Err = Error;
+///
+/// fn validate(unvalidated: &Unvalidated<[u8]>) -> Result<Self, Self::Err> {
+/// let raw = unvalidated.raw();
+/// if raw.len() < mem::size_of::<Data>() {
+/// return Err(EINVAL);
+/// }
+/// // can only handle version 0
+/// if raw[0] != 0 {
+/// return Err(EINVAL);
+/// }
+/// // version 0 only uses the lower 4 bits of flags
+/// if raw[1] & 0xf0 != 0 {
+/// return Err(EINVAL);
+/// }
+/// let ptr = raw.as_ptr();
+/// // CAST: `Data` only contains integers and has `repr(C)`.
+/// let ptr = ptr.cast::<Data>();
+/// // SAFETY: `ptr` came from a reference and the cast above is valid.
+/// Ok(unsafe { &*ptr })
+/// }
+/// }
+/// ```
+///
+/// To be able to modify the parsed data, while still supporting zero-copy, you can implement
+/// `Validate<&mut Unvalidated<[u8]>>`:
+///
+/// ```
+/// use kernel::{
+/// error::{code::EINVAL, Error},
+/// str::CStr,
+/// validate::{Unvalidated, Validate},
+/// };
+/// use core::mem;
+///
+/// // Important: use `repr(C)`, this ensures a linear layout of this type.
+/// #[repr(C)]
+/// struct Data {
+/// version: u8,
+/// flags: u8,
+/// _reserved: [u8; 2],
+/// count: u64,
+/// // lots of other fields...
+/// }
+///
+/// impl Validate<&mut Unvalidated<[u8]>> for &Data {
+/// type Err = Error;
+///
+/// fn validate(unvalidated: &mut Unvalidated<[u8]>) -> Result<Self, Self::Err> {
+/// let raw = unvalidated.raw_mut();
+/// if raw.len() < mem::size_of::<Data>() {
+/// return Err(EINVAL);
+/// }
+/// match raw[0] {
+/// 0 => {},
+/// 1 => {
+/// // version 1 implicitly sets the first bit.
+/// raw[1] |= 1;
+/// },
+/// // can only handle version 0 and 1
+/// _ => return Err(EINVAL),
+/// }
+/// // version 0 and 1 only use the lower 4 bits of flags
+/// if raw[1] & 0xf0 != 0 {
+/// return Err(EINVAL);
+/// }
+/// if raw[1] == 0 {}
+/// let ptr = raw.as_ptr();
+/// // CAST: `Data` only contains integers and has `repr(C)`.
+/// let ptr = ptr.cast::<Data>();
+/// // SAFETY: `ptr` came from a reference and the cast above is valid.
+/// Ok(unsafe { &*ptr })
+/// }
+/// }
+/// ```
+pub trait Validate<I: ValidateInput>: Sized {
+ /// Validation error.
+ type Err;
+
+ /// Validate the given untrusted data and parse it into the output type.
+ fn validate(unvalidated: I) -> Result<Self, Self::Err>;
+}
+
+/// Unvalidated data of type `T`.
+#[repr(transparent)]
+pub struct Unvalidated<T: ?Sized>(T);
+
+impl<T: ?Sized> Unvalidated<T> {
+ fn new(value: T) -> Self
+ where
+ T: Sized,
+ {
+ Self(value)
+ }
+
+ fn new_ref(value: &T) -> &Self {
+ let ptr: *const T = value;
+ // CAST: `Self` is `repr(transparent)` and contains a `T`.
+ let ptr = ptr as *const Self;
+ // SAFETY: `ptr` came from a mutable reference valid for `'a`.
+ unsafe { &*ptr }
+ }
+
+ fn new_mut(value: &mut T) -> &mut Self {
+ let ptr: *mut T = value;
+ // CAST: `Self` is `repr(transparent)` and contains a `T`.
+ let ptr = ptr as *mut Self;
+ // SAFETY: `ptr` came from a mutable reference valid for `'a`.
+ unsafe { &mut *ptr }
+ }
+
+ /// Validates and parses the untrusted data.
+ ///
+ /// See the [`Validate`] trait on how to implement it.
+ pub fn validate_ref<'a, V: Validate<&'a Unvalidated<T>>>(&'a self) -> Result<V, V::Err> {
+ V::validate(self)
+ }
+
+ /// Validates and parses the untrusted data.
+ ///
+ /// See the [`Validate`] trait on how to implement it.
+ pub fn validate_mut<'a, V: Validate<&'a mut Unvalidated<T>>>(
+ &'a mut self,
+ ) -> Result<V, V::Err> {
+ V::validate(self)
+ }
+
+ /// Gives immutable access to the underlying value.
+ pub fn raw(&self) -> &T {
+ &self.0
+ }
+
+ /// Gives mutable access to the underlying value.
+ pub fn raw_mut(&mut self) -> &mut T {
+ &mut self.0
+ }
+}
+
+impl<T, I> Index<I> for Unvalidated<[T]>
+where
+ I: slice::SliceIndex<[T]>,
+{
+ type Output = Unvalidated<I::Output>;
+
+ fn index(&self, index: I) -> &Self::Output {
+ Unvalidated::new_ref(self.0.index(index))
+ }
+}
+
+impl<T, I> IndexMut<I> for Unvalidated<[T]>
+where
+ I: slice::SliceIndex<[T]>,
+{
+ fn index_mut(&mut self, index: I) -> &mut Self::Output {
+ Unvalidated::new_mut(self.0.index_mut(index))
+ }
+}
+
+/// Immutable unvalidated slice iterator.
+pub struct Iter<'a, T>(slice::Iter<'a, T>);
+
+/// Mutable unvalidated slice iterator.
+pub struct IterMut<'a, T>(slice::IterMut<'a, T>);
+
+impl<'a, T> Iterator for Iter<'a, T> {
+ type Item = &'a Unvalidated<T>;
+
+ fn next(&mut self) -> Option<Self::Item> {
+ self.0.next().map(Unvalidated::new_ref)
+ }
+}
+
+impl<'a, T> IntoIterator for &'a Unvalidated<[T]> {
+ type Item = &'a Unvalidated<T>;
+ type IntoIter = Iter<'a, T>;
+
+ fn into_iter(self) -> Self::IntoIter {
+ Iter(self.0.iter())
+ }
+}
+
+impl<'a, T> Iterator for IterMut<'a, T> {
+ type Item = &'a mut Unvalidated<T>;
+
+ fn next(&mut self) -> Option<Self::Item> {
+ self.0.next().map(Unvalidated::new_mut)
+ }
+}
+
+impl<'a, T> IntoIterator for &'a mut Unvalidated<[T]> {
+ type Item = &'a mut Unvalidated<T>;
+ type IntoIter = IterMut<'a, T>;
+
+ fn into_iter(self) -> Self::IntoIter {
+ IterMut(self.0.iter_mut())
+ }
+}
+
+impl<T> Unvalidated<[T]> {
+ /// Returns the number of elements in the underlying slice.
+ pub fn len(&self) -> usize {
+ self.0.len()
+ }
+
+ /// Returns true if the underlying slice has a length of 0.
+ pub fn is_empty(&self) -> bool {
+ self.0.is_empty()
+ }
+
+ /// Iterates over all items and validates each of them individually.
+ pub fn validate_iter<'a, V: Validate<&'a Unvalidated<T>>>(
+ &'a self,
+ ) -> impl Iterator<Item = Result<V, V::Err>> + 'a {
+ self.into_iter().map(|item| V::validate(item))
+ }
+
+ /// Iterates over all items and validates each of them individually.
+ pub fn validate_iter_mut<'a, V: Validate<&'a mut Unvalidated<T>>>(
+ &'a mut self,
+ ) -> impl Iterator<Item = Result<V, V::Err>> + 'a {
+ self.into_iter().map(|item| V::validate(item))
+ }
+}
--
2.46.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v2 2/2] rust: switch uaccess to untrusted data API
[not found] <20240925205244.873020-1-benno.lossin@proton.me>
2024-09-25 20:53 ` [PATCH v2 1/2] rust: add untrusted data abstraction Benno Lossin
@ 2024-09-25 20:53 ` Benno Lossin
2024-09-26 11:09 ` Simona Vetter
2024-09-26 23:56 ` kernel test robot
1 sibling, 2 replies; 15+ messages in thread
From: Benno Lossin @ 2024-09-25 20:53 UTC (permalink / raw)
To: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
Björn Roy Baron, Benno Lossin, Andreas Hindborg, Alice Ryhl,
Trevor Gross
Cc: Greg KH, Simona Vetter, rust-for-linux, linux-kernel
Userspace is an untrusted data source, so all data that can be read from
there is untrusted. Therefore use the new untrusted API for any data
returned from it.
This makes it significantly harder to write TOCTOU bug, the example bug
in the documentation cannot easily be converted to use the new API. Thus
it is removed.
A possible future improvement is to change how `UserSliceWriter` exposes
writing to userspace: a trait `ToUserspace` TBB (to be bikeshed) could
allow users to write untrusted data to a userpointer without adding more
methods to it.
Signed-off-by: Benno Lossin <benno.lossin@proton.me>
---
rust/kernel/page.rs | 8 ++-
rust/kernel/uaccess.rs | 135 +++++++++++++++++++++--------------------
2 files changed, 74 insertions(+), 69 deletions(-)
diff --git a/rust/kernel/page.rs b/rust/kernel/page.rs
index 208a006d587c..74ce86326893 100644
--- a/rust/kernel/page.rs
+++ b/rust/kernel/page.rs
@@ -5,9 +5,9 @@
use crate::{
alloc::{AllocError, Flags},
bindings,
- error::code::*,
- error::Result,
+ error::{code::*, Result},
uaccess::UserSliceReader,
+ validate::Untrusted,
};
use core::ptr::{self, NonNull};
@@ -237,8 +237,10 @@ pub unsafe fn copy_from_user_slice_raw(
// SAFETY: If `with_pointer_into_page` calls into this closure, then it has performed a
// bounds check and guarantees that `dst` is valid for `len` bytes. Furthermore, we have
// exclusive access to the slice since the caller guarantees that there are no races.
- reader.read_raw(unsafe { core::slice::from_raw_parts_mut(dst.cast(), len) })
+ let slice = unsafe { core::slice::from_raw_parts_mut(dst.cast(), len) };
+ reader.read_raw(Untrusted::new_mut(slice))
})
+ .map(|_| ())
}
}
diff --git a/rust/kernel/uaccess.rs b/rust/kernel/uaccess.rs
index e9347cff99ab..3d312f845269 100644
--- a/rust/kernel/uaccess.rs
+++ b/rust/kernel/uaccess.rs
@@ -10,10 +10,12 @@
error::Result,
prelude::*,
types::{AsBytes, FromBytes},
+ validate::Untrusted,
};
use alloc::vec::Vec;
use core::ffi::{c_ulong, c_void};
use core::mem::{size_of, MaybeUninit};
+use init::init_from_closure;
/// The type used for userspace addresses.
pub type UserPtr = usize;
@@ -47,59 +49,39 @@
///
/// ```no_run
/// use alloc::vec::Vec;
-/// use core::ffi::c_void;
-/// use kernel::error::Result;
-/// use kernel::uaccess::{UserPtr, UserSlice};
+/// use core::{convert::Infallible, ffi::c_void};
+/// use kernel::{error::Result, uaccess::{UserPtr, UserSlice}, validate::{Unvalidated, Untrusted, Validate}};
///
-/// fn bytes_add_one(uptr: UserPtr, len: usize) -> Result<()> {
-/// let (read, mut write) = UserSlice::new(uptr, len).reader_writer();
+/// struct AddOne<'a>(&'a mut u8);
///
-/// let mut buf = Vec::new();
-/// read.read_all(&mut buf, GFP_KERNEL)?;
+/// impl<'a> Validate<&'a mut Unvalidated<u8>> for AddOne<'a> {
+/// type Err = Infallible;
///
-/// for b in &mut buf {
-/// *b = b.wrapping_add(1);
+/// fn validate(unvalidated: &'a mut Unvalidated<u8>) -> Result<Self, Self::Err> {
+/// // We are not doing any kind of validation here on purpose. After all, we only want to
+/// // increment the value and write it back.
+/// Ok(Self(unvalidated.raw_mut()))
/// }
-///
-/// write.write_slice(&buf)?;
-/// Ok(())
/// }
-/// ```
-///
-/// Example illustrating a TOCTOU (time-of-check to time-of-use) bug.
///
-/// ```no_run
-/// use alloc::vec::Vec;
-/// use core::ffi::c_void;
-/// use kernel::error::{code::EINVAL, Result};
-/// use kernel::uaccess::{UserPtr, UserSlice};
+/// impl AddOne<'_> {
+/// fn inc(&mut self) {
+/// *self.0 = self.0.wrapping_add(1);
+/// }
+/// }
///
-/// /// Returns whether the data in this region is valid.
-/// fn is_valid(uptr: UserPtr, len: usize) -> Result<bool> {
-/// let read = UserSlice::new(uptr, len).reader();
+/// fn bytes_add_one(uptr: UserPtr, len: usize) -> Result<()> {
+/// let (read, mut write) = UserSlice::new(uptr, len).reader_writer();
///
/// let mut buf = Vec::new();
/// read.read_all(&mut buf, GFP_KERNEL)?;
///
-/// todo!()
-/// }
-///
-/// /// Returns the bytes behind this user pointer if they are valid.
-/// fn get_bytes_if_valid(uptr: UserPtr, len: usize) -> Result<Vec<u8>> {
-/// if !is_valid(uptr, len)? {
-/// return Err(EINVAL);
+/// for b in &mut buf {
+/// b.validate_mut::<AddOne<'_>>()?.inc();
/// }
///
-/// let read = UserSlice::new(uptr, len).reader();
-///
-/// let mut buf = Vec::new();
-/// read.read_all(&mut buf, GFP_KERNEL)?;
-///
-/// // THIS IS A BUG! The bytes could have changed since we checked them.
-/// //
-/// // To avoid this kind of bug, don't call `UserSlice::new` multiple
-/// // times with the same address.
-/// Ok(buf)
+/// write.write_untrusted_slice(Untrusted::transpose_slice(&buf))?;
+/// Ok(())
/// }
/// ```
///
@@ -130,7 +112,7 @@ pub fn new(ptr: UserPtr, length: usize) -> Self {
/// Reads the entirety of the user slice, appending it to the end of the provided buffer.
///
/// Fails with [`EFAULT`] if the read happens on a bad address.
- pub fn read_all(self, buf: &mut Vec<u8>, flags: Flags) -> Result {
+ pub fn read_all(self, buf: &mut Vec<Untrusted<u8>>, flags: Flags) -> Result {
self.reader().read_all(buf, flags)
}
@@ -218,38 +200,47 @@ pub fn is_empty(&self) -> bool {
/// Fails with [`EFAULT`] if the read happens on a bad address, or if the read goes out of
/// bounds of this [`UserSliceReader`]. This call may modify `out` even if it returns an error.
///
+ /// Returns a reference to the initialized bytes in `out`.
+ ///
/// # Guarantees
///
/// After a successful call to this method, all bytes in `out` are initialized.
- pub fn read_raw(&mut self, out: &mut [MaybeUninit<u8>]) -> Result {
- let len = out.len();
- let out_ptr = out.as_mut_ptr().cast::<c_void>();
- if len > self.length {
- return Err(EFAULT);
- }
- let Ok(len_ulong) = c_ulong::try_from(len) else {
- return Err(EFAULT);
+ pub fn read_raw<'a>(
+ &'a mut self,
+ out: &'a mut Untrusted<[MaybeUninit<u8>]>,
+ ) -> Result<&'a mut Untrusted<[u8]>> {
+ let init = |ptr: *mut [u8]| {
+ let out_ptr = ptr.cast::<c_void>();
+ let len = ptr.len();
+ if len > self.length {
+ return Err(EFAULT);
+ }
+ let Ok(len_ulong) = c_ulong::try_from(len) else {
+ return Err(EFAULT);
+ };
+ // SAFETY: `out_ptr` points into a mutable slice of length `len_ulong`, so we may write
+ // that many bytes to it.
+ let res =
+ unsafe { bindings::copy_from_user(out_ptr, self.ptr as *const c_void, len_ulong) };
+ if res != 0 {
+ return Err(EFAULT);
+ }
+ self.ptr = self.ptr.wrapping_add(len);
+ self.length -= len;
+ Ok(())
};
- // SAFETY: `out_ptr` points into a mutable slice of length `len_ulong`, so we may write
- // that many bytes to it.
- let res =
- unsafe { bindings::copy_from_user(out_ptr, self.ptr as *const c_void, len_ulong) };
- if res != 0 {
- return Err(EFAULT);
- }
- self.ptr = self.ptr.wrapping_add(len);
- self.length -= len;
- Ok(())
+ out.write_uninit_slice(unsafe { init_from_closure(init) })
}
/// Reads raw data from the user slice into a kernel buffer.
///
/// Fails with [`EFAULT`] if the read happens on a bad address, or if the read goes out of
/// bounds of this [`UserSliceReader`]. This call may modify `out` even if it returns an error.
- pub fn read_slice(&mut self, out: &mut [u8]) -> Result {
+ pub fn read_slice(&mut self, out: &mut Untrusted<[u8]>) -> Result<&mut Untrusted<[u8]>> {
// SAFETY: The types are compatible and `read_raw` doesn't write uninitialized bytes to
// `out`.
- let out = unsafe { &mut *(out as *mut [u8] as *mut [MaybeUninit<u8>]) };
+ let out =
+ unsafe { &mut *(out as *mut Untrusted<[u8]> as *mut Untrusted<[MaybeUninit<u8>]>) };
self.read_raw(out)
}
@@ -291,13 +282,15 @@ pub fn read<T: FromBytes>(&mut self) -> Result<T> {
/// Reads the entirety of the user slice, appending it to the end of the provided buffer.
///
/// Fails with [`EFAULT`] if the read happens on a bad address.
- pub fn read_all(mut self, buf: &mut Vec<u8>, flags: Flags) -> Result {
+ pub fn read_all(mut self, buf: &mut Vec<Untrusted<u8>>, flags: Flags) -> Result {
let len = self.length;
- VecExt::<u8>::reserve(buf, len, flags)?;
+ VecExt::<_>::reserve(buf, len, flags)?;
// The call to `try_reserve` was successful, so the spare capacity is at least `len` bytes
// long.
- self.read_raw(&mut buf.spare_capacity_mut()[..len])?;
+ self.read_raw(Untrusted::transpose_slice_uninit_mut(
+ &mut buf.spare_capacity_mut()[..len],
+ ))?;
// SAFETY: Since the call to `read_raw` was successful, so the next `len` bytes of the
// vector have been initialized.
@@ -333,8 +326,18 @@ pub fn is_empty(&self) -> bool {
/// bounds of this [`UserSliceWriter`]. This call may modify the associated userspace slice even
/// if it returns an error.
pub fn write_slice(&mut self, data: &[u8]) -> Result {
- let len = data.len();
- let data_ptr = data.as_ptr().cast::<c_void>();
+ self.write_untrusted_slice(Untrusted::new_ref(data))
+ }
+
+ /// Writes raw data to this user pointer from a kernel buffer.
+ ///
+ /// Fails with [`EFAULT`] if the write happens on a bad address, or if the write goes out of
+ /// bounds of this [`UserSliceWriter`]. This call may modify the associated userspace slice even
+ /// if it returns an error.
+ pub fn write_untrusted_slice(&mut self, data: &Untrusted<[u8]>) -> Result {
+ let data_ptr = (data as *const _) as *const [u8];
+ let len = data_ptr.len();
+ let data_ptr = data_ptr.cast::<c_void>();
if len > self.length {
return Err(EFAULT);
}
--
2.46.0
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH v2 1/2] rust: add untrusted data abstraction
2024-09-25 20:53 ` [PATCH v2 1/2] rust: add untrusted data abstraction Benno Lossin
@ 2024-09-26 7:08 ` Dirk Behme
2024-09-26 10:40 ` Simona Vetter
2024-09-26 20:31 ` kernel test robot
2 siblings, 0 replies; 15+ messages in thread
From: Dirk Behme @ 2024-09-26 7:08 UTC (permalink / raw)
To: Benno Lossin, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
Björn Roy Baron, Andreas Hindborg, Alice Ryhl, Trevor Gross
Cc: Greg KH, Simona Vetter, linux-kernel, rust-for-linux
Hi Benno,
just some quick findings:
On 25.09.2024 22:53, Benno Lossin wrote:
....
> diff --git a/rust/kernel/validate.rs b/rust/kernel/validate.rs
> new file mode 100644
> index 000000000000..b325349e7dc3
> --- /dev/null
> +++ b/rust/kernel/validate.rs
> @@ -0,0 +1,604 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +//! Types for handling and validating untrusted data.
> +//!
> +//! # Overview
> +//!
> +//! Untrusted data is marked using the [`Untrusted<T>`] type. See [Rationale](#rationale) for the
> +//! reasons to mark untrusted data throught the kernel.
Typo? throught -> through (i.e. drop the trailing 't')?
...
> //! [`Untrusted<T>`]. This type only allows validating the data or
passing it along, since copying
> //! data from one userspace buffer into another is allowed for
untrusted data.
I wonder if we should drop the "userspace"? I mean untrusted data must
not be in a user space buffer, only? It could come e.g. from hardware as
well. Like in the read_bytes_from_network() example.
...
> + /// Marks the value behind the reference as untrusted.
> + ///
> + /// # Examples
> + ///
> + /// In this imaginary example there exists the `foo_hardware` struct on the C side, as well as
> + /// a `foo_hardware_read` function that reads some data directly from the hardware.
> + /// ```
> + /// use kernel::{error, types::Opaque, validate::Untrusted};
> + /// use core::ptr;
> + ///
> + /// # #[allow(non_camel_case_types)]
> + /// # mod bindings {
> + /// # pub(crate) struct foo_hardware;
> + /// # pub(crate) unsafe fn foo_hardware_read(_foo: *mut foo_hardware, _len: &mut usize) -> *mut u8 {
> + /// # todo!()
> + /// # }
> + /// # }
> + /// struct Foo(Opaque<bindings::foo_hardware>);
> + ///
> + /// impl Foo {
> + /// fn read(&mut self, mut len: usize) -> Result<&Untrusted<[u8]>> {
> + /// // SAFETY: just an FFI call without preconditions.
> + /// let data: *mut u8 = unsafe { bindings::foo_hardware_read(self.0.get(), &mut len) };
> + /// let data = error::from_err_ptr(data)?;
With my heavily patched 6.11-rc1 base for this I get:
error[E0603]: function `from_err_ptr` is private
--> rust/doctests_kernel_generated.rs:6749:27
|
6749 | let data = error::from_err_ptr(data)?;
| ^^^^^^^^^^^^ private function
|
note: the function `from_err_ptr` is defined here
--> rust/kernel/error.rs:271:1
|
271 | pub(crate) fn from_err_ptr<T>(ptr: *mut T) -> Result<*mut T> {
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
And the same for the &mut Untrusted example.
> + /// Sets the underlying untrusted value.
> + ///
> + /// # Examples
> + ///
> + /// ```
> + /// use kernel::validate::Untrusted;
> + ///
> + /// let mut untrusted = Untrusted::new(42);
> + /// untrusted.write(24);
> + /// ```
> + pub fn write(&mut self, value: impl Init<T>) {
> + let ptr: *mut T = &mut self.0 .0;
> + // SAFETY: `ptr` came from a mutable reference and the value is overwritten before it is
> + // read.
> + unsafe { ptr::drop_in_place(ptr) };
> + // SAFETY: `ptr` came from a mutable reference and the initializer cannot error.
> + match unsafe { value.__init(ptr) } {
> + Ok(()) => {}
For this I get
--> rust/kernel/validate.rs:188:15
|
188 | match unsafe { value.__init(ptr) } {
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ pattern `Err(_)` not
covered
|
note: `core::result::Result<(), Infallible>` defined here
-->
.rustup/toolchains/1.80.0-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/result.rs:527:1
|
527 | pub enum Result<T, E> {
| ^^^^^^^^^^^^^^^^^^^^^
...
536 | Err(#[stable(feature = "rust1", since = "1.0.0")] E),
| --- not covered
Just fyi in case its not due to my local patching ;)
Best regards
Dirk
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 1/2] rust: add untrusted data abstraction
2024-09-25 20:53 ` [PATCH v2 1/2] rust: add untrusted data abstraction Benno Lossin
2024-09-26 7:08 ` Dirk Behme
@ 2024-09-26 10:40 ` Simona Vetter
2024-09-30 14:04 ` Benno Lossin
2024-09-26 20:31 ` kernel test robot
2 siblings, 1 reply; 15+ messages in thread
From: Simona Vetter @ 2024-09-26 10:40 UTC (permalink / raw)
To: Benno Lossin
Cc: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
Björn Roy Baron, Andreas Hindborg, Alice Ryhl, Trevor Gross,
Greg KH, Simona Vetter, linux-kernel, rust-for-linux
On Wed, Sep 25, 2024 at 08:53:05PM +0000, Benno Lossin wrote:
> When reading data from userspace, hardware or other external untrusted
> sources, the data must be validated before it is used for logic within
> the kernel. This abstraction provides a generic newtype wrapper
> `Untrusted`; it prevents direct access to the inner type. The only way
> to use the underlying data is to call `.validate()` on such a value.
>
> Doing so utilizes the new `Validate` trait that is responsible for all
> of the validation logic. This trait gives access to the inner value of
> `Untrusted` by means of another newtype wrapper `Unvalidated`. In
> contrast to `Untrusted`, `Unvalidated` allows direct access and
> additionally provides several helper functions for slices.
>
> Having these two different newtype wrappers is an idea from Simona
> Vetter. It has several benefits: it fully prevents safe access to the
> underlying value of `Untrusted` without going through the `Validate`
> API. Additionally, it allows one to grep for validation logic by simply
> looking for `Unvalidated<`.
>
> Any API that reads data from an untrusted source should return
> `Untrusted<T>` where `T` is the type of the underlying untrusted data.
> This generic allows other abstractions to return their custom type
> wrapped by `Untrusted`, signaling to the caller that the data must be
> validated before use. This allows those abstractions to be used both in
> a trusted and untrusted manner, increasing their generality.
> Additionally, using the arbitrary self types feature, APIs can be
> designed to explicitly read untrusted data:
>
> impl MyCustomDataSource {
> pub fn read(self: &Untrusted<Self>) -> &Untrusted<[u8]>;
> }
>
> Cc: Simona Vetter <simona.vetter@ffwll.ch>
> Signed-off-by: Benno Lossin <benno.lossin@proton.me>
I think this looks overall nifty, bunch of thoughts below.
Cheers, Sima
> ---
> rust/kernel/lib.rs | 1 +
> rust/kernel/validate.rs | 602 ++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 603 insertions(+)
> create mode 100644 rust/kernel/validate.rs
>
> diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
> index f10b06a78b9d..3125936eae45 100644
> --- a/rust/kernel/lib.rs
> +++ b/rust/kernel/lib.rs
> @@ -54,6 +54,7 @@
> pub mod time;
> pub mod types;
> pub mod uaccess;
> +pub mod validate;
> pub mod workqueue;
>
> #[doc(hidden)]
> diff --git a/rust/kernel/validate.rs b/rust/kernel/validate.rs
> new file mode 100644
> index 000000000000..b325349e7dc3
> --- /dev/null
> +++ b/rust/kernel/validate.rs
> @@ -0,0 +1,604 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +//! Types for handling and validating untrusted data.
> +//!
> +//! # Overview
> +//!
> +//! Untrusted data is marked using the [`Untrusted<T>`] type. See [Rationale](#rationale) for the
> +//! reasons to mark untrusted data throught the kernel. It is a totally opaque wrapper, it is not
> +//! possible to read the data inside; but it is possible to [`Untrusted::write`] into it.
> +//!
> +//! The only way to "access" the data inside an [`Untrusted<T>`] is to [`Untrusted::validate`] it;
> +//! turning it into a different form using the [`Validate`] trait. That trait receives the data in
> +//! the form of [`Unvalidated<T>`], which in contrast to [`Untrusted<T>`], allows access to the
> +//! underlying data. It additionally provides several utility functions to simplify validation.
> +//!
> +//! # Rationale
> +//!
> +//! When reading data from an untrusted source, it must be validated before it can be used for
> +//! logic. For example, this is a very bad idea:
> +//!
> +//! ```
> +//! # fn read_bytes_from_network() -> Box<[u8]> {
> +//! # Box::new([1, 0], kernel::alloc::flags::GFP_KERNEL).unwrap()
> +//! # }
> +//! let bytes: Box<[u8]> = read_bytes_from_network();
> +//! let data_index = bytes[0];
> +//! let data = bytes[usize::from(data_index)];
> +//! ```
> +//!
> +//! While this will not lead to a memory violation (because the array index checks the bounds), it
> +//! might result in a kernel panic. For this reason, all untrusted data must be wrapped in
> +//! [`Untrusted<T>`]. This type only allows validating the data or passing it along, since copying
> +//! data from one userspace buffer into another is allowed for untrusted data.
> +
> +use crate::init::Init;
> +use core::{
> + mem::MaybeUninit,
> + ops::{Index, IndexMut},
> + ptr, slice,
> +};
> +
> +/// Untrusted data of type `T`.
> +///
> +/// When reading data from userspace, hardware or other external untrusted sources, the data must
> +/// be validated before it is used for logic within the kernel. To do so, the [`validate()`]
> +/// function exists and uses the [`Validate`] trait.
> +///
> +/// Also see the [module] description.
> +///
> +/// [`validate()`]: Self::validate
> +/// [module]: self
> +#[repr(transparent)]
> +pub struct Untrusted<T: ?Sized>(Unvalidated<T>);
> +
> +impl<T: ?Sized> Untrusted<T> {
> + /// Marks the given value as untrusted.
> + ///
> + /// # Examples
> + ///
> + /// ```
> + /// use kernel::validate::Untrusted;
> + ///
> + /// # mod bindings { pub(crate) unsafe fn read_foo_info() -> [u8; 4] { todo!() } };
> + /// fn read_foo_info() -> Untrusted<[u8; 4]> {
> + /// // SAFETY: just an FFI call without preconditions.
> + /// Untrusted::new(unsafe { bindings::read_foo_info() })
> + /// }
> + /// ```
> + pub fn new(value: T) -> Self
> + where
> + T: Sized,
> + {
> + Self(Unvalidated::new(value))
> + }
> +
> + /// Marks the value behind the reference as untrusted.
> + ///
> + /// # Examples
> + ///
> + /// In this imaginary example there exists the `foo_hardware` struct on the C side, as well as
> + /// a `foo_hardware_read` function that reads some data directly from the hardware.
> + /// ```
> + /// use kernel::{error, types::Opaque, validate::Untrusted};
> + /// use core::ptr;
> + ///
> + /// # #[allow(non_camel_case_types)]
> + /// # mod bindings {
> + /// # pub(crate) struct foo_hardware;
> + /// # pub(crate) unsafe fn foo_hardware_read(_foo: *mut foo_hardware, _len: &mut usize) -> *mut u8 {
> + /// # todo!()
> + /// # }
> + /// # }
> + /// struct Foo(Opaque<bindings::foo_hardware>);
> + ///
> + /// impl Foo {
> + /// fn read(&mut self, mut len: usize) -> Result<&Untrusted<[u8]>> {
> + /// // SAFETY: just an FFI call without preconditions.
> + /// let data: *mut u8 = unsafe { bindings::foo_hardware_read(self.0.get(), &mut len) };
> + /// let data = error::from_err_ptr(data)?;
> + /// let data = ptr::slice_from_raw_parts(data, len);
> + /// // SAFETY: `data` returned by `foo_hardware_read` is valid for reads as long as the
> + /// // `foo_hardware` object exists. That function updated the
> + /// let data = unsafe { &*data };
> + /// Ok(Untrusted::new_ref(data))
> + /// }
> + /// }
> + /// ```
> + pub fn new_ref(value: &T) -> &Self {
> + let ptr: *const T = value;
> + // CAST: `Self` and `Unvalidated` are `repr(transparent)` and contain a `T`.
> + let ptr = ptr as *const Self;
> + // SAFETY: `ptr` came from a shared reference valid for `'a`.
> + unsafe { &*ptr }
> + }
> +
> + /// Marks the value behind the reference as untrusted.
> + ///
> + /// # Examples
> + ///
> + /// In this imaginary example there exists the `foo_hardware` struct on the C side, as well as
> + /// a `foo_hardware_read` function that reads some data directly from the hardware.
> + /// ```
> + /// use kernel::{error, types::Opaque, validate::Untrusted};
> + /// use core::ptr;
> + ///
> + /// # #[allow(non_camel_case_types)]
> + /// # mod bindings {
> + /// # pub(crate) struct foo_hardware;
> + /// # pub(crate) unsafe fn foo_hardware_read(_foo: *mut foo_hardware, _len: &mut usize) -> *mut u8 {
> + /// # todo!()
> + /// # }
> + /// # }
> + /// struct Foo(Opaque<bindings::foo_hardware>);
> + ///
> + /// impl Foo {
> + /// fn read(&mut self, mut len: usize) -> Result<&mut Untrusted<[u8]>> {
> + /// // SAFETY: just an FFI call without preconditions.
> + /// let data: *mut u8 = unsafe { bindings::foo_hardware_read(self.0.get(), &mut len) };
> + /// let data = error::from_err_ptr(data)?;
> + /// let data = ptr::slice_from_raw_parts_mut(data, len);
> + /// // SAFETY: `data` returned by `foo_hardware_read` is valid for reads as long as the
> + /// // `foo_hardware` object exists. That function updated the
> + /// let data = unsafe { &mut *data };
> + /// Ok(Untrusted::new_mut(data))
> + /// }
> + /// }
> + /// ```
> + pub fn new_mut(value: &mut T) -> &mut Self {
> + let ptr: *mut T = value;
> + // CAST: `Self` and `Unvalidated` are `repr(transparent)` and contain a `T`.
> + let ptr = ptr as *mut Self;
> + // SAFETY: `ptr` came from a mutable reference valid for `'a`.
> + unsafe { &mut *ptr }
> + }
> +
> + /// Validates and parses the untrusted data.
> + ///
> + /// See the [`Validate`] trait on how to implement it.
> + pub fn validate<'a, V: Validate<&'a Unvalidated<T>>>(&'a self) -> Result<V, V::Err> {
> + V::validate(&self.0)
> + }
> +
> + /// Validates and parses the untrusted data.
> + ///
> + /// See the [`Validate`] trait on how to implement it.
> + pub fn validate_mut<'a, V: Validate<&'a mut Unvalidated<T>>>(
> + &'a mut self,
> + ) -> Result<V, V::Err> {
> + V::validate(&mut self.0)
> + }
> +
> + /// Sets the underlying untrusted value.
> + ///
> + /// # Examples
> + ///
> + /// ```
> + /// use kernel::validate::Untrusted;
> + ///
> + /// let mut untrusted = Untrusted::new(42);
> + /// untrusted.write(24);
> + /// ```
> + pub fn write(&mut self, value: impl Init<T>) {
> + let ptr: *mut T = &mut self.0 .0;
> + // SAFETY: `ptr` came from a mutable reference and the value is overwritten before it is
> + // read.
> + unsafe { ptr::drop_in_place(ptr) };
> + // SAFETY: `ptr` came from a mutable reference and the initializer cannot error.
> + match unsafe { value.__init(ptr) } {
> + Ok(()) => {}
> + }
> + }
So I think this doesn't work for a few reasons:
- there's a lot of unsafe both here and when using this (even without
MaybeUninit), at least if I understand things correctly. And forcing
unsafe for something that inheritedly does not break any of the rust
memory safety rules feels silly.
- it also looks awkward, because you have the special versions
write_uninit and write_uninit_slice, and imo if our write isn't good
enought to just work with abritrary box/container types, it's not good.
- it doesn't actually work for one of the main uaccess.rs use-cases beyond
just enabling the current interface. Example in pseudo-rust:
struct IoctlParams {
input: u32,
ouptut: u32,
}
The thing is that ioctl that use the struct approach like drm does, use
the same struct if there's both input and output paramterers, and
furthermore we are not allowed to overwrite the entire struct because
that breaks ioctl restarting. So the flow is roughly
let userptr : UserSlice;
let params : Untrusted<IoctlParams>;
userptr.read(params));
// validate params, do something interesting with it params.input
// this is _not_ allowed to overwrite params.input but must leave it
// unchanged
params.write(|x| { x.output = 42; });
userptr.write(params);
Your current write doesn't allow this case, and I think that's not good
enough. The one I propsed in private does:
Untrusted<T>::write(&mut self, Fn(&mut T)->())
It's not perfect because it allows you to do other stuff than writing,
but it's still pretty good by limiting any potentially dangerous code to
the provided closure. And with these of protection apis we need to
strike the right balance between as much protection as possible while
still allowing users to get the job done.
Now I guess you can do this with your write too using a copy constructor
approach, but that will result in less efficient code and it's probably
also too much functional programming design for kernel developers to
cope with.
> +
> + /// Turns a slice of untrusted values into an untrusted slice of values.
> + pub fn transpose_slice(slice: &[Untrusted<T>]) -> &Untrusted<[T]>
> + where
> + T: Sized,
> + {
> + let ptr = slice.as_ptr().cast::<T>();
> + // SAFETY: `ptr` and `len` come from the same slice reference.
> + let slice = unsafe { slice::from_raw_parts(ptr, slice.len()) };
> + Untrusted::new_ref(slice)
> + }
> +
> + /// Turns a slice of uninitialized, untrusted values into an untrusted slice of uninitialized
> + /// values.
> + pub fn transpose_slice_uninit(
> + slice: &[MaybeUninit<Untrusted<T>>],
> + ) -> &Untrusted<[MaybeUninit<T>]>
So these and some of the related functions to handle slice of box vs box
of slice feel a bit awkward. I think we can do better if we rename
Untrusted and Unvalidated to UntrustedBox and UnvalidatedBox, and then
make Untrusted and Unvalidated traits, with implementations for
UntrustedBox<T>, [Untrusted<T>], and all the others we want.
I expect that in the future we'll get more boxes with special semantics
that we want to use together with Untrusted, not just [] and MaybeUninit.
One example would be when the Untrusted data is shared with userspace
(mmap()ed into both kernel and userspace for example for a ringbuffer), in
which case the data is both Untrusted but also SharedUnsafe or whatever
we'll call memory that's fundamentally breaking the rust memory model
because there's no exclusive access (userspace can always do whatever it
feels like), and all access has to go through at least
READ_ONCE/WRITE_ONCE from the C side, or often even full blown atomics.
And that memory is not MaybeUninit, because we have to initialize it
before userspace can mmap it, for otherwise it's an information leak and
hence security issue.
tldr; I think going with traits for Untrusted and Unvalidated here will be
worth it, even if a bit more pain at first. Plus cleaner interfaces.
> + where
> + T: Sized,
> + {
> + let ptr = slice.as_ptr().cast::<MaybeUninit<T>>();
> + // SAFETY: `ptr` and `len` come from the same mutable slice reference.
> + let slice = unsafe { slice::from_raw_parts(ptr, slice.len()) };
> + Untrusted::new_ref(slice)
> + }
> +
> + /// Turns a slice of uninitialized, untrusted values into an untrusted slice of uninitialized
> + /// values.
> + pub fn transpose_slice_uninit_mut(
> + slice: &mut [MaybeUninit<Untrusted<T>>],
> + ) -> &mut Untrusted<[MaybeUninit<T>]>
> + where
> + T: Sized,
> + {
> + // CAST: `MaybeUninit<T>` and `MaybeUninit<Untrusted<T>>` have the same layout.
> + let ptr = slice.as_mut_ptr().cast::<MaybeUninit<T>>();
> + // SAFETY: `ptr` and `len` come from the same mutable slice reference.
> + let slice = unsafe { slice::from_raw_parts_mut(ptr, slice.len()) };
> + Untrusted::new_mut(slice)
> + }
> +}
> +
> +impl<T> Untrusted<MaybeUninit<T>> {
> + /// Sets the underlying untrusted value.
> + ///
> + /// # Examples
> + ///
> + /// ```
> + /// use kernel::validate::Untrusted;
> + ///
> + /// let mut untrusted = Untrusted::new(42);
> + /// untrusted.write(24);
> + /// ```
> + pub fn write_uninit<E>(&mut self, value: impl Init<T, E>) -> Result<&mut Untrusted<T>, E> {
> + let ptr: *mut MaybeUninit<T> = &mut self.0 .0;
> + // CAST: `MaybeUninit<T>` is `repr(transparent)`.
> + let ptr = ptr.cast::<T>();
> + // SAFETY: `ptr` came from a reference and if `Err` is returned, the underlying memory is
> + // considered uninitialized.
> + unsafe { value.__init(ptr) }.map(|()| {
> + let this = self.0.raw_mut();
> + // SAFETY: we initialized the memory above.
> + Untrusted::new_mut(unsafe { this.assume_init_mut() })
> + })
> + }
> +}
> +
> +impl<T> Untrusted<[MaybeUninit<T>]> {
> + /// Sets the underlying untrusted value.
> + ///
> + /// # Examples
> + ///
> + /// ```
> + /// use kernel::validate::Untrusted;
> + ///
> + /// let mut untrusted = Untrusted::new(42);
> + /// untrusted.write(24);
> + /// ```
> + pub fn write_uninit_slice<E>(
> + &mut self,
> + value: impl Init<[T], E>,
> + ) -> Result<&mut Untrusted<[T]>, E> {
> + let ptr: *mut [MaybeUninit<T>] = &mut self.0 .0;
> + // CAST: `MaybeUninit<T>` is `repr(transparent)`.
> + let ptr = ptr as *mut [T];
> + // SAFETY: `ptr` came from a reference and if `Err` is returned, the underlying memory is
> + // considered uninitialized.
> + unsafe { value.__init(ptr) }.map(|()| {
> + let this = self.0.raw_mut().as_mut_ptr();
> + // CAST: `MaybeUninit<T>` is `repr(transparent)`.
> + let this = this.cast::<T>();
> + // SAFETY: `this` and `len` came from the same slice reference.
> + let this = unsafe { slice::from_raw_parts_mut(this, self.0.len()) };
> + Untrusted::new_mut(this)
> + })
> + }
> +}
> +
> +/// Marks types that can be used as input to [`Validate::validate`].
> +pub trait ValidateInput: private::Sealed + Sized {}
I guess even if we go with making Unvalidated a trait we might still need
this one to seal it?
> +
> +mod private {
> + pub trait Sealed {}
> +}
> +
> +impl<'a, T: ?Sized> private::Sealed for &'a Unvalidated<T> {}
> +impl<'a, T: ?Sized> ValidateInput for &'a Unvalidated<T> {}
> +
> +impl<'a, T: ?Sized> private::Sealed for &'a mut Unvalidated<T> {}
> +impl<'a, T: ?Sized> ValidateInput for &'a mut Unvalidated<T> {}
> +
> +/// Validates untrusted data.
> +///
> +/// # Examples
> +///
> +/// The simplest way to validate data is to just implement `Validate<&Unvalidated<[u8]>>` for the
> +/// type that you wish to validate:
> +///
> +/// ```
> +/// use kernel::{
> +/// error::{code::EINVAL, Error},
> +/// str::{CStr, CString},
> +/// validate::{Unvalidated, Validate},
> +/// };
> +///
> +/// struct Data {
> +/// flags: u8,
> +/// name: CString,
> +/// }
> +///
> +/// impl Validate<&Unvalidated<[u8]>> for Data {
> +/// type Err = Error;
> +///
> +/// fn validate(unvalidated: &Unvalidated<[u8]>) -> Result<Self, Self::Err> {
> +/// let raw = unvalidated.raw();
> +/// let (&flags, name) = raw.split_first().ok_or(EINVAL)?;
> +/// let name = CStr::from_bytes_with_nul(name)?.to_cstring()?;
> +/// Ok(Data { flags, name })
> +/// }
> +/// }
> +/// ```
> +///
> +/// This approach copies the data and requires allocation. If you want to avoid the allocation and
> +/// copying the data, you can borrow from the input like this:
> +///
> +/// ```
> +/// use kernel::{
> +/// error::{code::EINVAL, Error},
> +/// str::CStr,
> +/// validate::{Unvalidated, Validate},
> +/// };
> +///
> +/// struct Data<'a> {
> +/// flags: u8,
> +/// name: &'a CStr,
> +/// }
> +///
> +/// impl<'a> Validate<&'a Unvalidated<[u8]>> for Data<'a> {
> +/// type Err = Error;
> +///
> +/// fn validate(unvalidated: &'a Unvalidated<[u8]>) -> Result<Self, Self::Err> {
> +/// let raw = unvalidated.raw();
> +/// let (&flags, name) = raw.split_first().ok_or(EINVAL)?;
> +/// let name = CStr::from_bytes_with_nul(name)?;
> +/// Ok(Data { flags, name })
> +/// }
> +/// }
> +/// ```
> +///
> +/// If you need to in-place validate your data, you currently need to resort to `unsafe`:
> +///
> +/// ```
> +/// use kernel::{
> +/// error::{code::EINVAL, Error},
> +/// str::CStr,
> +/// validate::{Unvalidated, Validate},
> +/// };
> +/// use core::mem;
> +///
> +/// // Important: use `repr(C)`, this ensures a linear layout of this type.
> +/// #[repr(C)]
> +/// struct Data {
> +/// version: u8,
> +/// flags: u8,
> +/// _reserved: [u8; 2],
> +/// count: u64,
> +/// // lots of other fields...
> +/// }
> +///
> +/// impl Validate<&Unvalidated<[u8]>> for &Data {
> +/// type Err = Error;
> +///
> +/// fn validate(unvalidated: &Unvalidated<[u8]>) -> Result<Self, Self::Err> {
> +/// let raw = unvalidated.raw();
> +/// if raw.len() < mem::size_of::<Data>() {
> +/// return Err(EINVAL);
> +/// }
> +/// // can only handle version 0
> +/// if raw[0] != 0 {
> +/// return Err(EINVAL);
> +/// }
> +/// // version 0 only uses the lower 4 bits of flags
> +/// if raw[1] & 0xf0 != 0 {
> +/// return Err(EINVAL);
> +/// }
> +/// let ptr = raw.as_ptr();
> +/// // CAST: `Data` only contains integers and has `repr(C)`.
> +/// let ptr = ptr.cast::<Data>();
> +/// // SAFETY: `ptr` came from a reference and the cast above is valid.
> +/// Ok(unsafe { &*ptr })
> +/// }
> +/// }
> +/// ```
> +///
> +/// To be able to modify the parsed data, while still supporting zero-copy, you can implement
> +/// `Validate<&mut Unvalidated<[u8]>>`:
> +///
> +/// ```
> +/// use kernel::{
> +/// error::{code::EINVAL, Error},
> +/// str::CStr,
> +/// validate::{Unvalidated, Validate},
> +/// };
> +/// use core::mem;
> +///
> +/// // Important: use `repr(C)`, this ensures a linear layout of this type.
> +/// #[repr(C)]
> +/// struct Data {
> +/// version: u8,
> +/// flags: u8,
> +/// _reserved: [u8; 2],
> +/// count: u64,
> +/// // lots of other fields...
> +/// }
> +///
> +/// impl Validate<&mut Unvalidated<[u8]>> for &Data {
> +/// type Err = Error;
I think we should make that one the default, but not sure that's doable
with associated types instead of generics.
The input parameter should definitely default to the output paramter,
because often they're just exactly the same e.g. when validating ioctl
input structures.
On this version you've also left out the in-place validation (originally
validate_bytes), but we can add that later on when a need arises I guess.
Or do you think it's just not a good idea in general?
> +///
> +/// fn validate(unvalidated: &mut Unvalidated<[u8]>) -> Result<Self, Self::Err> {
> +/// let raw = unvalidated.raw_mut();
> +/// if raw.len() < mem::size_of::<Data>() {
> +/// return Err(EINVAL);
> +/// }
> +/// match raw[0] {
> +/// 0 => {},
> +/// 1 => {
> +/// // version 1 implicitly sets the first bit.
> +/// raw[1] |= 1;
> +/// },
> +/// // can only handle version 0 and 1
> +/// _ => return Err(EINVAL),
> +/// }
> +/// // version 0 and 1 only use the lower 4 bits of flags
> +/// if raw[1] & 0xf0 != 0 {
> +/// return Err(EINVAL);
> +/// }
> +/// if raw[1] == 0 {}
> +/// let ptr = raw.as_ptr();
> +/// // CAST: `Data` only contains integers and has `repr(C)`.
> +/// let ptr = ptr.cast::<Data>();
> +/// // SAFETY: `ptr` came from a reference and the cast above is valid.
> +/// Ok(unsafe { &*ptr })
> +/// }
> +/// }
> +/// ```
> +pub trait Validate<I: ValidateInput>: Sized {
> + /// Validation error.
> + type Err;
> +
> + /// Validate the given untrusted data and parse it into the output type.
> + fn validate(unvalidated: I) -> Result<Self, Self::Err>;
> +}
> +
> +/// Unvalidated data of type `T`.
> +#[repr(transparent)]
> +pub struct Unvalidated<T: ?Sized>(T);
> +
> +impl<T: ?Sized> Unvalidated<T> {
> + fn new(value: T) -> Self
> + where
> + T: Sized,
> + {
> + Self(value)
> + }
> +
> + fn new_ref(value: &T) -> &Self {
> + let ptr: *const T = value;
> + // CAST: `Self` is `repr(transparent)` and contains a `T`.
> + let ptr = ptr as *const Self;
> + // SAFETY: `ptr` came from a mutable reference valid for `'a`.
> + unsafe { &*ptr }
> + }
> +
> + fn new_mut(value: &mut T) -> &mut Self {
> + let ptr: *mut T = value;
> + // CAST: `Self` is `repr(transparent)` and contains a `T`.
> + let ptr = ptr as *mut Self;
> + // SAFETY: `ptr` came from a mutable reference valid for `'a`.
> + unsafe { &mut *ptr }
> + }
> +
> + /// Validates and parses the untrusted data.
> + ///
> + /// See the [`Validate`] trait on how to implement it.
> + pub fn validate_ref<'a, V: Validate<&'a Unvalidated<T>>>(&'a self) -> Result<V, V::Err> {
> + V::validate(self)
> + }
> +
> + /// Validates and parses the untrusted data.
> + ///
> + /// See the [`Validate`] trait on how to implement it.
> + pub fn validate_mut<'a, V: Validate<&'a mut Unvalidated<T>>>(
> + &'a mut self,
> + ) -> Result<V, V::Err> {
> + V::validate(self)
> + }
> +
> + /// Gives immutable access to the underlying value.
> + pub fn raw(&self) -> &T {
> + &self.0
> + }
> +
> + /// Gives mutable access to the underlying value.
> + pub fn raw_mut(&mut self) -> &mut T {
> + &mut self.0
> + }
> +}
> +
> +impl<T, I> Index<I> for Unvalidated<[T]>
> +where
> + I: slice::SliceIndex<[T]>,
> +{
> + type Output = Unvalidated<I::Output>;
> +
> + fn index(&self, index: I) -> &Self::Output {
> + Unvalidated::new_ref(self.0.index(index))
> + }
> +}
> +
> +impl<T, I> IndexMut<I> for Unvalidated<[T]>
> +where
> + I: slice::SliceIndex<[T]>,
> +{
> + fn index_mut(&mut self, index: I) -> &mut Self::Output {
> + Unvalidated::new_mut(self.0.index_mut(index))
> + }
> +}
> +
> +/// Immutable unvalidated slice iterator.
> +pub struct Iter<'a, T>(slice::Iter<'a, T>);
> +
> +/// Mutable unvalidated slice iterator.
> +pub struct IterMut<'a, T>(slice::IterMut<'a, T>);
> +
> +impl<'a, T> Iterator for Iter<'a, T> {
> + type Item = &'a Unvalidated<T>;
> +
> + fn next(&mut self) -> Option<Self::Item> {
> + self.0.next().map(Unvalidated::new_ref)
> + }
> +}
> +
> +impl<'a, T> IntoIterator for &'a Unvalidated<[T]> {
> + type Item = &'a Unvalidated<T>;
> + type IntoIter = Iter<'a, T>;
> +
> + fn into_iter(self) -> Self::IntoIter {
> + Iter(self.0.iter())
> + }
> +}
> +
> +impl<'a, T> Iterator for IterMut<'a, T> {
> + type Item = &'a mut Unvalidated<T>;
> +
> + fn next(&mut self) -> Option<Self::Item> {
> + self.0.next().map(Unvalidated::new_mut)
> + }
> +}
> +
> +impl<'a, T> IntoIterator for &'a mut Unvalidated<[T]> {
> + type Item = &'a mut Unvalidated<T>;
> + type IntoIter = IterMut<'a, T>;
> +
> + fn into_iter(self) -> Self::IntoIter {
> + IterMut(self.0.iter_mut())
> + }
> +}
> +
> +impl<T> Unvalidated<[T]> {
Hm there's no Container trait that has all these in rust? Kinda surprising
tbh ...
> + /// Returns the number of elements in the underlying slice.
> + pub fn len(&self) -> usize {
> + self.0.len()
> + }
> +
> + /// Returns true if the underlying slice has a length of 0.
> + pub fn is_empty(&self) -> bool {
> + self.0.is_empty()
> + }
> +
> + /// Iterates over all items and validates each of them individually.
> + pub fn validate_iter<'a, V: Validate<&'a Unvalidated<T>>>(
> + &'a self,
> + ) -> impl Iterator<Item = Result<V, V::Err>> + 'a {
> + self.into_iter().map(|item| V::validate(item))
> + }
> +
> + /// Iterates over all items and validates each of them individually.
> + pub fn validate_iter_mut<'a, V: Validate<&'a mut Unvalidated<T>>>(
> + &'a mut self,
> + ) -> impl Iterator<Item = Result<V, V::Err>> + 'a {
> + self.into_iter().map(|item| V::validate(item))
> + }
> +}
> --
> 2.46.0
>
>
--
Simona Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 2/2] rust: switch uaccess to untrusted data API
2024-09-25 20:53 ` [PATCH v2 2/2] rust: switch uaccess to untrusted data API Benno Lossin
@ 2024-09-26 11:09 ` Simona Vetter
2024-09-26 23:56 ` kernel test robot
1 sibling, 0 replies; 15+ messages in thread
From: Simona Vetter @ 2024-09-26 11:09 UTC (permalink / raw)
To: Benno Lossin
Cc: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
Björn Roy Baron, Andreas Hindborg, Alice Ryhl, Trevor Gross,
Greg KH, Simona Vetter, rust-for-linux, linux-kernel
On Wed, Sep 25, 2024 at 08:53:11PM +0000, Benno Lossin wrote:
> Userspace is an untrusted data source, so all data that can be read from
> there is untrusted. Therefore use the new untrusted API for any data
> returned from it.
> This makes it significantly harder to write TOCTOU bug, the example bug
> in the documentation cannot easily be converted to use the new API. Thus
> it is removed.
>
> A possible future improvement is to change how `UserSliceWriter` exposes
> writing to userspace: a trait `ToUserspace` TBB (to be bikeshed) could
> allow users to write untrusted data to a userpointer without adding more
> methods to it.
>
> Signed-off-by: Benno Lossin <benno.lossin@proton.me>
> ---
> rust/kernel/page.rs | 8 ++-
> rust/kernel/uaccess.rs | 135 +++++++++++++++++++++--------------------
> 2 files changed, 74 insertions(+), 69 deletions(-)
>
> diff --git a/rust/kernel/page.rs b/rust/kernel/page.rs
> index 208a006d587c..74ce86326893 100644
> --- a/rust/kernel/page.rs
> +++ b/rust/kernel/page.rs
> @@ -5,9 +5,9 @@
> use crate::{
> alloc::{AllocError, Flags},
> bindings,
> - error::code::*,
> - error::Result,
> + error::{code::*, Result},
> uaccess::UserSliceReader,
> + validate::Untrusted,
> };
> use core::ptr::{self, NonNull};
>
> @@ -237,8 +237,10 @@ pub unsafe fn copy_from_user_slice_raw(
> // SAFETY: If `with_pointer_into_page` calls into this closure, then it has performed a
> // bounds check and guarantees that `dst` is valid for `len` bytes. Furthermore, we have
> // exclusive access to the slice since the caller guarantees that there are no races.
> - reader.read_raw(unsafe { core::slice::from_raw_parts_mut(dst.cast(), len) })
> + let slice = unsafe { core::slice::from_raw_parts_mut(dst.cast(), len) };
> + reader.read_raw(Untrusted::new_mut(slice))
> })
> + .map(|_| ())
> }
> }
>
> diff --git a/rust/kernel/uaccess.rs b/rust/kernel/uaccess.rs
> index e9347cff99ab..3d312f845269 100644
> --- a/rust/kernel/uaccess.rs
> +++ b/rust/kernel/uaccess.rs
> @@ -10,10 +10,12 @@
> error::Result,
> prelude::*,
> types::{AsBytes, FromBytes},
> + validate::Untrusted,
> };
> use alloc::vec::Vec;
> use core::ffi::{c_ulong, c_void};
> use core::mem::{size_of, MaybeUninit};
> +use init::init_from_closure;
>
> /// The type used for userspace addresses.
> pub type UserPtr = usize;
> @@ -47,59 +49,39 @@
> ///
> /// ```no_run
> /// use alloc::vec::Vec;
> -/// use core::ffi::c_void;
> -/// use kernel::error::Result;
> -/// use kernel::uaccess::{UserPtr, UserSlice};
> +/// use core::{convert::Infallible, ffi::c_void};
> +/// use kernel::{error::Result, uaccess::{UserPtr, UserSlice}, validate::{Unvalidated, Untrusted, Validate}};
> ///
> -/// fn bytes_add_one(uptr: UserPtr, len: usize) -> Result<()> {
> -/// let (read, mut write) = UserSlice::new(uptr, len).reader_writer();
> +/// struct AddOne<'a>(&'a mut u8);
> ///
> -/// let mut buf = Vec::new();
> -/// read.read_all(&mut buf, GFP_KERNEL)?;
> +/// impl<'a> Validate<&'a mut Unvalidated<u8>> for AddOne<'a> {
> +/// type Err = Infallible;
> ///
> -/// for b in &mut buf {
> -/// *b = b.wrapping_add(1);
> +/// fn validate(unvalidated: &'a mut Unvalidated<u8>) -> Result<Self, Self::Err> {
> +/// // We are not doing any kind of validation here on purpose. After all, we only want to
> +/// // increment the value and write it back.
> +/// Ok(Self(unvalidated.raw_mut()))
> /// }
> -///
> -/// write.write_slice(&buf)?;
> -/// Ok(())
> /// }
> -/// ```
> -///
> -/// Example illustrating a TOCTOU (time-of-check to time-of-use) bug.
> ///
> -/// ```no_run
> -/// use alloc::vec::Vec;
> -/// use core::ffi::c_void;
> -/// use kernel::error::{code::EINVAL, Result};
> -/// use kernel::uaccess::{UserPtr, UserSlice};
> +/// impl AddOne<'_> {
> +/// fn inc(&mut self) {
> +/// *self.0 = self.0.wrapping_add(1);
> +/// }
> +/// }
> ///
> -/// /// Returns whether the data in this region is valid.
> -/// fn is_valid(uptr: UserPtr, len: usize) -> Result<bool> {
> -/// let read = UserSlice::new(uptr, len).reader();
> +/// fn bytes_add_one(uptr: UserPtr, len: usize) -> Result<()> {
> +/// let (read, mut write) = UserSlice::new(uptr, len).reader_writer();
> ///
> /// let mut buf = Vec::new();
> /// read.read_all(&mut buf, GFP_KERNEL)?;
> ///
> -/// todo!()
> -/// }
> -///
> -/// /// Returns the bytes behind this user pointer if they are valid.
> -/// fn get_bytes_if_valid(uptr: UserPtr, len: usize) -> Result<Vec<u8>> {
> -/// if !is_valid(uptr, len)? {
> -/// return Err(EINVAL);
> +/// for b in &mut buf {
> +/// b.validate_mut::<AddOne<'_>>()?.inc();
> /// }
> ///
> -/// let read = UserSlice::new(uptr, len).reader();
> -///
> -/// let mut buf = Vec::new();
> -/// read.read_all(&mut buf, GFP_KERNEL)?;
> -///
> -/// // THIS IS A BUG! The bytes could have changed since we checked them.
> -/// //
> -/// // To avoid this kind of bug, don't call `UserSlice::new` multiple
> -/// // times with the same address.
> -/// Ok(buf)
> +/// write.write_untrusted_slice(Untrusted::transpose_slice(&buf))?;
See below and in my previous mail, I think with the right amount of traits
you can rewrite this as
write.write(&buf)?;
and it will all just neatly work.
> +/// Ok(())
> /// }
> /// ```
> ///
> @@ -130,7 +112,7 @@ pub fn new(ptr: UserPtr, length: usize) -> Self {
> /// Reads the entirety of the user slice, appending it to the end of the provided buffer.
> ///
> /// Fails with [`EFAULT`] if the read happens on a bad address.
> - pub fn read_all(self, buf: &mut Vec<u8>, flags: Flags) -> Result {
> + pub fn read_all(self, buf: &mut Vec<Untrusted<u8>>, flags: Flags) -> Result {
> self.reader().read_all(buf, flags)
> }
>
> @@ -218,38 +200,47 @@ pub fn is_empty(&self) -> bool {
> /// Fails with [`EFAULT`] if the read happens on a bad address, or if the read goes out of
> /// bounds of this [`UserSliceReader`]. This call may modify `out` even if it returns an error.
> ///
> + /// Returns a reference to the initialized bytes in `out`.
> + ///
> /// # Guarantees
> ///
> /// After a successful call to this method, all bytes in `out` are initialized.
> - pub fn read_raw(&mut self, out: &mut [MaybeUninit<u8>]) -> Result {
> - let len = out.len();
> - let out_ptr = out.as_mut_ptr().cast::<c_void>();
> - if len > self.length {
> - return Err(EFAULT);
> - }
> - let Ok(len_ulong) = c_ulong::try_from(len) else {
> - return Err(EFAULT);
> + pub fn read_raw<'a>(
> + &'a mut self,
> + out: &'a mut Untrusted<[MaybeUninit<u8>]>,
> + ) -> Result<&'a mut Untrusted<[u8]>> {
> + let init = |ptr: *mut [u8]| {
> + let out_ptr = ptr.cast::<c_void>();
> + let len = ptr.len();
> + if len > self.length {
> + return Err(EFAULT);
> + }
> + let Ok(len_ulong) = c_ulong::try_from(len) else {
> + return Err(EFAULT);
> + };
> + // SAFETY: `out_ptr` points into a mutable slice of length `len_ulong`, so we may write
> + // that many bytes to it.
> + let res =
> + unsafe { bindings::copy_from_user(out_ptr, self.ptr as *const c_void, len_ulong) };
> + if res != 0 {
> + return Err(EFAULT);
> + }
> + self.ptr = self.ptr.wrapping_add(len);
> + self.length -= len;
> + Ok(())
> };
> - // SAFETY: `out_ptr` points into a mutable slice of length `len_ulong`, so we may write
> - // that many bytes to it.
> - let res =
> - unsafe { bindings::copy_from_user(out_ptr, self.ptr as *const c_void, len_ulong) };
> - if res != 0 {
> - return Err(EFAULT);
> - }
> - self.ptr = self.ptr.wrapping_add(len);
> - self.length -= len;
> - Ok(())
> + out.write_uninit_slice(unsafe { init_from_closure(init) })
> }
>
> /// Reads raw data from the user slice into a kernel buffer.
> ///
> /// Fails with [`EFAULT`] if the read happens on a bad address, or if the read goes out of
> /// bounds of this [`UserSliceReader`]. This call may modify `out` even if it returns an error.
> - pub fn read_slice(&mut self, out: &mut [u8]) -> Result {
> + pub fn read_slice(&mut self, out: &mut Untrusted<[u8]>) -> Result<&mut Untrusted<[u8]>> {
> // SAFETY: The types are compatible and `read_raw` doesn't write uninitialized bytes to
> // `out`.
> - let out = unsafe { &mut *(out as *mut [u8] as *mut [MaybeUninit<u8>]) };
> + let out =
> + unsafe { &mut *(out as *mut Untrusted<[u8]> as *mut Untrusted<[MaybeUninit<u8>]>) };
> self.read_raw(out)
> }
>
You didn't update the typed read here, intentionally? Should be something
like:
- pub fn read<T: FromBytes>(&mut self) -> Result<T> {
+ pub fn read<T: FromBytes>(&mut self) -> Result<Untrusted<T>> {
> @@ -291,13 +282,15 @@ pub fn read<T: FromBytes>(&mut self) -> Result<T> {
> /// Reads the entirety of the user slice, appending it to the end of the provided buffer.
> ///
> /// Fails with [`EFAULT`] if the read happens on a bad address.
> - pub fn read_all(mut self, buf: &mut Vec<u8>, flags: Flags) -> Result {
> + pub fn read_all(mut self, buf: &mut Vec<Untrusted<u8>>, flags: Flags) -> Result {
> let len = self.length;
> - VecExt::<u8>::reserve(buf, len, flags)?;
> + VecExt::<_>::reserve(buf, len, flags)?;
>
> // The call to `try_reserve` was successful, so the spare capacity is at least `len` bytes
> // long.
> - self.read_raw(&mut buf.spare_capacity_mut()[..len])?;
> + self.read_raw(Untrusted::transpose_slice_uninit_mut(
> + &mut buf.spare_capacity_mut()[..len],
> + ))?;
>
> // SAFETY: Since the call to `read_raw` was successful, so the next `len` bytes of the
> // vector have been initialized.
> @@ -333,8 +326,18 @@ pub fn is_empty(&self) -> bool {
> /// bounds of this [`UserSliceWriter`]. This call may modify the associated userspace slice even
> /// if it returns an error.
> pub fn write_slice(&mut self, data: &[u8]) -> Result {
> - let len = data.len();
> - let data_ptr = data.as_ptr().cast::<c_void>();
> + self.write_untrusted_slice(Untrusted::new_ref(data))
> + }
> +
> + /// Writes raw data to this user pointer from a kernel buffer.
> + ///
> + /// Fails with [`EFAULT`] if the write happens on a bad address, or if the write goes out of
> + /// bounds of this [`UserSliceWriter`]. This call may modify the associated userspace slice even
> + /// if it returns an error.
> + pub fn write_untrusted_slice(&mut self, data: &Untrusted<[u8]>) -> Result {
I think this one we don't need, but instead we need:
unsafe impl<T: AsBytes> AsBytes for Untrusted<T> {}
and you get the untrusted write for free.
Cheers, Sima
> + let data_ptr = (data as *const _) as *const [u8];
> + let len = data_ptr.len();
> + let data_ptr = data_ptr.cast::<c_void>();
> if len > self.length {
> return Err(EFAULT);
> }
> --
> 2.46.0
>
>
--
Simona Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 1/2] rust: add untrusted data abstraction
2024-09-25 20:53 ` [PATCH v2 1/2] rust: add untrusted data abstraction Benno Lossin
2024-09-26 7:08 ` Dirk Behme
2024-09-26 10:40 ` Simona Vetter
@ 2024-09-26 20:31 ` kernel test robot
2024-09-26 21:40 ` Benno Lossin
2 siblings, 1 reply; 15+ messages in thread
From: kernel test robot @ 2024-09-26 20:31 UTC (permalink / raw)
To: Benno Lossin, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
Björn Roy Baron, Andreas Hindborg, Alice Ryhl, Trevor Gross
Cc: llvm, oe-kbuild-all, Greg KH, Simona Vetter, linux-kernel,
rust-for-linux
Hi Benno,
kernel test robot noticed the following build errors:
[auto build test ERROR on a2f11547052001bd448ccec81dd1e68409078fbb]
url: https://github.com/intel-lab-lkp/linux/commits/Benno-Lossin/rust-add-untrusted-data-abstraction/20240926-045445
base: a2f11547052001bd448ccec81dd1e68409078fbb
patch link: https://lore.kernel.org/r/20240925205244.873020-2-benno.lossin%40proton.me
patch subject: [PATCH v2 1/2] rust: add untrusted data abstraction
config: x86_64-buildonly-randconfig-002-20240927 (https://download.01.org/0day-ci/archive/20240927/202409270303.kUIAmOmY-lkp@intel.com/config)
compiler: clang version 18.1.8 (https://github.com/llvm/llvm-project 3b5b5c1ec4a3095ab096dd780e84d7ab81f3d7ff)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240927/202409270303.kUIAmOmY-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202409270303.kUIAmOmY-lkp@intel.com/
All errors (new ones prefixed by >>):
>> error[E0004]: non-exhaustive patterns: `Err(_)` not covered
--> rust/kernel/validate.rs:188:15
|
188 | match unsafe { value.__init(ptr) } {
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ pattern `Err(_)` not covered
|
note: `core::result::Result<(), Infallible>` defined here
--> /opt/cross/rustc-1.78.0-bindgen-0.65.1/rustup/toolchains/1.78.0-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/result.rs:502:1
|
502 | pub enum Result<T, E> {
| ^^^^^^^^^^^^^^^^^^^^^
...
511 | Err(#[stable(feature = "rust1", since = "1.0.0")] E),
| --- not covered
= note: the matched value is of type `core::result::Result<(), Infallible>`
help: ensure that all possible cases are being handled by adding a match arm with a wildcard pattern or an explicit pattern as shown
|
189 ~ Ok(()) => {},
190 + Err(_) => todo!()
|
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 1/2] rust: add untrusted data abstraction
2024-09-26 20:31 ` kernel test robot
@ 2024-09-26 21:40 ` Benno Lossin
2024-09-26 21:56 ` Miguel Ojeda
2024-09-26 21:57 ` Miguel Ojeda
0 siblings, 2 replies; 15+ messages in thread
From: Benno Lossin @ 2024-09-26 21:40 UTC (permalink / raw)
To: kernel test robot, Miguel Ojeda, Alex Gaynor, Boqun Feng,
Gary Guo, Björn Roy Baron, Andreas Hindborg, Alice Ryhl,
Trevor Gross
Cc: llvm, oe-kbuild-all, Greg KH, Simona Vetter, linux-kernel,
rust-for-linux
On 26.09.24 22:31, kernel test robot wrote:
> Hi Benno,
>
> kernel test robot noticed the following build errors:
>
> [auto build test ERROR on a2f11547052001bd448ccec81dd1e68409078fbb]
>
> url: https://github.com/intel-lab-lkp/linux/commits/Benno-Lossin/rust-add-untrusted-data-abstraction/20240926-045445
> base: a2f11547052001bd448ccec81dd1e68409078fbb
> patch link: https://lore.kernel.org/r/20240925205244.873020-2-benno.lossin%40proton.me
> patch subject: [PATCH v2 1/2] rust: add untrusted data abstraction
> config: x86_64-buildonly-randconfig-002-20240927 (https://download.01.org/0day-ci/archive/20240927/202409270303.kUIAmOmY-lkp@intel.com/config)
> compiler: clang version 18.1.8 (https://github.com/llvm/llvm-project 3b5b5c1ec4a3095ab096dd780e84d7ab81f3d7ff)
> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240927/202409270303.kUIAmOmY-lkp@intel.com/reproduce)
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <lkp@intel.com>
> | Closes: https://lore.kernel.org/oe-kbuild-all/202409270303.kUIAmOmY-lkp@intel.com/
>
> All errors (new ones prefixed by >>):
>
>>> error[E0004]: non-exhaustive patterns: `Err(_)` not covered
> --> rust/kernel/validate.rs:188:15
> |
> 188 | match unsafe { value.__init(ptr) } {
> | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ pattern `Err(_)` not covered
> |
> note: `core::result::Result<(), Infallible>` defined here
> --> /opt/cross/rustc-1.78.0-bindgen-0.65.1/rustup/toolchains/1.78.0-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/result.rs:502:1
> |
> 502 | pub enum Result<T, E> {
> | ^^^^^^^^^^^^^^^^^^^^^
> ...
> 511 | Err(#[stable(feature = "rust1", since = "1.0.0")] E),
> | --- not covered
> = note: the matched value is of type `core::result::Result<(), Infallible>`
> help: ensure that all possible cases are being handled by adding a match arm with a wildcard pattern or an explicit pattern as shown
> |
> 189 ~ Ok(()) => {},
> 190 + Err(_) => todo!()
I didn't include the `Err` variant here, since on nightly it would give
me this error:
error: unreachable pattern
--> rust/kernel/validate.rs:190:13
|
190 | Err(i) => match i {},
| ^^^^^^ matches no values because `Infallible` is uninhabited
|
= note: to learn more about uninhabited types, see https://doc.rust-lang.org/nomicon/exotic-sizes.html#empty-types
= note: `-D unreachable-patterns` implied by `-D warnings`
= help: to override `-D warnings` add `#[allow(unreachable_patterns)]`
error: aborting due to 1 previous error
I can reproduce this error locally in a Rust project, but am unable to
do so on play.rust-lang.org. That seems very strange to me, anyone else
on 1.83 or 1.82 can confirm? AFAIK, it was stabilized in 1.82, see [1].
I would like to use `#[cfg("rust version <= 1.82")]`, but I don't know
how to do that. Do we already have support for checking the version in a
cfg? The alternative would be `#[allow(unreachable_patterns)]`, but I
would like to avoid that.
[1]: https://github.com/rust-lang/rust/issues/119612
---
Cheers,
Benno
> |
>
> --
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 1/2] rust: add untrusted data abstraction
2024-09-26 21:40 ` Benno Lossin
@ 2024-09-26 21:56 ` Miguel Ojeda
2024-09-26 22:15 ` Benno Lossin
2024-09-26 21:57 ` Miguel Ojeda
1 sibling, 1 reply; 15+ messages in thread
From: Miguel Ojeda @ 2024-09-26 21:56 UTC (permalink / raw)
To: Benno Lossin
Cc: kernel test robot, Miguel Ojeda, Alex Gaynor, Boqun Feng,
Gary Guo, Björn Roy Baron, Andreas Hindborg, Alice Ryhl,
Trevor Gross, llvm, oe-kbuild-all, Greg KH, Simona Vetter,
linux-kernel, rust-for-linux
On Thu, Sep 26, 2024 at 11:40 PM Benno Lossin <benno.lossin@proton.me> wrote:
>
> I would like to use `#[cfg("rust version <= 1.82")]`, but I don't know
> how to do that. Do we already have support for checking the version in a
> cfg? The alternative would be `#[allow(unreachable_patterns)]`, but I
> would like to avoid that.
We can do it, but it requires adding a Kconfig symbol, sadly, since
`cfg` does not support numerical comparisons there.
See commit 93dc3be19450 ("docs: rust: include other expressions in
conditional compilation section") where I explained it.
So depending on how often we think we will hit this, we may want to
pay the "cost" of the extra Kconfig symbol. Otherwise, we can just
`allow` it as you say.
(There are discussions in Rust about `cfg(version(...))`, but that is
for the future -- I have some links at
https://github.com/Rust-for-Linux/linux/issues/354)
Cheers,
Miguel
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 1/2] rust: add untrusted data abstraction
2024-09-26 21:40 ` Benno Lossin
2024-09-26 21:56 ` Miguel Ojeda
@ 2024-09-26 21:57 ` Miguel Ojeda
1 sibling, 0 replies; 15+ messages in thread
From: Miguel Ojeda @ 2024-09-26 21:57 UTC (permalink / raw)
To: Benno Lossin
Cc: kernel test robot, Miguel Ojeda, Alex Gaynor, Boqun Feng,
Gary Guo, Björn Roy Baron, Andreas Hindborg, Alice Ryhl,
Trevor Gross, llvm, oe-kbuild-all, Greg KH, Simona Vetter,
linux-kernel, rust-for-linux
On Thu, Sep 26, 2024 at 11:40 PM Benno Lossin <benno.lossin@proton.me> wrote:
>
> I can reproduce this error locally in a Rust project, but am unable to
> do so on play.rust-lang.org. That seems very strange to me, anyone else
> on 1.83 or 1.82 can confirm? AFAIK, it was stabilized in 1.82, see [1].
Can you try in Compiler Explorer? That allows you to select more versions etc.
Cheers,
Miguel
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 1/2] rust: add untrusted data abstraction
2024-09-26 21:56 ` Miguel Ojeda
@ 2024-09-26 22:15 ` Benno Lossin
2024-09-27 8:39 ` Miguel Ojeda
0 siblings, 1 reply; 15+ messages in thread
From: Benno Lossin @ 2024-09-26 22:15 UTC (permalink / raw)
To: Miguel Ojeda
Cc: kernel test robot, Miguel Ojeda, Alex Gaynor, Boqun Feng,
Gary Guo, Björn Roy Baron, Andreas Hindborg, Alice Ryhl,
Trevor Gross, llvm, oe-kbuild-all, Greg KH, Simona Vetter,
linux-kernel, rust-for-linux
On 26.09.24 23:56, Miguel Ojeda wrote:
> On Thu, Sep 26, 2024 at 11:40 PM Benno Lossin <benno.lossin@proton.me> wrote:
>>
>> I would like to use `#[cfg("rust version <= 1.82")]`, but I don't know
>> how to do that. Do we already have support for checking the version in a
>> cfg? The alternative would be `#[allow(unreachable_patterns)]`, but I
>> would like to avoid that.
>
> We can do it, but it requires adding a Kconfig symbol, sadly, since
> `cfg` does not support numerical comparisons there.
>
> See commit 93dc3be19450 ("docs: rust: include other expressions in
> conditional compilation section") where I explained it.
Thanks!
> So depending on how often we think we will hit this, we may want to
> pay the "cost" of the extra Kconfig symbol. Otherwise, we can just
> `allow` it as you say.
Personally, I would prefer adding a symbol. Since if we allow it, then
it might take a long time until the code is removed once we increase the
minimum version.
It would be best, if it produces an error when we raise the minimum
version beyond the one represented by the symbol. Is that already the
case?
On 26.09.24 23:57, Miguel Ojeda wrote:
> On Thu, Sep 26, 2024 at 11:40 PM Benno Lossin <benno.lossin@proton.me> wrote:
>>
>> I can reproduce this error locally in a Rust project, but am unable to
>> do so on play.rust-lang.org. That seems very strange to me, anyone else
>> on 1.83 or 1.82 can confirm? AFAIK, it was stabilized in 1.82, see [1].
>
> Can you try in Compiler Explorer? That allows you to select more versions etc.
Gave it a try and I also can't reproduce the error there...
---
Cheers,
Benno
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 2/2] rust: switch uaccess to untrusted data API
2024-09-25 20:53 ` [PATCH v2 2/2] rust: switch uaccess to untrusted data API Benno Lossin
2024-09-26 11:09 ` Simona Vetter
@ 2024-09-26 23:56 ` kernel test robot
1 sibling, 0 replies; 15+ messages in thread
From: kernel test robot @ 2024-09-26 23:56 UTC (permalink / raw)
To: Benno Lossin, Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
Björn Roy Baron, Andreas Hindborg, Alice Ryhl, Trevor Gross
Cc: llvm, oe-kbuild-all, Greg KH, Simona Vetter, rust-for-linux,
linux-kernel
Hi Benno,
kernel test robot noticed the following build errors:
[auto build test ERROR on a2f11547052001bd448ccec81dd1e68409078fbb]
url: https://github.com/intel-lab-lkp/linux/commits/Benno-Lossin/rust-add-untrusted-data-abstraction/20240926-045445
base: a2f11547052001bd448ccec81dd1e68409078fbb
patch link: https://lore.kernel.org/r/20240925205244.873020-3-benno.lossin%40proton.me
patch subject: [PATCH v2 2/2] rust: switch uaccess to untrusted data API
config: x86_64-buildonly-randconfig-002-20240927 (https://download.01.org/0day-ci/archive/20240927/202409270711.g9QhxwTf-lkp@intel.com/config)
compiler: clang version 18.1.8 (https://github.com/llvm/llvm-project 3b5b5c1ec4a3095ab096dd780e84d7ab81f3d7ff)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240927/202409270711.g9QhxwTf-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202409270711.g9QhxwTf-lkp@intel.com/
All errors (new ones prefixed by >>):
>> error[E0658]: use of unstable library feature 'slice_ptr_len'
--> rust/kernel/uaccess.rs:214:27
|
214 | let len = ptr.len();
| ^^^
|
= note: see issue #71146 <https://github.com/rust-lang/rust/issues/71146> for more information
= help: add `#![feature(slice_ptr_len)]` to the crate attributes to enable
= note: this compiler was built on 2024-04-29; consider upgrading it if it is out of date
--
>> error[E0658]: use of unstable library feature 'slice_ptr_len'
--> rust/kernel/uaccess.rs:339:28
|
339 | let len = data_ptr.len();
| ^^^
|
= note: see issue #71146 <https://github.com/rust-lang/rust/issues/71146> for more information
= help: add `#![feature(slice_ptr_len)]` to the crate attributes to enable
= note: this compiler was built on 2024-04-29; consider upgrading it if it is out of date
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 1/2] rust: add untrusted data abstraction
2024-09-26 22:15 ` Benno Lossin
@ 2024-09-27 8:39 ` Miguel Ojeda
2024-09-27 9:06 ` Benno Lossin
0 siblings, 1 reply; 15+ messages in thread
From: Miguel Ojeda @ 2024-09-27 8:39 UTC (permalink / raw)
To: Benno Lossin
Cc: kernel test robot, Miguel Ojeda, Alex Gaynor, Boqun Feng,
Gary Guo, Björn Roy Baron, Andreas Hindborg, Alice Ryhl,
Trevor Gross, llvm, oe-kbuild-all, Greg KH, Simona Vetter,
linux-kernel, rust-for-linux
On Fri, Sep 27, 2024 at 12:15 AM Benno Lossin <benno.lossin@proton.me> wrote:
>
> Personally, I would prefer adding a symbol. Since if we allow it, then
> it might take a long time until the code is removed once we increase the
> minimum version.
>
> It would be best, if it produces an error when we raise the minimum
> version beyond the one represented by the symbol. Is that already the
> case?
No -- that is a nice idea that I guess could be implemented in Kconfig
somewhere (i.e. when checking conditions). However, one of the common
things to do when upgrading the minimum is to review/clean the
`*_VERSION` uses anyway (and they may occur outside Kconfig files
too), and also sometimes one wants to upgrade a minimum without doing
the cleanups immediately (e.g. the recently proposed GCC 8.1 upgrade).
> Gave it a try and I also can't reproduce the error there...
Hmm... I think CE uses the Rust-provided binaries (and I guess the
playground too). Do you have a link ("Share" in CE)?
Cheers,
Miguel
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 1/2] rust: add untrusted data abstraction
2024-09-27 8:39 ` Miguel Ojeda
@ 2024-09-27 9:06 ` Benno Lossin
0 siblings, 0 replies; 15+ messages in thread
From: Benno Lossin @ 2024-09-27 9:06 UTC (permalink / raw)
To: Miguel Ojeda
Cc: kernel test robot, Miguel Ojeda, Alex Gaynor, Boqun Feng,
Gary Guo, Björn Roy Baron, Andreas Hindborg, Alice Ryhl,
Trevor Gross, llvm, oe-kbuild-all, Greg KH, Simona Vetter,
linux-kernel, rust-for-linux
On 27.09.24 10:39, Miguel Ojeda wrote:
> On Fri, Sep 27, 2024 at 12:15 AM Benno Lossin <benno.lossin@proton.me> wrote:
>>
>> Personally, I would prefer adding a symbol. Since if we allow it, then
>> it might take a long time until the code is removed once we increase the
>> minimum version.
>>
>> It would be best, if it produces an error when we raise the minimum
>> version beyond the one represented by the symbol. Is that already the
>> case?
>
> No -- that is a nice idea that I guess could be implemented in Kconfig
> somewhere (i.e. when checking conditions). However, one of the common
> things to do when upgrading the minimum is to review/clean the
> `*_VERSION` uses anyway (and they may occur outside Kconfig files
> too), and also sometimes one wants to upgrade a minimum without doing
> the cleanups immediately (e.g. the recently proposed GCC 8.1 upgrade).
>
>> Gave it a try and I also can't reproduce the error there...
>
> Hmm... I think CE uses the Rust-provided binaries (and I guess the
> playground too). Do you have a link ("Share" in CE)?
Here you go: https://godbolt.org/z/qE8E9M56c
The error that I get locally:
error: unreachable pattern
--> src/main.rs:8:9
|
8 | Err(e) => match e {},
| ^^^^^^ matches no values because `Infallible` is uninhabited
|
= note: to learn more about uninhabited types, see https://doc.rust-lang.org/nomicon/exotic-sizes.html#empty-types
note: the lint level is defined here
--> src/main.rs:1:9
|
1 | #![deny(unreachable_patterns)]
| ^^^^^^^^^^^^^^^^^^^^
error: could not compile `rust` (bin "rust") due to 1 previous error
---
Cheers,
Benno
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 1/2] rust: add untrusted data abstraction
2024-09-26 10:40 ` Simona Vetter
@ 2024-09-30 14:04 ` Benno Lossin
2024-11-26 8:05 ` Simona Vetter
0 siblings, 1 reply; 15+ messages in thread
From: Benno Lossin @ 2024-09-30 14:04 UTC (permalink / raw)
To: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
Björn Roy Baron, Andreas Hindborg, Alice Ryhl, Trevor Gross,
Greg KH, linux-kernel, rust-for-linux
On 26.09.24 12:40, Simona Vetter wrote:
> On Wed, Sep 25, 2024 at 08:53:05PM +0000, Benno Lossin wrote:
>> + /// Sets the underlying untrusted value.
>> + ///
>> + /// # Examples
>> + ///
>> + /// ```
>> + /// use kernel::validate::Untrusted;
>> + ///
>> + /// let mut untrusted = Untrusted::new(42);
>> + /// untrusted.write(24);
>> + /// ```
>> + pub fn write(&mut self, value: impl Init<T>) {
>> + let ptr: *mut T = &mut self.0 .0;
>> + // SAFETY: `ptr` came from a mutable reference and the value is overwritten before it is
>> + // read.
>> + unsafe { ptr::drop_in_place(ptr) };
>> + // SAFETY: `ptr` came from a mutable reference and the initializer cannot error.
>> + match unsafe { value.__init(ptr) } {
>> + Ok(()) => {}
>> + }
>> + }
>
> So I think this doesn't work for a few reasons:
>
> - there's a lot of unsafe both here and when using this (even without
> MaybeUninit), at least if I understand things correctly. And forcing
> unsafe for something that inheritedly does not break any of the rust
> memory safety rules feels silly.
>
> - it also looks awkward, because you have the special versions
> write_uninit and write_uninit_slice, and imo if our write isn't good
> enought to just work with abritrary box/container types, it's not good.
>
> - it doesn't actually work for one of the main uaccess.rs use-cases beyond
> just enabling the current interface. Example in pseudo-rust:
>
> struct IoctlParams {
> input: u32,
> ouptut: u32,
> }
>
> The thing is that ioctl that use the struct approach like drm does, use
> the same struct if there's both input and output paramterers, and
> furthermore we are not allowed to overwrite the entire struct because
> that breaks ioctl restarting. So the flow is roughly
>
> let userptr : UserSlice;
> let params : Untrusted<IoctlParams>;
>
> userptr.read(params));
>
> // validate params, do something interesting with it params.input
>
> // this is _not_ allowed to overwrite params.input but must leave it
> // unchanged
>
> params.write(|x| { x.output = 42; });
>
> userptr.write(params);
>
> Your current write doesn't allow this case, and I think that's not good
> enough.
I see. One thing that makes this API work better is field projections
(it's a rust language feature for which we do have a PoC macro
implementation and which I am working on getting into the language):
To only overwrite a part of a struct you would do
params->output.write(42);
The `->` syntax "projects" the `output` field, turning
`Untrusted<IoctlParams>` into `Untrusted<u32>`.
I think that this would solve your problem with the API. (we probably
need a stop gap solution though until field projections are part of the
language).
> The one I propsed in private does:
>
> Untrusted<T>::write(&mut self, Fn(&mut T)->())
>
> It's not perfect because it allows you to do other stuff than writing,
> but it's still pretty good by limiting any potentially dangerous code to
> the provided closure. And with these of protection apis we need to
> strike the right balance between as much protection as possible while
> still allowing users to get the job done.
Ideally I would like to avoid exposing the value.
> Now I guess you can do this with your write too using a copy constructor
> approach, but that will result in less efficient code and it's probably
> also too much functional programming design for kernel developers to
> cope with.
Agreed.
>> +
>> + /// Turns a slice of untrusted values into an untrusted slice of values.
>> + pub fn transpose_slice(slice: &[Untrusted<T>]) -> &Untrusted<[T]>
>> + where
>> + T: Sized,
>> + {
>> + let ptr = slice.as_ptr().cast::<T>();
>> + // SAFETY: `ptr` and `len` come from the same slice reference.
>> + let slice = unsafe { slice::from_raw_parts(ptr, slice.len()) };
>> + Untrusted::new_ref(slice)
>> + }
>> +
>> + /// Turns a slice of uninitialized, untrusted values into an untrusted slice of uninitialized
>> + /// values.
>> + pub fn transpose_slice_uninit(
>> + slice: &[MaybeUninit<Untrusted<T>>],
>> + ) -> &Untrusted<[MaybeUninit<T>]>
>
> So these and some of the related functions to handle slice of box vs box
> of slice feel a bit awkward. I think we can do better if we rename
> Untrusted and Unvalidated to UntrustedBox and UnvalidatedBox, and then
> make Untrusted and Unvalidated traits, with implementations for
> UntrustedBox<T>, [Untrusted<T>], and all the others we want.
Hmm that might work for `Untrusted`, I will see how that looks. For
`Unvalidated` I am less confident.
> I expect that in the future we'll get more boxes with special semantics
> that we want to use together with Untrusted, not just [] and MaybeUninit.
> One example would be when the Untrusted data is shared with userspace
> (mmap()ed into both kernel and userspace for example for a ringbuffer), in
> which case the data is both Untrusted but also SharedUnsafe or whatever
> we'll call memory that's fundamentally breaking the rust memory model
> because there's no exclusive access (userspace can always do whatever it
> feels like), and all access has to go through at least
> READ_ONCE/WRITE_ONCE from the C side, or often even full blown atomics.
I don't think that we need to wrap that in `Untrusted`, instead the type
representing shared data between kernel and userspace should return
`Untrusted<u8>` instead of `u8`.
> And that memory is not MaybeUninit, because we have to initialize it
> before userspace can mmap it, for otherwise it's an information leak and
> hence security issue.
>
> tldr; I think going with traits for Untrusted and Unvalidated here will be
> worth it, even if a bit more pain at first. Plus cleaner interfaces.
>
>> + where
>> + T: Sized,
>> + {
>> + let ptr = slice.as_ptr().cast::<MaybeUninit<T>>();
>> + // SAFETY: `ptr` and `len` come from the same mutable slice reference.
>> + let slice = unsafe { slice::from_raw_parts(ptr, slice.len()) };
>> + Untrusted::new_ref(slice)
>> + }
[...]
>> +/// Validates untrusted data.
>> +///
>> +/// # Examples
>> +///
>> +/// The simplest way to validate data is to just implement `Validate<&Unvalidated<[u8]>>` for the
>> +/// type that you wish to validate:
>> +///
>> +/// ```
>> +/// use kernel::{
>> +/// error::{code::EINVAL, Error},
>> +/// str::{CStr, CString},
>> +/// validate::{Unvalidated, Validate},
>> +/// };
>> +///
>> +/// struct Data {
>> +/// flags: u8,
>> +/// name: CString,
>> +/// }
>> +///
>> +/// impl Validate<&Unvalidated<[u8]>> for Data {
>> +/// type Err = Error;
>> +///
>> +/// fn validate(unvalidated: &Unvalidated<[u8]>) -> Result<Self, Self::Err> {
>> +/// let raw = unvalidated.raw();
>> +/// let (&flags, name) = raw.split_first().ok_or(EINVAL)?;
>> +/// let name = CStr::from_bytes_with_nul(name)?.to_cstring()?;
>> +/// Ok(Data { flags, name })
>> +/// }
>> +/// }
>> +/// ```
>> +///
>> +/// This approach copies the data and requires allocation. If you want to avoid the allocation and
>> +/// copying the data, you can borrow from the input like this:
>> +///
>> +/// ```
>> +/// use kernel::{
>> +/// error::{code::EINVAL, Error},
>> +/// str::CStr,
>> +/// validate::{Unvalidated, Validate},
>> +/// };
>> +///
>> +/// struct Data<'a> {
>> +/// flags: u8,
>> +/// name: &'a CStr,
>> +/// }
>> +///
>> +/// impl<'a> Validate<&'a Unvalidated<[u8]>> for Data<'a> {
>> +/// type Err = Error;
>> +///
>> +/// fn validate(unvalidated: &'a Unvalidated<[u8]>) -> Result<Self, Self::Err> {
>> +/// let raw = unvalidated.raw();
>> +/// let (&flags, name) = raw.split_first().ok_or(EINVAL)?;
>> +/// let name = CStr::from_bytes_with_nul(name)?;
>> +/// Ok(Data { flags, name })
>> +/// }
>> +/// }
>> +/// ```
>> +///
>> +/// If you need to in-place validate your data, you currently need to resort to `unsafe`:
>> +///
>> +/// ```
>> +/// use kernel::{
>> +/// error::{code::EINVAL, Error},
>> +/// str::CStr,
>> +/// validate::{Unvalidated, Validate},
>> +/// };
>> +/// use core::mem;
>> +///
>> +/// // Important: use `repr(C)`, this ensures a linear layout of this type.
>> +/// #[repr(C)]
>> +/// struct Data {
>> +/// version: u8,
>> +/// flags: u8,
>> +/// _reserved: [u8; 2],
>> +/// count: u64,
>> +/// // lots of other fields...
>> +/// }
>> +///
>> +/// impl Validate<&Unvalidated<[u8]>> for &Data {
>> +/// type Err = Error;
>> +///
>> +/// fn validate(unvalidated: &Unvalidated<[u8]>) -> Result<Self, Self::Err> {
>> +/// let raw = unvalidated.raw();
>> +/// if raw.len() < mem::size_of::<Data>() {
>> +/// return Err(EINVAL);
>> +/// }
>> +/// // can only handle version 0
>> +/// if raw[0] != 0 {
>> +/// return Err(EINVAL);
>> +/// }
>> +/// // version 0 only uses the lower 4 bits of flags
>> +/// if raw[1] & 0xf0 != 0 {
>> +/// return Err(EINVAL);
>> +/// }
>> +/// let ptr = raw.as_ptr();
>> +/// // CAST: `Data` only contains integers and has `repr(C)`.
>> +/// let ptr = ptr.cast::<Data>();
>> +/// // SAFETY: `ptr` came from a reference and the cast above is valid.
>> +/// Ok(unsafe { &*ptr })
>> +/// }
>> +/// }
>> +/// ```
>> +///
>> +/// To be able to modify the parsed data, while still supporting zero-copy, you can implement
>> +/// `Validate<&mut Unvalidated<[u8]>>`:
>> +///
>> +/// ```
>> +/// use kernel::{
>> +/// error::{code::EINVAL, Error},
>> +/// str::CStr,
>> +/// validate::{Unvalidated, Validate},
>> +/// };
>> +/// use core::mem;
>> +///
>> +/// // Important: use `repr(C)`, this ensures a linear layout of this type.
>> +/// #[repr(C)]
>> +/// struct Data {
>> +/// version: u8,
>> +/// flags: u8,
>> +/// _reserved: [u8; 2],
>> +/// count: u64,
>> +/// // lots of other fields...
>> +/// }
>> +///
>> +/// impl Validate<&mut Unvalidated<[u8]>> for &Data {
>> +/// type Err = Error;
>
> I think we should make that one the default, but not sure that's doable
> with associated types instead of generics.
There is the `associated_type_defaults` unstable feature:
https://github.com/rust-lang/rust/issues/29661
But I actually think that we should get away from making the `Error`
type the return error of Rust functions. It's much better to have
function-specific errors, since they can be more descriptive. In cases
where we have to return to userspace or C, sure use the existing
`Error`.
> The input parameter should definitely default to the output paramter,
> because often they're just exactly the same e.g. when validating ioctl
> input structures.
I don't understand what you mean by this? The input parameter has to be
some sort of `..Untrusted..` and the output shouldn't have that.
> On this version you've also left out the in-place validation (originally
> validate_bytes), but we can add that later on when a need arises I guess.
> Or do you think it's just not a good idea in general?
This version supports in-place validation, this example and the one
above showcase how you would implement `Validate` in that case.
You can then call `.vaildate()` on an `&[mut ]Unvalidated<[u8]>` and you
get a `&Data` that borrows from the unvalidated data without copying the
data.
---
Cheers,
Benno
>> +///
>> +/// fn validate(unvalidated: &mut Unvalidated<[u8]>) -> Result<Self, Self::Err> {
>> +/// let raw = unvalidated.raw_mut();
>> +/// if raw.len() < mem::size_of::<Data>() {
>> +/// return Err(EINVAL);
>> +/// }
>> +/// match raw[0] {
>> +/// 0 => {},
>> +/// 1 => {
>> +/// // version 1 implicitly sets the first bit.
>> +/// raw[1] |= 1;
>> +/// },
>> +/// // can only handle version 0 and 1
>> +/// _ => return Err(EINVAL),
>> +/// }
>> +/// // version 0 and 1 only use the lower 4 bits of flags
>> +/// if raw[1] & 0xf0 != 0 {
>> +/// return Err(EINVAL);
>> +/// }
>> +/// if raw[1] == 0 {}
>> +/// let ptr = raw.as_ptr();
>> +/// // CAST: `Data` only contains integers and has `repr(C)`.
>> +/// let ptr = ptr.cast::<Data>();
>> +/// // SAFETY: `ptr` came from a reference and the cast above is valid.
>> +/// Ok(unsafe { &*ptr })
>> +/// }
>> +/// }
>> +/// ```
>> +pub trait Validate<I: ValidateInput>: Sized {
>> + /// Validation error.
>> + type Err;
>> +
>> + /// Validate the given untrusted data and parse it into the output type.
>> + fn validate(unvalidated: I) -> Result<Self, Self::Err>;
>> +}
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 1/2] rust: add untrusted data abstraction
2024-09-30 14:04 ` Benno Lossin
@ 2024-11-26 8:05 ` Simona Vetter
0 siblings, 0 replies; 15+ messages in thread
From: Simona Vetter @ 2024-11-26 8:05 UTC (permalink / raw)
To: Benno Lossin
Cc: Miguel Ojeda, Alex Gaynor, Boqun Feng, Gary Guo,
Björn Roy Baron, Andreas Hindborg, Alice Ryhl, Trevor Gross,
Greg KH, linux-kernel, rust-for-linux
Sorry for the super late reply, was absorbed too much with life stuff :-/
On Mon, Sep 30, 2024 at 02:04:06PM +0000, Benno Lossin wrote:
> On 26.09.24 12:40, Simona Vetter wrote:
> > On Wed, Sep 25, 2024 at 08:53:05PM +0000, Benno Lossin wrote:
> >> + /// Sets the underlying untrusted value.
> >> + ///
> >> + /// # Examples
> >> + ///
> >> + /// ```
> >> + /// use kernel::validate::Untrusted;
> >> + ///
> >> + /// let mut untrusted = Untrusted::new(42);
> >> + /// untrusted.write(24);
> >> + /// ```
> >> + pub fn write(&mut self, value: impl Init<T>) {
> >> + let ptr: *mut T = &mut self.0 .0;
> >> + // SAFETY: `ptr` came from a mutable reference and the value is overwritten before it is
> >> + // read.
> >> + unsafe { ptr::drop_in_place(ptr) };
> >> + // SAFETY: `ptr` came from a mutable reference and the initializer cannot error.
> >> + match unsafe { value.__init(ptr) } {
> >> + Ok(()) => {}
> >> + }
> >> + }
> >
> > So I think this doesn't work for a few reasons:
> >
> > - there's a lot of unsafe both here and when using this (even without
> > MaybeUninit), at least if I understand things correctly. And forcing
> > unsafe for something that inheritedly does not break any of the rust
> > memory safety rules feels silly.
> >
> > - it also looks awkward, because you have the special versions
> > write_uninit and write_uninit_slice, and imo if our write isn't good
> > enought to just work with abritrary box/container types, it's not good.
> >
> > - it doesn't actually work for one of the main uaccess.rs use-cases beyond
> > just enabling the current interface. Example in pseudo-rust:
> >
> > struct IoctlParams {
> > input: u32,
> > ouptut: u32,
> > }
> >
> > The thing is that ioctl that use the struct approach like drm does, use
> > the same struct if there's both input and output paramterers, and
> > furthermore we are not allowed to overwrite the entire struct because
> > that breaks ioctl restarting. So the flow is roughly
> >
> > let userptr : UserSlice;
> > let params : Untrusted<IoctlParams>;
> >
> > userptr.read(params));
> >
> > // validate params, do something interesting with it params.input
> >
> > // this is _not_ allowed to overwrite params.input but must leave it
> > // unchanged
> >
> > params.write(|x| { x.output = 42; });
> >
> > userptr.write(params);
> >
> > Your current write doesn't allow this case, and I think that's not good
> > enough.
>
> I see. One thing that makes this API work better is field projections
> (it's a rust language feature for which we do have a PoC macro
> implementation and which I am working on getting into the language):
> To only overwrite a part of a struct you would do
>
> params->output.write(42);
>
> The `->` syntax "projects" the `output` field, turning
> `Untrusted<IoctlParams>` into `Untrusted<u32>`.
>
> I think that this would solve your problem with the API. (we probably
> need a stop gap solution though until field projections are part of the
> language).
>
> > The one I propsed in private does:
> >
> > Untrusted<T>::write(&mut self, Fn(&mut T)->())
> >
> > It's not perfect because it allows you to do other stuff than writing,
> > but it's still pretty good by limiting any potentially dangerous code to
> > the provided closure. And with these of protection apis we need to
> > strike the right balance between as much protection as possible while
> > still allowing users to get the job done.
>
> Ideally I would like to avoid exposing the value.
>
> > Now I guess you can do this with your write too using a copy constructor
> > approach, but that will result in less efficient code and it's probably
> > also too much functional programming design for kernel developers to
> > cope with.
>
> Agreed.
>
> >> +
> >> + /// Turns a slice of untrusted values into an untrusted slice of values.
> >> + pub fn transpose_slice(slice: &[Untrusted<T>]) -> &Untrusted<[T]>
> >> + where
> >> + T: Sized,
> >> + {
> >> + let ptr = slice.as_ptr().cast::<T>();
> >> + // SAFETY: `ptr` and `len` come from the same slice reference.
> >> + let slice = unsafe { slice::from_raw_parts(ptr, slice.len()) };
> >> + Untrusted::new_ref(slice)
> >> + }
> >> +
> >> + /// Turns a slice of uninitialized, untrusted values into an untrusted slice of uninitialized
> >> + /// values.
> >> + pub fn transpose_slice_uninit(
> >> + slice: &[MaybeUninit<Untrusted<T>>],
> >> + ) -> &Untrusted<[MaybeUninit<T>]>
> >
> > So these and some of the related functions to handle slice of box vs box
> > of slice feel a bit awkward. I think we can do better if we rename
> > Untrusted and Unvalidated to UntrustedBox and UnvalidatedBox, and then
> > make Untrusted and Unvalidated traits, with implementations for
> > UntrustedBox<T>, [Untrusted<T>], and all the others we want.
>
> Hmm that might work for `Untrusted`, I will see how that looks. For
> `Unvalidated` I am less confident.
>
> > I expect that in the future we'll get more boxes with special semantics
> > that we want to use together with Untrusted, not just [] and MaybeUninit.
> > One example would be when the Untrusted data is shared with userspace
> > (mmap()ed into both kernel and userspace for example for a ringbuffer), in
> > which case the data is both Untrusted but also SharedUnsafe or whatever
> > we'll call memory that's fundamentally breaking the rust memory model
> > because there's no exclusive access (userspace can always do whatever it
> > feels like), and all access has to go through at least
> > READ_ONCE/WRITE_ONCE from the C side, or often even full blown atomics.
>
> I don't think that we need to wrap that in `Untrusted`, instead the type
> representing shared data between kernel and userspace should return
> `Untrusted<u8>` instead of `u8`.
>
> > And that memory is not MaybeUninit, because we have to initialize it
> > before userspace can mmap it, for otherwise it's an information leak and
> > hence security issue.
> >
> > tldr; I think going with traits for Untrusted and Unvalidated here will be
> > worth it, even if a bit more pain at first. Plus cleaner interfaces.
> >
> >> + where
> >> + T: Sized,
> >> + {
> >> + let ptr = slice.as_ptr().cast::<MaybeUninit<T>>();
> >> + // SAFETY: `ptr` and `len` come from the same mutable slice reference.
> >> + let slice = unsafe { slice::from_raw_parts(ptr, slice.len()) };
> >> + Untrusted::new_ref(slice)
> >> + }
>
> [...]
>
> >> +/// Validates untrusted data.
> >> +///
> >> +/// # Examples
> >> +///
> >> +/// The simplest way to validate data is to just implement `Validate<&Unvalidated<[u8]>>` for the
> >> +/// type that you wish to validate:
> >> +///
> >> +/// ```
> >> +/// use kernel::{
> >> +/// error::{code::EINVAL, Error},
> >> +/// str::{CStr, CString},
> >> +/// validate::{Unvalidated, Validate},
> >> +/// };
> >> +///
> >> +/// struct Data {
> >> +/// flags: u8,
> >> +/// name: CString,
> >> +/// }
> >> +///
> >> +/// impl Validate<&Unvalidated<[u8]>> for Data {
> >> +/// type Err = Error;
> >> +///
> >> +/// fn validate(unvalidated: &Unvalidated<[u8]>) -> Result<Self, Self::Err> {
> >> +/// let raw = unvalidated.raw();
> >> +/// let (&flags, name) = raw.split_first().ok_or(EINVAL)?;
> >> +/// let name = CStr::from_bytes_with_nul(name)?.to_cstring()?;
> >> +/// Ok(Data { flags, name })
> >> +/// }
> >> +/// }
> >> +/// ```
> >> +///
> >> +/// This approach copies the data and requires allocation. If you want to avoid the allocation and
> >> +/// copying the data, you can borrow from the input like this:
> >> +///
> >> +/// ```
> >> +/// use kernel::{
> >> +/// error::{code::EINVAL, Error},
> >> +/// str::CStr,
> >> +/// validate::{Unvalidated, Validate},
> >> +/// };
> >> +///
> >> +/// struct Data<'a> {
> >> +/// flags: u8,
> >> +/// name: &'a CStr,
> >> +/// }
> >> +///
> >> +/// impl<'a> Validate<&'a Unvalidated<[u8]>> for Data<'a> {
> >> +/// type Err = Error;
> >> +///
> >> +/// fn validate(unvalidated: &'a Unvalidated<[u8]>) -> Result<Self, Self::Err> {
> >> +/// let raw = unvalidated.raw();
> >> +/// let (&flags, name) = raw.split_first().ok_or(EINVAL)?;
> >> +/// let name = CStr::from_bytes_with_nul(name)?;
> >> +/// Ok(Data { flags, name })
> >> +/// }
> >> +/// }
> >> +/// ```
> >> +///
> >> +/// If you need to in-place validate your data, you currently need to resort to `unsafe`:
> >> +///
> >> +/// ```
> >> +/// use kernel::{
> >> +/// error::{code::EINVAL, Error},
> >> +/// str::CStr,
> >> +/// validate::{Unvalidated, Validate},
> >> +/// };
> >> +/// use core::mem;
> >> +///
> >> +/// // Important: use `repr(C)`, this ensures a linear layout of this type.
> >> +/// #[repr(C)]
> >> +/// struct Data {
> >> +/// version: u8,
> >> +/// flags: u8,
> >> +/// _reserved: [u8; 2],
> >> +/// count: u64,
> >> +/// // lots of other fields...
> >> +/// }
> >> +///
> >> +/// impl Validate<&Unvalidated<[u8]>> for &Data {
> >> +/// type Err = Error;
> >> +///
> >> +/// fn validate(unvalidated: &Unvalidated<[u8]>) -> Result<Self, Self::Err> {
> >> +/// let raw = unvalidated.raw();
> >> +/// if raw.len() < mem::size_of::<Data>() {
> >> +/// return Err(EINVAL);
> >> +/// }
> >> +/// // can only handle version 0
> >> +/// if raw[0] != 0 {
> >> +/// return Err(EINVAL);
> >> +/// }
> >> +/// // version 0 only uses the lower 4 bits of flags
> >> +/// if raw[1] & 0xf0 != 0 {
> >> +/// return Err(EINVAL);
> >> +/// }
> >> +/// let ptr = raw.as_ptr();
> >> +/// // CAST: `Data` only contains integers and has `repr(C)`.
> >> +/// let ptr = ptr.cast::<Data>();
> >> +/// // SAFETY: `ptr` came from a reference and the cast above is valid.
> >> +/// Ok(unsafe { &*ptr })
> >> +/// }
> >> +/// }
> >> +/// ```
> >> +///
> >> +/// To be able to modify the parsed data, while still supporting zero-copy, you can implement
> >> +/// `Validate<&mut Unvalidated<[u8]>>`:
> >> +///
> >> +/// ```
> >> +/// use kernel::{
> >> +/// error::{code::EINVAL, Error},
> >> +/// str::CStr,
> >> +/// validate::{Unvalidated, Validate},
> >> +/// };
> >> +/// use core::mem;
> >> +///
> >> +/// // Important: use `repr(C)`, this ensures a linear layout of this type.
> >> +/// #[repr(C)]
> >> +/// struct Data {
> >> +/// version: u8,
> >> +/// flags: u8,
> >> +/// _reserved: [u8; 2],
> >> +/// count: u64,
> >> +/// // lots of other fields...
> >> +/// }
> >> +///
> >> +/// impl Validate<&mut Unvalidated<[u8]>> for &Data {
> >> +/// type Err = Error;
> >
> > I think we should make that one the default, but not sure that's doable
> > with associated types instead of generics.
>
> There is the `associated_type_defaults` unstable feature:
> https://github.com/rust-lang/rust/issues/29661
>
> But I actually think that we should get away from making the `Error`
> type the return error of Rust functions. It's much better to have
> function-specific errors, since they can be more descriptive. In cases
> where we have to return to userspace or C, sure use the existing
> `Error`.
Do you mean a area-specific Error code that only returns the values that
are actually possible? I fear a bit that that could be a maintenance pain,
when the C side suddenly returns a new errno value. Or do you mean
something else.
> > The input parameter should definitely default to the output paramter,
> > because often they're just exactly the same e.g. when validating ioctl
> > input structures.
>
> I don't understand what you mean by this? The input parameter has to be
> some sort of `..Untrusted..` and the output shouldn't have that.
Yeah it needs to be wrapped, but I was wondering whether there's a way to
make Untrusted<T> default to T as the output. If I remember all my
thoughts correctly, it's been a month.
> > On this version you've also left out the in-place validation (originally
> > validate_bytes), but we can add that later on when a need arises I guess.
> > Or do you think it's just not a good idea in general?
>
> This version supports in-place validation, this example and the one
> above showcase how you would implement `Validate` in that case.
>
> You can then call `.vaildate()` on an `&[mut ]Unvalidated<[u8]>` and you
> get a `&Data` that borrows from the unvalidated data without copying the
> data.
Ah I missed that, thanks for explaining.
Cheers, Sima
>
> ---
> Cheers,
> Benno
>
> >> +///
> >> +/// fn validate(unvalidated: &mut Unvalidated<[u8]>) -> Result<Self, Self::Err> {
> >> +/// let raw = unvalidated.raw_mut();
> >> +/// if raw.len() < mem::size_of::<Data>() {
> >> +/// return Err(EINVAL);
> >> +/// }
> >> +/// match raw[0] {
> >> +/// 0 => {},
> >> +/// 1 => {
> >> +/// // version 1 implicitly sets the first bit.
> >> +/// raw[1] |= 1;
> >> +/// },
> >> +/// // can only handle version 0 and 1
> >> +/// _ => return Err(EINVAL),
> >> +/// }
> >> +/// // version 0 and 1 only use the lower 4 bits of flags
> >> +/// if raw[1] & 0xf0 != 0 {
> >> +/// return Err(EINVAL);
> >> +/// }
> >> +/// if raw[1] == 0 {}
> >> +/// let ptr = raw.as_ptr();
> >> +/// // CAST: `Data` only contains integers and has `repr(C)`.
> >> +/// let ptr = ptr.cast::<Data>();
> >> +/// // SAFETY: `ptr` came from a reference and the cast above is valid.
> >> +/// Ok(unsafe { &*ptr })
> >> +/// }
> >> +/// }
> >> +/// ```
> >> +pub trait Validate<I: ValidateInput>: Sized {
> >> + /// Validation error.
> >> + type Err;
> >> +
> >> + /// Validate the given untrusted data and parse it into the output type.
> >> + fn validate(unvalidated: I) -> Result<Self, Self::Err>;
> >> +}
>
--
Simona Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2024-11-26 8:05 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20240925205244.873020-1-benno.lossin@proton.me>
2024-09-25 20:53 ` [PATCH v2 1/2] rust: add untrusted data abstraction Benno Lossin
2024-09-26 7:08 ` Dirk Behme
2024-09-26 10:40 ` Simona Vetter
2024-09-30 14:04 ` Benno Lossin
2024-11-26 8:05 ` Simona Vetter
2024-09-26 20:31 ` kernel test robot
2024-09-26 21:40 ` Benno Lossin
2024-09-26 21:56 ` Miguel Ojeda
2024-09-26 22:15 ` Benno Lossin
2024-09-27 8:39 ` Miguel Ojeda
2024-09-27 9:06 ` Benno Lossin
2024-09-26 21:57 ` Miguel Ojeda
2024-09-25 20:53 ` [PATCH v2 2/2] rust: switch uaccess to untrusted data API Benno Lossin
2024-09-26 11:09 ` Simona Vetter
2024-09-26 23:56 ` kernel test robot
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox