From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-wm1-f74.google.com (mail-wm1-f74.google.com [209.85.128.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 95094211A13 for ; Wed, 11 Dec 2024 10:37:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.74 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733913461; cv=none; b=i26rmZpPE0NPHEdjmWXw5kD1A3PXe35OrAKXhv7i8Bg7xJ07DpfMAPpWVcleWhnGfU/VEFqmXGYizTJ8x8s6bZxABmwIA4YmyVg0wp5PwnKyGOar9oTCouQwwWBCv8eYYywAHiBFzegT28dLTpRzJ8iTjf76Hw4cHd/al0Ob6Cg= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733913461; c=relaxed/simple; bh=UELtGY1YbfyD3E/jKfzFIZ073fFyrrrn2baNLFYoZH8=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=GrYl/wLc9cj1s5ueAYPRxD5RkkthF8F57ra4IZa/GfMnio6BONXJqHXmid4+8C09x/xWhnDkrxh0f3/RV8fp6TE8PmcIs29LKOimUkIauKx6MklFBz5CA29b7p9OU7h5jFjVVY19Q4sxZAUgv8A67bbdiF685Ghd5/Gf7RnAA+A= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--aliceryhl.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=iUyquE6S; arc=none smtp.client-ip=209.85.128.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--aliceryhl.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="iUyquE6S" Received: by mail-wm1-f74.google.com with SMTP id 5b1f17b1804b1-43610eba55bso13994865e9.3 for ; Wed, 11 Dec 2024 02:37:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1733913457; x=1734518257; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=ptCVgSajmLGd+33xQ7/1wwSdfyjzIQ8tWfofuzm1Y9U=; b=iUyquE6SWHsM3bA2/CyG4Stj9uT9IUUiiANDavdiOIsOsZi3okwo/dFn01A1pYnf9d A94ACTRWGfS0pZRL4H8KRcZWYoOdksd/TerPAC5P/iSJ+sxG1MU0cKq9DV/JRJY/Guhz sgKWXcr+N53T3LxWAOoeE3hP04gZVcsTtAVs4b11RXnawRmCqvoFLXh6FCQoeUejeVDv fUnzr/k2LMCx+oY5sLRezLfrvvJVO+4L8zvQ7bPRCMyUYmdt49W+M1laPg1i2QumGtBj 7ZTRlV+UJVpmGdwo5FWL1yQ2RAgdxQpiE84Z/tps4TVlZiq0xNN0kTrOmxFw43dsL6MI Bubw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733913457; x=1734518257; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ptCVgSajmLGd+33xQ7/1wwSdfyjzIQ8tWfofuzm1Y9U=; b=FL1waiP5wJCVR4G/pjEEIkge/zYgfeG+PYYie6bFGvjTHyeE4so57iZizkPqwRBOfJ YbOSUKfn7yBzfhcv/TabwP7gaGJf/Dz9mhY8tIlT16EB/Fo59Ti7mN7VfLwfNMapaHKA AXf7u2tJBmZVNbaQB4gW0VGksnDvBtRN0UR56VsBguQiszhU9gDtImh1ETP8cY72AEBN nA9sZyXbZAQpcPuYFT+onYgVBn4HnSny2c/OaF0uEuXhgAxG3creph9HwOcAEnhmUDdi /G9SZh0XEJvKgw7BaBwLTnMyPhIy/ZNexc1KLTPbbhPAU1Cm4lLuKbrpNBj81orWFDSK fJ1w== X-Forwarded-Encrypted: i=1; AJvYcCUNa40uSV1qOUnys9d8CTZADbU3bv9rxRp43qpkWvNuku2bTk7uC2ut3v1ReHYpqEmbwwWQ0z6KobZw8K1U2Q==@vger.kernel.org X-Gm-Message-State: AOJu0YwBYt98A6FHbXiIsp2dpIMJ/B7cy6aZgusmDC7lTe23jYsOy+FU ZjZidtkwsyJADt3/iV1mUNksmCNxYqRvDhtRuBOlrtN8PzKUw3Paw1hBMLc5bXlYazHKa1AMO7O bxa6IVqd7vNN+fg== X-Google-Smtp-Source: AGHT+IF2oHVYbWX6H19GjEN2XS8y+OVVbBpLGUljSwx1ScMsBgY3j4qEcrJlD9I/H+Lzexp4LvuOcuGlpE8PUZM= X-Received: from wmpd23.prod.google.com ([2002:a05:600c:4c17:b0:436:16c6:831]) (user=aliceryhl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:1d20:b0:434:f218:e1a8 with SMTP id 5b1f17b1804b1-4361c3c70eamr16834795e9.19.1733913457042; Wed, 11 Dec 2024 02:37:37 -0800 (PST) Date: Wed, 11 Dec 2024 10:37:06 +0000 In-Reply-To: <20241211-vma-v11-0-466640428fc3@google.com> Precedence: bulk X-Mailing-List: rust-for-linux@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241211-vma-v11-0-466640428fc3@google.com> X-Developer-Key: i=aliceryhl@google.com; a=openpgp; fpr=49F6C1FAA74960F43A5B86A1EE7A392FDE96209F X-Developer-Signature: v=1; a=openpgp-sha256; l=10059; i=aliceryhl@google.com; h=from:subject:message-id; bh=UELtGY1YbfyD3E/jKfzFIZ073fFyrrrn2baNLFYoZH8=; b=owEBbQKS/ZANAwAKAQRYvu5YxjlGAcsmYgBnWWtmJY7Ptcy7pNZrWwIGonxUXNeWa5B1NRzbo /1Y1X00++WJAjMEAAEKAB0WIQSDkqKUTWQHCvFIvbIEWL7uWMY5RgUCZ1lrZgAKCRAEWL7uWMY5 RisoD/9s1GclB/tPV2RQgJj7Nws8WY0RQiMF3+XoqpaXLo7CBWgXqFtqDRnmEDtnPSRjPNjZDUw vn76Q3hqvoFexzTpBLEIXNpq+0aW62Rs4WYTSB9EK4l03q8OIXaLWOuNzNlIFnkk5FWJQnnVctY 4WaWo/DVhnFu3CckEwyhSjyZaC6tPRmjhEOV0MV9Zbc8e1VQXzd6gIuukpFAHK46aLIKXZwmPa7 J9pTslBugUwDx0t/QdX6IDb8rto2Ux9i/12Ph2fwYr/mmAY0XkIqc3MavawP8NkoK9IqXgWDVfH OAk1I8Nd9M4GaTxeIr8Wqn3+VcYuY2axQuPSdU+N+/CI/5kSIxDN++PqFJG1ohgzX3zgVUHioTH 5ZwPVM+FGEGb2b6Lv4iIn2uoVLcgsE3qxZetWkS/wYejQJ8GS0tvdtyuY0zkLLM5uQqH6ZBqFMR VHC+va0R5zAofKnL4ktwO3H/+1SBj2euS1cjsFFE2EIDCz3ENNd32tgOwboechRxR0AXiyPxyuE Iw12dKxpN9KICnBsQaXZZMbK3xJQ5wsZ74pST127xzw0893fVbw2sKL5BfGFNVccPJv/QWbMq41 de5sNMp11t7DLip+VQnwMBl5Qlb/GCogJOR7gtVxW7AA1y9Eaqfqjj38Lo9unyrdcERTCEq0e+I leQBcaYv82zdrBw== X-Mailer: b4 0.13.0 Message-ID: <20241211-vma-v11-2-466640428fc3@google.com> Subject: [PATCH v11 2/8] mm: rust: add vm_area_struct methods that require read access From: Alice Ryhl To: Miguel Ojeda , Matthew Wilcox , Lorenzo Stoakes , Vlastimil Babka , John Hubbard , "Liam R. Howlett" , Andrew Morton , Greg Kroah-Hartman , Arnd Bergmann , Christian Brauner , Jann Horn , Suren Baghdasaryan Cc: Alex Gaynor , Boqun Feng , Gary Guo , "=?utf-8?q?Bj=C3=B6rn_Roy_Baron?=" , Benno Lossin , Andreas Hindborg , Trevor Gross , linux-kernel@vger.kernel.org, linux-mm@kvack.org, rust-for-linux@vger.kernel.org, Alice Ryhl Content-Type: text/plain; charset="utf-8" This adds a type called VmAreaRef which is used when referencing a vma that you have read access to. Here, read access means that you hold either the mmap read lock or the vma read lock (or stronger). Additionally, a vma_lookup method is added to the mmap read guard, which enables you to obtain a &VmAreaRef in safe Rust code. This patch only provides a way to lock the mmap read lock, but a follow-up patch also provides a way to just lock the vma read lock. Acked-by: Lorenzo Stoakes (for mm bits) Reviewed-by: Jann Horn Signed-off-by: Alice Ryhl --- rust/helpers/mm.c | 6 ++ rust/kernel/mm.rs | 21 ++++++ rust/kernel/mm/virt.rs | 191 +++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 218 insertions(+) diff --git a/rust/helpers/mm.c b/rust/helpers/mm.c index 7201747a5d31..7b72eb065a3e 100644 --- a/rust/helpers/mm.c +++ b/rust/helpers/mm.c @@ -37,3 +37,9 @@ void rust_helper_mmap_read_unlock(struct mm_struct *mm) { mmap_read_unlock(mm); } + +struct vm_area_struct *rust_helper_vma_lookup(struct mm_struct *mm, + unsigned long addr) +{ + return vma_lookup(mm, addr); +} diff --git a/rust/kernel/mm.rs b/rust/kernel/mm.rs index 84cba581edaa..ace8e7d57afe 100644 --- a/rust/kernel/mm.rs +++ b/rust/kernel/mm.rs @@ -12,6 +12,8 @@ }; use core::{ops::Deref, ptr::NonNull}; +pub mod virt; + /// A wrapper for the kernel's `struct mm_struct`. /// /// Since `mm_users` may be zero, the associated address space may not exist anymore. You can use @@ -210,6 +212,25 @@ pub struct MmapReadGuard<'a> { _nts: NotThreadSafe, } +impl<'a> MmapReadGuard<'a> { + /// Look up a vma at the given address. + #[inline] + pub fn vma_lookup(&self, vma_addr: usize) -> Option<&virt::VmAreaRef> { + // SAFETY: We hold a reference to the mm, so the pointer must be valid. Any value is okay + // for `vma_addr`. + let vma = unsafe { bindings::vma_lookup(self.mm.as_raw(), vma_addr as _) }; + + if vma.is_null() { + None + } else { + // SAFETY: We just checked that a vma was found, so the pointer is valid. Furthermore, + // the returned area will borrow from this read lock guard, so it can only be used + // while the mmap read lock is still held. + unsafe { Some(virt::VmAreaRef::from_raw(vma)) } + } + } +} + impl Drop for MmapReadGuard<'_> { #[inline] fn drop(&mut self) { diff --git a/rust/kernel/mm/virt.rs b/rust/kernel/mm/virt.rs new file mode 100644 index 000000000000..68c763169cf0 --- /dev/null +++ b/rust/kernel/mm/virt.rs @@ -0,0 +1,191 @@ +// SPDX-License-Identifier: GPL-2.0 + +// Copyright (C) 2024 Google LLC. + +//! Virtual memory. + +use crate::{bindings, mm::MmWithUser, types::Opaque}; + +/// A wrapper for the kernel's `struct vm_area_struct` with read access. +/// +/// It represents an area of virtual memory. +/// +/// # Invariants +/// +/// The caller must hold the mmap read lock or the vma read lock. +#[repr(transparent)] +pub struct VmAreaRef { + vma: Opaque, +} + +// Methods you can call when holding the mmap or vma read lock (or strong). They must be usable no +// matter what the vma flags are. +impl VmAreaRef { + /// Access a virtual memory area given a raw pointer. + /// + /// # Safety + /// + /// Callers must ensure that `vma` is valid for the duration of 'a, and that the mmap or vma + /// read lock (or stronger) is held for at least the duration of 'a. + #[inline] + pub unsafe fn from_raw<'a>(vma: *const bindings::vm_area_struct) -> &'a Self { + // SAFETY: The caller ensures that the invariants are satisfied for the duration of 'a. + unsafe { &*vma.cast() } + } + + /// Returns a raw pointer to this area. + #[inline] + pub fn as_ptr(&self) -> *mut bindings::vm_area_struct { + self.vma.get() + } + + /// Access the underlying `mm_struct`. + #[inline] + pub fn mm(&self) -> &MmWithUser { + // SAFETY: By the type invariants, this `vm_area_struct` is valid and we hold the mmap/vma + // read lock or stronger. This implies that the underlying mm has a non-zero value of + // `mm_users`. + unsafe { MmWithUser::from_raw((*self.as_ptr()).vm_mm) } + } + + /// Returns the flags associated with the virtual memory area. + /// + /// The possible flags are a combination of the constants in [`flags`]. + #[inline] + pub fn flags(&self) -> vm_flags_t { + // SAFETY: By the type invariants, the caller holds at least the mmap read lock, so this + // access is not a data race. + unsafe { (*self.as_ptr()).__bindgen_anon_2.vm_flags as _ } + } + + /// Returns the (inclusive) start address of the virtual memory area. + #[inline] + pub fn start(&self) -> usize { + // SAFETY: By the type invariants, the caller holds at least the mmap read lock, so this + // access is not a data race. + unsafe { (*self.as_ptr()).__bindgen_anon_1.__bindgen_anon_1.vm_start as _ } + } + + /// Returns the (exclusive) end address of the virtual memory area. + #[inline] + pub fn end(&self) -> usize { + // SAFETY: By the type invariants, the caller holds at least the mmap read lock, so this + // access is not a data race. + unsafe { (*self.as_ptr()).__bindgen_anon_1.__bindgen_anon_1.vm_end as _ } + } + + /// Zap pages in the given page range. + /// + /// This clears page table mappings for the range at the leaf level, leaving all other page + /// tables intact, and freeing any memory referenced by the VMA in this range. That is, + /// anonymous memory is completely freed, file-backed memory has its reference count on page + /// cache folio's dropped, any dirty data will still be written back to disk as usual. + #[inline] + pub fn zap_page_range_single(&self, address: usize, size: usize) { + let (end, did_overflow) = address.overflowing_add(size); + if did_overflow || address < self.start() || self.end() < end { + // TODO: call WARN_ONCE once Rust version of it is added + return; + } + + // SAFETY: By the type invariants, the caller has read access to this VMA, which is + // sufficient for this method call. This method has no requirements on the vma flags. The + // address range is checked to be within the vma. + unsafe { + bindings::zap_page_range_single( + self.as_ptr(), + address as _, + size as _, + core::ptr::null_mut(), + ) + }; + } +} + +/// The integer type used for vma flags. +#[doc(inline)] +pub use bindings::vm_flags_t; + +/// All possible flags for [`VmAreaRef`]. +pub mod flags { + use super::vm_flags_t; + use crate::bindings; + + /// No flags are set. + pub const NONE: vm_flags_t = bindings::VM_NONE as _; + + /// Mapping allows reads. + pub const READ: vm_flags_t = bindings::VM_READ as _; + + /// Mapping allows writes. + pub const WRITE: vm_flags_t = bindings::VM_WRITE as _; + + /// Mapping allows execution. + pub const EXEC: vm_flags_t = bindings::VM_EXEC as _; + + /// Mapping is shared. + pub const SHARED: vm_flags_t = bindings::VM_SHARED as _; + + /// Mapping may be updated to allow reads. + pub const MAYREAD: vm_flags_t = bindings::VM_MAYREAD as _; + + /// Mapping may be updated to allow writes. + pub const MAYWRITE: vm_flags_t = bindings::VM_MAYWRITE as _; + + /// Mapping may be updated to allow execution. + pub const MAYEXEC: vm_flags_t = bindings::VM_MAYEXEC as _; + + /// Mapping may be updated to be shared. + pub const MAYSHARE: vm_flags_t = bindings::VM_MAYSHARE as _; + + /// Page-ranges managed without `struct page`, just pure PFN. + pub const PFNMAP: vm_flags_t = bindings::VM_PFNMAP as _; + + /// Memory mapped I/O or similar. + pub const IO: vm_flags_t = bindings::VM_IO as _; + + /// Do not copy this vma on fork. + pub const DONTCOPY: vm_flags_t = bindings::VM_DONTCOPY as _; + + /// Cannot expand with mremap(). + pub const DONTEXPAND: vm_flags_t = bindings::VM_DONTEXPAND as _; + + /// Lock the pages covered when they are faulted in. + pub const LOCKONFAULT: vm_flags_t = bindings::VM_LOCKONFAULT as _; + + /// Is a VM accounted object. + pub const ACCOUNT: vm_flags_t = bindings::VM_ACCOUNT as _; + + /// Should the VM suppress accounting. + pub const NORESERVE: vm_flags_t = bindings::VM_NORESERVE as _; + + /// Huge TLB Page VM. + pub const HUGETLB: vm_flags_t = bindings::VM_HUGETLB as _; + + /// Synchronous page faults. (DAX-specific) + pub const SYNC: vm_flags_t = bindings::VM_SYNC as _; + + /// Architecture-specific flag. + pub const ARCH_1: vm_flags_t = bindings::VM_ARCH_1 as _; + + /// Wipe VMA contents in child on fork. + pub const WIPEONFORK: vm_flags_t = bindings::VM_WIPEONFORK as _; + + /// Do not include in the core dump. + pub const DONTDUMP: vm_flags_t = bindings::VM_DONTDUMP as _; + + /// Not soft dirty clean area. + pub const SOFTDIRTY: vm_flags_t = bindings::VM_SOFTDIRTY as _; + + /// Can contain `struct page` and pure PFN pages. + pub const MIXEDMAP: vm_flags_t = bindings::VM_MIXEDMAP as _; + + /// MADV_HUGEPAGE marked this vma. + pub const HUGEPAGE: vm_flags_t = bindings::VM_HUGEPAGE as _; + + /// MADV_NOHUGEPAGE marked this vma. + pub const NOHUGEPAGE: vm_flags_t = bindings::VM_NOHUGEPAGE as _; + + /// KSM may merge identical pages. + pub const MERGEABLE: vm_flags_t = bindings::VM_MERGEABLE as _; +} -- 2.47.1.613.gc27f4b7a9f-goog