* [PATCH v5 0/2] landlock: fix SCOPE_SIGNAL bypass on the SIGIO/fowner path
From: Bryam Vargas @ 2026-06-04 23:16 UTC (permalink / raw)
To: Mickaël Salaün, Günther Noack
Cc: Justin Suess, Christian Brauner, Paul Moore, James Morris,
Serge E . Hallyn, linux-security-module, stable, linux-kernel
This series fixes a LANDLOCK_SCOPE_SIGNAL bypass on the asynchronous SIGIO
(fcntl(F_SETOWN)) delivery path, and adds regression tests.
A sandboxed process that owns a file or socket can request a signal
(F_SETSIG, e.g. SIGKILL) to be delivered to a whole process group on I/O
readiness (F_SETOWN(-pgid) + O_ASYNC). When it is the head of its own
process group -- the default after fork() -- that group still contains the
non-sandboxed process that launched it (a supervisor, a security monitor),
so the sandbox can signal processes that SCOPE_SIGNAL is meant to protect
from it.
Patch 1 has two parts:
- Narrow the same-thread-group exemption in control_current_fowner() so a
process-group fowner always records the caller's Landlock domain; the
delivery-time check in hook_file_send_sigiotask() then runs against
every group member. This closes the bypass.
- Recording the domain alone over-blocks one corner: the kernel signals a
process group through its members' thread-group leaders, and the leader
of the registrant's own process can carry a different Landlock domain
than the sibling thread that armed F_SETOWN. domain_is_scoped() would
then deny that leader, even though commit 18eb75f3af40 requires
same-process delivery to be allowed. hook_task_kill() avoids this by
checking same_thread_group() live, per recipient; the SIGIO path
delegated the whole decision to a single registration-time check that a
fan-out cannot honor. So patch 1 also records the registrant's thread
group next to its domain and exempts it at delivery, restoring the
same-process guarantee while keeping out-of-domain group members
blocked.
The direct kill() path (hook_task_kill) is unaffected.
Patch 2 adds two regression tests in scoped_signal_test.c:
sigio_to_pgid_members (out-of-domain member must not be signaled) and
sigio_to_pgid_self (the registrant's own process, reached through its
thread-group leader, must still be signaled).
The defect was introduced by commit 18eb75f3af40 ("landlock: Always allow
signals between threads of the same process") in v6.15, and is present in the
stable branches that backported it (6.12.y, 6.13.y, 6.14.y).
control_current_fowner() is identical across those branches.
Verified on 7.1.0-rc5 + CONFIG_SECURITY_LANDLOCK=y (same .config, only the
landlock change differs across arms):
- unpatched: sigio_to_pgid_members fails (out-of-domain member signaled,
bypass), sigio_to_pgid_self passes;
- patch-1-record-only (the v4 hunk): sigio_to_pgid_members passes,
sigio_to_pgid_self fails (the registrant's own leader is over-blocked);
- this series: both pass, and the landlock signal-scoping suite is 21/21.
A standalone reproducer of both invariants was also built -m32 and -m64 and
run on each arm: the fix behaves identically through the i386-compat and the
x86-64 native syscall paths.
v4 -> v5 (review feedback from Günther Noack):
- patch 1: also fix the same-process over-block introduced by recording the
domain for a process-group fowner -- record the registrant's thread group
(struct pid) in landlock_file_security and exempt it in
hook_file_send_sigiotask() (task_tgid(tsk) == fown_tg), restoring the
18eb75f3af40 guarantee for the registrant's own process;
- patch 2: add sigio_to_pgid_self covering the non-leader-registrant /
pgid-includes-self case;
- drop Tested-by: Justin Suess -- patch 1 gained the delivery-time exemption
he did not test (re-test welcome);
- posted as a fresh top-level thread (no In-Reply-To to the v4 review).
v4: https://lore.kernel.org/all/20260602172741.18760-1-hexlabsecurity@proton.me/
(v1/v2 were sent to security@kernel.org while embargoed; not in a public
archive.)
Bryam Vargas (2):
landlock: fix LANDLOCK_SCOPE_SIGNAL bypass on the SIGIO path
selftests/landlock: test SCOPE_SIGNAL on the SIGIO/fowner pgid path
security/landlock/fs.c | 15 ++
security/landlock/fs.h | 10 +
security/landlock/task.c | 11 ++
.../selftests/landlock/scoped_signal_test.c | 183 ++++++++++++++++++
4 files changed, 219 insertions(+)
base-commit: 6f3ed7fec72fc8979b2a8c7219c0a9fcfc8d07b5
--
2.43.0
^ permalink raw reply
* Re: -next status as at v7.1-rc6
From: Paul Moore @ 2026-06-04 22:23 UTC (permalink / raw)
To: Linus Torvalds
Cc: linux-security-module, Mark Brown, Blaise Boscaccy,
Alexei Starovoitov, linux-next, linux-kernel
In-Reply-To: <CAHk-=wj_rtiufEEVudMCSbh86HYrnmEa9NhYfKeWHs1SOzJVnA@mail.gmail.com>
On Wed, Jun 3, 2026 at 8:32 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Wed, 3 Jun 2026 at 17:04, Paul Moore <paul@paul-moore.com> wrote:
> >
> > It's worth mentioning that resolving the merge issue was relatively
> > straightforward and we had a tested patch ready in a few hours
>
> This is not the reason I'm not going to pull it ...
I didn't assume the merge issue was the reason, but you felt it was
worth mentioning so I figured it deserved a response.
> ... the merge issue was
> just the reminder I got about an earlier email that I had dropped on
> the floor.
>
> No, the reason I won't pull it is that the main developer I pull bpf
> code NAK'ed it.
I would like to clarify some things, not necessarily for Hornet's
sake, but because they have implications moving forward. As much as I
*love* our little chats, I think we both would prefer not to repeat
this discussion in the future.
While you didn't reply to any of my comments explaining how Hornet
works, specifically how it ties into the kernel, I'm assuming you've
read the overview. Can you help those of us in the LSM space
understand why a BPF dev's NACK on code that lives strictly under
security/ is sufficient grounds to reject an LSM patch?
Disagreement about what access controls an LSM is allowed to enforce?
Disagreement about using LSMs for additional signature verification,
despite compatibility (although one could argue we already do this
with IMA)?
Use of the bpf_map_ops::map_get_hash() method?
Other things ... ? I'm genuinely not trying to be antagonistic; I'm
trying to understand your thinking in rejecting Hornet and take some
lessons away from it. I'm currently stuck with (pardon the
paraphrase) "because Alexei said-so", and while I'm certain Alexei
would be happy to have that codified, I'd like to believe there is
more to your rejection than that.
> My tree is *not* some kind of "we are bypassing developers by sending
> a pull request directly to Linus" tree.
I've been sending you pull requests for a while now, and as you know
I'm always very upfront in the email if any of the commits contain a
NACK or even the absence of a cross-subsystem ACK. My intention is
never to deceive anyone, my goal is always to make you aware of the
situation so that you can decide what to merge into your tree.
That was why I decided to merge Hornet into the LSM dev/next branch in
preparation for the upcoming merge window. You had been CC'd on
several threads regarding the BPF signature verification work and
hadn't provided guidance on the cross-subsystem issues. Even an
off-list email I sent you received what could best be described as a
shrug. My plan was to present Hornet to you with an explanation,
likely similar to my last reply in this thread, detailing that we had
worked with the BPF devs for over a year and their solution actively
ignored our requirements (despite their claims otherwise). Hornet,
while NACK'd by Alexei, did not touch any code under kernel/bpf/, was
compatible with the existing BPF verification code, and solved real
user problems.
While I was not looking forward to the discussion, I felt responsible
for trying one last time to bring this to your attention and make a
case for the signature verification requirements the BPF community has
chosen to reject. It appears a private, off-list email from the BPF
devs preempted this, which is unfortunate regarding both visibility
and timing, but at least we are still having a discussion of sorts.
> And honestly, I also have two+ decades of history of "LSM people
> cannot agree on a single thing".
You've repeated this enough that I worry your opinion has ossified,
but I can promise you the reality is that the LSM maintainer community
is a reasonably tight knit group and has been for quite some time.
I'm laughing a bit as I write this (I know ...), but I would encourage
you to drop by the Linux Security Summit in Prague this fall; I
believe you'll already be in town for the Maintainer Summit and it
might be an opportunity to see some of us and realize we agree far
more than you envision. You might particularly enjoy our "working
with Linus" support group that meets in the evening at a local bar ;)
--
paul-moore.com
^ permalink raw reply
* Re: [PATCH v4 1/2] landlock: fix LANDLOCK_SCOPE_SIGNAL bypass on the SIGIO path
From: Günther Noack @ 2026-06-04 20:47 UTC (permalink / raw)
To: Bryam Vargas
Cc: Mickaël Salaün, Günther Noack, Justin Suess,
Christian Brauner, Paul Moore, James Morris, Serge E . Hallyn,
linux-security-module, stable, linux-kernel
In-Reply-To: <20260604102707.133997-1-hexlabsecurity@proton.me>
Hello Bryam,
Just a brief mail to confirm the approach; this makes sense to me.
On Thu, Jun 04, 2026 at 10:27:13AM +0000, Bryam Vargas wrote:
> > I believe the result after this patch is:
> > - No threads receive the SIGIO at all.
> >
> > This is because we have been setting T2.2's Landlock domain as the
> > "sending domain" for the hook_file_sigiotask(), and that hook does on
> > its own not do the "same_thread_group()" check [...]
>
> Confirmed -- I traced the delivery path and your analysis holds.
>
> For a PGID owner the signal is anchored per process on its thread-group
> leader: a task is attached to pid->tasks[PIDTYPE_PGID] only in the
> thread_group_leader() branch of copy_process(), so send_sigio()'s
> do_each_pid_task(pid, PIDTYPE_PGID, p) walk visits exactly T2.1 for P2,
> never the non-leader T2.2. hook_file_send_sigiotask() then runs
> domain_is_scoped(recorded T2.2 domain, T2.1's live domain, SIGNAL) and,
> having no same_thread_group() exemption of its own (unlike
> hook_task_kill()), denies it -- even though T2.1 and T2.2 share P2's
> signal_struct and 18eb75f3af40 mandates that same-process delivery always
> be allowed. T2.1 is P2's only entry on the PGID list, so P2 receives
> nothing. You are right.
>
> One thing worth putting on the record: this over-block is not introduced
> by the patch. In unpatched control_current_fowner() the PGID case already
> resolves through pid_task(fown->pid, PIDTYPE_PGID), which returns an
> arbitrary hlist head -- one representative leader. Whenever that head is
> outside the caller's thread group, the domain is already recorded today and
> the same delivery-time denial of the registrant's own leader already fires.
> The patch only makes domain recording for PGID unconditional, i.e. it turns
> that order-dependent behaviour into a deterministic one while closing the
> order-dependent bypass. So the corner you describe is a pre-existing gap in
> the delivery hook, not a regression in v4.
>
> That points at the real root cause: same_thread_group is a *per-recipient*
> property, but control_current_fowner() approximates it once, at F_SETOWN
> time, against a single pid_task() representative. hook_task_kill() gets
> this right because it evaluates same_thread_group(p, current) live, per
> actual recipient. hook_file_send_sigiotask() is the SIGIO analogue but
> delegates the whole thread-group decision to that one registration-time
> check, which a PGID delivery set simply cannot be captured by.
>
> So the fully-correct fix is to move the same-process exemption to delivery
> time, keyed to the *registrant* rather than to current (at SIGIO time
> current is the fd writer, not the task that armed F_SETOWN). Concretely:
> when hook_file_set_fowner() records the domain, also pin
> get_pid(task_tgid(current)) in struct landlock_file_security; in
> hook_file_send_sigiotask(), before domain_is_scoped(), return 0 when
> task_tgid(tsk) == that recorded pid. PGID owners still record the domain
> (so P1 stays blocked -- the bypass fix), but the registrant's own process,
> including T2.1, is always allowed -- restoring 18eb75f3af40 exactly. The
> new pid is taken/put in lockstep with fown_subject.domain under the same
> file->f_owner->lock and freed in hook_file_free_security(); the equality
> test follows neither pid, so there is no extra RCU surface. Sketch:
>
> /* struct landlock_file_security */
> struct pid *fown_tg; /* registrant's thread group; NULL if no domain */
>
> /* hook_file_set_fowner(), where fown_subject is recorded */
> fown_tg = get_pid(task_tgid(current));
> ...
> put_pid(landlock_file(file)->fown_tg); /* release previous */
> landlock_file(file)->fown_tg = fown_tg;
>
> /* hook_file_free_security() */
> put_pid(landlock_file(file)->fown_tg);
>
> /* hook_file_send_sigiotask(), after the !subject->domain quick return */
> if (task_tgid(tsk) == landlock_file(fown->file)->fown_tg)
> return 0; /* same process as the registrant: always allowed */
n>
> I do not see a correct fix that avoids recording the registrant's identity:
> the registrant task is deliberately discarded after set_fowner (only its
> domain is kept), and exempting on a shared *domain* instead would be
> insecure -- sibling threads can hold different domains, and a different
> process could share one.
Yes, your approach checks out for me; I also think that storing this
additional information is the best approach; we need to know during
hook_file_send_sigiotask() what the TGID of the registering task was,
in order to tell apart signals within the same process from signals
going outwards of that process.
> > To be clear, the patch is still obviously an improvement [...] it just
> > seems to block it slightly too broadly in this corner scenario?
> > [...] Mickaël, maybe you have some thoughts on the tradeoff?
>
> Agreed on both counts. Mickaël -- two ways to land this:
>
> (a) keep v4 as is. It closes the bypass; the residual same-process
> over-block is pre-existing, deterministic only under the stacked
> conditions Günther listed (already-multithreaded enforce, no TSYNC,
> SIGIO to a PGID that includes self, registered from a non-leader
> thread in a per-thread signal-scoped domain), and arguably tolerable.
>
> (b) v5 = v4 + the delivery-time exemption above. Strictly more correct:
> it also closes the pre-existing delivery-hook gap and restores
> 18eb75f3af40's same-process invariant, at the cost of one struct pid*
> in landlock_file_security.
>
> I lean (b) -- it fixes the actual root cause rather than the one reachable
> instance -- and I am happy to spin it (with an added selftest covering the
> PGID-includes-self / non-leader-registrant case, A/B verified) or to hold at
> v4 if you would rather keep the change minimal. Your call on whether the
> corner warrants the extra state.
+1, I also think that the approach is quite clean. Some checks would
happen at a later time, but it seems unavoidable in the generic case.
Checking TGID during hook_file_send_sigiotask() sounds reasonably
cheap. (I suspect that trying to do that check early during
hook_file_set_fowner() would not save us much.)
> > P.S: [...] new patchset versions are posted at the top (no Reply-To
> > header in the cover letter) [...]
>
> Will do -- v5 (whichever option) goes out as a fresh top-level thread, no
> In-Reply-To/Reply-To pointing back at this review.
Awesome, thank you very much for looking into patching this! :)
–Günther
^ permalink raw reply
* [PATCH v17 10/10] rust: page: add `from_raw()`
From: Andreas Hindborg @ 2026-06-04 20:11 UTC (permalink / raw)
To: Miguel Ojeda, Gary Guo, Björn Roy Baron, Benno Lossin,
Alice Ryhl, Trevor Gross, Danilo Krummrich, Greg Kroah-Hartman,
Dave Ertman, Ira Weiny, Leon Romanovsky, Paul Moore, Serge Hallyn,
Rafael J. Wysocki, David Airlie, Simona Vetter, Alexander Viro,
Christian Brauner, Jan Kara, Daniel Almeida, Viresh Kumar,
Nishanth Menon, Stephen Boyd, Bjorn Helgaas,
Krzysztof Wilczyński, Boqun Feng, Uladzislau Rezki,
Lorenzo Stoakes, Vlastimil Babka, Liam R. Howlett, Igor Korotin,
Pavel Tikhomirov, Boqun Feng, Igor Korotin, Lorenzo Stoakes,
Liam R. Howlett, Vlastimil Babka
Cc: linux-kernel, rust-for-linux, linux-block, linux-security-module,
dri-devel, linux-fsdevel, linux-mm, linux-pm, linux-pci,
Andreas Hindborg, driver-core, Andreas Hindborg
In-Reply-To: <20260604-unique-ref-v17-0-7b4c3d2930b9@kernel.org>
From: Andreas Hindborg <a.hindborg@samsung.com>
Add a method to `Page` that allows construction of an instance from `struct
page` pointer.
Signed-off-by: Andreas Hindborg <a.hindborg@samsung.com>
---
rust/kernel/page.rs | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/rust/kernel/page.rs b/rust/kernel/page.rs
index 844c75e54134..d56ae597f692 100644
--- a/rust/kernel/page.rs
+++ b/rust/kernel/page.rs
@@ -214,6 +214,18 @@ pub fn nid(&self) -> i32 {
unsafe { bindings::page_to_nid(self.as_ptr()) }
}
+ /// Create a `&Page` from a raw `struct page` pointer.
+ ///
+ /// # Safety
+ ///
+ /// `ptr` must be convertible to a shared reference with a lifetime of `'a`.
+ #[inline]
+ pub unsafe fn from_raw<'a>(ptr: *const bindings::page) -> &'a Self {
+ // SAFETY: By function safety requirements, `ptr` is not null and is convertible to a shared
+ // reference.
+ unsafe { &*ptr.cast() }
+ }
+
/// Runs a piece of code with this page mapped to an address.
///
/// The page is unmapped when this call returns.
--
2.51.2
^ permalink raw reply related
* [PATCH v17 03/10] rust: implement `ForeignOwnable` for `Owned`
From: Andreas Hindborg @ 2026-06-04 20:11 UTC (permalink / raw)
To: Miguel Ojeda, Gary Guo, Björn Roy Baron, Benno Lossin,
Alice Ryhl, Trevor Gross, Danilo Krummrich, Greg Kroah-Hartman,
Dave Ertman, Ira Weiny, Leon Romanovsky, Paul Moore, Serge Hallyn,
Rafael J. Wysocki, David Airlie, Simona Vetter, Alexander Viro,
Christian Brauner, Jan Kara, Daniel Almeida, Viresh Kumar,
Nishanth Menon, Stephen Boyd, Bjorn Helgaas,
Krzysztof Wilczyński, Boqun Feng, Uladzislau Rezki,
Lorenzo Stoakes, Vlastimil Babka, Liam R. Howlett, Igor Korotin,
Pavel Tikhomirov, Boqun Feng, Igor Korotin, Lorenzo Stoakes,
Liam R. Howlett, Vlastimil Babka
Cc: linux-kernel, rust-for-linux, linux-block, linux-security-module,
dri-devel, linux-fsdevel, linux-mm, linux-pm, linux-pci,
Andreas Hindborg, driver-core
In-Reply-To: <20260604-unique-ref-v17-0-7b4c3d2930b9@kernel.org>
Implement `ForeignOwnable` for `Owned<T>`. This allows use of `Owned<T>` in
places such as the `XArray`.
Note that `T` does not need to implement `ForeignOwnable` for `Owned<T>` to
implement `ForeignOwnable`.
Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
rust/kernel/owned.rs | 46 ++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 46 insertions(+)
diff --git a/rust/kernel/owned.rs b/rust/kernel/owned.rs
index 456e239e906e..5eacdf327d12 100644
--- a/rust/kernel/owned.rs
+++ b/rust/kernel/owned.rs
@@ -15,6 +15,8 @@
ptr::NonNull, //
};
+use kernel::types::ForeignOwnable;
+
/// Types that specify their own way of performing allocation and destruction. Typically, this trait
/// is implemented on types from the C side.
///
@@ -108,6 +110,7 @@ pub trait Ownable {
///
/// - Until `T::release` is called, this `Owned<T>` exclusively owns the underlying `T`.
/// - The `T` value is pinned.
+#[repr(transparent)]
pub struct Owned<T: Ownable> {
ptr: NonNull<T>,
}
@@ -185,3 +188,46 @@ fn drop(&mut self) {
unsafe { T::release(self.ptr.as_mut()) };
}
}
+
+// SAFETY: We derive the pointer to `T` from a valid `T`, so the returned
+// pointer satisfy alignment requirements of `T`.
+unsafe impl<T: Ownable + 'static> ForeignOwnable for Owned<T> {
+ const FOREIGN_ALIGN: usize = core::mem::align_of::<Owned<T>>();
+
+ type Borrowed<'a> = &'a T;
+ type BorrowedMut<'a> = Pin<&'a mut T>;
+
+ #[inline]
+ fn into_foreign(self) -> *mut kernel::ffi::c_void {
+ let ptr = self.ptr.as_ptr().cast();
+ core::mem::forget(self);
+ ptr
+ }
+
+ #[inline]
+ unsafe fn from_foreign(ptr: *mut kernel::ffi::c_void) -> Self {
+ Self {
+ // SAFETY: By function safety contract, `ptr` came from
+ // `into_foreign` and cannot be null.
+ ptr: unsafe { NonNull::new_unchecked(ptr.cast()) },
+ }
+ }
+
+ #[inline]
+ unsafe fn borrow<'a>(ptr: *mut kernel::ffi::c_void) -> Self::Borrowed<'a> {
+ // SAFETY: By function safety requirements, `ptr` is valid for use as a
+ // reference for `'a`.
+ unsafe { &*ptr.cast() }
+ }
+
+ #[inline]
+ unsafe fn borrow_mut<'a>(ptr: *mut kernel::ffi::c_void) -> Self::BorrowedMut<'a> {
+ // SAFETY: By function safety requirements, `ptr` is valid for use as a
+ // unique reference for `'a`.
+ let inner = unsafe { &mut *ptr.cast() };
+
+ // SAFETY: We never move out of inner, and we do not hand out mutable
+ // references when `T: !Unpin`.
+ unsafe { Pin::new_unchecked(inner) }
+ }
+}
--
2.51.2
^ permalink raw reply related
* [PATCH v17 01/10] rust: alloc: add `KBox::into_non_null`
From: Andreas Hindborg @ 2026-06-04 20:11 UTC (permalink / raw)
To: Miguel Ojeda, Gary Guo, Björn Roy Baron, Benno Lossin,
Alice Ryhl, Trevor Gross, Danilo Krummrich, Greg Kroah-Hartman,
Dave Ertman, Ira Weiny, Leon Romanovsky, Paul Moore, Serge Hallyn,
Rafael J. Wysocki, David Airlie, Simona Vetter, Alexander Viro,
Christian Brauner, Jan Kara, Daniel Almeida, Viresh Kumar,
Nishanth Menon, Stephen Boyd, Bjorn Helgaas,
Krzysztof Wilczyński, Boqun Feng, Uladzislau Rezki,
Lorenzo Stoakes, Vlastimil Babka, Liam R. Howlett, Igor Korotin,
Pavel Tikhomirov, Boqun Feng, Igor Korotin, Lorenzo Stoakes,
Liam R. Howlett, Vlastimil Babka
Cc: linux-kernel, rust-for-linux, linux-block, linux-security-module,
dri-devel, linux-fsdevel, linux-mm, linux-pm, linux-pci,
Andreas Hindborg, driver-core
In-Reply-To: <20260604-unique-ref-v17-0-7b4c3d2930b9@kernel.org>
Add a method to consume a `Box<T, A>` and return a `NonNull<T>`. This
is a convenience wrapper around `Self::into_raw` for callers that need
a `NonNull` pointer rather than a raw pointer.
Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
rust/kernel/alloc/kbox.rs | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/rust/kernel/alloc/kbox.rs b/rust/kernel/alloc/kbox.rs
index bd6da02c7ab8..6fab67704294 100644
--- a/rust/kernel/alloc/kbox.rs
+++ b/rust/kernel/alloc/kbox.rs
@@ -188,6 +188,15 @@ pub fn leak<'a>(b: Self) -> &'a mut T {
// which points to an initialized instance of `T`.
unsafe { &mut *Box::into_raw(b) }
}
+
+ /// Consumes the `Box<T,A>` and returns a `NonNull<T>`.
+ ///
+ /// Like [`Self::into_raw`], but returns a `NonNull`.
+ #[inline]
+ pub fn into_non_null(b: Self) -> NonNull<T> {
+ // SAFETY: `KBox::into_raw` returns a valid pointer.
+ unsafe { NonNull::new_unchecked(Self::into_raw(b)) }
+ }
}
impl<T, A> Box<MaybeUninit<T>, A>
--
2.51.2
^ permalink raw reply related
* [PATCH v17 09/10] rust: Add `OwnableRefCounted`
From: Andreas Hindborg @ 2026-06-04 20:11 UTC (permalink / raw)
To: Miguel Ojeda, Gary Guo, Björn Roy Baron, Benno Lossin,
Alice Ryhl, Trevor Gross, Danilo Krummrich, Greg Kroah-Hartman,
Dave Ertman, Ira Weiny, Leon Romanovsky, Paul Moore, Serge Hallyn,
Rafael J. Wysocki, David Airlie, Simona Vetter, Alexander Viro,
Christian Brauner, Jan Kara, Daniel Almeida, Viresh Kumar,
Nishanth Menon, Stephen Boyd, Bjorn Helgaas,
Krzysztof Wilczyński, Boqun Feng, Uladzislau Rezki,
Lorenzo Stoakes, Vlastimil Babka, Liam R. Howlett, Igor Korotin,
Pavel Tikhomirov, Boqun Feng, Igor Korotin, Lorenzo Stoakes,
Liam R. Howlett, Vlastimil Babka
Cc: linux-kernel, rust-for-linux, linux-block, linux-security-module,
dri-devel, linux-fsdevel, linux-mm, linux-pm, linux-pci,
Andreas Hindborg, driver-core, Oliver Mangold
In-Reply-To: <20260604-unique-ref-v17-0-7b4c3d2930b9@kernel.org>
From: Oliver Mangold <oliver.mangold@pm.me>
Types implementing one of these traits can safely convert between an
`ARef<T>` and an `Owned<T>`.
This is useful for types which generally are accessed through an `ARef`
but have methods which can only safely be called when the reference is
unique, like e.g. `block::mq::Request::end_ok()`.
Signed-off-by: Oliver Mangold <oliver.mangold@pm.me>
[ Andreas: Fix formatting, update documentation, fix error handling in
examples. ]
Co-developed-by: Andreas Hindborg <a.hindborg@kernel.org>
Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
rust/kernel/owned.rs | 138 +++++++++++++++++++++++++++++++++++++++++++++--
rust/kernel/sync/aref.rs | 16 +++++-
rust/kernel/types.rs | 1 +
3 files changed, 149 insertions(+), 6 deletions(-)
diff --git a/rust/kernel/owned.rs b/rust/kernel/owned.rs
index bedd4fef84fa..9db0daab2225 100644
--- a/rust/kernel/owned.rs
+++ b/rust/kernel/owned.rs
@@ -14,20 +14,26 @@
pin::Pin,
ptr::NonNull, //
};
+use kernel::{
+ sync::aref::ARef,
+ types::RefCounted, //
+};
use kernel::types::ForeignOwnable;
/// Types that specify their own way of performing allocation and destruction. Typically, this trait
/// is implemented on types from the C side.
///
-/// Implementing this trait allows types to be referenced via the [`Owned<Self>`] pointer type. This
-/// is useful when it is desirable to tie the lifetime of the reference to an owned object, rather
-/// than pass around a bare reference. [`Ownable`] types can define custom drop logic that is
-/// executed when the owned reference [`Owned<Self>`] pointing to the object is dropped.
+/// Implementing this trait allows types to be referenced via the [`Owned<Self>`] pointer type.
+/// - This is useful when it is desirable to tie the lifetime of an object reference to an owned
+/// object, rather than pass around a bare reference.
+/// - [`Ownable`] types can define custom drop logic that is executed when the owned reference
+/// of type [`Owned<_>`] pointing to the object is dropped.
///
/// Note: The underlying object is not required to provide internal reference counting, because it
/// represents a unique, owned reference. If reference counting (on the Rust side) is required,
-/// [`RefCounted`](crate::types::RefCounted) should be implemented.
+/// [`RefCounted`] should be implemented. [`OwnableRefCounted`] should be implemented if conversion
+/// between unique and shared (reference counted) ownership is needed.
///
/// # Examples
///
@@ -231,3 +237,125 @@ unsafe fn borrow_mut<'a>(ptr: *mut kernel::ffi::c_void) -> Self::BorrowedMut<'a>
unsafe { Pin::new_unchecked(inner) }
}
}
+
+/// A trait for objects that can be wrapped in either one of the reference types [`Owned`] and
+/// [`ARef`].
+///
+/// # Examples
+///
+/// A minimal example implementation of [`OwnableRefCounted`], [`Ownable`] and its usage with
+/// [`ARef`] and [`Owned`] looks like this:
+///
+/// ```
+/// # #![expect(clippy::disallowed_names)]
+/// # use core::cell::Cell;
+/// # use core::ptr::NonNull;
+/// # use kernel::alloc::{flags, kbox::KBox, AllocError};
+/// # use kernel::sync::aref::{ARef, RefCounted};
+/// # use kernel::types::{Owned, Ownable, OwnableRefCounted};
+///
+/// // An internally refcounted struct for demonstration purposes.
+/// //
+/// // # Invariants
+/// //
+/// // - `refcount` is always non-zero for a valid object.
+/// // - `refcount` is >1 if there is more than one Rust reference to it.
+/// //
+/// struct Foo {
+/// refcount: Cell<usize>,
+/// }
+///
+/// impl Foo {
+/// fn new() -> Result<Owned<Self>> {
+/// // We are just using a `KBox` here to handle the actual allocation, as our `Foo` is
+/// // not actually a C-allocated object.
+/// let result = KBox::new(
+/// Foo {
+/// refcount: Cell::new(1),
+/// },
+/// flags::GFP_KERNEL,
+/// )?;
+/// let result = KBox::into_non_null(result);
+/// // SAFETY:
+/// // - We just allocated the `Self`, thus it is valid and we own it.
+/// // - We can transfer this ownership to the `from_raw` method.
+/// Ok(unsafe { Owned::from_raw(result) })
+/// }
+/// }
+///
+/// // SAFETY: We increment and decrement each time the respective function is called and only free
+/// // the `Foo` when the refcount reaches zero.
+/// unsafe impl RefCounted for Foo {
+/// fn inc_ref(&self) {
+/// self.refcount.replace(self.refcount.get() + 1);
+/// }
+///
+/// unsafe fn dec_ref(this: NonNull<Self>) {
+/// // SAFETY: By requirement on calling this function, the refcount is non-zero,
+/// // implying the underlying object is valid.
+/// let refcount = unsafe { &this.as_ref().refcount };
+/// let new_refcount = refcount.get() - 1;
+/// if new_refcount == 0 {
+/// // The `Foo` will be dropped when `KBox` goes out of scope.
+/// // SAFETY: The [`KBox<Foo>`] is still alive as the old refcount is 1. We can pass
+/// // ownership to the [`KBox`] as by requirement on calling this function,
+/// // the `Self` will no longer be used by the caller.
+/// unsafe { KBox::from_raw(this.as_ptr()) };
+/// } else {
+/// refcount.replace(new_refcount);
+/// }
+/// }
+/// }
+///
+/// impl OwnableRefCounted for Foo {
+/// fn try_from_shared(this: ARef<Self>) -> Result<Owned<Self>, ARef<Self>> {
+/// if this.refcount.get() == 1 {
+/// // SAFETY: The `Foo` is still alive and has no other Rust references as the refcount
+/// // is 1.
+/// Ok(unsafe { Owned::from_raw(ARef::into_raw(this)) })
+/// } else {
+/// Err(this)
+/// }
+/// }
+/// }
+///
+/// impl Ownable for Foo {
+/// unsafe fn release(&mut self) {
+/// // SAFETY: Using `dec_ref()` from [`RefCounted`] to release is okay, as the refcount is
+/// // always 1 for an [`Owned<Foo>`].
+/// unsafe{ Foo::dec_ref(NonNull::new_unchecked(self)) };
+/// }
+/// }
+///
+/// let foo = Foo::new()?;
+/// let mut foo = ARef::from(foo);
+/// {
+/// let bar = foo.clone();
+/// assert!(Owned::try_from(bar).is_err());
+/// }
+/// assert!(Owned::try_from(foo).is_ok());
+/// # Ok::<(), Error>(())
+/// ```
+pub trait OwnableRefCounted: RefCounted + Ownable + Sized {
+ /// Checks if the [`ARef`] is unique and converts it to an [`Owned`] if that is the case.
+ /// Otherwise it returns again an [`ARef`] to the same underlying object.
+ fn try_from_shared(this: ARef<Self>) -> Result<Owned<Self>, ARef<Self>>;
+
+ /// Converts the [`Owned`] into an [`ARef`].
+ #[inline]
+ fn into_shared(this: Owned<Self>) -> ARef<Self> {
+ // SAFETY: Safe by the requirements on implementing the trait.
+ unsafe { ARef::from_raw(Owned::into_raw(this)) }
+ }
+}
+
+impl<T: OwnableRefCounted> TryFrom<ARef<T>> for Owned<T> {
+ type Error = ARef<T>;
+ /// Tries to convert the [`ARef`] to an [`Owned`] by calling
+ /// [`try_from_shared()`](OwnableRefCounted::try_from_shared). In case the [`ARef`] is not
+ /// unique, it returns again an [`ARef`] to the same underlying object.
+ #[inline]
+ fn try_from(b: ARef<T>) -> Result<Owned<T>, Self::Error> {
+ T::try_from_shared(b)
+ }
+}
diff --git a/rust/kernel/sync/aref.rs b/rust/kernel/sync/aref.rs
index 818c84fa923a..68d31f43e674 100644
--- a/rust/kernel/sync/aref.rs
+++ b/rust/kernel/sync/aref.rs
@@ -23,6 +23,10 @@
ops::Deref,
ptr::NonNull, //
};
+use kernel::types::{
+ OwnableRefCounted,
+ Owned, //
+};
/// Types that are internally reference counted.
///
@@ -35,7 +39,10 @@
/// Note: Implementing this trait allows types to be wrapped in an [`ARef<Self>`]. It requires an
/// internal reference count and provides only shared references. If unique references are required
/// [`Ownable`](crate::types::Ownable) should be implemented which allows types to be wrapped in an
-/// [`Owned<Self>`](crate::types::Owned).
+/// [`Owned<Self>`](crate::types::Owned). Implementing the trait
+/// [`OwnableRefCounted`] allows to convert between unique and
+/// shared references (i.e. [`Owned<Self>`](crate::types::Owned) and
+/// [`ARef<Self>`](crate::types::Owned)).
///
/// # Safety
///
@@ -188,6 +195,13 @@ fn from(b: &T) -> Self {
}
}
+impl<T: OwnableRefCounted> From<Owned<T>> for ARef<T> {
+ #[inline]
+ fn from(b: Owned<T>) -> Self {
+ T::into_shared(b)
+ }
+}
+
impl<T: RefCounted> Drop for ARef<T> {
fn drop(&mut self) {
// SAFETY: The type invariants guarantee that the `ARef` owns the reference we're about to
diff --git a/rust/kernel/types.rs b/rust/kernel/types.rs
index 9b96aa2ebdb7..f43c091eeb8b 100644
--- a/rust/kernel/types.rs
+++ b/rust/kernel/types.rs
@@ -14,6 +14,7 @@
pub use crate::{
owned::{
Ownable,
+ OwnableRefCounted,
Owned, //
},
sync::aref::{
--
2.51.2
^ permalink raw reply related
* [PATCH v17 04/10] rust: page: update formatting of `use` statements
From: Andreas Hindborg @ 2026-06-04 20:11 UTC (permalink / raw)
To: Miguel Ojeda, Gary Guo, Björn Roy Baron, Benno Lossin,
Alice Ryhl, Trevor Gross, Danilo Krummrich, Greg Kroah-Hartman,
Dave Ertman, Ira Weiny, Leon Romanovsky, Paul Moore, Serge Hallyn,
Rafael J. Wysocki, David Airlie, Simona Vetter, Alexander Viro,
Christian Brauner, Jan Kara, Daniel Almeida, Viresh Kumar,
Nishanth Menon, Stephen Boyd, Bjorn Helgaas,
Krzysztof Wilczyński, Boqun Feng, Uladzislau Rezki,
Lorenzo Stoakes, Vlastimil Babka, Liam R. Howlett, Igor Korotin,
Pavel Tikhomirov, Boqun Feng, Igor Korotin, Lorenzo Stoakes,
Liam R. Howlett, Vlastimil Babka
Cc: linux-kernel, rust-for-linux, linux-block, linux-security-module,
dri-devel, linux-fsdevel, linux-mm, linux-pm, linux-pci,
Andreas Hindborg, driver-core
In-Reply-To: <20260604-unique-ref-v17-0-7b4c3d2930b9@kernel.org>
Update formatting in preparation for next patch
Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
rust/kernel/page.rs | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/rust/kernel/page.rs b/rust/kernel/page.rs
index adecb200c654..3bdcee0e16a8 100644
--- a/rust/kernel/page.rs
+++ b/rust/kernel/page.rs
@@ -3,17 +3,23 @@
//! Kernel page allocation and management.
use crate::{
- alloc::{AllocError, Flags},
+ alloc::{
+ AllocError,
+ Flags, //
+ },
bindings,
error::code::*,
error::Result,
- uaccess::UserSliceReader,
+ uaccess::UserSliceReader, //
};
use core::{
marker::PhantomData,
mem::ManuallyDrop,
ops::Deref,
- ptr::{self, NonNull},
+ ptr::{
+ self,
+ NonNull, //
+ }, //
};
/// A bitwise shift for the page size.
--
2.51.2
^ permalink raw reply related
* [PATCH v17 05/10] rust: page: convert to `Ownable`
From: Andreas Hindborg @ 2026-06-04 20:11 UTC (permalink / raw)
To: Miguel Ojeda, Gary Guo, Björn Roy Baron, Benno Lossin,
Alice Ryhl, Trevor Gross, Danilo Krummrich, Greg Kroah-Hartman,
Dave Ertman, Ira Weiny, Leon Romanovsky, Paul Moore, Serge Hallyn,
Rafael J. Wysocki, David Airlie, Simona Vetter, Alexander Viro,
Christian Brauner, Jan Kara, Daniel Almeida, Viresh Kumar,
Nishanth Menon, Stephen Boyd, Bjorn Helgaas,
Krzysztof Wilczyński, Boqun Feng, Uladzislau Rezki,
Lorenzo Stoakes, Vlastimil Babka, Liam R. Howlett, Igor Korotin,
Pavel Tikhomirov, Boqun Feng, Igor Korotin, Lorenzo Stoakes,
Liam R. Howlett, Vlastimil Babka
Cc: linux-kernel, rust-for-linux, linux-block, linux-security-module,
dri-devel, linux-fsdevel, linux-mm, linux-pm, linux-pci,
Andreas Hindborg, driver-core, Asahi Lina
In-Reply-To: <20260604-unique-ref-v17-0-7b4c3d2930b9@kernel.org>
From: Asahi Lina <lina@asahilina.net>
This allows Page references to be returned as borrowed references,
without necessarily owning the struct page.
Signed-off-by: Asahi Lina <lina@asahilina.net>
[ Andreas: Fix formatting and add a safety comment. ]
Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
rust/kernel/page.rs | 38 +++++++++++++++++++++++++-------------
1 file changed, 25 insertions(+), 13 deletions(-)
diff --git a/rust/kernel/page.rs b/rust/kernel/page.rs
index 3bdcee0e16a8..844c75e54134 100644
--- a/rust/kernel/page.rs
+++ b/rust/kernel/page.rs
@@ -10,6 +10,11 @@
bindings,
error::code::*,
error::Result,
+ types::{
+ Opaque,
+ Ownable,
+ Owned, //
+ },
uaccess::UserSliceReader, //
};
use core::{
@@ -105,7 +110,7 @@ pub const fn page_align(addr: usize) -> Option<usize> {
///
/// [`VBox`]: kernel::alloc::VBox
/// [`Vmalloc`]: kernel::alloc::allocator::Vmalloc
-pub struct BorrowedPage<'a>(ManuallyDrop<Page>, PhantomData<&'a Page>);
+pub struct BorrowedPage<'a>(ManuallyDrop<NonNull<Page>>, PhantomData<&'a Page>);
impl<'a> BorrowedPage<'a> {
/// Constructs a [`BorrowedPage`] from a raw pointer to a `struct page`.
@@ -115,7 +120,9 @@ impl<'a> BorrowedPage<'a> {
/// - `ptr` must point to a valid `bindings::page`.
/// - `ptr` must remain valid for the entire lifetime `'a`.
pub unsafe fn from_raw(ptr: NonNull<bindings::page>) -> Self {
- let page = Page { page: ptr };
+ let page: NonNull<Page> =
+ // SAFETY: By function safety requirements `ptr` is non null.
+ unsafe { NonNull::new_unchecked(ptr.as_ptr().cast()) };
// INVARIANT: The safety requirements guarantee that `ptr` is valid for the entire lifetime
// `'a`.
@@ -127,7 +134,8 @@ impl<'a> Deref for BorrowedPage<'a> {
type Target = Page;
fn deref(&self) -> &Self::Target {
- &self.0
+ // SAFETY: By type invariant `self.0` is convertible to a reference for `'a`.
+ unsafe { self.0.as_ref() }
}
}
@@ -148,8 +156,9 @@ pub trait AsPageIter {
/// # Invariants
///
/// The pointer is valid, and has ownership over the page.
+#[repr(transparent)]
pub struct Page {
- page: NonNull<bindings::page>,
+ page: Opaque<bindings::page>,
}
// SAFETY: Pages have no logic that relies on them staying on a given thread, so moving them across
@@ -183,19 +192,20 @@ impl Page {
/// # Ok::<(), kernel::alloc::AllocError>(())
/// ```
#[inline]
- pub fn alloc_page(flags: Flags) -> Result<Self, AllocError> {
+ pub fn alloc_page(flags: Flags) -> Result<Owned<Self>, AllocError> {
// SAFETY: Depending on the value of `gfp_flags`, this call may sleep. Other than that, it
// is always safe to call this method.
let page = unsafe { bindings::alloc_pages(flags.as_raw(), 0) };
let page = NonNull::new(page).ok_or(AllocError)?;
- // INVARIANT: We just successfully allocated a page, so we now have ownership of the newly
- // allocated page. We transfer that ownership to the new `Page` object.
- Ok(Self { page })
+ // SAFETY: We just successfully allocated a page, so we now have ownership of the newly
+ // allocated page. We transfer that ownership to the new `Owned<Page>` object.
+ // Since `Page` is transparent, we can cast the pointer directly.
+ Ok(unsafe { Owned::from_raw(page.cast()) })
}
/// Returns a raw pointer to the page.
pub fn as_ptr(&self) -> *mut bindings::page {
- self.page.as_ptr()
+ Opaque::cast_into(&self.page)
}
/// Get the node id containing this page.
@@ -370,10 +380,12 @@ pub unsafe fn copy_from_user_slice_raw(
}
}
-impl Drop for Page {
+impl Ownable for Page {
#[inline]
- fn drop(&mut self) {
- // SAFETY: By the type invariants, we have ownership of the page and can free it.
- unsafe { bindings::__free_pages(self.page.as_ptr(), 0) };
+ unsafe fn release(&mut self) {
+ let ptr: *mut Self = self;
+ // SAFETY: By the function safety requirements, we have ownership of the page and can free
+ // it. Since Page is transparent, we can cast the raw pointer directly.
+ unsafe { bindings::__free_pages(ptr.cast(), 0) };
}
}
--
2.51.2
^ permalink raw reply related
* [PATCH v17 00/10] rust: add `Ownable` trait and `Owned` type
From: Andreas Hindborg @ 2026-06-04 20:11 UTC (permalink / raw)
To: Miguel Ojeda, Gary Guo, Björn Roy Baron, Benno Lossin,
Alice Ryhl, Trevor Gross, Danilo Krummrich, Greg Kroah-Hartman,
Dave Ertman, Ira Weiny, Leon Romanovsky, Paul Moore, Serge Hallyn,
Rafael J. Wysocki, David Airlie, Simona Vetter, Alexander Viro,
Christian Brauner, Jan Kara, Daniel Almeida, Viresh Kumar,
Nishanth Menon, Stephen Boyd, Bjorn Helgaas,
Krzysztof Wilczyński, Boqun Feng, Uladzislau Rezki,
Lorenzo Stoakes, Vlastimil Babka, Liam R. Howlett, Igor Korotin,
Pavel Tikhomirov, Boqun Feng, Igor Korotin, Lorenzo Stoakes,
Liam R. Howlett, Vlastimil Babka
Cc: linux-kernel, rust-for-linux, linux-block, linux-security-module,
dri-devel, linux-fsdevel, linux-mm, linux-pm, linux-pci,
Andreas Hindborg, driver-core, Asahi Lina, Oliver Mangold,
Viresh Kumar, Asahi Lina, Andreas Hindborg
Add a new trait `Ownable` and type `Owned` for types that specify their
own way of performing allocation and destruction. This is useful for
types from the C side.
Implement `ForeignOwnable` for `Owned`.
Convert `Page` to be `Ownable` and add a `from_raw` method.
Add the trait `OwnableRefCounted` that allows conversion between
`ARef` and `Owned`. This is analogous to conversion between `Arc` and
`UniqueArc`.
Patches 1-5 implement `Ownable` and applies it to `Page`. These patches
can be merged on their own.
Patches 6-9 add `Ownable` -> `ARef` interop and can be merged later if
consensus on their shape cannot be reached.
Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
Changes in v17:
- Rebase on v7.1-rc2.
- Reorder patches so that `Ownable` can merge without `OwnableRefCounted` (Alice).
- Add `#[inline]` directives to short functions added by the series (Gary).
- Link to v16: https://msgid.link/20260224-unique-ref-v16-0-c21afcb118d3@kernel.org
Changes in v16:
- Simplify pointer to reference cast in `Page::from_raw`.
- Use `NonNull<Page>` rather than `Owned<Page>` for `BorrowedPage` internals.
- Use "convertible to reference" wording when converting pointers to references.
- Fix formatting for `Page::from_raw` docs.
- Leave imports alone when adding safety comment to aref example.
- Use `KBox::into_nonnull` for examples.
- Add patch for `KBox::into_nonnull`.
- Change invariants and safety comments of `Ownable` and make the trait safe.
- Make `Ownable::release` take a mutable reference.
- Fix error handling in example for `Ownable`
- Link to v15: https://msgid.link/20260220-unique-ref-v15-0-893ed86b06cc@kernel.org
Changes in v15:
- Update series with original SoB's.
- Rename `AlwaysRefCounted` in `kernel::usb`.
- Rename `Owned::get_pin_mut` to `Owned::as_pin_mut`.
- Link to v14: https://msgid.link/20260204-unique-ref-v14-0-17cb29ebacbb@kernel.org
Changes in v14:
- Rebase on v6.19-rc7.
- Rewrite cover letter.
- Update documentation and safety comments based on v13 feedback.
- Update commit messages.
- Reorder implementation blocks in owned.rs.
- Update example in owned.rs to use try operator rather than `expect`.
- Reformat use statements.
- Add patch: rust: page: convert to `Ownable`.
- Add patch: rust: implement `ForeignOwnable` for `Owned`.
- Add patch: rust: page: add `from_raw()`.
- Link to v13: https://lore.kernel.org/r/20251117-unique-ref-v13-0-b5b243df1250@pm.me
Changes in v13:
- Rebase onto v6.18-rc1 (Andreas's work).
- Documentation and style fixes contributed by Andreas
- Link to v12: https://lore.kernel.org/r/20251001-unique-ref-v12-0-fa5c31f0c0c4@pm.me
Changes in v12:
-
- Rebase onto v6.17-rc1 (Andreas's work).
- moved kernel/types/ownable.rs to kernel/owned.rs
- Drop OwnableMut, make DerefMut depend on Unpin instead. I understood
ML discussion as that being okay, but probably needs further scrunity.
- Lots of more documentation changes suggested by reviewers.
- Usage example for Ownable/Owned.
- Link to v11: https://lore.kernel.org/r/20250618-unique-ref-v11-0-49eadcdc0aa6@pm.me
Changes in v11:
- Rework of documentation. I tried to honor all requests for changes "in
spirit" plus some clearifications and corrections of my own.
- Dropping `SimpleOwnedRefCounted` by request from Alice, as it creates a
potentially problematic blanket implementation (which a derive macro that
could be created later would not have).
- Dropping Miguel's "kbuild: provide `RUSTC_HAS_DO_NOT_RECOMMEND` symbol"
patch, as it is not needed anymore after dropping `SimpleOwnedRefCounted`.
(I can add it again, if it is considered useful anyway).
- Link to v10: https://lore.kernel.org/r/20250502-unique-ref-v10-0-25de64c0307f@pm.me
Changes in v10:
- Moved kernel/ownable.rs to kernel/types/ownable.rs
- Fixes in documentation / comments as suggested by Andreas Hindborg
- Added Reviewed-by comment for Andreas Hindborg
- Fix rustfmt of pid_namespace.rs
- Link to v9: https://lore.kernel.org/r/20250325-unique-ref-v9-0-e91618c1de26@pm.me
Changes in v9:
- Rebase onto v6.14-rc7
- Move Ownable/OwnedRefCounted/Ownable, etc., into separate module
- Documentation fixes to Ownable/OwnableMut/OwnableRefCounted
- Add missing SAFETY documentation to ARef example
- Link to v8: https://lore.kernel.org/r/20250313-unique-ref-v8-0-3082ffc67a31@pm.me
Changes in v8:
- Fix Co-developed-by and Suggested-by tags as suggested by Miguel and Boqun
- Some small documentation fixes in Owned/Ownable patch
- removing redundant trait constraint on DerefMut for Owned as suggested by Boqun Feng
- make SimpleOwnedRefCounted no longer implement RefCounted as suggested by Boqun Feng
- documentation for RefCounted as suggested by Boqun Feng
- Link to v7: https://lore.kernel.org/r/20250310-unique-ref-v7-0-4caddb78aa05@pm.me
Changes in v7:
- Squash patch to make Owned::from_raw/into_raw public into parent
- Added Signed-off-by to other people's commits
- Link to v6: https://lore.kernel.org/r/20250310-unique-ref-v6-0-1ff53558617e@pm.me
Changes in v6:
- Changed comments/formatting as suggested by Miguel Ojeda
- Included and used new config flag RUSTC_HAS_DO_NOT_RECOMMEND,
thus no changes to types.rs will be needed when the attribute
becomes available.
- Fixed commit message for Owned patch.
- Link to v5: https://lore.kernel.org/r/20250307-unique-ref-v5-0-bffeb633277e@pm.me
Changes in v5:
- Rebase the whole thing on top of the Ownable/Owned traits by Asahi Lina.
- Rename AlwaysRefCounted to RefCounted and make AlwaysRefCounted a
marker trait instead to allow to obtain an ARef<T> from an &T,
which (as Alice pointed out) is unsound when combined with UniqueRef/Owned.
- Change the Trait design and naming to implement this feature,
UniqueRef/UniqueRefCounted is dropped in favor of Ownable/Owned and
OwnableRefCounted is used to provide the functions to convert
between Owned and ARef.
- Link to v4: https://lore.kernel.org/r/20250305-unique-ref-v4-1-a8fdef7b1c2c@pm.me
Changes in v4:
- Just a minor change in naming by request from Andreas Hindborg,
try_shared_to_unique() -> try_from_shared(),
unique_to_shared() -> into_shared(),
which is more in line with standard Rust naming conventions.
- Link to v3: https://lore.kernel.org/r/Z8Wuud2UQX6Yukyr@mango
To: Danilo Krummrich <dakr@kernel.org>
To: Lorenzo Stoakes <ljs@kernel.org>
To: Vlastimil Babka <vbabka@kernel.org>
To: "Liam R. Howlett" <liam@infradead.org>
To: Uladzislau Rezki <urezki@gmail.com>
To: Miguel Ojeda <ojeda@kernel.org>
To: Boqun Feng <boqun@kernel.org>
To: Gary Guo <gary@garyguo.net>
To: Björn Roy Baron <bjorn3_gh@protonmail.com>
To: Benno Lossin <lossin@kernel.org>
To: Andreas Hindborg <a.hindborg@kernel.org>
To: Alice Ryhl <aliceryhl@google.com>
To: Trevor Gross <tmgross@umich.edu>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: "Rafael J. Wysocki" <rafael@kernel.org>
To: Dave Ertman <david.m.ertman@intel.com>
To: Ira Weiny <ira.weiny@intel.com>
To: Leon Romanovsky <leon@kernel.org>
To: Paul Moore <paul@paul-moore.com>
To: Serge Hallyn <sergeh@kernel.org>
To: David Airlie <airlied@gmail.com>
To: Simona Vetter <simona@ffwll.ch>
To: Alexander Viro <viro@zeniv.linux.org.uk>
To: Christian Brauner <brauner@kernel.org>
To: Jan Kara <jack@suse.cz>
To: Igor Korotin <igor.korotin@linux.dev>
To: Daniel Almeida <daniel.almeida@collabora.com>
To: Viresh Kumar <vireshk@kernel.org>
To: Nishanth Menon <nm@ti.com>
To: Stephen Boyd <sboyd@kernel.org>
To: Bjorn Helgaas <bhelgaas@google.com>
To: Krzysztof Wilczyński <kwilczynski@kernel.org>
To: Pavel Tikhomirov <ptikhomirov@virtuozzo.com>
Cc: rust-for-linux@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: driver-core@lists.linux.dev
Cc: linux-block@vger.kernel.org
Cc: linux-security-module@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-pm@vger.kernel.org
Cc: linux-pci@vger.kernel.org
---
Andreas Hindborg (5):
rust: alloc: add `KBox::into_non_null`
rust: implement `ForeignOwnable` for `Owned`
rust: page: update formatting of `use` statements
rust: aref: update formatting of use statements
rust: page: add `from_raw()`
Asahi Lina (2):
rust: types: Add Ownable/Owned types
rust: page: convert to `Ownable`
Oliver Mangold (3):
rust: rename `AlwaysRefCounted` to `RefCounted`.
rust: Add missing SAFETY documentation for `ARef` example
rust: Add `OwnableRefCounted`
rust/kernel/alloc/kbox.rs | 9 +
rust/kernel/auxiliary.rs | 7 +-
rust/kernel/block/mq/request.rs | 15 +-
rust/kernel/cred.rs | 13 +-
rust/kernel/device.rs | 12 +-
rust/kernel/device/property.rs | 11 +-
rust/kernel/drm/device.rs | 9 +-
rust/kernel/drm/gem/mod.rs | 16 +-
rust/kernel/fs/file.rs | 16 +-
rust/kernel/i2c.rs | 13 +-
rust/kernel/lib.rs | 1 +
rust/kernel/mm.rs | 15 +-
rust/kernel/mm/mmput_async.rs | 9 +-
rust/kernel/opp.rs | 10 +-
rust/kernel/owned.rs | 361 ++++++++++++++++++++++++++++++++++++++++
rust/kernel/page.rs | 62 +++++--
rust/kernel/pci.rs | 10 +-
rust/kernel/pid_namespace.rs | 12 +-
rust/kernel/platform.rs | 7 +-
rust/kernel/sync/aref.rs | 83 ++++++---
rust/kernel/task.rs | 13 +-
rust/kernel/types.rs | 13 ++
rust/kernel/usb.rs | 17 +-
23 files changed, 652 insertions(+), 82 deletions(-)
---
base-commit: 7fd2df204f342fc17d1a0bfcd474b24232fb0f32
change-id: 20250305-unique-ref-29fcd675f9e9
Best regards,
--
Andreas Hindborg <a.hindborg@kernel.org>
^ permalink raw reply
* [PATCH v17 07/10] rust: Add missing SAFETY documentation for `ARef` example
From: Andreas Hindborg @ 2026-06-04 20:11 UTC (permalink / raw)
To: Miguel Ojeda, Gary Guo, Björn Roy Baron, Benno Lossin,
Alice Ryhl, Trevor Gross, Danilo Krummrich, Greg Kroah-Hartman,
Dave Ertman, Ira Weiny, Leon Romanovsky, Paul Moore, Serge Hallyn,
Rafael J. Wysocki, David Airlie, Simona Vetter, Alexander Viro,
Christian Brauner, Jan Kara, Daniel Almeida, Viresh Kumar,
Nishanth Menon, Stephen Boyd, Bjorn Helgaas,
Krzysztof Wilczyński, Boqun Feng, Uladzislau Rezki,
Lorenzo Stoakes, Vlastimil Babka, Liam R. Howlett, Igor Korotin,
Pavel Tikhomirov, Boqun Feng, Igor Korotin, Lorenzo Stoakes,
Liam R. Howlett, Vlastimil Babka
Cc: linux-kernel, rust-for-linux, linux-block, linux-security-module,
dri-devel, linux-fsdevel, linux-mm, linux-pm, linux-pci,
Andreas Hindborg, driver-core, Oliver Mangold
In-Reply-To: <20260604-unique-ref-v17-0-7b4c3d2930b9@kernel.org>
From: Oliver Mangold <oliver.mangold@pm.me>
SAFETY comment in rustdoc example was just 'TODO'. Fixed.
Signed-off-by: Oliver Mangold <oliver.mangold@pm.me>
Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com>
Co-developed-by: Andreas Hindborg <a.hindborg@kernel.org>
Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
rust/kernel/sync/aref.rs | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/rust/kernel/sync/aref.rs b/rust/kernel/sync/aref.rs
index 2d656f672b97..7491382bcf29 100644
--- a/rust/kernel/sync/aref.rs
+++ b/rust/kernel/sync/aref.rs
@@ -137,7 +137,9 @@ pub unsafe fn from_raw(ptr: NonNull<T>) -> Self {
///
/// struct Empty {}
///
- /// # // SAFETY: TODO.
+ /// // SAFETY: The `RefCounted` implementation for `Empty` does not count references and never
+ /// // frees the underlying object. Thus we can act as owning an increment on the refcount for
+ /// // the object that we pass to the newly created `ARef`.
/// unsafe impl RefCounted for Empty {
/// fn inc_ref(&self) {}
/// unsafe fn dec_ref(_obj: NonNull<Self>) {}
@@ -145,7 +147,7 @@ pub unsafe fn from_raw(ptr: NonNull<T>) -> Self {
///
/// let mut data = Empty {};
/// let ptr = NonNull::<Empty>::new(&mut data).unwrap();
- /// # // SAFETY: TODO.
+ /// // SAFETY: We keep `data` around longer than the `ARef`.
/// let data_ref: ARef<Empty> = unsafe { ARef::from_raw(ptr) };
/// let raw_ptr: NonNull<Empty> = ARef::into_raw(data_ref);
///
--
2.51.2
^ permalink raw reply related
* [PATCH v17 08/10] rust: aref: update formatting of use statements
From: Andreas Hindborg @ 2026-06-04 20:11 UTC (permalink / raw)
To: Miguel Ojeda, Gary Guo, Björn Roy Baron, Benno Lossin,
Alice Ryhl, Trevor Gross, Danilo Krummrich, Greg Kroah-Hartman,
Dave Ertman, Ira Weiny, Leon Romanovsky, Paul Moore, Serge Hallyn,
Rafael J. Wysocki, David Airlie, Simona Vetter, Alexander Viro,
Christian Brauner, Jan Kara, Daniel Almeida, Viresh Kumar,
Nishanth Menon, Stephen Boyd, Bjorn Helgaas,
Krzysztof Wilczyński, Boqun Feng, Uladzislau Rezki,
Lorenzo Stoakes, Vlastimil Babka, Liam R. Howlett, Igor Korotin,
Pavel Tikhomirov, Boqun Feng, Igor Korotin, Lorenzo Stoakes,
Liam R. Howlett, Vlastimil Babka
Cc: linux-kernel, rust-for-linux, linux-block, linux-security-module,
dri-devel, linux-fsdevel, linux-mm, linux-pm, linux-pci,
Andreas Hindborg, driver-core
In-Reply-To: <20260604-unique-ref-v17-0-7b4c3d2930b9@kernel.org>
Update formatting if use statements in preparation for next commit.
Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
rust/kernel/sync/aref.rs | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/rust/kernel/sync/aref.rs b/rust/kernel/sync/aref.rs
index 7491382bcf29..818c84fa923a 100644
--- a/rust/kernel/sync/aref.rs
+++ b/rust/kernel/sync/aref.rs
@@ -17,7 +17,12 @@
//! [`Arc`]: crate::sync::Arc
//! [`Arc<T>`]: crate::sync::Arc
-use core::{marker::PhantomData, mem::ManuallyDrop, ops::Deref, ptr::NonNull};
+use core::{
+ marker::PhantomData,
+ mem::ManuallyDrop,
+ ops::Deref,
+ ptr::NonNull, //
+};
/// Types that are internally reference counted.
///
--
2.51.2
^ permalink raw reply related
* [PATCH v17 06/10] rust: rename `AlwaysRefCounted` to `RefCounted`.
From: Andreas Hindborg @ 2026-06-04 20:11 UTC (permalink / raw)
To: Miguel Ojeda, Gary Guo, Björn Roy Baron, Benno Lossin,
Alice Ryhl, Trevor Gross, Danilo Krummrich, Greg Kroah-Hartman,
Dave Ertman, Ira Weiny, Leon Romanovsky, Paul Moore, Serge Hallyn,
Rafael J. Wysocki, David Airlie, Simona Vetter, Alexander Viro,
Christian Brauner, Jan Kara, Daniel Almeida, Viresh Kumar,
Nishanth Menon, Stephen Boyd, Bjorn Helgaas,
Krzysztof Wilczyński, Boqun Feng, Uladzislau Rezki,
Lorenzo Stoakes, Vlastimil Babka, Liam R. Howlett, Igor Korotin,
Pavel Tikhomirov, Boqun Feng, Igor Korotin, Lorenzo Stoakes,
Liam R. Howlett, Vlastimil Babka
Cc: linux-kernel, rust-for-linux, linux-block, linux-security-module,
dri-devel, linux-fsdevel, linux-mm, linux-pm, linux-pci,
Andreas Hindborg, driver-core, Oliver Mangold, Viresh Kumar
In-Reply-To: <20260604-unique-ref-v17-0-7b4c3d2930b9@kernel.org>
From: Oliver Mangold <oliver.mangold@pm.me>
There are types where it may both be reference counted in some cases and
owned in others. In such cases, obtaining `ARef<T>` from `&T` would be
unsound as it allows creation of `ARef<T>` copy from `&Owned<T>`.
Therefore, we split `AlwaysRefCounted` into `RefCounted` (which `ARef<T>`
would require) and a marker trait to indicate that the type is always
reference counted (and not `Ownable`) so the `&T` -> `ARef<T>` conversion
is possible.
- Rename `AlwaysRefCounted` to `RefCounted`.
- Add a new unsafe trait `AlwaysRefCounted`.
- Implement the new trait `AlwaysRefCounted` for the newly renamed
`RefCounted` implementations. This leaves functionality of existing
implementers of `AlwaysRefCounted` intact.
Suggested-by: Alice Ryhl <aliceryhl@google.com>
Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com>
Signed-off-by: Oliver Mangold <oliver.mangold@pm.me>
[ Andreas: Updated commit message and rebase on rust-6.20-7.0 ]
Acked-by: Igor Korotin <igor.korotin.linux@gmail.com>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Reviewed-by: Gary Guo <gary@garyguo.net>
Co-developed-by: Andreas Hindborg <a.hindborg@kernel.org>
Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
rust/kernel/auxiliary.rs | 7 +++++-
rust/kernel/block/mq/request.rs | 15 ++++++++-----
rust/kernel/cred.rs | 13 +++++++++--
rust/kernel/device.rs | 12 ++++++++--
rust/kernel/device/property.rs | 11 +++++++--
rust/kernel/drm/device.rs | 9 ++++++--
rust/kernel/drm/gem/mod.rs | 16 ++++++++++----
rust/kernel/fs/file.rs | 16 ++++++++++----
rust/kernel/i2c.rs | 13 ++++++++---
rust/kernel/mm.rs | 15 +++++++++----
rust/kernel/mm/mmput_async.rs | 9 ++++++--
rust/kernel/opp.rs | 10 ++++++---
rust/kernel/owned.rs | 2 +-
rust/kernel/pci.rs | 10 ++++++++-
rust/kernel/pid_namespace.rs | 12 ++++++++--
rust/kernel/platform.rs | 7 +++++-
rust/kernel/sync/aref.rs | 49 ++++++++++++++++++++++++++---------------
rust/kernel/task.rs | 13 +++++++++--
rust/kernel/types.rs | 3 ++-
rust/kernel/usb.rs | 17 +++++++++++---
20 files changed, 195 insertions(+), 64 deletions(-)
diff --git a/rust/kernel/auxiliary.rs b/rust/kernel/auxiliary.rs
index 93c0db1f6655..49f07740f657 100644
--- a/rust/kernel/auxiliary.rs
+++ b/rust/kernel/auxiliary.rs
@@ -19,6 +19,7 @@
to_result, //
},
prelude::*,
+ sync::aref::{AlwaysRefCounted, RefCounted},
types::Opaque,
ThisModule, //
};
@@ -289,7 +290,7 @@ unsafe impl<Ctx: device::DeviceContext> device::AsBusDevice<Ctx> for Device<Ctx>
kernel::impl_device_context_into_aref!(Device);
// SAFETY: Instances of `Device` are always reference-counted.
-unsafe impl crate::sync::aref::AlwaysRefCounted for Device {
+unsafe impl RefCounted for Device {
fn inc_ref(&self) {
// SAFETY: The existence of a shared reference guarantees that the refcount is non-zero.
unsafe { bindings::get_device(self.as_ref().as_raw()) };
@@ -308,6 +309,10 @@ unsafe fn dec_ref(obj: NonNull<Self>) {
}
}
+// SAFETY: We do not implement `Ownable`, thus it is okay to obtain an `ARef<Device>` from a
+// `&Device`.
+unsafe impl AlwaysRefCounted for Device {}
+
impl<Ctx: device::DeviceContext> AsRef<device::Device<Ctx>> for Device<Ctx> {
fn as_ref(&self) -> &device::Device<Ctx> {
// SAFETY: By the type invariant of `Self`, `self.as_raw()` is a pointer to a valid
diff --git a/rust/kernel/block/mq/request.rs b/rust/kernel/block/mq/request.rs
index ce3e30c81cb5..cf013b9e2cac 100644
--- a/rust/kernel/block/mq/request.rs
+++ b/rust/kernel/block/mq/request.rs
@@ -9,7 +9,7 @@
block::mq::Operations,
error::Result,
sync::{
- aref::{ARef, AlwaysRefCounted},
+ aref::{ARef, AlwaysRefCounted, RefCounted},
atomic::Relaxed,
Refcount,
},
@@ -229,11 +229,10 @@ unsafe impl<T: Operations> Send for Request<T> {}
// mutate `self` are internally synchronized`
unsafe impl<T: Operations> Sync for Request<T> {}
-// SAFETY: All instances of `Request<T>` are reference counted. This
-// implementation of `AlwaysRefCounted` ensure that increments to the ref count
-// keeps the object alive in memory at least until a matching reference count
-// decrement is executed.
-unsafe impl<T: Operations> AlwaysRefCounted for Request<T> {
+// SAFETY: All instances of `Request<T>` are reference counted. This implementation of `RefCounted`
+// ensure that increments to the ref count keeps the object alive in memory at least until a
+// matching reference count decrement is executed.
+unsafe impl<T: Operations> RefCounted for Request<T> {
fn inc_ref(&self) {
self.wrapper_ref().refcount().inc();
}
@@ -255,3 +254,7 @@ unsafe fn dec_ref(obj: core::ptr::NonNull<Self>) {
}
}
}
+
+// SAFETY: We currently do not implement `Ownable`, thus it is okay to obtain an `ARef<Request>`
+// from a `&Request` (but this will change in the future).
+unsafe impl<T: Operations> AlwaysRefCounted for Request<T> {}
diff --git a/rust/kernel/cred.rs b/rust/kernel/cred.rs
index ffa156b9df37..20ef0144094b 100644
--- a/rust/kernel/cred.rs
+++ b/rust/kernel/cred.rs
@@ -8,7 +8,12 @@
//!
//! Reference: <https://www.kernel.org/doc/html/latest/security/credentials.html>
-use crate::{bindings, sync::aref::AlwaysRefCounted, task::Kuid, types::Opaque};
+use crate::{
+ bindings,
+ sync::aref::RefCounted,
+ task::Kuid,
+ types::{AlwaysRefCounted, Opaque},
+};
/// Wraps the kernel's `struct cred`.
///
@@ -76,7 +81,7 @@ pub fn euid(&self) -> Kuid {
}
// SAFETY: The type invariants guarantee that `Credential` is always ref-counted.
-unsafe impl AlwaysRefCounted for Credential {
+unsafe impl RefCounted for Credential {
#[inline]
fn inc_ref(&self) {
// SAFETY: The existence of a shared reference means that the refcount is nonzero.
@@ -90,3 +95,7 @@ unsafe fn dec_ref(obj: core::ptr::NonNull<Credential>) {
unsafe { bindings::put_cred(obj.cast().as_ptr()) };
}
}
+
+// SAFETY: We do not implement `Ownable`, thus it is okay to obtain an `ARef<Credential>` from a
+// `&Credential`.
+unsafe impl AlwaysRefCounted for Credential {}
diff --git a/rust/kernel/device.rs b/rust/kernel/device.rs
index 6d5396a43ebe..efdf33617d12 100644
--- a/rust/kernel/device.rs
+++ b/rust/kernel/device.rs
@@ -8,8 +8,12 @@
bindings,
fmt,
prelude::*,
- sync::aref::ARef,
+ sync::aref::{
+ ARef,
+ RefCounted, //
+ },
types::{
+ AlwaysRefCounted,
ForeignOwnable,
Opaque, //
}, //
@@ -508,7 +512,7 @@ pub fn name(&self) -> &CStr {
kernel::impl_device_context_into_aref!(Device);
// SAFETY: Instances of `Device` are always reference-counted.
-unsafe impl crate::sync::aref::AlwaysRefCounted for Device {
+unsafe impl RefCounted for Device {
fn inc_ref(&self) {
// SAFETY: The existence of a shared reference guarantees that the refcount is non-zero.
unsafe { bindings::get_device(self.as_raw()) };
@@ -520,6 +524,10 @@ unsafe fn dec_ref(obj: ptr::NonNull<Self>) {
}
}
+// SAFETY: We do not implement `Ownable`, thus it is okay to obtain an `ARef<Device>` from a
+// `&Device`.
+unsafe impl AlwaysRefCounted for Device {}
+
// SAFETY: As by the type invariant `Device` can be sent to any thread.
unsafe impl Send for Device {}
diff --git a/rust/kernel/device/property.rs b/rust/kernel/device/property.rs
index 5aead835fbbc..cee7e2501368 100644
--- a/rust/kernel/device/property.rs
+++ b/rust/kernel/device/property.rs
@@ -14,7 +14,10 @@
fmt,
prelude::*,
str::{CStr, CString},
- sync::aref::ARef,
+ sync::aref::{
+ ARef,
+ AlwaysRefCounted, //
+ },
types::Opaque,
};
@@ -360,7 +363,7 @@ fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
}
// SAFETY: Instances of `FwNode` are always reference-counted.
-unsafe impl crate::sync::aref::AlwaysRefCounted for FwNode {
+unsafe impl crate::sync::aref::RefCounted for FwNode {
fn inc_ref(&self) {
// SAFETY: The existence of a shared reference guarantees that the
// refcount is non-zero.
@@ -374,6 +377,10 @@ unsafe fn dec_ref(obj: ptr::NonNull<Self>) {
}
}
+// SAFETY: We do not implement `Ownable`, thus it is okay to obtain an `ARef<FwNode>` from a
+// `&FwNode`.
+unsafe impl AlwaysRefCounted for FwNode {}
+
enum Node<'a> {
Borrowed(&'a FwNode),
Owned(ARef<FwNode>),
diff --git a/rust/kernel/drm/device.rs b/rust/kernel/drm/device.rs
index adbafe8db54d..a5a040266aae 100644
--- a/rust/kernel/drm/device.rs
+++ b/rust/kernel/drm/device.rs
@@ -15,7 +15,8 @@
prelude::*,
sync::aref::{
ARef,
- AlwaysRefCounted, //
+ AlwaysRefCounted,
+ RefCounted, //
},
types::Opaque,
workqueue::{
@@ -217,7 +218,7 @@ fn deref(&self) -> &Self::Target {
// SAFETY: DRM device objects are always reference counted and the get/put functions
// satisfy the requirements.
-unsafe impl<T: drm::Driver> AlwaysRefCounted for Device<T> {
+unsafe impl<T: drm::Driver> RefCounted for Device<T> {
fn inc_ref(&self) {
// SAFETY: The existence of a shared reference guarantees that the refcount is non-zero.
unsafe { bindings::drm_dev_get(self.as_raw()) };
@@ -232,6 +233,10 @@ unsafe fn dec_ref(obj: NonNull<Self>) {
}
}
+// SAFETY: We do not implement `Ownable`, thus it is okay to obtain an `ARef<Device>` from a
+// `&Device`.
+unsafe impl<T: drm::Driver> AlwaysRefCounted for Device<T> {}
+
impl<T: drm::Driver> AsRef<device::Device> for Device<T> {
fn as_ref(&self) -> &device::Device {
// SAFETY: `bindings::drm_device::dev` is valid as long as the DRM device itself is valid,
diff --git a/rust/kernel/drm/gem/mod.rs b/rust/kernel/drm/gem/mod.rs
index 75acda7ba500..f8cc2a0ff4c7 100644
--- a/rust/kernel/drm/gem/mod.rs
+++ b/rust/kernel/drm/gem/mod.rs
@@ -17,7 +17,7 @@
prelude::*,
sync::aref::{
ARef,
- AlwaysRefCounted, //
+ RefCounted, //
},
types::Opaque,
};
@@ -29,7 +29,7 @@
#[cfg(CONFIG_RUST_DRM_GEM_SHMEM_HELPER)]
pub mod shmem;
-/// A macro for implementing [`AlwaysRefCounted`] for any GEM object type.
+/// A macro for implementing [`RefCounted`] for any GEM object type.
///
/// Since all GEM objects use the same refcounting scheme.
#[macro_export]
@@ -42,7 +42,7 @@ impl $( <$( $tparam_id:ident ),+> )? for $type:ty
)?
) => {
// SAFETY: All GEM objects are refcounted.
- unsafe impl $( <$( $tparam_id ),+> )? $crate::sync::aref::AlwaysRefCounted for $type
+ unsafe impl $( <$( $tparam_id ),+> )? $crate::sync::aref::RefCounted for $type
where
Self: IntoGEMObject,
$( $( $bind_param : $bind_trait ),+ )?
@@ -61,6 +61,14 @@ unsafe fn dec_ref(obj: core::ptr::NonNull<Self>) {
unsafe { bindings::drm_gem_object_put(obj) };
}
}
+
+ // SAFETY: We do not implement `Ownable`, thus it is okay to obtain an `ARef<$type>` from a
+ // `&$type`.
+ unsafe impl $( <$( $tparam_id ),+> )? $crate::sync::aref::AlwaysRefCounted for $type
+ where
+ Self: IntoGEMObject,
+ $( $( $bind_param : $bind_trait ),+ )?
+ {}
};
}
#[cfg_attr(not(CONFIG_RUST_DRM_GEM_SHMEM_HELPER), allow(unused))]
@@ -98,7 +106,7 @@ fn close(_obj: &<Self::Driver as drm::Driver>::Object, _file: &DriverFile<Self>)
}
/// Trait that represents a GEM object subtype
-pub trait IntoGEMObject: Sized + super::private::Sealed + AlwaysRefCounted {
+pub trait IntoGEMObject: Sized + super::private::Sealed + RefCounted {
/// Returns a reference to the raw `drm_gem_object` structure, which must be valid as long as
/// this owning object is valid.
fn as_raw(&self) -> *mut bindings::drm_gem_object;
diff --git a/rust/kernel/fs/file.rs b/rust/kernel/fs/file.rs
index 23ee689bd240..06e457d62a93 100644
--- a/rust/kernel/fs/file.rs
+++ b/rust/kernel/fs/file.rs
@@ -12,8 +12,8 @@
cred::Credential,
error::{code::*, to_result, Error, Result},
fmt,
- sync::aref::{ARef, AlwaysRefCounted},
- types::{NotThreadSafe, Opaque},
+ sync::aref::RefCounted,
+ types::{ARef, AlwaysRefCounted, NotThreadSafe, Opaque},
};
use core::ptr;
@@ -197,7 +197,7 @@ unsafe impl Sync for File {}
// SAFETY: The type invariants guarantee that `File` is always ref-counted. This implementation
// makes `ARef<File>` own a normal refcount.
-unsafe impl AlwaysRefCounted for File {
+unsafe impl RefCounted for File {
#[inline]
fn inc_ref(&self) {
// SAFETY: The existence of a shared reference means that the refcount is nonzero.
@@ -212,6 +212,10 @@ unsafe fn dec_ref(obj: ptr::NonNull<File>) {
}
}
+// SAFETY: We do not implement `Ownable`, thus it is okay to obtain an `ARef<File>` from a
+// `&File`.
+unsafe impl AlwaysRefCounted for File {}
+
/// Wraps the kernel's `struct file`. Not thread safe.
///
/// This type represents a file that is not known to be safe to transfer across thread boundaries.
@@ -233,7 +237,7 @@ pub struct LocalFile {
// SAFETY: The type invariants guarantee that `LocalFile` is always ref-counted. This implementation
// makes `ARef<LocalFile>` own a normal refcount.
-unsafe impl AlwaysRefCounted for LocalFile {
+unsafe impl RefCounted for LocalFile {
#[inline]
fn inc_ref(&self) {
// SAFETY: The existence of a shared reference means that the refcount is nonzero.
@@ -249,6 +253,10 @@ unsafe fn dec_ref(obj: ptr::NonNull<LocalFile>) {
}
}
+// SAFETY: We do not implement `Ownable`, thus it is okay to obtain an `ARef<LocalFile>` from a
+// `&LocalFile`.
+unsafe impl AlwaysRefCounted for LocalFile {}
+
impl LocalFile {
/// Constructs a new `struct file` wrapper from a file descriptor.
///
diff --git a/rust/kernel/i2c.rs b/rust/kernel/i2c.rs
index 7b908f0c5a58..56791c1d63d7 100644
--- a/rust/kernel/i2c.rs
+++ b/rust/kernel/i2c.rs
@@ -18,7 +18,8 @@
prelude::*,
sync::aref::{
ARef,
- AlwaysRefCounted, //
+ AlwaysRefCounted,
+ RefCounted, //
},
types::Opaque, //
};
@@ -415,7 +416,7 @@ pub fn get(index: i32) -> Result<ARef<Self>> {
kernel::impl_device_context_into_aref!(I2cAdapter);
// SAFETY: Instances of `I2cAdapter` are always reference-counted.
-unsafe impl AlwaysRefCounted for I2cAdapter {
+unsafe impl RefCounted for I2cAdapter {
fn inc_ref(&self) {
// SAFETY: The existence of a shared reference guarantees that the refcount is non-zero.
unsafe { bindings::i2c_get_adapter(self.index()) };
@@ -426,6 +427,9 @@ unsafe fn dec_ref(obj: NonNull<Self>) {
unsafe { bindings::i2c_put_adapter(obj.as_ref().as_raw()) }
}
}
+// SAFETY: We do not implement `Ownable`, thus it is okay to obtain an `ARef<Device>` from an
+// `&I2cAdapter`.
+unsafe impl AlwaysRefCounted for I2cAdapter {}
/// The i2c board info representation
///
@@ -491,7 +495,7 @@ unsafe impl<Ctx: device::DeviceContext> device::AsBusDevice<Ctx> for I2cClient<C
kernel::impl_device_context_into_aref!(I2cClient);
// SAFETY: Instances of `I2cClient` are always reference-counted.
-unsafe impl AlwaysRefCounted for I2cClient {
+unsafe impl RefCounted for I2cClient {
fn inc_ref(&self) {
// SAFETY: The existence of a shared reference guarantees that the refcount is non-zero.
unsafe { bindings::get_device(self.as_ref().as_raw()) };
@@ -502,6 +506,9 @@ unsafe fn dec_ref(obj: NonNull<Self>) {
unsafe { bindings::put_device(&raw mut (*obj.as_ref().as_raw()).dev) }
}
}
+// SAFETY: We do not implement `Ownable`, thus it is okay to obtain an `ARef<Device>` from an
+// `&I2cClient`.
+unsafe impl AlwaysRefCounted for I2cClient {}
impl<Ctx: device::DeviceContext> AsRef<device::Device<Ctx>> for I2cClient<Ctx> {
fn as_ref(&self) -> &device::Device<Ctx> {
diff --git a/rust/kernel/mm.rs b/rust/kernel/mm.rs
index 4764d7b68f2a..dd9e3969e720 100644
--- a/rust/kernel/mm.rs
+++ b/rust/kernel/mm.rs
@@ -13,8 +13,8 @@
use crate::{
bindings,
- sync::aref::{ARef, AlwaysRefCounted},
- types::{NotThreadSafe, Opaque},
+ sync::aref::RefCounted,
+ types::{ARef, AlwaysRefCounted, NotThreadSafe, Opaque},
};
use core::{ops::Deref, ptr::NonNull};
@@ -55,7 +55,7 @@ unsafe impl Send for Mm {}
unsafe impl Sync for Mm {}
// SAFETY: By the type invariants, this type is always refcounted.
-unsafe impl AlwaysRefCounted for Mm {
+unsafe impl RefCounted for Mm {
#[inline]
fn inc_ref(&self) {
// SAFETY: The pointer is valid since self is a reference.
@@ -69,6 +69,9 @@ unsafe fn dec_ref(obj: NonNull<Self>) {
}
}
+// SAFETY: We do not implement `Ownable`, thus it is okay to obtain an `ARef<Mm>` from a `&Mm`.
+unsafe impl AlwaysRefCounted for Mm {}
+
/// A wrapper for the kernel's `struct mm_struct`.
///
/// This type is like [`Mm`], but with non-zero `mm_users`. It can only be used when `mm_users` can
@@ -91,7 +94,7 @@ unsafe impl Send for MmWithUser {}
unsafe impl Sync for MmWithUser {}
// SAFETY: By the type invariants, this type is always refcounted.
-unsafe impl AlwaysRefCounted for MmWithUser {
+unsafe impl RefCounted for MmWithUser {
#[inline]
fn inc_ref(&self) {
// SAFETY: The pointer is valid since self is a reference.
@@ -105,6 +108,10 @@ unsafe fn dec_ref(obj: NonNull<Self>) {
}
}
+// SAFETY: We do not implement `Ownable`, thus it is okay to obtain an `ARef<MmWithUser>` from a
+// `&MmWithUser`.
+unsafe impl AlwaysRefCounted for MmWithUser {}
+
// Make all `Mm` methods available on `MmWithUser`.
impl Deref for MmWithUser {
type Target = Mm;
diff --git a/rust/kernel/mm/mmput_async.rs b/rust/kernel/mm/mmput_async.rs
index b8d2f051225c..aba4ce675c86 100644
--- a/rust/kernel/mm/mmput_async.rs
+++ b/rust/kernel/mm/mmput_async.rs
@@ -10,7 +10,8 @@
use crate::{
bindings,
mm::MmWithUser,
- sync::aref::{ARef, AlwaysRefCounted},
+ sync::aref::RefCounted,
+ types::{ARef, AlwaysRefCounted},
};
use core::{ops::Deref, ptr::NonNull};
@@ -34,7 +35,7 @@ unsafe impl Send for MmWithUserAsync {}
unsafe impl Sync for MmWithUserAsync {}
// SAFETY: By the type invariants, this type is always refcounted.
-unsafe impl AlwaysRefCounted for MmWithUserAsync {
+unsafe impl RefCounted for MmWithUserAsync {
#[inline]
fn inc_ref(&self) {
// SAFETY: The pointer is valid since self is a reference.
@@ -48,6 +49,10 @@ unsafe fn dec_ref(obj: NonNull<Self>) {
}
}
+// SAFETY: We do not implement `Ownable`, thus it is okay to obtain an `ARef<MmWithUserAsync>`
+// from a `&MmWithUserAsync`.
+unsafe impl AlwaysRefCounted for MmWithUserAsync {}
+
// Make all `MmWithUser` methods available on `MmWithUserAsync`.
impl Deref for MmWithUserAsync {
type Target = MmWithUser;
diff --git a/rust/kernel/opp.rs b/rust/kernel/opp.rs
index a760fac28765..06fe2ca776a4 100644
--- a/rust/kernel/opp.rs
+++ b/rust/kernel/opp.rs
@@ -16,8 +16,8 @@
ffi::{c_char, c_ulong},
prelude::*,
str::CString,
- sync::aref::{ARef, AlwaysRefCounted},
- types::Opaque,
+ sync::aref::RefCounted,
+ types::{ARef, AlwaysRefCounted, Opaque},
};
#[cfg(CONFIG_CPU_FREQ)]
@@ -1041,7 +1041,7 @@ unsafe impl Send for OPP {}
unsafe impl Sync for OPP {}
/// SAFETY: The type invariants guarantee that [`OPP`] is always refcounted.
-unsafe impl AlwaysRefCounted for OPP {
+unsafe impl RefCounted for OPP {
fn inc_ref(&self) {
// SAFETY: The existence of a shared reference means that the refcount is nonzero.
unsafe { bindings::dev_pm_opp_get(self.0.get()) };
@@ -1053,6 +1053,10 @@ unsafe fn dec_ref(obj: ptr::NonNull<Self>) {
}
}
+// SAFETY: We do not implement `Ownable`, thus it is okay to obtain an `ARef<OPP>` from an
+// `&OPP`.
+unsafe impl AlwaysRefCounted for OPP {}
+
impl OPP {
/// Creates an owned reference to a [`OPP`] from a valid pointer.
///
diff --git a/rust/kernel/owned.rs b/rust/kernel/owned.rs
index 5eacdf327d12..bedd4fef84fa 100644
--- a/rust/kernel/owned.rs
+++ b/rust/kernel/owned.rs
@@ -27,7 +27,7 @@
///
/// Note: The underlying object is not required to provide internal reference counting, because it
/// represents a unique, owned reference. If reference counting (on the Rust side) is required,
-/// [`AlwaysRefCounted`](crate::types::AlwaysRefCounted) should be implemented.
+/// [`RefCounted`](crate::types::RefCounted) should be implemented.
///
/// # Examples
///
diff --git a/rust/kernel/pci.rs b/rust/kernel/pci.rs
index af74ddff6114..acf7384fea02 100644
--- a/rust/kernel/pci.rs
+++ b/rust/kernel/pci.rs
@@ -19,6 +19,10 @@
},
prelude::*,
str::CStr,
+ sync::aref::{
+ AlwaysRefCounted,
+ RefCounted, //
+ },
types::Opaque,
ThisModule, //
};
@@ -474,7 +478,7 @@ unsafe impl<Ctx: device::DeviceContext> device::AsBusDevice<Ctx> for Device<Ctx>
impl crate::dma::Device for Device<device::Core> {}
// SAFETY: Instances of `Device` are always reference-counted.
-unsafe impl crate::sync::aref::AlwaysRefCounted for Device {
+unsafe impl RefCounted for Device {
fn inc_ref(&self) {
// SAFETY: The existence of a shared reference guarantees that the refcount is non-zero.
unsafe { bindings::pci_dev_get(self.as_raw()) };
@@ -486,6 +490,10 @@ unsafe fn dec_ref(obj: NonNull<Self>) {
}
}
+// SAFETY: We do not implement `Ownable`, thus it is okay to obtain an `ARef<Device>` from a
+// `&Device`.
+unsafe impl AlwaysRefCounted for Device {}
+
impl<Ctx: device::DeviceContext> AsRef<device::Device<Ctx>> for Device<Ctx> {
fn as_ref(&self) -> &device::Device<Ctx> {
// SAFETY: By the type invariant of `Self`, `self.as_raw()` is a pointer to a valid
diff --git a/rust/kernel/pid_namespace.rs b/rust/kernel/pid_namespace.rs
index 979a9718f153..4f6a94540e33 100644
--- a/rust/kernel/pid_namespace.rs
+++ b/rust/kernel/pid_namespace.rs
@@ -7,7 +7,11 @@
//! C header: [`include/linux/pid_namespace.h`](srctree/include/linux/pid_namespace.h) and
//! [`include/linux/pid.h`](srctree/include/linux/pid.h)
-use crate::{bindings, sync::aref::AlwaysRefCounted, types::Opaque};
+use crate::{
+ bindings,
+ sync::aref::RefCounted,
+ types::{AlwaysRefCounted, Opaque},
+};
use core::ptr;
/// Wraps the kernel's `struct pid_namespace`. Thread safe.
@@ -41,7 +45,7 @@ pub unsafe fn from_ptr<'a>(ptr: *const bindings::pid_namespace) -> &'a Self {
}
// SAFETY: Instances of `PidNamespace` are always reference-counted.
-unsafe impl AlwaysRefCounted for PidNamespace {
+unsafe impl RefCounted for PidNamespace {
#[inline]
fn inc_ref(&self) {
// SAFETY: The existence of a shared reference means that the refcount is nonzero.
@@ -55,6 +59,10 @@ unsafe fn dec_ref(obj: ptr::NonNull<PidNamespace>) {
}
}
+// SAFETY: We do not implement `Ownable`, thus it is okay to obtain an `ARef<PidNamespace>` from
+// a `&PidNamespace`.
+unsafe impl AlwaysRefCounted for PidNamespace {}
+
// SAFETY:
// - `PidNamespace::dec_ref` can be called from any thread.
// - It is okay to send ownership of `PidNamespace` across thread boundaries.
diff --git a/rust/kernel/platform.rs b/rust/kernel/platform.rs
index 8917d4ee499f..3c35aa94e319 100644
--- a/rust/kernel/platform.rs
+++ b/rust/kernel/platform.rs
@@ -27,6 +27,7 @@
},
of,
prelude::*,
+ sync::aref::{AlwaysRefCounted, RefCounted},
types::Opaque,
ThisModule, //
};
@@ -512,7 +513,7 @@ pub fn optional_irq_by_name(&self, name: &CStr) -> Result<IrqRequest<'_>> {
impl crate::dma::Device for Device<device::Core> {}
// SAFETY: Instances of `Device` are always reference-counted.
-unsafe impl crate::sync::aref::AlwaysRefCounted for Device {
+unsafe impl RefCounted for Device {
fn inc_ref(&self) {
// SAFETY: The existence of a shared reference guarantees that the refcount is non-zero.
unsafe { bindings::get_device(self.as_ref().as_raw()) };
@@ -524,6 +525,10 @@ unsafe fn dec_ref(obj: NonNull<Self>) {
}
}
+// SAFETY: We do not implement `Ownable`, thus it is okay to obtain an `ARef<Device>` from a
+// `&Device`.
+unsafe impl AlwaysRefCounted for Device {}
+
impl<Ctx: device::DeviceContext> AsRef<device::Device<Ctx>> for Device<Ctx> {
fn as_ref(&self) -> &device::Device<Ctx> {
// SAFETY: By the type invariant of `Self`, `self.as_raw()` is a pointer to a valid
diff --git a/rust/kernel/sync/aref.rs b/rust/kernel/sync/aref.rs
index 4ee5fac0e0b6..2d656f672b97 100644
--- a/rust/kernel/sync/aref.rs
+++ b/rust/kernel/sync/aref.rs
@@ -19,11 +19,9 @@
use core::{marker::PhantomData, mem::ManuallyDrop, ops::Deref, ptr::NonNull};
-/// Types that are _always_ reference counted.
+/// Types that are internally reference counted.
///
/// It allows such types to define their own custom ref increment and decrement functions.
-/// Additionally, it allows users to convert from a shared reference `&T` to an owned reference
-/// [`ARef<T>`].
///
/// This is usually implemented by wrappers to existing structures on the C side of the code. For
/// Rust code, the recommendation is to use [`Arc`](crate::sync::Arc) to create reference-counted
@@ -40,9 +38,8 @@
/// at least until matching decrements are performed.
///
/// Implementers must also ensure that all instances are reference-counted. (Otherwise they
-/// won't be able to honour the requirement that [`AlwaysRefCounted::inc_ref`] keep the object
-/// alive.)
-pub unsafe trait AlwaysRefCounted {
+/// won't be able to honour the requirement that [`RefCounted::inc_ref`] keep the object alive.)
+pub unsafe trait RefCounted {
/// Increments the reference count on the object.
fn inc_ref(&self);
@@ -55,11 +52,27 @@ pub unsafe trait AlwaysRefCounted {
/// Callers must ensure that there was a previous matching increment to the reference count,
/// and that the object is no longer used after its reference count is decremented (as it may
/// result in the object being freed), unless the caller owns another increment on the refcount
- /// (e.g., it calls [`AlwaysRefCounted::inc_ref`] twice, then calls
- /// [`AlwaysRefCounted::dec_ref`] once).
+ /// (e.g., it calls [`RefCounted::inc_ref`] twice, then calls [`RefCounted::dec_ref`] once).
unsafe fn dec_ref(obj: NonNull<Self>);
}
+/// Always reference-counted type.
+///
+/// It allows deriving a counted reference [`ARef<T>`] from a `&T`.
+///
+/// This provides some convenience, but it allows "escaping" borrow checks on `&T`. As it
+/// complicates attempts to ensure that a reference to T is unique, it is optional to provide for
+/// [`RefCounted`] types. See *Safety* below.
+///
+/// # Safety
+///
+/// Implementers must ensure that no safety invariants are violated by upgrading an `&T` to an
+/// [`ARef<T>`]. In particular that implies [`AlwaysRefCounted`] and [`crate::types::Ownable`]
+/// cannot be implemented for the same type, as this would allow violating the uniqueness guarantee
+/// of [`crate::types::Owned<T>`] by dereferencing it into an `&T` and obtaining an [`ARef`] from
+/// that.
+pub unsafe trait AlwaysRefCounted: RefCounted {}
+
/// An owned reference to an always-reference-counted object.
///
/// The object's reference count is automatically decremented when an instance of [`ARef`] is
@@ -70,7 +83,7 @@ pub unsafe trait AlwaysRefCounted {
///
/// The pointer stored in `ptr` is non-null and valid for the lifetime of the [`ARef`] instance. In
/// particular, the [`ARef`] instance owns an increment on the underlying object's reference count.
-pub struct ARef<T: AlwaysRefCounted> {
+pub struct ARef<T: RefCounted> {
ptr: NonNull<T>,
_p: PhantomData<T>,
}
@@ -79,19 +92,19 @@ pub struct ARef<T: AlwaysRefCounted> {
// it effectively means sharing `&T` (which is safe because `T` is `Sync`); additionally, it needs
// `T` to be `Send` because any thread that has an `ARef<T>` may ultimately access `T` using a
// mutable reference, for example, when the reference count reaches zero and `T` is dropped.
-unsafe impl<T: AlwaysRefCounted + Sync + Send> Send for ARef<T> {}
+unsafe impl<T: RefCounted + Sync + Send> Send for ARef<T> {}
// SAFETY: It is safe to send `&ARef<T>` to another thread when the underlying `T` is `Sync`
// because it effectively means sharing `&T` (which is safe because `T` is `Sync`); additionally,
// it needs `T` to be `Send` because any thread that has a `&ARef<T>` may clone it and get an
// `ARef<T>` on that thread, so the thread may ultimately access `T` using a mutable reference, for
// example, when the reference count reaches zero and `T` is dropped.
-unsafe impl<T: AlwaysRefCounted + Sync + Send> Sync for ARef<T> {}
+unsafe impl<T: RefCounted + Sync + Send> Sync for ARef<T> {}
// Even if `T` is pinned, pointers to `T` can still move.
-impl<T: AlwaysRefCounted> Unpin for ARef<T> {}
+impl<T: RefCounted> Unpin for ARef<T> {}
-impl<T: AlwaysRefCounted> ARef<T> {
+impl<T: RefCounted> ARef<T> {
/// Creates a new instance of [`ARef`].
///
/// It takes over an increment of the reference count on the underlying object.
@@ -120,12 +133,12 @@ pub unsafe fn from_raw(ptr: NonNull<T>) -> Self {
///
/// ```
/// use core::ptr::NonNull;
- /// use kernel::sync::aref::{ARef, AlwaysRefCounted};
+ /// use kernel::sync::aref::{ARef, RefCounted};
///
/// struct Empty {}
///
/// # // SAFETY: TODO.
- /// unsafe impl AlwaysRefCounted for Empty {
+ /// unsafe impl RefCounted for Empty {
/// fn inc_ref(&self) {}
/// unsafe fn dec_ref(_obj: NonNull<Self>) {}
/// }
@@ -143,7 +156,7 @@ pub fn into_raw(me: Self) -> NonNull<T> {
}
}
-impl<T: AlwaysRefCounted> Clone for ARef<T> {
+impl<T: RefCounted> Clone for ARef<T> {
fn clone(&self) -> Self {
self.inc_ref();
// SAFETY: We just incremented the refcount above.
@@ -151,7 +164,7 @@ fn clone(&self) -> Self {
}
}
-impl<T: AlwaysRefCounted> Deref for ARef<T> {
+impl<T: RefCounted> Deref for ARef<T> {
type Target = T;
fn deref(&self) -> &Self::Target {
@@ -168,7 +181,7 @@ fn from(b: &T) -> Self {
}
}
-impl<T: AlwaysRefCounted> Drop for ARef<T> {
+impl<T: RefCounted> Drop for ARef<T> {
fn drop(&mut self) {
// SAFETY: The type invariants guarantee that the `ARef` owns the reference we're about to
// decrement.
diff --git a/rust/kernel/task.rs b/rust/kernel/task.rs
index 38273f4eedb5..6259430b0ca3 100644
--- a/rust/kernel/task.rs
+++ b/rust/kernel/task.rs
@@ -10,7 +10,12 @@
pid_namespace::PidNamespace,
prelude::*,
sync::aref::ARef,
- types::{NotThreadSafe, Opaque},
+ types::{
+ AlwaysRefCounted,
+ NotThreadSafe,
+ Opaque,
+ RefCounted, //
+ },
};
use core::{
ops::Deref,
@@ -347,7 +352,7 @@ pub fn group_leader(&self) -> &Task {
}
// SAFETY: The type invariants guarantee that `Task` is always refcounted.
-unsafe impl crate::sync::aref::AlwaysRefCounted for Task {
+unsafe impl RefCounted for Task {
#[inline]
fn inc_ref(&self) {
// SAFETY: The existence of a shared reference means that the refcount is nonzero.
@@ -361,6 +366,10 @@ unsafe fn dec_ref(obj: ptr::NonNull<Self>) {
}
}
+// SAFETY: We do not implement `Ownable`, thus it is okay to obtain an `ARef<Task>` from a
+// `&Task`.
+unsafe impl AlwaysRefCounted for Task {}
+
impl PartialEq for Task {
#[inline]
fn eq(&self, other: &Self) -> bool {
diff --git a/rust/kernel/types.rs b/rust/kernel/types.rs
index 4aec7b699269..9b96aa2ebdb7 100644
--- a/rust/kernel/types.rs
+++ b/rust/kernel/types.rs
@@ -18,7 +18,8 @@
},
sync::aref::{
ARef,
- AlwaysRefCounted, //
+ AlwaysRefCounted,
+ RefCounted, //
}, //
};
diff --git a/rust/kernel/usb.rs b/rust/kernel/usb.rs
index 9c17a672cd27..90b13e65cc82 100644
--- a/rust/kernel/usb.rs
+++ b/rust/kernel/usb.rs
@@ -18,7 +18,10 @@
to_result, //
},
prelude::*,
- sync::aref::AlwaysRefCounted,
+ sync::aref::{
+ AlwaysRefCounted,
+ RefCounted, //
+ },
types::Opaque,
ThisModule, //
};
@@ -381,7 +384,7 @@ fn as_ref(&self) -> &Device {
}
// SAFETY: Instances of `Interface` are always reference-counted.
-unsafe impl AlwaysRefCounted for Interface {
+unsafe impl RefCounted for Interface {
fn inc_ref(&self) {
// SAFETY: The invariants of `Interface` guarantee that `self.as_raw()`
// returns a valid `struct usb_interface` pointer, for which we will
@@ -395,6 +398,10 @@ unsafe fn dec_ref(obj: NonNull<Self>) {
}
}
+// SAFETY: We do not implement `Ownable`, thus it is okay to obtain an `ARef<Interface>` from a
+// `&Interface`.
+unsafe impl AlwaysRefCounted for Interface {}
+
// SAFETY: A `Interface` is always reference-counted and can be released from any thread.
unsafe impl Send for Interface {}
@@ -432,7 +439,7 @@ fn as_raw(&self) -> *mut bindings::usb_device {
kernel::impl_device_context_into_aref!(Device);
// SAFETY: Instances of `Device` are always reference-counted.
-unsafe impl AlwaysRefCounted for Device {
+unsafe impl RefCounted for Device {
fn inc_ref(&self) {
// SAFETY: The invariants of `Device` guarantee that `self.as_raw()`
// returns a valid `struct usb_device` pointer, for which we will
@@ -446,6 +453,10 @@ unsafe fn dec_ref(obj: NonNull<Self>) {
}
}
+// SAFETY: We do not implement `Ownable`, thus it is okay to obtain an `ARef<Device>` from a
+// `&Device`.
+unsafe impl AlwaysRefCounted for Device {}
+
impl<Ctx: device::DeviceContext> AsRef<device::Device<Ctx>> for Device<Ctx> {
fn as_ref(&self) -> &device::Device<Ctx> {
// SAFETY: By the type invariant of `Self`, `self.as_raw()` is a pointer to a valid
--
2.51.2
^ permalink raw reply related
* [PATCH v17 02/10] rust: types: Add Ownable/Owned types
From: Andreas Hindborg @ 2026-06-04 20:11 UTC (permalink / raw)
To: Miguel Ojeda, Gary Guo, Björn Roy Baron, Benno Lossin,
Alice Ryhl, Trevor Gross, Danilo Krummrich, Greg Kroah-Hartman,
Dave Ertman, Ira Weiny, Leon Romanovsky, Paul Moore, Serge Hallyn,
Rafael J. Wysocki, David Airlie, Simona Vetter, Alexander Viro,
Christian Brauner, Jan Kara, Daniel Almeida, Viresh Kumar,
Nishanth Menon, Stephen Boyd, Bjorn Helgaas,
Krzysztof Wilczyński, Boqun Feng, Uladzislau Rezki,
Lorenzo Stoakes, Vlastimil Babka, Liam R. Howlett, Igor Korotin,
Pavel Tikhomirov, Boqun Feng, Igor Korotin, Lorenzo Stoakes,
Liam R. Howlett, Vlastimil Babka
Cc: linux-kernel, rust-for-linux, linux-block, linux-security-module,
dri-devel, linux-fsdevel, linux-mm, linux-pm, linux-pci,
Andreas Hindborg, driver-core, Asahi Lina, Oliver Mangold
In-Reply-To: <20260604-unique-ref-v17-0-7b4c3d2930b9@kernel.org>
From: Asahi Lina <lina+kernel@asahilina.net>
By analogy to `AlwaysRefCounted` and `ARef`, an `Ownable` type is a
(typically C FFI) type that *may* be owned by Rust, but need not be. Unlike
`AlwaysRefCounted`, this mechanism expects the reference to be unique
within Rust, and does not allow cloning.
Conceptually, this is similar to a `KBox<T>`, except that it delegates
resource management to the `T` instead of using a generic allocator.
[ om:
- Split code into separate file and `pub use` it from types.rs.
- Make from_raw() and into_raw() public.
- Remove OwnableMut, and make DerefMut dependent on Unpin instead.
- Usage example/doctest for Ownable/Owned.
- Fixes to documentation and commit message.
]
Link: https://lore.kernel.org/all/20250202-rust-page-v1-1-e3170d7fe55e@asahilina.net/
Signed-off-by: Asahi Lina <lina+kernel@asahilina.net>
Co-developed-by: Oliver Mangold <oliver.mangold@pm.me>
Signed-off-by: Oliver Mangold <oliver.mangold@pm.me>
Reviewed-by: Boqun Feng <boqun.feng@gmail.com>
Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com>
[ Andreas: Updated documentation, examples, and formatting. Change safety
requirements, safety comments. Use a reference for `release`. ]
Reviewed-by: Gary Guo <gary@garyguo.net>
Co-developed-by: Andreas Hindborg <a.hindborg@kernel.org>
Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
---
rust/kernel/lib.rs | 1 +
rust/kernel/owned.rs | 187 +++++++++++++++++++++++++++++++++++++++++++++++
rust/kernel/sync/aref.rs | 5 ++
rust/kernel/types.rs | 11 +++
4 files changed, 204 insertions(+)
diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs
index b72b2fbe046d..d07759eec799 100644
--- a/rust/kernel/lib.rs
+++ b/rust/kernel/lib.rs
@@ -100,6 +100,7 @@
pub mod of;
#[cfg(CONFIG_PM_OPP)]
pub mod opp;
+pub mod owned;
pub mod page;
#[cfg(CONFIG_PCI)]
pub mod pci;
diff --git a/rust/kernel/owned.rs b/rust/kernel/owned.rs
new file mode 100644
index 000000000000..456e239e906e
--- /dev/null
+++ b/rust/kernel/owned.rs
@@ -0,0 +1,187 @@
+// SPDX-License-Identifier: GPL-2.0
+
+//! Unique owned pointer types for objects with custom drop logic.
+//!
+//! These pointer types are useful for C-allocated objects which by API-contract
+//! are owned by Rust, but need to be freed through the C API.
+
+use core::{
+ mem::ManuallyDrop,
+ ops::{
+ Deref,
+ DerefMut, //
+ },
+ pin::Pin,
+ ptr::NonNull, //
+};
+
+/// Types that specify their own way of performing allocation and destruction. Typically, this trait
+/// is implemented on types from the C side.
+///
+/// Implementing this trait allows types to be referenced via the [`Owned<Self>`] pointer type. This
+/// is useful when it is desirable to tie the lifetime of the reference to an owned object, rather
+/// than pass around a bare reference. [`Ownable`] types can define custom drop logic that is
+/// executed when the owned reference [`Owned<Self>`] pointing to the object is dropped.
+///
+/// Note: The underlying object is not required to provide internal reference counting, because it
+/// represents a unique, owned reference. If reference counting (on the Rust side) is required,
+/// [`AlwaysRefCounted`](crate::types::AlwaysRefCounted) should be implemented.
+///
+/// # Examples
+///
+/// A minimal example implementation of [`Ownable`] and its usage with [`Owned`] looks like
+/// this:
+///
+/// ```
+/// # #![expect(clippy::disallowed_names)]
+/// # use core::cell::Cell;
+/// # use core::ptr::NonNull;
+/// # use kernel::sync::global_lock;
+/// # use kernel::alloc::{flags, kbox::KBox, AllocError};
+/// # use kernel::types::{Owned, Ownable};
+///
+/// // Let's count the allocations to see if freeing works.
+/// kernel::sync::global_lock! {
+/// // SAFETY: we call `init()` right below, before doing anything else.
+/// unsafe(uninit) static FOO_ALLOC_COUNT: Mutex<usize> = 0;
+/// }
+/// // SAFETY: We call `init()` only once, here.
+/// unsafe { FOO_ALLOC_COUNT.init() };
+///
+/// struct Foo;
+///
+/// impl Foo {
+/// fn new() -> Result<Owned<Self>> {
+/// // We are just using a `KBox` here to handle the actual allocation, as our `Foo` is
+/// // not actually a C-allocated object.
+/// let result = KBox::new(
+/// Foo {},
+/// flags::GFP_KERNEL,
+/// )?;
+/// let result = KBox::into_non_null(result);
+/// // Count new allocation
+/// *FOO_ALLOC_COUNT.lock() += 1;
+/// // SAFETY:
+/// // - We just allocated the `Self`, thus it is valid and we own it.
+/// // - We can transfer this ownership to the `from_raw` method.
+/// Ok(unsafe { Owned::from_raw(result) })
+/// }
+/// }
+///
+/// impl Ownable for Foo {
+/// unsafe fn release(&mut self) {
+/// // SAFETY: The [`KBox<Self>`] is still alive. We can pass ownership to the [`KBox`], as
+/// // by requirement on calling this function.
+/// drop(unsafe { KBox::from_raw(self) });
+/// // Count released allocation
+/// *FOO_ALLOC_COUNT.lock() -= 1;
+/// }
+/// }
+///
+/// {
+/// let foo = Foo::new()?;
+/// assert!(*FOO_ALLOC_COUNT.lock() == 1);
+/// }
+/// // `foo` is out of scope now, so we expect no live allocations.
+/// assert!(*FOO_ALLOC_COUNT.lock() == 0);
+/// # Ok::<(), Error>(())
+/// ```
+pub trait Ownable {
+ /// Tear down this `Ownable`.
+ ///
+ /// Implementers of `Ownable` can use this function to clean up the use of `Self`. This can
+ /// include freeing the underlying object.
+ ///
+ /// # Safety
+ ///
+ /// Callers must ensure that the caller has exclusive ownership of `T`, and this ownership can
+ /// be transferred to the `release` method.
+ unsafe fn release(&mut self);
+}
+
+/// A mutable reference to an owned `T`.
+///
+/// The [`Ownable`] is automatically freed or released when an instance of [`Owned`] is
+/// dropped.
+///
+/// # Invariants
+///
+/// - Until `T::release` is called, this `Owned<T>` exclusively owns the underlying `T`.
+/// - The `T` value is pinned.
+pub struct Owned<T: Ownable> {
+ ptr: NonNull<T>,
+}
+
+impl<T: Ownable> Owned<T> {
+ /// Creates a new instance of [`Owned`].
+ ///
+ /// This function takes over ownership of the underlying object.
+ ///
+ /// # Safety
+ ///
+ /// Callers must ensure that:
+ /// - `ptr` points to a valid instance of `T`.
+ /// - Until `T::release` is called, the returned `Owned<T>` exclusively owns the underlying `T`.
+ #[inline]
+ pub unsafe fn from_raw(ptr: NonNull<T>) -> Self {
+ // INVARIANT: By function safety requirement we satisfy the first invariant of `Self`.
+ // We treat `T` as pinned from now on.
+ Self { ptr }
+ }
+
+ /// Consumes the [`Owned`], returning a raw pointer.
+ ///
+ /// This function does not drop the underlying `T`. When this function returns, ownership of the
+ /// underlying `T` is with the caller.
+ #[inline]
+ pub fn into_raw(me: Self) -> NonNull<T> {
+ ManuallyDrop::new(me).ptr
+ }
+
+ /// Get a pinned mutable reference to the data owned by this `Owned<T>`.
+ #[inline]
+ pub fn as_pin_mut(&mut self) -> Pin<&mut T> {
+ // SAFETY: The type invariants guarantee that the object is valid, and that we can safely
+ // return a mutable reference to it.
+ let unpinned = unsafe { self.ptr.as_mut() };
+
+ // SAFETY: By type invariant `T` is pinned.
+ unsafe { Pin::new_unchecked(unpinned) }
+ }
+}
+
+// SAFETY: It is safe to send an [`Owned<T>`] to another thread when the underlying `T` is [`Send`],
+// because of the ownership invariant. Sending an [`Owned<T>`] is equivalent to sending the `T`.
+unsafe impl<T: Ownable + Send> Send for Owned<T> {}
+
+// SAFETY: It is safe to send [`&Owned<T>`] to another thread when the underlying `T` is [`Sync`],
+// because of the ownership invariant. Sending an [`&Owned<T>`] is equivalent to sending the `&T`.
+unsafe impl<T: Ownable + Sync> Sync for Owned<T> {}
+
+impl<T: Ownable> Deref for Owned<T> {
+ type Target = T;
+
+ #[inline]
+ fn deref(&self) -> &Self::Target {
+ // SAFETY: The type invariants guarantee that the object is valid.
+ unsafe { self.ptr.as_ref() }
+ }
+}
+
+impl<T: Ownable + Unpin> DerefMut for Owned<T> {
+ #[inline]
+ fn deref_mut(&mut self) -> &mut Self::Target {
+ // SAFETY: The type invariants guarantee that the object is valid, and that we can safely
+ // return a mutable reference to it.
+ unsafe { self.ptr.as_mut() }
+ }
+}
+
+impl<T: Ownable> Drop for Owned<T> {
+ #[inline]
+ fn drop(&mut self) {
+ // SAFETY: By existence of `&mut self` we exclusively own `self` and the underlying `T`. As
+ // we are dropping `self`, we can transfer ownership of the `T` to the `release` method.
+ unsafe { T::release(self.ptr.as_mut()) };
+ }
+}
diff --git a/rust/kernel/sync/aref.rs b/rust/kernel/sync/aref.rs
index 9989f56d0605..4ee5fac0e0b6 100644
--- a/rust/kernel/sync/aref.rs
+++ b/rust/kernel/sync/aref.rs
@@ -29,6 +29,11 @@
/// Rust code, the recommendation is to use [`Arc`](crate::sync::Arc) to create reference-counted
/// instances of a type.
///
+/// Note: Implementing this trait allows types to be wrapped in an [`ARef<Self>`]. It requires an
+/// internal reference count and provides only shared references. If unique references are required
+/// [`Ownable`](crate::types::Ownable) should be implemented which allows types to be wrapped in an
+/// [`Owned<Self>`](crate::types::Owned).
+///
/// # Safety
///
/// Implementers must ensure that increments to the reference count keep the object alive in memory
diff --git a/rust/kernel/types.rs b/rust/kernel/types.rs
index 4329d3c2c2e5..4aec7b699269 100644
--- a/rust/kernel/types.rs
+++ b/rust/kernel/types.rs
@@ -11,6 +11,17 @@
};
use pin_init::{PinInit, Wrapper, Zeroable};
+pub use crate::{
+ owned::{
+ Ownable,
+ Owned, //
+ },
+ sync::aref::{
+ ARef,
+ AlwaysRefCounted, //
+ }, //
+};
+
/// Used to transfer ownership to and from foreign (non-Rust) languages.
///
/// Ownership is transferred from Rust to a foreign language by calling [`Self::into_foreign`] and
--
2.51.2
^ permalink raw reply related
* [PATCH] keys: prevent slab cache merging for key_jar
From: Mohammed EL Kadiri @ 2026-06-04 12:50 UTC (permalink / raw)
To: David Howells, Jarkko Sakkinen
Cc: Paul Moore, James Morris, Serge E . Hallyn, Kees Cook,
Vlastimil Babka, keyrings, linux-security-module, linux-hardening,
linux-kernel, Mohammed EL Kadiri
The key_jar slab cache holds struct key objects containing cryptographic
keys, authentication tokens, and keyring linkage. This cache currently
lacks merge prevention, allowing the SLUB allocator to merge it with
other similarly-sized caches.
On a default Ubuntu 6.17.0-23-generic system, key_jar has 5 aliases,
meaning 5 unrelated object types share its slab pages. struct key is
224 bytes, placed in 256-byte slabs alongside biovec-16, maple_node,
ip6_dst_cache, task_delay_info, and kmalloc-256 users.
Cross-cache heap exploitation is a well-documented attack class
(CVE-2022-29582, CVE-2022-2588, CVE-2021-22555) where slab cache
merging enables type confusion between unrelated kernel objects. A
use-after-free in any subsystem sharing slab pages with key_jar could
allow an attacker to reclaim a freed slot as a struct key, or corrupt
an existing key through a dangling pointer to a different type.
Add SLAB_NO_MERGE to ensure key_jar receives dedicated slab pages,
eliminating cross-cache attacks targeting struct key. The memory
overhead is minimal: with 32 objects per slab page and typical key
usage bounded by system keyring size, the cost of dedicated pages is
negligible. There is zero performance impact on the allocation hot
path.
This follows the precedent set by skbuff_head_cache (net/core/skbuff.c)
which uses SLAB_NO_MERGE for similar isolation requirements.
Signed-off-by: Mohammed EL Kadiri <med08elkadiri@gmail.com>
---
security/keys/key.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/security/keys/key.c b/security/keys/key.c
index 3bbdde778631..592b65cf8539 100644
--- a/security/keys/key.c
+++ b/security/keys/key.c
@@ -1275,7 +1275,7 @@ void __init key_init(void)
{
/* allocate a slab in which we can store keys */
key_jar = kmem_cache_create("key_jar", sizeof(struct key),
- 0, SLAB_HWCACHE_ALIGN|SLAB_PANIC, NULL);
+ 0, SLAB_HWCACHE_ALIGN | SLAB_PANIC | SLAB_NO_MERGE, NULL);
/* add the special key types */
list_add_tail(&key_type_keyring.link, &key_types_list);
--
2.43.0
^ permalink raw reply related
* Re: [PATCH v4 1/2] landlock: fix LANDLOCK_SCOPE_SIGNAL bypass on the SIGIO path
From: Bryam Vargas @ 2026-06-04 10:27 UTC (permalink / raw)
To: Günther Noack
Cc: Mickaël Salaün, Günther Noack, Justin Suess,
Christian Brauner, Paul Moore, James Morris, Serge E . Hallyn,
linux-security-module, stable, linux-kernel
In-Reply-To: <20260604.f1cb6ce9cd6b@gnoack.org>
Hi Günther,
> I believe the result after this patch is:
> - No threads receive the SIGIO at all.
>
> This is because we have been setting T2.2's Landlock domain as the
> "sending domain" for the hook_file_sigiotask(), and that hook does on
> its own not do the "same_thread_group()" check [...]
Confirmed -- I traced the delivery path and your analysis holds.
For a PGID owner the signal is anchored per process on its thread-group
leader: a task is attached to pid->tasks[PIDTYPE_PGID] only in the
thread_group_leader() branch of copy_process(), so send_sigio()'s
do_each_pid_task(pid, PIDTYPE_PGID, p) walk visits exactly T2.1 for P2,
never the non-leader T2.2. hook_file_send_sigiotask() then runs
domain_is_scoped(recorded T2.2 domain, T2.1's live domain, SIGNAL) and,
having no same_thread_group() exemption of its own (unlike
hook_task_kill()), denies it -- even though T2.1 and T2.2 share P2's
signal_struct and 18eb75f3af40 mandates that same-process delivery always
be allowed. T2.1 is P2's only entry on the PGID list, so P2 receives
nothing. You are right.
One thing worth putting on the record: this over-block is not introduced
by the patch. In unpatched control_current_fowner() the PGID case already
resolves through pid_task(fown->pid, PIDTYPE_PGID), which returns an
arbitrary hlist head -- one representative leader. Whenever that head is
outside the caller's thread group, the domain is already recorded today and
the same delivery-time denial of the registrant's own leader already fires.
The patch only makes domain recording for PGID unconditional, i.e. it turns
that order-dependent behaviour into a deterministic one while closing the
order-dependent bypass. So the corner you describe is a pre-existing gap in
the delivery hook, not a regression in v4.
That points at the real root cause: same_thread_group is a *per-recipient*
property, but control_current_fowner() approximates it once, at F_SETOWN
time, against a single pid_task() representative. hook_task_kill() gets
this right because it evaluates same_thread_group(p, current) live, per
actual recipient. hook_file_send_sigiotask() is the SIGIO analogue but
delegates the whole thread-group decision to that one registration-time
check, which a PGID delivery set simply cannot be captured by.
So the fully-correct fix is to move the same-process exemption to delivery
time, keyed to the *registrant* rather than to current (at SIGIO time
current is the fd writer, not the task that armed F_SETOWN). Concretely:
when hook_file_set_fowner() records the domain, also pin
get_pid(task_tgid(current)) in struct landlock_file_security; in
hook_file_send_sigiotask(), before domain_is_scoped(), return 0 when
task_tgid(tsk) == that recorded pid. PGID owners still record the domain
(so P1 stays blocked -- the bypass fix), but the registrant's own process,
including T2.1, is always allowed -- restoring 18eb75f3af40 exactly. The
new pid is taken/put in lockstep with fown_subject.domain under the same
file->f_owner->lock and freed in hook_file_free_security(); the equality
test follows neither pid, so there is no extra RCU surface. Sketch:
/* struct landlock_file_security */
struct pid *fown_tg; /* registrant's thread group; NULL if no domain */
/* hook_file_set_fowner(), where fown_subject is recorded */
fown_tg = get_pid(task_tgid(current));
...
put_pid(landlock_file(file)->fown_tg); /* release previous */
landlock_file(file)->fown_tg = fown_tg;
/* hook_file_free_security() */
put_pid(landlock_file(file)->fown_tg);
/* hook_file_send_sigiotask(), after the !subject->domain quick return */
if (task_tgid(tsk) == landlock_file(fown->file)->fown_tg)
return 0; /* same process as the registrant: always allowed */
I do not see a correct fix that avoids recording the registrant's identity:
the registrant task is deliberately discarded after set_fowner (only its
domain is kept), and exempting on a shared *domain* instead would be
insecure -- sibling threads can hold different domains, and a different
process could share one.
> To be clear, the patch is still obviously an improvement [...] it just
> seems to block it slightly too broadly in this corner scenario?
> [...] Mickaël, maybe you have some thoughts on the tradeoff?
Agreed on both counts. Mickaël -- two ways to land this:
(a) keep v4 as is. It closes the bypass; the residual same-process
over-block is pre-existing, deterministic only under the stacked
conditions Günther listed (already-multithreaded enforce, no TSYNC,
SIGIO to a PGID that includes self, registered from a non-leader
thread in a per-thread signal-scoped domain), and arguably tolerable.
(b) v5 = v4 + the delivery-time exemption above. Strictly more correct:
it also closes the pre-existing delivery-hook gap and restores
18eb75f3af40's same-process invariant, at the cost of one struct pid*
in landlock_file_security.
I lean (b) -- it fixes the actual root cause rather than the one reachable
instance -- and I am happy to spin it (with an added selftest covering the
PGID-includes-self / non-leader-registrant case, A/B verified) or to hold at
v4 if you would rather keep the change minimal. Your call on whether the
corner warrants the extra state.
> P.S: [...] new patchset versions are posted at the top (no Reply-To
> header in the cover letter) [...]
Will do -- v5 (whichever option) goes out as a fresh top-level thread, no
In-Reply-To/Reply-To pointing back at this review.
Bryam
^ permalink raw reply
* Re: [PATCH v4 1/2] landlock: fix LANDLOCK_SCOPE_SIGNAL bypass on the SIGIO path
From: Günther Noack @ 2026-06-04 8:10 UTC (permalink / raw)
To: Bryam Vargas
Cc: Mickaël Salaün, Günther Noack, Justin Suess,
Christian Brauner, Paul Moore, James Morris, Serge E . Hallyn,
linux-security-module, stable, linux-kernel
In-Reply-To: <20260602172741.18760-2-hexlabsecurity@proton.me>
Hello!
Thanks for the updated patch set!
On Tue, Jun 02, 2026 at 05:27:56PM +0000, Bryam Vargas wrote:
> LANDLOCK_SCOPE_SIGNAL must prevent a sandboxed process from signaling
> processes outside its Landlock domain. It can be bypassed through the
> asynchronous SIGIO delivery path.
>
> A sandboxed process that owns any file or socket can arm it with
> fcntl(F_SETOWN, fd, -pgid), fcntl(F_SETSIG, fd, SIGKILL) and O_ASYNC, so
> that an I/O event makes the kernel deliver the chosen signal to the whole
> process group. As the head of its own process group -- the default right
> after fork() -- that group also holds the non-sandboxed process that
> launched it, e.g. a supervisor or a security monitor. The sandbox can
> thus kill or repeatedly signal exactly the processes SCOPE_SIGNAL is meant
> to protect from it.
>
> The scope is enforced in hook_file_send_sigiotask() against the Landlock
> domain recorded at F_SETOWN time, not the live domain of the sender.
> control_current_fowner() decides whether to record that domain and skips
> recording it when the fowner target is in the caller's thread group --
> safe only when the target is a single process sharing the caller's
> credentials (PIDTYPE_PID, PIDTYPE_TGID). For a process group
> (PIDTYPE_PGID) the target resolves to the caller itself when it is the
> group head, recording is skipped, and hook_file_send_sigiotask() then lets
> the signal fan out to the whole group unchecked.
>
> Skip the recording only for the single-process target types, so the scope
> is enforced against every group member at delivery time. The direct
> kill() path (hook_task_kill) already evaluates the live domain and is
> unaffected.
Consider the following scenario:
- Processes P1 and P2 are in the same process group
- Threads T2.1 and T2.2 are part of P2.
- T2.1 is the thread group leader of P2.
- T2.2 is in a signal-scoped Landlock domain
- T2.2 registers the SIGIO for the entire PGID
- Someone writes to the FD, triggering the SIGIO mechanism
What I would expect in this scenario is:
- T2.1 receives the SIGIO because it is the thread group leader for
P2. (SIGIO with PGID only sends to one thread per process)
- It is OK for it to receive the signal because signals between
sibling threads should be permitted.
- No other threads receive SIGIO.
I believe the result after this patch is:
- No threads receive the SIGIO at all.
This is because we have been setting T2.2's Landlock domain as the
"sending domain" for the hook_file_sigiotask(), and that hook does on
its own not do the "same_thread_group()" check, and the thread group
leader T2.1 is outside of the T2.2's Landlock domain.
To be clear, the patch is still obviously an improvement, given that
it fixes a bypass for the signaling policy; it just seems to block it
slightly too broadly in this corner scenario?
The scenario does not happen *much* in practice, because SIGIO is not
used much, and starting with 7.0, multithreaded processes should
ideally use TSYNC and have their threads all in the exact same
Landlock domain. (Before TSYNC, this only affected the case where a
process was already(!) multithreaded at the time of Landlock
enforcement.)
I like the simplicity of this fix, but I'm afraid it does not do 100%
the correct thing. (I have not tried it out though and I'm happy to
stand corrected if my analysis is wrong.)
The fix would be for a very fringe scenario only, where multiple
conditions come together:
- An already multithreaded process enforcing a Landlock policy
- Not using the TSYNC flag for it (since Linux 7.0)
- Using SIGIO
- Using SIGIO with signaling to a full PGID, including the current process
- SIGIO registration happens from a non-thread-leader thread
- That thread is in a signal-scoped Landlock domain
Mickaël, maybe you have some thoughts on the tradeoff?
>
> Fixes: 18eb75f3af40 ("landlock: Always allow signals between threads of the same process")
> Cc: stable@vger.kernel.org
> Tested-by: Justin Suess <utilityemal77@gmail.com>
> Signed-off-by: Bryam Vargas <hexlabsecurity@proton.me>
> ---
> security/landlock/fs.c | 9 +++++++++
> 1 file changed, 9 insertions(+)
>
> diff --git a/security/landlock/fs.c b/security/landlock/fs.c
> index c1ecfe239032..2ebad70a956d 100644
> --- a/security/landlock/fs.c
> +++ b/security/landlock/fs.c
> @@ -1909,6 +1909,15 @@ static bool control_current_fowner(struct fown_struct *const fown)
> if (!p)
> return true;
>
> + /*
> + * A process-group fowner fans the signal out to every member at
> + * delivery time, so record the domain for any non single-process
> + * target -- even when it resolves to current as the group head --
> + * and let hook_file_send_sigiotask() check the live scope.
> + */
> + if (fown->pid_type != PIDTYPE_PID && fown->pid_type != PIDTYPE_TGID)
> + return true;
> +
> return !same_thread_group(p, current);
> }
>
> --
> 2.43.0
Thanks,
–Günther
P.S: The threaded mail is now in the right format. Remaining nit
though: By convention, new patchset versions are posted at the
top (no Reply-To header in the cover letter), and this is what
many maintainers filter for - it is easier to get maintainers
attention when sticking to that convention.
^ permalink raw reply
* Re: -next status as at v7.1-rc6
From: Linus Torvalds @ 2026-06-04 0:31 UTC (permalink / raw)
To: Paul Moore
Cc: linux-security-module, Mark Brown, Blaise Boscaccy,
Alexei Starovoitov, linux-next, linux-kernel
In-Reply-To: <CAHC9VhRX1=tx4+arzuH76kirYFFJozexa5L=41Bb2QbgWZXOkQ@mail.gmail.com>
On Wed, 3 Jun 2026 at 17:04, Paul Moore <paul@paul-moore.com> wrote:
>
> It's worth mentioning that resolving the merge issue was relatively
> straightforward and we had a tested patch ready in a few hours
This is not the reason I'm not going to pull it - the merge issue was
just the reminder I got about an earlier email that I had dropped on
the floor.
No, the reason I won't pull it is that the main developer I pull bpf
code NAK'ed it.
Now, I will cdertainly sometimes override maintainers, so it's not
like a NAK is always some final thing.
I don't _like_ overriding developers, but I'll do it when I feel it is
necessary to make forward progress.
But I also have to feel people have been unnecessarily difficult, and
I have been extensively informed about the decision and I feel like I
can make a reasonable judgement on it.
So it happens, but it happens with my explicit understanding.
My tree is *not* some kind of "we are bypassing developers by sending
a pull request directly to Linus" tree.
NEVER is that the way things get done.
[ Yes, that too has happened, and I have done that unwittingly because
I didn't realize what was going on ]
So I will not pull this tree. End of story.
The way to get me to override developers is to make me aware of the
conflict and convince me that yes, something needs overriding - but it
is typically not very easy to do with active developers.
And honestly, I also have two+ decades of history of "LSM people
cannot agree on a single thing".
That is _literally_ why the LSM layer exists in the first place.
So when LSM people then disagree with _other_ developers, quite
frankly my immediate and visceral reaction then is "oh, these people
who have decades of history of not being able to even agree amongst
themselves are now disagreeing with outsiders too".
Put another way: LSM people have a higher barrier to convince me that
I should take their disagreements seriously.
And no, I'm afraid that may not be entirely fair. But "history of
being disagreeable" is a thing.
Linus
^ permalink raw reply
* Re: [PATCH v6 11/12] ima: Support staging and deleting N measurements records
From: steven chen @ 2026-06-04 0:25 UTC (permalink / raw)
To: Roberto Sassu, corbet, skhan, zohar, dmitry.kasatkin,
eric.snowberg, paul, jmorris, serge
Cc: linux-doc, linux-kernel, linux-integrity, linux-security-module,
gregorylumen, nramas, Roberto Sassu, steven chen
In-Reply-To: <20260602111401.1706052-12-roberto.sassu@huaweicloud.com>
On 6/2/2026 4:14 AM, Roberto Sassu wrote:
> From: Roberto Sassu <roberto.sassu@huawei.com>
>
> Add support for sending a value N between 1 and ULONG_MAX to the IMA
> original measurement interface. This value represents the number of
> measurements that should be deleted from the current measurements list. In
> this case, measurements are staged in an internal non-user visible list,
> and immediately deleted.
>
> This staging method allows the remote attestation agents to easily separate
> the measurements that were verified (staged and deleted) from those that
> weren't due to the race between taking a TPM quote and reading the
> measurements list.
>
> In order to minimize the locking time of ima_extend_list_mutex, deleting
> N records is realized by doing a lockless walk in the current measurements
> list to determine the N-th entry to cut, to cut the current measurements
> list under the lock, and by deleting the excess records after releasing the
> lock.
>
> Flushing the hash table is not supported for N records, since it would
> require removing the N records one by one from the hash table under the
> ima_extend_list_mutex lock, which would increase the locking time.
>
> Link: https://github.com/linux-integrity/linux/issues/1
> Co-developed-by: Steven Chen <chenste@linux.microsoft.com>
Signed-off-by: Steven Chen <chenste@linux.microsoft.com>
> Co-developed-by: Roberto Sassu <roberto.sassu@huawei.com>
> Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
> ---
> security/integrity/ima/Kconfig | 3 ++
> security/integrity/ima/ima.h | 1 +
> security/integrity/ima/ima_fs.c | 32 +++++++++++++--
> security/integrity/ima/ima_queue.c | 63 ++++++++++++++++++++++++++++++
> 4 files changed, 96 insertions(+), 3 deletions(-)
>
> diff --git a/security/integrity/ima/Kconfig b/security/integrity/ima/Kconfig
> index 02436670f746..f4d25e045808 100644
> --- a/security/integrity/ima/Kconfig
> +++ b/security/integrity/ima/Kconfig
> @@ -341,6 +341,9 @@ config IMA_STAGING
> It allows user space to stage the measurements list for deletion and
> to delete the staged measurements after confirmation.
>
> + Or, alternatively, it allows user space to specify N measurements
> + records to stage internally, so that they can be immediately deleted.
> +
> On kexec, staging is aborted and any staged measurement records are
> copied to the secondary kernel.
>
> diff --git a/security/integrity/ima/ima.h b/security/integrity/ima/ima.h
> index d2e740c8ff75..7a1b2d6a8b59 100644
> --- a/security/integrity/ima/ima.h
> +++ b/security/integrity/ima/ima.h
> @@ -320,6 +320,7 @@ struct ima_template_desc *lookup_template_desc(const char *name);
> bool ima_template_has_modsig(const struct ima_template_desc *ima_template);
> int ima_queue_stage(void);
> int ima_queue_staged_delete_all(void);
> +int ima_queue_delete_partial(unsigned long req_value);
> int ima_restore_measurement_entry(struct ima_template_entry *entry);
> int ima_restore_measurement_list(loff_t bufsize, void *buf);
> int ima_measurements_show(struct seq_file *m, void *v);
> diff --git a/security/integrity/ima/ima_fs.c b/security/integrity/ima/ima_fs.c
> index 96d7503a605b..174a94740da1 100644
> --- a/security/integrity/ima/ima_fs.c
> +++ b/security/integrity/ima/ima_fs.c
> @@ -28,6 +28,7 @@
> * Requests:
> * 'A\n': stage the entire measurements list
> * 'D\n': delete all staged measurements
> + * '[1, ULONG_MAX]\n' delete N measurements records
> */
> #define STAGED_REQ_LENGTH 21
>
> @@ -343,6 +344,7 @@ static ssize_t _ima_measurements_write(struct file *file,
> loff_t *ppos, bool staged_interface)
> {
> char req[STAGED_REQ_LENGTH];
> + unsigned long req_value;
> int ret;
>
> if (datalen < 2 || datalen > STAGED_REQ_LENGTH)
> @@ -370,7 +372,24 @@ static ssize_t _ima_measurements_write(struct file *file,
> ret = ima_queue_staged_delete_all();
> break;
> default:
> - ret = -EINVAL;
> + if (staged_interface)
> + return -EINVAL;
> +
> + if (ima_flush_htable) {
> + pr_debug("Deleting staged N measurements not supported when flushing the hash table is requested\n");
> + return -EINVAL;
> + }
> +
> + ret = kstrtoul(req, 10, &req_value);
> + if (ret < 0)
> + return ret;
> +
> + if (req_value == 0) {
> + pr_debug("Must delete at least one entry\n");
> + return -EINVAL;
> + }
> +
> + ret = ima_queue_delete_partial(req_value);
> }
>
> if (ret < 0)
> @@ -379,6 +398,12 @@ static ssize_t _ima_measurements_write(struct file *file,
> return datalen;
> }
>
> +static ssize_t ima_measurements_write(struct file *file, const char __user *buf,
> + size_t datalen, loff_t *ppos)
> +{
> + return _ima_measurements_write(file, buf, datalen, ppos, false);
> +}
> +
> static ssize_t ima_measurements_staged_write(struct file *file,
> const char __user *buf,
> size_t datalen, loff_t *ppos)
> @@ -389,6 +414,7 @@ static ssize_t ima_measurements_staged_write(struct file *file,
> static const struct file_operations ima_measurements_ops = {
> .open = ima_measurements_open,
> .read = seq_read,
> + .write = ima_measurements_write,
> .llseek = seq_lseek,
> .release = ima_measurements_release,
> };
> @@ -470,6 +496,7 @@ static int ima_ascii_measurements_open(struct inode *inode, struct file *file)
> static const struct file_operations ima_ascii_measurements_ops = {
> .open = ima_ascii_measurements_open,
> .read = seq_read,
> + .write = ima_measurements_write,
> .llseek = seq_lseek,
> .release = ima_measurements_release,
> };
> @@ -603,14 +630,13 @@ static int __init create_securityfs_measurement_lists(bool staging)
> {
> const struct file_operations *ascii_ops = &ima_ascii_measurements_ops;
> const struct file_operations *binary_ops = &ima_measurements_ops;
> - umode_t permissions = (S_IRUSR | S_IRGRP);
> + umode_t permissions = (S_IRUSR | S_IRGRP | S_IWUSR | S_IWGRP);
> const char *file_suffix = "";
> int count = NR_BANKS(ima_tpm_chip);
>
> if (staging) {
> ascii_ops = &ima_ascii_measurements_staged_ops;
> binary_ops = &ima_measurements_staged_ops;
> - permissions |= (S_IWUSR | S_IWGRP);
> file_suffix = "_staged";
> }
>
> diff --git a/security/integrity/ima/ima_queue.c b/security/integrity/ima/ima_queue.c
> index af0502f27d57..718991ba8bcd 100644
> --- a/security/integrity/ima/ima_queue.c
> +++ b/security/integrity/ima/ima_queue.c
> @@ -405,6 +405,69 @@ int ima_queue_staged_delete_all(void)
> return 0;
> }
>
> +/**
> + * ima_queue_delete_partial - Delete current measurements
> + * @req_value: Number of measurements to delete
> + *
> + * Delete the requested number of measurements from the current measurements
> + * list, and update the number of records and the binary run-time size
> + * accordingly.
> + *
> + * Refuse to delete current measurements if measurement is suspended, so that
> + * dump can be done in a lockless way and user space is notified about current
> + * measurements being carried over to the secondary kernel, so that it does not
> + * save them twice.
> + *
> + * Return: Zero on success, a negative value otherwise.
> + */
> +int ima_queue_delete_partial(unsigned long req_value)
> +{
> + unsigned long req_value_copy = req_value;
> + unsigned long size_to_remove = 0, num_to_remove = 0;
> + LIST_HEAD(ima_measurements_trim);
> + struct ima_queue_entry *qe;
> + int ret = 0;
> +
> + /*
> + * list_for_each_entry_rcu() without rcu_read_lock() is fine because
> + * only list append can happen concurrently. No list replace due to the
> + * staging/delete writers mutual exclusion.
> + */
> + list_for_each_entry_rcu(qe, &ima_measurements, later, true) {
> + size_to_remove += get_binary_runtime_size(qe->entry);
> + num_to_remove++;
> +
> + if (--req_value_copy == 0)
> + break;
> + }
> +
> + /* Not enough records to delete. */
> + if (req_value_copy > 0)
> + return -ENOENT;
> +
> + mutex_lock(&ima_extend_list_mutex);
> + if (ima_measurements_suspended) {
> + mutex_unlock(&ima_extend_list_mutex);
> + return -ESTALE;
> + }
> +
> + /*
> + * qe remains valid because ima_fs.c enforces single-writer exclusion.
> + */
> + __list_cut_position(&ima_measurements_trim, &ima_measurements,
> + &qe->later);
> +
> + atomic_long_sub(num_to_remove, &ima_num_records[BINARY]);
> +
> + if (IS_ENABLED(CONFIG_IMA_KEXEC))
> + binary_runtime_size[BINARY] -= size_to_remove;
> +
> + mutex_unlock(&ima_extend_list_mutex);
> +
> + ima_queue_delete(&ima_measurements_trim, false);
> + return ret;
> +}
> +
> /**
> * ima_queue_delete - Delete measurements
> * @head: List head measurements are deleted from
^ permalink raw reply
* Re: -next status as at v7.1-rc6
From: Paul Moore @ 2026-06-04 0:04 UTC (permalink / raw)
To: Linus Torvalds
Cc: linux-security-module, Mark Brown, Blaise Boscaccy,
Alexei Starovoitov, linux-next, linux-kernel
In-Reply-To: <CAHk-=wj6CtZS9hbwFjQcoNkPwQLoyKmk8czaBF6=bBOCYuXEUQ@mail.gmail.com>
On Tue, Jun 2, 2026 at 4:20 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Mon, 1 Jun 2026 at 11:22, Mark Brown <broonie@kernel.org> wrote:
> >
> > We're reaching the peak of conflicts for this release cycle but
> > fundamentally everything is pretty quiet at the minute.
>
> So the "Hornet" LSM thing needs to bve removed from linux-next.
>
> I'm not ever going to pull it - it has been NAK'ed by the developers
> of the actual BPF code, and now it apparently causes merge problems
> too.
It's worth mentioning that resolving the merge issue was relatively
straightforward and we had a tested patch ready in a few hours; I was
in the process of merging it when your mail hit my inbox. We see
cross-subsystem issues like this every couple of months. The good news
is that the linux-next process works well and we are usually able to
resolve the problems quickly.
I'll touch on the NACK below.
> The LSM people need to realize that they cannot override the people
> who actually write the real code.
>
> The security layer is not boss in the relationship. It's the
> subservient party. Security is important, but LSM's are not.
[NOTE: adding the LSM list to the CC line since I just realized you
didn't include it]
At this point I believe we are all aware of your dislike of LSMs, but
I once again feel compelled to speak out against the disrespect you
have shown towards the LSM developers and users. Most (all?) of the
major Linux distributions rely on at least one LSM to meet the needs
of their users. We've seen the importance of LSMs to Android, and the
real impact they have had on defending against vulnerabilities. We
also know, oddly enough in this particular case, the importance of the
LSM framework to the BPF ecosystem; the BPF LSM attach points are
critical for many BPF use cases.
LSMs may not be important to you, but they are very important for a
very large number of Linux users.
> When the maintainer of a codebase NAK's a security model, and explains
> that the code has different needs and different security models, the
> LSM people don't just ignore that and go do their thing despite the
> NAK.
I think there may be some confusion about how Hornet works and
interacts with the existing BPF subsystem. Hornet does not touch any
code inside the BPF subsystem, and outside of some minor PKCS7 patches
to enable some things from the PKCS specs, it doesn't touch any code
outside of security/. Hornet's design criteria required it to work
within the existing LSM hooks and remain compatible with the existing
BPF signature verification code. Hornet follows the traditional LSM
design pattern: it works within the LSM framework, building upon the
security functionality in other subsystems to satisfy user security
needs that have otherwise been ignored.
A BPF light skeleton signed with the Hornet tools passes both the
existing BPF signature verification code and Hornet's verification.
Hornet allows signature verification using arbitrary keys/keyrings,
just like the existing BPF signature code, enabling support for
dynamically generated/signed BPF programs. In mixed environments,
Hornet can verify the loader portion of BPF light skeletons signed
with the existing BPF signing tools, distinguishing between existing
Alexei/KP signatures and a full Hornet signature. Hornet also
supports unsigned BPF programs when needed.
As an LSM, Hornet can be enabled/disabled at build time, kernel boot,
and at runtime (subject to the LSM providing enforcement). If a user
is required to run a specific kernel build, e.g. an "enterprise" Linux
distro support situation, the admin has multiple ways to disable
Hornet if it is not desired. Any Hornet signed BPF light skeletons
will load without issue on a system without Hornet and Hornet's
presence does not block the existing BPF signature verification
mechanisms.
The BPF developers will quickly point out, if they haven't mentioned
it to you already, that Hornet calls into a 'bpf_map_ops' function.
This is true, Hornet calls the map_get_hash() method to get the hash
of a BPF map so it can be verified. The BPF devs will argue this
presents a layering violation, but I would counter that several users
in the networking stack go much further than Hornet in their use of
'bpf_map_ops'; I find it difficult to see 'bpf_map_ops' as a private
API at this point. It is also important to mention that Hornet does
not manipulate or modify the BPF program or map state in any way
beyond storing some state in a Hornet specific LSM blob, similar to
what other LSMs have done.
We've done this work, and tried collaborating with the BPF devs for
over a year, because we believe the kernel should verify the integrity
of both the BPF light skeleton loader and the associated maps. While
the existing BPF signature verification scheme verifies the light
skeleton loader, it requires the loader to self-verify its associated
maps. Relying on the loader to self-verify maps is problematic
because it adds an additional burden on system builders and admins who
must now also manage and verify the signature verification in every
BPF light skeleton loader on the system (how does one know if a
specific light skeleton loader suffers from a verification bug? what
additional steps need to be added to the deployment of third-party,
binary only BPF light skeletons?). At the very least Hornet can
ensure the integrity of both the BPF light skeleton loader and the
signed maps without requiring prior analysis of the light skeleton
loader. That is a big win for admins who care about what code is
loaded into their kernel.
Hornet lives entirely within the LSM framework, adds no additional LSM
hooks, remains compatible with the existing BPF signature verification
code, can be disabled in multiple ways if required, provides the
verification flexibility needed to support the existing BPF ecosystem,
and helps satisfy the needs of real users. Why can't we support
Hornet alongside the existing BPF signature verification code and let
the users employ the mechanism that works best for them?
--
paul-moore.com
^ permalink raw reply
* Re: [PATCH v4 0/2] Delete task_euid()
From: Alice Ryhl @ 2026-06-03 17:05 UTC (permalink / raw)
To: Paul Moore
Cc: Serge Hallyn, Jonathan Corbet, Greg Kroah-Hartman, Shuah Khan,
Alex Shi, Yanteng Si, Dongliang Mu, Miguel Ojeda, Boqun Feng,
Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
Trevor Gross, Danilo Krummrich, Jann Horn, linux-security-module,
linux-doc, linux-kernel, rust-for-linux
In-Reply-To: <CAHC9VhQyNzxJgdMkmEsOeAQ7Wt2L+eW6aNLjeoYmnCQLmcYRnw@mail.gmail.com>
On Wed, Jun 3, 2026 at 6:05 PM Paul Moore <paul@paul-moore.com> wrote:
>
> On Tue, Jun 2, 2026 at 2:15 AM Alice Ryhl <aliceryhl@google.com> wrote:
> > On Mon, Jun 01, 2026 at 07:13:37PM -0400, Paul Moore wrote:
> > > On Fri, May 29, 2026 at 5:33 AM Alice Ryhl <aliceryhl@google.com> wrote:
> > > >
> > > > The task_euid() method is a very weird method, and Binder was the only
> > > > user. As of commit 65b672152289 ("binder: use current_euid() for
> > > > transaction sender identity") Binder doesn't use task_euid() anymore,
> > > > so we can delete this method.
> > >
> > > Given the problems from last time, it seems like it might be prudent
> > > to let the commit have some time to "breathe" in a proper release, I'd
> > > suggest merging this not for the upcoming v7.2 merge window but
> > > instead waiting for v7.3.
> >
> > Sure, that makes sense. I'll resend after the merge window.
>
> No need to resend if there are no changes (see below), it's in
> patchwork and I'm tracking it so you're all set. I'll send another
> notice when I merge it.
>
> > > > My suggestion would be to merge this through the LSM tree.
> > >
> > > That's fine with me. I'd also suggest updating the commit description
> > > in patch 1/2 to indicate that binder is no longer using task_euid();
> > > it currently reads like it is still being used.
> >
> > I guess this occurred because when patch 1 was written, it really *was*
> > still being used.
>
> Yeah, I understand the world has changed since patch 1/2 was written,
> which is okay, we just need to update the commit description ... which
> should be a trivial task.
>
> > Perhaps we could pick up only patch 1 now since even
> > if we run into problems and Binder has to go back to using task_euid(),
> > clarifying the docs is still useful.
>
> I assumed that was one of the reasons for splitting the changes across
> two patches (reverting patch 2/2 leaves patch 1/2 intact).
> Regardless, we're at -rc6 and with patch 1/2 being purely a comment
> update I don't see an urgent rush on this, especially considering that
> if I did pick it up now, it would be for the v7.2 merge window and the
> binder/current_euid() change will ship in v7.1.
>
> Let's update the commit description - you've got a couple of weeks to
> do that - and then we'll merge everything once the v7.2 merge window
> closes.
Sounds good, thanks!
Alice
^ permalink raw reply
* Re: [PATCH v4 0/2] Delete task_euid()
From: Paul Moore @ 2026-06-03 16:04 UTC (permalink / raw)
To: Alice Ryhl
Cc: Serge Hallyn, Jonathan Corbet, Greg Kroah-Hartman, Shuah Khan,
Alex Shi, Yanteng Si, Dongliang Mu, Miguel Ojeda, Boqun Feng,
Gary Guo, Björn Roy Baron, Benno Lossin, Andreas Hindborg,
Trevor Gross, Danilo Krummrich, Jann Horn, linux-security-module,
linux-doc, linux-kernel, rust-for-linux
In-Reply-To: <ah51DY5yfaNZejBd@google.com>
On Tue, Jun 2, 2026 at 2:15 AM Alice Ryhl <aliceryhl@google.com> wrote:
> On Mon, Jun 01, 2026 at 07:13:37PM -0400, Paul Moore wrote:
> > On Fri, May 29, 2026 at 5:33 AM Alice Ryhl <aliceryhl@google.com> wrote:
> > >
> > > The task_euid() method is a very weird method, and Binder was the only
> > > user. As of commit 65b672152289 ("binder: use current_euid() for
> > > transaction sender identity") Binder doesn't use task_euid() anymore,
> > > so we can delete this method.
> >
> > Given the problems from last time, it seems like it might be prudent
> > to let the commit have some time to "breathe" in a proper release, I'd
> > suggest merging this not for the upcoming v7.2 merge window but
> > instead waiting for v7.3.
>
> Sure, that makes sense. I'll resend after the merge window.
No need to resend if there are no changes (see below), it's in
patchwork and I'm tracking it so you're all set. I'll send another
notice when I merge it.
> > > My suggestion would be to merge this through the LSM tree.
> >
> > That's fine with me. I'd also suggest updating the commit description
> > in patch 1/2 to indicate that binder is no longer using task_euid();
> > it currently reads like it is still being used.
>
> I guess this occurred because when patch 1 was written, it really *was*
> still being used.
Yeah, I understand the world has changed since patch 1/2 was written,
which is okay, we just need to update the commit description ... which
should be a trivial task.
> Perhaps we could pick up only patch 1 now since even
> if we run into problems and Binder has to go back to using task_euid(),
> clarifying the docs is still useful.
I assumed that was one of the reasons for splitting the changes across
two patches (reverting patch 2/2 leaves patch 1/2 intact).
Regardless, we're at -rc6 and with patch 1/2 being purely a comment
update I don't see an urgent rush on this, especially considering that
if I did pick it up now, it would be for the v7.2 merge window and the
binder/current_euid() change will ship in v7.1.
Let's update the commit description - you've got a couple of weeks to
do that - and then we'll merge everything once the v7.2 merge window
closes.
--
paul-moore.com
^ permalink raw reply
* Re: [PATCH v5 7/8] vfs: Replace security_sb_mount/security_move_mount with granular hooks
From: Song Liu @ 2026-06-03 15:42 UTC (permalink / raw)
To: brauner
Cc: paul, jmorris, serge, viro, jack, john.johansen,
stephen.smalley.work, omosnace, mic, gnoack, takedakn,
penguin-kernel, herton, kernel-team, linux-security-module,
linux-fsdevel, apparmor, selinux
In-Reply-To: <20260528182607.3150386-8-song@kernel.org>
Hi Christian,
On Thu, May 28, 2026 at 11:26 AM Song Liu <song@kernel.org> wrote:
>
> Replace the monolithic security_sb_mount() call in path_mount() and
> security_move_mount() in vfs_move_mount() with the new granular mount
> hooks:
>
> - do_loopback(): call security_mount_bind()
> - do_new_mount(): call security_mount_new()
> - do_remount(): call security_mount_remount()
> - do_reconfigure_mnt(): call security_mount_reconfigure()
> - do_move_mount_old(): call security_mount_move()
> - do_change_type(): call security_mount_change_type()
> - vfs_move_mount(): replace security_move_mount() with
> security_mount_move()
Does this version look good to you?
Thanks,
Song
^ permalink raw reply
* [PATCH v4 2/2] selftests/landlock: test SCOPE_SIGNAL on the SIGIO/fowner pgid path
From: Bryam Vargas @ 2026-06-02 17:28 UTC (permalink / raw)
To: Mickaël Salaün, Günther Noack
Cc: Justin Suess, Christian Brauner, Paul Moore, James Morris,
Serge E . Hallyn, linux-security-module, stable, linux-kernel
In-Reply-To: <20260602172741.18760-1-hexlabsecurity@proton.me>
Add a regression test for the LANDLOCK_SCOPE_SIGNAL bypass on the
asynchronous SIGIO delivery path. A sandboxed process that owns a file
via fcntl(F_SETOWN, -pgrp) while sitting at the head of its process
group's PID hlist (the default position after fork()) used to have its
Landlock domain recording skipped, letting the SIGIO fan-out reach
non-sandboxed members of the process group.
The test creates a dedicated process group, sandboxes the (hlist-head)
child with LANDLOCK_SCOPE_SIGNAL, arms F_SETSIG(SIGURG) / F_SETOWN(-pgrp)
/ O_ASYNC on a pipe and triggers the fan-out. The in-domain child must
receive the signal (proving the trigger fired); the non-sandboxed parent,
which is outside the child's domain, must not. Without the fix the parent
is signaled and the test fails.
Signed-off-by: Bryam Vargas <hexlabsecurity@proton.me>
---
.../selftests/landlock/scoped_signal_test.c | 97 +++++++++++++++++++
1 file changed, 97 insertions(+)
diff --git a/tools/testing/selftests/landlock/scoped_signal_test.c b/tools/testing/selftests/landlock/scoped_signal_test.c
index d8bf33417619..62d86a115775 100644
--- a/tools/testing/selftests/landlock/scoped_signal_test.c
+++ b/tools/testing/selftests/landlock/scoped_signal_test.c
@@ -559,4 +559,101 @@ TEST_F(fown, sigurg_socket)
_metadata->exit_code = KSFT_FAIL;
}
+/*
+ * Checks that LANDLOCK_SCOPE_SIGNAL is enforced on the asynchronous SIGIO
+ * delivery path (fcntl(F_SETOWN)) when the file owner is a process group.
+ *
+ * A sandboxed process sitting at the head of its process group's PID hlist
+ * (the default position right after fork()) used to escape the
+ * fcntl(F_SETOWN, -pgrp) domain recording: pid_task(pgrp, PIDTYPE_PGID)
+ * resolved to the process itself, so the same-thread-group exemption skipped
+ * recording its Landlock domain. At SIGIO time that domain was then unset
+ * and the signal fanned out to every group member, including non-sandboxed
+ * processes outside the domain.
+ */
+TEST(sigio_to_pgid_members)
+{
+ int trigger[2], sync_child[2];
+ char buf;
+ pid_t child;
+ int status, i;
+
+ drop_caps(_metadata);
+
+ /*
+ * Isolates the test in its own process group so the SIGIO fan-out
+ * stays bounded to this parent and the child forked below.
+ */
+ ASSERT_EQ(0, setpgid(0, 0));
+
+ /* The non-sandboxed parent is the protected (out-of-domain) target. */
+ ASSERT_EQ(0, setup_signal_handler(SIGURG));
+ signal_received = 0;
+
+ ASSERT_EQ(0, pipe2(trigger, O_CLOEXEC));
+ ASSERT_EQ(0, pipe2(sync_child, O_CLOEXEC));
+
+ child = fork();
+ ASSERT_LE(0, child);
+ if (child == 0) {
+ /*
+ * The child inherits the parent's new process group and, just
+ * attached with hlist_add_head_rcu(), is now the head of the
+ * pgid hlist: this is the case that used to skip the recording.
+ */
+ EXPECT_EQ(0, close(sync_child[0]));
+
+ /* In-domain positive control: the child must be signaled. */
+ ASSERT_EQ(0, setup_signal_handler(SIGURG));
+ signal_received = 0;
+
+ create_scoped_domain(_metadata, LANDLOCK_SCOPE_SIGNAL);
+
+ /* Owns the SIGIO source for the whole process group. */
+ ASSERT_EQ(0, fcntl(trigger[0], F_SETSIG, SIGURG));
+ ASSERT_EQ(0, fcntl(trigger[0], F_SETOWN, -getpgrp()));
+ ASSERT_EQ(0, fcntl(trigger[0], F_SETFL, O_ASYNC));
+
+ /* Fans SIGURG out to every member of the process group. */
+ ASSERT_EQ(1, write(trigger[1], ".", 1));
+
+ /*
+ * The sandboxed child is in its own domain and must always be
+ * signaled: this proves the SIGIO actually fired.
+ */
+ for (i = 0; i < 1000 && !signal_received; i++)
+ usleep(1000);
+ EXPECT_EQ(1, signal_received);
+
+ ASSERT_EQ(1, write(sync_child[1], ".", 1));
+ EXPECT_EQ(0, close(sync_child[1]));
+
+ _exit(_metadata->exit_code);
+ return;
+ }
+ EXPECT_EQ(0, close(sync_child[1]));
+ EXPECT_EQ(0, close(trigger[0]));
+ EXPECT_EQ(0, close(trigger[1]));
+
+ /* Waits for the child to generate the SIGIO. */
+ ASSERT_EQ(1, read(sync_child[0], &buf, 1));
+ EXPECT_EQ(0, close(sync_child[0]));
+
+ /* Lets a delivered-but-pending signal run our handler, if any. */
+ for (i = 0; i < 100 && !signal_received; i++)
+ usleep(1000);
+
+ /*
+ * SCOPE_SIGNAL must block the fan-out to this non-sandboxed parent,
+ * which is outside the child's Landlock domain. Before the fix the
+ * parent was signaled here.
+ */
+ EXPECT_EQ(0, signal_received);
+
+ ASSERT_EQ(child, waitpid(child, &status, 0));
+ if (WIFSIGNALED(status) || !WIFEXITED(status) ||
+ WEXITSTATUS(status) != EXIT_SUCCESS)
+ _metadata->exit_code = KSFT_FAIL;
+}
+
TEST_HARNESS_MAIN
--
2.43.0
^ permalink raw reply related
* [PATCH v4 1/2] landlock: fix LANDLOCK_SCOPE_SIGNAL bypass on the SIGIO path
From: Bryam Vargas @ 2026-06-02 17:27 UTC (permalink / raw)
To: Mickaël Salaün, Günther Noack
Cc: Justin Suess, Christian Brauner, Paul Moore, James Morris,
Serge E . Hallyn, linux-security-module, stable, linux-kernel
In-Reply-To: <20260602172741.18760-1-hexlabsecurity@proton.me>
LANDLOCK_SCOPE_SIGNAL must prevent a sandboxed process from signaling
processes outside its Landlock domain. It can be bypassed through the
asynchronous SIGIO delivery path.
A sandboxed process that owns any file or socket can arm it with
fcntl(F_SETOWN, fd, -pgid), fcntl(F_SETSIG, fd, SIGKILL) and O_ASYNC, so
that an I/O event makes the kernel deliver the chosen signal to the whole
process group. As the head of its own process group -- the default right
after fork() -- that group also holds the non-sandboxed process that
launched it, e.g. a supervisor or a security monitor. The sandbox can
thus kill or repeatedly signal exactly the processes SCOPE_SIGNAL is meant
to protect from it.
The scope is enforced in hook_file_send_sigiotask() against the Landlock
domain recorded at F_SETOWN time, not the live domain of the sender.
control_current_fowner() decides whether to record that domain and skips
recording it when the fowner target is in the caller's thread group --
safe only when the target is a single process sharing the caller's
credentials (PIDTYPE_PID, PIDTYPE_TGID). For a process group
(PIDTYPE_PGID) the target resolves to the caller itself when it is the
group head, recording is skipped, and hook_file_send_sigiotask() then lets
the signal fan out to the whole group unchecked.
Skip the recording only for the single-process target types, so the scope
is enforced against every group member at delivery time. The direct
kill() path (hook_task_kill) already evaluates the live domain and is
unaffected.
Fixes: 18eb75f3af40 ("landlock: Always allow signals between threads of the same process")
Cc: stable@vger.kernel.org
Tested-by: Justin Suess <utilityemal77@gmail.com>
Signed-off-by: Bryam Vargas <hexlabsecurity@proton.me>
---
security/landlock/fs.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/security/landlock/fs.c b/security/landlock/fs.c
index c1ecfe239032..2ebad70a956d 100644
--- a/security/landlock/fs.c
+++ b/security/landlock/fs.c
@@ -1909,6 +1909,15 @@ static bool control_current_fowner(struct fown_struct *const fown)
if (!p)
return true;
+ /*
+ * A process-group fowner fans the signal out to every member at
+ * delivery time, so record the domain for any non single-process
+ * target -- even when it resolves to current as the group head --
+ * and let hook_file_send_sigiotask() check the live scope.
+ */
+ if (fown->pid_type != PIDTYPE_PID && fown->pid_type != PIDTYPE_TGID)
+ return true;
+
return !same_thread_group(p, current);
}
--
2.43.0
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox