* [PATCH bpf-next v2 1/5] bpf: Verify signed loader metadata at load time
From: Daniel Borkmann @ 2026-06-24 14:02 UTC (permalink / raw)
To: ast
Cc: kpsingh, James.Bottomley, paul, bboscaccy, memxor, torvalds, bpf,
linux-security-module
In-Reply-To: <20260624140301.93421-1-daniel@iogearbox.net>
A signed gen_loader program carries the programs, maps and relocations it
installs in a metadata array map. The loader instructions are covered by
the PKCS#7 signature, but the metadata map is not: Today the loader
compares the map contents from within BPF against a hash baked into its
(signed) instructions, using the kernel-cached map hash. The kernel itself
never actually attests that the metadata the loader installs is the
metadata that was signed.
This split is the core of the long-standing objection to the BPF signing
scheme from the LSM / integrity side: the integrity check of a light
skeleton only completes once the loader program runs, that is, after the
security_bpf_prog_load() hook, so at admission time an LSM observes a
program whose payload has not yet been verified [0]. Auditing the chain
link is also not a purely cryptographic operation: whoever signs or reviews
an lskel has to disassemble the loader's preamble to convince themselves
that the embedded hash check is present and correct [1][2]. Two acceptable
fixes were identified in those threads: Complete the integrity check
before the admission hook fires, or add a second hook that collects the
verification result after the loader ran [3]. Let's implement the former,
without growing the UAPI.
A signed loader binds its metadata map(s) through the existing fd_array,
and an exclusive map is already bound to a program digest (excl_prog_hash).
So when a signature is present, collect the exclusive maps from fd_array
and append their frozen contents to the instructions before verification:
the signature now covers insns || metadata_0 || metadata_1 || [...] in the
fd_array order, and verification completes in bpf_check(), once the
fd_array maps are resolved into used_maps, before the LSM admission hook
and the rest of verification.
A program is either BPF_SIG_UNSIGNED or BPF_SIG_VERIFIED, with nothing in
between. While folding the fd_array maps, a non-exclusive map bound to
a signed program is rejected, so every map folded into the signature is
exclusive. A signed loader that fails to cover its metadata thus does not
load, and BPF_SIG_VERIFIED always means the instructions and every
exclusive map are authentic.
The maps must be frozen so the hashed bytes cannot change before the
loader runs; the map <-> program digest binding is enforced by the
verifier for every used map. Binding maps through fd_array_cnt makes the
verifier resolve and excl-check them (excl_prog_sha vs prog->digest)
before it would otherwise compute the digest, so compute prog->digest
up front in bpf_check(), over the unmodified instructions the
signature covers, for a load that folds metadata.
Unsigned programs are not affected. Note, signed loaders generated by
older libbpf/bpftool versions need to be regenerated; some of the recent
fixes we've had on the signed loader side require the latter already to
close gaps.
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/CAHC9VhSDkwGgPfrBUh7EgBKEJj_JjnY68c0YAmuuLT_i--GskQ@mail.gmail.com [0]
Link: https://lore.kernel.org/bpf/2f71d6c03698eb17d51f7247efde777627ee578a.camel@HansenPartnership.com [1]
Link: https://lore.kernel.org/lkml/ecf0521ed302db672672ebfbc670ecfba36a6e00.camel@HansenPartnership.com [2]
Link: https://lore.kernel.org/bpf/88703f00d5b7a779728451008626efa45e42db3d.camel@HansenPartnership.com [3]
---
include/linux/bpf_verifier.h | 1 +
kernel/bpf/syscall.c | 76 +---------------
kernel/bpf/verifier.c | 163 ++++++++++++++++++++++++++++++++++-
3 files changed, 165 insertions(+), 75 deletions(-)
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 39a851e690ec..1431cf7c620d 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -939,6 +939,7 @@ struct bpf_verifier_env {
bool bypass_spec_v4;
bool seen_direct_write;
bool seen_exception;
+ bool check_signature;
struct bpf_insn_aux_data *insn_aux_data; /* array of per-insn state */
const struct bpf_line_info *prev_linfo;
struct bpf_verifier_log log;
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index b44106c8ea75..026b61d78bdb 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -40,7 +40,6 @@
#include <linux/tracepoint.h>
#include <linux/overflow.h>
#include <linux/cookie.h>
-#include <linux/verification.h>
#include <linux/btf_ids.h>
#include <net/netfilter/nf_bpf_link.h>
@@ -2886,64 +2885,6 @@ static bool is_perfmon_prog_type(enum bpf_prog_type prog_type)
}
}
-static enum bpf_sig_keyring bpf_classify_keyring(s32 keyring_id)
-{
- switch (keyring_id) {
- case 0:
- return BPF_SIG_KEYRING_BUILTIN;
- case (s32)(unsigned long)VERIFY_USE_SECONDARY_KEYRING:
- return BPF_SIG_KEYRING_SECONDARY;
- case (s32)(unsigned long)VERIFY_USE_PLATFORM_KEYRING:
- return BPF_SIG_KEYRING_PLATFORM;
- default:
- return BPF_SIG_KEYRING_USER;
- }
-}
-
-static int bpf_prog_verify_signature(struct bpf_prog *prog, union bpf_attr *attr,
- bool is_kernel, s32 *keyring_serial)
-{
- bpfptr_t usig = make_bpfptr(attr->signature, is_kernel);
- struct bpf_dynptr_kern sig_ptr, insns_ptr;
- struct bpf_key *key = NULL;
- void *sig;
- int err = 0;
-
- /*
- * Don't attempt to use kmalloc_large or vmalloc for signatures.
- * Practical signature for BPF program should be below this limit.
- */
- if (attr->signature_size > KMALLOC_MAX_CACHE_SIZE)
- return -EINVAL;
-
- if (system_keyring_id_check(attr->keyring_id) == 0)
- key = bpf_lookup_system_key(attr->keyring_id);
- else
- key = bpf_lookup_user_key(attr->keyring_id, 0);
-
- if (!key)
- return -EINVAL;
-
- sig = kvmemdup_bpfptr(usig, attr->signature_size);
- if (IS_ERR(sig)) {
- bpf_key_put(key);
- return PTR_ERR(sig);
- }
-
- bpf_dynptr_init(&sig_ptr, sig, BPF_DYNPTR_TYPE_LOCAL, 0,
- attr->signature_size);
- bpf_dynptr_init(&insns_ptr, prog->insnsi, BPF_DYNPTR_TYPE_LOCAL, 0,
- prog->len * sizeof(struct bpf_insn));
-
- err = bpf_verify_pkcs7_signature((struct bpf_dynptr *)&insns_ptr,
- (struct bpf_dynptr *)&sig_ptr, key);
- if (!err)
- *keyring_serial = bpf_key_serial(key);
- bpf_key_put(key);
- kvfree(sig);
- return err;
-}
-
static int bpf_prog_mark_insn_arrays_ready(struct bpf_prog *prog)
{
int err;
@@ -3133,17 +3074,8 @@ static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr, struct bpf_log_at
/* eBPF programs must be GPL compatible to use GPL-ed functions */
prog->gpl_compatible = license_is_gpl_compatible(license) ? 1 : 0;
- if (attr->signature) {
- err = bpf_prog_verify_signature(prog, attr, uattr.is_kernel,
- &prog->aux->sig.keyring_serial);
- if (err)
- goto free_prog;
- prog->aux->sig.keyring_type = bpf_classify_keyring(attr->keyring_id);
- prog->aux->sig.verdict = BPF_SIG_VERIFIED;
- } else {
- prog->aux->sig.keyring_type = BPF_SIG_KEYRING_NONE;
- prog->aux->sig.verdict = BPF_SIG_UNSIGNED;
- }
+ prog->aux->sig.keyring_type = BPF_SIG_KEYRING_NONE;
+ prog->aux->sig.verdict = BPF_SIG_UNSIGNED;
prog->orig_prog = NULL;
prog->jited = 0;
@@ -3189,10 +3121,6 @@ static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr, struct bpf_log_at
if (err < 0)
goto free_prog;
- err = security_bpf_prog_load(prog, attr, token, uattr.is_kernel);
- if (err)
- goto free_prog;
-
/* run eBPF verifier */
err = bpf_check(&prog, attr, uattr, attr_log);
if (err < 0)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 2abc79dbf281..9cd2b62da380 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -22,6 +22,8 @@
#include <linux/ctype.h>
#include <linux/error-injection.h>
#include <linux/bpf_lsm.h>
+#include <linux/security.h>
+#include <linux/verification.h>
#include <linux/btf_ids.h>
#include <linux/poison.h>
#include <linux/module.h>
@@ -19703,12 +19705,146 @@ int bpf_fixup_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
return 0;
}
+static enum bpf_sig_keyring bpf_classify_keyring(s32 keyring_id)
+{
+ switch (keyring_id) {
+ case 0:
+ return BPF_SIG_KEYRING_BUILTIN;
+ case (s32)(unsigned long)VERIFY_USE_SECONDARY_KEYRING:
+ return BPF_SIG_KEYRING_SECONDARY;
+ case (s32)(unsigned long)VERIFY_USE_PLATFORM_KEYRING:
+ return BPF_SIG_KEYRING_PLATFORM;
+ default:
+ return BPF_SIG_KEYRING_USER;
+ }
+}
+
+/*
+ * Verify the PKCS#7 signature of a loaded program. Called from bpf_check()
+ * once the program's metadata maps have been resolved into used_maps, so
+ * the exact maps folded into the signature are the ones the program binds.
+ *
+ * The signature covers the instructions followed by the frozen contents of
+ * each map, in @maps order: insns || map_0 || map_1 || [...]. On success the
+ * verdict and keyring info are recorded on prog->aux.
+ */
+static int bpf_prog_verify_signature(struct bpf_verifier_env *env,
+ union bpf_attr *attr, bool is_kernel)
+{
+ bpfptr_t usig = make_bpfptr(attr->signature, is_kernel);
+ struct bpf_dynptr_kern sig_ptr, data_ptr;
+ struct bpf_prog *prog = env->prog;
+ struct bpf_map **maps = env->used_maps;
+ struct bpf_key *key = NULL;
+ void *sig, *data = NULL;
+ u32 map_cnt = env->used_map_cnt;
+ u32 i, off, insns_sz;
+ u64 data_sz;
+ int err = 0;
+
+ /*
+ * Don't attempt to use kmalloc_large or vmalloc for signatures.
+ * Practical signature for BPF program should be below this limit.
+ */
+ if (attr->signature_size > KMALLOC_MAX_CACHE_SIZE)
+ return -EINVAL;
+ if (system_keyring_id_check(attr->keyring_id) == 0)
+ key = bpf_lookup_system_key(attr->keyring_id);
+ else
+ key = bpf_lookup_user_key(attr->keyring_id, 0);
+ if (!key) {
+ verbose(env, "cannot resolve signing keyring with keyring_id %d\n",
+ attr->keyring_id);
+ return -EINVAL;
+ }
+
+ sig = kvmemdup_bpfptr(usig, attr->signature_size);
+ if (IS_ERR(sig)) {
+ bpf_key_put(key);
+ return PTR_ERR(sig);
+ }
+
+ insns_sz = prog->len * sizeof(struct bpf_insn);
+ data_sz = insns_sz;
+ for (i = 0; i < map_cnt; i++) {
+ struct bpf_map *map = maps[i];
+
+ if (map->map_type != BPF_MAP_TYPE_ARRAY ||
+ !map->ops->map_direct_value_addr) {
+ verbose(env, "signed program metadata map '%s' must be an array\n",
+ map->name);
+ err = -EINVAL;
+ goto out;
+ }
+ if (!READ_ONCE(map->frozen)) {
+ verbose(env, "signed program metadata map '%s' must be frozen\n",
+ map->name);
+ err = -EPERM;
+ goto out;
+ }
+ if (!map->excl_prog_sha) {
+ verbose(env, "signed program metadata map '%s' must be exclusive\n",
+ map->name);
+ err = -EPERM;
+ goto out;
+ }
+ data_sz += map->value_size;
+ }
+ if (bpf_dynptr_check_size(data_sz)) {
+ verbose(env, "signed payload too large: %llu bytes\n", data_sz);
+ err = -E2BIG;
+ goto out;
+ }
+ data = kvmalloc(data_sz, GFP_KERNEL | __GFP_ZERO);
+ if (!data) {
+ err = -ENOMEM;
+ goto out;
+ }
+ memcpy(data, prog->insnsi, insns_sz);
+ off = insns_sz;
+ for (i = 0; i < map_cnt; i++) {
+ struct bpf_map *map = maps[i];
+ u64 addr;
+
+ err = map->ops->map_direct_value_addr(map, &addr, 0);
+ if (err) {
+ verbose(env, "failed to read signed metadata map '%s': %d\n",
+ map->name, err);
+ goto out;
+ }
+ memcpy(data + off, (void *)(unsigned long)addr,
+ map->value_size);
+ off += map->value_size;
+ }
+
+ bpf_dynptr_init(&data_ptr, data, BPF_DYNPTR_TYPE_LOCAL, 0, data_sz);
+ bpf_dynptr_init(&sig_ptr, sig, BPF_DYNPTR_TYPE_LOCAL, 0,
+ attr->signature_size);
+
+ err = bpf_verify_pkcs7_signature((struct bpf_dynptr *)&data_ptr,
+ (struct bpf_dynptr *)&sig_ptr, key);
+ if (err) {
+ verbose(env, "signature verification failed: %d\n", err);
+ } else {
+ verbose(env, "signature verification passed\n");
+ prog->aux->sig.keyring_serial = bpf_key_serial(key);
+ prog->aux->sig.keyring_type = bpf_classify_keyring(attr->keyring_id);
+ prog->aux->sig.verdict = BPF_SIG_VERIFIED;
+ }
+out:
+ kvfree(data);
+ bpf_key_put(key);
+ kvfree(sig);
+ return err;
+}
+
int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr,
struct bpf_log_attr *attr_log)
{
u64 start_time = ktime_get_ns();
struct bpf_verifier_env *env;
int i, len, ret = -EINVAL, err;
+ u32 signed_map_cnt = 0;
bool is_priv;
BTF_TYPE_EMIT(enum bpf_features);
@@ -19745,6 +19881,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr,
env->bypass_spec_v1 = bpf_bypass_spec_v1(env->prog->aux->token);
env->bypass_spec_v4 = bpf_bypass_spec_v4(env->prog->aux->token);
env->bpf_capable = is_priv = bpf_token_capable(env->prog->aux->token, CAP_BPF);
+ env->check_signature = attr->signature;
bpf_get_btf_vmlinux();
@@ -19758,11 +19895,28 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr,
ret = bpf_vlog_init(&env->log, attr_log->level, attr_log->ubuf, attr_log->size);
if (ret)
goto err_unlock;
+ if (env->check_signature) {
+ ret = bpf_prog_calc_tag(env->prog);
+ if (ret < 0)
+ goto skip_full_check;
+ }
ret = process_fd_array(env, attr, uattr);
if (ret)
goto skip_full_check;
+ if (env->check_signature) {
+ ret = bpf_prog_verify_signature(env, attr, uattr.is_kernel);
+ if (ret)
+ goto skip_full_check;
+ signed_map_cnt = env->used_map_cnt;
+ }
+
+ ret = security_bpf_prog_load(env->prog, attr, env->prog->aux->token,
+ uattr.is_kernel);
+ if (ret)
+ goto skip_full_check;
+
mark_verifier_state_clean(env);
if (IS_ERR(btf_vmlinux)) {
@@ -19812,7 +19966,14 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr,
ret = check_and_resolve_insns(env);
if (ret < 0)
goto skip_full_check;
-
+ if (env->prog->aux->sig.verdict == BPF_SIG_VERIFIED &&
+ (env->used_map_cnt != signed_map_cnt || env->used_btf_cnt)) {
+ verbose(env, "signed program uses %s not covered by the signature\n",
+ env->used_map_cnt != signed_map_cnt ?
+ (env->used_btf_cnt ? "maps and BTF" : "maps") : "BTF");
+ ret = -EACCES;
+ goto skip_full_check;
+ }
if (bpf_prog_is_offloaded(env->prog->aux)) {
ret = bpf_prog_offload_verifier_prep(env->prog);
if (ret)
--
2.43.0
^ permalink raw reply related
* Re: [PATCH] landlock: shrink tsync works[] on partial allocation failure
From: Hao Peng @ 2026-06-24 12:47 UTC (permalink / raw)
To: Günther Noack; +Cc: mic, linux-security-module
In-Reply-To: <ajpVtFfp0ZxJUVc6@google.com>
On Tue, Jun 23, 2026 at 5:45 PM Günther Noack <gnoack@google.com> wrote:
>
> Hello Peng!
>
> Thanks for your patch!
>
> On Tue, Jun 23, 2026 at 01:01:27PM +0800, Peng Hao wrote:
> > When the per-slot kzalloc fails mid-loop in tsync_works_grow_by(), the
> > already-enlarged s->works array keeps uninitialized trailing entries.
> > Shrink the array back to its used size on the error path so no waste
> > is carried over: free it outright when nothing has been allocated yet,
> > otherwise try a shrinking krealloc_array() (keep the larger array if
> > the shrink fails, since tsync_works_release() honors s->capacity).
> >
> > Signed-off-by: Peng Hao <flyingpeng@tencent.com>
> > ---
> > security/landlock/tsync.c | 18 ++++++++++++++++--
> > 1 file changed, 16 insertions(+), 2 deletions(-)
> >
> > diff --git a/security/landlock/tsync.c b/security/landlock/tsync.c
> > index c5730bbd..356ce94b 100644
> > --- a/security/landlock/tsync.c
> > +++ b/security/landlock/tsync.c
> > @@ -272,9 +272,23 @@ static int tsync_works_grow_by(struct tsync_works *s, size_t n, gfp_t flags)
> > work = kzalloc_obj(*work, flags);
> > if (!work) {
> > /*
> > - * Leave the object in a consistent state,
> > - * but return an error.
> > + * Leave the object in a consistent state, but return
> > + * an error. Shrink @s->works back to its used size to
> > + * avoid carrying uninitialized trailing entries. A
> > + * shrinking krealloc_array() should normally succeed,
> > + * but if it does not we simply keep the larger array;
> > + * tsync_works_release() iterates only up to capacity.
> > */
> > + if (i == 0) {
> > + kfree(s->works);
> > + s->works = NULL;
> > + } else {
> > + works = krealloc_array(s->works, i,
> > + sizeof(s->works[0]),
> > + flags | __GFP_NOWARN);
> > + if (works)
> > + s->works = works;
> > + }
> > s->capacity = i;
> > return -ENOMEM;
> > }
>
> Can you please clarify your motivation for this?
>
> To paraphrase my understanding
>
> * You are not addressing a logic bug
>
> The invariant for that data structure is that s->size <= s->capacity
> and s->capacity <= number of elements in the array <= number of
> sibling threads.
>
> If the array is slightly larger than the capacity, that does not break
> the invariant and should not result in out-of-bounds accesses.
>
> * You are addressing that the array is a bit larger than the capacity
>
> This is in the case where kzalloc_obj() failed. We set the capacity
> to i (making sure that only the objects 0 to i-1 are being looked at),
> and we return an error. Sure, the array is overallocated a little
> bit, but now that we are returning an error, the caller will
> fast-track to abort the tsync due to ENOMEM and the array does anyway
> get released very soon now.
>
>
> The state where we use slightly more memory (with number of entries <=
> number of sibling threads) is supposed to be very transient.
>
> Is the delay between raising the error and the final kfree() long enough
> that you have seen it cause problems in practice?
>
> Thanks,
> —Günther
You are right; the time between the error being triggered and the call to
kfree() is very short, so this patch is of little significance.
Thanks.
^ permalink raw reply
* [GIT PULL] AppArmor updates for 7.2-rc1
From: John Johansen @ 2026-06-24 7:52 UTC (permalink / raw)
To: Linus Torvalds; +Cc: LKLM, open list:SECURITY SUBSYSTEM, Georgia Garcia
Hi Linus,
This is another round of bug fixing and some code cleanups, there are no
new features. I ended up pulling in a few bug fixes late, and gave them
some extra time to bake.
The biggest thing to note is Georgia is being added to help co-maintain
apparmor.
Everything has been merge, build, and regression tested against your tree
for Monday June 22.
thanks
- john
The following changes since commit 254f49634ee16a731174d2ae34bc50bd5f45e731:
Linux 7.1-rc1 (2026-04-26 14:19:00 -0700)
are available in the Git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/jj/linux-apparmor tags/apparmor-pr-2026-06-22
for you to fetch changes up to 2f6701a5ce6257ae7a64ddc6d89d0a08d2a034f8:
apparmor: advertise the tcp fast open fix is applied (2026-06-23 22:15:15 -0700)
----------------------------------------------------------------
* add Georgia Garcia as co-maintainer of apparmor
* Cleanups
- replace get_zeroed_page() with kzalloc()
- remove unnecessary goto and associated label
- change fn_label_build() to return err on failure instead of NULL or err
- free rawdata as soon as possible
- use explicit instead of implicit flex array in rawdata_f_data
- use __label_make_stale in __aa_proxy_redirect
- return correct error by propagate -ENOMEM correctly in unpack_table
- aa_label_alloc use aa_label_free on alloc failure
- add a conditional version of get_newest_label
* Bug Fixes
- mediate the implicit connect of TCP fast open sendmsg
- fix C23ism of label immediately before a declaration
- fix kernel-doc warnings
- fix spelling mistakes
- fix use-after-free in rawdata dedup loop
- Fix inverted comparison in cache_hold_inc()
- fix uninitialized pointer passed to audit_log_untrustedstring()
- don't audit files pointing to aa_null.dentry
- put secmark label after secid lookup
- fix aa_getprocattr free procattr leak on format failure
- release exe file resources on path failure
- fail policy unpack on accept2 allocation failure
- Fix return in ns_mkdir_op
- remove or add symlinks to rawdata according to export_binary
- fix NULL pointer dereference in unpack_pdb
- fix potential UAF in aa_replace_profiles
- grab ns lock and refresh when looking up changehat child profiles
- enable differential encoding
- check label build before no_new_privs test
- conditionally compile get_loaddata_common_ref()
- fix unix socket mediation cache update, and leak
----------------------------------------------------------------
Andrew Morton (1):
security/apparmor/apparmorfs.c: conditionally compile get_loaddata_common_ref()
Bryam Vargas (1):
apparmor: mediate the implicit connect of TCP fast open sendmsg
Eduardo Vasconcelos (1):
apparmor: Fix inverted comparison in cache_hold_inc()
Georgia Garcia (3):
apparmor: fix NULL pointer dereference in unpack_pdb
apparmor: remove or add symlinks to rawdata according to export_binary
apparmor: don't audit files pointing to aa_null.dentry
Hongling Zeng (1):
apparmor: Fix return in ns_mkdir_op
John Johansen (13):
apparmor: add Georgia Garcia as co-maintainer of apparmor
apparmor: fix shadowing of plabel that prevents cache from being updated
apparmor: fix race in unix socket mediation when peer_path is used
apparmor: fix refcount leak when updating the sk_ctx
apparmor: add a conditional version of get_newest_label
apparmor: enable differential encoding
apparmor: fix rawdata_f_data implicit flex array
apparmor: free rawdata as soon as possible
apparmor: change fn_label_build() call to not return NULL
apparmor: make fn_label_build() capable of handling not supported
apparmor: remove unnecessary goto and associated label
apparmor: fix label can not be immediately before a declaration
apparmor: advertise the tcp fast open fix is applied
Maciek Borzecki (1):
apparmor: fix uninitialised pointer passed to audit_log_untrustedstring()
Maxime Bélair (2):
apparmor: propagate -ENOMEM correctly in unpack_table
apparmor: fix potential UAF in aa_replace_profiles
Mike Rapoport (Microsoft) (1):
apparmor: replace get_zeroed_page() with kzalloc()
Qingshuang Fu (1):
security: apparmor: fix two spelling mistakes
Rodrigo Zaiden (1):
apparmor: fix kernel-doc warnings
Ruoyu Wang (1):
apparmor: check label build before no_new_privs test
Ruslan Valiyev (1):
apparmor: fix use-after-free in rawdata dedup loop
Ryan Lee (2):
apparmor: use __label_make_stale in __aa_proxy_redirect
apparmor: grab ns lock and refresh when looking up changehat child profiles
Zygmunt Krynicki (5):
apparmor: aa_label_alloc use aa_label_free on alloc failure
apparmor: fail policy unpack on accept2 allocation failure
apparmor: release exe file resources on path failure
apparmor: aa_getprocattr free procattr leak on format failure
apparmor: put secmark label after secid lookup
MAINTAINERS | 1 +
security/apparmor/af_unix.c | 73 +++++++++---------
security/apparmor/apparmorfs.c | 119 ++++++++++++++++++++++--------
security/apparmor/domain.c | 97 ++++++++++++++++--------
security/apparmor/file.c | 12 +--
security/apparmor/include/apparmorfs.h | 12 +++
security/apparmor/include/label.h | 32 ++++++++
security/apparmor/include/lib.h | 21 +++---
security/apparmor/include/policy_unpack.h | 21 +++++-
security/apparmor/label.c | 26 +++----
security/apparmor/lsm.c | 20 ++++-
security/apparmor/match.c | 22 +++---
security/apparmor/mount.c | 17 ++---
security/apparmor/net.c | 3 +
security/apparmor/policy.c | 35 ++++++++-
security/apparmor/policy_unpack.c | 6 +-
security/apparmor/procattr.c | 2 +
security/apparmor/task.c | 2 +-
18 files changed, 368 insertions(+), 153 deletions(-)
^ permalink raw reply
* Re: [PATCH v3 1/2] bpf: add bpf_init_inode_xattr kfunc for atomic inode labeling
From: Paul Moore @ 2026-06-24 0:12 UTC (permalink / raw)
To: David Windsor, viro, brauner, jack, ast, daniel, john.fastabend,
andrii, eddyz87, memxor, martin.lau, song, yonghong.song, jolsa,
emil, kpsingh, mattbobrowski, jmorris, serge, zohar,
roberto.sassu, dmitry.kasatkin, eric.snowberg,
stephen.smalley.work, omosnace, casey, shuah
Cc: linux-kernel, linux-fsdevel, bpf, linux-security-module,
linux-integrity, selinux, linux-kselftest, David Windsor
In-Reply-To: <20260618203411.73917-2-dwindsor@gmail.com>
On Jun 18, 2026 David Windsor <dwindsor@gmail.com> wrote:
>
> Add bpf_init_inode_xattr() kfunc for BPF LSM programs to atomically set
> xattrs via the inode_init_security hook using lsm_get_xattr_slot().
>
> The inode_init_security hook previously took the xattr array and count
> as two separate output parameters (struct xattr *xattrs, int
> *xattr_count), which BPF programs cannot write to. Pass the xattr state
> as a single context object (struct xattr_ctx) instead, and have
> bpf_init_inode_xattr() take that context directly. Update the existing
> in-tree callers of inode_init_security to take and forward the new
> xattr_ctx.
>
> A previous attempt [1] required a kmalloc string output protocol for
> the xattr name. Since commit 6bcdfd2cac55 ("security: Allow all LSMs to
> provide xattrs for inode_init_security hook") [2], the xattr name is no
> longer allocated; it is a static constant.
>
> Because we rely on the hook-specific ctx layout, the kfunc is
> restricted to lsm/inode_init_security. Restrict the xattr names that
> may be set via this kfunc to the bpf.* namespace.
>
> Link: https://kernsec.org/pipermail/linux-security-module-archive/2022-October/034878.html [1]
> Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6bcdfd2cac55 [2]
> Suggested-by: Song Liu <song@kernel.org>
> Signed-off-by: David Windsor <dwindsor@gmail.com>
> ---
> fs/bpf_fs_kfuncs.c | 106 +++++++++++++++++++++++++++++-
> include/linux/bpf.h | 1 +
> include/linux/bpf_lsm.h | 3 +
> include/linux/evm.h | 9 +--
> include/linux/lsm_hook_defs.h | 4 +-
> include/linux/lsm_hooks.h | 16 ++---
> include/linux/security.h | 5 ++
> kernel/bpf/bpf_lsm.c | 10 +++
> kernel/bpf/trampoline.c | 3 +
> security/bpf/hooks.c | 1 +
> security/integrity/evm/evm_main.c | 8 ++-
> security/security.c | 7 +-
> security/selinux/hooks.c | 4 +-
> security/smack/smack_lsm.c | 27 ++++----
> 14 files changed, 166 insertions(+), 38 deletions(-)
I have a few specific comments below, inline with the patch, but I wanted
to make some general comments too.
The kfunc additions really don't belong in the VFS kfunc file, please
create a LSM kfunc file (call it security/bpf_lsm_kfuncs.c) and add the
kfunc code to this new file.
While moving the kfunc additions to a LSM kfunc file does sort of convert
the VFS changes into LSM changes, Christian's comment about splitting
this patch two patches, one with the LSM hook changes and one with the
BPF additions, is still reasonable and a good suggestion. I would still
CC the VFS folks on these patches and I would encourage reviews from the
VFS folks as there is a VFS component here, albeit a somewhat small one.
> diff --git a/fs/bpf_fs_kfuncs.c b/fs/bpf_fs_kfuncs.c
> index 768aca2dc0f0..7abc3f3d1a67 100644
> --- a/fs/bpf_fs_kfuncs.c
> +++ b/fs/bpf_fs_kfuncs.c
> @@ -10,6 +10,7 @@
> #include <linux/fsnotify.h>
> #include <linux/file.h>
> #include <linux/kernfs.h>
> +#include <linux/lsm_hooks.h>
> #include <linux/mm.h>
> #include <linux/xattr.h>
>
> @@ -374,6 +375,97 @@ __bpf_kfunc struct inode *bpf_real_inode(struct dentry *dentry)
> return d_real_inode(dentry);
> }
>
> +static int bpf_xattrs_used(const struct xattr_ctx *ctx)
> +{
> + const size_t prefix_len = sizeof(XATTR_BPF_LSM_SUFFIX) - 1;
> + int i, n = 0;
> +
> + for (i = 0; i < *ctx->xattr_count; i++) {
> + const char *name = ctx->xattrs[i].name;
> +
> + if (name && !strncmp(name, XATTR_BPF_LSM_SUFFIX, prefix_len))
> + n++;
> + }
> + return n;
> +}
> +
> +static int __bpf_init_inode_xattr(struct xattr_ctx *xattr_ctx,
> + const char *name__str,
> + const struct bpf_dynptr *value_p)
> +{
> + struct bpf_dynptr_kern *value_ptr = (struct bpf_dynptr_kern *)value_p;
> + size_t name_len;
> + void *xattr_value;
> + struct xattr *xattr;
> + struct xattr *xattrs;
> + int *xattr_count;
> + const void *value;
> + u32 value_len;
> +
> + if (!xattr_ctx || !name__str)
> + return -EINVAL;
> +
> + xattrs = xattr_ctx->xattrs;
> + xattr_count = xattr_ctx->xattr_count;
I'm not sure why the "xattrs" and "xattrs_count" local variables are
necessary, especially since they only appear to be used in the sanity
check below. I would suggest just using the values in the struct
directly.
> + if (!xattrs || !xattr_count)
> + return -EINVAL;
> + if (bpf_xattrs_used(xattr_ctx) >= BPF_LSM_INODE_INIT_XATTRS)
> + return -ENOSPC;
> +
> + name_len = strlen(name__str);
> + if (name_len == 0 || name_len > XATTR_NAME_MAX)
> + return -EINVAL;
> + if (strncmp(name__str, XATTR_BPF_LSM_SUFFIX,
> + sizeof(XATTR_BPF_LSM_SUFFIX) - 1))
> + return -EPERM;
> +
> + value_len = __bpf_dynptr_size(value_ptr);
> + if (value_len == 0 || value_len > XATTR_SIZE_MAX)
> + return -EINVAL;
> +
> + value = __bpf_dynptr_data(value_ptr, value_len);
> + if (!value)
> + return -EINVAL;
> +
> + /* Combine xattr value + name into one allocation. */
> + xattr_value = kmalloc(value_len + name_len + 1, GFP_KERNEL);
> + if (!xattr_value)
> + return -ENOMEM;
> +
> + memcpy(xattr_value, value, value_len);
> + memcpy(xattr_value + value_len, name__str, name_len);
> + ((char *)xattr_value)[value_len + name_len] = '\0';
> +
> + xattr = lsm_get_xattr_slot(xattr_ctx);
> + if (!xattr) {
> + kfree(xattr_value);
> + return -ENOSPC;
> + }
> +
> + xattr->value = xattr_value;
> + xattr->name = (const char *)xattr_value + value_len;
> + xattr->value_len = value_len;
> +
> + return 0;
> +}
> +
> +/**
> + * bpf_init_inode_xattr - set an xattr on a new inode from inode_init_security
> + * @xattr_ctx: inode_init_security xattr state from the hook context
> + * @name__str: xattr name (e.g., "bpf.file_label")
> + * @value_p: dynptr containing the xattr value
> + *
> + * Only callable from lsm/inode_init_security programs.
> + *
> + * Return: 0 on success, negative error on failure.
> + */
> +__bpf_kfunc int bpf_init_inode_xattr(struct xattr_ctx *xattr_ctx,
> + const char *name__str,
> + const struct bpf_dynptr *value_p)
> +{
> + return __bpf_init_inode_xattr(xattr_ctx, name__str, value_p);
> +}
I'm sure there is a reason why you split the code out into
__bpf_init_inode_xattr() as opposed to just putting it directly in this
kfunc, can you help me understand?
> diff --git a/include/linux/security.h b/include/linux/security.h
> index 153e9043058f..1f8e84e7dd7e 100644
> --- a/include/linux/security.h
> +++ b/include/linux/security.h
> @@ -68,6 +68,11 @@ struct watch;
> struct watch_notification;
> struct lsm_ctx;
>
> +struct xattr_ctx {
> + struct xattr *xattrs;
> + int *xattr_count;
> +};
I see the bots already pointed this out, and you acknowledged it, but
since I'm looking at this I felt obliged to remind you once again about
the rename to "struct lsm_xattrs" :)
Also, looking at this closer, is there a reason why the "xattr_count"
field is an integer pointer and not just an integer? We're passing
the entire struct by reference to the individual LSMs so we shouldn't
need this to get an updated count in the caller and having it as a
regular integer should simplify things slightly (you could also
make it an unsigned int while you are it).
> @@ -315,6 +324,7 @@ BTF_ID(func, bpf_lsm_inode_create)
> BTF_ID(func, bpf_lsm_inode_free_security)
> BTF_ID(func, bpf_lsm_inode_getattr)
> BTF_ID(func, bpf_lsm_inode_getxattr)
> +BTF_ID(func, bpf_lsm_inode_init_security)
> BTF_ID(func, bpf_lsm_inode_mknod)
> BTF_ID(func, bpf_lsm_inode_need_killpriv)
> BTF_ID(func, bpf_lsm_inode_post_setxattr)
> diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
> index 1a721fc4bef5..b41b02173e24 100644
> --- a/kernel/bpf/trampoline.c
> +++ b/kernel/bpf/trampoline.c
> @@ -859,6 +859,9 @@ static int bpf_trampoline_add_prog(struct bpf_trampoline *tr,
> }
> if (cnt >= BPF_MAX_TRAMP_LINKS)
> return -E2BIG;
> + if (node->link->prog->aux->attach_limit &&
> + tr->progs_cnt[kind] >= node->link->prog->aux->attach_limit)
> + return -E2BIG;
Re: Alexei's comments about this - if you look back at my previous
comments, my concern was around BPF LSMs using too many slots in the
xattr array and causing issues. If the BPF folks want to do that check
in the kfunc located in the LSM framework I'm okay with that; the
important part is that the BPF LSMs don't use more space than they
previously requested.
> diff --git a/security/security.c b/security/security.c
> index 71aea8fdf014..8f82a1352356 100644
> --- a/security/security.c
> +++ b/security/security.c
> @@ -1334,6 +1334,7 @@ int security_inode_init_security(struct inode *inode, struct inode *dir,
> {
> struct lsm_static_call *scall;
> struct xattr *new_xattrs = NULL;
> + struct xattr_ctx xattr_ctx;
> int ret = -EOPNOTSUPP, xattr_count = 0;
Since we have the xattr array/pointer and count in the new "lsm_xattrs"
struct, I think we can remove the "new_xattrs" and "xattr_count" local
variables in favor of the fields in the new struct.
> if (unlikely(IS_PRIVATE(inode)))
> @@ -1349,10 +1350,12 @@ int security_inode_init_security(struct inode *inode, struct inode *dir,
> if (!new_xattrs)
> return -ENOMEM;
> }
> + xattr_ctx.xattrs = new_xattrs;
> + xattr_ctx.xattr_count = &xattr_count;
>
> lsm_for_each_hook(scall, inode_init_security) {
> - ret = scall->hl->hook.inode_init_security(inode, dir, qstr, new_xattrs,
> - &xattr_count);
> + ret = scall->hl->hook.inode_init_security(inode, dir, qstr,
> + &xattr_ctx);
> if (ret && ret != -EOPNOTSUPP)
> goto out;
> /*
--
paul-moore.com
^ permalink raw reply
* Re: [PATCH v17 06/10] rust: rename `AlwaysRefCounted` to `RefCounted`.
From: Onur Özkan @ 2026-06-23 17:58 UTC (permalink / raw)
To: Andreas Hindborg
Cc: Miguel Ojeda, Gary Guo, Björn Roy Baron, Benno Lossin,
Alice Ryhl, Trevor Gross, Danilo Krummrich, Greg Kroah-Hartman,
Dave Ertman, Ira Weiny, Leon Romanovsky, Paul Moore, Serge Hallyn,
Rafael J. Wysocki, David Airlie, Simona Vetter, Alexander Viro,
Christian Brauner, Jan Kara, Daniel Almeida, Viresh Kumar,
Nishanth Menon, Stephen Boyd, Bjorn Helgaas,
Krzysztof Wilczyński, Boqun Feng, Uladzislau Rezki,
Lorenzo Stoakes, Vlastimil Babka, Liam R. Howlett, Igor Korotin,
Pavel Tikhomirov, linux-kernel, rust-for-linux, linux-block,
linux-security-module, dri-devel, linux-fsdevel, linux-mm,
linux-pm, linux-pci, driver-core, Oliver Mangold, Viresh Kumar
In-Reply-To: <20260604-unique-ref-v17-6-7b4c3d2930b9@kernel.org>
On Thu, 04 Jun 2026 22:11:18 +0200
Andreas Hindborg <a.hindborg@kernel.org> wrote:
> From: Oliver Mangold <oliver.mangold@pm.me>
>
> There are types where it may both be reference counted in some cases and
> owned in others. In such cases, obtaining `ARef<T>` from `&T` would be
> unsound as it allows creation of `ARef<T>` copy from `&Owned<T>`.
>
> Therefore, we split `AlwaysRefCounted` into `RefCounted` (which `ARef<T>`
> would require) and a marker trait to indicate that the type is always
> reference counted (and not `Ownable`) so the `&T` -> `ARef<T>` conversion
> is possible.
>
> - Rename `AlwaysRefCounted` to `RefCounted`.
> - Add a new unsafe trait `AlwaysRefCounted`.
> - Implement the new trait `AlwaysRefCounted` for the newly renamed
> `RefCounted` implementations. This leaves functionality of existing
> implementers of `AlwaysRefCounted` intact.
>
> Suggested-by: Alice Ryhl <aliceryhl@google.com>
> Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com>
> Signed-off-by: Oliver Mangold <oliver.mangold@pm.me>
> [ Andreas: Updated commit message and rebase on rust-6.20-7.0 ]
> Acked-by: Igor Korotin <igor.korotin.linux@gmail.com>
> Acked-by: Danilo Krummrich <dakr@kernel.org>
> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
> Reviewed-by: Gary Guo <gary@garyguo.net>
> Co-developed-by: Andreas Hindborg <a.hindborg@kernel.org>
> Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
> ---
> rust/kernel/auxiliary.rs | 7 +++++-
> rust/kernel/block/mq/request.rs | 15 ++++++++-----
> rust/kernel/cred.rs | 13 +++++++++--
> rust/kernel/device.rs | 12 ++++++++--
> rust/kernel/device/property.rs | 11 +++++++--
> rust/kernel/drm/device.rs | 9 ++++++--
> rust/kernel/drm/gem/mod.rs | 16 ++++++++++----
> rust/kernel/fs/file.rs | 16 ++++++++++----
> rust/kernel/i2c.rs | 13 ++++++++---
> rust/kernel/mm.rs | 15 +++++++++----
> rust/kernel/mm/mmput_async.rs | 9 ++++++--
> rust/kernel/opp.rs | 10 ++++++---
> rust/kernel/owned.rs | 2 +-
> rust/kernel/pci.rs | 10 ++++++++-
> rust/kernel/pid_namespace.rs | 12 ++++++++--
> rust/kernel/platform.rs | 7 +++++-
> rust/kernel/sync/aref.rs | 49 ++++++++++++++++++++++++++---------------
> rust/kernel/task.rs | 13 +++++++++--
> rust/kernel/types.rs | 3 ++-
> rust/kernel/usb.rs | 17 +++++++++++---
> 20 files changed, 195 insertions(+), 64 deletions(-)
>
> diff --git a/rust/kernel/auxiliary.rs b/rust/kernel/auxiliary.rs
> index 93c0db1f6655..49f07740f657 100644
> --- a/rust/kernel/auxiliary.rs
> +++ b/rust/kernel/auxiliary.rs
> @@ -19,6 +19,7 @@
> to_result, //
> },
> prelude::*,
> + sync::aref::{AlwaysRefCounted, RefCounted},
This patch has multiple horizontal use statements around.
Onur
> types::Opaque,
> ThisModule, //
> };
> @@ -289,7 +290,7 @@ unsafe impl<Ctx: device::DeviceContext> device::AsBusDevice<Ctx> for Device<Ctx>
> kernel::impl_device_context_into_aref!(Device);
>
> // SAFETY: Instances of `Device` are always reference-counted.
> -unsafe impl crate::sync::aref::AlwaysRefCounted for Device {
> +unsafe impl RefCounted for Device {
> fn inc_ref(&self) {
> // SAFETY: The existence of a shared reference guarantees that the refcount is non-zero.
> unsafe { bindings::get_device(self.as_ref().as_raw()) };
> @@ -308,6 +309,10 @@ unsafe fn dec_ref(obj: NonNull<Self>) {
> }
> }
>
> +// SAFETY: We do not implement `Ownable`, thus it is okay to obtain an `ARef<Device>` from a
> +// `&Device`.
> +unsafe impl AlwaysRefCounted for Device {}
> +
> impl<Ctx: device::DeviceContext> AsRef<device::Device<Ctx>> for Device<Ctx> {
> fn as_ref(&self) -> &device::Device<Ctx> {
> // SAFETY: By the type invariant of `Self`, `self.as_raw()` is a pointer to a valid
> diff --git a/rust/kernel/block/mq/request.rs b/rust/kernel/block/mq/request.rs
> index ce3e30c81cb5..cf013b9e2cac 100644
> --- a/rust/kernel/block/mq/request.rs
> +++ b/rust/kernel/block/mq/request.rs
> @@ -9,7 +9,7 @@
> block::mq::Operations,
> error::Result,
> sync::{
> - aref::{ARef, AlwaysRefCounted},
> + aref::{ARef, AlwaysRefCounted, RefCounted},
> atomic::Relaxed,
> Refcount,
> },
> @@ -229,11 +229,10 @@ unsafe impl<T: Operations> Send for Request<T> {}
> // mutate `self` are internally synchronized`
> unsafe impl<T: Operations> Sync for Request<T> {}
>
> -// SAFETY: All instances of `Request<T>` are reference counted. This
> -// implementation of `AlwaysRefCounted` ensure that increments to the ref count
> -// keeps the object alive in memory at least until a matching reference count
> -// decrement is executed.
> -unsafe impl<T: Operations> AlwaysRefCounted for Request<T> {
> +// SAFETY: All instances of `Request<T>` are reference counted. This implementation of `RefCounted`
> +// ensure that increments to the ref count keeps the object alive in memory at least until a
> +// matching reference count decrement is executed.
> +unsafe impl<T: Operations> RefCounted for Request<T> {
> fn inc_ref(&self) {
> self.wrapper_ref().refcount().inc();
> }
> @@ -255,3 +254,7 @@ unsafe fn dec_ref(obj: core::ptr::NonNull<Self>) {
> }
> }
> }
> +
> +// SAFETY: We currently do not implement `Ownable`, thus it is okay to obtain an `ARef<Request>`
> +// from a `&Request` (but this will change in the future).
> +unsafe impl<T: Operations> AlwaysRefCounted for Request<T> {}
> diff --git a/rust/kernel/cred.rs b/rust/kernel/cred.rs
> index ffa156b9df37..20ef0144094b 100644
> --- a/rust/kernel/cred.rs
> +++ b/rust/kernel/cred.rs
> @@ -8,7 +8,12 @@
> //!
> //! Reference: <https://www.kernel.org/doc/html/latest/security/credentials.html>
>
> -use crate::{bindings, sync::aref::AlwaysRefCounted, task::Kuid, types::Opaque};
> +use crate::{
> + bindings,
> + sync::aref::RefCounted,
> + task::Kuid,
> + types::{AlwaysRefCounted, Opaque},
> +};
>
> /// Wraps the kernel's `struct cred`.
> ///
> @@ -76,7 +81,7 @@ pub fn euid(&self) -> Kuid {
> }
>
> // SAFETY: The type invariants guarantee that `Credential` is always ref-counted.
> -unsafe impl AlwaysRefCounted for Credential {
> +unsafe impl RefCounted for Credential {
> #[inline]
> fn inc_ref(&self) {
> // SAFETY: The existence of a shared reference means that the refcount is nonzero.
> @@ -90,3 +95,7 @@ unsafe fn dec_ref(obj: core::ptr::NonNull<Credential>) {
> unsafe { bindings::put_cred(obj.cast().as_ptr()) };
> }
> }
> +
> +// SAFETY: We do not implement `Ownable`, thus it is okay to obtain an `ARef<Credential>` from a
> +// `&Credential`.
> +unsafe impl AlwaysRefCounted for Credential {}
> diff --git a/rust/kernel/device.rs b/rust/kernel/device.rs
> index 6d5396a43ebe..efdf33617d12 100644
> --- a/rust/kernel/device.rs
> +++ b/rust/kernel/device.rs
> @@ -8,8 +8,12 @@
> bindings,
> fmt,
> prelude::*,
> - sync::aref::ARef,
> + sync::aref::{
> + ARef,
> + RefCounted, //
> + },
> types::{
> + AlwaysRefCounted,
> ForeignOwnable,
> Opaque, //
> }, //
> @@ -508,7 +512,7 @@ pub fn name(&self) -> &CStr {
> kernel::impl_device_context_into_aref!(Device);
>
> // SAFETY: Instances of `Device` are always reference-counted.
> -unsafe impl crate::sync::aref::AlwaysRefCounted for Device {
> +unsafe impl RefCounted for Device {
> fn inc_ref(&self) {
> // SAFETY: The existence of a shared reference guarantees that the refcount is non-zero.
> unsafe { bindings::get_device(self.as_raw()) };
> @@ -520,6 +524,10 @@ unsafe fn dec_ref(obj: ptr::NonNull<Self>) {
> }
> }
>
> +// SAFETY: We do not implement `Ownable`, thus it is okay to obtain an `ARef<Device>` from a
> +// `&Device`.
> +unsafe impl AlwaysRefCounted for Device {}
> +
> // SAFETY: As by the type invariant `Device` can be sent to any thread.
> unsafe impl Send for Device {}
>
> diff --git a/rust/kernel/device/property.rs b/rust/kernel/device/property.rs
> index 5aead835fbbc..cee7e2501368 100644
> --- a/rust/kernel/device/property.rs
> +++ b/rust/kernel/device/property.rs
> @@ -14,7 +14,10 @@
> fmt,
> prelude::*,
> str::{CStr, CString},
> - sync::aref::ARef,
> + sync::aref::{
> + ARef,
> + AlwaysRefCounted, //
> + },
> types::Opaque,
> };
>
> @@ -360,7 +363,7 @@ fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
> }
>
> // SAFETY: Instances of `FwNode` are always reference-counted.
> -unsafe impl crate::sync::aref::AlwaysRefCounted for FwNode {
> +unsafe impl crate::sync::aref::RefCounted for FwNode {
> fn inc_ref(&self) {
> // SAFETY: The existence of a shared reference guarantees that the
> // refcount is non-zero.
> @@ -374,6 +377,10 @@ unsafe fn dec_ref(obj: ptr::NonNull<Self>) {
> }
> }
>
> +// SAFETY: We do not implement `Ownable`, thus it is okay to obtain an `ARef<FwNode>` from a
> +// `&FwNode`.
> +unsafe impl AlwaysRefCounted for FwNode {}
> +
> enum Node<'a> {
> Borrowed(&'a FwNode),
> Owned(ARef<FwNode>),
> diff --git a/rust/kernel/drm/device.rs b/rust/kernel/drm/device.rs
> index adbafe8db54d..a5a040266aae 100644
> --- a/rust/kernel/drm/device.rs
> +++ b/rust/kernel/drm/device.rs
> @@ -15,7 +15,8 @@
> prelude::*,
> sync::aref::{
> ARef,
> - AlwaysRefCounted, //
> + AlwaysRefCounted,
> + RefCounted, //
> },
> types::Opaque,
> workqueue::{
> @@ -217,7 +218,7 @@ fn deref(&self) -> &Self::Target {
>
> // SAFETY: DRM device objects are always reference counted and the get/put functions
> // satisfy the requirements.
> -unsafe impl<T: drm::Driver> AlwaysRefCounted for Device<T> {
> +unsafe impl<T: drm::Driver> RefCounted for Device<T> {
> fn inc_ref(&self) {
> // SAFETY: The existence of a shared reference guarantees that the refcount is non-zero.
> unsafe { bindings::drm_dev_get(self.as_raw()) };
> @@ -232,6 +233,10 @@ unsafe fn dec_ref(obj: NonNull<Self>) {
> }
> }
>
> +// SAFETY: We do not implement `Ownable`, thus it is okay to obtain an `ARef<Device>` from a
> +// `&Device`.
> +unsafe impl<T: drm::Driver> AlwaysRefCounted for Device<T> {}
> +
> impl<T: drm::Driver> AsRef<device::Device> for Device<T> {
> fn as_ref(&self) -> &device::Device {
> // SAFETY: `bindings::drm_device::dev` is valid as long as the DRM device itself is valid,
> diff --git a/rust/kernel/drm/gem/mod.rs b/rust/kernel/drm/gem/mod.rs
> index 75acda7ba500..f8cc2a0ff4c7 100644
> --- a/rust/kernel/drm/gem/mod.rs
> +++ b/rust/kernel/drm/gem/mod.rs
> @@ -17,7 +17,7 @@
> prelude::*,
> sync::aref::{
> ARef,
> - AlwaysRefCounted, //
> + RefCounted, //
> },
> types::Opaque,
> };
> @@ -29,7 +29,7 @@
> #[cfg(CONFIG_RUST_DRM_GEM_SHMEM_HELPER)]
> pub mod shmem;
>
> -/// A macro for implementing [`AlwaysRefCounted`] for any GEM object type.
> +/// A macro for implementing [`RefCounted`] for any GEM object type.
> ///
> /// Since all GEM objects use the same refcounting scheme.
> #[macro_export]
> @@ -42,7 +42,7 @@ impl $( <$( $tparam_id:ident ),+> )? for $type:ty
> )?
> ) => {
> // SAFETY: All GEM objects are refcounted.
> - unsafe impl $( <$( $tparam_id ),+> )? $crate::sync::aref::AlwaysRefCounted for $type
> + unsafe impl $( <$( $tparam_id ),+> )? $crate::sync::aref::RefCounted for $type
> where
> Self: IntoGEMObject,
> $( $( $bind_param : $bind_trait ),+ )?
> @@ -61,6 +61,14 @@ unsafe fn dec_ref(obj: core::ptr::NonNull<Self>) {
> unsafe { bindings::drm_gem_object_put(obj) };
> }
> }
> +
> + // SAFETY: We do not implement `Ownable`, thus it is okay to obtain an `ARef<$type>` from a
> + // `&$type`.
> + unsafe impl $( <$( $tparam_id ),+> )? $crate::sync::aref::AlwaysRefCounted for $type
> + where
> + Self: IntoGEMObject,
> + $( $( $bind_param : $bind_trait ),+ )?
> + {}
> };
> }
> #[cfg_attr(not(CONFIG_RUST_DRM_GEM_SHMEM_HELPER), allow(unused))]
> @@ -98,7 +106,7 @@ fn close(_obj: &<Self::Driver as drm::Driver>::Object, _file: &DriverFile<Self>)
> }
>
> /// Trait that represents a GEM object subtype
> -pub trait IntoGEMObject: Sized + super::private::Sealed + AlwaysRefCounted {
> +pub trait IntoGEMObject: Sized + super::private::Sealed + RefCounted {
> /// Returns a reference to the raw `drm_gem_object` structure, which must be valid as long as
> /// this owning object is valid.
> fn as_raw(&self) -> *mut bindings::drm_gem_object;
> diff --git a/rust/kernel/fs/file.rs b/rust/kernel/fs/file.rs
> index 23ee689bd240..06e457d62a93 100644
> --- a/rust/kernel/fs/file.rs
> +++ b/rust/kernel/fs/file.rs
> @@ -12,8 +12,8 @@
> cred::Credential,
> error::{code::*, to_result, Error, Result},
> fmt,
> - sync::aref::{ARef, AlwaysRefCounted},
> - types::{NotThreadSafe, Opaque},
> + sync::aref::RefCounted,
> + types::{ARef, AlwaysRefCounted, NotThreadSafe, Opaque},
> };
> use core::ptr;
>
> @@ -197,7 +197,7 @@ unsafe impl Sync for File {}
>
> // SAFETY: The type invariants guarantee that `File` is always ref-counted. This implementation
> // makes `ARef<File>` own a normal refcount.
> -unsafe impl AlwaysRefCounted for File {
> +unsafe impl RefCounted for File {
> #[inline]
> fn inc_ref(&self) {
> // SAFETY: The existence of a shared reference means that the refcount is nonzero.
> @@ -212,6 +212,10 @@ unsafe fn dec_ref(obj: ptr::NonNull<File>) {
> }
> }
>
> +// SAFETY: We do not implement `Ownable`, thus it is okay to obtain an `ARef<File>` from a
> +// `&File`.
> +unsafe impl AlwaysRefCounted for File {}
> +
> /// Wraps the kernel's `struct file`. Not thread safe.
> ///
> /// This type represents a file that is not known to be safe to transfer across thread boundaries.
> @@ -233,7 +237,7 @@ pub struct LocalFile {
>
> // SAFETY: The type invariants guarantee that `LocalFile` is always ref-counted. This implementation
> // makes `ARef<LocalFile>` own a normal refcount.
> -unsafe impl AlwaysRefCounted for LocalFile {
> +unsafe impl RefCounted for LocalFile {
> #[inline]
> fn inc_ref(&self) {
> // SAFETY: The existence of a shared reference means that the refcount is nonzero.
> @@ -249,6 +253,10 @@ unsafe fn dec_ref(obj: ptr::NonNull<LocalFile>) {
> }
> }
>
> +// SAFETY: We do not implement `Ownable`, thus it is okay to obtain an `ARef<LocalFile>` from a
> +// `&LocalFile`.
> +unsafe impl AlwaysRefCounted for LocalFile {}
> +
> impl LocalFile {
> /// Constructs a new `struct file` wrapper from a file descriptor.
> ///
> diff --git a/rust/kernel/i2c.rs b/rust/kernel/i2c.rs
> index 7b908f0c5a58..56791c1d63d7 100644
> --- a/rust/kernel/i2c.rs
> +++ b/rust/kernel/i2c.rs
> @@ -18,7 +18,8 @@
> prelude::*,
> sync::aref::{
> ARef,
> - AlwaysRefCounted, //
> + AlwaysRefCounted,
> + RefCounted, //
> },
> types::Opaque, //
> };
> @@ -415,7 +416,7 @@ pub fn get(index: i32) -> Result<ARef<Self>> {
> kernel::impl_device_context_into_aref!(I2cAdapter);
>
> // SAFETY: Instances of `I2cAdapter` are always reference-counted.
> -unsafe impl AlwaysRefCounted for I2cAdapter {
> +unsafe impl RefCounted for I2cAdapter {
> fn inc_ref(&self) {
> // SAFETY: The existence of a shared reference guarantees that the refcount is non-zero.
> unsafe { bindings::i2c_get_adapter(self.index()) };
> @@ -426,6 +427,9 @@ unsafe fn dec_ref(obj: NonNull<Self>) {
> unsafe { bindings::i2c_put_adapter(obj.as_ref().as_raw()) }
> }
> }
> +// SAFETY: We do not implement `Ownable`, thus it is okay to obtain an `ARef<Device>` from an
> +// `&I2cAdapter`.
> +unsafe impl AlwaysRefCounted for I2cAdapter {}
>
> /// The i2c board info representation
> ///
> @@ -491,7 +495,7 @@ unsafe impl<Ctx: device::DeviceContext> device::AsBusDevice<Ctx> for I2cClient<C
> kernel::impl_device_context_into_aref!(I2cClient);
>
> // SAFETY: Instances of `I2cClient` are always reference-counted.
> -unsafe impl AlwaysRefCounted for I2cClient {
> +unsafe impl RefCounted for I2cClient {
> fn inc_ref(&self) {
> // SAFETY: The existence of a shared reference guarantees that the refcount is non-zero.
> unsafe { bindings::get_device(self.as_ref().as_raw()) };
> @@ -502,6 +506,9 @@ unsafe fn dec_ref(obj: NonNull<Self>) {
> unsafe { bindings::put_device(&raw mut (*obj.as_ref().as_raw()).dev) }
> }
> }
> +// SAFETY: We do not implement `Ownable`, thus it is okay to obtain an `ARef<Device>` from an
> +// `&I2cClient`.
> +unsafe impl AlwaysRefCounted for I2cClient {}
>
> impl<Ctx: device::DeviceContext> AsRef<device::Device<Ctx>> for I2cClient<Ctx> {
> fn as_ref(&self) -> &device::Device<Ctx> {
> diff --git a/rust/kernel/mm.rs b/rust/kernel/mm.rs
> index 4764d7b68f2a..dd9e3969e720 100644
> --- a/rust/kernel/mm.rs
> +++ b/rust/kernel/mm.rs
> @@ -13,8 +13,8 @@
>
> use crate::{
> bindings,
> - sync::aref::{ARef, AlwaysRefCounted},
> - types::{NotThreadSafe, Opaque},
> + sync::aref::RefCounted,
> + types::{ARef, AlwaysRefCounted, NotThreadSafe, Opaque},
> };
> use core::{ops::Deref, ptr::NonNull};
>
> @@ -55,7 +55,7 @@ unsafe impl Send for Mm {}
> unsafe impl Sync for Mm {}
>
> // SAFETY: By the type invariants, this type is always refcounted.
> -unsafe impl AlwaysRefCounted for Mm {
> +unsafe impl RefCounted for Mm {
> #[inline]
> fn inc_ref(&self) {
> // SAFETY: The pointer is valid since self is a reference.
> @@ -69,6 +69,9 @@ unsafe fn dec_ref(obj: NonNull<Self>) {
> }
> }
>
> +// SAFETY: We do not implement `Ownable`, thus it is okay to obtain an `ARef<Mm>` from a `&Mm`.
> +unsafe impl AlwaysRefCounted for Mm {}
> +
> /// A wrapper for the kernel's `struct mm_struct`.
> ///
> /// This type is like [`Mm`], but with non-zero `mm_users`. It can only be used when `mm_users` can
> @@ -91,7 +94,7 @@ unsafe impl Send for MmWithUser {}
> unsafe impl Sync for MmWithUser {}
>
> // SAFETY: By the type invariants, this type is always refcounted.
> -unsafe impl AlwaysRefCounted for MmWithUser {
> +unsafe impl RefCounted for MmWithUser {
> #[inline]
> fn inc_ref(&self) {
> // SAFETY: The pointer is valid since self is a reference.
> @@ -105,6 +108,10 @@ unsafe fn dec_ref(obj: NonNull<Self>) {
> }
> }
>
> +// SAFETY: We do not implement `Ownable`, thus it is okay to obtain an `ARef<MmWithUser>` from a
> +// `&MmWithUser`.
> +unsafe impl AlwaysRefCounted for MmWithUser {}
> +
> // Make all `Mm` methods available on `MmWithUser`.
> impl Deref for MmWithUser {
> type Target = Mm;
> diff --git a/rust/kernel/mm/mmput_async.rs b/rust/kernel/mm/mmput_async.rs
> index b8d2f051225c..aba4ce675c86 100644
> --- a/rust/kernel/mm/mmput_async.rs
> +++ b/rust/kernel/mm/mmput_async.rs
> @@ -10,7 +10,8 @@
> use crate::{
> bindings,
> mm::MmWithUser,
> - sync::aref::{ARef, AlwaysRefCounted},
> + sync::aref::RefCounted,
> + types::{ARef, AlwaysRefCounted},
> };
> use core::{ops::Deref, ptr::NonNull};
>
> @@ -34,7 +35,7 @@ unsafe impl Send for MmWithUserAsync {}
> unsafe impl Sync for MmWithUserAsync {}
>
> // SAFETY: By the type invariants, this type is always refcounted.
> -unsafe impl AlwaysRefCounted for MmWithUserAsync {
> +unsafe impl RefCounted for MmWithUserAsync {
> #[inline]
> fn inc_ref(&self) {
> // SAFETY: The pointer is valid since self is a reference.
> @@ -48,6 +49,10 @@ unsafe fn dec_ref(obj: NonNull<Self>) {
> }
> }
>
> +// SAFETY: We do not implement `Ownable`, thus it is okay to obtain an `ARef<MmWithUserAsync>`
> +// from a `&MmWithUserAsync`.
> +unsafe impl AlwaysRefCounted for MmWithUserAsync {}
> +
> // Make all `MmWithUser` methods available on `MmWithUserAsync`.
> impl Deref for MmWithUserAsync {
> type Target = MmWithUser;
> diff --git a/rust/kernel/opp.rs b/rust/kernel/opp.rs
> index a760fac28765..06fe2ca776a4 100644
> --- a/rust/kernel/opp.rs
> +++ b/rust/kernel/opp.rs
> @@ -16,8 +16,8 @@
> ffi::{c_char, c_ulong},
> prelude::*,
> str::CString,
> - sync::aref::{ARef, AlwaysRefCounted},
> - types::Opaque,
> + sync::aref::RefCounted,
> + types::{ARef, AlwaysRefCounted, Opaque},
> };
>
> #[cfg(CONFIG_CPU_FREQ)]
> @@ -1041,7 +1041,7 @@ unsafe impl Send for OPP {}
> unsafe impl Sync for OPP {}
>
> /// SAFETY: The type invariants guarantee that [`OPP`] is always refcounted.
> -unsafe impl AlwaysRefCounted for OPP {
> +unsafe impl RefCounted for OPP {
> fn inc_ref(&self) {
> // SAFETY: The existence of a shared reference means that the refcount is nonzero.
> unsafe { bindings::dev_pm_opp_get(self.0.get()) };
> @@ -1053,6 +1053,10 @@ unsafe fn dec_ref(obj: ptr::NonNull<Self>) {
> }
> }
>
> +// SAFETY: We do not implement `Ownable`, thus it is okay to obtain an `ARef<OPP>` from an
> +// `&OPP`.
> +unsafe impl AlwaysRefCounted for OPP {}
> +
> impl OPP {
> /// Creates an owned reference to a [`OPP`] from a valid pointer.
> ///
> diff --git a/rust/kernel/owned.rs b/rust/kernel/owned.rs
> index 5eacdf327d12..bedd4fef84fa 100644
> --- a/rust/kernel/owned.rs
> +++ b/rust/kernel/owned.rs
> @@ -27,7 +27,7 @@
> ///
> /// Note: The underlying object is not required to provide internal reference counting, because it
> /// represents a unique, owned reference. If reference counting (on the Rust side) is required,
> -/// [`AlwaysRefCounted`](crate::types::AlwaysRefCounted) should be implemented.
> +/// [`RefCounted`](crate::types::RefCounted) should be implemented.
> ///
> /// # Examples
> ///
> diff --git a/rust/kernel/pci.rs b/rust/kernel/pci.rs
> index af74ddff6114..acf7384fea02 100644
> --- a/rust/kernel/pci.rs
> +++ b/rust/kernel/pci.rs
> @@ -19,6 +19,10 @@
> },
> prelude::*,
> str::CStr,
> + sync::aref::{
> + AlwaysRefCounted,
> + RefCounted, //
> + },
> types::Opaque,
> ThisModule, //
> };
> @@ -474,7 +478,7 @@ unsafe impl<Ctx: device::DeviceContext> device::AsBusDevice<Ctx> for Device<Ctx>
> impl crate::dma::Device for Device<device::Core> {}
>
> // SAFETY: Instances of `Device` are always reference-counted.
> -unsafe impl crate::sync::aref::AlwaysRefCounted for Device {
> +unsafe impl RefCounted for Device {
> fn inc_ref(&self) {
> // SAFETY: The existence of a shared reference guarantees that the refcount is non-zero.
> unsafe { bindings::pci_dev_get(self.as_raw()) };
> @@ -486,6 +490,10 @@ unsafe fn dec_ref(obj: NonNull<Self>) {
> }
> }
>
> +// SAFETY: We do not implement `Ownable`, thus it is okay to obtain an `ARef<Device>` from a
> +// `&Device`.
> +unsafe impl AlwaysRefCounted for Device {}
> +
> impl<Ctx: device::DeviceContext> AsRef<device::Device<Ctx>> for Device<Ctx> {
> fn as_ref(&self) -> &device::Device<Ctx> {
> // SAFETY: By the type invariant of `Self`, `self.as_raw()` is a pointer to a valid
> diff --git a/rust/kernel/pid_namespace.rs b/rust/kernel/pid_namespace.rs
> index 979a9718f153..4f6a94540e33 100644
> --- a/rust/kernel/pid_namespace.rs
> +++ b/rust/kernel/pid_namespace.rs
> @@ -7,7 +7,11 @@
> //! C header: [`include/linux/pid_namespace.h`](srctree/include/linux/pid_namespace.h) and
> //! [`include/linux/pid.h`](srctree/include/linux/pid.h)
>
> -use crate::{bindings, sync::aref::AlwaysRefCounted, types::Opaque};
> +use crate::{
> + bindings,
> + sync::aref::RefCounted,
> + types::{AlwaysRefCounted, Opaque},
> +};
> use core::ptr;
>
> /// Wraps the kernel's `struct pid_namespace`. Thread safe.
> @@ -41,7 +45,7 @@ pub unsafe fn from_ptr<'a>(ptr: *const bindings::pid_namespace) -> &'a Self {
> }
>
> // SAFETY: Instances of `PidNamespace` are always reference-counted.
> -unsafe impl AlwaysRefCounted for PidNamespace {
> +unsafe impl RefCounted for PidNamespace {
> #[inline]
> fn inc_ref(&self) {
> // SAFETY: The existence of a shared reference means that the refcount is nonzero.
> @@ -55,6 +59,10 @@ unsafe fn dec_ref(obj: ptr::NonNull<PidNamespace>) {
> }
> }
>
> +// SAFETY: We do not implement `Ownable`, thus it is okay to obtain an `ARef<PidNamespace>` from
> +// a `&PidNamespace`.
> +unsafe impl AlwaysRefCounted for PidNamespace {}
> +
> // SAFETY:
> // - `PidNamespace::dec_ref` can be called from any thread.
> // - It is okay to send ownership of `PidNamespace` across thread boundaries.
> diff --git a/rust/kernel/platform.rs b/rust/kernel/platform.rs
> index 8917d4ee499f..3c35aa94e319 100644
> --- a/rust/kernel/platform.rs
> +++ b/rust/kernel/platform.rs
> @@ -27,6 +27,7 @@
> },
> of,
> prelude::*,
> + sync::aref::{AlwaysRefCounted, RefCounted},
> types::Opaque,
> ThisModule, //
> };
> @@ -512,7 +513,7 @@ pub fn optional_irq_by_name(&self, name: &CStr) -> Result<IrqRequest<'_>> {
> impl crate::dma::Device for Device<device::Core> {}
>
> // SAFETY: Instances of `Device` are always reference-counted.
> -unsafe impl crate::sync::aref::AlwaysRefCounted for Device {
> +unsafe impl RefCounted for Device {
> fn inc_ref(&self) {
> // SAFETY: The existence of a shared reference guarantees that the refcount is non-zero.
> unsafe { bindings::get_device(self.as_ref().as_raw()) };
> @@ -524,6 +525,10 @@ unsafe fn dec_ref(obj: NonNull<Self>) {
> }
> }
>
> +// SAFETY: We do not implement `Ownable`, thus it is okay to obtain an `ARef<Device>` from a
> +// `&Device`.
> +unsafe impl AlwaysRefCounted for Device {}
> +
> impl<Ctx: device::DeviceContext> AsRef<device::Device<Ctx>> for Device<Ctx> {
> fn as_ref(&self) -> &device::Device<Ctx> {
> // SAFETY: By the type invariant of `Self`, `self.as_raw()` is a pointer to a valid
> diff --git a/rust/kernel/sync/aref.rs b/rust/kernel/sync/aref.rs
> index 4ee5fac0e0b6..2d656f672b97 100644
> --- a/rust/kernel/sync/aref.rs
> +++ b/rust/kernel/sync/aref.rs
> @@ -19,11 +19,9 @@
>
> use core::{marker::PhantomData, mem::ManuallyDrop, ops::Deref, ptr::NonNull};
>
> -/// Types that are _always_ reference counted.
> +/// Types that are internally reference counted.
> ///
> /// It allows such types to define their own custom ref increment and decrement functions.
> -/// Additionally, it allows users to convert from a shared reference `&T` to an owned reference
> -/// [`ARef<T>`].
> ///
> /// This is usually implemented by wrappers to existing structures on the C side of the code. For
> /// Rust code, the recommendation is to use [`Arc`](crate::sync::Arc) to create reference-counted
> @@ -40,9 +38,8 @@
> /// at least until matching decrements are performed.
> ///
> /// Implementers must also ensure that all instances are reference-counted. (Otherwise they
> -/// won't be able to honour the requirement that [`AlwaysRefCounted::inc_ref`] keep the object
> -/// alive.)
> -pub unsafe trait AlwaysRefCounted {
> +/// won't be able to honour the requirement that [`RefCounted::inc_ref`] keep the object alive.)
> +pub unsafe trait RefCounted {
> /// Increments the reference count on the object.
> fn inc_ref(&self);
>
> @@ -55,11 +52,27 @@ pub unsafe trait AlwaysRefCounted {
> /// Callers must ensure that there was a previous matching increment to the reference count,
> /// and that the object is no longer used after its reference count is decremented (as it may
> /// result in the object being freed), unless the caller owns another increment on the refcount
> - /// (e.g., it calls [`AlwaysRefCounted::inc_ref`] twice, then calls
> - /// [`AlwaysRefCounted::dec_ref`] once).
> + /// (e.g., it calls [`RefCounted::inc_ref`] twice, then calls [`RefCounted::dec_ref`] once).
> unsafe fn dec_ref(obj: NonNull<Self>);
> }
>
> +/// Always reference-counted type.
> +///
> +/// It allows deriving a counted reference [`ARef<T>`] from a `&T`.
> +///
> +/// This provides some convenience, but it allows "escaping" borrow checks on `&T`. As it
> +/// complicates attempts to ensure that a reference to T is unique, it is optional to provide for
> +/// [`RefCounted`] types. See *Safety* below.
> +///
> +/// # Safety
> +///
> +/// Implementers must ensure that no safety invariants are violated by upgrading an `&T` to an
> +/// [`ARef<T>`]. In particular that implies [`AlwaysRefCounted`] and [`crate::types::Ownable`]
> +/// cannot be implemented for the same type, as this would allow violating the uniqueness guarantee
> +/// of [`crate::types::Owned<T>`] by dereferencing it into an `&T` and obtaining an [`ARef`] from
> +/// that.
> +pub unsafe trait AlwaysRefCounted: RefCounted {}
> +
> /// An owned reference to an always-reference-counted object.
> ///
> /// The object's reference count is automatically decremented when an instance of [`ARef`] is
> @@ -70,7 +83,7 @@ pub unsafe trait AlwaysRefCounted {
> ///
> /// The pointer stored in `ptr` is non-null and valid for the lifetime of the [`ARef`] instance. In
> /// particular, the [`ARef`] instance owns an increment on the underlying object's reference count.
> -pub struct ARef<T: AlwaysRefCounted> {
> +pub struct ARef<T: RefCounted> {
> ptr: NonNull<T>,
> _p: PhantomData<T>,
> }
> @@ -79,19 +92,19 @@ pub struct ARef<T: AlwaysRefCounted> {
> // it effectively means sharing `&T` (which is safe because `T` is `Sync`); additionally, it needs
> // `T` to be `Send` because any thread that has an `ARef<T>` may ultimately access `T` using a
> // mutable reference, for example, when the reference count reaches zero and `T` is dropped.
> -unsafe impl<T: AlwaysRefCounted + Sync + Send> Send for ARef<T> {}
> +unsafe impl<T: RefCounted + Sync + Send> Send for ARef<T> {}
>
> // SAFETY: It is safe to send `&ARef<T>` to another thread when the underlying `T` is `Sync`
> // because it effectively means sharing `&T` (which is safe because `T` is `Sync`); additionally,
> // it needs `T` to be `Send` because any thread that has a `&ARef<T>` may clone it and get an
> // `ARef<T>` on that thread, so the thread may ultimately access `T` using a mutable reference, for
> // example, when the reference count reaches zero and `T` is dropped.
> -unsafe impl<T: AlwaysRefCounted + Sync + Send> Sync for ARef<T> {}
> +unsafe impl<T: RefCounted + Sync + Send> Sync for ARef<T> {}
>
> // Even if `T` is pinned, pointers to `T` can still move.
> -impl<T: AlwaysRefCounted> Unpin for ARef<T> {}
> +impl<T: RefCounted> Unpin for ARef<T> {}
>
> -impl<T: AlwaysRefCounted> ARef<T> {
> +impl<T: RefCounted> ARef<T> {
> /// Creates a new instance of [`ARef`].
> ///
> /// It takes over an increment of the reference count on the underlying object.
> @@ -120,12 +133,12 @@ pub unsafe fn from_raw(ptr: NonNull<T>) -> Self {
> ///
> /// ```
> /// use core::ptr::NonNull;
> - /// use kernel::sync::aref::{ARef, AlwaysRefCounted};
> + /// use kernel::sync::aref::{ARef, RefCounted};
> ///
> /// struct Empty {}
> ///
> /// # // SAFETY: TODO.
> - /// unsafe impl AlwaysRefCounted for Empty {
> + /// unsafe impl RefCounted for Empty {
> /// fn inc_ref(&self) {}
> /// unsafe fn dec_ref(_obj: NonNull<Self>) {}
> /// }
> @@ -143,7 +156,7 @@ pub fn into_raw(me: Self) -> NonNull<T> {
> }
> }
>
> -impl<T: AlwaysRefCounted> Clone for ARef<T> {
> +impl<T: RefCounted> Clone for ARef<T> {
> fn clone(&self) -> Self {
> self.inc_ref();
> // SAFETY: We just incremented the refcount above.
> @@ -151,7 +164,7 @@ fn clone(&self) -> Self {
> }
> }
>
> -impl<T: AlwaysRefCounted> Deref for ARef<T> {
> +impl<T: RefCounted> Deref for ARef<T> {
> type Target = T;
>
> fn deref(&self) -> &Self::Target {
> @@ -168,7 +181,7 @@ fn from(b: &T) -> Self {
> }
> }
>
> -impl<T: AlwaysRefCounted> Drop for ARef<T> {
> +impl<T: RefCounted> Drop for ARef<T> {
> fn drop(&mut self) {
> // SAFETY: The type invariants guarantee that the `ARef` owns the reference we're about to
> // decrement.
> diff --git a/rust/kernel/task.rs b/rust/kernel/task.rs
> index 38273f4eedb5..6259430b0ca3 100644
> --- a/rust/kernel/task.rs
> +++ b/rust/kernel/task.rs
> @@ -10,7 +10,12 @@
> pid_namespace::PidNamespace,
> prelude::*,
> sync::aref::ARef,
> - types::{NotThreadSafe, Opaque},
> + types::{
> + AlwaysRefCounted,
> + NotThreadSafe,
> + Opaque,
> + RefCounted, //
> + },
> };
> use core::{
> ops::Deref,
> @@ -347,7 +352,7 @@ pub fn group_leader(&self) -> &Task {
> }
>
> // SAFETY: The type invariants guarantee that `Task` is always refcounted.
> -unsafe impl crate::sync::aref::AlwaysRefCounted for Task {
> +unsafe impl RefCounted for Task {
> #[inline]
> fn inc_ref(&self) {
> // SAFETY: The existence of a shared reference means that the refcount is nonzero.
> @@ -361,6 +366,10 @@ unsafe fn dec_ref(obj: ptr::NonNull<Self>) {
> }
> }
>
> +// SAFETY: We do not implement `Ownable`, thus it is okay to obtain an `ARef<Task>` from a
> +// `&Task`.
> +unsafe impl AlwaysRefCounted for Task {}
> +
> impl PartialEq for Task {
> #[inline]
> fn eq(&self, other: &Self) -> bool {
> diff --git a/rust/kernel/types.rs b/rust/kernel/types.rs
> index 4aec7b699269..9b96aa2ebdb7 100644
> --- a/rust/kernel/types.rs
> +++ b/rust/kernel/types.rs
> @@ -18,7 +18,8 @@
> },
> sync::aref::{
> ARef,
> - AlwaysRefCounted, //
> + AlwaysRefCounted,
> + RefCounted, //
> }, //
> };
>
> diff --git a/rust/kernel/usb.rs b/rust/kernel/usb.rs
> index 9c17a672cd27..90b13e65cc82 100644
> --- a/rust/kernel/usb.rs
> +++ b/rust/kernel/usb.rs
> @@ -18,7 +18,10 @@
> to_result, //
> },
> prelude::*,
> - sync::aref::AlwaysRefCounted,
> + sync::aref::{
> + AlwaysRefCounted,
> + RefCounted, //
> + },
> types::Opaque,
> ThisModule, //
> };
> @@ -381,7 +384,7 @@ fn as_ref(&self) -> &Device {
> }
>
> // SAFETY: Instances of `Interface` are always reference-counted.
> -unsafe impl AlwaysRefCounted for Interface {
> +unsafe impl RefCounted for Interface {
> fn inc_ref(&self) {
> // SAFETY: The invariants of `Interface` guarantee that `self.as_raw()`
> // returns a valid `struct usb_interface` pointer, for which we will
> @@ -395,6 +398,10 @@ unsafe fn dec_ref(obj: NonNull<Self>) {
> }
> }
>
> +// SAFETY: We do not implement `Ownable`, thus it is okay to obtain an `ARef<Interface>` from a
> +// `&Interface`.
> +unsafe impl AlwaysRefCounted for Interface {}
> +
> // SAFETY: A `Interface` is always reference-counted and can be released from any thread.
> unsafe impl Send for Interface {}
>
> @@ -432,7 +439,7 @@ fn as_raw(&self) -> *mut bindings::usb_device {
> kernel::impl_device_context_into_aref!(Device);
>
> // SAFETY: Instances of `Device` are always reference-counted.
> -unsafe impl AlwaysRefCounted for Device {
> +unsafe impl RefCounted for Device {
> fn inc_ref(&self) {
> // SAFETY: The invariants of `Device` guarantee that `self.as_raw()`
> // returns a valid `struct usb_device` pointer, for which we will
> @@ -446,6 +453,10 @@ unsafe fn dec_ref(obj: NonNull<Self>) {
> }
> }
>
> +// SAFETY: We do not implement `Ownable`, thus it is okay to obtain an `ARef<Device>` from a
> +// `&Device`.
> +unsafe impl AlwaysRefCounted for Device {}
> +
> impl<Ctx: device::DeviceContext> AsRef<device::Device<Ctx>> for Device<Ctx> {
> fn as_ref(&self) -> &device::Device<Ctx> {
> // SAFETY: By the type invariant of `Self`, `self.as_raw()` is a pointer to a valid
>
> --
> 2.51.2
>
>
^ permalink raw reply
* Re: [PATCH v17 08/10] rust: aref: update formatting of use statements
From: Onur Özkan @ 2026-06-23 17:55 UTC (permalink / raw)
To: Andreas Hindborg
Cc: Miguel Ojeda, Gary Guo, Björn Roy Baron, Benno Lossin,
Alice Ryhl, Trevor Gross, Danilo Krummrich, Greg Kroah-Hartman,
Dave Ertman, Ira Weiny, Leon Romanovsky, Paul Moore, Serge Hallyn,
Rafael J. Wysocki, David Airlie, Simona Vetter, Alexander Viro,
Christian Brauner, Jan Kara, Daniel Almeida, Viresh Kumar,
Nishanth Menon, Stephen Boyd, Bjorn Helgaas,
Krzysztof Wilczyński, Boqun Feng, Uladzislau Rezki,
Lorenzo Stoakes, Vlastimil Babka, Liam R. Howlett, Igor Korotin,
Pavel Tikhomirov, linux-kernel, rust-for-linux, linux-block,
linux-security-module, dri-devel, linux-fsdevel, linux-mm,
linux-pm, linux-pci, driver-core
In-Reply-To: <20260604-unique-ref-v17-8-7b4c3d2930b9@kernel.org>
On Thu, 04 Jun 2026 22:11:20 +0200
Andreas Hindborg <a.hindborg@kernel.org> wrote:
> Update formatting if use statements in preparation for next commit.
I guess you meant "formatting use statements"? Also, why not doing this in
the next commit directly?
Onur
>
> Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
> ---
> rust/kernel/sync/aref.rs | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/rust/kernel/sync/aref.rs b/rust/kernel/sync/aref.rs
> index 7491382bcf29..818c84fa923a 100644
> --- a/rust/kernel/sync/aref.rs
> +++ b/rust/kernel/sync/aref.rs
> @@ -17,7 +17,12 @@
> //! [`Arc`]: crate::sync::Arc
> //! [`Arc<T>`]: crate::sync::Arc
>
> -use core::{marker::PhantomData, mem::ManuallyDrop, ops::Deref, ptr::NonNull};
> +use core::{
> + marker::PhantomData,
> + mem::ManuallyDrop,
> + ops::Deref,
> + ptr::NonNull, //
> +};
>
> /// Types that are internally reference counted.
> ///
>
> --
> 2.51.2
>
>
^ permalink raw reply
* Re: [PATCH v17 10/10] rust: page: add `from_raw()`
From: Onur Özkan @ 2026-06-23 17:52 UTC (permalink / raw)
To: Andreas Hindborg
Cc: Miguel Ojeda, Gary Guo, Björn Roy Baron, Benno Lossin,
Alice Ryhl, Trevor Gross, Danilo Krummrich, Greg Kroah-Hartman,
Dave Ertman, Ira Weiny, Leon Romanovsky, Paul Moore, Serge Hallyn,
Rafael J. Wysocki, David Airlie, Simona Vetter, Alexander Viro,
Christian Brauner, Jan Kara, Daniel Almeida, Viresh Kumar,
Nishanth Menon, Stephen Boyd, Bjorn Helgaas,
Krzysztof Wilczyński, Boqun Feng, Uladzislau Rezki,
Lorenzo Stoakes, Vlastimil Babka, Liam R. Howlett, Igor Korotin,
Pavel Tikhomirov, linux-kernel, rust-for-linux, linux-block,
linux-security-module, dri-devel, linux-fsdevel, linux-mm,
linux-pm, linux-pci, driver-core, Onur Özkan
In-Reply-To: <20260604-unique-ref-v17-10-7b4c3d2930b9@kernel.org>
On Thu, 04 Jun 2026 22:11:22 +0200
Andreas Hindborg <a.hindborg@kernel.org> wrote:
> From: Andreas Hindborg <a.hindborg@samsung.com>
>
> Add a method to `Page` that allows construction of an instance from `struct
> page` pointer.
>
> Signed-off-by: Andreas Hindborg <a.hindborg@samsung.com>
> ---
> rust/kernel/page.rs | 12 ++++++++++++
> 1 file changed, 12 insertions(+)
>
> diff --git a/rust/kernel/page.rs b/rust/kernel/page.rs
> index 844c75e54134..d56ae597f692 100644
> --- a/rust/kernel/page.rs
> +++ b/rust/kernel/page.rs
> @@ -214,6 +214,18 @@ pub fn nid(&self) -> i32 {
> unsafe { bindings::page_to_nid(self.as_ptr()) }
> }
>
> + /// Create a `&Page` from a raw `struct page` pointer.
> + ///
> + /// # Safety
> + ///
> + /// `ptr` must be convertible to a shared reference with a lifetime of `'a`.
> + #[inline]
> + pub unsafe fn from_raw<'a>(ptr: *const bindings::page) -> &'a Self {
> + // SAFETY: By function safety requirements, `ptr` is not null and is convertible to a shared
> + // reference.
> + unsafe { &*ptr.cast() }
> + }
> +
> /// Runs a piece of code with this page mapped to an address.
> ///
> /// The page is unmapped when this call returns.
>
> --
> 2.51.2
>
>
Reviewed-by: Onur Özkan <work@onurozkan.dev>
^ permalink raw reply
* Re: [PATCH v17 02/10] rust: types: Add Ownable/Owned types
From: Andreas Hindborg @ 2026-06-23 13:09 UTC (permalink / raw)
To: Alice Ryhl
Cc: Miguel Ojeda, Gary Guo, Björn Roy Baron, Benno Lossin,
Trevor Gross, Danilo Krummrich, Greg Kroah-Hartman, Dave Ertman,
Ira Weiny, Leon Romanovsky, Paul Moore, Serge Hallyn,
Rafael J. Wysocki, David Airlie, Simona Vetter, Alexander Viro,
Christian Brauner, Jan Kara, Daniel Almeida, Viresh Kumar,
Nishanth Menon, Stephen Boyd, Bjorn Helgaas,
Krzysztof Wilczyński, Boqun Feng, Uladzislau Rezki,
Lorenzo Stoakes, Vlastimil Babka, Liam R. Howlett, Igor Korotin,
Pavel Tikhomirov, linux-kernel, rust-for-linux, linux-block,
linux-security-module, dri-devel, linux-fsdevel, linux-mm,
linux-pm, linux-pci, driver-core, Asahi Lina, Oliver Mangold
In-Reply-To: <ajE5c-5gZtJRoadx@google.com>
Alice Ryhl <aliceryhl@google.com> writes:
> On Thu, Jun 04, 2026 at 10:11:14PM +0200, Andreas Hindborg wrote:
>> From: Asahi Lina <lina+kernel@asahilina.net>
>>
>> By analogy to `AlwaysRefCounted` and `ARef`, an `Ownable` type is a
>> (typically C FFI) type that *may* be owned by Rust, but need not be. Unlike
>> `AlwaysRefCounted`, this mechanism expects the reference to be unique
>> within Rust, and does not allow cloning.
>>
>> Conceptually, this is similar to a `KBox<T>`, except that it delegates
>> resource management to the `T` instead of using a generic allocator.
>>
>> [ om:
>> - Split code into separate file and `pub use` it from types.rs.
>> - Make from_raw() and into_raw() public.
>> - Remove OwnableMut, and make DerefMut dependent on Unpin instead.
>> - Usage example/doctest for Ownable/Owned.
>> - Fixes to documentation and commit message.
>> ]
>>
>> Link: https://lore.kernel.org/all/20250202-rust-page-v1-1-e3170d7fe55e@asahilina.net/
>> Signed-off-by: Asahi Lina <lina+kernel@asahilina.net>
>> Co-developed-by: Oliver Mangold <oliver.mangold@pm.me>
>> Signed-off-by: Oliver Mangold <oliver.mangold@pm.me>
>> Reviewed-by: Boqun Feng <boqun.feng@gmail.com>
>> Reviewed-by: Daniel Almeida <daniel.almeida@collabora.com>
>> [ Andreas: Updated documentation, examples, and formatting. Change safety
>> requirements, safety comments. Use a reference for `release`. ]
>> Reviewed-by: Gary Guo <gary@garyguo.net>
>> Co-developed-by: Andreas Hindborg <a.hindborg@kernel.org>
>> Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
>
> Overall looks good to me, but two nits below. With them fixed:
>
> Reviewed-by: Alice Ryhl <aliceryhl@google.com>
>
>> +pub trait Ownable {
>> + /// Tear down this `Ownable`.
>> + ///
>> + /// Implementers of `Ownable` can use this function to clean up the use of `Self`. This can
>> + /// include freeing the underlying object.
>> + ///
>> + /// # Safety
>> + ///
>> + /// Callers must ensure that the caller has exclusive ownership of `T`, and this ownership can
>> + /// be transferred to the `release` method.
>> + unsafe fn release(&mut self);
>
> I'd make this take a raw pointer because the pointer can be freed during
> the execution of release(), which references don't allow.
Ok.
>
>> diff --git a/rust/kernel/types.rs b/rust/kernel/types.rs
>> index 4329d3c2c2e5..4aec7b699269 100644
>> --- a/rust/kernel/types.rs
>> +++ b/rust/kernel/types.rs
>> @@ -11,6 +11,17 @@
>> };
>> use pin_init::{PinInit, Wrapper, Zeroable};
>>
>> +pub use crate::{
>> + owned::{
>> + Ownable,
>> + Owned, //
>> + },
>> + sync::aref::{
>> + ARef,
>> + AlwaysRefCounted, //
>> + }, //
>> +};
>
> We removed the types::ARef re-export, so you shouldn't add it back.
Looks like a rebase failure, I will remove it.
Best regards,
Andreas Hindborg
^ permalink raw reply
* Re: [PATCH v17 05/10] rust: page: convert to `Ownable`
From: Andreas Hindborg @ 2026-06-23 13:11 UTC (permalink / raw)
To: Alice Ryhl
Cc: Miguel Ojeda, Gary Guo, Björn Roy Baron, Benno Lossin,
Trevor Gross, Danilo Krummrich, Greg Kroah-Hartman, Dave Ertman,
Ira Weiny, Leon Romanovsky, Paul Moore, Serge Hallyn,
Rafael J. Wysocki, David Airlie, Simona Vetter, Alexander Viro,
Christian Brauner, Jan Kara, Daniel Almeida, Viresh Kumar,
Nishanth Menon, Stephen Boyd, Bjorn Helgaas,
Krzysztof Wilczyński, Boqun Feng, Uladzislau Rezki,
Lorenzo Stoakes, Vlastimil Babka, Liam R. Howlett, Igor Korotin,
Pavel Tikhomirov, linux-kernel, rust-for-linux, linux-block,
linux-security-module, dri-devel, linux-fsdevel, linux-mm,
linux-pm, linux-pci, driver-core, Asahi Lina
In-Reply-To: <CAH5fLgggdf0CM0o4Oa1VbY+y8gQxrU0VU5z_yB4GWCxnbqsFuQ@mail.gmail.com>
Alice Ryhl <aliceryhl@google.com> writes:
> On Thu, Jun 4, 2026 at 10:14 PM Andreas Hindborg <a.hindborg@kernel.org> wrote:
>>
>> From: Asahi Lina <lina@asahilina.net>
>>
>> This allows Page references to be returned as borrowed references,
>> without necessarily owning the struct page.
>>
>> Signed-off-by: Asahi Lina <lina@asahilina.net>
>> [ Andreas: Fix formatting and add a safety comment. ]
>> Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
>
> This will not compile unless Rust Binder is also updated.
I'll be sure to build test against binder.
Best regards,
Andreas Hindborg
^ permalink raw reply
* Re: [PATCH v17 05/10] rust: page: convert to `Ownable`
From: Andreas Hindborg @ 2026-06-23 13:12 UTC (permalink / raw)
To: Alice Ryhl
Cc: Miguel Ojeda, Gary Guo, Björn Roy Baron, Benno Lossin,
Trevor Gross, Danilo Krummrich, Greg Kroah-Hartman, Dave Ertman,
Ira Weiny, Leon Romanovsky, Paul Moore, Serge Hallyn,
Rafael J. Wysocki, David Airlie, Simona Vetter, Alexander Viro,
Christian Brauner, Jan Kara, Daniel Almeida, Viresh Kumar,
Nishanth Menon, Stephen Boyd, Bjorn Helgaas,
Krzysztof Wilczyński, Boqun Feng, Uladzislau Rezki,
Lorenzo Stoakes, Vlastimil Babka, Liam R. Howlett, Igor Korotin,
Pavel Tikhomirov, linux-kernel, rust-for-linux, linux-block,
linux-security-module, dri-devel, linux-fsdevel, linux-mm,
linux-pm, linux-pci, driver-core, Asahi Lina
In-Reply-To: <ajKGv5ioLSassmND@google.com>
Alice Ryhl <aliceryhl@google.com> writes:
> On Thu, Jun 04, 2026 at 10:11:17PM +0200, Andreas Hindborg wrote:
>> From: Asahi Lina <lina@asahilina.net>
>>
>> This allows Page references to be returned as borrowed references,
>> without necessarily owning the struct page.
>>
>> Signed-off-by: Asahi Lina <lina@asahilina.net>
>> [ Andreas: Fix formatting and add a safety comment. ]
>> Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
>> ---
>> rust/kernel/page.rs | 38 +++++++++++++++++++++++++-------------
>> 1 file changed, 25 insertions(+), 13 deletions(-)
>>
>> diff --git a/rust/kernel/page.rs b/rust/kernel/page.rs
>> index 3bdcee0e16a8..844c75e54134 100644
>> --- a/rust/kernel/page.rs
>> +++ b/rust/kernel/page.rs
>> @@ -10,6 +10,11 @@
>> bindings,
>> error::code::*,
>> error::Result,
>> + types::{
>> + Opaque,
>> + Ownable,
>> + Owned, //
>> + },
>> uaccess::UserSliceReader, //
>> };
>> use core::{
>> @@ -105,7 +110,7 @@ pub const fn page_align(addr: usize) -> Option<usize> {
>> ///
>> /// [`VBox`]: kernel::alloc::VBox
>> /// [`Vmalloc`]: kernel::alloc::allocator::Vmalloc
>> -pub struct BorrowedPage<'a>(ManuallyDrop<Page>, PhantomData<&'a Page>);
>> +pub struct BorrowedPage<'a>(ManuallyDrop<NonNull<Page>>, PhantomData<&'a Page>);
>
> BorrowedPage<'a> is no longer needed because it's just &Page.
I'll add some cleaning then.
Best regards,
Andreas Hindborg
^ permalink raw reply
* Re: [PATCH v17 03/10] rust: implement `ForeignOwnable` for `Owned`
From: Andreas Hindborg @ 2026-06-23 13:10 UTC (permalink / raw)
To: Gary Guo, Miguel Ojeda, Gary Guo, Björn Roy Baron,
Benno Lossin, Alice Ryhl, Trevor Gross, Danilo Krummrich,
Greg Kroah-Hartman, Dave Ertman, Ira Weiny, Leon Romanovsky,
Paul Moore, Serge Hallyn, Rafael J. Wysocki, David Airlie,
Simona Vetter, Alexander Viro, Christian Brauner, Jan Kara,
Daniel Almeida, Viresh Kumar, Nishanth Menon, Stephen Boyd,
Bjorn Helgaas, Krzysztof Wilczyński, Boqun Feng,
Uladzislau Rezki, Lorenzo Stoakes, Vlastimil Babka,
Liam R. Howlett, Igor Korotin, Pavel Tikhomirov
Cc: linux-kernel, rust-for-linux, linux-block, linux-security-module,
dri-devel, linux-fsdevel, linux-mm, linux-pm, linux-pci,
driver-core
In-Reply-To: <DJAGDGPKZ4HX.M47NMAU53PCJ@garyguo.net>
"Gary Guo" <gary@garyguo.net> writes:
> On Thu Jun 4, 2026 at 9:11 PM BST, Andreas Hindborg wrote:
>> Implement `ForeignOwnable` for `Owned<T>`. This allows use of `Owned<T>` in
>> places such as the `XArray`.
>>
>> Note that `T` does not need to implement `ForeignOwnable` for `Owned<T>` to
>> implement `ForeignOwnable`.
>>
>> Signed-off-by: Andreas Hindborg <a.hindborg@kernel.org>
>> ---
>> rust/kernel/owned.rs | 46 ++++++++++++++++++++++++++++++++++++++++++++++
>> 1 file changed, 46 insertions(+)
>>
>> diff --git a/rust/kernel/owned.rs b/rust/kernel/owned.rs
>> index 456e239e906e..5eacdf327d12 100644
>> --- a/rust/kernel/owned.rs
>> +++ b/rust/kernel/owned.rs
>> @@ -15,6 +15,8 @@
>> ptr::NonNull, //
>> };
>>
>> +use kernel::types::ForeignOwnable;
>> +
>> /// Types that specify their own way of performing allocation and destruction. Typically, this trait
>> /// is implemented on types from the C side.
>> ///
>> @@ -108,6 +110,7 @@ pub trait Ownable {
>> ///
>> /// - Until `T::release` is called, this `Owned<T>` exclusively owns the underlying `T`.
>> /// - The `T` value is pinned.
>> +#[repr(transparent)]
>
> AFAIT this `#[repr(transparent)]` isn't actually needed.
I'll drop it.
>
>> pub struct Owned<T: Ownable> {
>> ptr: NonNull<T>,
>> }
>> @@ -185,3 +188,46 @@ fn drop(&mut self) {
>> unsafe { T::release(self.ptr.as_mut()) };
>> }
>> }
>> +
>> +// SAFETY: We derive the pointer to `T` from a valid `T`, so the returned
>> +// pointer satisfy alignment requirements of `T`.
>> +unsafe impl<T: Ownable + 'static> ForeignOwnable for Owned<T> {
>
> You should drop the `'static` bound and put where bound on the GAT below
> instead. See how `Box` is doing it.
I will take a look.
Best regards,
Andreas Hindborg
^ permalink raw reply
* Re: [PATCH] landlock: shrink tsync works[] on partial allocation failure
From: Günther Noack @ 2026-06-23 9:45 UTC (permalink / raw)
To: Peng Hao; +Cc: mic, linux-security-module
In-Reply-To: <20260623050127.48247-1-flyingpeng@tencent.com>
Hello Peng!
Thanks for your patch!
On Tue, Jun 23, 2026 at 01:01:27PM +0800, Peng Hao wrote:
> When the per-slot kzalloc fails mid-loop in tsync_works_grow_by(), the
> already-enlarged s->works array keeps uninitialized trailing entries.
> Shrink the array back to its used size on the error path so no waste
> is carried over: free it outright when nothing has been allocated yet,
> otherwise try a shrinking krealloc_array() (keep the larger array if
> the shrink fails, since tsync_works_release() honors s->capacity).
>
> Signed-off-by: Peng Hao <flyingpeng@tencent.com>
> ---
> security/landlock/tsync.c | 18 ++++++++++++++++--
> 1 file changed, 16 insertions(+), 2 deletions(-)
>
> diff --git a/security/landlock/tsync.c b/security/landlock/tsync.c
> index c5730bbd..356ce94b 100644
> --- a/security/landlock/tsync.c
> +++ b/security/landlock/tsync.c
> @@ -272,9 +272,23 @@ static int tsync_works_grow_by(struct tsync_works *s, size_t n, gfp_t flags)
> work = kzalloc_obj(*work, flags);
> if (!work) {
> /*
> - * Leave the object in a consistent state,
> - * but return an error.
> + * Leave the object in a consistent state, but return
> + * an error. Shrink @s->works back to its used size to
> + * avoid carrying uninitialized trailing entries. A
> + * shrinking krealloc_array() should normally succeed,
> + * but if it does not we simply keep the larger array;
> + * tsync_works_release() iterates only up to capacity.
> */
> + if (i == 0) {
> + kfree(s->works);
> + s->works = NULL;
> + } else {
> + works = krealloc_array(s->works, i,
> + sizeof(s->works[0]),
> + flags | __GFP_NOWARN);
> + if (works)
> + s->works = works;
> + }
> s->capacity = i;
> return -ENOMEM;
> }
Can you please clarify your motivation for this?
To paraphrase my understanding
* You are not addressing a logic bug
The invariant for that data structure is that s->size <= s->capacity
and s->capacity <= number of elements in the array <= number of
sibling threads.
If the array is slightly larger than the capacity, that does not break
the invariant and should not result in out-of-bounds accesses.
* You are addressing that the array is a bit larger than the capacity
This is in the case where kzalloc_obj() failed. We set the capacity
to i (making sure that only the objects 0 to i-1 are being looked at),
and we return an error. Sure, the array is overallocated a little
bit, but now that we are returning an error, the caller will
fast-track to abort the tsync due to ENOMEM and the array does anyway
get released very soon now.
The state where we use slightly more memory (with number of entries <=
number of sibling threads) is supposed to be very transient.
Is the delay between raising the error and the final kfree() long enough
that you have seen it cause problems in practice?
Thanks,
—Günther
^ permalink raw reply
* Re: [PATCH] apparmor: mv get_loaddata_common_ref() into CONFIG_SECURITY_APPARMOR_EXPORT_BINARY block
From: John Johansen @ 2026-06-23 7:28 UTC (permalink / raw)
To: yaolu, paul, jmorris, serge
Cc: apparmor, linux-security-module, linux-kernel, k2ci
In-Reply-To: <20260623015049.41392-1-yaolu@kylinos.cn>
On 6/22/26 18:50, yaolu@kylinos.cn wrote:
> From: Lu Yao <yaolu@kylinos.cn>
>
> When SECURITY_APPARMOR_EXPORT_BINARY is not set, the compiler emits an
> unused-function warning which is promoted to an error with -Werror:
> security/apparmor/apparmorfs.c:177:28: error: ‘get_loaddata_common_ref’ defined but not used [-Werror=unused-function]
>
> Move the function into the #ifdef block to match its only call site,
> silencing the warning.
>
> Fixes: 8e135b8aee5a ("apparmor: fix race between freeing data and fs accessing it")
> Reported-by: k2ci <kernel-bot@kylinos.cn>
> Signed-off-by: Lu Yao <yaolu@kylinos.cn>
the patch is fine but this was already fixed by
d62d9bfe050f4 security/apparmor/apparmorfs.c: conditionally compile get_loaddata_common_ref()
that is queued up in apparmor-next, that was just adding a simple ifdef wrapper, if you want
to rework/rebase your patch to move the fn, and drop the extra ifdef, I have no objections
to pulling it in
> ---
> security/apparmor/apparmorfs.c | 16 ++++++++--------
> 1 file changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/security/apparmor/apparmorfs.c b/security/apparmor/apparmorfs.c
> index ededaf46f3ca..f762b101d682 100644
> --- a/security/apparmor/apparmorfs.c
> +++ b/security/apparmor/apparmorfs.c
> @@ -174,14 +174,6 @@ static struct aa_proxy *get_proxy_common_ref(struct aa_common_ref *ref)
> return NULL;
> }
>
> -static struct aa_loaddata *get_loaddata_common_ref(struct aa_common_ref *ref)
> -{
> - if (ref)
> - return aa_get_i_loaddata(container_of(ref, struct aa_loaddata,
> - count));
> - return NULL;
> -}
> -
> static void aa_put_common_ref(struct aa_common_ref *ref)
> {
> if (!ref)
> @@ -1318,6 +1310,14 @@ static const struct file_operations seq_rawdata_ ##NAME ##_fops = { \
> .release = seq_rawdata_release, \
> } \
>
> +static struct aa_loaddata *get_loaddata_common_ref(struct aa_common_ref *ref)
> +{
> + if (ref)
> + return aa_get_i_loaddata(container_of(ref, struct aa_loaddata,
> + count));
> + return NULL;
> +}
> +
> static int seq_rawdata_open(struct inode *inode, struct file *file,
> int (*show)(struct seq_file *, void *))
> {
^ permalink raw reply
* Re: [PATCH] tests: add a regression test for the TCP Fast Open connect-mediation bypass
From: John Johansen @ 2026-06-23 7:20 UTC (permalink / raw)
To: hexlabsecurity, apparmor; +Cc: linux-security-module, Ryan Lee
In-Reply-To: <20260622-b4-disp-220b400d-v1-1-69d8f4ddf57e@proton.me>
On 6/22/26 16:27, Bryam Vargas via B4 Relay wrote:
> From: Bryam Vargas <hexlabsecurity@proton.me>
>
> sendto(MSG_FASTOPEN) performs an implicit connect that the kernel's
> apparmor_socket_sendmsg() did not mediate as a connect, so a profile
> granting inet/inet6 stream send but denying connect was bypassed. Add a
> test that, under such a profile, asserts connect(2) is denied AND
> sendto(MSG_FASTOPEN) is also denied -- the latter requires the kernel fix
> "apparmor: mediate the implicit connect of TCP fast open sendmsg".
>
> It exercises both producers the fix guards -- plain TCP (inet/inet6) and
> MPTCP (IPPROTO_MPTCP) -- plus a positive control where connect is allowed.
> The test red-baselines on a vulnerable kernel and skips cleanly when the
> required fine-grained network mediation or TCP Fast Open is unavailable
> (requires_any_of_kernel_features / requires_parser_support, plus a
> tcp_fastopen guard); the MPTCP cases are skipped if MPTCP is disabled.
>
> Signed-off-by: Bryam Vargas <hexlabsecurity@proton.me>
thanks for the tests
Acked-by: John Johansen <john.johansen@canonical.com>
I have pulled this into the user space tree
> ---
> This is the userspace regression test for the AppArmor TCP Fast Open
> connect-mediation kernel fix posted to linux-security-module:
> https://lore.kernel.org/all/20260622-b4-disp-aba401c6-v1-1-9d74343c7ced@proton.me/
> It mirrors the SELinux testsuite test ("[PATCH testsuite] tests/inet_socket:
> add tests for TCP Fast Open", Stephen Smalley) and was requested by the
> AppArmor team (Ryan Lee).
>
> It covers both producers the kernel fix mediates -- plain TCP (inet/inet6)
> and MPTCP -- plus a positive control. It red-baselines on a vulnerable
> kernel (the fastopen assertions fail) and skips cleanly when TCP Fast Open
> or fine-grained network mediation is unavailable.
> ---
> tests/regression/apparmor/Makefile | 2 +
> tests/regression/apparmor/net_inet_tcp_fastopen.c | 241 +++++++++++++++++++++
> tests/regression/apparmor/net_inet_tcp_fastopen.sh | 119 ++++++++++
> 3 files changed, 362 insertions(+)
>
> diff --git a/tests/regression/apparmor/Makefile b/tests/regression/apparmor/Makefile
> index 345f39968..18e408f5c 100644
> --- a/tests/regression/apparmor/Makefile
> +++ b/tests/regression/apparmor/Makefile
> @@ -151,6 +151,7 @@ SRC=access.c \
> named_pipe.c \
> net_inet_rcv.c \
> net_inet_snd.c \
> + net_inet_tcp_fastopen.c \
> net_raw.c \
> open.c \
> openat.c \
> @@ -325,6 +326,7 @@ TESTS=aa_exec \
> namespaces \
> net_iface \
> net_inet \
> + net_inet_tcp_fastopen \
> net_raw \
> overlayfs_kernel \
> open \
> diff --git a/tests/regression/apparmor/net_inet_tcp_fastopen.c b/tests/regression/apparmor/net_inet_tcp_fastopen.c
> new file mode 100644
> index 000000000..cfc1ff9c5
> --- /dev/null
> +++ b/tests/regression/apparmor/net_inet_tcp_fastopen.c
> @@ -0,0 +1,241 @@
> +/*
> + * Copyright (C) 2026 Canonical, Ltd.
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License as
> + * published by the Free Software Foundation, version 2 of the
> + * License.
> + */
> +
> +/*
> + * TCP Fast Open connect-mediation bypass regression test.
> + *
> + * Under an AppArmor profile that grants inet/inet6 stream "send" but DENIES
> + * "connect", a plain connect(2) must be refused (EACCES/EPERM). Historically
> + * the kernel's TFO fast path (sendto(..., MSG_FASTOPEN, ...), which performs
> + * an implicit connect) only checked the send permission (AA_NET_SEND 0x02)
> + * and skipped the connect permission (AA_NET_CONNECT 0x40), so a confined
> + * task could open an outbound connection that connect(2) would have blocked.
> + * The kernel fix mediates both producers: plain TCP and MPTCP (IPPROTO_MPTCP).
> + *
> + * This binary takes a mode and asserts the operation is DENIED:
> + * argv[1] = "connect" -> baseline: connect(2) must be denied
> + * argv[1] = "fastopen" -> the bug: sendto(MSG_FASTOPEN) must be denied
> + * argv[2] = family: "inet"/"inet6" (TCP) or "minet"/"minet6" (MPTCP)
> + * argv[3] = port (the listener port, set up by this same process)
> + *
> + * Output contract (parsed by checktestfg in prologue.inc):
> + * "PASS\n" -> the operation was DENIED as required (regression OK)
> + * "FAIL ..."-> the operation was ALLOWED (connect bypass) OR a setup error
> + *
> + * The .sh runs this with expected outcome "pass"; it also enables TCP Fast
> + * Open first, so an EOPNOTSUPP here is a real setup error, not a skip.
> + */
> +
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <string.h>
> +#include <unistd.h>
> +#include <errno.h>
> +#include <signal.h>
> +#include <netinet/in.h>
> +#include <netinet/tcp.h>
> +#include <sys/socket.h>
> +#include <sys/types.h>
> +#include <sys/wait.h>
> +#include <arpa/inet.h>
> +
> +#ifndef MSG_FASTOPEN
> +#define MSG_FASTOPEN 0x20000000
> +#endif
> +
> +#ifndef IPPROTO_MPTCP
> +#define IPPROTO_MPTCP 262
> +#endif
> +
> +/* Map a family token to (AF_*, protocol). "minet"/"minet6" select MPTCP.
> + * Returns 0 on success, -1 on an unknown token.
> + */
> +static int parse_family(const char *tok, int *family, int *proto)
> +{
> + *proto = 0;
> + if (strcmp(tok, "inet") == 0) {
> + *family = AF_INET;
> + } else if (strcmp(tok, "inet6") == 0) {
> + *family = AF_INET6;
> + } else if (strcmp(tok, "minet") == 0) {
> + *family = AF_INET;
> + *proto = IPPROTO_MPTCP;
> + } else if (strcmp(tok, "minet6") == 0) {
> + *family = AF_INET6;
> + *proto = IPPROTO_MPTCP;
> + } else {
> + return -1;
> + }
> + return 0;
> +}
> +
> +/* Build a loopback sockaddr for the requested family. Returns addrlen. */
> +static socklen_t make_addr(int family, int port, struct sockaddr_storage *ss)
> +{
> + memset(ss, 0, sizeof(*ss));
> + if (family == AF_INET) {
> + struct sockaddr_in *a = (struct sockaddr_in *)ss;
> +
> + a->sin_family = AF_INET;
> + a->sin_port = htons(port);
> + inet_pton(AF_INET, "127.0.0.1", &a->sin_addr);
> + return sizeof(*a);
> + }
> + {
> + struct sockaddr_in6 *a = (struct sockaddr_in6 *)ss;
> +
> + a->sin6_family = AF_INET6;
> + a->sin6_port = htons(port);
> + inet_pton(AF_INET6, "::1", &a->sin6_addr);
> + return sizeof(*a);
> + }
> +}
> +
> +/* Start a plain TCP listener so the connect/TFO target exists. Returns the fd
> + * or -1. A TCP listener accepts both TCP and MPTCP clients, which keeps the
> + * test on the client-side mediation under examination. bind/listen perms are
> + * granted by the profile so this must succeed.
> + */
> +static int start_listener(int family, int port)
> +{
> + int s, one = 1;
> + struct sockaddr_storage ss;
> + socklen_t len = make_addr(family, port, &ss);
> +
> + s = socket(family, SOCK_STREAM, 0);
> + if (s < 0) {
> + printf("FAIL - listener socket: %m\n");
> + return -1;
> + }
> + (void)setsockopt(s, SOL_SOCKET, SO_REUSEADDR, &one, sizeof(one));
> + /* Enable TFO on the listener (qlen). Best-effort; the mediation check
> + * under test happens on the client side. */
> + (void)setsockopt(s, IPPROTO_TCP, TCP_FASTOPEN, &one, sizeof(one));
> + if (bind(s, (struct sockaddr *)&ss, len) < 0) {
> + printf("FAIL - listener bind: %m\n");
> + close(s);
> + return -1;
> + }
> + if (listen(s, 5) < 0) {
> + printf("FAIL - listener listen: %m\n");
> + close(s);
> + return -1;
> + }
> + return s;
> +}
> +
> +/* Returns 1 if the kernel DENIED the operation (EACCES/EPERM) => regression OK.
> + * Returns 0 if the operation was ALLOWED (connect bypass) => regression FAIL.
> + * Returns -1 on a setup error.
> + */
> +static int try_connect(int family, int proto, int port)
> +{
> + int s, rc;
> + struct sockaddr_storage ss;
> + socklen_t len = make_addr(family, port, &ss);
> +
> + s = socket(family, SOCK_STREAM, proto);
> + if (s < 0)
> + return -1;
> + rc = connect(s, (struct sockaddr *)&ss, len);
> + if (rc == 0) {
> + close(s);
> + return 0; /* allowed */
> + }
> + if (errno == EACCES || errno == EPERM) {
> + close(s);
> + return 1; /* denied by AppArmor */
> + }
> + /* ECONNREFUSED/ETIMEDOUT mean it reached the network: mediation did not
> + * block it, so count as allowed. */
> + close(s);
> + return (errno == ECONNREFUSED || errno == ETIMEDOUT) ? 0 : -1;
> +}
> +
> +static int try_fastopen(int family, int proto, int port)
> +{
> + int s;
> + ssize_t rc;
> + char msg[] = "tfo";
> + struct sockaddr_storage ss;
> + socklen_t len = make_addr(family, port, &ss);
> +
> + s = socket(family, SOCK_STREAM, proto);
> + if (s < 0)
> + return -1;
> +
> + /* The bug: this implicit-connect send must be mediated as a connect. */
> + rc = sendto(s, msg, sizeof(msg), MSG_FASTOPEN,
> + (struct sockaddr *)&ss, len);
> + if (rc >= 0) {
> + close(s);
> + return 0; /* allowed: connect bypass */
> + }
> + if (errno == EACCES || errno == EPERM) {
> + close(s);
> + return 1; /* denied by AppArmor */
> + }
> + if (errno == EOPNOTSUPP || errno == EINVAL) {
> + /* The .sh enabled TCP Fast Open before running, so this is a
> + * real setup error, not an expected condition. Fail loudly
> + * rather than masking it as a denial. */
> + close(s);
> + return -1;
> + }
> + close(s);
> + return (errno == ECONNREFUSED || errno == ETIMEDOUT) ? 0 : -1;
> +}
> +
> +int main(int argc, char *argv[])
> +{
> + int family, proto, port, denied, listener;
> + const char *mode;
> +
> + if (argc < 4) {
> + printf("FAIL - usage: %s connect|fastopen inet|inet6|minet|minet6 port\n",
> + argv[0]);
> + return 1;
> + }
> + mode = argv[1];
> + if (parse_family(argv[2], &family, &proto) < 0) {
> + printf("FAIL - unknown family '%s'\n", argv[2]);
> + return 1;
> + }
> + port = atoi(argv[3]);
> +
> + signal(SIGPIPE, SIG_IGN);
> +
> + listener = start_listener(family, port);
> + if (listener < 0)
> + return 1; /* FAIL already printed */
> +
> + if (strcmp(mode, "connect") == 0) {
> + denied = try_connect(family, proto, port);
> + } else if (strcmp(mode, "fastopen") == 0) {
> + denied = try_fastopen(family, proto, port);
> + } else {
> + printf("FAIL - unknown mode '%s'\n", mode);
> + close(listener);
> + return 1;
> + }
> +
> + close(listener);
> +
> + if (denied == 1) {
> + printf("PASS\n");
> + return 0;
> + }
> + if (denied == 0) {
> + printf("FAIL - %s was ALLOWED despite deny connect "
> + "(connect-mediation bypass)\n", mode);
> + return 1;
> + }
> + printf("FAIL - %s setup error: %m\n", mode);
> + return 1;
> +}
> diff --git a/tests/regression/apparmor/net_inet_tcp_fastopen.sh b/tests/regression/apparmor/net_inet_tcp_fastopen.sh
> new file mode 100755
> index 000000000..76300c53f
> --- /dev/null
> +++ b/tests/regression/apparmor/net_inet_tcp_fastopen.sh
> @@ -0,0 +1,119 @@
> +#! /bin/bash
> +# Copyright (C) 2026 Canonical, Ltd.
> +#
> +# This program is free software; you can redistribute it and/or
> +# modify it under the terms of the GNU General Public License as
> +# published by the Free Software Foundation, version 2 of the
> +# License.
> +
> +#=NAME net_inet_tcp_fastopen
> +#=DESCRIPTION
> +# Regression test for the TCP Fast Open connect-mediation bypass. Under a
> +# profile that grants inet/inet6 stream "send" but DENIES "connect", a plain
> +# connect(2) is refused, and sendto(..., MSG_FASTOPEN, ...) (which performs an
> +# implicit connect) MUST also be refused -- for both plain TCP and MPTCP. Pre-fix
> +# the TFO path checked only the send permission (AA_NET_SEND 0x02) and skipped
> +# connect (AA_NET_CONNECT 0x40).
> +#=END
> +
> +pwd=`dirname $0`
> +pwd=`cd $pwd ; /bin/pwd`
> +
> +bin=$pwd
> +
> +. "$bin/prologue.inc"
> +
> +# Need fine-grained inet mediation (connect/send are separable only there).
> +requires_any_of_kernel_features network_v8/af_inet network_v9/af_inet
> +requires_parser_support "network (send) ip=::1,"
> +
> +settest net_inet_tcp_fastopen
> +
> +tfo_sysctl=/proc/sys/net/ipv4/tcp_fastopen
> +tfo_saved=""
> +
> +cleanup()
> +{
> + # restore the original tcp_fastopen value if we changed it
> + if [ -n "$tfo_saved" ]; then
> + echo "$tfo_saved" > "$tfo_sysctl" 2>/dev/null || true
> + fi
> +}
> +do_onexit="cleanup"
> +
> +# The sendto(MSG_FASTOPEN) client path needs the TCP Fast Open client bit
> +# (0x1). Enable it for the run; if it is unavailable (no sysctl, or it cannot
> +# be enabled) the bug cannot be exercised at all, so skip rather than report a
> +# spurious failure.
> +if [ ! -w "$tfo_sysctl" ]; then
> + echo " TCP Fast Open sysctl ($tfo_sysctl) not available. Skipping tests ..."
> + exit 0
> +fi
> +tfo_saved=`cat "$tfo_sysctl"`
> +echo $((tfo_saved | 1)) > "$tfo_sysctl" 2>/dev/null || true
> +if [ $(($(cat "$tfo_sysctl") & 1)) -ne 1 ]; then
> + echo " Could not enable the TCP Fast Open client bit. Skipping tests ..."
> + exit 0
> +fi
> +
> +# add ::1 if not already present (loopback usually has it)
> +ip -6 addr add ::1/128 dev lo 2>/dev/null || true
> +
> +# pick a free port for the listener this binary creates
> +port=4321
> +while lsof -i:$port >/dev/null 2>&1; do
> + let port=$port+1
> +done
> +
> +# Profile: allow stream send/receive + the perms needed to stand up the
> +# in-process listener (bind/listen/accept), allow setopt/getopt for TFO
> +# sockopts, but explicitly DENY connect on both inet and inet6.
> +gen_send_no_connect()
> +{
> + genprofile \
> + "network;(send,receive,accept,listen,bind);ip=127.0.0.1;port=$port" \
> + "network;(send,receive,accept,listen,bind);ip=::1;port=$port" \
> + "network;(send,receive);peer=(ip=127.0.0.1)" \
> + "network;(send,receive);peer=(ip=::1)" \
> + "network;(setopt,getopt);ip=0.0.0.0;port=0" \
> + "network;(setopt,getopt);ip=::0;port=0" \
> + "qual=deny:network;(connect);ip=127.0.0.1" \
> + "qual=deny:network;(connect);ip=::1"
> +}
> +
> +# ---- inet (IPv4) ----
> +gen_send_no_connect
> +# baseline: a normal connect(2) must be denied -> binary prints PASS (denied),
> +# expected outcome 'pass'
> +runchecktest "TFO inet - connect(2) denied" pass connect inet $port
> +# the bug: sendto(MSG_FASTOPEN) must ALSO be denied post-fix
> +runchecktest "TFO inet - sendto(MSG_FASTOPEN) denied" pass fastopen inet $port
> +
> +# ---- inet6 (IPv6) ----
> +gen_send_no_connect
> +runchecktest "TFO inet6 - connect(2) denied" pass connect inet6 $port
> +runchecktest "TFO inet6 - sendto(MSG_FASTOPEN) denied" pass fastopen inet6 $port
> +
> +# ---- MPTCP: the second producer the fix guards (IPPROTO_MPTCP) ----
> +# The deny-connect rule is family/type based, so it covers MPTCP (inet/inet6
> +# stream) too. Only run when MPTCP is enabled.
> +if [ "`cat /proc/sys/net/mptcp/enabled 2>/dev/null`" = "1" ]; then
> + gen_send_no_connect
> + runchecktest "TFO MPTCP inet - connect(2) denied" pass connect minet $port
> + runchecktest "TFO MPTCP inet - sendto(MSG_FASTOPEN) denied" pass fastopen minet $port
> + gen_send_no_connect
> + runchecktest "TFO MPTCP inet6 - connect(2) denied" pass connect minet6 $port
> + runchecktest "TFO MPTCP inet6 - sendto(MSG_FASTOPEN) denied" pass fastopen minet6 $port
> +fi
> +
> +# ---- positive control: when connect IS allowed, both succeed (no false deny) ----
> +genprofile \
> + "network;(connect,send,receive,accept,listen,bind);ip=127.0.0.1;port=$port" \
> + "network;(connect,send,receive);peer=(ip=127.0.0.1)" \
> + "network;(setopt,getopt);ip=0.0.0.0;port=0"
> +# Here the binary's "denied" assertion is FALSE (op allowed), so it prints
> +# FAIL; we expect that, i.e. expected outcome 'fail'.
> +runchecktest "TFO inet - connect allowed (control)" fail connect inet $port
> +runchecktest "TFO inet - fastopen allowed (control)" fail fastopen inet $port
> +
> +exit 0
>
> ---
> base-commit: bdccc1ebd2e1a1b75ceb8f87b23831fe273b9ebb
> change-id: 20260622-b4-disp-220b400d-3d7fd53bce49
>
> Best regards,
^ permalink raw reply
* Re: [PATCH] apparmor: mediate the implicit connect of TCP fast open sendmsg
From: John Johansen @ 2026-06-23 7:15 UTC (permalink / raw)
To: hexlabsecurity, Ryan Lee
Cc: Mickael Salaun, Stephen Smalley, James Morris, Matthieu Buffet,
linux-security-module, Mikhail Ivanov, Serge E. Hallyn,
linux-kernel, Paul Moore, apparmor
In-Reply-To: <20260622-b4-disp-aba401c6-v1-1-9d74343c7ced@proton.me>
On 6/22/26 13:57, Bryam Vargas via B4 Relay wrote:
> From: Bryam Vargas <hexlabsecurity@proton.me>
>
> sendmsg()/sendto() with MSG_FASTOPEN is a combination of connect(2) and
> write(2): it opens the connection in the SYN. apparmor_socket_sendmsg()
> only checks AA_MAY_SEND, so a profile that grants send but denies connect
> lets a confined task open an outbound TCP/MPTCP connection that connect(2)
> would have refused, bypassing connect mediation.
>
> Mediate the implicit connect when MSG_FASTOPEN is set and a destination
> is supplied. Add it to apparmor_socket_sendmsg() (not the shared
> aa_sock_msg_perm() helper, which recvmsg also uses) and call aa_sk_perm()
> directly, mirroring the selinux and tomoyo fixes. sk_is_tcp() does not
> cover MPTCP fast open, so the SOCK_STREAM/IPPROTO_MPTCP arm is explicit.
>
> Fixes: cf60af03ca4e ("net-tcp: Fast Open client - sendmsg(MSG_FASTOPEN)")
> Cc: stable@vger.kernel.org
> Signed-off-by: Bryam Vargas <hexlabsecurity@proton.me>
Acked-by: John Johansen <john.johansen@canonical.com>
I have pulled this into my tree
> ---
> This is the patch and reproducer requested in [1]. A userspace regression test
> (tests/regression/apparmor/net_inet_tcp_fastopen) follows separately to the
> apparmor tree, as suggested.
>
> Reproducer (behavioral; the bypassed value is policy, not bus state, so no special
> hardware). Under a profile that grants inet/inet6 stream send but denies connect, on
> the current Debian security kernel 6.12.94 (apparmor active):
>
> [TCP ] connect(2)=EACCES sendto(MSG_FASTOPEN)=OK -> connect bypassed (listener accepted)
> [TCP6] connect(2)=EACCES sendto(MSG_FASTOPEN)=OK -> connect bypassed
>
> The kernel audit shows the connect(2) denial and no connect record for the fastopen
> sendto:
>
> apparmor="DENIED" operation="connect" profile="egress_restricted" comm="lsm_tfo_ab"
> family="inet" sock_type="stream" protocol=6 requested_mask="connect" denied_mask="connect"
>
> With this patch the fastopen sendto hits that same connect denial. Full reproducer
> available on request.
>
> Same-class fixes: selinux [2], tomoyo [3]; the original cross-LSM report (landlock,
> the first instance) is [4].
>
> [1] https://lore.kernel.org/r/20260619011138.264578-1-hexlabsecurity@proton.me
> [2] https://lore.kernel.org/r/20260618175513.112443-2-stephen.smalley.work@gmail.com
> [3] https://lore.kernel.org/r/20260619002207.61104-1-matthieu@buffet.re
> [4] https://lore.kernel.org/r/20260616201615.275032-1-hexlabsecurity@proton.me
> ---
> security/apparmor/lsm.c | 16 +++++++++++++++-
> 1 file changed, 15 insertions(+), 1 deletion(-)
>
> diff --git a/security/apparmor/lsm.c b/security/apparmor/lsm.c
> index 3491e9f60194..e01efdf50efa 100644
> --- a/security/apparmor/lsm.c
> +++ b/security/apparmor/lsm.c
> @@ -1422,7 +1422,21 @@ static int aa_sock_msg_perm(const char *op, u32 request, struct socket *sock,
> static int apparmor_socket_sendmsg(struct socket *sock,
> struct msghdr *msg, int size)
> {
> - return aa_sock_msg_perm(OP_SENDMSG, AA_MAY_SEND, sock, msg, size);
> + int error = aa_sock_msg_perm(OP_SENDMSG, AA_MAY_SEND, sock, msg, size);
> +
> + if (error)
> + return error;
> +
> + /* TCP fast open carries connect() semantics in sendmsg(); mediate
> + * the implicit connect so it cannot bypass the connect permission.
> + */
> + if ((msg->msg_flags & MSG_FASTOPEN) && msg->msg_name &&
> + (sk_is_tcp(sock->sk) ||
> + (sk_is_inet(sock->sk) && sock->sk->sk_type == SOCK_STREAM &&
> + sock->sk->sk_protocol == IPPROTO_MPTCP)))
> + error = aa_sk_perm(OP_CONNECT, AA_MAY_CONNECT, sock->sk);
> +
> + return error;
> }
>
> static int apparmor_socket_recvmsg(struct socket *sock,
>
> ---
> base-commit: 4549871118cf616eecdd2d939f78e3b9e1dddc48
> change-id: 20260622-b4-disp-aba401c6-f02842c82975
>
> Best regards,
^ permalink raw reply
* Re: AppArmor: TCP Fast Open bypasses connect mediation (last unaddressed LSM)
From: John Johansen @ 2026-06-23 7:13 UTC (permalink / raw)
To: Bryam Vargas, linux-security-module, apparmor
Cc: Paul Moore, James Morris, Serge E . Hallyn, Mickael Salaun,
Stephen Smalley, Matthieu Buffet, Mikhail Ivanov, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, netdev, linux-kernel
In-Reply-To: <20260619011138.264578-1-hexlabsecurity@proton.me>
On 6/18/26 18:11, Bryam Vargas wrote:
> Hello John, and LSM folks,
>
> I have been working on the Landlock TCP Fast Open connect bypass [1]. Stephen
> Smalley's SELinux fix for the same issue [3] -- "Similar to Landlock, SELinux was
> not updated when TCP Fast Open support was introduced ..." -- made me go back and
> check the rest of the connect-mediating LSMs, since I had only been looking at
> Landlock. With Landlock [2], SELinux [3], and now TOMOYO [4] all getting fixes,
> AppArmor is the last one with the same gap and no fix yet.
>
> Root cause (shared with the others)
> -----------------------------------
> security_socket_connect() has a single call site, net/socket.c (the connect(2)
> syscall). TCP Fast Open performs an implicit connect inside sendmsg:
>
> tcp_sendmsg -> tcp_sendmsg_fastopen -> __inet_stream_connect(..., is_sendmsg=1)
> -> sk->sk_prot->connect() net/ipv4/{tcp.c,af_inet.c}
>
> This never calls security_socket_connect(); the only LSM hook on the path is
> security_socket_sendmsg(). mptcp_sendmsg_fastopen reaches the same code and is a
> second producer.
>
> AppArmor
> --------
> apparmor_socket_connect() requests AA_MAY_CONNECT; apparmor_socket_sendmsg() (via
> aa_sock_msg_perm) requests AA_MAY_SEND. These are distinct bits, and apparmor_parser
> compiles them independently: "network send inet stream," yields accept mask 0x02
> while "network connect inet stream," yields 0x40. So an egress-restriction profile
> that grants send but not connect is bypassed by MSG_FASTOPEN.
>
> Reproduced on 6.12.88 with apparmor active. Under a profile granting the inet/inet6
> stream lifecycle except connect:
>
> aa-exec -p egress_restricted -- ./probe
> [TCP ] connect(2)=EACCES(blocked) sendto(MSG_FASTOPEN)=OK(reached) => connection established
> [TCP6] connect(2)=EACCES(blocked) sendto(MSG_FASTOPEN)=OK(reached) => connection established
>
> (The coarse "network inet stream," idiom grants connect anyway, so this only bites the
> fine-grained "allow send, deny connect" policy that the asymmetry is meant to serve.)
>
> Fix
> ---
> Same shape as the TOMOYO [4] and SELinux [3] fixes: in apparmor_socket_sendmsg (or
> aa_sock_msg_perm), when MSG_FASTOPEN is set and msg_name carries a destination on a
> not-yet-connected stream socket, additionally require aa_sk_perm(OP_CONNECT,
> AA_MAY_CONNECT, sk). I am happy to send that patch and the reproducer.
>
If you have a patch, I'd love to take it and give you the credit other wise I can
throw it together.
> (A single core check in __inet_stream_connect(), gated on is_sendmsg, would have
> covered all five LSMs and both the TCP and MPTCP producers in one place -- the kernel
> already mediates the analogous implicit-connect-on-send for AF_UNIX via
> security_unix_may_send and for SCTP via security_sctp_bind_connect. But since the
> other four LSMs are taking per-hook fixes, AppArmor matching them is the consistent
> move; mentioning the core option only in case it is preferred.)
>
I think per LSM makes sense, at least atm, as it is probably easier. We can look
at refactoring after the fact.
> [1] Landlock: LANDLOCK_ACCESS_NET_CONNECT_TCP bypass via TCP Fast Open (report)
> https://lore.kernel.org/r/20260616201615.275032-1-hexlabsecurity@proton.me
> [2] landlock: fix TCP Fast Open connection bypass (Matthieu Buffet)
> https://lore.kernel.org/r/20260617180526.15627-2-matthieu@buffet.re
> [3] selinux: check connect-related permissions on TCP Fast Open (Stephen Smalley)
> https://lore.kernel.org/r/20260618175513.112443-2-stephen.smalley.work@gmail.com
> [4] tomoyo: Enforce connect policy in TCP Fast Open (Matthieu Buffet)
> https://lore.kernel.org/r/20260619002207.61104-1-matthieu@buffet.re
>
> Thanks,
> Bryam Vargas
>
Thanks for the detailed report Bryan
^ permalink raw reply
* [PATCH] landlock: shrink tsync works[] on partial allocation failure
From: Peng Hao @ 2026-06-23 5:01 UTC (permalink / raw)
To: mic, gnoack; +Cc: linux-security-module
When the per-slot kzalloc fails mid-loop in tsync_works_grow_by(), the
already-enlarged s->works array keeps uninitialized trailing entries.
Shrink the array back to its used size on the error path so no waste
is carried over: free it outright when nothing has been allocated yet,
otherwise try a shrinking krealloc_array() (keep the larger array if
the shrink fails, since tsync_works_release() honors s->capacity).
Signed-off-by: Peng Hao <flyingpeng@tencent.com>
---
security/landlock/tsync.c | 18 ++++++++++++++++--
1 file changed, 16 insertions(+), 2 deletions(-)
diff --git a/security/landlock/tsync.c b/security/landlock/tsync.c
index c5730bbd..356ce94b 100644
--- a/security/landlock/tsync.c
+++ b/security/landlock/tsync.c
@@ -272,9 +272,23 @@ static int tsync_works_grow_by(struct tsync_works *s, size_t n, gfp_t flags)
work = kzalloc_obj(*work, flags);
if (!work) {
/*
- * Leave the object in a consistent state,
- * but return an error.
+ * Leave the object in a consistent state, but return
+ * an error. Shrink @s->works back to its used size to
+ * avoid carrying uninitialized trailing entries. A
+ * shrinking krealloc_array() should normally succeed,
+ * but if it does not we simply keep the larger array;
+ * tsync_works_release() iterates only up to capacity.
*/
+ if (i == 0) {
+ kfree(s->works);
+ s->works = NULL;
+ } else {
+ works = krealloc_array(s->works, i,
+ sizeof(s->works[0]),
+ flags | __GFP_NOWARN);
+ if (works)
+ s->works = works;
+ }
s->capacity = i;
return -ENOMEM;
}
--
2.43.7
^ permalink raw reply related
* Re: [PATCH bpf-next v3 1/2] bpf: add bpf_init_inode_xattr kfunc for atomic inode labeling
From: David Windsor @ 2026-06-23 4:01 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: Alexander Viro, Christian Brauner, Jan Kara, Alexei Starovoitov,
Daniel Borkmann, John Fastabend, Andrii Nakryiko, Eduard,
Kumar Kartikeya Dwivedi, Martin KaFai Lau, Song Liu,
Yonghong Song, Jiri Olsa, Emil Tsalapatis, KP Singh,
Matt Bobrowski, Paul Moore, James Morris, Serge E . Hallyn,
Mimi Zohar, Roberto Sassu, dmitry.kasatkin, eric.snowberg,
Stephen Smalley, Ondrej Mosnacek, Casey Schaufler, Shuah Khan,
LKML, Linux-Fsdevel, bpf, LSM List, linux-integrity, selinux,
open list:KERNEL SELFTEST FRAMEWORK
In-Reply-To: <CAADnVQJsdrL2RjxM4UE1WyWrT9KsprFP+=xLHRtFhUSccDqQcg@mail.gmail.com>
On Mon, Jun 22, 2026 at 11:59 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> >
> > > > diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
> > > > index 1a721fc4bef5..b41b02173e24 100644
> > > > --- a/kernel/bpf/trampoline.c
> > > > +++ b/kernel/bpf/trampoline.c
> > > > @@ -859,6 +859,9 @@ static int bpf_trampoline_add_prog(struct bpf_trampoline *tr,
> > > > }
> > > > if (cnt >= BPF_MAX_TRAMP_LINKS)
> > > > return -E2BIG;
> > > > + if (node->link->prog->aux->attach_limit &&
> > > > + tr->progs_cnt[kind] >= node->link->prog->aux->attach_limit)
> > > > + return -E2BIG;
> > >
> > > No need. The check inside kfunc is enough.
> > >
> >
> > Paul wanted this check because it occurs at bpf prog attach time,
> > whereas the one in the kfunc is at inode creation time.
>
> Sorry, we're not adding redundant code to the verifier.
Thanks, will send v4 soon.
^ permalink raw reply
* Re: [PATCH bpf-next v3 1/2] bpf: add bpf_init_inode_xattr kfunc for atomic inode labeling
From: Alexei Starovoitov @ 2026-06-23 3:59 UTC (permalink / raw)
To: David Windsor
Cc: Alexander Viro, Christian Brauner, Jan Kara, Alexei Starovoitov,
Daniel Borkmann, John Fastabend, Andrii Nakryiko, Eduard,
Kumar Kartikeya Dwivedi, Martin KaFai Lau, Song Liu,
Yonghong Song, Jiri Olsa, Emil Tsalapatis, KP Singh,
Matt Bobrowski, Paul Moore, James Morris, Serge E . Hallyn,
Mimi Zohar, Roberto Sassu, dmitry.kasatkin, eric.snowberg,
Stephen Smalley, Ondrej Mosnacek, Casey Schaufler, Shuah Khan,
LKML, Linux-Fsdevel, bpf, LSM List, linux-integrity, selinux,
open list:KERNEL SELFTEST FRAMEWORK
In-Reply-To: <CAEXv5_jVXS4JoExcd71YkXEE2WXPJ0_9STO-uCwgNF+Eia_h5w@mail.gmail.com>
On Mon, Jun 22, 2026 at 8:49 PM David Windsor <dwindsor@gmail.com> wrote:
>
> On Mon, Jun 22, 2026 at 7:57 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Thu Jun 18, 2026 at 1:34 PM PDT, David Windsor wrote:
> > > +
> > > +static int __bpf_init_inode_xattr(struct xattr_ctx *xattr_ctx,
> > > + const char *name__str,
> > > + const struct bpf_dynptr *value_p)
> > > +{
> > > + struct bpf_dynptr_kern *value_ptr = (struct bpf_dynptr_kern *)value_p;
> > > + size_t name_len;
> > > + void *xattr_value;
> > > + struct xattr *xattr;
> > > + struct xattr *xattrs;
> > > + int *xattr_count;
> > > + const void *value;
> > > + u32 value_len;
> > > +
> > > + if (!xattr_ctx || !name__str)
> > > + return -EINVAL;
> > > +
> > > + xattrs = xattr_ctx->xattrs;
> > > + xattr_count = xattr_ctx->xattr_count;
> > > + if (!xattrs || !xattr_count)
> > > + return -EINVAL;
> > > + if (bpf_xattrs_used(xattr_ctx) >= BPF_LSM_INODE_INIT_XATTRS)
> > > + return -ENOSPC;
> >
> > This check is good to have, but it's enough. No need to duplicate it.
> > More below.
> >
>
> > > +
> > > static int bpf_fs_kfuncs_filter(const struct bpf_prog *prog, u32 kfunc_id)
> > > {
> > > if (!btf_id_set8_contains(&bpf_fs_kfunc_set_ids, kfunc_id) ||
> > > - prog->type == BPF_PROG_TYPE_LSM)
> > > + prog->type == BPF_PROG_TYPE_LSM) {
> > > + /* bpf_init_inode_xattr only attaches to inode_init_security. */
> > > + if (kfunc_id == bpf_init_inode_xattr_btf_ids[0] &&
> > > + prog->aux->attach_btf_id != bpf_lsm_inode_init_security_btf_ids[0])
> > > + return -EACCES;
> >
> > This is unnecessary. Only one hook will have xattr_ctx type.
> > The normal verifier type enforcement will do its work.
> >
>
> Good point, thanks.
>
> > > diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
> > > index 1a721fc4bef5..b41b02173e24 100644
> > > --- a/kernel/bpf/trampoline.c
> > > +++ b/kernel/bpf/trampoline.c
> > > @@ -859,6 +859,9 @@ static int bpf_trampoline_add_prog(struct bpf_trampoline *tr,
> > > }
> > > if (cnt >= BPF_MAX_TRAMP_LINKS)
> > > return -E2BIG;
> > > + if (node->link->prog->aux->attach_limit &&
> > > + tr->progs_cnt[kind] >= node->link->prog->aux->attach_limit)
> > > + return -E2BIG;
> >
> > No need. The check inside kfunc is enough.
> >
>
> Paul wanted this check because it occurs at bpf prog attach time,
> whereas the one in the kfunc is at inode creation time.
Sorry, we're not adding redundant code to the verifier.
^ permalink raw reply
* Re: [PATCH bpf-next v3 1/2] bpf: add bpf_init_inode_xattr kfunc for atomic inode labeling
From: David Windsor @ 2026-06-23 3:49 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: viro, brauner, jack, ast, daniel, john.fastabend, andrii, eddyz87,
memxor, martin.lau, song, yonghong.song, jolsa, emil, kpsingh,
mattbobrowski, paul, jmorris, serge, zohar, roberto.sassu,
dmitry.kasatkin, eric.snowberg, stephen.smalley.work, omosnace,
casey, shuah, linux-kernel, linux-fsdevel, bpf,
linux-security-module, linux-integrity, selinux, linux-kselftest
In-Reply-To: <DJFZGYZFMN73.1799LMXW49WZO@gmail.com>
On Mon, Jun 22, 2026 at 7:57 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Thu Jun 18, 2026 at 1:34 PM PDT, David Windsor wrote:
> > +
> > +static int __bpf_init_inode_xattr(struct xattr_ctx *xattr_ctx,
> > + const char *name__str,
> > + const struct bpf_dynptr *value_p)
> > +{
> > + struct bpf_dynptr_kern *value_ptr = (struct bpf_dynptr_kern *)value_p;
> > + size_t name_len;
> > + void *xattr_value;
> > + struct xattr *xattr;
> > + struct xattr *xattrs;
> > + int *xattr_count;
> > + const void *value;
> > + u32 value_len;
> > +
> > + if (!xattr_ctx || !name__str)
> > + return -EINVAL;
> > +
> > + xattrs = xattr_ctx->xattrs;
> > + xattr_count = xattr_ctx->xattr_count;
> > + if (!xattrs || !xattr_count)
> > + return -EINVAL;
> > + if (bpf_xattrs_used(xattr_ctx) >= BPF_LSM_INODE_INIT_XATTRS)
> > + return -ENOSPC;
>
> This check is good to have, but it's enough. No need to duplicate it.
> More below.
>
> > +
> > static int bpf_fs_kfuncs_filter(const struct bpf_prog *prog, u32 kfunc_id)
> > {
> > if (!btf_id_set8_contains(&bpf_fs_kfunc_set_ids, kfunc_id) ||
> > - prog->type == BPF_PROG_TYPE_LSM)
> > + prog->type == BPF_PROG_TYPE_LSM) {
> > + /* bpf_init_inode_xattr only attaches to inode_init_security. */
> > + if (kfunc_id == bpf_init_inode_xattr_btf_ids[0] &&
> > + prog->aux->attach_btf_id != bpf_lsm_inode_init_security_btf_ids[0])
> > + return -EACCES;
>
> This is unnecessary. Only one hook will have xattr_ctx type.
> The normal verifier type enforcement will do its work.
>
Good point, thanks.
> > diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
> > index 1a721fc4bef5..b41b02173e24 100644
> > --- a/kernel/bpf/trampoline.c
> > +++ b/kernel/bpf/trampoline.c
> > @@ -859,6 +859,9 @@ static int bpf_trampoline_add_prog(struct bpf_trampoline *tr,
> > }
> > if (cnt >= BPF_MAX_TRAMP_LINKS)
> > return -E2BIG;
> > + if (node->link->prog->aux->attach_limit &&
> > + tr->progs_cnt[kind] >= node->link->prog->aux->attach_limit)
> > + return -E2BIG;
>
> No need. The check inside kfunc is enough.
>
Paul wanted this check because it occurs at bpf prog attach time,
whereas the one in the kfunc is at inode creation time.
> > if (!hlist_unhashed(&node->tramp_hlist))
> > /* prog already linked */
> > return -EBUSY;
> > diff --git a/security/bpf/hooks.c b/security/bpf/hooks.c
> > index 40efde233f3a..d7c44c5c0e30 100644
> > --- a/security/bpf/hooks.c
> > +++ b/security/bpf/hooks.c
> > @@ -30,6 +30,7 @@ static int __init bpf_lsm_init(void)
> >
> > struct lsm_blob_sizes bpf_lsm_blob_sizes __ro_after_init = {
> > .lbs_inode = sizeof(struct bpf_storage_blob),
> > + .lbs_xattr_count = BPF_LSM_INODE_INIT_XATTRS,
> > };
> >
> > DEFINE_LSM(bpf) = {
> > diff --git a/security/integrity/evm/evm_main.c b/security/integrity/evm/evm_main.c
> > index b59e3f121b8a..e0a05162accc 100644
> > --- a/security/integrity/evm/evm_main.c
> > +++ b/security/integrity/evm/evm_main.c
> > @@ -1062,14 +1062,16 @@ static int evm_inode_copy_up_xattr(struct dentry *src, const char *name)
> > * evm_inode_init_security - initializes security.evm HMAC value
> > */
> > int evm_inode_init_security(struct inode *inode, struct inode *dir,
> > - const struct qstr *qstr, struct xattr *xattrs,
> > - int *xattr_count)
> > + const struct qstr *qstr,
> > + struct xattr_ctx *xattr_ctx)
>
> the rest looks good.
> Pls split the patch. Introduce xattr_ctx first across the LSMs.
> Then another patch with a new kfunc.
>
> pw-bot: cr
^ permalink raw reply
* Re: [PATCH 1/2] bpf: lsm: disable xfrm_decode_session hook attachment
From: Alexei Starovoitov @ 2026-06-23 3:11 UTC (permalink / raw)
To: include
Cc: LSM List, bpf, LKML, stable, KP Singh, Matt Bobrowski,
Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
Eduard Zingerman, Kumar Kartikeya Dwivedi, Martin KaFai Lau,
Song Liu, Yonghong Song, Jiri Olsa, Emil Tsalapatis,
Florent Revest, Brendan Jackman
In-Reply-To: <20260619130305.27779-1-include@grrlz.net>
On Fri, Jun 19, 2026 at 6:03 AM Bradley Morgan <include@grrlz.net> wrote:
>
> BPF LSM programs can currently attach to xfrm_decode_session(). That
> hook may return an error, but security_skb_classify_flow() calls it
> from a void path and triggers BUG_ON() if an error is returned.
>
> Disable BPF attachment to the hook to prevent a BPF LSM program from
> turning packet classification into a full panic.
>
> Fixes: 9e4e01dfd325 ("bpf: lsm: Implement attach, detach and execution")
> Cc: stable@vger.kernel.org
> Signed-off-by: Bradley Morgan <include@grrlz.net>
> ---
> kernel/bpf/bpf_lsm.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c
> index 564071a92d7d..1433809bb166 100644
> --- a/kernel/bpf/bpf_lsm.c
> +++ b/kernel/bpf/bpf_lsm.c
> @@ -51,6 +51,9 @@ BTF_ID(func, bpf_lsm_key_getsecurity)
> #ifdef CONFIG_AUDIT
> BTF_ID(func, bpf_lsm_audit_rule_match)
> #endif
> +#ifdef CONFIG_SECURITY_NETWORK_XFRM
> +BTF_ID(func, bpf_lsm_xfrm_decode_session)
> +#endif
Applied this fix to bpf tree.
^ permalink raw reply
* [PATCH] apparmor: mv get_loaddata_common_ref() into CONFIG_SECURITY_APPARMOR_EXPORT_BINARY block
From: yaolu @ 2026-06-23 1:50 UTC (permalink / raw)
To: john.johansen, paul, jmorris, serge
Cc: apparmor, linux-security-module, linux-kernel, Lu Yao, k2ci
From: Lu Yao <yaolu@kylinos.cn>
When SECURITY_APPARMOR_EXPORT_BINARY is not set, the compiler emits an
unused-function warning which is promoted to an error with -Werror:
security/apparmor/apparmorfs.c:177:28: error: ‘get_loaddata_common_ref’ defined but not used [-Werror=unused-function]
Move the function into the #ifdef block to match its only call site,
silencing the warning.
Fixes: 8e135b8aee5a ("apparmor: fix race between freeing data and fs accessing it")
Reported-by: k2ci <kernel-bot@kylinos.cn>
Signed-off-by: Lu Yao <yaolu@kylinos.cn>
---
security/apparmor/apparmorfs.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/security/apparmor/apparmorfs.c b/security/apparmor/apparmorfs.c
index ededaf46f3ca..f762b101d682 100644
--- a/security/apparmor/apparmorfs.c
+++ b/security/apparmor/apparmorfs.c
@@ -174,14 +174,6 @@ static struct aa_proxy *get_proxy_common_ref(struct aa_common_ref *ref)
return NULL;
}
-static struct aa_loaddata *get_loaddata_common_ref(struct aa_common_ref *ref)
-{
- if (ref)
- return aa_get_i_loaddata(container_of(ref, struct aa_loaddata,
- count));
- return NULL;
-}
-
static void aa_put_common_ref(struct aa_common_ref *ref)
{
if (!ref)
@@ -1318,6 +1310,14 @@ static const struct file_operations seq_rawdata_ ##NAME ##_fops = { \
.release = seq_rawdata_release, \
} \
+static struct aa_loaddata *get_loaddata_common_ref(struct aa_common_ref *ref)
+{
+ if (ref)
+ return aa_get_i_loaddata(container_of(ref, struct aa_loaddata,
+ count));
+ return NULL;
+}
+
static int seq_rawdata_open(struct inode *inode, struct file *file,
int (*show)(struct seq_file *, void *))
{
--
2.25.1
^ permalink raw reply related
* Re: [PATCH bpf-next v3 1/2] bpf: add bpf_init_inode_xattr kfunc for atomic inode labeling
From: Alexei Starovoitov @ 2026-06-22 23:57 UTC (permalink / raw)
To: David Windsor, viro, brauner, jack, ast, daniel, john.fastabend,
andrii, eddyz87, memxor, martin.lau, song, yonghong.song, jolsa,
emil, kpsingh, mattbobrowski, paul, jmorris, serge, zohar,
roberto.sassu, dmitry.kasatkin, eric.snowberg,
stephen.smalley.work, omosnace, casey, shuah
Cc: linux-kernel, linux-fsdevel, bpf, linux-security-module,
linux-integrity, selinux, linux-kselftest
In-Reply-To: <20260618203411.73917-2-dwindsor@gmail.com>
On Thu Jun 18, 2026 at 1:34 PM PDT, David Windsor wrote:
> +
> +static int __bpf_init_inode_xattr(struct xattr_ctx *xattr_ctx,
> + const char *name__str,
> + const struct bpf_dynptr *value_p)
> +{
> + struct bpf_dynptr_kern *value_ptr = (struct bpf_dynptr_kern *)value_p;
> + size_t name_len;
> + void *xattr_value;
> + struct xattr *xattr;
> + struct xattr *xattrs;
> + int *xattr_count;
> + const void *value;
> + u32 value_len;
> +
> + if (!xattr_ctx || !name__str)
> + return -EINVAL;
> +
> + xattrs = xattr_ctx->xattrs;
> + xattr_count = xattr_ctx->xattr_count;
> + if (!xattrs || !xattr_count)
> + return -EINVAL;
> + if (bpf_xattrs_used(xattr_ctx) >= BPF_LSM_INODE_INIT_XATTRS)
> + return -ENOSPC;
This check is good to have, but it's enough. No need to duplicate it.
More below.
> +
> + name_len = strlen(name__str);
> + if (name_len == 0 || name_len > XATTR_NAME_MAX)
> + return -EINVAL;
> + if (strncmp(name__str, XATTR_BPF_LSM_SUFFIX,
> + sizeof(XATTR_BPF_LSM_SUFFIX) - 1))
> + return -EPERM;
> +
> + value_len = __bpf_dynptr_size(value_ptr);
> + if (value_len == 0 || value_len > XATTR_SIZE_MAX)
> + return -EINVAL;
> +
> + value = __bpf_dynptr_data(value_ptr, value_len);
> + if (!value)
> + return -EINVAL;
> +
> + /* Combine xattr value + name into one allocation. */
> + xattr_value = kmalloc(value_len + name_len + 1, GFP_KERNEL);
> + if (!xattr_value)
> + return -ENOMEM;
> +
> + memcpy(xattr_value, value, value_len);
> + memcpy(xattr_value + value_len, name__str, name_len);
> + ((char *)xattr_value)[value_len + name_len] = '\0';
> +
> + xattr = lsm_get_xattr_slot(xattr_ctx);
> + if (!xattr) {
> + kfree(xattr_value);
> + return -ENOSPC;
> + }
> +
> + xattr->value = xattr_value;
> + xattr->name = (const char *)xattr_value + value_len;
> + xattr->value_len = value_len;
> +
> + return 0;
> +}
> +
> +/**
> + * bpf_init_inode_xattr - set an xattr on a new inode from inode_init_security
> + * @xattr_ctx: inode_init_security xattr state from the hook context
> + * @name__str: xattr name (e.g., "bpf.file_label")
> + * @value_p: dynptr containing the xattr value
> + *
> + * Only callable from lsm/inode_init_security programs.
> + *
> + * Return: 0 on success, negative error on failure.
> + */
> +__bpf_kfunc int bpf_init_inode_xattr(struct xattr_ctx *xattr_ctx,
> + const char *name__str,
> + const struct bpf_dynptr *value_p)
> +{
> + return __bpf_init_inode_xattr(xattr_ctx, name__str, value_p);
> +}
> +
> __bpf_kfunc_end_defs();
>
> BTF_KFUNCS_START(bpf_fs_kfunc_set_ids)
> @@ -385,13 +477,25 @@ BTF_ID_FLAGS(func, bpf_get_file_xattr, KF_SLEEPABLE)
> BTF_ID_FLAGS(func, bpf_set_dentry_xattr, KF_SLEEPABLE)
> BTF_ID_FLAGS(func, bpf_remove_dentry_xattr, KF_SLEEPABLE)
> BTF_ID_FLAGS(func, bpf_real_inode, KF_SLEEPABLE | KF_RET_NULL)
> +BTF_ID_FLAGS(func, bpf_init_inode_xattr, KF_SLEEPABLE)
> BTF_KFUNCS_END(bpf_fs_kfunc_set_ids)
>
> +BTF_ID_LIST(bpf_lsm_inode_init_security_btf_ids)
> +BTF_ID(func, bpf_lsm_inode_init_security)
> +
> +BTF_ID_LIST(bpf_init_inode_xattr_btf_ids)
> +BTF_ID(func, bpf_init_inode_xattr)
> +
> static int bpf_fs_kfuncs_filter(const struct bpf_prog *prog, u32 kfunc_id)
> {
> if (!btf_id_set8_contains(&bpf_fs_kfunc_set_ids, kfunc_id) ||
> - prog->type == BPF_PROG_TYPE_LSM)
> + prog->type == BPF_PROG_TYPE_LSM) {
> + /* bpf_init_inode_xattr only attaches to inode_init_security. */
> + if (kfunc_id == bpf_init_inode_xattr_btf_ids[0] &&
> + prog->aux->attach_btf_id != bpf_lsm_inode_init_security_btf_ids[0])
> + return -EACCES;
This is unnecessary. Only one hook will have xattr_ctx type.
The normal verifier type enforcement will do its work.
> return 0;
> + }
> return -EACCES;
> }
>
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index 7719f6528445..f14bfcda78db 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -1752,6 +1752,7 @@ struct bpf_prog_aux {
> u32 real_func_cnt; /* includes hidden progs, only used for JIT and freeing progs */
> u32 func_idx; /* 0 for non-func prog, the index in func array for func prog */
> u32 attach_btf_id; /* in-kernel BTF type id to attach to */
> + u32 attach_limit; /* max concurrent attachments (0 = unlimited) */
no need.
> u32 attach_st_ops_member_off;
> u32 ctx_arg_info_size;
> u32 max_rdonly_access;
> diff --git a/include/linux/bpf_lsm.h b/include/linux/bpf_lsm.h
> index 143775a27a2a..b655c708818e 100644
> --- a/include/linux/bpf_lsm.h
> +++ b/include/linux/bpf_lsm.h
> @@ -19,6 +19,9 @@
> #include <linux/lsm_hook_defs.h>
> #undef LSM_HOOK
>
> +/* max bpf xattrs per inode */
> +#define BPF_LSM_INODE_INIT_XATTRS 4
> +
> struct bpf_storage_blob {
> struct bpf_local_storage __rcu *storage;
> };
> diff --git a/include/linux/evm.h b/include/linux/evm.h
> index 913f4573b203..0aa151288b36 100644
> --- a/include/linux/evm.h
> +++ b/include/linux/evm.h
> @@ -12,6 +12,8 @@
> #include <linux/integrity.h>
> #include <linux/xattr.h>
>
> +struct xattr_ctx;
> +
> #ifdef CONFIG_EVM
> extern int evm_set_key(void *key, size_t keylen);
> extern enum integrity_status evm_verifyxattr(struct dentry *dentry,
> @@ -21,8 +23,8 @@ extern enum integrity_status evm_verifyxattr(struct dentry *dentry,
> int evm_fix_hmac(struct dentry *dentry, const char *xattr_name,
> const char *xattr_value, size_t xattr_value_len);
> int evm_inode_init_security(struct inode *inode, struct inode *dir,
> - const struct qstr *qstr, struct xattr *xattrs,
> - int *xattr_count);
> + const struct qstr *qstr,
> + struct xattr_ctx *xattr_ctx);
> extern bool evm_revalidate_status(const char *xattr_name);
> extern int evm_protected_xattr_if_enabled(const char *req_xattr_name);
> extern int evm_read_protected_xattrs(struct dentry *dentry, u8 *buffer,
> @@ -63,8 +65,7 @@ static inline int evm_fix_hmac(struct dentry *dentry, const char *xattr_name,
>
> static inline int evm_inode_init_security(struct inode *inode, struct inode *dir,
> const struct qstr *qstr,
> - struct xattr *xattrs,
> - int *xattr_count)
> + struct xattr_ctx *xattr_ctx)
> {
> return 0;
> }
> diff --git a/include/linux/lsm_hook_defs.h b/include/linux/lsm_hook_defs.h
> index 65c9609ec207..f62780fbeb9e 100644
> --- a/include/linux/lsm_hook_defs.h
> +++ b/include/linux/lsm_hook_defs.h
> @@ -116,8 +116,8 @@ LSM_HOOK(int, 0, inode_alloc_security, struct inode *inode)
> LSM_HOOK(void, LSM_RET_VOID, inode_free_security, struct inode *inode)
> LSM_HOOK(void, LSM_RET_VOID, inode_free_security_rcu, void *inode_security)
> LSM_HOOK(int, -EOPNOTSUPP, inode_init_security, struct inode *inode,
> - struct inode *dir, const struct qstr *qstr, struct xattr *xattrs,
> - int *xattr_count)
> + struct inode *dir, const struct qstr *qstr,
> + struct xattr_ctx *xattr_ctx)
> LSM_HOOK(int, 0, inode_init_security_anon, struct inode *inode,
> const struct qstr *name, const struct inode *context_inode)
> LSM_HOOK(int, 0, inode_create, struct inode *dir, struct dentry *dentry,
> diff --git a/include/linux/lsm_hooks.h b/include/linux/lsm_hooks.h
> index b4f8cad53ddb..710e48caaeba 100644
> --- a/include/linux/lsm_hooks.h
> +++ b/include/linux/lsm_hooks.h
> @@ -200,20 +200,18 @@ extern struct lsm_static_calls_table static_calls_table __ro_after_init;
>
> /**
> * lsm_get_xattr_slot - Return the next available slot and increment the index
> - * @xattrs: array storing LSM-provided xattrs
> - * @xattr_count: number of already stored xattrs (updated)
> + * @ctx: xattr state shared by inode_init_security hooks
> *
> - * Retrieve the first available slot in the @xattrs array to fill with an xattr,
> - * and increment @xattr_count.
> + * Retrieve the first available slot in the @ctx->xattrs array to fill with an
> + * xattr, and increment @ctx->xattr_count.
> *
> - * Return: The slot to fill in @xattrs if non-NULL, NULL otherwise.
> + * Return: The slot to fill in @ctx->xattrs if non-NULL, NULL otherwise.
> */
> -static inline struct xattr *lsm_get_xattr_slot(struct xattr *xattrs,
> - int *xattr_count)
> +static inline struct xattr *lsm_get_xattr_slot(struct xattr_ctx *ctx)
> {
> - if (unlikely(!xattrs))
> + if (unlikely(!ctx || !ctx->xattrs || !ctx->xattr_count))
> return NULL;
> - return &xattrs[(*xattr_count)++];
> + return &ctx->xattrs[(*ctx->xattr_count)++];
> }
>
> #endif /* ! __LINUX_LSM_HOOKS_H */
> diff --git a/include/linux/security.h b/include/linux/security.h
> index 153e9043058f..1f8e84e7dd7e 100644
> --- a/include/linux/security.h
> +++ b/include/linux/security.h
> @@ -68,6 +68,11 @@ struct watch;
> struct watch_notification;
> struct lsm_ctx;
>
> +struct xattr_ctx {
> + struct xattr *xattrs;
> + int *xattr_count;
> +};
> +
> /* Default (no) options for the capable function */
> #define CAP_OPT_NONE 0x0
> /* If capable should audit the security request */
> diff --git a/kernel/bpf/bpf_lsm.c b/kernel/bpf/bpf_lsm.c
> index 564071a92d7d..86a8e188b900 100644
> --- a/kernel/bpf/bpf_lsm.c
> +++ b/kernel/bpf/bpf_lsm.c
> @@ -113,6 +113,9 @@ void bpf_lsm_find_cgroup_shim(const struct bpf_prog *prog,
> }
> #endif
>
> +BTF_ID_LIST_SINGLE(bpf_lsm_inode_init_security_btf_ids, func,
> + bpf_lsm_inode_init_security)
> +
> int bpf_lsm_verify_prog(struct bpf_verifier_log *vlog,
> const struct bpf_prog *prog)
> {
> @@ -137,6 +140,12 @@ int bpf_lsm_verify_prog(struct bpf_verifier_log *vlog,
> return -EINVAL;
> }
>
> + /* bpf reserves a fixed number of xattr slots for itself.
> + * Set the attach limit so the trampoline rejects excess attaches.
> + */
> + if (btf_id == bpf_lsm_inode_init_security_btf_ids[0])
> + prog->aux->attach_limit = BPF_LSM_INODE_INIT_XATTRS;
> +
> return 0;
> }
>
> @@ -315,6 +324,7 @@ BTF_ID(func, bpf_lsm_inode_create)
> BTF_ID(func, bpf_lsm_inode_free_security)
> BTF_ID(func, bpf_lsm_inode_getattr)
> BTF_ID(func, bpf_lsm_inode_getxattr)
> +BTF_ID(func, bpf_lsm_inode_init_security)
> BTF_ID(func, bpf_lsm_inode_mknod)
> BTF_ID(func, bpf_lsm_inode_need_killpriv)
> BTF_ID(func, bpf_lsm_inode_post_setxattr)
> diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
> index 1a721fc4bef5..b41b02173e24 100644
> --- a/kernel/bpf/trampoline.c
> +++ b/kernel/bpf/trampoline.c
> @@ -859,6 +859,9 @@ static int bpf_trampoline_add_prog(struct bpf_trampoline *tr,
> }
> if (cnt >= BPF_MAX_TRAMP_LINKS)
> return -E2BIG;
> + if (node->link->prog->aux->attach_limit &&
> + tr->progs_cnt[kind] >= node->link->prog->aux->attach_limit)
> + return -E2BIG;
No need. The check inside kfunc is enough.
> if (!hlist_unhashed(&node->tramp_hlist))
> /* prog already linked */
> return -EBUSY;
> diff --git a/security/bpf/hooks.c b/security/bpf/hooks.c
> index 40efde233f3a..d7c44c5c0e30 100644
> --- a/security/bpf/hooks.c
> +++ b/security/bpf/hooks.c
> @@ -30,6 +30,7 @@ static int __init bpf_lsm_init(void)
>
> struct lsm_blob_sizes bpf_lsm_blob_sizes __ro_after_init = {
> .lbs_inode = sizeof(struct bpf_storage_blob),
> + .lbs_xattr_count = BPF_LSM_INODE_INIT_XATTRS,
> };
>
> DEFINE_LSM(bpf) = {
> diff --git a/security/integrity/evm/evm_main.c b/security/integrity/evm/evm_main.c
> index b59e3f121b8a..e0a05162accc 100644
> --- a/security/integrity/evm/evm_main.c
> +++ b/security/integrity/evm/evm_main.c
> @@ -1062,14 +1062,16 @@ static int evm_inode_copy_up_xattr(struct dentry *src, const char *name)
> * evm_inode_init_security - initializes security.evm HMAC value
> */
> int evm_inode_init_security(struct inode *inode, struct inode *dir,
> - const struct qstr *qstr, struct xattr *xattrs,
> - int *xattr_count)
> + const struct qstr *qstr,
> + struct xattr_ctx *xattr_ctx)
the rest looks good.
Pls split the patch. Introduce xattr_ctx first across the LSMs.
Then another patch with a new kfunc.
pw-bot: cr
^ permalink raw reply
* [PATCH] tests: add a regression test for the TCP Fast Open connect-mediation bypass
From: Bryam Vargas via B4 Relay @ 2026-06-22 23:27 UTC (permalink / raw)
To: apparmor; +Cc: linux-security-module, John Johansen, Ryan Lee
From: Bryam Vargas <hexlabsecurity@proton.me>
sendto(MSG_FASTOPEN) performs an implicit connect that the kernel's
apparmor_socket_sendmsg() did not mediate as a connect, so a profile
granting inet/inet6 stream send but denying connect was bypassed. Add a
test that, under such a profile, asserts connect(2) is denied AND
sendto(MSG_FASTOPEN) is also denied -- the latter requires the kernel fix
"apparmor: mediate the implicit connect of TCP fast open sendmsg".
It exercises both producers the fix guards -- plain TCP (inet/inet6) and
MPTCP (IPPROTO_MPTCP) -- plus a positive control where connect is allowed.
The test red-baselines on a vulnerable kernel and skips cleanly when the
required fine-grained network mediation or TCP Fast Open is unavailable
(requires_any_of_kernel_features / requires_parser_support, plus a
tcp_fastopen guard); the MPTCP cases are skipped if MPTCP is disabled.
Signed-off-by: Bryam Vargas <hexlabsecurity@proton.me>
---
This is the userspace regression test for the AppArmor TCP Fast Open
connect-mediation kernel fix posted to linux-security-module:
https://lore.kernel.org/all/20260622-b4-disp-aba401c6-v1-1-9d74343c7ced@proton.me/
It mirrors the SELinux testsuite test ("[PATCH testsuite] tests/inet_socket:
add tests for TCP Fast Open", Stephen Smalley) and was requested by the
AppArmor team (Ryan Lee).
It covers both producers the kernel fix mediates -- plain TCP (inet/inet6)
and MPTCP -- plus a positive control. It red-baselines on a vulnerable
kernel (the fastopen assertions fail) and skips cleanly when TCP Fast Open
or fine-grained network mediation is unavailable.
---
tests/regression/apparmor/Makefile | 2 +
tests/regression/apparmor/net_inet_tcp_fastopen.c | 241 +++++++++++++++++++++
tests/regression/apparmor/net_inet_tcp_fastopen.sh | 119 ++++++++++
3 files changed, 362 insertions(+)
diff --git a/tests/regression/apparmor/Makefile b/tests/regression/apparmor/Makefile
index 345f39968..18e408f5c 100644
--- a/tests/regression/apparmor/Makefile
+++ b/tests/regression/apparmor/Makefile
@@ -151,6 +151,7 @@ SRC=access.c \
named_pipe.c \
net_inet_rcv.c \
net_inet_snd.c \
+ net_inet_tcp_fastopen.c \
net_raw.c \
open.c \
openat.c \
@@ -325,6 +326,7 @@ TESTS=aa_exec \
namespaces \
net_iface \
net_inet \
+ net_inet_tcp_fastopen \
net_raw \
overlayfs_kernel \
open \
diff --git a/tests/regression/apparmor/net_inet_tcp_fastopen.c b/tests/regression/apparmor/net_inet_tcp_fastopen.c
new file mode 100644
index 000000000..cfc1ff9c5
--- /dev/null
+++ b/tests/regression/apparmor/net_inet_tcp_fastopen.c
@@ -0,0 +1,241 @@
+/*
+ * Copyright (C) 2026 Canonical, Ltd.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation, version 2 of the
+ * License.
+ */
+
+/*
+ * TCP Fast Open connect-mediation bypass regression test.
+ *
+ * Under an AppArmor profile that grants inet/inet6 stream "send" but DENIES
+ * "connect", a plain connect(2) must be refused (EACCES/EPERM). Historically
+ * the kernel's TFO fast path (sendto(..., MSG_FASTOPEN, ...), which performs
+ * an implicit connect) only checked the send permission (AA_NET_SEND 0x02)
+ * and skipped the connect permission (AA_NET_CONNECT 0x40), so a confined
+ * task could open an outbound connection that connect(2) would have blocked.
+ * The kernel fix mediates both producers: plain TCP and MPTCP (IPPROTO_MPTCP).
+ *
+ * This binary takes a mode and asserts the operation is DENIED:
+ * argv[1] = "connect" -> baseline: connect(2) must be denied
+ * argv[1] = "fastopen" -> the bug: sendto(MSG_FASTOPEN) must be denied
+ * argv[2] = family: "inet"/"inet6" (TCP) or "minet"/"minet6" (MPTCP)
+ * argv[3] = port (the listener port, set up by this same process)
+ *
+ * Output contract (parsed by checktestfg in prologue.inc):
+ * "PASS\n" -> the operation was DENIED as required (regression OK)
+ * "FAIL ..."-> the operation was ALLOWED (connect bypass) OR a setup error
+ *
+ * The .sh runs this with expected outcome "pass"; it also enables TCP Fast
+ * Open first, so an EOPNOTSUPP here is a real setup error, not a skip.
+ */
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+#include <errno.h>
+#include <signal.h>
+#include <netinet/in.h>
+#include <netinet/tcp.h>
+#include <sys/socket.h>
+#include <sys/types.h>
+#include <sys/wait.h>
+#include <arpa/inet.h>
+
+#ifndef MSG_FASTOPEN
+#define MSG_FASTOPEN 0x20000000
+#endif
+
+#ifndef IPPROTO_MPTCP
+#define IPPROTO_MPTCP 262
+#endif
+
+/* Map a family token to (AF_*, protocol). "minet"/"minet6" select MPTCP.
+ * Returns 0 on success, -1 on an unknown token.
+ */
+static int parse_family(const char *tok, int *family, int *proto)
+{
+ *proto = 0;
+ if (strcmp(tok, "inet") == 0) {
+ *family = AF_INET;
+ } else if (strcmp(tok, "inet6") == 0) {
+ *family = AF_INET6;
+ } else if (strcmp(tok, "minet") == 0) {
+ *family = AF_INET;
+ *proto = IPPROTO_MPTCP;
+ } else if (strcmp(tok, "minet6") == 0) {
+ *family = AF_INET6;
+ *proto = IPPROTO_MPTCP;
+ } else {
+ return -1;
+ }
+ return 0;
+}
+
+/* Build a loopback sockaddr for the requested family. Returns addrlen. */
+static socklen_t make_addr(int family, int port, struct sockaddr_storage *ss)
+{
+ memset(ss, 0, sizeof(*ss));
+ if (family == AF_INET) {
+ struct sockaddr_in *a = (struct sockaddr_in *)ss;
+
+ a->sin_family = AF_INET;
+ a->sin_port = htons(port);
+ inet_pton(AF_INET, "127.0.0.1", &a->sin_addr);
+ return sizeof(*a);
+ }
+ {
+ struct sockaddr_in6 *a = (struct sockaddr_in6 *)ss;
+
+ a->sin6_family = AF_INET6;
+ a->sin6_port = htons(port);
+ inet_pton(AF_INET6, "::1", &a->sin6_addr);
+ return sizeof(*a);
+ }
+}
+
+/* Start a plain TCP listener so the connect/TFO target exists. Returns the fd
+ * or -1. A TCP listener accepts both TCP and MPTCP clients, which keeps the
+ * test on the client-side mediation under examination. bind/listen perms are
+ * granted by the profile so this must succeed.
+ */
+static int start_listener(int family, int port)
+{
+ int s, one = 1;
+ struct sockaddr_storage ss;
+ socklen_t len = make_addr(family, port, &ss);
+
+ s = socket(family, SOCK_STREAM, 0);
+ if (s < 0) {
+ printf("FAIL - listener socket: %m\n");
+ return -1;
+ }
+ (void)setsockopt(s, SOL_SOCKET, SO_REUSEADDR, &one, sizeof(one));
+ /* Enable TFO on the listener (qlen). Best-effort; the mediation check
+ * under test happens on the client side. */
+ (void)setsockopt(s, IPPROTO_TCP, TCP_FASTOPEN, &one, sizeof(one));
+ if (bind(s, (struct sockaddr *)&ss, len) < 0) {
+ printf("FAIL - listener bind: %m\n");
+ close(s);
+ return -1;
+ }
+ if (listen(s, 5) < 0) {
+ printf("FAIL - listener listen: %m\n");
+ close(s);
+ return -1;
+ }
+ return s;
+}
+
+/* Returns 1 if the kernel DENIED the operation (EACCES/EPERM) => regression OK.
+ * Returns 0 if the operation was ALLOWED (connect bypass) => regression FAIL.
+ * Returns -1 on a setup error.
+ */
+static int try_connect(int family, int proto, int port)
+{
+ int s, rc;
+ struct sockaddr_storage ss;
+ socklen_t len = make_addr(family, port, &ss);
+
+ s = socket(family, SOCK_STREAM, proto);
+ if (s < 0)
+ return -1;
+ rc = connect(s, (struct sockaddr *)&ss, len);
+ if (rc == 0) {
+ close(s);
+ return 0; /* allowed */
+ }
+ if (errno == EACCES || errno == EPERM) {
+ close(s);
+ return 1; /* denied by AppArmor */
+ }
+ /* ECONNREFUSED/ETIMEDOUT mean it reached the network: mediation did not
+ * block it, so count as allowed. */
+ close(s);
+ return (errno == ECONNREFUSED || errno == ETIMEDOUT) ? 0 : -1;
+}
+
+static int try_fastopen(int family, int proto, int port)
+{
+ int s;
+ ssize_t rc;
+ char msg[] = "tfo";
+ struct sockaddr_storage ss;
+ socklen_t len = make_addr(family, port, &ss);
+
+ s = socket(family, SOCK_STREAM, proto);
+ if (s < 0)
+ return -1;
+
+ /* The bug: this implicit-connect send must be mediated as a connect. */
+ rc = sendto(s, msg, sizeof(msg), MSG_FASTOPEN,
+ (struct sockaddr *)&ss, len);
+ if (rc >= 0) {
+ close(s);
+ return 0; /* allowed: connect bypass */
+ }
+ if (errno == EACCES || errno == EPERM) {
+ close(s);
+ return 1; /* denied by AppArmor */
+ }
+ if (errno == EOPNOTSUPP || errno == EINVAL) {
+ /* The .sh enabled TCP Fast Open before running, so this is a
+ * real setup error, not an expected condition. Fail loudly
+ * rather than masking it as a denial. */
+ close(s);
+ return -1;
+ }
+ close(s);
+ return (errno == ECONNREFUSED || errno == ETIMEDOUT) ? 0 : -1;
+}
+
+int main(int argc, char *argv[])
+{
+ int family, proto, port, denied, listener;
+ const char *mode;
+
+ if (argc < 4) {
+ printf("FAIL - usage: %s connect|fastopen inet|inet6|minet|minet6 port\n",
+ argv[0]);
+ return 1;
+ }
+ mode = argv[1];
+ if (parse_family(argv[2], &family, &proto) < 0) {
+ printf("FAIL - unknown family '%s'\n", argv[2]);
+ return 1;
+ }
+ port = atoi(argv[3]);
+
+ signal(SIGPIPE, SIG_IGN);
+
+ listener = start_listener(family, port);
+ if (listener < 0)
+ return 1; /* FAIL already printed */
+
+ if (strcmp(mode, "connect") == 0) {
+ denied = try_connect(family, proto, port);
+ } else if (strcmp(mode, "fastopen") == 0) {
+ denied = try_fastopen(family, proto, port);
+ } else {
+ printf("FAIL - unknown mode '%s'\n", mode);
+ close(listener);
+ return 1;
+ }
+
+ close(listener);
+
+ if (denied == 1) {
+ printf("PASS\n");
+ return 0;
+ }
+ if (denied == 0) {
+ printf("FAIL - %s was ALLOWED despite deny connect "
+ "(connect-mediation bypass)\n", mode);
+ return 1;
+ }
+ printf("FAIL - %s setup error: %m\n", mode);
+ return 1;
+}
diff --git a/tests/regression/apparmor/net_inet_tcp_fastopen.sh b/tests/regression/apparmor/net_inet_tcp_fastopen.sh
new file mode 100755
index 000000000..76300c53f
--- /dev/null
+++ b/tests/regression/apparmor/net_inet_tcp_fastopen.sh
@@ -0,0 +1,119 @@
+#! /bin/bash
+# Copyright (C) 2026 Canonical, Ltd.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation, version 2 of the
+# License.
+
+#=NAME net_inet_tcp_fastopen
+#=DESCRIPTION
+# Regression test for the TCP Fast Open connect-mediation bypass. Under a
+# profile that grants inet/inet6 stream "send" but DENIES "connect", a plain
+# connect(2) is refused, and sendto(..., MSG_FASTOPEN, ...) (which performs an
+# implicit connect) MUST also be refused -- for both plain TCP and MPTCP. Pre-fix
+# the TFO path checked only the send permission (AA_NET_SEND 0x02) and skipped
+# connect (AA_NET_CONNECT 0x40).
+#=END
+
+pwd=`dirname $0`
+pwd=`cd $pwd ; /bin/pwd`
+
+bin=$pwd
+
+. "$bin/prologue.inc"
+
+# Need fine-grained inet mediation (connect/send are separable only there).
+requires_any_of_kernel_features network_v8/af_inet network_v9/af_inet
+requires_parser_support "network (send) ip=::1,"
+
+settest net_inet_tcp_fastopen
+
+tfo_sysctl=/proc/sys/net/ipv4/tcp_fastopen
+tfo_saved=""
+
+cleanup()
+{
+ # restore the original tcp_fastopen value if we changed it
+ if [ -n "$tfo_saved" ]; then
+ echo "$tfo_saved" > "$tfo_sysctl" 2>/dev/null || true
+ fi
+}
+do_onexit="cleanup"
+
+# The sendto(MSG_FASTOPEN) client path needs the TCP Fast Open client bit
+# (0x1). Enable it for the run; if it is unavailable (no sysctl, or it cannot
+# be enabled) the bug cannot be exercised at all, so skip rather than report a
+# spurious failure.
+if [ ! -w "$tfo_sysctl" ]; then
+ echo " TCP Fast Open sysctl ($tfo_sysctl) not available. Skipping tests ..."
+ exit 0
+fi
+tfo_saved=`cat "$tfo_sysctl"`
+echo $((tfo_saved | 1)) > "$tfo_sysctl" 2>/dev/null || true
+if [ $(($(cat "$tfo_sysctl") & 1)) -ne 1 ]; then
+ echo " Could not enable the TCP Fast Open client bit. Skipping tests ..."
+ exit 0
+fi
+
+# add ::1 if not already present (loopback usually has it)
+ip -6 addr add ::1/128 dev lo 2>/dev/null || true
+
+# pick a free port for the listener this binary creates
+port=4321
+while lsof -i:$port >/dev/null 2>&1; do
+ let port=$port+1
+done
+
+# Profile: allow stream send/receive + the perms needed to stand up the
+# in-process listener (bind/listen/accept), allow setopt/getopt for TFO
+# sockopts, but explicitly DENY connect on both inet and inet6.
+gen_send_no_connect()
+{
+ genprofile \
+ "network;(send,receive,accept,listen,bind);ip=127.0.0.1;port=$port" \
+ "network;(send,receive,accept,listen,bind);ip=::1;port=$port" \
+ "network;(send,receive);peer=(ip=127.0.0.1)" \
+ "network;(send,receive);peer=(ip=::1)" \
+ "network;(setopt,getopt);ip=0.0.0.0;port=0" \
+ "network;(setopt,getopt);ip=::0;port=0" \
+ "qual=deny:network;(connect);ip=127.0.0.1" \
+ "qual=deny:network;(connect);ip=::1"
+}
+
+# ---- inet (IPv4) ----
+gen_send_no_connect
+# baseline: a normal connect(2) must be denied -> binary prints PASS (denied),
+# expected outcome 'pass'
+runchecktest "TFO inet - connect(2) denied" pass connect inet $port
+# the bug: sendto(MSG_FASTOPEN) must ALSO be denied post-fix
+runchecktest "TFO inet - sendto(MSG_FASTOPEN) denied" pass fastopen inet $port
+
+# ---- inet6 (IPv6) ----
+gen_send_no_connect
+runchecktest "TFO inet6 - connect(2) denied" pass connect inet6 $port
+runchecktest "TFO inet6 - sendto(MSG_FASTOPEN) denied" pass fastopen inet6 $port
+
+# ---- MPTCP: the second producer the fix guards (IPPROTO_MPTCP) ----
+# The deny-connect rule is family/type based, so it covers MPTCP (inet/inet6
+# stream) too. Only run when MPTCP is enabled.
+if [ "`cat /proc/sys/net/mptcp/enabled 2>/dev/null`" = "1" ]; then
+ gen_send_no_connect
+ runchecktest "TFO MPTCP inet - connect(2) denied" pass connect minet $port
+ runchecktest "TFO MPTCP inet - sendto(MSG_FASTOPEN) denied" pass fastopen minet $port
+ gen_send_no_connect
+ runchecktest "TFO MPTCP inet6 - connect(2) denied" pass connect minet6 $port
+ runchecktest "TFO MPTCP inet6 - sendto(MSG_FASTOPEN) denied" pass fastopen minet6 $port
+fi
+
+# ---- positive control: when connect IS allowed, both succeed (no false deny) ----
+genprofile \
+ "network;(connect,send,receive,accept,listen,bind);ip=127.0.0.1;port=$port" \
+ "network;(connect,send,receive);peer=(ip=127.0.0.1)" \
+ "network;(setopt,getopt);ip=0.0.0.0;port=0"
+# Here the binary's "denied" assertion is FALSE (op allowed), so it prints
+# FAIL; we expect that, i.e. expected outcome 'fail'.
+runchecktest "TFO inet - connect allowed (control)" fail connect inet $port
+runchecktest "TFO inet - fastopen allowed (control)" fail fastopen inet $port
+
+exit 0
---
base-commit: bdccc1ebd2e1a1b75ceb8f87b23831fe273b9ebb
change-id: 20260622-b4-disp-220b400d-3d7fd53bce49
Best regards,
--
bryamzxz <hexlabsecurity@proton.me>
^ permalink raw reply related
* [PATCH] apparmor: mediate the implicit connect of TCP fast open sendmsg
From: Bryam Vargas via B4 Relay @ 2026-06-22 20:57 UTC (permalink / raw)
To: Ryan Lee, John Johansen
Cc: Mickael Salaun, Stephen Smalley, James Morris, Matthieu Buffet,
linux-security-module, Mikhail Ivanov, Serge E. Hallyn,
linux-kernel, Paul Moore, apparmor
From: Bryam Vargas <hexlabsecurity@proton.me>
sendmsg()/sendto() with MSG_FASTOPEN is a combination of connect(2) and
write(2): it opens the connection in the SYN. apparmor_socket_sendmsg()
only checks AA_MAY_SEND, so a profile that grants send but denies connect
lets a confined task open an outbound TCP/MPTCP connection that connect(2)
would have refused, bypassing connect mediation.
Mediate the implicit connect when MSG_FASTOPEN is set and a destination
is supplied. Add it to apparmor_socket_sendmsg() (not the shared
aa_sock_msg_perm() helper, which recvmsg also uses) and call aa_sk_perm()
directly, mirroring the selinux and tomoyo fixes. sk_is_tcp() does not
cover MPTCP fast open, so the SOCK_STREAM/IPPROTO_MPTCP arm is explicit.
Fixes: cf60af03ca4e ("net-tcp: Fast Open client - sendmsg(MSG_FASTOPEN)")
Cc: stable@vger.kernel.org
Signed-off-by: Bryam Vargas <hexlabsecurity@proton.me>
---
This is the patch and reproducer requested in [1]. A userspace regression test
(tests/regression/apparmor/net_inet_tcp_fastopen) follows separately to the
apparmor tree, as suggested.
Reproducer (behavioral; the bypassed value is policy, not bus state, so no special
hardware). Under a profile that grants inet/inet6 stream send but denies connect, on
the current Debian security kernel 6.12.94 (apparmor active):
[TCP ] connect(2)=EACCES sendto(MSG_FASTOPEN)=OK -> connect bypassed (listener accepted)
[TCP6] connect(2)=EACCES sendto(MSG_FASTOPEN)=OK -> connect bypassed
The kernel audit shows the connect(2) denial and no connect record for the fastopen
sendto:
apparmor="DENIED" operation="connect" profile="egress_restricted" comm="lsm_tfo_ab"
family="inet" sock_type="stream" protocol=6 requested_mask="connect" denied_mask="connect"
With this patch the fastopen sendto hits that same connect denial. Full reproducer
available on request.
Same-class fixes: selinux [2], tomoyo [3]; the original cross-LSM report (landlock,
the first instance) is [4].
[1] https://lore.kernel.org/r/20260619011138.264578-1-hexlabsecurity@proton.me
[2] https://lore.kernel.org/r/20260618175513.112443-2-stephen.smalley.work@gmail.com
[3] https://lore.kernel.org/r/20260619002207.61104-1-matthieu@buffet.re
[4] https://lore.kernel.org/r/20260616201615.275032-1-hexlabsecurity@proton.me
---
security/apparmor/lsm.c | 16 +++++++++++++++-
1 file changed, 15 insertions(+), 1 deletion(-)
diff --git a/security/apparmor/lsm.c b/security/apparmor/lsm.c
index 3491e9f60194..e01efdf50efa 100644
--- a/security/apparmor/lsm.c
+++ b/security/apparmor/lsm.c
@@ -1422,7 +1422,21 @@ static int aa_sock_msg_perm(const char *op, u32 request, struct socket *sock,
static int apparmor_socket_sendmsg(struct socket *sock,
struct msghdr *msg, int size)
{
- return aa_sock_msg_perm(OP_SENDMSG, AA_MAY_SEND, sock, msg, size);
+ int error = aa_sock_msg_perm(OP_SENDMSG, AA_MAY_SEND, sock, msg, size);
+
+ if (error)
+ return error;
+
+ /* TCP fast open carries connect() semantics in sendmsg(); mediate
+ * the implicit connect so it cannot bypass the connect permission.
+ */
+ if ((msg->msg_flags & MSG_FASTOPEN) && msg->msg_name &&
+ (sk_is_tcp(sock->sk) ||
+ (sk_is_inet(sock->sk) && sock->sk->sk_type == SOCK_STREAM &&
+ sock->sk->sk_protocol == IPPROTO_MPTCP)))
+ error = aa_sk_perm(OP_CONNECT, AA_MAY_CONNECT, sock->sk);
+
+ return error;
}
static int apparmor_socket_recvmsg(struct socket *sock,
---
base-commit: 4549871118cf616eecdd2d939f78e3b9e1dddc48
change-id: 20260622-b4-disp-aba401c6-f02842c82975
Best regards,
--
bryamzxz <hexlabsecurity@proton.me>
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox