* Re: [PATCH v6 1/4] openat2: new OPENAT2_REGULAR flag support
From: Christian Brauner @ 2026-04-16 15:15 UTC (permalink / raw)
To: Jori Koolstra
Cc: Dorjoy Chowdhury, linux-fsdevel, linux-kernel, linux-api,
ceph-devel, gfs2, linux-nfs, linux-cifs, v9fs, linux-kselftest,
viro, jack, jlayton, chuck.lever, alex.aring, arnd, adilger,
mjguzik, smfrench, richard.henderson, mattst88, linmag7, tsbogend,
James.Bottomley, deller, davem, andreas, idryomov, amarkuze,
slava, agruenba, trondmy, anna, sfrench, pc, ronniesahlberg,
sprasad, tom, bharathsm, shuah, miklos, hansg, Aleksa Sarai
In-Reply-To: <1714293523.333222.1776351806025@kpc.webmail.kpnmail.nl>
On Thu, Apr 16, 2026 at 05:03:26PM +0200, Jori Koolstra wrote:
>
> > Op 16-04-2026 16:21 CEST schreef Dorjoy Chowdhury <dorjoychy111@gmail.com>:
> >
> >
> > On Thu, Apr 16, 2026 at 7:52 PM Jori Koolstra <jkoolstra@xs4all.nl> wrote:
> > >
> > > On Sat, Mar 28, 2026 at 11:22:22PM +0600, Dorjoy Chowdhury wrote:
> > > > diff --git a/arch/alpha/include/uapi/asm/fcntl.h b/arch/alpha/include/uapi/asm/fcntl.h
> > > > index 50bdc8e8a271..fe488bf7c18e 100644
> > > > --- a/arch/alpha/include/uapi/asm/fcntl.h
> > > > +++ b/arch/alpha/include/uapi/asm/fcntl.h
> > > > @@ -34,6 +34,7 @@
> > > >
> > > > #define O_PATH 040000000
> > > > #define __O_TMPFILE 0100000000
> > > > +#define OPENAT2_REGULAR 0200000000
> > > >
> > >
> > > I don't quite understand why we are adding OPENAT2_REGULAR inside the
> > > O_* flag range. Wasn't this supposed to be only supported for openat2()?
> > > If so, I don't see the need to waste an O_* flag bit. But maybe I am
> > > missing something.
> > >
> >
> > Yes, OPENAT2_REGULAR is only supported for openat2. I am not sure if I
> > got a specific review to not add OPENAT2_REGULAR in the O_* flag 32
> > bit range. But as far as I understand, for the old open system calls
> > we can't easily add new O_* flags as the older codepaths don't strip
> > off unknown bits which openat2 does. It's not easy to add new O_*
> > flags for the old open system calls since that could break userspace
> > programs.
>
> If I recall correctly, Aleksa has suggested we might also want to add
> O_EMPTYPATH to openat() instead of only allowing this for openat2().
> I am waiting to see what Christian thinks of this.
We can do that, yes. For O_EMPTYPATH that is workable.
I don't mind too much if we leave OPENAT2_REGUALR in the 32-bit flag
space. It'll silently be ignored but the flag name should give it away.
^ permalink raw reply
* Re: [PATCH v6 1/4] openat2: new OPENAT2_REGULAR flag support
From: Aleksa Sarai @ 2026-04-16 15:15 UTC (permalink / raw)
To: Dorjoy Chowdhury
Cc: Jori Koolstra, linux-fsdevel, linux-kernel, linux-api, ceph-devel,
gfs2, linux-nfs, linux-cifs, v9fs, linux-kselftest, viro, brauner,
jack, jlayton, chuck.lever, alex.aring, arnd, adilger, mjguzik,
smfrench, richard.henderson, mattst88, linmag7, tsbogend,
James.Bottomley, deller, davem, andreas, idryomov, amarkuze,
slava, agruenba, trondmy, anna, sfrench, pc, ronniesahlberg,
sprasad, tom, bharathsm, shuah, miklos, hansg
In-Reply-To: <CAFfO_h6pkyX=uN5uoXda6toTtT6KsahfBNBLom9i21HdZ7JOmQ@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 2304 bytes --]
On 2026-04-16, Dorjoy Chowdhury <dorjoychy111@gmail.com> wrote:
> On Thu, Apr 16, 2026 at 7:52 PM Jori Koolstra <jkoolstra@xs4all.nl> wrote:
> >
> > On Sat, Mar 28, 2026 at 11:22:22PM +0600, Dorjoy Chowdhury wrote:
> > > diff --git a/arch/alpha/include/uapi/asm/fcntl.h b/arch/alpha/include/uapi/asm/fcntl.h
> > > index 50bdc8e8a271..fe488bf7c18e 100644
> > > --- a/arch/alpha/include/uapi/asm/fcntl.h
> > > +++ b/arch/alpha/include/uapi/asm/fcntl.h
> > > @@ -34,6 +34,7 @@
> > >
> > > #define O_PATH 040000000
> > > #define __O_TMPFILE 0100000000
> > > +#define OPENAT2_REGULAR 0200000000
> > >
> >
> > I don't quite understand why we are adding OPENAT2_REGULAR inside the
> > O_* flag range. Wasn't this supposed to be only supported for openat2()?
> > If so, I don't see the need to waste an O_* flag bit. But maybe I am
> > missing something.
> >
>
> Yes, OPENAT2_REGULAR is only supported for openat2. I am not sure if I
> got a specific review to not add OPENAT2_REGULAR in the O_* flag 32
> bit range. But as far as I understand, for the old open system calls
> we can't easily add new O_* flags as the older codepaths don't strip
> off unknown bits which openat2 does. It's not easy to add new O_*
> flags for the old open system calls since that could break userspace
> programs. So I guess it's okay to add OPENAT2_REGULAR in the 32 bits
> range anyway? (Also lots of code paths take 32bit flags param right
> now and those would need changing to take uint64_t instead but this is
> of course not a reason to not add the new flag outside of the 32
> bits).
Oh, I didn't notice that this wasn't mentioned here, we had a separate
discussion about it in a thread with Jori and I must've assumed we
discussed it in both. (My brain is also really not wired up to read
large octal values easily.)
While it is hard to add new O_* flags (hence OPENAT2_REGULAR), it's not
/impossible/ (Jori has a patch for OPENAT2_EMPTY_PATH that is safe to
add to O_* flags because of some fun historical coincidences).
I would have a slight preference towards segregating the bits, ideally
at the top end but even 1<<31 would've been nice. Then again, I'm not
too fussed either way to be honest...
--
Aleksa Sarai
https://www.cyphar.com/
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 265 bytes --]
^ permalink raw reply
* Re: [PATCH v6 0/4] OPENAT2_REGULAR flag support for openat2
From: Dorjoy Chowdhury @ 2026-04-16 15:22 UTC (permalink / raw)
To: Christian Brauner
Cc: linux-fsdevel, linux-kernel, linux-api, ceph-devel, gfs2,
linux-nfs, linux-cifs, v9fs, linux-kselftest, viro, jack, jlayton,
chuck.lever, alex.aring, arnd, adilger, mjguzik, smfrench,
richard.henderson, mattst88, linmag7, tsbogend, James.Bottomley,
deller, davem, andreas, idryomov, amarkuze, slava, agruenba,
trondmy, anna, sfrench, pc, ronniesahlberg, sprasad, tom,
bharathsm, shuah, miklos, hansg
In-Reply-To: <20260416-abgraben-seeweg-a44ce660957f@brauner>
On Thu, Apr 16, 2026 at 7:07 PM Christian Brauner <brauner@kernel.org> wrote:
>
> On Sat, 28 Mar 2026 23:22:21 +0600, Dorjoy Chowdhury wrote:
> > I came upon this "Ability to only open regular files" uapi feature suggestion
> > from https://uapi-group.org/kernel-features/#ability-to-only-open-regular-files
> > and thought it would be something I could do as a first patch and get to
> > know the kernel code a bit better.
> >
> > The following filesystems have been tested by building and booting the kernel
> > x86 bzImage in a Fedora 43 VM in QEMU. I have tested with OPENAT2_REGULAR that
> > regular files can be successfully opened and non-regular files (directory, fifo etc)
> > return -EFTYPE.
> > - btrfs
> > - NFS (loopback)
> > - SMB (loopback)
> >
> > [...]
>
> - I've added an explanation why OPENAT2_REGULAR is only needed for some
> ->atomic_open() implementers but not others. What I don't like is that
> we need all that custom handling in there but it's managable.
>
> - I dropped the topmost style conversions. They really don't belong
> there and if we switch to something better we should use (1 << <nr>).
>
> - I split the EFTYPE errno introduction into a separate patch.
>
> ---
Thanks for fixing up and picking this one up!
>
> Applied to the vfs-7.2.openat.regular branch of the vfs/vfs.git tree.
> Patches in the vfs-7.2.openat.regular branch should appear in linux-next soon.
>
I don't see a vfs-7.2.openat.regular branch in vfs/vfs.git tree in
git.kernel.org. Maybe this hasn't been pushed yet?
> Please report any outstanding bugs that were missed during review in a
> new review to the original patch series allowing us to drop it.
>
> It's encouraged to provide Acked-bys and Reviewed-bys even though the
> patch has now been applied. If possible patch trailers will be updated.
>
> Note that commit hashes shown below are subject to change due to rebase,
> trailer updates or similar. If in doubt, please check the listed branch.
>
> tree: https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
> branch: master
>
I guess you wanted to mean vfs-7.2.openat.regular here?
Regards,
Dorjoy
^ permalink raw reply
* Re: [PATCH v6 1/4] openat2: new OPENAT2_REGULAR flag support
From: Jori Koolstra @ 2026-04-16 21:36 UTC (permalink / raw)
To: Christian Brauner
Cc: Dorjoy Chowdhury, linux-fsdevel, linux-kernel, linux-api,
ceph-devel, gfs2, linux-nfs, linux-cifs, v9fs, linux-kselftest,
viro, jack, jlayton, chuck.lever, alex.aring, arnd, adilger,
mjguzik, smfrench, richard.henderson, mattst88, linmag7, tsbogend,
James.Bottomley, deller, davem, andreas, idryomov, amarkuze,
slava, agruenba, trondmy, anna, sfrench, pc, ronniesahlberg,
sprasad, tom, bharathsm, shuah, miklos, hansg, Aleksa Sarai
In-Reply-To: <20260416-aufbau-sorgenfrei-cfa87c9ddc11@brauner>
> Op 16-04-2026 17:15 CEST schreef Christian Brauner <brauner@kernel.org>:
>
>
> On Thu, Apr 16, 2026 at 05:03:26PM +0200, Jori Koolstra wrote:
> >
> > If I recall correctly, Aleksa has suggested we might also want to add
> > O_EMPTYPATH to openat() instead of only allowing this for openat2().
> > I am waiting to see what Christian thinks of this.
>
> We can do that, yes. For O_EMPTYPATH that is workable.
All right, then I'll update the patch this weekend.
>
> I don't mind too much if we leave OPENAT2_REGUALR in the 32-bit flag
> space. It'll silently be ignored but the flag name should give it away.
I would also prefer to have the bits separated. Although it is unlikely
that we will add so many O_* that we will ever run out of space, it just
seems cleaner, and at no cost. But it's not too important.
Thanks,
Jori.
^ permalink raw reply
* Re: [PATCH v6 1/4] openat2: new OPENAT2_REGULAR flag support
From: Jori Koolstra @ 2026-04-16 21:42 UTC (permalink / raw)
To: Aleksa Sarai, Dorjoy Chowdhury, brauner
Cc: linux-fsdevel, linux-kernel, linux-api, ceph-devel, gfs2,
linux-nfs, linux-cifs, v9fs, linux-kselftest, viro, jack, jlayton,
chuck.lever, alex.aring, arnd, adilger, mjguzik, smfrench,
richard.henderson, mattst88, linmag7, tsbogend, James.Bottomley,
deller, davem, andreas, idryomov, amarkuze, slava, agruenba,
trondmy, anna, sfrench, pc, ronniesahlberg, sprasad, tom,
bharathsm, shuah, miklos, hansg
In-Reply-To: <2026-04-16-upstate-capable-deacon-petals-0l25lH@cyphar.com>
> Op 16-04-2026 17:15 CEST schreef Aleksa Sarai <cyphar@cyphar.com>:
>
>
> Oh, I didn't notice that this wasn't mentioned here, we had a separate
> discussion about it in a thread with Jori and I must've assumed we
> discussed it in both. (My brain is also really not wired up to read
> large octal values easily.)
>
> While it is hard to add new O_* flags (hence OPENAT2_REGULAR), it's not
> /impossible/ (Jori has a patch for OPENAT2_EMPTY_PATH that is safe to
> add to O_* flags because of some fun historical coincidences).
But it would change userspace, at least in theory, right? If anyone for
some reason decided to set whatever the bit will be for O_EMPTYPATH
in a call to openat(), and pass an empty string, relying on this to fail,
that will no longer be the case. But that is just really silly. Or are you
hinting on something else?
Thanks,
Jori.
^ permalink raw reply
* Re: [PATCH v6 1/4] openat2: new OPENAT2_REGULAR flag support
From: Aleksa Sarai @ 2026-04-17 7:58 UTC (permalink / raw)
To: Jori Koolstra
Cc: Dorjoy Chowdhury, brauner, linux-fsdevel, linux-kernel, linux-api,
ceph-devel, gfs2, linux-nfs, linux-cifs, v9fs, linux-kselftest,
viro, jack, jlayton, chuck.lever, alex.aring, arnd, adilger,
mjguzik, smfrench, richard.henderson, mattst88, linmag7, tsbogend,
James.Bottomley, deller, davem, andreas, idryomov, amarkuze,
slava, agruenba, trondmy, anna, sfrench, pc, ronniesahlberg,
sprasad, tom, bharathsm, shuah, miklos, hansg
In-Reply-To: <2059025134.378522.1776375762839@kpc.webmail.kpnmail.nl>
[-- Attachment #1: Type: text/plain, Size: 2323 bytes --]
On 2026-04-16, Jori Koolstra <jkoolstra@xs4all.nl> wrote:
>
> > Op 16-04-2026 17:15 CEST schreef Aleksa Sarai <cyphar@cyphar.com>:
> >
> >
> > Oh, I didn't notice that this wasn't mentioned here, we had a separate
> > discussion about it in a thread with Jori and I must've assumed we
> > discussed it in both. (My brain is also really not wired up to read
> > large octal values easily.)
> >
> > While it is hard to add new O_* flags (hence OPENAT2_REGULAR), it's not
> > /impossible/ (Jori has a patch for OPENAT2_EMPTY_PATH that is safe to
> > add to O_* flags because of some fun historical coincidences).
>
> But it would change userspace, at least in theory, right? If anyone for
> some reason decided to set whatever the bit will be for O_EMPTYPATH
> in a call to openat(), and pass an empty string, relying on this to fail,
> that will no longer be the case. But that is just really silly. Or are you
> hinting on something else?
Yes, such a program would break, but it is a fairly safe bet that no
such program actually exists in the wild. There is a limit to "never
break userspace" -- it actually needs to break a real userspace program
for it to matter.
Even then there are limits -- in theory someone could write a program
that would error out if any new flag is added to any syscall that
returns -EINVAL for invalid flags (in fact, we have selftests for
openat2(2) that would break because we test the error path) but it
wouldn't make sense to not add features to any syscall because such a
program could theoretically exist.
We change uAPI all the time, the trick is doing it so that userspace
doesn't notice.
For O_EMPTYPATH the logic is that programs that pass regular paths would
work the same way as they do now (i.e., LOOKUP_EMPTY semantics) and
programs that used to pass "" would previously get ENOENT -- it seems
quite unlikely anyone would depend on this for anything (they could
check if the string was empty themselves, after all) and it seems
astronomically unlikely that that they would pass garbage *and* depend
on it for anything.
(It is a little funky that open("", O_EMPTYPATH) would give you an fd to
"." but that makes more sense than the alternatives so let's just keep
it consistent.)
--
Aleksa Sarai
https://www.cyphar.com/
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 265 bytes --]
^ permalink raw reply
* [PATCH] [PATCH] PM: docs: Add comprehensive wakeup_count documentation
From: chenheyun @ 2026-04-19 7:23 UTC (permalink / raw)
To: rafael, pavel; +Cc: linux-pm, linux-api, linux-kernel, chenheyun
The current Documentation/power/wakeup-count.rst is empty and lacks
description of the race-free suspend mechanism, sysfs ABI semantics,
blocking behavior, and standard userspace usage.
Add complete documentation for /sys/power/wakeup_count, including
overview, interface semantics, usage example, and related interfaces.
Also update Documentation/power/index.rst to include the new document.
Signed-off-by: chenheyun <chen_heyun@163.com>
---
Documentation/power/index.rst | 1 +
Documentation/power/wakeup-count.rst | 63 ++++++++++++++++++++++++++++
2 files changed, 64 insertions(+)
create mode 100644 Documentation/power/wakeup-count.rst
diff --git a/Documentation/power/index.rst b/Documentation/power/index.rst
index b4581e4ae785..901268049d7c 100644
--- a/Documentation/power/index.rst
+++ b/Documentation/power/index.rst
@@ -27,6 +27,7 @@ Power Management
swsusp
video
tricks
+ wakeup-count
userland-swsusp
diff --git a/Documentation/power/wakeup-count.rst b/Documentation/power/wakeup-count.rst
new file mode 100644
index 000000000000..5f3a1ca654ce
--- /dev/null
+++ b/Documentation/power/wakeup-count.rst
@@ -0,0 +1,63 @@
+.. SPDX-License-Identifier: GPL-2.0
+.. Copyright (C) 2025 The Linux Foundation
+
+The wakeup_count mechanism for race-free suspend
+================================================
+
+Overview
+--------
+
+The ``/sys/power/wakeup_count`` sysfs interface provides a stable userspace
+mechanism to perform race-free system suspend transitions. It eliminates the
+well-known race condition between suspend permission check and actual system
+suspend entry.
+
+Userspace may use it in a standard three-step sequence:
+
+1. Read the current global wakeup event counter. The read operation blocks
+ until all ongoing wakeup event processing is finished, returning a stable value.
+2. Perform necessary suspend preparation steps in userspace.
+3. Write the previously-read counter value back to the interface.
+ The write operation will only succeed if no new wakeup events have occurred
+ since the read.
+
+Only after a successful write may userspace safely trigger system suspend.
+
+Interface semantics
+-------------------
+
+``/sys/power/wakeup_count``
+
+**Read**
+ Returns the global monotonically-increasing wakeup event counter.
+ This call blocks until there are no wakeup events under active processing
+ inside the kernel. If interrupted by a signal, it returns -EINTR.
+
+**Write**
+ Accepts the counter value obtained from a prior read.
+ The write succeeds only if the kernel's current counter exactly matches
+ the written value. Mismatch indicates new wakeup events arrived during
+ userspace preparation, and suspend must be aborted.
+
+Standard userspace usage example
+--------------------------------
+
+.. code-block:: shell
+
+ count=$(cat /sys/power/wakeup_count)
+ do_suspend_preparation
+ echo "$count" > /sys/power/wakeup_count && echo mem > /sys/power/state
+
+Blocking behavior
+-----------------
+
+The blocking read ensures that userspace never observes an inconsistent state
+where wakeup events are still being handled within the kernel. This stability
+is the core guarantee of the interface.
+
+Related kernel interfaces
+-------------------------
+
+- ``/sys/power/state``: System suspend state control interface.
+- ``/sys/kernel/debug/wakeup_sources``: Per-device wakeup source statistics.
+- ``Documentation/power/wakeup-events.rst``: General wakeup event framework.
\ No newline at end of file
--
2.25.1
^ permalink raw reply related
* Re: [PATCH v6 0/4] OPENAT2_REGULAR flag support for openat2
From: Christian Brauner @ 2026-04-20 13:20 UTC (permalink / raw)
To: Dorjoy Chowdhury
Cc: linux-fsdevel, linux-kernel, linux-api, ceph-devel, gfs2,
linux-nfs, linux-cifs, v9fs, linux-kselftest, viro, jack, jlayton,
chuck.lever, alex.aring, arnd, adilger, mjguzik, smfrench,
richard.henderson, mattst88, linmag7, tsbogend, James.Bottomley,
deller, davem, andreas, idryomov, amarkuze, slava, agruenba,
trondmy, anna, sfrench, pc, ronniesahlberg, sprasad, tom,
bharathsm, shuah, miklos, hansg
In-Reply-To: <CAFfO_h5mORm0OuK-d4thzBWWySmyvLSVeVa7phZc4Df-8D=1Cg@mail.gmail.com>
On Thu, Apr 16, 2026 at 09:22:03PM +0600, Dorjoy Chowdhury wrote:
> On Thu, Apr 16, 2026 at 7:07 PM Christian Brauner <brauner@kernel.org> wrote:
> >
> > On Sat, 28 Mar 2026 23:22:21 +0600, Dorjoy Chowdhury wrote:
> > > I came upon this "Ability to only open regular files" uapi feature suggestion
> > > from https://uapi-group.org/kernel-features/#ability-to-only-open-regular-files
> > > and thought it would be something I could do as a first patch and get to
> > > know the kernel code a bit better.
> > >
> > > The following filesystems have been tested by building and booting the kernel
> > > x86 bzImage in a Fedora 43 VM in QEMU. I have tested with OPENAT2_REGULAR that
> > > regular files can be successfully opened and non-regular files (directory, fifo etc)
> > > return -EFTYPE.
> > > - btrfs
> > > - NFS (loopback)
> > > - SMB (loopback)
> > >
> > > [...]
> >
> > - I've added an explanation why OPENAT2_REGULAR is only needed for some
> > ->atomic_open() implementers but not others. What I don't like is that
> > we need all that custom handling in there but it's managable.
> >
> > - I dropped the topmost style conversions. They really don't belong
> > there and if we switch to something better we should use (1 << <nr>).
> >
> > - I split the EFTYPE errno introduction into a separate patch.
> >
> > ---
>
> Thanks for fixing up and picking this one up!
>
> >
> > Applied to the vfs-7.2.openat.regular branch of the vfs/vfs.git tree.
> > Patches in the vfs-7.2.openat.regular branch should appear in linux-next soon.
> >
>
> I don't see a vfs-7.2.openat.regular branch in vfs/vfs.git tree in
> git.kernel.org. Maybe this hasn't been pushed yet?
Nothing will get pushed prior to -rc1 which is due this Sunday.
^ permalink raw reply
* [PATCH bpf-next v12 0/8] bpf: Extend BPF syscall with common attributes support
From: Leon Hwang @ 2026-04-20 14:17 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Daniel Borkmann, John Fastabend,
Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
Yonghong Song, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
Shuah Khan, Christian Brauner, Seth Forshee, Yuichiro Tsuji,
Andrey Albershteyn, Leon Hwang, Willem de Bruijn, Jason Xing,
Tao Chen, Mykyta Yatsenko, Kumar Kartikeya Dwivedi,
Anton Protopopov, Amery Hung, Rong Tao, linux-kernel, linux-api,
linux-kselftest, kernel-patches-bot
This patch series builds upon the discussion in
"[PATCH bpf-next v4 0/4] bpf: Improve error reporting for freplace attachment failure" [1].
This patch series introduces support for *common attributes* in the BPF
syscall, providing a unified mechanism for passing shared metadata across
all BPF commands, initially used by BPF_PROG_LOAD, BPF_BTF_LOAD, and
BPF_MAP_CREATE.
The initial set of common attributes includes:
1. 'log_buf': User-provided buffer for storing log output.
2. 'log_size': Size of the provided log buffer.
3. 'log_level': Verbosity level for logging.
4. 'log_true_size': Actual log size reported by kernel.
With this extension, the BPF syscall will be able to return meaningful
error messages (e.g., map creation failures), improving debuggability
and user experience.
Links:
[1] https://lore.kernel.org/bpf/20250224153352.64689-1-leon.hwang@linux.dev/
Changes:
v11 -> v12:
* Drop "log_" prefix in struct bpf_log_attr in patch #3.
* Drop "log_" prefix in struct bpf_log_opts in patch #7.
* Copy log_true_size using copy_to_bpfptr_offset() in patch #3 (per Alexei).
* v11: https://lore.kernel.org/bpf/20260216150445.68278-1-leon.hwang@linux.dev/
v10 -> v11:
* Collect Acked-by from Andrii, thanks.
* Validate whether log_buf, log_size, and log_level are valid by reusing
bpf_verifier_log_attr_valid() in patch #4 (per Andrii).
* v10: https://lore.kernel.org/bpf/20260211151115.78013-1-leon.hwang@linux.dev/
v9 -> v10:
* Collect Acked-by from Andrii, thanks.
* Address comments from Andrii:
* Drop log NULL check in bpf_log_attr_finalize().
* Return -EFAULT early in bpf_log_attr_finalize().
* Validate whether log_buf, log_size, and log_level are set.
* Keep log_buf, log_size, log_level, and user-pointer log_true_size in struct
bpf_log_attr.
* Make prog_load and btf_load work with the new struct bpf_log_attr.
* Add comment to log_true_size of struct bpf_log_opts in libbpf.
* Address comment from Alexei:
* Avoid using BPF_LOG_FIXED as log_level in tests.
* v9: https://lore.kernel.org/bpf/20260202144046.30651-1-leon.hwang@linux.dev/
v8 -> v9:
* Rework reporting 'log_true_size' for prog_load, btf_load, and map_create to
simplify struct bpf_log_attr (per Alexei).
* v8: https://lore.kernel.org/bpf/20260126151409.52072-1-leon.hwang@linux.dev/
v7 -> v8:
* Return 0 when fd < 0 and errno != EFAULT in probe_sys_bpf_ext(), then simplify
probe_bpf_syscall_common_attrs() (per Alexei and Andrii).
* v7: https://lore.kernel.org/bpf/20260123032445.125259-1-leon.hwang@linux.dev/
v6 -> v7:
* Return -errno when fd < 0 and errno != EFAULT in probe_sys_bpf_ext().
* Convert return value of probe_sys_bpf_ext() to bool in
probe_bpf_syscall_common_attrs().
* Address comments from Andrii:
* Drop the comment, and handle fd >= 0 case explicitly in
probe_sys_bpf_ext().
* Return an error when fd >= 0 in probe_sys_bpf_ext().
* v6: https://lore.kernel.org/bpf/20260120152424.40766-1-leon.hwang@linux.dev/
v5 -> v6:
* Address comments from Andrii:
* Update some variables' name.
* Drop unnecessary 'close(fd)' in libbpf.
* Rename FEAT_EXTENDED_SYSCALL to FEAT_BPF_SYSCALL_COMMON_ATTRS with
updated description in libbpf.
* Use EINVAL instead of EUSERS, as EUSERS is not used in bpf yet.
* Rename struct bpf_syscall_common_attr_opts to bpf_log_opts in libbpf.
* Add 'OPTS_SET(log_opts, log_true_size, 0);' in libbpf's 'bpf_map_create()'.
* v5: https://lore.kernel.org/bpf/20260112145616.44195-1-leon.hwang@linux.dev/
v4 -> v5:
* Rework reporting 'log_true_size' for prog_load, btf_load, and map_create
(per Alexei).
* v4: https://lore.kernel.org/bpf/20260106172018.57757-1-leon.hwang@linux.dev/
RFC v3 -> v4:
* Drop RFC.
* Address comments from Andrii:
* Add parentheses in 'sys_bpf_ext()'.
* Avoid creating new fd in 'probe_sys_bpf_ext()'.
* Add a new struct to wrap log fields in libbpf.
* Address comments from Alexei:
* Do not skip writing to user space when log_true_size is zero.
* Do not use 'bool' arguments.
* Drop the adding WARN_ON_ONCE()'s.
* v3: https://lore.kernel.org/bpf/20251002154841.99348-1-leon.hwang@linux.dev/
RFC v2 -> RFC v3:
* Rename probe_sys_bpf_extended to probe_sys_bpf_ext.
* Refactor reporting 'log_true_size' for prog_load.
* Refactor reporting 'btf_log_true_size' for btf_load.
* Add warnings for internal bugs in map_create.
* Check log_true_size in test cases.
* Address comment from Alexei:
* Change kvzalloc/kvfree to kzalloc/kfree.
* Address comments from Andrii:
* Move BPF_COMMON_ATTRS to 'enum bpf_cmd' alongside brief comment.
* Add bpf_check_uarg_tail_zero() for extra checks.
* Rename sys_bpf_extended to sys_bpf_ext.
* Rename sys_bpf_fd_extended to sys_bpf_ext_fd.
* Probe the new feature using NULL and -EFAULT.
* Move probe_sys_bpf_ext to libbpf_internal.h and drop LIBBPF_API.
* Return -EUSERS when log attrs are conflict between bpf_attr and
bpf_common_attr.
* Avoid touching bpf_vlog_init().
* Update the reason messages in map_create.
* Finalize the log using __cleanup().
* Report log size to users.
* Change type of log_buf from '__u64' to 'const char *' and cast type
using ptr_to_u64() in bpf_map_create().
* Do not return -EOPNOTSUPP when kernel doesn't support this feature
in bpf_map_create().
* Add log_level support for map creation for consistency.
* Address comment from Eduard:
* Use common_attrs->log_level instead of BPF_LOG_FIXED.
* v2: https://lore.kernel.org/bpf/20250911163328.93490-1-leon.hwang@linux.dev/
RFC v1 -> RFC v2:
* Fix build error reported by test bot.
* Address comments from Alexei:
* Drop new uapi for freplace.
* Add common attributes support for prog_load and btf_load.
* Add common attributes support for map_create.
* v1: https://lore.kernel.org/bpf/20250728142346.95681-1-leon.hwang@linux.dev/
Leon Hwang (8):
bpf: Extend BPF syscall with common attributes support
libbpf: Add support for extended BPF syscall
bpf: Refactor reporting log_true_size for prog_load
bpf: Add syscall common attributes support for prog_load
bpf: Add syscall common attributes support for btf_load
bpf: Add syscall common attributes support for map_create
libbpf: Add syscall common attributes support for map_create
selftests/bpf: Add tests to verify map create failure log
include/linux/bpf.h | 4 +-
include/linux/bpf_verifier.h | 16 ++
include/linux/btf.h | 3 +-
include/linux/syscalls.h | 3 +-
include/uapi/linux/bpf.h | 8 +
kernel/bpf/btf.c | 30 +---
kernel/bpf/log.c | 91 +++++++++-
kernel/bpf/syscall.c | 114 +++++++++---
kernel/bpf/verifier.c | 17 +-
tools/include/uapi/linux/bpf.h | 8 +
tools/lib/bpf/bpf.c | 52 +++++-
tools/lib/bpf/bpf.h | 17 +-
tools/lib/bpf/features.c | 8 +
tools/lib/bpf/libbpf_internal.h | 3 +
.../selftests/bpf/prog_tests/map_init.c | 166 ++++++++++++++++++
15 files changed, 473 insertions(+), 67 deletions(-)
--
2.53.0
^ permalink raw reply
* [PATCH bpf-next v12 1/8] bpf: Extend BPF syscall with common attributes support
From: Leon Hwang @ 2026-04-20 14:17 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Daniel Borkmann, John Fastabend,
Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
Yonghong Song, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
Shuah Khan, Christian Brauner, Seth Forshee, Yuichiro Tsuji,
Andrey Albershteyn, Leon Hwang, Willem de Bruijn, Jason Xing,
Tao Chen, Mykyta Yatsenko, Kumar Kartikeya Dwivedi,
Anton Protopopov, Amery Hung, Rong Tao, linux-kernel, linux-api,
linux-kselftest, kernel-patches-bot
In-Reply-To: <20260420141804.27179-1-leon.hwang@linux.dev>
Add generic BPF syscall support for passing common attributes.
The initial set of common attributes includes:
1. 'log_buf': User-provided buffer for storing logs.
2. 'log_size': Size of the log buffer.
3. 'log_level': Log verbosity level.
4. 'log_true_size': Actual log size reported by kernel.
The common-attribute pointer and its size are passed as the 4th and 5th
syscall arguments. A new command bit, 'BPF_COMMON_ATTRS' ('1 << 16'),
indicates that common attributes are supplied.
This commit adds syscall and uapi plumbing. Command-specific handling is
added in follow-up patches.
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
---
include/linux/syscalls.h | 3 ++-
include/uapi/linux/bpf.h | 8 ++++++++
kernel/bpf/syscall.c | 25 +++++++++++++++++++++----
tools/include/uapi/linux/bpf.h | 8 ++++++++
4 files changed, 39 insertions(+), 5 deletions(-)
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 02bd6ddb6278..f4848c64872f 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -936,7 +936,8 @@ asmlinkage long sys_seccomp(unsigned int op, unsigned int flags,
asmlinkage long sys_getrandom(char __user *buf, size_t count,
unsigned int flags);
asmlinkage long sys_memfd_create(const char __user *uname_ptr, unsigned int flags);
-asmlinkage long sys_bpf(int cmd, union bpf_attr __user *attr, unsigned int size);
+asmlinkage long sys_bpf(int cmd, union bpf_attr __user *attr, unsigned int size,
+ struct bpf_common_attr __user *attr_common, unsigned int size_common);
asmlinkage long sys_execveat(int dfd, const char __user *filename,
const char __user *const __user *argv,
const char __user *const __user *envp, int flags);
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 552bc5d9afbd..49eeb18ad050 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -994,6 +994,7 @@ enum bpf_cmd {
BPF_PROG_STREAM_READ_BY_FD,
BPF_PROG_ASSOC_STRUCT_OPS,
__MAX_BPF_CMD,
+ BPF_COMMON_ATTRS = 1 << 16, /* Indicate carrying syscall common attrs. */
};
enum bpf_map_type {
@@ -1500,6 +1501,13 @@ struct bpf_stack_build_id {
};
};
+struct bpf_common_attr {
+ __u64 log_buf;
+ __u32 log_size;
+ __u32 log_level;
+ __u32 log_true_size;
+};
+
#define BPF_OBJ_NAME_LEN 16U
enum {
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index b73b25c63073..5bf84ea21bf5 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -6210,8 +6210,10 @@ static int prog_assoc_struct_ops(union bpf_attr *attr)
return ret;
}
-static int __sys_bpf(enum bpf_cmd cmd, bpfptr_t uattr, unsigned int size)
+static int __sys_bpf(enum bpf_cmd cmd, bpfptr_t uattr, unsigned int size,
+ bpfptr_t uattr_common, unsigned int size_common)
{
+ struct bpf_common_attr attr_common;
union bpf_attr attr;
int err;
@@ -6225,6 +6227,20 @@ static int __sys_bpf(enum bpf_cmd cmd, bpfptr_t uattr, unsigned int size)
if (copy_from_bpfptr(&attr, uattr, size) != 0)
return -EFAULT;
+ memset(&attr_common, 0, sizeof(attr_common));
+ if (cmd & BPF_COMMON_ATTRS) {
+ err = bpf_check_uarg_tail_zero(uattr_common, sizeof(attr_common), size_common);
+ if (err)
+ return err;
+
+ cmd &= ~BPF_COMMON_ATTRS;
+ size_common = min_t(u32, size_common, sizeof(attr_common));
+ if (copy_from_bpfptr(&attr_common, uattr_common, size_common) != 0)
+ return -EFAULT;
+ } else {
+ size_common = 0;
+ }
+
err = security_bpf(cmd, &attr, size, uattr.is_kernel);
if (err < 0)
return err;
@@ -6360,9 +6376,10 @@ static int __sys_bpf(enum bpf_cmd cmd, bpfptr_t uattr, unsigned int size)
return err;
}
-SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, size)
+SYSCALL_DEFINE5(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, size,
+ struct bpf_common_attr __user *, uattr_common, unsigned int, size_common)
{
- return __sys_bpf(cmd, USER_BPFPTR(uattr), size);
+ return __sys_bpf(cmd, USER_BPFPTR(uattr), size, USER_BPFPTR(uattr_common), size_common);
}
static bool syscall_prog_is_valid_access(int off, int size,
@@ -6392,7 +6409,7 @@ BPF_CALL_3(bpf_sys_bpf, int, cmd, union bpf_attr *, attr, u32, attr_size)
default:
return -EINVAL;
}
- return __sys_bpf(cmd, KERNEL_BPFPTR(attr), attr_size);
+ return __sys_bpf(cmd, KERNEL_BPFPTR(attr), attr_size, KERNEL_BPFPTR(NULL), 0);
}
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 677be9a47347..16ff0968fc21 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -994,6 +994,7 @@ enum bpf_cmd {
BPF_PROG_STREAM_READ_BY_FD,
BPF_PROG_ASSOC_STRUCT_OPS,
__MAX_BPF_CMD,
+ BPF_COMMON_ATTRS = 1 << 16, /* Indicate carrying syscall common attrs. */
};
enum bpf_map_type {
@@ -1500,6 +1501,13 @@ struct bpf_stack_build_id {
};
};
+struct bpf_common_attr {
+ __u64 log_buf;
+ __u32 log_size;
+ __u32 log_level;
+ __u32 log_true_size;
+};
+
#define BPF_OBJ_NAME_LEN 16U
enum {
--
2.53.0
^ permalink raw reply related
* [PATCH bpf-next v12 2/8] libbpf: Add support for extended BPF syscall
From: Leon Hwang @ 2026-04-20 14:17 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Daniel Borkmann, John Fastabend,
Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
Yonghong Song, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
Shuah Khan, Christian Brauner, Seth Forshee, Yuichiro Tsuji,
Andrey Albershteyn, Leon Hwang, Willem de Bruijn, Jason Xing,
Tao Chen, Mykyta Yatsenko, Kumar Kartikeya Dwivedi,
Anton Protopopov, Amery Hung, Rong Tao, linux-kernel, linux-api,
linux-kselftest, kernel-patches-bot
In-Reply-To: <20260420141804.27179-1-leon.hwang@linux.dev>
To support the extended BPF syscall introduced in the previous commit,
introduce the following internal APIs:
* 'sys_bpf_ext()'
* 'sys_bpf_ext_fd()'
They wrap the raw 'syscall()' interface to support passing extended
attributes.
* 'probe_sys_bpf_ext()'
Check whether current kernel supports the BPF syscall common attributes.
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
---
tools/lib/bpf/bpf.c | 36 +++++++++++++++++++++++++++++++++
tools/lib/bpf/features.c | 8 ++++++++
tools/lib/bpf/libbpf_internal.h | 3 +++
3 files changed, 47 insertions(+)
diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index 5846de364209..9d8740761b7a 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -69,6 +69,42 @@ static inline __u64 ptr_to_u64(const void *ptr)
return (__u64) (unsigned long) ptr;
}
+static inline int sys_bpf_ext(enum bpf_cmd cmd, union bpf_attr *attr,
+ unsigned int size,
+ struct bpf_common_attr *attr_common,
+ unsigned int size_common)
+{
+ cmd = attr_common ? (cmd | BPF_COMMON_ATTRS) : (cmd & ~BPF_COMMON_ATTRS);
+ return syscall(__NR_bpf, cmd, attr, size, attr_common, size_common);
+}
+
+static inline int sys_bpf_ext_fd(enum bpf_cmd cmd, union bpf_attr *attr,
+ unsigned int size,
+ struct bpf_common_attr *attr_common,
+ unsigned int size_common)
+{
+ int fd;
+
+ fd = sys_bpf_ext(cmd, attr, size, attr_common, size_common);
+ return ensure_good_fd(fd);
+}
+
+int probe_sys_bpf_ext(void)
+{
+ const size_t attr_sz = offsetofend(union bpf_attr, prog_token_fd);
+ union bpf_attr attr;
+ int fd;
+
+ memset(&attr, 0, attr_sz);
+ fd = syscall(__NR_bpf, BPF_PROG_LOAD | BPF_COMMON_ATTRS, &attr, attr_sz, NULL,
+ sizeof(struct bpf_common_attr));
+ if (fd >= 0) {
+ close(fd);
+ return -EINVAL;
+ }
+ return errno == EFAULT ? 1 : 0;
+}
+
static inline int sys_bpf(enum bpf_cmd cmd, union bpf_attr *attr,
unsigned int size)
{
diff --git a/tools/lib/bpf/features.c b/tools/lib/bpf/features.c
index 4f19a0d79b0c..b7e388f99d0b 100644
--- a/tools/lib/bpf/features.c
+++ b/tools/lib/bpf/features.c
@@ -615,6 +615,11 @@ static int probe_kern_btf_layout(int token_fd)
(char *)layout, token_fd));
}
+static int probe_bpf_syscall_common_attrs(int token_fd)
+{
+ return probe_sys_bpf_ext();
+}
+
typedef int (*feature_probe_fn)(int /* token_fd */);
static struct kern_feature_cache feature_cache;
@@ -699,6 +704,9 @@ static struct kern_feature_desc {
[FEAT_BTF_LAYOUT] = {
"kernel supports BTF layout", probe_kern_btf_layout,
},
+ [FEAT_BPF_SYSCALL_COMMON_ATTRS] = {
+ "BPF syscall common attributes support", probe_bpf_syscall_common_attrs,
+ },
};
bool feat_supported(struct kern_feature_cache *cache, enum kern_feature_id feat_id)
diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h
index cabdaef79098..21ccdf4d9284 100644
--- a/tools/lib/bpf/libbpf_internal.h
+++ b/tools/lib/bpf/libbpf_internal.h
@@ -398,6 +398,8 @@ enum kern_feature_id {
FEAT_UPROBE_SYSCALL,
/* Kernel supports BTF layout information */
FEAT_BTF_LAYOUT,
+ /* Kernel supports BPF syscall common attributes */
+ FEAT_BPF_SYSCALL_COMMON_ATTRS,
__FEAT_CNT,
};
@@ -767,4 +769,5 @@ int probe_fd(int fd);
#define SHA256_DWORD_SIZE SHA256_DIGEST_LENGTH / sizeof(__u64)
void libbpf_sha256(const void *data, size_t len, __u8 out[SHA256_DIGEST_LENGTH]);
+int probe_sys_bpf_ext(void);
#endif /* __LIBBPF_LIBBPF_INTERNAL_H */
--
2.53.0
^ permalink raw reply related
* [PATCH bpf-next v12 3/8] bpf: Refactor reporting log_true_size for prog_load
From: Leon Hwang @ 2026-04-20 14:17 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Daniel Borkmann, John Fastabend,
Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
Yonghong Song, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
Shuah Khan, Christian Brauner, Seth Forshee, Yuichiro Tsuji,
Andrey Albershteyn, Leon Hwang, Willem de Bruijn, Jason Xing,
Tao Chen, Mykyta Yatsenko, Kumar Kartikeya Dwivedi,
Anton Protopopov, Amery Hung, Rong Tao, linux-kernel, linux-api,
linux-kselftest, kernel-patches-bot
In-Reply-To: <20260420141804.27179-1-leon.hwang@linux.dev>
The next commit will add support for reporting logs via extended common
attributes, including 'log_true_size'.
To prepare for that, refactor the 'log_true_size' reporting logic by
introducing a new struct bpf_log_attr to encapsulate log-related behavior:
* bpf_log_attr_init(): initialize log fields, which will support
extended common attributes in the next commit.
* bpf_log_attr_finalize(): handle log finalization and write back
'log_true_size' to userspace.
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
---
include/linux/bpf.h | 4 +++-
include/linux/bpf_verifier.h | 12 ++++++++++++
kernel/bpf/log.c | 29 +++++++++++++++++++++++++++++
kernel/bpf/syscall.c | 12 +++++++++---
kernel/bpf/verifier.c | 17 ++++-------------
5 files changed, 57 insertions(+), 17 deletions(-)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 30d35d5fe40b..76b28153224a 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -2912,7 +2912,9 @@ int bpf_check_uarg_tail_zero(bpfptr_t uaddr, size_t expected_size,
size_t actual_size);
/* verify correctness of eBPF program */
-int bpf_check(struct bpf_prog **fp, union bpf_attr *attr, bpfptr_t uattr, u32 uattr_size);
+struct bpf_log_attr;
+int bpf_check(struct bpf_prog **fp, union bpf_attr *attr, bpfptr_t uattr,
+ struct bpf_log_attr *attr_log);
#ifndef CONFIG_BPF_JIT_ALWAYS_ON
void bpf_patch_call_args(struct bpf_insn *insn, u32 stack_depth);
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 36bfd96d4563..1162b5492841 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -647,6 +647,18 @@ static inline bool bpf_verifier_log_needed(const struct bpf_verifier_log *log)
return log && log->level;
}
+struct bpf_log_attr {
+ char __user *ubuf;
+ u32 size;
+ u32 level;
+ u32 offsetof_true_size;
+ bpfptr_t uattr;
+};
+
+int bpf_log_attr_init(struct bpf_log_attr *log, u64 log_buf, u32 log_size, u32 log_level,
+ u32 offsetof_log_true_size, bpfptr_t uattr);
+int bpf_log_attr_finalize(struct bpf_log_attr *attr, struct bpf_verifier_log *log);
+
#define BPF_MAX_SUBPROGS 256
struct bpf_subprog_arg_info {
diff --git a/kernel/bpf/log.c b/kernel/bpf/log.c
index 37d72b052192..9c5c755d5972 100644
--- a/kernel/bpf/log.c
+++ b/kernel/bpf/log.c
@@ -865,3 +865,32 @@ void print_insn_state(struct bpf_verifier_env *env, const struct bpf_verifier_st
}
print_verifier_state(env, vstate, frameno, false);
}
+
+int bpf_log_attr_init(struct bpf_log_attr *log, u64 log_buf, u32 log_size, u32 log_level,
+ u32 offsetof_log_true_size, bpfptr_t uattr)
+{
+ char __user *ubuf = u64_to_user_ptr(log_buf);
+
+ memset(log, 0, sizeof(*log));
+ log->ubuf = ubuf;
+ log->size = log_size;
+ log->level = log_level;
+ log->offsetof_true_size = offsetof_log_true_size;
+ log->uattr = uattr;
+ return 0;
+}
+
+int bpf_log_attr_finalize(struct bpf_log_attr *attr, struct bpf_verifier_log *log)
+{
+ u32 log_true_size;
+ int err;
+
+ err = bpf_vlog_finalize(log, &log_true_size);
+
+ if (attr->offsetof_true_size &&
+ copy_to_bpfptr_offset(attr->uattr, attr->offsetof_true_size, &log_true_size,
+ sizeof(log_true_size)))
+ return -EFAULT;
+
+ return err;
+}
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 5bf84ea21bf5..b560dab9bfa8 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -2861,7 +2861,7 @@ static int bpf_prog_mark_insn_arrays_ready(struct bpf_prog *prog)
/* last field in 'union bpf_attr' used by this command */
#define BPF_PROG_LOAD_LAST_FIELD keyring_id
-static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr, u32 uattr_size)
+static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr, struct bpf_log_attr *attr_log)
{
enum bpf_prog_type type = attr->prog_type;
struct bpf_prog *prog, *dst_prog = NULL;
@@ -3079,7 +3079,7 @@ static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr, u32 uattr_size)
goto free_prog_sec;
/* run eBPF verifier */
- err = bpf_check(&prog, attr, uattr, uattr_size);
+ err = bpf_check(&prog, attr, uattr, attr_log);
if (err < 0)
goto free_used_maps;
@@ -6214,6 +6214,8 @@ static int __sys_bpf(enum bpf_cmd cmd, bpfptr_t uattr, unsigned int size,
bpfptr_t uattr_common, unsigned int size_common)
{
struct bpf_common_attr attr_common;
+ u32 offsetof_log_true_size = 0;
+ struct bpf_log_attr attr_log;
union bpf_attr attr;
int err;
@@ -6265,7 +6267,11 @@ static int __sys_bpf(enum bpf_cmd cmd, bpfptr_t uattr, unsigned int size,
err = map_freeze(&attr);
break;
case BPF_PROG_LOAD:
- err = bpf_prog_load(&attr, uattr, size);
+ if (size >= offsetofend(union bpf_attr, log_true_size))
+ offsetof_log_true_size = offsetof(union bpf_attr, log_true_size);
+ err = bpf_log_attr_init(&attr_log, attr.log_buf, attr.log_size, attr.log_level,
+ offsetof_log_true_size, uattr);
+ err = err ?: bpf_prog_load(&attr, uattr, &attr_log);
break;
case BPF_OBJ_PIN:
err = bpf_obj_pin(&attr);
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 594260c1f382..668a3628013e 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -26526,12 +26526,12 @@ static int compute_scc(struct bpf_verifier_env *env)
return err;
}
-int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u32 uattr_size)
+int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr,
+ struct bpf_log_attr *attr_log)
{
u64 start_time = ktime_get_ns();
struct bpf_verifier_env *env;
int i, len, ret = -EINVAL, err;
- u32 log_true_size;
bool is_priv;
BTF_TYPE_EMIT(enum bpf_features);
@@ -26578,9 +26578,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3
/* user could have requested verbose verifier output
* and supplied buffer to store the verification trace
*/
- ret = bpf_vlog_init(&env->log, attr->log_level,
- (char __user *) (unsigned long) attr->log_buf,
- attr->log_size);
+ ret = bpf_vlog_init(&env->log, attr_log->level, attr_log->ubuf, attr_log->size);
if (ret)
goto err_unlock;
@@ -26742,17 +26740,10 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr, __u3
env->prog->aux->verified_insns = env->insn_processed;
/* preserve original error even if log finalization is successful */
- err = bpf_vlog_finalize(&env->log, &log_true_size);
+ err = bpf_log_attr_finalize(attr_log, &env->log);
if (err)
ret = err;
- if (uattr_size >= offsetofend(union bpf_attr, log_true_size) &&
- copy_to_bpfptr_offset(uattr, offsetof(union bpf_attr, log_true_size),
- &log_true_size, sizeof(log_true_size))) {
- ret = -EFAULT;
- goto err_release_maps;
- }
-
if (ret)
goto err_release_maps;
--
2.53.0
^ permalink raw reply related
* [PATCH bpf-next v12 4/8] bpf: Add syscall common attributes support for prog_load
From: Leon Hwang @ 2026-04-20 14:18 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Daniel Borkmann, John Fastabend,
Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
Yonghong Song, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
Shuah Khan, Christian Brauner, Seth Forshee, Yuichiro Tsuji,
Andrey Albershteyn, Leon Hwang, Willem de Bruijn, Jason Xing,
Tao Chen, Mykyta Yatsenko, Kumar Kartikeya Dwivedi,
Anton Protopopov, Amery Hung, Rong Tao, linux-kernel, linux-api,
linux-kselftest, kernel-patches-bot
In-Reply-To: <20260420141804.27179-1-leon.hwang@linux.dev>
BPF_PROG_LOAD can now take log parameters from both union bpf_attr and
struct bpf_common_attr. The merge rules are:
- if both sides provide a complete log tuple (buf/size/level) and they
match, use it;
- if only one side provides log parameters, use that one;
- if both sides provide complete tuples but they differ, return -EINVAL.
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
---
include/linux/bpf_verifier.h | 3 ++-
kernel/bpf/log.c | 34 +++++++++++++++++++++++++++-------
kernel/bpf/syscall.c | 3 ++-
3 files changed, 31 insertions(+), 9 deletions(-)
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 1162b5492841..9045dc88c5b6 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -656,7 +656,8 @@ struct bpf_log_attr {
};
int bpf_log_attr_init(struct bpf_log_attr *log, u64 log_buf, u32 log_size, u32 log_level,
- u32 offsetof_log_true_size, bpfptr_t uattr);
+ u32 offsetof_log_true_size, bpfptr_t uattr, struct bpf_common_attr *common,
+ bpfptr_t uattr_common, u32 size_common);
int bpf_log_attr_finalize(struct bpf_log_attr *attr, struct bpf_verifier_log *log);
#define BPF_MAX_SUBPROGS 256
diff --git a/kernel/bpf/log.c b/kernel/bpf/log.c
index 9c5c755d5972..dc8734e27752 100644
--- a/kernel/bpf/log.c
+++ b/kernel/bpf/log.c
@@ -13,17 +13,17 @@
#define verbose(env, fmt, args...) bpf_verifier_log_write(env, fmt, ##args)
-static bool bpf_verifier_log_attr_valid(const struct bpf_verifier_log *log)
+static bool bpf_verifier_log_attr_valid(u32 log_level, char __user *log_buf, u32 log_size)
{
/* ubuf and len_total should both be specified (or not) together */
- if (!!log->ubuf != !!log->len_total)
+ if (!!log_buf != !!log_size)
return false;
/* log buf without log_level is meaningless */
- if (log->ubuf && log->level == 0)
+ if (log_buf && log_level == 0)
return false;
- if (log->level & ~BPF_LOG_MASK)
+ if (log_level & ~BPF_LOG_MASK)
return false;
- if (log->len_total > UINT_MAX >> 2)
+ if (log_size > UINT_MAX >> 2)
return false;
return true;
}
@@ -36,7 +36,7 @@ int bpf_vlog_init(struct bpf_verifier_log *log, u32 log_level,
log->len_total = log_size;
/* log attributes have to be sane */
- if (!bpf_verifier_log_attr_valid(log))
+ if (!bpf_verifier_log_attr_valid(log_level, log_buf, log_size))
return -EINVAL;
return 0;
@@ -867,16 +867,36 @@ void print_insn_state(struct bpf_verifier_env *env, const struct bpf_verifier_st
}
int bpf_log_attr_init(struct bpf_log_attr *log, u64 log_buf, u32 log_size, u32 log_level,
- u32 offsetof_log_true_size, bpfptr_t uattr)
+ u32 offsetof_log_true_size, bpfptr_t uattr, struct bpf_common_attr *common,
+ bpfptr_t uattr_common, u32 size_common)
{
+ char __user *ubuf_common = u64_to_user_ptr(common->log_buf);
char __user *ubuf = u64_to_user_ptr(log_buf);
+ if (!bpf_verifier_log_attr_valid(common->log_level, ubuf_common, common->log_size) ||
+ !bpf_verifier_log_attr_valid(log_level, ubuf, log_size))
+ return -EINVAL;
+
+ if (ubuf && ubuf_common && (ubuf != ubuf_common || log_size != common->log_size ||
+ log_level != common->log_level))
+ return -EINVAL;
+
memset(log, 0, sizeof(*log));
log->ubuf = ubuf;
log->size = log_size;
log->level = log_level;
log->offsetof_true_size = offsetof_log_true_size;
log->uattr = uattr;
+
+ if (!ubuf && ubuf_common) {
+ log->ubuf = ubuf_common;
+ log->size = common->log_size;
+ log->level = common->log_level;
+ log->uattr = uattr_common;
+ log->offsetof_true_size = 0;
+ if (size_common >= offsetofend(struct bpf_common_attr, log_true_size))
+ log->offsetof_true_size = offsetof(struct bpf_common_attr, log_true_size);
+ }
return 0;
}
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index b560dab9bfa8..1eb15a51c7dc 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -6270,7 +6270,8 @@ static int __sys_bpf(enum bpf_cmd cmd, bpfptr_t uattr, unsigned int size,
if (size >= offsetofend(union bpf_attr, log_true_size))
offsetof_log_true_size = offsetof(union bpf_attr, log_true_size);
err = bpf_log_attr_init(&attr_log, attr.log_buf, attr.log_size, attr.log_level,
- offsetof_log_true_size, uattr);
+ offsetof_log_true_size, uattr, &attr_common, uattr_common,
+ size_common);
err = err ?: bpf_prog_load(&attr, uattr, &attr_log);
break;
case BPF_OBJ_PIN:
--
2.53.0
^ permalink raw reply related
* [PATCH bpf-next v12 5/8] bpf: Add syscall common attributes support for btf_load
From: Leon Hwang @ 2026-04-20 14:18 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Daniel Borkmann, John Fastabend,
Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
Yonghong Song, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
Shuah Khan, Christian Brauner, Seth Forshee, Yuichiro Tsuji,
Andrey Albershteyn, Leon Hwang, Willem de Bruijn, Jason Xing,
Tao Chen, Mykyta Yatsenko, Kumar Kartikeya Dwivedi,
Anton Protopopov, Amery Hung, Rong Tao, linux-kernel, linux-api,
linux-kselftest, kernel-patches-bot
In-Reply-To: <20260420141804.27179-1-leon.hwang@linux.dev>
BPF_BTF_LOAD can now take log parameters from both union bpf_attr and
struct bpf_common_attr, with the same merge rules as BPF_PROG_LOAD:
- if both sides provide a complete log tuple (buf/size/level) and they
match, use it;
- if only one side provides log parameters, use that one;
- if both sides provide complete tuples but they differ, return -EINVAL.
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
---
include/linux/btf.h | 3 ++-
kernel/bpf/btf.c | 30 +++++++-----------------------
kernel/bpf/syscall.c | 11 ++++++++---
3 files changed, 17 insertions(+), 27 deletions(-)
diff --git a/include/linux/btf.h b/include/linux/btf.h
index 48108471c5b1..f64d87315c5f 100644
--- a/include/linux/btf.h
+++ b/include/linux/btf.h
@@ -145,7 +145,8 @@ const char *btf_get_name(const struct btf *btf);
void btf_get(struct btf *btf);
void btf_put(struct btf *btf);
const struct btf_header *btf_header(const struct btf *btf);
-int btf_new_fd(const union bpf_attr *attr, bpfptr_t uattr, u32 uattr_sz);
+struct bpf_log_attr;
+int btf_new_fd(const union bpf_attr *attr, bpfptr_t uattr, struct bpf_log_attr *attr_log);
struct btf *btf_get_by_fd(int fd);
int btf_get_info_by_fd(const struct btf *btf,
const union bpf_attr *attr,
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index a62d78581207..9f69ef774a0b 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -5914,25 +5914,10 @@ static int btf_check_type_tags(struct btf_verifier_env *env,
return 0;
}
-static int finalize_log(struct bpf_verifier_log *log, bpfptr_t uattr, u32 uattr_size)
-{
- u32 log_true_size;
- int err;
-
- err = bpf_vlog_finalize(log, &log_true_size);
-
- if (uattr_size >= offsetofend(union bpf_attr, btf_log_true_size) &&
- copy_to_bpfptr_offset(uattr, offsetof(union bpf_attr, btf_log_true_size),
- &log_true_size, sizeof(log_true_size)))
- err = -EFAULT;
-
- return err;
-}
-
-static struct btf *btf_parse(const union bpf_attr *attr, bpfptr_t uattr, u32 uattr_size)
+static struct btf *btf_parse(const union bpf_attr *attr, bpfptr_t uattr,
+ struct bpf_log_attr *attr_log)
{
bpfptr_t btf_data = make_bpfptr(attr->btf, uattr.is_kernel);
- char __user *log_ubuf = u64_to_user_ptr(attr->btf_log_buf);
struct btf_struct_metas *struct_meta_tab;
struct btf_verifier_env *env = NULL;
struct btf *btf = NULL;
@@ -5949,8 +5934,7 @@ static struct btf *btf_parse(const union bpf_attr *attr, bpfptr_t uattr, u32 uat
/* user could have requested verbose verifier output
* and supplied buffer to store the verification trace
*/
- err = bpf_vlog_init(&env->log, attr->btf_log_level,
- log_ubuf, attr->btf_log_size);
+ err = bpf_vlog_init(&env->log, attr_log->level, attr_log->ubuf, attr_log->size);
if (err)
goto errout_free;
@@ -6015,7 +5999,7 @@ static struct btf *btf_parse(const union bpf_attr *attr, bpfptr_t uattr, u32 uat
}
}
- err = finalize_log(&env->log, uattr, uattr_size);
+ err = bpf_log_attr_finalize(attr_log, &env->log);
if (err)
goto errout_free;
@@ -6027,7 +6011,7 @@ static struct btf *btf_parse(const union bpf_attr *attr, bpfptr_t uattr, u32 uat
btf_free_struct_meta_tab(btf);
errout:
/* overwrite err with -ENOSPC or -EFAULT */
- ret = finalize_log(&env->log, uattr, uattr_size);
+ ret = bpf_log_attr_finalize(attr_log, &env->log);
if (ret)
err = ret;
errout_free:
@@ -8196,12 +8180,12 @@ static int __btf_new_fd(struct btf *btf)
return anon_inode_getfd("btf", &btf_fops, btf, O_RDONLY | O_CLOEXEC);
}
-int btf_new_fd(const union bpf_attr *attr, bpfptr_t uattr, u32 uattr_size)
+int btf_new_fd(const union bpf_attr *attr, bpfptr_t uattr, struct bpf_log_attr *attr_log)
{
struct btf *btf;
int ret;
- btf = btf_parse(attr, uattr, uattr_size);
+ btf = btf_parse(attr, uattr, attr_log);
if (IS_ERR(btf))
return PTR_ERR(btf);
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 1eb15a51c7dc..06aea4e499a1 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -5473,7 +5473,7 @@ static int bpf_obj_get_info_by_fd(const union bpf_attr *attr,
#define BPF_BTF_LOAD_LAST_FIELD btf_token_fd
-static int bpf_btf_load(const union bpf_attr *attr, bpfptr_t uattr, __u32 uattr_size)
+static int bpf_btf_load(const union bpf_attr *attr, bpfptr_t uattr, struct bpf_log_attr *attr_log)
{
struct bpf_token *token = NULL;
@@ -5500,7 +5500,7 @@ static int bpf_btf_load(const union bpf_attr *attr, bpfptr_t uattr, __u32 uattr_
bpf_token_put(token);
- return btf_new_fd(attr, uattr, uattr_size);
+ return btf_new_fd(attr, uattr, attr_log);
}
#define BPF_BTF_GET_FD_BY_ID_LAST_FIELD fd_by_id_token_fd
@@ -6317,7 +6317,12 @@ static int __sys_bpf(enum bpf_cmd cmd, bpfptr_t uattr, unsigned int size,
err = bpf_raw_tracepoint_open(&attr);
break;
case BPF_BTF_LOAD:
- err = bpf_btf_load(&attr, uattr, size);
+ if (size >= offsetofend(union bpf_attr, btf_log_true_size))
+ offsetof_log_true_size = offsetof(union bpf_attr, btf_log_true_size);
+ err = bpf_log_attr_init(&attr_log, attr.btf_log_buf, attr.btf_log_size,
+ attr.btf_log_level, offsetof_log_true_size, uattr,
+ &attr_common, uattr_common, size_common);
+ err = err ?: bpf_btf_load(&attr, uattr, &attr_log);
break;
case BPF_BTF_GET_FD_BY_ID:
err = bpf_btf_get_fd_by_id(&attr);
--
2.53.0
^ permalink raw reply related
* [PATCH bpf-next v12 6/8] bpf: Add syscall common attributes support for map_create
From: Leon Hwang @ 2026-04-20 14:18 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Daniel Borkmann, John Fastabend,
Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
Yonghong Song, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
Shuah Khan, Christian Brauner, Seth Forshee, Yuichiro Tsuji,
Andrey Albershteyn, Leon Hwang, Willem de Bruijn, Jason Xing,
Tao Chen, Mykyta Yatsenko, Kumar Kartikeya Dwivedi,
Anton Protopopov, Amery Hung, Rong Tao, linux-kernel, linux-api,
linux-kselftest, kernel-patches-bot
In-Reply-To: <20260420141804.27179-1-leon.hwang@linux.dev>
Many BPF_MAP_CREATE validation failures currently return -EINVAL without
any explanation to userspace.
Plumb common syscall log attributes into map_create(), create a verifier
log from bpf_common_attr::log_buf/log_size/log_level, and report
map-creation failure reasons through that buffer.
This improves debuggability by allowing userspace to inspect why map
creation failed and read back log_true_size from common attributes.
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
---
include/linux/bpf_verifier.h | 3 ++
kernel/bpf/log.c | 30 +++++++++++++++++
kernel/bpf/syscall.c | 65 ++++++++++++++++++++++++++++++------
3 files changed, 88 insertions(+), 10 deletions(-)
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 9045dc88c5b6..9f565f7fd3f9 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -658,6 +658,9 @@ struct bpf_log_attr {
int bpf_log_attr_init(struct bpf_log_attr *log, u64 log_buf, u32 log_size, u32 log_level,
u32 offsetof_log_true_size, bpfptr_t uattr, struct bpf_common_attr *common,
bpfptr_t uattr_common, u32 size_common);
+struct bpf_verifier_log *bpf_log_attr_create_vlog(struct bpf_log_attr *attr_log,
+ struct bpf_common_attr *common, bpfptr_t uattr,
+ u32 size);
int bpf_log_attr_finalize(struct bpf_log_attr *attr, struct bpf_verifier_log *log);
#define BPF_MAX_SUBPROGS 256
diff --git a/kernel/bpf/log.c b/kernel/bpf/log.c
index dc8734e27752..0c972f577b60 100644
--- a/kernel/bpf/log.c
+++ b/kernel/bpf/log.c
@@ -900,6 +900,36 @@ int bpf_log_attr_init(struct bpf_log_attr *log, u64 log_buf, u32 log_size, u32 l
return 0;
}
+
+struct bpf_verifier_log *bpf_log_attr_create_vlog(struct bpf_log_attr *attr_log,
+ struct bpf_common_attr *common, bpfptr_t uattr,
+ u32 size)
+{
+ struct bpf_verifier_log *log;
+ int err;
+
+ memset(attr_log, 0, sizeof(*attr_log));
+ attr_log->uattr = uattr;
+ if (size >= offsetofend(struct bpf_common_attr, log_true_size))
+ attr_log->offsetof_true_size = offsetof(struct bpf_common_attr, log_true_size);
+
+ if (!common->log_buf)
+ return NULL;
+
+ log = kzalloc_obj(*log, GFP_KERNEL);
+ if (!log)
+ return ERR_PTR(-ENOMEM);
+
+ err = bpf_vlog_init(log, common->log_level, u64_to_user_ptr(common->log_buf),
+ common->log_size);
+ if (err) {
+ kfree(log);
+ return ERR_PTR(err);
+ }
+
+ return log;
+}
+
int bpf_log_attr_finalize(struct bpf_log_attr *attr, struct bpf_verifier_log *log)
{
u32 log_true_size;
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 06aea4e499a1..39f4d4b00469 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -1359,7 +1359,7 @@ static int map_check_btf(struct bpf_map *map, struct bpf_token *token,
#define BPF_MAP_CREATE_LAST_FIELD excl_prog_hash_size
/* called via syscall */
-static int map_create(union bpf_attr *attr, bpfptr_t uattr)
+static int __map_create(union bpf_attr *attr, bpfptr_t uattr, struct bpf_verifier_log *log)
{
const struct bpf_map_ops *ops;
struct bpf_token *token = NULL;
@@ -1371,8 +1371,10 @@ static int map_create(union bpf_attr *attr, bpfptr_t uattr)
int err;
err = CHECK_ATTR(BPF_MAP_CREATE);
- if (err)
+ if (err) {
+ bpf_log(log, "Invalid attr.\n");
return -EINVAL;
+ }
/* check BPF_F_TOKEN_FD flag, remember if it's set, and then clear it
* to avoid per-map type checks tripping on unknown flag
@@ -1381,17 +1383,25 @@ static int map_create(union bpf_attr *attr, bpfptr_t uattr)
attr->map_flags &= ~BPF_F_TOKEN_FD;
if (attr->btf_vmlinux_value_type_id) {
- if (attr->map_type != BPF_MAP_TYPE_STRUCT_OPS ||
- attr->btf_key_type_id || attr->btf_value_type_id)
+ if (attr->map_type != BPF_MAP_TYPE_STRUCT_OPS) {
+ bpf_log(log, "btf_vmlinux_value_type_id can only be used with struct_ops maps.\n");
return -EINVAL;
+ }
+ if (attr->btf_key_type_id || attr->btf_value_type_id) {
+ bpf_log(log, "btf_vmlinux_value_type_id is mutually exclusive with btf_key_type_id and btf_value_type_id.\n");
+ return -EINVAL;
+ }
} else if (attr->btf_key_type_id && !attr->btf_value_type_id) {
+ bpf_log(log, "Invalid btf_value_type_id.\n");
return -EINVAL;
}
if (attr->map_type != BPF_MAP_TYPE_BLOOM_FILTER &&
attr->map_type != BPF_MAP_TYPE_ARENA &&
- attr->map_extra != 0)
+ attr->map_extra != 0) {
+ bpf_log(log, "Invalid map_extra.\n");
return -EINVAL;
+ }
f_flags = bpf_get_file_flag(attr->map_flags);
if (f_flags < 0)
@@ -1399,13 +1409,17 @@ static int map_create(union bpf_attr *attr, bpfptr_t uattr)
if (numa_node != NUMA_NO_NODE &&
((unsigned int)numa_node >= nr_node_ids ||
- !node_online(numa_node)))
+ !node_online(numa_node))) {
+ bpf_log(log, "Invalid numa_node.\n");
return -EINVAL;
+ }
/* find map type and init map: hashtable vs rbtree vs bloom vs ... */
map_type = attr->map_type;
- if (map_type >= ARRAY_SIZE(bpf_map_types))
+ if (map_type >= ARRAY_SIZE(bpf_map_types)) {
+ bpf_log(log, "Invalid map_type.\n");
return -EINVAL;
+ }
map_type = array_index_nospec(map_type, ARRAY_SIZE(bpf_map_types));
ops = bpf_map_types[map_type];
if (!ops)
@@ -1423,8 +1437,10 @@ static int map_create(union bpf_attr *attr, bpfptr_t uattr)
if (token_flag) {
token = bpf_token_get_from_fd(attr->map_token_fd);
- if (IS_ERR(token))
+ if (IS_ERR(token)) {
+ bpf_log(log, "Invalid map_token_fd.\n");
return PTR_ERR(token);
+ }
/* if current token doesn't grant map creation permissions,
* then we can't use this token, so ignore it and rely on
@@ -1507,8 +1523,10 @@ static int map_create(union bpf_attr *attr, bpfptr_t uattr)
err = bpf_obj_name_cpy(map->name, attr->map_name,
sizeof(attr->map_name));
- if (err < 0)
+ if (err < 0) {
+ bpf_log(log, "Invalid map_name.\n");
goto free_map;
+ }
preempt_disable();
map->cookie = gen_cookie_next(&bpf_map_cookie);
@@ -1531,6 +1549,7 @@ static int map_create(union bpf_attr *attr, bpfptr_t uattr)
btf = btf_get_by_fd(attr->btf_fd);
if (IS_ERR(btf)) {
+ bpf_log(log, "Invalid btf_fd.\n");
err = PTR_ERR(btf);
goto free_map;
}
@@ -1558,6 +1577,7 @@ static int map_create(union bpf_attr *attr, bpfptr_t uattr)
bpfptr_t uprog_hash = make_bpfptr(attr->excl_prog_hash, uattr.is_kernel);
if (attr->excl_prog_hash_size != SHA256_DIGEST_SIZE) {
+ bpf_log(log, "Invalid excl_prog_hash_size.\n");
err = -EINVAL;
goto free_map;
}
@@ -1573,6 +1593,7 @@ static int map_create(union bpf_attr *attr, bpfptr_t uattr)
goto free_map;
}
} else if (attr->excl_prog_hash_size) {
+ bpf_log(log, "Invalid excl_prog_hash_size.\n");
err = -EINVAL;
goto free_map;
}
@@ -1611,6 +1632,30 @@ static int map_create(union bpf_attr *attr, bpfptr_t uattr)
return err;
}
+static int map_create(union bpf_attr *attr, bpfptr_t uattr, struct bpf_common_attr *attr_common,
+ bpfptr_t uattr_common, u32 size_common)
+{
+ struct bpf_verifier_log *log;
+ struct bpf_log_attr attr_log;
+ int err, ret;
+
+ log = bpf_log_attr_create_vlog(&attr_log, attr_common, uattr_common, size_common);
+ if (IS_ERR(log))
+ return PTR_ERR(log);
+
+ err = __map_create(attr, uattr, log);
+ if (err >= 0)
+ goto free;
+
+ ret = bpf_log_attr_finalize(&attr_log, log);
+ if (ret)
+ err = ret;
+
+free:
+ kfree(log);
+ return err;
+}
+
void bpf_map_inc(struct bpf_map *map)
{
atomic64_inc(&map->refcnt);
@@ -6249,7 +6294,7 @@ static int __sys_bpf(enum bpf_cmd cmd, bpfptr_t uattr, unsigned int size,
switch (cmd) {
case BPF_MAP_CREATE:
- err = map_create(&attr, uattr);
+ err = map_create(&attr, uattr, &attr_common, uattr_common, size_common);
break;
case BPF_MAP_LOOKUP_ELEM:
err = map_lookup_elem(&attr);
--
2.53.0
^ permalink raw reply related
* [PATCH bpf-next v12 7/8] libbpf: Add syscall common attributes support for map_create
From: Leon Hwang @ 2026-04-20 14:18 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Daniel Borkmann, John Fastabend,
Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
Yonghong Song, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
Shuah Khan, Christian Brauner, Seth Forshee, Yuichiro Tsuji,
Andrey Albershteyn, Leon Hwang, Willem de Bruijn, Jason Xing,
Tao Chen, Mykyta Yatsenko, Kumar Kartikeya Dwivedi,
Anton Protopopov, Amery Hung, Rong Tao, linux-kernel, linux-api,
linux-kselftest, kernel-patches-bot
In-Reply-To: <20260420141804.27179-1-leon.hwang@linux.dev>
With the previous commit adding common attribute support for
BPF_MAP_CREATE, users can now retrieve detailed error messages when map
creation fails via the log_buf field.
Introduce struct bpf_log_opts with the following fields:
log_buf, log_size, log_level, and log_true_size.
Extend bpf_map_create_opts with a new field log_opts, allowing users to
capture and inspect log messages on map creation failures.
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
---
tools/lib/bpf/bpf.c | 16 +++++++++++++++-
tools/lib/bpf/bpf.h | 17 ++++++++++++++++-
2 files changed, 31 insertions(+), 2 deletions(-)
diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index 9d8740761b7a..483c02cf21d1 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -209,6 +209,9 @@ int bpf_map_create(enum bpf_map_type map_type,
const struct bpf_map_create_opts *opts)
{
const size_t attr_sz = offsetofend(union bpf_attr, excl_prog_hash_size);
+ const size_t attr_common_sz = sizeof(struct bpf_common_attr);
+ struct bpf_common_attr attr_common;
+ struct bpf_log_opts *log_opts;
union bpf_attr attr;
int fd;
@@ -242,7 +245,18 @@ int bpf_map_create(enum bpf_map_type map_type,
attr.excl_prog_hash = ptr_to_u64(OPTS_GET(opts, excl_prog_hash, NULL));
attr.excl_prog_hash_size = OPTS_GET(opts, excl_prog_hash_size, 0);
- fd = sys_bpf_fd(BPF_MAP_CREATE, &attr, attr_sz);
+ log_opts = OPTS_GET(opts, log_opts, NULL);
+ if (log_opts && feat_supported(NULL, FEAT_BPF_SYSCALL_COMMON_ATTRS)) {
+ memset(&attr_common, 0, attr_common_sz);
+ attr_common.log_buf = ptr_to_u64(OPTS_GET(log_opts, buf, NULL));
+ attr_common.log_size = OPTS_GET(log_opts, size, 0);
+ attr_common.log_level = OPTS_GET(log_opts, level, 0);
+ fd = sys_bpf_ext_fd(BPF_MAP_CREATE, &attr, attr_sz, &attr_common, attr_common_sz);
+ OPTS_SET(log_opts, true_size, attr_common.log_true_size);
+ } else {
+ fd = sys_bpf_fd(BPF_MAP_CREATE, &attr, attr_sz);
+ OPTS_SET(log_opts, true_size, 0);
+ }
return libbpf_err_errno(fd);
}
diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h
index 2c8e88ddb674..2312900a3263 100644
--- a/tools/lib/bpf/bpf.h
+++ b/tools/lib/bpf/bpf.h
@@ -37,6 +37,18 @@ extern "C" {
LIBBPF_API int libbpf_set_memlock_rlim(size_t memlock_bytes);
+struct bpf_log_opts {
+ size_t sz; /* size of this struct for forward/backward compatibility */
+
+ char *buf;
+ __u32 size;
+ __u32 level;
+ __u32 true_size; /* out parameter set by kernel */
+
+ size_t :0;
+};
+#define bpf_log_opts__last_field true_size
+
struct bpf_map_create_opts {
size_t sz; /* size of this struct for forward/backward compatibility */
@@ -57,9 +69,12 @@ struct bpf_map_create_opts {
const void *excl_prog_hash;
__u32 excl_prog_hash_size;
+
+ struct bpf_log_opts *log_opts;
+
size_t :0;
};
-#define bpf_map_create_opts__last_field excl_prog_hash_size
+#define bpf_map_create_opts__last_field log_opts
LIBBPF_API int bpf_map_create(enum bpf_map_type map_type,
const char *map_name,
--
2.53.0
^ permalink raw reply related
* [PATCH bpf-next v12 8/8] selftests/bpf: Add tests to verify map create failure log
From: Leon Hwang @ 2026-04-20 14:18 UTC (permalink / raw)
To: bpf
Cc: Alexei Starovoitov, Daniel Borkmann, John Fastabend,
Andrii Nakryiko, Martin KaFai Lau, Eduard Zingerman, Song Liu,
Yonghong Song, KP Singh, Stanislav Fomichev, Hao Luo, Jiri Olsa,
Shuah Khan, Christian Brauner, Seth Forshee, Yuichiro Tsuji,
Andrey Albershteyn, Leon Hwang, Willem de Bruijn, Jason Xing,
Tao Chen, Mykyta Yatsenko, Kumar Kartikeya Dwivedi,
Anton Protopopov, Amery Hung, Rong Tao, linux-kernel, linux-api,
linux-kselftest, kernel-patches-bot
In-Reply-To: <20260420141804.27179-1-leon.hwang@linux.dev>
Add tests to verify that the kernel reports the expected error messages
and correct log_true_size when map creation fails.
Signed-off-by: Leon Hwang <leon.hwang@linux.dev>
---
.../selftests/bpf/prog_tests/map_init.c | 166 ++++++++++++++++++
1 file changed, 166 insertions(+)
diff --git a/tools/testing/selftests/bpf/prog_tests/map_init.c b/tools/testing/selftests/bpf/prog_tests/map_init.c
index 14a31109dd0e..5c61c8e37306 100644
--- a/tools/testing/selftests/bpf/prog_tests/map_init.c
+++ b/tools/testing/selftests/bpf/prog_tests/map_init.c
@@ -212,3 +212,169 @@ void test_map_init(void)
if (test__start_subtest("pcpu_lru_map_init"))
test_pcpu_lru_map_init();
}
+
+static void test_map_create(enum bpf_map_type map_type, const char *map_name,
+ struct bpf_map_create_opts *opts, const char *exp_msg)
+{
+ const int key_size = 4, value_size = 4, max_entries = 1;
+ char log_buf[128];
+ int fd;
+ LIBBPF_OPTS(bpf_log_opts, log_opts);
+
+ log_buf[0] = '\0';
+ log_opts.buf = log_buf;
+ log_opts.size = sizeof(log_buf);
+ log_opts.level = 1;
+ opts->log_opts = &log_opts;
+ fd = bpf_map_create(map_type, map_name, key_size, value_size, max_entries, opts);
+ if (!ASSERT_LT(fd, 0, "bpf_map_create")) {
+ close(fd);
+ return;
+ }
+
+ ASSERT_STREQ(log_buf, exp_msg, "log_buf");
+ ASSERT_EQ(log_opts.true_size, strlen(exp_msg) + 1, "true_size");
+}
+
+static void test_map_create_array(struct bpf_map_create_opts *opts, const char *exp_msg)
+{
+ test_map_create(BPF_MAP_TYPE_ARRAY, "test_map_create", opts, exp_msg);
+}
+
+static void test_invalid_vmlinux_value_type_id_struct_ops(void)
+{
+ const char *msg = "btf_vmlinux_value_type_id can only be used with struct_ops maps.\n";
+ LIBBPF_OPTS(bpf_map_create_opts, opts,
+ .btf_vmlinux_value_type_id = 1,
+ );
+
+ test_map_create_array(&opts, msg);
+}
+
+static void test_invalid_vmlinux_value_type_id_kv_type_id(void)
+{
+ const char *msg = "btf_vmlinux_value_type_id is mutually exclusive with btf_key_type_id and btf_value_type_id.\n";
+ LIBBPF_OPTS(bpf_map_create_opts, opts,
+ .btf_vmlinux_value_type_id = 1,
+ .btf_key_type_id = 1,
+ );
+
+ test_map_create(BPF_MAP_TYPE_STRUCT_OPS, "test_map_create", &opts, msg);
+}
+
+static void test_invalid_value_type_id(void)
+{
+ const char *msg = "Invalid btf_value_type_id.\n";
+ LIBBPF_OPTS(bpf_map_create_opts, opts,
+ .btf_key_type_id = 1,
+ );
+
+ test_map_create_array(&opts, msg);
+}
+
+static void test_invalid_map_extra(void)
+{
+ const char *msg = "Invalid map_extra.\n";
+ LIBBPF_OPTS(bpf_map_create_opts, opts,
+ .map_extra = 1,
+ );
+
+ test_map_create_array(&opts, msg);
+}
+
+static void test_invalid_numa_node(void)
+{
+ const char *msg = "Invalid numa_node.\n";
+ LIBBPF_OPTS(bpf_map_create_opts, opts,
+ .map_flags = BPF_F_NUMA_NODE,
+ .numa_node = 0xFF,
+ );
+
+ test_map_create_array(&opts, msg);
+}
+
+static void test_invalid_map_type(void)
+{
+ const char *msg = "Invalid map_type.\n";
+ LIBBPF_OPTS(bpf_map_create_opts, opts);
+
+ test_map_create(__MAX_BPF_MAP_TYPE, "test_map_create", &opts, msg);
+}
+
+static void test_invalid_token_fd(void)
+{
+ const char *msg = "Invalid map_token_fd.\n";
+ LIBBPF_OPTS(bpf_map_create_opts, opts,
+ .map_flags = BPF_F_TOKEN_FD,
+ .token_fd = 0xFF,
+ );
+
+ test_map_create_array(&opts, msg);
+}
+
+static void test_invalid_map_name(void)
+{
+ const char *msg = "Invalid map_name.\n";
+ LIBBPF_OPTS(bpf_map_create_opts, opts);
+
+ test_map_create(BPF_MAP_TYPE_ARRAY, "test-!@#", &opts, msg);
+}
+
+static void test_invalid_btf_fd(void)
+{
+ const char *msg = "Invalid btf_fd.\n";
+ LIBBPF_OPTS(bpf_map_create_opts, opts,
+ .btf_fd = -1,
+ .btf_key_type_id = 1,
+ .btf_value_type_id = 1,
+ );
+
+ test_map_create_array(&opts, msg);
+}
+
+static void test_excl_prog_hash_size_1(void)
+{
+ const char *msg = "Invalid excl_prog_hash_size.\n";
+ const char *hash = "DEADCODE";
+ LIBBPF_OPTS(bpf_map_create_opts, opts,
+ .excl_prog_hash = hash,
+ );
+
+ test_map_create_array(&opts, msg);
+}
+
+static void test_excl_prog_hash_size_2(void)
+{
+ const char *msg = "Invalid excl_prog_hash_size.\n";
+ LIBBPF_OPTS(bpf_map_create_opts, opts,
+ .excl_prog_hash_size = 1,
+ );
+
+ test_map_create_array(&opts, msg);
+}
+
+void test_map_create_failure(void)
+{
+ if (test__start_subtest("invalid_vmlinux_value_type_id_struct_ops"))
+ test_invalid_vmlinux_value_type_id_struct_ops();
+ if (test__start_subtest("invalid_vmlinux_value_type_id_kv_type_id"))
+ test_invalid_vmlinux_value_type_id_kv_type_id();
+ if (test__start_subtest("invalid_value_type_id"))
+ test_invalid_value_type_id();
+ if (test__start_subtest("invalid_map_extra"))
+ test_invalid_map_extra();
+ if (test__start_subtest("invalid_numa_node"))
+ test_invalid_numa_node();
+ if (test__start_subtest("invalid_map_type"))
+ test_invalid_map_type();
+ if (test__start_subtest("invalid_token_fd"))
+ test_invalid_token_fd();
+ if (test__start_subtest("invalid_map_name"))
+ test_invalid_map_name();
+ if (test__start_subtest("invalid_btf_fd"))
+ test_invalid_btf_fd();
+ if (test__start_subtest("invalid_excl_prog_hash_size_1"))
+ test_excl_prog_hash_size_1();
+ if (test__start_subtest("invalid_excl_prog_hash_size_2"))
+ test_excl_prog_hash_size_2();
+}
--
2.53.0
^ permalink raw reply related
* Re: [PATCH v6 0/4] OPENAT2_REGULAR flag support for openat2
From: Dorjoy Chowdhury @ 2026-04-20 14:31 UTC (permalink / raw)
To: Christian Brauner
Cc: linux-fsdevel, linux-kernel, linux-api, ceph-devel, gfs2,
linux-nfs, linux-cifs, v9fs, linux-kselftest, viro, jack, jlayton,
chuck.lever, alex.aring, arnd, adilger, mjguzik, smfrench,
richard.henderson, mattst88, linmag7, tsbogend, James.Bottomley,
deller, davem, andreas, idryomov, amarkuze, slava, agruenba,
trondmy, anna, sfrench, pc, ronniesahlberg, sprasad, tom,
bharathsm, shuah, miklos, hansg
In-Reply-To: <20260420-laufen-einzeln-4cf4bb364a5d@brauner>
On Mon, Apr 20, 2026 at 7:20 PM Christian Brauner <brauner@kernel.org> wrote:
>
> On Thu, Apr 16, 2026 at 09:22:03PM +0600, Dorjoy Chowdhury wrote:
> > On Thu, Apr 16, 2026 at 7:07 PM Christian Brauner <brauner@kernel.org> wrote:
> > >
> > > On Sat, 28 Mar 2026 23:22:21 +0600, Dorjoy Chowdhury wrote:
> > > > I came upon this "Ability to only open regular files" uapi feature suggestion
> > > > from https://uapi-group.org/kernel-features/#ability-to-only-open-regular-files
> > > > and thought it would be something I could do as a first patch and get to
> > > > know the kernel code a bit better.
> > > >
> > > > The following filesystems have been tested by building and booting the kernel
> > > > x86 bzImage in a Fedora 43 VM in QEMU. I have tested with OPENAT2_REGULAR that
> > > > regular files can be successfully opened and non-regular files (directory, fifo etc)
> > > > return -EFTYPE.
> > > > - btrfs
> > > > - NFS (loopback)
> > > > - SMB (loopback)
> > > >
> > > > [...]
> > >
> > > - I've added an explanation why OPENAT2_REGULAR is only needed for some
> > > ->atomic_open() implementers but not others. What I don't like is that
> > > we need all that custom handling in there but it's managable.
> > >
> > > - I dropped the topmost style conversions. They really don't belong
> > > there and if we switch to something better we should use (1 << <nr>).
> > >
> > > - I split the EFTYPE errno introduction into a separate patch.
> > >
> > > ---
> >
> > Thanks for fixing up and picking this one up!
> >
> > >
> > > Applied to the vfs-7.2.openat.regular branch of the vfs/vfs.git tree.
> > > Patches in the vfs-7.2.openat.regular branch should appear in linux-next soon.
> > >
> >
> > I don't see a vfs-7.2.openat.regular branch in vfs/vfs.git tree in
> > git.kernel.org. Maybe this hasn't been pushed yet?
>
> Nothing will get pushed prior to -rc1 which is due this Sunday.
Understood. Thanks!
Regards,
Dorjoy
^ permalink raw reply
* Re: [PATCH bpf-next v12 5/8] bpf: Add syscall common attributes support for btf_load
From: bot+bpf-ci @ 2026-04-20 14:56 UTC (permalink / raw)
To: leon.hwang, bpf
Cc: ast, daniel, john.fastabend, andrii, martin.lau, eddyz87, song,
yonghong.song, kpsingh, sdf, haoluo, jolsa, shuah, brauner,
sforshee, yuichtsu, aalbersh, leon.hwang, willemb,
kerneljasonxing, chen.dylane, yatsenko, memxor, a.s.protopopov,
ameryhung, rongtao, linux-kernel, linux-api, linux-kselftest,
kernel-patches-bot, ast, andrii, daniel, martin.lau, eddyz87,
yonghong.song, clm, ihor.solodrai
In-Reply-To: <20260420141804.27179-6-leon.hwang@linux.dev>
[-- Attachment #1: Type: text/plain, Size: 1644 bytes --]
> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> index 1eb15a51c7dc..06aea4e499a1 100644
> --- a/kernel/bpf/syscall.c
> +++ b/kernel/bpf/syscall.c
[ ... ]
> @@ -6317,7 +6317,12 @@ static int __sys_bpf(enum bpf_cmd cmd, bpfptr_t uattr, unsigned int size,
> err = bpf_raw_tracepoint_open(&attr);
> break;
> case BPF_BTF_LOAD:
> - err = bpf_btf_load(&attr, uattr, size);
> + if (size >= offsetofend(union bpf_attr, btf_log_true_size))
> + offsetof_log_true_size = offsetof(union bpf_attr, btf_log_true_size);
> + err = bpf_log_attr_init(&attr_log, attr.btf_log_buf, attr.btf_log_size,
> + attr.btf_log_level, offsetof_log_true_size, uattr,
> + &attr_common, uattr_common, size_common);
> + err = err ?: bpf_btf_load(&attr, uattr, &attr_log);
Should the log_true_size offset be set only for user space calls? In v10
review, Andrii Nakryiko noted that log_true_size should only be set for
user space, not kernel space calls.
https://lore.kernel.org/bpf/CAEf4BzY3GA+860RODLaXqqO3j5AgJ24pmKU1yv3+gHJpy0DvNA@mail.gmail.com/
The current code does not check uattr.is_kernel before setting
offsetof_log_true_size. This could cause issues when bpf_sys_bpf() calls
__sys_bpf() with KERNEL_BPFPTR, as it would attempt to write log_true_size
to a kernel pointer instead of a user space pointer.
> break;
> case BPF_BTF_GET_FD_BY_ID:
> err = bpf_btf_get_fd_by_id(&attr);
---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/24672239158
^ permalink raw reply
* Re: [PATCH bpf-next v12 6/8] bpf: Add syscall common attributes support for map_create
From: bot+bpf-ci @ 2026-04-20 14:56 UTC (permalink / raw)
To: leon.hwang, bpf
Cc: ast, daniel, john.fastabend, andrii, martin.lau, eddyz87, song,
yonghong.song, kpsingh, sdf, haoluo, jolsa, shuah, brauner,
sforshee, yuichtsu, aalbersh, leon.hwang, willemb,
kerneljasonxing, chen.dylane, yatsenko, memxor, a.s.protopopov,
ameryhung, rongtao, linux-kernel, linux-api, linux-kselftest,
kernel-patches-bot, ast, andrii, daniel, martin.lau, eddyz87,
yonghong.song, clm, ihor.solodrai
In-Reply-To: <20260420141804.27179-7-leon.hwang@linux.dev>
[-- Attachment #1: Type: text/plain, Size: 3737 bytes --]
> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> index dbe321aa8591..34500f81db6e 100644
> --- a/include/linux/bpf_verifier.h
> +++ b/include/linux/bpf_verifier.h
> @@ -710,6 +710,9 @@ struct bpf_log_attr {
> int bpf_log_attr_init(struct bpf_log_attr *log, u64 log_buf, u32 log_size, u32 log_level,
> u32 offsetof_log_true_size, bpfptr_t uattr, struct bpf_common_attr *common,
> bpfptr_t uattr_common, u32 size_common);
> +struct bpf_verifier_log *bpf_log_attr_create_vlog(struct bpf_log_attr *attr_log,
> + struct bpf_common_attr *common, bpfptr_t uattr,
> + u32 size);
> int bpf_log_attr_finalize(struct bpf_log_attr *attr, struct bpf_verifier_log *log);
>
> #define BPF_MAX_SUBPROGS 256
> diff --git a/kernel/bpf/log.c b/kernel/bpf/log.c
> index dffe3a0180a7..5a977a52ef61 100644
> --- a/kernel/bpf/log.c
> +++ b/kernel/bpf/log.c
> @@ -860,6 +860,36 @@ int bpf_log_attr_init(struct bpf_log_attr *log, u64 log_buf, u32 log_size, u32 l
> return 0;
> }
>
> +
> +struct bpf_verifier_log *bpf_log_attr_create_vlog(struct bpf_log_attr *attr_log,
> + struct bpf_common_attr *common, bpfptr_t uattr,
> + u32 size)
> +{
> + struct bpf_verifier_log *log;
> + int err;
> +
> + memset(attr_log, 0, sizeof(*attr_log));
> + attr_log->uattr = uattr;
> + if (size >= offsetofend(struct bpf_common_attr, log_true_size))
> + attr_log->offsetof_true_size = offsetof(struct bpf_common_attr, log_true_size);
> +
> + if (!common->log_buf)
> + return NULL;
Does this skip validation that bpf_log_attr_init() performs?
The sibling helper bpf_log_attr_init() validates struct bpf_common_attr
fields via bpf_verifier_log_attr_valid() unconditionally:
kernel/bpf/log.c:bpf_log_attr_init() {
if (!bpf_verifier_log_attr_valid(common->log_level, ubuf_common, common->log_size) ||
!bpf_verifier_log_attr_valid(log_level, ubuf, log_size))
return -EINVAL;
}
This returns -EINVAL for combinations like (log_buf==0 && log_size!=0)
or (log_buf==0 && log_level!=0), enforced by:
kernel/bpf/log.c:bpf_verifier_log_attr_valid() {
/* ubuf and len_total should both be specified (or not) together */
if (!!log_buf != !!log_size)
return false;
if (log_buf && log_level == 0)
return false;
if (log_level & ~BPF_LOG_MASK)
return false;
if (log_size > UINT_MAX >> 2)
return false;
}
When bpf_log_attr_create_vlog() returns early because common->log_buf == 0,
these four checks are skipped entirely. When common->log_buf != 0, the
checks run later via bpf_vlog_init().
This creates an ABI validation inconsistency: BPF_PROG_LOAD and BPF_BTF_LOAD
reject (log_buf==0 && log_size!=0) with -EINVAL, but BPF_MAP_CREATE
silently accepts it. Userspace passing the same invalid attribute
combination receives different results depending on the syscall command.
Should bpf_log_attr_create_vlog() call
bpf_verifier_log_attr_valid(common->log_level, u64_to_user_ptr(common->log_buf), common->log_size)
before the early return, and return ERR_PTR(-EINVAL) on failure?
> +
> + log = kzalloc_obj(*log, GFP_KERNEL);
> + if (!log)
> + return ERR_PTR(-ENOMEM);
> +
> + err = bpf_vlog_init(log, common->log_level, u64_to_user_ptr(common->log_buf),
> + common->log_size);
> + if (err) {
> + kfree(log);
> + return ERR_PTR(err);
> + }
> +
> + return log;
> +}
> +
> int bpf_log_attr_finalize(struct bpf_log_attr *attr, struct bpf_verifier_log *log)
> {
> u32 log_true_size;
[ ... ]
---
AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
CI run summary: https://github.com/kernel-patches/bpf/actions/runs/24672239158
^ permalink raw reply
* Re: [PATCH bpf-next v12 5/8] bpf: Add syscall common attributes support for btf_load
From: Leon Hwang @ 2026-04-22 5:28 UTC (permalink / raw)
To: bot+bpf-ci
Cc: Leon Hwang, bpf, ast, daniel, john.fastabend, andrii, martin.lau,
eddyz87, song, yonghong.song, kpsingh, sdf, haoluo, jolsa, shuah,
brauner, sforshee, yuichtsu, aalbersh, willemb, kerneljasonxing,
chen.dylane, yatsenko, memxor, a.s.protopopov, ameryhung, rongtao,
linux-kernel, linux-api, linux-kselftest, kernel-patches-bot,
martin.lau, clm, ihor.solodrai
In-Reply-To: <a007ad07d6c91fc692fe29e0a57f1f6059f27eb4d147cb987d01f955b76d3c36@mail.kernel.org>
On Mon, 20 Apr 2026 14:56:31 +0000 (UTC), bot+bpf-ci@kernel.org wrote:
>> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
>> index 1eb15a51c7dc..06aea4e499a1 100644
>> --- a/kernel/bpf/syscall.c
>> +++ b/kernel/bpf/syscall.c
>
> [ ... ]
>
>> @@ -6317,7 +6317,12 @@ static int __sys_bpf(enum bpf_cmd cmd, bpfptr_t uattr, unsigned int size,
>> err = bpf_raw_tracepoint_open(&attr);
>> break;
>> case BPF_BTF_LOAD:
>> - err = bpf_btf_load(&attr, uattr, size);
>> + if (size >= offsetofend(union bpf_attr, btf_log_true_size))
>> + offsetof_log_true_size = offsetof(union bpf_attr, btf_log_true_size);
>> + err = bpf_log_attr_init(&attr_log, attr.btf_log_buf, attr.btf_log_size,
>> + attr.btf_log_level, offsetof_log_true_size, uattr,
>> + &attr_common, uattr_common, size_common);
>> + err = err ?: bpf_btf_load(&attr, uattr, &attr_log);
>
> Should the log_true_size offset be set only for user space calls? In v10
> review, Andrii Nakryiko noted that log_true_size should only be set for
> user space, not kernel space calls.
>
> https://lore.kernel.org/bpf/CAEf4BzY3GA+860RODLaXqqO3j5AgJ24pmKU1yv3+gHJpy0DvNA@mail.gmail.com/
>
> The current code does not check uattr.is_kernel before setting
> offsetof_log_true_size. This could cause issues when bpf_sys_bpf() calls
> __sys_bpf() with KERNEL_BPFPTR, as it would attempt to write log_true_size
> to a kernel pointer instead of a user space pointer.
>
Not true. 'uattr.is_kernel' will be checked in copy_to_bpfptr_offset().
When write log_true_size to a kernel pointer, memcpy() will be used.
Thanks,
Leon
>> break;
>> case BPF_BTF_GET_FD_BY_ID:
>> err = bpf_btf_get_fd_by_id(&attr);
>
>
> ---
> AI reviewed your patch. Please fix the bug or email reply why it's not a bug.
> See: https://github.com/kernel-patches/vmtest/blob/master/ci/claude/README.md
>
> CI run summary: https://github.com/kernel-patches/bpf/actions/runs/24672239158
^ permalink raw reply
* Re: [PATCH bpf-next v12 6/8] bpf: Add syscall common attributes support for map_create
From: Leon Hwang @ 2026-04-22 5:44 UTC (permalink / raw)
To: bot+bpf-ci
Cc: Leon Hwang, bpf, ast, daniel, john.fastabend, andrii, martin.lau,
eddyz87, song, yonghong.song, kpsingh, sdf, haoluo, jolsa, shuah,
brauner, sforshee, yuichtsu, aalbersh, willemb, kerneljasonxing,
chen.dylane, yatsenko, memxor, a.s.protopopov, ameryhung, rongtao,
linux-kernel, linux-api, linux-kselftest, kernel-patches-bot,
martin.lau, clm, ihor.solodrai
In-Reply-To: <caa1b9f532b97e8cbcf9b2c00fc5d39d038066d08993ab4c12bba5a6cf971528@mail.kernel.org>
On Mon, 20 Apr 2026 14:56:33 +0000 (UTC), bot+bpf-ci@kernel.org wrote:
>> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
>> index dbe321aa8591..34500f81db6e 100644
>> --- a/include/linux/bpf_verifier.h
>> +++ b/include/linux/bpf_verifier.h
>> @@ -710,6 +710,9 @@ struct bpf_log_attr {
>> int bpf_log_attr_init(struct bpf_log_attr *log, u64 log_buf, u32 log_size, u32 log_level,
>> u32 offsetof_log_true_size, bpfptr_t uattr, struct bpf_common_attr *common,
>> bpfptr_t uattr_common, u32 size_common);
>> +struct bpf_verifier_log *bpf_log_attr_create_vlog(struct bpf_log_attr *attr_log,
>> + struct bpf_common_attr *common, bpfptr_t uattr,
>> + u32 size);
>> int bpf_log_attr_finalize(struct bpf_log_attr *attr, struct bpf_verifier_log *log);
>>
>> #define BPF_MAX_SUBPROGS 256
>> diff --git a/kernel/bpf/log.c b/kernel/bpf/log.c
>> index dffe3a0180a7..5a977a52ef61 100644
>> --- a/kernel/bpf/log.c
>> +++ b/kernel/bpf/log.c
>> @@ -860,6 +860,36 @@ int bpf_log_attr_init(struct bpf_log_attr *log, u64 log_buf, u32 log_size, u32 l
>> return 0;
>> }
>>
>> +
NIT: should drop this dup blank line.
>> +struct bpf_verifier_log *bpf_log_attr_create_vlog(struct bpf_log_attr *attr_log,
>> + struct bpf_common_attr *common, bpfptr_t uattr,
>> + u32 size)
>> +{
>> + struct bpf_verifier_log *log;
>> + int err;
>> +
>> + memset(attr_log, 0, sizeof(*attr_log));
>> + attr_log->uattr = uattr;
>> + if (size >= offsetofend(struct bpf_common_attr, log_true_size))
>> + attr_log->offsetof_true_size = offsetof(struct bpf_common_attr, log_true_size);
>> +
>> + if (!common->log_buf)
>> + return NULL;
>
> Does this skip validation that bpf_log_attr_init() performs?
>
> The sibling helper bpf_log_attr_init() validates struct bpf_common_attr
> fields via bpf_verifier_log_attr_valid() unconditionally:
>
> kernel/bpf/log.c:bpf_log_attr_init() {
> if (!bpf_verifier_log_attr_valid(common->log_level, ubuf_common, common->log_size) ||
> !bpf_verifier_log_attr_valid(log_level, ubuf, log_size))
> return -EINVAL;
> }
>
> This returns -EINVAL for combinations like (log_buf==0 && log_size!=0)
> or (log_buf==0 && log_level!=0), enforced by:
>
> kernel/bpf/log.c:bpf_verifier_log_attr_valid() {
> /* ubuf and len_total should both be specified (or not) together */
> if (!!log_buf != !!log_size)
> return false;
> if (log_buf && log_level == 0)
> return false;
> if (log_level & ~BPF_LOG_MASK)
> return false;
> if (log_size > UINT_MAX >> 2)
> return false;
> }
>
> When bpf_log_attr_create_vlog() returns early because common->log_buf == 0,
> these four checks are skipped entirely. When common->log_buf != 0, the
> checks run later via bpf_vlog_init().
>
> This creates an ABI validation inconsistency: BPF_PROG_LOAD and BPF_BTF_LOAD
> reject (log_buf==0 && log_size!=0) with -EINVAL, but BPF_MAP_CREATE
> silently accepts it. Userspace passing the same invalid attribute
> combination receives different results depending on the syscall command.
>
> Should bpf_log_attr_create_vlog() call
> bpf_verifier_log_attr_valid(common->log_level, u64_to_user_ptr(common->log_buf), common->log_size)
> before the early return, and return ERR_PTR(-EINVAL) on failure?
>
It was intentional.
Like the bpf_log(log, ...) use in this patch, if log_buf is not provided,
there's nowhere to store the log msg. Thereafter, no need to alloc the log
object.
It seems cheap to alloc the 1KiB+ log object. So it would be fine to drop
this early check, and let bpf_vlog_init() validate them instead.
Thanks,
Leon
>> +
[...]
^ permalink raw reply
* [PATCH v9 00/17] Exposing case folding behavior
From: Chuck Lever @ 2026-04-22 23:29 UTC (permalink / raw)
To: Al Viro, Christian Brauner, Jan Kara
Cc: linux-fsdevel, linux-ext4, linux-xfs, linux-cifs, linux-nfs,
linux-api, linux-f2fs-devel, hirofumi, linkinjeon, sj1557.seo,
yuezhang.mo, almaz.alexandrovich, slava, glaubitz, frank.li,
tytso, adilger.kernel, cem, sfrench, pc, ronniesahlberg, sprasad,
trondmy, anna, jaegeuk, chao, hansg, senozhatsky, Chuck Lever,
Darrick J. Wong
Following on from
https://lore.kernel.org/linux-nfs/20251021-zypressen-bazillus-545a44af57fd@brauner/T/#m0ba197d75b7921d994cf284f3cef3a62abb11aaa
I'm attempting to implement enough support in the Linux VFS to
enable file services like NFSD and ksmbd (and user space
equivalents) to provide the actual status of case folding support
in local file systems. The default behavior for local file systems
not explicitly supported in this series is to reflect the usual
POSIX behaviors:
case-insensitive = false
case-nonpreserving = false
The case-insensitivity and case-nonpreserving booleans can be
consumed immediately by NFSD. These two attributes have been part of
the NFSv3 and NFSv4 protocols for decades, in order to support NFS
client implementations on non-POSIX systems.
Support for user space file servers is why this series exposes case
folding information via a user-space API. I don't know of any other
category of user-space application that requires access to case
folding info.
The Linux NFS community has a growing interest in supporting NFS
clients on Windows and MacOS platforms, where file name behavior does
not align with traditional POSIX semantics.
One example of a Windows-based NFS client is [1]. This client
implementation explicitly requires servers to report
FATTR4_WORD0_CASE_INSENSITIVE = TRUE for proper operation, a hard
requirement for Windows client interoperability because Windows
applications expect case-insensitive behavior. When an NFS client
knows the server is case-insensitive, it can avoid issuing multiple
LOOKUP/READDIR requests to search for case variants, and applications
like Win32 programs work correctly without manual workarounds or
code changes.
Even the Linux client can take advantage of this information. Trond
merged patches 4 years ago [2] that introduce support for case
insensitivity, in support of the Hammerspace NFS server. In
particular, when a client detects a case-insensitive NFS share,
negative dentry caching must be disabled (a lookup for "FILE.TXT"
failing shouldn't cache a negative entry when "file.txt" exists)
and directory change invalidation must clear all cached case-folded
file name variants.
Hammerspace servers and several other NFS server implementations
operate in multi-protocol environments, where a single file service
instance caters to both NFS and SMB clients. In those cases, things
work more smoothly for everyone when the NFS client can see and adapt
to the case folding behavior that SMB users rely on and expect. NFSD
needs to support the case-insensitivity and case-nonpreserving
booleans properly in order to participate as a first-class citizen
in such environments.
[1] https://github.com/kofemann/ms-nfs41-client
[2] https://patchwork.kernel.org/project/linux-nfs/cover/20211217203658.439352-1-trondmy@kernel.org/
---
Changes since v8:
- Rebase on v7.0-rc1
Changes since v7:
- Split file_attr initialization changes into a separate patch
Changes since v6:
- Remove the memset from vfs_fileattr_get
Changes since v5:
- Finish the conversion to FS_XFLAGs
- NFSv4 GETATTR now clears the attr mask bit if nfsd_get_case_info()
fails
Changes since v4:
- Observe the MSDOS "nocase" mount option
- Define new FS_XFLAGs for the user API
Changes since v3:
- Change fa->case_preserving to fa_case_nonpreserving
- VFAT is case preserving
- Make new fields available to user space
Changes since v2:
- Remove unicode labels
- Replace vfs_get_case_info
- Add support for several more local file system implementations
- Add support for in-kernel SMB server
Changes since RFC:
- Use file_getattr instead of statx
- Postpone exposing Unicode version until later
- Support NTFS and ext4 in addition to FAT
- Support NFSv4 fattr4 in addition to NFSv3 PATHCONF
---
Chuck Lever (17):
fs: Move file_kattr initialization to callers
fs: Add case sensitivity flags to file_kattr
fat: Implement fileattr_get for case sensitivity
exfat: Implement fileattr_get for case sensitivity
ntfs3: Implement fileattr_get for case sensitivity
hfs: Implement fileattr_get for case sensitivity
hfsplus: Report case sensitivity in fileattr_get
ext4: Report case sensitivity in fileattr_get
xfs: Report case sensitivity in fileattr_get
cifs: Implement fileattr_get for case sensitivity
nfs: Implement fileattr_get for case sensitivity
f2fs: Add case sensitivity reporting to fileattr_get
vboxsf: Implement fileattr_get for case sensitivity
isofs: Implement fileattr_get for case sensitivity
nfsd: Report export case-folding via NFSv3 PATHCONF
nfsd: Implement NFSv4 FATTR4_CASE_INSENSITIVE and FATTR4_CASE_PRESERVING
ksmbd: Report filesystem case sensitivity via FS_ATTRIBUTE_INFORMATION
fs/exfat/exfat_fs.h | 2 ++
fs/exfat/file.c | 17 +++++++++++++++--
fs/exfat/namei.c | 1 +
fs/ext4/ioctl.c | 7 +++++++
fs/f2fs/file.c | 8 ++++++++
fs/fat/fat.h | 3 +++
fs/fat/file.c | 23 +++++++++++++++++++++++
fs/fat/namei_msdos.c | 1 +
fs/fat/namei_vfat.c | 1 +
fs/file_attr.c | 16 ++++++++--------
fs/hfs/dir.c | 1 +
fs/hfs/hfs_fs.h | 2 ++
fs/hfs/inode.c | 13 +++++++++++++
fs/hfsplus/inode.c | 10 ++++++++++
fs/isofs/dir.c | 11 +++++++++++
fs/nfs/client.c | 9 +++++++--
fs/nfs/inode.c | 21 +++++++++++++++++++++
fs/nfs/internal.h | 3 +++
fs/nfs/nfs3proc.c | 2 ++
fs/nfs/nfs3xdr.c | 7 +++++--
fs/nfs/nfs4proc.c | 2 ++
fs/nfs/proc.c | 3 +++
fs/nfs/symlink.c | 3 +++
fs/nfsd/nfs3proc.c | 18 ++++++++++--------
fs/nfsd/nfs4xdr.c | 25 +++++++++++++++++++++++--
fs/nfsd/vfs.c | 29 +++++++++++++++++++++++++++++
fs/nfsd/vfs.h | 3 +++
fs/ntfs3/file.c | 23 +++++++++++++++++++++++
fs/ntfs3/inode.c | 1 +
fs/ntfs3/namei.c | 2 ++
fs/ntfs3/ntfs_fs.h | 1 +
fs/smb/client/cifsfs.c | 20 ++++++++++++++++++++
fs/smb/server/smb2pdu.c | 25 +++++++++++++++++++------
fs/vboxsf/dir.c | 1 +
fs/vboxsf/file.c | 6 ++++--
fs/vboxsf/super.c | 7 +++++++
fs/vboxsf/utils.c | 26 ++++++++++++++++++++++++++
fs/vboxsf/vfsmod.h | 6 ++++++
fs/xfs/xfs_ioctl.c | 9 ++++++++-
include/linux/fileattr.h | 3 ++-
include/linux/nfs_xdr.h | 2 ++
include/uapi/linux/fs.h | 7 +++++++
42 files changed, 346 insertions(+), 34 deletions(-)
---
base-commit: 6596a02b207886e9e00bb0161c7fd59fea53c081
change-id: 20260422-case-sensitivity-5cbffc8f1558
Best regards,
--
Chuck Lever
^ permalink raw reply
* [PATCH v9 01/17] fs: Move file_kattr initialization to callers
From: Chuck Lever @ 2026-04-22 23:29 UTC (permalink / raw)
To: Al Viro, Christian Brauner, Jan Kara
Cc: linux-fsdevel, linux-ext4, linux-xfs, linux-cifs, linux-nfs,
linux-api, linux-f2fs-devel, hirofumi, linkinjeon, sj1557.seo,
yuezhang.mo, almaz.alexandrovich, slava, glaubitz, frank.li,
tytso, adilger.kernel, cem, sfrench, pc, ronniesahlberg, sprasad,
trondmy, anna, jaegeuk, chao, hansg, senozhatsky, Chuck Lever,
Darrick J. Wong
In-Reply-To: <20260422-case-sensitivity-v9-0-be023cc070e2@oracle.com>
From: Chuck Lever <chuck.lever@oracle.com>
fileattr_fill_xflags() and fileattr_fill_flags() memset the
entire file_kattr struct before populating select fields, so
callers cannot pre-set fields in fa->fsx_xflags without having
their values clobbered. Darrick Wong noted that a function
named "fill_xflags" touching more than xflags forces callers
to know implementation details beyond its apparent scope.
Drop the memset from both fill functions and initialize at the
entry points instead: ioctl_setflags(), ioctl_fssetxattr(),
the file_setattr() syscall, and xfs_ioc_fsgetxattra() now
declare fa with an aggregate initializer. ioctl_getflags(),
ioctl_fsgetxattr(), and the file_getattr() syscall already
aggregate-initialize fa to pass flags_valid/fsx_valid hints
into vfs_fileattr_get().
Subsequent patches rely on this so that ->fileattr_get()
handlers can set case-sensitivity flags (FS_XFLAG_CASEFOLD,
FS_XFLAG_CASENONPRESERVING) in fa->fsx_xflags before the fill
functions run.
Suggested-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
fs/file_attr.c | 12 ++++--------
fs/xfs/xfs_ioctl.c | 2 +-
2 files changed, 5 insertions(+), 9 deletions(-)
diff --git a/fs/file_attr.c b/fs/file_attr.c
index da983e105d70..f429da66a317 100644
--- a/fs/file_attr.c
+++ b/fs/file_attr.c
@@ -15,12 +15,10 @@
* @fa: fileattr pointer
* @xflags: FS_XFLAG_* flags
*
- * Set ->fsx_xflags, ->fsx_valid and ->flags (translated xflags). All
- * other fields are zeroed.
+ * Set ->fsx_xflags, ->fsx_valid and ->flags (translated xflags).
*/
void fileattr_fill_xflags(struct file_kattr *fa, u32 xflags)
{
- memset(fa, 0, sizeof(*fa));
fa->fsx_valid = true;
fa->fsx_xflags = xflags;
if (fa->fsx_xflags & FS_XFLAG_IMMUTABLE)
@@ -48,11 +46,9 @@ EXPORT_SYMBOL(fileattr_fill_xflags);
* @flags: FS_*_FL flags
*
* Set ->flags, ->flags_valid and ->fsx_xflags (translated flags).
- * All other fields are zeroed.
*/
void fileattr_fill_flags(struct file_kattr *fa, u32 flags)
{
- memset(fa, 0, sizeof(*fa));
fa->flags_valid = true;
fa->flags = flags;
if (fa->flags & FS_SYNC_FL)
@@ -325,7 +321,7 @@ int ioctl_setflags(struct file *file, unsigned int __user *argp)
{
struct mnt_idmap *idmap = file_mnt_idmap(file);
struct dentry *dentry = file->f_path.dentry;
- struct file_kattr fa;
+ struct file_kattr fa = {};
unsigned int flags;
int err;
@@ -357,7 +353,7 @@ int ioctl_fssetxattr(struct file *file, void __user *argp)
{
struct mnt_idmap *idmap = file_mnt_idmap(file);
struct dentry *dentry = file->f_path.dentry;
- struct file_kattr fa;
+ struct file_kattr fa = {};
int err;
err = copy_fsxattr_from_user(&fa, argp);
@@ -431,7 +427,7 @@ SYSCALL_DEFINE5(file_setattr, int, dfd, const char __user *, filename,
struct path filepath __free(path_put) = {};
unsigned int lookup_flags = 0;
struct file_attr fattr;
- struct file_kattr fa;
+ struct file_kattr fa = {};
int error;
BUILD_BUG_ON(sizeof(struct file_attr) < FILE_ATTR_SIZE_VER0);
diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
index 46e234863644..ed9b4846c05f 100644
--- a/fs/xfs/xfs_ioctl.c
+++ b/fs/xfs/xfs_ioctl.c
@@ -517,7 +517,7 @@ xfs_ioc_fsgetxattra(
xfs_inode_t *ip,
void __user *arg)
{
- struct file_kattr fa;
+ struct file_kattr fa = {};
xfs_ilock(ip, XFS_ILOCK_SHARED);
xfs_fill_fsxattr(ip, XFS_ATTR_FORK, &fa);
--
2.53.0
^ permalink raw reply related
* [PATCH v9 02/17] fs: Add case sensitivity flags to file_kattr
From: Chuck Lever @ 2026-04-22 23:29 UTC (permalink / raw)
To: Al Viro, Christian Brauner, Jan Kara
Cc: linux-fsdevel, linux-ext4, linux-xfs, linux-cifs, linux-nfs,
linux-api, linux-f2fs-devel, hirofumi, linkinjeon, sj1557.seo,
yuezhang.mo, almaz.alexandrovich, slava, glaubitz, frank.li,
tytso, adilger.kernel, cem, sfrench, pc, ronniesahlberg, sprasad,
trondmy, anna, jaegeuk, chao, hansg, senozhatsky, Chuck Lever,
Darrick J. Wong
In-Reply-To: <20260422-case-sensitivity-v9-0-be023cc070e2@oracle.com>
From: Chuck Lever <chuck.lever@oracle.com>
Enable upper layers such as NFSD to retrieve case sensitivity
information from file systems by adding FS_XFLAG_CASEFOLD and
FS_XFLAG_CASENONPRESERVING flags.
Filesystems report case-insensitive or case-nonpreserving behavior
by setting these flags directly in fa->fsx_xflags. The default
(flags unset) indicates POSIX semantics: case-sensitive and
case-preserving. These flags are read-only; userspace cannot set
them via ioctl.
Case sensitivity information is exported to userspace via the
fa_xflags field in the FS_IOC_FSGETXATTR ioctl and file_getattr()
system call.
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
fs/file_attr.c | 4 ++++
include/linux/fileattr.h | 3 ++-
include/uapi/linux/fs.h | 7 +++++++
3 files changed, 13 insertions(+), 1 deletion(-)
diff --git a/fs/file_attr.c b/fs/file_attr.c
index f429da66a317..bfb00d256dd5 100644
--- a/fs/file_attr.c
+++ b/fs/file_attr.c
@@ -37,6 +37,8 @@ void fileattr_fill_xflags(struct file_kattr *fa, u32 xflags)
fa->flags |= FS_PROJINHERIT_FL;
if (fa->fsx_xflags & FS_XFLAG_VERITY)
fa->flags |= FS_VERITY_FL;
+ if (fa->fsx_xflags & FS_XFLAG_CASEFOLD)
+ fa->flags |= FS_CASEFOLD_FL;
}
EXPORT_SYMBOL(fileattr_fill_xflags);
@@ -67,6 +69,8 @@ void fileattr_fill_flags(struct file_kattr *fa, u32 flags)
fa->fsx_xflags |= FS_XFLAG_PROJINHERIT;
if (fa->flags & FS_VERITY_FL)
fa->fsx_xflags |= FS_XFLAG_VERITY;
+ if (fa->flags & FS_CASEFOLD_FL)
+ fa->fsx_xflags |= FS_XFLAG_CASEFOLD;
}
EXPORT_SYMBOL(fileattr_fill_flags);
diff --git a/include/linux/fileattr.h b/include/linux/fileattr.h
index 3780904a63a6..58044b598016 100644
--- a/include/linux/fileattr.h
+++ b/include/linux/fileattr.h
@@ -16,7 +16,8 @@
/* Read-only inode flags */
#define FS_XFLAG_RDONLY_MASK \
- (FS_XFLAG_PREALLOC | FS_XFLAG_HASATTR | FS_XFLAG_VERITY)
+ (FS_XFLAG_PREALLOC | FS_XFLAG_HASATTR | FS_XFLAG_VERITY | \
+ FS_XFLAG_CASEFOLD | FS_XFLAG_CASENONPRESERVING)
/* Flags to indicate valid value of fsx_ fields */
#define FS_XFLAG_VALUES_MASK \
diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
index 13f71202845e..2ea4c81df08f 100644
--- a/include/uapi/linux/fs.h
+++ b/include/uapi/linux/fs.h
@@ -254,6 +254,13 @@ struct file_attr {
#define FS_XFLAG_DAX 0x00008000 /* use DAX for IO */
#define FS_XFLAG_COWEXTSIZE 0x00010000 /* CoW extent size allocator hint */
#define FS_XFLAG_VERITY 0x00020000 /* fs-verity enabled */
+/*
+ * Case handling flags (read-only, cannot be set via ioctl).
+ * Default (neither set) indicates POSIX semantics: case-sensitive
+ * lookups and case-preserving storage.
+ */
+#define FS_XFLAG_CASEFOLD 0x00040000 /* case-insensitive lookups */
+#define FS_XFLAG_CASENONPRESERVING 0x00080000 /* case not preserved */
#define FS_XFLAG_HASATTR 0x80000000 /* no DIFLAG for this */
/* the read-only stuff doesn't really belong here, but any other place is
--
2.53.0
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox