Linux userland API discussions
 help / color / mirror / Atom feed
* Re: [RFC PATCH v1 0/2] Add O_DENY_WRITE (complement AT_EXECVE_CHECK)
From: Christian Brauner @ 2025-08-26  9:07 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Al Viro, Kees Cook, Paul Moore, Serge Hallyn, Andy Lutomirski,
	Arnd Bergmann, Christian Heimes, Dmitry Vyukov, Elliott Hughes,
	Fan Wu, Florian Weimer, Jann Horn, Jeff Xu, Jonathan Corbet,
	Jordan R Abrahams, Lakshmi Ramasubramanian, Luca Boccassi,
	Matt Bobrowski, Miklos Szeredi, Mimi Zohar, Nicolas Bouchinet,
	Robert Waite, Roberto Sassu, Scott Shell, Steve Dower,
	Steve Grubb, kernel-hardening, linux-api, linux-fsdevel,
	linux-integrity, linux-kernel, linux-security-module
In-Reply-To: <20250822170800.2116980-1-mic@digikod.net>

On Fri, Aug 22, 2025 at 07:07:58PM +0200, Mickaël Salaün wrote:
> Hi,
> 
> Script interpreters can check if a file would be allowed to be executed
> by the kernel using the new AT_EXECVE_CHECK flag. This approach works
> well on systems with write-xor-execute policies, where scripts cannot
> be modified by malicious processes. However, this protection may not be
> available on more generic distributions.
> 
> The key difference between `./script.sh` and `sh script.sh` (when using
> AT_EXECVE_CHECK) is that execve(2) prevents the script from being opened
> for writing while it's being executed. To achieve parity, the kernel
> should provide a mechanism for script interpreters to deny write access
> during script interpretation. While interpreters can copy script content
> into a buffer, a race condition remains possible after AT_EXECVE_CHECK.
> 
> This patch series introduces a new O_DENY_WRITE flag for use with
> open*(2) and fcntl(2). Both interfaces are necessary since script
> interpreters may receive either a file path or file descriptor. For
> backward compatibility, open(2) with O_DENY_WRITE will not fail on
> unsupported systems, while users requiring explicit support guarantees
> can use openat2(2).

We've said no to abusing the O_* flag space for that AT_EXECVE_* stuff
before and you've been told by Linus as well that this is a nogo.

Nothing has changed in that regard and I'm not interested in stuffing
the VFS APIs full of special-purpose behavior to work around the fact
that this is work that needs to be done in userspace. Change the apps,
stop pushing more and more cruft into the VFS that has no business
there.

That's before we get into all the issues that are introduced by this
mechanism that magically makes arbitrary files unwritable. It's not just
a DoS it's likely to cause breakage in userspace as well. I removed the
deny-write from execve because it already breaks various use-cases or
leads to spurious failures in e.g., go. We're not spreading this disease
as a first-class VFS API.

^ permalink raw reply

* Re: [PATCH v2 1/1] man2/mount.2: expand and clarify docs for MS_REMOUNT | MS_BIND
From: Alejandro Colomar @ 2025-08-26  8:51 UTC (permalink / raw)
  To: Askar Safin
  Cc: Aleksa Sarai, Alexander Viro, linux-api, linux-fsdevel,
	David Howells, Christian Brauner, linux-man
In-Reply-To: <198e5864132.1283ed42534579.7191562270325331624@zohomail.com>

[-- Attachment #1: Type: text/plain, Size: 1869 bytes --]

Hi Askar,

On Tue, Aug 26, 2025 at 12:37:17PM +0400, Askar Safin wrote:
>  ---- On Mon, 25 Aug 2025 23:13:05 +0400  Alejandro Colomar <alx@kernel.org> wrote --- 
>  > Should we say "mount point" instead?  Otherwise, it's inconsistent with
> 
> d-user@comp:/rbt/man-pages$ grep -E -r -I -i 'mount point' /rbt/man-pages/man | wc -l
> 101
> d-user@comp:/rbt/man-pages$ grep -E -r -I -i 'mount-point' /rbt/man-pages/man | wc -l
> 9
> d-user@comp:/rbt/man-pages$ grep -E -r -I -i 'mountpoint' /rbt/man-pages/man | wc -l
> 4
> 
> My experiments show that "mount point" is indeed the most popular variant.
> 
> I changed all "mountpoint" to "mount point".
> 
> I decided to keep all "per-mount-point".

Thanks!

>  > > +have its existing per-mount-point flags
>  > > +cleared and replaced with those in
>  > > +.I mountflags
>  > > +when
>  > > +.B MS_REMOUNT
>  > > +and
>  > > +.B MS_BIND
>  > > +are specified.
>  > 
>  > Maybe reverse the sentence to start with this?
> 
> I decided simply to remove that "MS_REMOUNT and MS_BIND" part
> (because it is already present in previous sentence).

Okay.

>  > > +This means that if
>  > 
>  > I would move the 'if' to the next line.
> 
> I moved it. But, please, next time do it youself.
> I don't plan to become regular man-pages contributor.

I do these small things myself if they're the only issue.  If there are
more important issues, I _also_ point these out, just because it's
useful.

In general, when writing documentation sentences, write them similarly
to how you would write them if they were code.  You never put an if at
the end of a line of code; never put it at the end of a line of
documentation text.

> I addressed all complains except for listed above and sent v3.

I'll check.


Have a lovely day!
Alex

-- 
<https://www.alejandro-colomar.es/>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH v2 1/1] man2/mount.2: expand and clarify docs for MS_REMOUNT | MS_BIND
From: Askar Safin @ 2025-08-26  8:37 UTC (permalink / raw)
  To: Alejandro Colomar
  Cc: Aleksa Sarai, Alexander Viro, linux-api, linux-fsdevel,
	David Howells, Christian Brauner, linux-man
In-Reply-To: <rxl7zzllf374j6osujwvpvbvsnrjwikoo5tj2o3pqntfjdmwps@isiyqms4s776>

 ---- On Mon, 25 Aug 2025 23:13:05 +0400  Alejandro Colomar <alx@kernel.org> wrote --- 
 > Should we say "mount point" instead?  Otherwise, it's inconsistent with

d-user@comp:/rbt/man-pages$ grep -E -r -I -i 'mount point' /rbt/man-pages/man | wc -l
101
d-user@comp:/rbt/man-pages$ grep -E -r -I -i 'mount-point' /rbt/man-pages/man | wc -l
9
d-user@comp:/rbt/man-pages$ grep -E -r -I -i 'mountpoint' /rbt/man-pages/man | wc -l
4

My experiments show that "mount point" is indeed the most popular variant.

I changed all "mountpoint" to "mount point".

I decided to keep all "per-mount-point".

 > > +have its existing per-mount-point flags
 > > +cleared and replaced with those in
 > > +.I mountflags
 > > +when
 > > +.B MS_REMOUNT
 > > +and
 > > +.B MS_BIND
 > > +are specified.
 > 
 > Maybe reverse the sentence to start with this?

I decided simply to remove that "MS_REMOUNT and MS_BIND" part
(because it is already present in previous sentence).

 > > +This means that if
 > 
 > I would move the 'if' to the next line.

I moved it. But, please, next time do it youself.
I don't plan to become regular man-pages contributor.

I addressed all complains except for listed above and sent v3.

====
Askar Safin
https://types.pl/@safinaskar


^ permalink raw reply

* [PATCH v3 2/2] man2/mount.2: tfix (mountpoint => mount point)
From: Askar Safin @ 2025-08-26  8:32 UTC (permalink / raw)
  To: Alejandro Colomar
  Cc: Aleksa Sarai, Alexander Viro, linux-api, linux-fsdevel,
	David Howells, Christian Brauner, linux-man
In-Reply-To: <20250826083227.2611457-1-safinaskar@zohomail.com>

Here we fix the only remaining mention of "mountpoint"
in all man pages

Signed-off-by: Askar Safin <safinaskar@zohomail.com>
---
 man/man2/mount.2 | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/man/man2/mount.2 b/man/man2/mount.2
index 599c2d6fa..9b11fff51 100644
--- a/man/man2/mount.2
+++ b/man/man2/mount.2
@@ -311,7 +311,7 @@ Since Linux 2.6.16,
 can be set or cleared on a per-mount-point basis as well as on
 the underlying filesystem superblock.
 The mounted filesystem will be writable only if neither the filesystem
-nor the mountpoint are flagged as read-only.
+nor the mount point are flagged as read-only.
 .\"
 .SS Remounting an existing mount
 An existing mount may be remounted by specifying
-- 
2.47.2


^ permalink raw reply related

* [PATCH v3 1/2] man2/mount.2: expand and clarify docs for MS_REMOUNT | MS_BIND
From: Askar Safin @ 2025-08-26  8:32 UTC (permalink / raw)
  To: Alejandro Colomar
  Cc: Aleksa Sarai, Alexander Viro, linux-api, linux-fsdevel,
	David Howells, Christian Brauner, linux-man
In-Reply-To: <20250826083227.2611457-1-safinaskar@zohomail.com>

My edit is based on experiments and reading Linux code

Signed-off-by: Askar Safin <safinaskar@zohomail.com>
---
 man/man2/mount.2 | 27 ++++++++++++++++++++++++---
 1 file changed, 24 insertions(+), 3 deletions(-)

diff --git a/man/man2/mount.2 b/man/man2/mount.2
index 5d83231f9..599c2d6fa 100644
--- a/man/man2/mount.2
+++ b/man/man2/mount.2
@@ -405,7 +405,25 @@ flag can be used with
 to modify only the per-mount-point flags.
 .\" See https://lwn.net/Articles/281157/
 This is particularly useful for setting or clearing the "read-only"
-flag on a mount without changing the underlying filesystem.
+flag on a mount without changing the underlying filesystem parameters.
+The
+.I data
+argument is ignored if
+.B MS_REMOUNT
+and
+.B MS_BIND
+are specified.
+The mount point will
+have its existing per-mount-point flags
+cleared and replaced with those in
+.IR mountflags .
+This means that
+if you wish to preserve
+any existing per-mount-point flags,
+you need to include them in
+.IR mountflags ,
+along with the per-mount-point flags you wish to set
+(or with the flags you wish to clear missing).
 Specifying
 .I mountflags
 as:
@@ -416,8 +434,11 @@ MS_REMOUNT | MS_BIND | MS_RDONLY
 .EE
 .in
 .P
-will make access through this mountpoint read-only, without affecting
-other mounts.
+will make access through this mount point read-only
+(clearing all other per-mount-point flags),
+without affecting
+other mounts
+of this filesystem.
 .\"
 .SS Creating a bind mount
 If
-- 
2.47.2


^ permalink raw reply related

* [PATCH v3 0/2] man2/mount.2: expand and clarify docs for MS_REMOUNT | MS_BIND
From: Askar Safin @ 2025-08-26  8:32 UTC (permalink / raw)
  To: Alejandro Colomar
  Cc: Aleksa Sarai, Alexander Viro, linux-api, linux-fsdevel,
	David Howells, Christian Brauner, linux-man

My edit is based on experiments and reading Linux code

You will find C code I used for experiments below

v1: https://lore.kernel.org/linux-man/20250822114315.1571537-1-safinaskar@zohomail.com/
v2: https://lore.kernel.org/linux-man/20250825154839.2422856-1-safinaskar@zohomail.com/

Askar Safin (2):
  man2/mount.2: expand and clarify docs for MS_REMOUNT | MS_BIND
  man2/mount.2: tfix (mountpoint => mount point)

 man/man2/mount.2 | 29 +++++++++++++++++++++++++----
 1 file changed, 25 insertions(+), 4 deletions(-)

-- 
2.47.2

// You need to be root in initial user namespace

#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <string.h>
#include <unistd.h>
#include <fcntl.h>
#include <sched.h>
#include <errno.h>
#include <sys/stat.h>
#include <sys/mount.h>
#include <sys/syscall.h>
#include <sys/sysmacros.h>
#include <linux/openat2.h>

#define MY_ASSERT(cond) do { \
    if (!(cond)) { \
        fprintf (stderr, "%d: %s: assertion failed\n", __LINE__, #cond); \
        exit (1); \
    } \
} while (0)

int
main (void)
{
    // Init
    {
        MY_ASSERT (chdir ("/") == 0);
        MY_ASSERT (unshare (CLONE_NEWNS) == 0);
        MY_ASSERT (mount (NULL, "/", NULL, MS_PRIVATE | MS_REC, NULL) == 0);
        MY_ASSERT (mount (NULL, "/tmp", "tmpfs", 0, NULL) == 0);
    }

    MY_ASSERT (mkdir ("/tmp/a", 0777) == 0);
    MY_ASSERT (mkdir ("/tmp/b", 0777) == 0);

    // MS_REMOUNT sets options for superblock
    {
        MY_ASSERT (mount (NULL, "/tmp/a", "tmpfs", 0, NULL) == 0);
        MY_ASSERT (mount ("/tmp/a", "/tmp/b", NULL, MS_BIND, NULL) == 0);
        MY_ASSERT (mount (NULL, "/tmp/a", NULL, MS_REMOUNT | MS_RDONLY, NULL) == 0);
        MY_ASSERT (mkdir ("/tmp/a/c", 0777) == -1);
        MY_ASSERT (errno == EROFS);
        MY_ASSERT (mkdir ("/tmp/b/c", 0777) == -1);
        MY_ASSERT (errno == EROFS);
        MY_ASSERT (umount ("/tmp/a") == 0);
        MY_ASSERT (umount ("/tmp/b") == 0);
    }

    // MS_REMOUNT | MS_BIND sets options for vfsmount
    {
        MY_ASSERT (mount (NULL, "/tmp/a", "tmpfs", 0, NULL) == 0);
        MY_ASSERT (mount ("/tmp/a", "/tmp/b", NULL, MS_BIND, NULL) == 0);
        MY_ASSERT (mount (NULL, "/tmp/a", NULL, MS_REMOUNT | MS_BIND | MS_RDONLY, NULL) == 0);
        MY_ASSERT (mkdir ("/tmp/a/c", 0777) == -1);
        MY_ASSERT (errno == EROFS);
        MY_ASSERT (mkdir ("/tmp/b/c", 0777) == 0);
        MY_ASSERT (rmdir ("/tmp/b/c") == 0);
        MY_ASSERT (umount ("/tmp/a") == 0);
        MY_ASSERT (umount ("/tmp/b") == 0);
    }

    // fspick sets options for superblock
    {
        MY_ASSERT (mount (NULL, "/tmp/a", "tmpfs", 0, NULL) == 0);
        MY_ASSERT (mount ("/tmp/a", "/tmp/b", NULL, MS_BIND, NULL) == 0);
        {
            int fsfd = fspick (AT_FDCWD, "/tmp/a", 0);
            MY_ASSERT (fsfd >= 0);
            MY_ASSERT (fsconfig (fsfd, FSCONFIG_SET_FLAG, "ro", NULL, 0) == 0);
            MY_ASSERT (fsconfig (fsfd, FSCONFIG_CMD_RECONFIGURE, NULL, NULL, 0) == 0);
            MY_ASSERT (close (fsfd) == 0);
        }
        MY_ASSERT (mkdir ("/tmp/a/c", 0777) == -1);
        MY_ASSERT (errno == EROFS);
        MY_ASSERT (mkdir ("/tmp/b/c", 0777) == -1);
        MY_ASSERT (errno == EROFS);
        MY_ASSERT (umount ("/tmp/a") == 0);
        MY_ASSERT (umount ("/tmp/b") == 0);
    }

    // mount_setattr sets options for vfsmount
    {
        MY_ASSERT (mount (NULL, "/tmp/a", "tmpfs", 0, NULL) == 0);
        MY_ASSERT (mount ("/tmp/a", "/tmp/b", NULL, MS_BIND, NULL) == 0);
        {
            struct mount_attr attr;
            memset (&attr, 0, sizeof attr);
            attr.attr_set = MOUNT_ATTR_RDONLY;
            MY_ASSERT (mount_setattr (AT_FDCWD, "/tmp/a", 0, &attr, sizeof attr) == 0);
        }
        MY_ASSERT (mkdir ("/tmp/a/c", 0777) == -1);
        MY_ASSERT (errno == EROFS);
        MY_ASSERT (mkdir ("/tmp/b/c", 0777) == 0);
        MY_ASSERT (rmdir ("/tmp/b/c") == 0);
        MY_ASSERT (umount ("/tmp/a") == 0);
        MY_ASSERT (umount ("/tmp/b") == 0);
    }

    // "ro" as a string works for MS_REMOUNT
    {
        MY_ASSERT (mount (NULL, "/tmp/a", "tmpfs", 0, NULL) == 0);
        MY_ASSERT (mount ("/tmp/a", "/tmp/b", NULL, MS_BIND, NULL) == 0);
        MY_ASSERT (mount (NULL, "/tmp/a", NULL, MS_REMOUNT, "ro") == 0);
        MY_ASSERT (mkdir ("/tmp/a/c", 0777) == -1);
        MY_ASSERT (errno == EROFS);
        MY_ASSERT (mkdir ("/tmp/b/c", 0777) == -1);
        MY_ASSERT (errno == EROFS);
        MY_ASSERT (umount ("/tmp/a") == 0);
        MY_ASSERT (umount ("/tmp/b") == 0);
    }

    // "ro" as a string doesn't work for MS_REMOUNT | MS_BIND
    // Option string is ignored
    {
        MY_ASSERT (mount (NULL, "/tmp/a", "tmpfs", 0, NULL) == 0);
        MY_ASSERT (mount ("/tmp/a", "/tmp/b", NULL, MS_BIND, NULL) == 0);
        MY_ASSERT (mount (NULL, "/tmp/a", NULL, MS_REMOUNT | MS_BIND, "ro") == 0);
        MY_ASSERT (mkdir ("/tmp/a/c", 0777) == 0);
        MY_ASSERT (rmdir ("/tmp/a/c") == 0);
        MY_ASSERT (mkdir ("/tmp/b/c", 0777) == 0);
        MY_ASSERT (rmdir ("/tmp/b/c") == 0);
        MY_ASSERT (umount ("/tmp/a") == 0);
        MY_ASSERT (umount ("/tmp/b") == 0);
    }

    // Removing MS_RDONLY makes mount writable again (in case of MS_REMOUNT | MS_BIND)
    // Same for other options (not tested, but I did read code)
    {
        MY_ASSERT (mount (NULL, "/tmp/a", "tmpfs", 0, NULL) == 0);
        MY_ASSERT (mount (NULL, "/tmp/a", NULL, MS_REMOUNT | MS_BIND | MS_RDONLY, NULL) == 0);
        MY_ASSERT (mkdir ("/tmp/a/c", 0777) == -1);
        MY_ASSERT (errno == EROFS);
        MY_ASSERT (mount (NULL, "/tmp/a", NULL, MS_REMOUNT | MS_BIND, NULL) == 0);
        MY_ASSERT (mkdir ("/tmp/a/c", 0777) == 0);
        MY_ASSERT (umount ("/tmp/a") == 0);
    }

    // Removing "ro" from option string makes mount writable again (in case of MS_REMOUNT)
    // I. e. mount(2) works exactly as documented
    // This works even if option string is NULL, i. e. NULL works as default option string
    {
        typedef const char *c_string;
        c_string opts[3] = {NULL, "", "rw"};
        for (int i = 0; i != 3; ++i)
            {
                for (int j = 0; j != 3; ++j)
                    {
                        MY_ASSERT (mount (NULL, "/tmp/a", "tmpfs", 0, opts[i]) == 0);
                        MY_ASSERT (mkdir ("/tmp/a/c", 0777) == 0);
                        MY_ASSERT (rmdir ("/tmp/a/c") == 0);
                        MY_ASSERT (mount (NULL, "/tmp/a", NULL, MS_REMOUNT, "ro") == 0);
                        MY_ASSERT (mkdir ("/tmp/a/c", 0777) == -1);
                        MY_ASSERT (errno == EROFS);
                        MY_ASSERT (mount (NULL, "/tmp/a", NULL, MS_REMOUNT, opts[j]) == 0);
                        MY_ASSERT (mkdir ("/tmp/a/c", 0777) == 0);
                        MY_ASSERT (umount ("/tmp/a") == 0);
                    }
            }
    }

    // Removing MS_RDONLY makes mount writable again (in case of MS_REMOUNT)
    // I. e. mount(2) works exactly as documented
    {
        MY_ASSERT (mount (NULL, "/tmp/a", "tmpfs", 0, NULL) == 0);
        MY_ASSERT (mkdir ("/tmp/a/c", 0777) == 0);
        MY_ASSERT (rmdir ("/tmp/a/c") == 0);
        MY_ASSERT (mount (NULL, "/tmp/a", NULL, MS_REMOUNT | MS_RDONLY, NULL) == 0);
        MY_ASSERT (mkdir ("/tmp/a/c", 0777) == -1);
        MY_ASSERT (errno == EROFS);
        MY_ASSERT (mount (NULL, "/tmp/a", NULL, MS_REMOUNT, NULL) == 0);
        MY_ASSERT (mkdir ("/tmp/a/c", 0777) == 0);
        MY_ASSERT (rmdir ("/tmp/a/c") == 0);
        MY_ASSERT (umount ("/tmp/a") == 0);
    }

    // Setting MS_RDONLY (without other flags) removes all other flags, such as MS_NODEV (in case of MS_REMOUNT | MS_BIND)
    {
        MY_ASSERT (mount (NULL, "/tmp/a", "tmpfs", 0, NULL) == 0);
        MY_ASSERT (mknod ("/tmp/a/mynull", S_IFCHR | 0666, makedev (1, 3)) == 0);

        MY_ASSERT (mkdir ("/tmp/a/c", 0777) == 0);
        MY_ASSERT (rmdir ("/tmp/a/c") == 0);
        {
            int fd = open ("/tmp/a/mynull", O_WRONLY);
            MY_ASSERT (fd >= 0);
            MY_ASSERT (write (fd, "a", 1) == 1);
            MY_ASSERT (close (fd) == 0);
        }
        MY_ASSERT (mount (NULL, "/tmp/a", NULL, MS_REMOUNT | MS_BIND | MS_NODEV, NULL) == 0);
        MY_ASSERT (mkdir ("/tmp/a/c", 0777) == 0);
        MY_ASSERT (rmdir ("/tmp/a/c") == 0);
        MY_ASSERT (open ("/tmp/a/mynull", O_WRONLY) == -1);
        MY_ASSERT (mount (NULL, "/tmp/a", NULL, MS_REMOUNT | MS_BIND | MS_RDONLY, NULL) == 0);
        MY_ASSERT (mkdir ("/tmp/a/c", 0777) == -1);
        {
            int fd = open ("/tmp/a/mynull", O_WRONLY);
            MY_ASSERT (fd >= 0);
            MY_ASSERT (write (fd, "a", 1) == 1);
            MY_ASSERT (close (fd) == 0);
        }
        MY_ASSERT (umount ("/tmp/a") == 0);
    }
    printf ("All tests passed\n");
    exit (0);
}

^ permalink raw reply

* [PATCH] vdso: Remove struct getcpu_cache
From: Thomas Weißschuh @ 2025-08-26  5:29 UTC (permalink / raw)
  To: Huacai Chen, WANG Xuerui, Heiko Carstens, Vasily Gorbik,
	Alexander Gordeev, Christian Borntraeger, Sven Schnelle,
	Andy Lutomirski, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
	Dave Hansen, x86, H. Peter Anvin, Richard Weinberger,
	Anton Ivanov, Johannes Berg, Vincenzo Frascino, Shuah Khan
  Cc: loongarch, linux-kernel, linux-s390, linux-um, linux-api,
	linux-kselftest, Thomas Weißschuh

The cache parameter of getcpu() is not used by the kernel and no user ever
passes it in anyways.

Remove the struct and its header.

Signed-off-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
---
We could also completely remove the parameter, but I am not sure if
that is a good idea for syscalls and vDSO entrypoints.
---
 arch/loongarch/vdso/vgetcpu.c                   |  5 ++---
 arch/s390/kernel/vdso64/getcpu.c                |  3 +--
 arch/s390/kernel/vdso64/vdso.h                  |  4 +---
 arch/x86/entry/vdso/vgetcpu.c                   |  5 ++---
 arch/x86/include/asm/vdso/processor.h           |  4 +---
 arch/x86/um/vdso/um_vdso.c                      |  7 +++----
 include/linux/getcpu.h                          | 19 -------------------
 include/linux/syscalls.h                        |  3 +--
 kernel/sys.c                                    |  4 +---
 tools/testing/selftests/vDSO/vdso_test_getcpu.c |  4 +---
 10 files changed, 13 insertions(+), 45 deletions(-)

diff --git a/arch/loongarch/vdso/vgetcpu.c b/arch/loongarch/vdso/vgetcpu.c
index 5301cd9d0f839eb0fd7b73a1d36e80aaa75d5e76..aefba899873ed035d70766a95b0b6fea881e94df 100644
--- a/arch/loongarch/vdso/vgetcpu.c
+++ b/arch/loongarch/vdso/vgetcpu.c
@@ -4,7 +4,6 @@
  */
 
 #include <asm/vdso.h>
-#include <linux/getcpu.h>
 
 static __always_inline int read_cpu_id(void)
 {
@@ -20,8 +19,8 @@ static __always_inline int read_cpu_id(void)
 }
 
 extern
-int __vdso_getcpu(unsigned int *cpu, unsigned int *node, struct getcpu_cache *unused);
-int __vdso_getcpu(unsigned int *cpu, unsigned int *node, struct getcpu_cache *unused)
+int __vdso_getcpu(unsigned int *cpu, unsigned int *node, void *unused);
+int __vdso_getcpu(unsigned int *cpu, unsigned int *node, void *unused)
 {
 	int cpu_id;
 
diff --git a/arch/s390/kernel/vdso64/getcpu.c b/arch/s390/kernel/vdso64/getcpu.c
index 5c5d4a848b7669436e73df8e3b711e5b876eb3db..1e17665616c5fa766ca66c8de276b212528934bd 100644
--- a/arch/s390/kernel/vdso64/getcpu.c
+++ b/arch/s390/kernel/vdso64/getcpu.c
@@ -2,11 +2,10 @@
 /* Copyright IBM Corp. 2020 */
 
 #include <linux/compiler.h>
-#include <linux/getcpu.h>
 #include <asm/timex.h>
 #include "vdso.h"
 
-int __s390_vdso_getcpu(unsigned *cpu, unsigned *node, struct getcpu_cache *unused)
+int __s390_vdso_getcpu(unsigned *cpu, unsigned *node, void *unused)
 {
 	union tod_clock clk;
 
diff --git a/arch/s390/kernel/vdso64/vdso.h b/arch/s390/kernel/vdso64/vdso.h
index 9e5397e7b590a23c149ccc6043d0c0b0d5ea8457..cadd307d7a365cabf53f5c8d313be3718625533d 100644
--- a/arch/s390/kernel/vdso64/vdso.h
+++ b/arch/s390/kernel/vdso64/vdso.h
@@ -4,9 +4,7 @@
 
 #include <vdso/datapage.h>
 
-struct getcpu_cache;
-
-int __s390_vdso_getcpu(unsigned *cpu, unsigned *node, struct getcpu_cache *unused);
+int __s390_vdso_getcpu(unsigned *cpu, unsigned *node, void *unused);
 int __s390_vdso_gettimeofday(struct __kernel_old_timeval *tv, struct timezone *tz);
 int __s390_vdso_clock_gettime(clockid_t clock, struct __kernel_timespec *ts);
 int __s390_vdso_clock_getres(clockid_t clock, struct __kernel_timespec *ts);
diff --git a/arch/x86/entry/vdso/vgetcpu.c b/arch/x86/entry/vdso/vgetcpu.c
index e4640306b2e3c95d74d73037ab6b09294b8e1d6c..6381b472b7c52487bccf3cbf0664c3d7a0e59699 100644
--- a/arch/x86/entry/vdso/vgetcpu.c
+++ b/arch/x86/entry/vdso/vgetcpu.c
@@ -6,17 +6,16 @@
  */
 
 #include <linux/kernel.h>
-#include <linux/getcpu.h>
 #include <asm/segment.h>
 #include <vdso/processor.h>
 
 notrace long
-__vdso_getcpu(unsigned *cpu, unsigned *node, struct getcpu_cache *unused)
+__vdso_getcpu(unsigned *cpu, unsigned *node, void *unused)
 {
 	vdso_read_cpunode(cpu, node);
 
 	return 0;
 }
 
-long getcpu(unsigned *cpu, unsigned *node, struct getcpu_cache *tcache)
+long getcpu(unsigned *cpu, unsigned *node, void *tcache)
 	__attribute__((weak, alias("__vdso_getcpu")));
diff --git a/arch/x86/include/asm/vdso/processor.h b/arch/x86/include/asm/vdso/processor.h
index 7000aeb59aa287e2119c3d43ab3eaf82befb59c4..93e0e24e5cb47f7b0056c13f2a7f2304ed4a0595 100644
--- a/arch/x86/include/asm/vdso/processor.h
+++ b/arch/x86/include/asm/vdso/processor.h
@@ -18,9 +18,7 @@ static __always_inline void cpu_relax(void)
 	native_pause();
 }
 
-struct getcpu_cache;
-
-notrace long __vdso_getcpu(unsigned *cpu, unsigned *node, struct getcpu_cache *unused);
+notrace long __vdso_getcpu(unsigned *cpu, unsigned *node, void *unused);
 
 #endif /* __ASSEMBLER__ */
 
diff --git a/arch/x86/um/vdso/um_vdso.c b/arch/x86/um/vdso/um_vdso.c
index cbae2584124fd0ff0f9d240c33fefb8d213c84cd..9aa2c62cce6b7a07bbaf8441014d347162d1950d 100644
--- a/arch/x86/um/vdso/um_vdso.c
+++ b/arch/x86/um/vdso/um_vdso.c
@@ -10,14 +10,13 @@
 #define DISABLE_BRANCH_PROFILING
 
 #include <linux/time.h>
-#include <linux/getcpu.h>
 #include <asm/unistd.h>
 
 /* workaround for -Wmissing-prototypes warnings */
 int __vdso_clock_gettime(clockid_t clock, struct __kernel_old_timespec *ts);
 int __vdso_gettimeofday(struct __kernel_old_timeval *tv, struct timezone *tz);
 __kernel_old_time_t __vdso_time(__kernel_old_time_t *t);
-long __vdso_getcpu(unsigned int *cpu, unsigned int *node, struct getcpu_cache *unused);
+long __vdso_getcpu(unsigned int *cpu, unsigned int *node, void *unused);
 
 int __vdso_clock_gettime(clockid_t clock, struct __kernel_old_timespec *ts)
 {
@@ -60,7 +59,7 @@ __kernel_old_time_t __vdso_time(__kernel_old_time_t *t)
 __kernel_old_time_t time(__kernel_old_time_t *t) __attribute__((weak, alias("__vdso_time")));
 
 long
-__vdso_getcpu(unsigned int *cpu, unsigned int *node, struct getcpu_cache *unused)
+__vdso_getcpu(unsigned int *cpu, unsigned int *node, void *unused)
 {
 	/*
 	 * UML does not support SMP, we can cheat here. :)
@@ -74,5 +73,5 @@ __vdso_getcpu(unsigned int *cpu, unsigned int *node, struct getcpu_cache *unused
 	return 0;
 }
 
-long getcpu(unsigned int *cpu, unsigned int *node, struct getcpu_cache *tcache)
+long getcpu(unsigned int *cpu, unsigned int *node, void *tcache)
 	__attribute__((weak, alias("__vdso_getcpu")));
diff --git a/include/linux/getcpu.h b/include/linux/getcpu.h
deleted file mode 100644
index c304dcdb4eac2a9117080e6a14f4e3f28d07fd56..0000000000000000000000000000000000000000
--- a/include/linux/getcpu.h
+++ /dev/null
@@ -1,19 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef _LINUX_GETCPU_H
-#define _LINUX_GETCPU_H 1
-
-/* Cache for getcpu() to speed it up. Results might be a short time
-   out of date, but will be faster.
-
-   User programs should not refer to the contents of this structure.
-   I repeat they should not refer to it. If they do they will break
-   in future kernels.
-
-   It is only a private cache for vgetcpu(). It will change in future kernels.
-   The user program must store this information per thread (__thread)
-   If you want 100% accurate information pass NULL instead. */
-struct getcpu_cache {
-	unsigned long blob[128 / sizeof(long)];
-};
-
-#endif
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 77f45e5d44139da36a5dacbf9db7b65261d13398..81822d203eac5d8d91488a18ff7fcdc65670df54 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -59,7 +59,6 @@ struct compat_stat;
 struct old_timeval32;
 struct robust_list_head;
 struct futex_waitv;
-struct getcpu_cache;
 struct old_linux_dirent;
 struct perf_event_attr;
 struct file_handle;
@@ -714,7 +713,7 @@ asmlinkage long sys_getrusage(int who, struct rusage __user *ru);
 asmlinkage long sys_umask(int mask);
 asmlinkage long sys_prctl(int option, unsigned long arg2, unsigned long arg3,
 			unsigned long arg4, unsigned long arg5);
-asmlinkage long sys_getcpu(unsigned __user *cpu, unsigned __user *node, struct getcpu_cache __user *cache);
+asmlinkage long sys_getcpu(unsigned __user *cpu, unsigned __user *node, void __user *cache);
 asmlinkage long sys_gettimeofday(struct __kernel_old_timeval __user *tv,
 				struct timezone __user *tz);
 asmlinkage long sys_settimeofday(struct __kernel_old_timeval __user *tv,
diff --git a/kernel/sys.c b/kernel/sys.c
index 1e28b40053ce206d7d0ed27e8a4fce8b616c3565..a830d78c1e1eb1d6cef31294feeb6a88dc0f83f3 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -31,7 +31,6 @@
 #include <linux/tty.h>
 #include <linux/signal.h>
 #include <linux/cn_proc.h>
-#include <linux/getcpu.h>
 #include <linux/task_io_accounting_ops.h>
 #include <linux/seccomp.h>
 #include <linux/cpu.h>
@@ -2813,8 +2812,7 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
 	return error;
 }
 
-SYSCALL_DEFINE3(getcpu, unsigned __user *, cpup, unsigned __user *, nodep,
-		struct getcpu_cache __user *, unused)
+SYSCALL_DEFINE3(getcpu, unsigned __user *, cpup, unsigned __user *, nodep, void __user *, unused)
 {
 	int err = 0;
 	int cpu = raw_smp_processor_id();
diff --git a/tools/testing/selftests/vDSO/vdso_test_getcpu.c b/tools/testing/selftests/vDSO/vdso_test_getcpu.c
index cdeaed45fb26c61f6314c58fe1b71fa0be3c0108..994ce569dc37c6689b1a3c79156e3dfc8bf27f22 100644
--- a/tools/testing/selftests/vDSO/vdso_test_getcpu.c
+++ b/tools/testing/selftests/vDSO/vdso_test_getcpu.c
@@ -16,9 +16,7 @@
 #include "vdso_config.h"
 #include "vdso_call.h"
 
-struct getcpu_cache;
-typedef long (*getcpu_t)(unsigned int *, unsigned int *,
-			 struct getcpu_cache *);
+typedef long (*getcpu_t)(unsigned int *, unsigned int *, void *);
 
 int main(int argc, char **argv)
 {

---
base-commit: 8f5ae30d69d7543eee0d70083daf4de8fe15d585
change-id: 20250825-getcpu_cache-3abcd2e65437

Best regards,
-- 
Thomas Weißschuh <thomas.weissschuh@linutronix.de>


^ permalink raw reply related

* Re: [RFC PATCH v1 1/2] fs: Add O_DENY_WRITE
From: Jeff Xu @ 2025-08-25 23:06 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Mickaël Salaün, Jann Horn, Al Viro, Christian Brauner,
	Kees Cook, Paul Moore, Serge Hallyn, Andy Lutomirski,
	Arnd Bergmann, Christian Heimes, Dmitry Vyukov, Elliott Hughes,
	Fan Wu, Florian Weimer, Jonathan Corbet, Jordan R Abrahams,
	Lakshmi Ramasubramanian, Luca Boccassi, Matt Bobrowski,
	Miklos Szeredi, Mimi Zohar, Nicolas Bouchinet, Robert Waite,
	Roberto Sassu, Scott Shell, Steve Dower, Steve Grubb,
	kernel-hardening, linux-api, linux-fsdevel, linux-integrity,
	linux-kernel, linux-security-module, Jeff Xu
In-Reply-To: <F0E70FC7-8DCE-4057-8E91-9FA1AC5BC758@amacapital.net>

On Mon, Aug 25, 2025 at 2:56 PM Andy Lutomirski <luto@amacapital.net> wrote:
>
>
> > On Aug 25, 2025, at 11:10 AM, Jeff Xu <jeffxu@google.com> wrote:
> >
> > On Mon, Aug 25, 2025 at 9:43 AM Andy Lutomirski <luto@amacapital.net> wrote:
> >>> On Mon, Aug 25, 2025 at 2:31 AM Mickaël Salaün <mic@digikod.net> wrote:
> >>> On Sun, Aug 24, 2025 at 11:04:03AM -0700, Andy Lutomirski wrote:
> >>>> On Sun, Aug 24, 2025 at 4:03 AM Mickaël Salaün <mic@digikod.net> wrote:
> >>>>> On Fri, Aug 22, 2025 at 09:45:32PM +0200, Jann Horn wrote:
> >>>>>> On Fri, Aug 22, 2025 at 7:08 PM Mickaël Salaün <mic@digikod.net> wrote:
> >>>>>>> Add a new O_DENY_WRITE flag usable at open time and on opened file (e.g.
> >>>>>>> passed file descriptors).  This changes the state of the opened file by
> >>>>>>> making it read-only until it is closed.  The main use case is for script
> >>>>>>> interpreters to get the guarantee that script' content cannot be altered
> >>>>>>> while being read and interpreted.  This is useful for generic distros
> >>>>>>> that may not have a write-xor-execute policy.  See commit a5874fde3c08
> >>>>>>> ("exec: Add a new AT_EXECVE_CHECK flag to execveat(2)")
> >>>>>>> Both execve(2) and the IOCTL to enable fsverity can already set this
> >>>>>>> property on files with deny_write_access().  This new O_DENY_WRITE make
> >>>>>> The kernel actually tried to get rid of this behavior on execve() in
> >>>>>> commit 2a010c41285345da60cece35575b4e0af7e7bf44.; but sadly that had
> >>>>>> to be reverted in commit 3b832035387ff508fdcf0fba66701afc78f79e3d
> >>>>>> because it broke userspace assumptions.
> >>>>> Oh, good to know.
> >>>>>>> it widely available.  This is similar to what other OSs may provide
> >>>>>>> e.g., opening a file with only FILE_SHARE_READ on Windows.
> >>>>>> We used to have the analogous mmap() flag MAP_DENYWRITE, and that was
> >>>>>> removed for security reasons; as
> >>>>>> https://man7.org/linux/man-pages/man2/mmap.2.html says:
> >>>>>> |        MAP_DENYWRITE
> >>>>>> |               This flag is ignored.  (Long ago—Linux 2.0 and earlier—it
> >>>>>> |               signaled that attempts to write to the underlying file
> >>>>>> |               should fail with ETXTBSY.  But this was a source of denial-
> >>>>>> |               of-service attacks.)"
> >>>>>> It seems to me that the same issue applies to your patch - it would
> >>>>>> allow unprivileged processes to essentially lock files such that other
> >>>>>> processes can't write to them anymore. This might allow unprivileged
> >>>>>> users to prevent root from updating config files or stuff like that if
> >>>>>> they're updated in-place.
> >>>>> Yes, I agree, but since it is the case for executed files I though it
> >>>>> was worth starting a discussion on this topic.  This new flag could be
> >>>>> restricted to executable files, but we should avoid system-wide locks
> >>>>> like this.  I'm not sure how Windows handle these issues though.
> >>>>> Anyway, we should rely on the access control policy to control write and
> >>>>> execute access in a consistent way (e.g. write-xor-execute).  Thanks for
> >>>>> the references and the background!
> >>>> I'm confused.  I understand that there are many contexts in which one
> >>>> would want to prevent execution of unapproved content, which might
> >>>> include preventing a given process from modifying some code and then
> >>>> executing it.
> >>>> I don't understand what these deny-write features have to do with it.
> >>>> These features merely prevent someone from modifying code *that is
> >>>> currently in use*, which is not at all the same thing as preventing
> >>>> modifying code that might get executed -- one can often modify
> >>>> contents *before* executing those contents.
> >>> The order of checks would be:
> >>> 1. open script with O_DENY_WRITE
> >>> 2. check executability with AT_EXECVE_CHECK
> >>> 3. read the content and interpret it
> >> Hmm.  Common LSM configurations should be able to handle this without
> >> deny write, I think.  If you don't want a program to be able to make
> >> their own scripts, then don't allow AT_EXECVE_CHECK to succeed on a
> >> script that the program can write.
> > Yes, Common LSM could handle this, however, due to historic and app
> > backward compability reason, sometimes it is impossible to enforce
> > that kind of policy in practice, therefore as an alternative, a
> > machinism such as AT_EXECVE_CHECK is really useful.
>
> Can you clarify?  I’m suspicious that we’re taking past each other.
>
Apology, my response isn't clear.

> AT_EXECVE_CHECK solves a problem that there are actions that effectively “execute” a file that don’t execute literal CPU instructions for it. Sometimes open+read has the effect of interpreting the contents of the file as something code-like.
>
Yes. We have the same understanding of this.
As an example, shell script or java byte code, their file permission
can be rw, but no x bit set. The interpreter reads those and executes
them.

> But, as I see it, deny-write is almost entirely orthogonal. If you open a file with the intent of executing it (mmap-execute or interpret — makes little practical difference here), then the kernel can enforce some policy. If the file is writable by a process that ought not have permission to execute code in the context of the opening-for-execute process, then LSMs need deny-write to be enforced so that they can verify the contents at the time of opening.
>
> But let’s step back a moment: is there any actual sensible security policy that does this?  If I want to *enforce* that a process only execute approved code, then wouldn’t I do it be only allowing executing files that the process can’t write?
>
I imagine the following situation: an app has both "rw" access to the
file that holds the script code, the "w" is needed because the app
updates the script sometimes.

What is a reasonable sandbox solution for such an app? There are maybe
two options:

1> split the app as two processes: processA has "w" access to the
script for updating when needed. Process B has "r" access but no "w",
for executing. ProcessA and ProcessB will coordinate to avoid racing
on the script update.

2> The process will use AT_EXECVE_CHECK (added by interpreter) to
validate the file before opening , and the file content held by the
process should be immutable while being validated and executed later
by interpreter.

option 1 is the ideal, and IIUC, you promote this too. However, that
requires refactoring the app as two processes.
option 2 is an alternative. Because it doesn't require the change from
the apps, therefore a solution worth considering.

> The reason that the removal of deny-write wasn’t security — it was a functionality issue: a linker accidentally modified an in-use binary. If you have permission to use gcc or lld, etc to create binaries, and you have permission to run them, then you pretty much have permission to run whatever code you like.
>
> So, if there’s a real security use case for deny-write, I’m still not seeing it.
>
Although the current patch might not be ideal due to the potential DOS
attack, it does offer a starting point to address the needs. Let's
continue the discussion based on this patch and explore different
ideas.

Thanks and regards,
-Jeff

> >> Keep in mind that trying to lock this down too hard is pointless for
> >> users who are allowed to to ptrace-write to their own processes.  Or
> >> for users who can do JIT, or for users who can run a REPL, etc.
> > The ptrace-write and /proc/pid/mem writing are on my radar, at least
> > for ChomeOS and Android.
> > AT_EXECVE_CHECK is orthogonal to those IMO, I hope eventually all
> > those paths will be hardened.
> >
> > Thanks and regards,
> > -Jeff

^ permalink raw reply

* [PATCH 07/11] tools headers: Sync syscall tables with the kernel source
From: Namhyung Kim @ 2025-08-25 21:58 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: Ian Rogers, Jiri Olsa, Adrian Hunter, Peter Zijlstra, Ingo Molnar,
	LKML, linux-perf-users, Arnd Bergmann, linux-api
In-Reply-To: <20250825215904.2594216-1-namhyung@kernel.org>

To pick up the changes in this cset:

  be7efb2d20d67f33 fs: introduce file_getattr and file_setattr syscalls

This addresses these perf build warnings:

  Warning: Kernel ABI header differences:
    diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h
    diff -u tools/scripts/syscall.tbl scripts/syscall.tbl
    diff -u tools/perf/arch/x86/entry/syscalls/syscall_32.tbl arch/x86/entry/syscalls/syscall_32.tbl
    diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
    diff -u tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl
    diff -u tools/perf/arch/s390/entry/syscalls/syscall.tbl arch/s390/kernel/syscalls/syscall.tbl
    diff -u tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl arch/mips/kernel/syscalls/syscall_n64.tbl
    diff -u tools/perf/arch/arm/entry/syscalls/syscall.tbl arch/arm/tools/syscall.tbl
    diff -u tools/perf/arch/sh/entry/syscalls/syscall.tbl arch/sh/kernel/syscalls/syscall.tbl
    diff -u tools/perf/arch/sparc/entry/syscalls/syscall.tbl arch/sparc/kernel/syscalls/syscall.tbl
    diff -u tools/perf/arch/xtensa/entry/syscalls/syscall.tbl arch/xtensa/kernel/syscalls/syscall.tbl

Please see tools/include/uapi/README for further details.

Cc: Arnd Bergmann <arnd@arndb.de>
CC: linux-api@vger.kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
 tools/include/uapi/asm-generic/unistd.h             | 8 +++++++-
 tools/perf/arch/arm/entry/syscalls/syscall.tbl      | 2 ++
 tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl | 2 ++
 tools/perf/arch/powerpc/entry/syscalls/syscall.tbl  | 2 ++
 tools/perf/arch/s390/entry/syscalls/syscall.tbl     | 2 ++
 tools/perf/arch/sh/entry/syscalls/syscall.tbl       | 2 ++
 tools/perf/arch/sparc/entry/syscalls/syscall.tbl    | 2 ++
 tools/perf/arch/x86/entry/syscalls/syscall_32.tbl   | 2 ++
 tools/perf/arch/x86/entry/syscalls/syscall_64.tbl   | 2 ++
 tools/perf/arch/xtensa/entry/syscalls/syscall.tbl   | 2 ++
 tools/scripts/syscall.tbl                           | 2 ++
 11 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/tools/include/uapi/asm-generic/unistd.h b/tools/include/uapi/asm-generic/unistd.h
index 2892a45023af6d3e..04e0077fb4c97a4d 100644
--- a/tools/include/uapi/asm-generic/unistd.h
+++ b/tools/include/uapi/asm-generic/unistd.h
@@ -852,8 +852,14 @@ __SYSCALL(__NR_removexattrat, sys_removexattrat)
 #define __NR_open_tree_attr 467
 __SYSCALL(__NR_open_tree_attr, sys_open_tree_attr)
 
+/* fs/inode.c */
+#define __NR_file_getattr 468
+__SYSCALL(__NR_file_getattr, sys_file_getattr)
+#define __NR_file_setattr 469
+__SYSCALL(__NR_file_setattr, sys_file_setattr)
+
 #undef __NR_syscalls
-#define __NR_syscalls 468
+#define __NR_syscalls 470
 
 /*
  * 32 bit systems traditionally used different
diff --git a/tools/perf/arch/arm/entry/syscalls/syscall.tbl b/tools/perf/arch/arm/entry/syscalls/syscall.tbl
index 27c1d5ebcd91c8c2..b07e699aaa3c2840 100644
--- a/tools/perf/arch/arm/entry/syscalls/syscall.tbl
+++ b/tools/perf/arch/arm/entry/syscalls/syscall.tbl
@@ -482,3 +482,5 @@
 465	common	listxattrat			sys_listxattrat
 466	common	removexattrat			sys_removexattrat
 467	common	open_tree_attr			sys_open_tree_attr
+468	common	file_getattr			sys_file_getattr
+469	common	file_setattr			sys_file_setattr
diff --git a/tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl b/tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl
index 1e8c44c7b61492ea..7a7049c2c307885f 100644
--- a/tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl
+++ b/tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl
@@ -382,3 +382,5 @@
 465	n64	listxattrat			sys_listxattrat
 466	n64	removexattrat			sys_removexattrat
 467	n64	open_tree_attr			sys_open_tree_attr
+468	n64	file_getattr			sys_file_getattr
+469	n64	file_setattr			sys_file_setattr
diff --git a/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl b/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl
index 9a084bdb892694bc..b453e80dfc003796 100644
--- a/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl
+++ b/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl
@@ -558,3 +558,5 @@
 465	common	listxattrat			sys_listxattrat
 466	common	removexattrat			sys_removexattrat
 467	common	open_tree_attr			sys_open_tree_attr
+468	common	file_getattr			sys_file_getattr
+469	common	file_setattr			sys_file_setattr
diff --git a/tools/perf/arch/s390/entry/syscalls/syscall.tbl b/tools/perf/arch/s390/entry/syscalls/syscall.tbl
index a4569b96ef06c54c..8a6744d658db3986 100644
--- a/tools/perf/arch/s390/entry/syscalls/syscall.tbl
+++ b/tools/perf/arch/s390/entry/syscalls/syscall.tbl
@@ -470,3 +470,5 @@
 465  common	listxattrat		sys_listxattrat			sys_listxattrat
 466  common	removexattrat		sys_removexattrat		sys_removexattrat
 467  common	open_tree_attr		sys_open_tree_attr		sys_open_tree_attr
+468  common	file_getattr		sys_file_getattr		sys_file_getattr
+469  common	file_setattr		sys_file_setattr		sys_file_setattr
diff --git a/tools/perf/arch/sh/entry/syscalls/syscall.tbl b/tools/perf/arch/sh/entry/syscalls/syscall.tbl
index 52a7652fcff6394b..5e9c9eff5539e241 100644
--- a/tools/perf/arch/sh/entry/syscalls/syscall.tbl
+++ b/tools/perf/arch/sh/entry/syscalls/syscall.tbl
@@ -471,3 +471,5 @@
 465	common	listxattrat			sys_listxattrat
 466	common	removexattrat			sys_removexattrat
 467	common	open_tree_attr			sys_open_tree_attr
+468	common	file_getattr			sys_file_getattr
+469	common	file_setattr			sys_file_setattr
diff --git a/tools/perf/arch/sparc/entry/syscalls/syscall.tbl b/tools/perf/arch/sparc/entry/syscalls/syscall.tbl
index 83e45eb6c095a36b..ebb7d06d1044fa9b 100644
--- a/tools/perf/arch/sparc/entry/syscalls/syscall.tbl
+++ b/tools/perf/arch/sparc/entry/syscalls/syscall.tbl
@@ -513,3 +513,5 @@
 465	common	listxattrat			sys_listxattrat
 466	common	removexattrat			sys_removexattrat
 467	common	open_tree_attr			sys_open_tree_attr
+468	common	file_getattr			sys_file_getattr
+469	common	file_setattr			sys_file_setattr
diff --git a/tools/perf/arch/x86/entry/syscalls/syscall_32.tbl b/tools/perf/arch/x86/entry/syscalls/syscall_32.tbl
index ac007ea00979dc28..4877e16da69a50f2 100644
--- a/tools/perf/arch/x86/entry/syscalls/syscall_32.tbl
+++ b/tools/perf/arch/x86/entry/syscalls/syscall_32.tbl
@@ -473,3 +473,5 @@
 465	i386	listxattrat		sys_listxattrat
 466	i386	removexattrat		sys_removexattrat
 467	i386	open_tree_attr		sys_open_tree_attr
+468	i386	file_getattr		sys_file_getattr
+469	i386	file_setattr		sys_file_setattr
diff --git a/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl b/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
index cfb5ca41e30de1a4..92cf0fe2291eb99b 100644
--- a/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
@@ -391,6 +391,8 @@
 465	common	listxattrat		sys_listxattrat
 466	common	removexattrat		sys_removexattrat
 467	common	open_tree_attr		sys_open_tree_attr
+468	common	file_getattr		sys_file_getattr
+469	common	file_setattr		sys_file_setattr
 
 #
 # Due to a historical design error, certain syscalls are numbered differently
diff --git a/tools/perf/arch/xtensa/entry/syscalls/syscall.tbl b/tools/perf/arch/xtensa/entry/syscalls/syscall.tbl
index f657a77314f8667f..374e4cb788d8a6d4 100644
--- a/tools/perf/arch/xtensa/entry/syscalls/syscall.tbl
+++ b/tools/perf/arch/xtensa/entry/syscalls/syscall.tbl
@@ -438,3 +438,5 @@
 465	common	listxattrat			sys_listxattrat
 466	common	removexattrat			sys_removexattrat
 467	common	open_tree_attr			sys_open_tree_attr
+468	common	file_getattr			sys_file_getattr
+469	common	file_setattr			sys_file_setattr
diff --git a/tools/scripts/syscall.tbl b/tools/scripts/syscall.tbl
index 580b4e246aecd5f0..d1ae5e92c615b58e 100644
--- a/tools/scripts/syscall.tbl
+++ b/tools/scripts/syscall.tbl
@@ -408,3 +408,5 @@
 465	common	listxattrat			sys_listxattrat
 466	common	removexattrat			sys_removexattrat
 467	common	open_tree_attr			sys_open_tree_attr
+468	common	file_getattr			sys_file_getattr
+469	common	file_setattr			sys_file_setattr
-- 
2.51.0.261.g7ce5a0a67e-goog


^ permalink raw reply related

* Re: [RFC PATCH v1 1/2] fs: Add O_DENY_WRITE
From: Andy Lutomirski @ 2025-08-25 21:56 UTC (permalink / raw)
  To: Jeff Xu
  Cc: Mickaël Salaün, Jann Horn, Al Viro, Christian Brauner,
	Kees Cook, Paul Moore, Serge Hallyn, Andy Lutomirski,
	Arnd Bergmann, Christian Heimes, Dmitry Vyukov, Elliott Hughes,
	Fan Wu, Florian Weimer, Jonathan Corbet, Jordan R Abrahams,
	Lakshmi Ramasubramanian, Luca Boccassi, Matt Bobrowski,
	Miklos Szeredi, Mimi Zohar, Nicolas Bouchinet, Robert Waite,
	Roberto Sassu, Scott Shell, Steve Dower, Steve Grubb,
	kernel-hardening, linux-api, linux-fsdevel, linux-integrity,
	linux-kernel, linux-security-module, Jeff Xu


> On Aug 25, 2025, at 11:10 AM, Jeff Xu <jeffxu@google.com> wrote:
> 
> On Mon, Aug 25, 2025 at 9:43 AM Andy Lutomirski <luto@amacapital.net> wrote:
>>> On Mon, Aug 25, 2025 at 2:31 AM Mickaël Salaün <mic@digikod.net> wrote:
>>> On Sun, Aug 24, 2025 at 11:04:03AM -0700, Andy Lutomirski wrote:
>>>> On Sun, Aug 24, 2025 at 4:03 AM Mickaël Salaün <mic@digikod.net> wrote:
>>>>> On Fri, Aug 22, 2025 at 09:45:32PM +0200, Jann Horn wrote:
>>>>>> On Fri, Aug 22, 2025 at 7:08 PM Mickaël Salaün <mic@digikod.net> wrote:
>>>>>>> Add a new O_DENY_WRITE flag usable at open time and on opened file (e.g.
>>>>>>> passed file descriptors).  This changes the state of the opened file by
>>>>>>> making it read-only until it is closed.  The main use case is for script
>>>>>>> interpreters to get the guarantee that script' content cannot be altered
>>>>>>> while being read and interpreted.  This is useful for generic distros
>>>>>>> that may not have a write-xor-execute policy.  See commit a5874fde3c08
>>>>>>> ("exec: Add a new AT_EXECVE_CHECK flag to execveat(2)")
>>>>>>> Both execve(2) and the IOCTL to enable fsverity can already set this
>>>>>>> property on files with deny_write_access().  This new O_DENY_WRITE make
>>>>>> The kernel actually tried to get rid of this behavior on execve() in
>>>>>> commit 2a010c41285345da60cece35575b4e0af7e7bf44.; but sadly that had
>>>>>> to be reverted in commit 3b832035387ff508fdcf0fba66701afc78f79e3d
>>>>>> because it broke userspace assumptions.
>>>>> Oh, good to know.
>>>>>>> it widely available.  This is similar to what other OSs may provide
>>>>>>> e.g., opening a file with only FILE_SHARE_READ on Windows.
>>>>>> We used to have the analogous mmap() flag MAP_DENYWRITE, and that was
>>>>>> removed for security reasons; as
>>>>>> https://man7.org/linux/man-pages/man2/mmap.2.html says:
>>>>>> |        MAP_DENYWRITE
>>>>>> |               This flag is ignored.  (Long ago—Linux 2.0 and earlier—it
>>>>>> |               signaled that attempts to write to the underlying file
>>>>>> |               should fail with ETXTBSY.  But this was a source of denial-
>>>>>> |               of-service attacks.)"
>>>>>> It seems to me that the same issue applies to your patch - it would
>>>>>> allow unprivileged processes to essentially lock files such that other
>>>>>> processes can't write to them anymore. This might allow unprivileged
>>>>>> users to prevent root from updating config files or stuff like that if
>>>>>> they're updated in-place.
>>>>> Yes, I agree, but since it is the case for executed files I though it
>>>>> was worth starting a discussion on this topic.  This new flag could be
>>>>> restricted to executable files, but we should avoid system-wide locks
>>>>> like this.  I'm not sure how Windows handle these issues though.
>>>>> Anyway, we should rely on the access control policy to control write and
>>>>> execute access in a consistent way (e.g. write-xor-execute).  Thanks for
>>>>> the references and the background!
>>>> I'm confused.  I understand that there are many contexts in which one
>>>> would want to prevent execution of unapproved content, which might
>>>> include preventing a given process from modifying some code and then
>>>> executing it.
>>>> I don't understand what these deny-write features have to do with it.
>>>> These features merely prevent someone from modifying code *that is
>>>> currently in use*, which is not at all the same thing as preventing
>>>> modifying code that might get executed -- one can often modify
>>>> contents *before* executing those contents.
>>> The order of checks would be:
>>> 1. open script with O_DENY_WRITE
>>> 2. check executability with AT_EXECVE_CHECK
>>> 3. read the content and interpret it
>> Hmm.  Common LSM configurations should be able to handle this without
>> deny write, I think.  If you don't want a program to be able to make
>> their own scripts, then don't allow AT_EXECVE_CHECK to succeed on a
>> script that the program can write.
> Yes, Common LSM could handle this, however, due to historic and app
> backward compability reason, sometimes it is impossible to enforce
> that kind of policy in practice, therefore as an alternative, a
> machinism such as AT_EXECVE_CHECK is really useful.

Can you clarify?  I’m suspicious that we’re taking past each other.

AT_EXECVE_CHECK solves a problem that there are actions that effectively “execute” a file that don’t execute literal CPU instructions for it. Sometimes open+read has the effect of interpreting the contents of the file as something code-like.

But, as I see it, deny-write is almost entirely orthogonal. If you open a file with the intent of executing it (mmap-execute or interpret — makes little practical difference here), then the kernel can enforce some policy. If the file is writable by a process that ought not have permission to execute code in the context of the opening-for-execute process, then LSMs need deny-write to be enforced so that they can verify the contents at the time of opening.

But let’s step back a moment: is there any actual sensible security policy that does this?  If I want to *enforce* that a process only execute approved code, then wouldn’t I do it be only allowing executing files that the process can’t write?

The reason that the removal of deny-write wasn’t security — it was a functionality issue: a linker accidentally modified an in-use binary. If you have permission to use gcc or lld, etc to create binaries, and you have permission to run them, then you pretty much have permission to run whatever code you like.

So, if there’s a real security use case for deny-write, I’m still not seeing it.

>> Keep in mind that trying to lock this down too hard is pointless for
>> users who are allowed to to ptrace-write to their own processes.  Or
>> for users who can do JIT, or for users who can run a REPL, etc.
> The ptrace-write and /proc/pid/mem writing are on my radar, at least
> for ChomeOS and Android.
> AT_EXECVE_CHECK is orthogonal to those IMO, I hope eventually all
> those paths will be hardened.
> 
> Thanks and regards,
> -Jeff

^ permalink raw reply

* Re: [PATCH] uapi/fcntl: conditionally define AT_RENAME* macros
From: Randy Dunlap @ 2025-08-25 19:29 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-kernel, Amir Goldstein, Jeff Layton, Chuck Lever,
	Alexander Aring, Josef Bacik, Aleksa Sarai, Jan Kara,
	Christian Brauner, linux-fsdevel, linux-api
In-Reply-To: <aKyvO2bvPCZEzuBd@casper.infradead.org>



On 8/25/25 11:45 AM, Matthew Wilcox wrote:
> On Mon, Aug 25, 2025 at 10:52:31AM -0700, Randy Dunlap wrote:
>> $ grep -r AT_RENAME_NOREPLACE /usr/include
>> /usr/include/stdio.h:# define AT_RENAME_NOREPLACE RENAME_NOREPLACE
>> /usr/include/linux/fcntl.h:#define AT_RENAME_NOREPLACE	0x0001
>>
>> I have libc 2.42-1.1 (openSUSE).
> 
> I wonder if we can fix it by changing include/uapi/linux/fcntl.h
> from being an explicit 0x0001 to RENAME_NOREPLACE?  There's probably
> a horrendous include problem between linux/fcntl.h and linux/fs.h
> though?

I'm working on something like that now (suggested by Amir),
but it might depend on whether stdio.h has been #included first.

-- 
~Randy


^ permalink raw reply

* Re: [PATCH v2 1/1] man2/mount.2: expand and clarify docs for MS_REMOUNT | MS_BIND
From: Alejandro Colomar @ 2025-08-25 19:13 UTC (permalink / raw)
  To: Askar Safin
  Cc: Aleksa Sarai, Alexander Viro, linux-api, linux-fsdevel,
	David Howells, Christian Brauner, linux-man
In-Reply-To: <20250825154839.2422856-2-safinaskar@zohomail.com>

[-- Attachment #1: Type: text/plain, Size: 3043 bytes --]

Hi Askar,

On Mon, Aug 25, 2025 at 03:48:39PM +0000, Askar Safin wrote:
> My edit is based on experiments and reading Linux code
> 
> Signed-off-by: Askar Safin <safinaskar@zohomail.com>
> ---
>  man/man2/mount.2 | 32 +++++++++++++++++++++++++++++---
>  1 file changed, 29 insertions(+), 3 deletions(-)
> 
> diff --git a/man/man2/mount.2 b/man/man2/mount.2
> index 5d83231f9..47fc2d21f 100644
> --- a/man/man2/mount.2
> +++ b/man/man2/mount.2
> @@ -405,7 +405,30 @@ flag can be used with
>  to modify only the per-mount-point flags.
>  .\" See https://lwn.net/Articles/281157/
>  This is particularly useful for setting or clearing the "read-only"
> -flag on a mount without changing the underlying filesystem.
> +flag on a mount without changing the underlying filesystem parameters.
> +The
> +.I data
> +argument is ignored if
> +.B MS_REMOUNT
> +and
> +.B MS_BIND
> +are specified.
> +Note that the mountpoint will

I would remove "Note that".  Starting with "The" already is equally
meaningful, and two less meaningless words for the reader.

Should we say "mount point" instead?  Otherwise, it's inconsistent with
"mount-point flags" below.  Also, see:

alx@debian:~/src/linux/man-pages/man-pages/master/man$ grep -rn 'mount point' | wc -l
98
alx@debian:~/src/linux/man-pages/man-pages/master/man$ grep -rn 'mountpoint' | wc -l
3


> +have its existing per-mount-point flags
> +cleared and replaced with those in
> +.I mountflags
> +when
> +.B MS_REMOUNT
> +and
> +.B MS_BIND
> +are specified.

Maybe reverse the sentence to start with this?

	When
	.B MS_REMOUNT
	and
	.B MS_BIND
	are specified,
	the ...
	will have its existing ...
	cleared and replaced with those in
	.IR mountflags .

Having conditionals at the end makes my brain have to reparse the
previous text to understand it.  If I read the conditional early on,
my branch predictor kind of knows what to expect.  :)

> +This means that if

I would move the 'if' to the next line.

> +you wish to preserve
> +any existing per-mount-point flags,
> +you need to include them in
> +.IR mountflags ,
> +along with the per-mount-point flags you wish to set
> +(or with the flags you wish to clear missing).
>  Specifying
>  .I mountflags
>  as:
> @@ -416,8 +439,11 @@ MS_REMOUNT | MS_BIND | MS_RDONLY
>  .EE
>  .in
>  .P
> -will make access through this mountpoint read-only, without affecting
> -other mounts.

Hmmm, I see this uses 'mountpoint' already.

I guess we should have a clear direction of what term we want to use.
Since the existing text already uses this, I think we should change it
in a separate commit.  Do you want to send a second patch to use
'mount point'?

> +will make access through this mountpoint read-only
> +(clearing all other per-mount-point flags),
> +without affecting
> +other mounts
> +of this filesystem.


Have a lovely night!
Alex

>  .\"
>  .SS Creating a bind mount
>  If
> -- 
> 2.47.2
> 

-- 
<https://www.alejandro-colomar.es/>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH] uapi/fcntl: conditionally define AT_RENAME* macros
From: Matthew Wilcox @ 2025-08-25 18:45 UTC (permalink / raw)
  To: Randy Dunlap
  Cc: linux-kernel, Amir Goldstein, Jeff Layton, Chuck Lever,
	Alexander Aring, Josef Bacik, Aleksa Sarai, Jan Kara,
	Christian Brauner, linux-fsdevel, linux-api
In-Reply-To: <0c755ddc-9ed1-462e-a9f1-16762ebe0a19@infradead.org>

On Mon, Aug 25, 2025 at 10:52:31AM -0700, Randy Dunlap wrote:
> $ grep -r AT_RENAME_NOREPLACE /usr/include
> /usr/include/stdio.h:# define AT_RENAME_NOREPLACE RENAME_NOREPLACE
> /usr/include/linux/fcntl.h:#define AT_RENAME_NOREPLACE	0x0001
> 
> I have libc 2.42-1.1 (openSUSE).

I wonder if we can fix it by changing include/uapi/linux/fcntl.h
from being an explicit 0x0001 to RENAME_NOREPLACE?  There's probably
a horrendous include problem between linux/fcntl.h and linux/fs.h
though?

^ permalink raw reply

* [RFC PATCH v4 7/7] tools/kapi: Add kernel API specification extraction tool
From: Sasha Levin @ 2025-08-25 18:14 UTC (permalink / raw)
  To: linux-api, linux-doc, linux-kernel, tools; +Cc: Sasha Levin
In-Reply-To: <20250825181434.3340805-1-sashal@kernel.org>

The kapi tool extracts and displays kernel API specifications.

Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 Documentation/admin-guide/kernel-api-spec.rst | 198 +++-
 tools/kapi/.gitignore                         |   4 +
 tools/kapi/Cargo.toml                         |  19 +
 tools/kapi/src/extractor/debugfs.rs           | 442 +++++++++
 tools/kapi/src/extractor/kerneldoc_parser.rs  | 694 ++++++++++++++
 tools/kapi/src/extractor/mod.rs               | 461 +++++++++
 tools/kapi/src/extractor/source_parser.rs     | 213 +++++
 .../src/extractor/vmlinux/binary_utils.rs     | 180 ++++
 .../src/extractor/vmlinux/magic_finder.rs     | 102 ++
 tools/kapi/src/extractor/vmlinux/mod.rs       | 869 +++++++++++++++++
 tools/kapi/src/formatter/json.rs              | 468 +++++++++
 tools/kapi/src/formatter/mod.rs               | 145 +++
 tools/kapi/src/formatter/plain.rs             | 558 +++++++++++
 tools/kapi/src/formatter/rst.rs               | 621 ++++++++++++
 tools/kapi/src/formatter/shall.rs             | 891 ++++++++++++++++++
 tools/kapi/src/main.rs                        | 116 +++
 16 files changed, 5978 insertions(+), 3 deletions(-)
 create mode 100644 tools/kapi/.gitignore
 create mode 100644 tools/kapi/Cargo.toml
 create mode 100644 tools/kapi/src/extractor/debugfs.rs
 create mode 100644 tools/kapi/src/extractor/kerneldoc_parser.rs
 create mode 100644 tools/kapi/src/extractor/mod.rs
 create mode 100644 tools/kapi/src/extractor/source_parser.rs
 create mode 100644 tools/kapi/src/extractor/vmlinux/binary_utils.rs
 create mode 100644 tools/kapi/src/extractor/vmlinux/magic_finder.rs
 create mode 100644 tools/kapi/src/extractor/vmlinux/mod.rs
 create mode 100644 tools/kapi/src/formatter/json.rs
 create mode 100644 tools/kapi/src/formatter/mod.rs
 create mode 100644 tools/kapi/src/formatter/plain.rs
 create mode 100644 tools/kapi/src/formatter/rst.rs
 create mode 100644 tools/kapi/src/formatter/shall.rs
 create mode 100644 tools/kapi/src/main.rs

diff --git a/Documentation/admin-guide/kernel-api-spec.rst b/Documentation/admin-guide/kernel-api-spec.rst
index 3a63f6711e27..9b452753111a 100644
--- a/Documentation/admin-guide/kernel-api-spec.rst
+++ b/Documentation/admin-guide/kernel-api-spec.rst
@@ -31,7 +31,9 @@ The framework aims to:
    common programming errors during development and testing.
 
 3. **Support Tooling**: Export API specifications in machine-readable formats for
-   use by static analyzers, documentation generators, and development tools.
+   use by static analyzers, documentation generators, and development tools. The
+   ``kapi`` tool (see `The kapi Tool`_) provides comprehensive extraction and
+   formatting capabilities.
 
 4. **Enhance Debugging**: Provide detailed API information at runtime through debugfs
    for debugging and introspection.
@@ -71,6 +73,13 @@ The framework consists of several key components:
    - Type-safe parameter specifications
    - Context and constraint definitions
 
+5. **kapi Tool** (``tools/kapi/``)
+
+   - Userspace utility for extracting specifications
+   - Multiple input sources (source, binary, debugfs)
+   - Multiple output formats (plain, JSON, RST)
+   - Testing and validation utilities
+
 Data Model
 ----------
 
@@ -344,8 +353,177 @@ Documentation Generation
 ------------------------
 
 The framework exports specifications via debugfs that can be used
-to generate documentation. Tools for automatic documentation generation
-from specifications are planned for future development.
+to generate documentation. The ``kapi`` tool provides comprehensive
+extraction and formatting capabilities for kernel API specifications.
+
+The kapi Tool
+=============
+
+Overview
+--------
+
+The ``kapi`` tool is a userspace utility that extracts and displays kernel API
+specifications from multiple sources. It provides a unified interface to access
+API documentation whether from compiled kernels, source code, or runtime systems.
+
+Installation
+------------
+
+Build the tool from the kernel source tree::
+
+    $ cd tools/kapi
+    $ cargo build --release
+
+    # Optional: Install system-wide
+    $ cargo install --path .
+
+The tool requires Rust and Cargo to build. The binary will be available at
+``tools/kapi/target/release/kapi``.
+
+Command-Line Usage
+------------------
+
+Basic syntax::
+
+    kapi [OPTIONS] [API_NAME]
+
+Options:
+
+- ``--vmlinux <PATH>``: Extract from compiled kernel binary
+- ``--source <PATH>``: Extract from kernel source code
+- ``--debugfs <PATH>``: Extract from debugfs (default: /sys/kernel/debug)
+- ``-f, --format <FORMAT>``: Output format (plain, json, rst)
+- ``-h, --help``: Display help information
+- ``-V, --version``: Display version information
+
+Input Modes
+-----------
+
+**1. Source Code Mode**
+
+Extract specifications directly from kernel source::
+
+    # Scan entire kernel source tree
+    $ kapi --source /path/to/linux
+
+    # Extract from specific file
+    $ kapi --source kernel/sched/core.c
+
+    # Get details for specific API
+    $ kapi --source /path/to/linux sys_sched_yield
+
+**2. Vmlinux Mode**
+
+Extract from compiled kernel with debug symbols::
+
+    # List all APIs in vmlinux
+    $ kapi --vmlinux /boot/vmlinux-5.15.0
+
+    # Get specific syscall details
+    $ kapi --vmlinux ./vmlinux sys_read
+
+**3. Debugfs Mode**
+
+Extract from running kernel via debugfs::
+
+    # Use default debugfs path
+    $ kapi
+
+    # Use custom debugfs mount
+    $ kapi --debugfs /mnt/debugfs
+
+    # Get specific API from running kernel
+    $ kapi sys_write
+
+Output Formats
+--------------
+
+**Plain Text Format** (default)::
+
+    $ kapi sys_read
+
+    Detailed information for sys_read:
+    ==================================
+    Description: Read from a file descriptor
+
+    Detailed Description:
+    Reads up to count bytes from file descriptor fd into the buffer starting at buf.
+
+    Execution Context:
+      - KAPI_CTX_PROCESS | KAPI_CTX_SLEEPABLE
+
+    Parameters (3):
+
+    Available since: 1.0
+
+**JSON Format**::
+
+    $ kapi --format json sys_read
+    {
+      "api_details": {
+        "name": "sys_read",
+        "description": "Read from a file descriptor",
+        "long_description": "Reads up to count bytes...",
+        "context_flags": ["KAPI_CTX_PROCESS | KAPI_CTX_SLEEPABLE"],
+        "since_version": "1.0"
+      }
+    }
+
+**ReStructuredText Format**::
+
+    $ kapi --format rst sys_read
+
+    sys_read
+    ========
+
+    **Read from a file descriptor**
+
+    Reads up to count bytes from file descriptor fd into the buffer...
+
+Usage Examples
+--------------
+
+**Generate complete API documentation**::
+
+    # Export all kernel APIs to JSON
+    $ kapi --source /path/to/linux --format json > kernel-apis.json
+
+    # Generate RST documentation for all syscalls
+    $ kapi --vmlinux ./vmlinux --format rst > syscalls.rst
+
+    # List APIs from specific subsystem
+    $ kapi --source drivers/gpu/drm/
+
+**Integration with other tools**::
+
+    # Find all APIs that can sleep
+    $ kapi --format json | jq '.apis[] | select(.context_flags[] | contains("SLEEPABLE"))'
+
+    # Generate markdown documentation
+    $ kapi --format rst sys_mmap | pandoc -f rst -t markdown
+
+**Debugging and analysis**::
+
+    # Compare API between kernel versions
+    $ diff <(kapi --vmlinux vmlinux-5.10) <(kapi --vmlinux vmlinux-5.15)
+
+    # Check if specific API exists
+    $ kapi --source . my_custom_api || echo "API not found"
+
+Implementation Details
+----------------------
+
+The tool extracts API specifications from three sources:
+
+1. **Source Code**: Parses KAPI specification macros using regular expressions
+2. **Vmlinux**: Reads the ``.kapi_specs`` ELF section from compiled kernels
+3. **Debugfs**: Reads from ``/sys/kernel/debug/kapi/`` filesystem interface
+
+The tool supports all KAPI specification types:
+
+- System calls (``DEFINE_KERNEL_API_SPEC``)
+- IOCTLs (``DEFINE_IOCTL_API_SPEC``)
+- Kernel functions (``KAPI_DEFINE_SPEC``)
 
 IDE Integration
 ---------------
@@ -357,6 +535,11 @@ Modern IDEs can use the JSON export for:
 - Context validation
 - Error code documentation
 
+Example IDE integration::
+
+    # Generate IDE completion data
+    $ kapi --format json > .vscode/kernel-apis.json
+
 Testing Framework
 -----------------
 
@@ -367,6 +550,15 @@ The framework includes test helpers::
     kapi_test_api("kmalloc", test_cases);
     #endif
 
+The kapi tool can verify specifications against implementations::
+
+    # Run consistency tests
+    $ cd tools/kapi
+    $ ./test_consistency.sh
+
+    # Compare source vs binary specifications
+    $ ./compare_all_syscalls.sh
+
 Best Practices
 ==============
 
diff --git a/tools/kapi/.gitignore b/tools/kapi/.gitignore
new file mode 100644
index 000000000000..1390bfc12686
--- /dev/null
+++ b/tools/kapi/.gitignore
@@ -0,0 +1,4 @@
+# Rust build artifacts
+/target/
+**/*.rs.bk
+
diff --git a/tools/kapi/Cargo.toml b/tools/kapi/Cargo.toml
new file mode 100644
index 000000000000..4e6bcb10d132
--- /dev/null
+++ b/tools/kapi/Cargo.toml
@@ -0,0 +1,19 @@
+[package]
+name = "kapi"
+version = "0.1.0"
+edition = "2024"
+authors = ["Sasha Levin <sashal@kernel.org>"]
+description = "Tool for extracting and displaying kernel API specifications"
+license = "GPL-2.0"
+
+[dependencies]
+goblin = "0.10"
+clap = { version = "4.4", features = ["derive"] }
+anyhow = "1.0"
+serde = { version = "1.0", features = ["derive"] }
+serde_json = "1.0"
+regex = "1.10"
+walkdir = "2.4"
+
+[dev-dependencies]
+tempfile = "3.8"
diff --git a/tools/kapi/src/extractor/debugfs.rs b/tools/kapi/src/extractor/debugfs.rs
new file mode 100644
index 000000000000..698c51e50438
--- /dev/null
+++ b/tools/kapi/src/extractor/debugfs.rs
@@ -0,0 +1,442 @@
+use crate::formatter::OutputFormatter;
+use anyhow::{Context, Result, bail};
+use serde::Deserialize;
+use std::fs;
+use std::io::Write;
+use std::path::PathBuf;
+
+use super::{ApiExtractor, ApiSpec, CapabilitySpec, display_api_spec};
+
+#[derive(Deserialize)]
+struct KernelApiJson {
+    name: String,
+    api_type: Option<String>,
+    version: Option<u32>,
+    description: Option<String>,
+    long_description: Option<String>,
+    context_flags: Option<u32>,
+    since_version: Option<String>,
+    examples: Option<String>,
+    notes: Option<String>,
+    capabilities: Option<Vec<KernelCapabilityJson>>,
+}
+
+#[derive(Deserialize)]
+struct KernelCapabilityJson {
+    capability: i32,
+    name: String,
+    action: String,
+    allows: String,
+    without_cap: String,
+    check_condition: Option<String>,
+    priority: Option<u8>,
+    alternatives: Option<Vec<i32>>,
+}
+
+/// Extractor for kernel API specifications from debugfs
+pub struct DebugfsExtractor {
+    debugfs_path: PathBuf,
+}
+
+impl DebugfsExtractor {
+    /// Create a new debugfs extractor with the specified debugfs path
+    pub fn new(debugfs_path: Option<String>) -> Result<Self> {
+        let path = match debugfs_path {
+            Some(p) => PathBuf::from(p),
+            None => PathBuf::from("/sys/kernel/debug"),
+        };
+
+        // Check if the debugfs path exists
+        if !path.exists() {
+            bail!("Debugfs path does not exist: {}", path.display());
+        }
+
+        // Check if kapi directory exists
+        let kapi_path = path.join("kapi");
+        if !kapi_path.exists() {
+            bail!(
+                "Kernel API debugfs interface not found at: {}",
+                kapi_path.display()
+            );
+        }
+
+        Ok(Self { debugfs_path: path })
+    }
+
+    /// Parse the list file to get all available API names
+    fn parse_list_file(&self) -> Result<Vec<String>> {
+        let list_path = self.debugfs_path.join("kapi/list");
+        let content = fs::read_to_string(&list_path)
+            .with_context(|| format!("Failed to read {}", list_path.display()))?;
+
+        let mut apis = Vec::new();
+        let mut in_list = false;
+
+        for line in content.lines() {
+            if line.contains("===") {
+                in_list = true;
+                continue;
+            }
+
+            if in_list && line.starts_with("Total:") {
+                break;
+            }
+
+            if in_list && !line.trim().is_empty() {
+                // Extract API name from lines like "sys_read - Read from a file descriptor"
+                if let Some(name) = line.split(" - ").next() {
+                    apis.push(name.trim().to_string());
+                }
+            }
+        }
+
+        Ok(apis)
+    }
+
+    /// Try to parse JSON content, convert context flags from u32 to string representations
+    fn parse_context_flags(flags: u32) -> Vec<String> {
+        let mut result = Vec::new();
+
+        // These values should match KAPI_CTX_* flags from kernel
+        if flags & (1 << 0) != 0 {
+            result.push("PROCESS".to_string());
+        }
+        if flags & (1 << 1) != 0 {
+            result.push("SOFTIRQ".to_string());
+        }
+        if flags & (1 << 2) != 0 {
+            result.push("HARDIRQ".to_string());
+        }
+        if flags & (1 << 3) != 0 {
+            result.push("NMI".to_string());
+        }
+        if flags & (1 << 4) != 0 {
+            result.push("ATOMIC".to_string());
+        }
+        if flags & (1 << 5) != 0 {
+            result.push("SLEEPABLE".to_string());
+        }
+        if flags & (1 << 6) != 0 {
+            result.push("PREEMPT_DISABLED".to_string());
+        }
+        if flags & (1 << 7) != 0 {
+            result.push("IRQ_DISABLED".to_string());
+        }
+
+        result
+    }
+
+    /// Convert capability action from kernel representation
+    fn parse_capability_action(action: &str) -> String {
+        match action {
+            "bypass_check" => "Bypasses check".to_string(),
+            "increase_limit" => "Increases limit".to_string(),
+            "override_restriction" => "Overrides restriction".to_string(),
+            "grant_permission" => "Grants permission".to_string(),
+            "modify_behavior" => "Modifies behavior".to_string(),
+            "access_resource" => "Allows resource access".to_string(),
+            "perform_operation" => "Allows operation".to_string(),
+            _ => action.to_string(),
+        }
+    }
+
+    /// Try to parse as JSON first
+    fn try_parse_json(&self, content: &str) -> Option<ApiSpec> {
+        let json_data: KernelApiJson = serde_json::from_str(content).ok()?;
+
+        let mut spec = ApiSpec {
+            name: json_data.name,
+            api_type: json_data.api_type.unwrap_or_else(|| "unknown".to_string()),
+            description: json_data.description,
+            long_description: json_data.long_description,
+            version: json_data.version.map(|v| v.to_string()),
+            context_flags: json_data
+                .context_flags
+                .map_or_else(Vec::new, Self::parse_context_flags),
+            param_count: None,
+            error_count: None,
+            examples: json_data.examples,
+            notes: json_data.notes,
+            since_version: json_data.since_version,
+            subsystem: None,   // Not in current JSON format
+            sysfs_path: None,  // Not in current JSON format
+            permissions: None, // Not in current JSON format
+            socket_state: None,
+            protocol_behaviors: vec![],
+            addr_families: vec![],
+            buffer_spec: None,
+            async_spec: None,
+            net_data_transfer: None,
+            capabilities: vec![],
+            parameters: vec![],
+            return_spec: None,
+            errors: vec![],
+            signals: vec![],
+            signal_masks: vec![],
+            side_effects: vec![],
+            state_transitions: vec![],
+            constraints: vec![],
+            locks: vec![],
+            struct_specs: vec![],
+        };
+
+        // Convert capabilities
+        if let Some(caps) = json_data.capabilities {
+            for cap in caps {
+                spec.capabilities.push(CapabilitySpec {
+                    capability: cap.capability,
+                    name: cap.name,
+                    action: Self::parse_capability_action(&cap.action),
+                    allows: cap.allows,
+                    without_cap: cap.without_cap,
+                    check_condition: cap.check_condition,
+                    priority: cap.priority,
+                    alternatives: cap.alternatives.unwrap_or_default(),
+                });
+            }
+        }
+
+        Some(spec)
+    }
+
+    /// Parse a single API specification file
+    fn parse_spec_file(&self, api_name: &str) -> Result<ApiSpec> {
+        let spec_path = self.debugfs_path.join(format!("kapi/specs/{}", api_name));
+        let content = fs::read_to_string(&spec_path)
+            .with_context(|| format!("Failed to read {}", spec_path.display()))?;
+
+        // Try JSON parsing first
+        if let Some(spec) = self.try_parse_json(&content) {
+            return Ok(spec);
+        }
+
+        // Fall back to plain text parsing
+        let mut spec = ApiSpec {
+            name: api_name.to_string(),
+            api_type: "unknown".to_string(),
+            description: None,
+            long_description: None,
+            version: None,
+            context_flags: Vec::new(),
+            param_count: None,
+            error_count: None,
+            examples: None,
+            notes: None,
+            since_version: None,
+            subsystem: None,
+            sysfs_path: None,
+            permissions: None,
+            socket_state: None,
+            protocol_behaviors: vec![],
+            addr_families: vec![],
+            buffer_spec: None,
+            async_spec: None,
+            net_data_transfer: None,
+            capabilities: vec![],
+            parameters: vec![],
+            return_spec: None,
+            errors: vec![],
+            signals: vec![],
+            signal_masks: vec![],
+            side_effects: vec![],
+            state_transitions: vec![],
+            constraints: vec![],
+            locks: vec![],
+            struct_specs: vec![],
+        };
+
+        // Parse the content
+        let mut collecting_multiline = false;
+        let mut multiline_buffer = String::new();
+        let mut multiline_field = "";
+        let mut parsing_capability = false;
+        let mut current_capability: Option<CapabilitySpec> = None;
+
+        for line in content.lines() {
+            // Handle capability sections
+            if line.starts_with("Capabilities (") {
+                continue; // Skip the header
+            }
+            if line.starts_with("  ") && line.contains(" (") && line.ends_with("):") {
+                // Start of a capability entry like "  CAP_IPC_LOCK (14):"
+                if let Some(cap) = current_capability.take() {
+                    spec.capabilities.push(cap);
+                }
+
+                let parts: Vec<&str> = line.trim().split(" (").collect();
+                if parts.len() == 2 {
+                    let cap_name = parts[0].to_string();
+                    let cap_id = parts[1].trim_end_matches("):").parse().unwrap_or(0);
+                    current_capability = Some(CapabilitySpec {
+                        capability: cap_id,
+                        name: cap_name,
+                        action: String::new(),
+                        allows: String::new(),
+                        without_cap: String::new(),
+                        check_condition: None,
+                        priority: None,
+                        alternatives: Vec::new(),
+                    });
+                    parsing_capability = true;
+                }
+                continue;
+            }
+            if parsing_capability && line.starts_with("    ") {
+                // Parse capability fields
+                if let Some(ref mut cap) = current_capability {
+                    if let Some(action) = line.strip_prefix("    Action: ") {
+                        cap.action = action.to_string();
+                    } else if let Some(allows) = line.strip_prefix("    Allows: ") {
+                        cap.allows = allows.to_string();
+                    } else if let Some(without) = line.strip_prefix("    Without: ") {
+                        cap.without_cap = without.to_string();
+                    } else if let Some(cond) = line.strip_prefix("    Condition: ") {
+                        cap.check_condition = Some(cond.to_string());
+                    } else if let Some(prio) = line.strip_prefix("    Priority: ") {
+                        cap.priority = prio.parse().ok();
+                    } else if let Some(alts) = line.strip_prefix("    Alternatives: ") {
+                        cap.alternatives =
+                            alts.split(", ").filter_map(|s| s.parse().ok()).collect();
+                    }
+                }
+                continue;
+            }
+            if parsing_capability && !line.starts_with("  ") {
+                // End of capabilities section
+                if let Some(cap) = current_capability.take() {
+                    spec.capabilities.push(cap);
+                }
+                parsing_capability = false;
+            }
+
+            // Handle section headers
+            if line.starts_with("Parameters (") {
+                if let Some(count_str) = line
+                    .strip_prefix("Parameters (")
+                    .and_then(|s| s.strip_suffix("):"))
+                {
+                    spec.param_count = count_str.parse().ok();
+                }
+                continue;
+            } else if line.starts_with("Errors (") {
+                if let Some(count_str) = line
+                    .strip_prefix("Errors (")
+                    .and_then(|s| s.strip_suffix("):"))
+                {
+                    spec.error_count = count_str.parse().ok();
+                }
+                continue;
+            } else if line.starts_with("Examples:") {
+                collecting_multiline = true;
+                multiline_field = "examples";
+                multiline_buffer.clear();
+                continue;
+            } else if line.starts_with("Notes:") {
+                collecting_multiline = true;
+                multiline_field = "notes";
+                multiline_buffer.clear();
+                continue;
+            }
+
+            // Handle multiline sections
+            if collecting_multiline {
+                if line.trim().is_empty() && multiline_buffer.ends_with("\n\n") {
+                    collecting_multiline = false;
+                    match multiline_field {
+                        "examples" => spec.examples = Some(multiline_buffer.trim().to_string()),
+                        "notes" => spec.notes = Some(multiline_buffer.trim().to_string()),
+                        _ => {}
+                    }
+                    multiline_buffer.clear();
+                } else {
+                    if !multiline_buffer.is_empty() {
+                        multiline_buffer.push('\n');
+                    }
+                    multiline_buffer.push_str(line);
+                }
+                continue;
+            }
+
+            // Parse regular fields
+            if let Some(desc) = line.strip_prefix("Description: ") {
+                spec.description = Some(desc.to_string());
+            } else if let Some(long_desc) = line.strip_prefix("Long description: ") {
+                spec.long_description = Some(long_desc.to_string());
+            } else if let Some(version) = line.strip_prefix("Version: ") {
+                spec.version = Some(version.to_string());
+            } else if let Some(since) = line.strip_prefix("Since: ") {
+                spec.since_version = Some(since.to_string());
+            } else if let Some(flags) = line.strip_prefix("Context flags: ") {
+                spec.context_flags = flags.split_whitespace().map(str::to_string).collect();
+            } else if let Some(subsys) = line.strip_prefix("Subsystem: ") {
+                spec.subsystem = Some(subsys.to_string());
+            } else if let Some(path) = line.strip_prefix("Sysfs Path: ") {
+                spec.sysfs_path = Some(path.to_string());
+            } else if let Some(perms) = line.strip_prefix("Permissions: ") {
+                spec.permissions = Some(perms.to_string());
+            }
+        }
+
+        // Handle any remaining capability
+        if let Some(cap) = current_capability.take() {
+            spec.capabilities.push(cap);
+        }
+
+        // Determine API type based on name
+        if api_name.starts_with("sys_") {
+            spec.api_type = "syscall".to_string();
+        } else if api_name.contains("_ioctl") || api_name.starts_with("ioctl_") {
+            spec.api_type = "ioctl".to_string();
+        } else if api_name.contains("sysfs")
+            || api_name.ends_with("_show")
+            || api_name.ends_with("_store")
+        {
+            spec.api_type = "sysfs".to_string();
+        } else {
+            spec.api_type = "function".to_string();
+        }
+
+        Ok(spec)
+    }
+}
+
+impl ApiExtractor for DebugfsExtractor {
+    fn extract_all(&self) -> Result<Vec<ApiSpec>> {
+        let api_names = self.parse_list_file()?;
+        let mut specs = Vec::new();
+
+        for name in api_names {
+            match self.parse_spec_file(&name) {
+                Ok(spec) => specs.push(spec),
+                Err(_e) => {} // Silently skip files that fail to parse
+            }
+        }
+
+        Ok(specs)
+    }
+
+    fn extract_by_name(&self, name: &str) -> Result<Option<ApiSpec>> {
+        let api_names = self.parse_list_file()?;
+
+        if api_names.contains(&name.to_string()) {
+            Ok(Some(self.parse_spec_file(name)?))
+        } else {
+            Ok(None)
+        }
+    }
+
+    fn display_api_details(
+        &self,
+        api_name: &str,
+        formatter: &mut dyn OutputFormatter,
+        writer: &mut dyn Write,
+    ) -> Result<()> {
+        if let Some(spec) = self.extract_by_name(api_name)? {
+            display_api_spec(&spec, formatter, writer)?;
+        } else {
+            writeln!(writer, "API '{api_name}' not found in debugfs")?;
+        }
+
+        Ok(())
+    }
+}
diff --git a/tools/kapi/src/extractor/kerneldoc_parser.rs b/tools/kapi/src/extractor/kerneldoc_parser.rs
new file mode 100644
index 000000000000..2a6b5c896be9
--- /dev/null
+++ b/tools/kapi/src/extractor/kerneldoc_parser.rs
@@ -0,0 +1,694 @@
+use super::{
+    ApiSpec, CapabilitySpec, ConstraintSpec, ErrorSpec, LockSpec, ParamSpec,
+    ReturnSpec, SideEffectSpec, SignalSpec, StateTransitionSpec, StructSpec,
+    StructFieldSpec,
+};
+use anyhow::Result;
+use std::collections::HashMap;
+
+/// Real kerneldoc parser that extracts KAPI annotations
+pub struct KerneldocParserImpl;
+
+impl KerneldocParserImpl {
+    pub fn new() -> Self {
+        KerneldocParserImpl
+    }
+
+    pub fn parse_kerneldoc(
+        &self,
+        doc: &str,
+        name: &str,
+        api_type: &str,
+        _signature: Option<&str>,
+    ) -> Result<ApiSpec> {
+        let mut spec = ApiSpec {
+            name: name.to_string(),
+            api_type: api_type.to_string(),
+            description: None,
+            long_description: None,
+            version: None,
+            context_flags: vec![],
+            param_count: None,
+            error_count: None,
+            examples: None,
+            notes: None,
+            since_version: None,
+            subsystem: None,
+            sysfs_path: None,
+            permissions: None,
+            socket_state: None,
+            protocol_behaviors: vec![],
+            addr_families: vec![],
+            buffer_spec: None,
+            async_spec: None,
+            net_data_transfer: None,
+            capabilities: vec![],
+            parameters: vec![],
+            return_spec: None,
+            errors: vec![],
+            signals: vec![],
+            signal_masks: vec![],
+            side_effects: vec![],
+            state_transitions: vec![],
+            constraints: vec![],
+            locks: vec![],
+            struct_specs: vec![],
+        };
+
+        // Parse line by line
+        let lines: Vec<&str> = doc.lines().collect();
+        let mut i = 0;
+
+        // Extract main description from function name line
+        if let Some(first_line) = lines.first() {
+            if let Some((_, desc)) = first_line.split_once(" - ") {
+                spec.description = Some(desc.trim().to_string());
+            }
+        }
+
+        // Keep track of parameters we've seen
+        let mut param_map: HashMap<String, ParamSpec> = HashMap::new();
+        let mut struct_fields: Vec<StructFieldSpec> = Vec::new();
+        let mut current_lock: Option<LockSpec> = None;
+        let mut current_signal: Option<SignalSpec> = None;
+        let mut current_capability: Option<CapabilitySpec> = None;
+
+        while i < lines.len() {
+            let line = lines[i].trim();
+
+            // Skip empty lines
+            if line.is_empty() {
+                i += 1;
+                continue;
+            }
+
+            // Parse @param lines
+            if let Some(rest) = line.strip_prefix("@") {
+                if let Some((param_name, desc)) = rest.split_once(':') {
+                    let param_name = param_name.trim();
+                    let desc = desc.trim();
+                    if !param_name.contains('-') {
+                        // This is a basic parameter description - add to map
+                        param_map.insert(param_name.to_string(), ParamSpec {
+                            index: param_map.len() as u32,
+                            name: param_name.to_string(),
+                            type_name: String::new(),
+                            description: desc.to_string(),
+                            flags: 0,
+                            param_type: 0,
+                            constraint_type: 0,
+                            constraint: None,
+                            min_value: None,
+                            max_value: None,
+                            valid_mask: None,
+                            enum_values: vec![],
+                            size: None,
+                            alignment: None,
+                        });
+                    }
+                }
+            }
+            // Parse long-desc
+            else if let Some(rest) = line.strip_prefix("long-desc:") {
+                spec.long_description = Some(self.collect_multiline_value(&lines, i, rest));
+            }
+            // Parse context-flags
+            else if let Some(rest) = line.strip_prefix("context-flags:") {
+                spec.context_flags = self.parse_context_flags(rest.trim());
+            }
+            // Parse param-count
+            else if let Some(rest) = line.strip_prefix("param-count:") {
+                spec.param_count = rest.trim().parse().ok();
+            }
+            // Parse param-type
+            else if let Some(rest) = line.strip_prefix("param-type:") {
+                let parts: Vec<&str> = rest.split(',').map(|s| s.trim()).collect();
+                if parts.len() >= 2 {
+                    if let Some(param) = param_map.get_mut(parts[0]) {
+                        param.param_type = self.parse_param_type(parts[1]);
+                    }
+                }
+            }
+            // Parse param-flags
+            else if let Some(rest) = line.strip_prefix("param-flags:") {
+                let parts: Vec<&str> = rest.split(',').map(|s| s.trim()).collect();
+                if parts.len() >= 2 {
+                    if let Some(param) = param_map.get_mut(parts[0]) {
+                        param.flags = self.parse_param_flags(parts[1]);
+                    }
+                }
+            }
+            // Parse param-range
+            else if let Some(rest) = line.strip_prefix("param-range:") {
+                let parts: Vec<&str> = rest.split(',').map(|s| s.trim()).collect();
+                if parts.len() >= 3 {
+                    if let Some(param) = param_map.get_mut(parts[0]) {
+                        param.min_value = parts[1].parse().ok();
+                        param.max_value = parts[2].parse().ok();
+                        param.constraint_type = 1; // KAPI_CONSTRAINT_RANGE
+                    }
+                }
+            }
+            // Parse param-constraint
+            else if let Some(rest) = line.strip_prefix("param-constraint:") {
+                let parts: Vec<&str> = rest.splitn(2, ',').map(|s| s.trim()).collect();
+                if parts.len() >= 2 {
+                    if let Some(param) = param_map.get_mut(parts[0]) {
+                        param.constraint = Some(parts[1].to_string());
+                    }
+                }
+            }
+            // Parse error
+            else if let Some(rest) = line.strip_prefix("error:") {
+                // Parse error in format: "ERROR_CODE, description"
+                let parts: Vec<&str> = rest.splitn(2, ',').map(|s| s.trim()).collect();
+                if parts.len() >= 2 {
+                    let error_name = parts[0].to_string();
+                    let description = parts[1].to_string();
+
+                    // Look for desc: line on the next line
+                    let mut full_description = description;
+                    if i + 1 < lines.len() {
+                        if let Some(desc_line) = lines[i + 1].strip_prefix("*   desc:") {
+                            full_description = desc_line.trim().to_string();
+                        } else if let Some(desc_line) = lines[i + 1].strip_prefix("* desc:") {
+                            full_description = desc_line.trim().to_string();
+                        }
+                    }
+
+                    // Map common error names to codes
+                    let error_code = match error_name.as_str() {
+                        "E2BIG" => -7,
+                        "EACCES" => -13,
+                        "EAGAIN" => -11,
+                        "EBADF" => -9,
+                        "EBUSY" => -16,
+                        "EFAULT" => -14,
+                        "EINTR" => -4,
+                        "EINVAL" => -22,
+                        "EIO" => -5,
+                        "EISDIR" => -21,
+                        "ELIBBAD" => -80,
+                        "ELOOP" => -40,
+                        "EMFILE" => -24,
+                        "ENAMETOOLONG" => -36,
+                        "ENFILE" => -23,
+                        "ENOENT" => -2,
+                        "ENOEXEC" => -8,
+                        "ENOMEM" => -12,
+                        "ENOTDIR" => -20,
+                        "EOPNOTSUPP" => -95,
+                        "EPERM" => -1,
+                        "ESRCH" => -3,
+                        "ETXTBSY" => -26,
+                        _ => 0,
+                    };
+
+                    spec.errors.push(ErrorSpec {
+                        error_code,
+                        name: error_name,
+                        condition: String::new(),
+                        description: full_description,
+                    });
+                }
+            }
+            // Parse lock
+            else if let Some(rest) = line.strip_prefix("lock:") {
+                // Save previous lock if any
+                if let Some(lock) = current_lock.take() {
+                    spec.locks.push(lock);
+                }
+
+                let parts: Vec<&str> = rest.split(',').map(|s| s.trim()).collect();
+                if parts.len() >= 2 {
+                    current_lock = Some(LockSpec {
+                        lock_name: parts[0].to_string(),
+                        lock_type: self.parse_lock_type(parts[1]),
+                        acquired: false,
+                        released: false,
+                        held_on_entry: false,
+                        held_on_exit: false,
+                        description: String::new(),
+                    });
+                }
+            }
+            // Parse lock attributes
+            else if line.strip_prefix("lock-acquired:").is_some() {
+                if let Some(lock) = current_lock.as_mut() {
+                    lock.acquired = true;
+                }
+            }
+            else if line.strip_prefix("lock-released:").is_some() {
+                if let Some(lock) = current_lock.as_mut() {
+                    lock.released = true;
+                }
+            }
+            else if let Some(rest) = line.strip_prefix("lock-desc:") {
+                if let Some(lock) = current_lock.as_mut() {
+                    lock.description = self.collect_multiline_value(&lines, i, rest);
+                }
+            }
+            // Parse signal
+            else if let Some(rest) = line.strip_prefix("signal:") {
+                // Save previous signal if any
+                if let Some(signal) = current_signal.take() {
+                    spec.signals.push(signal);
+                }
+
+                let signal_name = rest.trim().to_string();
+                current_signal = Some(SignalSpec {
+                    signal_num: 0,
+                    signal_name,
+                    direction: 1,
+                    action: 0,
+                    target: None,
+                    condition: None,
+                    description: None,
+                    restartable: false,
+                    timing: 0,
+                    priority: 0,
+                    interruptible: false,
+                    queue: None,
+                    sa_flags: 0,
+                    sa_flags_required: 0,
+                    sa_flags_forbidden: 0,
+                    state_required: 0,
+                    state_forbidden: 0,
+                    error_on_signal: None,
+                });
+            }
+            // Parse signal attributes
+            else if let Some(rest) = line.strip_prefix("signal-direction:") {
+                if let Some(signal) = current_signal.as_mut() {
+                    signal.direction = self.parse_signal_direction(rest.trim());
+                }
+            }
+            else if let Some(rest) = line.strip_prefix("signal-action:") {
+                if let Some(signal) = current_signal.as_mut() {
+                    signal.action = self.parse_signal_action(rest.trim());
+                }
+            }
+            else if let Some(rest) = line.strip_prefix("signal-condition:") {
+                if let Some(signal) = current_signal.as_mut() {
+                    signal.condition = Some(self.collect_multiline_value(&lines, i, rest));
+                }
+            }
+            else if let Some(rest) = line.strip_prefix("signal-desc:") {
+                if let Some(signal) = current_signal.as_mut() {
+                    signal.description = Some(self.collect_multiline_value(&lines, i, rest));
+                }
+            }
+            else if let Some(rest) = line.strip_prefix("signal-timing:") {
+                if let Some(signal) = current_signal.as_mut() {
+                    signal.timing = self.parse_signal_timing(rest.trim());
+                }
+            }
+            else if let Some(rest) = line.strip_prefix("signal-priority:") {
+                if let Some(signal) = current_signal.as_mut() {
+                    signal.priority = rest.trim().parse().unwrap_or(0);
+                }
+            }
+            else if line.strip_prefix("signal-interruptible:").is_some() {
+                if let Some(signal) = current_signal.as_mut() {
+                    signal.interruptible = true;
+                }
+            }
+            else if let Some(rest) = line.strip_prefix("signal-state-req:") {
+                if let Some(signal) = current_signal.as_mut() {
+                    signal.state_required = self.parse_signal_state(rest.trim());
+                }
+            }
+            // Parse side-effect
+            else if let Some(rest) = line.strip_prefix("side-effect:") {
+                let full_effect = self.collect_multiline_value(&lines, i, rest);
+                let parts: Vec<&str> = full_effect.splitn(3, ',').map(|s| s.trim()).collect();
+                if parts.len() >= 3 {
+                    let mut effect = SideEffectSpec {
+                        effect_type: self.parse_effect_type(parts[0]),
+                        target: parts[1].to_string(),
+                        condition: None,
+                        description: parts[2].to_string(),
+                        reversible: false,
+                    };
+
+                    // Check for additional attributes
+                    if let Some(pos) = parts[2].find("condition=") {
+                        let cond_str = &parts[2][pos + 10..];
+                        if let Some(end) = cond_str.find(',') {
+                            effect.condition = Some(cond_str[..end].to_string());
+                        } else {
+                            effect.condition = Some(cond_str.to_string());
+                        }
+                    }
+
+                    if parts[2].contains("reversible=yes") {
+                        effect.reversible = true;
+                    }
+
+                    spec.side_effects.push(effect);
+                }
+            }
+            // Parse state-trans
+            else if let Some(rest) = line.strip_prefix("state-trans:") {
+                let parts: Vec<&str> = rest.split(',').map(|s| s.trim()).collect();
+                if parts.len() >= 4 {
+                    spec.state_transitions.push(StateTransitionSpec {
+                        object: parts[0].to_string(),
+                        from_state: parts[1].to_string(),
+                        to_state: parts[2].to_string(),
+                        condition: None,
+                        description: parts[3].to_string(),
+                    });
+                }
+            }
+            // Parse capability
+            else if let Some(rest) = line.strip_prefix("capability:") {
+                // Save previous capability if any
+                if let Some(cap) = current_capability.take() {
+                    spec.capabilities.push(cap);
+                }
+
+                let parts: Vec<&str> = rest.split(',').map(|s| s.trim()).collect();
+                if parts.len() >= 3 {
+                    current_capability = Some(CapabilitySpec {
+                        capability: self.parse_capability_value(parts[0]),
+                        action: parts[1].to_string(),
+                        name: parts[2].to_string(),
+                        allows: String::new(),
+                        without_cap: String::new(),
+                        check_condition: None,
+                        priority: Some(0),
+                        alternatives: vec![],
+                    });
+                }
+            }
+            // Parse capability attributes
+            else if let Some(rest) = line.strip_prefix("capability-allows:") {
+                if let Some(cap) = current_capability.as_mut() {
+                    cap.allows = self.collect_multiline_value(&lines, i, rest);
+                }
+            }
+            else if let Some(rest) = line.strip_prefix("capability-without:") {
+                if let Some(cap) = current_capability.as_mut() {
+                    cap.without_cap = self.collect_multiline_value(&lines, i, rest);
+                }
+            }
+            else if let Some(rest) = line.strip_prefix("capability-condition:") {
+                if let Some(cap) = current_capability.as_mut() {
+                    cap.check_condition = Some(self.collect_multiline_value(&lines, i, rest));
+                }
+            }
+            else if let Some(rest) = line.strip_prefix("capability-priority:") {
+                if let Some(cap) = current_capability.as_mut() {
+                    cap.priority = rest.trim().parse().ok();
+                }
+            }
+            // Parse constraint
+            else if let Some(rest) = line.strip_prefix("constraint:") {
+                let parts: Vec<&str> = rest.splitn(2, ',').map(|s| s.trim()).collect();
+                if parts.len() >= 2 {
+                    spec.constraints.push(ConstraintSpec {
+                        name: parts[0].to_string(),
+                        description: parts[1].to_string(),
+                        expression: None,
+                    });
+                }
+            }
+            // Parse constraint-expr
+            else if let Some(rest) = line.strip_prefix("constraint-expr:") {
+                let parts: Vec<&str> = rest.splitn(2, ',').map(|s| s.trim()).collect();
+                if parts.len() >= 2 {
+                    // Find matching constraint and update it
+                    if let Some(constraint) = spec.constraints.iter_mut().find(|c| c.name == parts[0]) {
+                        constraint.expression = Some(parts[1].to_string());
+                    }
+                }
+            }
+            // Parse struct-field
+            else if let Some(rest) = line.strip_prefix("struct-field:") {
+                let parts: Vec<&str> = rest.split(',').map(|s| s.trim()).collect();
+                if parts.len() >= 3 {
+                    struct_fields.push(StructFieldSpec {
+                        name: parts[0].to_string(),
+                        field_type: self.parse_field_type(parts[1]),
+                        type_name: parts[1].to_string(),
+                        offset: 0,
+                        size: 0,
+                        flags: 0,
+                        constraint_type: 0,
+                        min_value: 0,
+                        max_value: 0,
+                        valid_mask: 0,
+                        description: parts[2].to_string(),
+                    });
+                }
+            }
+            // Parse struct-field-range
+            else if let Some(rest) = line.strip_prefix("struct-field-range:") {
+                let parts: Vec<&str> = rest.split(',').map(|s| s.trim()).collect();
+                if parts.len() >= 3 {
+                    // Update the field with range
+                    if let Some(field) = struct_fields.iter_mut().find(|f| f.name == parts[0]) {
+                        field.min_value = parts[1].parse().unwrap_or(0);
+                        field.max_value = parts[2].parse().unwrap_or(0);
+                        field.constraint_type = 1; // KAPI_CONSTRAINT_RANGE
+                    }
+                }
+            }
+            // Parse examples
+            else if let Some(rest) = line.strip_prefix("examples:") {
+                spec.examples = Some(self.collect_multiline_value(&lines, i, rest));
+            }
+            // Parse notes
+            else if let Some(rest) = line.strip_prefix("notes:") {
+                spec.notes = Some(self.collect_multiline_value(&lines, i, rest));
+            }
+            // Parse since-version
+            else if let Some(rest) = line.strip_prefix("since-version:") {
+                spec.since_version = Some(rest.trim().to_string());
+            }
+            // Parse return-type
+            else if let Some(rest) = line.strip_prefix("return-type:") {
+                if spec.return_spec.is_none() {
+                    spec.return_spec = Some(ReturnSpec {
+                        type_name: rest.trim().to_string(),
+                        description: String::new(),
+                        return_type: self.parse_param_type(rest.trim()),
+                        check_type: 0,
+                        success_value: None,
+                        success_min: None,
+                        success_max: None,
+                        error_values: vec![],
+                    });
+                }
+            }
+            // Parse return-check-type
+            else if let Some(rest) = line.strip_prefix("return-check-type:") {
+                if let Some(ret) = spec.return_spec.as_mut() {
+                    ret.check_type = self.parse_return_check_type(rest.trim());
+                }
+            }
+            // Parse return-success
+            else if let Some(rest) = line.strip_prefix("return-success:") {
+                if let Some(ret) = spec.return_spec.as_mut() {
+                    ret.success_value = rest.trim().parse().ok();
+                }
+            }
+
+            i += 1;
+        }
+
+        // Save any remaining items
+        if let Some(lock) = current_lock {
+            spec.locks.push(lock);
+        }
+        if let Some(signal) = current_signal {
+            spec.signals.push(signal);
+        }
+        if let Some(cap) = current_capability {
+            spec.capabilities.push(cap);
+        }
+
+        // Convert param_map to vec preserving order
+        let mut params: Vec<ParamSpec> = param_map.into_values().collect();
+        params.sort_by_key(|p| p.index);
+        spec.parameters = params;
+
+        // Create struct spec if we have fields
+        if !struct_fields.is_empty() {
+            spec.struct_specs.push(StructSpec {
+                name: "struct sched_attr".to_string(),
+                size: 120, // Default for sched_attr
+                alignment: 8,
+                field_count: struct_fields.len() as u32,
+                fields: struct_fields,
+                description: "Structure specification".to_string(),
+            });
+        }
+
+        Ok(spec)
+    }
+
+    fn collect_multiline_value(&self, lines: &[&str], start_idx: usize, first_part: &str) -> String {
+        let mut result = String::from(first_part.trim());
+        let mut i = start_idx + 1;
+
+        // Continue collecting lines until we hit another annotation or end
+        while i < lines.len() {
+            let line = lines[i];
+
+            // Stop if we hit another annotation (contains ':' and starts with valid keyword)
+            if self.is_annotation_line(line) {
+                break;
+            }
+
+            // Add continuation lines
+            if !line.trim().is_empty() && line.starts_with("  ") {
+                if !result.is_empty() {
+                    result.push(' ');
+                }
+                result.push_str(line.trim());
+            } else if line.trim().is_empty() {
+                // Empty line might be part of multiline
+                i += 1;
+                continue;
+            } else {
+                // Non-continuation line, stop
+                break;
+            }
+
+            i += 1;
+        }
+
+        result
+    }
+
+    fn is_annotation_line(&self, line: &str) -> bool {
+        let annotations = [
+            "param-", "error-", "lock", "signal", "side-effect:",
+            "state-trans:", "capability", "constraint", "struct-",
+            "return-", "examples:", "notes:", "since-", "context-",
+            "long-desc:"
+        ];
+
+        for ann in &annotations {
+            if line.trim_start().starts_with(ann) {
+                return true;
+            }
+        }
+        false
+    }
+
+    fn parse_context_flags(&self, flags: &str) -> Vec<String> {
+        flags.split('|')
+            .map(|f| f.trim().to_string())
+            .collect()
+    }
+
+    fn parse_param_type(&self, type_str: &str) -> u32 {
+        match type_str {
+            "KAPI_TYPE_INT" => 1,
+            "KAPI_TYPE_UINT" => 2,
+            "KAPI_TYPE_LONG" => 3,
+            "KAPI_TYPE_ULONG" => 4,
+            "KAPI_TYPE_STRING" => 5,
+            "KAPI_TYPE_USER_PTR" => 6,
+            _ => 0,
+        }
+    }
+
+    fn parse_field_type(&self, type_str: &str) -> u32 {
+        match type_str {
+            "__s32" | "int" => 1,
+            "__u32" | "unsigned int" => 2,
+            "__s64" | "long" => 3,
+            "__u64" | "unsigned long" => 4,
+            _ => 0,
+        }
+    }
+
+    fn parse_param_flags(&self, flags: &str) -> u32 {
+        let mut result = 0;
+        for flag in flags.split('|') {
+            match flag.trim() {
+                "KAPI_PARAM_IN" => result |= 1,
+                "KAPI_PARAM_OUT" => result |= 2,
+                "KAPI_PARAM_INOUT" => result |= 3,
+                "KAPI_PARAM_USER" => result |= 4,
+                _ => {}
+            }
+        }
+        result
+    }
+
+    fn parse_lock_type(&self, type_str: &str) -> u32 {
+        match type_str {
+            "KAPI_LOCK_SPINLOCK" => 0,
+            "KAPI_LOCK_MUTEX" => 1,
+            "KAPI_LOCK_RWLOCK" => 2,
+            _ => 3,
+        }
+    }
+
+    fn parse_signal_direction(&self, dir: &str) -> u32 {
+        match dir {
+            "KAPI_SIGNAL_SEND" => 1,
+            "KAPI_SIGNAL_RECEIVE" => 2,
+            _ => 0,
+        }
+    }
+
+    fn parse_signal_action(&self, action: &str) -> u32 {
+        match action {
+            "KAPI_SIGNAL_ACTION_DEFAULT" => 0,
+            "KAPI_SIGNAL_ACTION_IGNORE" => 1,
+            "KAPI_SIGNAL_ACTION_CUSTOM" => 2,
+            _ => 0,
+        }
+    }
+
+    fn parse_signal_timing(&self, timing: &str) -> u32 {
+        match timing {
+            "KAPI_SIGNAL_TIME_BEFORE" => 0,
+            "KAPI_SIGNAL_TIME_DURING" => 1,
+            "KAPI_SIGNAL_TIME_AFTER" => 2,
+            _ => 0,
+        }
+    }
+
+    fn parse_signal_state(&self, state: &str) -> u32 {
+        match state {
+            "KAPI_SIGNAL_STATE_RUNNING" => 1,
+            "KAPI_SIGNAL_STATE_SLEEPING" => 2,
+            _ => 0,
+        }
+    }
+
+    fn parse_effect_type(&self, type_str: &str) -> u32 {
+        let mut result = 0;
+        for flag in type_str.split('|') {
+            match flag.trim() {
+                "KAPI_EFFECT_MODIFY_STATE" => result |= 1,
+                "KAPI_EFFECT_PROCESS_STATE" => result |= 2,
+                "KAPI_EFFECT_SCHEDULE" => result |= 4,
+                _ => {}
+            }
+        }
+        result
+    }
+
+    fn parse_capability_value(&self, cap: &str) -> i32 {
+        match cap {
+            "CAP_SYS_NICE" => 23,
+            _ => 0,
+        }
+    }
+
+    fn parse_return_check_type(&self, check: &str) -> u32 {
+        match check {
+            "KAPI_RETURN_ERROR_CHECK" => 1,
+            "KAPI_RETURN_SUCCESS_CHECK" => 2,
+            _ => 0,
+        }
+    }
+}
\ No newline at end of file
diff --git a/tools/kapi/src/extractor/mod.rs b/tools/kapi/src/extractor/mod.rs
new file mode 100644
index 000000000000..010851fe6a89
--- /dev/null
+++ b/tools/kapi/src/extractor/mod.rs
@@ -0,0 +1,461 @@
+use crate::formatter::OutputFormatter;
+use anyhow::Result;
+use std::convert::TryInto;
+use std::io::Write;
+
+pub mod debugfs;
+pub mod kerneldoc_parser;
+pub mod source_parser;
+pub mod vmlinux;
+
+pub use debugfs::DebugfsExtractor;
+pub use source_parser::SourceExtractor;
+pub use vmlinux::VmlinuxExtractor;
+
+/// Socket state specification
+#[derive(Debug, Clone, serde::Serialize)]
+pub struct SocketStateSpec {
+    pub required_states: Vec<String>,
+    pub forbidden_states: Vec<String>,
+    pub resulting_state: Option<String>,
+    pub condition: Option<String>,
+    pub applicable_protocols: Option<String>,
+}
+
+/// Protocol behavior specification
+#[derive(Debug, Clone, serde::Serialize)]
+pub struct ProtocolBehaviorSpec {
+    pub applicable_protocols: String,
+    pub behavior: String,
+    pub protocol_flags: Option<String>,
+    pub flag_description: Option<String>,
+}
+
+/// Address family specification
+#[derive(Debug, Clone, serde::Serialize)]
+pub struct AddrFamilySpec {
+    pub family: i32,
+    pub family_name: String,
+    pub addr_struct_size: usize,
+    pub min_addr_len: usize,
+    pub max_addr_len: usize,
+    pub addr_format: Option<String>,
+    pub supports_wildcard: bool,
+    pub supports_multicast: bool,
+    pub supports_broadcast: bool,
+    pub special_addresses: Option<String>,
+    pub port_range_min: u32,
+    pub port_range_max: u32,
+}
+
+/// Buffer specification
+#[derive(Debug, Clone, serde::Serialize)]
+pub struct BufferSpec {
+    pub buffer_behaviors: Option<String>,
+    pub min_buffer_size: Option<usize>,
+    pub max_buffer_size: Option<usize>,
+    pub optimal_buffer_size: Option<usize>,
+}
+
+/// Async specification
+#[derive(Debug, Clone, serde::Serialize)]
+pub struct AsyncSpec {
+    pub supported_modes: Option<String>,
+    pub nonblock_errno: Option<i32>,
+}
+
+/// Capability specification
+#[derive(Debug, Clone, serde::Serialize)]
+pub struct CapabilitySpec {
+    pub capability: i32,
+    pub name: String,
+    pub action: String,
+    pub allows: String,
+    pub without_cap: String,
+    pub check_condition: Option<String>,
+    pub priority: Option<u8>,
+    pub alternatives: Vec<i32>,
+}
+
+/// Parameter specification
+#[derive(Debug, Clone, serde::Serialize)]
+pub struct ParamSpec {
+    pub index: u32,
+    pub name: String,
+    pub type_name: String,
+    pub description: String,
+    pub flags: u32,
+    pub param_type: u32,
+    pub constraint_type: u32,
+    pub constraint: Option<String>,
+    pub min_value: Option<i64>,
+    pub max_value: Option<i64>,
+    pub valid_mask: Option<u64>,
+    pub enum_values: Vec<String>,
+    pub size: Option<u32>,
+    pub alignment: Option<u32>,
+}
+
+/// Return value specification
+#[derive(Debug, Clone, serde::Serialize)]
+pub struct ReturnSpec {
+    pub type_name: String,
+    pub description: String,
+    pub return_type: u32,
+    pub check_type: u32,
+    pub success_value: Option<i64>,
+    pub success_min: Option<i64>,
+    pub success_max: Option<i64>,
+    pub error_values: Vec<i32>,
+}
+
+/// Error specification
+#[derive(Debug, Clone, serde::Serialize)]
+pub struct ErrorSpec {
+    pub error_code: i32,
+    pub name: String,
+    pub condition: String,
+    pub description: String,
+}
+
+/// Signal specification
+#[derive(Debug, Clone, serde::Serialize)]
+pub struct SignalSpec {
+    pub signal_num: i32,
+    pub signal_name: String,
+    pub direction: u32,
+    pub action: u32,
+    pub target: Option<String>,
+    pub condition: Option<String>,
+    pub description: Option<String>,
+    pub timing: u32,
+    pub priority: u32,
+    pub restartable: bool,
+    pub interruptible: bool,
+    pub queue: Option<String>,
+    pub sa_flags: u32,
+    pub sa_flags_required: u32,
+    pub sa_flags_forbidden: u32,
+    pub state_required: u32,
+    pub state_forbidden: u32,
+    pub error_on_signal: Option<i32>,
+}
+
+/// Signal mask specification
+#[derive(Debug, Clone, serde::Serialize)]
+pub struct SignalMaskSpec {
+    pub name: String,
+    pub description: String,
+}
+
+/// Side effect specification
+#[derive(Debug, Clone, serde::Serialize)]
+pub struct SideEffectSpec {
+    pub effect_type: u32,
+    pub target: String,
+    pub condition: Option<String>,
+    pub description: String,
+    pub reversible: bool,
+}
+
+/// State transition specification
+#[derive(Debug, Clone, serde::Serialize)]
+pub struct StateTransitionSpec {
+    pub object: String,
+    pub from_state: String,
+    pub to_state: String,
+    pub condition: Option<String>,
+    pub description: String,
+}
+
+/// Constraint specification
+#[derive(Debug, Clone, serde::Serialize)]
+pub struct ConstraintSpec {
+    pub name: String,
+    pub description: String,
+    pub expression: Option<String>,
+}
+
+/// Lock specification
+#[derive(Debug, Clone, serde::Serialize)]
+pub struct LockSpec {
+    pub lock_name: String,
+    pub lock_type: u32,
+    pub acquired: bool,
+    pub released: bool,
+    pub held_on_entry: bool,
+    pub held_on_exit: bool,
+    pub description: String,
+}
+
+/// Struct field specification
+#[derive(Debug, Clone, serde::Serialize)]
+pub struct StructFieldSpec {
+    pub name: String,
+    pub field_type: u32,
+    pub type_name: String,
+    pub offset: usize,
+    pub size: usize,
+    pub flags: u32,
+    pub constraint_type: u32,
+    pub min_value: i64,
+    pub max_value: i64,
+    pub valid_mask: u64,
+    pub description: String,
+}
+
+/// Struct specification
+#[derive(Debug, Clone, serde::Serialize)]
+pub struct StructSpec {
+    pub name: String,
+    pub size: usize,
+    pub alignment: usize,
+    pub field_count: u32,
+    pub fields: Vec<StructFieldSpec>,
+    pub description: String,
+}
+
+/// Common API specification information that all extractors should provide
+#[derive(Debug, Clone)]
+pub struct ApiSpec {
+    pub name: String,
+    pub api_type: String,
+    pub description: Option<String>,
+    pub long_description: Option<String>,
+    pub version: Option<String>,
+    pub context_flags: Vec<String>,
+    pub param_count: Option<u32>,
+    pub error_count: Option<u32>,
+    pub examples: Option<String>,
+    pub notes: Option<String>,
+    pub since_version: Option<String>,
+    // Sysfs-specific fields
+    pub subsystem: Option<String>,
+    pub sysfs_path: Option<String>,
+    pub permissions: Option<String>,
+    // Networking-specific fields
+    pub socket_state: Option<SocketStateSpec>,
+    pub protocol_behaviors: Vec<ProtocolBehaviorSpec>,
+    pub addr_families: Vec<AddrFamilySpec>,
+    pub buffer_spec: Option<BufferSpec>,
+    pub async_spec: Option<AsyncSpec>,
+    pub net_data_transfer: Option<String>,
+    pub capabilities: Vec<CapabilitySpec>,
+    pub parameters: Vec<ParamSpec>,
+    pub return_spec: Option<ReturnSpec>,
+    pub errors: Vec<ErrorSpec>,
+    pub signals: Vec<SignalSpec>,
+    pub signal_masks: Vec<SignalMaskSpec>,
+    pub side_effects: Vec<SideEffectSpec>,
+    pub state_transitions: Vec<StateTransitionSpec>,
+    pub constraints: Vec<ConstraintSpec>,
+    pub locks: Vec<LockSpec>,
+    pub struct_specs: Vec<StructSpec>,
+}
+
+/// Trait for extracting API specifications from different sources
+pub trait ApiExtractor {
+    /// Extract all API specifications from the source
+    fn extract_all(&self) -> Result<Vec<ApiSpec>>;
+
+    /// Extract a specific API specification by name
+    fn extract_by_name(&self, name: &str) -> Result<Option<ApiSpec>>;
+
+    /// Display detailed information about a specific API
+    fn display_api_details(
+        &self,
+        api_name: &str,
+        formatter: &mut dyn OutputFormatter,
+        writer: &mut dyn Write,
+    ) -> Result<()>;
+}
+
+/// Helper function to display an ApiSpec using a formatter
+pub fn display_api_spec(
+    spec: &ApiSpec,
+    formatter: &mut dyn OutputFormatter,
+    writer: &mut dyn Write,
+) -> Result<()> {
+    formatter.begin_api_details(writer, &spec.name)?;
+
+    if let Some(desc) = &spec.description {
+        formatter.description(writer, desc)?;
+    }
+
+    if let Some(long_desc) = &spec.long_description {
+        formatter.long_description(writer, long_desc)?;
+    }
+
+    if let Some(version) = &spec.since_version {
+        formatter.since_version(writer, version)?;
+    }
+
+    if !spec.context_flags.is_empty() {
+        formatter.begin_context_flags(writer)?;
+        for flag in &spec.context_flags {
+            formatter.context_flag(writer, flag)?;
+        }
+        formatter.end_context_flags(writer)?;
+    }
+
+    if !spec.parameters.is_empty() {
+        formatter.begin_parameters(writer, spec.parameters.len().try_into().unwrap_or(u32::MAX))?;
+        for param in &spec.parameters {
+            formatter.parameter(writer, param)?;
+        }
+        formatter.end_parameters(writer)?;
+    }
+
+    if let Some(ret) = &spec.return_spec {
+        formatter.return_spec(writer, ret)?;
+    }
+
+    if !spec.errors.is_empty() {
+        formatter.begin_errors(writer, spec.errors.len().try_into().unwrap_or(u32::MAX))?;
+        for error in &spec.errors {
+            formatter.error(writer, error)?;
+        }
+        formatter.end_errors(writer)?;
+    }
+
+    if let Some(notes) = &spec.notes {
+        formatter.notes(writer, notes)?;
+    }
+
+    if let Some(examples) = &spec.examples {
+        formatter.examples(writer, examples)?;
+    }
+
+    // Display sysfs-specific fields
+    if spec.api_type == "sysfs" {
+        if let Some(subsystem) = &spec.subsystem {
+            formatter.sysfs_subsystem(writer, subsystem)?;
+        }
+        if let Some(path) = &spec.sysfs_path {
+            formatter.sysfs_path(writer, path)?;
+        }
+        if let Some(perms) = &spec.permissions {
+            formatter.sysfs_permissions(writer, perms)?;
+        }
+    }
+
+    // Display networking-specific fields
+    if let Some(socket_state) = &spec.socket_state {
+        formatter.socket_state(writer, socket_state)?;
+    }
+
+    if !spec.protocol_behaviors.is_empty() {
+        formatter.begin_protocol_behaviors(writer)?;
+        for behavior in &spec.protocol_behaviors {
+            formatter.protocol_behavior(writer, behavior)?;
+        }
+        formatter.end_protocol_behaviors(writer)?;
+    }
+
+    if !spec.addr_families.is_empty() {
+        formatter.begin_addr_families(writer)?;
+        for family in &spec.addr_families {
+            formatter.addr_family(writer, family)?;
+        }
+        formatter.end_addr_families(writer)?;
+    }
+
+    if let Some(buffer_spec) = &spec.buffer_spec {
+        formatter.buffer_spec(writer, buffer_spec)?;
+    }
+
+    if let Some(async_spec) = &spec.async_spec {
+        formatter.async_spec(writer, async_spec)?;
+    }
+
+    if let Some(net_data_transfer) = &spec.net_data_transfer {
+        formatter.net_data_transfer(writer, net_data_transfer)?;
+    }
+
+    if !spec.capabilities.is_empty() {
+        formatter.begin_capabilities(writer)?;
+        for cap in &spec.capabilities {
+            formatter.capability(writer, cap)?;
+        }
+        formatter.end_capabilities(writer)?;
+    }
+
+    // Display signals
+    if !spec.signals.is_empty() {
+        formatter.begin_signals(writer, spec.signals.len().try_into().unwrap_or(u32::MAX))?;
+        for signal in &spec.signals {
+            formatter.signal(writer, signal)?;
+        }
+        formatter.end_signals(writer)?;
+    }
+
+    // Display signal masks
+    if !spec.signal_masks.is_empty() {
+        formatter.begin_signal_masks(
+            writer,
+            spec.signal_masks.len().try_into().unwrap_or(u32::MAX),
+        )?;
+        for mask in &spec.signal_masks {
+            formatter.signal_mask(writer, mask)?;
+        }
+        formatter.end_signal_masks(writer)?;
+    }
+
+    // Display side effects
+    if !spec.side_effects.is_empty() {
+        formatter.begin_side_effects(
+            writer,
+            spec.side_effects.len().try_into().unwrap_or(u32::MAX),
+        )?;
+        for effect in &spec.side_effects {
+            formatter.side_effect(writer, effect)?;
+        }
+        formatter.end_side_effects(writer)?;
+    }
+
+    // Display state transitions
+    if !spec.state_transitions.is_empty() {
+        formatter.begin_state_transitions(
+            writer,
+            spec.state_transitions.len().try_into().unwrap_or(u32::MAX),
+        )?;
+        for trans in &spec.state_transitions {
+            formatter.state_transition(writer, trans)?;
+        }
+        formatter.end_state_transitions(writer)?;
+    }
+
+    // Display constraints
+    if !spec.constraints.is_empty() {
+        formatter.begin_constraints(
+            writer,
+            spec.constraints.len().try_into().unwrap_or(u32::MAX),
+        )?;
+        for constraint in &spec.constraints {
+            formatter.constraint(writer, constraint)?;
+        }
+        formatter.end_constraints(writer)?;
+    }
+
+    // Display locks
+    if !spec.locks.is_empty() {
+        formatter.begin_locks(writer, spec.locks.len().try_into().unwrap_or(u32::MAX))?;
+        for lock in &spec.locks {
+            formatter.lock(writer, lock)?;
+        }
+        formatter.end_locks(writer)?;
+    }
+
+    // Display struct specs
+    if !spec.struct_specs.is_empty() {
+        formatter.begin_struct_specs(writer, spec.struct_specs.len().try_into().unwrap_or(u32::MAX))?;
+        for struct_spec in &spec.struct_specs {
+            formatter.struct_spec(writer, struct_spec)?;
+        }
+        formatter.end_struct_specs(writer)?;
+    }
+
+    formatter.end_api_details(writer)?;
+
+    Ok(())
+}
diff --git a/tools/kapi/src/extractor/source_parser.rs b/tools/kapi/src/extractor/source_parser.rs
new file mode 100644
index 000000000000..7a72b85a83be
--- /dev/null
+++ b/tools/kapi/src/extractor/source_parser.rs
@@ -0,0 +1,213 @@
+use super::{
+    ApiExtractor, ApiSpec, display_api_spec,
+};
+use super::kerneldoc_parser::KerneldocParserImpl;
+use crate::formatter::OutputFormatter;
+use anyhow::{Context, Result};
+use regex::Regex;
+use std::fs;
+use std::io::Write;
+use std::path::Path;
+use walkdir::WalkDir;
+
+/// Extractor for kernel source files with KAPI-annotated kerneldoc
+pub struct SourceExtractor {
+    path: String,
+    parser: KerneldocParserImpl,
+    syscall_regex: Regex,
+    ioctl_regex: Regex,
+    function_regex: Regex,
+}
+
+impl SourceExtractor {
+    pub fn new(path: &str) -> Result<Self> {
+        Ok(SourceExtractor {
+            path: path.to_string(),
+            parser: KerneldocParserImpl::new(),
+            syscall_regex: Regex::new(r"SYSCALL_DEFINE\d+\((\w+)")?,
+            ioctl_regex: Regex::new(r"(?:static\s+)?long\s+(\w+_ioctl)\s*\(")?,
+            function_regex: Regex::new(
+                r"(?m)^(?:static\s+)?(?:inline\s+)?(?:(?:unsigned\s+)?(?:long|int|void|char|short|struct\s+\w+\s*\*?|[\w_]+_t)\s*\*?\s+)?(\w+)\s*\([^)]*\)",
+            )?,
+        })
+    }
+
+    fn extract_from_file(&self, path: &Path) -> Result<Vec<ApiSpec>> {
+        let content = fs::read_to_string(path)
+            .with_context(|| format!("Failed to read file: {}", path.display()))?;
+
+        self.extract_from_content(&content)
+    }
+
+    fn extract_from_content(&self, content: &str) -> Result<Vec<ApiSpec>> {
+        let mut specs = Vec::new();
+        let mut in_kerneldoc = false;
+        let mut current_doc = String::new();
+        let lines: Vec<&str> = content.lines().collect();
+        let mut i = 0;
+
+        while i < lines.len() {
+            let line = lines[i];
+
+            // Start of kerneldoc comment
+            if line.trim_start().starts_with("/**") {
+                in_kerneldoc = true;
+                current_doc.clear();
+                i += 1;
+                continue;
+            }
+
+            // Inside kerneldoc comment
+            if in_kerneldoc {
+                if line.contains("*/") {
+                    in_kerneldoc = false;
+
+                    // Check if this kerneldoc has KAPI annotations
+                    if current_doc.contains("context-flags:") ||
+                       current_doc.contains("param-count:") ||
+                       current_doc.contains("side-effect:") ||
+                       current_doc.contains("state-trans:") ||
+                       current_doc.contains("error-code:") {
+
+                        // Look ahead for the function declaration
+                        if let Some((name, api_type, signature)) = self.find_function_after(&lines, i + 1) {
+                            if let Ok(spec) = self.parser.parse_kerneldoc(&current_doc, &name, &api_type, Some(&signature)) {
+                                specs.push(spec);
+                            }
+                        }
+                    }
+                } else {
+                    // Remove leading asterisk and preserve content
+                    let cleaned = if let Some(stripped) = line.trim_start().strip_prefix("*") {
+                        if let Some(no_space) = stripped.strip_prefix(' ') {
+                            no_space
+                        } else {
+                            stripped
+                        }
+                    } else {
+                        line.trim_start()
+                    };
+                    current_doc.push_str(cleaned);
+                    current_doc.push('\n');
+                }
+            }
+
+            i += 1;
+        }
+
+        Ok(specs)
+    }
+
+    fn find_function_after(&self, lines: &[&str], start: usize) -> Option<(String, String, String)> {
+        for i in start..lines.len().min(start + 10) {
+            let line = lines[i];
+
+            // Skip empty lines
+            if line.trim().is_empty() {
+                continue;
+            }
+
+            // Check for SYSCALL_DEFINE
+            if let Some(caps) = self.syscall_regex.captures(line) {
+                let name = format!("sys_{}", caps.get(1).unwrap().as_str());
+                let signature = self.extract_syscall_signature(lines, i);
+                return Some((name, "syscall".to_string(), signature));
+            }
+
+            // Check for ioctl function
+            if let Some(caps) = self.ioctl_regex.captures(line) {
+                let name = caps.get(1).unwrap().as_str().to_string();
+                return Some((name, "ioctl".to_string(), line.to_string()));
+            }
+
+            // Check for regular function
+            if let Some(caps) = self.function_regex.captures(line) {
+                let name = caps.get(1).unwrap().as_str().to_string();
+                return Some((name, "function".to_string(), line.to_string()));
+            }
+
+            // Stop if we hit something that's clearly not part of the function declaration
+            if !line.starts_with(' ') && !line.starts_with('\t') && !line.trim().is_empty() {
+                break;
+            }
+        }
+
+        None
+    }
+
+    fn extract_syscall_signature(&self, lines: &[&str], start: usize) -> String {
+        // Extract the full SYSCALL_DEFINE signature
+        let mut sig = String::new();
+        let mut in_paren = false;
+        let mut paren_count = 0;
+
+        for line in lines.iter().skip(start).take(20) {
+            let line = *line;
+
+            // Start of SYSCALL_DEFINE
+            if line.contains("SYSCALL_DEFINE") {
+                if let Some(pos) = line.find('(') {
+                    sig.push_str(&line[pos..]);
+                    in_paren = true;
+                    paren_count = line[pos..].chars().filter(|&c| c == '(').count() -
+                                  line[pos..].chars().filter(|&c| c == ')').count();
+                }
+            } else if in_paren {
+                sig.push(' ');
+                sig.push_str(line.trim());
+                paren_count += line.chars().filter(|&c| c == '(').count();
+                paren_count -= line.chars().filter(|&c| c == ')').count();
+
+                if paren_count == 0 {
+                    break;
+                }
+            }
+        }
+
+        sig
+    }
+}
+
+impl ApiExtractor for SourceExtractor {
+    fn extract_all(&self) -> Result<Vec<ApiSpec>> {
+        let path = Path::new(&self.path);
+        let mut all_specs = Vec::new();
+
+        if path.is_file() {
+            // Single file
+            all_specs.extend(self.extract_from_file(path)?);
+        } else if path.is_dir() {
+            // Directory - walk all .c files
+            for entry in WalkDir::new(path)
+                .into_iter()
+                .filter_map(|e| e.ok())
+                .filter(|e| e.path().extension().is_some_and(|ext| ext == "c"))
+            {
+                if let Ok(specs) = self.extract_from_file(entry.path()) {
+                    all_specs.extend(specs);
+                }
+            }
+        }
+
+        Ok(all_specs)
+    }
+
+    fn extract_by_name(&self, name: &str) -> Result<Option<ApiSpec>> {
+        let all_specs = self.extract_all()?;
+        Ok(all_specs.into_iter().find(|s| s.name == name))
+    }
+
+    fn display_api_details(
+        &self,
+        api_name: &str,
+        formatter: &mut dyn OutputFormatter,
+        output: &mut dyn Write,
+    ) -> Result<()> {
+        if let Some(spec) = self.extract_by_name(api_name)? {
+            display_api_spec(&spec, formatter, output)?;
+        } else {
+            writeln!(output, "API '{}' not found", api_name)?;
+        }
+        Ok(())
+    }
+}
\ No newline at end of file
diff --git a/tools/kapi/src/extractor/vmlinux/binary_utils.rs b/tools/kapi/src/extractor/vmlinux/binary_utils.rs
new file mode 100644
index 000000000000..0a51943e1c02
--- /dev/null
+++ b/tools/kapi/src/extractor/vmlinux/binary_utils.rs
@@ -0,0 +1,180 @@
+// Constants for all structure field sizes
+pub mod sizes {
+    pub const NAME: usize = 128;
+    pub const DESC: usize = 512;
+    pub const MAX_PARAMS: usize = 16;
+    pub const MAX_ERRORS: usize = 32;
+    pub const MAX_CONSTRAINTS: usize = 16;
+    pub const MAX_CAPABILITIES: usize = 8;
+    pub const MAX_SIGNALS: usize = 16;
+    pub const MAX_STRUCT_SPECS: usize = 8;
+    pub const MAX_SIDE_EFFECTS: usize = 32;
+    pub const MAX_STATE_TRANS: usize = 16;
+    pub const MAX_PROTOCOL_BEHAVIORS: usize = 8;
+    pub const MAX_ADDR_FAMILIES: usize = 8;
+}
+
+// Helper for reading data at specific offsets
+pub struct DataReader<'a> {
+    pub data: &'a [u8],
+    pub pos: usize,
+}
+
+impl<'a> DataReader<'a> {
+    pub fn new(data: &'a [u8], offset: usize) -> Self {
+        Self { data, pos: offset }
+    }
+
+    pub fn read_bytes(&mut self, len: usize) -> Option<&'a [u8]> {
+        if self.pos + len <= self.data.len() {
+            let bytes = &self.data[self.pos..self.pos + len];
+            self.pos += len;
+            Some(bytes)
+        } else {
+            None
+        }
+    }
+
+    pub fn read_cstring(&mut self, max_len: usize) -> Option<String> {
+        let bytes = self.read_bytes(max_len)?;
+        if let Some(null_pos) = bytes.iter().position(|&b| b == 0) {
+            if null_pos > 0 {
+                if let Ok(s) = std::str::from_utf8(&bytes[..null_pos]) {
+                    return Some(s.to_string());
+                }
+            }
+        }
+        None
+    }
+
+    pub fn read_u32(&mut self) -> Option<u32> {
+        self.read_bytes(4).map(|b| u32::from_le_bytes(b.try_into().unwrap()))
+    }
+
+    pub fn read_u8(&mut self) -> Option<u8> {
+        self.read_bytes(1).map(|b| b[0])
+    }
+
+    pub fn read_i32(&mut self) -> Option<i32> {
+        self.read_bytes(4).map(|b| i32::from_le_bytes(b.try_into().unwrap()))
+    }
+
+    pub fn read_u64(&mut self) -> Option<u64> {
+        self.read_bytes(8).map(|b| u64::from_le_bytes(b.try_into().unwrap()))
+    }
+
+    pub fn read_i64(&mut self) -> Option<i64> {
+        self.read_bytes(8).map(|b| i64::from_le_bytes(b.try_into().unwrap()))
+    }
+
+    pub fn read_usize(&mut self) -> Option<usize> {
+        self.read_u64().map(|v| v as usize)
+    }
+
+    pub fn skip(&mut self, len: usize) {
+        self.pos = (self.pos + len).min(self.data.len());
+    }
+
+    // Helper methods for common patterns
+    pub fn read_bool(&mut self) -> Option<bool> {
+        self.read_u8().map(|v| v != 0)
+    }
+
+    pub fn read_optional_string(&mut self, max_len: usize) -> Option<String> {
+        self.read_cstring(max_len).filter(|s| !s.is_empty())
+    }
+
+    pub fn read_string_or_default(&mut self, max_len: usize) -> String {
+        self.read_cstring(max_len).unwrap_or_default()
+    }
+
+    // Skip and discard - advances position by reading and discarding
+    pub fn discard_cstring(&mut self, max_len: usize) {
+        let _ = self.read_cstring(max_len);
+    }
+
+    // Read multiple booleans at once
+    pub fn read_bools<const N: usize>(&mut self) -> Option<[bool; N]> {
+        let mut result = [false; N];
+        for item in &mut result {
+            *item = self.read_bool()?;
+        }
+        Some(result)
+    }
+
+
+}
+
+// Structure layout definitions for calculating sizes
+pub fn signal_mask_spec_layout_size() -> usize {
+    // Packed structure from struct kapi_signal_mask_spec
+    sizes::NAME + // mask_name
+    4 * sizes::MAX_SIGNALS + // signals array
+    4 + // signal_count
+    sizes::DESC // description
+}
+
+pub fn struct_field_layout_size() -> usize {
+    // Packed structure from struct kapi_struct_field
+    sizes::NAME + // name
+    4 + // type (enum)
+    sizes::NAME + // type_name
+    8 + // offset (size_t)
+    8 + // size (size_t)
+    4 + // flags
+    4 + // constraint_type (enum)
+    8 + // min_value (s64)
+    8 + // max_value (s64)
+    8 + // valid_mask (u64)
+    sizes::DESC + // enum_values
+    sizes::DESC // description
+}
+
+pub fn socket_state_spec_layout_size() -> usize {
+    // struct kapi_socket_state_spec
+    sizes::NAME * sizes::MAX_CONSTRAINTS + // required_states array
+    sizes::NAME * sizes::MAX_CONSTRAINTS + // forbidden_states array
+    sizes::NAME + // resulting_state
+    sizes::DESC + // condition
+    sizes::NAME + // applicable_protocols
+    4 + // required_count
+    4 // forbidden_count
+}
+
+pub fn protocol_behavior_spec_layout_size() -> usize {
+    // struct kapi_protocol_behavior
+    sizes::NAME + // applicable_protocols
+    sizes::DESC + // behavior
+    sizes::NAME + // protocol_flags
+    sizes::DESC // flag_description
+}
+
+pub fn buffer_spec_layout_size() -> usize {
+    // struct kapi_buffer_spec
+    sizes::DESC + // buffer_behaviors
+    8 + // min_buffer_size (size_t)
+    8 + // max_buffer_size (size_t)
+    8 // optimal_buffer_size (size_t)
+}
+
+pub fn async_spec_layout_size() -> usize {
+    // struct kapi_async_spec
+    sizes::NAME + // supported_modes
+    4 // nonblock_errno (int)
+}
+
+pub fn addr_family_spec_layout_size() -> usize {
+    // struct kapi_addr_family_spec
+    4 + // family (int)
+    sizes::NAME + // family_name
+    8 + // addr_struct_size (size_t)
+    8 + // min_addr_len (size_t)
+    8 + // max_addr_len (size_t)
+    sizes::DESC + // addr_format
+    1 + // supports_wildcard (bool)
+    1 + // supports_multicast (bool)
+    1 + // supports_broadcast (bool)
+    sizes::DESC + // special_addresses
+    4 + // port_range_min (u32)
+    4 // port_range_max (u32)
+}
diff --git a/tools/kapi/src/extractor/vmlinux/magic_finder.rs b/tools/kapi/src/extractor/vmlinux/magic_finder.rs
new file mode 100644
index 000000000000..cb7dc535801a
--- /dev/null
+++ b/tools/kapi/src/extractor/vmlinux/magic_finder.rs
@@ -0,0 +1,102 @@
+// Magic markers for each section
+pub const MAGIC_PARAM: u32 = 0x4B415031;    // 'KAP1'
+pub const MAGIC_RETURN: u32 = 0x4B415232;   // 'KAR2'
+pub const MAGIC_ERROR: u32 = 0x4B414533;    // 'KAE3'
+pub const MAGIC_LOCK: u32 = 0x4B414C34;     // 'KAL4'
+pub const MAGIC_CONSTRAINT: u32 = 0x4B414335; // 'KAC5'
+pub const MAGIC_INFO: u32 = 0x4B414936;     // 'KAI6'
+pub const MAGIC_SIGNAL: u32 = 0x4B415337;   // 'KAS7'
+pub const MAGIC_SIGMASK: u32 = 0x4B414D38;  // 'KAM8'
+pub const MAGIC_STRUCT: u32 = 0x4B415439;   // 'KAT9'
+pub const MAGIC_EFFECT: u32 = 0x4B414641;   // 'KAFA'
+pub const MAGIC_TRANS: u32 = 0x4B415442;    // 'KATB'
+pub const MAGIC_CAP: u32 = 0x4B414343;      // 'KACC'
+
+pub struct MagicOffsets {
+    pub param_offset: Option<usize>,
+    pub return_offset: Option<usize>,
+    pub error_offset: Option<usize>,
+    pub lock_offset: Option<usize>,
+    pub constraint_offset: Option<usize>,
+    pub info_offset: Option<usize>,
+    pub signal_offset: Option<usize>,
+    pub sigmask_offset: Option<usize>,
+    pub struct_offset: Option<usize>,
+    pub effect_offset: Option<usize>,
+    pub trans_offset: Option<usize>,
+    pub cap_offset: Option<usize>,
+}
+
+impl MagicOffsets {
+    /// Find magic markers in the provided data slice
+    /// data: slice of data to search (typically one spec's worth)
+    /// base_offset: absolute offset where this slice starts in the full buffer
+    pub fn find_in_data(data: &[u8], base_offset: usize) -> Self {
+        let mut offsets = MagicOffsets {
+            param_offset: None,
+            return_offset: None,
+            error_offset: None,
+            lock_offset: None,
+            constraint_offset: None,
+            info_offset: None,
+            signal_offset: None,
+            sigmask_offset: None,
+            struct_offset: None,
+            effect_offset: None,
+            trans_offset: None,
+            cap_offset: None,
+        };
+
+        // Scan through data looking for magic markers
+        // Only find the first occurrence of each magic to avoid cross-spec contamination
+        let mut i = 0;
+        while i + 4 <= data.len() {
+            let bytes = &data[i..i + 4];
+            let value = u32::from_le_bytes([bytes[0], bytes[1], bytes[2], bytes[3]]);
+
+            match value {
+                MAGIC_PARAM if offsets.param_offset.is_none() => {
+                    offsets.param_offset = Some(base_offset + i);
+                },
+                MAGIC_RETURN if offsets.return_offset.is_none() => {
+                    offsets.return_offset = Some(base_offset + i);
+                },
+                MAGIC_ERROR if offsets.error_offset.is_none() => {
+                    offsets.error_offset = Some(base_offset + i);
+                },
+                MAGIC_LOCK if offsets.lock_offset.is_none() => {
+                    offsets.lock_offset = Some(base_offset + i);
+                },
+                MAGIC_CONSTRAINT if offsets.constraint_offset.is_none() => {
+                    offsets.constraint_offset = Some(base_offset + i);
+                },
+                MAGIC_INFO if offsets.info_offset.is_none() => {
+                    offsets.info_offset = Some(base_offset + i);
+                },
+                MAGIC_SIGNAL if offsets.signal_offset.is_none() => {
+                    offsets.signal_offset = Some(base_offset + i);
+                },
+                MAGIC_SIGMASK if offsets.sigmask_offset.is_none() => {
+                    offsets.sigmask_offset = Some(base_offset + i);
+                },
+                MAGIC_STRUCT if offsets.struct_offset.is_none() => {
+                    offsets.struct_offset = Some(base_offset + i);
+                },
+                MAGIC_EFFECT if offsets.effect_offset.is_none() => {
+                    offsets.effect_offset = Some(base_offset + i);
+                },
+                MAGIC_TRANS if offsets.trans_offset.is_none() => {
+                    offsets.trans_offset = Some(base_offset + i);
+                },
+                MAGIC_CAP if offsets.cap_offset.is_none() => {
+                    offsets.cap_offset = Some(base_offset + i);
+                },
+                _ => {}
+            }
+
+            i += 1;
+        }
+
+        offsets
+    }
+}
\ No newline at end of file
diff --git a/tools/kapi/src/extractor/vmlinux/mod.rs b/tools/kapi/src/extractor/vmlinux/mod.rs
new file mode 100644
index 000000000000..a8c4ed4de626
--- /dev/null
+++ b/tools/kapi/src/extractor/vmlinux/mod.rs
@@ -0,0 +1,869 @@
+use super::{
+    ApiExtractor, ApiSpec, CapabilitySpec, ConstraintSpec, ErrorSpec, LockSpec, ParamSpec,
+    ReturnSpec, SideEffectSpec, SignalMaskSpec, SignalSpec, StateTransitionSpec, StructSpec,
+    StructFieldSpec,
+};
+use crate::formatter::OutputFormatter;
+use anyhow::{Context, Result};
+use goblin::elf::Elf;
+use std::convert::TryInto;
+use std::fs;
+use std::io::Write;
+
+mod binary_utils;
+mod magic_finder;
+use binary_utils::{
+    DataReader, addr_family_spec_layout_size, async_spec_layout_size, buffer_spec_layout_size,
+    protocol_behavior_spec_layout_size, signal_mask_spec_layout_size,
+    sizes, socket_state_spec_layout_size, struct_field_layout_size,
+};
+
+// Helper to convert empty strings to None
+fn opt_string(s: String) -> Option<String> {
+    if s.is_empty() { None } else { Some(s) }
+}
+
+pub struct VmlinuxExtractor {
+    kapi_data: Vec<u8>,
+    specs: Vec<KapiSpec>,
+}
+
+#[derive(Debug)]
+struct KapiSpec {
+    name: String,
+    api_type: String,
+    offset: usize,
+}
+
+impl VmlinuxExtractor {
+    pub fn new(vmlinux_path: &str) -> Result<Self> {
+        let vmlinux_data = fs::read(vmlinux_path)
+            .with_context(|| format!("Failed to read vmlinux file: {vmlinux_path}"))?;
+
+        let elf = Elf::parse(&vmlinux_data).context("Failed to parse ELF file")?;
+
+        // Find the .kapi_specs section
+        let kapi_section = elf
+            .section_headers
+            .iter()
+            .find(|sh| {
+                if let Some(name) = elf.shdr_strtab.get_at(sh.sh_name) {
+                    name == ".kapi_specs"
+                } else {
+                    false
+                }
+            })
+            .context("Could not find .kapi_specs section in vmlinux")?;
+
+        // Find __start_kapi_specs and __stop_kapi_specs symbols
+        let mut start_addr = None;
+        let mut stop_addr = None;
+
+        for sym in &elf.syms {
+            if let Some(name) = elf.strtab.get_at(sym.st_name) {
+                match name {
+                    "__start_kapi_specs" => start_addr = Some(sym.st_value),
+                    "__stop_kapi_specs" => stop_addr = Some(sym.st_value),
+                    _ => {}
+                }
+            }
+        }
+
+        let start = start_addr.context("Could not find __start_kapi_specs symbol")?;
+        let stop = stop_addr.context("Could not find __stop_kapi_specs symbol")?;
+
+        if stop <= start {
+            anyhow::bail!("No kernel API specifications found in vmlinux");
+        }
+
+        // Calculate the offset within the file
+        let section_vaddr = kapi_section.sh_addr;
+        let file_offset = kapi_section.sh_offset + (start - section_vaddr);
+        let data_size: usize = (stop - start)
+            .try_into()
+            .context("Data size too large for platform")?;
+
+        let file_offset_usize: usize = file_offset
+            .try_into()
+            .context("File offset too large for platform")?;
+
+        if file_offset_usize + data_size > vmlinux_data.len() {
+            anyhow::bail!("Invalid offset/size for .kapi_specs data");
+        }
+
+        // Extract the raw data
+        let kapi_data = vmlinux_data[file_offset_usize..(file_offset_usize + data_size)].to_vec();
+
+        // Parse the specifications
+        let specs = parse_kapi_specs(&kapi_data)?;
+
+        Ok(VmlinuxExtractor { kapi_data, specs })
+    }
+}
+
+fn parse_kapi_specs(data: &[u8]) -> Result<Vec<KapiSpec>> {
+    let mut specs = Vec::new();
+    let mut offset = 0;
+    let mut last_found_offset = None;
+
+    // Expected offset from struct start to param_magic based on struct layout
+    let param_magic_offset = sizes::NAME + 4 + sizes::DESC + (sizes::DESC * 4) + 4;
+
+    // Find specs by validating API name and magic marker pairs
+    while offset + param_magic_offset + 4 <= data.len() {
+        // Read potential API name
+        let name_bytes = &data[offset..offset + sizes::NAME.min(data.len() - offset)];
+
+        // Find null terminator
+        let name_len = name_bytes.iter().position(|&b| b == 0).unwrap_or(0);
+
+        if name_len > 0 && name_len < 100 {
+            let name = String::from_utf8_lossy(&name_bytes[..name_len]).to_string();
+
+            // Validate API name format
+            if is_valid_api_name(&name) {
+                // Verify magic marker at expected position
+                let magic_offset = offset + param_magic_offset;
+                if magic_offset + 4 <= data.len() {
+                    let magic_bytes = &data[magic_offset..magic_offset + 4];
+                    let magic_value = u32::from_le_bytes([magic_bytes[0], magic_bytes[1], magic_bytes[2], magic_bytes[3]]);
+
+                    if magic_value == magic_finder::MAGIC_PARAM {
+                        // Avoid duplicate detection of the same spec
+                        if last_found_offset.is_none() || offset >= last_found_offset.unwrap() + param_magic_offset {
+                            let api_type = if name.starts_with("sys_") {
+                                "syscall"
+                            } else if name.ends_with("_ioctl") {
+                                "ioctl"
+                            } else if name.contains("sysfs") {
+                                "sysfs"
+                            } else {
+                                "function"
+                            }
+                            .to_string();
+
+                            specs.push(KapiSpec {
+                                name: name.clone(),
+                                api_type,
+                                offset,
+                            });
+
+                            last_found_offset = Some(offset);
+                        }
+                    }
+                }
+            }
+        }
+
+        // Scan byte by byte to find all specs
+        offset += 1;
+    }
+
+    Ok(specs)
+}
+
+
+
+
+fn is_valid_api_name(name: &str) -> bool {
+    // Validate API name format and length
+    if name.is_empty() || name.len() < 3 || name.len() > 100 {
+        return false;
+    }
+
+    // Alphanumeric and underscore characters only
+    if !name.chars().all(|c| c.is_ascii_alphanumeric() || c == '_') {
+        return false;
+    }
+
+    // Must start with letter or underscore
+    let first_char = name.chars().next().unwrap();
+    if !first_char.is_ascii_alphabetic() && first_char != '_' {
+        return false;
+    }
+
+    // Match common kernel API patterns
+    name.starts_with("sys_") ||
+    name.starts_with("__") ||
+    name.ends_with("_ioctl") ||
+    name.contains("_") ||
+    name.len() > 6
+}
+
+impl ApiExtractor for VmlinuxExtractor {
+    fn extract_all(&self) -> Result<Vec<ApiSpec>> {
+        Ok(self
+            .specs
+            .iter()
+            .map(|spec| {
+                // Parse the full spec for listing
+                parse_binary_to_api_spec(&self.kapi_data, spec.offset)
+                    .unwrap_or_else(|_| ApiSpec {
+                        name: spec.name.clone(),
+                        api_type: spec.api_type.clone(),
+                        description: None,
+                        long_description: None,
+                        version: None,
+                        context_flags: vec![],
+                        param_count: None,
+                        error_count: None,
+                        examples: None,
+                        notes: None,
+                        since_version: None,
+                        subsystem: None,
+                        sysfs_path: None,
+                        permissions: None,
+                        socket_state: None,
+                        protocol_behaviors: vec![],
+                        addr_families: vec![],
+                        buffer_spec: None,
+                        async_spec: None,
+                        net_data_transfer: None,
+                        capabilities: vec![],
+                        parameters: vec![],
+                        return_spec: None,
+                        errors: vec![],
+                        signals: vec![],
+                        signal_masks: vec![],
+                        side_effects: vec![],
+                        state_transitions: vec![],
+                        constraints: vec![],
+                        locks: vec![],
+                        struct_specs: vec![],
+                    })
+            })
+            .collect())
+    }
+
+    fn extract_by_name(&self, api_name: &str) -> Result<Option<ApiSpec>> {
+        if let Some(spec) = self.specs.iter().find(|s| s.name == api_name) {
+            Ok(Some(parse_binary_to_api_spec(&self.kapi_data, spec.offset)?))
+        } else {
+            Ok(None)
+        }
+    }
+
+    fn display_api_details(
+        &self,
+        api_name: &str,
+        formatter: &mut dyn OutputFormatter,
+        writer: &mut dyn Write,
+    ) -> Result<()> {
+        if let Some(spec) = self.specs.iter().find(|s| s.name == api_name) {
+            let api_spec = parse_binary_to_api_spec(&self.kapi_data, spec.offset)?;
+            super::display_api_spec(&api_spec, formatter, writer)?;
+        }
+        Ok(())
+    }
+}
+
+/// Helper to read count and parse array items with optional magic offset
+fn parse_array_with_magic<T, F>(
+    reader: &mut DataReader,
+    magic_offset: Option<usize>,
+    max_items: u32,
+    parse_fn: F,
+) -> Vec<T>
+where
+    F: Fn(&mut DataReader) -> Option<T>,
+{
+    // Read count - position at magic+4 if magic offset exists
+    let count = if let Some(offset) = magic_offset {
+        reader.pos = offset + 4;
+        reader.read_u32()
+    } else {
+        reader.read_u32()
+    };
+
+    let mut items = Vec::new();
+    if let Some(count) = count {
+        // Position at start of array data if magic offset exists
+        if let Some(offset) = magic_offset {
+            reader.pos = offset + 8; // +4 for magic, +4 for count
+        }
+        // Parse items up to max_items
+        for _ in 0..count.min(max_items) as usize {
+            if let Some(item) = parse_fn(reader) {
+                items.push(item);
+            }
+        }
+    }
+    items
+}
+
+fn parse_binary_to_api_spec(data: &[u8], offset: usize) -> Result<ApiSpec> {
+    let mut reader = DataReader::new(data, offset);
+
+    // Search for magic markers in the entire spec data
+    let search_end = (offset + 0x70000).min(data.len()); // Search full spec size
+    let spec_data = &data[offset..search_end];
+
+    // Find magic markers relative to the spec start
+    let magic_offsets = magic_finder::MagicOffsets::find_in_data(spec_data, offset);
+
+    // Read fields in exact order of struct kernel_api_spec
+
+    // Read name (128 bytes)
+    let name = reader
+        .read_cstring(sizes::NAME)
+        .ok_or_else(|| anyhow::anyhow!("Failed to read API name"))?;
+
+    // Determine API type
+    let api_type = if name.starts_with("sys_") {
+        "syscall"
+    } else if name.ends_with("_ioctl") {
+        "ioctl"
+    } else if name.contains("sysfs") {
+        "sysfs"
+    } else {
+        "function"
+    }
+    .to_string();
+
+    // Read version (u32)
+    let version = reader.read_u32().map(|v| v.to_string());
+
+    // Read description (512 bytes)
+    let description = reader.read_cstring(sizes::DESC).filter(|s| !s.is_empty());
+
+    // Read long_description (2048 bytes)
+    let long_description = reader
+        .read_cstring(sizes::DESC * 4)
+        .filter(|s| !s.is_empty());
+
+    // Read context_flags (u32)
+    let context_flags = parse_context_flags(&mut reader);
+
+    // Parse params array
+    let parameters = parse_array_with_magic(
+        &mut reader,
+        magic_offsets.param_offset,
+        sizes::MAX_PARAMS as u32,
+        |r| parse_param(r, 0),  // Index doesn't seem to be used in parse_param
+    );
+
+    // Read return_spec
+    let return_spec = parse_return_spec(&mut reader);
+
+    // Parse errors array
+    let errors = parse_array_with_magic(
+        &mut reader,
+        magic_offsets.error_offset,
+        sizes::MAX_ERRORS as u32,
+        parse_error,
+    );
+
+    // Parse locks array
+    let locks = parse_array_with_magic(
+        &mut reader,
+        magic_offsets.lock_offset,
+        sizes::MAX_CONSTRAINTS as u32,
+        parse_lock,
+    );
+
+    // Parse constraints array
+    let constraints = parse_array_with_magic(
+        &mut reader,
+        magic_offsets.constraint_offset,
+        sizes::MAX_CONSTRAINTS as u32,
+        parse_constraint,
+    );
+
+    // Read examples and notes - position reader at info section if magic found
+    let (examples, notes) = if let Some(info_offset) = magic_offsets.info_offset {
+        reader.pos = info_offset + 4; // +4 to skip magic
+        let examples = reader.read_cstring(sizes::DESC * 2).filter(|s| !s.is_empty());
+        let notes = reader.read_cstring(sizes::DESC * 2).filter(|s| !s.is_empty());
+        (examples, notes)
+    } else {
+        let examples = reader.read_cstring(sizes::DESC * 2).filter(|s| !s.is_empty());
+        let notes = reader.read_cstring(sizes::DESC * 2).filter(|s| !s.is_empty());
+        (examples, notes)
+    };
+
+    // Read since_version (32 bytes)
+    let since_version = reader.read_cstring(32).filter(|s| !s.is_empty());
+
+    // Skip deprecated (bool = 1 byte + 3 bytes padding) and replacement (128 bytes)
+    // These fields were removed from kernel but we need to skip them for binary compatibility
+    reader.skip(4); // deprecated + padding
+    reader.discard_cstring(sizes::NAME); // replacement
+
+    // Parse signals array
+    let signals = parse_array_with_magic(
+        &mut reader,
+        magic_offsets.signal_offset,
+        sizes::MAX_SIGNALS as u32,
+        parse_signal,
+    );
+
+    // Read signal_mask_count (u32)
+    let signal_mask_count = reader.read_u32();
+
+    // Parse signal_masks array
+    let mut signal_masks = Vec::new();
+    if let Some(count) = signal_mask_count {
+        for i in 0..sizes::MAX_SIGNALS {
+            if i < count as usize {
+                if let Some(mask) = parse_signal_mask(&mut reader) {
+                    signal_masks.push(mask);
+                }
+            } else {
+                reader.skip(signal_mask_spec_layout_size());
+            }
+        }
+    } else {
+        reader.skip(signal_mask_spec_layout_size() * sizes::MAX_SIGNALS);
+    }
+
+    // Parse struct_specs array
+    let struct_specs = parse_array_with_magic(
+        &mut reader,
+        magic_offsets.struct_offset,
+        sizes::MAX_STRUCT_SPECS as u32,
+        parse_struct_spec,
+    );
+
+    // According to the C struct, the order is:
+    // side_effect_count, side_effects array, state_trans_count, state_transitions array,
+    // capability_count, capabilities array
+
+    // Parse side_effects array
+    let side_effects = parse_array_with_magic(
+        &mut reader,
+        magic_offsets.effect_offset,
+        sizes::MAX_SIDE_EFFECTS as u32,
+        parse_side_effect,
+    );
+
+    // Parse state_transitions array
+    let state_transitions = parse_array_with_magic(
+        &mut reader,
+        magic_offsets.trans_offset,
+        sizes::MAX_STATE_TRANS as u32,
+        parse_state_transition,
+    );
+
+    // Parse capabilities array
+    let capabilities = parse_array_with_magic(
+        &mut reader,
+        magic_offsets.cap_offset,
+        sizes::MAX_CAPABILITIES as u32,
+        parse_capability,
+    );
+
+    // Skip remaining network/socket fields
+    reader.skip(
+        socket_state_spec_layout_size() +
+        protocol_behavior_spec_layout_size() * sizes::MAX_PROTOCOL_BEHAVIORS +
+        4 + // protocol_behavior_count
+        buffer_spec_layout_size() +
+        async_spec_layout_size() +
+        addr_family_spec_layout_size() * sizes::MAX_ADDR_FAMILIES +
+        4 + // addr_family_count
+        6 + 2 + // 6 bool flags + padding
+        sizes::DESC * 3 // 3 semantic descriptions
+    );
+
+    Ok(ApiSpec {
+        name,
+        api_type,
+        description,
+        long_description,
+        version,
+        context_flags,
+        param_count: if parameters.is_empty() { None } else { Some(parameters.len() as u32) },
+        error_count: if errors.is_empty() { None } else { Some(errors.len() as u32) },
+        examples,
+        notes,
+        since_version,
+        subsystem: None,
+        sysfs_path: None,
+        permissions: None,
+        socket_state: None,
+        protocol_behaviors: vec![],
+        addr_families: vec![],
+        buffer_spec: None,
+        async_spec: None,
+        net_data_transfer: None,
+        capabilities,
+        parameters,
+        return_spec,
+        errors,
+        signals,
+        signal_masks,
+        side_effects,
+        state_transitions,
+        constraints,
+        locks,
+        struct_specs,
+    })
+}
+
+// Helper parsing functions
+
+fn parse_context_flags(reader: &mut DataReader) -> Vec<String> {
+    const KAPI_CTX_PROCESS: u32 = 1 << 0;
+    const KAPI_CTX_SOFTIRQ: u32 = 1 << 1;
+    const KAPI_CTX_HARDIRQ: u32 = 1 << 2;
+    const KAPI_CTX_NMI: u32 = 1 << 3;
+    const KAPI_CTX_ATOMIC: u32 = 1 << 4;
+    const KAPI_CTX_SLEEPABLE: u32 = 1 << 5;
+    const KAPI_CTX_PREEMPT_DISABLED: u32 = 1 << 6;
+    const KAPI_CTX_IRQ_DISABLED: u32 = 1 << 7;
+
+    if let Some(flags) = reader.read_u32() {
+        let mut parts = Vec::new();
+
+        if flags & KAPI_CTX_PROCESS != 0 {
+            parts.push("KAPI_CTX_PROCESS");
+        }
+        if flags & KAPI_CTX_SOFTIRQ != 0 {
+            parts.push("KAPI_CTX_SOFTIRQ");
+        }
+        if flags & KAPI_CTX_HARDIRQ != 0 {
+            parts.push("KAPI_CTX_HARDIRQ");
+        }
+        if flags & KAPI_CTX_NMI != 0 {
+            parts.push("KAPI_CTX_NMI");
+        }
+        if flags & KAPI_CTX_ATOMIC != 0 {
+            parts.push("KAPI_CTX_ATOMIC");
+        }
+        if flags & KAPI_CTX_SLEEPABLE != 0 {
+            parts.push("KAPI_CTX_SLEEPABLE");
+        }
+        if flags & KAPI_CTX_PREEMPT_DISABLED != 0 {
+            parts.push("KAPI_CTX_PREEMPT_DISABLED");
+        }
+        if flags & KAPI_CTX_IRQ_DISABLED != 0 {
+            parts.push("KAPI_CTX_IRQ_DISABLED");
+        }
+
+        if !parts.is_empty() {
+            vec![parts.join(" | ")]
+        } else {
+            vec![]
+        }
+    } else {
+        vec![]
+    }
+}
+
+fn parse_param(reader: &mut DataReader, index: usize) -> Option<ParamSpec> {
+    let name = reader.read_cstring(sizes::NAME)?;
+    let type_name = reader.read_cstring(sizes::NAME)?;
+    let param_type = reader.read_u32()?;
+    let flags = reader.read_u32()?;
+    let size = reader.read_usize()?;
+    let alignment = reader.read_usize()?;
+    let min_value = reader.read_i64()?;
+    let max_value = reader.read_i64()?;
+    let valid_mask = reader.read_u64()?;
+
+    // Skip enum_values pointer (8 bytes)
+    reader.skip(8);
+    let _enum_count = reader.read_u32()?; // Must use ? to propagate errors
+    let constraint_type = reader.read_u32()?;
+    // Skip validate function pointer (8 bytes)
+    reader.skip(8);
+
+    let description = reader.read_string_or_default(sizes::DESC);
+    let constraint = reader.read_optional_string(sizes::DESC);
+    let _size_param_idx = reader.read_i32()?; // Must use ? to propagate errors
+    let _size_multiplier = reader.read_usize()?; // Must use ? to propagate errors
+
+    Some(ParamSpec {
+        index: index as u32,
+        name,
+        type_name,
+        description,
+        flags,
+        param_type,
+        constraint_type,
+        constraint,
+        min_value: Some(min_value),
+        max_value: Some(max_value),
+        valid_mask: Some(valid_mask),
+        enum_values: vec![],
+        size: Some(size as u32),
+        alignment: Some(alignment as u32),
+    })
+}
+
+fn parse_return_spec(reader: &mut DataReader) -> Option<ReturnSpec> {
+    // Read type_name, but treat empty as valid (will be empty string)
+    let type_name = reader.read_string_or_default(sizes::NAME);
+
+    // Read return_type and check_type
+    let return_type = reader.read_u32().unwrap_or(0);
+    let check_type = reader.read_u32().unwrap_or(0);
+    let success_value = reader.read_i64().unwrap_or(0);
+    let success_min = reader.read_i64().unwrap_or(0);
+    let success_max = reader.read_i64().unwrap_or(0);
+
+    // Skip error_values pointer (8 bytes)
+    reader.skip(8);
+    let _error_count = reader.read_u32().unwrap_or(0); // Don't fail on return spec
+    // Skip is_success function pointer (8 bytes)
+    reader.skip(8);
+
+    let description = reader.read_string_or_default(sizes::DESC);
+
+    // Return a spec even if type_name is empty, as long as we have some data
+    // The type_name might be a string like "KAPI_TYPE_INT" that gets stored literally
+    if type_name.is_empty() && return_type == 0 && check_type == 0 && success_value == 0 {
+        // No return spec at all
+        return None;
+    }
+
+    Some(ReturnSpec {
+        type_name,
+        description,
+        return_type,
+        check_type,
+        success_value: Some(success_value),
+        success_min: Some(success_min),
+        success_max: Some(success_max),
+        error_values: vec![],
+    })
+}
+
+fn parse_error(reader: &mut DataReader) -> Option<ErrorSpec> {
+    let error_code = reader.read_i32()?;
+    let name = reader.read_cstring(sizes::NAME)?;
+    let condition = reader.read_string_or_default(sizes::DESC);
+    let description = reader.read_string_or_default(sizes::DESC);
+
+    Some(ErrorSpec {
+        error_code,
+        name,
+        condition,
+        description,
+    })
+}
+
+fn parse_lock(reader: &mut DataReader) -> Option<LockSpec> {
+    let lock_name = reader.read_cstring(sizes::NAME)?;
+    let lock_type = reader.read_u32()?;
+    let [acquired, released, held_on_entry, held_on_exit] = reader.read_bools()?;
+    let description = reader.read_string_or_default(sizes::DESC);
+
+    Some(LockSpec {
+        lock_name,
+        lock_type,
+        acquired,
+        released,
+        held_on_entry,
+        held_on_exit,
+        description,
+    })
+}
+
+fn parse_constraint(reader: &mut DataReader) -> Option<ConstraintSpec> {
+    let name = reader.read_cstring(sizes::NAME)?;
+    let description = reader.read_string_or_default(sizes::DESC);
+    let expression = reader.read_string_or_default(sizes::DESC);
+
+    // No function pointer in packed struct
+
+    Some(ConstraintSpec {
+        name,
+        description,
+        expression: opt_string(expression),
+    })
+}
+
+fn parse_signal(reader: &mut DataReader) -> Option<SignalSpec> {
+    let signal_num = reader.read_i32()?;
+    let signal_name = reader.read_cstring(32)?; // signal_name[32]
+    let direction = reader.read_u32()?;
+    let action = reader.read_u32()?;
+    let target = reader.read_optional_string(sizes::DESC); // target[512]
+    let condition = reader.read_optional_string(sizes::DESC); // condition[512]
+    let description = reader.read_optional_string(sizes::DESC); // description[512]
+    let restartable = reader.read_bool()?;
+    let sa_flags_required = reader.read_u32()?;
+    let sa_flags_forbidden = reader.read_u32()?;
+    let error_on_signal = reader.read_i32()?;
+    let _transform_to = reader.read_i32()?; // transform_to
+    let timing_bytes = reader.read_bytes(32)?; // timing[32]
+    let timing = if let Some(end) = timing_bytes.iter().position(|&b| b == 0) {
+        String::from_utf8_lossy(&timing_bytes[..end]).parse().unwrap_or(0)
+    } else {
+        0
+    };
+    let priority = reader.read_u8()?;
+    let interruptible = reader.read_bool()?;
+    let _queue_behavior = reader.read_bytes(128)?; // queue_behavior[128]
+    let state_required = reader.read_u32()?;
+    let state_forbidden = reader.read_u32()?;
+
+    Some(SignalSpec {
+        signal_num,
+        signal_name,
+        direction,
+        action,
+        target,
+        condition,
+        description,
+        timing,
+        priority: priority as u32,
+        restartable,
+        interruptible,
+        queue: None, // queue_behavior not exposed in SignalSpec
+        sa_flags: 0, // Not directly available
+        sa_flags_required,
+        sa_flags_forbidden,
+        state_required,
+        state_forbidden,
+        error_on_signal: Some(error_on_signal),
+    })
+}
+
+fn parse_signal_mask(reader: &mut DataReader) -> Option<SignalMaskSpec> {
+    let name = reader.read_cstring(sizes::NAME)?;
+    let description = reader.read_string_or_default(sizes::DESC);
+
+    // Skip signals array
+    for _ in 0..sizes::MAX_SIGNALS {
+        reader.read_i32();
+    }
+
+    let _signal_count = reader.read_u32()?;
+
+    Some(SignalMaskSpec {
+        name,
+        description,
+    })
+}
+
+fn parse_struct_field(reader: &mut DataReader) -> Option<StructFieldSpec> {
+    let name = reader.read_cstring(sizes::NAME)?;
+    let field_type = reader.read_u32()?;
+    let type_name = reader.read_cstring(sizes::NAME)?;
+    let offset = reader.read_usize()?;
+    let size = reader.read_usize()?;
+    let flags = reader.read_u32()?;
+    let constraint_type = reader.read_u32()?;
+    let min_value = reader.read_i64()?;
+    let max_value = reader.read_i64()?;
+    let valid_mask = reader.read_u64()?;
+    // Skip enum_values field (512 bytes)
+    let _enum_values = reader.read_cstring(sizes::DESC); // Don't fail on optional field
+    let description = reader.read_string_or_default(sizes::DESC);
+
+    Some(StructFieldSpec {
+        name,
+        field_type,
+        type_name,
+        offset,
+        size,
+        flags,
+        constraint_type,
+        min_value,
+        max_value,
+        valid_mask,
+        description,
+    })
+}
+
+fn parse_struct_spec(reader: &mut DataReader) -> Option<StructSpec> {
+    let name = reader.read_cstring(sizes::NAME)?;
+    let size = reader.read_usize()?;
+    let alignment = reader.read_usize()?;
+    let field_count = reader.read_u32()?;
+
+    // Parse fields array
+    let mut fields = Vec::new();
+    for _ in 0..field_count.min(sizes::MAX_PARAMS as u32) {
+        if let Some(field) = parse_struct_field(reader) {
+            fields.push(field);
+        } else {
+            // Skip this field if we can't parse it
+            reader.skip(struct_field_layout_size());
+        }
+    }
+
+    // Skip remaining fields if any
+    let remaining = sizes::MAX_PARAMS as u32 - field_count.min(sizes::MAX_PARAMS as u32);
+    for _ in 0..remaining {
+        reader.skip(struct_field_layout_size());
+    }
+
+    let description = reader.read_string_or_default(sizes::DESC);
+
+    Some(StructSpec {
+        name,
+        size,
+        alignment,
+        field_count,
+        fields,
+        description,
+    })
+}
+
+fn parse_side_effect(reader: &mut DataReader) -> Option<SideEffectSpec> {
+    let effect_type = reader.read_u32()?;
+    let target = reader.read_cstring(sizes::NAME)?;
+    let condition = reader.read_string_or_default(sizes::DESC);
+    let description = reader.read_string_or_default(sizes::DESC);
+    let reversible = reader.read_bool()?;
+    // No padding needed for packed struct
+
+    Some(SideEffectSpec {
+        effect_type,
+        target,
+        condition: opt_string(condition),
+        description,
+        reversible,
+    })
+}
+
+fn parse_state_transition(reader: &mut DataReader) -> Option<StateTransitionSpec> {
+    let from_state = reader.read_cstring(sizes::NAME)?;
+    let to_state = reader.read_cstring(sizes::NAME)?;
+    let condition = reader.read_string_or_default(sizes::DESC);
+    let object = reader.read_cstring(sizes::NAME)?;
+    let description = reader.read_string_or_default(sizes::DESC);
+
+    Some(StateTransitionSpec {
+        object,
+        from_state,
+        to_state,
+        condition: opt_string(condition),
+        description,
+    })
+}
+
+fn parse_capability(reader: &mut DataReader) -> Option<CapabilitySpec> {
+    let capability = reader.read_i32()?;
+    let cap_name = reader.read_cstring(sizes::NAME)?;
+    let action = reader.read_u32()?;
+    let allows = reader.read_string_or_default(sizes::DESC);
+    let without_cap = reader.read_string_or_default(sizes::DESC);
+    let check_condition = reader.read_optional_string(sizes::DESC);
+    let priority = reader.read_u32()?;
+
+    let mut alternatives = Vec::new();
+    for _ in 0..sizes::MAX_CAPABILITIES {
+        if let Some(alt) = reader.read_i32() {
+            if alt != 0 {
+                alternatives.push(alt);
+            }
+        }
+    }
+
+    let _alternative_count = reader.read_u32()?; // alternative_count
+
+    Some(CapabilitySpec {
+        capability,
+        name: cap_name,
+        action: action.to_string(),
+        allows,
+        without_cap,
+        check_condition,
+        priority: Some(priority as u8),
+        alternatives,
+    })
+}
\ No newline at end of file
diff --git a/tools/kapi/src/formatter/json.rs b/tools/kapi/src/formatter/json.rs
new file mode 100644
index 000000000000..8025467409d6
--- /dev/null
+++ b/tools/kapi/src/formatter/json.rs
@@ -0,0 +1,468 @@
+use super::OutputFormatter;
+use crate::extractor::{
+    AddrFamilySpec, AsyncSpec, BufferSpec, CapabilitySpec, ConstraintSpec, ErrorSpec, LockSpec,
+    ParamSpec, ProtocolBehaviorSpec, ReturnSpec, SideEffectSpec, SignalMaskSpec, SignalSpec,
+    SocketStateSpec, StateTransitionSpec, StructSpec,
+};
+use serde::Serialize;
+use std::io::Write;
+
+pub struct JsonFormatter {
+    data: JsonData,
+}
+
+#[derive(Serialize)]
+struct JsonData {
+    #[serde(skip_serializing_if = "Option::is_none")]
+    apis: Option<Vec<JsonApi>>,
+    #[serde(skip_serializing_if = "Option::is_none")]
+    api_details: Option<JsonApiDetails>,
+}
+
+#[derive(Serialize)]
+struct JsonApi {
+    name: String,
+    api_type: String,
+}
+
+#[derive(Serialize)]
+struct JsonApiDetails {
+    name: String,
+    #[serde(skip_serializing_if = "Option::is_none")]
+    description: Option<String>,
+    #[serde(skip_serializing_if = "Option::is_none")]
+    long_description: Option<String>,
+    #[serde(skip_serializing_if = "Vec::is_empty")]
+    context_flags: Vec<String>,
+    #[serde(skip_serializing_if = "Option::is_none")]
+    examples: Option<String>,
+    #[serde(skip_serializing_if = "Option::is_none")]
+    notes: Option<String>,
+    #[serde(skip_serializing_if = "Option::is_none")]
+    since_version: Option<String>,
+    // Sysfs-specific fields
+    #[serde(skip_serializing_if = "Option::is_none")]
+    subsystem: Option<String>,
+    #[serde(skip_serializing_if = "Option::is_none")]
+    sysfs_path: Option<String>,
+    #[serde(skip_serializing_if = "Option::is_none")]
+    permissions: Option<String>,
+    // Networking-specific fields
+    #[serde(skip_serializing_if = "Option::is_none")]
+    socket_state: Option<SocketStateSpec>,
+    #[serde(skip_serializing_if = "Vec::is_empty")]
+    protocol_behaviors: Vec<ProtocolBehaviorSpec>,
+    #[serde(skip_serializing_if = "Vec::is_empty")]
+    addr_families: Vec<AddrFamilySpec>,
+    #[serde(skip_serializing_if = "Option::is_none")]
+    buffer_spec: Option<BufferSpec>,
+    #[serde(skip_serializing_if = "Option::is_none")]
+    async_spec: Option<AsyncSpec>,
+    #[serde(skip_serializing_if = "Option::is_none")]
+    net_data_transfer: Option<String>,
+    #[serde(skip_serializing_if = "Vec::is_empty")]
+    capabilities: Vec<CapabilitySpec>,
+    #[serde(skip_serializing_if = "Vec::is_empty")]
+    state_transitions: Vec<StateTransitionSpec>,
+    #[serde(skip_serializing_if = "Vec::is_empty")]
+    side_effects: Vec<SideEffectSpec>,
+    #[serde(skip_serializing_if = "Vec::is_empty")]
+    parameters: Vec<ParamSpec>,
+    #[serde(skip_serializing_if = "Option::is_none")]
+    return_spec: Option<ReturnSpec>,
+    #[serde(skip_serializing_if = "Vec::is_empty")]
+    errors: Vec<ErrorSpec>,
+    #[serde(skip_serializing_if = "Vec::is_empty")]
+    locks: Vec<LockSpec>,
+    #[serde(skip_serializing_if = "Vec::is_empty")]
+    struct_specs: Vec<StructSpec>,
+    #[serde(skip_serializing_if = "Vec::is_empty")]
+    signals: Vec<SignalSpec>,
+    #[serde(skip_serializing_if = "Vec::is_empty")]
+    signal_masks: Vec<SignalMaskSpec>,
+    #[serde(skip_serializing_if = "Vec::is_empty")]
+    constraints: Vec<ConstraintSpec>,
+}
+
+impl JsonFormatter {
+    pub fn new() -> Self {
+        JsonFormatter {
+            data: JsonData {
+                apis: None,
+                api_details: None,
+            },
+        }
+    }
+}
+
+impl OutputFormatter for JsonFormatter {
+    fn begin_document(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn end_document(&mut self, w: &mut dyn Write) -> std::io::Result<()> {
+        let json = serde_json::to_string_pretty(&self.data)?;
+        writeln!(w, "{json}")?;
+        Ok(())
+    }
+
+    fn begin_api_list(&mut self, _w: &mut dyn Write, _title: &str) -> std::io::Result<()> {
+        self.data.apis = Some(Vec::new());
+        Ok(())
+    }
+
+    fn api_item(&mut self, _w: &mut dyn Write, name: &str, api_type: &str) -> std::io::Result<()> {
+        if let Some(apis) = &mut self.data.apis {
+            apis.push(JsonApi {
+                name: name.to_string(),
+                api_type: api_type.to_string(),
+            });
+        }
+        Ok(())
+    }
+
+    fn end_api_list(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn total_specs(&mut self, _w: &mut dyn Write, _count: usize) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn begin_api_details(&mut self, _w: &mut dyn Write, name: &str) -> std::io::Result<()> {
+        self.data.api_details = Some(JsonApiDetails {
+            name: name.to_string(),
+            description: None,
+            long_description: None,
+            context_flags: Vec::new(),
+            examples: None,
+            notes: None,
+            since_version: None,
+            subsystem: None,
+            sysfs_path: None,
+            permissions: None,
+            socket_state: None,
+            protocol_behaviors: Vec::new(),
+            addr_families: Vec::new(),
+            buffer_spec: None,
+            async_spec: None,
+            net_data_transfer: None,
+            capabilities: Vec::new(),
+            state_transitions: Vec::new(),
+            side_effects: Vec::new(),
+            parameters: Vec::new(),
+            return_spec: None,
+            errors: Vec::new(),
+            locks: Vec::new(),
+            struct_specs: Vec::new(),
+            signals: Vec::new(),
+            signal_masks: Vec::new(),
+            constraints: Vec::new(),
+        });
+        Ok(())
+    }
+
+    fn end_api_details(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn description(&mut self, _w: &mut dyn Write, desc: &str) -> std::io::Result<()> {
+        if let Some(details) = &mut self.data.api_details {
+            details.description = Some(desc.to_string());
+        }
+        Ok(())
+    }
+
+    fn long_description(&mut self, _w: &mut dyn Write, desc: &str) -> std::io::Result<()> {
+        if let Some(details) = &mut self.data.api_details {
+            details.long_description = Some(desc.to_string());
+        }
+        Ok(())
+    }
+
+    fn begin_context_flags(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn context_flag(&mut self, _w: &mut dyn Write, flag: &str) -> std::io::Result<()> {
+        if let Some(details) = &mut self.data.api_details {
+            details.context_flags.push(flag.to_string());
+        }
+        Ok(())
+    }
+
+    fn end_context_flags(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn begin_parameters(&mut self, _w: &mut dyn Write, _count: u32) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn end_parameters(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn begin_errors(&mut self, _w: &mut dyn Write, _count: u32) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn end_errors(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn examples(&mut self, _w: &mut dyn Write, examples: &str) -> std::io::Result<()> {
+        if let Some(details) = &mut self.data.api_details {
+            details.examples = Some(examples.to_string());
+        }
+        Ok(())
+    }
+
+    fn notes(&mut self, _w: &mut dyn Write, notes: &str) -> std::io::Result<()> {
+        if let Some(details) = &mut self.data.api_details {
+            details.notes = Some(notes.to_string());
+        }
+        Ok(())
+    }
+
+    fn since_version(&mut self, _w: &mut dyn Write, version: &str) -> std::io::Result<()> {
+        if let Some(details) = &mut self.data.api_details {
+            details.since_version = Some(version.to_string());
+        }
+        Ok(())
+    }
+
+    fn sysfs_subsystem(&mut self, _w: &mut dyn Write, subsystem: &str) -> std::io::Result<()> {
+        if let Some(details) = &mut self.data.api_details {
+            details.subsystem = Some(subsystem.to_string());
+        }
+        Ok(())
+    }
+
+    fn sysfs_path(&mut self, _w: &mut dyn Write, path: &str) -> std::io::Result<()> {
+        if let Some(details) = &mut self.data.api_details {
+            details.sysfs_path = Some(path.to_string());
+        }
+        Ok(())
+    }
+
+    fn sysfs_permissions(&mut self, _w: &mut dyn Write, perms: &str) -> std::io::Result<()> {
+        if let Some(details) = &mut self.data.api_details {
+            details.permissions = Some(perms.to_string());
+        }
+        Ok(())
+    }
+
+    // Networking-specific methods
+    fn socket_state(&mut self, _w: &mut dyn Write, state: &SocketStateSpec) -> std::io::Result<()> {
+        if let Some(details) = &mut self.data.api_details {
+            details.socket_state = Some(state.clone());
+        }
+        Ok(())
+    }
+
+    fn begin_protocol_behaviors(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn protocol_behavior(
+        &mut self,
+        _w: &mut dyn Write,
+        behavior: &ProtocolBehaviorSpec,
+    ) -> std::io::Result<()> {
+        if let Some(details) = &mut self.data.api_details {
+            details.protocol_behaviors.push(behavior.clone());
+        }
+        Ok(())
+    }
+
+    fn end_protocol_behaviors(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn begin_addr_families(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn addr_family(&mut self, _w: &mut dyn Write, family: &AddrFamilySpec) -> std::io::Result<()> {
+        if let Some(details) = &mut self.data.api_details {
+            details.addr_families.push(family.clone());
+        }
+        Ok(())
+    }
+
+    fn end_addr_families(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn buffer_spec(&mut self, _w: &mut dyn Write, spec: &BufferSpec) -> std::io::Result<()> {
+        if let Some(details) = &mut self.data.api_details {
+            details.buffer_spec = Some(spec.clone());
+        }
+        Ok(())
+    }
+
+    fn async_spec(&mut self, _w: &mut dyn Write, spec: &AsyncSpec) -> std::io::Result<()> {
+        if let Some(details) = &mut self.data.api_details {
+            details.async_spec = Some(spec.clone());
+        }
+        Ok(())
+    }
+
+    fn net_data_transfer(&mut self, _w: &mut dyn Write, desc: &str) -> std::io::Result<()> {
+        if let Some(details) = &mut self.data.api_details {
+            details.net_data_transfer = Some(desc.to_string());
+        }
+        Ok(())
+    }
+
+    fn begin_capabilities(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn capability(&mut self, _w: &mut dyn Write, cap: &CapabilitySpec) -> std::io::Result<()> {
+        if let Some(details) = &mut self.data.api_details {
+            details.capabilities.push(cap.clone());
+        }
+        Ok(())
+    }
+
+    fn end_capabilities(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    // Stub implementations for new methods
+    fn parameter(&mut self, _w: &mut dyn Write, param: &ParamSpec) -> std::io::Result<()> {
+        if let Some(details) = &mut self.data.api_details {
+            details.parameters.push(param.clone());
+        }
+        Ok(())
+    }
+
+    fn return_spec(&mut self, _w: &mut dyn Write, ret: &ReturnSpec) -> std::io::Result<()> {
+        if let Some(details) = &mut self.data.api_details {
+            details.return_spec = Some(ret.clone());
+        }
+        Ok(())
+    }
+
+    fn error(&mut self, _w: &mut dyn Write, error: &ErrorSpec) -> std::io::Result<()> {
+        if let Some(details) = &mut self.data.api_details {
+            details.errors.push(error.clone());
+        }
+        Ok(())
+    }
+
+    fn begin_signals(&mut self, _w: &mut dyn Write, _count: u32) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn signal(&mut self, _w: &mut dyn Write, signal: &SignalSpec) -> std::io::Result<()> {
+        if let Some(api_details) = &mut self.data.api_details {
+            api_details.signals.push(signal.clone());
+        }
+        Ok(())
+    }
+
+    fn end_signals(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn begin_signal_masks(&mut self, _w: &mut dyn Write, _count: u32) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn signal_mask(&mut self, _w: &mut dyn Write, mask: &SignalMaskSpec) -> std::io::Result<()> {
+        if let Some(api_details) = &mut self.data.api_details {
+            api_details.signal_masks.push(mask.clone());
+        }
+        Ok(())
+    }
+
+    fn end_signal_masks(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn begin_side_effects(&mut self, _w: &mut dyn Write, _count: u32) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn side_effect(&mut self, _w: &mut dyn Write, effect: &SideEffectSpec) -> std::io::Result<()> {
+        if let Some(details) = &mut self.data.api_details {
+            details.side_effects.push(effect.clone());
+        }
+        Ok(())
+    }
+
+    fn end_side_effects(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn begin_state_transitions(&mut self, _w: &mut dyn Write, _count: u32) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn state_transition(
+        &mut self,
+        _w: &mut dyn Write,
+        trans: &StateTransitionSpec,
+    ) -> std::io::Result<()> {
+        if let Some(details) = &mut self.data.api_details {
+            details.state_transitions.push(trans.clone());
+        }
+        Ok(())
+    }
+
+    fn end_state_transitions(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn begin_constraints(&mut self, _w: &mut dyn Write, _count: u32) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn constraint(
+        &mut self,
+        _w: &mut dyn Write,
+        constraint: &ConstraintSpec,
+    ) -> std::io::Result<()> {
+        if let Some(api_details) = &mut self.data.api_details {
+            api_details.constraints.push(constraint.clone());
+        }
+        Ok(())
+    }
+
+    fn end_constraints(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn begin_locks(&mut self, _w: &mut dyn Write, _count: u32) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn lock(&mut self, _w: &mut dyn Write, lock: &LockSpec) -> std::io::Result<()> {
+        if let Some(details) = &mut self.data.api_details {
+            details.locks.push(lock.clone());
+        }
+        Ok(())
+    }
+
+    fn end_locks(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn begin_struct_specs(&mut self, _w: &mut dyn Write, _count: u32) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn struct_spec(&mut self, _w: &mut dyn Write, spec: &StructSpec) -> std::io::Result<()> {
+        if let Some(ref mut details) = self.data.api_details {
+            details.struct_specs.push(spec.clone());
+        }
+        Ok(())
+    }
+
+    fn end_struct_specs(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+}
diff --git a/tools/kapi/src/formatter/mod.rs b/tools/kapi/src/formatter/mod.rs
new file mode 100644
index 000000000000..d799ff0ba971
--- /dev/null
+++ b/tools/kapi/src/formatter/mod.rs
@@ -0,0 +1,145 @@
+use crate::extractor::{
+    AddrFamilySpec, AsyncSpec, BufferSpec, CapabilitySpec, ConstraintSpec, ErrorSpec, LockSpec,
+    ParamSpec, ProtocolBehaviorSpec, ReturnSpec, SideEffectSpec, SignalMaskSpec, SignalSpec,
+    SocketStateSpec, StateTransitionSpec, StructSpec,
+};
+use std::io::Write;
+
+mod json;
+mod plain;
+mod rst;
+mod shall;
+
+pub use json::JsonFormatter;
+pub use plain::PlainFormatter;
+pub use rst::RstFormatter;
+pub use shall::ShallFormatter;
+
+#[derive(Debug, Clone, Copy, PartialEq)]
+pub enum OutputFormat {
+    Plain,
+    Json,
+    Rst,
+    Shall,
+}
+
+impl std::str::FromStr for OutputFormat {
+    type Err = String;
+
+    fn from_str(s: &str) -> Result<Self, Self::Err> {
+        match s.to_lowercase().as_str() {
+            "plain" => Ok(OutputFormat::Plain),
+            "json" => Ok(OutputFormat::Json),
+            "rst" => Ok(OutputFormat::Rst),
+            "shall" => Ok(OutputFormat::Shall),
+            _ => Err(format!("Unknown output format: {}", s)),
+        }
+    }
+}
+
+pub trait OutputFormatter {
+    fn begin_document(&mut self, w: &mut dyn Write) -> std::io::Result<()>;
+    fn end_document(&mut self, w: &mut dyn Write) -> std::io::Result<()>;
+
+    fn begin_api_list(&mut self, w: &mut dyn Write, title: &str) -> std::io::Result<()>;
+    fn api_item(&mut self, w: &mut dyn Write, name: &str, api_type: &str) -> std::io::Result<()>;
+    fn end_api_list(&mut self, w: &mut dyn Write) -> std::io::Result<()>;
+
+    fn total_specs(&mut self, w: &mut dyn Write, count: usize) -> std::io::Result<()>;
+
+    fn begin_api_details(&mut self, w: &mut dyn Write, name: &str) -> std::io::Result<()>;
+    fn end_api_details(&mut self, w: &mut dyn Write) -> std::io::Result<()>;
+    fn description(&mut self, w: &mut dyn Write, desc: &str) -> std::io::Result<()>;
+    fn long_description(&mut self, w: &mut dyn Write, desc: &str) -> std::io::Result<()>;
+
+    fn begin_context_flags(&mut self, w: &mut dyn Write) -> std::io::Result<()>;
+    fn context_flag(&mut self, w: &mut dyn Write, flag: &str) -> std::io::Result<()>;
+    fn end_context_flags(&mut self, w: &mut dyn Write) -> std::io::Result<()>;
+
+    fn begin_parameters(&mut self, w: &mut dyn Write, count: u32) -> std::io::Result<()>;
+    fn parameter(&mut self, w: &mut dyn Write, param: &ParamSpec) -> std::io::Result<()>;
+    fn end_parameters(&mut self, w: &mut dyn Write) -> std::io::Result<()>;
+
+    fn return_spec(&mut self, w: &mut dyn Write, ret: &ReturnSpec) -> std::io::Result<()>;
+
+    fn begin_errors(&mut self, w: &mut dyn Write, count: u32) -> std::io::Result<()>;
+    fn error(&mut self, w: &mut dyn Write, error: &ErrorSpec) -> std::io::Result<()>;
+    fn end_errors(&mut self, w: &mut dyn Write) -> std::io::Result<()>;
+
+    fn examples(&mut self, w: &mut dyn Write, examples: &str) -> std::io::Result<()>;
+    fn notes(&mut self, w: &mut dyn Write, notes: &str) -> std::io::Result<()>;
+    fn since_version(&mut self, w: &mut dyn Write, version: &str) -> std::io::Result<()>;
+
+    // Sysfs-specific methods
+    fn sysfs_subsystem(&mut self, w: &mut dyn Write, subsystem: &str) -> std::io::Result<()>;
+    fn sysfs_path(&mut self, w: &mut dyn Write, path: &str) -> std::io::Result<()>;
+    fn sysfs_permissions(&mut self, w: &mut dyn Write, perms: &str) -> std::io::Result<()>;
+
+    // Networking-specific methods
+    fn socket_state(&mut self, w: &mut dyn Write, state: &SocketStateSpec) -> std::io::Result<()>;
+
+    fn begin_protocol_behaviors(&mut self, w: &mut dyn Write) -> std::io::Result<()>;
+    fn protocol_behavior(
+        &mut self,
+        w: &mut dyn Write,
+        behavior: &ProtocolBehaviorSpec,
+    ) -> std::io::Result<()>;
+    fn end_protocol_behaviors(&mut self, w: &mut dyn Write) -> std::io::Result<()>;
+
+    fn begin_addr_families(&mut self, w: &mut dyn Write) -> std::io::Result<()>;
+    fn addr_family(&mut self, w: &mut dyn Write, family: &AddrFamilySpec) -> std::io::Result<()>;
+    fn end_addr_families(&mut self, w: &mut dyn Write) -> std::io::Result<()>;
+
+    fn buffer_spec(&mut self, w: &mut dyn Write, spec: &BufferSpec) -> std::io::Result<()>;
+    fn async_spec(&mut self, w: &mut dyn Write, spec: &AsyncSpec) -> std::io::Result<()>;
+    fn net_data_transfer(&mut self, w: &mut dyn Write, desc: &str) -> std::io::Result<()>;
+
+    fn begin_capabilities(&mut self, w: &mut dyn Write) -> std::io::Result<()>;
+    fn capability(&mut self, w: &mut dyn Write, cap: &CapabilitySpec) -> std::io::Result<()>;
+    fn end_capabilities(&mut self, w: &mut dyn Write) -> std::io::Result<()>;
+
+    // Signal-related methods
+    fn begin_signals(&mut self, w: &mut dyn Write, count: u32) -> std::io::Result<()>;
+    fn signal(&mut self, w: &mut dyn Write, signal: &SignalSpec) -> std::io::Result<()>;
+    fn end_signals(&mut self, w: &mut dyn Write) -> std::io::Result<()>;
+
+    fn begin_signal_masks(&mut self, w: &mut dyn Write, count: u32) -> std::io::Result<()>;
+    fn signal_mask(&mut self, w: &mut dyn Write, mask: &SignalMaskSpec) -> std::io::Result<()>;
+    fn end_signal_masks(&mut self, w: &mut dyn Write) -> std::io::Result<()>;
+
+    // Side effects and state transitions
+    fn begin_side_effects(&mut self, w: &mut dyn Write, count: u32) -> std::io::Result<()>;
+    fn side_effect(&mut self, w: &mut dyn Write, effect: &SideEffectSpec) -> std::io::Result<()>;
+    fn end_side_effects(&mut self, w: &mut dyn Write) -> std::io::Result<()>;
+
+    fn begin_state_transitions(&mut self, w: &mut dyn Write, count: u32) -> std::io::Result<()>;
+    fn state_transition(
+        &mut self,
+        w: &mut dyn Write,
+        trans: &StateTransitionSpec,
+    ) -> std::io::Result<()>;
+    fn end_state_transitions(&mut self, w: &mut dyn Write) -> std::io::Result<()>;
+
+    // Constraints and locks
+    fn begin_constraints(&mut self, w: &mut dyn Write, count: u32) -> std::io::Result<()>;
+    fn constraint(&mut self, w: &mut dyn Write, constraint: &ConstraintSpec)
+    -> std::io::Result<()>;
+    fn end_constraints(&mut self, w: &mut dyn Write) -> std::io::Result<()>;
+
+    fn begin_locks(&mut self, w: &mut dyn Write, count: u32) -> std::io::Result<()>;
+    fn lock(&mut self, w: &mut dyn Write, lock: &LockSpec) -> std::io::Result<()>;
+    fn end_locks(&mut self, w: &mut dyn Write) -> std::io::Result<()>;
+
+    fn begin_struct_specs(&mut self, w: &mut dyn Write, count: u32) -> std::io::Result<()>;
+    fn struct_spec(&mut self, w: &mut dyn Write, spec: &StructSpec) -> std::io::Result<()>;
+    fn end_struct_specs(&mut self, w: &mut dyn Write) -> std::io::Result<()>;
+}
+
+pub fn create_formatter(format: OutputFormat) -> Box<dyn OutputFormatter> {
+    match format {
+        OutputFormat::Plain => Box::new(PlainFormatter::new()),
+        OutputFormat::Json => Box::new(JsonFormatter::new()),
+        OutputFormat::Rst => Box::new(RstFormatter::new()),
+        OutputFormat::Shall => Box::new(ShallFormatter::new()),
+    }
+}
diff --git a/tools/kapi/src/formatter/plain.rs b/tools/kapi/src/formatter/plain.rs
new file mode 100644
index 000000000000..4c2d930aac94
--- /dev/null
+++ b/tools/kapi/src/formatter/plain.rs
@@ -0,0 +1,558 @@
+use super::OutputFormatter;
+use crate::extractor::{
+    AddrFamilySpec, AsyncSpec, BufferSpec, CapabilitySpec, ConstraintSpec, ErrorSpec, LockSpec,
+    ParamSpec, ProtocolBehaviorSpec, ReturnSpec, SideEffectSpec, SignalMaskSpec, SignalSpec,
+    SocketStateSpec, StateTransitionSpec,
+};
+use std::io::Write;
+
+pub struct PlainFormatter;
+
+impl PlainFormatter {
+    pub fn new() -> Self {
+        PlainFormatter
+    }
+}
+
+impl OutputFormatter for PlainFormatter {
+    fn begin_document(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn end_document(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn begin_api_list(&mut self, w: &mut dyn Write, title: &str) -> std::io::Result<()> {
+        writeln!(w, "\n{title}:")?;
+        writeln!(w, "{}", "-".repeat(title.len() + 1))
+    }
+
+    fn api_item(&mut self, w: &mut dyn Write, name: &str, _api_type: &str) -> std::io::Result<()> {
+        writeln!(w, "  {name}")
+    }
+
+    fn end_api_list(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn total_specs(&mut self, w: &mut dyn Write, count: usize) -> std::io::Result<()> {
+        writeln!(w, "\nTotal specifications found: {count}")
+    }
+
+    fn begin_api_details(&mut self, w: &mut dyn Write, name: &str) -> std::io::Result<()> {
+        writeln!(w, "\nDetailed information for {name}:")?;
+        writeln!(w, "{}=", "=".repeat(25 + name.len()))
+    }
+
+    fn end_api_details(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn description(&mut self, w: &mut dyn Write, desc: &str) -> std::io::Result<()> {
+        writeln!(w, "Description: {desc}")
+    }
+
+    fn long_description(&mut self, w: &mut dyn Write, desc: &str) -> std::io::Result<()> {
+        writeln!(w, "\nDetailed Description:")?;
+        writeln!(w, "{desc}")
+    }
+
+    fn begin_context_flags(&mut self, w: &mut dyn Write) -> std::io::Result<()> {
+        writeln!(w, "\nExecution Context:")
+    }
+
+    fn context_flag(&mut self, w: &mut dyn Write, flag: &str) -> std::io::Result<()> {
+        writeln!(w, "  - {flag}")
+    }
+
+    fn end_context_flags(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn begin_parameters(&mut self, w: &mut dyn Write, count: u32) -> std::io::Result<()> {
+        writeln!(w, "\nParameters ({count}):")
+    }
+
+    fn parameter(&mut self, w: &mut dyn Write, param: &ParamSpec) -> std::io::Result<()> {
+        writeln!(
+            w,
+            "  [{}] {} ({})",
+            param.index, param.name, param.type_name
+        )?;
+        if !param.description.is_empty() {
+            writeln!(w, "      {}", param.description)?;
+        }
+
+        // Display flags
+        let mut flags = Vec::new();
+        if param.flags & 0x01 != 0 {
+            flags.push("IN");
+        }
+        if param.flags & 0x02 != 0 {
+            flags.push("OUT");
+        }
+        if param.flags & 0x04 != 0 {
+            flags.push("INOUT");
+        }
+        if param.flags & 0x08 != 0 {
+            flags.push("USER");
+        }
+        if param.flags & 0x10 != 0 {
+            flags.push("OPTIONAL");
+        }
+        if !flags.is_empty() {
+            writeln!(w, "      Flags: {}", flags.join(" | "))?;
+        }
+
+        // Display constraints
+        if let Some(constraint) = &param.constraint {
+            writeln!(w, "      Constraint: {constraint}")?;
+        }
+        if let (Some(min), Some(max)) = (param.min_value, param.max_value) {
+            writeln!(w, "      Range: {min} to {max}")?;
+        }
+        if let Some(mask) = param.valid_mask {
+            writeln!(w, "      Valid mask: 0x{mask:x}")?;
+        }
+        Ok(())
+    }
+
+    fn end_parameters(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn return_spec(&mut self, w: &mut dyn Write, ret: &ReturnSpec) -> std::io::Result<()> {
+        writeln!(w, "\nReturn Value:")?;
+        writeln!(w, "  Type: {}", ret.type_name)?;
+        writeln!(w, "  {}", ret.description)?;
+        if let Some(val) = ret.success_value {
+            writeln!(w, "  Success value: {val}")?;
+        }
+        if let (Some(min), Some(max)) = (ret.success_min, ret.success_max) {
+            writeln!(w, "  Success range: {min} to {max}")?;
+        }
+        Ok(())
+    }
+
+    fn begin_errors(&mut self, w: &mut dyn Write, count: u32) -> std::io::Result<()> {
+        writeln!(w, "\nPossible Errors ({count}):")
+    }
+
+    fn error(&mut self, w: &mut dyn Write, error: &ErrorSpec) -> std::io::Result<()> {
+        writeln!(w, "  {} ({})", error.name, error.error_code)?;
+        if !error.condition.is_empty() {
+            writeln!(w, "      Condition: {}", error.condition)?;
+        }
+        if !error.description.is_empty() {
+            writeln!(w, "      {}", error.description)?;
+        }
+        Ok(())
+    }
+
+    fn end_errors(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn examples(&mut self, w: &mut dyn Write, examples: &str) -> std::io::Result<()> {
+        writeln!(w, "\nExamples:")?;
+        writeln!(w, "{examples}")
+    }
+
+    fn notes(&mut self, w: &mut dyn Write, notes: &str) -> std::io::Result<()> {
+        writeln!(w, "\nNotes:")?;
+        writeln!(w, "{notes}")
+    }
+
+    fn since_version(&mut self, w: &mut dyn Write, version: &str) -> std::io::Result<()> {
+        writeln!(w, "\nAvailable since: {version}")
+    }
+
+    fn sysfs_subsystem(&mut self, w: &mut dyn Write, subsystem: &str) -> std::io::Result<()> {
+        writeln!(w, "Subsystem: {subsystem}")
+    }
+
+    fn sysfs_path(&mut self, w: &mut dyn Write, path: &str) -> std::io::Result<()> {
+        writeln!(w, "Sysfs Path: {path}")
+    }
+
+    fn sysfs_permissions(&mut self, w: &mut dyn Write, perms: &str) -> std::io::Result<()> {
+        writeln!(w, "Permissions: {perms}")
+    }
+
+    // Networking-specific methods
+    fn socket_state(&mut self, w: &mut dyn Write, state: &SocketStateSpec) -> std::io::Result<()> {
+        writeln!(w, "\nSocket State Requirements:")?;
+        if !state.required_states.is_empty() {
+            writeln!(w, "  Required states: {:?}", state.required_states)?;
+        }
+        if !state.forbidden_states.is_empty() {
+            writeln!(w, "  Forbidden states: {:?}", state.forbidden_states)?;
+        }
+        if let Some(result) = &state.resulting_state {
+            writeln!(w, "  Resulting state: {result}")?;
+        }
+        if let Some(cond) = &state.condition {
+            writeln!(w, "  Condition: {cond}")?;
+        }
+        if let Some(protos) = &state.applicable_protocols {
+            writeln!(w, "  Applicable protocols: {protos}")?;
+        }
+        Ok(())
+    }
+
+    fn begin_protocol_behaviors(&mut self, w: &mut dyn Write) -> std::io::Result<()> {
+        writeln!(w, "\nProtocol-Specific Behaviors:")
+    }
+
+    fn protocol_behavior(
+        &mut self,
+        w: &mut dyn Write,
+        behavior: &ProtocolBehaviorSpec,
+    ) -> std::io::Result<()> {
+        writeln!(
+            w,
+            "  {} - {}",
+            behavior.applicable_protocols, behavior.behavior
+        )?;
+        if let Some(flags) = &behavior.protocol_flags {
+            writeln!(w, "    Flags: {flags}")?;
+        }
+        Ok(())
+    }
+
+    fn end_protocol_behaviors(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn begin_addr_families(&mut self, w: &mut dyn Write) -> std::io::Result<()> {
+        writeln!(w, "\nSupported Address Families:")
+    }
+
+    fn addr_family(&mut self, w: &mut dyn Write, family: &AddrFamilySpec) -> std::io::Result<()> {
+        writeln!(w, "  {} ({}):", family.family_name, family.family)?;
+        writeln!(w, "    Struct size: {} bytes", family.addr_struct_size)?;
+        writeln!(
+            w,
+            "    Address length: {}-{} bytes",
+            family.min_addr_len, family.max_addr_len
+        )?;
+        if let Some(format) = &family.addr_format {
+            writeln!(w, "    Format: {format}")?;
+        }
+        writeln!(
+            w,
+            "    Features: wildcard={}, multicast={}, broadcast={}",
+            family.supports_wildcard, family.supports_multicast, family.supports_broadcast
+        )?;
+        if let Some(special) = &family.special_addresses {
+            writeln!(w, "    Special addresses: {special}")?;
+        }
+        if family.port_range_max > 0 {
+            writeln!(
+                w,
+                "    Port range: {}-{}",
+                family.port_range_min, family.port_range_max
+            )?;
+        }
+        Ok(())
+    }
+
+    fn end_addr_families(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn buffer_spec(&mut self, w: &mut dyn Write, spec: &BufferSpec) -> std::io::Result<()> {
+        writeln!(w, "\nBuffer Specification:")?;
+        if let Some(behaviors) = &spec.buffer_behaviors {
+            writeln!(w, "  Behaviors: {behaviors}")?;
+        }
+        if let Some(min) = spec.min_buffer_size {
+            writeln!(w, "  Min size: {min} bytes")?;
+        }
+        if let Some(max) = spec.max_buffer_size {
+            writeln!(w, "  Max size: {max} bytes")?;
+        }
+        if let Some(optimal) = spec.optimal_buffer_size {
+            writeln!(w, "  Optimal size: {optimal} bytes")?;
+        }
+        Ok(())
+    }
+
+    fn async_spec(&mut self, w: &mut dyn Write, spec: &AsyncSpec) -> std::io::Result<()> {
+        writeln!(w, "\nAsynchronous Operation:")?;
+        if let Some(modes) = &spec.supported_modes {
+            writeln!(w, "  Supported modes: {modes}")?;
+        }
+        if let Some(errno) = spec.nonblock_errno {
+            writeln!(w, "  Non-blocking errno: {errno}")?;
+        }
+        Ok(())
+    }
+
+    fn net_data_transfer(&mut self, w: &mut dyn Write, desc: &str) -> std::io::Result<()> {
+        writeln!(w, "\nNetwork Data Transfer: {desc}")
+    }
+
+    fn begin_capabilities(&mut self, w: &mut dyn Write) -> std::io::Result<()> {
+        writeln!(w, "\nRequired Capabilities:")
+    }
+
+    fn capability(&mut self, w: &mut dyn Write, cap: &CapabilitySpec) -> std::io::Result<()> {
+        writeln!(w, "  {} ({}) - {}", cap.name, cap.capability, cap.action)?;
+        if !cap.allows.is_empty() {
+            writeln!(w, "    Allows: {}", cap.allows)?;
+        }
+        if !cap.without_cap.is_empty() {
+            writeln!(w, "    Without capability: {}", cap.without_cap)?;
+        }
+        if let Some(cond) = &cap.check_condition {
+            writeln!(w, "    Condition: {cond}")?;
+        }
+        Ok(())
+    }
+
+    fn end_capabilities(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    // Signal-related methods
+    fn begin_signals(&mut self, w: &mut dyn Write, count: u32) -> std::io::Result<()> {
+        writeln!(w, "\nSignal Specifications ({count}):")
+    }
+
+    fn signal(&mut self, w: &mut dyn Write, signal: &SignalSpec) -> std::io::Result<()> {
+        write!(w, "  {} ({})", signal.signal_name, signal.signal_num)?;
+
+        // Display direction
+        let direction = match signal.direction {
+            0 => "SEND",
+            1 => "RECEIVE",
+            2 => "HANDLE",
+            3 => "IGNORE",
+            _ => "UNKNOWN",
+        };
+        write!(w, " - {direction}")?;
+
+        // Display action
+        let action = match signal.action {
+            0 => "DEFAULT",
+            1 => "TERMINATE",
+            2 => "COREDUMP",
+            3 => "STOP",
+            4 => "CONTINUE",
+            5 => "IGNORE",
+            6 => "CUSTOM",
+            7 => "DISCARD",
+            _ => "UNKNOWN",
+        };
+        writeln!(w, " - {action}")?;
+
+        if let Some(target) = &signal.target {
+            writeln!(w, "      Target: {target}")?;
+        }
+        if let Some(condition) = &signal.condition {
+            writeln!(w, "      Condition: {condition}")?;
+        }
+        if let Some(desc) = &signal.description {
+            writeln!(w, "      {desc}")?;
+        }
+
+        // Display timing
+        let timing = match signal.timing {
+            0 => "BEFORE",
+            1 => "DURING",
+            2 => "AFTER",
+            3 => "EXIT",
+            _ => "UNKNOWN",
+        };
+        writeln!(w, "      Timing: {timing}")?;
+        writeln!(w, "      Priority: {}", signal.priority)?;
+
+        if signal.restartable {
+            writeln!(w, "      Restartable: yes")?;
+        }
+        if signal.interruptible {
+            writeln!(w, "      Interruptible: yes")?;
+        }
+        if let Some(error) = signal.error_on_signal {
+            writeln!(w, "      Error on signal: {error}")?;
+        }
+        Ok(())
+    }
+
+    fn end_signals(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn begin_signal_masks(&mut self, w: &mut dyn Write, count: u32) -> std::io::Result<()> {
+        writeln!(w, "\nSignal Masks ({count}):")
+    }
+
+    fn signal_mask(&mut self, w: &mut dyn Write, mask: &SignalMaskSpec) -> std::io::Result<()> {
+        writeln!(w, "  {}", mask.name)?;
+        if !mask.description.is_empty() {
+            writeln!(w, "      {}", mask.description)?;
+        }
+        Ok(())
+    }
+
+    fn end_signal_masks(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    // Side effects and state transitions
+    fn begin_side_effects(&mut self, w: &mut dyn Write, count: u32) -> std::io::Result<()> {
+        writeln!(w, "\nSide Effects ({count}):")
+    }
+
+    fn side_effect(&mut self, w: &mut dyn Write, effect: &SideEffectSpec) -> std::io::Result<()> {
+        writeln!(w, "  {} - {}", effect.target, effect.description)?;
+        if let Some(condition) = &effect.condition {
+            writeln!(w, "      Condition: {condition}")?;
+        }
+        if effect.reversible {
+            writeln!(w, "      Reversible: yes")?;
+        }
+        Ok(())
+    }
+
+    fn end_side_effects(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn begin_state_transitions(&mut self, w: &mut dyn Write, count: u32) -> std::io::Result<()> {
+        writeln!(w, "\nState Transitions ({count}):")
+    }
+
+    fn state_transition(
+        &mut self,
+        w: &mut dyn Write,
+        trans: &StateTransitionSpec,
+    ) -> std::io::Result<()> {
+        writeln!(
+            w,
+            "  {} : {} -> {}",
+            trans.object, trans.from_state, trans.to_state
+        )?;
+        if let Some(condition) = &trans.condition {
+            writeln!(w, "      Condition: {condition}")?;
+        }
+        if !trans.description.is_empty() {
+            writeln!(w, "      {}", trans.description)?;
+        }
+        Ok(())
+    }
+
+    fn end_state_transitions(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    // Constraints and locks
+    fn begin_constraints(&mut self, w: &mut dyn Write, count: u32) -> std::io::Result<()> {
+        writeln!(w, "\nAdditional Constraints ({count}):")
+    }
+
+    fn constraint(
+        &mut self,
+        w: &mut dyn Write,
+        constraint: &ConstraintSpec,
+    ) -> std::io::Result<()> {
+        writeln!(w, "  {}", constraint.name)?;
+        if !constraint.description.is_empty() {
+            writeln!(w, "      {}", constraint.description)?;
+        }
+        if let Some(expr) = &constraint.expression {
+            writeln!(w, "      Expression: {expr}")?;
+        }
+        Ok(())
+    }
+
+    fn end_constraints(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn begin_locks(&mut self, w: &mut dyn Write, count: u32) -> std::io::Result<()> {
+        writeln!(w, "\nLocking Requirements ({count}):")
+    }
+
+    fn lock(&mut self, w: &mut dyn Write, lock: &LockSpec) -> std::io::Result<()> {
+        write!(w, "  {}", lock.lock_name)?;
+
+        // Display lock type
+        let lock_type = match lock.lock_type {
+            0 => "NONE",
+            1 => "MUTEX",
+            2 => "SPINLOCK",
+            3 => "RWLOCK",
+            4 => "SEQLOCK",
+            5 => "RCU",
+            6 => "SEMAPHORE",
+            7 => "CUSTOM",
+            _ => "UNKNOWN",
+        };
+        writeln!(w, " ({lock_type})")?;
+
+        let mut actions = Vec::new();
+        if lock.acquired {
+            actions.push("acquired");
+        }
+        if lock.released {
+            actions.push("released");
+        }
+        if lock.held_on_entry {
+            actions.push("held on entry");
+        }
+        if lock.held_on_exit {
+            actions.push("held on exit");
+        }
+
+        if !actions.is_empty() {
+            writeln!(w, "      Actions: {}", actions.join(", "))?;
+        }
+
+        if !lock.description.is_empty() {
+            writeln!(w, "      {}", lock.description)?;
+        }
+        Ok(())
+    }
+
+    fn end_locks(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn begin_struct_specs(&mut self, w: &mut dyn Write, count: u32) -> std::io::Result<()> {
+        writeln!(w, "\nStructure Specifications ({count}):")
+    }
+
+    fn struct_spec(&mut self, w: &mut dyn Write, spec: &crate::extractor::StructSpec) -> std::io::Result<()> {
+        writeln!(w, "  {} (size={}, align={}):", spec.name, spec.size, spec.alignment)?;
+        if !spec.description.is_empty() {
+            writeln!(w, "      {}", spec.description)?;
+        }
+
+        if !spec.fields.is_empty() {
+            writeln!(w, "      Fields ({}):", spec.field_count)?;
+            for field in &spec.fields {
+                write!(w, "        - {} ({}):", field.name, field.type_name)?;
+                if !field.description.is_empty() {
+                    write!(w, " {}", field.description)?;
+                }
+                writeln!(w)?;
+
+                // Show constraints if present
+                if field.min_value != 0 || field.max_value != 0 {
+                    writeln!(w, "          Range: [{}, {}]", field.min_value, field.max_value)?;
+                }
+                if field.valid_mask != 0 {
+                    writeln!(w, "          Mask: {:#x}", field.valid_mask)?;
+                }
+            }
+        }
+        Ok(())
+    }
+
+    fn end_struct_specs(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+}
diff --git a/tools/kapi/src/formatter/rst.rs b/tools/kapi/src/formatter/rst.rs
new file mode 100644
index 000000000000..51d0be911480
--- /dev/null
+++ b/tools/kapi/src/formatter/rst.rs
@@ -0,0 +1,621 @@
+use super::OutputFormatter;
+use crate::extractor::{
+    AddrFamilySpec, AsyncSpec, BufferSpec, CapabilitySpec, ConstraintSpec, ErrorSpec, LockSpec,
+    ParamSpec, ProtocolBehaviorSpec, ReturnSpec, SideEffectSpec, SignalMaskSpec, SignalSpec,
+    SocketStateSpec, StateTransitionSpec,
+};
+use std::io::Write;
+
+pub struct RstFormatter {
+    current_section_level: usize,
+}
+
+impl RstFormatter {
+    pub fn new() -> Self {
+        RstFormatter {
+            current_section_level: 0,
+        }
+    }
+
+    fn section_char(level: usize) -> char {
+        match level {
+            0 => '=',
+            1 => '-',
+            2 => '~',
+            3 => '^',
+            _ => '"',
+        }
+    }
+}
+
+impl OutputFormatter for RstFormatter {
+    fn begin_document(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn end_document(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn begin_api_list(&mut self, w: &mut dyn Write, title: &str) -> std::io::Result<()> {
+        writeln!(w, "\n{title}")?;
+        writeln!(
+            w,
+            "{}",
+            Self::section_char(0).to_string().repeat(title.len())
+        )?;
+        writeln!(w)
+    }
+
+    fn api_item(&mut self, w: &mut dyn Write, name: &str, api_type: &str) -> std::io::Result<()> {
+        writeln!(w, "* **{name}** (*{api_type}*)")
+    }
+
+    fn end_api_list(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn total_specs(&mut self, w: &mut dyn Write, count: usize) -> std::io::Result<()> {
+        writeln!(w, "\n**Total specifications found:** {count}")
+    }
+
+    fn begin_api_details(&mut self, w: &mut dyn Write, name: &str) -> std::io::Result<()> {
+        self.current_section_level = 0;
+        writeln!(w, "\n{name}")?;
+        writeln!(
+            w,
+            "{}",
+            Self::section_char(0).to_string().repeat(name.len())
+        )?;
+        writeln!(w)
+    }
+
+    fn end_api_details(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn description(&mut self, w: &mut dyn Write, desc: &str) -> std::io::Result<()> {
+        writeln!(w, "**{desc}**")?;
+        writeln!(w)
+    }
+
+    fn long_description(&mut self, w: &mut dyn Write, desc: &str) -> std::io::Result<()> {
+        writeln!(w, "{desc}")?;
+        writeln!(w)
+    }
+
+    fn begin_context_flags(&mut self, w: &mut dyn Write) -> std::io::Result<()> {
+        self.current_section_level = 1;
+        let title = "Execution Context";
+        writeln!(w, "{title}")?;
+        writeln!(
+            w,
+            "{}",
+            Self::section_char(1).to_string().repeat(title.len())
+        )?;
+        writeln!(w)
+    }
+
+    fn context_flag(&mut self, w: &mut dyn Write, flag: &str) -> std::io::Result<()> {
+        writeln!(w, "* {flag}")
+    }
+
+    fn end_context_flags(&mut self, w: &mut dyn Write) -> std::io::Result<()> {
+        writeln!(w)
+    }
+
+    fn begin_parameters(&mut self, w: &mut dyn Write, count: u32) -> std::io::Result<()> {
+        self.current_section_level = 1;
+        let title = format!("Parameters ({count})");
+        writeln!(w, "{title}")?;
+        writeln!(
+            w,
+            "{}",
+            Self::section_char(1).to_string().repeat(title.len())
+        )?;
+        writeln!(w)
+    }
+
+    fn end_parameters(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn begin_errors(&mut self, w: &mut dyn Write, count: u32) -> std::io::Result<()> {
+        self.current_section_level = 1;
+        let title = format!("Possible Errors ({count})");
+        writeln!(w, "{title}")?;
+        writeln!(
+            w,
+            "{}",
+            Self::section_char(1).to_string().repeat(title.len())
+        )?;
+        writeln!(w)
+    }
+
+    fn end_errors(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn examples(&mut self, w: &mut dyn Write, examples: &str) -> std::io::Result<()> {
+        self.current_section_level = 1;
+        let title = "Examples";
+        writeln!(w, "{title}")?;
+        writeln!(
+            w,
+            "{}",
+            Self::section_char(1).to_string().repeat(title.len())
+        )?;
+        writeln!(w)?;
+        writeln!(w, ".. code-block:: c")?;
+        writeln!(w)?;
+        for line in examples.lines() {
+            writeln!(w, "   {line}")?;
+        }
+        writeln!(w)
+    }
+
+    fn notes(&mut self, w: &mut dyn Write, notes: &str) -> std::io::Result<()> {
+        self.current_section_level = 1;
+        let title = "Notes";
+        writeln!(w, "{title}")?;
+        writeln!(
+            w,
+            "{}",
+            Self::section_char(1).to_string().repeat(title.len())
+        )?;
+        writeln!(w)?;
+        writeln!(w, "{notes}")?;
+        writeln!(w)
+    }
+
+    fn since_version(&mut self, w: &mut dyn Write, version: &str) -> std::io::Result<()> {
+        writeln!(w, ":Available since: {version}")?;
+        writeln!(w)
+    }
+
+    fn sysfs_subsystem(&mut self, w: &mut dyn Write, subsystem: &str) -> std::io::Result<()> {
+        writeln!(w, ":Subsystem: {subsystem}")?;
+        writeln!(w)
+    }
+
+    fn sysfs_path(&mut self, w: &mut dyn Write, path: &str) -> std::io::Result<()> {
+        writeln!(w, ":Sysfs Path: {path}")?;
+        writeln!(w)
+    }
+
+    fn sysfs_permissions(&mut self, w: &mut dyn Write, perms: &str) -> std::io::Result<()> {
+        writeln!(w, ":Permissions: {perms}")?;
+        writeln!(w)
+    }
+
+    // Networking-specific methods
+    fn socket_state(&mut self, w: &mut dyn Write, state: &SocketStateSpec) -> std::io::Result<()> {
+        self.current_section_level = 1;
+        let title = "Socket State Requirements";
+        writeln!(w, "{title}")?;
+        writeln!(
+            w,
+            "{}",
+            Self::section_char(1).to_string().repeat(title.len())
+        )?;
+        writeln!(w)?;
+
+        if !state.required_states.is_empty() {
+            writeln!(
+                w,
+                "**Required states:** {}",
+                state.required_states.join(", ")
+            )?;
+        }
+        if !state.forbidden_states.is_empty() {
+            writeln!(
+                w,
+                "**Forbidden states:** {}",
+                state.forbidden_states.join(", ")
+            )?;
+        }
+        if let Some(result) = &state.resulting_state {
+            writeln!(w, "**Resulting state:** {result}")?;
+        }
+        if let Some(cond) = &state.condition {
+            writeln!(w, "**Condition:** {cond}")?;
+        }
+        if let Some(protos) = &state.applicable_protocols {
+            writeln!(w, "**Applicable protocols:** {protos}")?;
+        }
+        writeln!(w)
+    }
+
+    fn begin_protocol_behaviors(&mut self, w: &mut dyn Write) -> std::io::Result<()> {
+        self.current_section_level = 1;
+        let title = "Protocol-Specific Behaviors";
+        writeln!(w, "{title}")?;
+        writeln!(
+            w,
+            "{}",
+            Self::section_char(1).to_string().repeat(title.len())
+        )?;
+        writeln!(w)
+    }
+
+    fn protocol_behavior(
+        &mut self,
+        w: &mut dyn Write,
+        behavior: &ProtocolBehaviorSpec,
+    ) -> std::io::Result<()> {
+        writeln!(w, "**{}**", behavior.applicable_protocols)?;
+        writeln!(w)?;
+        writeln!(w, "{}", behavior.behavior)?;
+        if let Some(flags) = &behavior.protocol_flags {
+            writeln!(w)?;
+            writeln!(w, "*Flags:* {flags}")?;
+        }
+        writeln!(w)
+    }
+
+    fn end_protocol_behaviors(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn begin_addr_families(&mut self, w: &mut dyn Write) -> std::io::Result<()> {
+        self.current_section_level = 1;
+        let title = "Supported Address Families";
+        writeln!(w, "{title}")?;
+        writeln!(
+            w,
+            "{}",
+            Self::section_char(1).to_string().repeat(title.len())
+        )?;
+        writeln!(w)
+    }
+
+    fn addr_family(&mut self, w: &mut dyn Write, family: &AddrFamilySpec) -> std::io::Result<()> {
+        writeln!(w, "**{} ({})**", family.family_name, family.family)?;
+        writeln!(w)?;
+        writeln!(w, "* **Struct size:** {} bytes", family.addr_struct_size)?;
+        writeln!(
+            w,
+            "* **Address length:** {}-{} bytes",
+            family.min_addr_len, family.max_addr_len
+        )?;
+        if let Some(format) = &family.addr_format {
+            writeln!(w, "* **Format:** ``{format}``")?;
+        }
+        writeln!(
+            w,
+            "* **Features:** wildcard={}, multicast={}, broadcast={}",
+            family.supports_wildcard, family.supports_multicast, family.supports_broadcast
+        )?;
+        if let Some(special) = &family.special_addresses {
+            writeln!(w, "* **Special addresses:** {special}")?;
+        }
+        if family.port_range_max > 0 {
+            writeln!(
+                w,
+                "* **Port range:** {}-{}",
+                family.port_range_min, family.port_range_max
+            )?;
+        }
+        writeln!(w)
+    }
+
+    fn end_addr_families(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn buffer_spec(&mut self, w: &mut dyn Write, spec: &BufferSpec) -> std::io::Result<()> {
+        self.current_section_level = 1;
+        let title = "Buffer Specification";
+        writeln!(w, "{title}")?;
+        writeln!(
+            w,
+            "{}",
+            Self::section_char(1).to_string().repeat(title.len())
+        )?;
+        writeln!(w)?;
+
+        if let Some(behaviors) = &spec.buffer_behaviors {
+            writeln!(w, "**Behaviors:** {behaviors}")?;
+        }
+        if let Some(min) = spec.min_buffer_size {
+            writeln!(w, "**Min size:** {min} bytes")?;
+        }
+        if let Some(max) = spec.max_buffer_size {
+            writeln!(w, "**Max size:** {max} bytes")?;
+        }
+        if let Some(optimal) = spec.optimal_buffer_size {
+            writeln!(w, "**Optimal size:** {optimal} bytes")?;
+        }
+        writeln!(w)
+    }
+
+    fn async_spec(&mut self, w: &mut dyn Write, spec: &AsyncSpec) -> std::io::Result<()> {
+        self.current_section_level = 1;
+        let title = "Asynchronous Operation";
+        writeln!(w, "{title}")?;
+        writeln!(
+            w,
+            "{}",
+            Self::section_char(1).to_string().repeat(title.len())
+        )?;
+        writeln!(w)?;
+
+        if let Some(modes) = &spec.supported_modes {
+            writeln!(w, "**Supported modes:** {modes}")?;
+        }
+        if let Some(errno) = spec.nonblock_errno {
+            writeln!(w, "**Non-blocking errno:** {errno}")?;
+        }
+        writeln!(w)
+    }
+
+    fn net_data_transfer(&mut self, w: &mut dyn Write, desc: &str) -> std::io::Result<()> {
+        writeln!(w, "**Network Data Transfer:** {desc}")?;
+        writeln!(w)
+    }
+
+    fn begin_capabilities(&mut self, w: &mut dyn Write) -> std::io::Result<()> {
+        self.current_section_level = 1;
+        let title = "Required Capabilities";
+        writeln!(w, "{title}")?;
+        writeln!(
+            w,
+            "{}",
+            Self::section_char(1).to_string().repeat(title.len())
+        )?;
+        writeln!(w)
+    }
+
+    fn capability(&mut self, w: &mut dyn Write, cap: &CapabilitySpec) -> std::io::Result<()> {
+        writeln!(w, "**{} ({})** - {}", cap.name, cap.capability, cap.action)?;
+        writeln!(w)?;
+        if !cap.allows.is_empty() {
+            writeln!(w, "* **Allows:** {}", cap.allows)?;
+        }
+        if !cap.without_cap.is_empty() {
+            writeln!(w, "* **Without capability:** {}", cap.without_cap)?;
+        }
+        if let Some(cond) = &cap.check_condition {
+            writeln!(w, "* **Condition:** {}", cond)?;
+        }
+        writeln!(w)
+    }
+
+    fn end_capabilities(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    // Stub implementations for new methods
+    fn parameter(&mut self, w: &mut dyn Write, param: &ParamSpec) -> std::io::Result<()> {
+        writeln!(
+            w,
+            "**[{}] {}** (*{}*)",
+            param.index, param.name, param.type_name
+        )?;
+        writeln!(w)?;
+        writeln!(w, "  {}", param.description)?;
+
+        // Display flags
+        let mut flags = Vec::new();
+        if param.flags & 0x01 != 0 {
+            flags.push("IN");
+        }
+        if param.flags & 0x02 != 0 {
+            flags.push("OUT");
+        }
+        if param.flags & 0x04 != 0 {
+            flags.push("USER");
+        }
+        if param.flags & 0x08 != 0 {
+            flags.push("OPTIONAL");
+        }
+        if !flags.is_empty() {
+            writeln!(w, "  :Flags: {}", flags.join(", "))?;
+        }
+
+        if let Some(constraint) = &param.constraint {
+            writeln!(w, "  :Constraint: {}", constraint)?;
+        }
+
+        if let (Some(min), Some(max)) = (param.min_value, param.max_value) {
+            writeln!(w, "  :Range: {} to {}", min, max)?;
+        }
+
+        writeln!(w)
+    }
+
+    fn return_spec(&mut self, w: &mut dyn Write, ret: &ReturnSpec) -> std::io::Result<()> {
+        writeln!(w, "\nReturn Value")?;
+        writeln!(w, "{}\n", Self::section_char(1).to_string().repeat(12))?;
+        writeln!(w)?;
+        writeln!(w, ":Type: {}", ret.type_name)?;
+        writeln!(w, ":Description: {}", ret.description)?;
+        if let Some(success) = ret.success_value {
+            writeln!(w, ":Success value: {}", success)?;
+        }
+        writeln!(w)
+    }
+
+    fn error(&mut self, w: &mut dyn Write, error: &ErrorSpec) -> std::io::Result<()> {
+        writeln!(w, "**{}** ({})", error.name, error.error_code)?;
+        writeln!(w)?;
+        writeln!(w, "  :Condition: {}", error.condition)?;
+        if !error.description.is_empty() {
+            writeln!(w, "  :Description: {}", error.description)?;
+        }
+        writeln!(w)
+    }
+
+    fn begin_signals(&mut self, _w: &mut dyn Write, _count: u32) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn signal(&mut self, _w: &mut dyn Write, _signal: &SignalSpec) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn end_signals(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn begin_signal_masks(&mut self, _w: &mut dyn Write, _count: u32) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn signal_mask(&mut self, _w: &mut dyn Write, _mask: &SignalMaskSpec) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn end_signal_masks(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn begin_side_effects(&mut self, w: &mut dyn Write, count: u32) -> std::io::Result<()> {
+        self.current_section_level = 1;
+        let title = format!("Side Effects ({count})");
+        writeln!(w, "{}\n", title)?;
+        writeln!(
+            w,
+            "{}\n",
+            Self::section_char(1).to_string().repeat(title.len())
+        )
+    }
+
+    fn side_effect(&mut self, w: &mut dyn Write, effect: &SideEffectSpec) -> std::io::Result<()> {
+        write!(w, "* **{}**", effect.target)?;
+        if effect.reversible {
+            write!(w, " *(reversible)*")?;
+        }
+        writeln!(w)?;
+        writeln!(w, "  {}", effect.description)?;
+        if let Some(cond) = &effect.condition {
+            writeln!(w, "  :Condition: {}", cond)?;
+        }
+        writeln!(w)
+    }
+
+    fn end_side_effects(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn begin_state_transitions(&mut self, w: &mut dyn Write, count: u32) -> std::io::Result<()> {
+        self.current_section_level = 1;
+        let title = format!("State Transitions ({count})");
+        writeln!(w, "{}\n", title)?;
+        writeln!(
+            w,
+            "{}\n",
+            Self::section_char(1).to_string().repeat(title.len())
+        )
+    }
+
+    fn state_transition(
+        &mut self,
+        w: &mut dyn Write,
+        trans: &StateTransitionSpec,
+    ) -> std::io::Result<()> {
+        writeln!(
+            w,
+            "* **{}**: {} → {}",
+            trans.object, trans.from_state, trans.to_state
+        )?;
+        writeln!(w, "  {}", trans.description)?;
+        if let Some(cond) = &trans.condition {
+            writeln!(w, "  :Condition: {}", cond)?;
+        }
+        writeln!(w)
+    }
+
+    fn end_state_transitions(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn begin_constraints(&mut self, _w: &mut dyn Write, _count: u32) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn constraint(
+        &mut self,
+        _w: &mut dyn Write,
+        _constraint: &ConstraintSpec,
+    ) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn end_constraints(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn begin_locks(&mut self, w: &mut dyn Write, count: u32) -> std::io::Result<()> {
+        self.current_section_level = 1;
+        let title = format!("Locks ({count})");
+        writeln!(w, "{}\n", title)?;
+        writeln!(
+            w,
+            "{}\n",
+            Self::section_char(1).to_string().repeat(title.len())
+        )
+    }
+
+    fn lock(&mut self, w: &mut dyn Write, lock: &LockSpec) -> std::io::Result<()> {
+        write!(w, "* **{}**", lock.lock_name)?;
+        let lock_type_str = match lock.lock_type {
+            1 => " *(mutex)*",
+            2 => " *(spinlock)*",
+            3 => " *(rwlock)*",
+            4 => " *(semaphore)*",
+            5 => " *(RCU)*",
+            _ => "",
+        };
+        writeln!(w, "{}", lock_type_str)?;
+        if !lock.description.is_empty() {
+            writeln!(w, "  {}", lock.description)?;
+        }
+        writeln!(w)
+    }
+
+    fn end_locks(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn begin_struct_specs(&mut self, w: &mut dyn Write, _count: u32) -> std::io::Result<()> {
+        writeln!(w)?;
+        writeln!(w, "Structure Specifications")?;
+        writeln!(w, "~~~~~~~~~~~~~~~~~~~~~~~")?;
+        writeln!(w)
+    }
+
+    fn struct_spec(&mut self, w: &mut dyn Write, spec: &crate::extractor::StructSpec) -> std::io::Result<()> {
+        writeln!(w, "**{}**", spec.name)?;
+        writeln!(w)?;
+
+        if !spec.description.is_empty() {
+            writeln!(w, "  {}", spec.description)?;
+            writeln!(w)?;
+        }
+
+        writeln!(w, "  :Size: {} bytes", spec.size)?;
+        writeln!(w, "  :Alignment: {} bytes", spec.alignment)?;
+        writeln!(w, "  :Fields: {}", spec.field_count)?;
+        writeln!(w)?;
+
+        if !spec.fields.is_empty() {
+            for field in &spec.fields {
+                writeln!(w, "  * **{}** ({})", field.name, field.type_name)?;
+                if !field.description.is_empty() {
+                    writeln!(w, "    {}", field.description)?;
+                }
+                if field.min_value != 0 || field.max_value != 0 {
+                    writeln!(w, "    Range: [{}, {}]", field.min_value, field.max_value)?;
+                }
+            }
+            writeln!(w)?;
+        }
+
+        Ok(())
+    }
+
+    fn end_struct_specs(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+}
diff --git a/tools/kapi/src/formatter/shall.rs b/tools/kapi/src/formatter/shall.rs
new file mode 100644
index 000000000000..cc169d1290ca
--- /dev/null
+++ b/tools/kapi/src/formatter/shall.rs
@@ -0,0 +1,891 @@
+use super::OutputFormatter;
+use crate::extractor::{
+    AddrFamilySpec, AsyncSpec, BufferSpec, CapabilitySpec, ConstraintSpec, ErrorSpec, LockSpec,
+    ParamSpec, ProtocolBehaviorSpec, ReturnSpec, SideEffectSpec, SignalMaskSpec, SignalSpec,
+    SocketStateSpec, StateTransitionSpec,
+};
+use std::io::Write;
+
+pub struct ShallFormatter {
+    api_name: Option<String>,
+    in_list: bool,
+}
+
+impl ShallFormatter {
+    pub fn new() -> Self {
+        ShallFormatter {
+            api_name: None,
+            in_list: false,
+        }
+    }
+}
+
+impl OutputFormatter for ShallFormatter {
+    fn begin_document(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn end_document(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn begin_api_list(&mut self, w: &mut dyn Write, title: &str) -> std::io::Result<()> {
+        self.in_list = true;
+        writeln!(w, "\n{} API Behavioral Requirements:", title)?;
+        writeln!(w)
+    }
+
+    fn api_item(&mut self, w: &mut dyn Write, name: &str, _api_type: &str) -> std::io::Result<()> {
+        writeln!(
+            w,
+            "- {} shall be available for {}",
+            name,
+            name.replace('_', " ")
+        )
+    }
+
+    fn end_api_list(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        self.in_list = false;
+        Ok(())
+    }
+
+    fn total_specs(&mut self, w: &mut dyn Write, count: usize) -> std::io::Result<()> {
+        writeln!(
+            w,
+            "\nTotal: {} kernel API specifications shall be enforced.",
+            count
+        )
+    }
+
+    fn begin_api_details(&mut self, w: &mut dyn Write, name: &str) -> std::io::Result<()> {
+        self.api_name = Some(name.to_string());
+        writeln!(w, "\nBehavioral Requirements for {}:", name)?;
+        writeln!(w)
+    }
+
+    fn end_api_details(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        self.api_name = None;
+        Ok(())
+    }
+
+    fn description(&mut self, w: &mut dyn Write, desc: &str) -> std::io::Result<()> {
+        if let Some(api_name) = &self.api_name {
+            writeln!(w, "- {} shall {}.", api_name, desc.trim_end_matches('.'))
+        } else {
+            writeln!(w, "- The API shall {}.", desc.trim_end_matches('.'))
+        }
+    }
+
+    fn long_description(&mut self, w: &mut dyn Write, desc: &str) -> std::io::Result<()> {
+        writeln!(w)?;
+        for line in desc.lines() {
+            if !line.trim().is_empty() {
+                writeln!(w, "{}", line)?;
+            }
+        }
+        writeln!(w)
+    }
+
+    fn begin_context_flags(&mut self, w: &mut dyn Write) -> std::io::Result<()> {
+        writeln!(w, "\nExecution Context Requirements:")?;
+        writeln!(w)
+    }
+
+    fn context_flag(&mut self, w: &mut dyn Write, flag: &str) -> std::io::Result<()> {
+        // Parse context flags and make them readable with specific requirements
+        match flag {
+            "Process context" => {
+                writeln!(w, "- The function shall be callable from process context.")?;
+                writeln!(
+                    w,
+                    "  Process context allows the function to sleep, allocate memory with GFP_KERNEL, and access user space."
+                )
+            }
+            "Softirq context" => {
+                writeln!(w, "- The function shall be callable from softirq context.")?;
+                writeln!(
+                    w,
+                    "  In softirq context, the function shall not sleep and shall use GFP_ATOMIC for memory allocations."
+                )
+            }
+            "Hardirq context" => {
+                writeln!(
+                    w,
+                    "- The function shall be callable from hardirq (interrupt) context."
+                )?;
+                writeln!(
+                    w,
+                    "  In hardirq context, the function shall not sleep, shall minimize execution time, and shall use GFP_ATOMIC for allocations."
+                )
+            }
+            "NMI context" => {
+                writeln!(
+                    w,
+                    "- The function shall be callable from NMI (Non-Maskable Interrupt) context."
+                )?;
+                writeln!(
+                    w,
+                    "  In NMI context, the function shall not take any locks that might be held by interrupted code."
+                )
+            }
+            "User mode" => {
+                writeln!(
+                    w,
+                    "- The function shall be callable when the CPU is in user mode."
+                )?;
+                writeln!(w, "  This typically applies to system call entry points.")
+            }
+            "Kernel mode" => {
+                writeln!(
+                    w,
+                    "- The function shall be callable when the CPU is in kernel mode."
+                )
+            }
+            "May sleep" => {
+                writeln!(w, "- The function may sleep (block) during execution.")?;
+                writeln!(
+                    w,
+                    "  Callers shall ensure they are in a context where sleeping is allowed (not in interrupt or atomic context)."
+                )
+            }
+            "Atomic context" => {
+                writeln!(w, "- The function shall be callable from atomic context.")?;
+                writeln!(
+                    w,
+                    "  In atomic context, the function shall not sleep and shall complete quickly."
+                )
+            }
+            "Preemptible" => {
+                writeln!(
+                    w,
+                    "- The function shall be callable when preemption is enabled."
+                )?;
+                writeln!(
+                    w,
+                    "  The function may be preempted by higher priority tasks."
+                )
+            }
+            "Migration disabled" => {
+                writeln!(
+                    w,
+                    "- The function shall be callable when CPU migration is disabled."
+                )?;
+                writeln!(
+                    w,
+                    "  The function shall not rely on being able to migrate between CPUs."
+                )
+            }
+            _ => {
+                // Fallback for unrecognized flags
+                writeln!(w, "- The function shall be callable from {} context.", flag)
+            }
+        }
+    }
+
+    fn end_context_flags(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn begin_parameters(&mut self, w: &mut dyn Write, _count: u32) -> std::io::Result<()> {
+        writeln!(w, "\nParameter Requirements:")
+    }
+
+    fn parameter(&mut self, w: &mut dyn Write, param: &ParamSpec) -> std::io::Result<()> {
+        writeln!(w)?;
+        writeln!(
+            w,
+            "- If {} is provided, it shall be {}.",
+            param.name,
+            param.description.trim_end_matches('.')
+        )?;
+
+        // Only show meaningful numeric constraints
+        if let Some(min) = param.min_value {
+            if let Some(max) = param.max_value {
+                if min != 0 || max != 0 {
+                    writeln!(
+                        w,
+                        "\n- If {} is less than {} or greater than {}, the operation shall fail.",
+                        param.name, min, max
+                    )?;
+                }
+            } else if min != 0 {
+                writeln!(
+                    w,
+                    "\n- If {} is less than {}, the operation shall fail.",
+                    param.name, min
+                )?;
+            }
+        } else if let Some(max) = param.max_value {
+            if max != 0 {
+                writeln!(
+                    w,
+                    "\n- If {} is greater than {}, the operation shall fail.",
+                    param.name, max
+                )?;
+            }
+        }
+
+        if let Some(constraint) = &param.constraint {
+            if !constraint.is_empty() {
+                let constraint_text = constraint.trim_end_matches('.');
+                // Handle constraints that start with "Must be" or similar
+                if constraint_text.to_lowercase().starts_with("must be ") {
+                    let requirement = &constraint_text[8..]; // Skip "Must be "
+                    writeln!(
+                        w,
+                        "\n- If {} is not {}, the operation shall fail.",
+                        param.name, requirement
+                    )?;
+                } else if constraint_text.to_lowercase().starts_with("must ") {
+                    let requirement = &constraint_text[5..]; // Skip "Must "
+                    writeln!(
+                        w,
+                        "\n- If {} does not {}, the operation shall fail.",
+                        param.name, requirement
+                    )?;
+                } else if constraint_text.contains(" must ") || constraint_text.contains(" should ")
+                {
+                    // Reformat constraints with must/should in the middle
+                    writeln!(w, "\n- {} shall satisfy: {}.", param.name, constraint_text)?;
+                } else {
+                    // Default format for other constraints
+                    writeln!(
+                        w,
+                        "\n- If {} is not {}, the operation shall fail.",
+                        param.name, constraint_text
+                    )?;
+                }
+            }
+        }
+
+        // Only show valid_mask if it's not 0
+        if let Some(mask) = param.valid_mask {
+            if mask != 0 {
+                writeln!(
+                    w,
+                    "\n- If {} contains bits not set in 0x{:x}, the operation shall fail.",
+                    param.name, mask
+                )?;
+            }
+        }
+
+        Ok(())
+    }
+
+    fn end_parameters(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn return_spec(&mut self, w: &mut dyn Write, ret: &ReturnSpec) -> std::io::Result<()> {
+        writeln!(w, "\nReturn Value Behavior:")?;
+        writeln!(w)?;
+
+        if let Some(success) = ret.success_value {
+            writeln!(
+                w,
+                "- If the operation succeeds, the function shall return {}.",
+                success
+            )?;
+        } else if let Some(min) = ret.success_min {
+            if let Some(max) = ret.success_max {
+                writeln!(
+                    w,
+                    "- If the operation succeeds, the function shall return a value between {} and {} inclusive.",
+                    min, max
+                )?;
+            } else {
+                writeln!(
+                    w,
+                    "- If the operation succeeds, the function shall return a value greater than or equal to {}.",
+                    min
+                )?;
+            }
+        }
+
+        if !ret.error_values.is_empty() {
+            writeln!(
+                w,
+                "\n- If the operation fails, the function shall return one of the specified negative error values."
+            )?;
+        }
+
+        Ok(())
+    }
+
+    fn begin_errors(&mut self, w: &mut dyn Write, _count: u32) -> std::io::Result<()> {
+        writeln!(w, "\nError Handling:")?;
+        Ok(())
+    }
+
+    fn error(&mut self, w: &mut dyn Write, error: &ErrorSpec) -> std::io::Result<()> {
+        writeln!(w)?;
+        let condition = if error.condition.is_empty() {
+            error
+                .description
+                .to_lowercase()
+                .trim_end_matches('.')
+                .to_string()
+        } else {
+            error.condition.to_lowercase()
+        };
+        writeln!(
+            w,
+            "- If {condition}, the function shall return -{}.",
+            error.name
+        )?;
+
+        // Add description if available and different from condition
+        if !error.description.is_empty() && error.description != error.condition {
+            writeln!(w, "  {}", error.description)?;
+        }
+
+        Ok(())
+    }
+
+    fn end_errors(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn examples(&mut self, w: &mut dyn Write, examples: &str) -> std::io::Result<()> {
+        writeln!(w, "\nExample Usage:")?;
+        writeln!(w)?;
+        writeln!(w, "```")?;
+        write!(w, "{}", examples)?;
+        writeln!(w, "```")
+    }
+
+    fn notes(&mut self, w: &mut dyn Write, notes: &str) -> std::io::Result<()> {
+        writeln!(w, "\nImplementation Notes:")?;
+        writeln!(w)?;
+
+        // Split notes into sentences and format each as a behavioral requirement
+        let sentences: Vec<&str> = notes.split(". ").filter(|s| !s.trim().is_empty()).collect();
+
+        for sentence in sentences {
+            let trimmed = sentence.trim().trim_end_matches('.');
+            if trimmed.is_empty() {
+                continue;
+            }
+
+            // Check if it already contains "shall" or similar
+            if trimmed.contains("shall") || trimmed.contains("must") {
+                writeln!(w, "- {}.", trimmed)?;
+            } else if trimmed.starts_with("On ")
+                || trimmed.starts_with("If ")
+                || trimmed.starts_with("When ")
+            {
+                // These are already conditional, just add shall
+                writeln!(w, "- {}, the behavior shall be as described.", trimmed)?;
+            } else {
+                // Convert to a shall statement
+                writeln!(
+                    w,
+                    "- The implementation shall ensure that {}.",
+                    trimmed
+                        .chars()
+                        .next()
+                        .unwrap()
+                        .to_lowercase()
+                        .collect::<String>()
+                        + &trimmed[1..]
+                )?;
+            }
+        }
+        Ok(())
+    }
+
+    fn since_version(&mut self, w: &mut dyn Write, version: &str) -> std::io::Result<()> {
+        writeln!(
+            w,
+            "\n- If kernel version is {} or later, this API shall be available.",
+            version
+        )
+    }
+
+    fn sysfs_subsystem(&mut self, w: &mut dyn Write, subsystem: &str) -> std::io::Result<()> {
+        writeln!(
+            w,
+            "- If accessed through sysfs, the attribute shall be located in the {} subsystem.",
+            subsystem
+        )
+    }
+
+    fn sysfs_path(&mut self, w: &mut dyn Write, path: &str) -> std::io::Result<()> {
+        writeln!(
+            w,
+            "\n- If the sysfs interface is mounted, the attribute shall be accessible at {}.",
+            path
+        )
+    }
+
+    fn sysfs_permissions(&mut self, w: &mut dyn Write, perms: &str) -> std::io::Result<()> {
+        writeln!(
+            w,
+            "\n- If the attribute exists, its permissions shall be set to {}.",
+            perms
+        )
+    }
+
+    fn socket_state(&mut self, w: &mut dyn Write, state: &SocketStateSpec) -> std::io::Result<()> {
+        writeln!(w, "\nSocket State Behavior:")?;
+        writeln!(w)?;
+
+        if !state.required_states.is_empty() {
+            let states_str = state.required_states.join(" or ");
+            writeln!(
+                w,
+                "- If the socket is not in {} state, the operation shall fail.",
+                states_str
+            )?;
+        }
+
+        if !state.forbidden_states.is_empty() {
+            for s in &state.forbidden_states {
+                writeln!(
+                    w,
+                    "\n- If the socket is in {} state, the operation shall fail.",
+                    s
+                )?;
+            }
+        }
+
+        if let Some(result) = &state.resulting_state {
+            writeln!(
+                w,
+                "\n- If the operation succeeds, the socket state shall transition to {}.",
+                result
+            )?;
+        }
+
+        Ok(())
+    }
+
+    fn begin_protocol_behaviors(&mut self, w: &mut dyn Write) -> std::io::Result<()> {
+        writeln!(w, "\nProtocol-Specific Behavior:")
+    }
+
+    fn protocol_behavior(
+        &mut self,
+        w: &mut dyn Write,
+        behavior: &ProtocolBehaviorSpec,
+    ) -> std::io::Result<()> {
+        writeln!(w)?;
+        writeln!(
+            w,
+            "- If protocol is {}, {}.",
+            behavior.applicable_protocols, behavior.behavior
+        )?;
+
+        if let Some(flags) = &behavior.protocol_flags {
+            writeln!(
+                w,
+                "\n- If protocol is {} and flags {} are set, the behavior shall be modified accordingly.",
+                behavior.applicable_protocols, flags
+            )?;
+        }
+
+        Ok(())
+    }
+
+    fn end_protocol_behaviors(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn begin_addr_families(&mut self, w: &mut dyn Write) -> std::io::Result<()> {
+        writeln!(w, "\nAddress Family Behavior:")
+    }
+
+    fn addr_family(&mut self, w: &mut dyn Write, family: &AddrFamilySpec) -> std::io::Result<()> {
+        writeln!(w)?;
+        writeln!(
+            w,
+            "- If address family is {} ({}), the address structure size shall be {} bytes.",
+            family.family, family.family_name, family.addr_struct_size
+        )?;
+
+        writeln!(
+            w,
+            "\n- If address family is {} and address length is less than {} or greater than {}, the operation shall fail.",
+            family.family, family.min_addr_len, family.max_addr_len
+        )?;
+
+        Ok(())
+    }
+
+    fn end_addr_families(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn buffer_spec(&mut self, w: &mut dyn Write, spec: &BufferSpec) -> std::io::Result<()> {
+        writeln!(w, "\nBuffer Behavior:")?;
+        writeln!(w)?;
+
+        if let Some(min) = spec.min_buffer_size {
+            writeln!(
+                w,
+                "- If the buffer size is less than {} bytes, the operation shall fail.",
+                min
+            )?;
+        }
+
+        if let Some(max) = spec.max_buffer_size {
+            writeln!(
+                w,
+                "\n- If the buffer size exceeds {} bytes, the excess data shall be truncated.",
+                max
+            )?;
+        }
+
+        if let Some(behaviors) = &spec.buffer_behaviors {
+            writeln!(
+                w,
+                "\n- When handling buffers, the following behavior shall apply: {}.",
+                behaviors
+            )?;
+        }
+
+        Ok(())
+    }
+
+    fn async_spec(&mut self, w: &mut dyn Write, spec: &AsyncSpec) -> std::io::Result<()> {
+        writeln!(w, "\nAsynchronous Behavior:")?;
+        writeln!(w)?;
+
+        if let Some(_modes) = &spec.supported_modes {
+            writeln!(
+                w,
+                "- If O_NONBLOCK is set and the operation would block, the function shall return -EAGAIN or -EWOULDBLOCK."
+            )?;
+        }
+
+        if let Some(errno) = spec.nonblock_errno {
+            writeln!(
+                w,
+                "\n- If the file descriptor is in non-blocking mode and no data is available, the function shall return -{}.",
+                errno
+            )?;
+        }
+
+        Ok(())
+    }
+
+    fn net_data_transfer(&mut self, w: &mut dyn Write, desc: &str) -> std::io::Result<()> {
+        writeln!(w, "\nData Transfer Behavior:")?;
+        writeln!(w)?;
+        writeln!(
+            w,
+            "- When transferring data, the operation shall {}.",
+            desc.trim_end_matches('.')
+        )
+    }
+
+    fn begin_capabilities(&mut self, w: &mut dyn Write) -> std::io::Result<()> {
+        writeln!(w, "\nCapability Requirements:")
+    }
+
+    fn capability(&mut self, w: &mut dyn Write, cap: &CapabilitySpec) -> std::io::Result<()> {
+        writeln!(w)?;
+        writeln!(
+            w,
+            "- If the process attempts to {}, {} capability shall be checked.",
+            cap.action, cap.name
+        )?;
+        writeln!(w)?;
+        writeln!(w, "- If {} is present, {}.", cap.name, cap.allows)?;
+        writeln!(w)?;
+        writeln!(w, "- If {} is not present, {}.", cap.name, cap.without_cap)?;
+
+        Ok(())
+    }
+
+    fn end_capabilities(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn begin_signals(&mut self, w: &mut dyn Write, _count: u32) -> std::io::Result<()> {
+        writeln!(w, "\nSignal Behavior:")?;
+        Ok(())
+    }
+
+    fn signal(&mut self, w: &mut dyn Write, signal: &SignalSpec) -> std::io::Result<()> {
+        writeln!(w)?;
+
+        // Skip signals with no meaningful description
+        if let Some(desc) = &signal.description {
+            if !desc.is_empty() {
+                writeln!(w, "- {}: {}.", signal.signal_name, desc)?;
+                return Ok(());
+            }
+        }
+
+        // Default behavior based on direction
+        if signal.direction == 1 {
+            // Sends
+            writeln!(
+                w,
+                "- If the conditions for {} are met, the signal shall be sent to the target process.",
+                signal.signal_name
+            )?;
+        } else if signal.direction == 2 {
+            // Receives
+            writeln!(
+                w,
+                "- If {} is received and not blocked, the operation shall be interrupted.",
+                signal.signal_name
+            )?;
+
+            if signal.restartable {
+                writeln!(
+                    w,
+                    "\n- If {} is received and SA_RESTART is set, the operation shall be automatically restarted.",
+                    signal.signal_name
+                )?;
+            }
+        } else {
+            // Direction 0 or other - just note the signal handling
+            writeln!(
+                w,
+                "- {} shall be handled according to its default behavior.",
+                signal.signal_name
+            )?;
+        }
+
+        if let Some(errno) = signal.error_on_signal {
+            if errno != 0 {
+                writeln!(
+                    w,
+                    "\n- If interrupted by {}, the function shall return -{}.",
+                    signal.signal_name, errno
+                )?;
+            }
+        }
+
+        Ok(())
+    }
+
+    fn end_signals(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn begin_signal_masks(&mut self, w: &mut dyn Write, count: u32) -> std::io::Result<()> {
+        writeln!(w, "\n### Signal Mask Requirements")?;
+        if count > 0 {
+            writeln!(
+                w,
+                "The API SHALL support the following signal mask operations:"
+            )?;
+        }
+        Ok(())
+    }
+
+    fn signal_mask(&mut self, w: &mut dyn Write, mask: &SignalMaskSpec) -> std::io::Result<()> {
+        writeln!(w, "\n- **{}**: {}", mask.name, mask.description)?;
+        Ok(())
+    }
+
+    fn end_signal_masks(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn begin_side_effects(&mut self, w: &mut dyn Write, _count: u32) -> std::io::Result<()> {
+        writeln!(w, "\nSide Effects:")?;
+        Ok(())
+    }
+
+    fn side_effect(&mut self, w: &mut dyn Write, effect: &SideEffectSpec) -> std::io::Result<()> {
+        writeln!(w)?;
+        if let Some(condition) = &effect.condition {
+            writeln!(
+                w,
+                "- If {}, {} shall be {}.",
+                condition,
+                effect.target,
+                effect.description.trim_end_matches('.')
+            )?;
+        } else {
+            writeln!(
+                w,
+                "- When the operation executes, {} shall be {}.",
+                effect.target,
+                effect.description.trim_end_matches('.')
+            )?;
+        }
+
+        if effect.reversible {
+            writeln!(
+                w,
+                "\n- If the operation is rolled back, the effect on {} shall be reversed.",
+                effect.target
+            )?;
+        }
+
+        Ok(())
+    }
+
+    fn end_side_effects(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn begin_state_transitions(&mut self, w: &mut dyn Write, _count: u32) -> std::io::Result<()> {
+        writeln!(w, "\nState Transitions:")?;
+        Ok(())
+    }
+
+    fn state_transition(
+        &mut self,
+        w: &mut dyn Write,
+        trans: &StateTransitionSpec,
+    ) -> std::io::Result<()> {
+        writeln!(w)?;
+        if let Some(condition) = &trans.condition {
+            writeln!(
+                w,
+                "- If {} is in {} state and {}, it shall transition to {} state.",
+                trans.object, trans.from_state, condition, trans.to_state
+            )?;
+        } else {
+            writeln!(
+                w,
+                "- If {} is in {} state, it shall transition to {} state.",
+                trans.object, trans.from_state, trans.to_state
+            )?;
+        }
+
+        Ok(())
+    }
+
+    fn end_state_transitions(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn begin_constraints(&mut self, w: &mut dyn Write, _count: u32) -> std::io::Result<()> {
+        writeln!(w, "\nConstraints:")?;
+        Ok(())
+    }
+
+    fn constraint(
+        &mut self,
+        w: &mut dyn Write,
+        constraint: &ConstraintSpec,
+    ) -> std::io::Result<()> {
+        writeln!(w)?;
+        if let Some(expr) = &constraint.expression {
+            if expr.is_empty() {
+                writeln!(w, "- {}: {}.", constraint.name, constraint.description)?;
+            } else {
+                writeln!(
+                    w,
+                    "- If {} is violated, the operation shall fail.",
+                    constraint.name
+                )?;
+                writeln!(w, "  Constraint: {}", expr)?;
+            }
+        } else {
+            writeln!(w, "- {}: {}.", constraint.name, constraint.description)?;
+        }
+
+        Ok(())
+    }
+
+    fn end_constraints(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn begin_locks(&mut self, w: &mut dyn Write, _count: u32) -> std::io::Result<()> {
+        writeln!(w, "\nLocking Behavior:")?;
+        Ok(())
+    }
+
+    fn lock(&mut self, w: &mut dyn Write, lock: &LockSpec) -> std::io::Result<()> {
+        writeln!(w)?;
+
+        // Always show lock information if we have a description
+        if !lock.description.is_empty() {
+            let lock_type_str = match lock.lock_type {
+                1 => "mutex",
+                2 => "spinlock",
+                3 => "rwlock",
+                4 => "semaphore",
+                5 => "RCU",
+                _ => "lock",
+            };
+            writeln!(
+                w,
+                "- The {} {} shall be used for: {}",
+                lock.lock_name, lock_type_str, lock.description
+            )?;
+        }
+
+        if lock.held_on_entry {
+            writeln!(
+                w,
+                "- If {} is not held on entry, the operation shall fail.",
+                lock.lock_name
+            )?;
+        }
+
+        if lock.acquired && !lock.held_on_entry {
+            writeln!(
+                w,
+                "- Before accessing the protected resource, {} shall be acquired.",
+                lock.lock_name
+            )?;
+        }
+
+        if lock.released && lock.held_on_exit {
+            writeln!(
+                w,
+                "- If the operation succeeds and no error path is taken, {} shall remain held on exit.",
+                lock.lock_name
+            )?;
+        } else if lock.released {
+            writeln!(
+                w,
+                "- Before returning, {} shall be released.",
+                lock.lock_name
+            )?;
+        }
+
+        Ok(())
+    }
+
+    fn end_locks(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn begin_struct_specs(&mut self, _w: &mut dyn Write, _count: u32) -> std::io::Result<()> {
+        Ok(())
+    }
+
+    fn struct_spec(&mut self, w: &mut dyn Write, spec: &crate::extractor::StructSpec) -> std::io::Result<()> {
+        writeln!(w, "[STRUCT_SPEC] The system SHALL define a structure '{}' with the following properties:", spec.name)?;
+
+        if !spec.description.is_empty() {
+            writeln!(w, "  [DESCRIPTION] {}", spec.description)?;
+        }
+
+        writeln!(w, "  [SIZE] The structure SHALL have a size of {} bytes", spec.size)?;
+        writeln!(w, "  [ALIGNMENT] The structure SHALL have an alignment of {} bytes", spec.alignment)?;
+
+        if !spec.fields.is_empty() {
+            writeln!(w, "  [FIELDS] The structure SHALL contain {} fields:", spec.field_count)?;
+            for field in &spec.fields {
+                writeln!(w, "    - Field '{}' of type '{}': {}",
+                    field.name, field.type_name, field.description)?;
+
+                if field.min_value != 0 || field.max_value != 0 {
+                    writeln!(w, "      [RANGE] SHALL be in range [{}, {}]",
+                        field.min_value, field.max_value)?;
+                }
+            }
+        }
+
+        writeln!(w)?;
+        Ok(())
+    }
+
+    fn end_struct_specs(&mut self, _w: &mut dyn Write) -> std::io::Result<()> {
+        Ok(())
+    }
+}
diff --git a/tools/kapi/src/main.rs b/tools/kapi/src/main.rs
new file mode 100644
index 000000000000..2d219046f328
--- /dev/null
+++ b/tools/kapi/src/main.rs
@@ -0,0 +1,116 @@
+//! kapi - Kernel API Specification Tool
+//!
+//! This tool extracts and displays kernel API specifications from multiple sources:
+//! - Kernel source code (KAPI macros)
+//! - Compiled vmlinux binaries (`.kapi_specs` ELF section)
+//! - Running kernel via debugfs
+
+use anyhow::Result;
+use clap::Parser;
+use std::io::{self, Write};
+
+mod extractor;
+mod formatter;
+
+use extractor::{ApiExtractor, DebugfsExtractor, SourceExtractor, VmlinuxExtractor};
+use formatter::{OutputFormat, create_formatter};
+
+#[derive(Parser, Debug)]
+#[command(author, version, about, long_about = None)]
+struct Args {
+    /// Path to the vmlinux file
+    #[arg(long, value_name = "PATH", group = "input")]
+    vmlinux: Option<String>,
+
+    /// Path to kernel source directory or file
+    #[arg(long, value_name = "PATH", group = "input")]
+    source: Option<String>,
+
+    /// Path to debugfs (defaults to /sys/kernel/debug if not specified)
+    #[arg(long, value_name = "PATH", group = "input")]
+    debugfs: Option<String>,
+
+    /// Optional: Name of specific API to show details for
+    api_name: Option<String>,
+
+    /// Output format
+    #[arg(long, short = 'f', default_value = "plain")]
+    format: String,
+}
+
+fn main() -> Result<()> {
+    let args = Args::parse();
+
+    let output_format: OutputFormat = args
+        .format
+        .parse()
+        .map_err(|e: String| anyhow::anyhow!(e))?;
+
+    let extractor: Box<dyn ApiExtractor> = match (args.vmlinux, args.source, args.debugfs.clone()) {
+        (Some(vmlinux_path), None, None) => Box::new(VmlinuxExtractor::new(&vmlinux_path)?),
+        (None, Some(source_path), None) => Box::new(SourceExtractor::new(&source_path)?),
+        (None, None, Some(_) | None) => {
+            // If debugfs is specified or no input is provided, use debugfs
+            Box::new(DebugfsExtractor::new(args.debugfs)?)
+        }
+        _ => {
+            anyhow::bail!("Please specify only one of --vmlinux, --source, or --debugfs")
+        }
+    };
+
+    display_apis(extractor.as_ref(), args.api_name, output_format)
+}
+
+fn display_apis(
+    extractor: &dyn ApiExtractor,
+    api_name: Option<String>,
+    output_format: OutputFormat,
+) -> Result<()> {
+    let mut formatter = create_formatter(output_format);
+    let mut stdout = io::stdout();
+
+    formatter.begin_document(&mut stdout)?;
+
+    if let Some(api_name_req) = api_name {
+        // Use the extractor to display API details
+        if let Some(_spec) = extractor.extract_by_name(&api_name_req)? {
+            extractor.display_api_details(&api_name_req, &mut *formatter, &mut stdout)?;
+        } else if output_format == OutputFormat::Plain {
+            writeln!(stdout, "\nAPI '{}' not found.", api_name_req)?;
+            writeln!(stdout, "\nAvailable APIs:")?;
+            for spec in extractor.extract_all()? {
+                writeln!(stdout, "  {} ({})", spec.name, spec.api_type)?;
+            }
+        }
+    } else {
+        // Display list of APIs using the extractor
+        let all_specs = extractor.extract_all()?;
+
+        // Helper to display API list for a specific type
+        let mut display_api_type = |api_type: &str, title: &str| -> Result<()> {
+            let filtered: Vec<_> = all_specs.iter()
+                .filter(|s| s.api_type == api_type)
+                .collect();
+
+            if !filtered.is_empty() {
+                formatter.begin_api_list(&mut stdout, title)?;
+                for spec in filtered {
+                    formatter.api_item(&mut stdout, &spec.name, &spec.api_type)?;
+                }
+                formatter.end_api_list(&mut stdout)?;
+            }
+            Ok(())
+        };
+
+        display_api_type("syscall", "System Calls")?;
+        display_api_type("ioctl", "IOCTLs")?;
+        display_api_type("function", "Functions")?;
+        display_api_type("sysfs", "Sysfs Attributes")?;
+
+        formatter.total_specs(&mut stdout, all_specs.len())?;
+    }
+
+    formatter.end_document(&mut stdout)?;
+
+    Ok(())
+}
-- 
2.50.1


^ permalink raw reply related

* [RFC PATCH v4 6/7] fs/exec: add API specification for execveat
From: Sasha Levin @ 2025-08-25 18:14 UTC (permalink / raw)
  To: linux-api, linux-doc, linux-kernel, tools; +Cc: Sasha Levin
In-Reply-To: <20250825181434.3340805-1-sashal@kernel.org>

Add kernel API specification for the execveat() system call.

Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/exec.c | 594 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 594 insertions(+)

diff --git a/fs/exec.c b/fs/exec.c
index 2a1e5e4042a1..5dab6a801040 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -2010,6 +2010,600 @@ SYSCALL_DEFINE3(execve,
 	return do_execve(getname(filename), argv, envp);
 }
 
+/**
+ * sys_execveat - Execute program relative to directory file descriptor
+ * @fd: File descriptor of directory for relative pathname
+ * @filename: Pathname of program to execute
+ * @argv: Argument vector for new program
+ * @envp: Environment vector for new program
+ * @flags: Execution flags
+ *
+ * long-desc: Executes a new program, replacing the current process image with a new
+ *   process image. Similar to execve(), but the program is specified via a
+ *   directory file descriptor and pathname. Supports execution of scripts,
+ *   ELF binaries, and other registered binary formats. Handles setuid/setgid
+ *   executables with appropriate privilege transitions. The execveat() system
+ *   call combines and extends the functionality of execve() and fexecve().
+ * context-flags: KAPI_CTX_PROCESS | KAPI_CTX_SLEEPABLE
+ *
+ * param: fd
+ *   type: KAPI_TYPE_INT
+ *   flags: KAPI_PARAM_IN
+ *   constraint-type: KAPI_CONSTRAINT_CUSTOM
+ *   constraint: Valid file descriptor or AT_FDCWD (-100) for current directory
+ *
+ * param: filename
+ *   type: KAPI_TYPE_USER_PTR
+ *   flags: KAPI_PARAM_IN | KAPI_PARAM_USER
+ *   constraint-type: KAPI_CONSTRAINT_CUSTOM
+ *   constraint: Can be relative (to fd), absolute, or empty if AT_EMPTY_PATH is set
+ *
+ * param: argv
+ *   type: KAPI_TYPE_USER_PTR
+ *   flags: KAPI_PARAM_IN | KAPI_PARAM_USER
+ *   constraint-type: KAPI_CONSTRAINT_CUSTOM
+ *   constraint: NULL-terminated array of strings, total size < MAX_ARG_STRLEN * MAX_ARG_STRINGS
+ *
+ * param: envp
+ *   type: KAPI_TYPE_USER_PTR
+ *   flags: KAPI_PARAM_IN | KAPI_PARAM_USER
+ *   constraint-type: KAPI_CONSTRAINT_CUSTOM
+ *   constraint: NULL-terminated array of strings, total size < MAX_ARG_STRLEN * MAX_ARG_STRINGS
+ *
+ * param: flags
+ *   type: KAPI_TYPE_INT
+ *   flags: KAPI_PARAM_IN
+ *   constraint-type: KAPI_CONSTRAINT_BITMASK
+ *   valid-mask: AT_EMPTY_PATH | AT_SYMLINK_NOFOLLOW | AT_EXECVE_CHECK
+ *   constraint: AT_* flag validation
+ *   constraint: AT_EMPTY_PATH allows empty filename
+ *   constraint: AT_SYMLINK_NOFOLLOW prevents following symlinks
+ *   constraint: AT_EXECVE_CHECK only checks if execution would succeed
+ *
+ * return:
+ *   type: KAPI_TYPE_INT
+ *   check-type: KAPI_RETURN_NO_RETURN
+ *   success: 0
+ *   desc: On success, execveat() does not return (except with AT_EXECVE_CHECK which returns 0). On error, -1 is returned and errno is set
+ *
+ * error: E2BIG, Argument list too long
+ *   desc: Total size of argument and environment strings exceeds MAX_ARG_STRLEN * MAX_ARG_STRINGS
+ *     or a single string exceeds MAX_ARG_STRLEN, or too many arguments (> MAX_ARG_STRINGS)
+ *
+ * error: EACCES, Permission denied
+ *   desc: Execute permission denied on file, or search permission denied on path component
+ *     or file is not regular file, or filesystem mounted noexec
+ *     or file on read-only filesystem and requires writing
+ *
+ * error: EAGAIN, Resource limit exceeded
+ *   desc: RLIMIT_NPROC resource limit exceeded and process lacks CAP_SYS_ADMIN and CAP_SYS_RESOURCE
+ *     or cannot allocate necessary kernel structures due to memory pressure
+ *
+ * error: EBADF, Bad file descriptor
+ *   desc: The fd argument is not a valid file descriptor, or is not open
+ *     or filename is empty and AT_EMPTY_PATH not specified
+ *
+ * error: EFAULT, Bad address
+ *   desc: filename, argv, or envp points outside accessible address space
+ *     or argv/envp element points to invalid memory
+ *
+ * error: EINTR, Interrupted by signal
+ *   desc: A signal was caught during execution of execveat()
+ *     typically during the security checks or while setting up the new program
+ *
+ * error: EINVAL, Invalid argument
+ *   desc: Invalid flags specified, or ELF interpreter invalid
+ *     or incompatible architecture, or fd refers to something that cannot be executed
+ *
+ * error: EIO, I/O error
+ *   desc: An I/O error occurred while reading from the file system
+ *
+ * error: EISDIR, Is a directory
+ *   desc: The final component of filename or the file referred to by fd is a directory
+ *     or an ELF interpreter was a directory
+ *
+ * error: ELIBBAD, Invalid ELF interpreter
+ *   desc: An ELF interpreter was not in a recognized format
+ *
+ * error: ELOOP, Too many symbolic links
+ *   desc: Too many symbolic links encountered in resolving filename or fd
+ *     or maximum recursion depth exceeded in script interpreter resolution
+ *
+ * error: EMFILE, Too many open files
+ *   desc: The per-process limit on open file descriptors has been reached
+ *
+ * error: ENAMETOOLONG, Filename too long
+ *   desc: filename is too long, or a component of the pathname exceeds NAME_MAX
+ *     or pathname exceeds PATH_MAX
+ *
+ * error: ENFILE, System file table overflow
+ *   desc: The system-wide limit on the total number of open files has been reached
+ *
+ * error: ENOENT, No such file or directory
+ *   desc: filename or a component of the path does not exist
+ *     or the file referred to by fd does not exist (when AT_EMPTY_PATH)
+ *     or interpreter does not exist
+ *
+ * error: ENOEXEC, Exec format error
+ *   desc: The file is not in a recognized executable format, is for wrong architecture
+ *     or has some other format error that prevents execution
+ *
+ * error: ENOMEM, Out of memory
+ *   desc: Insufficient kernel memory available to execute the new program
+ *     cannot allocate page tables, or other memory structures
+ *
+ * error: ENOTDIR, Not a directory
+ *   desc: A component of the path prefix of filename or fd is not a directory
+ *
+ * error: EPERM, Operation not permitted
+ *   desc: The filesystem is mounted nosuid, the user is not root, and the file has
+ *     set-user-ID or set-group-ID bit set, or file is on a filesystem mounted
+ *     with MS_NOEXEC, or the process is being traced
+ *
+ * error: ETXTBSY, Text file busy
+ *   desc: The executable was open for writing by one or more processes
+ *
+ *
+ * lock: cred_guard_mutex
+ *   type: KAPI_LOCK_MUTEX
+ *   acquired: true
+ *   released: true
+ *   desc: Process credential guard mutex - prevents concurrent credential changes during exec
+ *
+ * lock: exec_update_lock
+ *   type: KAPI_LOCK_RWLOCK
+ *   acquired: true
+ *   released: true
+ *   desc: Signal exec update lock - taken for write during exec to prevent racing changes
+ *
+ * lock: sighand->siglock
+ *   type: KAPI_LOCK_SPINLOCK
+ *   acquired: true
+ *   released: true
+ *   desc: Signal handler spinlock - protects signal handler updates during exec
+ *
+ * lock: tasklist_lock
+ *   type: KAPI_LOCK_RWLOCK
+ *   acquired: true
+ *   released: true
+ *   desc: Global task list lock - taken for write when updating thread group during exec
+ *
+ * lock: binfmt_lock
+ *   type: KAPI_LOCK_RWLOCK
+ *   acquired: true
+ *   released: true
+ *   desc: Binary format list lock - taken for read when searching for binary handlers
+ *
+ * lock: mmap_lock
+ *   type: KAPI_LOCK_RWLOCK
+ *   acquired: true
+ *   released: true
+ *   desc: Memory map lock - taken when setting up new memory layout for executed program
+ *
+ * signal: SIGKILL
+ *   direction: KAPI_SIGNAL_RECEIVE
+ *   action: KAPI_SIGNAL_ACTION_TERMINATE
+ *   condition: Process killed during exec
+ *   desc: If the process is killed (SIGKILL) during execution, the exec
+ *     operation is aborted. This can happen at various points including
+ *     credential changes, memory setup, or binary loading. The process
+ *     terminates immediately without returning from execveat().
+ *   timing: KAPI_SIGNAL_TIME_DURING
+ *   priority: 0
+ *   restartable: no
+ *   state-req: KAPI_SIGNAL_STATE_RUNNING
+ *
+ * signal: FATAL
+ *   direction: KAPI_SIGNAL_RECEIVE
+ *   action: KAPI_SIGNAL_ACTION_RETURN
+ *   condition: Fatal signal pending
+ *   desc: Fatal signals interrupt execveat at specific checkpoints:
+ *     during argument copying, credential setup, and binary loading.
+ *     Returns -EINTR or -ERESTARTNOINTR. After point of no return,
+ *     signals cause the process to terminate rather than return.
+ *   error: -EINTR
+ *   timing: KAPI_SIGNAL_TIME_BEFORE
+ *   priority: 1
+ *   interruptible: yes
+ *   state-req: KAPI_SIGNAL_STATE_RUNNING
+ *
+ * signal: SIGKILL_THREADS
+ *   direction: KAPI_SIGNAL_SEND
+ *   action: KAPI_SIGNAL_ACTION_TERMINATE
+ *   condition: Multi-threaded process doing exec
+ *   desc: During de_thread(), zap_other_threads() sends SIGKILL to all
+ *     other threads in the thread group to ensure only the execing thread
+ *     survives. This ensures the process becomes single-threaded.
+ *   target: All other threads in thread group
+ *   timing: KAPI_SIGNAL_TIME_DURING
+ *   priority: 0
+ *
+ * signal: HANDLERS_RESET
+ *   direction: KAPI_SIGNAL_HANDLE
+ *   action: KAPI_SIGNAL_ACTION_CUSTOM
+ *   condition: Signal has a handler installed
+ *   desc: flush_signal_handlers() resets all signal handlers to SIG_DFL
+ *     except for signals that are ignored (SIG_IGN). This happens after
+ *     de_thread() completes to give the new program a clean signal state.
+ *   timing: KAPI_SIGNAL_TIME_DURING
+ *
+ *
+ * signal: IGNORED_PRESERVED
+ *   direction: KAPI_SIGNAL_IGNORE
+ *   action: KAPI_SIGNAL_ACTION_CUSTOM
+ *   condition: Signal disposition is SIG_IGN
+ *   desc: Signals set to SIG_IGN are preserved across exec. This is
+ *     POSIX-compliant behavior allowing parent processes to control
+ *     signal handling in children.
+ *   timing: KAPI_SIGNAL_TIME_DURING
+ *
+ *
+ * signal: PENDING_CLEARED
+ *   direction: KAPI_SIGNAL_HANDLE
+ *   action: KAPI_SIGNAL_ACTION_CUSTOM
+ *   condition: Any pending signals
+ *   desc: All pending signals are cleared during exec. This includes
+ *     both thread-specific and process-wide pending signals to prevent
+ *     unexpected signal delivery to the new program.
+ *   timing: KAPI_SIGNAL_TIME_DURING
+ *
+ *
+ * signal: TIMER_SIGNALS
+ *   direction: KAPI_SIGNAL_HANDLE
+ *   action: KAPI_SIGNAL_ACTION_CUSTOM
+ *   condition: Timer-generated signals pending
+ *   desc: flush_itimer_signals() clears any pending timer signals
+ *     (SIGALRM, SIGVTALRM, SIGPROF) to prevent confusion in the new program.
+ *     Timer settings are also reset.
+ *   timing: KAPI_SIGNAL_TIME_DURING
+ *
+ *
+ * signal: SIGCHLD_SETUP
+ *   direction: KAPI_SIGNAL_SEND
+ *   action: KAPI_SIGNAL_ACTION_DEFAULT
+ *   condition: Process exit after exec
+ *   desc: The exit_signal is set to SIGCHLD during exec, ensuring the
+ *     parent will receive SIGCHLD when this process terminates.
+ *   target: Parent process
+ *   timing: KAPI_SIGNAL_TIME_AFTER
+ *
+ *
+ * signal: SIGALTSTACK_CLEARED
+ *   direction: KAPI_SIGNAL_HANDLE
+ *   action: KAPI_SIGNAL_ACTION_CUSTOM
+ *   condition: Process had alternate signal stack
+ *   desc: Any alternate signal stack (sigaltstack) is not preserved
+ *     across exec. The new program starts with no alternate stack.
+ *   timing: KAPI_SIGNAL_TIME_DURING
+ *
+ *
+ * signal: SIGSEGV_FORCED
+ *   direction: KAPI_SIGNAL_SEND
+ *   action: KAPI_SIGNAL_ACTION_TERMINATE
+ *   condition: Error after point of no return
+ *   desc: If an error occurs after the point of no return and no fatal
+ *     signal is already pending, force_fatal_sig(SIGSEGV) is called to
+ *     terminate the process since it cannot return to the old state.
+ *   target: Current process
+ *   timing: KAPI_SIGNAL_TIME_AFTER
+ *   priority: 0
+ *
+ * side-effect: KAPI_EFFECT_PROCESS_STATE | KAPI_EFFECT_IRREVERSIBLE
+ *   target: process image
+ *   desc: Completely replaces the process image with new program.
+ *     The entire process address space, including code, data, heap,
+ *     and stack are replaced. Only PID, parent PID, and some signal
+ *     dispositions are preserved. This is irreversible once past the
+ *     point of no return.
+ *
+ *
+ * side-effect: KAPI_EFFECT_MODIFY_STATE | KAPI_EFFECT_CREDS
+ *   target: process credentials
+ *   desc: Updates process credentials for setuid/setgid executables.
+ *     Effective UID/GID are changed to file owner/group if setuid/setgid
+ *     bits are set and filesystem allows. Real UID/GID unchanged unless
+ *     explicitly set. Saved set-user-ID updated. Capabilities may be
+ *     gained or lost. AT_SECURE is set for security transitions.
+ *   condition: File has setuid or setgid bits
+ *   reversible: no
+ *
+ *
+ * side-effect: KAPI_EFFECT_CLOSE_FD
+ *   target: file descriptors
+ *   desc: Closes file descriptors marked with FD_CLOEXEC.
+ *     All file descriptors with FD_CLOEXEC flag are automatically closed.
+ *     Other file descriptors remain open and available to new program.
+ *     Standard streams (0,1,2) typically preserved unless explicitly marked.
+ *   reversible: no
+ *
+ *
+ * side-effect: KAPI_EFFECT_SIGNAL_STATE
+ *   target: signal handlers
+ *   desc: Resets signal handlers to default.
+ *     All caught signals are reset to default disposition (SIG_DFL).
+ *     Ignored signals (SIG_IGN) remain ignored except in special cases.
+ *     Signal mask is preserved. Pending signals are preserved unless
+ *     they would be ignored by the new program.
+ *   reversible: no
+ *
+ *
+ * side-effect: KAPI_EFFECT_MEMORY_MAP
+ *   target: memory mappings
+ *   desc: Destroys all existing memory mappings.
+ *     All memory mappings including shared memory, mmapped files, and
+ *     anonymous mappings are unmapped. New mappings are created for
+ *     the executed program's code, data, and stack. Shared memory
+ *     attachments are detached.
+ *   reversible: no
+ *
+ *
+ * side-effect: KAPI_EFFECT_THREAD_STATE
+ *   target: thread group
+ *   desc: Terminates all other threads in thread group.
+ *     If the calling thread is part of a multi-threaded process,
+ *     all other threads are terminated. The thread group becomes
+ *     single-threaded with only the execing thread surviving.
+ *     Thread group leader transfers if necessary.
+ *   condition: Multi-threaded process
+ *   reversible: no
+ *
+ *
+ * side-effect: KAPI_EFFECT_RLIMIT
+ *   target: resource limits
+ *   desc: Preserves most resource limits.
+ *     Resource limits (RLIMIT_*) are generally preserved across exec.
+ *     RLIMIT_CPU timer is reset. RLIMIT_STACK may be adjusted for
+ *     the new program's requirements.
+ *
+ *
+ * side-effect: KAPI_EFFECT_FILESYSTEM
+ *   target: working directory
+ *   desc: Preserves working directory and root.
+ *     Current working directory and root directory are preserved.
+ *     Umask is preserved. Close-on-exec file descriptors are closed.
+ *     File locks are preserved if not associated with closed descriptors.
+ *
+ *
+ * side-effect: KAPI_EFFECT_ACCOUNTING
+ *   target: process accounting
+ *   desc: Updates accounting information.
+ *     Process accounting records exec event. CPU timers reset.
+ *     Start time updated. Command name (comm) changed to new program.
+ *     Audit events generated for security-relevant transitions.
+ *
+ *
+ * side-effect: KAPI_EFFECT_NAMESPACE
+ *   target: personality
+ *   desc: May change execution personality.
+ *     Execution personality (e.g., Linux, SVR4, etc.) may change based
+ *     on binary format. This affects system call behavior, signal
+ *     numbering, and other ABI details. Usually preserved but can
+ *     change for compatibility.
+ *   condition: Binary requires different personality
+ *
+ *
+ * side-effect: KAPI_EFFECT_IO_CANCEL
+ *   target: io_uring
+ *   desc: Cancels all io_uring operations.
+ *     io_uring_task_cancel() is called to cancel any pending
+ *     io_uring operations. This prevents the new program from inheriting
+ *     incomplete asynchronous I/O operations from the old program.
+ *
+ *
+ * side-effect: KAPI_EFFECT_FILES_UNSHARE
+ *   target: file table
+ *   desc: Unshares file descriptor table.
+ *     unshare_files() ensures the process has its own file
+ *     descriptor table, not shared with other processes. This is required
+ *     for security during credential changes.
+ *
+ *
+ * side-effect: KAPI_EFFECT_PTRACE
+ *   target: ptrace event
+ *   desc: Generates PTRACE_EVENT_EXEC.
+ *     ptrace_event(PTRACE_EVENT_EXEC) notifies any process
+ *     tracing this one that an exec has occurred. The tracer can then
+ *     update its state and continue tracing the new program.
+ *
+ *
+ * side-effect: KAPI_EFFECT_CONNECTOR
+ *   target: process connector
+ *   desc: Sends exec notification.
+ *     proc_exec_connector() sends a notification through
+ *     the process connector (cn_proc) subsystem to inform interested
+ *     listeners that an exec has occurred.
+ *
+ *
+ * side-effect: KAPI_EFFECT_SCHEDULER
+ *   target: scheduler state
+ *   desc: Updates scheduler state for exec.
+ *     sched_exec() performs scheduler operations for exec,
+ *     potentially migrating the task to a less loaded CPU. Also manages
+ *     MM context IDs via sched_mm_cid_before_execve/after_execve.
+ *
+ *
+ * side-effect: KAPI_EFFECT_RSEQ
+ *   target: restartable sequences
+ *   desc: Handles rseq for exec.
+ *     rseq_execve() handles restartable sequence
+ *     state during exec. The rseq area is cleared to prevent the new
+ *     program from using stale rseq data from the old program.
+ *
+ *
+ * side-effect: KAPI_EFFECT_USER_EVENTS
+ *   target: user events
+ *   desc: Notifies user event subsystem.
+ *     user_events_execve() notifies the user events
+ *     tracing subsystem that an exec has occurred, allowing userspace
+ *     tracing tools to track process transitions.
+ *
+ *
+ * side-effect: KAPI_EFFECT_NUMA
+ *   target: NUMA state
+ *   desc: Cleans up NUMA task state.
+ *     task_numa_free() releases NUMA-related task state
+ *     including fault statistics and placement information. The new
+ *     program starts with fresh NUMA placement decisions.
+ *
+ *
+ * state-trans: executing
+ *   to: new program
+ *   condition: exec succeeds
+ *   target: process image
+ *   desc: Process transitions from executing current program to executing new program
+ *
+ *
+ * state-trans: multi-threaded
+ *   to: single-threaded
+ *   condition: exec in threaded process
+ *   target: thread group
+ *   desc: Multi-threaded process becomes single-threaded as all other threads terminate
+ *
+ *
+ * state-trans: unprivileged
+ *   to: privileged
+ *   condition: setuid/setgid exec
+ *   target: process credentials
+ *   desc: Process may gain privileges through setuid/setgid execution
+ *
+ *
+ * state-trans: privileged
+ *   to: unprivileged
+ *   condition: capability drop
+ *   target: process capabilities
+ *   desc: Process may lose capabilities when executing non-privileged binary
+ *
+ *
+ * state-trans: dumpable
+ *   to: non-dumpable
+ *   condition: security transition
+ *   target: process dumpability
+ *   desc: Process becomes non-dumpable after setuid/setgid or capability changes
+ *
+ *
+ * capability: CAP_SYS_ADMIN
+ *   type: KAPI_CAP_BYPASS
+ *   desc: Allows exceeding RLIMIT_NPROC process limit
+ *   allows: Allows exceeding RLIMIT_NPROC process limit
+ *   without: Execution fails with EAGAIN if RLIMIT_NPROC exceeded
+ *   condition: Process count at or above RLIMIT_NPROC
+ *   priority: 0
+ *
+ *
+ * capability: CAP_SYS_RESOURCE
+ *   type: KAPI_CAP_BYPASS
+ *   desc: Allows exceeding RLIMIT_NPROC process limit
+ *   allows: Allows exceeding RLIMIT_NPROC process limit
+ *   without: Execution fails with EAGAIN if RLIMIT_NPROC exceeded
+ *   condition: Process count at or above RLIMIT_NPROC
+ *   priority: 0
+ *
+ *
+ * capability: CAP_DAC_OVERRIDE
+ *   type: KAPI_CAP_BYPASS
+ *   desc: Allows execution of files without execute permission
+ *   allows: Allows execution of files without execute permission
+ *   without: Must have execute permission on file
+ *   condition: File lacks execute permission
+ *   priority: 0
+ *
+ *
+ * capability: CAP_MAC_ADMIN
+ *   type: KAPI_CAP_BYPASS
+ *   desc: May bypass MAC policy restrictions on execution
+ *   allows: May bypass MAC policy restrictions on execution
+ *   without: Subject to mandatory access control policies
+ *   condition: MAC policy would deny execution
+ *   priority: 0
+ *
+ *
+ * constraint: Binary Format Support
+ *   desc: The kernel must have support for the binary format being executed (ELF, script, etc).
+ *     Binary format handlers are registered via register_binfmt().
+ *     If no handler recognizes the format, execution fails with ENOEXEC.
+ *   expr: binfmt_handler_exists(file)
+ *
+ * constraint: Stack Size Limits
+ *   desc: The combined size of arguments and environment cannot exceed the stack limit.
+ *     The kernel enforces MAX_ARG_STRLEN (32 pages) per string and MAX_ARG_STRINGS total strings.
+ *     Additionally respects RLIMIT_STACK.
+ *   expr: total_size <= min(RLIMIT_STACK/4, MAX_ARG_STRLEN * MAX_ARG_STRINGS)
+ *
+ *
+ * constraint: Process Count Limit
+ *   desc: If RLIMIT_NPROC is exceeded, execution fails with EAGAIN unless the process has
+ *     CAP_SYS_ADMIN or CAP_SYS_RESOURCE capabilities. This prevents fork bombs and resource exhaustion.
+ *   expr: user_processes < RLIMIT_NPROC || CAP_SYS_ADMIN || CAP_SYS_RESOURCE
+ *
+ *
+ * constraint: Setuid/Setgid Execution
+ *   desc: Setuid/setgid bits are honored only if: filesystem is not mounted nosuid,
+ *     file has appropriate bits set, and user namespace allows the mapping.
+ *     AT_SECURE flag is set for security-sensitive transitions.
+ *   expr: !nosuid_mount && (S_ISUID || S_ISGID) && uid_mappable && gid_mappable
+ *
+ *
+ * constraint: Script Interpreter Limits
+ *   desc: Script execution (#! interpreter) has a maximum recursion depth of 4 levels
+ *     to prevent infinite loops. The interpreter line is limited to BINPRM_BUF_SIZE (256) bytes.
+ *   expr: interpreter_depth <= 4 && shebang_len <= BINPRM_BUF_SIZE
+ *
+ *
+ * constraint: Memory Layout Requirements
+ *   desc: The new program requires sufficient virtual memory for code, data, stack, and heap.
+ *     The kernel must be able to set up page tables and allocate initial pages.
+ *     Fails with ENOMEM if insufficient.
+ *   expr: available_memory >= program_requirements
+ *
+ *
+ * constraint: Security Module Checks
+ *   desc: LSM (Linux Security Module) hooks are called at multiple points:
+ *     security_bprm_check(), security_bprm_creds_from_file(),
+ *     security_bprm_committing_creds(), security_bprm_committed_creds().
+ *     Any can deny execution.
+ *   expr: all_lsm_checks_pass()
+ *
+ *
+ * constraint: File Descriptor Preservation
+ *   desc: File descriptors marked FD_CLOEXEC are closed during exec.
+ *     Others remain open in the new program. The AT_EMPTY_PATH flag requires
+ *     the fd to refer to a regular file with execute permission.
+ *   expr: fd_valid && (filename || AT_EMPTY_PATH) && (!FD_CLOEXEC || will_close)
+ *
+ *
+ * constraint: Point of No Return
+ *   desc: Once the point of no return is reached (bprm->point_of_no_return set),
+ *     the exec cannot fail gracefully. Any errors after this point result in
+ *     process termination via force_fatal_sig(SIGSEGV) rather than returning an error to userspace.
+ *   expr: !point_of_no_return || (error => process_terminated)
+ *
+ * examples: execveat(AT_FDCWD, "/bin/ls", argv, envp, 0);  // Execute absolute path
+ *   execveat(dirfd, "bin/ls", argv, envp, 0);    // Execute relative to dirfd
+ *   execveat(fd, "", argv, envp, AT_EMPTY_PATH); // Execute file referenced by fd
+ *   execveat(dirfd, "script", argv, envp, AT_SYMLINK_NOFOLLOW); // Don't follow symlinks
+ *   execveat(AT_FDCWD, "/bin/test", argv, envp, AT_EXECVE_CHECK); // Check if exec would succeed
+ *
+ * notes: execveat() is the most flexible exec variant, allowing execution relative to
+ *   directory file descriptors and direct execution of already-open files. The
+ *   AT_EMPTY_PATH flag enables fexecve()-like behavior. AT_EXECVE_CHECK (since
+ *   Linux 6.12) only checks if execution would be allowed without actually executing,
+ *   returning 0 on success. The function never returns on success (except with
+ *   AT_EXECVE_CHECK) - the calling process image is completely replaced. Use fork() or
+ *   clone() first if you want to preserve the parent process.
+ *
+ *   Security considerations: Check-use race conditions are avoided when AT_EMPTY_PATH
+ *   is used with a pre-opened file descriptor. Setuid/setgid bits may be ignored
+ *   in various circumstances including nosuid mounts, user namespaces without
+ *   mappings, and certain security policies. The dumpability of the process
+ *   may change, affecting ptrace attachability and core dump generation.
+ *
+ *   All threads except the calling thread are terminated. Signal handlers are
+ *   reset but the signal mask is preserved. File descriptors are preserved
+ *   except those marked FD_CLOEXEC. The point of no return is reached during
+ *   binary loading - after this point, errors are fatal to the process.
+ *
+ * since-version: 3.19
+ */
 SYSCALL_DEFINE5(execveat,
 		int, fd, const char __user *, filename,
 		const char __user *const __user *, argv,
-- 
2.50.1


^ permalink raw reply related

* [RFC PATCH v4 5/7] mm/mlock: add API specification for mlock
From: Sasha Levin @ 2025-08-25 18:14 UTC (permalink / raw)
  To: linux-api, linux-doc, linux-kernel, tools; +Cc: Sasha Levin
In-Reply-To: <20250825181434.3340805-1-sashal@kernel.org>

Add kernel API specification for the mlock() system call.

Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 mm/mlock.c | 134 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 134 insertions(+)

diff --git a/mm/mlock.c b/mm/mlock.c
index a1d93ad33c6d..36eac7fec17d 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -656,6 +656,140 @@ static __must_check int do_mlock(unsigned long start, size_t len, vm_flags_t fla
 	return 0;
 }
 
+/**
+ * sys_mlock - Lock pages in memory
+ * @start: Starting address of memory range to lock
+ * @len: Length of memory range to lock in bytes
+ *
+ * long-desc: Locks pages in the specified address range into RAM, preventing
+ *   them from being paged to swap. Requires CAP_IPC_LOCK capability
+ *   or RLIMIT_MEMLOCK resource limit.
+ *
+ * context-flags: KAPI_CTX_PROCESS | KAPI_CTX_SLEEPABLE
+ *
+ * param: start, KAPI_TYPE_UINT
+ *   flags: KAPI_PARAM_IN
+ *   constraint-type: KAPI_CONSTRAINT_NONE
+ *   constraint: Automatically page-aligned down by kernel (PAGE_ALIGN_DOWN)
+ *
+ * param: len, KAPI_TYPE_UINT
+ *   flags: KAPI_PARAM_IN
+ *   constraint-type: KAPI_CONSTRAINT_RANGE
+ *   range: 0, LONG_MAX
+ *   constraint: Automatically page-aligned up by kernel (PAGE_ALIGN)
+ *
+ * return:
+ *   type: KAPI_TYPE_INT
+ *   check-type: KAPI_RETURN_ERROR_CHECK
+ *   success: 0
+ *
+ * error: ENOMEM, Address range issue
+ *   desc: Some of the specified range is not mapped, has unmapped gaps,
+ *   or the lock would cause the number of mapped regions to exceed the limit.
+ *
+ * error: EPERM, Insufficient privileges
+ *   desc: The caller is not privileged (no CAP_IPC_LOCK) and RLIMIT_MEMLOCK is 0.
+ *
+ * error: EINVAL, Address overflow
+ *   desc: The result of the addition start+len was less than start (arithmetic overflow).
+ *
+ * error: EAGAIN, Some or all memory could not be locked
+ *   desc: Some or all of the specified address range could not be locked.
+ *
+ * error: EINTR, Interrupted by signal
+ *   desc: The operation was interrupted by a fatal signal before completion.
+ *
+ * error: EFAULT, Bad address
+ *   desc: The specified address range contains invalid addresses that cannot be accessed.
+ *
+ * since-version: 2.0
+ *
+ * lock: mmap_lock, KAPI_LOCK_RWLOCK
+ *   acquired: true
+ *   released: true
+ *   desc: Process memory map write lock
+ *
+ * signal: FATAL
+ *   direction: KAPI_SIGNAL_RECEIVE
+ *   action: KAPI_SIGNAL_ACTION_RETURN
+ *   condition: Fatal signal pending
+ *   desc: Fatal signals (SIGKILL) can interrupt the operation at two points:
+ *   when acquiring mmap_write_lock_killable() and during page population
+ *   in __mm_populate(). Returns -EINTR. Non-fatal signals do NOT interrupt
+ *   mlock - the operation continues even if SIGINT/SIGTERM are received.
+ *   error: -EINTR
+ *   timing: KAPI_SIGNAL_TIME_DURING
+ *   priority: 0
+ *   interruptible: yes
+ *   state-req: KAPI_SIGNAL_STATE_RUNNING
+ *
+ * examples: mlock(addr, 4096);  // Lock one page
+ *   mlock(addr, len);   // Lock range of pages
+ *
+ * notes: Memory locks do not stack - multiple calls on the same range can be
+ *   undone by a single munlock. Locks are not inherited by child processes.
+ *   Pages are locked on whole page boundaries. Commonly used by real-time
+ *   applications to prevent page faults during time-critical operations.
+ *   Also used for security to prevent sensitive data (e.g., cryptographic keys)
+ *   from being written to swap. Note: locked pages may still be saved to
+ *   swap during system suspend/hibernate.
+ *
+ *   Tagged addresses are automatically handled via untagged_addr(). The operation
+ *   occurs in two phases: first VMAs are marked with VM_LOCKED, then pages are
+ *   populated into memory. When checking RLIMIT_MEMLOCK, the kernel optimizes
+ *   by recounting locked memory to avoid double-counting overlapping regions.
+ * side-effect: KAPI_EFFECT_MODIFY_STATE | KAPI_EFFECT_ALLOC_MEMORY
+ *   target: process memory
+ *   desc: Locks pages into physical memory, preventing swapping
+ *   reversible: yes
+ *
+ * side-effect: KAPI_EFFECT_MODIFY_STATE
+ *   target: mm->locked_vm
+ *   desc: Increases process locked memory counter
+ *   reversible: yes
+ *
+ * side-effect: KAPI_EFFECT_ALLOC_MEMORY
+ *   target: physical pages
+ *   desc: May allocate and populate page table entries
+ *   condition: Pages not already present
+ *   reversible: yes
+ *
+ * side-effect: KAPI_EFFECT_MODIFY_STATE | KAPI_EFFECT_ALLOC_MEMORY
+ *   target: page faults
+ *   desc: Triggers page faults to bring pages into memory
+ *   condition: Pages not already resident
+ *
+ * side-effect: KAPI_EFFECT_MODIFY_STATE
+ *   target: VMA splitting
+ *   desc: May split existing VMAs at lock boundaries
+ *   condition: Lock range partially overlaps existing VMA
+ *
+ * state-trans: memory pages
+ *   from: swappable
+ *   to: locked in RAM
+ *   desc: Pages become non-swappable and pinned in physical memory
+ *
+ * state-trans: VMA flags
+ *   from: unlocked
+ *   to: VM_LOCKED set
+ *   desc: Virtual memory area marked as locked
+ *
+ * capability: CAP_IPC_LOCK, KAPI_CAP_BYPASS_CHECK, CAP_IPC_LOCK capability
+ *   allows: Lock unlimited amount of memory (no RLIMIT_MEMLOCK enforcement)
+ *   without: Must respect RLIMIT_MEMLOCK resource limit
+ *   condition: Checked when RLIMIT_MEMLOCK is 0 or locking would exceed limit
+ *   priority: 0
+ *
+ * constraint: RLIMIT_MEMLOCK Resource Limit
+ *   desc: The RLIMIT_MEMLOCK soft resource limit specifies the maximum bytes of memory that may be locked into RAM. Unprivileged processes are restricted to this limit. CAP_IPC_LOCK capability allows bypassing this limit entirely. The limit is enforced per-process, not per-user.
+ *   expr: locked_memory + request_size <= RLIMIT_MEMLOCK || CAP_IPC_LOCK
+ *
+ * constraint: Memory Pressure and OOM
+ *   desc: Locking large amounts of memory can cause system-wide memory pressure and potentially trigger the OOM killer. The kernel does not prevent locking memory that would destabilize the system.
+ *
+ * constraint: Special Memory Areas
+ *   desc: Some memory types cannot be locked or are silently skipped: VM_IO/VM_PFNMAP areas (device mappings) are skipped; Hugetlb pages are inherently pinned and skipped; DAX mappings are always present in memory and skipped; Secret memory (memfd_secret) mappings are skipped; VM_DROPPABLE memory cannot be locked and is skipped; Gate VMA (kernel entry point) is skipped; VM_LOCKED areas are already locked. These special areas are silently excluded without error.
+ */
 SYSCALL_DEFINE2(mlock, unsigned long, start, size_t, len)
 {
 	return do_mlock(start, len, VM_LOCKED);
-- 
2.50.1


^ permalink raw reply related

* [RFC PATCH v4 4/7] kernel/sched: add specs for sys_sched_setattr()
From: Sasha Levin @ 2025-08-25 18:14 UTC (permalink / raw)
  To: linux-api, linux-doc, linux-kernel, tools; +Cc: Sasha Levin
In-Reply-To: <20250825181434.3340805-1-sashal@kernel.org>

Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 kernel/sched/syscalls.c | 315 +++++++++++++++++++++++++++++++++++++++-
 1 file changed, 314 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/syscalls.c b/kernel/sched/syscalls.c
index 77ae87f36e84..c5eb0facdc3a 100644
--- a/kernel/sched/syscalls.c
+++ b/kernel/sched/syscalls.c
@@ -969,10 +969,323 @@ SYSCALL_DEFINE2(sched_setparam, pid_t, pid, struct sched_param __user *, param)
 }
 
 /**
- * sys_sched_setattr - same as above, but with extended sched_attr
+ * sys_sched_setattr - set/change scheduling policy and attributes
  * @pid: the pid in question.
  * @uattr: structure containing the extended parameters.
  * @flags: for future extension.
+ *
+ * long-desc: Sets the scheduling policy and attributes for a process,
+ *   supporting multiple scheduling classes including real-time,
+ *   deadline, and normal policies. Performs capability checks,
+ *   validates parameters, enforces resource limits, and ensures
+ *   bandwidth constraints for deadline tasks.
+ * context-flags: KAPI_CTX_PROCESS | KAPI_CTX_SLEEPABLE
+ *
+ * param-count: 3
+ *
+ * param: pid
+ *   type: KAPI_TYPE_INT
+ *   flags: KAPI_PARAM_IN
+ *   constraint-type: KAPI_CONSTRAINT_RANGE
+ *   range: 0, INT_MAX
+ *   constraint: Must be >= 0, where 0 means current process
+ *
+ * param: uattr
+ *   type: KAPI_TYPE_USER_PTR
+ *   flags: KAPI_PARAM_IN | KAPI_PARAM_USER
+ *   constraint-type: KAPI_CONSTRAINT_CUSTOM
+ *   constraint: Valid user pointer to struct sched_attr
+ *
+ * struct: struct sched_attr
+ *   size: 120
+ *   alignment: 8
+ *   field: size
+ *     type: __u32
+ *     desc: Structure size for version compatibility
+ *     constraint-type: KAPI_CONSTRAINT_RANGE
+ *     range: 48, 512
+ *     constraint: Must be at least SCHED_ATTR_SIZE_VER0
+ *   field: sched_policy
+ *     type: __u32
+ *     desc: Scheduling policy selector
+ *     constraint-type: KAPI_CONSTRAINT_ENUM
+ *     enum: SCHED_NORMAL(0), SCHED_FIFO(1), SCHED_RR(2), SCHED_BATCH(3), SCHED_IDLE(5), SCHED_DEADLINE(6), SCHED_EXT(7)
+ *   field: sched_flags
+ *     type: __u64
+ *     desc: Policy modifier flags
+ *     constraint-type: KAPI_CONSTRAINT_MASK
+ *     mask: SCHED_FLAG_ALL
+ *   field: sched_nice
+ *     type: __s32
+ *     desc: Nice value for CFS policies
+ *     constraint-type: KAPI_CONSTRAINT_RANGE
+ *     range: -20, 19
+ *     constraint: Only used for SCHED_NORMAL, SCHED_BATCH, SCHED_IDLE
+ *   field: sched_priority
+ *     type: __u32
+ *     desc: Priority for RT policies
+ *     constraint-type: KAPI_CONSTRAINT_RANGE
+ *     range: 1, 99
+ *     constraint: Only used for SCHED_FIFO, SCHED_RR
+ *   field: sched_runtime
+ *     type: __u64
+ *     desc: Runtime budget in nanoseconds
+ *     constraint: Only used for SCHED_DEADLINE
+ *   field: sched_deadline
+ *     type: __u64
+ *     desc: Deadline in nanoseconds
+ *     constraint: Only used for SCHED_DEADLINE
+ *   field: sched_period
+ *     type: __u64
+ *     desc: Period in nanoseconds (0 = use deadline)
+ *     constraint: Only used for SCHED_DEADLINE
+ *   field: sched_util_min
+ *     type: __u32
+ *     desc: Minimum utilization hint (v1+)
+ *     constraint-type: KAPI_CONSTRAINT_RANGE
+ *     range: 0, 1024
+ *     constraint: Requires struct version >= 1 and SCHED_FLAG_UTIL_CLAMP_MIN
+ *   field: sched_util_max
+ *     type: __u32
+ *     desc: Maximum utilization hint (v1+)
+ *     constraint-type: KAPI_CONSTRAINT_RANGE
+ *     range: 0, 1024
+ *     constraint: Requires struct version >= 1 and SCHED_FLAG_UTIL_CLAMP_MAX
+ *
+ * param: flags
+ *   type: KAPI_TYPE_UINT
+ *   flags: KAPI_PARAM_IN
+ *   range: 0, 0
+ *   constraint: Must be 0 (reserved for future use)
+ *
+ * validation-group: RT Policies
+ *   policy: SCHED_FIFO, SCHED_RR
+ *   rule: sched_priority must be in [1,99]
+ *   rule: sched_nice must be 0
+ *   rule: No deadline parameters
+ *
+ * validation-group: CFS Policies
+ *   policy: SCHED_NORMAL, SCHED_BATCH, SCHED_IDLE
+ *   rule: sched_priority must be 0
+ *   rule: sched_nice must be in [-20,19]
+ *   rule: No deadline parameters
+ *
+ * validation-group: Deadline Policy
+ *   policy: SCHED_DEADLINE
+ *   rule: sched_runtime > 0
+ *   rule: sched_deadline >= sched_runtime
+ *   rule: sched_period == 0 || sched_period >= sched_deadline
+ *   rule: sched_priority must be 0
+ *   rule: sched_nice must be 0
+ *
+ * validation-group: Utilization Clamping
+ *   flag: SCHED_FLAG_UTIL_CLAMP_MIN, SCHED_FLAG_UTIL_CLAMP_MAX
+ *   rule: Requires struct version >= 1 (size >= 56)
+ *   rule: util values must be in [0,1024]
+ *   rule: util_min <= util_max
+ *
+ * return:
+ *   type: KAPI_TYPE_INT
+ *   check-type: KAPI_RETURN_ERROR_CHECK
+ *   success: 0
+ *
+ * error: EINVAL, Invalid parameters
+ *   desc: Returned when uattr is NULL, pid < 0, flags != 0,
+ *     attr.size < SCHED_ATTR_SIZE_VER0, invalid scheduling policy,
+ *     invalid priority for policy, invalid sched_flags, or malformed
+ *     sched_attr structure (e.g., DL runtime > deadline)
+ *
+ * error: ESRCH, Process not found
+ *   desc: Returned when the specified pid does not exist
+ *
+ * error: EPERM, Insufficient privileges
+ *   desc: Returned when lacking CAP_SYS_NICE for privileged operations,
+ *     trying to change another user's process without CAP_SYS_NICE,
+ *     or resetting SCHED_RESET_ON_FORK flag without privileges
+ *
+ * error: E2BIG, Structure size mismatch
+ *   desc: Returned when sched_attr size is larger than kernel expects
+ *
+ * error: EFAULT, Bad user pointer
+ *   desc: Returned when copying from user space fails or uattr is not
+ *     a valid readable user pointer
+ *
+ * error: EBUSY, Bandwidth exceeded
+ *   desc: Returned when SCHED_DEADLINE bandwidth would be exceeded or
+ *     deadline admission test fails
+ *
+ * error: EAGAIN, Transient failure
+ *   desc: Returned when unable to change cpus_allowed due to transient
+ *     cpuset or CPU hotplug conditions
+ *
+ * error: ENOMEM, Memory allocation failed
+ *   desc: Returned when unable to allocate memory for CPU masks
+ *
+ * error: EOPNOTSUPP, Feature not supported
+ *   desc: Returned when utilization clamping is requested but
+ *     CONFIG_UCLAMP_TASK is not enabled
+ *
+ * since-version: 3.14
+ *
+ * lock: rq->lock
+ *   type: KAPI_LOCK_SPINLOCK
+ *   acquired: true
+ *   released: true
+ *   desc: Process runqueue lock for scheduler state changes
+ *
+ * lock: p->pi_lock
+ *   type: KAPI_LOCK_SPINLOCK
+ *   acquired: true
+ *   released: true
+ *   desc: Priority inheritance lock for PI chain adjustments
+ *
+ * lock: cpuset_mutex
+ *   type: KAPI_LOCK_MUTEX
+ *   acquired: true
+ *   released: true
+ *   desc: Cpuset mutex for SCHED_DEADLINE bandwidth checks
+ *
+ *
+ * signal: SIGXCPU
+ *   direction: KAPI_SIGNAL_SEND
+ *   action: KAPI_SIGNAL_ACTION_DEFAULT
+ *   condition: SCHED_FLAG_DL_OVERRUN is set and deadline is missed
+ *   desc: Sent to task when it exceeds its SCHED_DEADLINE runtime.
+ *     The signal is sent asynchronously from the scheduler tick or
+ *     deadline timer. Unlike other scheduling policies, SCHED_DEADLINE
+ *     can generate SIGXCPU for runtime overruns rather than just
+ *     CPU time limit violations.
+ *   timing: KAPI_SIGNAL_TIME_DURING
+ *   priority: 0
+ *   interruptible: no
+ *   state-req: KAPI_SIGNAL_STATE_RUNNING
+ *
+ * examples: sched_setattr(0, &attr, 0);  // Set attributes for current task
+ *   sched_setattr(pid, &attr, 0);  // Set attributes for specific task
+ *
+ * notes: The sched_attr structure supports forward/backward compatibility
+ *   through its size field. Older kernels ignore newer fields. The syscall
+ *   validates all parameters based on the scheduling policy. For SCHED_DEADLINE,
+ *   it performs CBS (Constant Bandwidth Server) admission control. Priority
+ *   changes may trigger immediate reschedule. RT policies require sched_priority
+ *   in range [1,99]. Normal policies use nice values [-20,19] mapped to
+ *   static_prio. Changes are atomic - either all succeed or none are applied.
+ *
+ * side-effect: KAPI_EFFECT_MODIFY_STATE | KAPI_EFFECT_PROCESS_STATE
+ *   target: task scheduling attributes
+ *   desc: Updates policy/priority/deadline parameters atomically
+ *   reversible: yes
+ *
+ *
+ * side-effect: KAPI_EFFECT_MODIFY_STATE | KAPI_EFFECT_SCHEDULE
+ *   target: runqueue
+ *   desc: May requeue task with new priority and trigger reschedule
+ *   condition: Task is runnable
+ *
+ *
+ * side-effect: KAPI_EFFECT_MODIFY_STATE
+ *   target: deadline bandwidth
+ *   desc: Allocates CBS bandwidth for SCHED_DEADLINE tasks
+ *   condition: Policy is SCHED_DEADLINE
+ *   reversible: yes
+ *
+ *
+ * side-effect: KAPI_EFFECT_MODIFY_STATE
+ *   target: timer slack
+ *   desc: Sets timer slack to 0 for RT/DL policies
+ *   condition: RT or DEADLINE policy
+ *
+ *
+ * side-effect: KAPI_EFFECT_MODIFY_STATE
+ *   target: PI chain
+ *   desc: Updates priority inheritance chain if task has PI waiters
+ *   condition: Task has PI waiters
+ *
+ *
+ * side-effect: KAPI_EFFECT_MODIFY_STATE | KAPI_EFFECT_SCHEDULE
+ *   target: CPU
+ *   desc: May migrate task to different CPU based on affinity/bandwidth
+ *   condition: SCHED_DEADLINE or cpuset changes
+ *
+ *
+ * state-trans: task->policy
+ *   from: any policy
+ *   to: new policy
+ *   desc: Task scheduling policy changes per sched_attr
+ *
+ *
+ * state-trans: task->rt_priority
+ *   from: any
+ *   to: 0-99 or 0
+ *   desc: RT priority updated for RT policies, 0 for others
+ *
+ *
+ * state-trans: task->normal_prio
+ *   from: any
+ *   to: recalculated
+ *   desc: Normal priority recalculated based on policy/nice
+ *
+ *
+ * state-trans: task->sched_reset_on_fork
+ *   from: 0/1
+ *   to: 0/1
+ *   desc: Reset-on-fork flag updated per SCHED_FLAG_RESET_ON_FORK
+ *
+ *
+ * state-trans: task->dl
+ *   from: inactive/active
+ *   to: active/inactive
+ *   desc: Deadline entity activated for SCHED_DEADLINE
+ *
+ *
+ * capability: CAP_SYS_NICE
+ *   type: KAPI_CAP_BYPASS_CHECK
+ *   desc: CAP_SYS_NICE capability
+ *   allows: Set RT/DL policies, increase priority, nice < 0, change other users' tasks, remove SCHED_FLAG_RESET_ON_FORK
+ *   without: Can only set SCHED_NORMAL/BATCH/IDLE, decrease priority, nice >= 0, modify own tasks
+ *   condition: Checked when setting RT/DL policy, decreasing nice, or modifying other user's tasks
+ *   priority: 0
+ *
+ *
+ * constraint: Valid Scheduling Policy
+ *   desc: The sched_policy field must be one of: SCHED_NORMAL (0), SCHED_FIFO (1), SCHED_RR (2),
+ *     SCHED_BATCH (3), SCHED_IDLE (5), SCHED_DEADLINE (6), or SCHED_EXT (7) if configured.
+ *     Invalid policies result in -EINVAL.
+ *   expr: uattr->sched_policy >= 0 && (uattr->sched_policy <= SCHED_DEADLINE || (uattr->sched_policy == SCHED_EXT && IS_ENABLED(CONFIG_SCHED_CLASS_EXT)))
+ *
+ *
+ * constraint: RT Priority Range
+ *   desc: For SCHED_FIFO and SCHED_RR policies, sched_priority must be in range [1, 99]
+ *     where 1 is lowest and 99 is highest RT priority. For other policies, sched_priority must be 0.
+ *   expr: rt_policy(uattr->sched_policy) ? (uattr->sched_priority >= 1 && uattr->sched_priority <= 99) : (uattr->sched_priority == 0)
+ *
+ *
+ * constraint: Nice Value Range
+ *   desc: For SCHED_NORMAL, SCHED_BATCH, and SCHED_IDLE policies, the nice value must be in range [-20, 19]
+ *     where -20 is highest priority (least nice) and 19 is lowest priority (most nice).
+ *   expr: fair_policy(uattr->sched_policy) ? (uattr->sched_nice >= MIN_NICE && uattr->sched_nice <= MAX_NICE) : 1
+ *
+ *
+ * constraint: SCHED_DEADLINE CBS Rules
+ *   desc: For SCHED_DEADLINE, must satisfy: sched_runtime > 0, sched_deadline >= sched_runtime,
+ *     sched_period >= sched_deadline. If period is 0, it defaults to deadline.
+ *   expr: dl_policy(uattr->sched_policy) ? (uattr->sched_runtime > 0 && uattr->sched_runtime <= uattr->sched_deadline && (uattr->sched_period == 0 || uattr->sched_period >= uattr->sched_deadline)) : 1
+ *
+ *
+ * constraint: Utilization Clamping Range
+ *   desc: If sched_flags includes SCHED_FLAG_UTIL_CLAMP_MIN/MAX, the util_min and util_max values
+ *     must be in range [0, 1024] where 1024 represents 100% utilization.
+ *   expr: (uattr->sched_flags & SCHED_FLAG_UTIL_CLAMP) ? (uattr->sched_util_min >= 0 && uattr->sched_util_min <= SCHED_CAPACITY_SCALE && uattr->sched_util_max >= 0 && uattr->sched_util_max <= SCHED_CAPACITY_SCALE && uattr->sched_util_min <= uattr->sched_util_max) : 1
+ *
+ *
+ * constraint: SCHED_DEADLINE Bandwidth
+ *   desc: The sum of runtime/period ratios for all SCHED_DEADLINE tasks on the system
+ *     must not exceed the available CPU capacity. This global bandwidth check prevents system overload.
+ *
+ *
+ * constraint: Structure Size Compatibility
+ *   desc: The attr.size field must be at least SCHED_ATTR_SIZE_VER0 (48 bytes) and no larger than
+ *     the kernel's known structure size to ensure forward/backward compatibility.
  */
 SYSCALL_DEFINE3(sched_setattr, pid_t, pid, struct sched_attr __user *, uattr,
 			       unsigned int, flags)
-- 
2.50.1


^ permalink raw reply related

* [RFC PATCH v4 3/7] kernel/api: add debugfs interface for kernel API specifications
From: Sasha Levin @ 2025-08-25 18:14 UTC (permalink / raw)
  To: linux-api, linux-doc, linux-kernel, tools; +Cc: Sasha Levin
In-Reply-To: <20250825181434.3340805-1-sashal@kernel.org>

Add a debugfs interface to expose kernel API specifications at runtime.
This allows tools and users to query the complete API specifications
through the debugfs filesystem.

The interface provides:
- /sys/kernel/debug/kapi/list - lists all available API specifications
- /sys/kernel/debug/kapi/specs/<name> - detailed info for each API

Each specification file includes:
- Function name, version, and descriptions
- Execution context requirements and flags
- Parameter details with types, flags, and constraints
- Return value specifications and success conditions
- Error codes with descriptions and conditions
- Locking requirements and constraints
- Signal handling specifications
- Examples, notes, and deprecation status

This enables runtime introspection of kernel APIs for documentation
tools, static analyzers, and debugging purposes.

Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 kernel/api/Kconfig        |  20 +++
 kernel/api/Makefile       |   6 +-
 kernel/api/kapi_debugfs.c | 334 ++++++++++++++++++++++++++++++++++++++
 3 files changed, 359 insertions(+), 1 deletion(-)
 create mode 100644 kernel/api/kapi_debugfs.c

diff --git a/kernel/api/Kconfig b/kernel/api/Kconfig
index fde25ec70e13..d2754b21acc4 100644
--- a/kernel/api/Kconfig
+++ b/kernel/api/Kconfig
@@ -33,3 +33,23 @@ config KAPI_RUNTIME_CHECKS
 	  development. The checks use WARN_ONCE to report violations.
 
 	  If unsure, say N.
+
+config KAPI_SPEC_DEBUGFS
+	bool "Export kernel API specifications via debugfs"
+	depends on KAPI_SPEC
+	depends on DEBUG_FS
+	help
+	  This option enables exporting kernel API specifications through
+	  the debugfs filesystem. When enabled, specifications can be
+	  accessed at /sys/kernel/debug/kapi/.
+
+	  The debugfs interface provides:
+	  - A list of all available API specifications
+	  - Detailed information for each API including parameters,
+	    return values, errors, locking requirements, and constraints
+	  - Complete machine-readable representation of the specs
+
+	  This is useful for documentation tools, static analyzers, and
+	  runtime introspection of kernel APIs.
+
+	  If unsure, say N.
diff --git a/kernel/api/Makefile b/kernel/api/Makefile
index 312d35179c78..396b2da1a109 100644
--- a/kernel/api/Makefile
+++ b/kernel/api/Makefile
@@ -10,6 +10,9 @@ obj-$(CONFIG_KAPI_SPEC)		+= kernel_api_spec.o
 ifeq ($(CONFIG_KAPI_SPEC),y)
 obj-$(CONFIG_KAPI_SPEC)		+= generated_api_specs.o
 
+# Debugfs interface for kernel API specs
+obj-$(CONFIG_KAPI_SPEC_DEBUGFS) += kapi_debugfs.o
+
 # Find all potential apispec files (this is evaluated at make time)
 apispec-files := $(shell find $(objtree) -name "*.apispec.h" -type f 2>/dev/null)
 
@@ -23,4 +26,5 @@ clean-files += generated_api_specs.c
 
 # Add explicit dependency on the generator script
 $(obj)/generated_api_specs.o: $(obj)/generated_api_specs.c
-endif
\ No newline at end of file
+endif
+
diff --git a/kernel/api/kapi_debugfs.c b/kernel/api/kapi_debugfs.c
new file mode 100644
index 000000000000..b75850b66ee3
--- /dev/null
+++ b/kernel/api/kapi_debugfs.c
@@ -0,0 +1,334 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Kernel API specification debugfs interface
+ *
+ * This provides a debugfs interface to expose kernel API specifications
+ * at runtime, allowing tools and users to query the complete API specs.
+ */
+
+#include <linux/debugfs.h>
+#include <linux/kernel.h>
+#include <linux/init.h>
+#include <linux/seq_file.h>
+#include <linux/kernel_api_spec.h>
+#include <linux/slab.h>
+#include <linux/string.h>
+
+/* External symbols for kernel API spec section */
+extern struct kernel_api_spec __start_kapi_specs[];
+extern struct kernel_api_spec __stop_kapi_specs[];
+
+static struct dentry *kapi_debugfs_root;
+
+/* Helper function to print parameter type as string */
+static const char *param_type_str(enum kapi_param_type type)
+{
+	switch (type) {
+	case KAPI_TYPE_INT: return "int";
+	case KAPI_TYPE_UINT: return "uint";
+	case KAPI_TYPE_PTR: return "ptr";
+	case KAPI_TYPE_STRUCT: return "struct";
+	case KAPI_TYPE_UNION: return "union";
+	case KAPI_TYPE_ARRAY: return "array";
+	case KAPI_TYPE_FD: return "fd";
+	case KAPI_TYPE_ENUM: return "enum";
+	case KAPI_TYPE_USER_PTR: return "user_ptr";
+	case KAPI_TYPE_PATH: return "path";
+	case KAPI_TYPE_FUNC_PTR: return "func_ptr";
+	case KAPI_TYPE_CUSTOM: return "custom";
+	default: return "unknown";
+	}
+}
+
+/* Helper to print parameter flags */
+static void print_param_flags(struct seq_file *m, u32 flags)
+{
+	seq_printf(m, "    flags: ");
+	if (flags & KAPI_PARAM_IN) seq_printf(m, "IN ");
+	if (flags & KAPI_PARAM_OUT) seq_printf(m, "OUT ");
+	if (flags & KAPI_PARAM_INOUT) seq_printf(m, "INOUT ");
+	if (flags & KAPI_PARAM_OPTIONAL) seq_printf(m, "OPTIONAL ");
+	if (flags & KAPI_PARAM_CONST) seq_printf(m, "CONST ");
+	if (flags & KAPI_PARAM_USER) seq_printf(m, "USER ");
+	if (flags & KAPI_PARAM_VOLATILE) seq_printf(m, "VOLATILE ");
+	if (flags & KAPI_PARAM_DMA) seq_printf(m, "DMA ");
+	if (flags & KAPI_PARAM_ALIGNED) seq_printf(m, "ALIGNED ");
+	seq_printf(m, "\n");
+}
+
+/* Helper to print context flags */
+static void print_context_flags(struct seq_file *m, u32 flags)
+{
+	seq_printf(m, "Context flags: ");
+	if (flags & KAPI_CTX_PROCESS) seq_printf(m, "PROCESS ");
+	if (flags & KAPI_CTX_HARDIRQ) seq_printf(m, "HARDIRQ ");
+	if (flags & KAPI_CTX_SOFTIRQ) seq_printf(m, "SOFTIRQ ");
+	if (flags & KAPI_CTX_NMI) seq_printf(m, "NMI ");
+	if (flags & KAPI_CTX_SLEEPABLE) seq_printf(m, "SLEEPABLE ");
+	if (flags & KAPI_CTX_ATOMIC) seq_printf(m, "ATOMIC ");
+	if (flags & KAPI_CTX_PREEMPT_DISABLED) seq_printf(m, "PREEMPT_DISABLED ");
+	if (flags & KAPI_CTX_IRQ_DISABLED) seq_printf(m, "IRQ_DISABLED ");
+	seq_printf(m, "\n");
+}
+
+/* Show function for individual API spec */
+static int kapi_spec_show(struct seq_file *m, void *v)
+{
+	struct kernel_api_spec *spec = m->private;
+	int i;
+
+	seq_printf(m, "Kernel API Specification\n");
+	seq_printf(m, "========================\n\n");
+
+	/* Basic info */
+	seq_printf(m, "Name: %s\n", spec->name);
+	seq_printf(m, "Version: %u\n", spec->version);
+	seq_printf(m, "Description: %s\n", spec->description);
+	if (strlen(spec->long_description) > 0)
+		seq_printf(m, "Long description: %s\n", spec->long_description);
+
+	/* Context */
+	print_context_flags(m, spec->context_flags);
+	seq_printf(m, "\n");
+
+	/* Parameters */
+	if (spec->param_count > 0) {
+		seq_printf(m, "Parameters (%u):\n", spec->param_count);
+		for (i = 0; i < spec->param_count; i++) {
+			struct kapi_param_spec *param = &spec->params[i];
+			seq_printf(m, "  [%d] %s:\n", i, param->name);
+			seq_printf(m, "    type: %s (%s)\n",
+				   param_type_str(param->type), param->type_name);
+			print_param_flags(m, param->flags);
+			if (strlen(param->description) > 0)
+				seq_printf(m, "    description: %s\n", param->description);
+			if (param->size > 0)
+				seq_printf(m, "    size: %zu\n", param->size);
+			if (param->alignment > 0)
+				seq_printf(m, "    alignment: %zu\n", param->alignment);
+
+			/* Print constraints if any */
+			if (param->constraint_type != KAPI_CONSTRAINT_NONE) {
+				seq_printf(m, "    constraints:\n");
+				switch (param->constraint_type) {
+				case KAPI_CONSTRAINT_RANGE:
+					seq_printf(m, "      type: range\n");
+					seq_printf(m, "      min: %lld\n", param->min_value);
+					seq_printf(m, "      max: %lld\n", param->max_value);
+					break;
+				case KAPI_CONSTRAINT_MASK:
+					seq_printf(m, "      type: mask\n");
+					seq_printf(m, "      valid_bits: 0x%llx\n", param->valid_mask);
+					break;
+				case KAPI_CONSTRAINT_ENUM:
+					seq_printf(m, "      type: enum\n");
+					seq_printf(m, "      count: %u\n", param->enum_count);
+					break;
+				case KAPI_CONSTRAINT_CUSTOM:
+					seq_printf(m, "      type: custom\n");
+					if (strlen(param->constraints) > 0)
+						seq_printf(m, "      description: %s\n",
+							   param->constraints);
+					break;
+				default:
+					break;
+				}
+			}
+			seq_printf(m, "\n");
+		}
+	}
+
+	/* Return value */
+	seq_printf(m, "Return value:\n");
+	seq_printf(m, "  type: %s\n", spec->return_spec.type_name);
+	if (strlen(spec->return_spec.description) > 0)
+		seq_printf(m, "  description: %s\n", spec->return_spec.description);
+
+	switch (spec->return_spec.check_type) {
+	case KAPI_RETURN_EXACT:
+		seq_printf(m, "  success: == %lld\n", spec->return_spec.success_value);
+		break;
+	case KAPI_RETURN_RANGE:
+		seq_printf(m, "  success: [%lld, %lld]\n",
+			   spec->return_spec.success_min,
+			   spec->return_spec.success_max);
+		break;
+	case KAPI_RETURN_FD:
+		seq_printf(m, "  success: valid file descriptor (>= 0)\n");
+		break;
+	case KAPI_RETURN_ERROR_CHECK:
+		seq_printf(m, "  success: error check\n");
+		break;
+	case KAPI_RETURN_CUSTOM:
+		seq_printf(m, "  success: custom check\n");
+		break;
+	default:
+		break;
+	}
+	seq_printf(m, "\n");
+
+	/* Errors */
+	if (spec->error_count > 0) {
+		seq_printf(m, "Errors (%u):\n", spec->error_count);
+		for (i = 0; i < spec->error_count; i++) {
+			struct kapi_error_spec *err = &spec->errors[i];
+			seq_printf(m, "  %s (%d): %s\n",
+				   err->name, err->error_code, err->description);
+			if (strlen(err->condition) > 0)
+				seq_printf(m, "    condition: %s\n", err->condition);
+		}
+		seq_printf(m, "\n");
+	}
+
+	/* Locks */
+	if (spec->lock_count > 0) {
+		seq_printf(m, "Locks (%u):\n", spec->lock_count);
+		for (i = 0; i < spec->lock_count; i++) {
+			struct kapi_lock_spec *lock = &spec->locks[i];
+			const char *type_str;
+			switch (lock->lock_type) {
+			case KAPI_LOCK_MUTEX: type_str = "mutex"; break;
+			case KAPI_LOCK_SPINLOCK: type_str = "spinlock"; break;
+			case KAPI_LOCK_RWLOCK: type_str = "rwlock"; break;
+			case KAPI_LOCK_SEMAPHORE: type_str = "semaphore"; break;
+			case KAPI_LOCK_RCU: type_str = "rcu"; break;
+			case KAPI_LOCK_SEQLOCK: type_str = "seqlock"; break;
+			default: type_str = "unknown"; break;
+			}
+			seq_printf(m, "  %s (%s): %s\n",
+				   lock->lock_name, type_str, lock->description);
+			if (lock->acquired)
+				seq_printf(m, "    acquired by function\n");
+			if (lock->released)
+				seq_printf(m, "    released by function\n");
+		}
+		seq_printf(m, "\n");
+	}
+
+	/* Constraints */
+	if (spec->constraint_count > 0) {
+		seq_printf(m, "Additional constraints (%u):\n", spec->constraint_count);
+		for (i = 0; i < spec->constraint_count; i++) {
+			seq_printf(m, "  - %s\n", spec->constraints[i].description);
+		}
+		seq_printf(m, "\n");
+	}
+
+	/* Signals */
+	if (spec->signal_count > 0) {
+		seq_printf(m, "Signal handling (%u):\n", spec->signal_count);
+		for (i = 0; i < spec->signal_count; i++) {
+			struct kapi_signal_spec *sig = &spec->signals[i];
+			seq_printf(m, "  %s (%d):\n", sig->signal_name, sig->signal_num);
+			seq_printf(m, "    direction: ");
+			if (sig->direction & KAPI_SIGNAL_SEND) seq_printf(m, "send ");
+			if (sig->direction & KAPI_SIGNAL_RECEIVE) seq_printf(m, "receive ");
+			if (sig->direction & KAPI_SIGNAL_HANDLE) seq_printf(m, "handle ");
+			if (sig->direction & KAPI_SIGNAL_BLOCK) seq_printf(m, "block ");
+			if (sig->direction & KAPI_SIGNAL_IGNORE) seq_printf(m, "ignore ");
+			seq_printf(m, "\n");
+			seq_printf(m, "    action: ");
+			switch (sig->action) {
+			case KAPI_SIGNAL_ACTION_DEFAULT: seq_printf(m, "default"); break;
+			case KAPI_SIGNAL_ACTION_TERMINATE: seq_printf(m, "terminate"); break;
+			case KAPI_SIGNAL_ACTION_COREDUMP: seq_printf(m, "coredump"); break;
+			case KAPI_SIGNAL_ACTION_STOP: seq_printf(m, "stop"); break;
+			case KAPI_SIGNAL_ACTION_CONTINUE: seq_printf(m, "continue"); break;
+			case KAPI_SIGNAL_ACTION_CUSTOM: seq_printf(m, "custom"); break;
+			case KAPI_SIGNAL_ACTION_RETURN: seq_printf(m, "return"); break;
+			case KAPI_SIGNAL_ACTION_RESTART: seq_printf(m, "restart"); break;
+			default: seq_printf(m, "unknown"); break;
+			}
+			seq_printf(m, "\n");
+			if (strlen(sig->description) > 0)
+				seq_printf(m, "    description: %s\n", sig->description);
+		}
+		seq_printf(m, "\n");
+	}
+
+	/* Additional info */
+	if (strlen(spec->examples) > 0) {
+		seq_printf(m, "Examples:\n%s\n\n", spec->examples);
+	}
+	if (strlen(spec->notes) > 0) {
+		seq_printf(m, "Notes:\n%s\n\n", spec->notes);
+	}
+	if (strlen(spec->since_version) > 0) {
+		seq_printf(m, "Since: %s\n", spec->since_version);
+	}
+
+	return 0;
+}
+
+static int kapi_spec_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, kapi_spec_show, inode->i_private);
+}
+
+static const struct file_operations kapi_spec_fops = {
+	.open = kapi_spec_open,
+	.read = seq_read,
+	.llseek = seq_lseek,
+	.release = single_release,
+};
+
+/* Show all available API specs */
+static int kapi_list_show(struct seq_file *m, void *v)
+{
+	struct kernel_api_spec *spec;
+	int count = 0;
+
+	seq_printf(m, "Available Kernel API Specifications\n");
+	seq_printf(m, "===================================\n\n");
+
+	for (spec = __start_kapi_specs; spec < __stop_kapi_specs; spec++) {
+		seq_printf(m, "%s - %s\n", spec->name, spec->description);
+		count++;
+	}
+
+	seq_printf(m, "\nTotal: %d specifications\n", count);
+	return 0;
+}
+
+static int kapi_list_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, kapi_list_show, NULL);
+}
+
+static const struct file_operations kapi_list_fops = {
+	.open = kapi_list_open,
+	.read = seq_read,
+	.llseek = seq_lseek,
+	.release = single_release,
+};
+
+static int __init kapi_debugfs_init(void)
+{
+	struct kernel_api_spec *spec;
+	struct dentry *spec_dir;
+
+	/* Create main directory */
+	kapi_debugfs_root = debugfs_create_dir("kapi", NULL);
+
+	/* Create list file */
+	debugfs_create_file("list", 0444, kapi_debugfs_root, NULL, &kapi_list_fops);
+
+	/* Create specs subdirectory */
+	spec_dir = debugfs_create_dir("specs", kapi_debugfs_root);
+
+	/* Create a file for each API spec */
+	for (spec = __start_kapi_specs; spec < __stop_kapi_specs; spec++) {
+		debugfs_create_file(spec->name, 0444, spec_dir, spec, &kapi_spec_fops);
+	}
+
+	pr_info("Kernel API debugfs interface initialized\n");
+	return 0;
+}
+
+static void __exit kapi_debugfs_exit(void)
+{
+	debugfs_remove_recursive(kapi_debugfs_root);
+}
+
+/* Initialize as part of kernel, not as a module */
+fs_initcall(kapi_debugfs_init);
\ No newline at end of file
-- 
2.50.1


^ permalink raw reply related

* [RFC PATCH v4 2/7] kernel/api: enable kerneldoc-based API specifications
From: Sasha Levin @ 2025-08-25 18:14 UTC (permalink / raw)
  To: linux-api, linux-doc, linux-kernel, tools; +Cc: Sasha Levin
In-Reply-To: <20250825181434.3340805-1-sashal@kernel.org>

This patch adds support for extracting API specifications from
kernel-doc comments and generating C macro invocations for the
kernel API specification framework.

Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 kernel/api/Makefile              |  21 +-
 scripts/Makefile.build           |  28 ++
 scripts/generate_api_specs.sh    |  69 +++
 scripts/kernel-doc.py            |   5 +
 scripts/lib/kdoc/kdoc_apispec.py | 714 +++++++++++++++++++++++++++++++
 scripts/lib/kdoc/kdoc_output.py  |   9 +-
 scripts/lib/kdoc/kdoc_parser.py  |  50 ++-
 7 files changed, 891 insertions(+), 5 deletions(-)
 create mode 100755 scripts/generate_api_specs.sh
 create mode 100644 scripts/lib/kdoc/kdoc_apispec.py

diff --git a/kernel/api/Makefile b/kernel/api/Makefile
index 4120ded7e5cf..312d35179c78 100644
--- a/kernel/api/Makefile
+++ b/kernel/api/Makefile
@@ -4,4 +4,23 @@
 #
 
 # Core API specification framework
-obj-$(CONFIG_KAPI_SPEC)		+= kernel_api_spec.o
\ No newline at end of file
+obj-$(CONFIG_KAPI_SPEC)		+= kernel_api_spec.o
+
+# Auto-generated API specifications collector
+ifeq ($(CONFIG_KAPI_SPEC),y)
+obj-$(CONFIG_KAPI_SPEC)		+= generated_api_specs.o
+
+# Find all potential apispec files (this is evaluated at make time)
+apispec-files := $(shell find $(objtree) -name "*.apispec.h" -type f 2>/dev/null)
+
+# Generate the collector file
+# Note: FORCE ensures this is always regenerated to pick up new apispec files
+$(obj)/generated_api_specs.c: $(srctree)/scripts/generate_api_specs.sh FORCE
+	$(Q)$(CONFIG_SHELL) $< $(srctree) $(objtree) > $@
+
+targets += generated_api_specs.c
+clean-files += generated_api_specs.c
+
+# Add explicit dependency on the generator script
+$(obj)/generated_api_specs.o: $(obj)/generated_api_specs.c
+endif
\ No newline at end of file
diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index d0ee33a487be..8d54b685debe 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -172,6 +172,34 @@ ifneq ($(KBUILD_EXTRA_WARN),)
         $<
 endif
 
+# Generate API spec headers from kernel-doc comments
+ifeq ($(CONFIG_KAPI_SPEC),y)
+# Function to check if a file has API specifications
+has-apispec = $(shell grep -qE '^\s*\*\s*(api-type|long-desc|context-flags|param-type|error-code|capability|signal|lock|side-effect|state-trans):' $(src)/$(1) 2>/dev/null && echo $(1))
+
+# Get base names without directory prefix
+c-objs-base := $(notdir $(real-obj-y) $(real-obj-m))
+# Filter to only .o files with corresponding .c source files
+c-files := $(foreach o,$(c-objs-base),$(if $(wildcard $(src)/$(o:.o=.c)),$(o:.o=.c)))
+# Also check for any additional .c files that contain API specs but are included
+extra-c-files := $(shell find $(src) -maxdepth 1 -name "*.c" -exec grep -l '^\s*\*\s*\(api-type\|long-desc\|context-flags\|param-type\|error-code\|capability\|signal\|lock\|side-effect\|state-trans\):' {} \; 2>/dev/null | xargs -r basename -a)
+# Combine both lists and remove duplicates
+all-c-files := $(sort $(c-files) $(extra-c-files))
+# Only include files that actually have API specifications
+apispec-files := $(foreach f,$(all-c-files),$(call has-apispec,$(f)))
+# Generate apispec targets with proper directory prefix
+apispec-y := $(addprefix $(obj)/,$(apispec-files:.c=.apispec.h))
+always-y += $(apispec-y)
+targets += $(apispec-y)
+
+quiet_cmd_apispec = APISPEC $@
+      cmd_apispec = PYTHONDONTWRITEBYTECODE=1 $(KERNELDOC) -apispec \
+                    $(KDOCFLAGS) $< > $@ 2>/dev/null || rm -f $@
+
+$(obj)/%.apispec.h: $(src)/%.c FORCE
+	$(call if_changed,apispec)
+endif
+
 # Compile C sources (.c)
 # ---------------------------------------------------------------------------
 
diff --git a/scripts/generate_api_specs.sh b/scripts/generate_api_specs.sh
new file mode 100755
index 000000000000..fe7797bfd135
--- /dev/null
+++ b/scripts/generate_api_specs.sh
@@ -0,0 +1,69 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+#
+# generate_api_specs.sh - Generate C file that includes all API specification headers
+#
+# Usage: generate_api_specs.sh <srctree> <objtree>
+
+SRCTREE="$1"
+OBJTREE="$2"
+
+if [ -z "$SRCTREE" ] || [ -z "$OBJTREE" ]; then
+    echo "Usage: $0 <srctree> <objtree>" >&2
+    exit 1
+fi
+
+# Generate header
+cat <<EOF
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Auto-generated file - DO NOT EDIT
+ * Generated by: scripts/generate_api_specs.sh
+ *
+ * This file includes all kernel API specification headers
+ */
+
+#include <linux/kernel.h>
+#include <linux/kernel_api_spec.h>
+#include <linux/errno.h>
+#include <linux/capability.h>
+#include <linux/fcntl.h>
+#include <uapi/linux/sched/types.h>
+
+#ifdef CONFIG_KAPI_SPEC
+
+EOF
+
+# Find all .apispec.h files and generate includes
+# Look in both source tree and object tree
+(find "$SRCTREE" -name "*.apispec.h" -type f 2>/dev/null; \
+ find "$OBJTREE" -name "*.apispec.h" -type f 2>/dev/null) | \
+    grep -v "/generated_api_specs.c" | \
+    sort -u | \
+    while read -r apispec_file; do
+        # Get relative path from srctree or objtree
+        case "$apispec_file" in
+            "$SRCTREE"*)
+                rel_path="${apispec_file#$SRCTREE/}"
+                ;;
+            *)
+                rel_path="${apispec_file#$OBJTREE/}"
+                ;;
+        esac
+
+        # Skip if file is empty
+        if [ ! -s "$apispec_file" ]; then
+            continue
+        fi
+
+        # Generate include statement
+        # For includes from kernel/api/, we need to go up two levels
+        echo "#include \"../../${rel_path}\""
+    done
+
+# Close the ifdef
+cat <<EOF
+
+#endif /* CONFIG_KAPI_SPEC */
+EOF
+
diff --git a/scripts/kernel-doc.py b/scripts/kernel-doc.py
index fc3d46ef519f..286471622c6c 100755
--- a/scripts/kernel-doc.py
+++ b/scripts/kernel-doc.py
@@ -109,6 +109,7 @@ sys.path.insert(0, os.path.join(SRC_DIR, LIB_DIR))
 
 from kdoc_files import KernelFiles                      # pylint: disable=C0413
 from kdoc_output import RestFormat, ManFormat           # pylint: disable=C0413
+from kdoc_apispec import ApiSpecFormat                  # pylint: disable=C0413
 
 DESC = """
 Read C language source or header FILEs, extract embedded documentation comments,
@@ -225,6 +226,8 @@ def main():
                          help="Output reStructuredText format (default).")
     out_fmt.add_argument("-N", "-none", "--none", action="store_true",
                          help="Do not output documentation, only warnings.")
+    out_fmt.add_argument("-apispec", "--apispec", action="store_true",
+                         help="Output C macro invocations for kernel API specifications.")
 
     # Output selection mutually-exclusive group
 
@@ -285,6 +288,8 @@ def main():
         out_style = ManFormat(modulename=args.modulename)
     elif args.none:
         out_style = None
+    elif args.apispec:
+        out_style = ApiSpecFormat()
     else:
         out_style = RestFormat()
 
diff --git a/scripts/lib/kdoc/kdoc_apispec.py b/scripts/lib/kdoc/kdoc_apispec.py
new file mode 100644
index 000000000000..7ece6848c0af
--- /dev/null
+++ b/scripts/lib/kdoc/kdoc_apispec.py
@@ -0,0 +1,714 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: GPL-2.0
+
+"""
+Generate C macro invocations for kernel API specifications from kernel-doc comments.
+
+This module creates C header files with API specification macros that match
+the kernel API specification framework introduced in commit 9688de5c25bed.
+"""
+
+from kdoc_output import OutputFormat
+import re
+
+
+# Valid KAPI effect types
+VALID_EFFECT_TYPES = {
+    'KAPI_EFFECT_NONE', 'KAPI_EFFECT_MODIFY_STATE', 'KAPI_EFFECT_PROCESS_STATE',
+    'KAPI_EFFECT_IRREVERSIBLE', 'KAPI_EFFECT_SCHEDULE', 'KAPI_EFFECT_FILESYSTEM',
+    'KAPI_EFFECT_HARDWARE', 'KAPI_EFFECT_ALLOC_MEMORY', 'KAPI_EFFECT_FREE_MEMORY',
+    'KAPI_EFFECT_SIGNAL_SEND', 'KAPI_EFFECT_FILE_POSITION', 'KAPI_EFFECT_LOCK_ACQUIRE',
+    'KAPI_EFFECT_LOCK_RELEASE', 'KAPI_EFFECT_RESOURCE_CREATE', 'KAPI_EFFECT_RESOURCE_DESTROY',
+    'KAPI_EFFECT_NETWORK'
+}
+
+# C type to KAPI type mapping
+CTYPE_TO_KAPI = {
+    '__u32': 'KAPI_TYPE_UINT', '__u64': 'KAPI_TYPE_UINT',
+    'u32': 'KAPI_TYPE_UINT', 'u64': 'KAPI_TYPE_UINT',
+    '__s32': 'KAPI_TYPE_INT', '__s64': 'KAPI_TYPE_INT',
+    's32': 'KAPI_TYPE_INT', 's64': 'KAPI_TYPE_INT',
+}
+
+
+class ApiSpecFormat(OutputFormat):
+    """Generate C macro invocations for kernel API specifications"""
+
+    def __init__(self):
+        super().__init__()
+        self.api_specs = {}
+        self.header_written = False
+
+    def msg(self, fname, name, args):
+        """Handles a single entry from kernel-doc parser"""
+        if not self.header_written:
+            self.data = self._generate_header()
+            self.header_written = True
+        else:
+            self.data = ""
+
+        result = super().msg(fname, name, args)
+        return result if result else self.data
+
+    def _generate_header(self):
+        """Generate the file header"""
+        return (
+            "/* SPDX-License-Identifier: GPL-2.0 */\n"
+            "/* Auto-generated from kerneldoc annotations - DO NOT EDIT */\n\n"
+            "#include <linux/kernel_api_spec.h>\n"
+            "#include <linux/errno.h>\n\n"
+        )
+
+    def _format_macro_param(self, value):
+        """Format a value for use in C macro parameter"""
+        if value is None:
+            return '""'
+        value = str(value).replace('\\', '\\\\').replace('"', '\\"')
+        value = value.replace('\n', '\\n"\n\t\t       "')
+        return f'"{value}"'
+
+    def _get_section(self, sections, key):
+        """Get value from sections, checking with and without @ prefix"""
+        for prefix in ['', '@']:
+            full_key = prefix + key
+            if full_key in sections:
+                content = sections[full_key].strip()
+                # Return only first line to avoid mixing sections
+                return content.split('\n')[0].strip() if content else ''
+        return None
+
+    def _get_section_lines(self, sections, key):
+        """Get all lines from a section"""
+        for prefix in ['', '@']:
+            full_key = prefix + key
+            if full_key in sections:
+                return [line.strip() for line in sections[full_key].strip().split('\n') if line.strip()]
+        return []
+
+    def _parse_indented_items(self, section_content, item_parser):
+        """Generic parser for indented items.
+
+        Args:
+            section_content: Raw section content
+            item_parser: Function that takes (lines, start_index) and returns (item, next_index)
+
+        Returns:
+            List of parsed items
+        """
+        if not section_content:
+            return []
+
+        items = []
+        lines = section_content.strip().split('\n')
+        i = 0
+
+        while i < len(lines):
+            if not lines[i].strip():
+                i += 1
+                continue
+
+            # Check if this is a main item (not indented)
+            if not lines[i].startswith((' ', '\t')):
+                item, i = item_parser(lines, i)
+                if item:
+                    items.append(item)
+            else:
+                i += 1
+
+        return items
+
+    def _parse_subfields(self, lines, start_idx):
+        """Parse indented subfields starting from start_idx+1.
+
+        Returns: (dict of subfields, next index)
+        """
+        subfields = {}
+        i = start_idx + 1
+
+        while i < len(lines) and (lines[i].startswith((' ', '\t'))):
+            line = lines[i].strip()
+            if ':' in line:
+                key, value = line.split(':', 1)
+                subfields[key.strip()] = value.strip()
+            i += 1
+
+        return subfields, i
+
+    def _parse_signal_item(self, lines, i):
+        """Parse a single signal specification"""
+        signal = {'name': lines[i].strip()}
+        subfields, next_i = self._parse_subfields(lines, i)
+
+        # Map subfields to signal attributes
+        signal.update({
+            'direction': subfields.get('direction', 'KAPI_SIGNAL_RECEIVE'),
+            'action': subfields.get('action', 'KAPI_SIGNAL_ACTION_RETURN'),
+            'condition': subfields.get('condition'),
+            'desc': subfields.get('desc'),
+            'error': subfields.get('error'),
+            'timing': subfields.get('timing'),
+            'priority': subfields.get('priority'),
+            'interruptible': subfields.get('interruptible', '').lower() == 'yes',
+            'number': subfields.get('number', '0'),
+        })
+
+        return signal, next_i
+
+    def _parse_error_item(self, lines, i):
+        """Parse a single error specification"""
+        line = lines[i].strip()
+
+        # Skip desc: lines
+        if line.startswith('desc:'):
+            return None, i + 1
+
+        # Check for error pattern
+        if not re.match(r'^[A-Z][A-Z0-9_]+,', line):
+            return None, i + 1
+
+        error = {'line': line, 'desc': ''}
+
+        # Look for desc: continuation
+        i += 1
+        desc_lines = []
+        while i < len(lines):
+            next_line = lines[i].strip()
+            if next_line.startswith('desc:'):
+                desc_lines.append(next_line[5:].strip())
+                i += 1
+            elif not next_line or re.match(r'^[A-Z][A-Z0-9_]+,', next_line):
+                break
+            else:
+                desc_lines.append(next_line)
+                i += 1
+
+        if desc_lines:
+            error['desc'] = ' '.join(desc_lines)
+
+        return error, i
+
+    def _parse_lock_item(self, lines, i):
+        """Parse a single lock specification"""
+        line = lines[i].strip()
+        if ':' not in line:
+            return None, i + 1
+
+        parts = line.split(':', 1)[1].strip().split(',', 1)
+        if len(parts) < 2:
+            return None, i + 1
+
+        lock = {
+            'name': parts[0].strip(),
+            'type': parts[1].strip()
+        }
+
+        subfields, next_i = self._parse_subfields(lines, i)
+
+        # Map boolean fields
+        for field in ['acquired', 'released', 'held-on-entry', 'held-on-exit']:
+            if subfields.get(field, '').lower() == 'true':
+                lock[field] = True
+
+        lock['desc'] = subfields.get('desc', '')
+
+        return lock, next_i
+
+    def _parse_constraint_item(self, lines, i):
+        """Parse a single constraint specification"""
+        line = lines[i].strip()
+
+        # Check for old format with comma
+        if ',' in line:
+            parts = line.split(',', 1)
+            constraint = {
+                'name': parts[0].strip(),
+                'desc': parts[1].strip() if len(parts) > 1 else '',
+                'expr': None
+            }
+        else:
+            constraint = {'name': line, 'desc': '', 'expr': None}
+
+        subfields, next_i = self._parse_subfields(lines, i)
+
+        if 'desc' in subfields:
+            constraint['desc'] = (constraint['desc'] + ' ' + subfields['desc']).strip()
+        constraint['expr'] = subfields.get('expr')
+
+        return constraint, next_i
+
+    def _parse_side_effect_item(self, lines, i):
+        """Parse a single side effect specification"""
+        line = lines[i].strip()
+
+        # Default to new format
+        effect = {
+            'type': line,
+            'target': '',
+            'desc': '',
+            'condition': None,
+            'reversible': False
+        }
+
+        # Check for old format with commas
+        if ',' in line:
+            # Handle condition and reversible flags
+            cond_match = re.search(r',\s*condition=([^,]+?)(?:\s*,\s*reversible=(yes|no)\s*)?$', line)
+            if cond_match:
+                effect['condition'] = cond_match.group(1).strip()
+                effect['reversible'] = cond_match.group(2) == 'yes'
+                line = line[:cond_match.start()]
+            elif ', reversible=yes' in line:
+                effect['reversible'] = True
+                line = line.replace(', reversible=yes', '')
+            elif ', reversible=no' in line:
+                line = line.replace(', reversible=no', '')
+
+            parts = line.split(',', 2)
+            if len(parts) >= 1:
+                effect['type'] = parts[0].strip()
+            if len(parts) >= 2:
+                effect['target'] = parts[1].strip()
+            if len(parts) >= 3:
+                effect['desc'] = parts[2].strip()
+        else:
+            # Multi-line format with subfields
+            subfields, next_i = self._parse_subfields(lines, i)
+            effect.update({
+                'target': subfields.get('target', ''),
+                'desc': subfields.get('desc', ''),
+                'condition': subfields.get('condition'),
+                'reversible': subfields.get('reversible', '').lower() == 'yes'
+            })
+            return effect, next_i
+
+        return effect, i + 1
+
+    def _parse_state_trans_item(self, lines, i):
+        """Parse a single state transition specification"""
+        line = lines[i].strip()
+
+        trans = {
+            'target': line,
+            'from': '',
+            'to': '',
+            'condition': '',
+            'desc': ''
+        }
+
+        # Check for old format with commas
+        if ',' in line:
+            parts = line.split(',', 3)
+            if len(parts) >= 1:
+                trans['target'] = parts[0].strip()
+            if len(parts) >= 2:
+                trans['from'] = parts[1].strip()
+            if len(parts) >= 3:
+                trans['to'] = parts[2].strip()
+            if len(parts) >= 4:
+                desc_part = parts[3].strip()
+                desc_parts = desc_part.split(',', 1)
+                if len(desc_parts) > 1:
+                    trans['condition'] = desc_parts[0].strip()
+                    trans['desc'] = desc_parts[1].strip()
+                else:
+                    trans['desc'] = desc_part
+            return trans, i + 1
+        else:
+            # Multi-line format with subfields
+            subfields, next_i = self._parse_subfields(lines, i)
+            trans.update({
+                'from': subfields.get('from', ''),
+                'to': subfields.get('to', ''),
+                'condition': subfields.get('condition', ''),
+                'desc': subfields.get('desc', '')
+            })
+            return trans, next_i
+
+    def _process_parameters(self, sections, parameterlist, parameterdescs, parametertypes):
+        """Process and output parameter specifications"""
+        param_count = len(parameterlist)
+        if param_count > 0:
+            self.data += f"\n\tKAPI_PARAM_COUNT({param_count})\n"
+
+        for param_idx, param in enumerate(parameterlist):
+            param_name = param.strip()
+            param_desc = parameterdescs.get(param_name, '')
+            param_ctype = parametertypes.get(param_name, '')
+
+            # Parse parameter specifications
+            param_section = sections.get('param', sections.get('@param', ''))
+            param_specs = {}
+            if param_section:
+                param_specs = self._parse_param_spec(param_section, param_name)
+
+            self.data += f"\n\tKAPI_PARAM({param_idx}, {self._format_macro_param(param_name)}, "
+            self.data += f"{self._format_macro_param(param_ctype)}, {self._format_macro_param(param_desc)})\n"
+
+            # Add parameter attributes
+            for key, macro in [
+                ('param-type', 'KAPI_PARAM_TYPE'),
+                ('param-flags', 'KAPI_PARAM_FLAGS'),
+                ('param-alignment', 'KAPI_PARAM_ALIGNMENT'),
+            ]:
+                if key in param_specs:
+                    self.data += f"\t\t{macro}({param_specs[key]})\n"
+
+            # Handle constraint type
+            if 'param-constraint-type' in param_specs:
+                ctype = param_specs['param-constraint-type']
+                if ctype == 'KAPI_CONSTRAINT_BITMASK':
+                    ctype = 'KAPI_CONSTRAINT_MASK'
+                self.data += f"\t\tKAPI_PARAM_CONSTRAINT_TYPE({ctype})\n"
+
+            # Handle range
+            if 'param-range' in param_specs and ',' in param_specs['param-range']:
+                min_val, max_val = param_specs['param-range'].split(',', 1)
+                self.data += f"\t\tKAPI_PARAM_RANGE({min_val.strip()}, {max_val.strip()})\n"
+
+            # Handle mask
+            if 'param-mask' in param_specs:
+                self.data += f"\t\tKAPI_PARAM_VALID_MASK({param_specs['param-mask']})\n"
+
+            # Handle constraint description
+            if 'param-constraint' in param_specs:
+                self.data += f"\t\tKAPI_PARAM_CONSTRAINT({self._format_macro_param(param_specs['param-constraint'])})\n"
+
+            self.data += "\tKAPI_PARAM_END\n"
+
+    def _parse_param_spec(self, section_content, param_name):
+        """Parse parameter specifications from indented format"""
+        specs = {}
+        lines = section_content.strip().split('\n')
+        current_item = None
+
+        for i, line in enumerate(lines):
+            if not line.strip():
+                continue
+
+            # Check if this is our parameter
+            if not line.startswith((' ', '\t')):
+                parts = line.strip().split(',', 1)
+                current_item = param_name if parts[0].strip() == param_name else None
+                if current_item and len(parts) > 1:
+                    specs['param-type'] = parts[1].strip()
+            elif current_item == param_name:
+                # Parse subfield
+                line = line.strip()
+                if ':' in line:
+                    key, value = line.split(':', 1)
+                    key = key.strip()
+                    value = value.strip()
+
+                    # Map to expected keys
+                    field_map = {
+                        'flags': 'param-flags',
+                        'constraint-type': 'param-constraint-type',
+                        'constraint': 'param-constraint',
+                        'range': 'param-range',
+                        'mask': 'param-mask',
+                        'valid-mask': 'param-mask',
+                        'alignment': 'param-alignment',
+                        'struct-type': 'param-struct-type',
+                    }
+
+                    if key in field_map:
+                        specs[field_map[key]] = value
+
+        return specs
+
+    def _validate_effect_type(self, effect_type):
+        """Validate and normalize effect type"""
+        if 'KAPI_EFFECT_SCHEDULER' in effect_type:
+            return effect_type.replace('KAPI_EFFECT_SCHEDULER', 'KAPI_EFFECT_SCHEDULE')
+
+        if 'KAPI_EFFECT_' in effect_type and effect_type not in VALID_EFFECT_TYPES:
+            if '|' in effect_type:
+                parts = [p.strip() for p in effect_type.split('|')]
+                valid_parts = [p if p in VALID_EFFECT_TYPES else 'KAPI_EFFECT_MODIFY_STATE' for p in parts]
+                return ' | '.join(valid_parts)
+            return 'KAPI_EFFECT_MODIFY_STATE'
+
+        return effect_type
+
+    def _has_api_spec(self, sections):
+        """Check if this function has an API specification"""
+        indicators = [
+            'api-type', 'context-flags', 'param-type', 'error-code',
+            'capability', 'signal', 'lock', 'state-trans', 'constraint',
+            'return', 'error', 'side-effects', 'struct'
+        ]
+
+        count = sum(1 for ind in indicators
+                   if any(key.lower().startswith(ind.lower()) or
+                         key.lower().startswith('@' + ind.lower())
+                         for key in sections.keys()))
+
+        return count >= 2
+
+    def out_function(self, fname, name, args):
+        """Generate API spec for a function"""
+        function_name = args.get('function', name)
+        sections = args.sections if hasattr(args, 'sections') else args.get('sections', {})
+
+        if not self._has_api_spec(sections):
+            return
+
+        parameterlist = args.parameterlist if hasattr(args, 'parameterlist') else args.get('parameterlist', [])
+        parameterdescs = args.parameterdescs if hasattr(args, 'parameterdescs') else args.get('parameterdescs', {})
+        parametertypes = args.parametertypes if hasattr(args, 'parametertypes') else args.get('parametertypes', {})
+        purpose = args.get('purpose', '')
+
+        # Start macro invocation
+        self.data += f"DEFINE_KERNEL_API_SPEC({function_name})\n"
+
+        # Basic info
+        if purpose:
+            self.data += f"\tKAPI_DESCRIPTION({self._format_macro_param(purpose)})\n"
+
+        long_desc = self._get_section(sections, 'long-desc')
+        if long_desc:
+            self.data += f"\tKAPI_LONG_DESC({self._format_macro_param(long_desc)})\n"
+
+        # Context flags
+        context = self._get_section(sections, 'context-flags') or self._get_section(sections, 'context')
+        if context:
+            self.data += f"\tKAPI_CONTEXT({context})\n"
+
+        # Process parameters
+        self._process_parameters(sections, parameterlist, parameterdescs, parametertypes)
+
+        # Process errors
+        errors = self._parse_indented_items(
+            sections.get('error', sections.get('@error', '')),
+            self._parse_error_item
+        )
+
+        if errors:
+            self.data += f"\n\tKAPI_RETURN_ERROR_COUNT({len(errors)})\n"
+            self.data += f"\n\tKAPI_ERROR_COUNT({len(errors)})\n"
+
+            for idx, error in enumerate(errors):
+                self._output_error(idx, error)
+
+        # Process signals
+        signals = self._parse_indented_items(
+            sections.get('signal', sections.get('@signal', '')),
+            self._parse_signal_item
+        )
+
+        if signals:
+            self.data += f"\n\tKAPI_SIGNAL_COUNT({len(signals)})\n"
+
+            for idx, signal in enumerate(signals):
+                self._output_signal(idx, signal)
+
+        # Process other specifications
+        self._process_locks(sections)
+        self._process_constraints(sections)
+        self._process_side_effects(sections)
+        self._process_state_transitions(sections)
+        self._process_capabilities(sections)
+
+        # Add examples and notes
+        for key, macro in [('examples', 'KAPI_EXAMPLES'), ('notes', 'KAPI_NOTES')]:
+            value = self._get_section(sections, key)
+            if value:
+                self.data += f"\n\t{macro}({self._format_macro_param(value)})\n"
+
+        self.data += "\nKAPI_END_SPEC;\n\n"
+
+    def _output_error(self, idx, error):
+        """Output a single error specification"""
+        line = error['line']
+        if line.startswith('-'):
+            line = line[1:].strip()
+
+        parts = line.split(',', 2)
+        if len(parts) == 2:
+            # Format: NAME, description
+            name = parts[0].strip()
+            short_desc = parts[1].strip()
+            code = f"-{name}"
+        elif len(parts) >= 3:
+            # Format: code, name, description
+            code = parts[0].strip()
+            name = parts[1].strip()
+            short_desc = parts[2].strip()
+            if not code.startswith('-'):
+                code = f"-{code}"
+        else:
+            return
+
+        long_desc = error.get('desc', '') or short_desc
+
+        self.data += f"\n\tKAPI_ERROR({idx}, {code}, {self._format_macro_param(name)}, "
+        self.data += f"{self._format_macro_param(short_desc)},\n\t\t   {self._format_macro_param(long_desc)})\n"
+
+    def _output_signal(self, idx, signal):
+        """Output a single signal specification"""
+        self.data += f"\n\tKAPI_SIGNAL({idx}, {signal['number']}, "
+        self.data += f"{self._format_macro_param(signal['name'])}, "
+        self.data += f"{signal['direction']}, {signal['action']})\n"
+
+        for key, macro in [
+            ('condition', 'KAPI_SIGNAL_CONDITION'),
+            ('desc', 'KAPI_SIGNAL_DESC'),
+            ('error', 'KAPI_SIGNAL_ERROR'),
+            ('timing', 'KAPI_SIGNAL_TIMING'),
+            ('priority', 'KAPI_SIGNAL_PRIORITY'),
+        ]:
+            if signal.get(key):
+                # Priority field is numeric
+                if key == 'priority':
+                    self.data += f"\t\t{macro}({signal[key]})\n"
+                else:
+                    self.data += f"\t\t{macro}({self._format_macro_param(signal[key])})\n"
+
+        if signal.get('interruptible'):
+            self.data += "\t\tKAPI_SIGNAL_INTERRUPTIBLE\n"
+
+        self.data += "\tKAPI_SIGNAL_END\n"
+
+    def _process_locks(self, sections):
+        """Process lock specifications"""
+        locks = self._parse_indented_items(
+            sections.get('lock', sections.get('@lock', '')),
+            self._parse_lock_item
+        )
+
+        if locks:
+            self.data += f"\n\tKAPI_LOCK_COUNT({len(locks)})\n"
+
+            for idx, lock in enumerate(locks):
+                self.data += f"\n\tKAPI_LOCK({idx}, {self._format_macro_param(lock['name'])}, {lock['type']})\n"
+
+                for flag in ['acquired', 'released']:
+                    if lock.get(flag):
+                        self.data += f"\t\tKAPI_LOCK_{flag.upper()}\n"
+
+                if lock.get('desc'):
+                    self.data += f"\t\tKAPI_LOCK_DESC({self._format_macro_param(lock['desc'])})\n"
+
+                self.data += "\tKAPI_LOCK_END\n"
+
+    def _process_constraints(self, sections):
+        """Process constraint specifications"""
+        constraints = self._parse_indented_items(
+            sections.get('constraint', sections.get('@constraint', '')),
+            self._parse_constraint_item
+        )
+
+        if constraints:
+            self.data += f"\n\tKAPI_CONSTRAINT_COUNT({len(constraints)})\n"
+
+            for idx, constraint in enumerate(constraints):
+                self.data += f"\n\tKAPI_CONSTRAINT({idx}, {self._format_macro_param(constraint['name'])},\n"
+                self.data += f"\t\t\t{self._format_macro_param(constraint['desc'])})\n"
+
+                if constraint.get('expr'):
+                    self.data += f"\t\tKAPI_CONSTRAINT_EXPR({self._format_macro_param(constraint['expr'])})\n"
+
+                self.data += "\tKAPI_CONSTRAINT_END\n"
+
+    def _process_side_effects(self, sections):
+        """Process side effect specifications"""
+        effects = self._parse_indented_items(
+            sections.get('side-effect', sections.get('@side-effect', '')),
+            self._parse_side_effect_item
+        )
+
+        if effects:
+            self.data += f"\n\tKAPI_SIDE_EFFECT_COUNT({len(effects)})\n"
+
+            for idx, effect in enumerate(effects):
+                effect_type = self._validate_effect_type(effect['type'])
+
+                self.data += f"\n\tKAPI_SIDE_EFFECT({idx}, {effect_type},\n"
+                self.data += f"\t\t\t {self._format_macro_param(effect['target'])},\n"
+                self.data += f"\t\t\t {self._format_macro_param(effect['desc'])})\n"
+
+                if effect.get('condition'):
+                    self.data += f"\t\tKAPI_EFFECT_CONDITION({self._format_macro_param(effect['condition'])})\n"
+
+                if effect.get('reversible'):
+                    self.data += "\t\tKAPI_EFFECT_REVERSIBLE\n"
+
+                self.data += "\tKAPI_SIDE_EFFECT_END\n"
+
+    def _process_state_transitions(self, sections):
+        """Process state transition specifications"""
+        transitions = self._parse_indented_items(
+            sections.get('state-trans', sections.get('@state-trans', '')),
+            self._parse_state_trans_item
+        )
+
+        if transitions:
+            self.data += f"\n\tKAPI_STATE_TRANS_COUNT({len(transitions)})\n"
+
+            for idx, trans in enumerate(transitions):
+                desc = trans['desc']
+                if trans.get('condition'):
+                    desc = trans['condition'] + (', ' + desc if desc else '')
+
+                self.data += f"\n\tKAPI_STATE_TRANS({idx}, {self._format_macro_param(trans['target'])}, "
+                self.data += f"{self._format_macro_param(trans['from'])}, {self._format_macro_param(trans['to'])},\n"
+                self.data += f"\t\t\t {self._format_macro_param(desc)})\n"
+                self.data += "\tKAPI_STATE_TRANS_END\n"
+
+    def _process_capabilities(self, sections):
+        """Process capability specifications"""
+        cap_section = sections.get('capability', sections.get('@capability', ''))
+        if not cap_section:
+            return
+
+        lines = cap_section.strip().split('\n')
+        capabilities = []
+        i = 0
+
+        while i < len(lines):
+            line = lines[i].strip()
+            if not line or line.startswith(('allows:', 'without:', 'condition:', 'priority:')):
+                i += 1
+                continue
+
+            cap_info = {'line': line}
+
+            # Parse subfields
+            subfields, next_i = self._parse_subfields(lines, i)
+            cap_info.update(subfields)
+            capabilities.append(cap_info)
+            i = next_i
+
+        if capabilities:
+            self.data += f"\n\tKAPI_CAPABILITY_COUNT({len(capabilities)})\n"
+
+            for idx, cap in enumerate(capabilities):
+                parts = cap['line'].split(',', 2)
+                if len(parts) >= 2:
+                    cap_name = parts[0].strip()
+                    cap_type = parts[1].strip()
+                    cap_desc = parts[2].strip() if len(parts) > 2 else cap_name
+
+                    # Fix common type issues
+                    if 'BYPASS' in cap_type and cap_type != 'KAPI_CAP_BYPASS_CHECK':
+                        cap_type = 'KAPI_CAP_BYPASS_CHECK'
+
+                    self.data += f"\n\tKAPI_CAPABILITY({idx}, {cap_name}, {self._format_macro_param(cap_desc)}, {cap_type})\n"
+
+                    for key, macro in [
+                        ('allows', 'KAPI_CAP_ALLOWS'),
+                        ('without', 'KAPI_CAP_WITHOUT'),
+                        ('condition', 'KAPI_CAP_CONDITION'),
+                        ('priority', 'KAPI_CAP_PRIORITY'),
+                    ]:
+                        if cap.get(key):
+                            value = self._format_macro_param(cap[key]) if key != 'priority' else cap[key]
+                            self.data += f"\t\t{macro}({value})\n"
+
+                    self.data += "\tKAPI_CAPABILITY_END\n"
+
+    # Skip output methods for non-function types
+    def out_enum(self, fname, name, args): pass
+    def out_typedef(self, fname, name, args): pass
+    def out_struct(self, fname, name, args): pass
+    def out_doc(self, fname, name, args): pass
\ No newline at end of file
diff --git a/scripts/lib/kdoc/kdoc_output.py b/scripts/lib/kdoc/kdoc_output.py
index ea8914537ba0..9ac8e5beddaf 100644
--- a/scripts/lib/kdoc/kdoc_output.py
+++ b/scripts/lib/kdoc/kdoc_output.py
@@ -124,8 +124,13 @@ class OutputFormat:
         Output warnings for identifiers that will be displayed.
         """
 
-        for log_msg in args.warnings:
-            self.config.warning(log_msg)
+        warnings = args.get('warnings', [])
+
+        for log_msg in warnings:
+            # Skip numeric warnings (line numbers) which are false positives
+            # from parameter-specific sections like "param-constraint: name, value"
+            if not isinstance(log_msg, int):
+                self.config.warning(log_msg)
 
     def check_doc(self, name, args):
         """Check if DOC should be output"""
diff --git a/scripts/lib/kdoc/kdoc_parser.py b/scripts/lib/kdoc/kdoc_parser.py
index fe730099eca8..55679224d578 100644
--- a/scripts/lib/kdoc/kdoc_parser.py
+++ b/scripts/lib/kdoc/kdoc_parser.py
@@ -46,7 +46,21 @@ doc_decl = doc_com + KernRe(r'(\w+)', cache=False)
 known_section_names = 'description|context|returns?|notes?|examples?'
 known_sections = KernRe(known_section_names, flags = re.I)
 doc_sect = doc_com + \
-    KernRe(r'\s*(\@[.\w]+|\@\.\.\.|' + known_section_names + r')\s*:([^:].*)?$',
+    KernRe(r'\s*(\@[.\w\-]+|\@\.\.\.|' + known_section_names + r'|' +
+           r'@?api-type|@?api-version|@?param(?!-)|' +
+           r'@?struct(?!-)|@?struct-type|@?struct-field|@?struct-field-[a-z\-]+|' +
+           r'@?validation-group|@?validation-policy|@?validation-flag|@?validation-rule|' +
+           r'@?error(?!-)|@?error-code|@?error-condition|@?error-count|' +
+           r'@?capability(?!-)|' +
+           r'@?capability-count|' +
+           r'@?signal(?!-)|@?signal-count|' +
+           r'@?lock(?!-)|@?lock-count|' +
+           r'@?since|@?since-version|' +
+           r'@?context-flags|@?return(?!-)|@?return-type|@?return-check|@?return-check-type|@?return-success|@?return-desc|' +
+           r'@?long-desc|@?constraint(?!-)|@?constraint-count|' +
+           r'@?side-effect(?!-)|@?side-effect-count|' +
+           r'@?state-trans(?!-)|@?state-trans-count|' +
+           r'@?param-count|@?kapi-.*)\s*:([^:].*)?$',
            flags=re.I, cache=False)
 
 doc_content = doc_com_body + KernRe(r'(.*)', cache=False)
@@ -183,7 +197,39 @@ class KernelEntry:
         name = self.section
         contents = self.contents()
 
-        if type_param.match(name):
+        # Check if this is an API specification section
+        # These should always be treated as sections, not parameters
+        api_sections = {
+            'api-type', 'api-version', 'param-type', 'param-flags', 'param-constraint',
+            'param-range', 'param-mask', 'param-constraint-type', 'param-size',
+            'param-alignment', 'param-enum', 'param-validate', 'param-size-param',
+            'param-size-multiplier', 'struct-type', 'struct-field', 'struct-field-range',
+            'struct-field-enum', 'struct-field-mask', 'struct-field-policy',
+            'struct-field-version', 'struct-field-flag', 'struct-field-validate',
+            'validation-group', 'validation-policy', 'validation-flag', 'validation-rule',
+            'error', 'error-code', 'error-condition', 'error-count',
+            'capability', 'capability-allows', 'capability-without', 'capability-condition',
+            'capability-priority', 'capability-count', 'signal', 'signal-direction',
+            'signal-action', 'signal-condition', 'signal-desc', 'signal-error',
+            'signal-timing', 'signal-priority', 'signal-interruptible', 'signal-state-req',
+            'signal-restartable', 'signal-count', 'lock', 'lock-type', 'lock-acquired', 'lock-released',
+            'lock-desc', 'lock-count', 'since', 'since-version', 'context-flags',
+            'return', 'return-type', 'return-check', 'return-check-type', 'return-success', 'return-desc',
+            'long-desc', 'constraint', 'constraint-expr', 'constraint-count',
+            'side-effect', 'side-effect-type', 'side-effect-desc', 'side-effect-condition',
+            'side-effect-reversible', 'side-effect-count', 'state-trans',
+            'state-trans-desc', 'state-trans-count', 'param-count',
+            # Also include notes and examples which can appear with or without @
+            'notes', 'note', 'examples', 'example'
+        }
+
+        # Check if name starts with @ and matches kapi-.* pattern
+        is_api_section = (name.lower() in api_sections or
+                         (name.startswith('@') and name[1:].lower() in api_sections) or
+                         (name.lower().startswith('kapi-')) or
+                         (name.lower().startswith('@kapi-')))
+
+        if not is_api_section and type_param.match(name):
             name = type_param.group(1)
 
             self.parameterdescs[name] = contents
-- 
2.50.1


^ permalink raw reply related

* [RFC PATCH v4 1/7] kernel/api: introduce kernel API specification framework
From: Sasha Levin @ 2025-08-25 18:14 UTC (permalink / raw)
  To: linux-api, linux-doc, linux-kernel, tools; +Cc: Sasha Levin
In-Reply-To: <20250825181434.3340805-1-sashal@kernel.org>

Add a comprehensive framework for formally documenting kernel APIs with
inline specifications. This framework provides:

- Structured API documentation with parameter specifications, return
  values, error conditions, and execution context requirements
- Runtime validation capabilities for debugging (CONFIG_KAPI_RUNTIME_CHECKS)
- Export of specifications via debugfs for tooling integration
- Support for both internal kernel APIs and system calls

The framework stores specifications in a dedicated ELF section and
provides infrastructure for:
- Compile-time validation of specifications
- Runtime querying of API documentation
- Machine-readable export formats
- Integration with existing SYSCALL_DEFINE macros

This commit introduces the core infrastructure without modifying any
existing APIs. Subsequent patches will add specifications to individual
subsystems.

Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 .gitignore                                    |    1 +
 Documentation/admin-guide/kernel-api-spec.rst |  507 ++++++
 MAINTAINERS                                   |    9 +
 arch/um/kernel/dyn.lds.S                      |    3 +
 arch/um/kernel/uml.lds.S                      |    3 +
 arch/x86/kernel/vmlinux.lds.S                 |    3 +
 include/asm-generic/vmlinux.lds.h             |   20 +
 include/linux/kernel_api_spec.h               | 1559 +++++++++++++++++
 include/linux/syscall_api_spec.h              |  125 ++
 include/linux/syscalls.h                      |   38 +
 init/Kconfig                                  |    2 +
 kernel/Makefile                               |    1 +
 kernel/api/Kconfig                            |   35 +
 kernel/api/Makefile                           |    7 +
 kernel/api/kernel_api_spec.c                  | 1155 ++++++++++++
 15 files changed, 3468 insertions(+)
 create mode 100644 Documentation/admin-guide/kernel-api-spec.rst
 create mode 100644 include/linux/kernel_api_spec.h
 create mode 100644 include/linux/syscall_api_spec.h
 create mode 100644 kernel/api/Kconfig
 create mode 100644 kernel/api/Makefile
 create mode 100644 kernel/api/kernel_api_spec.c

diff --git a/.gitignore b/.gitignore
index 929054df5212..e5aaa26d1b68 100644
--- a/.gitignore
+++ b/.gitignore
@@ -12,6 +12,7 @@
 #
 .*
 *.a
+*.apispec.h
 *.asn1.[ch]
 *.bin
 *.bz2
diff --git a/Documentation/admin-guide/kernel-api-spec.rst b/Documentation/admin-guide/kernel-api-spec.rst
new file mode 100644
index 000000000000..3a63f6711e27
--- /dev/null
+++ b/Documentation/admin-guide/kernel-api-spec.rst
@@ -0,0 +1,507 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+======================================
+Kernel API Specification Framework
+======================================
+
+:Author: Sasha Levin <sashal@kernel.org>
+:Date: June 2025
+
+.. contents:: Table of Contents
+   :depth: 3
+   :local:
+
+Introduction
+============
+
+The Kernel API Specification Framework (KAPI) provides a comprehensive system for
+formally documenting, validating, and introspecting kernel APIs. This framework
+addresses the long-standing challenge of maintaining accurate, machine-readable
+documentation for the thousands of internal kernel APIs and system calls.
+
+Purpose and Goals
+-----------------
+
+The framework aims to:
+
+1. **Improve API Documentation**: Provide structured, inline documentation that
+   lives alongside the code and is maintained as part of the development process.
+
+2. **Enable Runtime Validation**: Optionally validate API usage at runtime to catch
+   common programming errors during development and testing.
+
+3. **Support Tooling**: Export API specifications in machine-readable formats for
+   use by static analyzers, documentation generators, and development tools.
+
+4. **Enhance Debugging**: Provide detailed API information at runtime through debugfs
+   for debugging and introspection.
+
+5. **Formalize Contracts**: Explicitly document API contracts including parameter
+   constraints, execution contexts, locking requirements, and side effects.
+
+Architecture Overview
+=====================
+
+Components
+----------
+
+The framework consists of several key components:
+
+1. **Core Framework** (``kernel/api/kernel_api_spec.c``)
+
+   - API specification registration and storage
+   - Runtime validation engine
+   - Specification lookup and querying
+
+2. **DebugFS Interface** (``kernel/api/kapi_debugfs.c``)
+
+   - Runtime introspection via ``/sys/kernel/debug/kapi/``
+   - JSON and XML export formats
+   - Per-API detailed information
+
+3. **IOCTL Support** (``kernel/api/ioctl_validation.c``)
+
+   - Extended framework for IOCTL specifications
+   - Automatic validation wrappers
+   - Structure field validation
+
+4. **Specification Macros** (``include/linux/kernel_api_spec.h``)
+
+   - Declarative macros for API documentation
+   - Type-safe parameter specifications
+   - Context and constraint definitions
+
+Data Model
+----------
+
+The framework uses a hierarchical data model::
+
+    kernel_api_spec
+    ├── Basic Information
+    │   ├── name (API function name)
+    │   ├── version (specification version)
+    │   ├── description (human-readable description)
+    │   └── kernel_version (when API was introduced)
+    │
+    ├── Parameters (up to 16)
+    │   └── kapi_param_spec
+    │       ├── name
+    │       ├── type (int, pointer, string, etc.)
+    │       ├── direction (in, out, inout)
+    │       ├── constraints (range, mask, enum values)
+    │       └── validation rules
+    │
+    ├── Return Value
+    │   └── kapi_return_spec
+    │       ├── type
+    │       ├── success conditions
+    │       └── validation rules
+    │
+    ├── Error Conditions (up to 32)
+    │   └── kapi_error_spec
+    │       ├── error code
+    │       ├── condition description
+    │       └── recovery advice
+    │
+    ├── Execution Context
+    │   ├── allowed contexts (process, interrupt, etc.)
+    │   ├── locking requirements
+    │   └── preemption/interrupt state
+    │
+    └── Side Effects
+        ├── memory allocation
+        ├── state changes
+        └── signal handling
+
+Usage Guide
+===========
+
+Basic API Specification
+-----------------------
+
+To document a kernel API, use the specification macros in the implementation file:
+
+.. code-block:: c
+
+    #include <linux/kernel_api_spec.h>
+
+    KAPI_DEFINE_SPEC(kmalloc_spec, kmalloc, "3.0")
+    KAPI_DESCRIPTION("Allocate kernel memory")
+    KAPI_PARAM(0, size, KAPI_TYPE_SIZE_T, KAPI_DIR_IN,
+               "Number of bytes to allocate")
+    KAPI_PARAM_RANGE(0, 0, KMALLOC_MAX_SIZE)
+    KAPI_PARAM(1, flags, KAPI_TYPE_FLAGS, KAPI_DIR_IN,
+               "Allocation flags (GFP_*)")
+    KAPI_PARAM_MASK(1, __GFP_BITS_MASK)
+    KAPI_RETURN(KAPI_TYPE_POINTER, "Pointer to allocated memory or NULL")
+    KAPI_ERROR(ENOMEM, "Out of memory")
+    KAPI_CONTEXT(KAPI_CTX_PROCESS | KAPI_CTX_SOFTIRQ | KAPI_CTX_HARDIRQ)
+    KAPI_SIDE_EFFECT("Allocates memory from kernel heap")
+    KAPI_LOCK_NOT_REQUIRED("Any lock")
+    KAPI_END_SPEC
+
+    void *kmalloc(size_t size, gfp_t flags)
+    {
+        /* Implementation */
+    }
+
+System Call Specification
+-------------------------
+
+System calls use specialized macros:
+
+.. code-block:: c
+
+    KAPI_DEFINE_SYSCALL_SPEC(open_spec, open, "1.0")
+    KAPI_DESCRIPTION("Open a file")
+    KAPI_PARAM(0, pathname, KAPI_TYPE_USER_STRING, KAPI_DIR_IN,
+               "Path to file")
+    KAPI_PARAM_PATH(0, PATH_MAX)
+    KAPI_PARAM(1, flags, KAPI_TYPE_FLAGS, KAPI_DIR_IN,
+               "Open flags (O_*)")
+    KAPI_PARAM(2, mode, KAPI_TYPE_MODE_T, KAPI_DIR_IN,
+               "File permissions (if creating)")
+    KAPI_RETURN(KAPI_TYPE_INT, "File descriptor or -1")
+    KAPI_ERROR(EACCES, "Permission denied")
+    KAPI_ERROR(ENOENT, "File does not exist")
+    KAPI_ERROR(EMFILE, "Too many open files")
+    KAPI_CONTEXT(KAPI_CTX_PROCESS | KAPI_CTX_SLEEPABLE)
+    KAPI_SIGNAL(EINTR, "Open can be interrupted by signal")
+    KAPI_END_SYSCALL_SPEC
+
+IOCTL Specification
+-------------------
+
+IOCTLs have extended support for structure validation:
+
+.. code-block:: c
+
+    KAPI_DEFINE_IOCTL_SPEC(vidioc_querycap_spec, VIDIOC_QUERYCAP,
+                           "VIDIOC_QUERYCAP",
+                           sizeof(struct v4l2_capability),
+                           sizeof(struct v4l2_capability),
+                           "video_fops")
+    KAPI_DESCRIPTION("Query device capabilities")
+    KAPI_IOCTL_FIELD(driver, KAPI_TYPE_CHAR_ARRAY, KAPI_DIR_OUT,
+                     "Driver name", 16)
+    KAPI_IOCTL_FIELD(card, KAPI_TYPE_CHAR_ARRAY, KAPI_DIR_OUT,
+                     "Device name", 32)
+    KAPI_IOCTL_FIELD(version, KAPI_TYPE_U32, KAPI_DIR_OUT,
+                     "Driver version")
+    KAPI_IOCTL_FIELD(capabilities, KAPI_TYPE_FLAGS, KAPI_DIR_OUT,
+                     "Device capabilities")
+    KAPI_END_IOCTL_SPEC
+
+Runtime Validation
+==================
+
+Enabling Validation
+-------------------
+
+Runtime validation is controlled by kernel configuration:
+
+1. Enable ``CONFIG_KAPI_SPEC`` to build the framework
+2. Enable ``CONFIG_KAPI_RUNTIME_CHECKS`` for runtime validation
+3. Optionally enable ``CONFIG_KAPI_SPEC_DEBUGFS`` for debugfs interface
+
+Validation Modes
+----------------
+
+The framework supports several validation modes:
+
+.. code-block:: c
+
+    /* Enable validation for specific API */
+    kapi_enable_validation("kmalloc");
+
+    /* Enable validation for all APIs */
+    kapi_enable_all_validation();
+
+    /* Set validation level */
+    kapi_set_validation_level(KAPI_VALIDATE_FULL);
+
+Validation Levels:
+
+- ``KAPI_VALIDATE_NONE``: No validation
+- ``KAPI_VALIDATE_BASIC``: Type and NULL checks only
+- ``KAPI_VALIDATE_NORMAL``: Basic + range and constraint checks
+- ``KAPI_VALIDATE_FULL``: All checks including custom validators
+
+Custom Validators
+-----------------
+
+APIs can register custom validation functions:
+
+.. code-block:: c
+
+    static bool validate_buffer_size(const struct kapi_param_spec *spec,
+                                     const void *value, void *context)
+    {
+        size_t size = *(size_t *)value;
+        struct my_context *ctx = context;
+
+        return size > 0 && size <= ctx->max_buffer_size;
+    }
+
+    KAPI_PARAM_CUSTOM_VALIDATOR(0, validate_buffer_size)
+
+DebugFS Interface
+=================
+
+The debugfs interface provides runtime access to API specifications:
+
+Directory Structure
+-------------------
+
+::
+
+    /sys/kernel/debug/kapi/
+    ├── apis/                    # All registered APIs
+    │   ├── kmalloc/
+    │   │   ├── specification   # Human-readable spec
+    │   │   ├── json           # JSON format
+    │   │   └── xml            # XML format
+    │   └── open/
+    │       └── ...
+    ├── summary                  # Overview of all APIs
+    ├── validation/              # Validation controls
+    │   ├── enabled             # Global enable/disable
+    │   ├── level               # Validation level
+    │   └── stats               # Validation statistics
+    └── export/                  # Bulk export options
+        ├── all.json            # All specs in JSON
+        └── all.xml             # All specs in XML
+
+Usage Examples
+--------------
+
+Query specific API::
+
+    $ cat /sys/kernel/debug/kapi/apis/kmalloc/specification
+    API: kmalloc
+    Version: 3.0
+    Description: Allocate kernel memory
+
+    Parameters:
+      [0] size (size_t, in): Number of bytes to allocate
+          Range: 0 - 4194304
+      [1] flags (flags, in): Allocation flags (GFP_*)
+          Mask: 0x1ffffff
+
+    Returns: pointer - Pointer to allocated memory or NULL
+
+    Errors:
+      ENOMEM: Out of memory
+
+    Context: process, softirq, hardirq
+
+    Side Effects:
+      - Allocates memory from kernel heap
+
+Export all specifications::
+
+    $ cat /sys/kernel/debug/kapi/export/all.json > kernel-apis.json
+
+Enable validation for specific API::
+
+    $ echo 1 > /sys/kernel/debug/kapi/apis/kmalloc/validate
+
+Performance Considerations
+==========================
+
+Memory Overhead
+---------------
+
+Each API specification consumes approximately 2-4KB of memory. With thousands
+of kernel APIs, this can add up to several megabytes. Consider:
+
+1. Building with ``CONFIG_KAPI_SPEC=n`` for production kernels
+2. Using ``__init`` annotations for APIs only used during boot
+3. Implementing lazy loading for rarely used specifications
+
+Runtime Overhead
+----------------
+
+When ``CONFIG_KAPI_RUNTIME_CHECKS`` is enabled:
+
+- Each validated API call adds 50-200ns overhead
+- Complex validations (custom validators) may add more
+- Use validation only in development/testing kernels
+
+Optimization Strategies
+-----------------------
+
+1. **Compile-time optimization**: When validation is disabled, all
+   validation code is optimized away by the compiler.
+
+2. **Selective validation**: Enable validation only for specific APIs
+   or subsystems under test.
+
+3. **Caching**: The framework caches validation results for repeated
+   calls with identical parameters.
+
+Documentation Generation
+------------------------
+
+The framework exports specifications via debugfs that can be used
+to generate documentation. Tools for automatic documentation generation
+from specifications are planned for future development.
+
+IDE Integration
+---------------
+
+Modern IDEs can use the JSON export for:
+
+- Parameter hints
+- Type checking
+- Context validation
+- Error code documentation
+
+Testing Framework
+-----------------
+
+The framework includes test helpers::
+
+    #ifdef CONFIG_KAPI_TESTING
+    /* Verify API behaves according to specification */
+    kapi_test_api("kmalloc", test_cases);
+    #endif
+
+Best Practices
+==============
+
+Writing Specifications
+----------------------
+
+1. **Be Comprehensive**: Document all parameters, errors, and side effects
+2. **Keep Updated**: Update specs when API behavior changes
+3. **Use Examples**: Include usage examples in descriptions
+4. **Validate Constraints**: Define realistic constraints for parameters
+5. **Document Context**: Clearly specify allowed execution contexts
+
+Maintenance
+-----------
+
+1. **Version Specifications**: Increment version when API changes
+2. **Deprecation**: Mark deprecated APIs and suggest replacements
+3. **Cross-reference**: Link related APIs in descriptions
+4. **Test Specifications**: Verify specs match implementation
+
+Common Patterns
+---------------
+
+**Optional Parameters**::
+
+    KAPI_PARAM(2, optional_arg, KAPI_TYPE_POINTER, KAPI_DIR_IN,
+               "Optional argument (may be NULL)")
+    KAPI_PARAM_OPTIONAL(2)
+
+**Variable Arguments**::
+
+    KAPI_PARAM(1, fmt, KAPI_TYPE_FORMAT_STRING, KAPI_DIR_IN,
+               "Printf-style format string")
+    KAPI_PARAM_VARIADIC(2, "Format arguments")
+
+**Callback Functions**::
+
+    KAPI_PARAM(1, callback, KAPI_TYPE_FUNCTION_PTR, KAPI_DIR_IN,
+               "Callback function")
+    KAPI_PARAM_CALLBACK(1, "int (*)(void *data)", "data")
+
+Troubleshooting
+===============
+
+Common Issues
+-------------
+
+**Specification Not Found**::
+
+    kernel: KAPI: Specification for 'my_api' not found
+
+    Solution: Ensure KAPI_DEFINE_SPEC is in the same translation unit
+    as the function implementation.
+
+**Validation Failures**::
+
+    kernel: KAPI: Validation failed for kmalloc parameter 'size':
+            value 5242880 exceeds maximum 4194304
+
+    Solution: Check parameter constraints or adjust specification if
+    the constraint is incorrect.
+
+**Build Errors**::
+
+    error: 'KAPI_TYPE_UNKNOWN' undeclared
+
+    Solution: Include <linux/kernel_api_spec.h> and ensure
+    CONFIG_KAPI_SPEC is enabled.
+
+Debug Options
+-------------
+
+Enable verbose debugging::
+
+    echo 8 > /proc/sys/kernel/printk
+    echo 1 > /sys/kernel/debug/kapi/debug/verbose
+
+Future Directions
+=================
+
+Planned Features
+----------------
+
+1. **Automatic Extraction**: Tool to extract specifications from existing
+   kernel-doc comments
+
+2. **Contract Verification**: Static analysis to verify implementation
+   matches specification
+
+3. **Performance Profiling**: Measure actual API performance against
+   documented expectations
+
+4. **Fuzzing Integration**: Use specifications to guide intelligent
+   fuzzing of kernel APIs
+
+5. **Version Compatibility**: Track API changes across kernel versions
+
+Research Areas
+--------------
+
+1. **Formal Verification**: Use specifications for mathematical proofs
+   of correctness
+
+2. **Runtime Monitoring**: Detect specification violations in production
+   with minimal overhead
+
+3. **API Evolution**: Analyze how kernel APIs change over time
+
+4. **Security Applications**: Use specifications for security policy
+   enforcement
+
+Contributing
+============
+
+Submitting Specifications
+-------------------------
+
+1. Add specifications to the same file as the API implementation
+2. Follow existing patterns and naming conventions
+3. Test with CONFIG_KAPI_RUNTIME_CHECKS enabled
+4. Verify debugfs output is correct
+5. Run scripts/checkpatch.pl on your changes
+
+Review Criteria
+---------------
+
+Specifications will be reviewed for:
+
+1. **Completeness**: All parameters and errors documented
+2. **Accuracy**: Specification matches implementation
+3. **Clarity**: Descriptions are clear and helpful
+4. **Consistency**: Follows framework conventions
+5. **Performance**: No unnecessary runtime overhead
+
+Contact
+-------
+
+- Maintainer: Sasha Levin <sashal@kernel.org>
diff --git a/MAINTAINERS b/MAINTAINERS
index fed6cd812d79..51c8ff70b8a1 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -13244,6 +13244,15 @@ W:	https://linuxtv.org
 T:	git git://linuxtv.org/media.git
 F:	drivers/media/radio/radio-keene*
 
+KERNEL API SPECIFICATION FRAMEWORK (KAPI)
+M:	Sasha Levin <sashal@kernel.org>
+L:	linux-api@vger.kernel.org
+S:	Maintained
+F:	Documentation/admin-guide/kernel-api-spec.rst
+F:	include/linux/kernel_api_spec.h
+F:	kernel/api/
+F:	scripts/extract-kapi-spec.sh
+
 KERNEL AUTOMOUNTER
 M:	Ian Kent <raven@themaw.net>
 L:	autofs@vger.kernel.org
diff --git a/arch/um/kernel/dyn.lds.S b/arch/um/kernel/dyn.lds.S
index a36b7918a011..283ab11788d8 100644
--- a/arch/um/kernel/dyn.lds.S
+++ b/arch/um/kernel/dyn.lds.S
@@ -102,6 +102,9 @@ SECTIONS
   init.data : { INIT_DATA }
   __init_end = .;
 
+  /* Kernel API specifications in dedicated section */
+  KAPI_SPECS_SECTION()
+
   /* Ensure the __preinit_array_start label is properly aligned.  We
      could instead move the label definition inside the section, but
      the linker would then create the section even if it turns out to
diff --git a/arch/um/kernel/uml.lds.S b/arch/um/kernel/uml.lds.S
index a409d4b66114..e3850d829343 100644
--- a/arch/um/kernel/uml.lds.S
+++ b/arch/um/kernel/uml.lds.S
@@ -74,6 +74,9 @@ SECTIONS
   init.data : { INIT_DATA }
   __init_end = .;
 
+  /* Kernel API specifications in dedicated section */
+  KAPI_SPECS_SECTION()
+
   .data    :
   {
     INIT_TASK_DATA(KERNEL_STACK_SIZE)
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 4fa0be732af1..8cc508adc9d5 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -173,6 +173,9 @@ SECTIONS
 	RO_DATA(PAGE_SIZE)
 	X86_ALIGN_RODATA_END
 
+	/* Kernel API specifications in dedicated section */
+	KAPI_SPECS_SECTION()
+
 	/* Data */
 	.data : AT(ADDR(.data) - LOAD_OFFSET) {
 		/* Start of data section */
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index ae2d2359b79e..93b6293c0259 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -279,6 +279,26 @@ defined(CONFIG_AUTOFDO_CLANG) || defined(CONFIG_PROPELLER_CLANG)
 #define TRACE_SYSCALLS()
 #endif
 
+#ifdef CONFIG_KAPI_SPEC
+#define KAPI_SPECS()				\
+	. = ALIGN(8);				\
+	__start_kapi_specs = .;			\
+	KEEP(*(.kapi_specs))			\
+	__stop_kapi_specs = .;
+
+/* For placing KAPI specs in a dedicated section */
+#define KAPI_SPECS_SECTION()			\
+	.kapi_specs : AT(ADDR(.kapi_specs) - LOAD_OFFSET) {	\
+		. = ALIGN(8);			\
+		__start_kapi_specs = .;		\
+		KEEP(*(.kapi_specs))		\
+		__stop_kapi_specs = .;		\
+	}
+#else
+#define KAPI_SPECS()
+#define KAPI_SPECS_SECTION()
+#endif
+
 #ifdef CONFIG_BPF_EVENTS
 #define BPF_RAW_TP() STRUCT_ALIGN();				\
 	BOUNDED_SECTION_BY(__bpf_raw_tp_map, __bpf_raw_tp)
diff --git a/include/linux/kernel_api_spec.h b/include/linux/kernel_api_spec.h
new file mode 100644
index 000000000000..163e3af8ca82
--- /dev/null
+++ b/include/linux/kernel_api_spec.h
@@ -0,0 +1,1559 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * kernel_api_spec.h - Kernel API Formal Specification Framework
+ *
+ * This framework provides structures and macros to formally specify kernel APIs
+ * in both human and machine-readable formats. It supports comprehensive documentation
+ * of function signatures, parameters, return values, error conditions, and constraints.
+ */
+
+#ifndef _LINUX_KERNEL_API_SPEC_H
+#define _LINUX_KERNEL_API_SPEC_H
+
+#include <linux/types.h>
+#include <linux/stringify.h>
+#include <linux/compiler.h>
+#include <linux/errno.h>
+
+struct sigaction;
+
+#define KAPI_MAX_PARAMS		16
+#define KAPI_MAX_ERRORS		32
+#define KAPI_MAX_CONSTRAINTS	32
+#define KAPI_MAX_SIGNALS	32
+#define KAPI_MAX_NAME_LEN	128
+#define KAPI_MAX_DESC_LEN	512
+#define KAPI_MAX_CAPABILITIES	8
+#define KAPI_MAX_SOCKET_STATES	16
+#define KAPI_MAX_PROTOCOL_BEHAVIORS	8
+#define KAPI_MAX_NET_ERRORS	16
+#define KAPI_MAX_SOCKOPTS	16
+#define KAPI_MAX_ADDR_FAMILIES	8
+
+/**
+ * enum kapi_param_type - Parameter type classification
+ * @KAPI_TYPE_VOID: void type
+ * @KAPI_TYPE_INT: Integer types (int, long, etc.)
+ * @KAPI_TYPE_UINT: Unsigned integer types
+ * @KAPI_TYPE_PTR: Pointer types
+ * @KAPI_TYPE_STRUCT: Structure types
+ * @KAPI_TYPE_UNION: Union types
+ * @KAPI_TYPE_ENUM: Enumeration types
+ * @KAPI_TYPE_FUNC_PTR: Function pointer types
+ * @KAPI_TYPE_ARRAY: Array types
+ * @KAPI_TYPE_FD: File descriptor - validated in process context
+ * @KAPI_TYPE_USER_PTR: User space pointer - validated for access and size
+ * @KAPI_TYPE_PATH: Pathname - validated for access and path limits
+ * @KAPI_TYPE_CUSTOM: Custom/complex types
+ */
+enum kapi_param_type {
+	KAPI_TYPE_VOID = 0,
+	KAPI_TYPE_INT,
+	KAPI_TYPE_UINT,
+	KAPI_TYPE_PTR,
+	KAPI_TYPE_STRUCT,
+	KAPI_TYPE_UNION,
+	KAPI_TYPE_ENUM,
+	KAPI_TYPE_FUNC_PTR,
+	KAPI_TYPE_ARRAY,
+	KAPI_TYPE_FD,		/* File descriptor - validated in process context */
+	KAPI_TYPE_USER_PTR,	/* User space pointer - validated for access and size */
+	KAPI_TYPE_PATH,		/* Pathname - validated for access and path limits */
+	KAPI_TYPE_CUSTOM,
+};
+
+/**
+ * enum kapi_param_flags - Parameter attribute flags
+ * @KAPI_PARAM_IN: Input parameter
+ * @KAPI_PARAM_OUT: Output parameter
+ * @KAPI_PARAM_INOUT: Input/output parameter
+ * @KAPI_PARAM_OPTIONAL: Optional parameter (can be NULL)
+ * @KAPI_PARAM_CONST: Const qualified parameter
+ * @KAPI_PARAM_VOLATILE: Volatile qualified parameter
+ * @KAPI_PARAM_USER: User space pointer
+ * @KAPI_PARAM_DMA: DMA-capable memory required
+ * @KAPI_PARAM_ALIGNED: Alignment requirements
+ */
+enum kapi_param_flags {
+	KAPI_PARAM_IN		= (1 << 0),
+	KAPI_PARAM_OUT		= (1 << 1),
+	KAPI_PARAM_INOUT	= (1 << 2),
+	KAPI_PARAM_OPTIONAL	= (1 << 3),
+	KAPI_PARAM_CONST	= (1 << 4),
+	KAPI_PARAM_VOLATILE	= (1 << 5),
+	KAPI_PARAM_USER		= (1 << 6),
+	KAPI_PARAM_DMA		= (1 << 7),
+	KAPI_PARAM_ALIGNED	= (1 << 8),
+};
+
+/**
+ * enum kapi_context_flags - Function execution context flags
+ * @KAPI_CTX_PROCESS: Can be called from process context
+ * @KAPI_CTX_SOFTIRQ: Can be called from softirq context
+ * @KAPI_CTX_HARDIRQ: Can be called from hardirq context
+ * @KAPI_CTX_NMI: Can be called from NMI context
+ * @KAPI_CTX_ATOMIC: Must be called in atomic context
+ * @KAPI_CTX_SLEEPABLE: May sleep
+ * @KAPI_CTX_PREEMPT_DISABLED: Requires preemption disabled
+ * @KAPI_CTX_IRQ_DISABLED: Requires interrupts disabled
+ */
+enum kapi_context_flags {
+	KAPI_CTX_PROCESS	= (1 << 0),
+	KAPI_CTX_SOFTIRQ	= (1 << 1),
+	KAPI_CTX_HARDIRQ	= (1 << 2),
+	KAPI_CTX_NMI		= (1 << 3),
+	KAPI_CTX_ATOMIC		= (1 << 4),
+	KAPI_CTX_SLEEPABLE	= (1 << 5),
+	KAPI_CTX_PREEMPT_DISABLED = (1 << 6),
+	KAPI_CTX_IRQ_DISABLED	= (1 << 7),
+};
+
+/**
+ * enum kapi_lock_type - Lock types used/required by the function
+ * @KAPI_LOCK_NONE: No locking requirements
+ * @KAPI_LOCK_MUTEX: Mutex lock
+ * @KAPI_LOCK_SPINLOCK: Spinlock
+ * @KAPI_LOCK_RWLOCK: Read-write lock
+ * @KAPI_LOCK_SEQLOCK: Sequence lock
+ * @KAPI_LOCK_RCU: RCU lock
+ * @KAPI_LOCK_SEMAPHORE: Semaphore
+ * @KAPI_LOCK_CUSTOM: Custom locking mechanism
+ */
+enum kapi_lock_type {
+	KAPI_LOCK_NONE = 0,
+	KAPI_LOCK_MUTEX,
+	KAPI_LOCK_SPINLOCK,
+	KAPI_LOCK_RWLOCK,
+	KAPI_LOCK_SEQLOCK,
+	KAPI_LOCK_RCU,
+	KAPI_LOCK_SEMAPHORE,
+	KAPI_LOCK_CUSTOM,
+};
+
+/**
+ * enum kapi_constraint_type - Types of parameter constraints
+ * @KAPI_CONSTRAINT_NONE: No constraint
+ * @KAPI_CONSTRAINT_RANGE: Numeric range constraint
+ * @KAPI_CONSTRAINT_MASK: Bitmask constraint
+ * @KAPI_CONSTRAINT_ENUM: Enumerated values constraint
+ * @KAPI_CONSTRAINT_ALIGNMENT: Alignment constraint (must be aligned to specified boundary)
+ * @KAPI_CONSTRAINT_POWER_OF_TWO: Value must be a power of two
+ * @KAPI_CONSTRAINT_PAGE_ALIGNED: Value must be page-aligned
+ * @KAPI_CONSTRAINT_NONZERO: Value must be non-zero
+ * @KAPI_CONSTRAINT_CUSTOM: Custom validation function
+ */
+enum kapi_constraint_type {
+	KAPI_CONSTRAINT_NONE = 0,
+	KAPI_CONSTRAINT_RANGE,
+	KAPI_CONSTRAINT_MASK,
+	KAPI_CONSTRAINT_ENUM,
+	KAPI_CONSTRAINT_ALIGNMENT,
+	KAPI_CONSTRAINT_POWER_OF_TWO,
+	KAPI_CONSTRAINT_PAGE_ALIGNED,
+	KAPI_CONSTRAINT_NONZERO,
+	KAPI_CONSTRAINT_CUSTOM,
+};
+
+/**
+ * struct kapi_param_spec - Parameter specification
+ * @name: Parameter name
+ * @type_name: Type name as string
+ * @type: Parameter type classification
+ * @flags: Parameter attribute flags
+ * @size: Size in bytes (for arrays/buffers)
+ * @alignment: Required alignment
+ * @min_value: Minimum valid value (for numeric types)
+ * @max_value: Maximum valid value (for numeric types)
+ * @valid_mask: Valid bits mask (for flag parameters)
+ * @enum_values: Array of valid enumerated values
+ * @enum_count: Number of valid enumerated values
+ * @constraint_type: Type of constraint applied
+ * @validate: Custom validation function
+ * @description: Human-readable description
+ * @constraints: Additional constraints description
+ * @size_param_idx: Index of parameter that determines size (-1 if fixed size)
+ * @size_multiplier: Multiplier for size calculation (e.g., sizeof(struct))
+ */
+struct kapi_param_spec {
+	char name[KAPI_MAX_NAME_LEN];
+	char type_name[KAPI_MAX_NAME_LEN];
+	enum kapi_param_type type;
+	u32 flags;
+	size_t size;
+	size_t alignment;
+	s64 min_value;
+	s64 max_value;
+	u64 valid_mask;
+	const s64 *enum_values;
+	u32 enum_count;
+	enum kapi_constraint_type constraint_type;
+	bool (*validate)(s64 value);
+	char description[KAPI_MAX_DESC_LEN];
+	char constraints[KAPI_MAX_DESC_LEN];
+	int size_param_idx;	/* Index of param that determines size, -1 if N/A */
+	size_t size_multiplier;	/* Size per unit (e.g., sizeof(struct epoll_event)) */
+} __attribute__((packed));
+
+/**
+ * struct kapi_error_spec - Error condition specification
+ * @error_code: Error code value
+ * @name: Error code name (e.g., "EINVAL")
+ * @condition: Condition that triggers this error
+ * @description: Detailed error description
+ */
+struct kapi_error_spec {
+	int error_code;
+	char name[KAPI_MAX_NAME_LEN];
+	char condition[KAPI_MAX_DESC_LEN];
+	char description[KAPI_MAX_DESC_LEN];
+} __attribute__((packed));
+
+/**
+ * enum kapi_return_check_type - Return value check types
+ * @KAPI_RETURN_EXACT: Success is an exact value
+ * @KAPI_RETURN_RANGE: Success is within a range
+ * @KAPI_RETURN_ERROR_CHECK: Success is when NOT in error list
+ * @KAPI_RETURN_FD: Return value is a file descriptor (>= 0 is success)
+ * @KAPI_RETURN_CUSTOM: Custom validation function
+ * @KAPI_RETURN_NO_RETURN: Function does not return (e.g., exec on success)
+ */
+enum kapi_return_check_type {
+	KAPI_RETURN_EXACT,
+	KAPI_RETURN_RANGE,
+	KAPI_RETURN_ERROR_CHECK,
+	KAPI_RETURN_FD,
+	KAPI_RETURN_CUSTOM,
+	KAPI_RETURN_NO_RETURN,
+};
+
+/**
+ * struct kapi_return_spec - Return value specification
+ * @type_name: Return type name
+ * @type: Return type classification
+ * @check_type: Type of success check to perform
+ * @success_value: Exact value indicating success (for EXACT)
+ * @success_min: Minimum success value (for RANGE)
+ * @success_max: Maximum success value (for RANGE)
+ * @error_values: Array of error values (for ERROR_CHECK)
+ * @error_count: Number of error values
+ * @is_success: Custom function to check success
+ * @description: Return value description
+ */
+struct kapi_return_spec {
+	char type_name[KAPI_MAX_NAME_LEN];
+	enum kapi_param_type type;
+	enum kapi_return_check_type check_type;
+	s64 success_value;
+	s64 success_min;
+	s64 success_max;
+	const s64 *error_values;
+	u32 error_count;
+	bool (*is_success)(s64 retval);
+	char description[KAPI_MAX_DESC_LEN];
+} __attribute__((packed));
+
+/**
+ * struct kapi_lock_spec - Lock requirement specification
+ * @lock_name: Name of the lock
+ * @lock_type: Type of lock
+ * @acquired: Whether function acquires this lock
+ * @released: Whether function releases this lock
+ * @held_on_entry: Whether lock must be held on entry
+ * @held_on_exit: Whether lock is held on exit
+ * @description: Additional lock requirements
+ */
+struct kapi_lock_spec {
+	char lock_name[KAPI_MAX_NAME_LEN];
+	enum kapi_lock_type lock_type;
+	bool acquired;
+	bool released;
+	bool held_on_entry;
+	bool held_on_exit;
+	char description[KAPI_MAX_DESC_LEN];
+} __attribute__((packed));
+
+/**
+ * struct kapi_constraint_spec - Additional constraint specification
+ * @name: Constraint name
+ * @description: Constraint description
+ * @expression: Formal expression (if applicable)
+ */
+struct kapi_constraint_spec {
+	char name[KAPI_MAX_NAME_LEN];
+	char description[KAPI_MAX_DESC_LEN];
+	char expression[KAPI_MAX_DESC_LEN];
+} __attribute__((packed));
+
+/**
+ * enum kapi_signal_direction - Signal flow direction
+ * @KAPI_SIGNAL_RECEIVE: Function may receive this signal
+ * @KAPI_SIGNAL_SEND: Function may send this signal
+ * @KAPI_SIGNAL_HANDLE: Function handles this signal specially
+ * @KAPI_SIGNAL_BLOCK: Function blocks this signal
+ * @KAPI_SIGNAL_IGNORE: Function ignores this signal
+ */
+enum kapi_signal_direction {
+	KAPI_SIGNAL_RECEIVE	= (1 << 0),
+	KAPI_SIGNAL_SEND	= (1 << 1),
+	KAPI_SIGNAL_HANDLE	= (1 << 2),
+	KAPI_SIGNAL_BLOCK	= (1 << 3),
+	KAPI_SIGNAL_IGNORE	= (1 << 4),
+};
+
+/**
+ * enum kapi_signal_action - What the function does with the signal
+ * @KAPI_SIGNAL_ACTION_DEFAULT: Default signal action applies
+ * @KAPI_SIGNAL_ACTION_TERMINATE: Causes termination
+ * @KAPI_SIGNAL_ACTION_COREDUMP: Causes termination with core dump
+ * @KAPI_SIGNAL_ACTION_STOP: Stops the process
+ * @KAPI_SIGNAL_ACTION_CONTINUE: Continues a stopped process
+ * @KAPI_SIGNAL_ACTION_CUSTOM: Custom handling described in notes
+ * @KAPI_SIGNAL_ACTION_RETURN: Returns from syscall with EINTR
+ * @KAPI_SIGNAL_ACTION_RESTART: Restarts the syscall
+ * @KAPI_SIGNAL_ACTION_QUEUE: Queues the signal for later delivery
+ * @KAPI_SIGNAL_ACTION_DISCARD: Discards the signal
+ * @KAPI_SIGNAL_ACTION_TRANSFORM: Transforms to another signal
+ */
+enum kapi_signal_action {
+	KAPI_SIGNAL_ACTION_DEFAULT = 0,
+	KAPI_SIGNAL_ACTION_TERMINATE,
+	KAPI_SIGNAL_ACTION_COREDUMP,
+	KAPI_SIGNAL_ACTION_STOP,
+	KAPI_SIGNAL_ACTION_CONTINUE,
+	KAPI_SIGNAL_ACTION_CUSTOM,
+	KAPI_SIGNAL_ACTION_RETURN,
+	KAPI_SIGNAL_ACTION_RESTART,
+	KAPI_SIGNAL_ACTION_QUEUE,
+	KAPI_SIGNAL_ACTION_DISCARD,
+	KAPI_SIGNAL_ACTION_TRANSFORM,
+};
+
+/**
+ * struct kapi_signal_spec - Signal specification
+ * @signal_num: Signal number (e.g., SIGKILL, SIGTERM)
+ * @signal_name: Signal name as string
+ * @direction: Direction flags (OR of kapi_signal_direction)
+ * @action: What happens when signal is received
+ * @target: Description of target process/thread for sent signals
+ * @condition: Condition under which signal is sent/received/handled
+ * @description: Detailed description of signal handling
+ * @restartable: Whether syscall is restartable after this signal
+ * @sa_flags_required: Required signal action flags (SA_*)
+ * @sa_flags_forbidden: Forbidden signal action flags
+ * @error_on_signal: Error code returned when signal occurs (-EINTR, etc)
+ * @transform_to: Signal number to transform to (if action is TRANSFORM)
+ * @timing: When signal can occur ("entry", "during", "exit", "anytime")
+ * @priority: Signal handling priority (lower processed first)
+ * @interruptible: Whether this operation is interruptible by this signal
+ * @queue_behavior: How signal is queued ("realtime", "standard", "coalesce")
+ * @state_required: Required process state for signal to be delivered
+ * @state_forbidden: Forbidden process state for signal delivery
+ */
+struct kapi_signal_spec {
+	int signal_num;
+	char signal_name[32];
+	u32 direction;
+	enum kapi_signal_action action;
+	char target[KAPI_MAX_DESC_LEN];
+	char condition[KAPI_MAX_DESC_LEN];
+	char description[KAPI_MAX_DESC_LEN];
+	bool restartable;
+	u32 sa_flags_required;
+	u32 sa_flags_forbidden;
+	int error_on_signal;
+	int transform_to;
+	char timing[32];
+	u8 priority;
+	bool interruptible;
+	char queue_behavior[128];
+	u32 state_required;
+	u32 state_forbidden;
+} __attribute__((packed));
+
+/**
+ * struct kapi_signal_mask_spec - Signal mask specification
+ * @mask_name: Name of the signal mask
+ * @signals: Array of signal numbers in the mask
+ * @signal_count: Number of signals in the mask
+ * @description: Description of what this mask represents
+ */
+struct kapi_signal_mask_spec {
+	char mask_name[KAPI_MAX_NAME_LEN];
+	int signals[KAPI_MAX_SIGNALS];
+	u32 signal_count;
+	char description[KAPI_MAX_DESC_LEN];
+} __attribute__((packed));
+
+/**
+ * struct kapi_struct_field - Structure field specification
+ * @name: Field name
+ * @type: Field type classification
+ * @type_name: Type name as string
+ * @offset: Offset within structure
+ * @size: Size of field in bytes
+ * @flags: Field attribute flags
+ * @constraint_type: Type of constraint applied
+ * @min_value: Minimum valid value (for numeric types)
+ * @max_value: Maximum valid value (for numeric types)
+ * @valid_mask: Valid bits mask (for flag fields)
+ * @enum_values: Comma-separated list of valid enum values (for enum types)
+ * @description: Field description
+ */
+struct kapi_struct_field {
+	char name[KAPI_MAX_NAME_LEN];
+	enum kapi_param_type type;
+	char type_name[KAPI_MAX_NAME_LEN];
+	size_t offset;
+	size_t size;
+	u32 flags;
+	enum kapi_constraint_type constraint_type;
+	s64 min_value;
+	s64 max_value;
+	u64 valid_mask;
+	char enum_values[KAPI_MAX_DESC_LEN];	/* Comma-separated list of valid enum values */
+	char description[KAPI_MAX_DESC_LEN];
+} __attribute__((packed));
+
+/**
+ * struct kapi_struct_spec - Structure type specification
+ * @name: Structure name
+ * @size: Total size of structure
+ * @alignment: Required alignment
+ * @field_count: Number of fields
+ * @fields: Field specifications
+ * @description: Structure description
+ */
+struct kapi_struct_spec {
+	char name[KAPI_MAX_NAME_LEN];
+	size_t size;
+	size_t alignment;
+	u32 field_count;
+	struct kapi_struct_field fields[KAPI_MAX_PARAMS];
+	char description[KAPI_MAX_DESC_LEN];
+} __attribute__((packed));
+
+/**
+ * enum kapi_capability_action - What the capability allows
+ * @KAPI_CAP_BYPASS_CHECK: Bypasses a check entirely
+ * @KAPI_CAP_INCREASE_LIMIT: Increases or removes a limit
+ * @KAPI_CAP_OVERRIDE_RESTRICTION: Overrides a restriction
+ * @KAPI_CAP_GRANT_PERMISSION: Grants permission that would otherwise be denied
+ * @KAPI_CAP_MODIFY_BEHAVIOR: Changes the behavior of the operation
+ * @KAPI_CAP_ACCESS_RESOURCE: Allows access to restricted resources
+ * @KAPI_CAP_PERFORM_OPERATION: Allows performing a privileged operation
+ */
+enum kapi_capability_action {
+	KAPI_CAP_BYPASS_CHECK = 0,
+	KAPI_CAP_INCREASE_LIMIT,
+	KAPI_CAP_OVERRIDE_RESTRICTION,
+	KAPI_CAP_GRANT_PERMISSION,
+	KAPI_CAP_MODIFY_BEHAVIOR,
+	KAPI_CAP_ACCESS_RESOURCE,
+	KAPI_CAP_PERFORM_OPERATION,
+};
+
+/**
+ * struct kapi_capability_spec - Capability requirement specification
+ * @capability: The capability constant (e.g., CAP_IPC_LOCK)
+ * @cap_name: Capability name as string
+ * @action: What the capability allows (kapi_capability_action)
+ * @allows: Description of what the capability allows
+ * @without_cap: What happens without the capability
+ * @check_condition: Condition when capability is checked
+ * @priority: Check priority (lower checked first)
+ * @alternative: Alternative capabilities that can be used
+ * @alternative_count: Number of alternative capabilities
+ */
+struct kapi_capability_spec {
+	int capability;
+	char cap_name[KAPI_MAX_NAME_LEN];
+	enum kapi_capability_action action;
+	char allows[KAPI_MAX_DESC_LEN];
+	char without_cap[KAPI_MAX_DESC_LEN];
+	char check_condition[KAPI_MAX_DESC_LEN];
+	u8 priority;
+	int alternative[KAPI_MAX_CAPABILITIES];
+	u32 alternative_count;
+} __attribute__((packed));
+
+/**
+ * enum kapi_side_effect_type - Types of side effects
+ * @KAPI_EFFECT_NONE: No side effects
+ * @KAPI_EFFECT_ALLOC_MEMORY: Allocates memory
+ * @KAPI_EFFECT_FREE_MEMORY: Frees memory
+ * @KAPI_EFFECT_MODIFY_STATE: Modifies global/shared state
+ * @KAPI_EFFECT_SIGNAL_SEND: Sends signals
+ * @KAPI_EFFECT_FILE_POSITION: Modifies file position
+ * @KAPI_EFFECT_LOCK_ACQUIRE: Acquires locks
+ * @KAPI_EFFECT_LOCK_RELEASE: Releases locks
+ * @KAPI_EFFECT_RESOURCE_CREATE: Creates system resources (FDs, PIDs, etc)
+ * @KAPI_EFFECT_RESOURCE_DESTROY: Destroys system resources
+ * @KAPI_EFFECT_SCHEDULE: May cause scheduling/context switch
+ * @KAPI_EFFECT_HARDWARE: Interacts with hardware
+ * @KAPI_EFFECT_NETWORK: Network I/O operation
+ * @KAPI_EFFECT_FILESYSTEM: Filesystem modification
+ * @KAPI_EFFECT_PROCESS_STATE: Modifies process state
+ * @KAPI_EFFECT_IRREVERSIBLE: Effect cannot be undone
+ */
+enum kapi_side_effect_type {
+	KAPI_EFFECT_NONE = 0,
+	KAPI_EFFECT_ALLOC_MEMORY = (1 << 0),
+	KAPI_EFFECT_FREE_MEMORY = (1 << 1),
+	KAPI_EFFECT_MODIFY_STATE = (1 << 2),
+	KAPI_EFFECT_SIGNAL_SEND = (1 << 3),
+	KAPI_EFFECT_FILE_POSITION = (1 << 4),
+	KAPI_EFFECT_LOCK_ACQUIRE = (1 << 5),
+	KAPI_EFFECT_LOCK_RELEASE = (1 << 6),
+	KAPI_EFFECT_RESOURCE_CREATE = (1 << 7),
+	KAPI_EFFECT_RESOURCE_DESTROY = (1 << 8),
+	KAPI_EFFECT_SCHEDULE = (1 << 9),
+	KAPI_EFFECT_HARDWARE = (1 << 10),
+	KAPI_EFFECT_NETWORK = (1 << 11),
+	KAPI_EFFECT_FILESYSTEM = (1 << 12),
+	KAPI_EFFECT_PROCESS_STATE = (1 << 13),
+	KAPI_EFFECT_IRREVERSIBLE = (1 << 14),
+};
+
+/**
+ * struct kapi_side_effect - Side effect specification
+ * @type: Bitmask of effect types
+ * @target: What is affected (e.g., "process memory", "file descriptor table")
+ * @condition: Condition under which effect occurs
+ * @description: Detailed description of the effect
+ * @reversible: Whether the effect can be undone
+ */
+struct kapi_side_effect {
+	u32 type;
+	char target[KAPI_MAX_NAME_LEN];
+	char condition[KAPI_MAX_DESC_LEN];
+	char description[KAPI_MAX_DESC_LEN];
+	bool reversible;
+} __attribute__((packed));
+
+/**
+ * struct kapi_state_transition - State transition specification
+ * @from_state: Starting state description
+ * @to_state: Ending state description
+ * @condition: Condition for transition
+ * @object: Object whose state changes
+ * @description: Detailed description
+ */
+struct kapi_state_transition {
+	char from_state[KAPI_MAX_NAME_LEN];
+	char to_state[KAPI_MAX_NAME_LEN];
+	char condition[KAPI_MAX_DESC_LEN];
+	char object[KAPI_MAX_NAME_LEN];
+	char description[KAPI_MAX_DESC_LEN];
+} __attribute__((packed));
+
+#define KAPI_MAX_STRUCT_SPECS	8
+#define KAPI_MAX_SIDE_EFFECTS	32
+#define KAPI_MAX_STATE_TRANS	8
+
+/**
+ * enum kapi_socket_state - Socket states for state machine
+ */
+enum kapi_socket_state {
+	KAPI_SOCK_STATE_UNSPEC = 0,
+	KAPI_SOCK_STATE_CLOSED,
+	KAPI_SOCK_STATE_OPEN,
+	KAPI_SOCK_STATE_BOUND,
+	KAPI_SOCK_STATE_LISTEN,
+	KAPI_SOCK_STATE_SYN_SENT,
+	KAPI_SOCK_STATE_SYN_RECV,
+	KAPI_SOCK_STATE_ESTABLISHED,
+	KAPI_SOCK_STATE_FIN_WAIT1,
+	KAPI_SOCK_STATE_FIN_WAIT2,
+	KAPI_SOCK_STATE_CLOSE_WAIT,
+	KAPI_SOCK_STATE_CLOSING,
+	KAPI_SOCK_STATE_LAST_ACK,
+	KAPI_SOCK_STATE_TIME_WAIT,
+	KAPI_SOCK_STATE_CONNECTED,
+	KAPI_SOCK_STATE_DISCONNECTED,
+};
+
+/**
+ * enum kapi_socket_protocol - Socket protocol types
+ */
+enum kapi_socket_protocol {
+	KAPI_PROTO_TCP		= (1 << 0),
+	KAPI_PROTO_UDP		= (1 << 1),
+	KAPI_PROTO_UNIX		= (1 << 2),
+	KAPI_PROTO_RAW		= (1 << 3),
+	KAPI_PROTO_PACKET	= (1 << 4),
+	KAPI_PROTO_NETLINK	= (1 << 5),
+	KAPI_PROTO_SCTP		= (1 << 6),
+	KAPI_PROTO_DCCP		= (1 << 7),
+	KAPI_PROTO_ALL		= 0xFFFFFFFF,
+};
+
+/**
+ * enum kapi_buffer_behavior - Network buffer handling behaviors
+ */
+enum kapi_buffer_behavior {
+	KAPI_BUF_PEEK		= (1 << 0),
+	KAPI_BUF_TRUNCATE	= (1 << 1),
+	KAPI_BUF_SCATTER	= (1 << 2),
+	KAPI_BUF_ZERO_COPY	= (1 << 3),
+	KAPI_BUF_KERNEL_ALLOC	= (1 << 4),
+	KAPI_BUF_DMA_CAPABLE	= (1 << 5),
+	KAPI_BUF_FRAGMENT	= (1 << 6),
+};
+
+/**
+ * enum kapi_async_behavior - Asynchronous operation behaviors
+ */
+enum kapi_async_behavior {
+	KAPI_ASYNC_BLOCK	= 0,
+	KAPI_ASYNC_NONBLOCK	= (1 << 0),
+	KAPI_ASYNC_POLL_READY	= (1 << 1),
+	KAPI_ASYNC_SIGNAL_DRIVEN = (1 << 2),
+	KAPI_ASYNC_AIO		= (1 << 3),
+	KAPI_ASYNC_IO_URING	= (1 << 4),
+	KAPI_ASYNC_EPOLL	= (1 << 5),
+};
+
+/**
+ * struct kapi_socket_state_spec - Socket state requirement/transition
+ */
+struct kapi_socket_state_spec {
+	enum kapi_socket_state required_states[KAPI_MAX_SOCKET_STATES];
+	u32 required_state_count;
+	enum kapi_socket_state forbidden_states[KAPI_MAX_SOCKET_STATES];
+	u32 forbidden_state_count;
+	enum kapi_socket_state resulting_state;
+	char state_condition[KAPI_MAX_DESC_LEN];
+	u32 applicable_protocols;
+} __attribute__((packed));
+
+/**
+ * struct kapi_protocol_behavior - Protocol-specific behavior
+ */
+struct kapi_protocol_behavior {
+	u32 applicable_protocols;
+	char behavior[KAPI_MAX_DESC_LEN];
+	s64 protocol_flags;
+	char flag_description[KAPI_MAX_DESC_LEN];
+} __attribute__((packed));
+
+/**
+ * struct kapi_buffer_spec - Network buffer specification
+ */
+struct kapi_buffer_spec {
+	u32 buffer_behaviors;
+	size_t min_buffer_size;
+	size_t max_buffer_size;
+	size_t optimal_buffer_size;
+	char fragmentation_rules[KAPI_MAX_DESC_LEN];
+	bool can_partial_transfer;
+	char partial_transfer_rules[KAPI_MAX_DESC_LEN];
+} __attribute__((packed));
+
+/**
+ * struct kapi_async_spec - Asynchronous behavior specification
+ */
+struct kapi_async_spec {
+	enum kapi_async_behavior supported_modes;
+	int nonblock_errno;
+	u32 poll_events_in;
+	u32 poll_events_out;
+	char completion_condition[KAPI_MAX_DESC_LEN];
+	bool supports_timeout;
+	char timeout_behavior[KAPI_MAX_DESC_LEN];
+} __attribute__((packed));
+
+/**
+ * struct kapi_addr_family_spec - Address family specification
+ */
+struct kapi_addr_family_spec {
+	int family;
+	char family_name[32];
+	size_t addr_struct_size;
+	size_t min_addr_len;
+	size_t max_addr_len;
+	char addr_format[KAPI_MAX_DESC_LEN];
+	bool supports_wildcard;
+	bool supports_multicast;
+	bool supports_broadcast;
+	char special_addresses[KAPI_MAX_DESC_LEN];
+	u32 port_range_min;
+	u32 port_range_max;
+} __attribute__((packed));
+
+/**
+ * struct kernel_api_spec - Complete kernel API specification
+ * @name: Function name
+ * @version: API version
+ * @description: Brief description
+ * @long_description: Detailed description
+ * @context_flags: Execution context flags
+ * @param_count: Number of parameters
+ * @params: Parameter specifications
+ * @return_spec: Return value specification
+ * @error_count: Number of possible errors
+ * @errors: Error specifications
+ * @lock_count: Number of lock specifications
+ * @locks: Lock requirement specifications
+ * @constraint_count: Number of additional constraints
+ * @constraints: Additional constraint specifications
+ * @examples: Usage examples
+ * @notes: Additional notes
+ * @since_version: Kernel version when introduced
+ * @signal_count: Number of signal specifications
+ * @signals: Signal handling specifications
+ * @signal_mask_count: Number of signal mask specifications
+ * @signal_masks: Signal mask specifications
+ * @struct_spec_count: Number of structure specifications
+ * @struct_specs: Structure type specifications
+ * @side_effect_count: Number of side effect specifications
+ * @side_effects: Side effect specifications
+ * @state_trans_count: Number of state transition specifications
+ * @state_transitions: State transition specifications
+ */
+struct kernel_api_spec {
+	char name[KAPI_MAX_NAME_LEN];
+	u32 version;
+	char description[KAPI_MAX_DESC_LEN];
+	char long_description[KAPI_MAX_DESC_LEN * 4];
+	u32 context_flags;
+
+	/* Parameters */
+	u32 param_magic;  /* 0x4B415031 = 'KAP1' */
+	u32 param_count;
+	struct kapi_param_spec params[KAPI_MAX_PARAMS];
+
+	/* Return value */
+	u32 return_magic; /* 0x4B415232 = 'KAR2' */
+	struct kapi_return_spec return_spec;
+
+	/* Errors */
+	u32 error_magic;  /* 0x4B414533 = 'KAE3' */
+	u32 error_count;
+	struct kapi_error_spec errors[KAPI_MAX_ERRORS];
+
+	/* Locking */
+	u32 lock_magic;   /* 0x4B414C34 = 'KAL4' */
+	u32 lock_count;
+	struct kapi_lock_spec locks[KAPI_MAX_CONSTRAINTS];
+
+	/* Constraints */
+	u32 constraint_magic; /* 0x4B414335 = 'KAC5' */
+	u32 constraint_count;
+	struct kapi_constraint_spec constraints[KAPI_MAX_CONSTRAINTS];
+
+	/* Additional information */
+	u32 info_magic;   /* 0x4B414936 = 'KAI6' */
+	char examples[KAPI_MAX_DESC_LEN * 2];
+	char notes[KAPI_MAX_DESC_LEN * 2];
+	char since_version[32];
+
+	/* Signal specifications */
+	u32 signal_magic; /* 0x4B415337 = 'KAS7' */
+	u32 signal_count;
+	struct kapi_signal_spec signals[KAPI_MAX_SIGNALS];
+
+	/* Signal mask specifications */
+	u32 sigmask_magic; /* 0x4B414D38 = 'KAM8' */
+	u32 signal_mask_count;
+	struct kapi_signal_mask_spec signal_masks[KAPI_MAX_SIGNALS];
+
+	/* Structure specifications */
+	u32 struct_magic; /* 0x4B415439 = 'KAT9' */
+	u32 struct_spec_count;
+	struct kapi_struct_spec struct_specs[KAPI_MAX_STRUCT_SPECS];
+
+	/* Side effects */
+	u32 effect_magic; /* 0x4B414641 = 'KAFA' */
+	u32 side_effect_count;
+	struct kapi_side_effect side_effects[KAPI_MAX_SIDE_EFFECTS];
+
+	/* State transitions */
+	u32 trans_magic;  /* 0x4B415442 = 'KATB' */
+	u32 state_trans_count;
+	struct kapi_state_transition state_transitions[KAPI_MAX_STATE_TRANS];
+
+	/* Capability specifications */
+	u32 cap_magic;    /* 0x4B414343 = 'KACC' */
+	u32 capability_count;
+	struct kapi_capability_spec capabilities[KAPI_MAX_CAPABILITIES];
+
+	/* Extended fields for socket and network operations */
+	struct kapi_socket_state_spec socket_state;
+	struct kapi_protocol_behavior protocol_behaviors[KAPI_MAX_PROTOCOL_BEHAVIORS];
+	u32 protocol_behavior_count;
+	struct kapi_buffer_spec buffer_spec;
+	struct kapi_async_spec async_spec;
+	struct kapi_addr_family_spec addr_families[KAPI_MAX_ADDR_FAMILIES];
+	u32 addr_family_count;
+
+	/* Operation characteristics */
+	bool is_connection_oriented;
+	bool is_message_oriented;
+	bool supports_oob_data;
+	bool supports_peek;
+	bool supports_select_poll;
+	bool is_reentrant;
+
+	/* Semantic descriptions */
+	char connection_establishment[KAPI_MAX_DESC_LEN];
+	char connection_termination[KAPI_MAX_DESC_LEN];
+	char data_transfer_semantics[KAPI_MAX_DESC_LEN];
+} __attribute__((packed));
+
+/* Macros for defining API specifications */
+
+/**
+ * DEFINE_KERNEL_API_SPEC - Define a kernel API specification
+ * @func_name: Function name to specify
+ */
+#define DEFINE_KERNEL_API_SPEC(func_name) \
+	static struct kernel_api_spec __kapi_spec_##func_name \
+	__used __section(".kapi_specs") = {	\
+		.name = __stringify(func_name),	\
+		.version = 1,
+
+#define KAPI_END_SPEC };
+
+/**
+ * KAPI_DESCRIPTION - Set API description
+ * @desc: Description string
+ */
+#define KAPI_DESCRIPTION(desc) \
+	.description = desc,
+
+/**
+ * KAPI_LONG_DESC - Set detailed API description
+ * @desc: Detailed description string
+ */
+#define KAPI_LONG_DESC(desc) \
+	.long_description = desc,
+
+/**
+ * KAPI_CONTEXT - Set execution context flags
+ * @flags: Context flags (OR'ed KAPI_CTX_* values)
+ */
+#define KAPI_CONTEXT(flags) \
+	.context_flags = flags,
+
+/**
+ * KAPI_PARAM - Define a parameter specification
+ * @idx: Parameter index (0-based)
+ * @pname: Parameter name
+ * @ptype: Type name string
+ * @pdesc: Parameter description
+ */
+#define KAPI_PARAM(idx, pname, ptype, pdesc) \
+	.params[idx] = {			\
+		.name = pname,			\
+		.type_name = ptype,		\
+		.description = pdesc,		\
+		.size_param_idx = -1,		/* Default: no dynamic sizing */
+
+#define KAPI_PARAM_TYPE(ptype) \
+		.type = ptype,
+
+#define KAPI_PARAM_FLAGS(pflags) \
+		.flags = pflags,
+
+#define KAPI_PARAM_SIZE(psize) \
+		.size = psize,
+
+#define KAPI_PARAM_RANGE(pmin, pmax) \
+		.min_value = pmin,	\
+		.max_value = pmax,
+
+#define KAPI_PARAM_CONSTRAINT_TYPE(ctype) \
+		.constraint_type = ctype,
+
+#define KAPI_PARAM_CONSTRAINT(desc) \
+		.constraints = desc,
+
+#define KAPI_PARAM_VALID_MASK(mask) \
+		.valid_mask = mask,
+
+#define KAPI_PARAM_ENUM_VALUES(values) \
+		.enum_values = values, \
+		.enum_count = ARRAY_SIZE(values),
+
+#define KAPI_PARAM_ALIGNMENT(align) \
+		.alignment = align,
+
+#define KAPI_PARAM_SIZE_PARAM(idx) \
+		.size_param_idx = idx,
+
+#define KAPI_PARAM_END },
+
+/**
+ * KAPI_PARAM_COUNT - Set the number of parameters
+ * @n: Number of parameters
+ */
+#define KAPI_PARAM_COUNT(n) \
+	.param_magic = 0x4B415031, /* 'KAP1' */ \
+	.param_count = n,
+
+/**
+ * KAPI_RETURN - Define return value specification
+ * @rtype: Return type name
+ * @rdesc: Return value description
+ */
+#define KAPI_RETURN(rtype, rdesc) \
+	.return_spec = {		\
+		.type_name = rtype,	\
+		.description = rdesc,
+
+#define KAPI_RETURN_SUCCESS(val) \
+		.success_value = val,
+
+#define KAPI_RETURN_TYPE(rtype) \
+		.type = rtype,
+
+#define KAPI_RETURN_CHECK_TYPE(ctype) \
+		.check_type = ctype,
+
+#define KAPI_RETURN_ERROR_VALUES(values) \
+		.error_values = values,
+
+#define KAPI_RETURN_ERROR_COUNT(count) \
+		.error_count = count,
+
+#define KAPI_RETURN_SUCCESS_RANGE(min, max) \
+		.success_min = min, \
+		.success_max = max,
+
+#define KAPI_RETURN_END },
+
+/**
+ * KAPI_ERROR - Define an error condition
+ * @idx: Error index
+ * @ecode: Error code value
+ * @ename: Error name
+ * @econd: Error condition
+ * @edesc: Error description
+ */
+#define KAPI_ERROR(idx, ecode, ename, econd, edesc) \
+	.errors[idx] = {			\
+		.error_code = ecode,		\
+		.name = ename,			\
+		.condition = econd,		\
+		.description = edesc,		\
+	},
+
+/**
+ * KAPI_ERROR_COUNT - Set the number of errors
+ * @n: Number of errors
+ */
+#define KAPI_ERROR_COUNT(n) \
+	.error_magic = 0x4B414533, /* 'KAE3' */ \
+	.error_count = n,
+
+/**
+ * KAPI_LOCK - Define a lock requirement
+ * @idx: Lock index
+ * @lname: Lock name
+ * @ltype: Lock type
+ */
+#define KAPI_LOCK(idx, lname, ltype) \
+	.locks[idx] = {			\
+		.lock_name = lname,	\
+		.lock_type = ltype,
+
+#define KAPI_LOCK_ACQUIRED \
+		.acquired = true,
+
+#define KAPI_LOCK_RELEASED \
+		.released = true,
+
+#define KAPI_LOCK_HELD_ENTRY \
+		.held_on_entry = true,
+
+#define KAPI_LOCK_HELD_EXIT \
+		.held_on_exit = true,
+
+#define KAPI_LOCK_DESC(ldesc) \
+		.description = ldesc,
+
+#define KAPI_LOCK_END },
+
+/**
+ * KAPI_CONSTRAINT - Define an additional constraint
+ * @idx: Constraint index
+ * @cname: Constraint name
+ * @cdesc: Constraint description
+ */
+#define KAPI_CONSTRAINT(idx, cname, cdesc) \
+	.constraints[idx] = {		\
+		.name = cname,		\
+		.description = cdesc,
+
+#define KAPI_CONSTRAINT_EXPR(expr) \
+		.expression = expr,
+
+#define KAPI_CONSTRAINT_END },
+
+/**
+ * KAPI_EXAMPLES - Set API usage examples
+ * @examples: Examples string
+ */
+#define KAPI_EXAMPLES(ex) \
+	.info_magic = 0x4B414936, /* 'KAI6' */ \
+	.examples = ex,
+
+/**
+ * KAPI_NOTES - Set API notes
+ * @notes: Notes string
+ */
+#define KAPI_NOTES(n) \
+	.notes = n,
+
+
+/**
+ * KAPI_SIGNAL - Define a signal specification
+ * @idx: Signal index
+ * @signum: Signal number (e.g., SIGKILL)
+ * @signame: Signal name string
+ * @dir: Direction flags
+ * @act: Action taken
+ */
+#define KAPI_SIGNAL(idx, signum, signame, dir, act) \
+	.signals[idx] = {			\
+		.signal_num = signum,		\
+		.signal_name = signame,		\
+		.direction = dir,		\
+		.action = act,
+
+#define KAPI_SIGNAL_TARGET(tgt) \
+		.target = tgt,
+
+#define KAPI_SIGNAL_CONDITION(cond) \
+		.condition = cond,
+
+#define KAPI_SIGNAL_DESC(desc) \
+		.description = desc,
+
+#define KAPI_SIGNAL_RESTARTABLE \
+		.restartable = true,
+
+#define KAPI_SIGNAL_SA_FLAGS_REQ(flags) \
+		.sa_flags_required = flags,
+
+#define KAPI_SIGNAL_SA_FLAGS_FORBID(flags) \
+		.sa_flags_forbidden = flags,
+
+#define KAPI_SIGNAL_ERROR(err) \
+		.error_on_signal = err,
+
+#define KAPI_SIGNAL_TRANSFORM(sig) \
+		.transform_to = sig,
+
+#define KAPI_SIGNAL_TIMING(when) \
+		.timing = when,
+
+#define KAPI_SIGNAL_PRIORITY(prio) \
+		.priority = prio,
+
+#define KAPI_SIGNAL_INTERRUPTIBLE \
+		.interruptible = true,
+
+#define KAPI_SIGNAL_QUEUE(behavior) \
+		.queue_behavior = behavior,
+
+#define KAPI_SIGNAL_STATE_REQ(state) \
+		.state_required = state,
+
+#define KAPI_SIGNAL_STATE_FORBID(state) \
+		.state_forbidden = state,
+
+#define KAPI_SIGNAL_END },
+
+#define KAPI_SIGNAL_COUNT(n) \
+	.signal_magic = 0x4B415337, /* 'KAS7' */ \
+	.signal_count = n,
+
+/**
+ * KAPI_SIGNAL_MASK - Define a signal mask specification
+ * @idx: Mask index
+ * @name: Mask name
+ * @desc: Mask description
+ */
+#define KAPI_SIGNAL_MASK(idx, name, desc) \
+	.signal_masks[idx] = {		\
+		.mask_name = name,	\
+		.description = desc,
+
+#define KAPI_SIGNAL_MASK_ADD(signum) \
+		.signals[.signal_count++] = signum,
+
+#define KAPI_SIGNAL_MASK_END },
+
+/**
+ * KAPI_STRUCT_SPEC - Define a structure specification
+ * @idx: Structure spec index
+ * @sname: Structure name
+ * @sdesc: Structure description
+ */
+#define KAPI_STRUCT_SPEC(idx, sname, sdesc) \
+	.struct_specs[idx] = {		\
+		.name = #sname,		\
+		.description = sdesc,
+
+#define KAPI_STRUCT_SIZE(ssize, salign) \
+		.size = ssize,		\
+		.alignment = salign,
+
+#define KAPI_STRUCT_FIELD_COUNT(n) \
+		.field_count = n,
+
+/**
+ * KAPI_STRUCT_FIELD - Define a structure field
+ * @fidx: Field index
+ * @fname: Field name
+ * @ftype: Field type (KAPI_TYPE_*)
+ * @ftype_name: Type name as string
+ * @fdesc: Field description
+ */
+#define KAPI_STRUCT_FIELD(fidx, fname, ftype, ftype_name, fdesc) \
+		.fields[fidx] = {	\
+			.name = fname,	\
+			.type = ftype,	\
+			.type_name = ftype_name, \
+			.description = fdesc,
+
+#define KAPI_FIELD_OFFSET(foffset) \
+			.offset = foffset,
+
+#define KAPI_FIELD_SIZE(fsize) \
+			.size = fsize,
+
+#define KAPI_FIELD_FLAGS(fflags) \
+			.flags = fflags,
+
+#define KAPI_FIELD_CONSTRAINT_RANGE(min, max) \
+			.constraint_type = KAPI_CONSTRAINT_RANGE, \
+			.min_value = min, \
+			.max_value = max,
+
+#define KAPI_FIELD_CONSTRAINT_MASK(mask) \
+			.constraint_type = KAPI_CONSTRAINT_MASK, \
+			.valid_mask = mask,
+
+#define KAPI_FIELD_CONSTRAINT_ENUM(values) \
+			.constraint_type = KAPI_CONSTRAINT_ENUM, \
+			.enum_values = values,
+
+#define KAPI_STRUCT_FIELD_END },
+
+#define KAPI_STRUCT_SPEC_END },
+
+/* Counter for structure specifications */
+#define KAPI_STRUCT_SPEC_COUNT(n) \
+	.struct_magic = 0x4B415439, /* 'KAT9' */ \
+	.struct_spec_count = n,
+
+/* Additional lock-related macros */
+#define KAPI_LOCK_COUNT(n) \
+	.lock_magic = 0x4B414C34, /* 'KAL4' */ \
+	.lock_count = n,
+
+/**
+ * KAPI_SIDE_EFFECT - Define a side effect
+ * @idx: Side effect index
+ * @etype: Effect type bitmask (OR'ed KAPI_EFFECT_* values)
+ * @etarget: What is affected
+ * @edesc: Effect description
+ */
+#define KAPI_SIDE_EFFECT(idx, etype, etarget, edesc) \
+	.side_effects[idx] = {		\
+		.type = etype,		\
+		.target = etarget,	\
+		.description = edesc,	\
+		.reversible = false,	/* Default to non-reversible */
+
+#define KAPI_EFFECT_CONDITION(cond) \
+		.condition = cond,
+
+#define KAPI_EFFECT_REVERSIBLE \
+		.reversible = true,
+
+#define KAPI_SIDE_EFFECT_END },
+
+/**
+ * KAPI_STATE_TRANS - Define a state transition
+ * @idx: State transition index
+ * @obj: Object whose state changes
+ * @from: From state
+ * @to: To state
+ * @desc: Transition description
+ */
+#define KAPI_STATE_TRANS(idx, obj, from, to, desc) \
+	.state_transitions[idx] = {	\
+		.object = obj,		\
+		.from_state = from,	\
+		.to_state = to,		\
+		.description = desc,
+
+#define KAPI_STATE_TRANS_COND(cond) \
+		.condition = cond,
+
+#define KAPI_STATE_TRANS_END },
+
+/* Counters for side effects and state transitions */
+#define KAPI_SIDE_EFFECT_COUNT(n) \
+	.effect_magic = 0x4B414641, /* 'KAFA' */ \
+	.side_effect_count = n,
+
+#define KAPI_STATE_TRANS_COUNT(n) \
+	.trans_magic = 0x4B415442, /* 'KATB' */ \
+	.state_trans_count = n,
+
+/* Helper macros for common side effect patterns */
+#define KAPI_EFFECTS_MEMORY	(KAPI_EFFECT_ALLOC_MEMORY | KAPI_EFFECT_FREE_MEMORY)
+#define KAPI_EFFECTS_LOCKING	(KAPI_EFFECT_LOCK_ACQUIRE | KAPI_EFFECT_LOCK_RELEASE)
+#define KAPI_EFFECTS_RESOURCES	(KAPI_EFFECT_RESOURCE_CREATE | KAPI_EFFECT_RESOURCE_DESTROY)
+#define KAPI_EFFECTS_IO		(KAPI_EFFECT_NETWORK | KAPI_EFFECT_FILESYSTEM)
+
+/* Helper macros for common patterns */
+
+#define KAPI_PARAM_IN		(KAPI_PARAM_IN)
+#define KAPI_PARAM_OUT		(KAPI_PARAM_OUT)
+#define KAPI_PARAM_INOUT	(KAPI_PARAM_IN | KAPI_PARAM_OUT)
+#define KAPI_PARAM_OPTIONAL	(KAPI_PARAM_OPTIONAL)
+#define KAPI_PARAM_USER_PTR	(KAPI_PARAM_USER | KAPI_PARAM_PTR)
+
+/* Common signal timing constants */
+#define KAPI_SIGNAL_TIME_ENTRY		"entry"
+#define KAPI_SIGNAL_TIME_DURING		"during"
+#define KAPI_SIGNAL_TIME_EXIT		"exit"
+#define KAPI_SIGNAL_TIME_ANYTIME	"anytime"
+#define KAPI_SIGNAL_TIME_BLOCKING	"while_blocked"
+#define KAPI_SIGNAL_TIME_SLEEPING	"while_sleeping"
+#define KAPI_SIGNAL_TIME_BEFORE		"before"
+#define KAPI_SIGNAL_TIME_AFTER		"after"
+
+/* Common signal queue behaviors */
+#define KAPI_SIGNAL_QUEUE_STANDARD	"standard"
+#define KAPI_SIGNAL_QUEUE_REALTIME	"realtime"
+#define KAPI_SIGNAL_QUEUE_COALESCE	"coalesce"
+#define KAPI_SIGNAL_QUEUE_REPLACE	"replace"
+#define KAPI_SIGNAL_QUEUE_DISCARD	"discard"
+
+/* Process state flags for signal delivery */
+#define KAPI_SIGNAL_STATE_RUNNING	(1 << 0)
+#define KAPI_SIGNAL_STATE_SLEEPING	(1 << 1)
+#define KAPI_SIGNAL_STATE_STOPPED	(1 << 2)
+#define KAPI_SIGNAL_STATE_TRACED	(1 << 3)
+#define KAPI_SIGNAL_STATE_ZOMBIE	(1 << 4)
+#define KAPI_SIGNAL_STATE_DEAD		(1 << 5)
+
+/* Capability specification macros */
+
+/**
+ * KAPI_CAPABILITY - Define a capability requirement
+ * @idx: Capability index
+ * @cap: Capability constant (e.g., CAP_IPC_LOCK)
+ * @name: Capability name string
+ * @act: Action type (kapi_capability_action)
+ */
+#define KAPI_CAPABILITY(idx, cap, name, act) \
+	.capabilities[idx] = {		\
+		.capability = cap,	\
+		.cap_name = name,	\
+		.action = act,
+
+#define KAPI_CAP_ALLOWS(desc) \
+		.allows = desc,
+
+#define KAPI_CAP_WITHOUT(desc) \
+		.without_cap = desc,
+
+#define KAPI_CAP_CONDITION(cond) \
+		.check_condition = cond,
+
+#define KAPI_CAP_PRIORITY(prio) \
+		.priority = prio,
+
+#define KAPI_CAP_ALTERNATIVE(caps, count) \
+		.alternative = caps,	\
+		.alternative_count = count,
+
+#define KAPI_CAPABILITY_END },
+
+/* Counter for capability specifications */
+#define KAPI_CAPABILITY_COUNT(n) \
+	.cap_magic = 0x4B414343, /* 'KACC' */ \
+	.capability_count = n,
+
+/* Common signal patterns for syscalls */
+#define KAPI_SIGNAL_INTERRUPTIBLE_SLEEP \
+	KAPI_SIGNAL(0, SIGINT, "SIGINT", KAPI_SIGNAL_RECEIVE, KAPI_SIGNAL_ACTION_RETURN) \
+		KAPI_SIGNAL_TIMING(KAPI_SIGNAL_TIME_SLEEPING) \
+		KAPI_SIGNAL_ERROR(-EINTR) \
+		KAPI_SIGNAL_RESTARTABLE \
+		KAPI_SIGNAL_DESC("Interrupts sleep, returns -EINTR") \
+	KAPI_SIGNAL_END, \
+	KAPI_SIGNAL(1, SIGTERM, "SIGTERM", KAPI_SIGNAL_RECEIVE, KAPI_SIGNAL_ACTION_RETURN) \
+		KAPI_SIGNAL_TIMING(KAPI_SIGNAL_TIME_SLEEPING) \
+		KAPI_SIGNAL_ERROR(-EINTR) \
+		KAPI_SIGNAL_RESTARTABLE \
+		KAPI_SIGNAL_DESC("Interrupts sleep, returns -EINTR") \
+	KAPI_SIGNAL_END
+
+#define KAPI_SIGNAL_FATAL_DEFAULT \
+	KAPI_SIGNAL(2, SIGKILL, "SIGKILL", KAPI_SIGNAL_RECEIVE, KAPI_SIGNAL_ACTION_TERMINATE) \
+		KAPI_SIGNAL_TIMING(KAPI_SIGNAL_TIME_ANYTIME) \
+		KAPI_SIGNAL_PRIORITY(0) \
+		KAPI_SIGNAL_DESC("Process terminated immediately") \
+	KAPI_SIGNAL_END
+
+#define KAPI_SIGNAL_STOP_CONT \
+	KAPI_SIGNAL(3, SIGSTOP, "SIGSTOP", KAPI_SIGNAL_RECEIVE, KAPI_SIGNAL_ACTION_STOP) \
+		KAPI_SIGNAL_TIMING(KAPI_SIGNAL_TIME_ANYTIME) \
+		KAPI_SIGNAL_DESC("Process stopped") \
+	KAPI_SIGNAL_END, \
+	KAPI_SIGNAL(4, SIGCONT, "SIGCONT", KAPI_SIGNAL_RECEIVE, KAPI_SIGNAL_ACTION_CONTINUE) \
+		KAPI_SIGNAL_TIMING(KAPI_SIGNAL_TIME_ANYTIME) \
+		KAPI_SIGNAL_DESC("Process continued") \
+	KAPI_SIGNAL_END
+
+/* Validation and runtime checking */
+
+#ifdef CONFIG_KAPI_RUNTIME_CHECKS
+bool kapi_validate_params(const struct kernel_api_spec *spec, ...);
+bool kapi_validate_param(const struct kapi_param_spec *param_spec, s64 value);
+bool kapi_validate_param_with_context(const struct kapi_param_spec *param_spec,
+				       s64 value, const s64 *all_params, int param_count);
+int kapi_validate_syscall_param(const struct kernel_api_spec *spec,
+				int param_idx, s64 value);
+int kapi_validate_syscall_params(const struct kernel_api_spec *spec,
+				 const s64 *params, int param_count);
+bool kapi_check_return_success(const struct kapi_return_spec *return_spec, s64 retval);
+bool kapi_validate_return_value(const struct kernel_api_spec *spec, s64 retval);
+int kapi_validate_syscall_return(const struct kernel_api_spec *spec, s64 retval);
+void kapi_check_context(const struct kernel_api_spec *spec);
+void kapi_check_locks(const struct kernel_api_spec *spec);
+bool kapi_check_signal_allowed(const struct kernel_api_spec *spec, int signum);
+bool kapi_validate_signal_action(const struct kernel_api_spec *spec, int signum,
+				 struct sigaction *act);
+int kapi_get_signal_error(const struct kernel_api_spec *spec, int signum);
+bool kapi_is_signal_restartable(const struct kernel_api_spec *spec, int signum);
+#else
+static inline bool kapi_validate_params(const struct kernel_api_spec *spec, ...)
+{
+	return true;
+}
+static inline bool kapi_validate_param(const struct kapi_param_spec *param_spec, s64 value)
+{
+	return true;
+}
+static inline bool kapi_validate_param_with_context(const struct kapi_param_spec *param_spec,
+						     s64 value, const s64 *all_params, int param_count)
+{
+	return true;
+}
+static inline int kapi_validate_syscall_param(const struct kernel_api_spec *spec,
+					       int param_idx, s64 value)
+{
+	return 0;
+}
+static inline int kapi_validate_syscall_params(const struct kernel_api_spec *spec,
+					       const s64 *params, int param_count)
+{
+	return 0;
+}
+static inline bool kapi_check_return_success(const struct kapi_return_spec *return_spec, s64 retval)
+{
+	return true;
+}
+static inline bool kapi_validate_return_value(const struct kernel_api_spec *spec, s64 retval)
+{
+	return true;
+}
+static inline int kapi_validate_syscall_return(const struct kernel_api_spec *spec, s64 retval)
+{
+	return 0;
+}
+static inline void kapi_check_context(const struct kernel_api_spec *spec) {}
+static inline void kapi_check_locks(const struct kernel_api_spec *spec) {}
+static inline bool kapi_check_signal_allowed(const struct kernel_api_spec *spec, int signum)
+{
+	return true;
+}
+static inline bool kapi_validate_signal_action(const struct kernel_api_spec *spec, int signum,
+					       struct sigaction *act)
+{
+	return true;
+}
+static inline int kapi_get_signal_error(const struct kernel_api_spec *spec, int signum)
+{
+	return -EINTR;
+}
+static inline bool kapi_is_signal_restartable(const struct kernel_api_spec *spec, int signum)
+{
+	return false;
+}
+#endif
+
+/* Export/query functions */
+const struct kernel_api_spec *kapi_get_spec(const char *name);
+int kapi_export_json(const struct kernel_api_spec *spec, char *buf, size_t size);
+void kapi_print_spec(const struct kernel_api_spec *spec);
+
+/* Registration for dynamic APIs */
+int kapi_register_spec(struct kernel_api_spec *spec);
+void kapi_unregister_spec(const char *name);
+
+/* Helper to get parameter constraint info */
+static inline bool kapi_get_param_constraint(const char *api_name, int param_idx,
+					      enum kapi_constraint_type *type,
+					      u64 *valid_mask, s64 *min_val, s64 *max_val)
+{
+	const struct kernel_api_spec *spec = kapi_get_spec(api_name);
+
+	if (!spec || param_idx >= spec->param_count)
+		return false;
+
+	if (type)
+		*type = spec->params[param_idx].constraint_type;
+	if (valid_mask)
+		*valid_mask = spec->params[param_idx].valid_mask;
+	if (min_val)
+		*min_val = spec->params[param_idx].min_value;
+	if (max_val)
+		*max_val = spec->params[param_idx].max_value;
+
+	return true;
+}
+
+/* Socket state requirement macros */
+#define KAPI_SOCKET_STATE_REQ(...) \
+	.socket_state = { \
+		.required_states = { __VA_ARGS__ }, \
+		.required_state_count = sizeof((enum kapi_socket_state[]){__VA_ARGS__})/sizeof(enum kapi_socket_state),
+
+#define KAPI_SOCKET_STATE_FORBID(...) \
+		.forbidden_states = { __VA_ARGS__ }, \
+		.forbidden_state_count = sizeof((enum kapi_socket_state[]){__VA_ARGS__})/sizeof(enum kapi_socket_state),
+
+#define KAPI_SOCKET_STATE_RESULT(state) \
+		.resulting_state = state,
+
+#define KAPI_SOCKET_STATE_COND(cond) \
+		.state_condition = cond,
+
+#define KAPI_SOCKET_STATE_PROTOS(protos) \
+		.applicable_protocols = protos,
+
+#define KAPI_SOCKET_STATE_END },
+
+/* Protocol behavior macros */
+#define KAPI_PROTOCOL_BEHAVIOR(idx, protos, desc) \
+	.protocol_behaviors[idx] = { \
+		.applicable_protocols = protos, \
+		.behavior = desc,
+
+#define KAPI_PROTOCOL_FLAGS(flags, desc) \
+		.protocol_flags = flags, \
+		.flag_description = desc,
+
+#define KAPI_PROTOCOL_BEHAVIOR_END },
+
+/* Async behavior macros */
+#define KAPI_ASYNC_SPEC(modes, errno) \
+	.async_spec = { \
+		.supported_modes = modes, \
+		.nonblock_errno = errno,
+
+#define KAPI_ASYNC_POLL(in, out) \
+		.poll_events_in = in, \
+		.poll_events_out = out,
+
+#define KAPI_ASYNC_COMPLETION(cond) \
+		.completion_condition = cond,
+
+#define KAPI_ASYNC_TIMEOUT(supported, desc) \
+		.supports_timeout = supported, \
+		.timeout_behavior = desc,
+
+#define KAPI_ASYNC_END },
+
+/* Buffer behavior macros */
+#define KAPI_BUFFER_SPEC(behaviors) \
+	.buffer_spec = { \
+		.buffer_behaviors = behaviors,
+
+#define KAPI_BUFFER_SIZE(min, max, optimal) \
+		.min_buffer_size = min, \
+		.max_buffer_size = max, \
+		.optimal_buffer_size = optimal,
+
+#define KAPI_BUFFER_PARTIAL(allowed, rules) \
+		.can_partial_transfer = allowed, \
+		.partial_transfer_rules = rules,
+
+#define KAPI_BUFFER_FRAGMENT(rules) \
+		.fragmentation_rules = rules,
+
+#define KAPI_BUFFER_END },
+
+/* Address family macros */
+#define KAPI_ADDR_FAMILY(idx, fam, name, struct_sz, min_len, max_len) \
+	.addr_families[idx] = { \
+		.family = fam, \
+		.family_name = name, \
+		.addr_struct_size = struct_sz, \
+		.min_addr_len = min_len, \
+		.max_addr_len = max_len,
+
+#define KAPI_ADDR_FORMAT(fmt) \
+		.addr_format = fmt,
+
+#define KAPI_ADDR_FEATURES(wildcard, multicast, broadcast) \
+		.supports_wildcard = wildcard, \
+		.supports_multicast = multicast, \
+		.supports_broadcast = broadcast,
+
+#define KAPI_ADDR_SPECIAL(addrs) \
+		.special_addresses = addrs,
+
+#define KAPI_ADDR_PORTS(min, max) \
+		.port_range_min = min, \
+		.port_range_max = max,
+
+#define KAPI_ADDR_FAMILY_END },
+
+#define KAPI_ADDR_FAMILY_COUNT(n) \
+	.addr_family_count = n,
+
+#define KAPI_PROTOCOL_BEHAVIOR_COUNT(n) \
+	.protocol_behavior_count = n,
+
+#define KAPI_CONSTRAINT_COUNT(n) \
+	.constraint_magic = 0x4B414335, /* 'KAC5' */ \
+	.constraint_count = n,
+
+/* Network operation characteristics macros */
+#define KAPI_NET_CONNECTION_ORIENTED \
+	.is_connection_oriented = true,
+
+#define KAPI_NET_MESSAGE_ORIENTED \
+	.is_message_oriented = true,
+
+#define KAPI_NET_SUPPORTS_OOB \
+	.supports_oob_data = true,
+
+#define KAPI_NET_SUPPORTS_PEEK \
+	.supports_peek = true,
+
+#define KAPI_NET_REENTRANT \
+	.is_reentrant = true,
+
+/* Semantic description macros */
+#define KAPI_NET_CONN_ESTABLISH(desc) \
+	.connection_establishment = desc,
+
+#define KAPI_NET_CONN_TERMINATE(desc) \
+	.connection_termination = desc,
+
+#define KAPI_NET_DATA_TRANSFER(desc) \
+	.data_transfer_semantics = desc,
+
+#endif /* _LINUX_KERNEL_API_SPEC_H */
diff --git a/include/linux/syscall_api_spec.h b/include/linux/syscall_api_spec.h
new file mode 100644
index 000000000000..9317aa30e49c
--- /dev/null
+++ b/include/linux/syscall_api_spec.h
@@ -0,0 +1,125 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * syscall_api_spec.h - System Call API Specification Integration
+ *
+ * This header extends the SYSCALL_DEFINEX macros to support inline API specifications,
+ * allowing syscall documentation to be written alongside the implementation in a
+ * human-readable and machine-parseable format.
+ */
+
+#ifndef _LINUX_SYSCALL_API_SPEC_H
+#define _LINUX_SYSCALL_API_SPEC_H
+
+#include <linux/kernel_api_spec.h>
+
+
+
+/* Automatic syscall validation infrastructure */
+/*
+ * The validation is now integrated directly into the SYSCALL_DEFINEx macros
+ * in syscalls.h when CONFIG_KAPI_RUNTIME_CHECKS is enabled.
+ *
+ * The validation happens in the __do_kapi_sys##name wrapper function which:
+ * 1. Validates all parameters before calling the actual syscall
+ * 2. Calls the real syscall implementation
+ * 3. Validates the return value
+ * 4. Returns the result
+ */
+
+
+/*
+ * Helper macros for common syscall patterns
+ */
+
+/* For syscalls that can sleep */
+#define KAPI_SYSCALL_SLEEPABLE \
+	KAPI_CONTEXT(KAPI_CTX_PROCESS | KAPI_CTX_SLEEPABLE)
+
+/* For syscalls that must be atomic */
+#define KAPI_SYSCALL_ATOMIC \
+	KAPI_CONTEXT(KAPI_CTX_PROCESS | KAPI_CTX_ATOMIC)
+
+/* Common parameter specifications */
+#define KAPI_PARAM_FD(idx, desc) \
+	KAPI_PARAM(idx, "fd", "int", desc) \
+		KAPI_PARAM_FLAGS(KAPI_PARAM_IN) \
+		.type = KAPI_TYPE_FD, \
+		.constraint_type = KAPI_CONSTRAINT_NONE, \
+	KAPI_PARAM_END
+
+#define KAPI_PARAM_USER_BUF(idx, name, desc) \
+	KAPI_PARAM(idx, name, "void __user *", desc) \
+		KAPI_PARAM_FLAGS(KAPI_PARAM_USER_PTR | KAPI_PARAM_IN) \
+	KAPI_PARAM_END
+
+#define KAPI_PARAM_USER_STRUCT(idx, name, struct_type, desc) \
+	KAPI_PARAM(idx, name, #struct_type " __user *", desc) \
+		KAPI_PARAM_FLAGS(KAPI_PARAM_USER | KAPI_PARAM_IN) \
+		.type = KAPI_TYPE_USER_PTR, \
+		.size = sizeof(struct_type), \
+		.constraint_type = KAPI_CONSTRAINT_NONE, \
+	KAPI_PARAM_END
+
+#define KAPI_PARAM_SIZE_T(idx, name, desc) \
+	KAPI_PARAM(idx, name, "size_t", desc) \
+		KAPI_PARAM_FLAGS(KAPI_PARAM_IN) \
+		KAPI_PARAM_RANGE(0, SIZE_MAX) \
+	KAPI_PARAM_END
+
+/* Common error specifications */
+#define KAPI_ERROR_EBADF(idx) \
+	KAPI_ERROR(idx, -EBADF, "EBADF", "Invalid file descriptor", \
+		   "The file descriptor is not valid or has been closed")
+
+#define KAPI_ERROR_EINVAL(idx, condition) \
+	KAPI_ERROR(idx, -EINVAL, "EINVAL", condition, \
+		   "Invalid argument provided")
+
+#define KAPI_ERROR_ENOMEM(idx) \
+	KAPI_ERROR(idx, -ENOMEM, "ENOMEM", "Insufficient memory", \
+		   "Cannot allocate memory for the operation")
+
+#define KAPI_ERROR_EPERM(idx) \
+	KAPI_ERROR(idx, -EPERM, "EPERM", "Operation not permitted", \
+		   "The calling process does not have the required permissions")
+
+#define KAPI_ERROR_EFAULT(idx) \
+	KAPI_ERROR(idx, -EFAULT, "EFAULT", "Bad address", \
+		   "Invalid user space address provided")
+
+/* Standard return value specifications */
+#define KAPI_RETURN_SUCCESS_ZERO \
+	KAPI_RETURN("long", "0 on success, negative error code on failure") \
+		KAPI_RETURN_SUCCESS(0, "== 0") \
+	KAPI_RETURN_END
+
+#define KAPI_RETURN_FD_SPEC \
+	KAPI_RETURN("long", "File descriptor on success, negative error code on failure") \
+		.check_type = KAPI_RETURN_FD, \
+	KAPI_RETURN_END
+
+#define KAPI_RETURN_COUNT \
+	KAPI_RETURN("long", "Number of bytes processed on success, negative error code on failure") \
+		KAPI_RETURN_SUCCESS(0, ">= 0") \
+	KAPI_RETURN_END
+
+/* KAPI_ERROR_COUNT and KAPI_PARAM_COUNT are now defined in kernel_api_spec.h */
+
+/**
+ * KAPI_SINCE_VERSION - Set the since version
+ * @version: Version string when the API was introduced
+ */
+#define KAPI_SINCE_VERSION(version) \
+	.since_version = version,
+
+
+/**
+ * KAPI_SIGNAL_MASK_COUNT - Set the signal mask count
+ * @count: Number of signal masks defined
+ */
+#define KAPI_SIGNAL_MASK_COUNT(count) \
+	.signal_mask_count = count,
+
+
+
+#endif /* _LINUX_SYSCALL_API_SPEC_H */
\ No newline at end of file
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 77f45e5d4413..403d0e09692f 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -88,6 +88,7 @@ struct file_attr;
 #include <linux/bug.h>
 #include <linux/sem.h>
 #include <asm/siginfo.h>
+#include <linux/syscall_api_spec.h>
 #include <linux/unistd.h>
 #include <linux/quota.h>
 #include <linux/key.h>
@@ -133,6 +134,7 @@ struct file_attr;
 #define __SC_TYPE(t, a)	t
 #define __SC_ARGS(t, a)	a
 #define __SC_TEST(t, a) (void)BUILD_BUG_ON_ZERO(!__TYPE_IS_LL(t) && sizeof(t) > sizeof(long))
+#define __SC_CAST_TO_S64(t, a)	(s64)(a)
 
 #ifdef CONFIG_FTRACE_SYSCALLS
 #define __SC_STR_ADECL(t, a)	#a
@@ -243,6 +245,41 @@ static inline int is_syscall_trace_event(struct trace_event_call *tp_event)
  * done within __do_sys_*().
  */
 #ifndef __SYSCALL_DEFINEx
+#ifdef CONFIG_KAPI_RUNTIME_CHECKS
+#define __SYSCALL_DEFINEx(x, name, ...)					\
+	__diag_push();							\
+	__diag_ignore(GCC, 8, "-Wattribute-alias",			\
+		      "Type aliasing is used to sanitize syscall arguments");\
+	asmlinkage long sys##name(__MAP(x,__SC_DECL,__VA_ARGS__))	\
+		__attribute__((alias(__stringify(__se_sys##name))));	\
+	ALLOW_ERROR_INJECTION(sys##name, ERRNO);			\
+	static inline long __do_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__));\
+	static inline long __do_kapi_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__)); \
+	asmlinkage long __se_sys##name(__MAP(x,__SC_LONG,__VA_ARGS__));	\
+	asmlinkage long __se_sys##name(__MAP(x,__SC_LONG,__VA_ARGS__))	\
+	{								\
+		long ret = __do_kapi_sys##name(__MAP(x,__SC_CAST,__VA_ARGS__));\
+		__MAP(x,__SC_TEST,__VA_ARGS__);				\
+		__PROTECT(x, ret,__MAP(x,__SC_ARGS,__VA_ARGS__));	\
+		return ret;						\
+	}								\
+	__diag_pop();							\
+	static inline long __do_kapi_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__))\
+	{								\
+		const struct kernel_api_spec *__spec = kapi_get_spec("sys_" #name); \
+		if (__spec) {						\
+			s64 __params[x] = { __MAP(x,__SC_CAST_TO_S64,__VA_ARGS__) }; \
+			int __ret = kapi_validate_syscall_params(__spec, __params, x); \
+			if (__ret) return __ret;			\
+		}							\
+		long ret = __do_sys##name(__MAP(x,__SC_ARGS,__VA_ARGS__));	\
+		if (__spec) {						\
+			kapi_validate_syscall_return(__spec, (s64)ret); \
+		}							\
+		return ret;						\
+	}								\
+	static inline long __do_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__))
+#else /* !CONFIG_KAPI_RUNTIME_CHECKS */
 #define __SYSCALL_DEFINEx(x, name, ...)					\
 	__diag_push();							\
 	__diag_ignore(GCC, 8, "-Wattribute-alias",			\
@@ -261,6 +298,7 @@ static inline int is_syscall_trace_event(struct trace_event_call *tp_event)
 	}								\
 	__diag_pop();							\
 	static inline long __do_sys##name(__MAP(x,__SC_DECL,__VA_ARGS__))
+#endif /* CONFIG_KAPI_RUNTIME_CHECKS */
 #endif /* __SYSCALL_DEFINEx */
 
 /* For split 64-bit arguments on 32-bit architectures */
diff --git a/init/Kconfig b/init/Kconfig
index 836320251219..481a5a73f1ff 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -2108,6 +2108,8 @@ config TRACEPOINTS
 
 source "kernel/Kconfig.kexec"
 
+source "kernel/api/Kconfig"
+
 endmenu		# General setup
 
 source "arch/Kconfig"
diff --git a/kernel/Makefile b/kernel/Makefile
index c60623448235..c100baacb1f0 100644
--- a/kernel/Makefile
+++ b/kernel/Makefile
@@ -56,6 +56,7 @@ obj-y += dma/
 obj-y += entry/
 obj-y += unwind/
 obj-$(CONFIG_MODULES) += module/
+obj-$(CONFIG_KAPI_SPEC) += api/
 
 obj-$(CONFIG_KCMP) += kcmp.o
 obj-$(CONFIG_FREEZER) += freezer.o
diff --git a/kernel/api/Kconfig b/kernel/api/Kconfig
new file mode 100644
index 000000000000..fde25ec70e13
--- /dev/null
+++ b/kernel/api/Kconfig
@@ -0,0 +1,35 @@
+# SPDX-License-Identifier: GPL-2.0-only
+#
+# Kernel API Specification Framework Configuration
+#
+
+config KAPI_SPEC
+	bool "Kernel API Specification Framework"
+	help
+	  This option enables the kernel API specification framework,
+	  which provides formal documentation of kernel APIs in both
+	  human and machine-readable formats.
+
+	  The framework allows developers to document APIs inline with
+	  their implementation, including parameter specifications,
+	  return values, error conditions, locking requirements, and
+	  execution context constraints.
+
+	  When enabled, API specifications can be queried at runtime
+	  and exported in various formats (JSON, XML) through debugfs.
+
+	  If unsure, say N.
+
+config KAPI_RUNTIME_CHECKS
+	bool "Runtime API specification checks"
+	depends on KAPI_SPEC
+	depends on DEBUG_KERNEL
+	help
+	  Enable runtime validation of API usage against specifications.
+	  This includes checking execution context requirements, parameter
+	  validation, and lock state verification.
+
+	  This adds overhead and should only be used for debugging and
+	  development. The checks use WARN_ONCE to report violations.
+
+	  If unsure, say N.
diff --git a/kernel/api/Makefile b/kernel/api/Makefile
new file mode 100644
index 000000000000..4120ded7e5cf
--- /dev/null
+++ b/kernel/api/Makefile
@@ -0,0 +1,7 @@
+# SPDX-License-Identifier: GPL-2.0
+#
+# Makefile for the Kernel API Specification Framework
+#
+
+# Core API specification framework
+obj-$(CONFIG_KAPI_SPEC)		+= kernel_api_spec.o
\ No newline at end of file
diff --git a/kernel/api/kernel_api_spec.c b/kernel/api/kernel_api_spec.c
new file mode 100644
index 000000000000..5500bb98c4f9
--- /dev/null
+++ b/kernel/api/kernel_api_spec.c
@@ -0,0 +1,1155 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * kernel_api_spec.c - Kernel API Specification Framework Implementation
+ *
+ * Provides runtime support for kernel API specifications including validation,
+ * export to various formats, and querying capabilities.
+ */
+
+#include <linux/kernel.h>
+#include <linux/kernel_api_spec.h>
+#include <linux/string.h>
+#include <linux/slab.h>
+#include <linux/list.h>
+#include <linux/mutex.h>
+#include <linux/seq_file.h>
+#include <linux/debugfs.h>
+#include <linux/export.h>
+#include <linux/preempt.h>
+#include <linux/hardirq.h>
+#include <linux/file.h>
+#include <linux/fdtable.h>
+#include <linux/uaccess.h>
+#include <linux/limits.h>
+#include <linux/fcntl.h>
+#include <linux/mm.h>
+
+/* Section where API specifications are stored */
+extern struct kernel_api_spec __start_kapi_specs[];
+extern struct kernel_api_spec __stop_kapi_specs[];
+
+/* Dynamic API registration */
+static LIST_HEAD(dynamic_api_specs);
+static DEFINE_MUTEX(api_spec_mutex);
+
+struct dynamic_api_spec {
+	struct list_head list;
+	struct kernel_api_spec *spec;
+};
+
+/**
+ * kapi_get_spec - Get API specification by name
+ * @name: Function name to look up
+ *
+ * Return: Pointer to API specification or NULL if not found
+ */
+const struct kernel_api_spec *kapi_get_spec(const char *name)
+{
+	struct kernel_api_spec *spec;
+	struct dynamic_api_spec *dyn_spec;
+
+	/* Search static specifications */
+	for (spec = __start_kapi_specs; spec < __stop_kapi_specs; spec++) {
+		if (strcmp(spec->name, name) == 0)
+			return spec;
+	}
+
+	/* Search dynamic specifications */
+	mutex_lock(&api_spec_mutex);
+	list_for_each_entry(dyn_spec, &dynamic_api_specs, list) {
+		if (strcmp(dyn_spec->spec->name, name) == 0) {
+			mutex_unlock(&api_spec_mutex);
+			return dyn_spec->spec;
+		}
+	}
+	mutex_unlock(&api_spec_mutex);
+
+	return NULL;
+}
+EXPORT_SYMBOL_GPL(kapi_get_spec);
+
+/**
+ * kapi_register_spec - Register a dynamic API specification
+ * @spec: API specification to register
+ *
+ * Return: 0 on success, negative error code on failure
+ */
+int kapi_register_spec(struct kernel_api_spec *spec)
+{
+	struct dynamic_api_spec *dyn_spec;
+
+	if (!spec || !spec->name[0])
+		return -EINVAL;
+
+	/* Check if already exists */
+	if (kapi_get_spec(spec->name))
+		return -EEXIST;
+
+	dyn_spec = kzalloc(sizeof(*dyn_spec), GFP_KERNEL);
+	if (!dyn_spec)
+		return -ENOMEM;
+
+	dyn_spec->spec = spec;
+
+	mutex_lock(&api_spec_mutex);
+	list_add_tail(&dyn_spec->list, &dynamic_api_specs);
+	mutex_unlock(&api_spec_mutex);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(kapi_register_spec);
+
+/**
+ * kapi_unregister_spec - Unregister a dynamic API specification
+ * @name: Name of API to unregister
+ */
+void kapi_unregister_spec(const char *name)
+{
+	struct dynamic_api_spec *dyn_spec, *tmp;
+
+	mutex_lock(&api_spec_mutex);
+	list_for_each_entry_safe(dyn_spec, tmp, &dynamic_api_specs, list) {
+		if (strcmp(dyn_spec->spec->name, name) == 0) {
+			list_del(&dyn_spec->list);
+			kfree(dyn_spec);
+			break;
+		}
+	}
+	mutex_unlock(&api_spec_mutex);
+}
+EXPORT_SYMBOL_GPL(kapi_unregister_spec);
+
+/**
+ * param_type_to_string - Convert parameter type to string
+ * @type: Parameter type
+ *
+ * Return: String representation of type
+ */
+static const char *param_type_to_string(enum kapi_param_type type)
+{
+	static const char * const type_names[] = {
+		[KAPI_TYPE_VOID] = "void",
+		[KAPI_TYPE_INT] = "int",
+		[KAPI_TYPE_UINT] = "uint",
+		[KAPI_TYPE_PTR] = "pointer",
+		[KAPI_TYPE_STRUCT] = "struct",
+		[KAPI_TYPE_UNION] = "union",
+		[KAPI_TYPE_ENUM] = "enum",
+		[KAPI_TYPE_FUNC_PTR] = "function_pointer",
+		[KAPI_TYPE_ARRAY] = "array",
+		[KAPI_TYPE_FD] = "file_descriptor",
+		[KAPI_TYPE_USER_PTR] = "user_pointer",
+		[KAPI_TYPE_PATH] = "pathname",
+		[KAPI_TYPE_CUSTOM] = "custom",
+	};
+
+	if (type >= ARRAY_SIZE(type_names))
+		return "unknown";
+
+	return type_names[type];
+}
+
+/**
+ * lock_type_to_string - Convert lock type to string
+ * @type: Lock type
+ *
+ * Return: String representation of lock type
+ */
+static const char *lock_type_to_string(enum kapi_lock_type type)
+{
+	static const char * const lock_names[] = {
+		[KAPI_LOCK_NONE] = "none",
+		[KAPI_LOCK_MUTEX] = "mutex",
+		[KAPI_LOCK_SPINLOCK] = "spinlock",
+		[KAPI_LOCK_RWLOCK] = "rwlock",
+		[KAPI_LOCK_SEQLOCK] = "seqlock",
+		[KAPI_LOCK_RCU] = "rcu",
+		[KAPI_LOCK_SEMAPHORE] = "semaphore",
+		[KAPI_LOCK_CUSTOM] = "custom",
+	};
+
+	if (type >= ARRAY_SIZE(lock_names))
+		return "unknown";
+
+	return lock_names[type];
+}
+
+/**
+ * return_check_type_to_string - Convert return check type to string
+ * @type: Return check type
+ *
+ * Return: String representation of return check type
+ */
+static const char *return_check_type_to_string(enum kapi_return_check_type type)
+{
+	static const char * const check_names[] = {
+		[KAPI_RETURN_EXACT] = "exact",
+		[KAPI_RETURN_RANGE] = "range",
+		[KAPI_RETURN_ERROR_CHECK] = "error_check",
+		[KAPI_RETURN_FD] = "file_descriptor",
+		[KAPI_RETURN_CUSTOM] = "custom",
+	};
+
+	if (type >= ARRAY_SIZE(check_names))
+		return "unknown";
+
+	return check_names[type];
+}
+
+/**
+ * capability_action_to_string - Convert capability action to string
+ * @action: Capability action
+ *
+ * Return: String representation of capability action
+ */
+static const char *capability_action_to_string(enum kapi_capability_action action)
+{
+	static const char * const action_names[] = {
+		[KAPI_CAP_BYPASS_CHECK] = "bypass_check",
+		[KAPI_CAP_INCREASE_LIMIT] = "increase_limit",
+		[KAPI_CAP_OVERRIDE_RESTRICTION] = "override_restriction",
+		[KAPI_CAP_GRANT_PERMISSION] = "grant_permission",
+		[KAPI_CAP_MODIFY_BEHAVIOR] = "modify_behavior",
+		[KAPI_CAP_ACCESS_RESOURCE] = "access_resource",
+		[KAPI_CAP_PERFORM_OPERATION] = "perform_operation",
+	};
+
+	if (action >= ARRAY_SIZE(action_names))
+		return "unknown";
+
+	return action_names[action];
+}
+
+/**
+ * kapi_export_json - Export API specification to JSON format
+ * @spec: API specification to export
+ * @buf: Buffer to write JSON to
+ * @size: Size of buffer
+ *
+ * Return: Number of bytes written or negative error
+ */
+int kapi_export_json(const struct kernel_api_spec *spec, char *buf, size_t size)
+{
+	int ret = 0;
+	int i;
+
+	if (!spec || !buf || size == 0)
+		return -EINVAL;
+
+	ret = scnprintf(buf, size,
+		"{\n"
+		"  \"name\": \"%s\",\n"
+		"  \"version\": %u,\n"
+		"  \"description\": \"%s\",\n"
+		"  \"long_description\": \"%s\",\n"
+		"  \"context_flags\": \"0x%x\",\n",
+		spec->name,
+		spec->version,
+		spec->description,
+		spec->long_description,
+		spec->context_flags);
+
+	/* Parameters */
+	ret += scnprintf(buf + ret, size - ret,
+		"  \"parameters\": [\n");
+
+	for (i = 0; i < spec->param_count && i < KAPI_MAX_PARAMS; i++) {
+		const struct kapi_param_spec *param = &spec->params[i];
+
+		ret += scnprintf(buf + ret, size - ret,
+			"    {\n"
+			"      \"name\": \"%s\",\n"
+			"      \"type\": \"%s\",\n"
+			"      \"type_class\": \"%s\",\n"
+			"      \"flags\": \"0x%x\",\n"
+			"      \"description\": \"%s\"\n"
+			"    }%s\n",
+			param->name,
+			param->type_name,
+			param_type_to_string(param->type),
+			param->flags,
+			param->description,
+			(i < spec->param_count - 1) ? "," : "");
+	}
+
+	ret += scnprintf(buf + ret, size - ret, "  ],\n");
+
+	/* Return value */
+	ret += scnprintf(buf + ret, size - ret,
+		"  \"return\": {\n"
+		"    \"type\": \"%s\",\n"
+		"    \"type_class\": \"%s\",\n"
+		"    \"check_type\": \"%s\",\n",
+		spec->return_spec.type_name,
+		param_type_to_string(spec->return_spec.type),
+		return_check_type_to_string(spec->return_spec.check_type));
+
+	switch (spec->return_spec.check_type) {
+	case KAPI_RETURN_EXACT:
+		ret += scnprintf(buf + ret, size - ret,
+			"    \"success_value\": %lld,\n",
+			spec->return_spec.success_value);
+		break;
+	case KAPI_RETURN_RANGE:
+		ret += scnprintf(buf + ret, size - ret,
+			"    \"success_min\": %lld,\n"
+			"    \"success_max\": %lld,\n",
+			spec->return_spec.success_min,
+			spec->return_spec.success_max);
+		break;
+	case KAPI_RETURN_ERROR_CHECK:
+		ret += scnprintf(buf + ret, size - ret,
+			"    \"error_count\": %u,\n",
+			spec->return_spec.error_count);
+		break;
+	default:
+		break;
+	}
+
+	ret += scnprintf(buf + ret, size - ret,
+		"    \"description\": \"%s\"\n"
+		"  },\n",
+		spec->return_spec.description);
+
+	/* Errors */
+	ret += scnprintf(buf + ret, size - ret,
+		"  \"errors\": [\n");
+
+	for (i = 0; i < spec->error_count && i < KAPI_MAX_ERRORS; i++) {
+		const struct kapi_error_spec *error = &spec->errors[i];
+
+		ret += scnprintf(buf + ret, size - ret,
+			"    {\n"
+			"      \"code\": %d,\n"
+			"      \"name\": \"%s\",\n"
+			"      \"condition\": \"%s\",\n"
+			"      \"description\": \"%s\"\n"
+			"    }%s\n",
+			error->error_code,
+			error->name,
+			error->condition,
+			error->description,
+			(i < spec->error_count - 1) ? "," : "");
+	}
+
+	ret += scnprintf(buf + ret, size - ret, "  ],\n");
+
+	/* Locks */
+	ret += scnprintf(buf + ret, size - ret,
+		"  \"locks\": [\n");
+
+	for (i = 0; i < spec->lock_count && i < KAPI_MAX_CONSTRAINTS; i++) {
+		const struct kapi_lock_spec *lock = &spec->locks[i];
+
+		ret += scnprintf(buf + ret, size - ret,
+			"    {\n"
+			"      \"name\": \"%s\",\n"
+			"      \"type\": \"%s\",\n"
+			"      \"acquired\": %s,\n"
+			"      \"released\": %s,\n"
+			"      \"held_on_entry\": %s,\n"
+			"      \"held_on_exit\": %s,\n"
+			"      \"description\": \"%s\"\n"
+			"    }%s\n",
+			lock->lock_name,
+			lock_type_to_string(lock->lock_type),
+			lock->acquired ? "true" : "false",
+			lock->released ? "true" : "false",
+			lock->held_on_entry ? "true" : "false",
+			lock->held_on_exit ? "true" : "false",
+			lock->description,
+			(i < spec->lock_count - 1) ? "," : "");
+	}
+
+	ret += scnprintf(buf + ret, size - ret, "  ],\n");
+
+	/* Capabilities */
+	ret += scnprintf(buf + ret, size - ret,
+		"  \"capabilities\": [\n");
+
+	for (i = 0; i < spec->capability_count && i < KAPI_MAX_CAPABILITIES; i++) {
+		const struct kapi_capability_spec *cap = &spec->capabilities[i];
+
+		ret += scnprintf(buf + ret, size - ret,
+			"    {\n"
+			"      \"capability\": %d,\n"
+			"      \"name\": \"%s\",\n"
+			"      \"action\": \"%s\",\n"
+			"      \"allows\": \"%s\",\n"
+			"      \"without_cap\": \"%s\",\n"
+			"      \"check_condition\": \"%s\",\n"
+			"      \"priority\": %u",
+			cap->capability,
+			cap->cap_name,
+			capability_action_to_string(cap->action),
+			cap->allows,
+			cap->without_cap,
+			cap->check_condition,
+			cap->priority);
+
+		if (cap->alternative_count > 0) {
+			int j;
+			ret += scnprintf(buf + ret, size - ret,
+				",\n      \"alternatives\": [");
+			for (j = 0; j < cap->alternative_count; j++) {
+				ret += scnprintf(buf + ret, size - ret,
+					"%d%s", cap->alternative[j],
+					(j < cap->alternative_count - 1) ? ", " : "");
+			}
+			ret += scnprintf(buf + ret, size - ret, "]");
+		}
+
+		ret += scnprintf(buf + ret, size - ret,
+			"\n    }%s\n",
+			(i < spec->capability_count - 1) ? "," : "");
+	}
+
+	ret += scnprintf(buf + ret, size - ret, "  ],\n");
+
+	/* Additional info */
+	ret += scnprintf(buf + ret, size - ret,
+		"  \"since_version\": \"%s\",\n"
+		"  \"examples\": \"%s\",\n"
+		"  \"notes\": \"%s\"\n"
+		"}\n",
+		spec->since_version,
+		spec->examples,
+		spec->notes);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(kapi_export_json);
+
+
+/**
+ * kapi_print_spec - Print API specification to kernel log
+ * @spec: API specification to print
+ */
+void kapi_print_spec(const struct kernel_api_spec *spec)
+{
+	int i;
+
+	if (!spec)
+		return;
+
+	pr_info("=== Kernel API Specification ===\n");
+	pr_info("Name: %s\n", spec->name);
+	pr_info("Version: %u\n", spec->version);
+	pr_info("Description: %s\n", spec->description);
+
+	if (spec->long_description[0])
+		pr_info("Long Description: %s\n", spec->long_description);
+
+	pr_info("Context Flags: 0x%x\n", spec->context_flags);
+
+	/* Parameters */
+	if (spec->param_count > 0) {
+		pr_info("Parameters:\n");
+		for (i = 0; i < spec->param_count && i < KAPI_MAX_PARAMS; i++) {
+			const struct kapi_param_spec *param = &spec->params[i];
+			pr_info("  [%d] %s: %s (flags: 0x%x)\n",
+				i, param->name, param->type_name, param->flags);
+			if (param->description[0])
+				pr_info("      Description: %s\n", param->description);
+		}
+	}
+
+	/* Return value */
+	pr_info("Return: %s\n", spec->return_spec.type_name);
+	if (spec->return_spec.description[0])
+		pr_info("  Description: %s\n", spec->return_spec.description);
+
+	/* Errors */
+	if (spec->error_count > 0) {
+		pr_info("Possible Errors:\n");
+		for (i = 0; i < spec->error_count && i < KAPI_MAX_ERRORS; i++) {
+			const struct kapi_error_spec *error = &spec->errors[i];
+			pr_info("  %s (%d): %s\n",
+				error->name, error->error_code, error->condition);
+		}
+	}
+
+	/* Capabilities */
+	if (spec->capability_count > 0) {
+		pr_info("Capabilities:\n");
+		for (i = 0; i < spec->capability_count && i < KAPI_MAX_CAPABILITIES; i++) {
+			const struct kapi_capability_spec *cap = &spec->capabilities[i];
+			pr_info("  %s (%d):\n", cap->cap_name, cap->capability);
+			pr_info("    Action: %s\n", capability_action_to_string(cap->action));
+			pr_info("    Allows: %s\n", cap->allows);
+			pr_info("    Without: %s\n", cap->without_cap);
+			if (cap->check_condition[0])
+				pr_info("    Condition: %s\n", cap->check_condition);
+		}
+	}
+
+	pr_info("================================\n");
+}
+EXPORT_SYMBOL_GPL(kapi_print_spec);
+
+#ifdef CONFIG_KAPI_RUNTIME_CHECKS
+
+/**
+ * kapi_validate_fd - Validate that a file descriptor is valid in current context
+ * @fd: File descriptor to validate
+ *
+ * Return: true if fd is valid in current process context, false otherwise
+ */
+static bool kapi_validate_fd(int fd)
+{
+	struct fd f;
+
+	/* Special case: AT_FDCWD is always valid */
+	if (fd == AT_FDCWD)
+		return true;
+
+	/* Check basic range */
+	if (fd < 0)
+		return false;
+
+	/* Check if fd is valid in current process context */
+	f = fdget(fd);
+	if (fd_empty(f)) {
+		return false;
+	}
+
+	/* fd is valid, release reference */
+	fdput(f);
+	return true;
+}
+
+/**
+ * kapi_validate_user_ptr - Validate that a user pointer is accessible
+ * @ptr: User pointer to validate
+ * @size: Size in bytes to validate
+ * @write: Whether write access is required
+ *
+ * Return: true if user memory is accessible, false otherwise
+ */
+static bool kapi_validate_user_ptr(const void __user *ptr, size_t size, bool write)
+{
+	/* NULL is valid if parameter is marked optional */
+	if (!ptr)
+		return false;
+
+	/* Check if the user memory region is accessible */
+	if (write) {
+		return access_ok(ptr, size);
+	} else {
+		return access_ok(ptr, size);
+	}
+}
+
+/**
+ * kapi_validate_user_ptr_with_params - Validate user pointer with dynamic size
+ * @param_spec: Parameter specification
+ * @ptr: User pointer to validate
+ * @all_params: Array of all parameter values
+ * @param_count: Number of parameters
+ *
+ * Return: true if user memory is accessible, false otherwise
+ */
+static bool kapi_validate_user_ptr_with_params(const struct kapi_param_spec *param_spec,
+						const void __user *ptr,
+						const s64 *all_params,
+						int param_count)
+{
+	size_t actual_size;
+	bool write;
+
+	/* NULL is allowed for optional parameters */
+	if (!ptr && (param_spec->flags & KAPI_PARAM_OPTIONAL))
+		return true;
+
+	/* Calculate actual size based on related parameter */
+	if (param_spec->size_param_idx >= 0 &&
+	    param_spec->size_param_idx < param_count) {
+		s64 count = all_params[param_spec->size_param_idx];
+
+		/* Validate count is positive */
+		if (count <= 0) {
+			pr_warn("Parameter %s: size determinant is non-positive (%lld)\n",
+				param_spec->name, count);
+			return false;
+		}
+
+		/* Check for multiplication overflow */
+		if (param_spec->size_multiplier > 0 &&
+		    count > SIZE_MAX / param_spec->size_multiplier) {
+			pr_warn("Parameter %s: size calculation overflow\n",
+				param_spec->name);
+			return false;
+		}
+
+		actual_size = count * param_spec->size_multiplier;
+	} else {
+		/* Use fixed size */
+		actual_size = param_spec->size;
+	}
+
+	write = (param_spec->flags & KAPI_PARAM_OUT) ||
+		(param_spec->flags & KAPI_PARAM_INOUT);
+
+	return kapi_validate_user_ptr(ptr, actual_size, write);
+}
+
+/**
+ * kapi_validate_path - Validate that a pathname is accessible and within limits
+ * @path: User pointer to pathname
+ * @param_spec: Parameter specification
+ *
+ * Return: true if path is valid, false otherwise
+ */
+static bool kapi_validate_path(const char __user *path,
+				const struct kapi_param_spec *param_spec)
+{
+	size_t len;
+
+	/* NULL is allowed for optional parameters */
+	if (!path && (param_spec->flags & KAPI_PARAM_OPTIONAL))
+		return true;
+
+	if (!path) {
+		pr_warn("Parameter %s: NULL path not allowed\n", param_spec->name);
+		return false;
+	}
+
+	/* Check if the path is accessible */
+	if (!access_ok(path, 1)) {
+		pr_warn("Parameter %s: path pointer %p not accessible\n",
+			param_spec->name, path);
+		return false;
+	}
+
+	/* Use strnlen_user to get the length and validate accessibility */
+	len = strnlen_user(path, PATH_MAX + 1);
+	if (len == 0) {
+		pr_warn("Parameter %s: invalid path pointer %p\n",
+			param_spec->name, path);
+		return false;
+	}
+
+	/* Check path length limit */
+	if (len > PATH_MAX) {
+		pr_warn("Parameter %s: path too long (exceeds PATH_MAX)\n",
+			param_spec->name);
+		return false;
+	}
+
+	return true;
+}
+
+/**
+ * kapi_validate_param - Validate a parameter against its specification
+ * @param_spec: Parameter specification
+ * @value: Parameter value to validate
+ *
+ * Return: true if valid, false otherwise
+ */
+bool kapi_validate_param(const struct kapi_param_spec *param_spec, s64 value)
+{
+	int i;
+
+	/* Special handling for file descriptor type */
+	if (param_spec->type == KAPI_TYPE_FD) {
+		if (!kapi_validate_fd((int)value)) {
+			pr_warn("Parameter %s: invalid file descriptor %lld\n",
+				param_spec->name, value);
+			return false;
+		}
+		/* Continue with additional constraint checks if needed */
+	}
+
+	/* Special handling for user pointer type */
+	if (param_spec->type == KAPI_TYPE_USER_PTR) {
+		const void __user *ptr = (const void __user *)value;
+		bool write = (param_spec->flags & KAPI_PARAM_OUT) ||
+			     (param_spec->flags & KAPI_PARAM_INOUT);
+
+		/* NULL is allowed for optional parameters */
+		if (!ptr && (param_spec->flags & KAPI_PARAM_OPTIONAL))
+			return true;
+
+		if (!kapi_validate_user_ptr(ptr, param_spec->size, write)) {
+			pr_warn("Parameter %s: invalid user pointer %p (size: %zu, %s)\n",
+				param_spec->name, ptr, param_spec->size,
+				write ? "write" : "read");
+			return false;
+		}
+		/* Continue with additional constraint checks if needed */
+	}
+
+	/* Special handling for path type */
+	if (param_spec->type == KAPI_TYPE_PATH) {
+		const char __user *path = (const char __user *)value;
+
+		if (!kapi_validate_path(path, param_spec)) {
+			return false;
+		}
+		/* Continue with additional constraint checks if needed */
+	}
+
+	switch (param_spec->constraint_type) {
+	case KAPI_CONSTRAINT_NONE:
+		return true;
+
+	case KAPI_CONSTRAINT_RANGE:
+		if (value < param_spec->min_value || value > param_spec->max_value) {
+			pr_warn("Parameter %s value %lld out of range [%lld, %lld]\n",
+				param_spec->name, value,
+				param_spec->min_value, param_spec->max_value);
+			return false;
+		}
+		return true;
+
+	case KAPI_CONSTRAINT_MASK:
+		if (value & ~param_spec->valid_mask) {
+			pr_warn("Parameter %s value 0x%llx contains invalid bits (valid mask: 0x%llx)\n",
+				param_spec->name, value, param_spec->valid_mask);
+			return false;
+		}
+		return true;
+
+	case KAPI_CONSTRAINT_ENUM:
+		if (!param_spec->enum_values || param_spec->enum_count == 0)
+			return true;
+
+		for (i = 0; i < param_spec->enum_count; i++) {
+			if (value == param_spec->enum_values[i])
+				return true;
+		}
+		pr_warn("Parameter %s value %lld not in valid enumeration\n",
+			param_spec->name, value);
+		return false;
+
+	case KAPI_CONSTRAINT_ALIGNMENT:
+		if (param_spec->alignment == 0) {
+			pr_warn("Parameter %s: alignment constraint specified but alignment is 0\n",
+				param_spec->name);
+			return false;
+		}
+		if (value & (param_spec->alignment - 1)) {
+			pr_warn("Parameter %s value 0x%llx not aligned to %zu boundary\n",
+				param_spec->name, value, param_spec->alignment);
+			return false;
+		}
+		return true;
+
+	case KAPI_CONSTRAINT_POWER_OF_TWO:
+		if (value == 0 || (value & (value - 1))) {
+			pr_warn("Parameter %s value %lld is not a power of two\n",
+				param_spec->name, value);
+			return false;
+		}
+		return true;
+
+	case KAPI_CONSTRAINT_PAGE_ALIGNED:
+		if (value & (PAGE_SIZE - 1)) {
+			pr_warn("Parameter %s value 0x%llx not page-aligned (PAGE_SIZE=%ld)\n",
+				param_spec->name, value, PAGE_SIZE);
+			return false;
+		}
+		return true;
+
+	case KAPI_CONSTRAINT_NONZERO:
+		if (value == 0) {
+			pr_warn("Parameter %s must be non-zero\n", param_spec->name);
+			return false;
+		}
+		return true;
+
+	case KAPI_CONSTRAINT_CUSTOM:
+		if (param_spec->validate)
+			return param_spec->validate(value);
+		return true;
+
+	default:
+		return true;
+	}
+}
+EXPORT_SYMBOL_GPL(kapi_validate_param);
+
+/**
+ * kapi_validate_param_with_context - Validate parameter with access to all params
+ * @param_spec: Parameter specification
+ * @value: Parameter value to validate
+ * @all_params: Array of all parameter values
+ * @param_count: Number of parameters
+ *
+ * Return: true if valid, false otherwise
+ */
+bool kapi_validate_param_with_context(const struct kapi_param_spec *param_spec,
+				       s64 value, const s64 *all_params, int param_count)
+{
+	/* Special handling for user pointer type with dynamic sizing */
+	if (param_spec->type == KAPI_TYPE_USER_PTR) {
+		const void __user *ptr = (const void __user *)value;
+
+		/* NULL is allowed for optional parameters */
+		if (!ptr && (param_spec->flags & KAPI_PARAM_OPTIONAL))
+			return true;
+
+		if (!kapi_validate_user_ptr_with_params(param_spec, ptr, all_params, param_count)) {
+			pr_warn("Parameter %s: invalid user pointer %p\n",
+				param_spec->name, ptr);
+			return false;
+		}
+		/* Continue with additional constraint checks if needed */
+	}
+
+	/* For other types, fall back to regular validation */
+	return kapi_validate_param(param_spec, value);
+}
+EXPORT_SYMBOL_GPL(kapi_validate_param_with_context);
+
+/**
+ * kapi_validate_syscall_param - Validate syscall parameter with enforcement
+ * @spec: API specification
+ * @param_idx: Parameter index
+ * @value: Parameter value
+ *
+ * Return: -EINVAL if invalid, 0 if valid
+ */
+int kapi_validate_syscall_param(const struct kernel_api_spec *spec,
+				 int param_idx, s64 value)
+{
+	const struct kapi_param_spec *param_spec;
+
+	if (!spec || param_idx >= spec->param_count)
+		return 0;
+
+	param_spec = &spec->params[param_idx];
+
+	if (!kapi_validate_param(param_spec, value)) {
+		if (strncmp(spec->name, "sys_", 4) == 0) {
+			/* For syscalls, we can return EINVAL to userspace */
+			return -EINVAL;
+		}
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(kapi_validate_syscall_param);
+
+/**
+ * kapi_validate_syscall_params - Validate all syscall parameters together
+ * @spec: API specification
+ * @params: Array of parameter values
+ * @param_count: Number of parameters
+ *
+ * Return: -EINVAL if any parameter is invalid, 0 if all valid
+ */
+int kapi_validate_syscall_params(const struct kernel_api_spec *spec,
+				 const s64 *params, int param_count)
+{
+	int i;
+
+	if (!spec || !params)
+		return 0;
+
+	/* Validate that we have the expected number of parameters */
+	if (param_count != spec->param_count) {
+		pr_warn("API %s: parameter count mismatch (expected %u, got %d)\n",
+			spec->name, spec->param_count, param_count);
+		return -EINVAL;
+	}
+
+	/* Validate each parameter with context */
+	for (i = 0; i < spec->param_count && i < KAPI_MAX_PARAMS; i++) {
+		const struct kapi_param_spec *param_spec = &spec->params[i];
+
+		if (!kapi_validate_param_with_context(param_spec, params[i], params, param_count)) {
+			if (strncmp(spec->name, "sys_", 4) == 0) {
+				/* For syscalls, we can return EINVAL to userspace */
+				return -EINVAL;
+			}
+		}
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(kapi_validate_syscall_params);
+
+/**
+ * kapi_check_return_success - Check if return value indicates success
+ * @return_spec: Return specification
+ * @retval: Return value to check
+ *
+ * Returns true if the return value indicates success according to the spec.
+ */
+bool kapi_check_return_success(const struct kapi_return_spec *return_spec, s64 retval)
+{
+	u32 i;
+
+	if (!return_spec)
+		return true; /* No spec means we can't validate */
+
+	switch (return_spec->check_type) {
+	case KAPI_RETURN_EXACT:
+		return retval == return_spec->success_value;
+
+	case KAPI_RETURN_RANGE:
+		return retval >= return_spec->success_min &&
+		       retval <= return_spec->success_max;
+
+	case KAPI_RETURN_ERROR_CHECK:
+		/* Success if NOT in error list */
+		if (return_spec->error_values) {
+			for (i = 0; i < return_spec->error_count; i++) {
+				if (retval == return_spec->error_values[i])
+					return false; /* Found in error list */
+			}
+		}
+		return true; /* Not in error list = success */
+
+	case KAPI_RETURN_FD:
+		/* File descriptors: >= 0 is success, < 0 is error */
+		return retval >= 0;
+
+	case KAPI_RETURN_CUSTOM:
+		if (return_spec->is_success)
+			return return_spec->is_success(retval);
+		fallthrough;
+
+	default:
+		return true; /* Unknown check type, assume success */
+	}
+}
+EXPORT_SYMBOL_GPL(kapi_check_return_success);
+
+/**
+ * kapi_validate_return_value - Validate that return value matches spec
+ * @spec: API specification
+ * @retval: Return value to validate
+ *
+ * Return: true if return value is valid according to spec, false otherwise.
+ *
+ * This function checks:
+ * 1. If the value indicates success, it must match the success criteria
+ * 2. If the value indicates error, it must be one of the specified error codes
+ */
+bool kapi_validate_return_value(const struct kernel_api_spec *spec, s64 retval)
+{
+	int i;
+	bool is_success;
+
+	if (!spec)
+		return true; /* No spec means we can't validate */
+
+	/* First check if this is a success return */
+	is_success = kapi_check_return_success(&spec->return_spec, retval);
+
+	if (is_success) {
+		/* Success case - already validated by kapi_check_return_success */
+		return true;
+	}
+
+	/* Special validation for file descriptor returns */
+	if (spec->return_spec.check_type == KAPI_RETURN_FD && is_success) {
+		/* For successful FD returns, validate it's a valid FD */
+		if (!kapi_validate_fd((int)retval)) {
+			pr_warn("API %s returned invalid file descriptor %lld\n",
+				spec->name, retval);
+			return false;
+		}
+		return true;
+	}
+
+	/* Error case - check if it's one of the specified errors */
+	if (spec->error_count == 0) {
+		/* No errors specified, so any error is potentially valid */
+		pr_debug("API %s returned unspecified error %lld\n",
+			 spec->name, retval);
+		return true;
+	}
+
+	/* Check if the error is in our list of specified errors */
+	for (i = 0; i < spec->error_count && i < KAPI_MAX_ERRORS; i++) {
+		if (retval == spec->errors[i].error_code)
+			return true;
+	}
+
+	/* Error not in spec */
+	pr_warn("API %s returned unspecified error code %lld. Valid errors are:\n",
+		spec->name, retval);
+	for (i = 0; i < spec->error_count && i < KAPI_MAX_ERRORS; i++) {
+		pr_warn("  %s (%d): %s\n",
+			spec->errors[i].name,
+			spec->errors[i].error_code,
+			spec->errors[i].condition);
+	}
+
+	return false;
+}
+EXPORT_SYMBOL_GPL(kapi_validate_return_value);
+
+/**
+ * kapi_validate_syscall_return - Validate syscall return value with enforcement
+ * @spec: API specification
+ * @retval: Return value
+ *
+ * Return: 0 if valid, -EINVAL if the return value doesn't match spec
+ *
+ * For syscalls, this can help detect kernel bugs where unspecified error
+ * codes are returned to userspace.
+ */
+int kapi_validate_syscall_return(const struct kernel_api_spec *spec, s64 retval)
+{
+	if (!spec)
+		return 0;
+
+	if (!kapi_validate_return_value(spec, retval)) {
+		/* Log the violation but don't change the return value */
+		WARN_ONCE(1, "Syscall %s returned unspecified value %lld\n",
+			  spec->name, retval);
+		/* Could return -EINVAL here to enforce, but that might break userspace */
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(kapi_validate_syscall_return);
+
+/**
+ * kapi_check_context - Check if current context matches API requirements
+ * @spec: API specification to check against
+ */
+void kapi_check_context(const struct kernel_api_spec *spec)
+{
+	u32 ctx = spec->context_flags;
+	bool valid = false;
+
+	if (!ctx)
+		return;
+
+	/* Check if we're in an allowed context */
+	if ((ctx & KAPI_CTX_PROCESS) && !in_interrupt())
+		valid = true;
+
+	if ((ctx & KAPI_CTX_SOFTIRQ) && in_softirq())
+		valid = true;
+
+	if ((ctx & KAPI_CTX_HARDIRQ) && in_hardirq())
+		valid = true;
+
+	if ((ctx & KAPI_CTX_NMI) && in_nmi())
+		valid = true;
+
+	if (!valid) {
+		WARN_ONCE(1, "API %s called from invalid context\n", spec->name);
+	}
+
+	/* Check specific requirements */
+	if ((ctx & KAPI_CTX_ATOMIC) && preemptible()) {
+		WARN_ONCE(1, "API %s requires atomic context\n", spec->name);
+	}
+
+	if ((ctx & KAPI_CTX_SLEEPABLE) && !preemptible()) {
+		WARN_ONCE(1, "API %s requires sleepable context\n", spec->name);
+	}
+}
+EXPORT_SYMBOL_GPL(kapi_check_context);
+
+#endif /* CONFIG_KAPI_RUNTIME_CHECKS */
+
+/* DebugFS interface */
+#ifdef CONFIG_DEBUG_FS
+
+static struct dentry *kapi_debugfs_root;
+
+static int kapi_spec_show(struct seq_file *s, void *v)
+{
+	struct kernel_api_spec *spec = s->private;
+	char *buf;
+	int ret;
+
+	buf = kmalloc(PAGE_SIZE * 4, GFP_KERNEL);
+	if (!buf)
+		return -ENOMEM;
+
+	ret = kapi_export_json(spec, buf, PAGE_SIZE * 4);
+	if (ret > 0)
+		seq_printf(s, "%s", buf);
+
+	kfree(buf);
+	return 0;
+}
+
+static int kapi_spec_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, kapi_spec_show, inode->i_private);
+}
+
+static const struct file_operations kapi_spec_fops = {
+	.open = kapi_spec_open,
+	.read = seq_read,
+	.llseek = seq_lseek,
+	.release = single_release,
+};
+
+static int kapi_list_show(struct seq_file *s, void *v)
+{
+	struct kernel_api_spec *spec;
+	struct dynamic_api_spec *dyn_spec;
+
+	seq_printf(s, "Kernel API Specifications:\n\n");
+
+	/* List static specifications */
+	seq_printf(s, "Static APIs:\n");
+	for (spec = __start_kapi_specs; spec < __stop_kapi_specs; spec++) {
+		seq_printf(s, "  %s (v%u): %s\n",
+			spec->name, spec->version, spec->description);
+	}
+
+	/* List dynamic specifications */
+	seq_printf(s, "\nDynamic APIs:\n");
+	mutex_lock(&api_spec_mutex);
+	list_for_each_entry(dyn_spec, &dynamic_api_specs, list) {
+		spec = dyn_spec->spec;
+		seq_printf(s, "  %s (v%u): %s\n",
+			spec->name, spec->version, spec->description);
+	}
+	mutex_unlock(&api_spec_mutex);
+
+	return 0;
+}
+
+static int kapi_list_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, kapi_list_show, NULL);
+}
+
+static const struct file_operations kapi_list_fops = {
+	.open = kapi_list_open,
+	.read = seq_read,
+	.llseek = seq_lseek,
+	.release = single_release,
+};
+
+static int __init kapi_debugfs_init(void)
+{
+	struct kernel_api_spec *spec;
+	struct dentry *spec_dir;
+
+	kapi_debugfs_root = debugfs_create_dir("kapi", NULL);
+	if (!kapi_debugfs_root)
+		return -ENOMEM;
+
+	/* Create list file */
+	debugfs_create_file("list", 0444, kapi_debugfs_root, NULL,
+			    &kapi_list_fops);
+
+	/* Create directory for specifications */
+	spec_dir = debugfs_create_dir("specs", kapi_debugfs_root);
+
+	/* Create files for each static specification */
+	for (spec = __start_kapi_specs; spec < __stop_kapi_specs; spec++) {
+		debugfs_create_file(spec->name, 0444, spec_dir, spec,
+				    &kapi_spec_fops);
+	}
+
+	return 0;
+}
+
+late_initcall(kapi_debugfs_init);
+
+#endif /* CONFIG_DEBUG_FS */
\ No newline at end of file
-- 
2.50.1


^ permalink raw reply related

* [RFC PATCH v4 0/7] Kernel API Specification Framework
From: Sasha Levin @ 2025-08-25 18:14 UTC (permalink / raw)
  To: linux-api, linux-doc, linux-kernel, tools; +Cc: Sasha Levin

This patch series introduces a framework for formally specifying kernel APIs,
addressing the long-standing challenge of maintaining stable interfaces between
the kernel and user-space programs. As outlined in previous discussions about
kernel ABI stability, the lack of machine-readable API specifications has led
to inadvertent breakages and inconsistent validation across system calls and
IOCTLs.

Changes since RFC v3:

1. Improved kerneldoc format for better readability.

2. Added kerneldoc parsing support in the kapi tool.

3. Use magic markers in the binary output to simplify extraction.

Sasha Levin (7):
  kernel/api: introduce kernel API specification framework
  kernel/api: enable kerneldoc-based API specifications
  kernel/api: add debugfs interface for kernel API specifications
  kernel/sched: add specs for sys_sched_setattr()
  mm/mlock: add API specification for mlock
  fs/exec: add API specification for execveat
  tools/kapi: Add kernel API specification extraction tool

 .gitignore                                    |    1 +
 Documentation/admin-guide/kernel-api-spec.rst |  699 ++++++++
 MAINTAINERS                                   |    9 +
 arch/um/kernel/dyn.lds.S                      |    3 +
 arch/um/kernel/uml.lds.S                      |    3 +
 arch/x86/kernel/vmlinux.lds.S                 |    3 +
 fs/exec.c                                     |  594 +++++++
 include/asm-generic/vmlinux.lds.h             |   20 +
 include/linux/kernel_api_spec.h               | 1559 +++++++++++++++++
 include/linux/syscall_api_spec.h              |  125 ++
 include/linux/syscalls.h                      |   38 +
 init/Kconfig                                  |    2 +
 kernel/Makefile                               |    1 +
 kernel/api/Kconfig                            |   55 +
 kernel/api/Makefile                           |   30 +
 kernel/api/kapi_debugfs.c                     |  334 ++++
 kernel/api/kernel_api_spec.c                  | 1155 ++++++++++++
 kernel/sched/syscalls.c                       |  315 +++-
 mm/mlock.c                                    |  134 ++
 scripts/Makefile.build                        |   28 +
 scripts/generate_api_specs.sh                 |   68 +
 scripts/kernel-doc.py                         |    5 +
 scripts/lib/kdoc/kdoc_apispec.py              |  714 ++++++++
 scripts/lib/kdoc/kdoc_output.py               |    9 +-
 scripts/lib/kdoc/kdoc_parser.py               |   50 +-
 tools/kapi/.gitignore                         |    4 +
 tools/kapi/Cargo.toml                         |   19 +
 tools/kapi/src/extractor/debugfs.rs           |  442 +++++
 tools/kapi/src/extractor/kerneldoc_parser.rs  |  694 ++++++++
 tools/kapi/src/extractor/mod.rs               |  461 +++++
 tools/kapi/src/extractor/source_parser.rs     |  213 +++
 .../src/extractor/vmlinux/binary_utils.rs     |  180 ++
 .../src/extractor/vmlinux/magic_finder.rs     |  102 ++
 tools/kapi/src/extractor/vmlinux/mod.rs       |  869 +++++++++
 tools/kapi/src/formatter/json.rs              |  468 +++++
 tools/kapi/src/formatter/mod.rs               |  145 ++
 tools/kapi/src/formatter/plain.rs             |  558 ++++++
 tools/kapi/src/formatter/rst.rs               |  621 +++++++
 tools/kapi/src/formatter/shall.rs             |  891 ++++++++++
 tools/kapi/src/main.rs                        |  116 ++
 40 files changed, 11732 insertions(+), 5 deletions(-)
 create mode 100644 Documentation/admin-guide/kernel-api-spec.rst
 create mode 100644 include/linux/kernel_api_spec.h
 create mode 100644 include/linux/syscall_api_spec.h
 create mode 100644 kernel/api/Kconfig
 create mode 100644 kernel/api/Makefile
 create mode 100644 kernel/api/kapi_debugfs.c
 create mode 100644 kernel/api/kernel_api_spec.c
 create mode 100755 scripts/generate_api_specs.sh
 create mode 100644 scripts/lib/kdoc/kdoc_apispec.py
 create mode 100644 tools/kapi/.gitignore
 create mode 100644 tools/kapi/Cargo.toml
 create mode 100644 tools/kapi/src/extractor/debugfs.rs
 create mode 100644 tools/kapi/src/extractor/kerneldoc_parser.rs
 create mode 100644 tools/kapi/src/extractor/mod.rs
 create mode 100644 tools/kapi/src/extractor/source_parser.rs
 create mode 100644 tools/kapi/src/extractor/vmlinux/binary_utils.rs
 create mode 100644 tools/kapi/src/extractor/vmlinux/magic_finder.rs
 create mode 100644 tools/kapi/src/extractor/vmlinux/mod.rs
 create mode 100644 tools/kapi/src/formatter/json.rs
 create mode 100644 tools/kapi/src/formatter/mod.rs
 create mode 100644 tools/kapi/src/formatter/plain.rs
 create mode 100644 tools/kapi/src/formatter/rst.rs
 create mode 100644 tools/kapi/src/formatter/shall.rs
 create mode 100644 tools/kapi/src/main.rs

-- 
2.50.1


^ permalink raw reply

* Re: [RFC PATCH v1 1/2] fs: Add O_DENY_WRITE
From: Jeff Xu @ 2025-08-25 18:10 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Mickaël Salaün, Jann Horn, Al Viro, Christian Brauner,
	Kees Cook, Paul Moore, Serge Hallyn, Andy Lutomirski,
	Arnd Bergmann, Christian Heimes, Dmitry Vyukov, Elliott Hughes,
	Fan Wu, Florian Weimer, Jonathan Corbet, Jordan R Abrahams,
	Lakshmi Ramasubramanian, Luca Boccassi, Matt Bobrowski,
	Miklos Szeredi, Mimi Zohar, Nicolas Bouchinet, Robert Waite,
	Roberto Sassu, Scott Shell, Steve Dower, Steve Grubb,
	kernel-hardening, linux-api, linux-fsdevel, linux-integrity,
	linux-kernel, linux-security-module, Jeff Xu
In-Reply-To: <CALCETrX+OpkRSvOZhaWiqOsAPr-hRb+kY5=Hh5LU3H+1xPb3qg@mail.gmail.com>

On Mon, Aug 25, 2025 at 9:43 AM Andy Lutomirski <luto@amacapital.net> wrote:
>
> On Mon, Aug 25, 2025 at 2:31 AM Mickaël Salaün <mic@digikod.net> wrote:
> >
> > On Sun, Aug 24, 2025 at 11:04:03AM -0700, Andy Lutomirski wrote:
> > > On Sun, Aug 24, 2025 at 4:03 AM Mickaël Salaün <mic@digikod.net> wrote:
> > > >
> > > > On Fri, Aug 22, 2025 at 09:45:32PM +0200, Jann Horn wrote:
> > > > > On Fri, Aug 22, 2025 at 7:08 PM Mickaël Salaün <mic@digikod.net> wrote:
> > > > > > Add a new O_DENY_WRITE flag usable at open time and on opened file (e.g.
> > > > > > passed file descriptors).  This changes the state of the opened file by
> > > > > > making it read-only until it is closed.  The main use case is for script
> > > > > > interpreters to get the guarantee that script' content cannot be altered
> > > > > > while being read and interpreted.  This is useful for generic distros
> > > > > > that may not have a write-xor-execute policy.  See commit a5874fde3c08
> > > > > > ("exec: Add a new AT_EXECVE_CHECK flag to execveat(2)")
> > > > > >
> > > > > > Both execve(2) and the IOCTL to enable fsverity can already set this
> > > > > > property on files with deny_write_access().  This new O_DENY_WRITE make
> > > > >
> > > > > The kernel actually tried to get rid of this behavior on execve() in
> > > > > commit 2a010c41285345da60cece35575b4e0af7e7bf44.; but sadly that had
> > > > > to be reverted in commit 3b832035387ff508fdcf0fba66701afc78f79e3d
> > > > > because it broke userspace assumptions.
> > > >
> > > > Oh, good to know.
> > > >
> > > > >
> > > > > > it widely available.  This is similar to what other OSs may provide
> > > > > > e.g., opening a file with only FILE_SHARE_READ on Windows.
> > > > >
> > > > > We used to have the analogous mmap() flag MAP_DENYWRITE, and that was
> > > > > removed for security reasons; as
> > > > > https://man7.org/linux/man-pages/man2/mmap.2.html says:
> > > > >
> > > > > |        MAP_DENYWRITE
> > > > > |               This flag is ignored.  (Long ago—Linux 2.0 and earlier—it
> > > > > |               signaled that attempts to write to the underlying file
> > > > > |               should fail with ETXTBSY.  But this was a source of denial-
> > > > > |               of-service attacks.)"
> > > > >
> > > > > It seems to me that the same issue applies to your patch - it would
> > > > > allow unprivileged processes to essentially lock files such that other
> > > > > processes can't write to them anymore. This might allow unprivileged
> > > > > users to prevent root from updating config files or stuff like that if
> > > > > they're updated in-place.
> > > >
> > > > Yes, I agree, but since it is the case for executed files I though it
> > > > was worth starting a discussion on this topic.  This new flag could be
> > > > restricted to executable files, but we should avoid system-wide locks
> > > > like this.  I'm not sure how Windows handle these issues though.
> > > >
> > > > Anyway, we should rely on the access control policy to control write and
> > > > execute access in a consistent way (e.g. write-xor-execute).  Thanks for
> > > > the references and the background!
> > >
> > > I'm confused.  I understand that there are many contexts in which one
> > > would want to prevent execution of unapproved content, which might
> > > include preventing a given process from modifying some code and then
> > > executing it.
> > >
> > > I don't understand what these deny-write features have to do with it.
> > > These features merely prevent someone from modifying code *that is
> > > currently in use*, which is not at all the same thing as preventing
> > > modifying code that might get executed -- one can often modify
> > > contents *before* executing those contents.
> >
> > The order of checks would be:
> > 1. open script with O_DENY_WRITE
> > 2. check executability with AT_EXECVE_CHECK
> > 3. read the content and interpret it
>
> Hmm.  Common LSM configurations should be able to handle this without
> deny write, I think.  If you don't want a program to be able to make
> their own scripts, then don't allow AT_EXECVE_CHECK to succeed on a
> script that the program can write.
>
Yes, Common LSM could handle this, however, due to historic and app
backward compability reason, sometimes it is impossible to enforce
that kind of policy in practice, therefore as an alternative, a
machinism such as AT_EXECVE_CHECK is really useful.

> Keep in mind that trying to lock this down too hard is pointless for
> users who are allowed to to ptrace-write to their own processes.  Or
> for users who can do JIT, or for users who can run a REPL, etc.
>
The ptrace-write and /proc/pid/mem writing are on my radar, at least
for ChomeOS and Android.
AT_EXECVE_CHECK is orthogonal to those IMO, I hope eventually all
those paths will be hardened.

Thanks and regards,
-Jeff

^ permalink raw reply

* Re: [RFC PATCH v1 1/2] fs: Add O_DENY_WRITE
From: Jeff Xu @ 2025-08-25 17:57 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Andy Lutomirski, Jann Horn, Al Viro, Christian Brauner, Kees Cook,
	Paul Moore, Serge Hallyn, Andy Lutomirski, Arnd Bergmann,
	Christian Heimes, Dmitry Vyukov, Elliott Hughes, Fan Wu,
	Florian Weimer, Jonathan Corbet, Jordan R Abrahams,
	Lakshmi Ramasubramanian, Luca Boccassi, Matt Bobrowski,
	Miklos Szeredi, Mimi Zohar, Nicolas Bouchinet, Robert Waite,
	Roberto Sassu, Scott Shell, Steve Dower, Steve Grubb,
	kernel-hardening, linux-api, linux-fsdevel, linux-integrity,
	linux-kernel, linux-security-module, Jeff Xu
In-Reply-To: <20250825.mahNeel0dohz@digikod.net>

Hi Mickaël

On Mon, Aug 25, 2025 at 2:31 AM Mickaël Salaün <mic@digikod.net> wrote:
>
> On Sun, Aug 24, 2025 at 11:04:03AM -0700, Andy Lutomirski wrote:
> > On Sun, Aug 24, 2025 at 4:03 AM Mickaël Salaün <mic@digikod.net> wrote:
> > >
> > > On Fri, Aug 22, 2025 at 09:45:32PM +0200, Jann Horn wrote:
> > > > On Fri, Aug 22, 2025 at 7:08 PM Mickaël Salaün <mic@digikod.net> wrote:
> > > > > Add a new O_DENY_WRITE flag usable at open time and on opened file (e.g.
> > > > > passed file descriptors).  This changes the state of the opened file by
> > > > > making it read-only until it is closed.  The main use case is for script
> > > > > interpreters to get the guarantee that script' content cannot be altered
> > > > > while being read and interpreted.  This is useful for generic distros
> > > > > that may not have a write-xor-execute policy.  See commit a5874fde3c08
> > > > > ("exec: Add a new AT_EXECVE_CHECK flag to execveat(2)")
> > > > >
> > > > > Both execve(2) and the IOCTL to enable fsverity can already set this
> > > > > property on files with deny_write_access().  This new O_DENY_WRITE make
> > > >
> > > > The kernel actually tried to get rid of this behavior on execve() in
> > > > commit 2a010c41285345da60cece35575b4e0af7e7bf44.; but sadly that had
> > > > to be reverted in commit 3b832035387ff508fdcf0fba66701afc78f79e3d
> > > > because it broke userspace assumptions.
> > >
> > > Oh, good to know.
> > >
> > > >
> > > > > it widely available.  This is similar to what other OSs may provide
> > > > > e.g., opening a file with only FILE_SHARE_READ on Windows.
> > > >
> > > > We used to have the analogous mmap() flag MAP_DENYWRITE, and that was
> > > > removed for security reasons; as
> > > > https://man7.org/linux/man-pages/man2/mmap.2.html says:
> > > >
> > > > |        MAP_DENYWRITE
> > > > |               This flag is ignored.  (Long ago—Linux 2.0 and earlier—it
> > > > |               signaled that attempts to write to the underlying file
> > > > |               should fail with ETXTBSY.  But this was a source of denial-
> > > > |               of-service attacks.)"
> > > >
> > > > It seems to me that the same issue applies to your patch - it would
> > > > allow unprivileged processes to essentially lock files such that other
> > > > processes can't write to them anymore. This might allow unprivileged
> > > > users to prevent root from updating config files or stuff like that if
> > > > they're updated in-place.
> > >
> > > Yes, I agree, but since it is the case for executed files I though it
> > > was worth starting a discussion on this topic.  This new flag could be
> > > restricted to executable files, but we should avoid system-wide locks
> > > like this.  I'm not sure how Windows handle these issues though.
> > >
> > > Anyway, we should rely on the access control policy to control write and
> > > execute access in a consistent way (e.g. write-xor-execute).  Thanks for
> > > the references and the background!
> >
> > I'm confused.  I understand that there are many contexts in which one
> > would want to prevent execution of unapproved content, which might
> > include preventing a given process from modifying some code and then
> > executing it.
> >
> > I don't understand what these deny-write features have to do with it.
> > These features merely prevent someone from modifying code *that is
> > currently in use*, which is not at all the same thing as preventing
> > modifying code that might get executed -- one can often modify
> > contents *before* executing those contents.
>
> The order of checks would be:
> 1. open script with O_DENY_WRITE
> 2. check executability with AT_EXECVE_CHECK
> 3. read the content and interpret it
>
I'm not sure about the O_DENY_WRITE approach, but the problem is worth solving.

AT_EXECVE_CHECK is not just for scripting languages. It could also
work with bytecodes like Java, for example. If we let the Java runtime
call AT_EXECVE_CHECK before loading the bytecode, the LSM could
develop a policy based on that.

> The deny-write feature was to guarantee that there is no race condition
> between step 2 and 3.  All these checks are supposed to be done by a
> trusted interpreter (which is allowed to be executed).  The
> AT_EXECVE_CHECK call enables the caller to know if the kernel (and
> associated security policies) allowed the *current* content of the file
> to be executed.  Whatever happen before or after that (wrt.
> O_DENY_WRITE) should be covered by the security policy.
>
Agree, the race problem needs to be solved in order for AT_EXECVE_CHECK.

Enforcing non-write for the path that stores scripts or bytecodes can
be challenging due to historical or backward compatibility reasons.
Since AT_EXECVE_CHECK provides a mechanism to check the file right
before it is used, we can assume it will detect any "problem" that
happened before that, (e.g. the file was overwritten). However, that
also imposes two additional requirements:
1> the file doesn't change while AT_EXECVE_CHECK does the check.
2>The file content kept by the process remains unchanged after passing
the AT_EXECVE_CHECK.

I imagine, the complete solution for AT_EXECVE_CHECK would include
those two grantees.

Thanks
-Jeff

^ permalink raw reply

* Re: [PATCH] uapi/fcntl: conditionally define AT_RENAME* macros
From: Randy Dunlap @ 2025-08-25 17:52 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: linux-kernel, Amir Goldstein, Jeff Layton, Chuck Lever,
	Alexander Aring, Josef Bacik, Aleksa Sarai, Jan Kara,
	Christian Brauner, linux-fsdevel, linux-api
In-Reply-To: <aKxfGix_o4glz8-Z@casper.infradead.org>



On 8/25/25 6:03 AM, Matthew Wilcox wrote:
> On Sun, Aug 24, 2025 at 04:54:50PM -0700, Randy Dunlap wrote:
>> In file included from ../samples/vfs/test-statx.c:23:
>> usr/include/linux/fcntl.h:159:9: warning: ‘AT_RENAME_NOREPLACE’ redefined
>>   159 | #define AT_RENAME_NOREPLACE     0x0001
>> In file included from ../samples/vfs/test-statx.c:13:
>> /usr/include/stdio.h:171:10: note: this is the location of the previous definition
>>   171 | # define AT_RENAME_NOREPLACE RENAME_NOREPLACE
> 
> Oh dear.  This is going to be libc-version-dependent.
> 

I am not surprised at that.

> $ grep -r AT_RENAME_NOREPLACE /usr/include
> /usr/include/linux/fcntl.h:#define AT_RENAME_NOREPLACE	0x0001
> 
> It's not in stdio.h at all.  This is with libc6 2.41-10

It was added 2025-04-22:
2025-04-22  Joseph Myers  <josmyers@redhat.com>

	COMMIT: cf9241107d12e79073ddb03bab9de115e5e0e856
	Add AT_* constants from Linux 6.12

$ grep -r AT_RENAME_NOREPLACE /usr/include
/usr/include/stdio.h:# define AT_RENAME_NOREPLACE RENAME_NOREPLACE
/usr/include/linux/fcntl.h:#define AT_RENAME_NOREPLACE	0x0001

I have libc 2.42-1.1 (openSUSE).

thanks.
-- 
~Randy


^ permalink raw reply

* Re: [RFC PATCH v1 1/2] fs: Add O_DENY_WRITE
From: Andy Lutomirski @ 2025-08-25 16:43 UTC (permalink / raw)
  To: Mickaël Salaün
  Cc: Jann Horn, Al Viro, Christian Brauner, Kees Cook, Paul Moore,
	Serge Hallyn, Andy Lutomirski, Arnd Bergmann, Christian Heimes,
	Dmitry Vyukov, Elliott Hughes, Fan Wu, Florian Weimer, Jeff Xu,
	Jonathan Corbet, Jordan R Abrahams, Lakshmi Ramasubramanian,
	Luca Boccassi, Matt Bobrowski, Miklos Szeredi, Mimi Zohar,
	Nicolas Bouchinet, Robert Waite, Roberto Sassu, Scott Shell,
	Steve Dower, Steve Grubb, kernel-hardening, linux-api,
	linux-fsdevel, linux-integrity, linux-kernel,
	linux-security-module, Jeff Xu
In-Reply-To: <20250825.mahNeel0dohz@digikod.net>

On Mon, Aug 25, 2025 at 2:31 AM Mickaël Salaün <mic@digikod.net> wrote:
>
> On Sun, Aug 24, 2025 at 11:04:03AM -0700, Andy Lutomirski wrote:
> > On Sun, Aug 24, 2025 at 4:03 AM Mickaël Salaün <mic@digikod.net> wrote:
> > >
> > > On Fri, Aug 22, 2025 at 09:45:32PM +0200, Jann Horn wrote:
> > > > On Fri, Aug 22, 2025 at 7:08 PM Mickaël Salaün <mic@digikod.net> wrote:
> > > > > Add a new O_DENY_WRITE flag usable at open time and on opened file (e.g.
> > > > > passed file descriptors).  This changes the state of the opened file by
> > > > > making it read-only until it is closed.  The main use case is for script
> > > > > interpreters to get the guarantee that script' content cannot be altered
> > > > > while being read and interpreted.  This is useful for generic distros
> > > > > that may not have a write-xor-execute policy.  See commit a5874fde3c08
> > > > > ("exec: Add a new AT_EXECVE_CHECK flag to execveat(2)")
> > > > >
> > > > > Both execve(2) and the IOCTL to enable fsverity can already set this
> > > > > property on files with deny_write_access().  This new O_DENY_WRITE make
> > > >
> > > > The kernel actually tried to get rid of this behavior on execve() in
> > > > commit 2a010c41285345da60cece35575b4e0af7e7bf44.; but sadly that had
> > > > to be reverted in commit 3b832035387ff508fdcf0fba66701afc78f79e3d
> > > > because it broke userspace assumptions.
> > >
> > > Oh, good to know.
> > >
> > > >
> > > > > it widely available.  This is similar to what other OSs may provide
> > > > > e.g., opening a file with only FILE_SHARE_READ on Windows.
> > > >
> > > > We used to have the analogous mmap() flag MAP_DENYWRITE, and that was
> > > > removed for security reasons; as
> > > > https://man7.org/linux/man-pages/man2/mmap.2.html says:
> > > >
> > > > |        MAP_DENYWRITE
> > > > |               This flag is ignored.  (Long ago—Linux 2.0 and earlier—it
> > > > |               signaled that attempts to write to the underlying file
> > > > |               should fail with ETXTBSY.  But this was a source of denial-
> > > > |               of-service attacks.)"
> > > >
> > > > It seems to me that the same issue applies to your patch - it would
> > > > allow unprivileged processes to essentially lock files such that other
> > > > processes can't write to them anymore. This might allow unprivileged
> > > > users to prevent root from updating config files or stuff like that if
> > > > they're updated in-place.
> > >
> > > Yes, I agree, but since it is the case for executed files I though it
> > > was worth starting a discussion on this topic.  This new flag could be
> > > restricted to executable files, but we should avoid system-wide locks
> > > like this.  I'm not sure how Windows handle these issues though.
> > >
> > > Anyway, we should rely on the access control policy to control write and
> > > execute access in a consistent way (e.g. write-xor-execute).  Thanks for
> > > the references and the background!
> >
> > I'm confused.  I understand that there are many contexts in which one
> > would want to prevent execution of unapproved content, which might
> > include preventing a given process from modifying some code and then
> > executing it.
> >
> > I don't understand what these deny-write features have to do with it.
> > These features merely prevent someone from modifying code *that is
> > currently in use*, which is not at all the same thing as preventing
> > modifying code that might get executed -- one can often modify
> > contents *before* executing those contents.
>
> The order of checks would be:
> 1. open script with O_DENY_WRITE
> 2. check executability with AT_EXECVE_CHECK
> 3. read the content and interpret it

Hmm.  Common LSM configurations should be able to handle this without
deny write, I think.  If you don't want a program to be able to make
their own scripts, then don't allow AT_EXECVE_CHECK to succeed on a
script that the program can write.

Keep in mind that trying to lock this down too hard is pointless for
users who are allowed to to ptrace-write to their own processes.  Or
for users who can do JIT, or for users who can run a REPL, etc.

> > But maybe a less kludgy version could be used for real.  What if there
> > was a syscall that would take an fd and make a snapshot of the file?
>
> Yes, that would be a clean solution.  I don't think this is achievable
> in an efficient way without involving filesystem implementations though.

It wouldn't be so terrible to involve filesystem implementations.
Most of the filesystems that people who care at all about security run
their binaries from either support reflinks or are immutable.  Things
like OCI implementations may already fit meet those criteria, and it
would be pretty nifty if the kernel was actually aware that OCI layers
are intended to be immutable.  We could even have an API to
generically query the hash of an immutable file and to ask the kernel
if it's validating the hash on reads.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox