From: Sasha Levin <sashal@kernel.org>
To: Dmitry Vyukov <dvyukov@google.com>
Cc: kees@kernel.org, elver@google.com, linux-api@vger.kernel.org,
linux-kernel@vger.kernel.org, tools@kernel.org,
workflows@vger.kernel.org
Subject: Re: [RFC 00/19] Kernel API Specification Framework
Date: Mon, 30 Jun 2025 10:27:29 -0400 [thread overview]
Message-ID: <aGKe0bcv1mzBnnQr@lappy> (raw)
In-Reply-To: <CACT4Y+ZB45ovD0hX3xX_yTUVSRDc1UCXnVDB57jxyWPPc7k=MA@mail.gmail.com>
On Fri, Jun 27, 2025 at 08:23:41AM +0200, Dmitry Vyukov wrote:
>On Thu, 26 Jun 2025 at 18:23, Sasha Levin <sashal@kernel.org> wrote:
>>
>> On Thu, Jun 26, 2025 at 10:37:33AM +0200, Dmitry Vyukov wrote:
>> >On Thu, 26 Jun 2025 at 10:32, Dmitry Vyukov <dvyukov@google.com> wrote:
>> >>
>> >> On Wed, 25 Jun 2025 at 17:55, Sasha Levin <sashal@kernel.org> wrote:
>> >> >
>> >> > On Wed, Jun 25, 2025 at 10:52:46AM +0200, Dmitry Vyukov wrote:
>> >> > >On Tue, 24 Jun 2025 at 22:04, Sasha Levin <sashal@kernel.org> wrote:
>> >> > >
>> >> > >> >6. What's the goal of validation of the input arguments?
>> >> > >> >Kernel code must do this validation anyway, right.
>> >> > >> >Any non-trivial validation is hard, e.g. even for open the validation function
>> >> > >> >for file name would need to have access to flags and check file precense for
>> >> > >> >some flags combinations. That may add significant amount of non-trivial code
>> >> > >> >that duplicates main syscall logic, and that logic may also have bugs and
>> >> > >> >memory leaks.
>> >> > >>
>> >> > >> Mostly to catch divergence from the spec: think of a scenario where
>> >> > >> someone added a new param/flag/etc but forgot to update the spec - this
>> >> > >> will help catch it.
>> >> > >
>> >> > >How exactly is this supposed to work?
>> >> > >Even if we run with a unit test suite, a test suite may include some
>> >> > >incorrect inputs to check for error conditions. The framework will
>> >> > >report violations on these incorrect inputs. These are not bugs in the
>> >> > >API specifications, nor in the test suite (read false positives).
>> >> >
>> >> > Right now it would be something along the lines of the test checking for
>> >> > an expected failure message in dmesg, something along the lines of:
>> >> >
>> >> > https://github.com/linux-test-project/ltp/blob/0c99c7915f029d32de893b15b0a213ff3de210af/testcases/commands/sysctl/sysctl02.sh#L67
>> >> >
>> >> > I'm not opposed to coming up with a better story...
>> >
>> >If the goal of validation is just indirectly validating correctness of
>> >the specification itself, then I would look for other ways of
>> >validating correctness of the spec.
>> >Either removing duplication between specification and actual code
>> >(i.e. generating it from SYSCALL_DEFINE, or the other way around) ,
>> >then spec is correct by construction. Or, cross-validating it with
>> >info automatically extracted from the source (using
>> >clang/dwarf/pahole).
>> >This would be more scalable (O(1) work, rather than thousands more
>> >manually written tests).
>> >
>> >> Oh, you mean special tests for this framework (rather than existing tests).
>> >> I don't think this is going to work in practice. Besides writing all
>> >> these specifications, we will also need to write dozens of tests per
>> >> each specification (e.g. for each fd arg one needs at least 3 tests:
>> >> -1, valid fd, inclid fd; an enum may need 5 various inputs of
>> >> something; let alone netlink specifications).
>>
>> I didn't mean just for the framework: being able to specify the APIs in
>> machine readable format will enable us to automatically generate
>> exhaustive tests for each such API.
>>
>> I've been playing with the kapi tool (see last patch) which already
>> supports different formatters. Right now it outputs human readable
>> output, but I have proof-of-concept code that outputs testcases for
>> specced APIs.
>>
>> The dream here is to be able to automatically generate
>> hundreds/thousands of tests for each API in an automated fashion, and
>> verify the results with:
>>
>> 1. Simply checking expected return value.
>>
>> 2. Checking that the actual action happened (i.e. we called close(fd),
>> verify that `fd` is really closed).
>>
>> 3. Check for side effects (i.e. close(fd) isn't supposed to allocate
>> memory - verify that it didn't allocate memory).
>>
>> 4. Code coverage: our tests are supposed to cover 100% of the code in
>> that APIs call chain, do we have code that didn't run (missing/incorrect
>> specs).
>
>
>This is all good. I was asking the argument verification part of the
>framework. Is it required for any of this? How?
Specifications without enforcement are just documentation :)
In my mind, there are a few reasons we want this:
1. For folks coding against the kernel, it's a way for them to know that
the code they're writing fits within the spec of the kernel's API.
2. Enforcement around kernel changes: think of a scenario where a flag
is added to a syscall - the author of that change will have to also
update the spec because otherwise the verification layer will complain
about the new flag. This helps prevent divergence between the code and
the spec.
3. Extra layer of security: we can choose to enable this as an
additional layer to protect us from missing checks in our userspace
facing API.
--
Thanks,
Sasha
next prev parent reply other threads:[~2025-06-30 14:27 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-06-14 13:48 [RFC 00/19] Kernel API Specification Framework Sasha Levin
2025-06-14 13:48 ` [RFC 01/19] kernel/api: introduce kernel API specification framework Sasha Levin
2025-06-14 13:48 ` [RFC 02/19] eventpoll: add API specification for epoll_create1 Sasha Levin
2025-06-14 13:48 ` [RFC 03/19] eventpoll: add API specification for epoll_create Sasha Levin
2025-06-14 13:48 ` [RFC 04/19] eventpoll: add API specification for epoll_ctl Sasha Levin
2025-06-14 13:48 ` [RFC 05/19] eventpoll: add API specification for epoll_wait Sasha Levin
2025-06-14 13:48 ` [RFC 06/19] eventpoll: add API specification for epoll_pwait Sasha Levin
2025-06-14 13:48 ` [RFC 07/19] eventpoll: add API specification for epoll_pwait2 Sasha Levin
2025-06-14 13:48 ` [RFC 08/19] exec: add API specification for execve Sasha Levin
2025-06-16 21:39 ` Florian Weimer
2025-06-17 1:51 ` Sasha Levin
2025-06-17 7:13 ` Florian Weimer
2025-06-17 22:58 ` Sasha Levin
2025-06-14 13:48 ` [RFC 09/19] exec: add API specification for execveat Sasha Levin
2025-06-14 13:48 ` [RFC 10/19] mm/mlock: add API specification for mlock Sasha Levin
2025-06-14 13:48 ` [RFC 11/19] mm/mlock: add API specification for mlock2 Sasha Levin
2025-06-14 13:48 ` [RFC 12/19] mm/mlock: add API specification for mlockall Sasha Levin
2025-06-14 13:48 ` [RFC 13/19] mm/mlock: add API specification for munlock Sasha Levin
2025-06-14 13:48 ` [RFC 14/19] mm/mlock: add API specification for munlockall Sasha Levin
2025-06-14 13:48 ` [RFC 15/19] kernel/api: add debugfs interface for kernel API specifications Sasha Levin
2025-06-14 13:48 ` [RFC 16/19] kernel/api: add IOCTL specification infrastructure Sasha Levin
2025-06-14 13:48 ` [RFC 17/19] fwctl: add detailed IOCTL API specifications Sasha Levin
2025-06-14 13:48 ` [RFC 18/19] binder: " Sasha Levin
2025-06-14 13:48 ` [RFC 19/19] tools/kapi: Add kernel API specification extraction tool Sasha Levin
2025-06-17 12:08 ` [RFC 00/19] Kernel API Specification Framework David Laight
2025-06-18 21:29 ` Kees Cook
2025-06-19 0:22 ` Sasha Levin
2025-06-23 13:28 ` Dmitry Vyukov
2025-06-24 14:06 ` Cyril Hrubis
2025-06-24 14:30 ` Dmitry Vyukov
2025-06-24 15:27 ` Cyril Hrubis
2025-06-24 20:04 ` Sasha Levin
2025-06-25 8:49 ` Dmitry Vyukov
2025-06-25 8:52 ` Dmitry Vyukov
2025-06-25 15:46 ` Cyril Hrubis
2025-06-25 15:55 ` Sasha Levin
2025-06-26 8:32 ` Dmitry Vyukov
2025-06-26 8:37 ` Dmitry Vyukov
2025-06-26 16:23 ` Sasha Levin
2025-06-27 6:23 ` Dmitry Vyukov
2025-06-30 14:27 ` Sasha Levin [this message]
2025-07-01 6:11 ` Dmitry Vyukov
2025-06-25 8:56 ` Dmitry Vyukov
2025-06-25 16:23 ` Sasha Levin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aGKe0bcv1mzBnnQr@lappy \
--to=sashal@kernel.org \
--cc=dvyukov@google.com \
--cc=elver@google.com \
--cc=kees@kernel.org \
--cc=linux-api@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=tools@kernel.org \
--cc=workflows@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).