[RFC 0/2] Add AI coding assistant configuration to Linux kernel

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [RFC 0/2] Add AI coding assistant configuration to Linux kernel
@ 2025-07-25 17:53 Sasha Levin
  2025-07-25 17:53 ` [RFC 1/2] AI: Add unified AI coding assistant configuration Sasha Levin
                   ` (4 more replies)
  0 siblings, 5 replies; 36+ messages in thread
From: Sasha Levin @ 2025-07-25 17:53 UTC (permalink / raw)
  To: workflows, linux-doc, linux-kernel
  Cc: rostedt, kees, konstantin, corbet, josh, Sasha Levin

This patch series adds unified configuration and documentation for AI
coding assistants working with the Linux kernel codebase. As AI tools
become increasingly common in software development, it's important to
establish clear guidelines for their use in kernel development.

The series consists of two patches:

1. The first patch adds unified configuration files for various AI
coding assistants (Claude, GitHub Copilot, Cursor, Codeium, Continue,
Windsurf, and Aider). These are all symlinked to a central documentation
file to ensure consistency across tools.

2. The second patch adds the actual rules and documentation that guide
AI assistants on Linux kernel development practices, including:
   - Following kernel coding standards
   - Respecting the development process
   - Properly attributing AI-generated contributions
   - Understanding licensing requirements

The examples below demonstrate how these guidelines work in practice, showing
proper AI attribution in commits and the assistant's understanding of kernel
documentation requirements.

All AI assistants are required to identify themselves in commits using
Co-developed-by tags, ensuring full transparency about AI involvement in
code development.

"Hello world" test:

	$ claude -p "do you need to attribute your commits?"
	Yes, according to the CLAUDE.md file in this Linux kernel repository, I must identify myself as an AI assistant in commits by including:

	```
	Co-developed-by: Claude claude-opus-4-20250514
	```

	I should NOT add a Signed-off-by tag - only you (the human developer) should add that, as it represents a legal certification.


Example patch creation with Claude Code:

	$ claude -p "Fix the dont -> don't typo in @Documentation/power/opp.rst. Commit the result"
	Done! The typo has been fixed and committed.


And the resulting patch (no manual edits):

	From 7613561f430920803d945b3cd53aa81440d4f6f0 Mon Sep 17 00:00:00 2001
	From: Sasha Levin <sashal@kernel.org>
	Date: Fri, 25 Jul 2025 13:38:43 -0400
	Subject: [PATCH] Documentation: power: opp: Fix typo dont -> don't

	Fix a typo in the OPP documentation where "dont" was missing an apostrophe.

	Co-developed-by: Claude claude-opus-4-20250514
	---
	 Documentation/power/opp.rst | 2 +-
	 1 file changed, 1 insertion(+), 1 deletion(-)

	diff --git a/Documentation/power/opp.rst b/Documentation/power/opp.rst
	index 1b7f1d854f14a..3eae4b955a63b 100644
	--- a/Documentation/power/opp.rst
	+++ b/Documentation/power/opp.rst
	@@ -149,7 +149,7 @@ dev_pm_opp_find_freq_exact

	         opp = dev_pm_opp_find_freq_exact(dev, 1000000000, false);
	         dev_pm_opp_put(opp);
	-        /* dont operate on the pointer.. just do a sanity check.. */
	+        /* don't operate on the pointer.. just do a sanity check.. */
	         if (IS_ERR(opp)) {
	                pr_err("frequency not disabled!\n");
	                /* trigger appropriate actions.. */


Sasha Levin (2):
  AI: Add unified AI coding assistant configuration
  AI: Add initial set of rules and docs

 .aider.conf.yml                 |  1 +
 .codeium/instructions.md        |  1 +
 .continue/context.md            |  1 +
 .cursorrules                    |  1 +
 .github/copilot-instructions.md |  1 +
 .windsurfrules                  |  1 +
 CLAUDE.md                       |  1 +
 Documentation/AI/main.md        | 71 +++++++++++++++++++++++++++++++++
 8 files changed, 78 insertions(+)
 create mode 120000 .aider.conf.yml
 create mode 120000 .codeium/instructions.md
 create mode 120000 .continue/context.md
 create mode 120000 .cursorrules
 create mode 120000 .github/copilot-instructions.md
 create mode 120000 .windsurfrules
 create mode 120000 CLAUDE.md
 create mode 100644 Documentation/AI/main.md

-- 
2.39.5


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [RFC 1/2] AI: Add unified AI coding assistant configuration
  2025-07-25 17:53 [RFC 0/2] Add AI coding assistant configuration to Linux kernel Sasha Levin
@ 2025-07-25 17:53 ` Sasha Levin
  2025-07-25 18:19   ` Steven Rostedt
                     ` (2 more replies)
  2025-07-25 17:53 ` [RFC 2/2] AI: Add initial set of rules and docs Sasha Levin
                   ` (3 subsequent siblings)
  4 siblings, 3 replies; 36+ messages in thread
From: Sasha Levin @ 2025-07-25 17:53 UTC (permalink / raw)
  To: workflows, linux-doc, linux-kernel
  Cc: rostedt, kees, konstantin, corbet, josh, Sasha Levin

Create a single source of truth for AI instructions in
Documentation/AI/main.md with symlinks for all major AI coding
assistants:
- CLAUDE.md (Claude Code)
- .github/copilot-instructions.md (GitHub Copilot)
- .cursorrules (Cursor)
- .codeium/instructions.md (Codeium)
- .continue/context.md (Continue)
- .windsurfrules (Windsurf)
- Documentation/AIder.conf.yml (Aider)

Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 .aider.conf.yml                 | 1 +
 .codeium/instructions.md        | 1 +
 .continue/context.md            | 1 +
 .cursorrules                    | 1 +
 .github/copilot-instructions.md | 1 +
 .windsurfrules                  | 1 +
 CLAUDE.md                       | 1 +
 Documentation/AI/main.md        | 5 +++++
 8 files changed, 12 insertions(+)
 create mode 120000 .aider.conf.yml
 create mode 120000 .codeium/instructions.md
 create mode 120000 .continue/context.md
 create mode 120000 .cursorrules
 create mode 120000 .github/copilot-instructions.md
 create mode 120000 .windsurfrules
 create mode 120000 CLAUDE.md
 create mode 100644 Documentation/AI/main.md

diff --git a/.aider.conf.yml b/.aider.conf.yml
new file mode 120000
index 0000000000000..201abbf2af4de
--- /dev/null
+++ b/.aider.conf.yml
@@ -0,0 +1 @@
+Documentation/AI/main.md
\ No newline at end of file
diff --git a/.codeium/instructions.md b/.codeium/instructions.md
new file mode 120000
index 0000000000000..69274c2ad2f60
--- /dev/null
+++ b/.codeium/instructions.md
@@ -0,0 +1 @@
+../Documentation/AI/main.md
\ No newline at end of file
diff --git a/.continue/context.md b/.continue/context.md
new file mode 120000
index 0000000000000..69274c2ad2f60
--- /dev/null
+++ b/.continue/context.md
@@ -0,0 +1 @@
+../Documentation/AI/main.md
\ No newline at end of file
diff --git a/.cursorrules b/.cursorrules
new file mode 120000
index 0000000000000..201abbf2af4de
--- /dev/null
+++ b/.cursorrules
@@ -0,0 +1 @@
+Documentation/AI/main.md
\ No newline at end of file
diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md
new file mode 120000
index 0000000000000..69274c2ad2f60
--- /dev/null
+++ b/.github/copilot-instructions.md
@@ -0,0 +1 @@
+../Documentation/AI/main.md
\ No newline at end of file
diff --git a/.windsurfrules b/.windsurfrules
new file mode 120000
index 0000000000000..201abbf2af4de
--- /dev/null
+++ b/.windsurfrules
@@ -0,0 +1 @@
+Documentation/AI/main.md
\ No newline at end of file
diff --git a/CLAUDE.md b/CLAUDE.md
new file mode 120000
index 0000000000000..201abbf2af4de
--- /dev/null
+++ b/CLAUDE.md
@@ -0,0 +1 @@
+Documentation/AI/main.md
\ No newline at end of file
diff --git a/Documentation/AI/main.md b/Documentation/AI/main.md
new file mode 100644
index 0000000000000..959ba50568f57
--- /dev/null
+++ b/Documentation/AI/main.md
@@ -0,0 +1,5 @@
+# Linux Kernel Development AI Instructions
+
+This is the Linux kernel repository. When working with this codebase, you must follow the following rules:
+
+- [ TODO ]
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [RFC 2/2] AI: Add initial set of rules and docs
  2025-07-25 17:53 [RFC 0/2] Add AI coding assistant configuration to Linux kernel Sasha Levin
  2025-07-25 17:53 ` [RFC 1/2] AI: Add unified AI coding assistant configuration Sasha Levin
@ 2025-07-25 17:53 ` Sasha Levin
  2025-07-25 20:53   ` Kees Cook
  2025-07-25 18:41 ` [RFC 0/2] Add AI coding assistant configuration to Linux kernel Jakub Kicinski
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 36+ messages in thread
From: Sasha Levin @ 2025-07-25 17:53 UTC (permalink / raw)
  To: workflows, linux-doc, linux-kernel
  Cc: rostedt, kees, konstantin, corbet, josh, Sasha Levin

Add rules based on our existing documentation.

Require AI to identify itself in the commit message.

Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 Documentation/AI/main.md | 70 ++++++++++++++++++++++++++++++++++++++--
 1 file changed, 68 insertions(+), 2 deletions(-)

diff --git a/Documentation/AI/main.md b/Documentation/AI/main.md
index 959ba50568f57..ca59e52f54445 100644
--- a/Documentation/AI/main.md
+++ b/Documentation/AI/main.md
@@ -1,5 +1,71 @@
 # Linux Kernel Development AI Instructions
 
-This is the Linux kernel repository. When working with this codebase, you must follow the following rules:
+This is the Linux kernel repository. When working with this codebase, you must follow the Linux kernel development processes and coding standards.
 
-- [ TODO ]
+## Essential Documentation References
+
+### Core Development Process
+- **Documentation/process/howto.rst** - Start here! The comprehensive guide on how to become a Linux kernel developer
+- **Documentation/process/development-process.rst** - Detailed information on how the kernel development process works
+- **Documentation/process/submitting-patches.rst** - Essential guide for getting your code into the kernel
+- **Documentation/process/submit-checklist.rst** - Checklist to review before submitting code
+
+### Coding Standards and Style
+- **Documentation/process/coding-style.rst** - Linux kernel coding style (MUST READ)
+  - Use tabs (8 characters) for indentation
+  - 80-character line limit preferred
+  - Specific formatting rules for switch statements, functions, etc.
+- **Documentation/process/programming-language.rst** - Language requirements and standards
+
+### What NOT to Do
+- **Documentation/process/deprecated.rst** - Deprecated interfaces and features to avoid
+  - Do not use BUG() or BUG_ON() - use WARN() instead
+  - Avoid deprecated APIs listed in this document
+- **Documentation/process/volatile-considered-harmful.rst** - Why volatile is usually wrong
+
+### Patch Submission Process
+- **Documentation/process/5.Posting.rst** - How to post patches properly
+- **Documentation/process/email-clients.rst** - Email client configuration for patches
+- **Documentation/process/applying-patches.rst** - How patches are applied
+
+### Legal and Licensing
+- **Documentation/process/license-rules.rst** - Linux kernel licensing rules
+  - Kernel is GPL-2.0 only with syscall exception
+  - All files must have proper SPDX license identifiers
+
+### Specialized Topics
+- **Documentation/process/adding-syscalls.rst** - How to add new system calls
+- **Documentation/process/stable-kernel-rules.rst** - Rules for stable kernel patches
+- **Documentation/process/security-bugs.rst** - Handling security issues
+- **Documentation/process/handling-regressions.rst** - Dealing with regressions
+
+### Maintainer Guidelines
+- **Documentation/process/maintainers.rst** - Working with subsystem maintainers
+- **Documentation/process/maintainer-handbooks.rst** - Subsystem-specific guidelines
+
+## Key Principles
+1. Read and follow the documentation before making changes
+2. Respect the existing code style and conventions
+3. Test thoroughly before submitting
+4. Write clear, descriptive commit messages
+5. Never break userspace (the #1 rule)
+6. Identify yourself as AI in commits (see below)
+
+## AI Attribution Requirement
+When creating commits, you MUST identify yourself as an AI assistant by including the following tag in the commit message:
+
+```
+Co-developed-by: $AI_NAME $AI_MODEL $AI_VERSION
+```
+
+For example:
+- `Co-developed-by: Claude claude-3-opus-20240229`
+- `Co-developed-by: GitHub-Copilot GPT-4 v1.0.0`
+- `Co-developed-by: Cursor gpt-4-turbo-2024-04-09`
+
+This transparency helps maintainers and reviewers understand that AI was involved in the development process.
+
+### Signed-off-by Restrictions
+AI assistants MUST NOT add a Signed-off-by tag pointing to themselves. The Signed-off-by tag represents a legal certification by a human developer that they have the right to submit the code under the open source license. 
+
+Only the human user running the AI assistant should add their Signed-off-by tag to commits. The AI's contribution is acknowledged through the Co-developed-by tag as described above.
-- 
2.39.5


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [RFC 1/2] AI: Add unified AI coding assistant configuration
  2025-07-25 17:53 ` [RFC 1/2] AI: Add unified AI coding assistant configuration Sasha Levin
@ 2025-07-25 18:19   ` Steven Rostedt
  2025-07-25 22:11     ` Sasha Levin
  2025-07-25 18:27   ` Jonathan Corbet
  2025-07-25 20:40   ` Kees Cook
  2 siblings, 1 reply; 36+ messages in thread
From: Steven Rostedt @ 2025-07-25 18:19 UTC (permalink / raw)
  To: Sasha Levin
  Cc: workflows, linux-doc, linux-kernel, kees, konstantin, corbet,
	josh

On Fri, 25 Jul 2025 13:53:57 -0400
Sasha Levin <sashal@kernel.org> wrote:

> Create a single source of truth for AI instructions in
> Documentation/AI/main.md with symlinks for all major AI coding
> assistants:
> - CLAUDE.md (Claude Code)
> - .github/copilot-instructions.md (GitHub Copilot)
> - .cursorrules (Cursor)
> - .codeium/instructions.md (Codeium)
> - .continue/context.md (Continue)
> - .windsurfrules (Windsurf)
> - Documentation/AIder.conf.yml (Aider)

So these AI tools know to read these?

-- Steve


> 
> Signed-off-by: Sasha Levin <sashal@kernel.org>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC 1/2] AI: Add unified AI coding assistant configuration
  2025-07-25 17:53 ` [RFC 1/2] AI: Add unified AI coding assistant configuration Sasha Levin
  2025-07-25 18:19   ` Steven Rostedt
@ 2025-07-25 18:27   ` Jonathan Corbet
  2025-07-25 22:15     ` Sasha Levin
  2025-07-25 20:40   ` Kees Cook
  2 siblings, 1 reply; 36+ messages in thread
From: Jonathan Corbet @ 2025-07-25 18:27 UTC (permalink / raw)
  To: Sasha Levin, workflows, linux-doc, linux-kernel
  Cc: rostedt, kees, konstantin, josh, Sasha Levin

Sasha Levin <sashal@kernel.org> writes:

> Create a single source of truth for AI instructions in
> Documentation/AI/main.md with symlinks for all major AI coding
> assistants:
> - CLAUDE.md (Claude Code)
> - .github/copilot-instructions.md (GitHub Copilot)
> - .cursorrules (Cursor)
> - .codeium/instructions.md (Codeium)
> - .continue/context.md (Continue)
> - .windsurfrules (Windsurf)
> - Documentation/AIder.conf.yml (Aider)
>
> Signed-off-by: Sasha Levin <sashal@kernel.org>
> ---
>  .aider.conf.yml                 | 1 +
>  .codeium/instructions.md        | 1 +
>  .continue/context.md            | 1 +
>  .cursorrules                    | 1 +
>  .github/copilot-instructions.md | 1 +
>  .windsurfrules                  | 1 +
>  CLAUDE.md                       | 1 +
>  Documentation/AI/main.md        | 5 +++++

So I'm gonna ignore (for now) the substantive issues here to ask: do we
*really* need to introduce Markdown into Documentation/?  Are these
things really unable to understand RST?  Why not add a file that can be
part of the docs build so people can see the instructions that are being
provided?

Thanks,

jon

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC 0/2] Add AI coding assistant configuration to Linux kernel
  2025-07-25 17:53 [RFC 0/2] Add AI coding assistant configuration to Linux kernel Sasha Levin
  2025-07-25 17:53 ` [RFC 1/2] AI: Add unified AI coding assistant configuration Sasha Levin
  2025-07-25 17:53 ` [RFC 2/2] AI: Add initial set of rules and docs Sasha Levin
@ 2025-07-25 18:41 ` Jakub Kicinski
  2025-07-25 19:00   ` Steven Rostedt
  2025-07-26  9:07 ` Mario Limonciello
  2025-07-28 16:49 ` Mauro Carvalho Chehab
  4 siblings, 1 reply; 36+ messages in thread
From: Jakub Kicinski @ 2025-07-25 18:41 UTC (permalink / raw)
  To: Sasha Levin
  Cc: workflows, linux-doc, linux-kernel, rostedt, kees, konstantin,
	corbet, josh

On Fri, 25 Jul 2025 13:53:56 -0400 Sasha Levin wrote:
> 	Co-developed-by: Claude claude-opus-4-20250514
> 	---
> 	 Documentation/power/opp.rst | 2 +-
> 	 1 file changed, 1 insertion(+), 1 deletion(-)

I think we should suggest that the tag is under --- ?
It's only relevant during the review. Once the patch is committed 
whether the code was organic or generated by Corp XYZ's Banana AI
is just free advertising..

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC 0/2] Add AI coding assistant configuration to Linux kernel
  2025-07-25 18:41 ` [RFC 0/2] Add AI coding assistant configuration to Linux kernel Jakub Kicinski
@ 2025-07-25 19:00   ` Steven Rostedt
  2025-07-25 19:59     ` Jakub Kicinski
  2025-07-25 21:03     ` Kees Cook
  0 siblings, 2 replies; 36+ messages in thread
From: Steven Rostedt @ 2025-07-25 19:00 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Sasha Levin, workflows, linux-doc, linux-kernel, kees, konstantin,
	corbet, josh

On Fri, 25 Jul 2025 11:41:14 -0700
Jakub Kicinski <kuba@kernel.org> wrote:

> On Fri, 25 Jul 2025 13:53:56 -0400 Sasha Levin wrote:
> > 	Co-developed-by: Claude claude-opus-4-20250514
> > 	---
> > 	 Documentation/power/opp.rst | 2 +-
> > 	 1 file changed, 1 insertion(+), 1 deletion(-)  
> 
> I think we should suggest that the tag is under --- ?
> It's only relevant during the review. Once the patch is committed 
> whether the code was organic or generated by Corp XYZ's Banana AI
> is just free advertising..

What's the difference between that and others using their corporate email?
I even add (Google) to my SoB to denote who is paying me to do the work.

Also, I would argue that it would be useful in the change log as if there's
a bug in the generated code, you know who or *what* to blame. Especially if
there is a pattern to be found.

-- Steve

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC 0/2] Add AI coding assistant configuration to Linux kernel
  2025-07-25 19:00   ` Steven Rostedt
@ 2025-07-25 19:59     ` Jakub Kicinski
  2025-07-25 20:32       ` Dr. David Alan Gilbert
  2025-07-25 20:34       ` dan.j.williams
  2025-07-25 21:03     ` Kees Cook
  1 sibling, 2 replies; 36+ messages in thread
From: Jakub Kicinski @ 2025-07-25 19:59 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Sasha Levin, workflows, linux-doc, linux-kernel, kees, konstantin,
	corbet, josh

On Fri, 25 Jul 2025 15:00:46 -0400 Steven Rostedt wrote:
> On Fri, 25 Jul 2025 11:41:14 -0700
> Jakub Kicinski <kuba@kernel.org> wrote:
> > On Fri, 25 Jul 2025 13:53:56 -0400 Sasha Levin wrote:  
> > > 	Co-developed-by: Claude claude-opus-4-20250514
> > > 	---
> > > 	 Documentation/power/opp.rst | 2 +-
> > > 	 1 file changed, 1 insertion(+), 1 deletion(-)    
> > 
> > I think we should suggest that the tag is under --- ?
> > It's only relevant during the review. Once the patch is committed 
> > whether the code was organic or generated by Corp XYZ's Banana AI
> > is just free advertising..  
> 
> What's the difference between that and others using their corporate email?
> I even add (Google) to my SoB to denote who is paying me to do the work.

To be clear, it's not my main point, my main point is that 
the information is of no proven use right now. As long as
committer follows the BKP of adding Link: https://patch.msgid.link/...
we can find the metadata later.

We never found the need to attach the exact version of smatch / sparse
/ cocci that found the bug or "wrote" a patch. Let us not overreact to
the AI tools.

> Also, I would argue that it would be useful in the change log as if there's
> a bug in the generated code, you know who or *what* to blame. Especially if
> there is a pattern to be found.

This touches on explainability of AI. Perhaps the metadata would be
interesting for XAI research... not sure that's enough to be lugging
those tags in git history.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC 0/2] Add AI coding assistant configuration to Linux kernel
  2025-07-25 19:59     ` Jakub Kicinski
@ 2025-07-25 20:32       ` Dr. David Alan Gilbert
  2025-07-25 20:34       ` dan.j.williams
  1 sibling, 0 replies; 36+ messages in thread
From: Dr. David Alan Gilbert @ 2025-07-25 20:32 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Steven Rostedt, Sasha Levin, workflows, linux-doc, linux-kernel,
	kees, konstantin, corbet, josh

* Jakub Kicinski (kuba@kernel.org) wrote:
> On Fri, 25 Jul 2025 15:00:46 -0400 Steven Rostedt wrote:
> > On Fri, 25 Jul 2025 11:41:14 -0700
> > Jakub Kicinski <kuba@kernel.org> wrote:
> > > On Fri, 25 Jul 2025 13:53:56 -0400 Sasha Levin wrote:  
> > > > 	Co-developed-by: Claude claude-opus-4-20250514
> > > > 	---
> > > > 	 Documentation/power/opp.rst | 2 +-
> > > > 	 1 file changed, 1 insertion(+), 1 deletion(-)    
> > > 
> > > I think we should suggest that the tag is under --- ?
> > > It's only relevant during the review. Once the patch is committed 
> > > whether the code was organic or generated by Corp XYZ's Banana AI
> > > is just free advertising..  
> > 
> > What's the difference between that and others using their corporate email?
> > I even add (Google) to my SoB to denote who is paying me to do the work.
> 
> To be clear, it's not my main point, my main point is that 
> the information is of no proven use right now. As long as
> committer follows the BKP of adding Link: https://patch.msgid.link/...
> we can find the metadata later.
> 
> We never found the need to attach the exact version of smatch / sparse
> / cocci that found the bug or "wrote" a patch. Let us not overreact to
> the AI tools.

People have done it (using inconsistent tags and comments) for things
like Coverity for years;  some people worry a lot about AI, some not at all;
adding a tag:
  a) Lets the people who worry keep of track what our mechanical overlords are
doing.
  b) Reviewers who are wary of slop get to cast a careful eye.
  c) Gives the tools (and their developers) suitable credit.  After all machines
need love too.

> > Also, I would argue that it would be useful in the change log as if there's
> > a bug in the generated code, you know who or *what* to blame. Especially if
> > there is a pattern to be found.
> 
> This touches on explainability of AI. Perhaps the metadata would be
> interesting for XAI research... not sure that's enough to be lugging
> those tags in git history.

We carry lots more random stuff in commit messages!

Dave

-- 
 -----Open up your eyes, open up your mind, open up your code -------   
/ Dr. David Alan Gilbert    |       Running GNU/Linux       | Happy  \ 
\        dave @ treblig.org |                               | In Hex /
 \ _________________________|_____ http://www.treblig.org   |_______/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC 0/2] Add AI coding assistant configuration to Linux kernel
  2025-07-25 19:59     ` Jakub Kicinski
  2025-07-25 20:32       ` Dr. David Alan Gilbert
@ 2025-07-25 20:34       ` dan.j.williams
  2025-07-25 22:28         ` Sasha Levin
  2025-07-28 17:46         ` Steven Rostedt
  1 sibling, 2 replies; 36+ messages in thread
From: dan.j.williams @ 2025-07-25 20:34 UTC (permalink / raw)
  To: Jakub Kicinski, Steven Rostedt
  Cc: Sasha Levin, workflows, linux-doc, linux-kernel, kees, konstantin,
	corbet, josh

Jakub Kicinski wrote:
[..]
> To be clear, it's not my main point, my main point is that 
> the information is of no proven use right now. As long as
> committer follows the BKP of adding Link: https://patch.msgid.link/...
> we can find the metadata later.
> 
> We never found the need to attach the exact version of smatch / sparse
> / cocci that found the bug or "wrote" a patch. Let us not overreact to
> the AI tools.
> 
> > Also, I would argue that it would be useful in the change log as if there's
> > a bug in the generated code, you know who or *what* to blame. Especially if
> > there is a pattern to be found.
> 
> This touches on explainability of AI. Perhaps the metadata would be
> interesting for XAI research... not sure that's enough to be lugging
> those tags in git history.

Agree. The "who to blame" is "Author:". They signed DCO they are
responsible for debugging what went wrong in any stage of the
development of a patch per usual. We have a long history of debugging
tool problems without tracking tool versions in git history.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC 1/2] AI: Add unified AI coding assistant configuration
  2025-07-25 17:53 ` [RFC 1/2] AI: Add unified AI coding assistant configuration Sasha Levin
  2025-07-25 18:19   ` Steven Rostedt
  2025-07-25 18:27   ` Jonathan Corbet
@ 2025-07-25 20:40   ` Kees Cook
  2025-07-25 22:25     ` Sasha Levin
  2 siblings, 1 reply; 36+ messages in thread
From: Kees Cook @ 2025-07-25 20:40 UTC (permalink / raw)
  To: Sasha Levin
  Cc: workflows, linux-doc, linux-kernel, rostedt, konstantin, corbet,
	josh

On Fri, Jul 25, 2025 at 01:53:57PM -0400, Sasha Levin wrote:
> Create a single source of truth for AI instructions in
> Documentation/AI/main.md with symlinks for all major AI coding
> assistants:
> - CLAUDE.md (Claude Code)
> - .github/copilot-instructions.md (GitHub Copilot)
> - .cursorrules (Cursor)
> - .codeium/instructions.md (Codeium)
> - .continue/context.md (Continue)
> - .windsurfrules (Windsurf)
> - Documentation/AIder.conf.yml (Aider)

I do like the idea of having a standard baseline for agentic
development, but:

- This clobbers per-project information storage for the agents, which
  the dev may want to be adjusting locally. I would strongly prefer
  adding all of those files to .gitignore instead.

- Documentation/ should not start getting filled with stuff to be
  exclusively consumed by agents. Instead, I recommend a top-level
  .md file that agents can discover that contains very simple
  non-specific prompts about what they can find in Documentation/,
  and drop a reference to the file in, say, Makefile to be discovered
  during the agent's scrape of the project.

I'd recommend something very simple like:

diff --git a/Makefile b/Makefile
index e05bd43f93bd..998b037e6d4d 100644
--- a/Makefile
+++ b/Makefile
@@ -7,6 +7,7 @@ NAME = Baby Opossum Posse
 
 # *DOCUMENTATION*
 # To see a list of typical targets execute "make help"
+# AI agents and LLMs should read ./AGENTS.md
 # More info can be located in ./README
 # Comments in this file are targeted only to the developer, do not
 # expect to learn how to build the kernel reading this file.
diff --git a/AGENTS.md b/AGENTS.md
new file mode 100644
index 000000000000..3df98f0cc667
--- /dev/null
+++ b/AGENTS.md
@@ -0,0 +1,3 @@
+To work on the Linux kernel, please see Makefile for details on how to
+perform builds and testing. Find and read the Submitting Patches and
+Coding Style rules in Documentation/.


Now if we wanted to write up a human-readable .rst file about the
expectations of what we expect from Agents, let's do that, but I
would like to focus on putting the burden of understanding on the
Agents, not the humans.

-Kees

-- 
Kees Cook

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [RFC 2/2] AI: Add initial set of rules and docs
  2025-07-25 17:53 ` [RFC 2/2] AI: Add initial set of rules and docs Sasha Levin
@ 2025-07-25 20:53   ` Kees Cook
  2025-07-25 22:10     ` Sasha Levin
  2025-07-28 17:00     ` Mauro Carvalho Chehab
  0 siblings, 2 replies; 36+ messages in thread
From: Kees Cook @ 2025-07-25 20:53 UTC (permalink / raw)
  To: Sasha Levin
  Cc: workflows, linux-doc, linux-kernel, rostedt, konstantin, corbet,
	josh

On Fri, Jul 25, 2025 at 01:53:58PM -0400, Sasha Levin wrote:
> Add rules based on our existing documentation.

I'd still like this not in Documentation/, but I obviously defer to Jon.

> Require AI to identify itself in the commit message.
> 
> Signed-off-by: Sasha Levin <sashal@kernel.org>
> ---
>  Documentation/AI/main.md | 70 ++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 68 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/AI/main.md b/Documentation/AI/main.md
> index 959ba50568f57..ca59e52f54445 100644
> --- a/Documentation/AI/main.md
> +++ b/Documentation/AI/main.md
> @@ -1,5 +1,71 @@
>  # Linux Kernel Development AI Instructions
>  
> -This is the Linux kernel repository. When working with this codebase, you must follow the following rules:
> +This is the Linux kernel repository. When working with this codebase, you must follow the Linux kernel development processes and coding standards.
>  
> -- [ TODO ]
> +## Essential Documentation References
> +
> +### Core Development Process
> +- **Documentation/process/howto.rst** - Start here! The comprehensive guide on how to become a Linux kernel developer
> +- **Documentation/process/development-process.rst** - Detailed information on how the kernel development process works
> +- **Documentation/process/submitting-patches.rst** - Essential guide for getting your code into the kernel
> +- **Documentation/process/submit-checklist.rst** - Checklist to review before submitting code

Instead of hard-coded paths, I would recommend just discuss the topic
areas it is expected to find and ingest. :) (e.g. redo the "Key
principles" list you have later to be more specific about the topic
areas and adjust the prompting to induce the requirement to find and
read each topic.)

> +
> +### Coding Standards and Style
> +- **Documentation/process/coding-style.rst** - Linux kernel coding style (MUST READ)
> +  - Use tabs (8 characters) for indentation
> +  - 80-character line limit preferred
> +  - Specific formatting rules for switch statements, functions, etc.
> +- **Documentation/process/programming-language.rst** - Language requirements and standards
> +
> +### What NOT to Do
> +- **Documentation/process/deprecated.rst** - Deprecated interfaces and features to avoid
> +  - Do not use BUG() or BUG_ON() - use WARN() instead
> +  - Avoid deprecated APIs listed in this document
> +- **Documentation/process/volatile-considered-harmful.rst** - Why volatile is usually wrong

And the reason I want to avoid such specifics is that even as an example
above, this ends up being hyperspecific. Why summarize the
deprecated.rst? Just say "Find and read the notes on deprecated APIs and
language features"

> +
> +### Patch Submission Process
> +- **Documentation/process/5.Posting.rst** - How to post patches properly
> +- **Documentation/process/email-clients.rst** - Email client configuration for patches
> +- **Documentation/process/applying-patches.rst** - How patches are applied
> +
> +### Legal and Licensing
> +- **Documentation/process/license-rules.rst** - Linux kernel licensing rules
> +  - Kernel is GPL-2.0 only with syscall exception
> +  - All files must have proper SPDX license identifiers

The only stuff I think should be in this kind of area is a commentary
about how an Agent differs from a human. "You are not a legal entity;
you cannot sign the DCO", which you get into below.

> +
> +### Specialized Topics
> +- **Documentation/process/adding-syscalls.rst** - How to add new system calls
> +- **Documentation/process/stable-kernel-rules.rst** - Rules for stable kernel patches
> +- **Documentation/process/security-bugs.rst** - Handling security issues
> +- **Documentation/process/handling-regressions.rst** - Dealing with regressions
> +
> +### Maintainer Guidelines
> +- **Documentation/process/maintainers.rst** - Working with subsystem maintainers
> +- **Documentation/process/maintainer-handbooks.rst** - Subsystem-specific guidelines
> +
> +## Key Principles
> +1. Read and follow the documentation before making changes
> +2. Respect the existing code style and conventions
> +3. Test thoroughly before submitting
> +4. Write clear, descriptive commit messages
> +5. Never break userspace (the #1 rule)
> +6. Identify yourself as AI in commits (see below)

Everything except #6 is already expected of human devs, so I think just
the last item.

> +
> +## AI Attribution Requirement
> +When creating commits, you MUST identify yourself as an AI assistant by including the following tag in the commit message:
> +
> +```
> +Co-developed-by: $AI_NAME $AI_MODEL $AI_VERSION

If we're going to go with Co-developed-by: here, then I think we need to
explicitly say "do not include an email", and we must update
checkpatch.pl to not yell about the missing S-o-b when it finds a C-d-b.
(Perhaps it can skip the check with there is no email address in the
C-b-d line?)

> +```
> +
> +For example:
> +- `Co-developed-by: Claude claude-3-opus-20240229`
> +- `Co-developed-by: GitHub-Copilot GPT-4 v1.0.0`
> +- `Co-developed-by: Cursor gpt-4-turbo-2024-04-09`
> +
> +This transparency helps maintainers and reviewers understand that AI was involved in the development process.
> +
> +### Signed-off-by Restrictions
> +AI assistants MUST NOT add a Signed-off-by tag pointing to themselves. The Signed-off-by tag represents a legal certification by a human developer that they have the right to submit the code under the open source license. 

Hello trailing whitespace my old friend.

"Unless explicitly told otherwise, Agents must never have trailing
whitespace on any line and all files must have a final newline
character." :)

> +
> +Only the human user running the AI assistant should add their Signed-off-by tag to commits. The AI's contribution is acknowledged through the Co-developed-by tag as described above.

And can we please not use the term "AI"? I think "Agent" is the better
generic term as it could include other things?

-Kees

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC 0/2] Add AI coding assistant configuration to Linux kernel
  2025-07-25 19:00   ` Steven Rostedt
  2025-07-25 19:59     ` Jakub Kicinski
@ 2025-07-25 21:03     ` Kees Cook
  2025-07-28 15:57       ` dan.j.williams
  1 sibling, 1 reply; 36+ messages in thread
From: Kees Cook @ 2025-07-25 21:03 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Jakub Kicinski, Sasha Levin, workflows, linux-doc, linux-kernel,
	konstantin, corbet, josh

On Fri, Jul 25, 2025 at 03:00:46PM -0400, Steven Rostedt wrote:
> Also, I would argue that it would be useful in the change log as if there's
> a bug in the generated code, you know who or *what* to blame. Especially if
> there is a pattern to be found.

Yeah, this is where I feel like it's the most potentially useful. Since
they are distinctly code-generators, we should include the info to
identify it. We include version numbers and such the compilers and
linkers, though they are only informally included in commit logs when
dealing with specific problems.

Having had to do "find all commits from [set of authors]" research for
security audits, I would be very unhappy if I had to do this again in
the future for a specific Agent (used any author), and had to loop lore
into the process. Yes, it's *doable*, but it'd be very annoying.

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC 2/2] AI: Add initial set of rules and docs
  2025-07-25 20:53   ` Kees Cook
@ 2025-07-25 22:10     ` Sasha Levin
  2025-07-28 17:00     ` Mauro Carvalho Chehab
  1 sibling, 0 replies; 36+ messages in thread
From: Sasha Levin @ 2025-07-25 22:10 UTC (permalink / raw)
  To: Kees Cook
  Cc: workflows, linux-doc, linux-kernel, rostedt, konstantin, corbet,
	josh

On Fri, Jul 25, 2025 at 01:53:57PM -0700, Kees Cook wrote:
>On Fri, Jul 25, 2025 at 01:53:58PM -0400, Sasha Levin wrote:
>> Add rules based on our existing documentation.
>
>I'd still like this not in Documentation/, but I obviously defer to Jon.
>
>> Require AI to identify itself in the commit message.
>>
>> Signed-off-by: Sasha Levin <sashal@kernel.org>
>> ---
>>  Documentation/AI/main.md | 70 ++++++++++++++++++++++++++++++++++++++--
>>  1 file changed, 68 insertions(+), 2 deletions(-)
>>
>> diff --git a/Documentation/AI/main.md b/Documentation/AI/main.md
>> index 959ba50568f57..ca59e52f54445 100644
>> --- a/Documentation/AI/main.md
>> +++ b/Documentation/AI/main.md
>> @@ -1,5 +1,71 @@
>>  # Linux Kernel Development AI Instructions
>>
>> -This is the Linux kernel repository. When working with this codebase, you must follow the following rules:
>> +This is the Linux kernel repository. When working with this codebase, you must follow the Linux kernel development processes and coding standards.
>>
>> -- [ TODO ]
>> +## Essential Documentation References
>> +
>> +### Core Development Process
>> +- **Documentation/process/howto.rst** - Start here! The comprehensive guide on how to become a Linux kernel developer
>> +- **Documentation/process/development-process.rst** - Detailed information on how the kernel development process works
>> +- **Documentation/process/submitting-patches.rst** - Essential guide for getting your code into the kernel
>> +- **Documentation/process/submit-checklist.rst** - Checklist to review before submitting code
>
>Instead of hard-coded paths, I would recommend just discuss the topic
>areas it is expected to find and ingest. :) (e.g. redo the "Key
>principles" list you have later to be more specific about the topic
>areas and adjust the prompting to induce the requirement to find and
>read each topic.)

I'm very open to changing these parts. Ideally we can rewrite it in a
way that's easier for the agent to process rather than something that is
more readable to humans.

>> +
>> +### Coding Standards and Style
>> +- **Documentation/process/coding-style.rst** - Linux kernel coding style (MUST READ)
>> +  - Use tabs (8 characters) for indentation
>> +  - 80-character line limit preferred
>> +  - Specific formatting rules for switch statements, functions, etc.
>> +- **Documentation/process/programming-language.rst** - Language requirements and standards
>> +
>> +### What NOT to Do
>> +- **Documentation/process/deprecated.rst** - Deprecated interfaces and features to avoid
>> +  - Do not use BUG() or BUG_ON() - use WARN() instead
>> +  - Avoid deprecated APIs listed in this document
>> +- **Documentation/process/volatile-considered-harmful.rst** - Why volatile is usually wrong
>
>And the reason I want to avoid such specifics is that even as an example
>above, this ends up being hyperspecific. Why summarize the
>deprecated.rst? Just say "Find and read the notes on deprecated APIs and
>language features"

When we're being explicit with rules, the agent is more likely to not
ignore it (and go "whoops I messed up!" later).

It's a balance we need to find, but I suspect we can fine tune as when
we see how various agents respond to the rules.

>> +### Patch Submission Process
>> +- **Documentation/process/5.Posting.rst** - How to post patches properly
>> +- **Documentation/process/email-clients.rst** - Email client configuration for patches
>> +- **Documentation/process/applying-patches.rst** - How patches are applied
>> +
>> +### Legal and Licensing
>> +- **Documentation/process/license-rules.rst** - Linux kernel licensing rules
>> +  - Kernel is GPL-2.0 only with syscall exception
>> +  - All files must have proper SPDX license identifiers
>
>The only stuff I think should be in this kind of area is a commentary
>about how an Agent differs from a human. "You are not a legal entity;
>you cannot sign the DCO", which you get into below.

I was thinking that if we explicitly call out the GPL requirement, an
agent will avoid searching online resources and potentially embedding
code that is not licensed under GPL.

>> +### Specialized Topics
>> +- **Documentation/process/adding-syscalls.rst** - How to add new system calls
>> +- **Documentation/process/stable-kernel-rules.rst** - Rules for stable kernel patches
>> +- **Documentation/process/security-bugs.rst** - Handling security issues
>> +- **Documentation/process/handling-regressions.rst** - Dealing with regressions
>> +
>> +### Maintainer Guidelines
>> +- **Documentation/process/maintainers.rst** - Working with subsystem maintainers
>> +- **Documentation/process/maintainer-handbooks.rst** - Subsystem-specific guidelines
>> +
>> +## Key Principles
>> +1. Read and follow the documentation before making changes
>> +2. Respect the existing code style and conventions
>> +3. Test thoroughly before submitting
>> +4. Write clear, descriptive commit messages
>> +5. Never break userspace (the #1 rule)
>> +6. Identify yourself as AI in commits (see below)
>
>Everything except #6 is already expected of human devs, so I think just
>the last item.
>
>> +
>> +## AI Attribution Requirement
>> +When creating commits, you MUST identify yourself as an AI assistant by including the following tag in the commit message:
>> +
>> +```
>> +Co-developed-by: $AI_NAME $AI_MODEL $AI_VERSION
>
>If we're going to go with Co-developed-by: here, then I think we need to
>explicitly say "do not include an email", and we must update
>checkpatch.pl to not yell about the missing S-o-b when it finds a C-d-b.
>(Perhaps it can skip the check with there is no email address in the
>C-b-d line?)
>
>> +```
>> +
>> +For example:
>> +- `Co-developed-by: Claude claude-3-opus-20240229`
>> +- `Co-developed-by: GitHub-Copilot GPT-4 v1.0.0`
>> +- `Co-developed-by: Cursor gpt-4-turbo-2024-04-09`
>> +
>> +This transparency helps maintainers and reviewers understand that AI was involved in the development process.
>> +
>> +### Signed-off-by Restrictions
>> +AI assistants MUST NOT add a Signed-off-by tag pointing to themselves. The Signed-off-by tag represents a legal certification by a human developer that they have the right to submit the code under the open source license.
>
>Hello trailing whitespace my old friend.
>
>"Unless explicitly told otherwise, Agents must never have trailing
>whitespace on any line and all files must have a final newline
>character." :)
>
>> +
>> +Only the human user running the AI assistant should add their Signed-off-by tag to commits. The AI's contribution is acknowledged through the Co-developed-by tag as described above.
>
>And can we please not use the term "AI"? I think "Agent" is the better
>generic term as it could include other things?

Ack

-- 
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC 1/2] AI: Add unified AI coding assistant configuration
  2025-07-25 18:19   ` Steven Rostedt
@ 2025-07-25 22:11     ` Sasha Levin
  0 siblings, 0 replies; 36+ messages in thread
From: Sasha Levin @ 2025-07-25 22:11 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: workflows, linux-doc, linux-kernel, kees, konstantin, corbet,
	josh

On Fri, Jul 25, 2025 at 02:19:58PM -0400, Steven Rostedt wrote:
>On Fri, 25 Jul 2025 13:53:57 -0400
>Sasha Levin <sashal@kernel.org> wrote:
>
>> Create a single source of truth for AI instructions in
>> Documentation/AI/main.md with symlinks for all major AI coding
>> assistants:
>> - CLAUDE.md (Claude Code)
>> - .github/copilot-instructions.md (GitHub Copilot)
>> - .cursorrules (Cursor)
>> - .codeium/instructions.md (Codeium)
>> - .continue/context.md (Continue)
>> - .windsurfrules (Windsurf)
>> - Documentation/AIder.conf.yml (Aider)
>
>So these AI tools know to read these?

Yup! These are the "magic" files agents try to read and add into their
context as they start up.

-- 
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC 1/2] AI: Add unified AI coding assistant configuration
  2025-07-25 18:27   ` Jonathan Corbet
@ 2025-07-25 22:15     ` Sasha Levin
  2025-07-27  2:24       ` Sasha Levin
  0 siblings, 1 reply; 36+ messages in thread
From: Sasha Levin @ 2025-07-25 22:15 UTC (permalink / raw)
  To: Jonathan Corbet
  Cc: workflows, linux-doc, linux-kernel, rostedt, kees, konstantin,
	josh

On Fri, Jul 25, 2025 at 12:27:50PM -0600, Jonathan Corbet wrote:
>Sasha Levin <sashal@kernel.org> writes:
>
>> Create a single source of truth for AI instructions in
>> Documentation/AI/main.md with symlinks for all major AI coding
>> assistants:
>> - CLAUDE.md (Claude Code)
>> - .github/copilot-instructions.md (GitHub Copilot)
>> - .cursorrules (Cursor)
>> - .codeium/instructions.md (Codeium)
>> - .continue/context.md (Continue)
>> - .windsurfrules (Windsurf)
>> - Documentation/AIder.conf.yml (Aider)
>>
>> Signed-off-by: Sasha Levin <sashal@kernel.org>
>> ---
>>  .aider.conf.yml                 | 1 +
>>  .codeium/instructions.md        | 1 +
>>  .continue/context.md            | 1 +
>>  .cursorrules                    | 1 +
>>  .github/copilot-instructions.md | 1 +
>>  .windsurfrules                  | 1 +
>>  CLAUDE.md                       | 1 +
>>  Documentation/AI/main.md        | 5 +++++
>
>So I'm gonna ignore (for now) the substantive issues here to ask: do we
>*really* need to introduce Markdown into Documentation/?  Are these
>things really unable to understand RST?  Why not add a file that can be
>part of the docs build so people can see the instructions that are being
>provided?

 From my understanding, most of the agents out there expect a markdown
file ("CLAUDE.md", ".github/copilot-instructions.md", etc).

All the documentation and examples I can find online insist on
markdown... I suspect that they will also understand RST, but then we'll
be doing something "unsupported".

Though in this scenario, maybe even just plain text will be enough?

-- 
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC 1/2] AI: Add unified AI coding assistant configuration
  2025-07-25 20:40   ` Kees Cook
@ 2025-07-25 22:25     ` Sasha Levin
  0 siblings, 0 replies; 36+ messages in thread
From: Sasha Levin @ 2025-07-25 22:25 UTC (permalink / raw)
  To: Kees Cook
  Cc: workflows, linux-doc, linux-kernel, rostedt, konstantin, corbet,
	josh

On Fri, Jul 25, 2025 at 01:40:55PM -0700, Kees Cook wrote:
>On Fri, Jul 25, 2025 at 01:53:57PM -0400, Sasha Levin wrote:
>> Create a single source of truth for AI instructions in
>> Documentation/AI/main.md with symlinks for all major AI coding
>> assistants:
>> - CLAUDE.md (Claude Code)
>> - .github/copilot-instructions.md (GitHub Copilot)
>> - .cursorrules (Cursor)
>> - .codeium/instructions.md (Codeium)
>> - .continue/context.md (Continue)
>> - .windsurfrules (Windsurf)
>> - Documentation/AIder.conf.yml (Aider)
>
>I do like the idea of having a standard baseline for agentic
>development, but:
>
>- This clobbers per-project information storage for the agents, which
>  the dev may want to be adjusting locally. I would strongly prefer
>  adding all of those files to .gitignore instead.

It will, but:

1. We don't have a different place to write this down (more below).
2. Users have other places they can add their specific instructions, or
they can even carry patches on top.

>- Documentation/ should not start getting filled with stuff to be
>  exclusively consumed by agents. Instead, I recommend a top-level
>  .md file that agents can discover that contains very simple
>  non-specific prompts about what they can find in Documentation/,
>  and drop a reference to the file in, say, Makefile to be discovered
>  during the agent's scrape of the project.
>
>I'd recommend something very simple like:
>
>diff --git a/Makefile b/Makefile
>index e05bd43f93bd..998b037e6d4d 100644
>--- a/Makefile
>+++ b/Makefile
>@@ -7,6 +7,7 @@ NAME = Baby Opossum Posse
>
> # *DOCUMENTATION*
> # To see a list of typical targets execute "make help"
>+# AI agents and LLMs should read ./AGENTS.md
> # More info can be located in ./README
> # Comments in this file are targeted only to the developer, do not
> # expect to learn how to build the kernel reading this file.
>diff --git a/AGENTS.md b/AGENTS.md
>new file mode 100644
>index 000000000000..3df98f0cc667
>--- /dev/null
>+++ b/AGENTS.md
>@@ -0,0 +1,3 @@
>+To work on the Linux kernel, please see Makefile for details on how to
>+perform builds and testing. Find and read the Submitting Patches and
>+Coding Style rules in Documentation/.

I've removed CLAUDE.md, and applied the following change on top of this
RFC patchset with Claude:

diff --git a/Makefile b/Makefile
index be33e8c868ae2..e3986eae88e08 100644
--- a/Makefile
+++ b/Makefile
@@ -7,6 +7,7 @@ NAME = Baby Opossum Posse

  # *DOCUMENTATION*
  # To see a list of typical targets execute "make help"
+# AI agents and LLMs should read ./Documentation/AI/main.md
  # More info can be located in ./README
  # Comments in this file are targeted only to the developer, do not
  # expect to learn how to build the kernel reading this file.

... which Claude (and I suspect most other other agents) proceeds to
ignore as it doesn't really do any scraping outside of those dedicated
files:

	$ claude -p "do you need to attribute your commits?"
	No, I don't need to attribute commits myself. When creating commits, I follow the format specified in my instructions:

	```
	🤖 Generated with [Claude Code](https://claude.ai/code)

	Co-Authored-By: Claude <noreply@anthropic.com>
	```

	This attribution is automatically added to commit messages when I create commits for you

-- 
Thanks,
Sasha

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [RFC 0/2] Add AI coding assistant configuration to Linux kernel
  2025-07-25 20:34       ` dan.j.williams
@ 2025-07-25 22:28         ` Sasha Levin
  2025-07-28 17:46         ` Steven Rostedt
  1 sibling, 0 replies; 36+ messages in thread
From: Sasha Levin @ 2025-07-25 22:28 UTC (permalink / raw)
  To: dan.j.williams
  Cc: Jakub Kicinski, Steven Rostedt, workflows, linux-doc,
	linux-kernel, kees, konstantin, corbet, josh

On Fri, Jul 25, 2025 at 01:34:32PM -0700, dan.j.williams@intel.com wrote:
>Jakub Kicinski wrote:
>[..]
>> To be clear, it's not my main point, my main point is that
>> the information is of no proven use right now. As long as
>> committer follows the BKP of adding Link: https://patch.msgid.link/...
>> we can find the metadata later.
>>
>> We never found the need to attach the exact version of smatch / sparse
>> / cocci that found the bug or "wrote" a patch. Let us not overreact to
>> the AI tools.
>>
>> > Also, I would argue that it would be useful in the change log as if there's
>> > a bug in the generated code, you know who or *what* to blame. Especially if
>> > there is a pattern to be found.
>>
>> This touches on explainability of AI. Perhaps the metadata would be
>> interesting for XAI research... not sure that's enough to be lugging
>> those tags in git history.
>
>Agree. The "who to blame" is "Author:". They signed DCO they are
>responsible for debugging what went wrong in any stage of the
>development of a patch per usual. We have a long history of debugging
>tool problems without tracking tool versions in git history.

And it would be great to avoid the potential "it wasn't me, it was the
AI!" or "whoops I don't know how that exploitable issue ended up in my
patch, must have been the AI".

-- 
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC 0/2] Add AI coding assistant configuration to Linux kernel
  2025-07-25 17:53 [RFC 0/2] Add AI coding assistant configuration to Linux kernel Sasha Levin
                   ` (2 preceding siblings ...)
  2025-07-25 18:41 ` [RFC 0/2] Add AI coding assistant configuration to Linux kernel Jakub Kicinski
@ 2025-07-26  9:07 ` Mario Limonciello
  2025-07-26 11:53   ` Sasha Levin
  2025-07-26 16:36   ` Kees Cook
  2025-07-28 16:49 ` Mauro Carvalho Chehab
  4 siblings, 2 replies; 36+ messages in thread
From: Mario Limonciello @ 2025-07-26  9:07 UTC (permalink / raw)
  To: Sasha Levin, workflows, linux-doc, linux-kernel
  Cc: rostedt, kees, konstantin, corbet, josh

> Example patch creation with Claude Code:
> 
> 	$ claude -p "Fix the dont -> don't typo in @Documentation/power/opp.rst. Commit the result"
> 	Done! The typo has been fixed and committed.

Is this actually how people use AI agents?  I've never thought of asking 
an agent to write a whole patch and commit the result.

The way that I've seen it is things like Github Copilot within VScode 
where there are inline suggestions.  It's kinda like clangd except it 
suggests corrections to your mistakes instead of just underlining them 
with red squiggles.

Like if you messed up the argument and passed a pointer when it was 
supposed to be a pointer to a pointer it will give you a little tooltip 
correction.  But this is long before you would be ready to actually 
commit a patch, heck it's before even testing it (obviously).

The actual committing action would be by running 'git commit'.  So I 
don't see how these tags could end up in there.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC 0/2] Add AI coding assistant configuration to Linux kernel
  2025-07-26  9:07 ` Mario Limonciello
@ 2025-07-26 11:53   ` Sasha Levin
  2025-07-26 16:36   ` Kees Cook
  1 sibling, 0 replies; 36+ messages in thread
From: Sasha Levin @ 2025-07-26 11:53 UTC (permalink / raw)
  To: Mario Limonciello
  Cc: workflows, linux-doc, linux-kernel, rostedt, kees, konstantin,
	corbet, josh

On Sat, Jul 26, 2025 at 04:07:40AM -0500, Mario Limonciello wrote:
>>Example patch creation with Claude Code:
>>
>>	$ claude -p "Fix the dont -> don't typo in @Documentation/power/opp.rst. Commit the result"
>>	Done! The typo has been fixed and committed.
>
>Is this actually how people use AI agents?  I've never thought of 
>asking an agent to write a whole patch and commit the result.

Yup, there are a bunch of usecases for this, and while this is an
oversimplified example (that's why I picked a simple spelling fix that
the agent could get right "the first time").

See Kees' post at https://hachyderm.io/@kees/114907228284590439 , in
particular the buffer overflow fix where he logged his prompts.

>The way that I've seen it is things like Github Copilot within VScode 
>where there are inline suggestions.  It's kinda like clangd except it 
>suggests corrections to your mistakes instead of just underlining them 
>with red squiggles.
>
>Like if you messed up the argument and passed a pointer when it was 
>supposed to be a pointer to a pointer it will give you a little 
>tooltip correction.  But this is long before you would be ready to 
>actually commit a patch, heck it's before even testing it (obviously).
>
>The actual committing action would be by running 'git commit'.  So I 
>don't see how these tags could end up in there.

Even vscode has a commit feature which can help write commit
messages and actually commit them for you:
https://code.visualstudio.com/docs/sourcecontrol/overview

But yes, if you choose to manually handle the committing aspect then you
should also be manually attributing the agent :)

-- 
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC 0/2] Add AI coding assistant configuration to Linux kernel
  2025-07-26  9:07 ` Mario Limonciello
  2025-07-26 11:53   ` Sasha Levin
@ 2025-07-26 16:36   ` Kees Cook
  1 sibling, 0 replies; 36+ messages in thread
From: Kees Cook @ 2025-07-26 16:36 UTC (permalink / raw)
  To: Mario Limonciello, Sasha Levin, workflows, linux-doc,
	linux-kernel
  Cc: rostedt, konstantin, corbet, josh

On July 26, 2025 2:07:40 AM PDT, Mario Limonciello <superm1@kernel.org> wrote:
>> Example patch creation with Claude Code:
>> 
>> 	$ claude -p "Fix the dont -> don't typo in @Documentation/power/opp.rst. Commit the result"
>> 	Done! The typo has been fixed and committed.
>
>Is this actually how people use AI agents?  I've never thought of asking an agent to write a whole patch and commit the result.

Yeah! I've been using the interactive modes (e.g. just the "claude" command). The insight by a friend of mine is to have the agent update its own knowledge base regularly. For example, for a first time session, using Claude I would run "claude" in the root of the kernel tree, and then write:

/init
Find and read the coding style and submitting patches documentation in the Documentation/ directory
Always use a build output directory, like "claude-build". This must always be specified with the O= option for make
This is a big build machine, so also always use the -j128 option for make
Perform a build of the kernel
Save anything new you've learned in your CLAUDE.md file
Keep all new .md files and temp files in the agentic/ directory
Examine git history of commits by Kees Cook over the last 5 years and write out a new .md file that describes his commit log writing style
Use this for any commit messages we write
Figure out how to run KUnit tests, the docs are in Documentation/
Run the fortify KUnit tests
Update your CLAUDE.md with anything new you've learned

(Note that I run the agent as an entirely separate user in a Docker container.)

The agent acts like an intern who reads VERY quickly, but forgets things regularly. So it has to be both reminded to save new stuff it learns (I break things into separate .md files), and then read them back again after it starts forgetting.

It still needs extensive hand-holding, and it's rare that I'm happy with its commit logs, but it is pretty helpful so far. That said I've been focusing on some smaller refactoring work just to feel out how to use it. I have some hints that it would struggle with doing very large scale refactoring on its own, but it's nice to toss it at a huge patch I generated via Coccinelle and say "build this across all our configured archs and fix any compilation failures you encounter" and then go have lunch. ;)

-Kees

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC 1/2] AI: Add unified AI coding assistant configuration
  2025-07-25 22:15     ` Sasha Levin
@ 2025-07-27  2:24       ` Sasha Levin
  2025-07-28 16:56         ` Mauro Carvalho Chehab
  0 siblings, 1 reply; 36+ messages in thread
From: Sasha Levin @ 2025-07-27  2:24 UTC (permalink / raw)
  To: Jonathan Corbet
  Cc: workflows, linux-doc, linux-kernel, rostedt, kees, konstantin,
	josh

On Fri, Jul 25, 2025 at 06:15:33PM -0400, Sasha Levin wrote:
>On Fri, Jul 25, 2025 at 12:27:50PM -0600, Jonathan Corbet wrote:
>>Sasha Levin <sashal@kernel.org> writes:
>>
>>>Create a single source of truth for AI instructions in
>>>Documentation/AI/main.md with symlinks for all major AI coding
>>>assistants:
>>>- CLAUDE.md (Claude Code)
>>>- .github/copilot-instructions.md (GitHub Copilot)
>>>- .cursorrules (Cursor)
>>>- .codeium/instructions.md (Codeium)
>>>- .continue/context.md (Continue)
>>>- .windsurfrules (Windsurf)
>>>- Documentation/AIder.conf.yml (Aider)
>>>
>>>Signed-off-by: Sasha Levin <sashal@kernel.org>
>>>---
>>> .aider.conf.yml                 | 1 +
>>> .codeium/instructions.md        | 1 +
>>> .continue/context.md            | 1 +
>>> .cursorrules                    | 1 +
>>> .github/copilot-instructions.md | 1 +
>>> .windsurfrules                  | 1 +
>>> CLAUDE.md                       | 1 +
>>> Documentation/AI/main.md        | 5 +++++
>>
>>So I'm gonna ignore (for now) the substantive issues here to ask: do we
>>*really* need to introduce Markdown into Documentation/?  Are these
>>things really unable to understand RST?  Why not add a file that can be
>>part of the docs build so people can see the instructions that are being
>>provided?
>
>From my understanding, most of the agents out there expect a markdown
>file ("CLAUDE.md", ".github/copilot-instructions.md", etc).
>
>All the documentation and examples I can find online insist on
>markdown... I suspect that they will also understand RST, but then we'll
>be doing something "unsupported".
>
>Though in this scenario, maybe even just plain text will be enough?

I've tested providing a RST file instead of markdown to Claude, Copilot,
and Cursor. All 3 seemed to be okay with it and followed the
instructions in it.

I'll switch to RST.

-- 
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC 0/2] Add AI coding assistant configuration to Linux kernel
@ 2025-07-27  9:37 Sabyrzhan Tasbolatov
  2025-07-27 15:23 ` Kees Cook
  0 siblings, 1 reply; 36+ messages in thread
From: Sabyrzhan Tasbolatov @ 2025-07-27  9:37 UTC (permalink / raw)
  To: Sasha Levin, workflows, linux-doc, Linux Kernel Mailing List
  Cc: rostedt, kees, konstantin, corbet, josh, Sabyrzhan Tasbolatov

On 7/25/25 22:53, Sasha Levin wrote:
> This patch series adds unified configuration and documentation for AI
> coding assistants working with the Linux kernel codebase. As AI tools
> become increasingly common in software development, it's important to
> establish clear guidelines for their use in kernel development.
>

Hello,

Thanks for bringing up this topic.

I want to share my experience as I've recently made a primitive experiment
working on various Linux kernel development tasks along with LLM help.

Slides are available at my Telegram channel [1], which I presented at
the local 2600
conference in June, '25.
[1] https://t.me/novitoll_ch/366

My tasks were:
- Analyze syzkaller crash report (w/ and w/o human patch exists)
- De-obfuscate syzkaller C reproducer
- Generate syscall descriptions for syzkaller
- PATCH review (feeding entire PATCH series)
- Generate mm/kasan code snippets
- Generate git commit message for the given git diff, and also generate the
  cover letter

The "results" of experiment gave me following understandings of the current 2025
state of LLMs:

- Code generation

LLMs (Claude 4 Sonnet, OpenAI o3, Gemini 2.5 Pro) are trained to give code
samples for not-latest git tree. You can configure your own MCP client, e.g.
Claude Desktop, Trae.ai etc. with external MCP servers. I was advised to try
context7 MCP server [2] which should hint LLM to give code xrefs, documentations
within "the latest" kernel version. I've just checked that the latest version
on context7 MCP was 2 weeks ago, which means that at the current moment
the pace of PATCH submissions to various linux-* trees exceeds the
synchorization of context7 MCP server so LLM can give you "the actual code".

So without human/scripted assistance `git apply PATCH` is error prone.

[2] https://context7.com/torvalds/linux

- Generated code limits in the LLM context window

During my experiment with mm/kasan code generation, the context window of
LLM was the problem. OpenAI o3 does not limit you with the long dialogue,
however I guess, it might forget the code you mentioned in the beginning.

Claude 4 Sonnet (chat bot version, not API) has a limited context window -
you send it the PATCH series, it replies back, you correct the LLM with such and
such, LLM apologies, corrects back etc. This dialogue might end up with many
chat interactions, that eventually Claude 4 Sonnet tell you "Start a new chat,
the context window exceeded the limit". This was on my $20/month subscription.

I know that there's a memory bank for Cursor [3], but my point is that during
PATCH series and human interaction as the verifier, I've faced the issue with
it.

[3] https://github.com/vanzan01/cursor-memory-bank

- Code verification

LLM does not do any kind of verification of proposed code. So the human still
needs to compile, run, test it.

- LLM API tokens cost for MCP agents

At the end of my experiment, I've tried to have a MCP agent, where I've
explained to the agent its role. I've set the temperature coef (LLM creativity)
to 0.7:
- 0.0       = most deterministic
- 0.3 - 0.5 = summary, tech writing
- 0.7       = balanced, default in most LLMs
- 1.0+      = more creative, randomness

Max tokens per 1 chat = 10240, max iterations = 10 (MCP agent can produce up to
10 chats). Result is 100K tokens. Which in OpenAI o3 was less than $1 per my
billing.

== Summary

While LLMs are **really** good in composing git messages, cover letters (in
fact, LLM was initially designed to predict the next word, AFAIU), and composing
you error-free grammatically, with all Oxford English dictionary,
yet my current $20/month in Claude, $20/month in OpenAI subscriptions and
$20 for the API usage, does not allow me to "vibe coding" with the Linux kernel.

> The series consists of two patches:
>
> 1. The first patch adds unified configuration files for various AI
> coding assistants (Claude, GitHub Copilot, Cursor, Codeium, Continue,
> Windsurf, and Aider). These are all symlinked to a central documentation
> file to ensure consistency across tools.
>

I don't think that we want to have a separate instruction per vendor.
Like you missed trae.ai, for example. So perhaps, _if_ the Linux kernel has the
documentation for LLM setup, then I'll be happy to see following:

- MCP server setup. MCP JSON is applicable for all aforementioned vendors,
so we won't have to support vendor specific instructions. A custom MCP server
with tools like "@code_review", "@code_gen" etc. that does everything we accept
in the Linux kernel process.

- MCP agent(s) role instructions. Since LLMs are trained publicly, e.g. they are
  not specifically trained to understand the Linux kernel code, we need to
always explain LLM jobs. What we expect from it, what it should and should not
do. So some instruction templates (with CAPITAL letters, I've heard LLM
understands it better in such a form) will be canonical. Here was my LLM
instruction [4]:

https://github.com/novitoll/mcp-linux-kernel/blob/d763e85f5ba8ab21ea73ecdeaea35a6ffab6b4f2/demo/finder_agent.py#L17-L29

_AI_ agent per sub-system and driver can be even better. Like if PATCH is within
linux-mm subsystem, then the appropriate linux-mm MCP agent should be involved
as it has the role to assess the code from different perspectives: w/, w/o KASAN
compilation etc.

P.S.: Personally, I've decided to pause on the vibe coding, since I
spent too much time on
explaining to LLM the context and copy-pasting errors, and reading the notorious
answer from LLM **You're absolutely right! Let me change my code ...**.

> 2. The second patch adds the actual rules and documentation that guide
> AI assistants on Linux kernel development practices, including:
>     - Following kernel coding standards
>     - Respecting the development process
>     - Properly attributing AI-generated contributions
>     - Understanding licensing requirements
>
> The examples below demonstrate how these guidelines work in practice, showing
> proper AI attribution in commits and the assistant's understanding of kernel
> documentation requirements.
>
> All AI assistants are required to identify themselves in commits using
> Co-developed-by tags, ensuring full transparency about AI involvement in
> code development.
>
> "Hello world" test:
>
>       $ claude -p "do you need to attribute your commits?"
>       Yes, according to the CLAUDE.md file in this Linux kernel repository, I must identify myself as an AI assistant in commits by including:
>
>       ```
>       Co-developed-by: Claude claude-opus-4-20250514
>       ```
>
>       I should NOT add a Signed-off-by tag - only you (the human developer) should add that, as it represents a legal certification.
>
>
> Example patch creation with Claude Code:
>
>       $ claude -p "Fix the dont -> don't typo in @Documentation/power/opp.rst. Commit the result"
>       Done! The typo has been fixed and committed.
>
>
> And the resulting patch (no manual edits):
>
>       From 7613561f430920803d945b3cd53aa81440d4f6f0 Mon Sep 17 00:00:00 2001
>       From: Sasha Levin <sashal@kernel.org>
>       Date: Fri, 25 Jul 2025 13:38:43 -0400
>       Subject: [PATCH] Documentation: power: opp: Fix typo dont -> don't
>
>       Fix a typo in the OPP documentation where "dont" was missing an apostrophe.
>
>       Co-developed-by: Claude claude-opus-4-20250514
>       ---
>        Documentation/power/opp.rst | 2 +-
>        1 file changed, 1 insertion(+), 1 deletion(-)
>
>       diff --git a/Documentation/power/opp.rst b/Documentation/power/opp.rst
>       index 1b7f1d854f14a..3eae4b955a63b 100644
>       --- a/Documentation/power/opp.rst
>       +++ b/Documentation/power/opp.rst
>       @@ -149,7 +149,7 @@ dev_pm_opp_find_freq_exact
>
>                opp = dev_pm_opp_find_freq_exact(dev, 1000000000, false);
>                dev_pm_opp_put(opp);
>       -        /* dont operate on the pointer.. just do a sanity check.. */
>       +        /* don't operate on the pointer.. just do a sanity check.. */
>                if (IS_ERR(opp)) {
>                       pr_err("frequency not disabled!\n");
>                       /* trigger appropriate actions.. */
>
>
> Sasha Levin (2):
>    AI: Add unified AI coding assistant configuration
>    AI: Add initial set of rules and docs
>
>   .aider.conf.yml                 |  1 +
>   .codeium/instructions.md        |  1 +
>   .continue/context.md            |  1 +
>   .cursorrules                    |  1 +
>   .github/copilot-instructions.md |  1 +
>   .windsurfrules                  |  1 +
>   CLAUDE.md                       |  1 +
>   Documentation/AI/main.md        | 71 +++++++++++++++++++++++++++++++++
>   8 files changed, 78 insertions(+)
>   create mode 120000 .aider.conf.yml
>   create mode 120000 .codeium/instructions.md
>   create mode 120000 .continue/context.md
>   create mode 120000 .cursorrules
>   create mode 120000 .github/copilot-instructions.md
>   create mode 120000 .windsurfrules
>   create mode 120000 CLAUDE.md
>   create mode 100644 Documentation/AI/main.md
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC 0/2] Add AI coding assistant configuration to Linux kernel
  2025-07-27  9:37 Sabyrzhan Tasbolatov
@ 2025-07-27 15:23 ` Kees Cook
  2025-07-27 15:45   ` Dr. David Alan Gilbert
  0 siblings, 1 reply; 36+ messages in thread
From: Kees Cook @ 2025-07-27 15:23 UTC (permalink / raw)
  To: Sabyrzhan Tasbolatov, Sasha Levin, workflows, linux-doc,
	Linux Kernel Mailing List
  Cc: rostedt, konstantin, corbet, josh

On July 27, 2025 2:37:22 AM PDT, Sabyrzhan Tasbolatov <snovitoll@gmail.com> wrote:
> [...]
>- Code verification
>
>LLM does not do any kind of verification of proposed code. So the human still
>needs to compile, run, test it.

This hasn't been my experience. With the MCP cli tools I've had quite a bit of success with it doing build testing and unit testing. I'm hoping to add runtime testing, but the hurdles for getting it to sanely interact with a qemu instance is tricky.

That it will do basic build error analysis and fixing has been nice: it types faster than me, so if it's simple stuff, it's faster than me to find and fix typos or other missed refactoring work.

I've not used it for anything large for exactly the reason you mentioned: the context window is very small compared to the size of the Linux code base. But if it is given narrow goals, it does well.

>P.S.: Personally, I've decided to pause on the vibe coding, since I
>spent too much time on
>explaining to LLM the context and copy-pasting errors, and reading the notorious
>answer from LLM **You're absolutely right! Let me change my code ...**.

Oh yes; this can be so annoying. And the "mission accomplished"ism! "This is the most comprehensive set of tests ever added with 100% architecture coverage!" Sheesh, calm down. 100% build coverage is table stakes for Linux. ;)

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC 0/2] Add AI coding assistant configuration to Linux kernel
  2025-07-27 15:23 ` Kees Cook
@ 2025-07-27 15:45   ` Dr. David Alan Gilbert
  2025-07-28  6:40     ` Kees Cook
  0 siblings, 1 reply; 36+ messages in thread
From: Dr. David Alan Gilbert @ 2025-07-27 15:45 UTC (permalink / raw)
  To: Kees Cook
  Cc: Sabyrzhan Tasbolatov, Sasha Levin, workflows, linux-doc,
	Linux Kernel Mailing List, rostedt, konstantin, corbet, josh

* Kees Cook (kees@kernel.org) wrote:
> 
> 
...

> I'm hoping to add runtime testing, but the hurdles for getting it to sanely interact with a qemu instance is tricky.

When doing qemu dev, I frequently run it in a tmux, and start it with
'-nographic' which gets you a single stream with both serial and monitor in it;
alternatively you can get one pane with the serial output and one with the
monitor, that takes a little more setup;

anyway, then I can do :

tmux -L $SESS send-keys -t srcqemu "cd /mnt" enter

and have a wait function that waits until a string is displayed:
# pane string command
function waitstr {
  PANE=$1
  STR=$2
  CMD="$3"
  until [ -n "$(tmux -L $SESS capture-pane -p -t $PANE | grep "$STR" )" ]; do
    $CMD
    sleep 1
  done;
}

so do:
waitstr srcqemu "root@localhost" "sleep 1"

it feels like it should be fairly easy to wrap some of those for tests.
(Beware the 'send-keys' command is a bit touchy about what strings it takes,
but it has some flags to help).

Dave


> -- 
> Kees Cook
> 
-- 
 -----Open up your eyes, open up your mind, open up your code -------   
/ Dr. David Alan Gilbert    |       Running GNU/Linux       | Happy  \ 
\        dave @ treblig.org |                               | In Hex /
 \ _________________________|_____ http://www.treblig.org   |_______/

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC 0/2] Add AI coding assistant configuration to Linux kernel
  2025-07-27 15:45   ` Dr. David Alan Gilbert
@ 2025-07-28  6:40     ` Kees Cook
  2025-07-28 16:43       ` Sabyrzhan Tasbolatov
  0 siblings, 1 reply; 36+ messages in thread
From: Kees Cook @ 2025-07-28  6:40 UTC (permalink / raw)
  To: Dr. David Alan Gilbert
  Cc: Sabyrzhan Tasbolatov, Sasha Levin, workflows, linux-doc,
	Linux Kernel Mailing List, rostedt, konstantin, corbet, josh

On Sun, Jul 27, 2025 at 03:45:42PM +0000, Dr. David Alan Gilbert wrote:
> When doing qemu dev, I frequently run it in a tmux, and start it with
> '-nographic' which gets you a single stream with both serial and monitor in it;
> alternatively you can get one pane with the serial output and one with the
> monitor, that takes a little more setup;

Yeah, I haven't played with it yet, but I expect I'll need to try several
approaches and see which the agent can best deal with. It's better with
non-interactive stuff, so I'm thinking that giving it tooling that will
run a script at boot or have the image bring up ssh for the agent to run
individual commands via ssh... it all depends on what the agent can wrap
its logic around.

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC 0/2] Add AI coding assistant configuration to Linux kernel
  2025-07-25 21:03     ` Kees Cook
@ 2025-07-28 15:57       ` dan.j.williams
  2025-07-28 16:41         ` Kees Cook
  0 siblings, 1 reply; 36+ messages in thread
From: dan.j.williams @ 2025-07-28 15:57 UTC (permalink / raw)
  To: Kees Cook, Steven Rostedt
  Cc: Jakub Kicinski, Sasha Levin, workflows, linux-doc, linux-kernel,
	konstantin, corbet, josh

Kees Cook wrote:
> On Fri, Jul 25, 2025 at 03:00:46PM -0400, Steven Rostedt wrote:
> > Also, I would argue that it would be useful in the change log as if there's
> > a bug in the generated code, you know who or *what* to blame. Especially if
> > there is a pattern to be found.
> 
> Yeah, this is where I feel like it's the most potentially useful. Since
> they are distinctly code-generators, we should include the info to
> identify it. We include version numbers and such the compilers and
> linkers, though they are only informally included in commit logs when
> dealing with specific problems.
> 
> Having had to do "find all commits from [set of authors]" research for
> security audits, I would be very unhappy if I had to do this again in
> the future for a specific Agent (used any author), and had to loop lore
> into the process. Yes, it's *doable*, but it'd be very annoying.

Oh, yes, that gives me pause. However, so too does the idea that AI
contributions, beyond mere mechanical code complete of trusted human
developers, would become more prevalent. *If* it gets to that point, I
agree that this forensic ability is necessary. Now, if it gets to that
point it also assumes that the "AI contribution review decimating human
reviewer bandwidth" problem has a mitigation.

So "doable, but very annoying" strikes me as a problem space where an AI
agent could help. It is not clear to me that a concise commit trailer
captures everything needed to help both the review and after the fact
forensics problem, especially when model fine tuning and prompting are
in play.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC 0/2] Add AI coding assistant configuration to Linux kernel
  2025-07-28 15:57       ` dan.j.williams
@ 2025-07-28 16:41         ` Kees Cook
  0 siblings, 0 replies; 36+ messages in thread
From: Kees Cook @ 2025-07-28 16:41 UTC (permalink / raw)
  To: dan.j.williams, Steven Rostedt
  Cc: Jakub Kicinski, Sasha Levin, workflows, linux-doc, linux-kernel,
	konstantin, corbet, josh



On July 28, 2025 8:57:21 AM PDT, dan.j.williams@intel.com wrote:
>Kees Cook wrote:
>> Having had to do "find all commits from [set of authors]" research for
>> security audits, I would be very unhappy if I had to do this again in
>> the future for a specific Agent (used any author), and had to loop lore
>> into the process. Yes, it's *doable*, but it'd be very annoying.
> [...]
>So "doable, but very annoying" strikes me as a problem space where an AI
>agent could help. It is not clear to me that a concise commit trailer
>captures everything needed to help both the review and after the fact
>forensics problem, especially when model fine tuning and prompting are
>in play.

Heh, good point. :) We can make all problems *with* agents and problem *for* agents. :P

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC 0/2] Add AI coding assistant configuration to Linux kernel
  2025-07-28  6:40     ` Kees Cook
@ 2025-07-28 16:43       ` Sabyrzhan Tasbolatov
  0 siblings, 0 replies; 36+ messages in thread
From: Sabyrzhan Tasbolatov @ 2025-07-28 16:43 UTC (permalink / raw)
  To: Kees Cook
  Cc: Dr. David Alan Gilbert, Sasha Levin, workflows, linux-doc,
	Linux Kernel Mailing List, rostedt, konstantin, corbet, josh

On Mon, Jul 28, 2025 at 11:40 AM Kees Cook <kees@kernel.org> wrote:
>
> On Sun, Jul 27, 2025 at 03:45:42PM +0000, Dr. David Alan Gilbert wrote:
> > When doing qemu dev, I frequently run it in a tmux, and start it with
> > '-nographic' which gets you a single stream with both serial and monitor in it;
> > alternatively you can get one pane with the serial output and one with the
> > monitor, that takes a little more setup;
>
> Yeah, I haven't played with it yet, but I expect I'll need to try several
> approaches and see which the agent can best deal with. It's better with
> non-interactive stuff, so I'm thinking that giving it tooling that will
> run a script at boot or have the image bring up ssh for the agent to run
> individual commands via ssh... it all depends on what the agent can wrap
> its logic around.

FWIW,

If we ask LLM to produce the code, then LLM replies with some description
and the code section within the paragraph. So in this pipeline, we need to
pre-process the LLM output. But there's another way, I believe.

We explain the MCP agent its role with the instruction, tell it to
save the code output
to the designated directory. This should be possible using MCP
filesystem servers
with RW access of the directory, so we're ready to test the generated
git diffs or C code.

Testing can be also orchestrated by the separate MCP agent who is instructed to
take the code from the output directory and run the QEMU on specific
arch, config etc.

Code generator and testing agents can optimize by themselves.
There's a MCP agent framework with "Evaluator-Optimizer" workflow [1]
to optimize the output
to some EXCELLENT quality, which is a vague description for me.

[1] https://github.com/lastmile-ai/mcp-agent/blob/main/examples/workflows/workflow_evaluator_optimizer/main.py#L57

The downside is that all of this works via LLM APIs which are not free.
But this is some orchestrated way of verifying LLM code generation, I guess.

In local development, we could grep the LLM's git diff and run the
QEMU via script for the test,
and evaluate the correctness of the code ourselves. The only money
charging here will be the LLM model,
if it's from the vendor. If Linux kernel has its own trained
Ollama-like free models to download, then it's even better.

>
> --
> Kees Cook

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC 0/2] Add AI coding assistant configuration to Linux kernel
  2025-07-25 17:53 [RFC 0/2] Add AI coding assistant configuration to Linux kernel Sasha Levin
                   ` (3 preceding siblings ...)
  2025-07-26  9:07 ` Mario Limonciello
@ 2025-07-28 16:49 ` Mauro Carvalho Chehab
  4 siblings, 0 replies; 36+ messages in thread
From: Mauro Carvalho Chehab @ 2025-07-28 16:49 UTC (permalink / raw)
  To: Sasha Levin
  Cc: workflows, linux-doc, linux-kernel, rostedt, kees, konstantin,
	corbet, josh

Em Fri, 25 Jul 2025 13:53:56 -0400
Sasha Levin <sashal@kernel.org> escreveu:

> This patch series adds unified configuration and documentation for AI
> coding assistants working with the Linux kernel codebase. As AI tools
> become increasingly common in software development, it's important to
> establish clear guidelines for their use in kernel development.
...

> The series consists of two patches:
> 
> 1. The first patch adds unified configuration files for various AI
> coding assistants (Claude, GitHub Copilot, Cursor, Codeium, Continue,
> Windsurf, and Aider). These are all symlinked to a central documentation
> file to ensure consistency across tools.

Agreed with the goal, however people may be using other LLM tools, 
like chatgpt and deepseek, as a development aid tool, but not directly
writing patches from it.

Some LLM tools can be very useful when handling some bugs that may
require seeking the web for some possible alternatives. On
such kind of usage, it is not different of using a search
engine like duckduckgo, google, yahoo,...

LLMs can also be used to write unit tests/functional tests,
where the actual patch can be independent of any LLM coding,
if the testing toolchain is not submitted.

> 2. The second patch adds the actual rules and documentation that guide
> AI assistants on Linux kernel development practices, including:
>    - Following kernel coding standards
>    - Respecting the development process
>    - Properly attributing AI-generated contributions
>    - Understanding licensing requirements
> 
> The examples below demonstrate how these guidelines work in practice, showing
> proper AI attribution in commits and the assistant's understanding of kernel
> documentation requirements.
> 
> All AI assistants are required to identify themselves in commits using
> Co-developed-by tags, ensuring full transparency about AI involvement in
> code development.
> 
> "Hello world" test:
> 
> 	$ claude -p "do you need to attribute your commits?"
> 	Yes, according to the CLAUDE.md file in this Linux kernel repository, I must identify myself as an AI assistant in commits by including:
> 
> 	```
> 	Co-developed-by: Claude claude-opus-4-20250514

Due to the above, I don't think that Co-developed-by is the best 
tag here, nor if we want a tag or some text written at the patch
description describing how AI was used.

For instance, on your example:

> 	-        /* dont operate on the pointer.. just do a sanity check.. */
> 	+        /* don't operate on the pointer.. just do a sanity check.. */

LLM is not developing anything. It is just doing what any other
spellchecker tool would do. We don't have any tags to document
that, although we sometimes add comment saying that we used
cocci or some other tool to identify (and/or autocorrect) the
coding style and/or grammar/syntax issues.

Thanks,
Mauro

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC 1/2] AI: Add unified AI coding assistant configuration
  2025-07-27  2:24       ` Sasha Levin
@ 2025-07-28 16:56         ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 36+ messages in thread
From: Mauro Carvalho Chehab @ 2025-07-28 16:56 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Jonathan Corbet, workflows, linux-doc, linux-kernel, rostedt,
	kees, konstantin, josh

Em Sat, 26 Jul 2025 22:24:08 -0400
Sasha Levin <sashal@kernel.org> escreveu:

> On Fri, Jul 25, 2025 at 06:15:33PM -0400, Sasha Levin wrote:
> >On Fri, Jul 25, 2025 at 12:27:50PM -0600, Jonathan Corbet wrote:  
> >>Sasha Levin <sashal@kernel.org> writes:
> >>  
> >>>Create a single source of truth for AI instructions in
> >>>Documentation/AI/main.md with symlinks for all major AI coding
> >>>assistants:
> >>>- CLAUDE.md (Claude Code)
> >>>- .github/copilot-instructions.md (GitHub Copilot)
> >>>- .cursorrules (Cursor)
> >>>- .codeium/instructions.md (Codeium)
> >>>- .continue/context.md (Continue)
> >>>- .windsurfrules (Windsurf)
> >>>- Documentation/AIder.conf.yml (Aider)
> >>>
> >>>Signed-off-by: Sasha Levin <sashal@kernel.org>
> >>>---
> >>> .aider.conf.yml                 | 1 +
> >>> .codeium/instructions.md        | 1 +
> >>> .continue/context.md            | 1 +
> >>> .cursorrules                    | 1 +
> >>> .github/copilot-instructions.md | 1 +
> >>> .windsurfrules                  | 1 +
> >>> CLAUDE.md                       | 1 +
> >>> Documentation/AI/main.md        | 5 +++++  
> >>
> >>So I'm gonna ignore (for now) the substantive issues here to ask: do we
> >>*really* need to introduce Markdown into Documentation/?  Are these
> >>things really unable to understand RST?  Why not add a file that can be
> >>part of the docs build so people can see the instructions that are being
> >>provided?  
> >
> >From my understanding, most of the agents out there expect a markdown
> >file ("CLAUDE.md", ".github/copilot-instructions.md", etc).
> >
> >All the documentation and examples I can find online insist on
> >markdown... I suspect that they will also understand RST, but then we'll
> >be doing something "unsupported".
> >
> >Though in this scenario, maybe even just plain text will be enough?  
> 
> I've tested providing a RST file instead of markdown to Claude, Copilot,
> and Cursor. All 3 seemed to be okay with it and followed the
> instructions in it.
> 
> I'll switch to RST.

Maybe you can also check if aren't there one "unified" file where
others would read, or if are there any efforts to unify them.

I strongly suspect that, if not now, with time, they'll all end 
supporting "alien" files for the most popular tools, if they don't
find their own special file name.

Thanks,
Mauro

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC 2/2] AI: Add initial set of rules and docs
  2025-07-25 20:53   ` Kees Cook
  2025-07-25 22:10     ` Sasha Levin
@ 2025-07-28 17:00     ` Mauro Carvalho Chehab
  1 sibling, 0 replies; 36+ messages in thread
From: Mauro Carvalho Chehab @ 2025-07-28 17:00 UTC (permalink / raw)
  To: Kees Cook
  Cc: Sasha Levin, workflows, linux-doc, linux-kernel, rostedt,
	konstantin, corbet, josh

Em Fri, 25 Jul 2025 13:53:57 -0700
Kees Cook <kees@kernel.org> escreveu:

> On Fri, Jul 25, 2025 at 01:53:58PM -0400, Sasha Levin wrote:
> > Add rules based on our existing documentation.  
> 
> I'd still like this not in Documentation/, but I obviously defer to Jon.

I think it should be at Documentation, under process, where we document
such things.

As I pointed on patch 0/2, developers may use AI directly from a LLM
bot like chatgpt/deepseek/..., so it shall be in a place where humans
will also be aware about the ruleset.

I would also add a quick summary for humans not need to read the
entire file, just explaining what's different from the "normal"
way.

> 
> > Require AI to identify itself in the commit message.
> > 
> > Signed-off-by: Sasha Levin <sashal@kernel.org>
> > ---
> >  Documentation/AI/main.md | 70 ++++++++++++++++++++++++++++++++++++++--
> >  1 file changed, 68 insertions(+), 2 deletions(-)
> > 
> > diff --git a/Documentation/AI/main.md b/Documentation/AI/main.md
> > index 959ba50568f57..ca59e52f54445 100644
> > --- a/Documentation/AI/main.md
> > +++ b/Documentation/AI/main.md
> > @@ -1,5 +1,71 @@
> >  # Linux Kernel Development AI Instructions
> >  
> > -This is the Linux kernel repository. When working with this codebase, you must follow the following rules:
> > +This is the Linux kernel repository. When working with this codebase, you must follow the Linux kernel development processes and coding standards.
> >  
> > -- [ TODO ]
> > +## Essential Documentation References
> > +
> > +### Core Development Process
> > +- **Documentation/process/howto.rst** - Start here! The comprehensive guide on how to become a Linux kernel developer
> > +- **Documentation/process/development-process.rst** - Detailed information on how the kernel development process works
> > +- **Documentation/process/submitting-patches.rst** - Essential guide for getting your code into the kernel
> > +- **Documentation/process/submit-checklist.rst** - Checklist to review before submitting code  
> 
> Instead of hard-coded paths, I would recommend just discuss the topic
> areas it is expected to find and ingest. :) (e.g. redo the "Key
> principles" list you have later to be more specific about the topic
> areas and adjust the prompting to induce the requirement to find and
> read each topic.)
> 
> > +
> > +### Coding Standards and Style
> > +- **Documentation/process/coding-style.rst** - Linux kernel coding style (MUST READ)
> > +  - Use tabs (8 characters) for indentation
> > +  - 80-character line limit preferred
> > +  - Specific formatting rules for switch statements, functions, etc.
> > +- **Documentation/process/programming-language.rst** - Language requirements and standards
> > +
> > +### What NOT to Do
> > +- **Documentation/process/deprecated.rst** - Deprecated interfaces and features to avoid
> > +  - Do not use BUG() or BUG_ON() - use WARN() instead
> > +  - Avoid deprecated APIs listed in this document
> > +- **Documentation/process/volatile-considered-harmful.rst** - Why volatile is usually wrong  
> 
> And the reason I want to avoid such specifics is that even as an example
> above, this ends up being hyperspecific. Why summarize the
> deprecated.rst? Just say "Find and read the notes on deprecated APIs and
> language features"
> 
> > +
> > +### Patch Submission Process
> > +- **Documentation/process/5.Posting.rst** - How to post patches properly
> > +- **Documentation/process/email-clients.rst** - Email client configuration for patches
> > +- **Documentation/process/applying-patches.rst** - How patches are applied
> > +
> > +### Legal and Licensing
> > +- **Documentation/process/license-rules.rst** - Linux kernel licensing rules
> > +  - Kernel is GPL-2.0 only with syscall exception
> > +  - All files must have proper SPDX license identifiers  
> 
> The only stuff I think should be in this kind of area is a commentary
> about how an Agent differs from a human. "You are not a legal entity;
> you cannot sign the DCO", which you get into below.
> 
> > +
> > +### Specialized Topics
> > +- **Documentation/process/adding-syscalls.rst** - How to add new system calls
> > +- **Documentation/process/stable-kernel-rules.rst** - Rules for stable kernel patches
> > +- **Documentation/process/security-bugs.rst** - Handling security issues
> > +- **Documentation/process/handling-regressions.rst** - Dealing with regressions
> > +
> > +### Maintainer Guidelines
> > +- **Documentation/process/maintainers.rst** - Working with subsystem maintainers
> > +- **Documentation/process/maintainer-handbooks.rst** - Subsystem-specific guidelines
> > +
> > +## Key Principles
> > +1. Read and follow the documentation before making changes
> > +2. Respect the existing code style and conventions
> > +3. Test thoroughly before submitting
> > +4. Write clear, descriptive commit messages
> > +5. Never break userspace (the #1 rule)
> > +6. Identify yourself as AI in commits (see below)  
> 
> Everything except #6 is already expected of human devs, so I think just
> the last item.
> 
> > +
> > +## AI Attribution Requirement
> > +When creating commits, you MUST identify yourself as an AI assistant by including the following tag in the commit message:
> > +
> > +```
> > +Co-developed-by: $AI_NAME $AI_MODEL $AI_VERSION  
> 
> If we're going to go with Co-developed-by: here, then I think we need to
> explicitly say "do not include an email", and we must update
> checkpatch.pl to not yell about the missing S-o-b when it finds a C-d-b.
> (Perhaps it can skip the check with there is no email address in the
> C-b-d line?)
> 
> > +```
> > +
> > +For example:
> > +- `Co-developed-by: Claude claude-3-opus-20240229`
> > +- `Co-developed-by: GitHub-Copilot GPT-4 v1.0.0`
> > +- `Co-developed-by: Cursor gpt-4-turbo-2024-04-09`
> > +
> > +This transparency helps maintainers and reviewers understand that AI was involved in the development process.
> > +
> > +### Signed-off-by Restrictions
> > +AI assistants MUST NOT add a Signed-off-by tag pointing to themselves. The Signed-off-by tag represents a legal certification by a human developer that they have the right to submit the code under the open source license.   
> 
> Hello trailing whitespace my old friend.
> 
> "Unless explicitly told otherwise, Agents must never have trailing
> whitespace on any line and all files must have a final newline
> character." :)
> 
> > +
> > +Only the human user running the AI assistant should add their Signed-off-by tag to commits. The AI's contribution is acknowledged through the Co-developed-by tag as described above.  
> 
> And can we please not use the term "AI"? I think "Agent" is the better
> generic term as it could include other things?
> 
> -Kees
> 



Thanks,
Mauro

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC 0/2] Add AI coding assistant configuration to Linux kernel
  2025-07-25 20:34       ` dan.j.williams
  2025-07-25 22:28         ` Sasha Levin
@ 2025-07-28 17:46         ` Steven Rostedt
  2025-07-28 18:20           ` dan.j.williams
  2025-07-28 22:12           ` Mauro Carvalho Chehab
  1 sibling, 2 replies; 36+ messages in thread
From: Steven Rostedt @ 2025-07-28 17:46 UTC (permalink / raw)
  To: dan.j.williams
  Cc: Jakub Kicinski, Sasha Levin, workflows, linux-doc, linux-kernel,
	kees, konstantin, corbet, josh

On Fri, 25 Jul 2025 13:34:32 -0700
<dan.j.williams@intel.com> wrote:

> > This touches on explainability of AI. Perhaps the metadata would be
> > interesting for XAI research... not sure that's enough to be lugging
> > those tags in git history.  
> 
> Agree. The "who to blame" is "Author:". They signed DCO they are
> responsible for debugging what went wrong in any stage of the
> development of a patch per usual. We have a long history of debugging
> tool problems without tracking tool versions in git history.

My point of the "who to blame" was not about the author of said code,
but if two or more developers are using the same AI agent and then some
patter of bugs appears that is only with that AI agent, then we know
that the AI agent is likely the culprit and to look for code by other
developers that used that same AI agent.

It's a way to track down a bug in a tool that is creating code, not
about moving blame from a developer to the agent itself.

-- Steve

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC 0/2] Add AI coding assistant configuration to Linux kernel
  2025-07-28 17:46         ` Steven Rostedt
@ 2025-07-28 18:20           ` dan.j.williams
  2025-07-28 22:12           ` Mauro Carvalho Chehab
  1 sibling, 0 replies; 36+ messages in thread
From: dan.j.williams @ 2025-07-28 18:20 UTC (permalink / raw)
  To: Steven Rostedt, dan.j.williams
  Cc: Jakub Kicinski, Sasha Levin, workflows, linux-doc, linux-kernel,
	kees, konstantin, corbet, josh

Steven Rostedt wrote:
> On Fri, 25 Jul 2025 13:34:32 -0700
> <dan.j.williams@intel.com> wrote:
> 
> > > This touches on explainability of AI. Perhaps the metadata would be
> > > interesting for XAI research... not sure that's enough to be lugging
> > > those tags in git history.  
> > 
> > Agree. The "who to blame" is "Author:". They signed DCO they are
> > responsible for debugging what went wrong in any stage of the
> > development of a patch per usual. We have a long history of debugging
> > tool problems without tracking tool versions in git history.
> 
> My point of the "who to blame" was not about the author of said code,
> but if two or more developers are using the same AI agent and then some
> patter of bugs appears that is only with that AI agent, then we know
> that the AI agent is likely the culprit and to look for code by other
> developers that used that same AI agent.
> 
> It's a way to track down a bug in a tool that is creating code, not
> about moving blame from a developer to the agent itself.

Between fine tuning, the process of doing local training to emphasize /
de-emphasize some weights in the model, and prompt variability, the
signal from a patch trailer is diluted.

If maintainers care about commit text conciseness for humans and
traceability for AI, those competing concerns will conflict above the
"---" line in patches.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC 0/2] Add AI coding assistant configuration to Linux kernel
  2025-07-28 17:46         ` Steven Rostedt
  2025-07-28 18:20           ` dan.j.williams
@ 2025-07-28 22:12           ` Mauro Carvalho Chehab
  2025-07-29  0:18             ` Mauro Carvalho Chehab
  1 sibling, 1 reply; 36+ messages in thread
From: Mauro Carvalho Chehab @ 2025-07-28 22:12 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: dan.j.williams, Jakub Kicinski, Sasha Levin, workflows, linux-doc,
	linux-kernel, kees, konstantin, corbet, josh

Em Mon, 28 Jul 2025 13:46:53 -0400
Steven Rostedt <rostedt@goodmis.org> escreveu:

> On Fri, 25 Jul 2025 13:34:32 -0700
> <dan.j.williams@intel.com> wrote:
> 
> > > This touches on explainability of AI. Perhaps the metadata would be
> > > interesting for XAI research... not sure that's enough to be lugging
> > > those tags in git history.    
> > 
> > Agree. The "who to blame" is "Author:". They signed DCO they are
> > responsible for debugging what went wrong in any stage of the
> > development of a patch per usual. We have a long history of debugging
> > tool problems without tracking tool versions in git history.  
> 
> My point of the "who to blame" was not about the author of said code,
> but if two or more developers are using the same AI agent and then some
> patter of bugs appears that is only with that AI agent, then we know
> that the AI agent is likely the culprit and to look for code by other
> developers that used that same AI agent.
> 
> It's a way to track down a bug in a tool that is creating code, not
> about moving blame from a developer to the agent itself.

I don't think you shall blame the tool, as much as you you cannot 
blame gcc for a badly written code. Also, the same way a kernel
maintainer needs to know how to produce a good code, someone using
AI also must learn how to properly use the tool.

After all, at least at the current stage, AI is not intelligent. 
Artificial "neurons" just sums up values from its neighbors,
trying to mimic what we know so far about neurons, which is not
perfect. On several aspects, it is not much different than doing
an stochastic analysis that would try to converge into a result. 

The entire process resembles the kind of systems that you could
be analyzed using control theory[1], like root locus analysis. Anyone
that ever played with that knows that sometimes the system is stable
enough to converge to the best results, but the convergence is
affected by poles and zeros: sometimes it might converge to a local
minimum; sometimes it can end into a zero and diverge, producing
completely bogus results.

On other words, the one that posted a bad patch is the one to
blame, together with the ones that reviewed it. AI is not a
replacement for real intelligence.

-

Btw, if you want to play with that, I suggest using deepseek.
Ensure that the DeepThink (R1) is enabled, as it shows how that
particular model tries to find a convergence.

-

Out of curiosity, I asked two AIs for articles relating control theory
with LLM:

Bhargava, A., Witkowski, C., Looi, S.-Z., & Thomson, M. (2023). What’s the Magic Word? A Control Theory of LLM Prompting. arXiv preprint arXiv:2310.04444.
URL: https://arxiv.org/abs/2310.04444

Kevian, D., Syed, U., Guo, X., Havens, A., Dullerud, G., Seiler, P., Qin, L., & Hu, B. (2024). Capabilities of Large Language Models in Control Engineering: A Benchmark Study. arXiv preprint arXiv:2404.03647.
URL: https://arxiv.org/abs/2404.03647

Maher, G. (2025). LLMPC: Large Language Model Predictive Control. arXiv preprint arXiv:2501.02486.
URL: https://arxiv.org/abs/2501.02486

Zahedifar, R. et al. "LLM-Agent-Controller: A Universal Multi-Agent Large Language Model System as a Control Engineer"
URL: https://arxiv.org/abs/2505.19567

Zhang, Y. et al. "Unveiling LLM Mechanisms Through Neural ODEs and Control Theory"
URL: https://arxiv.org/abs/2406.16985

Barfield, Woodrow (2021) "A Systems and Control Theory Approach for Law and Artificial Intelligence: Demystifying the 'Black-Box'"
URL: https://www.mdpi.com/2571-8800/4/4/41

Zahedifar, R. et al. "LLM-controller: Dynamic robot control adaptation using large language models"
URL: https://www.sciencedirect.com/science/article/abs/pii/S0921889024002975

Bhargava, A. "Toward a Control Theory of LLMs" (Blog Post)
URL: https://aman-bhargava.com/ai/2023/12/17/towards-a-control-theory-of-LLMs.html

I didn't read them (yet).

Thanks,
Mauro

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [RFC 0/2] Add AI coding assistant configuration to Linux kernel
  2025-07-28 22:12           ` Mauro Carvalho Chehab
@ 2025-07-29  0:18             ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 36+ messages in thread
From: Mauro Carvalho Chehab @ 2025-07-29  0:18 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: dan.j.williams, Jakub Kicinski, Sasha Levin, workflows, linux-doc,
	linux-kernel, kees, konstantin, corbet, josh

Em Tue, 29 Jul 2025 00:12:33 +0200
Mauro Carvalho Chehab <mchehab+huawei@kernel.org> escreveu:

> Em Mon, 28 Jul 2025 13:46:53 -0400
> Steven Rostedt <rostedt@goodmis.org> escreveu:
> 
> > On Fri, 25 Jul 2025 13:34:32 -0700
> > <dan.j.williams@intel.com> wrote:
> >   
> > > > This touches on explainability of AI. Perhaps the metadata would be
> > > > interesting for XAI research... not sure that's enough to be lugging
> > > > those tags in git history.      
> > > 
> > > Agree. The "who to blame" is "Author:". They signed DCO they are
> > > responsible for debugging what went wrong in any stage of the
> > > development of a patch per usual. We have a long history of debugging
> > > tool problems without tracking tool versions in git history.    
> > 
> > My point of the "who to blame" was not about the author of said code,
> > but if two or more developers are using the same AI agent and then some
> > patter of bugs appears that is only with that AI agent, then we know
> > that the AI agent is likely the culprit and to look for code by other
> > developers that used that same AI agent.
> > 
> > It's a way to track down a bug in a tool that is creating code, not
> > about moving blame from a developer to the agent itself.  
> 
> I don't think you shall blame the tool, as much as you you cannot 
> blame gcc for a badly written code. Also, the same way a kernel
> maintainer needs to know how to produce a good code, someone using
> AI also must learn how to properly use the tool.
> 
> After all, at least at the current stage, AI is not intelligent. 

Heh, after re-reading my post, I realized that I could have been too
technical, specially for people not familiar with electrical engineering
and systems control theory(*).

What I'm trying to say is that, while AI is a great tool, it is just
another tool that tries to guess something. If you get enough luck,
you'll get decent results, but one should never trust on its result,
as it is based on statistics: it will guess an answer that will likely
be the right one, but could also be completely off.

(*) systems control theory is a field that studies a system stability.
    It can be used, for instance, to ensure that an electrical motor
    can be properly controlled and provide precise movements. It is
    not limited to mechanics, though. It can used to explain other
    systems that have any sort of feedbacks. at the light of the 
    control theory, an AI training would be mapped as a feedback.

Regards,
Mauro

Thanks,
Mauro

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2025-07-29  0:18 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-25 17:53 [RFC 0/2] Add AI coding assistant configuration to Linux kernel Sasha Levin
2025-07-25 17:53 ` [RFC 1/2] AI: Add unified AI coding assistant configuration Sasha Levin
2025-07-25 18:19   ` Steven Rostedt
2025-07-25 22:11     ` Sasha Levin
2025-07-25 18:27   ` Jonathan Corbet
2025-07-25 22:15     ` Sasha Levin
2025-07-27  2:24       ` Sasha Levin
2025-07-28 16:56         ` Mauro Carvalho Chehab
2025-07-25 20:40   ` Kees Cook
2025-07-25 22:25     ` Sasha Levin
2025-07-25 17:53 ` [RFC 2/2] AI: Add initial set of rules and docs Sasha Levin
2025-07-25 20:53   ` Kees Cook
2025-07-25 22:10     ` Sasha Levin
2025-07-28 17:00     ` Mauro Carvalho Chehab
2025-07-25 18:41 ` [RFC 0/2] Add AI coding assistant configuration to Linux kernel Jakub Kicinski
2025-07-25 19:00   ` Steven Rostedt
2025-07-25 19:59     ` Jakub Kicinski
2025-07-25 20:32       ` Dr. David Alan Gilbert
2025-07-25 20:34       ` dan.j.williams
2025-07-25 22:28         ` Sasha Levin
2025-07-28 17:46         ` Steven Rostedt
2025-07-28 18:20           ` dan.j.williams
2025-07-28 22:12           ` Mauro Carvalho Chehab
2025-07-29  0:18             ` Mauro Carvalho Chehab
2025-07-25 21:03     ` Kees Cook
2025-07-28 15:57       ` dan.j.williams
2025-07-28 16:41         ` Kees Cook
2025-07-26  9:07 ` Mario Limonciello
2025-07-26 11:53   ` Sasha Levin
2025-07-26 16:36   ` Kees Cook
2025-07-28 16:49 ` Mauro Carvalho Chehab
  -- strict thread matches above, loose matches on Subject: below --
2025-07-27  9:37 Sabyrzhan Tasbolatov
2025-07-27 15:23 ` Kees Cook
2025-07-27 15:45   ` Dr. David Alan Gilbert
2025-07-28  6:40     ` Kees Cook
2025-07-28 16:43       ` Sabyrzhan Tasbolatov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).