Linux Documentation
 help / color / mirror / Atom feed
* [PATCH RFC] coding-assistants: simplify attribution
@ 2026-07-01 15:54 Christian Brauner
  2026-07-01 16:08 ` Mark Brown
                   ` (6 more replies)
  0 siblings, 7 replies; 34+ messages in thread
From: Christian Brauner @ 2026-07-01 15:54 UTC (permalink / raw)
  To: Linus Torvalds, Jonathan Corbet
  Cc: Jens Axboe, David Hildenbrand, Jeff Layton, Vlastimil Babka,
	workflows, linux-doc, linux-kernel, linux-fsdevel,
	Christian Brauner (Amutable)

I remain very confused by our coding assistant contribution guidelines.
I'm going to be a bit polemic now but this seriously in good faith.

Why precisely do we require all this detailed information about what
specific coding assistant was used?

I find it very irritating that our git history has effectively started
to function a bit like a free advertising platform for a bunch of AI
companies and their proprietary agents and models.

And it reamins unclear to me what exactly we do get out of this detailed
information: Do we want to run statistical analysis on what agent and
model is used the most and publish that on LWN at some point?

I acknowledge that my stance is even more radical: imho we would just
stop it with any disclosure requirements completely. It's useless imho.
We already see that other than core contributors most people don't care
and will just not disclose their usage of AI. I think this is entirely
pointless and worse it brings in undefined legal status as well. It's
not like recent events of pulling certain models from the face of the
earth have made this any less concerning.

But fine, if we want to do this can we please just dumb it down to

Assisted-by: LLM

or

Assisted-by: Coding Assistant

or something else. That still gives the "careful review" signal to
reviewers that want to pay special attention to LLM generated work while
avoiding this slew of metadata.

Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
---
 Documentation/process/coding-assistants.rst | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/Documentation/process/coding-assistants.rst b/Documentation/process/coding-assistants.rst
index 899f4459c52d..fe34f3e7e828 100644
--- a/Documentation/process/coding-assistants.rst
+++ b/Documentation/process/coding-assistants.rst
@@ -43,12 +43,8 @@ When AI tools contribute to kernel development, proper attribution
 helps track the evolving role of AI in the development process.
 Contributions should include an Assisted-by tag in the following format::
 
-  Assisted-by: AGENT_NAME:MODEL_VERSION [TOOL1] [TOOL2]
+  Assisted-by: LLM [TOOL1] [TOOL2]
 
-Where:
-
-* ``AGENT_NAME`` is the name of the AI tool or framework
-* ``MODEL_VERSION`` is the specific model version used
 * ``[TOOL1] [TOOL2]`` are optional specialized analysis tools used
   (e.g., coccinelle, sparse, smatch, clang-tidy)
 
@@ -56,4 +52,4 @@ Basic development tools (git, gcc, make, editors) should not be listed.
 
 Example::
 
-  Assisted-by: Claude:claude-3-opus coccinelle sparse
+  Assisted-by: LLM coccinelle sparse

---
base-commit: dc59e4fea9d83f03bad6bddf3fa2e52491777482
change-id: 20260701-work-coding-assistants-650ae1202ee0


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH RFC] coding-assistants: simplify attribution
  2026-07-01 15:54 [PATCH RFC] coding-assistants: simplify attribution Christian Brauner
@ 2026-07-01 16:08 ` Mark Brown
  2026-07-02  7:10   ` Christian Brauner
  2026-07-01 16:08 ` Jonathan Corbet
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 34+ messages in thread
From: Mark Brown @ 2026-07-01 16:08 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Linus Torvalds, Jonathan Corbet, Jens Axboe, David Hildenbrand,
	Jeff Layton, Vlastimil Babka, workflows, linux-doc, linux-kernel,
	linux-fsdevel

[-- Attachment #1: Type: text/plain, Size: 512 bytes --]

On Wed, Jul 01, 2026 at 05:54:48PM +0200, Christian Brauner wrote:

> And it reamins unclear to me what exactly we do get out of this detailed
> information: Do we want to run statistical analysis on what agent and
> model is used the most and publish that on LWN at some point?

IIRC it was literally this, have people mention which tools they used so
we can use that to inform our assessment of the patches.  I'm not sure
the differences we're actually seeing are tool based rather than
operator skill though.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH RFC] coding-assistants: simplify attribution
  2026-07-01 15:54 [PATCH RFC] coding-assistants: simplify attribution Christian Brauner
  2026-07-01 16:08 ` Mark Brown
@ 2026-07-01 16:08 ` Jonathan Corbet
  2026-07-01 16:12   ` David Hildenbrand (Arm)
                     ` (2 more replies)
  2026-07-01 16:10 ` David Hildenbrand (Arm)
                   ` (4 subsequent siblings)
  6 siblings, 3 replies; 34+ messages in thread
From: Jonathan Corbet @ 2026-07-01 16:08 UTC (permalink / raw)
  To: Christian Brauner, Linus Torvalds
  Cc: Jens Axboe, David Hildenbrand, Jeff Layton, Vlastimil Babka,
	workflows, linux-doc, linux-kernel, linux-fsdevel,
	Christian Brauner (Amutable)

Christian Brauner <brauner@kernel.org> writes:

> I remain very confused by our coding assistant contribution guidelines.
> I'm going to be a bit polemic now but this seriously in good faith.
>
> Why precisely do we require all this detailed information about what
> specific coding assistant was used?

From my memory of the discussions:

- If a specific LLM turns out to be in a bad position with regard to
  some copyright ruling, we can identify the commits that might have
  been tainted by it.

- Similarly should an LLM prove to have an inclination toward specific
  types of security issues.

Whether either of these would ever actually prove useful is not
something I can hazard a guess for.

> I find it very irritating that our git history has effectively started
> to function a bit like a free advertising platform for a bunch of AI
> companies and their proprietary agents and models.
>
> And it reamins unclear to me what exactly we do get out of this detailed
> information: Do we want to run statistical analysis on what agent and
> model is used the most and publish that on LWN at some point?

...wasn't in my plans ...

> I acknowledge that my stance is even more radical: imho we would just
> stop it with any disclosure requirements completely. It's useless imho.
> We already see that other than core contributors most people don't care
> and will just not disclose their usage of AI.

The widespread ignoring of the disclosure rule is, IMO, something we
need to address somehow; were I still on the TAB, I'd be raising the
issue there.  Either we find a way to be serious about enforcing the
disclosure rule, or we should just drop it.  A rule that everybody
ignores is less than useful.

(That said, 706 commits in 7.2-rc1 include Assisted-by tags, so *some*
people are complying.  That's about 5% of the total.  What do we think
is the actual use of LLMs for the creation of kernel patches?)

jon

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH RFC] coding-assistants: simplify attribution
  2026-07-01 15:54 [PATCH RFC] coding-assistants: simplify attribution Christian Brauner
  2026-07-01 16:08 ` Mark Brown
  2026-07-01 16:08 ` Jonathan Corbet
@ 2026-07-01 16:10 ` David Hildenbrand (Arm)
  2026-07-02  7:27   ` Christian Brauner
  2026-07-02  9:24   ` Lorenzo Stoakes
  2026-07-01 18:35 ` Jeff Layton
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 34+ messages in thread
From: David Hildenbrand (Arm) @ 2026-07-01 16:10 UTC (permalink / raw)
  To: Christian Brauner, Linus Torvalds, Jonathan Corbet
  Cc: Jens Axboe, Jeff Layton, Vlastimil Babka, workflows, linux-doc,
	linux-kernel, linux-fsdevel

On 7/1/26 17:54, Christian Brauner wrote:
> I remain very confused by our coding assistant contribution guidelines.
> I'm going to be a bit polemic now but this seriously in good faith.
> 
> Why precisely do we require all this detailed information about what
> specific coding assistant was used?
> 
> I find it very irritating that our git history has effectively started
> to function a bit like a free advertising platform for a bunch of AI
> companies and their proprietary agents and models.
> 
> And it reamins unclear to me what exactly we do get out of this detailed
> information: Do we want to run statistical analysis on what agent and
> model is used the most and publish that on LWN at some point?
> 
> I acknowledge that my stance is even more radical: imho we would just
> stop it with any disclosure requirements completely. It's useless imho.
> We already see that other than core contributors most people don't care
> and will just not disclose their usage of AI. I think this is entirely
> pointless and worse it brings in undefined legal status as well. It's
> not like recent events of pulling certain models from the face of the
> earth have made this any less concerning.
> 
> But fine, if we want to do this can we please just dumb it down to
> 
> Assisted-by: LLM
> 
> or
> 
> Assisted-by: Coding Assistant

I'd prefer this.

The doc states "proper attribution helps track the evolving role of AI in the
development process". If there is another reason why we need the free
advertisement, we should document it.

Side note: if someone instructs an LLM exactly what to do, and would have
achieved the same thing just typing it in, the use of the tag is not any helpful
to me. (similar to "Assisted-by: vim" would not be helpful).

What would be much more relevant to know is to which degree LLMs were used.

Assisted-by: LLM # translate commit message
Assisted-by: LLM # generate some test cases
Assisted-by: LLM # cleanup logic
Assisted-by: LLM # everything and I have no clue what any in here does

I thought we ask for that in some document, but couldn't immediately find it
(and nobody does that).

-- 
Cheers,

David

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH RFC] coding-assistants: simplify attribution
  2026-07-01 16:08 ` Jonathan Corbet
@ 2026-07-01 16:12   ` David Hildenbrand (Arm)
  2026-07-02  7:11   ` Christian Brauner
  2026-07-02  9:51   ` David Disseldorp
  2 siblings, 0 replies; 34+ messages in thread
From: David Hildenbrand (Arm) @ 2026-07-01 16:12 UTC (permalink / raw)
  To: Jonathan Corbet, Christian Brauner, Linus Torvalds
  Cc: Jens Axboe, Jeff Layton, Vlastimil Babka, workflows, linux-doc,
	linux-kernel, linux-fsdevel

>> I acknowledge that my stance is even more radical: imho we would just
>> stop it with any disclosure requirements completely. It's useless imho.
>> We already see that other than core contributors most people don't care
>> and will just not disclose their usage of AI.
> 
> The widespread ignoring of the disclosure rule is, IMO, something we
> need to address somehow; were I still on the TAB, I'd be raising the
> issue there.  Either we find a way to be serious about enforcing the
> disclosure rule, or we should just drop it.  A rule that everybody
> ignores is less than useful.

Note. :)

We do have people using it, and at least in MM, when we suspect AI usage and get
the confirmation, we would tell people to use proper tags. (happens not so often
IIRC)

-- 
Cheers,

David

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH RFC] coding-assistants: simplify attribution
  2026-07-01 15:54 [PATCH RFC] coding-assistants: simplify attribution Christian Brauner
                   ` (2 preceding siblings ...)
  2026-07-01 16:10 ` David Hildenbrand (Arm)
@ 2026-07-01 18:35 ` Jeff Layton
  2026-07-01 18:53   ` Jakub Kicinski
  2026-07-02  7:28   ` Christian Brauner
  2026-07-02  8:12 ` Jori Koolstra
                   ` (2 subsequent siblings)
  6 siblings, 2 replies; 34+ messages in thread
From: Jeff Layton @ 2026-07-01 18:35 UTC (permalink / raw)
  To: Christian Brauner, Linus Torvalds, Jonathan Corbet
  Cc: Jens Axboe, David Hildenbrand, Vlastimil Babka, workflows,
	linux-doc, linux-kernel, linux-fsdevel

On Wed, 2026-07-01 at 17:54 +0200, Christian Brauner wrote:
> I remain very confused by our coding assistant contribution guidelines.
> I'm going to be a bit polemic now but this seriously in good faith.
> 
> Why precisely do we require all this detailed information about what
> specific coding assistant was used?
> 
> I find it very irritating that our git history has effectively started
> to function a bit like a free advertising platform for a bunch of AI
> companies and their proprietary agents and models.
> 
> And it reamins unclear to me what exactly we do get out of this detailed
> information: Do we want to run statistical analysis on what agent and
> model is used the most and publish that on LWN at some point?
> 
> I acknowledge that my stance is even more radical: imho we would just
> stop it with any disclosure requirements completely. It's useless imho.
> We already see that other than core contributors most people don't care
> and will just not disclose their usage of AI. I think this is entirely
> pointless and worse it brings in undefined legal status as well. It's
> not like recent events of pulling certain models from the face of the
> earth have made this any less concerning.
> 
> But fine, if we want to do this can we please just dumb it down to
> 
> Assisted-by: LLM
> 
> or
> 
> Assisted-by: Coding Assistant
> 
> or something else. That still gives the "careful review" signal to
> reviewers that want to pay special attention to LLM generated work while
> avoiding this slew of metadata.
> 
> Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
> ---
>  Documentation/process/coding-assistants.rst | 8 ++------
>  1 file changed, 2 insertions(+), 6 deletions(-)
> 
> diff --git a/Documentation/process/coding-assistants.rst b/Documentation/process/coding-assistants.rst
> index 899f4459c52d..fe34f3e7e828 100644
> --- a/Documentation/process/coding-assistants.rst
> +++ b/Documentation/process/coding-assistants.rst
> @@ -43,12 +43,8 @@ When AI tools contribute to kernel development, proper attribution
>  helps track the evolving role of AI in the development process.
>  Contributions should include an Assisted-by tag in the following format::
>  
> -  Assisted-by: AGENT_NAME:MODEL_VERSION [TOOL1] [TOOL2]
> +  Assisted-by: LLM [TOOL1] [TOOL2]
>  
> -Where:
> -
> -* ``AGENT_NAME`` is the name of the AI tool or framework
> -* ``MODEL_VERSION`` is the specific model version used
>  * ``[TOOL1] [TOOL2]`` are optional specialized analysis tools used
>    (e.g., coccinelle, sparse, smatch, clang-tidy)
>  
> @@ -56,4 +52,4 @@ Basic development tools (git, gcc, make, editors) should not be listed.
>  
>  Example::
>  
> -  Assisted-by: Claude:claude-3-opus coccinelle sparse
> +  Assisted-by: LLM coccinelle sparse
> 
> ---
> base-commit: dc59e4fea9d83f03bad6bddf3fa2e52491777482
> change-id: 20260701-work-coding-assistants-650ae1202ee0


In general, collecting data for nebulous purposes usually turns out to
be a bad idea. If we're not 100% clear on why we want this data, then
we're probably better off not collecting it at all.

With that in mind: if we're going to water down the tag, then I say
just remove the requirement altogether. If we later decide that we want
to start collecting more detailed info for some (clear) purpose then we
can revisit the idea.
-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH RFC] coding-assistants: simplify attribution
  2026-07-01 18:35 ` Jeff Layton
@ 2026-07-01 18:53   ` Jakub Kicinski
  2026-07-02  7:29     ` Christian Brauner
  2026-07-02  7:28   ` Christian Brauner
  1 sibling, 1 reply; 34+ messages in thread
From: Jakub Kicinski @ 2026-07-01 18:53 UTC (permalink / raw)
  To: Jeff Layton, Christian Brauner
  Cc: Linus Torvalds, Jonathan Corbet, Jens Axboe, David Hildenbrand,
	Vlastimil Babka, workflows, linux-doc, linux-kernel,
	linux-fsdevel

On Wed, 01 Jul 2026 14:35:08 -0400 Jeff Layton wrote:
> On Wed, 2026-07-01 at 17:54 +0200, Christian Brauner wrote:
> > I remain very confused by our coding assistant contribution guidelines.
> > I'm going to be a bit polemic now but this seriously in good faith.
> > 
> > Why precisely do we require all this detailed information about what
> > specific coding assistant was used?
> > 
> > I find it very irritating that our git history has effectively started
> > to function a bit like a free advertising platform for a bunch of AI
> > companies and their proprietary agents and models.

FWIW, this is exactly how I feel. I added a regex to strip these in
my git hooks. So at least the net/ history should be ads-free 🤷️

Inexperienced developers who just trust the LLM output, and therefore
are the group where the tags would be most useful, tend not to add
them. Either because they are ashamed or because they want full credit.
This correlation kills the utility of the tag.

> > And it reamins unclear to me what exactly we do get out of this detailed
> > information: Do we want to run statistical analysis on what agent and
> > model is used the most and publish that on LWN at some point?
> > 
> > I acknowledge that my stance is even more radical: imho we would just
> > stop it with any disclosure requirements completely. It's useless imho.
> > We already see that other than core contributors most people don't care
> > and will just not disclose their usage of AI. I think this is entirely
> > pointless and worse it brings in undefined legal status as well. It's
> > not like recent events of pulling certain models from the face of the
> > earth have made this any less concerning.
> > 
> > But fine, if we want to do this can we please just dumb it down to
> > 
> > Assisted-by: LLM
> > 
> > or
> > 
> > Assisted-by: Coding Assistant
> > 
> > or something else. That still gives the "careful review" signal to
> > reviewers that want to pay special attention to LLM generated work while
> > avoiding this slew of metadata.
> > 
> > Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
> > ---
> >  Documentation/process/coding-assistants.rst | 8 ++------
> >  1 file changed, 2 insertions(+), 6 deletions(-)
> > 
> > diff --git a/Documentation/process/coding-assistants.rst b/Documentation/process/coding-assistants.rst
> > index 899f4459c52d..fe34f3e7e828 100644
> > --- a/Documentation/process/coding-assistants.rst
> > +++ b/Documentation/process/coding-assistants.rst
> > @@ -43,12 +43,8 @@ When AI tools contribute to kernel development, proper attribution
> >  helps track the evolving role of AI in the development process.
> >  Contributions should include an Assisted-by tag in the following format::
> >  
> > -  Assisted-by: AGENT_NAME:MODEL_VERSION [TOOL1] [TOOL2]
> > +  Assisted-by: LLM [TOOL1] [TOOL2]
> >  
> > -Where:
> > -
> > -* ``AGENT_NAME`` is the name of the AI tool or framework
> > -* ``MODEL_VERSION`` is the specific model version used
> >  * ``[TOOL1] [TOOL2]`` are optional specialized analysis tools used
> >    (e.g., coccinelle, sparse, smatch, clang-tidy)
> >  
> > @@ -56,4 +52,4 @@ Basic development tools (git, gcc, make, editors) should not be listed.
> >  
> >  Example::
> >  
> > -  Assisted-by: Claude:claude-3-opus coccinelle sparse
> > +  Assisted-by: LLM coccinelle sparse
> > 
> > ---
> > base-commit: dc59e4fea9d83f03bad6bddf3fa2e52491777482
> > change-id: 20260701-work-coding-assistants-650ae1202ee0  
> 
> 
> In general, collecting data for nebulous purposes usually turns out to
> be a bad idea. If we're not 100% clear on why we want this data, then
> we're probably better off not collecting it at all.
> 
> With that in mind: if we're going to water down the tag, then I say
> just remove the requirement altogether. If we later decide that we want
> to start collecting more detailed info for some (clear) purpose then we
> can revisit the idea.

+1

Honestly even tool attribution feels increasingly moot.
People vibe code tools and AI-in-the-loop pipelines which they never
publish. Open source tools are (hopefully?) used in pre-commit
pipelines, so they have the "kbuild bot problem" of problems getting
fixed before the code is merged. And we have the same free advertising
problem for the rest.

It's 100 times more important to drill into people to provide sufficient
information in plain English. How was the bug discovered, has it been
triggered / proven and how, what is the user impact. I wonder if
inventing tags distracts contributors from what really matters.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH RFC] coding-assistants: simplify attribution
  2026-07-01 16:08 ` Mark Brown
@ 2026-07-02  7:10   ` Christian Brauner
  2026-07-02 11:35     ` Mark Brown
  0 siblings, 1 reply; 34+ messages in thread
From: Christian Brauner @ 2026-07-02  7:10 UTC (permalink / raw)
  To: Mark Brown
  Cc: Christian Brauner, Linus Torvalds, Jonathan Corbet, Jens Axboe,
	David Hildenbrand, Jeff Layton, Vlastimil Babka, workflows,
	linux-doc, linux-kernel, linux-fsdevel

On 2026-07-01 17:08 +0100, Mark Brown wrote:
> On Wed, Jul 01, 2026 at 05:54:48PM +0200, Christian Brauner wrote:
> 
> > And it reamins unclear to me what exactly we do get out of this detailed
> > information: Do we want to run statistical analysis on what agent and
> > model is used the most and publish that on LWN at some point?
> 
> IIRC it was literally this, have people mention which tools they used so
> we can use that to inform our assessment of the patches.  I'm not sure

Forgive my candor but I think that is just useless for us. It's
certainly useful for AI company statistics. If we want to provide that
service I would recommend we start charging. ;)


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH RFC] coding-assistants: simplify attribution
  2026-07-01 16:08 ` Jonathan Corbet
  2026-07-01 16:12   ` David Hildenbrand (Arm)
@ 2026-07-02  7:11   ` Christian Brauner
  2026-07-02  9:51   ` David Disseldorp
  2 siblings, 0 replies; 34+ messages in thread
From: Christian Brauner @ 2026-07-02  7:11 UTC (permalink / raw)
  To: Jonathan Corbet
  Cc: Christian Brauner, Linus Torvalds, Jens Axboe, David Hildenbrand,
	Jeff Layton, Vlastimil Babka, workflows, linux-doc, linux-kernel,
	linux-fsdevel

> disclosure rule, or we should just drop it.

I agree.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH RFC] coding-assistants: simplify attribution
  2026-07-01 16:10 ` David Hildenbrand (Arm)
@ 2026-07-02  7:27   ` Christian Brauner
  2026-07-02  7:46     ` David Hildenbrand (Arm)
  2026-07-02  8:08     ` Laurent Pinchart
  2026-07-02  9:24   ` Lorenzo Stoakes
  1 sibling, 2 replies; 34+ messages in thread
From: Christian Brauner @ 2026-07-02  7:27 UTC (permalink / raw)
  To: David Hildenbrand (Arm)
  Cc: Christian Brauner, Linus Torvalds, Jonathan Corbet, Jens Axboe,
	Jeff Layton, Vlastimil Babka, workflows, linux-doc, linux-kernel,
	linux-fsdevel

> What would be much more relevant to know is to which degree LLMs were used.
> 
> Assisted-by: LLM # translate commit message
> Assisted-by: LLM # generate some test cases
> Assisted-by: LLM # cleanup logic
> Assisted-by: LLM # everything and I have no clue what any in here does

I think we should just drop any attribution as a general kernel-wide
rule and let subsystems require them as needed. Then you can have all
the complexity in mm for this that you think is needed for your
workflow to function. This is precisely what the subsystem profiles are
for. So maybe just add:

Documentation/process/maintainer-mm.rst

alongside

Documentation/process/maintainer-{tip,netdev,x86}.rst

and lay down the rules that you require for LLM based submissions in
whatever detail you need.

I don't see how this additional commentary you want would ever be
enforced consistently across the kernel or who would even enforce it. I
don't need more beaurocracy to chase after people in my subsystems tbh.

The other thing is that I think this Assisted-by annotation is just
noise in the changelog. If you want to know in detail what an LLM was
used for when generating the patch it's mostly a signal for how
"intense" of a review this will get afaict (already questionable imho
but sure that's just something to disagree on).

If the information is mostly useful during review then I still would
question why it has to end up in our git logs. It's completely
irrelevant information imho.

> I thought we ask for that in some document, but couldn't immediately find it
> (and nobody does that).


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH RFC] coding-assistants: simplify attribution
  2026-07-01 18:35 ` Jeff Layton
  2026-07-01 18:53   ` Jakub Kicinski
@ 2026-07-02  7:28   ` Christian Brauner
  1 sibling, 0 replies; 34+ messages in thread
From: Christian Brauner @ 2026-07-02  7:28 UTC (permalink / raw)
  To: Jeff Layton
  Cc: Christian Brauner, Linus Torvalds, Jonathan Corbet, Jens Axboe,
	David Hildenbrand, Vlastimil Babka, workflows, linux-doc,
	linux-kernel, linux-fsdevel

On 2026-07-01 14:35 -0400, Jeff Layton wrote:
> On Wed, 2026-07-01 at 17:54 +0200, Christian Brauner wrote:
> > I remain very confused by our coding assistant contribution guidelines.
> > I'm going to be a bit polemic now but this seriously in good faith.
> > 
> > Why precisely do we require all this detailed information about what
> > specific coding assistant was used?
> > 
> > I find it very irritating that our git history has effectively started
> > to function a bit like a free advertising platform for a bunch of AI
> > companies and their proprietary agents and models.
> > 
> > And it reamins unclear to me what exactly we do get out of this detailed
> > information: Do we want to run statistical analysis on what agent and
> > model is used the most and publish that on LWN at some point?
> > 
> > I acknowledge that my stance is even more radical: imho we would just
> > stop it with any disclosure requirements completely. It's useless imho.
> > We already see that other than core contributors most people don't care
> > and will just not disclose their usage of AI. I think this is entirely
> > pointless and worse it brings in undefined legal status as well. It's
> > not like recent events of pulling certain models from the face of the
> > earth have made this any less concerning.
> > 
> > But fine, if we want to do this can we please just dumb it down to
> > 
> > Assisted-by: LLM
> > 
> > or
> > 
> > Assisted-by: Coding Assistant
> > 
> > or something else. That still gives the "careful review" signal to
> > reviewers that want to pay special attention to LLM generated work while
> > avoiding this slew of metadata.
> > 
> > Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
> > ---
> >  Documentation/process/coding-assistants.rst | 8 ++------
> >  1 file changed, 2 insertions(+), 6 deletions(-)
> > 
> > diff --git a/Documentation/process/coding-assistants.rst b/Documentation/process/coding-assistants.rst
> > index 899f4459c52d..fe34f3e7e828 100644
> > --- a/Documentation/process/coding-assistants.rst
> > +++ b/Documentation/process/coding-assistants.rst
> > @@ -43,12 +43,8 @@ When AI tools contribute to kernel development, proper attribution
> >  helps track the evolving role of AI in the development process.
> >  Contributions should include an Assisted-by tag in the following format::
> >  
> > -  Assisted-by: AGENT_NAME:MODEL_VERSION [TOOL1] [TOOL2]
> > +  Assisted-by: LLM [TOOL1] [TOOL2]
> >  
> > -Where:
> > -
> > -* ``AGENT_NAME`` is the name of the AI tool or framework
> > -* ``MODEL_VERSION`` is the specific model version used
> >  * ``[TOOL1] [TOOL2]`` are optional specialized analysis tools used
> >    (e.g., coccinelle, sparse, smatch, clang-tidy)
> >  
> > @@ -56,4 +52,4 @@ Basic development tools (git, gcc, make, editors) should not be listed.
> >  
> >  Example::
> >  
> > -  Assisted-by: Claude:claude-3-opus coccinelle sparse
> > +  Assisted-by: LLM coccinelle sparse
> > 
> > ---
> > base-commit: dc59e4fea9d83f03bad6bddf3fa2e52491777482
> > change-id: 20260701-work-coding-assistants-650ae1202ee0
> 
> 
> In general, collecting data for nebulous purposes usually turns out to
> be a bad idea. If we're not 100% clear on why we want this data, then
> we're probably better off not collecting it at all.

Agreed.

> With that in mind: if we're going to water down the tag, then I say
> just remove the requirement altogether. If we later decide that we want
> to start collecting more detailed info for some (clear) purpose then we
> can revisit the idea.

Agreed.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH RFC] coding-assistants: simplify attribution
  2026-07-01 18:53   ` Jakub Kicinski
@ 2026-07-02  7:29     ` Christian Brauner
  0 siblings, 0 replies; 34+ messages in thread
From: Christian Brauner @ 2026-07-02  7:29 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Jeff Layton, Christian Brauner, Linus Torvalds, Jonathan Corbet,
	Jens Axboe, David Hildenbrand, Vlastimil Babka, workflows,
	linux-doc, linux-kernel, linux-fsdevel

On 2026-07-01 11:53 -0700, Jakub Kicinski wrote:
> On Wed, 01 Jul 2026 14:35:08 -0400 Jeff Layton wrote:
> > On Wed, 2026-07-01 at 17:54 +0200, Christian Brauner wrote:
> > > I remain very confused by our coding assistant contribution guidelines.
> > > I'm going to be a bit polemic now but this seriously in good faith.
> > > 
> > > Why precisely do we require all this detailed information about what
> > > specific coding assistant was used?
> > > 
> > > I find it very irritating that our git history has effectively started
> > > to function a bit like a free advertising platform for a bunch of AI
> > > companies and their proprietary agents and models.
> 
> FWIW, this is exactly how I feel. I added a regex to strip these in
> my git hooks. So at least the net/ history should be ads-free 🤷️

Ah, that's good to know. I've been rewriting them to "LLM" but I might
just start doing what netdev is.

> > In general, collecting data for nebulous purposes usually turns out to
> > be a bad idea. If we're not 100% clear on why we want this data, then
> > we're probably better off not collecting it at all.
> > 
> > With that in mind: if we're going to water down the tag, then I say
> > just remove the requirement altogether. If we later decide that we want
> > to start collecting more detailed info for some (clear) purpose then we
> > can revisit the idea.
> 
> +1
> 
> Honestly even tool attribution feels increasingly moot.
> People vibe code tools and AI-in-the-loop pipelines which they never
> publish. Open source tools are (hopefully?) used in pre-commit
> pipelines, so they have the "kbuild bot problem" of problems getting
> fixed before the code is merged. And we have the same free advertising
> problem for the rest.

Agreed.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH RFC] coding-assistants: simplify attribution
  2026-07-02  7:27   ` Christian Brauner
@ 2026-07-02  7:46     ` David Hildenbrand (Arm)
  2026-07-02  8:10       ` Laurent Pinchart
  2026-07-02 10:04       ` Lorenzo Stoakes
  2026-07-02  8:08     ` Laurent Pinchart
  1 sibling, 2 replies; 34+ messages in thread
From: David Hildenbrand (Arm) @ 2026-07-02  7:46 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Linus Torvalds, Jonathan Corbet, Jens Axboe, Jeff Layton,
	Vlastimil Babka, workflows, linux-doc, linux-kernel,
	linux-fsdevel, Lorenzo Stoakes (Oracle)

On 7/2/26 09:27, Christian Brauner wrote:
>> What would be much more relevant to know is to which degree LLMs were used.
>>
>> Assisted-by: LLM # translate commit message
>> Assisted-by: LLM # generate some test cases
>> Assisted-by: LLM # cleanup logic
>> Assisted-by: LLM # everything and I have no clue what any in here does
> 
> I think we should just drop any attribution as a general kernel-wide
> rule and let subsystems require them as needed. Then you can have all
> the complexity in mm for this that you think is needed for your
> workflow to function. This is precisely what the subsystem profiles are
> for. So maybe just add:
> 
> Documentation/process/maintainer-mm.rst
> 
> alongside
> 
> Documentation/process/maintainer-{tip,netdev,x86}.rst
> 
> and lay down the rules that you require for LLM based submissions in
> whatever detail you need.

I'm not really sure if having (more?) subsystem-specific tags is the way to go.
(below)

So either we find a very simple, kernel-wide rule for such tags, or we drop them
entirely.

> 
> I don't see how this additional commentary you want would ever be
> enforced consistently across the kernel or who would even enforce it. I
> don't need more beaurocracy to chase after people in my subsystems tbh.

That's certainly a good thing to discuss. (below)

> 
> The other thing is that I think this Assisted-by annotation is just
> noise in the changelog. If you want to know in detail what an LLM was
> used for when generating the patch it's mostly a signal for how
> "intense" of a review this will get afaict (already questionable imho
> but sure that's just something to disagree on).

I'd be happy to just have such information in the cover letter. Without any
tags. Having subsystem-specific rules on the disclosure on that might be more
reasonable.

I agree on the "enforce" aspect. It's impossible, but it's still easy to catch
people using AI irresponsibly today ... and that's what we care about. Not
people that know what they are doing using AI responsibly.

> 
> If the information is mostly useful during review then I still would
> question why it has to end up in our git logs. It's completely
> irrelevant information imho.

Fully agreed. In the tree it's irrelevant.

-- 
Cheers,

David

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH RFC] coding-assistants: simplify attribution
  2026-07-02  7:27   ` Christian Brauner
  2026-07-02  7:46     ` David Hildenbrand (Arm)
@ 2026-07-02  8:08     ` Laurent Pinchart
  2026-07-02  8:28       ` Christian Brauner
  1 sibling, 1 reply; 34+ messages in thread
From: Laurent Pinchart @ 2026-07-02  8:08 UTC (permalink / raw)
  To: Christian Brauner
  Cc: David Hildenbrand (Arm), Linus Torvalds, Jonathan Corbet,
	Jens Axboe, Jeff Layton, Vlastimil Babka, workflows, linux-doc,
	linux-kernel, linux-fsdevel

On Thu, Jul 02, 2026 at 09:27:37AM +0200, Christian Brauner wrote:
> > What would be much more relevant to know is to which degree LLMs were used.
> > 
> > Assisted-by: LLM # translate commit message
> > Assisted-by: LLM # generate some test cases
> > Assisted-by: LLM # cleanup logic
> > Assisted-by: LLM # everything and I have no clue what any in here does
> 
> I think we should just drop any attribution as a general kernel-wide
> rule and let subsystems require them as needed. Then you can have all
> the complexity in mm for this that you think is needed for your
> workflow to function. This is precisely what the subsystem profiles are
> for. So maybe just add:
> 
> Documentation/process/maintainer-mm.rst
> 
> alongside
> 
> Documentation/process/maintainer-{tip,netdev,x86}.rst
> 
> and lay down the rules that you require for LLM based submissions in
> whatever detail you need.
> 
> I don't see how this additional commentary you want would ever be
> enforced consistently across the kernel or who would even enforce it. I
> don't need more beaurocracy to chase after people in my subsystems tbh.
> 
> The other thing is that I think this Assisted-by annotation is just
> noise in the changelog. If you want to know in detail what an LLM was
> used for when generating the patch it's mostly a signal for how
> "intense" of a review this will get afaict (already questionable imho
> but sure that's just something to disagree on).
> 
> If the information is mostly useful during review then I still would
> question why it has to end up in our git logs. It's completely
> irrelevant information imho.

Food for thought, the Kubernetes project has published a disclosure
policy ([1], reported by LWN.net at [2], with a blog post explaininig it
at [3]). Quoting LWN.net,

"Of note, the project requires disclosure when AI tools have been used
to assist in the creation of a contribution but forbids the use of
listing AI as a co-author or including "assisted-by" or "co-developed"
trailers to attribute work to an LLM tool."

I personally don't see a lot of value in the Assisted-by trailer, but I
would like the submitter to include the information in a place that
doesn't end up in the git commit history (cover letter or below the ---
line).

[1] https://www.kubernetes.dev/docs/guide/pull-requests/#ai-guidance
[2] https://lwn.net/Articles/1080144/
[3] https://kubernetes.io/blog/2026/06/26/open-source-maintainership-in-the-age-of-ai/

> > I thought we ask for that in some document, but couldn't immediately find it
> > (and nobody does that).

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH RFC] coding-assistants: simplify attribution
  2026-07-02  7:46     ` David Hildenbrand (Arm)
@ 2026-07-02  8:10       ` Laurent Pinchart
  2026-07-02  8:16         ` David Hildenbrand (Arm)
  2026-07-02 10:04       ` Lorenzo Stoakes
  1 sibling, 1 reply; 34+ messages in thread
From: Laurent Pinchart @ 2026-07-02  8:10 UTC (permalink / raw)
  To: David Hildenbrand (Arm)
  Cc: Christian Brauner, Linus Torvalds, Jonathan Corbet, Jens Axboe,
	Jeff Layton, Vlastimil Babka, workflows, linux-doc, linux-kernel,
	linux-fsdevel, Lorenzo Stoakes (Oracle)

On Thu, Jul 02, 2026 at 09:46:37AM +0200, David Hildenbrand (Arm) wrote:
> On 7/2/26 09:27, Christian Brauner wrote:
> >> What would be much more relevant to know is to which degree LLMs were used.
> >>
> >> Assisted-by: LLM # translate commit message
> >> Assisted-by: LLM # generate some test cases
> >> Assisted-by: LLM # cleanup logic
> >> Assisted-by: LLM # everything and I have no clue what any in here does
> > 
> > I think we should just drop any attribution as a general kernel-wide
> > rule and let subsystems require them as needed. Then you can have all
> > the complexity in mm for this that you think is needed for your
> > workflow to function. This is precisely what the subsystem profiles are
> > for. So maybe just add:
> > 
> > Documentation/process/maintainer-mm.rst
> > 
> > alongside
> > 
> > Documentation/process/maintainer-{tip,netdev,x86}.rst
> > 
> > and lay down the rules that you require for LLM based submissions in
> > whatever detail you need.
> 
> I'm not really sure if having (more?) subsystem-specific tags is the way to go.
> (below)
> 
> So either we find a very simple, kernel-wide rule for such tags, or we drop them
> entirely.
> 
> > 
> > I don't see how this additional commentary you want would ever be
> > enforced consistently across the kernel or who would even enforce it. I
> > don't need more beaurocracy to chase after people in my subsystems tbh.
> 
> That's certainly a good thing to discuss. (below)
> 
> > 
> > The other thing is that I think this Assisted-by annotation is just
> > noise in the changelog. If you want to know in detail what an LLM was
> > used for when generating the patch it's mostly a signal for how
> > "intense" of a review this will get afaict (already questionable imho
> > but sure that's just something to disagree on).
> 
> I'd be happy to just have such information in the cover letter. Without any
> tags. Having subsystem-specific rules on the disclosure on that might be more
> reasonable.
> 
> I agree on the "enforce" aspect. It's impossible, but it's still easy to catch
> people using AI irresponsibly today ... and that's what we care about. Not
> people that know what they are doing using AI responsibly.

I have to reply to the "responsible" part: there's no possible ethical
use of generative AI in FOSS development today.

> > If the information is mostly useful during review then I still would
> > question why it has to end up in our git logs. It's completely
> > irrelevant information imho.
> 
> Fully agreed. In the tree it's irrelevant.

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH RFC] coding-assistants: simplify attribution
  2026-07-01 15:54 [PATCH RFC] coding-assistants: simplify attribution Christian Brauner
                   ` (3 preceding siblings ...)
  2026-07-01 18:35 ` Jeff Layton
@ 2026-07-02  8:12 ` Jori Koolstra
  2026-07-02  8:44   ` Vlastimil Babka (SUSE)
  2026-07-02 10:27 ` Krzysztof Kozlowski
  2026-07-02 11:27 ` Christoph Hellwig
  6 siblings, 1 reply; 34+ messages in thread
From: Jori Koolstra @ 2026-07-02  8:12 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Linus Torvalds, Jonathan Corbet, Jens Axboe, David Hildenbrand,
	Jeff Layton, Vlastimil Babka, workflows, linux-doc, linux-kernel,
	linux-fsdevel

Ah, I still reigniting this discussion again :)

What about a combination of what David and Jeff say? The whole point
seems to me that the salient information is not that an LLM was used (or
are we going to tag Sashiko as well or any other LLM-based code review
tool?), but what is was used to do. This information may be relevant for
how the review is approached. The latter should perhaps only be in the
cover letter and then we can drop the assisted-by tags altogether.

The question about enforcement remains.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH RFC] coding-assistants: simplify attribution
  2026-07-02  8:10       ` Laurent Pinchart
@ 2026-07-02  8:16         ` David Hildenbrand (Arm)
  0 siblings, 0 replies; 34+ messages in thread
From: David Hildenbrand (Arm) @ 2026-07-02  8:16 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Christian Brauner, Linus Torvalds, Jonathan Corbet, Jens Axboe,
	Jeff Layton, Vlastimil Babka, workflows, linux-doc, linux-kernel,
	linux-fsdevel, Lorenzo Stoakes (Oracle)

On 7/2/26 10:10, Laurent Pinchart wrote:
> On Thu, Jul 02, 2026 at 09:46:37AM +0200, David Hildenbrand (Arm) wrote:
>> On 7/2/26 09:27, Christian Brauner wrote:
>>>
>>> I think we should just drop any attribution as a general kernel-wide
>>> rule and let subsystems require them as needed. Then you can have all
>>> the complexity in mm for this that you think is needed for your
>>> workflow to function. This is precisely what the subsystem profiles are
>>> for. So maybe just add:
>>>
>>> Documentation/process/maintainer-mm.rst
>>>
>>> alongside
>>>
>>> Documentation/process/maintainer-{tip,netdev,x86}.rst
>>>
>>> and lay down the rules that you require for LLM based submissions in
>>> whatever detail you need.
>>
>> I'm not really sure if having (more?) subsystem-specific tags is the way to go.
>> (below)
>>
>> So either we find a very simple, kernel-wide rule for such tags, or we drop them
>> entirely.
>>
>>>
>>> I don't see how this additional commentary you want would ever be
>>> enforced consistently across the kernel or who would even enforce it. I
>>> don't need more beaurocracy to chase after people in my subsystems tbh.
>>
>> That's certainly a good thing to discuss. (below)
>>
>>>
>>> The other thing is that I think this Assisted-by annotation is just
>>> noise in the changelog. If you want to know in detail what an LLM was
>>> used for when generating the patch it's mostly a signal for how
>>> "intense" of a review this will get afaict (already questionable imho
>>> but sure that's just something to disagree on).
>>
>> I'd be happy to just have such information in the cover letter. Without any
>> tags. Having subsystem-specific rules on the disclosure on that might be more
>> reasonable.
>>
>> I agree on the "enforce" aspect. It's impossible, but it's still easy to catch
>> people using AI irresponsibly today ... and that's what we care about. Not
>> people that know what they are doing using AI responsibly.
> 
> I have to reply to the "responsible" part: there's no possible ethical
> use of generative AI in FOSS development today.

haha, fair enough. I was focusing on the technical aspect.

-- 
Cheers,

David

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH RFC] coding-assistants: simplify attribution
  2026-07-02  8:08     ` Laurent Pinchart
@ 2026-07-02  8:28       ` Christian Brauner
  0 siblings, 0 replies; 34+ messages in thread
From: Christian Brauner @ 2026-07-02  8:28 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Christian Brauner, David Hildenbrand (Arm), Linus Torvalds,
	Jonathan Corbet, Jens Axboe, Jeff Layton, Vlastimil Babka,
	workflows, linux-doc, linux-kernel, linux-fsdevel

On 2026-07-02 11:08 +0300, Laurent Pinchart wrote:
> On Thu, Jul 02, 2026 at 09:27:37AM +0200, Christian Brauner wrote:
> > > What would be much more relevant to know is to which degree LLMs were used.
> > > 
> > > Assisted-by: LLM # translate commit message
> > > Assisted-by: LLM # generate some test cases
> > > Assisted-by: LLM # cleanup logic
> > > Assisted-by: LLM # everything and I have no clue what any in here does
> > 
> > I think we should just drop any attribution as a general kernel-wide
> > rule and let subsystems require them as needed. Then you can have all
> > the complexity in mm for this that you think is needed for your
> > workflow to function. This is precisely what the subsystem profiles are
> > for. So maybe just add:
> > 
> > Documentation/process/maintainer-mm.rst
> > 
> > alongside
> > 
> > Documentation/process/maintainer-{tip,netdev,x86}.rst
> > 
> > and lay down the rules that you require for LLM based submissions in
> > whatever detail you need.
> > 
> > I don't see how this additional commentary you want would ever be
> > enforced consistently across the kernel or who would even enforce it. I
> > don't need more beaurocracy to chase after people in my subsystems tbh.
> > 
> > The other thing is that I think this Assisted-by annotation is just
> > noise in the changelog. If you want to know in detail what an LLM was
> > used for when generating the patch it's mostly a signal for how
> > "intense" of a review this will get afaict (already questionable imho
> > but sure that's just something to disagree on).
> > 
> > If the information is mostly useful during review then I still would
> > question why it has to end up in our git logs. It's completely
> > irrelevant information imho.
> 
> Food for thought, the Kubernetes project has published a disclosure
> policy ([1], reported by LWN.net at [2], with a blog post explaininig it
> at [3]). Quoting LWN.net,
> 
> "Of note, the project requires disclosure when AI tools have been used
> to assist in the creation of a contribution but forbids the use of
> listing AI as a co-author or including "assisted-by" or "co-developed"
> trailers to attribute work to an LLM tool."
> 
> I personally don't see a lot of value in the Assisted-by trailer, but I
> would like the submitter to include the information in a place that
> doesn't end up in the git commit history (cover letter or below the ---
> line).

Fwiw, way before k8s I had systemd adopt the following policy:

https://github.com/systemd/systemd/blob/main/docs/CONTRIBUTING.md#policy-on-the-use-of-large-language-models-llms-and-ai-tooling

    We expect everyone contributing to systemd to fully own their
    contribution, be able to reason about it, be able to explain why things
    were done a particular way and act as the full owner of that code. AI
    tools are treated the same as traditional tooling like sed, awk or
    coccinelle.

    For the purpose of this project, AI tools CANNOT be treated as author,
    co-author or be credited in any way that would suggest any ownership
    over the contribution.

    The contributor should have done all the thinking, planning and
    understanding of the changes needed to resolve an issue or implement a
    new feature prior to using automated tooling to perform the grunt work.

    Unguided use of those tools or the inability to prove understanding of
    the code contributed will result in a loss of trust in that contributor
    by project maintainers which can then lead to exclusion from any further
    contribution to the project.

    As with any other submissions, authors are responsible for doing due
    diligence and ensuring their submissions are compatible with the
    project's license as documented in LICENSES/README.md.


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH RFC] coding-assistants: simplify attribution
  2026-07-02  8:12 ` Jori Koolstra
@ 2026-07-02  8:44   ` Vlastimil Babka (SUSE)
  2026-07-02  9:09     ` Jori Koolstra
                       ` (3 more replies)
  0 siblings, 4 replies; 34+ messages in thread
From: Vlastimil Babka (SUSE) @ 2026-07-02  8:44 UTC (permalink / raw)
  To: Jori Koolstra, Christian Brauner, Linus Torvalds, Jonathan Corbet,
	Jens Axboe, David Hildenbrand, Jeff Layton, workflows, linux-doc,
	linux-kernel, linux-fsdevel, Lorenzo Stoakes

On 7/2/26 10:12, Jori Koolstra wrote:
> Ah, I still reigniting this discussion again :)
> 
> What about a combination of what David and Jeff say? The whole point
> seems to me that the salient information is not that an LLM was used (or
> are we going to tag Sashiko as well or any other LLM-based code review
> tool?), but what is was used to do. This information may be relevant for
> how the review is approached. The latter should perhaps only be in the
> cover letter and then we can drop the assisted-by tags altogether.
> 
> The question about enforcement remains.

It's not possible to enforce it. People can deny it if the tag is missing
and you confront them and even though the submission has many signs of being
obviously LLM, there is no definite proof. We've seen (likely, as there's no
proof!) that happen in mm.

Such situation then penalizes those who disclose so obviously they won't. We
should drop the tag and instead think how we can empower maintainers to be
able to use their own judgment and deprioritize dealing with what they
perceive as LLM slop, without fearing consequences of not being properly
responsible etc, and not rely on any non-enforceable tags for that.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH RFC] coding-assistants: simplify attribution
  2026-07-02  8:44   ` Vlastimil Babka (SUSE)
@ 2026-07-02  9:09     ` Jori Koolstra
  2026-07-02  9:39       ` Lorenzo Stoakes
  2026-07-02  9:37     ` Lorenzo Stoakes
                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 34+ messages in thread
From: Jori Koolstra @ 2026-07-02  9:09 UTC (permalink / raw)
  To: Vlastimil Babka (SUSE), Christian Brauner, Linus Torvalds,
	Jonathan Corbet, Jens Axboe, David Hildenbrand, Jeff Layton,
	workflows, linux-doc, linux-kernel, linux-fsdevel,
	Lorenzo Stoakes


> Op 02-07-2026 10:44 CEST schreef Vlastimil Babka (SUSE) <vbabka@kernel.org>:
> 
>  
> On 7/2/26 10:12, Jori Koolstra wrote:
> > Ah, I still reigniting this discussion again :)
> > 
> > What about a combination of what David and Jeff say? The whole point
> > seems to me that the salient information is not that an LLM was used (or
> > are we going to tag Sashiko as well or any other LLM-based code review
> > tool?), but what is was used to do. This information may be relevant for
> > how the review is approached. The latter should perhaps only be in the
> > cover letter and then we can drop the assisted-by tags altogether.
> > 
> > The question about enforcement remains.
> 
> It's not possible to enforce it. People can deny it if the tag is missing
> and you confront them and even though the submission has many signs of being
> obviously LLM, there is no definite proof. We've seen (likely, as there's no
> proof!) that happen in mm.
> 

Maintainers should be free to ignore what they perceive as slop without needing
to defend that call. Reputation can be gained by submitting useful work or
being present in the community, attending conferences, giving talks, etc.
I am not saying that we should be harsh on beginning contributors (or I would
have to count myself out as well), but they should be as free as possible to
only invest their time in the project and people that may become involved in the
community. And that call is up to them.

I try to review fix-up patches of first-time contributors, but if it reeks of
AI I don't bother. We have the same policy in the kernel mentorship program,
we invest time to help people get involved with the community and kernel, not
to let someone strike "kernel contributor" of their list. The whole point is
not that most of this clean-up work is super useful (and indeed an LLM can do it),
but to let someone feel excited about contributing and maybe getting them to
to stick around.

> Such situation then penalizes those who disclose so obviously they won't. We
> should drop the tag and instead think how we can empower maintainers to be
> able to use their own judgment and deprioritize dealing with what they
> perceive as LLM slop, without fearing consequences of not being properly
> responsible etc, and not rely on any non-enforceable tags for that.

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH RFC] coding-assistants: simplify attribution
  2026-07-01 16:10 ` David Hildenbrand (Arm)
  2026-07-02  7:27   ` Christian Brauner
@ 2026-07-02  9:24   ` Lorenzo Stoakes
  1 sibling, 0 replies; 34+ messages in thread
From: Lorenzo Stoakes @ 2026-07-02  9:24 UTC (permalink / raw)
  To: David Hildenbrand (Arm)
  Cc: Christian Brauner, Linus Torvalds, Jonathan Corbet, Jens Axboe,
	Jeff Layton, Vlastimil Babka, workflows, linux-doc, linux-kernel,
	linux-fsdevel

On Wed, Jul 01, 2026 at 06:10:48PM +0200, David Hildenbrand (Arm) wrote:
> On 7/1/26 17:54, Christian Brauner wrote:
> > I remain very confused by our coding assistant contribution guidelines.
> > I'm going to be a bit polemic now but this seriously in good faith.
> >
> > Why precisely do we require all this detailed information about what
> > specific coding assistant was used?
> >
> > I find it very irritating that our git history has effectively started
> > to function a bit like a free advertising platform for a bunch of AI
> > companies and their proprietary agents and models.
> >
> > And it reamins unclear to me what exactly we do get out of this detailed
> > information: Do we want to run statistical analysis on what agent and
> > model is used the most and publish that on LWN at some point?
> >
> > I acknowledge that my stance is even more radical: imho we would just
> > stop it with any disclosure requirements completely. It's useless imho.
> > We already see that other than core contributors most people don't care
> > and will just not disclose their usage of AI. I think this is entirely
> > pointless and worse it brings in undefined legal status as well. It's
> > not like recent events of pulling certain models from the face of the
> > earth have made this any less concerning.
> >
> > But fine, if we want to do this can we please just dumb it down to
> >
> > Assisted-by: LLM
> >
> > or
> >
> > Assisted-by: Coding Assistant
>
> I'd prefer this.

Yeah I don't see any reason why we need to know precisely which model or version
of said model we need.

>
> The doc states "proper attribution helps track the evolving role of AI in the
> development process". If there is another reason why we need the free
> advertisement, we should document it.

Yup.

Honestly I find the phrasing here quite vague.

While it is interesting to track the degree of AI involvement (where that's
disclosed) a really important part of this is how maintainers deal with AI
submissions.

Also we have a schism in the documentation anyway, there's [0] which is
literally indexed as 'AI Coding Assistants', which says NOTHING about how
people are supposed to use them etc. and there's [1] Which DOES say
something about that, but which isn't linked to by [0], nor links to it.

Before I happened across this thread, I was thinking of sending a patch to
at least link one to the other. Now I think I definitely will.

>
> Side note: if someone instructs an LLM exactly what to do, and would have
> achieved the same thing just typing it in, the use of the tag is not any helpful
> to me. (similar to "Assisted-by: vim" would not be helpful).
>
> What would be much more relevant to know is to which degree LLMs were used.

As I mentioned off-list I do agree that this is key.

Having this information helps with the most important issue we face when it
comes to AI - an EXISTENTIAL issue actually IMO - the asymmetry between how
much code can be generated, and available maintainer/reviewer resource.

Being able to, at a glance, see that a series was both wholly generated
seems substandard means we can quickly ask for more human attention.

And I know what the argument's going to be - 'bad faith people will lie
about it' - and sure, yes they will.

But now that there's been a huge surge of AI generated code in mm I can
speak from experience - many DO attribute, and for those that don't it's
very useful to have guidelines to point to.

Both aid in dealing with this asymmetry.

(as an example, I've had to push back quite strongly on an _attributed_
series ([2] and [3]) that appeared to be wholly generated. Having this
information would have helped there).

>
> Assisted-by: LLM # translate commit message
> Assisted-by: LLM # generate some test cases
> Assisted-by: LLM # cleanup logic
> Assisted-by: LLM # everything and I have no clue what any in here does

Yeah this format works I think!

>
> I thought we ask for that in some document, but couldn't immediately find it
> (and nobody does that).

Well you're probably thinking of [1], e.g.:

	Second, when making a contribution, be transparent about the origin
	of content in cover letters and changelogs. You can be more
	transparent by adding information like this:

	...

	- Which portions of the content were affected by that tool?

	...


And also from the same document:

	If tools permit you to generate a contribution automatically,
	expect additional scrutiny in proportion to how much of it was
	generated.

	As with the output of any tooling, the result may be incorrect or
	inappropriate. You are expected to understand and to be able to
	defend everything you submit. If you are unable to do so, then do
	not submit the resulting changes.

	If you do so anyway, maintainers are entitled to reject your series
	without detailed review.

This only speaks more to the need to link the two documents together. I'll
send a patch.

>
> --
> Cheers,
>
> David

Thanks, Lorenzo

[0]:https://docs.kernel.org/process/coding-assistants.html
[1]:https://docs.kernel.org/process/generated-content.html
[2]:https://lore.kernel.org/linux-mm/aj9yrlB0TrlYCLlf@lucifer/
[3]:https://lore.kernel.org/linux-mm/akIjA_dqh4OHAYo4@lucifer/

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH RFC] coding-assistants: simplify attribution
  2026-07-02  8:44   ` Vlastimil Babka (SUSE)
  2026-07-02  9:09     ` Jori Koolstra
@ 2026-07-02  9:37     ` Lorenzo Stoakes
  2026-07-02  9:38     ` Laurent Pinchart
  2026-07-02 10:34     ` Krzysztof Kozlowski
  3 siblings, 0 replies; 34+ messages in thread
From: Lorenzo Stoakes @ 2026-07-02  9:37 UTC (permalink / raw)
  To: Vlastimil Babka (SUSE)
  Cc: Jori Koolstra, Christian Brauner, Linus Torvalds, Jonathan Corbet,
	Jens Axboe, David Hildenbrand, Jeff Layton, workflows, linux-doc,
	linux-kernel, linux-fsdevel

On Thu, Jul 02, 2026 at 10:44:34AM +0200, Vlastimil Babka (SUSE) wrote:
> On 7/2/26 10:12, Jori Koolstra wrote:
> > Ah, I still reigniting this discussion again :)
> >
> > What about a combination of what David and Jeff say? The whole point
> > seems to me that the salient information is not that an LLM was used (or
> > are we going to tag Sashiko as well or any other LLM-based code review
> > tool?), but what is was used to do. This information may be relevant for
> > how the review is approached. The latter should perhaps only be in the
> > cover letter and then we can drop the assisted-by tags altogether.
> >
> > The question about enforcement remains.
>
> It's not possible to enforce it. People can deny it if the tag is missing
> and you confront them and even though the submission has many signs of being
> obviously LLM, there is no definite proof. We've seen (likely, as there's no
> proof!) that happen in mm.

I think it's helpful to point to guidelines, and I've actively used that in
practice with the recent wave of AI slop in mm.

But yes it quickly becomes very politically difficult if somebody adamently lies
about that, and as you know I've found myself in that situation too :)

However, there are those who _do_ attribute, especially those working at tech
companies that are encouraging LLM-usage, and others who are in good faith, so
having the tag is, I think, helpful.

It also strengthens the case for those who are dishonest if we do at some point
institute a 'well this seems very likely to be so sorry no' approach in mm at
least.

>
> Such situation then penalizes those who disclose so obviously they won't. We
> should drop the tag and instead think how we can empower maintainers to be
> able to use their own judgment and deprioritize dealing with what they
> perceive as LLM slop, without fearing consequences of not being properly
> responsible etc, and not rely on any non-enforceable tags for that.

I agree we should have the ability to do this.

But the amount of time wasted on AI slop is already too much and we're only at
the start of this, we really need a very low effort way to filter it.

Tags with more information WILL help IMO, but I honestly think, in the long run,
we're simply going to have to deprioritise patches from newcomers that do more
than small changes.

Other open source communities are ahead of us in this and that seems to be the
road being taken often (e.g. [0]).

Cheers, Lorenzo

[0]:https://godotengine.org/article/contribution-policy-2026/

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH RFC] coding-assistants: simplify attribution
  2026-07-02  8:44   ` Vlastimil Babka (SUSE)
  2026-07-02  9:09     ` Jori Koolstra
  2026-07-02  9:37     ` Lorenzo Stoakes
@ 2026-07-02  9:38     ` Laurent Pinchart
  2026-07-02  9:44       ` Lorenzo Stoakes
  2026-07-02 10:34     ` Krzysztof Kozlowski
  3 siblings, 1 reply; 34+ messages in thread
From: Laurent Pinchart @ 2026-07-02  9:38 UTC (permalink / raw)
  To: Vlastimil Babka (SUSE)
  Cc: Jori Koolstra, Christian Brauner, Linus Torvalds, Jonathan Corbet,
	Jens Axboe, David Hildenbrand, Jeff Layton, workflows, linux-doc,
	linux-kernel, linux-fsdevel, Lorenzo Stoakes

On Thu, Jul 02, 2026 at 10:44:34AM +0200, Vlastimil Babka (SUSE) wrote:
> On 7/2/26 10:12, Jori Koolstra wrote:
> > Ah, I still reigniting this discussion again :)
> > 
> > What about a combination of what David and Jeff say? The whole point
> > seems to me that the salient information is not that an LLM was used (or
> > are we going to tag Sashiko as well or any other LLM-based code review
> > tool?), but what is was used to do. This information may be relevant for
> > how the review is approached. The latter should perhaps only be in the
> > cover letter and then we can drop the assisted-by tags altogether.
> > 
> > The question about enforcement remains.
> 
> It's not possible to enforce it. People can deny it if the tag is missing
> and you confront them and even though the submission has many signs of being
> obviously LLM, there is no definite proof. We've seen (likely, as there's no
> proof!) that happen in mm.
> 
> Such situation then penalizes those who disclose so obviously they won't.

I think there's also a penality for those who don't disclose when
they're told they should: it will lower trust. Kernel development is
largely based on a trust model. If a contributor decides to adopt a
deceiptful behaviour, they can expect maintainers to raise the bar for
accepting patches, when not rejecting them outright.

I can't quantifying which of the penalities will be higher, but I hope
(call me naive if you wish) that the vast majority of contributurs who
*know* we require disclosure to abide by that rule, even if it incurs a
penalty. After all, proponents for LLM usage claim such performance
improvements that a small penalty during review can't be that bad, right
? :-)

> We
> should drop the tag and instead think how we can empower maintainers to be
> able to use their own judgment and deprioritize dealing with what they
> perceive as LLM slop, without fearing consequences of not being properly
> responsible etc, and not rely on any non-enforceable tags for that.

-- 
Regards,

Laurent Pinchart

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH RFC] coding-assistants: simplify attribution
  2026-07-02  9:09     ` Jori Koolstra
@ 2026-07-02  9:39       ` Lorenzo Stoakes
  0 siblings, 0 replies; 34+ messages in thread
From: Lorenzo Stoakes @ 2026-07-02  9:39 UTC (permalink / raw)
  To: Jori Koolstra
  Cc: Vlastimil Babka (SUSE), Christian Brauner, Linus Torvalds,
	Jonathan Corbet, Jens Axboe, David Hildenbrand, Jeff Layton,
	workflows, linux-doc, linux-kernel, linux-fsdevel

On Thu, Jul 02, 2026 at 11:09:37AM +0200, Jori Koolstra wrote:
>
> > Op 02-07-2026 10:44 CEST schreef Vlastimil Babka (SUSE) <vbabka@kernel.org>:
> >
> >
> > On 7/2/26 10:12, Jori Koolstra wrote:
> > > Ah, I still reigniting this discussion again :)
> > >
> > > What about a combination of what David and Jeff say? The whole point
> > > seems to me that the salient information is not that an LLM was used (or
> > > are we going to tag Sashiko as well or any other LLM-based code review
> > > tool?), but what is was used to do. This information may be relevant for
> > > how the review is approached. The latter should perhaps only be in the
> > > cover letter and then we can drop the assisted-by tags altogether.
> > >
> > > The question about enforcement remains.
> >
> > It's not possible to enforce it. People can deny it if the tag is missing
> > and you confront them and even though the submission has many signs of being
> > obviously LLM, there is no definite proof. We've seen (likely, as there's no
> > proof!) that happen in mm.
> >
>
> Maintainers should be free to ignore what they perceive as slop without needing
> to defend that call. Reputation can be gained by submitting useful work or
> being present in the community, attending conferences, giving talks, etc.
> I am not saying that we should be harsh on beginning contributors (or I would
> have to count myself out as well), but they should be as free as possible to
> only invest their time in the project and people that may become involved in the
> community. And that call is up to them.

Yup agreed, however I have had the experience of doing exactly this and then
being second-guessed enormously, which was exhausting honestly.

So we need total clarity that it's OK to do this.

I guess this is partly a subsystem-by-subsystem thing though.

>
> I try to review fix-up patches of first-time contributors, but if it reeks of
> AI I don't bother. We have the same policy in the kernel mentorship program,
> we invest time to help people get involved with the community and kernel, not
> to let someone strike "kernel contributor" of their list. The whole point is
> not that most of this clean-up work is super useful (and indeed an LLM can do it),
> but to let someone feel excited about contributing and maybe getting them to
> to stick around.

Yup agreed :)

>
> > Such situation then penalizes those who disclose so obviously they won't. We
> > should drop the tag and instead think how we can empower maintainers to be
> > able to use their own judgment and deprioritize dealing with what they
> > perceive as LLM slop, without fearing consequences of not being properly
> > responsible etc, and not rely on any non-enforceable tags for that.

Thanks, Lorenzo

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH RFC] coding-assistants: simplify attribution
  2026-07-02  9:38     ` Laurent Pinchart
@ 2026-07-02  9:44       ` Lorenzo Stoakes
  2026-07-02 11:57         ` Brian Foster
  0 siblings, 1 reply; 34+ messages in thread
From: Lorenzo Stoakes @ 2026-07-02  9:44 UTC (permalink / raw)
  To: Laurent Pinchart
  Cc: Vlastimil Babka (SUSE), Jori Koolstra, Christian Brauner,
	Linus Torvalds, Jonathan Corbet, Jens Axboe, David Hildenbrand,
	Jeff Layton, workflows, linux-doc, linux-kernel, linux-fsdevel

On Thu, Jul 02, 2026 at 12:38:44PM +0300, Laurent Pinchart wrote:
> On Thu, Jul 02, 2026 at 10:44:34AM +0200, Vlastimil Babka (SUSE) wrote:
> > On 7/2/26 10:12, Jori Koolstra wrote:
> > > Ah, I still reigniting this discussion again :)
> > >
> > > What about a combination of what David and Jeff say? The whole point
> > > seems to me that the salient information is not that an LLM was used (or
> > > are we going to tag Sashiko as well or any other LLM-based code review
> > > tool?), but what is was used to do. This information may be relevant for
> > > how the review is approached. The latter should perhaps only be in the
> > > cover letter and then we can drop the assisted-by tags altogether.
> > >
> > > The question about enforcement remains.
> >
> > It's not possible to enforce it. People can deny it if the tag is missing
> > and you confront them and even though the submission has many signs of being
> > obviously LLM, there is no definite proof. We've seen (likely, as there's no
> > proof!) that happen in mm.
> >
> > Such situation then penalizes those who disclose so obviously they won't.
>
> I think there's also a penality for those who don't disclose when
> they're told they should: it will lower trust. Kernel development is
> largely based on a trust model. If a contributor decides to adopt a
> deceiptful behaviour, they can expect maintainers to raise the bar for
> accepting patches, when not rejecting them outright.

Yes, I explicitly said this in response to somebody for whom there was
overwhelming evidence they were submitting AI slop, and that they'd need to
build it back up again.

It's precisely the issue as I see it.

But others within the community disagreed with me, so it turned into a very
long and draining discussion that I don't particularly wish to repeat.

So we really need clarity on it being OK to do this (I remember saying this
last year when I made an ultimately unsuccessful submission to the
maintainer's summit about all this :)

What matters overall is being able to _quickly_ dismiss AI slop so that
asymmetry between LLM generation + maintainer time isn't exploited.

And ultimately I think the trust model will end up being 'newcomes have 0,
now build it up'.

Which sucks but this issue is simply existential for open source.

>
> I can't quantifying which of the penalities will be higher, but I hope
> (call me naive if you wish) that the vast majority of contributurs who
> *know* we require disclosure to abide by that rule, even if it incurs a
> penalty. After all, proponents for LLM usage claim such performance
> improvements that a small penalty during review can't be that bad, right
> ? :-)
>
> > We
> > should drop the tag and instead think how we can empower maintainers to be
> > able to use their own judgment and deprioritize dealing with what they
> > perceive as LLM slop, without fearing consequences of not being properly
> > responsible etc, and not rely on any non-enforceable tags for that.
>
> --
> Regards,
>
> Laurent Pinchart

Thanks, Lorenzo

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH RFC] coding-assistants: simplify attribution
  2026-07-01 16:08 ` Jonathan Corbet
  2026-07-01 16:12   ` David Hildenbrand (Arm)
  2026-07-02  7:11   ` Christian Brauner
@ 2026-07-02  9:51   ` David Disseldorp
  2 siblings, 0 replies; 34+ messages in thread
From: David Disseldorp @ 2026-07-02  9:51 UTC (permalink / raw)
  To: Jonathan Corbet
  Cc: Christian Brauner, Linus Torvalds, Jens Axboe, David Hildenbrand,
	Jeff Layton, Vlastimil Babka, workflows, linux-doc, linux-kernel,
	linux-fsdevel

On Wed, 01 Jul 2026 10:08:11 -0600, Jonathan Corbet wrote:

> > Why precisely do we require all this detailed information about what
> > specific coding assistant was used?  
> 
> From my memory of the discussions:
> 
> - If a specific LLM turns out to be in a bad position with regard to
>   some copyright ruling, we can identify the commits that might have
>   been tainted by it.
> 
> - Similarly should an LLM prove to have an inclination toward specific
>   types of security issues.
> 
> Whether either of these would ever actually prove useful is not
> something I can hazard a guess for.

In https://lwn.net/Articles/854645/ (An update on the UMN affair) you
documented a case where an organization acted maliciously, resulting in
the need to audit (and revert) numerous commits based on git authorship
metadata. IMO the existing Assisted-by tags, although far from perfect,
will be useful for audits when a specific LLM coding assistant is
similarly found to be generating malicious output.

Thanks, David

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH RFC] coding-assistants: simplify attribution
  2026-07-02  7:46     ` David Hildenbrand (Arm)
  2026-07-02  8:10       ` Laurent Pinchart
@ 2026-07-02 10:04       ` Lorenzo Stoakes
  2026-07-02 11:51         ` David Hildenbrand (Arm)
  1 sibling, 1 reply; 34+ messages in thread
From: Lorenzo Stoakes @ 2026-07-02 10:04 UTC (permalink / raw)
  To: David Hildenbrand (Arm)
  Cc: Christian Brauner, Linus Torvalds, Jonathan Corbet, Jens Axboe,
	Jeff Layton, Vlastimil Babka, workflows, linux-doc, linux-kernel,
	linux-fsdevel

(thanks for the cc-!)

On Thu, Jul 02, 2026 at 09:46:37AM +0200, David Hildenbrand (Arm) wrote:
> On 7/2/26 09:27, Christian Brauner wrote:
> >> What would be much more relevant to know is to which degree LLMs were used.
> >>
> >> Assisted-by: LLM # translate commit message
> >> Assisted-by: LLM # generate some test cases
> >> Assisted-by: LLM # cleanup logic
> >> Assisted-by: LLM # everything and I have no clue what any in here does
> >
> > I think we should just drop any attribution as a general kernel-wide
> > rule and let subsystems require them as needed. Then you can have all
> > the complexity in mm for this that you think is needed for your
> > workflow to function. This is precisely what the subsystem profiles are
> > for. So maybe just add:

A single comment is complexity?

> >
> > Documentation/process/maintainer-mm.rst
> >
> > alongside
> >
> > Documentation/process/maintainer-{tip,netdev,x86}.rst
> >
> > and lay down the rules that you require for LLM based submissions in
> > whatever detail you need.
>
> I'm not really sure if having (more?) subsystem-specific tags is the way to go.
> (below)
>
> So either we find a very simple, kernel-wide rule for such tags, or we drop them
> entirely.

Yup I couldn't disagree more with Christian here, the whole thing feels like
trying to 'wish away' the AI issue, and now punting off to subsystem
maintainers...

Subsystems impact each other. Right now I'm writing a series that changes driver
code so we can enforce some sanity in mm APIs.

I've had to interact with fs code quite a bit that uses mm logic.

It's all interconnected, and one subsystem let's say going with 'let it all in'
say, impacts another.

Yes some people lie about it, but having the guidelines only STRENGTHENS our
position on that, and I've seen that in practice.

So yeah, sorry, I think it's beyond silly to push back on requesting somebody
disclose how much of a patch/series was AI generated.

And [0] already essentially says people NEED to do this now. But that doc has
been rather downplayed unfortunately I think.

>
> >
> > I don't see how this additional commentary you want would ever be
> > enforced consistently across the kernel or who would even enforce it. I
> > don't need more beaurocracy to chase after people in my subsystems tbh.

Again, asking LLM submitters to write a single comment is 'beaurocracy'?

>
> That's certainly a good thing to discuss. (below)
>
> >
> > The other thing is that I think this Assisted-by annotation is just
> > noise in the changelog. If you want to know in detail what an LLM was
> > used for when generating the patch it's mostly a signal for how
> > "intense" of a review this will get afaict (already questionable imho
> > but sure that's just something to disagree on).
>
> I'd be happy to just have such information in the cover letter. Without any
> tags. Having subsystem-specific rules on the disclosure on that might be more
> reasonable.

I disagree, I think it's important to have it standardised and simple.

If we make things vague, people won't do it. And reading through a cover letter
in its AI slop entirety (and boy does it generate a LOT of text) to find the
mention or not (and hey what if it's not clear?) is just objectively worse.

>
> I agree on the "enforce" aspect. It's impossible, but it's still easy to catch
> people using AI irresponsibly today ... and that's what we care about. Not
> people that know what they are doing using AI responsibly.

For me it's about empowering maintainers to push back.

>
> >
> > If the information is mostly useful during review then I still would
> > question why it has to end up in our git logs. It's completely
> > irrelevant information imho.
>
> Fully agreed. In the tree it's irrelevant.

Not sure about that, if it turns out AI-generated patches are causing 95% more
bugs say that's pretty useful information no?

Or if you find that a patch somebody sent from another subsystem that has a
lassez faire approach to AI slop completely breaks you in some subtle way, isn't
it easier to push for a revert if you see it's LLM-generated?

And is it really that egregious to include a tag? You can ignore it if you don't
care.

>
> --
> Cheers,
>
> David

Thanks, Lorenzo

[0]:https://docs.kernel.org/process/generated-content.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH RFC] coding-assistants: simplify attribution
  2026-07-01 15:54 [PATCH RFC] coding-assistants: simplify attribution Christian Brauner
                   ` (4 preceding siblings ...)
  2026-07-02  8:12 ` Jori Koolstra
@ 2026-07-02 10:27 ` Krzysztof Kozlowski
  2026-07-02 11:27 ` Christoph Hellwig
  6 siblings, 0 replies; 34+ messages in thread
From: Krzysztof Kozlowski @ 2026-07-02 10:27 UTC (permalink / raw)
  To: Christian Brauner, Linus Torvalds, Jonathan Corbet
  Cc: Jens Axboe, David Hildenbrand, Jeff Layton, Vlastimil Babka,
	workflows, linux-doc, linux-kernel, linux-fsdevel

On 01/07/2026 17:54, Christian Brauner wrote:
> I remain very confused by our coding assistant contribution guidelines.
> I'm going to be a bit polemic now but this seriously in good faith.
> 
> Why precisely do we require all this detailed information about what
> specific coding assistant was used?
> 
> I find it very irritating that our git history has effectively started
> to function a bit like a free advertising platform for a bunch of AI
> companies and their proprietary agents and models.
> 
> And it reamins unclear to me what exactly we do get out of this detailed
> information: Do we want to run statistical analysis on what agent and
> model is used the most and publish that on LWN at some point?
> 
> I acknowledge that my stance is even more radical: imho we would just
> stop it with any disclosure requirements completely. It's useless imho.
> We already see that other than core contributors most people don't care
> and will just not disclose their usage of AI. I think this is entirely
> pointless and worse it brings in undefined legal status as well. It's
> not like recent events of pulling certain models from the face of the
> earth have made this any less concerning.
> 
> But fine, if we want to do this can we please just dumb it down to
> 
> Assisted-by: LLM
> 

Yes, I agree. Useful information would be the model or trained data,
e.g. to judge for any copyright issues, but I doubt anyone will ever
provide such details.

I also see little value to keep even the Assisted-by tag in the first
place. I am interested in seeing named respectable tools in commit msg
as reasons for doing this commit, e.g. I am fixing warning from Smatch
so I mention Smatch. Sashiko fits this purpose as well. I have no
interests to see "Claude" as reason of doing something.

Best regards,
Krzysztof

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH RFC] coding-assistants: simplify attribution
  2026-07-02  8:44   ` Vlastimil Babka (SUSE)
                       ` (2 preceding siblings ...)
  2026-07-02  9:38     ` Laurent Pinchart
@ 2026-07-02 10:34     ` Krzysztof Kozlowski
  3 siblings, 0 replies; 34+ messages in thread
From: Krzysztof Kozlowski @ 2026-07-02 10:34 UTC (permalink / raw)
  To: Vlastimil Babka (SUSE), Jori Koolstra, Christian Brauner,
	Linus Torvalds, Jonathan Corbet, Jens Axboe, David Hildenbrand,
	Jeff Layton, workflows, linux-doc, linux-kernel, linux-fsdevel,
	Lorenzo Stoakes

On 02/07/2026 10:44, Vlastimil Babka (SUSE) wrote:
> On 7/2/26 10:12, Jori Koolstra wrote:
>> Ah, I still reigniting this discussion again :)
>>
>> What about a combination of what David and Jeff say? The whole point
>> seems to me that the salient information is not that an LLM was used (or
>> are we going to tag Sashiko as well or any other LLM-based code review
>> tool?), but what is was used to do. This information may be relevant for
>> how the review is approached. The latter should perhaps only be in the
>> cover letter and then we can drop the assisted-by tags altogether.
>>
>> The question about enforcement remains.
> 
> It's not possible to enforce it. People can deny it if the tag is missing
> and you confront them and even though the submission has many signs of being
> obviously LLM, there is no definite proof. We've seen (likely, as there's no
> proof!) that happen in mm.
> 
> Such situation then penalizes those who disclose so obviously they won't. We
> should drop the tag and instead think how we can empower maintainers to be
> able to use their own judgment and deprioritize dealing with what they
> perceive as LLM slop, without fearing consequences of not being properly
> responsible etc, and not rely on any non-enforceable tags for that.

+1

I see no benefits of enforcing the tag for these exact reasons. Every
LLM slop will miss the tag. OTOH, seeing reasonable contribution with
the tag makes my spider-senses tingling and causing unnecessary
prejudice. If the contribution is reasonable, how does the tag
information helps me? I trust (or not) the person, regardless what tool
they use.

And if we think about any future possible copyright issues with LLM
contributions (like if there is ever a ruling that model trained on BSD
data creates BSD-derivative work etc), does that tag anyhow solve it?
Like if that ruling appear we will go through the history and revert the
commits?

Best regards,
Krzysztof

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH RFC] coding-assistants: simplify attribution
  2026-07-01 15:54 [PATCH RFC] coding-assistants: simplify attribution Christian Brauner
                   ` (5 preceding siblings ...)
  2026-07-02 10:27 ` Krzysztof Kozlowski
@ 2026-07-02 11:27 ` Christoph Hellwig
  6 siblings, 0 replies; 34+ messages in thread
From: Christoph Hellwig @ 2026-07-02 11:27 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Linus Torvalds, Jonathan Corbet, Jens Axboe, David Hildenbrand,
	Jeff Layton, Vlastimil Babka, workflows, linux-doc, linux-kernel,
	linux-fsdevel

On Wed, Jul 01, 2026 at 05:54:48PM +0200, Christian Brauner wrote:
> Assisted-by: LLM
> 
> or
> 
> Assisted-by: Coding Assistant

I think what is more relevant is what assistance there was.  If the
code was generated by an LLM we should plain out reject it out of
copyright grounds.  If it was used for validation or ideas: who
care?

I.e. do we need this at all except as a guard against vibe code junk
that pull in other copyrighted material?  And do we really rely on
a tag instead of detecting it by the usual signs?


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH RFC] coding-assistants: simplify attribution
  2026-07-02  7:10   ` Christian Brauner
@ 2026-07-02 11:35     ` Mark Brown
  0 siblings, 0 replies; 34+ messages in thread
From: Mark Brown @ 2026-07-02 11:35 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Linus Torvalds, Jonathan Corbet, Jens Axboe, David Hildenbrand,
	Jeff Layton, Vlastimil Babka, workflows, linux-doc, linux-kernel,
	linux-fsdevel

[-- Attachment #1: Type: text/plain, Size: 888 bytes --]

On Thu, Jul 02, 2026 at 09:10:16AM +0200, Christian Brauner wrote:
> On 2026-07-01 17:08 +0100, Mark Brown wrote:
> > On Wed, Jul 01, 2026 at 05:54:48PM +0200, Christian Brauner wrote:

> > > And it reamins unclear to me what exactly we do get out of this detailed
> > > information: Do we want to run statistical analysis on what agent and
> > > model is used the most and publish that on LWN at some point?

> > IIRC it was literally this, have people mention which tools they used so
> > we can use that to inform our assessment of the patches.  I'm not sure

> Forgive my candor but I think that is just useless for us. It's
> certainly useful for AI company statistics. If we want to provide that
> service I would recommend we start charging. ;)

Yeah, that was more an observation of fact than a comment on value.
Certainly now we're able to see what things look like in practice.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH RFC] coding-assistants: simplify attribution
  2026-07-02 10:04       ` Lorenzo Stoakes
@ 2026-07-02 11:51         ` David Hildenbrand (Arm)
  0 siblings, 0 replies; 34+ messages in thread
From: David Hildenbrand (Arm) @ 2026-07-02 11:51 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: Christian Brauner, Linus Torvalds, Jonathan Corbet, Jens Axboe,
	Jeff Layton, Vlastimil Babka, workflows, linux-doc, linux-kernel,
	linux-fsdevel

On 7/2/26 12:04, Lorenzo Stoakes wrote:
> (thanks for the cc-!)
> 
> On Thu, Jul 02, 2026 at 09:46:37AM +0200, David Hildenbrand (Arm) wrote:
>> On 7/2/26 09:27, Christian Brauner wrote:
>>>
>>> I think we should just drop any attribution as a general kernel-wide
>>> rule and let subsystems require them as needed. Then you can have all
>>> the complexity in mm for this that you think is needed for your
>>> workflow to function. This is precisely what the subsystem profiles are
>>> for. So maybe just add:
> 
> A single comment is complexity?

I think Christian meant more elaborate rules. More than just "If you used LLMs,
disclose how you used them."

>>
>> I'm not really sure if having (more?) subsystem-specific tags is the way to go.
>> (below)
>>
>> So either we find a very simple, kernel-wide rule for such tags, or we drop them
>> entirely.
> 
> Yup I couldn't disagree more with Christian here, the whole thing feels like
> trying to 'wish away' the AI issue, and now punting off to subsystem
> maintainers...
> 
> Subsystems impact each other. Right now I'm writing a series that changes driver
> code so we can enforce some sanity in mm APIs.
> 
> I've had to interact with fs code quite a bit that uses mm logic.
> 
> It's all interconnected, and one subsystem let's say going with 'let it all in'
> say, impacts another.
> 
> Yes some people lie about it, but having the guidelines only STRENGTHENS our
> position on that, and I've seen that in practice.
> 
> So yeah, sorry, I think it's beyond silly to push back on requesting somebody
> disclose how much of a patch/series was AI generated.
> 
> And [0] already essentially says people NEED to do this now. But that doc has
> been rather downplayed unfortunately I think.

[...]

>> I agree on the "enforce" aspect. It's impossible, but it's still easy to catch
>> people using AI irresponsibly today ... and that's what we care about. Not
>> people that know what they are doing using AI responsibly.
> 
> For me it's about empowering maintainers to push back.

Right, but I suspect maintainers do have this power already, it's just not
exercised that often on obvious AI slop yet.

> 
>>
>>>
>>> If the information is mostly useful during review then I still would
>>> question why it has to end up in our git logs. It's completely
>>> irrelevant information imho.
>>
>> Fully agreed. In the tree it's irrelevant.
> 
> Not sure about that, if it turns out AI-generated patches are causing 95% more
> bugs say that's pretty useful information no?

Well

a) You don't know how much AI was used. In particular, it could just slip in as
the submitter tries to untangle some of the mess the AI created (so not AI's
fault). Or the submitter just used it to write+translate the patch description.
Really, the tag itself doesn't tell you much as it stands, which is the biggest
problem I am having with it.

b) You don't catch all the cases where people didn't use the tag.

> 
> Or if you find that a patch somebody sent from another subsystem that has a
> lassez faire approach to AI slop completely breaks you in some subtle way, isn't
> it easier to push for a revert if you see it's LLM-generated?

The information would have to be had from the linked mailing list posting.

Given that some subsystems already started suppressing the tags when applying
patches, that doesn't really help ... :/

> 
> And is it really that egregious to include a tag? You can ignore it if you don't
> care.

I hate the current tags as they are. The question I am asking myself: assume we
stop using the Assisted-by for LLM stuff. What to do with the other tools? Why
are LLMs suddenly no longer a tool to mention there.

-- 
Cheers,

David

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH RFC] coding-assistants: simplify attribution
  2026-07-02  9:44       ` Lorenzo Stoakes
@ 2026-07-02 11:57         ` Brian Foster
  2026-07-02 12:18           ` Lorenzo Stoakes
  0 siblings, 1 reply; 34+ messages in thread
From: Brian Foster @ 2026-07-02 11:57 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: Laurent Pinchart, Vlastimil Babka (SUSE), Jori Koolstra,
	Christian Brauner, Linus Torvalds, Jonathan Corbet, Jens Axboe,
	David Hildenbrand, Jeff Layton, workflows, linux-doc,
	linux-kernel, linux-fsdevel

On Thu, Jul 02, 2026 at 10:44:09AM +0100, Lorenzo Stoakes wrote:
> On Thu, Jul 02, 2026 at 12:38:44PM +0300, Laurent Pinchart wrote:
> > On Thu, Jul 02, 2026 at 10:44:34AM +0200, Vlastimil Babka (SUSE) wrote:
> > > On 7/2/26 10:12, Jori Koolstra wrote:
> > > > Ah, I still reigniting this discussion again :)
> > > >
> > > > What about a combination of what David and Jeff say? The whole point
> > > > seems to me that the salient information is not that an LLM was used (or
> > > > are we going to tag Sashiko as well or any other LLM-based code review
> > > > tool?), but what is was used to do. This information may be relevant for
> > > > how the review is approached. The latter should perhaps only be in the
> > > > cover letter and then we can drop the assisted-by tags altogether.
> > > >
> > > > The question about enforcement remains.
> > >
> > > It's not possible to enforce it. People can deny it if the tag is missing
> > > and you confront them and even though the submission has many signs of being
> > > obviously LLM, there is no definite proof. We've seen (likely, as there's no
> > > proof!) that happen in mm.
> > >
> > > Such situation then penalizes those who disclose so obviously they won't.
> >
> > I think there's also a penality for those who don't disclose when
> > they're told they should: it will lower trust. Kernel development is
> > largely based on a trust model. If a contributor decides to adopt a
> > deceiptful behaviour, they can expect maintainers to raise the bar for
> > accepting patches, when not rejecting them outright.
> 
> Yes, I explicitly said this in response to somebody for whom there was
> overwhelming evidence they were submitting AI slop, and that they'd need to
> build it back up again.
> 
> It's precisely the issue as I see it.
> 
> But others within the community disagreed with me, so it turned into a very
> long and draining discussion that I don't particularly wish to repeat.
> 
> So we really need clarity on it being OK to do this (I remember saying this
> last year when I made an ultimately unsuccessful submission to the
> maintainer's summit about all this :)
> 
> What matters overall is being able to _quickly_ dismiss AI slop so that
> asymmetry between LLM generation + maintainer time isn't exploited.
> 
> And ultimately I think the trust model will end up being 'newcomes have 0,
> now build it up'.
> 
> Which sucks but this issue is simply existential for open source.
> 

Has anybody tried throwing any of the obvious LLM slop submissions we
have seen into one of these LLM detector things? To be clear, I've never
tried those so I'm certainly no authority on if they even work reliably,
but if so I wonder if something like that is a potential solution for
elminating the worst cases..

I.e., suppose we had some Sashiko type LLM/bot whose job was mainly to
detect purely LLM generated content based on some minimum level of
confidence and reply with a loud and clear message to the thread. Maybe
that would be a clear enough signal to maintainers and reviewers that
something is not worth prioritizing for review.. Maybe also some "slop
detected" feedback would help disincentivize flinging slop onto the
lists. At the very least that could be something that is more easily
configured/enabled per-subsystem without having to use per-subsystem
commit tags.

Brian

> >
> > I can't quantifying which of the penalities will be higher, but I hope
> > (call me naive if you wish) that the vast majority of contributurs who
> > *know* we require disclosure to abide by that rule, even if it incurs a
> > penalty. After all, proponents for LLM usage claim such performance
> > improvements that a small penalty during review can't be that bad, right
> > ? :-)
> >
> > > We
> > > should drop the tag and instead think how we can empower maintainers to be
> > > able to use their own judgment and deprioritize dealing with what they
> > > perceive as LLM slop, without fearing consequences of not being properly
> > > responsible etc, and not rely on any non-enforceable tags for that.
> >
> > --
> > Regards,
> >
> > Laurent Pinchart
> 
> Thanks, Lorenzo
> 


^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH RFC] coding-assistants: simplify attribution
  2026-07-02 11:57         ` Brian Foster
@ 2026-07-02 12:18           ` Lorenzo Stoakes
  0 siblings, 0 replies; 34+ messages in thread
From: Lorenzo Stoakes @ 2026-07-02 12:18 UTC (permalink / raw)
  To: Brian Foster
  Cc: Laurent Pinchart, Vlastimil Babka (SUSE), Jori Koolstra,
	Christian Brauner, Linus Torvalds, Jonathan Corbet, Jens Axboe,
	David Hildenbrand, Jeff Layton, workflows, linux-doc,
	linux-kernel, linux-fsdevel

On Thu, Jul 02, 2026 at 07:57:45AM -0400, Brian Foster wrote:
> On Thu, Jul 02, 2026 at 10:44:09AM +0100, Lorenzo Stoakes wrote:
> > On Thu, Jul 02, 2026 at 12:38:44PM +0300, Laurent Pinchart wrote:
> > > On Thu, Jul 02, 2026 at 10:44:34AM +0200, Vlastimil Babka (SUSE) wrote:
> > > > On 7/2/26 10:12, Jori Koolstra wrote:
> > > > > Ah, I still reigniting this discussion again :)
> > > > >
> > > > > What about a combination of what David and Jeff say? The whole point
> > > > > seems to me that the salient information is not that an LLM was used (or
> > > > > are we going to tag Sashiko as well or any other LLM-based code review
> > > > > tool?), but what is was used to do. This information may be relevant for
> > > > > how the review is approached. The latter should perhaps only be in the
> > > > > cover letter and then we can drop the assisted-by tags altogether.
> > > > >
> > > > > The question about enforcement remains.
> > > >
> > > > It's not possible to enforce it. People can deny it if the tag is missing
> > > > and you confront them and even though the submission has many signs of being
> > > > obviously LLM, there is no definite proof. We've seen (likely, as there's no
> > > > proof!) that happen in mm.
> > > >
> > > > Such situation then penalizes those who disclose so obviously they won't.
> > >
> > > I think there's also a penality for those who don't disclose when
> > > they're told they should: it will lower trust. Kernel development is
> > > largely based on a trust model. If a contributor decides to adopt a
> > > deceiptful behaviour, they can expect maintainers to raise the bar for
> > > accepting patches, when not rejecting them outright.
> >
> > Yes, I explicitly said this in response to somebody for whom there was
> > overwhelming evidence they were submitting AI slop, and that they'd need to
> > build it back up again.
> >
> > It's precisely the issue as I see it.
> >
> > But others within the community disagreed with me, so it turned into a very
> > long and draining discussion that I don't particularly wish to repeat.
> >
> > So we really need clarity on it being OK to do this (I remember saying this
> > last year when I made an ultimately unsuccessful submission to the
> > maintainer's summit about all this :)
> >
> > What matters overall is being able to _quickly_ dismiss AI slop so that
> > asymmetry between LLM generation + maintainer time isn't exploited.
> >
> > And ultimately I think the trust model will end up being 'newcomes have 0,
> > now build it up'.
> >
> > Which sucks but this issue is simply existential for open source.
> >
>
> Has anybody tried throwing any of the obvious LLM slop submissions we
> have seen into one of these LLM detector things? To be clear, I've never
> tried those so I'm certainly no authority on if they even work reliably,
> but if so I wonder if something like that is a potential solution for
> elminating the worst cases..
>
> I.e., suppose we had some Sashiko type LLM/bot whose job was mainly to
> detect purely LLM generated content based on some minimum level of
> confidence and reply with a loud and clear message to the thread. Maybe
> that would be a clear enough signal to maintainers and reviewers that
> something is not worth prioritizing for review.. Maybe also some "slop
> detected" feedback would help disincentivize flinging slop onto the
> lists. At the very least that could be something that is more easily
> configured/enabled per-subsystem without having to use per-subsystem
> commit tags.

Yup I thought of this, have done this on series and they do detect it
reliably.

But then it becomes an arms race. People will get AI to try to defeat AI
detection. So I'm not sure it's a safe road to go down.

>
> Brian
>

Thanks, Lorenzo

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2026-07-02 12:18 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-07-01 15:54 [PATCH RFC] coding-assistants: simplify attribution Christian Brauner
2026-07-01 16:08 ` Mark Brown
2026-07-02  7:10   ` Christian Brauner
2026-07-02 11:35     ` Mark Brown
2026-07-01 16:08 ` Jonathan Corbet
2026-07-01 16:12   ` David Hildenbrand (Arm)
2026-07-02  7:11   ` Christian Brauner
2026-07-02  9:51   ` David Disseldorp
2026-07-01 16:10 ` David Hildenbrand (Arm)
2026-07-02  7:27   ` Christian Brauner
2026-07-02  7:46     ` David Hildenbrand (Arm)
2026-07-02  8:10       ` Laurent Pinchart
2026-07-02  8:16         ` David Hildenbrand (Arm)
2026-07-02 10:04       ` Lorenzo Stoakes
2026-07-02 11:51         ` David Hildenbrand (Arm)
2026-07-02  8:08     ` Laurent Pinchart
2026-07-02  8:28       ` Christian Brauner
2026-07-02  9:24   ` Lorenzo Stoakes
2026-07-01 18:35 ` Jeff Layton
2026-07-01 18:53   ` Jakub Kicinski
2026-07-02  7:29     ` Christian Brauner
2026-07-02  7:28   ` Christian Brauner
2026-07-02  8:12 ` Jori Koolstra
2026-07-02  8:44   ` Vlastimil Babka (SUSE)
2026-07-02  9:09     ` Jori Koolstra
2026-07-02  9:39       ` Lorenzo Stoakes
2026-07-02  9:37     ` Lorenzo Stoakes
2026-07-02  9:38     ` Laurent Pinchart
2026-07-02  9:44       ` Lorenzo Stoakes
2026-07-02 11:57         ` Brian Foster
2026-07-02 12:18           ` Lorenzo Stoakes
2026-07-02 10:34     ` Krzysztof Kozlowski
2026-07-02 10:27 ` Krzysztof Kozlowski
2026-07-02 11:27 ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox