[PATCH RFC] coding-assistants: simplify attribution

Linux Documentation
 help / color / mirror / Atom feed

* [PATCH RFC] coding-assistants: simplify attribution
@ 2026-07-01 15:54 Christian Brauner
  2026-07-01 16:08 ` Mark Brown
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Christian Brauner @ 2026-07-01 15:54 UTC (permalink / raw)
  To: Linus Torvalds, Jonathan Corbet
  Cc: Jens Axboe, David Hildenbrand, Jeff Layton, Vlastimil Babka,
	workflows, linux-doc, linux-kernel, linux-fsdevel,
	Christian Brauner (Amutable)

I remain very confused by our coding assistant contribution guidelines.
I'm going to be a bit polemic now but this seriously in good faith.

Why precisely do we require all this detailed information about what
specific coding assistant was used?

I find it very irritating that our git history has effectively started
to function a bit like a free advertising platform for a bunch of AI
companies and their proprietary agents and models.

And it reamins unclear to me what exactly we do get out of this detailed
information: Do we want to run statistical analysis on what agent and
model is used the most and publish that on LWN at some point?

I acknowledge that my stance is even more radical: imho we would just
stop it with any disclosure requirements completely. It's useless imho.
We already see that other than core contributors most people don't care
and will just not disclose their usage of AI. I think this is entirely
pointless and worse it brings in undefined legal status as well. It's
not like recent events of pulling certain models from the face of the
earth have made this any less concerning.

But fine, if we want to do this can we please just dumb it down to

Assisted-by: LLM

or

Assisted-by: Coding Assistant

or something else. That still gives the "careful review" signal to
reviewers that want to pay special attention to LLM generated work while
avoiding this slew of metadata.

Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
---
 Documentation/process/coding-assistants.rst | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/Documentation/process/coding-assistants.rst b/Documentation/process/coding-assistants.rst
index 899f4459c52d..fe34f3e7e828 100644
--- a/Documentation/process/coding-assistants.rst
+++ b/Documentation/process/coding-assistants.rst
@@ -43,12 +43,8 @@ When AI tools contribute to kernel development, proper attribution
 helps track the evolving role of AI in the development process.
 Contributions should include an Assisted-by tag in the following format::

-  Assisted-by: AGENT_NAME:MODEL_VERSION [TOOL1] [TOOL2]
+  Assisted-by: LLM [TOOL1] [TOOL2]

-Where:
-
-* ``AGENT_NAME`` is the name of the AI tool or framework
-* ``MODEL_VERSION`` is the specific model version used
 * ``[TOOL1] [TOOL2]`` are optional specialized analysis tools used
   (e.g., coccinelle, sparse, smatch, clang-tidy)

@@ -56,4 +52,4 @@ Basic development tools (git, gcc, make, editors) should not be listed.

 Example::

-  Assisted-by: Claude:claude-3-opus coccinelle sparse
+  Assisted-by: LLM coccinelle sparse

---
base-commit: dc59e4fea9d83f03bad6bddf3fa2e52491777482
change-id: 20260701-work-coding-assistants-650ae1202ee0

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH RFC] coding-assistants: simplify attribution
  2026-07-01 15:54 [PATCH RFC] coding-assistants: simplify attribution Christian Brauner
@ 2026-07-01 16:08 ` Mark Brown
  2026-07-01 16:08 ` Jonathan Corbet
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 7+ messages in thread
From: Mark Brown @ 2026-07-01 16:08 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Linus Torvalds, Jonathan Corbet, Jens Axboe, David Hildenbrand,
	Jeff Layton, Vlastimil Babka, workflows, linux-doc, linux-kernel,
	linux-fsdevel

[-- Attachment #1: Type: text/plain, Size: 512 bytes --]

On Wed, Jul 01, 2026 at 05:54:48PM +0200, Christian Brauner wrote:

> And it reamins unclear to me what exactly we do get out of this detailed
> information: Do we want to run statistical analysis on what agent and
> model is used the most and publish that on LWN at some point?

IIRC it was literally this, have people mention which tools they used so
we can use that to inform our assessment of the patches.  I'm not sure
the differences we're actually seeing are tool based rather than
operator skill though.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RFC] coding-assistants: simplify attribution
  2026-07-01 15:54 [PATCH RFC] coding-assistants: simplify attribution Christian Brauner
  2026-07-01 16:08 ` Mark Brown
@ 2026-07-01 16:08 ` Jonathan Corbet
  2026-07-01 16:12   ` David Hildenbrand (Arm)
  2026-07-01 16:10 ` David Hildenbrand (Arm)
  2026-07-01 18:35 ` Jeff Layton
  3 siblings, 1 reply; 7+ messages in thread
From: Jonathan Corbet @ 2026-07-01 16:08 UTC (permalink / raw)
  To: Christian Brauner, Linus Torvalds
  Cc: Jens Axboe, David Hildenbrand, Jeff Layton, Vlastimil Babka,
	workflows, linux-doc, linux-kernel, linux-fsdevel,
	Christian Brauner (Amutable)

Christian Brauner <brauner@kernel.org> writes:

> I remain very confused by our coding assistant contribution guidelines.
> I'm going to be a bit polemic now but this seriously in good faith.
>
> Why precisely do we require all this detailed information about what
> specific coding assistant was used?

From my memory of the discussions:

- If a specific LLM turns out to be in a bad position with regard to
  some copyright ruling, we can identify the commits that might have
  been tainted by it.

- Similarly should an LLM prove to have an inclination toward specific
  types of security issues.

Whether either of these would ever actually prove useful is not
something I can hazard a guess for.

> I find it very irritating that our git history has effectively started
> to function a bit like a free advertising platform for a bunch of AI
> companies and their proprietary agents and models.
>
> And it reamins unclear to me what exactly we do get out of this detailed
> information: Do we want to run statistical analysis on what agent and
> model is used the most and publish that on LWN at some point?

...wasn't in my plans ...

> I acknowledge that my stance is even more radical: imho we would just
> stop it with any disclosure requirements completely. It's useless imho.
> We already see that other than core contributors most people don't care
> and will just not disclose their usage of AI.

The widespread ignoring of the disclosure rule is, IMO, something we
need to address somehow; were I still on the TAB, I'd be raising the
issue there.  Either we find a way to be serious about enforcing the
disclosure rule, or we should just drop it.  A rule that everybody
ignores is less than useful.

(That said, 706 commits in 7.2-rc1 include Assisted-by tags, so *some*
people are complying.  That's about 5% of the total.  What do we think
is the actual use of LLMs for the creation of kernel patches?)

jon

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RFC] coding-assistants: simplify attribution
  2026-07-01 15:54 [PATCH RFC] coding-assistants: simplify attribution Christian Brauner
  2026-07-01 16:08 ` Mark Brown
  2026-07-01 16:08 ` Jonathan Corbet
@ 2026-07-01 16:10 ` David Hildenbrand (Arm)
  2026-07-01 18:35 ` Jeff Layton
  3 siblings, 0 replies; 7+ messages in thread
From: David Hildenbrand (Arm) @ 2026-07-01 16:10 UTC (permalink / raw)
  To: Christian Brauner, Linus Torvalds, Jonathan Corbet
  Cc: Jens Axboe, Jeff Layton, Vlastimil Babka, workflows, linux-doc,
	linux-kernel, linux-fsdevel

On 7/1/26 17:54, Christian Brauner wrote:
> I remain very confused by our coding assistant contribution guidelines.
> I'm going to be a bit polemic now but this seriously in good faith.
> 
> Why precisely do we require all this detailed information about what
> specific coding assistant was used?
> 
> I find it very irritating that our git history has effectively started
> to function a bit like a free advertising platform for a bunch of AI
> companies and their proprietary agents and models.
> 
> And it reamins unclear to me what exactly we do get out of this detailed
> information: Do we want to run statistical analysis on what agent and
> model is used the most and publish that on LWN at some point?
> 
> I acknowledge that my stance is even more radical: imho we would just
> stop it with any disclosure requirements completely. It's useless imho.
> We already see that other than core contributors most people don't care
> and will just not disclose their usage of AI. I think this is entirely
> pointless and worse it brings in undefined legal status as well. It's
> not like recent events of pulling certain models from the face of the
> earth have made this any less concerning.
> 
> But fine, if we want to do this can we please just dumb it down to
> 
> Assisted-by: LLM
> 
> or
> 
> Assisted-by: Coding Assistant

I'd prefer this.

The doc states "proper attribution helps track the evolving role of AI in the
development process". If there is another reason why we need the free
advertisement, we should document it.

Side note: if someone instructs an LLM exactly what to do, and would have
achieved the same thing just typing it in, the use of the tag is not any helpful
to me. (similar to "Assisted-by: vim" would not be helpful).

What would be much more relevant to know is to which degree LLMs were used.

Assisted-by: LLM # translate commit message
Assisted-by: LLM # generate some test cases
Assisted-by: LLM # cleanup logic
Assisted-by: LLM # everything and I have no clue what any in here does

I thought we ask for that in some document, but couldn't immediately find it
(and nobody does that).

-- 
Cheers,

David

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RFC] coding-assistants: simplify attribution
  2026-07-01 16:08 ` Jonathan Corbet
@ 2026-07-01 16:12   ` David Hildenbrand (Arm)
  0 siblings, 0 replies; 7+ messages in thread
From: David Hildenbrand (Arm) @ 2026-07-01 16:12 UTC (permalink / raw)
  To: Jonathan Corbet, Christian Brauner, Linus Torvalds
  Cc: Jens Axboe, Jeff Layton, Vlastimil Babka, workflows, linux-doc,
	linux-kernel, linux-fsdevel

>> I acknowledge that my stance is even more radical: imho we would just
>> stop it with any disclosure requirements completely. It's useless imho.
>> We already see that other than core contributors most people don't care
>> and will just not disclose their usage of AI.
> 
> The widespread ignoring of the disclosure rule is, IMO, something we
> need to address somehow; were I still on the TAB, I'd be raising the
> issue there.  Either we find a way to be serious about enforcing the
> disclosure rule, or we should just drop it.  A rule that everybody
> ignores is less than useful.

Note. :)

We do have people using it, and at least in MM, when we suspect AI usage and get
the confirmation, we would tell people to use proper tags. (happens not so often
IIRC)

-- 
Cheers,

David

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RFC] coding-assistants: simplify attribution
  2026-07-01 15:54 [PATCH RFC] coding-assistants: simplify attribution Christian Brauner
                   ` (2 preceding siblings ...)
  2026-07-01 16:10 ` David Hildenbrand (Arm)
@ 2026-07-01 18:35 ` Jeff Layton
  2026-07-01 18:53   ` Jakub Kicinski
  3 siblings, 1 reply; 7+ messages in thread
From: Jeff Layton @ 2026-07-01 18:35 UTC (permalink / raw)
  To: Christian Brauner, Linus Torvalds, Jonathan Corbet
  Cc: Jens Axboe, David Hildenbrand, Vlastimil Babka, workflows,
	linux-doc, linux-kernel, linux-fsdevel

On Wed, 2026-07-01 at 17:54 +0200, Christian Brauner wrote:
> I remain very confused by our coding assistant contribution guidelines.
> I'm going to be a bit polemic now but this seriously in good faith.
> 
> Why precisely do we require all this detailed information about what
> specific coding assistant was used?
> 
> I find it very irritating that our git history has effectively started
> to function a bit like a free advertising platform for a bunch of AI
> companies and their proprietary agents and models.
> 
> And it reamins unclear to me what exactly we do get out of this detailed
> information: Do we want to run statistical analysis on what agent and
> model is used the most and publish that on LWN at some point?
> 
> I acknowledge that my stance is even more radical: imho we would just
> stop it with any disclosure requirements completely. It's useless imho.
> We already see that other than core contributors most people don't care
> and will just not disclose their usage of AI. I think this is entirely
> pointless and worse it brings in undefined legal status as well. It's
> not like recent events of pulling certain models from the face of the
> earth have made this any less concerning.
> 
> But fine, if we want to do this can we please just dumb it down to
> 
> Assisted-by: LLM
> 
> or
> 
> Assisted-by: Coding Assistant
> 
> or something else. That still gives the "careful review" signal to
> reviewers that want to pay special attention to LLM generated work while
> avoiding this slew of metadata.
> 
> Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
> ---
>  Documentation/process/coding-assistants.rst | 8 ++------
>  1 file changed, 2 insertions(+), 6 deletions(-)
> 
> diff --git a/Documentation/process/coding-assistants.rst b/Documentation/process/coding-assistants.rst
> index 899f4459c52d..fe34f3e7e828 100644
> --- a/Documentation/process/coding-assistants.rst
> +++ b/Documentation/process/coding-assistants.rst
> @@ -43,12 +43,8 @@ When AI tools contribute to kernel development, proper attribution
>  helps track the evolving role of AI in the development process.
>  Contributions should include an Assisted-by tag in the following format::
>  
> -  Assisted-by: AGENT_NAME:MODEL_VERSION [TOOL1] [TOOL2]
> +  Assisted-by: LLM [TOOL1] [TOOL2]
>  
> -Where:
> -
> -* ``AGENT_NAME`` is the name of the AI tool or framework
> -* ``MODEL_VERSION`` is the specific model version used
>  * ``[TOOL1] [TOOL2]`` are optional specialized analysis tools used
>    (e.g., coccinelle, sparse, smatch, clang-tidy)
>  
> @@ -56,4 +52,4 @@ Basic development tools (git, gcc, make, editors) should not be listed.
>  
>  Example::
>  
> -  Assisted-by: Claude:claude-3-opus coccinelle sparse
> +  Assisted-by: LLM coccinelle sparse
> 
> ---
> base-commit: dc59e4fea9d83f03bad6bddf3fa2e52491777482
> change-id: 20260701-work-coding-assistants-650ae1202ee0


In general, collecting data for nebulous purposes usually turns out to
be a bad idea. If we're not 100% clear on why we want this data, then
we're probably better off not collecting it at all.

With that in mind: if we're going to water down the tag, then I say
just remove the requirement altogether. If we later decide that we want
to start collecting more detailed info for some (clear) purpose then we
can revisit the idea.
-- 
Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH RFC] coding-assistants: simplify attribution
  2026-07-01 18:35 ` Jeff Layton
@ 2026-07-01 18:53   ` Jakub Kicinski
  0 siblings, 0 replies; 7+ messages in thread
From: Jakub Kicinski @ 2026-07-01 18:53 UTC (permalink / raw)
  To: Jeff Layton, Christian Brauner
  Cc: Linus Torvalds, Jonathan Corbet, Jens Axboe, David Hildenbrand,
	Vlastimil Babka, workflows, linux-doc, linux-kernel,
	linux-fsdevel

On Wed, 01 Jul 2026 14:35:08 -0400 Jeff Layton wrote:
> On Wed, 2026-07-01 at 17:54 +0200, Christian Brauner wrote:
> > I remain very confused by our coding assistant contribution guidelines.
> > I'm going to be a bit polemic now but this seriously in good faith.
> > 
> > Why precisely do we require all this detailed information about what
> > specific coding assistant was used?
> > 
> > I find it very irritating that our git history has effectively started
> > to function a bit like a free advertising platform for a bunch of AI
> > companies and their proprietary agents and models.

FWIW, this is exactly how I feel. I added a regex to strip these in
my git hooks. So at least the net/ history should be ads-free 🤷️

Inexperienced developers who just trust the LLM output, and therefore
are the group where the tags would be most useful, tend not to add
them. Either because they are ashamed or because they want full credit.
This correlation kills the utility of the tag.

> > And it reamins unclear to me what exactly we do get out of this detailed
> > information: Do we want to run statistical analysis on what agent and
> > model is used the most and publish that on LWN at some point?
> > 
> > I acknowledge that my stance is even more radical: imho we would just
> > stop it with any disclosure requirements completely. It's useless imho.
> > We already see that other than core contributors most people don't care
> > and will just not disclose their usage of AI. I think this is entirely
> > pointless and worse it brings in undefined legal status as well. It's
> > not like recent events of pulling certain models from the face of the
> > earth have made this any less concerning.
> > 
> > But fine, if we want to do this can we please just dumb it down to
> > 
> > Assisted-by: LLM
> > 
> > or
> > 
> > Assisted-by: Coding Assistant
> > 
> > or something else. That still gives the "careful review" signal to
> > reviewers that want to pay special attention to LLM generated work while
> > avoiding this slew of metadata.
> > 
> > Signed-off-by: Christian Brauner (Amutable) <brauner@kernel.org>
> > ---
> >  Documentation/process/coding-assistants.rst | 8 ++------
> >  1 file changed, 2 insertions(+), 6 deletions(-)
> > 
> > diff --git a/Documentation/process/coding-assistants.rst b/Documentation/process/coding-assistants.rst
> > index 899f4459c52d..fe34f3e7e828 100644
> > --- a/Documentation/process/coding-assistants.rst
> > +++ b/Documentation/process/coding-assistants.rst
> > @@ -43,12 +43,8 @@ When AI tools contribute to kernel development, proper attribution
> >  helps track the evolving role of AI in the development process.
> >  Contributions should include an Assisted-by tag in the following format::
> >  
> > -  Assisted-by: AGENT_NAME:MODEL_VERSION [TOOL1] [TOOL2]
> > +  Assisted-by: LLM [TOOL1] [TOOL2]
> >  
> > -Where:
> > -
> > -* ``AGENT_NAME`` is the name of the AI tool or framework
> > -* ``MODEL_VERSION`` is the specific model version used
> >  * ``[TOOL1] [TOOL2]`` are optional specialized analysis tools used
> >    (e.g., coccinelle, sparse, smatch, clang-tidy)
> >  
> > @@ -56,4 +52,4 @@ Basic development tools (git, gcc, make, editors) should not be listed.
> >  
> >  Example::
> >  
> > -  Assisted-by: Claude:claude-3-opus coccinelle sparse
> > +  Assisted-by: LLM coccinelle sparse
> > 
> > ---
> > base-commit: dc59e4fea9d83f03bad6bddf3fa2e52491777482
> > change-id: 20260701-work-coding-assistants-650ae1202ee0  
> 
> 
> In general, collecting data for nebulous purposes usually turns out to
> be a bad idea. If we're not 100% clear on why we want this data, then
> we're probably better off not collecting it at all.
> 
> With that in mind: if we're going to water down the tag, then I say
> just remove the requirement altogether. If we later decide that we want
> to start collecting more detailed info for some (clear) purpose then we
> can revisit the idea.

+1

Honestly even tool attribution feels increasingly moot.
People vibe code tools and AI-in-the-loop pipelines which they never
publish. Open source tools are (hopefully?) used in pre-commit
pipelines, so they have the "kbuild bot problem" of problems getting
fixed before the code is merged. And we have the same free advertising
problem for the rest.

It's 100 times more important to drill into people to provide sufficient
information in plain English. How was the bug discovered, has it been
triggered / proven and how, what is the user impact. I wonder if
inventing tags distracts contributors from what really matters.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2026-07-01 18:53 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-07-01 15:54 [PATCH RFC] coding-assistants: simplify attribution Christian Brauner
2026-07-01 16:08 ` Mark Brown
2026-07-01 16:08 ` Jonathan Corbet
2026-07-01 16:12   ` David Hildenbrand (Arm)
2026-07-01 16:10 ` David Hildenbrand (Arm)
2026-07-01 18:35 ` Jeff Layton
2026-07-01 18:53   ` Jakub Kicinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox