Re: [PATCH v2] docs/devel: relax policy on AI-generated contributions

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Michael S. Tsirkin" <mst@redhat.com>
To: Peter Maydell <peter.maydell@linaro.org>
Cc: "Paolo Bonzini" <pbonzini@redhat.com>,
	qemu-devel@nongnu.org, "Alex Bennée" <alex.bennee@linaro.org>,
	"Alistair Francis" <alistair.francis@wdc.com>,
	"BALATON Zoltan" <balaton@eik.bme.hu>,
	"Daniel P. Berrangé" <berrange@redhat.com>,
	"Fabiano Rosas" <farosas@suse.de>,
	"Kevin Wolf" <kwolf@redhat.com>, "Warner Losh" <imp@bsdimp.com>,
	"Philippe Mathieu-Daudé" <philmd@linaro.org>,
	"Paolo Bonzini" <bonzini@gnu.org>
Subject: Re: [PATCH v2] docs/devel: relax policy on AI-generated contributions
Date: Fri, 29 May 2026 11:46:47 -0400	[thread overview]
Message-ID: <20260529114114-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <CAFEAcA_EWR_MjH1arCdi2y0saZ-Hc7zRAK97YTeEcesVRb9Rtg@mail.gmail.com>

On Fri, May 29, 2026 at 04:34:45PM +0100, Peter Maydell wrote:
> On Fri, 29 May 2026 at 10:46, Paolo Bonzini <pbonzini@redhat.com> wrote:
> >
> > Until now QEMU's code provenance policy declined any contribution
> > believed to include or derive from AI-generated content.  A blanket ban
> > was easy to maintain while LLM output was rarely usable on its own, but
> > as the tools improved an absolute prohibition has become harder to
> > justify.
> >
> > The concern that motivated the policy is unchanged, and it is worth stating
> > precisely: the DCO is about whether the submitter has the legal right to
> > contribute the code, not about "creative expression".  While the status of
> > LLM output seems to be converging towards non-copyrightability, questions
> > around unintentional reproduction of copyrighted code are still open.
> > What has shifted is the balance of risk:
> >
> > - projects accepting AI-assisted content have not run into serious
> >   legal trouble so far, which suggests the probability of the risk
> >   materializing is not high;
> >
> > - other organizations, such as Red Hat[1], have assessed the risk as
> >   acceptable -- though a community of individual developers does not
> >   have the legal backing of a company, and even an unfounded dispute
> >   would be a long-lasting distraction from work on QEMU.
> >
> > Nevertheless, even Red Hat mentions that "the possibility of occasional
> > replication cannot be ignored".  In QEMU's view, attentiveness and
> > oversight are not a practical way to address this; yet as a copyleft
> > project, copyright and code provenance are of utmost importance to us.
> > Therefore, it remains prudent to only permit AI assistance where the
> > ramifications of copyright violations are at least easy to revert and
> > unlikely to spread: tests, documentation, mechanical changes, and small
> > bug fixes.  Core code that other things depend on, and that cannot
> > simply be thrown away once a problem is noticed long after the fact,
> > stays off-limits without prior agreement from a maintainer.
> 
> This all makes sense to me, except for the part where we allow
> a maintainer to say "actually it's OK". Where our justification
> for not wanting AI contributions rests on "it's too much burden
> on maintainers to have to deal with and review it", allowing an
> individual maintainer to say "I'm OK with that burden in this case
> or for this particular contribution" logically follows as a
> possible relaxation. But if as a project we want to limit the
> blast-radius if we find we have to rip out a hypothetical tainted
> contribution, shouldn't that mean that we hold that as a project-wide
> line, rather than leaving it up to the opinion of the individual
> maintainer ?

I guess, the maintainer can judge that the code is unique and qemu
specific enough, and follows from what it is doing automatically enough,
that the chances it is accidentally copying something are nil?


> > Related to this, and already visible in the incredible uptick in
> > security reports, is the question of maintainer burnout and the shift in
> > effort from the author to the reviewer of the code.  AI lowers the cost of
> > producing a patch but does nothing to lower the cost of understanding and
> > reviewing one; if anything it raises it, since a reviewer can no longer
> > assume that the submitter has reasoned through every line.  The limits
> > above work just as much to keep the volume of review work sustainable.
> >
> > Revise the policy according to the above considerations, and introduce the
> > "AI-used-for:" trailer as a record of where AI was used.  The standard is
> > slightly different from the more usual "Assisted-by"; the intention is for
> > the metadata to provide more information for reviewers to judge the result.
> >
> > In any case, use of AI does not relax any other contribution requirement:
> > authors still comply with the DCO and take responsibility for the whole
> > patch via Signed-off-by.
> >
> > [Commit message largely based on
> >  https://lore.kernel.org/qemu-devel/ahXbxzB4C_lr6b0N@redhat.com/, by
> >  Kevin Wolf. - Paolo]
> 
> > +**Documentation and code comments**
> > +  While AI can help draft text, it still requires significant human
> > +  oversight.  Pay attention to the organization and flow of the generated
> > +  text, and strictly fact-check all technical details as LLMs are prone
> > +  to being confidently wrong.
> 
> I think the application to documentation and comments is the part
> I'm least enthusiastic about here.

But I am very enthusiastic about less agrammatical english in both.
AI is super helpful for non native speakers.

> For changes to code, we have at
> least some guardrails on the AI output, in the fact that it has to
> compile and to pass tests. For changes to documentation, the
> only guardrails are human eyeballs.
> 
> Also both comments and documentation ideally are a record of
> what we intended the behaviour to be. If an LLM is effectively
> autogenerating something documentation-shaped from the code we
> lose that.
> 
> -- PMM

next prev parent reply	other threads:[~2026-05-29 15:47 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-29  9:46 [PATCH v2] docs/devel: relax policy on AI-generated contributions Paolo Bonzini
2026-05-29 11:52 ` Alex Bennée
2026-05-29 13:06   ` Paolo Bonzini
2026-05-29 13:10     ` Michael S. Tsirkin
2026-05-29 11:59 ` BALATON Zoltan
2026-05-29 15:34 ` Peter Maydell
2026-05-29 15:46   ` Michael S. Tsirkin [this message]
2026-05-29 15:55     ` Peter Maydell
2026-05-29 16:17     ` Paolo Bonzini
2026-05-29 17:47       ` Michael S. Tsirkin
2026-06-02  7:38   ` Michael S. Tsirkin
2026-06-02  8:09     ` Paolo Bonzini
2026-06-02 15:53 ` Stefan Hajnoczi
2026-06-03 11:35   ` Paolo Bonzini
2026-06-03 14:55     ` Stefan Hajnoczi
2026-06-03 14:59 ` Daniel P. Berrangé
2026-06-03 15:06   ` Michael S. Tsirkin
2026-06-03 15:35   ` Paolo Bonzini
2026-06-03 17:54     ` Daniel P. Berrangé
2026-06-04 10:37       ` Paolo Bonzini
2026-06-05  9:17         ` Daniel P. Berrangé
2026-06-05  9:25           ` Michael S. Tsirkin
2026-06-05  9:39             ` Daniel P. Berrangé
2026-06-05  9:48               ` Michael S. Tsirkin
2026-06-05 10:23                 ` Daniel P. Berrangé
2026-06-05 10:28                   ` Michael S. Tsirkin
2026-06-05 10:34                     ` Daniel P. Berrangé
2026-06-05 11:26                   ` Paolo Bonzini
2026-06-05 12:39                   ` BALATON Zoltan
2026-06-05 13:00                     ` Daniel P. Berrangé
2026-06-03 18:14     ` Alex Bennée
2026-06-03 18:20       ` Daniel P. Berrangé
2026-06-04 10:04         ` Alex Bennée
2026-06-04  6:08       ` Michael S. Tsirkin
2026-06-05 10:12     ` Kevin Wolf
2026-06-05 10:23       ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260529114114-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=alex.bennee@linaro.org \
    --cc=alistair.francis@wdc.com \
    --cc=balaton@eik.bme.hu \
    --cc=berrange@redhat.com \
    --cc=bonzini@gnu.org \
    --cc=farosas@suse.de \
    --cc=imp@bsdimp.com \
    --cc=kwolf@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=philmd@linaro.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.