Re: [PATCH v2] docs/devel: relax policy on AI-generated contributions

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Daniel P. Berrangé" <berrange@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-devel@nongnu.org, "Michael S. Tsirkin" <mst@redhat.com>,
	"Alex Bennée" <alex.bennee@linaro.org>,
	"Alistair Francis" <alistair.francis@wdc.com>,
	"BALATON Zoltan" <balaton@eik.bme.hu>,
	"Fabiano Rosas" <farosas@suse.de>,
	"Kevin Wolf" <kwolf@redhat.com>,
	"Peter Maydell" <peter.maydell@linaro.org>,
	"Warner Losh" <imp@bsdimp.com>,
	"Philippe Mathieu-Daudé" <philmd@linaro.org>,
	"Paolo Bonzini" <bonzini@gnu.org>
Subject: Re: [PATCH v2] docs/devel: relax policy on AI-generated contributions
Date: Wed, 3 Jun 2026 18:54:23 +0100	[thread overview]
Message-ID: <aiBqT1KfI-qowr4L@redhat.com> (raw)
In-Reply-To: <CABgObfZNX2rLnB=FosVopFmLB_S_5V48UPLjNTTtn543d9RfHg@mail.gmail.com>

On Wed, Jun 03, 2026 at 05:35:46PM +0200, Paolo Bonzini wrote:
> On Wed, Jun 3, 2026 at 4:59 PM Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > @@ -288,62 +288,108 @@ content generators below.
> > >  Use of AI-generated content
> > >  ~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >
> > > +Risks to open source projects include maintainer burnout from an
> > > +increased number of contributions, as well as the risk to the project
> > > +from unintentional inclusion of copyrighted material in the LLM's output.
> > > +In order to mitigate these risks, the QEMU project currently allows
> > > +using AI/LLM tools to produce patches in a limited set of scenarios:
> >
> > If we're opening the door to AI assisted contribution, then IMHO we
> > need to write about both the social and technical expectations.
> >
> > Admittedly that will expand the scope of your proposal here, but
> > IMHO that's somewhat unavoidable. A significant part of the downsides
> > of AI-assisted contributions comes from bad social practices, rather
> > than merely bad technical practices.
> >
> > As a general theme, I would like us to emphasize at the start that the
> > act of collaboration & contribution in QEMU is about the interaction,
> > trust and relationships between humans, not bots.
> 
> I agree that it needs to be there somewhere.  On the other hand I'm a
> bit worried about having a treatise that no one will read -- at least
> with stuff like "writing a commit message" we can point people to it.

It doesn't have to be an especially long introduction in the context
of the AI policy doc. If we want something more verbose that could
fit elsewhere in our contribution docs. The AI policy should just
make a point that we expect to be communicating with people not
bots pretending to be people.

> > > +**Small bug fixes**
> > > +  These should be limited to 20 lines of code or less, not including
> > > +  tests.  You are still expected to :ref:`understand and explain your changes
> > > +  <write_a_meaningful_commit_message>` and the rationale behind them.
> >
> > I think the "20 lines or less" is not going a good job at expressing
> > the intent behind this point. I'd like us to emphasize between the
> > "why" of this point, as that helps contributors & reviewers make a
> > decision of whether a change is "within the spirit" or the rule of
> > not.
> 
> True but we also need a rule. The spirit is better explained elsewhere
> (and also, building consensus on spirit vs. a rule are two different
> things).

Do we have a better elsewhere in this case ?  It is a point specifically
about intent of the AI policy rule. 

> > Docs is an area I'm more wary of from the social expectation side rather
> > than the technical or legal side.  I don't feeel like "pay attention to
> > the organization and flow" really mitigates to the tendancy to production
> > of vast reams of convincing sounding slop.
> 
> Reviewers have no obligation to review.  The good thing about slop is
> that saying no takes about the same effort as the author put into the
> creation of the change.

That is true, but at the same time even if you merely say "no", being on
the receiving end of it drains you mental energy, as you still have to
pay some attention in order to decide whether to say no or not. I'm
pretty wary at a policy that is liable to unleash volumuous docs
submissions on maintainers.

Can we do this more incrementally and have a more tightly constrained
guide for docs initially and review it again at a later date if we
feel it is worth relaxing further.

> > IMHO it should not be at the discretion of individual maintainers to
> > accept large-scale AI authored changes outside these guidelines. To
> > quote the commit message rationale
> >
> >    "Therefore, it remains prudent to only permit AI assistance where
> >     the ramifications of copyright violations are at least easy to
> >     revert and unlikely to spread"
> >
> > that does not suggest we should leave it to the discretion of maintainers
> > to override the guidelines.
> 
> See my reply to Peter elsewhere in the thread. I agree with your
> concerns for both docs and discretion, but I had specific uses in mind
> that I'd like to allow.
> 
> For docs:
> - create tutorials and/or feature documentation based on functional tests

That doesn't sound too appealing to me. Reverse engineering docs
or tutorials from our functional tests is exactly the kind of
thing that feels likely to result in volumous text of marginal
value which will have a large burden on reviewers.

> - create function comments (including Rust doctests) based on a
> high-level, human-written description of the module

This feels interesting to trial.

The key difference with function comments is that you don't really
have any of the trouble with document structure & coherence. The
API docs should be comparatively smaller & easier to read & digest
and give feedback on accuracy of.

> For maintainer discretion:
> - updating patches for changes to kernel APIs, before the kernel side
> is ready for inclusion
> - creation of parsing code for Rust procedural macros based on code
> examples and/or a human-written description of the macro
> - creation of boilerplate code similar to hw/core/hotplug.c
> 
> I think all of these are potentially compelling and I would like
> people to be allowed to experiment with them or similar cases.

Those are largely in direct conflict with the intent behind only
allowing "small bug fixes". Either we accept broad contributions
like this as a project, or we don't - I don't see that as something
that is suitable for per-maintainer discretion. If it will not be
intended for merge, then the policy is irrelevant and people can
just experiment out of tree at will.

> The idea of contacting maintainers beforehand comes from the policy
> currently under discussion in the Rust project.

I presume you're looking at

  https://github.com/rust-lang/rust-forge/pull/1040/changes

What I find interesting there is that their rule that is comparable
to your "small bug fixes" rule, emphasizes it is about allowing
"trivial" chances and specifically references the idea of a threshold
for originality / copyrightability. I find that more satisfying than
talking about lines of code and bug fixes.

The talk about experimenting with LLMS for larger changes emphasizes
experimentation  and use of PRs in order to trigger the run of tools
from their review pipeline.   

That doesn't explicitly say whether such "experiments" are permissible
to be merged though. I don't know if the Rust project has specific
terminology here, but my reading of that was that people can agree
to collaborate publically on LLM work, but that does not appear to give
individual maintainers permission to waive the LLM policy rules to
merge arbitrary LLM code in the way this QEMU proposal suggests.

> > > +There is no requirement to include your prompts or summarize the
> > > +conversation in the commit message or cover letter, but you may do so
> > > +if you think it helps a reviewer judge the result.  For example:
> >
> > IMHO we should actively discourage the inclusion of prompts
> > entirely as it is the wrong information to provide.
> 
> Why? I think it helps especially in the case where we're asking for
> maintainers to apply their discretion, and for reproducibility. It may
> not be always applicable, but it can also help.
> 
> > > +**Helpful prompts**
> > > +  These describe concrete constraints or instructions, making it easy for a
> > > +  reviewer to see how the tool's output was guided:
> > > +
> > > +  * "move field ``foo`` from ``struct aa`` to ``struct bb``.  If a
> > > +    function already has a local variable or parameter of type ``struct
> > > +    bb``, use it instead of accessing ``aa.bb``"
> > > +
> > > +  * "add an implementation of the trait for ``Mutex<T: MyTrait>``; it
> > > +    takes the lock around the calls and forwards to ``T``"
> >
> > These examples prompts are just expressing an aspect that should
> > already have been described in prose in the commit message. We
> > don't need to classify them as "ai prompts" in a a commit message,
> > we just need the author to write a useful commit message.
> 
> The commit message does not have to contain this information. For
> example, commit 44a9d1b86c0 does not explain that it implements the
> ToMigrationState{,Shared} traits for Mutex. The commit message could
> say something like
> 
> "The implementation of the traits for types ... were created with AI.
> The prompt was: "add a simple forwarding implementation of the traits
> in rust/migration/src/migratable.rs for the array type [T; N] and for
> the interior mutable types Mutex<T> and BqlRefCell<T>. Note that
> interior mutable types only need T: ToMigrationState in order to
> implement ToMigrationStateShared".

If that is relevant info for reviewers then regardless of whether
it was written by an LLM or a human, that should have been added
to the commit message.

 "This commit adds a simple forwarding implementation of the traits
  in rust/migration/src/migratable.rs for the array type [T; N] and for
  the interior mutable types Mutex<T> and BqlRefCell<T>. Note that
  interior mutable types only need T: ToMigrationState in order to
  implement ToMigrationStateShared".

there is no reason to call out the inclusion of "LLM prompts" as
a concept here. We should be emphasizing that commit messages
should explain their intent in all cases, not something specific
to AI authored code.

> I agree that this is not a typical part of a commit message. On the
> other hand we do mention occasionally how a commit was automated, and
> this falls under that case, sort of? See for example commit
> 324b2298fea ("docs/system: convert Texinfo documentation to rST",
> 2020-03-06).

With regards,
Daniel
-- 
|: https://berrange.com       ~~        https://hachyderm.io/@berrange :|
|: https://libvirt.org          ~~          https://entangle-photo.org :|
|: https://pixelfed.art/berrange   ~~    https://fstop138.berrange.com :|

next prev parent reply	other threads:[~2026-06-03 17:55 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-29  9:46 [PATCH v2] docs/devel: relax policy on AI-generated contributions Paolo Bonzini
2026-05-29 11:52 ` Alex Bennée
2026-05-29 13:06   ` Paolo Bonzini
2026-05-29 13:10     ` Michael S. Tsirkin
2026-05-29 11:59 ` BALATON Zoltan
2026-05-29 15:34 ` Peter Maydell
2026-05-29 15:46   ` Michael S. Tsirkin
2026-05-29 15:55     ` Peter Maydell
2026-05-29 16:17     ` Paolo Bonzini
2026-05-29 17:47       ` Michael S. Tsirkin
2026-06-02  7:38   ` Michael S. Tsirkin
2026-06-02  8:09     ` Paolo Bonzini
2026-06-02 15:53 ` Stefan Hajnoczi
2026-06-03 11:35   ` Paolo Bonzini
2026-06-03 14:55     ` Stefan Hajnoczi
2026-06-03 14:59 ` Daniel P. Berrangé
2026-06-03 15:06   ` Michael S. Tsirkin
2026-06-03 15:35   ` Paolo Bonzini
2026-06-03 17:54     ` Daniel P. Berrangé [this message]
2026-06-04 10:37       ` Paolo Bonzini
2026-06-05  9:17         ` Daniel P. Berrangé
2026-06-05  9:25           ` Michael S. Tsirkin
2026-06-05  9:39             ` Daniel P. Berrangé
2026-06-05  9:48               ` Michael S. Tsirkin
2026-06-05 10:23                 ` Daniel P. Berrangé
2026-06-05 10:28                   ` Michael S. Tsirkin
2026-06-05 10:34                     ` Daniel P. Berrangé
2026-06-05 11:26                   ` Paolo Bonzini
2026-06-05 12:39                   ` BALATON Zoltan
2026-06-05 13:00                     ` Daniel P. Berrangé
2026-06-03 18:14     ` Alex Bennée
2026-06-03 18:20       ` Daniel P. Berrangé
2026-06-04 10:04         ` Alex Bennée
2026-06-04  6:08       ` Michael S. Tsirkin
2026-06-05 10:12     ` Kevin Wolf
2026-06-05 10:23       ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aiBqT1KfI-qowr4L@redhat.com \
    --to=berrange@redhat.com \
    --cc=alex.bennee@linaro.org \
    --cc=alistair.francis@wdc.com \
    --cc=balaton@eik.bme.hu \
    --cc=bonzini@gnu.org \
    --cc=farosas@suse.de \
    --cc=imp@bsdimp.com \
    --cc=kwolf@redhat.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=philmd@linaro.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.