Re: [PATCH v3 3/3] docs: define policy forbidding use of AI code generators

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Markus Armbruster <armbru@redhat.com>
To: "Philippe Mathieu-Daudé" <philmd@linaro.org>
Cc: "Daniel P. Berrangé" <berrange@redhat.com>,
	"Stefan Hajnoczi" <stefanha@gmail.com>,
	qemu-devel@nongnu.org, "Thomas Huth" <thuth@redhat.com>,
	"Alex Bennée" <alex.bennee@linaro.org>,
	"Michael S . Tsirkin" <mst@redhat.com>,
	"Gerd Hoffmann" <kraxel@redhat.com>,
	"Mark Cave-Ayland" <mark.cave-ayland@ilande.co.uk>,
	"Kevin Wolf" <kwolf@redhat.com>,
	"Stefan Hajnoczi" <stefanha@redhat.com>,
	"Alexander Graf" <agraf@csgraf.de>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Richard Henderson" <richard.henderson@linaro.org>,
	"Peter Maydell" <peter.maydell@linaro.org>,
	"Pierrick Bouvier" <pierrick.bouvier@linaro.org>
Subject: Re: [PATCH v3 3/3] docs: define policy forbidding use of AI code generators
Date: Wed, 04 Jun 2025 11:04:47 +0200	[thread overview]
Message-ID: <8734cfyj0w.fsf@pond.sub.org> (raw)
In-Reply-To: <3df2ae5d-c1c6-45ee-8119-ca42e17a0d98@linaro.org> ("Philippe Mathieu-Daudé"'s message of "Wed, 4 Jun 2025 09:54:33 +0200")

Philippe Mathieu-Daudé <philmd@linaro.org> writes:

> On 4/6/25 09:15, Daniel P. Berrangé wrote:
>> On Wed, Jun 04, 2025 at 08:17:27AM +0200, Markus Armbruster wrote:
>>> Stefan Hajnoczi <stefanha@gmail.com> writes:
>>>
>>>> On Tue, Jun 3, 2025 at 10:25 AM Markus Armbruster <armbru@redhat.com> wrote:
>>>>>
>>>>> From: Daniel P. Berrangé <berrange@redhat.com>
>>>>> +The increasing prevalence of AI code generators, most notably but not limited
>>>>
>>>> More detail is needed on what an "AI code generator" is. Coding
>>>> assistant tools range from autocompletion to linters to automatic code
>>>> generators. In addition there are other AI-related tools like ChatGPT
>>>> or Gemini as a chatbot that can people use like Stackoverflow or an
>>>> API documentation summarizer.
>>>>
>>>> I think the intent is to say: do not put code that comes from _any_ AI
>>>> tool into QEMU.
>>>>
>>>> It would be okay to use AI to research APIs, algorithms, brainstorm
>>>> ideas, debug the code, analyze the code, etc but the actual code
>>>> changes must not be generated by AI.
>> 
>> The scope of the policy is around contributions we receive as
>> patches with SoB. Researching / brainstorming / analysis etc
>> are not contribution activities, so not covered by the policy
>> IMHO.
>> 
>>>
>>> The existing text is about "AI code generators".  However, the "most
>>> notably LLMs" that follows it could lead readers to believe it's about
>>> more than just code generation, because LLMs are in fact used for more.
>>> I figure this is your concern.
>>>
>>> We could instead start wide, then narrow the focus to code generation.
>>> Here's my try:
>>>
>>>    The increasing prevalence of AI-assisted software development results
>>>    in a number of difficult legal questions and risks for software
>>>    projects, including QEMU.  Of particular concern is code generated by
>>>    `Large Language Models
>>>    <https://en.wikipedia.org/wiki/Large_language_model>`__ (LLMs).
>> 
>> Documentation we maintain has the same concerns as code.
>> So I'd suggest to substitute 'code' with 'code / content'.
>
> Why couldn't we accept documentation patches improved using LLM?
>
> As a non-native English speaker being often stuck trying to describe
> function APIs, I'm very tempted to use a LLM to review my sentences
> and make them better understandable.

I understand the temptation!  Unfortunately, the "legal questions and
risks" Daniel described apply to *any* kind of copyrightable material,
not just to code.

Quote:

    To satisfy the DCO, the patch contributor has to fully understand the
    copyright and license status of code they are contributing to QEMU. With AI
    code generators, the copyright and license status of the output is ill-defined
    with no generally accepted, settled legal foundation.

    Where the training material is known, it is common for it to include large
    volumes of material under restrictive licensing/copyright terms. Even where
    the training material is all known to be under open source licenses, it is
    likely to be under a variety of terms, not all of which will be compatible
    with QEMU's licensing requirements.

    How contributors could comply with DCO terms (b) or (c) for the output of AI
    code generators commonly available today is unclear.  The QEMU project is not
    willing or able to accept the legal risks of non-compliance.

[...]

next prev parent reply	other threads:[~2025-06-04  9:05 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-03 14:25 [PATCH v3 0/3] docs: define policy forbidding use of "AI" / LLM code generators Markus Armbruster
2025-06-03 14:25 ` [PATCH v3 1/3] docs: introduce dedicated page about code provenance / sign-off Markus Armbruster
2025-06-03 16:53   ` Alex Bennée
2025-06-04  6:44     ` Markus Armbruster
2025-06-04  7:18       ` Daniel P. Berrangé
2025-06-04  7:46       ` Philippe Mathieu-Daudé
2025-06-04  8:52         ` Markus Armbruster
2025-06-05  9:04           ` Markus Armbruster
2025-06-04  7:58       ` Gerd Hoffmann
2025-06-05 14:52       ` Markus Armbruster
2025-06-05 15:07         ` Alex Bennée
2025-06-03 14:25 ` [PATCH v3 2/3] docs: define policy limiting the inclusion of generated files Markus Armbruster
2025-06-03 14:25 ` [PATCH v3 3/3] docs: define policy forbidding use of AI code generators Markus Armbruster
2025-06-03 15:37   ` Kevin Wolf
2025-06-04  6:18     ` Markus Armbruster
2025-06-03 18:25   ` Stefan Hajnoczi
2025-06-04  6:17     ` Markus Armbruster
2025-06-04  7:15       ` Daniel P. Berrangé
2025-06-04  7:54         ` Philippe Mathieu-Daudé
2025-06-04  8:40           ` Daniel P. Berrangé
2025-06-04  9:19             ` Philippe Mathieu-Daudé
2025-06-04  9:04           ` Markus Armbruster [this message]
2025-06-04  8:58         ` Markus Armbruster
2025-06-04  9:22           ` Daniel P. Berrangé
2025-06-04  9:40             ` Markus Armbruster
2025-06-04 12:35             ` Yan Vugenfirer
2025-06-04  9:10     ` Daniel P. Berrangé
2025-06-04 11:01       ` Stefan Hajnoczi
2025-06-03 15:25 ` [PATCH v3 0/3] docs: define policy forbidding use of "AI" / LLM " Kevin Wolf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8734cfyj0w.fsf@pond.sub.org \
    --to=armbru@redhat.com \
    --cc=agraf@csgraf.de \
    --cc=alex.bennee@linaro.org \
    --cc=berrange@redhat.com \
    --cc=kraxel@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mark.cave-ayland@ilande.co.uk \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=philmd@linaro.org \
    --cc=pierrick.bouvier@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.henderson@linaro.org \
    --cc=stefanha@gmail.com \
    --cc=stefanha@redhat.com \
    --cc=thuth@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.