All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Daniel P. Berrangé" <berrange@redhat.com>
To: BALATON Zoltan <balaton@eik.bme.hu>
Cc: "Michael S. Tsirkin" <mst@redhat.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	qemu-devel <qemu-devel@nongnu.org>,
	"Alex Bennée" <alex.bennee@linaro.org>,
	"Alistair Francis" <alistair.francis@wdc.com>,
	"Fabiano Rosas" <farosas@suse.de>,
	"Kevin Wolf" <kwolf@redhat.com>,
	"Peter Maydell" <peter.maydell@linaro.org>,
	"Warner Losh" <imp@bsdimp.com>,
	"Philippe Mathieu-Daudé" <philmd@linaro.org>
Subject: Re: [PATCH v2] docs/devel: relax policy on AI-generated contributions
Date: Fri, 5 Jun 2026 14:00:35 +0100	[thread overview]
Message-ID: <aiLIc6C7S_bVtlN_@redhat.com> (raw)
In-Reply-To: <1cb908e1-2d9a-3333-240e-1f7023c5c09e@eik.bme.hu>

On Fri, Jun 05, 2026 at 02:39:35PM +0200, BALATON Zoltan wrote:
> On Fri, 5 Jun 2026, Daniel P. Berrangé wrote:
> > On Fri, Jun 05, 2026 at 05:48:37AM -0400, Michael S. Tsirkin wrote:
> > > On Fri, Jun 05, 2026 at 10:39:15AM +0100, Daniel P. Berrangé wrote:
> > > > On Fri, Jun 05, 2026 at 05:25:36AM -0400, Michael S. Tsirkin wrote:
> > > > > On Fri, Jun 05, 2026 at 10:17:16AM +0100, Daniel P. Berrangé wrote:
> > > > > > IMHO need unconditional disclosure, because the use of the LLM impacts
> > > > > > the license of the code. QEMU is traditionally expected to be GPLv2+
> > > > > > licensed for all new code, but there's the train of thought that LLM
> > > > > > code is public domain.
> > > > > > If it gets human editting afterwards we can
> > > > > > consider that the human edits are GPLv2+ licensed, but IMHO we still
> > > > > > want to know the origins.
> > > > > 
> > > > > Wait that's a big ask.
> > > > > 
> > > > > DOC explicitly does not ask if code might be available anywhere else
> > > > > under any other license. Just that contributor can contribute under GPL.
> > > > > If it's public domain then the human can license is under GPL.
> > > > 
> > > > For new files, in checkpatch we validate that SPDX-License-Identifier
> > > > is explicitly set as GPL-2.0-or-later. Contributors are expected to
> > > > justify any divergence in the commit message.
> > > > 
> > > > I've seen guidance that SPDX-License-Identifier for AI output code
> > > > should NOT state a license, under the theory it is public domain.
> > > 
> > > Not state a license? Recommended by a lawyer? Seen where? Why?
> > 
> > https://www.redhat.com/en/blog/ai-assisted-development-and-open-source-navigating-legal-issues
> > 
> >  "The harder case is when an entire source file, or even
> >   an entire repository, is generated by AI. Here, adding
> >   a copyright and license notice may be inappropriate
> >   unless and until human contributions transform the file
> >   into a copyrightable work. "
> > 
> > I interpret that to suggest we should not automatically use
> > SPDX-License-Identifier: GPL-2.0-or-later on LLM generated
> > code, unless subsequent human editting was non-trivial.
> 
> The presumtion that LLM generated code is public domain is dubious. If you
> tell it to regenerate part of QEMU source after it has seen the GPL sources
> and it comes up with something equivalent does that make the generated
> version public domain? If so people could just rewrite GPL code and make it
> proprietary. This can't be right as the generated code will likely contain
> parts copied from the original so still fall under GPL. What if I just tell
> LLM to rewrite QEMU in C++? Will that make a public domain version that I
> can then make closed source even though it still contains large parts of GPL
> code? I don't think so. The code generated by LLM comes from somewhere but
> nobody can tell where from so also nobody knows what licence it is. If
> you're lucky it comes from examples or other sources with a free licence but
> could be anything even some open source code not compatible with GPL or
> proprietary code. The idea of public domain probably comes from that there's
> no human to hold the copyright but what about cases of copying copyleft code
> by LLM that should not make it public domain. This is similar to the case
> when somebody who worked on a proprietary code before then writes some open
> source code that does similar things or vice versa. What is the legal status
> of those cases? Can the other party claim copyright for the code? Probably
> only if the person recalls whole parts that resemble each other closely
> which could happen. The risk is probably the same with LLMs and thus the
> handling of this should be similar probably. This seems more complex than
> assuming anthing from an LLM is public domain.

Yes, I should have clarified my comments better. I did not mean to
imply that everything/anything from an LLM is public domain.

The "public domain" argument does indeed come from the idea that
only humans can own copyright, and IMHO can apply *only* in the
case where you can credibly consider it to NOT be a direct derived
work of an existing licensed work.

If you're instructing an AI to clone QEMU into a different language
there's a strong argument the result would be a derived work.

If you're instructing an AI to write a non-trivial feature with
creative work and that is following a non-trivial design pattern
that is common in other areas of QEMU, there's also a decent
argument that the result would be a derived work and thus also
liable to be GPL.

This is not the kind of usage that's being proposed for QEMU though.
The kind of scenarios being considered are borderline for creativity
and thus questionable whether they would meet the threshold for
copyrightability even for a human author. 

With regards,
Daniel
-- 
|: https://berrange.com       ~~        https://hachyderm.io/@berrange :|
|: https://libvirt.org          ~~          https://entangle-photo.org :|
|: https://pixelfed.art/berrange   ~~    https://fstop138.berrange.com :|



  reply	other threads:[~2026-06-05 13:02 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-29  9:46 [PATCH v2] docs/devel: relax policy on AI-generated contributions Paolo Bonzini
2026-05-29 11:52 ` Alex Bennée
2026-05-29 13:06   ` Paolo Bonzini
2026-05-29 13:10     ` Michael S. Tsirkin
2026-05-29 11:59 ` BALATON Zoltan
2026-05-29 15:34 ` Peter Maydell
2026-05-29 15:46   ` Michael S. Tsirkin
2026-05-29 15:55     ` Peter Maydell
2026-05-29 16:17     ` Paolo Bonzini
2026-05-29 17:47       ` Michael S. Tsirkin
2026-06-02  7:38   ` Michael S. Tsirkin
2026-06-02  8:09     ` Paolo Bonzini
2026-06-02 15:53 ` Stefan Hajnoczi
2026-06-03 11:35   ` Paolo Bonzini
2026-06-03 14:55     ` Stefan Hajnoczi
2026-06-03 14:59 ` Daniel P. Berrangé
2026-06-03 15:06   ` Michael S. Tsirkin
2026-06-03 15:35   ` Paolo Bonzini
2026-06-03 17:54     ` Daniel P. Berrangé
2026-06-04 10:37       ` Paolo Bonzini
2026-06-05  9:17         ` Daniel P. Berrangé
2026-06-05  9:25           ` Michael S. Tsirkin
2026-06-05  9:39             ` Daniel P. Berrangé
2026-06-05  9:48               ` Michael S. Tsirkin
2026-06-05 10:23                 ` Daniel P. Berrangé
2026-06-05 10:28                   ` Michael S. Tsirkin
2026-06-05 10:34                     ` Daniel P. Berrangé
2026-06-05 11:26                   ` Paolo Bonzini
2026-06-05 12:39                   ` BALATON Zoltan
2026-06-05 13:00                     ` Daniel P. Berrangé [this message]
2026-06-03 18:14     ` Alex Bennée
2026-06-03 18:20       ` Daniel P. Berrangé
2026-06-04 10:04         ` Alex Bennée
2026-06-04  6:08       ` Michael S. Tsirkin
2026-06-05 10:12     ` Kevin Wolf
2026-06-05 10:23       ` Michael S. Tsirkin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aiLIc6C7S_bVtlN_@redhat.com \
    --to=berrange@redhat.com \
    --cc=alex.bennee@linaro.org \
    --cc=alistair.francis@wdc.com \
    --cc=balaton@eik.bme.hu \
    --cc=farosas@suse.de \
    --cc=imp@bsdimp.com \
    --cc=kwolf@redhat.com \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=philmd@linaro.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.