Re: [RFC PATCH 4/4] docs/code-provenance: make the exception process feasible

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: "Daniel P. Berrangé" <berrange@redhat.com>
To: Peter Maydell <peter.maydell@linaro.org>
Cc: "Paolo Bonzini" <pbonzini@redhat.com>,
	qemu-devel@nongnu.org, "Alex Bennée" <alex.bennee@linaro.org>,
	"Markus Armbruster" <armbru@redhat.com>,
	"Stefan Hajnoczi" <stefanha@redhat.com>
Subject: Re: [RFC PATCH 4/4] docs/code-provenance: make the exception process feasible
Date: Mon, 22 Sep 2025 15:03:18 +0100	[thread overview]
Message-ID: <aNFXJtQu9gFkIwLg@redhat.com> (raw)
In-Reply-To: <CAFEAcA_rQhXdavAUCEt8atMhpZYEu0Lz6tVdu4+mfgPOK9iUuw@mail.gmail.com>

On Mon, Sep 22, 2025 at 02:26:00PM +0100, Peter Maydell wrote:
> On Mon, 22 Sept 2025 at 14:05, Daniel P. Berrangé <berrange@redhat.com> wrote:
> >
> > On Mon, Sep 22, 2025 at 12:46:51PM +0100, Peter Maydell wrote:
> > > On Mon, 22 Sept 2025 at 12:32, Paolo Bonzini <pbonzini@redhat.com> wrote:
> > > >
> > > > I do not think that anyone knows how to demonstrate "clarity of the
> > > > copyright status in relation to training".
> > >
> > > Yes; to me this is the whole driving force behind the policy.
> > >
> > > > On the other hand, AI tools can be used as a natural language refactoring
> > > > engine for simple tasks such as modifying all callers of a given function
> > > > or even less simple ones such as adding Python type annotations.
> > > > These tasks have a very low risk of introducing training material in
> > > > the code base, and can provide noticeable time savings because they are
> > > > easily tested and reviewed; for the lack of a better term, I will call
> > > > these "tasks with limited or non-existing creative content".
> > >
> > > Does anybody know how to demonstrate "limited or non-existing
> > > creative content", which I assume is a standin here for
> > > "not copyrightable" ?
> >
> > That was something we aimed to intentionally avoid specifying in the
> > policy. It is very hard to define it in a way that will be clearly
> > understood by all contributors.
> 
> > TL;DR: I don't think we should attempt to define whether the boundary
> > is between copyrightable and non-copyrightable code changes.
> 
> Well, this is why I think a policy that just says "no" is
> more easily understandable and followable. As soon as we
> start defining and granting exceptions then we're effectively
> in the position of making judgements and defining the boundary.

Whether we have our AI policy or not, contributors are still required
to abide by the terms of the DCO, which requires them to understand
the legal situation of any contribution.

Our policy is effectively saying that most use of AI is such that we
don't think it is possible for contributions to claim DCO compliance.

If we think there are situations where it might be credible for a
contributor to claim DCO compliance, we can try to find a way to
describe that situation, without having to explicitly state our
legal interpretation of the "copyrightable vs non-copyrightable"
boundary.

At KVM Forum what was notably raised as the topic fo code
refactoring and whether it is practical to allow some such
usage.

We have historically allowed machine refactoring done by Coccinelle
for example. Someone could asks an AI agent to write a Coccinelle
script for a given task, and then tells the AI to run that script
across the code base. I think that might be a situation where it
would be reasonable to accept the AI driven refactoring, as the
substance of the comit is clearly defined by the Coccinelle
script.

Could that be summarized by saying that we'll allow refactoring
if driven via an intermediate script ? That is still quite a
strict definition that could frustrate much usage, but it at
least feels like something that should have greatl]y reduced
risk compared to direct refactoring by an opaque agent.

As an example though, we have the scripts/clean-includes.pl script
that Markus wrote for manipulating code into our preferred style
for headers.

Whether the headers change is done manually by a human, automated
with Markus' perl script or automated by an AI agent, the end
result should be identical, as there is only one possible end
point and you can describe what that end point should look like.

That said there is  still a questionmark over complexity. Getting
to the end point may be a trival & mundane exercise in some cases,
while requiring considerable intellectual thought in other cases.
The latter is perhaps especially true if wanting simple, easily
bisected series of small steps rather than a big bang conversion.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

next prev parent reply	other threads:[~2025-09-22 14:04 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-22 11:32 [RFC PATCH 0/4] docs/code-provenance: make AI policy clearer and more practical Paolo Bonzini
2025-09-22 11:32 ` [RFC PATCH 1/4] docs/code-provenance: clarify scope very early Paolo Bonzini
2025-09-22 11:34   ` Daniel P. Berrangé
2025-09-22 12:52   ` Alex Bennée
2025-09-22 11:32 ` [RFC PATCH 2/4] docs/code-provenance: make the exception process more prominent Paolo Bonzini
2025-09-22 13:24   ` Daniel P. Berrangé
2025-09-22 13:56     ` Paolo Bonzini
2025-09-22 14:51       ` Daniel P. Berrangé
2025-09-22 11:32 ` [RFC PATCH 3/4] docs/code-provenance: clarify the scope of AI exceptions Paolo Bonzini
2025-09-22 13:02   ` Alex Bennée
2025-09-22 13:38     ` Daniel P. Berrangé
2025-09-22 11:32 ` [RFC PATCH 4/4] docs/code-provenance: make the exception process feasible Paolo Bonzini
2025-09-22 11:46   ` Peter Maydell
2025-09-22 12:06     ` Paolo Bonzini
2025-09-22 13:04     ` Daniel P. Berrangé
2025-09-22 13:26       ` Peter Maydell
2025-09-22 14:03         ` Daniel P. Berrangé [this message]
2025-09-22 15:10           ` Paolo Bonzini
2025-09-22 16:36             ` Daniel P. Berrangé
2025-09-22 16:55               ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aNFXJtQu9gFkIwLg@redhat.com \
    --to=berrange@redhat.com \
    --cc=alex.bennee@linaro.org \
    --cc=armbru@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).