Re: [PATCH 0/6] tracetool: add mypy --strict checking [AI discussion ahead!]

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Daniel P. Berrangé" <berrange@redhat.com>
To: Markus Armbruster <armbru@redhat.com>
Cc: "Paolo Bonzini" <pbonzini@redhat.com>,
	qemu-devel@nongnu.org, "Alex Bennée" <alex.bennee@linaro.org>,
	"John Snow" <jsnow@redhat.com>, "Kevin Wolf" <kwolf@redhat.com>,
	"Peter Maydell" <peter.maydell@linaro.org>,
	"Stefan Hajnoczi" <stefanha@redhat.com>,
	"Mads Ynddal" <mads@ynddal.dk>
Subject: Re: [PATCH 0/6] tracetool: add mypy --strict checking [AI discussion ahead!]
Date: Wed, 8 Oct 2025 11:34:36 +0100	[thread overview]
Message-ID: <aOY-PFGNPY7aOwkJ@redhat.com> (raw)
In-Reply-To: <87ikgpn9oz.fsf@pond.sub.org>

On Wed, Oct 08, 2025 at 09:18:04AM +0200, Markus Armbruster wrote:
> Paolo Bonzini <pbonzini@redhat.com> writes:
> 
> > [People in Cc are a mix of Python people, tracing people, and people
> >  who followed the recent AI discussions. - Paolo]
> >
> > This series adds type annotations to tracetool. While useful on its own, 
> > it also served as an experiment in whether AI tools could be useful and
> > appropriate for mechanical code transformations that may not involve
> > copyrightable expression.
> >
> > In this version, the types were added mostly with the RightTyper tool
> > (https://github.com/RightTyper/RightTyper), which uses profiling to detect
> > the types of arguments and return types at run time.  However, because
> > adding type annotations is such a narrow and verifiable task, I also developed
> > a parallel version using an LLM, to provide some data on topics such as:
> >
> > - how much choice/creativity is there in writing type annotations?
> >   Is it closer to writing functional code or to refactoring?
> 
> Based on my work with John Snow on typing of the QAPI generator: there
> is some choice.
> 
> Consider typing a function's argument.  Should we pick it based on what
> the function requires from its argument?  Or should the type reflect how
> the function is used?
> 
> Say the function iterates over the argument.  So we make the argument
> Iterable[...], right?  But what if all callers pass a list?  Making it
> List[...] could be clearer then.  It's a choice.
> 
> I think the choice depends on context and taste.  At some library's
> external interface, picking a more general type can make the function
> more generally useful.  But for some internal helper, I'd pick the
> actual type.
> 
> My point isn't that an LLM could not possibly do the right thing based
> on context, and maybe even "taste" distilled from its training data.  My
> point is that this isn't entirely mechanical with basically one correct
> output.
>
> Once we have such judgement calls, there's the question how an LLM's
> choice depends on its training data (first order approximation today:
> nobody knows), and whether and when that makes the LLM's output a
> derived work of its training data (to be settled in court).

There's perhaps a missing step here. The work first has to be considered
eligible for copyright protection, before we get onto questions wrt
derivative works, but that opens a can of worms....

The big challenge is that traditionally projects did not really have
to think much about where the "threshold of originality"[1] came into
play for a work to be considered eligible for copyright.

It was fine to take the conservative view that any patch benefits
from copyright protection for the individual (or more often their
employer). There was rarely any compelling need for the project to
understand if something failed to meet the threshold. We see this
in cases where a project re-licenses code - it is simpler/cheaper
just to contact all contributors for permission, than to evaluate
which subset held copyright.

This is a very good thing. Understanding where that threshold applies
is an challenging intellectual task that consumes time better spent
on other tasks. Especially notable though is that threshold varies
across countries in the world, and some places have at times even
considered merely expending labour to make a work eligible.

In trying to create an policy that permits AI contributions in some
narrow scenarios, we're trying to thread the needle and as a global
project that implies satisfying a wide variety of interpretations
for the copyright threshold. There's no clear cut answer to that,
the only option is to mitigate risk to a tolerable level.

That's something all projects already do. For example when choosing
between a CLA, vs a DCO signup, vs implicitly expecting contributors
to be good, we're making a risk/cost tradeoff.

Back to the LLMs / type annotations question. Let us accept there
there is a judgement in situations like Iterable[...] vs List[...],
and that an LLM makes that call. The LLM's basis for that judgement
call is certainly biased from what it saw in training data, just as
our own judgement call would be biased from what we've seen before
in python type annotations in other codebases.

In either case though, I have a hard time proposing why the result
of that judgement call makes the output a derived work of anything
other than the current QEMU codebase (and possibly the LLM prompt).
To say otherwise would mean my work is tainted by every codebase
I've ever read.

The most plausible scenario of "copying" existing copyright content
is where a fork of the QEMU code already exists in the training
material with the type annotations present. If that was the case
though, that pre-existing work would already be considered a
derivative of QEMU code and thus compatibly licensed. What we'd
be loosing would be the explicit SoB of the original author (if
any), rather than facing any clear license compatibility problem.

If we consider that type annotations are (a) borderline for copyright
protection eligibility and (b) the type annotations are unlikely to be
claimed as derived work of anything other than QEMU; then this looks
like a low risk scenario for use of LLM.

The problem I have remaining is around the practicality of expressing
this in a policy and having contributors & maintainers apply it well
in practice.

There's a definite "slippery slope" situation. The incentive for
contributors will be to search out reasons to justify why a work
matches the AI exception, while overworked maintainers probably
aren't going to spend much (any) mental energy on objectively
analysing the situation, unless the proposal looks terribly out
of line with expectations.

I'm not saying we shouldn't have an exception. I think the mypy case
is an fairly compelling example of a low risk activity. We will need
to become comfortable with the implications around the incentives
for contributors & maintainers when applying it though.

With regards,
Daniel

[1] https://en.wikipedia.org/wiki/Threshold_of_originality
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

next prev parent reply	other threads:[~2025-10-08 10:35 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-08  6:35 [PATCH 0/6] tracetool: add mypy --strict checking [AI discussion ahead!] Paolo Bonzini
2025-10-08  6:35 ` [PATCH 1/6] tracetool: rename variable with conflicting types Paolo Bonzini
2025-10-08  9:27   ` Daniel P. Berrangé
2025-10-08 18:06   ` Stefan Hajnoczi
2025-10-08  6:35 ` [PATCH 2/6] tracetool: apply isort and add check Paolo Bonzini
2025-10-08  9:29   ` Daniel P. Berrangé
2025-10-08 17:58   ` Stefan Hajnoczi
2025-10-09  7:52     ` Daniel P. Berrangé
2025-10-09  8:22       ` Paolo Bonzini
2025-10-09  8:58         ` Daniel P. Berrangé
2025-10-09 11:26           ` Paolo Bonzini
2025-10-09 12:05             ` Daniel P. Berrangé
2025-10-09  7:58     ` Paolo Bonzini
2025-10-08  6:35 ` [PATCH 3/6] tracetool: "import annotations" Paolo Bonzini
2025-10-08  9:31   ` Daniel P. Berrangé
2025-10-08 18:06   ` Stefan Hajnoczi
2025-10-08  6:35 ` [PATCH 4/6] tracetool: add type annotations Paolo Bonzini
2025-10-08 18:09   ` Stefan Hajnoczi
2025-10-08  6:35 ` [PATCH 5/6] tracetool: complete typing annotations Paolo Bonzini
2025-10-08 18:10   ` Stefan Hajnoczi
2025-10-08  6:35 ` [PATCH 6/6] tracetool: add typing checks to "make -C python check" Paolo Bonzini
2025-10-08  9:32   ` Daniel P. Berrangé
2025-10-08 18:12   ` Stefan Hajnoczi
2025-10-08  7:18 ` [PATCH 0/6] tracetool: add mypy --strict checking [AI discussion ahead!] Markus Armbruster
2025-10-08 10:34   ` Daniel P. Berrangé [this message]
2025-10-10 12:38     ` Markus Armbruster
2025-10-10 13:49       ` Paolo Bonzini
2025-10-10 17:41         ` Markus Armbruster
2025-10-08 10:40 ` Daniel P. Berrangé
2025-10-08 17:23 ` Stefan Hajnoczi
2025-10-08 17:41   ` Paolo Bonzini
2025-10-14 19:02 ` Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aOY-PFGNPY7aOwkJ@redhat.com \
    --to=berrange@redhat.com \
    --cc=alex.bennee@linaro.org \
    --cc=armbru@redhat.com \
    --cc=jsnow@redhat.com \
    --cc=kwolf@redhat.com \
    --cc=mads@ynddal.dk \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.