From: Markus Armbruster <armbru@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-devel@nongnu.org, "Alex Bennée" <alex.bennee@linaro.org>,
"Daniel P . Berrangé" <berrange@redhat.com>,
"John Snow" <jsnow@redhat.com>, "Kevin Wolf" <kwolf@redhat.com>,
"Peter Maydell" <peter.maydell@linaro.org>,
"Stefan Hajnoczi" <stefanha@redhat.com>,
"Mads Ynddal" <mads@ynddal.dk>
Subject: Re: [PATCH 0/6] tracetool: add mypy --strict checking [AI discussion ahead!]
Date: Wed, 08 Oct 2025 09:18:04 +0200 [thread overview]
Message-ID: <87ikgpn9oz.fsf@pond.sub.org> (raw)
In-Reply-To: <20251008063546.376603-1-pbonzini@redhat.com> (Paolo Bonzini's message of "Wed, 8 Oct 2025 08:35:39 +0200")
Paolo Bonzini <pbonzini@redhat.com> writes:
> [People in Cc are a mix of Python people, tracing people, and people
> who followed the recent AI discussions. - Paolo]
>
> This series adds type annotations to tracetool. While useful on its own,
> it also served as an experiment in whether AI tools could be useful and
> appropriate for mechanical code transformations that may not involve
> copyrightable expression.
>
> In this version, the types were added mostly with the RightTyper tool
> (https://github.com/RightTyper/RightTyper), which uses profiling to detect
> the types of arguments and return types at run time. However, because
> adding type annotations is such a narrow and verifiable task, I also developed
> a parallel version using an LLM, to provide some data on topics such as:
>
> - how much choice/creativity is there in writing type annotations?
> Is it closer to writing functional code or to refactoring?
Based on my work with John Snow on typing of the QAPI generator: there
is some choice.
Consider typing a function's argument. Should we pick it based on what
the function requires from its argument? Or should the type reflect how
the function is used?
Say the function iterates over the argument. So we make the argument
Iterable[...], right? But what if all callers pass a list? Making it
List[...] could be clearer then. It's a choice.
I think the choice depends on context and taste. At some library's
external interface, picking a more general type can make the function
more generally useful. But for some internal helper, I'd pick the
actual type.
My point isn't that an LLM could not possibly do the right thing based
on context, and maybe even "taste" distilled from its training data. My
point is that this isn't entirely mechanical with basically one correct
output.
Once we have such judgement calls, there's the question how an LLM's
choice depends on its training data (first order approximation today:
nobody knows), and whether and when that makes the LLM's output a
derived work of its training data (to be settled in court).
[...]
> Based on this experience, my answer to the copyrightability question is
> that, for this kind of narrow request, the output of AI can be treated as
> the output of an imperfect tool, rather than as creative content potentially
> tainted by the training material.
Maybe.
> Of course this is one data point and
> is intended as an experiment rather than a policy recommendation.
Understood. We need to develop a better understanding of capabilities,
potential benefits and risks, and such experiments can only help with
that.
next prev parent reply other threads:[~2025-10-08 7:19 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-08 6:35 [PATCH 0/6] tracetool: add mypy --strict checking [AI discussion ahead!] Paolo Bonzini
2025-10-08 6:35 ` [PATCH 1/6] tracetool: rename variable with conflicting types Paolo Bonzini
2025-10-08 9:27 ` Daniel P. Berrangé
2025-10-08 18:06 ` Stefan Hajnoczi
2025-10-08 6:35 ` [PATCH 2/6] tracetool: apply isort and add check Paolo Bonzini
2025-10-08 9:29 ` Daniel P. Berrangé
2025-10-08 17:58 ` Stefan Hajnoczi
2025-10-09 7:52 ` Daniel P. Berrangé
2025-10-09 8:22 ` Paolo Bonzini
2025-10-09 8:58 ` Daniel P. Berrangé
2025-10-09 11:26 ` Paolo Bonzini
2025-10-09 12:05 ` Daniel P. Berrangé
2025-10-09 7:58 ` Paolo Bonzini
2025-10-08 6:35 ` [PATCH 3/6] tracetool: "import annotations" Paolo Bonzini
2025-10-08 9:31 ` Daniel P. Berrangé
2025-10-08 18:06 ` Stefan Hajnoczi
2025-10-08 6:35 ` [PATCH 4/6] tracetool: add type annotations Paolo Bonzini
2025-10-08 18:09 ` Stefan Hajnoczi
2025-10-08 6:35 ` [PATCH 5/6] tracetool: complete typing annotations Paolo Bonzini
2025-10-08 18:10 ` Stefan Hajnoczi
2025-10-08 6:35 ` [PATCH 6/6] tracetool: add typing checks to "make -C python check" Paolo Bonzini
2025-10-08 9:32 ` Daniel P. Berrangé
2025-10-08 18:12 ` Stefan Hajnoczi
2025-10-08 7:18 ` Markus Armbruster [this message]
2025-10-08 10:34 ` [PATCH 0/6] tracetool: add mypy --strict checking [AI discussion ahead!] Daniel P. Berrangé
2025-10-10 12:38 ` Markus Armbruster
2025-10-10 13:49 ` Paolo Bonzini
2025-10-10 17:41 ` Markus Armbruster
2025-10-08 10:40 ` Daniel P. Berrangé
2025-10-08 17:23 ` Stefan Hajnoczi
2025-10-08 17:41 ` Paolo Bonzini
2025-10-14 19:02 ` Stefan Hajnoczi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87ikgpn9oz.fsf@pond.sub.org \
--to=armbru@redhat.com \
--cc=alex.bennee@linaro.org \
--cc=berrange@redhat.com \
--cc=jsnow@redhat.com \
--cc=kwolf@redhat.com \
--cc=mads@ynddal.dk \
--cc=pbonzini@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.