[PATCH 0/6] tracetool: add mypy --strict checking [AI discussion ahead!]

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/6] tracetool: add mypy --strict checking [AI discussion ahead!]
@ 2025-10-08  6:35 Paolo Bonzini
  2025-10-08  6:35 ` [PATCH 1/6] tracetool: rename variable with conflicting types Paolo Bonzini
                   ` (8 more replies)
  0 siblings, 9 replies; 31+ messages in thread
From: Paolo Bonzini @ 2025-10-08  6:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: Alex Bennée, Daniel P . Berrangé, John Snow, Kevin Wolf,
	Markus Armbruster, Peter Maydell, Stefan Hajnoczi, Mads Ynddal

[People in Cc are a mix of Python people, tracing people, and people
 who followed the recent AI discussions. - Paolo]

This series adds type annotations to tracetool. While useful on its own, 
it also served as an experiment in whether AI tools could be useful and
appropriate for mechanical code transformations that may not involve
copyrightable expression.

In this version, the types were added mostly with the RightTyper tool
(https://github.com/RightTyper/RightTyper), which uses profiling to detect
the types of arguments and return types at run time.  However, because
adding type annotations is such a narrow and verifiable task, I also developed
a parallel version using an LLM, to provide some data on topics such as:

- how much choice/creativity is there in writing type annotations?
  Is it closer to writing functional code or to refactoring?

- how does AI output for such mechanical transformations compare to other
  automated tools, whose output is known not to be copyrightable? 

- what is the kind of time saving that the tool can provide?

- how effective is an LLM for this task?  Is the required human help a
  pain to provide, or can the human keep the fun part for itself?

While the version submitted here does not use any LLM-generated content
in compliance with QEMU's policy, the first version was developed using
an LLM tool (Claude Code) with a very simple prompt:
    
      Add type annotations to tracetool.py and all .py files in
      tracetool/*.py.  Use new-style annotations, for example list[str]
      instead of typing.List[str].  When done, use mypy to find any
      issues.

The results were surprisingly good for such a simple-minded test, and
while there were some issues they turned out to be mostly due to dead
code that confused the LLM.  Thus, I removed the dead code (this is
already in qemu.git) and then did the actual experiment, running
both the RightTyper script (which is detailed in patch 4) and the LLM
in separate branches.  The remaining errors from the LLM were also
reasonably easy to fix, so I did that manually:

      tracetool/__init__.py:239: error: Need type annotation for "_args" (hint: "_args: list[<type>] = ...")  [var-annotated]
      tracetool/__init__.py:272: error: Argument 1 to "Arguments" has incompatible type "list[tuple[str, str]]"; expected "list[tuple[str, str] | Arguments]"  [arg-type]
      tracetool/__init__.py:579: error: Incompatible types in assignment (expression has type "str | None", variable has type "None")  [assignment]
      tracetool/__init__.py:580: error: Incompatible types in assignment (expression has type "str | None", variable has type "None")  [assignment]

After reviewing the changes, I followed up with another prompt for mechanical
changes:

      Change "backend: Any" to "backend: Wrapper" in all tracetool/format/ files

Honestly, I could have done this part in less time without any help; I
was just curious to see if the LLM would also remove the unused import.
Not only it didn't; this prompt didn't even touch most of the files so
I did the change, ran isort and called it a day.

Comparing the results from RightTyper and AI, both tools benefit from human
help: the dead code removal which I mentioned, the small cleanups in patch 1,
the final manual fixes for the remaining (mostly trivial) errors.  But at
least in this case, AI is a winner:

- it left basically no unannotated code: fixing the above errors was enough
  to pass "mypy --strict", unlike RightTyper which needed a little more work
  due to its profiling-based nature and a few other limitations (see patch 5).

- most importantly, trying various other tools that didn't work, as well as
  figuring out how to use RightTyper, took a couple hours more.  Surprising
  as it was, I could not find any static type inferencing tool for Python;
  neither pytype nor pyre worked for me.  This is also why I think this
  is not apples to oranges, but a fair comparison between AI-based and
  regular tooling.

I want to highlight "in this case".  I had other experiments where the
LLM's output was all but awful, and I wasted time debugging code that
I should have thrown away immediately!

After the diffstat, you can find a diff from this series to the version
based on Claude Code.  It's impossible to be 100% objective but,
besides being functionally equivalent, I don't think either would be
identifiable as written by an LLM, by a person, by a tool+human combo,
or even by a good type inferencing tool (if it existed).

Based on this experience, my answer to the copyrightability question is
that, for this kind of narrow request, the output of AI can be treated as
the output of an imperfect tool, rather than as creative content potentially
tainted by the training material.  Of course this is one data point and
is intended as an experiment rather than a policy recommendation.

Paolo



Paolo Bonzini (6):
  tracetool: rename variable with conflicting types
  tracetool: apply isort and add check
  tracetool: "import annotations"
  tracetool: add type annotations
  tracetool: complete typing annotations
  tracetool: add typing checks to "make -C python check"

 python/tests/tracetool-isort.sh              |  4 +
 python/tests/tracetool-mypy.sh               |  5 ++
 scripts/tracetool.py                         | 12 +--
 scripts/tracetool/__init__.py                | 84 ++++++++++----------
 scripts/tracetool/backend/__init__.py        | 21 ++---
 scripts/tracetool/backend/dtrace.py          | 19 ++---
 scripts/tracetool/backend/ftrace.py          | 13 +--
 scripts/tracetool/backend/log.py             | 13 +--
 scripts/tracetool/backend/simple.py          | 19 ++---
 scripts/tracetool/backend/syslog.py          | 13 +--
 scripts/tracetool/backend/ust.py             | 11 +--
 scripts/tracetool/format/__init__.py         |  9 ++-
 scripts/tracetool/format/c.py                |  7 +-
 scripts/tracetool/format/d.py                |  7 +-
 scripts/tracetool/format/h.py                |  7 +-
 scripts/tracetool/format/log_stap.py         | 12 +--
 scripts/tracetool/format/rs.py               |  7 +-
 scripts/tracetool/format/simpletrace_stap.py |  7 +-
 scripts/tracetool/format/stap.py             | 10 ++-
 scripts/tracetool/format/ust_events_c.py     |  7 +-
 scripts/tracetool/format/ust_events_h.py     |  7 +-
 21 files changed, 173 insertions(+), 121 deletions(-)
 create mode 100755 python/tests/tracetool-isort.sh
 create mode 100755 python/tests/tracetool-mypy.sh
-- 
2.51.0


diff --git a/scripts/tracetool.py b/scripts/tracetool.py
index 13c5583fa21..06c40aa71e7 100755
--- a/scripts/tracetool.py
+++ b/scripts/tracetool.py
@@ -16,7 +16,6 @@
 
 import getopt
 import sys
-from typing import NoReturn
 
 import tracetool.backend
 import tracetool.format
@@ -24,7 +24,7 @@
 
 _SCRIPT = ""
 
-def error_opt(msg: str | None = None) -> NoReturn:
+def error_opt(msg: str | None = None) -> None:
     if msg is not None:
         error_write("Error: " + msg + "\n")
 
diff --git a/scripts/tracetool/__init__.py b/scripts/tracetool/__init__.py
index 58b7a289425..c9509b327f4 100644
--- a/scripts/tracetool/__init__.py
+++ b/scripts/tracetool/__init__.py
@@ -17,9 +17,8 @@
 import os
 import re
 import sys
-from io import TextIOWrapper
 from pathlib import PurePath
-from typing import Any, Iterator, Sequence
+from typing import Any, Iterator, Sequence, TextIO
 
 import tracetool.backend
 import tracetool.format
@@ -50,7 +50,7 @@ def error(*lines: str) -> None:
     'MAX': 'j',
 }
 
-def expand_format_string(c_fmt: str, prefix: str="") -> str:
+def expand_format_string(c_fmt: str, prefix: str = "") -> str:
     def pri_macro_to_fmt(pri_macro: str) -> str:
         assert pri_macro.startswith("PRI")
         fmt_type = pri_macro[3]  # 'd', 'i', 'u', or 'x'
@@ -469,13 +469,13 @@ def formats(self) -> list[str]:
     QEMU_BACKEND_DSTATE      = "TRACE_%(NAME)s_BACKEND_DSTATE"
     QEMU_EVENT               = "_TRACE_%(NAME)s_EVENT"
 
-    def api(self, fmt: str|None=None) -> str:
+    def api(self, fmt: str | None = None) -> str:
         if fmt is None:
             fmt = Event.QEMU_TRACE
         return fmt % {"name": self.name, "NAME": self.name.upper()}
 
 
-def read_events(fobj: TextIOWrapper, fname: str) -> list[Event]:
+def read_events(fobj: TextIO, fname: str) -> list[Event]:
     """Generate the output for the given (format, backends) pair.
 
     Parameters
@@ -514,7 +514,7 @@ class TracetoolError (Exception):
     pass
 
 
-def try_import(mod_name: str, attr_name: str | None=None, attr_default: Any=None) -> tuple[bool, Any]:
+def try_import(mod_name: str, attr_name: str | None = None, attr_default: Any = None) -> tuple[bool, Any]:
     """Try to import a module and get an attribute from it.
 
     Parameters
@@ -541,7 +541,7 @@ def try_import(mod_name: str, attr_name: str | None=None, attr_default: Any=None
 
 
 def generate(events: list[Event], group: str, format: str, backends: list[str],
-             binary: str|None=None, probe_prefix: str|None=None) -> None:
+             binary: str | None = None, probe_prefix: str | None = None) -> None:
     """Generate the output for the given (format, backends) pair.
 
     Parameters
@@ -581,7 +581,7 @@ def generate(events: list[Event], group: str, format: str, backends: list[str],
 
     tracetool.format.generate(events, format, backend, group)
 
-def posix_relpath(path: str, start: str|None=None) -> str:
+def posix_relpath(path: str, start: str | None = None) -> str:
     try:
         path = os.path.relpath(path, start)
     except ValueError:
diff --git a/scripts/tracetool/backend/__init__.py b/scripts/tracetool/backend/__init__.py
index 84bab3f42a0..37654a133ce 100644
--- a/scripts/tracetool/backend/__init__.py
+++ b/scripts/tracetool/backend/__init__.py
@@ -122,7 +122,7 @@ def backend_modules(self) -> Iterator[Any]:
              if module is not None:
                  yield module
 
-    def _run_function(self, name: str, *args: Any, check_trace_event_get_state: bool | None=None, **kwargs: Any) -> None:
+    def _run_function(self, name: str, *args: Any, check_trace_event_get_state: bool | None = None, **kwargs: Any) -> None:
         for backend in self.backend_modules():
             func = getattr(backend, name % self._format, None)
             if func is not None and \
@@ -133,7 +133,7 @@ def _run_function(self, name: str, *args: Any, check_trace_event_get_state: bool
     def generate_begin(self, events: list[tracetool.Event], group: str) -> None:
         self._run_function("generate_%s_begin", events, group)
 
-    def generate(self, event: tracetool.Event, group: str, check_trace_event_get_state: bool | None=None) -> None:
+    def generate(self, event: tracetool.Event, group: str, check_trace_event_get_state: bool | None = None) -> None:
         self._run_function("generate_%s", event, group, check_trace_event_get_state=check_trace_event_get_state)
 
     def generate_backend_dstate(self, event: tracetool.Event, group: str) -> None:
diff --git a/scripts/tracetool/backend/dtrace.py b/scripts/tracetool/backend/dtrace.py
index d9f4b8f09c8..e54847e96b6 100644
--- a/scripts/tracetool/backend/dtrace.py
+++ b/scripts/tracetool/backend/dtrace.py
@@ -19,7 +19,7 @@
 PUBLIC = True
 
 
-PROBEPREFIX: str|None = None
+PROBEPREFIX: str | None = None
 
 def probeprefix() -> str:
     if PROBEPREFIX is None:
@@ -27,7 +29,7 @@ def probeprefix() -> str:
     return PROBEPREFIX
 
 
-BINARY: str|None = None
+BINARY: str | None = None
 
 def binary() -> str:
     if BINARY is None:
diff --git a/scripts/tracetool/format/__init__.py b/scripts/tracetool/format/__init__.py
index c287e93ed30..d4fc452b4ee 100644
--- a/scripts/tracetool/format/__init__.py
+++ b/scripts/tracetool/format/__init__.py
@@ -40,7 +40,7 @@
 import os
 
 import tracetool
-from tracetool.backend import Wrapper
+import tracetool.backend
 
 
 def get_list() -> list[tuple[str, str]]:
@@ -76,7 +76,7 @@ def exists(name: str) -> bool:
     return tracetool.try_import("tracetool.format." + name)[0]
 
 
-def generate(events: list[tracetool.Event], format: str, backend: Wrapper, group: str) -> None:
+def generate(events: list[tracetool.Event], format: str, backend: tracetool.backend.Wrapper, group: str) -> None:
     if not exists(format):
         raise ValueError("unknown format: %s" % format)
     format = format.replace("-", "_")
diff --git a/scripts/tracetool/format/c.py b/scripts/tracetool/format/c.py
index e8848c51051..2e02365444d 100644
--- a/scripts/tracetool/format/c.py
+++ b/scripts/tracetool/format/c.py
@@ -14,11 +14,11 @@
 __email__      = "stefanha@redhat.com"
 
 
+import tracetool.backend
 from tracetool import Event, out
-from tracetool.backend import Wrapper
 
 
-def generate(events: list[Event], backend: Wrapper, group: str) -> None:
+def generate(events: list[Event], backend: tracetool.backend.Wrapper, group: str) -> None:
     active_events = [e for e in events
                      if "disable" not in e.properties]
 
diff --git a/scripts/tracetool/format/d.py b/scripts/tracetool/format/d.py
index 2abf8bc24ba..bc9bc57e0b3 100644
--- a/scripts/tracetool/format/d.py
+++ b/scripts/tracetool/format/d.py
@@ -16,8 +16,8 @@
 
 from sys import platform
 
+import tracetool.backend
 from tracetool import Event, out
-from tracetool.backend import Wrapper
 
 # Reserved keywords from
 # https://wikis.oracle.com/display/DTrace/Types,+Operators+and+Expressions
@@ -32,7 +32,7 @@
 )
 
 
-def generate(events: list[Event], backend: Wrapper, group: str) -> None:
+def generate(events: list[Event], backend: tracetool.backend.Wrapper, group: str) -> None:
     events = [e for e in events
               if "disable" not in e.properties]
 
diff --git a/scripts/tracetool/format/h.py b/scripts/tracetool/format/h.py
index 2324234dd24..6d2d5cb4a9a 100644
--- a/scripts/tracetool/format/h.py
+++ b/scripts/tracetool/format/h.py
@@ -14,11 +14,11 @@
 __email__      = "stefanha@redhat.com"
 
 
+import tracetool.backend
 from tracetool import Event, out
-from tracetool.backend import Wrapper
 
 
-def generate(events: list[Event], backend: Wrapper, group: str) -> None:
+def generate(events: list[Event], backend: tracetool.backend.Wrapper, group: str) -> None:
     header = "trace/control.h"
 
     out('/* This file is autogenerated by tracetool, do not edit. */',
diff --git a/scripts/tracetool/format/log_stap.py b/scripts/tracetool/format/log_stap.py
index 914e4674ffe..4223e2d9e2d 100644
--- a/scripts/tracetool/format/log_stap.py
+++ b/scripts/tracetool/format/log_stap.py
@@ -15,8 +15,8 @@
 
 import re
 
+import tracetool.backend
 from tracetool import Event, out
-from tracetool.backend import Wrapper
 from tracetool.backend.dtrace import binary, probeprefix
 from tracetool.backend.simple import is_string
 from tracetool.format.stap import stap_escape
@@ -86,7 +86,7 @@ def c_fmt_to_stap(fmt: str) -> str:
     fmt = re.sub(r"%(\d*)(l+|z)(x|u|d)", r"%\1\3", "".join(bits))
     return fmt
 
-def generate(events: list[Event], backend: Wrapper, group: str) -> None:
+def generate(events: list[Event], backend: tracetool.backend.Wrapper, group: str) -> None:
     out('/* This file is autogenerated by tracetool, do not edit. */',
         '/* SPDX-License-Identifier: GPL-2.0-or-later */',
         '')
diff --git a/scripts/tracetool/format/rs.py b/scripts/tracetool/format/rs.py
index 27a4025e3d6..08129d2d426 100644
--- a/scripts/tracetool/format/rs.py
+++ b/scripts/tracetool/format/rs.py
@@ -14,11 +14,11 @@
 __email__      = "stefanha@redhat.com"
 
 
+import tracetool.backend
 from tracetool import Event, out
-from tracetool.backend import Wrapper
 
 
-def generate(events: list[Event], backend: Wrapper, group: str) -> None:
+def generate(events: list[Event], backend: tracetool.backend.Wrapper, group: str) -> None:
     out('// SPDX-License-Identifier: GPL-2.0-or-later',
         '// This file is @generated by tracetool, do not edit.',
         '',
diff --git a/scripts/tracetool/format/simpletrace_stap.py b/scripts/tracetool/format/simpletrace_stap.py
index c76cf426853..b42bd3d6f02 100644
--- a/scripts/tracetool/format/simpletrace_stap.py
+++ b/scripts/tracetool/format/simpletrace_stap.py
@@ -14,14 +14,14 @@
 __email__      = "stefanha@redhat.com"
 
 
+import tracetool.backend
 from tracetool import Event, out
-from tracetool.backend import Wrapper
 from tracetool.backend.dtrace import probeprefix
 from tracetool.backend.simple import is_string
 from tracetool.format.stap import stap_escape
 
 
-def generate(events: list[Event], backend: Wrapper, group: str) -> None:
+def generate(events: list[Event], backend: tracetool.backend.Wrapper, group: str) -> None:
     out('/* This file is autogenerated by tracetool, do not edit. */',
         '/* SPDX-License-Identifier: GPL-2.0-or-later */',
         '')
diff --git a/scripts/tracetool/format/stap.py b/scripts/tracetool/format/stap.py
index af98f3c44a3..2c15c75f7c2 100644
--- a/scripts/tracetool/format/stap.py
+++ b/scripts/tracetool/format/stap.py
@@ -14,8 +14,8 @@
 __email__      = "stefanha@redhat.com"
 
 
+import tracetool.backend
 from tracetool import Event, out
-from tracetool.backend import Wrapper
 from tracetool.backend.dtrace import binary, probeprefix
 
 # Technically 'self' is not used by systemtap yet, but
@@ -35,7 +35,7 @@ def stap_escape(identifier: str) -> str:
     return identifier
 
 
-def generate(events: list[Event], backend: Wrapper, group: str) -> None:
+def generate(events: list[Event], backend: tracetool.backend.Wrapper, group: str) -> None:
     events = [e for e in events
               if "disable" not in e.properties]
 
diff --git a/scripts/tracetool/format/ust_events_c.py b/scripts/tracetool/format/ust_events_c.py
index 2c8256ed7a8..634dd9a7115 100644
--- a/scripts/tracetool/format/ust_events_c.py
+++ b/scripts/tracetool/format/ust_events_c.py
@@ -14,11 +14,11 @@
 __email__      = "stefanha@redhat.com"
 
 
+import tracetool.backend
 from tracetool import Event, out
-from tracetool.backend import Wrapper
 
 
-def generate(events: list[Event], backend: Wrapper, group: str) -> None:
+def generate(events: list[Event], backend: tracetool.backend.Wrapper, group: str) -> None:
     events = [e for e in events
               if "disabled" not in e.properties]
 
diff --git a/scripts/tracetool/format/ust_events_h.py b/scripts/tracetool/format/ust_events_h.py
index f0ffcae6941..098c28c7e7a 100644
--- a/scripts/tracetool/format/ust_events_h.py
+++ b/scripts/tracetool/format/ust_events_h.py
@@ -14,11 +14,11 @@
 __email__      = "stefanha@redhat.com"
 
 
+import tracetool.backend
 from tracetool import Event, out
-from tracetool.backend import Wrapper
 
 
-def generate(events: list[Event], backend: Wrapper, group: str) -> None:
+def generate(events: list[Event], backend: tracetool.backend.Wrapper, group: str) -> None:
     events = [e for e in events
               if "disabled" not in e.properties]
 



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 1/6] tracetool: rename variable with conflicting types
  2025-10-08  6:35 [PATCH 0/6] tracetool: add mypy --strict checking [AI discussion ahead!] Paolo Bonzini
@ 2025-10-08  6:35 ` Paolo Bonzini
  2025-10-08  9:27   ` Daniel P. Berrangé
  2025-10-08 18:06   ` Stefan Hajnoczi
  2025-10-08  6:35 ` [PATCH 2/6] tracetool: apply isort and add check Paolo Bonzini
                   ` (7 subsequent siblings)
  8 siblings, 2 replies; 31+ messages in thread
From: Paolo Bonzini @ 2025-10-08  6:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: Alex Bennée, Daniel P . Berrangé, John Snow, Kevin Wolf,
	Markus Armbruster, Peter Maydell, Stefan Hajnoczi, Mads Ynddal

"backend" is used as both a string and a backend.Wrapper.  In preparation
for adding type annotations, use different names.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 scripts/tracetool/__init__.py | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/scripts/tracetool/__init__.py b/scripts/tracetool/__init__.py
index 74062d21a7c..85527c08c98 100644
--- a/scripts/tracetool/__init__.py
+++ b/scripts/tracetool/__init__.py
@@ -566,9 +566,9 @@ def generate(events, group, format, backends,
 
     if len(backends) == 0:
         raise TracetoolError("no backends specified")
-    for backend in backends:
-        if not tracetool.backend.exists(backend):
-            raise TracetoolError("unknown backend: %s" % backend)
+    for backend_name in backends:
+        if not tracetool.backend.exists(backend_name):
+            raise TracetoolError("unknown backend: %s" % backend_name)
     backend = tracetool.backend.Wrapper(backends, format)
 
     import tracetool.backend.dtrace
-- 
2.51.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 2/6] tracetool: apply isort and add check
  2025-10-08  6:35 [PATCH 0/6] tracetool: add mypy --strict checking [AI discussion ahead!] Paolo Bonzini
  2025-10-08  6:35 ` [PATCH 1/6] tracetool: rename variable with conflicting types Paolo Bonzini
@ 2025-10-08  6:35 ` Paolo Bonzini
  2025-10-08  9:29   ` Daniel P. Berrangé
  2025-10-08 17:58   ` Stefan Hajnoczi
  2025-10-08  6:35 ` [PATCH 3/6] tracetool: "import annotations" Paolo Bonzini
                   ` (6 subsequent siblings)
  8 siblings, 2 replies; 31+ messages in thread
From: Paolo Bonzini @ 2025-10-08  6:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: Alex Bennée, Daniel P . Berrangé, John Snow, Kevin Wolf,
	Markus Armbruster, Peter Maydell, Stefan Hajnoczi, Mads Ynddal

Sort imports automatically, to keep the coding style more uniform.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 python/tests/tracetool-isort.sh      | 4 ++++
 scripts/tracetool.py                 | 5 ++---
 scripts/tracetool/backend/dtrace.py  | 1 -
 scripts/tracetool/backend/ftrace.py  | 3 +--
 scripts/tracetool/backend/log.py     | 3 +--
 scripts/tracetool/backend/simple.py  | 1 -
 scripts/tracetool/backend/syslog.py  | 3 +--
 scripts/tracetool/backend/ust.py     | 1 -
 scripts/tracetool/format/d.py        | 2 +-
 scripts/tracetool/format/log_stap.py | 1 -
 scripts/tracetool/format/stap.py     | 1 -
 11 files changed, 10 insertions(+), 15 deletions(-)
 create mode 100755 python/tests/tracetool-isort.sh

diff --git a/python/tests/tracetool-isort.sh b/python/tests/tracetool-isort.sh
new file mode 100755
index 00000000000..b23f3d48448
--- /dev/null
+++ b/python/tests/tracetool-isort.sh
@@ -0,0 +1,4 @@
+#!/bin/sh -e
+# SPDX-License-Identifier: GPL-2.0-or-later
+
+python3 -m isort --sp . -c ../scripts/tracetool/
diff --git a/scripts/tracetool.py b/scripts/tracetool.py
index 0fdc9cb9477..22fdc29e01f 100755
--- a/scripts/tracetool.py
+++ b/scripts/tracetool.py
@@ -12,13 +12,12 @@
 __email__      = "stefanha@redhat.com"
 
 
-import sys
 import getopt
+import sys
 
-from tracetool import error_write, out, out_open
 import tracetool.backend
 import tracetool.format
-
+from tracetool import error_write, out, out_open
 
 _SCRIPT = ""
 
diff --git a/scripts/tracetool/backend/dtrace.py b/scripts/tracetool/backend/dtrace.py
index b4af403025c..0fbd98ee91b 100644
--- a/scripts/tracetool/backend/dtrace.py
+++ b/scripts/tracetool/backend/dtrace.py
@@ -14,7 +14,6 @@
 
 from tracetool import out
 
-
 PUBLIC = True
 
 
diff --git a/scripts/tracetool/backend/ftrace.py b/scripts/tracetool/backend/ftrace.py
index e03698a2edf..0d77bd23a51 100644
--- a/scripts/tracetool/backend/ftrace.py
+++ b/scripts/tracetool/backend/ftrace.py
@@ -12,8 +12,7 @@
 __email__      = "stefanha@redhat.com"
 
 
-from tracetool import out, expand_format_string
-
+from tracetool import expand_format_string, out
 
 PUBLIC = True
 CHECK_TRACE_EVENT_GET_STATE = True
diff --git a/scripts/tracetool/backend/log.py b/scripts/tracetool/backend/log.py
index 9e3e5046f5f..bbfb56911d9 100644
--- a/scripts/tracetool/backend/log.py
+++ b/scripts/tracetool/backend/log.py
@@ -12,8 +12,7 @@
 __email__      = "stefanha@redhat.com"
 
 
-from tracetool import out, expand_format_string
-
+from tracetool import expand_format_string, out
 
 PUBLIC = True
 CHECK_TRACE_EVENT_GET_STATE = True
diff --git a/scripts/tracetool/backend/simple.py b/scripts/tracetool/backend/simple.py
index b131e4fc194..b67257ce7e9 100644
--- a/scripts/tracetool/backend/simple.py
+++ b/scripts/tracetool/backend/simple.py
@@ -14,7 +14,6 @@
 
 from tracetool import out
 
-
 PUBLIC = True
 CHECK_TRACE_EVENT_GET_STATE = True
 
diff --git a/scripts/tracetool/backend/syslog.py b/scripts/tracetool/backend/syslog.py
index 12b826593db..c3efab036cf 100644
--- a/scripts/tracetool/backend/syslog.py
+++ b/scripts/tracetool/backend/syslog.py
@@ -12,8 +12,7 @@
 __email__      = "stefanha@redhat.com"
 
 
-from tracetool import out, expand_format_string
-
+from tracetool import expand_format_string, out
 
 PUBLIC = True
 CHECK_TRACE_EVENT_GET_STATE = True
diff --git a/scripts/tracetool/backend/ust.py b/scripts/tracetool/backend/ust.py
index 3aa9bb1da29..a70e3d83e1d 100644
--- a/scripts/tracetool/backend/ust.py
+++ b/scripts/tracetool/backend/ust.py
@@ -14,7 +14,6 @@
 
 from tracetool import out
 
-
 PUBLIC = True
 
 
diff --git a/scripts/tracetool/format/d.py b/scripts/tracetool/format/d.py
index e9e33dfe30a..0befd444e82 100644
--- a/scripts/tracetool/format/d.py
+++ b/scripts/tracetool/format/d.py
@@ -12,9 +12,9 @@
 __email__      = "stefanha@redhat.com"
 
 
-from tracetool import out
 from sys import platform
 
+from tracetool import out
 
 # Reserved keywords from
 # https://wikis.oracle.com/display/DTrace/Types,+Operators+and+Expressions
diff --git a/scripts/tracetool/format/log_stap.py b/scripts/tracetool/format/log_stap.py
index 259303a189d..99c6181f389 100644
--- a/scripts/tracetool/format/log_stap.py
+++ b/scripts/tracetool/format/log_stap.py
@@ -18,7 +18,6 @@
 from tracetool.backend.simple import is_string
 from tracetool.format.stap import stap_escape
 
-
 STATE_SKIP = 0
 STATE_LITERAL = 1
 STATE_MACRO = 2
diff --git a/scripts/tracetool/format/stap.py b/scripts/tracetool/format/stap.py
index 285c9203ba7..f917bd7545d 100644
--- a/scripts/tracetool/format/stap.py
+++ b/scripts/tracetool/format/stap.py
@@ -15,7 +15,6 @@
 from tracetool import out
 from tracetool.backend.dtrace import binary, probeprefix
 
-
 # Technically 'self' is not used by systemtap yet, but
 # they recommended we keep it in the reserved list anyway
 RESERVED_WORDS = (
-- 
2.51.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 3/6] tracetool: "import annotations"
  2025-10-08  6:35 [PATCH 0/6] tracetool: add mypy --strict checking [AI discussion ahead!] Paolo Bonzini
  2025-10-08  6:35 ` [PATCH 1/6] tracetool: rename variable with conflicting types Paolo Bonzini
  2025-10-08  6:35 ` [PATCH 2/6] tracetool: apply isort and add check Paolo Bonzini
@ 2025-10-08  6:35 ` Paolo Bonzini
  2025-10-08  9:31   ` Daniel P. Berrangé
  2025-10-08 18:06   ` Stefan Hajnoczi
  2025-10-08  6:35 ` [PATCH 4/6] tracetool: add type annotations Paolo Bonzini
                   ` (5 subsequent siblings)
  8 siblings, 2 replies; 31+ messages in thread
From: Paolo Bonzini @ 2025-10-08  6:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: Alex Bennée, Daniel P . Berrangé, John Snow, Kevin Wolf,
	Markus Armbruster, Peter Maydell, Stefan Hajnoczi, Mads Ynddal

In preparations for adding type annotations, make Python process them lazily.

This avoids the need to express some annotations as strings.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 scripts/tracetool.py                         | 2 ++
 scripts/tracetool/__init__.py                | 2 ++
 scripts/tracetool/backend/__init__.py        | 2 ++
 scripts/tracetool/backend/dtrace.py          | 2 ++
 scripts/tracetool/backend/ftrace.py          | 2 ++
 scripts/tracetool/backend/log.py             | 2 ++
 scripts/tracetool/backend/simple.py          | 2 ++
 scripts/tracetool/backend/syslog.py          | 2 ++
 scripts/tracetool/backend/ust.py             | 2 ++
 scripts/tracetool/format/__init__.py         | 2 ++
 scripts/tracetool/format/c.py                | 2 ++
 scripts/tracetool/format/d.py                | 2 ++
 scripts/tracetool/format/h.py                | 2 ++
 scripts/tracetool/format/log_stap.py         | 2 ++
 scripts/tracetool/format/rs.py               | 2 ++
 scripts/tracetool/format/simpletrace_stap.py | 2 ++
 scripts/tracetool/format/stap.py             | 2 ++
 scripts/tracetool/format/ust_events_c.py     | 2 ++
 scripts/tracetool/format/ust_events_h.py     | 2 ++
 19 files changed, 38 insertions(+)

diff --git a/scripts/tracetool.py b/scripts/tracetool.py
index 22fdc29e01f..390f1a371bf 100755
--- a/scripts/tracetool.py
+++ b/scripts/tracetool.py
@@ -4,6 +4,8 @@
 Command-line wrapper for the tracetool machinery.
 """
 
+from __future__ import annotations
+
 __author__     = "Lluís Vilanova <vilanova@ac.upc.edu>"
 __copyright__  = "Copyright 2012-2014, Lluís Vilanova <vilanova@ac.upc.edu>"
 __license__    = "GPL version 2 or (at your option) any later version"
diff --git a/scripts/tracetool/__init__.py b/scripts/tracetool/__init__.py
index 85527c08c98..09ff6da6612 100644
--- a/scripts/tracetool/__init__.py
+++ b/scripts/tracetool/__init__.py
@@ -4,6 +4,8 @@
 Machinery for generating tracing-related intermediate files.
 """
 
+from __future__ import annotations
+
 __author__     = "Lluís Vilanova <vilanova@ac.upc.edu>"
 __copyright__  = "Copyright 2012-2017, Lluís Vilanova <vilanova@ac.upc.edu>"
 __license__    = "GPL version 2 or (at your option) any later version"
diff --git a/scripts/tracetool/backend/__init__.py b/scripts/tracetool/backend/__init__.py
index 9109a783c72..8cc9c821382 100644
--- a/scripts/tracetool/backend/__init__.py
+++ b/scripts/tracetool/backend/__init__.py
@@ -49,6 +49,8 @@
 
 """
 
+from __future__ import annotations
+
 __author__     = "Lluís Vilanova <vilanova@ac.upc.edu>"
 __copyright__  = "Copyright 2012-2014, Lluís Vilanova <vilanova@ac.upc.edu>"
 __license__    = "GPL version 2 or (at your option) any later version"
diff --git a/scripts/tracetool/backend/dtrace.py b/scripts/tracetool/backend/dtrace.py
index 0fbd98ee91b..fa941b2ff5c 100644
--- a/scripts/tracetool/backend/dtrace.py
+++ b/scripts/tracetool/backend/dtrace.py
@@ -4,6 +4,8 @@
 DTrace/SystemTAP backend.
 """
 
+from __future__ import annotations
+
 __author__     = "Lluís Vilanova <vilanova@ac.upc.edu>"
 __copyright__  = "Copyright 2012-2017, Lluís Vilanova <vilanova@ac.upc.edu>"
 __license__    = "GPL version 2 or (at your option) any later version"
diff --git a/scripts/tracetool/backend/ftrace.py b/scripts/tracetool/backend/ftrace.py
index 0d77bd23a51..40bb323f5e6 100644
--- a/scripts/tracetool/backend/ftrace.py
+++ b/scripts/tracetool/backend/ftrace.py
@@ -4,6 +4,8 @@
 Ftrace built-in backend.
 """
 
+from __future__ import annotations
+
 __author__     = "Eiichi Tsukata <eiichi.tsukata.xh@hitachi.com>"
 __copyright__  = "Copyright (C) 2013 Hitachi, Ltd."
 __license__    = "GPL version 2 or (at your option) any later version"
diff --git a/scripts/tracetool/backend/log.py b/scripts/tracetool/backend/log.py
index bbfb56911d9..d346a19e40f 100644
--- a/scripts/tracetool/backend/log.py
+++ b/scripts/tracetool/backend/log.py
@@ -4,6 +4,8 @@
 Stderr built-in backend.
 """
 
+from __future__ import annotations
+
 __author__     = "Lluís Vilanova <vilanova@ac.upc.edu>"
 __copyright__  = "Copyright 2012-2017, Lluís Vilanova <vilanova@ac.upc.edu>"
 __license__    = "GPL version 2 or (at your option) any later version"
diff --git a/scripts/tracetool/backend/simple.py b/scripts/tracetool/backend/simple.py
index b67257ce7e9..9c691dc231b 100644
--- a/scripts/tracetool/backend/simple.py
+++ b/scripts/tracetool/backend/simple.py
@@ -4,6 +4,8 @@
 Simple built-in backend.
 """
 
+from __future__ import annotations
+
 __author__     = "Lluís Vilanova <vilanova@ac.upc.edu>"
 __copyright__  = "Copyright 2012-2017, Lluís Vilanova <vilanova@ac.upc.edu>"
 __license__    = "GPL version 2 or (at your option) any later version"
diff --git a/scripts/tracetool/backend/syslog.py b/scripts/tracetool/backend/syslog.py
index c3efab036cf..9311453c5a7 100644
--- a/scripts/tracetool/backend/syslog.py
+++ b/scripts/tracetool/backend/syslog.py
@@ -4,6 +4,8 @@
 Syslog built-in backend.
 """
 
+from __future__ import annotations
+
 __author__     = "Paul Durrant <paul.durrant@citrix.com>"
 __copyright__  = "Copyright 2016, Citrix Systems Inc."
 __license__    = "GPL version 2 or (at your option) any later version"
diff --git a/scripts/tracetool/backend/ust.py b/scripts/tracetool/backend/ust.py
index a70e3d83e1d..f2270725128 100644
--- a/scripts/tracetool/backend/ust.py
+++ b/scripts/tracetool/backend/ust.py
@@ -4,6 +4,8 @@
 LTTng User Space Tracing backend.
 """
 
+from __future__ import annotations
+
 __author__     = "Lluís Vilanova <vilanova@ac.upc.edu>"
 __copyright__  = "Copyright 2012-2017, Lluís Vilanova <vilanova@ac.upc.edu>"
 __license__    = "GPL version 2 or (at your option) any later version"
diff --git a/scripts/tracetool/format/__init__.py b/scripts/tracetool/format/__init__.py
index 7b9d1b57826..4c606d57579 100644
--- a/scripts/tracetool/format/__init__.py
+++ b/scripts/tracetool/format/__init__.py
@@ -27,6 +27,8 @@
 
 """
 
+from __future__ import annotations
+
 __author__     = "Lluís Vilanova <vilanova@ac.upc.edu>"
 __copyright__  = "Copyright 2012-2014, Lluís Vilanova <vilanova@ac.upc.edu>"
 __license__    = "GPL version 2 or (at your option) any later version"
diff --git a/scripts/tracetool/format/c.py b/scripts/tracetool/format/c.py
index 50e03313cbf..5b3459f2beb 100644
--- a/scripts/tracetool/format/c.py
+++ b/scripts/tracetool/format/c.py
@@ -4,6 +4,8 @@
 trace/generated-tracers.c
 """
 
+from __future__ import annotations
+
 __author__     = "Lluís Vilanova <vilanova@ac.upc.edu>"
 __copyright__  = "Copyright 2012-2014, Lluís Vilanova <vilanova@ac.upc.edu>"
 __license__    = "GPL version 2 or (at your option) any later version"
diff --git a/scripts/tracetool/format/d.py b/scripts/tracetool/format/d.py
index 0befd444e82..dda80eeb766 100644
--- a/scripts/tracetool/format/d.py
+++ b/scripts/tracetool/format/d.py
@@ -4,6 +4,8 @@
 trace/generated-tracers.dtrace (DTrace only).
 """
 
+from __future__ import annotations
+
 __author__     = "Lluís Vilanova <vilanova@ac.upc.edu>"
 __copyright__  = "Copyright 2012-2014, Lluís Vilanova <vilanova@ac.upc.edu>"
 __license__    = "GPL version 2 or (at your option) any later version"
diff --git a/scripts/tracetool/format/h.py b/scripts/tracetool/format/h.py
index dd58713a158..d04cabc63e8 100644
--- a/scripts/tracetool/format/h.py
+++ b/scripts/tracetool/format/h.py
@@ -4,6 +4,8 @@
 trace/generated-tracers.h
 """
 
+from __future__ import annotations
+
 __author__     = "Lluís Vilanova <vilanova@ac.upc.edu>"
 __copyright__  = "Copyright 2012-2017, Lluís Vilanova <vilanova@ac.upc.edu>"
 __license__    = "GPL version 2 or (at your option) any later version"
diff --git a/scripts/tracetool/format/log_stap.py b/scripts/tracetool/format/log_stap.py
index 99c6181f389..65514442030 100644
--- a/scripts/tracetool/format/log_stap.py
+++ b/scripts/tracetool/format/log_stap.py
@@ -4,6 +4,8 @@
 Generate .stp file that printfs log messages (DTrace with SystemTAP only).
 """
 
+from __future__ import annotations
+
 __author__     = "Daniel P. Berrange <berrange@redhat.com>"
 __copyright__  = "Copyright (C) 2014-2019, Red Hat, Inc."
 __license__    = "GPL version 2 or (at your option) any later version"
diff --git a/scripts/tracetool/format/rs.py b/scripts/tracetool/format/rs.py
index 32ac4e59770..71b847cb7bc 100644
--- a/scripts/tracetool/format/rs.py
+++ b/scripts/tracetool/format/rs.py
@@ -4,6 +4,8 @@
 trace-DIR.rs
 """
 
+from __future__ import annotations
+
 __author__     = "Tanish Desai <tanishdesai37@gmail.com>"
 __copyright__  = "Copyright 2025, Tanish Desai <tanishdesai37@gmail.com>"
 __license__    = "GPL version 2 or (at your option) any later version"
diff --git a/scripts/tracetool/format/simpletrace_stap.py b/scripts/tracetool/format/simpletrace_stap.py
index c7bde97a855..eb58b4b9592 100644
--- a/scripts/tracetool/format/simpletrace_stap.py
+++ b/scripts/tracetool/format/simpletrace_stap.py
@@ -4,6 +4,8 @@
 Generate .stp file that outputs simpletrace binary traces (DTrace with SystemTAP only).
 """
 
+from __future__ import annotations
+
 __author__     = "Stefan Hajnoczi <redhat.com>"
 __copyright__  = "Copyright (C) 2014, Red Hat, Inc."
 __license__    = "GPL version 2 or (at your option) any later version"
diff --git a/scripts/tracetool/format/stap.py b/scripts/tracetool/format/stap.py
index f917bd7545d..808fb478b5f 100644
--- a/scripts/tracetool/format/stap.py
+++ b/scripts/tracetool/format/stap.py
@@ -4,6 +4,8 @@
 Generate .stp file (DTrace with SystemTAP only).
 """
 
+from __future__ import annotations
+
 __author__     = "Lluís Vilanova <vilanova@ac.upc.edu>"
 __copyright__  = "Copyright 2012-2014, Lluís Vilanova <vilanova@ac.upc.edu>"
 __license__    = "GPL version 2 or (at your option) any later version"
diff --git a/scripts/tracetool/format/ust_events_c.py b/scripts/tracetool/format/ust_events_c.py
index 074226bfd37..fa7d5437984 100644
--- a/scripts/tracetool/format/ust_events_c.py
+++ b/scripts/tracetool/format/ust_events_c.py
@@ -4,6 +4,8 @@
 trace/generated-ust.c
 """
 
+from __future__ import annotations
+
 __author__     = "Mohamad Gebai <mohamad.gebai@polymtl.ca>"
 __copyright__  = "Copyright 2012, Mohamad Gebai <mohamad.gebai@polymtl.ca>"
 __license__    = "GPL version 2 or (at your option) any later version"
diff --git a/scripts/tracetool/format/ust_events_h.py b/scripts/tracetool/format/ust_events_h.py
index cee7970a403..1057d02577e 100644
--- a/scripts/tracetool/format/ust_events_h.py
+++ b/scripts/tracetool/format/ust_events_h.py
@@ -4,6 +4,8 @@
 trace/generated-ust-provider.h
 """
 
+from __future__ import annotations
+
 __author__     = "Mohamad Gebai <mohamad.gebai@polymtl.ca>"
 __copyright__  = "Copyright 2012, Mohamad Gebai <mohamad.gebai@polymtl.ca>"
 __license__    = "GPL version 2 or (at your option) any later version"
-- 
2.51.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 4/6] tracetool: add type annotations
  2025-10-08  6:35 [PATCH 0/6] tracetool: add mypy --strict checking [AI discussion ahead!] Paolo Bonzini
                   ` (2 preceding siblings ...)
  2025-10-08  6:35 ` [PATCH 3/6] tracetool: "import annotations" Paolo Bonzini
@ 2025-10-08  6:35 ` Paolo Bonzini
  2025-10-08 18:09   ` Stefan Hajnoczi
  2025-10-08  6:35 ` [PATCH 5/6] tracetool: complete typing annotations Paolo Bonzini
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 31+ messages in thread
From: Paolo Bonzini @ 2025-10-08  6:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: Alex Bennée, Daniel P . Berrangé, John Snow, Kevin Wolf,
	Markus Armbruster, Peter Maydell, Stefan Hajnoczi, Mads Ynddal

Created with a profiling-based tool, righttyper.  I used this script:

  python -m righttyper --generate-stubs --no-sampling --overwrite  -- ./tracetool.py --help
  find . -name "*.pyi" | while read fname; do
    merge-pyi --in-place -b bak.$fmt ${fname%i} $fname
  done

  for fmt in c h rs d log-stap simpletrace-stap stap ust-events-c ust-events-h; do
    find . -name '*.pyi*' | xargs rm
    python -m righttyper --generate-stubs --no-sampling --overwrite ./tracetool.py \
      --format=$fmt --backends=ust,simple,syslog,ftrace,dtrace,log --group=testsuite \
      --binary=qemu --probe-prefix=qemu ../tests/tracetool/trace-events outfile.$fmt
    find . -name "*.pyi" | while read fname; do
      merge-pyi --in-place -b bak.$fmt ${fname%i} $fname
    done
  done

  python -m isort $(find tracetool -name "*.py")

Running the script took about 5 minutes.  The errors were mostly
due to misunderstanding the "try_import" function:

  tracetool/backend/__init__.py:83: error: Incompatible types in assignment (expression has type Module, variable has type "tuple[bool, Module]")  [assignment]
  tracetool/backend/__init__.py:117: error: Incompatible types in assignment (expression has type Module, variable has type "str")  [assignment]
  tracetool/__init__.py:543: error: Incompatible return value type (got "tuple[bool, None]", expected "tuple[bool, Module]")  [return-value]
  tracetool/format/__init__.py:60: error: Incompatible types in assignment (expression has type Module, variable has type "tuple[bool, Module]")  [assignment]
  tracetool/format/__init__.py:85: error: Argument 2 to "try_import" has incompatible type "str"; expected "None"  [arg-type]
  tracetool/format/__init__.py:88: error: Module not callable  [operator]

On top of this I fixed a little weirdness, while leaving the unannotated
functions unchanged.  Being profiling-based, righttyper did not annotate
anything not covered by the check-tracetool testsuite:
- error cases
- PRIxxx macros

It also reported list[Never] for always-empty lists, which is incorrect.
Righttyper also has a few limitations: it does not annotate nested functions
(there were only four of them), or "*args" argument lists.  These are fixed
in the next patch.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 scripts/tracetool.py                         |  5 +-
 scripts/tracetool/__init__.py                | 54 ++++++++++----------
 scripts/tracetool/backend/__init__.py        | 19 +++----
 scripts/tracetool/backend/dtrace.py          | 16 +++---
 scripts/tracetool/backend/ftrace.py          | 10 ++--
 scripts/tracetool/backend/log.py             | 10 ++--
 scripts/tracetool/backend/simple.py          | 16 +++---
 scripts/tracetool/backend/syslog.py          | 10 ++--
 scripts/tracetool/backend/ust.py             |  8 +--
 scripts/tracetool/format/__init__.py         |  7 +--
 scripts/tracetool/format/c.py                |  5 +-
 scripts/tracetool/format/d.py                |  5 +-
 scripts/tracetool/format/h.py                |  5 +-
 scripts/tracetool/format/log_stap.py         |  7 +--
 scripts/tracetool/format/rs.py               |  5 +-
 scripts/tracetool/format/simpletrace_stap.py |  5 +-
 scripts/tracetool/format/stap.py             |  7 +--
 scripts/tracetool/format/ust_events_c.py     |  5 +-
 scripts/tracetool/format/ust_events_h.py     |  5 +-
 19 files changed, 109 insertions(+), 95 deletions(-)

diff --git a/scripts/tracetool.py b/scripts/tracetool.py
index 390f1a371bf..0688c9cf5f4 100755
--- a/scripts/tracetool.py
+++ b/scripts/tracetool.py
@@ -16,6 +16,7 @@
 
 import getopt
 import sys
+from typing import NoReturn
 
 import tracetool.backend
 import tracetool.format
@@ -23,7 +24,7 @@
 
 _SCRIPT = ""
 
-def error_opt(msg = None):
+def error_opt(msg: str | None = None) -> NoReturn:
     if msg is not None:
         error_write("Error: " + msg + "\n")
 
@@ -58,7 +59,7 @@ def error_opt(msg = None):
     else:
         sys.exit(1)
 
-def main(args):
+def main(args: list[str]) -> None:
     global _SCRIPT
     _SCRIPT = args[0]
 
diff --git a/scripts/tracetool/__init__.py b/scripts/tracetool/__init__.py
index 09ff6da6612..62895261a0d 100644
--- a/scripts/tracetool/__init__.py
+++ b/scripts/tracetool/__init__.py
@@ -17,13 +17,15 @@
 import os
 import re
 import sys
+from io import TextIOWrapper
 from pathlib import PurePath
+from typing import Any, Iterator
 
 import tracetool.backend
 import tracetool.format
 
 
-def error_write(*lines):
+def error_write(*lines) -> None:
     """Write a set of error lines."""
     sys.stderr.writelines("\n".join(lines) + "\n")
 
@@ -48,7 +50,7 @@ def error(*lines):
     'MAX': 'j',
 }
 
-def expand_format_string(c_fmt, prefix=""):
+def expand_format_string(c_fmt: str, prefix: str="") -> str:
     def pri_macro_to_fmt(pri_macro):
         assert pri_macro.startswith("PRI")
         fmt_type = pri_macro[3]  # 'd', 'i', 'u', or 'x'
@@ -80,12 +82,12 @@ def pri_macro_to_fmt(pri_macro):
 out_filename = '<none>'
 out_fobj = sys.stdout
 
-def out_open(filename):
+def out_open(filename: str) -> None:
     global out_filename, out_fobj
     out_filename = posix_relpath(filename)
     out_fobj = open(filename, 'wt')
 
-def out(*lines, **kwargs):
+def out(*lines, **kwargs) -> None:
     """Write a set of output lines.
 
     You can use kwargs as a shorthand for mapping variables when formatting all
@@ -177,7 +179,7 @@ def out(*lines, **kwargs):
     "bool",
 }
 
-def validate_type(name):
+def validate_type(name: str) -> None:
     bits = name.split(" ")
     for bit in bits:
         bit = re.sub(r"\*", "", bit)
@@ -192,7 +194,7 @@ def validate_type(name):
                              "other complex pointer types should be "
                              "declared as 'void *'" % name)
 
-def c_type_to_rust(name):
+def c_type_to_rust(name: str) -> str:
     ptr = False
     const = False
     name = name.rstrip()
@@ -227,7 +229,7 @@ def c_type_to_rust(name):
 class Arguments:
     """Event arguments description."""
 
-    def __init__(self, args):
+    def __init__(self, args: list[tuple[str, str]]) -> None:
         """
         Parameters
         ----------
@@ -242,7 +244,7 @@ def __init__(self, args):
                 self._args.append(arg)
 
     @staticmethod
-    def build(arg_str):
+    def build(arg_str: str) -> Arguments:
         """Build and Arguments instance from an argument string.
 
         Parameters
@@ -275,15 +277,15 @@ def __getitem__(self, index):
         else:
             return self._args[index]
 
-    def __iter__(self):
+    def __iter__(self) -> Iterator[tuple[str, str]]:
         """Iterate over the (type, name) pairs."""
         return iter(self._args)
 
-    def __len__(self):
+    def __len__(self) -> int:
         """Number of arguments."""
         return len(self._args)
 
-    def __str__(self):
+    def __str__(self) -> str:
         """String suitable for declaring function arguments."""
         def onearg(t, n):
             if t[-1] == '*':
@@ -300,11 +302,11 @@ def __repr__(self):
         """Evaluable string representation for this object."""
         return "Arguments(\"%s\")" % str(self)
 
-    def names(self):
+    def names(self) -> list[str]:
         """List of argument names."""
         return [ name for _, name in self._args ]
 
-    def types(self):
+    def types(self) -> list[str]:
         """List of argument types."""
         return [ type_ for type_, _ in self._args ]
 
@@ -312,12 +314,12 @@ def casted(self):
         """List of argument names casted to their type."""
         return ["(%s)%s" % (type_, name) for type_, name in self._args]
 
-    def rust_decl_extern(self):
+    def rust_decl_extern(self) -> str:
         """Return a Rust argument list for an extern "C" function"""
         return ", ".join((f"_{name}: {c_type_to_rust(type_)}"
                           for type_, name in self._args))
 
-    def rust_decl(self):
+    def rust_decl(self) -> str:
         """Return a Rust argument list for a tracepoint function"""
         def decl_type(type_):
             if type_ == "const char *":
@@ -327,7 +329,7 @@ def decl_type(type_):
         return ", ".join((f"_{name}: {decl_type(type_)}"
                           for type_, name in self._args))
 
-    def rust_call_extern(self):
+    def rust_call_extern(self) -> str:
         """Return a Rust argument list for a call to an extern "C" function"""
         def rust_cast(name, type_):
             if type_ == "const char *":
@@ -336,7 +338,7 @@ def rust_cast(name, type_):
 
         return ", ".join((rust_cast(name, type_) for type_, name in self._args))
 
-    def rust_call_varargs(self):
+    def rust_call_varargs(self) -> str:
         """Return a Rust argument list for a call to a C varargs function"""
         def rust_cast(name, type_):
             if type_ == "const char *":
@@ -379,7 +381,7 @@ class Event(object):
 
     _VALID_PROPS = set(["disable"])
 
-    def __init__(self, name, props, fmt, args, lineno, filename):
+    def __init__(self, name: str, props: list[str], fmt: str, args: Arguments, lineno: int, filename: str) -> None:
         """
         Parameters
         ----------
@@ -415,7 +417,7 @@ def __init__(self, name, props, fmt, args, lineno, filename):
 
 
     @staticmethod
-    def build(line_str, lineno, filename):
+    def build(line_str: str, lineno: int, filename: str) -> Event:
         """Build an Event instance from a string.
 
         Parameters
@@ -457,7 +459,7 @@ def __repr__(self):
     # arguments with that format, hence the non-greedy version of it.
     _FMT = re.compile(r"(%[\d\.]*\w+|%.*?PRI\S+)")
 
-    def formats(self):
+    def formats(self) -> list[str]:
         """List conversion specifiers in the argument print format string."""
         return self._FMT.findall(self.fmt)
 
@@ -467,13 +469,13 @@ def formats(self):
     QEMU_BACKEND_DSTATE      = "TRACE_%(NAME)s_BACKEND_DSTATE"
     QEMU_EVENT               = "_TRACE_%(NAME)s_EVENT"
 
-    def api(self, fmt=None):
+    def api(self, fmt: str|None=None) -> str:
         if fmt is None:
             fmt = Event.QEMU_TRACE
         return fmt % {"name": self.name, "NAME": self.name.upper()}
 
 
-def read_events(fobj, fname):
+def read_events(fobj: TextIOWrapper, fname: str) -> list[Event]:
     """Generate the output for the given (format, backends) pair.
 
     Parameters
@@ -512,7 +514,7 @@ class TracetoolError (Exception):
     pass
 
 
-def try_import(mod_name, attr_name=None, attr_default=None):
+def try_import(mod_name: str, attr_name: str | None=None, attr_default: Any=None) -> tuple[bool, Any]:
     """Try to import a module and get an attribute from it.
 
     Parameters
@@ -538,8 +540,8 @@ def try_import(mod_name, attr_name=None, attr_default=None):
         return False, None
 
 
-def generate(events, group, format, backends,
-             binary=None, probe_prefix=None):
+def generate(events: list[Event], group: str, format: str, backends: list[str],
+             binary: str|None=None, probe_prefix: str|None=None) -> None:
     """Generate the output for the given (format, backends) pair.
 
     Parameters
@@ -579,7 +581,7 @@ def generate(events, group, format, backends,
 
     tracetool.format.generate(events, format, backend, group)
 
-def posix_relpath(path, start=None):
+def posix_relpath(path: str, start: str|None=None) -> str:
     try:
         path = os.path.relpath(path, start)
     except ValueError:
diff --git a/scripts/tracetool/backend/__init__.py b/scripts/tracetool/backend/__init__.py
index 8cc9c821382..645e78ece06 100644
--- a/scripts/tracetool/backend/__init__.py
+++ b/scripts/tracetool/backend/__init__.py
@@ -60,11 +60,12 @@
 
 
 import os
+from typing import Any, Iterator
 
 import tracetool
 
 
-def get_list(only_public = False):
+def get_list(only_public: bool = False) -> list[tuple[str, str]]:
     """Get a list of (name, description) pairs."""
     res = [("nop", "Tracing disabled.")]
     modnames = []
@@ -93,7 +94,7 @@ def get_list(only_public = False):
     return res
 
 
-def exists(name):
+def exists(name: str) -> bool:
     """Return whether the given backend exists."""
     if len(name) == 0:
         return False
@@ -104,7 +105,7 @@ def exists(name):
 
 
 class Wrapper:
-    def __init__(self, backends, format):
+    def __init__(self, backends: list[str], format: str) -> None:
         self._backends = [backend.replace("-", "_") for backend in backends]
         self._format = format.replace("-", "_")
         self.check_trace_event_get_state = False
@@ -115,13 +116,13 @@ def __init__(self, backends, format):
             check_trace_event_get_state = getattr(backend, "CHECK_TRACE_EVENT_GET_STATE", False)
             self.check_trace_event_get_state = self.check_trace_event_get_state or check_trace_event_get_state
 
-    def backend_modules(self):
+    def backend_modules(self) -> Iterator[Any]:
         for backend in self._backends:
              module = tracetool.try_import("tracetool.backend." + backend)[1]
              if module is not None:
                  yield module
 
-    def _run_function(self, name, *args, check_trace_event_get_state=None, **kwargs):
+    def _run_function(self, name: str, *args, check_trace_event_get_state: bool | None=None, **kwargs) -> None:
         for backend in self.backend_modules():
             func = getattr(backend, name % self._format, None)
             if func is not None and \
@@ -129,14 +130,14 @@ def _run_function(self, name, *args, check_trace_event_get_state=None, **kwargs)
                  check_trace_event_get_state == getattr(backend, 'CHECK_TRACE_EVENT_GET_STATE', False)):
                     func(*args, **kwargs)
 
-    def generate_begin(self, events, group):
+    def generate_begin(self, events: list[tracetool.Event], group: str) -> None:
         self._run_function("generate_%s_begin", events, group)
 
-    def generate(self, event, group, check_trace_event_get_state=None):
+    def generate(self, event: tracetool.Event, group: str, check_trace_event_get_state: bool | None=None) -> None:
         self._run_function("generate_%s", event, group, check_trace_event_get_state=check_trace_event_get_state)
 
-    def generate_backend_dstate(self, event, group):
+    def generate_backend_dstate(self, event: tracetool.Event, group: str) -> None:
         self._run_function("generate_%s_backend_dstate", event, group)
 
-    def generate_end(self, events, group):
+    def generate_end(self, events: list[tracetool.Event], group: str) -> None:
         self._run_function("generate_%s_end", events, group)
diff --git a/scripts/tracetool/backend/dtrace.py b/scripts/tracetool/backend/dtrace.py
index fa941b2ff5c..d9f4b8f09c8 100644
--- a/scripts/tracetool/backend/dtrace.py
+++ b/scripts/tracetool/backend/dtrace.py
@@ -14,28 +14,28 @@
 __email__      = "stefanha@redhat.com"
 
 
-from tracetool import out
+from tracetool import Event, out
 
 PUBLIC = True
 
 
-PROBEPREFIX = None
+PROBEPREFIX: str|None = None
 
-def probeprefix():
+def probeprefix() -> str:
     if PROBEPREFIX is None:
         raise ValueError("you must set PROBEPREFIX")
     return PROBEPREFIX
 
 
-BINARY = None
+BINARY: str|None = None
 
-def binary():
+def binary() -> str:
     if BINARY is None:
         raise ValueError("you must set BINARY")
     return BINARY
 
 
-def generate_h_begin(events, group):
+def generate_h_begin(events: list[Event], group: str) -> None:
     if group == "root":
         header = "trace-dtrace-root.h"
     else:
@@ -62,12 +62,12 @@ def generate_h_begin(events, group):
             '#endif',
             uppername=e.name.upper())
 
-def generate_h(event, group):
+def generate_h(event: Event, group: str) -> None:
     out('    QEMU_%(uppername)s(%(argnames)s);',
         uppername=event.name.upper(),
         argnames=", ".join(event.args.names()))
 
 
-def generate_h_backend_dstate(event, group):
+def generate_h_backend_dstate(event: Event, group: str) -> None:
     out('    QEMU_%(uppername)s_ENABLED() || \\',
         uppername=event.name.upper())
diff --git a/scripts/tracetool/backend/ftrace.py b/scripts/tracetool/backend/ftrace.py
index 40bb323f5e6..fa4a40d44a8 100644
--- a/scripts/tracetool/backend/ftrace.py
+++ b/scripts/tracetool/backend/ftrace.py
@@ -14,18 +14,18 @@
 __email__      = "stefanha@redhat.com"
 
 
-from tracetool import expand_format_string, out
+from tracetool import Event, expand_format_string, out
 
 PUBLIC = True
 CHECK_TRACE_EVENT_GET_STATE = True
 
 
-def generate_h_begin(events, group):
+def generate_h_begin(events: list[Event], group: str) -> None:
     out('#include "trace/ftrace.h"',
         '')
 
 
-def generate_h(event, group):
+def generate_h(event: Event, group: str) -> None:
     argnames = ", ".join(event.args.names())
     if len(event.args) > 0:
         argnames = ", " + argnames
@@ -41,11 +41,11 @@ def generate_h(event, group):
         argnames=argnames)
 
 
-def generate_h_backend_dstate(event, group):
+def generate_h_backend_dstate(event: Event, group: str) -> None:
     out('    trace_event_get_state_dynamic_by_id(%(event_id)s) || \\',
         event_id="TRACE_" + event.name.upper())
 
-def generate_rs(event, group):
+def generate_rs(event: Event, group: str) -> None:
     out('        let format_string = c"%(fmt)s";',
         '        unsafe {bindings::ftrace_write(format_string.as_ptr() as *const c_char, %(args)s);}',
         fmt=expand_format_string(event.fmt),
diff --git a/scripts/tracetool/backend/log.py b/scripts/tracetool/backend/log.py
index d346a19e40f..39d777218be 100644
--- a/scripts/tracetool/backend/log.py
+++ b/scripts/tracetool/backend/log.py
@@ -14,18 +14,18 @@
 __email__      = "stefanha@redhat.com"
 
 
-from tracetool import expand_format_string, out
+from tracetool import Event, expand_format_string, out
 
 PUBLIC = True
 CHECK_TRACE_EVENT_GET_STATE = True
 
 
-def generate_h_begin(events, group):
+def generate_h_begin(events: list[Event], group: str) -> None:
     out('#include "qemu/log-for-trace.h"',
         '')
 
 
-def generate_h(event, group):
+def generate_h(event: Event, group: str) -> None:
     argnames = ", ".join(event.args.names())
     if len(event.args) > 0:
         argnames = ", " + argnames
@@ -42,11 +42,11 @@ def generate_h(event, group):
         argnames=argnames)
 
 
-def generate_h_backend_dstate(event, group):
+def generate_h_backend_dstate(event: Event, group: str) -> None:
     out('    trace_event_get_state_dynamic_by_id(%(event_id)s) || \\',
         event_id="TRACE_" + event.name.upper())
 
-def generate_rs(event, group):
+def generate_rs(event: Event, group: str) -> None:
     out('        let format_string = c"%(fmt)s\\n";',
         '        if (unsafe { bindings::qemu_loglevel } & bindings::LOG_TRACE) != 0 {',
         '            unsafe { bindings::qemu_log(format_string.as_ptr() as *const c_char, %(args)s);}',
diff --git a/scripts/tracetool/backend/simple.py b/scripts/tracetool/backend/simple.py
index 9c691dc231b..519f7a09e5c 100644
--- a/scripts/tracetool/backend/simple.py
+++ b/scripts/tracetool/backend/simple.py
@@ -14,13 +14,13 @@
 __email__      = "stefanha@redhat.com"
 
 
-from tracetool import out
+from tracetool import Event, out
 
 PUBLIC = True
 CHECK_TRACE_EVENT_GET_STATE = True
 
 
-def is_string(arg):
+def is_string(arg: str) -> bool:
     strtype = ('const char*', 'char*', 'const char *', 'char *')
     arg_strip = arg.lstrip()
     if arg_strip.startswith(strtype) and arg_strip.count('*') == 1:
@@ -29,7 +29,7 @@ def is_string(arg):
         return False
 
 
-def generate_h_begin(events, group):
+def generate_h_begin(events: list[Event], group: str) -> None:
     for event in events:
         out('void _simple_%(api)s(%(args)s);',
             api=event.api(),
@@ -37,25 +37,25 @@ def generate_h_begin(events, group):
     out('')
 
 
-def generate_h(event, group):
+def generate_h(event: Event, group: str) -> None:
     out('        _simple_%(api)s(%(args)s);',
         api=event.api(),
         args=", ".join(event.args.names()))
 
 
-def generate_h_backend_dstate(event, group):
+def generate_h_backend_dstate(event: Event, group: str) -> None:
     out('    trace_event_get_state_dynamic_by_id(%(event_id)s) || \\',
         event_id="TRACE_" + event.name.upper())
 
 
-def generate_c_begin(events, group):
+def generate_c_begin(events: list[Event], group: str) -> None:
     out('#include "qemu/osdep.h"',
         '#include "trace/control.h"',
         '#include "trace/simple.h"',
         '')
 
 
-def generate_c(event, group):
+def generate_c(event: Event, group: str) -> None:
     out('void _simple_%(api)s(%(args)s)',
         '{',
         '    TraceBufferRecord rec;',
@@ -100,7 +100,7 @@ def generate_c(event, group):
         '}',
         '')
 
-def generate_rs(event, group):
+def generate_rs(event: Event, group: str) -> None:
     out('        extern "C" { fn _simple_%(api)s(%(rust_args)s); }',
         '        unsafe { _simple_%(api)s(%(args)s); }',
         api=event.api(),
diff --git a/scripts/tracetool/backend/syslog.py b/scripts/tracetool/backend/syslog.py
index 9311453c5a7..dd39df6af6f 100644
--- a/scripts/tracetool/backend/syslog.py
+++ b/scripts/tracetool/backend/syslog.py
@@ -14,18 +14,18 @@
 __email__      = "stefanha@redhat.com"
 
 
-from tracetool import expand_format_string, out
+from tracetool import Event, expand_format_string, out
 
 PUBLIC = True
 CHECK_TRACE_EVENT_GET_STATE = True
 
 
-def generate_h_begin(events, group):
+def generate_h_begin(events: list[Event], group: str) -> None:
     out('#include <syslog.h>',
         '')
 
 
-def generate_h(event, group):
+def generate_h(event: Event, group: str) -> None:
     argnames = ", ".join(event.args.names())
     if len(event.args) > 0:
         argnames = ", " + argnames
@@ -39,12 +39,12 @@ def generate_h(event, group):
         fmt=event.fmt.rstrip("\n"),
         argnames=argnames)
 
-def generate_rs(event, group):
+def generate_rs(event: Event, group: str) -> None:
     out('        let format_string = c"%(fmt)s";',
         '        unsafe {::trace::syslog(::trace::LOG_INFO, format_string.as_ptr() as *const c_char, %(args)s);}',
         fmt=expand_format_string(event.fmt),
         args=event.args.rust_call_varargs())
 
-def generate_h_backend_dstate(event, group):
+def generate_h_backend_dstate(event: Event, group: str) -> None:
     out('    trace_event_get_state_dynamic_by_id(%(event_id)s) || \\',
         event_id="TRACE_" + event.name.upper())
diff --git a/scripts/tracetool/backend/ust.py b/scripts/tracetool/backend/ust.py
index f2270725128..a26cd7ace61 100644
--- a/scripts/tracetool/backend/ust.py
+++ b/scripts/tracetool/backend/ust.py
@@ -14,12 +14,12 @@
 __email__      = "stefanha@redhat.com"
 
 
-from tracetool import out
+from tracetool import Event, out
 
 PUBLIC = True
 
 
-def generate_h_begin(events, group):
+def generate_h_begin(events: list[Event], group: str) -> None:
     header = 'trace-ust-' + group + '.h'
     out('#include <lttng/tracepoint.h>',
         '#include "%s"' % header,
@@ -31,7 +31,7 @@ def generate_h_begin(events, group):
         '')
 
 
-def generate_h(event, group):
+def generate_h(event: Event, group: str) -> None:
     argnames = ", ".join(event.args.names())
     if len(event.args) > 0:
         argnames = ", " + argnames
@@ -41,6 +41,6 @@ def generate_h(event, group):
         tp_args=argnames)
 
 
-def generate_h_backend_dstate(event, group):
+def generate_h_backend_dstate(event: Event, group: str) -> None:
     out('    tracepoint_enabled(qemu, %(name)s) || \\',
         name=event.name)
diff --git a/scripts/tracetool/format/__init__.py b/scripts/tracetool/format/__init__.py
index 4c606d57579..c287e93ed30 100644
--- a/scripts/tracetool/format/__init__.py
+++ b/scripts/tracetool/format/__init__.py
@@ -40,9 +40,10 @@
 import os
 
 import tracetool
+from tracetool.backend import Wrapper
 
 
-def get_list():
+def get_list() -> list[tuple[str, str]]:
     """Get a list of (name, description) pairs."""
     res = []
     modnames = []
@@ -67,7 +68,7 @@ def get_list():
     return res
 
 
-def exists(name):
+def exists(name: str) -> bool:
     """Return whether the given format exists."""
     if len(name) == 0:
         return False
@@ -75,7 +76,7 @@ def exists(name):
     return tracetool.try_import("tracetool.format." + name)[0]
 
 
-def generate(events, format, backend, group):
+def generate(events: list[tracetool.Event], format: str, backend: Wrapper, group: str) -> None:
     if not exists(format):
         raise ValueError("unknown format: %s" % format)
     format = format.replace("-", "_")
diff --git a/scripts/tracetool/format/c.py b/scripts/tracetool/format/c.py
index 5b3459f2beb..e8848c51051 100644
--- a/scripts/tracetool/format/c.py
+++ b/scripts/tracetool/format/c.py
@@ -14,10 +14,11 @@
 __email__      = "stefanha@redhat.com"
 
 
-from tracetool import out
+from tracetool import Event, out
+from tracetool.backend import Wrapper
 
 
-def generate(events, backend, group):
+def generate(events: list[Event], backend: Wrapper, group: str) -> None:
     active_events = [e for e in events
                      if "disable" not in e.properties]
 
diff --git a/scripts/tracetool/format/d.py b/scripts/tracetool/format/d.py
index dda80eeb766..2abf8bc24ba 100644
--- a/scripts/tracetool/format/d.py
+++ b/scripts/tracetool/format/d.py
@@ -16,7 +16,8 @@
 
 from sys import platform
 
-from tracetool import out
+from tracetool import Event, out
+from tracetool.backend import Wrapper
 
 # Reserved keywords from
 # https://wikis.oracle.com/display/DTrace/Types,+Operators+and+Expressions
@@ -31,7 +32,7 @@
 )
 
 
-def generate(events, backend, group):
+def generate(events: list[Event], backend: Wrapper, group: str) -> None:
     events = [e for e in events
               if "disable" not in e.properties]
 
diff --git a/scripts/tracetool/format/h.py b/scripts/tracetool/format/h.py
index d04cabc63e8..2324234dd24 100644
--- a/scripts/tracetool/format/h.py
+++ b/scripts/tracetool/format/h.py
@@ -14,10 +14,11 @@
 __email__      = "stefanha@redhat.com"
 
 
-from tracetool import out
+from tracetool import Event, out
+from tracetool.backend import Wrapper
 
 
-def generate(events, backend, group):
+def generate(events: list[Event], backend: Wrapper, group: str) -> None:
     header = "trace/control.h"
 
     out('/* This file is autogenerated by tracetool, do not edit. */',
diff --git a/scripts/tracetool/format/log_stap.py b/scripts/tracetool/format/log_stap.py
index 65514442030..d7ad0bb6a72 100644
--- a/scripts/tracetool/format/log_stap.py
+++ b/scripts/tracetool/format/log_stap.py
@@ -15,7 +15,8 @@
 
 import re
 
-from tracetool import out
+from tracetool import Event, out
+from tracetool.backend import Wrapper
 from tracetool.backend.dtrace import binary, probeprefix
 from tracetool.backend.simple import is_string
 from tracetool.format.stap import stap_escape
@@ -30,7 +31,7 @@ def c_macro_to_format(macro):
 
     raise Exception("Unhandled macro '%s'" % macro)
 
-def c_fmt_to_stap(fmt):
+def c_fmt_to_stap(fmt: str) -> str:
     state = 0
     bits = []
     literal = ""
@@ -85,7 +86,7 @@ def c_fmt_to_stap(fmt):
     fmt = re.sub(r"%(\d*)(l+|z)(x|u|d)", r"%\1\3", "".join(bits))
     return fmt
 
-def generate(events, backend, group):
+def generate(events: list[Event], backend: Wrapper, group: str) -> None:
     out('/* This file is autogenerated by tracetool, do not edit. */',
         '/* SPDX-License-Identifier: GPL-2.0-or-later */',
         '')
diff --git a/scripts/tracetool/format/rs.py b/scripts/tracetool/format/rs.py
index 71b847cb7bc..27a4025e3d6 100644
--- a/scripts/tracetool/format/rs.py
+++ b/scripts/tracetool/format/rs.py
@@ -14,10 +14,11 @@
 __email__      = "stefanha@redhat.com"
 
 
-from tracetool import out
+from tracetool import Event, out
+from tracetool.backend import Wrapper
 
 
-def generate(events, backend, group):
+def generate(events: list[Event], backend: Wrapper, group: str) -> None:
     out('// SPDX-License-Identifier: GPL-2.0-or-later',
         '// This file is @generated by tracetool, do not edit.',
         '',
diff --git a/scripts/tracetool/format/simpletrace_stap.py b/scripts/tracetool/format/simpletrace_stap.py
index eb58b4b9592..c76cf426853 100644
--- a/scripts/tracetool/format/simpletrace_stap.py
+++ b/scripts/tracetool/format/simpletrace_stap.py
@@ -14,13 +14,14 @@
 __email__      = "stefanha@redhat.com"
 
 
-from tracetool import out
+from tracetool import Event, out
+from tracetool.backend import Wrapper
 from tracetool.backend.dtrace import probeprefix
 from tracetool.backend.simple import is_string
 from tracetool.format.stap import stap_escape
 
 
-def generate(events, backend, group):
+def generate(events: list[Event], backend: Wrapper, group: str) -> None:
     out('/* This file is autogenerated by tracetool, do not edit. */',
         '/* SPDX-License-Identifier: GPL-2.0-or-later */',
         '')
diff --git a/scripts/tracetool/format/stap.py b/scripts/tracetool/format/stap.py
index 808fb478b5f..af98f3c44a3 100644
--- a/scripts/tracetool/format/stap.py
+++ b/scripts/tracetool/format/stap.py
@@ -14,7 +14,8 @@
 __email__      = "stefanha@redhat.com"
 
 
-from tracetool import out
+from tracetool import Event, out
+from tracetool.backend import Wrapper
 from tracetool.backend.dtrace import binary, probeprefix
 
 # Technically 'self' is not used by systemtap yet, but
@@ -27,14 +28,14 @@
     )
 
 
-def stap_escape(identifier):
+def stap_escape(identifier: str) -> str:
     # Append underscore to reserved keywords
     if identifier in RESERVED_WORDS:
         return identifier + '_'
     return identifier
 
 
-def generate(events, backend, group):
+def generate(events: list[Event], backend: Wrapper, group: str) -> None:
     events = [e for e in events
               if "disable" not in e.properties]
 
diff --git a/scripts/tracetool/format/ust_events_c.py b/scripts/tracetool/format/ust_events_c.py
index fa7d5437984..2c8256ed7a8 100644
--- a/scripts/tracetool/format/ust_events_c.py
+++ b/scripts/tracetool/format/ust_events_c.py
@@ -14,10 +14,11 @@
 __email__      = "stefanha@redhat.com"
 
 
-from tracetool import out
+from tracetool import Event, out
+from tracetool.backend import Wrapper
 
 
-def generate(events, backend, group):
+def generate(events: list[Event], backend: Wrapper, group: str) -> None:
     events = [e for e in events
               if "disabled" not in e.properties]
 
diff --git a/scripts/tracetool/format/ust_events_h.py b/scripts/tracetool/format/ust_events_h.py
index 1057d02577e..f0ffcae6941 100644
--- a/scripts/tracetool/format/ust_events_h.py
+++ b/scripts/tracetool/format/ust_events_h.py
@@ -14,10 +14,11 @@
 __email__      = "stefanha@redhat.com"
 
 
-from tracetool import out
+from tracetool import Event, out
+from tracetool.backend import Wrapper
 
 
-def generate(events, backend, group):
+def generate(events: list[Event], backend: Wrapper, group: str) -> None:
     events = [e for e in events
               if "disabled" not in e.properties]
 
-- 
2.51.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 5/6] tracetool: complete typing annotations
  2025-10-08  6:35 [PATCH 0/6] tracetool: add mypy --strict checking [AI discussion ahead!] Paolo Bonzini
                   ` (3 preceding siblings ...)
  2025-10-08  6:35 ` [PATCH 4/6] tracetool: add type annotations Paolo Bonzini
@ 2025-10-08  6:35 ` Paolo Bonzini
  2025-10-08 18:10   ` Stefan Hajnoczi
  2025-10-08  6:35 ` [PATCH 6/6] tracetool: add typing checks to "make -C python check" Paolo Bonzini
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 31+ messages in thread
From: Paolo Bonzini @ 2025-10-08  6:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: Alex Bennée, Daniel P . Berrangé, John Snow, Kevin Wolf,
	Markus Armbruster, Peter Maydell, Stefan Hajnoczi, Mads Ynddal

Add more annotations so that "mypy --strict".  These have to be done
manually due to limitations of RightTyper.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 scripts/tracetool/__init__.py         | 30 +++++++++++++--------------
 scripts/tracetool/backend/__init__.py |  2 +-
 scripts/tracetool/format/log_stap.py  |  2 +-
 3 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/scripts/tracetool/__init__.py b/scripts/tracetool/__init__.py
index 62895261a0d..c9509b327f4 100644
--- a/scripts/tracetool/__init__.py
+++ b/scripts/tracetool/__init__.py
@@ -19,17 +19,17 @@
 import sys
 from io import TextIOWrapper
 from pathlib import PurePath
-from typing import Any, Iterator
+from typing import Any, Iterator, Sequence
 
 import tracetool.backend
 import tracetool.format
 
 
-def error_write(*lines) -> None:
+def error_write(*lines: str) -> None:
     """Write a set of error lines."""
     sys.stderr.writelines("\n".join(lines) + "\n")
 
-def error(*lines):
+def error(*lines: str) -> None:
     """Write a set of error lines and exit."""
     error_write(*lines)
     sys.exit(1)
@@ -51,7 +51,7 @@ def error(*lines):
 }
 
 def expand_format_string(c_fmt: str, prefix: str="") -> str:
-    def pri_macro_to_fmt(pri_macro):
+    def pri_macro_to_fmt(pri_macro: str) -> str:
         assert pri_macro.startswith("PRI")
         fmt_type = pri_macro[3]  # 'd', 'i', 'u', or 'x'
         fmt_size = pri_macro[4:]  # '8', '16', '32', '64', 'PTR', 'MAX'
@@ -87,7 +87,7 @@ def out_open(filename: str) -> None:
     out_filename = posix_relpath(filename)
     out_fobj = open(filename, 'wt')
 
-def out(*lines, **kwargs) -> None:
+def out(*lines: str, **kwargs: Any) -> None:
     """Write a set of output lines.
 
     You can use kwargs as a shorthand for mapping variables when formatting all
@@ -229,14 +229,14 @@ def c_type_to_rust(name: str) -> str:
 class Arguments:
     """Event arguments description."""
 
-    def __init__(self, args: list[tuple[str, str]]) -> None:
+    def __init__(self, args: Sequence[tuple[str, str] | Arguments]) -> None:
         """
         Parameters
         ----------
         args :
             List of (type, name) tuples or Arguments objects.
         """
-        self._args = []
+        self._args: list[tuple[str, str]] = []
         for arg in args:
             if isinstance(arg, Arguments):
                 self._args.extend(arg._args)
@@ -271,7 +271,7 @@ def build(arg_str: str) -> Arguments:
             res.append((arg_type, identifier))
         return Arguments(res)
 
-    def __getitem__(self, index):
+    def __getitem__(self, index: int | slice) -> Arguments | tuple[str, str]:
         if isinstance(index, slice):
             return Arguments(self._args[index])
         else:
@@ -287,7 +287,7 @@ def __len__(self) -> int:
 
     def __str__(self) -> str:
         """String suitable for declaring function arguments."""
-        def onearg(t, n):
+        def onearg(t: str, n: str) -> str:
             if t[-1] == '*':
                 return "".join([t, n])
             else:
@@ -298,7 +298,7 @@ def onearg(t, n):
         else:
             return ", ".join([ onearg(t, n) for t,n in self._args ])
 
-    def __repr__(self):
+    def __repr__(self) -> str:
         """Evaluable string representation for this object."""
         return "Arguments(\"%s\")" % str(self)
 
@@ -310,7 +310,7 @@ def types(self) -> list[str]:
         """List of argument types."""
         return [ type_ for type_, _ in self._args ]
 
-    def casted(self):
+    def casted(self) -> list[str]:
         """List of argument names casted to their type."""
         return ["(%s)%s" % (type_, name) for type_, name in self._args]
 
@@ -321,7 +321,7 @@ def rust_decl_extern(self) -> str:
 
     def rust_decl(self) -> str:
         """Return a Rust argument list for a tracepoint function"""
-        def decl_type(type_):
+        def decl_type(type_: str) -> str:
             if type_ == "const char *":
                 return "&std::ffi::CStr"
             return c_type_to_rust(type_)
@@ -331,7 +331,7 @@ def decl_type(type_):
 
     def rust_call_extern(self) -> str:
         """Return a Rust argument list for a call to an extern "C" function"""
-        def rust_cast(name, type_):
+        def rust_cast(name: str, type_: str) -> str:
             if type_ == "const char *":
                 return f"_{name}.as_ptr()"
             return f"_{name}"
@@ -340,7 +340,7 @@ def rust_cast(name, type_):
 
     def rust_call_varargs(self) -> str:
         """Return a Rust argument list for a call to a C varargs function"""
-        def rust_cast(name, type_):
+        def rust_cast(name: str, type_: str) -> str:
             if type_ == "const char *":
                 return f"_{name}.as_ptr()"
 
@@ -449,7 +449,7 @@ def build(line_str: str, lineno: int, filename: str) -> Event:
 
         return Event(name, props, fmt, args, lineno, posix_relpath(filename))
 
-    def __repr__(self):
+    def __repr__(self) -> str:
         """Evaluable string representation for this object."""
         return "Event('%s %s(%s) %s')" % (" ".join(self.properties),
                                           self.name,
diff --git a/scripts/tracetool/backend/__init__.py b/scripts/tracetool/backend/__init__.py
index 645e78ece06..84bab3f42a0 100644
--- a/scripts/tracetool/backend/__init__.py
+++ b/scripts/tracetool/backend/__init__.py
@@ -122,7 +122,7 @@ def backend_modules(self) -> Iterator[Any]:
              if module is not None:
                  yield module
 
-    def _run_function(self, name: str, *args, check_trace_event_get_state: bool | None=None, **kwargs) -> None:
+    def _run_function(self, name: str, *args: Any, check_trace_event_get_state: bool | None=None, **kwargs: Any) -> None:
         for backend in self.backend_modules():
             func = getattr(backend, name % self._format, None)
             if func is not None and \
diff --git a/scripts/tracetool/format/log_stap.py b/scripts/tracetool/format/log_stap.py
index d7ad0bb6a72..914e4674ffe 100644
--- a/scripts/tracetool/format/log_stap.py
+++ b/scripts/tracetool/format/log_stap.py
@@ -25,7 +25,7 @@
 STATE_LITERAL = 1
 STATE_MACRO = 2
 
-def c_macro_to_format(macro):
+def c_macro_to_format(macro: str) -> str:
     if macro.startswith("PRI"):
         return macro[3]
 
-- 
2.51.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH 6/6] tracetool: add typing checks to "make -C python check"
  2025-10-08  6:35 [PATCH 0/6] tracetool: add mypy --strict checking [AI discussion ahead!] Paolo Bonzini
                   ` (4 preceding siblings ...)
  2025-10-08  6:35 ` [PATCH 5/6] tracetool: complete typing annotations Paolo Bonzini
@ 2025-10-08  6:35 ` Paolo Bonzini
  2025-10-08  9:32   ` Daniel P. Berrangé
  2025-10-08 18:12   ` Stefan Hajnoczi
  2025-10-08  7:18 ` [PATCH 0/6] tracetool: add mypy --strict checking [AI discussion ahead!] Markus Armbruster
                   ` (2 subsequent siblings)
  8 siblings, 2 replies; 31+ messages in thread
From: Paolo Bonzini @ 2025-10-08  6:35 UTC (permalink / raw)
  To: qemu-devel
  Cc: Alex Bennée, Daniel P . Berrangé, John Snow, Kevin Wolf,
	Markus Armbruster, Peter Maydell, Stefan Hajnoczi, Mads Ynddal

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 python/tests/tracetool-mypy.sh | 5 +++++
 1 file changed, 5 insertions(+)
 create mode 100755 python/tests/tracetool-mypy.sh

diff --git a/python/tests/tracetool-mypy.sh b/python/tests/tracetool-mypy.sh
new file mode 100755
index 00000000000..7b4011e52d6
--- /dev/null
+++ b/python/tests/tracetool-mypy.sh
@@ -0,0 +1,5 @@
+#!/bin/sh -e
+# SPDX-License-Identifier: GPL-2.0-or-later
+
+cd ../scripts
+python3 -m mypy --strict tracetool
-- 
2.51.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH 0/6] tracetool: add mypy --strict checking [AI discussion ahead!]
  2025-10-08  6:35 [PATCH 0/6] tracetool: add mypy --strict checking [AI discussion ahead!] Paolo Bonzini
                   ` (5 preceding siblings ...)
  2025-10-08  6:35 ` [PATCH 6/6] tracetool: add typing checks to "make -C python check" Paolo Bonzini
@ 2025-10-08  7:18 ` Markus Armbruster
  2025-10-08 10:34   ` Daniel P. Berrangé
  2025-10-08 10:40 ` Daniel P. Berrangé
  2025-10-08 17:23 ` Stefan Hajnoczi
  8 siblings, 1 reply; 31+ messages in thread
From: Markus Armbruster @ 2025-10-08  7:18 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: qemu-devel, Alex Bennée, Daniel P . Berrangé, John Snow,
	Kevin Wolf, Peter Maydell, Stefan Hajnoczi, Mads Ynddal

Paolo Bonzini <pbonzini@redhat.com> writes:

> [People in Cc are a mix of Python people, tracing people, and people
>  who followed the recent AI discussions. - Paolo]
>
> This series adds type annotations to tracetool. While useful on its own, 
> it also served as an experiment in whether AI tools could be useful and
> appropriate for mechanical code transformations that may not involve
> copyrightable expression.
>
> In this version, the types were added mostly with the RightTyper tool
> (https://github.com/RightTyper/RightTyper), which uses profiling to detect
> the types of arguments and return types at run time.  However, because
> adding type annotations is such a narrow and verifiable task, I also developed
> a parallel version using an LLM, to provide some data on topics such as:
>
> - how much choice/creativity is there in writing type annotations?
>   Is it closer to writing functional code or to refactoring?

Based on my work with John Snow on typing of the QAPI generator: there
is some choice.

Consider typing a function's argument.  Should we pick it based on what
the function requires from its argument?  Or should the type reflect how
the function is used?

Say the function iterates over the argument.  So we make the argument
Iterable[...], right?  But what if all callers pass a list?  Making it
List[...] could be clearer then.  It's a choice.

I think the choice depends on context and taste.  At some library's
external interface, picking a more general type can make the function
more generally useful.  But for some internal helper, I'd pick the
actual type.

My point isn't that an LLM could not possibly do the right thing based
on context, and maybe even "taste" distilled from its training data.  My
point is that this isn't entirely mechanical with basically one correct
output.

Once we have such judgement calls, there's the question how an LLM's
choice depends on its training data (first order approximation today:
nobody knows), and whether and when that makes the LLM's output a
derived work of its training data (to be settled in court).

[...]

> Based on this experience, my answer to the copyrightability question is
> that, for this kind of narrow request, the output of AI can be treated as
> the output of an imperfect tool, rather than as creative content potentially
> tainted by the training material.

Maybe.

>                                    Of course this is one data point and
> is intended as an experiment rather than a policy recommendation.

Understood.  We need to develop a better understanding of capabilities,
potential benefits and risks, and such experiments can only help with
that.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/6] tracetool: rename variable with conflicting types
  2025-10-08  6:35 ` [PATCH 1/6] tracetool: rename variable with conflicting types Paolo Bonzini
@ 2025-10-08  9:27   ` Daniel P. Berrangé
  2025-10-08 18:06   ` Stefan Hajnoczi
  1 sibling, 0 replies; 31+ messages in thread
From: Daniel P. Berrangé @ 2025-10-08  9:27 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: qemu-devel, Alex Bennée, John Snow, Kevin Wolf,
	Markus Armbruster, Peter Maydell, Stefan Hajnoczi, Mads Ynddal

On Wed, Oct 08, 2025 at 08:35:40AM +0200, Paolo Bonzini wrote:
> "backend" is used as both a string and a backend.Wrapper.  In preparation
> for adding type annotations, use different names.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  scripts/tracetool/__init__.py | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/scripts/tracetool/__init__.py b/scripts/tracetool/__init__.py
> index 74062d21a7c..85527c08c98 100644
> --- a/scripts/tracetool/__init__.py
> +++ b/scripts/tracetool/__init__.py
> @@ -566,9 +566,9 @@ def generate(events, group, format, backends,
>  
>      if len(backends) == 0:
>          raise TracetoolError("no backends specified")
> -    for backend in backends:
> -        if not tracetool.backend.exists(backend):
> -            raise TracetoolError("unknown backend: %s" % backend)
> +    for backend_name in backends:
> +        if not tracetool.backend.exists(backend_name):
> +            raise TracetoolError("unknown backend: %s" % backend_name)

IMHO it suffices to shorten the name  'for name in backends',
but either way

  Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>


>      backend = tracetool.backend.Wrapper(backends, format)
>  
>      import tracetool.backend.dtrace
> -- 
> 2.51.0
> 

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/6] tracetool: apply isort and add check
  2025-10-08  6:35 ` [PATCH 2/6] tracetool: apply isort and add check Paolo Bonzini
@ 2025-10-08  9:29   ` Daniel P. Berrangé
  2025-10-08 17:58   ` Stefan Hajnoczi
  1 sibling, 0 replies; 31+ messages in thread
From: Daniel P. Berrangé @ 2025-10-08  9:29 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: qemu-devel, Alex Bennée, John Snow, Kevin Wolf,
	Markus Armbruster, Peter Maydell, Stefan Hajnoczi, Mads Ynddal

On Wed, Oct 08, 2025 at 08:35:41AM +0200, Paolo Bonzini wrote:
> Sort imports automatically, to keep the coding style more uniform.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  python/tests/tracetool-isort.sh      | 4 ++++
>  scripts/tracetool.py                 | 5 ++---
>  scripts/tracetool/backend/dtrace.py  | 1 -
>  scripts/tracetool/backend/ftrace.py  | 3 +--
>  scripts/tracetool/backend/log.py     | 3 +--
>  scripts/tracetool/backend/simple.py  | 1 -
>  scripts/tracetool/backend/syslog.py  | 3 +--
>  scripts/tracetool/backend/ust.py     | 1 -
>  scripts/tracetool/format/d.py        | 2 +-
>  scripts/tracetool/format/log_stap.py | 1 -
>  scripts/tracetool/format/stap.py     | 1 -
>  11 files changed, 10 insertions(+), 15 deletions(-)
>  create mode 100755 python/tests/tracetool-isort.sh

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/6] tracetool: "import annotations"
  2025-10-08  6:35 ` [PATCH 3/6] tracetool: "import annotations" Paolo Bonzini
@ 2025-10-08  9:31   ` Daniel P. Berrangé
  2025-10-08 18:06   ` Stefan Hajnoczi
  1 sibling, 0 replies; 31+ messages in thread
From: Daniel P. Berrangé @ 2025-10-08  9:31 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: qemu-devel, Alex Bennée, John Snow, Kevin Wolf,
	Markus Armbruster, Peter Maydell, Stefan Hajnoczi, Mads Ynddal

On Wed, Oct 08, 2025 at 08:35:42AM +0200, Paolo Bonzini wrote:
> In preparations for adding type annotations, make Python process them lazily.
> 
> This avoids the need to express some annotations as strings.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  scripts/tracetool.py                         | 2 ++
>  scripts/tracetool/__init__.py                | 2 ++
>  scripts/tracetool/backend/__init__.py        | 2 ++
>  scripts/tracetool/backend/dtrace.py          | 2 ++
>  scripts/tracetool/backend/ftrace.py          | 2 ++
>  scripts/tracetool/backend/log.py             | 2 ++
>  scripts/tracetool/backend/simple.py          | 2 ++
>  scripts/tracetool/backend/syslog.py          | 2 ++
>  scripts/tracetool/backend/ust.py             | 2 ++
>  scripts/tracetool/format/__init__.py         | 2 ++
>  scripts/tracetool/format/c.py                | 2 ++
>  scripts/tracetool/format/d.py                | 2 ++
>  scripts/tracetool/format/h.py                | 2 ++
>  scripts/tracetool/format/log_stap.py         | 2 ++
>  scripts/tracetool/format/rs.py               | 2 ++
>  scripts/tracetool/format/simpletrace_stap.py | 2 ++
>  scripts/tracetool/format/stap.py             | 2 ++
>  scripts/tracetool/format/ust_events_c.py     | 2 ++
>  scripts/tracetool/format/ust_events_h.py     | 2 ++
>  19 files changed, 38 insertions(+)
>

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 6/6] tracetool: add typing checks to "make -C python check"
  2025-10-08  6:35 ` [PATCH 6/6] tracetool: add typing checks to "make -C python check" Paolo Bonzini
@ 2025-10-08  9:32   ` Daniel P. Berrangé
  2025-10-08 18:12   ` Stefan Hajnoczi
  1 sibling, 0 replies; 31+ messages in thread
From: Daniel P. Berrangé @ 2025-10-08  9:32 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: qemu-devel, Alex Bennée, John Snow, Kevin Wolf,
	Markus Armbruster, Peter Maydell, Stefan Hajnoczi, Mads Ynddal

On Wed, Oct 08, 2025 at 08:35:45AM +0200, Paolo Bonzini wrote:
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  python/tests/tracetool-mypy.sh | 5 +++++
>  1 file changed, 5 insertions(+)
>  create mode 100755 python/tests/tracetool-mypy.sh

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 0/6] tracetool: add mypy --strict checking [AI discussion ahead!]
  2025-10-08  7:18 ` [PATCH 0/6] tracetool: add mypy --strict checking [AI discussion ahead!] Markus Armbruster
@ 2025-10-08 10:34   ` Daniel P. Berrangé
  2025-10-10 12:38     ` Markus Armbruster
  0 siblings, 1 reply; 31+ messages in thread
From: Daniel P. Berrangé @ 2025-10-08 10:34 UTC (permalink / raw)
  To: Markus Armbruster
  Cc: Paolo Bonzini, qemu-devel, Alex Bennée, John Snow,
	Kevin Wolf, Peter Maydell, Stefan Hajnoczi, Mads Ynddal

On Wed, Oct 08, 2025 at 09:18:04AM +0200, Markus Armbruster wrote:
> Paolo Bonzini <pbonzini@redhat.com> writes:
> 
> > [People in Cc are a mix of Python people, tracing people, and people
> >  who followed the recent AI discussions. - Paolo]
> >
> > This series adds type annotations to tracetool. While useful on its own, 
> > it also served as an experiment in whether AI tools could be useful and
> > appropriate for mechanical code transformations that may not involve
> > copyrightable expression.
> >
> > In this version, the types were added mostly with the RightTyper tool
> > (https://github.com/RightTyper/RightTyper), which uses profiling to detect
> > the types of arguments and return types at run time.  However, because
> > adding type annotations is such a narrow and verifiable task, I also developed
> > a parallel version using an LLM, to provide some data on topics such as:
> >
> > - how much choice/creativity is there in writing type annotations?
> >   Is it closer to writing functional code or to refactoring?
> 
> Based on my work with John Snow on typing of the QAPI generator: there
> is some choice.
> 
> Consider typing a function's argument.  Should we pick it based on what
> the function requires from its argument?  Or should the type reflect how
> the function is used?
> 
> Say the function iterates over the argument.  So we make the argument
> Iterable[...], right?  But what if all callers pass a list?  Making it
> List[...] could be clearer then.  It's a choice.
> 
> I think the choice depends on context and taste.  At some library's
> external interface, picking a more general type can make the function
> more generally useful.  But for some internal helper, I'd pick the
> actual type.
> 
> My point isn't that an LLM could not possibly do the right thing based
> on context, and maybe even "taste" distilled from its training data.  My
> point is that this isn't entirely mechanical with basically one correct
> output.
>
> Once we have such judgement calls, there's the question how an LLM's
> choice depends on its training data (first order approximation today:
> nobody knows), and whether and when that makes the LLM's output a
> derived work of its training data (to be settled in court).

There's perhaps a missing step here. The work first has to be considered
eligible for copyright protection, before we get onto questions wrt
derivative works, but that opens a can of worms....

The big challenge is that traditionally projects did not really have
to think much about where the "threshold of originality"[1] came into
play for a work to be considered eligible for copyright.

It was fine to take the conservative view that any patch benefits
from copyright protection for the individual (or more often their
employer). There was rarely any compelling need for the project to
understand if something failed to meet the threshold. We see this
in cases where a project re-licenses code - it is simpler/cheaper
just to contact all contributors for permission, than to evaluate
which subset held copyright.

This is a very good thing. Understanding where that threshold applies
is an challenging intellectual task that consumes time better spent
on other tasks. Especially notable though is that threshold varies
across countries in the world, and some places have at times even
considered merely expending labour to make a work eligible.

In trying to create an policy that permits AI contributions in some
narrow scenarios, we're trying to thread the needle and as a global
project that implies satisfying a wide variety of interpretations
for the copyright threshold. There's no clear cut answer to that,
the only option is to mitigate risk to a tolerable level.

That's something all projects already do. For example when choosing
between a CLA, vs a DCO signup, vs implicitly expecting contributors
to be good, we're making a risk/cost tradeoff.

Back to the LLMs / type annotations question. Let us accept there
there is a judgement in situations like Iterable[...] vs List[...],
and that an LLM makes that call. The LLM's basis for that judgement
call is certainly biased from what it saw in training data, just as
our own judgement call would be biased from what we've seen before
in python type annotations in other codebases.

In either case though, I have a hard time proposing why the result
of that judgement call makes the output a derived work of anything
other than the current QEMU codebase (and possibly the LLM prompt).
To say otherwise would mean my work is tainted by every codebase
I've ever read.

The most plausible scenario of "copying" existing copyright content
is where a fork of the QEMU code already exists in the training
material with the type annotations present. If that was the case
though, that pre-existing work would already be considered a
derivative of QEMU code and thus compatibly licensed. What we'd
be loosing would be the explicit SoB of the original author (if
any), rather than facing any clear license compatibility problem.

If we consider that type annotations are (a) borderline for copyright
protection eligibility and (b) the type annotations are unlikely to be
claimed as derived work of anything other than QEMU; then this looks
like a low risk scenario for use of LLM.

The problem I have remaining is around the practicality of expressing
this in a policy and having contributors & maintainers apply it well
in practice.

There's a definite "slippery slope" situation. The incentive for
contributors will be to search out reasons to justify why a work
matches the AI exception, while overworked maintainers probably
aren't going to spend much (any) mental energy on objectively
analysing the situation, unless the proposal looks terribly out
of line with expectations.

I'm not saying we shouldn't have an exception. I think the mypy case
is an fairly compelling example of a low risk activity. We will need
to become comfortable with the implications around the incentives
for contributors & maintainers when applying it though.

With regards,
Daniel

[1] https://en.wikipedia.org/wiki/Threshold_of_originality
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 0/6] tracetool: add mypy --strict checking [AI discussion ahead!]
  2025-10-08  6:35 [PATCH 0/6] tracetool: add mypy --strict checking [AI discussion ahead!] Paolo Bonzini
                   ` (6 preceding siblings ...)
  2025-10-08  7:18 ` [PATCH 0/6] tracetool: add mypy --strict checking [AI discussion ahead!] Markus Armbruster
@ 2025-10-08 10:40 ` Daniel P. Berrangé
  2025-10-08 17:23 ` Stefan Hajnoczi
  8 siblings, 0 replies; 31+ messages in thread
From: Daniel P. Berrangé @ 2025-10-08 10:40 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: qemu-devel, Alex Bennée, John Snow, Kevin Wolf,
	Markus Armbruster, Peter Maydell, Stefan Hajnoczi, Mads Ynddal

On Wed, Oct 08, 2025 at 08:35:39AM +0200, Paolo Bonzini wrote:
> Comparing the results from RightTyper and AI, both tools benefit from human
> help: the dead code removal which I mentioned, the small cleanups in patch 1,
> the final manual fixes for the remaining (mostly trivial) errors.  But at
> least in this case, AI is a winner:
> 
> - it left basically no unannotated code: fixing the above errors was enough
>   to pass "mypy --strict", unlike RightTyper which needed a little more work
>   due to its profiling-based nature and a few other limitations (see patch 5).
> 
> - most importantly, trying various other tools that didn't work, as well as
>   figuring out how to use RightTyper, took a couple hours more.  Surprising
>   as it was, I could not find any static type inferencing tool for Python;
>   neither pytype nor pyre worked for me.  This is also why I think this
>   is not apples to oranges, but a fair comparison between AI-based and
>   regular tooling.

FWIW, there was a third possible option here, a sort of hybrid...

   prompt> find me tools that can add python type annotatinos to code
   prompt> show me how to invoke tool <blah> on the QEMU codebase.
   prompt> run the tool <blah> on the QEMU code, run mypy tests and
           commit the result

that would probably have saved the hours learning how to use RightTyper
making it more competitive with the pure AI approach.

> After the diffstat, you can find a diff from this series to the version
> based on Claude Code.  It's impossible to be 100% objective but,
> besides being functionally equivalent, I don't think either would be
> identifiable as written by an LLM, by a person, by a tool+human combo,
> or even by a good type inferencing tool (if it existed).

AFAICS the diff there is either whitespace changes, or a differing
style for  importing a module vs importing a name from a module.

Whitespace we should be enforcing with 'black' formatting rules.
For import style, I'd say that's an irrelevant difference.

IOW, I'd call the two patches identical for any practical purposes.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 0/6] tracetool: add mypy --strict checking [AI discussion ahead!]
  2025-10-08  6:35 [PATCH 0/6] tracetool: add mypy --strict checking [AI discussion ahead!] Paolo Bonzini
                   ` (7 preceding siblings ...)
  2025-10-08 10:40 ` Daniel P. Berrangé
@ 2025-10-08 17:23 ` Stefan Hajnoczi
  2025-10-08 17:41   ` Paolo Bonzini
  8 siblings, 1 reply; 31+ messages in thread
From: Stefan Hajnoczi @ 2025-10-08 17:23 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: qemu-devel, Alex Bennée, Daniel P . Berrangé, John Snow,
	Kevin Wolf, Markus Armbruster, Peter Maydell, Mads Ynddal

[-- Attachment #1: Type: text/plain, Size: 2360 bytes --]

On Wed, Oct 08, 2025 at 08:35:39AM +0200, Paolo Bonzini wrote:
> In this version, the types were added mostly with the RightTyper tool
> (https://github.com/RightTyper/RightTyper), which uses profiling to detect
> the types of arguments and return types at run time.  However, because
> adding type annotations is such a narrow and verifiable task, I also developed
> a parallel version using an LLM, to provide some data on topics such as:
> 
> - how much choice/creativity is there in writing type annotations?
>   Is it closer to writing functional code or to refactoring?
> 
> - how does AI output for such mechanical transformations compare to other
>   automated tools, whose output is known not to be copyrightable? 
> 
> - what is the kind of time saving that the tool can provide?
> 
> - how effective is an LLM for this task?  Is the required human help a
>   pain to provide, or can the human keep the fun part for itself?

Here are my thoughts:

- Type annotations are probably not copyrightable. Although I think AI
  output contributions should be allowed for non-copyrightable work, I
  understand the slippery slope argument. Also, at QEMU Summit the
  consensus was to give it some time and then revisit the AI policy next
  year. I think we should stick to that so that the QEMU community has
  time to consider other scenarios involving AI. Then we can make a
  policy change next year that combines the new items that have come up
  rather than making a series of shifting policy changes over the coming
  months. That will make it simpler for everyone to understand and
  follow the policy correctly.

- Markus pointed out why generating them automatically may not result in
  the same quality as manually-written annotations because the AI
  doesn't have the context that is missing from the code itself. It's
  the same reason why generating API docs using AI produces something
  resembling documentation, but it consists of obvious stuff apparent
  from the code itself and not the context that only the author can
  communicate about the intended use of the API. The difference between
  manually-written type hints and AI-generated ones is probably less
  significant than with API documentation though, so I think they are
  sufficiently valuable and high quality enough to merge.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 0/6] tracetool: add mypy --strict checking [AI discussion ahead!]
  2025-10-08 17:23 ` Stefan Hajnoczi
@ 2025-10-08 17:41   ` Paolo Bonzini
  0 siblings, 0 replies; 31+ messages in thread
From: Paolo Bonzini @ 2025-10-08 17:41 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: qemu-devel, Alex Bennée, Daniel P . Berrangé, John Snow,
	Kevin Wolf, Markus Armbruster, Peter Maydell, Mads Ynddal

[-- Attachment #1: Type: text/plain, Size: 3009 bytes --]

Il mer 8 ott 2025, 19:23 Stefan Hajnoczi <stefanha@redhat.com> ha scritto:

> On Wed, Oct 08, 2025 at 08:35:39AM +0200, Paolo Bonzini wrote:
> > In this version, the types were added mostly with the RightTyper tool
> > (https://github.com/RightTyper/RightTyper), which uses profiling to
> detect
> > the types of arguments and return types at run time.  However, because
> > adding type annotations is such a narrow and verifiable task, I also
> developed
> > a parallel version using an LLM, to provide some data on topics such as:
> >
> > - how much choice/creativity is there in writing type annotations?
> >   Is it closer to writing functional code or to refactoring?
> >
> > - how does AI output for such mechanical transformations compare to other
> >   automated tools, whose output is known not to be copyrightable?
> >
> > - what is the kind of time saving that the tool can provide?
> >
> > - how effective is an LLM for this task?  Is the required human help a
> >   pain to provide, or can the human keep the fun part for itself?
>
> Here are my thoughts:
>
> - Type annotations are probably not copyrightable. Although I think AI
>   output contributions should be allowed for non-copyrightable work, I
>   understand the slippery slope argument. Also, at QEMU Summit the
>   consensus was to give it some time and then revisit the AI policy next
>   year. I think we should stick to that so that the QEMU community has
>   time to consider other scenarios involving AI.

I agree. At the same time, rushing a policy change is bad but making an
uninformed one is too. Submitting the non-AI change, recounting the
experience with the AI tool, and explaining how they differed for me,
hopefully helps informing decisions. In fact this is probably at the upper
end of the complexity, for what can still be described as mechanical
transformation.

In other words, I want to be clear that, even though I did rush a bit the
discussion on the policy, I don't want to rush the changes but rather be
prepared for them.

- Markus pointed out why generating them automatically may not result in
>   the same quality as manually-written annotations because the AI
>   doesn't have the context that is missing from the code itself. [...] The
> difference between
>   manually-written type hints and AI-generated ones is probably less
>   significant than with API documentation though, so I think they are
>   sufficiently valuable and high quality enough to merge.
>

Indeed. And also, the nuances are more important for something where type
hints describe an API than if you're doing it only for some extra static
checking.

You can always start with a tool and refine, too. In fact I expect that
*if* an exception is granted for mechanical changes, submissions would
almost never consist of AI-generated changes only. Full isolation may not
always be practical, but separating AI- or tool-generated commits from
human refinements is something we'd ask for anyway, for the sake of
reviewability.

Paolo

>

[-- Attachment #2: Type: text/html, Size: 4269 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/6] tracetool: apply isort and add check
  2025-10-08  6:35 ` [PATCH 2/6] tracetool: apply isort and add check Paolo Bonzini
  2025-10-08  9:29   ` Daniel P. Berrangé
@ 2025-10-08 17:58   ` Stefan Hajnoczi
  2025-10-09  7:52     ` Daniel P. Berrangé
  2025-10-09  7:58     ` Paolo Bonzini
  1 sibling, 2 replies; 31+ messages in thread
From: Stefan Hajnoczi @ 2025-10-08 17:58 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: qemu-devel, Alex Bennée, Daniel P . Berrangé, John Snow,
	Kevin Wolf, Markus Armbruster, Peter Maydell, Mads Ynddal

[-- Attachment #1: Type: text/plain, Size: 1473 bytes --]

On Wed, Oct 08, 2025 at 08:35:41AM +0200, Paolo Bonzini wrote:
> Sort imports automatically, to keep the coding style more uniform.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  python/tests/tracetool-isort.sh      | 4 ++++
>  scripts/tracetool.py                 | 5 ++---
>  scripts/tracetool/backend/dtrace.py  | 1 -
>  scripts/tracetool/backend/ftrace.py  | 3 +--
>  scripts/tracetool/backend/log.py     | 3 +--
>  scripts/tracetool/backend/simple.py  | 1 -
>  scripts/tracetool/backend/syslog.py  | 3 +--
>  scripts/tracetool/backend/ust.py     | 1 -
>  scripts/tracetool/format/d.py        | 2 +-
>  scripts/tracetool/format/log_stap.py | 1 -
>  scripts/tracetool/format/stap.py     | 1 -
>  11 files changed, 10 insertions(+), 15 deletions(-)
>  create mode 100755 python/tests/tracetool-isort.sh
> 
> diff --git a/python/tests/tracetool-isort.sh b/python/tests/tracetool-isort.sh
> new file mode 100755
> index 00000000000..b23f3d48448
> --- /dev/null
> +++ b/python/tests/tracetool-isort.sh
> @@ -0,0 +1,4 @@
> +#!/bin/sh -e
> +# SPDX-License-Identifier: GPL-2.0-or-later
> +
> +python3 -m isort --sp . -c ../scripts/tracetool/

I wonder why python/tests/isort.sh doesn't already cover this with its
`python3 -m isort -c scripts/` line?

Also, why the --settings-path (--sp) option that python/tests/isort.sh
doesn't use?

It would be great to have just 1 script that runs isort on all Python
code in QEMU.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 1/6] tracetool: rename variable with conflicting types
  2025-10-08  6:35 ` [PATCH 1/6] tracetool: rename variable with conflicting types Paolo Bonzini
  2025-10-08  9:27   ` Daniel P. Berrangé
@ 2025-10-08 18:06   ` Stefan Hajnoczi
  1 sibling, 0 replies; 31+ messages in thread
From: Stefan Hajnoczi @ 2025-10-08 18:06 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: qemu-devel, Alex Bennée, Daniel P . Berrangé, John Snow,
	Kevin Wolf, Markus Armbruster, Peter Maydell, Mads Ynddal

[-- Attachment #1: Type: text/plain, Size: 410 bytes --]

On Wed, Oct 08, 2025 at 08:35:40AM +0200, Paolo Bonzini wrote:
> "backend" is used as both a string and a backend.Wrapper.  In preparation
> for adding type annotations, use different names.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  scripts/tracetool/__init__.py | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 3/6] tracetool: "import annotations"
  2025-10-08  6:35 ` [PATCH 3/6] tracetool: "import annotations" Paolo Bonzini
  2025-10-08  9:31   ` Daniel P. Berrangé
@ 2025-10-08 18:06   ` Stefan Hajnoczi
  1 sibling, 0 replies; 31+ messages in thread
From: Stefan Hajnoczi @ 2025-10-08 18:06 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: qemu-devel, Alex Bennée, Daniel P . Berrangé, John Snow,
	Kevin Wolf, Markus Armbruster, Peter Maydell, Mads Ynddal

[-- Attachment #1: Type: text/plain, Size: 1435 bytes --]

On Wed, Oct 08, 2025 at 08:35:42AM +0200, Paolo Bonzini wrote:
> In preparations for adding type annotations, make Python process them lazily.
> 
> This avoids the need to express some annotations as strings.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  scripts/tracetool.py                         | 2 ++
>  scripts/tracetool/__init__.py                | 2 ++
>  scripts/tracetool/backend/__init__.py        | 2 ++
>  scripts/tracetool/backend/dtrace.py          | 2 ++
>  scripts/tracetool/backend/ftrace.py          | 2 ++
>  scripts/tracetool/backend/log.py             | 2 ++
>  scripts/tracetool/backend/simple.py          | 2 ++
>  scripts/tracetool/backend/syslog.py          | 2 ++
>  scripts/tracetool/backend/ust.py             | 2 ++
>  scripts/tracetool/format/__init__.py         | 2 ++
>  scripts/tracetool/format/c.py                | 2 ++
>  scripts/tracetool/format/d.py                | 2 ++
>  scripts/tracetool/format/h.py                | 2 ++
>  scripts/tracetool/format/log_stap.py         | 2 ++
>  scripts/tracetool/format/rs.py               | 2 ++
>  scripts/tracetool/format/simpletrace_stap.py | 2 ++
>  scripts/tracetool/format/stap.py             | 2 ++
>  scripts/tracetool/format/ust_events_c.py     | 2 ++
>  scripts/tracetool/format/ust_events_h.py     | 2 ++
>  19 files changed, 38 insertions(+)

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 4/6] tracetool: add type annotations
  2025-10-08  6:35 ` [PATCH 4/6] tracetool: add type annotations Paolo Bonzini
@ 2025-10-08 18:09   ` Stefan Hajnoczi
  0 siblings, 0 replies; 31+ messages in thread
From: Stefan Hajnoczi @ 2025-10-08 18:09 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: qemu-devel, Alex Bennée, Daniel P . Berrangé, John Snow,
	Kevin Wolf, Markus Armbruster, Peter Maydell, Mads Ynddal

[-- Attachment #1: Type: text/plain, Size: 3773 bytes --]

On Wed, Oct 08, 2025 at 08:35:43AM +0200, Paolo Bonzini wrote:
> Created with a profiling-based tool, righttyper.  I used this script:
> 
>   python -m righttyper --generate-stubs --no-sampling --overwrite  -- ./tracetool.py --help
>   find . -name "*.pyi" | while read fname; do
>     merge-pyi --in-place -b bak.$fmt ${fname%i} $fname
>   done
> 
>   for fmt in c h rs d log-stap simpletrace-stap stap ust-events-c ust-events-h; do
>     find . -name '*.pyi*' | xargs rm
>     python -m righttyper --generate-stubs --no-sampling --overwrite ./tracetool.py \
>       --format=$fmt --backends=ust,simple,syslog,ftrace,dtrace,log --group=testsuite \
>       --binary=qemu --probe-prefix=qemu ../tests/tracetool/trace-events outfile.$fmt
>     find . -name "*.pyi" | while read fname; do
>       merge-pyi --in-place -b bak.$fmt ${fname%i} $fname
>     done
>   done
> 
>   python -m isort $(find tracetool -name "*.py")
> 
> Running the script took about 5 minutes.  The errors were mostly
> due to misunderstanding the "try_import" function:
> 
>   tracetool/backend/__init__.py:83: error: Incompatible types in assignment (expression has type Module, variable has type "tuple[bool, Module]")  [assignment]
>   tracetool/backend/__init__.py:117: error: Incompatible types in assignment (expression has type Module, variable has type "str")  [assignment]
>   tracetool/__init__.py:543: error: Incompatible return value type (got "tuple[bool, None]", expected "tuple[bool, Module]")  [return-value]
>   tracetool/format/__init__.py:60: error: Incompatible types in assignment (expression has type Module, variable has type "tuple[bool, Module]")  [assignment]
>   tracetool/format/__init__.py:85: error: Argument 2 to "try_import" has incompatible type "str"; expected "None"  [arg-type]
>   tracetool/format/__init__.py:88: error: Module not callable  [operator]
> 
> On top of this I fixed a little weirdness, while leaving the unannotated
> functions unchanged.  Being profiling-based, righttyper did not annotate
> anything not covered by the check-tracetool testsuite:
> - error cases
> - PRIxxx macros
> 
> It also reported list[Never] for always-empty lists, which is incorrect.
> Righttyper also has a few limitations: it does not annotate nested functions
> (there were only four of them), or "*args" argument lists.  These are fixed
> in the next patch.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  scripts/tracetool.py                         |  5 +-
>  scripts/tracetool/__init__.py                | 54 ++++++++++----------
>  scripts/tracetool/backend/__init__.py        | 19 +++----
>  scripts/tracetool/backend/dtrace.py          | 16 +++---
>  scripts/tracetool/backend/ftrace.py          | 10 ++--
>  scripts/tracetool/backend/log.py             | 10 ++--
>  scripts/tracetool/backend/simple.py          | 16 +++---
>  scripts/tracetool/backend/syslog.py          | 10 ++--
>  scripts/tracetool/backend/ust.py             |  8 +--
>  scripts/tracetool/format/__init__.py         |  7 +--
>  scripts/tracetool/format/c.py                |  5 +-
>  scripts/tracetool/format/d.py                |  5 +-
>  scripts/tracetool/format/h.py                |  5 +-
>  scripts/tracetool/format/log_stap.py         |  7 +--
>  scripts/tracetool/format/rs.py               |  5 +-
>  scripts/tracetool/format/simpletrace_stap.py |  5 +-
>  scripts/tracetool/format/stap.py             |  7 +--
>  scripts/tracetool/format/ust_events_c.py     |  5 +-
>  scripts/tracetool/format/ust_events_h.py     |  5 +-
>  19 files changed, 109 insertions(+), 95 deletions(-)

I skimmed through the type annotations and they look reasonable to me:

Acked-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 5/6] tracetool: complete typing annotations
  2025-10-08  6:35 ` [PATCH 5/6] tracetool: complete typing annotations Paolo Bonzini
@ 2025-10-08 18:10   ` Stefan Hajnoczi
  0 siblings, 0 replies; 31+ messages in thread
From: Stefan Hajnoczi @ 2025-10-08 18:10 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: qemu-devel, Alex Bennée, Daniel P . Berrangé, John Snow,
	Kevin Wolf, Markus Armbruster, Peter Maydell, Mads Ynddal

[-- Attachment #1: Type: text/plain, Size: 531 bytes --]

On Wed, Oct 08, 2025 at 08:35:44AM +0200, Paolo Bonzini wrote:
> Add more annotations so that "mypy --strict".  These have to be done
> manually due to limitations of RightTyper.
> 
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  scripts/tracetool/__init__.py         | 30 +++++++++++++--------------
>  scripts/tracetool/backend/__init__.py |  2 +-
>  scripts/tracetool/format/log_stap.py  |  2 +-
>  3 files changed, 17 insertions(+), 17 deletions(-)

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 6/6] tracetool: add typing checks to "make -C python check"
  2025-10-08  6:35 ` [PATCH 6/6] tracetool: add typing checks to "make -C python check" Paolo Bonzini
  2025-10-08  9:32   ` Daniel P. Berrangé
@ 2025-10-08 18:12   ` Stefan Hajnoczi
  1 sibling, 0 replies; 31+ messages in thread
From: Stefan Hajnoczi @ 2025-10-08 18:12 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: qemu-devel, Alex Bennée, Daniel P . Berrangé, John Snow,
	Kevin Wolf, Markus Armbruster, Peter Maydell, Mads Ynddal

[-- Attachment #1: Type: text/plain, Size: 306 bytes --]

On Wed, Oct 08, 2025 at 08:35:45AM +0200, Paolo Bonzini wrote:
> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> ---
>  python/tests/tracetool-mypy.sh | 5 +++++
>  1 file changed, 5 insertions(+)
>  create mode 100755 python/tests/tracetool-mypy.sh

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/6] tracetool: apply isort and add check
  2025-10-08 17:58   ` Stefan Hajnoczi
@ 2025-10-09  7:52     ` Daniel P. Berrangé
  2025-10-09  8:22       ` Paolo Bonzini
  2025-10-09  7:58     ` Paolo Bonzini
  1 sibling, 1 reply; 31+ messages in thread
From: Daniel P. Berrangé @ 2025-10-09  7:52 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Paolo Bonzini, qemu-devel, Alex Bennée, John Snow,
	Kevin Wolf, Markus Armbruster, Peter Maydell, Mads Ynddal

On Wed, Oct 08, 2025 at 01:58:11PM -0400, Stefan Hajnoczi wrote:
> On Wed, Oct 08, 2025 at 08:35:41AM +0200, Paolo Bonzini wrote:
> > Sort imports automatically, to keep the coding style more uniform.
> > 
> > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> > ---
> >  python/tests/tracetool-isort.sh      | 4 ++++
> >  scripts/tracetool.py                 | 5 ++---
> >  scripts/tracetool/backend/dtrace.py  | 1 -
> >  scripts/tracetool/backend/ftrace.py  | 3 +--
> >  scripts/tracetool/backend/log.py     | 3 +--
> >  scripts/tracetool/backend/simple.py  | 1 -
> >  scripts/tracetool/backend/syslog.py  | 3 +--
> >  scripts/tracetool/backend/ust.py     | 1 -
> >  scripts/tracetool/format/d.py        | 2 +-
> >  scripts/tracetool/format/log_stap.py | 1 -
> >  scripts/tracetool/format/stap.py     | 1 -
> >  11 files changed, 10 insertions(+), 15 deletions(-)
> >  create mode 100755 python/tests/tracetool-isort.sh
> > 
> > diff --git a/python/tests/tracetool-isort.sh b/python/tests/tracetool-isort.sh
> > new file mode 100755
> > index 00000000000..b23f3d48448
> > --- /dev/null
> > +++ b/python/tests/tracetool-isort.sh
> > @@ -0,0 +1,4 @@
> > +#!/bin/sh -e
> > +# SPDX-License-Identifier: GPL-2.0-or-later
> > +
> > +python3 -m isort --sp . -c ../scripts/tracetool/
> 
> I wonder why python/tests/isort.sh doesn't already cover this with its
> `python3 -m isort -c scripts/` line?
> 
> Also, why the --settings-path (--sp) option that python/tests/isort.sh
> doesn't use?
> 
> It would be great to have just 1 script that runs isort on all Python
> code in QEMU.

IMHO all of the shell scripts should really just go away. They
live in their own world outside the rest of our meson test setup
which means developers rarely even know they exist, let alone run
them on submissions.

I've proposed removing them in favour of meson rules earlier
this year:

  https://lists.gnu.org/archive/html/qemu-devel/2025-02/msg04920.html

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/6] tracetool: apply isort and add check
  2025-10-08 17:58   ` Stefan Hajnoczi
  2025-10-09  7:52     ` Daniel P. Berrangé
@ 2025-10-09  7:58     ` Paolo Bonzini
  1 sibling, 0 replies; 31+ messages in thread
From: Paolo Bonzini @ 2025-10-09  7:58 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: qemu-devel, Alex Bennée, Daniel P . Berrangé, John Snow,
	Kevin Wolf, Markus Armbruster, Peter Maydell, Mads Ynddal

On Wed, Oct 8, 2025 at 7:58 PM Stefan Hajnoczi <stefanha@redhat.com> wrote:
>
> On Wed, Oct 08, 2025 at 08:35:41AM +0200, Paolo Bonzini wrote:
> > Sort imports automatically, to keep the coding style more uniform.
> >
> > Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
> > ---
> >  python/tests/tracetool-isort.sh      | 4 ++++
> >  scripts/tracetool.py                 | 5 ++---
> >  scripts/tracetool/backend/dtrace.py  | 1 -
> >  scripts/tracetool/backend/ftrace.py  | 3 +--
> >  scripts/tracetool/backend/log.py     | 3 +--
> >  scripts/tracetool/backend/simple.py  | 1 -
> >  scripts/tracetool/backend/syslog.py  | 3 +--
> >  scripts/tracetool/backend/ust.py     | 1 -
> >  scripts/tracetool/format/d.py        | 2 +-
> >  scripts/tracetool/format/log_stap.py | 1 -
> >  scripts/tracetool/format/stap.py     | 1 -
> >  11 files changed, 10 insertions(+), 15 deletions(-)
> >  create mode 100755 python/tests/tracetool-isort.sh
> >
> > diff --git a/python/tests/tracetool-isort.sh b/python/tests/tracetool-isort.sh
> > new file mode 100755
> > index 00000000000..b23f3d48448
> > --- /dev/null
> > +++ b/python/tests/tracetool-isort.sh
> > @@ -0,0 +1,4 @@
> > +#!/bin/sh -e
> > +# SPDX-License-Identifier: GPL-2.0-or-later
> > +
> > +python3 -m isort --sp . -c ../scripts/tracetool/
>
> I wonder why python/tests/isort.sh doesn't already cover this with its
> `python3 -m isort -c scripts/` line?

Because that one is for python/scripts. Not sure why John placed
mkvenv.py and venv.py in python/scripts, but the potential for
confusion is real.

> Also, why the --settings-path (--sp) option that python/tests/isort.sh
> doesn't use?

I honestly copied it from python/tests/qapi-isort.sh. I think it's
because python/scripts/ is a subdirectory of where python/setup.cfg
lives, but scripts/ is not.

> It would be great to have just 1 script that runs isort on all Python
> code in QEMU.

Yes, there's lots of room for cleanup there.

Paolo



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/6] tracetool: apply isort and add check
  2025-10-09  7:52     ` Daniel P. Berrangé
@ 2025-10-09  8:22       ` Paolo Bonzini
  2025-10-09  8:58         ` Daniel P. Berrangé
  0 siblings, 1 reply; 31+ messages in thread
From: Paolo Bonzini @ 2025-10-09  8:22 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Stefan Hajnoczi, qemu-devel, Alex Bennée, John Snow,
	Kevin Wolf, Markus Armbruster, Peter Maydell, Mads Ynddal

On Thu, Oct 9, 2025 at 9:52 AM Daniel P. Berrangé <berrange@redhat.com> wrote:
> I've proposed removing them in favour of meson rules earlier
> this year:
>
>   https://lists.gnu.org/archive/html/qemu-devel/2025-02/msg04920.html

I mostly agree with the intent of integrating tests with the
configure/pyvenv/meson infrastructure, but I'd do things in a
different order:

- split part of scripts/ into contrib/scripts/ to identify one-off
scripts vs. what is really used in the build process

- move python/scripts/ to scripts/venv/, adjusting python/tests

- move tests/minreqs.txt to pythondeps.toml. switch from pip to mkvenv
in tests/Makefile.

- unifying the test scripts for all of scripts/, as much as possible

- integrate tests in meson, but keeping the shell scripts for now.
move python/tests to tests/python. add a custom_target() to do "mkvenv
ensuregroup tests"

- move tox usage to .gitlab-ci.d, placing tox outside the configure
script instead of inside. this makes the QEMU venv a "child" of the
tox venv rather than the opposite, and allows more testing than just
running the linters

- remove testing-related parts of tests/Makefile

Only then I'd look at opportunities for unifying all the settings and
moving away from the shell scripts

Paolo



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/6] tracetool: apply isort and add check
  2025-10-09  8:22       ` Paolo Bonzini
@ 2025-10-09  8:58         ` Daniel P. Berrangé
  2025-10-09 11:26           ` Paolo Bonzini
  0 siblings, 1 reply; 31+ messages in thread
From: Daniel P. Berrangé @ 2025-10-09  8:58 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Stefan Hajnoczi, qemu-devel, Alex Bennée, John Snow,
	Kevin Wolf, Markus Armbruster, Peter Maydell, Mads Ynddal

On Thu, Oct 09, 2025 at 10:22:43AM +0200, Paolo Bonzini wrote:
> On Thu, Oct 9, 2025 at 9:52 AM Daniel P. Berrangé <berrange@redhat.com> wrote:
> > I've proposed removing them in favour of meson rules earlier
> > this year:
> >
> >   https://lists.gnu.org/archive/html/qemu-devel/2025-02/msg04920.html
> 
> I mostly agree with the intent of integrating tests with the
> configure/pyvenv/meson infrastructure, but I'd do things in a
> different order:
> 
> - split part of scripts/ into contrib/scripts/ to identify one-off
> scripts vs. what is really used in the build process

IMHO we should rather aim to eliminate contrib. We should either care
about the contents, and find an appropriate home for it in tree & have
ability to install it, or they should live outside QEMU git in their
own home.

We shouldn't encourage a "dumping ground" for stuff that we don't
know what to do with.

> - move python/scripts/ to scripts/venv/, adjusting python/tests
> 
> - move tests/minreqs.txt to pythondeps.toml. switch from pip to mkvenv
> in tests/Makefile.
> 
> - unifying the test scripts for all of scripts/, as much as possible
> 
> - integrate tests in meson, but keeping the shell scripts for now.
> move python/tests to tests/python. add a custom_target() to do "mkvenv
> ensuregroup tests"
> 
> - move tox usage to .gitlab-ci.d, placing tox outside the configure
> script instead of inside. this makes the QEMU venv a "child" of the
> tox venv rather than the opposite, and allows more testing than just
> running the linters

If tox is outside, that seems to require that we re-run configure for
each different python version we want to run tests with, which feels
pretty undesirable.

There's the (system) python version that we want to run the overall
build system with, and then there are N other python versions which
we want to be able to run linters against.

> - remove testing-related parts of tests/Makefile
> 
> Only then I'd look at opportunities for unifying all the settings and
> moving away from the shell scripts



With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/6] tracetool: apply isort and add check
  2025-10-09  8:58         ` Daniel P. Berrangé
@ 2025-10-09 11:26           ` Paolo Bonzini
  2025-10-09 12:05             ` Daniel P. Berrangé
  0 siblings, 1 reply; 31+ messages in thread
From: Paolo Bonzini @ 2025-10-09 11:26 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Stefan Hajnoczi, qemu-devel, Alex Bennée, John Snow,
	Kevin Wolf, Markus Armbruster, Peter Maydell, Mads Ynddal

On Thu, Oct 9, 2025 at 10:58 AM Daniel P. Berrangé <berrange@redhat.com> wrote:
>
> On Thu, Oct 09, 2025 at 10:22:43AM +0200, Paolo Bonzini wrote:
> > On Thu, Oct 9, 2025 at 9:52 AM Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > I've proposed removing them in favour of meson rules earlier
> > > this year:
> > >
> > >   https://lists.gnu.org/archive/html/qemu-devel/2025-02/msg04920.html
> >
> > I mostly agree with the intent of integrating tests with the
> > configure/pyvenv/meson infrastructure, but I'd do things in a
> > different order:
> >
> > - split part of scripts/ into contrib/scripts/ to identify one-off
> > scripts vs. what is really used in the build process
>
> IMHO we should rather aim to eliminate contrib. We should either care
> about the contents, and find an appropriate home for it in tree & have
> ability to install it, or they should live outside QEMU git in their
> own home.
>
> We shouldn't encourage a "dumping ground" for stuff that we don't
> know what to do with.

I don't disagree but we have a lot of scripts like that:

scripts/analyse-9p-simpletrace.py
scripts/analyse-locks-simpletrace.py
scripts/compare-machine-types.py
scripts/dump-guest-memory.py
scripts/get-wraps-from-cargo-registry.py
scripts/python_qmp_updater.py
scripts/qcow2-to-stdout.py
scripts/qemu-gdb.py
scripts/qom-cast-macro-clean-cocci-gen.py
scripts/render_block_graph.py
scripts/replay-dump.py
scripts/simpletrace.py
scripts/u2f-setup-gen.py
scripts/userfaultfd-wrlat.py

and a bunch more that aren't .py so I didn't check them; plus entire
directories:

scripts/coccinelle/
scripts/codeconverter
scripts/coverage/
scripts/kvm/
scripts/performance/
scripts/simplebench/

Separating them would be useful.

> > - move python/scripts/ to scripts/venv/, adjusting python/tests
> >
> > - move tests/minreqs.txt to pythondeps.toml. switch from pip to mkvenv
> > in tests/Makefile.
> >
> > - unifying the test scripts for all of scripts/, as much as possible
> >
> > - integrate tests in meson, but keeping the shell scripts for now.
> > move python/tests to tests/python. add a custom_target() to do "mkvenv
> > ensuregroup tests"
> >
> > - move tox usage to .gitlab-ci.d, placing tox outside the configure
> > script instead of inside. this makes the QEMU venv a "child" of the
> > tox venv rather than the opposite, and allows more testing than just
> > running the linters
>
> If tox is outside, that seems to require that we re-run configure for
> each different python version we want to run tests with, which feels
> pretty undesirable.
>
> There's the (system) python version that we want to run the overall
> build system with, and then there are N other python versions which
> we want to be able to run linters against.

I think we do want to check that configure can set up a virtual
environment for all those versions of Python. Then we can use the
virtual environment to run the linters too.

In fact I'm not even sure we need tox. Thanks to mkvenv, we would be
using "configure --python" as a much cheaper substitute of tox. The
only thing that is lost is the ability to test all versions through a
single build tree, but I'm not sure the complexity is worth it.

Paolo

> > - remove testing-related parts of tests/Makefile
> >
> > Only then I'd look at opportunities for unifying all the settings and
> > moving away from the shell scripts
>
>
>
> With regards,
> Daniel
> --
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
>



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 2/6] tracetool: apply isort and add check
  2025-10-09 11:26           ` Paolo Bonzini
@ 2025-10-09 12:05             ` Daniel P. Berrangé
  0 siblings, 0 replies; 31+ messages in thread
From: Daniel P. Berrangé @ 2025-10-09 12:05 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Stefan Hajnoczi, qemu-devel, Alex Bennée, John Snow,
	Kevin Wolf, Markus Armbruster, Peter Maydell, Mads Ynddal

On Thu, Oct 09, 2025 at 01:26:25PM +0200, Paolo Bonzini wrote:
> On Thu, Oct 9, 2025 at 10:58 AM Daniel P. Berrangé <berrange@redhat.com> wrote:
> >
> > On Thu, Oct 09, 2025 at 10:22:43AM +0200, Paolo Bonzini wrote:
> > > On Thu, Oct 9, 2025 at 9:52 AM Daniel P. Berrangé <berrange@redhat.com> wrote:
> > > > I've proposed removing them in favour of meson rules earlier
> > > > this year:
> > > >
> > > >   https://lists.gnu.org/archive/html/qemu-devel/2025-02/msg04920.html
> > >
> > > I mostly agree with the intent of integrating tests with the
> > > configure/pyvenv/meson infrastructure, but I'd do things in a
> > > different order:
> > >
> > > - split part of scripts/ into contrib/scripts/ to identify one-off
> > > scripts vs. what is really used in the build process
> >
> > IMHO we should rather aim to eliminate contrib. We should either care
> > about the contents, and find an appropriate home for it in tree & have
> > ability to install it, or they should live outside QEMU git in their
> > own home.
> >
> > We shouldn't encourage a "dumping ground" for stuff that we don't
> > know what to do with.
> 
> I don't disagree but we have a lot of scripts like that:
> 
> scripts/analyse-9p-simpletrace.py
> scripts/analyse-locks-simpletrace.py
> scripts/compare-machine-types.py
> scripts/dump-guest-memory.py
> scripts/get-wraps-from-cargo-registry.py
> scripts/python_qmp_updater.py
> scripts/qcow2-to-stdout.py
> scripts/qemu-gdb.py
> scripts/qom-cast-macro-clean-cocci-gen.py
> scripts/render_block_graph.py
> scripts/replay-dump.py
> scripts/simpletrace.py
> scripts/u2f-setup-gen.py
> scripts/userfaultfd-wrlat.py
> 
> and a bunch more that aren't .py so I didn't check them; plus entire
> directories:
> 
> scripts/coccinelle/
> scripts/codeconverter
> scripts/coverage/
> scripts/kvm/
> scripts/performance/
> scripts/simplebench/
> 
> Separating them would be useful.
> 
> > > - move python/scripts/ to scripts/venv/, adjusting python/tests
> > >
> > > - move tests/minreqs.txt to pythondeps.toml. switch from pip to mkvenv
> > > in tests/Makefile.
> > >
> > > - unifying the test scripts for all of scripts/, as much as possible
> > >
> > > - integrate tests in meson, but keeping the shell scripts for now.
> > > move python/tests to tests/python. add a custom_target() to do "mkvenv
> > > ensuregroup tests"
> > >
> > > - move tox usage to .gitlab-ci.d, placing tox outside the configure
> > > script instead of inside. this makes the QEMU venv a "child" of the
> > > tox venv rather than the opposite, and allows more testing than just
> > > running the linters
> >
> > If tox is outside, that seems to require that we re-run configure for
> > each different python version we want to run tests with, which feels
> > pretty undesirable.
> >
> > There's the (system) python version that we want to run the overall
> > build system with, and then there are N other python versions which
> > we want to be able to run linters against.
> 
> I think we do want to check that configure can set up a virtual
> environment for all those versions of Python. Then we can use the
> virtual environment to run the linters too.
> 
> In fact I'm not even sure we need tox. Thanks to mkvenv, we would be
> using "configure --python" as a much cheaper substitute of tox. The
> only thing that is lost is the ability to test all versions through a
> single build tree, but I'm not sure the complexity is worth it.

FWIW, from my POV as a contributor, what I dislike is submitting a
patch series that has run through "make check" and then getting
complaints from maintainers that it fails various tests that are
not run as part of "make check".

If the maintainer is going to send feedback about failures under
5 different versions of python, then IMHO 'make check' needs to
be validating those 5 different versions of python in one go.
That was what my patch series aimed to address.

I'm not bothered whether we use tox or directly call mkvenv to
setup our test envs, provided we don't end up running

  configure; make; make check

for each python version we want linter checks against.

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 0/6] tracetool: add mypy --strict checking [AI discussion ahead!]
  2025-10-08 10:34   ` Daniel P. Berrangé
@ 2025-10-10 12:38     ` Markus Armbruster
  2025-10-10 13:49       ` Paolo Bonzini
  0 siblings, 1 reply; 31+ messages in thread
From: Markus Armbruster @ 2025-10-10 12:38 UTC (permalink / raw)
  To: Daniel P. Berrangé
  Cc: Paolo Bonzini, qemu-devel, Alex Bennée, John Snow,
	Kevin Wolf, Peter Maydell, Stefan Hajnoczi, Mads Ynddal

Daniel P. Berrangé <berrange@redhat.com> writes:

> On Wed, Oct 08, 2025 at 09:18:04AM +0200, Markus Armbruster wrote:
>> Paolo Bonzini <pbonzini@redhat.com> writes:
>> 
>> > [People in Cc are a mix of Python people, tracing people, and people
>> >  who followed the recent AI discussions. - Paolo]
>> >
>> > This series adds type annotations to tracetool. While useful on its own, 
>> > it also served as an experiment in whether AI tools could be useful and
>> > appropriate for mechanical code transformations that may not involve
>> > copyrightable expression.
>> >
>> > In this version, the types were added mostly with the RightTyper tool
>> > (https://github.com/RightTyper/RightTyper), which uses profiling to detect
>> > the types of arguments and return types at run time.  However, because
>> > adding type annotations is such a narrow and verifiable task, I also developed
>> > a parallel version using an LLM, to provide some data on topics such as:
>> >
>> > - how much choice/creativity is there in writing type annotations?
>> >   Is it closer to writing functional code or to refactoring?
>> 
>> Based on my work with John Snow on typing of the QAPI generator: there
>> is some choice.
>> 
>> Consider typing a function's argument.  Should we pick it based on what
>> the function requires from its argument?  Or should the type reflect how
>> the function is used?
>> 
>> Say the function iterates over the argument.  So we make the argument
>> Iterable[...], right?  But what if all callers pass a list?  Making it
>> List[...] could be clearer then.  It's a choice.
>> 
>> I think the choice depends on context and taste.  At some library's
>> external interface, picking a more general type can make the function
>> more generally useful.  But for some internal helper, I'd pick the
>> actual type.
>> 
>> My point isn't that an LLM could not possibly do the right thing based
>> on context, and maybe even "taste" distilled from its training data.  My
>> point is that this isn't entirely mechanical with basically one correct
>> output.
>>
>> Once we have such judgement calls, there's the question how an LLM's
>> choice depends on its training data (first order approximation today:
>> nobody knows), and whether and when that makes the LLM's output a
>> derived work of its training data (to be settled in court).
>
> There's perhaps a missing step here. The work first has to be considered
> eligible for copyright protection, before we get onto questions wrt
> derivative works, but that opens a can of worms....
>
>
> The big challenge is that traditionally projects did not really have
> to think much about where the "threshold of originality"[1] came into
> play for a work to be considered eligible for copyright.
>
> It was fine to take the conservative view that any patch benefits
> from copyright protection for the individual (or more often their
> employer). There was rarely any compelling need for the project to
> understand if something failed to meet the threshold. We see this
> in cases where a project re-licenses code - it is simpler/cheaper
> just to contact all contributors for permission, than to evaluate
> which subset held copyright.
>
> This is a very good thing. Understanding where that threshold applies
> is an challenging intellectual task that consumes time better spent
> on other tasks. Especially notable though is that threshold varies
> across countries in the world, and some places have at times even
> considered merely expending labour to make a work eligible.

Moreover, having software developers apply copyright law is about as
smart as having copyright lawyers write mission-critical code.  Both
tasks require education and experience.

> In trying to create an policy that permits AI contributions in some
> narrow scenarios, we're trying to thread the needle and as a global
> project that implies satisfying a wide variety of interpretations
> for the copyright threshold. There's no clear cut answer to that,
> the only option is to mitigate risk to a tolerable level.

Yes!

The boundary between legal and illegal is a superposition of fuzzy,
squiggly lines, one per jurisdiction.

We can only try to approximate it from the legal side.

The tighter we try to approximate, the more risk we take on.

In addition, tighter approximations can be difficult to understand and
apply.

> That's something all projects already do. For example when choosing
> between a CLA, vs a DCO signup, vs implicitly expecting contributors
> to be good, we're making a risk/cost tradeoff.

A tradeoff made with competent legal advice.  In addition, it's easy to
understand and apply.

[Strong argument why type annotations are low risk snipped...]

> The problem I have remaining is around the practicality of expressing
> this in a policy and having contributors & maintainers apply it well
> in practice.
>
> There's a definite "slippery slope" situation. The incentive for
> contributors will be to search out reasons to justify why a work
> matches the AI exception,

„Libenter homines id quod volunt credunt.“

>                           while overworked maintainers probably
> aren't going to spend much (any) mental energy on objectively
> analysing the situation, unless the proposal looks terribly out
> of line with expectations.

Yup.  Code review takes so much brain capacity that it can impair my
spelling.  Legal reasoning is right out.

> I'm not saying we shouldn't have an exception. I think the mypy case
> is an fairly compelling example of a low risk activity. We will need
> to become comfortable with the implications around the incentives
> for contributors & maintainers when applying it though.

Makes sense.

> With regards,
> Daniel
>
> [1] https://en.wikipedia.org/wiki/Threshold_of_originality



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 0/6] tracetool: add mypy --strict checking [AI discussion ahead!]
  2025-10-10 12:38     ` Markus Armbruster
@ 2025-10-10 13:49       ` Paolo Bonzini
  2025-10-10 17:41         ` Markus Armbruster
  0 siblings, 1 reply; 31+ messages in thread
From: Paolo Bonzini @ 2025-10-10 13:49 UTC (permalink / raw)
  To: Markus Armbruster, Daniel P. Berrangé
  Cc: qemu-devel, Alex Bennée, John Snow, Kevin Wolf,
	Peter Maydell, Stefan Hajnoczi, Mads Ynddal

On 10/10/25 14:38, Markus Armbruster wrote:
> The boundary between legal and illegal is a superposition of fuzzy,
> squiggly lines, one per jurisdiction.
> 
> We can only try to approximate it from the legal side.
> The tighter we try to approximate, the more risk we take on.
> 
> In addition, tighter approximations can be difficult to understand and
> apply.

I agree.

> [Strong argument why type annotations are low risk snipped...]

Note that type annotations are pretty much the upper bound of what I 
would consider a mechanical change.  I would expect, for most cases, 
that "include the prompt in the commit message" and the boringness of 
the change are together already a satisfactory explanation.

At the same time, I decided to try with a more complex change to 1) 
avoid a slippery slope; it's easier to do so if you look at the hard 
cases from the beginning, and Daniel did that very, very well; 2) probe 
the limitations of the tool and ascertain if it's even worthwhile having 
an exception.

>> There's a definite "slippery slope" situation. The incentive for
>> contributors will be to search out reasons to justify why a work
>> matches the AI exception,
> 
> „Libenter homines id quod volunt credunt.“

Yes, and that's why we should strive for simplicity if we are to have 
exceptions.  If you cannot convince me with the prompt that your change 
is mechanical/non-creative, don't even bother making complicated and 
probably wrong legal arguments.

Paying a certain price upfront (i.e., now) is fine, but in the long term 
the maintainer's job wrt AI should be and remain easy.  There has to be 
a cost, but then the same would be true with any policy other than 
"don't ask, don't tell"---including zero tolerance.

Paolo

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH 0/6] tracetool: add mypy --strict checking [AI discussion ahead!]
  2025-10-10 13:49       ` Paolo Bonzini
@ 2025-10-10 17:41         ` Markus Armbruster
  0 siblings, 0 replies; 31+ messages in thread
From: Markus Armbruster @ 2025-10-10 17:41 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Daniel P. Berrangé, qemu-devel, Alex Bennée, John Snow,
	Kevin Wolf, Peter Maydell, Stefan Hajnoczi, Mads Ynddal

Paolo Bonzini <pbonzini@redhat.com> writes:

> On 10/10/25 14:38, Markus Armbruster wrote:
>> The boundary between legal and illegal is a superposition of fuzzy,
>> squiggly lines, one per jurisdiction.
>> 
>> We can only try to approximate it from the legal side.
>> The tighter we try to approximate, the more risk we take on.
>> 
>> In addition, tighter approximations can be difficult to understand and
>> apply.
>
> I agree.
>
>> [Strong argument why type annotations are low risk snipped...]
>
> Note that type annotations are pretty much the upper bound of what I 
> would consider a mechanical change.  I would expect, for most cases, 
> that "include the prompt in the commit message" and the boringness of 
> the change are together already a satisfactory explanation.
>
> At the same time, I decided to try with a more complex change to 1) 
> avoid a slippery slope; it's easier to do so if you look at the hard 
> cases from the beginning, and Daniel did that very, very well; 2) probe 
> the limitations of the tool and ascertain if it's even worthwhile having 
> an exception.

It was a useful experiment.

>>> There's a definite "slippery slope" situation. The incentive for
>>> contributors will be to search out reasons to justify why a work
>>> matches the AI exception,
>> 
>> „Libenter homines id quod volunt credunt.“
>
> Yes, and that's why we should strive for simplicity if we are to have 
> exceptions.  If you cannot convince me with the prompt that your change 
> is mechanical/non-creative, don't even bother making complicated and 
> probably wrong legal arguments.
>
> Paying a certain price upfront (i.e., now) is fine, but in the long term 
> the maintainer's job wrt AI should be and remain easy.  There has to be 
> a cost, but then the same would be true with any policy other than 
> "don't ask, don't tell"---including zero tolerance.

Yes.

Cost is fine when the benefits are worth it.

Maintainer bandwidth is precious.  But it's not infiniyely precious :)



^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2025-10-10 17:45 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-08  6:35 [PATCH 0/6] tracetool: add mypy --strict checking [AI discussion ahead!] Paolo Bonzini
2025-10-08  6:35 ` [PATCH 1/6] tracetool: rename variable with conflicting types Paolo Bonzini
2025-10-08  9:27   ` Daniel P. Berrangé
2025-10-08 18:06   ` Stefan Hajnoczi
2025-10-08  6:35 ` [PATCH 2/6] tracetool: apply isort and add check Paolo Bonzini
2025-10-08  9:29   ` Daniel P. Berrangé
2025-10-08 17:58   ` Stefan Hajnoczi
2025-10-09  7:52     ` Daniel P. Berrangé
2025-10-09  8:22       ` Paolo Bonzini
2025-10-09  8:58         ` Daniel P. Berrangé
2025-10-09 11:26           ` Paolo Bonzini
2025-10-09 12:05             ` Daniel P. Berrangé
2025-10-09  7:58     ` Paolo Bonzini
2025-10-08  6:35 ` [PATCH 3/6] tracetool: "import annotations" Paolo Bonzini
2025-10-08  9:31   ` Daniel P. Berrangé
2025-10-08 18:06   ` Stefan Hajnoczi
2025-10-08  6:35 ` [PATCH 4/6] tracetool: add type annotations Paolo Bonzini
2025-10-08 18:09   ` Stefan Hajnoczi
2025-10-08  6:35 ` [PATCH 5/6] tracetool: complete typing annotations Paolo Bonzini
2025-10-08 18:10   ` Stefan Hajnoczi
2025-10-08  6:35 ` [PATCH 6/6] tracetool: add typing checks to "make -C python check" Paolo Bonzini
2025-10-08  9:32   ` Daniel P. Berrangé
2025-10-08 18:12   ` Stefan Hajnoczi
2025-10-08  7:18 ` [PATCH 0/6] tracetool: add mypy --strict checking [AI discussion ahead!] Markus Armbruster
2025-10-08 10:34   ` Daniel P. Berrangé
2025-10-10 12:38     ` Markus Armbruster
2025-10-10 13:49       ` Paolo Bonzini
2025-10-10 17:41         ` Markus Armbruster
2025-10-08 10:40 ` Daniel P. Berrangé
2025-10-08 17:23 ` Stefan Hajnoczi
2025-10-08 17:41   ` Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).