* [RFC PATCH v3 00/11] tcg-plugins: add hooks for discontinuities
@ 2024-12-02 19:26 Julian Ganz
2024-12-02 19:26 ` [RFC PATCH v3 01/11] plugins: add types for callbacks related to certain discontinuities Julian Ganz
` (12 more replies)
0 siblings, 13 replies; 70+ messages in thread
From: Julian Ganz @ 2024-12-02 19:26 UTC (permalink / raw)
To: qemu-devel; +Cc: Julian Ganz
Some analysis greatly benefits, or depends on, information about
certain types of dicontinuities such as interrupts. For example, we may
need to handle the execution of a new translation block differently if
it is not the result of normal program flow but of an interrupt.
Even with the existing interfaces, it is more or less possible to
discern these situations, e.g. as done by the cflow plugin. However,
this process poses a considerable overhead to the core analysis one may
intend to perform.
These changes introduce a generic and easy-to-use interface for plugin
authors in the form of a callback for discontinuities. Patch 1 defines
an enumeration of some trap-related discontinuities including somewhat
narrow definitions of the discontinuity evetns and a callback type.
Patch 2 defines the callback registration function. Patch 3 adds some
hooks for triggering the callbacks. Patch 4 adds an example plugin
showcasing the new API. Patches 5 through 6 call the hooks for a
selection of architectures, mapping architecture specific events to the
three categories defined in patch 1. Future non-RFC patchsets will call
these hooks for all architectures (that have some concept of trap or
interrupt). Finally, patch 11 supplies a test plugin asserting that the
next PC provided to the plugin points to the next instruction executed.
Sidenote: I'm likely doing something wrong for one architecture or
the other. These patches are untested for most of them.
Since v2 (tcg-plugins: add hooks for interrupts, exceptions and traps):
- Switched from traps as core concept to more generic discontinuities
- Switched from semihosting to hostcall as term for emulated traps
- Added enumeration of events and dedicated callback type
- Make callback receive event type as well as origin and target PC
(as requested by Pierrick Bouvier)
- Combined registration functions for different traps into a single
one for all types of discontinuities (as requested by Pierrick
Bouvier)
- Migrated records in example plugin from fully pre-allocated to a
scoreboard (as suggested by Pierrick Bouvier)
- Handle PSCI calls as hostcall (as pointed out by Peter Maydell)
- Added hooks for ARM Cortex M arches (as pointed out by Peter
Maydell)
- Added hooks for Alpha targets
- Added hooks for MIPS targets
- Added a plugin for testing some of the interface behaviour
Since v1:
- Split the one callback into multiple callbacks
- Added a target-agnostic definition of the relevant event(s)
- Call hooks from architecture-code rather than accel/tcg/cpu-exec.c
- Added a plugin showcasing API usage
Julian Ganz (11):
plugins: add types for callbacks related to certain discontinuities
plugins: add API for registering discontinuity callbacks
plugins: add hooks for new discontinuity related callbacks
contrib/plugins: add plugin showcasing new dicontinuity related API
target/alpha: call plugin trap callbacks
target/arm: call plugin trap callbacks
target/avr: call plugin trap callbacks
target/mips: call plugin trap callbacks
target/riscv: call plugin trap callbacks
target/sparc: call plugin trap callbacks
tests: add plugin asserting correctness of discon event's to_pc
contrib/plugins/meson.build | 3 +-
contrib/plugins/traps.c | 96 +++++++++++++++++++++++++++++
include/qemu/plugin-event.h | 3 +
include/qemu/plugin.h | 13 ++++
include/qemu/qemu-plugin.h | 58 +++++++++++++++++
plugins/core.c | 67 ++++++++++++++++++++
target/alpha/helper.c | 12 ++++
target/arm/helper.c | 25 ++++++++
target/arm/tcg/m_helper.c | 18 ++++++
target/avr/helper.c | 3 +
target/mips/tcg/sysemu/tlb_helper.c | 11 ++++
target/riscv/cpu_helper.c | 9 +++
target/sparc/int32_helper.c | 7 +++
target/sparc/int64_helper.c | 10 +++
tests/tcg/plugins/discons.c | 95 ++++++++++++++++++++++++++++
tests/tcg/plugins/meson.build | 2 +-
16 files changed, 430 insertions(+), 2 deletions(-)
create mode 100644 contrib/plugins/traps.c
create mode 100644 tests/tcg/plugins/discons.c
--
2.45.2
^ permalink raw reply [flat|nested] 70+ messages in thread
* [RFC PATCH v3 01/11] plugins: add types for callbacks related to certain discontinuities
2024-12-02 19:26 [RFC PATCH v3 00/11] tcg-plugins: add hooks for discontinuities Julian Ganz
@ 2024-12-02 19:26 ` Julian Ganz
2024-12-03 8:45 ` Julian Ganz
2024-12-04 22:45 ` Pierrick Bouvier
2024-12-02 19:26 ` [RFC PATCH v3 02/11] plugins: add API for registering discontinuity callbacks Julian Ganz
` (11 subsequent siblings)
12 siblings, 2 replies; 70+ messages in thread
From: Julian Ganz @ 2024-12-02 19:26 UTC (permalink / raw)
To: qemu-devel; +Cc: Julian Ganz
The plugin API allows registration of callbacks for a variety of VCPU
related events, such as VCPU reset, idle and resume. However, traps of
any kind, i.e. interrupts or exceptions, were previously not covered.
These kinds of events are arguably quite significant and usually go hand
in hand with a PC discontinuity. On most platforms, the discontinuity
also includes a transition from some "mode" to another. Thus, plugins
for the analysis of (virtualized) embedded systems may benefit from or
even require the possiblity to perform work on the occurance of an
interrupt or exception.
This change introduces the concept of such a discontinuity event in the
form of an enumeration. Currently only traps are covered. Specifically
we (loosely) define interrupts, exceptions and host calls across all
platforms. In addition, this change introduces a type to use for
callback functions related to such events. Since possible modes and the
enumeration of interupts and exceptions vary greatly between different
architectures, the callback type only receives the VCPU id, the type of
event as well as the old and new PC.
---
include/qemu/plugin.h | 1 +
include/qemu/qemu-plugin.h | 43 ++++++++++++++++++++++++++++++++++++++
2 files changed, 44 insertions(+)
diff --git a/include/qemu/plugin.h b/include/qemu/plugin.h
index 9726a9ebf3..27a176b631 100644
--- a/include/qemu/plugin.h
+++ b/include/qemu/plugin.h
@@ -59,6 +59,7 @@ union qemu_plugin_cb_sig {
qemu_plugin_udata_cb_t udata;
qemu_plugin_vcpu_simple_cb_t vcpu_simple;
qemu_plugin_vcpu_udata_cb_t vcpu_udata;
+ qemu_plugin_vcpu_discon_cb_t vcpu_discon;
qemu_plugin_vcpu_tb_trans_cb_t vcpu_tb_trans;
qemu_plugin_vcpu_mem_cb_t vcpu_mem;
qemu_plugin_vcpu_syscall_cb_t vcpu_syscall;
diff --git a/include/qemu/qemu-plugin.h b/include/qemu/qemu-plugin.h
index 0fba36ae02..9c67374b7e 100644
--- a/include/qemu/qemu-plugin.h
+++ b/include/qemu/qemu-plugin.h
@@ -154,6 +154,49 @@ typedef void (*qemu_plugin_vcpu_simple_cb_t)(qemu_plugin_id_t id,
typedef void (*qemu_plugin_vcpu_udata_cb_t)(unsigned int vcpu_index,
void *userdata);
+
+/**
+ * enum qemu_plugin_discon_type - type of a (potential) PC discontinuity
+ *
+ * @QEMU_PLUGIN_DISCON_INTERRUPT: an interrupt, defined across all architectures
+ * as an asynchronous event, usually originating
+ * from outside the CPU
+ * @QEMU_PLUGIN_DISCON_EXCEPTION: an exception, defined across all architectures
+ * as a synchronous event in response to a
+ * specific instruction being executed
+ * @QEMU_PLUGIN_DISCON_HOSTCALL: a host call, functionally a special kind of
+ * exception that is not handled by code run by
+ * the vCPU but machinery outside the vCPU
+ * @QEMU_PLUGIN_DISCON_ALL: all types of disconinuity events currently covered
+ */
+enum qemu_plugin_discon_type {
+ QEMU_PLUGIN_DISCON_INTERRUPT = 1,
+ QEMU_PLUGIN_DISCON_EXCEPTION = 2,
+ QEMU_PLUGIN_DISCON_HOSTCALL = 4,
+ QEMU_PLUGIN_DISCON_ALL = 7
+};
+
+/**
+ * typedef qemu_plugin_vcpu_discon_cb_t - vcpu discontinuity callback
+ * @vcpu_index: the current vcpu context
+ * @type: the type of discontinuity
+ * @from_pc: the source of the discontinuity, e.g. the PC before the
+ * transition
+ * @to_pc: the PC pointing to the next instruction to be executed
+ *
+ * The excact semantics of @from_pc depends on @the type of discontinuity. For
+ * interrupts, @from_pc will point to the next instruction which would have
+ * been executed. For exceptions and host calls, @from_pc will point to the
+ * instruction that caused the exception or issued the host call. Note that
+ * in the case of exceptions, the instruction is not retired and thus not
+ * observable via general instruction exec callbacks. The same may be the case
+ * for some host calls such as hypervisor call "exceptions".
+ */
+typedef void (*qemu_plugin_vcpu_discon_cb_t)(qemu_plugin_id_t id,
+ unsigned int vcpu_index,
+ enum qemu_plugin_discon_type type,
+ uint64_t from_pc, uint64_t to_pc);
+
/**
* qemu_plugin_uninstall() - Uninstall a plugin
* @id: this plugin's opaque ID
--
2.45.2
^ permalink raw reply related [flat|nested] 70+ messages in thread
* [RFC PATCH v3 02/11] plugins: add API for registering discontinuity callbacks
2024-12-02 19:26 [RFC PATCH v3 00/11] tcg-plugins: add hooks for discontinuities Julian Ganz
2024-12-02 19:26 ` [RFC PATCH v3 01/11] plugins: add types for callbacks related to certain discontinuities Julian Ganz
@ 2024-12-02 19:26 ` Julian Ganz
2024-12-04 22:45 ` Pierrick Bouvier
2025-01-09 13:57 ` Alex Bennée
2024-12-02 19:26 ` [RFC PATCH v3 03/11] plugins: add hooks for new discontinuity related callbacks Julian Ganz
` (10 subsequent siblings)
12 siblings, 2 replies; 70+ messages in thread
From: Julian Ganz @ 2024-12-02 19:26 UTC (permalink / raw)
To: qemu-devel
Cc: Julian Ganz, Alex Bennée, Alexandre Iooss, Mahmoud Mandour,
Pierrick Bouvier
The plugin API allows registration of callbacks for a variety of VCPU
related events, such as VCPU reset, idle and resume. In addition to
those events, we recently defined discontinuity events, which include
traps.
This change introduces a function to register callbacks for these
events. We define one distinct plugin event type for each type of
discontinuity, granting fine control to plugins in term of which events
they receive.
---
include/qemu/plugin-event.h | 3 +++
include/qemu/qemu-plugin.h | 15 +++++++++++++++
plugins/core.c | 15 +++++++++++++++
3 files changed, 33 insertions(+)
diff --git a/include/qemu/plugin-event.h b/include/qemu/plugin-event.h
index 7056d8427b..1100dae212 100644
--- a/include/qemu/plugin-event.h
+++ b/include/qemu/plugin-event.h
@@ -20,6 +20,9 @@ enum qemu_plugin_event {
QEMU_PLUGIN_EV_VCPU_SYSCALL_RET,
QEMU_PLUGIN_EV_FLUSH,
QEMU_PLUGIN_EV_ATEXIT,
+ QEMU_PLUGIN_EV_VCPU_INTERRUPT,
+ QEMU_PLUGIN_EV_VCPU_EXCEPTION,
+ QEMU_PLUGIN_EV_VCPU_HOSTCALL,
QEMU_PLUGIN_EV_MAX, /* total number of plugin events we support */
};
diff --git a/include/qemu/qemu-plugin.h b/include/qemu/qemu-plugin.h
index 9c67374b7e..f998a465e5 100644
--- a/include/qemu/qemu-plugin.h
+++ b/include/qemu/qemu-plugin.h
@@ -273,6 +273,21 @@ QEMU_PLUGIN_API
void qemu_plugin_register_vcpu_resume_cb(qemu_plugin_id_t id,
qemu_plugin_vcpu_simple_cb_t cb);
+/**
+ * qemu_plugin_register_vcpu_discon_cb() - register a discontinuity callback
+ * @id: plugin ID
+ * @cb: callback function
+ *
+ * The @cb function is called every time a vCPU receives a discontinuity event
+ * of the specified type(s), after the vCPU was prepared to handle the event.
+ * Preparation usually entails updating the PC to some interrupt handler or trap
+ * vector entry.
+ */
+QEMU_PLUGIN_API
+void qemu_plugin_register_vcpu_discon_cb(qemu_plugin_id_t id,
+ enum qemu_plugin_discon_type type,
+ qemu_plugin_vcpu_discon_cb_t cb);
+
/** struct qemu_plugin_tb - Opaque handle for a translation block */
struct qemu_plugin_tb;
/** struct qemu_plugin_insn - Opaque handle for a translated instruction */
diff --git a/plugins/core.c b/plugins/core.c
index bb105e8e68..a89a4a0315 100644
--- a/plugins/core.c
+++ b/plugins/core.c
@@ -559,6 +559,21 @@ void qemu_plugin_register_vcpu_resume_cb(qemu_plugin_id_t id,
plugin_register_cb(id, QEMU_PLUGIN_EV_VCPU_RESUME, cb);
}
+void qemu_plugin_register_vcpu_discon_cb(qemu_plugin_id_t id,
+ enum qemu_plugin_discon_type type,
+ qemu_plugin_vcpu_discon_cb_t cb)
+{
+ if (type & QEMU_PLUGIN_DISCON_INTERRUPT) {
+ plugin_register_cb(id, QEMU_PLUGIN_EV_VCPU_INTERRUPT, cb);
+ }
+ if (type & QEMU_PLUGIN_DISCON_EXCEPTION) {
+ plugin_register_cb(id, QEMU_PLUGIN_EV_VCPU_EXCEPTION, cb);
+ }
+ if (type & QEMU_PLUGIN_DISCON_HOSTCALL) {
+ plugin_register_cb(id, QEMU_PLUGIN_EV_VCPU_HOSTCALL, cb);
+ }
+}
+
void qemu_plugin_register_flush_cb(qemu_plugin_id_t id,
qemu_plugin_simple_cb_t cb)
{
--
2.45.2
^ permalink raw reply related [flat|nested] 70+ messages in thread
* [RFC PATCH v3 03/11] plugins: add hooks for new discontinuity related callbacks
2024-12-02 19:26 [RFC PATCH v3 00/11] tcg-plugins: add hooks for discontinuities Julian Ganz
2024-12-02 19:26 ` [RFC PATCH v3 01/11] plugins: add types for callbacks related to certain discontinuities Julian Ganz
2024-12-02 19:26 ` [RFC PATCH v3 02/11] plugins: add API for registering discontinuity callbacks Julian Ganz
@ 2024-12-02 19:26 ` Julian Ganz
2024-12-04 22:47 ` Pierrick Bouvier
2025-01-09 13:58 ` Alex Bennée
2024-12-02 19:26 ` [RFC PATCH v3 04/11] contrib/plugins: add plugin showcasing new dicontinuity related API Julian Ganz
` (9 subsequent siblings)
12 siblings, 2 replies; 70+ messages in thread
From: Julian Ganz @ 2024-12-02 19:26 UTC (permalink / raw)
To: qemu-devel
Cc: Julian Ganz, Alex Bennée, Alexandre Iooss, Mahmoud Mandour,
Pierrick Bouvier
The plugin API allows registration of callbacks for a variety of VCPU
related events, such as VCPU reset, idle and resume. In addition, we
recently introduced API for registering callbacks for discontinuity
events, specifically for interrupts, exceptions and host calls.
This change introduces the corresponding hooks called from target
specific code inside qemu.
---
include/qemu/plugin.h | 12 ++++++++++
plugins/core.c | 52 +++++++++++++++++++++++++++++++++++++++++++
2 files changed, 64 insertions(+)
diff --git a/include/qemu/plugin.h b/include/qemu/plugin.h
index 27a176b631..3de9cb3fe4 100644
--- a/include/qemu/plugin.h
+++ b/include/qemu/plugin.h
@@ -161,6 +161,9 @@ void qemu_plugin_vcpu_exit_hook(CPUState *cpu);
void qemu_plugin_tb_trans_cb(CPUState *cpu, struct qemu_plugin_tb *tb);
void qemu_plugin_vcpu_idle_cb(CPUState *cpu);
void qemu_plugin_vcpu_resume_cb(CPUState *cpu);
+void qemu_plugin_vcpu_interrupt_cb(CPUState *cpu, uint64_t from, uint64_t to);
+void qemu_plugin_vcpu_exception_cb(CPUState *cpu, uint64_t from, uint64_t to);
+void qemu_plugin_vcpu_hostcall_cb(CPUState *cpu, uint64_t from, uint64_t to);
void
qemu_plugin_vcpu_syscall(CPUState *cpu, int64_t num, uint64_t a1,
uint64_t a2, uint64_t a3, uint64_t a4, uint64_t a5,
@@ -243,6 +246,15 @@ static inline void qemu_plugin_vcpu_idle_cb(CPUState *cpu)
static inline void qemu_plugin_vcpu_resume_cb(CPUState *cpu)
{ }
+void qemu_plugin_vcpu_interrupt_cb(CPUState *cpu, uint64_t from, uint64_t to)
+{ }
+
+void qemu_plugin_vcpu_exception_cb(CPUState *cpu, uint64_t from, uint64_t to)
+{ }
+
+void qemu_plugin_vcpu_hostcall_cb(CPUState *cpu, uint64_t from, uint64_t to)
+{ }
+
static inline void
qemu_plugin_vcpu_syscall(CPUState *cpu, int64_t num, uint64_t a1, uint64_t a2,
uint64_t a3, uint64_t a4, uint64_t a5, uint64_t a6,
diff --git a/plugins/core.c b/plugins/core.c
index a89a4a0315..2c9637334f 100644
--- a/plugins/core.c
+++ b/plugins/core.c
@@ -112,6 +112,43 @@ static void plugin_vcpu_cb__simple(CPUState *cpu, enum qemu_plugin_event ev)
}
}
+/*
+ * Disable CFI checks.
+ * The callback function has been loaded from an external library so we do not
+ * have type information
+ */
+QEMU_DISABLE_CFI
+static void plugin_vcpu_cb__discon(CPUState *cpu,
+ enum qemu_plugin_discon_type type,
+ uint64_t from, uint64_t to)
+{
+ struct qemu_plugin_cb *cb, *next;
+ enum qemu_plugin_event ev;
+
+ if (cpu->cpu_index < plugin.num_vcpus) {
+ switch (type) {
+ case QEMU_PLUGIN_DISCON_INTERRUPT:
+ ev = QEMU_PLUGIN_EV_VCPU_INTERRUPT;
+ break;
+ case QEMU_PLUGIN_DISCON_EXCEPTION:
+ ev = QEMU_PLUGIN_EV_VCPU_EXCEPTION;
+ break;
+ case QEMU_PLUGIN_DISCON_HOSTCALL:
+ ev = QEMU_PLUGIN_EV_VCPU_HOSTCALL;
+ break;
+ default:
+ g_assert_not_reached();
+ }
+
+ /* iterate safely; plugins might uninstall themselves at any time */
+ QLIST_FOREACH_SAFE_RCU(cb, &plugin.cb_lists[ev], entry, next) {
+ qemu_plugin_vcpu_discon_cb_t func = cb->f.vcpu_discon;
+
+ func(cb->ctx->id, cpu->cpu_index, type, from, to);
+ }
+ }
+}
+
/*
* Disable CFI checks.
* The callback function has been loaded from an external library so we do not
@@ -547,6 +584,21 @@ void qemu_plugin_vcpu_resume_cb(CPUState *cpu)
}
}
+void qemu_plugin_vcpu_interrupt_cb(CPUState *cpu, uint64_t from, uint64_t to)
+{
+ plugin_vcpu_cb__discon(cpu, QEMU_PLUGIN_DISCON_INTERRUPT, from, to);
+}
+
+void qemu_plugin_vcpu_exception_cb(CPUState *cpu, uint64_t from, uint64_t to)
+{
+ plugin_vcpu_cb__discon(cpu, QEMU_PLUGIN_DISCON_EXCEPTION, from, to);
+}
+
+void qemu_plugin_vcpu_hostcall_cb(CPUState *cpu, uint64_t from, uint64_t to)
+{
+ plugin_vcpu_cb__discon(cpu, QEMU_PLUGIN_DISCON_HOSTCALL, from, to);
+}
+
void qemu_plugin_register_vcpu_idle_cb(qemu_plugin_id_t id,
qemu_plugin_vcpu_simple_cb_t cb)
{
--
2.45.2
^ permalink raw reply related [flat|nested] 70+ messages in thread
* [RFC PATCH v3 04/11] contrib/plugins: add plugin showcasing new dicontinuity related API
2024-12-02 19:26 [RFC PATCH v3 00/11] tcg-plugins: add hooks for discontinuities Julian Ganz
` (2 preceding siblings ...)
2024-12-02 19:26 ` [RFC PATCH v3 03/11] plugins: add hooks for new discontinuity related callbacks Julian Ganz
@ 2024-12-02 19:26 ` Julian Ganz
2024-12-04 23:14 ` Pierrick Bouvier
2025-01-09 14:04 ` Alex Bennée
2024-12-02 19:26 ` [RFC PATCH v3 05/11] target/alpha: call plugin trap callbacks Julian Ganz
` (8 subsequent siblings)
12 siblings, 2 replies; 70+ messages in thread
From: Julian Ganz @ 2024-12-02 19:26 UTC (permalink / raw)
To: qemu-devel
Cc: Julian Ganz, Alex Bennée, Alexandre Iooss, Mahmoud Mandour,
Pierrick Bouvier
We recently introduced new plugin API for registration of discontinuity
related callbacks. This change introduces a minimal plugin showcasing
the new API. It simply counts the occurances of interrupts, exceptions
and host calls per CPU and reports the counts when exitting.
---
contrib/plugins/meson.build | 3 +-
contrib/plugins/traps.c | 96 +++++++++++++++++++++++++++++++++++++
2 files changed, 98 insertions(+), 1 deletion(-)
create mode 100644 contrib/plugins/traps.c
diff --git a/contrib/plugins/meson.build b/contrib/plugins/meson.build
index 63a32c2b4f..9a3015e1c1 100644
--- a/contrib/plugins/meson.build
+++ b/contrib/plugins/meson.build
@@ -1,5 +1,6 @@
contrib_plugins = ['bbv', 'cache', 'cflow', 'drcov', 'execlog', 'hotblocks',
- 'hotpages', 'howvec', 'hwprofile', 'ips', 'stoptrigger']
+ 'hotpages', 'howvec', 'hwprofile', 'ips', 'stoptrigger',
+ 'traps']
if host_os != 'windows'
# lockstep uses socket.h
contrib_plugins += 'lockstep'
diff --git a/contrib/plugins/traps.c b/contrib/plugins/traps.c
new file mode 100644
index 0000000000..ecd4beac5f
--- /dev/null
+++ b/contrib/plugins/traps.c
@@ -0,0 +1,96 @@
+/*
+ * Copyright (C) 2024, Julian Ganz <neither@nut.email>
+ *
+ * Traps - count traps
+ *
+ * License: GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include <stdio.h>
+
+#include <qemu-plugin.h>
+
+QEMU_PLUGIN_EXPORT int qemu_plugin_version = QEMU_PLUGIN_VERSION;
+
+typedef struct {
+ uint64_t interrupts;
+ uint64_t exceptions;
+ uint64_t hostcalls;
+ bool active;
+} TrapCounters;
+
+static struct qemu_plugin_scoreboard *traps;
+static size_t max_vcpus;
+
+static void vcpu_init(qemu_plugin_id_t id, unsigned int vcpu_index)
+{
+ TrapCounters* rec = qemu_plugin_scoreboard_find(traps, vcpu_index);
+ rec->active = true;
+}
+
+static void vcpu_discon(qemu_plugin_id_t id, unsigned int vcpu_index,
+ enum qemu_plugin_discon_type type, uint64_t from_pc,
+ uint64_t to_pc)
+{
+ TrapCounters* rec = qemu_plugin_scoreboard_find(traps, vcpu_index);
+ switch (type) {
+ case QEMU_PLUGIN_DISCON_INTERRUPT:
+ rec->interrupts++;
+ break;
+ case QEMU_PLUGIN_DISCON_EXCEPTION:
+ rec->exceptions++;
+ break;
+ case QEMU_PLUGIN_DISCON_HOSTCALL:
+ rec->hostcalls++;
+ break;
+ default:
+ /* unreachable */
+ break;
+ }
+}
+
+static void plugin_exit(qemu_plugin_id_t id, void *p)
+{
+ g_autoptr(GString) report;
+ report = g_string_new("VCPU, interrupts, exceptions, hostcalls\n");
+ int vcpu;
+
+ for (vcpu = 0; vcpu < max_vcpus; vcpu++) {
+ TrapCounters *rec = qemu_plugin_scoreboard_find(traps, vcpu);
+ if (rec->active) {
+ g_string_append_printf(report,
+ "% 4d, % 10"PRId64", % 10"PRId64", % 10"
+ PRId64"\n",
+ vcpu,
+ rec->interrupts, rec->exceptions,
+ rec->hostcalls);
+ }
+ }
+
+ qemu_plugin_outs(report->str);
+ qemu_plugin_scoreboard_free(traps);
+}
+
+QEMU_PLUGIN_EXPORT
+int qemu_plugin_install(qemu_plugin_id_t id, const qemu_info_t *info,
+ int argc, char **argv)
+{
+ if (!info->system_emulation) {
+ fputs("trap plugin can only be used in system emulation mode.\n",
+ stderr);
+ return -1;
+ }
+
+ max_vcpus = info->system.max_vcpus;
+ traps = qemu_plugin_scoreboard_new(sizeof(TrapCounters));
+ qemu_plugin_register_vcpu_init_cb(id, vcpu_init);
+ qemu_plugin_vcpu_for_each(id, vcpu_init);
+
+ qemu_plugin_register_vcpu_discon_cb(id, QEMU_PLUGIN_DISCON_TRAPS,
+ vcpu_discon);
+
+ qemu_plugin_register_atexit_cb(id, plugin_exit, NULL);
+
+ return 0;
+}
--
2.45.2
^ permalink raw reply related [flat|nested] 70+ messages in thread
* [RFC PATCH v3 05/11] target/alpha: call plugin trap callbacks
2024-12-02 19:26 [RFC PATCH v3 00/11] tcg-plugins: add hooks for discontinuities Julian Ganz
` (3 preceding siblings ...)
2024-12-02 19:26 ` [RFC PATCH v3 04/11] contrib/plugins: add plugin showcasing new dicontinuity related API Julian Ganz
@ 2024-12-02 19:26 ` Julian Ganz
2024-12-04 22:48 ` Pierrick Bouvier
2024-12-02 19:26 ` [RFC PATCH v3 06/11] target/arm: " Julian Ganz
` (7 subsequent siblings)
12 siblings, 1 reply; 70+ messages in thread
From: Julian Ganz @ 2024-12-02 19:26 UTC (permalink / raw)
To: qemu-devel; +Cc: Julian Ganz, Richard Henderson
We recently introduced API for registering callbacks for trap related
events as well as the corresponding hook functions. Due to differences
between architectures, the latter need to be called from target specific
code.
This change places hooks for Alpha targets.
---
target/alpha/helper.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/target/alpha/helper.c b/target/alpha/helper.c
index 2f1000c99f..acc92402af 100644
--- a/target/alpha/helper.c
+++ b/target/alpha/helper.c
@@ -25,6 +25,7 @@
#include "fpu/softfloat-types.h"
#include "exec/helper-proto.h"
#include "qemu/qemu-print.h"
+#include "qemu/plugin.h"
#define CONVERT_BIT(X, SRC, DST) \
@@ -326,6 +327,7 @@ void alpha_cpu_do_interrupt(CPUState *cs)
{
CPUAlphaState *env = cpu_env(cs);
int i = cs->exception_index;
+ uint64_t last_pc = env->pc;
if (qemu_loglevel_mask(CPU_LOG_INT)) {
static int count;
@@ -429,6 +431,16 @@ void alpha_cpu_do_interrupt(CPUState *cs)
/* Switch to PALmode. */
env->flags |= ENV_FLAG_PAL_MODE;
+
+ switch (i) {
+ case EXCP_SMP_INTERRUPT:
+ case EXCP_CLK_INTERRUPT:
+ case EXCP_DEV_INTERRUPT:
+ qemu_plugin_vcpu_interrupt_cb(cs, last_pc, env->pc);
+ break;
+ qemu_plugin_vcpu_exception_cb(cs, last_pc, env->pc);
+ default:
+ }
}
bool alpha_cpu_exec_interrupt(CPUState *cs, int interrupt_request)
--
2.45.2
^ permalink raw reply related [flat|nested] 70+ messages in thread
* [RFC PATCH v3 06/11] target/arm: call plugin trap callbacks
2024-12-02 19:26 [RFC PATCH v3 00/11] tcg-plugins: add hooks for discontinuities Julian Ganz
` (4 preceding siblings ...)
2024-12-02 19:26 ` [RFC PATCH v3 05/11] target/alpha: call plugin trap callbacks Julian Ganz
@ 2024-12-02 19:26 ` Julian Ganz
2024-12-02 19:26 ` [RFC PATCH v3 07/11] target/avr: " Julian Ganz
` (6 subsequent siblings)
12 siblings, 0 replies; 70+ messages in thread
From: Julian Ganz @ 2024-12-02 19:26 UTC (permalink / raw)
To: qemu-devel
Cc: Julian Ganz, Alex Bennée, Alexandre Iooss, Mahmoud Mandour,
Pierrick Bouvier, Peter Maydell, open list:ARM TCG CPUs
We recently introduced API for registering callbacks for trap related
events as well as the corresponding hook functions. Due to differences
between architectures, the latter need to be called from target specific
code.
This change places hooks for ARM (and Aarch64) targets. We decided to
treat the (V)IRQ, (VI/VF)NMI, (V)FIQ and VSERR exceptions as interrupts
since they are, presumably, async in nature.
---
contrib/plugins/traps.c | 2 +-
target/arm/helper.c | 25 +++++++++++++++++++++++++
target/arm/tcg/m_helper.c | 18 ++++++++++++++++++
3 files changed, 44 insertions(+), 1 deletion(-)
diff --git a/contrib/plugins/traps.c b/contrib/plugins/traps.c
index ecd4beac5f..cdb503e499 100644
--- a/contrib/plugins/traps.c
+++ b/contrib/plugins/traps.c
@@ -87,7 +87,7 @@ int qemu_plugin_install(qemu_plugin_id_t id, const qemu_info_t *info,
qemu_plugin_register_vcpu_init_cb(id, vcpu_init);
qemu_plugin_vcpu_for_each(id, vcpu_init);
- qemu_plugin_register_vcpu_discon_cb(id, QEMU_PLUGIN_DISCON_TRAPS,
+ qemu_plugin_register_vcpu_discon_cb(id, QEMU_PLUGIN_DISCON_ALL,
vcpu_discon);
qemu_plugin_register_atexit_cb(id, plugin_exit, NULL);
diff --git a/target/arm/helper.c b/target/arm/helper.c
index f38eb054c0..57f274a037 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -31,6 +31,7 @@
#endif
#include "cpregs.h"
#include "target/arm/gtimer.h"
+#include "qemu/plugin.h"
#define ARM_CPU_FREQ 1000000000 /* FIXME: 1 GHz, should be configurable */
@@ -11166,6 +11167,25 @@ static void take_aarch32_exception(CPUARMState *env, int new_mode,
}
}
+static void arm_do_plugin_vcpu_interrupt_cb(CPUState *cs, uint64_t from,
+ uint64_t to)
+{
+ switch (cs->exception_index) {
+ case EXCP_IRQ:
+ case EXCP_VIRQ:
+ case EXCP_NMI:
+ case EXCP_VINMI:
+ case EXCP_FIQ:
+ case EXCP_VFIQ:
+ case EXCP_VFNMI:
+ case EXCP_VSERR:
+ qemu_plugin_vcpu_interrupt_cb(cs, from, to);
+ break;
+ default:
+ qemu_plugin_vcpu_exception_cb(cs, from, to);
+ }
+}
+
static void arm_cpu_do_interrupt_aarch32_hyp(CPUState *cs)
{
/*
@@ -11822,6 +11842,7 @@ void arm_cpu_do_interrupt(CPUState *cs)
ARMCPU *cpu = ARM_CPU(cs);
CPUARMState *env = &cpu->env;
unsigned int new_el = env->exception.target_el;
+ uint64_t last_pc = env->pc;
assert(!arm_feature(env, ARM_FEATURE_M));
@@ -11838,6 +11859,7 @@ void arm_cpu_do_interrupt(CPUState *cs)
if (tcg_enabled() && arm_is_psci_call(cpu, cs->exception_index)) {
arm_handle_psci_call(cpu);
qemu_log_mask(CPU_LOG_INT, "...handled as PSCI call\n");
+ qemu_plugin_vcpu_hostcall_cb(cs, last_pc, env->pc);
return;
}
@@ -11849,6 +11871,7 @@ void arm_cpu_do_interrupt(CPUState *cs)
#ifdef CONFIG_TCG
if (cs->exception_index == EXCP_SEMIHOST) {
tcg_handle_semihosting(cs);
+ qemu_plugin_vcpu_hostcall_cb(cs, last_pc, env->pc);
return;
}
#endif
@@ -11874,6 +11897,8 @@ void arm_cpu_do_interrupt(CPUState *cs)
if (!kvm_enabled()) {
cs->interrupt_request |= CPU_INTERRUPT_EXITTB;
}
+
+ arm_do_plugin_vcpu_interrupt_cb(cs, last_pc, env->pc);
}
#endif /* !CONFIG_USER_ONLY */
diff --git a/target/arm/tcg/m_helper.c b/target/arm/tcg/m_helper.c
index f7354f3c6e..3a8b55db82 100644
--- a/target/arm/tcg/m_helper.c
+++ b/target/arm/tcg/m_helper.c
@@ -24,6 +24,7 @@
#if !defined(CONFIG_USER_ONLY)
#include "hw/intc/armv7m_nvic.h"
#endif
+#include "qemu/plugin.h"
static void v7m_msr_xpsr(CPUARMState *env, uint32_t mask,
uint32_t reg, uint32_t val)
@@ -2186,6 +2187,7 @@ void arm_v7m_cpu_do_interrupt(CPUState *cs)
CPUARMState *env = &cpu->env;
uint32_t lr;
bool ignore_stackfaults;
+ uint64_t last_pc = env->pc;
arm_log_exception(cs);
@@ -2353,6 +2355,7 @@ void arm_v7m_cpu_do_interrupt(CPUState *cs)
g_assert_not_reached();
#endif
env->regs[15] += env->thumb ? 2 : 4;
+ qemu_plugin_vcpu_hostcall_cb(cs, last_pc, env->pc);
return;
case EXCP_BKPT:
armv7m_nvic_set_pending(env->nvic, ARMV7M_EXCP_DEBUG, false);
@@ -2419,6 +2422,21 @@ void arm_v7m_cpu_do_interrupt(CPUState *cs)
ignore_stackfaults = v7m_push_stack(cpu);
v7m_exception_taken(cpu, lr, false, ignore_stackfaults);
+
+ switch (cs->exception_index) {
+ case EXCP_IRQ:
+ case EXCP_VIRQ:
+ case EXCP_NMI:
+ case EXCP_VINMI:
+ case EXCP_FIQ:
+ case EXCP_VFIQ:
+ case EXCP_VFNMI:
+ case EXCP_VSERR:
+ qemu_plugin_vcpu_interrupt_cb(cs, last_pc, env->pc);
+ break;
+ default:
+ qemu_plugin_vcpu_exception_cb(cs, last_pc, env->pc);
+ }
}
uint32_t HELPER(v7m_mrs)(CPUARMState *env, uint32_t reg)
--
2.45.2
^ permalink raw reply related [flat|nested] 70+ messages in thread
* [RFC PATCH v3 07/11] target/avr: call plugin trap callbacks
2024-12-02 19:26 [RFC PATCH v3 00/11] tcg-plugins: add hooks for discontinuities Julian Ganz
` (5 preceding siblings ...)
2024-12-02 19:26 ` [RFC PATCH v3 06/11] target/arm: " Julian Ganz
@ 2024-12-02 19:26 ` Julian Ganz
2024-12-02 19:26 ` [RFC PATCH v3 08/11] target/mips: " Julian Ganz
` (5 subsequent siblings)
12 siblings, 0 replies; 70+ messages in thread
From: Julian Ganz @ 2024-12-02 19:26 UTC (permalink / raw)
To: qemu-devel; +Cc: Julian Ganz, Michael Rolnik
We recently introduced API for registering callbacks for trap related
events as well as the corresponding hook functions. Due to differences
between architectures, the latter need to be called from target specific
code.
This change places the hook for AVR targets. That architecture appears
to only know interrupts.
---
target/avr/helper.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/target/avr/helper.c b/target/avr/helper.c
index 345708a1b3..ba7704f2f1 100644
--- a/target/avr/helper.c
+++ b/target/avr/helper.c
@@ -28,6 +28,7 @@
#include "exec/cpu_ldst.h"
#include "exec/address-spaces.h"
#include "exec/helper-proto.h"
+#include "qemu/plugin.h"
bool avr_cpu_exec_interrupt(CPUState *cs, int interrupt_request)
{
@@ -97,6 +98,8 @@ void avr_cpu_do_interrupt(CPUState *cs)
env->sregI = 0; /* clear Global Interrupt Flag */
cs->exception_index = -1;
+
+ qemu_plugin_vcpu_interrupt_cb(cs, ret, env->pc_w);
}
hwaddr avr_cpu_get_phys_page_debug(CPUState *cs, vaddr addr)
--
2.45.2
^ permalink raw reply related [flat|nested] 70+ messages in thread
* [RFC PATCH v3 08/11] target/mips: call plugin trap callbacks
2024-12-02 19:26 [RFC PATCH v3 00/11] tcg-plugins: add hooks for discontinuities Julian Ganz
` (6 preceding siblings ...)
2024-12-02 19:26 ` [RFC PATCH v3 07/11] target/avr: " Julian Ganz
@ 2024-12-02 19:26 ` Julian Ganz
2025-01-09 13:43 ` Alex Bennée
2024-12-02 19:26 ` [RFC PATCH v3 09/11] target/riscv: " Julian Ganz
` (4 subsequent siblings)
12 siblings, 1 reply; 70+ messages in thread
From: Julian Ganz @ 2024-12-02 19:26 UTC (permalink / raw)
To: qemu-devel
Cc: Julian Ganz, Philippe Mathieu-Daudé, Aurelien Jarno,
Jiaxun Yang, Aleksandar Rikalo
We recently introduced API for registering callbacks for trap related
events as well as the corresponding hook functions. Due to differences
between architectures, the latter need to be called from target specific
code.
This change places hooks for MIPS targets.
---
target/mips/tcg/sysemu/tlb_helper.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/target/mips/tcg/sysemu/tlb_helper.c b/target/mips/tcg/sysemu/tlb_helper.c
index e98bb95951..2b19975d53 100644
--- a/target/mips/tcg/sysemu/tlb_helper.c
+++ b/target/mips/tcg/sysemu/tlb_helper.c
@@ -18,6 +18,7 @@
*/
#include "qemu/osdep.h"
#include "qemu/bitops.h"
+#include "qemu/plugin.h"
#include "cpu.h"
#include "internal.h"
@@ -1033,6 +1034,7 @@ void mips_cpu_do_interrupt(CPUState *cs)
bool update_badinstr = 0;
target_ulong offset;
int cause = -1;
+ uint64_t last_pc = env->active_tc.PC;
if (qemu_loglevel_mask(CPU_LOG_INT)
&& cs->exception_index != EXCP_EXT_INTERRUPT) {
@@ -1051,6 +1053,7 @@ void mips_cpu_do_interrupt(CPUState *cs)
cs->exception_index = EXCP_NONE;
mips_semihosting(env);
env->active_tc.PC += env->error_code;
+ qemu_plugin_vcpu_hostcall_cb(cs, last_pc, env->active_tc.PC);
return;
case EXCP_DSS:
env->CP0_Debug |= 1 << CP0DB_DSS;
@@ -1335,6 +1338,14 @@ void mips_cpu_do_interrupt(CPUState *cs)
env->CP0_Status, env->CP0_Cause, env->CP0_BadVAddr,
env->CP0_DEPC);
}
+ switch (cs->exception_index) {
+ case EXCP_NMI:
+ case EXCP_EXT_INTERRUPT:
+ qemu_plugin_vcpu_interrupt_cb(cs, last_pc, env->active_tc.PC);
+ break;
+ default:
+ qemu_plugin_vcpu_exception_cb(cs, last_pc, env->active_tc.PC);
+ }
cs->exception_index = EXCP_NONE;
}
--
2.45.2
^ permalink raw reply related [flat|nested] 70+ messages in thread
* [RFC PATCH v3 09/11] target/riscv: call plugin trap callbacks
2024-12-02 19:26 [RFC PATCH v3 00/11] tcg-plugins: add hooks for discontinuities Julian Ganz
` (7 preceding siblings ...)
2024-12-02 19:26 ` [RFC PATCH v3 08/11] target/mips: " Julian Ganz
@ 2024-12-02 19:26 ` Julian Ganz
2024-12-03 4:39 ` Alistair Francis
2024-12-02 19:41 ` [RFC PATCH v3 10/11] target/sparc: " Julian Ganz
` (3 subsequent siblings)
12 siblings, 1 reply; 70+ messages in thread
From: Julian Ganz @ 2024-12-02 19:26 UTC (permalink / raw)
To: qemu-devel
Cc: Julian Ganz, Palmer Dabbelt, Alistair Francis, Bin Meng,
Weiwei Li, Daniel Henrique Barboza, Liu Zhiwei,
open list:RISC-V TCG CPUs
We recently introduced API for registering callbacks for trap related
events as well as the corresponding hook functions. Due to differences
between architectures, the latter need to be called from target specific
code.
This change places hooks for RISC-V targets.
---
target/riscv/cpu_helper.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
index 0a3ead69ea..6da9bd4629 100644
--- a/target/riscv/cpu_helper.c
+++ b/target/riscv/cpu_helper.c
@@ -34,6 +34,7 @@
#include "debug.h"
#include "tcg/oversized-guest.h"
#include "pmp.h"
+#include "qemu/plugin.h"
int riscv_env_mmu_index(CPURISCVState *env, bool ifetch)
{
@@ -1806,6 +1807,7 @@ void riscv_cpu_do_interrupt(CPUState *cs)
!(env->mip & (1 << cause));
bool vs_injected = env->hvip & (1 << cause) & env->hvien &&
!(env->mip & (1 << cause));
+ uint64_t last_pc = env-> pc;
target_ulong tval = 0;
target_ulong tinst = 0;
target_ulong htval = 0;
@@ -1820,6 +1822,7 @@ void riscv_cpu_do_interrupt(CPUState *cs)
case RISCV_EXCP_SEMIHOST:
do_common_semihosting(cs);
env->pc += 4;
+ qemu_plugin_vcpu_hostcall_cb(cs, last_pc, env->pc);
return;
#endif
case RISCV_EXCP_LOAD_GUEST_ACCESS_FAULT:
@@ -1999,6 +2002,12 @@ void riscv_cpu_do_interrupt(CPUState *cs)
riscv_cpu_set_mode(env, PRV_M, virt);
}
+ if (async) {
+ qemu_plugin_vcpu_interrupt_cb(cs, last_pc, env->pc);
+ } else {
+ qemu_plugin_vcpu_exception_cb(cs, last_pc, env->pc);
+ }
+
/*
* Interrupt/exception/trap delivery is asynchronous event and as per
* zicfilp spec CPU should clear up the ELP state. No harm in clearing
--
2.45.2
^ permalink raw reply related [flat|nested] 70+ messages in thread
* [RFC PATCH v3 10/11] target/sparc: call plugin trap callbacks
2024-12-02 19:26 [RFC PATCH v3 00/11] tcg-plugins: add hooks for discontinuities Julian Ganz
` (8 preceding siblings ...)
2024-12-02 19:26 ` [RFC PATCH v3 09/11] target/riscv: " Julian Ganz
@ 2024-12-02 19:41 ` Julian Ganz
2025-01-09 13:46 ` Alex Bennée
2024-12-02 19:41 ` [RFC PATCH v3 11/11] tests: add plugin asserting correctness of discon event's to_pc Julian Ganz
` (2 subsequent siblings)
12 siblings, 1 reply; 70+ messages in thread
From: Julian Ganz @ 2024-12-02 19:41 UTC (permalink / raw)
To: qemu-devel; +Cc: Julian Ganz, Mark Cave-Ayland, Artyom Tarasenko
We recently introduced API for registering callbacks for trap related
events as well as the corresponding hook functions. Due to differences
between architectures, the latter need to be called from target specific
code.
This change places hooks for SPARC (32bit and 64bit) targets. We treat
any interrupt other than EXTINT and IVEC as exceptions as they appear to
be synchroneous events.
---
target/sparc/int32_helper.c | 7 +++++++
target/sparc/int64_helper.c | 10 ++++++++++
2 files changed, 17 insertions(+)
diff --git a/target/sparc/int32_helper.c b/target/sparc/int32_helper.c
index f2dd8bcb2e..86b21eecb6 100644
--- a/target/sparc/int32_helper.c
+++ b/target/sparc/int32_helper.c
@@ -24,6 +24,7 @@
#include "exec/cpu_ldst.h"
#include "exec/log.h"
#include "sysemu/runstate.h"
+#include "qemu/plugin.h"
static const char * const excp_names[0x80] = {
[TT_TFAULT] = "Instruction Access Fault",
@@ -172,4 +173,10 @@ void sparc_cpu_do_interrupt(CPUState *cs)
env->qemu_irq_ack(env, intno);
}
#endif
+
+ if (intno == TT_EXTINT) {
+ qemu_plugin_vcpu_interrupt_cb(cs, env->regwptr[9], env->pc);
+ } else {
+ qemu_plugin_vcpu_exception_cb(cs, env->regwptr[9], env->pc);
+ }
}
diff --git a/target/sparc/int64_helper.c b/target/sparc/int64_helper.c
index bd14c7a0db..9f0e7206d3 100644
--- a/target/sparc/int64_helper.c
+++ b/target/sparc/int64_helper.c
@@ -23,6 +23,7 @@
#include "exec/helper-proto.h"
#include "exec/log.h"
#include "trace.h"
+#include "qemu/plugin.h"
#define DEBUG_PCALL
@@ -253,6 +254,15 @@ void sparc_cpu_do_interrupt(CPUState *cs)
}
env->npc = env->pc + 4;
cs->exception_index = -1;
+
+ switch (intno) {
+ case TT_EXTINT:
+ case TT_IVEC:
+ qemu_plugin_vcpu_interrupt_cb(cs, tsptr->tpc, env->pc);
+ break;
+ default:
+ qemu_plugin_vcpu_exception_cb(cs, tsptr->tpc, env->pc);
+ }
}
trap_state *cpu_tsptr(CPUSPARCState* env)
--
2.45.2
^ permalink raw reply related [flat|nested] 70+ messages in thread
* [RFC PATCH v3 11/11] tests: add plugin asserting correctness of discon event's to_pc
2024-12-02 19:26 [RFC PATCH v3 00/11] tcg-plugins: add hooks for discontinuities Julian Ganz
` (9 preceding siblings ...)
2024-12-02 19:41 ` [RFC PATCH v3 10/11] target/sparc: " Julian Ganz
@ 2024-12-02 19:41 ` Julian Ganz
2024-12-04 23:33 ` Pierrick Bouvier
2024-12-03 8:36 ` [RFC PATCH v3 00/11] tcg-plugins: add hooks for discontinuities Julian Ganz
2025-01-09 16:43 ` Alex Bennée
12 siblings, 1 reply; 70+ messages in thread
From: Julian Ganz @ 2024-12-02 19:41 UTC (permalink / raw)
To: qemu-devel
Cc: Julian Ganz, Alex Bennée, Alexandre Iooss, Mahmoud Mandour,
Pierrick Bouvier
We recently introduced plugin API for the registration of callbacks for
discontinuity events, specifically for interrupts, exceptions and host
call events. The callback receives, among other information, the VCPU
index and the PC after the event. This change introduces a test plugin
asserting that particular behaviour.
---
tests/tcg/plugins/discons.c | 95 +++++++++++++++++++++++++++++++++++
tests/tcg/plugins/meson.build | 2 +-
2 files changed, 96 insertions(+), 1 deletion(-)
create mode 100644 tests/tcg/plugins/discons.c
diff --git a/tests/tcg/plugins/discons.c b/tests/tcg/plugins/discons.c
new file mode 100644
index 0000000000..54e52f563a
--- /dev/null
+++ b/tests/tcg/plugins/discons.c
@@ -0,0 +1,95 @@
+/*
+ * Copyright (C) 2024, Julian Ganz <neither@nut.email>
+ *
+ * License: GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+#include <stdio.h>
+
+#include <qemu-plugin.h>
+
+QEMU_PLUGIN_EXPORT int qemu_plugin_version = QEMU_PLUGIN_VERSION;
+
+struct cpu_state {
+ uint64_t next_pc;
+ bool has_next;
+};
+
+static struct qemu_plugin_scoreboard *states;
+
+static bool abort_on_mismatch;
+
+static void vcpu_discon(qemu_plugin_id_t id, unsigned int vcpu_index,
+ enum qemu_plugin_discon_type type, uint64_t from_pc,
+ uint64_t to_pc)
+{
+ struct cpu_state *state = qemu_plugin_scoreboard_find(states, vcpu_index);
+ state->next_pc = to_pc;
+ state->has_next = true;
+}
+
+static void insn_exec(unsigned int vcpu_index, void *userdata)
+{
+ struct cpu_state *state = qemu_plugin_scoreboard_find(states, vcpu_index);
+ uint64_t pc = (uint64_t) userdata;
+ GString* report;
+
+ if (state->has_next) {
+ if (state->next_pc != pc) {
+ report = g_string_new("Trap target PC mismatch\n");
+ g_string_append_printf(report,
+ "Expected: %"PRIx64"\nEncountered: %"
+ PRIx64"\n",
+ state->next_pc, pc);
+ qemu_plugin_outs(report->str);
+ if (abort_on_mismatch) {
+ g_abort();
+ }
+ g_string_free(report, true);
+ }
+ state->has_next = false;
+ }
+}
+
+static void vcpu_tb_trans(qemu_plugin_id_t id, struct qemu_plugin_tb *tb)
+{
+ size_t i;
+ size_t n_insns = qemu_plugin_tb_n_insns(tb);
+
+ for (i = 0; i < n_insns; i++) {
+ struct qemu_plugin_insn * insn = qemu_plugin_tb_get_insn(tb, i);
+ uint64_t pc = qemu_plugin_insn_vaddr(insn);
+ qemu_plugin_register_vcpu_insn_exec_cb(insn, insn_exec,
+ QEMU_PLUGIN_CB_NO_REGS,
+ (void*) pc);
+ }
+}
+
+QEMU_PLUGIN_EXPORT int qemu_plugin_install(qemu_plugin_id_t id,
+ const qemu_info_t *info,
+ int argc, char **argv)
+{
+ int i;
+
+ for (i = 0; i < argc; i++) {
+ char *opt = argv[i];
+ g_auto(GStrv) tokens = g_strsplit(opt, "=", 2);
+ if (g_strcmp0(tokens[0], "abort") == 0) {
+ if (!qemu_plugin_bool_parse(tokens[0], tokens[1], &abort_on_mismatch)) {
+ fprintf(stderr, "boolean argument parsing failed: %s\n", opt);
+ return -1;
+ }
+ } else {
+ fprintf(stderr, "option parsing failed: %s\n", opt);
+ return -1;
+ }
+ }
+
+ states = qemu_plugin_scoreboard_new(sizeof(struct cpu_state));
+
+ qemu_plugin_register_vcpu_discon_cb(id, QEMU_PLUGIN_DISCON_ALL,
+ vcpu_discon);
+ qemu_plugin_register_vcpu_tb_trans_cb(id, vcpu_tb_trans);
+
+ return 0;
+}
diff --git a/tests/tcg/plugins/meson.build b/tests/tcg/plugins/meson.build
index f847849b1b..f057238da1 100644
--- a/tests/tcg/plugins/meson.build
+++ b/tests/tcg/plugins/meson.build
@@ -1,6 +1,6 @@
t = []
if get_option('plugins')
- foreach i : ['bb', 'empty', 'inline', 'insn', 'mem', 'syscall']
+ foreach i : ['bb', 'discons', 'empty', 'inline', 'insn', 'mem', 'syscall']
if host_os == 'windows'
t += shared_module(i, files(i + '.c') + '../../../contrib/plugins/win32_linker.c',
include_directories: '../../../include/qemu',
--
2.45.2
^ permalink raw reply related [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 09/11] target/riscv: call plugin trap callbacks
2024-12-02 19:26 ` [RFC PATCH v3 09/11] target/riscv: " Julian Ganz
@ 2024-12-03 4:39 ` Alistair Francis
0 siblings, 0 replies; 70+ messages in thread
From: Alistair Francis @ 2024-12-03 4:39 UTC (permalink / raw)
To: Julian Ganz
Cc: qemu-devel, Palmer Dabbelt, Alistair Francis, Bin Meng, Weiwei Li,
Daniel Henrique Barboza, Liu Zhiwei, open list:RISC-V TCG CPUs
On Tue, Dec 3, 2024 at 4:30 AM Julian Ganz <neither@nut.email> wrote:
>
> We recently introduced API for registering callbacks for trap related
> events as well as the corresponding hook functions. Due to differences
> between architectures, the latter need to be called from target specific
> code.
>
> This change places hooks for RISC-V targets.
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Alistair
> ---
> target/riscv/cpu_helper.c | 9 +++++++++
> 1 file changed, 9 insertions(+)
>
> diff --git a/target/riscv/cpu_helper.c b/target/riscv/cpu_helper.c
> index 0a3ead69ea..6da9bd4629 100644
> --- a/target/riscv/cpu_helper.c
> +++ b/target/riscv/cpu_helper.c
> @@ -34,6 +34,7 @@
> #include "debug.h"
> #include "tcg/oversized-guest.h"
> #include "pmp.h"
> +#include "qemu/plugin.h"
>
> int riscv_env_mmu_index(CPURISCVState *env, bool ifetch)
> {
> @@ -1806,6 +1807,7 @@ void riscv_cpu_do_interrupt(CPUState *cs)
> !(env->mip & (1 << cause));
> bool vs_injected = env->hvip & (1 << cause) & env->hvien &&
> !(env->mip & (1 << cause));
> + uint64_t last_pc = env-> pc;
> target_ulong tval = 0;
> target_ulong tinst = 0;
> target_ulong htval = 0;
> @@ -1820,6 +1822,7 @@ void riscv_cpu_do_interrupt(CPUState *cs)
> case RISCV_EXCP_SEMIHOST:
> do_common_semihosting(cs);
> env->pc += 4;
> + qemu_plugin_vcpu_hostcall_cb(cs, last_pc, env->pc);
> return;
> #endif
> case RISCV_EXCP_LOAD_GUEST_ACCESS_FAULT:
> @@ -1999,6 +2002,12 @@ void riscv_cpu_do_interrupt(CPUState *cs)
> riscv_cpu_set_mode(env, PRV_M, virt);
> }
>
> + if (async) {
> + qemu_plugin_vcpu_interrupt_cb(cs, last_pc, env->pc);
> + } else {
> + qemu_plugin_vcpu_exception_cb(cs, last_pc, env->pc);
> + }
> +
> /*
> * Interrupt/exception/trap delivery is asynchronous event and as per
> * zicfilp spec CPU should clear up the ELP state. No harm in clearing
> --
> 2.45.2
>
>
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 00/11] tcg-plugins: add hooks for discontinuities
2024-12-02 19:26 [RFC PATCH v3 00/11] tcg-plugins: add hooks for discontinuities Julian Ganz
` (10 preceding siblings ...)
2024-12-02 19:41 ` [RFC PATCH v3 11/11] tests: add plugin asserting correctness of discon event's to_pc Julian Ganz
@ 2024-12-03 8:36 ` Julian Ganz
2024-12-04 22:51 ` Pierrick Bouvier
2025-01-09 16:43 ` Alex Bennée
12 siblings, 1 reply; 70+ messages in thread
From: Julian Ganz @ 2024-12-03 8:36 UTC (permalink / raw)
To: qemu-devel
Hi,
I just realized that I forgot to run the checkpatch script on the
patches again before sending and did not include the Sign-Off. Sorry
about that.
Regards,
Julian Ganz
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 01/11] plugins: add types for callbacks related to certain discontinuities
2024-12-02 19:26 ` [RFC PATCH v3 01/11] plugins: add types for callbacks related to certain discontinuities Julian Ganz
@ 2024-12-03 8:45 ` Julian Ganz
2024-12-04 22:41 ` Pierrick Bouvier
2024-12-04 22:45 ` Pierrick Bouvier
1 sibling, 1 reply; 70+ messages in thread
From: Julian Ganz @ 2024-12-03 8:45 UTC (permalink / raw)
To: qemu-devel
Hi,
December 2, 2024 at 8:26 PM, "Julian Ganz" wrote:
> diff --git a/include/qemu/qemu-plugin.h b/include/qemu/qemu-plugin.h
> index 0fba36ae02..9c67374b7e 100644
> --- a/include/qemu/qemu-plugin.h
> +++ b/include/qemu/qemu-plugin.h
> @@ -154,6 +154,49 @@ typedef void (*qemu_plugin_vcpu_simple_cb_t)(qemu_plugin_id_t id,
<snip>
> +/**
> + * typedef qemu_plugin_vcpu_discon_cb_t - vcpu discontinuity callback
> + * @vcpu_index: the current vcpu context
> + * @type: the type of discontinuity
> + * @from_pc: the source of the discontinuity, e.g. the PC before the
> + * transition
> + * @to_pc: the PC pointing to the next instruction to be executed
> + *
> + * The excact semantics of @from_pc depends on @the type of discontinuity. For
> + * interrupts, @from_pc will point to the next instruction which would have
> + * been executed. For exceptions and host calls, @from_pc will point to the
> + * instruction that caused the exception or issued the host call. Note that
> + * in the case of exceptions, the instruction is not retired and thus not
> + * observable via general instruction exec callbacks. The same may be the case
> + * for some host calls such as hypervisor call "exceptions".
Some more notes about this bit: I originally tried to make the from_pc
semantics independent from the type of event, i.e. either of the two
cases. I obviously did not succeed in doing so. As, in most cases, the
instruction pointed to by from_pc is not observable via exec callbacks
I could also not test this behaviour in the testing plugin (see patch
11). I am therefore in favor for dropping the from_pc for the next
iteration of this patch series.
> + */
> +typedef void (*qemu_plugin_vcpu_discon_cb_t)(qemu_plugin_id_t id,
> + unsigned int vcpu_index,
> + enum qemu_plugin_discon_type type,
> + uint64_t from_pc, uint64_t to_pc);
> +
> /**
> * qemu_plugin_uninstall() - Uninstall a plugin
> * @id: this plugin's opaque ID
> --
> 2.45.2
>
Regards,
Julian
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 01/11] plugins: add types for callbacks related to certain discontinuities
2024-12-03 8:45 ` Julian Ganz
@ 2024-12-04 22:41 ` Pierrick Bouvier
2024-12-05 12:40 ` Julian Ganz
0 siblings, 1 reply; 70+ messages in thread
From: Pierrick Bouvier @ 2024-12-04 22:41 UTC (permalink / raw)
To: Julian Ganz, qemu-devel
On 12/3/24 00:45, Julian Ganz wrote:
> Hi,
>
> December 2, 2024 at 8:26 PM, "Julian Ganz" wrote:
>> diff --git a/include/qemu/qemu-plugin.h b/include/qemu/qemu-plugin.h
>> index 0fba36ae02..9c67374b7e 100644
>> --- a/include/qemu/qemu-plugin.h
>> +++ b/include/qemu/qemu-plugin.h
>> @@ -154,6 +154,49 @@ typedef void (*qemu_plugin_vcpu_simple_cb_t)(qemu_plugin_id_t id,
> <snip>
>> +/**
>> + * typedef qemu_plugin_vcpu_discon_cb_t - vcpu discontinuity callback
>> + * @vcpu_index: the current vcpu context
>> + * @type: the type of discontinuity
>> + * @from_pc: the source of the discontinuity, e.g. the PC before the
>> + * transition
>> + * @to_pc: the PC pointing to the next instruction to be executed
>> + *
>> + * The excact semantics of @from_pc depends on @the type of discontinuity. For
>> + * interrupts, @from_pc will point to the next instruction which would have
>> + * been executed. For exceptions and host calls, @from_pc will point to the
>> + * instruction that caused the exception or issued the host call. Note that
>> + * in the case of exceptions, the instruction is not retired and thus not
>> + * observable via general instruction exec callbacks. The same may be the case
>> + * for some host calls such as hypervisor call "exceptions".
>
> Some more notes about this bit: I originally tried to make the from_pc
> semantics independent from the type of event, i.e. either of the two
> cases. I obviously did not succeed in doing so. As, in most cases, the
> instruction pointed to by from_pc is not observable via exec callbacks
> I could also not test this behaviour in the testing plugin (see patch
> 11). I am therefore in favor for dropping the from_pc for the next
> iteration of this patch series.
>
Does it mean that information returned should be dependent of type of
event, as we previously discussed on v1?
>> + */
>> +typedef void (*qemu_plugin_vcpu_discon_cb_t)(qemu_plugin_id_t id,
>> + unsigned int vcpu_index,
>> + enum qemu_plugin_discon_type type,
>> + uint64_t from_pc, uint64_t to_pc);
>> +
>> /**
>> * qemu_plugin_uninstall() - Uninstall a plugin
>> * @id: this plugin's opaque ID
>> --
>> 2.45.2
>>
>
> Regards,
> Julian
>
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 01/11] plugins: add types for callbacks related to certain discontinuities
2024-12-02 19:26 ` [RFC PATCH v3 01/11] plugins: add types for callbacks related to certain discontinuities Julian Ganz
2024-12-03 8:45 ` Julian Ganz
@ 2024-12-04 22:45 ` Pierrick Bouvier
2024-12-05 12:44 ` Julian Ganz
2025-01-09 13:52 ` Alex Bennée
1 sibling, 2 replies; 70+ messages in thread
From: Pierrick Bouvier @ 2024-12-04 22:45 UTC (permalink / raw)
To: Julian Ganz, qemu-devel
Hi Julian,
thanks for the update!
Comments below.
On 12/2/24 11:26, Julian Ganz wrote:
> The plugin API allows registration of callbacks for a variety of VCPU
> related events, such as VCPU reset, idle and resume. However, traps of
> any kind, i.e. interrupts or exceptions, were previously not covered.
> These kinds of events are arguably quite significant and usually go hand
> in hand with a PC discontinuity. On most platforms, the discontinuity
> also includes a transition from some "mode" to another. Thus, plugins
> for the analysis of (virtualized) embedded systems may benefit from or
> even require the possiblity to perform work on the occurance of an
> interrupt or exception.
>
> This change introduces the concept of such a discontinuity event in the
> form of an enumeration. Currently only traps are covered. Specifically
> we (loosely) define interrupts, exceptions and host calls across all
> platforms. In addition, this change introduces a type to use for
> callback functions related to such events. Since possible modes and the
> enumeration of interupts and exceptions vary greatly between different
> architectures, the callback type only receives the VCPU id, the type of
> event as well as the old and new PC.
> ---
> include/qemu/plugin.h | 1 +
> include/qemu/qemu-plugin.h | 43 ++++++++++++++++++++++++++++++++++++++
> 2 files changed, 44 insertions(+)
>
> diff --git a/include/qemu/plugin.h b/include/qemu/plugin.h
> index 9726a9ebf3..27a176b631 100644
> --- a/include/qemu/plugin.h
> +++ b/include/qemu/plugin.h
> @@ -59,6 +59,7 @@ union qemu_plugin_cb_sig {
> qemu_plugin_udata_cb_t udata;
> qemu_plugin_vcpu_simple_cb_t vcpu_simple;
> qemu_plugin_vcpu_udata_cb_t vcpu_udata;
> + qemu_plugin_vcpu_discon_cb_t vcpu_discon;
> qemu_plugin_vcpu_tb_trans_cb_t vcpu_tb_trans;
> qemu_plugin_vcpu_mem_cb_t vcpu_mem;
> qemu_plugin_vcpu_syscall_cb_t vcpu_syscall;
> diff --git a/include/qemu/qemu-plugin.h b/include/qemu/qemu-plugin.h
> index 0fba36ae02..9c67374b7e 100644
> --- a/include/qemu/qemu-plugin.h
> +++ b/include/qemu/qemu-plugin.h
> @@ -154,6 +154,49 @@ typedef void (*qemu_plugin_vcpu_simple_cb_t)(qemu_plugin_id_t id,
> typedef void (*qemu_plugin_vcpu_udata_cb_t)(unsigned int vcpu_index,
> void *userdata);
>
> +
> +/**
> + * enum qemu_plugin_discon_type - type of a (potential) PC discontinuity
> + *
> + * @QEMU_PLUGIN_DISCON_INTERRUPT: an interrupt, defined across all architectures
> + * as an asynchronous event, usually originating
> + * from outside the CPU
> + * @QEMU_PLUGIN_DISCON_EXCEPTION: an exception, defined across all architectures
> + * as a synchronous event in response to a
> + * specific instruction being executed
> + * @QEMU_PLUGIN_DISCON_HOSTCALL: a host call, functionally a special kind of
> + * exception that is not handled by code run by
> + * the vCPU but machinery outside the vCPU
> + * @QEMU_PLUGIN_DISCON_ALL: all types of disconinuity events currently covered
> + */
> +enum qemu_plugin_discon_type {
> + QEMU_PLUGIN_DISCON_INTERRUPT = 1,
> + QEMU_PLUGIN_DISCON_EXCEPTION = 2,
> + QEMU_PLUGIN_DISCON_HOSTCALL = 4,
> + QEMU_PLUGIN_DISCON_ALL = 7
> +};
Matter of style, but would be better to use:
enum qemu_plugin_discon_type {
QEMU_PLUGIN_DISCON_INTERRUPT = 1 << 0,
QEMU_PLUGIN_DISCON_EXCEPTION = 1 << 1,
QEMU_PLUGIN_DISCON_HOSTCALL = 1 << 2,
QEMU_PLUGIN_DISCON_ALL = -1
};
> +
> +/**
> + * typedef qemu_plugin_vcpu_discon_cb_t - vcpu discontinuity callback
> + * @vcpu_index: the current vcpu context
> + * @type: the type of discontinuity
> + * @from_pc: the source of the discontinuity, e.g. the PC before the
> + * transition
> + * @to_pc: the PC pointing to the next instruction to be executed
> + *
Missing those parameters when building doc.
include/qemu/qemu-plugin.h:198: warning: Function parameter or member
'id' not described in 'qemu_plugin_vcpu_discon_cb_t'
include/qemu/qemu-plugin.h:289: warning: Function parameter or member
'type' not described in 'qemu_plugin_register_vcpu_discon_cb'
2 warnings as Errors
> + * The excact semantics of @from_pc depends on @the type of discontinuity. For
> + * interrupts, @from_pc will point to the next instruction which would have
> + * been executed. For exceptions and host calls, @from_pc will point to the
> + * instruction that caused the exception or issued the host call. Note that
> + * in the case of exceptions, the instruction is not retired and thus not
> + * observable via general instruction exec callbacks. The same may be the case
> + * for some host calls such as hypervisor call "exceptions".
> + */
> +typedef void (*qemu_plugin_vcpu_discon_cb_t)(qemu_plugin_id_t id,
> + unsigned int vcpu_index,
> + enum qemu_plugin_discon_type type,
> + uint64_t from_pc, uint64_t to_pc);
> +
> /**
> * qemu_plugin_uninstall() - Uninstall a plugin
> * @id: this plugin's opaque ID
Overall I'm ok with what it looks like.
However, it seems that your conclusion was we can't always guarantee to
have from_pc, and I'm not sure that the solution to simply drop it is
the right one.
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 02/11] plugins: add API for registering discontinuity callbacks
2024-12-02 19:26 ` [RFC PATCH v3 02/11] plugins: add API for registering discontinuity callbacks Julian Ganz
@ 2024-12-04 22:45 ` Pierrick Bouvier
2025-01-09 13:57 ` Alex Bennée
1 sibling, 0 replies; 70+ messages in thread
From: Pierrick Bouvier @ 2024-12-04 22:45 UTC (permalink / raw)
To: Julian Ganz, qemu-devel
Cc: Alex Bennée, Alexandre Iooss, Mahmoud Mandour
On 12/2/24 11:26, Julian Ganz wrote:
> The plugin API allows registration of callbacks for a variety of VCPU
> related events, such as VCPU reset, idle and resume. In addition to
> those events, we recently defined discontinuity events, which include
> traps.
>
> This change introduces a function to register callbacks for these
> events. We define one distinct plugin event type for each type of
> discontinuity, granting fine control to plugins in term of which events
> they receive.
> ---
> include/qemu/plugin-event.h | 3 +++
> include/qemu/qemu-plugin.h | 15 +++++++++++++++
> plugins/core.c | 15 +++++++++++++++
> 3 files changed, 33 insertions(+)
>
> diff --git a/include/qemu/plugin-event.h b/include/qemu/plugin-event.h
> index 7056d8427b..1100dae212 100644
> --- a/include/qemu/plugin-event.h
> +++ b/include/qemu/plugin-event.h
> @@ -20,6 +20,9 @@ enum qemu_plugin_event {
> QEMU_PLUGIN_EV_VCPU_SYSCALL_RET,
> QEMU_PLUGIN_EV_FLUSH,
> QEMU_PLUGIN_EV_ATEXIT,
> + QEMU_PLUGIN_EV_VCPU_INTERRUPT,
> + QEMU_PLUGIN_EV_VCPU_EXCEPTION,
> + QEMU_PLUGIN_EV_VCPU_HOSTCALL,
> QEMU_PLUGIN_EV_MAX, /* total number of plugin events we support */
> };
>
> diff --git a/include/qemu/qemu-plugin.h b/include/qemu/qemu-plugin.h
> index 9c67374b7e..f998a465e5 100644
> --- a/include/qemu/qemu-plugin.h
> +++ b/include/qemu/qemu-plugin.h
> @@ -273,6 +273,21 @@ QEMU_PLUGIN_API
> void qemu_plugin_register_vcpu_resume_cb(qemu_plugin_id_t id,
> qemu_plugin_vcpu_simple_cb_t cb);
>
> +/**
> + * qemu_plugin_register_vcpu_discon_cb() - register a discontinuity callback
> + * @id: plugin ID
> + * @cb: callback function
> + *
> + * The @cb function is called every time a vCPU receives a discontinuity event
> + * of the specified type(s), after the vCPU was prepared to handle the event.
> + * Preparation usually entails updating the PC to some interrupt handler or trap
> + * vector entry.
> + */
> +QEMU_PLUGIN_API
> +void qemu_plugin_register_vcpu_discon_cb(qemu_plugin_id_t id,
> + enum qemu_plugin_discon_type type,
> + qemu_plugin_vcpu_discon_cb_t cb);
> +
> /** struct qemu_plugin_tb - Opaque handle for a translation block */
> struct qemu_plugin_tb;
> /** struct qemu_plugin_insn - Opaque handle for a translated instruction */
> diff --git a/plugins/core.c b/plugins/core.c
> index bb105e8e68..a89a4a0315 100644
> --- a/plugins/core.c
> +++ b/plugins/core.c
> @@ -559,6 +559,21 @@ void qemu_plugin_register_vcpu_resume_cb(qemu_plugin_id_t id,
> plugin_register_cb(id, QEMU_PLUGIN_EV_VCPU_RESUME, cb);
> }
>
> +void qemu_plugin_register_vcpu_discon_cb(qemu_plugin_id_t id,
> + enum qemu_plugin_discon_type type,
> + qemu_plugin_vcpu_discon_cb_t cb)
> +{
> + if (type & QEMU_PLUGIN_DISCON_INTERRUPT) {
> + plugin_register_cb(id, QEMU_PLUGIN_EV_VCPU_INTERRUPT, cb);
> + }
> + if (type & QEMU_PLUGIN_DISCON_EXCEPTION) {
> + plugin_register_cb(id, QEMU_PLUGIN_EV_VCPU_EXCEPTION, cb);
> + }
> + if (type & QEMU_PLUGIN_DISCON_HOSTCALL) {
> + plugin_register_cb(id, QEMU_PLUGIN_EV_VCPU_HOSTCALL, cb);
> + }
> +}
> +
> void qemu_plugin_register_flush_cb(qemu_plugin_id_t id,
> qemu_plugin_simple_cb_t cb)
> {
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 03/11] plugins: add hooks for new discontinuity related callbacks
2024-12-02 19:26 ` [RFC PATCH v3 03/11] plugins: add hooks for new discontinuity related callbacks Julian Ganz
@ 2024-12-04 22:47 ` Pierrick Bouvier
2025-01-09 13:58 ` Alex Bennée
1 sibling, 0 replies; 70+ messages in thread
From: Pierrick Bouvier @ 2024-12-04 22:47 UTC (permalink / raw)
To: Julian Ganz, qemu-devel
Cc: Alex Bennée, Alexandre Iooss, Mahmoud Mandour
On 12/2/24 11:26, Julian Ganz wrote:
> The plugin API allows registration of callbacks for a variety of VCPU
> related events, such as VCPU reset, idle and resume. In addition, we
> recently introduced API for registering callbacks for discontinuity
> events, specifically for interrupts, exceptions and host calls.
>
> This change introduces the corresponding hooks called from target
> specific code inside qemu.
> ---
> include/qemu/plugin.h | 12 ++++++++++
> plugins/core.c | 52 +++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 64 insertions(+)
>
> diff --git a/include/qemu/plugin.h b/include/qemu/plugin.h
> index 27a176b631..3de9cb3fe4 100644
> --- a/include/qemu/plugin.h
> +++ b/include/qemu/plugin.h
> @@ -161,6 +161,9 @@ void qemu_plugin_vcpu_exit_hook(CPUState *cpu);
> void qemu_plugin_tb_trans_cb(CPUState *cpu, struct qemu_plugin_tb *tb);
> void qemu_plugin_vcpu_idle_cb(CPUState *cpu);
> void qemu_plugin_vcpu_resume_cb(CPUState *cpu);
> +void qemu_plugin_vcpu_interrupt_cb(CPUState *cpu, uint64_t from, uint64_t to);
> +void qemu_plugin_vcpu_exception_cb(CPUState *cpu, uint64_t from, uint64_t to);
> +void qemu_plugin_vcpu_hostcall_cb(CPUState *cpu, uint64_t from, uint64_t to);
> void
> qemu_plugin_vcpu_syscall(CPUState *cpu, int64_t num, uint64_t a1,
> uint64_t a2, uint64_t a3, uint64_t a4, uint64_t a5,
> @@ -243,6 +246,15 @@ static inline void qemu_plugin_vcpu_idle_cb(CPUState *cpu)
> static inline void qemu_plugin_vcpu_resume_cb(CPUState *cpu)
> { }
>
> +void qemu_plugin_vcpu_interrupt_cb(CPUState *cpu, uint64_t from, uint64_t to)
> +{ }
> +
> +void qemu_plugin_vcpu_exception_cb(CPUState *cpu, uint64_t from, uint64_t to)
> +{ }
> +
> +void qemu_plugin_vcpu_hostcall_cb(CPUState *cpu, uint64_t from, uint64_t to)
> +{ }
> +
> static inline void
> qemu_plugin_vcpu_syscall(CPUState *cpu, int64_t num, uint64_t a1, uint64_t a2,
> uint64_t a3, uint64_t a4, uint64_t a5, uint64_t a6,
> diff --git a/plugins/core.c b/plugins/core.c
> index a89a4a0315..2c9637334f 100644
> --- a/plugins/core.c
> +++ b/plugins/core.c
> @@ -112,6 +112,43 @@ static void plugin_vcpu_cb__simple(CPUState *cpu, enum qemu_plugin_event ev)
> }
> }
>
> +/*
> + * Disable CFI checks.
> + * The callback function has been loaded from an external library so we do not
> + * have type information
> + */
> +QEMU_DISABLE_CFI
> +static void plugin_vcpu_cb__discon(CPUState *cpu,
> + enum qemu_plugin_discon_type type,
> + uint64_t from, uint64_t to)
> +{
> + struct qemu_plugin_cb *cb, *next;
> + enum qemu_plugin_event ev;
> +
> + if (cpu->cpu_index < plugin.num_vcpus) {
> + switch (type) {
> + case QEMU_PLUGIN_DISCON_INTERRUPT:
> + ev = QEMU_PLUGIN_EV_VCPU_INTERRUPT;
> + break;
> + case QEMU_PLUGIN_DISCON_EXCEPTION:
> + ev = QEMU_PLUGIN_EV_VCPU_EXCEPTION;
> + break;
> + case QEMU_PLUGIN_DISCON_HOSTCALL:
> + ev = QEMU_PLUGIN_EV_VCPU_HOSTCALL;
> + break;
> + default:
> + g_assert_not_reached();
> + }
> +
> + /* iterate safely; plugins might uninstall themselves at any time */
> + QLIST_FOREACH_SAFE_RCU(cb, &plugin.cb_lists[ev], entry, next) {
> + qemu_plugin_vcpu_discon_cb_t func = cb->f.vcpu_discon;
> +
> + func(cb->ctx->id, cpu->cpu_index, type, from, to);
> + }
> + }
> +}
> +
> /*
> * Disable CFI checks.
> * The callback function has been loaded from an external library so we do not
> @@ -547,6 +584,21 @@ void qemu_plugin_vcpu_resume_cb(CPUState *cpu)
> }
> }
>
> +void qemu_plugin_vcpu_interrupt_cb(CPUState *cpu, uint64_t from, uint64_t to)
> +{
> + plugin_vcpu_cb__discon(cpu, QEMU_PLUGIN_DISCON_INTERRUPT, from, to);
> +}
> +
> +void qemu_plugin_vcpu_exception_cb(CPUState *cpu, uint64_t from, uint64_t to)
> +{
> + plugin_vcpu_cb__discon(cpu, QEMU_PLUGIN_DISCON_EXCEPTION, from, to);
> +}
> +
> +void qemu_plugin_vcpu_hostcall_cb(CPUState *cpu, uint64_t from, uint64_t to)
> +{
> + plugin_vcpu_cb__discon(cpu, QEMU_PLUGIN_DISCON_HOSTCALL, from, to);
> +}
> +
> void qemu_plugin_register_vcpu_idle_cb(qemu_plugin_id_t id,
> qemu_plugin_vcpu_simple_cb_t cb)
> {
Reviewed-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 05/11] target/alpha: call plugin trap callbacks
2024-12-02 19:26 ` [RFC PATCH v3 05/11] target/alpha: call plugin trap callbacks Julian Ganz
@ 2024-12-04 22:48 ` Pierrick Bouvier
0 siblings, 0 replies; 70+ messages in thread
From: Pierrick Bouvier @ 2024-12-04 22:48 UTC (permalink / raw)
To: Julian Ganz, qemu-devel; +Cc: Richard Henderson
On 12/2/24 11:26, Julian Ganz wrote:
> We recently introduced API for registering callbacks for trap related
> events as well as the corresponding hook functions. Due to differences
> between architectures, the latter need to be called from target specific
> code.
>
> This change places hooks for Alpha targets.
> ---
> target/alpha/helper.c | 12 ++++++++++++
> 1 file changed, 12 insertions(+)
>
> diff --git a/target/alpha/helper.c b/target/alpha/helper.c
> index 2f1000c99f..acc92402af 100644
> --- a/target/alpha/helper.c
> +++ b/target/alpha/helper.c
> @@ -25,6 +25,7 @@
> #include "fpu/softfloat-types.h"
> #include "exec/helper-proto.h"
> #include "qemu/qemu-print.h"
> +#include "qemu/plugin.h"
>
>
> #define CONVERT_BIT(X, SRC, DST) \
> @@ -326,6 +327,7 @@ void alpha_cpu_do_interrupt(CPUState *cs)
> {
> CPUAlphaState *env = cpu_env(cs);
> int i = cs->exception_index;
> + uint64_t last_pc = env->pc;
>
> if (qemu_loglevel_mask(CPU_LOG_INT)) {
> static int count;
> @@ -429,6 +431,16 @@ void alpha_cpu_do_interrupt(CPUState *cs)
>
> /* Switch to PALmode. */
> env->flags |= ENV_FLAG_PAL_MODE;
> +
> + switch (i) {
> + case EXCP_SMP_INTERRUPT:
> + case EXCP_CLK_INTERRUPT:
> + case EXCP_DEV_INTERRUPT:
> + qemu_plugin_vcpu_interrupt_cb(cs, last_pc, env->pc);
> + break;
> + qemu_plugin_vcpu_exception_cb(cs, last_pc, env->pc);
> + default:
> + }
Does not compile with clang:
../target/alpha/helper.c:442:13: error: label at end of compound
statement: expected statement
default:
^
> }
>
> bool alpha_cpu_exec_interrupt(CPUState *cs, int interrupt_request)
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 00/11] tcg-plugins: add hooks for discontinuities
2024-12-03 8:36 ` [RFC PATCH v3 00/11] tcg-plugins: add hooks for discontinuities Julian Ganz
@ 2024-12-04 22:51 ` Pierrick Bouvier
0 siblings, 0 replies; 70+ messages in thread
From: Pierrick Bouvier @ 2024-12-04 22:51 UTC (permalink / raw)
To: Julian Ganz, qemu-devel
On 12/3/24 00:36, Julian Ganz wrote:
> Hi,
>
> I just realized that I forgot to run the checkpatch script on the
> patches again before sending and did not include the Sign-Off. Sorry
> about that.
>
> Regards,
> Julian Ganz
>
No worries, it's pretty frequent that people forgot those.
While at it, you can fix the style issues checkpatch has reported for
the series.
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 04/11] contrib/plugins: add plugin showcasing new dicontinuity related API
2024-12-02 19:26 ` [RFC PATCH v3 04/11] contrib/plugins: add plugin showcasing new dicontinuity related API Julian Ganz
@ 2024-12-04 23:14 ` Pierrick Bouvier
2024-12-05 13:00 ` Julian Ganz
2025-01-09 14:04 ` Alex Bennée
1 sibling, 1 reply; 70+ messages in thread
From: Pierrick Bouvier @ 2024-12-04 23:14 UTC (permalink / raw)
To: Julian Ganz, qemu-devel
Cc: Alex Bennée, Alexandre Iooss, Mahmoud Mandour
On 12/2/24 11:26, Julian Ganz wrote:
> We recently introduced new plugin API for registration of discontinuity
> related callbacks. This change introduces a minimal plugin showcasing
> the new API. It simply counts the occurances of interrupts, exceptions
> and host calls per CPU and reports the counts when exitting.
> ---
> contrib/plugins/meson.build | 3 +-
> contrib/plugins/traps.c | 96 +++++++++++++++++++++++++++++++++++++
> 2 files changed, 98 insertions(+), 1 deletion(-)
> create mode 100644 contrib/plugins/traps.c
>
> diff --git a/contrib/plugins/meson.build b/contrib/plugins/meson.build
> index 63a32c2b4f..9a3015e1c1 100644
> --- a/contrib/plugins/meson.build
> +++ b/contrib/plugins/meson.build
> @@ -1,5 +1,6 @@
> contrib_plugins = ['bbv', 'cache', 'cflow', 'drcov', 'execlog', 'hotblocks',
> - 'hotpages', 'howvec', 'hwprofile', 'ips', 'stoptrigger']
> + 'hotpages', 'howvec', 'hwprofile', 'ips', 'stoptrigger',
> + 'traps']
> if host_os != 'windows'
> # lockstep uses socket.h
> contrib_plugins += 'lockstep'
> diff --git a/contrib/plugins/traps.c b/contrib/plugins/traps.c
> new file mode 100644
> index 0000000000..ecd4beac5f
> --- /dev/null
> +++ b/contrib/plugins/traps.c
> @@ -0,0 +1,96 @@
> +/*
> + * Copyright (C) 2024, Julian Ganz <neither@nut.email>
> + *
> + * Traps - count traps
> + *
> + * License: GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + */
> +
> +#include <stdio.h>
> +
> +#include <qemu-plugin.h>
> +
> +QEMU_PLUGIN_EXPORT int qemu_plugin_version = QEMU_PLUGIN_VERSION;
> +
> +typedef struct {
> + uint64_t interrupts;
> + uint64_t exceptions;
> + uint64_t hostcalls;
> + bool active;
The active field can be removed, as you can query
qemu_plugin_num_vcpus() to know (dynamically) how many vcpus were created.
> +} TrapCounters;
> +
> +static struct qemu_plugin_scoreboard *traps;
> +static size_t max_vcpus;
> +
> +static void vcpu_init(qemu_plugin_id_t id, unsigned int vcpu_index)
> +{
> + TrapCounters* rec = qemu_plugin_scoreboard_find(traps, vcpu_index);
> + rec->active = true;
> +}
Once active is removed, this function can be removed too.
> +
> +static void vcpu_discon(qemu_plugin_id_t id, unsigned int vcpu_index,
> + enum qemu_plugin_discon_type type, uint64_t from_pc,
> + uint64_t to_pc)
> +{
> + TrapCounters* rec = qemu_plugin_scoreboard_find(traps, vcpu_index);
> + switch (type) {
> + case QEMU_PLUGIN_DISCON_INTERRUPT:
> + rec->interrupts++;
> + break;
> + case QEMU_PLUGIN_DISCON_EXCEPTION:
> + rec->exceptions++;
> + break;
> + case QEMU_PLUGIN_DISCON_HOSTCALL:
> + rec->hostcalls++;
> + break;
> + default:
> + /* unreachable */
You can use g_assert_not_reached() ensure it's unreachable. We removed
assert(0) from the codebase and replaced those with it.
> + break;
> + }
> +}
> +
> +static void plugin_exit(qemu_plugin_id_t id, void *p)
> +{
> + g_autoptr(GString) report;
> + report = g_string_new("VCPU, interrupts, exceptions, hostcalls\n");
> + int vcpu;
> +
> + for (vcpu = 0; vcpu < max_vcpus; vcpu++) {
vcpu < qemu_plugin_num_vcpus()
> + TrapCounters *rec = qemu_plugin_scoreboard_find(traps, vcpu);
> + if (rec->active) {
> + g_string_append_printf(report,
> + "% 4d, % 10"PRId64", % 10"PRId64", % 10"
> + PRId64"\n",
> + vcpu,
> + rec->interrupts, rec->exceptions,
> + rec->hostcalls);
> + }
> + }
> +
> + qemu_plugin_outs(report->str);
> + qemu_plugin_scoreboard_free(traps);
> +}
> +
> +QEMU_PLUGIN_EXPORT
> +int qemu_plugin_install(qemu_plugin_id_t id, const qemu_info_t *info,
> + int argc, char **argv)
> +{
> + if (!info->system_emulation) {
> + fputs("trap plugin can only be used in system emulation mode.\n",
> + stderr);
> + return -1;
> + }
> +
> + max_vcpus = info->system.max_vcpus;
> + traps = qemu_plugin_scoreboard_new(sizeof(TrapCounters));
> + qemu_plugin_register_vcpu_init_cb(id, vcpu_init);
> + qemu_plugin_vcpu_for_each(id, vcpu_init);
> +
> + qemu_plugin_register_vcpu_discon_cb(id, QEMU_PLUGIN_DISCON_TRAPS,
> + vcpu_discon);
> +
The change from QEMU_PLUGIN_DISCON_TRAPS to QEMU_PLUGIN_DISCON_ALL
should be included in this patch, instead of next one.
> + qemu_plugin_register_atexit_cb(id, plugin_exit, NULL);
> +
> + return 0;
> +}
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 11/11] tests: add plugin asserting correctness of discon event's to_pc
2024-12-02 19:41 ` [RFC PATCH v3 11/11] tests: add plugin asserting correctness of discon event's to_pc Julian Ganz
@ 2024-12-04 23:33 ` Pierrick Bouvier
2024-12-05 13:10 ` Julian Ganz
2024-12-20 11:47 ` Julian Ganz
0 siblings, 2 replies; 70+ messages in thread
From: Pierrick Bouvier @ 2024-12-04 23:33 UTC (permalink / raw)
To: Julian Ganz, qemu-devel
Cc: Alex Bennée, Alexandre Iooss, Mahmoud Mandour
On 12/2/24 11:41, Julian Ganz wrote:
> We recently introduced plugin API for the registration of callbacks for
> discontinuity events, specifically for interrupts, exceptions and host
> call events. The callback receives, among other information, the VCPU
> index and the PC after the event. This change introduces a test plugin
> asserting that particular behaviour.
> ---
> tests/tcg/plugins/discons.c | 95 +++++++++++++++++++++++++++++++++++
> tests/tcg/plugins/meson.build | 2 +-
> 2 files changed, 96 insertions(+), 1 deletion(-)
> create mode 100644 tests/tcg/plugins/discons.c
>
> diff --git a/tests/tcg/plugins/discons.c b/tests/tcg/plugins/discons.c
> new file mode 100644
> index 0000000000..54e52f563a
> --- /dev/null
> +++ b/tests/tcg/plugins/discons.c
> @@ -0,0 +1,95 @@
> +/*
> + * Copyright (C) 2024, Julian Ganz <neither@nut.email>
> + *
> + * License: GNU GPL, version 2 or later.
> + * See the COPYING file in the top-level directory.
> + */
Would be nice to include a description of the plugin here.
> +#include <stdio.h>
> +
> +#include <qemu-plugin.h>
> +
> +QEMU_PLUGIN_EXPORT int qemu_plugin_version = QEMU_PLUGIN_VERSION;
> +
> +struct cpu_state {
> + uint64_t next_pc;
> + bool has_next;
> +};
> +
> +static struct qemu_plugin_scoreboard *states;
> +
> +static bool abort_on_mismatch;
> +
> +static void vcpu_discon(qemu_plugin_id_t id, unsigned int vcpu_index,
> + enum qemu_plugin_discon_type type, uint64_t from_pc,
> + uint64_t to_pc)
> +{
> + struct cpu_state *state = qemu_plugin_scoreboard_find(states, vcpu_index);
> + state->next_pc = to_pc;
> + state->has_next = true;
> +}
> +
> +static void insn_exec(unsigned int vcpu_index, void *userdata)
> +{
> + struct cpu_state *state = qemu_plugin_scoreboard_find(states, vcpu_index);
> + uint64_t pc = (uint64_t) userdata;
> + GString* report;
> +
> + if (state->has_next) {
> + if (state->next_pc != pc) {
> + report = g_string_new("Trap target PC mismatch\n");
> + g_string_append_printf(report,
> + "Expected: %"PRIx64"\nEncountered: %"
> + PRIx64"\n",
> + state->next_pc, pc);
> + qemu_plugin_outs(report->str);
> + if (abort_on_mismatch) {
> + g_abort();
> + }
> + g_string_free(report, true);
> + }
> + state->has_next = false;
> + }
> +}
When booting an arm64 vm, I get this message:
Trap target PC mismatch
Expected: 23faf3a80
Encountered: 23faf3a84
From what I understand, it means that the next_pc we have is incorrect.
> +
> +static void vcpu_tb_trans(qemu_plugin_id_t id, struct qemu_plugin_tb *tb)
> +{
> + size_t i;
> + size_t n_insns = qemu_plugin_tb_n_insns(tb);
> +
> + for (i = 0; i < n_insns; i++) {
> + struct qemu_plugin_insn * insn = qemu_plugin_tb_get_insn(tb, i);
> + uint64_t pc = qemu_plugin_insn_vaddr(insn);
> + qemu_plugin_register_vcpu_insn_exec_cb(insn, insn_exec,
> + QEMU_PLUGIN_CB_NO_REGS,
> + (void*) pc);
> + }
> +}> +
> +QEMU_PLUGIN_EXPORT int qemu_plugin_install(qemu_plugin_id_t id,
> + const qemu_info_t *info,
> + int argc, char **argv)
> +{
> + int i;
> +
> + for (i = 0; i < argc; i++) {
> + char *opt = argv[i];
> + g_auto(GStrv) tokens = g_strsplit(opt, "=", 2);
> + if (g_strcmp0(tokens[0], "abort") == 0) {
> + if (!qemu_plugin_bool_parse(tokens[0], tokens[1], &abort_on_mismatch)) {
> + fprintf(stderr, "boolean argument parsing failed: %s\n", opt);
> + return -1;
> + }
> + } else {
> + fprintf(stderr, "option parsing failed: %s\n", opt);
> + return -1;
> + }
> + }
> +
> + states = qemu_plugin_scoreboard_new(sizeof(struct cpu_state));
> +
> + qemu_plugin_register_vcpu_discon_cb(id, QEMU_PLUGIN_DISCON_ALL,
> + vcpu_discon);
> + qemu_plugin_register_vcpu_tb_trans_cb(id, vcpu_tb_trans);
> +
> + return 0;
> +}
> diff --git a/tests/tcg/plugins/meson.build b/tests/tcg/plugins/meson.build
> index f847849b1b..f057238da1 100644
> --- a/tests/tcg/plugins/meson.build
> +++ b/tests/tcg/plugins/meson.build
> @@ -1,6 +1,6 @@
> t = []
> if get_option('plugins')
> - foreach i : ['bb', 'empty', 'inline', 'insn', 'mem', 'syscall']
> + foreach i : ['bb', 'discons', 'empty', 'inline', 'insn', 'mem', 'syscall']
> if host_os == 'windows'
> t += shared_module(i, files(i + '.c') + '../../../contrib/plugins/win32_linker.c',
> include_directories: '../../../include/qemu',
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 01/11] plugins: add types for callbacks related to certain discontinuities
2024-12-04 22:41 ` Pierrick Bouvier
@ 2024-12-05 12:40 ` Julian Ganz
2024-12-05 17:56 ` Pierrick Bouvier
0 siblings, 1 reply; 70+ messages in thread
From: Julian Ganz @ 2024-12-05 12:40 UTC (permalink / raw)
To: Pierrick Bouvier, qemu-devel
Hi Pierrick,
December 4, 2024 at 11:41 PM, "Pierrick Bouvier" wrote:
> On 12/3/24 00:45, Julian Ganz wrote:
>
> >
> > Hi,
> > December 2, 2024 at 8:26 PM, "Julian Ganz" wrote:
> >
> > >
> > > diff --git a/include/qemu/qemu-plugin.h b/include/qemu/qemu-plugin.h
> > > index 0fba36ae02..9c67374b7e 100644
> > > --- a/include/qemu/qemu-plugin.h
> > > +++ b/include/qemu/qemu-plugin.h
> > > @@ -154,6 +154,49 @@ typedef void (*qemu_plugin_vcpu_simple_cb_t)(qemu_plugin_id_t id,
> > >
> > <snip>
> >
> > >
> > > +/**
> > > + * typedef qemu_plugin_vcpu_discon_cb_t - vcpu discontinuity callback
> > > + * @vcpu_index: the current vcpu context
> > > + * @type: the type of discontinuity
> > > + * @from_pc: the source of the discontinuity, e.g. the PC before the
> > > + * transition
> > > + * @to_pc: the PC pointing to the next instruction to be executed
> > > + *
> > > + * The excact semantics of @from_pc depends on @the type of discontinuity. For
> > > + * interrupts, @from_pc will point to the next instruction which would have
> > > + * been executed. For exceptions and host calls, @from_pc will point to the
> > > + * instruction that caused the exception or issued the host call. Note that
> > > + * in the case of exceptions, the instruction is not retired and thus not
> > > + * observable via general instruction exec callbacks. The same may be the case
> > > + * for some host calls such as hypervisor call "exceptions".
> > >
> > Some more notes about this bit: I originally tried to make the from_pc
> > semantics independent from the type of event, i.e. either of the two
> > cases. I obviously did not succeed in doing so. As, in most cases, the
> > instruction pointed to by from_pc is not observable via exec callbacks
> > I could also not test this behaviour in the testing plugin (see patch
> > 11). I am therefore in favor for dropping the from_pc for the next
> > iteration of this patch series.
> >
> Does it mean that information returned should be dependent of type of event, as we previously discussed on v1?
Yes, and I don't like it.
Regards,
Julian Ganz
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 01/11] plugins: add types for callbacks related to certain discontinuities
2024-12-04 22:45 ` Pierrick Bouvier
@ 2024-12-05 12:44 ` Julian Ganz
2024-12-05 17:35 ` Pierrick Bouvier
2025-01-09 13:52 ` Alex Bennée
1 sibling, 1 reply; 70+ messages in thread
From: Julian Ganz @ 2024-12-05 12:44 UTC (permalink / raw)
To: Pierrick Bouvier, qemu-devel
Hi Pierrick,
December 4, 2024 at 11:45 PM, "Pierrick Bouvier" wrote:
> On 12/2/24 11:26, Julian Ganz wrote:
> > include/qemu/plugin.h | 1 +
> > include/qemu/qemu-plugin.h | 43 ++++++++++++++++++++++++++++++++++++++
> > 2 files changed, 44 insertions(+)
> > diff --git a/include/qemu/plugin.h b/include/qemu/plugin.h
> > index 9726a9ebf3..27a176b631 100644
> > --- a/include/qemu/plugin.h
> > +++ b/include/qemu/plugin.h
<snip>
> > +
> > +/**
> > + * typedef qemu_plugin_vcpu_discon_cb_t - vcpu discontinuity callback
> > + * @vcpu_index: the current vcpu context
> > + * @type: the type of discontinuity
> > + * @from_pc: the source of the discontinuity, e.g. the PC before the
> > + * transition
> > + * @to_pc: the PC pointing to the next instruction to be executed
> > + *
> >
> Missing those parameters when building doc.
> include/qemu/qemu-plugin.h:198: warning: Function parameter or member 'id' not described in 'qemu_plugin_vcpu_discon_cb_t'
> include/qemu/qemu-plugin.h:289: warning: Function parameter or member 'type' not described in 'qemu_plugin_register_vcpu_discon_cb'
> 2 warnings as Errors
Yes, I forgot about id. But type is clearly documented. Maybe the tool
is confused about the name and thinks it's a reserved word or something?
In that case I better change that to something else.
And note to self: also test-biuld the docs next time.
Regards,
Julian Ganz
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 04/11] contrib/plugins: add plugin showcasing new dicontinuity related API
2024-12-04 23:14 ` Pierrick Bouvier
@ 2024-12-05 13:00 ` Julian Ganz
2024-12-05 17:23 ` Pierrick Bouvier
0 siblings, 1 reply; 70+ messages in thread
From: Julian Ganz @ 2024-12-05 13:00 UTC (permalink / raw)
To: Pierrick Bouvier, qemu-devel
Cc: Alex Bennée, Alexandre Iooss, Mahmoud Mandour
Hi Pierrick,
December 5, 2024 at 12:14 AM, "Pierrick Bouvier" wrote:
> On 12/2/24 11:26, Julian Ganz wrote:
> > +typedef struct {
> > + uint64_t interrupts;
> > + uint64_t exceptions;
> > + uint64_t hostcalls;
> > + bool active;
> >
> The active field can be removed, as you can query qemu_plugin_num_vcpus() to know (dynamically) how many vcpus were created.
Yes, if the ids of dynamically initialized VCPUs are contiguous. I
wasn't sure they really are. And I distinctly remember we originally
used some query function and ended up with the maximum number of VCPUs
supported rather then those actually used. But that may have been
another function, or some unfortunate result of me being too cautious
and doing
| qemu_plugin_vcpu_for_each(id, vcpu_init);
in qemu_plugin_install.
> >
> > + break;
> > + }
> > +}
> > +
> > +static void plugin_exit(qemu_plugin_id_t id, void *p)
> > +{
> > + g_autoptr(GString) report;
> > + report = g_string_new("VCPU, interrupts, exceptions, hostcalls\n");
> > + int vcpu;
> > +
> > + for (vcpu = 0; vcpu < max_vcpus; vcpu++) {
> >
> vcpu < qemu_plugin_num_vcpus()
Yes, max_vcpus was introduced as an optimization. If we can rely on all
VCPUs with id < qemu_plugin_num_vcpus() having been active at some point
this becomes unnecessary.
> > + qemu_plugin_register_vcpu_discon_cb(id, QEMU_PLUGIN_DISCON_TRAPS,
> > + vcpu_discon);
> > +
> >
> The change from QEMU_PLUGIN_DISCON_TRAPS to QEMU_PLUGIN_DISCON_ALL should be included in this patch, instead of next one.
Ah, thanks for pointing that out. I likely fumbled this at some point when rebasing.
Regards,
Julian Ganz
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 11/11] tests: add plugin asserting correctness of discon event's to_pc
2024-12-04 23:33 ` Pierrick Bouvier
@ 2024-12-05 13:10 ` Julian Ganz
2024-12-05 17:30 ` Pierrick Bouvier
2024-12-20 11:47 ` Julian Ganz
1 sibling, 1 reply; 70+ messages in thread
From: Julian Ganz @ 2024-12-05 13:10 UTC (permalink / raw)
To: Pierrick Bouvier, qemu-devel
Cc: Alex Bennée, Alexandre Iooss, Mahmoud Mandour
Hi Pierrick,
December 5, 2024 at 12:33 AM, "Pierrick Bouvier" wrote:
> On 12/2/24 11:41, Julian Ganz wrote:
>
> > +/*
> > + * Copyright (C) 2024, Julian Ganz <neither@nut.email>
> > + *
> > + * License: GNU GPL, version 2 or later.
> > + * See the COPYING file in the top-level directory.
> > + */
> >
> Would be nice to include a description of the plugin here.
Agreed. I'll include one next time.
> When booting an arm64 vm, I get this message:
> Trap target PC mismatch
> Expected: 23faf3a80
> Encountered: 23faf3a84
>
> From what I understand, it means that the next_pc we have is incorrect.
Yes, this is indeed incorrect, and also a perfect example why this test
plugin exists. There are likely other errors lurking in target specific
code. Did you happen to also log interrupts? Do you remember what image
you used?
Btw: this also illustrates another issue I have with from_pc: we can
test the behavior for to_pc, but doing this meaningfully for from_pc
via a plugin is next to impossible because the instruction it points to
is not observable via an exec callback. At least not in the general
case, even not if we only consider a single type of event.
Regards,
Julian Ganz
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 04/11] contrib/plugins: add plugin showcasing new dicontinuity related API
2024-12-05 13:00 ` Julian Ganz
@ 2024-12-05 17:23 ` Pierrick Bouvier
0 siblings, 0 replies; 70+ messages in thread
From: Pierrick Bouvier @ 2024-12-05 17:23 UTC (permalink / raw)
To: Julian Ganz, qemu-devel
Cc: Alex Bennée, Alexandre Iooss, Mahmoud Mandour
On 12/5/24 05:00, Julian Ganz wrote:
> Hi Pierrick,
>
> December 5, 2024 at 12:14 AM, "Pierrick Bouvier" wrote:
>> On 12/2/24 11:26, Julian Ganz wrote:
>>> +typedef struct {
>>> + uint64_t interrupts;
>>> + uint64_t exceptions;
>>> + uint64_t hostcalls;
>>> + bool active;
>>>
>> The active field can be removed, as you can query qemu_plugin_num_vcpus() to know (dynamically) how many vcpus were created.
>
> Yes, if the ids of dynamically initialized VCPUs are contiguous. I
> wasn't sure they really are. And I distinctly remember we originally
> used some query function and ended up with the maximum number of VCPUs
> supported rather then those actually used. But that may have been
> another function, or some unfortunate result of me being too cautious
> and doing
>
It's a valid concern, and the good news is that they are guaranteed to
be contiguous.
In system mode, we initialize all vcpus *before* executing anything.
In user mode, they are spawned when new threads appear.
In reality, qemu_plugin_num_vcpus() returns the maximum number of vcpus
we had at some point (vs the real number of vcpus running).
It was introduced with scoreboards, to ensure the user can know how many
entries they contain, in a safe way.
Most of the usage we had for this was to allocate structures and collect
information per vcpu, especially at the end of execution (where some
vcpus will have exited already in user-mode). So choosing to return the
max is a valid abstraction.
> | qemu_plugin_vcpu_for_each(id, vcpu_init);
>
> in qemu_plugin_install.
>
And indeed, qemu_plugin_vcpu_for_each(id, vcpu_init) will only iterate
on active cpus.
>>>
>>> + break;
>>> + }
>>> +}
>>> +
>>> +static void plugin_exit(qemu_plugin_id_t id, void *p)
>>> +{
>>> + g_autoptr(GString) report;
>>> + report = g_string_new("VCPU, interrupts, exceptions, hostcalls\n");
>>> + int vcpu;
>>> +
>>> + for (vcpu = 0; vcpu < max_vcpus; vcpu++) {
>>>
>> vcpu < qemu_plugin_num_vcpus()
>
> Yes, max_vcpus was introduced as an optimization. If we can rely on all
> VCPUs with id < qemu_plugin_num_vcpus() having been active at some point
> this becomes unnecessary.
>
>>> + qemu_plugin_register_vcpu_discon_cb(id, QEMU_PLUGIN_DISCON_TRAPS,
>>> + vcpu_discon);
>>> +
>>>
>> The change from QEMU_PLUGIN_DISCON_TRAPS to QEMU_PLUGIN_DISCON_ALL should be included in this patch, instead of next one.
>
> Ah, thanks for pointing that out. I likely fumbled this at some point when rebasing.
>
> Regards,
> Julian Ganz
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 11/11] tests: add plugin asserting correctness of discon event's to_pc
2024-12-05 13:10 ` Julian Ganz
@ 2024-12-05 17:30 ` Pierrick Bouvier
2024-12-05 21:22 ` Julian Ganz
0 siblings, 1 reply; 70+ messages in thread
From: Pierrick Bouvier @ 2024-12-05 17:30 UTC (permalink / raw)
To: Julian Ganz, qemu-devel
Cc: Alex Bennée, Alexandre Iooss, Mahmoud Mandour
On 12/5/24 05:10, Julian Ganz wrote:
> Hi Pierrick,
>
> December 5, 2024 at 12:33 AM, "Pierrick Bouvier" wrote:
>> On 12/2/24 11:41, Julian Ganz wrote:
>>
>>> +/*
>>> + * Copyright (C) 2024, Julian Ganz <neither@nut.email>
>>> + *
>>> + * License: GNU GPL, version 2 or later.
>>> + * See the COPYING file in the top-level directory.
>>> + */
>>>
>> Would be nice to include a description of the plugin here.
>
> Agreed. I'll include one next time.
>
>> When booting an arm64 vm, I get this message:
>> Trap target PC mismatch
>> Expected: 23faf3a80
>> Encountered: 23faf3a84
>>
>> From what I understand, it means that the next_pc we have is incorrect.
>
> Yes, this is indeed incorrect, and also a perfect example why this test
> plugin exists. There are likely other errors lurking in target specific
> code. Did you happen to also log interrupts? Do you remember what image
> you used?
I used exactly this:
./build/qemu-system-aarch64 -plugin
./build/tests/tcg/plugins/libdiscons.so -smp 4 -M virt -d plugin -m 8G
-device virtio-blk-pci,drive=root -drive
if=none,id=root,file=./debianaarch64.img -M virt -cpu max,pauth=off
-drive if=pflash,readonly=on,file=/usr/share/AAVMF/AAVMF_CODE.fd -drive
if=pflash,file=./AAVMF_VARS.fd
The arm64 image is a vanilla debian stable I installed.
AAVMF* files come from qemu-efi-aarch64 debian package.
>
> Btw: this also illustrates another issue I have with from_pc: we can
> test the behavior for to_pc, but doing this meaningfully for from_pc
> via a plugin is next to impossible because the instruction it points to
> is not observable via an exec callback. At least not in the general
> case, even not if we only consider a single type of event.
>
We can store the next_expected pc for each instruction (from
current_instruction + insn_length), and we should be able to compare
that with the expected from_pc.
This is mostly what contrib/plugins/cflow.c does.
With that, we can test from_pc.
> Regards,
> Julian Ganz
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 01/11] plugins: add types for callbacks related to certain discontinuities
2024-12-05 12:44 ` Julian Ganz
@ 2024-12-05 17:35 ` Pierrick Bouvier
2024-12-05 21:25 ` Julian Ganz
0 siblings, 1 reply; 70+ messages in thread
From: Pierrick Bouvier @ 2024-12-05 17:35 UTC (permalink / raw)
To: Julian Ganz, qemu-devel
On 12/5/24 04:44, Julian Ganz wrote:
> Hi Pierrick,
>
> December 4, 2024 at 11:45 PM, "Pierrick Bouvier" wrote:
>> On 12/2/24 11:26, Julian Ganz wrote:
>>> include/qemu/plugin.h | 1 +
>>> include/qemu/qemu-plugin.h | 43 ++++++++++++++++++++++++++++++++++++++
>>> 2 files changed, 44 insertions(+)
>>> diff --git a/include/qemu/plugin.h b/include/qemu/plugin.h
>>> index 9726a9ebf3..27a176b631 100644
>>> --- a/include/qemu/plugin.h
>>> +++ b/include/qemu/plugin.h
> <snip>
>>> +
>>> +/**
>>> + * typedef qemu_plugin_vcpu_discon_cb_t - vcpu discontinuity callback
>>> + * @vcpu_index: the current vcpu context
>>> + * @type: the type of discontinuity
>>> + * @from_pc: the source of the discontinuity, e.g. the PC before the
>>> + * transition
>>> + * @to_pc: the PC pointing to the next instruction to be executed
>>> + *
>>>
>> Missing those parameters when building doc.
>> include/qemu/qemu-plugin.h:198: warning: Function parameter or member 'id' not described in 'qemu_plugin_vcpu_discon_cb_t'
>> include/qemu/qemu-plugin.h:289: warning: Function parameter or member 'type' not described in 'qemu_plugin_register_vcpu_discon_cb'
>> 2 warnings as Errors
>
> Yes, I forgot about id. But type is clearly documented. Maybe the tool
> is confused about the name and thinks it's a reserved word or something?
> In that case I better change that to something else.
>
The type (qemu_plugin_discon_type) is documented, but the parameter
"type" is not. Even if the name is correctly chosen, you still need to
make it appear in the doc.
> And note to self: also test-biuld the docs next time.
>
I was bitten by this too when I started contributing to plugins, so no
worries. It's safer to keep the docs enabled (that's the configure
default), even if it adds some latency to the build, especially when
touching documented headers.
> Regards,
> Julian Ganz
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 01/11] plugins: add types for callbacks related to certain discontinuities
2024-12-05 12:40 ` Julian Ganz
@ 2024-12-05 17:56 ` Pierrick Bouvier
2024-12-05 21:50 ` Julian Ganz
0 siblings, 1 reply; 70+ messages in thread
From: Pierrick Bouvier @ 2024-12-05 17:56 UTC (permalink / raw)
To: Julian Ganz, qemu-devel
On 12/5/24 04:40, Julian Ganz wrote:
> Hi Pierrick,
>
> December 4, 2024 at 11:41 PM, "Pierrick Bouvier" wrote:
>> On 12/3/24 00:45, Julian Ganz wrote:
>>
>>>
>>> Hi,
>>> December 2, 2024 at 8:26 PM, "Julian Ganz" wrote:
>>>
>>>>
>>>> diff --git a/include/qemu/qemu-plugin.h b/include/qemu/qemu-plugin.h
>>>> index 0fba36ae02..9c67374b7e 100644
>>>> --- a/include/qemu/qemu-plugin.h
>>>> +++ b/include/qemu/qemu-plugin.h
>>>> @@ -154,6 +154,49 @@ typedef void (*qemu_plugin_vcpu_simple_cb_t)(qemu_plugin_id_t id,
>>>>
>>> <snip>
>>>
>>>>
>>>> +/**
>>>> + * typedef qemu_plugin_vcpu_discon_cb_t - vcpu discontinuity callback
>>>> + * @vcpu_index: the current vcpu context
>>>> + * @type: the type of discontinuity
>>>> + * @from_pc: the source of the discontinuity, e.g. the PC before the
>>>> + * transition
>>>> + * @to_pc: the PC pointing to the next instruction to be executed
>>>> + *
>>>> + * The excact semantics of @from_pc depends on @the type of discontinuity. For
>>>> + * interrupts, @from_pc will point to the next instruction which would have
>>>> + * been executed. For exceptions and host calls, @from_pc will point to the
>>>> + * instruction that caused the exception or issued the host call. Note that
>>>> + * in the case of exceptions, the instruction is not retired and thus not
>>>> + * observable via general instruction exec callbacks. The same may be the case
>>>> + * for some host calls such as hypervisor call "exceptions".
>>>>
>>> Some more notes about this bit: I originally tried to make the from_pc
>>> semantics independent from the type of event, i.e. either of the two
>>> cases. I obviously did not succeed in doing so. As, in most cases, the
>>> instruction pointed to by from_pc is not observable via exec callbacks
>>> I could also not test this behaviour in the testing plugin (see patch
>>> 11). I am therefore in favor for dropping the from_pc for the next
>>> iteration of this patch series.
>>>
>> Does it mean that information returned should be dependent of type of event, as we previously discussed on v1?
>
> Yes, and I don't like it.
>
I respect your personal preference, but our conversation should be based
on arguments, and not only tastes.
The important thing, from my point of view, is that the API stays easy
to use and clear for the user. Having multiple callbacks is a headache,
because you can't clearly group them somewhere, and force the user to
implement all of them at once.
By having a single callback, we can force the user to handle all cases,
thanks to the type system. The user may decide to use "default: break;"
and that's ok, because they chose it deliberately.
I was, and I am still ok with the current approach, of having from/to
parameters and a "simple" callback type. But remove "from" because we
can't get it right in some cases does not seem the best decision.
Let's try to move forward, and solve the problems we have with from_pc.
The testing part can be solved already (as explained in a previous
message). In which cases can't you identify from_pc?
> Regards,
> Julian Ganz
By the way, and if you are open to talk about naming.
I understand why you picked up discontinuity, which is coming from a
mathematical background. However, I rarely see this term used in the
literature for computer science, and people use "exceptional control
flow" to qualify interrupts and traps. In more, when we'll integrate
classic control flow (including fallback between tb), the term
discontinuity will lose its meaning. For this reason, I think that
{cflow,CFLOW} makes more sense.
But, as there is some personal preference into this, I will leave the
choice up to you.
Thanks,
Pierrick
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 11/11] tests: add plugin asserting correctness of discon event's to_pc
2024-12-05 17:30 ` Pierrick Bouvier
@ 2024-12-05 21:22 ` Julian Ganz
2024-12-05 22:28 ` Pierrick Bouvier
0 siblings, 1 reply; 70+ messages in thread
From: Julian Ganz @ 2024-12-05 21:22 UTC (permalink / raw)
To: Pierrick Bouvier, qemu-devel
Cc: Alex Bennée, Alexandre Iooss, Mahmoud Mandour
Hi Pierrick,
December 5, 2024 at 6:30 PM, "Pierrick Bouvier" wrote:
> On 12/5/24 05:10, Julian Ganz wrote:
> > December 5, 2024 at 12:33 AM, "Pierrick Bouvier" wrote:
> > > Trap target PC mismatch
> > > Expected: 23faf3a80
> > > Encountered: 23faf3a84
> > >
> > > From what I understand, it means that the next_pc we have is incorrect.
> > >
> > Yes, this is indeed incorrect, and also a perfect example why this test
> > plugin exists. There are likely other errors lurking in target specific
> > code. Did you happen to also log interrupts? Do you remember what image
> > you used?
> >
> I used exactly this:
>
> ./build/qemu-system-aarch64 -plugin ./build/tests/tcg/plugins/libdiscons.so -smp 4 -M virt -d plugin -m 8G -device virtio-blk-pci,drive=root -drive if=none,id=root,file=./debianaarch64.img -M virt -cpu max,pauth=off -drive if=pflash,readonly=on,file=/usr/share/AAVMF/AAVMF_CODE.fd -drive if=pflash,file=./AAVMF_VARS.fd
>
> The arm64 image is a vanilla debian stable I installed.
> AAVMF* files come from qemu-efi-aarch64 debian package.
Thanks! I will have a closer look and include a fix in the next iteration.
> > Btw: this also illustrates another issue I have with from_pc: we can
> > test the behavior for to_pc, but doing this meaningfully for from_pc
> > via a plugin is next to impossible because the instruction it points to
> > is not observable via an exec callback. At least not in the general
> > case, even not if we only consider a single type of event.
> >
> We can store the next_expected pc for each instruction (from current_instruction + insn_length), and we should be able to compare that with the expected from_pc.
> This is mostly what contrib/plugins/cflow.c does.
>
> With that, we can test from_pc.
I'm not confident that this will work reliably for branch, jump and
other "interesting" instructions. But I can have a closer look at the
cflow plugin and try to figure out how that plugin handles those cases.
Regards,
Julian Ganz
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 01/11] plugins: add types for callbacks related to certain discontinuities
2024-12-05 17:35 ` Pierrick Bouvier
@ 2024-12-05 21:25 ` Julian Ganz
0 siblings, 0 replies; 70+ messages in thread
From: Julian Ganz @ 2024-12-05 21:25 UTC (permalink / raw)
To: Pierrick Bouvier, qemu-devel
Hi Pierrick,
December 5, 2024 at 6:35 PM, "Pierrick Bouvier" wrote:
> On 12/5/24 04:44, Julian Ganz wrote:
>
> >
> > Hi Pierrick,
> > December 4, 2024 at 11:45 PM, "Pierrick Bouvier" wrote:
> >
> > >
> > > On 12/2/24 11:26, Julian Ganz wrote:
> > >
> > include/qemu/plugin.h | 1 +
> > include/qemu/qemu-plugin.h | 43 ++++++++++++++++++++++++++++++++++++++
> > 2 files changed, 44 insertions(+)
> > diff --git a/include/qemu/plugin.h b/include/qemu/plugin.h
> > index 9726a9ebf3..27a176b631 100644
> > --- a/include/qemu/plugin.h
> > +++ b/include/qemu/plugin.h
> > <snip>
> > +
> > +/**
> > + * typedef qemu_plugin_vcpu_discon_cb_t - vcpu discontinuity callback
> > + * @vcpu_index: the current vcpu context
> > + * @type: the type of discontinuity
> > + * @from_pc: the source of the discontinuity, e.g. the PC before the
> > + * transition
> > + * @to_pc: the PC pointing to the next instruction to be executed
> > + *
> >
> > >
> > > Missing those parameters when building doc.
> > > include/qemu/qemu-plugin.h:198: warning: Function parameter or member 'id' not described in 'qemu_plugin_vcpu_discon_cb_t'
> > > include/qemu/qemu-plugin.h:289: warning: Function parameter or member 'type' not described in 'qemu_plugin_register_vcpu_discon_cb'
> > > 2 warnings as Errors
> > >
> > Yes, I forgot about id. But type is clearly documented. Maybe the tool
> > is confused about the name and thinks it's a reserved word or something?
> > In that case I better change that to something else.
> >
> The type (qemu_plugin_discon_type) is documented, but the parameter "type" is not. Even if the name is correctly chosen, you still need to make it appear in the doc.
Yes. I didn't realize that the two warnings were for different entities
because I somehow failed to parse the entire line.
Regards,
Julian Ganz
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 01/11] plugins: add types for callbacks related to certain discontinuities
2024-12-05 17:56 ` Pierrick Bouvier
@ 2024-12-05 21:50 ` Julian Ganz
2024-12-05 22:14 ` Julian Ganz
2024-12-05 23:03 ` Pierrick Bouvier
0 siblings, 2 replies; 70+ messages in thread
From: Julian Ganz @ 2024-12-05 21:50 UTC (permalink / raw)
To: Pierrick Bouvier, qemu-devel
Hi Pierrick,
December 5, 2024 at 6:56 PM, "Pierrick Bouvier" wrote:
> On 12/5/24 04:40, Julian Ganz wrote:
>
> >
> > Hi Pierrick,
> > December 4, 2024 at 11:41 PM, "Pierrick Bouvier" wrote:
> > > Does it mean that information returned should be dependent of type of event, as we previously discussed on v1?
> > >
> > Yes, and I don't like it.
> >
> I respect your personal preference, but our conversation should be based on arguments, and not only tastes.
>
> The important thing, from my point of view, is that the API stays easy to use and clear for the user. Having multiple callbacks is a headache, because you can't clearly group them somewhere, and force the user to implement all of them at once.
Having only one callback is not something I'm against.
> I was, and I am still ok with the current approach, of having from/to parameters and a "simple" callback type. But remove "from" because we can't get it right in some cases does not seem the best decision.
If you cannot rely on an input being a sensible value, doesn't that
render the input useless?
> Let's try to move forward, and solve the problems we have with from_pc. The testing part can be solved already (as explained in a previous message). In which cases can't you identify from_pc?
I'll have to check, but problems that I discussed with a colleague
included jumps to an unmapped page resulting in the appropriate
exception. We ultimately agreed that in such a situation from_pc should
point to the jump target inside the unmapped page, instead of, say, the
jump. We assume that most targets should already behave this way without
further changes. However, in order to compute the correct from_pc, we
need to know the jump target before the exception is raised (i.e. right
after the jump instruction is executed), and that's not necessarily
straight-forward to do in a plugin.
But as I wrote before in another message, I need to take another look at
the cflow plugin.
> By the way, and if you are open to talk about naming.
I'm open to suggestions.
> I understand why you picked up discontinuity, which is coming from a mathematical background. However, I rarely see this term used in the literature for computer science, and people use "exceptional control flow" to qualify interrupts and traps. In more, when we'll integrate classic control flow (including fallback between tb), the term discontinuity will lose its meaning. For this reason, I think that {cflow,CFLOW} makes more sense.
Using the term "discontinuity" was, in fact, inspired by "uninferable PC
discontinuities" defined in the RISC-V ETrace spec [1], arguably a
technical document. We chose discontinuity over the notion of control
flow because the PC is not the (only) thing affected by the event. In
the case of host calls, we ideally don't even observe an effect on the
PC. Thus control flow doesn't really fit the bill for those.
Regards,
Julian Ganz
[1] https://github.com/riscv-non-isa/riscv-trace-spec
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 01/11] plugins: add types for callbacks related to certain discontinuities
2024-12-05 21:50 ` Julian Ganz
@ 2024-12-05 22:14 ` Julian Ganz
2024-12-05 23:03 ` Pierrick Bouvier
1 sibling, 0 replies; 70+ messages in thread
From: Julian Ganz @ 2024-12-05 22:14 UTC (permalink / raw)
To: Julian Ganz, Pierrick Bouvier, qemu-devel
Hi Pierrick,
December 5, 2024 at 10:50 PM, "Julian Ganz" wrote:
> December 5, 2024 at 6:56 PM, "Pierrick Bouvier" wrote:
> > Let's try to move forward, and solve the problems we have with from_pc. The testing part can be solved already (as explained in a previous message). In which cases can't you identify from_pc?
> >
> I'll have to check, but problems that I discussed with a colleague
> included jumps to an unmapped page resulting in the appropriate
> exception. We ultimately agreed that in such a situation from_pc should
> point to the jump target inside the unmapped page, instead of, say, the
> jump. We assume that most targets should already behave this way without
> further changes. However, in order to compute the correct from_pc, we
> need to know the jump target before the exception is raised (i.e. right
> after the jump instruction is executed), and that's not necessarily
> straight-forward to do in a plugin.
Just remembered the joyful existence of double traps.
Regards,
Julian
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 11/11] tests: add plugin asserting correctness of discon event's to_pc
2024-12-05 21:22 ` Julian Ganz
@ 2024-12-05 22:28 ` Pierrick Bouvier
2024-12-06 8:42 ` Julian Ganz
0 siblings, 1 reply; 70+ messages in thread
From: Pierrick Bouvier @ 2024-12-05 22:28 UTC (permalink / raw)
To: Julian Ganz, qemu-devel
Cc: Alex Bennée, Alexandre Iooss, Mahmoud Mandour
On 12/5/24 13:22, Julian Ganz wrote:
> Hi Pierrick,
>
> December 5, 2024 at 6:30 PM, "Pierrick Bouvier" wrote:
>> On 12/5/24 05:10, Julian Ganz wrote:
>>> December 5, 2024 at 12:33 AM, "Pierrick Bouvier" wrote:
>>>> Trap target PC mismatch
>>>> Expected: 23faf3a80
>>>> Encountered: 23faf3a84
>>>>
>>>> From what I understand, it means that the next_pc we have is incorrect.
>>>>
>>> Yes, this is indeed incorrect, and also a perfect example why this test
>>> plugin exists. There are likely other errors lurking in target specific
>>> code. Did you happen to also log interrupts? Do you remember what image
>>> you used?
>>>
>> I used exactly this:
>>
>> ./build/qemu-system-aarch64 -plugin ./build/tests/tcg/plugins/libdiscons.so -smp 4 -M virt -d plugin -m 8G -device virtio-blk-pci,drive=root -drive if=none,id=root,file=./debianaarch64.img -M virt -cpu max,pauth=off -drive if=pflash,readonly=on,file=/usr/share/AAVMF/AAVMF_CODE.fd -drive if=pflash,file=./AAVMF_VARS.fd
>>
>> The arm64 image is a vanilla debian stable I installed.
>> AAVMF* files come from qemu-efi-aarch64 debian package.
>
> Thanks! I will have a closer look and include a fix in the next iteration.
>
>>> Btw: this also illustrates another issue I have with from_pc: we can
>>> test the behavior for to_pc, but doing this meaningfully for from_pc
>>> via a plugin is next to impossible because the instruction it points to
>>> is not observable via an exec callback. At least not in the general
>>> case, even not if we only consider a single type of event.
>>>
>> We can store the next_expected pc for each instruction (from current_instruction + insn_length), and we should be able to compare that with the expected from_pc.
>> This is mostly what contrib/plugins/cflow.c does.
>>
>> With that, we can test from_pc.
>
> I'm not confident that this will work reliably for branch, jump and
> other "interesting" instructions. But I can have a closer look at the
> cflow plugin and try to figure out how that plugin handles those cases.
>
It won't work for latest instructions in a tb (because we don't know
what will be the next one), but should be good for all the others cases.
> Regards,
> Julian Ganz
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 01/11] plugins: add types for callbacks related to certain discontinuities
2024-12-05 21:50 ` Julian Ganz
2024-12-05 22:14 ` Julian Ganz
@ 2024-12-05 23:03 ` Pierrick Bouvier
2024-12-06 8:58 ` Julian Ganz
1 sibling, 1 reply; 70+ messages in thread
From: Pierrick Bouvier @ 2024-12-05 23:03 UTC (permalink / raw)
To: Julian Ganz, qemu-devel
On 12/5/24 13:50, Julian Ganz wrote:
> Hi Pierrick,
>
> December 5, 2024 at 6:56 PM, "Pierrick Bouvier" wrote:
>> On 12/5/24 04:40, Julian Ganz wrote:
>>
>>>
>>> Hi Pierrick,
>>> December 4, 2024 at 11:41 PM, "Pierrick Bouvier" wrote:
>>>> Does it mean that information returned should be dependent of type of event, as we previously discussed on v1?
>>>>
>>> Yes, and I don't like it.
>>>
>> I respect your personal preference, but our conversation should be based on arguments, and not only tastes.
>>
>> The important thing, from my point of view, is that the API stays easy to use and clear for the user. Having multiple callbacks is a headache, because you can't clearly group them somewhere, and force the user to implement all of them at once.
>
> Having only one callback is not something I'm against.
>
>> I was, and I am still ok with the current approach, of having from/to parameters and a "simple" callback type. But remove "from" because we can't get it right in some cases does not seem the best decision.
>
> If you cannot rely on an input being a sensible value, doesn't that
> render the input useless?
>
I agree. If for a specific event it's impossible to provide a value
(i.e. the value has no meaning for a real cpu), it will just point that
we need several types of data per event, and the compromise of having a
single callback won't be possible.
We should differentiate "it's hard to find this value in QEMU" vs "this
value does not exist in real life". The first can be solved if we put
effort into it. And every time a cpu changes it's flow of execution, it
makes sense to find where it was just before.
One of the end goals is to be able to build a full control flow graph,
with edges labeled on transition type (exceptions, traps, interrupts,
jump, fallback), which we can do with the triple {event,from,to}.
>> Let's try to move forward, and solve the problems we have with from_pc. The testing part can be solved already (as explained in a previous message). In which cases can't you identify from_pc?
>
> I'll have to check, but problems that I discussed with a colleague
> included jumps to an unmapped page resulting in the appropriate
> exception. We ultimately agreed that in such a situation from_pc should
> point to the jump target inside the unmapped page, instead of, say, the
> jump. We assume that most targets should already behave this way without
> further changes. However, in order to compute the correct from_pc, we
> need to know the jump target before the exception is raised (i.e. right
> after the jump instruction is executed), and that's not necessarily
> straight-forward to do in a plugin.
It's an interesting conversation. For the scope of this series, I agree
you should use the jump target, which triggered the trap.
In fine, transitions should simply follow what the cpu does.
- orig_insn: jump to A
- jump_target: execute A traps
- page_fault: load page
- jump_target: come back to A
event(JUMP, orig_insn, jump_target) // not covered by this series
event(EXCEPTION, jump_target, page_fault)
... execute page_fault (with potential other transitions)
event(JUMP, end_page_fault, jump_target)
In the case of a double trap, we could follow the same logic, and
represent the original transition that lead to the trap, and the two
consecutive traps.
Does it make sense?
>
> But as I wrote before in another message, I need to take another look at
> the cflow plugin.
>
>> By the way, and if you are open to talk about naming.
>
> I'm open to suggestions.
>
>> I understand why you picked up discontinuity, which is coming from a mathematical background. However, I rarely see this term used in the literature for computer science, and people use "exceptional control flow" to qualify interrupts and traps. In more, when we'll integrate classic control flow (including fallback between tb), the term discontinuity will lose its meaning. For this reason, I think that {cflow,CFLOW} makes more sense.
>
> Using the term "discontinuity" was, in fact, inspired by "uninferable PC
> discontinuities" defined in the RISC-V ETrace spec [1], arguably a
> technical document. We chose discontinuity over the notion of control
> flow because the PC is not the (only) thing affected by the event. In
> the case of host calls, we ideally don't even observe an effect on the
> PC. Thus control flow doesn't really fit the bill for those.
>
I'm happy to read this spec reinvents a term clearly defined elsewhere
(sigh...). Beyond the intellectual vocabulary creation pleasure, it's
good to use term people can find on Google.
We can use this term though, PC discontinuity makes sense, and a
fallback is a special case of discontinuity if we adopt this
perspective. Thanks for explaining!
> Regards,
> Julian Ganz
>
> [1] https://github.com/riscv-non-isa/riscv-trace-spec
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 11/11] tests: add plugin asserting correctness of discon event's to_pc
2024-12-05 22:28 ` Pierrick Bouvier
@ 2024-12-06 8:42 ` Julian Ganz
2024-12-06 19:02 ` Pierrick Bouvier
0 siblings, 1 reply; 70+ messages in thread
From: Julian Ganz @ 2024-12-06 8:42 UTC (permalink / raw)
To: Pierrick Bouvier, qemu-devel
Cc: Alex Bennée, Alexandre Iooss, Mahmoud Mandour
Hi Pierrick,
December 5, 2024 at 11:28 PM, "Pierrick Bouvier" wrote:
> On 12/5/24 13:22, Julian Ganz wrote:
> > December 5, 2024 at 6:30 PM, "Pierrick Bouvier" wrote:
> > > We can store the next_expected pc for each instruction (from current_instruction + insn_length), and we should be able to compare that with the expected from_pc.
> > > This is mostly what contrib/plugins/cflow.c does.
> > >
> > > With that, we can test from_pc.
> > >
> > I'm not confident that this will work reliably for branch, jump and
> > other "interesting" instructions. But I can have a closer look at the
> > cflow plugin and try to figure out how that plugin handles those cases.
> >
> It won't work for latest instructions in a tb (because we don't know what will be the next one), but should be good for all the others cases.
IIUC qemu will schedule interrupts "opportunistically" between tb
executions. If that's the case we'll observe interrupts exclusively
after the last instruction in a tb. That strikes me as a serious
limitation.
Regards,
Julian Ganz
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 01/11] plugins: add types for callbacks related to certain discontinuities
2024-12-05 23:03 ` Pierrick Bouvier
@ 2024-12-06 8:58 ` Julian Ganz
2024-12-06 18:59 ` Pierrick Bouvier
0 siblings, 1 reply; 70+ messages in thread
From: Julian Ganz @ 2024-12-06 8:58 UTC (permalink / raw)
To: Pierrick Bouvier, qemu-devel
Hi Pierrick,
December 6, 2024 at 12:03 AM, "Pierrick Bouvier" wrote:
> On 12/5/24 13:50, Julian Ganz wrote:
> > If you cannot rely on an input being a sensible value, doesn't that
> > render the input useless?
> >
> I agree. If for a specific event it's impossible to provide a value (i.e. the value has no meaning for a real cpu), it will just point that we need several types of data per event, and the compromise of having a single callback won't be possible.
>
> We should differentiate "it's hard to find this value in QEMU" vs "this value does not exist in real life". The first can be solved if we put effort into it. And every time a cpu changes it's flow of execution, it makes sense to find where it was just before.
>
> One of the end goals is to be able to build a full control flow graph, with edges labeled on transition type (exceptions, traps, interrupts, jump, fallback), which we can do with the triple {event,from,to}.
I agree that that triple is sensible for any event type and likely
useful to plugin authors. At least if the semantics are sufficiently
uniform among event types. However, I also feel that given the actual
implementation (hooks sprinkled over target specific code) this is not
easily achievable reliably. At least testability should be a hard
requirement. Otherwise the API's reliability will inevitably deteriorate
over time without any way to tell how bad the situation got.
> > > Let's try to move forward, and solve the problems we have with from_pc. The testing part can be solved already (as explained in a previous message). In which cases can't you identify from_pc?
> > >
> > I'll have to check, but problems that I discussed with a colleague
> > included jumps to an unmapped page resulting in the appropriate
> > exception. We ultimately agreed that in such a situation from_pc should
> > point to the jump target inside the unmapped page, instead of, say, the
> > jump. We assume that most targets should already behave this way without
> > further changes. However, in order to compute the correct from_pc, we
> > need to know the jump target before the exception is raised (i.e. right
> > after the jump instruction is executed), and that's not necessarily
> > straight-forward to do in a plugin.
> >
> It's an interesting conversation. For the scope of this series, I agree you should use the jump target, which triggered the trap.
>
> In fine, transitions should simply follow what the cpu does.
>
> - orig_insn: jump to A
> - jump_target: execute A traps
> - page_fault: load page
> - jump_target: come back to A
>
> event(JUMP, orig_insn, jump_target) // not covered by this series
> event(EXCEPTION, jump_target, page_fault)
> ... execute page_fault (with potential other transitions)
> event(JUMP, end_page_fault, jump_target)
>
> In the case of a double trap, we could follow the same logic, and represent the original transition that lead to the trap, and the two consecutive traps.
>
> Does it make sense?
Yes, those transitions are correct imo. And if a jump event should be
introduced at some point, the call sequence would look like that. My
issue is that testing this (in a plugin) will not be straight forward
or even impossible. And overly complex tests don't exactly provoke
confidence.
Regards,
Julian Ganz
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 01/11] plugins: add types for callbacks related to certain discontinuities
2024-12-06 8:58 ` Julian Ganz
@ 2024-12-06 18:59 ` Pierrick Bouvier
2024-12-07 13:38 ` Julian Ganz
0 siblings, 1 reply; 70+ messages in thread
From: Pierrick Bouvier @ 2024-12-06 18:59 UTC (permalink / raw)
To: Julian Ganz, qemu-devel
On 12/6/24 00:58, Julian Ganz wrote:
> Hi Pierrick,
>
> December 6, 2024 at 12:03 AM, "Pierrick Bouvier" wrote:
>> On 12/5/24 13:50, Julian Ganz wrote:
>>> If you cannot rely on an input being a sensible value, doesn't that
>>> render the input useless?
>>>
>> I agree. If for a specific event it's impossible to provide a value (i.e. the value has no meaning for a real cpu), it will just point that we need several types of data per event, and the compromise of having a single callback won't be possible.
>>
>> We should differentiate "it's hard to find this value in QEMU" vs "this value does not exist in real life". The first can be solved if we put effort into it. And every time a cpu changes it's flow of execution, it makes sense to find where it was just before.
>>
>> One of the end goals is to be able to build a full control flow graph, with edges labeled on transition type (exceptions, traps, interrupts, jump, fallback), which we can do with the triple {event,from,to}.
>
> I agree that that triple is sensible for any event type and likely
> useful to plugin authors. At least if the semantics are sufficiently
> uniform among event types. However, I also feel that given the actual
> implementation (hooks sprinkled over target specific code) this is not
> easily achievable reliably. At least testability should be a hard
> requirement. Otherwise the API's reliability will inevitably deteriorate
> over time without any way to tell how bad the situation got.
>
>>>> Let's try to move forward, and solve the problems we have with from_pc. The testing part can be solved already (as explained in a previous message). In which cases can't you identify from_pc?
>>>>
>>> I'll have to check, but problems that I discussed with a colleague
>>> included jumps to an unmapped page resulting in the appropriate
>>> exception. We ultimately agreed that in such a situation from_pc should
>>> point to the jump target inside the unmapped page, instead of, say, the
>>> jump. We assume that most targets should already behave this way without
>>> further changes. However, in order to compute the correct from_pc, we
>>> need to know the jump target before the exception is raised (i.e. right
>>> after the jump instruction is executed), and that's not necessarily
>>> straight-forward to do in a plugin.
>>>
>> It's an interesting conversation. For the scope of this series, I agree you should use the jump target, which triggered the trap.
>>
>> In fine, transitions should simply follow what the cpu does.
>>
>> - orig_insn: jump to A
>> - jump_target: execute A traps
>> - page_fault: load page
>> - jump_target: come back to A
>>
>> event(JUMP, orig_insn, jump_target) // not covered by this series
>> event(EXCEPTION, jump_target, page_fault)
>> ... execute page_fault (with potential other transitions)
>> event(JUMP, end_page_fault, jump_target)
>>
>> In the case of a double trap, we could follow the same logic, and represent the original transition that lead to the trap, and the two consecutive traps.
>>
>> Does it make sense?
>
> Yes, those transitions are correct imo. And if a jump event should be
> introduced at some point, the call sequence would look like that. My
> issue is that testing this (in a plugin) will not be straight forward
> or even impossible. And overly complex tests don't exactly provoke
> confidence.
>
Instruction instrumentation is done before executing the instruction
itself, as you can see by running:
./build/qemu-x86_64 -plugin build/tests/tcg/plugins/libinsn.so -d
in_asm,op /usr/bin/true
I'm not entirely sure about the sequence when there is an exception
while fetching the instruction though. You can give it a try, track the
PC using insn instrumentation, and we can identify which cases are not
working.
The test plugin itself is not complicated.
You'll need:
- one callback per instruction to set the expected pc (possibly
optimized with inline operation), used to compare to from_pc, and we
check if (optional) to_pc matches the current instruction.
- when the callback for discontinuity is called, we check if from_pc is
matching, and register the next expected with to_pc.
We can then add tests targeting supported architectures using the
plugin, and ensuring it never fails.
It's hard to know we don't miss events though. Except if we write manual
assembly system mode tests, that trigger the expected behaviour. But it
would be tedious, and I'm really not sure there is a real value with
reduced examples like this.
> Regards,
> Julian Ganz
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 11/11] tests: add plugin asserting correctness of discon event's to_pc
2024-12-06 8:42 ` Julian Ganz
@ 2024-12-06 19:02 ` Pierrick Bouvier
2024-12-06 19:42 ` Richard Henderson
0 siblings, 1 reply; 70+ messages in thread
From: Pierrick Bouvier @ 2024-12-06 19:02 UTC (permalink / raw)
To: Julian Ganz, qemu-devel
Cc: Alex Bennée, Alexandre Iooss, Mahmoud Mandour
On 12/6/24 00:42, Julian Ganz wrote:
> Hi Pierrick,
>
> December 5, 2024 at 11:28 PM, "Pierrick Bouvier" wrote:
>> On 12/5/24 13:22, Julian Ganz wrote:
>>> December 5, 2024 at 6:30 PM, "Pierrick Bouvier" wrote:
>>>> We can store the next_expected pc for each instruction (from current_instruction + insn_length), and we should be able to compare that with the expected from_pc.
>>>> This is mostly what contrib/plugins/cflow.c does.
>>>>
>>>> With that, we can test from_pc.
>>>>
>>> I'm not confident that this will work reliably for branch, jump and
>>> other "interesting" instructions. But I can have a closer look at the
>>> cflow plugin and try to figure out how that plugin handles those cases.
>>>
>> It won't work for latest instructions in a tb (because we don't know what will be the next one), but should be good for all the others cases.
>
> IIUC qemu will schedule interrupts "opportunistically" between tb
> executions. If that's the case we'll observe interrupts exclusively
> after the last instruction in a tb. That strikes me as a serious
> limitation.
>
To reuse fancy vocabulary, maybe we should have a distinction between
inferable interruptions (interrupt instruction) and uninferable
interrupts, triggered by an external event.
In the latter, it *might* be acceptable to not provide a from_pc (let's
say a value 0), because there is no useful information in itself, except
creating random edges in the control flow graph, which we don't want to do.
What do you think of it?
> Regards,
> Julian Ganz
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 11/11] tests: add plugin asserting correctness of discon event's to_pc
2024-12-06 19:02 ` Pierrick Bouvier
@ 2024-12-06 19:42 ` Richard Henderson
2024-12-06 20:40 ` Pierrick Bouvier
2024-12-07 13:41 ` Julian Ganz
0 siblings, 2 replies; 70+ messages in thread
From: Richard Henderson @ 2024-12-06 19:42 UTC (permalink / raw)
To: Pierrick Bouvier, Julian Ganz, qemu-devel
Cc: Alex Bennée, Alexandre Iooss, Mahmoud Mandour
On 12/6/24 13:02, Pierrick Bouvier wrote:
> On 12/6/24 00:42, Julian Ganz wrote:
>> Hi Pierrick,
>>
>> December 5, 2024 at 11:28 PM, "Pierrick Bouvier" wrote:
>>> On 12/5/24 13:22, Julian Ganz wrote:
>>>> December 5, 2024 at 6:30 PM, "Pierrick Bouvier" wrote:
>>>>> We can store the next_expected pc for each instruction (from current_instruction +
>>>>> insn_length), and we should be able to compare that with the expected from_pc.
>>>>> This is mostly what contrib/plugins/cflow.c does.
>>>>>
>>>>> With that, we can test from_pc.
>>>>>
>>>> I'm not confident that this will work reliably for branch, jump and
>>>> other "interesting" instructions. But I can have a closer look at the
>>>> cflow plugin and try to figure out how that plugin handles those cases.
>>>>
>>> It won't work for latest instructions in a tb (because we don't know what will be the
>>> next one), but should be good for all the others cases.
>>
>> IIUC qemu will schedule interrupts "opportunistically" between tb
>> executions. If that's the case we'll observe interrupts exclusively
>> after the last instruction in a tb. That strikes me as a serious
>> limitation.
>>
>
> To reuse fancy vocabulary, maybe we should have a distinction between inferable
> interruptions (interrupt instruction) and uninferable interrupts, triggered by an external
> event.
>
> In the latter, it *might* be acceptable to not provide a from_pc (let's say a value 0),
> because there is no useful information in itself, except creating random edges in the
> control flow graph, which we don't want to do.
>
> What do you think of it?
I think you both are over-complicating things.
Always, env->pc (or whatever) within cc->cpu_exec_interrupt *is* where the interrupt is
recognized, and *is* where the discontinuity occurs. Report that.
Just because some device interrupts are deferred to the end of the TB, that makes no
difference. There is no "right" answer for timing between execution and asynchronous
interrupts.
r~
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 11/11] tests: add plugin asserting correctness of discon event's to_pc
2024-12-06 19:42 ` Richard Henderson
@ 2024-12-06 20:40 ` Pierrick Bouvier
2024-12-06 22:56 ` Richard Henderson
2024-12-07 13:41 ` Julian Ganz
1 sibling, 1 reply; 70+ messages in thread
From: Pierrick Bouvier @ 2024-12-06 20:40 UTC (permalink / raw)
To: Richard Henderson, Julian Ganz, qemu-devel
Cc: Alex Bennée, Alexandre Iooss, Mahmoud Mandour
On 12/6/24 11:42, Richard Henderson wrote:
> On 12/6/24 13:02, Pierrick Bouvier wrote:
>> On 12/6/24 00:42, Julian Ganz wrote:
>>> Hi Pierrick,
>>>
>>> December 5, 2024 at 11:28 PM, "Pierrick Bouvier" wrote:
>>>> On 12/5/24 13:22, Julian Ganz wrote:
>>>>> December 5, 2024 at 6:30 PM, "Pierrick Bouvier" wrote:
>>>>>> We can store the next_expected pc for each instruction (from current_instruction +
>>>>>> insn_length), and we should be able to compare that with the expected from_pc.
>>>>>> This is mostly what contrib/plugins/cflow.c does.
>>>>>>
>>>>>> With that, we can test from_pc.
>>>>>>
>>>>> I'm not confident that this will work reliably for branch, jump and
>>>>> other "interesting" instructions. But I can have a closer look at the
>>>>> cflow plugin and try to figure out how that plugin handles those cases.
>>>>>
>>>> It won't work for latest instructions in a tb (because we don't know what will be the
>>>> next one), but should be good for all the others cases.
>>>
>>> IIUC qemu will schedule interrupts "opportunistically" between tb
>>> executions. If that's the case we'll observe interrupts exclusively
>>> after the last instruction in a tb. That strikes me as a serious
>>> limitation.
>>>
>>
>> To reuse fancy vocabulary, maybe we should have a distinction between inferable
>> interruptions (interrupt instruction) and uninferable interrupts, triggered by an external
>> event.
>>
>> In the latter, it *might* be acceptable to not provide a from_pc (let's say a value 0),
>> because there is no useful information in itself, except creating random edges in the
>> control flow graph, which we don't want to do.
>>
>> What do you think of it?
>
> I think you both are over-complicating things.
>
> Always, env->pc (or whatever) within cc->cpu_exec_interrupt *is* where the interrupt is
> recognized, and *is* where the discontinuity occurs. Report that.
>
Do we have an architecture agnostic pc representation, or do we have to
add this for every target in {arch}_cpu_exec_interrupt?
Beyond the scope of interruptions, are we guaranteed this instruction
pointer (per arch) is always updated between instructions? Any corner cases?
> Just because some device interrupts are deferred to the end of the TB, that makes no
> difference. There is no "right" answer for timing between execution and asynchronous
> interrupts.
>
>
> r~
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 11/11] tests: add plugin asserting correctness of discon event's to_pc
2024-12-06 20:40 ` Pierrick Bouvier
@ 2024-12-06 22:56 ` Richard Henderson
2024-12-07 13:47 ` Julian Ganz
0 siblings, 1 reply; 70+ messages in thread
From: Richard Henderson @ 2024-12-06 22:56 UTC (permalink / raw)
To: Pierrick Bouvier, Julian Ganz, qemu-devel
Cc: Alex Bennée, Alexandre Iooss, Mahmoud Mandour
On 12/6/24 14:40, Pierrick Bouvier wrote:
> Do we have an architecture agnostic pc representation, or do we have to add this for every
> target in {arch}_cpu_exec_interrupt?
We have CPUClass.get_pc, which is almost certainly what you want.
The call to TCGCPUOps.cpu_exec_interrupt is only a hint that an interrupt might be ready:
interrupts can still be masked, etc.
From the current bool return value you can tell if a discontinuity actually occurred.
But if you want to categorize that event in any way you need to update each architecture.
You could simplify such updates by changing the return type from bool to an enum. While
you would have to simultaneously update all targets for the change in function signature,
if you select enumerators such that 0 -> no-op and 1 -> uncategorized, then you can also
tell if a target has been updated. Because this is still C, the current return true/false
statements will Just Work. :-)
On the other hand, the previous patches to add plugin calls to each cpu_exec_interrupt are
in the end approximately the same level of difficulty, and is more straightforward, so YMMV.
> Beyond the scope of interruptions, are we guaranteed this instruction pointer (per arch)
> is always updated between instructions? Any corner cases?
Not "between instructions" or even "between TB". But you are guaranteed that pc is
updated by the time we get to cpu_handle_interrupt, where cpu_exec_interrupt is called.
r~
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 01/11] plugins: add types for callbacks related to certain discontinuities
2024-12-06 18:59 ` Pierrick Bouvier
@ 2024-12-07 13:38 ` Julian Ganz
2024-12-09 18:52 ` Pierrick Bouvier
0 siblings, 1 reply; 70+ messages in thread
From: Julian Ganz @ 2024-12-07 13:38 UTC (permalink / raw)
To: Pierrick Bouvier, qemu-devel
Hi Pierrick,
December 6, 2024 at 7:59 PM, "Pierrick Bouvier" wrote:
> On 12/6/24 00:58, Julian Ganz wrote:
> > December 6, 2024 at 12:03 AM, "Pierrick Bouvier" wrote:
> > > It's an interesting conversation. For the scope of this series, I agree you should use the jump target, which triggered the trap.
> > >
> > > In fine, transitions should simply follow what the cpu does.
> > >
> > > - orig_insn: jump to A
> > > - jump_target: execute A traps
> > > - page_fault: load page
> > > - jump_target: come back to A
> > >
> > > event(JUMP, orig_insn, jump_target) // not covered by this series
> > > event(EXCEPTION, jump_target, page_fault)
> > > ... execute page_fault (with potential other transitions)
> > > event(JUMP, end_page_fault, jump_target)
> > >
> > > In the case of a double trap, we could follow the same logic, and represent the original transition that lead to the trap, and the two consecutive traps.
> > >
> > > Does it make sense?
> > >
> > Yes, those transitions are correct imo. And if a jump event should be
> > introduced at some point, the call sequence would look like that. My
> > issue is that testing this (in a plugin) will not be straight forward
> > or even impossible. And overly complex tests don't exactly provoke
> > confidence.
> >
> Instruction instrumentation is done before executing the instruction itself, as you can see by running:
> ./build/qemu-x86_64 -plugin build/tests/tcg/plugins/libinsn.so -d in_asm,op /usr/bin/true
>
> I'm not entirely sure about the sequence when there is an exception while fetching the instruction though. You can give it a try, track the PC using insn instrumentation, and we can identify which cases are not working.
>
> The test plugin itself is not complicated.
> You'll need:
> - one callback per instruction to set the expected pc (possibly optimized with inline operation), used to compare to from_pc, and we check if (optional) to_pc matches the current instruction.
What I'm saying ist that this exactly is not feasible for quite a few
relevant instructions. As someone who isn't all too intimate with TCG
itself, it's not even clear if we can rely on jump and branch
instructions only occuring at the end of a tb. At least a superficial
glance at the documentation tells me we can, but if this should in fact
not be the case, or if someone introduces something like zero overhead
loops inside a tb, all bets are off.
> - when the callback for discontinuity is called, we check if from_pc is matching, and register the next expected with to_pc.
>
> We can then add tests targeting supported architectures using the plugin, and ensuring it never fails.
> It's hard to know we don't miss events though. Except if we write manual assembly system mode tests, that trigger the expected behaviour. But it would be tedious, and I'm really not sure there is a real value with reduced examples like this.
So currently my thinking goes in the direction of having the plugin
print a warning every time we don't have an expected from_pc to check
against. Probably also have this be a feature you can deactivate.
Regards,
Julian Ganz
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 11/11] tests: add plugin asserting correctness of discon event's to_pc
2024-12-06 19:42 ` Richard Henderson
2024-12-06 20:40 ` Pierrick Bouvier
@ 2024-12-07 13:41 ` Julian Ganz
1 sibling, 0 replies; 70+ messages in thread
From: Julian Ganz @ 2024-12-07 13:41 UTC (permalink / raw)
To: Richard Henderson, Pierrick Bouvier, qemu-devel
Cc: Alex Bennée, Alexandre Iooss, Mahmoud Mandour
Hi Richard,
December 6, 2024 at 8:42 PM, "Richard Henderson" wrote:
> On 12/6/24 13:02, Pierrick Bouvier wrote:
> > On 12/6/24 00:42, Julian Ganz wrote:
> > > IIUC qemu will schedule interrupts "opportunistically" between tb
> > > executions. If that's the case we'll observe interrupts exclusively
> > > after the last instruction in a tb. That strikes me as a serious
> > > limitation.
> > >
> > To reuse fancy vocabulary, maybe we should have a distinction between inferable > interruptions (interrupt instruction) and uninferable interrupts, triggered by an external > event.
> > In the latter, it *might* be acceptable to not provide a from_pc (let's say a value 0), > because there is no useful information in itself, except creating random edges in the > control flow graph, which we don't want to do.
> > What do you think of it?
> >
> I think you both are over-complicating things.
>
> Always, env->pc (or whatever) within cc->cpu_exec_interrupt *is* where the interrupt is recognized, and *is* where the discontinuity occurs. Report that.
Glad to hear. This means what I naïvely did for most targets should be
correct at least in this regard.
Regards,
Julian
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 11/11] tests: add plugin asserting correctness of discon event's to_pc
2024-12-06 22:56 ` Richard Henderson
@ 2024-12-07 13:47 ` Julian Ganz
0 siblings, 0 replies; 70+ messages in thread
From: Julian Ganz @ 2024-12-07 13:47 UTC (permalink / raw)
To: Richard Henderson, Pierrick Bouvier, qemu-devel
Cc: Alex Bennée, Alexandre Iooss, Mahmoud Mandour
Hi Richard,
December 6, 2024 at 11:56 PM, "Richard Henderson" wrote:
> On 12/6/24 14:40, Pierrick Bouvier wrote:
>
> >
> > Do we have an architecture agnostic pc representation, or do we have to add this for every > target in {arch}_cpu_exec_interrupt?
> >
> We have CPUClass.get_pc, which is almost certainly what you want.
I was wondering about this but honestly failed to do so until the
current series was done. I'll definitely use this for to_pc in the next
iteration. I'll still need to gather from_pc "manually", but that's ok.
> The call to TCGCPUOps.cpu_exec_interrupt is only a hint that an interrupt might be ready: interrupts can still be masked, etc.
>
> From the current bool return value you can tell if a discontinuity actually occurred. But if you want to categorize that event in any way you need to update each architecture.
>
> You could simplify such updates by changing the return type from bool to an enum. While you would have to simultaneously update all targets for the change in function signature, if you select enumerators such that 0 -> no-op and 1 -> uncategorized, then you can also tell if a target has been updated. Because this is still C, the current return true/false statements will Just Work. :-)
>
> On the other hand, the previous patches to add plugin calls to each cpu_exec_interrupt are in the end approximately the same level of difficulty, and is more straightforward, so YMMV.
Good to hear. I think I'll stick to the current setup. It likely also
probably makes things easier or less awkward if the API is extended to
cover things like branches and jumps in the future.
Regards,
Julian
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 01/11] plugins: add types for callbacks related to certain discontinuities
2024-12-07 13:38 ` Julian Ganz
@ 2024-12-09 18:52 ` Pierrick Bouvier
0 siblings, 0 replies; 70+ messages in thread
From: Pierrick Bouvier @ 2024-12-09 18:52 UTC (permalink / raw)
To: Julian Ganz, qemu-devel
On 12/7/24 05:38, Julian Ganz wrote:
> Hi Pierrick,
>
> December 6, 2024 at 7:59 PM, "Pierrick Bouvier" wrote:
>> On 12/6/24 00:58, Julian Ganz wrote:
>>> December 6, 2024 at 12:03 AM, "Pierrick Bouvier" wrote:
>>>> It's an interesting conversation. For the scope of this series, I agree you should use the jump target, which triggered the trap.
>>>>
>>>> In fine, transitions should simply follow what the cpu does.
>>>>
>>>> - orig_insn: jump to A
>>>> - jump_target: execute A traps
>>>> - page_fault: load page
>>>> - jump_target: come back to A
>>>>
>>>> event(JUMP, orig_insn, jump_target) // not covered by this series
>>>> event(EXCEPTION, jump_target, page_fault)
>>>> ... execute page_fault (with potential other transitions)
>>>> event(JUMP, end_page_fault, jump_target)
>>>>
>>>> In the case of a double trap, we could follow the same logic, and represent the original transition that lead to the trap, and the two consecutive traps.
>>>>
>>>> Does it make sense?
>>>>
>>> Yes, those transitions are correct imo. And if a jump event should be
>>> introduced at some point, the call sequence would look like that. My
>>> issue is that testing this (in a plugin) will not be straight forward
>>> or even impossible. And overly complex tests don't exactly provoke
>>> confidence.
>>>
>> Instruction instrumentation is done before executing the instruction itself, as you can see by running:
>> ./build/qemu-x86_64 -plugin build/tests/tcg/plugins/libinsn.so -d in_asm,op /usr/bin/true
>>
>> I'm not entirely sure about the sequence when there is an exception while fetching the instruction though. You can give it a try, track the PC using insn instrumentation, and we can identify which cases are not working.
>>
>> The test plugin itself is not complicated.
>> You'll need:
>> - one callback per instruction to set the expected pc (possibly optimized with inline operation), used to compare to from_pc, and we check if (optional) to_pc matches the current instruction.
>
> What I'm saying ist that this exactly is not feasible for quite a few
> relevant instructions. As someone who isn't all too intimate with TCG
> itself, it's not even clear if we can rely on jump and branch
> instructions only occuring at the end of a tb. At least a superficial
> glance at the documentation tells me we can, but if this should in fact
> not be the case, or if someone introduces something like zero overhead
> loops inside a tb, all bets are off.
>
TCG already has the possibility to generate jumps inside a TB
(conditional branches - brcond, is an example).
However, for the scope of this series, it does not matter.
We only have to check the from_pc when a discontinuity is detected (and
not all the time), so there is no need to anticipate the "next" instruction.
>> - when the callback for discontinuity is called, we check if from_pc is matching, and register the next expected with to_pc.
>>
>> We can then add tests targeting supported architectures using the plugin, and ensuring it never fails.
>> It's hard to know we don't miss events though. Except if we write manual assembly system mode tests, that trigger the expected behaviour. But it would be tedious, and I'm really not sure there is a real value with reduced examples like this.
>
> So currently my thinking goes in the direction of having the plugin
> print a warning every time we don't have an expected from_pc to check
> against. Probably also have this be a feature you can deactivate.
>
As it's a test plugin, I think it would be a better default to abort on
error, and not be able to deactivate it. Any difference we find would be
a bug anyway, so require fixing.
> Regards,
> Julian Ganz
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 11/11] tests: add plugin asserting correctness of discon event's to_pc
2024-12-04 23:33 ` Pierrick Bouvier
2024-12-05 13:10 ` Julian Ganz
@ 2024-12-20 11:47 ` Julian Ganz
2024-12-20 21:17 ` Pierrick Bouvier
2025-01-09 16:33 ` Alex Bennée
1 sibling, 2 replies; 70+ messages in thread
From: Julian Ganz @ 2024-12-20 11:47 UTC (permalink / raw)
To: Pierrick Bouvier, qemu-devel
Cc: Alex Bennée, Alexandre Iooss, Mahmoud Mandour
Hi Pierrick,
December 5, 2024 at 12:33 AM, "Pierrick Bouvier" wrote:
> On 12/2/24 11:41, Julian Ganz wrote:
> > +static void insn_exec(unsigned int vcpu_index, void *userdata)
> > +{
> > + struct cpu_state *state = qemu_plugin_scoreboard_find(states, vcpu_index);
> > + uint64_t pc = (uint64_t) userdata;
> > + GString* report;
> > +
> > + if (state->has_next) {
> > + if (state->next_pc != pc) {
> > + report = g_string_new("Trap target PC mismatch\n");
> > + g_string_append_printf(report,
> > + "Expected: %"PRIx64"\nEncountered: %"
> > + PRIx64"\n",
> > + state->next_pc, pc);
> > + qemu_plugin_outs(report->str);
> > + if (abort_on_mismatch) {
> > + g_abort();
> > + }
> > + g_string_free(report, true);
> > + }
> > + state->has_next = false;
> > + }
> > +}
> >
> When booting an arm64 vm, I get this message:
> Trap target PC mismatch
> Expected: 23faf3a80
> Encountered: 23faf3a84
A colleague of mine went to great lengths trying to track and reliably
reproduce this. We think that it's something amiss with the existing
instruction exec callback infrastructure. So... it's not something I'll
be addressing with the next iteration as it's out of scope. We'll
probably continue looking into it, though.
The mismatch is reported perfectly normal and boring exceptions and
interrupts with no indication of any differences to other (not reported)
events that fire on a regular basis. Apparently, once in a blue moon
(relatively speaking), for the first instruction of a handler (even
though it is definitely executed and qemu does print a trace-line for
that instruction):
| Trace 0: 0x7fffa0b03900 [00104004/000000023fde73b4/00000021/ff020200]
| Trace 0: 0x7fffa02d9580 [00104004/000000023fde72b8/00000021/ff020200]
| Trace 0: 0x7fffa02dfc40 [00104004/000000023fde7338/00000021/ff020200]
| Trace 0: 0x7fffa0b03d00 [00104004/000000023fde73d4/00000021/ff020200]
| Trace 0: 0x7fffa0b03e80 [00104004/000000023fde73d8/00000021/ff020200]
| Trace 0: 0x7fffa0b04140 [00104004/000000023fde7408/00000021/ff020200]
| Trace 0: 0x7fffa02dd6c0 [00104004/000000023fde70b8/00000021/ff020200]
| Trace 0: 0x7fffa02dd800 [00104004/000000023fde7b90/00000021/ff020200]
| cpu_io_recompile: rewound execution of TB to 000000023fde7b90
| Taking exception 5 [IRQ] on CPU 0
| ...from EL1 to EL1
| ...with ESR 0x0/0x3800000
| ...with SPSR 0x20000305
| ...with ELR 0x23fde7b90
| ...to EL1 PC 0x23fd77a80 PSTATE 0x23c5
| Trace 0: 0x7fffa13a8340 [00104004/000000023fd77a80/00000021/ff021201]
| Trace 0: 0x7fffa13a8480 [00104004/000000023fd77a84/00000021/ff020200]
| Trap target PC mismatch CPU 0
| Expected: 23fd77a80
| Encountered: 23fd77a84
| warning: 44 ./nptl/pthread_kill.c: No such file or directory
| Couldn't get registers: No such process.
It does show up with both single-core and multi-core VMs, so that at
least eliminates some possibilities. Maybe :/
The issue is nasty to reproduce in a way that allows any meaningful
investigation. It usually involves sifting through many GBs of Qemu logs
for maybe one occurance. We could add another testing/dummy plugin that
just prints the PC for _any_ instruction executed and have a skript
check for non-alternating Trace-lines from Qemu and that Plugin. But
then we're talking nearly double the amount of Lines to look through
with probably little additional information.
Regards,
Julian
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 11/11] tests: add plugin asserting correctness of discon event's to_pc
2024-12-20 11:47 ` Julian Ganz
@ 2024-12-20 21:17 ` Pierrick Bouvier
2024-12-20 21:46 ` Pierrick Bouvier
2025-01-09 16:35 ` Alex Bennée
2025-01-09 16:33 ` Alex Bennée
1 sibling, 2 replies; 70+ messages in thread
From: Pierrick Bouvier @ 2024-12-20 21:17 UTC (permalink / raw)
To: Julian Ganz, qemu-devel
Cc: Alex Bennée, Alexandre Iooss, Mahmoud Mandour
Hi Julian,
On 12/20/24 03:47, Julian Ganz wrote:
> Hi Pierrick,
>
> December 5, 2024 at 12:33 AM, "Pierrick Bouvier" wrote:
>> On 12/2/24 11:41, Julian Ganz wrote:
>>> +static void insn_exec(unsigned int vcpu_index, void *userdata)
>>> +{
>>> + struct cpu_state *state = qemu_plugin_scoreboard_find(states, vcpu_index);
>>> + uint64_t pc = (uint64_t) userdata;
>>> + GString* report;
>>> +
>>> + if (state->has_next) {
>>> + if (state->next_pc != pc) {
>>> + report = g_string_new("Trap target PC mismatch\n");
>>> + g_string_append_printf(report,
>>> + "Expected: %"PRIx64"\nEncountered: %"
>>> + PRIx64"\n",
>>> + state->next_pc, pc);
>>> + qemu_plugin_outs(report->str);
>>> + if (abort_on_mismatch) {
>>> + g_abort();
>>> + }
>>> + g_string_free(report, true);
>>> + }
>>> + state->has_next = false;
>>> + }
>>> +}
>>>
>> When booting an arm64 vm, I get this message:
>> Trap target PC mismatch
>> Expected: 23faf3a80
>> Encountered: 23faf3a84
>
> A colleague of mine went to great lengths trying to track and reliably
> reproduce this. We think that it's something amiss with the existing
> instruction exec callback infrastructure. So... it's not something I'll
> be addressing with the next iteration as it's out of scope. We'll
> probably continue looking into it, though.
>
> The mismatch is reported perfectly normal and boring exceptions and
> interrupts with no indication of any differences to other (not reported)
> events that fire on a regular basis. Apparently, once in a blue moon
> (relatively speaking), for the first instruction of a handler (even
> though it is definitely executed and qemu does print a trace-line for
> that instruction):
>
> | Trace 0: 0x7fffa0b03900 [00104004/000000023fde73b4/00000021/ff020200]
> | Trace 0: 0x7fffa02d9580 [00104004/000000023fde72b8/00000021/ff020200]
> | Trace 0: 0x7fffa02dfc40 [00104004/000000023fde7338/00000021/ff020200]
> | Trace 0: 0x7fffa0b03d00 [00104004/000000023fde73d4/00000021/ff020200]
> | Trace 0: 0x7fffa0b03e80 [00104004/000000023fde73d8/00000021/ff020200]
> | Trace 0: 0x7fffa0b04140 [00104004/000000023fde7408/00000021/ff020200]
> | Trace 0: 0x7fffa02dd6c0 [00104004/000000023fde70b8/00000021/ff020200]
> | Trace 0: 0x7fffa02dd800 [00104004/000000023fde7b90/00000021/ff020200]
> | cpu_io_recompile: rewound execution of TB to 000000023fde7b90
> | Taking exception 5 [IRQ] on CPU 0
> | ...from EL1 to EL1
> | ...with ESR 0x0/0x3800000
> | ...with SPSR 0x20000305
> | ...with ELR 0x23fde7b90
> | ...to EL1 PC 0x23fd77a80 PSTATE 0x23c5
> | Trace 0: 0x7fffa13a8340 [00104004/000000023fd77a80/00000021/ff021201]
> | Trace 0: 0x7fffa13a8480 [00104004/000000023fd77a84/00000021/ff020200]
> | Trap target PC mismatch CPU 0
> | Expected: 23fd77a80
> | Encountered: 23fd77a84
> | warning: 44 ./nptl/pthread_kill.c: No such file or directory
> | Couldn't get registers: No such process.
>
> It does show up with both single-core and multi-core VMs, so that at
> least eliminates some possibilities. Maybe :/
>
> The issue is nasty to reproduce in a way that allows any meaningful
> investigation. It usually involves sifting through many GBs of Qemu logs
> for maybe one occurance. We could add another testing/dummy plugin that
> just prints the PC for _any_ instruction executed and have a skript
> check for non-alternating Trace-lines from Qemu and that Plugin. But
> then we're talking nearly double the amount of Lines to look through
> with probably little additional information.
>
Thanks for the investigation.
I could reproduce this with this command line:
./build/qemu-system-aarch64 -M virt -plugin
./build/tests/tcg/plugins/libdiscons.so,abort=on -m 8G -device
virtio-blk-pci,drive=root -drive
if=none,id=root,file=/home/user/.work/images/debianaarch64.img -M virt
-cpu max,pauth=off -drive
if=pflash,readonly=on,file=/usr/share/AAVMF/AAVMF_CODE.fd -drive
if=pflash,file=/home/user/.work/images/AAVMF_VARS.fd -d plugin,in_asm,op
-D crash.log
# -d plugin,in_asm,op allows to dump asm of every translated block,
plugin output (for discon plugin), and tcg op generated.
It reliably crashes with a single address.
Looking at the debug output (crash.log):
----------------
IN:
0x23faf3a80: d108c3ff sub sp, sp, #0x230
# => This bb has a single instruction as input
OP:
# this is the TB instrumentation
ld_i32 loc0,env,$0xfffffffffffffff8
brcond_i32 loc0,$0x0,lt,$L0
st8_i32 $0x1,env,$0xfffffffffffffffc
---- 0000000000000a80 0000000000000000 0000000000000000
# => we can see that there is no call_plugin, looks like instrumentation
# is not applied
sub_i64 sp,sp,$0x230
add_i64 pc,pc,$0x4
goto_tb $0x1
exit_tb $0x7f7eedd355c1
set_label $L0
exit_tb $0x7f7eedd355c3
----------------
IN:
0x23faf3a84: a9b007e0 stp x0, x1, [sp, #-0x100]!
0x23faf3a88: a9010fe2 stp x2, x3, [sp, #0x10]
...
OP:
ld_i32 loc0,env,$0xfffffffffffffff8
brcond_i32 loc0,$0x0,lt,$L0
st8_i32 $0x0,env,$0xfffffffffffffffc
---- 0000000000000a84 0000000000000000 0000000000000000
# instruction is correctly applied
call plugin(0x7f7eec96d530),$0x1,$0,$0x0,$0x23faf3a84
mov_i64 loc2,sp
...
Trap target PC mismatch
Expected: 23faf3a80
Encountered: 23faf3a84
The thing interesting here is that we can notice that 23faf3a80 is a
translation block with a single instruction, and we can see that
instrumentation is not applied for this instruction (call_plugin is not
present).
Overall, it really looks like a bug on QEMU side, where we miss
instrumenting something. I'll take a look. You can ignore this for now.
> Regards,
> Julian
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 11/11] tests: add plugin asserting correctness of discon event's to_pc
2024-12-20 21:17 ` Pierrick Bouvier
@ 2024-12-20 21:46 ` Pierrick Bouvier
2025-01-09 16:35 ` Alex Bennée
1 sibling, 0 replies; 70+ messages in thread
From: Pierrick Bouvier @ 2024-12-20 21:46 UTC (permalink / raw)
To: Julian Ganz, qemu-devel
Cc: Alex Bennée, Alexandre Iooss, Mahmoud Mandour
On 12/20/24 13:17, Pierrick Bouvier wrote:
> Hi Julian,
>
> On 12/20/24 03:47, Julian Ganz wrote:
>> Hi Pierrick,
>>
>> December 5, 2024 at 12:33 AM, "Pierrick Bouvier" wrote:
>>> On 12/2/24 11:41, Julian Ganz wrote:
>>>> +static void insn_exec(unsigned int vcpu_index, void *userdata)
>>>> +{
>>>> + struct cpu_state *state = qemu_plugin_scoreboard_find(states, vcpu_index);
>>>> + uint64_t pc = (uint64_t) userdata;
>>>> + GString* report;
>>>> +
>>>> + if (state->has_next) {
>>>> + if (state->next_pc != pc) {
>>>> + report = g_string_new("Trap target PC mismatch\n");
>>>> + g_string_append_printf(report,
>>>> + "Expected: %"PRIx64"\nEncountered: %"
>>>> + PRIx64"\n",
>>>> + state->next_pc, pc);
>>>> + qemu_plugin_outs(report->str);
>>>> + if (abort_on_mismatch) {
>>>> + g_abort();
>>>> + }
>>>> + g_string_free(report, true);
>>>> + }
>>>> + state->has_next = false;
>>>> + }
>>>> +}
>>>>
>>> When booting an arm64 vm, I get this message:
>>> Trap target PC mismatch
>>> Expected: 23faf3a80
>>> Encountered: 23faf3a84
>>
>> A colleague of mine went to great lengths trying to track and reliably
>> reproduce this. We think that it's something amiss with the existing
>> instruction exec callback infrastructure. So... it's not something I'll
>> be addressing with the next iteration as it's out of scope. We'll
>> probably continue looking into it, though.
>>
>> The mismatch is reported perfectly normal and boring exceptions and
>> interrupts with no indication of any differences to other (not reported)
>> events that fire on a regular basis. Apparently, once in a blue moon
>> (relatively speaking), for the first instruction of a handler (even
>> though it is definitely executed and qemu does print a trace-line for
>> that instruction):
>>
>> | Trace 0: 0x7fffa0b03900 [00104004/000000023fde73b4/00000021/ff020200]
>> | Trace 0: 0x7fffa02d9580 [00104004/000000023fde72b8/00000021/ff020200]
>> | Trace 0: 0x7fffa02dfc40 [00104004/000000023fde7338/00000021/ff020200]
>> | Trace 0: 0x7fffa0b03d00 [00104004/000000023fde73d4/00000021/ff020200]
>> | Trace 0: 0x7fffa0b03e80 [00104004/000000023fde73d8/00000021/ff020200]
>> | Trace 0: 0x7fffa0b04140 [00104004/000000023fde7408/00000021/ff020200]
>> | Trace 0: 0x7fffa02dd6c0 [00104004/000000023fde70b8/00000021/ff020200]
>> | Trace 0: 0x7fffa02dd800 [00104004/000000023fde7b90/00000021/ff020200]
>> | cpu_io_recompile: rewound execution of TB to 000000023fde7b90
>> | Taking exception 5 [IRQ] on CPU 0
>> | ...from EL1 to EL1
>> | ...with ESR 0x0/0x3800000
>> | ...with SPSR 0x20000305
>> | ...with ELR 0x23fde7b90
>> | ...to EL1 PC 0x23fd77a80 PSTATE 0x23c5
>> | Trace 0: 0x7fffa13a8340 [00104004/000000023fd77a80/00000021/ff021201]
>> | Trace 0: 0x7fffa13a8480 [00104004/000000023fd77a84/00000021/ff020200]
>> | Trap target PC mismatch CPU 0
>> | Expected: 23fd77a80
>> | Encountered: 23fd77a84
>> | warning: 44 ./nptl/pthread_kill.c: No such file or directory
>> | Couldn't get registers: No such process.
>>
>> It does show up with both single-core and multi-core VMs, so that at
>> least eliminates some possibilities. Maybe :/
>>
>> The issue is nasty to reproduce in a way that allows any meaningful
>> investigation. It usually involves sifting through many GBs of Qemu logs
>> for maybe one occurance. We could add another testing/dummy plugin that
>> just prints the PC for _any_ instruction executed and have a skript
>> check for non-alternating Trace-lines from Qemu and that Plugin. But
>> then we're talking nearly double the amount of Lines to look through
>> with probably little additional information.
>>
>
> Thanks for the investigation.
> I could reproduce this with this command line:
> ./build/qemu-system-aarch64 -M virt -plugin
> ./build/tests/tcg/plugins/libdiscons.so,abort=on -m 8G -device
> virtio-blk-pci,drive=root -drive
> if=none,id=root,file=/home/user/.work/images/debianaarch64.img -M virt
> -cpu max,pauth=off -drive
> if=pflash,readonly=on,file=/usr/share/AAVMF/AAVMF_CODE.fd -drive
> if=pflash,file=/home/user/.work/images/AAVMF_VARS.fd -d plugin,in_asm,op
> -D crash.log
>
> # -d plugin,in_asm,op allows to dump asm of every translated block,
> plugin output (for discon plugin), and tcg op generated.
>
> It reliably crashes with a single address.
> Looking at the debug output (crash.log):
> ----------------
> IN:
> 0x23faf3a80: d108c3ff sub sp, sp, #0x230
> # => This bb has a single instruction as input
>
> OP:
> # this is the TB instrumentation
> ld_i32 loc0,env,$0xfffffffffffffff8
> brcond_i32 loc0,$0x0,lt,$L0
> st8_i32 $0x1,env,$0xfffffffffffffffc
>
> ---- 0000000000000a80 0000000000000000 0000000000000000
> # => we can see that there is no call_plugin, looks like instrumentation
> # is not applied
> sub_i64 sp,sp,$0x230
> add_i64 pc,pc,$0x4
> goto_tb $0x1
> exit_tb $0x7f7eedd355c1
> set_label $L0
> exit_tb $0x7f7eedd355c3
>
> ----------------
> IN:
> 0x23faf3a84: a9b007e0 stp x0, x1, [sp, #-0x100]!
> 0x23faf3a88: a9010fe2 stp x2, x3, [sp, #0x10]
> ...
>
> OP:
> ld_i32 loc0,env,$0xfffffffffffffff8
> brcond_i32 loc0,$0x0,lt,$L0
> st8_i32 $0x0,env,$0xfffffffffffffffc
>
> ---- 0000000000000a84 0000000000000000 0000000000000000
> # instruction is correctly applied
> call plugin(0x7f7eec96d530),$0x1,$0,$0x0,$0x23faf3a84
> mov_i64 loc2,sp
> ...
>
> Trap target PC mismatch
> Expected: 23faf3a80
> Encountered: 23faf3a84
>
> The thing interesting here is that we can notice that 23faf3a80 is a
> translation block with a single instruction, and we can see that
> instrumentation is not applied for this instruction (call_plugin is not
> present).
>
> Overall, it really looks like a bug on QEMU side, where we miss
> instrumenting something. I'll take a look. You can ignore this for now.
>
It seems like we have a problem to identify tb as mem_only. This was
introduced to prevent double instrumentation of some memory access
(touching MMIO), but it seems that as a result, we skip some
instructions sometimes.
I need to dig into this further, but for now, you should be able to
workaround this on your side with this patch:
diff --git a/plugins/api.c b/plugins/api.c
index 24ea64e2de5..6cb9d81a0a2 100644
--- a/plugins/api.c
+++ b/plugins/api.c
@@ -92,6 +92,7 @@ void
qemu_plugin_register_vcpu_exit_cb(qemu_plugin_id_t id,
static bool tb_is_mem_only(void)
{
+ return false;
return tb_cflags(tcg_ctx->gen_tb) & CF_MEMI_ONLY;
}
>> Regards,
>> Julian
>
Regards,
Pierrick
^ permalink raw reply related [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 08/11] target/mips: call plugin trap callbacks
2024-12-02 19:26 ` [RFC PATCH v3 08/11] target/mips: " Julian Ganz
@ 2025-01-09 13:43 ` Alex Bennée
0 siblings, 0 replies; 70+ messages in thread
From: Alex Bennée @ 2025-01-09 13:43 UTC (permalink / raw)
To: Julian Ganz
Cc: qemu-devel, Philippe Mathieu-Daudé, Aurelien Jarno,
Jiaxun Yang, Aleksandar Rikalo
Julian Ganz <neither@nut.email> writes:
> We recently introduced API for registering callbacks for trap related
> events as well as the corresponding hook functions. Due to differences
> between architectures, the latter need to be called from target specific
> code.
>
> This change places hooks for MIPS targets.
> ---
> target/mips/tcg/sysemu/tlb_helper.c | 11 +++++++++++
> 1 file changed, 11 insertions(+)
>
> diff --git a/target/mips/tcg/sysemu/tlb_helper.c b/target/mips/tcg/sysemu/tlb_helper.c
> index e98bb95951..2b19975d53 100644
> --- a/target/mips/tcg/sysemu/tlb_helper.c
> +++ b/target/mips/tcg/sysemu/tlb_helper.c
> @@ -18,6 +18,7 @@
> */
merge failure as this is now in system/tlb_helper.c
--
Alex Bennée
Virtualisation Tech Lead @ Linaro
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 10/11] target/sparc: call plugin trap callbacks
2024-12-02 19:41 ` [RFC PATCH v3 10/11] target/sparc: " Julian Ganz
@ 2025-01-09 13:46 ` Alex Bennée
0 siblings, 0 replies; 70+ messages in thread
From: Alex Bennée @ 2025-01-09 13:46 UTC (permalink / raw)
To: Julian Ganz; +Cc: qemu-devel, Mark Cave-Ayland, Artyom Tarasenko
Julian Ganz <neither@nut.email> writes:
> We recently introduced API for registering callbacks for trap related
> events as well as the corresponding hook functions. Due to differences
> between architectures, the latter need to be called from target specific
> code.
>
> This change places hooks for SPARC (32bit and 64bit) targets. We treat
> any interrupt other than EXTINT and IVEC as exceptions as they appear to
> be synchroneous events.
> ---
> target/sparc/int32_helper.c | 7 +++++++
> target/sparc/int64_helper.c | 10 ++++++++++
> 2 files changed, 17 insertions(+)
>
> diff --git a/target/sparc/int32_helper.c b/target/sparc/int32_helper.c
> index f2dd8bcb2e..86b21eecb6 100644
> --- a/target/sparc/int32_helper.c
> +++ b/target/sparc/int32_helper.c
> @@ -24,6 +24,7 @@
> #include "exec/cpu_ldst.h"
> #include "exec/log.h"
> #include "sysemu/runstate.h"
> +#include "qemu/plugin.h"
Also a merge failure due to tidying up of target/sparc/sysemu
directories.
--
Alex Bennée
Virtualisation Tech Lead @ Linaro
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 01/11] plugins: add types for callbacks related to certain discontinuities
2024-12-04 22:45 ` Pierrick Bouvier
2024-12-05 12:44 ` Julian Ganz
@ 2025-01-09 13:52 ` Alex Bennée
2025-01-09 22:28 ` Pierrick Bouvier
2025-01-10 11:43 ` Julian Ganz
1 sibling, 2 replies; 70+ messages in thread
From: Alex Bennée @ 2025-01-09 13:52 UTC (permalink / raw)
To: Pierrick Bouvier; +Cc: Julian Ganz, qemu-devel
Pierrick Bouvier <pierrick.bouvier@linaro.org> writes:
> Hi Julian,
>
> thanks for the update!
> Comments below.
>
> On 12/2/24 11:26, Julian Ganz wrote:
>> The plugin API allows registration of callbacks for a variety of VCPU
>> related events, such as VCPU reset, idle and resume. However, traps of
>> any kind, i.e. interrupts or exceptions, were previously not covered.
>> These kinds of events are arguably quite significant and usually go hand
>> in hand with a PC discontinuity. On most platforms, the discontinuity
>> also includes a transition from some "mode" to another. Thus, plugins
>> for the analysis of (virtualized) embedded systems may benefit from or
>> even require the possiblity to perform work on the occurance of an
>> interrupt or exception.
>> This change introduces the concept of such a discontinuity event in
>> the
>> form of an enumeration. Currently only traps are covered. Specifically
>> we (loosely) define interrupts, exceptions and host calls across all
>> platforms. In addition, this change introduces a type to use for
>> callback functions related to such events. Since possible modes and the
>> enumeration of interupts and exceptions vary greatly between different
>> architectures, the callback type only receives the VCPU id, the type of
>> event as well as the old and new PC.
>> ---
>> include/qemu/plugin.h | 1 +
>> include/qemu/qemu-plugin.h | 43 ++++++++++++++++++++++++++++++++++++++
>> 2 files changed, 44 insertions(+)
>> diff --git a/include/qemu/plugin.h b/include/qemu/plugin.h
>> index 9726a9ebf3..27a176b631 100644
>> --- a/include/qemu/plugin.h
>> +++ b/include/qemu/plugin.h
>> @@ -59,6 +59,7 @@ union qemu_plugin_cb_sig {
>> qemu_plugin_udata_cb_t udata;
>> qemu_plugin_vcpu_simple_cb_t vcpu_simple;
>> qemu_plugin_vcpu_udata_cb_t vcpu_udata;
>> + qemu_plugin_vcpu_discon_cb_t vcpu_discon;
>> qemu_plugin_vcpu_tb_trans_cb_t vcpu_tb_trans;
>> qemu_plugin_vcpu_mem_cb_t vcpu_mem;
>> qemu_plugin_vcpu_syscall_cb_t vcpu_syscall;
>> diff --git a/include/qemu/qemu-plugin.h b/include/qemu/qemu-plugin.h
>> index 0fba36ae02..9c67374b7e 100644
>> --- a/include/qemu/qemu-plugin.h
>> +++ b/include/qemu/qemu-plugin.h
>> @@ -154,6 +154,49 @@ typedef void (*qemu_plugin_vcpu_simple_cb_t)(qemu_plugin_id_t id,
>> typedef void (*qemu_plugin_vcpu_udata_cb_t)(unsigned int vcpu_index,
>> void *userdata);
>> +
>> +/**
>> + * enum qemu_plugin_discon_type - type of a (potential) PC discontinuity
>> + *
>> + * @QEMU_PLUGIN_DISCON_INTERRUPT: an interrupt, defined across all architectures
>> + * as an asynchronous event, usually originating
>> + * from outside the CPU
>> + * @QEMU_PLUGIN_DISCON_EXCEPTION: an exception, defined across all architectures
>> + * as a synchronous event in response to a
>> + * specific instruction being executed
>> + * @QEMU_PLUGIN_DISCON_HOSTCALL: a host call, functionally a special kind of
>> + * exception that is not handled by code run by
>> + * the vCPU but machinery outside the vCPU
>> + * @QEMU_PLUGIN_DISCON_ALL: all types of disconinuity events currently covered
>> + */
>> +enum qemu_plugin_discon_type {
>> + QEMU_PLUGIN_DISCON_INTERRUPT = 1,
>> + QEMU_PLUGIN_DISCON_EXCEPTION = 2,
>> + QEMU_PLUGIN_DISCON_HOSTCALL = 4,
>> + QEMU_PLUGIN_DISCON_ALL = 7
>> +};
>
> Matter of style, but would be better to use:
>
> enum qemu_plugin_discon_type {
> QEMU_PLUGIN_DISCON_INTERRUPT = 1 << 0,
> QEMU_PLUGIN_DISCON_EXCEPTION = 1 << 1,
> QEMU_PLUGIN_DISCON_HOSTCALL = 1 << 2,
> QEMU_PLUGIN_DISCON_ALL = -1
> };
>
<snip>
Is this really a bit field though? If you will only report type of
discontinuity at a time a simple 0 based enum with
QEMU_PLUGIN_DISCON_MAX would be simpler.
--
Alex Bennée
Virtualisation Tech Lead @ Linaro
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 02/11] plugins: add API for registering discontinuity callbacks
2024-12-02 19:26 ` [RFC PATCH v3 02/11] plugins: add API for registering discontinuity callbacks Julian Ganz
2024-12-04 22:45 ` Pierrick Bouvier
@ 2025-01-09 13:57 ` Alex Bennée
2025-01-10 11:40 ` Julian Ganz
1 sibling, 1 reply; 70+ messages in thread
From: Alex Bennée @ 2025-01-09 13:57 UTC (permalink / raw)
To: Julian Ganz
Cc: qemu-devel, Alexandre Iooss, Mahmoud Mandour, Pierrick Bouvier
Julian Ganz <neither@nut.email> writes:
> The plugin API allows registration of callbacks for a variety of VCPU
> related events, such as VCPU reset, idle and resume. In addition to
> those events, we recently defined discontinuity events, which include
> traps.
>
> This change introduces a function to register callbacks for these
> events. We define one distinct plugin event type for each type of
> discontinuity, granting fine control to plugins in term of which events
> they receive.
> ---
> include/qemu/plugin-event.h | 3 +++
> include/qemu/qemu-plugin.h | 15 +++++++++++++++
> plugins/core.c | 15 +++++++++++++++
> 3 files changed, 33 insertions(+)
>
> diff --git a/include/qemu/plugin-event.h b/include/qemu/plugin-event.h
> index 7056d8427b..1100dae212 100644
> --- a/include/qemu/plugin-event.h
> +++ b/include/qemu/plugin-event.h
> @@ -20,6 +20,9 @@ enum qemu_plugin_event {
> QEMU_PLUGIN_EV_VCPU_SYSCALL_RET,
> QEMU_PLUGIN_EV_FLUSH,
> QEMU_PLUGIN_EV_ATEXIT,
> + QEMU_PLUGIN_EV_VCPU_INTERRUPT,
> + QEMU_PLUGIN_EV_VCPU_EXCEPTION,
> + QEMU_PLUGIN_EV_VCPU_HOSTCALL,
> QEMU_PLUGIN_EV_MAX, /* total number of plugin events we support */
> };
>
> diff --git a/include/qemu/qemu-plugin.h b/include/qemu/qemu-plugin.h
> index 9c67374b7e..f998a465e5 100644
> --- a/include/qemu/qemu-plugin.h
> +++ b/include/qemu/qemu-plugin.h
> @@ -273,6 +273,21 @@ QEMU_PLUGIN_API
> void qemu_plugin_register_vcpu_resume_cb(qemu_plugin_id_t id,
> qemu_plugin_vcpu_simple_cb_t cb);
>
> +/**
> + * qemu_plugin_register_vcpu_discon_cb() - register a discontinuity callback
> + * @id: plugin ID
> + * @cb: callback function
> + *
> + * The @cb function is called every time a vCPU receives a discontinuity event
> + * of the specified type(s), after the vCPU was prepared to handle the event.
> + * Preparation usually entails updating the PC to some interrupt handler or trap
> + * vector entry.
The "usually" here is a bit of a weasel word. We should be clear what
the contract is with the plugin. Can we say the PC will be updated to
the next instruction that will execute after the callback?
> + */
> +QEMU_PLUGIN_API
> +void qemu_plugin_register_vcpu_discon_cb(qemu_plugin_id_t id,
> + enum qemu_plugin_discon_type type,
> + qemu_plugin_vcpu_discon_cb_t cb);
> +
> /** struct qemu_plugin_tb - Opaque handle for a translation block */
> struct qemu_plugin_tb;
> /** struct qemu_plugin_insn - Opaque handle for a translated instruction */
> diff --git a/plugins/core.c b/plugins/core.c
> index bb105e8e68..a89a4a0315 100644
> --- a/plugins/core.c
> +++ b/plugins/core.c
> @@ -559,6 +559,21 @@ void qemu_plugin_register_vcpu_resume_cb(qemu_plugin_id_t id,
> plugin_register_cb(id, QEMU_PLUGIN_EV_VCPU_RESUME, cb);
> }
>
> +void qemu_plugin_register_vcpu_discon_cb(qemu_plugin_id_t id,
> + enum qemu_plugin_discon_type type,
> + qemu_plugin_vcpu_discon_cb_t cb)
> +{
> + if (type & QEMU_PLUGIN_DISCON_INTERRUPT) {
> + plugin_register_cb(id, QEMU_PLUGIN_EV_VCPU_INTERRUPT, cb);
> + }
> + if (type & QEMU_PLUGIN_DISCON_EXCEPTION) {
> + plugin_register_cb(id, QEMU_PLUGIN_EV_VCPU_EXCEPTION, cb);
> + }
> + if (type & QEMU_PLUGIN_DISCON_HOSTCALL) {
> + plugin_register_cb(id, QEMU_PLUGIN_EV_VCPU_HOSTCALL, cb);
> + }
> +}
> +
> void qemu_plugin_register_flush_cb(qemu_plugin_id_t id,
> qemu_plugin_simple_cb_t cb)
> {
--
Alex Bennée
Virtualisation Tech Lead @ Linaro
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 03/11] plugins: add hooks for new discontinuity related callbacks
2024-12-02 19:26 ` [RFC PATCH v3 03/11] plugins: add hooks for new discontinuity related callbacks Julian Ganz
2024-12-04 22:47 ` Pierrick Bouvier
@ 2025-01-09 13:58 ` Alex Bennée
1 sibling, 0 replies; 70+ messages in thread
From: Alex Bennée @ 2025-01-09 13:58 UTC (permalink / raw)
To: Julian Ganz
Cc: qemu-devel, Alexandre Iooss, Mahmoud Mandour, Pierrick Bouvier
Julian Ganz <neither@nut.email> writes:
> The plugin API allows registration of callbacks for a variety of VCPU
> related events, such as VCPU reset, idle and resume. In addition, we
> recently introduced API for registering callbacks for discontinuity
> events, specifically for interrupts, exceptions and host calls.
>
> This change introduces the corresponding hooks called from target
> specific code inside qemu.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
--
Alex Bennée
Virtualisation Tech Lead @ Linaro
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 04/11] contrib/plugins: add plugin showcasing new dicontinuity related API
2024-12-02 19:26 ` [RFC PATCH v3 04/11] contrib/plugins: add plugin showcasing new dicontinuity related API Julian Ganz
2024-12-04 23:14 ` Pierrick Bouvier
@ 2025-01-09 14:04 ` Alex Bennée
2025-01-09 22:10 ` Pierrick Bouvier
2025-01-10 11:49 ` Julian Ganz
1 sibling, 2 replies; 70+ messages in thread
From: Alex Bennée @ 2025-01-09 14:04 UTC (permalink / raw)
To: Julian Ganz
Cc: qemu-devel, Alexandre Iooss, Mahmoud Mandour, Pierrick Bouvier
Julian Ganz <neither@nut.email> writes:
> We recently introduced new plugin API for registration of discontinuity
> related callbacks. This change introduces a minimal plugin showcasing
> the new API. It simply counts the occurances of interrupts, exceptions
> and host calls per CPU and reports the counts when exitting.
> ---
> contrib/plugins/meson.build | 3 +-
> contrib/plugins/traps.c | 96 +++++++++++++++++++++++++++++++++++++
> 2 files changed, 98 insertions(+), 1 deletion(-)
> create mode 100644 contrib/plugins/traps.c
>
> diff --git a/contrib/plugins/meson.build b/contrib/plugins/meson.build
> index 63a32c2b4f..9a3015e1c1 100644
> --- a/contrib/plugins/meson.build
> +++ b/contrib/plugins/meson.build
> @@ -1,5 +1,6 @@
> contrib_plugins = ['bbv', 'cache', 'cflow', 'drcov', 'execlog', 'hotblocks',
> - 'hotpages', 'howvec', 'hwprofile', 'ips', 'stoptrigger']
> + 'hotpages', 'howvec', 'hwprofile', 'ips', 'stoptrigger',
> + 'traps']
I wonder if this is better in tests/tcg/plugins? We need to do something
to ensure it gets covered by CI although we might want to be smarter
about running it together with a test binary that will actually pick up
something.
> if host_os != 'windows'
> # lockstep uses socket.h
> contrib_plugins += 'lockstep'
<snip>
> +QEMU_PLUGIN_EXPORT
> +int qemu_plugin_install(qemu_plugin_id_t id, const qemu_info_t *info,
> + int argc, char **argv)
> +{
> + if (!info->system_emulation) {
> + fputs("trap plugin can only be used in system emulation mode.\n",
> + stderr);
> + return -1;
> + }
> +
> + max_vcpus = info->system.max_vcpus;
> + traps = qemu_plugin_scoreboard_new(sizeof(TrapCounters));
> + qemu_plugin_register_vcpu_init_cb(id, vcpu_init);
> + qemu_plugin_vcpu_for_each(id, vcpu_init);
Hmm at first glances this seems redundant - however I guess this is
covering the use case you load the plugin after the system is up and
running.
I wonder if you have unearthed a foot-gun in the API that is easy to
fall into? Maybe we should expand qemu_plugin_register_vcpu_init_cb to
call the call back immediately for existing vcpus?
> +
> + qemu_plugin_register_vcpu_discon_cb(id, QEMU_PLUGIN_DISCON_TRAPS,
> + vcpu_discon);
> +
> + qemu_plugin_register_atexit_cb(id, plugin_exit, NULL);
> +
> + return 0;
> +}
--
Alex Bennée
Virtualisation Tech Lead @ Linaro
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 11/11] tests: add plugin asserting correctness of discon event's to_pc
2024-12-20 11:47 ` Julian Ganz
2024-12-20 21:17 ` Pierrick Bouvier
@ 2025-01-09 16:33 ` Alex Bennée
2025-01-09 22:27 ` Pierrick Bouvier
2025-01-10 11:58 ` Julian Ganz
1 sibling, 2 replies; 70+ messages in thread
From: Alex Bennée @ 2025-01-09 16:33 UTC (permalink / raw)
To: Julian Ganz
Cc: Pierrick Bouvier, qemu-devel, Alexandre Iooss, Mahmoud Mandour,
Richard Henderson
"Julian Ganz" <neither@nut.email> writes:
(Add Richard to CC)
> Hi Pierrick,
>
> December 5, 2024 at 12:33 AM, "Pierrick Bouvier" wrote:
>> On 12/2/24 11:41, Julian Ganz wrote:
>> > +static void insn_exec(unsigned int vcpu_index, void *userdata)
>> > +{
>> > + struct cpu_state *state = qemu_plugin_scoreboard_find(states, vcpu_index);
>> > + uint64_t pc = (uint64_t) userdata;
>> > + GString* report;
>> > +
>> > + if (state->has_next) {
>> > + if (state->next_pc != pc) {
>> > + report = g_string_new("Trap target PC mismatch\n");
>> > + g_string_append_printf(report,
>> > + "Expected: %"PRIx64"\nEncountered: %"
>> > + PRIx64"\n",
>> > + state->next_pc, pc);
>> > + qemu_plugin_outs(report->str);
>> > + if (abort_on_mismatch) {
>> > + g_abort();
>> > + }
>> > + g_string_free(report, true);
>> > + }
>> > + state->has_next = false;
>> > + }
>> > +}
>> >
>> When booting an arm64 vm, I get this message:
>> Trap target PC mismatch
>> Expected: 23faf3a80
>> Encountered: 23faf3a84
>
> A colleague of mine went to great lengths trying to track and reliably
> reproduce this. We think that it's something amiss with the existing
> instruction exec callback infrastructure. So... it's not something I'll
> be addressing with the next iteration as it's out of scope. We'll
> probably continue looking into it, though.
>
> The mismatch is reported perfectly normal and boring exceptions and
> interrupts with no indication of any differences to other (not reported)
> events that fire on a regular basis. Apparently, once in a blue moon
> (relatively speaking), for the first instruction of a handler (even
> though it is definitely executed and qemu does print a trace-line for
> that instruction):
>
> | Trace 0: 0x7fffa0b03900 [00104004/000000023fde73b4/00000021/ff020200]
> | Trace 0: 0x7fffa02d9580 [00104004/000000023fde72b8/00000021/ff020200]
> | Trace 0: 0x7fffa02dfc40 [00104004/000000023fde7338/00000021/ff020200]
> | Trace 0: 0x7fffa0b03d00 [00104004/000000023fde73d4/00000021/ff020200]
> | Trace 0: 0x7fffa0b03e80 [00104004/000000023fde73d8/00000021/ff020200]
> | Trace 0: 0x7fffa0b04140 [00104004/000000023fde7408/00000021/ff020200]
> | Trace 0: 0x7fffa02dd6c0 [00104004/000000023fde70b8/00000021/ff020200]
> | Trace 0: 0x7fffa02dd800 [00104004/000000023fde7b90/00000021/ff020200]
> | cpu_io_recompile: rewound execution of TB to 000000023fde7b90
So this happens when an instruction that is not the last instruction of
the block does some IO. As IO accesses can potentially change system
state we can't allow more instructions to run in the block that might
not have that change of state captured
cpu_io_recompile exits the loop and forces the next TranslationBlock to
be only one (or maybe two instructions). We have to play games with
instrumentation to avoid double counting execution:
/*
* Exit the loop and potentially generate a new TB executing the
* just the I/O insns. We also limit instrumentation to memory
* operations only (which execute after completion) so we don't
* double instrument the instruction.
*/
cpu->cflags_next_tb = curr_cflags(cpu) | CF_MEMI_ONLY | n;
The instruction is in a weird state having both executed (from the
plugin point of view) but not changed any state (stopped from doing MMIO
until the next instruction).
> | Taking exception 5 [IRQ] on CPU 0
> | ...from EL1 to EL1
> | ...with ESR 0x0/0x3800000
> | ...with SPSR 0x20000305
> | ...with ELR 0x23fde7b90
> | ...to EL1 PC 0x23fd77a80 PSTATE 0x23c5
I guess before we re-executed the new block an asynchronous interrupt
came in?
Does changing the above to:
cpu->cflags_next_tb = curr_cflags(cpu) | CF_MEMI_ONLY | CF_NOIRQ | n;
make the problem go away? It should ensure the next 1/2 instruction
block execute without checking for async events. See gen_tb_start() for
the gory details.
> | Trace 0: 0x7fffa13a8340 [00104004/000000023fd77a80/00000021/ff021201]
> | Trace 0: 0x7fffa13a8480 [00104004/000000023fd77a84/00000021/ff020200]
> | Trap target PC mismatch CPU 0
> | Expected: 23fd77a80
> | Encountered: 23fd77a84
> | warning: 44 ./nptl/pthread_kill.c: No such file or directory
> | Couldn't get registers: No such process.
>
> It does show up with both single-core and multi-core VMs, so that at
> least eliminates some possibilities. Maybe :/
>
> The issue is nasty to reproduce in a way that allows any meaningful
> investigation. It usually involves sifting through many GBs of Qemu logs
> for maybe one occurance. We could add another testing/dummy plugin that
> just prints the PC for _any_ instruction executed and have a skript
> check for non-alternating Trace-lines from Qemu and that Plugin. But
> then we're talking nearly double the amount of Lines to look through
> with probably little additional information.
>
> Regards,
> Julian
--
Alex Bennée
Virtualisation Tech Lead @ Linaro
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 11/11] tests: add plugin asserting correctness of discon event's to_pc
2024-12-20 21:17 ` Pierrick Bouvier
2024-12-20 21:46 ` Pierrick Bouvier
@ 2025-01-09 16:35 ` Alex Bennée
1 sibling, 0 replies; 70+ messages in thread
From: Alex Bennée @ 2025-01-09 16:35 UTC (permalink / raw)
To: Pierrick Bouvier
Cc: Julian Ganz, qemu-devel, Alexandre Iooss, Mahmoud Mandour
Pierrick Bouvier <pierrick.bouvier@linaro.org> writes:
> Hi Julian,
>
> On 12/20/24 03:47, Julian Ganz wrote:
>> Hi Pierrick,
>> December 5, 2024 at 12:33 AM, "Pierrick Bouvier" wrote:
>>> On 12/2/24 11:41, Julian Ganz wrote:
>>>> +static void insn_exec(unsigned int vcpu_index, void *userdata)
>>>> +{
>>>> + struct cpu_state *state = qemu_plugin_scoreboard_find(states, vcpu_index);
>>>> + uint64_t pc = (uint64_t) userdata;
>>>> + GString* report;
>>>> +
>>>> + if (state->has_next) {
>>>> + if (state->next_pc != pc) {
>>>> + report = g_string_new("Trap target PC mismatch\n");
>>>> + g_string_append_printf(report,
>>>> + "Expected: %"PRIx64"\nEncountered: %"
>>>> + PRIx64"\n",
>>>> + state->next_pc, pc);
>>>> + qemu_plugin_outs(report->str);
>>>> + if (abort_on_mismatch) {
>>>> + g_abort();
>>>> + }
>>>> + g_string_free(report, true);
>>>> + }
>>>> + state->has_next = false;
>>>> + }
>>>> +}
>>>>
>>> When booting an arm64 vm, I get this message:
>>> Trap target PC mismatch
>>> Expected: 23faf3a80
>>> Encountered: 23faf3a84
>> A colleague of mine went to great lengths trying to track and
>> reliably
>> reproduce this. We think that it's something amiss with the existing
>> instruction exec callback infrastructure. So... it's not something I'll
>> be addressing with the next iteration as it's out of scope. We'll
>> probably continue looking into it, though.
>> The mismatch is reported perfectly normal and boring exceptions and
>> interrupts with no indication of any differences to other (not reported)
>> events that fire on a regular basis. Apparently, once in a blue moon
>> (relatively speaking), for the first instruction of a handler (even
>> though it is definitely executed and qemu does print a trace-line for
>> that instruction):
>> | Trace 0: 0x7fffa0b03900
>> [00104004/000000023fde73b4/00000021/ff020200]
>> | Trace 0: 0x7fffa02d9580 [00104004/000000023fde72b8/00000021/ff020200]
>> | Trace 0: 0x7fffa02dfc40 [00104004/000000023fde7338/00000021/ff020200]
>> | Trace 0: 0x7fffa0b03d00 [00104004/000000023fde73d4/00000021/ff020200]
>> | Trace 0: 0x7fffa0b03e80 [00104004/000000023fde73d8/00000021/ff020200]
>> | Trace 0: 0x7fffa0b04140 [00104004/000000023fde7408/00000021/ff020200]
>> | Trace 0: 0x7fffa02dd6c0 [00104004/000000023fde70b8/00000021/ff020200]
>> | Trace 0: 0x7fffa02dd800 [00104004/000000023fde7b90/00000021/ff020200]
>> | cpu_io_recompile: rewound execution of TB to 000000023fde7b90
>> | Taking exception 5 [IRQ] on CPU 0
>> | ...from EL1 to EL1
>> | ...with ESR 0x0/0x3800000
>> | ...with SPSR 0x20000305
>> | ...with ELR 0x23fde7b90
>> | ...to EL1 PC 0x23fd77a80 PSTATE 0x23c5
>> | Trace 0: 0x7fffa13a8340 [00104004/000000023fd77a80/00000021/ff021201]
>> | Trace 0: 0x7fffa13a8480 [00104004/000000023fd77a84/00000021/ff020200]
>> | Trap target PC mismatch CPU 0
>> | Expected: 23fd77a80
>> | Encountered: 23fd77a84
>> | warning: 44 ./nptl/pthread_kill.c: No such file or directory
>> | Couldn't get registers: No such process.
>> It does show up with both single-core and multi-core VMs, so that at
>> least eliminates some possibilities. Maybe :/
>> The issue is nasty to reproduce in a way that allows any meaningful
>> investigation. It usually involves sifting through many GBs of Qemu logs
>> for maybe one occurance. We could add another testing/dummy plugin that
>> just prints the PC for _any_ instruction executed and have a skript
>> check for non-alternating Trace-lines from Qemu and that Plugin. But
>> then we're talking nearly double the amount of Lines to look through
>> with probably little additional information.
>>
>
> Thanks for the investigation.
> I could reproduce this with this command line:
> ./build/qemu-system-aarch64 -M virt -plugin
> ./build/tests/tcg/plugins/libdiscons.so,abort=on -m 8G -device
> virtio-blk-pci,drive=root -drive
> if=none,id=root,file=/home/user/.work/images/debianaarch64.img -M virt
> -cpu max,pauth=off -drive
> if=pflash,readonly=on,file=/usr/share/AAVMF/AAVMF_CODE.fd -drive
> if=pflash,file=/home/user/.work/images/AAVMF_VARS.fd -d
> plugin,in_asm,op -D crash.log
>
> # -d plugin,in_asm,op allows to dump asm of every translated block,
> plugin output (for discon plugin), and tcg op generated.
>
> It reliably crashes with a single address.
> Looking at the debug output (crash.log):
> ----------------
> IN:
> 0x23faf3a80: d108c3ff sub sp, sp, #0x230
> # => This bb has a single instruction as input
>
> OP:
> # this is the TB instrumentation
> ld_i32 loc0,env,$0xfffffffffffffff8
> brcond_i32 loc0,$0x0,lt,$L0
> st8_i32 $0x1,env,$0xfffffffffffffffc
>
> ---- 0000000000000a80 0000000000000000 0000000000000000
> # => we can see that there is no call_plugin, looks like
> instrumentation # is not applied
That is expected. We have previously called the instrumentation for the
instruction in the block that triggered cpu_io_recompile() so we suppress
everything apart from memory instrumentation for the re-execution so we
don't double count.
> sub_i64 sp,sp,$0x230
> add_i64 pc,pc,$0x4
> goto_tb $0x1
> exit_tb $0x7f7eedd355c1
> set_label $L0
> exit_tb $0x7f7eedd355c3
>
> ----------------
> IN:
> 0x23faf3a84: a9b007e0 stp x0, x1, [sp, #-0x100]!
> 0x23faf3a88: a9010fe2 stp x2, x3, [sp, #0x10]
> ...
>
> OP:
> ld_i32 loc0,env,$0xfffffffffffffff8
> brcond_i32 loc0,$0x0,lt,$L0
> st8_i32 $0x0,env,$0xfffffffffffffffc
>
> ---- 0000000000000a84 0000000000000000 0000000000000000
> # instruction is correctly applied
> call plugin(0x7f7eec96d530),$0x1,$0,$0x0,$0x23faf3a84
> mov_i64 loc2,sp
> ...
>
> Trap target PC mismatch
> Expected: 23faf3a80
> Encountered: 23faf3a84
>
> The thing interesting here is that we can notice that 23faf3a80 is a
> translation block with a single instruction, and we can see that
> instrumentation is not applied for this instruction (call_plugin is
> not present).
>
> Overall, it really looks like a bug on QEMU side, where we miss
> instrumenting something. I'll take a look. You can ignore this for
> now.
See my other email. I think the bug was allowing an async IRQ to
interfere with our "special" single instruction block.
>
>> Regards,
>> Julian
--
Alex Bennée
Virtualisation Tech Lead @ Linaro
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 00/11] tcg-plugins: add hooks for discontinuities
2024-12-02 19:26 [RFC PATCH v3 00/11] tcg-plugins: add hooks for discontinuities Julian Ganz
` (11 preceding siblings ...)
2024-12-03 8:36 ` [RFC PATCH v3 00/11] tcg-plugins: add hooks for discontinuities Julian Ganz
@ 2025-01-09 16:43 ` Alex Bennée
12 siblings, 0 replies; 70+ messages in thread
From: Alex Bennée @ 2025-01-09 16:43 UTC (permalink / raw)
To: Julian Ganz; +Cc: qemu-devel
Julian Ganz <neither@nut.email> writes:
> Some analysis greatly benefits, or depends on, information about
> certain types of dicontinuities such as interrupts. For example, we may
> need to handle the execution of a new translation block differently if
> it is not the result of normal program flow but of an interrupt.
>
> Even with the existing interfaces, it is more or less possible to
> discern these situations, e.g. as done by the cflow plugin. However,
> this process poses a considerable overhead to the core analysis one may
> intend to perform.
>
> These changes introduce a generic and easy-to-use interface for plugin
> authors in the form of a callback for discontinuities. Patch 1 defines
> an enumeration of some trap-related discontinuities including somewhat
> narrow definitions of the discontinuity evetns and a callback type.
> Patch 2 defines the callback registration function. Patch 3 adds some
> hooks for triggering the callbacks. Patch 4 adds an example plugin
> showcasing the new API. Patches 5 through 6 call the hooks for a
> selection of architectures, mapping architecture specific events to the
> three categories defined in patch 1. Future non-RFC patchsets will call
> these hooks for all architectures (that have some concept of trap or
> interrupt). Finally, patch 11 supplies a test plugin asserting that the
> next PC provided to the plugin points to the next instruction executed.
>
> Sidenote: I'm likely doing something wrong for one architecture or
> the other. These patches are untested for most of them.
I've finished my review pass. Overall I think the API is fine but I
would like the arch maintainers to be happy the individual hooks capture
the right semantics for their arches.
I think Pierrick has already picked up some compile failures, you can
see more from my gitlab CI run:
https://gitlab.com/stsquad/qemu/-/pipelines/1618014020
As you have discovered with the discontinuity issue making sure the
execution state is consistent with JIT'ed code has a few landmines in
it. Given it is hard to trigger with our basic softmmu tests you should
consider a few more aggressive tests like:
tests/functional/test_aarch64_tcg_plugins.py
where we can pick exactly which plugin we want to use and run something
that will have a lot of IRQs and exceptions in it. It doesn't have to be
Aarch64 - whichever arch you are most familiar with. A test that
includes a hypervisor would be ideal as that will trigger a wider range
of execution state changes.
--
Alex Bennée
Virtualisation Tech Lead @ Linaro
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 04/11] contrib/plugins: add plugin showcasing new dicontinuity related API
2025-01-09 14:04 ` Alex Bennée
@ 2025-01-09 22:10 ` Pierrick Bouvier
2025-01-10 11:49 ` Julian Ganz
1 sibling, 0 replies; 70+ messages in thread
From: Pierrick Bouvier @ 2025-01-09 22:10 UTC (permalink / raw)
To: Alex Bennée, Julian Ganz
Cc: qemu-devel, Alexandre Iooss, Mahmoud Mandour
On 1/9/25 06:04, Alex Bennée wrote:
> Julian Ganz <neither@nut.email> writes:
>
>> We recently introduced new plugin API for registration of discontinuity
>> related callbacks. This change introduces a minimal plugin showcasing
>> the new API. It simply counts the occurances of interrupts, exceptions
>> and host calls per CPU and reports the counts when exitting.
>> ---
>> contrib/plugins/meson.build | 3 +-
>> contrib/plugins/traps.c | 96 +++++++++++++++++++++++++++++++++++++
>> 2 files changed, 98 insertions(+), 1 deletion(-)
>> create mode 100644 contrib/plugins/traps.c
>>
>> diff --git a/contrib/plugins/meson.build b/contrib/plugins/meson.build
>> index 63a32c2b4f..9a3015e1c1 100644
>> --- a/contrib/plugins/meson.build
>> +++ b/contrib/plugins/meson.build
>> @@ -1,5 +1,6 @@
>> contrib_plugins = ['bbv', 'cache', 'cflow', 'drcov', 'execlog', 'hotblocks',
>> - 'hotpages', 'howvec', 'hwprofile', 'ips', 'stoptrigger']
>> + 'hotpages', 'howvec', 'hwprofile', 'ips', 'stoptrigger',
>> + 'traps']
>
> I wonder if this is better in tests/tcg/plugins? We need to do something
> to ensure it gets covered by CI although we might want to be smarter
> about running it together with a test binary that will actually pick up
> something.
>
>> if host_os != 'windows'
>> # lockstep uses socket.h
>> contrib_plugins += 'lockstep'
> <snip>
>> +QEMU_PLUGIN_EXPORT
>> +int qemu_plugin_install(qemu_plugin_id_t id, const qemu_info_t *info,
>> + int argc, char **argv)
>> +{
>> + if (!info->system_emulation) {
>> + fputs("trap plugin can only be used in system emulation mode.\n",
>> + stderr);
>> + return -1;
>> + }
>> +
>> + max_vcpus = info->system.max_vcpus;
>> + traps = qemu_plugin_scoreboard_new(sizeof(TrapCounters));
>> + qemu_plugin_register_vcpu_init_cb(id, vcpu_init);
>> + qemu_plugin_vcpu_for_each(id, vcpu_init);
>
> Hmm at first glances this seems redundant - however I guess this is
> covering the use case you load the plugin after the system is up and
> running.
>
> I wonder if you have unearthed a foot-gun in the API that is easy to
> fall into? Maybe we should expand qemu_plugin_register_vcpu_init_cb to
> call the call back immediately for existing vcpus?
>
I'm not sure to see how we can load a plugin after the cpus have been
initialized?
In case someone wants to dynamically create a new vcpu_init callback
(after some event or translation for instance), it should be the
responsibility of plugin to call it for existing vcpus.
qemu_plugin_vcpu_for_each is still useful in case you want to only
iterate currently active cpus (and not the one who exited already in
user mode).
>> +
>> + qemu_plugin_register_vcpu_discon_cb(id, QEMU_PLUGIN_DISCON_TRAPS,
>> + vcpu_discon);
>> +
>> + qemu_plugin_register_atexit_cb(id, plugin_exit, NULL);
>> +
>> + return 0;
>> +}
>
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 11/11] tests: add plugin asserting correctness of discon event's to_pc
2025-01-09 16:33 ` Alex Bennée
@ 2025-01-09 22:27 ` Pierrick Bouvier
2025-01-10 11:58 ` Julian Ganz
1 sibling, 0 replies; 70+ messages in thread
From: Pierrick Bouvier @ 2025-01-09 22:27 UTC (permalink / raw)
To: Alex Bennée, Julian Ganz
Cc: qemu-devel, Alexandre Iooss, Mahmoud Mandour, Richard Henderson
On 1/9/25 08:33, Alex Bennée wrote:
> "Julian Ganz" <neither@nut.email> writes:
>
> (Add Richard to CC)
>
>> Hi Pierrick,
>>
>> December 5, 2024 at 12:33 AM, "Pierrick Bouvier" wrote:
>>> On 12/2/24 11:41, Julian Ganz wrote:
>>>> +static void insn_exec(unsigned int vcpu_index, void *userdata)
>>>> +{
>>>> + struct cpu_state *state = qemu_plugin_scoreboard_find(states, vcpu_index);
>>>> + uint64_t pc = (uint64_t) userdata;
>>>> + GString* report;
>>>> +
>>>> + if (state->has_next) {
>>>> + if (state->next_pc != pc) {
>>>> + report = g_string_new("Trap target PC mismatch\n");
>>>> + g_string_append_printf(report,
>>>> + "Expected: %"PRIx64"\nEncountered: %"
>>>> + PRIx64"\n",
>>>> + state->next_pc, pc);
>>>> + qemu_plugin_outs(report->str);
>>>> + if (abort_on_mismatch) {
>>>> + g_abort();
>>>> + }
>>>> + g_string_free(report, true);
>>>> + }
>>>> + state->has_next = false;
>>>> + }
>>>> +}
>>>>
>>> When booting an arm64 vm, I get this message:
>>> Trap target PC mismatch
>>> Expected: 23faf3a80
>>> Encountered: 23faf3a84
>>
>> A colleague of mine went to great lengths trying to track and reliably
>> reproduce this. We think that it's something amiss with the existing
>> instruction exec callback infrastructure. So... it's not something I'll
>> be addressing with the next iteration as it's out of scope. We'll
>> probably continue looking into it, though.
>>
>> The mismatch is reported perfectly normal and boring exceptions and
>> interrupts with no indication of any differences to other (not reported)
>> events that fire on a regular basis. Apparently, once in a blue moon
>> (relatively speaking), for the first instruction of a handler (even
>> though it is definitely executed and qemu does print a trace-line for
>> that instruction):
>>
>> | Trace 0: 0x7fffa0b03900 [00104004/000000023fde73b4/00000021/ff020200]
>> | Trace 0: 0x7fffa02d9580 [00104004/000000023fde72b8/00000021/ff020200]
>> | Trace 0: 0x7fffa02dfc40 [00104004/000000023fde7338/00000021/ff020200]
>> | Trace 0: 0x7fffa0b03d00 [00104004/000000023fde73d4/00000021/ff020200]
>> | Trace 0: 0x7fffa0b03e80 [00104004/000000023fde73d8/00000021/ff020200]
>> | Trace 0: 0x7fffa0b04140 [00104004/000000023fde7408/00000021/ff020200]
>> | Trace 0: 0x7fffa02dd6c0 [00104004/000000023fde70b8/00000021/ff020200]
>> | Trace 0: 0x7fffa02dd800 [00104004/000000023fde7b90/00000021/ff020200]
>> | cpu_io_recompile: rewound execution of TB to 000000023fde7b90
>
> So this happens when an instruction that is not the last instruction of
> the block does some IO. As IO accesses can potentially change system
> state we can't allow more instructions to run in the block that might
> not have that change of state captured
>
> cpu_io_recompile exits the loop and forces the next TranslationBlock to
> be only one (or maybe two instructions). We have to play games with
> instrumentation to avoid double counting execution:
>
> /*
> * Exit the loop and potentially generate a new TB executing the
> * just the I/O insns. We also limit instrumentation to memory
> * operations only (which execute after completion) so we don't
> * double instrument the instruction.
> */
> cpu->cflags_next_tb = curr_cflags(cpu) | CF_MEMI_ONLY | n;
>
> The instruction is in a weird state having both executed (from the
> plugin point of view) but not changed any state (stopped from doing MMIO
> until the next instruction).
>
>> | Taking exception 5 [IRQ] on CPU 0
>> | ...from EL1 to EL1
>> | ...with ESR 0x0/0x3800000
>> | ...with SPSR 0x20000305
>> | ...with ELR 0x23fde7b90
>> | ...to EL1 PC 0x23fd77a80 PSTATE 0x23c5
>
> I guess before we re-executed the new block an asynchronous interrupt
> came in?
>
> Does changing the above to:
>
> cpu->cflags_next_tb = curr_cflags(cpu) | CF_MEMI_ONLY | CF_NOIRQ | n;
>
> make the problem go away? It should ensure the next 1/2 instruction
> block execute without checking for async events. See gen_tb_start() for
> the gory details.
>
Thanks, it solves the problem indeed.
I was not sure why this specific block was reexecuted in the case of an IRQ.
>> | Trace 0: 0x7fffa13a8340 [00104004/000000023fd77a80/00000021/ff021201]
>> | Trace 0: 0x7fffa13a8480 [00104004/000000023fd77a84/00000021/ff020200]
>> | Trap target PC mismatch CPU 0
>> | Expected: 23fd77a80
>> | Encountered: 23fd77a84
>> | warning: 44 ./nptl/pthread_kill.c: No such file or directory
>> | Couldn't get registers: No such process.
>>
>> It does show up with both single-core and multi-core VMs, so that at
>> least eliminates some possibilities. Maybe :/
>>
>> The issue is nasty to reproduce in a way that allows any meaningful
>> investigation. It usually involves sifting through many GBs of Qemu logs
>> for maybe one occurance. We could add another testing/dummy plugin that
>> just prints the PC for _any_ instruction executed and have a skript
>> check for non-alternating Trace-lines from Qemu and that Plugin. But
>> then we're talking nearly double the amount of Lines to look through
>> with probably little additional information.
>>
>> Regards,
>> Julian
>
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 01/11] plugins: add types for callbacks related to certain discontinuities
2025-01-09 13:52 ` Alex Bennée
@ 2025-01-09 22:28 ` Pierrick Bouvier
2025-01-10 11:43 ` Julian Ganz
1 sibling, 0 replies; 70+ messages in thread
From: Pierrick Bouvier @ 2025-01-09 22:28 UTC (permalink / raw)
To: Alex Bennée; +Cc: Julian Ganz, qemu-devel
On 1/9/25 05:52, Alex Bennée wrote:
> Pierrick Bouvier <pierrick.bouvier@linaro.org> writes:
>
>> Hi Julian,
>>
>> thanks for the update!
>> Comments below.
>>
>> On 12/2/24 11:26, Julian Ganz wrote:
>>> The plugin API allows registration of callbacks for a variety of VCPU
>>> related events, such as VCPU reset, idle and resume. However, traps of
>>> any kind, i.e. interrupts or exceptions, were previously not covered.
>>> These kinds of events are arguably quite significant and usually go hand
>>> in hand with a PC discontinuity. On most platforms, the discontinuity
>>> also includes a transition from some "mode" to another. Thus, plugins
>>> for the analysis of (virtualized) embedded systems may benefit from or
>>> even require the possiblity to perform work on the occurance of an
>>> interrupt or exception.
>>> This change introduces the concept of such a discontinuity event in
>>> the
>>> form of an enumeration. Currently only traps are covered. Specifically
>>> we (loosely) define interrupts, exceptions and host calls across all
>>> platforms. In addition, this change introduces a type to use for
>>> callback functions related to such events. Since possible modes and the
>>> enumeration of interupts and exceptions vary greatly between different
>>> architectures, the callback type only receives the VCPU id, the type of
>>> event as well as the old and new PC.
>>> ---
>>> include/qemu/plugin.h | 1 +
>>> include/qemu/qemu-plugin.h | 43 ++++++++++++++++++++++++++++++++++++++
>>> 2 files changed, 44 insertions(+)
>>> diff --git a/include/qemu/plugin.h b/include/qemu/plugin.h
>>> index 9726a9ebf3..27a176b631 100644
>>> --- a/include/qemu/plugin.h
>>> +++ b/include/qemu/plugin.h
>>> @@ -59,6 +59,7 @@ union qemu_plugin_cb_sig {
>>> qemu_plugin_udata_cb_t udata;
>>> qemu_plugin_vcpu_simple_cb_t vcpu_simple;
>>> qemu_plugin_vcpu_udata_cb_t vcpu_udata;
>>> + qemu_plugin_vcpu_discon_cb_t vcpu_discon;
>>> qemu_plugin_vcpu_tb_trans_cb_t vcpu_tb_trans;
>>> qemu_plugin_vcpu_mem_cb_t vcpu_mem;
>>> qemu_plugin_vcpu_syscall_cb_t vcpu_syscall;
>>> diff --git a/include/qemu/qemu-plugin.h b/include/qemu/qemu-plugin.h
>>> index 0fba36ae02..9c67374b7e 100644
>>> --- a/include/qemu/qemu-plugin.h
>>> +++ b/include/qemu/qemu-plugin.h
>>> @@ -154,6 +154,49 @@ typedef void (*qemu_plugin_vcpu_simple_cb_t)(qemu_plugin_id_t id,
>>> typedef void (*qemu_plugin_vcpu_udata_cb_t)(unsigned int vcpu_index,
>>> void *userdata);
>>> +
>>> +/**
>>> + * enum qemu_plugin_discon_type - type of a (potential) PC discontinuity
>>> + *
>>> + * @QEMU_PLUGIN_DISCON_INTERRUPT: an interrupt, defined across all architectures
>>> + * as an asynchronous event, usually originating
>>> + * from outside the CPU
>>> + * @QEMU_PLUGIN_DISCON_EXCEPTION: an exception, defined across all architectures
>>> + * as a synchronous event in response to a
>>> + * specific instruction being executed
>>> + * @QEMU_PLUGIN_DISCON_HOSTCALL: a host call, functionally a special kind of
>>> + * exception that is not handled by code run by
>>> + * the vCPU but machinery outside the vCPU
>>> + * @QEMU_PLUGIN_DISCON_ALL: all types of disconinuity events currently covered
>>> + */
>>> +enum qemu_plugin_discon_type {
>>> + QEMU_PLUGIN_DISCON_INTERRUPT = 1,
>>> + QEMU_PLUGIN_DISCON_EXCEPTION = 2,
>>> + QEMU_PLUGIN_DISCON_HOSTCALL = 4,
>>> + QEMU_PLUGIN_DISCON_ALL = 7
>>> +};
>>
>> Matter of style, but would be better to use:
>>
>> enum qemu_plugin_discon_type {
>> QEMU_PLUGIN_DISCON_INTERRUPT = 1 << 0,
>> QEMU_PLUGIN_DISCON_EXCEPTION = 1 << 1,
>> QEMU_PLUGIN_DISCON_HOSTCALL = 1 << 2,
>> QEMU_PLUGIN_DISCON_ALL = -1
>> };
>>
> <snip>
>
> Is this really a bit field though? If you will only report type of
> discontinuity at a time a simple 0 based enum with
> QEMU_PLUGIN_DISCON_MAX would be simpler.
>
Here we want to express ALL, not MAX.
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 02/11] plugins: add API for registering discontinuity callbacks
2025-01-09 13:57 ` Alex Bennée
@ 2025-01-10 11:40 ` Julian Ganz
0 siblings, 0 replies; 70+ messages in thread
From: Julian Ganz @ 2025-01-10 11:40 UTC (permalink / raw)
To: Alex Bennée
Cc: qemu-devel, Alexandre Iooss, Mahmoud Mandour, Pierrick Bouvier
Hi Alex,
January 9, 2025 at 2:57 PM, "Alex Bennée" wrote:
> Julian Ganz <neither@nut.email> writes:
> > diff --git a/include/qemu/qemu-plugin.h b/include/qemu/qemu-plugin.h
> > index 9c67374b7e..f998a465e5 100644
> > --- a/include/qemu/qemu-plugin.h
> > +++ b/include/qemu/qemu-plugin.h
> > @@ -273,6 +273,21 @@ QEMU_PLUGIN_API
> > void qemu_plugin_register_vcpu_resume_cb(qemu_plugin_id_t id,
> > qemu_plugin_vcpu_simple_cb_t cb);
> >
> > +/**
> > + * qemu_plugin_register_vcpu_discon_cb() - register a discontinuity callback
> > + * @id: plugin ID
> > + * @cb: callback function
> > + *
> > + * The @cb function is called every time a vCPU receives a discontinuity event
> > + * of the specified type(s), after the vCPU was prepared to handle the event.
> > + * Preparation usually entails updating the PC to some interrupt handler or trap
> > + * vector entry.
> >
> The "usually" here is a bit of a weasel word. We should be clear what
> the contract is with the plugin. Can we say the PC will be updated to
> the next instruction that will execute after the callback?
The contract is indeed clear: the PC will always be updated to the
instruction that will be executed next, at least if we don't have a
second discontinuity (e.g. jump to an unmapped page). The "usually"
refers to the discontinuity itself: in the case of host calls, we don't
observe a "jump" and the next instruction executed will just be the
instruction following the call.
I could have phrased this better, and will make this more clear in the
next update.
Regards,
Julian
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 01/11] plugins: add types for callbacks related to certain discontinuities
2025-01-09 13:52 ` Alex Bennée
2025-01-09 22:28 ` Pierrick Bouvier
@ 2025-01-10 11:43 ` Julian Ganz
1 sibling, 0 replies; 70+ messages in thread
From: Julian Ganz @ 2025-01-10 11:43 UTC (permalink / raw)
To: Alex Bennée, Pierrick Bouvier; +Cc: qemu-devel
Hi Alex,
January 9, 2025 at 2:52 PM, "Alex Bennée" wrote:
> Pierrick Bouvier <pierrick.bouvier@linaro.org> writes:
> > On 12/2/24 11:26, Julian Ganz wrote:
> > > diff --git a/include/qemu/qemu-plugin.h b/include/qemu/qemu-plugin.h
> > > index 0fba36ae02..9c67374b7e 100644
> > > --- a/include/qemu/qemu-plugin.h
> > > +++ b/include/qemu/qemu-plugin.h
> > > @@ -154,6 +154,49 @@ typedef void (*qemu_plugin_vcpu_simple_cb_t)(qemu_plugin_id_t id,
> > > typedef void (*qemu_plugin_vcpu_udata_cb_t)(unsigned int vcpu_index,
> > > void *userdata);
> > > +
> > > +/**
> > > + * enum qemu_plugin_discon_type - type of a (potential) PC discontinuity
> > > + *
> > > + * @QEMU_PLUGIN_DISCON_INTERRUPT: an interrupt, defined across all architectures
> > > + * as an asynchronous event, usually originating
> > > + * from outside the CPU
> > > + * @QEMU_PLUGIN_DISCON_EXCEPTION: an exception, defined across all architectures
> > > + * as a synchronous event in response to a
> > > + * specific instruction being executed
> > > + * @QEMU_PLUGIN_DISCON_HOSTCALL: a host call, functionally a special kind of
> > > + * exception that is not handled by code run by
> > > + * the vCPU but machinery outside the vCPU
> > > + * @QEMU_PLUGIN_DISCON_ALL: all types of disconinuity events currently covered
> > > + */
> > > +enum qemu_plugin_discon_type {
> > > + QEMU_PLUGIN_DISCON_INTERRUPT = 1,
> > > + QEMU_PLUGIN_DISCON_EXCEPTION = 2,
> > > + QEMU_PLUGIN_DISCON_HOSTCALL = 4,
> > > + QEMU_PLUGIN_DISCON_ALL = 7
> > > +};
> > >
> > Matter of style, but would be better to use:
> >
> > enum qemu_plugin_discon_type {
> > QEMU_PLUGIN_DISCON_INTERRUPT = 1 << 0,
> > QEMU_PLUGIN_DISCON_EXCEPTION = 1 << 1,
> > QEMU_PLUGIN_DISCON_HOSTCALL = 1 << 2,
> > QEMU_PLUGIN_DISCON_ALL = -1
> > };
> >
> <snip>
>
> Is this really a bit field though? If you will only report type of
> discontinuity at a time a simple 0 based enum with
> QEMU_PLUGIN_DISCON_MAX would be simpler.
We don't only use this type to communicate the kind of discontinuity but
also when registering callbacks. I'll make this more clear in the commit
message and/or documentation in the next series.
Regards,
Julian
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 04/11] contrib/plugins: add plugin showcasing new dicontinuity related API
2025-01-09 14:04 ` Alex Bennée
2025-01-09 22:10 ` Pierrick Bouvier
@ 2025-01-10 11:49 ` Julian Ganz
2025-01-10 15:15 ` Alex Bennée
1 sibling, 1 reply; 70+ messages in thread
From: Julian Ganz @ 2025-01-10 11:49 UTC (permalink / raw)
To: Alex Bennée
Cc: qemu-devel, Alexandre Iooss, Mahmoud Mandour, Pierrick Bouvier
Hi Alex,
January 9, 2025 at 3:04 PM, "Alex Bennée" wrote:
> Julian Ganz <neither@nut.email> writes:
> > We recently introduced new plugin API for registration of discontinuity
> > related callbacks. This change introduces a minimal plugin showcasing
> > the new API. It simply counts the occurances of interrupts, exceptions
> > and host calls per CPU and reports the counts when exitting.
> > ---
> > contrib/plugins/meson.build | 3 +-
> > contrib/plugins/traps.c | 96 +++++++++++++++++++++++++++++++++++++
> > 2 files changed, 98 insertions(+), 1 deletion(-)
> > create mode 100644 contrib/plugins/traps.c
> >
> > diff --git a/contrib/plugins/meson.build b/contrib/plugins/meson.build
> > index 63a32c2b4f..9a3015e1c1 100644
> > --- a/contrib/plugins/meson.build
> > +++ b/contrib/plugins/meson.build
> > @@ -1,5 +1,6 @@
> > contrib_plugins = ['bbv', 'cache', 'cflow', 'drcov', 'execlog', 'hotblocks',
> > - 'hotpages', 'howvec', 'hwprofile', 'ips', 'stoptrigger']
> > + 'hotpages', 'howvec', 'hwprofile', 'ips', 'stoptrigger',
> > + 'traps']
> >
> I wonder if this is better in tests/tcg/plugins? We need to do something
> to ensure it gets covered by CI although we might want to be smarter
> about running it together with a test binary that will actually pick up
> something.
The callback is intended as an example. The patch-series does contain a
dedicated testing plugin. And iirc the contrib plugins are now built
with the rest of qemu anyway?
> > +QEMU_PLUGIN_EXPORT
> > +int qemu_plugin_install(qemu_plugin_id_t id, const qemu_info_t *info,
> > + int argc, char **argv)
> > +{
> > + if (!info->system_emulation) {
> > + fputs("trap plugin can only be used in system emulation mode.\n",
> > + stderr);
> > + return -1;
> > + }
> > +
> > + max_vcpus = info->system.max_vcpus;
> > + traps = qemu_plugin_scoreboard_new(sizeof(TrapCounters));
> > + qemu_plugin_register_vcpu_init_cb(id, vcpu_init);
> > + qemu_plugin_vcpu_for_each(id, vcpu_init);
> >
> Hmm at first glances this seems redundant - however I guess this is
> covering the use case you load the plugin after the system is up and
> running.
Yep, but really that was just me being paranoid.
> I wonder if you have unearthed a foot-gun in the API that is easy to
> fall into? Maybe we should expand qemu_plugin_register_vcpu_init_cb to
> call the call back immediately for existing vcpus?
Would probably not hurt.
Regards,
Julian
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 11/11] tests: add plugin asserting correctness of discon event's to_pc
2025-01-09 16:33 ` Alex Bennée
2025-01-09 22:27 ` Pierrick Bouvier
@ 2025-01-10 11:58 ` Julian Ganz
1 sibling, 0 replies; 70+ messages in thread
From: Julian Ganz @ 2025-01-10 11:58 UTC (permalink / raw)
To: Alex Bennée
Cc: Pierrick Bouvier, qemu-devel, Alexandre Iooss, Mahmoud Mandour,
Richard Henderson
Hi Alex,
Sorry for tha late reply.
January 9, 2025 at 5:33 PM, "Alex Bennée" wrote:
> "Julian Ganz" <neither@nut.email> writes:
>
> (Add Richard to CC)
>
> >
> > Hi Pierrick,
> >
> > December 5, 2024 at 12:33 AM, "Pierrick Bouvier" wrote:
> >
> > >
> > > On 12/2/24 11:41, Julian Ganz wrote:
> > > > +static void insn_exec(unsigned int vcpu_index, void *userdata)
> > > > +{
> > > > + struct cpu_state *state = qemu_plugin_scoreboard_find(states, vcpu_index);
> > > > + uint64_t pc = (uint64_t) userdata;
> > > > + GString* report;
> > > > +
> > > > + if (state->has_next) {
> > > > + if (state->next_pc != pc) {
> > > > + report = g_string_new("Trap target PC mismatch\n");
> > > > + g_string_append_printf(report,
> > > > + "Expected: %"PRIx64"\nEncountered: %"
> > > > + PRIx64"\n",
> > > > + state->next_pc, pc);
> > > > + qemu_plugin_outs(report->str);
> > > > + if (abort_on_mismatch) {
> > > > + g_abort();
> > > > + }
> > > > + g_string_free(report, true);
> > > > + }
> > > > + state->has_next = false;
> > > > + }
> > > > +}
> > > >
> > > When booting an arm64 vm, I get this message:
> > > Trap target PC mismatch
> > > Expected: 23faf3a80
> > > Encountered: 23faf3a84
> > >
> > A colleague of mine went to great lengths trying to track and reliably
> > reproduce this. We think that it's something amiss with the existing
> > instruction exec callback infrastructure. So... it's not something I'll
> > be addressing with the next iteration as it's out of scope. We'll
> > probably continue looking into it, though.
> >
> > The mismatch is reported perfectly normal and boring exceptions and
> > interrupts with no indication of any differences to other (not reported)
> > events that fire on a regular basis. Apparently, once in a blue moon
> > (relatively speaking), for the first instruction of a handler (even
> > though it is definitely executed and qemu does print a trace-line for
> > that instruction):
> >
> > | Trace 0: 0x7fffa0b03900 [00104004/000000023fde73b4/00000021/ff020200]
> > | Trace 0: 0x7fffa02d9580 [00104004/000000023fde72b8/00000021/ff020200]
> > | Trace 0: 0x7fffa02dfc40 [00104004/000000023fde7338/00000021/ff020200]
> > | Trace 0: 0x7fffa0b03d00 [00104004/000000023fde73d4/00000021/ff020200]
> > | Trace 0: 0x7fffa0b03e80 [00104004/000000023fde73d8/00000021/ff020200]
> > | Trace 0: 0x7fffa0b04140 [00104004/000000023fde7408/00000021/ff020200]
> > | Trace 0: 0x7fffa02dd6c0 [00104004/000000023fde70b8/00000021/ff020200]
> > | Trace 0: 0x7fffa02dd800 [00104004/000000023fde7b90/00000021/ff020200]
> > | cpu_io_recompile: rewound execution of TB to 000000023fde7b90
> >
> So this happens when an instruction that is not the last instruction of
> the block does some IO. As IO accesses can potentially change system
> state we can't allow more instructions to run in the block that might
> not have that change of state captured
>
> cpu_io_recompile exits the loop and forces the next TranslationBlock to
> be only one (or maybe two instructions). We have to play games with
> instrumentation to avoid double counting execution:
>
> /*
> * Exit the loop and potentially generate a new TB executing the
> * just the I/O insns. We also limit instrumentation to memory
> * operations only (which execute after completion) so we don't
> * double instrument the instruction.
> */
> cpu->cflags_next_tb = curr_cflags(cpu) | CF_MEMI_ONLY | n;
>
> The instruction is in a weird state having both executed (from the
> plugin point of view) but not changed any state (stopped from doing MMIO
> until the next instruction).
>
> >
> > | Taking exception 5 [IRQ] on CPU 0
> > | ...from EL1 to EL1
> > | ...with ESR 0x0/0x3800000
> > | ...with SPSR 0x20000305
> > | ...with ELR 0x23fde7b90
> > | ...to EL1 PC 0x23fd77a80 PSTATE 0x23c5
> >
> I guess before we re-executed the new block an asynchronous interrupt
> came in?
>
> Does changing the above to:
>
> cpu->cflags_next_tb = curr_cflags(cpu) | CF_MEMI_ONLY | CF_NOIRQ | n;
>
> make the problem go away? It should ensure the next 1/2 instruction
> block execute without checking for async events. See gen_tb_start() for
> the gory details.
So my collegue was tracing this issue himself over the last week(a) and
arrived more or less at the same conclusion. According to him we don't
observe the issue anymore, either.
Thanks for the fix!
Regards,
Julian
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 04/11] contrib/plugins: add plugin showcasing new dicontinuity related API
2025-01-10 11:49 ` Julian Ganz
@ 2025-01-10 15:15 ` Alex Bennée
2025-01-10 21:02 ` Pierrick Bouvier
0 siblings, 1 reply; 70+ messages in thread
From: Alex Bennée @ 2025-01-10 15:15 UTC (permalink / raw)
To: Julian Ganz
Cc: qemu-devel, Alexandre Iooss, Mahmoud Mandour, Pierrick Bouvier
"Julian Ganz" <neither@nut.email> writes:
> Hi Alex,
>
> January 9, 2025 at 3:04 PM, "Alex Bennée" wrote:
>> Julian Ganz <neither@nut.email> writes:
>> > We recently introduced new plugin API for registration of discontinuity
>> > related callbacks. This change introduces a minimal plugin showcasing
>> > the new API. It simply counts the occurances of interrupts, exceptions
>> > and host calls per CPU and reports the counts when exitting.
>> > ---
>> > contrib/plugins/meson.build | 3 +-
>> > contrib/plugins/traps.c | 96 +++++++++++++++++++++++++++++++++++++
>> > 2 files changed, 98 insertions(+), 1 deletion(-)
>> > create mode 100644 contrib/plugins/traps.c
>> >
>> > diff --git a/contrib/plugins/meson.build b/contrib/plugins/meson.build
>> > index 63a32c2b4f..9a3015e1c1 100644
>> > --- a/contrib/plugins/meson.build
>> > +++ b/contrib/plugins/meson.build
>> > @@ -1,5 +1,6 @@
>> > contrib_plugins = ['bbv', 'cache', 'cflow', 'drcov', 'execlog', 'hotblocks',
>> > - 'hotpages', 'howvec', 'hwprofile', 'ips', 'stoptrigger']
>> > + 'hotpages', 'howvec', 'hwprofile', 'ips', 'stoptrigger',
>> > + 'traps']
>> >
>> I wonder if this is better in tests/tcg/plugins? We need to do something
>> to ensure it gets covered by CI although we might want to be smarter
>> about running it together with a test binary that will actually pick up
>> something.
>
> The callback is intended as an example. The patch-series does contain a
> dedicated testing plugin. And iirc the contrib plugins are now built
> with the rest of qemu anyway?
They do - however we generate additional tests with tests/tcg/plugins
with the existing multiarch linux-user and softmmu check-tcg tests. Its
a fairly dumb expansion though:
# We need to ensure expand the run-plugin-TEST-with-PLUGIN
# pre-requistes manually here as we can't use stems to handle it. We
# only expand MULTIARCH_TESTS which are common on most of our targets
# to avoid an exponential explosion as new tests are added. We also
# add some special helpers the run-plugin- rules can use below.
# In more, extra tests can be added using ADDITIONAL_PLUGINS_TESTS variable.
ifneq ($(MULTIARCH_TESTS),)
$(foreach p,$(PLUGINS), \
$(foreach t,$(MULTIARCH_TESTS) $(ADDITIONAL_PLUGINS_TESTS),\
$(eval run-plugin-$(t)-with-$(p): $t $p) \
$(eval RUN_TESTS+=run-plugin-$(t)-with-$(p))))
endif # MULTIARCH_TESTS
endif # CONFIG_PLUGIN
We also have a hand-hacked test for validating memory instrumentation:
# Test plugin memory access instrumentation
run-plugin-test-plugin-mem-access-with-libmem.so: \
PLUGIN_ARGS=$(COMMA)print-accesses=true
run-plugin-test-plugin-mem-access-with-libmem.so: \
CHECK_PLUGIN_OUTPUT_COMMAND= \
$(SRC_PATH)/tests/tcg/multiarch/check-plugin-output.sh \
$(QEMU) $<
test-plugin-mem-access: CFLAGS+=-pthread -O0
test-plugin-mem-access: LDFLAGS+=-pthread -O0
That said as I mention in the reply to the cover letter the traps stuff
might be better exercised with the functional test so could utilise a
plugin built in contrib just as easily.
>
>> > +QEMU_PLUGIN_EXPORT
>> > +int qemu_plugin_install(qemu_plugin_id_t id, const qemu_info_t *info,
>> > + int argc, char **argv)
>> > +{
>> > + if (!info->system_emulation) {
>> > + fputs("trap plugin can only be used in system emulation mode.\n",
>> > + stderr);
>> > + return -1;
>> > + }
>> > +
>> > + max_vcpus = info->system.max_vcpus;
>> > + traps = qemu_plugin_scoreboard_new(sizeof(TrapCounters));
>> > + qemu_plugin_register_vcpu_init_cb(id, vcpu_init);
>> > + qemu_plugin_vcpu_for_each(id, vcpu_init);
>> >
>> Hmm at first glances this seems redundant - however I guess this is
>> covering the use case you load the plugin after the system is up and
>> running.
>
> Yep, but really that was just me being paranoid.
>
>> I wonder if you have unearthed a foot-gun in the API that is easy to
>> fall into? Maybe we should expand qemu_plugin_register_vcpu_init_cb to
>> call the call back immediately for existing vcpus?
>
> Would probably not hurt.
>
> Regards,
> Julian
--
Alex Bennée
Virtualisation Tech Lead @ Linaro
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 04/11] contrib/plugins: add plugin showcasing new dicontinuity related API
2025-01-10 15:15 ` Alex Bennée
@ 2025-01-10 21:02 ` Pierrick Bouvier
2025-01-11 12:15 ` Alex Bennée
0 siblings, 1 reply; 70+ messages in thread
From: Pierrick Bouvier @ 2025-01-10 21:02 UTC (permalink / raw)
To: Alex Bennée, Julian Ganz
Cc: qemu-devel, Alexandre Iooss, Mahmoud Mandour
On 1/10/25 07:15, Alex Bennée wrote:
> "Julian Ganz" <neither@nut.email> writes:
>
>> Hi Alex,
>>
>> January 9, 2025 at 3:04 PM, "Alex Bennée" wrote:
>>> Julian Ganz <neither@nut.email> writes:
>>>> We recently introduced new plugin API for registration of discontinuity
>>>> related callbacks. This change introduces a minimal plugin showcasing
>>>> the new API. It simply counts the occurances of interrupts, exceptions
>>>> and host calls per CPU and reports the counts when exitting.
>>>> ---
>>>> contrib/plugins/meson.build | 3 +-
>>>> contrib/plugins/traps.c | 96 +++++++++++++++++++++++++++++++++++++
>>>> 2 files changed, 98 insertions(+), 1 deletion(-)
>>>> create mode 100644 contrib/plugins/traps.c
>>>>
>>>> diff --git a/contrib/plugins/meson.build b/contrib/plugins/meson.build
>>>> index 63a32c2b4f..9a3015e1c1 100644
>>>> --- a/contrib/plugins/meson.build
>>>> +++ b/contrib/plugins/meson.build
>>>> @@ -1,5 +1,6 @@
>>>> contrib_plugins = ['bbv', 'cache', 'cflow', 'drcov', 'execlog', 'hotblocks',
>>>> - 'hotpages', 'howvec', 'hwprofile', 'ips', 'stoptrigger']
>>>> + 'hotpages', 'howvec', 'hwprofile', 'ips', 'stoptrigger',
>>>> + 'traps']
>>>>
>>> I wonder if this is better in tests/tcg/plugins? We need to do something
>>> to ensure it gets covered by CI although we might want to be smarter
>>> about running it together with a test binary that will actually pick up
>>> something.
>>
>> The callback is intended as an example. The patch-series does contain a
>> dedicated testing plugin. And iirc the contrib plugins are now built
>> with the rest of qemu anyway?
>
> They do - however we generate additional tests with tests/tcg/plugins
> with the existing multiarch linux-user and softmmu check-tcg tests. Its
> a fairly dumb expansion though:
>
> # We need to ensure expand the run-plugin-TEST-with-PLUGIN
> # pre-requistes manually here as we can't use stems to handle it. We
> # only expand MULTIARCH_TESTS which are common on most of our targets
> # to avoid an exponential explosion as new tests are added. We also
> # add some special helpers the run-plugin- rules can use below.
> # In more, extra tests can be added using ADDITIONAL_PLUGINS_TESTS variable.
>
> ifneq ($(MULTIARCH_TESTS),)
> $(foreach p,$(PLUGINS), \
> $(foreach t,$(MULTIARCH_TESTS) $(ADDITIONAL_PLUGINS_TESTS),\
> $(eval run-plugin-$(t)-with-$(p): $t $p) \
> $(eval RUN_TESTS+=run-plugin-$(t)-with-$(p))))
> endif # MULTIARCH_TESTS
> endif # CONFIG_PLUGIN
>
> We also have a hand-hacked test for validating memory instrumentation:
>
> # Test plugin memory access instrumentation
> run-plugin-test-plugin-mem-access-with-libmem.so: \
> PLUGIN_ARGS=$(COMMA)print-accesses=true
> run-plugin-test-plugin-mem-access-with-libmem.so: \
> CHECK_PLUGIN_OUTPUT_COMMAND= \
> $(SRC_PATH)/tests/tcg/multiarch/check-plugin-output.sh \
> $(QEMU) $<
>
> test-plugin-mem-access: CFLAGS+=-pthread -O0
> test-plugin-mem-access: LDFLAGS+=-pthread -O0
>
> That said as I mention in the reply to the cover letter the traps stuff
> might be better exercised with the functional test so could utilise a
> plugin built in contrib just as easily.
>
I agree, as it was discussed in previous versions, we should add a
functional test for this. I'm not sure if we should write a custom and
complicated test, or simply boot and shutdown an existing image, and
call it a day.
Do you have any opinion on this Alex?
>>
>>>> +QEMU_PLUGIN_EXPORT
>>>> +int qemu_plugin_install(qemu_plugin_id_t id, const qemu_info_t *info,
>>>> + int argc, char **argv)
>>>> +{
>>>> + if (!info->system_emulation) {
>>>> + fputs("trap plugin can only be used in system emulation mode.\n",
>>>> + stderr);
>>>> + return -1;
>>>> + }
>>>> +
>>>> + max_vcpus = info->system.max_vcpus;
>>>> + traps = qemu_plugin_scoreboard_new(sizeof(TrapCounters));
>>>> + qemu_plugin_register_vcpu_init_cb(id, vcpu_init);
>>>> + qemu_plugin_vcpu_for_each(id, vcpu_init);
>>>>
>>> Hmm at first glances this seems redundant - however I guess this is
>>> covering the use case you load the plugin after the system is up and
>>> running.
>>
>> Yep, but really that was just me being paranoid.
>>
>>> I wonder if you have unearthed a foot-gun in the API that is easy to
>>> fall into? Maybe we should expand qemu_plugin_register_vcpu_init_cb to
>>> call the call back immediately for existing vcpus?
>>
>> Would probably not hurt.
>>
>> Regards,
>> Julian
>
^ permalink raw reply [flat|nested] 70+ messages in thread
* Re: [RFC PATCH v3 04/11] contrib/plugins: add plugin showcasing new dicontinuity related API
2025-01-10 21:02 ` Pierrick Bouvier
@ 2025-01-11 12:15 ` Alex Bennée
0 siblings, 0 replies; 70+ messages in thread
From: Alex Bennée @ 2025-01-11 12:15 UTC (permalink / raw)
To: Pierrick Bouvier
Cc: Julian Ganz, qemu-devel, Alexandre Iooss, Mahmoud Mandour
Pierrick Bouvier <pierrick.bouvier@linaro.org> writes:
> On 1/10/25 07:15, Alex Bennée wrote:
>> "Julian Ganz" <neither@nut.email> writes:
>>
>>> Hi Alex,
>>>
>>> January 9, 2025 at 3:04 PM, "Alex Bennée" wrote:
>>>> Julian Ganz <neither@nut.email> writes:
>>>>> We recently introduced new plugin API for registration of discontinuity
>>>>> related callbacks. This change introduces a minimal plugin showcasing
>>>>> the new API. It simply counts the occurances of interrupts, exceptions
>>>>> and host calls per CPU and reports the counts when exitting.
>>>>> ---
>>>>> contrib/plugins/meson.build | 3 +-
>>>>> contrib/plugins/traps.c | 96 +++++++++++++++++++++++++++++++++++++
>>>>> 2 files changed, 98 insertions(+), 1 deletion(-)
>>>>> create mode 100644 contrib/plugins/traps.c
>>>>>
>>>>> diff --git a/contrib/plugins/meson.build b/contrib/plugins/meson.build
>>>>> index 63a32c2b4f..9a3015e1c1 100644
>>>>> --- a/contrib/plugins/meson.build
>>>>> +++ b/contrib/plugins/meson.build
>>>>> @@ -1,5 +1,6 @@
>>>>> contrib_plugins = ['bbv', 'cache', 'cflow', 'drcov', 'execlog', 'hotblocks',
>>>>> - 'hotpages', 'howvec', 'hwprofile', 'ips', 'stoptrigger']
>>>>> + 'hotpages', 'howvec', 'hwprofile', 'ips', 'stoptrigger',
>>>>> + 'traps']
>>>>>
>>>> I wonder if this is better in tests/tcg/plugins? We need to do something
>>>> to ensure it gets covered by CI although we might want to be smarter
>>>> about running it together with a test binary that will actually pick up
>>>> something.
>>>
>>> The callback is intended as an example. The patch-series does contain a
>>> dedicated testing plugin. And iirc the contrib plugins are now built
>>> with the rest of qemu anyway?
>> They do - however we generate additional tests with
>> tests/tcg/plugins
>> with the existing multiarch linux-user and softmmu check-tcg tests. Its
>> a fairly dumb expansion though:
>> # We need to ensure expand the run-plugin-TEST-with-PLUGIN
>> # pre-requistes manually here as we can't use stems to handle it. We
>> # only expand MULTIARCH_TESTS which are common on most of our targets
>> # to avoid an exponential explosion as new tests are added. We also
>> # add some special helpers the run-plugin- rules can use below.
>> # In more, extra tests can be added using ADDITIONAL_PLUGINS_TESTS variable.
>> ifneq ($(MULTIARCH_TESTS),)
>> $(foreach p,$(PLUGINS), \
>> $(foreach t,$(MULTIARCH_TESTS) $(ADDITIONAL_PLUGINS_TESTS),\
>> $(eval run-plugin-$(t)-with-$(p): $t $p) \
>> $(eval RUN_TESTS+=run-plugin-$(t)-with-$(p))))
>> endif # MULTIARCH_TESTS
>> endif # CONFIG_PLUGIN
>> We also have a hand-hacked test for validating memory
>> instrumentation:
>> # Test plugin memory access instrumentation
>> run-plugin-test-plugin-mem-access-with-libmem.so: \
>> PLUGIN_ARGS=$(COMMA)print-accesses=true
>> run-plugin-test-plugin-mem-access-with-libmem.so: \
>> CHECK_PLUGIN_OUTPUT_COMMAND= \
>> $(SRC_PATH)/tests/tcg/multiarch/check-plugin-output.sh \
>> $(QEMU) $<
>> test-plugin-mem-access: CFLAGS+=-pthread -O0
>> test-plugin-mem-access: LDFLAGS+=-pthread -O0
>> That said as I mention in the reply to the cover letter the traps
>> stuff
>> might be better exercised with the functional test so could utilise a
>> plugin built in contrib just as easily.
>>
>
> I agree, as it was discussed in previous versions, we should add a
> functional test for this. I'm not sure if we should write a custom and
> complicated test, or simply boot and shutdown an existing image, and
> call it a day.
>
> Do you have any opinion on this Alex?
An existing image based test would be fine although I'd favour one that
had multiple exception types (e.g. something with firmware and
hypervisor transitions on Arm or equivalent on other arches.)
>
>>>
>>>>> +QEMU_PLUGIN_EXPORT
>>>>> +int qemu_plugin_install(qemu_plugin_id_t id, const qemu_info_t *info,
>>>>> + int argc, char **argv)
>>>>> +{
>>>>> + if (!info->system_emulation) {
>>>>> + fputs("trap plugin can only be used in system emulation mode.\n",
>>>>> + stderr);
>>>>> + return -1;
>>>>> + }
>>>>> +
>>>>> + max_vcpus = info->system.max_vcpus;
>>>>> + traps = qemu_plugin_scoreboard_new(sizeof(TrapCounters));
>>>>> + qemu_plugin_register_vcpu_init_cb(id, vcpu_init);
>>>>> + qemu_plugin_vcpu_for_each(id, vcpu_init);
>>>>>
>>>> Hmm at first glances this seems redundant - however I guess this is
>>>> covering the use case you load the plugin after the system is up and
>>>> running.
>>>
>>> Yep, but really that was just me being paranoid.
>>>
>>>> I wonder if you have unearthed a foot-gun in the API that is easy to
>>>> fall into? Maybe we should expand qemu_plugin_register_vcpu_init_cb to
>>>> call the call back immediately for existing vcpus?
>>>
>>> Would probably not hurt.
>>>
>>> Regards,
>>> Julian
>>
--
Alex Bennée
Virtualisation Tech Lead @ Linaro
^ permalink raw reply [flat|nested] 70+ messages in thread
end of thread, other threads:[~2025-01-11 12:16 UTC | newest]
Thread overview: 70+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-02 19:26 [RFC PATCH v3 00/11] tcg-plugins: add hooks for discontinuities Julian Ganz
2024-12-02 19:26 ` [RFC PATCH v3 01/11] plugins: add types for callbacks related to certain discontinuities Julian Ganz
2024-12-03 8:45 ` Julian Ganz
2024-12-04 22:41 ` Pierrick Bouvier
2024-12-05 12:40 ` Julian Ganz
2024-12-05 17:56 ` Pierrick Bouvier
2024-12-05 21:50 ` Julian Ganz
2024-12-05 22:14 ` Julian Ganz
2024-12-05 23:03 ` Pierrick Bouvier
2024-12-06 8:58 ` Julian Ganz
2024-12-06 18:59 ` Pierrick Bouvier
2024-12-07 13:38 ` Julian Ganz
2024-12-09 18:52 ` Pierrick Bouvier
2024-12-04 22:45 ` Pierrick Bouvier
2024-12-05 12:44 ` Julian Ganz
2024-12-05 17:35 ` Pierrick Bouvier
2024-12-05 21:25 ` Julian Ganz
2025-01-09 13:52 ` Alex Bennée
2025-01-09 22:28 ` Pierrick Bouvier
2025-01-10 11:43 ` Julian Ganz
2024-12-02 19:26 ` [RFC PATCH v3 02/11] plugins: add API for registering discontinuity callbacks Julian Ganz
2024-12-04 22:45 ` Pierrick Bouvier
2025-01-09 13:57 ` Alex Bennée
2025-01-10 11:40 ` Julian Ganz
2024-12-02 19:26 ` [RFC PATCH v3 03/11] plugins: add hooks for new discontinuity related callbacks Julian Ganz
2024-12-04 22:47 ` Pierrick Bouvier
2025-01-09 13:58 ` Alex Bennée
2024-12-02 19:26 ` [RFC PATCH v3 04/11] contrib/plugins: add plugin showcasing new dicontinuity related API Julian Ganz
2024-12-04 23:14 ` Pierrick Bouvier
2024-12-05 13:00 ` Julian Ganz
2024-12-05 17:23 ` Pierrick Bouvier
2025-01-09 14:04 ` Alex Bennée
2025-01-09 22:10 ` Pierrick Bouvier
2025-01-10 11:49 ` Julian Ganz
2025-01-10 15:15 ` Alex Bennée
2025-01-10 21:02 ` Pierrick Bouvier
2025-01-11 12:15 ` Alex Bennée
2024-12-02 19:26 ` [RFC PATCH v3 05/11] target/alpha: call plugin trap callbacks Julian Ganz
2024-12-04 22:48 ` Pierrick Bouvier
2024-12-02 19:26 ` [RFC PATCH v3 06/11] target/arm: " Julian Ganz
2024-12-02 19:26 ` [RFC PATCH v3 07/11] target/avr: " Julian Ganz
2024-12-02 19:26 ` [RFC PATCH v3 08/11] target/mips: " Julian Ganz
2025-01-09 13:43 ` Alex Bennée
2024-12-02 19:26 ` [RFC PATCH v3 09/11] target/riscv: " Julian Ganz
2024-12-03 4:39 ` Alistair Francis
2024-12-02 19:41 ` [RFC PATCH v3 10/11] target/sparc: " Julian Ganz
2025-01-09 13:46 ` Alex Bennée
2024-12-02 19:41 ` [RFC PATCH v3 11/11] tests: add plugin asserting correctness of discon event's to_pc Julian Ganz
2024-12-04 23:33 ` Pierrick Bouvier
2024-12-05 13:10 ` Julian Ganz
2024-12-05 17:30 ` Pierrick Bouvier
2024-12-05 21:22 ` Julian Ganz
2024-12-05 22:28 ` Pierrick Bouvier
2024-12-06 8:42 ` Julian Ganz
2024-12-06 19:02 ` Pierrick Bouvier
2024-12-06 19:42 ` Richard Henderson
2024-12-06 20:40 ` Pierrick Bouvier
2024-12-06 22:56 ` Richard Henderson
2024-12-07 13:47 ` Julian Ganz
2024-12-07 13:41 ` Julian Ganz
2024-12-20 11:47 ` Julian Ganz
2024-12-20 21:17 ` Pierrick Bouvier
2024-12-20 21:46 ` Pierrick Bouvier
2025-01-09 16:35 ` Alex Bennée
2025-01-09 16:33 ` Alex Bennée
2025-01-09 22:27 ` Pierrick Bouvier
2025-01-10 11:58 ` Julian Ganz
2024-12-03 8:36 ` [RFC PATCH v3 00/11] tcg-plugins: add hooks for discontinuities Julian Ganz
2024-12-04 22:51 ` Pierrick Bouvier
2025-01-09 16:43 ` Alex Bennée
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).