* [PATCH v4 0/7] contrib/plugins: uftrace
@ 2025-07-24 20:45 Pierrick Bouvier
2025-07-24 20:45 ` [PATCH v4 1/7] contrib/plugins/uftrace: new uftrace plugin Pierrick Bouvier
` (7 more replies)
0 siblings, 8 replies; 14+ messages in thread
From: Pierrick Bouvier @ 2025-07-24 20:45 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Maydell, Alexandre Iooss, Gustavo Romero, Alex Bennée,
Mahmoud Mandour, rowan Hart, Manos Pitsidianakis,
Philippe Mathieu-Daudé, Richard Henderson, Pierrick Bouvier
This plugin generates a binary trace compatible with the excellent uftrace:
https://github.com/namhyung/uftrace
In short, it tracks all function calls performed during execution, based on
frame pointer analysis. A big advantage over "uftrace record" is that it works
in system mode, allowing to trace a full system execution, which was the
original goal. It works as well in user mode, but uftrace itself already does
this. It's implemented for aarch64 only (with the intent to add x86_64 later).
Let's start with concrete examples of the result.
First, in system mode, booting a stack using TF-A + U-boot + Linux:
- Two first stages of boot sequence in Arm Trusted Firmware (EL3 and S-EL1)
https://fileserver.linaro.org/s/kkxBS552W7nYESX/preview
- Stat and open syscalls in kernel
https://fileserver.linaro.org/s/dXe4MfraKg2F476/preview
- Poweroff sequence (from kernel back to firmware, NS-EL2 to EL3)
https://fileserver.linaro.org/s/oR2PtyGKJrqnfRf/preview
Full trace is available here:
https://fileserver.linaro.org/s/WsemLboPEzo24nw/download/aarch64_boot.json.gz
You can download and open it on https://ui.perfetto.dev/ to explore it.
Second, in user mode, tracing qemu-aarch64 (itself) running git --help:
- Loading program and its interpreter
https://fileserver.linaro.org/s/fie8JgX76yyL5cq/preview
- TB creation
https://fileserver.linaro.org/s/GXY6NKMw5EeRCew/preview
Full trace is available here:
https://fileserver.linaro.org/s/N8X8fnZ5yGRZLsT/download/qemu_aarch64_git_help.json.gz
If you had curiosity and now you're ready to give some attention, most of the
details you want to read are included in the documentation patch (final one).
Overhead is around x2 (sampling only) to x10-x15 (precise), and long traces can
be directly filtered with uftrace if needed.
The series is splitted in:
- implementing the plugin
- adding useful options (especially sampling and privilege level tracing)
- add a companion script to symbolize traces generated
- add documentation with examples
I hope this plugin can help people trying to understand what happens out of the
user space, and get a better grasp of how firmwares, bootloader, and kernel
interact behind the curtain.
v2
--
- trace active stacks on exit
- do not erase map generated in system_emulation
- add documentation to generate restricted visual traces around specific events
of execution
v3
--
- fix missing include unistd.h (build failed on MacOS only)
v4
--
- add support for x64
Pierrick Bouvier (7):
contrib/plugins/uftrace: new uftrace plugin
contrib/plugins/uftrace: add trace-sample option
contrib/plugins/uftrace: add trace-privilege-level option
contrib/plugins/uftrace: add timestamp-based-on-real-time option
contrib/plugins/uftrace_symbols.py
contrib/plugins/uftrace: add documentation
contrib/plugins/uftrace: add support for x64
docs/about/emulation.rst | 213 ++++++
contrib/plugins/uftrace.c | 1014 ++++++++++++++++++++++++++++
contrib/plugins/meson.build | 3 +-
contrib/plugins/uftrace_symbols.py | 152 +++++
4 files changed, 1381 insertions(+), 1 deletion(-)
create mode 100644 contrib/plugins/uftrace.c
create mode 100755 contrib/plugins/uftrace_symbols.py
--
2.47.2
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH v4 1/7] contrib/plugins/uftrace: new uftrace plugin
2025-07-24 20:45 [PATCH v4 0/7] contrib/plugins: uftrace Pierrick Bouvier
@ 2025-07-24 20:45 ` Pierrick Bouvier
2025-08-05 10:04 ` Alex Bennée
2025-07-24 20:45 ` [PATCH v4 2/7] contrib/plugins/uftrace: add trace-sample option Pierrick Bouvier
` (6 subsequent siblings)
7 siblings, 1 reply; 14+ messages in thread
From: Pierrick Bouvier @ 2025-07-24 20:45 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Maydell, Alexandre Iooss, Gustavo Romero, Alex Bennée,
Mahmoud Mandour, rowan Hart, Manos Pitsidianakis,
Philippe Mathieu-Daudé, Richard Henderson, Pierrick Bouvier
This plugin generates a binary trace compatible with:
https://github.com/namhyung/uftrace
It tracks frame pointer during execution, detecting function
calls/returns and works in system and user mode.
It's implemented for aarch64 only (adding other architecture should
be trivial, especially x86_64 that share the same stack layout).
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
---
contrib/plugins/uftrace.c | 630 ++++++++++++++++++++++++++++++++++++
contrib/plugins/meson.build | 3 +-
2 files changed, 632 insertions(+), 1 deletion(-)
create mode 100644 contrib/plugins/uftrace.c
diff --git a/contrib/plugins/uftrace.c b/contrib/plugins/uftrace.c
new file mode 100644
index 00000000000..4ca6d3fd68d
--- /dev/null
+++ b/contrib/plugins/uftrace.c
@@ -0,0 +1,630 @@
+/*
+ * Copyright (C) 2025, Pierrick Bouvier <pierrick.bouvier@linaro.org>
+ *
+ * Generates a trace compatible with uftrace (similar to uftrace record).
+ * https://github.com/namhyung/uftrace
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#include <qemu-plugin.h>
+#include <glib.h>
+#include <stdio.h>
+#include <sys/stat.h>
+#include <sys/time.h>
+#include <time.h>
+#include <unistd.h>
+
+QEMU_PLUGIN_EXPORT int qemu_plugin_version = QEMU_PLUGIN_VERSION;
+
+typedef struct {
+ GArray *s;
+} callstack;
+
+typedef struct {
+ uint64_t pc;
+ uint64_t frame_pointer;
+} callstack_entry;
+
+typedef struct {
+ GArray *t;
+ GString *path;
+ GString *name;
+ uint32_t id;
+} trace;
+
+typedef struct Cpu Cpu;
+
+typedef struct {
+ void (*init)(Cpu *cpu);
+ void (*end)(Cpu *cpu);
+ uint64_t (*get_frame_pointer)(Cpu *cpu);
+ bool (*does_insn_modify_frame_pointer)(const char *disas);
+} CpuOps;
+
+typedef struct Cpu {
+ uint64_t insn_count;
+ trace *trace;
+ callstack *cs;
+ GArray *callstacks; /* callstack *callstacks[] */
+ GArray *traces; /* trace *traces [] */
+ GByteArray *buf;
+ CpuOps ops;
+ void *arch;
+} Cpu;
+
+typedef struct {
+ struct qemu_plugin_register *reg_fp;
+} Aarch64Cpu;
+
+typedef struct {
+ uint64_t timestamp;
+ uint64_t data;
+} uftrace_entry;
+
+enum uftrace_record_type {
+ UFTRACE_ENTRY,
+ UFTRACE_EXIT,
+ UFTRACE_LOST,
+ UFTRACE_EVENT
+};
+
+static struct qemu_plugin_scoreboard *score;
+static CpuOps arch_ops;
+
+static void uftrace_write_map(bool system_emulation)
+{
+ const char *path = "./uftrace.data/sid-0.map";
+
+ if (system_emulation && access(path, F_OK) == 0) {
+ /* do not erase existing map in system emulation, as a custom one might
+ * already have been generated by uftrace_symbols.py */
+ return;
+ }
+
+ FILE *sid_map = fopen(path, "w");
+ g_assert(sid_map);
+
+ if (system_emulation) {
+ fprintf(sid_map,
+ "# map stack on highest address possible, to prevent uftrace\n"
+ "# from considering any kernel address\n");
+ fprintf(sid_map,
+ "ffffffffffff-ffffffffffff rw-p 00000000 00:00 0 [stack]\n");
+ } else {
+ /* in user mode, copy /proc/self/maps instead */
+ FILE *self_map = fopen("/proc/self/maps", "r");
+ g_assert(self_map);
+ for (;;) {
+ int c = fgetc(self_map);
+ if (c == EOF) {
+ break;
+ }
+ fputc(c, sid_map);
+ }
+ fclose(self_map);
+ }
+ fclose(sid_map);
+}
+
+static void uftrace_write_task(const GArray *traces)
+{
+ FILE *task = fopen("./uftrace.data/task.txt", "w");
+ g_assert(task);
+ for (int i = 0; i < traces->len; ++i) {
+ trace *t = g_array_index(traces, trace*, i);
+ fprintf(task, "SESS timestamp=0.0 pid=%"PRIu32" sid=0 exename=\"%s\"\n",
+ t->id, t->name->str);
+ fprintf(task, "TASK timestamp=0.0 tid=%"PRIu32" pid=%"PRIu32"\n",
+ t->id, t->id);
+ }
+ fclose(task);
+}
+
+static void uftrace_write_info(const GArray *traces)
+{
+ g_autoptr(GString) taskinfo_tids = g_string_new("taskinfo:tids=");
+ for (int i = 0; i < traces->len; ++i) {
+ trace *t = g_array_index(traces, trace*, i);
+ const char *delim = i > 0 ? "," : "";
+ g_string_append_printf(taskinfo_tids, "%s%"PRIu32, delim, t->id);
+ }
+
+ g_autoptr(GString) taskinfo_nr_tid = g_string_new("taskinfo:nr_tid=");
+ g_string_append_printf(taskinfo_nr_tid, "%d", traces->len);
+
+ FILE *info = fopen("./uftrace.data/info", "w");
+ g_assert(info);
+ /*
+ * $ uftrace dump --debug
+ * uftrace file header: magic = 4674726163652100
+ * uftrace file header: version = 4
+ * uftrace file header: header size = 40
+ * uftrace file header: endian = 1 (little)
+ * ftrace file header: class = 2 (64 bit)
+ * uftrace file header: features = 0x1263 (PLTHOOK | ...
+ * uftrace file header: info = 0x7bff (EXE_NAME | ...
+ * <0000000000000000>: 46 74 72 61 63 65 21 00 04 00 00 00 28 00 01 02
+ * <0000000000000010>: 63 12 00 00 00 00 00 00 ff 7b 00 00 00 00 00 00
+ * <0000000000000020>: 00 04 00 00 00 00 00 00
+ */
+ const uint8_t header[] = {0x46, 0x74, 0x72, 0x61, 0x63, 0x65, 0x21, 0x00,
+ 0x04, 0x00, 0x00, 0x00, 0x28, 0x00, 0x01, 0x02,
+ 0x63, 0x12, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0xff, 0x7b, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x04, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00};
+ fwrite(header, sizeof(header), 1, info);
+ const char *info_data[] = {
+ "exename:from_qemu",
+ "build_id:0123456789abcdef0123456789abcdef01234567",
+ "exit_status:0",
+ "cmdline:uftrace record qemu",
+ "cpuinfo:lines=2",
+ "cpuinfo:nr_cpus=1 / 1 (online/possible)",
+ "cpuinfo:desc=Intel 8086",
+ "meminfo:1.0 / 1.0 GB (free / total)",
+ "osinfo:lines=3",
+ "osinfo:kernel=Linux 6.12.33",
+ "osinfo:hostname=pc",
+ "osinfo:distro=\"Debian GNU/Linux 13 (trixie)\"",
+ "taskinfo:lines=2",
+ taskinfo_nr_tid->str,
+ taskinfo_tids->str,
+ "usageinfo:lines=6",
+ "usageinfo:systime=0.000000",
+ "usageinfo:usrtime=0.000000",
+ "usageinfo:ctxsw=0 / 0 (voluntary / involuntary)",
+ "usageinfo:maxrss=8016",
+ "usageinfo:pagefault=0 / 0 (major / minor)",
+ "usageinfo:iops=0 / 0 (read / write)",
+ "loadinfo:0.0 / 0.0 / 0.0",
+ "record_date:Mon Jan 1 00:00:00 2025",
+ "elapsed_time:1000000000000.0 sec",
+ "pattern_type:regex",
+ "uftrace_version:v0.17 ( x86_64 dwarf python3 luajit tui perf sched dynamic kernel )",
+ "utc_offset:1751552954",
+ 0};
+ const char **info_data_it = info_data;
+ while (*(info_data_it)) {
+ fprintf(info, "%s\n", *info_data_it);
+ ++info_data_it;
+ }
+ fclose(info);
+}
+
+static callstack *callstack_new(void)
+{
+ callstack *cs = g_malloc0(sizeof(callstack));
+ cs->s = g_array_new(false, false, sizeof(callstack_entry));
+ return cs;
+}
+
+static void callstack_free(callstack *cs)
+{
+ g_array_free(cs->s, true);
+ cs->s = NULL;
+ g_free(cs);
+}
+
+static size_t callstack_depth(const callstack *cs)
+{
+ return cs->s->len;
+}
+
+static size_t callstack_empty(const callstack *cs)
+{
+ return callstack_depth(cs) == 0;
+}
+
+static void callstack_clear(callstack *cs)
+{
+ g_array_set_size(cs->s, 0);
+}
+
+static const callstack_entry *callstack_at(const callstack *cs, size_t depth)
+{
+ g_assert(depth > 0);
+ g_assert(depth <= callstack_depth(cs));
+ return &g_array_index(cs->s, callstack_entry, depth - 1);
+}
+
+static callstack_entry callstack_top(const callstack *cs)
+{
+ if (callstack_depth(cs) >= 1) {
+ return *callstack_at(cs, callstack_depth(cs));
+ }
+ return (callstack_entry){};
+}
+
+static callstack_entry callstack_caller(const callstack *cs)
+{
+ if (callstack_depth(cs) >= 2) {
+ return *callstack_at(cs, callstack_depth(cs) - 1);
+ }
+ return (callstack_entry){};
+}
+
+static void callstack_push(callstack *cs, callstack_entry e)
+{
+ g_array_append_val(cs->s, e);
+}
+
+static callstack_entry callstack_pop(callstack *cs)
+{
+ g_assert(!callstack_empty(cs));
+ callstack_entry e = callstack_top(cs);
+ g_array_set_size(cs->s, callstack_depth(cs) - 1);
+ return e;
+}
+
+static trace *trace_new(uint32_t id, GString *name)
+{
+ trace *t = g_malloc0(sizeof(trace));
+ t->t = g_array_new(false, false, sizeof(uftrace_entry));
+ t->path = g_string_new(NULL);
+ g_string_append_printf(t->path, "./uftrace.data/%"PRIu32".dat", id);
+ t->name = g_string_new(name->str);
+ t->id = id;
+ return t;
+}
+
+static void trace_free(trace *t)
+{
+ g_assert(t->t->len == 0);
+ g_array_free(t->t, true);
+ t->t = NULL;
+ g_string_free(t->path, true);
+ t->path = NULL;
+ g_string_free(t->name, true);
+ t->name = NULL;
+ g_free(t);
+}
+
+static void trace_flush(trace *t, bool append)
+{
+ int create_dir = g_mkdir_with_parents("./uftrace.data",
+ S_IRWXU | S_IRWXG | S_IRWXO);
+ g_assert(create_dir == 0);
+ FILE *dat = fopen(t->path->str, append ? "a" : "w");
+ g_assert(dat);
+ GArray *data = t->t;
+ if (data->len) {
+ fwrite(data->data, data->len, sizeof(uftrace_entry), dat);
+ }
+ fclose(dat);
+ g_array_set_size(data, 0);
+}
+
+static void trace_add_entry(trace *t, uint64_t timestamp, uint64_t pc,
+ size_t depth, enum uftrace_record_type type)
+{
+ /* libmcount/record.c:record_event */
+ const uint64_t record_magic = 0x5;
+ uint64_t data = type | record_magic << 3;
+ data += depth << 6;
+ data += pc << 16;
+ uftrace_entry e = {.timestamp = timestamp, .data = data};
+ g_array_append_val(t->t, e);
+ if (t->t->len * sizeof(uftrace_entry) > 32 * 1024 * 1024) {
+ /* flush every 32 MB */
+ trace_flush(t, true);
+ }
+}
+
+static void trace_enter_function(trace *t, uint64_t timestamp,
+ uint64_t pc, size_t depth)
+{
+ trace_add_entry(t, timestamp, pc, depth, UFTRACE_ENTRY);
+}
+
+static void trace_exit_function(trace *t, uint64_t timestamp,
+ uint64_t pc, size_t depth)
+{
+ trace_add_entry(t, timestamp, pc, depth, UFTRACE_EXIT);
+}
+
+static void trace_enter_stack(trace *t, callstack *cs, uint64_t timestamp)
+{
+ for (size_t depth = 1; depth <= callstack_depth(cs); ++depth) {
+ trace_enter_function(t, timestamp, callstack_at(cs, depth)->pc, depth);
+ }
+}
+
+static void trace_exit_stack(trace *t, callstack *cs, uint64_t timestamp)
+{
+ for (size_t depth = callstack_depth(cs); depth > 0; --depth) {
+ trace_exit_function(t, timestamp, callstack_at(cs, depth)->pc, depth);
+ }
+}
+
+static uint64_t cpu_read_register64(Cpu *cpu, struct qemu_plugin_register *reg)
+{
+ GByteArray *buf = cpu->buf;
+ g_byte_array_set_size(buf, 0);
+ size_t sz = qemu_plugin_read_register(reg, buf);
+ g_assert(sz == 8);
+ g_assert(buf->len == 8);
+ return *((uint64_t *) buf->data);
+}
+
+static uint64_t cpu_read_memory64(Cpu *cpu, uint64_t addr)
+{
+ g_assert(addr);
+ GByteArray *buf = cpu->buf;
+ g_byte_array_set_size(buf, 0);
+ bool read = qemu_plugin_read_memory_vaddr(addr, buf, 8);
+ if (!read) {
+ return 0;
+ }
+ g_assert(buf->len == 8);
+ return *((uint64_t *) buf->data);
+}
+
+static void cpu_unwind_stack(Cpu *cpu, uint64_t frame_pointer, uint64_t pc)
+{
+ g_assert(callstack_empty(cpu->cs));
+
+ #define UNWIND_STACK_MAX_DEPTH 1024
+ callstack_entry unwind[UNWIND_STACK_MAX_DEPTH];
+ size_t depth = 0;
+ do {
+ /* check we don't have an infinite stack */
+ for (size_t i = 0; i < depth; ++i) {
+ if (frame_pointer == unwind[i].frame_pointer) {
+ break;
+ }
+ }
+ callstack_entry e = {.frame_pointer = frame_pointer, .pc = pc};
+ unwind[depth] = e;
+ depth++;
+ if (frame_pointer) {
+ frame_pointer = cpu_read_memory64(cpu, frame_pointer);
+ }
+ pc = cpu_read_memory64(cpu, frame_pointer + 8); /* read previous lr */
+ } while (frame_pointer && pc && depth < UNWIND_STACK_MAX_DEPTH);
+ #undef UNWIND_STACK_MAX_DEPTH
+
+ /* push it from bottom to top */
+ while (depth) {
+ callstack_push(cpu->cs, unwind[depth - 1]);
+ --depth;
+ }
+}
+
+static uint64_t cpu_get_timestamp(const Cpu *cpu)
+{
+ return cpu->insn_count;
+}
+
+static uint64_t aarch64_get_frame_pointer(Cpu *cpu_)
+{
+ Aarch64Cpu *cpu = cpu_->arch;
+ return cpu_read_register64(cpu_, cpu->reg_fp);
+}
+
+static void aarch64_init(Cpu *cpu_)
+{
+ Aarch64Cpu *cpu = g_malloc0(sizeof(Aarch64Cpu));
+ cpu_->arch = cpu;
+ g_autoptr(GArray) regs = qemu_plugin_get_registers();
+ for (int i = 0; i < regs->len; ++i) {
+ qemu_plugin_reg_descriptor *reg;
+ reg = &g_array_index(regs, qemu_plugin_reg_descriptor, i);
+ if (!strcmp(reg->name, "x29")) {
+ cpu->reg_fp = reg->handle;
+ }
+ }
+ if (!cpu->reg_fp) {
+ fprintf(stderr, "uftrace plugin: frame pointer register (x29) is not "
+ "available. Please use an AArch64 cpu (or -cpu max).\n");
+ g_abort();
+ }
+}
+
+static void aarch64_end(Cpu *cpu)
+{
+ g_free(cpu->arch);
+}
+
+static bool aarch64_does_insn_modify_frame_pointer(const char *disas)
+{
+ /*
+ * Check if current instruction concerns fp register "x29".
+ * We add a prefix space to make sure we don't match addresses dump
+ * in disassembly.
+ */
+ return strstr(disas, " x29");
+}
+
+static CpuOps aarch64_ops = {
+ .init = aarch64_init,
+ .end = aarch64_end,
+ .get_frame_pointer = aarch64_get_frame_pointer,
+ .does_insn_modify_frame_pointer = aarch64_does_insn_modify_frame_pointer,
+};
+
+static void track_callstack(unsigned int cpu_index, void *udata)
+{
+ uint64_t pc = (uintptr_t) udata;
+ Cpu *cpu = qemu_plugin_scoreboard_find(score, cpu_index);
+ uint64_t timestamp = cpu_get_timestamp(cpu);
+ callstack *cs = cpu->cs;
+ trace *t = cpu->trace;
+
+ uint64_t fp = cpu->ops.get_frame_pointer(cpu);
+ if (!fp && callstack_empty(cs)) {
+ /*
+ * We simply push current pc. Note that we won't detect symbol change as
+ * long as a proper call does not happen.
+ */
+ callstack_push(cs, (callstack_entry){.frame_pointer = fp,
+ .pc = pc});
+ trace_enter_function(t, timestamp, pc, callstack_depth(cs));
+ return;
+ }
+
+ callstack_entry top = callstack_top(cs);
+ if (fp == top.frame_pointer) {
+ /* same function */
+ return;
+ }
+
+ callstack_entry caller = callstack_caller(cs);
+ if (fp == caller.frame_pointer) {
+ /* return */
+ callstack_entry e = callstack_pop(cs);
+ trace_exit_function(t, timestamp, e.pc, callstack_depth(cs));
+ return;
+ }
+
+ uint64_t caller_fp = fp ? cpu_read_memory64(cpu, fp) : 0;
+ if (caller_fp == top.frame_pointer) {
+ /* call */
+ callstack_push(cs, (callstack_entry){.frame_pointer = fp,
+ .pc = pc});
+ trace_enter_function(t, timestamp, pc, callstack_depth(cs));
+ return;
+ }
+
+ /* discontinuity, exit current stack and unwind new one */
+ trace_exit_stack(t, cs, timestamp);
+ callstack_clear(cs);
+
+ cpu_unwind_stack(cpu, fp, pc);
+ trace_enter_stack(t, cs, timestamp);
+}
+
+static void vcpu_tb_trans(qemu_plugin_id_t id, struct qemu_plugin_tb *tb)
+{
+ size_t n_insns = qemu_plugin_tb_n_insns(tb);
+
+ qemu_plugin_u64 insn_count = qemu_plugin_scoreboard_u64_in_struct(
+ score, Cpu, insn_count);
+
+ /*
+ * We instrument all instructions following one that might have updated
+ * the frame pointer. We always instrument first instruction in block, as
+ * last executed instruction, in previous tb, may have modified it.
+ */
+ bool instrument_insn = true;
+ for (int i = 0; i < n_insns; i++) {
+ struct qemu_plugin_insn *insn = qemu_plugin_tb_get_insn(tb, i);
+ qemu_plugin_register_vcpu_insn_exec_inline_per_vcpu(
+ insn, QEMU_PLUGIN_INLINE_ADD_U64, insn_count, 1);
+
+ if (instrument_insn) {
+ uintptr_t pc = qemu_plugin_insn_vaddr(insn);
+ qemu_plugin_register_vcpu_insn_exec_cb(insn, track_callstack,
+ QEMU_PLUGIN_CB_R_REGS,
+ (void *) pc);
+ instrument_insn = false;
+ }
+
+ char *disas = qemu_plugin_insn_disas(insn);
+ if (arch_ops.does_insn_modify_frame_pointer(disas)) {
+ instrument_insn = true;
+ }
+ }
+}
+
+static void vcpu_init(qemu_plugin_id_t id, unsigned int vcpu_index)
+{
+ Cpu *cpu = qemu_plugin_scoreboard_find(score, vcpu_index);
+ cpu->ops = arch_ops;
+
+ cpu->ops.init(cpu);
+ cpu->buf = g_byte_array_new();
+ cpu->callstacks = g_array_new(0, 0, sizeof(callstack *));
+ cpu->traces = g_array_new(0, 0, sizeof(trace *));
+
+ g_assert(vcpu_index < 1000);
+ uint32_t trace_id = 1000 * 1000 + vcpu_index * 1000;
+
+ g_autoptr(GString) trace_name = g_string_new(NULL);
+ g_string_append_printf(trace_name, "cpu%u", vcpu_index);
+ trace *t = trace_new(trace_id, trace_name);
+ g_array_append_val(cpu->traces, t);
+ callstack *cs = callstack_new();
+ g_array_append_val(cpu->callstacks, cs);
+ /* create/truncate trace file */
+ trace_flush(t, false);
+
+ cpu->cs = cs;
+ cpu->trace = t;
+}
+
+static void vcpu_end(unsigned int vcpu_index)
+{
+ Cpu *cpu = qemu_plugin_scoreboard_find(score, vcpu_index);
+ g_byte_array_free(cpu->buf, true);
+
+ for (size_t i = 0; i < cpu->traces->len; ++i) {
+ trace *t = g_array_index(cpu->traces, trace*, i);
+ trace_free(t);
+ }
+
+ for (size_t i = 0; i < cpu->callstacks->len; ++i) {
+ callstack *cs = g_array_index(cpu->callstacks, callstack*, i);
+ callstack_free(cs);
+ }
+
+ g_array_free(cpu->traces, true);
+ g_array_free(cpu->callstacks, true);
+ memset(cpu, 0, sizeof(Cpu));
+}
+
+static void at_exit(qemu_plugin_id_t id, void *data)
+{
+ bool system_emulation = (bool) data;
+ g_autoptr(GArray) traces = g_array_new(0, 0, sizeof(trace *));
+
+ /* exit last active stacks */
+ for (size_t i = 0; i < qemu_plugin_num_vcpus(); ++i) {
+ Cpu *cpu = qemu_plugin_scoreboard_find(score, i);
+ uint64_t timestamp = cpu_get_timestamp(cpu);
+ if (trace_sample) {
+ cpu_trace_last_sample(cpu, timestamp);
+ }
+ trace_exit_stack(cpu->trace, cpu->cs, timestamp);
+ callstack_clear(cpu->cs);
+ }
+
+ for (size_t i = 0; i < qemu_plugin_num_vcpus(); ++i) {
+ Cpu *cpu = qemu_plugin_scoreboard_find(score, i);
+ for (size_t j = 0; j < cpu->traces->len; ++j) {
+ trace *t = g_array_index(cpu->traces, trace*, j);
+ trace_flush(t, true);
+ g_array_append_val(traces, t);
+ }
+ }
+
+ uftrace_write_map(system_emulation);
+ uftrace_write_info(traces);
+ uftrace_write_task(traces);
+
+ for (size_t i = 0; i < qemu_plugin_num_vcpus(); ++i) {
+ vcpu_end(i);
+ }
+
+ qemu_plugin_scoreboard_free(score);
+}
+
+QEMU_PLUGIN_EXPORT int qemu_plugin_install(qemu_plugin_id_t id,
+ const qemu_info_t *info,
+ int argc, char **argv)
+{
+ if (!strcmp(info->target_name, "aarch64")) {
+ arch_ops = aarch64_ops;
+ } else {
+ fprintf(stderr, "plugin uftrace: %s target is not supported\n",
+ info->target_name);
+ return 1;
+ }
+
+ score = qemu_plugin_scoreboard_new(sizeof(Cpu));
+ qemu_plugin_register_vcpu_init_cb(id, vcpu_init);
+ qemu_plugin_register_atexit_cb(id, at_exit, (void *) info->system_emulation);
+ qemu_plugin_register_vcpu_tb_trans_cb(id, vcpu_tb_trans);
+
+ return 0;
+}
diff --git a/contrib/plugins/meson.build b/contrib/plugins/meson.build
index 1876bc78438..7eb3629c95d 100644
--- a/contrib/plugins/meson.build
+++ b/contrib/plugins/meson.build
@@ -1,5 +1,6 @@
contrib_plugins = ['bbv', 'cache', 'cflow', 'drcov', 'execlog', 'hotblocks',
- 'hotpages', 'howvec', 'hwprofile', 'ips', 'stoptrigger']
+ 'hotpages', 'howvec', 'hwprofile', 'ips', 'stoptrigger',
+ 'uftrace']
if host_os != 'windows'
# lockstep uses socket.h
contrib_plugins += 'lockstep'
--
2.47.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v4 2/7] contrib/plugins/uftrace: add trace-sample option
2025-07-24 20:45 [PATCH v4 0/7] contrib/plugins: uftrace Pierrick Bouvier
2025-07-24 20:45 ` [PATCH v4 1/7] contrib/plugins/uftrace: new uftrace plugin Pierrick Bouvier
@ 2025-07-24 20:45 ` Pierrick Bouvier
2025-07-24 20:45 ` [PATCH v4 3/7] contrib/plugins/uftrace: add trace-privilege-level option Pierrick Bouvier
` (5 subsequent siblings)
7 siblings, 0 replies; 14+ messages in thread
From: Pierrick Bouvier @ 2025-07-24 20:45 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Maydell, Alexandre Iooss, Gustavo Romero, Alex Bennée,
Mahmoud Mandour, rowan Hart, Manos Pitsidianakis,
Philippe Mathieu-Daudé, Richard Henderson, Pierrick Bouvier
usage: trace-sample=N
Allow to use sampling (every N instructions) for tracking the stack.
We implement a fast mode, where instrumentation is only per tb, and
simply dump current stack, and unwind new one, instead of tracking every
frame pointer change.
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
---
contrib/plugins/uftrace.c | 111 +++++++++++++++++++++++++++++++++++---
1 file changed, 105 insertions(+), 6 deletions(-)
diff --git a/contrib/plugins/uftrace.c b/contrib/plugins/uftrace.c
index 4ca6d3fd68d..a626030b7da 100644
--- a/contrib/plugins/uftrace.c
+++ b/contrib/plugins/uftrace.c
@@ -44,6 +44,9 @@ typedef struct {
typedef struct Cpu {
uint64_t insn_count;
+ uint64_t sample_insn_count;
+ uint64_t sample_timestamp;
+ callstack *sample_cs;
trace *trace;
callstack *cs;
GArray *callstacks; /* callstack *callstacks[] */
@@ -70,6 +73,7 @@ enum uftrace_record_type {
};
static struct qemu_plugin_scoreboard *score;
+static uint64_t trace_sample;
static CpuOps arch_ops;
static void uftrace_write_map(bool system_emulation)
@@ -199,6 +203,13 @@ static callstack *callstack_new(void)
return cs;
}
+static callstack *callstack_clone(const callstack *cs)
+{
+ callstack *clone = g_malloc0(sizeof(callstack));
+ clone->s = g_array_copy(cs->s);
+ return clone;
+}
+
static void callstack_free(callstack *cs)
{
g_array_free(cs->s, true);
@@ -391,6 +402,26 @@ static void cpu_unwind_stack(Cpu *cpu, uint64_t frame_pointer, uint64_t pc)
}
}
+static void cpu_trace_last_sample(Cpu *cpu, uint64_t timestamp)
+{
+ if (!cpu->sample_cs) {
+ return;
+ }
+ uint64_t elapsed = timestamp - cpu->sample_timestamp;
+ uint64_t middle_timestamp = cpu->sample_timestamp + (elapsed / 2);
+ trace_exit_stack(cpu->trace, cpu->sample_cs, middle_timestamp);
+ callstack_free(cpu->sample_cs);
+ cpu->sample_cs = NULL;
+ trace_enter_stack(cpu->trace, cpu->cs, middle_timestamp);
+}
+
+static void cpu_set_new_sample(Cpu *cpu, uint64_t timestamp)
+{
+ cpu->sample_insn_count = 0;
+ cpu->sample_cs = callstack_clone(cpu->cs);
+ cpu->sample_timestamp = timestamp;
+}
+
static uint64_t cpu_get_timestamp(const Cpu *cpu)
{
return cpu->insn_count;
@@ -451,6 +482,13 @@ static void track_callstack(unsigned int cpu_index, void *udata)
callstack *cs = cpu->cs;
trace *t = cpu->trace;
+ if (trace_sample && cpu->sample_insn_count >= trace_sample) {
+ cpu_trace_last_sample(cpu, timestamp);
+ cpu_set_new_sample(cpu, timestamp);
+ }
+
+ bool trace_change = !trace_sample;
+
uint64_t fp = cpu->ops.get_frame_pointer(cpu);
if (!fp && callstack_empty(cs)) {
/*
@@ -459,7 +497,9 @@ static void track_callstack(unsigned int cpu_index, void *udata)
*/
callstack_push(cs, (callstack_entry){.frame_pointer = fp,
.pc = pc});
- trace_enter_function(t, timestamp, pc, callstack_depth(cs));
+ if (trace_change) {
+ trace_enter_function(t, timestamp, pc, callstack_depth(cs));
+ }
return;
}
@@ -473,7 +513,9 @@ static void track_callstack(unsigned int cpu_index, void *udata)
if (fp == caller.frame_pointer) {
/* return */
callstack_entry e = callstack_pop(cs);
- trace_exit_function(t, timestamp, e.pc, callstack_depth(cs));
+ if (trace_change) {
+ trace_exit_function(t, timestamp, e.pc, callstack_depth(cs));
+ }
return;
}
@@ -482,27 +524,65 @@ static void track_callstack(unsigned int cpu_index, void *udata)
/* call */
callstack_push(cs, (callstack_entry){.frame_pointer = fp,
.pc = pc});
- trace_enter_function(t, timestamp, pc, callstack_depth(cs));
+ if (trace_change) {
+ trace_enter_function(t, timestamp, pc, callstack_depth(cs));
+ }
return;
}
/* discontinuity, exit current stack and unwind new one */
- trace_exit_stack(t, cs, timestamp);
+ if (trace_change) {
+ trace_exit_stack(t, cs, timestamp);
+ }
callstack_clear(cs);
cpu_unwind_stack(cpu, fp, pc);
- trace_enter_stack(t, cs, timestamp);
+ if (trace_change) {
+ trace_enter_stack(t, cs, timestamp);
+ }
+}
+
+static void sample_callstack(unsigned int cpu_index, void *udata)
+{
+ uint64_t pc = (uintptr_t) udata;
+ Cpu *cpu = qemu_plugin_scoreboard_find(score, cpu_index);
+ uint64_t timestamp = cpu_get_timestamp(cpu);
+
+ trace_exit_stack(cpu->trace, cpu->cs, timestamp);
+ callstack_clear(cpu->cs);
+
+ cpu_unwind_stack(cpu, cpu->ops.get_frame_pointer(cpu), pc);
+ trace_enter_stack(cpu->trace, cpu->cs, timestamp);
+
+ /* reset counter */
+ cpu->sample_insn_count = 0;
}
static void vcpu_tb_trans(qemu_plugin_id_t id, struct qemu_plugin_tb *tb)
{
size_t n_insns = qemu_plugin_tb_n_insns(tb);
+ uintptr_t tb_pc = qemu_plugin_tb_vaddr(tb);
qemu_plugin_u64 insn_count = qemu_plugin_scoreboard_u64_in_struct(
score, Cpu, insn_count);
+ qemu_plugin_u64 sample_insn_count = qemu_plugin_scoreboard_u64_in_struct(
+ score, Cpu, sample_insn_count);
+
+ if (trace_sample) {
+ /* We can do a light instrumentation, per tb only */
+ qemu_plugin_register_vcpu_tb_exec_inline_per_vcpu(
+ tb, QEMU_PLUGIN_INLINE_ADD_U64, insn_count, n_insns);
+ qemu_plugin_register_vcpu_tb_exec_inline_per_vcpu(
+ tb, QEMU_PLUGIN_INLINE_ADD_U64, sample_insn_count, n_insns);
+ qemu_plugin_register_vcpu_tb_exec_cond_cb(
+ tb, sample_callstack, QEMU_PLUGIN_CB_R_REGS,
+ QEMU_PLUGIN_COND_GE, sample_insn_count, trace_sample,
+ (void *) tb_pc);
+ return;
+ }
/*
- * We instrument all instructions following one that might have updated
+ * We now instrument all instructions following one that might have updated
* the frame pointer. We always instrument first instruction in block, as
* last executed instruction, in previous tb, may have modified it.
*/
@@ -511,6 +591,8 @@ static void vcpu_tb_trans(qemu_plugin_id_t id, struct qemu_plugin_tb *tb)
struct qemu_plugin_insn *insn = qemu_plugin_tb_get_insn(tb, i);
qemu_plugin_register_vcpu_insn_exec_inline_per_vcpu(
insn, QEMU_PLUGIN_INLINE_ADD_U64, insn_count, 1);
+ qemu_plugin_register_vcpu_insn_exec_inline_per_vcpu(
+ insn, QEMU_PLUGIN_INLINE_ADD_U64, sample_insn_count, 1);
if (instrument_insn) {
uintptr_t pc = qemu_plugin_insn_vaddr(insn);
@@ -536,6 +618,7 @@ static void vcpu_init(qemu_plugin_id_t id, unsigned int vcpu_index)
cpu->buf = g_byte_array_new();
cpu->callstacks = g_array_new(0, 0, sizeof(callstack *));
cpu->traces = g_array_new(0, 0, sizeof(trace *));
+ cpu->sample_timestamp = cpu_get_timestamp(cpu);
g_assert(vcpu_index < 1000);
uint32_t trace_id = 1000 * 1000 + vcpu_index * 1000;
@@ -613,6 +696,22 @@ QEMU_PLUGIN_EXPORT int qemu_plugin_install(qemu_plugin_id_t id,
const qemu_info_t *info,
int argc, char **argv)
{
+ for (int i = 0; i < argc; i++) {
+ char *opt = argv[i];
+ g_auto(GStrv) tokens = g_strsplit(opt, "=", 2);
+ if (g_strcmp0(tokens[0], "trace-sample") == 0) {
+ gint64 value = g_ascii_strtoll(tokens[1], NULL, 10);
+ if (value <= 0) {
+ fprintf(stderr, "bad trace-sample value: %s\n", tokens[1]);
+ return -1;
+ }
+ trace_sample = value;
+ } else {
+ fprintf(stderr, "option parsing failed: %s\n", opt);
+ return -1;
+ }
+ }
+
if (!strcmp(info->target_name, "aarch64")) {
arch_ops = aarch64_ops;
} else {
--
2.47.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v4 3/7] contrib/plugins/uftrace: add trace-privilege-level option
2025-07-24 20:45 [PATCH v4 0/7] contrib/plugins: uftrace Pierrick Bouvier
2025-07-24 20:45 ` [PATCH v4 1/7] contrib/plugins/uftrace: new uftrace plugin Pierrick Bouvier
2025-07-24 20:45 ` [PATCH v4 2/7] contrib/plugins/uftrace: add trace-sample option Pierrick Bouvier
@ 2025-07-24 20:45 ` Pierrick Bouvier
2025-07-24 20:45 ` [PATCH v4 4/7] contrib/plugins/uftrace: add timestamp-based-on-real-time option Pierrick Bouvier
` (4 subsequent siblings)
7 siblings, 0 replies; 14+ messages in thread
From: Pierrick Bouvier @ 2025-07-24 20:45 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Maydell, Alexandre Iooss, Gustavo Romero, Alex Bennée,
Mahmoud Mandour, rowan Hart, Manos Pitsidianakis,
Philippe Mathieu-Daudé, Richard Henderson, Pierrick Bouvier
usage: trace-privilege-level=[on|off]
This option generates different traces (represented as different
processes in uftrace), allowing to follow privilege level changes.
For aarch64, we track current EL and Security State.
As well, we make sure that sampling works correctly with this option. If
user requests sampling and privilege-level tracing, we perform an
accurate callstack tracking (slower), and only sample the information we
present.
This gives accurate backtraces on all privilege switches, while keeping
the trace sampled if needed.
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
---
contrib/plugins/uftrace.c | 182 +++++++++++++++++++++++++++++++++++---
1 file changed, 171 insertions(+), 11 deletions(-)
diff --git a/contrib/plugins/uftrace.c b/contrib/plugins/uftrace.c
index a626030b7da..bd4219db693 100644
--- a/contrib/plugins/uftrace.c
+++ b/contrib/plugins/uftrace.c
@@ -39,6 +39,9 @@ typedef struct {
void (*init)(Cpu *cpu);
void (*end)(Cpu *cpu);
uint64_t (*get_frame_pointer)(Cpu *cpu);
+ uint8_t (*get_privilege_level)(Cpu *cpu);
+ uint8_t (*num_privilege_levels)(void);
+ const char *(*get_privilege_level_name)(uint8_t pl);
bool (*does_insn_modify_frame_pointer)(const char *disas);
} CpuOps;
@@ -49,6 +52,7 @@ typedef struct Cpu {
callstack *sample_cs;
trace *trace;
callstack *cs;
+ uint8_t privilege_level;
GArray *callstacks; /* callstack *callstacks[] */
GArray *traces; /* trace *traces [] */
GByteArray *buf;
@@ -56,8 +60,23 @@ typedef struct Cpu {
void *arch;
} Cpu;
+typedef enum {
+ AARCH64_EL0_SECURE,
+ AARCH64_EL0_NONSECURE,
+ AARCH64_EL0_REALM,
+ AARCH64_EL1_SECURE,
+ AARCH64_EL1_NONSECURE,
+ AARCH64_EL1_REALM,
+ AARCH64_EL2_SECURE,
+ AARCH64_EL2_NONSECURE,
+ AARCH64_EL2_REALM,
+ AARCH64_EL3,
+} Aarch64PrivilegeLevel;
+
typedef struct {
struct qemu_plugin_register *reg_fp;
+ struct qemu_plugin_register *reg_cpsr;
+ struct qemu_plugin_register *reg_scr_el3;
} Aarch64Cpu;
typedef struct {
@@ -74,6 +93,7 @@ enum uftrace_record_type {
static struct qemu_plugin_scoreboard *score;
static uint64_t trace_sample;
+static bool trace_privilege_level;
static CpuOps arch_ops;
static void uftrace_write_map(bool system_emulation)
@@ -358,6 +378,16 @@ static uint64_t cpu_read_register64(Cpu *cpu, struct qemu_plugin_register *reg)
return *((uint64_t *) buf->data);
}
+static uint32_t cpu_read_register32(Cpu *cpu, struct qemu_plugin_register *reg)
+{
+ GByteArray *buf = cpu->buf;
+ g_byte_array_set_size(buf, 0);
+ size_t sz = qemu_plugin_read_register(reg, buf);
+ g_assert(sz == 4);
+ g_assert(buf->len == 4);
+ return *((uint32_t *) buf->data);
+}
+
static uint64_t cpu_read_memory64(Cpu *cpu, uint64_t addr)
{
g_assert(addr);
@@ -427,6 +457,68 @@ static uint64_t cpu_get_timestamp(const Cpu *cpu)
return cpu->insn_count;
}
+static uint8_t aarch64_num_privilege_levels(void)
+{
+ return AARCH64_EL3 + 1;
+}
+
+static const char *aarch64_get_privilege_level_name(uint8_t pl)
+{
+ switch (pl) {
+ case AARCH64_EL0_SECURE: return "S-EL0";
+ case AARCH64_EL0_NONSECURE: return "NS-EL0";
+ case AARCH64_EL0_REALM: return "R-EL0";
+ case AARCH64_EL1_SECURE: return "S-EL1";
+ case AARCH64_EL1_NONSECURE: return "NS-EL1";
+ case AARCH64_EL1_REALM: return "R-EL1";
+ case AARCH64_EL2_SECURE: return "S-EL2";
+ case AARCH64_EL2_NONSECURE: return "NS-EL2";
+ case AARCH64_EL2_REALM: return "R-EL2";
+ case AARCH64_EL3: return "EL3";
+ default:
+ g_assert_not_reached();
+ }
+}
+
+static uint8_t aarch64_get_privilege_level(Cpu *cpu_)
+{
+ Aarch64Cpu *cpu = cpu_->arch;
+ /*
+ * QEMU gdbstub does not provide access to CurrentEL,
+ * so we use CPSR instead.
+ */
+ uint8_t el = cpu_read_register32(cpu_, cpu->reg_cpsr) >> 2 & 0b11;
+
+ if (el == 3) {
+ return AARCH64_EL3;
+ }
+
+ uint8_t ss = AARCH64_EL0_SECURE;
+ if (!cpu->reg_scr_el3) {
+ ss = AARCH64_EL0_NONSECURE;
+ }
+ uint64_t scr_el3 = cpu_read_register64(cpu_, cpu->reg_scr_el3);
+ uint64_t ns = (scr_el3 >> 0) & 0b1;
+ uint64_t nse = (scr_el3 >> 62) & 0b1;
+ switch (nse << 1 | ns) {
+ case 0b00:
+ ss = AARCH64_EL0_SECURE;
+ break;
+ case 0b01:
+ ss = AARCH64_EL0_NONSECURE;
+ break;
+ case 0b11:
+ ss = AARCH64_EL0_REALM;
+ break;
+ default:
+ g_assert_not_reached();
+ }
+
+ const uint8_t num_ss = 3;
+ Aarch64PrivilegeLevel pl = el * num_ss + ss;
+ return pl;
+}
+
static uint64_t aarch64_get_frame_pointer(Cpu *cpu_)
{
Aarch64Cpu *cpu = cpu_->arch;
@@ -443,6 +535,10 @@ static void aarch64_init(Cpu *cpu_)
reg = &g_array_index(regs, qemu_plugin_reg_descriptor, i);
if (!strcmp(reg->name, "x29")) {
cpu->reg_fp = reg->handle;
+ } else if (!strcmp(reg->name, "cpsr")) {
+ cpu->reg_cpsr = reg->handle;
+ } else if (!strcmp(reg->name, "SCR_EL3")) {
+ cpu->reg_scr_el3 = reg->handle;
}
}
if (!cpu->reg_fp) {
@@ -450,6 +546,8 @@ static void aarch64_init(Cpu *cpu_)
"available. Please use an AArch64 cpu (or -cpu max).\n");
g_abort();
}
+ g_assert(cpu->reg_cpsr);
+ /* scr_el3 is optional */
}
static void aarch64_end(Cpu *cpu)
@@ -471,9 +569,43 @@ static CpuOps aarch64_ops = {
.init = aarch64_init,
.end = aarch64_end,
.get_frame_pointer = aarch64_get_frame_pointer,
+ .get_privilege_level = aarch64_get_privilege_level,
+ .num_privilege_levels = aarch64_num_privilege_levels,
+ .get_privilege_level_name = aarch64_get_privilege_level_name,
.does_insn_modify_frame_pointer = aarch64_does_insn_modify_frame_pointer,
};
+static void track_privilege_change(unsigned int cpu_index, void *udata)
+{
+ Cpu *cpu = qemu_plugin_scoreboard_find(score, cpu_index);
+ uint8_t new_pl = cpu->ops.get_privilege_level(cpu);
+
+ if (new_pl == cpu->privilege_level) {
+ return;
+ }
+
+ uint64_t pc = (uintptr_t) udata;
+ uint64_t timestamp = cpu_get_timestamp(cpu);
+
+ if (trace_sample) {
+ cpu_trace_last_sample(cpu, timestamp);
+ }
+
+ trace_exit_stack(cpu->trace, cpu->cs, timestamp);
+ callstack_clear(cpu->cs);
+
+ cpu->privilege_level = new_pl;
+ cpu->cs = g_array_index(cpu->callstacks, callstack*, new_pl);
+ cpu->trace = g_array_index(cpu->traces, trace*, new_pl);
+
+ cpu_unwind_stack(cpu, cpu->ops.get_frame_pointer(cpu), pc);
+ trace_enter_stack(cpu->trace, cpu->cs, timestamp);
+
+ if (trace_sample) {
+ cpu_set_new_sample(cpu, timestamp);
+ }
+}
+
static void track_callstack(unsigned int cpu_index, void *udata)
{
uint64_t pc = (uintptr_t) udata;
@@ -568,7 +700,7 @@ static void vcpu_tb_trans(qemu_plugin_id_t id, struct qemu_plugin_tb *tb)
qemu_plugin_u64 sample_insn_count = qemu_plugin_scoreboard_u64_in_struct(
score, Cpu, sample_insn_count);
- if (trace_sample) {
+ if (trace_sample && !trace_privilege_level) {
/* We can do a light instrumentation, per tb only */
qemu_plugin_register_vcpu_tb_exec_inline_per_vcpu(
tb, QEMU_PLUGIN_INLINE_ADD_U64, insn_count, n_insns);
@@ -581,6 +713,12 @@ static void vcpu_tb_trans(qemu_plugin_id_t id, struct qemu_plugin_tb *tb)
return;
}
+ if (trace_privilege_level) {
+ qemu_plugin_register_vcpu_tb_exec_cb(tb, track_privilege_change,
+ QEMU_PLUGIN_CB_R_REGS,
+ (void *) tb_pc);
+ }
+
/*
* We now instrument all instructions following one that might have updated
* the frame pointer. We always instrument first instruction in block, as
@@ -623,17 +761,33 @@ static void vcpu_init(qemu_plugin_id_t id, unsigned int vcpu_index)
g_assert(vcpu_index < 1000);
uint32_t trace_id = 1000 * 1000 + vcpu_index * 1000;
- g_autoptr(GString) trace_name = g_string_new(NULL);
- g_string_append_printf(trace_name, "cpu%u", vcpu_index);
- trace *t = trace_new(trace_id, trace_name);
- g_array_append_val(cpu->traces, t);
- callstack *cs = callstack_new();
- g_array_append_val(cpu->callstacks, cs);
- /* create/truncate trace file */
- trace_flush(t, false);
+ if (trace_privilege_level) {
+ for (uint8_t pl = 0; pl < cpu->ops.num_privilege_levels(); ++pl) {
+ g_autoptr(GString) trace_name = g_string_new(NULL);
+ g_string_append_printf(trace_name, "cpu%u %s", vcpu_index,
+ cpu->ops.get_privilege_level_name(pl));
+ trace *t = trace_new(trace_id + pl, trace_name);
+ g_array_append_val(cpu->traces, t);
+ callstack *cs = callstack_new();
+ g_array_append_val(cpu->callstacks, cs);
+ }
+ } else {
+ g_autoptr(GString) trace_name = g_string_new(NULL);
+ g_string_append_printf(trace_name, "cpu%u", vcpu_index);
+ trace *t = trace_new(trace_id, trace_name);
+ g_array_append_val(cpu->traces, t);
+ callstack *cs = callstack_new();
+ g_array_append_val(cpu->callstacks, cs);
+ }
- cpu->cs = cs;
- cpu->trace = t;
+ for (size_t i = 0; i < cpu->traces->len; ++i) {
+ /* create/truncate trace files */
+ trace *t = g_array_index(cpu->traces, trace*, i);
+ trace_flush(t, false);
+ }
+
+ cpu->cs = g_array_index(cpu->callstacks, callstack*, cpu->privilege_level);
+ cpu->trace = g_array_index(cpu->traces, trace*, cpu->privilege_level);
}
static void vcpu_end(unsigned int vcpu_index)
@@ -706,6 +860,12 @@ QEMU_PLUGIN_EXPORT int qemu_plugin_install(qemu_plugin_id_t id,
return -1;
}
trace_sample = value;
+ } else if (g_strcmp0(tokens[0], "trace-privilege-level") == 0) {
+ if (!qemu_plugin_bool_parse(tokens[0], tokens[1],
+ &trace_privilege_level)) {
+ fprintf(stderr, "boolean argument parsing failed: %s\n", opt);
+ return -1;
+ }
} else {
fprintf(stderr, "option parsing failed: %s\n", opt);
return -1;
--
2.47.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v4 4/7] contrib/plugins/uftrace: add timestamp-based-on-real-time option
2025-07-24 20:45 [PATCH v4 0/7] contrib/plugins: uftrace Pierrick Bouvier
` (2 preceding siblings ...)
2025-07-24 20:45 ` [PATCH v4 3/7] contrib/plugins/uftrace: add trace-privilege-level option Pierrick Bouvier
@ 2025-07-24 20:45 ` Pierrick Bouvier
2025-08-05 10:23 ` Alex Bennée
2025-07-24 20:45 ` [PATCH v4 5/7] contrib/plugins/uftrace_symbols.py Pierrick Bouvier
` (3 subsequent siblings)
7 siblings, 1 reply; 14+ messages in thread
From: Pierrick Bouvier @ 2025-07-24 20:45 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Maydell, Alexandre Iooss, Gustavo Romero, Alex Bennée,
Mahmoud Mandour, rowan Hart, Manos Pitsidianakis,
Philippe Mathieu-Daudé, Richard Henderson, Pierrick Bouvier
usage: timestamp-based-on-real-time=[on|off]
Instead of using number of instructions executed (which is per vcpu), we
use the wall time for timestamps. This is useful when tracing user mode
programs as well.
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
---
contrib/plugins/uftrace.c | 29 +++++++++++++++++++++++++++++
1 file changed, 29 insertions(+)
diff --git a/contrib/plugins/uftrace.c b/contrib/plugins/uftrace.c
index bd4219db693..9211f273700 100644
--- a/contrib/plugins/uftrace.c
+++ b/contrib/plugins/uftrace.c
@@ -94,8 +94,28 @@ enum uftrace_record_type {
static struct qemu_plugin_scoreboard *score;
static uint64_t trace_sample;
static bool trace_privilege_level;
+static bool timestamp_based_on_real_time;
static CpuOps arch_ops;
+static uint64_t gettime_ns(void)
+{
+#ifdef _WIN32
+ /*
+ * On Windows, timespec_get is available only with UCRT, but not with
+ * MinGW64 environment. Simplify by using only gettimeofday on this
+ * platform.
+ */
+ struct timeval tv;
+ gettimeofday(&tv, NULL);
+ uint64_t now_ns = tv.tv_sec * 1000 * 1000 * 1000 + tv.tv_usec * 1000;
+#else
+ struct timespec ts;
+ timespec_get(&ts, TIME_UTC);
+ uint64_t now_ns = ts.tv_sec * 1000 * 1000 * 1000 + ts.tv_nsec;
+#endif
+ return now_ns;
+}
+
static void uftrace_write_map(bool system_emulation)
{
const char *path = "./uftrace.data/sid-0.map";
@@ -454,6 +474,9 @@ static void cpu_set_new_sample(Cpu *cpu, uint64_t timestamp)
static uint64_t cpu_get_timestamp(const Cpu *cpu)
{
+ if (timestamp_based_on_real_time) {
+ return gettime_ns();
+ }
return cpu->insn_count;
}
@@ -866,6 +889,12 @@ QEMU_PLUGIN_EXPORT int qemu_plugin_install(qemu_plugin_id_t id,
fprintf(stderr, "boolean argument parsing failed: %s\n", opt);
return -1;
}
+ } else if (g_strcmp0(tokens[0], "timestamp-based-on-real-time") == 0) {
+ if (!qemu_plugin_bool_parse(tokens[0], tokens[1],
+ ×tamp_based_on_real_time)) {
+ fprintf(stderr, "boolean argument parsing failed: %s\n", opt);
+ return -1;
+ }
} else {
fprintf(stderr, "option parsing failed: %s\n", opt);
return -1;
--
2.47.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v4 5/7] contrib/plugins/uftrace_symbols.py
2025-07-24 20:45 [PATCH v4 0/7] contrib/plugins: uftrace Pierrick Bouvier
` (3 preceding siblings ...)
2025-07-24 20:45 ` [PATCH v4 4/7] contrib/plugins/uftrace: add timestamp-based-on-real-time option Pierrick Bouvier
@ 2025-07-24 20:45 ` Pierrick Bouvier
2025-08-05 10:28 ` Alex Bennée
2025-07-24 20:45 ` [PATCH v4 6/7] contrib/plugins/uftrace: add documentation Pierrick Bouvier
` (2 subsequent siblings)
7 siblings, 1 reply; 14+ messages in thread
From: Pierrick Bouvier @ 2025-07-24 20:45 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Maydell, Alexandre Iooss, Gustavo Romero, Alex Bennée,
Mahmoud Mandour, rowan Hart, Manos Pitsidianakis,
Philippe Mathieu-Daudé, Richard Henderson, Pierrick Bouvier
usage: contrib/plugins/uftrace_symbols.py \
--prefix-symbols \
arm-trusted-firmware/build/qemu/debug/bl1/bl1.elf \
arm-trusted-firmware/build/qemu/debug/bl2/bl2.elf \
arm-trusted-firmware/build/qemu/debug/bl31/bl31.elf \
u-boot/u-boot:0x60000000 \
u-boot/u-boot.relocated:0x000000023f6b6000 \
linux/vmlinux
Will generate symbols and memory mapping files for uftrace, allowing to
have an enhanced trace, instead of raw addresses.
It takes a collection of elf files, and automatically find all their
symbols, and generate an ordered memory map based on that.
This script uses the python (native) pyelftools module.
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
---
contrib/plugins/uftrace_symbols.py | 152 +++++++++++++++++++++++++++++
1 file changed, 152 insertions(+)
create mode 100755 contrib/plugins/uftrace_symbols.py
diff --git a/contrib/plugins/uftrace_symbols.py b/contrib/plugins/uftrace_symbols.py
new file mode 100755
index 00000000000..b49e03203c8
--- /dev/null
+++ b/contrib/plugins/uftrace_symbols.py
@@ -0,0 +1,152 @@
+#!/usr/bin/env python3
+# -*- coding: utf-8 -*-
+#
+# Create symbols and mapping files for uftrace.
+#
+# Copyright 2025 Linaro Ltd
+# Author: Pierrick Bouvier <pierrick.bouvier@linaro.org>
+#
+# SPDX-License-Identifier: GPL-2.0-or-later
+
+import argparse
+import elftools # pip install pyelftools
+import os
+
+from elftools.elf.elffile import ELFFile
+from elftools.elf.sections import SymbolTableSection
+
+def elf_func_symbols(elf):
+ symbol_tables = [(idx, s) for idx, s in enumerate(elf.iter_sections())
+ if isinstance(s, SymbolTableSection)]
+ symbols = []
+ for _, section in symbol_tables:
+ for _, symbol in enumerate(section.iter_symbols()):
+ if symbol_size(symbol) == 0:
+ continue
+ type = symbol['st_info']['type']
+ if type == 'STT_FUNC' or type == 'STT_NOTYPE':
+ symbols.append(symbol)
+ symbols.sort(key = lambda x: symbol_addr(x))
+ return symbols
+
+def symbol_size(symbol):
+ return symbol['st_size']
+
+def symbol_addr(symbol):
+ addr = symbol['st_value']
+ # clamp addr to 48 bits, like uftrace entries
+ return addr & 0xffffffffffff
+
+def symbol_name(symbol):
+ return symbol.name
+
+class BinaryFile:
+ def __init__(self, path, map_offset):
+ self.fullpath = os.path.realpath(path)
+ self.map_offset = map_offset
+ with open(path, 'rb') as f:
+ self.elf = ELFFile(f)
+ self.symbols = elf_func_symbols(self.elf)
+
+ def path(self):
+ return self.fullpath
+
+ def addr_start(self):
+ return self.map_offset
+
+ def addr_end(self):
+ last_sym = self.symbols[-1]
+ return symbol_addr(last_sym) + symbol_size(last_sym) + self.map_offset
+
+ def generate_symbol_file(self, prefix_symbols):
+ binary_name = os.path.basename(self.fullpath)
+ sym_file_path = f'./uftrace.data/{binary_name}.sym'
+ print(f'{sym_file_path} ({len(self.symbols)} symbols)')
+ with open(sym_file_path, 'w') as sym_file:
+ # print hexadecimal addresses on 48 bits
+ addrx = "0>12x"
+ for s in self.symbols:
+ addr = symbol_addr(s)
+ addr = f'{addr:{addrx}}'
+ size = f'{symbol_size(s):{addrx}}'
+ name = symbol_name(s)
+ if prefix_symbols:
+ name = f'{binary_name}:{name}'
+ print(addr, size, 'T', name, file=sym_file)
+
+def parse_parameter(p):
+ s = p.split(":")
+ path = s[0]
+ if len(s) == 1:
+ return path, 0
+ if len(s) > 2:
+ raise ValueError('only one offset can be set')
+ offset = s[1]
+ if not offset.startswith('0x'):
+ err = f'offset "{offset}" is not an hexadecimal constant. '
+ err += 'It should starts with "0x".'
+ raise ValueError(err)
+ offset = int(offset, 16)
+ return path, offset
+
+def is_from_user_mode(map_file_path):
+ if os.path.exists(map_file_path):
+ with open(map_file_path, 'r') as map_file:
+ if not map_file.readline().startswith('# map stack on'):
+ return True
+ return False
+
+def generate_map(binaries):
+ map_file_path = './uftrace.data/sid-0.map'
+
+ if is_from_user_mode(map_file_path):
+ print(f'do not overwrite {map_file_path} generated from qemu-user')
+ return
+
+ mappings = []
+
+ # print hexadecimal addresses on 48 bits
+ addrx = "0>12x"
+
+ mappings += ['# map stack on highest address possible, to prevent uftrace']
+ mappings += ['# from considering any kernel address']
+ mappings += ['ffffffffffff-ffffffffffff rw-p 00000000 00:00 0 [stack]']
+
+ for b in binaries:
+ m = f'{b.addr_start():{addrx}}-{b.addr_end():{addrx}}'
+ m += f' r--p 00000000 00:00 0 {b.path()}'
+ mappings.append(m)
+
+ with open(map_file_path, 'w') as map_file:
+ print('\n'.join(mappings), file=map_file)
+ print(f'{map_file_path}')
+ print('\n'.join(mappings))
+
+def main():
+ parser = argparse.ArgumentParser(description=
+ 'generate symbol files for uftrace')
+ parser.add_argument('elf_file', nargs='+',
+ help='path to an ELF file. '
+ 'Use /path/to/file:0xdeadbeef to add a mapping offset.')
+ parser.add_argument('--prefix-symbols',
+ help='prepend binary name to symbols',
+ action=argparse.BooleanOptionalAction)
+ args = parser.parse_args()
+
+ if not os.path.exists('./uftrace.data'):
+ os.mkdir('./uftrace.data')
+
+ binaries = []
+ for file in args.elf_file:
+ path, offset = parse_parameter(file)
+ b = BinaryFile(path, offset)
+ binaries.append(b)
+ binaries.sort(key = lambda b: b.addr_end());
+
+ for b in binaries:
+ b.generate_symbol_file(args.prefix_symbols)
+
+ generate_map(binaries)
+
+if __name__ == '__main__':
+ main()
--
2.47.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v4 6/7] contrib/plugins/uftrace: add documentation
2025-07-24 20:45 [PATCH v4 0/7] contrib/plugins: uftrace Pierrick Bouvier
` (4 preceding siblings ...)
2025-07-24 20:45 ` [PATCH v4 5/7] contrib/plugins/uftrace_symbols.py Pierrick Bouvier
@ 2025-07-24 20:45 ` Pierrick Bouvier
2025-07-24 20:45 ` [PATCH v4 7/7] contrib/plugins/uftrace: add support for x64 Pierrick Bouvier
2025-08-04 16:19 ` [PATCH v4 0/7] contrib/plugins: uftrace Pierrick Bouvier
7 siblings, 0 replies; 14+ messages in thread
From: Pierrick Bouvier @ 2025-07-24 20:45 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Maydell, Alexandre Iooss, Gustavo Romero, Alex Bennée,
Mahmoud Mandour, rowan Hart, Manos Pitsidianakis,
Philippe Mathieu-Daudé, Richard Henderson, Pierrick Bouvier
This documentation summarizes how to use the plugin, and present two
examples of the possibilities offered by it.
As well, it explains how to rebuild and reproduce easily the system boot
example.
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
---
docs/about/emulation.rst | 207 ++++++++++++++++++++++++++++++++++++++
contrib/plugins/uftrace.c | 2 +
2 files changed, 209 insertions(+)
diff --git a/docs/about/emulation.rst b/docs/about/emulation.rst
index 456d01d5b08..118f059711f 100644
--- a/docs/about/emulation.rst
+++ b/docs/about/emulation.rst
@@ -816,6 +816,213 @@ This plugin can limit the number of Instructions Per Second that are executed::
The lower the number the more accurate time will be, but the less efficient the plugin.
Defaults to ips/10
+Uftrace
+.......
+
+``contrib/plugins/uftrace.c``
+
+This plugin generates a binary trace compatible with
+`uftrace <https://github.com/namhyung/uftrace>`_.
+
+Plugin supports aarch64 only (x64 support should be trivial to add), and works
+in user and system mode, allowing to trace a system boot, which is not something
+possible usually.
+
+In user mode, the memory mapping is directly copied from ``/proc/self/maps`` at
+the end of execution. Uftrace should be able to retrieve symbols by itself,
+without any additional step.
+In system mode, the default memory mapping is empty, and you can generate
+one (and associated symbols) using ``contrib/plugins/uftrace_symbols.py``.
+Symbols must be present in ELF binaries.
+
+It tracks the call stack (based on frame pointer analysis). Thus, your program
+and its dependencies must be compiled using ``-fno-omit-frame-pointer
+-mno-omit-leaf-frame-pointer``. In 2024, `Ubuntu and Fedora enabled it by
+default again on x64
+<https://www.brendangregg.com/blog/2024-03-17/the-return-of-the-frame-pointers.html>`_.
+On aarch64, this is less of a problem, as they are usually part of the ABI,
+except for leaf functions. That's true for user space applications, but not
+necessarily for bare metal code.
+
+Timestamps used for events are the number of instructions executed so far by
+default. As it's tracked per vcpu, each timeline should be considered
+separately. It's possible to use real timestamps by using option
+``timestamp-based-on-real-time``. This is not the default, as considering real
+time when doing emulation and instrumentation may not necessarily report correct
+things. However, it is quite useful when running multiple cpus scenarios, or if
+you want to generate a trace around a particular time of the execution.
+
+When tracing long scenarios (> 1 min), the generated trace can become very long,
+making it hard to extract data from it. In this case, a simple solution is to
+trace execution using ``timestamp-based-on-real-time=on``, and generate a
+timestamped output log using ``qemu-system-aarch64 ... | ts "%s"``. Then,
+``uftrace --time-range=start~end`` can be used to reduce trace for only this
+part of execution.
+
+Performance wise, overhead compared to normal tcg execution can vary from x2
+(sampling only) to x10-x15 (precise stack tracking).
+
+.. list-table:: Uftrace plugin arguments
+ :widths: 20 80
+ :header-rows: 1
+
+ * - Option
+ - Description
+ * - trace-privilege-level=[on|off]
+ - Generate one trace per privilege level (Exception Level + Security State
+ on aarch64).
+ * - trace-sample=N
+ - Instead of precise tracking, perform stack sampling every N instructions.
+ If combined with ``trace-privilege-level``, it will still contain precise
+ stacks for privilege level changes, and will sample stack between those.
+ * - timestamp-based-on-real-time=[on|off]
+ - Use real time for timestamps instead of number of instructions executed.
+
+.. list-table:: uftrace_symbols.py arguments
+ :widths: 20 80
+ :header-rows: 1
+
+ * - Option
+ - Description
+ * - elf_file [elf_file ...]
+ - path to an ELF file. Use /path/to/file:0xdeadbeef to add a mapping offset.
+ * - --prefix-symbols
+ - prepend binary name to symbols
+
+Example user trace
+++++++++++++++++++
+
+As an example, we can trace qemu itself running git::
+
+ $ ./build/qemu-aarch64 -plugin \
+ build/contrib/plugins/libuftrace.so,timestamp-based-on-real-time=on \
+ ./build/qemu-aarch64 /usr/bin/git --help
+
+ # and generate a chrome trace directly
+ $ uftrace dump --chrome | gzip > ~/qemu_aarch64_git_help.json.gz
+
+For convenience, you can download this trace `qemu_aarch64_git_help.json.gz
+<https://fileserver.linaro.org/s/N8X8fnZ5yGRZLsT/download/qemu_aarch64_git_help.json.gz>`_.
+Download it and open this trace on https://ui.perfetto.dev/. You can zoom in/out
+using w,a,s,d keys. Some sequences taken from this trace:
+
+- Loading program and its interpreter
+
+.. image:: https://fileserver.linaro.org/s/fie8JgX76yyL5cq/preview
+ :height: 200px
+
+- open syscall
+
+.. image:: https://fileserver.linaro.org/s/rsXPTeZZPza4PcE/preview
+ :height: 200px
+
+- TB creation
+
+.. image:: https://fileserver.linaro.org/s/GXY6NKMw5EeRCew/preview
+ :height: 200px
+
+It's usually better to use ``uftrace record`` directly. However, tracing
+binaries through qemu-user can be convenient when you don't want to recompile
+them (``uftrace record`` requires instrumentation), as long as symbols are
+present.
+
+Example system trace
+++++++++++++++++++++
+
+A full trace example (chrome trace, from instructions below) generated from a
+system boot can be found `here
+<https://fileserver.linaro.org/s/WsemLboPEzo24nw/download/aarch64_boot.json.gz>`_.
+Download it and open this trace on https://ui.perfetto.dev/. You can see code
+executed for all privilege levels, and zoom in/out using w,a,s,d keys. You can
+find below some sequences taken from this trace:
+
+- Two first stages of boot sequence in Arm Trusted Firmware (EL3 and S-EL1)
+
+.. image:: https://fileserver.linaro.org/s/kkxBS552W7nYESX/preview
+ :height: 200px
+
+- U-boot initialization (until code relocation, after which we can't track it)
+
+.. image:: https://fileserver.linaro.org/s/LKTgsXNZFi5GFNC/preview
+ :height: 200px
+
+- Stat and open syscalls in kernel
+
+.. image:: https://fileserver.linaro.org/s/dXe4MfraKg2F476/preview
+ :height: 200px
+
+- Timer interrupt
+
+.. image:: https://fileserver.linaro.org/s/TM5yobYzJtP7P3C/preview
+ :height: 200px
+
+- Poweroff sequence (from kernel back to firmware, NS-EL2 to EL3)
+
+.. image:: https://fileserver.linaro.org/s/oR2PtyGKJrqnfRf/preview
+ :height: 200px
+
+Build and run system example
+++++++++++++++++++++++++++++
+
+Building a full system image with frame pointers is not trivial.
+
+We provide a `simple way <https://github.com/pbo-linaro/qemu-linux-stack>`_ to
+build an aarch64 system, combining Arm Trusted firmware, U-boot, Linux kernel
+and debian userland. It's based on containers (``podman`` only) and
+``qemu-user-static (binfmt)`` to make sure it's easily reproducible and does not depend
+on machine where you build it.
+
+To build the system::
+
+ # Install dependencies
+ $ sudo apt install -y podman qemu-user-static
+
+ $ git clone https://github.com/pbo-linaro/qemu-linux-stack
+ $ cd qemu-linux-stack
+ $ ./build.sh
+
+ # system can be started using:
+ $ ./run.sh /path/to/qemu-system-aarch64
+
+To generate a uftrace for a system boot from that::
+
+ # run true and poweroff the system
+ $ env INIT=true ./run.sh path/to/qemu-system-aarch64 \
+ -plugin path/to/contrib/plugins/libuftrace.so,trace-privilege-level=on
+
+ # generate symbols and memory mapping
+ $ path/to/contrib/plugins/uftrace_symbols.py \
+ --prefix-symbols \
+ arm-trusted-firmware/build/qemu/debug/bl1/bl1.elf \
+ arm-trusted-firmware/build/qemu/debug/bl2/bl2.elf \
+ arm-trusted-firmware/build/qemu/debug/bl31/bl31.elf \
+ u-boot/u-boot:0x60000000 \
+ linux/vmlinux
+
+ # inspect trace with
+ $ uftrace replay
+
+Uftrace allows to filter the trace, and dump flamegraphs, or a chrome trace.
+This last one is very interesting to see visually the boot process::
+
+ $ uftrace dump --chrome > boot.json
+ # Open your browser, and load boot.json on https://ui.perfetto.dev/.
+
+Long visual chrome traces can't be easily opened, thus, it might be
+interesting to generate them around a particular point of execution::
+
+ # execute qemu and timestamp output log
+ $ env INIT=true ./run.sh path/to/qemu-system-aarch64 \
+ -plugin path/to/contrib/plugins/libuftrace.so,trace-privilege-level=on,timestamp-based-on-real-time=on |&
+ ts "%s" | tee exec.log
+
+ $ cat exec.log | grep 'Run /init'
+ 1753122320 [ 11.834391] Run /init as init process
+ # init was launched at 1753122320
+
+ # generate trace around init execution (2 seconds):
+ $ uftrace dump --chrome --time-range=1753122320~1753122322 > init.json
+
Other emulation features
------------------------
diff --git a/contrib/plugins/uftrace.c b/contrib/plugins/uftrace.c
index 9211f273700..9cbadda0aae 100644
--- a/contrib/plugins/uftrace.c
+++ b/contrib/plugins/uftrace.c
@@ -4,6 +4,8 @@
* Generates a trace compatible with uftrace (similar to uftrace record).
* https://github.com/namhyung/uftrace
*
+ * See docs/about/emulation.rst|Uftrace for details and examples.
+ *
* SPDX-License-Identifier: GPL-2.0-or-later
*/
--
2.47.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* [PATCH v4 7/7] contrib/plugins/uftrace: add support for x64
2025-07-24 20:45 [PATCH v4 0/7] contrib/plugins: uftrace Pierrick Bouvier
` (5 preceding siblings ...)
2025-07-24 20:45 ` [PATCH v4 6/7] contrib/plugins/uftrace: add documentation Pierrick Bouvier
@ 2025-07-24 20:45 ` Pierrick Bouvier
2025-08-04 16:19 ` [PATCH v4 0/7] contrib/plugins: uftrace Pierrick Bouvier
7 siblings, 0 replies; 14+ messages in thread
From: Pierrick Bouvier @ 2025-07-24 20:45 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Maydell, Alexandre Iooss, Gustavo Romero, Alex Bennée,
Mahmoud Mandour, rowan Hart, Manos Pitsidianakis,
Philippe Mathieu-Daudé, Richard Henderson, Pierrick Bouvier
Stack layout is the same as aarch64, so we can reuse the same code.
We don't track stack in real mode, but it does not really matter as it's
usually very short, when the machine boots.
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
---
docs/about/emulation.rst | 14 ++++--
contrib/plugins/uftrace.c | 94 +++++++++++++++++++++++++++++++++++++++
2 files changed, 104 insertions(+), 4 deletions(-)
diff --git a/docs/about/emulation.rst b/docs/about/emulation.rst
index 118f059711f..7f45dc209b6 100644
--- a/docs/about/emulation.rst
+++ b/docs/about/emulation.rst
@@ -824,9 +824,8 @@ Uftrace
This plugin generates a binary trace compatible with
`uftrace <https://github.com/namhyung/uftrace>`_.
-Plugin supports aarch64 only (x64 support should be trivial to add), and works
-in user and system mode, allowing to trace a system boot, which is not something
-possible usually.
+Plugin supports aarch64 and x64, and works in user and system mode, allowing to
+trace a system boot, which is not something possible usually.
In user mode, the memory mapping is directly copied from ``/proc/self/maps`` at
the end of execution. Uftrace should be able to retrieve symbols by itself,
@@ -842,7 +841,8 @@ default again on x64
<https://www.brendangregg.com/blog/2024-03-17/the-return-of-the-frame-pointers.html>`_.
On aarch64, this is less of a problem, as they are usually part of the ABI,
except for leaf functions. That's true for user space applications, but not
-necessarily for bare metal code.
+necessarily for bare metal code. You can read this `section
+<uftrace_build_system_example>` to easily build a system with frame pointers.
Timestamps used for events are the number of instructions executed so far by
default. As it's tracked per vcpu, each timeline should be considered
@@ -964,6 +964,8 @@ find below some sequences taken from this trace:
Build and run system example
++++++++++++++++++++++++++++
+.. _uftrace_build_system_example:
+
Building a full system image with frame pointers is not trivial.
We provide a `simple way <https://github.com/pbo-linaro/qemu-linux-stack>`_ to
@@ -972,6 +974,10 @@ and debian userland. It's based on containers (``podman`` only) and
``qemu-user-static (binfmt)`` to make sure it's easily reproducible and does not depend
on machine where you build it.
+You can follow the exact same instructions for a x64 system, combining edk2,
+Linux, and Ubuntu, simply by switching to
+`x86_64 <https://github.com/pbo-linaro/qemu-linux-stack/tree/x86_64>`_ branch.
+
To build the system::
# Install dependencies
diff --git a/contrib/plugins/uftrace.c b/contrib/plugins/uftrace.c
index 9cbadda0aae..8fc22302989 100644
--- a/contrib/plugins/uftrace.c
+++ b/contrib/plugins/uftrace.c
@@ -81,6 +81,20 @@ typedef struct {
struct qemu_plugin_register *reg_scr_el3;
} Aarch64Cpu;
+typedef enum {
+ X64_RING0,
+ X64_RING1,
+ X64_RING2,
+ X64_RING3,
+ X64_REAL_MODE,
+} X64PrivilegeLevel;
+
+typedef struct {
+ struct qemu_plugin_register *reg_rbp;
+ struct qemu_plugin_register *reg_cs;
+ struct qemu_plugin_register *reg_cr0;
+} X64Cpu;
+
typedef struct {
uint64_t timestamp;
uint64_t data;
@@ -600,6 +614,84 @@ static CpuOps aarch64_ops = {
.does_insn_modify_frame_pointer = aarch64_does_insn_modify_frame_pointer,
};
+static uint8_t x64_num_privilege_levels(void)
+{
+ return X64_REAL_MODE + 1;
+}
+
+static const char *x64_get_privilege_level_name(uint8_t pl)
+{
+ switch (pl) {
+ case X64_RING0: return "Ring0";
+ case X64_RING1: return "Ring1";
+ case X64_RING2: return "Ring2";
+ case X64_RING3: return "Ring3";
+ case X64_REAL_MODE: return "RealMode";
+ default:
+ g_assert_not_reached();
+ }
+}
+
+static uint8_t x64_get_privilege_level(Cpu *cpu_)
+{
+ X64Cpu *cpu = cpu_->arch;
+ uint64_t cr0 = cpu_read_register64(cpu_, cpu->reg_cr0);
+ uint64_t protected_mode = (cr0 >> 0) & 0b1;
+ if (!protected_mode) {
+ return X64_REAL_MODE;
+ }
+ uint32_t cs = cpu_read_register32(cpu_, cpu->reg_cs);
+ uint32_t ring_level = (cs >> 0) & 0b11;
+ return ring_level;
+}
+
+static uint64_t x64_get_frame_pointer(Cpu *cpu_)
+{
+ X64Cpu *cpu = cpu_->arch;
+ return cpu_read_register64(cpu_, cpu->reg_rbp);
+}
+
+static void x64_init(Cpu *cpu_)
+{
+ X64Cpu *cpu = g_malloc0(sizeof(X64Cpu));
+ cpu_->arch = cpu;
+ g_autoptr(GArray) regs = qemu_plugin_get_registers();
+ for (int i = 0; i < regs->len; ++i) {
+ qemu_plugin_reg_descriptor *reg;
+ reg = &g_array_index(regs, qemu_plugin_reg_descriptor, i);
+ if (!strcmp(reg->name, "rbp")) {
+ cpu->reg_rbp = reg->handle;
+ } else if (!strcmp(reg->name, "cs")) {
+ cpu->reg_cs = reg->handle;
+ } else if (!strcmp(reg->name, "cr0")) {
+ cpu->reg_cr0 = reg->handle;
+ }
+ }
+ g_assert(cpu->reg_rbp);
+ g_assert(cpu->reg_cs);
+ g_assert(cpu->reg_cr0);
+}
+
+static void x64_end(Cpu *cpu)
+{
+ g_free(cpu->arch);
+}
+
+static bool x64_does_insn_modify_frame_pointer(const char *disas)
+{
+ return strstr(disas, "rbp");
+}
+
+static CpuOps x64_ops = {
+ .init = x64_init,
+ .end = x64_end,
+ .get_frame_pointer = x64_get_frame_pointer,
+ .get_privilege_level = x64_get_privilege_level,
+ .num_privilege_levels = x64_num_privilege_levels,
+ .get_privilege_level_name = x64_get_privilege_level_name,
+ .does_insn_modify_frame_pointer = x64_does_insn_modify_frame_pointer,
+};
+
static void track_privilege_change(unsigned int cpu_index, void *udata)
{
Cpu *cpu = qemu_plugin_scoreboard_find(score, cpu_index);
@@ -905,6 +997,8 @@ QEMU_PLUGIN_EXPORT int qemu_plugin_install(qemu_plugin_id_t id,
if (!strcmp(info->target_name, "aarch64")) {
arch_ops = aarch64_ops;
+ } else if (!strcmp(info->target_name, "x86_64")) {
+ arch_ops = x64_ops;
} else {
fprintf(stderr, "plugin uftrace: %s target is not supported\n",
info->target_name);
--
2.47.2
^ permalink raw reply related [flat|nested] 14+ messages in thread
* Re: [PATCH v4 0/7] contrib/plugins: uftrace
2025-07-24 20:45 [PATCH v4 0/7] contrib/plugins: uftrace Pierrick Bouvier
` (6 preceding siblings ...)
2025-07-24 20:45 ` [PATCH v4 7/7] contrib/plugins/uftrace: add support for x64 Pierrick Bouvier
@ 2025-08-04 16:19 ` Pierrick Bouvier
7 siblings, 0 replies; 14+ messages in thread
From: Pierrick Bouvier @ 2025-08-04 16:19 UTC (permalink / raw)
To: qemu-devel
Cc: Peter Maydell, Alexandre Iooss, Gustavo Romero, Alex Bennée,
Mahmoud Mandour, rowan Hart, Manos Pitsidianakis,
Philippe Mathieu-Daudé, Richard Henderson
On 7/24/25 1:45 PM, Pierrick Bouvier wrote:
> This plugin generates a binary trace compatible with the excellent uftrace:
> https://github.com/namhyung/uftrace
>
> In short, it tracks all function calls performed during execution, based on
> frame pointer analysis. A big advantage over "uftrace record" is that it works
> in system mode, allowing to trace a full system execution, which was the
> original goal. It works as well in user mode, but uftrace itself already does
> this. It's implemented for aarch64 only (with the intent to add x86_64 later).
>
> Let's start with concrete examples of the result.
>
> First, in system mode, booting a stack using TF-A + U-boot + Linux:
> - Two first stages of boot sequence in Arm Trusted Firmware (EL3 and S-EL1)
> https://fileserver.linaro.org/s/kkxBS552W7nYESX/preview
> - Stat and open syscalls in kernel
> https://fileserver.linaro.org/s/dXe4MfraKg2F476/preview
> - Poweroff sequence (from kernel back to firmware, NS-EL2 to EL3)
> https://fileserver.linaro.org/s/oR2PtyGKJrqnfRf/preview
>
> Full trace is available here:
> https://fileserver.linaro.org/s/WsemLboPEzo24nw/download/aarch64_boot.json.gz
> You can download and open it on https://ui.perfetto.dev/ to explore it.
>
> Second, in user mode, tracing qemu-aarch64 (itself) running git --help:
> - Loading program and its interpreter
> https://fileserver.linaro.org/s/fie8JgX76yyL5cq/preview
> - TB creation
> https://fileserver.linaro.org/s/GXY6NKMw5EeRCew/preview
>
> Full trace is available here:
> https://fileserver.linaro.org/s/N8X8fnZ5yGRZLsT/download/qemu_aarch64_git_help.json.gz
>
> If you had curiosity and now you're ready to give some attention, most of the
> details you want to read are included in the documentation patch (final one).
>
> Overhead is around x2 (sampling only) to x10-x15 (precise), and long traces can
> be directly filtered with uftrace if needed.
>
> The series is splitted in:
> - implementing the plugin
> - adding useful options (especially sampling and privilege level tracing)
> - add a companion script to symbolize traces generated
> - add documentation with examples
>
> I hope this plugin can help people trying to understand what happens out of the
> user space, and get a better grasp of how firmwares, bootloader, and kernel
> interact behind the curtain.
>
> v2
> --
>
> - trace active stacks on exit
> - do not erase map generated in system_emulation
> - add documentation to generate restricted visual traces around specific events
> of execution
>
> v3
> --
>
> - fix missing include unistd.h (build failed on MacOS only)
>
> v4
> --
>
> - add support for x64
>
> Pierrick Bouvier (7):
> contrib/plugins/uftrace: new uftrace plugin
> contrib/plugins/uftrace: add trace-sample option
> contrib/plugins/uftrace: add trace-privilege-level option
> contrib/plugins/uftrace: add timestamp-based-on-real-time option
> contrib/plugins/uftrace_symbols.py
> contrib/plugins/uftrace: add documentation
> contrib/plugins/uftrace: add support for x64
>
> docs/about/emulation.rst | 213 ++++++
> contrib/plugins/uftrace.c | 1014 ++++++++++++++++++++++++++++
> contrib/plugins/meson.build | 3 +-
> contrib/plugins/uftrace_symbols.py | 152 +++++
> 4 files changed, 1381 insertions(+), 1 deletion(-)
> create mode 100644 contrib/plugins/uftrace.c
> create mode 100755 contrib/plugins/uftrace_symbols.py
>
Ping on this series.
Regards,
Pierrick
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v4 1/7] contrib/plugins/uftrace: new uftrace plugin
2025-07-24 20:45 ` [PATCH v4 1/7] contrib/plugins/uftrace: new uftrace plugin Pierrick Bouvier
@ 2025-08-05 10:04 ` Alex Bennée
2025-08-06 0:41 ` Pierrick Bouvier
0 siblings, 1 reply; 14+ messages in thread
From: Alex Bennée @ 2025-08-05 10:04 UTC (permalink / raw)
To: Pierrick Bouvier
Cc: qemu-devel, Peter Maydell, Alexandre Iooss, Gustavo Romero,
Mahmoud Mandour, rowan Hart, Manos Pitsidianakis,
Philippe Mathieu-Daudé, Richard Henderson
Pierrick Bouvier <pierrick.bouvier@linaro.org> writes:
> This plugin generates a binary trace compatible with:
> https://github.com/namhyung/uftrace
>
> It tracks frame pointer during execution, detecting function
> calls/returns and works in system and user mode.
>
> It's implemented for aarch64 only (adding other architecture should
> be trivial, especially x86_64 that share the same stack layout).
>
> Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
> ---
> contrib/plugins/uftrace.c | 630
> ++++++++++++++++++++++++++++++++++++
Is it possible to break this down into smaller commits to ease review? I
appreciate it takes a bit until you have something working but this is a
relatively big plugin.
Anyway some thoughts bellow.
> contrib/plugins/meson.build | 3 +-
> 2 files changed, 632 insertions(+), 1 deletion(-)
> create mode 100644 contrib/plugins/uftrace.c
>
> diff --git a/contrib/plugins/uftrace.c b/contrib/plugins/uftrace.c
> new file mode 100644
> index 00000000000..4ca6d3fd68d
> --- /dev/null
> +++ b/contrib/plugins/uftrace.c
> @@ -0,0 +1,630 @@
> +/*
> + * Copyright (C) 2025, Pierrick Bouvier <pierrick.bouvier@linaro.org>
> + *
> + * Generates a trace compatible with uftrace (similar to uftrace record).
> + * https://github.com/namhyung/uftrace
> + *
> + * SPDX-License-Identifier: GPL-2.0-or-later
> + */
> +
> +#include <qemu-plugin.h>
> +#include <glib.h>
> +#include <stdio.h>
> +#include <sys/stat.h>
> +#include <sys/time.h>
> +#include <time.h>
> +#include <unistd.h>
> +
> +QEMU_PLUGIN_EXPORT int qemu_plugin_version = QEMU_PLUGIN_VERSION;
> +
> +typedef struct {
> + GArray *s;
> +} callstack;
> +
> +typedef struct {
> + uint64_t pc;
> + uint64_t frame_pointer;
> +} callstack_entry;
> +
> +typedef struct {
> + GArray *t;
> + GString *path;
> + GString *name;
Do these actually build over time or could they just be g_strdup'ed
strings? Could you infact use g_intern_string to save memory by only
having one canonical reference? You wouldn't free them until the end
though.
> + uint32_t id;
> +} trace;
> +
> +typedef struct Cpu Cpu;
> +
> +typedef struct {
> + void (*init)(Cpu *cpu);
> + void (*end)(Cpu *cpu);
> + uint64_t (*get_frame_pointer)(Cpu *cpu);
> + bool (*does_insn_modify_frame_pointer)(const char *disas);
> +} CpuOps;
> +
> +typedef struct Cpu {
> + uint64_t insn_count;
> + trace *trace;
> + callstack *cs;
> + GArray *callstacks; /* callstack *callstacks[] */
> + GArray *traces; /* trace *traces [] */
> + GByteArray *buf;
> + CpuOps ops;
> + void *arch;
> +} Cpu;
> +
> +typedef struct {
> + struct qemu_plugin_register *reg_fp;
> +} Aarch64Cpu;
> +
> +typedef struct {
> + uint64_t timestamp;
> + uint64_t data;
> +} uftrace_entry;
The typedefs are inconsistent in naming style. QEMU style says typenames
should use CamelCase.
> +
> +enum uftrace_record_type {
> + UFTRACE_ENTRY,
> + UFTRACE_EXIT,
> + UFTRACE_LOST,
> + UFTRACE_EVENT
> +};
> +
> +static struct qemu_plugin_scoreboard *score;
> +static CpuOps arch_ops;
> +
> +static void uftrace_write_map(bool system_emulation)
> +{
> + const char *path = "./uftrace.data/sid-0.map";
> +
> + if (system_emulation && access(path, F_OK) == 0) {
> + /* do not erase existing map in system emulation, as a custom one might
> + * already have been generated by uftrace_symbols.py */
> + return;
> + }
> +
> + FILE *sid_map = fopen(path, "w");
> + g_assert(sid_map);
I'm not sure an assert is the correct way to deal with a possible error
condition here.
> +
> + if (system_emulation) {
> + fprintf(sid_map,
> + "# map stack on highest address possible, to prevent uftrace\n"
> + "# from considering any kernel address\n");
> + fprintf(sid_map,
> + "ffffffffffff-ffffffffffff rw-p 00000000 00:00 0 [stack]\n");
> + } else {
> + /* in user mode, copy /proc/self/maps instead */
> + FILE *self_map = fopen("/proc/self/maps", "r");
> + g_assert(self_map);
> + for (;;) {
> + int c = fgetc(self_map);
> + if (c == EOF) {
> + break;
> + }
> + fputc(c, sid_map);
> + }
> + fclose(self_map);
> + }
> + fclose(sid_map);
> +}
> +
> +static void uftrace_write_task(const GArray *traces)
> +{
> + FILE *task = fopen("./uftrace.data/task.txt", "w");
> + g_assert(task);
ditto
> + for (int i = 0; i < traces->len; ++i) {
> + trace *t = g_array_index(traces, trace*, i);
> + fprintf(task, "SESS timestamp=0.0 pid=%"PRIu32" sid=0 exename=\"%s\"\n",
> + t->id, t->name->str);
> + fprintf(task, "TASK timestamp=0.0 tid=%"PRIu32" pid=%"PRIu32"\n",
> + t->id, t->id);
> + }
> + fclose(task);
> +}
> +
> +static void uftrace_write_info(const GArray *traces)
> +{
> + g_autoptr(GString) taskinfo_tids = g_string_new("taskinfo:tids=");
> + for (int i = 0; i < traces->len; ++i) {
> + trace *t = g_array_index(traces, trace*, i);
> + const char *delim = i > 0 ? "," : "";
> + g_string_append_printf(taskinfo_tids, "%s%"PRIu32, delim, t->id);
> + }
> +
> + g_autoptr(GString) taskinfo_nr_tid = g_string_new("taskinfo:nr_tid=");
> + g_string_append_printf(taskinfo_nr_tid, "%d", traces->len);
> +
> + FILE *info = fopen("./uftrace.data/info", "w");
> + g_assert(info);
> + /*
> + * $ uftrace dump --debug
> + * uftrace file header: magic = 4674726163652100
> + * uftrace file header: version = 4
> + * uftrace file header: header size = 40
> + * uftrace file header: endian = 1 (little)
> + * ftrace file header: class = 2 (64 bit)
> + * uftrace file header: features = 0x1263 (PLTHOOK | ...
> + * uftrace file header: info = 0x7bff (EXE_NAME | ...
> + * <0000000000000000>: 46 74 72 61 63 65 21 00 04 00 00 00 28 00 01 02
> + * <0000000000000010>: 63 12 00 00 00 00 00 00 ff 7b 00 00 00 00 00 00
> + * <0000000000000020>: 00 04 00 00 00 00 00 00
> + */
> + const uint8_t header[] = {0x46, 0x74, 0x72, 0x61, 0x63, 0x65, 0x21, 0x00,
> + 0x04, 0x00, 0x00, 0x00, 0x28, 0x00, 0x01, 0x02,
> + 0x63, 0x12, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
> + 0xff, 0x7b, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
> + 0x00, 0x04, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00};
> + fwrite(header, sizeof(header), 1, info);
> + const char *info_data[] = {
> + "exename:from_qemu",
> + "build_id:0123456789abcdef0123456789abcdef01234567",
> + "exit_status:0",
> + "cmdline:uftrace record qemu",
> + "cpuinfo:lines=2",
> + "cpuinfo:nr_cpus=1 / 1 (online/possible)",
> + "cpuinfo:desc=Intel 8086",
> + "meminfo:1.0 / 1.0 GB (free / total)",
> + "osinfo:lines=3",
> + "osinfo:kernel=Linux 6.12.33",
> + "osinfo:hostname=pc",
> + "osinfo:distro=\"Debian GNU/Linux 13 (trixie)\"",
> + "taskinfo:lines=2",
This seems all rather like a dump from a developer machine. Should we be
filling this out with either the local host setup or even the guest
details (at least for CPU type etc)?
> + taskinfo_nr_tid->str,
> + taskinfo_tids->str,
> + "usageinfo:lines=6",
> + "usageinfo:systime=0.000000",
> + "usageinfo:usrtime=0.000000",
> + "usageinfo:ctxsw=0 / 0 (voluntary / involuntary)",
> + "usageinfo:maxrss=8016",
> + "usageinfo:pagefault=0 / 0 (major / minor)",
> + "usageinfo:iops=0 / 0 (read / write)",
> + "loadinfo:0.0 / 0.0 / 0.0",
> + "record_date:Mon Jan 1 00:00:00 2025",
> + "elapsed_time:1000000000000.0 sec",
> + "pattern_type:regex",
> + "uftrace_version:v0.17 ( x86_64 dwarf python3 luajit tui perf sched dynamic kernel )",
> + "utc_offset:1751552954",
> + 0};
> + const char **info_data_it = info_data;
> + while (*(info_data_it)) {
> + fprintf(info, "%s\n", *info_data_it);
> + ++info_data_it;
> + }
> + fclose(info);
> +}
> +
> +static callstack *callstack_new(void)
> +{
> + callstack *cs = g_malloc0(sizeof(callstack));
g_new0(callstack, 1)?
> + cs->s = g_array_new(false, false, sizeof(callstack_entry));
> + return cs;
> +}
> +
> +static void callstack_free(callstack *cs)
> +{
> + g_array_free(cs->s, true);
> + cs->s = NULL;
> + g_free(cs);
> +}
> +
> +static size_t callstack_depth(const callstack *cs)
> +{
> + return cs->s->len;
> +}
> +
> +static size_t callstack_empty(const callstack *cs)
> +{
> + return callstack_depth(cs) == 0;
> +}
> +
> +static void callstack_clear(callstack *cs)
> +{
> + g_array_set_size(cs->s, 0);
> +}
> +
> +static const callstack_entry *callstack_at(const callstack *cs, size_t depth)
> +{
> + g_assert(depth > 0);
> + g_assert(depth <= callstack_depth(cs));
> + return &g_array_index(cs->s, callstack_entry, depth - 1);
> +}
> +
> +static callstack_entry callstack_top(const callstack *cs)
> +{
> + if (callstack_depth(cs) >= 1) {
> + return *callstack_at(cs, callstack_depth(cs));
> + }
> + return (callstack_entry){};
> +}
> +
> +static callstack_entry callstack_caller(const callstack *cs)
> +{
> + if (callstack_depth(cs) >= 2) {
> + return *callstack_at(cs, callstack_depth(cs) - 1);
> + }
> + return (callstack_entry){};
> +}
> +
> +static void callstack_push(callstack *cs, callstack_entry e)
> +{
> + g_array_append_val(cs->s, e);
> +}
> +
> +static callstack_entry callstack_pop(callstack *cs)
> +{
> + g_assert(!callstack_empty(cs));
> + callstack_entry e = callstack_top(cs);
> + g_array_set_size(cs->s, callstack_depth(cs) - 1);
> + return e;
> +}
> +
> +static trace *trace_new(uint32_t id, GString *name)
> +{
> + trace *t = g_malloc0(sizeof(trace));
g_new0()
> + t->t = g_array_new(false, false, sizeof(uftrace_entry));
> + t->path = g_string_new(NULL);
> + g_string_append_printf(t->path, "./uftrace.data/%"PRIu32".dat", id);
> + t->name = g_string_new(name->str);
> + t->id = id;
> + return t;
> +}
> +
> +static void trace_free(trace *t)
> +{
> + g_assert(t->t->len == 0);
> + g_array_free(t->t, true);
> + t->t = NULL;
> + g_string_free(t->path, true);
> + t->path = NULL;
> + g_string_free(t->name, true);
> + t->name = NULL;
> + g_free(t);
> +}
> +
> +static void trace_flush(trace *t, bool append)
> +{
> + int create_dir = g_mkdir_with_parents("./uftrace.data",
> + S_IRWXU | S_IRWXG | S_IRWXO);
> + g_assert(create_dir == 0);
> + FILE *dat = fopen(t->path->str, append ? "a" : "w");
> + g_assert(dat);
> + GArray *data = t->t;
> + if (data->len) {
> + fwrite(data->data, data->len, sizeof(uftrace_entry), dat);
> + }
> + fclose(dat);
> + g_array_set_size(data, 0);
> +}
> +
> +static void trace_add_entry(trace *t, uint64_t timestamp, uint64_t pc,
> + size_t depth, enum uftrace_record_type type)
> +{
> + /* libmcount/record.c:record_event */
> + const uint64_t record_magic = 0x5;
> + uint64_t data = type | record_magic << 3;
> + data += depth << 6;
> + data += pc << 16;
> + uftrace_entry e = {.timestamp = timestamp, .data = data};
> + g_array_append_val(t->t, e);
> + if (t->t->len * sizeof(uftrace_entry) > 32 * 1024 * 1024) {
A good candidate for:
#define MiB (INT64_C(1) << 20)
like units.h and then having > (32 * MiB)
> + /* flush every 32 MB */
> + trace_flush(t, true);
> + }
> +}
> +
> +static void trace_enter_function(trace *t, uint64_t timestamp,
> + uint64_t pc, size_t depth)
> +{
> + trace_add_entry(t, timestamp, pc, depth, UFTRACE_ENTRY);
> +}
> +
> +static void trace_exit_function(trace *t, uint64_t timestamp,
> + uint64_t pc, size_t depth)
> +{
> + trace_add_entry(t, timestamp, pc, depth, UFTRACE_EXIT);
> +}
> +
> +static void trace_enter_stack(trace *t, callstack *cs, uint64_t timestamp)
> +{
> + for (size_t depth = 1; depth <= callstack_depth(cs); ++depth) {
> + trace_enter_function(t, timestamp, callstack_at(cs, depth)->pc, depth);
> + }
> +}
> +
> +static void trace_exit_stack(trace *t, callstack *cs, uint64_t timestamp)
> +{
> + for (size_t depth = callstack_depth(cs); depth > 0; --depth) {
> + trace_exit_function(t, timestamp, callstack_at(cs, depth)->pc, depth);
> + }
> +}
> +
> +static uint64_t cpu_read_register64(Cpu *cpu, struct qemu_plugin_register *reg)
> +{
> + GByteArray *buf = cpu->buf;
> + g_byte_array_set_size(buf, 0);
> + size_t sz = qemu_plugin_read_register(reg, buf);
> + g_assert(sz == 8);
> + g_assert(buf->len == 8);
> + return *((uint64_t *) buf->data);
> +}
> +
> +static uint64_t cpu_read_memory64(Cpu *cpu, uint64_t addr)
> +{
> + g_assert(addr);
> + GByteArray *buf = cpu->buf;
> + g_byte_array_set_size(buf, 0);
> + bool read = qemu_plugin_read_memory_vaddr(addr, buf, 8);
> + if (!read) {
> + return 0;
> + }
> + g_assert(buf->len == 8);
> + return *((uint64_t *) buf->data);
> +}
> +
A little comment about what we are doing here would be useful.
> +static void cpu_unwind_stack(Cpu *cpu, uint64_t frame_pointer, uint64_t pc)
> +{
> + g_assert(callstack_empty(cpu->cs));
> +
> + #define UNWIND_STACK_MAX_DEPTH 1024
> + callstack_entry unwind[UNWIND_STACK_MAX_DEPTH];
> + size_t depth = 0;
> + do {
> + /* check we don't have an infinite stack */
> + for (size_t i = 0; i < depth; ++i) {
> + if (frame_pointer == unwind[i].frame_pointer) {
> + break;
> + }
> + }
> + callstack_entry e = {.frame_pointer = frame_pointer, .pc = pc};
> + unwind[depth] = e;
> + depth++;
> + if (frame_pointer) {
> + frame_pointer = cpu_read_memory64(cpu, frame_pointer);
> + }
> + pc = cpu_read_memory64(cpu, frame_pointer + 8); /* read previous lr */
> + } while (frame_pointer && pc && depth < UNWIND_STACK_MAX_DEPTH);
> + #undef UNWIND_STACK_MAX_DEPTH
> +
> + /* push it from bottom to top */
> + while (depth) {
> + callstack_push(cpu->cs, unwind[depth - 1]);
> + --depth;
> + }
> +}
> +
> +static uint64_t cpu_get_timestamp(const Cpu *cpu)
> +{
> + return cpu->insn_count;
> +}
> +
> +static uint64_t aarch64_get_frame_pointer(Cpu *cpu_)
> +{
> + Aarch64Cpu *cpu = cpu_->arch;
> + return cpu_read_register64(cpu_, cpu->reg_fp);
> +}
> +
> +static void aarch64_init(Cpu *cpu_)
> +{
> + Aarch64Cpu *cpu = g_malloc0(sizeof(Aarch64Cpu));
> + cpu_->arch = cpu;
> + g_autoptr(GArray) regs = qemu_plugin_get_registers();
> + for (int i = 0; i < regs->len; ++i) {
> + qemu_plugin_reg_descriptor *reg;
> + reg = &g_array_index(regs, qemu_plugin_reg_descriptor, i);
> + if (!strcmp(reg->name, "x29")) {
> + cpu->reg_fp = reg->handle;
> + }
> + }
It doesn't seem aarch64_init/Aarch64Cpu is very aarch64 specific, you could easily
make a table indexed by info->target_name or allow the user to specific
the fpreg via an arg.
> + if (!cpu->reg_fp) {
> + fprintf(stderr, "uftrace plugin: frame pointer register (x29) is not "
> + "available. Please use an AArch64 cpu (or -cpu max).\n");
> + g_abort();
> + }
> +}
> +
> +static void aarch64_end(Cpu *cpu)
> +{
> + g_free(cpu->arch);
> +}
> +
> +static bool aarch64_does_insn_modify_frame_pointer(const char *disas)
> +{
> + /*
> + * Check if current instruction concerns fp register "x29".
> + * We add a prefix space to make sure we don't match addresses dump
> + * in disassembly.
> + */
> + return strstr(disas, " x29");
there was another fix in another plugin to handle /t as a potential
terminator.
> +}
> +
> +static CpuOps aarch64_ops = {
> + .init = aarch64_init,
> + .end = aarch64_end,
> + .get_frame_pointer = aarch64_get_frame_pointer,
> + .does_insn_modify_frame_pointer =
> aarch64_does_insn_modify_frame_pointer,
arguably does_insn_modify_frame_pointer is the only specialised handler
you need for ops although maybe even then you could save the text of the
register when you setup so you can have a common helper when frame
pointers are explicit in the assembly.
> +};
> +
> +static void track_callstack(unsigned int cpu_index, void *udata)
> +{
> + uint64_t pc = (uintptr_t) udata;
> + Cpu *cpu = qemu_plugin_scoreboard_find(score, cpu_index);
> + uint64_t timestamp = cpu_get_timestamp(cpu);
> + callstack *cs = cpu->cs;
> + trace *t = cpu->trace;
> +
> + uint64_t fp = cpu->ops.get_frame_pointer(cpu);
> + if (!fp && callstack_empty(cs)) {
> + /*
> + * We simply push current pc. Note that we won't detect symbol change as
> + * long as a proper call does not happen.
> + */
> + callstack_push(cs, (callstack_entry){.frame_pointer = fp,
> + .pc = pc});
> + trace_enter_function(t, timestamp, pc, callstack_depth(cs));
> + return;
> + }
> +
> + callstack_entry top = callstack_top(cs);
> + if (fp == top.frame_pointer) {
> + /* same function */
> + return;
> + }
> +
> + callstack_entry caller = callstack_caller(cs);
> + if (fp == caller.frame_pointer) {
> + /* return */
> + callstack_entry e = callstack_pop(cs);
> + trace_exit_function(t, timestamp, e.pc, callstack_depth(cs));
> + return;
> + }
> +
> + uint64_t caller_fp = fp ? cpu_read_memory64(cpu, fp) : 0;
> + if (caller_fp == top.frame_pointer) {
> + /* call */
> + callstack_push(cs, (callstack_entry){.frame_pointer = fp,
> + .pc = pc});
> + trace_enter_function(t, timestamp, pc, callstack_depth(cs));
> + return;
> + }
> +
> + /* discontinuity, exit current stack and unwind new one */
> + trace_exit_stack(t, cs, timestamp);
> + callstack_clear(cs);
> +
> + cpu_unwind_stack(cpu, fp, pc);
> + trace_enter_stack(t, cs, timestamp);
> +}
> +
> +static void vcpu_tb_trans(qemu_plugin_id_t id, struct qemu_plugin_tb *tb)
> +{
> + size_t n_insns = qemu_plugin_tb_n_insns(tb);
> +
> + qemu_plugin_u64 insn_count = qemu_plugin_scoreboard_u64_in_struct(
> + score, Cpu, insn_count);
> +
> + /*
> + * We instrument all instructions following one that might have updated
> + * the frame pointer. We always instrument first instruction in block, as
> + * last executed instruction, in previous tb, may have modified it.
> + */
> + bool instrument_insn = true;
> + for (int i = 0; i < n_insns; i++) {
> + struct qemu_plugin_insn *insn = qemu_plugin_tb_get_insn(tb, i);
> + qemu_plugin_register_vcpu_insn_exec_inline_per_vcpu(
> + insn, QEMU_PLUGIN_INLINE_ADD_U64, insn_count, 1);
> +
> + if (instrument_insn) {
> + uintptr_t pc = qemu_plugin_insn_vaddr(insn);
> + qemu_plugin_register_vcpu_insn_exec_cb(insn, track_callstack,
> + QEMU_PLUGIN_CB_R_REGS,
> + (void *) pc);
> + instrument_insn = false;
> + }
> +
> + char *disas = qemu_plugin_insn_disas(insn);
> + if (arch_ops.does_insn_modify_frame_pointer(disas)) {
> + instrument_insn = true;
> + }
> + }
> +}
> +
> +static void vcpu_init(qemu_plugin_id_t id, unsigned int vcpu_index)
> +{
> + Cpu *cpu = qemu_plugin_scoreboard_find(score, vcpu_index);
> + cpu->ops = arch_ops;
> +
> + cpu->ops.init(cpu);
> + cpu->buf = g_byte_array_new();
> + cpu->callstacks = g_array_new(0, 0, sizeof(callstack *));
> + cpu->traces = g_array_new(0, 0, sizeof(trace *));
> +
> + g_assert(vcpu_index < 1000);
Seems arbitrary?
./qemu-aarch64 -plugin ./contrib/plugins/libuftrace.so ./tests/tcg/aarch64-linux-user/threadcount 1001
**
ERROR:../../contrib/plugins/uftrace.c:878:vcpu_init: assertion failed: (vcpu_index < 1000)
Bail out! ERROR:../../contrib/plugins/uftrace.c:878:vcpu_init: assertion failed: (vcpu_index < 1000)
fish: Job 1, './qemu-aarch64 -plugin ./contri…' terminated by signal SIGABRT (Abort)
> + uint32_t trace_id = 1000 * 1000 + vcpu_index * 1000;
> +
> + g_autoptr(GString) trace_name = g_string_new(NULL);
> + g_string_append_printf(trace_name, "cpu%u", vcpu_index);
g_strdup_printf would work just as well here.
> + trace *t = trace_new(trace_id, trace_name);
> + g_array_append_val(cpu->traces, t);
> + callstack *cs = callstack_new();
> + g_array_append_val(cpu->callstacks, cs);
> + /* create/truncate trace file */
> + trace_flush(t, false);
> +
> + cpu->cs = cs;
> + cpu->trace = t;
> +}
> +
> +static void vcpu_end(unsigned int vcpu_index)
> +{
> + Cpu *cpu = qemu_plugin_scoreboard_find(score, vcpu_index);
> + g_byte_array_free(cpu->buf, true);
> +
> + for (size_t i = 0; i < cpu->traces->len; ++i) {
> + trace *t = g_array_index(cpu->traces, trace*, i);
> + trace_free(t);
> + }
> +
> + for (size_t i = 0; i < cpu->callstacks->len; ++i) {
> + callstack *cs = g_array_index(cpu->callstacks, callstack*, i);
> + callstack_free(cs);
> + }
> +
> + g_array_free(cpu->traces, true);
> + g_array_free(cpu->callstacks, true);
> + memset(cpu, 0, sizeof(Cpu));
> +}
> +
> +static void at_exit(qemu_plugin_id_t id, void *data)
> +{
> + bool system_emulation = (bool) data;
> + g_autoptr(GArray) traces = g_array_new(0, 0, sizeof(trace *));
> +
> + /* exit last active stacks */
> + for (size_t i = 0; i < qemu_plugin_num_vcpus(); ++i) {
> + Cpu *cpu = qemu_plugin_scoreboard_find(score, i);
> + uint64_t timestamp = cpu_get_timestamp(cpu);
> + if (trace_sample) {
> + cpu_trace_last_sample(cpu, timestamp);
> + }
> + trace_exit_stack(cpu->trace, cpu->cs, timestamp);
> + callstack_clear(cpu->cs);
> + }
> +
> + for (size_t i = 0; i < qemu_plugin_num_vcpus(); ++i) {
> + Cpu *cpu = qemu_plugin_scoreboard_find(score, i);
> + for (size_t j = 0; j < cpu->traces->len; ++j) {
> + trace *t = g_array_index(cpu->traces, trace*, j);
> + trace_flush(t, true);
> + g_array_append_val(traces, t);
> + }
> + }
> +
> + uftrace_write_map(system_emulation);
> + uftrace_write_info(traces);
> + uftrace_write_task(traces);
> +
> + for (size_t i = 0; i < qemu_plugin_num_vcpus(); ++i) {
> + vcpu_end(i);
> + }
> +
> + qemu_plugin_scoreboard_free(score);
> +}
> +
> +QEMU_PLUGIN_EXPORT int qemu_plugin_install(qemu_plugin_id_t id,
> + const qemu_info_t *info,
> + int argc, char **argv)
> +{
> + if (!strcmp(info->target_name, "aarch64")) {
> + arch_ops = aarch64_ops;
> + } else {
> + fprintf(stderr, "plugin uftrace: %s target is not supported\n",
> + info->target_name);
> + return 1;
> + }
> +
> + score = qemu_plugin_scoreboard_new(sizeof(Cpu));
> + qemu_plugin_register_vcpu_init_cb(id, vcpu_init);
> + qemu_plugin_register_atexit_cb(id, at_exit, (void *) info->system_emulation);
> + qemu_plugin_register_vcpu_tb_trans_cb(id, vcpu_tb_trans);
> +
> + return 0;
> +}
> diff --git a/contrib/plugins/meson.build b/contrib/plugins/meson.build
> index 1876bc78438..7eb3629c95d 100644
> --- a/contrib/plugins/meson.build
> +++ b/contrib/plugins/meson.build
> @@ -1,5 +1,6 @@
> contrib_plugins = ['bbv', 'cache', 'cflow', 'drcov', 'execlog', 'hotblocks',
> - 'hotpages', 'howvec', 'hwprofile', 'ips', 'stoptrigger']
> + 'hotpages', 'howvec', 'hwprofile', 'ips', 'stoptrigger',
> + 'uftrace']
> if host_os != 'windows'
> # lockstep uses socket.h
> contrib_plugins += 'lockstep'
--
Alex Bennée
Virtualisation Tech Lead @ Linaro
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v4 4/7] contrib/plugins/uftrace: add timestamp-based-on-real-time option
2025-07-24 20:45 ` [PATCH v4 4/7] contrib/plugins/uftrace: add timestamp-based-on-real-time option Pierrick Bouvier
@ 2025-08-05 10:23 ` Alex Bennée
2025-08-06 0:44 ` Pierrick Bouvier
0 siblings, 1 reply; 14+ messages in thread
From: Alex Bennée @ 2025-08-05 10:23 UTC (permalink / raw)
To: Pierrick Bouvier
Cc: qemu-devel, Peter Maydell, Alexandre Iooss, Gustavo Romero,
Mahmoud Mandour, rowan Hart, Manos Pitsidianakis,
Philippe Mathieu-Daudé, Richard Henderson
Pierrick Bouvier <pierrick.bouvier@linaro.org> writes:
> usage: timestamp-based-on-real-time=[on|off]
>
> Instead of using number of instructions executed (which is per vcpu), we
> use the wall time for timestamps. This is useful when tracing user mode
> programs as well.
Arguably it is the more useful default when examining flame graphs. The
instruction count is entirely artificial and wouldn't map onto real
execution time anyway.
Maybe use-icount-for-ts and default to off?
>
> Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
> ---
> contrib/plugins/uftrace.c | 29 +++++++++++++++++++++++++++++
> 1 file changed, 29 insertions(+)
>
> diff --git a/contrib/plugins/uftrace.c b/contrib/plugins/uftrace.c
> index bd4219db693..9211f273700 100644
> --- a/contrib/plugins/uftrace.c
> +++ b/contrib/plugins/uftrace.c
> @@ -94,8 +94,28 @@ enum uftrace_record_type {
> static struct qemu_plugin_scoreboard *score;
> static uint64_t trace_sample;
> static bool trace_privilege_level;
> +static bool timestamp_based_on_real_time;
> static CpuOps arch_ops;
>
> +static uint64_t gettime_ns(void)
> +{
> +#ifdef _WIN32
> + /*
> + * On Windows, timespec_get is available only with UCRT, but not with
> + * MinGW64 environment. Simplify by using only gettimeofday on this
> + * platform.
> + */
> + struct timeval tv;
> + gettimeofday(&tv, NULL);
> + uint64_t now_ns = tv.tv_sec * 1000 * 1000 * 1000 + tv.tv_usec * 1000;
> +#else
> + struct timespec ts;
> + timespec_get(&ts, TIME_UTC);
> + uint64_t now_ns = ts.tv_sec * 1000 * 1000 * 1000 + ts.tv_nsec;
Why use timspec_get here, QEMU's own timer uses:
/* get host real time in nanosecond */
static inline int64_t get_clock_realtime(void)
{
struct timeval tv;
gettimeofday(&tv, NULL);
return tv.tv_sec * 1000000000LL + (tv.tv_usec * 1000);
}
> +#endif
> + return now_ns;
> +}
> +
> static void uftrace_write_map(bool system_emulation)
> {
> const char *path = "./uftrace.data/sid-0.map";
> @@ -454,6 +474,9 @@ static void cpu_set_new_sample(Cpu *cpu, uint64_t timestamp)
>
> static uint64_t cpu_get_timestamp(const Cpu *cpu)
> {
> + if (timestamp_based_on_real_time) {
> + return gettime_ns();
> + }
> return cpu->insn_count;
> }
>
> @@ -866,6 +889,12 @@ QEMU_PLUGIN_EXPORT int qemu_plugin_install(qemu_plugin_id_t id,
> fprintf(stderr, "boolean argument parsing failed: %s\n", opt);
> return -1;
> }
> + } else if (g_strcmp0(tokens[0], "timestamp-based-on-real-time") == 0) {
> + if (!qemu_plugin_bool_parse(tokens[0], tokens[1],
> + ×tamp_based_on_real_time)) {
> + fprintf(stderr, "boolean argument parsing failed: %s\n", opt);
> + return -1;
> + }
> } else {
> fprintf(stderr, "option parsing failed: %s\n", opt);
> return -1;
--
Alex Bennée
Virtualisation Tech Lead @ Linaro
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v4 5/7] contrib/plugins/uftrace_symbols.py
2025-07-24 20:45 ` [PATCH v4 5/7] contrib/plugins/uftrace_symbols.py Pierrick Bouvier
@ 2025-08-05 10:28 ` Alex Bennée
0 siblings, 0 replies; 14+ messages in thread
From: Alex Bennée @ 2025-08-05 10:28 UTC (permalink / raw)
To: Pierrick Bouvier
Cc: qemu-devel, Peter Maydell, Alexandre Iooss, Gustavo Romero,
Mahmoud Mandour, rowan Hart, Manos Pitsidianakis,
Philippe Mathieu-Daudé, Richard Henderson
Pierrick Bouvier <pierrick.bouvier@linaro.org> writes:
> usage: contrib/plugins/uftrace_symbols.py \
> --prefix-symbols \
> arm-trusted-firmware/build/qemu/debug/bl1/bl1.elf \
> arm-trusted-firmware/build/qemu/debug/bl2/bl2.elf \
> arm-trusted-firmware/build/qemu/debug/bl31/bl31.elf \
> u-boot/u-boot:0x60000000 \
> u-boot/u-boot.relocated:0x000000023f6b6000 \
> linux/vmlinux
>
> Will generate symbols and memory mapping files for uftrace, allowing to
> have an enhanced trace, instead of raw addresses.
>
> It takes a collection of elf files, and automatically find all their
> symbols, and generate an ordered memory map based on that.
>
> This script uses the python (native) pyelftools module.
>
> Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
Acked-by: Alex Bennée <alex.bennee@linaro.org>
--
Alex Bennée
Virtualisation Tech Lead @ Linaro
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v4 1/7] contrib/plugins/uftrace: new uftrace plugin
2025-08-05 10:04 ` Alex Bennée
@ 2025-08-06 0:41 ` Pierrick Bouvier
0 siblings, 0 replies; 14+ messages in thread
From: Pierrick Bouvier @ 2025-08-06 0:41 UTC (permalink / raw)
To: Alex Bennée
Cc: qemu-devel, Peter Maydell, Alexandre Iooss, Gustavo Romero,
Mahmoud Mandour, rowan Hart, Manos Pitsidianakis,
Philippe Mathieu-Daudé, Richard Henderson
On 8/5/25 3:04 AM, Alex Bennée wrote:
> Pierrick Bouvier <pierrick.bouvier@linaro.org> writes:
>
>> This plugin generates a binary trace compatible with:
>> https://github.com/namhyung/uftrace
>>
>> It tracks frame pointer during execution, detecting function
>> calls/returns and works in system and user mode.
>>
>> It's implemented for aarch64 only (adding other architecture should
>> be trivial, especially x86_64 that share the same stack layout).
>>
>> Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
>> ---
>> contrib/plugins/uftrace.c | 630
>> ++++++++++++++++++++++++++++++++++++
>
> Is it possible to break this down into smaller commits to ease review? I
> appreciate it takes a bit until you have something working but this is a
> relatively big plugin.
>
I'll post another version breaking down the first version, so it's
easier to review all the different structures used and methods applying
to them.
Regards,
Pierrick
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH v4 4/7] contrib/plugins/uftrace: add timestamp-based-on-real-time option
2025-08-05 10:23 ` Alex Bennée
@ 2025-08-06 0:44 ` Pierrick Bouvier
0 siblings, 0 replies; 14+ messages in thread
From: Pierrick Bouvier @ 2025-08-06 0:44 UTC (permalink / raw)
To: Alex Bennée
Cc: qemu-devel, Peter Maydell, Alexandre Iooss, Gustavo Romero,
Mahmoud Mandour, rowan Hart, Manos Pitsidianakis,
Philippe Mathieu-Daudé, Richard Henderson
On 8/5/25 3:23 AM, Alex Bennée wrote:
> Pierrick Bouvier <pierrick.bouvier@linaro.org> writes:
>
>> usage: timestamp-based-on-real-time=[on|off]
>>
>> Instead of using number of instructions executed (which is per vcpu), we
>> use the wall time for timestamps. This is useful when tracing user mode
>> programs as well.
>
> Arguably it is the more useful default when examining flame graphs. The
> instruction count is entirely artificial and wouldn't map onto real
> execution time anyway.
>
> Maybe use-icount-for-ts and default to off?
>
I don't mind changing the default if you prefer so.
>>
>> Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
>> ---
>> contrib/plugins/uftrace.c | 29 +++++++++++++++++++++++++++++
>> 1 file changed, 29 insertions(+)
>>
>> diff --git a/contrib/plugins/uftrace.c b/contrib/plugins/uftrace.c
>> index bd4219db693..9211f273700 100644
>> --- a/contrib/plugins/uftrace.c
>> +++ b/contrib/plugins/uftrace.c
>> @@ -94,8 +94,28 @@ enum uftrace_record_type {
>> static struct qemu_plugin_scoreboard *score;
>> static uint64_t trace_sample;
>> static bool trace_privilege_level;
>> +static bool timestamp_based_on_real_time;
>> static CpuOps arch_ops;
>>
>> +static uint64_t gettime_ns(void)
>> +{
>> +#ifdef _WIN32
>> + /*
>> + * On Windows, timespec_get is available only with UCRT, but not with
>> + * MinGW64 environment. Simplify by using only gettimeofday on this
>> + * platform.
>> + */
>> + struct timeval tv;
>> + gettimeofday(&tv, NULL);
>> + uint64_t now_ns = tv.tv_sec * 1000 * 1000 * 1000 + tv.tv_usec * 1000;
>> +#else
>> + struct timespec ts;
>> + timespec_get(&ts, TIME_UTC);
>> + uint64_t now_ns = ts.tv_sec * 1000 * 1000 * 1000 + ts.tv_nsec;
>
> Why use timspec_get here, QEMU's own timer uses:
>
> /* get host real time in nanosecond */
> static inline int64_t get_clock_realtime(void)
> {
> struct timeval tv;
>
> gettimeofday(&tv, NULL);
> return tv.tv_sec * 1000000000LL + (tv.tv_usec * 1000);
> }
>
We need nanoseconds timestamps (which timespec offers and not
gettimeofday), or we lose precision for short lived functions, resulting
in bad shaped traces.
>
>> +#endif
>> + return now_ns;
>> +}
>> +
>> static void uftrace_write_map(bool system_emulation)
>> {
>> const char *path = "./uftrace.data/sid-0.map";
>> @@ -454,6 +474,9 @@ static void cpu_set_new_sample(Cpu *cpu, uint64_t timestamp)
>>
>> static uint64_t cpu_get_timestamp(const Cpu *cpu)
>> {
>> + if (timestamp_based_on_real_time) {
>> + return gettime_ns();
>> + }
>> return cpu->insn_count;
>> }
>>
>> @@ -866,6 +889,12 @@ QEMU_PLUGIN_EXPORT int qemu_plugin_install(qemu_plugin_id_t id,
>> fprintf(stderr, "boolean argument parsing failed: %s\n", opt);
>> return -1;
>> }
>> + } else if (g_strcmp0(tokens[0], "timestamp-based-on-real-time") == 0) {
>> + if (!qemu_plugin_bool_parse(tokens[0], tokens[1],
>> + ×tamp_based_on_real_time)) {
>> + fprintf(stderr, "boolean argument parsing failed: %s\n", opt);
>> + return -1;
>> + }
>> } else {
>> fprintf(stderr, "option parsing failed: %s\n", opt);
>> return -1;
>
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2025-08-06 0:45 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-24 20:45 [PATCH v4 0/7] contrib/plugins: uftrace Pierrick Bouvier
2025-07-24 20:45 ` [PATCH v4 1/7] contrib/plugins/uftrace: new uftrace plugin Pierrick Bouvier
2025-08-05 10:04 ` Alex Bennée
2025-08-06 0:41 ` Pierrick Bouvier
2025-07-24 20:45 ` [PATCH v4 2/7] contrib/plugins/uftrace: add trace-sample option Pierrick Bouvier
2025-07-24 20:45 ` [PATCH v4 3/7] contrib/plugins/uftrace: add trace-privilege-level option Pierrick Bouvier
2025-07-24 20:45 ` [PATCH v4 4/7] contrib/plugins/uftrace: add timestamp-based-on-real-time option Pierrick Bouvier
2025-08-05 10:23 ` Alex Bennée
2025-08-06 0:44 ` Pierrick Bouvier
2025-07-24 20:45 ` [PATCH v4 5/7] contrib/plugins/uftrace_symbols.py Pierrick Bouvier
2025-08-05 10:28 ` Alex Bennée
2025-07-24 20:45 ` [PATCH v4 6/7] contrib/plugins/uftrace: add documentation Pierrick Bouvier
2025-07-24 20:45 ` [PATCH v4 7/7] contrib/plugins/uftrace: add support for x64 Pierrick Bouvier
2025-08-04 16:19 ` [PATCH v4 0/7] contrib/plugins: uftrace Pierrick Bouvier
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).