[RFC PATCH v2] xen: add libafl-qemu fuzzer support

All of lore.kernel.org
 help / color / mirror / Atom feed

* [RFC PATCH v2] xen: add libafl-qemu fuzzer support
@ 2025-03-15  0:36 Volodymyr Babchuk
  2025-03-21 22:32 ` Stefano Stabellini
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Volodymyr Babchuk @ 2025-03-15  0:36 UTC (permalink / raw)
  To: xen-devel@lists.xenproject.org
  Cc: Volodymyr Babchuk, Andrew Cooper, Anthony PERARD, Michal Orzel,
	Jan Beulich, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, Bertrand Marquis, Volodymyr Babchuk,
	Dario Faggioli, Juergen Gross, George Dunlap

LibAFL, which is a part of AFL++ project is a instrument that allows
us to perform fuzzing on beremetal code (Xen hypervisor in this case)
using QEMU as an emulator. It employs QEMU's ability to create
snapshots to run many tests relatively quickly: system state is saved
right before executing a new test and restored after the test is
finished.

This patch adds all necessary plumbing to run aarch64 build of Xen
inside that LibAFL-QEMU fuzzer. From the Xen perspective we need to
do following things:

1. Able to communicate with LibAFL-QEMU fuzzer. This is done by
executing special opcodes, that only LibAFL-QEMU can handle.

2. Use interface from p.1 to tell the fuzzer about code Xen section,
so fuzzer know which part of code to track and gather coverage data.

3. Report fuzzer about crash. This is done in panic() function.

4. Prevent test harness from shooting itself in knee.

Right now test harness is an external component, because we want to
test external Xen interfaces, but it is possible to fuzz internal code
if we want to.

Test harness is implemented XTF-based test-case(s). As test harness
can issue hypercall that shuts itself down, KConfig option
CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING was added. It basically tells
fuzzer that test was completed successfully if Dom0 tries to shut
itself (or the whole machine) down.

Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>

---

I tried to fuzz the vGIC emulator and hypercall interface. While vGIC
fuzzing didn't yield any interesting results, hypercall fuzzing found a
way to crash the hypervisor from Dom0 on aarch64, using
"XEN_SYSCTL_page_offline_op" with "sysctl_query_page_offline" sub-op,
because it leads to page_is_ram_type() call which is marked
UNREACHABLE on ARM.

In v2:

 - Moved to XTF-based test harness
 - Severely reworked the fuzzer itself. Now it has user-friendly
   command-line interface and is capable of running in CI, as it now
   returns an appropriate error code if any faults were found
 - Also I found, debugged and fixed a nasty bug in LibAFL-QEMU fork,
   which crashed the whole fuzzer.

Right now the fuzzer is lockated at Xen Troops repo:

https://github.com/xen-troops/xen-fuzzer-rs

But I believe that it is ready to be included into
gitlab.com/xen-project/

XTF-based harness is at

https://gitlab.com/vlad.babchuk/xtf/-/tree/mr_libafl

and there is corresponding MR for including it into

https://gitlab.com/xen-project/fusa/xtf/-/tree/xtf-arm

So, to sum up. All components are basically ready for initial
inclusion. There will be smaller, integration-related changes
later. For example - we will need to update URLs for various
components after they are moved to correct places.
---
 docs/hypervisor-guide/fuzzing.rst           |  90 ++++++++++++
 xen/arch/arm/Kconfig.debug                  |  26 ++++
 xen/arch/arm/Makefile                       |   1 +
 xen/arch/arm/include/asm/libafl_qemu.h      |  54 +++++++
 xen/arch/arm/include/asm/libafl_qemu_defs.h |  37 +++++
 xen/arch/arm/libafl_qemu.c                  | 152 ++++++++++++++++++++
 xen/arch/arm/psci.c                         |  13 ++
 xen/common/sched/core.c                     |  17 +++
 xen/common/shutdown.c                       |   7 +
 xen/drivers/char/console.c                  |   8 ++
 10 files changed, 405 insertions(+)
 create mode 100644 docs/hypervisor-guide/fuzzing.rst
 create mode 100644 xen/arch/arm/include/asm/libafl_qemu.h
 create mode 100644 xen/arch/arm/include/asm/libafl_qemu_defs.h
 create mode 100644 xen/arch/arm/libafl_qemu.c

diff --git a/docs/hypervisor-guide/fuzzing.rst b/docs/hypervisor-guide/fuzzing.rst
new file mode 100644
index 0000000000..a5de71dd25
--- /dev/null
+++ b/docs/hypervisor-guide/fuzzing.rst
@@ -0,0 +1,90 @@
+.. SPDX-License-Identifier: CC-BY-4.0
+
+Fuzzing
+=======
+
+It is possible to use LibAFL-QEMU for fuzzing hypervisor. Right now
+only aarch64 is supported and only hypercall fuzzing is enabled in the
+test harness, but there are plans to add vGIC interface fuzzing, PSCI
+fuzzing and vPL011 fuzzing as well.
+
+
+Principle of operation
+----------------------
+
+LibAFL-QEMU is a part of American Fuzzy lop plus plus (AKA AFL++)
+project. It uses special build of QEMU, that allows to fuzz baremetal
+software like Xen hypervisor or Linux kernel. Basic idea is that we
+have software under test (Xen hypervisor in our case) and a test
+harness application. Test harness uses special protocol to communicate
+with LibAFL outside of QEMU to get input data and report test
+result. LibAFL monitors which branches are taken by Xen and mutates
+input data in attempt to discover new code paths that eventually can
+lead to a crash or other unintended behavior.
+
+LibAFL uses QEMU's `snapshot` feature to run multiple test without
+restarting the whole system every time. This speeds up fuzzing process
+greatly.
+
+So, to try Xen fuzzing we need three components: LibAFL-based fuzzer,
+test harness and Xen itself.
+
+Building Xen for fuzzing
+------------------------
+
+Xen hypervisor should be built with these two options::
+
+ CONFIG_LIBAFL_QEMU_FUZZER=y
+ CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING=y
+
+Building LibAFL-QEMU based fuzzer
+---------------------------------
+
+Fuzzer is written in Rust, so you need Rust toolchain and `cargo` tool
+in your system. Please refer to your distro documentation on how to
+obtain them.
+
+Once Rust is ready, fetch and build the fuzzer::
+
+  # git clone https://github.com/xen-troops/xen-fuzzer-rs
+  # cd xen-fuzzer-rs
+  # cargo build
+
+Building test harness
+---------------------
+
+We need to make low-level actions, like issuing random hypercalls, so
+for test harness we use special build of Zephyr application. We use
+XTF as a test harness. You can build XTF manually, or let fuzzer to do this::
+
+  # cargo make build_xtf
+
+This fill download and build XTF for ARM.
+
+Running the fuzzer
+------------------
+
+Please refer to README.md that comes with the fuzzer, but the most
+versatile way is to run it like this::
+
+  # target/debug/xen_fuzzer -t 3600 /path/to/xen \
+      target/xtf/tests/arm-vgic-fuzzer/test-mmu64le-arm-vgic-fuzzer
+
+(assuming that you built XTF with `cargo make build_xtf`)
+
+Any inputs that led to crashes will be found in `crashes` directory.
+
+You can replay a crash with `-r` option::
+
+  # target/debug/xen_fuzzer -r crashes/0195e4fc65828c17 run \
+      /path/to/xen \
+      /path/to/harness
+
+
+Fuzzer will return non-zero error code if it encountered any crashes.
+
+TODOs
+-----
+
+ - Add x86 support.
+ - Implement fuzzing of other external hypervisor interfaces.
diff --git a/xen/arch/arm/Kconfig.debug b/xen/arch/arm/Kconfig.debug
index 5a03b220ac..3b00c77d3a 100644
--- a/xen/arch/arm/Kconfig.debug
+++ b/xen/arch/arm/Kconfig.debug
@@ -190,3 +190,29 @@ config EARLY_PRINTK_INC
 	default "debug-mvebu.inc" if EARLY_UART_MVEBU
 	default "debug-pl011.inc" if EARLY_UART_PL011
 	default "debug-scif.inc" if EARLY_UART_SCIF
+
+config LIBAFL_QEMU_FUZZER
+	bool "Enable LibAFL-QEMU calls"
+	help
+	  This option enables support for LibAFL-QEMU calls. Enable this
+	  only when you are going to run hypervisor inside LibAFL-QEMU.
+	  Xen will report code section to LibAFL and will report about
+	  crash when it panics.
+
+	  Do not try to run Xen built on this option on any real hardware
+	  or plain QEMU, because it will just crash during startup.
+
+config LIBAFL_QEMU_FUZZER_PASS_BLOCKING
+	depends on LIBAFL_QEMU_FUZZER
+	bool "LibAFL: Report any attempt to suspend/destroy a domain as a success"
+	help
+	  When fuzzing hypercalls, fuzzer sometimes will issue an hypercall that
+	  leads to a domain shutdown, or machine shutdown, or vCPU being
+	  blocked, or something similar. In this case test harness will not be
+	  able to report about successfully handled call to the fuzzer. Fuzzer
+	  will report timeout and mark this as a crash, which is not true. So,
+	  in such cases we need to report about successfully test case from the
+	  hypervisor itself.
+
+          Enable this option only if fuzzing attempt can lead to a correct
+	  stoppage, like when fuzzing hypercalls or PSCI.
diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index fb0948f067..7b4eaab680 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -12,6 +12,7 @@ obj-$(CONFIG_TEE) += tee/
 obj-$(CONFIG_HAS_VPCI) += vpci.o
 
 obj-$(CONFIG_HAS_ALTERNATIVE) += alternative.o
+obj-${CONFIG_LIBAFL_QEMU_FUZZER} += libafl_qemu.o
 obj-y += cpuerrata.o
 obj-y += cpufeature.o
 obj-y += decode.o
diff --git a/xen/arch/arm/include/asm/libafl_qemu.h b/xen/arch/arm/include/asm/libafl_qemu.h
new file mode 100644
index 0000000000..b90cf48b9a
--- /dev/null
+++ b/xen/arch/arm/include/asm/libafl_qemu.h
@@ -0,0 +1,54 @@
+#ifndef LIBAFL_QEMU_H
+#define LIBAFL_QEMU_H
+
+#include <xen/stdint.h>
+#include "libafl_qemu_defs.h"
+#define LIBAFL_QEMU_PRINTF_MAX_SIZE 4096
+
+typedef uint64_t libafl_word;
+
+/**
+ * LibAFL QEMU header file.
+ *
+ * This file is a portable header file used to build target harnesses more
+ * conveniently. Its main purpose is to generate ready-to-use calls to
+ * communicate with the fuzzer. The list of commands is available at the bottom
+ * of this file. The rest mostly consists of macros generating the code used by
+ * the commands.
+ */
+
+enum LibaflQemuEndStatus {
+  LIBAFL_QEMU_END_UNKNOWN = 0,
+  LIBAFL_QEMU_END_OK = 1,
+  LIBAFL_QEMU_END_CRASH = 2,
+};
+
+libafl_word libafl_qemu_start_virt(void *buf_vaddr, libafl_word max_len);
+
+libafl_word libafl_qemu_start_phys(void *buf_paddr, libafl_word max_len);
+
+libafl_word libafl_qemu_input_virt(void *buf_vaddr, libafl_word max_len);
+
+libafl_word libafl_qemu_input_phys(void *buf_paddr, libafl_word max_len);
+
+void libafl_qemu_end(enum LibaflQemuEndStatus status);
+
+void libafl_qemu_save(void);
+
+void libafl_qemu_load(void);
+
+libafl_word libafl_qemu_version(void);
+
+void libafl_qemu_page_current_allow(void);
+
+void libafl_qemu_internal_error(void);
+
+void __attribute__((format(printf, 1, 2))) lqprintf(const char *fmt, ...);
+
+void libafl_qemu_test(void);
+
+void libafl_qemu_trace_vaddr_range(libafl_word start, libafl_word end);
+
+void libafl_qemu_trace_vaddr_size(libafl_word start, libafl_word size);
+
+#endif
diff --git a/xen/arch/arm/include/asm/libafl_qemu_defs.h b/xen/arch/arm/include/asm/libafl_qemu_defs.h
new file mode 100644
index 0000000000..2866cadaac
--- /dev/null
+++ b/xen/arch/arm/include/asm/libafl_qemu_defs.h
@@ -0,0 +1,37 @@
+#ifndef LIBAFL_QEMU_DEFS
+#define LIBAFL_QEMU_DEFS
+
+#define LIBAFL_STRINGIFY(s) #s
+#define XSTRINGIFY(s) LIBAFL_STRINGIFY(s)
+
+#if __STDC_VERSION__ >= 201112L
+  #define STATIC_CHECKS                                   \
+    _Static_assert(sizeof(void *) <= sizeof(libafl_word), \
+                   "pointer type should not be larger and libafl_word");
+#else
+  #define STATIC_CHECKS
+#endif
+
+#define LIBAFL_SYNC_EXIT_OPCODE 0x66f23a0f
+#define LIBAFL_BACKDOOR_OPCODE 0x44f23a0f
+
+#define LIBAFL_QEMU_TEST_VALUE 0xcafebabe
+
+#define LIBAFL_QEMU_HDR_VERSION_NUMBER 0111  // TODO: find a nice way to set it.
+
+typedef enum LibaflQemuCommand {
+  LIBAFL_QEMU_COMMAND_START_VIRT = 0,
+  LIBAFL_QEMU_COMMAND_START_PHYS = 1,
+  LIBAFL_QEMU_COMMAND_INPUT_VIRT = 2,
+  LIBAFL_QEMU_COMMAND_INPUT_PHYS = 3,
+  LIBAFL_QEMU_COMMAND_END = 4,
+  LIBAFL_QEMU_COMMAND_SAVE = 5,
+  LIBAFL_QEMU_COMMAND_LOAD = 6,
+  LIBAFL_QEMU_COMMAND_VERSION = 7,
+  LIBAFL_QEMU_COMMAND_VADDR_FILTER_ALLOW = 8,
+  LIBAFL_QEMU_COMMAND_INTERNAL_ERROR = 9,
+  LIBAFL_QEMU_COMMAND_LQPRINTF = 10,
+  LIBAFL_QEMU_COMMAND_TEST = 11,
+} LibaflExit;
+
+#endif
diff --git a/xen/arch/arm/libafl_qemu.c b/xen/arch/arm/libafl_qemu.c
new file mode 100644
index 0000000000..58924ce6c6
--- /dev/null
+++ b/xen/arch/arm/libafl_qemu.c
@@ -0,0 +1,152 @@
+/* SPDX-License-Identifier: Apache-2.0 */
+/*
+   This file is based on libafl_qemu_impl.h and libafl_qemu_qemu_arch.h
+   from LibAFL project.
+*/
+#include <xen/lib.h>
+#include <xen/init.h>
+#include <xen/kernel.h>
+#include <asm/libafl_qemu.h>
+
+#define LIBAFL_DEFINE_FUNCTIONS(name, opcode)				\
+	libafl_word _libafl_##name##_call0(	\
+		libafl_word action) {					\
+		libafl_word ret;					\
+		__asm__ volatile (					\
+			"mov x0, %1\n"					\
+			".word " XSTRINGIFY(opcode) "\n"		\
+			"mov %0, x0\n"					\
+			: "=r"(ret)					\
+			: "r"(action)					\
+			: "x0"						\
+			);						\
+		return ret;						\
+	}								\
+									\
+	libafl_word _libafl_##name##_call1(	\
+		libafl_word action, libafl_word arg1) {			\
+		libafl_word ret;					\
+		__asm__ volatile (					\
+			"mov x0, %1\n"					\
+			"mov x1, %2\n"					\
+			".word " XSTRINGIFY(opcode) "\n"		\
+			"mov %0, x0\n"					\
+			: "=r"(ret)					\
+			: "r"(action), "r"(arg1)			\
+			: "x0", "x1"					\
+			);						\
+		return ret;						\
+	}								\
+									\
+	libafl_word _libafl_##name##_call2(	\
+		libafl_word action, libafl_word arg1, libafl_word arg2) { \
+		libafl_word ret;					\
+		__asm__ volatile (					\
+			"mov x0, %1\n"					\
+			"mov x1, %2\n"					\
+			"mov x2, %3\n"					\
+			".word " XSTRINGIFY(opcode) "\n"		\
+			"mov %0, x0\n"					\
+			: "=r"(ret)					\
+			: "r"(action), "r"(arg1), "r"(arg2)		\
+			: "x0", "x1", "x2"				\
+			);						\
+		return ret;						\
+	}
+
+// Generates sync exit functions
+LIBAFL_DEFINE_FUNCTIONS(sync_exit, LIBAFL_SYNC_EXIT_OPCODE)
+
+// Generates backdoor functions
+LIBAFL_DEFINE_FUNCTIONS(backdoor, LIBAFL_BACKDOOR_OPCODE)
+
+static char _lqprintf_buffer[LIBAFL_QEMU_PRINTF_MAX_SIZE] = {0};
+
+libafl_word libafl_qemu_start_virt(void       *buf_vaddr,
+                                            libafl_word max_len) {
+  return _libafl_sync_exit_call2(LIBAFL_QEMU_COMMAND_START_VIRT,
+                                 (libafl_word)buf_vaddr, max_len);
+}
+
+libafl_word libafl_qemu_start_phys(void       *buf_paddr,
+                                            libafl_word max_len) {
+  return _libafl_sync_exit_call2(LIBAFL_QEMU_COMMAND_START_PHYS,
+                                 (libafl_word)buf_paddr, max_len);
+}
+
+libafl_word libafl_qemu_input_virt(void       *buf_vaddr,
+                                            libafl_word max_len) {
+  return _libafl_sync_exit_call2(LIBAFL_QEMU_COMMAND_INPUT_VIRT,
+                                 (libafl_word)buf_vaddr, max_len);
+}
+
+libafl_word libafl_qemu_input_phys(void       *buf_paddr,
+                                            libafl_word max_len) {
+  return _libafl_sync_exit_call2(LIBAFL_QEMU_COMMAND_INPUT_PHYS,
+                                 (libafl_word)buf_paddr, max_len);
+}
+
+void libafl_qemu_end(enum LibaflQemuEndStatus status) {
+  _libafl_sync_exit_call1(LIBAFL_QEMU_COMMAND_END, status);
+}
+
+void libafl_qemu_save(void) {
+  _libafl_sync_exit_call0(LIBAFL_QEMU_COMMAND_SAVE);
+}
+
+void libafl_qemu_load(void) {
+  _libafl_sync_exit_call0(LIBAFL_QEMU_COMMAND_LOAD);
+}
+
+libafl_word libafl_qemu_version(void) {
+  return _libafl_sync_exit_call0(LIBAFL_QEMU_COMMAND_VERSION);
+}
+
+void libafl_qemu_internal_error(void) {
+  _libafl_sync_exit_call0(LIBAFL_QEMU_COMMAND_INTERNAL_ERROR);
+}
+
+void lqprintf(const char *fmt, ...) {
+  va_list args;
+  int res;
+  va_start(args, fmt);
+  res = vsnprintf(_lqprintf_buffer, LIBAFL_QEMU_PRINTF_MAX_SIZE, fmt, args);
+  va_end(args);
+
+  if (res >= LIBAFL_QEMU_PRINTF_MAX_SIZE) {
+    // buffer is not big enough, either recompile the target with more
+    // space or print less things
+    libafl_qemu_internal_error();
+  }
+
+  _libafl_sync_exit_call2(LIBAFL_QEMU_COMMAND_LQPRINTF,
+                          (libafl_word)_lqprintf_buffer, res);
+}
+
+void libafl_qemu_test(void) {
+  _libafl_sync_exit_call1(LIBAFL_QEMU_COMMAND_TEST, LIBAFL_QEMU_TEST_VALUE);
+}
+
+void libafl_qemu_trace_vaddr_range(libafl_word start,
+                                            libafl_word end) {
+  _libafl_sync_exit_call2(LIBAFL_QEMU_COMMAND_VADDR_FILTER_ALLOW, start, end);
+}
+
+void libafl_qemu_trace_vaddr_size(libafl_word start,
+                                           libafl_word size) {
+  libafl_qemu_trace_vaddr_range(start, start + size);
+}
+
+static int init_afl(void)
+{
+	vaddr_t xen_text_start = (vaddr_t)_stext;
+	vaddr_t xen_text_end = (vaddr_t)_etext;
+
+	lqprintf("Telling AFL about code section: %lx - %lx\n", xen_text_start, xen_text_end);
+
+	libafl_qemu_trace_vaddr_range(xen_text_start, xen_text_end);
+
+	return 0;
+}
+
+__initcall(init_afl);
diff --git a/xen/arch/arm/psci.c b/xen/arch/arm/psci.c
index b6860a7760..c7a51a1144 100644
--- a/xen/arch/arm/psci.c
+++ b/xen/arch/arm/psci.c
@@ -17,6 +17,7 @@
 #include <asm/cpufeature.h>
 #include <asm/psci.h>
 #include <asm/acpi.h>
+#include <asm/libafl_qemu.h>
 
 /*
  * While a 64-bit OS can make calls with SMC32 calling conventions, for
@@ -49,6 +50,10 @@ int call_psci_cpu_on(int cpu)
 
 void call_psci_cpu_off(void)
 {
+#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
+    libafl_qemu_end(LIBAFL_QEMU_END_OK);
+#endif
+
     if ( psci_ver > PSCI_VERSION(0, 1) )
     {
         struct arm_smccc_res res;
@@ -62,12 +67,20 @@ void call_psci_cpu_off(void)
 
 void call_psci_system_off(void)
 {
+#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
+    libafl_qemu_end(LIBAFL_QEMU_END_OK);
+#endif
+
     if ( psci_ver > PSCI_VERSION(0, 1) )
         arm_smccc_smc(PSCI_0_2_FN32_SYSTEM_OFF, NULL);
 }
 
 void call_psci_system_reset(void)
 {
+#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
+    libafl_qemu_end(LIBAFL_QEMU_END_OK);
+#endif
+
     if ( psci_ver > PSCI_VERSION(0, 1) )
         arm_smccc_smc(PSCI_0_2_FN32_SYSTEM_RESET, NULL);
 }
diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
index 9043414290..55eb132568 100644
--- a/xen/common/sched/core.c
+++ b/xen/common/sched/core.c
@@ -47,6 +47,10 @@
 #define pv_shim false
 #endif
 
+#ifdef CONFIG_LIBAFL_QEMU_FUZZER
+#include <asm/libafl_qemu.h>
+#endif
+
 /* opt_sched: scheduler - default to configured value */
 static char __initdata opt_sched[10] = CONFIG_SCHED_DEFAULT;
 string_param("sched", opt_sched);
@@ -1452,6 +1456,10 @@ static long do_poll(const struct sched_poll *sched_poll)
     if ( !guest_handle_okay(sched_poll->ports, sched_poll->nr_ports) )
         return -EFAULT;
 
+#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
+    libafl_qemu_end(LIBAFL_QEMU_END_OK);
+#endif
+
     set_bit(_VPF_blocked, &v->pause_flags);
     v->poll_evtchn = -1;
     set_bit(v->vcpu_id, d->poll_mask);
@@ -1904,12 +1912,18 @@ ret_t do_sched_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
     {
     case SCHEDOP_yield:
     {
+#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
+        libafl_qemu_end(LIBAFL_QEMU_END_OK);
+#endif
         ret = vcpu_yield();
         break;
     }
 
     case SCHEDOP_block:
     {
+#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
+        libafl_qemu_end(LIBAFL_QEMU_END_OK);
+#endif
         vcpu_block_enable_events();
         break;
     }
@@ -1924,6 +1938,9 @@ ret_t do_sched_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
 
         TRACE_TIME(TRC_SCHED_SHUTDOWN, current->domain->domain_id,
                    current->vcpu_id, sched_shutdown.reason);
+#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
+        libafl_qemu_end(LIBAFL_QEMU_END_OK);
+#endif
         ret = domain_shutdown(current->domain, (u8)sched_shutdown.reason);
 
         break;
diff --git a/xen/common/shutdown.c b/xen/common/shutdown.c
index c47341b977..1340f4b606 100644
--- a/xen/common/shutdown.c
+++ b/xen/common/shutdown.c
@@ -11,6 +11,10 @@
 #include <xen/kexec.h>
 #include <public/sched.h>
 
+#ifdef CONFIG_LIBAFL_QEMU_FUZZER
+#include <asm/libafl_qemu.h>
+#endif
+
 /* opt_noreboot: If true, machine will need manual reset on error. */
 bool __ro_after_init opt_noreboot;
 boolean_param("noreboot", opt_noreboot);
@@ -32,6 +36,9 @@ static void noreturn reboot_or_halt(void)
 
 void hwdom_shutdown(unsigned char reason)
 {
+#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
+    libafl_qemu_end(LIBAFL_QEMU_END_OK);
+#endif
     switch ( reason )
     {
     case SHUTDOWN_poweroff:
diff --git a/xen/drivers/char/console.c b/xen/drivers/char/console.c
index ba428199d2..55d33fa744 100644
--- a/xen/drivers/char/console.c
+++ b/xen/drivers/char/console.c
@@ -40,6 +40,9 @@
 #ifdef CONFIG_SBSA_VUART_CONSOLE
 #include <asm/vpl011.h>
 #endif
+#ifdef CONFIG_LIBAFL_QEMU_FUZZER
+#include <asm/libafl_qemu.h>
+#endif
 
 /* console: comma-separated list of console outputs. */
 static char __initdata opt_console[30] = OPT_CONSOLE_STR;
@@ -1289,6 +1292,11 @@ void panic(const char *fmt, ...)
 
     kexec_crash(CRASHREASON_PANIC);
 
+    #ifdef CONFIG_LIBAFL_QEMU_FUZZER
+    /* Tell the fuzzer that we crashed */
+    libafl_qemu_end(LIBAFL_QEMU_END_CRASH);
+    #endif
+
     if ( opt_noreboot )
         machine_halt();
     else
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH v2] xen: add libafl-qemu fuzzer support
  2025-03-15  0:36 [RFC PATCH v2] xen: add libafl-qemu fuzzer support Volodymyr Babchuk
@ 2025-03-21 22:32 ` Stefano Stabellini
  2025-03-21 22:57   ` Julien Grall
  2025-03-21 23:34   ` Julien Grall
  2025-03-21 23:31 ` Julien Grall
  2025-04-08 15:40 ` Jan Beulich
  2 siblings, 2 replies; 9+ messages in thread
From: Stefano Stabellini @ 2025-03-21 22:32 UTC (permalink / raw)
  To: Volodymyr Babchuk
  Cc: xen-devel@lists.xenproject.org, Andrew Cooper, Anthony PERARD,
	Michal Orzel, Jan Beulich, Julien Grall, Roger Pau Monné,
	Stefano Stabellini, Bertrand Marquis, Dario Faggioli,
	Juergen Gross, George Dunlap

On Sat, 15 Mar 2025, Volodymyr Babchuk wrote:
> LibAFL, which is a part of AFL++ project is a instrument that allows
> us to perform fuzzing on beremetal code (Xen hypervisor in this case)
> using QEMU as an emulator. It employs QEMU's ability to create
> snapshots to run many tests relatively quickly: system state is saved
> right before executing a new test and restored after the test is
> finished.
> 
> This patch adds all necessary plumbing to run aarch64 build of Xen
> inside that LibAFL-QEMU fuzzer. From the Xen perspective we need to
> do following things:
> 
> 1. Able to communicate with LibAFL-QEMU fuzzer. This is done by
> executing special opcodes, that only LibAFL-QEMU can handle.
> 
> 2. Use interface from p.1 to tell the fuzzer about code Xen section,
> so fuzzer know which part of code to track and gather coverage data.
> 
> 3. Report fuzzer about crash. This is done in panic() function.
> 
> 4. Prevent test harness from shooting itself in knee.
> 
> Right now test harness is an external component, because we want to
> test external Xen interfaces, but it is possible to fuzz internal code
> if we want to.
> 
> Test harness is implemented XTF-based test-case(s). As test harness
> can issue hypercall that shuts itself down, KConfig option
> CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING was added. It basically tells
> fuzzer that test was completed successfully if Dom0 tries to shut
> itself (or the whole machine) down.
> 
> Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>

I would appreciate if you could add a gitlab test for this. While I
realize that fuzzers are meant to run overnight and that's not something
we might be able to do with gitlab it would make it a lot easier to run
by anyone and it would also serve as documentation itself.

I think initially you can use your git branches you listed below but we
can create repositories under gitlab.com/xen-project when we commit this
patch to xen.


> ---
> 
> I tried to fuzz the vGIC emulator and hypercall interface. While vGIC
> fuzzing didn't yield any interesting results, hypercall fuzzing found a
> way to crash the hypervisor from Dom0 on aarch64, using
> "XEN_SYSCTL_page_offline_op" with "sysctl_query_page_offline" sub-op,
> because it leads to page_is_ram_type() call which is marked
> UNREACHABLE on ARM.
> 
> In v2:
> 
>  - Moved to XTF-based test harness
>  - Severely reworked the fuzzer itself. Now it has user-friendly
>    command-line interface and is capable of running in CI, as it now
>    returns an appropriate error code if any faults were found
>  - Also I found, debugged and fixed a nasty bug in LibAFL-QEMU fork,
>    which crashed the whole fuzzer.
> 
> Right now the fuzzer is lockated at Xen Troops repo:
> 
> https://github.com/xen-troops/xen-fuzzer-rs
> 
> But I believe that it is ready to be included into
> gitlab.com/xen-project/
> 
> XTF-based harness is at
> 
> https://gitlab.com/vlad.babchuk/xtf/-/tree/mr_libafl
> 
> and there is corresponding MR for including it into
> 
> https://gitlab.com/xen-project/fusa/xtf/-/tree/xtf-arm
> 
> So, to sum up. All components are basically ready for initial
> inclusion. There will be smaller, integration-related changes
> later. For example - we will need to update URLs for various
> components after they are moved to correct places.
> ---
>  docs/hypervisor-guide/fuzzing.rst           |  90 ++++++++++++
>  xen/arch/arm/Kconfig.debug                  |  26 ++++
>  xen/arch/arm/Makefile                       |   1 +
>  xen/arch/arm/include/asm/libafl_qemu.h      |  54 +++++++
>  xen/arch/arm/include/asm/libafl_qemu_defs.h |  37 +++++
>  xen/arch/arm/libafl_qemu.c                  | 152 ++++++++++++++++++++
>  xen/arch/arm/psci.c                         |  13 ++
>  xen/common/sched/core.c                     |  17 +++
>  xen/common/shutdown.c                       |   7 +
>  xen/drivers/char/console.c                  |   8 ++
>  10 files changed, 405 insertions(+)
>  create mode 100644 docs/hypervisor-guide/fuzzing.rst
>  create mode 100644 xen/arch/arm/include/asm/libafl_qemu.h
>  create mode 100644 xen/arch/arm/include/asm/libafl_qemu_defs.h
>  create mode 100644 xen/arch/arm/libafl_qemu.c
> 
> diff --git a/docs/hypervisor-guide/fuzzing.rst b/docs/hypervisor-guide/fuzzing.rst
> new file mode 100644
> index 0000000000..a5de71dd25
> --- /dev/null
> +++ b/docs/hypervisor-guide/fuzzing.rst
> @@ -0,0 +1,90 @@
> +.. SPDX-License-Identifier: CC-BY-4.0
> +
> +Fuzzing
> +=======
> +
> +It is possible to use LibAFL-QEMU for fuzzing hypervisor. Right now
> +only aarch64 is supported and only hypercall fuzzing is enabled in the
> +test harness, but there are plans to add vGIC interface fuzzing, PSCI
> +fuzzing and vPL011 fuzzing as well.
> +
> +
> +Principle of operation
> +----------------------
> +
> +LibAFL-QEMU is a part of American Fuzzy lop plus plus (AKA AFL++)
> +project. It uses special build of QEMU, that allows to fuzz baremetal
> +software like Xen hypervisor or Linux kernel. Basic idea is that we
> +have software under test (Xen hypervisor in our case) and a test
> +harness application. Test harness uses special protocol to communicate
> +with LibAFL outside of QEMU to get input data and report test
> +result. LibAFL monitors which branches are taken by Xen and mutates
> +input data in attempt to discover new code paths that eventually can
> +lead to a crash or other unintended behavior.
> +
> +LibAFL uses QEMU's `snapshot` feature to run multiple test without
> +restarting the whole system every time. This speeds up fuzzing process
> +greatly.
> +
> +So, to try Xen fuzzing we need three components: LibAFL-based fuzzer,
> +test harness and Xen itself.
> +
> +Building Xen for fuzzing
> +------------------------
> +
> +Xen hypervisor should be built with these two options::
> +
> + CONFIG_LIBAFL_QEMU_FUZZER=y
> + CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING=y
> +
> +Building LibAFL-QEMU based fuzzer
> +---------------------------------
> +
> +Fuzzer is written in Rust, so you need Rust toolchain and `cargo` tool
> +in your system. Please refer to your distro documentation on how to
> +obtain them.
> +
> +Once Rust is ready, fetch and build the fuzzer::
> +
> +  # git clone https://github.com/xen-troops/xen-fuzzer-rs
> +  # cd xen-fuzzer-rs
> +  # cargo build
> +
> +Building test harness
> +---------------------
> +
> +We need to make low-level actions, like issuing random hypercalls, so
> +for test harness we use special build of Zephyr application. We use

You mean a special build of an XTF application?


> +XTF as a test harness. You can build XTF manually, or let fuzzer to do this::
> +
> +  # cargo make build_xtf
> +
> +This fill download and build XTF for ARM.
> +
> +Running the fuzzer
> +------------------
> +
> +Please refer to README.md that comes with the fuzzer, but the most
> +versatile way is to run it like this::
> +
> +  # target/debug/xen_fuzzer -t 3600 /path/to/xen \
> +      target/xtf/tests/arm-vgic-fuzzer/test-mmu64le-arm-vgic-fuzzer
> +
> +(assuming that you built XTF with `cargo make build_xtf`)
> +
> +Any inputs that led to crashes will be found in `crashes` directory.
> +
> +You can replay a crash with `-r` option::
> +
> +  # target/debug/xen_fuzzer -r crashes/0195e4fc65828c17 run \
> +      /path/to/xen \
> +      /path/to/harness
> +
> +
> +Fuzzer will return non-zero error code if it encountered any crashes.
> +
> +TODOs
> +-----
> +
> + - Add x86 support.
> + - Implement fuzzing of other external hypervisor interfaces.
> diff --git a/xen/arch/arm/Kconfig.debug b/xen/arch/arm/Kconfig.debug
> index 5a03b220ac..3b00c77d3a 100644
> --- a/xen/arch/arm/Kconfig.debug
> +++ b/xen/arch/arm/Kconfig.debug
> @@ -190,3 +190,29 @@ config EARLY_PRINTK_INC
>  	default "debug-mvebu.inc" if EARLY_UART_MVEBU
>  	default "debug-pl011.inc" if EARLY_UART_PL011
>  	default "debug-scif.inc" if EARLY_UART_SCIF
> +
> +config LIBAFL_QEMU_FUZZER
> +	bool "Enable LibAFL-QEMU calls"
> +	help
> +	  This option enables support for LibAFL-QEMU calls. Enable this
> +	  only when you are going to run hypervisor inside LibAFL-QEMU.
> +	  Xen will report code section to LibAFL and will report about
> +	  crash when it panics.
> +
> +	  Do not try to run Xen built on this option on any real hardware
> +	  or plain QEMU, because it will just crash during startup.
> +
> +config LIBAFL_QEMU_FUZZER_PASS_BLOCKING
> +	depends on LIBAFL_QEMU_FUZZER
> +	bool "LibAFL: Report any attempt to suspend/destroy a domain as a success"
> +	help
> +	  When fuzzing hypercalls, fuzzer sometimes will issue an hypercall that
> +	  leads to a domain shutdown, or machine shutdown, or vCPU being
> +	  blocked, or something similar. In this case test harness will not be
> +	  able to report about successfully handled call to the fuzzer. Fuzzer
> +	  will report timeout and mark this as a crash, which is not true. So,
> +	  in such cases we need to report about successfully test case from the
> +	  hypervisor itself.
> +
> +          Enable this option only if fuzzing attempt can lead to a correct
> +	  stoppage, like when fuzzing hypercalls or PSCI.
> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
> index fb0948f067..7b4eaab680 100644
> --- a/xen/arch/arm/Makefile
> +++ b/xen/arch/arm/Makefile
> @@ -12,6 +12,7 @@ obj-$(CONFIG_TEE) += tee/
>  obj-$(CONFIG_HAS_VPCI) += vpci.o
>  
>  obj-$(CONFIG_HAS_ALTERNATIVE) += alternative.o
> +obj-${CONFIG_LIBAFL_QEMU_FUZZER} += libafl_qemu.o
>  obj-y += cpuerrata.o
>  obj-y += cpufeature.o
>  obj-y += decode.o
> diff --git a/xen/arch/arm/include/asm/libafl_qemu.h b/xen/arch/arm/include/asm/libafl_qemu.h
> new file mode 100644
> index 0000000000..b90cf48b9a
> --- /dev/null
> +++ b/xen/arch/arm/include/asm/libafl_qemu.h
> @@ -0,0 +1,54 @@
> +#ifndef LIBAFL_QEMU_H
> +#define LIBAFL_QEMU_H
> +
> +#include <xen/stdint.h>
> +#include "libafl_qemu_defs.h"
> +#define LIBAFL_QEMU_PRINTF_MAX_SIZE 4096
> +
> +typedef uint64_t libafl_word;
> +
> +/**
> + * LibAFL QEMU header file.
> + *
> + * This file is a portable header file used to build target harnesses more
> + * conveniently. Its main purpose is to generate ready-to-use calls to
> + * communicate with the fuzzer. The list of commands is available at the bottom
> + * of this file. The rest mostly consists of macros generating the code used by
> + * the commands.
> + */
> +
> +enum LibaflQemuEndStatus {
> +  LIBAFL_QEMU_END_UNKNOWN = 0,
> +  LIBAFL_QEMU_END_OK = 1,
> +  LIBAFL_QEMU_END_CRASH = 2,
> +};
> +
> +libafl_word libafl_qemu_start_virt(void *buf_vaddr, libafl_word max_len);
> +
> +libafl_word libafl_qemu_start_phys(void *buf_paddr, libafl_word max_len);
> +
> +libafl_word libafl_qemu_input_virt(void *buf_vaddr, libafl_word max_len);
> +
> +libafl_word libafl_qemu_input_phys(void *buf_paddr, libafl_word max_len);
> +
> +void libafl_qemu_end(enum LibaflQemuEndStatus status);
> +
> +void libafl_qemu_save(void);
> +
> +void libafl_qemu_load(void);
> +
> +libafl_word libafl_qemu_version(void);
> +
> +void libafl_qemu_page_current_allow(void);
> +
> +void libafl_qemu_internal_error(void);
> +
> +void __attribute__((format(printf, 1, 2))) lqprintf(const char *fmt, ...);
> +
> +void libafl_qemu_test(void);
> +
> +void libafl_qemu_trace_vaddr_range(libafl_word start, libafl_word end);
> +
> +void libafl_qemu_trace_vaddr_size(libafl_word start, libafl_word size);
> +
> +#endif
> diff --git a/xen/arch/arm/include/asm/libafl_qemu_defs.h b/xen/arch/arm/include/asm/libafl_qemu_defs.h
> new file mode 100644
> index 0000000000..2866cadaac
> --- /dev/null
> +++ b/xen/arch/arm/include/asm/libafl_qemu_defs.h
> @@ -0,0 +1,37 @@
> +#ifndef LIBAFL_QEMU_DEFS
> +#define LIBAFL_QEMU_DEFS
> +
> +#define LIBAFL_STRINGIFY(s) #s
> +#define XSTRINGIFY(s) LIBAFL_STRINGIFY(s)
> +
> +#if __STDC_VERSION__ >= 201112L
> +  #define STATIC_CHECKS                                   \
> +    _Static_assert(sizeof(void *) <= sizeof(libafl_word), \
> +                   "pointer type should not be larger and libafl_word");
> +#else
> +  #define STATIC_CHECKS
> +#endif

I think this could be a BUILD_BUG_ON ?


> +#define LIBAFL_SYNC_EXIT_OPCODE 0x66f23a0f
> +#define LIBAFL_BACKDOOR_OPCODE 0x44f23a0f
> +
> +#define LIBAFL_QEMU_TEST_VALUE 0xcafebabe
> +
> +#define LIBAFL_QEMU_HDR_VERSION_NUMBER 0111  // TODO: find a nice way to set it.
> +
> +typedef enum LibaflQemuCommand {
> +  LIBAFL_QEMU_COMMAND_START_VIRT = 0,
> +  LIBAFL_QEMU_COMMAND_START_PHYS = 1,
> +  LIBAFL_QEMU_COMMAND_INPUT_VIRT = 2,
> +  LIBAFL_QEMU_COMMAND_INPUT_PHYS = 3,
> +  LIBAFL_QEMU_COMMAND_END = 4,
> +  LIBAFL_QEMU_COMMAND_SAVE = 5,
> +  LIBAFL_QEMU_COMMAND_LOAD = 6,
> +  LIBAFL_QEMU_COMMAND_VERSION = 7,
> +  LIBAFL_QEMU_COMMAND_VADDR_FILTER_ALLOW = 8,
> +  LIBAFL_QEMU_COMMAND_INTERNAL_ERROR = 9,
> +  LIBAFL_QEMU_COMMAND_LQPRINTF = 10,
> +  LIBAFL_QEMU_COMMAND_TEST = 11,
> +} LibaflExit;
> +
> +#endif
> diff --git a/xen/arch/arm/libafl_qemu.c b/xen/arch/arm/libafl_qemu.c
> new file mode 100644
> index 0000000000..58924ce6c6
> --- /dev/null
> +++ b/xen/arch/arm/libafl_qemu.c
> @@ -0,0 +1,152 @@
> +/* SPDX-License-Identifier: Apache-2.0 */
> +/*
> +   This file is based on libafl_qemu_impl.h and libafl_qemu_qemu_arch.h
> +   from LibAFL project.
> +*/
> +#include <xen/lib.h>
> +#include <xen/init.h>
> +#include <xen/kernel.h>
> +#include <asm/libafl_qemu.h>
> +
> +#define LIBAFL_DEFINE_FUNCTIONS(name, opcode)				\
> +	libafl_word _libafl_##name##_call0(	\
> +		libafl_word action) {					\
> +		libafl_word ret;					\
> +		__asm__ volatile (					\
> +			"mov x0, %1\n"					\
> +			".word " XSTRINGIFY(opcode) "\n"		\
> +			"mov %0, x0\n"					\
> +			: "=r"(ret)					\
> +			: "r"(action)					\
> +			: "x0"						\
> +			);						\
> +		return ret;						\
> +	}								\
> +									\
> +	libafl_word _libafl_##name##_call1(	\
> +		libafl_word action, libafl_word arg1) {			\
> +		libafl_word ret;					\
> +		__asm__ volatile (					\
> +			"mov x0, %1\n"					\
> +			"mov x1, %2\n"					\
> +			".word " XSTRINGIFY(opcode) "\n"		\
> +			"mov %0, x0\n"					\
> +			: "=r"(ret)					\
> +			: "r"(action), "r"(arg1)			\
> +			: "x0", "x1"					\
> +			);						\
> +		return ret;						\
> +	}								\
> +									\
> +	libafl_word _libafl_##name##_call2(	\
> +		libafl_word action, libafl_word arg1, libafl_word arg2) { \
> +		libafl_word ret;					\
> +		__asm__ volatile (					\
> +			"mov x0, %1\n"					\
> +			"mov x1, %2\n"					\
> +			"mov x2, %3\n"					\
> +			".word " XSTRINGIFY(opcode) "\n"		\
> +			"mov %0, x0\n"					\
> +			: "=r"(ret)					\
> +			: "r"(action), "r"(arg1), "r"(arg2)		\
> +			: "x0", "x1", "x2"				\
> +			);						\
> +		return ret;						\
> +	}
> +
> +// Generates sync exit functions
> +LIBAFL_DEFINE_FUNCTIONS(sync_exit, LIBAFL_SYNC_EXIT_OPCODE)
> +
> +// Generates backdoor functions
> +LIBAFL_DEFINE_FUNCTIONS(backdoor, LIBAFL_BACKDOOR_OPCODE)
> +
> +static char _lqprintf_buffer[LIBAFL_QEMU_PRINTF_MAX_SIZE] = {0};
> +
> +libafl_word libafl_qemu_start_virt(void       *buf_vaddr,
> +                                            libafl_word max_len) {
> +  return _libafl_sync_exit_call2(LIBAFL_QEMU_COMMAND_START_VIRT,
> +                                 (libafl_word)buf_vaddr, max_len);
> +}
> +
> +libafl_word libafl_qemu_start_phys(void       *buf_paddr,
> +                                            libafl_word max_len) {
> +  return _libafl_sync_exit_call2(LIBAFL_QEMU_COMMAND_START_PHYS,
> +                                 (libafl_word)buf_paddr, max_len);
> +}
> +
> +libafl_word libafl_qemu_input_virt(void       *buf_vaddr,
> +                                            libafl_word max_len) {
> +  return _libafl_sync_exit_call2(LIBAFL_QEMU_COMMAND_INPUT_VIRT,
> +                                 (libafl_word)buf_vaddr, max_len);
> +}
> +
> +libafl_word libafl_qemu_input_phys(void       *buf_paddr,
> +                                            libafl_word max_len) {
> +  return _libafl_sync_exit_call2(LIBAFL_QEMU_COMMAND_INPUT_PHYS,
> +                                 (libafl_word)buf_paddr, max_len);
> +}
> +
> +void libafl_qemu_end(enum LibaflQemuEndStatus status) {
> +  _libafl_sync_exit_call1(LIBAFL_QEMU_COMMAND_END, status);
> +}
> +
> +void libafl_qemu_save(void) {
> +  _libafl_sync_exit_call0(LIBAFL_QEMU_COMMAND_SAVE);
> +}
> +
> +void libafl_qemu_load(void) {
> +  _libafl_sync_exit_call0(LIBAFL_QEMU_COMMAND_LOAD);
> +}
> +
> +libafl_word libafl_qemu_version(void) {
> +  return _libafl_sync_exit_call0(LIBAFL_QEMU_COMMAND_VERSION);
> +}
> +
> +void libafl_qemu_internal_error(void) {
> +  _libafl_sync_exit_call0(LIBAFL_QEMU_COMMAND_INTERNAL_ERROR);
> +}
> +
> +void lqprintf(const char *fmt, ...) {
> +  va_list args;
> +  int res;
> +  va_start(args, fmt);
> +  res = vsnprintf(_lqprintf_buffer, LIBAFL_QEMU_PRINTF_MAX_SIZE, fmt, args);
> +  va_end(args);
> +
> +  if (res >= LIBAFL_QEMU_PRINTF_MAX_SIZE) {
> +    // buffer is not big enough, either recompile the target with more
> +    // space or print less things
> +    libafl_qemu_internal_error();
> +  }
> +
> +  _libafl_sync_exit_call2(LIBAFL_QEMU_COMMAND_LQPRINTF,
> +                          (libafl_word)_lqprintf_buffer, res);
> +}
> +
> +void libafl_qemu_test(void) {
> +  _libafl_sync_exit_call1(LIBAFL_QEMU_COMMAND_TEST, LIBAFL_QEMU_TEST_VALUE);
> +}
> +
> +void libafl_qemu_trace_vaddr_range(libafl_word start,
> +                                            libafl_word end) {
> +  _libafl_sync_exit_call2(LIBAFL_QEMU_COMMAND_VADDR_FILTER_ALLOW, start, end);
> +}
> +
> +void libafl_qemu_trace_vaddr_size(libafl_word start,
> +                                           libafl_word size) {
> +  libafl_qemu_trace_vaddr_range(start, start + size);
> +}
> +
> +static int init_afl(void)
> +{
> +	vaddr_t xen_text_start = (vaddr_t)_stext;
> +	vaddr_t xen_text_end = (vaddr_t)_etext;
> +
> +	lqprintf("Telling AFL about code section: %lx - %lx\n", xen_text_start, xen_text_end);
> +
> +	libafl_qemu_trace_vaddr_range(xen_text_start, xen_text_end);
> +
> +	return 0;
> +}
> +
> +__initcall(init_afl);
> diff --git a/xen/arch/arm/psci.c b/xen/arch/arm/psci.c
> index b6860a7760..c7a51a1144 100644
> --- a/xen/arch/arm/psci.c
> +++ b/xen/arch/arm/psci.c
> @@ -17,6 +17,7 @@
>  #include <asm/cpufeature.h>
>  #include <asm/psci.h>
>  #include <asm/acpi.h>
> +#include <asm/libafl_qemu.h>
>  
>  /*
>   * While a 64-bit OS can make calls with SMC32 calling conventions, for
> @@ -49,6 +50,10 @@ int call_psci_cpu_on(int cpu)
>  
>  void call_psci_cpu_off(void)
>  {
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
> +    libafl_qemu_end(LIBAFL_QEMU_END_OK);
> +#endif

As discussed, please add a wrapper with an empty implementation in the
regular case and the call to libafl_qemu_end when the fuzzer is enabled.
So that here it becomes just something like:

  fuzzer_success();

Other than that, the code changes to Xen look OK to me


> +
>      if ( psci_ver > PSCI_VERSION(0, 1) )
>      {
>          struct arm_smccc_res res;
> @@ -62,12 +67,20 @@ void call_psci_cpu_off(void)
>  
>  void call_psci_system_off(void)
>  {
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
> +    libafl_qemu_end(LIBAFL_QEMU_END_OK);
> +#endif
> +
>      if ( psci_ver > PSCI_VERSION(0, 1) )
>          arm_smccc_smc(PSCI_0_2_FN32_SYSTEM_OFF, NULL);
>  }
>  
>  void call_psci_system_reset(void)
>  {
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
> +    libafl_qemu_end(LIBAFL_QEMU_END_OK);
> +#endif
> +
>      if ( psci_ver > PSCI_VERSION(0, 1) )
>          arm_smccc_smc(PSCI_0_2_FN32_SYSTEM_RESET, NULL);
>  }
> diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
> index 9043414290..55eb132568 100644
> --- a/xen/common/sched/core.c
> +++ b/xen/common/sched/core.c
> @@ -47,6 +47,10 @@
>  #define pv_shim false
>  #endif
>  
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER
> +#include <asm/libafl_qemu.h>
> +#endif
> +
>  /* opt_sched: scheduler - default to configured value */
>  static char __initdata opt_sched[10] = CONFIG_SCHED_DEFAULT;
>  string_param("sched", opt_sched);
> @@ -1452,6 +1456,10 @@ static long do_poll(const struct sched_poll *sched_poll)
>      if ( !guest_handle_okay(sched_poll->ports, sched_poll->nr_ports) )
>          return -EFAULT;
>  
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
> +    libafl_qemu_end(LIBAFL_QEMU_END_OK);
> +#endif
> +
>      set_bit(_VPF_blocked, &v->pause_flags);
>      v->poll_evtchn = -1;
>      set_bit(v->vcpu_id, d->poll_mask);
> @@ -1904,12 +1912,18 @@ ret_t do_sched_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>      {
>      case SCHEDOP_yield:
>      {
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
> +        libafl_qemu_end(LIBAFL_QEMU_END_OK);
> +#endif
>          ret = vcpu_yield();
>          break;
>      }
>  
>      case SCHEDOP_block:
>      {
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
> +        libafl_qemu_end(LIBAFL_QEMU_END_OK);
> +#endif
>          vcpu_block_enable_events();
>          break;
>      }
> @@ -1924,6 +1938,9 @@ ret_t do_sched_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>  
>          TRACE_TIME(TRC_SCHED_SHUTDOWN, current->domain->domain_id,
>                     current->vcpu_id, sched_shutdown.reason);
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
> +        libafl_qemu_end(LIBAFL_QEMU_END_OK);
> +#endif
>          ret = domain_shutdown(current->domain, (u8)sched_shutdown.reason);
>  
>          break;
> diff --git a/xen/common/shutdown.c b/xen/common/shutdown.c
> index c47341b977..1340f4b606 100644
> --- a/xen/common/shutdown.c
> +++ b/xen/common/shutdown.c
> @@ -11,6 +11,10 @@
>  #include <xen/kexec.h>
>  #include <public/sched.h>
>  
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER
> +#include <asm/libafl_qemu.h>
> +#endif
> +
>  /* opt_noreboot: If true, machine will need manual reset on error. */
>  bool __ro_after_init opt_noreboot;
>  boolean_param("noreboot", opt_noreboot);
> @@ -32,6 +36,9 @@ static void noreturn reboot_or_halt(void)
>  
>  void hwdom_shutdown(unsigned char reason)
>  {
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
> +    libafl_qemu_end(LIBAFL_QEMU_END_OK);
> +#endif
>      switch ( reason )
>      {
>      case SHUTDOWN_poweroff:
> diff --git a/xen/drivers/char/console.c b/xen/drivers/char/console.c
> index ba428199d2..55d33fa744 100644
> --- a/xen/drivers/char/console.c
> +++ b/xen/drivers/char/console.c
> @@ -40,6 +40,9 @@
>  #ifdef CONFIG_SBSA_VUART_CONSOLE
>  #include <asm/vpl011.h>
>  #endif
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER
> +#include <asm/libafl_qemu.h>
> +#endif
>  
>  /* console: comma-separated list of console outputs. */
>  static char __initdata opt_console[30] = OPT_CONSOLE_STR;
> @@ -1289,6 +1292,11 @@ void panic(const char *fmt, ...)
>  
>      kexec_crash(CRASHREASON_PANIC);
>  
> +    #ifdef CONFIG_LIBAFL_QEMU_FUZZER
> +    /* Tell the fuzzer that we crashed */
> +    libafl_qemu_end(LIBAFL_QEMU_END_CRASH);
> +    #endif
> +
>      if ( opt_noreboot )
>          machine_halt();
>      else
> -- 
> 2.48.1
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH v2] xen: add libafl-qemu fuzzer support
  2025-03-21 22:32 ` Stefano Stabellini
@ 2025-03-21 22:57   ` Julien Grall
  2025-03-21 23:34   ` Julien Grall
  1 sibling, 0 replies; 9+ messages in thread
From: Julien Grall @ 2025-03-21 22:57 UTC (permalink / raw)
  To: Stefano Stabellini, Volodymyr Babchuk
  Cc: xen-devel@lists.xenproject.org, Andrew Cooper, Anthony PERARD,
	Michal Orzel, Jan Beulich, Roger Pau Monné, Bertrand Marquis,
	Dario Faggioli, Juergen Gross, George Dunlap

Hi Stefano, Volodymyr,

On 21/03/2025 22:32, Stefano Stabellini wrote:
>> diff --git a/xen/arch/arm/psci.c b/xen/arch/arm/psci.c
>> index b6860a7760..c7a51a1144 100644
>> --- a/xen/arch/arm/psci.c
>> +++ b/xen/arch/arm/psci.c
>> @@ -17,6 +17,7 @@
>>   #include <asm/cpufeature.h>
>>   #include <asm/psci.h>
>>   #include <asm/acpi.h>
>> +#include <asm/libafl_qemu.h>
>>   
>>   /*
>>    * While a 64-bit OS can make calls with SMC32 calling conventions, for
>> @@ -49,6 +50,10 @@ int call_psci_cpu_on(int cpu)
>>   
>>   void call_psci_cpu_off(void)
>>   {
>> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
>> +    libafl_qemu_end(LIBAFL_QEMU_END_OK);
>> +#endif
> 
> As discussed, please add a wrapper with an empty implementation in the
> regular case and the call to libafl_qemu_end when the fuzzer is enabled.
> So that here it becomes just something like:
> 
>    fuzzer_success();
> 
> Other than that, the code changes to Xen look OK to me

I am a bit surprised this was resent without addressing the licensing 
issue pointed out by Andrew [1] (I don't see a reply). And if there is 
no issue, then I would have a least expected a mention in the commit 
message why this is ok.

Cheers,

[1] https://lore.kernel.org/ae2dbe98-57cf-4aba-bc48-6d7212cfc859@citrix.com

-- 
Julien Grall



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH v2] xen: add libafl-qemu fuzzer support
  2025-03-15  0:36 [RFC PATCH v2] xen: add libafl-qemu fuzzer support Volodymyr Babchuk
  2025-03-21 22:32 ` Stefano Stabellini
@ 2025-03-21 23:31 ` Julien Grall
  2025-04-30  2:17   ` Volodymyr Babchuk
  2025-04-08 15:40 ` Jan Beulich
  2 siblings, 1 reply; 9+ messages in thread
From: Julien Grall @ 2025-03-21 23:31 UTC (permalink / raw)
  To: Volodymyr Babchuk, xen-devel@lists.xenproject.org
  Cc: Andrew Cooper, Anthony PERARD, Michal Orzel, Jan Beulich,
	Roger Pau Monné, Stefano Stabellini, Bertrand Marquis,
	Dario Faggioli, Juergen Gross, George Dunlap

Hi Volodymyr,

On 15/03/2025 00:36, Volodymyr Babchuk wrote:
> LibAFL, which is a part of AFL++ project is a instrument that allows
> us to perform fuzzing on beremetal code (Xen hypervisor in this case)
> using QEMU as an emulator. It employs QEMU's ability to create
> snapshots to run many tests relatively quickly: system state is saved
> right before executing a new test and restored after the test is
> finished.
> 
> This patch adds all necessary plumbing to run aarch64 build of Xen
> inside that LibAFL-QEMU fuzzer. From the Xen perspective we need to
> do following things:
> 
> 1. Able to communicate with LibAFL-QEMU fuzzer. This is done by
> executing special opcodes, that only LibAFL-QEMU can handle.
> 
> 2. Use interface from p.1 to tell the fuzzer about code Xen section,
> so fuzzer know which part of code to track and gather coverage data.
> 
> 3. Report fuzzer about crash. This is done in panic() function.
> 
> 4. Prevent test harness from shooting itself in knee.
> 
> Right now test harness is an external component, because we want to
> test external Xen interfaces, but it is possible to fuzz internal code
> if we want to.
> 
> Test harness is implemented XTF-based test-case(s). As test harness
> can issue hypercall that shuts itself down, KConfig option
> CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING was added. It basically tells
> fuzzer that test was completed successfully if Dom0 tries to shut
> itself (or the whole machine) down.
> 
> Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
> 
> ---
> 
> I tried to fuzz the vGIC emulator and hypercall interface. While vGIC
> fuzzing didn't yield any interesting results, hypercall fuzzing found a
> way to crash the hypervisor from Dom0 on aarch64, using
> "XEN_SYSCTL_page_offline_op" with "sysctl_query_page_offline" sub-op,
> because it leads to page_is_ram_type() call which is marked
> UNREACHABLE on ARM.
> 
> In v2:
> 
>   - Moved to XTF-based test harness
>   - Severely reworked the fuzzer itself. Now it has user-friendly
>     command-line interface and is capable of running in CI, as it now
>     returns an appropriate error code if any faults were found
>   - Also I found, debugged and fixed a nasty bug in LibAFL-QEMU fork,
>     which crashed the whole fuzzer.
> 
> Right now the fuzzer is lockated at Xen Troops repo:
> 
> https://github.com/xen-troops/xen-fuzzer-rs
> 
> But I believe that it is ready to be included into
> gitlab.com/xen-project/
> 
> XTF-based harness is at
> 
> https://gitlab.com/vlad.babchuk/xtf/-/tree/mr_libafl
> 
> and there is corresponding MR for including it into
> 
> https://gitlab.com/xen-project/fusa/xtf/-/tree/xtf-arm
> 
> So, to sum up. All components are basically ready for initial
> inclusion. There will be smaller, integration-related changes
> later. For example - we will need to update URLs for various
> components after they are moved to correct places.
> ---
>   docs/hypervisor-guide/fuzzing.rst           |  90 ++++++++++++
>   xen/arch/arm/Kconfig.debug                  |  26 ++++
>   xen/arch/arm/Makefile                       |   1 +
>   xen/arch/arm/include/asm/libafl_qemu.h      |  54 +++++++
>   xen/arch/arm/include/asm/libafl_qemu_defs.h |  37 +++++
>   xen/arch/arm/libafl_qemu.c                  | 152 ++++++++++++++++++++
>   xen/arch/arm/psci.c                         |  13 ++
>   xen/common/sched/core.c                     |  17 +++
>   xen/common/shutdown.c                       |   7 +
>   xen/drivers/char/console.c                  |   8 ++
>   10 files changed, 405 insertions(+)
>   create mode 100644 docs/hypervisor-guide/fuzzing.rst
>   create mode 100644 xen/arch/arm/include/asm/libafl_qemu.h
>   create mode 100644 xen/arch/arm/include/asm/libafl_qemu_defs.h
>   create mode 100644 xen/arch/arm/libafl_qemu.c
> 
> diff --git a/docs/hypervisor-guide/fuzzing.rst b/docs/hypervisor-guide/fuzzing.rst
> new file mode 100644
> index 0000000000..a5de71dd25
> --- /dev/null
> +++ b/docs/hypervisor-guide/fuzzing.rst
> @@ -0,0 +1,90 @@
> +.. SPDX-License-Identifier: CC-BY-4.0
> +
> +Fuzzing
> +=======
> +
> +It is possible to use LibAFL-QEMU for fuzzing hypervisor. Right now
> +only aarch64 is supported and only hypercall fuzzing is enabled in the
> +test harness, but there are plans to add vGIC interface fuzzing, PSCI
> +fuzzing and vPL011 fuzzing as well.
> +
> +
> +Principle of operation
> +----------------------
> +
> +LibAFL-QEMU is a part of American Fuzzy lop plus plus (AKA AFL++)
> +project. It uses special build of QEMU, that allows to fuzz baremetal
> +software like Xen hypervisor or Linux kernel. Basic idea is that we
> +have software under test (Xen hypervisor in our case) and a test
> +harness application. Test harness uses special protocol to communicate
> +with LibAFL outside of QEMU to get input data and report test
> +result. LibAFL monitors which branches are taken by Xen and mutates
> +input data in attempt to discover new code paths that eventually can
> +lead to a crash or other unintended behavior.
> +
> +LibAFL uses QEMU's `snapshot` feature to run multiple test without
> +restarting the whole system every time. This speeds up fuzzing process
> +greatly.
> +
> +So, to try Xen fuzzing we need three components: LibAFL-based fuzzer,
> +test harness and Xen itself.
> +
> +Building Xen for fuzzing
> +------------------------
> +
> +Xen hypervisor should be built with these two options::
> +
> + CONFIG_LIBAFL_QEMU_FUZZER=y
> + CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING=y
> +
> +Building LibAFL-QEMU based fuzzer
> +---------------------------------
> +
> +Fuzzer is written in Rust, so you need Rust toolchain and `cargo` tool
> +in your system. Please refer to your distro documentation on how to
> +obtain them.
> +
> +Once Rust is ready, fetch and build the fuzzer::
> +
> +  # git clone https://github.com/xen-troops/xen-fuzzer-rs
> +  # cd xen-fuzzer-rs
> +  # cargo build
> +
> +Building test harness
> +---------------------
> +
> +We need to make low-level actions, like issuing random hypercalls, so
> +for test harness we use special build of Zephyr application. We use
> +XTF as a test harness. You can build XTF manually, or let fuzzer to do this::
> +
> +  # cargo make build_xtf
> +
> +This fill download and build XTF for ARM.
> +
> +Running the fuzzer
> +------------------
> +
> +Please refer to README.md that comes with the fuzzer, but the most
> +versatile way is to run it like this::
> +
> +  # target/debug/xen_fuzzer -t 3600 /path/to/xen \
> +      target/xtf/tests/arm-vgic-fuzzer/test-mmu64le-arm-vgic-fuzzer
> +
> +(assuming that you built XTF with `cargo make build_xtf`)
> +
> +Any inputs that led to crashes will be found in `crashes` directory.
> +
> +You can replay a crash with `-r` option::
> +
> +  # target/debug/xen_fuzzer -r crashes/0195e4fc65828c17 run \
> +      /path/to/xen \
> +      /path/to/harness
> +
> +
> +Fuzzer will return non-zero error code if it encountered any crashes.
> +
> +TODOs
> +-----
> +
> + - Add x86 support.
> + - Implement fuzzing of other external hypervisor interfaces.
> diff --git a/xen/arch/arm/Kconfig.debug b/xen/arch/arm/Kconfig.debug
> index 5a03b220ac..3b00c77d3a 100644
> --- a/xen/arch/arm/Kconfig.debug
> +++ b/xen/arch/arm/Kconfig.debug
> @@ -190,3 +190,29 @@ config EARLY_PRINTK_INC
>   	default "debug-mvebu.inc" if EARLY_UART_MVEBU
>   	default "debug-pl011.inc" if EARLY_UART_PL011
>   	default "debug-scif.inc" if EARLY_UART_SCIF
> +
> +config LIBAFL_QEMU_FUZZER
> +	bool "Enable LibAFL-QEMU calls"

Looking at the code below, I kind of doubt this is working on arm32. Can 
you confirm? If it doesn't work, then this needs to be "depends on".

> +	help
> +	  This option enables support for LibAFL-QEMU calls. Enable this
> +	  only when you are going to run hypervisor inside LibAFL-QEMU.
> +	  Xen will report code section to LibAFL and will report about
> +	  crash when it panics.
> +
> +	  Do not try to run Xen built on this option on any real hardware
> +	  or plain QEMU, because it will just crash during startup.
> +
> +config LIBAFL_QEMU_FUZZER_PASS_BLOCKING
> +	depends on LIBAFL_QEMU_FUZZER
> +	bool "LibAFL: Report any attempt to suspend/destroy a domain as a success"
> +	help
> +	  When fuzzing hypercalls, fuzzer sometimes will issue an hypercall that
> +	  leads to a domain shutdown, or machine shutdown, or vCPU being
> +	  blocked, or something similar. In this case test harness will not be
> +	  able to report about successfully handled call to the fuzzer. Fuzzer
> +	  will report timeout and mark this as a crash, which is not true. So,
> +	  in such cases we need to report about successfully test case from the
> +	  hypervisor itself.
> +
> +          Enable this option only if fuzzing attempt can lead to a correct
> +	  stoppage, like when fuzzing hypercalls or PSCI.
> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
> index fb0948f067..7b4eaab680 100644
> --- a/xen/arch/arm/Makefile
> +++ b/xen/arch/arm/Makefile
> @@ -12,6 +12,7 @@ obj-$(CONFIG_TEE) += tee/
>   obj-$(CONFIG_HAS_VPCI) += vpci.o
>   
>   obj-$(CONFIG_HAS_ALTERNATIVE) += alternative.o
> +obj-${CONFIG_LIBAFL_QEMU_FUZZER} += libafl_qemu.o
>   obj-y += cpuerrata.o
>   obj-y += cpufeature.o
>   obj-y += decode.o
> diff --git a/xen/arch/arm/include/asm/libafl_qemu.h b/xen/arch/arm/include/asm/libafl_qemu.h
> new file mode 100644
> index 0000000000..b90cf48b9a
> --- /dev/null
> +++ b/xen/arch/arm/include/asm/libafl_qemu.h
> @@ -0,0 +1,54 @@
> +#ifndef LIBAFL_QEMU_H
> +#define LIBAFL_QEMU_H
> +
> +#include <xen/stdint.h>
> +#include "libafl_qemu_defs.h"
> +#define LIBAFL_QEMU_PRINTF_MAX_SIZE 4096

Is this defined by libafl or an internal decision?

[...]

> diff --git a/xen/arch/arm/include/asm/libafl_qemu_defs.h b/xen/arch/arm/include/asm/libafl_qemu_defs.h
> new file mode 100644
> index 0000000000..2866cadaac
> --- /dev/null
> +++ b/xen/arch/arm/include/asm/libafl_qemu_defs.h
> @@ -0,0 +1,37 @@

Missing license. Also, is this file taken from somewhere?

> +#ifndef LIBAFL_QEMU_DEFS
> +#define LIBAFL_QEMU_DEFS
> +
> +#define LIBAFL_STRINGIFY(s) #s
> +#define XSTRINGIFY(s) LIBAFL_STRINGIFY(s)
> +
> +#if __STDC_VERSION__ >= 201112L
> +  #define STATIC_CHECKS                                   \
> +    _Static_assert(sizeof(void *) <= sizeof(libafl_word), \
> +                   "pointer type should not be larger and libafl_word");
> +#else
> +  #define STATIC_CHECKS
> +#endif

No-one seems to use STATIC_CHECKS? Is this intended?

> +
> +#define LIBAFL_SYNC_EXIT_OPCODE 0x66f23a0f
 > +#define LIBAFL_BACKDOOR_OPCODE 0x44f23a0f

Are the opcode valid for arm32? If not, they should be protected with 
#ifdef CONFIG_ARM_64.

> +
> +#define LIBAFL_QEMU_TEST_VALUE 0xcafebabe
 > +> +#define LIBAFL_QEMU_HDR_VERSION_NUMBER 0111  // TODO: find a nice 
way to set it.
> +
> +typedef enum LibaflQemuCommand {
> +  LIBAFL_QEMU_COMMAND_START_VIRT = 0,
> +  LIBAFL_QEMU_COMMAND_START_PHYS = 1,
> +  LIBAFL_QEMU_COMMAND_INPUT_VIRT = 2,
> +  LIBAFL_QEMU_COMMAND_INPUT_PHYS = 3,
> +  LIBAFL_QEMU_COMMAND_END = 4,
> +  LIBAFL_QEMU_COMMAND_SAVE = 5,
> +  LIBAFL_QEMU_COMMAND_LOAD = 6,
> +  LIBAFL_QEMU_COMMAND_VERSION = 7,
> +  LIBAFL_QEMU_COMMAND_VADDR_FILTER_ALLOW = 8,
> +  LIBAFL_QEMU_COMMAND_INTERNAL_ERROR = 9,
> +  LIBAFL_QEMU_COMMAND_LQPRINTF = 10,
> +  LIBAFL_QEMU_COMMAND_TEST = 11,
> +} LibaflExit;
> +
> +#endif

Missing emacs magic.

> diff --git a/xen/arch/arm/libafl_qemu.c b/xen/arch/arm/libafl_qemu.c
> new file mode 100644
> index 0000000000..58924ce6c6
> --- /dev/null
> +++ b/xen/arch/arm/libafl_qemu.c
> @@ -0,0 +1,152 @@
> +/* SPDX-License-Identifier: Apache-2.0 */

See my other reply about the license. I think this need to be resolved 
before sending a new version.

> +/*
> +   This file is based on libafl_qemu_impl.h and libafl_qemu_qemu_arch.h
> +   from LibAFL project.
> +*/
> +#include <xen/lib.h>
> +#include <xen/init.h>
> +#include <xen/kernel.h>
> +#include <asm/libafl_qemu.h>
> +
> +#define LIBAFL_DEFINE_FUNCTIONS(name, opcode)				\
> +	libafl_word _libafl_##name##_call0(	\
> +		libafl_word action) {					\
> +		libafl_word ret;					\
> +		__asm__ volatile (					\
> +			"mov x0, %1\n"					\
> +			".word " XSTRINGIFY(opcode) "\n"		\
> +			"mov %0, x0\n"					\
> +			: "=r"(ret)					\
> +			: "r"(action)					\
> +			: "x0"						\

Can we store the action directly in x0 (same for the other argunments 
below)? This would avoid to clobber two registers (See smccc.h as an 
example).

> +			);						\
> +		return ret;						\
> +	}								\
> +									\
> +	libafl_word _libafl_##name##_call1(	\
> +		libafl_word action, libafl_word arg1) {			\
> +		libafl_word ret;					\
> +		__asm__ volatile (					\
> +			"mov x0, %1\n"					\
> +			"mov x1, %2\n"					\
> +			".word " XSTRINGIFY(opcode) "\n"		\
> +			"mov %0, x0\n"					\
> +			: "=r"(ret)					\
> +			: "r"(action), "r"(arg1)			\
> +			: "x0", "x1"					\
> +			);						\
> +		return ret;						\
> +	}								\
> +									\
> +	libafl_word _libafl_##name##_call2(	\
> +		libafl_word action, libafl_word arg1, libafl_word arg2) { \
> +		libafl_word ret;					\
> +		__asm__ volatile (					\
> +			"mov x0, %1\n"					\
> +			"mov x1, %2\n"					\
> +			"mov x2, %3\n"					\
> +			".word " XSTRINGIFY(opcode) "\n"		\
> +			"mov %0, x0\n"					\
> +			: "=r"(ret)					\
> +			: "r"(action), "r"(arg1), "r"(arg2)		\
> +			: "x0", "x1", "x2"				\
> +			);						\
> +		return ret;						\
> +	}
> +
> +// Generates sync exit functions
> +LIBAFL_DEFINE_FUNCTIONS(sync_exit, LIBAFL_SYNC_EXIT_OPCODE)
> +
> +// Generates backdoor functions
> +LIBAFL_DEFINE_FUNCTIONS(backdoor, LIBAFL_BACKDOOR_OPCODE)
> +
> +static char _lqprintf_buffer[LIBAFL_QEMU_PRINTF_MAX_SIZE] = {0};

AFAICT, this buffer is only used by lqprintf(). So it would be better to 
move it in lqprintf(). Also, you don't need {0}.

> +
> +libafl_word libafl_qemu_start_virt(void       *buf_vaddr,
> +                                            libafl_word max_len) {

What coding style is this file meant to use?

> +  return _libafl_sync_exit_call2(LIBAFL_QEMU_COMMAND_START_VIRT,
> +                                 (libafl_word)buf_vaddr, max_len);
> +}
> +
> +libafl_word libafl_qemu_start_phys(void       *buf_paddr,
> +                                            libafl_word max_len) {
> +  return _libafl_sync_exit_call2(LIBAFL_QEMU_COMMAND_START_PHYS,
> +                                 (libafl_word)buf_paddr, max_len);
> +}
> +
> +libafl_word libafl_qemu_input_virt(void       *buf_vaddr,
> +                                            libafl_word max_len) {
> +  return _libafl_sync_exit_call2(LIBAFL_QEMU_COMMAND_INPUT_VIRT,
> +                                 (libafl_word)buf_vaddr, max_len);
> +}
> +
> +libafl_word libafl_qemu_input_phys(void       *buf_paddr,
> +                                            libafl_word max_len) {
> +  return _libafl_sync_exit_call2(LIBAFL_QEMU_COMMAND_INPUT_PHYS,
> +                                 (libafl_word)buf_paddr, max_len);
> +}
> +
> +void libafl_qemu_end(enum LibaflQemuEndStatus status) {
> +  _libafl_sync_exit_call1(LIBAFL_QEMU_COMMAND_END, status);
> +}
> +
> +void libafl_qemu_save(void) {
> +  _libafl_sync_exit_call0(LIBAFL_QEMU_COMMAND_SAVE);
> +}
> +
> +void libafl_qemu_load(void) {
> +  _libafl_sync_exit_call0(LIBAFL_QEMU_COMMAND_LOAD);
> +}
> +
> +libafl_word libafl_qemu_version(void) {
> +  return _libafl_sync_exit_call0(LIBAFL_QEMU_COMMAND_VERSION);
> +}
> +
> +void libafl_qemu_internal_error(void) {
> +  _libafl_sync_exit_call0(LIBAFL_QEMU_COMMAND_INTERNAL_ERROR);
> +}
> +
> +void lqprintf(const char *fmt, ...) {

I am not sure I understand the value of lqprinf(). Why can't we use the 
console? When is this meant to be used?

> +  va_list args;
> +  int res;
> +  va_start(args, fmt);
> +  res = vsnprintf(_lqprintf_buffer, LIBAFL_QEMU_PRINTF_MAX_SIZE, fmt, args);
 > +  va_end(args);

What if lqprintf() is called concurrently?

> +
> +  if (res >= LIBAFL_QEMU_PRINTF_MAX_SIZE) {
> +    // buffer is not big enough, either recompile the target with more
> +    // space or print less things
> +    libafl_qemu_internal_error();
> +  }
> +
> +  _libafl_sync_exit_call2(LIBAFL_QEMU_COMMAND_LQPRINTF,
> +                          (libafl_word)_lqprintf_buffer, res);
> +}
> +
> +void libafl_qemu_test(void) {
> +  _libafl_sync_exit_call1(LIBAFL_QEMU_COMMAND_TEST, LIBAFL_QEMU_TEST_VALUE);
> +}
> +
> +void libafl_qemu_trace_vaddr_range(libafl_word start,
> +                                            libafl_word end) {
> +  _libafl_sync_exit_call2(LIBAFL_QEMU_COMMAND_VADDR_FILTER_ALLOW, start, end);
> +}
> +
> +void libafl_qemu_trace_vaddr_size(libafl_word start,
> +                                           libafl_word size) {
> +  libafl_qemu_trace_vaddr_range(start, start + size);
> +}
> +
> +static int init_afl(void)
> +{
> +	vaddr_t xen_text_start = (vaddr_t)_stext;
> +	vaddr_t xen_text_end = (vaddr_t)_etext;
> +
> +	lqprintf("Telling AFL about code section: %lx - %lx\n", xen_text_start, xen_text_end);
 > +> +	libafl_qemu_trace_vaddr_range(xen_text_start, xen_text_end);
 > +> +	return 0;
> +}
> +
> +__initcall(init_afl);
> diff --git a/xen/arch/arm/psci.c b/xen/arch/arm/psci.c
> index b6860a7760..c7a51a1144 100644
> --- a/xen/arch/arm/psci.c
> +++ b/xen/arch/arm/psci.c
> @@ -17,6 +17,7 @@
>   #include <asm/cpufeature.h>
>   #include <asm/psci.h>
>   #include <asm/acpi.h>
> +#include <asm/libafl_qemu.h>
>   
>   /*
>    * While a 64-bit OS can make calls with SMC32 calling conventions, for
> @@ -49,6 +50,10 @@ int call_psci_cpu_on(int cpu)
>   
>   void call_psci_cpu_off(void)
>   {
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
> +    libafl_qemu_end(LIBAFL_QEMU_END_OK);
> +#endif

I am a bit confused with this call. For a first, this cannot be reached 
from a VM (or even dom0). Then, even if it is reached, shouldn't we 
allow the test continue while other pCPUs are running?

That said, the call to QEMU is not PSCI related. So shouldn't this be 
called from the callers (same applies to all the changes in PSCI)?

> +
>       if ( psci_ver > PSCI_VERSION(0, 1) )
>       {
>           struct arm_smccc_res res;
> @@ -62,12 +67,20 @@ void call_psci_cpu_off(void)
>   
>   void call_psci_system_off(void)
>   {
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
> +    libafl_qemu_end(LIBAFL_QEMU_END_OK);
> +#endif
> +
>       if ( psci_ver > PSCI_VERSION(0, 1) )
>           arm_smccc_smc(PSCI_0_2_FN32_SYSTEM_OFF, NULL);
>   }
>   
>   void call_psci_system_reset(void)
>   {
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
> +    libafl_qemu_end(LIBAFL_QEMU_END_OK);
> +#endif
> +
>       if ( psci_ver > PSCI_VERSION(0, 1) )
>           arm_smccc_smc(PSCI_0_2_FN32_SYSTEM_RESET, NULL);
>   }
> diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
> index 9043414290..55eb132568 100644
> --- a/xen/common/sched/core.c
> +++ b/xen/common/sched/core.c
> @@ -47,6 +47,10 @@
>   #define pv_shim false
>   #endif
>   
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER

This Kconfig is only defined on Arm but you are using in common code. 
Even if this can't be supported right now, shouldn't this be defined in 
common code?

> +#include <asm/libafl_qemu.h>
> +#endif
> +
>   /* opt_sched: scheduler - default to configured value */
>   static char __initdata opt_sched[10] = CONFIG_SCHED_DEFAULT;
>   string_param("sched", opt_sched);
> @@ -1452,6 +1456,10 @@ static long do_poll(const struct sched_poll *sched_poll)
>       if ( !guest_handle_okay(sched_poll->ports, sched_poll->nr_ports) )
>           return -EFAULT;
>   
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
> +    libafl_qemu_end(LIBAFL_QEMU_END_OK);
> +#endif

I think this and all the changes in sched/core need a comment explaning 
why we want to stop the fuzzing. For instance, this one sort of makes 
sense but...

> +
>       set_bit(_VPF_blocked, &v->pause_flags);
>       v->poll_evtchn = -1;
>       set_bit(v->vcpu_id, d->poll_mask);
> @@ -1904,12 +1912,18 @@ ret_t do_sched_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>       {
>       case SCHEDOP_yield:
>       {
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
> +        libafl_qemu_end(LIBAFL_QEMU_END_OK);
> +#endif

... this doesn't not. It is just a yield, there is no blocking operations.

>           ret = vcpu_yield();
>           break;
>       }
>   
>       case SCHEDOP_block:
>       {
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
> +        libafl_qemu_end(LIBAFL_QEMU_END_OK);
> +#endif
>           vcpu_block_enable_events();
>           break;
>       }
> @@ -1924,6 +1938,9 @@ ret_t do_sched_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>   
>           TRACE_TIME(TRC_SCHED_SHUTDOWN, current->domain->domain_id,
>                      current->vcpu_id, sched_shutdown.reason);
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
> +        libafl_qemu_end(LIBAFL_QEMU_END_OK);
> +#endif

Shouldn't this be called from domain_shutdown() to cover all the 
possible shutdown case? I am mainly thinking about domain_crash() which 
you don't seem to handle.

>           ret = domain_shutdown(current->domain, (u8)sched_shutdown.reason);
>   
>           break;
> diff --git a/xen/common/shutdown.c b/xen/common/shutdown.c
> index c47341b977..1340f4b606 100644
> --- a/xen/common/shutdown.c
> +++ b/xen/common/shutdown.c
> @@ -11,6 +11,10 @@
>   #include <xen/kexec.h>
>   #include <public/sched.h>
>   
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER
> +#include <asm/libafl_qemu.h>
> +#endif
> +
>   /* opt_noreboot: If true, machine will need manual reset on error. */
>   bool __ro_after_init opt_noreboot;
>   boolean_param("noreboot", opt_noreboot);
> @@ -32,6 +36,9 @@ static void noreturn reboot_or_halt(void)
>   
>   void hwdom_shutdown(unsigned char reason)
>   {
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
> +    libafl_qemu_end(LIBAFL_QEMU_END_OK);
> +#endif

If you call libalf_qemu_end() from domain_shutdown(), then you shouldn't 
need a special case for the hardware domain.

>       switch ( reason )
>       {
>       case SHUTDOWN_poweroff:
> diff --git a/xen/drivers/char/console.c b/xen/drivers/char/console.c
> index ba428199d2..55d33fa744 100644
> --- a/xen/drivers/char/console.c
> +++ b/xen/drivers/char/console.c
> @@ -40,6 +40,9 @@
>   #ifdef CONFIG_SBSA_VUART_CONSOLE
>   #include <asm/vpl011.h>
>   #endif
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER
> +#include <asm/libafl_qemu.h>
> +#endif
>   
>   /* console: comma-separated list of console outputs. */
>   static char __initdata opt_console[30] = OPT_CONSOLE_STR;
> @@ -1289,6 +1292,11 @@ void panic(const char *fmt, ...)
>   
>       kexec_crash(CRASHREASON_PANIC);
>   
> +    #ifdef CONFIG_LIBAFL_QEMU_FUZZER
> +    /* Tell the fuzzer that we crashed */
> +    libafl_qemu_end(LIBAFL_QEMU_END_CRASH);
> +    #endif
> +
>       if ( opt_noreboot )
>           machine_halt();
>       else

Cheers,

-- 
Julien Grall



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH v2] xen: add libafl-qemu fuzzer support
  2025-03-21 22:32 ` Stefano Stabellini
  2025-03-21 22:57   ` Julien Grall
@ 2025-03-21 23:34   ` Julien Grall
  1 sibling, 0 replies; 9+ messages in thread
From: Julien Grall @ 2025-03-21 23:34 UTC (permalink / raw)
  To: Stefano Stabellini, Volodymyr Babchuk
  Cc: xen-devel@lists.xenproject.org, Andrew Cooper, Anthony PERARD,
	Michal Orzel, Jan Beulich, Roger Pau Monné, Bertrand Marquis,
	Dario Faggioli, Juergen Gross, George Dunlap

Hi Stefano,

On 21/03/2025 22:32, Stefano Stabellini wrote:
> As discussed, please add a wrapper with an empty implementation in the
> regular case and the call to libafl_qemu_end when the fuzzer is enabled.
> So that here it becomes just something like:
> 
>    fuzzer_success();

I was thinking the same when reviewing the code. It would make the code 
a bit more readable. We would also want fuzzer_failure(). Both would 
need to be implemented in a common header.

Cheers,

-- 
Julien Grall



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH v2] xen: add libafl-qemu fuzzer support
  2025-03-15  0:36 [RFC PATCH v2] xen: add libafl-qemu fuzzer support Volodymyr Babchuk
  2025-03-21 22:32 ` Stefano Stabellini
  2025-03-21 23:31 ` Julien Grall
@ 2025-04-08 15:40 ` Jan Beulich
  2 siblings, 0 replies; 9+ messages in thread
From: Jan Beulich @ 2025-04-08 15:40 UTC (permalink / raw)
  To: Volodymyr Babchuk
  Cc: Andrew Cooper, Anthony PERARD, Michal Orzel, Julien Grall,
	Roger Pau Monné, Stefano Stabellini, Bertrand Marquis,
	Dario Faggioli, Juergen Gross, George Dunlap,
	xen-devel@lists.xenproject.org

On 15.03.2025 01:36, Volodymyr Babchuk wrote:
> LibAFL, which is a part of AFL++ project is a instrument that allows
> us to perform fuzzing on beremetal code (Xen hypervisor in this case)
> using QEMU as an emulator. It employs QEMU's ability to create
> snapshots to run many tests relatively quickly: system state is saved
> right before executing a new test and restored after the test is
> finished.
> 
> This patch adds all necessary plumbing to run aarch64 build of Xen
> inside that LibAFL-QEMU fuzzer. From the Xen perspective we need to
> do following things:
> 
> 1. Able to communicate with LibAFL-QEMU fuzzer. This is done by
> executing special opcodes, that only LibAFL-QEMU can handle.
> 
> 2. Use interface from p.1 to tell the fuzzer about code Xen section,
> so fuzzer know which part of code to track and gather coverage data.
> 
> 3. Report fuzzer about crash. This is done in panic() function.
> 
> 4. Prevent test harness from shooting itself in knee.
> 
> Right now test harness is an external component, because we want to
> test external Xen interfaces, but it is possible to fuzz internal code
> if we want to.
> 
> Test harness is implemented XTF-based test-case(s). As test harness
> can issue hypercall that shuts itself down, KConfig option
> CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING was added. It basically tells
> fuzzer that test was completed successfully if Dom0 tries to shut
> itself (or the whole machine) down.
> 
> Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com>
> 
> ---
> 
> I tried to fuzz the vGIC emulator and hypercall interface. While vGIC
> fuzzing didn't yield any interesting results, hypercall fuzzing found a
> way to crash the hypervisor from Dom0 on aarch64, using
> "XEN_SYSCTL_page_offline_op" with "sysctl_query_page_offline" sub-op,
> because it leads to page_is_ram_type() call which is marked
> UNREACHABLE on ARM.
> 
> In v2:
> 
>  - Moved to XTF-based test harness
>  - Severely reworked the fuzzer itself. Now it has user-friendly
>    command-line interface and is capable of running in CI, as it now
>    returns an appropriate error code if any faults were found
>  - Also I found, debugged and fixed a nasty bug in LibAFL-QEMU fork,
>    which crashed the whole fuzzer.
> 
> Right now the fuzzer is lockated at Xen Troops repo:
> 
> https://github.com/xen-troops/xen-fuzzer-rs
> 
> But I believe that it is ready to be included into
> gitlab.com/xen-project/
> 
> XTF-based harness is at
> 
> https://gitlab.com/vlad.babchuk/xtf/-/tree/mr_libafl
> 
> and there is corresponding MR for including it into
> 
> https://gitlab.com/xen-project/fusa/xtf/-/tree/xtf-arm
> 
> So, to sum up. All components are basically ready for initial
> inclusion. There will be smaller, integration-related changes
> later. For example - we will need to update URLs for various
> components after they are moved to correct places.
> ---
>  docs/hypervisor-guide/fuzzing.rst           |  90 ++++++++++++
>  xen/arch/arm/Kconfig.debug                  |  26 ++++
>  xen/arch/arm/Makefile                       |   1 +
>  xen/arch/arm/include/asm/libafl_qemu.h      |  54 +++++++
>  xen/arch/arm/include/asm/libafl_qemu_defs.h |  37 +++++
>  xen/arch/arm/libafl_qemu.c                  | 152 ++++++++++++++++++++
>  xen/arch/arm/psci.c                         |  13 ++
>  xen/common/sched/core.c                     |  17 +++
>  xen/common/shutdown.c                       |   7 +
>  xen/drivers/char/console.c                  |   8 ++
>  10 files changed, 405 insertions(+)
>  create mode 100644 docs/hypervisor-guide/fuzzing.rst
>  create mode 100644 xen/arch/arm/include/asm/libafl_qemu.h
>  create mode 100644 xen/arch/arm/include/asm/libafl_qemu_defs.h
>  create mode 100644 xen/arch/arm/libafl_qemu.c

This looks to be about Arm only, which would be nice if that was visible
right from the subject.

Also, nit: New files' names are to use dashes in favor of underscores.

> --- a/xen/common/sched/core.c
> +++ b/xen/common/sched/core.c
> @@ -47,6 +47,10 @@
>  #define pv_shim false
>  #endif
>  
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER
> +#include <asm/libafl_qemu.h>
> +#endif
> +
>  /* opt_sched: scheduler - default to configured value */
>  static char __initdata opt_sched[10] = CONFIG_SCHED_DEFAULT;
>  string_param("sched", opt_sched);
> @@ -1452,6 +1456,10 @@ static long do_poll(const struct sched_poll *sched_poll)
>      if ( !guest_handle_okay(sched_poll->ports, sched_poll->nr_ports) )
>          return -EFAULT;
>  
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
> +    libafl_qemu_end(LIBAFL_QEMU_END_OK);
> +#endif
> +
>      set_bit(_VPF_blocked, &v->pause_flags);
>      v->poll_evtchn = -1;
>      set_bit(v->vcpu_id, d->poll_mask);
> @@ -1904,12 +1912,18 @@ ret_t do_sched_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>      {
>      case SCHEDOP_yield:
>      {
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
> +        libafl_qemu_end(LIBAFL_QEMU_END_OK);
> +#endif
>          ret = vcpu_yield();
>          break;
>      }
>  
>      case SCHEDOP_block:
>      {
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
> +        libafl_qemu_end(LIBAFL_QEMU_END_OK);
> +#endif
>          vcpu_block_enable_events();
>          break;
>      }
> @@ -1924,6 +1938,9 @@ ret_t do_sched_op(int cmd, XEN_GUEST_HANDLE_PARAM(void) arg)
>  
>          TRACE_TIME(TRC_SCHED_SHUTDOWN, current->domain->domain_id,
>                     current->vcpu_id, sched_shutdown.reason);
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
> +        libafl_qemu_end(LIBAFL_QEMU_END_OK);
> +#endif
>          ret = domain_shutdown(current->domain, (u8)sched_shutdown.reason);
>  
>          break;

If I was a scheduler maintainer, I'd likely object to this kind of #ifdef-ary.

> --- a/xen/common/shutdown.c
> +++ b/xen/common/shutdown.c
> @@ -11,6 +11,10 @@
>  #include <xen/kexec.h>
>  #include <public/sched.h>
>  
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER
> +#include <asm/libafl_qemu.h>
> +#endif
> +
>  /* opt_noreboot: If true, machine will need manual reset on error. */
>  bool __ro_after_init opt_noreboot;
>  boolean_param("noreboot", opt_noreboot);
> @@ -32,6 +36,9 @@ static void noreturn reboot_or_halt(void)
>  
>  void hwdom_shutdown(unsigned char reason)
>  {
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
> +    libafl_qemu_end(LIBAFL_QEMU_END_OK);
> +#endif
>      switch ( reason )
>      {
>      case SHUTDOWN_poweroff:

It's not as bad here and ...

> --- a/xen/drivers/char/console.c
> +++ b/xen/drivers/char/console.c
> @@ -40,6 +40,9 @@
>  #ifdef CONFIG_SBSA_VUART_CONSOLE
>  #include <asm/vpl011.h>
>  #endif
> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER
> +#include <asm/libafl_qemu.h>
> +#endif
>  
>  /* console: comma-separated list of console outputs. */
>  static char __initdata opt_console[30] = OPT_CONSOLE_STR;
> @@ -1289,6 +1292,11 @@ void panic(const char *fmt, ...)
>  
>      kexec_crash(CRASHREASON_PANIC);
>  
> +    #ifdef CONFIG_LIBAFL_QEMU_FUZZER
> +    /* Tell the fuzzer that we crashed */
> +    libafl_qemu_end(LIBAFL_QEMU_END_CRASH);
> +    #endif

... here, but still.

Also, pre-processor directives want their # to live at the beginning of the
line.

Jan


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH v2] xen: add libafl-qemu fuzzer support
  2025-03-21 23:31 ` Julien Grall
@ 2025-04-30  2:17   ` Volodymyr Babchuk
  2025-04-30  6:42     ` Jan Beulich
  0 siblings, 1 reply; 9+ messages in thread
From: Volodymyr Babchuk @ 2025-04-30  2:17 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel@lists.xenproject.org, Andrew Cooper, Anthony PERARD,
	Michal Orzel, Jan Beulich, Roger Pau Monné,
	Stefano Stabellini, Bertrand Marquis, Dario Faggioli,
	Juergen Gross, George Dunlap


Hi Julien,

Julien Grall <julien@xen.org> writes:

[...]

>> diff --git a/xen/arch/arm/include/asm/libafl_qemu.h b/xen/arch/arm/include/asm/libafl_qemu.h
>> new file mode 100644
>> index 0000000000..b90cf48b9a
>> --- /dev/null
>> +++ b/xen/arch/arm/include/asm/libafl_qemu.h
>> @@ -0,0 +1,54 @@
>> +#ifndef LIBAFL_QEMU_H
>> +#define LIBAFL_QEMU_H
>> +
>> +#include <xen/stdint.h>
>> +#include "libafl_qemu_defs.h"
>> +#define LIBAFL_QEMU_PRINTF_MAX_SIZE 4096
>
> Is this defined by libafl or an internal decision?

It is defined by libafl

>
> [...]
>
>> diff --git a/xen/arch/arm/include/asm/libafl_qemu_defs.h b/xen/arch/arm/include/asm/libafl_qemu_defs.h
>> new file mode 100644
>> index 0000000000..2866cadaac
>> --- /dev/null
>> +++ b/xen/arch/arm/include/asm/libafl_qemu_defs.h
>> @@ -0,0 +1,37 @@
>
> Missing license. Also, is this file taken from somewhere?
>

I add MIT license, as libafl is dual licensed under Apache-2 and
MIT. This file is based on libafl_qemu [1]

>> +#ifndef LIBAFL_QEMU_DEFS
>> +#define LIBAFL_QEMU_DEFS
>> +
>> +#define LIBAFL_STRINGIFY(s) #s
>> +#define XSTRINGIFY(s) LIBAFL_STRINGIFY(s)
>> +
>> +#if __STDC_VERSION__ >= 201112L
>> +  #define STATIC_CHECKS                                   \
>> +    _Static_assert(sizeof(void *) <= sizeof(libafl_word), \
>> +                   "pointer type should not be larger and libafl_word");
>> +#else
>> +  #define STATIC_CHECKS
>> +#endif
>
> No-one seems to use STATIC_CHECKS? Is this intended?

I used this file as is... But I'll rework this part.

>> +
>> +#define LIBAFL_SYNC_EXIT_OPCODE 0x66f23a0f
>> +#define LIBAFL_BACKDOOR_OPCODE 0x44f23a0f
>
> Are the opcode valid for arm32? If not, they should be protected with
> #ifdef CONFIG_ARM_64.
>

It is valid even for x86_64. They use the same opcode for x86_64, arm,
aarch64 and riscv.

[...]

>> +
>> +#define LIBAFL_DEFINE_FUNCTIONS(name, opcode)				\
>> +	libafl_word _libafl_##name##_call0(	\
>> +		libafl_word action) {					\
>> +		libafl_word ret;					\
>> +		__asm__ volatile (					\
>> +			"mov x0, %1\n"					\
>> +			".word " XSTRINGIFY(opcode) "\n"		\
>> +			"mov %0, x0\n"					\
>> +			: "=r"(ret)					\
>> +			: "r"(action)					\
>> +			: "x0"						\
>
> Can we store the action directly in x0 (same for the other argunments
> below)? This would avoid to clobber two registers (See smccc.h as an
> example).

Yes, this part bothers me also. I'll try to rework it to be more
efficient.


[...]

>> +
>> +libafl_word libafl_qemu_start_virt(void       *buf_vaddr,
>> +                                            libafl_word max_len) {
>
> What coding style is this file meant to use?

Well, LibAFL people is very lax in their coding style. I copied this
file as is, but probably it should be tidied up and minimized.


[...]

>> +void lqprintf(const char *fmt, ...) {
>
> I am not sure I understand the value of lqprinf(). Why can't we use
> the console? When is this meant to be used?

This is alternative way to output something. It skips all the
abstractions around console and outputs straight to stdout. At least,
this is a nice way to check that communication with the fuzzer is
working.

>> +  va_list args;
>> +  int res;
>> +  va_start(args, fmt);
>> +  res = vsnprintf(_lqprintf_buffer, LIBAFL_QEMU_PRINTF_MAX_SIZE, fmt, args);
>> +  va_end(args);
>
> What if lqprintf() is called concurrently?
>

I'll add a spinlock.

[...]

>>     void call_psci_cpu_off(void)
>>   {
>> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
>> +    libafl_qemu_end(LIBAFL_QEMU_END_OK);
>> +#endif
>
> I am a bit confused with this call. For a first, this cannot be
> reached from a VM (or even dom0). Then, even if it is reached,
> shouldn't we allow the test continue while other pCPUs are running?

Yes, looks like this particular call is not accessible by a dom0. I'll
remove it.

> That said, the call to QEMU is not PSCI related. So shouldn't this be
> called from the callers (same applies to all the changes in PSCI)?

Purpose of fuzzing to cover as much code paths as possible. So it is
natural to put this as late as possible. I also reworked changes to
sched/core.c in accordance to this. I.e. moved fuzzer_on_block() calls
as late as possible.

>
>> +
>>       if ( psci_ver > PSCI_VERSION(0, 1) )
>>       {
>>           struct arm_smccc_res res;
>> @@ -62,12 +67,20 @@ void call_psci_cpu_off(void)
>>     void call_psci_system_off(void)
>>   {
>> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
>> +    libafl_qemu_end(LIBAFL_QEMU_END_OK);
>> +#endif
>> +
>>       if ( psci_ver > PSCI_VERSION(0, 1) )
>>           arm_smccc_smc(PSCI_0_2_FN32_SYSTEM_OFF, NULL);
>>   }
>>     void call_psci_system_reset(void)
>>   {
>> +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
>> +    libafl_qemu_end(LIBAFL_QEMU_END_OK);
>> +#endif
>> +
>>       if ( psci_ver > PSCI_VERSION(0, 1) )
>>           arm_smccc_smc(PSCI_0_2_FN32_SYSTEM_RESET, NULL);
>>   }
>> diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
>> index 9043414290..55eb132568 100644
>> --- a/xen/common/sched/core.c
>> +++ b/xen/common/sched/core.c
>> @@ -47,6 +47,10 @@
>>   #define pv_shim false
>>   #endif
>>   +#ifdef CONFIG_LIBAFL_QEMU_FUZZER
>
> This Kconfig is only defined on Arm but you are using in common
> code. Even if this can't be supported right now, shouldn't this be
> defined in common code?

Yes, I am going to move it to the common code, but will it be fine to
have "depends on ARM_64" in the global ./Kconfig.debug for a time being?

>
>> +#include <asm/libafl_qemu.h>
>> +#endif
>> +
>>   /* opt_sched: scheduler - default to configured value */
>>   static char __initdata opt_sched[10] = CONFIG_SCHED_DEFAULT;
>>   string_param("sched", opt_sched);
>> @@ -1452,6 +1456,10 @@ static long do_poll(const struct sched_poll *sched_poll)
>>       if ( !guest_handle_okay(sched_poll->ports, sched_poll->nr_ports) )
>>           return -EFAULT;
>>   +#ifdef CONFIG_LIBAFL_QEMU_FUZZER_PASS_BLOCKING
>> +    libafl_qemu_end(LIBAFL_QEMU_END_OK);
>> +#endif
>
> I think this and all the changes in sched/core need a comment
> explaning why we want to stop the fuzzing.

I introduced fuzzer_on_block() function and put the following comment
for it:

/*
 * Conditional success
 *
 * Sometimes a fuzzer might make Xen to do something that prevents
 * from returning to the caller: reboot or turn off the machine, block
 * calling vCPU, crash a domain, etc. Depending on fuzzing goal this
 * may be a valid behavior, but as control is not returned to the
 * fuzzing harness, it can't tell the fuzzer about success, so we need
 * to do this ourselves.
 */

Will it be enough? Or do you want to have a comment before each call to fuzzer_on_block()?


[1] https://github.com/AFLplusplus/LibAFL/blob/main/libafl_qemu/runtime/libafl_qemu_defs.h

-- 
WBR, Volodymyr

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH v2] xen: add libafl-qemu fuzzer support
  2025-04-30  2:17   ` Volodymyr Babchuk
@ 2025-04-30  6:42     ` Jan Beulich
  2025-04-30 12:19       ` Volodymyr Babchuk
  0 siblings, 1 reply; 9+ messages in thread
From: Jan Beulich @ 2025-04-30  6:42 UTC (permalink / raw)
  To: Volodymyr Babchuk
  Cc: xen-devel@lists.xenproject.org, Andrew Cooper, Anthony PERARD,
	Michal Orzel, Roger Pau Monné, Stefano Stabellini,
	Bertrand Marquis, Dario Faggioli, Juergen Gross, George Dunlap,
	Julien Grall

On 30.04.2025 04:17, Volodymyr Babchuk wrote:
> Julien Grall <julien@xen.org> writes:
>>> --- /dev/null
>>> +++ b/xen/arch/arm/include/asm/libafl_qemu_defs.h
>>> @@ -0,0 +1,37 @@
>>
>> Missing license. Also, is this file taken from somewhere?
>>
> 
> I add MIT license, as libafl is dual licensed under Apache-2 and
> MIT. This file is based on libafl_qemu [1]
> 
>>> +#ifndef LIBAFL_QEMU_DEFS
>>> +#define LIBAFL_QEMU_DEFS
>>> +
>>> +#define LIBAFL_STRINGIFY(s) #s
>>> +#define XSTRINGIFY(s) LIBAFL_STRINGIFY(s)
>>> +
>>> +#if __STDC_VERSION__ >= 201112L
>>> +  #define STATIC_CHECKS                                   \
>>> +    _Static_assert(sizeof(void *) <= sizeof(libafl_word), \
>>> +                   "pointer type should not be larger and libafl_word");
>>> +#else
>>> +  #define STATIC_CHECKS
>>> +#endif
>>
>> No-one seems to use STATIC_CHECKS? Is this intended?
> 
> I used this file as is... But I'll rework this part.
> 
>>> +
>>> +#define LIBAFL_SYNC_EXIT_OPCODE 0x66f23a0f
>>> +#define LIBAFL_BACKDOOR_OPCODE 0x44f23a0f
>>
>> Are the opcode valid for arm32? If not, they should be protected with
>> #ifdef CONFIG_ARM_64.
>>
> 
> It is valid even for x86_64. They use the same opcode for x86_64, arm,
> aarch64 and riscv.

Wow. On x86-64 they rely on the (prefix-less) opcode 0f3af2 to not gain
any meaning. Somewhat similar on RISC-V, somewhere in MISC_MEM opcode
space. Pretty fragile. Not to speak of what the effect of using such an
opcode is on disassembly of surrounding code (at least for x86).

Jan


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH v2] xen: add libafl-qemu fuzzer support
  2025-04-30  6:42     ` Jan Beulich
@ 2025-04-30 12:19       ` Volodymyr Babchuk
  0 siblings, 0 replies; 9+ messages in thread
From: Volodymyr Babchuk @ 2025-04-30 12:19 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-devel@lists.xenproject.org, Andrew Cooper, Anthony PERARD,
	Michal Orzel, Roger Pau Monné, Stefano Stabellini,
	Bertrand Marquis, Dario Faggioli, Juergen Gross, George Dunlap,
	Julien Grall

Jan Beulich <jbeulich@suse.com> writes:

> On 30.04.2025 04:17, Volodymyr Babchuk wrote:
>> Julien Grall <julien@xen.org> writes:
>>>> --- /dev/null
>>>> +++ b/xen/arch/arm/include/asm/libafl_qemu_defs.h
>>>> @@ -0,0 +1,37 @@
>>>
>>> Missing license. Also, is this file taken from somewhere?
>>>
>> 
>> I add MIT license, as libafl is dual licensed under Apache-2 and
>> MIT. This file is based on libafl_qemu [1]
>> 
>>>> +#ifndef LIBAFL_QEMU_DEFS
>>>> +#define LIBAFL_QEMU_DEFS
>>>> +
>>>> +#define LIBAFL_STRINGIFY(s) #s
>>>> +#define XSTRINGIFY(s) LIBAFL_STRINGIFY(s)
>>>> +
>>>> +#if __STDC_VERSION__ >= 201112L
>>>> +  #define STATIC_CHECKS                                   \
>>>> +    _Static_assert(sizeof(void *) <= sizeof(libafl_word), \
>>>> +                   "pointer type should not be larger and libafl_word");
>>>> +#else
>>>> +  #define STATIC_CHECKS
>>>> +#endif
>>>
>>> No-one seems to use STATIC_CHECKS? Is this intended?
>> 
>> I used this file as is... But I'll rework this part.
>> 
>>>> +
>>>> +#define LIBAFL_SYNC_EXIT_OPCODE 0x66f23a0f
>>>> +#define LIBAFL_BACKDOOR_OPCODE 0x44f23a0f
>>>
>>> Are the opcode valid for arm32? If not, they should be protected with
>>> #ifdef CONFIG_ARM_64.
>>>
>> 
>> It is valid even for x86_64. They use the same opcode for x86_64, arm,
>> aarch64 and riscv.
>
> Wow. On x86-64 they rely on the (prefix-less) opcode 0f3af2 to not gain
> any meaning. Somewhat similar on RISC-V, somewhere in MISC_MEM opcode
> space. Pretty fragile. Not to speak of what the effect of using such an
> opcode is on disassembly of surrounding code (at least for x86).

Yeah, they made some questionable choices, and opcode selection is one
of them. Also, the whole libafl-qemu code quality is not to the highest
standard, but there are no better alternatives.

They just hacked into TCG translator code and are looking for the their
special opcodes byte-per-byte:

[1] https://github.com/AFLplusplus/qemu-libafl-bridge/blob/main/accel/tcg/translator.c#L184


-- 
WBR, Volodymyr

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2025-04-30 12:19 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-15  0:36 [RFC PATCH v2] xen: add libafl-qemu fuzzer support Volodymyr Babchuk
2025-03-21 22:32 ` Stefano Stabellini
2025-03-21 22:57   ` Julien Grall
2025-03-21 23:34   ` Julien Grall
2025-03-21 23:31 ` Julien Grall
2025-04-30  2:17   ` Volodymyr Babchuk
2025-04-30  6:42     ` Jan Beulich
2025-04-30 12:19       ` Volodymyr Babchuk
2025-04-08 15:40 ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.