qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/1] user: add runtime switch to call safe_syscall via libc
@ 2025-11-02 13:26 Balint Reczey via
       [not found] ` <CAFEAcA9jzkZ_PSXvjDEXyGB1=NAKuMEphhKeMeahXcfXXDmWUg@mail.gmail.com>
  0 siblings, 1 reply; 3+ messages in thread
From: Balint Reczey via @ 2025-11-02 13:26 UTC (permalink / raw)
  Cc: Warner Losh, Kyle Evans, Riku Voipio, Laurent Vivier

Add a libc-backed path for safe_syscall() that make syscalls via
libc's syscall(). This enables interposing syscalls via LD_PRELOAD when
running static guest binaries under a dynamically linked qemu-user.

The assembly implementation (safe_syscall_base()) remains the default.
A runtime switch or a set environment variable changes the behavior:

Command line: -libc-syscall
Environment: QEMU_LIBC_SYSCALL

Signed-off-by: Balint Reczey <balint@balintreczey.hu>
---
 bsd-user/main.c             | 11 +++++++
 common-user/meson.build     |  1 +
 common-user/safe-syscall.c  | 57 +++++++++++++++++++++++++++++++++++++
 docs/user/main.rst          | 28 ++++++++++++++++--
 include/user/safe-syscall.h | 25 +++++++++++-----
 linux-user/main.c           |  9 ++++++
 6 files changed, 122 insertions(+), 9 deletions(-)
 create mode 100644 common-user/safe-syscall.c

diff --git a/bsd-user/main.c b/bsd-user/main.c
index 73aae8c327..9b3ff67859 100644
--- a/bsd-user/main.c
+++ b/bsd-user/main.c
@@ -38,6 +38,7 @@
 #include "qemu/plugin.h"
 #include "user/guest-base.h"
 #include "user/page-protection.h"
+#include "user/safe-syscall.h"
 #include "accel/accel-ops.h"
 #include "tcg/startup.h"
 #include "qemu/timer.h"
@@ -166,6 +167,7 @@ static void usage(void)
            "-E var=value      sets/modifies targets environment variable(s)\n"
            "-U var            unsets targets environment variable(s)\n"
            "-B address        set guest_base address to address\n"
+           "-libc-syscall     use libc syscall() instead of assembly safe-syscall\n"
            "\n"
            "Debug options:\n"
            "-d item1[,...]    enable logging of specified items\n"
@@ -183,6 +185,8 @@ static void usage(void)
            "Environment variables:\n"
            "QEMU_STRACE       Print system calls and arguments similar to the\n"
            "                  'strace' program.  Enable by setting to any value.\n"
+           "QEMU_LIBC_SYSCALL Use libc syscall() instead of assembly safe-syscall.\n"
+           "                  Enable by setting to any value.\n"
            "You can use -E and -U options to set/unset environment variables\n"
            "for target process.  It is possible to provide several variables\n"
            "by repeating the option.  For example:\n"
@@ -310,6 +314,11 @@ int main(int argc, char **argv)
     qemu_add_opts(&qemu_trace_opts);
     qemu_plugin_add_opts();
 
+    /* Check QEMU_LIBC_SYSCALL environment variable */
+    if (getenv("QEMU_LIBC_SYSCALL")) {
+        qemu_use_libc_syscall = true;
+    }
+
     optind = 1;
     for (;;) {
         if (optind >= argc) {
@@ -380,6 +389,8 @@ int main(int argc, char **argv)
             have_guest_base = true;
         } else if (!strcmp(r, "drop-ld-preload")) {
             (void) envlist_unsetenv(envlist, "LD_PRELOAD");
+        } else if (!strcmp(r, "libc-syscall")) {
+            qemu_use_libc_syscall = true;
         } else if (!strcmp(r, "seed")) {
             seed_optarg = optarg;
         } else if (!strcmp(r, "one-insn-per-tb")) {
diff --git a/common-user/meson.build b/common-user/meson.build
index ac9de5b9e3..d44ffe1f56 100644
--- a/common-user/meson.build
+++ b/common-user/meson.build
@@ -7,4 +7,5 @@ common_user_inc += include_directories('host/' / host_arch)
 user_ss.add(files(
   'safe-syscall.S',
   'safe-syscall-error.c',
+  'safe-syscall.c',
 ))
diff --git a/common-user/safe-syscall.c b/common-user/safe-syscall.c
new file mode 100644
index 0000000000..d1476c3113
--- /dev/null
+++ b/common-user/safe-syscall.c
@@ -0,0 +1,57 @@
+/*
+ * safe-syscall.c: C implementation using libc's syscall()
+ * to handle signals occurring at the same time as system calls.
+ *
+ * Written by Balint Reczey <balint@balintreczey.hu>
+ *
+ * Copyright (C) 2025 Balint Reczey
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#include "qemu/osdep.h"
+#if defined(__linux__)
+# include "special-errno.h"
+#elif defined(__FreeBSD__)
+# include "errno_defs.h"
+#endif
+#include "user/safe-syscall.h"
+#include <stdarg.h>
+#include <unistd.h>
+#include <sys/syscall.h>
+#include "qemu/atomic.h"
+
+/* Global runtime toggle (default: false). */
+bool qemu_use_libc_syscall;
+
+/*
+ * libc-backed implementation: Make a system call via libc's syscall()
+ * if no guest signal is pending.
+ */
+long safe_syscall_libc(int *pending, long number, ...)
+{
+    va_list ap;
+    long arg1, arg2, arg3, arg4, arg5, arg6;
+    long ret;
+
+    /* Check if a guest signal is pending */
+    if (qatomic_read(pending)) {
+        errno = QEMU_ERESTARTSYS;
+        return -1;
+    }
+
+    va_start(ap, number);
+    /* Extract up to 6 syscall arguments */
+    arg1 = va_arg(ap, long);
+    arg2 = va_arg(ap, long);
+    arg3 = va_arg(ap, long);
+    arg4 = va_arg(ap, long);
+    arg5 = va_arg(ap, long);
+    arg6 = va_arg(ap, long);
+    va_end(ap);
+
+    /* Make the actual system call using libc's syscall() */
+    ret = syscall(number, arg1, arg2, arg3, arg4, arg5, arg6);
+
+    return ret;
+}
diff --git a/docs/user/main.rst b/docs/user/main.rst
index a8ddf91424..6b7e76dfe1 100644
--- a/docs/user/main.rst
+++ b/docs/user/main.rst
@@ -70,7 +70,7 @@ Command line options
 
 ::
 
-   qemu-i386 [-h] [-d] [-L path] [-s size] [-cpu model] [-g endpoint] [-B offset] [-R size] program [arguments...]
+   qemu-i386 [-h] [-d] [-L path] [-s size] [-cpu model] [-g endpoint] [-B offset] [-R size] [-libc-syscall] program [arguments...]
 
 ``-h``
    Print the help
@@ -101,6 +101,15 @@ Command line options
    bytes). \"G\", \"M\", and \"k\" suffixes may be used when specifying
    the size.
 
+``-libc-syscall``
+   Use the host C library's ``syscall()`` entry point for guest system calls
+   instead of QEMU's built-in safe-syscall trampoline. By default this option
+   is disabled and QEMU uses its internal assembly implementation for
+   performance and precise control of signal-restart semantics. This switch is
+   primarily intended for debugging and integration scenarios (for example
+   when interposing on ``syscall()`` via ``LD_PRELOAD``). Available on Linux
+   and BSD user-mode builds.
+
 Debug options:
 
 ``-d item1,...``
@@ -135,6 +144,10 @@ QEMU_STRACE
    format are printed with information for six arguments. Many
    flag-style arguments don't have decoders and will show up as numbers.
 
+QEMU_LIBC_SYSCALL
+   When set to a non-empty value, behave as if ``-libc-syscall`` was specified
+   on the command line. Defaults to disabled.
+
 Other binaries
 ~~~~~~~~~~~~~~
 
@@ -231,7 +244,7 @@ Command line options
 
 ::
 
-   qemu-sparc64 [-h] [-d] [-L path] [-s size] [-bsd type] program [arguments...]
+   qemu-sparc64 [-h] [-d] [-L path] [-s size] [-bsd type] [-libc-syscall] program [arguments...]
 
 ``-h``
    Print the help
@@ -256,6 +269,11 @@ Command line options
    Set the type of the emulated BSD Operating system. Valid values are
    FreeBSD, NetBSD and OpenBSD (default).
 
+``-libc-syscall``
+   Use the host C library's ``syscall()`` entry point for guest system calls
+   instead of QEMU's built-in safe-syscall trampoline. See the Linux user-mode
+   option of the same name for details. Defaults to disabled.
+
 Debug options:
 
 ``-d item1,...``
@@ -266,3 +284,9 @@ Debug options:
    Run the emulation with one guest instruction per translation block.
    This slows down emulation a lot, but can be useful in some situations,
    such as when trying to analyse the logs produced by the ``-d`` option.
+
+Environment variables:
+
+QEMU_LIBC_SYSCALL
+   When set to a non-empty value, behave as if ``-libc-syscall`` was specified
+   on the command line. Defaults to disabled.
diff --git a/include/user/safe-syscall.h b/include/user/safe-syscall.h
index aa075f4d5c..02a95c24e9 100644
--- a/include/user/safe-syscall.h
+++ b/include/user/safe-syscall.h
@@ -125,16 +125,27 @@
  * kinds of restartability.
  */
 
-/* The core part of this function is implemented in assembly */
-long safe_syscall_base(int *pending, long number, ...);
-long safe_syscall_set_errno_tail(int value);
+/*
+ * The core part remains implemented in assembly; a C dispatcher selects
+ * runtime path.
+ */
+extern long safe_syscall_base(int *pending, long number, ...);
+extern long safe_syscall_set_errno_tail(int value);
+extern long safe_syscall_libc(int *pending, long number, ...);
+extern bool qemu_use_libc_syscall;
 
-/* These are defined by the safe-syscall.inc.S file */
+/*
+ * These symbols are defined for compatibility with signal handling code.
+ * In the C implementation, they are dummy symbols.
+ */
 extern char safe_syscall_start[];
 extern char safe_syscall_end[];
 
-#define safe_syscall(...)                                                 \
-    safe_syscall_base(&get_task_state(thread_cpu)->signal_pending,        \
-                      __VA_ARGS__)
+#define safe_syscall(...)                                               \
+    (qemu_use_libc_syscall ?                                            \
+     safe_syscall_libc(&get_task_state(thread_cpu)->signal_pending,     \
+                       __VA_ARGS__) :                                   \
+     safe_syscall_base(&get_task_state(thread_cpu)->signal_pending,     \
+                       __VA_ARGS__))
 
 #endif
diff --git a/linux-user/main.c b/linux-user/main.c
index db751c0757..de2a20efb4 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -40,6 +40,7 @@
 #include "qemu/plugin.h"
 #include "user/guest-base.h"
 #include "user/page-protection.h"
+#include "user/safe-syscall.h"
 #include "exec/gdbstub.h"
 #include "gdbstub/user.h"
 #include "accel/accel-ops.h"
@@ -456,6 +457,12 @@ static void handle_arg_jitdump(const char *arg)
     perf_enable_jitdump();
 }
 
+static void handle_arg_libc_syscall(const char *arg)
+{
+    /* Enable libc-backed syscall implementation */
+    qemu_use_libc_syscall = true;
+}
+
 static QemuPluginList plugins = QTAILQ_HEAD_INITIALIZER(plugins);
 
 #ifdef CONFIG_PLUGIN
@@ -534,6 +541,8 @@ static const struct qemu_argument arg_table[] = {
      "",           "Generate a /tmp/perf-${pid}.map file for perf"},
     {"jitdump",    "QEMU_JITDUMP",     false, handle_arg_jitdump,
      "",           "Generate a jit-${pid}.dump file for perf"},
+    {"libc-syscall", "QEMU_LIBC_SYSCALL", false, handle_arg_libc_syscall,
+     "",           "use libc syscall() instead of assembly safe-syscall"},
     {NULL, NULL, false, NULL, NULL, NULL}
 };
 
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [PATCH 1/1] user: add runtime switch to call safe_syscall via libc
@ 2025-11-02 13:26 Balint Reczey
  0 siblings, 0 replies; 3+ messages in thread
From: Balint Reczey @ 2025-11-02 13:26 UTC (permalink / raw)
  Cc: Warner Losh, Kyle Evans, Riku Voipio, Laurent Vivier

Add a libc-backed path for safe_syscall() that make syscalls via
libc's syscall(). This enables interposing syscalls via LD_PRELOAD when
running static guest binaries under a dynamically linked qemu-user.

The assembly implementation (safe_syscall_base()) remains the default.
A runtime switch or a set environment variable changes the behavior:

Command line: -libc-syscall
Environment: QEMU_LIBC_SYSCALL

Signed-off-by: Balint Reczey <balint@balintreczey.hu>
---
 bsd-user/main.c             | 11 +++++++
 common-user/meson.build     |  1 +
 common-user/safe-syscall.c  | 57 +++++++++++++++++++++++++++++++++++++
 docs/user/main.rst          | 28 ++++++++++++++++--
 include/user/safe-syscall.h | 25 +++++++++++-----
 linux-user/main.c           |  9 ++++++
 6 files changed, 122 insertions(+), 9 deletions(-)
 create mode 100644 common-user/safe-syscall.c

diff --git a/bsd-user/main.c b/bsd-user/main.c
index 73aae8c327..9b3ff67859 100644
--- a/bsd-user/main.c
+++ b/bsd-user/main.c
@@ -38,6 +38,7 @@
 #include "qemu/plugin.h"
 #include "user/guest-base.h"
 #include "user/page-protection.h"
+#include "user/safe-syscall.h"
 #include "accel/accel-ops.h"
 #include "tcg/startup.h"
 #include "qemu/timer.h"
@@ -166,6 +167,7 @@ static void usage(void)
            "-E var=value      sets/modifies targets environment variable(s)\n"
            "-U var            unsets targets environment variable(s)\n"
            "-B address        set guest_base address to address\n"
+           "-libc-syscall     use libc syscall() instead of assembly safe-syscall\n"
            "\n"
            "Debug options:\n"
            "-d item1[,...]    enable logging of specified items\n"
@@ -183,6 +185,8 @@ static void usage(void)
            "Environment variables:\n"
            "QEMU_STRACE       Print system calls and arguments similar to the\n"
            "                  'strace' program.  Enable by setting to any value.\n"
+           "QEMU_LIBC_SYSCALL Use libc syscall() instead of assembly safe-syscall.\n"
+           "                  Enable by setting to any value.\n"
            "You can use -E and -U options to set/unset environment variables\n"
            "for target process.  It is possible to provide several variables\n"
            "by repeating the option.  For example:\n"
@@ -310,6 +314,11 @@ int main(int argc, char **argv)
     qemu_add_opts(&qemu_trace_opts);
     qemu_plugin_add_opts();
 
+    /* Check QEMU_LIBC_SYSCALL environment variable */
+    if (getenv("QEMU_LIBC_SYSCALL")) {
+        qemu_use_libc_syscall = true;
+    }
+
     optind = 1;
     for (;;) {
         if (optind >= argc) {
@@ -380,6 +389,8 @@ int main(int argc, char **argv)
             have_guest_base = true;
         } else if (!strcmp(r, "drop-ld-preload")) {
             (void) envlist_unsetenv(envlist, "LD_PRELOAD");
+        } else if (!strcmp(r, "libc-syscall")) {
+            qemu_use_libc_syscall = true;
         } else if (!strcmp(r, "seed")) {
             seed_optarg = optarg;
         } else if (!strcmp(r, "one-insn-per-tb")) {
diff --git a/common-user/meson.build b/common-user/meson.build
index ac9de5b9e3..d44ffe1f56 100644
--- a/common-user/meson.build
+++ b/common-user/meson.build
@@ -7,4 +7,5 @@ common_user_inc += include_directories('host/' / host_arch)
 user_ss.add(files(
   'safe-syscall.S',
   'safe-syscall-error.c',
+  'safe-syscall.c',
 ))
diff --git a/common-user/safe-syscall.c b/common-user/safe-syscall.c
new file mode 100644
index 0000000000..d1476c3113
--- /dev/null
+++ b/common-user/safe-syscall.c
@@ -0,0 +1,57 @@
+/*
+ * safe-syscall.c: C implementation using libc's syscall()
+ * to handle signals occurring at the same time as system calls.
+ *
+ * Written by Balint Reczey <balint@balintreczey.hu>
+ *
+ * Copyright (C) 2025 Balint Reczey
+ *
+ * SPDX-License-Identifier: GPL-2.0-or-later
+ */
+
+#include "qemu/osdep.h"
+#if defined(__linux__)
+# include "special-errno.h"
+#elif defined(__FreeBSD__)
+# include "errno_defs.h"
+#endif
+#include "user/safe-syscall.h"
+#include <stdarg.h>
+#include <unistd.h>
+#include <sys/syscall.h>
+#include "qemu/atomic.h"
+
+/* Global runtime toggle (default: false). */
+bool qemu_use_libc_syscall;
+
+/*
+ * libc-backed implementation: Make a system call via libc's syscall()
+ * if no guest signal is pending.
+ */
+long safe_syscall_libc(int *pending, long number, ...)
+{
+    va_list ap;
+    long arg1, arg2, arg3, arg4, arg5, arg6;
+    long ret;
+
+    /* Check if a guest signal is pending */
+    if (qatomic_read(pending)) {
+        errno = QEMU_ERESTARTSYS;
+        return -1;
+    }
+
+    va_start(ap, number);
+    /* Extract up to 6 syscall arguments */
+    arg1 = va_arg(ap, long);
+    arg2 = va_arg(ap, long);
+    arg3 = va_arg(ap, long);
+    arg4 = va_arg(ap, long);
+    arg5 = va_arg(ap, long);
+    arg6 = va_arg(ap, long);
+    va_end(ap);
+
+    /* Make the actual system call using libc's syscall() */
+    ret = syscall(number, arg1, arg2, arg3, arg4, arg5, arg6);
+
+    return ret;
+}
diff --git a/docs/user/main.rst b/docs/user/main.rst
index a8ddf91424..6b7e76dfe1 100644
--- a/docs/user/main.rst
+++ b/docs/user/main.rst
@@ -70,7 +70,7 @@ Command line options
 
 ::
 
-   qemu-i386 [-h] [-d] [-L path] [-s size] [-cpu model] [-g endpoint] [-B offset] [-R size] program [arguments...]
+   qemu-i386 [-h] [-d] [-L path] [-s size] [-cpu model] [-g endpoint] [-B offset] [-R size] [-libc-syscall] program [arguments...]
 
 ``-h``
    Print the help
@@ -101,6 +101,15 @@ Command line options
    bytes). \"G\", \"M\", and \"k\" suffixes may be used when specifying
    the size.
 
+``-libc-syscall``
+   Use the host C library's ``syscall()`` entry point for guest system calls
+   instead of QEMU's built-in safe-syscall trampoline. By default this option
+   is disabled and QEMU uses its internal assembly implementation for
+   performance and precise control of signal-restart semantics. This switch is
+   primarily intended for debugging and integration scenarios (for example
+   when interposing on ``syscall()`` via ``LD_PRELOAD``). Available on Linux
+   and BSD user-mode builds.
+
 Debug options:
 
 ``-d item1,...``
@@ -135,6 +144,10 @@ QEMU_STRACE
    format are printed with information for six arguments. Many
    flag-style arguments don't have decoders and will show up as numbers.
 
+QEMU_LIBC_SYSCALL
+   When set to a non-empty value, behave as if ``-libc-syscall`` was specified
+   on the command line. Defaults to disabled.
+
 Other binaries
 ~~~~~~~~~~~~~~
 
@@ -231,7 +244,7 @@ Command line options
 
 ::
 
-   qemu-sparc64 [-h] [-d] [-L path] [-s size] [-bsd type] program [arguments...]
+   qemu-sparc64 [-h] [-d] [-L path] [-s size] [-bsd type] [-libc-syscall] program [arguments...]
 
 ``-h``
    Print the help
@@ -256,6 +269,11 @@ Command line options
    Set the type of the emulated BSD Operating system. Valid values are
    FreeBSD, NetBSD and OpenBSD (default).
 
+``-libc-syscall``
+   Use the host C library's ``syscall()`` entry point for guest system calls
+   instead of QEMU's built-in safe-syscall trampoline. See the Linux user-mode
+   option of the same name for details. Defaults to disabled.
+
 Debug options:
 
 ``-d item1,...``
@@ -266,3 +284,9 @@ Debug options:
    Run the emulation with one guest instruction per translation block.
    This slows down emulation a lot, but can be useful in some situations,
    such as when trying to analyse the logs produced by the ``-d`` option.
+
+Environment variables:
+
+QEMU_LIBC_SYSCALL
+   When set to a non-empty value, behave as if ``-libc-syscall`` was specified
+   on the command line. Defaults to disabled.
diff --git a/include/user/safe-syscall.h b/include/user/safe-syscall.h
index aa075f4d5c..02a95c24e9 100644
--- a/include/user/safe-syscall.h
+++ b/include/user/safe-syscall.h
@@ -125,16 +125,27 @@
  * kinds of restartability.
  */
 
-/* The core part of this function is implemented in assembly */
-long safe_syscall_base(int *pending, long number, ...);
-long safe_syscall_set_errno_tail(int value);
+/*
+ * The core part remains implemented in assembly; a C dispatcher selects
+ * runtime path.
+ */
+extern long safe_syscall_base(int *pending, long number, ...);
+extern long safe_syscall_set_errno_tail(int value);
+extern long safe_syscall_libc(int *pending, long number, ...);
+extern bool qemu_use_libc_syscall;
 
-/* These are defined by the safe-syscall.inc.S file */
+/*
+ * These symbols are defined for compatibility with signal handling code.
+ * In the C implementation, they are dummy symbols.
+ */
 extern char safe_syscall_start[];
 extern char safe_syscall_end[];
 
-#define safe_syscall(...)                                                 \
-    safe_syscall_base(&get_task_state(thread_cpu)->signal_pending,        \
-                      __VA_ARGS__)
+#define safe_syscall(...)                                               \
+    (qemu_use_libc_syscall ?                                            \
+     safe_syscall_libc(&get_task_state(thread_cpu)->signal_pending,     \
+                       __VA_ARGS__) :                                   \
+     safe_syscall_base(&get_task_state(thread_cpu)->signal_pending,     \
+                       __VA_ARGS__))
 
 #endif
diff --git a/linux-user/main.c b/linux-user/main.c
index db751c0757..de2a20efb4 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -40,6 +40,7 @@
 #include "qemu/plugin.h"
 #include "user/guest-base.h"
 #include "user/page-protection.h"
+#include "user/safe-syscall.h"
 #include "exec/gdbstub.h"
 #include "gdbstub/user.h"
 #include "accel/accel-ops.h"
@@ -456,6 +457,12 @@ static void handle_arg_jitdump(const char *arg)
     perf_enable_jitdump();
 }
 
+static void handle_arg_libc_syscall(const char *arg)
+{
+    /* Enable libc-backed syscall implementation */
+    qemu_use_libc_syscall = true;
+}
+
 static QemuPluginList plugins = QTAILQ_HEAD_INITIALIZER(plugins);
 
 #ifdef CONFIG_PLUGIN
@@ -534,6 +541,8 @@ static const struct qemu_argument arg_table[] = {
      "",           "Generate a /tmp/perf-${pid}.map file for perf"},
     {"jitdump",    "QEMU_JITDUMP",     false, handle_arg_jitdump,
      "",           "Generate a jit-${pid}.dump file for perf"},
+    {"libc-syscall", "QEMU_LIBC_SYSCALL", false, handle_arg_libc_syscall,
+     "",           "use libc syscall() instead of assembly safe-syscall"},
     {NULL, NULL, false, NULL, NULL, NULL}
 };
 
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH 1/1] user: add runtime switch to call safe_syscall via libc
       [not found] ` <CAFEAcA9jzkZ_PSXvjDEXyGB1=NAKuMEphhKeMeahXcfXXDmWUg@mail.gmail.com>
@ 2025-11-06 15:39   ` Bálint Réczey via
  0 siblings, 0 replies; 3+ messages in thread
From: Bálint Réczey via @ 2025-11-06 15:39 UTC (permalink / raw)
  To: Peter Maydell, qemu-devel
  Cc: Warner Losh, Kyle Evans, Riku Voipio, Laurent Vivier

Hi Peter,

Thank you for the review!

Peter Maydell <peter.maydell@linaro.org> ezt írta (időpont: 2025. nov.
5., Sze, 14:53):
>
> On Tue, 4 Nov 2025 at 22:08, Balint Reczey via <qemu-devel@nongnu.org> wrote:
> >
> > Add a libc-backed path for safe_syscall() that make syscalls via
> > libc's syscall(). This enables interposing syscalls via LD_PRELOAD when
> > running static guest binaries under a dynamically linked qemu-user.
> >
> > The assembly implementation (safe_syscall_base()) remains the default.
> > A runtime switch or a set environment variable changes the behavior:
> >
> > Command line: -libc-syscall
> > Environment: QEMU_LIBC_SYSCALL
>
> > +/*
> > + * libc-backed implementation: Make a system call via libc's syscall()
> > + * if no guest signal is pending.
> > + */
> > +long safe_syscall_libc(int *pending, long number, ...)
> > +{
> > +    va_list ap;
> > +    long arg1, arg2, arg3, arg4, arg5, arg6;
> > +    long ret;
> > +
> > +    /* Check if a guest signal is pending */
> > +    if (qatomic_read(pending)) {
> > +        errno = QEMU_ERESTARTSYS;
> > +        return -1;
> > +    }
>
> We check for a pending signal here...
>
> > +
> > +    va_start(ap, number);
> > +    /* Extract up to 6 syscall arguments */
> > +    arg1 = va_arg(ap, long);
> > +    arg2 = va_arg(ap, long);
> > +    arg3 = va_arg(ap, long);
> > +    arg4 = va_arg(ap, long);
> > +    arg5 = va_arg(ap, long);
> > +    arg6 = va_arg(ap, long);
> > +    va_end(ap);
>
> ...but if a signal arrives after we checked but somewhere in here
> before we actually make the host syscall, then we may incorrectly
> block in the syscall.
>
> > +
> > +    /* Make the actual system call using libc's syscall() */
> > +    ret = syscall(number, arg1, arg2, arg3, arg4, arg5, arg6);
>
> This is the race condition which is the reason why safe_syscall
> is implemented in assembly: we need to be able to control exactly
> which code we're in so that the signal handler can adjust the PC
> if it sees that we were attempting to do a syscall when the
> signal arrived. (There's a longer explanation of this in a comment
> in include/user/safe-syscall.h.)

Yes, indeed. I moved the pending check right before the syscall()
call, but there is still a race and I think it can't be resolved as
well as it is done in the assembly implementation.

> Not getting this right results in various hangs and
> other misbehaviour when a signal arrives to the guest
> program at the wrong moment. We don't want to regress
> that behaviour. Any proposal for having QEMU call syscall()
> needs to avoid reintroducing the races.

Yes, this is why there is a run-time switch to choose the less
safe behavior, but I understand that this still may not be acceptable
for the project.
It works beautifully though for wrapping statically linked tools in builds,
thus for anyone interested the patch will be maintained at:
https://github.com/firebuild/qemu/tree/safe-syscalls-via-libc

Cheers,
Balint

> thanks
> -- PMM


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-11-06 15:40 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-02 13:26 [PATCH 1/1] user: add runtime switch to call safe_syscall via libc Balint Reczey via
     [not found] ` <CAFEAcA9jzkZ_PSXvjDEXyGB1=NAKuMEphhKeMeahXcfXXDmWUg@mail.gmail.com>
2025-11-06 15:39   ` Bálint Réczey via
  -- strict thread matches above, loose matches on Subject: below --
2025-11-02 13:26 Balint Reczey

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).