qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] linux-user: Fix siginfo_t contents when jumping to non-readable pages
@ 2022-08-04 18:23 Ilya Leoshkevich
  2022-08-04 18:23 ` [PATCH 1/2] " Ilya Leoshkevich
  2022-08-04 18:23 ` [PATCH 2/2] tests/tcg: Test " Ilya Leoshkevich
  0 siblings, 2 replies; 6+ messages in thread
From: Ilya Leoshkevich @ 2022-08-04 18:23 UTC (permalink / raw)
  To: Laurent Vivier, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé, Yanan Wang, Richard Henderson,
	Paolo Bonzini, David Hildenbrand
  Cc: qemu-devel, qemu-s390x, Christian Borntraeger, Ilya Leoshkevich

Hi,

I noticed that when we get a SEGV due to jumping to non-readable
memory, sometimes si_addr and program counter in siginfo_t are slightly
off. I tracked this down to the assumption that translators stop before
the end of a page, while in reality they may stop right after it.

Patch 1 fixes the issue, patch 2 adds tests.

Best regards,
Ilya

Ilya Leoshkevich (2):
  linux-user: Fix siginfo_t contents when jumping to non-readable pages
  tests/tcg: Test siginfo_t contents when jumping to non-readable pages

 accel/tcg/translate-all.c        |  16 ++--
 accel/tcg/translator.c           |  25 ++++++
 include/hw/core/cpu.h            |   2 +
 linux-user/signal.c              |   5 ++
 tests/tcg/multiarch/noexec.h     | 114 ++++++++++++++++++++++++
 tests/tcg/s390x/Makefile.target  |   1 +
 tests/tcg/s390x/noexec.c         | 145 +++++++++++++++++++++++++++++++
 tests/tcg/x86_64/Makefile.target |   3 +-
 tests/tcg/x86_64/noexec.c        | 116 +++++++++++++++++++++++++
 9 files changed, 421 insertions(+), 6 deletions(-)
 create mode 100644 tests/tcg/multiarch/noexec.h
 create mode 100644 tests/tcg/s390x/noexec.c
 create mode 100644 tests/tcg/x86_64/noexec.c

-- 
2.35.3



^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/2] linux-user: Fix siginfo_t contents when jumping to non-readable pages
  2022-08-04 18:23 [PATCH 0/2] linux-user: Fix siginfo_t contents when jumping to non-readable pages Ilya Leoshkevich
@ 2022-08-04 18:23 ` Ilya Leoshkevich
  2022-08-05  8:50   ` Peter Maydell
  2022-08-04 18:23 ` [PATCH 2/2] tests/tcg: Test " Ilya Leoshkevich
  1 sibling, 1 reply; 6+ messages in thread
From: Ilya Leoshkevich @ 2022-08-04 18:23 UTC (permalink / raw)
  To: Laurent Vivier, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé, Yanan Wang, Richard Henderson,
	Paolo Bonzini, David Hildenbrand
  Cc: qemu-devel, qemu-s390x, Christian Borntraeger, Ilya Leoshkevich

When the first instruction of a translation block is located in a
non-readable page, qemu-user fills siginfo_t correctly. For the other
instructions the result is as if it were the first instruction, which
is not correct.

The reason is that the current logic expects translate_insn() hook to
stop at the page boundary. This way only the first instruction can
cause a SEGV. However, this is quite difficult to properly implement
when the problematic instruction crosses a page boundary, and indeed
the actual implementations do not do this. Note that this can also
break self-modifying code detection when only bytes on the second page
are modified, but this is outside of the scope of this patch.

Instead of chaning all the translators, do a much simpler thing: when
such a situation is detected, start from scratch and stop right before
the problematic instruction.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 accel/tcg/translate-all.c | 16 +++++++++++-----
 accel/tcg/translator.c    | 25 +++++++++++++++++++++++++
 include/hw/core/cpu.h     |  2 ++
 linux-user/signal.c       |  5 +++++
 4 files changed, 43 insertions(+), 5 deletions(-)

diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index ef62a199c7..b4766f4661 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -2295,12 +2295,18 @@ void page_set_flags(target_ulong start, target_ulong end, int flags)
          len != 0;
          len -= TARGET_PAGE_SIZE, addr += TARGET_PAGE_SIZE) {
         PageDesc *p = page_find_alloc(addr >> TARGET_PAGE_BITS, 1);
+        bool invalidate;
 
-        /* If the write protection bit is set, then we invalidate
-           the code inside.  */
-        if (!(p->flags & PAGE_WRITE) &&
-            (flags & PAGE_WRITE) &&
-            p->first_tb) {
+        /*
+         * If the write protection bit is set, then we invalidate the code
+         * inside.  For qemu-user, we need to do this when PAGE_READ is cleared
+         * as well, in order to force a SEGV when trying to run this code.
+         */
+        invalidate = !(p->flags & PAGE_WRITE) && (flags & PAGE_WRITE);
+#ifdef CONFIG_USER_ONLY
+        invalidate |= (p->flags & PAGE_READ) && !(flags & PAGE_READ);
+#endif
+        if (invalidate && p->first_tb) {
             tb_invalidate_phys_page(addr, 0);
         }
         if (reset_target_data) {
diff --git a/accel/tcg/translator.c b/accel/tcg/translator.c
index fe7af9b943..e444c17515 100644
--- a/accel/tcg/translator.c
+++ b/accel/tcg/translator.c
@@ -57,6 +57,18 @@ void translator_loop(const TranslatorOps *ops, DisasContextBase *db,
     uint32_t cflags = tb_cflags(tb);
     bool plugin_enabled;
 
+    /*
+     * In case translate_insn hook touched an unreadable page, redo the
+     * translation until the problematic instruction.  We cannot just throw
+     * away the trailing ops, because the hook could have changed DisasContext.
+     */
+    tcg_debug_assert(!cpu->translator_jmp);
+    if (sigsetjmp(cpu->translator_jmp_env, 1) != 0) {
+        cpu->translator_jmp = false;
+        tcg_remove_ops_after(NULL);
+        max_insns = db->num_insns - 1;
+    }
+
     /* Initialize DisasContext */
     db->tb = tb;
     db->pc_first = tb->pc;
@@ -122,8 +134,21 @@ void translator_loop(const TranslatorOps *ops, DisasContextBase *db,
             db->is_jmp = DISAS_TOO_MANY;
             break;
         }
+
+        /*
+         * Propagate SEGVs from the first instruction to the guest and handle
+         * the rest. This way guest's siginfo_t gets accurate pc and si_addr.
+         */
+        cpu->translator_jmp = true;
     }
 
+    /*
+     * Clear translator_jmp on all ways out of this function, otherwise
+     * instructions that fetch code as part of their operation will be
+     * confused.
+     */
+    cpu->translator_jmp = false;
+
     /* Emit code to exit the TB, as indicated by db->is_jmp.  */
     ops->tb_stop(db, cpu);
     gen_tb_end(db->tb, db->num_insns);
diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h
index 500503da13..6c1829b7f5 100644
--- a/include/hw/core/cpu.h
+++ b/include/hw/core/cpu.h
@@ -349,6 +349,8 @@ struct CPUState {
     int64_t icount_extra;
     uint64_t random_seed;
     sigjmp_buf jmp_env;
+    bool translator_jmp;
+    sigjmp_buf translator_jmp_env;
 
     QemuMutex work_mutex;
     QSIMPLEQ_HEAD(, qemu_work_item) work_list;
diff --git a/linux-user/signal.c b/linux-user/signal.c
index 8d29bfaa6b..f7e77c8d2e 100644
--- a/linux-user/signal.c
+++ b/linux-user/signal.c
@@ -833,6 +833,11 @@ static void host_signal_handler(int host_sig, siginfo_t *info, void *puc)
         abi_ptr guest_addr;
         bool is_write;
 
+        /* Translator wants to handle this. */
+        if (helper_retaddr == 1 && cpu->translator_jmp) {
+            siglongjmp(cpu->translator_jmp_env, 1);
+        }
+
         host_addr = (uintptr_t)info->si_addr;
 
         /*
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/2] tests/tcg: Test siginfo_t contents when jumping to non-readable pages
  2022-08-04 18:23 [PATCH 0/2] linux-user: Fix siginfo_t contents when jumping to non-readable pages Ilya Leoshkevich
  2022-08-04 18:23 ` [PATCH 1/2] " Ilya Leoshkevich
@ 2022-08-04 18:23 ` Ilya Leoshkevich
  1 sibling, 0 replies; 6+ messages in thread
From: Ilya Leoshkevich @ 2022-08-04 18:23 UTC (permalink / raw)
  To: Laurent Vivier, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé, Yanan Wang, Richard Henderson,
	Paolo Bonzini, David Hildenbrand
  Cc: qemu-devel, qemu-s390x, Christian Borntraeger, Ilya Leoshkevich

Add x86_64 and s390x tests to prevent regressions.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---
 tests/tcg/multiarch/noexec.h     | 114 ++++++++++++++++++++++++
 tests/tcg/s390x/Makefile.target  |   1 +
 tests/tcg/s390x/noexec.c         | 145 +++++++++++++++++++++++++++++++
 tests/tcg/x86_64/Makefile.target |   3 +-
 tests/tcg/x86_64/noexec.c        | 116 +++++++++++++++++++++++++
 5 files changed, 378 insertions(+), 1 deletion(-)
 create mode 100644 tests/tcg/multiarch/noexec.h
 create mode 100644 tests/tcg/s390x/noexec.c
 create mode 100644 tests/tcg/x86_64/noexec.c

diff --git a/tests/tcg/multiarch/noexec.h b/tests/tcg/multiarch/noexec.h
new file mode 100644
index 0000000000..a76e0aa9ea
--- /dev/null
+++ b/tests/tcg/multiarch/noexec.h
@@ -0,0 +1,114 @@
+/*
+ * Common code for arch-specific MMU_INST_FETCH fault testing.
+ *
+ * Declare struct arch_noexec_test before including this file and define
+ * arch_check_mcontext() after that.
+ */
+
+#include <assert.h>
+#include <signal.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/mman.h>
+#include <sys/ucontext.h>
+#include <unistd.h>
+
+/* Forward declarations. */
+
+static void arch_check_mcontext(const struct arch_noexec_test *test,
+                                const mcontext_t *ctx);
+
+/* Utility functions. */
+
+static void safe_print(const char *s)
+{
+    write(0, s, strlen(s));
+}
+
+static void safe_puts(const char *s)
+{
+    safe_print(s);
+    safe_print("\n");
+}
+
+#define PAGE_ALIGN(p) (void *)((unsigned long)(p) & ~0xfffUL)
+
+/* Testing infrastructure. */
+
+struct noexec_test {
+    const char *name;
+    void (*func)(int);
+    void *page;
+    void *expected_si_addr;
+    struct arch_noexec_test arch;
+};
+
+static const struct noexec_test *current_noexec_test;
+
+static void handle_segv(int sig, siginfo_t *info, void *ucontext)
+{
+    int err;
+
+    if (current_noexec_test == NULL) {
+        safe_puts("[  FAILED  ] unexpected SEGV");
+        _exit(1);
+    }
+
+    if (info->si_addr != current_noexec_test->expected_si_addr) {
+        safe_puts("[  FAILED  ] wrong si_addr");
+        _exit(1);
+    }
+
+    arch_check_mcontext(&current_noexec_test->arch,
+                        &((ucontext_t *)ucontext)->uc_mcontext);
+
+    err = mprotect(current_noexec_test->page, 0x1000, PROT_READ | PROT_EXEC);
+    if (err != 0) {
+        safe_puts("[  FAILED  ] mprotect() failed");
+        _exit(1);
+    }
+
+    current_noexec_test = NULL;
+}
+
+static void test_noexec_1(const struct noexec_test *test)
+{
+    int ret;
+
+    /* Trigger TB creation in order to test invalidation. */
+    test->func(0);
+
+    ret = mprotect(test->page, 0x1000, PROT_NONE);
+    assert(ret == 0);
+
+    /* Trigger SEGV and check that handle_segv() ran. */
+    current_noexec_test = test;
+    test->func(0);
+    assert(current_noexec_test == NULL);
+}
+
+static int test_noexec(struct noexec_test *tests, size_t n_tests)
+{
+    struct sigaction act;
+    size_t i;
+    int err;
+
+    memset(&act, 0, sizeof(act));
+    act.sa_sigaction = handle_segv;
+    act.sa_flags = SA_SIGINFO;
+    err = sigaction(SIGSEGV, &act, NULL);
+    assert(err == 0);
+
+    for (i = 0; i < n_tests; i++) {
+        struct noexec_test *test = &tests[i];
+
+        safe_print("[ RUN      ] ");
+        safe_puts(test->name);
+        test_noexec_1(test);
+        safe_puts("[       OK ]");
+    }
+
+    safe_puts("[  PASSED  ]");
+
+    return EXIT_SUCCESS;
+}
diff --git a/tests/tcg/s390x/Makefile.target b/tests/tcg/s390x/Makefile.target
index 1a7a4a2f59..5e13a41c3f 100644
--- a/tests/tcg/s390x/Makefile.target
+++ b/tests/tcg/s390x/Makefile.target
@@ -16,6 +16,7 @@ TESTS+=shift
 TESTS+=trap
 TESTS+=signals-s390x
 TESTS+=branch-relative-long
+TESTS+=noexec
 
 Z14_TESTS=vfminmax
 vfminmax: LDFLAGS+=-lm
diff --git a/tests/tcg/s390x/noexec.c b/tests/tcg/s390x/noexec.c
new file mode 100644
index 0000000000..2dfc9ee817
--- /dev/null
+++ b/tests/tcg/s390x/noexec.c
@@ -0,0 +1,145 @@
+#define _GNU_SOURCE
+
+struct arch_noexec_test {
+    void *expected_pswa;
+    unsigned long expected_r2;
+};
+
+#include "../multiarch/noexec.h"
+
+static void arch_check_mcontext(const struct arch_noexec_test *test,
+                                const mcontext_t *ctx) {
+    if (ctx->psw.addr != (unsigned long)test->expected_pswa) {
+        safe_puts("[  FAILED  ] wrong psw.addr");
+        _exit(1);
+    }
+
+    if (ctx->gregs[2] != test->expected_r2) {
+        safe_puts("[  FAILED  ] wrong r2");
+        _exit(1);
+    }
+}
+
+#define DEFINE_NX(name, offset) \
+    void name ## _1(int); \
+    void name ## _2(int); \
+    void name ## _exrl(int); \
+    extern const short name ## _end[]; \
+    asm(/* Go to the specified page offset. */ \
+        ".align 0x1000\n" \
+        ".org .+" #offset "\n" \
+        /* %r2 is 0 on entry, overwrite it with 1. */ \
+        ".globl " #name "_1\n" \
+        #name "_1:\n" \
+        ".cfi_startproc\n" \
+        "lgfi %r2,1\n" \
+        /* Overwrite %2 with 2. */ \
+        ".globl " #name "_2\n" \
+        #name "_2:\n" \
+        "lgfi %r2,2\n" \
+        "br %r14\n" \
+        /* End of code. */ \
+        ".globl " #name "_end\n" \
+        #name "_end:\n" \
+        ".cfi_endproc\n" \
+        /* Go to the next page. */ \
+        ".align 0x1000\n" \
+        /* Break alignment. */ \
+        "nopr %r7\n" \
+        ".globl " #name "_exrl\n" \
+        #name "_exrl:\n" \
+        ".cfi_startproc\n" \
+        "exrl %r0," #name "_2\n" \
+        "br %r14\n" \
+        ".cfi_endproc");
+
+/* noexec_1 is executable, noexec_2 is non-executable. */
+DEFINE_NX(noexec, 0xffa);
+
+/*
+ * noexec_cross_1 is executable, noexec_cross_2 crosses non-executable page
+ * boundary.
+ */
+DEFINE_NX(noexec_cross, 0xff8);
+
+/* noexec_full_1 and noexec_full_2 are non-executable. */
+DEFINE_NX(noexec_full, 0x322);
+
+int main(void)
+{
+    struct noexec_test noexec_tests[] = {
+        {
+            .name = "Fallthrough",
+            .func = noexec_1,
+            .page = noexec_2,
+            .expected_si_addr = noexec_2,
+            .arch = {
+                .expected_pswa = noexec_2,
+                .expected_r2 = 1,
+            },
+        },
+        {
+            .name = "Jump",
+            .func = noexec_2,
+            .page = noexec_2,
+            .expected_si_addr = noexec_2,
+            .arch = {
+                .expected_pswa = noexec_2,
+                .expected_r2 = 0,
+            },
+        },
+        {
+            .name = "EXRL",
+            .func = noexec_exrl,
+            .page = noexec_2,
+            .expected_si_addr = PAGE_ALIGN(noexec_end),
+            .arch = {
+                .expected_pswa = noexec_exrl,
+                .expected_r2 = 0,
+            },
+        },
+        {
+            .name = "Fallthrough [cross]",
+            .func = noexec_cross_1,
+            .page = PAGE_ALIGN(noexec_cross_end),
+            .expected_si_addr = PAGE_ALIGN(noexec_cross_end),
+            .arch = {
+                .expected_pswa = noexec_cross_2,
+                .expected_r2 = 1,
+            },
+        },
+        {
+            .name = "Jump [cross]",
+            .func = noexec_cross_2,
+            .page = PAGE_ALIGN(noexec_cross_end),
+            .expected_si_addr = PAGE_ALIGN(noexec_cross_end),
+            .arch = {
+                .expected_pswa = noexec_cross_2,
+                .expected_r2 = 0,
+            },
+        },
+        {
+            .name = "EXRL [cross]",
+            .func = noexec_cross_exrl,
+            .page = PAGE_ALIGN(noexec_cross_end),
+            .expected_si_addr = PAGE_ALIGN(noexec_cross_end),
+            .arch = {
+                .expected_pswa = noexec_cross_exrl,
+                .expected_r2 = 0,
+            },
+        },
+        {
+            .name = "Jump [full]",
+            .func = noexec_full_1,
+            .page = PAGE_ALIGN(noexec_full_1),
+            .expected_si_addr = PAGE_ALIGN(noexec_full_1),
+            .arch = {
+                .expected_pswa = noexec_full_1,
+                .expected_r2 = 0,
+            },
+        },
+    };
+
+    return test_noexec(noexec_tests,
+                       sizeof(noexec_tests) / sizeof(noexec_tests[0]));
+}
diff --git a/tests/tcg/x86_64/Makefile.target b/tests/tcg/x86_64/Makefile.target
index b71a6bcd5e..c0e7e5b005 100644
--- a/tests/tcg/x86_64/Makefile.target
+++ b/tests/tcg/x86_64/Makefile.target
@@ -10,6 +10,7 @@ include $(SRC_PATH)/tests/tcg/i386/Makefile.target
 
 ifeq ($(filter %-linux-user, $(TARGET)),$(TARGET))
 X86_64_TESTS += vsyscall
+X86_64_TESTS += noexec
 TESTS=$(MULTIARCH_TESTS) $(X86_64_TESTS) test-x86_64
 else
 TESTS=$(MULTIARCH_TESTS)
@@ -20,5 +21,5 @@ test-x86_64: LDFLAGS+=-lm -lc
 test-x86_64: test-i386.c test-i386.h test-i386-shift.h test-i386-muldiv.h
 	$(CC) $(CFLAGS) $< -o $@ $(LDFLAGS)
 
-vsyscall: $(SRC_PATH)/tests/tcg/x86_64/vsyscall.c
+%: $(SRC_PATH)/tests/tcg/x86_64/%.c
 	$(CC) $(CFLAGS) $< -o $@ $(LDFLAGS)
diff --git a/tests/tcg/x86_64/noexec.c b/tests/tcg/x86_64/noexec.c
new file mode 100644
index 0000000000..ec07c9f0ba
--- /dev/null
+++ b/tests/tcg/x86_64/noexec.c
@@ -0,0 +1,116 @@
+#define _GNU_SOURCE
+
+struct arch_noexec_test {
+    void *expected_rip;
+    unsigned long expected_rdi;
+};
+
+#include "../multiarch/noexec.h"
+
+static void arch_check_mcontext(const struct arch_noexec_test *test,
+                                const mcontext_t *ctx) {
+    if (ctx->gregs[REG_RIP] != (unsigned long)test->expected_rip) {
+        safe_puts("[  FAILED  ] wrong rip");
+        _exit(1);
+    }
+
+    if (ctx->gregs[REG_RDI] != test->expected_rdi) {
+        safe_puts("[  FAILED  ] wrong rdi");
+        _exit(1);
+    }
+}
+
+#define DEFINE_NX(name, offset) \
+    void name ## _1(int); \
+    void name ## _2(int); \
+    extern const short name ## _end[]; \
+    asm(/* Go to the specified page offset. */ \
+        ".align 0x1000\n" \
+        ".org .+" #offset "\n" \
+        /* %rdi is 0 on entry, overwrite it with 1. */ \
+        ".globl " #name "_1\n" \
+        #name "_1:\n" \
+        ".cfi_startproc\n" \
+        "movq $1,%rdi\n" \
+        /* Overwrite %rdi with 2. */ \
+        ".globl " #name "_2\n" \
+        #name "_2:\n" \
+        "movq $2,%rdi\n" \
+        "ret\n" \
+        /* End of code. */ \
+        ".globl " #name "_end\n" \
+        #name "_end:\n" \
+        ".cfi_endproc\n" \
+        /* Go to the next page. */ \
+        ".align 0x1000");
+
+/* noexec_1 is executable, noexec_2 is non-executable. */
+DEFINE_NX(noexec, 0xff9);
+
+/*
+ * noexec_cross_1 is executable, noexec_cross_2 crosses non-executable page
+ * boundary.
+ */
+DEFINE_NX(noexec_cross, 0xff8);
+
+/* noexec_full_1 and noexec_full_2 are non-executable. */
+DEFINE_NX(noexec_full, 0x321);
+
+int main(void)
+{
+    struct noexec_test noexec_tests[] = {
+        {
+            .name = "Fallthrough",
+            .func = noexec_1,
+            .page = noexec_2,
+            .expected_si_addr = noexec_2,
+            .arch = {
+                .expected_rip = noexec_2,
+                .expected_rdi = 1,
+            },
+        },
+        {
+            .name = "Jump",
+            .func = noexec_2,
+            .page = noexec_2,
+            .expected_si_addr = noexec_2,
+            .arch = {
+                .expected_rip = noexec_2,
+                .expected_rdi = 0,
+            },
+        },
+        {
+            .name = "Fallthrough [cross]",
+            .func = noexec_cross_1,
+            .page = PAGE_ALIGN(noexec_cross_end),
+            .expected_si_addr = PAGE_ALIGN(noexec_cross_end),
+            .arch = {
+                .expected_rip = noexec_cross_2,
+                .expected_rdi = 1,
+            },
+        },
+        {
+            .name = "Jump [cross]",
+            .func = noexec_cross_2,
+            .page = PAGE_ALIGN(noexec_cross_end),
+            .expected_si_addr = PAGE_ALIGN(noexec_cross_end),
+            .arch = {
+                .expected_rip = noexec_cross_2,
+                .expected_rdi = 0,
+            },
+        },
+        {
+            .name = "Jump [full]",
+            .func = noexec_full_1,
+            .page = PAGE_ALIGN(noexec_full_1),
+            .expected_si_addr = noexec_full_1,
+            .arch = {
+                .expected_rip = noexec_full_1,
+                .expected_rdi = 0,
+            },
+        },
+    };
+
+    return test_noexec(noexec_tests,
+                       sizeof(noexec_tests) / sizeof(noexec_tests[0]));
+}
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] linux-user: Fix siginfo_t contents when jumping to non-readable pages
  2022-08-04 18:23 ` [PATCH 1/2] " Ilya Leoshkevich
@ 2022-08-05  8:50   ` Peter Maydell
  2022-08-05 10:28     ` Ilya Leoshkevich
  0 siblings, 1 reply; 6+ messages in thread
From: Peter Maydell @ 2022-08-05  8:50 UTC (permalink / raw)
  To: Ilya Leoshkevich
  Cc: Laurent Vivier, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé, Yanan Wang, Richard Henderson,
	Paolo Bonzini, David Hildenbrand, qemu-devel, qemu-s390x,
	Christian Borntraeger

On Thu, 4 Aug 2022 at 19:50, Ilya Leoshkevich <iii@linux.ibm.com> wrote:
>
> When the first instruction of a translation block is located in a
> non-readable page, qemu-user fills siginfo_t correctly. For the other
> instructions the result is as if it were the first instruction, which
> is not correct.
>
> The reason is that the current logic expects translate_insn() hook to
> stop at the page boundary. This way only the first instruction can
> cause a SEGV. However, this is quite difficult to properly implement
> when the problematic instruction crosses a page boundary, and indeed
> the actual implementations do not do this. Note that this can also
> break self-modifying code detection when only bytes on the second page
> are modified, but this is outside of the scope of this patch.

Which guests do you observe this on ? I think we should indeed
fix this in the translators. More specifically, I think we should
get this correct already on Arm, and I would expect it to work
correctly on all the fixed-insn-width architectures, which can't
have page-crossing-insns to start with. x86 probably gets this wrong.

thanks
-- PMM


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] linux-user: Fix siginfo_t contents when jumping to non-readable pages
  2022-08-05  8:50   ` Peter Maydell
@ 2022-08-05 10:28     ` Ilya Leoshkevich
  2022-08-05 10:55       ` Peter Maydell
  0 siblings, 1 reply; 6+ messages in thread
From: Ilya Leoshkevich @ 2022-08-05 10:28 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Laurent Vivier, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé, Yanan Wang, Richard Henderson,
	Paolo Bonzini, David Hildenbrand, qemu-devel, qemu-s390x,
	Christian Borntraeger

On Fri, 2022-08-05 at 09:50 +0100, Peter Maydell wrote:
> On Thu, 4 Aug 2022 at 19:50, Ilya Leoshkevich <iii@linux.ibm.com>
> wrote:
> > 
> > When the first instruction of a translation block is located in a
> > non-readable page, qemu-user fills siginfo_t correctly. For the
> > other
> > instructions the result is as if it were the first instruction,
> > which
> > is not correct.
> > 
> > The reason is that the current logic expects translate_insn() hook
> > to
> > stop at the page boundary. This way only the first instruction can
> > cause a SEGV. However, this is quite difficult to properly
> > implement
> > when the problematic instruction crosses a page boundary, and
> > indeed
> > the actual implementations do not do this. Note that this can also
> > break self-modifying code detection when only bytes on the second
> > page
> > are modified, but this is outside of the scope of this patch.
> 
> Which guests do you observe this on ? I think we should indeed
> fix this in the translators. More specifically, I think we should
> get this correct already on Arm, and I would expect it to work
> correctly on all the fixed-insn-width architectures, which can't
> have page-crossing-insns to start with. x86 probably gets this wrong.
> 
> thanks
> -- PMM

I first discovered this on s390x, and then realized x86_64 is broken as
well. Fixing this in translators means adding page boundary checks to
all code loads. Actually, on s390x it doesn't look as nasty as I
thought it would, since we quickly figure out the length and load
everything in one place:

$ grep ld.*code target/s390x/tcg/translate.c | wc -l
6

On x86_64 it's as bad as expected:

$ grep x86_ld.*code target/i386/tcg/translate.c | wc -l
96

Implementing this there would mean changing x86_ldub_code() and friends
to macros, and then making sure we undo everything that we did since
the start of the instruction. E.g. bt/bts/btr/btc mix parsing and
op emission. There might be something that touches DisasContext as
well. Therefore I thought that the generic approach from this patch
would be more reliable.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] linux-user: Fix siginfo_t contents when jumping to non-readable pages
  2022-08-05 10:28     ` Ilya Leoshkevich
@ 2022-08-05 10:55       ` Peter Maydell
  0 siblings, 0 replies; 6+ messages in thread
From: Peter Maydell @ 2022-08-05 10:55 UTC (permalink / raw)
  To: Ilya Leoshkevich
  Cc: Laurent Vivier, Eduardo Habkost, Marcel Apfelbaum,
	Philippe Mathieu-Daudé, Yanan Wang, Richard Henderson,
	Paolo Bonzini, David Hildenbrand, qemu-devel, qemu-s390x,
	Christian Borntraeger

On Fri, 5 Aug 2022 at 11:28, Ilya Leoshkevich <iii@linux.ibm.com> wrote:
> On Fri, 2022-08-05 at 09:50 +0100, Peter Maydell wrote:
> > Which guests do you observe this on ? I think we should indeed
> > fix this in the translators. More specifically, I think we should
> > get this correct already on Arm, and I would expect it to work
> > correctly on all the fixed-insn-width architectures, which can't
> > have page-crossing-insns to start with. x86 probably gets this wrong.

> I first discovered this on s390x, and then realized x86_64 is broken as
> well. Fixing this in translators means adding page boundary checks to
> all code loads. Actually, on s390x it doesn't look as nasty as I
> thought it would, since we quickly figure out the length and load
> everything in one place:
>
> $ grep ld.*code target/s390x/tcg/translate.c | wc -l
> 6

Yes, it depends a bit on the translator and the architecture
how painful it is to get this right. Note that it's OK to
be pessimistic about whether an insn might cross the page
boundary (you end up with a 1-insn TB for it, which is a bit
inefficient but not wrong behaviour). For Arm we check this kind
of thing in insn_crosses_page() (original fix in commit
541ebcd401ee in 2015, cleaned up to be a bit more efficient later).

I think also that this bug is not specific to linux-user.
In a case with a TB like:

 load/store insn that should fault
 other insn that spans page boundary into a non-executable page

ie where the translator failed to break the TB before the
page-boundary-spanning instruction, we will report the fault for
the execution on the non-executable page, when we should have
reported the fault for the load/store, and this happens in system
mode as well as linux-user.

> On x86_64 it's as bad as expected:
>
> $ grep x86_ld.*code target/i386/tcg/translate.c | wc -l
> 96
>
> Implementing this there would mean changing x86_ldub_code() and friends
> to macros, and then making sure we undo everything that we did since
> the start of the instruction. E.g. bt/bts/btr/btc mix parsing and
> op emission. There might be something that touches DisasContext as
> well. Therefore I thought that the generic approach from this patch
> would be more reliable.

No surprise that x86 is a mess :-)

-- PMM


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-08-05 10:57 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-08-04 18:23 [PATCH 0/2] linux-user: Fix siginfo_t contents when jumping to non-readable pages Ilya Leoshkevich
2022-08-04 18:23 ` [PATCH 1/2] " Ilya Leoshkevich
2022-08-05  8:50   ` Peter Maydell
2022-08-05 10:28     ` Ilya Leoshkevich
2022-08-05 10:55       ` Peter Maydell
2022-08-04 18:23 ` [PATCH 2/2] tests/tcg: Test " Ilya Leoshkevich

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).