* [Qemu-devel] [PATCH RFC 0/3] s390x/tcg: LAP support using immediate TLB invalidation
@ 2017-09-27 17:00 David Hildenbrand
  2017-09-27 17:00 ` [Qemu-devel] [PATCH RFC 1/3] accel/tcg: allow to invalidate a write TLB entry immediately David Hildenbrand
                   ` (3 more replies)
  0 siblings, 4 replies; 21+ messages in thread
From: David Hildenbrand @ 2017-09-27 17:00 UTC (permalink / raw)
  To: qemu-devel
  Cc: thuth, cohuck, Christian Borntraeger, Alexander Graf,
	Richard Henderson
Details about Low-Address Protection can be found in description of
patch 1 and 2. It is basically a subpage protection of the first two
pages of every address space (for which it is enabled).
We can achieve this by simply directly invalidating the TLB entry and
therefore forcing every write accesses onto these two pages into the slow
path.
With this patch, I can boot Linux just fine (which uses LAP). This also
makes all related kvm-unit-tests that we have pass.
The checks are working that good, that I discovered a STFL bug. STFL
stores into the low addresses but low-address protection does explicitly
not apply. The Linux kernel calls STFL while LAP is active. So without
patch nr 3, booting Linux will fail. (this change is also part of a patch
of my SMP series).
Based on: https://github.com/cohuck/qemu.git s390-next
Available on: https://github.com/dhildenb/qemu.git s390x_lap
David Hildenbrand (3):
  accel/tcg: allow to invalidate a write TLB entry immediately
  s390x/tcg: low-address protection support
  s390x/tcg: make STFL store into the lowcore
 accel/tcg/cputlb.c           |  5 ++-
 accel/tcg/softmmu_template.h |  4 +-
 include/exec/cpu-all.h       |  3 ++
 target/s390x/excp_helper.c   |  3 +-
 target/s390x/helper.h        |  2 +-
 target/s390x/mem_helper.c    |  8 ----
 target/s390x/misc_helper.c   |  7 +++-
 target/s390x/mmu_helper.c    | 96 ++++++++++++++++++++++++++++----------------
 8 files changed, 78 insertions(+), 50 deletions(-)
-- 
2.13.5
^ permalink raw reply	[flat|nested] 21+ messages in thread
* [Qemu-devel] [PATCH RFC 1/3] accel/tcg: allow to invalidate a write TLB entry immediately
  2017-09-27 17:00 [Qemu-devel] [PATCH RFC 0/3] s390x/tcg: LAP support using immediate TLB invalidation David Hildenbrand
@ 2017-09-27 17:00 ` David Hildenbrand
  2017-09-27 17:48   ` Richard Henderson
  2017-09-27 17:00 ` [Qemu-devel] [PATCH RFC 2/3] s390x/tcg: low-address protection support David Hildenbrand
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 21+ messages in thread
From: David Hildenbrand @ 2017-09-27 17:00 UTC (permalink / raw)
  To: qemu-devel
  Cc: thuth, cohuck, Christian Borntraeger, Alexander Graf,
	Richard Henderson, David Hildenbrand
Background: s390x implements Low-Address Protection (LAP). If LAP is
enabled, writing to effective addresses (before any transaltion)
0-511 and 4096-4607 triggers a protection exception.
So we have subpage protection on the first two pages of every address
space (where the lowcore - the CPU private data resides).
By immediately invalidating the write entry but allowing the caller to
continue, we force every write access onto these first two pages into
the slow path. we will get a tlb fault with the specific accessed
addresses and can then evaluate if protection applies or not.
We have to make sure to ignore the invalid bit if tlb_fill() succeeds.
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 accel/tcg/cputlb.c           | 5 ++++-
 accel/tcg/softmmu_template.h | 4 ++--
 include/exec/cpu-all.h       | 3 +++
 3 files changed, 9 insertions(+), 3 deletions(-)
diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
index bcbcc4db6c..5bc4233961 100644
--- a/accel/tcg/cputlb.c
+++ b/accel/tcg/cputlb.c
@@ -683,6 +683,9 @@ void tlb_set_page_with_attrs(CPUState *cpu, target_ulong vaddr,
         } else {
             tn.addr_write = address;
         }
+        if (prot & PAGE_WRITE_INV) {
+            tn.addr_write |= TLB_INVALID_MASK;
+        }
     }
 
     /* Pairs with flag setting in tlb_reset_dirty_range */
@@ -967,7 +970,7 @@ static void *atomic_mmu_lookup(CPUArchState *env, target_ulong addr,
         if (!VICTIM_TLB_HIT(addr_write, addr)) {
             tlb_fill(ENV_GET_CPU(env), addr, MMU_DATA_STORE, mmu_idx, retaddr);
         }
-        tlb_addr = tlbe->addr_write;
+        tlb_addr = tlbe->addr_write & ~TLB_INVALID_MASK;
     }
 
     /* Check notdirty */
diff --git a/accel/tcg/softmmu_template.h b/accel/tcg/softmmu_template.h
index d7563292a5..3fc5144316 100644
--- a/accel/tcg/softmmu_template.h
+++ b/accel/tcg/softmmu_template.h
@@ -285,7 +285,7 @@ void helper_le_st_name(CPUArchState *env, target_ulong addr, DATA_TYPE val,
         if (!VICTIM_TLB_HIT(addr_write, addr)) {
             tlb_fill(ENV_GET_CPU(env), addr, MMU_DATA_STORE, mmu_idx, retaddr);
         }
-        tlb_addr = env->tlb_table[mmu_idx][index].addr_write;
+        tlb_addr = env->tlb_table[mmu_idx][index].addr_write & ~TLB_INVALID_MASK;
     }
 
     /* Handle an IO access.  */
@@ -361,7 +361,7 @@ void helper_be_st_name(CPUArchState *env, target_ulong addr, DATA_TYPE val,
         if (!VICTIM_TLB_HIT(addr_write, addr)) {
             tlb_fill(ENV_GET_CPU(env), addr, MMU_DATA_STORE, mmu_idx, retaddr);
         }
-        tlb_addr = env->tlb_table[mmu_idx][index].addr_write;
+        tlb_addr = env->tlb_table[mmu_idx][index].addr_write & ~TLB_INVALID_MASK;
     }
 
     /* Handle an IO access.  */
diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
index ffe43d5654..24b9509604 100644
--- a/include/exec/cpu-all.h
+++ b/include/exec/cpu-all.h
@@ -247,6 +247,9 @@ extern intptr_t qemu_host_page_mask;
 /* original state of the write flag (used when tracking self-modifying
    code */
 #define PAGE_WRITE_ORG 0x0010
+/* Invalidate the TLB entry immediately, helpful for s390x
+ * Low-Address-Protection. Used with PAGE_WRITE in tlb_set_page_with_attrs() */
+#define PAGE_WRITE_INV 0x0040
 #if defined(CONFIG_BSD) && defined(CONFIG_USER_ONLY)
 /* FIXME: Code that sets/uses this is broken and needs to go away.  */
 #define PAGE_RESERVED  0x0020
-- 
2.13.5
^ permalink raw reply related	[flat|nested] 21+ messages in thread
* [Qemu-devel] [PATCH RFC 2/3] s390x/tcg: low-address protection support
  2017-09-27 17:00 [Qemu-devel] [PATCH RFC 0/3] s390x/tcg: LAP support using immediate TLB invalidation David Hildenbrand
  2017-09-27 17:00 ` [Qemu-devel] [PATCH RFC 1/3] accel/tcg: allow to invalidate a write TLB entry immediately David Hildenbrand
@ 2017-09-27 17:00 ` David Hildenbrand
  2017-09-27 17:51   ` Richard Henderson
  2017-09-28  4:50   ` Thomas Huth
  2017-09-27 17:00 ` [Qemu-devel] [PATCH RFC 3/3] s390x/tcg: make STFL store into the lowcore David Hildenbrand
  2017-09-29 11:49 ` [Qemu-devel] [PATCH RFC 0/3] s390x/tcg: LAP support using immediate TLB invalidation Cornelia Huck
  3 siblings, 2 replies; 21+ messages in thread
From: David Hildenbrand @ 2017-09-27 17:00 UTC (permalink / raw)
  To: qemu-devel
  Cc: thuth, cohuck, Christian Borntraeger, Alexander Graf,
	Richard Henderson, David Hildenbrand
This is a neat way to implement low address protection, whereby
only the first 512 bytes of the first two pages (each 4096 bytes) of
every address space are protected.
Store a tec of 0 for the access exception, this is what is defined by
Enhanced Suppression on Protection in case of a low address protection
(Bit 61 set to 0, rest undefined).
We have to make sure to to pass the access address, not the masked page
address into mmu_translate*().
Drop the check from testblock. So we can properly test this via
kvm-unit-tests.
This will check every access going through one of the MMUs.
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 target/s390x/excp_helper.c |  3 +-
 target/s390x/mem_helper.c  |  8 ----
 target/s390x/mmu_helper.c  | 96 +++++++++++++++++++++++++++++-----------------
 3 files changed, 62 insertions(+), 45 deletions(-)
diff --git a/target/s390x/excp_helper.c b/target/s390x/excp_helper.c
index 3e4349d00b..aa0cbf67ac 100644
--- a/target/s390x/excp_helper.c
+++ b/target/s390x/excp_helper.c
@@ -95,7 +95,6 @@ int s390_cpu_handle_mmu_fault(CPUState *cs, vaddr orig_vaddr,
     DPRINTF("%s: address 0x%" VADDR_PRIx " rw %d mmu_idx %d\n",
             __func__, orig_vaddr, rw, mmu_idx);
 
-    orig_vaddr &= TARGET_PAGE_MASK;
     vaddr = orig_vaddr;
 
     if (mmu_idx < MMU_REAL_IDX) {
@@ -127,7 +126,7 @@ int s390_cpu_handle_mmu_fault(CPUState *cs, vaddr orig_vaddr,
     qemu_log_mask(CPU_LOG_MMU, "%s: set tlb %" PRIx64 " -> %" PRIx64 " (%x)\n",
             __func__, (uint64_t)vaddr, (uint64_t)raddr, prot);
 
-    tlb_set_page(cs, orig_vaddr, raddr, prot,
+    tlb_set_page(cs, orig_vaddr & TARGET_PAGE_MASK, raddr, prot,
                  mmu_idx, TARGET_PAGE_SIZE);
 
     return 0;
diff --git a/target/s390x/mem_helper.c b/target/s390x/mem_helper.c
index bbbe1c62b3..69a16867d4 100644
--- a/target/s390x/mem_helper.c
+++ b/target/s390x/mem_helper.c
@@ -1687,18 +1687,10 @@ void HELPER(stctl)(CPUS390XState *env, uint32_t r1, uint64_t a2, uint32_t r3)
 uint32_t HELPER(testblock)(CPUS390XState *env, uint64_t real_addr)
 {
     uintptr_t ra = GETPC();
-    CPUState *cs = CPU(s390_env_get_cpu(env));
     int i;
 
     real_addr = wrap_address(env, real_addr) & TARGET_PAGE_MASK;
 
-    /* Check low-address protection */
-    if ((env->cregs[0] & CR0_LOWPROT) && real_addr < 0x2000) {
-        cpu_restore_state(cs, ra);
-        program_interrupt(env, PGM_PROTECTION, 4);
-        return 1;
-    }
-
     for (i = 0; i < TARGET_PAGE_SIZE; i += 8) {
         cpu_stq_real_ra(env, real_addr + i, 0, ra);
     }
diff --git a/target/s390x/mmu_helper.c b/target/s390x/mmu_helper.c
index 9daa0fd8e2..44a15449d2 100644
--- a/target/s390x/mmu_helper.c
+++ b/target/s390x/mmu_helper.c
@@ -106,6 +106,37 @@ static void trigger_page_fault(CPUS390XState *env, target_ulong vaddr,
     trigger_access_exception(env, type, ilen, tec);
 }
 
+/* check whether the address would be proteted by Low-Address Protection */
+static bool is_low_address(uint64_t addr)
+{
+    return addr < 512 || (addr >= 4096 && addr < 4607);
+}
+
+/* check whether Low-Address Protection is enabled for mmu_translate() */
+static bool lowprot_enabled(const CPUS390XState *env, uint64_t asc)
+{
+    if (!(env->cregs[0] & CR0_LOWPROT)) {
+        return false;
+    }
+    if (!(env->psw.mask & PSW_MASK_DAT)) {
+        return true;
+    }
+
+    /* Check the private-space control bit */
+    switch (asc) {
+    case PSW_ASC_PRIMARY:
+        return !(env->cregs[1] & _ASCE_PRIVATE_SPACE);
+    case PSW_ASC_SECONDARY:
+        return !(env->cregs[7] & _ASCE_PRIVATE_SPACE);
+    case PSW_ASC_HOME:
+        return !(env->cregs[13] & _ASCE_PRIVATE_SPACE);
+    default:
+        /* We don't support access register mode */
+        error_report("unsupported addressing mode");
+        exit(1);
+    }
+}
+
 /**
  * Translate real address to absolute (= physical)
  * address by taking care of the prefix mapping.
@@ -323,6 +354,24 @@ int mmu_translate(CPUS390XState *env, target_ulong vaddr, int rw, uint64_t asc,
     }
 
     *flags = PAGE_READ | PAGE_WRITE | PAGE_EXEC;
+    if (is_low_address(vaddr & TARGET_PAGE_MASK) && lowprot_enabled(env, asc)) {
+        /*
+         * If any part of this page is currently protected, make sure the
+         * TLB entry will not be reused.
+         *
+         * As the protected range is always the first 512 bytes of the
+         * two first pages, we are able to catch all writes to these areas
+         * just by looking at the start address (triggering the tlb miss).
+         */
+        *flags |= PAGE_WRITE_INV;
+        if (is_low_address(vaddr) && rw == MMU_DATA_STORE) {
+            if (exc) {
+                trigger_access_exception(env, PGM_PROTECTION, ILEN_AUTO, 0);
+            }
+            return -EACCES;
+        }
+    }
+
     vaddr &= TARGET_PAGE_MASK;
 
     if (!(env->psw.mask & PSW_MASK_DAT)) {
@@ -392,50 +441,17 @@ int mmu_translate(CPUS390XState *env, target_ulong vaddr, int rw, uint64_t asc,
 }
 
 /**
- * lowprot_enabled: Check whether low-address protection is enabled
- */
-static bool lowprot_enabled(const CPUS390XState *env)
-{
-    if (!(env->cregs[0] & CR0_LOWPROT)) {
-        return false;
-    }
-    if (!(env->psw.mask & PSW_MASK_DAT)) {
-        return true;
-    }
-
-    /* Check the private-space control bit */
-    switch (env->psw.mask & PSW_MASK_ASC) {
-    case PSW_ASC_PRIMARY:
-        return !(env->cregs[1] & _ASCE_PRIVATE_SPACE);
-    case PSW_ASC_SECONDARY:
-        return !(env->cregs[7] & _ASCE_PRIVATE_SPACE);
-    case PSW_ASC_HOME:
-        return !(env->cregs[13] & _ASCE_PRIVATE_SPACE);
-    default:
-        /* We don't support access register mode */
-        error_report("unsupported addressing mode");
-        exit(1);
-    }
-}
-
-/**
  * translate_pages: Translate a set of consecutive logical page addresses
  * to absolute addresses
  */
 static int translate_pages(S390CPU *cpu, vaddr addr, int nr_pages,
                            target_ulong *pages, bool is_write)
 {
-    bool lowprot = is_write && lowprot_enabled(&cpu->env);
     uint64_t asc = cpu->env.psw.mask & PSW_MASK_ASC;
     CPUS390XState *env = &cpu->env;
     int ret, i, pflags;
 
     for (i = 0; i < nr_pages; i++) {
-        /* Low-address protection? */
-        if (lowprot && (addr < 512 || (addr >= 4096 && addr < 4096 + 512))) {
-            trigger_access_exception(env, PGM_PROTECTION, ILEN_AUTO, 0);
-            return -EACCES;
-        }
         ret = mmu_translate(env, addr, is_write, asc, &pages[i], &pflags, true);
         if (ret) {
             return ret;
@@ -509,9 +525,19 @@ int s390_cpu_virt_mem_rw(S390CPU *cpu, vaddr laddr, uint8_t ar, void *hostbuf,
 int mmu_translate_real(CPUS390XState *env, target_ulong raddr, int rw,
                        target_ulong *addr, int *flags)
 {
-    /* TODO: low address protection once we flush the tlb on cr changes */
+    const bool lowprot_enabled = env->cregs[0] & CR0_LOWPROT;
+
     *flags = PAGE_READ | PAGE_WRITE;
-    *addr = mmu_real2abs(env, raddr);
+    if (is_low_address(raddr & TARGET_PAGE_MASK) && lowprot_enabled) {
+        /* see comment in mmu_translate() how this works */
+        *flags |= PAGE_WRITE_INV;
+        if (is_low_address(raddr) && rw == MMU_DATA_STORE) {
+            trigger_access_exception(env, PGM_PROTECTION, ILEN_AUTO, 0);
+            return -EACCES;
+        }
+    }
+
+    *addr = mmu_real2abs(env, raddr & TARGET_PAGE_MASK);
 
     /* TODO: storage key handling */
     return 0;
-- 
2.13.5
^ permalink raw reply related	[flat|nested] 21+ messages in thread
* [Qemu-devel] [PATCH RFC 3/3] s390x/tcg: make STFL store into the lowcore
  2017-09-27 17:00 [Qemu-devel] [PATCH RFC 0/3] s390x/tcg: LAP support using immediate TLB invalidation David Hildenbrand
  2017-09-27 17:00 ` [Qemu-devel] [PATCH RFC 1/3] accel/tcg: allow to invalidate a write TLB entry immediately David Hildenbrand
  2017-09-27 17:00 ` [Qemu-devel] [PATCH RFC 2/3] s390x/tcg: low-address protection support David Hildenbrand
@ 2017-09-27 17:00 ` David Hildenbrand
  2017-09-27 17:52   ` Richard Henderson
  2017-09-29 12:43   ` Cornelia Huck
  2017-09-29 11:49 ` [Qemu-devel] [PATCH RFC 0/3] s390x/tcg: LAP support using immediate TLB invalidation Cornelia Huck
  3 siblings, 2 replies; 21+ messages in thread
From: David Hildenbrand @ 2017-09-27 17:00 UTC (permalink / raw)
  To: qemu-devel
  Cc: thuth, cohuck, Christian Borntraeger, Alexander Graf,
	Richard Henderson, David Hildenbrand
Using virtual memory access is wrong and will soon include low-address
protection checks, which is to be bypassed for STFL.
This was originally part of a bigger STFL(E) refactoring.
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 target/s390x/helper.h      | 2 +-
 target/s390x/misc_helper.c | 7 ++++++-
 2 files changed, 7 insertions(+), 2 deletions(-)
diff --git a/target/s390x/helper.h b/target/s390x/helper.h
index 75ba04fc15..52c2963baa 100644
--- a/target/s390x/helper.h
+++ b/target/s390x/helper.h
@@ -104,7 +104,6 @@ DEF_HELPER_FLAGS_5(calc_cc, TCG_CALL_NO_RWG_SE, i32, env, i32, i64, i64, i64)
 DEF_HELPER_FLAGS_2(sfpc, TCG_CALL_NO_RWG, void, env, i64)
 DEF_HELPER_FLAGS_2(sfas, TCG_CALL_NO_WG, void, env, i64)
 DEF_HELPER_FLAGS_1(popcnt, TCG_CALL_NO_RWG_SE, i64, i64)
-DEF_HELPER_FLAGS_1(stfl, TCG_CALL_NO_RWG, void, env)
 DEF_HELPER_2(stfle, i32, env, i64)
 DEF_HELPER_FLAGS_2(lpq, TCG_CALL_NO_WG, i64, env, i64)
 DEF_HELPER_FLAGS_4(stpq, TCG_CALL_NO_WG, void, env, i64, i64, i64)
@@ -153,6 +152,7 @@ DEF_HELPER_FLAGS_3(sturg, TCG_CALL_NO_WG, void, env, i64, i64)
 DEF_HELPER_1(per_check_exception, void, env)
 DEF_HELPER_FLAGS_3(per_branch, TCG_CALL_NO_RWG, void, env, i64, i64)
 DEF_HELPER_FLAGS_2(per_ifetch, TCG_CALL_NO_RWG, void, env, i64)
+DEF_HELPER_FLAGS_1(stfl, TCG_CALL_NO_RWG, void, env)
 
 DEF_HELPER_2(xsch, void, env, i64)
 DEF_HELPER_2(csch, void, env, i64)
diff --git a/target/s390x/misc_helper.c b/target/s390x/misc_helper.c
index 293fc8428a..0b93381188 100644
--- a/target/s390x/misc_helper.c
+++ b/target/s390x/misc_helper.c
@@ -541,13 +541,18 @@ static unsigned do_stfle(CPUS390XState *env, uint64_t words[MAX_STFL_WORDS])
     return max_bit / 64;
 }
 
+#ifndef CONFIG_USER_ONLY
 void HELPER(stfl)(CPUS390XState *env)
 {
     uint64_t words[MAX_STFL_WORDS];
+    LowCore *lowcore;
 
+    lowcore = cpu_map_lowcore(env);
     do_stfle(env, words);
-    cpu_stl_data(env, 200, words[0] >> 32);
+    lowcore->stfl_fac_list = cpu_to_be32(words[0] >> 32);
+    cpu_unmap_lowcore(lowcore);
 }
+#endif
 
 uint32_t HELPER(stfle)(CPUS390XState *env, uint64_t addr)
 {
-- 
2.13.5
^ permalink raw reply related	[flat|nested] 21+ messages in thread
* Re: [Qemu-devel] [PATCH RFC 1/3] accel/tcg: allow to invalidate a write TLB entry immediately
  2017-09-27 17:00 ` [Qemu-devel] [PATCH RFC 1/3] accel/tcg: allow to invalidate a write TLB entry immediately David Hildenbrand
@ 2017-09-27 17:48   ` Richard Henderson
  2017-09-27 18:50     ` David Hildenbrand
  2017-10-16  7:24     ` David Hildenbrand
  0 siblings, 2 replies; 21+ messages in thread
From: Richard Henderson @ 2017-09-27 17:48 UTC (permalink / raw)
  To: David Hildenbrand, qemu-devel
  Cc: thuth, cohuck, Christian Borntraeger, Alexander Graf,
	Peter Maydell
On 09/27/2017 10:00 AM, David Hildenbrand wrote:
> Background: s390x implements Low-Address Protection (LAP). If LAP is
> enabled, writing to effective addresses (before any transaltion)
> 0-511 and 4096-4607 triggers a protection exception.
> 
> So we have subpage protection on the first two pages of every address
> space (where the lowcore - the CPU private data resides).
> 
> By immediately invalidating the write entry but allowing the caller to
> continue, we force every write access onto these first two pages into
> the slow path. we will get a tlb fault with the specific accessed
> addresses and can then evaluate if protection applies or not.
> 
> We have to make sure to ignore the invalid bit if tlb_fill() succeeds.
This is similar to a scheme I proposed to PMM wrt handling ARM v8M translation.
 Reusing TLB_INVALID_MASK would appear to work, but I wonder if it wouldn't be
clearer to use another bit.  I believe I had proposed a TLB_FORCE_SLOW_MASK.
Thoughts, Peter?
r~
^ permalink raw reply	[flat|nested] 21+ messages in thread
* Re: [Qemu-devel] [PATCH RFC 2/3] s390x/tcg: low-address protection support
  2017-09-27 17:00 ` [Qemu-devel] [PATCH RFC 2/3] s390x/tcg: low-address protection support David Hildenbrand
@ 2017-09-27 17:51   ` Richard Henderson
  2017-09-28  4:50   ` Thomas Huth
  1 sibling, 0 replies; 21+ messages in thread
From: Richard Henderson @ 2017-09-27 17:51 UTC (permalink / raw)
  To: David Hildenbrand, qemu-devel
  Cc: thuth, cohuck, Christian Borntraeger, Alexander Graf
On 09/27/2017 10:00 AM, David Hildenbrand wrote:
> +    case PSW_ASC_HOME:
> +        return !(env->cregs[13] & _ASCE_PRIVATE_SPACE);
> +    default:
> +        /* We don't support access register mode */
> +        error_report("unsupported addressing mode");
> +        exit(1);
I think g_assert_not_reached() is sufficient for cases like this.
Oh, it's just code movement.  Nevermind, perhaps.
Modulo the outcome of discussion on patch 1,
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply	[flat|nested] 21+ messages in thread
* Re: [Qemu-devel] [PATCH RFC 3/3] s390x/tcg: make STFL store into the lowcore
  2017-09-27 17:00 ` [Qemu-devel] [PATCH RFC 3/3] s390x/tcg: make STFL store into the lowcore David Hildenbrand
@ 2017-09-27 17:52   ` Richard Henderson
  2017-09-27 18:46     ` David Hildenbrand
  2017-09-29 12:43   ` Cornelia Huck
  1 sibling, 1 reply; 21+ messages in thread
From: Richard Henderson @ 2017-09-27 17:52 UTC (permalink / raw)
  To: David Hildenbrand, qemu-devel
  Cc: thuth, cohuck, Christian Borntraeger, Alexander Graf
On 09/27/2017 10:00 AM, David Hildenbrand wrote:
> Using virtual memory access is wrong and will soon include low-address
> protection checks, which is to be bypassed for STFL.
> 
> This was originally part of a bigger STFL(E) refactoring.
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  target/s390x/helper.h      | 2 +-
>  target/s390x/misc_helper.c | 7 ++++++-
>  2 files changed, 7 insertions(+), 2 deletions(-)
Need to sort this patch first, so that the series is bisectable.
>  DEF_HELPER_FLAGS_2(sfpc, TCG_CALL_NO_RWG, void, env, i64)
>  DEF_HELPER_FLAGS_2(sfas, TCG_CALL_NO_WG, void, env, i64)
>  DEF_HELPER_FLAGS_1(popcnt, TCG_CALL_NO_RWG_SE, i64, i64)
> -DEF_HELPER_FLAGS_1(stfl, TCG_CALL_NO_RWG, void, env)
>  DEF_HELPER_2(stfle, i32, env, i64)
>  DEF_HELPER_FLAGS_2(lpq, TCG_CALL_NO_WG, i64, env, i64)
>  DEF_HELPER_FLAGS_4(stpq, TCG_CALL_NO_WG, void, env, i64, i64, i64)
> @@ -153,6 +152,7 @@ DEF_HELPER_FLAGS_3(sturg, TCG_CALL_NO_WG, void, env, i64, i64)
>  DEF_HELPER_1(per_check_exception, void, env)
>  DEF_HELPER_FLAGS_3(per_branch, TCG_CALL_NO_RWG, void, env, i64, i64)
>  DEF_HELPER_FLAGS_2(per_ifetch, TCG_CALL_NO_RWG, void, env, i64)
> +DEF_HELPER_FLAGS_1(stfl, TCG_CALL_NO_RWG, void, env)
>  
Why?  Otherwise,
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
r~
^ permalink raw reply	[flat|nested] 21+ messages in thread
* Re: [Qemu-devel] [PATCH RFC 3/3] s390x/tcg: make STFL store into the lowcore
  2017-09-27 17:52   ` Richard Henderson
@ 2017-09-27 18:46     ` David Hildenbrand
  2017-09-28  4:23       ` Thomas Huth
  0 siblings, 1 reply; 21+ messages in thread
From: David Hildenbrand @ 2017-09-27 18:46 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: thuth, cohuck, Christian Borntraeger, Alexander Graf
On 27.09.2017 19:52, Richard Henderson wrote:
> On 09/27/2017 10:00 AM, David Hildenbrand wrote:
>> Using virtual memory access is wrong and will soon include low-address
>> protection checks, which is to be bypassed for STFL.
>>
>> This was originally part of a bigger STFL(E) refactoring.
>>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
>> ---
>>  target/s390x/helper.h      | 2 +-
>>  target/s390x/misc_helper.c | 7 ++++++-
>>  2 files changed, 7 insertions(+), 2 deletions(-)
> 
> Need to sort this patch first, so that the series is bisectable.
Right, this should become #2.
> 
>>  DEF_HELPER_FLAGS_2(sfpc, TCG_CALL_NO_RWG, void, env, i64)
>>  DEF_HELPER_FLAGS_2(sfas, TCG_CALL_NO_WG, void, env, i64)
>>  DEF_HELPER_FLAGS_1(popcnt, TCG_CALL_NO_RWG_SE, i64, i64)
>> -DEF_HELPER_FLAGS_1(stfl, TCG_CALL_NO_RWG, void, env)
>>  DEF_HELPER_2(stfle, i32, env, i64)
>>  DEF_HELPER_FLAGS_2(lpq, TCG_CALL_NO_WG, i64, env, i64)
>>  DEF_HELPER_FLAGS_4(stpq, TCG_CALL_NO_WG, void, env, i64, i64, i64)
>> @@ -153,6 +152,7 @@ DEF_HELPER_FLAGS_3(sturg, TCG_CALL_NO_WG, void, env, i64, i64)
>>  DEF_HELPER_1(per_check_exception, void, env)
>>  DEF_HELPER_FLAGS_3(per_branch, TCG_CALL_NO_RWG, void, env, i64, i64)
>>  DEF_HELPER_FLAGS_2(per_ifetch, TCG_CALL_NO_RWG, void, env, i64)
>> +DEF_HELPER_FLAGS_1(stfl, TCG_CALL_NO_RWG, void, env)
>>  
> 
> Why?  Otherwise,
struct LowCore is only available for !CONFIG_USER_ONLY. Therefore I also
have to move the helper declaration into !CONFIG_USER_ONLY.
Thanks!
> 
> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
> 
> 
> r~
> 
-- 
Thanks,
David
^ permalink raw reply	[flat|nested] 21+ messages in thread
* Re: [Qemu-devel] [PATCH RFC 1/3] accel/tcg: allow to invalidate a write TLB entry immediately
  2017-09-27 17:48   ` Richard Henderson
@ 2017-09-27 18:50     ` David Hildenbrand
  2017-10-16  7:24     ` David Hildenbrand
  1 sibling, 0 replies; 21+ messages in thread
From: David Hildenbrand @ 2017-09-27 18:50 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: thuth, cohuck, Christian Borntraeger, Alexander Graf,
	Peter Maydell
On 27.09.2017 19:48, Richard Henderson wrote:
> On 09/27/2017 10:00 AM, David Hildenbrand wrote:
>> Background: s390x implements Low-Address Protection (LAP). If LAP is
>> enabled, writing to effective addresses (before any transaltion)
>> 0-511 and 4096-4607 triggers a protection exception.
>>
>> So we have subpage protection on the first two pages of every address
>> space (where the lowcore - the CPU private data resides).
>>
>> By immediately invalidating the write entry but allowing the caller to
>> continue, we force every write access onto these first two pages into
>> the slow path. we will get a tlb fault with the specific accessed
>> addresses and can then evaluate if protection applies or not.
>>
>> We have to make sure to ignore the invalid bit if tlb_fill() succeeds.
> 
> This is similar to a scheme I proposed to PMM wrt handling ARM v8M translation.
>  Reusing TLB_INVALID_MASK would appear to work, but I wonder if it wouldn't be
> clearer to use another bit.  I believe I had proposed a TLB_FORCE_SLOW_MASK.
The only downside of another bit is that we have to duplicate all
checks. Using TLB_INVALID_MASK avoids this change (because it simply works).
Of course, we could come up with something as simple as
#define TLB_INVALID_MASK (TLB_INVALID | TLB_FORCE_SLOW)
and fixup the few places where TLB_INVALID_MASK really just has to be
TLB_INVALID.
Have no strong opinion on this. This way requires minimal changes.
> 
> Thoughts, Peter?
> 
> 
> r~
> 
-- 
Thanks,
David
^ permalink raw reply	[flat|nested] 21+ messages in thread
* Re: [Qemu-devel] [PATCH RFC 3/3] s390x/tcg: make STFL store into the lowcore
  2017-09-27 18:46     ` David Hildenbrand
@ 2017-09-28  4:23       ` Thomas Huth
  0 siblings, 0 replies; 21+ messages in thread
From: Thomas Huth @ 2017-09-28  4:23 UTC (permalink / raw)
  To: David Hildenbrand, Richard Henderson, qemu-devel
  Cc: cohuck, Christian Borntraeger, Alexander Graf
On 27.09.2017 20:46, David Hildenbrand wrote:
> On 27.09.2017 19:52, Richard Henderson wrote:
>> On 09/27/2017 10:00 AM, David Hildenbrand wrote:
>>> Using virtual memory access is wrong and will soon include low-address
>>> protection checks, which is to be bypassed for STFL.
>>>
>>> This was originally part of a bigger STFL(E) refactoring.
>>>
>>> Signed-off-by: David Hildenbrand <david@redhat.com>
>>> ---
>>>  target/s390x/helper.h      | 2 +-
>>>  target/s390x/misc_helper.c | 7 ++++++-
>>>  2 files changed, 7 insertions(+), 2 deletions(-)
>>
>> Need to sort this patch first, so that the series is bisectable.
> 
> Right, this should become #2.
> 
>>
>>>  DEF_HELPER_FLAGS_2(sfpc, TCG_CALL_NO_RWG, void, env, i64)
>>>  DEF_HELPER_FLAGS_2(sfas, TCG_CALL_NO_WG, void, env, i64)
>>>  DEF_HELPER_FLAGS_1(popcnt, TCG_CALL_NO_RWG_SE, i64, i64)
>>> -DEF_HELPER_FLAGS_1(stfl, TCG_CALL_NO_RWG, void, env)
>>>  DEF_HELPER_2(stfle, i32, env, i64)
>>>  DEF_HELPER_FLAGS_2(lpq, TCG_CALL_NO_WG, i64, env, i64)
>>>  DEF_HELPER_FLAGS_4(stpq, TCG_CALL_NO_WG, void, env, i64, i64, i64)
>>> @@ -153,6 +152,7 @@ DEF_HELPER_FLAGS_3(sturg, TCG_CALL_NO_WG, void, env, i64, i64)
>>>  DEF_HELPER_1(per_check_exception, void, env)
>>>  DEF_HELPER_FLAGS_3(per_branch, TCG_CALL_NO_RWG, void, env, i64, i64)
>>>  DEF_HELPER_FLAGS_2(per_ifetch, TCG_CALL_NO_RWG, void, env, i64)
>>> +DEF_HELPER_FLAGS_1(stfl, TCG_CALL_NO_RWG, void, env)
>>>  
>>
>> Why?  Otherwise,
> 
> struct LowCore is only available for !CONFIG_USER_ONLY. Therefore I also
> have to move the helper declaration into !CONFIG_USER_ONLY.
You should mention that in the patch description, that it is a
privileged instruction and thus you also mark it here accordingly.
With that update:
Reviewed-by: Thomas Huth <thuth@redhat.com>
^ permalink raw reply	[flat|nested] 21+ messages in thread
* Re: [Qemu-devel] [PATCH RFC 2/3] s390x/tcg: low-address protection support
  2017-09-27 17:00 ` [Qemu-devel] [PATCH RFC 2/3] s390x/tcg: low-address protection support David Hildenbrand
  2017-09-27 17:51   ` Richard Henderson
@ 2017-09-28  4:50   ` Thomas Huth
  2017-09-28 13:08     ` David Hildenbrand
  1 sibling, 1 reply; 21+ messages in thread
From: Thomas Huth @ 2017-09-28  4:50 UTC (permalink / raw)
  To: David Hildenbrand, qemu-devel
  Cc: cohuck, Christian Borntraeger, Alexander Graf, Richard Henderson
On 27.09.2017 19:00, David Hildenbrand wrote:
> This is a neat way to implement low address protection, whereby
> only the first 512 bytes of the first two pages (each 4096 bytes) of
> every address space are protected.
> 
> Store a tec of 0 for the access exception, this is what is defined by
> Enhanced Suppression on Protection in case of a low address protection
> (Bit 61 set to 0, rest undefined).
> 
> We have to make sure to to pass the access address, not the masked page
> address into mmu_translate*().
> 
> Drop the check from testblock. So we can properly test this via
> kvm-unit-tests.
> 
> This will check every access going through one of the MMUs.
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  target/s390x/excp_helper.c |  3 +-
>  target/s390x/mem_helper.c  |  8 ----
>  target/s390x/mmu_helper.c  | 96 +++++++++++++++++++++++++++++-----------------
>  3 files changed, 62 insertions(+), 45 deletions(-)
[...]
> diff --git a/target/s390x/mmu_helper.c b/target/s390x/mmu_helper.c
> index 9daa0fd8e2..44a15449d2 100644
> --- a/target/s390x/mmu_helper.c
> +++ b/target/s390x/mmu_helper.c
> @@ -106,6 +106,37 @@ static void trigger_page_fault(CPUS390XState *env, target_ulong vaddr,
>      trigger_access_exception(env, type, ilen, tec);
>  }
>  
> +/* check whether the address would be proteted by Low-Address Protection */
> +static bool is_low_address(uint64_t addr)
> +{
> +    return addr < 512 || (addr >= 4096 && addr < 4607);
> +}
I like the check from the kernel sources better:
static inline int is_low_address(unsigned long ga)
{
    /* Check for address ranges 0..511 and 4096..4607 */
    return (ga & ~0x11fful) == 0;
}
... that might result in slightly faster code (depending on the
compiler, of course).
 Thomas
^ permalink raw reply	[flat|nested] 21+ messages in thread
* Re: [Qemu-devel] [PATCH RFC 2/3] s390x/tcg: low-address protection support
  2017-09-28  4:50   ` Thomas Huth
@ 2017-09-28 13:08     ` David Hildenbrand
  2017-09-29 11:27       ` Cornelia Huck
  0 siblings, 1 reply; 21+ messages in thread
From: David Hildenbrand @ 2017-09-28 13:08 UTC (permalink / raw)
  To: Thomas Huth, qemu-devel
  Cc: cohuck, Christian Borntraeger, Alexander Graf, Richard Henderson
On 28.09.2017 06:50, Thomas Huth wrote:
> On 27.09.2017 19:00, David Hildenbrand wrote:
>> This is a neat way to implement low address protection, whereby
>> only the first 512 bytes of the first two pages (each 4096 bytes) of
>> every address space are protected.
>>
>> Store a tec of 0 for the access exception, this is what is defined by
>> Enhanced Suppression on Protection in case of a low address protection
>> (Bit 61 set to 0, rest undefined).
>>
>> We have to make sure to to pass the access address, not the masked page
>> address into mmu_translate*().
>>
>> Drop the check from testblock. So we can properly test this via
>> kvm-unit-tests.
>>
>> This will check every access going through one of the MMUs.
>>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
>> ---
>>  target/s390x/excp_helper.c |  3 +-
>>  target/s390x/mem_helper.c  |  8 ----
>>  target/s390x/mmu_helper.c  | 96 +++++++++++++++++++++++++++++-----------------
>>  3 files changed, 62 insertions(+), 45 deletions(-)
> [...]
>> diff --git a/target/s390x/mmu_helper.c b/target/s390x/mmu_helper.c
>> index 9daa0fd8e2..44a15449d2 100644
>> --- a/target/s390x/mmu_helper.c
>> +++ b/target/s390x/mmu_helper.c
>> @@ -106,6 +106,37 @@ static void trigger_page_fault(CPUS390XState *env, target_ulong vaddr,
>>      trigger_access_exception(env, type, ilen, tec);
>>  }
>>  
>> +/* check whether the address would be proteted by Low-Address Protection */
>> +static bool is_low_address(uint64_t addr)
>> +{
>> +    return addr < 512 || (addr >= 4096 && addr < 4607);
>> +}
> 
> I like the check from the kernel sources better:
> 
> static inline int is_low_address(unsigned long ga)
> {
>     /* Check for address ranges 0..511 and 4096..4607 */
>     return (ga & ~0x11fful) == 0;
> }
> 
> ... that might result in slightly faster code (depending on the
> compiler, of course).
I think that lim (readability) -> 0. Without that comment you're at
first sight really clueless what this is about.
My check exactly corresponds to the wording in the PoP (and smart
compilers should be able to optimize).
But I don't have a strong opinion on this micro optimization.
Thanks
> 
>  Thomas
> 
-- 
Thanks,
David
^ permalink raw reply	[flat|nested] 21+ messages in thread
* Re: [Qemu-devel] [PATCH RFC 2/3] s390x/tcg: low-address protection support
  2017-09-28 13:08     ` David Hildenbrand
@ 2017-09-29 11:27       ` Cornelia Huck
  2017-10-12  8:41         ` Thomas Huth
  0 siblings, 1 reply; 21+ messages in thread
From: Cornelia Huck @ 2017-09-29 11:27 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Thomas Huth, qemu-devel, Christian Borntraeger, Alexander Graf,
	Richard Henderson
On Thu, 28 Sep 2017 15:08:11 +0200
David Hildenbrand <david@redhat.com> wrote:
> On 28.09.2017 06:50, Thomas Huth wrote:
> > On 27.09.2017 19:00, David Hildenbrand wrote:  
> >> This is a neat way to implement low address protection, whereby
> >> only the first 512 bytes of the first two pages (each 4096 bytes) of
> >> every address space are protected.
> >>
> >> Store a tec of 0 for the access exception, this is what is defined by
> >> Enhanced Suppression on Protection in case of a low address protection
> >> (Bit 61 set to 0, rest undefined).
> >>
> >> We have to make sure to to pass the access address, not the masked page
> >> address into mmu_translate*().
> >>
> >> Drop the check from testblock. So we can properly test this via
> >> kvm-unit-tests.
> >>
> >> This will check every access going through one of the MMUs.
> >>
> >> Signed-off-by: David Hildenbrand <david@redhat.com>
> >> ---
> >>  target/s390x/excp_helper.c |  3 +-
> >>  target/s390x/mem_helper.c  |  8 ----
> >>  target/s390x/mmu_helper.c  | 96 +++++++++++++++++++++++++++++-----------------
> >>  3 files changed, 62 insertions(+), 45 deletions(-)  
> > [...]  
> >> diff --git a/target/s390x/mmu_helper.c b/target/s390x/mmu_helper.c
> >> index 9daa0fd8e2..44a15449d2 100644
> >> --- a/target/s390x/mmu_helper.c
> >> +++ b/target/s390x/mmu_helper.c
> >> @@ -106,6 +106,37 @@ static void trigger_page_fault(CPUS390XState *env, target_ulong vaddr,
> >>      trigger_access_exception(env, type, ilen, tec);
> >>  }
> >>  
> >> +/* check whether the address would be proteted by Low-Address Protection */
> >> +static bool is_low_address(uint64_t addr)
> >> +{
> >> +    return addr < 512 || (addr >= 4096 && addr < 4607);
> >> +}  
> > 
> > I like the check from the kernel sources better:
> > 
> > static inline int is_low_address(unsigned long ga)
> > {
> >     /* Check for address ranges 0..511 and 4096..4607 */
> >     return (ga & ~0x11fful) == 0;
> > }
> > 
> > ... that might result in slightly faster code (depending on the
> > compiler, of course).  
> 
> I think that lim (readability) -> 0. Without that comment you're at
> first sight really clueless what this is about.
> 
> My check exactly corresponds to the wording in the PoP (and smart
> compilers should be able to optimize).
> 
> But I don't have a strong opinion on this micro optimization.
FWIW, I'd be happy with both, but has anyone actually looked at the
generated code?
^ permalink raw reply	[flat|nested] 21+ messages in thread
* Re: [Qemu-devel] [PATCH RFC 0/3] s390x/tcg: LAP support using immediate TLB invalidation
  2017-09-27 17:00 [Qemu-devel] [PATCH RFC 0/3] s390x/tcg: LAP support using immediate TLB invalidation David Hildenbrand
                   ` (2 preceding siblings ...)
  2017-09-27 17:00 ` [Qemu-devel] [PATCH RFC 3/3] s390x/tcg: make STFL store into the lowcore David Hildenbrand
@ 2017-09-29 11:49 ` Cornelia Huck
  2017-09-29 12:09   ` David Hildenbrand
  3 siblings, 1 reply; 21+ messages in thread
From: Cornelia Huck @ 2017-09-29 11:49 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: qemu-devel, thuth, Christian Borntraeger, Alexander Graf,
	Richard Henderson
On Wed, 27 Sep 2017 19:00:24 +0200
David Hildenbrand <david@redhat.com> wrote:
> Details about Low-Address Protection can be found in description of
> patch 1 and 2. It is basically a subpage protection of the first two
> pages of every address space (for which it is enabled).
> 
> We can achieve this by simply directly invalidating the TLB entry and
> therefore forcing every write accesses onto these two pages into the slow
> path.
> 
> With this patch, I can boot Linux just fine (which uses LAP). This also
> makes all related kvm-unit-tests that we have pass.
> 
> The checks are working that good, that I discovered a STFL bug. STFL
> stores into the low addresses but low-address protection does explicitly
> not apply. The Linux kernel calls STFL while LAP is active. So without
> patch nr 3, booting Linux will fail. (this change is also part of a patch
> of my SMP series).
I fear I have lost track a bit with all those patches floating around.
IIUC, patch 3 fixes a real bug that is only exposed by your LAP
changes. It used to be part of the stfl changes in v1 of your smp
series but is no longer in v2. So, is this a patch that can be applied
to current s390-next?
> 
> Based on: https://github.com/cohuck/qemu.git s390-next
> Available on: https://github.com/dhildenb/qemu.git s390x_lap
> 
> 
> David Hildenbrand (3):
>   accel/tcg: allow to invalidate a write TLB entry immediately
>   s390x/tcg: low-address protection support
>   s390x/tcg: make STFL store into the lowcore
> 
>  accel/tcg/cputlb.c           |  5 ++-
>  accel/tcg/softmmu_template.h |  4 +-
>  include/exec/cpu-all.h       |  3 ++
>  target/s390x/excp_helper.c   |  3 +-
>  target/s390x/helper.h        |  2 +-
>  target/s390x/mem_helper.c    |  8 ----
>  target/s390x/misc_helper.c   |  7 +++-
>  target/s390x/mmu_helper.c    | 96 ++++++++++++++++++++++++++++----------------
>  8 files changed, 78 insertions(+), 50 deletions(-)
> 
^ permalink raw reply	[flat|nested] 21+ messages in thread
* Re: [Qemu-devel] [PATCH RFC 0/3] s390x/tcg: LAP support using immediate TLB invalidation
  2017-09-29 11:49 ` [Qemu-devel] [PATCH RFC 0/3] s390x/tcg: LAP support using immediate TLB invalidation Cornelia Huck
@ 2017-09-29 12:09   ` David Hildenbrand
  2017-09-29 12:13     ` Cornelia Huck
  0 siblings, 1 reply; 21+ messages in thread
From: David Hildenbrand @ 2017-09-29 12:09 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: qemu-devel, thuth, Christian Borntraeger, Alexander Graf,
	Richard Henderson
On 29.09.2017 13:49, Cornelia Huck wrote:
> On Wed, 27 Sep 2017 19:00:24 +0200
> David Hildenbrand <david@redhat.com> wrote:
> 
>> Details about Low-Address Protection can be found in description of
>> patch 1 and 2. It is basically a subpage protection of the first two
>> pages of every address space (for which it is enabled).
>>
>> We can achieve this by simply directly invalidating the TLB entry and
>> therefore forcing every write accesses onto these two pages into the slow
>> path.
>>
>> With this patch, I can boot Linux just fine (which uses LAP). This also
>> makes all related kvm-unit-tests that we have pass.
>>
>> The checks are working that good, that I discovered a STFL bug. STFL
>> stores into the low addresses but low-address protection does explicitly
>> not apply. The Linux kernel calls STFL while LAP is active. So without
>> patch nr 3, booting Linux will fail. (this change is also part of a patch
>> of my SMP series).
> 
> I fear I have lost track a bit with all those patches floating around.
> IIUC, patch 3 fixes a real bug that is only exposed by your LAP
> changes. It used to be part of the stfl changes in v1 of your smp
> series but is no longer in v2. So, is this a patch that can be applied
> to current s390-next?
> 
The SMP series is based on both, this series and the CPU cleanup series
you already picked up.
You can apply Patch 3 with the following modified description (requested
by Thomas):
s390x/tcg: make STFL store into the lowcore
Using virtual memory access is wrong and will soon include low-address
protection checks, which is to be bypassed for STFL.
STFL is a privileged instruction and using LowCore requires
!CONFIG_USER_ONLY, so add the ifdef and move the declaration to the
right place.
This was originally part of a bigger STFL(E) refactoring.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
The other two patches, I will resend once we know how to handle the TLB
invalidation.
>>
>> Based on: https://github.com/cohuck/qemu.git s390-next
>> Available on: https://github.com/dhildenb/qemu.git s390x_lap
>>
>>
>> David Hildenbrand (3):
>>   accel/tcg: allow to invalidate a write TLB entry immediately
>>   s390x/tcg: low-address protection support
>>   s390x/tcg: make STFL store into the lowcore
>>
>>  accel/tcg/cputlb.c           |  5 ++-
>>  accel/tcg/softmmu_template.h |  4 +-
>>  include/exec/cpu-all.h       |  3 ++
>>  target/s390x/excp_helper.c   |  3 +-
>>  target/s390x/helper.h        |  2 +-
>>  target/s390x/mem_helper.c    |  8 ----
>>  target/s390x/misc_helper.c   |  7 +++-
>>  target/s390x/mmu_helper.c    | 96 ++++++++++++++++++++++++++++----------------
>>  8 files changed, 78 insertions(+), 50 deletions(-)
>>
> 
-- 
Thanks,
David
^ permalink raw reply	[flat|nested] 21+ messages in thread
* Re: [Qemu-devel] [PATCH RFC 0/3] s390x/tcg: LAP support using immediate TLB invalidation
  2017-09-29 12:09   ` David Hildenbrand
@ 2017-09-29 12:13     ` Cornelia Huck
  0 siblings, 0 replies; 21+ messages in thread
From: Cornelia Huck @ 2017-09-29 12:13 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: qemu-devel, thuth, Christian Borntraeger, Alexander Graf,
	Richard Henderson
On Fri, 29 Sep 2017 14:09:04 +0200
David Hildenbrand <david@redhat.com> wrote:
> On 29.09.2017 13:49, Cornelia Huck wrote:
> > On Wed, 27 Sep 2017 19:00:24 +0200
> > David Hildenbrand <david@redhat.com> wrote:
> >   
> >> Details about Low-Address Protection can be found in description of
> >> patch 1 and 2. It is basically a subpage protection of the first two
> >> pages of every address space (for which it is enabled).
> >>
> >> We can achieve this by simply directly invalidating the TLB entry and
> >> therefore forcing every write accesses onto these two pages into the slow
> >> path.
> >>
> >> With this patch, I can boot Linux just fine (which uses LAP). This also
> >> makes all related kvm-unit-tests that we have pass.
> >>
> >> The checks are working that good, that I discovered a STFL bug. STFL
> >> stores into the low addresses but low-address protection does explicitly
> >> not apply. The Linux kernel calls STFL while LAP is active. So without
> >> patch nr 3, booting Linux will fail. (this change is also part of a patch
> >> of my SMP series).  
> > 
> > I fear I have lost track a bit with all those patches floating around.
> > IIUC, patch 3 fixes a real bug that is only exposed by your LAP
> > changes. It used to be part of the stfl changes in v1 of your smp
> > series but is no longer in v2. So, is this a patch that can be applied
> > to current s390-next?
> >   
> 
> The SMP series is based on both, this series and the CPU cleanup series
> you already picked up.
> 
> You can apply Patch 3 with the following modified description (requested
> by Thomas):
> 
> 
> s390x/tcg: make STFL store into the lowcore
> 
> Using virtual memory access is wrong and will soon include low-address
> protection checks, which is to be bypassed for STFL.
> 
> STFL is a privileged instruction and using LowCore requires
> !CONFIG_USER_ONLY, so add the ifdef and move the declaration to the
> right place.
> 
> This was originally part of a bigger STFL(E) refactoring.
> 
> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
> Reviewed-by: Thomas Huth <thuth@redhat.com>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> 
> 
> The other two patches, I will resend once we know how to handle the TLB
> invalidation.
OK, thanks for the clarification!
^ permalink raw reply	[flat|nested] 21+ messages in thread
* Re: [Qemu-devel] [PATCH RFC 3/3] s390x/tcg: make STFL store into the lowcore
  2017-09-27 17:00 ` [Qemu-devel] [PATCH RFC 3/3] s390x/tcg: make STFL store into the lowcore David Hildenbrand
  2017-09-27 17:52   ` Richard Henderson
@ 2017-09-29 12:43   ` Cornelia Huck
  1 sibling, 0 replies; 21+ messages in thread
From: Cornelia Huck @ 2017-09-29 12:43 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: qemu-devel, thuth, Christian Borntraeger, Alexander Graf,
	Richard Henderson
On Wed, 27 Sep 2017 19:00:27 +0200
David Hildenbrand <david@redhat.com> wrote:
> Using virtual memory access is wrong and will soon include low-address
> protection checks, which is to be bypassed for STFL.
> 
> This was originally part of a bigger STFL(E) refactoring.
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  target/s390x/helper.h      | 2 +-
>  target/s390x/misc_helper.c | 7 ++++++-
>  2 files changed, 7 insertions(+), 2 deletions(-)
Thanks, applied (with updated patch description).
^ permalink raw reply	[flat|nested] 21+ messages in thread
* Re: [Qemu-devel] [PATCH RFC 2/3] s390x/tcg: low-address protection support
  2017-09-29 11:27       ` Cornelia Huck
@ 2017-10-12  8:41         ` Thomas Huth
  2017-10-16  7:20           ` David Hildenbrand
  0 siblings, 1 reply; 21+ messages in thread
From: Thomas Huth @ 2017-10-12  8:41 UTC (permalink / raw)
  To: Cornelia Huck, David Hildenbrand
  Cc: qemu-devel, Christian Borntraeger, Alexander Graf,
	Richard Henderson, qemu-s390x
On 29.09.2017 13:27, Cornelia Huck wrote:
> On Thu, 28 Sep 2017 15:08:11 +0200
> David Hildenbrand <david@redhat.com> wrote:
> 
>> On 28.09.2017 06:50, Thomas Huth wrote:
>>> On 27.09.2017 19:00, David Hildenbrand wrote:  
>>>> This is a neat way to implement low address protection, whereby
>>>> only the first 512 bytes of the first two pages (each 4096 bytes) of
>>>> every address space are protected.
>>>>
>>>> Store a tec of 0 for the access exception, this is what is defined by
>>>> Enhanced Suppression on Protection in case of a low address protection
>>>> (Bit 61 set to 0, rest undefined).
>>>>
>>>> We have to make sure to to pass the access address, not the masked page
>>>> address into mmu_translate*().
>>>>
>>>> Drop the check from testblock. So we can properly test this via
>>>> kvm-unit-tests.
>>>>
>>>> This will check every access going through one of the MMUs.
>>>>
>>>> Signed-off-by: David Hildenbrand <david@redhat.com>
>>>> ---
>>>>  target/s390x/excp_helper.c |  3 +-
>>>>  target/s390x/mem_helper.c  |  8 ----
>>>>  target/s390x/mmu_helper.c  | 96 +++++++++++++++++++++++++++++-----------------
>>>>  3 files changed, 62 insertions(+), 45 deletions(-)  
>>> [...]  
>>>> diff --git a/target/s390x/mmu_helper.c b/target/s390x/mmu_helper.c
>>>> index 9daa0fd8e2..44a15449d2 100644
>>>> --- a/target/s390x/mmu_helper.c
>>>> +++ b/target/s390x/mmu_helper.c
>>>> @@ -106,6 +106,37 @@ static void trigger_page_fault(CPUS390XState *env, target_ulong vaddr,
>>>>      trigger_access_exception(env, type, ilen, tec);
>>>>  }
>>>>  
>>>> +/* check whether the address would be proteted by Low-Address Protection */
>>>> +static bool is_low_address(uint64_t addr)
>>>> +{
>>>> +    return addr < 512 || (addr >= 4096 && addr < 4607);
>>>> +}  
>>>
>>> I like the check from the kernel sources better:
>>>
>>> static inline int is_low_address(unsigned long ga)
>>> {
>>>     /* Check for address ranges 0..511 and 4096..4607 */
>>>     return (ga & ~0x11fful) == 0;
>>> }
>>>
>>> ... that might result in slightly faster code (depending on the
>>> compiler, of course).  
>>
>> I think that lim (readability) -> 0. Without that comment you're at
>> first sight really clueless what this is about.
>>
>> My check exactly corresponds to the wording in the PoP (and smart
>> compilers should be able to optimize).
>>
>> But I don't have a strong opinion on this micro optimization.
> 
> FWIW, I'd be happy with both, but has anyone actually looked at the
> generated code?
This is what I get for David's original code:
    80000510:   c4 18 00 00 0d a4       lgrl    %r1,80002058 <x1>
    80000516:   a7 29 01 ff             lghi    %r2,511
    8000051a:   ec 12 00 4f c0 65       clgrjnh %r1,%r2,800005b8 <main+0xd8>
    80000520:   a7 1b f0 00             aghi    %r1,-4096
    80000524:   c2 1e 00 00 01 fe       clgfi   %r1,510
    8000052a:   a7 18 00 00             lhi     %r1,0
    8000052e:   b9 99 00 11             slbr    %r1,%r1
    80000532:   13 11                   lcr     %r1,%r1
    80000534:   c4 1f 00 00 0d 96       strl    %r1,80002060 <b1>
And this is the optimized kernel version:
    8000054a:   c4 18 00 00 0d 7f       lgrl    %r1,80002048 <x2>
    80000550:   a5 17 ee 00             nill    %r1,60928
    80000554:   b9 00 00 11             lpgr    %r1,%r1
    80000558:   a7 1b ff ff             aghi    %r1,-1
    8000055c:   eb 11 00 3f 00 0c       srlg    %r1,%r1,63
    80000562:   c4 1f 00 00 0d 77       strl    %r1,80002050 <b2>
So that's indeed a little bit better :-)
(I was using GCC 4.8.5 from RHEL7, with -O2)
By the way, I think there's a bug in David's code: It should either be
"addr <= 4607" or "addr < 4608" instead of "addr < 4607".
With that bug fixed, David's version get's optimized even more:
    80000510:   c4 18 00 00 0d a4       lgrl    %r1,80002058 <x1>
    80000516:   a5 17 ef ff             nill    %r1,61439
    8000051a:   c2 1e 00 00 01 ff       clgfi   %r1,511
    80000520:   a7 18 00 00             lhi     %r1,0
    80000524:   b9 99 00 11             slbr    %r1,%r1
    80000528:   13 11                   lcr     %r1,%r1
    8000052a:   c4 1f 00 00 0d 9b       strl    %r1,80002060 <b1>
... so the difference is really very minimal in that case --> We could
really use the more readable version, I think.
 Thomas
^ permalink raw reply	[flat|nested] 21+ messages in thread
* Re: [Qemu-devel] [PATCH RFC 2/3] s390x/tcg: low-address protection support
  2017-10-12  8:41         ` Thomas Huth
@ 2017-10-16  7:20           ` David Hildenbrand
  0 siblings, 0 replies; 21+ messages in thread
From: David Hildenbrand @ 2017-10-16  7:20 UTC (permalink / raw)
  To: Thomas Huth, Cornelia Huck
  Cc: qemu-devel, Christian Borntraeger, Alexander Graf,
	Richard Henderson, qemu-s390x
On 12.10.2017 10:41, Thomas Huth wrote:
> On 29.09.2017 13:27, Cornelia Huck wrote:
>> On Thu, 28 Sep 2017 15:08:11 +0200
>> David Hildenbrand <david@redhat.com> wrote:
>>
>>> On 28.09.2017 06:50, Thomas Huth wrote:
>>>> On 27.09.2017 19:00, David Hildenbrand wrote:  
>>>>> This is a neat way to implement low address protection, whereby
>>>>> only the first 512 bytes of the first two pages (each 4096 bytes) of
>>>>> every address space are protected.
>>>>>
>>>>> Store a tec of 0 for the access exception, this is what is defined by
>>>>> Enhanced Suppression on Protection in case of a low address protection
>>>>> (Bit 61 set to 0, rest undefined).
>>>>>
>>>>> We have to make sure to to pass the access address, not the masked page
>>>>> address into mmu_translate*().
>>>>>
>>>>> Drop the check from testblock. So we can properly test this via
>>>>> kvm-unit-tests.
>>>>>
>>>>> This will check every access going through one of the MMUs.
>>>>>
>>>>> Signed-off-by: David Hildenbrand <david@redhat.com>
>>>>> ---
>>>>>  target/s390x/excp_helper.c |  3 +-
>>>>>  target/s390x/mem_helper.c  |  8 ----
>>>>>  target/s390x/mmu_helper.c  | 96 +++++++++++++++++++++++++++++-----------------
>>>>>  3 files changed, 62 insertions(+), 45 deletions(-)  
>>>> [...]  
>>>>> diff --git a/target/s390x/mmu_helper.c b/target/s390x/mmu_helper.c
>>>>> index 9daa0fd8e2..44a15449d2 100644
>>>>> --- a/target/s390x/mmu_helper.c
>>>>> +++ b/target/s390x/mmu_helper.c
>>>>> @@ -106,6 +106,37 @@ static void trigger_page_fault(CPUS390XState *env, target_ulong vaddr,
>>>>>      trigger_access_exception(env, type, ilen, tec);
>>>>>  }
>>>>>  
>>>>> +/* check whether the address would be proteted by Low-Address Protection */
>>>>> +static bool is_low_address(uint64_t addr)
>>>>> +{
>>>>> +    return addr < 512 || (addr >= 4096 && addr < 4607);
>>>>> +}  
>>>>
>>>> I like the check from the kernel sources better:
>>>>
>>>> static inline int is_low_address(unsigned long ga)
>>>> {
>>>>     /* Check for address ranges 0..511 and 4096..4607 */
>>>>     return (ga & ~0x11fful) == 0;
>>>> }
>>>>
>>>> ... that might result in slightly faster code (depending on the
>>>> compiler, of course).  
>>>
>>> I think that lim (readability) -> 0. Without that comment you're at
>>> first sight really clueless what this is about.
>>>
>>> My check exactly corresponds to the wording in the PoP (and smart
>>> compilers should be able to optimize).
>>>
>>> But I don't have a strong opinion on this micro optimization.
>>
>> FWIW, I'd be happy with both, but has anyone actually looked at the
>> generated code?
> 
> This is what I get for David's original code:
> 
>     80000510:   c4 18 00 00 0d a4       lgrl    %r1,80002058 <x1>
>     80000516:   a7 29 01 ff             lghi    %r2,511
>     8000051a:   ec 12 00 4f c0 65       clgrjnh %r1,%r2,800005b8 <main+0xd8>
>     80000520:   a7 1b f0 00             aghi    %r1,-4096
>     80000524:   c2 1e 00 00 01 fe       clgfi   %r1,510
>     8000052a:   a7 18 00 00             lhi     %r1,0
>     8000052e:   b9 99 00 11             slbr    %r1,%r1
>     80000532:   13 11                   lcr     %r1,%r1
>     80000534:   c4 1f 00 00 0d 96       strl    %r1,80002060 <b1>
> 
> And this is the optimized kernel version:
> 
>     8000054a:   c4 18 00 00 0d 7f       lgrl    %r1,80002048 <x2>
>     80000550:   a5 17 ee 00             nill    %r1,60928
>     80000554:   b9 00 00 11             lpgr    %r1,%r1
>     80000558:   a7 1b ff ff             aghi    %r1,-1
>     8000055c:   eb 11 00 3f 00 0c       srlg    %r1,%r1,63
>     80000562:   c4 1f 00 00 0d 77       strl    %r1,80002050 <b2>
> 
> So that's indeed a little bit better :-)
> (I was using GCC 4.8.5 from RHEL7, with -O2)
> 
> By the way, I think there's a bug in David's code: It should either be
> "addr <= 4607" or "addr < 4608" instead of "addr < 4607".
> 
> With that bug fixed, David's version get's optimized even more:
> 
>     80000510:   c4 18 00 00 0d a4       lgrl    %r1,80002058 <x1>
>     80000516:   a5 17 ef ff             nill    %r1,61439
>     8000051a:   c2 1e 00 00 01 ff       clgfi   %r1,511
>     80000520:   a7 18 00 00             lhi     %r1,0
>     80000524:   b9 99 00 11             slbr    %r1,%r1
>     80000528:   13 11                   lcr     %r1,%r1
>     8000052a:   c4 1f 00 00 0d 9b       strl    %r1,80002060 <b1>
> 
> ... so the difference is really very minimal in that case --> We could
> really use the more readable version, I think.
> 
>  Thomas
> 
Very right, I'll fix that.
Nice way to find BUGs - comparing generated code :)
Thanks!
-- 
Thanks,
David
^ permalink raw reply	[flat|nested] 21+ messages in thread
* Re: [Qemu-devel] [PATCH RFC 1/3] accel/tcg: allow to invalidate a write TLB entry immediately
  2017-09-27 17:48   ` Richard Henderson
  2017-09-27 18:50     ` David Hildenbrand
@ 2017-10-16  7:24     ` David Hildenbrand
  2017-10-16 18:06       ` Richard Henderson
  1 sibling, 1 reply; 21+ messages in thread
From: David Hildenbrand @ 2017-10-16  7:24 UTC (permalink / raw)
  To: Richard Henderson, qemu-devel
  Cc: thuth, cohuck, Christian Borntraeger, Alexander Graf,
	Peter Maydell
On 27.09.2017 19:48, Richard Henderson wrote:
> On 09/27/2017 10:00 AM, David Hildenbrand wrote:
>> Background: s390x implements Low-Address Protection (LAP). If LAP is
>> enabled, writing to effective addresses (before any transaltion)
>> 0-511 and 4096-4607 triggers a protection exception.
>>
>> So we have subpage protection on the first two pages of every address
>> space (where the lowcore - the CPU private data resides).
>>
>> By immediately invalidating the write entry but allowing the caller to
>> continue, we force every write access onto these first two pages into
>> the slow path. we will get a tlb fault with the specific accessed
>> addresses and can then evaluate if protection applies or not.
>>
>> We have to make sure to ignore the invalid bit if tlb_fill() succeeds.
> 
> This is similar to a scheme I proposed to PMM wrt handling ARM v8M translation.
>  Reusing TLB_INVALID_MASK would appear to work, but I wonder if it wouldn't be
> clearer to use another bit.  I believe I had proposed a TLB_FORCE_SLOW_MASK.
> 
> Thoughts, Peter?
As two weeks have passed:
Any further opinions? Richard, how do you want me to continue with this?
Thanks!
-- 
Thanks,
David
^ permalink raw reply	[flat|nested] 21+ messages in thread
* Re: [Qemu-devel] [PATCH RFC 1/3] accel/tcg: allow to invalidate a write TLB entry immediately
  2017-10-16  7:24     ` David Hildenbrand
@ 2017-10-16 18:06       ` Richard Henderson
  0 siblings, 0 replies; 21+ messages in thread
From: Richard Henderson @ 2017-10-16 18:06 UTC (permalink / raw)
  To: David Hildenbrand, qemu-devel
  Cc: thuth, cohuck, Christian Borntraeger, Alexander Graf,
	Peter Maydell
On 10/16/2017 12:24 AM, David Hildenbrand wrote:
> On 27.09.2017 19:48, Richard Henderson wrote:
>> On 09/27/2017 10:00 AM, David Hildenbrand wrote:
>>> Background: s390x implements Low-Address Protection (LAP). If LAP is
>>> enabled, writing to effective addresses (before any transaltion)
>>> 0-511 and 4096-4607 triggers a protection exception.
>>>
>>> So we have subpage protection on the first two pages of every address
>>> space (where the lowcore - the CPU private data resides).
>>>
>>> By immediately invalidating the write entry but allowing the caller to
>>> continue, we force every write access onto these first two pages into
>>> the slow path. we will get a tlb fault with the specific accessed
>>> addresses and can then evaluate if protection applies or not.
>>>
>>> We have to make sure to ignore the invalid bit if tlb_fill() succeeds.
>>
>> This is similar to a scheme I proposed to PMM wrt handling ARM v8M translation.
>>  Reusing TLB_INVALID_MASK would appear to work, but I wonder if it wouldn't be
>> clearer to use another bit.  I believe I had proposed a TLB_FORCE_SLOW_MASK.
>>
>> Thoughts, Peter?
> 
> As two weeks have passed:
> 
> Any further opinions? Richard, how do you want me to continue with this?
Let's just go ahead with TLB_INVALID_MASK; we'll revisit if it gets to be
confusing.
r~
^ permalink raw reply	[flat|nested] 21+ messages in thread
end of thread, other threads:[~2017-10-16 18:06 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-09-27 17:00 [Qemu-devel] [PATCH RFC 0/3] s390x/tcg: LAP support using immediate TLB invalidation David Hildenbrand
2017-09-27 17:00 ` [Qemu-devel] [PATCH RFC 1/3] accel/tcg: allow to invalidate a write TLB entry immediately David Hildenbrand
2017-09-27 17:48   ` Richard Henderson
2017-09-27 18:50     ` David Hildenbrand
2017-10-16  7:24     ` David Hildenbrand
2017-10-16 18:06       ` Richard Henderson
2017-09-27 17:00 ` [Qemu-devel] [PATCH RFC 2/3] s390x/tcg: low-address protection support David Hildenbrand
2017-09-27 17:51   ` Richard Henderson
2017-09-28  4:50   ` Thomas Huth
2017-09-28 13:08     ` David Hildenbrand
2017-09-29 11:27       ` Cornelia Huck
2017-10-12  8:41         ` Thomas Huth
2017-10-16  7:20           ` David Hildenbrand
2017-09-27 17:00 ` [Qemu-devel] [PATCH RFC 3/3] s390x/tcg: make STFL store into the lowcore David Hildenbrand
2017-09-27 17:52   ` Richard Henderson
2017-09-27 18:46     ` David Hildenbrand
2017-09-28  4:23       ` Thomas Huth
2017-09-29 12:43   ` Cornelia Huck
2017-09-29 11:49 ` [Qemu-devel] [PATCH RFC 0/3] s390x/tcg: LAP support using immediate TLB invalidation Cornelia Huck
2017-09-29 12:09   ` David Hildenbrand
2017-09-29 12:13     ` Cornelia Huck
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).