[Qemu-devel] Making cputlb.c operations safe for MTTCG

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Qemu-devel] Making cputlb.c operations safe for MTTCG
@ 2016-07-26 12:09 Alex Bennée
  2016-07-26 14:03 ` [Qemu-devel] [PATCH] cputlb: make tlb_reset_dirty " Alex Bennée
  2016-08-01 14:54 ` [Qemu-devel] Making cputlb.c operations " Paolo Bonzini
  0 siblings, 2 replies; 14+ messages in thread
From: Alex Bennée @ 2016-07-26 12:09 UTC (permalink / raw)
  To: MTTCG Devel, QEMU Developers, Paolo Bonzini, Richard Henderson,
	Sergey Fedorov, Emilio G. Cota, Frederic Konrad, a.rigo

Hi,

While I've been re-spinning the base patches I've brought forward some
of the async work for cputlb done on the ARM enabling set. Thanks to
Sergey's consolidation work we have a robust mechanism for halting all
vCPUs to get work done if we need to. The cputlb changes are actually
independent of any specific architecture fixes needed so it makes sense
to fix them all in the base set. This works well for the various
tlb_flush type operations.

Going through cputlb though I have come across one use case where
deferring the work until later seems like a potential bottleneck and
also introduces a potential race.

When we do code generation we use tlb_protect_code() to set the region
as DIRTY_MEMORY_CODE and update the SoftMMU TLB flags to force the slow
path if anything tries to write to areas which have generated code
blocks associated with them. This operation is intrinsically cross-vCPU
as any vCPU writing to the code needs to be trapped:

    static void tlb_reset_dirty_range_all(ram_addr_t start, ram_addr_t length)
    {
        CPUState *cpu;
        ram_addr_t start1;
        RAMBlock *block;
        ram_addr_t end;

        end = TARGET_PAGE_ALIGN(start + length);
        start &= TARGET_PAGE_MASK;

        rcu_read_lock();
        block = qemu_get_ram_block(start);
        assert(block == qemu_get_ram_block(end - 1));
        start1 = (uintptr_t)ramblock_ptr(block, start - block->offset);
        CPU_FOREACH(cpu) {
            tlb_reset_dirty(cpu, start1, length);
        }
        rcu_read_unlock();
    }

If we defer the updating of the other vCPUs to later we'll introduce a
potential race which while I'm sure would be tricky to hit could result
in for example a guest probe not getting picked up if placed just after
code generation.

As the eventual operation is the setting of a flag I'm wondering if we
can simply use atomic primitives to ensure we don't corrupt the lookup
address when setting the TLB_NOTDIRTY flag?

Of course the TLB structure itself covers a number of values but AFAICT
erroneously setting TLB_NOTDIRTY on a entry that gets updated to a new
address wouldn't cause a problem except triggering an additional
slow-path write. If we are careful about the filling of the TLB entries
can we be sure we are always safe?

I hope to have some patches to show by the end of the week.

--
Alex Bennée

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Qemu-devel] [PATCH] cputlb: make tlb_reset_dirty safe for MTTCG
  2016-07-26 12:09 [Qemu-devel] Making cputlb.c operations safe for MTTCG Alex Bennée
@ 2016-07-26 14:03 ` Alex Bennée
  2016-08-01 14:54 ` [Qemu-devel] Making cputlb.c operations " Paolo Bonzini
  1 sibling, 0 replies; 14+ messages in thread
From: Alex Bennée @ 2016-07-26 14:03 UTC (permalink / raw)
  To: MTTCG Devel, QEMU Developers, Paolo Bonzini, Richard Henderson,
	Sergey Fedorov, Emilio G. Cota, Frederic Konrad, a.rigo


The main use case for tlb_reset_dirty is to set the TLB_NOTDIRTY flags
in TLB entries to force the slow-path on writes. This is used to mark
page ranges containing code which has been translated so it can be
invalidated if written to. To do this safely we need to ensure the TLB
entries in question for all vCPUs are updated before we attempt to run
the code otherwise a race could be introduced.

To achieve this we atomically set the flag in tlb_reset_dirty_range and
take care when setting it when the TLB entry is filled.

The helper function is made static as it isn't used outside of cputlb.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 cputlb.c              | 57 ++++++++++++++++++++++++++++++++++++---------------
 include/exec/cputlb.h |  2 --
 2 files changed, 40 insertions(+), 19 deletions(-)

diff --git a/cputlb.c b/cputlb.c
index e0d5bdd..e7b6a08 100644
--- a/cputlb.c
+++ b/cputlb.c
@@ -275,32 +275,52 @@ void tlb_unprotect_code(ram_addr_t ram_addr)
     cpu_physical_memory_set_dirty_flag(ram_addr, DIRTY_MEMORY_CODE);
 }

-static bool tlb_is_dirty_ram(CPUTLBEntry *tlbe)
-{
-    return (tlbe->addr_write & (TLB_INVALID_MASK|TLB_MMIO|TLB_NOTDIRTY)) == 0;
-}

-void tlb_reset_dirty_range(CPUTLBEntry *tlb_entry, uintptr_t start,
+/*
+ * Dirty write flag handling
+ *
+ * When the TCG code writes to a location it looks up the address in
+ * the TLB and uses that data to compute the final address. If any of
+ * the lower bits of the address are set then the slow path is forced.
+ * There are a number of reasons to do this but for normal RAM the
+ * most usual is detecting writes to code regions which may invalidate
+ * generated code.
+ *
+ * Because we want other vCPUs to respond to changes straight away we
+ * update the te->addr_write field atomically. If the TLB entry has
+ * been changed by the vCPU in the mean time we skip the update.
+ */
+
+static void tlb_reset_dirty_range(CPUTLBEntry *tlb_entry, uintptr_t start,
                            uintptr_t length)
 {
-    uintptr_t addr;
+    /* paired with atomic_mb_set in tlb_set_page_with_attrs */
+    uintptr_t orig_addr = atomic_mb_read(&tlb_entry->addr_write);
+    uintptr_t addr = orig_addr;

-    if (tlb_is_dirty_ram(tlb_entry)) {
-        addr = (tlb_entry->addr_write & TARGET_PAGE_MASK) + tlb_entry->addend;
+    if ((addr & (TLB_INVALID_MASK|TLB_MMIO|TLB_NOTDIRTY)) == 0) {
+        addr &= TARGET_PAGE_MASK;
+        addr += atomic_read(&tlb_entry->addend);
         if ((addr - start) < length) {
-            tlb_entry->addr_write |= TLB_NOTDIRTY;
+            uintptr_t notdirty_addr = orig_addr | TLB_NOTDIRTY;
+            if (!atomic_bool_cmpxchg(&tlb_entry->addr_write, orig_addr, notdirty_addr)) {
+                fprintf(stderr,"%s: raced setting the flag\n", __func__);
+            }
         }
     }
 }

+/* This is a cross vCPU call (i.e. another vCPU resetting the flags of
+ * the target vCPU). As such care needs to be taken that we don't
+ * dangerously race with another vCPU update. The only thing actually
+ * updated is the target TLB entry ->addr_write flags.
+ */
 void tlb_reset_dirty(CPUState *cpu, ram_addr_t start1, ram_addr_t length)
 {
     CPUArchState *env;

     int mmu_idx;

-    assert_cpu_is_self(cpu);
-
     env = cpu->env_ptr;
     for (mmu_idx = 0; mmu_idx < NB_MMU_MODES; mmu_idx++) {
         unsigned int i;
@@ -386,7 +406,7 @@ void tlb_set_page_with_attrs(CPUState *cpu, target_ulong vaddr,
     MemoryRegionSection *section;
     unsigned int index;
     target_ulong address;
-    target_ulong code_address;
+    target_ulong code_address, write_address;
     uintptr_t addend;
     CPUTLBEntry *te;
     hwaddr iotlb, xlat, sz;
@@ -443,21 +463,24 @@ void tlb_set_page_with_attrs(CPUState *cpu, target_ulong vaddr,
     } else {
         te->addr_code = -1;
     }
+
+    write_address = -1;
     if (prot & PAGE_WRITE) {
         if ((memory_region_is_ram(section->mr) && section->readonly)
             || memory_region_is_romd(section->mr)) {
             /* Write access calls the I/O callback.  */
-            te->addr_write = address | TLB_MMIO;
+            write_address = address | TLB_MMIO;
         } else if (memory_region_is_ram(section->mr)
                    && cpu_physical_memory_is_clean(
                         memory_region_get_ram_addr(section->mr) + xlat)) {
-            te->addr_write = address | TLB_NOTDIRTY;
+            write_address = address | TLB_NOTDIRTY;
         } else {
-            te->addr_write = address;
+            write_address = address;
         }
-    } else {
-        te->addr_write = -1;
     }
+
+    /* Pairs with flag setting in tlb_reset_dirty_range */
+    atomic_mb_set(&te->addr_write, write_address);
 }

 /* Add a new TLB entry, but without specifying the memory
diff --git a/include/exec/cputlb.h b/include/exec/cputlb.h
index d454c00..3f94178 100644
--- a/include/exec/cputlb.h
+++ b/include/exec/cputlb.h
@@ -23,8 +23,6 @@
 /* cputlb.c */
 void tlb_protect_code(ram_addr_t ram_addr);
 void tlb_unprotect_code(ram_addr_t ram_addr);
-void tlb_reset_dirty_range(CPUTLBEntry *tlb_entry, uintptr_t start,
-                           uintptr_t length);
 extern int tlb_flush_count;

 #endif
--
2.7.4

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] Making cputlb.c operations safe for MTTCG
  2016-07-26 12:09 [Qemu-devel] Making cputlb.c operations safe for MTTCG Alex Bennée
  2016-07-26 14:03 ` [Qemu-devel] [PATCH] cputlb: make tlb_reset_dirty " Alex Bennée
@ 2016-08-01 14:54 ` Paolo Bonzini
  2016-08-02  6:37   ` Alex Bennée
  1 sibling, 1 reply; 14+ messages in thread
From: Paolo Bonzini @ 2016-08-01 14:54 UTC (permalink / raw)
  To: Alex Bennée, MTTCG Devel, QEMU Developers, Richard Henderson,
	Sergey Fedorov, Emilio G. Cota, Frederic Konrad, a.rigo



On 26/07/2016 14:09, Alex Bennée wrote:
> 
> As the eventual operation is the setting of a flag I'm wondering if we
> can simply use atomic primitives to ensure we don't corrupt the lookup
> address when setting the TLB_NOTDIRTY flag?

In theory tlb_reset_dirty and tlb_set_dirty1 can use atomic_* on
tlb_entry->addr_write, but careful because:

- you need to order reads and writes to tlb_entry->addr_write and
tlb_entry->addend properly

- addr_write cannot be written atomically for 32-bit host/64-bit target.
 Probably you can use something like

    union {
        target_ulong addr_write;
#if TARGET_LONG_BITS == 32
        struct { uint32_t lo_and_lfags; } addr_write_w;
#elif defined HOST_WORDS_BIGENDIAN
        struct { uint32_t hi, lo_and_flags; } addr_write_w;
#else
        struct { uint32_t lo_and_flags, hi; } addr_write_w;
#endif
    };

IIRC "foreign" accesses only set TLB_NOTDIRTY, so they can use a cmpxchg
on lo_and_flags (worst case you end up with an unnecessary call to
notdirty_mem_write).

- When removing TLB_NOTDIRTY from a TLB entry
(notdirty_mem_write/tlb_unprotect_code), as well as filling in a TLB
entry without TLB_NOTDIRTY (tlb_set_page_with_attrs) you need to protect
from a concurrent tb_alloc_page and hence take the tb_lock.

In particular:

- in notdirty_mem_write, care must be put in the ordering of
tb_invalidate_phys_page_fast (which itself calls tlb_unprotect_code and
takes the tb_lock in tb_invalidate_phys_page_range) and tlb_set_dirty.
At least it seems to me that the call to tb_invalidate_phys_page_fast
should be after the write, but that's not all.  Perhaps merge this part
of notdirty_mem_write:

    /* Set both VGA and migration bits for simplicity and to remove
     * the notdirty callback faster.
     */
    cpu_physical_memory_set_dirty_range(ram_addr, size,
                                        DIRTY_CLIENTS_NOCODE);
    /* we remove the notdirty callback only if the code has been
       flushed */
    if (!cpu_physical_memory_is_clean(ram_addr)) {
        tlb_set_dirty(current_cpu, current_cpu->mem_io_vaddr);
    }

into tlb_unprotect_code?!?  Or perhaps do tlb_set_dirty _first_, and
then add back the callback if cpu_physical_memory_is_clean(ram_addr) is
true.  I haven't put much thought into it.

- tlb_set_page_with_attrs is also hard-ish to get right, but perhaps the
same idea of adding the callback last would work:

    /* First set addr_write so that concurrent tlb_reset_dirty_range
     * finds a match.
     */
    te->addr_write = address;
    if (memory_region_is_ram(section->mr)) {
        if (cpu_physical_memory_is_clean(
                     memory_region_get_ram_addr(section->mr) + xlat)) {
            te->addr_write = address | TLB_NOTDIRTY;
        }
    }

Paolo

> Of course the TLB structure itself covers a number of values but AFAICT
> erroneously setting TLB_NOTDIRTY on a entry that gets updated to a new
> address wouldn't cause a problem except triggering an additional
> slow-path write. If we are careful about the filling of the TLB entries
> can we be sure we are always safe?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] Making cputlb.c operations safe for MTTCG
  2016-08-01 14:54 ` [Qemu-devel] Making cputlb.c operations " Paolo Bonzini
@ 2016-08-02  6:37   ` Alex Bennée
  2016-08-02 10:26     ` Paolo Bonzini
                       ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Alex Bennée @ 2016-08-02  6:37 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: MTTCG Devel, QEMU Developers, Richard Henderson, Sergey Fedorov,
	Emilio G. Cota, Frederic Konrad, a.rigo


Paolo Bonzini <pbonzini@redhat.com> writes:

> On 26/07/2016 14:09, Alex Bennée wrote:
>>
>> As the eventual operation is the setting of a flag I'm wondering if we
>> can simply use atomic primitives to ensure we don't corrupt the lookup
>> address when setting the TLB_NOTDIRTY flag?
>
> In theory tlb_reset_dirty and tlb_set_dirty1 can use atomic_* on
> tlb_entry->addr_write, but careful because:
>
> - you need to order reads and writes to tlb_entry->addr_write and
> tlb_entry->addend properly
>
> - addr_write cannot be written atomically for 32-bit host/64-bit target.
>  Probably you can use something like
>
>     union {
>         target_ulong addr_write;
> #if TARGET_LONG_BITS == 32
>         struct { uint32_t lo_and_lfags; } addr_write_w;
> #elif defined HOST_WORDS_BIGENDIAN
>         struct { uint32_t hi, lo_and_flags; } addr_write_w;
> #else
>         struct { uint32_t lo_and_flags, hi; } addr_write_w;
> #endif
>     };


This will work but I wonder if it is time to call it a day for 32 on 64
support? I mean all this can be worked around but I wonder if it is
worth the effort if no one actually uses this combination.

I might prepare a patch for the next dev cycle to promote discussion.

>
> IIRC "foreign" accesses only set TLB_NOTDIRTY, so they can use a cmpxchg
> on lo_and_flags (worst case you end up with an unnecessary call to
> notdirty_mem_write).

Yeah I used a cmpxchg in the RFC patch. AFAICT you wouldn't see a change
to addend that didn't involve addr_write changing. So as long as
addr_write was always the last thing written things should be fine.

>
> - When removing TLB_NOTDIRTY from a TLB entry
> (notdirty_mem_write/tlb_unprotect_code), as well as filling in a TLB
> entry without TLB_NOTDIRTY (tlb_set_page_with_attrs) you need to protect
> from a concurrent tb_alloc_page and hence take the tb_lock.

tlb_set_page_with_attrs is also only ever filled by the vCPU in question
so I just ensured the final addr_write was calculated and written in one
go at the end, after all other entries had been updated.

>
> In particular:
>
> - in notdirty_mem_write, care must be put in the ordering of
> tb_invalidate_phys_page_fast (which itself calls tlb_unprotect_code and
> takes the tb_lock in tb_invalidate_phys_page_range) and tlb_set_dirty.
> At least it seems to me that the call to tb_invalidate_phys_page_fast
> should be after the write, but that's not all.  Perhaps merge this part
> of notdirty_mem_write:
>
>     /* Set both VGA and migration bits for simplicity and to remove
>      * the notdirty callback faster.
>      */
>     cpu_physical_memory_set_dirty_range(ram_addr, size,
>                                         DIRTY_CLIENTS_NOCODE);
>     /* we remove the notdirty callback only if the code has been
>        flushed */
>     if (!cpu_physical_memory_is_clean(ram_addr)) {
>         tlb_set_dirty(current_cpu, current_cpu->mem_io_vaddr);
>     }
>
> into tlb_unprotect_code?!?  Or perhaps do tlb_set_dirty _first_, and
> then add back the callback if cpu_physical_memory_is_clean(ram_addr) is
> true.  I haven't put much thought into it.

I'll have a closer look at these bits.

>
> - tlb_set_page_with_attrs is also hard-ish to get right, but perhaps the
> same idea of adding the callback last would work:
>
>     /* First set addr_write so that concurrent tlb_reset_dirty_range
>      * finds a match.
>      */
>     te->addr_write = address;
>     if (memory_region_is_ram(section->mr)) {
>         if (cpu_physical_memory_is_clean(
>                      memory_region_get_ram_addr(section->mr) + xlat)) {
>             te->addr_write = address | TLB_NOTDIRTY;
>         }
>     }

Why not just write addr_write completely at the end?

>
> Paolo
>
>> Of course the TLB structure itself covers a number of values but AFAICT
>> erroneously setting TLB_NOTDIRTY on a entry that gets updated to a new
>> address wouldn't cause a problem except triggering an additional
>> slow-path write. If we are careful about the filling of the TLB entries
>> can we be sure we are always safe?


--
Alex Bennée

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] Making cputlb.c operations safe for MTTCG
  2016-08-02  6:37   ` Alex Bennée
@ 2016-08-02 10:26     ` Paolo Bonzini
  2016-08-03 17:25     ` Richard Henderson
  2016-09-27 16:16     ` Paolo Bonzini
  2 siblings, 0 replies; 14+ messages in thread
From: Paolo Bonzini @ 2016-08-02 10:26 UTC (permalink / raw)
  To: Alex Bennée
  Cc: MTTCG Devel, QEMU Developers, Richard Henderson, Sergey Fedorov,
	Emilio G. Cota, Frederic Konrad, a rigo


> > - tlb_set_page_with_attrs is also hard-ish to get right, but perhaps the
> > same idea of adding the callback last would work:
> >
> >     /* First set addr_write so that concurrent tlb_reset_dirty_range
> >      * finds a match.
> >      */
> >     te->addr_write = address;
> >     if (memory_region_is_ram(section->mr)) {
> >         if (cpu_physical_memory_is_clean(
> >                      memory_region_get_ram_addr(section->mr) + xlat)) {
> >             te->addr_write = address | TLB_NOTDIRTY;
> >         }
> >     }
> 
> Why not just write addr_write completely at the end?

I suspect that another thread can come around and make the page "not dirty",
while the CPU that "owns" the TLB has decided that the page is dirty and does
not set the bit.  So you have

    CPU thread                               other thread
    -----------------------------            -----------------------------------
    compute addr_write
                                             page is now clean
    write addr_write without TLB_NOTDIRTY

By the way, the same happens in victim_tlb_hit.  You have to recompute
TLB_NOTDIRTY after setting *tlb, something like:

    tmptlb = *vtlb;
    *vtlb = *tlb;

    if (tmptlb.addr_write & (TLB_INVALID_MASK|TLB_MMIO)) {
        *tlb = tmptlb;
    } else {
        /* First write TLB entry without TLB_NOTDIRTY.  */
        tmptlb.addr_write &= ~TLB_NOTDIRTY;
        *tlb = tmptlb;
        smp_wmb();
        /* Then set NOTDIRTY in tlb->addr_write if necessary.
         * IIRC the IOTLB lets you get back to the ram_addr, and
         * pass it to cpu_physical_memory_is_clean.
         */
        if (cpu_physical_memory_is_clean(tlb_get_ram_addr(&tmptlb))) {
            tlb->addr_write = tmptlb.addr_write | TLB_NOTDIRTY;
        }
    }

The silver lining is that tlb_set_dirty now does not need anymore to check
the victim TLB.

Paolo

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] Making cputlb.c operations safe for MTTCG
  2016-08-02  6:37   ` Alex Bennée
  2016-08-02 10:26     ` Paolo Bonzini
@ 2016-08-03 17:25     ` Richard Henderson
  2016-08-03 20:56       ` Paolo Bonzini
  2016-09-27 16:16     ` Paolo Bonzini
  2 siblings, 1 reply; 14+ messages in thread
From: Richard Henderson @ 2016-08-03 17:25 UTC (permalink / raw)
  To: Alex Bennée, Paolo Bonzini
  Cc: MTTCG Devel, QEMU Developers, Sergey Fedorov, Emilio G. Cota,
	Frederic Konrad, a.rigo

On 08/02/2016 12:07 PM, Alex Bennée wrote:
> This will work but I wonder if it is time to call it a day for 32 on 64
> support? I mean all this can be worked around but I wonder if it is
> worth the effort if no one actually uses this combination.

I've been meaning to bring up exactly this question during the 2.8 cycle.

Given the changes we want to make with target atomic operations, I think it's a 
complete waste of time to continue to support 64-on-32.

Indeed, it's half-tempting to drop 32-bit host support entirely.  But while 
i686, arm, and mips hosts continue to function for simple guests, it is perhaps 
a step too far to drop it all (yet).

> I might prepare a patch for the next dev cycle to promote discussion.

Please do.

r~

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] Making cputlb.c operations safe for MTTCG
  2016-08-03 17:25     ` Richard Henderson
@ 2016-08-03 20:56       ` Paolo Bonzini
  0 siblings, 0 replies; 14+ messages in thread
From: Paolo Bonzini @ 2016-08-03 20:56 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Alex Bennée, MTTCG Devel, QEMU Developers, Sergey Fedorov,
	Emilio G. Cota, Frederic Konrad, a rigo

> On 08/02/2016 12:07 PM, Alex Bennée wrote:
> > This will work but I wonder if it is time to call it a day for 32 on 64
> > support? I mean all this can be worked around but I wonder if it is
> > worth the effort if no one actually uses this combination.
> 
> I've been meaning to bring up exactly this question during the 2.8 cycle.
> 
> Given the changes we want to make with target atomic operations, I think it's
> a complete waste of time to continue to support 64-on-32.

The only interesting 32-bit host IMO is arm, and it does support double-word
atomics on ARMv7+ (only through ldrdex/strdex if < Cortex-A15).  MIPS doesn't,
and PPC (POWER8) has 128-bit atomics in 64-bit mode, but lacks 64-bit atomics
in 32-bit mode.

FWIW, OpenRISC also lacks double-word (64-bit in 32-bit processors, 128-bit
in 64-bit processors) atomics.  Sad.

So if we want to keep 64-on-32 emulation, it might still make sense to restrict
it to ARMv7+ and x86 hosts.  But otherwise it might be indeed time to say goodbye...

Paolo

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] Making cputlb.c operations safe for MTTCG
  2016-08-02  6:37   ` Alex Bennée
  2016-08-02 10:26     ` Paolo Bonzini
  2016-08-03 17:25     ` Richard Henderson
@ 2016-09-27 16:16     ` Paolo Bonzini
  2016-09-27 22:15       ` Alex Bennée
  2016-09-27 22:29       ` Emilio G. Cota
  2 siblings, 2 replies; 14+ messages in thread
From: Paolo Bonzini @ 2016-09-27 16:16 UTC (permalink / raw)
  To: Alex Bennée
  Cc: MTTCG Devel, QEMU Developers, a.rigo, Emilio G. Cota,
	Sergey Fedorov, Richard Henderson, Frederic Konrad



On 02/08/2016 08:37, Alex Bennée wrote:
>> - in notdirty_mem_write, care must be put in the ordering of
>> tb_invalidate_phys_page_fast (which itself calls tlb_unprotect_code and
>> takes the tb_lock in tb_invalidate_phys_page_range) and tlb_set_dirty.
>> At least it seems to me that the call to tb_invalidate_phys_page_fast
>> should be after the write, but that's not all.  Perhaps merge this part
>> of notdirty_mem_write:

I looked at it again and you are already doing the right thing in patch 19.
It's possible to simplify it a bit though like this:

diff --git a/exec.c b/exec.c
index c8389f9..7850c39 100644
--- a/exec.c
+++ b/exec.c
@@ -1944,9 +1944,6 @@ ram_addr_t qemu_ram_addr_from_host(void *ptr)
 static void notdirty_mem_write(void *opaque, hwaddr ram_addr,
                                uint64_t val, unsigned size)
 {
-    if (!cpu_physical_memory_get_dirty_flag(ram_addr, DIRTY_MEMORY_CODE)) {
-        tb_invalidate_phys_page_fast(ram_addr, size);
-    }
     switch (size) {
     case 1:
         stb_p(qemu_map_ram_ptr(NULL, ram_addr), val);
@@ -1960,11 +1957,19 @@ static void notdirty_mem_write(void *opaque, hwaddr ram_addr,
      */
     cpu_physical_memory_set_dirty_range(ram_addr, size,
                                         DIRTY_CLIENTS_NOCODE);
+    tb_lock();
+    if (!cpu_physical_memory_get_dirty_flag(ram_addr, DIRTY_MEMORY_CODE)) {
+        /* tb_invalidate_phys_page_range will call tlb_unprotect_code
+         * once the last TB in this page is gone.
+         */
+        tb_invalidate_phys_page_fast(ram_addr, size);
+    }
     /* we remove the notdirty callback only if the code has been
        flushed */
     if (!cpu_physical_memory_is_clean(ram_addr)) {
         tlb_set_dirty(current_cpu, current_cpu->mem_io_vaddr);
     }
+    tb_unlock();
 }
 
 static bool notdirty_mem_accepts(void *opaque, hwaddr addr,


Anyhow, the next step is to merge either cmpxchg-based atomics
or iothread-free single-threaded TCG.  Either will do. :)

I think that even iothread-free single-threaded TCG requires this
TLB stuff, because the iothread's address_space_write (and hence
invalidate_and_set_dirty) can race against the TCG thread's
code generation.

Thanks,

Paolo

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] Making cputlb.c operations safe for MTTCG
  2016-09-27 16:16     ` Paolo Bonzini
@ 2016-09-27 22:15       ` Alex Bennée
  2016-09-27 22:29       ` Emilio G. Cota
  1 sibling, 0 replies; 14+ messages in thread
From: Alex Bennée @ 2016-09-27 22:15 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: MTTCG Devel, QEMU Developers, a.rigo, Emilio G. Cota,
	Sergey Fedorov, Richard Henderson, Frederic Konrad


Paolo Bonzini <pbonzini@redhat.com> writes:

> On 02/08/2016 08:37, Alex Bennée wrote:
>>> - in notdirty_mem_write, care must be put in the ordering of
>>> tb_invalidate_phys_page_fast (which itself calls tlb_unprotect_code and
>>> takes the tb_lock in tb_invalidate_phys_page_range) and tlb_set_dirty.
>>> At least it seems to me that the call to tb_invalidate_phys_page_fast
>>> should be after the write, but that's not all.  Perhaps merge this part
>>> of notdirty_mem_write:
>
> I looked at it again and you are already doing the right thing in patch 19.
> It's possible to simplify it a bit though like this:
>
> diff --git a/exec.c b/exec.c
> index c8389f9..7850c39 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -1944,9 +1944,6 @@ ram_addr_t qemu_ram_addr_from_host(void *ptr)
>  static void notdirty_mem_write(void *opaque, hwaddr ram_addr,
>                                 uint64_t val, unsigned size)
>  {
> -    if (!cpu_physical_memory_get_dirty_flag(ram_addr, DIRTY_MEMORY_CODE)) {
> -        tb_invalidate_phys_page_fast(ram_addr, size);
> -    }
>      switch (size) {
>      case 1:
>          stb_p(qemu_map_ram_ptr(NULL, ram_addr), val);
> @@ -1960,11 +1957,19 @@ static void notdirty_mem_write(void *opaque, hwaddr ram_addr,
>       */
>      cpu_physical_memory_set_dirty_range(ram_addr, size,
>                                          DIRTY_CLIENTS_NOCODE);
> +    tb_lock();
> +    if (!cpu_physical_memory_get_dirty_flag(ram_addr, DIRTY_MEMORY_CODE)) {
> +        /* tb_invalidate_phys_page_range will call tlb_unprotect_code
> +         * once the last TB in this page is gone.
> +         */
> +        tb_invalidate_phys_page_fast(ram_addr, size);
> +    }
>      /* we remove the notdirty callback only if the code has been
>         flushed */
>      if (!cpu_physical_memory_is_clean(ram_addr)) {
>          tlb_set_dirty(current_cpu, current_cpu->mem_io_vaddr);
>      }
> +    tb_unlock();
>  }
>
>  static bool notdirty_mem_accepts(void *opaque, hwaddr addr,
>
>
> Anyhow, the next step is to merge either cmpxchg-based atomics
> or iothread-free single-threaded TCG.  Either will do. :)

By iothread-free single-threaded TCG you mean dropping the need to grab
the BQL when we start the TCG thread and making the BQL purely an
on-demand/when needed thing?

The cmpxchg stuff is looking good to me - I still have to do a pass over
rth's patch set since he re-based on async safe work. In fact once your
updated PULL req is in even better ;-)

> I think that even iothread-free single-threaded TCG requires this
> TLB stuff, because the iothread's address_space_write (and hence
> invalidate_and_set_dirty) can race against the TCG thread's
> code generation.

Yes.

>
> Thanks,
>
> Paolo


--
Alex Bennée

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] Making cputlb.c operations safe for MTTCG
  2016-09-27 16:16     ` Paolo Bonzini
  2016-09-27 22:15       ` Alex Bennée
@ 2016-09-27 22:29       ` Emilio G. Cota
  2016-09-27 23:04         ` Alex Bennée
  2016-09-27 23:05         ` Richard Henderson
  1 sibling, 2 replies; 14+ messages in thread
From: Emilio G. Cota @ 2016-09-27 22:29 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Alex Bennée, MTTCG Devel, QEMU Developers, a.rigo,
	Sergey Fedorov, Richard Henderson, Frederic Konrad

On Tue, Sep 27, 2016 at 18:16:45 +0200, Paolo Bonzini wrote:
> Anyhow, the next step is to merge either cmpxchg-based atomics
> or iothread-free single-threaded TCG.  Either will do. :)
> 
> I think that even iothread-free single-threaded TCG requires this
> TLB stuff, because the iothread's address_space_write (and hence
> invalidate_and_set_dirty) can race against the TCG thread's
> code generation.

What's a quick-and-dirty way to disable the fast-path TLB lookups?
Alex: you told me the monitor has an option for this, but I can't
find it. I'm looking for something that'd go in tcg/i386 to simply
bypass the fast path.

Forcing the slow TLB lookup would be an easy way to then implement
a per-TLB seqlock. I think TLB corruption might explain the crashes I
see when booting Ubuntu in a many-core guest (running on a many-core
host).

Thanks,

		Emilio

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] Making cputlb.c operations safe for MTTCG
  2016-09-27 22:29       ` Emilio G. Cota
@ 2016-09-27 23:04         ` Alex Bennée
  2016-09-27 23:05         ` Richard Henderson
  1 sibling, 0 replies; 14+ messages in thread
From: Alex Bennée @ 2016-09-27 23:04 UTC (permalink / raw)
  To: Emilio G. Cota
  Cc: Paolo Bonzini, MTTCG Devel, QEMU Developers, a.rigo,
	Sergey Fedorov, Richard Henderson, Frederic Konrad


Emilio G. Cota <cota@braap.org> writes:

> On Tue, Sep 27, 2016 at 18:16:45 +0200, Paolo Bonzini wrote:
>> Anyhow, the next step is to merge either cmpxchg-based atomics
>> or iothread-free single-threaded TCG.  Either will do. :)
>>
>> I think that even iothread-free single-threaded TCG requires this
>> TLB stuff, because the iothread's address_space_write (and hence
>> invalidate_and_set_dirty) can race against the TCG thread's
>> code generation.
>
> What's a quick-and-dirty way to disable the fast-path TLB lookups?
> Alex: you told me the monitor has an option for this, but I can't
> find it. I'm looking for something that'd go in tcg/i386 to simply
> bypass the fast path.

Hack up tlb_set_page_with_attrs() to always set one of the TLB_FOO bits
(you might want to invent a new one as the other do have meanings).

>
> Forcing the slow TLB lookup would be an easy way to then implement
> a per-TLB seqlock. I think TLB corruption might explain the crashes I
> see when booting Ubuntu in a many-core guest (running on a many-core
> host).

TLB corruption is suspected but I've never come up with a clean test
case to force it. I find heavy compiles in a system image can do it but
my SMC torture test never crashes.

>
> Thanks,
>
> 		Emilio


--
Alex Bennée

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] Making cputlb.c operations safe for MTTCG
  2016-09-27 22:29       ` Emilio G. Cota
  2016-09-27 23:04         ` Alex Bennée
@ 2016-09-27 23:05         ` Richard Henderson
  2016-09-27 23:32           ` Alex Bennée
  1 sibling, 1 reply; 14+ messages in thread
From: Richard Henderson @ 2016-09-27 23:05 UTC (permalink / raw)
  To: Emilio G. Cota, Paolo Bonzini
  Cc: Alex Bennée, MTTCG Devel, QEMU Developers, a.rigo,
	Sergey Fedorov, Frederic Konrad

On 09/27/2016 03:29 PM, Emilio G. Cota wrote:
> What's a quick-and-dirty way to disable the fast-path TLB lookups?
> Alex: you told me the monitor has an option for this, but I can't
> find it. I'm looking for something that'd go in tcg/i386 to simply
> bypass the fast path.

There is no easy way.  If you need that, you'd have to significantly modify the 
tcg backend.


r~

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] Making cputlb.c operations safe for MTTCG
  2016-09-27 23:05         ` Richard Henderson
@ 2016-09-27 23:32           ` Alex Bennée
  2016-09-28  0:34             ` Richard Henderson
  0 siblings, 1 reply; 14+ messages in thread
From: Alex Bennée @ 2016-09-27 23:32 UTC (permalink / raw)
  To: Richard Henderson
  Cc: Emilio G. Cota, Paolo Bonzini, MTTCG Devel, QEMU Developers,
	a.rigo, Sergey Fedorov, Frederic Konrad


Richard Henderson <rth@twiddle.net> writes:

> On 09/27/2016 03:29 PM, Emilio G. Cota wrote:
>> What's a quick-and-dirty way to disable the fast-path TLB lookups?
>> Alex: you told me the monitor has an option for this, but I can't
>> find it. I'm looking for something that'd go in tcg/i386 to simply
>> bypass the fast path.
>
> There is no easy way.  If you need that, you'd have to significantly modify the 
> tcg backend.

Surely all the backends force the slow-path when any of TLB_FLAGS_MASK
are set. Unless adding an extra bit is going to run out of spare bits on
some backends?

>
>
> r~


-- 
Alex Bennée

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [Qemu-devel] Making cputlb.c operations safe for MTTCG
  2016-09-27 23:32           ` Alex Bennée
@ 2016-09-28  0:34             ` Richard Henderson
  0 siblings, 0 replies; 14+ messages in thread
From: Richard Henderson @ 2016-09-28  0:34 UTC (permalink / raw)
  To: Alex Bennée
  Cc: Emilio G. Cota, Paolo Bonzini, MTTCG Devel, QEMU Developers,
	a.rigo, Sergey Fedorov, Frederic Konrad

On 09/27/2016 04:32 PM, Alex Bennée wrote:
>
> Richard Henderson <rth@twiddle.net> writes:
>
>> On 09/27/2016 03:29 PM, Emilio G. Cota wrote:
>>> What's a quick-and-dirty way to disable the fast-path TLB lookups?
>>> Alex: you told me the monitor has an option for this, but I can't
>>> find it. I'm looking for something that'd go in tcg/i386 to simply
>>> bypass the fast path.
>>
>> There is no easy way.  If you need that, you'd have to significantly modify the
>> tcg backend.
>
> Surely all the backends force the slow-path when any of TLB_FLAGS_MASK
> are set. Unless adding an extra bit is going to run out of spare bits on
> some backends?

You could do that, yes.  You also need to adjust softmmu_template.h to match.


r~

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2016-09-28  0:34 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-07-26 12:09 [Qemu-devel] Making cputlb.c operations safe for MTTCG Alex Bennée
2016-07-26 14:03 ` [Qemu-devel] [PATCH] cputlb: make tlb_reset_dirty " Alex Bennée
2016-08-01 14:54 ` [Qemu-devel] Making cputlb.c operations " Paolo Bonzini
2016-08-02  6:37   ` Alex Bennée
2016-08-02 10:26     ` Paolo Bonzini
2016-08-03 17:25     ` Richard Henderson
2016-08-03 20:56       ` Paolo Bonzini
2016-09-27 16:16     ` Paolo Bonzini
2016-09-27 22:15       ` Alex Bennée
2016-09-27 22:29       ` Emilio G. Cota
2016-09-27 23:04         ` Alex Bennée
2016-09-27 23:05         ` Richard Henderson
2016-09-27 23:32           ` Alex Bennée
2016-09-28  0:34             ` Richard Henderson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.