Re: [PATCH v4 11/16] cputlb: Merge and move memory_notdirty_write_{prepare,complete}

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Alex Bennée" <alex.bennee@linaro.org>
To: Richard Henderson <richard.henderson@linaro.org>
Cc: pbonzini@redhat.com, qemu-devel@nongnu.org, stefanha@redhat.com,
	david@redhat.com
Subject: Re: [PATCH v4 11/16] cputlb: Merge and move memory_notdirty_write_{prepare,complete}
Date: Wed, 25 Sep 2019 17:15:03 +0100	[thread overview]
Message-ID: <87pnjo9wu0.fsf@linaro.org> (raw)
In-Reply-To: <20190923230004.9231-12-richard.henderson@linaro.org>


Richard Henderson <richard.henderson@linaro.org> writes:

> Since 9458a9a1df1a, all readers of the dirty bitmaps wait
> for the rcu lock, which means that they wait until the end
> of any executing TranslationBlock.
>
> As a consequence, there is no need for the actual access
> to happen in between the _prepare and _complete.  Therefore,
> we can improve things by merging the two functions into
> notdirty_write and dropping the NotDirtyInfo structure.
>
> In addition, the only users of notdirty_write are in cputlb.c,
> so move the merged function there.  Pass in the CPUIOTLBEntry
> from which the ram_addr_t may be computed.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

> ---
>  include/exec/memory-internal.h | 65 -----------------------------
>  accel/tcg/cputlb.c             | 76 +++++++++++++++++++---------------
>  exec.c                         | 44 --------------------
>  3 files changed, 42 insertions(+), 143 deletions(-)
>
> diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
> index ef4fb92371..9fcc2af25c 100644
> --- a/include/exec/memory-internal.h
> +++ b/include/exec/memory-internal.h
> @@ -49,70 +49,5 @@ void address_space_dispatch_free(AddressSpaceDispatch *d);
>
>  void mtree_print_dispatch(struct AddressSpaceDispatch *d,
>                            MemoryRegion *root);
> -
> -struct page_collection;
> -
> -/* Opaque struct for passing info from memory_notdirty_write_prepare()
> - * to memory_notdirty_write_complete(). Callers should treat all fields
> - * as private, with the exception of @active.
> - *
> - * @active is a field which is not touched by either the prepare or
> - * complete functions, but which the caller can use if it wishes to
> - * track whether it has called prepare for this struct and so needs
> - * to later call the complete function.
> - */
> -typedef struct {
> -    CPUState *cpu;
> -    struct page_collection *pages;
> -    ram_addr_t ram_addr;
> -    vaddr mem_vaddr;
> -    unsigned size;
> -    bool active;
> -} NotDirtyInfo;
> -
> -/**
> - * memory_notdirty_write_prepare: call before writing to non-dirty memory
> - * @ndi: pointer to opaque NotDirtyInfo struct
> - * @cpu: CPU doing the write
> - * @mem_vaddr: virtual address of write
> - * @ram_addr: the ram address of the write
> - * @size: size of write in bytes
> - *
> - * Any code which writes to the host memory corresponding to
> - * guest RAM which has been marked as NOTDIRTY must wrap those
> - * writes in calls to memory_notdirty_write_prepare() and
> - * memory_notdirty_write_complete():
> - *
> - *  NotDirtyInfo ndi;
> - *  memory_notdirty_write_prepare(&ndi, ....);
> - *  ... perform write here ...
> - *  memory_notdirty_write_complete(&ndi);
> - *
> - * These calls will ensure that we flush any TCG translated code for
> - * the memory being written, update the dirty bits and (if possible)
> - * remove the slowpath callback for writing to the memory.
> - *
> - * This must only be called if we are using TCG; it will assert otherwise.
> - *
> - * We may take locks in the prepare call, so callers must ensure that
> - * they don't exit (via longjump or otherwise) without calling complete.
> - *
> - * This call must only be made inside an RCU critical section.
> - * (Note that while we're executing a TCG TB we're always in an
> - * RCU critical section, which is likely to be the case for callers
> - * of these functions.)
> - */
> -void memory_notdirty_write_prepare(NotDirtyInfo *ndi,
> -                                   CPUState *cpu,
> -                                   vaddr mem_vaddr,
> -                                   ram_addr_t ram_addr,
> -                                   unsigned size);
> -/**
> - * memory_notdirty_write_complete: finish write to non-dirty memory
> - * @ndi: pointer to the opaque NotDirtyInfo struct which was initialized
> - * by memory_not_dirty_write_prepare().
> - */
> -void memory_notdirty_write_complete(NotDirtyInfo *ndi);
> -
>  #endif
>  #endif
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index 05530a8b0c..09b0df87c6 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -33,6 +33,7 @@
>  #include "exec/helper-proto.h"
>  #include "qemu/atomic.h"
>  #include "qemu/atomic128.h"
> +#include "translate-all.h"
>
>  /* DEBUG defines, enable DEBUG_TLB_LOG to log to the CPU_LOG_MMU target */
>  /* #define DEBUG_TLB */
> @@ -1084,6 +1085,37 @@ tb_page_addr_t get_page_addr_code(CPUArchState *env, target_ulong addr)
>      return qemu_ram_addr_from_host_nofail(p);
>  }
>
> +static void notdirty_write(CPUState *cpu, vaddr mem_vaddr, unsigned size,
> +                           CPUIOTLBEntry *iotlbentry, uintptr_t retaddr)
> +{
> +    ram_addr_t ram_addr = mem_vaddr + iotlbentry->addr;
> +
> +    trace_memory_notdirty_write_access(mem_vaddr, ram_addr, size);
> +
> +    if (!cpu_physical_memory_get_dirty_flag(ram_addr, DIRTY_MEMORY_CODE)) {
> +        struct page_collection *pages
> +            = page_collection_lock(ram_addr, ram_addr + size);
> +
> +        /* We require mem_io_pc in tb_invalidate_phys_page_range.  */
> +        cpu->mem_io_pc = retaddr;
> +
> +        tb_invalidate_phys_page_fast(pages, ram_addr, size);
> +        page_collection_unlock(pages);
> +    }
> +
> +    /*
> +     * Set both VGA and migration bits for simplicity and to remove
> +     * the notdirty callback faster.
> +     */
> +    cpu_physical_memory_set_dirty_range(ram_addr, size, DIRTY_CLIENTS_NOCODE);
> +
> +    /* We remove the notdirty callback only if the code has been flushed. */
> +    if (!cpu_physical_memory_is_clean(ram_addr)) {
> +        trace_memory_notdirty_set_dirty(mem_vaddr);
> +        tlb_set_dirty(cpu, mem_vaddr);
> +    }
> +}
> +
>  /*
>   * Probe for whether the specified guest access is permitted. If it is not
>   * permitted then an exception will be taken in the same way as if this
> @@ -1203,8 +1235,7 @@ void *tlb_vaddr_to_host(CPUArchState *env, abi_ptr addr,
>  /* Probe for a read-modify-write atomic operation.  Do not allow unaligned
>   * operations, or io operations to proceed.  Return the host address.  */
>  static void *atomic_mmu_lookup(CPUArchState *env, target_ulong addr,
> -                               TCGMemOpIdx oi, uintptr_t retaddr,
> -                               NotDirtyInfo *ndi)
> +                               TCGMemOpIdx oi, uintptr_t retaddr)
>  {
>      size_t mmu_idx = get_mmuidx(oi);
>      uintptr_t index = tlb_index(env, mmu_idx, addr);
> @@ -1264,12 +1295,9 @@ static void *atomic_mmu_lookup(CPUArchState *env, target_ulong addr,
>
>      hostaddr = (void *)((uintptr_t)addr + tlbe->addend);
>
> -    ndi->active = false;
>      if (unlikely(tlb_addr & TLB_NOTDIRTY)) {
> -        ndi->active = true;
> -        memory_notdirty_write_prepare(ndi, env_cpu(env), addr,
> -                                      qemu_ram_addr_from_host_nofail(hostaddr),
> -                                      1 << s_bits);
> +        notdirty_write(env_cpu(env), addr, 1 << s_bits,
> +                       &env_tlb(env)->d[mmu_idx].iotlb[index], retaddr);
>      }
>
>      return hostaddr;
> @@ -1636,28 +1664,13 @@ store_helper(CPUArchState *env, target_ulong addr, uint64_t val,
>              return;
>          }
>
> -        haddr = (void *)((uintptr_t)addr + entry->addend);
> -
>          /* Handle clean RAM pages.  */
>          if (tlb_addr & TLB_NOTDIRTY) {
> -            NotDirtyInfo ndi;
> -
> -            /* We require mem_io_pc in tb_invalidate_phys_page_range.  */
> -            env_cpu(env)->mem_io_pc = retaddr;
> -
> -            memory_notdirty_write_prepare(&ndi, env_cpu(env), addr,
> -                                          addr + iotlbentry->addr, size);
> -
> -            if (unlikely(need_swap)) {
> -                store_memop(haddr, val, op ^ MO_BSWAP);
> -            } else {
> -                store_memop(haddr, val, op);
> -            }
> -
> -            memory_notdirty_write_complete(&ndi);
> -            return;
> +            notdirty_write(env_cpu(env), addr, size, iotlbentry, retaddr);
>          }
>
> +        haddr = (void *)((uintptr_t)addr + entry->addend);
> +
>          if (unlikely(need_swap)) {
>              store_memop(haddr, val, op ^ MO_BSWAP);
>          } else {
> @@ -1783,14 +1796,9 @@ void helper_be_stq_mmu(CPUArchState *env, target_ulong addr, uint64_t val,
>  #define EXTRA_ARGS     , TCGMemOpIdx oi, uintptr_t retaddr
>  #define ATOMIC_NAME(X) \
>      HELPER(glue(glue(glue(atomic_ ## X, SUFFIX), END), _mmu))
> -#define ATOMIC_MMU_DECLS NotDirtyInfo ndi
> -#define ATOMIC_MMU_LOOKUP atomic_mmu_lookup(env, addr, oi, retaddr, &ndi)
> -#define ATOMIC_MMU_CLEANUP                              \
> -    do {                                                \
> -        if (unlikely(ndi.active)) {                     \
> -            memory_notdirty_write_complete(&ndi);       \
> -        }                                               \
> -    } while (0)
> +#define ATOMIC_MMU_DECLS
> +#define ATOMIC_MMU_LOOKUP atomic_mmu_lookup(env, addr, oi, retaddr)
> +#define ATOMIC_MMU_CLEANUP
>
>  #define DATA_SIZE 1
>  #include "atomic_template.h"
> @@ -1818,7 +1826,7 @@ void helper_be_stq_mmu(CPUArchState *env, target_ulong addr, uint64_t val,
>  #undef ATOMIC_MMU_LOOKUP
>  #define EXTRA_ARGS         , TCGMemOpIdx oi
>  #define ATOMIC_NAME(X)     HELPER(glue(glue(atomic_ ## X, SUFFIX), END))
> -#define ATOMIC_MMU_LOOKUP  atomic_mmu_lookup(env, addr, oi, GETPC(), &ndi)
> +#define ATOMIC_MMU_LOOKUP  atomic_mmu_lookup(env, addr, oi, GETPC())
>
>  #define DATA_SIZE 1
>  #include "atomic_template.h"
> diff --git a/exec.c b/exec.c
> index 961d7d6497..7d835b1a2b 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -2718,50 +2718,6 @@ ram_addr_t qemu_ram_addr_from_host(void *ptr)
>      return block->offset + offset;
>  }
>
> -/* Called within RCU critical section. */
> -void memory_notdirty_write_prepare(NotDirtyInfo *ndi,
> -                          CPUState *cpu,
> -                          vaddr mem_vaddr,
> -                          ram_addr_t ram_addr,
> -                          unsigned size)
> -{
> -    ndi->cpu = cpu;
> -    ndi->ram_addr = ram_addr;
> -    ndi->mem_vaddr = mem_vaddr;
> -    ndi->size = size;
> -    ndi->pages = NULL;
> -
> -    trace_memory_notdirty_write_access(mem_vaddr, ram_addr, size);
> -
> -    assert(tcg_enabled());
> -    if (!cpu_physical_memory_get_dirty_flag(ram_addr, DIRTY_MEMORY_CODE)) {
> -        ndi->pages = page_collection_lock(ram_addr, ram_addr + size);
> -        tb_invalidate_phys_page_fast(ndi->pages, ram_addr, size);
> -    }
> -}
> -
> -/* Called within RCU critical section. */
> -void memory_notdirty_write_complete(NotDirtyInfo *ndi)
> -{
> -    if (ndi->pages) {
> -        assert(tcg_enabled());
> -        page_collection_unlock(ndi->pages);
> -        ndi->pages = NULL;
> -    }
> -
> -    /* Set both VGA and migration bits for simplicity and to remove
> -     * the notdirty callback faster.
> -     */
> -    cpu_physical_memory_set_dirty_range(ndi->ram_addr, ndi->size,
> -                                        DIRTY_CLIENTS_NOCODE);
> -    /* we remove the notdirty callback only if the code has been
> -       flushed */
> -    if (!cpu_physical_memory_is_clean(ndi->ram_addr)) {
> -        trace_memory_notdirty_set_dirty(ndi->mem_vaddr);
> -        tlb_set_dirty(ndi->cpu, ndi->mem_vaddr);
> -    }
> -}
> -
>  /* Generate a debug exception if a watchpoint has been hit.  */
>  void cpu_check_watchpoint(CPUState *cpu, vaddr addr, vaddr len,
>                            MemTxAttrs attrs, int flags, uintptr_t ra)


--
Alex Bennée

next prev parent reply	other threads:[~2019-09-25 16:18 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-23 22:59 [PATCH v4 00/16] Move rom and notdirty handling to cputlb Richard Henderson
2019-09-23 22:59 ` [PATCH v4 01/16] exec: Use TARGET_PAGE_BITS_MIN for TLB flags Richard Henderson
2019-09-24 13:53   ` Alex Bennée
2019-09-23 22:59 ` [PATCH v4 02/16] cputlb: Disable __always_inline__ without optimization Richard Henderson
2019-09-24 13:56   ` Alex Bennée
2019-09-23 22:59 ` [PATCH v4 03/16] qemu/compiler.h: Add optimize_away Richard Henderson
2019-09-24  7:47   ` David Hildenbrand
2019-09-24 17:27     ` Richard Henderson
2019-09-24 17:29       ` David Hildenbrand
2019-09-24 15:47   ` Alex Bennée
2019-09-23 22:59 ` [PATCH v4 04/16] cputlb: Use optimize_away in load/store_helpers Richard Henderson
2019-09-24  7:47   ` David Hildenbrand
2019-09-24 15:47   ` Alex Bennée
2019-09-23 22:59 ` [PATCH v4 05/16] cputlb: Split out load/store_memop Richard Henderson
2019-09-24  7:48   ` David Hildenbrand
2019-09-24 15:51   ` Alex Bennée
2019-09-23 22:59 ` [PATCH v4 06/16] cputlb: Introduce TLB_BSWAP Richard Henderson
2019-09-24 18:25   ` Alex Bennée
2019-09-25 17:36     ` Richard Henderson
2019-09-23 22:59 ` [PATCH v4 07/16] exec: Adjust notdirty tracing Richard Henderson
2019-09-24 21:53   ` Alex Bennée
2019-09-23 22:59 ` [PATCH v4 08/16] cputlb: Move ROM handling from I/O path to TLB path Richard Henderson
2019-09-25  0:16   ` Alex Bennée
2019-09-25  6:59     ` David Hildenbrand
2019-09-25 16:01       ` Alex Bennée
2019-09-25 17:01       ` Richard Henderson
2019-09-23 22:59 ` [PATCH v4 09/16] cputlb: Move NOTDIRTY " Richard Henderson
2019-09-25 16:06   ` Alex Bennée
2019-09-23 22:59 ` [PATCH v4 10/16] cputlb: Partially inline memory_region_section_get_iotlb Richard Henderson
2019-09-24  7:59   ` David Hildenbrand
2019-09-25 17:55     ` Richard Henderson
2019-09-25 19:40       ` David Hildenbrand
2019-09-25 16:12   ` Alex Bennée
2019-09-23 22:59 ` [PATCH v4 11/16] cputlb: Merge and move memory_notdirty_write_{prepare, complete} Richard Henderson
2019-09-24  8:04   ` [PATCH v4 11/16] cputlb: Merge and move memory_notdirty_write_{prepare,complete} David Hildenbrand
2019-09-25 16:15   ` Alex Bennée [this message]
2019-09-23 23:00 ` [PATCH v4 12/16] cputlb: Handle TLB_NOTDIRTY in probe_access Richard Henderson
2019-09-24  8:05   ` David Hildenbrand
2019-09-25 16:21   ` Alex Bennée
2019-09-23 23:00 ` [PATCH v4 13/16] cputlb: Remove cpu->mem_io_vaddr Richard Henderson
2019-09-25 16:22   ` Alex Bennée
2019-09-23 23:00 ` [PATCH v4 14/16] cputlb: Remove tb_invalidate_phys_page_range is_cpu_write_access Richard Henderson
2019-09-25 16:23   ` Alex Bennée
2019-09-23 23:00 ` [PATCH v4 15/16] cputlb: Pass retaddr to tb_invalidate_phys_page_fast Richard Henderson
2019-09-25 16:28   ` Alex Bennée
2019-09-23 23:00 ` [PATCH v4 16/16] cputlb: Pass retaddr to tb_check_watchpoint Richard Henderson
2019-09-25 16:30   ` Alex Bennée
2019-09-25 18:52 ` [PATCH v4 00/16] Move rom and notdirty handling to cputlb Mark Cave-Ayland
2019-09-25 18:54   ` Mark Cave-Ayland

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87pnjo9wu0.fsf@linaro.org \
    --to=alex.bennee@linaro.org \
    --cc=david@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.henderson@linaro.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.