Re: [PATCH v4 11/16] cputlb: Merge and move memory_notdirty_write_{prepare,complete}

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: "Alex Bennée" <alex.bennee@linaro.org>
To: Richard Henderson <richard.henderson@linaro.org>
Cc: pbonzini@redhat.com, qemu-devel@nongnu.org, stefanha@redhat.com,
	david@redhat.com
Subject: Re: [PATCH v4 11/16] cputlb: Merge and move memory_notdirty_write_{prepare,complete}
Date: Wed, 25 Sep 2019 17:15:03 +0100	[thread overview]
Message-ID: <87pnjo9wu0.fsf@linaro.org> (raw)
In-Reply-To: <20190923230004.9231-12-richard.henderson@linaro.org>


Richard Henderson <richard.henderson@linaro.org> writes:

> Since 9458a9a1df1a, all readers of the dirty bitmaps wait
> for the rcu lock, which means that they wait until the end
> of any executing TranslationBlock.
>
> As a consequence, there is no need for the actual access
> to happen in between the _prepare and _complete.  Therefore,
> we can improve things by merging the two functions into
> notdirty_write and dropping the NotDirtyInfo structure.
>
> In addition, the only users of notdirty_write are in cputlb.c,
> so move the merged function there.  Pass in the CPUIOTLBEntry
> from which the ram_addr_t may be computed.
>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>

> ---
>  include/exec/memory-internal.h | 65 -----------------------------
>  accel/tcg/cputlb.c             | 76 +++++++++++++++++++---------------
>  exec.c                         | 44 --------------------
>  3 files changed, 42 insertions(+), 143 deletions(-)
>
> diff --git a/include/exec/memory-internal.h b/include/exec/memory-internal.h
> index ef4fb92371..9fcc2af25c 100644
> --- a/include/exec/memory-internal.h
> +++ b/include/exec/memory-internal.h
> @@ -49,70 +49,5 @@ void address_space_dispatch_free(AddressSpaceDispatch *d);
>
>  void mtree_print_dispatch(struct AddressSpaceDispatch *d,
>                            MemoryRegion *root);
> -
> -struct page_collection;
> -
> -/* Opaque struct for passing info from memory_notdirty_write_prepare()
> - * to memory_notdirty_write_complete(). Callers should treat all fields
> - * as private, with the exception of @active.
> - *
> - * @active is a field which is not touched by either the prepare or
> - * complete functions, but which the caller can use if it wishes to
> - * track whether it has called prepare for this struct and so needs
> - * to later call the complete function.
> - */
> -typedef struct {
> -    CPUState *cpu;
> -    struct page_collection *pages;
> -    ram_addr_t ram_addr;
> -    vaddr mem_vaddr;
> -    unsigned size;
> -    bool active;
> -} NotDirtyInfo;
> -
> -/**
> - * memory_notdirty_write_prepare: call before writing to non-dirty memory
> - * @ndi: pointer to opaque NotDirtyInfo struct
> - * @cpu: CPU doing the write
> - * @mem_vaddr: virtual address of write
> - * @ram_addr: the ram address of the write
> - * @size: size of write in bytes
> - *
> - * Any code which writes to the host memory corresponding to
> - * guest RAM which has been marked as NOTDIRTY must wrap those
> - * writes in calls to memory_notdirty_write_prepare() and
> - * memory_notdirty_write_complete():
> - *
> - *  NotDirtyInfo ndi;
> - *  memory_notdirty_write_prepare(&ndi, ....);
> - *  ... perform write here ...
> - *  memory_notdirty_write_complete(&ndi);
> - *
> - * These calls will ensure that we flush any TCG translated code for
> - * the memory being written, update the dirty bits and (if possible)
> - * remove the slowpath callback for writing to the memory.
> - *
> - * This must only be called if we are using TCG; it will assert otherwise.
> - *
> - * We may take locks in the prepare call, so callers must ensure that
> - * they don't exit (via longjump or otherwise) without calling complete.
> - *
> - * This call must only be made inside an RCU critical section.
> - * (Note that while we're executing a TCG TB we're always in an
> - * RCU critical section, which is likely to be the case for callers
> - * of these functions.)
> - */
> -void memory_notdirty_write_prepare(NotDirtyInfo *ndi,
> -                                   CPUState *cpu,
> -                                   vaddr mem_vaddr,
> -                                   ram_addr_t ram_addr,
> -                                   unsigned size);
> -/**
> - * memory_notdirty_write_complete: finish write to non-dirty memory
> - * @ndi: pointer to the opaque NotDirtyInfo struct which was initialized
> - * by memory_not_dirty_write_prepare().
> - */
> -void memory_notdirty_write_complete(NotDirtyInfo *ndi);
> -
>  #endif
>  #endif
> diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c
> index 05530a8b0c..09b0df87c6 100644
> --- a/accel/tcg/cputlb.c
> +++ b/accel/tcg/cputlb.c
> @@ -33,6 +33,7 @@
>  #include "exec/helper-proto.h"
>  #include "qemu/atomic.h"
>  #include "qemu/atomic128.h"
> +#include "translate-all.h"
>
>  /* DEBUG defines, enable DEBUG_TLB_LOG to log to the CPU_LOG_MMU target */
>  /* #define DEBUG_TLB */
> @@ -1084,6 +1085,37 @@ tb_page_addr_t get_page_addr_code(CPUArchState *env, target_ulong addr)
>      return qemu_ram_addr_from_host_nofail(p);
>  }
>
> +static void notdirty_write(CPUState *cpu, vaddr mem_vaddr, unsigned size,
> +                           CPUIOTLBEntry *iotlbentry, uintptr_t retaddr)
> +{
> +    ram_addr_t ram_addr = mem_vaddr + iotlbentry->addr;
> +
> +    trace_memory_notdirty_write_access(mem_vaddr, ram_addr, size);
> +
> +    if (!cpu_physical_memory_get_dirty_flag(ram_addr, DIRTY_MEMORY_CODE)) {
> +        struct page_collection *pages
> +            = page_collection_lock(ram_addr, ram_addr + size);
> +
> +        /* We require mem_io_pc in tb_invalidate_phys_page_range.  */
> +        cpu->mem_io_pc = retaddr;
> +
> +        tb_invalidate_phys_page_fast(pages, ram_addr, size);
> +        page_collection_unlock(pages);
> +    }
> +
> +    /*
> +     * Set both VGA and migration bits for simplicity and to remove
> +     * the notdirty callback faster.
> +     */
> +    cpu_physical_memory_set_dirty_range(ram_addr, size, DIRTY_CLIENTS_NOCODE);
> +
> +    /* We remove the notdirty callback only if the code has been flushed. */
> +    if (!cpu_physical_memory_is_clean(ram_addr)) {
> +        trace_memory_notdirty_set_dirty(mem_vaddr);
> +        tlb_set_dirty(cpu, mem_vaddr);
> +    }
> +}
> +
>  /*
>   * Probe for whether the specified guest access is permitted. If it is not
>   * permitted then an exception will be taken in the same way as if this
> @@ -1203,8 +1235,7 @@ void *tlb_vaddr_to_host(CPUArchState *env, abi_ptr addr,
>  /* Probe for a read-modify-write atomic operation.  Do not allow unaligned
>   * operations, or io operations to proceed.  Return the host address.  */
>  static void *atomic_mmu_lookup(CPUArchState *env, target_ulong addr,
> -                               TCGMemOpIdx oi, uintptr_t retaddr,
> -                               NotDirtyInfo *ndi)
> +                               TCGMemOpIdx oi, uintptr_t retaddr)
>  {
>      size_t mmu_idx = get_mmuidx(oi);
>      uintptr_t index = tlb_index(env, mmu_idx, addr);
> @@ -1264,12 +1295,9 @@ static void *atomic_mmu_lookup(CPUArchState *env, target_ulong addr,
>
>      hostaddr = (void *)((uintptr_t)addr + tlbe->addend);
>
> -    ndi->active = false;
>      if (unlikely(tlb_addr & TLB_NOTDIRTY)) {
> -        ndi->active = true;
> -        memory_notdirty_write_prepare(ndi, env_cpu(env), addr,
> -                                      qemu_ram_addr_from_host_nofail(hostaddr),
> -                                      1 << s_bits);
> +        notdirty_write(env_cpu(env), addr, 1 << s_bits,
> +                       &env_tlb(env)->d[mmu_idx].iotlb[index], retaddr);
>      }
>
>      return hostaddr;
> @@ -1636,28 +1664,13 @@ store_helper(CPUArchState *env, target_ulong addr, uint64_t val,
>              return;
>          }
>
> -        haddr = (void *)((uintptr_t)addr + entry->addend);
> -
>          /* Handle clean RAM pages.  */
>          if (tlb_addr & TLB_NOTDIRTY) {
> -            NotDirtyInfo ndi;
> -
> -            /* We require mem_io_pc in tb_invalidate_phys_page_range.  */
> -            env_cpu(env)->mem_io_pc = retaddr;
> -
> -            memory_notdirty_write_prepare(&ndi, env_cpu(env), addr,
> -                                          addr + iotlbentry->addr, size);
> -
> -            if (unlikely(need_swap)) {
> -                store_memop(haddr, val, op ^ MO_BSWAP);
> -            } else {
> -                store_memop(haddr, val, op);
> -            }
> -
> -            memory_notdirty_write_complete(&ndi);
> -            return;
> +            notdirty_write(env_cpu(env), addr, size, iotlbentry, retaddr);
>          }
>
> +        haddr = (void *)((uintptr_t)addr + entry->addend);
> +
>          if (unlikely(need_swap)) {
>              store_memop(haddr, val, op ^ MO_BSWAP);
>          } else {
> @@ -1783,14 +1796,9 @@ void helper_be_stq_mmu(CPUArchState *env, target_ulong addr, uint64_t val,
>  #define EXTRA_ARGS     , TCGMemOpIdx oi, uintptr_t retaddr
>  #define ATOMIC_NAME(X) \
>      HELPER(glue(glue(glue(atomic_ ## X, SUFFIX), END), _mmu))
> -#define ATOMIC_MMU_DECLS NotDirtyInfo ndi
> -#define ATOMIC_MMU_LOOKUP atomic_mmu_lookup(env, addr, oi, retaddr, &ndi)
> -#define ATOMIC_MMU_CLEANUP                              \
> -    do {                                                \
> -        if (unlikely(ndi.active)) {                     \
> -            memory_notdirty_write_complete(&ndi);       \
> -        }                                               \
> -    } while (0)
> +#define ATOMIC_MMU_DECLS
> +#define ATOMIC_MMU_LOOKUP atomic_mmu_lookup(env, addr, oi, retaddr)
> +#define ATOMIC_MMU_CLEANUP
>
>  #define DATA_SIZE 1
>  #include "atomic_template.h"
> @@ -1818,7 +1826,7 @@ void helper_be_stq_mmu(CPUArchState *env, target_ulong addr, uint64_t val,
>  #undef ATOMIC_MMU_LOOKUP
>  #define EXTRA_ARGS         , TCGMemOpIdx oi
>  #define ATOMIC_NAME(X)     HELPER(glue(glue(atomic_ ## X, SUFFIX), END))
> -#define ATOMIC_MMU_LOOKUP  atomic_mmu_lookup(env, addr, oi, GETPC(), &ndi)
> +#define ATOMIC_MMU_LOOKUP  atomic_mmu_lookup(env, addr, oi, GETPC())
>
>  #define DATA_SIZE 1
>  #include "atomic_template.h"
> diff --git a/exec.c b/exec.c
> index 961d7d6497..7d835b1a2b 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -2718,50 +2718,6 @@ ram_addr_t qemu_ram_addr_from_host(void *ptr)
>      return block->offset + offset;
>  }
>
> -/* Called within RCU critical section. */
> -void memory_notdirty_write_prepare(NotDirtyInfo *ndi,
> -                          CPUState *cpu,
> -                          vaddr mem_vaddr,
> -                          ram_addr_t ram_addr,
> -                          unsigned size)
> -{
> -    ndi->cpu = cpu;
> -    ndi->ram_addr = ram_addr;
> -    ndi->mem_vaddr = mem_vaddr;
> -    ndi->size = size;
> -    ndi->pages = NULL;
> -
> -    trace_memory_notdirty_write_access(mem_vaddr, ram_addr, size);
> -
> -    assert(tcg_enabled());
> -    if (!cpu_physical_memory_get_dirty_flag(ram_addr, DIRTY_MEMORY_CODE)) {
> -        ndi->pages = page_collection_lock(ram_addr, ram_addr + size);
> -        tb_invalidate_phys_page_fast(ndi->pages, ram_addr, size);
> -    }
> -}
> -
> -/* Called within RCU critical section. */
> -void memory_notdirty_write_complete(NotDirtyInfo *ndi)
> -{
> -    if (ndi->pages) {
> -        assert(tcg_enabled());
> -        page_collection_unlock(ndi->pages);
> -        ndi->pages = NULL;
> -    }
> -
> -    /* Set both VGA and migration bits for simplicity and to remove
> -     * the notdirty callback faster.
> -     */
> -    cpu_physical_memory_set_dirty_range(ndi->ram_addr, ndi->size,
> -                                        DIRTY_CLIENTS_NOCODE);
> -    /* we remove the notdirty callback only if the code has been
> -       flushed */
> -    if (!cpu_physical_memory_is_clean(ndi->ram_addr)) {
> -        trace_memory_notdirty_set_dirty(ndi->mem_vaddr);
> -        tlb_set_dirty(ndi->cpu, ndi->mem_vaddr);
> -    }
> -}
> -
>  /* Generate a debug exception if a watchpoint has been hit.  */
>  void cpu_check_watchpoint(CPUState *cpu, vaddr addr, vaddr len,
>                            MemTxAttrs attrs, int flags, uintptr_t ra)


--
Alex Bennée

next prev parent reply	other threads:[~2019-09-25 16:18 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-23 22:59 [PATCH v4 00/16] Move rom and notdirty handling to cputlb Richard Henderson
2019-09-23 22:59 ` [PATCH v4 01/16] exec: Use TARGET_PAGE_BITS_MIN for TLB flags Richard Henderson
2019-09-24 13:53   ` Alex Bennée
2019-09-23 22:59 ` [PATCH v4 02/16] cputlb: Disable __always_inline__ without optimization Richard Henderson
2019-09-24 13:56   ` Alex Bennée
2019-09-23 22:59 ` [PATCH v4 03/16] qemu/compiler.h: Add optimize_away Richard Henderson
2019-09-24  7:47   ` David Hildenbrand
2019-09-24 17:27     ` Richard Henderson
2019-09-24 17:29       ` David Hildenbrand
2019-09-24 15:47   ` Alex Bennée
2019-09-23 22:59 ` [PATCH v4 04/16] cputlb: Use optimize_away in load/store_helpers Richard Henderson
2019-09-24  7:47   ` David Hildenbrand
2019-09-24 15:47   ` Alex Bennée
2019-09-23 22:59 ` [PATCH v4 05/16] cputlb: Split out load/store_memop Richard Henderson
2019-09-24  7:48   ` David Hildenbrand
2019-09-24 15:51   ` Alex Bennée
2019-09-23 22:59 ` [PATCH v4 06/16] cputlb: Introduce TLB_BSWAP Richard Henderson
2019-09-24 18:25   ` Alex Bennée
2019-09-25 17:36     ` Richard Henderson
2019-09-23 22:59 ` [PATCH v4 07/16] exec: Adjust notdirty tracing Richard Henderson
2019-09-24 21:53   ` Alex Bennée
2019-09-23 22:59 ` [PATCH v4 08/16] cputlb: Move ROM handling from I/O path to TLB path Richard Henderson
2019-09-25  0:16   ` Alex Bennée
2019-09-25  6:59     ` David Hildenbrand
2019-09-25 16:01       ` Alex Bennée
2019-09-25 17:01       ` Richard Henderson
2019-09-23 22:59 ` [PATCH v4 09/16] cputlb: Move NOTDIRTY " Richard Henderson
2019-09-25 16:06   ` Alex Bennée
2019-09-23 22:59 ` [PATCH v4 10/16] cputlb: Partially inline memory_region_section_get_iotlb Richard Henderson
2019-09-24  7:59   ` David Hildenbrand
2019-09-25 17:55     ` Richard Henderson
2019-09-25 19:40       ` David Hildenbrand
2019-09-25 16:12   ` Alex Bennée
2019-09-23 22:59 ` [PATCH v4 11/16] cputlb: Merge and move memory_notdirty_write_{prepare, complete} Richard Henderson
2019-09-24  8:04   ` [PATCH v4 11/16] cputlb: Merge and move memory_notdirty_write_{prepare,complete} David Hildenbrand
2019-09-25 16:15   ` Alex Bennée [this message]
2019-09-23 23:00 ` [PATCH v4 12/16] cputlb: Handle TLB_NOTDIRTY in probe_access Richard Henderson
2019-09-24  8:05   ` David Hildenbrand
2019-09-25 16:21   ` Alex Bennée
2019-09-23 23:00 ` [PATCH v4 13/16] cputlb: Remove cpu->mem_io_vaddr Richard Henderson
2019-09-25 16:22   ` Alex Bennée
2019-09-23 23:00 ` [PATCH v4 14/16] cputlb: Remove tb_invalidate_phys_page_range is_cpu_write_access Richard Henderson
2019-09-25 16:23   ` Alex Bennée
2019-09-23 23:00 ` [PATCH v4 15/16] cputlb: Pass retaddr to tb_invalidate_phys_page_fast Richard Henderson
2019-09-25 16:28   ` Alex Bennée
2019-09-23 23:00 ` [PATCH v4 16/16] cputlb: Pass retaddr to tb_check_watchpoint Richard Henderson
2019-09-25 16:30   ` Alex Bennée
2019-09-25 18:52 ` [PATCH v4 00/16] Move rom and notdirty handling to cputlb Mark Cave-Ayland
2019-09-25 18:54   ` Mark Cave-Ayland

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87pnjo9wu0.fsf@linaro.org \
    --to=alex.bennee@linaro.org \
    --cc=david@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=richard.henderson@linaro.org \
    --cc=stefanha@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).