Re: [Qemu-devel] [RFC 02/10] softmmu_llsc_template.h: Move to multi-threading

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Sergey Fedorov <serge.fdrv@gmail.com>
To: "Alex Bennée" <alex.bennee@linaro.org>
Cc: Alvise Rigo <a.rigo@virtualopensystems.com>,
	mttcg@listserver.greensocs.com, qemu-devel@nongnu.org,
	jani.kokkonen@huawei.com, claudio.fontana@huawei.com,
	tech@virtualopensystems.com, fred.konrad@greensocs.com,
	pbonzini@redhat.com, rth@twiddle.net, cota@braap.org,
	peter.maydell@linaro.org
Subject: Re: [Qemu-devel] [RFC 02/10] softmmu_llsc_template.h: Move to multi-threading
Date: Tue, 14 Jun 2016 12:31:19 +0300	[thread overview]
Message-ID: <575FCEE7.5060807@gmail.com> (raw)
In-Reply-To: <87bn34go84.fsf@linaro.org>

On 14/06/16 11:37, Alex Bennée wrote:
> Alex Bennée <alex.bennee@linaro.org> writes:
>
>> Sergey Fedorov <serge.fdrv@gmail.com> writes:
>>
>>> On 26/05/16 19:35, Alvise Rigo wrote:
>>>> Using tcg_exclusive_{lock,unlock}(), make the emulation of
>>>> LoadLink/StoreConditional thread safe.
>>>>
>>>> During an LL access, this lock protects the load access itself, the
>>>> update of the exclusive history and the update of the VCPU's protected
>>>> range.  In a SC access, the lock protects the store access itself, the
>>>> possible reset of other VCPUs' protected range and the reset of the
>>>> exclusive context of calling VCPU.
>>>>
>>>> The lock is also taken when a normal store happens to access an
>>>> exclusive page to reset other VCPUs' protected range in case of
>>>> collision.
>>> I think the key problem here is that the load in LL helper can race with
>>> a concurrent regular fast-path store. It's probably easier to annotate
>>> the source here:
>>>
>>>      1  WORD_TYPE helper_ldlink_name(CPUArchState *env, target_ulong addr,
>>>      2                                  TCGMemOpIdx oi, uintptr_t retaddr)
>>>      3  {
>>>      4      WORD_TYPE ret;
>>>      5      int index;
>>>      6      CPUState *this_cpu = ENV_GET_CPU(env);
>>>      7      CPUClass *cc = CPU_GET_CLASS(this_cpu);
>>>      8      hwaddr hw_addr;
>>>      9      unsigned mmu_idx = get_mmuidx(oi);
>>>
>>>     10      index = (addr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1);
>>>
>>>     11      tcg_exclusive_lock();
>>>
>>>     12      /* Use the proper load helper from cpu_ldst.h */
>>>     13      ret = helper_ld(env, addr, oi, retaddr);
>>>
>>>     14      /* hw_addr = hwaddr of the page (i.e. section->mr->ram_addr
>>> + xlat)
>>>     15       * plus the offset (i.e. addr & ~TARGET_PAGE_MASK) */
>>>     16      hw_addr = (env->iotlb[mmu_idx][index].addr &
>>> TARGET_PAGE_MASK) + addr;
>>>     17      if (likely(!(env->tlb_table[mmu_idx][index].addr_read &
>>> TLB_MMIO))) {
>>>     18          /* If all the vCPUs have the EXCL bit set for this page
>>> there is no need
>>>     19           * to request any flush. */
>>>     20          if (cpu_physical_memory_set_excl(hw_addr)) {
>>>     21              CPUState *cpu;
>>>
>>>     22              excl_history_put_addr(hw_addr);
>>>     23              CPU_FOREACH(cpu) {
>>>     24                  if (this_cpu != cpu) {
>>>     25                      tlb_flush_other(this_cpu, cpu, 1);
>>>     26                  }
>>>     27              }
>>>     28          }
>>>     29          /* For this vCPU, just update the TLB entry, no need to
>>> flush. */
>>>     30          env->tlb_table[mmu_idx][index].addr_write |= TLB_EXCL;
>>>     31      } else {
>>>     32          /* Set a pending exclusive access in the MemoryRegion */
>>>     33          MemoryRegion *mr = iotlb_to_region(this_cpu,
>>>     34
>>> env->iotlb[mmu_idx][index].addr,
>>>     35
>>> env->iotlb[mmu_idx][index].attrs);
>>>     36          mr->pending_excl_access = true;
>>>     37      }
>>>
>>>     38      cc->cpu_set_excl_protected_range(this_cpu, hw_addr, DATA_SIZE);
>>>
>>>     39      tcg_exclusive_unlock();
>>>
>>>     40      /* From now on we are in LL/SC context */
>>>     41      this_cpu->ll_sc_context = true;
>>>
>>>     42      return ret;
>>>     43  }
>>>
>>>
>>> The exclusive lock at line 11 doesn't help if concurrent fast-patch
>>> store at this address occurs after we finished load at line 13 but
>>> before TLB is flushed as a result of line 25. If we reorder the load to
>>> happen after the TLB flush request we still must be sure that the flush
>>> is complete before we can do the load safely.
>> I think this can be fixed using async_safe_run_on_cpu and tweaking the
>> ldlink helper.
>>
>>   * Change the helper_ldlink call
>>     - pass it offset-of(cpu->reg[n]) so it can store result of load
>>     - maybe pass it next-pc (unless there is some other way to know)
>>
>>   vCPU runs until the ldlink instruction occurs and jumps to the helper
>>
>>   * Once in the helper_ldlink
>>     - queue up an async helper function with info of offset
>>     - cpu_loop_exit_restore(with next PC)
>>
>>   vCPU the issued the ldlink exits immediately, waits until all vCPUs are
>>   out of generated code.
>>
>>   * Once in helper_ldlink async helper
>>     - Everything at this point is quiescent, no vCPU activity
>>     - Flush all TLBs/set flags
>>     - Do the load from memory, store directly into cpu->reg[n]
>>
>> The key thing is once we are committed to load in the async helper
>> nothing else can get in the way. Any stores before we are in the helper
>> happen as normal, once we exit the async helper all potential
>> conflicting stores will slow path.
>>
>> There is a little messing about in knowing the next PC which is simple
>> in the ARM case but gets a bit more complicated for architectures that
>> have deferred jump slots. I haven't looked into this nit yet.
> Thinking on it further the messing about with offset and PCs could be
> solved just by having a simple flag:
>
>  - First run, no flag set, queue safe work, restart loop at PC
>  - Second run, flag set, do load, clear flag
>
> As long as the flag is per-vCPU I think its safe from races.

As soon as we flushed TLBs properly, we can set an exclusive flag for
the page. Next attempt to execute the LL we can do actual load and
proceed further. I think there's no need to introduce some special flag.
That is basically what Alvise suggested, see
http://thread.gmane.org/gmane.comp.emulators.qemu/413978/focus=417787.

Kind regards,
Sergey

next prev parent reply	other threads:[~2016-06-14  9:31 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-26 16:35 [Qemu-devel] [RFC 00/10] MTTCG: Slow-path for atomic insns Alvise Rigo
2016-05-26 16:35 ` [Qemu-devel] [RFC 01/10] exec: Introduce tcg_exclusive_{lock, unlock}() Alvise Rigo
2016-05-31 15:03   ` Pranith Kumar
2016-06-02 16:21     ` alvise rigo
2016-06-08  9:21   ` Alex Bennée
2016-06-08 10:00     ` alvise rigo
2016-06-08 11:32       ` Peter Maydell
2016-06-08 13:52       ` Alex Bennée
2016-05-26 16:35 ` [Qemu-devel] [RFC 02/10] softmmu_llsc_template.h: Move to multi-threading Alvise Rigo
2016-06-10 15:21   ` Sergey Fedorov
2016-06-10 15:53     ` alvise rigo
2016-06-10 16:00       ` Sergey Fedorov
2016-06-10 16:04         ` alvise rigo
2016-06-14 12:00       ` Alex Bennée
2016-06-14 12:58         ` alvise rigo
2016-06-14 13:14           ` Alex Bennée
2016-06-10 16:15     ` Alex Bennée
2016-06-11 19:53       ` Sergey Fedorov
2016-06-14  8:37       ` Alex Bennée
2016-06-14  9:31         ` Sergey Fedorov [this message]
2016-05-26 16:35 ` [Qemu-devel] [RFC 03/10] cpus: Introduce async_wait_run_on_cpu() Alvise Rigo
2016-06-08 13:54   ` Alex Bennée
2016-06-08 14:10     ` alvise rigo
2016-06-08 14:53       ` Sergey Fedorov
2016-06-08 15:20         ` Alex Bennée
2016-06-08 16:24           ` alvise rigo
2016-06-13  9:26             ` Alex Bennée
2016-05-26 16:35 ` [Qemu-devel] [RFC 04/10] cputlb: Introduce tlb_flush_other() Alvise Rigo
2016-05-26 16:35 ` [Qemu-devel] [RFC 05/10] target-arm: End TB after ldrex instruction Alvise Rigo
2016-05-26 16:35 ` [Qemu-devel] [RFC 06/10] cputlb: Add tlb_tables_flush_bitmap() Alvise Rigo
2016-05-26 16:35 ` [Qemu-devel] [RFC 07/10] cputlb: Query tlb_flush_by_mmuidx Alvise Rigo
2016-05-26 16:35 ` [Qemu-devel] [RFC 08/10] cputlb: Query tlb_flush_page_by_mmuidx Alvise Rigo
2016-05-26 16:35 ` [Qemu-devel] [RFC 09/10] cputlb: Query tlb_flush_page_all Alvise Rigo
2016-05-26 16:35 ` [Qemu-devel] [RFC 10/10] cpus: Do not sleep if some work item is pending Alvise Rigo
2016-06-10 15:21 ` [Qemu-devel] [RFC 00/10] MTTCG: Slow-path for atomic insns Alex Bennée
2016-06-10 15:30   ` alvise rigo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=575FCEE7.5060807@gmail.com \
    --to=serge.fdrv@gmail.com \
    --cc=a.rigo@virtualopensystems.com \
    --cc=alex.bennee@linaro.org \
    --cc=claudio.fontana@huawei.com \
    --cc=cota@braap.org \
    --cc=fred.konrad@greensocs.com \
    --cc=jani.kokkonen@huawei.com \
    --cc=mttcg@listserver.greensocs.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    --cc=rth@twiddle.net \
    --cc=tech@virtualopensystems.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).