Re: [Qemu-devel] [RFC PATCH V7 00/19] Multithread TCG.

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Claudio Fontana <claudio.fontana@huawei.com>
To: Frederic Konrad <fred.konrad@greensocs.com>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: mttcg@greensocs.com, mark.burton@greensocs.com,
	qemu-devel@nongnu.org, a.rigo@virtualopensystems.com,
	guillaume.delbergue@greensocs.com, pbonzini@redhat.com,
	alex.bennee@linaro.org
Subject: Re: [Qemu-devel] [RFC PATCH V7 00/19] Multithread TCG.
Date: Wed, 7 Oct 2015 14:46:31 +0200	[thread overview]
Message-ID: <56151427.4080809@huawei.com> (raw)
In-Reply-To: <55C995CB.3080300@greensocs.com>

Hello Frederic,

On 11.08.2015 08:27, Frederic Konrad wrote:
> On 11/08/2015 08:15, Benjamin Herrenschmidt wrote:
>> On Mon, 2015-08-10 at 17:26 +0200, fred.konrad@greensocs.com wrote:
>>> From: KONRAD Frederic <fred.konrad@greensocs.com>
>>>
>>> This is the 7th round of the MTTCG patch series.
>>>
>>>
>>> It can be cloned from:
>>> git@git.greensocs.com:fkonrad/mttcg.git branch multi_tcg_v7.

would it be possible to rebase on latest qemu? I wonder if mttcg is diverging a bit too much from mainline,
which will make it more difficult to rebase later..(Or did I get confused about all these repos?)

Thank you!

Claudio

>>>
>>> This patch-set try to address the different issues in the global picture of
>>> MTTCG, presented on the wiki.
>>>
>>> == Needed patch for our work ==
>>>
>>> Some preliminaries are needed for our work:
>>>   * current_cpu doesn't make sense in mttcg so a tcg_executing flag is added to
>>>     the CPUState.
>> Can't you just make it a TLS ?
> 
> True that can be done as well. But the tcg_exec_flags has a second meaning saying
> "you can't start executing code right now because I want to do a safe_work".
>>
>>>   * We need to run some work safely when all VCPUs are outside their execution
>>>     loop. This is done with the async_run_safe_work_on_cpu function introduced
>>>     in this series.
>>>   * QemuSpin lock is introduced (on posix only yet) to allow a faster handling of
>>>     atomic instruction.
>> How do you handle the memory model ? IE , ARM and PPC are OO while x86
>> is (mostly) in order, so emulating ARM/PPC on x86 is fine but emulating
>> x86 on ARM or PPC will lead to problems unless you generate memory
>> barriers with every load/store ..
> 
> For the moment we are trying to do the first case.
>>
>> At least on POWER7 and later on PPC we have the possibility of setting
>> the attribute "Strong Access Ordering" with mremap/mprotect (I dont'
>> remember which one) which gives us x86-like memory semantics...
>>
>> I don't know if ARM supports something similar. On the other hand, when
>> emulating ARM on PPC or vice-versa, we can probably get away with no
>> barriers.
>>
>> Do you expose some kind of guest memory model info to the TCG backend so
>> it can decide how to handle these things ?
>>
>>> == Code generation and cache ==
>>>
>>> As Qemu stands, there is no protection at all against two threads attempting to
>>> generate code at the same time or modifying a TranslationBlock.
>>> The "protect TBContext with tb_lock" patch address the issue of code generation
>>> and makes all the tb_* function thread safe (except tb_flush).
>>> This raised the question of one or multiple caches. We choosed to use one
>>> unified cache because it's easier as a first step and since the structure of
>>> QEMU effectively has a ‘local’ cache per CPU in the form of the jump cache, we
>>> don't see the benefit of having two pools of tbs.
>>>
>>> == Dirty tracking ==
>>>
>>> Protecting the IOs:
>>> To allows all VCPUs threads to run at the same time we need to drop the
>>> global_mutex as soon as possible. The io access need to take the mutex. This is
>>> likely to change when http://thread.gmane.org/gmane.comp.emulators.qemu/345258
>>> will be upstreamed.
>>>
>>> Invalidation of TranslationBlocks:
>>> We can have all VCPUs running during an invalidation. Each VCPU is able to clean
>>> it's jump cache itself as it is in CPUState so that can be handled by a simple
>>> call to async_run_on_cpu. However tb_invalidate also writes to the
>>> TranslationBlock which is shared as we have only one pool.
>>> Hence this part of invalidate requires all VCPUs to exit before it can be done.
>>> Hence the async_run_safe_work_on_cpu is introduced to handle this case.
>> What about the host MMU emulation ? Is that multithreaded ? It has
>> potential issues when doing things like dirty bit updates into guest
>> memory, those need to be done atomically. Also TLB invalidations on ARM
>> and PPC are global, so they will need to invalidate the remote SW TLBs
>> as well.
>>
>> Do you have a mechanism to synchronize with another thread ? IE, make it
>> pop out of TCG if already in and prevent it from getting in ? That way
>> you can "remotely" invalidate its TLB...
> Yes that's what the safe_work is doing. Ask everybody to exit prevent VCPUs to
> resume (tcg_exec_flag) and do the work when everybody is outside cpu-exec.
> 
>>
>>> == Atomic instruction ==
>>>
>>> For now only ARM on x64 is supported by using an cmpxchg instruction.
>>> Specifically the limitation of this approach is that it is harder to support
>>> 64bit ARM on a host architecture that is multi-core, but only supports 32 bit
>>> cmpxchg (we believe this could be the case for some PPC cores).
>> Right, on the other hand 64-bit will do fine. But then x86 has 2-value
>> atomics nowadays, doesn't it ? And that will be hard to emulate on
>> anything. You might need to have some kind of global hashed lock list
>> used by atomics (hash the physical address) as a fallback if you don't
>> have a 1:1 match between host and guest capabilities.
> VOS did a "Slow path for atomic instruction translation" series you can find here:
> https://lists.gnu.org/archive/html/qemu-devel/2015-08/msg00971.html
> 
> Which will be used in the end.
> 
> Thanks,
> Fred
>>
>> Cheers,
>> Ben.

next prev parent reply	other threads:[~2015-10-07 12:46 UTC|newest]

Thread overview: 81+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-08-10 15:26 [Qemu-devel] [RFC PATCH V7 00/19] Multithread TCG fred.konrad
2015-08-10 15:26 ` [Qemu-devel] [RFC PATCH V7 01/19] cpus: protect queued_work_* with work_mutex fred.konrad
2015-08-10 15:59   ` Paolo Bonzini
2015-08-10 16:04     ` Frederic Konrad
2015-08-10 16:06       ` Paolo Bonzini
2015-08-10 15:27 ` [Qemu-devel] [RFC PATCH V7 02/19] cpus: add tcg_exec_flag fred.konrad
2015-08-11 10:53   ` Paolo Bonzini
2015-08-11 11:11     ` Frederic Konrad
2015-08-11 12:57       ` Paolo Bonzini
2015-08-10 15:27 ` [Qemu-devel] [RFC PATCH V7 03/19] cpus: introduce async_run_safe_work_on_cpu fred.konrad
2015-08-10 15:27 ` [Qemu-devel] [RFC PATCH V7 04/19] replace spinlock by QemuMutex fred.konrad
2015-08-10 16:09   ` Paolo Bonzini
2015-08-10 15:27 ` [Qemu-devel] [RFC PATCH V7 05/19] remove unused spinlock fred.konrad
2015-08-10 15:27 ` [Qemu-devel] [RFC PATCH V7 06/19] add support for spin lock on POSIX systems exclusively fred.konrad
2015-08-10 16:10   ` Paolo Bonzini
2015-08-10 15:27 ` [Qemu-devel] [RFC PATCH V7 07/19] protect TBContext with tb_lock fred.konrad
2015-08-10 16:36   ` Paolo Bonzini
2015-08-10 16:50     ` Paolo Bonzini
2015-08-10 18:39       ` Alex Bennée
2015-08-11  8:31         ` Paolo Bonzini
2015-08-11  6:46     ` Frederic Konrad
2015-08-11  8:34       ` Paolo Bonzini
2015-08-11  9:21         ` Peter Maydell
2015-08-11  9:59           ` Paolo Bonzini
2015-08-12 17:45   ` Frederic Konrad
2015-08-12 18:20     ` Alex Bennée
2015-08-12 18:22       ` Paolo Bonzini
2015-08-14  8:38       ` Frederic Konrad
2015-08-15  0:04         ` Paolo Bonzini
2015-08-10 15:27 ` [Qemu-devel] [RFC PATCH V7 08/19] tcg: remove tcg_halt_cond global variable fred.konrad
2015-08-10 16:12   ` Paolo Bonzini
2015-08-10 15:27 ` [Qemu-devel] [RFC PATCH V7 09/19] Drop global lock during TCG code execution fred.konrad
2015-08-10 16:15   ` Paolo Bonzini
2015-08-11  6:55     ` Frederic Konrad
2015-08-11 20:12     ` Alex Bennée
2015-08-11 21:34       ` Frederic Konrad
2015-08-12  9:58         ` Paolo Bonzini
2015-08-12 12:32           ` Frederic Konrad
2015-08-10 15:27 ` [Qemu-devel] [RFC PATCH V7 10/19] cpu: remove exit_request global fred.konrad
2015-08-10 15:51   ` Paolo Bonzini
2015-08-10 15:27 ` [Qemu-devel] [RFC PATCH V7 11/19] tcg: switch on multithread fred.konrad
2015-08-13 11:17   ` Paolo Bonzini
2015-08-13 14:41     ` Frederic Konrad
2015-08-13 14:58       ` Paolo Bonzini
2015-08-13 15:18         ` Frederic Konrad
2015-08-10 15:27 ` [Qemu-devel] [RFC PATCH V7 12/19] Use atomic cmpxchg to atomically check the exclusive value in a STREX fred.konrad
2015-08-10 15:27 ` [Qemu-devel] [RFC PATCH V7 13/19] add a callback when tb_invalidate is called fred.konrad
2015-08-10 16:52   ` Paolo Bonzini
2015-08-10 18:41     ` Alex Bennée
2015-08-10 15:27 ` [Qemu-devel] [RFC PATCH V7 14/19] cpu: introduce tlb_flush*_all fred.konrad
2015-08-10 15:54   ` Paolo Bonzini
2015-08-10 16:00     ` Peter Maydell
2015-08-10 15:27 ` [Qemu-devel] [RFC PATCH V7 15/19] arm: use tlb_flush*_all fred.konrad
2015-08-10 15:27 ` [Qemu-devel] [RFC PATCH V7 16/19] translate-all: introduces tb_flush_safe fred.konrad
2015-08-10 16:26   ` Paolo Bonzini
2015-08-12 14:09   ` Paolo Bonzini
2015-08-12 14:11     ` Frederic Konrad
2015-08-12 14:14       ` Paolo Bonzini
2015-08-10 15:27 ` [Qemu-devel] [RFC PATCH V7 17/19] translate-all: (wip) use tb_flush_safe when we can't alloc more tb fred.konrad
2015-08-10 15:27 ` [Qemu-devel] [RFC PATCH V7 18/19] mttcg: signal the associated cpu anyway fred.konrad
2015-08-10 15:51   ` Paolo Bonzini
2015-08-10 15:27 ` [Qemu-devel] [RFC PATCH V7 19/19] target-arm/psci.c: wake up sleeping CPUs (MTTCG) fred.konrad
2015-08-10 16:41   ` Paolo Bonzini
2015-08-10 18:38     ` Alex Bennée
2015-08-10 18:34 ` [Qemu-devel] [RFC PATCH V7 00/19] Multithread TCG Alex Bennée
2015-08-10 23:02   ` Frederic Konrad
2015-08-11  6:15 ` Benjamin Herrenschmidt
2015-08-11  6:27   ` Frederic Konrad
2015-10-07 12:46     ` Claudio Fontana [this message]
2015-10-07 14:52       ` Frederic Konrad
2015-10-21 15:09         ` Claudio Fontana
2015-08-11  7:54   ` Alex Bennée
2015-08-11  9:22     ` Benjamin Herrenschmidt
2015-08-11  9:29       ` Peter Maydell
2015-08-11 10:09         ` Benjamin Herrenschmidt
2015-08-11 19:22       ` Alex Bennée
2015-08-11 12:45 ` Paolo Bonzini
2015-08-11 13:59   ` Frederic Konrad
2015-08-11 14:10     ` Paolo Bonzini
2015-08-12 15:19     ` Frederic Konrad
2015-08-12 15:39       ` Paolo Bonzini

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56151427.4080809@huawei.com \
    --to=claudio.fontana@huawei.com \
    --cc=a.rigo@virtualopensystems.com \
    --cc=alex.bennee@linaro.org \
    --cc=benh@kernel.crashing.org \
    --cc=fred.konrad@greensocs.com \
    --cc=guillaume.delbergue@greensocs.com \
    --cc=mark.burton@greensocs.com \
    --cc=mttcg@greensocs.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).