From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:59433) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZP4O6-0000Zm-PG for qemu-devel@nongnu.org; Tue, 11 Aug 2015 03:54:24 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZP4O2-0002ml-CQ for qemu-devel@nongnu.org; Tue, 11 Aug 2015 03:54:22 -0400 Received: from mail-wi0-f170.google.com ([209.85.212.170]:36068) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZP4O2-0002mf-4C for qemu-devel@nongnu.org; Tue, 11 Aug 2015 03:54:18 -0400 Received: by wicja10 with SMTP id ja10so64702550wic.1 for ; Tue, 11 Aug 2015 00:54:17 -0700 (PDT) References: <1439220437-23957-1-git-send-email-fred.konrad@greensocs.com> <1439273709.14448.102.camel@kernel.crashing.org> From: Alex =?utf-8?Q?Benn=C3=A9e?= In-reply-to: <1439273709.14448.102.camel@kernel.crashing.org> Date: Tue, 11 Aug 2015 08:54:15 +0100 Message-ID: <87614mgt08.fsf@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Subject: Re: [Qemu-devel] [RFC PATCH V7 00/19] Multithread TCG. List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Benjamin Herrenschmidt Cc: mttcg@greensocs.com, mark.burton@greensocs.com, a.rigo@virtualopensystems.com, qemu-devel@nongnu.org, guillaume.delbergue@greensocs.com, pbonzini@redhat.com, fred.konrad@greensocs.com Benjamin Herrenschmidt writes: > On Mon, 2015-08-10 at 17:26 +0200, fred.konrad@greensocs.com wrote: >> From: KONRAD Frederic >> >> This is the 7th round of the MTTCG patch series. >> >> >> It can be cloned from: >> git@git.greensocs.com:fkonrad/mttcg.git branch multi_tcg_v7. >> >> This patch-set try to address the different issues in the global picture of >> MTTCG, presented on the wiki. >> >> == Needed patch for our work == >> >> Some preliminaries are needed for our work: >> * current_cpu doesn't make sense in mttcg so a tcg_executing flag is added to >> the CPUState. > > Can't you just make it a TLS ? > >> * We need to run some work safely when all VCPUs are outside their execution >> loop. This is done with the async_run_safe_work_on_cpu function introduced >> in this series. >> * QemuSpin lock is introduced (on posix only yet) to allow a faster handling of >> atomic instruction. > > How do you handle the memory model ? IE , ARM and PPC are OO while x86 > is (mostly) in order, so emulating ARM/PPC on x86 is fine but emulating > x86 on ARM or PPC will lead to problems unless you generate memory > barriers with every load/store .. This is the next chunk of work. We have Alvise's LL/SC patches which allow us to do proper emulation of ARMs Load/store exclusive behaviour and any weak order target will have to use such constructs. Currently the plan is to introduce a barrier TCG op which will translate to the strongest backend barrier available. Even x86 should be using barriers to ensure cross-core visibility which then leaves LS re-ordering on the same core. > At least on POWER7 and later on PPC we have the possibility of setting > the attribute "Strong Access Ordering" with mremap/mprotect (I dont' > remember which one) which gives us x86-like memory semantics... > > I don't know if ARM supports something similar. On the other hand, when > emulating ARM on PPC or vice-versa, we can probably get away with no > barriers. > > Do you expose some kind of guest memory model info to the TCG backend so > it can decide how to handle these things ? > >> == Code generation and cache == >> >> As Qemu stands, there is no protection at all against two threads attempting to >> generate code at the same time or modifying a TranslationBlock. >> The "protect TBContext with tb_lock" patch address the issue of code generation >> and makes all the tb_* function thread safe (except tb_flush). >> This raised the question of one or multiple caches. We choosed to use one >> unified cache because it's easier as a first step and since the structure of >> QEMU effectively has a ‘local’ cache per CPU in the form of the jump cache, we >> don't see the benefit of having two pools of tbs. >> >> == Dirty tracking == >> >> Protecting the IOs: >> To allows all VCPUs threads to run at the same time we need to drop the >> global_mutex as soon as possible. The io access need to take the mutex. This is >> likely to change when http://thread.gmane.org/gmane.comp.emulators.qemu/345258 >> will be upstreamed. >> >> Invalidation of TranslationBlocks: >> We can have all VCPUs running during an invalidation. Each VCPU is able to clean >> it's jump cache itself as it is in CPUState so that can be handled by a simple >> call to async_run_on_cpu. However tb_invalidate also writes to the >> TranslationBlock which is shared as we have only one pool. >> Hence this part of invalidate requires all VCPUs to exit before it can be done. >> Hence the async_run_safe_work_on_cpu is introduced to handle this case. > > What about the host MMU emulation ? Is that multithreaded ? It has > potential issues when doing things like dirty bit updates into guest > memory, those need to be done atomically. Also TLB invalidations on ARM > and PPC are global, so they will need to invalidate the remote SW TLBs > as well. > > Do you have a mechanism to synchronize with another thread ? IE, make it > pop out of TCG if already in and prevent it from getting in ? That way > you can "remotely" invalidate its TLB... > >> == Atomic instruction == >> >> For now only ARM on x64 is supported by using an cmpxchg instruction. >> Specifically the limitation of this approach is that it is harder to support >> 64bit ARM on a host architecture that is multi-core, but only supports 32 bit >> cmpxchg (we believe this could be the case for some PPC cores). > > Right, on the other hand 64-bit will do fine. But then x86 has 2-value > atomics nowadays, doesn't it ? And that will be hard to emulate on > anything. You might need to have some kind of global hashed lock list > used by atomics (hash the physical address) as a fallback if you don't > have a 1:1 match between host and guest capabilities. > > Cheers, > Ben. -- Alex Bennée