From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:42205) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z5L9o-0003ET-H9 for qemu-devel@nongnu.org; Wed, 17 Jun 2015 17:46:06 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Z5L9l-0007B9-7j for qemu-devel@nongnu.org; Wed, 17 Jun 2015 17:46:04 -0400 Received: from greensocs.com ([193.104.36.180]:33888) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Z5L9k-0007AC-QW for qemu-devel@nongnu.org; Wed, 17 Jun 2015 17:46:01 -0400 Message-ID: <5581EA90.5020004@greensocs.com> Date: Wed, 17 Jun 2015 23:45:52 +0200 From: Frederic Konrad MIME-Version: 1.0 References: <878uborigh.fsf@linaro.org> <20150617165716.GM2122@work-vm> <63D89881-446B-4523-B877-D2110E361345@greensocs.com> In-Reply-To: <63D89881-446B-4523-B877-D2110E361345@greensocs.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: quoted-printable Subject: Re: [Qemu-devel] RFC Multi-threaded TCG design document List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Mark Burton , "Dr. David Alan Gilbert" Cc: mttcg@greensocs.com, Peter Maydell , QEMU Developers , Alexander Graf , Guillaume Delbergue , Paolo Bonzini , Alex Benn?e On 17/06/2015 20:23, Mark Burton wrote: >> On 17 Jun 2015, at 18:57, Dr. David Alan Gilbert = wrote: >> >> * Alex Benn?e (alex.bennee@linaro.org) wrote: >>> Hi, >>> Shared Data Structures >>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >>> >>> Global TCG State >>> ---------------- >>> >>> We need to protect the entire code generation cycle including any pos= t >>> generation patching of the translated code. This also implies a share= d >>> translation buffer which contains code running on all cores. Any >>> execution path that comes to the main run loop will need to hold a >>> mutex for code generation. This also includes times when we need flus= h >>> code or jumps from the tb_cache. >>> >>> DESIGN REQUIREMENT: Add locking around all code generation, patching >>> and jump cache modification >> I don't think that you require a shared translation buffer between >> cores to do this - although it *might* be the easiest way. >> You could have a per-core translation buffer, the only requirement is >> that most invalidation operations happen on all the buffers >> (although that might depend on the emulated architecture). >> With a per-core translation buffer, each core could generate new trans= lations >> without locking the other cores as long as no one is doing invalidatio= ns. > I agree it=92s not a design requirement - however we=92ve kind of gone = round this loop in terms of getting things to work. > Fred will doubtless fill in some details, but basically it looks like m= aking the TCG so you could run several in parallel is a nightmare. We see= m to get reasonable performance having just one CPU at a time generating = TBs. At the same time, of course, the way Qemu is constructed there are = actually several =91layers=92 of buffer - from the CPU local ones through= to the TB =91pool=92. So, actually, my accident or design, we benefit fr= om a sort of caching structure. > True, it seems to be very complex at least on ARM because of the disassem= ble context etc.. But on the other side the invalidation might be easier I=20 guess. For performance I'm not sure of what is the better way.. Fred >>> Memory maps and TLBs >>> -------------------- >>> >>> The memory handling code is fairly critical to the speed of memory >>> access in the emulated system. >>> >>> - Memory regions (dividing up access to PIO, MMIO and RAM) >>> - Dirty page tracking (for code gen, migration and display) >>> - Virtual TLB (for translating guest address->real address) >>> >>> There is a both a fast path walked by the generated code and a slow >>> path when resolution is required. When the TLB tables are updated we >>> need to ensure they are done in a safe way by bringing all executing >>> threads to a halt before making the modifications. >>> >>> DESIGN REQUIREMENTS: >>> >>> - TLB Flush All/Page >>> - can be across-CPUs >>> - will need all other CPUs brought to a halt >>> - TLB Update (update a CPUTLBEntry, via tlb_set_page_with_attrs) >>> - This is a per-CPU table - by definition can't race >>> - updated by it's own thread when the slow-path is forced >>> >>> Emulated hardware state >>> ----------------------- >>> >>> Currently the hardware emulation has no protection against >>> multiple-accesses. However guest systems accessing emulated hardware >>> should be carrying out their own locking to prevent multiple CPUs >>> confusing the hardware. Of course there is no guarantee the there >>> couldn't be a broken guest that doesn't lock so you could get racing >>> accesses to the hardware. >>> >>> There is the class of paravirtualized hardware (VIRTIO) that works in >>> a purely mmio mode. Often setting flags directly in guest memory as a >>> result of a guest triggered transaction. >>> >>> DESIGN REQUIREMENTS: >>> >>> - Access to IO Memory should be serialised by an IOMem mutex >>> - The mutex should be recursive (e.g. allowing pid to relock itself= ) >>> >>> IO Subsystem >>> ------------ >>> >>> The I/O subsystem is heavily used by KVM and has seen a lot of >>> improvements to offload I/O tasks to dedicated IOThreads. There shoul= d >>> be no additional locking required once we reach the Block Driver. >>> >>> DESIGN REQUIREMENTS: >>> >>> - The dataplane should continue to be protected by the iothread loc= ks >> Watch out for where DMA invalidates the translated code. >> > > need to check - that might be a great catch ! > > Cheers > > Mark. > >> Dave >> >>> >>> References >>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D >>> >>> [1] https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/p= lain/Documentation/memory-barriers.txt >>> [2] http://thread.gmane.org/gmane.comp.emulators.qemu/334561 >>> [3] http://thread.gmane.org/gmane.comp.emulators.qemu/335297 >>> >>> >>> >>> --=20 >>> Alex Benn=E9e >> -- >> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK > > +44 (0)20 7100 3485 x 210 > +33 (0)5 33 52 01 77x 210 > > +33 (0)603762104 > mark.burton > >