Re: [Qemu-devel] RFC Multi-threaded TCG design document

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

From: Frederic Konrad <fred.konrad@greensocs.com>
To: Mark Burton <mark.burton@greensocs.com>,
	"Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: mttcg@greensocs.com, Peter Maydell <peter.maydell@linaro.org>,
	QEMU Developers <qemu-devel@nongnu.org>,
	Alexander Graf <agraf@suse.de>,
	Guillaume Delbergue <guillaume.delbergue@greensocs.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Alex Benn?e <alex.bennee@linaro.org>
Subject: Re: [Qemu-devel] RFC Multi-threaded TCG design document
Date: Wed, 17 Jun 2015 23:45:52 +0200	[thread overview]
Message-ID: <5581EA90.5020004@greensocs.com> (raw)
In-Reply-To: <63D89881-446B-4523-B877-D2110E361345@greensocs.com>

On 17/06/2015 20:23, Mark Burton wrote:
>> On 17 Jun 2015, at 18:57, Dr. David Alan Gilbert <dgilbert@redhat.com> wrote:
>>
>> * Alex Benn?e (alex.bennee@linaro.org) wrote:
>>> Hi,
>>> Shared Data Structures
>>> ======================
>>>
>>> Global TCG State
>>> ----------------
>>>
>>> We need to protect the entire code generation cycle including any post
>>> generation patching of the translated code. This also implies a shared
>>> translation buffer which contains code running on all cores. Any
>>> execution path that comes to the main run loop will need to hold a
>>> mutex for code generation. This also includes times when we need flush
>>> code or jumps from the tb_cache.
>>>
>>> DESIGN REQUIREMENT: Add locking around all code generation, patching
>>> and jump cache modification
>> I don't think that you require a shared translation buffer between
>> cores to do this - although it *might* be the easiest way.
>> You could have a per-core translation buffer, the only requirement is
>> that most invalidation operations happen on all the buffers
>> (although that might depend on the emulated architecture).
>> With a per-core translation buffer, each core could generate new translations
>> without locking the other cores as long as no one is doing invalidations.
> I agree it’s not a design requirement - however we’ve kind of gone round this loop in terms of getting things to work.
> Fred will doubtless fill in some details, but basically it looks like making the TCG so you could run several in parallel is a nightmare. We seem to get reasonable performance having just one CPU at a time generating TBs.  At the same time, of course, the way Qemu is constructed there are actually several ‘layers’ of buffer - from the CPU local ones through to the TB ‘pool’. So, actually, my accident or design, we benefit from a sort of caching structure.
>
True, it seems to be very complex at least on ARM because of the disassemble
context etc.. But on the other side the invalidation might be easier I 
guess.
For performance I'm not sure of what is the better way..

Fred
>>> Memory maps and TLBs
>>> --------------------
>>>
>>> The memory handling code is fairly critical to the speed of memory
>>> access in the emulated system.
>>>
>>>   - Memory regions (dividing up access to PIO, MMIO and RAM)
>>>   - Dirty page tracking (for code gen, migration and display)
>>>   - Virtual TLB (for translating guest address->real address)
>>>
>>> There is a both a fast path walked by the generated code and a slow
>>> path when resolution is required. When the TLB tables are updated we
>>> need to ensure they are done in a safe way by bringing all executing
>>> threads to a halt before making the modifications.
>>>
>>> DESIGN REQUIREMENTS:
>>>
>>>   - TLB Flush All/Page
>>>     - can be across-CPUs
>>>     - will need all other CPUs brought to a halt
>>>   - TLB Update (update a CPUTLBEntry, via tlb_set_page_with_attrs)
>>>     - This is a per-CPU table - by definition can't race
>>>     - updated by it's own thread when the slow-path is forced
>>>
>>> Emulated hardware state
>>> -----------------------
>>>
>>> Currently the hardware emulation has no protection against
>>> multiple-accesses. However guest systems accessing emulated hardware
>>> should be carrying out their own locking to prevent multiple CPUs
>>> confusing the hardware. Of course there is no guarantee the there
>>> couldn't be a broken guest that doesn't lock so you could get racing
>>> accesses to the hardware.
>>>
>>> There is the class of paravirtualized hardware (VIRTIO) that works in
>>> a purely mmio mode. Often setting flags directly in guest memory as a
>>> result of a guest triggered transaction.
>>>
>>> DESIGN REQUIREMENTS:
>>>
>>>   - Access to IO Memory should be serialised by an IOMem mutex
>>>   - The mutex should be recursive (e.g. allowing pid to relock itself)
>>>
>>> IO Subsystem
>>> ------------
>>>
>>> The I/O subsystem is heavily used by KVM and has seen a lot of
>>> improvements to offload I/O tasks to dedicated IOThreads. There should
>>> be no additional locking required once we reach the Block Driver.
>>>
>>> DESIGN REQUIREMENTS:
>>>
>>>   - The dataplane should continue to be protected by the iothread locks
>> Watch out for where DMA invalidates the translated code.
>>
>
> need to check - that might be a great catch !
>
> Cheers
>
> Mark.
>
>> Dave
>>
>>>
>>> References
>>> ==========
>>>
>>> [1] https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/plain/Documentation/memory-barriers.txt
>>> [2] http://thread.gmane.org/gmane.comp.emulators.qemu/334561
>>> [3] http://thread.gmane.org/gmane.comp.emulators.qemu/335297
>>>
>>>
>>>
>>> -- 
>>> Alex Bennée
>> --
>> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
> 	 +44 (0)20 7100 3485 x 210
>   +33 (0)5 33 52 01 77x 210
>
> 	+33 (0)603762104
> 	mark.burton
>
>

     prev parent reply	other threads:[~2015-06-17 21:46 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-12 16:37 [Qemu-devel] RFC Multi-threaded TCG design document Alex Bennée
2015-06-15  9:13 ` Frederic Konrad
2015-06-15 10:06   ` Alex Bennée
2015-06-15 10:51     ` Mark Burton
2015-06-15 12:36       ` Alex Bennée
2015-06-15 14:25       ` Alex Bennée
2015-06-15 13:06 ` alvise rigo
2015-06-15 14:25   ` Alex Bennée
2015-06-17 11:58 ` Paolo Bonzini
2015-06-17 15:57   ` Alex Bennée
2015-06-17 16:13     ` Paolo Bonzini
2015-06-17 16:57 ` Dr. David Alan Gilbert
2015-06-17 18:23   ` Mark Burton
2015-06-17 21:45     ` Frederic Konrad [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5581EA90.5020004@greensocs.com \
    --to=fred.konrad@greensocs.com \
    --cc=agraf@suse.de \
    --cc=alex.bennee@linaro.org \
    --cc=dgilbert@redhat.com \
    --cc=guillaume.delbergue@greensocs.com \
    --cc=mark.burton@greensocs.com \
    --cc=mttcg@greensocs.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).