From: Frederic Konrad <fred.konrad@greensocs.com>
To: Mark Burton <mark.burton@greensocs.com>,
"Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: mttcg@greensocs.com, Peter Maydell <peter.maydell@linaro.org>,
QEMU Developers <qemu-devel@nongnu.org>,
Alexander Graf <agraf@suse.de>,
Guillaume Delbergue <guillaume.delbergue@greensocs.com>,
Paolo Bonzini <pbonzini@redhat.com>,
Alex Benn?e <alex.bennee@linaro.org>
Subject: Re: [Qemu-devel] RFC Multi-threaded TCG design document
Date: Wed, 17 Jun 2015 23:45:52 +0200 [thread overview]
Message-ID: <5581EA90.5020004@greensocs.com> (raw)
In-Reply-To: <63D89881-446B-4523-B877-D2110E361345@greensocs.com>
On 17/06/2015 20:23, Mark Burton wrote:
>> On 17 Jun 2015, at 18:57, Dr. David Alan Gilbert <dgilbert@redhat.com> wrote:
>>
>> * Alex Benn?e (alex.bennee@linaro.org) wrote:
>>> Hi,
>>> Shared Data Structures
>>> ======================
>>>
>>> Global TCG State
>>> ----------------
>>>
>>> We need to protect the entire code generation cycle including any post
>>> generation patching of the translated code. This also implies a shared
>>> translation buffer which contains code running on all cores. Any
>>> execution path that comes to the main run loop will need to hold a
>>> mutex for code generation. This also includes times when we need flush
>>> code or jumps from the tb_cache.
>>>
>>> DESIGN REQUIREMENT: Add locking around all code generation, patching
>>> and jump cache modification
>> I don't think that you require a shared translation buffer between
>> cores to do this - although it *might* be the easiest way.
>> You could have a per-core translation buffer, the only requirement is
>> that most invalidation operations happen on all the buffers
>> (although that might depend on the emulated architecture).
>> With a per-core translation buffer, each core could generate new translations
>> without locking the other cores as long as no one is doing invalidations.
> I agree it’s not a design requirement - however we’ve kind of gone round this loop in terms of getting things to work.
> Fred will doubtless fill in some details, but basically it looks like making the TCG so you could run several in parallel is a nightmare. We seem to get reasonable performance having just one CPU at a time generating TBs. At the same time, of course, the way Qemu is constructed there are actually several ‘layers’ of buffer - from the CPU local ones through to the TB ‘pool’. So, actually, my accident or design, we benefit from a sort of caching structure.
>
True, it seems to be very complex at least on ARM because of the disassemble
context etc.. But on the other side the invalidation might be easier I
guess.
For performance I'm not sure of what is the better way..
Fred
>>> Memory maps and TLBs
>>> --------------------
>>>
>>> The memory handling code is fairly critical to the speed of memory
>>> access in the emulated system.
>>>
>>> - Memory regions (dividing up access to PIO, MMIO and RAM)
>>> - Dirty page tracking (for code gen, migration and display)
>>> - Virtual TLB (for translating guest address->real address)
>>>
>>> There is a both a fast path walked by the generated code and a slow
>>> path when resolution is required. When the TLB tables are updated we
>>> need to ensure they are done in a safe way by bringing all executing
>>> threads to a halt before making the modifications.
>>>
>>> DESIGN REQUIREMENTS:
>>>
>>> - TLB Flush All/Page
>>> - can be across-CPUs
>>> - will need all other CPUs brought to a halt
>>> - TLB Update (update a CPUTLBEntry, via tlb_set_page_with_attrs)
>>> - This is a per-CPU table - by definition can't race
>>> - updated by it's own thread when the slow-path is forced
>>>
>>> Emulated hardware state
>>> -----------------------
>>>
>>> Currently the hardware emulation has no protection against
>>> multiple-accesses. However guest systems accessing emulated hardware
>>> should be carrying out their own locking to prevent multiple CPUs
>>> confusing the hardware. Of course there is no guarantee the there
>>> couldn't be a broken guest that doesn't lock so you could get racing
>>> accesses to the hardware.
>>>
>>> There is the class of paravirtualized hardware (VIRTIO) that works in
>>> a purely mmio mode. Often setting flags directly in guest memory as a
>>> result of a guest triggered transaction.
>>>
>>> DESIGN REQUIREMENTS:
>>>
>>> - Access to IO Memory should be serialised by an IOMem mutex
>>> - The mutex should be recursive (e.g. allowing pid to relock itself)
>>>
>>> IO Subsystem
>>> ------------
>>>
>>> The I/O subsystem is heavily used by KVM and has seen a lot of
>>> improvements to offload I/O tasks to dedicated IOThreads. There should
>>> be no additional locking required once we reach the Block Driver.
>>>
>>> DESIGN REQUIREMENTS:
>>>
>>> - The dataplane should continue to be protected by the iothread locks
>> Watch out for where DMA invalidates the translated code.
>>
>
> need to check - that might be a great catch !
>
> Cheers
>
> Mark.
>
>> Dave
>>
>>>
>>> References
>>> ==========
>>>
>>> [1] https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/plain/Documentation/memory-barriers.txt
>>> [2] http://thread.gmane.org/gmane.comp.emulators.qemu/334561
>>> [3] http://thread.gmane.org/gmane.comp.emulators.qemu/335297
>>>
>>>
>>>
>>> --
>>> Alex Bennée
>> --
>> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
>
> +44 (0)20 7100 3485 x 210
> +33 (0)5 33 52 01 77x 210
>
> +33 (0)603762104
> mark.burton
>
>
prev parent reply other threads:[~2015-06-17 21:46 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-06-12 16:37 [Qemu-devel] RFC Multi-threaded TCG design document Alex Bennée
2015-06-15 9:13 ` Frederic Konrad
2015-06-15 10:06 ` Alex Bennée
2015-06-15 10:51 ` Mark Burton
2015-06-15 12:36 ` Alex Bennée
2015-06-15 14:25 ` Alex Bennée
2015-06-15 13:06 ` alvise rigo
2015-06-15 14:25 ` Alex Bennée
2015-06-17 11:58 ` Paolo Bonzini
2015-06-17 15:57 ` Alex Bennée
2015-06-17 16:13 ` Paolo Bonzini
2015-06-17 16:57 ` Dr. David Alan Gilbert
2015-06-17 18:23 ` Mark Burton
2015-06-17 21:45 ` Frederic Konrad [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5581EA90.5020004@greensocs.com \
--to=fred.konrad@greensocs.com \
--cc=agraf@suse.de \
--cc=alex.bennee@linaro.org \
--cc=dgilbert@redhat.com \
--cc=guillaume.delbergue@greensocs.com \
--cc=mark.burton@greensocs.com \
--cc=mttcg@greensocs.com \
--cc=pbonzini@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).