qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] Should we introduce a TranslationRegion with its own codegen buffer?
@ 2016-04-04  8:54 Alex Bennée
  2016-04-04  9:01 ` Paolo Bonzini
  0 siblings, 1 reply; 7+ messages in thread
From: Alex Bennée @ 2016-04-04  8:54 UTC (permalink / raw)
  To: QEMU Developers, MTTCG Devel
  Cc: Peter Maydell, Sergey Fedorov, Richard Henderson, ".Cota",
	Paolo Bonzini

Hi,

While reviewing the recent TB patching cleanup patches I wondered if
there is a cleaner way of handling TB invalidation. Currently we have a
single code generation buffer which slowly fills up with TBs as we
execute code. These TBs are chained together if they exist in the same
physical page (so we always exit to the run-loop if crossing a page
boundary).

We hold a bunch if extra information in the TBs to facilitate looking
things up. We have:

    struct TranslationBlock *phys_hash_next;

to facilitate looking up TBs which have matching hashes in the physical
address lookup. We also have:

    uintptr_t jmp_list_next[2];
    uintptr_t jmp_list_first;

Which are used for unwinding any jump patching should we invalidate a
page and hence don't want code jumping to potentially invalid
translations.

We also have a number of associated jump caches held against each CPU
which is used to optimise re-entry into generated code as we go round
the main run-loop. These also have to be cleanly invalidated as TBs are
marked invalid.

Finally as the TBs are generated on demand the actual code may not be
locally jump-able which makes atomic patching of the jumps trickier to
do.

TB invalidation is almost always due to page mapping changes although
SMC code and debugging are also causes for throwing away translations.
I'm wondering if it is time to add a layer of indirection to simplify
the process?

If we introduce a TranslationRegion which could initially cover a pages
worth of code. It would have its own code generation buffer protected by
an RCU lock to make it easier to swap out on code buffers is a clean
manner:

  Normal Execution (cpu_exec):
    - lookup TranslationRegion
    - take RCU read lock
    - lookup-or-generate TB
    - jump into code
    - exit TB
    - release RCU read lock

  Invalidation of Page:
    - lookup TranslationRegion
    - take RCU write lock
      - create fresh empty region
      - signal cpu_exit to all vCPUs
    - release RCU write lock
    - take RCU read lock
    - lookup-or-generate TB
    - jump into code
    - exit TB
    - release RCU read lock*

* when the last vCPU releases the read lock on the old code it can be
  cleanly thrown away. No fiddly jump patching required.

There are some potential optimisation's that could be made to this system
as well.

Jump patching would become easier on backends with limited jump ranges
as local code is kept together in a shared code buffer.

For one there is no reason the area covered by a TranslationRegion has
to be a page. For example the kernel segment once mapped will never
change. Then all internal TBs could still be chained together.

I'm sure there is scope for localising the jump cache to regions as
there are likely to be only a few entry points to any given page with
the rest all being internal branches for loops and conditionals.

The only thing I can currently think that may be a problem is
potentially causing heap fragmentation by having a large number of code
buffers. This could probably be ameliorated by using custom allocation
routines for the code buffers.

I'm going to have a bit of a play to see what this sort of solution
would look like in the code but I thought I'd sketch the idea out to see
if there are any obvious glaring holes or others things to consider.

Thoughts, objections? Discuss ;-)

--
Alex Bennée

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2016-04-04 12:13 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-04-04  8:54 [Qemu-devel] Should we introduce a TranslationRegion with its own codegen buffer? Alex Bennée
2016-04-04  9:01 ` Paolo Bonzini
2016-04-04 10:39   ` Sergey Fedorov
2016-04-04 11:24     ` Alex Bennée
2016-04-04 11:33       ` Paolo Bonzini
2016-04-04 12:01         ` Alex Bennée
2016-04-04 12:13           ` Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).