qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] Update on TCG Multithreading
@ 2014-12-01 19:33 Mark Burton
  2014-12-01 21:00 ` Lluís Vilanova
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Mark Burton @ 2014-12-01 19:33 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Maydell, Bastian Koppelmann, Dr. David Alan Gilbert,
	Alexander Graf, pavel Dovgaluk, Paolo Bonzini, Alex Bennée,
	Lluís Vilanova, KONRAD Frédéric

[-- Attachment #1: Type: text/plain, Size: 844 bytes --]


All - first a huge thanks for those who have contributed, and those who have expressed an interest in helping out.

One issue I’d like to see more opinions on is the question of a cache per core, or a shared cache.
I have heard anecdotal evidence that a shared cache gives a major performance benefit….
Does anybody have anything more concrete?
(of course we will get numbers in the end if we implement the hybrid scheme as suggested in the wiki - but I’d still appreciate any feedback).

Our next plan is to start putting an implementation plan together. Probably quite sketchy at this point, and we hope to start coding shortly.


Cheers

Mark.





	 +44 (0)20 7100 3485 x 210
 +33 (0)5 33 52 01 77x 210

	+33 (0)603762104
	mark.burton
 <applewebdata://FB8B3C00-B344-43B7-AF3D-1618ECF92219/www.greensocs.com>

[-- Attachment #2: Type: text/html, Size: 2512 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] Update on TCG Multithreading
  2014-12-01 19:33 [Qemu-devel] Update on TCG Multithreading Mark Burton
@ 2014-12-01 21:00 ` Lluís Vilanova
  2014-12-01 23:45   ` Alexander Graf
  2014-12-02 10:14 ` Dr. David Alan Gilbert
  2014-12-02 15:14 ` Kirill Batuzov
  2 siblings, 1 reply; 6+ messages in thread
From: Lluís Vilanova @ 2014-12-01 21:00 UTC (permalink / raw)
  To: Mark Burton
  Cc: Peter Maydell, Bastian Koppelmann, Dr. David Alan Gilbert,
	qemu-devel, Alexander Graf, pavel Dovgaluk, Paolo Bonzini,
	Alex Bennée, KONRAD Frédéric

Mark Burton writes:

> All - first a huge thanks for those who have contributed, and those who have
> expressed an interest in helping out.

> One issue I’d like to see more opinions on is the question of a cache per core,
> or a shared cache.
> I have heard anecdotal evidence that a shared cache gives a major performance
> benefit….
> Does anybody have anything more concrete?
> (of course we will get numbers in the end if we implement the hybrid scheme as
> suggested in the wiki - but I’d still appreciate any feedback).

I think it makes sense to have a per-core pointer to a qom TCGCacheClass. That
can then have its own methods for working with updates, making it much simpler
to work with different implementations, like completely avoiding locks (per-cpu
cache) or a hybrid approach like the one described in the wiki.


> Our next plan is to start putting an implementation plan together. Probably
> quite sketchy at this point, and we hope to start coding shortly.

BTW, I've added some links to the COREMU project, which was discussed long ago
in this list.


Best,
  Lluis

-- 
 "And it's much the same thing with knowledge, for whenever you learn
 something new, the whole world becomes that much richer."
 -- The Princess of Pure Reason, as told by Norton Juster in The Phantom
 Tollbooth

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] Update on TCG Multithreading
  2014-12-01 21:00 ` Lluís Vilanova
@ 2014-12-01 23:45   ` Alexander Graf
  2014-12-02  1:41     ` Lluís Vilanova
  0 siblings, 1 reply; 6+ messages in thread
From: Alexander Graf @ 2014-12-01 23:45 UTC (permalink / raw)
  To: Mark Burton, qemu-devel, Bastian Koppelmann, pavel Dovgaluk,
	KONRAD Frédéric, Peter Maydell, Dr. David Alan Gilbert,
	Alex Bennée, Paolo Bonzini



On 01.12.14 22:00, Lluís Vilanova wrote:
> Mark Burton writes:
> 
>> All - first a huge thanks for those who have contributed, and those who have
>> expressed an interest in helping out.
> 
>> One issue I’d like to see more opinions on is the question of a cache per core,
>> or a shared cache.
>> I have heard anecdotal evidence that a shared cache gives a major performance
>> benefit….
>> Does anybody have anything more concrete?
>> (of course we will get numbers in the end if we implement the hybrid scheme as
>> suggested in the wiki - but I’d still appreciate any feedback).
> 
> I think it makes sense to have a per-core pointer to a qom TCGCacheClass. That
> can then have its own methods for working with updates, making it much simpler
> to work with different implementations, like completely avoiding locks (per-cpu
> cache) or a hybrid approach like the one described in the wiki.

I don't think you want to have indirect function calls in the fast path ;).


Alex

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] Update on TCG Multithreading
  2014-12-01 23:45   ` Alexander Graf
@ 2014-12-02  1:41     ` Lluís Vilanova
  0 siblings, 0 replies; 6+ messages in thread
From: Lluís Vilanova @ 2014-12-02  1:41 UTC (permalink / raw)
  To: Alexander Graf
  Cc: Peter Maydell, Bastian Koppelmann, Mark Burton, qemu-devel,
	Dr. David Alan Gilbert, pavel Dovgaluk, Paolo Bonzini,
	Alex Bennée, KONRAD Frédéric

Alexander Graf writes:

> On 01.12.14 22:00, Lluís Vilanova wrote:
>> Mark Burton writes:
>> 
>>> All - first a huge thanks for those who have contributed, and those who have
>>> expressed an interest in helping out.
>> 
>>> One issue I’d like to see more opinions on is the question of a cache per core,
>>> or a shared cache.
>>> I have heard anecdotal evidence that a shared cache gives a major performance
>>> benefit….
>>> Does anybody have anything more concrete?
>>> (of course we will get numbers in the end if we implement the hybrid scheme as
>>> suggested in the wiki - but I’d still appreciate any feedback).
>> 
>> I think it makes sense to have a per-core pointer to a qom TCGCacheClass. That
>> can then have its own methods for working with updates, making it much simpler
>> to work with different implementations, like completely avoiding locks (per-cpu
>> cache) or a hybrid approach like the one described in the wiki.

> I don't think you want to have indirect function calls in the fast path ;).

Ooops, true; at least probably, since you're never sure how much the HW
prefetcher is going to outsmart you :)

Well, I guess that a define will have to do then. But I think it still makes
sense to refactor tb_* functions and such to have a TCGCache as first argument.


Best,
  Lluis

-- 
 "And it's much the same thing with knowledge, for whenever you learn
 something new, the whole world becomes that much richer."
 -- The Princess of Pure Reason, as told by Norton Juster in The Phantom
 Tollbooth

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] Update on TCG Multithreading
  2014-12-01 19:33 [Qemu-devel] Update on TCG Multithreading Mark Burton
  2014-12-01 21:00 ` Lluís Vilanova
@ 2014-12-02 10:14 ` Dr. David Alan Gilbert
  2014-12-02 15:14 ` Kirill Batuzov
  2 siblings, 0 replies; 6+ messages in thread
From: Dr. David Alan Gilbert @ 2014-12-02 10:14 UTC (permalink / raw)
  To: Mark Burton
  Cc: Peter Maydell, Bastian Koppelmann, qemu-devel, Alexander Graf,
	pavel Dovgaluk, Paolo Bonzini, Alex Benn?e, Llu?s Vilanova,
	KONRAD Fr?d?ric

* Mark Burton (mark.burton@greensocs.com) wrote:
> 
> All - first a huge thanks for those who have contributed, and those who have expressed an interest in helping out.
> 
> One issue I???d like to see more opinions on is the question of a cache per core, or a shared cache.
> I have heard anecdotal evidence that a shared cache gives a major performance benefit???.
> Does anybody have anything more concrete?
> (of course we will get numbers in the end if we implement the hybrid scheme as suggested in the wiki - but I???d still appreciate any feedback).
> 
> Our next plan is to start putting an implementation plan together. Probably quite sketchy at this point, and we hope to start coding shortly.

I'd expect a shared one to be able to take advantage
of code that's translated by one core and then used on
another.
On the other hand with one per core you can perform updates
on the caches with a lot less locking; however you've still
got to be able to do invalidates across all the caches if any
core does the write, and that could also get tricky.

Dave

> 
> 
> Cheers
> 
> Mark.
> 
> 
> 
> 
> 
> 	 +44 (0)20 7100 3485 x 210
>  +33 (0)5 33 52 01 77x 210
> 
> 	+33 (0)603762104
> 	mark.burton
>  <applewebdata://FB8B3C00-B344-43B7-AF3D-1618ECF92219/www.greensocs.com>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [Qemu-devel] Update on TCG Multithreading
  2014-12-01 19:33 [Qemu-devel] Update on TCG Multithreading Mark Burton
  2014-12-01 21:00 ` Lluís Vilanova
  2014-12-02 10:14 ` Dr. David Alan Gilbert
@ 2014-12-02 15:14 ` Kirill Batuzov
  2 siblings, 0 replies; 6+ messages in thread
From: Kirill Batuzov @ 2014-12-02 15:14 UTC (permalink / raw)
  To: Mark Burton
  Cc: Peter Maydell, Bastian Koppelmann, Alexander Graf, qemu-devel,
	pavel Dovgaluk, Paolo Bonzini, Alex Bennée,
	Lluís Vilanova, Dr. David Alan Gilbert,
	KONRAD Frédéric

[-- Attachment #1: Type: TEXT/PLAIN, Size: 573 bytes --]

On Mon, 1 Dec 2014, Mark Burton wrote:
> 
> One issue I’d like to see more opinions on is the question of a cache per core, or a shared cache.
> I have heard anecdotal evidence that a shared cache gives a major performance benefit….
> Does anybody have anything more concrete?

There is a theoretical and experimental comparison of these approaches in
PQEMU article (you've cited it on wiki page). Only the authors call them
differently: they call cache-per-core "Separate Code Cache" (SCC) and
they call shared cache "Unified Code Cache" (UCC).

-- 
Kirill

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2014-12-02 15:15 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-12-01 19:33 [Qemu-devel] Update on TCG Multithreading Mark Burton
2014-12-01 21:00 ` Lluís Vilanova
2014-12-01 23:45   ` Alexander Graf
2014-12-02  1:41     ` Lluís Vilanova
2014-12-02 10:14 ` Dr. David Alan Gilbert
2014-12-02 15:14 ` Kirill Batuzov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).