[Qemu-devel] Qemu + CUDA: a new possible way?

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] Qemu + CUDA: a new possible way?
@ 2009-06-05  8:01 OneSoul
  0 siblings, 0 replies; 5+ messages in thread
From: OneSoul @ 2009-06-05  8:01 UTC (permalink / raw)
  To: qemu-devel

Hello all!

I'm a Qemu user for a long time and I'm very satisfied by its features 
of flexibility, power and portability - really a good project!

Recently, reading some technical articles over internet, I have 
discoverd the big potentialities of the CUDA framework in relation to 
the scientific and graphic computing that takes strong advantage from 
the most recent GPUs. Someone has used it for password recovery, 
realtime rendering, etc, with great results.

It would be possible to use this technology in the Qemu project to 
achieve better performance?
It could be a significative step for the develop in virtualization 
technology?

Someone, for example, in experimental way, has (re)wrote the md-raid 
kernel modules using the CUDA framework to accelerate the reed-solomon 
features... and it seems that works fine.
Why not for Qemu or related components?

The main question is about the dynamic transaltion engine: can it be 
modified for this framework?
Someone says that Qemu is NOT parallelizable... but it seems strange 
because by definition is "Fast and Portable".
Not portable on this framework?
Pay attention, the computing on GPU is driven through a kernel module, 
not directly.

What do you think about this draft idea? It's just a proof-of-concept, 
but I hope to be useful.

Any feedback is welcome...

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Qemu-devel] Qemu + CUDA: a new possible way?
@ 2009-06-05 20:09 OneSoul
  2009-06-05 20:59 ` Lennart Sorensen
                   ` (2 more replies)
  0 siblings, 3 replies; 5+ messages in thread
From: OneSoul @ 2009-06-05 20:09 UTC (permalink / raw)
  To: qemu-devel

Hello all!

I'm a Qemu user for a long time and I'm very satisfied by its features
of flexibility, power and portability - really a good project!

Recently, reading some technical articles over internet, I have
discoverd the big potentialities of the CUDA framework in relation to
the scientific and graphic computing that takes strong advantage from
the most recent GPUs. Someone has used it for password recovery,
realtime rendering, etc, with great results.

It would be possible to use this technology in the Qemu project to
achieve better performance?
It could be a significative step for the develop in virtualization
technology?

Someone, for example, in experimental way, has (re)wrote the md-raid
kernel modules using the CUDA framework to accelerate the reed-solomon
features... and it seems that works fine.
Why not for Qemu or related components?

The main question is about the dynamic transaltion engine: can it be
modified for this framework?
Someone says that Qemu is NOT parallelizable... but it seems strange
because by definition is "Fast and Portable".
Not portable on this framework?
Pay attention, the computing on GPU is driven through a kernel module,
not directly.

What do you think about this draft idea? It's just a proof-of-concept,
but I hope to be useful.

Any feedback is welcome...

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] Qemu + CUDA: a new possible way?
  2009-06-05 20:09 [Qemu-devel] Qemu + CUDA: a new possible way? OneSoul
@ 2009-06-05 20:59 ` Lennart Sorensen
  2009-06-05 21:31 ` Blue Swirl
  2009-06-06  2:42 ` Paul Brook
  2 siblings, 0 replies; 5+ messages in thread
From: Lennart Sorensen @ 2009-06-05 20:59 UTC (permalink / raw)
  To: OneSoul; +Cc: qemu-devel

On Fri, Jun 05, 2009 at 09:09:27PM +0100, OneSoul wrote:
> I'm a Qemu user for a long time and I'm very satisfied by its features
> of flexibility, power and portability - really a good project!
>
> Recently, reading some technical articles over internet, I have
> discoverd the big potentialities of the CUDA framework in relation to
> the scientific and graphic computing that takes strong advantage from
> the most recent GPUs. Someone has used it for password recovery,
> realtime rendering, etc, with great results.
>
> It would be possible to use this technology in the Qemu project to
> achieve better performance?
> It could be a significative step for the develop in virtualization
> technology?
>
> Someone, for example, in experimental way, has (re)wrote the md-raid
> kernel modules using the CUDA framework to accelerate the reed-solomon
> features... and it seems that works fine.
> Why not for Qemu or related components?
>
> The main question is about the dynamic transaltion engine: can it be
> modified for this framework?
> Someone says that Qemu is NOT parallelizable... but it seems strange
> because by definition is "Fast and Portable".
> Not portable on this framework?
> Pay attention, the computing on GPU is driven through a kernel module,
> not directly.
>
> What do you think about this draft idea? It's just a proof-of-concept,
> but I hope to be useful.
>
> Any feedback is welcome...

Password cracking involves lots and lots of identical attemps with
different values.  Trivial to make parallel.  

md5 involves doing xor on lots of data.  Trivial to make parallel.  

Rendering involves calculating lots of pixels.  Trivial to make parallel.

qemu is emulating a single CPU doing one thing at a time.  That is
pretty much not parallel work.  Now adding the threaded IO so device
emulation is done in parallel to the cpu emulation (which I believe is
currently happening) does make sense, but I don't think there is much
chance of making the cpu emulation and translation particularly parallel.
At best you might get one thread doing the translation and another doing
the actual execution of the translation.  Nothing parallel in the range
needed to make cuda useful.

-- 
Len Sorensen

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] Qemu + CUDA: a new possible way?
  2009-06-05 20:09 [Qemu-devel] Qemu + CUDA: a new possible way? OneSoul
  2009-06-05 20:59 ` Lennart Sorensen
@ 2009-06-05 21:31 ` Blue Swirl
  2009-06-06  2:42 ` Paul Brook
  2 siblings, 0 replies; 5+ messages in thread
From: Blue Swirl @ 2009-06-05 21:31 UTC (permalink / raw)
  To: OneSoul; +Cc: qemu-devel

On 6/5/09, OneSoul <onesoul@autistici.org> wrote:
> Hello all!
>
>  I'm a Qemu user for a long time and I'm very satisfied by its features
>  of flexibility, power and portability - really a good project!
>
>  Recently, reading some technical articles over internet, I have
>  discoverd the big potentialities of the CUDA framework in relation to
>  the scientific and graphic computing that takes strong advantage from
>  the most recent GPUs. Someone has used it for password recovery,
>  realtime rendering, etc, with great results.
>
>  It would be possible to use this technology in the Qemu project to
>  achieve better performance?
>  It could be a significative step for the develop in virtualization
>  technology?
>
>  Someone, for example, in experimental way, has (re)wrote the md-raid
>  kernel modules using the CUDA framework to accelerate the reed-solomon
>  features... and it seems that works fine.
>  Why not for Qemu or related components?
>
>  The main question is about the dynamic transaltion engine: can it be
>  modified for this framework?

It should be possible to make a CUDA target for TCG, judging from a
quick look at PTX documentation.

I don't know whether that makes sense from performance point of view,
how much time does PTX compilation and transfer to GPU take? Native
GPU machine code would be faster.

>  Someone says that Qemu is NOT parallelizable... but it seems strange
>  because by definition is "Fast and Portable".
>  Not portable on this framework?
>  Pay attention, the computing on GPU is driven through a kernel module,
>  not directly.

The problem for CPUs is that emulation of atomic operations is costly.
Maybe the native thread synchronization operations in CUDA could help.

If QEMU runs at user space, the user to kernel to GPU switches will
increase latency. At least the dynamic translation code should then
run also on GPU, leaving only the IO device handling to CPU. Obviously
VGA emulation should reside in GPU if possible.

>  What do you think about this draft idea? It's just a proof-of-concept,
>  but I hope to be useful.
>
>  Any feedback is welcome...

What is the performance of a single execution unit? If you emulate an
x86 system, I'd think you get more cycles to run the emulator in CPU
if that runs at 2 GHz, compared to GPU running only at 500 MHz. Maybe
you could emulate a system with 16384 CPUs @ 500MHz? Even if the
single emulator performance is not great, it may still be attractive
for server farms.

Taking the idea one step further: could the CUDA framework be
virtualized? Though it looks like there are no exceptions or privilege
levels, so CUDA can't run an OS.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [Qemu-devel] Qemu + CUDA: a new possible way?
  2009-06-05 20:09 [Qemu-devel] Qemu + CUDA: a new possible way? OneSoul
  2009-06-05 20:59 ` Lennart Sorensen
  2009-06-05 21:31 ` Blue Swirl
@ 2009-06-06  2:42 ` Paul Brook
  2 siblings, 0 replies; 5+ messages in thread
From: Paul Brook @ 2009-06-06  2:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: OneSoul

> Recently, reading some technical articles over internet, I have
> discoverd the big potentialities of the CUDA framework in relation to
> the scientific and graphic computing that takes strong advantage from
> the most recent GPUs. Someone has used it for password recovery,
> realtime rendering, etc, with great results.

Most of these problems are what's known as embarrassingly parallel. It's 
trivial to split them up into many small independent pieces. A GPU contains 
hundreds or thousands of small, loosely coupled, low power processing units so 
is a good fit for this kind of problem.

Most of the work that qemu does is completely the opposite. Every step is 
highly dependent on the preceding steps, so you have to execute them in 
series.

If you really think you have figured our some magic way around this then your 
first step should be to make qemu work over a small number (say 8 or 16) of 
tightly coupled CPU cores. If you can't do that then you haven't got a hope in 
hell of making it work over a vast number of remote GPU cores.

Paul

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2009-06-06  2:42 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-06-05 20:09 [Qemu-devel] Qemu + CUDA: a new possible way? OneSoul
2009-06-05 20:59 ` Lennart Sorensen
2009-06-05 21:31 ` Blue Swirl
2009-06-06  2:42 ` Paul Brook
  -- strict thread matches above, loose matches on Subject: below --
2009-06-05  8:01 OneSoul

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).