qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] Para-virtualized ram-based filesystem?
@ 2011-04-15 21:09 Ritchie, Stuart
  2011-04-15 21:43 ` Anthony Liguori
  0 siblings, 1 reply; 10+ messages in thread
From: Ritchie, Stuart @ 2011-04-15 21:09 UTC (permalink / raw)
  To: qemu-devel@nongnu.org

Hi all,

Has anyone looked at implementing a para-virtualized ram-based filesystem
for qemu?  Or any similar dynamic memory mapping techniques for running
guests?

What I had in mind would be a convenient, zero-copy mechanism for sharing
dynamically allocated, memory mapped files between host and guests.

The host provides a primary memory-mapped file system (ramfs, tmpfs,
hugetlbfs, etc), and the guest kernel and qemu use this host fs to provide
the illusion to guest applications that the filesystem is local.

The guest kernel contains a new filesystem, say call it vramfs,
implementing the various VFS handlers for a para-virt filesystem.  These
handlers call out to qemu, which in turn emulates them by invoking the
required host system calls.

Handling mmap/munmap is tricky -- but this is where the magic is.  There
does seem to be some qemu infrastructure to dynamically map memory into a
running system, though it may be designed for different requirements
(e.g., device memory).

I currently have the resources to work on this and am looking forward to
contributing my work back to the community.  I would appreciate any help
or pointers on this effort.

Cheers,
--Stuart


============================================================
The information contained in this message may be privileged
and confidential and protected from disclosure. If the reader
of this message is not the intended recipient, or an employee
or agent responsible for delivering this message to the
intended recipient, you are hereby notified that any reproduction,
dissemination or distribution of this communication is strictly
prohibited. If you have received this communication in error,
please notify us immediately by replying to the message and
deleting it from your computer. Thank you. Tellabs
============================================================

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] Para-virtualized ram-based filesystem?
  2011-04-15 21:09 [Qemu-devel] Para-virtualized ram-based filesystem? Ritchie, Stuart
@ 2011-04-15 21:43 ` Anthony Liguori
  2011-04-15 23:58   ` Ritchie, Stuart
  0 siblings, 1 reply; 10+ messages in thread
From: Anthony Liguori @ 2011-04-15 21:43 UTC (permalink / raw)
  To: Ritchie, Stuart; +Cc: qemu-devel@nongnu.org

On 04/15/2011 04:09 PM, Ritchie, Stuart wrote:
> Hi all,
>
> Has anyone looked at implementing a para-virtualized ram-based filesystem
> for qemu?  Or any similar dynamic memory mapping techniques for running
> guests?
>
> What I had in mind would be a convenient, zero-copy mechanism for sharing
> dynamically allocated, memory mapped files between host and guests.
>
> The host provides a primary memory-mapped file system (ramfs, tmpfs,
> hugetlbfs, etc), and the guest kernel and qemu use this host fs to provide
> the illusion to guest applications that the filesystem is local.
>
> The guest kernel contains a new filesystem, say call it vramfs,
> implementing the various VFS handlers for a para-virt filesystem.  These
> handlers call out to qemu, which in turn emulates them by invoking the
> required host system calls.

You can do this with ivshmem today.  You give it a path to a shared 
memory file, and then there's a path in sysfs that you can mmap() in 
userspace in the guest.

Regards,

Anthony Liguori

> Handling mmap/munmap is tricky -- but this is where the magic is.  There
> does seem to be some qemu infrastructure to dynamically map memory into a
> running system, though it may be designed for different requirements
> (e.g., device memory).
>
> I currently have the resources to work on this and am looking forward to
> contributing my work back to the community.  I would appreciate any help
> or pointers on this effort.
>
> Cheers,
> --Stuart
>
>
> ============================================================
> The information contained in this message may be privileged
> and confidential and protected from disclosure. If the reader
> of this message is not the intended recipient, or an employee
> or agent responsible for delivering this message to the
> intended recipient, you are hereby notified that any reproduction,
> dissemination or distribution of this communication is strictly
> prohibited. If you have received this communication in error,
> please notify us immediately by replying to the message and
> deleting it from your computer. Thank you. Tellabs
> ============================================================
>
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] Para-virtualized ram-based filesystem?
  2011-04-15 21:43 ` Anthony Liguori
@ 2011-04-15 23:58   ` Ritchie, Stuart
  2011-04-16  0:27     ` Brad Hards
  2011-04-17 12:43     ` Avi Kivity
  0 siblings, 2 replies; 10+ messages in thread
From: Ritchie, Stuart @ 2011-04-15 23:58 UTC (permalink / raw)
  To: Anthony Liguori; +Cc: qemu-devel@nongnu.org

On 4/15/11 2:43 PM, "Anthony Liguori" <anthony@codemonkey.ws> wrote:

>On 04/15/2011 04:09 PM, Ritchie, Stuart wrote:
>> Hi all,
>>
>> Has anyone looked at implementing a para-virtualized ram-based
>>filesystem
>> for qemu?  Or any similar dynamic memory mapping techniques for running
>> guests?
>>
>> What I had in mind would be a convenient, zero-copy mechanism for
>>sharing
>> dynamically allocated, memory mapped files between host and guests.
>>
>> The host provides a primary memory-mapped file system (ramfs, tmpfs,
>> hugetlbfs, etc), and the guest kernel and qemu use this host fs to
>>provide
>> the illusion to guest applications that the filesystem is local.
>>
>> The guest kernel contains a new filesystem, say call it vramfs,
>> implementing the various VFS handlers for a para-virt filesystem.  These
>> handlers call out to qemu, which in turn emulates them by invoking the
>> required host system calls.
>
>You can do this with ivshmem today.  You give it a path to a shared
>memory file, and then there's a path in sysfs that you can mmap() in
>userspace in the guest.

Please correct me if I am wrong, but with ivshmem you must to manage your
world within a single, fixed size region.  I appreciate the simplicity of
mapping the whole region all in one go, but our requirements are a bit
different.  Even if you could pass multiple -device ivshmem instances,
it's still a fixed environment.  Right?

Guest applications need to manage an arbitrary number of dynamic files.
Say, a few dozen.  The files can be created, deleted, grow and shrink
arbitrarily as applications see fit.  Some are small like 2KB, others
could incrementally grow to 1GB or more.  Existing code depends on this
file-based abstraction and there is pressure against change.

Each file is created and owned by its own process; thus synchronization
within a file is not necessary.  A single guest contains a number of
processes that are creating and owning these files.

A guest may exit and restart.  When it restarts, its processes should be
able to open their files, map them in, and carry on.

It must be possible to hand control of files from guest to another, again
using zero-copy memory mapping.

Seems to me that a para-virt ram-based filesystem fits the bill here.  The
idea leverages the host fs for indexing, memory management and
synchronization.  This is otherwise what we would have to do within a
single fixed region ourselves.

The cost for all this flexibility is lots of micro-mappings.  I'm not sure
the current qemu infrastructure is designed for this.  For starters,
RAMBlocks are managed on a singly-linked list.  Probably there are lots of
other scaling issues.

How does that sound?

Cheers,
--Stuart

PS. Sorry about the corporate disclaimer, maybe folks in other Fortune
500s can give me tips on how to fix it. :-)

>
>Regards,
>
>Anthony Liguori
>
>> Handling mmap/munmap is tricky -- but this is where the magic is.  There
>> does seem to be some qemu infrastructure to dynamically map memory into
>>a
>> running system, though it may be designed for different requirements
>> (e.g., device memory).
>>
>> I currently have the resources to work on this and am looking forward to
>> contributing my work back to the community.  I would appreciate any help
>> or pointers on this effort.
>>
>> Cheers,
>> --Stuart


============================================================
The information contained in this message may be privileged
and confidential and protected from disclosure. If the reader
of this message is not the intended recipient, or an employee
or agent responsible for delivering this message to the
intended recipient, you are hereby notified that any reproduction,
dissemination or distribution of this communication is strictly
prohibited. If you have received this communication in error,
please notify us immediately by replying to the message and
deleting it from your computer. Thank you. Tellabs
============================================================

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] Para-virtualized ram-based filesystem?
  2011-04-15 23:58   ` Ritchie, Stuart
@ 2011-04-16  0:27     ` Brad Hards
  2011-04-16  8:52       ` Stefan Hajnoczi
  2011-04-17 12:43     ` Avi Kivity
  1 sibling, 1 reply; 10+ messages in thread
From: Brad Hards @ 2011-04-16  0:27 UTC (permalink / raw)
  To: qemu-devel

On Saturday 16 April 2011 09:58:32 Ritchie, Stuart wrote:
> How does that sound?
As a general user: Confusing.

Is there a concrete example (specific applications, specific performance issues, 
specific requirements) that you can share?

Brad

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] Para-virtualized ram-based filesystem?
  2011-04-16  0:27     ` Brad Hards
@ 2011-04-16  8:52       ` Stefan Hajnoczi
  2011-04-16  8:54         ` Stefan Hajnoczi
  2011-04-18  4:12         ` Ritchie, Stuart
  0 siblings, 2 replies; 10+ messages in thread
From: Stefan Hajnoczi @ 2011-04-16  8:52 UTC (permalink / raw)
  To: Brad Hards; +Cc: qemu-devel

On Sat, Apr 16, 2011 at 1:27 AM, Brad Hards <bradh@frogmouth.net> wrote:
> On Saturday 16 April 2011 09:58:32 Ritchie, Stuart wrote:
>> How does that sound?
> As a general user: Confusing.
>
> Is there a concrete example (specific applications, specific performance issues,
> specific requirements) that you can share?

I'm also wondering why you want this.

Does it matter if the files get pushed out to swap on the host?

It's tempting to take advantage of running virtualized but then things
like migration get in the way.  Have you actually tried out network
file systems and determined they won't work for some reason?

Stefan

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] Para-virtualized ram-based filesystem?
  2011-04-16  8:52       ` Stefan Hajnoczi
@ 2011-04-16  8:54         ` Stefan Hajnoczi
  2011-04-18  4:12         ` Ritchie, Stuart
  1 sibling, 0 replies; 10+ messages in thread
From: Stefan Hajnoczi @ 2011-04-16  8:54 UTC (permalink / raw)
  To: Stuart.Ritchie; +Cc: qemu-devel, Brad Hards

Resent because Stuart dropped from the recipients list.

On Sat, Apr 16, 2011 at 9:52 AM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
> On Sat, Apr 16, 2011 at 1:27 AM, Brad Hards <bradh@frogmouth.net> wrote:
>> On Saturday 16 April 2011 09:58:32 Ritchie, Stuart wrote:
>>> How does that sound?
>> As a general user: Confusing.
>>
>> Is there a concrete example (specific applications, specific performance issues,
>> specific requirements) that you can share?
>
> I'm also wondering why you want this.
>
> Does it matter if the files get pushed out to swap on the host?
>
> It's tempting to take advantage of running virtualized but then things
> like migration get in the way.  Have you actually tried out network
> file systems and determined they won't work for some reason?
>
> Stefan
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] Para-virtualized ram-based filesystem?
  2011-04-15 23:58   ` Ritchie, Stuart
  2011-04-16  0:27     ` Brad Hards
@ 2011-04-17 12:43     ` Avi Kivity
  2011-04-18  3:28       ` Ritchie, Stuart
  1 sibling, 1 reply; 10+ messages in thread
From: Avi Kivity @ 2011-04-17 12:43 UTC (permalink / raw)
  To: Ritchie, Stuart; +Cc: qemu-devel@nongnu.org

On 04/16/2011 02:58 AM, Ritchie, Stuart wrote:
> >
> >You can do this with ivshmem today.  You give it a path to a shared
> >memory file, and then there's a path in sysfs that you can mmap() in
> >userspace in the guest.
>
> Please correct me if I am wrong, but with ivshmem you must to manage your
> world within a single, fixed size region.  I appreciate the simplicity of
> mapping the whole region all in one go, but our requirements are a bit
> different.  Even if you could pass multiple -device ivshmem instances,
> it's still a fixed environment.  Right?
>

You could place a read-only filesystem (say iso9660) in the region and 
mount it; it will then appear as a complete filesystem.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] Para-virtualized ram-based filesystem?
  2011-04-17 12:43     ` Avi Kivity
@ 2011-04-18  3:28       ` Ritchie, Stuart
  2011-04-18  6:31         ` Avi Kivity
  0 siblings, 1 reply; 10+ messages in thread
From: Ritchie, Stuart @ 2011-04-18  3:28 UTC (permalink / raw)
  To: Avi Kivity; +Cc: qemu-devel@nongnu.org

On 4/17/11 5:43 AM, "Avi Kivity" <avi@redhat.com> wrote:

>On 04/16/2011 02:58 AM, Ritchie, Stuart wrote:
>> >
>> >You can do this with ivshmem today.  You give it a path to a shared
>> >memory file, and then there's a path in sysfs that you can mmap() in
>> >userspace in the guest.
>>
>> Please correct me if I am wrong, but with ivshmem you must to manage
>>your
>> world within a single, fixed size region.  I appreciate the simplicity
>>of
>> mapping the whole region all in one go, but our requirements are a bit
>> different.  Even if you could pass multiple -device ivshmem instances,
>> it's still a fixed environment.  Right?
>>
>
>You could place a read-only filesystem (say iso9660) in the region and
>mount it; it will then appear as a complete filesystem.

We've thought about formatting the region as a ramdisk, but the block
layer shields mmap() from the storage, thus requiring a data copy into the
page-cache.  The great thing about ramfs/tmpfs is the data is used
in-place; we'd lose that when going with a ramdisk or other real
filesystem.

--Stuart


============================================================
The information contained in this message may be privileged
and confidential and protected from disclosure. If the reader
of this message is not the intended recipient, or an employee
or agent responsible for delivering this message to the
intended recipient, you are hereby notified that any reproduction,
dissemination or distribution of this communication is strictly
prohibited. If you have received this communication in error,
please notify us immediately by replying to the message and
deleting it from your computer. Thank you. Tellabs
============================================================

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] Para-virtualized ram-based filesystem?
  2011-04-16  8:52       ` Stefan Hajnoczi
  2011-04-16  8:54         ` Stefan Hajnoczi
@ 2011-04-18  4:12         ` Ritchie, Stuart
  1 sibling, 0 replies; 10+ messages in thread
From: Ritchie, Stuart @ 2011-04-18  4:12 UTC (permalink / raw)
  To: Stefan Hajnoczi, Brad Hards; +Cc: qemu-devel@nongnu.org

On 4/16/11 1:52 AM, "Stefan Hajnoczi" <stefanha@gmail.com> wrote:

>On Sat, Apr 16, 2011 at 1:27 AM, Brad Hards <bradh@frogmouth.net> wrote:
>> On Saturday 16 April 2011 09:58:32 Ritchie, Stuart wrote:
>>> How does that sound?
>> As a general user: Confusing.
>>
>> Is there a concrete example (specific applications, specific
>>performance issues,
>> specific requirements) that you can share?
>
>I'm also wondering why you want this.

The reason why we want this is the same reason why anyone would want
mmap() and tmpfs/ramfs in the first place: zero-copy, in-place access to
your data.

>
>Does it matter if the files get pushed out to swap on the host?

We don't run in a swapping environment.  But someone who does and accepts
the performance hit, then I don't see why it would matter.

Ramfs does not kick pages out of the page-cache.  But tmpfs does -- the
host should make this transparent, as it does now.

>
>It's tempting to take advantage of running virtualized but then things
>like migration get in the way.  Have you actually tried out network
>file systems and determined they won't work for some reason?
>
>Stefan

For performance reasons it's very important for our system that the data
be as close to the app as possible.  We can't afford to push data through
an I/O channel.

What I'm really suggesting here is a way for guest applications to mmap()
host memory.

Combine that with a virt-aware robust mutex and you've probably got the
most flexible, performant, inter-guest sharing/communication mechanism
possible.  (Semaphores through a socket?  On the same system?  You gotta
be kidding. :-)

Cheers,
--Stuart


============================================================
The information contained in this message may be privileged
and confidential and protected from disclosure. If the reader
of this message is not the intended recipient, or an employee
or agent responsible for delivering this message to the
intended recipient, you are hereby notified that any reproduction,
dissemination or distribution of this communication is strictly
prohibited. If you have received this communication in error,
please notify us immediately by replying to the message and
deleting it from your computer. Thank you. Tellabs
============================================================

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Qemu-devel] Para-virtualized ram-based filesystem?
  2011-04-18  3:28       ` Ritchie, Stuart
@ 2011-04-18  6:31         ` Avi Kivity
  0 siblings, 0 replies; 10+ messages in thread
From: Avi Kivity @ 2011-04-18  6:31 UTC (permalink / raw)
  To: Ritchie, Stuart; +Cc: qemu-devel@nongnu.org

On 04/18/2011 06:28 AM, Ritchie, Stuart wrote:
> On 4/17/11 5:43 AM, "Avi Kivity"<avi@redhat.com>  wrote:
>
> >On 04/16/2011 02:58 AM, Ritchie, Stuart wrote:
> >>  >
> >>  >You can do this with ivshmem today.  You give it a path to a shared
> >>  >memory file, and then there's a path in sysfs that you can mmap() in
> >>  >userspace in the guest.
> >>
> >>  Please correct me if I am wrong, but with ivshmem you must to manage
> >>your
> >>  world within a single, fixed size region.  I appreciate the simplicity
> >>of
> >>  mapping the whole region all in one go, but our requirements are a bit
> >>  different.  Even if you could pass multiple -device ivshmem instances,
> >>  it's still a fixed environment.  Right?
> >>
> >
> >You could place a read-only filesystem (say iso9660) in the region and
> >mount it; it will then appear as a complete filesystem.
>
> We've thought about formatting the region as a ramdisk, but the block
> layer shields mmap() from the storage, thus requiring a data copy into the
> page-cache.  The great thing about ramfs/tmpfs is the data is used
> in-place; we'd lose that when going with a ramdisk or other real
> filesystem.

s390 uses a trick to achieve this (XIP).

Look at fs/ext2/xip.c.

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2011-04-18  6:31 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-04-15 21:09 [Qemu-devel] Para-virtualized ram-based filesystem? Ritchie, Stuart
2011-04-15 21:43 ` Anthony Liguori
2011-04-15 23:58   ` Ritchie, Stuart
2011-04-16  0:27     ` Brad Hards
2011-04-16  8:52       ` Stefan Hajnoczi
2011-04-16  8:54         ` Stefan Hajnoczi
2011-04-18  4:12         ` Ritchie, Stuart
2011-04-17 12:43     ` Avi Kivity
2011-04-18  3:28       ` Ritchie, Stuart
2011-04-18  6:31         ` Avi Kivity

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).