From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from eggs.gnu.org ([2001:4830:134:3::10]:45132) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1X0AQA-0002KU-C6 for qemu-devel@nongnu.org; Thu, 26 Jun 2014 10:13:14 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1X0AQ3-0003Hu-SW for qemu-devel@nongnu.org; Thu, 26 Jun 2014 10:13:02 -0400 Received: from mail-oa0-f54.google.com ([209.85.219.54]:40418) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1X0AQ3-0003HZ-JN for qemu-devel@nongnu.org; Thu, 26 Jun 2014 10:12:55 -0400 Received: by mail-oa0-f54.google.com with SMTP id eb12so3910048oac.41 for ; Thu, 26 Jun 2014 07:12:54 -0700 (PDT) MIME-Version: 1.0 Sender: camm@ualberta.ca In-Reply-To: <53A83722.60408@huawei.com> References: <1403266532-13231-1-git-send-email-david.marchand@6wind.com> <1403266532-13231-2-git-send-email-david.marchand@6wind.com> <53A83722.60408@huawei.com> Date: Thu, 26 Jun 2014 08:12:53 -0600 Message-ID: From: Cam Macdonell Content-Type: multipart/alternative; boundary=e89a8fb1f39ef0afba04fcbdc90c Subject: Re: [Qemu-devel] [PATCH 1/2] docs: update ivshmem device spec List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Claudio Fontana Cc: Paolo Bonzini , Jani Kokkonen , David Marchand , KVM General , "qemu-devel@nongnu.org Developers" --e89a8fb1f39ef0afba04fcbdc90c Content-Type: text/plain; charset=UTF-8 Hi, Thank you for everyone's interest and work on this. Sorry I haven't been...better. I will offer my knowledge where it helps. And the server is GPL in case that was seen as an issue. On Mon, Jun 23, 2014 at 8:18 AM, Claudio Fontana wrote: > Hi, > > we were reading through this quickly today, and these are some of the > questions that > we think can came up when reading this. Answers to some of these questions > we think > we have figured out, but I think it's important to put this information > into the > documentation. > > I will quote the file in its entirety, and insert some questions inline. > > > Device Specification for Inter-VM shared memory device > > ------------------------------------------------------ > > > > The Inter-VM shared memory device is designed to share a region of > memory to > > userspace in multiple virtual guests. > > What does "to userspace" mean in this context? The userspace of the host, > or the userspace in the guest? > The memory is intended to be shared between userspaces in the guests. However, since the memory is POSIX shm region, it is visible on the host too. > > What about "The Inter-VM shared memory device is designed to share a > memory region (created on the host via the POSIX shared memory API) between > multiple QEMU processes running different guests. In order for all guests > to be able to pick up the shared memory area, it is modeled by QEMU as a > PCI device exposing said memory to the guest as a PCI BAR." > > Whether in those guests the memory region is used in kernel space or > userspace, or there is even any meaning for those terms is guest-dependent > I would think (I think of an OSv here, where the application and kernel > execute at the same privilege level and in the same address space). I'm not exactly clear what you're asking here. The region is visible to both the guest kernel and userspace (once mounted). > > > The memory region does not belong to any > > guest, but is a POSIX memory object on the host. > > Ok that's clear. > One thing I would ask is, but I don't know if it makes sense to mention > here, is who creates this memory object on the host? I understand in some cases it's the contributed server (what you provide in > contrib/), in some cases it's the "user" of this device who has to write > some server code for that, but is it true that also the qemu process itself > can create this memory object on its own, without any external process > needed? Is this the use case for host<->guest only? > > (Answering based on my original server code) When using the server, the server creates it. Without the server, each qemu process will check if it exists and if it does, it will use it. If it does not exist, the qemu process will create it. > > Optionally, the device may > > support sending interrupts to other guests sharing the same memory > region. > > This opens a lot of questions here which are partly answered later (If I > understand correctly, not only interrupts are involved, but a complete > communication protocol involving registers in BAR0), but what about staying > a bit general here, like > "Optionally, the device may also provide a communication mechanism between > guests sharing the same memory region. More details about that in the > section 'OPTIONAL ivshmem guest to guest communication protocol'. > > Thinking out loud, I wonder if this communication mechanism should be part > of this device in QEMU, or it should be provided at another layer.. > > > > > > > > The Inter-VM PCI device > > ----------------------- > > > > *BARs* > > > > The device supports three BARs. BAR0 is a 1 Kbyte MMIO region to support > > registers. BAR1 is used for MSI-X when it is enabled in the device. > BAR2 is > > used to map the shared memory object from the host. The size of BAR2 is > > specified when the guest is started and must be a power of 2 in size. > > Are BAR0 and BAR1 optional? That's what I would think by reading the > whole, but I'm still not sure. > Am I forced to map BAR0 and BAR1 anyway? I don't think so, but.. > > They do not need to be mapped. You do not need to map them if you don't want to use them. > If so, can we separate the explanation into the base shared memory > feature, and a separate section which explains the OPTIONAL communication > mechanism, and the OPTIONAL MSI-X BAR? > > For example, say that I am a potential ivshmem user (which I am), and I am > interested in the shared memory but I want to use my own communication > mechanism and protocol between guests, can we make it so that I don't have > to wonder whether some of the info I read applies or not? > The solution to that I think is to put all the OPTIONAL parts into > separate sections. > > > > > *Registers* > > Ok, so this should I think go into one such OPTIONAL sections. > > > > > The device currently supports 4 registers of 32-bits each. Registers > > are used for synchronization between guests sharing the same memory > object when > > interrupts are supported (this requires using the shared memory server). > > So use of BAR0 goes together with interrupts, and goes together with the > shared memory server (is it the one contributed in contrib/?) > > > > > The server assigns each VM an ID number and sends this ID number to the > QEMU > > process when the guest starts. > > > > enum ivshmem_registers { > > IntrMask = 0, > > IntrStatus = 4, > > IVPosition = 8, > > Doorbell = 12 > > }; > > > > The first two registers are the interrupt mask and status registers. > Mask and > > status are only used with pin-based interrupts. They are unused with MSI > > interrupts. > > > > Status Register: The status register is set to 1 when an interrupt > occurs. > > > > Mask Register: The mask register is bitwise ANDed with the interrupt > status > > and the result will raise an interrupt if it is non-zero. However, > since 1 is > > the only value the status will be set to, it is only the first bit of > the mask > > that has any effect. Therefore interrupts can be masked by setting the > first > > bit to 0 and unmasked by setting the first bit to 1. > > > > IVPosition Register: The IVPosition register is read-only and reports the > > guest's ID number. The guest IDs are non-negative integers. When using > the > > server, since the server is a separate process, the VM ID will only be > set when > > the device is ready (shared memory is received from the server and > accessible via > > the device). If the device is not ready, the IVPosition will return -1. > > Applications should ensure that they have a valid VM ID before accessing > the > > shared memory. > > So the guest ID number is 32bits, but actually the doorbell is 16-bit, can > we be > more explicit about this? So does it follow that the maximum number of > guests > is 65536? > Yes, for each server and its corresponding memory region. > > > > > Doorbell Register: To interrupt another guest, a guest must write to the > > Doorbell register. The doorbell register is 32-bits, logically divided > into > > two 16-bit fields. The high 16-bits are the guest ID to interrupt and > the low > > 16-bits are the interrupt vector to trigger. The semantics of the value > > written to the doorbell depends on whether the device is using MSI or a > regular > > pin-based interrupt. In short, MSI uses vectors while regular > interrupts set the > > status register. > > > > Regular Interrupts > > > > If regular interrupts are used (due to either a guest not supporting MSI > or the > > user specifying not to use them on startup) then the value written to > the lower > > 16-bits of the Doorbell register results is arbitrary and will trigger an > > interrupt in the destination guest. > > > > Message Signalled Interrupts > > > > A ivshmem device may support multiple MSI vectors. If so, the lower > 16-bits > > written to the Doorbell register must be between 0 and the maximum > number of > > vectors the guest supports. The lower 16 bits written to the doorbell > is the > > MSI vector that will be raised in the destination guest. The number of > MSI > > vectors is configurable but it is set when the VM is started. > > > > The important thing to remember with MSI is that it is only a signal, no > status > > is set (since MSI interrupts are not shared). All information other > than the > > interrupt itself should be communicated via the shared memory region. > Devices > > supporting multiple MSI vectors can use different vectors to indicate > different > > events have occurred. The semantics of interrupt vectors are left to the > > user's discretion. > > > > > > Maybe an example of a full exchange would be useful to explain the use of > these registers, making the protocol used for communication clear; or does > this only provide mechanisms that can be used by someone else to implement > a protocol? > > > > > IVSHMEM host services > > --------------------- > > > > This part is optional (see *Usage in the Guest* section below) > > Ok this section is optional, but its role is not that clear to me. > > So are there exactly 3 ways this can be used: > > 1) shared memory only, PCI BAR2 > 2) full device including registers in BAR0 but no MSI > 3) full device including registers in BAR0 and MSI support in BAR1 > ? > > > > > > > To handle notifications between users of a ivshmem device, a ivshmem > server has > > been added. This server is responsible for creating the shared memory and > > creating a set of eventfds for each users of the shared memory. > > Ok this is the first time eventfds are mentioned, after we spoke about > interrupts in the other section before.. > The interrupts are transported between QEMU processes using eventfds. The interrupts are delivered into the guest using regular interrupts or MSI-X. The interrupts can be delivered to user-level using eventfds in UIO. > > > It behaves as a > > proxy between the different ivshmem clients (QEMU): giving the shared > memory fd > > to each client, > > telling each client which /dev/name to shm_open? No, it passes a file descriptor to the region using SCM_RIGHTS. When using the server, the qemu clients do not know the name of the shm region. > > > allocating eventfds to new clients and broadcasting to all > > clients when a client disappears. > > What about VM Ids, are they also decided and shared by the server? > Yes, the server hands out increasing VM Ids. > > > > > Apart from the current ivshmem implementation in QEMU, a ivshmem client > can be > > written for debug, for development purposes, or to implement > notifications > > between host and guests. > > > > > > Usage in the Guest > > ------------------ > > > > The guest should map BAR0 to access the registers (an array of 32-bit > ints > > allows simple writing) and map BAR2 to access the shared memory region > itself. > > Ok, but can I avoid mapping BAR0 if I don't use the registers? > Yes > > > The size of the shared memory region is specified when the guest (or > shared > > memory server) is started. A guest may map the whole shared memory > region or > > only part of it. > > So what does it mean here, I can choose to start the optional server > contributed in contrib/ > with a shared memory size parameter determining the size of the actual > shared memory region, > and then the guest has the option to map only part of that? > You do not need to map the whole region. > > Or can also the guest (or better, the QEMU process running the guest) > create the shared memory region by itself? > Which parameters control these behaviours? > When giving a shared memory region name "foo" -device ivshmem,shm=foo,size=2048,use64=1 1) if the 'foo' memory object doesn't exist, the qemu process will create it 2) if 'foo' already exists it will use it 3) if the object exists but does not match the size specified, ivshmem will exit. > Btw I would expect there to be a separate section with all the QEMU > command line configuration parameters and their effect on behavior of this > device. Also for the contributed code in contrib/, especially for the > server, we need documentation about the command line parameters, env > variables, whatever can be configured and which effect they have on this. > > > > > ivshmem provides an optional notification mechanism through eventfds > handled by > > QEMU that will trigger interrupts in guests. This mechanism is enabled > when > > using a ivshmem-server which must be started prior to VMs and which > serves as a > > proxy for exchanging eventfds. > > Here also, a simple description of such a sequence of exchanges would be > welcome, I would not mind some ASCII art as well. > > > > > It is your choice how to use the ivshmem device. > > Good :) > > > - the simple way, you don't need anything else than what is already in > QEMU. > > If the server becomes part of the QEMU package, then this sentence is a > bit unclear right? This was probably written at the time the server was not > contributed to QEMU, right? > > > You can map the shared memory in guest, then use it in userland as you > see fit > > In userland.. ? Can I create the shared memory by just running a qemu > process with some parameters? Does this mean I now share memory between > guest and host? If I run multiple guest providing the same device name, can > I make them use the same shared memory without the need of any server? > Yes, the server is only necessary for the interrupt behaviour. > > > (memnic for example works this way http://dpdk.org/browse/memnic), > > I'll check that out.. > > > - the more advanced way, basically, if you want an event mechanism > between the > > VMs using your ivshmem device. In this case, then you will most likely > want to > > write a kernel driver that will handle interrupts. > > Ok. > > Let me ask you this, what about virtio? > Can I take this shared memory implementation, and then run virtio on top > of that, which already has primitives for communication? > > I understand this would restricts me to 1 vs 1 communication, while with > the optional server in contrib/ I would have any to any communication > available. > > But what about the 1 to 1 guest-to-guest communication, is in this case in > theory possible to put virtio on top of ivshmem and use that to make the > two guests communicate? > > This is just a list of questions that we came up with, but anybody please > weigh in with your additional questions, comments, feedback. Especially I > would like to know if the idea to have a virtio guest to guest > communication is possible and realistic, maybe with minimal extension of > virtio, or if I am being insane. > > There was originally a virtio-based version of ivshmem. You could see the discussion around that sometime in 2009. I think you could use virtio over ivshmem but the 1-to-1 case is quite limiting. Virtio is well optimized for what it does and so it was decided to keep the two separate. HTH, Cam Thank you, > > Claudio > > > > --e89a8fb1f39ef0afba04fcbdc90c Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable

Hi,

Thank you for ev= eryone's interest and work on this. =C2=A0Sorry I haven't been...be= tter. =C2=A0I will offer my knowledge where it helps. =C2=A0And the server = is GPL in case that was seen as an issue.

On Mon, Jun 23, 2= 014 at 8:18 AM, Claudio Fontana <claudio.fontana@huawei.com&g= t; wrote:
Hi,

we were reading through this quickly today, and these are some of the quest= ions that
we think can came up when reading this. Answers to some of these questions = we think
we have figured out, but I think it's important to put this information= into the
documentation.

I will quote the file in its entirety, and insert some questions inline.
> Device Specification for Inter-VM shared memory device
> ------------------------------------------------------
>
> The Inter-VM shared memory device is designed to share a region of mem= ory to
> userspace in multiple virtual guests.

What does "to userspace" mean in this context? The userspace of t= he host, or the userspace in the guest?

The memory is intended to be shared between userspaces in the guests. =C2= =A0However, since the memory is POSIX shm region, it is visible on the host= too.
=C2=A0

What about "The Inter-VM shared memory device is designed to share a m= emory region (created on the host via the POSIX shared memory API) between = multiple QEMU processes running different guests. In order for all guests t= o be able to pick up the shared memory area, it is modeled by QEMU as a PCI= device exposing said memory to the guest as a PCI BAR."

Whether in those guests the memory region is used in kernel space or usersp= ace, or there is even any meaning for those terms is guest-dependent I woul= d think (I think of an OSv here, where the application and kernel execute a= t the same privilege level and in the same address space).

I'm not exactly clear what you're asking here. = =C2=A0The region is visible to both the guest kernel and userspace (once mo= unted). =C2=A0
=C2=A0

> The memory region does not belong to any
> guest, but is a POSIX memory object on the host.

Ok that's clear.
One thing I would ask is, but I don't know if it makes sense to mention= here, is who creates this memory object on the host?=C2=A0
I understand in some cases it's the contributed server (what you provid= e in contrib/), in some cases it's the "user" of this device = who has to write some server code for that, but is it true that also the qe= mu process itself can create this memory object on its own, without any ext= ernal process needed? Is this the use case for host<->guest only?


(Answering based on my original server= code) When using the server, the server creates it. =C2=A0Without the serv= er, each qemu process will check if it exists and if it does, it will use i= t. =C2=A0If it does not exist, the qemu process will create it.
=C2=A0
> Optionally, the device may
> support sending interrupts to other guests sharing the same memory reg= ion.

This opens a lot of questions here which are partly answered later (If I un= derstand correctly, not only interrupts are involved, but a complete commun= ication protocol involving registers in BAR0), but what about staying a bit= general here, like
"Optionally, the device may also provide a communication mechanism bet= ween guests sharing the same memory region. More details about that in the = section 'OPTIONAL ivshmem guest to guest communication protocol'.
Thinking out loud, I wonder if this communication mechanism should be part = of this device in QEMU, or it should be provided at another layer..


>
>
> The Inter-VM PCI device
> -----------------------
>
> *BARs*
>
> The device supports three BARs. =C2=A0BAR0 is a 1 Kbyte MMIO region to= support
> registers. =C2=A0BAR1 is used for MSI-X when it is enabled in the devi= ce. =C2=A0BAR2 is
> used to map the shared memory object from the host. =C2=A0The size of = BAR2 is
> specified when the guest is started and must be a power of 2 in size.<= br>
Are BAR0 and BAR1 optional? That's what I would think by reading the wh= ole, but I'm still not sure.
Am I forced to map BAR0 and BAR1 anyway? I don't think so, but..


They do not need to be mapped. =C2=A0Y= ou do not need to map them if you don't want to use them.
=C2= =A0
If so, can we separate the explanation into the base shared memory feature,= and a separate section which explains the OPTIONAL communication mechanism= , and the OPTIONAL MSI-X BAR?

For example, say that I am a potential ivshmem user (which I am), and I am = interested in the shared memory but I want to use my own communication mech= anism and protocol between guests, can we make it so that I don't have = to wonder whether some of the info I read applies or not?
The solution to that I think is to put all the OPTIONAL parts into separate= sections.

>
> *Registers*

Ok, so this should I think go into one such OPTIONAL sections.

>
> The device currently supports 4 registers of 32-bits each. =C2=A0Regis= ters
> are used for synchronization between guests sharing the same memory ob= ject when
> interrupts are supported (this requires using the shared memory server= ).

So use of BAR0 goes together with interrupts, and goes together with the sh= ared memory server (is it the one contributed in contrib/?)

>
> The server assigns each VM an ID number and sends this ID number to th= e QEMU
> process when the guest starts.
>
> enum ivshmem_registers {
> =C2=A0 =C2=A0 IntrMask =3D 0,
> =C2=A0 =C2=A0 IntrStatus =3D 4,
> =C2=A0 =C2=A0 IVPosition =3D 8,
> =C2=A0 =C2=A0 Doorbell =3D 12
> };
>
> The first two registers are the interrupt mask and status registers. = =C2=A0Mask and
> status are only used with pin-based interrupts. =C2=A0They are unused = with MSI
> interrupts.
>
> Status Register: The status register is set to 1 when an interrupt occ= urs.
>
> Mask Register: The mask register is bitwise ANDed with the interrupt s= tatus
> and the result will raise an interrupt if it is non-zero. =C2=A0Howeve= r, since 1 is
> the only value the status will be set to, it is only the first bit of = the mask
> that has any effect. =C2=A0Therefore interrupts can be masked by setti= ng the first
> bit to 0 and unmasked by setting the first bit to 1.
>
> IVPosition Register: The IVPosition register is read-only and reports = the
> guest's ID number. =C2=A0The guest IDs are non-negative integers. = =C2=A0When using the
> server, since the server is a separate process, the VM ID will only be= set when
> the device is ready (shared memory is received from the server and acc= essible via
> the device). =C2=A0If the device is not ready, the IVPosition will ret= urn -1.
> Applications should ensure that they have a valid VM ID before accessi= ng the
> shared memory.

So the guest ID number is 32bits, but actually the doorbell is 16-bit, can = we be
more explicit about this? So does it follow that the maximum number of gues= ts
is 65536?

Yes, for each server and its = corresponding memory region.
=C2=A0

>
> Doorbell Register: =C2=A0To interrupt another guest, a guest must writ= e to the
> Doorbell register. =C2=A0The doorbell register is 32-bits, logically d= ivided into
> two 16-bit fields. =C2=A0The high 16-bits are the guest ID to interrup= t and the low
> 16-bits are the interrupt vector to trigger. =C2=A0The semantics of th= e value
> written to the doorbell depends on whether the device is using MSI or = a regular
> pin-based interrupt. =C2=A0In short, MSI uses vectors while regular in= terrupts set the
> status register.
>
> Regular Interrupts
>
> If regular interrupts are used (due to either a guest not supporting M= SI or the
> user specifying not to use them on startup) then the value written to = the lower
> 16-bits of the Doorbell register results is arbitrary and will trigger= an
> interrupt in the destination guest.
>
> Message Signalled Interrupts
>
> A ivshmem device may support multiple MSI vectors. =C2=A0If so, the lo= wer 16-bits
> written to the Doorbell register must be between 0 and the maximum num= ber of
> vectors the guest supports. =C2=A0The lower 16 bits written to the doo= rbell is the
> MSI vector that will be raised in the destination guest. =C2=A0The num= ber of MSI
> vectors is configurable but it is set when the VM is started.
>
> The important thing to remember with MSI is that it is only a signal, = no status
> is set (since MSI interrupts are not shared). =C2=A0All information ot= her than the
> interrupt itself should be communicated via the shared memory region. = =C2=A0Devices
> supporting multiple MSI vectors can use different vectors to indicate = different
> events have occurred. =C2=A0The semantics of interrupt= vectors are left to the
> user's discretion.
>
>

Maybe an example of a full exchange would be useful to explain the us= e of these registers, making the protocol used for communication clear; or = does this only provide mechanisms that can be used by someone else to imple= ment a protocol?



> IVSHMEM host services
> ---------------------
>
> This part is optional (see *Usage in the Guest* section below)

Ok this section is optional, but its role is not that clear to me.
So are there exactly 3 ways this can be used:

1) shared memory only, PCI BAR2
2) full device including registers in BAR0 but no MSI
3) full device including registers in BAR0 and MSI support in BAR1
?



>
> To handle notifications between users of a ivshmem device, a ivshmem s= erver has
> been added. This server is responsible for creat= ing the shared memory and
> creating a set of eventfds for each users of the= shared memory.

Ok this is the first time eventfds are mentioned, after we spoke abou= t interrupts in the other section before..

<= div>The interrupts are transported between QEMU processes using eventfds. = =C2=A0The interrupts are delivered into the guest using regular interrupts = or MSI-X. =C2=A0The interrupts can be delivered to user-level using eventfd= s in UIO.
=C2=A0

> It behaves as a
> proxy between the different ivshmem clients (QEMU): gi= ving the shared memory fd
> to each client,

telling each client which /dev/name to shm_open?=C2=A0
No, it passes a file descriptor to the region using SCM_RIGHTS.= =C2=A0When using the server, the qemu clients do not know the name of the = shm region.
=C2=A0

> allocating eventfds to new clients and broadcasting to all
> clients when a client disappears.

What about VM Ids, are they also decided and shared by the server?

Yes, the server hands out increasing VM Ids.
=C2=A0

>
> Apart from the current ivshmem implementation in QEMU, a ivshmem clien= t can be
> written for debug, for development purposes, or = to implement notifications
> between host and guests.
>
>
> Usage in the Guest
> ------------------
>
> The guest should map BAR0 to access the register= s (an array of 32-bit ints
> allows simple writing) and map BAR2 to access th= e shared memory region itself.

Ok, but can I avoid mapping BAR0 if I don't use the registers?

Yes
=C2=A0

> The size of the shared memory region is specified when the guest (or s= hared
> memory server) is started. A guest may map the w= hole shared memory region or
> only part of it.

So what does it mean here, I can choose to start the optional server contri= buted in contrib/
with a shared memory size parameter determining the size of the actual shar= ed memory region,
and then the guest has the option to map only part of that?

You do not need to map the whole region.
=C2= =A0

Or can also the guest (or better, the QEMU process running the guest) creat= e the shared memory region by itself?
Which parameters control these behaviours?

<= div>When giving a shared memory region name "foo"

<= /div>
=C2=A0 =C2=A0 -device ivshmem,shm=3Dfoo,size=3D2048,use64=3D1
=C2=A0

1) if the 'foo' memory object doesn't exist= , the qemu process will create it
2) if 'foo' already exi= sts it will use it
3) if the object exists but does not match the= size specified, ivshmem will exit.


Btw I would expect there to be a separate section with all the QEMU command= line configuration parameters and their effect on behavior of this device.= Also for the contributed code in contrib/, especially for the server, we n= eed documentation about the command line parameters, env variables, whateve= r can be configured and which effect they have on this.

>
> ivshmem provides an optional notification mechanism through eventfds h= andled by
> QEMU that will trigger interrupts in guests. Thi= s mechanism is enabled when
> using a ivshmem-server which must be started pri= or to VMs and which serves as a
> proxy for exchanging eventfds.

Here also, a simple description of such a sequence of exchanges would be we= lcome, I would not mind some ASCII art as well.

>
> It is your choice how to use the ivshmem device.

Good :)

> - the simple way, you don't need anything else than what is alread= y in QEMU.

If the server becomes part of the QEMU package, then this sentence is= a bit unclear right? This was probably written at the time the server was = not contributed to QEMU, right?

> =C2=A0 You can map the shared memory in guest, then use it in userland= as you see fit

In userland.. ? Can I create the shared memory by just running a qemu= process with some parameters? Does this mean I now share memory between gu= est and host? If I run multiple guest providing the same device name, can I= make them use the same shared memory without the need of any server?

Yes, the server is only necessary for the = interrupt behaviour.
=C2=A0

> =C2=A0 (memnic for example works this way http://dpdk.org/browse/memnic),

I'll check that out..

> - the more advanced way, basically, if you want an event mechanism bet= ween the
> =C2=A0 VMs using your ivshmem device. In this ca= se, then you will most likely want to
> =C2=A0 write a kernel driver that will handle in= terrupts.

Ok.

Let me ask you this, what about virtio?
Can I take this shared memory implementation, and then run virtio on top of= that, which already has primitives for communication?

I understand this would restricts me to 1 vs 1 communication, while with th= e optional server in contrib/ I would have any to any communication availab= le.

But what about the 1 to 1 guest-to-guest communication, is in this case in = theory possible to put virtio on top of ivshmem and use that to make the tw= o guests communicate?

This is just a list of questions that we came up with, but anybody please w= eigh in with your additional questions, comments, feedback. Especially I wo= uld like to know if the idea to have a virtio guest to guest communication = is possible and realistic, maybe with minimal extension of virtio, or if I = am being insane.


There was originally a virtio-based ve= rsion of ivshmem. =C2=A0You could see the discussion around that sometime i= n 2009. =C2=A0I think you could use virtio over ivshmem but the 1-to-1 case= is quite limiting. =C2=A0Virtio is well optimized for what it does and so = it was decided to keep the two separate.

HTH,

Cam=C2=A0

<= /div>

Thank you,

Claudio




--e89a8fb1f39ef0afba04fcbdc90c--