[Qemu-devel] TCP based PCIE request forwarding

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] TCP based PCIE request forwarding
@ 2012-11-16  8:39 lementec fabien
  2012-11-16 12:10 ` Stefan Hajnoczi
  2012-11-20 16:41 ` Jason Baron
  0 siblings, 2 replies; 13+ messages in thread
From: lementec fabien @ 2012-11-16  8:39 UTC (permalink / raw)
  To: qemu-devel

Hi,

I am a software engineer who works in an electronic group. Using QEMU
to emulate devices allows me to start writing and testing LINUX software
before the device is actually available. In the group, we are mostly
working with XILINX FPGAs, communicating with the host via PCIE. The
devices are implemented in VHDL.

I wanted to be able to reuse our VHDL designs in QEMU. To this end,
I implemented a QEMU TCP based PCIE request forwarder, so that I can
emulate our device in a standard process, and use the GHDL GCC frontend
plus some glue.

The fact that it is TCP based allows me to run the device on another
machine, which is a requirement.

The whole thing is available here:
https://github.com/texane/vpcie

The request forwarder is available here:
https://github.com/texane/vpcie/blob/master/qemu/pciefw.c

It requires a patch to QEMU, available here:
https://github.com/texane/vpcie/blob/master/qemu/qemu_58617a795c8067b2f9800cffce60f38707d3aa31.diff

Since I am the only one using it and I wanted a working version soon,
I use a naive method to plug into QEMU, which can block the VM. Plus,
I did not take care of some PCIE related details. But it works well
enough.

Do you think the approach of forwarding PCIE requests over TCP could
be integrated to QEMU? If positive, what kind of modifications should
be done to this patch?

Best regards,

Fabien Le Mentec.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] TCP based PCIE request forwarding
  2012-11-16  8:39 [Qemu-devel] TCP based PCIE request forwarding lementec fabien
@ 2012-11-16 12:10 ` Stefan Hajnoczi
  2012-11-16 13:05   ` lementec fabien
  2012-11-20 16:41 ` Jason Baron
  1 sibling, 1 reply; 13+ messages in thread
From: Stefan Hajnoczi @ 2012-11-16 12:10 UTC (permalink / raw)
  To: lementec fabien; +Cc: Cam Macdonell, Nick Gasson, qemu-devel, fred.konrad

On Fri, Nov 16, 2012 at 9:39 AM, lementec fabien
<fabien.lementec@gmail.com> wrote:
> I am a software engineer who works in an electronic group. Using QEMU
> to emulate devices allows me to start writing and testing LINUX software
> before the device is actually available. In the group, we are mostly
> working with XILINX FPGAs, communicating with the host via PCIE. The
> devices are implemented in VHDL.
>
> I wanted to be able to reuse our VHDL designs in QEMU. To this end,
> I implemented a QEMU TCP based PCIE request forwarder, so that I can
> emulate our device in a standard process, and use the GHDL GCC frontend
> plus some glue.
>
> The fact that it is TCP based allows me to run the device on another
> machine, which is a requirement.
>
> The whole thing is available here:
> https://github.com/texane/vpcie
>
> The request forwarder is available here:
> https://github.com/texane/vpcie/blob/master/qemu/pciefw.c
>
> It requires a patch to QEMU, available here:
> https://github.com/texane/vpcie/blob/master/qemu/qemu_58617a795c8067b2f9800cffce60f38707d3aa31.diff
>
> Since I am the only one using it and I wanted a working version soon,
> I use a naive method to plug into QEMU, which can block the VM. Plus,
> I did not take care of some PCIE related details. But it works well
> enough.
>
> Do you think the approach of forwarding PCIE requests over TCP could
> be integrated to QEMU? If positive, what kind of modifications should
> be done to this patch?

Thanks for sharing your code.  There is definitely interest in
integrating hardware simulation with QEMU in the wider community.

There is a little bit of overlap with hw/ivshmem.c but I don't think
ivshmem is as flexible for modelling arbitrary PCIe adapters.

I guess the reason you didn't try linking the GHDL object files
against QEMU is that you wanted full control over the process (e.g. so
you don't need to worry about QEMU's event loop)?

Stefan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] TCP based PCIE request forwarding
  2012-11-16 12:10 ` Stefan Hajnoczi
@ 2012-11-16 13:05   ` lementec fabien
  2012-11-19  8:55     ` Stefan Hajnoczi
  0 siblings, 1 reply; 13+ messages in thread
From: lementec fabien @ 2012-11-16 13:05 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Cam Macdonell, Nick Gasson, qemu-devel, fred.konrad

Hi,

Thanks for your reply.

Actually, I wanted to be independant of the QEMU event loop. Plus,
some proprietary simulation environment provides a closed socket
based interface to 'stimulate' the emulated device, at the PCIE level
for instance. These environments are sometimes installed on cluster
not running QEMU. The socket based approach fits quite well.

Not knowing about QEMU internals, I spent some hours trying to find
out the best way to plug into QEMU, and did not find ivhsmem appropriate.
Honestly, I wanted to have a working solution asap, and it did not take
long before I opted for the socket based approach. Now that it is working,
I can take time to reconsider stuffs according to others need, and ideally
an integration to QEMU.

Fabien.

2012/11/16 Stefan Hajnoczi <stefanha@gmail.com>:
> On Fri, Nov 16, 2012 at 9:39 AM, lementec fabien
> <fabien.lementec@gmail.com> wrote:
>> I am a software engineer who works in an electronic group. Using QEMU
>> to emulate devices allows me to start writing and testing LINUX software
>> before the device is actually available. In the group, we are mostly
>> working with XILINX FPGAs, communicating with the host via PCIE. The
>> devices are implemented in VHDL.
>>
>> I wanted to be able to reuse our VHDL designs in QEMU. To this end,
>> I implemented a QEMU TCP based PCIE request forwarder, so that I can
>> emulate our device in a standard process, and use the GHDL GCC frontend
>> plus some glue.
>>
>> The fact that it is TCP based allows me to run the device on another
>> machine, which is a requirement.
>>
>> The whole thing is available here:
>> https://github.com/texane/vpcie
>>
>> The request forwarder is available here:
>> https://github.com/texane/vpcie/blob/master/qemu/pciefw.c
>>
>> It requires a patch to QEMU, available here:
>> https://github.com/texane/vpcie/blob/master/qemu/qemu_58617a795c8067b2f9800cffce60f38707d3aa31.diff
>>
>> Since I am the only one using it and I wanted a working version soon,
>> I use a naive method to plug into QEMU, which can block the VM. Plus,
>> I did not take care of some PCIE related details. But it works well
>> enough.
>>
>> Do you think the approach of forwarding PCIE requests over TCP could
>> be integrated to QEMU? If positive, what kind of modifications should
>> be done to this patch?
>
> Thanks for sharing your code.  There is definitely interest in
> integrating hardware simulation with QEMU in the wider community.
>
> There is a little bit of overlap with hw/ivshmem.c but I don't think
> ivshmem is as flexible for modelling arbitrary PCIe adapters.
>
> I guess the reason you didn't try linking the GHDL object files
> against QEMU is that you wanted full control over the process (e.g. so
> you don't need to worry about QEMU's event loop)?
>
> Stefan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] TCP based PCIE request forwarding
  2012-11-16 13:05   ` lementec fabien
@ 2012-11-19  8:55     ` Stefan Hajnoczi
  2012-11-19 16:00       ` lementec fabien
  2012-11-21 14:27       ` lementec fabien
  0 siblings, 2 replies; 13+ messages in thread
From: Stefan Hajnoczi @ 2012-11-19  8:55 UTC (permalink / raw)
  To: lementec fabien; +Cc: Cam Macdonell, Nick Gasson, qemu-devel, fred.konrad

On Fri, Nov 16, 2012 at 02:05:29PM +0100, lementec fabien wrote:
> Actually, I wanted to be independant of the QEMU event loop. Plus,
> some proprietary simulation environment provides a closed socket
> based interface to 'stimulate' the emulated device, at the PCIE level
> for instance. These environments are sometimes installed on cluster
> not running QEMU. The socket based approach fits quite well.
> 
> Not knowing about QEMU internals, I spent some hours trying to find
> out the best way to plug into QEMU, and did not find ivhsmem appropriate.
> Honestly, I wanted to have a working solution asap, and it did not take
> long before I opted for the socket based approach. Now that it is working,
> I can take time to reconsider stuffs according to others need, and ideally
> an integration to QEMU.

I suggest writing up a spec for the socket protocol.  It can be put in
docs/specs/ (like the ivshmem spec).

This is both a good way to increase discussion and important for others
who may wish to make use of this feature.

Stefan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] TCP based PCIE request forwarding
  2012-11-19  8:55     ` Stefan Hajnoczi
@ 2012-11-19 16:00       ` lementec fabien
  2012-11-21 14:27       ` lementec fabien
  1 sibling, 0 replies; 13+ messages in thread
From: lementec fabien @ 2012-11-19 16:00 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Cam Macdonell, Nick Gasson, qemu-devel, fred.konrad

Hi,

Thanks, it is actually a good idea to start with. I will write a spec
based on an improved version of what I have already implemented.
I think I will have some time this week, I will keep you updated soon.

Best regards,

Fabien.

2012/11/19 Stefan Hajnoczi <stefanha@gmail.com>:
> On Fri, Nov 16, 2012 at 02:05:29PM +0100, lementec fabien wrote:
>> Actually, I wanted to be independant of the QEMU event loop. Plus,
>> some proprietary simulation environment provides a closed socket
>> based interface to 'stimulate' the emulated device, at the PCIE level
>> for instance. These environments are sometimes installed on cluster
>> not running QEMU. The socket based approach fits quite well.
>>
>> Not knowing about QEMU internals, I spent some hours trying to find
>> out the best way to plug into QEMU, and did not find ivhsmem appropriate.
>> Honestly, I wanted to have a working solution asap, and it did not take
>> long before I opted for the socket based approach. Now that it is working,
>> I can take time to reconsider stuffs according to others need, and ideally
>> an integration to QEMU.
>
> I suggest writing up a spec for the socket protocol.  It can be put in
> docs/specs/ (like the ivshmem spec).
>
> This is both a good way to increase discussion and important for others
> who may wish to make use of this feature.
>
> Stefan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] TCP based PCIE request forwarding
  2012-11-16  8:39 [Qemu-devel] TCP based PCIE request forwarding lementec fabien
  2012-11-16 12:10 ` Stefan Hajnoczi
@ 2012-11-20 16:41 ` Jason Baron
  2012-11-21 13:13   ` lementec fabien
  1 sibling, 1 reply; 13+ messages in thread
From: Jason Baron @ 2012-11-20 16:41 UTC (permalink / raw)
  To: lementec fabien; +Cc: qemu-devel

On Fri, Nov 16, 2012 at 09:39:07AM +0100, lementec fabien wrote:
> Hi,
> 
> I am a software engineer who works in an electronic group. Using QEMU
> to emulate devices allows me to start writing and testing LINUX software
> before the device is actually available. In the group, we are mostly
> working with XILINX FPGAs, communicating with the host via PCIE. The
> devices are implemented in VHDL.

As you know the current PCI config space is limited to 256 bytes on x86. I was
wondering then, if you needed to work around this limitation in any way
since you've mentioned you're using PCIE (which has a 4k config space)?

Thanks,

-Jason

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] TCP based PCIE request forwarding
  2012-11-20 16:41 ` Jason Baron
@ 2012-11-21 13:13   ` lementec fabien
  0 siblings, 0 replies; 13+ messages in thread
From: lementec fabien @ 2012-11-21 13:13 UTC (permalink / raw)
  To: Jason Baron; +Cc: qemu-devel

Hi,

As far as I know, all the PCIE devices implemented here
work with 256 bytes config header.

Cheers,

Fabien.

2012/11/20 Jason Baron <jbaron@redhat.com>:
> On Fri, Nov 16, 2012 at 09:39:07AM +0100, lementec fabien wrote:
>> Hi,
>>
>> I am a software engineer who works in an electronic group. Using QEMU
>> to emulate devices allows me to start writing and testing LINUX software
>> before the device is actually available. In the group, we are mostly
>> working with XILINX FPGAs, communicating with the host via PCIE. The
>> devices are implemented in VHDL.
>
> As you know the current PCI config space is limited to 256 bytes on x86. I was
> wondering then, if you needed to work around this limitation in any way
> since you've mentioned you're using PCIE (which has a 4k config space)?
>
> Thanks,
>
> -Jason
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] TCP based PCIE request forwarding
  2012-11-19  8:55     ` Stefan Hajnoczi
  2012-11-19 16:00       ` lementec fabien
@ 2012-11-21 14:27       ` lementec fabien
  2012-11-22  8:19         ` Stefan Hajnoczi
  1 sibling, 1 reply; 13+ messages in thread
From: lementec fabien @ 2012-11-21 14:27 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Cam Macdonell, Nick Gasson, qemu-devel, fred.konrad

[-- Attachment #1: Type: text/plain, Size: 1188 bytes --]

I join you a doc describing the current small protocol implementation.

2012/11/19 Stefan Hajnoczi <stefanha@gmail.com>:
> On Fri, Nov 16, 2012 at 02:05:29PM +0100, lementec fabien wrote:
>> Actually, I wanted to be independant of the QEMU event loop. Plus,
>> some proprietary simulation environment provides a closed socket
>> based interface to 'stimulate' the emulated device, at the PCIE level
>> for instance. These environments are sometimes installed on cluster
>> not running QEMU. The socket based approach fits quite well.
>>
>> Not knowing about QEMU internals, I spent some hours trying to find
>> out the best way to plug into QEMU, and did not find ivhsmem appropriate.
>> Honestly, I wanted to have a working solution asap, and it did not take
>> long before I opted for the socket based approach. Now that it is working,
>> I can take time to reconsider stuffs according to others need, and ideally
>> an integration to QEMU.
>
> I suggest writing up a spec for the socket protocol.  It can be put in
> docs/specs/ (like the ivshmem spec).
>
> This is both a good way to increase discussion and important for others
> who may wish to make use of this feature.
>
> Stefan

[-- Attachment #2: pciefw.protocol --]
[-- Type: application/octet-stream, Size: 2837 bytes --]

rationale
---------
PCIE access forwarding was made to implement a PCIE endpoint in a process
external to QEMU, possibly on a remote host. The main reason is to allow
interfacing QEMU with PCIE devices simulated in a third party environment.
Being an external process also has several benefits: independence from the
QEMU event loop, using compilation tools not supported by the QEMU build
system ...

usage
-----
PCIEFW devices are instanciated using the following QEMU options:
-device \
 pciefw,\
 laddr=<local_addr>,\
 lport=<local_port>,\
 raddr=<remote_addr>,\
 rport=<remote_port>


implementation
--------------
PCIEFW is a PCIE accesses forwarding device added to the QEMU source tree. At
initialization, this device opens a bidirectionnal point to point communication
channel with an external process. This process actually implements the PCIE
endpoint. That is, a PCIE access made by QEMU is forwarded to the process.
Reciprocally, replies and interrupts messages from the process are forwarded
to QEMU.

The commnication currently relies on a bidirectionnal point to point TCP
socket based channel. Byte ordering is little endian.

PCIEFW initiates a request upon access from QEMU. It sends a message whose
format is described by the pciefw_msg_t type:

typedef struct pciefw_msg
{
#define PCIEFW_MSG_MAX_SIZE (offsetof(pciefw_msg_t, data) + 0x1000)

  pciefw_header_t header;

#define PCIEFW_OP_READ_CONFIG 0
#define PCIEFW_OP_WRITE_CONFIG 1
#define PCIEFW_OP_READ_MEM 2
#define PCIEFW_OP_WRITE_MEM 3
#define PCIEFW_OP_READ_IO 4
#define PCIEFW_OP_WRITE_IO 5
#define PCIEFW_OP_INT 6
#define PCIEFW_OP_MSI 7
#define PCIEFW_OP_MSIX 8

  uint8_t op; /* in PCIEFW_OP_XXX */
  uint8_t bar; /* in [0:5] */
  uint8_t width; /* access in 1, 2, 4, 8 */
  uint64_t addr;
  uint16_t size; /* data size, in bytes */
  uint8_t data[1];

} __attribute__((packed)) pciefw_msg_t;

Note that data is a variable length field.

The PCIE endpoint process replies with a pciefw_reply_t formatted message:

typedef struct pciefw_reply
{
  pciefw_header_t header;
  uint8_t status;
  uint8_t data[8];
} __attribute__((packed)) pciefw_reply_t;

The PCIE endpoint process can initiate pciefw_msg_t to perform write operations
of its own. This is used to perform data transfer (DMA engines ...) and send
interrupts.

Both types start with a pciefw_header containing the total size:

typedef struct pciefw_header
{
  uint16_t size;
} __attribute__((packed)) pciefw_header_t;


limitations
-----------
The process acts as the server, and must be started before QEMU starts.
The QEMU event loop is blocked while awaiting for a device reply.
Protocol messages are not identified, assuming in order delivery.
Read operations from the process are not supported.
The transport layer should be abstracted, allowing non TCP to be used.
MSIX are not supported.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] TCP based PCIE request forwarding
  2012-11-21 14:27       ` lementec fabien
@ 2012-11-22  8:19         ` Stefan Hajnoczi
  2012-11-22 10:08           ` lementec fabien
  2012-11-22 10:21           ` Paolo Bonzini
  0 siblings, 2 replies; 13+ messages in thread
From: Stefan Hajnoczi @ 2012-11-22  8:19 UTC (permalink / raw)
  To: lementec fabien; +Cc: Cam Macdonell, Nick Gasson, qemu-devel, fred.konrad

On Wed, Nov 21, 2012 at 03:27:48PM +0100, lementec fabien wrote:
> usage
> -----
> PCIEFW devices are instanciated using the following QEMU options:
> -device \
>  pciefw,\
>  laddr=<local_addr>,\
>  lport=<local_port>,\
>  raddr=<remote_addr>,\
>  rport=<remote_port>

Take a look at qemu_socket.h:socket_parse().  It should allow you to
support TCP, UNIX domain sockets, and arbitrary file descriptors.

> implementation
> --------------
> PCIEFW is a PCIE accesses forwarding device added to the QEMU source tree. At
> initialization, this device opens a bidirectionnal point to point communication
> channel with an external process. This process actually implements the PCIE
> endpoint. That is, a PCIE access made by QEMU is forwarded to the process.
> Reciprocally, replies and interrupts messages from the process are forwarded
> to QEMU.
>
> The commnication currently relies on a bidirectionnal point to point TCP

s/commnication/communication/

> socket based channel. Byte ordering is little endian.
>
> PCIEFW initiates a request upon access from QEMU. It sends a message whose
> format is described by the pciefw_msg_t type:
> 
> typedef struct pciefw_msg
> {
> #define PCIEFW_MSG_MAX_SIZE (offsetof(pciefw_msg_t, data) + 0x1000)

The size field is uint16_t.  Do you really want to limit to 4 KB of
data?

>
>   pciefw_header_t header;
>
> #define PCIEFW_OP_READ_CONFIG 0
> #define PCIEFW_OP_WRITE_CONFIG 1
> #define PCIEFW_OP_READ_MEM 2
> #define PCIEFW_OP_WRITE_MEM 3
> #define PCIEFW_OP_READ_IO 4
> #define PCIEFW_OP_WRITE_IO 5
> #define PCIEFW_OP_INT 6
> #define PCIEFW_OP_MSI 7
> #define PCIEFW_OP_MSIX 8
>
>   uint8_t op; /* in PCIEFW_OP_XXX */
>   uint8_t bar; /* in [0:5] */
>   uint8_t width; /* access in 1, 2, 4, 8 */
>   uint64_t addr;
>   uint16_t size; /* data size, in bytes */

Why is are both width and size fields?  For read-type operations the
size field would indicate how many bytes to read.  For write-type
operations the size field would indicate how many bytes are included in
data[].

>   uint8_t data[1];
>
> } __attribute__((packed)) pciefw_msg_t;
>
> Note that data is a variable length field.
>
> The PCIE endpoint process replies with a pciefw_reply_t formatted message:
>
> typedef struct pciefw_reply
> {
>   pciefw_header_t header;
>   uint8_t status;

What values does this field take?

>   uint8_t data[8];
> } __attribute__((packed)) pciefw_reply_t;
>
> The PCIE endpoint process can initiate pciefw_msg_t to perform write operations
> of its own. This is used to perform data transfer (DMA engines ...) and send
> interrupts.

Any flow control rules?  For example, can the endpoint raise an
interrupt while processing a message (before it sends a reply)?

> Both types start with a pciefw_header containing the total size:
>
> typedef struct pciefw_header
> {
>   uint16_t size;
> } __attribute__((packed)) pciefw_header_t;

A "hello" message type would be useful so that you can extend the
protocol in the future.  The message would contain feature bits or a
version number.

Stefan

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] TCP based PCIE request forwarding
  2012-11-22  8:19         ` Stefan Hajnoczi
@ 2012-11-22 10:08           ` lementec fabien
  2012-11-22 10:21           ` Paolo Bonzini
  1 sibling, 0 replies; 13+ messages in thread
From: lementec fabien @ 2012-11-22 10:08 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Cam Macdonell, Nick Gasson, qemu-devel, fred.konrad

Hi,

Thanks for the feedback, I will modify the previous document
to include the changes you mentionned. I reply here too.

2012/11/22 Stefan Hajnoczi <stefanha@gmail.com>:
> On Wed, Nov 21, 2012 at 03:27:48PM +0100, lementec fabien wrote:
>> usage
>> -----
>> PCIEFW devices are instanciated using the following QEMU options:
>> -device \
>>  pciefw,\
>>  laddr=<local_addr>,\
>>  lport=<local_port>,\
>>  raddr=<remote_addr>,\
>>  rport=<remote_port>
>
> Take a look at qemu_socket.h:socket_parse().  It should allow you to
> support TCP, UNIX domain sockets, and arbitrary file descriptors.
>

ok, I will have a look what it implies to support arbitrary file descriptor.
For instance, my current implementation does not work with UDP sockets.
It assumes a reliable, ordered transport layer, whose OS API is not datagram
oriented.

>> implementation
>> --------------
>> PCIEFW is a PCIE accesses forwarding device added to the QEMU source tree. At
>> initialization, this device opens a bidirectionnal point to point communication
>> channel with an external process. This process actually implements the PCIE
>> endpoint. That is, a PCIE access made by QEMU is forwarded to the process.
>> Reciprocally, replies and interrupts messages from the process are forwarded
>> to QEMU.
>>
>> The commnication currently relies on a bidirectionnal point to point TCP
>
> s/commnication/communication/
>
>> socket based channel. Byte ordering is little endian.
>>
>> PCIEFW initiates a request upon access from QEMU. It sends a message whose
>> format is described by the pciefw_msg_t type:
>>
>> typedef struct pciefw_msg
>> {
>> #define PCIEFW_MSG_MAX_SIZE (offsetof(pciefw_msg_t, data) + 0x1000)
>
> The size field is uint16_t.  Do you really want to limit to 4 KB of
> data?
>

My first implementation required to allocate a fixed size buffer. It
is no longer
the case (with non datagram oriented IO operations) since I included the header
that contains the message size. Since PCIE maximum payload size is 0x1000,
it was an obvious choice. Of course, it is an arbitrary choice.

>>
>>   pciefw_header_t header;
>>
>> #define PCIEFW_OP_READ_CONFIG 0
>> #define PCIEFW_OP_WRITE_CONFIG 1
>> #define PCIEFW_OP_READ_MEM 2
>> #define PCIEFW_OP_WRITE_MEM 3
>> #define PCIEFW_OP_READ_IO 4
>> #define PCIEFW_OP_WRITE_IO 5
>> #define PCIEFW_OP_INT 6
>> #define PCIEFW_OP_MSI 7
>> #define PCIEFW_OP_MSIX 8
>>
>>   uint8_t op; /* in PCIEFW_OP_XXX */
>>   uint8_t bar; /* in [0:5] */
>>   uint8_t width; /* access in 1, 2, 4, 8 */
>>   uint64_t addr;
>>   uint16_t size; /* data size, in bytes */
>
> Why is are both width and size fields?  For read-type operations the
> size field would indicate how many bytes to read.  For write-type
> operations the size field would indicate how many bytes are included in
> data[].
>

Actually, the width field is currently not required. I included it to
allow multiple
contiguous accesses in 1 operation (where count = size / width). The device
would still need know the width of individual accesses in this case. But this is
not used.

>>   uint8_t data[1];
>>
>> } __attribute__((packed)) pciefw_msg_t;
>>
>> Note that data is a variable length field.
>>
>> The PCIE endpoint process replies with a pciefw_reply_t formatted message:
>>
>> typedef struct pciefw_reply
>> {
>>   pciefw_header_t header;
>>   uint8_t status;
>
> What values does this field take?
>

I will define a PCIEFW_STATUS_XXX

>>   uint8_t data[8];
>> } __attribute__((packed)) pciefw_reply_t;
>>
>> The PCIE endpoint process can initiate pciefw_msg_t to perform write operations
>> of its own. This is used to perform data transfer (DMA engines ...) and send
>> interrupts.
>
> Any flow control rules?  For example, can the endpoint raise an
> interrupt while processing a message (before it sends a reply)?
>

Currently, messages are not identified so the transport is assumed
to be made in order. In practice, it works because of the LINUX
application I use does not starts 2 DMA transfers in parallel. But a
protocol cannot rely on such assumptions. Plus, I assume QEMU
can eventually make 2 PCIE concurrent accesses on the same
device which would lead to 2 replies. so I will add an identifier field.

>> Both types start with a pciefw_header containing the total size:
>>
>> typedef struct pciefw_header
>> {
>>   uint16_t size;
>> } __attribute__((packed)) pciefw_header_t;
>
> A "hello" message type would be useful so that you can extend the
> protocol in the future.  The message would contain feature bits or a
> version number.
>

I did think about it. More generally, it would be useful to have a control
message to allow an endpoint to be disconnected then reconnected
without having to reboot QEMU. It is very useful when developping a
new device.

> Stefan

I will send you the modified document,

Thanks,

Fabien.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] TCP based PCIE request forwarding
  2012-11-22  8:19         ` Stefan Hajnoczi
  2012-11-22 10:08           ` lementec fabien
@ 2012-11-22 10:21           ` Paolo Bonzini
  2012-11-22 11:26             ` lementec fabien
  2012-11-22 12:38             ` Stefan Hajnoczi
  1 sibling, 2 replies; 13+ messages in thread
From: Paolo Bonzini @ 2012-11-22 10:21 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: lementec fabien, Cam Macdonell, Nick Gasson, qemu-devel,
	fred.konrad

Il 22/11/2012 09:19, Stefan Hajnoczi ha scritto:
>> > usage
>> > -----
>> > PCIEFW devices are instanciated using the following QEMU options:
>> > -device \
>> >  pciefw,\
>> >  laddr=<local_addr>,\
>> >  lport=<local_port>,\
>> >  raddr=<remote_addr>,\
>> >  rport=<remote_port>
> Take a look at qemu_socket.h:socket_parse().  It should allow you to
> support TCP, UNIX domain sockets, and arbitrary file descriptors.
> 

Even better it could just be a chardev.  socket_parse() is only used by
the (human) monitor interface.

Paolo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] TCP based PCIE request forwarding
  2012-11-22 10:21           ` Paolo Bonzini
@ 2012-11-22 11:26             ` lementec fabien
  2012-11-22 12:38             ` Stefan Hajnoczi
  1 sibling, 0 replies; 13+ messages in thread
From: lementec fabien @ 2012-11-22 11:26 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Stefan Hajnoczi, Cam Macdonell, Nick Gasson, qemu-devel,
	fred.konrad

[-- Attachment #1: Type: text/plain, Size: 1417 bytes --]

Hi,

I modified the protocol so that new message types can be
added easily. It is necessary for control related messages,
such as the hello one (I called it init). A type field has
been added to the header.

I did not include a is_reply (or is_request) field, and
prefered having 2 distinct message types. This is because
one may imagine a message type that has no reply (ie. ping ...)

Out of order reception is allowed by the use of a tag field
in request and replies. I did not included the tag in the
header, since not all the messages may need a tag. I plan
to implement this tag as a simple incrementing counter, so
made it large enough.

I did not implement these modifications yet, since I prefer
having feedbacks first. Neither did I have a look to the
the command line option parsing.

Regards,

Fabien.

2012/11/22 Paolo Bonzini <pbonzini@redhat.com>:
> Il 22/11/2012 09:19, Stefan Hajnoczi ha scritto:
>>> > usage
>>> > -----
>>> > PCIEFW devices are instanciated using the following QEMU options:
>>> > -device \
>>> >  pciefw,\
>>> >  laddr=<local_addr>,\
>>> >  lport=<local_port>,\
>>> >  raddr=<remote_addr>,\
>>> >  rport=<remote_port>
>> Take a look at qemu_socket.h:socket_parse().  It should allow you to
>> support TCP, UNIX domain sockets, and arbitrary file descriptors.
>>
>
> Even better it could just be a chardev.  socket_parse() is only used by
> the (human) monitor interface.
>
> Paolo

[-- Attachment #2: pciefw.protocol --]
[-- Type: application/octet-stream, Size: 4239 bytes --]

* rationale
-----------

PCIE access forwarding was made to implement a PCIE endpoint in a process
external to QEMU, possibly on a remote host. The main reason is to allow
interfacing QEMU with PCIE devices simulated in a third party environment.
Being an external process also has several benefits: independence from the
QEMU event loop, using compilation tools not supported by the QEMU build
system ...

* usage
-------

PCIEFW devices are instanciated using the following QEMU options:
-device \
 pciefw,\
 laddr=<local_addr>,\
 lport=<local_port>,\
 raddr=<remote_addr>,\
 rport=<remote_port>

* theory of operation
---------------------

PCIEFW is a PCIE accesses forwarding device added to the QEMU source
tree. At initialization, this device opens a bidirectionnal point to
point communication channel with an external process. This process
actually implements the PCIE endpoint. That is, a PCIE access made by
QEMU is forwarded to the process. Reciprocally, replies and accesses
messages from the process are forwarded to QEMU.

* communication protocol
------------------------

The communication assumes a reliable transport layer. Currently, a
bidirectionnal point to point TCP socket based channel is used. Byte
ordering is little endian.

** communication messages
-------------------------

A protocol message always starts with a header:

typedef struct pciefw_header
{
  uint16_t size;

#define PCIEFW_TYPE_INIT_REQ 0x00
#define PCIEFW_TYPE_INIT_REP 0x01
#define PCIEFW_TYPE_ACCESS_REQ 0x01
#define PCIEFW_TYPE_ACCESS_REP 0x02
  uint8_t type;

} __attribute__((packed)) pciefw_header_t;

Where:
. size is the total message size,
. type is the message type, one of PCIEFW_TYPE_XXX.

** initialization sequence
--------------------------

At initialization, QEMU sends an initialization message:

typedef pciefw_init_req
{
  pciefw_header_t header;
  uint8_t version;
} __attribute__((packed)) pciefw_init_req_t;

Where:
. version is used to identify the protocol version.

The process answers with a pciefw_init_rep message:

typedef pciefw_init_rep
{
  pciefw_header_t header;

#define PCIEFW_STATUS_SUCCESS 0x00
#define PCIEFW_STATUS_FAILURE 0xff
  uint8_t status;

} __attribute__((packed)) pciefw_init_rep_t;

Where:
. status if not PCIEFW_STATUS_SUCCESS, indicates the protocol is
not supported and no further communication should be attempted.

** access messages
------------------

PCIEFW initiates a request upon access from QEMU. It sends a message
whose format is described by the pciefw_access_req_t type:

typedef struct pciefw_access_req
{
#define PCIEFW_ACCESS_MAX_SIZE (offsetof(pciefw_msg_t, data) + 0x1000)

  pciefw_header_t header;
  uint32_t tag;

#define PCIEFW_OP_READ_CONFIG 0
#define PCIEFW_OP_WRITE_CONFIG 1
#define PCIEFW_OP_READ_MEM 2
#define PCIEFW_OP_WRITE_MEM 3
#define PCIEFW_OP_READ_IO 4
#define PCIEFW_OP_WRITE_IO 5
#define PCIEFW_OP_INT 6
#define PCIEFW_OP_MSI 7
#define PCIEFW_OP_MSIX 8
  uint8_t op;

  uint8_t bar;
  uint64_t addr;
  uint16_t size;
  uint8_t data[1];

} __attribute__((packed)) pciefw_access_req_t;

Where:
. tag is sent back by the reply as opaque field,
. op operation type, one of PCIEFW_OP_XXX,
. bar the PCIE BAR, in [0:5],
. addr the target address,
. size the data size in bytes,
. data is a variable length field containing the access data.

The PCIE endpoint process can initiate pciefw_access_req_t messages to
perform write operations of its own. This is used to perform data transfers
(DMA engines ...) and send interrupts.

In the case of a read operation, the PCIE endpoint process replies with
a pciefw_reply_t formatted message:

typedef struct pciefw_access_rep
{
  pciefw_header_t header;
  uint32_t tag;
  uint8_t status;
  uint8_t data[8];
} __attribute__((packed)) pciefw_access_rep_t;

Where:
. tag is the initiating access tag,
. status is the access status, one of PCIEFW_STATUS_XXX,
. data contains the replied data.

* limitations
-------------

The process acts as the server, and must be started before QEMU starts.
The QEMU event loop is blocked while awaiting for a device reply.
Read operations from the process are not supported.
The transport layer should be abstracted, allowing non TCP to be used.
MSIX are not supported.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [Qemu-devel] TCP based PCIE request forwarding
  2012-11-22 10:21           ` Paolo Bonzini
  2012-11-22 11:26             ` lementec fabien
@ 2012-11-22 12:38             ` Stefan Hajnoczi
  1 sibling, 0 replies; 13+ messages in thread
From: Stefan Hajnoczi @ 2012-11-22 12:38 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: lementec fabien, Cam Macdonell, Nick Gasson, qemu-devel,
	fred.konrad

On Thu, Nov 22, 2012 at 11:21:58AM +0100, Paolo Bonzini wrote:
> Il 22/11/2012 09:19, Stefan Hajnoczi ha scritto:
> >> > usage
> >> > -----
> >> > PCIEFW devices are instanciated using the following QEMU options:
> >> > -device \
> >> >  pciefw,\
> >> >  laddr=<local_addr>,\
> >> >  lport=<local_port>,\
> >> >  raddr=<remote_addr>,\
> >> >  rport=<remote_port>
> > Take a look at qemu_socket.h:socket_parse().  It should allow you to
> > support TCP, UNIX domain sockets, and arbitrary file descriptors.
> > 
> 
> Even better it could just be a chardev.  socket_parse() is only used by
> the (human) monitor interface.

The issue with chardev is that it's asynchronous.

In this case we cannot return from MemoryRegionOps->read() or
MemoryRegionOps->write() back to the event loop.

Stefan

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2012-11-22 12:38 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-11-16  8:39 [Qemu-devel] TCP based PCIE request forwarding lementec fabien
2012-11-16 12:10 ` Stefan Hajnoczi
2012-11-16 13:05   ` lementec fabien
2012-11-19  8:55     ` Stefan Hajnoczi
2012-11-19 16:00       ` lementec fabien
2012-11-21 14:27       ` lementec fabien
2012-11-22  8:19         ` Stefan Hajnoczi
2012-11-22 10:08           ` lementec fabien
2012-11-22 10:21           ` Paolo Bonzini
2012-11-22 11:26             ` lementec fabien
2012-11-22 12:38             ` Stefan Hajnoczi
2012-11-20 16:41 ` Jason Baron
2012-11-21 13:13   ` lementec fabien

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).