[Qemu-devel] [RFC] QOMification of AXI stream

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] [RFC] QOMification of AXI stream
@ 2012-06-08  4:23 Peter Crosthwaite
  2012-06-08  9:13 ` Paul Brook
                   ` (2 more replies)
  0 siblings, 3 replies; 63+ messages in thread
From: Peter Crosthwaite @ 2012-06-08  4:23 UTC (permalink / raw)
  To: qemu-devel@nongnu.org Developers, Andreas Färber,
	Anthony Liguori, Edgar E. Iglesias
  Cc: Michal Simek, Paul Brook, John Williams

Hi all,

Im looking to QOMifying and refactoring the AXI stream interfaces
between the AXI ethernet and AXI DMA modules. I could use some
guidance on how to do this as I can think of about 6 different
solutions. Sources are hw/xilinx_axienet.c and hw/xilinx_axidma.c.

First ill start off by describing the real hardware:

Each of the two core has three interfaces (+interrupt pins):

1: Sysbus attachment for device control
2: AXI stream TX link
3: AXI stream RX link

Ethernet packet data is transferred from the ethernet device to/from
memory via the AXI stream links and the DMA core. Basically the DMA
core can be summed up as simply taking data to/from memory and putting
to/from the axi stream links. Axi stream is a trival point to point
protocol that allows for pushing 32-bit data words at a time.

>From an architecture point of view, the TX and RX links are completely
independent of each other. It doesnt make a lot of sense to have tx or
rx without the other for the ethernet with DMA case, but other
applications of the DMA could use only one of tx and rx. For this
reason I think its best we decouple the tx/rx pair. Currently it is
coupled in qemu (hw/xilinx_axdma.h):

struct XilinxDMAConnection {
    void *dma;
    void *client;

    DMAPushFn to_dma;
    DMAPushFn to_client;
};

So what im proposing is AXI stream is implemented as a unidirectional
point to point bus. The xilinx ethernet system would consist of two of
these buses one for tx, one for rx.

Onto the QOM stuff:

Currently the DMA interconnect is handled as this struct I pasted
above and a QDEV_PROP_PTR (which i understand is evil). The
interconnect mechanism obviously needs to change. So lets assume that
AXI stream is turned into a proper QEMU bus and devices can create
sub-busses which they are the tx'ing master:

s->axi_stream_master = axi_stream_create_bus(&dev->qdev, "axi_stream");

Machine models can grab the bus to attach slaves:

foo = qdev_get_child_bus(dev, "axi_stream");

Where my thinking goes pear shaped though is having proper QOMified
slaves. Each IP is a slave to both the sysbus and their respective
rx'ing AXI stream bus. This is something of a multiple inheritance
problem, I cant inherit from both SYSBUS and AXI_STREAM_SLAVE. So to
overcome this should I ...

A: Make AXI_STREAM_SLAVE an interface (not a sub-class of DEVICE). Its
kind of annoying though if someone in the future whats the create a
device thats only and axi stream slave, as they would have to
explicitly inherit from DEVICE as well.

or

B: Have the slave attachment be a device within a device. Hard part is
getting an accessor so machine models can retrieve the slave
attachment and hook it up.

Let me know what to do,

Regards,
Peter

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI stream
  2012-06-08  4:23 [Qemu-devel] [RFC] QOMification of AXI stream Peter Crosthwaite
@ 2012-06-08  9:13 ` Paul Brook
  2012-06-08  9:34   ` Peter Maydell
  2012-06-08 13:41   ` Anthony Liguori
  2012-06-08  9:45 ` Andreas Färber
  2012-06-08 14:15 ` Anthony Liguori
  2 siblings, 2 replies; 63+ messages in thread
From: Paul Brook @ 2012-06-08  9:13 UTC (permalink / raw)
  To: Peter Crosthwaite
  Cc: Michal Simek, Anthony Liguori, qemu-devel@nongnu.org Developers,
	Edgar E. Iglesias, Andreas Färber, John Williams

> Im looking to QOMifying and refactoring the AXI stream interfaces
> between the AXI ethernet and AXI DMA modules. I could use some
> guidance on how to do this as I can think of about 6 different
> solutions. Sources are hw/xilinx_axienet.c and hw/xilinx_axidma.c.
> 
>...
>
> So what im proposing is AXI stream is implemented as a unidirectional
> point to point bus. The xilinx ethernet system would consist of two of
> these buses one for tx, one for rx.

I thought the idea was that with QOM the bus/device model would go away.
The DMA controller implements an AXIDMA interface, and the device has a AXIDMA 
link that's connected to that interface.

Of course we then hit the usual problem with QOM that we can only link to 
objects, and it's impossible to expose multiple interfaces of the same type. 
The DMA controller probably needs a proxy object for each DMA channel.

Paul

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI stream
  2012-06-08  9:13 ` Paul Brook
@ 2012-06-08  9:34   ` Peter Maydell
  2012-06-08 13:13     ` Paul Brook
  2012-06-08 13:41   ` Anthony Liguori
  1 sibling, 1 reply; 63+ messages in thread
From: Peter Maydell @ 2012-06-08  9:34 UTC (permalink / raw)
  To: Paul Brook
  Cc: Michal Simek, Anthony Liguori, qemu-devel@nongnu.org Developers,
	Peter Crosthwaite, Edgar E. Iglesias, Andreas Färber,
	John Williams

On 8 June 2012 10:13, Paul Brook <paul@codesourcery.com> wrote:
> Of course we then hit the usual problem with QOM that we can only link to
> objects, and it's impossible to expose multiple interfaces of the same type.

I'm pretty sure Anthony claimed this was entirely possible --
presumably that's how Pins are going to work.

> The DMA controller probably needs a proxy object for each DMA channel.

I think there's some extra objects floating around in there, yes...

-- PMM

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI stream
  2012-06-08  4:23 [Qemu-devel] [RFC] QOMification of AXI stream Peter Crosthwaite
  2012-06-08  9:13 ` Paul Brook
@ 2012-06-08  9:45 ` Andreas Färber
  2012-06-09  1:53   ` Peter Crosthwaite
  2012-06-08 14:15 ` Anthony Liguori
  2 siblings, 1 reply; 63+ messages in thread
From: Andreas Färber @ 2012-06-08  9:45 UTC (permalink / raw)
  To: Peter Crosthwaite
  Cc: Anthony Liguori, qemu-devel@nongnu.org Developers, Michal Simek,
	Paul Brook, Edgar E. Iglesias, John Williams

Am 08.06.2012 06:23, schrieb Peter Crosthwaite:
> Each of the two core has three interfaces (+interrupt pins):
> 
> 1: Sysbus attachment for device control
> 2: AXI stream TX link
> 3: AXI stream RX link
[...]
> struct XilinxDMAConnection {
>     void *dma;
>     void *client;
> 
>     DMAPushFn to_dma;
>     DMAPushFn to_client;
> };
> 
> So what im proposing is AXI stream is implemented as a unidirectional
> point to point bus. The xilinx ethernet system would consist of two of
> these buses one for tx, one for rx.
[...]
> A: Make AXI_STREAM_SLAVE an interface (not a sub-class of DEVICE). Its
> kind of annoying though if someone in the future whats the create a
> device thats only and axi stream slave, as they would have to
> explicitly inherit from DEVICE as well.
> 
> or
> 
> B: Have the slave attachment be a device within a device. Hard part is
> getting an accessor so machine models can retrieve the slave
> attachment and hook it up.

If you dive into busses, note that Anthony has refactored QBus on
qom-next branch.

As Paul has already mentioned, the concept of tree-structured qdev
busses is deprecated by QOM in favor of link<> properties.

SysBus is also in the process of being deprecated, and Anthony is
working on Pin objects for simple qemu_irq-style messaging.

What you could try as option C today is deriving a type from SysBus with
one added DMAPushFn and a link<> property of that type for
unidirectional connection and form circular links for RX and TX...
That would of course limit the number of channels to one. Otherwise you
need a dedicated child<> object, of which a device can have multiple.

Regards,
Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI stream
  2012-06-08  9:34   ` Peter Maydell
@ 2012-06-08 13:13     ` Paul Brook
  2012-06-08 13:39       ` Anthony Liguori
  0 siblings, 1 reply; 63+ messages in thread
From: Paul Brook @ 2012-06-08 13:13 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Michal Simek, Anthony Liguori, qemu-devel@nongnu.org Developers,
	Peter Crosthwaite, Edgar E. Iglesias, Andreas Färber,
	John Williams

> On 8 June 2012 10:13, Paul Brook <paul@codesourcery.com> wrote:
> > Of course we then hit the usual problem with QOM that we can only link to
> > objects, and it's impossible to expose multiple interfaces of the same
> > type.
> 
> I'm pretty sure Anthony claimed this was entirely possible --
> presumably that's how Pins are going to work.

Really?  Every time I've talked to him I've got the opposite impression.  Part 
of the response has been that interrupt pins are the only case where this 
actually occurs, so It's not worth fixing properly.  I disagree with this 
assesment.

Given we do need to expose multiple instances of the same interface, I see a 
few different options:

- Create a proxy object for each reciever which multiplexes onto a different 
interface on the main object.  For interrupt pins this basically means making 
the qemu_irq object part of the device tree, and have the actual device 
implement qemu_irq_handler (see hw/irq.h).  The equivalent of qemu_irq (i.e. 
irq.c/h) needs to be created for every duplicated interface.  It's worth 
noting that qemu_irq is about as simple as it gets, it's a single 
unidirectional call.

- Make some form of handle an explicit part of the API.  IMO this is a really 
bad idea, and a step backwards.  In the qemu_irq case it means that the device 
raising the interrupt needs to know how the interrupt controller enumerates 
its input pins, and which one it's connected to.  Instead of making 
connections via a nice clean links we have a link and some other device 
specific information.  It's worse than the old callback+opaque pointer pair 
because user [machine description] has to provide that device specific 
additional value.

- Link to properties, not objects.  This probably ends up similar to the first 
option, except with a framework and consistent implementation across different 
interfaces.

Paul

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI stream
  2012-06-08 13:13     ` Paul Brook
@ 2012-06-08 13:39       ` Anthony Liguori
  2012-06-08 13:59         ` Paul Brook
  0 siblings, 1 reply; 63+ messages in thread
From: Anthony Liguori @ 2012-06-08 13:39 UTC (permalink / raw)
  To: Paul Brook
  Cc: Peter Maydell, Michal Simek, qemu-devel@nongnu.org Developers,
	Peter Crosthwaite, Edgar E. Iglesias, Andreas Färber,
	John Williams

On 06/08/2012 09:13 PM, Paul Brook wrote:
>> On 8 June 2012 10:13, Paul Brook<paul@codesourcery.com>  wrote:
>>> Of course we then hit the usual problem with QOM that we can only link to
>>> objects, and it's impossible to expose multiple interfaces of the same
>>> type.
>>
>> I'm pretty sure Anthony claimed this was entirely possible --
>> presumably that's how Pins are going to work.
>
> Really?  Every time I've talked to him I've got the opposite impression.  Part
> of the response has been that interrupt pins are the only case where this
> actually occurs, so It's not worth fixing properly.

I think it depends on your definition of "properly".

There's really only three concepts in QOM that matter for this discussion: 1) 
objects 2) children and 3) links.

There is absolutely no difference between a Pin object and a SerialState object. 
  They both are first-class objects are far as QOM and concerned.  Both can have 
links to other objects.

The most common way for other objects to create objects is via children.  A 
device could have a bunch of Pin child objects with that being the sole 
communication mechanism with the outside world.

A device could also have a 'PCISocket' child object (which inherits from 
PCIDevice) in order to expose a PCI interface to the world.

For most bus-based devices, I think the above is poor design.  But that's my 
opinion from a modeling PoV, QOM doesn't have an opinion from an infrastructure PoV.

> I disagree with this
> assesment.
>
> Given we do need to expose multiple instances of the same interface, I see a
> few different options:
>
> - Create a proxy object for each reciever which multiplexes onto a different
> interface on the main object.  For interrupt pins this basically means making
> the qemu_irq object part of the device tree, and have the actual device
> implement qemu_irq_handler (see hw/irq.h).  The equivalent of qemu_irq (i.e.
> irq.c/h) needs to be created for every duplicated interface.  It's worth
> noting that qemu_irq is about as simple as it gets, it's a single
> unidirectional call.
>
> - Make some form of handle an explicit part of the API.  IMO this is a really
> bad idea, and a step backwards.  In the qemu_irq case it means that the device
> raising the interrupt needs to know how the interrupt controller enumerates
> its input pins, and which one it's connected to.  Instead of making
> connections via a nice clean links we have a link and some other device
> specific information.  It's worse than the old callback+opaque pointer pair
> because user [machine description] has to provide that device specific
> additional value.
>
> - Link to properties, not objects.  This probably ends up similar to the first
> option, except with a framework and consistent implementation across different
> interfaces.

Let me read and respond to the rest of the thread.  I'm not sure a super 
abstract discussion here is going to be helpful

Regards,

Anthony Liguori

>
> Paul
>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI stream
  2012-06-08  9:13 ` Paul Brook
  2012-06-08  9:34   ` Peter Maydell
@ 2012-06-08 13:41   ` Anthony Liguori
  2012-06-08 13:53     ` Paul Brook
  2012-06-08 13:55     ` Peter Maydell
  1 sibling, 2 replies; 63+ messages in thread
From: Anthony Liguori @ 2012-06-08 13:41 UTC (permalink / raw)
  To: Paul Brook
  Cc: Michal Simek, qemu-devel@nongnu.org Developers, Peter Crosthwaite,
	Edgar E. Iglesias, Andreas Färber, John Williams

On 06/08/2012 05:13 PM, Paul Brook wrote:
>> Im looking to QOMifying and refactoring the AXI stream interfaces
>> between the AXI ethernet and AXI DMA modules. I could use some
>> guidance on how to do this as I can think of about 6 different
>> solutions. Sources are hw/xilinx_axienet.c and hw/xilinx_axidma.c.
>>
>> ...
>>
>> So what im proposing is AXI stream is implemented as a unidirectional
>> point to point bus. The xilinx ethernet system would consist of two of
>> these buses one for tx, one for rx.
>
> I thought the idea was that with QOM the bus/device model would go away.
> The DMA controller implements an AXIDMA interface, and the device has a AXIDMA
> link that's connected to that interface.
>
> Of course we then hit the usual problem with QOM that we can only link to
> objects, and it's impossible to expose multiple interfaces of the same type.

No, QOM supports multiple inheritance of interfaces so you absolutely can 
inherit from multiple different interfaces.

So no need for proxy objects.

Regards,

Anthony Liguori

> The DMA controller probably needs a proxy object for each DMA channel.
>
> Paul
>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI stream
  2012-06-08 13:41   ` Anthony Liguori
@ 2012-06-08 13:53     ` Paul Brook
  2012-06-08 13:55     ` Peter Maydell
  1 sibling, 0 replies; 63+ messages in thread
From: Paul Brook @ 2012-06-08 13:53 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Michal Simek, qemu-devel@nongnu.org Developers, Peter Crosthwaite,
	Edgar E. Iglesias, Andreas Färber, John Williams

> >> So what im proposing is AXI stream is implemented as a unidirectional
> >> point to point bus. The xilinx ethernet system would consist of two of
> >> these buses one for tx, one for rx.
> > 
> > I thought the idea was that with QOM the bus/device model would go away.
> > The DMA controller implements an AXIDMA interface, and the device has a
> > AXIDMA link that's connected to that interface.
> > 
> > Of course we then hit the usual problem with QOM that we can only link to
> > objects, and it's impossible to expose multiple interfaces of the same
> > type.
> 
> No, QOM supports multiple inheritance of interfaces so you absolutely can
> inherit from multiple different interfaces.

But you can't have multiple instances of the same interface.  And the 
interfaces must be stateless.  Hence you need the proxy object.

Paul

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI stream
  2012-06-08 13:41   ` Anthony Liguori
  2012-06-08 13:53     ` Paul Brook
@ 2012-06-08 13:55     ` Peter Maydell
  1 sibling, 0 replies; 63+ messages in thread
From: Peter Maydell @ 2012-06-08 13:55 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Michal Simek, qemu-devel@nongnu.org Developers, Peter Crosthwaite,
	Paul Brook, Edgar E. Iglesias, Andreas Färber, John Williams

On 8 June 2012 14:41, Anthony Liguori <aliguori@us.ibm.com> wrote:
> On 06/08/2012 05:13 PM, Paul Brook wrote:
>> Of course we then hit the usual problem with QOM that we can only link to
>> objects, and it's impossible to expose multiple interfaces of the same
>> type.

> No, QOM supports multiple inheritance of interfaces so you absolutely can
> inherit from multiple different interfaces.

That would be multiple interfaces of different types, not multiple
interfaces of the same type.

-- PMM

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI stream
  2012-06-08 13:39       ` Anthony Liguori
@ 2012-06-08 13:59         ` Paul Brook
  2012-06-08 14:17           ` Anthony Liguori
  0 siblings, 1 reply; 63+ messages in thread
From: Paul Brook @ 2012-06-08 13:59 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Peter Maydell, Michal Simek, qemu-devel@nongnu.org Developers,
	Peter Crosthwaite, Edgar E. Iglesias, Andreas Färber,
	John Williams

> >>> Of course we then hit the usual problem with QOM that we can only link
> >>> to objects, and it's impossible to expose multiple interfaces of the
> >>> same type.
> >> 
> >> I'm pretty sure Anthony claimed this was entirely possible --
> >> presumably that's how Pins are going to work.
> > 
> > Really?  Every time I've talked to him I've got the opposite impression. 
> > Part of the response has been that interrupt pins are the only case
> > where this actually occurs, so It's not worth fixing properly.
> 
> I think it depends on your definition of "properly".
> 
> There's really only three concepts in QOM that matter for this discussion:
> 1) objects 2) children and 3) links.
> 
> There is absolutely no difference between a Pin object and a SerialState
> object. They both are first-class objects are far as QOM and concerned. 
> Both can have links to other objects.
> 
> The most common way for other objects to create objects is via children.  A
> device could have a bunch of Pin child objects with that being the sole
> communication mechanism with the outside world.

And those pin objects would presumably communicate back to the device via some 
as-yet unimplemented PinMultiplex interface link?  Or are you expecting the 
Pin objects to have an API call that allows the device to register an 
arbitrary QEMUBH (or equivalent)?

> A device could also have a 'PCISocket' child object (which inherits from
> PCIDevice) in order to expose a PCI interface to the world.
> 
> For most bus-based devices, I think the above is poor design.  But that's
> my opinion from a modeling PoV, QOM doesn't have an opinion from an
> infrastructure PoV.

So what is a good design?  Are you hoping most of your interfaces are 
stateless, so can be implemented directly on the device object?

Paul

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI stream
  2012-06-08  4:23 [Qemu-devel] [RFC] QOMification of AXI stream Peter Crosthwaite
  2012-06-08  9:13 ` Paul Brook
  2012-06-08  9:45 ` Andreas Färber
@ 2012-06-08 14:15 ` Anthony Liguori
  2012-06-09  1:24   ` Peter Crosthwaite
  2 siblings, 1 reply; 63+ messages in thread
From: Anthony Liguori @ 2012-06-08 14:15 UTC (permalink / raw)
  To: Peter Crosthwaite
  Cc: Michal Simek, Anthony Liguori, qemu-devel@nongnu.org Developers,
	Paul Brook, Edgar E. Iglesias, Andreas Färber, John Williams

On 06/08/2012 12:23 PM, Peter Crosthwaite wrote:
> Hi all,
>
> Im looking to QOMifying and refactoring the AXI stream interfaces
> between the AXI ethernet and AXI DMA modules. I could use some
> guidance on how to do this as I can think of about 6 different
> solutions. Sources are hw/xilinx_axienet.c and hw/xilinx_axidma.c.
>
> First ill start off by describing the real hardware:
>
> Each of the two core has three interfaces (+interrupt pins):
>
> 1: Sysbus attachment for device control
> 2: AXI stream TX link
> 3: AXI stream RX link
>
> Ethernet packet data is transferred from the ethernet device to/from
> memory via the AXI stream links and the DMA core. Basically the DMA
> core can be summed up as simply taking data to/from memory and putting
> to/from the axi stream links. Axi stream is a trival point to point
> protocol that allows for pushing 32-bit data words at a time.
>
>  From an architecture point of view, the TX and RX links are completely
> independent of each other. It doesnt make a lot of sense to have tx or
> rx without the other for the ethernet with DMA case, but other
> applications of the DMA could use only one of tx and rx. For this
> reason I think its best we decouple the tx/rx pair. Currently it is
> coupled in qemu (hw/xilinx_axdma.h):

So is this something that should be modeled as a DMAContext ala dwg/benh's IOMMU 
patch series?

Wouldn't the DMA controller expose two or more DMAContextes and then the 
Ethernet device expects a DMAContext for tx and rx.

Of course, DMAContext could be made trivially into an Object and tx/rx could be 
turned into link<> properties.

Regards,

Anthony Liguori

>
> struct XilinxDMAConnection {
>      void *dma;
>      void *client;
>
>      DMAPushFn to_dma;
>      DMAPushFn to_client;
> };
>
> So what im proposing is AXI stream is implemented as a unidirectional
> point to point bus. The xilinx ethernet system would consist of two of
> these buses one for tx, one for rx.
>
> Onto the QOM stuff:
>
> Currently the DMA interconnect is handled as this struct I pasted
> above and a QDEV_PROP_PTR (which i understand is evil). The
> interconnect mechanism obviously needs to change. So lets assume that
> AXI stream is turned into a proper QEMU bus and devices can create
> sub-busses which they are the tx'ing master:
>
> s->axi_stream_master = axi_stream_create_bus(&dev->qdev, "axi_stream");
>
> Machine models can grab the bus to attach slaves:
>
> foo = qdev_get_child_bus(dev, "axi_stream");
>
> Where my thinking goes pear shaped though is having proper QOMified
> slaves. Each IP is a slave to both the sysbus and their respective
> rx'ing AXI stream bus. This is something of a multiple inheritance
> problem, I cant inherit from both SYSBUS and AXI_STREAM_SLAVE. So to
> overcome this should I ...
>
> A: Make AXI_STREAM_SLAVE an interface (not a sub-class of DEVICE). Its
> kind of annoying though if someone in the future whats the create a
> device thats only and axi stream slave, as they would have to
> explicitly inherit from DEVICE as well.
>
> or
>
> B: Have the slave attachment be a device within a device. Hard part is
> getting an accessor so machine models can retrieve the slave
> attachment and hook it up.
>
> Let me know what to do,
>
> Regards,
> Peter
>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI stream
  2012-06-08 13:59         ` Paul Brook
@ 2012-06-08 14:17           ` Anthony Liguori
  0 siblings, 0 replies; 63+ messages in thread
From: Anthony Liguori @ 2012-06-08 14:17 UTC (permalink / raw)
  To: Paul Brook
  Cc: Peter Maydell, Anthony Liguori, Michal Simek,
	qemu-devel@nongnu.org Developers, Peter Crosthwaite,
	Edgar E. Iglesias, Andreas Färber, John Williams

On 06/08/2012 09:59 PM, Paul Brook wrote:
>>>>> Of course we then hit the usual problem with QOM that we can only link
>>>>> to objects, and it's impossible to expose multiple interfaces of the
>>>>> same type.
>>>>
>>>> I'm pretty sure Anthony claimed this was entirely possible --
>>>> presumably that's how Pins are going to work.
>>>
>>> Really?  Every time I've talked to him I've got the opposite impression.
>>> Part of the response has been that interrupt pins are the only case
>>> where this actually occurs, so It's not worth fixing properly.
>>
>> I think it depends on your definition of "properly".
>>
>> There's really only three concepts in QOM that matter for this discussion:
>> 1) objects 2) children and 3) links.
>>
>> There is absolutely no difference between a Pin object and a SerialState
>> object. They both are first-class objects are far as QOM and concerned.
>> Both can have links to other objects.
>>
>> The most common way for other objects to create objects is via children.  A
>> device could have a bunch of Pin child objects with that being the sole
>> communication mechanism with the outside world.
>
> And those pin objects would presumably communicate back to the device via some
> as-yet unimplemented PinMultiplex interface link?  Or are you expecting the
> Pin objects to have an API call that allows the device to register an
> arbitrary QEMUBH (or equivalent)?

Notifiers actually.  The rough model is Signal/Slots similar to GObject and/or QT.

>
>> A device could also have a 'PCISocket' child object (which inherits from
>> PCIDevice) in order to expose a PCI interface to the world.
>>
>> For most bus-based devices, I think the above is poor design.  But that's
>> my opinion from a modeling PoV, QOM doesn't have an opinion from an
>> infrastructure PoV.
>
> So what is a good design?  Are you hoping most of your interfaces are
> stateless, so can be implemented directly on the device object?

QOM is not a panacea.  We still need to have good objects/interfaces.  For this 
specific example, it sounds like a job for the DMA API dwg/benh have proposed. 
I responded with more detail in another part of the thread.

Regards,

Anthony Liguori

>
> Paul
>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI stream
  2012-06-08 14:15 ` Anthony Liguori
@ 2012-06-09  1:24   ` Peter Crosthwaite
  2012-06-11 13:15     ` Anthony Liguori
  0 siblings, 1 reply; 63+ messages in thread
From: Peter Crosthwaite @ 2012-06-09  1:24 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Michal Simek, Anthony Liguori, qemu-devel@nongnu.org Developers,
	Paul Brook, Edgar E. Iglesias, Andreas Färber, John Williams

On Sat, Jun 9, 2012 at 12:15 AM, Anthony Liguori <anthony@codemonkey.ws> wrote:
> On 06/08/2012 12:23 PM, Peter Crosthwaite wrote:
>>
>> Hi all,
>>
>> Im looking to QOMifying and refactoring the AXI stream interfaces
>> between the AXI ethernet and AXI DMA modules. I could use some
>> guidance on how to do this as I can think of about 6 different
>> solutions. Sources are hw/xilinx_axienet.c and hw/xilinx_axidma.c.
>>
>> First ill start off by describing the real hardware:
>>
>> Each of the two core has three interfaces (+interrupt pins):
>>
>> 1: Sysbus attachment for device control
>> 2: AXI stream TX link
>> 3: AXI stream RX link
>>
>> Ethernet packet data is transferred from the ethernet device to/from
>> memory via the AXI stream links and the DMA core. Basically the DMA
>> core can be summed up as simply taking data to/from memory and putting
>> to/from the axi stream links. Axi stream is a trival point to point
>> protocol that allows for pushing 32-bit data words at a time.
>>
>>  From an architecture point of view, the TX and RX links are completely
>> independent of each other. It doesnt make a lot of sense to have tx or
>> rx without the other for the ethernet with DMA case, but other
>> applications of the DMA could use only one of tx and rx. For this
>> reason I think its best we decouple the tx/rx pair. Currently it is
>> coupled in qemu (hw/xilinx_axdma.h):
>
>
> So is this something that should be modeled as a DMAContext ala dwg/benh's
> IOMMU patch series?
>
> Wouldn't the DMA controller expose two or more DMAContextes and then the
> Ethernet device expects a DMAContext for tx and rx.
>

Hi Anthony,

This doesn't sound right to me at all WRT to modelling the real
hardware. The ethernet controller on its own has no notion that it is
attached to a DMA controller. I can attached the device to any
AXI-stream tx/rx pair. The ethernet device should only has awareness
of the axi stream protocol and have no exposure to any DMA related
functionality.

Heres the (futuristic) example im worried about:

I have my ethernet controller attached to the DMA as usual over the
axi-stream pairs. I then come along and decide I want to model some
hardware packet filter device on the ethernet traffic inbetween the
memory and ethernet. My packet filter is implemented as a axi-stream
master and slave (basically a filtering fifo). I need to on the
machine model level, insert it on the rx path. Heres the architecture:

Ethernet -> axistream -> Random packet filter IP -> axistream -> DMA

Ethernet <- axistream <- DMA

In this case the ethernet is insulated from DMA. I think i can
summarize in saying that assuming that an axi-stream link is for DMA
is incorrect.

Regards,
Peter

> Of course, DMAContext could be made trivially into an Object and tx/rx could
> be turned into link<> properties.
>
> Regards,
>
> Anthony Liguori
>
>
>>
>> struct XilinxDMAConnection {
>>     void *dma;
>>     void *client;
>>
>>     DMAPushFn to_dma;
>>     DMAPushFn to_client;
>> };
>>
>> So what im proposing is AXI stream is implemented as a unidirectional
>> point to point bus. The xilinx ethernet system would consist of two of
>> these buses one for tx, one for rx.
>>
>> Onto the QOM stuff:
>>
>> Currently the DMA interconnect is handled as this struct I pasted
>> above and a QDEV_PROP_PTR (which i understand is evil). The
>> interconnect mechanism obviously needs to change. So lets assume that
>> AXI stream is turned into a proper QEMU bus and devices can create
>> sub-busses which they are the tx'ing master:
>>
>> s->axi_stream_master = axi_stream_create_bus(&dev->qdev, "axi_stream");
>>
>> Machine models can grab the bus to attach slaves:
>>
>> foo = qdev_get_child_bus(dev, "axi_stream");
>>
>> Where my thinking goes pear shaped though is having proper QOMified
>> slaves. Each IP is a slave to both the sysbus and their respective
>> rx'ing AXI stream bus. This is something of a multiple inheritance
>> problem, I cant inherit from both SYSBUS and AXI_STREAM_SLAVE. So to
>> overcome this should I ...
>>
>> A: Make AXI_STREAM_SLAVE an interface (not a sub-class of DEVICE). Its
>> kind of annoying though if someone in the future whats the create a
>> device thats only and axi stream slave, as they would have to
>> explicitly inherit from DEVICE as well.
>>
>> or
>>
>> B: Have the slave attachment be a device within a device. Hard part is
>> getting an accessor so machine models can retrieve the slave
>> attachment and hook it up.
>>
>> Let me know what to do,
>>
>> Regards,
>> Peter
>>
>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI stream
  2012-06-08  9:45 ` Andreas Färber
@ 2012-06-09  1:53   ` Peter Crosthwaite
  2012-06-09  2:12     ` Andreas Färber
  0 siblings, 1 reply; 63+ messages in thread
From: Peter Crosthwaite @ 2012-06-09  1:53 UTC (permalink / raw)
  To: Andreas Färber
  Cc: Anthony Liguori, qemu-devel@nongnu.org Developers, Michal Simek,
	Paul Brook, Edgar E. Iglesias, John Williams

On Fri, Jun 8, 2012 at 7:45 PM, Andreas Färber <afaerber@suse.de> wrote:
> Am 08.06.2012 06:23, schrieb Peter Crosthwaite:
>> Each of the two core has three interfaces (+interrupt pins):
>>
>> 1: Sysbus attachment for device control
>> 2: AXI stream TX link
>> 3: AXI stream RX link
> [...]
>> struct XilinxDMAConnection {
>>     void *dma;
>>     void *client;
>>
>>     DMAPushFn to_dma;
>>     DMAPushFn to_client;
>> };
>>
>> So what im proposing is AXI stream is implemented as a unidirectional
>> point to point bus. The xilinx ethernet system would consist of two of
>> these buses one for tx, one for rx.
> [...]
>> A: Make AXI_STREAM_SLAVE an interface (not a sub-class of DEVICE). Its
>> kind of annoying though if someone in the future whats the create a
>> device thats only and axi stream slave, as they would have to
>> explicitly inherit from DEVICE as well.
>>
>> or
>>
>> B: Have the slave attachment be a device within a device. Hard part is
>> getting an accessor so machine models can retrieve the slave
>> attachment and hook it up.
>
> If you dive into busses, note that Anthony has refactored QBus on
> qom-next branch.
>

How stable is this branch? It seems like I should use it as the
development point. Is the merge immenent. If the merge is delayed, can
I at least rely on the fundamental APIs define here (around links and
stuff) no changing?

> As Paul has already mentioned, the concept of tree-structured qdev
> busses is deprecated by QOM in favor of link<> properties.

Ive had a brief look at the refactorings on qom-next, I notice that
busses are now just children of the parent object TYPE_BUS.
Essentially for point-to-point links this means that link itself has a
QOM object. So for finer clarification, for new busses should or
should I not have an object (whether it inheritTYPE_BUS or some other
parent) for the link itself? Or should The master device interface
directly with its slave? Im thinking the latter, no need for an object
for a trivial point-to-point link.

>
> SysBus is also in the process of being deprecated, and Anthony is
> working on Pin objects for simple qemu_irq-style messaging.
>
> What you could try as option C today is deriving a type from SysBus with
> one added DMAPushFn and a link<> property of that type for
> unidirectional connection and form circular links for RX and TX...

Ok,

Heres what i'm thinking now. each device will

Inherit from SYSBUS
implement interface AXI_STREAM_SLAVE
have a link property "axi_stream_connected_slave"

AXI_STREAM_SLAVE has a single function to push data down the link
(what I believe you called DMAPushFn), but I will rename to
axi_stream_push or the like as its not DMA specific.

Machine model then just sets axi_stream_connected_slave to each other.

> That would of course limit the number of channels to one. Otherwise you
> need a dedicated child<> object, of which a device can have multiple.

Im not too worried about that, but Peter and Paul have opened the
discussion. Is the straight up interface on the sysbus device fine for
what im trying to do - or should I have proxy objects for the sake of
consistency?

Regards,
Peter

>
> Regards,
> Andreas
>
> --
> SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
> GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI stream
  2012-06-09  1:53   ` Peter Crosthwaite
@ 2012-06-09  2:12     ` Andreas Färber
  2012-06-09  3:28       ` Peter Crosthwaite
  2012-06-11  5:54       ` Paolo Bonzini
  0 siblings, 2 replies; 63+ messages in thread
From: Andreas Färber @ 2012-06-09  2:12 UTC (permalink / raw)
  To: Peter Crosthwaite
  Cc: Anthony Liguori, qemu-devel@nongnu.org Developers, Michal Simek,
	Paul Brook, Paolo Bonzini, Edgar E. Iglesias, John Williams

Am 09.06.2012 03:53, schrieb Peter Crosthwaite:
> On Fri, Jun 8, 2012 at 7:45 PM, Andreas Färber <afaerber@suse.de> wrote:
>> Am 08.06.2012 06:23, schrieb Peter Crosthwaite:
>>> Each of the two core has three interfaces (+interrupt pins):
>>>
>>> 1: Sysbus attachment for device control
>>> 2: AXI stream TX link
>>> 3: AXI stream RX link
>> [...]
>>> struct XilinxDMAConnection {
>>>     void *dma;
>>>     void *client;
>>>
>>>     DMAPushFn to_dma;
>>>     DMAPushFn to_client;
>>> };
>>>
>>> So what im proposing is AXI stream is implemented as a unidirectional
>>> point to point bus. The xilinx ethernet system would consist of two of
>>> these buses one for tx, one for rx.
>> [...]
>>> A: Make AXI_STREAM_SLAVE an interface (not a sub-class of DEVICE). Its
>>> kind of annoying though if someone in the future whats the create a
>>> device thats only and axi stream slave, as they would have to
>>> explicitly inherit from DEVICE as well.
>>>
>>> or
>>>
>>> B: Have the slave attachment be a device within a device. Hard part is
>>> getting an accessor so machine models can retrieve the slave
>>> attachment and hook it up.
>>
>> If you dive into busses, note that Anthony has refactored QBus on
>> qom-next branch.
>>
> 
> How stable is this branch? It seems like I should use it as the
> development point. Is the merge immenent. If the merge is delayed, can
> I at least rely on the fundamental APIs define here (around links and
> stuff) no changing?

At this point we're pretty close to merging (hopefully next week) so I
would advise against basing new work on that branch. Just be prepared to
rebase onto the "qdev: Convert busses to QEMU Object Model" patch, i.e.
BusInfo gets replaced by TypeInfo and creation uses TYPE_FOO.

>> As Paul has already mentioned, the concept of tree-structured qdev
>> busses is deprecated by QOM in favor of link<> properties.
> 
> Ive had a brief look at the refactorings on qom-next, I notice that
> busses are now just children of the parent object TYPE_BUS.
> Essentially for point-to-point links this means that link itself has a
> QOM object. So for finer clarification, for new busses should or
> should I not have an object (whether it inheritTYPE_BUS or some other
> parent) for the link itself? Or should The master device interface
> directly with its slave? Im thinking the latter, no need for an object
> for a trivial point-to-point link.

No bus expert myself, deferring to Anthony and Paolo.

> Heres what i'm thinking now. each device will
> 
> Inherit from SYSBUS
> implement interface AXI_STREAM_SLAVE
> have a link property "axi_stream_connected_slave"
> 
> AXI_STREAM_SLAVE has a single function to push data down the link
> (what I believe you called DMAPushFn), but I will rename to
> axi_stream_push or the like as its not DMA specific.
> 
> Machine model then just sets axi_stream_connected_slave to each other.

Doesn't sound wrong so far under the premise of that simplistic
modelling approach. Not that I'm specifically advocating this approach.

>> That would of course limit the number of channels to one. Otherwise you
>> need a dedicated child<> object, of which a device can have multiple.
> 
> Im not too worried about that, but Peter and Paul have opened the
> discussion. Is the straight up interface on the sysbus device fine for
> what im trying to do - or should I have proxy objects for the sake of
> consistency?

I'm not aware of any use of interfaces in upstream nor of any proxy
object. In the end it'll be a compromise between fancy and quick... ;)

Regards,
Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI stream
  2012-06-09  2:12     ` Andreas Färber
@ 2012-06-09  3:28       ` Peter Crosthwaite
  2012-06-11  5:54       ` Paolo Bonzini
  1 sibling, 0 replies; 63+ messages in thread
From: Peter Crosthwaite @ 2012-06-09  3:28 UTC (permalink / raw)
  To: Andreas Färber
  Cc: Anthony Liguori, qemu-devel@nongnu.org Developers, Michal Simek,
	Paul Brook, Paolo Bonzini, Edgar E. Iglesias, John Williams

On Sat, Jun 9, 2012 at 12:12 PM, Andreas Färber <afaerber@suse.de> wrote:
> Am 09.06.2012 03:53, schrieb Peter Crosthwaite:
>> On Fri, Jun 8, 2012 at 7:45 PM, Andreas Färber <afaerber@suse.de> wrote:
>>> Am 08.06.2012 06:23, schrieb Peter Crosthwaite:
>>>> Each of the two core has three interfaces (+interrupt pins):
>>>>
>>>> 1: Sysbus attachment for device control
>>>> 2: AXI stream TX link
>>>> 3: AXI stream RX link
>>> [...]
>>>> struct XilinxDMAConnection {
>>>>     void *dma;
>>>>     void *client;
>>>>
>>>>     DMAPushFn to_dma;
>>>>     DMAPushFn to_client;
>>>> };
>>>>
>>>> So what im proposing is AXI stream is implemented as a unidirectional
>>>> point to point bus. The xilinx ethernet system would consist of two of
>>>> these buses one for tx, one for rx.
>>> [...]
>>>> A: Make AXI_STREAM_SLAVE an interface (not a sub-class of DEVICE). Its
>>>> kind of annoying though if someone in the future whats the create a
>>>> device thats only and axi stream slave, as they would have to
>>>> explicitly inherit from DEVICE as well.
>>>>
>>>> or
>>>>
>>>> B: Have the slave attachment be a device within a device. Hard part is
>>>> getting an accessor so machine models can retrieve the slave
>>>> attachment and hook it up.
>>>
>>> If you dive into busses, note that Anthony has refactored QBus on
>>> qom-next branch.
>>>
>>
>> How stable is this branch? It seems like I should use it as the
>> development point. Is the merge immenent. If the merge is delayed, can
>> I at least rely on the fundamental APIs define here (around links and
>> stuff) no changing?
>
> At this point we're pretty close to merging (hopefully next week) so I
> would advise against basing new work on that branch. Just be prepared to
> rebase onto the "qdev: Convert busses to QEMU Object Model" patch, i.e.
> BusInfo gets replaced by TypeInfo and creation uses TYPE_FOO.
>
>>> As Paul has already mentioned, the concept of tree-structured qdev
>>> busses is deprecated by QOM in favor of link<> properties.
>>
>> Ive had a brief look at the refactorings on qom-next, I notice that
>> busses are now just children of the parent object TYPE_BUS.
>> Essentially for point-to-point links this means that link itself has a
>> QOM object. So for finer clarification, for new busses should or
>> should I not have an object (whether it inheritTYPE_BUS or some other
>> parent) for the link itself? Or should The master device interface
>> directly with its slave? Im thinking the latter, no need for an object
>> for a trivial point-to-point link.
>
> No bus expert myself, deferring to Anthony and Paolo.
>
>> Heres what i'm thinking now. each device will
>>
>> Inherit from SYSBUS
>> implement interface AXI_STREAM_SLAVE
>> have a link property "axi_stream_connected_slave"
>>
>> AXI_STREAM_SLAVE has a single function to push data down the link
>> (what I believe you called DMAPushFn), but I will rename to
>> axi_stream_push or the like as its not DMA specific.
>>
>> Machine model then just sets axi_stream_connected_slave to each other.
>
> Doesn't sound wrong so far under the premise of that simplistic
> modelling approach. Not that I'm specifically advocating this approach.
>
>>> That would of course limit the number of channels to one. Otherwise you
>>> need a dedicated child<> object, of which a device can have multiple.
>>
>> Im not too worried about that, but Peter and Paul have opened the
>> discussion. Is the straight up interface on the sysbus device fine for
>> what im trying to do - or should I have proxy objects for the sake of
>> consistency?
>
> I'm not aware of any use of interfaces in upstream nor of any proxy
> object. In the end it'll be a compromise between fancy and quick... ;)
>

Do you have a repo/branch you can point me to some good examples of
using qom-interfaces?

Regards,
Peter

> Regards,
> Andreas
>
> --
> SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
> GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI stream
  2012-06-09  2:12     ` Andreas Färber
  2012-06-09  3:28       ` Peter Crosthwaite
@ 2012-06-11  5:54       ` Paolo Bonzini
  2012-06-11 13:05         ` Peter Maydell
  2012-06-11 13:17         ` Anthony Liguori
  1 sibling, 2 replies; 63+ messages in thread
From: Paolo Bonzini @ 2012-06-11  5:54 UTC (permalink / raw)
  To: Andreas Färber
  Cc: Anthony Liguori, qemu-devel@nongnu.org Developers,
	Peter Crosthwaite, Michal Simek, Paul Brook, Edgar E. Iglesias,
	John Williams

> > Ive had a brief look at the refactorings on qom-next, I notice that
> > busses are now just children of the parent object TYPE_BUS.
> > Essentially for point-to-point links this means that link itself
> > has a QOM object. So for finer clarification, for new busses should or
> > should I not have an object (whether it inherit TYPE_BUS or some
> > other parent) for the link itself? Or should The master device interface
> > directly with its slave? Im thinking the latter, no need for an
> > object for a trivial point-to-point link.

I think AXI counts as a point-to-point link, so no bus.

In my prop-ptr branch you can find some code (untested beyond compilation)
that tried to model AXI using interfaces (git://github.com/pbonzini/qemu.git).
I don't know much about AXI, so I tried to reverse engineer a decent model
from what was done without QOM.

Paolo

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI stream
  2012-06-11  5:54       ` Paolo Bonzini
@ 2012-06-11 13:05         ` Peter Maydell
  2012-06-11 13:17         ` Anthony Liguori
  1 sibling, 0 replies; 63+ messages in thread
From: Peter Maydell @ 2012-06-11 13:05 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Michal Simek, Anthony Liguori, qemu-devel@nongnu.org Developers,
	Peter Crosthwaite, Paul Brook, Edgar E. Iglesias,
	Andreas Färber, John Williams

On 11 June 2012 06:54, Paolo Bonzini <pbonzini@redhat.com> wrote:
> I think AXI counts as a point-to-point link, so no bus.

Depends what you mean by AXI. In general an AXI bus fabric
can have multiple masters and slaves connected via an
interconnect, but we probably don't actually want to model
the interconnect as a separate device...

-- PMM

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI stream
  2012-06-09  1:24   ` Peter Crosthwaite
@ 2012-06-11 13:15     ` Anthony Liguori
  2012-06-11 13:39       ` Peter Maydell
  0 siblings, 1 reply; 63+ messages in thread
From: Anthony Liguori @ 2012-06-11 13:15 UTC (permalink / raw)
  To: Peter Crosthwaite
  Cc: Michal Simek, qemu-devel@nongnu.org Developers, Paul Brook,
	Anthony Liguori, Edgar E. Iglesias, Andreas Färber,
	John Williams

On 06/08/2012 08:24 PM, Peter Crosthwaite wrote:
> On Sat, Jun 9, 2012 at 12:15 AM, Anthony Liguori<anthony@codemonkey.ws>  wrote:
>> On 06/08/2012 12:23 PM, Peter Crosthwaite wrote:
>>>
>>> Hi all,
>>>
>>> Im looking to QOMifying and refactoring the AXI stream interfaces
>>> between the AXI ethernet and AXI DMA modules. I could use some
>>> guidance on how to do this as I can think of about 6 different
>>> solutions. Sources are hw/xilinx_axienet.c and hw/xilinx_axidma.c.
>>>
>>> First ill start off by describing the real hardware:
>>>
>>> Each of the two core has three interfaces (+interrupt pins):
>>>
>>> 1: Sysbus attachment for device control
>>> 2: AXI stream TX link
>>> 3: AXI stream RX link
>>>
>>> Ethernet packet data is transferred from the ethernet device to/from
>>> memory via the AXI stream links and the DMA core. Basically the DMA
>>> core can be summed up as simply taking data to/from memory and putting
>>> to/from the axi stream links. Axi stream is a trival point to point
>>> protocol that allows for pushing 32-bit data words at a time.
>>>
>>>   From an architecture point of view, the TX and RX links are completely
>>> independent of each other. It doesnt make a lot of sense to have tx or
>>> rx without the other for the ethernet with DMA case, but other
>>> applications of the DMA could use only one of tx and rx. For this
>>> reason I think its best we decouple the tx/rx pair. Currently it is
>>> coupled in qemu (hw/xilinx_axdma.h):
>>
>>
>> So is this something that should be modeled as a DMAContext ala dwg/benh's
>> IOMMU patch series?
>>
>> Wouldn't the DMA controller expose two or more DMAContextes and then the
>> Ethernet device expects a DMAContext for tx and rx.
>>
>
> Hi Anthony,
>
> This doesn't sound right to me at all WRT to modelling the real
> hardware. The ethernet controller on its own has no notion that it is
> attached to a DMA controller. I can attached the device to any
> AXI-stream tx/rx pair. The ethernet device should only has awareness
> of the axi stream protocol and have no exposure to any DMA related
> functionality.

What is the AXI stream protocol?

 From what you said earlier, it's basically:

'write data to this address'
'read data from this address'

An interface that implements this is DMAContext.  Forget about the fact that 
'DMA' is in the name.  It's really the symmetric version of a MemoryRegion.

> Heres the (futuristic) example im worried about:
>
> I have my ethernet controller attached to the DMA as usual over the
> axi-stream pairs. I then come along and decide I want to model some
> hardware packet filter device on the ethernet traffic inbetween the
> memory and ethernet. My packet filter is implemented as a axi-stream
> master and slave (basically a filtering fifo). I need to on the
> machine model level, insert it on the rx path. Heres the architecture:
>
> Ethernet ->  axistream ->  Random packet filter IP ->  axistream ->  DMA
>
> Ethernet<- axistream<- DMA
>
> In this case the ethernet is insulated from DMA. I think i can
> summarize in saying that assuming that an axi-stream link is for DMA
> is incorrect.

So first thing you need to do is figure out what the stream interface consists 
of.  You can model this as an Interface or use an existing Interface (like 
DMAContext).  You can then expose a pair of these interfaces from the controller 
using a child object (as Paul likes to say, a proxy object).  This is how 
MemoryRegions and DMAContexts work.

Regards,

Anthony Liguori

>
> Regards,
> Peter
>
>> Of course, DMAContext could be made trivially into an Object and tx/rx could
>> be turned into link<>  properties.
>>
>> Regards,
>>
>> Anthony Liguori
>>
>>
>>>
>>> struct XilinxDMAConnection {
>>>      void *dma;
>>>      void *client;
>>>
>>>      DMAPushFn to_dma;
>>>      DMAPushFn to_client;
>>> };
>>>
>>> So what im proposing is AXI stream is implemented as a unidirectional
>>> point to point bus. The xilinx ethernet system would consist of two of
>>> these buses one for tx, one for rx.
>>>
>>> Onto the QOM stuff:
>>>
>>> Currently the DMA interconnect is handled as this struct I pasted
>>> above and a QDEV_PROP_PTR (which i understand is evil). The
>>> interconnect mechanism obviously needs to change. So lets assume that
>>> AXI stream is turned into a proper QEMU bus and devices can create
>>> sub-busses which they are the tx'ing master:
>>>
>>> s->axi_stream_master = axi_stream_create_bus(&dev->qdev, "axi_stream");
>>>
>>> Machine models can grab the bus to attach slaves:
>>>
>>> foo = qdev_get_child_bus(dev, "axi_stream");
>>>
>>> Where my thinking goes pear shaped though is having proper QOMified
>>> slaves. Each IP is a slave to both the sysbus and their respective
>>> rx'ing AXI stream bus. This is something of a multiple inheritance
>>> problem, I cant inherit from both SYSBUS and AXI_STREAM_SLAVE. So to
>>> overcome this should I ...
>>>
>>> A: Make AXI_STREAM_SLAVE an interface (not a sub-class of DEVICE). Its
>>> kind of annoying though if someone in the future whats the create a
>>> device thats only and axi stream slave, as they would have to
>>> explicitly inherit from DEVICE as well.
>>>
>>> or
>>>
>>> B: Have the slave attachment be a device within a device. Hard part is
>>> getting an accessor so machine models can retrieve the slave
>>> attachment and hook it up.
>>>
>>> Let me know what to do,
>>>
>>> Regards,
>>> Peter
>>>
>>
>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI stream
  2012-06-11  5:54       ` Paolo Bonzini
  2012-06-11 13:05         ` Peter Maydell
@ 2012-06-11 13:17         ` Anthony Liguori
  2012-06-11 13:41           ` Paolo Bonzini
  1 sibling, 1 reply; 63+ messages in thread
From: Anthony Liguori @ 2012-06-11 13:17 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Michal Simek, qemu-devel@nongnu.org Developers, Peter Crosthwaite,
	Paul Brook, Edgar E. Iglesias, Andreas Färber, John Williams

On 06/11/2012 12:54 AM, Paolo Bonzini wrote:
>>> Ive had a brief look at the refactorings on qom-next, I notice that
>>> busses are now just children of the parent object TYPE_BUS.
>>> Essentially for point-to-point links this means that link itself
>>> has a QOM object. So for finer clarification, for new busses should or
>>> should I not have an object (whether it inherit TYPE_BUS or some
>>> other parent) for the link itself? Or should The master device interface
>>> directly with its slave? Im thinking the latter, no need for an
>>> object for a trivial point-to-point link.
>
> I think AXI counts as a point-to-point link, so no bus.
>
> In my prop-ptr branch you can find some code (untested beyond compilation)
> that tried to model AXI using interfaces (git://github.com/pbonzini/qemu.git).

s/pbonzini/bonzini/g, right?

And I don't see a prop-ptr branch.  Did you forget to push?

Regards,

Anthony Liguori

> I don't know much about AXI, so I tried to reverse engineer a decent model
> from what was done without QOM.
>
> Paolo
>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI stream
  2012-06-11 13:15     ` Anthony Liguori
@ 2012-06-11 13:39       ` Peter Maydell
  2012-06-11 14:38         ` Edgar E. Iglesias
  2012-06-11 15:01         ` Anthony Liguori
  0 siblings, 2 replies; 63+ messages in thread
From: Peter Maydell @ 2012-06-11 13:39 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Michal Simek, qemu-devel@nongnu.org Developers, Peter Crosthwaite,
	Paul Brook, Anthony Liguori, Edgar E. Iglesias,
	Andreas Färber, John Williams

On 11 June 2012 14:15, Anthony Liguori <aliguori@us.ibm.com> wrote:
> From what you said earlier, it's basically:
>
> 'write data to this address'
> 'read data from this address'
>
> An interface that implements this is DMAContext.  Forget about the fact that
> 'DMA' is in the name.  It's really the symmetric version of a MemoryRegion.

...so can we fix the name?

Ideally the interface used by DMA controllers should be identical to
the interface used by CPUs to talk to the rest of the system: it's
exactly the same bus interface in hardware, after all.

-- PMM

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI stream
  2012-06-11 13:17         ` Anthony Liguori
@ 2012-06-11 13:41           ` Paolo Bonzini
  0 siblings, 0 replies; 63+ messages in thread
From: Paolo Bonzini @ 2012-06-11 13:41 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Michal Simek, qemu-devel@nongnu.org Developers, Peter Crosthwaite,
	Paul Brook, Edgar E. Iglesias, Andreas Färber, John Williams

Il 11/06/2012 15:17, Anthony Liguori ha scritto:
>>>>
>>
>> I think AXI counts as a point-to-point link, so no bus.
>>
>> In my prop-ptr branch you can find some code (untested beyond
>> compilation)
>> that tried to model AXI using interfaces
>> (git://github.com/pbonzini/qemu.git).
> 
> s/pbonzini/bonzini/g, right?

Yes.

> And I don't see a prop-ptr branch.  Did you forget to push?

"git push" says "Everything up-to-date", and github web interface
agrees: https://github.com/bonzini/qemu/commits/prop-ptr

Paolo

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI stream
  2012-06-11 13:39       ` Peter Maydell
@ 2012-06-11 14:38         ` Edgar E. Iglesias
  2012-06-11 14:53           ` Peter Maydell
  2012-06-11 15:01         ` Anthony Liguori
  1 sibling, 1 reply; 63+ messages in thread
From: Edgar E. Iglesias @ 2012-06-11 14:38 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Anthony Liguori, Michal Simek, qemu-devel@nongnu.org Developers,
	Peter Crosthwaite, Paul Brook, Anthony Liguori,
	Andreas Färber, John Williams

On Mon, Jun 11, 2012 at 02:39:56PM +0100, Peter Maydell wrote:
> On 11 June 2012 14:15, Anthony Liguori <aliguori@us.ibm.com> wrote:
> > From what you said earlier, it's basically:
> >
> > 'write data to this address'
> > 'read data from this address'
> >
> > An interface that implements this is DMAContext.  Forget about the fact that
> > 'DMA' is in the name.  It's really the symmetric version of a MemoryRegion.
> 
> ...so can we fix the name?
> 
> Ideally the interface used by DMA controllers should be identical to
> the interface used by CPUs to talk to the rest of the system: it's
> exactly the same bus interface in hardware, after all.

I thought we were talking about the interface between the DMA ctrl
and the I/O (devices). Not between the DMA and the "memory" bus system.

Cheers

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI stream
  2012-06-11 14:38         ` Edgar E. Iglesias
@ 2012-06-11 14:53           ` Peter Maydell
  2012-06-11 14:58             ` Edgar E. Iglesias
  2012-06-11 15:03             ` Anthony Liguori
  0 siblings, 2 replies; 63+ messages in thread
From: Peter Maydell @ 2012-06-11 14:53 UTC (permalink / raw)
  To: Edgar E. Iglesias
  Cc: Anthony Liguori, Michal Simek, qemu-devel@nongnu.org Developers,
	Peter Crosthwaite, Paul Brook, Anthony Liguori,
	Andreas Färber, John Williams

On 11 June 2012 15:38, Edgar E. Iglesias <edgar.iglesias@gmail.com> wrote:
> On Mon, Jun 11, 2012 at 02:39:56PM +0100, Peter Maydell wrote:
>> Ideally the interface used by DMA controllers should be identical to
>> the interface used by CPUs to talk to the rest of the system: it's
>> exactly the same bus interface in hardware, after all.
>
> I thought we were talking about the interface between the DMA ctrl
> and the I/O (devices). Not between the DMA and the "memory" bus system.

In hardware (at least for AXI) they're the same thing. A DMA
controller is a bus master, just like a CPU. They don't care
whether the slave is RAM or a device, they're just issuing
memory transactions to addresses.

-- PMM

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI stream
  2012-06-11 14:53           ` Peter Maydell
@ 2012-06-11 14:58             ` Edgar E. Iglesias
  2012-06-11 15:03             ` Anthony Liguori
  1 sibling, 0 replies; 63+ messages in thread
From: Edgar E. Iglesias @ 2012-06-11 14:58 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Anthony Liguori, Michal Simek, qemu-devel@nongnu.org Developers,
	Peter Crosthwaite, Paul Brook, Anthony Liguori,
	Andreas Färber, John Williams

On Mon, Jun 11, 2012 at 03:53:23PM +0100, Peter Maydell wrote:
> On 11 June 2012 15:38, Edgar E. Iglesias <edgar.iglesias@gmail.com> wrote:
> > On Mon, Jun 11, 2012 at 02:39:56PM +0100, Peter Maydell wrote:
> >> Ideally the interface used by DMA controllers should be identical to
> >> the interface used by CPUs to talk to the rest of the system: it's
> >> exactly the same bus interface in hardware, after all.
> >
> > I thought we were talking about the interface between the DMA ctrl
> > and the I/O (devices). Not between the DMA and the "memory" bus system.
> 
> In hardware (at least for AXI) they're the same thing. A DMA
> controller is a bus master, just like a CPU. They don't care
> whether the slave is RAM or a device, they're just issuing
> memory transactions to addresses.

In many cases they are not the same thing. The DMA controller has
an interface to the bus (the one you refer to) and in many cases
dedicated channels connected to the various devices. These channels
may or may not resemble the bus a CPU uses, typically they don't.

There are of course systems of any kind, even ones that use the
system bus to transfer data from the DMA to/from the devices.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI stream
  2012-06-11 13:39       ` Peter Maydell
  2012-06-11 14:38         ` Edgar E. Iglesias
@ 2012-06-11 15:01         ` Anthony Liguori
  2012-06-11 17:31           ` Avi Kivity
  1 sibling, 1 reply; 63+ messages in thread
From: Anthony Liguori @ 2012-06-11 15:01 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Anthony Liguori, Michal Simek, qemu-devel@nongnu.org Developers,
	Peter Crosthwaite, Paul Brook, Edgar E. Iglesias,
	Andreas Färber, John Williams, Avi Kivity

On 06/11/2012 08:39 AM, Peter Maydell wrote:
> On 11 June 2012 14:15, Anthony Liguori<aliguori@us.ibm.com>  wrote:
>>  From what you said earlier, it's basically:
>>
>> 'write data to this address'
>> 'read data from this address'
>>
>> An interface that implements this is DMAContext.  Forget about the fact that
>> 'DMA' is in the name.  It's really the symmetric version of a MemoryRegion.
>
> ...so can we fix the name?

Perhaps we should just make MemoryRegion work in both directions?

Ben/Avi, what do you guys think?

Regards,

Anthony Liguori

>
> Ideally the interface used by DMA controllers should be identical to
> the interface used by CPUs to talk to the rest of the system: it's
> exactly the same bus interface in hardware, after all.
>
> -- PMM

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI stream
  2012-06-11 14:53           ` Peter Maydell
  2012-06-11 14:58             ` Edgar E. Iglesias
@ 2012-06-11 15:03             ` Anthony Liguori
  2012-06-11 15:34               ` Peter Maydell
  2012-06-11 15:56               ` Edgar E. Iglesias
  1 sibling, 2 replies; 63+ messages in thread
From: Anthony Liguori @ 2012-06-11 15:03 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Anthony Liguori, Michal Simek, qemu-devel@nongnu.org Developers,
	Peter Crosthwaite, Paul Brook, Edgar E. Iglesias,
	Andreas Färber, John Williams

On 06/11/2012 09:53 AM, Peter Maydell wrote:
> On 11 June 2012 15:38, Edgar E. Iglesias<edgar.iglesias@gmail.com>  wrote:
>> On Mon, Jun 11, 2012 at 02:39:56PM +0100, Peter Maydell wrote:
>>> Ideally the interface used by DMA controllers should be identical to
>>> the interface used by CPUs to talk to the rest of the system: it's
>>> exactly the same bus interface in hardware, after all.
>>
>> I thought we were talking about the interface between the DMA ctrl
>> and the I/O (devices). Not between the DMA and the "memory" bus system.
>
> In hardware (at least for AXI) they're the same thing. A DMA
> controller is a bus master, just like a CPU. They don't care
> whether the slave is RAM or a device, they're just issuing
> memory transactions to addresses.

It looks like the AXI stream interface also includes a word array.  I can't tell 
though whether this is just a decomposed scatter/gather list though.

There doesn't appear to be a notion of an address though.  You could make all 
operations go to address 0 though but it makes me wonder if that's stretching 
the concept of DMA a bit too much.

Regards,

Anthony Liguori

>
> -- PMM

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI stream
  2012-06-11 15:03             ` Anthony Liguori
@ 2012-06-11 15:34               ` Peter Maydell
  2012-06-11 15:56               ` Edgar E. Iglesias
  1 sibling, 0 replies; 63+ messages in thread
From: Peter Maydell @ 2012-06-11 15:34 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Anthony Liguori, Michal Simek, qemu-devel@nongnu.org Developers,
	Peter Crosthwaite, Paul Brook, Edgar E. Iglesias,
	Andreas Färber, John Williams

On 11 June 2012 16:03, Anthony Liguori <anthony@codemonkey.ws> wrote:
> On 06/11/2012 09:53 AM, Peter Maydell wrote:
>> In hardware (at least for AXI) they're the same thing. A DMA
>> controller is a bus master, just like a CPU. They don't care
>> whether the slave is RAM or a device, they're just issuing
>> memory transactions to addresses.
>
> It looks like the AXI stream interface also includes a word array.  I can't
> tell though whether this is just a decomposed scatter/gather list though.
>
> There doesn't appear to be a notion of an address though.  You could make
> all operations go to address 0 though but it makes me wonder if that's
> stretching the concept of DMA a bit too much.

Yeah, I've probably been confusing things a little there, sorry.
AXI-Stream is an addressless data stream (not necessarily point-to-point,
the protocol includes TDEST signals to indicate the 'destination' for
a data stream, so you could have a multiple-master-multiple-slave config,
or a config with a slave that accepted several incoming streams over
the same interface). AXI is the general address-based interface.

(Have just read the spec, and AXI-Stream has a number of interesting
features including optional end-of-packet signals, and the ability to
say "these bytes in the data stream are to be ignored" so you can have
a transfer which updates eg bytes 1-5 and 8-15 at the destination
but leaves 6-7 untouched.)

-- PMM

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI stream
  2012-06-11 15:03             ` Anthony Liguori
  2012-06-11 15:34               ` Peter Maydell
@ 2012-06-11 15:56               ` Edgar E. Iglesias
  2012-06-12  0:33                 ` Peter Crosthwaite
  1 sibling, 1 reply; 63+ messages in thread
From: Edgar E. Iglesias @ 2012-06-11 15:56 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Peter Maydell, Anthony Liguori, Michal Simek,
	qemu-devel@nongnu.org Developers, Peter Crosthwaite, Paul Brook,
	Andreas Färber, John Williams

On Mon, Jun 11, 2012 at 10:03:56AM -0500, Anthony Liguori wrote:
> On 06/11/2012 09:53 AM, Peter Maydell wrote:
> >On 11 June 2012 15:38, Edgar E. Iglesias<edgar.iglesias@gmail.com>  wrote:
> >>On Mon, Jun 11, 2012 at 02:39:56PM +0100, Peter Maydell wrote:
> >>>Ideally the interface used by DMA controllers should be identical to
> >>>the interface used by CPUs to talk to the rest of the system: it's
> >>>exactly the same bus interface in hardware, after all.
> >>
> >>I thought we were talking about the interface between the DMA ctrl
> >>and the I/O (devices). Not between the DMA and the "memory" bus system.
> >
> >In hardware (at least for AXI) they're the same thing. A DMA
> >controller is a bus master, just like a CPU. They don't care
> >whether the slave is RAM or a device, they're just issuing
> >memory transactions to addresses.
> 
> It looks like the AXI stream interface also includes a word array.
> I can't tell though whether this is just a decomposed scatter/gather
> list though.

Hi,

IIRC the word array thing is device specific, not really AXI stream.
I think the whole connection to AXI is a bit unfortunate, these devices
are pretty much the same devices that in other contexts where connected
to other bus standards. Xilinx choose to name them AXI-xxx and I used
the name in our models but I didn't really model anything that is AXI
stream specific..


> There doesn't appear to be a notion of an address though.  You could
> make all operations go to address 0 though but it makes me wonder if
> that's stretching the concept of DMA a bit too much.

Yes, IMO we need a different abstraction..

Cheers

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI stream
  2012-06-11 15:01         ` Anthony Liguori
@ 2012-06-11 17:31           ` Avi Kivity
  2012-06-11 18:35             ` Anthony Liguori
                               ` (2 more replies)
  0 siblings, 3 replies; 63+ messages in thread
From: Avi Kivity @ 2012-06-11 17:31 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Peter Maydell, Anthony Liguori, Michal Simek,
	qemu-devel@nongnu.org Developers, Peter Crosthwaite, Paul Brook,
	Edgar E. Iglesias, Andreas Färber, John Williams

On 06/11/2012 06:01 PM, Anthony Liguori wrote:
> On 06/11/2012 08:39 AM, Peter Maydell wrote:
>> On 11 June 2012 14:15, Anthony Liguori<aliguori@us.ibm.com>  wrote:
>>>  From what you said earlier, it's basically:
>>>
>>> 'write data to this address'
>>> 'read data from this address'
>>>
>>> An interface that implements this is DMAContext.  Forget about the
>>> fact that
>>> 'DMA' is in the name.  It's really the symmetric version of a
>>> MemoryRegion.
>>
>> ...so can we fix the name?
>
> Perhaps we should just make MemoryRegion work in both directions?
>
> Ben/Avi, what do you guys think?
>

The other direction is currently cpu_physical_memory_rw().  We do need
to support issuing transactions from arbitrary points in the memory
hierarchy, but I don't think a device's MemoryRegion is the right
interface.  Being able to respond to memory transactions, and being able
to issue them are two different things.

In fact we will probably have to add more details to the memory
hierarchy.  Currently (for example) we model the memory hub passing
transactions destined for the various pci windows to the pci bus, but we
don't model the memory hub receiving a pci-initiated transaction and
passing it to the system bus.  We simply pretend it originated on the
system bus in the first place.  Perhaps we need parallel hierarchies:

   system_memory
      alias -> pci
      alias -> ram
   pci
      bar1
      bar2
   pcibm
      alias -> pci  (prio 1)
      alias -> system_memory (prio 0)

cpu_physical_memory_rw() would be implemented as
memory_region_rw(system_memory, ...) while pci_dma_rw() would be
implemented as memory_region_rw(pcibm, ...).  This would allow different
address transformations for the two accesses.

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI stream
  2012-06-11 17:31           ` Avi Kivity
@ 2012-06-11 18:35             ` Anthony Liguori
  2012-06-11 22:00               ` [Qemu-devel] [RFC] QOMification of AXI streams Benjamin Herrenschmidt
  2012-06-12  9:31               ` [Qemu-devel] [RFC] QOMification of AXI stream Avi Kivity
  2012-06-11 18:36             ` Anthony Liguori
  2012-06-12 12:58             ` Peter Maydell
  2 siblings, 2 replies; 63+ messages in thread
From: Anthony Liguori @ 2012-06-11 18:35 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Peter Maydell, Anthony Liguori, Michal Simek,
	qemu-devel@nongnu.org Developers, Peter Crosthwaite, Paul Brook,
	Edgar E. Iglesias, Andreas Färber, John Williams

On 06/11/2012 12:31 PM, Avi Kivity wrote:
> On 06/11/2012 06:01 PM, Anthony Liguori wrote:
>> On 06/11/2012 08:39 AM, Peter Maydell wrote:
>>> On 11 June 2012 14:15, Anthony Liguori<aliguori@us.ibm.com>   wrote:
>>>>    From what you said earlier, it's basically:
>>>>
>>>> 'write data to this address'
>>>> 'read data from this address'
>>>>
>>>> An interface that implements this is DMAContext.  Forget about the
>>>> fact that
>>>> 'DMA' is in the name.  It's really the symmetric version of a
>>>> MemoryRegion.
>>>
>>> ...so can we fix the name?
>>
>> Perhaps we should just make MemoryRegion work in both directions?
>>
>> Ben/Avi, what do you guys think?
>>
>
> The other direction is currently cpu_physical_memory_rw().

Right, and with benh's proposal, it's dma_memory_rw().  It also adds a 
DMAContext parameter.

I can't help think that the contents of DMAContext is awfully similar to 
MemoryRegionOps.

>  We do need
> to support issuing transactions from arbitrary points in the memory
> hierarchy, but I don't think a device's MemoryRegion is the right
> interface.  Being able to respond to memory transactions, and being able
> to issue them are two different things.

But an IOMMU has to be able to respond to a memory transaction.  Many of the 
things it may do (like endianness conversion) also happen to already exist in 
the memory API.

> In fact we will probably have to add more details to the memory
> hierarchy.  Currently (for example) we model the memory hub passing
> transactions destined for the various pci windows to the pci bus, but we
> don't model the memory hub receiving a pci-initiated transaction and
> passing it to the system bus.  We simply pretend it originated on the
> system bus in the first place.  Perhaps we need parallel hierarchies:
>
>     system_memory
>        alias ->  pci
>        alias ->  ram
>     pci
>        bar1
>        bar2
>     pcibm
>        alias ->  pci  (prio 1)
>        alias ->  system_memory (prio 0)
>
> cpu_physical_memory_rw() would be implemented as
> memory_region_rw(system_memory, ...) while pci_dma_rw() would be
> implemented as memory_region_rw(pcibm, ...).  This would allow different
> address transformations for the two accesses.

Yeah, this is what I'm basically thinking although I don't quite understand what 
'pcibm' stands for.

My biggest worry is that we'll end up with parallel memory API implementations 
split between memory.c and dma.c.

Regards,

Anthony Liguori

>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI stream
  2012-06-11 17:31           ` Avi Kivity
  2012-06-11 18:35             ` Anthony Liguori
@ 2012-06-11 18:36             ` Anthony Liguori
  2012-06-12  9:51               ` Avi Kivity
  2012-06-12 12:58             ` Peter Maydell
  2 siblings, 1 reply; 63+ messages in thread
From: Anthony Liguori @ 2012-06-11 18:36 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Peter Maydell, Anthony Liguori, Michal Simek,
	qemu-devel@nongnu.org Developers, Peter Crosthwaite, Paul Brook,
	Edgar E. Iglesias, Andreas Färber, John Williams

On 06/11/2012 12:31 PM, Avi Kivity wrote:
> On 06/11/2012 06:01 PM, Anthony Liguori wrote:
>> On 06/11/2012 08:39 AM, Peter Maydell wrote:
>>> On 11 June 2012 14:15, Anthony Liguori<aliguori@us.ibm.com>   wrote:
>>>>    From what you said earlier, it's basically:
>>>>
>>>> 'write data to this address'
>>>> 'read data from this address'
>>>>
>>>> An interface that implements this is DMAContext.  Forget about the
>>>> fact that
>>>> 'DMA' is in the name.  It's really the symmetric version of a
>>>> MemoryRegion.
>>>
>>> ...so can we fix the name?
>>
>> Perhaps we should just make MemoryRegion work in both directions?
>>
>> Ben/Avi, what do you guys think?
>>
>
> The other direction is currently cpu_physical_memory_rw().  We do need
> to support issuing transactions from arbitrary points in the memory
> hierarchy, but I don't think a device's MemoryRegion is the right
> interface.  Being able to respond to memory transactions, and being able
> to issue them are two different things.
>
> In fact we will probably have to add more details to the memory
> hierarchy.  Currently (for example) we model the memory hub passing
> transactions destined for the various pci windows to the pci bus, but we
> don't model the memory hub receiving a pci-initiated transaction and
> passing it to the system bus.  We simply pretend it originated on the
> system bus in the first place.  Perhaps we need parallel hierarchies:
>
>     system_memory
>        alias ->  pci
>        alias ->  ram
>     pci
>        bar1
>        bar2
>     pcibm
>        alias ->  pci  (prio 1)
>        alias ->  system_memory (prio 0)
>
> cpu_physical_memory_rw() would be implemented as
> memory_region_rw(system_memory, ...) while pci_dma_rw() would be
> implemented as memory_region_rw(pcibm, ...).  This would allow different
> address transformations for the two accesses.

BTW, the main problem with the memory API right now is that there isn't a 
'MemoryRegion *mr' as the first argument to the dispatch functions.  This could 
be fixed by introducing yet another set of function pointers and keeping the 
existing callers unchanged though.

Regards,

Anthony Liguori

>
>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI streams
  2012-06-11 18:35             ` Anthony Liguori
@ 2012-06-11 22:00               ` Benjamin Herrenschmidt
  2012-06-11 22:29                 ` Anthony Liguori
  2012-06-12  1:04                 ` Andreas Färber
  2012-06-12  9:31               ` [Qemu-devel] [RFC] QOMification of AXI stream Avi Kivity
  1 sibling, 2 replies; 63+ messages in thread
From: Benjamin Herrenschmidt @ 2012-06-11 22:00 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Peter Maydell, Anthony Liguori, qemu-devel@nongnu.org Developers,
	Peter Crosthwaite, Michal Simek, Avi Kivity, Edgar E. Iglesias,
	Andreas Färber, John Williams, Paul Brook

> >     system_memory
> >        alias ->  pci
> >        alias ->  ram
> >     pci
> >        bar1
> >        bar2
> >     pcibm
> >        alias ->  pci  (prio 1)
> >        alias ->  system_memory (prio 0)
> >
> > cpu_physical_memory_rw() would be implemented as
> > memory_region_rw(system_memory, ...) while pci_dma_rw() would be
> > implemented as memory_region_rw(pcibm, ...).  This would allo
> different address transformations for the two accesses.
> 
> Yeah, this is what I'm basically thinking although I don't quite
> understand what  'pcibm' stands for.
> 
> My biggest worry is that we'll end up with parallel memory API
> implementations split between memory.c and dma.c.

So it makes some amount of sense to use the same structure. For example,
if a device issues accesses, those could be caught by a sibling device
memory region... or go upstream.

Let's just look at downstream transformation for a minute...

We do need to be a bit careful about transformation here: I need to
double check but I don't think we do transformation downstream today in
a clean way and we'd have to do that. IE. On pseries for example, the
PCI host bridge has a window in the CPU address space of [A...A+S], but
accesses to that window generates PCI cycles with different addresses
[B...B+S] (with typically A and B both being naturally aligned on S so
it's just a bit masking in HW).

We somewhat implements that in spapr_pci today since it works but I
don't quite understand how :-) Or rather, the terminology "alias" seems
to be fairly bogus, we aren't talking about aliases here...

So today we create a memory region with an "alias" (whatever that means)
that is [B...B+S] and add a subregion which is [A...A+S]. That seems to
work but but it's obscure.

If I was to implement that, I would make it so that the struct
MemoryRegion used in that hierarchy contains the address in the local
domain -and- the transformed address in the CPU domain, so you can still
sort them by CPU addresses for quick access and make this offsetting a
standard property of any memory region since it's very common that
busses drop address bits along the way.

Now, if you want to use that structure for DMA, what you need to do
first is when an access happens, walk up the region tree and scan for
all siblings at every level, which can be costly.

Additionally to handle iommu's etc... you need the option for a given
memory region to have functions to perform the transformation in the
upstream direction.

To be true to the HW, each bridge should have its memory region, so a
setup with

      /pci-host
          |
          |--/p2p
               |
	       |--/device 

Any DMA done by the device would walk through the p2p region to the host
which would contain a region with transform ops.

However, at each level, you'd have to search for sibling regions that
may decode the address at that level before moving up, ie implement
essentially the equivalent of the PCI substractive decoding scheme.

That will be a significant overhead for your DMA ops I believe, though
doable.

Then we'd have to add map/unmap to MemoryRegion as well, with the
understanding that they may not be supported at every level...

So yeah, it sounds doable and it would handle what DMAContext doesn't
handle which is access to peer devices without going all the way back to
the "top level", but it's complex and ... I need something in qemu
1.2 :-)

In addition there's the memory barrier business so we probably want to
keep the idea of having DMA specific accessors ...

Could we keep the DMAContext for now and just rename it to MemoryRegion
(keeping the accessors) when we go for a more in depth transformation ?

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI streams
  2012-06-11 22:00               ` [Qemu-devel] [RFC] QOMification of AXI streams Benjamin Herrenschmidt
@ 2012-06-11 22:29                 ` Anthony Liguori
  2012-06-11 23:46                   ` Benjamin Herrenschmidt
  2012-06-12  9:46                   ` Avi Kivity
  2012-06-12  1:04                 ` Andreas Färber
  1 sibling, 2 replies; 63+ messages in thread
From: Anthony Liguori @ 2012-06-11 22:29 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Peter Maydell, Anthony Liguori, qemu-devel@nongnu.org Developers,
	Peter Crosthwaite, Michal Simek, Avi Kivity, Edgar E. Iglesias,
	Andreas Färber, John Williams, Paul Brook

On 06/11/2012 05:00 PM, Benjamin Herrenschmidt wrote:
>>>      system_memory
>>>         alias ->   pci
>>>         alias ->   ram
>>>      pci
>>>         bar1
>>>         bar2
>>>      pcibm
>>>         alias ->   pci  (prio 1)
>>>         alias ->   system_memory (prio 0)
>>>
>>> cpu_physical_memory_rw() would be implemented as
>>> memory_region_rw(system_memory, ...) while pci_dma_rw() would be
>>> implemented as memory_region_rw(pcibm, ...).  This would allo
>> different address transformations for the two accesses.
>>
>> Yeah, this is what I'm basically thinking although I don't quite
>> understand what  'pcibm' stands for.
>>
>> My biggest worry is that we'll end up with parallel memory API
>> implementations split between memory.c and dma.c.
>
> So it makes some amount of sense to use the same structure. For example,
> if a device issues accesses, those could be caught by a sibling device
> memory region... or go upstream.
>
> Let's just look at downstream transformation for a minute...
>
> We do need to be a bit careful about transformation here: I need to
> double check but I don't think we do transformation downstream today in
> a clean way and we'd have to do that. IE. On pseries for example, the
> PCI host bridge has a window in the CPU address space of [A...A+S], but
> accesses to that window generates PCI cycles with different addresses
> [B...B+S] (with typically A and B both being naturally aligned on S so
> it's just a bit masking in HW).

I don't know that we really have bit masking done right in the memory API.

When we add a subregion, it always removes the offset from the address when it 
dispatches.  This more often than not works out well but for what you're 
describing above, it sounds like you'd really want to get an adjusted size (that 
could be transformed).

Today we generate a linear dispatch table.  This prevents us from applying 
device-level transforms.

> We somewhat implements that in spapr_pci today since it works but I
> don't quite understand how :-) Or rather, the terminology "alias" seems
> to be fairly bogus, we aren't talking about aliases here...
>
> So today we create a memory region with an "alias" (whatever that means)
> that is [B...B+S] and add a subregion which is [A...A+S]. That seems to
> work but but it's obscure.
>
> If I was to implement that, I would make it so that the struct
> MemoryRegion used in that hierarchy contains the address in the local
> domain -and- the transformed address in the CPU domain, so you can still
> sort them by CPU addresses for quick access and make this offsetting a
> standard property of any memory region since it's very common that
> busses drop address bits along the way.
>
> Now, if you want to use that structure for DMA, what you need to do
> first is when an access happens, walk up the region tree and scan for
> all siblings at every level, which can be costly.

So if you stick with the notion of subregions, you would still have a single 
MemoryRegion at the PCI bus layer that has all of it's children as sub regions. 
  Presumably that "scan for all siblings" is a binary search which shouldn't 
really be that expensive considering that we're likely to have a shallow depth 
in the memory hierarchy.

>
> Additionally to handle iommu's etc... you need the option for a given
> memory region to have functions to perform the transformation in the
> upstream direction.

I think that transformation function lives in the bus layer MemoryRegion.  It's 
a bit tricky though because you need some sort of notion of "who is asking".  So 
you need:

dma_memory_write(MemoryRegion *parent, DeviceState *caller,
                  const void *data, size_t size);

This could be simplified at each layer via:

void pci_device_write(PCIDevice *dev, const void *data, size_t size) {
     dma_memory_write(dev->bus->mr, DEVICE(dev), data, size);
}

> To be true to the HW, each bridge should have its memory region, so a
> setup with
>
>        /pci-host
>            |
>            |--/p2p
>                 |
> 	       |--/device
>
> Any DMA done by the device would walk through the p2p region to the host
> which would contain a region with transform ops.
>
> However, at each level, you'd have to search for sibling regions that
> may decode the address at that level before moving up, ie implement
> essentially the equivalent of the PCI substractive decoding scheme.

Not quite...  subtractive decoding only happens for very specific devices IIUC. 
  For instance, an PCI-ISA bridge.  Normally, it's positive decoding and a 
bridge has to describe the full region of MMIO/PIO that it handles.

So it's only necessary to transverse down the tree again for the very special 
case of PCI-ISA bridges.  Normally you can tell just by looking at siblings.

> That will be a significant overhead for your DMA ops I believe, though
> doable.

Worst case scenario, 256 devices with what, a 3 level deep hierarchy?  we're 
still talking about 24 simple address compares.  That shouldn't be so bad.

> Then we'd have to add map/unmap to MemoryRegion as well, with the
> understanding that they may not be supported at every level...

map/unmap can always fall back to bounce buffers.

> So yeah, it sounds doable and it would handle what DMAContext doesn't
> handle which is access to peer devices without going all the way back to
> the "top level", but it's complex and ... I need something in qemu
> 1.2 :-)

I think we need a longer term vision here.  We can find incremental solutions 
for the short term but I'm pretty nervous about having two parallel APIs only to 
discover that we need to converge in 2 years.

Regards,

Anthony Liguori


> In addition there's the memory barrier business so we probably want to
> keep the idea of having DMA specific accessors ...
>
> Could we keep the DMAContext for now and just rename it to MemoryRegion
> (keeping the accessors) when we go for a more in depth transformation ?
>
> Cheers,
> Ben.
>
>
>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI streams
  2012-06-11 22:29                 ` Anthony Liguori
@ 2012-06-11 23:46                   ` Benjamin Herrenschmidt
  2012-06-12  1:33                     ` Anthony Liguori
  2012-06-12  9:46                   ` Avi Kivity
  1 sibling, 1 reply; 63+ messages in thread
From: Benjamin Herrenschmidt @ 2012-06-11 23:46 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Peter Maydell, Anthony Liguori, qemu-devel@nongnu.org Developers,
	Peter Crosthwaite, Michal Simek, Avi Kivity, Edgar E. Iglesias,
	Andreas Färber, John Williams, Paul Brook

On Mon, 2012-06-11 at 17:29 -0500, Anthony Liguori wrote:

> I don't know that we really have bit masking done right in the memory API.

That's not a big deal:

> When we add a subregion, it always removes the offset from the address when it 
> dispatches.  This more often than not works out well but for what you're 
> describing above, it sounds like you'd really want to get an adjusted size (that 
> could be transformed).
> 
> Today we generate a linear dispatch table.  This prevents us from applying 
> device-level transforms.

subregions being relative to the parent would probably work for me.

Typically I have something like:

  0x1234_0000_0000 .. 0x1234_3fff_ffff : window to PCI memory 

Which generates PCI cycles into the range:

  0x0000_c000_0000 .. 0x0000_ffff_ffff

Which is a combination of bit masking & offset, ie, the HW algorithm is
to forward some bits (0x3fff_ffff in this case) and replace the other
ones (with 0xc000_0000) in this case.

Is that properly represented in the current scheme by a subregion ?

> > We somewhat implements that in spapr_pci today since it works but I
> > don't quite understand how :-) Or rather, the terminology "alias" seems
> > to be fairly bogus, we aren't talking about aliases here...
> >
> > So today we create a memory region with an "alias" (whatever that means)
> > that is [B...B+S] and add a subregion which is [A...A+S]. That seems to
> > work but but it's obscure.
> >
> > If I was to implement that, I would make it so that the struct
> > MemoryRegion used in that hierarchy contains the address in the local
> > domain -and- the transformed address in the CPU domain, so you can still
> > sort them by CPU addresses for quick access and make this offsetting a
> > standard property of any memory region since it's very common that
> > busses drop address bits along the way.
> >
> > Now, if you want to use that structure for DMA, what you need to do
> > first is when an access happens, walk up the region tree and scan for
> > all siblings at every level, which can be costly.
> 
> So if you stick with the notion of subregions, you would still have a single 
> MemoryRegion at the PCI bus layer that has all of it's children as sub regions. 
>   Presumably that "scan for all siblings" is a binary search which shouldn't 
> really be that expensive considering that we're likely to have a shallow depth 
> in the memory hierarchy.

Possibly but on every DMA access ...

> > Additionally to handle iommu's etc... you need the option for a given
> > memory region to have functions to perform the transformation in the
> > upstream direction.
> 
> I think that transformation function lives in the bus layer MemoryRegion.  It's 
> a bit tricky though because you need some sort of notion of "who is asking".  So 
> you need:
> 
> dma_memory_write(MemoryRegion *parent, DeviceState *caller,
>                   const void *data, size_t size);

Why do you need the parent argument ? Can't it be implied from the
DeviceState ? Or do you envision devices sitting on multiple busses ? Or
it's just a matter that each device "class" will store that separately ?

Also you want the parent to be the direct parent P2P no ? Not the host
bridge ? Ie. to be completely correct, you want to look at every sibling
device under a given bridge, then go up if there is no match, etc...
until you reach the PHB at which point you hit the iommu and eventually
the system fabric.

> This could be simplified at each layer via:
> 
> void pci_device_write(PCIDevice *dev, const void *data, size_t size) {
>      dma_memory_write(dev->bus->mr, DEVICE(dev), data, size);
> }

Ok.

> > To be true to the HW, each bridge should have its memory region, so a
> > setup with
> >
> >        /pci-host
> >            |
> >            |--/p2p
> >                 |
> > 	       |--/device
> >
> > Any DMA done by the device would walk through the p2p region to the host
> > which would contain a region with transform ops.
> >
> > However, at each level, you'd have to search for sibling regions that
> > may decode the address at that level before moving up, ie implement
> > essentially the equivalent of the PCI substractive decoding scheme.
> 
> Not quite...  subtractive decoding only happens for very specific devices IIUC. 
>   For instance, an PCI-ISA bridge.  Normally, it's positive decoding and a 
> bridge has to describe the full region of MMIO/PIO that it handles.

That's for downstream. Upstream is essentially substractive as far as I
can tell. IE. If no sibling device decodes the cycle, then it goes up
no ? I don't remember off hand (actually I wonder how it works if a
device does a cycle to address A and it's both below a P2P bridge and
there's a substractive decoding bridge next door... who gets the cycle ?
upstream of substractive ?).

The problem is non-existent on PCIe of course.

> So it's only necessary to transverse down the tree again for the very special 
> case of PCI-ISA bridges.  Normally you can tell just by looking at siblings.
> 
> > That will be a significant overhead for your DMA ops I believe, though
> > doable.
> 
> Worst case scenario, 256 devices with what, a 3 level deep hierarchy?  we're 
> still talking about 24 simple address compares.  That shouldn't be so bad.

Per DMA access...

> > Then we'd have to add map/unmap to MemoryRegion as well, with the
> > understanding that they may not be supported at every level...
> 
> map/unmap can always fall back to bounce buffers.

Right.

> > So yeah, it sounds doable and it would handle what DMAContext doesn't
> > handle which is access to peer devices without going all the way back to
> > the "top level", but it's complex and ... I need something in qemu
> > 1.2 :-)
> 
> I think we need a longer term vision here.  We can find incremental solutions 
> for the short term but I'm pretty nervous about having two parallel APIs only to 
> discover that we need to converge in 2 years.

Can we agree that what we need to avoid having to change every second
day is the driver API ?

In which case, what about I come up with some pci_* DMA APIs such as the
ones you suggested above and fixup a handful of devices we care about to
them.

Under the hood, we can make it use the DMAContext for now and have that
working for us, but we can easily "merge" DMAContext and MemoryRegion
later since it's not directly exposed to devices.

IE. The devices would only use:

  pci_device_read/write/map/unmap(PCIDevice,...)

And DMAContext remains burried in the implementation.

Would that be ok with you ?

Cheers,
Ben.

> Regards,
> 
> Anthony Liguori
> 
> 
> > In addition there's the memory barrier business so we probably want to
> > keep the idea of having DMA specific accessors ...
> >
> > Could we keep the DMAContext for now and just rename it to MemoryRegion
> > (keeping the accessors) when we go for a more in depth transformation ?
> >
> > Cheers,
> > Ben.
> >
> >
> >

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI stream
  2012-06-11 15:56               ` Edgar E. Iglesias
@ 2012-06-12  0:33                 ` Peter Crosthwaite
  2012-06-12  7:58                   ` Edgar E. Iglesias
  0 siblings, 1 reply; 63+ messages in thread
From: Peter Crosthwaite @ 2012-06-12  0:33 UTC (permalink / raw)
  To: Edgar E. Iglesias
  Cc: Peter Maydell, Anthony Liguori, Michal Simek,
	qemu-devel@nongnu.org Developers, Paul Brook, Anthony Liguori,
	Andreas Färber, John Williams

>
> Hi,
>
> IIRC the word array thing is device specific, not really AXI stream.
> I think the whole connection to AXI is a bit unfortunate,

Yes, so I think we can summaries the confusion here with AXI stream is
completely unreleated to AXI. The connection between the two comes
with the way low level signalling is handled. (e.g. the SRC_RDY and
DST_RDY pins etc. etc.), thats common between AXI and AXI stream hence
the shared name. This is of course completely irrelevant to QEMU.

these devices
> are pretty much the same devices that in other contexts where connected
> to other bus standards. Xilinx choose to name them AXI-xxx and I used
> the name in our models but I didn't really model anything that is AXI
> stream specific..
>

Theres not a lot to model. With the signalling stuff taken away, its
pretty much just a unidirectional point-to-point data bus. There is
the T_LAST bit, and the extended capabilities PMM mentioned.

>
>> There doesn't appear to be a notion of an address though.  You could
>> make all operations go to address 0 though but it makes me wonder if
>> that's stretching the concept of DMA a bit too much.
>
> Yes, IMO we need a different abstraction..
>

Seconded, and the other argument, is that the protocol should be
usable for non-memory-access purposes (i.e. AXI stream sources and
sinks do not have to be RAM, ROM etc).

> Cheers

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI streams
  2012-06-11 22:00               ` [Qemu-devel] [RFC] QOMification of AXI streams Benjamin Herrenschmidt
  2012-06-11 22:29                 ` Anthony Liguori
@ 2012-06-12  1:04                 ` Andreas Färber
  2012-06-12  2:42                   ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 63+ messages in thread
From: Andreas Färber @ 2012-06-12  1:04 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Avi Kivity
  Cc: Peter Maydell, Anthony Liguori, Michal Simek,
	qemu-devel@nongnu.org Developers, Peter Crosthwaite, Paul Brook,
	Anthony Liguori, Edgar E. Iglesias, John Williams

Am 12.06.2012 00:00, schrieb Benjamin Herrenschmidt:
>>>     system_memory
>>>        alias ->  pci
>>>        alias ->  ram
>>>     pci
>>>        bar1
>>>        bar2
>>>     pcibm
>>>        alias ->  pci  (prio 1)
>>>        alias ->  system_memory (prio 0)
>>>
>>> cpu_physical_memory_rw() would be implemented as
>>> memory_region_rw(system_memory, ...) while pci_dma_rw() would be
>>> implemented as memory_region_rw(pcibm, ...).  This would allo
>> different address transformations for the two accesses.
>>
>> Yeah, this is what I'm basically thinking although I don't quite
>> understand what  'pcibm' stands for.
>>
>> My biggest worry is that we'll end up with parallel memory API
>> implementations split between memory.c and dma.c.
> 
> So it makes some amount of sense to use the same structure. For example,
> if a device issues accesses, those could be caught by a sibling device
> memory region... or go upstream.
> 
> Let's just look at downstream transformation for a minute...
> 
> We do need to be a bit careful about transformation here: I need to
> double check but I don't think we do transformation downstream today in
> a clean way and we'd have to do that. IE. On pseries for example, the
> PCI host bridge has a window in the CPU address space of [A...A+S], but
[snip]

That's not quite the way we're modelling it yet as shown by Avi above,
there is no CPU address space, only a "system" address space.

The way we're modelling it today is shoving everything into a global
machine-level adress space which many devices access themselves via
get_system_memory() and get_system_io() because there's no easy way to
pass it to them except for exposing their struct and setting a field
before qdev_init_nofail().
http://git.qemu.org/?p=qemu.git;a=blob;f=hw/ppc_prep.c;h=be2b26830deed1f728c01aa8f0582b63540919a8;hb=HEAD#l588

Can't each CPUState get a MemoryRegion for its CPU address space, which
then can have subregions/aliases for the one system_memory with its
subregions for PCI host bridge etc.? Then there's no need any more to
have a cpu_physical_memory_rw(), is there?

Cheers,
Andreas

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI streams
  2012-06-11 23:46                   ` Benjamin Herrenschmidt
@ 2012-06-12  1:33                     ` Anthony Liguori
  2012-06-12  2:06                       ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 63+ messages in thread
From: Anthony Liguori @ 2012-06-12  1:33 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Peter Maydell, Michal Simek, qemu-devel@nongnu.org Developers,
	Peter Crosthwaite, Paul Brook, Edgar E. Iglesias,
	Andreas Färber, John Williams, Avi Kivity

On 06/11/2012 06:46 PM, Benjamin Herrenschmidt wrote:
> On Mon, 2012-06-11 at 17:29 -0500, Anthony Liguori wrote:
>
>> When we add a subregion, it always removes the offset from the address when it
>> dispatches.  This more often than not works out well but for what you're
>> describing above, it sounds like you'd really want to get an adjusted size (that
>> could be transformed).
>>
>> Today we generate a linear dispatch table.  This prevents us from applying
>> device-level transforms.
>
> subregions being relative to the parent would probably work for me.
>
> Typically I have something like:
>
>    0x1234_0000_0000 .. 0x1234_3fff_ffff : window to PCI memory
>
> Which generates PCI cycles into the range:
>
>    0x0000_c000_0000 .. 0x0000_ffff_ffff
>
> Which is a combination of bit masking&  offset, ie, the HW algorithm is
> to forward some bits (0x3fff_ffff in this case) and replace the other
> ones (with 0xc000_0000) in this case.
>
> Is that properly represented in the current scheme by a subregion ?

Does this mean that both 0x1234..0000_0000is a valid address to issue PCI 
requests along with 0x0000_c000_000?

If they both are valid, that sounds pretty much like what aliases were made for.

>> So if you stick with the notion of subregions, you would still have a single
>> MemoryRegion at the PCI bus layer that has all of it's children as sub regions.
>>    Presumably that "scan for all siblings" is a binary search which shouldn't
>> really be that expensive considering that we're likely to have a shallow depth
>> in the memory hierarchy.
>
> Possibly but on every DMA access ...

We can always as a TLB :-)

>>> Additionally to handle iommu's etc... you need the option for a given
>>> memory region to have functions to perform the transformation in the
>>> upstream direction.
>>
>> I think that transformation function lives in the bus layer MemoryRegion.  It's
>> a bit tricky though because you need some sort of notion of "who is asking".  So
>> you need:
>>
>> dma_memory_write(MemoryRegion *parent, DeviceState *caller,
>>                    const void *data, size_t size);
>
> Why do you need the parent argument ? Can't it be implied from the
> DeviceState ? Or do you envision devices sitting on multiple busses ? Or
> it's just a matter that each device "class" will store that separately ?

Yeah, you may have multiple MemoryRegions you want to write to.  I don't think 
we should assume all devices have exactly one address space it writes to.

> Also you want the parent to be the direct parent P2P no ? Not the host
> bridge ? Ie. to be completely correct, you want to look at every sibling
> device under a given bridge, then go up if there is no match, etc...
> until you reach the PHB at which point you hit the iommu and eventually
> the system fabric.

Yes.

>> Not quite...  subtractive decoding only happens for very specific devices IIUC.
>>    For instance, an PCI-ISA bridge.  Normally, it's positive decoding and a
>> bridge has to describe the full region of MMIO/PIO that it handles.
>
> That's for downstream. Upstream is essentially substractive as far as I
> can tell. IE. If no sibling device decodes the cycle, then it goes up
> no ?

Why would that be necessary?  Are upstream requests p2p?  I would have assumed 
that they went to the host bus and then were dispatched by the host bus (but I'm 
just guess).

>> So it's only necessary to transverse down the tree again for the very special
>> case of PCI-ISA bridges.  Normally you can tell just by looking at siblings.
>>
>>> That will be a significant overhead for your DMA ops I believe, though
>>> doable.
>>
>> Worst case scenario, 256 devices with what, a 3 level deep hierarchy?  we're
>> still talking about 24 simple address compares.  That shouldn't be so bad.
>
> Per DMA access...
>
>>> Then we'd have to add map/unmap to MemoryRegion as well, with the
>>> understanding that they may not be supported at every level...
>>
>> map/unmap can always fall back to bounce buffers.
>
> Right.
>
>>> So yeah, it sounds doable and it would handle what DMAContext doesn't
>>> handle which is access to peer devices without going all the way back to
>>> the "top level", but it's complex and ... I need something in qemu
>>> 1.2 :-)
>>
>> I think we need a longer term vision here.  We can find incremental solutions
>> for the short term but I'm pretty nervous about having two parallel APIs only to
>> discover that we need to converge in 2 years.
>
> Can we agree that what we need to avoid having to change every second
> day is the driver API ?
>
> In which case, what about I come up with some pci_* DMA APIs such as the
> ones you suggested above and fixup a handful of devices we care about to
> them.

Yes!

If I recall, that was what I suggested originally and you talked me out of it :-)

I think that's the best future proof design.

Regards,

Anthony Liguori

>
> Under the hood, we can make it use the DMAContext for now and have that
> working for us, but we can easily "merge" DMAContext and MemoryRegion
> later since it's not directly exposed to devices.
>
> IE. The devices would only use:
>
>    pci_device_read/write/map/unmap(PCIDevice,...)
>
> And DMAContext remains burried in the implementation.
>
> Would that be ok with you ?
>
> Cheers,
> Ben.
>
>> Regards,
>>
>> Anthony Liguori
>>
>>
>>> In addition there's the memory barrier business so we probably want to
>>> keep the idea of having DMA specific accessors ...
>>>
>>> Could we keep the DMAContext for now and just rename it to MemoryRegion
>>> (keeping the accessors) when we go for a more in depth transformation ?
>>>
>>> Cheers,
>>> Ben.
>>>
>>>
>>>
>
>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI streams
  2012-06-12  1:33                     ` Anthony Liguori
@ 2012-06-12  2:06                       ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 63+ messages in thread
From: Benjamin Herrenschmidt @ 2012-06-12  2:06 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Peter Maydell, Michal Simek, qemu-devel@nongnu.org Developers,
	Peter Crosthwaite, Paul Brook, Edgar E. Iglesias,
	Andreas Färber, John Williams, Avi Kivity

On Mon, 2012-06-11 at 20:33 -0500, Anthony Liguori wrote:
> On 06/11/2012 06:46 PM, Benjamin Herrenschmidt wrote:
> > On Mon, 2012-06-11 at 17:29 -0500, Anthony Liguori wrote:
> >
> >> When we add a subregion, it always removes the offset from the address when it
> >> dispatches.  This more often than not works out well but for what you're
> >> describing above, it sounds like you'd really want to get an adjusted size (that
> >> could be transformed).
> >>
> >> Today we generate a linear dispatch table.  This prevents us from applying
> >> device-level transforms.
> >
> > subregions being relative to the parent would probably work for me.
> >
> > Typically I have something like:
> >
> >    0x1234_0000_0000 .. 0x1234_3fff_ffff : window to PCI memory
> >
> > Which generates PCI cycles into the range:
> >
> >    0x0000_c000_0000 .. 0x0000_ffff_ffff
> >
> > Which is a combination of bit masking&  offset, ie, the HW algorithm is
> > to forward some bits (0x3fff_ffff in this case) and replace the other
> > ones (with 0xc000_0000) in this case.
> >
> > Is that properly represented in the current scheme by a subregion ?
> 
> Does this mean that both 0x1234..0000_0000is a valid address to issue PCI 
> requests along with 0x0000_c000_000?
> 
> If they both are valid, that sounds pretty much like what aliases were made for.

No. 0x1234_0000_0000 is what the CPU accesses (the window into the CPU
space) and an access to that address will generate a PCI access with
address 0xc000_0000 on the PCI bus.

> >> So if you stick with the notion of subregions, you would still have a single
> >> MemoryRegion at the PCI bus layer that has all of it's children as sub regions.
> >>    Presumably that "scan for all siblings" is a binary search which shouldn't
> >> really be that expensive considering that we're likely to have a shallow depth
> >> in the memory hierarchy.
> >
> > Possibly but on every DMA access ...
> 
> We can always as a TLB :-)

Yeah... at least up to the first guy who tries to translate... unless we
start caching translations too :-)

> >>> Additionally to handle iommu's etc... you need the option for a given
> >>> memory region to have functions to perform the transformation in the
> >>> upstream direction.
> >>
> >> I think that transformation function lives in the bus layer MemoryRegion.  It's
> >> a bit tricky though because you need some sort of notion of "who is asking".  So
> >> you need:
> >>
> >> dma_memory_write(MemoryRegion *parent, DeviceState *caller,
> >>                    const void *data, size_t size);
> >
> > Why do you need the parent argument ? Can't it be implied from the
> > DeviceState ? Or do you envision devices sitting on multiple busses ? Or
> > it's just a matter that each device "class" will store that separately ?
> 
> Yeah, you may have multiple MemoryRegions you want to write to.  I don't think 
> we should assume all devices have exactly one address space it writes to.

Ok.

> > Also you want the parent to be the direct parent P2P no ? Not the host
> > bridge ? Ie. to be completely correct, you want to look at every sibling
> > device under a given bridge, then go up if there is no match, etc...
> > until you reach the PHB at which point you hit the iommu and eventually
> > the system fabric.
> 
> Yes.
> 
> >> Not quite...  subtractive decoding only happens for very specific devices IIUC.
> >>    For instance, an PCI-ISA bridge.  Normally, it's positive decoding and a
> >> bridge has to describe the full region of MMIO/PIO that it handles.
> >
> > That's for downstream. Upstream is essentially substractive as far as I
> > can tell. IE. If no sibling device decodes the cycle, then it goes up
> > no ?
> 
> Why would that be necessary?  Are upstream requests p2p?  I would have assumed 
> that they went to the host bus and then were dispatched by the host bus (but I'm 
> just guess).

On classic PCI there's no such thing:

  Host
   |
  P2P
   |-----|-----|
 dev0  dev1  dev2

In a setup like the above, if dev0 does a write to address A, from the
perspective of the bus segment where dev0,1 and 2 sit, this is just an
initiator requesting arbitration, and when granted it, doing a write to
address A. There is no concept of "upstream". The P2P bridge downstream
leg, for all intent and purpose is just another device on that bus
segment.

So it depends who decodes A and if neither dev1 nor dev2 does, then the
P2P gets it and passes it up the food chain (it doesn't have a
configurable "range" of upstream addresses to decode). 

> >> So it's only necessary to transverse down the tree again for the very special
> >> case of PCI-ISA bridges.  Normally you can tell just by looking at siblings.
> >>
> >>> That will be a significant overhead for your DMA ops I believe, though
> >>> doable.
> >>
> >> Worst case scenario, 256 devices with what, a 3 level deep hierarchy?  we're
> >> still talking about 24 simple address compares.  That shouldn't be so bad.
> >
> > Per DMA access...
> >
> >>> Then we'd have to add map/unmap to MemoryRegion as well, with the
> >>> understanding that they may not be supported at every level...
> >>
> >> map/unmap can always fall back to bounce buffers.
> >
> > Right.
> >
> >>> So yeah, it sounds doable and it would handle what DMAContext doesn't
> >>> handle which is access to peer devices without going all the way back to
> >>> the "top level", but it's complex and ... I need something in qemu
> >>> 1.2 :-)
> >>
> >> I think we need a longer term vision here.  We can find incremental solutions
> >> for the short term but I'm pretty nervous about having two parallel APIs only to
> >> discover that we need to converge in 2 years.
> >
> > Can we agree that what we need to avoid having to change every second
> > day is the driver API ?
> >
> > In which case, what about I come up with some pci_* DMA APIs such as the
> > ones you suggested above and fixup a handful of devices we care about to
> > them.
> 
> Yes!
> 
> If I recall, that was what I suggested originally and you talked me out of it :-)
> 
> I think that's the best future proof design.

Yeah well, I don't remember exactly and we all make mistakes but bus
specific helpers are a good idea, I don't remember talking you out of
that specifically :-)

I'll respin the patches along those lines so we can easily move toward
something fancier without having to impact all devices.

Cheers,
Ben.

> Regards,
> 
> Anthony Liguori
> 
> >
> > Under the hood, we can make it use the DMAContext for now and have that
> > working for us, but we can easily "merge" DMAContext and MemoryRegion
> > later since it's not directly exposed to devices.
> >
> > IE. The devices would only use:
> >
> >    pci_device_read/write/map/unmap(PCIDevice,...)
> >
> > And DMAContext remains burried in the implementation.
> >
> > Would that be ok with you ?
> >
> > Cheers,
> > Ben.
> >
> >> Regards,
> >>
> >> Anthony Liguori
> >>
> >>
> >>> In addition there's the memory barrier business so we probably want to
> >>> keep the idea of having DMA specific accessors ...
> >>>
> >>> Could we keep the DMAContext for now and just rename it to MemoryRegion
> >>> (keeping the accessors) when we go for a more in depth transformation ?
> >>>
> >>> Cheers,
> >>> Ben.
> >>>
> >>>
> >>>
> >
> >

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI streams
  2012-06-12  1:04                 ` Andreas Färber
@ 2012-06-12  2:42                   ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 63+ messages in thread
From: Benjamin Herrenschmidt @ 2012-06-12  2:42 UTC (permalink / raw)
  To: Andreas Färber
  Cc: Peter Maydell, Anthony Liguori, qemu-devel@nongnu.org Developers,
	Peter Crosthwaite, Michal Simek, Avi Kivity, Anthony Liguori,
	Edgar E. Iglesias, John Williams, Paul Brook

On Tue, 2012-06-12 at 03:04 +0200, Andreas Färber wrote:

> That's not quite the way we're modelling it yet as shown by Avi above,
> there is no CPU address space, only a "system" address space.

That can be considered as roughly equivalent for now, though it might
become problematic when modelling crazy embedded setups, I think at this
stage, we can consider that all the CPUs have the same view of the world
which is more/less the "system bus".

> The way we're modelling it today is shoving everything into a global
> machine-level adress space which many devices access themselves via
> get_system_memory() and get_system_io() because there's no easy way to
> pass it to them except for exposing their struct and setting a field
> before qdev_init_nofail().
> http://git.qemu.org/?p=qemu.git;a=blob;f=hw/ppc_prep.c;h=be2b26830deed1f728c01aa8f0582b63540919a8;hb=HEAD#l588
> 
> Can't each CPUState get a MemoryRegion for its CPU address space, which
> then can have subregions/aliases for the one system_memory with its
> subregions for PCI host bridge etc.? Then there's no need any more to
> have a cpu_physical_memory_rw(), is there?

I think that's less urgent. The problem is really when going from that
"system bus" down to various IO busses, whether they are AXI, PLB4/5/6,
PCI or PCI behind any of the previous ones, etc....

There's potentially translation happening when crossing those bridges,
though most of the time, this can be represented in the form of one or
more "window" that forward a given region of the parent bus down to the
child bus with an offset.

>From there, the child bus will itself potentially have a collection of
stacked up bridges that may or may not translate or may or may not
convey portions of that address space.

The rules for translation "upstream" are similar, except that often you
have an iommu on the way (though potentially there could be one on the
way downstream as well).

The "MemoryRegion" structure is a bit of an odd way to represent this
but it could be re-used, though I reckon we should probably generalize
the concept of "offset".

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI stream
  2012-06-12  0:33                 ` Peter Crosthwaite
@ 2012-06-12  7:58                   ` Edgar E. Iglesias
  2012-06-14  1:01                     ` Peter Crosthwaite
  0 siblings, 1 reply; 63+ messages in thread
From: Edgar E. Iglesias @ 2012-06-12  7:58 UTC (permalink / raw)
  To: Peter Crosthwaite
  Cc: Peter Maydell, Anthony Liguori, Michal Simek,
	qemu-devel@nongnu.org Developers, Paul Brook, Anthony Liguori,
	Andreas Färber, John Williams

On Tue, Jun 12, 2012 at 10:33:51AM +1000, Peter Crosthwaite wrote:
> >
> > Hi,
> >
> > IIRC the word array thing is device specific, not really AXI stream.
> > I think the whole connection to AXI is a bit unfortunate,
> 
> Yes, so I think we can summaries the confusion here with AXI stream is
> completely unreleated to AXI. The connection between the two comes
> with the way low level signalling is handled. (e.g. the SRC_RDY and
> DST_RDY pins etc. etc.), thats common between AXI and AXI stream hence
> the shared name. This is of course completely irrelevant to QEMU.
> 
> these devices
> > are pretty much the same devices that in other contexts where connected
> > to other bus standards. Xilinx choose to name them AXI-xxx and I used
> > the name in our models but I didn't really model anything that is AXI
> > stream specific..
> >
> 
> Theres not a lot to model. With the signalling stuff taken away, its
> pretty much just a unidirectional point-to-point data bus. There is
> the T_LAST bit, and the extended capabilities PMM mentioned.

IMO:
Yes, the end-of-packet needs to be modeled,
(e.g one function-call -> one packet). The exteneded stuff is AFAICS just
optimization, the abstract concept is still data moving. We are not modeling
AXI Stream, we should be modeling the common abstraction between DMA ctrls
and devices. Error reporting from the sink/device might be needed in some cases
(overrun/underrun etc).

Another issue is that these data paths are typically high-speed. So a zerocopy
approach might be worth looking at. For example, if all the chunks in
a packet are backed by host memory, we could just pass pointers to
the final sink, which in turn can do the equivalent of a writev() without
memory_rw into temporary linear bufs etc. maybe...

Cheers

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI stream
  2012-06-11 18:35             ` Anthony Liguori
  2012-06-11 22:00               ` [Qemu-devel] [RFC] QOMification of AXI streams Benjamin Herrenschmidt
@ 2012-06-12  9:31               ` Avi Kivity
  2012-06-12  9:42                 ` Edgar E. Iglesias
  1 sibling, 1 reply; 63+ messages in thread
From: Avi Kivity @ 2012-06-12  9:31 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Peter Maydell, Anthony Liguori, Michal Simek,
	qemu-devel@nongnu.org Developers, Peter Crosthwaite, Paul Brook,
	Edgar E. Iglesias, Andreas Färber, John Williams

On 06/11/2012 09:35 PM, Anthony Liguori wrote:
>>
>> The other direction is currently cpu_physical_memory_rw().
> 
> Right, and with benh's proposal, it's dma_memory_rw().  It also adds a
> DMAContext parameter.
> 
> I can't help think that the contents of DMAContext is awfully similar to
> MemoryRegionOps.
> 
>>  We do need
>> to support issuing transactions from arbitrary points in the memory
>> hierarchy, but I don't think a device's MemoryRegion is the right
>> interface.  Being able to respond to memory transactions, and being able
>> to issue them are two different things.
> 
> But an IOMMU has to be able to respond to a memory transaction.  Many of
> the things it may do (like endianness conversion) also happen to already
> exist in the memory API.

Right; it's both an initiator and a target.  We'll need something for
initiators, we have MemoryRegion for targets, and iommus will use both.

> 
>> In fact we will probably have to add more details to the memory
>> hierarchy.  Currently (for example) we model the memory hub passing
>> transactions destined for the various pci windows to the pci bus, but we
>> don't model the memory hub receiving a pci-initiated transaction and
>> passing it to the system bus.  We simply pretend it originated on the
>> system bus in the first place.  Perhaps we need parallel hierarchies:
>>
>>     system_memory
>>        alias ->  pci
>>        alias ->  ram
>>     pci
>>        bar1
>>        bar2
>>     pcibm
>>        alias ->  pci  (prio 1)
>>        alias ->  system_memory (prio 0)
>>
>> cpu_physical_memory_rw() would be implemented as
>> memory_region_rw(system_memory, ...) while pci_dma_rw() would be
>> implemented as memory_region_rw(pcibm, ...).  This would allow different
>> address transformations for the two accesses.
> 
> Yeah, this is what I'm basically thinking although I don't quite
> understand what 'pcibm' stands for.

PCI bus master

> 
> My biggest worry is that we'll end up with parallel memory API
> implementations split between memory.c and dma.c.

It definitely belongs together.


-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI stream
  2012-06-12  9:31               ` [Qemu-devel] [RFC] QOMification of AXI stream Avi Kivity
@ 2012-06-12  9:42                 ` Edgar E. Iglesias
  0 siblings, 0 replies; 63+ messages in thread
From: Edgar E. Iglesias @ 2012-06-12  9:42 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Peter Maydell, Anthony Liguori, Michal Simek,
	qemu-devel@nongnu.org Developers, Peter Crosthwaite, Paul Brook,
	Anthony Liguori, Andreas Färber, John Williams

On Tue, Jun 12, 2012 at 12:31:58PM +0300, Avi Kivity wrote:
> On 06/11/2012 09:35 PM, Anthony Liguori wrote:
> >>
> >> The other direction is currently cpu_physical_memory_rw().
> > 
> > Right, and with benh's proposal, it's dma_memory_rw().  It also adds a
> > DMAContext parameter.
> > 
> > I can't help think that the contents of DMAContext is awfully similar to
> > MemoryRegionOps.
> > 
> >>  We do need
> >> to support issuing transactions from arbitrary points in the memory
> >> hierarchy, but I don't think a device's MemoryRegion is the right
> >> interface.  Being able to respond to memory transactions, and being able
> >> to issue them are two different things.
> > 
> > But an IOMMU has to be able to respond to a memory transaction.  Many of
> > the things it may do (like endianness conversion) also happen to already
> > exist in the memory API.
> 
> Right; it's both an initiator and a target.  We'll need something for
> initiators, we have MemoryRegion for targets, and iommus will use both.
> 
> > 
> >> In fact we will probably have to add more details to the memory
> >> hierarchy.  Currently (for example) we model the memory hub passing
> >> transactions destined for the various pci windows to the pci bus, but we
> >> don't model the memory hub receiving a pci-initiated transaction and
> >> passing it to the system bus.  We simply pretend it originated on the
> >> system bus in the first place.  Perhaps we need parallel hierarchies:
> >>
> >>     system_memory
> >>        alias ->  pci
> >>        alias ->  ram
> >>     pci
> >>        bar1
> >>        bar2
> >>     pcibm
> >>        alias ->  pci  (prio 1)
> >>        alias ->  system_memory (prio 0)
> >>
> >> cpu_physical_memory_rw() would be implemented as
> >> memory_region_rw(system_memory, ...) while pci_dma_rw() would be
> >> implemented as memory_region_rw(pcibm, ...).  This would allow different
> >> address transformations for the two accesses.
> > 
> > Yeah, this is what I'm basically thinking although I don't quite
> > understand what 'pcibm' stands for.
> 
> PCI bus master
> 
> > 
> > My biggest worry is that we'll end up with parallel memory API
> > implementations split between memory.c and dma.c.
> 
> It definitely belongs together.

I agree. Would be good to keep DMA out of the naming. It's just
about adding support for multiple initiators/masters (DMA is just
one of the use-cases)..

Cheers,
Edgar

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI streams
  2012-06-11 22:29                 ` Anthony Liguori
  2012-06-11 23:46                   ` Benjamin Herrenschmidt
@ 2012-06-12  9:46                   ` Avi Kivity
  2012-06-13  0:37                     ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 63+ messages in thread
From: Avi Kivity @ 2012-06-12  9:46 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Peter Maydell, Anthony Liguori, qemu-devel@nongnu.org Developers,
	Peter Crosthwaite, Michal Simek, Paul Brook, Edgar E. Iglesias,
	Andreas Färber, John Williams

On 06/12/2012 01:29 AM, Anthony Liguori wrote:
>>
>> So it makes some amount of sense to use the same structure. For example,
>> if a device issues accesses, those could be caught by a sibling device
>> memory region... or go upstream.
>>
>> Let's just look at downstream transformation for a minute...
>>
>> We do need to be a bit careful about transformation here: I need to
>> double check but I don't think we do transformation downstream today in
>> a clean way and we'd have to do that. IE. On pseries for example, the
>> PCI host bridge has a window in the CPU address space of [A...A+S], but
>> accesses to that window generates PCI cycles with different addresses
>> [B...B+S] (with typically A and B both being naturally aligned on S so
>> it's just a bit masking in HW).
> 
> I don't know that we really have bit masking done right in the memory API.
> 
> When we add a subregion, it always removes the offset from the address
> when it dispatches.  This more often than not works out well but for
> what you're describing above, it sounds like you'd really want to get an
> adjusted size (that could be transformed).
> 
> Today we generate a linear dispatch table.  This prevents us from
> applying device-level transforms.

We can perform arbitrary transformations to the address during dispatch
(except it_shift style, but I think that should be added).  The blocker
is that we have just a single dispatch table where we should have
several - one for each initiator group (cpus, each bci bus, etc.).


> 
>> We somewhat implements that in spapr_pci today since it works but I
>> don't quite understand how :-) Or rather, the terminology "alias" seems
>> to be fairly bogus, we aren't talking about aliases here...
>>
>> So today we create a memory region with an "alias" (whatever that means)
>> that is [B...B+S] and add a subregion which is [A...A+S]. That seems to
>> work but but it's obscure.
>>
>> If I was to implement that, I would make it so that the struct
>> MemoryRegion used in that hierarchy contains the address in the local
>> domain -and- the transformed address in the CPU domain, so you can still
>> sort them by CPU addresses for quick access and make this offsetting a
>> standard property of any memory region since it's very common that
>> busses drop address bits along the way.
>>
>> Now, if you want to use that structure for DMA, what you need to do
>> first is when an access happens, walk up the region tree and scan for
>> all siblings at every level, which can be costly.
> 
> So if you stick with the notion of subregions, you would still have a
> single MemoryRegion at the PCI bus layer that has all of it's children
> as sub regions.  Presumably that "scan for all siblings" is a binary
> search which shouldn't really be that expensive considering that we're
> likely to have a shallow depth in the memory hierarchy.

We can just render the memory hierarchy from the point of the devices.
This is needed in cases we don't support dynamic dispatch (virtual iommu
that is implemented by host hardware), and more efficient elsewhere.

> 
>>
>> Additionally to handle iommu's etc... you need the option for a given
>> memory region to have functions to perform the transformation in the
>> upstream direction.
> 
> I think that transformation function lives in the bus layer
> MemoryRegion.  It's a bit tricky though because you need some sort of
> notion of "who is asking".  So you need:
> 
> dma_memory_write(MemoryRegion *parent, DeviceState *caller,
>                  const void *data, size_t size);

It is not the parent here, but rather the root of the memory hierarchy
as viewed from the device (the enigmatically named 'pcibm' above).  The
pci memory region simply doesn't have the information about where system
memory lives, because it is a sibling region.

Note that the address transformations are not necessarily symmetric (for
example, iommus transform device->system transactions, but not
cpu->device transactions).  Each initiator has a separate DAG to follow.

> 
> This could be simplified at each layer via:
> 
> void pci_device_write(PCIDevice *dev, const void *data, size_t size) {
>     dma_memory_write(dev->bus->mr, DEVICE(dev), data, size);
> }
> 
>> To be true to the HW, each bridge should have its memory region, so a
>> setup with
>>
>>        /pci-host
>>            |
>>            |--/p2p
>>                 |
>>            |--/device
>>
>> Any DMA done by the device would walk through the p2p region to the host
>> which would contain a region with transform ops.
>>
>> However, at each level, you'd have to search for sibling regions that
>> may decode the address at that level before moving up, ie implement
>> essentially the equivalent of the PCI substractive decoding scheme.
> 
> Not quite...  subtractive decoding only happens for very specific
> devices IIUC.  For instance, an PCI-ISA bridge.  Normally, it's positive
> decoding and a bridge has to describe the full region of MMIO/PIO that
> it handles.
> 
> So it's only necessary to transverse down the tree again for the very
> special case of PCI-ISA bridges.  Normally you can tell just by looking
> at siblings.
> 
>> That will be a significant overhead for your DMA ops I believe, though
>> doable.
> 
> Worst case scenario, 256 devices with what, a 3 level deep hierarchy? 
> we're still talking about 24 simple address compares.  That shouldn't be
> so bad.

Or just lookup the device-local phys_map.

> 
>> Then we'd have to add map/unmap to MemoryRegion as well, with the
>> understanding that they may not be supported at every level...
> 
> map/unmap can always fall back to bounce buffers.
> 
>> So yeah, it sounds doable and it would handle what DMAContext doesn't
>> handle which is access to peer devices without going all the way back to
>> the "top level", but it's complex and ... I need something in qemu
>> 1.2 :-)
> 
> I think we need a longer term vision here.  We can find incremental
> solutions for the short term but I'm pretty nervous about having two
> parallel APIs only to discover that we need to converge in 2 years.

The API already exists, we just need to fill up the data structures.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI stream
  2012-06-11 18:36             ` Anthony Liguori
@ 2012-06-12  9:51               ` Avi Kivity
  0 siblings, 0 replies; 63+ messages in thread
From: Avi Kivity @ 2012-06-12  9:51 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Peter Maydell, Anthony Liguori, Michal Simek,
	qemu-devel@nongnu.org Developers, Peter Crosthwaite, Paul Brook,
	Edgar E. Iglesias, Andreas Färber, John Williams

On 06/11/2012 09:36 PM, Anthony Liguori wrote:
>>
>> cpu_physical_memory_rw() would be implemented as
>> memory_region_rw(system_memory, ...) while pci_dma_rw() would be
>> implemented as memory_region_rw(pcibm, ...).  This would allow different
>> address transformations for the two accesses.
> 
> BTW, the main problem with the memory API right now is that there isn't
> a 'MemoryRegion *mr' as the first argument to the dispatch functions. 

Why is it a problem?

> This could be fixed by introducing yet another set of function pointers
> and keeping the existing callers unchanged though.

I prefer a full s///, I hate the huge number of thunks we go through to
deliver some transaction.

Note we had a *mr parameter originally, it was removed in favour of *opaque.


-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI stream
  2012-06-11 17:31           ` Avi Kivity
  2012-06-11 18:35             ` Anthony Liguori
  2012-06-11 18:36             ` Anthony Liguori
@ 2012-06-12 12:58             ` Peter Maydell
  2012-06-12 13:18               ` Avi Kivity
  2 siblings, 1 reply; 63+ messages in thread
From: Peter Maydell @ 2012-06-12 12:58 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Anthony Liguori, Michal Simek, qemu-devel@nongnu.org Developers,
	Peter Crosthwaite, Paul Brook, Anthony Liguori, Edgar E. Iglesias,
	Andreas Färber, John Williams

On 11 June 2012 18:31, Avi Kivity <avi@redhat.com> wrote:
> On 06/11/2012 06:01 PM, Anthony Liguori wrote:
>> Perhaps we should just make MemoryRegion work in both directions?

> The other direction is currently cpu_physical_memory_rw().  We do need
> to support issuing transactions from arbitrary points in the memory
> hierarchy, but I don't think a device's MemoryRegion is the right
> interface.  Being able to respond to memory transactions, and being able
> to issue them are two different things.

...they're just opposite sides of the same interface, though,
really. For instance you could say that any memory transaction
master (cpu, dma controller, whatever) should take a single
MemoryRegion* and must issue all its memory accesses to that MR*.
(obviously that would usually be a container region.)

> In fact we will probably have to add more details to the memory
> hierarchy.  Currently (for example) we model the memory hub passing
> transactions destined for the various pci windows to the pci bus, but we
> don't model the memory hub receiving a pci-initiated transaction and
> passing it to the system bus.  We simply pretend it originated on the
> system bus in the first place.  Perhaps we need parallel hierarchies:
>
>   system_memory
>      alias -> pci
>      alias -> ram
>   pci
>      bar1
>      bar2
>   pcibm
>      alias -> pci  (prio 1)
>      alias -> system_memory (prio 0)
>
> cpu_physical_memory_rw() would be implemented as
> memory_region_rw(system_memory, ...) while pci_dma_rw() would be
> implemented as memory_region_rw(pcibm, ...).  This would allow different
> address transformations for the two accesses.

Ideally system_memory shouldn't exist at all. Anything
which can issue transactions should do it by exposing
a suitable interface which the system model can then
just wire up suitably. Obviously mostly that's going
to be "all these devices go here in the memory map and
that view of the world is what all the CPUs see", but
we shouldn't have a globally accessible and implicitly
used address space IMHO.

(For instance, the right way to model per-CPU devices
is to have each individual CPU core expose its transaction
master, and then you wire the per-CPU devices up only
to their corresponding CPU.)

-- PMM

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI stream
  2012-06-12 12:58             ` Peter Maydell
@ 2012-06-12 13:18               ` Avi Kivity
  2012-06-12 13:32                 ` Peter Maydell
  0 siblings, 1 reply; 63+ messages in thread
From: Avi Kivity @ 2012-06-12 13:18 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Anthony Liguori, Michal Simek, qemu-devel@nongnu.org Developers,
	Peter Crosthwaite, Paul Brook, Anthony Liguori, Edgar E. Iglesias,
	Andreas Färber, John Williams

On 06/12/2012 03:58 PM, Peter Maydell wrote:
> On 11 June 2012 18:31, Avi Kivity <avi@redhat.com> wrote:
>> On 06/11/2012 06:01 PM, Anthony Liguori wrote:
>>> Perhaps we should just make MemoryRegion work in both directions?
> 
>> The other direction is currently cpu_physical_memory_rw().  We do need
>> to support issuing transactions from arbitrary points in the memory
>> hierarchy, but I don't think a device's MemoryRegion is the right
>> interface.  Being able to respond to memory transactions, and being able
>> to issue them are two different things.
> 
> ...they're just opposite sides of the same interface, though,
> really. For instance you could say that any memory transaction
> master (cpu, dma controller, whatever) should take a single
> MemoryRegion* and must issue all its memory accesses to that MR*.
> (obviously that would usually be a container region.)

It would be a container region, and it would be unrelated to any other
regions held by the device (the device might not have any memory
regions; instead it would only be able to do dma).

> 
>> In fact we will probably have to add more details to the memory
>> hierarchy.  Currently (for example) we model the memory hub passing
>> transactions destined for the various pci windows to the pci bus, but we
>> don't model the memory hub receiving a pci-initiated transaction and
>> passing it to the system bus.  We simply pretend it originated on the
>> system bus in the first place.  Perhaps we need parallel hierarchies:
>>
>>   system_memory
>>      alias -> pci
>>      alias -> ram
>>   pci
>>      bar1
>>      bar2
>>   pcibm
>>      alias -> pci  (prio 1)
>>      alias -> system_memory (prio 0)
>>
>> cpu_physical_memory_rw() would be implemented as
>> memory_region_rw(system_memory, ...) while pci_dma_rw() would be
>> implemented as memory_region_rw(pcibm, ...).  This would allow different
>> address transformations for the two accesses.
> 
> Ideally system_memory shouldn't exist at all. Anything
> which can issue transactions should do it by exposing
> a suitable interface which the system model can then
> just wire up suitably. Obviously mostly that's going
> to be "all these devices go here in the memory map and
> that view of the world is what all the CPUs see", but
> we shouldn't have a globally accessible and implicitly
> used address space IMHO.
> 
> (For instance, the right way to model per-CPU devices
> is to have each individual CPU core expose its transaction
> master, and then you wire the per-CPU devices up only
> to their corresponding CPU.)

Correct.  system_memory would be renamed cpu_memory, and would be cpu
local (since some systems allow each cpu to have a different memory map).

phys_map would be a cache for the memory map that can be seen through
that region; we might make it a field of MemoryRegion.


-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI stream
  2012-06-12 13:18               ` Avi Kivity
@ 2012-06-12 13:32                 ` Peter Maydell
  2012-06-12 13:48                   ` Avi Kivity
  2012-06-12 13:55                   ` Andreas Färber
  0 siblings, 2 replies; 63+ messages in thread
From: Peter Maydell @ 2012-06-12 13:32 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Anthony Liguori, Michal Simek, qemu-devel@nongnu.org Developers,
	Peter Crosthwaite, Paul Brook, Anthony Liguori, Edgar E. Iglesias,
	Andreas Färber, John Williams

On 12 June 2012 14:18, Avi Kivity <avi@redhat.com> wrote:
> On 06/12/2012 03:58 PM, Peter Maydell wrote:
>> On 11 June 2012 18:31, Avi Kivity <avi@redhat.com> wrote:
>>> On 06/11/2012 06:01 PM, Anthony Liguori wrote:
>>>> Perhaps we should just make MemoryRegion work in both directions?
>>
>>> The other direction is currently cpu_physical_memory_rw().  We do need
>>> to support issuing transactions from arbitrary points in the memory
>>> hierarchy, but I don't think a device's MemoryRegion is the right
>>> interface.  Being able to respond to memory transactions, and being able
>>> to issue them are two different things.
>>
>> ...they're just opposite sides of the same interface, though,
>> really. For instance you could say that any memory transaction
>> master (cpu, dma controller, whatever) should take a single
>> MemoryRegion* and must issue all its memory accesses to that MR*.
>> (obviously that would usually be a container region.)
>
> It would be a container region, and it would be unrelated to any other
> regions held by the device (the device might not have any memory
> regions; instead it would only be able to do dma).

It shouldn't actually be owned by the transaction master, but
by whatever the parent object is that created the transaction
master. So for instance for an ARM board you'd have something
like:
 * top level machine QOM object creates a 'system-memory'
   container region, and puts all the devices in it in their
   correct locations
 * top level object also creates the cortex-a9 device, and
   passes it a pointer to the system-memory container
 * the cortex-a9 device instantiates the CPU cores and the
   per-cpu devices, and creates a container region for
   each cpu containing (devices for that cpu, plus the
   system-memory region it got passed). It passes a pointer
   to the right region to each cpu core
 * the cpu cores just use the region they're given
 * if there's a dma controller in the system, the top level
   machine object creates the controller and hands it a
   pointer to the system-memory container region too. (So
   the dma controller correctly doesn't see the per-cpu
   devices.)

(when I say 'passes a pointer' I mean something involving
QOM links I expect. I'm not sure if anybody's thought about
how we expose memory regions in a QOM manner.)

Notice that in this approach it's perfectly valid to have
a board model which creates a single device and a single
CPU and passes the device's MemoryRegion directly to the
CPU. This corresponds to a hardware design where the CPU's
address lines just connect straight to the device's, ie
there's no bus fabric or address decoding.

-- PMM

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI stream
  2012-06-12 13:32                 ` Peter Maydell
@ 2012-06-12 13:48                   ` Avi Kivity
  2012-06-12 13:55                   ` Andreas Färber
  1 sibling, 0 replies; 63+ messages in thread
From: Avi Kivity @ 2012-06-12 13:48 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Anthony Liguori, Michal Simek, qemu-devel@nongnu.org Developers,
	Peter Crosthwaite, Paul Brook, Anthony Liguori, Edgar E. Iglesias,
	Andreas Färber, John Williams

On 06/12/2012 04:32 PM, Peter Maydell wrote:
> On 12 June 2012 14:18, Avi Kivity <avi@redhat.com> wrote:
>> On 06/12/2012 03:58 PM, Peter Maydell wrote:
>>> On 11 June 2012 18:31, Avi Kivity <avi@redhat.com> wrote:
>>>> On 06/11/2012 06:01 PM, Anthony Liguori wrote:
>>>>> Perhaps we should just make MemoryRegion work in both directions?
>>>
>>>> The other direction is currently cpu_physical_memory_rw().  We do need
>>>> to support issuing transactions from arbitrary points in the memory
>>>> hierarchy, but I don't think a device's MemoryRegion is the right
>>>> interface.  Being able to respond to memory transactions, and being able
>>>> to issue them are two different things.
>>>
>>> ...they're just opposite sides of the same interface, though,
>>> really. For instance you could say that any memory transaction
>>> master (cpu, dma controller, whatever) should take a single
>>> MemoryRegion* and must issue all its memory accesses to that MR*.
>>> (obviously that would usually be a container region.)
>>
>> It would be a container region, and it would be unrelated to any other
>> regions held by the device (the device might not have any memory
>> regions; instead it would only be able to do dma).
> 
> It shouldn't actually be owned by the transaction master, but
> by whatever the parent object is that created the transaction
> master. So for instance for an ARM board you'd have something
> like:
>  * top level machine QOM object creates a 'system-memory'
>    container region, and puts all the devices in it in their
>    correct locations
>  * top level object also creates the cortex-a9 device, and
>    passes it a pointer to the system-memory container
>  * the cortex-a9 device instantiates the CPU cores and the
>    per-cpu devices, and creates a container region for
>    each cpu containing (devices for that cpu, plus the
>    system-memory region it got passed). It passes a pointer
>    to the right region to each cpu core
>  * the cpu cores just use the region they're given
>  * if there's a dma controller in the system, the top level
>    machine object creates the controller and hands it a
>    pointer to the system-memory container region too. (So
>    the dma controller correctly doesn't see the per-cpu
>    devices.)
> 
> (when I say 'passes a pointer' I mean something involving
> QOM links I expect. I'm not sure if anybody's thought about
> how we expose memory regions in a QOM manner.)
> 
> Notice that in this approach it's perfectly valid to have
> a board model which creates a single device and a single
> CPU and passes the device's MemoryRegion directly to the
> CPU. This corresponds to a hardware design where the CPU's
> address lines just connect straight to the device's, ie
> there's no bus fabric or address decoding.

Yes, exactly.

If the devices sees a byte-swapping bus, then instead of giving it a
container region, we give it an io region; the callbacks byte-swap and
write the contents to a container that does the rest of the forwarding.
 If it's an address remapping iommu, then we pass it a container region
with an alias-per-page that remaps the device addresses to addresses in
another container:

iommu
 |
 +-alias-page-0 ---> system_memory[7]
 |-alias-page-1 ---> system_memory[3]
 .
 .
 .

So a device write to page 1 is redirected to page 3.  Of course in both
cases we'll want to fold the functionality into the memory API instead
of making the iommu writer work so hard.

-- 
error compiling committee.c: too many arguments to function

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI stream
  2012-06-12 13:32                 ` Peter Maydell
  2012-06-12 13:48                   ` Avi Kivity
@ 2012-06-12 13:55                   ` Andreas Färber
  1 sibling, 0 replies; 63+ messages in thread
From: Andreas Färber @ 2012-06-12 13:55 UTC (permalink / raw)
  To: Peter Maydell
  Cc: Michal Simek, Anthony Liguori, qemu-devel@nongnu.org Developers,
	Peter Crosthwaite, Avi Kivity, Anthony Liguori, Edgar E. Iglesias,
	John Williams, Paul Brook

Am 12.06.2012 15:32, schrieb Peter Maydell:
> [...] I'm not sure if anybody's thought about
> how we expose memory regions in a QOM manner.)

Anthony had that in his i440fx RFC.

/-F

-- 
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI streams
  2012-06-12  9:46                   ` Avi Kivity
@ 2012-06-13  0:37                     ` Benjamin Herrenschmidt
  2012-06-13 20:57                       ` Anthony Liguori
  2012-06-14  0:00                       ` Edgar E. Iglesias
  0 siblings, 2 replies; 63+ messages in thread
From: Benjamin Herrenschmidt @ 2012-06-13  0:37 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Peter Maydell, Anthony Liguori, Michal Simek,
	qemu-devel@nongnu.org Developers, Peter Crosthwaite, Paul Brook,
	Anthony Liguori, Edgar E. Iglesias, Andreas Färber,
	John Williams

On Tue, 2012-06-12 at 12:46 +0300, Avi Kivity wrote:
> > I think that transformation function lives in the bus layer
> > MemoryRegion.  It's a bit tricky though because you need some sort of
> > notion of "who is asking".  So you need:
> > 
> > dma_memory_write(MemoryRegion *parent, DeviceState *caller,
> >                  const void *data, size_t size);
> 
> It is not the parent here, but rather the root of the memory hierarchy
> as viewed from the device (the enigmatically named 'pcibm' above).  The
> pci memory region simply doesn't have the information about where system
> memory lives, because it is a sibling region.

Right and it has to be hierarchical, you can have CPU -> PCI transform
followed by PCI -> AXI (or whatever stupid bus they use on the Broadcom
wireless cards), etc...

There can be any amount of transform. There's also the need at each
level to handle sibling decoding. IE. It's the same BARs used for
downstream and upstream access that will decode an access at any given
level.

So it's not a separate hierarchy, it's the same hierarchy walked both
ways with potentially different transforms depending on what direction
it's walked .

> Note that the address transformations are not necessarily symmetric (for
> example, iommus transform device->system transactions, but not
> cpu->device transactions).  Each initiator has a separate DAG to follow.

Right. Or rather they might transform CPU -> device but differently (ie,
we do have several windows with different offsets on power for example
etc...) so it's a different transform which -might- be an iommu of some
sort as well.

I think the whole mechanism should be symetrical, with a fast path for
transforms that can be represented by a direct map + offset (ie no iommu
case).

> > This could be simplified at each layer via:
> > 
> > void pci_device_write(PCIDevice *dev, const void *data, size_t size) {
> >     dma_memory_write(dev->bus->mr, DEVICE(dev), data, size);
> > }
> > 
> >> To be true to the HW, each bridge should have its memory region, so a
> >> setup with
> >>
> >>        /pci-host
> >>            |
> >>            |--/p2p
> >>                 |
> >>            |--/device
> >>
> >> Any DMA done by the device would walk through the p2p region to the host
> >> which would contain a region with transform ops.
> >>
> >> However, at each level, you'd have to search for sibling regions that
> >> may decode the address at that level before moving up, ie implement
> >> essentially the equivalent of the PCI substractive decoding scheme.
> > 
> > Not quite...  subtractive decoding only happens for very specific
> > devices IIUC.  For instance, an PCI-ISA bridge.  Normally, it's positive
> > decoding and a bridge has to describe the full region of MMIO/PIO that
> > it handles.
> > 
> > So it's only necessary to transverse down the tree again for the very
> > special case of PCI-ISA bridges.  Normally you can tell just by looking
> > at siblings.
> > 
> >> That will be a significant overhead for your DMA ops I believe, though
> >> doable.
> > 
> > Worst case scenario, 256 devices with what, a 3 level deep hierarchy? 
> > we're still talking about 24 simple address compares.  That shouldn't be
> > so bad.
> 
> Or just lookup the device-local phys_map.

> > 
> >> Then we'd have to add map/unmap to MemoryRegion as well, with the
> >> understanding that they may not be supported at every level...
> > 
> > map/unmap can always fall back to bounce buffers.
> > 
> >> So yeah, it sounds doable and it would handle what DMAContext doesn't
> >> handle which is access to peer devices without going all the way back to
> >> the "top level", but it's complex and ... I need something in qemu
> >> 1.2 :-)
> > 
> > I think we need a longer term vision here.  We can find incremental
> > solutions for the short term but I'm pretty nervous about having two
> > parallel APIs only to discover that we need to converge in 2 years.
> 
> The API already exists, we just need to fill up the data structures.

Not really no, we don't have proper DMA APIs to shoot from devices.

What the DMAContext patches provide is a generic dma_* API but if we are
going to get rid of DMAContext in favor of a (modified ?) MemoryRegion
I'd rather not expose that to devices.

Since I need something _now_ for 1.2 (this has been going on for way too
long), I'm going to go for a quick kill for PCI & PAPR VIO only using a
slightly modified version of the existing iommu patches that provides
pci_* wrappers to the DMA ops that take the PCIDevice as an argument.

That way we can replace the infrastructure and remove DMAContext without
affecting devices in a second stage (unless you think you can come up
with a new scheme in the next few days :-) I really am not familiar
enough with those parts of qemu to aim for the full schebang for 1.2 but
maybe you guys can :-)

Cheers,
Ben.

> -- 
> error compiling committee.c: too many arguments to function
> 
> 
> 

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI streams
  2012-06-13  0:37                     ` Benjamin Herrenschmidt
@ 2012-06-13 20:57                       ` Anthony Liguori
  2012-06-13 21:25                         ` Benjamin Herrenschmidt
  2012-06-14  0:00                       ` Edgar E. Iglesias
  1 sibling, 1 reply; 63+ messages in thread
From: Anthony Liguori @ 2012-06-13 20:57 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Peter Maydell, Anthony Liguori, qemu-devel@nongnu.org Developers,
	Peter Crosthwaite, Michal Simek, Avi Kivity, Edgar E. Iglesias,
	Andreas Färber, John Williams, Paul Brook

On 06/12/2012 07:37 PM, Benjamin Herrenschmidt wrote:
> Not really no, we don't have proper DMA APIs to shoot from devices.
>
> What the DMAContext patches provide is a generic dma_* API but if we are
> going to get rid of DMAContext in favor of a (modified ?) MemoryRegion
> I'd rather not expose that to devices.
>
> Since I need something _now_ for 1.2 (this has been going on for way too
> long), I'm going to go for a quick kill for PCI&  PAPR VIO only using a
> slightly modified version of the existing iommu patches that provides
> pci_* wrappers to the DMA ops that take the PCIDevice as an argument.
>
> That way we can replace the infrastructure and remove DMAContext without
> affecting devices in a second stage (unless you think you can come up
> with a new scheme in the next few days :-) I really am not familiar
> enough with those parts of qemu to aim for the full schebang for 1.2 but
> maybe you guys can :-)

I think pci_* wrappers is the right thing to do in the short term (and probably 
long term too).

Regards,

Anthony Liguori

>
> Cheers,
> Ben.
>
>> --
>> error compiling committee.c: too many arguments to function
>>
>>
>>
>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI streams
  2012-06-13 20:57                       ` Anthony Liguori
@ 2012-06-13 21:25                         ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 63+ messages in thread
From: Benjamin Herrenschmidt @ 2012-06-13 21:25 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Peter Maydell, Anthony Liguori, qemu-devel@nongnu.org Developers,
	Peter Crosthwaite, Michal Simek, Avi Kivity, Edgar E. Iglesias,
	Andreas Färber, John Williams, Paul Brook

On Wed, 2012-06-13 at 15:57 -0500, Anthony Liguori wrote:

> I think pci_* wrappers is the right thing to do in the short term (and probably 
> long term too).

Oh I agree absolutely. Same for vio, I'll do some wrappers. One
remaining question is where do the barriers go in that scheme...

I'll figure something out for now, again, goal is purely to have a
stable device interface, we can rip the guts of the implementation out
for something better. I'll try to spend time on that today (it's HARD to
work while looking after a sick 3 yrs old :-)

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI streams
  2012-06-13  0:37                     ` Benjamin Herrenschmidt
  2012-06-13 20:57                       ` Anthony Liguori
@ 2012-06-14  0:00                       ` Edgar E. Iglesias
  2012-06-14  1:34                         ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 63+ messages in thread
From: Edgar E. Iglesias @ 2012-06-14  0:00 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Peter Maydell, Anthony Liguori, qemu-devel@nongnu.org Developers,
	Peter Crosthwaite, Michal Simek, Avi Kivity, Anthony Liguori,
	Andreas Färber, John Williams, Paul Brook

On Wed, Jun 13, 2012 at 10:37:41AM +1000, Benjamin Herrenschmidt wrote:
> On Tue, 2012-06-12 at 12:46 +0300, Avi Kivity wrote:
> > > I think that transformation function lives in the bus layer
> > > MemoryRegion.  It's a bit tricky though because you need some sort of
> > > notion of "who is asking".  So you need:
> > > 
> > > dma_memory_write(MemoryRegion *parent, DeviceState *caller,
> > >                  const void *data, size_t size);
> > 
> > It is not the parent here, but rather the root of the memory hierarchy
> > as viewed from the device (the enigmatically named 'pcibm' above).  The
> > pci memory region simply doesn't have the information about where system
> > memory lives, because it is a sibling region.
> 
> Right and it has to be hierarchical, you can have CPU -> PCI transform
> followed by PCI -> AXI (or whatever stupid bus they use on the Broadcom
> wireless cards), etc...
> 
> There can be any amount of transform. There's also the need at each
> level to handle sibling decoding. IE. It's the same BARs used for
> downstream and upstream access that will decode an access at any given
> level.
> 
> So it's not a separate hierarchy, it's the same hierarchy walked both
> ways with potentially different transforms depending on what direction
> it's walked .
> 
> > Note that the address transformations are not necessarily symmetric (for
> > example, iommus transform device->system transactions, but not
> > cpu->device transactions).  Each initiator has a separate DAG to follow.
> 
> Right. Or rather they might transform CPU -> device but differently (ie,
> we do have several windows with different offsets on power for example
> etc...) so it's a different transform which -might- be an iommu of some
> sort as well.
> 
> I think the whole mechanism should be symetrical, with a fast path for
> transforms that can be represented by a direct map + offset (ie no iommu
> case).
> 
> > > This could be simplified at each layer via:
> > > 
> > > void pci_device_write(PCIDevice *dev, const void *data, size_t size) {
> > >     dma_memory_write(dev->bus->mr, DEVICE(dev), data, size);
> > > }
> > > 
> > >> To be true to the HW, each bridge should have its memory region, so a
> > >> setup with
> > >>
> > >>        /pci-host
> > >>            |
> > >>            |--/p2p
> > >>                 |
> > >>            |--/device
> > >>
> > >> Any DMA done by the device would walk through the p2p region to the host
> > >> which would contain a region with transform ops.
> > >>
> > >> However, at each level, you'd have to search for sibling regions that
> > >> may decode the address at that level before moving up, ie implement
> > >> essentially the equivalent of the PCI substractive decoding scheme.
> > > 
> > > Not quite...  subtractive decoding only happens for very specific
> > > devices IIUC.  For instance, an PCI-ISA bridge.  Normally, it's positive
> > > decoding and a bridge has to describe the full region of MMIO/PIO that
> > > it handles.
> > > 
> > > So it's only necessary to transverse down the tree again for the very
> > > special case of PCI-ISA bridges.  Normally you can tell just by looking
> > > at siblings.
> > > 
> > >> That will be a significant overhead for your DMA ops I believe, though
> > >> doable.
> > > 
> > > Worst case scenario, 256 devices with what, a 3 level deep hierarchy? 
> > > we're still talking about 24 simple address compares.  That shouldn't be
> > > so bad.
> > 
> > Or just lookup the device-local phys_map.
> 
> > > 
> > >> Then we'd have to add map/unmap to MemoryRegion as well, with the
> > >> understanding that they may not be supported at every level...
> > > 
> > > map/unmap can always fall back to bounce buffers.
> > > 
> > >> So yeah, it sounds doable and it would handle what DMAContext doesn't
> > >> handle which is access to peer devices without going all the way back to
> > >> the "top level", but it's complex and ... I need something in qemu
> > >> 1.2 :-)
> > > 
> > > I think we need a longer term vision here.  We can find incremental
> > > solutions for the short term but I'm pretty nervous about having two
> > > parallel APIs only to discover that we need to converge in 2 years.
> > 
> > The API already exists, we just need to fill up the data structures.
> 
> Not really no, we don't have proper DMA APIs to shoot from devices.

TBH, I don't understand any of the "upstream" access discussion nor the
specifics of DMA accesses for the memory/system bus accesses.

When a device, like a DMA unit accesses the memory/system bus it,
AFAIK, does it from a different port (its master port). That port is
NOT the same as it's slave port. There is no reverese decoding of the
addresses, the access is made top-down. If you are talking about NoCs
we need a compleet different infrastructure but I don't think that
is the case.

I agree with Avi, most of it is already in place.

Cheers

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI stream
  2012-06-12  7:58                   ` Edgar E. Iglesias
@ 2012-06-14  1:01                     ` Peter Crosthwaite
  0 siblings, 0 replies; 63+ messages in thread
From: Peter Crosthwaite @ 2012-06-14  1:01 UTC (permalink / raw)
  To: Edgar E. Iglesias
  Cc: Peter Maydell, Anthony Liguori, qemu-devel@nongnu.org Developers,
	Michal Simek, Paul Brook, Anthony Liguori, pbonzini,
	Andreas Färber, John Williams

On Tue, 2012-06-12 at 09:58 +0200, Edgar E. Iglesias wrote:
> On Tue, Jun 12, 2012 at 10:33:51AM +1000, Peter Crosthwaite wrote:
> > >
> > > Hi,
> > >
> > > IIRC the word array thing is device specific, not really AXI stream.
> > > I think the whole connection to AXI is a bit unfortunate,
> > 
> > Yes, so I think we can summaries the confusion here with AXI stream is
> > completely unreleated to AXI. The connection between the two comes
> > with the way low level signalling is handled. (e.g. the SRC_RDY and
> > DST_RDY pins etc. etc.), thats common between AXI and AXI stream hence
> > the shared name. This is of course completely irrelevant to QEMU.
> > 
> > these devices
> > > are pretty much the same devices that in other contexts where connected
> > > to other bus standards. Xilinx choose to name them AXI-xxx and I used
> > > the name in our models but I didn't really model anything that is AXI
> > > stream specific..
> > >
> > 
> > Theres not a lot to model. With the signalling stuff taken away, its
> > pretty much just a unidirectional point-to-point data bus. There is
> > the T_LAST bit, and the extended capabilities PMM mentioned.
> 
> IMO:
> Yes, the end-of-packet needs to be modeled,
> (e.g one function-call -> one packet). The exteneded stuff is AFAICS just
> optimization, the abstract concept is still data moving. We are not modeling
> AXI Stream, we should be modeling the common abstraction between DMA ctrls
> and devices. Error reporting from the sink/device might be needed in some cases
> (overrun/underrun etc).
> 
> Another issue is that these data paths are typically high-speed. So a zerocopy
> approach might be worth looking at. For example, if all the chunks in
> a packet are backed by host memory, we could just pass pointers to
> the final sink, which in turn can do the equivalent of a writev() without
> memory_rw into temporary linear bufs etc. maybe...
> 

Hi All,

Edgar and myself discussed this on IRC and came to the follow
conclusions:

1: AXI-stream is too specific for QEMU.

A good example of why, is (our of tree) we have another DMA and ethernet
controller that works over the core-connect local-link interface, which
is also a trivial interface. The thinking is the same basic abstraction
should be used for both for simplicity and reduced code volume. Also
means bridges can be modelled transparently. Its also consistent with
sysbus - sysbus models but has no AXI awareness so stream shouldnt have
axi awareness either.

2: This Interconnect is not DMA

Because its and addressless stream interface that pushes and pops. DMA
is an application of the interconnect, not part of it.

So if its not AXI-stream and its not DMA what is it? I guess its just a
"stream". In my current RFC I have TYPE_AXI_STREAM_SLAVE etc etc, change
this to TYPE_STREAM_SLAVE? Any other input on naming?

Regards,
Peter

> Cheers

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI streams
  2012-06-14  0:00                       ` Edgar E. Iglesias
@ 2012-06-14  1:34                         ` Benjamin Herrenschmidt
  2012-06-14  2:03                           ` Edgar E. Iglesias
  0 siblings, 1 reply; 63+ messages in thread
From: Benjamin Herrenschmidt @ 2012-06-14  1:34 UTC (permalink / raw)
  To: Edgar E. Iglesias
  Cc: Peter Maydell, Anthony Liguori, qemu-devel@nongnu.org Developers,
	Peter Crosthwaite, Michal Simek, Avi Kivity, Anthony Liguori,
	Andreas Färber, John Williams, Paul Brook

On Thu, 2012-06-14 at 02:00 +0200, Edgar E. Iglesias wrote:
> 
> TBH, I don't understand any of the "upstream" access discussion nor
> the specifics of DMA accesses for the memory/system bus accesses.
> 
> When a device, like a DMA unit accesses the memory/system bus it,
> AFAIK, does it from a different port (its master port). That port is
> NOT the same as it's slave port.

It may or may not be but yes, in most case the slave and master
interfaces can be seen as different interfaces to the bus though in many
case they aren't completely independent (there are ordering guarantees
in some cases for example).

>  There is no reverese decoding of the
> addresses, the access is made top-down. If you are talking about NoCs
> we need a compleet different infrastructure but I don't think that
> is the case.

I don't parse the above. But the simple example is to look at PCI.

When a PCI device initiates an access, what happens is that it gets
arbitration on its bus segment, and from there it's purely an ordinary
PCI access for the address specified by that device.

Depending on other devices, bridges, etc... that access may travel
upstream or to a sibling (which can be a bridge and thus go back down)
etc...

Along that hierarchy, transforms might occur, which can be different
depending on the direction. For example the PCI host controller might
contain an iommu (typically that's the case with IBM pseries one) while
on the downstream direction it has a handful of relatively fixed
windows.

But there are other cases, the processor bus can be bridged to some
other IO bus (PLB4, PLB6, AXI) which can itself be bridge to some other
simpler bus (OPB, AHB, ...) and in some case there can be transforms
done along the way. Some may and some may not support DMA, or with
address limitation, etc... there's a bit of everything out there.

> I agree with Avi, most of it is already in place.

Well, not quite in that we need to traverse the memory hierarchy
starting from the device itself and have rules at the various "bridge"
levels on how to forward/transform things in the upstream direction
(though as the discussion showed earlier, we could use some better
support for downstream transforms as well).

I think we can probably stick to a mechanism that has two basic
function:

 - Either the transform (or lack of) can be represented by an offset to
be applied to the address (which would match most of the bit
mask/replace cases)

 - Or the transform is a function pointer

The former would represent most cases and would provide the ability to
"flatten the tree" by storing fully transformed ranges from the CPU
point of view (it might be tricky to have a variant of these for every
device) in order to speed up accesses. The latter would typically be
used by iommu's.

Now if you guys think you can cook up the necessary infrastructure
changes in the next couple of weeks, I'm happy to completely drop
DMAContext.

In any case, better accessors are a good thing so I'll start with the
first few patches of David's series and polish them to add proper pci_
and vio_ accessors for use by devices while we iron out what we want to
do with the infrastructure.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI streams
  2012-06-14  1:34                         ` Benjamin Herrenschmidt
@ 2012-06-14  2:03                           ` Edgar E. Iglesias
  2012-06-14  2:16                             ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 63+ messages in thread
From: Edgar E. Iglesias @ 2012-06-14  2:03 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Peter Maydell, Anthony Liguori, qemu-devel@nongnu.org Developers,
	Peter Crosthwaite, Michal Simek, Avi Kivity, Anthony Liguori,
	Andreas Färber, John Williams, Paul Brook

On Thu, Jun 14, 2012 at 11:34:10AM +1000, Benjamin Herrenschmidt wrote:
> On Thu, 2012-06-14 at 02:00 +0200, Edgar E. Iglesias wrote:
> > 
> > TBH, I don't understand any of the "upstream" access discussion nor
> > the specifics of DMA accesses for the memory/system bus accesses.
> > 
> > When a device, like a DMA unit accesses the memory/system bus it,
> > AFAIK, does it from a different port (its master port). That port is
> > NOT the same as it's slave port.
> 
> It may or may not be but yes, in most case the slave and master
> interfaces can be seen as different interfaces to the bus though in many
> case they aren't completely independent (there are ordering guarantees
> in some cases for example).
> 
> >  There is no reverese decoding of the
> > addresses, the access is made top-down. If you are talking about NoCs
> > we need a compleet different infrastructure but I don't think that
> > is the case.
> 
> I don't parse the above. But the simple example is to look at PCI.
> 
> When a PCI device initiates an access, what happens is that it gets
> arbitration on its bus segment, and from there it's purely an ordinary
> PCI access for the address specified by that device.
> 
> Depending on other devices, bridges, etc... that access may travel
> upstream or to a sibling (which can be a bridge and thus go back down)
> etc...
> 
> Along that hierarchy, transforms might occur, which can be different
> depending on the direction. For example the PCI host controller might
> contain an iommu (typically that's the case with IBM pseries one) while
> on the downstream direction it has a handful of relatively fixed
> windows.
> 
> But there are other cases, the processor bus can be bridged to some
> other IO bus (PLB4, PLB6, AXI) which can itself be bridge to some other
> simpler bus (OPB, AHB, ...) and in some case there can be transforms
> done along the way. Some may and some may not support DMA, or with
> address limitation, etc... there's a bit of everything out there.
> 
> > I agree with Avi, most of it is already in place.
> 
> Well, not quite in that we need to traverse the memory hierarchy
> starting from the device itself and have rules at the various "bridge"
> levels on how to forward/transform things in the upstream direction
> (though as the discussion showed earlier, we could use some better
> support for downstream transforms as well).
> 
> I think we can probably stick to a mechanism that has two basic
> function:
> 
>  - Either the transform (or lack of) can be represented by an offset to
> be applied to the address (which would match most of the bit
> mask/replace cases)
> 
>  - Or the transform is a function pointer
> 
> The former would represent most cases and would provide the ability to
> "flatten the tree" by storing fully transformed ranges from the CPU
> point of view (it might be tricky to have a variant of these for every
> device) in order to speed up accesses. The latter would typically be
> used by iommu's.
> 
> Now if you guys think you can cook up the necessary infrastructure
> changes in the next couple of weeks, I'm happy to completely drop
> DMAContext.
> 
> In any case, better accessors are a good thing so I'll start with the
> first few patches of David's series and polish them to add proper pci_
> and vio_ accessors for use by devices while we iron out what we want to
> do with the infrastructure.

Thanks for the clarificatino Ben.

I don't know much about PCI but in the embedded world I've never seen
anything that resemblems what you describe. Devices at the bottom of
the hierharcy (or at any location) that make acceses to the memory system
do it through a different port located at a differnt logical position in
the hierarchy. In fact, a DMA ctrl can through it's master port access it's
slave port without it even noticing that it is the same device accessing
itself and the access will travel top-down through the bus hierarchy.

I've never seen address decoding beeing done in reverse.. bottom-up.

Cheers

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI streams
  2012-06-14  2:03                           ` Edgar E. Iglesias
@ 2012-06-14  2:16                             ` Benjamin Herrenschmidt
  2012-06-14  2:31                               ` Edgar E. Iglesias
  0 siblings, 1 reply; 63+ messages in thread
From: Benjamin Herrenschmidt @ 2012-06-14  2:16 UTC (permalink / raw)
  To: Edgar E. Iglesias
  Cc: Peter Maydell, Anthony Liguori, qemu-devel@nongnu.org Developers,
	Peter Crosthwaite, Michal Simek, Avi Kivity, Anthony Liguori,
	Andreas Färber, John Williams, Paul Brook

On Thu, 2012-06-14 at 04:03 +0200, Edgar E. Iglesias wrote:
> Thanks for the clarificatino Ben.
> 
> I don't know much about PCI but in the embedded world I've never seen
> anything that resemblems what you describe. Devices at the bottom of
> the hierharcy (or at any location) that make acceses to the memory system
> do it through a different port located at a differnt logical position in
> the hierarchy. 

Right, that's what I meant when I said that master and slave interfaces
don't have to be the same, but that's not a problem. A given device
could depend on two memory regions, one for downstream accesses only and
one for upstream accesses, both located in different positions in the
hierarchy.

> In fact, a DMA ctrl can through it's master port access it's
> slave port without it even noticing that it is the same device accessing
> itself and the access will travel top-down through the bus hierarchy.

That sometimes work ... and sometimes doesn't ... some combinations of
busses/devices will not allow that, some will deadlock, it really
depends on the setup.

> I've never seen address decoding beeing done in reverse.. bottom-up.

It really depends on the bus type and happens on embedded as well.

If your device master "port" is directly on the processor bus, you still
have to deal with sibling devices potentially decoding right ? Now, what
about you have a bridge from that bus to another bus. For example ppc
PLB to AXI (that stuff exist out on the field).

An AXI device might issue a cycle on the AXI portion, that can be
decoded by either a sibling AXI device ... or go up. In most cases
though, "upstream" is some kind of substractive decoding (ie. anything
-not- decoded by a sibling goes up).

This gets even more obvious with PCI of course. Then add a b43 wifi card
on PCI, you have AXI -> PCI -> SSB (silicon backplane). Now what happens
when one of the sub-devices on SBB (the MAC interface for example) does
a bus master ? You have at least 2 layers to cross before you hit your
processor bus. 

Add iommu's to the soup and you get into a serious mess :-)

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI streams
  2012-06-14  2:16                             ` Benjamin Herrenschmidt
@ 2012-06-14  2:31                               ` Edgar E. Iglesias
  2012-06-14  2:41                                 ` Benjamin Herrenschmidt
  2012-06-14  5:16                                 ` Benjamin Herrenschmidt
  0 siblings, 2 replies; 63+ messages in thread
From: Edgar E. Iglesias @ 2012-06-14  2:31 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Peter Maydell, Anthony Liguori, qemu-devel@nongnu.org Developers,
	Peter Crosthwaite, Michal Simek, Avi Kivity, Anthony Liguori,
	Andreas Färber, John Williams, Paul Brook

On Thu, Jun 14, 2012 at 12:16:45PM +1000, Benjamin Herrenschmidt wrote:
> On Thu, 2012-06-14 at 04:03 +0200, Edgar E. Iglesias wrote:
> > Thanks for the clarificatino Ben.
> > 
> > I don't know much about PCI but in the embedded world I've never seen
> > anything that resemblems what you describe. Devices at the bottom of
> > the hierharcy (or at any location) that make acceses to the memory system
> > do it through a different port located at a differnt logical position in
> > the hierarchy. 
> 
> Right, that's what I meant when I said that master and slave interfaces
> don't have to be the same, but that's not a problem. A given device
> could depend on two memory regions, one for downstream accesses only and
> one for upstream accesses, both located in different positions in the
> hierarchy.
> 
> > In fact, a DMA ctrl can through it's master port access it's
> > slave port without it even noticing that it is the same device accessing
> > itself and the access will travel top-down through the bus hierarchy.
> 
> That sometimes work ... and sometimes doesn't ... some combinations of
> busses/devices will not allow that, some will deadlock, it really
> depends on the setup.
> 
> > I've never seen address decoding beeing done in reverse.. bottom-up.
> 
> It really depends on the bus type and happens on embedded as well.
> 
> If your device master "port" is directly on the processor bus, you still
> have to deal with sibling devices potentially decoding right ? Now, what
> about you have a bridge from that bus to another bus. For example ppc
> PLB to AXI (that stuff exist out on the field).
> 
> An AXI device might issue a cycle on the AXI portion, that can be
> decoded by either a sibling AXI device ... or go up. In most cases

No, it doesn't really go up.. This is where we disagree.


> though, "upstream" is some kind of substractive decoding (ie. anything
> -not- decoded by a sibling goes up).

I've never seen this happen, but I won't say it doesn't exist because
you'll find all kinds of HW out there that'll do all kind of stuff
you wouldn't imagine.


> 
> This gets even more obvious with PCI of course. Then add a b43 wifi card
> on PCI, you have AXI -> PCI -> SSB (silicon backplane). Now what happens
> when one of the sub-devices on SBB (the MAC interface for example) does
> a bus master ? You have at least 2 layers to cross before you hit your
> processor bus. 
> 
> Add iommu's to the soup and you get into a serious mess :-)

IOmmus messify things a bit, but the translation is only done from
the masters above the iommu going down theough the iommu. Not
vice versa. At least AFAIK.

Cheers

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI streams
  2012-06-14  2:31                               ` Edgar E. Iglesias
@ 2012-06-14  2:41                                 ` Benjamin Herrenschmidt
  2012-06-14  3:17                                   ` Edgar E. Iglesias
  2012-06-14  5:16                                 ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 63+ messages in thread
From: Benjamin Herrenschmidt @ 2012-06-14  2:41 UTC (permalink / raw)
  To: Edgar E. Iglesias
  Cc: Peter Maydell, Anthony Liguori, qemu-devel@nongnu.org Developers,
	Peter Crosthwaite, Michal Simek, Avi Kivity, Anthony Liguori,
	Andreas Färber, John Williams, Paul Brook

On Thu, 2012-06-14 at 04:31 +0200, Edgar E. Iglesias wrote:
> > An AXI device might issue a cycle on the AXI portion, that can be
> > decoded by either a sibling AXI device ... or go up. In most cases
> 
> No, it doesn't really go up.. This is where we disagree.

Well, not really indeed as the memory subsystem is just another sibling
in this specific case, bad choice of words here. Though it does go "up"
in a way if you are in a cache coherent system thanks to snooping :-)
But right, memory is just a sibling.

> > though, "upstream" is some kind of substractive decoding (ie. anything
> > -not- decoded by a sibling goes up).
> 
> I've never seen this happen, but I won't say it doesn't exist because
> you'll find all kinds of HW out there that'll do all kind of stuff
> you wouldn't imagine.

Its quite common. You seem to have a vision limited to a single flat bus
with all the devices connected to it :-) The minute you start having bus
hierarchies things become much more interesting... Even on ARM, with PCI
for example.

> > This gets even more obvious with PCI of course. Then add a b43 wifi card
> > on PCI, you have AXI -> PCI -> SSB (silicon backplane). Now what happens
> > when one of the sub-devices on SBB (the MAC interface for example) does
> > a bus master ? You have at least 2 layers to cross before you hit your
> > processor bus. 
> > 
> > Add iommu's to the soup and you get into a serious mess :-)
> 
> IOmmus messify things a bit, but the translation is only done from
> the masters above the iommu going down theough the iommu. Not
> vice versa. At least AFAIK.

Sure, for the very simple reason that the iommu typically sits behind
the master. So the other direction is handled by the CPU's MMU.

However, iommu's aren't necessarily at the top level either ... some
designs have a global one that handle all type of devices, some designs
have them in the PCI bridge itself, etc...

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI streams
  2012-06-14  2:41                                 ` Benjamin Herrenschmidt
@ 2012-06-14  3:17                                   ` Edgar E. Iglesias
  2012-06-14  3:43                                     ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 63+ messages in thread
From: Edgar E. Iglesias @ 2012-06-14  3:17 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Peter Maydell, Anthony Liguori, qemu-devel@nongnu.org Developers,
	Peter Crosthwaite, Michal Simek, Avi Kivity, Anthony Liguori,
	Andreas Färber, John Williams, Paul Brook

On Thu, Jun 14, 2012 at 12:41:06PM +1000, Benjamin Herrenschmidt wrote:
> On Thu, 2012-06-14 at 04:31 +0200, Edgar E. Iglesias wrote:
> > > An AXI device might issue a cycle on the AXI portion, that can be
> > > decoded by either a sibling AXI device ... or go up. In most cases
> > 
> > No, it doesn't really go up.. This is where we disagree.
> 
> Well, not really indeed as the memory subsystem is just another sibling
> in this specific case, bad choice of words here. Though it does go "up"
> in a way if you are in a cache coherent system thanks to snooping :-)
> But right, memory is just a sibling.
> 
> > > though, "upstream" is some kind of substractive decoding (ie. anything
> > > -not- decoded by a sibling goes up).
> > 
> > I've never seen this happen, but I won't say it doesn't exist because
> > you'll find all kinds of HW out there that'll do all kind of stuff
> > you wouldn't imagine.
> 
> Its quite common. You seem to have a vision limited to a single flat bus
> with all the devices connected to it :-) The minute you start having bus
> hierarchies things become much more interesting... Even on ARM, with PCI
> for example.

I probably have a limited view, but I've worked with developing devices
and connecting them to various busses for a while so I'm not completely
lost.


> 
> > > This gets even more obvious with PCI of course. Then add a b43 wifi card
> > > on PCI, you have AXI -> PCI -> SSB (silicon backplane). Now what happens
> > > when one of the sub-devices on SBB (the MAC interface for example) does
> > > a bus master ? You have at least 2 layers to cross before you hit your
> > > processor bus. 
> > > 
> > > Add iommu's to the soup and you get into a serious mess :-)
> > 
> > IOmmus messify things a bit, but the translation is only done from
> > the masters above the iommu going down theough the iommu. Not
> > vice versa. At least AFAIK.
> 
> Sure, for the very simple reason that the iommu typically sits behind
> the master. So the other direction is handled by the CPU's MMU.

The CPU's MMU is a CPU local thing, it can be ignored in this context...

Anyway, I might very well be missing or missunderstganding something so
I'm not claiming I'm having the absolute "thruth" here but it seems to me
like implementing device accesses bottom-up or "upstream", is something
that is extremely rare (to put it mildely), something that is unnecessary
unless we __really__ need it.

Cheers

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI streams
  2012-06-14  3:17                                   ` Edgar E. Iglesias
@ 2012-06-14  3:43                                     ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 63+ messages in thread
From: Benjamin Herrenschmidt @ 2012-06-14  3:43 UTC (permalink / raw)
  To: Edgar E. Iglesias
  Cc: Peter Maydell, Anthony Liguori, qemu-devel@nongnu.org Developers,
	Peter Crosthwaite, Michal Simek, Avi Kivity, Anthony Liguori,
	Andreas Färber, John Williams, Paul Brook

On Thu, 2012-06-14 at 05:17 +0200, Edgar E. Iglesias wrote:

> The CPU's MMU is a CPU local thing, it can be ignored in this context...
> 
> Anyway, I might very well be missing or missunderstganding something so
> I'm not claiming I'm having the absolute "thruth" here but it seems to me
> like implementing device accesses bottom-up or "upstream", is something
> that is extremely rare (to put it mildely), something that is unnecessary
> unless we __really__ need it.

Well, I'm happy to stick with my existing DMAContext patches that take
care of most of the requirement so here I'm siding with you :-)

As for the "extremely rare" I don't know ... technically speaking
anything with PCI qualifies. However we only rarely get into a situation
where we need that level of correctness in the model, because the lack
of a "proper" model in this case will probably only affect peer to peer
DMAs which are fairly rare and only when such peer to peer DMA is itself
done in a system with funky remapping and/or iommu...

That's why I'd like for 1.2 to simply use the existing DMAContext/iommu
proposal, however, in order to avoid having to change the devices
if/when something "better" is implemented, I'll provide "accessors" at
least for PCI (feel free to do accessors for AXI etc... as well if you
wish) so that the accessors themselves can be changed if needed without
impacting all the users.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [Qemu-devel] [RFC] QOMification of AXI streams
  2012-06-14  2:31                               ` Edgar E. Iglesias
  2012-06-14  2:41                                 ` Benjamin Herrenschmidt
@ 2012-06-14  5:16                                 ` Benjamin Herrenschmidt
  1 sibling, 0 replies; 63+ messages in thread
From: Benjamin Herrenschmidt @ 2012-06-14  5:16 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Peter Maydell, Anthony Liguori, qemu-devel@nongnu.org Developers,
	Peter Crosthwaite, Michal Simek, Avi Kivity, Andreas Färber,
	John Williams, Paul Brook

< snip thread >

So I was looking at this accessor business. We already have them for
PCI. PAPR VIO already has its own as well.

That leaves us with various devices such as OHCI that can exist
on different bus types and use the lower-level "DMAContext" based
variant...

Now I'm keen to keep it (essentially, keep the series as-is really) and
say let's just merge it for 1.2, unless you think you can come up with
something better before that deadline.

At the end of the day, DMAContext is a fine structure to hold the
iommu ops and identify the top level DMA translation, we might just
end up burying it elsewhere in the long run, but for now it's easy
to keep as-is.

There's few enough drivers who directly use the dma_* API, really.
Mostly those who sit on multiple bus, and a bit of search/replace in
there won't be hard if needed in the future.

After the discussion we had with a few other folks on the list it looks
like a "proper" model isn't something that we need today to fix a
specific issue, the simpler DMAContext approach will work fine, unless I
missed something.

And we can always make DMAContext itself hierarchical or burry it in the
MemoryRegion and leave the type itself as a handle for upstream
transactions.

Cheers,
Ben.

^ permalink raw reply	[flat|nested] 63+ messages in thread

end of thread, other threads:[~2012-06-14  5:17 UTC | newest]

Thread overview: 63+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-06-08  4:23 [Qemu-devel] [RFC] QOMification of AXI stream Peter Crosthwaite
2012-06-08  9:13 ` Paul Brook
2012-06-08  9:34   ` Peter Maydell
2012-06-08 13:13     ` Paul Brook
2012-06-08 13:39       ` Anthony Liguori
2012-06-08 13:59         ` Paul Brook
2012-06-08 14:17           ` Anthony Liguori
2012-06-08 13:41   ` Anthony Liguori
2012-06-08 13:53     ` Paul Brook
2012-06-08 13:55     ` Peter Maydell
2012-06-08  9:45 ` Andreas Färber
2012-06-09  1:53   ` Peter Crosthwaite
2012-06-09  2:12     ` Andreas Färber
2012-06-09  3:28       ` Peter Crosthwaite
2012-06-11  5:54       ` Paolo Bonzini
2012-06-11 13:05         ` Peter Maydell
2012-06-11 13:17         ` Anthony Liguori
2012-06-11 13:41           ` Paolo Bonzini
2012-06-08 14:15 ` Anthony Liguori
2012-06-09  1:24   ` Peter Crosthwaite
2012-06-11 13:15     ` Anthony Liguori
2012-06-11 13:39       ` Peter Maydell
2012-06-11 14:38         ` Edgar E. Iglesias
2012-06-11 14:53           ` Peter Maydell
2012-06-11 14:58             ` Edgar E. Iglesias
2012-06-11 15:03             ` Anthony Liguori
2012-06-11 15:34               ` Peter Maydell
2012-06-11 15:56               ` Edgar E. Iglesias
2012-06-12  0:33                 ` Peter Crosthwaite
2012-06-12  7:58                   ` Edgar E. Iglesias
2012-06-14  1:01                     ` Peter Crosthwaite
2012-06-11 15:01         ` Anthony Liguori
2012-06-11 17:31           ` Avi Kivity
2012-06-11 18:35             ` Anthony Liguori
2012-06-11 22:00               ` [Qemu-devel] [RFC] QOMification of AXI streams Benjamin Herrenschmidt
2012-06-11 22:29                 ` Anthony Liguori
2012-06-11 23:46                   ` Benjamin Herrenschmidt
2012-06-12  1:33                     ` Anthony Liguori
2012-06-12  2:06                       ` Benjamin Herrenschmidt
2012-06-12  9:46                   ` Avi Kivity
2012-06-13  0:37                     ` Benjamin Herrenschmidt
2012-06-13 20:57                       ` Anthony Liguori
2012-06-13 21:25                         ` Benjamin Herrenschmidt
2012-06-14  0:00                       ` Edgar E. Iglesias
2012-06-14  1:34                         ` Benjamin Herrenschmidt
2012-06-14  2:03                           ` Edgar E. Iglesias
2012-06-14  2:16                             ` Benjamin Herrenschmidt
2012-06-14  2:31                               ` Edgar E. Iglesias
2012-06-14  2:41                                 ` Benjamin Herrenschmidt
2012-06-14  3:17                                   ` Edgar E. Iglesias
2012-06-14  3:43                                     ` Benjamin Herrenschmidt
2012-06-14  5:16                                 ` Benjamin Herrenschmidt
2012-06-12  1:04                 ` Andreas Färber
2012-06-12  2:42                   ` Benjamin Herrenschmidt
2012-06-12  9:31               ` [Qemu-devel] [RFC] QOMification of AXI stream Avi Kivity
2012-06-12  9:42                 ` Edgar E. Iglesias
2012-06-11 18:36             ` Anthony Liguori
2012-06-12  9:51               ` Avi Kivity
2012-06-12 12:58             ` Peter Maydell
2012-06-12 13:18               ` Avi Kivity
2012-06-12 13:32                 ` Peter Maydell
2012-06-12 13:48                   ` Avi Kivity
2012-06-12 13:55                   ` Andreas Färber

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).