From: "Edgar E. Iglesias" <edgar.iglesias@gmail.com>
To: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: "Peter Maydell" <peter.maydell@linaro.org>,
"Anthony Liguori" <aliguori@us.ibm.com>,
"qemu-devel@nongnu.org Developers" <qemu-devel@nongnu.org>,
"Peter Crosthwaite" <peter.crosthwaite@petalogix.com>,
"Michal Simek" <monstr@monstr.eu>, "Avi Kivity" <avi@redhat.com>,
"Anthony Liguori" <anthony@codemonkey.ws>,
"Andreas Färber" <afaerber@suse.de>,
"John Williams" <john.williams@petalogix.com>,
"Paul Brook" <paul@codesourcery.com>
Subject: Re: [Qemu-devel] [RFC] QOMification of AXI streams
Date: Thu, 14 Jun 2012 02:00:01 +0200 [thread overview]
Message-ID: <20120614000001.GA26263@zapo> (raw)
In-Reply-To: <1339547861.9220.77.camel@pasglop>
On Wed, Jun 13, 2012 at 10:37:41AM +1000, Benjamin Herrenschmidt wrote:
> On Tue, 2012-06-12 at 12:46 +0300, Avi Kivity wrote:
> > > I think that transformation function lives in the bus layer
> > > MemoryRegion. It's a bit tricky though because you need some sort of
> > > notion of "who is asking". So you need:
> > >
> > > dma_memory_write(MemoryRegion *parent, DeviceState *caller,
> > > const void *data, size_t size);
> >
> > It is not the parent here, but rather the root of the memory hierarchy
> > as viewed from the device (the enigmatically named 'pcibm' above). The
> > pci memory region simply doesn't have the information about where system
> > memory lives, because it is a sibling region.
>
> Right and it has to be hierarchical, you can have CPU -> PCI transform
> followed by PCI -> AXI (or whatever stupid bus they use on the Broadcom
> wireless cards), etc...
>
> There can be any amount of transform. There's also the need at each
> level to handle sibling decoding. IE. It's the same BARs used for
> downstream and upstream access that will decode an access at any given
> level.
>
> So it's not a separate hierarchy, it's the same hierarchy walked both
> ways with potentially different transforms depending on what direction
> it's walked .
>
> > Note that the address transformations are not necessarily symmetric (for
> > example, iommus transform device->system transactions, but not
> > cpu->device transactions). Each initiator has a separate DAG to follow.
>
> Right. Or rather they might transform CPU -> device but differently (ie,
> we do have several windows with different offsets on power for example
> etc...) so it's a different transform which -might- be an iommu of some
> sort as well.
>
> I think the whole mechanism should be symetrical, with a fast path for
> transforms that can be represented by a direct map + offset (ie no iommu
> case).
>
> > > This could be simplified at each layer via:
> > >
> > > void pci_device_write(PCIDevice *dev, const void *data, size_t size) {
> > > dma_memory_write(dev->bus->mr, DEVICE(dev), data, size);
> > > }
> > >
> > >> To be true to the HW, each bridge should have its memory region, so a
> > >> setup with
> > >>
> > >> /pci-host
> > >> |
> > >> |--/p2p
> > >> |
> > >> |--/device
> > >>
> > >> Any DMA done by the device would walk through the p2p region to the host
> > >> which would contain a region with transform ops.
> > >>
> > >> However, at each level, you'd have to search for sibling regions that
> > >> may decode the address at that level before moving up, ie implement
> > >> essentially the equivalent of the PCI substractive decoding scheme.
> > >
> > > Not quite... subtractive decoding only happens for very specific
> > > devices IIUC. For instance, an PCI-ISA bridge. Normally, it's positive
> > > decoding and a bridge has to describe the full region of MMIO/PIO that
> > > it handles.
> > >
> > > So it's only necessary to transverse down the tree again for the very
> > > special case of PCI-ISA bridges. Normally you can tell just by looking
> > > at siblings.
> > >
> > >> That will be a significant overhead for your DMA ops I believe, though
> > >> doable.
> > >
> > > Worst case scenario, 256 devices with what, a 3 level deep hierarchy?
> > > we're still talking about 24 simple address compares. That shouldn't be
> > > so bad.
> >
> > Or just lookup the device-local phys_map.
>
> > >
> > >> Then we'd have to add map/unmap to MemoryRegion as well, with the
> > >> understanding that they may not be supported at every level...
> > >
> > > map/unmap can always fall back to bounce buffers.
> > >
> > >> So yeah, it sounds doable and it would handle what DMAContext doesn't
> > >> handle which is access to peer devices without going all the way back to
> > >> the "top level", but it's complex and ... I need something in qemu
> > >> 1.2 :-)
> > >
> > > I think we need a longer term vision here. We can find incremental
> > > solutions for the short term but I'm pretty nervous about having two
> > > parallel APIs only to discover that we need to converge in 2 years.
> >
> > The API already exists, we just need to fill up the data structures.
>
> Not really no, we don't have proper DMA APIs to shoot from devices.
TBH, I don't understand any of the "upstream" access discussion nor the
specifics of DMA accesses for the memory/system bus accesses.
When a device, like a DMA unit accesses the memory/system bus it,
AFAIK, does it from a different port (its master port). That port is
NOT the same as it's slave port. There is no reverese decoding of the
addresses, the access is made top-down. If you are talking about NoCs
we need a compleet different infrastructure but I don't think that
is the case.
I agree with Avi, most of it is already in place.
Cheers
next prev parent reply other threads:[~2012-06-14 0:00 UTC|newest]
Thread overview: 63+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-06-08 4:23 [Qemu-devel] [RFC] QOMification of AXI stream Peter Crosthwaite
2012-06-08 9:13 ` Paul Brook
2012-06-08 9:34 ` Peter Maydell
2012-06-08 13:13 ` Paul Brook
2012-06-08 13:39 ` Anthony Liguori
2012-06-08 13:59 ` Paul Brook
2012-06-08 14:17 ` Anthony Liguori
2012-06-08 13:41 ` Anthony Liguori
2012-06-08 13:53 ` Paul Brook
2012-06-08 13:55 ` Peter Maydell
2012-06-08 9:45 ` Andreas Färber
2012-06-09 1:53 ` Peter Crosthwaite
2012-06-09 2:12 ` Andreas Färber
2012-06-09 3:28 ` Peter Crosthwaite
2012-06-11 5:54 ` Paolo Bonzini
2012-06-11 13:05 ` Peter Maydell
2012-06-11 13:17 ` Anthony Liguori
2012-06-11 13:41 ` Paolo Bonzini
2012-06-08 14:15 ` Anthony Liguori
2012-06-09 1:24 ` Peter Crosthwaite
2012-06-11 13:15 ` Anthony Liguori
2012-06-11 13:39 ` Peter Maydell
2012-06-11 14:38 ` Edgar E. Iglesias
2012-06-11 14:53 ` Peter Maydell
2012-06-11 14:58 ` Edgar E. Iglesias
2012-06-11 15:03 ` Anthony Liguori
2012-06-11 15:34 ` Peter Maydell
2012-06-11 15:56 ` Edgar E. Iglesias
2012-06-12 0:33 ` Peter Crosthwaite
2012-06-12 7:58 ` Edgar E. Iglesias
2012-06-14 1:01 ` Peter Crosthwaite
2012-06-11 15:01 ` Anthony Liguori
2012-06-11 17:31 ` Avi Kivity
2012-06-11 18:35 ` Anthony Liguori
2012-06-11 22:00 ` [Qemu-devel] [RFC] QOMification of AXI streams Benjamin Herrenschmidt
2012-06-11 22:29 ` Anthony Liguori
2012-06-11 23:46 ` Benjamin Herrenschmidt
2012-06-12 1:33 ` Anthony Liguori
2012-06-12 2:06 ` Benjamin Herrenschmidt
2012-06-12 9:46 ` Avi Kivity
2012-06-13 0:37 ` Benjamin Herrenschmidt
2012-06-13 20:57 ` Anthony Liguori
2012-06-13 21:25 ` Benjamin Herrenschmidt
2012-06-14 0:00 ` Edgar E. Iglesias [this message]
2012-06-14 1:34 ` Benjamin Herrenschmidt
2012-06-14 2:03 ` Edgar E. Iglesias
2012-06-14 2:16 ` Benjamin Herrenschmidt
2012-06-14 2:31 ` Edgar E. Iglesias
2012-06-14 2:41 ` Benjamin Herrenschmidt
2012-06-14 3:17 ` Edgar E. Iglesias
2012-06-14 3:43 ` Benjamin Herrenschmidt
2012-06-14 5:16 ` Benjamin Herrenschmidt
2012-06-12 1:04 ` Andreas Färber
2012-06-12 2:42 ` Benjamin Herrenschmidt
2012-06-12 9:31 ` [Qemu-devel] [RFC] QOMification of AXI stream Avi Kivity
2012-06-12 9:42 ` Edgar E. Iglesias
2012-06-11 18:36 ` Anthony Liguori
2012-06-12 9:51 ` Avi Kivity
2012-06-12 12:58 ` Peter Maydell
2012-06-12 13:18 ` Avi Kivity
2012-06-12 13:32 ` Peter Maydell
2012-06-12 13:48 ` Avi Kivity
2012-06-12 13:55 ` Andreas Färber
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120614000001.GA26263@zapo \
--to=edgar.iglesias@gmail.com \
--cc=afaerber@suse.de \
--cc=aliguori@us.ibm.com \
--cc=anthony@codemonkey.ws \
--cc=avi@redhat.com \
--cc=benh@kernel.crashing.org \
--cc=john.williams@petalogix.com \
--cc=monstr@monstr.eu \
--cc=paul@codesourcery.com \
--cc=peter.crosthwaite@petalogix.com \
--cc=peter.maydell@linaro.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).