From: jonathan.derrick@intel.com (Jon Derrick)
Subject: phys_addr_t instead of dma_addr_t for nvme_dev->cmb_dma_addr
Date: Wed, 11 Jan 2017 08:36:26 -0700 [thread overview]
Message-ID: <20170111153626.GA5485@localhost.localdomain> (raw)
In-Reply-To: <1484122522.26936.9.camel@mellanox.com>
On Wed, Jan 11, 2017@08:15:23AM +0000, Haggai Eran wrote:
> On Mon, 2017-01-09@14:54 -0700, Jon Derrick wrote:
> > On Sun, Jan 08, 2017@10:55:28AM +0200, Haggai Eran wrote:
> > >
> > > On 1/5/2017 8:39 PM, Jon Derrick wrote:
> > > >
> > > > >
> > > > > Perhaps I'm mistaken, but shouldn't the code use
> > > > > pcibios_resource_to_bus()
> > > > > in this case to convert the resource to bus addresses? I see
> > > > > cmb_dma_addr?
> > > > > is later passed directly to the device as the sq_dma_addr.
> > > > >
> > > > That gets us a region from a window within a larger region, but
> > > > to me it
> > > > looks to me like resource_contains() would fail to match if the
> > > > CMB
> > > > region went beyond the window.
> > > I thought that the CMB must fit in its BAR, and therefore in the
> > > window that?
> > > contains it. Isn't it so?
> > >
> > The spec is unclear if it's the host's responsibility to stay within
> > the
> > BAR, or the device's to reduce CMBLOC and CMBSZ to fit:
> >
> > "If the Offset + Size exceeds the length of
> > the indicated BAR, the size available to the host is limited by the
> > length of the BAR."
>
> If the BAR is smaller than (offset + size) then any address that is
> outside the BAR must be treated by the device as if it is not in the
> CMB (otherwise some other devices / host memory will simply be
> inaccessible by the NVMe device).?
> >
> > I think this would only happen if we're behind a bridge with a
> > smaller
> > window than BAR.
>
> I'm pretty sure that the bridge window must contain the underlying
> device BARs. If it can't contain them, they can be simply left
> disabled.
>
Oh good. I wasn't aware of those restrictions. That should make
pcibus_resource_to_bus a possibility.
> The situation can still happen in case the NVMe device exposes a
> smaller BAR than the CMB, or if it supports the resizeable BAR PCIe
> capability and the BIOS resized it to a smaller size (although I
> haven't heard of any device or BIOS that supports that).?
>
> >
> >
> > >
> > > >
> > > > There's another option - pci_bus_addr_t/pci_bus_region takes the
> > > > largest
> > > > of phys_addr_t's width and dma_addr_t's width. So in the cases
> > > > where
> > > > those two types might differ it should still be able to hold a
> > > > valid
> > > > physical address, which is what both the resource API and Create-
> > > > SQes
> > > > expect.
> > > I don't think the issue is just the width of the types. What
> > > happens on?
> > > architectures where phy_addr_t addresses are translated before
> > > going to?
> > > the PCIe bus?
> > If we have a DMA translation, we get the host side addresses from the
> > ioremapping and I believe the device is still expecting the
> > untranslated
> > addresses, since it needs to DMA over the fabric. Do archs exists
> > that
> > don't fit this model?
> I'm not talking about DMA translation. I'm talking about MMIO
> translation. From what I understand this can happen on POWER systems.
> The physical addresses for MMIO that are used by the CPU are different
> from the ones that are used on the PCIe bus.
>
I hadn't considered those but the address given in the Create SQes
command still should be in the range of the addresses in the device's
BARs. I'm guessing we'll may have to go through the IOMMU subsystem to
untranslate those.
> Regards,
> Haggai
prev parent reply other threads:[~2017-01-11 15:36 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-01-05 10:20 phys_addr_t instead of dma_addr_t for nvme_dev->cmb_dma_addr Max Gurtovoy
2017-01-05 11:02 ` Haggai Eran
2017-01-05 18:39 ` Jon Derrick
2017-01-08 8:55 ` Haggai Eran
2017-01-09 21:54 ` Jon Derrick
2017-01-11 8:15 ` Haggai Eran
2017-01-11 9:06 ` hch
2017-01-11 15:36 ` Jon Derrick [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170111153626.GA5485@localhost.localdomain \
--to=jonathan.derrick@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.