From: Mark Hounschell <markh@compro.net>
To: William Davis <wdavis@nvidia.com>, Bjorn Helgaas <bhelgaas@google.com>
Cc: "joro@8bytes.org" <joro@8bytes.org>,
"iommu@lists.linux-foundation.org"
<iommu@lists.linux-foundation.org>,
"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
Terence Ripperda <TRipperda@nvidia.com>,
John Hubbard <jhubbard@nvidia.com>,
"jglisse@redhat.com" <jglisse@redhat.com>,
"konrad.wilk@oracle.com" <konrad.wilk@oracle.com>,
Jonathan Corbet <corbet@lwn.net>,
"David S. Miller" <davem@davemloft.net>
Subject: Re: [PATCH v2 4/7] DMA-API: Add dma_(un)map_resource() documentation
Date: Wed, 20 May 2015 15:15:59 -0400 [thread overview]
Message-ID: <555CDD6F.40304@compro.net> (raw)
In-Reply-To: <ea891abe867444d193742030348a6ad1@HQMAIL106.nvidia.com>
On 05/20/2015 01:30 PM, William Davis wrote:
>
>
>> -----Original Message-----
>> From: Mark Hounschell [mailto:markh@compro.net]
>> Sent: Wednesday, May 20, 2015 7:11 AM
>> To: Bjorn Helgaas; William Davis
>> Cc: joro@8bytes.org; iommu@lists.linux-foundation.org; linux-
>> pci@vger.kernel.org; Terence Ripperda; John Hubbard; jglisse@redhat.com;
>> konrad.wilk@oracle.com; Jonathan Corbet; David S. Miller
>> Subject: Re: [PATCH v2 4/7] DMA-API: Add dma_(un)map_resource()
>> documentation
>>
>> On 05/19/2015 07:43 PM, Bjorn Helgaas wrote:
>>> [+cc Dave, Jonathan]
>>>
>>> On Mon, May 18, 2015 at 01:25:01PM -0500, wdavis@nvidia.com wrote:
>>>> From: Will Davis <wdavis@nvidia.com>
>>>>
>>>> Add references to both the general API documentation as well as the
>> HOWTO.
>>>>
>>>> Signed-off-by: Will Davis <wdavis@nvidia.com>
>>>> ---
>>>> Documentation/DMA-API-HOWTO.txt | 39
>> +++++++++++++++++++++++++++++++++++++--
>>>> Documentation/DMA-API.txt | 36 +++++++++++++++++++++++++++++++--
>> ---
>>>> 2 files changed, 68 insertions(+), 7 deletions(-)
>>>>
>>>> diff --git a/Documentation/DMA-API-HOWTO.txt
>>>> b/Documentation/DMA-API-HOWTO.txt index 0f7afb2..89bd730 100644
>>>> --- a/Documentation/DMA-API-HOWTO.txt
>>>> +++ b/Documentation/DMA-API-HOWTO.txt
>>>> @@ -138,6 +138,10 @@ What about block I/O and networking buffers? The
>> block I/O and
>>>> networking subsystems make sure that the buffers they use are valid
>>>> for you to DMA from/to.
>>>>
>>>> +In some systems, it may also be possible to DMA to and/or from a
>>>> +peer device's MMIO region, as described by a 'struct resource'. This
>>>> +is referred to as a peer-to-peer mapping.
>>>> +
>>>> DMA addressing limitations
>>>>
>>>> Does your device have any DMA addressing limitations? For example,
>>>> is @@ -648,6 +652,35 @@ Every dma_map_{single,sg}() call should have its
>> dma_unmap_{single,sg}()
>>>> counterpart, because the bus address space is a shared resource and
>>>> you could render the machine unusable by consuming all bus addresses.
>>>>
>>>> +Peer-to-peer DMA mappings can be obtained using dma_map_resource()
>>>> +to map another device's MMIO region for the given device:
>>>> +
>>>> + struct resource *peer_mmio_res = &other_dev->resource[0];
>>>> + dma_addr_t dma_handle = dma_map_resource(dev, peer_mmio_res,
>>>> + offset, size, direction);
>>>> + if (dma_handle == 0 || dma_mapping_error(dev, dma_handle))
>>>> + {
>>>> + /*
>>>> + * If dma_handle == 0, dma_map_resource() is not
>>>> + * implemented, and peer-to-peer transactions will not
>>>> + * work.
>>>> + */
>>>> + goto map_error_handling;
>>>> + }
>>>> +
>>>> + ...
>>>> +
>>>> + dma_unmap_resource(dev, dma_handle, size, direction);
>>>> +
>>>> +Here, "offset" means byte offset within the given resource.
>>>> +
>>>> +You should both check for a 0 return value and call
>>>> +dma_mapping_error(), as dma_map_resource() can either be not
>>>> +implemented or fail and return error as outlined under the
>> dma_map_single() discussion.
>>>> +
>>>> +You should call dma_unmap_resource() when DMA activity is finished,
>>>> +e.g., from the interrupt which told you that the DMA transfer is done.
>>>> +
>>>> If you need to use the same streaming DMA region multiple times and
>> touch
>>>> the data in between the DMA transfers, the buffer needs to be synced
>>>> properly in order for the CPU and device to see the most up-to-date
>>>> and @@ -765,8 +798,8 @@ failure can be determined by:
>>>>
>>>> - checking if dma_alloc_coherent() returns NULL or dma_map_sg
>>>> returns 0
>>>>
>>>> -- checking the dma_addr_t returned from dma_map_single() and
>>>> dma_map_page()
>>>> - by using dma_mapping_error():
>>>> +- checking the dma_addr_t returned from dma_map_single(),
>>>> +dma_map_resource(),
>>>> + and dma_map_page() by using dma_mapping_error():
>>>>
>>>> dma_addr_t dma_handle;
>>>>
>>>> @@ -780,6 +813,8 @@ failure can be determined by:
>>>> goto map_error_handling;
>>>> }
>>>>
>>>> +- checking if dma_map_resource() returns 0
>>>> +
>>>> - unmap pages that are already mapped, when mapping error occurs in
>> the middle
>>>> of a multiple page mapping attempt. These example are applicable to
>>>> dma_map_page() as well.
>>>> diff --git a/Documentation/DMA-API.txt b/Documentation/DMA-API.txt
>>>> index 5208840..c25c549 100644
>>>> --- a/Documentation/DMA-API.txt
>>>> +++ b/Documentation/DMA-API.txt
>>>> @@ -283,14 +283,40 @@ and <size> parameters are provided to do partial
>> page mapping, it is
>>>> recommended that you never use these unless you really know what the
>>>> cache width is.
>>>>
>>>> +dma_addr_t
>>>> +dma_map_resource(struct device *dev, struct resource *res,
>>>> + unsigned long offset, size_t size,
>>>> + enum dma_data_direction_direction)
>>>> +
>>>> +API for mapping resources. This API allows a driver to map a peer
>>>> +device's resource for DMA. All the notes and warnings for the other
>>>> +APIs apply here. Also, the success of this API does not validate or
>>>> +guarantee that peer-to-peer transactions between the device and its
>>>> +peer will be functional. They only grant access so that if such
>>>> +transactions are possible, an IOMMU will not prevent them from
>>>> +succeeding.
>>>
>>> If the driver can't tell whether peer-to-peer accesses will actually
>>> work, this seems like sort of a dubious API. I'm trying to imagine
>>> how a driver would handle this. I guess whether peer-to-peer works
>>> depends on the underlying platform (not the devices themselves)? If
>>> we run the driver on a platform where peer-to-peer *doesn't* work,
>>> what happens? The driver can't tell, so we just rely on the user to
>>> say "this isn't working as expected"?
>>>
>>
>
> Yes, it's quite difficult to tell whether peer-to-peer will actually work,
> and it usually involves some probing and heuristics on the driver's part.
> I wouldn't say that this makes it a dubious API - it's a piece of the
> puzzle that's absolutely necessary for a driver to set up peer-to-peer in
> an IOMMU environment.
>
I currently just do
page = virt_to_page(__va(bus_address));
then just use the normal API. Works for writes anyway.
>> Most currently available hardware doesn't allow reads but will allow writes
>> on PCIe peer-to-peer transfers. All current AMD chipsets are this way. I'm
>> pretty sure all Intel chipsets are this way also.
>
> Most != all. As an example, Mellanox offers the ability to do peer-to-peer
> transfers:
>
> http://www.mellanox.com/page/products_dyn?product_family=116
>
> which would indicate there is at least some platform out there which allows
> peer-to-peer reads. I don't think that that being a minority configuration
> should preclude it from support.
>
>> What happens with reads
>> is they are just dropped with no indication of error other than the data
>> will not be as expected. Supposedly the PCIe spec does not even require any
>> peer-to-peer support. Regular PCI there is no problem and this API could be
>> useful. However I doubt seriously you will find a pure PCI motherboard that
>> has an IOMMU.
>>
>> I don't understand the chipset manufactures reasoning for disabling PCIe
>> peer-to-peer reads. We would like to make PCIe versions of our cards but
>> their application requires peer-to-peer reads and writes. So we cannot
>> develop PCIe versions of the cards.
>>
>> Again, Regular PCI there is no problem and this API could be useful.
>> IOMMU or not.
>> If we had a pure PCI with IOMMU env, how will this API handle when the 2
>> devices are on the same PCI bus. There will be NO IOMMU between the devices
>> on the same bus. Does this API address that configuration?
>>
>
> What is the expected behavior in this configuration? That the "mapping"
> simply be the bus address (as in the nommu case)?
>
I suspect just using the bus address would sort of defeat one or more
purposes of the IOMMU. The bus address would certainly be what I would
want to use though.
> In an IOMMU environment, the DMA ops would be one of the IOMMU
> implementations, so these APIs would create a mapping for the peer device
> resource, even if it's on the same bus. Would a transaction targeting that
> mapping be forwarded upstream until it hits an IOMMU, which would then send
> the translated request back downstream? Or is my understanding of this
> configuration incorrect?
>
It's my understanding of the IOMMU that is lacking here. I have no idea
if that is actually what would happen. Does it?
Regards
Mark
> Thanks,
> Will
>
>> Mark
>>
>>>> +If this API is not provided by the underlying implementation, 0 is
>>>> +returned and the driver must take appropriate action. Otherwise, the
>>>> +DMA address is returned, and that DMA address should be checked by
>>>> +the driver (see dma_mapping_error() below).
>>>> +
>>>> +void
>>>> +dma_unmap_resource(struct device *dev, dma_addr_t dma_address, size_t
>> size,
>>>> + enum dma_data_direction direction)
>>>> +
>>>> +Unmaps the resource previously mapped. All the parameters passed in
>>>> +must be identical to those passed in to (and returned by) the
>>>> +mapping API.
>>>> +
>>>> int
>>>> dma_mapping_error(struct device *dev, dma_addr_t dma_addr)
>>>>
>>>> -In some circumstances dma_map_single() and dma_map_page() will fail
>>>> to create -a mapping. A driver can check for these errors by testing
>>>> the returned -DMA address with dma_mapping_error(). A non-zero return
>>>> value means the mapping -could not be created and the driver should take
>> appropriate action (e.g.
>>>> -reduce current DMA mapping usage or delay and try again later).
>>>> +In some circumstances dma_map_single(), dma_map_page() and
>>>> +dma_map_resource() will fail to create a mapping. A driver can check
>>>> +for these errors by testing the returned DMA address with
>>>> +dma_mapping_error(). A non-zero return value means the mapping could
>>>> +not be created and the driver should take appropriate action (e.g.
>> reduce current DMA mapping usage or delay and try again later).
>>>>
>>>> int
>>>> dma_map_sg(struct device *dev, struct scatterlist *sg,
>>>> --
>>>> 2.4.0
>>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-pci"
>>> in the body of a message to majordomo@vger.kernel.org More majordomo
>>> info at http://vger.kernel.org/majordomo-info.html
>>>
> --
> nvpublic
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
next prev parent reply other threads:[~2015-05-20 19:16 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-05-18 18:24 [PATCH v2 0/7] IOMMU/DMA map_resource support for peer-to-peer wdavis
2015-05-18 18:24 ` [PATCH v2 1/7] dma-debug: add checking for map/unmap_resource wdavis
2015-05-18 18:24 ` [PATCH v2 2/7] DMA-API: Introduce dma_(un)map_resource wdavis
2015-05-29 8:16 ` Joerg Roedel
2015-05-18 18:25 ` [PATCH v2 3/7] dma-mapping: pci: add pci_(un)map_resource wdavis
2015-05-18 18:25 ` [PATCH v2 4/7] DMA-API: Add dma_(un)map_resource() documentation wdavis
2015-05-19 23:43 ` Bjorn Helgaas
2015-05-20 12:11 ` Mark Hounschell
2015-05-20 17:30 ` William Davis
2015-05-20 19:15 ` Mark Hounschell [this message]
2015-05-20 19:51 ` William Davis
2015-05-20 20:07 ` Mark Hounschell
2015-05-27 18:31 ` William Davis
2015-05-29 8:24 ` joro
2015-07-07 15:15 ` Bjorn Helgaas
2015-07-07 15:41 ` Alex Williamson
2015-07-07 16:16 ` Bjorn Helgaas
2015-07-07 16:41 ` Alex Williamson
2015-07-07 17:14 ` Mark Hounschell
2015-07-07 17:28 ` Alex Williamson
2015-07-07 19:17 ` Mark Hounschell
2015-07-07 19:54 ` Alex Williamson
2015-07-08 15:11 ` Bjorn Helgaas
2015-07-08 16:40 ` Mark Hounschell
2015-07-09 0:50 ` Rafael J. Wysocki
2015-06-01 21:25 ` Konrad Rzeszutek Wilk
2015-06-02 14:27 ` William Davis
2015-05-18 18:25 ` [PATCH v2 5/7] iommu/amd: Implement (un)map_resource wdavis
2015-05-18 18:25 ` [PATCH v2 6/7] iommu/vt-d: implement (un)map_resource wdavis
2015-05-18 18:25 ` [PATCH v2 7/7] x86: add pci-nommu implementation of map_resource wdavis
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=555CDD6F.40304@compro.net \
--to=markh@compro.net \
--cc=TRipperda@nvidia.com \
--cc=bhelgaas@google.com \
--cc=corbet@lwn.net \
--cc=davem@davemloft.net \
--cc=iommu@lists.linux-foundation.org \
--cc=jglisse@redhat.com \
--cc=jhubbard@nvidia.com \
--cc=joro@8bytes.org \
--cc=konrad.wilk@oracle.com \
--cc=linux-pci@vger.kernel.org \
--cc=wdavis@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).