Re: [PATCH v2 4/7] DMA-API: Add dma_(un)map_resource() documentation

linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Mark Hounschell <markh@compro.net>
To: William Davis <wdavis@nvidia.com>, Bjorn Helgaas <bhelgaas@google.com>
Cc: "joro@8bytes.org" <joro@8bytes.org>,
	"iommu@lists.linux-foundation.org"
	<iommu@lists.linux-foundation.org>,
	"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
	Terence Ripperda <TRipperda@nvidia.com>,
	John Hubbard <jhubbard@nvidia.com>,
	"jglisse@redhat.com" <jglisse@redhat.com>,
	"konrad.wilk@oracle.com" <konrad.wilk@oracle.com>,
	Jonathan Corbet <corbet@lwn.net>,
	"David S. Miller" <davem@davemloft.net>
Subject: Re: [PATCH v2 4/7] DMA-API: Add dma_(un)map_resource() documentation
Date: Wed, 20 May 2015 15:15:59 -0400	[thread overview]
Message-ID: <555CDD6F.40304@compro.net> (raw)
In-Reply-To: <ea891abe867444d193742030348a6ad1@HQMAIL106.nvidia.com>

On 05/20/2015 01:30 PM, William Davis wrote:
>
>
>> -----Original Message-----
>> From: Mark Hounschell [mailto:markh@compro.net]
>> Sent: Wednesday, May 20, 2015 7:11 AM
>> To: Bjorn Helgaas; William Davis
>> Cc: joro@8bytes.org; iommu@lists.linux-foundation.org; linux-
>> pci@vger.kernel.org; Terence Ripperda; John Hubbard; jglisse@redhat.com;
>> konrad.wilk@oracle.com; Jonathan Corbet; David S. Miller
>> Subject: Re: [PATCH v2 4/7] DMA-API: Add dma_(un)map_resource()
>> documentation
>>
>> On 05/19/2015 07:43 PM, Bjorn Helgaas wrote:
>>> [+cc Dave, Jonathan]
>>>
>>> On Mon, May 18, 2015 at 01:25:01PM -0500, wdavis@nvidia.com wrote:
>>>> From: Will Davis <wdavis@nvidia.com>
>>>>
>>>> Add references to both the general API documentation as well as the
>> HOWTO.
>>>>
>>>> Signed-off-by: Will Davis <wdavis@nvidia.com>
>>>> ---
>>>>    Documentation/DMA-API-HOWTO.txt | 39
>> +++++++++++++++++++++++++++++++++++++--
>>>>    Documentation/DMA-API.txt       | 36 +++++++++++++++++++++++++++++++--
>> ---
>>>>    2 files changed, 68 insertions(+), 7 deletions(-)
>>>>
>>>> diff --git a/Documentation/DMA-API-HOWTO.txt
>>>> b/Documentation/DMA-API-HOWTO.txt index 0f7afb2..89bd730 100644
>>>> --- a/Documentation/DMA-API-HOWTO.txt
>>>> +++ b/Documentation/DMA-API-HOWTO.txt
>>>> @@ -138,6 +138,10 @@ What about block I/O and networking buffers?  The
>> block I/O and
>>>>    networking subsystems make sure that the buffers they use are valid
>>>>    for you to DMA from/to.
>>>>
>>>> +In some systems, it may also be possible to DMA to and/or from a
>>>> +peer device's MMIO region, as described by a 'struct resource'. This
>>>> +is referred to as a peer-to-peer mapping.
>>>> +
>>>>    			DMA addressing limitations
>>>>
>>>>    Does your device have any DMA addressing limitations?  For example,
>>>> is @@ -648,6 +652,35 @@ Every dma_map_{single,sg}() call should have its
>> dma_unmap_{single,sg}()
>>>>    counterpart, because the bus address space is a shared resource and
>>>>    you could render the machine unusable by consuming all bus addresses.
>>>>
>>>> +Peer-to-peer DMA mappings can be obtained using dma_map_resource()
>>>> +to map another device's MMIO region for the given device:
>>>> +
>>>> +	struct resource *peer_mmio_res = &other_dev->resource[0];
>>>> +	dma_addr_t dma_handle = dma_map_resource(dev, peer_mmio_res,
>>>> +						 offset, size, direction);
>>>> +	if (dma_handle == 0 || dma_mapping_error(dev, dma_handle))
>>>> +	{
>>>> +		/*
>>>> +		 * If dma_handle == 0, dma_map_resource() is not
>>>> +		 * implemented, and peer-to-peer transactions will not
>>>> +		 * work.
>>>> +		 */
>>>> +		goto map_error_handling;
>>>> +	}
>>>> +
>>>> +	...
>>>> +
>>>> +	dma_unmap_resource(dev, dma_handle, size, direction);
>>>> +
>>>> +Here, "offset" means byte offset within the given resource.
>>>> +
>>>> +You should both check for a 0 return value and call
>>>> +dma_mapping_error(), as dma_map_resource() can either be not
>>>> +implemented or fail and return error as outlined under the
>> dma_map_single() discussion.
>>>> +
>>>> +You should call dma_unmap_resource() when DMA activity is finished,
>>>> +e.g., from the interrupt which told you that the DMA transfer is done.
>>>> +
>>>>    If you need to use the same streaming DMA region multiple times and
>> touch
>>>>    the data in between the DMA transfers, the buffer needs to be synced
>>>>    properly in order for the CPU and device to see the most up-to-date
>>>> and @@ -765,8 +798,8 @@ failure can be determined by:
>>>>
>>>>    - checking if dma_alloc_coherent() returns NULL or dma_map_sg
>>>> returns 0
>>>>
>>>> -- checking the dma_addr_t returned from dma_map_single() and
>>>> dma_map_page()
>>>> -  by using dma_mapping_error():
>>>> +- checking the dma_addr_t returned from dma_map_single(),
>>>> +dma_map_resource(),
>>>> +  and dma_map_page() by using dma_mapping_error():
>>>>
>>>>    	dma_addr_t dma_handle;
>>>>
>>>> @@ -780,6 +813,8 @@ failure can be determined by:
>>>>    		goto map_error_handling;
>>>>    	}
>>>>
>>>> +- checking if dma_map_resource() returns 0
>>>> +
>>>>    - unmap pages that are already mapped, when mapping error occurs in
>> the middle
>>>>      of a multiple page mapping attempt. These example are applicable to
>>>>      dma_map_page() as well.
>>>> diff --git a/Documentation/DMA-API.txt b/Documentation/DMA-API.txt
>>>> index 5208840..c25c549 100644
>>>> --- a/Documentation/DMA-API.txt
>>>> +++ b/Documentation/DMA-API.txt
>>>> @@ -283,14 +283,40 @@ and <size> parameters are provided to do partial
>> page mapping, it is
>>>>    recommended that you never use these unless you really know what the
>>>>    cache width is.
>>>>
>>>> +dma_addr_t
>>>> +dma_map_resource(struct device *dev, struct resource *res,
>>>> +		 unsigned long offset, size_t size,
>>>> +		 enum dma_data_direction_direction)
>>>> +
>>>> +API for mapping resources. This API allows a driver to map a peer
>>>> +device's resource for DMA. All the notes and warnings for the other
>>>> +APIs apply here. Also, the success of this API does not validate or
>>>> +guarantee that peer-to-peer transactions between the device and its
>>>> +peer will be functional. They only grant access so that if such
>>>> +transactions are possible, an IOMMU will not prevent them from
>>>> +succeeding.
>>>
>>> If the driver can't tell whether peer-to-peer accesses will actually
>>> work, this seems like sort of a dubious API.  I'm trying to imagine
>>> how a driver would handle this.  I guess whether peer-to-peer works
>>> depends on the underlying platform (not the devices themselves)?  If
>>> we run the driver on a platform where peer-to-peer *doesn't* work,
>>> what happens?  The driver can't tell, so we just rely on the user to
>>> say "this isn't working as expected"?
>>>
>>
>
> Yes, it's quite difficult to tell whether peer-to-peer will actually work,
> and it usually involves some probing and heuristics on the driver's part.
> I wouldn't say that this makes it a dubious API - it's a piece of the
> puzzle that's absolutely necessary for a driver to set up peer-to-peer in
> an IOMMU environment.
>

I currently just do

page = virt_to_page(__va(bus_address));

then just use the normal API. Works for writes anyway.

>> Most currently available hardware doesn't allow reads but will allow writes
>> on PCIe peer-to-peer transfers. All current AMD chipsets are this way. I'm
>> pretty sure all Intel chipsets are this way also.
>
> Most != all. As an example, Mellanox offers the ability to do peer-to-peer
> transfers:
>
> http://www.mellanox.com/page/products_dyn?product_family=116
>
> which would indicate there is at least some platform out there which allows
> peer-to-peer reads. I don't think that that being a minority configuration
> should preclude it from support.
>
>> What happens with reads
>> is they are just dropped with no indication of error other than the data
>> will not be as expected. Supposedly the PCIe spec does not even require any
>> peer-to-peer support. Regular PCI there is no problem and this API could be
>> useful. However I doubt seriously you will find a pure PCI motherboard that
>> has an IOMMU.
>>
>> I don't understand the chipset manufactures reasoning for disabling PCIe
>> peer-to-peer reads. We would like to make PCIe versions of our cards but
>> their application requires  peer-to-peer reads and writes. So we cannot
>> develop PCIe versions of the cards.
>>
>> Again, Regular PCI there is no problem and this API could be useful.
>> IOMMU or not.
>> If we had a pure PCI with IOMMU env, how will this API handle when the 2
>> devices are on the same PCI bus. There will be NO IOMMU between the devices
>> on the same bus. Does this API address that configuration?
>>
>
> What is the expected behavior in this configuration? That the "mapping"
> simply be the bus address (as in the nommu case)?
>

I suspect just using the bus address would sort of defeat one or more 
purposes of the IOMMU. The bus address would certainly be what I would 
want to use though.

> In an IOMMU environment, the DMA ops would be one of the IOMMU
> implementations, so these APIs would create a mapping for the peer device
> resource, even if it's on the same bus. Would a transaction targeting that
> mapping be forwarded upstream until it hits an IOMMU, which would then send
> the translated request back downstream? Or is my understanding of this
> configuration incorrect?
>

It's my understanding of the IOMMU that is lacking here. I have no idea 
if that is actually what would happen. Does it?

Regards
Mark


> Thanks,
> Will
>
>> Mark
>>
>>>> +If this API is not provided by the underlying implementation, 0 is
>>>> +returned and the driver must take appropriate action. Otherwise, the
>>>> +DMA address is returned, and that DMA address should be checked by
>>>> +the driver (see dma_mapping_error() below).
>>>> +
>>>> +void
>>>> +dma_unmap_resource(struct device *dev, dma_addr_t dma_address, size_t
>> size,
>>>> +		   enum dma_data_direction direction)
>>>> +
>>>> +Unmaps the resource previously mapped. All the parameters passed in
>>>> +must be identical to those passed in to (and returned by) the
>>>> +mapping API.
>>>> +
>>>>    int
>>>>    dma_mapping_error(struct device *dev, dma_addr_t dma_addr)
>>>>
>>>> -In some circumstances dma_map_single() and dma_map_page() will fail
>>>> to create -a mapping. A driver can check for these errors by testing
>>>> the returned -DMA address with dma_mapping_error(). A non-zero return
>>>> value means the mapping -could not be created and the driver should take
>> appropriate action (e.g.
>>>> -reduce current DMA mapping usage or delay and try again later).
>>>> +In some circumstances dma_map_single(), dma_map_page() and
>>>> +dma_map_resource() will fail to create a mapping. A driver can check
>>>> +for these errors by testing the returned DMA address with
>>>> +dma_mapping_error(). A non-zero return value means the mapping could
>>>> +not be created and the driver should take appropriate action (e.g.
>> reduce current DMA mapping usage or delay and try again later).
>>>>
>>>>    	int
>>>>    	dma_map_sg(struct device *dev, struct scatterlist *sg,
>>>> --
>>>> 2.4.0
>>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-pci"
>>> in the body of a message to majordomo@vger.kernel.org More majordomo
>>> info at  http://vger.kernel.org/majordomo-info.html
>>>
> --
> nvpublic
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

next prev parent reply	other threads:[~2015-05-20 19:16 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-18 18:24 [PATCH v2 0/7] IOMMU/DMA map_resource support for peer-to-peer wdavis
2015-05-18 18:24 ` [PATCH v2 1/7] dma-debug: add checking for map/unmap_resource wdavis
2015-05-18 18:24 ` [PATCH v2 2/7] DMA-API: Introduce dma_(un)map_resource wdavis
2015-05-29  8:16   ` Joerg Roedel
2015-05-18 18:25 ` [PATCH v2 3/7] dma-mapping: pci: add pci_(un)map_resource wdavis
2015-05-18 18:25 ` [PATCH v2 4/7] DMA-API: Add dma_(un)map_resource() documentation wdavis
2015-05-19 23:43   ` Bjorn Helgaas
2015-05-20 12:11     ` Mark Hounschell
2015-05-20 17:30       ` William Davis
2015-05-20 19:15         ` Mark Hounschell [this message]
2015-05-20 19:51           ` William Davis
2015-05-20 20:07             ` Mark Hounschell
2015-05-27 18:31               ` William Davis
2015-05-29  8:24           ` joro
2015-07-07 15:15       ` Bjorn Helgaas
2015-07-07 15:41         ` Alex Williamson
2015-07-07 16:16           ` Bjorn Helgaas
2015-07-07 16:41             ` Alex Williamson
2015-07-07 17:14         ` Mark Hounschell
2015-07-07 17:28           ` Alex Williamson
2015-07-07 19:17             ` Mark Hounschell
2015-07-07 19:54               ` Alex Williamson
2015-07-08 15:11           ` Bjorn Helgaas
2015-07-08 16:40             ` Mark Hounschell
2015-07-09  0:50             ` Rafael J. Wysocki
2015-06-01 21:25   ` Konrad Rzeszutek Wilk
2015-06-02 14:27     ` William Davis
2015-05-18 18:25 ` [PATCH v2 5/7] iommu/amd: Implement (un)map_resource wdavis
2015-05-18 18:25 ` [PATCH v2 6/7] iommu/vt-d: implement (un)map_resource wdavis
2015-05-18 18:25 ` [PATCH v2 7/7] x86: add pci-nommu implementation of map_resource wdavis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=555CDD6F.40304@compro.net \
    --to=markh@compro.net \
    --cc=TRipperda@nvidia.com \
    --cc=bhelgaas@google.com \
    --cc=corbet@lwn.net \
    --cc=davem@davemloft.net \
    --cc=iommu@lists.linux-foundation.org \
    --cc=jglisse@redhat.com \
    --cc=jhubbard@nvidia.com \
    --cc=joro@8bytes.org \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-pci@vger.kernel.org \
    --cc=wdavis@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).