* dma-mapping: support for DMA_ATTR_NON_CONSISTENT DMA attribute
@ 2015-06-28 20:32 Sylvain Munaut
2015-06-28 22:45 ` Russell King - ARM Linux
0 siblings, 1 reply; 9+ messages in thread
From: Sylvain Munaut @ 2015-06-28 20:32 UTC (permalink / raw)
To: linux-arm-kernel
Hi,
I've found this patch from several month ago :
http://lists.infradead.org/pipermail/linux-arm-kernel/2015-February/325489.html
And I find it very useful, but I couldn't find why it wasn't merged ?
What was the problem with it ?
In my use case, I need to allocate fairly large buffers (holding video
data) and the hardware requires them to be contiguous.
I ensure the consistency by using the DMA streaming API
(dma_sync_single_for_{cpu,device}) and so when the buffer is "owned"
by the CPU, I definitely want cached access because coherent memory
speed is just abysmal ... (about 10x slower to read the whole buffer
sequentially).
Since the buffer is > 4M, I can't rely on kmalloc or alloc_pages to
allocate it and dma_alloc_attrs with this patch applied does exactly
what I need.
Did I miss something ? Is there another way to do what I need ? (well
beside reimplementing the whole dma_alloc_attr logic myself ...)
Cheers,
Sylvain
^ permalink raw reply [flat|nested] 9+ messages in thread* dma-mapping: support for DMA_ATTR_NON_CONSISTENT DMA attribute
2015-06-28 20:32 dma-mapping: support for DMA_ATTR_NON_CONSISTENT DMA attribute Sylvain Munaut
@ 2015-06-28 22:45 ` Russell King - ARM Linux
2015-06-29 5:24 ` Sylvain Munaut
0 siblings, 1 reply; 9+ messages in thread
From: Russell King - ARM Linux @ 2015-06-28 22:45 UTC (permalink / raw)
To: linux-arm-kernel
On Sun, Jun 28, 2015 at 10:32:41PM +0200, Sylvain Munaut wrote:
> I've found this patch from several month ago :
>
> http://lists.infradead.org/pipermail/linux-arm-kernel/2015-February/325489.html
>
> And I find it very useful, but I couldn't find why it wasn't merged ?
It's wrong.
> What was the problem with it ?
It's based on a bad assumption, the same bad assumption that you seem to
be making below, which is an abuse of the DMA API.
> In my use case, I need to allocate fairly large buffers (holding video
> data) and the hardware requires them to be contiguous.
So, you're allocating the buffers using the *coherent* DMA API...
> I ensure the consistency by using the DMA streaming API
> (dma_sync_single_for_{cpu,device}) and so when the buffer is "owned"
> by the CPU, I definitely want cached access because coherent memory
> speed is just abysmal ... (about 10x slower to read the whole buffer
> sequentially).
... and then using the *streaming* API dma_sync_* calls on it. That
is invalid, and will lead to undefined results on implementations of
dma_alloc_coherent() which return remapped memory (as for older ARMs.)
This is an abuse of the API.
See Documentation/DMA-API.txt. The API is split into two separate major
chunks, the coherent API, and the streaming API. It is not permissible
to use functions from one API with memory obtained from the other API.
Enable DMA debug (part III of that document) to have the kernel tell you
that you're abusing the API. :)
--
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.
^ permalink raw reply [flat|nested] 9+ messages in thread* dma-mapping: support for DMA_ATTR_NON_CONSISTENT DMA attribute
2015-06-28 22:45 ` Russell King - ARM Linux
@ 2015-06-29 5:24 ` Sylvain Munaut
2015-06-29 6:44 ` Mike Looijmans
0 siblings, 1 reply; 9+ messages in thread
From: Sylvain Munaut @ 2015-06-29 5:24 UTC (permalink / raw)
To: linux-arm-kernel
Hi,
> ... and then using the *streaming* API dma_sync_* calls on it. That
> is invalid, and will lead to undefined results on implementations of
> dma_alloc_coherent() which return remapped memory (as for older ARMs.)
> This is an abuse of the API.
Ok, fine, but then
_How_ do I do it properly ? From a driver module, how do I allocate a
large chunk ( > 4M so no alloc_pages and no kmalloc ) of contiguous
memory that's cacheable and that I can use with the streaming API ?
Or alternatively, what are the correct "sync points" that the doc refers to ?
""" By using this API, you are guaranteeing to the platform that you
have all the correct and necessary sync points for this memory in the
driver should it choose to return non-consistent memory."""
I had assumed it was the dma_sync_* calls, but apparently not.
Cheers,
Sylvain
^ permalink raw reply [flat|nested] 9+ messages in thread
* dma-mapping: support for DMA_ATTR_NON_CONSISTENT DMA attribute
2015-06-29 5:24 ` Sylvain Munaut
@ 2015-06-29 6:44 ` Mike Looijmans
2015-06-29 9:45 ` Russell King - ARM Linux
2015-06-29 10:05 ` Sylvain Munaut
0 siblings, 2 replies; 9+ messages in thread
From: Mike Looijmans @ 2015-06-29 6:44 UTC (permalink / raw)
To: linux-arm-kernel
?On 29-06-15 07:24, Sylvain Munaut wrote:
>> ... and then using the *streaming* API dma_sync_* calls on it. That
>> is invalid, and will lead to undefined results on implementations of
>> dma_alloc_coherent() which return remapped memory (as for older ARMs.)
>> This is an abuse of the API.
>
> Ok, fine, but then
>
> _How_ do I do it properly ? From a driver module, how do I allocate a
> large chunk ( > 4M so no alloc_pages and no kmalloc ) of contiguous
> memory that's cacheable and that I can use with the streaming API ?
To the best of my knowledge, and after asking around here, the only answer
I've so far been able to get is that alloc_pages (and hence, kmalloc) is the
method to call.
It would be nice if there were a dma_alloc_streaming() interface, which might
just translate to kalloc directly, but it would at least get rid of all the
obscurities of allocating DMA memory.
I fell into the same pit as Sylvain here, initially I was also allocating with
dma_alloc_...
I submitted a patch to amend the dma documentation a while ago, but never got
a response, so I probably sent it to the wrong group.
> Or alternatively, what are the correct "sync points" that the doc refers to ?
>
> """ By using this API, you are guaranteeing to the platform that you
> have all the correct and necessary sync points for this memory in the
> driver should it choose to return non-consistent memory."""
>
> I had assumed it was the dma_sync_* calls, but apparently not.
I think the answer is "the dma_sync_* calls plus whatever it is your DMA
controller hardware needs to do"
If that's the case, where should I send the documentation patch?
Kind regards,
Mike Looijmans
System Expert
TOPIC Embedded Products
Eindhovenseweg 32-C, NL-5683 KH Best
Postbus 440, NL-5680 AK Best
Telefoon: +31 (0) 499 33 69 79
Telefax: +31 (0) 499 33 69 70
E-mail: mike.looijmans at topicproducts.com
Website: www.topicproducts.com
Please consider the environment before printing this e-mail
^ permalink raw reply [flat|nested] 9+ messages in thread
* dma-mapping: support for DMA_ATTR_NON_CONSISTENT DMA attribute
2015-06-29 6:44 ` Mike Looijmans
@ 2015-06-29 9:45 ` Russell King - ARM Linux
2015-06-29 13:08 ` Mike Looijmans
2015-06-29 10:05 ` Sylvain Munaut
1 sibling, 1 reply; 9+ messages in thread
From: Russell King - ARM Linux @ 2015-06-29 9:45 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Jun 29, 2015 at 08:44:50AM +0200, Mike Looijmans wrote:
> ?On 29-06-15 07:24, Sylvain Munaut wrote:
> >""" By using this API, you are guaranteeing to the platform that you
> >have all the correct and necessary sync points for this memory in the
> >driver should it choose to return non-consistent memory."""
> >
> >I had assumed it was the dma_sync_* calls, but apparently not.
>
> I think the answer is "the dma_sync_* calls plus whatever it is your DMA
> controller hardware needs to do"
No it is not. When something is clearly the wrong API, you do not go
around telling people to use it. I'm pretty disgusted that you think
that is an appropraite way to behave.
> If that's the case, where should I send the documentation patch?
It's not for me to say whether your patch is correct or not, it's for
the wider community to do so. As there is no one "responsible" for
that documentation, I'd suggest sending it to Andrew Morton as nothing
else happened with it.
--
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.
^ permalink raw reply [flat|nested] 9+ messages in thread
* dma-mapping: support for DMA_ATTR_NON_CONSISTENT DMA attribute
2015-06-29 9:45 ` Russell King - ARM Linux
@ 2015-06-29 13:08 ` Mike Looijmans
0 siblings, 0 replies; 9+ messages in thread
From: Mike Looijmans @ 2015-06-29 13:08 UTC (permalink / raw)
To: linux-arm-kernel
On 29-06-15 11:45, Russell King - ARM Linux wrote:
> On Mon, Jun 29, 2015 at 08:44:50AM +0200, Mike Looijmans wrote:
>> ?On 29-06-15 07:24, Sylvain Munaut wrote:
>>> """ By using this API, you are guaranteeing to the platform that you
>>> have all the correct and necessary sync points for this memory in the
>>> driver should it choose to return non-consistent memory."""
>>>
>>> I had assumed it was the dma_sync_* calls, but apparently not.
>>
>> I think the answer is "the dma_sync_* calls plus whatever it is your DMA
>> controller hardware needs to do"
>
> No it is not. When something is clearly the wrong API, you do not go
> around telling people to use it. I'm pretty disgusted that you think
> that is an appropraite way to behave.
I trying to merely refer to the "correct and necessary sync points" part, once
the decision to (properly) use the streaming API has been made.
To the best of my knowledge, the dma_sync_... methods take care of things from
the ARM (CPU) side.
That leaves only the device side, you need to make sure that for example its
FIFOs are empty before starting to flush caches. Something that the API cannot
do for you, but your driver and hardware (or logic in the case of a Zynq) must
do a proper job too.
Saying something about those things would make the documentation a lot less
cryptic than just dropping a "correct and necessary sync points" bomb into the
text and then offering no advise on where one could hope to find some clay
tablets with engravings explaining what is "correct and necessary".
M.
^ permalink raw reply [flat|nested] 9+ messages in thread
* dma-mapping: support for DMA_ATTR_NON_CONSISTENT DMA attribute
2015-06-29 6:44 ` Mike Looijmans
2015-06-29 9:45 ` Russell King - ARM Linux
@ 2015-06-29 10:05 ` Sylvain Munaut
2015-06-29 10:08 ` Russell King - ARM Linux
1 sibling, 1 reply; 9+ messages in thread
From: Sylvain Munaut @ 2015-06-29 10:05 UTC (permalink / raw)
To: linux-arm-kernel
>> _How_ do I do it properly ? From a driver module, how do I allocate a
>> large chunk ( > 4M so no alloc_pages and no kmalloc ) of contiguous
>> memory that's cacheable and that I can use with the streaming API ?
>
> To the best of my knowledge, and after asking around here, the only answer
> I've so far been able to get is that alloc_pages (and hence, kmalloc) is the
> method to call.
But as I mentioned that doesn't work for chunks larger than 4M due to
the MAX_ORDER.
I'd want that memory to come out of the CMA pool ideally (since afaict
it's the most reliable way to get large chunks of contiguous memory).
But the CMA stuff is not available directly and the only "visible"
interface to it I could find is the dma_alloc_xxx methods.
> It would be nice if there were a dma_alloc_streaming() interface, which
> might just translate to kalloc directly, but it would at least get rid of
> all the obscurities of allocating DMA memory.
+1
>> """ By using this API, you are guaranteeing to the platform that you
>> have all the correct and necessary sync points for this memory in the
>> driver should it choose to return non-consistent memory."""
>>
>> I had assumed it was the dma_sync_* calls, but apparently not.
>
>
> I think the answer is "the dma_sync_* calls plus whatever it is your DMA
> controller hardware needs to do"
Well that can't be since the dma_sync_* are part of streaming API and
the dma_alloc_attrs is part of the "coherent" API (even though you
have the NON_COHERENT flag in it ... that's a bit weird).
But I don't see the alternative, unless the doc is suggesting I
basically do all the cache invalidation myself manually and doing arch
specific stuff directly in the driver ?
Cheers,
Sylvain
^ permalink raw reply [flat|nested] 9+ messages in thread
* dma-mapping: support for DMA_ATTR_NON_CONSISTENT DMA attribute
2015-06-29 10:05 ` Sylvain Munaut
@ 2015-06-29 10:08 ` Russell King - ARM Linux
2015-06-29 13:22 ` Sylvain Munaut
0 siblings, 1 reply; 9+ messages in thread
From: Russell King - ARM Linux @ 2015-06-29 10:08 UTC (permalink / raw)
To: linux-arm-kernel
On Mon, Jun 29, 2015 at 12:05:09PM +0200, Sylvain Munaut wrote:
> Well that can't be since the dma_sync_* are part of streaming API and
> the dma_alloc_attrs is part of the "coherent" API (even though you
> have the NON_COHERENT flag in it ... that's a bit weird).
>
> But I don't see the alternative, unless the doc is suggesting I
> basically do all the cache invalidation myself manually and doing arch
> specific stuff directly in the driver ?
Maybe someone with this problem should work on a solution - augmenting
the coherent API with a set of functions to do what you want?
Please, if you find something lacking like this, make a proposal and
send a patch - but bear in mind that such an extension is not a matter
for just ARM, but every other architecture that the kernel supports,
so merely sending it to ARM mailing lists isn't going to get you very
far.
--
FTTC broadband for 0.8mile line: currently at 10.5Mbps down 400kbps up
according to speedtest.net.
^ permalink raw reply [flat|nested] 9+ messages in thread
* dma-mapping: support for DMA_ATTR_NON_CONSISTENT DMA attribute
2015-06-29 10:08 ` Russell King - ARM Linux
@ 2015-06-29 13:22 ` Sylvain Munaut
0 siblings, 0 replies; 9+ messages in thread
From: Sylvain Munaut @ 2015-06-29 13:22 UTC (permalink / raw)
To: linux-arm-kernel
Hi Russell,
On Mon, Jun 29, 2015 at 12:08 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Mon, Jun 29, 2015 at 12:05:09PM +0200, Sylvain Munaut wrote:
>> Well that can't be since the dma_sync_* are part of streaming API and
>> the dma_alloc_attrs is part of the "coherent" API (even though you
>> have the NON_COHERENT flag in it ... that's a bit weird).
>>
>> But I don't see the alternative, unless the doc is suggesting I
>> basically do all the cache invalidation myself manually and doing arch
>> specific stuff directly in the driver ?
>
> Maybe someone with this problem should work on a solution - augmenting
> the coherent API with a set of functions to do what you want?
>
> Please, if you find something lacking like this, make a proposal and
> send a patch - but bear in mind that such an extension is not a matter
> for just ARM, but every other architecture that the kernel supports,
> so merely sending it to ARM mailing lists isn't going to get you very
> far.
I think I'd do it the other way, and augment the streaming API with a
method to allocate memory (either using kmalloc for small size or from
the CMA for large chunks).
This seems to be much less architecture dependent.
Would that be acceptable ?
All in all, the DMA_ATTR_NON_CONSISTENT flag seems to be used by only
2 drivers (and I couldn't really trace easily how they handle the
sync).
Cheers,
Sylvain
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2015-06-29 13:22 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-06-28 20:32 dma-mapping: support for DMA_ATTR_NON_CONSISTENT DMA attribute Sylvain Munaut
2015-06-28 22:45 ` Russell King - ARM Linux
2015-06-29 5:24 ` Sylvain Munaut
2015-06-29 6:44 ` Mike Looijmans
2015-06-29 9:45 ` Russell King - ARM Linux
2015-06-29 13:08 ` Mike Looijmans
2015-06-29 10:05 ` Sylvain Munaut
2015-06-29 10:08 ` Russell King - ARM Linux
2015-06-29 13:22 ` Sylvain Munaut
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).