From: Thomas Hellstrom <thellstrom@vmware.com>
To: Lucas Stach <l.stach@pengutronix.de>
Cc: Thomas Hellstrom <thellstrom@vmware.com>,
Jerome Glisse <jglisse@redhat.com>,
"dri-devel@lists.freedesktop.org"
<dri-devel@lists.freedesktop.org>
Subject: Re: Use of pci_map_page in nouveau, radeon TTM.
Date: Tue, 01 Oct 2013 13:13:35 +0200 [thread overview]
Message-ID: <524AAE5F.1080308@vmware.com> (raw)
In-Reply-To: <1380623653.4060.4.camel@weser.hi.pengutronix.de>
On 10/01/2013 12:34 PM, Lucas Stach wrote:
> Am Dienstag, den 01.10.2013, 12:16 +0200 schrieb Thomas Hellstrom:
>> Jerome, Konrad
>>
>> Forgive an ignorant question, but it appears like both Nouveau and
>> Radeon may use pci_map_page() when populating TTMs on
>> pages obtained using the ordinary (not DMA pool). These pages will, if I
>> understand things correctly, not be pages allocated with
>> DMA_ALLOC_COHERENT.
>>
>> From what I understand, at least for the corresponding dma_map_page()
>> it's illegal for the CPU to access these pages without calling
>> dma_sync_xx_for_cpu(). And before the device is allowed to access them
>> again, you need to call dma_sync_xx_for_device().
>> So mapping for PCI really invalidates the TTM interleaved CPU / device
>> access model.
>>
> That's right. The API says you need to sync for device or cpu, but on
> x86 you can get away with not doing so, as on x86 the calls end up just
> being WB buffer flushes.
OK, but what about the cases where the dma subsystem allocates a bounce
buffer?
(Although I think the TTM page selection works around this situation).
Perhaps at the very least this deserves a comment in the code...
>
> For ARM, or similar non-coherent arches you absolutely have to do the
> syncs, or you'll end up with different contents in cache vs sysram. For
> my nouveau on ARM work I introduced some simple helpers to do the right
> thing. And it really isn't hard doing the syncs at the right points in
> time, just sync for CPU when getting a cpu_prep ioctl and then sync for
> device when validating a buffer for GPU use.
Yes, this will probably work for drivers where a buffer is either bound
for CPU or for GPU,
however, on drivers using user-space sub-allocation of buffers, or for
partial updates of
vertex buffers etc. that isn't sufficient. In that case one either has
to use coherent memory
or implement an elaborate scheme where we sync for device and kill
user-space mappings on validation and
sync for cpu in the cpu fault handler. Unfortunately the latter triggers
a fence wait for the
whole buffer, not just the part of the buffer we want to write to.
>
> Regards,
> Lucas
Regards,
Thomas
next prev parent reply other threads:[~2013-10-01 11:13 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-10-01 10:16 Use of pci_map_page in nouveau, radeon TTM Thomas Hellstrom
2013-10-01 10:34 ` Lucas Stach
2013-10-01 11:13 ` Thomas Hellstrom [this message]
2013-10-01 11:56 ` Lucas Stach
2013-10-01 14:14 ` Konrad Rzeszutek Wilk
2013-10-03 16:19 ` Alex Ivanov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=524AAE5F.1080308@vmware.com \
--to=thellstrom@vmware.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=jglisse@redhat.com \
--cc=l.stach@pengutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.