From: Ben Dooks <ben-linux@fluff.org>
To: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Cc: Hugh Dickins <hugh.dickins@tiscali.co.uk>,
Robin Holt <holt@sgi.com>,
linux-kernel@vger.kernel.org,
v4l2_linux <linux-media@vger.kernel.org>,
linux-arm-kernel@lists.arm.linux.org.uk
Subject: Re: How to efficiently handle DMA and cache on ARMv7 ? (was "Is get_user_pages() enough to prevent pages from being swapped out ?")
Date: Thu, 6 Aug 2009 12:46:19 +0100 [thread overview]
Message-ID: <20090806114619.GW2080@trinity.fluff.org> (raw)
In-Reply-To: <200908061208.22131.laurent.pinchart@ideasonboard.com>
On Thu, Aug 06, 2009 at 12:08:21PM +0200, Laurent Pinchart wrote:
> [Resent with an updated subject, this time CC'ing linux-arm-kernel]
>
> I've spent the last few days "playing" with get_user_pages() and mlock() and
> got some interesting results. It turned out that cache coherency comes into
> play at some point, making the overall problem more complex.
>
> Here's my current setup:
>
> - OMAP processor, based on an ARMv7 core
> - MMU and IOMMU
> - VIPT non-aliasing data cache
> - video capture driver that transfers data to memory using DMA
> - video capture application that pass userspace pointers to video buffers to
> the driver
>
> My goal is to make sure that, upon DMA completion, the correct data will be
> available to the userspace application.
>
> The first problem was to pin pages to memory, to make sure they will not be
> freed when the DMA is in progress. videobug-dma-sg uses get_user_pages() for
> that, and Hugh Dickins nicely explained to me why this is enough.
>
> The second problem is to ensure cache coherency. As the userspace application
> will read data from the video buffers, those buffers will end up being cached
> in the processor's data cache. The driver does need to invalidate the cache
> before starting the DMA operation (userspace could in theory write to the
> buffers, but the data will be overwritten by DMA anyway, so there's no need to
> clean the cache).
You'll need to clean the write buffers, otherwise the CPU may have data
queued that it has yet to write back to memory.
> As the cache is of the VIPT (Virtual Index Physical Tag) type, cache
> invalidation can either be done globally (in which case the cache is flushed
> instead of being invalidated) or based on virtual addresses. In the last case
> the processor will need to look physical addresses up, either in the TLB or
> through hardware table walk.
>
> I can see three solutions to the DMA/cache problem.
>
> 1. Flushing the whole data cache right before starting the DMA transfer.
> There's no API for that in the ARM architecture, so a whole I+D cache is
> required. This is quite costly, we're talking about around 30 flushes per
> second, but it doesn't involve the MMU. That's the solution that I currently
> use.
>
> 2. Invalidating only the cache lines that store video buffer data. This
> requires a TLB lookup or a hardware table walk, so the userspace application
> MM context needs to be available (no problem there as where's flushing in
> userspace context) and all pages need to be mapped properly. This can be a
> problem as, as Hugh pointed out, pages can still be unmapped from the
> userspace context after get_user_pages() returns. I have experienced one oops
> due to a kernel paging request failure:
If you already know the virtual addresses of the buffers, why do you need
a TLB lookup (or am I being dense here?)
> Unable to handle kernel paging request at virtual address 44e12000
> pgd = c8698000
> [44e12000] *pgd=8a4fd031, *pte=8cfda1cd, *ppte=00000000
> Internal error: Oops: 817 [#1] PREEMPT
> PC is at v7_dma_inv_range+0x2c/0x44
>
> Fixing this requires more investigation, and I'm not sure how to proceed to
> find out if the page fault is really caused by pages being unmapped from the
> userspace context. Help would be appreciated.
>
> 3. Mark the pages as non-cacheable. Depending on how the buffers are then used
> by userspace, the additional cache misses might destroy any benefit I would
> get from not flushing the cache before DMA. I'm not sure how to mark a bunch
> of pages as non-cacheable though. What usually happens is that video drivers
> allocate DMA-coherent memory themselves, but in this case I need to deal with
> an arbitrary buffer allocated by userspace. If someone has any experience with
> this, it would be appreciated.
>
> Regards,
>
> Laurent Pinchart
>
>
> -------------------------------------------------------------------
> List admin: http://lists.arm.linux.org.uk/mailman/listinfo/linux-arm-kernel
> FAQ: http://www.arm.linux.org.uk/mailinglists/faq.php
> Etiquette: http://www.arm.linux.org.uk/mailinglists/etiquette.php
--
--
Ben
Q: What's a light-year?
A: One-third less calories than a regular year.
next prev parent reply other threads:[~2009-08-06 11:46 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-08-06 10:08 How to efficiently handle DMA and cache on ARMv7 ? (was "Is get_user_pages() enough to prevent pages from being swapped out ?") Laurent Pinchart
2009-08-06 11:46 ` Ben Dooks [this message]
2009-08-06 13:06 ` Laurent Pinchart
2009-08-06 18:46 ` David Xiao
2009-08-06 19:16 ` Chetan.Loke
2009-08-06 20:15 ` Jamie Lokier
2009-08-06 22:25 ` Russell King - ARM Linux
2009-08-07 5:59 ` David Xiao
2009-08-07 7:58 ` Laurent Pinchart
2009-08-07 8:10 ` Russell King - ARM Linux
2009-08-07 9:54 ` Jamie Lokier
2009-08-07 9:59 ` Russell King - ARM Linux
2009-08-07 12:07 ` Laurent Desnogues
2009-08-07 13:15 ` Robin Holt
2009-08-07 19:01 ` Russell King - ARM Linux
2009-08-07 20:11 ` Laurent Pinchart
2009-08-07 20:28 ` Russell King - ARM Linux
2009-08-07 22:25 ` David Xiao
2009-08-10 13:49 ` Laurent Pinchart
2009-08-07 8:08 ` Russell King - ARM Linux
2009-08-07 10:23 ` Jamie Lokier
2009-08-07 19:03 ` Russell King - ARM Linux
2009-08-11 9:31 ` Catalin Marinas
2009-08-11 18:23 ` David Xiao
2009-08-07 7:48 ` Laurent Pinchart
2009-08-25 12:53 ` Steven Walter
2009-08-25 22:02 ` David Xiao
2009-08-25 23:17 ` Laurent Pinchart
2009-08-26 17:22 ` David Xiao
2009-09-01 13:31 ` Russell King - ARM Linux
2009-09-01 18:08 ` David Xiao
2009-09-01 13:28 ` Russell King - ARM Linux
2009-09-01 13:43 ` Laurent Pinchart
2009-09-01 14:18 ` Russell King - ARM Linux
2009-09-01 16:53 ` Hugh Dickins
2009-09-02 15:10 ` Imre Deak
2009-09-03 7:31 ` Imre Deak
2009-09-03 8:36 ` Russell King - ARM Linux
2009-09-08 13:05 ` Steven Walter
2009-08-07 7:29 ` Laurent Pinchart
2009-08-07 8:12 ` Matthieu CASTET
2009-08-07 10:13 ` How to efficiently handle DMA and cache on ARMv7 ? (was " Is " Laurent Pinchart
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090806114619.GW2080@trinity.fluff.org \
--to=ben-linux@fluff.org \
--cc=holt@sgi.com \
--cc=hugh.dickins@tiscali.co.uk \
--cc=laurent.pinchart@ideasonboard.com \
--cc=linux-arm-kernel@lists.arm.linux.org.uk \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-media@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).