From: linux@arm.linux.org.uk (Russell King - ARM Linux)
To: linux-arm-kernel@lists.infradead.org
Subject: [PATCH v2 06/17] ARM: dma-mapping: fix for speculative accesses
Date: Tue, 24 Nov 2009 16:22:50 +0000 [thread overview]
Message-ID: <20091124162250.GA17481@n2100.arm.linux.org.uk> (raw)
In-Reply-To: <c70ff3ad0911240746v3438bba9y556cca3f027b3dd7@mail.gmail.com>
On Tue, Nov 24, 2009 at 05:46:02PM +0200, saeed bishara wrote:
> On Mon, Nov 23, 2009 at 7:27 PM, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > On Mon, Nov 23, 2009 at 05:25:06PM +0200, saeed bishara wrote:
> >> > + ? ? ? if (dir == DMA_FROM_DEVICE) {
> >> > + ? ? ? ? ? ? ? outer_inv_range(paddr, paddr + size);
> >> > + ? ? ? ? ? ? ? dmac_inv_range(kaddr, kaddr + size);
> >> it's not clear why outer cache invalidated before inner cache, and, I
> >> think that it may be incorrect, how can you be sure that a dirty line
> >> that is in inner cache can't be moved down to outer cache?
> >
> > I think you have a point, but also see Catalin's point. ?When we are
> > invalidating, we have to invalidate the outer caches before the inner
> > caches, otherwise there is a danger that the inner caches could
> > prefetch from the outer between the two operations.
> >
> > I think that the only thing we can do is as my previous version did,
> > and always clean the caches at this point - first the inner and then
> > the outer caches.
>
> this makes sense, if a specific CPU needs the outer cache to be
> invalidated before the inner one, then that better be done in a
> separate patch as this is a different issue and has nothing to do with
> the speculative prefetch
That will mean redoing most of the patch series, which won't be pretty.
> > Yes, but we need to do this for bidirectional mappings as well (which
> > would've only been cleaned.)
> >
> > What would be useful is if someone could run some tests on the performance
> > impact of this (eg, measuring the DMA performance of a SATA hard drive)
> > and comparing the resulting performance.
> >
> > I _could_ do that, but it would mean taking down my network and running
> > the tests on a fairly old machine.
>
> I want to help with that testing, is there any git tree where I can
> pull the code from?
There isn't, because apparantly under git rules, as soon as I publish it
in git form it must be fixed. So I'm keeping the stuff private and only
issuing patches on this mailing list.
> > If it does turn out to be a problem, we can change the 'inv_range' and
> > 'clean_range' functions to be 'dev_to_cpu' and 'cpu_to_dev' functions,
> > and push the decisions about what to do at each point down into the
> > per-CPU assembly.
>
> the thing is that the inv_range or dev_to_cpu is called before and
> after the DMA operation.
> and for the systems that doesn't have speculative prefetch, we don't
> that function to be called at all after the DMA operation.
Well, the numbers based on trivial looping tests on Versatile/PB926
(in other words, it's not a real test) are:
--- original ---
Test: Null Function: 108.000ns
Test: Map 4K DMA_FROM_DEVICE: 10393.000ns
Test: Unmap 4K DMA_FROM_DEVICE: 107.000ns
Test: Map+Unmap 4K DMA_FROM_DEVICE: 10388.999ns
Test: Map 4K DMA_TO_DEVICE: 10374.000ns
Test: Unmap 4K DMA_TO_DEVICE: 106.000ns
Test: Map+Unmap 4K DMA_TO_DEVICE: 10374.000ns
Test: Map 32K DMA_FROM_DEVICE: 78653.998ns
Test: Unmap 32K DMA_FROM_DEVICE: 106.000ns
Test: Map+Unmap 32K DMA_FROM_DEVICE: 78654.003ns
Test: Map 32K DMA_TO_DEVICE: 78639.984ns
Test: Unmap 32K DMA_TO_DEVICE: 107.000ns
Test: Map+Unmap 32K DMA_TO_DEVICE: 78640.041ns
--- replacement ---
Test: Null Function: 109.000ns
Test: Map 4K DMA_FROM_DEVICE: 10196.000ns
Test: Unmap 4K DMA_FROM_DEVICE: 10185.999ns
Test: Map+Unmap 4K DMA_FROM_DEVICE: 20290.999ns
Test: Map 4K DMA_TO_DEVICE: 10164.000ns
Test: Unmap 4K DMA_TO_DEVICE: 327.000ns
Test: Map+Unmap 4K DMA_TO_DEVICE: 10454.000ns
Test: Map 32K DMA_FROM_DEVICE: 78457.995ns
Test: Unmap 32K DMA_FROM_DEVICE: 78473.016ns
Test: Map+Unmap 32K DMA_FROM_DEVICE: 156819.917ns
Test: Map 32K DMA_TO_DEVICE: 78454.342ns
Test: Unmap 32K DMA_TO_DEVICE: 325.995ns
Test: Map+Unmap 32K DMA_TO_DEVICE: 78715.267ns
which shows minimal impact for the DMA_TO_DEVICE case, but a major impact
on the DMA_FROM_DEVICE case, which I don't think we can live with.
So I think it's time to re-think this.
next prev parent reply other threads:[~2009-11-24 16:22 UTC|newest]
Thread overview: 67+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-11-20 18:25 [PATCH 0/7] dma-mapping: Cortex A9 speculative prefetch fixes Russell King - ARM Linux
2009-11-20 18:01 ` [PATCH 1/7] ARM: provide phys_to_page() to complement page_to_phys() Russell King - ARM Linux
2009-11-23 12:05 ` Catalin Marinas
2009-11-20 18:02 ` [PATCH 2/7] ARM: dma-mapping: simplify page_to_dma() and __pfn_to_bus() Russell King - ARM Linux
2009-11-23 12:05 ` [PATCH 2/7] ARM: dma-mapping: simplify page_to_dma() and__pfn_to_bus() Catalin Marinas
2009-12-12 14:01 ` [PATCH 2/7] ARM: dma-mapping: simplify page_to_dma() and __pfn_to_bus() Anders Grafström
2009-12-12 14:37 ` Russell King - ARM Linux
2009-12-14 18:47 ` Anders Grafström
2009-11-20 18:03 ` [PATCH 3/7] ARM: dma-mapping: provide dma_to_page() Russell King - ARM Linux
2009-11-20 18:04 ` [PATCH 4/7] ARM: dma-mapping: split dma_unmap_page() from dma_unmap_single() Russell King - ARM Linux
2009-11-20 18:05 ` [PATCH 5/7] ARM: dma-mapping: use the idea of buffer ownership Russell King - ARM Linux
2009-11-20 18:06 ` [PATCH 6/7] ARM: dma-mapping: fix for speculative accesses Russell King - ARM Linux
2009-11-23 12:03 ` Catalin Marinas
2009-11-23 12:08 ` Russell King - ARM Linux
2009-11-23 13:38 ` [PATCH v2 06/17] " Russell King - ARM Linux
2009-11-23 15:25 ` saeed bishara
2009-11-23 17:27 ` Russell King - ARM Linux
2009-11-24 15:46 ` saeed bishara
2009-11-24 16:22 ` Russell King - ARM Linux [this message]
2009-11-24 16:40 ` saeed bishara
2009-11-24 16:48 ` Catalin Marinas
2009-11-24 17:02 ` Russell King - ARM Linux
2009-11-24 19:12 ` Nicolas Pitre
2009-11-25 12:12 ` Mark Brown
2009-11-24 16:47 ` Catalin Marinas
2009-11-24 20:00 ` Russell King - ARM Linux
2009-11-24 20:35 ` Nicolas Pitre
2009-11-24 21:01 ` Russell King - ARM Linux
2009-11-24 21:46 ` saeed bishara
2009-11-25 12:14 ` Catalin Marinas
2009-11-25 16:26 ` Russell King - ARM Linux
2009-11-26 13:21 ` Ronen Shitrit
2009-11-20 18:07 ` [PATCH 7/7] ARM: dma-mapping: no need to clean overlapping cache lines on invalidate Russell King - ARM Linux
2009-11-22 22:16 ` Nicolas Pitre
2009-11-23 10:26 ` Russell King - ARM Linux
2009-11-23 12:09 ` [PATCH 7/7] ARM: dma-mapping: no need to clean overlapping cachelines " Catalin Marinas
2009-11-23 13:30 ` Russell King - ARM Linux
2009-11-23 13:38 ` [PATCH v2 07/17] ARM: dma-mapping: no need to clean overlapping cache lines " Russell King - ARM Linux
2009-11-23 19:35 ` Nicolas Pitre
2009-11-25 12:19 ` Catalin Marinas
2009-11-25 16:31 ` Russell King - ARM Linux
2009-11-25 17:02 ` [PATCH v2 07/17] ARM: dma-mapping: no need to cleanoverlapping " Catalin Marinas
2009-11-25 17:34 ` [PATCH v2 07/17] ARM: dma-mapping: no need to clean overlapping " Russell King - ARM Linux
2009-11-21 14:00 ` [PATCH 0/7] dma-mapping: Cortex A9 speculative prefetch fixes Jamie Iles
2009-11-23 10:29 ` Russell King - ARM Linux
2010-01-13 6:37 ` muni anda
2010-01-13 8:42 ` Russell King - ARM Linux
2009-11-21 17:56 ` Rajanikanth H.V
2009-11-21 18:57 ` Russell King - ARM Linux
2009-11-21 19:35 ` [PATCH 0/10] dma-mapping: cleanup coherent/writealloc dma allocations and ARMv7 memory support Russell King - ARM Linux
2009-11-21 19:37 ` [PATCH 01/10] ARM: dma-mapping: split out vmregion code from dma coherent mapping code Russell King - ARM Linux
2009-11-21 19:37 ` [PATCH 02/10] ARM: dma-mapping: functions to allocate/free a coherent buffer Russell King - ARM Linux
2009-11-21 19:37 ` [PATCH 03/10] ARM: dma-mapping: fix coherent arch dma_alloc_coherent() Russell King - ARM Linux
2009-11-21 19:38 ` [PATCH 04/10] ARM: dma-mapping: fix nommu dma_alloc_coherent() Russell King - ARM Linux
2009-11-21 19:38 ` [PATCH 05/10] ARM: dma-mapping: factor dma_free_coherent() common code Russell King - ARM Linux
2009-11-21 19:38 ` [PATCH 06/10] ARM: dma-mapping: move consistent_init into CONFIG_MMU section Russell King - ARM Linux
2009-11-22 0:05 ` Greg Ungerer
2009-11-22 0:16 ` Russell King - ARM Linux
2009-11-21 19:38 ` [PATCH 07/10] ARM: dma-mapping: clean up coherent arch dma allocation Russell King - ARM Linux
2009-11-21 19:39 ` [PATCH 08/10] ARM: dma-mapping: Factor out noMMU dma buffer allocation code Russell King - ARM Linux
2009-11-21 19:39 ` [PATCH 09/10] ARM: dma-mapping: get rid of setting/clearing the reserved page bit Russell King - ARM Linux
2009-11-21 19:39 ` [PATCH 10/10] ARM: dma-mapping: switch ARMv7 DMA mappings to retain 'memory' attribute Russell King - ARM Linux
2009-11-23 12:21 ` [PATCH 10/10] ARM: dma-mapping: switch ARMv7 DMA mappings to retain'memory' attribute Catalin Marinas
2009-11-23 7:10 ` [PATCH 0/10] dma-mapping: cleanup coherent/writealloc dma allocations and ARMv7 memory support Greg Ungerer
2009-11-23 10:26 ` Russell King - ARM Linux
-- strict thread matches above, loose matches on Subject: below --
2012-05-23 21:02 [PATCH v2 06/17] ARM: dma-mapping: fix for speculative accesses Rui Sousa
2012-05-23 23:02 ` Russell King - ARM Linux
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20091124162250.GA17481@n2100.arm.linux.org.uk \
--to=linux@arm.linux.org.uk \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).