From mboxrd@z Thu Jan 1 00:00:00 1970 From: catalin.marinas@arm.com (Catalin Marinas) Date: Thu, 28 Jan 2016 11:49:25 +0000 Subject: Speeding up dma_unmap In-Reply-To: <24650183.WoXLVmr0Vj@wuerfel> References: <20160127180944.GZ10826@n2100.arm.linux.org.uk> <20160128103105.GR14823@e104818-lin.cambridge.arm.com> <24650183.WoXLVmr0Vj@wuerfel> Message-ID: <20160128114925.GU14823@e104818-lin.cambridge.arm.com> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org On Thu, Jan 28, 2016 at 12:20:55PM +0100, Arnd Bergmann wrote: > On Thursday 28 January 2016 10:31:06 Catalin Marinas wrote: > > On Wed, Jan 27, 2016 at 06:09:45PM +0000, Russell King - ARM Linux wrote: > > > On Wed, Jan 27, 2016 at 04:06:30PM +0000, Catalin Marinas wrote: > > > > On Wed, Jan 27, 2016 at 01:23:27PM +0100, Arnd Bergmann wrote: > > > > > up reading cache lines back in randomly on a speculative prefetch, > > > > > but as far as I can tell, the Cortex-A8 (or A5/A7) won't do that. > > > > > > > > Are you sure about A5 and A7? I'm not even sure about the A8 but there > > > > are good chances that A7 and A5 do speculative prefetches. > > > > > > I thought when I was re-implementing the DMA API on ARM (which was > > > around early v7 times) that there were CPUs that did speculative > > > prefetching, which included the A8. I seem to remember it was pretty > > > urgent to have the DMA API fixed for _any_ ARMv7 CPU because of the > > > speculative prefetching. > > > > Indeed, it's a safe assumption to say that any ARMv7 CPU perform > > speculative accesses. Even if some of them may only do I-cache > > prefetching (just guessing), in the presence of a unified L2 this > > distinction no longer matters. [...] > This means that there are still some cores on which one could try > if disabling the prefetching and the flushes in DMA unmap provides > any serious performance boost. I think we need to look at the original use-case. There seems to be a 4MB buffer, how often is this mapped/unmapped? Would it be better off with the coherent API than the streaming one? -- Catalin