From mboxrd@z Thu Jan 1 00:00:00 1970 From: santosh.shilimkar@ti.com (Santosh Shilimkar) Date: Tue, 22 Apr 2014 11:18:08 -0400 Subject: [PATCH] ARM: mm: dma: Update coherent streaming apis with missing memory barrier In-Reply-To: <20140422150725.GD10224@arm.com> References: <1398103390-31968-1-git-send-email-santosh.shilimkar@ti.com> <20140422102834.GE7484@arm.com> <53567352.1080306@ti.com> <20140422150725.GD10224@arm.com> Message-ID: <53568830.50801@ti.com> To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org On Tuesday 22 April 2014 11:07 AM, Catalin Marinas wrote: > On Tue, Apr 22, 2014 at 02:49:06PM +0100, Santosh Shilimkar wrote: >> On Tuesday 22 April 2014 06:28 AM, Will Deacon wrote: >>> On Mon, Apr 21, 2014 at 07:03:10PM +0100, Santosh Shilimkar wrote: >>>> ARM coherent CPU dma map APIS are assumed to be nops on cache coherent >>>> machines. While this is true, one still needs to ensure that no >>>> outstanding writes are pending in CPU write buffers. To take care >>>> of that, we at least need a memory barrier to commit those changes >>>> to main memory. >>>> >>>> Patch is trying to fix those cases. Without such a patch, you will >>>> end up patching device drivers to avoid the synchronisation issues. >>> >>> Don't you only need these barriers if you're passing ownership of a CPU >>> buffer to a device? In that case, I would expect a subsequent writel to tell >>> the device about the new buffer, which includes the required __iowmb(). >>> That's the reason for the relaxed accessors: to avoid this barrier when it's >>> not needed. Perhaps you're using the relaxed accessors where you actually >>> need the stronger ordering guarantees? >> >> I kind of guessed some one will bring up above point. Infact this is how >> mostly people have been living with the issue on coherent machines. On >> Keystone too, we did explicit barriers in respective drivers. >> >> I have added these barriers only on CPU to device streaming APIs because on >> other direction, the memory is already upto date from CPU's perspective. >> >> But if you look at the actual problem, its really responsibility of >> DMA streaming APIs which we are trying to push on to drivers. A device >> driver should be independent of whether it is running on a coherent or >> a non-coherent CPU. >> >> Lets take a example.... >> MMC controller driver running on a non-coherent and coherent machine. >> Driver has below code sequence which is generic. >> 1. Prepare SG list >> 2. Perform CMO using DMA streaming API >> 3. Start DMA transfer... > > The key here is how you start the DMA transfer. So far we assumed it's > done via an I/O operation like writel() and it has the right barriers. > If we have other ways for starting this (like writing the dma_addr in > some other memory descriptor), should we use explicit memory barriers? > My point is why we want to rely on the DMA kick mechanism barrier for the operation which is suppose to be completed in step 2. Regards, Santosh