Hi, I am attaching a patch which attempts to reduce the cache operations while doing MMC transactions. I have tested it only on arm and the tests performed with benchmarks like iozone/bonnie showed that the data integrity is maintained while I/O bandwidth is increased. I have tested it with K3.1 and I believe it can apply to 3.12 also. My understanding from the important API's dealing with DMA memory is as follows: 1) dma_map_sg/ dma_sync_sg_for_device -> make sure that cache is flushed after CPU is done updating the memory allocated for DMA and is called before giving control of DMA memory to the device. 2) dma_unmap_sg/dma_sync_sg_for_cpu -> Make sure that cache is invalidated before reading from the DMA area which was used by the device to write the data. About the patch: Changes in sdhci_adma_table_pre make sure that we only flush if we have updated DMA area after the call to dma_map_sg. Changes in sdhci_adma_table_post take care of following: 1) Remove invalidation of cache for memory locations which are going to be updated by CPU, as they are not being read. 2) Perform the unmap of sg before CPU accesses DMA area as the changes we did for unaligned cases might get lost due to invalidation afterwards. I was not able to induce unaligned buffer accesses using normal filesystem/raw device operations. Maybe that's why this issue was not discovered so far. 3) Only drawback is sg->dma_address gets used after the call to dma_unmap_sg. I would like to understand if this patch can cause any regressions for any of the architectures or with the MMC functionality. Thanks & Regards, Vishal Annapurve