From mboxrd@z Thu Jan 1 00:00:00 1970 From: Arnd Bergmann Subject: Re: [PATCH V3 06/26] csky: Cache and TLB routines Date: Fri, 7 Sep 2018 10:14:38 +0200 Message-ID: References: <16105a3e54f1c4bb65a5ec81d77af7c176e705c6.1536138304.git.ren_guo@c-sky.com> <20180907030447.GA10434@guoren-Inspiron-7460> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Return-path: In-Reply-To: <20180907030447.GA10434@guoren-Inspiron-7460> Sender: linux-kernel-owner@vger.kernel.org To: Guo Ren Cc: linux-arch , Linux Kernel Mailing List , Thomas Gleixner , Daniel Lezcano , Jason Cooper , c-sky_gcc_upstream@c-sky.com, gnu-csky@mentor.com, Thomas Petazzoni , wbx@uclibc-ng.org, Greentime Hu List-Id: linux-arch.vger.kernel.org On Fri, Sep 7, 2018 at 5:04 AM Guo Ren wrote: > > On Thu, Sep 06, 2018 at 04:31:16PM +0200, Arnd Bergmann wrote: > > On Wed, Sep 5, 2018 at 2:08 PM Guo Ren wrote: > > > > Can you describe how C-Sky hardware implements MMIO? > Our mmio is uncachable and strong-order address, so there is no need > barriers for access these io addr. > > #define ioremap_wc ioremap_nocache > #define ioremap_wt ioremap_nocache > > Current ioremap_wc and ioremap_wt implementation are too simple and > we'll improve it in future. > > > In particular: > > > > - Is a read from uncached memory always serialized with DMA, and with > > other CPUs doing MMIO access to a different address? > CPU use ld.w to get data from uncached strong order memory. > Other CPUs use the same mmio vaddr to access the uncachable strong order > memory paddr. Ok, but what about the DMA? The most common requirement for serialization here is with a DMA transfer, where you first write into a buffer in memory, then write to an MMIO register to trigger a DMA-load, and then the device reads the data from memory. Without a barrier before the MMIO, the data may still be in a store queue of the CPU, and the DMA gets stale data. Similarly, an MMIO read may be used to see if a DMA has completed and the device register tells you that the DMA has left the device, but without a barrier, the CPU may have prefetched the DMA data while waiting for the MMIO-read to complete. The __io_ar() barrier() in asm-generic/io.h prevents the compiler from reordering the two reads, but if an weakly ordered read (in coherent DMA buffer) can bypass a strongly ordered read (MMIO), then it's still still broken. > > - How does endianess work? Are there any buses that flip bytes around > > when running big-endian, or do you always do that in software? > Currently we only support little-endian and soc will follow it. Ok, that makes it easier. If you think that you won't even need big-endian support in the long run, you could also remove your asm/byteorder.h header. If you're not sure, it doesn't hurt to keep it of course. Arnd From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-qt0-f178.google.com ([209.85.216.178]:42316 "EHLO mail-qt0-f178.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727093AbeIGMyo (ORCPT ); Fri, 7 Sep 2018 08:54:44 -0400 MIME-Version: 1.0 References: <16105a3e54f1c4bb65a5ec81d77af7c176e705c6.1536138304.git.ren_guo@c-sky.com> <20180907030447.GA10434@guoren-Inspiron-7460> In-Reply-To: <20180907030447.GA10434@guoren-Inspiron-7460> From: Arnd Bergmann Date: Fri, 7 Sep 2018 10:14:38 +0200 Message-ID: Subject: Re: [PATCH V3 06/26] csky: Cache and TLB routines Content-Type: text/plain; charset="UTF-8" Sender: linux-arch-owner@vger.kernel.org List-ID: To: Guo Ren Cc: linux-arch , Linux Kernel Mailing List , Thomas Gleixner , Daniel Lezcano , Jason Cooper , c-sky_gcc_upstream@c-sky.com, gnu-csky@mentor.com, Thomas Petazzoni , wbx@uclibc-ng.org, Greentime Hu Message-ID: <20180907081438.Heul3BSfOltwlDJlK4utMMSH5kOUJsGrh4y0oyBOfOg@z> On Fri, Sep 7, 2018 at 5:04 AM Guo Ren wrote: > > On Thu, Sep 06, 2018 at 04:31:16PM +0200, Arnd Bergmann wrote: > > On Wed, Sep 5, 2018 at 2:08 PM Guo Ren wrote: > > > > Can you describe how C-Sky hardware implements MMIO? > Our mmio is uncachable and strong-order address, so there is no need > barriers for access these io addr. > > #define ioremap_wc ioremap_nocache > #define ioremap_wt ioremap_nocache > > Current ioremap_wc and ioremap_wt implementation are too simple and > we'll improve it in future. > > > In particular: > > > > - Is a read from uncached memory always serialized with DMA, and with > > other CPUs doing MMIO access to a different address? > CPU use ld.w to get data from uncached strong order memory. > Other CPUs use the same mmio vaddr to access the uncachable strong order > memory paddr. Ok, but what about the DMA? The most common requirement for serialization here is with a DMA transfer, where you first write into a buffer in memory, then write to an MMIO register to trigger a DMA-load, and then the device reads the data from memory. Without a barrier before the MMIO, the data may still be in a store queue of the CPU, and the DMA gets stale data. Similarly, an MMIO read may be used to see if a DMA has completed and the device register tells you that the DMA has left the device, but without a barrier, the CPU may have prefetched the DMA data while waiting for the MMIO-read to complete. The __io_ar() barrier() in asm-generic/io.h prevents the compiler from reordering the two reads, but if an weakly ordered read (in coherent DMA buffer) can bypass a strongly ordered read (MMIO), then it's still still broken. > > - How does endianess work? Are there any buses that flip bytes around > > when running big-endian, or do you always do that in software? > Currently we only support little-endian and soc will follow it. Ok, that makes it easier. If you think that you won't even need big-endian support in the long run, you could also remove your asm/byteorder.h header. If you're not sure, it doesn't hurt to keep it of course. Arnd