From mboxrd@z Thu Jan 1 00:00:00 1970 From: Guo Ren Subject: Re: [PATCH 09/21] riscv: dma-mapping: skip invalidation before bidirectional DMA Date: Fri, 5 May 2023 13:47:03 +0800 Message-ID: References: <20230327121317.4081816-1-arnd@kernel.org> <20230327121317.4081816-10-arnd@kernel.org> Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable Return-path: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1683265636; bh=TmEJ43a9wRyU79Q4Z9WHv4VZR6aLeCdSzDRMZZtHgkE=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=I2g+vyhe1hoLtc3xpRjAc9BU/OAumbEN1B7Qbc0uCZJ6vm38izq9WxgG6TDQIw9Cn gtD6rIxqmEHNa8rMU8v/ldvTpcVtaEF3DSdJwkdOmvdfZad8eHH3pZuMYUuPnRHXOF aYYdetp0cJMW6mFI28DiTiu7OK1/suDqFmm+PnshMDcFrjGxdV75SdkN4eyVXeCkzW HSaITHGNl7LHZa0zreUIjIqZKUUeH1GUas9yTSfl5gIOK4Nx3H8cZuqGu0g7MQUCMn LcnwJPgGPB8cca2qNfnc8SP0goqQZTmJ1qZnOLhwiQAwrnHXfTiUVeCmF7841pmmh4 RSuO/9+XH4fLg== In-Reply-To: <20230327121317.4081816-10-arnd@kernel.org> List-ID: Content-Type: text/plain; charset="windows-1252" To: Arnd Bergmann , Arnd Bergmann , Christoph Hellwig Cc: linux-kernel@vger.kernel.org, Vineet Gupta , Will Deacon , Russell King , Neil Armstrong , Linus Walleij , Catalin Marinas , Brian Cain , Geert Uytterhoeven , Michal Simek , Thomas Bogendoerfer , Dinh Nguyen , Stafford Horne , Helge Deller , Michael Ellerman , Christophe Leroy , Paul Walmsley , Palmer Dabbelt , Rich Felker , John Paul Adrian Glaubitz <> On Mon, Mar 27, 2023 at 8:15=E2=80=AFPM Arnd Bergmann wro= te: > > From: Arnd Bergmann > > For a DMA_BIDIRECTIONAL transfer, the caches have to be cleaned > first to let the device see data written by the CPU, and invalidated > after the transfer to let the CPU see data written by the device. > > riscv also invalidates the caches before the transfer, which does > not appear to serve any purpose. Yes, we can't guarantee the CPU pre-load cache lines randomly during dma working. But I've two purposes to keep invalidates before dma transfer: - We clearly tell the CPU these cache lines are invalid. The caching algorithm would use these invalid slots first instead of replacing valid ones. - Invalidating is very cheap. Actually, flush and clean have the same performance in our machine. So, how about: diff --git a/arch/riscv/mm/dma-noncoherent.c b/arch/riscv/mm/dma-noncoheren= t.c index d919efab6eba..2c52fbc15064 100644 --- a/arch/riscv/mm/dma-noncoherent.c +++ b/arch/riscv/mm/dma-noncoherent.c @@ -22,8 +22,6 @@ void arch_sync_dma_for_device(phys_addr_t paddr, size_t s= ize, ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size); break; case DMA_FROM_DEVICE: - ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size); - break; case DMA_BIDIRECTIONAL: ALT_CMO_OP(flush, vaddr, size, riscv_cbom_block_size); break; @@ -42,7 +40,7 @@ void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size= , break; case DMA_FROM_DEVICE: case DMA_BIDIRECTIONAL: /* I'm not sure all drivers have guaranteed cacheline alignment. If not, this inval would cause problems */ - ALT_CMO_OP(flush, vaddr, size, riscv_cbom_block_size); + ALT_CMO_OP(inval, vaddr, size, riscv_cbom_block_size); break; default: break; > > Signed-off-by: Arnd Bergmann > --- > arch/riscv/mm/dma-noncoherent.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/riscv/mm/dma-noncoherent.c b/arch/riscv/mm/dma-noncoher= ent.c > index 640f4c496d26..69c80b2155a1 100644 > --- a/arch/riscv/mm/dma-noncoherent.c > +++ b/arch/riscv/mm/dma-noncoherent.c > @@ -25,7 +25,7 @@ void arch_sync_dma_for_device(phys_addr_t paddr, size_t= size, > ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size); > break; > case DMA_BIDIRECTIONAL: > - ALT_CMO_OP(flush, vaddr, size, riscv_cbom_block_size); > + ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size); > break; > default: > break; > -- > 2.39.2 > --=20 Best Regards Guo Ren