From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0F2AEC433EF for ; Fri, 17 Dec 2021 12:00:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=HOUcAJxWAaZDhs41qegtxFfxJyY8JE3LVgo4inKTGHY=; b=P19/Vet9lm0BpT hIwqsactHZmq3+p7UjIuGEmh+dskPFBRWX40N6X7vgzQTSWzZ+wxlM28jGH14cqYoBq/WTWgQoR8P aCfFwxWqk5FgVXkz8NBgajvGHmE0yjQaMu/8UaQqMtUZTgzQ369DgA3Z+J+fD7G5/PUQDa4kSxWMw GTIrOz+/dDbXDGDGTDyrYPB/tPWWZsAhKd6NBniU9Qm8l1+B95o+rp+oU2F7WDf1tk7tlmAnDVNN2 8WZUADts+vgxqxWPzxY9rVFO8GqkFoASBf0jwjZ+IwvoelUByzZtoNPA06XTGrDjuAQ6hlx8yqc16 pQhMwddMxhGw4x6VnLQg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1myBtV-009zVj-VV; Fri, 17 Dec 2021 11:59:26 +0000 Received: from dfw.source.kernel.org ([139.178.84.217]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1myBtS-009zTg-Le for linux-arm-kernel@lists.infradead.org; Fri, 17 Dec 2021 11:59:24 +0000 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 0992D62159; Fri, 17 Dec 2021 11:59:22 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 19FE9C36AE5; Fri, 17 Dec 2021 11:59:19 +0000 (UTC) Date: Fri, 17 Dec 2021 11:59:16 +0000 From: Catalin Marinas To: Xiongfeng Wang Cc: will@kernel.org, mark.rutland@arm.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, moyufeng@huawei.com, linux-arch@vger.kernel.org Subject: Re: [PATCH] asm-generic: introduce io_stop_wc() and add implementation for ARM64 Message-ID: References: <20211217085611.111999-1-wangxiongfeng2@huawei.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20211217085611.111999-1-wangxiongfeng2@huawei.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20211217_035922_823487_2B2A3B63 X-CRM114-Status: GOOD ( 25.60 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Fri, Dec 17, 2021 at 04:56:11PM +0800, Xiongfeng Wang wrote: > For memory accesses with Normal-Non Cacheable attributes, the CPU may do You may want to mention "arm64 Normal Non-Cacheable" as other architectures have a different meaning of NC. > write combining. But in some situation, this is bad for the performance > because the prior access may wait too long just to be merged. > > We introduce io_stop_wc() to prevent the Normal-NC memory accesses before > this instruction to be merged with memory accesses after this > instruction. > > We add implementation for ARM64 using DGH instruction and provide NOP > implementation for other architectures. > > Signed-off-by: Xiongfeng Wang > --- > Documentation/memory-barriers.txt | 9 +++++++++ > arch/arm64/include/asm/barrier.h | 9 +++++++++ > include/asm-generic/barrier.h | 11 +++++++++++ > 3 files changed, 29 insertions(+) > > diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt > index 7367ada13208..b868b51b1801 100644 > --- a/Documentation/memory-barriers.txt > +++ b/Documentation/memory-barriers.txt > @@ -1950,6 +1950,15 @@ There are some more advanced barrier functions: > For load from persistent memory, existing read memory barriers are sufficient > to ensure read ordering. > > + (*) io_stop_wc(); > + > + For memory accesses with Normal-Non Cacheable attributes (e.g. those > + returned by ioremap_wc()), the CPU may do write combining. But in some > + situation, this is bad for the performance because the prior access may > + wait too long just to be merged. io_stop_wc() can be used to prevent > + merging memory accesses with Normal-Non Cacheable attributes before this > + instruction with any memory accesses appearing after this instruction. I'm fine with the patch in general but the comment here and in asm-generic/barrier.h should avoid Normal Non-Cacheable as that's an arm-specific term. Looking at Documentation/driver-api/device-io.rst, we could simply say "write-combining". Something like: For memory accesses with write-combining attributes (e.g. those returned by ioremap_wc()), the CPU may wait for prior accesses to be merged with subsequent ones. io_stop_wc() can be used to prevent the merging of write-combining memory accesses before this macro with those after it when such wait has performance implications. (feel free to rephrase it but avoid Normal NC here) > + > =============================== > IMPLICIT KERNEL MEMORY BARRIERS > =============================== > diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h > index 1c5a00598458..62217be36217 100644 > --- a/arch/arm64/include/asm/barrier.h > +++ b/arch/arm64/include/asm/barrier.h > @@ -26,6 +26,14 @@ > #define __tsb_csync() asm volatile("hint #18" : : : "memory") > #define csdb() asm volatile("hint #20" : : : "memory") > > +/* > + * Data Gathering Hint: > + * This instruction prevents merging memory accesses with Normal-NC or > + * Device-GRE attributes before the hint instruction with any memory accesses > + * appearing after the hint instruction. > + */ > +#define dgh() asm volatile("hint #6" : : : "memory") This is fine, arm-specific code. > + > #ifdef CONFIG_ARM64_PSEUDO_NMI > #define pmr_sync() \ > do { \ > @@ -46,6 +54,7 @@ > #define dma_rmb() dmb(oshld) > #define dma_wmb() dmb(oshst) > > +#define io_stop_wc() dgh() > > #define tsb_csync() \ > do { \ > diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h > index 640f09479bdf..083be6d34cb9 100644 > --- a/include/asm-generic/barrier.h > +++ b/include/asm-generic/barrier.h > @@ -251,5 +251,16 @@ do { \ > #define pmem_wmb() wmb() > #endif > > +/* > + * ioremap_wc() maps I/O memory as memory with Normal-Non Cacheable attributes. > + * The CPU may do write combining for this kind of memory access. io_stop_wc() > + * prevents merging memory accesses with Normal-Non Cacheable attributes > + * before this instruction with any memory accesses appearing after this > + * instruction. Please change this as well along the lines of my comment above. > + */ > +#ifndef io_stop_wc > +#define io_stop_wc do { } while (0) > +#endif > + > #endif /* !__ASSEMBLY__ */ > #endif /* __ASM_GENERIC_BARRIER_H */ Thanks. -- Catalin _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel