From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 74847C00528 for ; Mon, 31 Jul 2023 15:55:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232960AbjGaPzm (ORCPT ); Mon, 31 Jul 2023 11:55:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33022 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232948AbjGaPzj (ORCPT ); Mon, 31 Jul 2023 11:55:39 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4687E199E; Mon, 31 Jul 2023 08:55:33 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 3ED3B611E4; Mon, 31 Jul 2023 15:55:32 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7E5CDC433C7; Mon, 31 Jul 2023 15:55:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1690818931; bh=NpBtSIJSQAMZgop0+UrbZFjO72hykmWioT0INVInNzM=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=Xsiktcq6yFo8bOa1grhEAGYy7DsiCxGQ61akQuSRbSwgdbkw9u2hCxXeagc3bSYfW UxwX+nPP9ioKXtvf/dOVtbUtbW2QUKfnDjPwdLPZ9Yj03LIV2NnploqVk+yNOSQKo3 lIn23gUH61k3GeZ5aAm5B6WjRfJtwyxQcnJQUrOwTKIbGN3u24EsVzIOF4eQBSnCt9 LeW53R+OyWgIL3cnlrgbfVVFuoeW1fiusxzu2NwTtIEErbXofC/yCejdXWSm/BHL3P Ynnp4jVgwfx1aj7I1kiGCuh2tIXDsSltWmnH+32lJC8m8QqxE3Vavb8Hx4QTLfD1n4 zSSrN7Jy0bO2Q== Date: Mon, 31 Jul 2023 23:43:52 +0800 From: Jisheng Zhang To: Arnd Bergmann Cc: guoren , Heiko =?utf-8?Q?St=C3=BCbner?= , Emil Renner Berthing , Prabhakar , "Conor.Dooley" , Geert Uytterhoeven , Andrew Jones , Paul Walmsley , Palmer Dabbelt , Albert Ou , Samuel Holland , linux-riscv@lists.infradead.org, Christoph Hellwig , Rob Herring , Krzysztof Kozlowski , devicetree@vger.kernel.org, linux-kernel@vger.kernel.org, Linux-Renesas , Biju Das , "Lad, Prabhakar" Subject: Re: [PATCH v10 3/6] riscv: mm: dma-noncoherent: nonstandard cache operations support Message-ID: References: <20230702203429.237615-1-prabhakar.mahadev-lad.rj@bp.renesas.com> <20230702203429.237615-4-prabhakar.mahadev-lad.rj@bp.renesas.com> <92c00ddb-e956-4861-af80-5f5558c9a8f5@app.fastmail.com> <8b3466e4-a295-4249-bd05-2edbf7b3f6e3@app.fastmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <8b3466e4-a295-4249-bd05-2edbf7b3f6e3@app.fastmail.com> Precedence: bulk List-ID: X-Mailing-List: devicetree@vger.kernel.org On Mon, Jul 31, 2023 at 07:39:30AM +0200, Arnd Bergmann wrote: > On Mon, Jul 31, 2023, at 02:49, Guo Ren wrote: > > On Mon, Jul 31, 2023 at 4:36 AM Arnd Bergmann wrote: > >> > >> On Sun, Jul 30, 2023, at 17:42, Emil Renner Berthing wrote: > >> > On Sun, 30 Jul 2023 at 17:11, Jisheng Zhang wrote: > >> > >> >> > + > >> >> > static inline void arch_dma_cache_wback(phys_addr_t paddr, size_t size) > >> >> > { > >> >> > void *vaddr = phys_to_virt(paddr); > >> >> > > >> >> > +#ifdef CONFIG_RISCV_NONSTANDARD_CACHE_OPS > >> >> > + if (unlikely(noncoherent_cache_ops.wback)) { > >> >> > >> >> I'm worried about the performance impact here. > >> >> For unified kernel Image reason, RISCV_NONSTANDARD_CACHE_OPS will be > >> >> enabled by default, so standard CMO and T-HEAD's CMO platform's > >> >> performance will be impacted, because even an unlikely is put > >> >> here, the check action still needs to be done. > >> > > >> > On IRC I asked why not use a static key so the overhead is just a > >> > single nop when the standard CMO ops are available, but the consensus > >> > seemed to be that the flushing would completely dominate this branch. > >> > And on platforms with the standard CMO ops the branch be correctly > >> > predicted anyway. > >> > >> Not just the flushing, but also loading back the invalidated > >> cache lines afterwards is just very expensive. I don't think > >> you would be able to measure a difference between the static I read this as: the cache clean/inv is so expensive that the static key saving percentage is trivial, is this understanding right? this could be measured by writing a small benchmark kernel module which just calls cache clean/inv a buf(for example 1500Bytes)in a loop. > >> key and a correctly predicted branch on any relevant usecase here. > > Maybe we should move CMO & THEAD ops to the noncoherent_cache_ops, and > > only keep one of them. > > > > I prefer noncoherent_cache_ops, it's more maintance than ALTERNATIVE. > > I think moving the THEAD ops at the same level as all nonstandard > operations makes sense, but I'd still leave CMO as an explicit > fast path that avoids the indirect branch. This seems like the right > thing to do both for readability and for platforms on which the > indirect branch has a noticeable overhead. > > Arnd