From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F2AC622069; Wed, 6 Dec 2023 11:09:23 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5016FC433C7; Wed, 6 Dec 2023 11:09:21 +0000 (UTC) Date: Wed, 6 Dec 2023 11:09:18 +0000 From: Catalin Marinas To: Jason Gunthorpe Cc: Niklas Schnelle , Mark Rutland , Leon Romanovsky , Arnd Bergmann , linux-arch@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-rdma@vger.kernel.org, llvm@lists.linux.dev, Michael Guralnik , Nathan Chancellor , Nick Desaulniers , Will Deacon Subject: Re: [PATCH rdma-next 1/2] arm64/io: add memcpy_toio_64 Message-ID: References: <20231124122352.GB436702@nvidia.com> <20231127134505.GI436702@nvidia.com> <20231204182330.GK1493156@nvidia.com> <20231205175127.GJ2692119@nvidia.com> <20231205195130.GM2692119@nvidia.com> Precedence: bulk X-Mailing-List: linux-arch@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20231205195130.GM2692119@nvidia.com> On Tue, Dec 05, 2023 at 03:51:30PM -0400, Jason Gunthorpe wrote: > On Tue, Dec 05, 2023 at 07:34:45PM +0000, Catalin Marinas wrote: > > > 2) You want to #define __iowrite512_copy() to memcpy_toio() on ARM and > > > implement some quad STP optimization for this case? > > > > We can have the generic __iowrite512_copy() do memcpy_toio() and have > > the arm64 implement an optimised version. > > > > What I'm not entirely sure of is the DGH (whatever the io_* barrier name > > is). I'd put it in the same __iowrite512_copy() function and remove it > > from the driver code. Otherwise when ST64B is added, we have an > > unnecessary DGH in the driver. If this does not match the other > > __iowrite*_copy() semantics, we can come up with another name. But start > > with this for now and document the function. > > I think the iowrite is only used for WC and the DGH is functionally > harmless for non-WC, so it makes sense. > > In this case we should just remove the DGH macro from the generic > architecture code and tell people to use iowrite - since we now > understand that callers basically have to in order to use DGH on new > ARM CPUs. That works for me but what would the semantics be for __iowrite64_copy() for example? Is there a DGH at the end of the whole write or after each iteration? I'd go with the former since e.g. hns3_tx_push_bd() does that (and doesn't seem to be a 64 byte copy). Similarly for __iowrite512_copy(), if you want the DGH after each iteration you should only pass a count of 1. -- Catalin From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1FF37C4167B for ; Wed, 6 Dec 2023 11:09:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=mrrGXS4N5PG5oDMjcK+yntgsweQv3f1nUFY/h9oFAxk=; b=ACeekpuerN/dWO v+JbKOQwHBpo0Aq+zT9e2qzmzjC4uaq+O6iu2LifS3ZJkHj75IkfGlwsRkYejX2212YgXBNCgAEZS WtK1sjZesYOpuAys2nIKNS2scQDn2WHRYjnbunurhl9PCSNM+qTxYOfAO4aJVDvX4wwqu3Nn7UJrL B0xkVWb0/J3CpII10h9/L/k+7EkIHmNTR5m/xgwuS/TsBJ9DbkuFOF+zyJMO4clNxxInrdDlgYelc CV5u5TQejViLBDuyyYAvbcGg/FyO/u5v6jD/TPLU2WuZI9vnX4owx3upQTaI0M5KEcjD9kHGJ4E75 ROFEnK6wkA6/mPCBdPTA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1rApmT-009scT-1e; Wed, 06 Dec 2023 11:09:29 +0000 Received: from dfw.source.kernel.org ([139.178.84.217]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1rApmO-009sab-2P for linux-arm-kernel@lists.infradead.org; Wed, 06 Dec 2023 11:09:27 +0000 Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id D91AC61BE2; Wed, 6 Dec 2023 11:09:23 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5016FC433C7; Wed, 6 Dec 2023 11:09:21 +0000 (UTC) Date: Wed, 6 Dec 2023 11:09:18 +0000 From: Catalin Marinas To: Jason Gunthorpe Cc: Niklas Schnelle , Mark Rutland , Leon Romanovsky , Arnd Bergmann , linux-arch@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-rdma@vger.kernel.org, llvm@lists.linux.dev, Michael Guralnik , Nathan Chancellor , Nick Desaulniers , Will Deacon Subject: Re: [PATCH rdma-next 1/2] arm64/io: add memcpy_toio_64 Message-ID: References: <20231124122352.GB436702@nvidia.com> <20231127134505.GI436702@nvidia.com> <20231204182330.GK1493156@nvidia.com> <20231205175127.GJ2692119@nvidia.com> <20231205195130.GM2692119@nvidia.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20231205195130.GM2692119@nvidia.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231206_030924_825673_D6AB03F3 X-CRM114-Status: GOOD ( 25.10 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Tue, Dec 05, 2023 at 03:51:30PM -0400, Jason Gunthorpe wrote: > On Tue, Dec 05, 2023 at 07:34:45PM +0000, Catalin Marinas wrote: > > > 2) You want to #define __iowrite512_copy() to memcpy_toio() on ARM and > > > implement some quad STP optimization for this case? > > > > We can have the generic __iowrite512_copy() do memcpy_toio() and have > > the arm64 implement an optimised version. > > > > What I'm not entirely sure of is the DGH (whatever the io_* barrier name > > is). I'd put it in the same __iowrite512_copy() function and remove it > > from the driver code. Otherwise when ST64B is added, we have an > > unnecessary DGH in the driver. If this does not match the other > > __iowrite*_copy() semantics, we can come up with another name. But start > > with this for now and document the function. > > I think the iowrite is only used for WC and the DGH is functionally > harmless for non-WC, so it makes sense. > > In this case we should just remove the DGH macro from the generic > architecture code and tell people to use iowrite - since we now > understand that callers basically have to in order to use DGH on new > ARM CPUs. That works for me but what would the semantics be for __iowrite64_copy() for example? Is there a DGH at the end of the whole write or after each iteration? I'd go with the former since e.g. hns3_tx_push_bd() does that (and doesn't seem to be a 64 byte copy). Similarly for __iowrite512_copy(), if you want the DGH after each iteration you should only pass a count of 1. -- Catalin _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel