From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 291B11C68F; Fri, 26 Jan 2024 14:56:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706280997; cv=none; b=ThFOMLOIHHhq/hXGpP5nyO952Sapq++tmdm3xQZDAkbdYCZl8crGpnbSqOjxT3J0sbJ47Hcub8RBThLwwBoNQ3z19F4579D6a9VoRjwixv0hVWEjxvOofxO5bvrXhlDG8Gmyw0ULARJ/w5DR9DG/3dWrhZx+Kvdz5FRsbBmQn1o= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706280997; c=relaxed/simple; bh=f8cywfRmvLjKXiTEkbut2MRiriAkYIQcyCBGU0iDCGQ=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=AqDx9nF7SjDIKSRnxqEsoznko6ppTTUxiQFJ4K5CR67pQTSUyoP36KkjpwBDTnCfNfaa0KweZqS3zTINRufP8oglNScgA/FgQpk9eTK8k7aQHINTiq//UHko9/I9xBWUqRIeNFvYDoxy7Q6/1dF55O503mUKefUX5fUiLqTysT8= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id 170B2C43601; Fri, 26 Jan 2024 14:56:33 +0000 (UTC) Date: Fri, 26 Jan 2024 14:56:31 +0000 From: Catalin Marinas To: Jason Gunthorpe Cc: Mark Rutland , Niklas Schnelle , Leon Romanovsky , Arnd Bergmann , linux-arch@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-rdma@vger.kernel.org, llvm@lists.linux.dev, Michael Guralnik , Nathan Chancellor , Nick Desaulniers , Will Deacon , Marc Zyngier Subject: Re: [PATCH rdma-next 1/2] arm64/io: add memcpy_toio_64 Message-ID: References: <20240117123618.GD734935@nvidia.com> <20240124132719.GF1455070@nvidia.com> <20240124192634.GJ1455070@nvidia.com> <20240125174333.GA2192844@nvidia.com> Precedence: bulk X-Mailing-List: linux-arch@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240125174333.GA2192844@nvidia.com> On Thu, Jan 25, 2024 at 01:43:33PM -0400, Jason Gunthorpe wrote: > On Wed, Jan 24, 2024 at 03:26:34PM -0400, Jason Gunthorpe wrote: > > The suggestion that it should not have any interleaving instructions > > and use STP came from our CPU architecture team. > > I got some more details here. > > They point to the ARM publication about write combining > > https://community.arm.com/cfs-file/__key/telligent-evolution-components-attachments/13-150-00-00-00-00-10-12/Understanding_5F00_Write_5F00_Combining_5F00_on_5F00_Arm_5F00_V.1.0.pdf > > specifically to the example code using 4x 128 bit NEON stores. That's an example but this document doesn't make any statements about 64-bit writes. > They point at the actual CPU design and say it is optimized for 128 > bit stores (STP and ST4 included, it seems). > > 64 bit stores trigger some different behavior. This is highly microarchitecture specific. The best bet in the future is the ST64B instruction but in the meantime it's pretty much guessing. > I have no way to know if it will be OK for other drivers that expect > this to be a performance path in the kernel. > > Are you *sure* you want to do this str version? If it works for mlx5 I > will send the patch and the other companies can come later with > performance data. Yeah, I'd stick to the STR for now, it makes things simpler as we don't have to care about what emulation does. -- Catalin From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 8DFD8C47422 for ; Fri, 26 Jan 2024 14:59:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:Cc:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=+1bkKV2ADq/z1qSh5eGY6pcnOHp+DQDgjf0c3BPDELg=; b=HB9gfrSx7bxgB4 ho67uuxlbEdXPyOqLpq16lUD7mq73zewL9mCFMB9+n1xRXkIrTucy1s94OLNwFlDii06BAVutleWQ Mrjj/bhPMY6VulFRjwGFXZk7Sn0VFwWABbdPcZ5JAq+4DHUJjpbd9q7d7nKP6sVrTSEJSWn1w739/ anbcdjCzn9QiyMTNPSfo7tg1gvyo8t/Yuel9iSpR6JtnbQ1jG7o69c+k0nBrFl2MRKZiZluPo4HOC QgjifMiMEzYFXiQmxBQvBy2B97+qVPhNLtEU3et4FxQoFUdBTyPZ5qJuxljJJGt3PnoYxl8jiuKJq UAV4S7SOyUuqrEYwEfUQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rTNgA-00000004SEt-0L4G; Fri, 26 Jan 2024 14:59:38 +0000 Received: from sin.source.kernel.org ([145.40.73.55]) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1rTNfS-00000004Rno-3SqY for linux-arm-kernel@lists.infradead.org; Fri, 26 Jan 2024 14:59:00 +0000 Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by sin.source.kernel.org (Postfix) with ESMTP id B6278CE3702; Fri, 26 Jan 2024 14:57:59 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 170B2C43601; Fri, 26 Jan 2024 14:56:33 +0000 (UTC) Date: Fri, 26 Jan 2024 14:56:31 +0000 From: Catalin Marinas To: Jason Gunthorpe Cc: Mark Rutland , Niklas Schnelle , Leon Romanovsky , Arnd Bergmann , linux-arch@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-rdma@vger.kernel.org, llvm@lists.linux.dev, Michael Guralnik , Nathan Chancellor , Nick Desaulniers , Will Deacon , Marc Zyngier Subject: Re: [PATCH rdma-next 1/2] arm64/io: add memcpy_toio_64 Message-ID: References: <20240117123618.GD734935@nvidia.com> <20240124132719.GF1455070@nvidia.com> <20240124192634.GJ1455070@nvidia.com> <20240125174333.GA2192844@nvidia.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20240125174333.GA2192844@nvidia.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240126_065857_427735_DEB254BC X-CRM114-Status: GOOD ( 17.18 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Thu, Jan 25, 2024 at 01:43:33PM -0400, Jason Gunthorpe wrote: > On Wed, Jan 24, 2024 at 03:26:34PM -0400, Jason Gunthorpe wrote: > > The suggestion that it should not have any interleaving instructions > > and use STP came from our CPU architecture team. > > I got some more details here. > > They point to the ARM publication about write combining > > https://community.arm.com/cfs-file/__key/telligent-evolution-components-attachments/13-150-00-00-00-00-10-12/Understanding_5F00_Write_5F00_Combining_5F00_on_5F00_Arm_5F00_V.1.0.pdf > > specifically to the example code using 4x 128 bit NEON stores. That's an example but this document doesn't make any statements about 64-bit writes. > They point at the actual CPU design and say it is optimized for 128 > bit stores (STP and ST4 included, it seems). > > 64 bit stores trigger some different behavior. This is highly microarchitecture specific. The best bet in the future is the ST64B instruction but in the meantime it's pretty much guessing. > I have no way to know if it will be OK for other drivers that expect > this to be a performance path in the kernel. > > Are you *sure* you want to do this str version? If it works for mlx5 I > will send the patch and the other companies can come later with > performance data. Yeah, I'd stick to the STR for now, it makes things simpler as we don't have to care about what emulation does. -- Catalin _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel