From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 53EA5C61D85 for ; Thu, 23 Nov 2023 19:05:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-ID:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=JSjGdSRycRvX+hCCgvaL789kXPOfp9LxNptjIoll9QA=; b=pKgWw0tq/UU+tr 9RQuBzIhj+cOwA1bhlC0VwvCdKKxA/4j+axauMJbWrGrjxc4Lvhyyiizhe4MEbf4/B1+eTk7BWGX/ upwssXGyKnrdopT3WLmikLexCBihoqDektlU8pc8ozRTeL5WzlRdRuJuYSP7OCMYmh/9YHZZa2IGr tfnxIpe41F63BWONGo2IHSTX2RQx2LJkVyAkNKx4POX9lyCcMA3ISNM106mLUy09JnPtyosdB1Uu/ D+0tPlk6aGVJ2S1ZAlUQujHSaWpyl0hkOSyFLoj8+gbeZ0AwxFSedA505Qw3EJGcgGAGZa65OFo2O 96pD2JyVtctBtVthZ7jA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1r6F0K-005ZDF-2F; Thu, 23 Nov 2023 19:04:48 +0000 Received: from ams.source.kernel.org ([145.40.68.75]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1r6F0H-005ZAd-2F for linux-arm-kernel@lists.infradead.org; Thu, 23 Nov 2023 19:04:47 +0000 Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by ams.source.kernel.org (Postfix) with ESMTP id 1BC0AB82EE0; Thu, 23 Nov 2023 19:04:44 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E835BC433CA; Thu, 23 Nov 2023 19:04:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1700766283; bh=xUztf5c6+cDrQVGWca5DAIqwQ3w/w7JoXRRhID515MY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=iEbQgGVxuo5B+LkFE9lNJQsMr3+HPoAKvZldiQRRSTjhFRsd6PUuS8DZjFh/1U7fg BraPFqCBY6Kutl8+97B/ZKk/rZP89+vzbsMufoKDQgWyqVoVhLQTyp7BM0fmRpCB9W hwdVPYiHevRo6ysqAED9NuVMaxd5e6dPy6nGCNp3phvKuX0lJ7RIkN59ULD+Fz7hHb lAwLRhrKM4v5C8lZJdZmKtNoRn2j29FwQDY8iQTqqfOTjWC3v/ZqP9si/HUV/g0c5d Pfg9ff29ZANEch0NMG6DabapUS49GnGHEj9UYYhDCety0Qp7LVXfxMK0mjcCc8EOYB xmU8SMS9YQYKQ== From: Leon Romanovsky To: Jason Gunthorpe Cc: Arnd Bergmann , Catalin Marinas , linux-arch@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-rdma@vger.kernel.org, llvm@lists.linux.dev, Michael Guralnik , Nathan Chancellor , Nick Desaulniers , Will Deacon Subject: [PATCH rdma-next 2/2] IB/mlx5: Use memcpy_toio_64() for write combining stores Date: Thu, 23 Nov 2023 21:04:32 +0200 Message-ID: <744fdfcd61fa8efa6da8ed432883b5f016c3a86f.1700766072.git.leon@kernel.org> X-Mailer: git-send-email 2.42.0 In-Reply-To: References: MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231123_110445_893752_825678DB X-CRM114-Status: GOOD ( 13.31 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org From: Jason Gunthorpe mlx5 has a built in self-test at driver startup to evaluate if the platform supports write combining to generate a 64 byte PCIe TLP or not. This has proven necessary because a lot of common scenarios end up with broken write combining (especially inside virtual machines) and there is other way to learn this information. This self test has been consistently failing on new ARM64 CPU designs (specifically with NVIDIA Grace's implementation of Neoverse V2). The C loop around writel() generates some pretty terrible ARM64 assembly, but historically this has worked on a lot of existing ARM64 CPUs till now. We see it succeed about 1 time in 10,000 on the worst effected systems. The CPU architects speculate that the load instructions interspersed with the stores make it very unreliable. Change this to use memcpy_toio_64() which provides a block of 4 STP instructions on ARM64, and the same writel loop on everything else. Fixes: 11f552e21755 ("IB/mlx5: Test write combining support") Signed-off-by: Jason Gunthorpe Signed-off-by: Leon Romanovsky --- drivers/infiniband/hw/mlx5/mem.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/drivers/infiniband/hw/mlx5/mem.c b/drivers/infiniband/hw/mlx5/mem.c index 96ffbbaf0a73..26b5590d2164 100644 --- a/drivers/infiniband/hw/mlx5/mem.c +++ b/drivers/infiniband/hw/mlx5/mem.c @@ -108,7 +108,6 @@ static int post_send_nop(struct mlx5_ib_dev *dev, struct ib_qp *ibqp, u64 wr_id, __be32 mmio_wqe[16] = {}; unsigned long flags; unsigned int idx; - int i; if (unlikely(dev->mdev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR)) return -EIO; @@ -148,9 +147,7 @@ static int post_send_nop(struct mlx5_ib_dev *dev, struct ib_qp *ibqp, u64 wr_id, * we hit doorbell */ wmb(); - for (i = 0; i < 8; i++) - mlx5_write64(&mmio_wqe[i * 2], - bf->bfreg->map + bf->offset + i * 8); + memcpy_toio_64(bf->bfreg->map + bf->offset, mmio_wqe); io_stop_wc(); bf->offset ^= bf->buf_size; -- 2.42.0 _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel