netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Catalin Marinas <catalin.marinas@arm.com>
To: Jason Gunthorpe <jgg@nvidia.com>
Cc: Will Deacon <will@kernel.org>,
	Alexander Gordeev <agordeev@linux.ibm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Christian Borntraeger <borntraeger@linux.ibm.com>,
	Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Gerald Schaefer <gerald.schaefer@linux.ibm.com>,
	Vasily Gorbik <gor@linux.ibm.com>,
	Heiko Carstens <hca@linux.ibm.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Justin Stitt <justinstitt@google.com>,
	Jakub Kicinski <kuba@kernel.org>,
	Leon Romanovsky <leon@kernel.org>,
	linux-rdma@vger.kernel.org, linux-s390@vger.kernel.org,
	llvm@lists.linux.dev, Ingo Molnar <mingo@redhat.com>,
	Bill Wendling <morbo@google.com>,
	Nathan Chancellor <nathan@kernel.org>,
	Nick Desaulniers <ndesaulniers@google.com>,
	netdev@vger.kernel.org, Paolo Abeni <pabeni@redhat.com>,
	Salil Mehta <salil.mehta@huawei.com>,
	Sven Schnelle <svens@linux.ibm.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	x86@kernel.org, Yisen Zhuang <yisen.zhuang@huawei.com>,
	Arnd Bergmann <arnd@arndb.de>,
	Leon Romanovsky <leonro@mellanox.com>,
	linux-arch@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
	Mark Rutland <mark.rutland@arm.com>,
	Michael Guralnik <michaelgur@mellanox.com>,
	patches@lists.linux.dev, Niklas Schnelle <schnelle@linux.ibm.com>,
	Jijie Shao <shaojijie@huawei.com>
Subject: Re: [PATCH v3 6/6] IB/mlx5: Use __iowrite64_copy() for write combining stores
Date: Fri, 18 Jul 2025 19:10:06 +0100	[thread overview]
Message-ID: <aHqN_hpJl84T1Usi@arm.com> (raw)
In-Reply-To: <20250715115200.GJ2067380@nvidia.com>

On Tue, Jul 15, 2025 at 08:52:00AM -0300, Jason Gunthorpe wrote:
> On Tue, Jul 15, 2025 at 11:15:25AM +0100, Will Deacon wrote:
> > > Since STP was rejected alread we've only tested the Neon version. It
> > > does make a huge improvement, but it still somehow fails to combine
> > > rarely sometimes. The CPU is really bad at this :(
> > 
> > I think the thread was from last year so I've forgotten most of the
> > details, but wasn't STP rejected because it wasn't virtualisable? 
> 
> Yes, that was the claim.
> 
> > In which case, doesn't NEON suffer from exactly the same (or possibly
> > worse) problem?
> 
> In general yes, in specific no.

For a generic iowrite function, I wouldn't use STP or Neon since it may
end up being used on emulated MMIO.

BTW, for Neon, don't you need kernel_neon_begin/end()? This may have its
own overhead and also BUG_ON for different contexts. Again, not suitable
for a generic function.

Unfortunately, there's no way to know what this function is called on.
We might try to infer that the kernel started at EL2 but even that is
not entirely correct with nested virt. Or the OS may start at EL1 but
have direct access to mlx5 where we'd want the faster option.

> mlx5 (and other RDMA devices) have long used Neon for MMIO in
> userspace, so any VMM assigning mlx5 devices simply must make this
> work - it is already not optional. So we know that all VMs out there
> with mlx5 support neon for mlx5, and it is safe for mlx5 to use.

I can't think of any generic solution here, it may have to be a hack
specific to mlx5. We can also add add support for ST64B and have some
condition on system_supports_st64b() for future systems.

Even if we could handle virtualisation, I wonder whether
__iowrite64_copy() is the right function to implement 128-bit stores or
the larger 64-byte atomic stores. At least the comment for the generic
function suggests that it writes in 64-bit quantities. Some MMIO may
only handle such writes. A function like memcpy_toio() is more generic,
it doesn't imply any restrictions on the size of the writes (though I
think it guarantees natural alignment for the stores).

-- 
Catalin

  reply	other threads:[~2025-07-18 18:10 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-11 16:46 [PATCH v3 0/6] Fix mlx5 write combining support on new ARM64 cores Jason Gunthorpe
2024-04-11 16:46 ` [PATCH v3 1/6] x86: Stop using weak symbols for __iowrite32_copy() Jason Gunthorpe
2024-04-11 20:24   ` Arnd Bergmann
2024-04-11 16:46 ` [PATCH v3 2/6] s390: Implement __iowrite32_copy() Jason Gunthorpe
2024-04-11 16:46 ` [PATCH v3 3/6] s390: Stop using weak symbols for __iowrite64_copy() Jason Gunthorpe
2024-04-11 20:23   ` Arnd Bergmann
2024-04-11 16:46 ` [PATCH v3 4/6] arm64/io: Provide a WC friendly __iowriteXX_copy() Jason Gunthorpe
2024-04-11 16:46 ` [PATCH v3 5/6] net: hns3: Remove io_stop_wc() calls after __iowrite64_copy() Jason Gunthorpe
2024-04-11 16:46 ` [PATCH v3 6/6] IB/mlx5: Use __iowrite64_copy() for write combining stores Jason Gunthorpe
2024-04-16  8:29   ` Leon Romanovsky
2025-07-14 21:55   ` Jason Gunthorpe
2025-07-15  5:57     ` Leon Romanovsky
2025-07-15 10:15     ` Will Deacon
2025-07-15 11:52       ` Jason Gunthorpe
2025-07-18 18:10         ` Catalin Marinas [this message]
2025-07-18 20:00           ` Jason Gunthorpe
2024-04-23  0:18 ` [PATCH v3 0/6] Fix mlx5 write combining support on new ARM64 cores Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aHqN_hpJl84T1Usi@arm.com \
    --to=catalin.marinas@arm.com \
    --cc=agordeev@linux.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=borntraeger@linux.ibm.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=gerald.schaefer@linux.ibm.com \
    --cc=gor@linux.ibm.com \
    --cc=hca@linux.ibm.com \
    --cc=hpa@zytor.com \
    --cc=jgg@nvidia.com \
    --cc=justinstitt@google.com \
    --cc=kuba@kernel.org \
    --cc=leon@kernel.org \
    --cc=leonro@mellanox.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=llvm@lists.linux.dev \
    --cc=mark.rutland@arm.com \
    --cc=michaelgur@mellanox.com \
    --cc=mingo@redhat.com \
    --cc=morbo@google.com \
    --cc=nathan@kernel.org \
    --cc=ndesaulniers@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=patches@lists.linux.dev \
    --cc=salil.mehta@huawei.com \
    --cc=schnelle@linux.ibm.com \
    --cc=shaojijie@huawei.com \
    --cc=svens@linux.ibm.com \
    --cc=tglx@linutronix.de \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    --cc=yisen.zhuang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).