All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Niklas Schnelle <schnelle@linux.ibm.com>
Cc: David Laight <David.Laight@aculab.com>,
	Alexander Gordeev <agordeev@linux.ibm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Christian Borntraeger <borntraeger@linux.ibm.com>,
	Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Gerald Schaefer <gerald.schaefer@linux.ibm.com>,
	Vasily Gorbik <gor@linux.ibm.com>,
	Heiko Carstens <hca@linux.ibm.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Justin Stitt <justinstitt@google.com>,
	Jakub Kicinski <kuba@kernel.org>,
	Leon Romanovsky <leon@kernel.org>,
	"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
	"linux-s390@vger.kernel.org" <linux-s390@vger.kernel.org>,
	"llvm@lists.linux.dev" <llvm@lists.linux.dev>,
	Ingo Molnar <mingo@redhat.com>, Bill Wendling <morbo@google.com>,
	Nathan Chancellor <nathan@kernel.org>,
	Nick Desaulniers <ndesaulniers@google.com>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	Paolo Abeni <pabeni@redhat.com>,
	Salil Mehta <salil.mehta@huawei.com>,
	Jijie Shao <shaojijie@huawei.com>,
	Sven Schnelle <svens@linux.ibm.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	"x86@kernel.org" <x86@kernel.org>,
	Yisen Zhuang <yisen.zhuang@huawei.com>,
	Arnd Bergmann <arnd@arndb.de>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Leon Romanovsky <leonro@mellanox.com>,
	"linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Michael Guralnik <michaelgur@mellanox.com>,
	"patches@lists.linux.dev" <patches@lists.linux.dev>,
	Will Deacon <will@kernel.org>
Subject: Re: [PATCH 4/6] arm64/io: Provide a WC friendly __iowriteXX_copy()
Date: Fri, 23 Feb 2024 08:58:52 -0400	[thread overview]
Message-ID: <20240223125852.GE13330@nvidia.com> (raw)
In-Reply-To: <e78f6e6294c31d889ace4de3a3c3cebad04f4213.camel@linux.ibm.com>

On Fri, Feb 23, 2024 at 12:38:18PM +0100, Niklas Schnelle wrote:
> > Although I doubt that generating long TLP from byte writes is
> > really necessary.
> 
> I might have gotten confused but I think these are not byte writes.
> Remember that the count is in terms of the number of bits sized
> quantities to copy so "count == 1" is 4/8 bytes here.

Right.

There seem to be two callers of this API in the kernel, one is calling
with a constant size and wants a large TLP

Another seems to want memcpy_to_io with a guarenteed 32/64 bit store.

> > IIRC you were merging at most 4 writes.
> > So better to do a single 32bit write instead.
> > (Unless you have misaligned source data - unlikely.)
> > 
> > While write-combining to generate long TLP is probably mostly
> > safe for PCIe targets, there are some that will only handle
> > TLP for single 32bit data items.
> > Which might be why the code is explicitly requesting 4 byte copies.
> > So it may be entirely wrong to write-combine anything except
> > the generic memcpy_toio().
> 
> On anything other than s390x this should only do write-combine if the
> memory mapping allows it, no? Meaning a driver that can't handle larger
> TLPs really shouldn't use ioremap_wc() then.

Right.

> On s390x one could argue that our version of __iowriteXX_copy() is
> strictly speaking not correct in that zpci_memcpy_toio() doesn't really
> use XX bit writes which is why for us memcpy_toio() was actually a
> better fit indeed. On the other hand doing 32 bit PCI stores (an s390x
> thing) can't combine multiple stores into a single TLP which these
> functions are used for and which has much more use cases than forcing a
> copy loop with 32/64 bit sized writes which would also be a lot slower
> on s390x than an aligned zpci_memcpy_toio().

mlx5 will definitely not work right if __iowrite64_copy() results in
anything smaller than 32/64 bit PCIe TLPs.

Jason

WARNING: multiple messages have this Message-ID (diff)
From: Jason Gunthorpe <jgg@nvidia.com>
To: Niklas Schnelle <schnelle@linux.ibm.com>
Cc: David Laight <David.Laight@aculab.com>,
	Alexander Gordeev <agordeev@linux.ibm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Christian Borntraeger <borntraeger@linux.ibm.com>,
	Borislav Petkov <bp@alien8.de>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	"David S. Miller" <davem@davemloft.net>,
	Eric Dumazet <edumazet@google.com>,
	Gerald Schaefer <gerald.schaefer@linux.ibm.com>,
	Vasily Gorbik <gor@linux.ibm.com>,
	Heiko Carstens <hca@linux.ibm.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Justin Stitt <justinstitt@google.com>,
	Jakub Kicinski <kuba@kernel.org>,
	Leon Romanovsky <leon@kernel.org>,
	"linux-rdma@vger.kernel.org" <linux-rdma@vger.kernel.org>,
	"linux-s390@vger.kernel.org" <linux-s390@vger.kernel.org>,
	"llvm@lists.linux.dev" <llvm@lists.linux.dev>,
	Ingo Molnar <mingo@redhat.com>, Bill Wendling <morbo@google.com>,
	Nathan Chancellor <nathan@kernel.org>,
	Nick Desaulniers <ndesaulniers@google.com>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	Paolo Abeni <pabeni@redhat.com>,
	Salil Mehta <salil.mehta@huawei.com>,
	Jijie Shao <shaojijie@huawei.com>,
	Sven Schnelle <svens@linux.ibm.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	"x86@kernel.org" <x86@kernel.org>,
	Yisen Zhuang <yisen.zhuang@huawei.com>,
	Arnd Bergmann <arnd@arndb.de>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Leon Romanovsky <leonro@mellanox.com>,
	"linux-arch@vger.kernel.org" <linux-arch@vger.kernel.org>,
	"linux-arm-kernel@lists.infradead.org"
	<linux-arm-kernel@lists.infradead.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Michael Guralnik <michaelgur@mellanox.com>,
	"patches@lists.linux.dev" <patches@lists.linux.dev>,
	Will Deacon <will@kernel.org>
Subject: Re: [PATCH 4/6] arm64/io: Provide a WC friendly __iowriteXX_copy()
Date: Fri, 23 Feb 2024 08:58:52 -0400	[thread overview]
Message-ID: <20240223125852.GE13330@nvidia.com> (raw)
In-Reply-To: <e78f6e6294c31d889ace4de3a3c3cebad04f4213.camel@linux.ibm.com>

On Fri, Feb 23, 2024 at 12:38:18PM +0100, Niklas Schnelle wrote:
> > Although I doubt that generating long TLP from byte writes is
> > really necessary.
> 
> I might have gotten confused but I think these are not byte writes.
> Remember that the count is in terms of the number of bits sized
> quantities to copy so "count == 1" is 4/8 bytes here.

Right.

There seem to be two callers of this API in the kernel, one is calling
with a constant size and wants a large TLP

Another seems to want memcpy_to_io with a guarenteed 32/64 bit store.

> > IIRC you were merging at most 4 writes.
> > So better to do a single 32bit write instead.
> > (Unless you have misaligned source data - unlikely.)
> > 
> > While write-combining to generate long TLP is probably mostly
> > safe for PCIe targets, there are some that will only handle
> > TLP for single 32bit data items.
> > Which might be why the code is explicitly requesting 4 byte copies.
> > So it may be entirely wrong to write-combine anything except
> > the generic memcpy_toio().
> 
> On anything other than s390x this should only do write-combine if the
> memory mapping allows it, no? Meaning a driver that can't handle larger
> TLPs really shouldn't use ioremap_wc() then.

Right.

> On s390x one could argue that our version of __iowriteXX_copy() is
> strictly speaking not correct in that zpci_memcpy_toio() doesn't really
> use XX bit writes which is why for us memcpy_toio() was actually a
> better fit indeed. On the other hand doing 32 bit PCI stores (an s390x
> thing) can't combine multiple stores into a single TLP which these
> functions are used for and which has much more use cases than forcing a
> copy loop with 32/64 bit sized writes which would also be a lot slower
> on s390x than an aligned zpci_memcpy_toio().

mlx5 will definitely not work right if __iowrite64_copy() results in
anything smaller than 32/64 bit PCIe TLPs.

Jason

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  parent reply	other threads:[~2024-02-23 12:58 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-21  1:17 [PATCH 0/6] Fix mlx5 write combining support on new ARM64 cores Jason Gunthorpe
2024-02-21  1:17 ` Jason Gunthorpe
2024-02-21  1:17 ` [PATCH 1/6] x86: Stop using weak symbols for __iowrite32_copy() Jason Gunthorpe
2024-02-21  1:17   ` Jason Gunthorpe
2024-02-21  1:17 ` [PATCH 2/6] s390: Implement __iowrite32_copy() Jason Gunthorpe
2024-02-21  1:17   ` Jason Gunthorpe
2024-02-21  1:17 ` [PATCH 3/6] s390: Stop using weak symbols for __iowrite64_copy() Jason Gunthorpe
2024-02-21  1:17   ` Jason Gunthorpe
2024-02-21  1:17 ` [PATCH 4/6] arm64/io: Provide a WC friendly __iowriteXX_copy() Jason Gunthorpe
2024-02-21  1:17   ` Jason Gunthorpe
2024-02-21 19:22   ` Will Deacon
2024-02-21 19:22     ` Will Deacon
2024-02-21 23:28     ` Jason Gunthorpe
2024-02-21 23:28       ` Jason Gunthorpe
2024-02-22 22:05   ` David Laight
2024-02-22 22:05     ` David Laight
2024-02-22 22:36     ` Jason Gunthorpe
2024-02-22 22:36       ` Jason Gunthorpe
2024-02-23  9:07       ` David Laight
2024-02-23  9:07         ` David Laight
2024-02-23 11:01         ` Niklas Schnelle
2024-02-23 11:01           ` Niklas Schnelle
2024-02-23 11:05           ` David Laight
2024-02-23 11:05             ` David Laight
2024-02-23 12:53             ` Jason Gunthorpe
2024-02-23 12:53               ` Jason Gunthorpe
2024-02-23 11:38         ` Niklas Schnelle
2024-02-23 11:38           ` Niklas Schnelle
2024-02-23 12:19           ` David Laight
2024-02-23 12:19             ` David Laight
2024-02-23 13:03             ` Jason Gunthorpe
2024-02-23 13:03               ` Jason Gunthorpe
2024-02-23 13:52               ` David Laight
2024-02-23 13:52                 ` David Laight
2024-02-23 14:44                 ` Jason Gunthorpe
2024-02-23 14:44                   ` Jason Gunthorpe
2024-02-23 12:58           ` Jason Gunthorpe [this message]
2024-02-23 12:58             ` Jason Gunthorpe
2024-02-23 16:35             ` Niklas Schnelle
2024-02-23 16:35               ` Niklas Schnelle
2024-02-23 17:05               ` Jason Gunthorpe
2024-02-23 17:05                 ` Jason Gunthorpe
2024-02-27 10:37   ` Catalin Marinas
2024-02-27 10:37     ` Catalin Marinas
2024-02-28 23:06     ` Jason Gunthorpe
2024-02-28 23:06       ` Jason Gunthorpe
2024-02-29 10:24       ` Catalin Marinas
2024-02-29 10:24         ` Catalin Marinas
2024-02-29 13:28         ` Jason Gunthorpe
2024-02-29 13:28           ` Jason Gunthorpe
2024-02-29 10:33   ` Catalin Marinas
2024-02-29 10:33     ` Catalin Marinas
2024-02-29 13:29     ` Jason Gunthorpe
2024-02-29 13:29       ` Jason Gunthorpe
2024-03-01 18:52   ` Catalin Marinas
2024-03-01 18:52     ` Catalin Marinas
2024-02-21  1:17 ` [PATCH 5/6] net: hns3: Remove io_stop_wc() calls after __iowrite64_copy() Jason Gunthorpe
2024-02-21  1:17   ` Jason Gunthorpe
2024-02-22  0:57   ` Jijie Shao
2024-02-22  0:57     ` Jijie Shao
2024-02-21  1:17 ` [PATCH 6/6] IB/mlx5: Use __iowrite64_copy() for write combining stores Jason Gunthorpe
2024-02-21  1:17   ` Jason Gunthorpe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240223125852.GE13330@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=David.Laight@aculab.com \
    --cc=agordeev@linux.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=borntraeger@linux.ibm.com \
    --cc=bp@alien8.de \
    --cc=catalin.marinas@arm.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=davem@davemloft.net \
    --cc=edumazet@google.com \
    --cc=gerald.schaefer@linux.ibm.com \
    --cc=gor@linux.ibm.com \
    --cc=hca@linux.ibm.com \
    --cc=hpa@zytor.com \
    --cc=justinstitt@google.com \
    --cc=kuba@kernel.org \
    --cc=leon@kernel.org \
    --cc=leonro@mellanox.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=llvm@lists.linux.dev \
    --cc=mark.rutland@arm.com \
    --cc=michaelgur@mellanox.com \
    --cc=mingo@redhat.com \
    --cc=morbo@google.com \
    --cc=nathan@kernel.org \
    --cc=ndesaulniers@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=patches@lists.linux.dev \
    --cc=salil.mehta@huawei.com \
    --cc=schnelle@linux.ibm.com \
    --cc=shaojijie@huawei.com \
    --cc=svens@linux.ibm.com \
    --cc=tglx@linutronix.de \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    --cc=yisen.zhuang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.