From: Jason Gunthorpe <jgg@nvidia.com>
To: David Ahern <dsahern@kernel.org>
Cc: Jakub Kicinski <kuba@kernel.org>,
Saeed Mahameed <saeed@kernel.org>, Arnd Bergmann <arnd@arndb.de>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Leon Romanovsky <leonro@nvidia.com>, Jiri Pirko <jiri@nvidia.com>,
Leonid Bloch <lbloch@nvidia.com>,
Itay Avraham <itayavr@nvidia.com>,
Saeed Mahameed <saeedm@nvidia.com>,
Aron Silverton <aron.silverton@oracle.com>,
Christoph Hellwig <hch@infradead.org>,
andrew.gospodarek@broadcom.com, linux-kernel@vger.kernel.org,
netdev@vger.kernel.org
Subject: Re: [PATCH V4 0/5] mlx5 ConnectX control misc driver
Date: Fri, 9 Feb 2024 21:01:29 -0400 [thread overview]
Message-ID: <20240210010129.GA1010957@nvidia.com> (raw)
In-Reply-To: <2bdc5510-801a-4601-87a3-56eb941d661a@kernel.org>
On Fri, Feb 09, 2024 at 03:42:16PM -0700, David Ahern wrote:
> On 2/8/24 7:15 PM, Jakub Kicinski wrote:
> >>> Ah yes, the high frequency counters. Something that is definitely
> >>> impossible to implement in a generic way. You were literally in the
> >>> room at netconf when David Ahern described his proposal for this.
>
> The key point of that proposal is host memory mapped to userspace where
> H/W counters land (either via direct DMA by a H/W push or a
> kthread/timer pulling in updates). That is similar to what is proposed here.
The counter experiment that inspired Saeed to write about it here was
done using mlx5ctl interfaces and some other POC stuff on an RDMA
network monitoring RDMA workloads, inspecting RDMA objects.
So if your proposal also considers how to select RDMA object counters,
control the detailed sampling hardware with RDMA stuff, and works
on a netdev-free InfiniBand network, then it might be interesting.
It was actually interesting research, I hope some information will be
made public.
> BTW, there is already a broadcom driver under drivers/misc that seems to
> have a lot of overlap capability wise to this driver. Perhaps a Broadcom
> person could chime in.
Yeah, there are lots of examples of drivers that use this kind FW API
direct to userspace. It is a common design pattern across the kernel
in many subsystems. At the core it is following the general philosophy
of pushing things to userspace that don't need to be in the kernel. It
is more secure, more hackable and easier to deploy.
It becomes a userspace decision what kind of tooling community will
develop and what the ultimate user experience will be.
> > Why don't you repost it to netdev and see how many acks you get?
> > I'm not the only netdev maintainer.
>
> I'll go out on that limb and say I would have no problem ACK'ing the
> driver. It's been proven time and time again that these kinds of
> debugging facilities are needed for these kinds of complex,
> multifunction devices.
Agree as well. Ack for RDMA community. This is perfectly consistent
with the subsystem's existing design of directly exposing the device
to userspace. It is essential as we can't piggyback on any "generic"
netdev stuff on InfiniBand HW. Further, I anticipate most of the
mlx5ctl users would actually be running primarily RDMA related
workloads anyhow.
There is not that many people that buy these expensive cards and don't
use them to their full capability.
Recently at usenix Microsoft shared some details of their production
network in the paper "Empowering Azure Storage with RDMA".
Notably they shared that "Traffic statistics of all Azure public
regions between January 18 and February 16, 2023. Traffic was measured
by collecting switch counters of server-facing ports on all Top of
Rack (ToR) switches. Around 70% of traffic was RDMA."
It is a rare public insight into what is going on in the industry at
large, and why RDMA is a significant and important subsytem.
Jason
next prev parent reply other threads:[~2024-02-10 1:01 UTC|newest]
Thread overview: 78+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20240207072435.14182-1-saeed@kernel.org>
2024-02-07 15:03 ` [PATCH V4 0/5] mlx5 ConnectX control misc driver Jakub Kicinski
2024-02-08 5:03 ` Saeed Mahameed
2024-02-09 2:15 ` Jakub Kicinski
2024-02-09 6:55 ` Jiri Pirko
2024-02-09 22:42 ` David Ahern
2024-02-09 22:58 ` Jakub Kicinski
2024-02-10 5:01 ` David Ahern
2024-02-11 11:03 ` Greg Kroah-Hartman
2024-02-11 17:01 ` David Ahern
2024-02-14 20:31 ` David Ahern
2024-02-15 0:46 ` Jason Gunthorpe
2024-02-10 1:01 ` Jason Gunthorpe [this message]
2024-02-11 16:59 ` David Ahern
[not found] ` <Zcx53N8lQjkpEu94@infradead.org>
2024-02-14 15:48 ` Jakub Kicinski
2024-02-15 7:00 ` Christoph Hellwig
2024-02-15 12:08 ` Jiri Pirko
2024-02-16 1:00 ` Jakub Kicinski
2024-02-16 15:05 ` Jason Gunthorpe
2024-02-15 13:21 ` Jason Gunthorpe
2024-02-16 1:10 ` Jakub Kicinski
2024-02-16 4:20 ` David Ahern
2024-02-16 19:04 ` Jason Gunthorpe
[not found] ` <ZczntnbWpxUFLxjp@C02YVCJELVCG.dhcp.broadcom.net>
[not found] ` <20240214175735.GG1088888@nvidia.com>
2024-02-14 18:11 ` Jakub Kicinski
2024-02-14 18:37 ` Jason Gunthorpe
2024-02-16 1:40 ` Jakub Kicinski
2024-02-16 14:27 ` Jason Gunthorpe
[not found] ` <20240304160237.GA2909161@nvidia.com>
[not found] ` <9cc7127f-8674-43bc-b4d7-b1c4c2d96fed@kernel.org>
[not found] ` <2024032248-ardently-ribcage-a495@gregkh>
[not found] ` <510c1b6b-1738-4baa-bdba-54d478633598@kernel.org>
[not found] ` <Zf2n02q0GevGdS-Z@C02YVCJELVCG>
2024-03-22 20:58 ` Jakub Kicinski
2024-03-22 21:18 ` David Ahern
2024-03-22 22:40 ` Jakub Kicinski
2024-03-26 14:57 ` David Ahern
2024-04-01 12:30 ` Leon Romanovsky
2024-04-01 14:50 ` Jakub Kicinski
2024-04-01 18:10 ` Leon Romanovsky
2024-04-01 19:04 ` Jakub Kicinski
2024-04-02 19:20 ` Leon Romanovsky
2024-04-02 18:45 ` Jason Gunthorpe
2024-04-02 21:36 ` Jakub Kicinski
2024-04-02 22:46 ` Jason Gunthorpe
2024-04-02 23:21 ` Jakub Kicinski
2024-04-03 0:15 ` Jakub Kicinski
2024-04-03 6:57 ` Leon Romanovsky
2024-04-02 16:32 ` Edward Cree
2024-04-02 18:40 ` Jason Gunthorpe
2024-04-03 19:28 ` David Ahern
2024-04-04 17:35 ` Edward Cree
2024-04-04 18:33 ` Jason Gunthorpe
2024-04-04 19:31 ` Edward Cree
2024-04-05 11:21 ` Jason Gunthorpe
2024-04-04 19:53 ` Jakub Kicinski
2024-04-04 20:44 ` Jason Gunthorpe
2024-04-04 21:34 ` Jakub Kicinski
2024-04-05 11:13 ` Jason Gunthorpe
2024-04-05 15:38 ` Jakub Kicinski
2024-04-05 17:48 ` Jakub Kicinski
2024-04-08 16:45 ` Jason Gunthorpe
2024-04-08 16:41 ` Jason Gunthorpe
2024-04-04 18:44 ` Andrew Lunn
2024-04-04 20:25 ` Jason Gunthorpe
2024-04-04 20:53 ` Edward Cree
2024-04-05 11:00 ` Jason Gunthorpe
2024-04-02 18:48 ` Leon Romanovsky
2024-04-03 12:26 ` Edward Cree
2024-04-03 19:00 ` Leon Romanovsky
2024-04-03 19:31 ` David Ahern
2024-04-04 0:01 ` Jakub Kicinski
2024-04-04 3:57 ` David Ahern
2024-04-04 12:23 ` Jason Gunthorpe
2024-04-04 14:48 ` Jakub Kicinski
2024-04-04 17:47 ` Jason Gunthorpe
2024-04-04 18:06 ` Edward Cree
2024-04-04 18:35 ` Leon Romanovsky
2024-04-04 19:46 ` Edward Cree
2024-04-05 10:41 ` Leon Romanovsky
2024-04-08 8:02 ` Przemek Kitszel
2024-03-22 21:44 ` Jason Gunthorpe
2024-03-22 22:29 ` Jakub Kicinski
2024-03-23 1:27 ` Saeed Mahameed
2024-03-23 1:33 ` Jason Gunthorpe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240210010129.GA1010957@nvidia.com \
--to=jgg@nvidia.com \
--cc=andrew.gospodarek@broadcom.com \
--cc=arnd@arndb.de \
--cc=aron.silverton@oracle.com \
--cc=dsahern@kernel.org \
--cc=gregkh@linuxfoundation.org \
--cc=hch@infradead.org \
--cc=itayavr@nvidia.com \
--cc=jiri@nvidia.com \
--cc=kuba@kernel.org \
--cc=lbloch@nvidia.com \
--cc=leonro@nvidia.com \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=saeed@kernel.org \
--cc=saeedm@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).