From: Jason Gunthorpe <jgg@nvidia.com>
To: Oded Gabbay <oded.gabbay@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
"Linux-Kernel@Vger. Kernel. Org" <linux-kernel@vger.kernel.org>,
dri-devel <dri-devel@lists.freedesktop.org>,
Jiho Chu <jiho.chu@samsung.com>,
Yuji Ishikawa <yuji2.ishikawa@toshiba.co.jp>
Subject: Re: New subsystem for acceleration devices
Date: Tue, 9 Aug 2022 09:43:41 -0300 [thread overview]
Message-ID: <YvJWfS5h2SeWGAEM@nvidia.com> (raw)
In-Reply-To: <CAFCwf12MFVmBOEMw37Cdh4O3n13LosR4yDi007eH9BhF3kRC4w@mail.gmail.com>
On Mon, Aug 08, 2022 at 11:26:11PM +0300, Oded Gabbay wrote:
> So if you want a common uAPI and a common userspace library to use it,
> you need to expose the same device character files for every device,
> regardless of the driver. e.g. you need all devices to be called
> /dev/accelX and not /dev/habanaX or /dev/nvidiaX
So, this is an interesting idea. One of the things we did in RDMA that
turned our very well is to have the user side of the kernel/user API
in a single git repo for all the drivers, including the lowest layer
of the driver-specific APIs.
It gives a reasonable target for a DRM-like test of "you must have a
userspace". Ie send your userspace and userspace documentation/tests
before your kernel side can be merged.
Even if it is just a git repo collecting and curating driver-specific
libraries under the "accel" banner it could be quite a useful
activity.
But, probably this boils down to things that look like:
device = habana_open_device()
habana_mooo(device)
device = nvidia_open_device()
nvidia_baaa(device)
> That's what I mean by abstracting all this kernel API from the
> drivers. Not because it is an API that is hard to use, but because the
> drivers should *not* use it at all.
>
> I think drm did that pretty well. Their code defines objects for
> driver, device and minors, with resource manager that will take care
> of releasing the objects automatically (it is based on devres.c).
We have lots of examples of subsystems doing this - the main thing
unique about accel is that that there is really no shared uAPI between
the drivers, and not 'abstraction' provided by the kernel. Maybe that
is the point..
> So actually I do want an ioctl but as you said, not for the main
> device char, but to an accompanied control device char.
There is a general problem across all these "thick" devices in the
kernel to support their RAS & configuration requirements and IMHO we
don't have a good answer at all.
We've been talking on and off here about having some kind of
subsystem/methodology specifically for this area - how to monitor,
configure, service, etc a very complicated off-CPU device. I think
there would be a lot of interest in this and maybe it shouldn't be
coupled to this accel idea.
Eg we already have some established mechinisms - I would expect any
accel device to be able to introspect and upgrade its flash FW using
the 'devlink flash' common API.
> an application only has access to the information ioctl through this
> device char (so it can't submit anything, allocate memory, etc.) and
> can only retrieve metrics which do not leak information about the
> compute application.
This is often being done over a netlink socket as the "second char"
Jason
WARNING: multiple messages have this Message-ID (diff)
From: Jason Gunthorpe <jgg@nvidia.com>
To: Oded Gabbay <oded.gabbay@gmail.com>
Cc: Dave Airlie <airlied@gmail.com>,
dri-devel <dri-devel@lists.freedesktop.org>,
Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Yuji Ishikawa <yuji2.ishikawa@toshiba.co.jp>,
Jiho Chu <jiho.chu@samsung.com>, Arnd Bergmann <arnd@arndb.de>,
"Linux-Kernel@Vger. Kernel. Org" <linux-kernel@vger.kernel.org>
Subject: Re: New subsystem for acceleration devices
Date: Tue, 9 Aug 2022 09:43:41 -0300 [thread overview]
Message-ID: <YvJWfS5h2SeWGAEM@nvidia.com> (raw)
In-Reply-To: <CAFCwf12MFVmBOEMw37Cdh4O3n13LosR4yDi007eH9BhF3kRC4w@mail.gmail.com>
On Mon, Aug 08, 2022 at 11:26:11PM +0300, Oded Gabbay wrote:
> So if you want a common uAPI and a common userspace library to use it,
> you need to expose the same device character files for every device,
> regardless of the driver. e.g. you need all devices to be called
> /dev/accelX and not /dev/habanaX or /dev/nvidiaX
So, this is an interesting idea. One of the things we did in RDMA that
turned our very well is to have the user side of the kernel/user API
in a single git repo for all the drivers, including the lowest layer
of the driver-specific APIs.
It gives a reasonable target for a DRM-like test of "you must have a
userspace". Ie send your userspace and userspace documentation/tests
before your kernel side can be merged.
Even if it is just a git repo collecting and curating driver-specific
libraries under the "accel" banner it could be quite a useful
activity.
But, probably this boils down to things that look like:
device = habana_open_device()
habana_mooo(device)
device = nvidia_open_device()
nvidia_baaa(device)
> That's what I mean by abstracting all this kernel API from the
> drivers. Not because it is an API that is hard to use, but because the
> drivers should *not* use it at all.
>
> I think drm did that pretty well. Their code defines objects for
> driver, device and minors, with resource manager that will take care
> of releasing the objects automatically (it is based on devres.c).
We have lots of examples of subsystems doing this - the main thing
unique about accel is that that there is really no shared uAPI between
the drivers, and not 'abstraction' provided by the kernel. Maybe that
is the point..
> So actually I do want an ioctl but as you said, not for the main
> device char, but to an accompanied control device char.
There is a general problem across all these "thick" devices in the
kernel to support their RAS & configuration requirements and IMHO we
don't have a good answer at all.
We've been talking on and off here about having some kind of
subsystem/methodology specifically for this area - how to monitor,
configure, service, etc a very complicated off-CPU device. I think
there would be a lot of interest in this and maybe it shouldn't be
coupled to this accel idea.
Eg we already have some established mechinisms - I would expect any
accel device to be able to introspect and upgrade its flash FW using
the 'devlink flash' common API.
> an application only has access to the information ioctl through this
> device char (so it can't submit anything, allocate memory, etc.) and
> can only retrieve metrics which do not leak information about the
> compute application.
This is often being done over a netlink socket as the "second char"
Jason
next prev parent reply other threads:[~2022-08-09 12:44 UTC|newest]
Thread overview: 78+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CGME20220731114605epcas1p1afff6b948f542e2062b60d49a8023f6f@epcas1p1.samsung.com>
2022-07-31 11:45 ` New subsystem for acceleration devices Oded Gabbay
2022-07-31 15:37 ` Greg Kroah-Hartman
2022-08-01 2:29 ` yuji2.ishikawa
2022-08-01 8:21 ` Oded Gabbay
2022-08-03 4:39 ` yuji2.ishikawa
2022-08-03 5:34 ` Greg KH
2022-08-03 20:28 ` Oded Gabbay
2022-08-02 17:25 ` Jiho Chu
2022-08-02 19:07 ` Oded Gabbay
2022-08-03 19:04 ` Dave Airlie
2022-08-03 20:20 ` Oded Gabbay
2022-08-03 23:31 ` Daniel Stone
2022-08-04 6:46 ` Oded Gabbay
2022-08-04 9:27 ` Jiho Chu
2022-08-03 23:54 ` Dave Airlie
2022-08-03 23:54 ` Dave Airlie
2022-08-04 7:43 ` Oded Gabbay
2022-08-04 7:43 ` Oded Gabbay
2022-08-04 14:50 ` Jason Gunthorpe
2022-08-04 14:50 ` Jason Gunthorpe
2022-08-04 17:48 ` Oded Gabbay
2022-08-04 17:48 ` Oded Gabbay
2022-08-05 0:22 ` Jason Gunthorpe
2022-08-05 0:22 ` Jason Gunthorpe
2022-08-07 6:43 ` Oded Gabbay
2022-08-07 6:43 ` Oded Gabbay
2022-08-07 11:25 ` Oded Gabbay
2022-08-07 11:25 ` Oded Gabbay
2022-08-08 6:10 ` Greg Kroah-Hartman
2022-08-08 6:10 ` Greg Kroah-Hartman
2022-08-08 17:55 ` Jason Gunthorpe
2022-08-08 17:55 ` Jason Gunthorpe
2022-08-09 6:23 ` Greg Kroah-Hartman
2022-08-09 6:23 ` Greg Kroah-Hartman
2022-08-09 8:04 ` Christoph Hellwig
2022-08-09 8:32 ` Arnd Bergmann
2022-08-09 8:32 ` Arnd Bergmann
2022-08-09 12:18 ` Jason Gunthorpe
2022-08-09 12:18 ` Jason Gunthorpe
2022-08-09 12:46 ` Arnd Bergmann
2022-08-09 12:46 ` Arnd Bergmann
2022-08-09 14:22 ` Jason Gunthorpe
2022-08-09 14:22 ` Jason Gunthorpe
2022-08-09 8:45 ` Greg Kroah-Hartman
2022-08-09 8:45 ` Greg Kroah-Hartman
2022-08-08 17:46 ` Jason Gunthorpe
2022-08-08 17:46 ` Jason Gunthorpe
2022-08-08 20:26 ` Oded Gabbay
2022-08-08 20:26 ` Oded Gabbay
2022-08-09 12:43 ` Jason Gunthorpe [this message]
2022-08-09 12:43 ` Jason Gunthorpe
2022-08-05 3:02 ` Dave Airlie
2022-08-05 3:02 ` Dave Airlie
2022-08-07 6:50 ` Oded Gabbay
2022-08-07 6:50 ` Oded Gabbay
2022-08-09 21:42 ` Oded Gabbay
2022-08-09 21:42 ` Oded Gabbay
2022-08-10 9:00 ` Jiho Chu
2022-08-10 9:00 ` Jiho Chu
2022-08-10 14:05 ` yuji2.ishikawa
2022-08-10 14:05 ` yuji2.ishikawa
2022-08-10 14:37 ` Oded Gabbay
2022-08-10 14:37 ` Oded Gabbay
2022-08-23 18:23 ` Kevin Hilman
2022-08-23 18:23 ` Kevin Hilman
2022-08-23 20:45 ` Oded Gabbay
2022-08-23 20:45 ` Oded Gabbay
2022-08-29 20:54 ` Kevin Hilman
2022-08-29 20:54 ` Kevin Hilman
2022-09-23 16:21 ` Oded Gabbay
2022-09-23 16:21 ` Oded Gabbay
2022-09-26 8:16 ` Christoph Hellwig
2022-09-29 6:50 ` Oded Gabbay
2022-09-29 6:50 ` Oded Gabbay
2022-08-04 12:00 ` Tvrtko Ursulin
2022-08-04 15:03 ` Jeffrey Hugo
2022-08-04 17:53 ` Oded Gabbay
2022-08-04 17:53 ` Oded Gabbay
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YvJWfS5h2SeWGAEM@nvidia.com \
--to=jgg@nvidia.com \
--cc=arnd@arndb.de \
--cc=dri-devel@lists.freedesktop.org \
--cc=gregkh@linuxfoundation.org \
--cc=jiho.chu@samsung.com \
--cc=linux-kernel@vger.kernel.org \
--cc=oded.gabbay@gmail.com \
--cc=yuji2.ishikawa@toshiba.co.jp \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.