All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kevin Hilman <khilman@baylibre.com>
To: Oded Gabbay <oded.gabbay@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	"Linux-Kernel@Vger. Kernel. Org" <linux-kernel@vger.kernel.org>,
	dri-devel <dri-devel@lists.freedesktop.org>,
	Alexandre Bailon <abailon@baylibre.com>,
	Jason Gunthorpe <jgg@nvidia.com>, Jiho Chu <jiho.chu@samsung.com>,
	Yuji Ishikawa <yuji2.ishikawa@toshiba.co.jp>
Subject: Re: New subsystem for acceleration devices
Date: Mon, 29 Aug 2022 13:54:42 -0700	[thread overview]
Message-ID: <7hh71uixd9.fsf@baylibre.com> (raw)
In-Reply-To: <CAFCwf12P6DckVUJL7V_Z7ASj+8A3yyx9eX5MpZPF47Rzg6CjEA@mail.gmail.com>

Hi Oded (and sorry I misspelled your name last time),

Oded Gabbay <oded.gabbay@gmail.com> writes:

> On Tue, Aug 23, 2022 at 9:24 PM Kevin Hilman <khilman@baylibre.com> wrote:
>>
>> Hi Obed,
>>
>> Oded Gabbay <oded.gabbay@gmail.com> writes:
>>
>> [...]
>>
>> > I want to update that I'm currently in discussions with Dave to figure
>> > out what's the best way to move forward. We are writing it down to do
>> > a proper comparison between the two paths (new accel subsystem or
>> > using drm). I guess it will take a week or so.
>>
>> Any update on the discussions with Dave? and/or are there any plans to
>> discuss this further at LPC/ksummit yet?
> Hi Kevin.
>
> We are still discussing the details, as at least the habanalabs driver
> is very complex and there are multiple parts that I need to see if and
> how they can be mapped to drm.
> Some of us will attend LPC so we will probably take advantage of that
> to talk more about this.

OK, looking forward to some more conversations at LPC.

>>
>> We (BayLibre) are upstreaming support for APUs on Mediatek SoCs, and are
>> using the DRM-based approach.  I'll also be at LPC and happy to discuss
>> in person.
>>
>> For some context on my/our interest: back in Sept 2020 we initially
>> submitted an rpmesg based driver for kernel communication[1].  After
>> review comments, we rewrote that based on DRM[2] and are now using it
>> for some MTK SoCs[3] and supporting our MTK customers with it.
>>
>> Hopefully we will get the kernel interfaces sorted out soon, but next,
>> there's the userspace side of things.  To that end, we're also working
>> on libAPU, a common, open userspace stack.  Alex Bailon recently
>> presented a proposal earlier this year at Embedded Recipes in Paris
>> (video[4], slides[5].)
>>
>> libAPU would include abstractions of the kernel interfaces for DRM
>> (using libdrm), remoteproc/rpmsg, virtio etc. but also goes farther and
>> proposes an open firmware for the accelerator side using
>> libMetal/OpenAMP + rpmsg for communication with (most likely closed
>> source) vendor firmware.  Think of this like sound open firmware (SOF[6]),
>> but for accelerators.
>
> I think your device and the habana device are very different in
> nature, and it is part of what Dave and I discussed, whether these two
> classes of devices can live together. I guess they can live together
> in the kernel, but in the userspace, not so much imo.

Yeah, for now I think focusing on how to handle both classes of devices
in the kernel is the most important.

> The first class is the edge inference devices (usually as part of some
> SoC). I think your description of the APU on MTK SoC is a classic
> example of such a device.

Correct.

> You usually have some firmware you load, you give it a graph and
> pointers for input and output and then you just execute the graph
> again and again to perform inference and just replace the inputs.
>
> The second class is the data-center, training accelerators, which
> habana's gaudi device is classified as such. These devices usually
> have a number of different compute engines, a fabric for scaling out,
> on-device memory, internal MMUs and RAS monitoring requirements. Those
> devices are usually operated via command queues, either through their
> kernel driver or directly from user-space. They have multiple APIs for
> memory management, RAS, scaling-out and command-submissions.

OK, I see.

>>
>> We've been using this succesfully for Mediatek SoCs (which have a
>> Cadence VP6 APU) and have submitted/published the code, including the
>> OpenAMP[7] and libmetal[8] parts in addition to the kernel parts already
>> mentioned.
> What's the difference between libmetal and other open-source low-level
> runtime drivers, such as oneAPI level-zero ?

TBH, I'd never heard of oneAPI before, so I'm assuming it's mainly
focused in the data center.  libmetal/openAMP are widely used
in the consumer, industrial embedded space, and heavily used by FPGAs in
many market segments.

> Currently we have our own runtime driver which is tightly coupled with
> our h/w. For example, the method the userspace "talks" to the
> data-plane firmware is very proprietary as it is hard-wired into the
> architecture of the entire ASIC and how it performs deep-learning
> training. Therefore, I don't see how this can be shared with other
> vendors. Not because of secrecy but because it is simply not relevant
> to any other ASIC.

OK, makes sense.

Thanks for clarifying your use case in more detail.

Kevin

WARNING: multiple messages have this Message-ID (diff)
From: Kevin Hilman <khilman@baylibre.com>
To: Oded Gabbay <oded.gabbay@gmail.com>
Cc: Dave Airlie <airlied@gmail.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Yuji Ishikawa <yuji2.ishikawa@toshiba.co.jp>,
	Jiho Chu <jiho.chu@samsung.com>,
	Alexandre Bailon <abailon@baylibre.com>,
	Jason Gunthorpe <jgg@nvidia.com>, Arnd Bergmann <arnd@arndb.de>,
	dri-devel <dri-devel@lists.freedesktop.org>,
	"Linux-Kernel@Vger. Kernel. Org" <linux-kernel@vger.kernel.org>
Subject: Re: New subsystem for acceleration devices
Date: Mon, 29 Aug 2022 13:54:42 -0700	[thread overview]
Message-ID: <7hh71uixd9.fsf@baylibre.com> (raw)
In-Reply-To: <CAFCwf12P6DckVUJL7V_Z7ASj+8A3yyx9eX5MpZPF47Rzg6CjEA@mail.gmail.com>

Hi Oded (and sorry I misspelled your name last time),

Oded Gabbay <oded.gabbay@gmail.com> writes:

> On Tue, Aug 23, 2022 at 9:24 PM Kevin Hilman <khilman@baylibre.com> wrote:
>>
>> Hi Obed,
>>
>> Oded Gabbay <oded.gabbay@gmail.com> writes:
>>
>> [...]
>>
>> > I want to update that I'm currently in discussions with Dave to figure
>> > out what's the best way to move forward. We are writing it down to do
>> > a proper comparison between the two paths (new accel subsystem or
>> > using drm). I guess it will take a week or so.
>>
>> Any update on the discussions with Dave? and/or are there any plans to
>> discuss this further at LPC/ksummit yet?
> Hi Kevin.
>
> We are still discussing the details, as at least the habanalabs driver
> is very complex and there are multiple parts that I need to see if and
> how they can be mapped to drm.
> Some of us will attend LPC so we will probably take advantage of that
> to talk more about this.

OK, looking forward to some more conversations at LPC.

>>
>> We (BayLibre) are upstreaming support for APUs on Mediatek SoCs, and are
>> using the DRM-based approach.  I'll also be at LPC and happy to discuss
>> in person.
>>
>> For some context on my/our interest: back in Sept 2020 we initially
>> submitted an rpmesg based driver for kernel communication[1].  After
>> review comments, we rewrote that based on DRM[2] and are now using it
>> for some MTK SoCs[3] and supporting our MTK customers with it.
>>
>> Hopefully we will get the kernel interfaces sorted out soon, but next,
>> there's the userspace side of things.  To that end, we're also working
>> on libAPU, a common, open userspace stack.  Alex Bailon recently
>> presented a proposal earlier this year at Embedded Recipes in Paris
>> (video[4], slides[5].)
>>
>> libAPU would include abstractions of the kernel interfaces for DRM
>> (using libdrm), remoteproc/rpmsg, virtio etc. but also goes farther and
>> proposes an open firmware for the accelerator side using
>> libMetal/OpenAMP + rpmsg for communication with (most likely closed
>> source) vendor firmware.  Think of this like sound open firmware (SOF[6]),
>> but for accelerators.
>
> I think your device and the habana device are very different in
> nature, and it is part of what Dave and I discussed, whether these two
> classes of devices can live together. I guess they can live together
> in the kernel, but in the userspace, not so much imo.

Yeah, for now I think focusing on how to handle both classes of devices
in the kernel is the most important.

> The first class is the edge inference devices (usually as part of some
> SoC). I think your description of the APU on MTK SoC is a classic
> example of such a device.

Correct.

> You usually have some firmware you load, you give it a graph and
> pointers for input and output and then you just execute the graph
> again and again to perform inference and just replace the inputs.
>
> The second class is the data-center, training accelerators, which
> habana's gaudi device is classified as such. These devices usually
> have a number of different compute engines, a fabric for scaling out,
> on-device memory, internal MMUs and RAS monitoring requirements. Those
> devices are usually operated via command queues, either through their
> kernel driver or directly from user-space. They have multiple APIs for
> memory management, RAS, scaling-out and command-submissions.

OK, I see.

>>
>> We've been using this succesfully for Mediatek SoCs (which have a
>> Cadence VP6 APU) and have submitted/published the code, including the
>> OpenAMP[7] and libmetal[8] parts in addition to the kernel parts already
>> mentioned.
> What's the difference between libmetal and other open-source low-level
> runtime drivers, such as oneAPI level-zero ?

TBH, I'd never heard of oneAPI before, so I'm assuming it's mainly
focused in the data center.  libmetal/openAMP are widely used
in the consumer, industrial embedded space, and heavily used by FPGAs in
many market segments.

> Currently we have our own runtime driver which is tightly coupled with
> our h/w. For example, the method the userspace "talks" to the
> data-plane firmware is very proprietary as it is hard-wired into the
> architecture of the entire ASIC and how it performs deep-learning
> training. Therefore, I don't see how this can be shared with other
> vendors. Not because of secrecy but because it is simply not relevant
> to any other ASIC.

OK, makes sense.

Thanks for clarifying your use case in more detail.

Kevin

  reply	other threads:[~2022-08-29 20:54 UTC|newest]

Thread overview: 78+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20220731114605epcas1p1afff6b948f542e2062b60d49a8023f6f@epcas1p1.samsung.com>
2022-07-31 11:45 ` New subsystem for acceleration devices Oded Gabbay
2022-07-31 15:37   ` Greg Kroah-Hartman
2022-08-01  2:29     ` yuji2.ishikawa
2022-08-01  8:21       ` Oded Gabbay
2022-08-03  4:39         ` yuji2.ishikawa
2022-08-03  5:34           ` Greg KH
2022-08-03 20:28           ` Oded Gabbay
2022-08-02 17:25   ` Jiho Chu
2022-08-02 19:07     ` Oded Gabbay
2022-08-03 19:04   ` Dave Airlie
2022-08-03 20:20     ` Oded Gabbay
2022-08-03 23:31       ` Daniel Stone
2022-08-04  6:46         ` Oded Gabbay
2022-08-04  9:27           ` Jiho Chu
2022-08-03 23:54       ` Dave Airlie
2022-08-03 23:54         ` Dave Airlie
2022-08-04  7:43         ` Oded Gabbay
2022-08-04  7:43           ` Oded Gabbay
2022-08-04 14:50           ` Jason Gunthorpe
2022-08-04 14:50             ` Jason Gunthorpe
2022-08-04 17:48             ` Oded Gabbay
2022-08-04 17:48               ` Oded Gabbay
2022-08-05  0:22               ` Jason Gunthorpe
2022-08-05  0:22                 ` Jason Gunthorpe
2022-08-07  6:43                 ` Oded Gabbay
2022-08-07  6:43                   ` Oded Gabbay
2022-08-07 11:25                   ` Oded Gabbay
2022-08-07 11:25                     ` Oded Gabbay
2022-08-08  6:10                     ` Greg Kroah-Hartman
2022-08-08  6:10                       ` Greg Kroah-Hartman
2022-08-08 17:55                       ` Jason Gunthorpe
2022-08-08 17:55                         ` Jason Gunthorpe
2022-08-09  6:23                         ` Greg Kroah-Hartman
2022-08-09  6:23                           ` Greg Kroah-Hartman
2022-08-09  8:04                           ` Christoph Hellwig
2022-08-09  8:32                             ` Arnd Bergmann
2022-08-09  8:32                               ` Arnd Bergmann
2022-08-09 12:18                               ` Jason Gunthorpe
2022-08-09 12:18                                 ` Jason Gunthorpe
2022-08-09 12:46                                 ` Arnd Bergmann
2022-08-09 12:46                                   ` Arnd Bergmann
2022-08-09 14:22                                   ` Jason Gunthorpe
2022-08-09 14:22                                     ` Jason Gunthorpe
2022-08-09  8:45                             ` Greg Kroah-Hartman
2022-08-09  8:45                               ` Greg Kroah-Hartman
2022-08-08 17:46                   ` Jason Gunthorpe
2022-08-08 17:46                     ` Jason Gunthorpe
2022-08-08 20:26                     ` Oded Gabbay
2022-08-08 20:26                       ` Oded Gabbay
2022-08-09 12:43                       ` Jason Gunthorpe
2022-08-09 12:43                         ` Jason Gunthorpe
2022-08-05  3:02           ` Dave Airlie
2022-08-05  3:02             ` Dave Airlie
2022-08-07  6:50             ` Oded Gabbay
2022-08-07  6:50               ` Oded Gabbay
2022-08-09 21:42               ` Oded Gabbay
2022-08-09 21:42                 ` Oded Gabbay
2022-08-10  9:00                 ` Jiho Chu
2022-08-10  9:00                   ` Jiho Chu
2022-08-10 14:05                 ` yuji2.ishikawa
2022-08-10 14:05                   ` yuji2.ishikawa
2022-08-10 14:37                   ` Oded Gabbay
2022-08-10 14:37                     ` Oded Gabbay
2022-08-23 18:23                 ` Kevin Hilman
2022-08-23 18:23                   ` Kevin Hilman
2022-08-23 20:45                   ` Oded Gabbay
2022-08-23 20:45                     ` Oded Gabbay
2022-08-29 20:54                     ` Kevin Hilman [this message]
2022-08-29 20:54                       ` Kevin Hilman
2022-09-23 16:21                       ` Oded Gabbay
2022-09-23 16:21                         ` Oded Gabbay
2022-09-26  8:16                         ` Christoph Hellwig
2022-09-29  6:50                           ` Oded Gabbay
2022-09-29  6:50                             ` Oded Gabbay
2022-08-04 12:00         ` Tvrtko Ursulin
2022-08-04 15:03           ` Jeffrey Hugo
2022-08-04 17:53             ` Oded Gabbay
2022-08-04 17:53               ` Oded Gabbay

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7hh71uixd9.fsf@baylibre.com \
    --to=khilman@baylibre.com \
    --cc=abailon@baylibre.com \
    --cc=arnd@arndb.de \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=jgg@nvidia.com \
    --cc=jiho.chu@samsung.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oded.gabbay@gmail.com \
    --cc=yuji2.ishikawa@toshiba.co.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.