All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sumit Garg <sumit.garg@kernel.org>
To: Jens Wiklander <jens.wiklander@linaro.org>
Cc: "Daniel Stone" <daniel@fooishbar.org>,
	"Boris Brezillon" <boris.brezillon@collabora.com>,
	"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>,
	"Linux Media Mailing List" <linux-media@vger.kernel.org>,
	dri-devel <dri-devel@lists.freedesktop.org>,
	"moderated list:DMA BUFFER SHARING FRAMEWORK"
	<linaro-mm-sig@lists.linaro.org>,
	op-tee@lists.trustedfirmware.org,
	linux-arm-kernel <linux-arm-kernel@lists.infradead.org>,
	"Olivier Masse" <olivier.masse@nxp.com>,
	"Thierry Reding" <thierry.reding@gmail.com>,
	"Yong Wu" <yong.wu@mediatek.com>,
	"Sumit Semwal" <sumit.semwal@linaro.org>,
	"Benjamin Gaignard" <benjamin.gaignard@collabora.com>,
	"Brian Starkey" <Brian.Starkey@arm.com>,
	"John Stultz" <jstultz@google.com>,
	"T . J . Mercier" <tjmercier@google.com>,
	"Christian König" <christian.koenig@amd.com>,
	"Matthias Brugger" <matthias.bgg@gmail.com>,
	"AngeloGioacchino Del Regno"
	<angelogioacchino.delregno@collabora.com>,
	azarrabi@qti.qualcomm.com,
	"Florent Tomasin" <florent.tomasin@arm.com>
Subject: Re: [PATCH v4 0/6] TEE subsystem for restricted dma-buf allocations
Date: Tue, 4 Mar 2025 13:15:54 +0530	[thread overview]
Message-ID: <Z8avsigZJ4vqmiA4@sumit-X1> (raw)
In-Reply-To: <CAHUa44FkG1NAWpoW8UVBywv44XW_mjAJa32PcC9mcmiOLdiRqw@mail.gmail.com>

On Tue, Mar 04, 2025 at 08:17:23AM +0100, Jens Wiklander wrote:
> Hi Daniel,
> 
> On Fri, Feb 21, 2025 at 3:12 PM Daniel Stone <daniel@fooishbar.org> wrote:
> >
> > Hi Sumit,
> >
> > On Fri, 21 Feb 2025 at 11:24, Sumit Garg <sumit.garg@linaro.org> wrote:
> > > On Tue, 18 Feb 2025 at 21:52, Daniel Stone <daniel@fooishbar.org> wrote:
> > > > dma-heaps was created to solve the problem of having too many
> > > > 'allocate $n bytes from $specialplace' uAPIs. The proliferation was
> > > > painful and making it difficult for userspace to do what it needed to
> > > > do. Userspace doesn't _yet_ make full use of it, but the solution is
> > > > to make userspace make full use of it, not to go create entirely
> > > > separate allocation paths for unclear reasons.
> > > >
> > > > Besides, I'm writing this from a platform that implements SVP not via
> > > > TEE. I've worked on platforms which implement SVP without any TEE,
> > > > where the TEE implementation would be at best a no-op stub, and at
> > > > worst flat-out impossible.
> > >
> > > Can you elaborate the non-TEE use-case for Secure Video Path (SVP) a
> > > bit more? As to how the protected/encrypted media content pipeline
> > > works? Which architecture support does your use-case require? Is there
> > > any higher privileged level firmware interaction required to perform
> > > media content decryption into restricted memory? Do you plan to
> > > upstream corresponding support in near future?
> >
> > You can see the MTK SVP patches on list which use the MTK SMC to mediate it.
> >
> > There are TI Jacinto platforms which implement a 'secure' area
> > configured statically by (IIRC) BL2, with static permissions defined
> > for each AXI endpoint, e.g. CPU write + codec RW + dispc read. I've
> > heard of another SoC vendor doing the same, but I don't think I can
> > share those details. There is no TEE interaction.
> >
> > I'm writing this message from an AMD laptop which implements
> > restricted content paths outside of TEE. I don't have the full picture
> > of how SVP is implemented on AMD systems, but I do know that I don't
> > have any TEE devices exposed.
> >
> > > Let me try to elaborate on the Secure Video Path (SVP) flow requiring
> > > a TEE implementation (in general terms a higher privileged firmware
> > > managing the pipeline as the kernel/user-space has no access
> > > permissions to the plain text media content):
> > >
> > > - [...]
> >
> > Yeah, I totally understand the TEE usecase. I think that TEE is a good
> > design to implement this. I think that TEE should be used for SVP
> > where it makes sense.
> >
> > Please understand that I am _not_ arguing that no-one should use TEE for SVP!
> >
> > > > So, again, let's
> > > > please turn this around: _why_ TEE? Who benefits from exposing this as
> > > > completely separate to the more generic uAPI that we specifically
> > > > designed to handle things like this?
> > >
> > > The bridging between DMA heaps and TEE would still require user-space
> > > to perform an IOCTL into TEE to register the DMA-bufs as you can see
> > > here [1]. Then it will rather be two handles for user-space to manage.
> >
> > Yes, the decoder would need to do this. That's common though: if you
> > want to share a buffer between V4L2 and DRM, you have three handles:
> > the V4L2 buffer handle, the DRM GEM handle, and the dmabuf you use to
> > bridge the two.
> >
> > > Similarly during restricted memory allocation/free we need another
> > > glue layer under DMA heaps to TEE subsystem.
> >
> > Yep.
> >
> > > The reason is simply which has been iterated over many times in the
> > > past threads that:
> > >
> > >     "If user-space has to interact with a TEE device for SVP use-case
> > > then why it's not better to ask TEE to allocate restricted DMA-bufs
> > > too"
> >
> > The first word in your proposition is load-bearing.
> >
> > Build out the usecase a little more here. You have a DRMed video
> > stream coming in, which you need to decode (involving TEE for this
> > usecase). You get a dmabuf handle to the decoded frame. You need to
> > pass the dmabuf across to the Wayland compositor. The compositor needs
> > to pass it to EGL/Vulkan to import and do composition, which in turn
> > passes it to the GPU DRM driver. The output of the composition is in
> > turn shared between the GPU DRM driver and the separate KMS DRM
> > driver, with the involvement of GBM.
> >
> > For the platforms I'm interested in, the GPU DRM driver needs to
> > switch into protected mode, which has no involvement at all with TEE -
> > it's architecturally impossible to have TEE involved without moving
> > most of the GPU driver into TEE and destroying performance. The
> > display hardware also needs to engage protected mode, which again has
> > no involvement with TEE and again would need to have half the driver
> > moved into TEE for no benefit in order to do so. The Wayland
> > compositor also has no interest in TEE: it tells the GPU DRM driver
> > about the protected status of its buffers, and that's it.
> >
> > What these components _are_ opinionated about, is the way buffers are
> > allocated and managed. We built out dmabuf modifiers for this usecase,
> > and we have a good negotiation protocol around that. We also really
> > care about buffer placement in some usecases - e.g. some display/codec
> > hardware requires buffers to be sourced from contiguous memory, other
> > hardware needs to know that when it shares buffers with another
> > device, it needs to place the buffers outside of inaccessible/slow
> > local RAM. So we built out dma-heaps, so every part of the component
> > in the stack can communicate their buffer-placement needs in the same
> > way as we do modifiers, and negotiate an acceptable allocation.
> >
> > That's my starting point for this discussion. We have a mechanism to
> > deal with the fact that buffers need to be shared between different IP
> > blocks which have their own constraints on buffer placement, avoiding
> > the current problem of having every subsystem reinvent their own
> > allocation uAPI which was burying us in impedance mismatch and
> > confusion. That mechanism is dma-heaps. It seems like your starting
> > point from this discussion is that you've implemented a TEE-centric
> > design for SVP, and so all of userspace should bypass our existing
> > cross-subsystem special-purpose allocation mechanism, and write
> > specifically to one implementation. I believe that is a massive step
> > backwards and an immediate introduction of technical debt.
> >
> > Again, having an implementation of SVP via TEE makes a huge amount of
> > sense. Having _most_ SVP implementations via TEE still makes a lot of
> > sense. Having _all_ SVP implementations eventually be via TEE would
> > still make sense. But even if we were at that point - which we aren't
> > - it still doesn't justify telling userspace 'use the generic dma-heap
> > uAPI for every device-specific allocation constraint, apart from SVP
> > which has a completely different way to allocate some bytes'.
> 
> I must admit that I don't see how this makes a significant difference,
> but then I haven't hacked much in the stacks you're talking about, so
> I'm going to take your word for it.
> 
> I've experimented with providing a dma-heap replacing the TEE API. The
> implementation is more complex than I first anticipated, adding about
> 400 lines to the patch set.

I did anticipated this but let's give it a try and see if DMA heaps
really adds any value from user-space point of view. If it does then it
will be worth the maintenence overhead.

> From user space, it looks like another
> dma-heap. I'm using the names you gave earlier,
> protected,secure-video, protected,trusted-ui, and
> protected,secure-video-record. However, I wonder if we shouldn't use
> "restricted" instead of "protected" since we had agreed to call it
> restricted memory earlier.

Let's stick with "restricted" memory buffer references only.

-Sumit


WARNING: multiple messages have this Message-ID (diff)
From: Sumit Garg <sumit.garg@kernel.org>
To: op-tee@lists.trustedfirmware.org
Subject: Re: [PATCH v4 0/6] TEE subsystem for restricted dma-buf allocations
Date: Tue, 04 Mar 2025 13:15:54 +0530	[thread overview]
Message-ID: <Z8avsigZJ4vqmiA4@sumit-X1> (raw)
In-Reply-To: < <CAHUa44FkG1NAWpoW8UVBywv44XW_mjAJa32PcC9mcmiOLdiRqw@mail.gmail.com>>

[-- Attachment #1: Type: text/plain, Size: 8156 bytes --]

On Tue, Mar 04, 2025 at 08:17:23AM +0100, Jens Wiklander wrote:
> Hi Daniel,
> 
> On Fri, Feb 21, 2025 at 3:12 PM Daniel Stone <daniel@fooishbar.org> wrote:
> >
> > Hi Sumit,
> >
> > On Fri, 21 Feb 2025 at 11:24, Sumit Garg <sumit.garg@linaro.org> wrote:
> > > On Tue, 18 Feb 2025 at 21:52, Daniel Stone <daniel@fooishbar.org> wrote:
> > > > dma-heaps was created to solve the problem of having too many
> > > > 'allocate $n bytes from $specialplace' uAPIs. The proliferation was
> > > > painful and making it difficult for userspace to do what it needed to
> > > > do. Userspace doesn't _yet_ make full use of it, but the solution is
> > > > to make userspace make full use of it, not to go create entirely
> > > > separate allocation paths for unclear reasons.
> > > >
> > > > Besides, I'm writing this from a platform that implements SVP not via
> > > > TEE. I've worked on platforms which implement SVP without any TEE,
> > > > where the TEE implementation would be at best a no-op stub, and at
> > > > worst flat-out impossible.
> > >
> > > Can you elaborate the non-TEE use-case for Secure Video Path (SVP) a
> > > bit more? As to how the protected/encrypted media content pipeline
> > > works? Which architecture support does your use-case require? Is there
> > > any higher privileged level firmware interaction required to perform
> > > media content decryption into restricted memory? Do you plan to
> > > upstream corresponding support in near future?
> >
> > You can see the MTK SVP patches on list which use the MTK SMC to mediate it.
> >
> > There are TI Jacinto platforms which implement a 'secure' area
> > configured statically by (IIRC) BL2, with static permissions defined
> > for each AXI endpoint, e.g. CPU write + codec RW + dispc read. I've
> > heard of another SoC vendor doing the same, but I don't think I can
> > share those details. There is no TEE interaction.
> >
> > I'm writing this message from an AMD laptop which implements
> > restricted content paths outside of TEE. I don't have the full picture
> > of how SVP is implemented on AMD systems, but I do know that I don't
> > have any TEE devices exposed.
> >
> > > Let me try to elaborate on the Secure Video Path (SVP) flow requiring
> > > a TEE implementation (in general terms a higher privileged firmware
> > > managing the pipeline as the kernel/user-space has no access
> > > permissions to the plain text media content):
> > >
> > > - [...]
> >
> > Yeah, I totally understand the TEE usecase. I think that TEE is a good
> > design to implement this. I think that TEE should be used for SVP
> > where it makes sense.
> >
> > Please understand that I am _not_ arguing that no-one should use TEE for SVP!
> >
> > > > So, again, let's
> > > > please turn this around: _why_ TEE? Who benefits from exposing this as
> > > > completely separate to the more generic uAPI that we specifically
> > > > designed to handle things like this?
> > >
> > > The bridging between DMA heaps and TEE would still require user-space
> > > to perform an IOCTL into TEE to register the DMA-bufs as you can see
> > > here [1]. Then it will rather be two handles for user-space to manage.
> >
> > Yes, the decoder would need to do this. That's common though: if you
> > want to share a buffer between V4L2 and DRM, you have three handles:
> > the V4L2 buffer handle, the DRM GEM handle, and the dmabuf you use to
> > bridge the two.
> >
> > > Similarly during restricted memory allocation/free we need another
> > > glue layer under DMA heaps to TEE subsystem.
> >
> > Yep.
> >
> > > The reason is simply which has been iterated over many times in the
> > > past threads that:
> > >
> > >     "If user-space has to interact with a TEE device for SVP use-case
> > > then why it's not better to ask TEE to allocate restricted DMA-bufs
> > > too"
> >
> > The first word in your proposition is load-bearing.
> >
> > Build out the usecase a little more here. You have a DRMed video
> > stream coming in, which you need to decode (involving TEE for this
> > usecase). You get a dmabuf handle to the decoded frame. You need to
> > pass the dmabuf across to the Wayland compositor. The compositor needs
> > to pass it to EGL/Vulkan to import and do composition, which in turn
> > passes it to the GPU DRM driver. The output of the composition is in
> > turn shared between the GPU DRM driver and the separate KMS DRM
> > driver, with the involvement of GBM.
> >
> > For the platforms I'm interested in, the GPU DRM driver needs to
> > switch into protected mode, which has no involvement at all with TEE -
> > it's architecturally impossible to have TEE involved without moving
> > most of the GPU driver into TEE and destroying performance. The
> > display hardware also needs to engage protected mode, which again has
> > no involvement with TEE and again would need to have half the driver
> > moved into TEE for no benefit in order to do so. The Wayland
> > compositor also has no interest in TEE: it tells the GPU DRM driver
> > about the protected status of its buffers, and that's it.
> >
> > What these components _are_ opinionated about, is the way buffers are
> > allocated and managed. We built out dmabuf modifiers for this usecase,
> > and we have a good negotiation protocol around that. We also really
> > care about buffer placement in some usecases - e.g. some display/codec
> > hardware requires buffers to be sourced from contiguous memory, other
> > hardware needs to know that when it shares buffers with another
> > device, it needs to place the buffers outside of inaccessible/slow
> > local RAM. So we built out dma-heaps, so every part of the component
> > in the stack can communicate their buffer-placement needs in the same
> > way as we do modifiers, and negotiate an acceptable allocation.
> >
> > That's my starting point for this discussion. We have a mechanism to
> > deal with the fact that buffers need to be shared between different IP
> > blocks which have their own constraints on buffer placement, avoiding
> > the current problem of having every subsystem reinvent their own
> > allocation uAPI which was burying us in impedance mismatch and
> > confusion. That mechanism is dma-heaps. It seems like your starting
> > point from this discussion is that you've implemented a TEE-centric
> > design for SVP, and so all of userspace should bypass our existing
> > cross-subsystem special-purpose allocation mechanism, and write
> > specifically to one implementation. I believe that is a massive step
> > backwards and an immediate introduction of technical debt.
> >
> > Again, having an implementation of SVP via TEE makes a huge amount of
> > sense. Having _most_ SVP implementations via TEE still makes a lot of
> > sense. Having _all_ SVP implementations eventually be via TEE would
> > still make sense. But even if we were at that point - which we aren't
> > - it still doesn't justify telling userspace 'use the generic dma-heap
> > uAPI for every device-specific allocation constraint, apart from SVP
> > which has a completely different way to allocate some bytes'.
> 
> I must admit that I don't see how this makes a significant difference,
> but then I haven't hacked much in the stacks you're talking about, so
> I'm going to take your word for it.
> 
> I've experimented with providing a dma-heap replacing the TEE API. The
> implementation is more complex than I first anticipated, adding about
> 400 lines to the patch set.

I did anticipated this but let's give it a try and see if DMA heaps
really adds any value from user-space point of view. If it does then it
will be worth the maintenence overhead.

> From user space, it looks like another
> dma-heap. I'm using the names you gave earlier,
> protected,secure-video, protected,trusted-ui, and
> protected,secure-video-record. However, I wonder if we shouldn't use
> "restricted" instead of "protected" since we had agreed to call it
> restricted memory earlier.

Let's stick with "restricted" memory buffer references only.

-Sumit

  reply	other threads:[~2025-03-04  8:03 UTC|newest]

Thread overview: 86+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-17 10:07 [PATCH v4 0/6] TEE subsystem for restricted dma-buf allocations Jens Wiklander
2024-12-17 10:07 ` Jens Wiklander
2024-12-17 10:07 ` [PATCH v4 1/6] tee: add restricted memory allocation Jens Wiklander
2024-12-17 10:07   ` Jens Wiklander
2025-01-08 16:54   ` Simona Vetter
2025-01-08 16:54     ` Simona Vetter
2025-01-09  7:17     ` Jens Wiklander
2025-01-09  7:17       ` Jens Wiklander
2025-02-13  8:44   ` Boris Brezillon
2025-02-13  8:44     ` Boris Brezillon
2024-12-17 10:07 ` [PATCH v4 2/6] optee: account for direction while converting parameters Jens Wiklander
2024-12-17 10:07   ` Jens Wiklander
2024-12-17 10:07 ` [PATCH v4 3/6] optee: sync secure world ABI headers Jens Wiklander
2024-12-17 10:07   ` Jens Wiklander
2024-12-17 10:07 ` [PATCH v4 4/6] optee: support restricted memory allocation Jens Wiklander
2024-12-17 10:07   ` Jens Wiklander
2024-12-17 10:07 ` [PATCH v4 5/6] optee: FF-A: dynamic " Jens Wiklander
2024-12-17 10:07   ` Jens Wiklander
2024-12-20 14:38   ` kernel test robot
2024-12-20 14:38     ` kernel test robot
2024-12-21  9:40   ` kernel test robot
2024-12-21  9:40     ` kernel test robot
2024-12-17 10:07 ` [PATCH v4 6/6] optee: smc abi: " Jens Wiklander
2024-12-17 10:07   ` Jens Wiklander
2024-12-18 11:06 ` [PATCH v4 0/6] TEE subsystem for restricted dma-buf allocations Simona Vetter
2024-12-18 11:06   ` Simona Vetter
2024-12-24  6:35   ` Sumit Garg
2024-12-24  6:35     ` Sumit Garg
2024-12-24  9:28     ` Lukas Wunner
2024-12-24  9:28       ` Lukas Wunner
2024-12-24  9:32       ` Lukas Wunner
2024-12-24  9:32         ` Lukas Wunner
2024-12-24 10:00         ` Dmitry Baryshkov
2024-12-24 10:00           ` Dmitry Baryshkov
2024-12-26  5:59       ` Sumit Garg
2024-12-26  5:59         ` Sumit Garg
2024-12-26 11:26         ` Lukas Wunner
2024-12-26 11:26           ` Lukas Wunner
2025-01-08 17:00           ` Simona Vetter
2025-01-08 17:00             ` Simona Vetter
2025-01-08 16:57     ` Simona Vetter
2025-01-08 16:57       ` Simona Vetter
2025-01-09  6:08       ` Sumit Garg
2025-01-09  6:08         ` Sumit Garg
2025-02-12 19:56 ` Boris Brezillon
2025-02-12 19:56   ` Boris Brezillon
2025-02-13  6:41   ` Sumit Garg
2025-02-13  6:41     ` Sumit Garg
2025-02-13  8:19     ` Jens Wiklander
2025-02-13  8:19       ` Jens Wiklander
2025-02-13  8:35     ` Boris Brezillon
2025-02-13  8:35       ` Boris Brezillon
2025-02-13  9:16       ` Sumit Garg
2025-02-13  9:16         ` Sumit Garg
2025-02-13 12:40         ` Boris Brezillon
2025-02-13 12:40           ` Boris Brezillon
2025-02-13 14:05           ` Daniel Stone
2025-02-13 14:05             ` Daniel Stone
2025-02-13 15:57             ` Jens Wiklander
2025-02-13 15:57               ` Jens Wiklander
2025-02-13 17:39               ` Daniel Stone
2025-02-13 17:39                 ` Daniel Stone
2025-02-14 10:07                 ` Jens Wiklander
2025-02-14 10:07                   ` Jens Wiklander
2025-02-14 13:07                   ` Sumit Garg
2025-02-14 13:07                     ` Sumit Garg
2025-02-14 15:48                     ` Boris Brezillon
2025-02-14 15:48                       ` Boris Brezillon
2025-02-17  6:12                       ` Sumit Garg
2025-02-17  6:12                         ` Sumit Garg
2025-02-18 16:22                         ` Daniel Stone
2025-02-18 16:22                           ` Daniel Stone
2025-02-19 13:22                           ` Simona Vetter
2025-02-19 13:22                             ` Simona Vetter
2025-02-21 11:24                           ` Sumit Garg
2025-02-21 11:24                             ` Sumit Garg
2025-02-21 14:12                             ` Daniel Stone
2025-02-21 14:12                               ` Daniel Stone
2025-03-04  7:17                               ` Jens Wiklander
2025-03-04  7:17                                 ` Jens Wiklander
2025-03-04  7:45                                 ` Sumit Garg [this message]
2025-03-04  7:45                                   ` Sumit Garg
2025-03-18 18:38                                   ` Nicolas Dufresne
2025-03-18 18:38                                     ` Nicolas Dufresne
2025-03-19  7:37                                     ` Jens Wiklander
2025-03-19  7:37                                       ` Jens Wiklander

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z8avsigZJ4vqmiA4@sumit-X1 \
    --to=sumit.garg@kernel.org \
    --cc=Brian.Starkey@arm.com \
    --cc=angelogioacchino.delregno@collabora.com \
    --cc=azarrabi@qti.qualcomm.com \
    --cc=benjamin.gaignard@collabora.com \
    --cc=boris.brezillon@collabora.com \
    --cc=christian.koenig@amd.com \
    --cc=daniel@fooishbar.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=florent.tomasin@arm.com \
    --cc=jens.wiklander@linaro.org \
    --cc=jstultz@google.com \
    --cc=linaro-mm-sig@lists.linaro.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-media@vger.kernel.org \
    --cc=matthias.bgg@gmail.com \
    --cc=olivier.masse@nxp.com \
    --cc=op-tee@lists.trustedfirmware.org \
    --cc=sumit.semwal@linaro.org \
    --cc=thierry.reding@gmail.com \
    --cc=tjmercier@google.com \
    --cc=yong.wu@mediatek.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.