linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/5] drm/panthor: Protected mode support for Mali CSF GPUs
@ 2025-01-30 13:08 Florent Tomasin
  2025-01-30 13:08 ` [RFC PATCH 1/5] dt-bindings: dma: Add CMA Heap bindings Florent Tomasin
                   ` (6 more replies)
  0 siblings, 7 replies; 48+ messages in thread
From: Florent Tomasin @ 2025-01-30 13:08 UTC (permalink / raw)
  To: Vinod Koul, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	Sumit Semwal, Benjamin Gaignard, Brian Starkey, John Stultz,
	T . J . Mercier, Christian König, Matthias Brugger,
	AngeloGioacchino Del Regno, Yong Wu
  Cc: dmaengine, devicetree, linux-kernel, dri-devel, linux-media,
	linaro-mm-sig, linux-arm-kernel, linux-mediatek, nd, Akash Goel,
	Florent Tomasin

Hi,

This is a patch series covering the support for protected mode execution in
Mali Panthor CSF kernel driver.

The Mali CSF GPUs come with the support for protected mode execution at the
HW level. This feature requires two main changes in the kernel driver:

1) Configure the GPU with a protected buffer. The system must provide a DMA
   heap from which the driver can allocate a protected buffer.
   It can be a carved-out memory or dynamically allocated protected memory region.
   Some system includes a trusted FW which is in charge of the protected memory.
   Since this problem is integration specific, the Mali Panthor CSF kernel
   driver must import the protected memory from a device specific exporter.

2) Handle enter and exit of the GPU HW from normal to protected mode of execution.
   FW sends a request for protected mode entry to the kernel driver.
   The acknowledgment of that request is a scheduling decision. Effectively,
   protected mode execution should not overrule normal mode of execution.
   A fair distribution of execution time will guaranty the overall performance
   of the device, including the UI (usually executing in normal mode),
   will not regress when a protected mode job is submitted by an application.


Background
----------

Current Mali Panthor CSF driver does not allow a user space application to
execute protected jobs on the GPU. This use case is quite common on end-user-device.
A user may want to watch a video or render content that is under a "Digital Right
Management" protection, or launch an application with user private data.

1) User-space:

   In order for an application to execute protected jobs on a Mali CSF GPU the
   user space application must submit jobs to the GPU within a "protected regions"
   (range of commands to execute in protected mode).

   Find here an example of a command buffer that contains protected commands:

```
          <--- Normal mode ---><--- Protected mode ---><--- Normal mode --->
   +-------------------------------------------------------------------------+
   | ... | CMD_0 | ... | CMD_N | PROT_REGION | CMD_N+1 | ... | CMD_N+M | ... |
   +-------------------------------------------------------------------------+
```

   The PROT_REGION command acts as a barrier to notify the HW of upcoming
   protected jobs. It also defines the number of commands to execute in protected
   mode.

   The Mesa definition of the opcode can be found here:

     https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/panfrost/lib/genxml/v10.xml?ref_type=heads#L763

2) Kernel-space:

   When loading the FW image, the Kernel driver must also load the data section of
   CSF FW that comes from the protected memory, in order to allow FW to execute in
   protected mode.

   Important: this memory is not owned by any process. It is a GPU device level
              protected memory.

   In addition, when a CSG (group) is created, it must have a protected suspend buffer.
   This memory is allocated within the kernel but bound to a specific CSG that belongs
   to a process. The kernel owns this allocation and does not allow user space mapping.
   The format of the data in this buffer is only known by the FW and does not need to
   be shared with other entities. The purpose of this buffer is the same as the normal
   suspend buffer but for protected mode. FW will use it to suspend the execution of
   PROT_REGION before returning to normal mode of execution.


Design decisions
----------------

The Mali Panthor CSF kernel driver will allocate protected DMA buffers
using a global protected DMA heap. The name of the heap can vary on
the system and is integration specific. Therefore, the kernel driver
will retrieve it using the DTB entry: "protected-heap-name".

The Mali Panthor CSF kernel driver will handle enter/exit of protected
mode with a fair consideration of the job scheduling.

If the system integrator does not provide a protected DMA heap, the driver
will not allow any protected mode execution.


Patch series
------------

The series is based on:

  https://lore.kernel.org/lkml/20230911023038.30649-1-yong.wu@mediatek.com/#t

[PATCHES 1-2]:
  These patches were used for the development of the feature and are not
  initially thought to land in the Linux kernel. They are mostly relevant
  if someone wants to reproduce the environment of testing.

  Note: Please, raise interest if you think those patches have value in
        the Linux kernel.

  * dt-bindings: dma: Add CMA Heap bindings
  * cma-heap: Allow registration of custom cma heaps

[PATCHES 3-4]:
  These patches introduce the Mali Panthor CSF driver DTB binding to pass
  the protected DMA Heap name and the handling of the protected DMA memory
  allocations in the driver.

  Note: The registration of the protected DMA buffers is done via GEM prime.
  The direct call to the registration function, may seems controversial and
  I would appreciate feedback on that matter.

  * dt-bindings: gpu: Add protected heap name to Mali Valhall CSF binding
  * drm/panthor: Add support for protected memory allocation in panthor

[PATCH 5]:
  This patch implements the logic to handle enter/exit of the GPU protected
  mode in Panthor CSF driver.

  Note: to prevent scheduler priority inversion, only a single CSG is allowed
        to execute while in protected mode. It must be the top priority one.

  * drm/panthor: Add support for entering and exiting protected mode

Testing
-------

1) Platform and development environment

   Any platform containing a Mali CSF type of GPU and a protected memory allocator
   that is based on DMA Heap can be used. For example, it can be a physical platform
   or a simulator such as Arm Total Compute FVPs platforms. Reference to the latter:

     https://developer.arm.com/Tools%20and%20Software/Fixed%20Virtual%20Platforms/Total%20Compute%20FVPs

   To ease the development of the feature, a carved-out protected memory heap was
   defined and managed using a modified version of the CMA heap driver.

2) Protected job submission:

   A protected mode job can be created in Mesa following this approach:

```
diff --git a/src/gallium/drivers/panfrost/pan_csf.c b/src/gallium/drivers/panfrost/pan_csf.c
index da6ce875f86..13d54abf5a1 100644
--- a/src/gallium/drivers/panfrost/pan_csf.c
+++ b/src/gallium/drivers/panfrost/pan_csf.c
@@ -803,8 +803,25 @@ GENX(csf_emit_fragment_job)(struct panfrost_batch *batch,
       }
    }

+   if (protected_cmd) {
+      /* Number of commands to execute in protected mode in bytes.
+       * The run fragment and wait commands. */
+      unsigned const size = 2 * sizeof(u64);
+
+      /* Wait for all previous commands to complete before evaluating
+       * the protected commands. */
+      cs_wait_slots(b, SB_ALL_MASK, false);
+      cs_prot_region(b, size);
+   }
+
    /* Run the fragment job and wait */
    cs_run_fragment(b, false, MALI_TILE_RENDER_ORDER_Z_ORDER, false);
+
+   /* Wait for all protected commands to complete before evaluating
+    * the normal mode commands. */
+   if (protected_cmd)
+      cs_wait_slots(b, SB_ALL_MASK, false);
+
    cs_wait_slot(b, 2, false);

    /* Gather freed heap chunks and add them to the heap context free list
```


Constraints
-----------

At the time of developing the feature, Linux kernel does not have a generic
way of implementing protected DMA heaps. The patch series relies on previous
work to expose the DMA heap API to the kernel drivers.

The Mali CSF GPU requires device level allocated protected memory, which do
not belong to a process. The current Linux implementation of DMA heap only
allows a user space program to allocate from such heap. Having the ability
to allocate this memory at the kernel level via the DMA heap API would allow
support for protected mode on Mali CSF GPUs.


Conclusion
----------

This patch series aims to initiate the discussion around handling of protected
mode in Mali CSG GPU and highlights constraints found during the development
of the feature.

Some Mesa changes are required to construct a protected mode job in user space,
which can be submitted to the GPU.

Some of the changes may seems controversial and we would appreciate getting
opinion from the community.


Regards,

Florent Tomasin (5):
  dt-bindings: dma: Add CMA Heap bindings
  cma-heap: Allow registration of custom cma heaps
  dt-bindings: gpu: Add protected heap name to Mali Valhall CSF binding
  drm/panthor: Add support for protected memory allocation in panthor
  drm/panthor: Add support for entering and exiting protected mode

 .../devicetree/bindings/dma/linux,cma.yml     |  43 ++++++
 .../bindings/gpu/arm,mali-valhall-csf.yaml    |   6 +
 drivers/dma-buf/heaps/cma_heap.c              | 120 +++++++++++------
 drivers/gpu/drm/panthor/Kconfig               |   1 +
 drivers/gpu/drm/panthor/panthor_device.c      |  22 +++-
 drivers/gpu/drm/panthor/panthor_device.h      |  10 ++
 drivers/gpu/drm/panthor/panthor_fw.c          |  46 ++++++-
 drivers/gpu/drm/panthor/panthor_fw.h          |   2 +
 drivers/gpu/drm/panthor/panthor_gem.c         |  49 ++++++-
 drivers/gpu/drm/panthor/panthor_gem.h         |  16 ++-
 drivers/gpu/drm/panthor/panthor_heap.c        |   2 +
 drivers/gpu/drm/panthor/panthor_sched.c       | 124 ++++++++++++++++--
 12 files changed, 382 insertions(+), 59 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/dma/linux,cma.yml

--
2.34.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC PATCH 1/5] dt-bindings: dma: Add CMA Heap bindings
  2025-01-30 13:08 [RFC PATCH 0/5] drm/panthor: Protected mode support for Mali CSF GPUs Florent Tomasin
@ 2025-01-30 13:08 ` Florent Tomasin
  2025-01-30 13:28   ` Maxime Ripard
  2025-01-30 23:20   ` Rob Herring
  2025-01-30 13:08 ` [RFC PATCH 2/5] cma-heap: Allow registration of custom cma heaps Florent Tomasin
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 48+ messages in thread
From: Florent Tomasin @ 2025-01-30 13:08 UTC (permalink / raw)
  To: Vinod Koul, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	Sumit Semwal, Benjamin Gaignard, Brian Starkey, John Stultz,
	T . J . Mercier, Christian König, Matthias Brugger,
	AngeloGioacchino Del Regno, Yong Wu
  Cc: dmaengine, devicetree, linux-kernel, dri-devel, linux-media,
	linaro-mm-sig, linux-arm-kernel, linux-mediatek, nd, Akash Goel,
	Florent Tomasin

Introduce a CMA Heap dt-binding allowing custom
CMA heap registrations.

* Note to the reviewers:
The patch was used for the development of the protected mode
feature in Panthor CSF kernel driver and is not initially thought
to land in the Linux kernel. It is mostly relevant if someone
wants to reproduce the environment of testing. Please, raise
interest if you think the patch has value in the Linux kernel.

Signed-off-by: Florent Tomasin <florent.tomasin@arm.com>
---
 .../devicetree/bindings/dma/linux,cma.yml     | 43 +++++++++++++++++++
 1 file changed, 43 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/dma/linux,cma.yml

diff --git a/Documentation/devicetree/bindings/dma/linux,cma.yml b/Documentation/devicetree/bindings/dma/linux,cma.yml
new file mode 100644
index 000000000000..c532e016bbe5
--- /dev/null
+++ b/Documentation/devicetree/bindings/dma/linux,cma.yml
@@ -0,0 +1,43 @@
+# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/dma/linux-cma.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Custom Linux CMA heap
+
+description:
+  The custom Linux CMA heap device tree node allows registering
+  of multiple CMA heaps.
+
+  The CMA heap name will match the node name of the "memory-region".
+
+properties:
+  compatible:
+    enum:
+      - linux,cma
+
+  memory-region:
+    maxItems: 1
+    description: |
+      Phandle to the reserved memory node associated with the CMA Heap.
+      The reserved memory node must follow this binding convention:
+       - Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
+
+examples:
+  - |
+    reserved-memory {
+      #address-cells = <2>;
+      #size-cells = <2>;
+
+      custom_cma_heap: custom-cma-heap {
+        compatible = "shared-dma-pool";
+        reg = <0x0 0x90600000 0x0 0x1000000>;
+        reusable;
+      };
+    };
+
+    device_cma_heap: device-cma-heap {
+      compatible = "linux,cma";
+      memory-region = <&custom_cma_heap>;
+    };
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC PATCH 2/5] cma-heap: Allow registration of custom cma heaps
  2025-01-30 13:08 [RFC PATCH 0/5] drm/panthor: Protected mode support for Mali CSF GPUs Florent Tomasin
  2025-01-30 13:08 ` [RFC PATCH 1/5] dt-bindings: dma: Add CMA Heap bindings Florent Tomasin
@ 2025-01-30 13:08 ` Florent Tomasin
  2025-01-30 13:34   ` Maxime Ripard
  2025-01-30 13:08 ` [RFC PATCH 3/5] dt-bindings: gpu: Add protected heap name to Mali Valhall CSF binding Florent Tomasin
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 48+ messages in thread
From: Florent Tomasin @ 2025-01-30 13:08 UTC (permalink / raw)
  To: Vinod Koul, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	Sumit Semwal, Benjamin Gaignard, Brian Starkey, John Stultz,
	T . J . Mercier, Christian König, Matthias Brugger,
	AngeloGioacchino Del Regno, Yong Wu
  Cc: dmaengine, devicetree, linux-kernel, dri-devel, linux-media,
	linaro-mm-sig, linux-arm-kernel, linux-mediatek, nd, Akash Goel,
	Florent Tomasin

This patch introduces a cma-heap probe function, allowing
users to register custom cma heaps in the device tree.

A "memory-region" is bound to the cma heap at probe time
allowing allocation of DMA buffers from that heap.

Use cases:
- registration of carved out secure heaps. Some devices
  are implementing secure memory by reserving a specific
  memory regions for that purpose. For example, this is the
  case of platforms making use of early version of
  ARM TrustZone.
- registration of multiple memory regions at different
  locations for efficiency or HW integration reasons.
  For example, a peripheral may expect to share data at a
  specific location in RAM. This information could have been
  programmed by a FW prior to the kernel boot.

* Zeroing of CMA heap allocation:
In the case of secure CMA heaps used along with ARM TrustZone,
the zeroing of the secure memory could result in a bus fault
if performed with `kmap_atomic()` or `page_address()`.
To prevent such scenario, the zeroing of the pages is done
using a virtual pointer acquired from `vmap()` using:
`pgprot_writecombine(PAGE_KERNEL)` as argument.

* Idea of improvement:
This patch could have an impact on the performance of devices
as a result of using `pgprot_writecombine(PAGE_KERNEL)`.
It could be prevented by allowing control of this argument
via a parameter of some sort. The driver could then use
or not `pgprot_writecombine(PAGE_KERNEL)` according to
the use case defined by the system integrator.

* Note to the reviewers:
The patch was used for the development of the protected mode
feature in Panthor CSF kernel driver and is not initially thought
to land in the Linux kernel. It is mostly relevant if someone
wants to reproduce the environment of testing. Please, raise
interest if you think the patch has value in the Linux kernel.

Signed-off-by: Florent Tomasin <florent.tomasin@arm.com>
---
 drivers/dma-buf/heaps/cma_heap.c | 120 +++++++++++++++++++++----------
 1 file changed, 81 insertions(+), 39 deletions(-)

diff --git a/drivers/dma-buf/heaps/cma_heap.c b/drivers/dma-buf/heaps/cma_heap.c
index 9512d050563a..8f17221311fd 100644
--- a/drivers/dma-buf/heaps/cma_heap.c
+++ b/drivers/dma-buf/heaps/cma_heap.c
@@ -18,6 +18,9 @@
 #include <linux/io.h>
 #include <linux/mm.h>
 #include <linux/module.h>
+#include <linux/of.h>
+#include <linux/of_reserved_mem.h>
+#include <linux/platform_device.h>
 #include <linux/scatterlist.h>
 #include <linux/slab.h>
 #include <linux/vmalloc.h>
@@ -186,6 +189,7 @@ static int cma_heap_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma)
 
 	vma->vm_ops = &dma_heap_vm_ops;
 	vma->vm_private_data = buffer;
+	vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot);
 
 	return 0;
 }
@@ -194,7 +198,7 @@ static void *cma_heap_do_vmap(struct cma_heap_buffer *buffer)
 {
 	void *vaddr;
 
-	vaddr = vmap(buffer->pages, buffer->pagecount, VM_MAP, PAGE_KERNEL);
+	vaddr = vmap(buffer->pages, buffer->pagecount, VM_MAP, pgprot_writecombine(PAGE_KERNEL));
 	if (!vaddr)
 		return ERR_PTR(-ENOMEM);
 
@@ -286,6 +290,7 @@ static struct dma_buf *cma_heap_allocate(struct dma_heap *heap,
 	struct page *cma_pages;
 	struct dma_buf *dmabuf;
 	int ret = -ENOMEM;
+	void *vaddr;
 	pgoff_t pg;
 
 	buffer = kzalloc(sizeof(*buffer), GFP_KERNEL);
@@ -303,29 +308,6 @@ static struct dma_buf *cma_heap_allocate(struct dma_heap *heap,
 	if (!cma_pages)
 		goto free_buffer;
 
-	/* Clear the cma pages */
-	if (PageHighMem(cma_pages)) {
-		unsigned long nr_clear_pages = pagecount;
-		struct page *page = cma_pages;
-
-		while (nr_clear_pages > 0) {
-			void *vaddr = kmap_local_page(page);
-
-			memset(vaddr, 0, PAGE_SIZE);
-			kunmap_local(vaddr);
-			/*
-			 * Avoid wasting time zeroing memory if the process
-			 * has been killed by SIGKILL.
-			 */
-			if (fatal_signal_pending(current))
-				goto free_cma;
-			page++;
-			nr_clear_pages--;
-		}
-	} else {
-		memset(page_address(cma_pages), 0, size);
-	}
-
 	buffer->pages = kmalloc_array(pagecount, sizeof(*buffer->pages), GFP_KERNEL);
 	if (!buffer->pages) {
 		ret = -ENOMEM;
@@ -335,6 +317,14 @@ static struct dma_buf *cma_heap_allocate(struct dma_heap *heap,
 	for (pg = 0; pg < pagecount; pg++)
 		buffer->pages[pg] = &cma_pages[pg];
 
+	/* Clear the cma pages */
+	vaddr = vmap(buffer->pages, pagecount, VM_MAP, pgprot_writecombine(PAGE_KERNEL));
+	if (!vaddr)
+		goto free_pages;
+
+	memset(vaddr, 0, size);
+	vunmap(vaddr);
+
 	buffer->cma_pages = cma_pages;
 	buffer->heap = cma_heap;
 	buffer->pagecount = pagecount;
@@ -366,17 +356,79 @@ static const struct dma_heap_ops cma_heap_ops = {
 	.allocate = cma_heap_allocate,
 };
 
-static int __init __add_cma_heap(struct cma *cma, void *data)
+static int cma_heap_probe(struct platform_device *pdev)
 {
+	struct dma_heap_export_info *exp_info;
+	struct cma_heap *cma_heap;
+	int ret;
+
+	exp_info = devm_kzalloc(&pdev->dev, sizeof(*exp_info), GFP_KERNEL);
+	if (IS_ERR_OR_NULL(exp_info))
+		return -ENOMEM;
+
+	cma_heap = devm_kzalloc(&pdev->dev, sizeof(*cma_heap), GFP_KERNEL);
+	if (IS_ERR_OR_NULL(cma_heap))
+		return -ENOMEM;
+
+	ret = of_reserved_mem_device_init(&pdev->dev);
+	if (ret)
+		return ret;
+
+	cma_heap->cma = dev_get_cma_area(&pdev->dev);
+	if (!cma_heap->cma) {
+		ret = -EINVAL;
+		goto error_reserved_mem;
+	}
+
+	exp_info->name = cma_get_name(cma_heap->cma);
+	exp_info->ops = &cma_heap_ops;
+	exp_info->priv = cma_heap;
+
+	cma_heap->heap = dma_heap_add(exp_info);
+	if (IS_ERR(cma_heap->heap)) {
+		ret = PTR_ERR(cma_heap->heap);
+		goto error_reserved_mem;
+	}
+
+	return 0;
+
+error_reserved_mem:
+	of_reserved_mem_device_release(&pdev->dev);
+
+	return ret;
+}
+
+static const struct of_device_id dt_match[] = {
+	{ .compatible = "linux,cma" },
+	{}
+};
+MODULE_DEVICE_TABLE(of, dt_match);
+
+static struct platform_driver cma_heap_driver = {
+	.probe = cma_heap_probe,
+	.driver = {
+		.name = "linux,cma",
+		.of_match_table = dt_match,
+	},
+};
+
+static int __init cma_heap_init(void)
+{
+	struct cma *cma_area = dev_get_cma_area(NULL);
 	struct cma_heap *cma_heap;
 	struct dma_heap_export_info exp_info;
 
+	if (!cma_area)
+		return -EINVAL;
+
+	/* Add default CMA heap */
 	cma_heap = kzalloc(sizeof(*cma_heap), GFP_KERNEL);
 	if (!cma_heap)
 		return -ENOMEM;
-	cma_heap->cma = cma;
 
-	exp_info.name = cma_get_name(cma);
+	cma_heap->cma = cma_area;
+
+	exp_info.name = cma_get_name(cma_area);
 	exp_info.ops = &cma_heap_ops;
 	exp_info.priv = cma_heap;
 
@@ -388,18 +440,8 @@ static int __init __add_cma_heap(struct cma *cma, void *data)
 		return ret;
 	}
 
-	return 0;
+	return platform_driver_register(&cma_heap_driver);
 }
 
-static int __init add_default_cma_heap(void)
-{
-	struct cma *default_cma = dev_get_cma_area(NULL);
-	int ret = 0;
-
-	if (default_cma)
-		ret = __add_cma_heap(default_cma, NULL);
-
-	return ret;
-}
-module_init(add_default_cma_heap);
+module_init(cma_heap_init);
 MODULE_DESCRIPTION("DMA-BUF CMA Heap");
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC PATCH 3/5] dt-bindings: gpu: Add protected heap name to Mali Valhall CSF binding
  2025-01-30 13:08 [RFC PATCH 0/5] drm/panthor: Protected mode support for Mali CSF GPUs Florent Tomasin
  2025-01-30 13:08 ` [RFC PATCH 1/5] dt-bindings: dma: Add CMA Heap bindings Florent Tomasin
  2025-01-30 13:08 ` [RFC PATCH 2/5] cma-heap: Allow registration of custom cma heaps Florent Tomasin
@ 2025-01-30 13:08 ` Florent Tomasin
  2025-01-30 13:25   ` Krzysztof Kozlowski
  2025-01-30 13:09 ` [RFC PATCH 4/5] drm/panthor: Add support for protected memory allocation in panthor Florent Tomasin
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 48+ messages in thread
From: Florent Tomasin @ 2025-01-30 13:08 UTC (permalink / raw)
  To: Vinod Koul, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	Sumit Semwal, Benjamin Gaignard, Brian Starkey, John Stultz,
	T . J . Mercier, Christian König, Matthias Brugger,
	AngeloGioacchino Del Regno, Yong Wu
  Cc: dmaengine, devicetree, linux-kernel, dri-devel, linux-media,
	linaro-mm-sig, linux-arm-kernel, linux-mediatek, nd, Akash Goel,
	Florent Tomasin

Allow mali-valhall-csf driver to retrieve a protected
heap at probe time by passing the name of the heap
as attribute to the device tree GPU node.

Signed-off-by: Florent Tomasin <florent.tomasin@arm.com>
---
 .../devicetree/bindings/gpu/arm,mali-valhall-csf.yaml       | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/Documentation/devicetree/bindings/gpu/arm,mali-valhall-csf.yaml b/Documentation/devicetree/bindings/gpu/arm,mali-valhall-csf.yaml
index a5b4e0021758..dc633b037ede 100644
--- a/Documentation/devicetree/bindings/gpu/arm,mali-valhall-csf.yaml
+++ b/Documentation/devicetree/bindings/gpu/arm,mali-valhall-csf.yaml
@@ -85,6 +85,12 @@ properties:
 
   dma-coherent: true
 
+  protected-heap-name:
+    $ref: /schemas/types.yaml#/definitions/string
+    description:
+      Specifies the name of the protected Heap from
+      which the GPU driver allocates protected memory.
+
 required:
   - compatible
   - reg
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC PATCH 4/5] drm/panthor: Add support for protected memory allocation in panthor
  2025-01-30 13:08 [RFC PATCH 0/5] drm/panthor: Protected mode support for Mali CSF GPUs Florent Tomasin
                   ` (2 preceding siblings ...)
  2025-01-30 13:08 ` [RFC PATCH 3/5] dt-bindings: gpu: Add protected heap name to Mali Valhall CSF binding Florent Tomasin
@ 2025-01-30 13:09 ` Florent Tomasin
  2025-02-11 11:04   ` Boris Brezillon
  2025-03-12 20:05   ` Adrian Larumbe
  2025-01-30 13:09 ` [RFC PATCH 5/5] drm/panthor: Add support for entering and exiting protected mode Florent Tomasin
                   ` (2 subsequent siblings)
  6 siblings, 2 replies; 48+ messages in thread
From: Florent Tomasin @ 2025-01-30 13:09 UTC (permalink / raw)
  To: Vinod Koul, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	Sumit Semwal, Benjamin Gaignard, Brian Starkey, John Stultz,
	T . J . Mercier, Christian König, Matthias Brugger,
	AngeloGioacchino Del Regno, Yong Wu
  Cc: dmaengine, devicetree, linux-kernel, dri-devel, linux-media,
	linaro-mm-sig, linux-arm-kernel, linux-mediatek, nd, Akash Goel,
	Florent Tomasin

This patch allows Panthor to allocate buffer objects from a
protected heap. The Panthor driver should be seen as a consumer
of the heap and not an exporter.

To help with the review of this patch, here are important information
about the Mali GPU protected mode support:
- On CSF FW load, the Panthor driver must allocate a protected
  buffer object to hold data to use by the FW when in protected
  mode. This protected buffer object is owned by the device
  and does not belong to a process.
- On CSG creation, the Panthor driver must allocate a protected
  suspend buffer object for the FW to store data when suspending
  the CSG while in protected mode. The kernel owns this allocation
  and does not allow user space mapping. The format of the data
  in this buffer is only known by the FW and does not need to be
  shared with other entities.

To summarize, Mali GPUs require allocations of protected buffer
objects at the kernel level.

* How is the protected heap accessed by the Panthor driver?
The driver will retrieve the protected heap using the name of the
heap provided to the driver via the DTB as attribute.
If the heap is not yet available, the panthor driver will defer
the probe until created. It is an integration error to provide
a heap name that does not exist or is never created in the
DTB node.

* How is the Panthor driver allocating from the heap?
Panthor is calling the DMA heap allocation function
and obtains a DMA buffer from it. This buffer is then
registered to GEM via PRIME by importing the DMA buffer.

Signed-off-by: Florent Tomasin <florent.tomasin@arm.com>
---
 drivers/gpu/drm/panthor/Kconfig          |  1 +
 drivers/gpu/drm/panthor/panthor_device.c | 22 ++++++++++-
 drivers/gpu/drm/panthor/panthor_device.h |  7 ++++
 drivers/gpu/drm/panthor/panthor_fw.c     | 36 +++++++++++++++--
 drivers/gpu/drm/panthor/panthor_fw.h     |  2 +
 drivers/gpu/drm/panthor/panthor_gem.c    | 49 ++++++++++++++++++++++--
 drivers/gpu/drm/panthor/panthor_gem.h    | 16 +++++++-
 drivers/gpu/drm/panthor/panthor_heap.c   |  2 +
 drivers/gpu/drm/panthor/panthor_sched.c  |  5 ++-
 9 files changed, 130 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/panthor/Kconfig b/drivers/gpu/drm/panthor/Kconfig
index 55b40ad07f3b..c0208b886d9f 100644
--- a/drivers/gpu/drm/panthor/Kconfig
+++ b/drivers/gpu/drm/panthor/Kconfig
@@ -7,6 +7,7 @@ config DRM_PANTHOR
 	depends on !GENERIC_ATOMIC64  # for IOMMU_IO_PGTABLE_LPAE
 	depends on MMU
 	select DEVFREQ_GOV_SIMPLE_ONDEMAND
+	select DMABUF_HEAPS
 	select DRM_EXEC
 	select DRM_GEM_SHMEM_HELPER
 	select DRM_GPUVM
diff --git a/drivers/gpu/drm/panthor/panthor_device.c b/drivers/gpu/drm/panthor/panthor_device.c
index 00f7b8ce935a..1018e5c90a0e 100644
--- a/drivers/gpu/drm/panthor/panthor_device.c
+++ b/drivers/gpu/drm/panthor/panthor_device.c
@@ -4,7 +4,9 @@
 /* Copyright 2023 Collabora ltd. */
 
 #include <linux/clk.h>
+#include <linux/dma-heap.h>
 #include <linux/mm.h>
+#include <linux/of.h>
 #include <linux/platform_device.h>
 #include <linux/pm_domain.h>
 #include <linux/pm_runtime.h>
@@ -102,6 +104,9 @@ void panthor_device_unplug(struct panthor_device *ptdev)
 	panthor_mmu_unplug(ptdev);
 	panthor_gpu_unplug(ptdev);
 
+	if (ptdev->protm.heap)
+		dma_heap_put(ptdev->protm.heap);
+
 	pm_runtime_dont_use_autosuspend(ptdev->base.dev);
 	pm_runtime_put_sync_suspend(ptdev->base.dev);
 
@@ -172,6 +177,7 @@ int panthor_device_init(struct panthor_device *ptdev)
 	u32 *dummy_page_virt;
 	struct resource *res;
 	struct page *p;
+	const char *protm_heap_name;
 	int ret;
 
 	ret = panthor_gpu_coherency_init(ptdev);
@@ -246,9 +252,19 @@ int panthor_device_init(struct panthor_device *ptdev)
 			return ret;
 	}
 
+	/* If a protected heap is specified but not found, defer the probe until created */
+	if (!of_property_read_string(ptdev->base.dev->of_node, "protected-heap-name",
+				     &protm_heap_name)) {
+		ptdev->protm.heap = dma_heap_find(protm_heap_name);
+		if (!ptdev->protm.heap) {
+			ret = -EPROBE_DEFER;
+			goto err_rpm_put;
+		}
+	}
+
 	ret = panthor_gpu_init(ptdev);
 	if (ret)
-		goto err_rpm_put;
+		goto err_dma_heap_put;
 
 	ret = panthor_mmu_init(ptdev);
 	if (ret)
@@ -286,6 +302,10 @@ int panthor_device_init(struct panthor_device *ptdev)
 err_unplug_gpu:
 	panthor_gpu_unplug(ptdev);
 
+err_dma_heap_put:
+	if (ptdev->protm.heap)
+		dma_heap_put(ptdev->protm.heap);
+
 err_rpm_put:
 	pm_runtime_put_sync_suspend(ptdev->base.dev);
 	return ret;
diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/panthor/panthor_device.h
index 0e68f5a70d20..406de9e888e2 100644
--- a/drivers/gpu/drm/panthor/panthor_device.h
+++ b/drivers/gpu/drm/panthor/panthor_device.h
@@ -7,6 +7,7 @@
 #define __PANTHOR_DEVICE_H__
 
 #include <linux/atomic.h>
+#include <linux/dma-heap.h>
 #include <linux/io-pgtable.h>
 #include <linux/regulator/consumer.h>
 #include <linux/sched.h>
@@ -190,6 +191,12 @@ struct panthor_device {
 
 	/** @fast_rate: Maximum device clock frequency. Set by DVFS */
 	unsigned long fast_rate;
+
+	/** @protm: Protected mode related data. */
+	struct {
+		/** @heap: Pointer to the protected heap */
+		struct dma_heap *heap;
+	} protm;
 };
 
 struct panthor_gpu_usage {
diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor/panthor_fw.c
index 4a2e36504fea..7822af1533b4 100644
--- a/drivers/gpu/drm/panthor/panthor_fw.c
+++ b/drivers/gpu/drm/panthor/panthor_fw.c
@@ -458,6 +458,7 @@ panthor_fw_alloc_queue_iface_mem(struct panthor_device *ptdev,
 
 	mem = panthor_kernel_bo_create(ptdev, ptdev->fw->vm, SZ_8K,
 				       DRM_PANTHOR_BO_NO_MMAP,
+				       0,
 				       DRM_PANTHOR_VM_BIND_OP_MAP_NOEXEC |
 				       DRM_PANTHOR_VM_BIND_OP_MAP_UNCACHED,
 				       PANTHOR_VM_KERNEL_AUTO_VA);
@@ -491,6 +492,28 @@ panthor_fw_alloc_suspend_buf_mem(struct panthor_device *ptdev, size_t size)
 {
 	return panthor_kernel_bo_create(ptdev, panthor_fw_vm(ptdev), size,
 					DRM_PANTHOR_BO_NO_MMAP,
+					0,
+					DRM_PANTHOR_VM_BIND_OP_MAP_NOEXEC,
+					PANTHOR_VM_KERNEL_AUTO_VA);
+}
+
+/**
+ * panthor_fw_alloc_protm_suspend_buf_mem() - Allocate a protm suspend buffer
+ * for a command stream group.
+ * @ptdev: Device.
+ * @size: Size of the protm suspend buffer.
+ *
+ * Return: A valid pointer in case of success, NULL if no protected heap, an ERR_PTR() otherwise.
+ */
+struct panthor_kernel_bo *
+panthor_fw_alloc_protm_suspend_buf_mem(struct panthor_device *ptdev, size_t size)
+{
+	if (!ptdev->protm.heap)
+		return NULL;
+
+	return panthor_kernel_bo_create(ptdev, panthor_fw_vm(ptdev), size,
+					DRM_PANTHOR_BO_NO_MMAP,
+					DRM_PANTHOR_KBO_PROTECTED_HEAP,
 					DRM_PANTHOR_VM_BIND_OP_MAP_NOEXEC,
 					PANTHOR_VM_KERNEL_AUTO_VA);
 }
@@ -503,6 +526,7 @@ static int panthor_fw_load_section_entry(struct panthor_device *ptdev,
 	ssize_t vm_pgsz = panthor_vm_page_size(ptdev->fw->vm);
 	struct panthor_fw_binary_section_entry_hdr hdr;
 	struct panthor_fw_section *section;
+	bool is_protm_section = false;
 	u32 section_size;
 	u32 name_len;
 	int ret;
@@ -541,10 +565,13 @@ static int panthor_fw_load_section_entry(struct panthor_device *ptdev,
 		return -EINVAL;
 	}
 
-	if (hdr.flags & CSF_FW_BINARY_IFACE_ENTRY_PROT) {
+	if ((hdr.flags & CSF_FW_BINARY_IFACE_ENTRY_PROT) && !ptdev->protm.heap) {
 		drm_warn(&ptdev->base,
 			 "Firmware protected mode entry not be supported, ignoring");
 		return 0;
+	} else if ((hdr.flags & CSF_FW_BINARY_IFACE_ENTRY_PROT) && ptdev->protm.heap) {
+		drm_info(&ptdev->base, "Firmware protected mode entry supported");
+		is_protm_section = true;
 	}
 
 	if (hdr.va.start == CSF_MCU_SHARED_REGION_START &&
@@ -610,9 +637,10 @@ static int panthor_fw_load_section_entry(struct panthor_device *ptdev,
 			vm_map_flags |= DRM_PANTHOR_VM_BIND_OP_MAP_UNCACHED;
 
 		section->mem = panthor_kernel_bo_create(ptdev, panthor_fw_vm(ptdev),
-							section_size,
-							DRM_PANTHOR_BO_NO_MMAP,
-							vm_map_flags, va);
+					section_size,
+					DRM_PANTHOR_BO_NO_MMAP,
+					(is_protm_section ? DRM_PANTHOR_KBO_PROTECTED_HEAP : 0),
+					vm_map_flags, va);
 		if (IS_ERR(section->mem))
 			return PTR_ERR(section->mem);
 
diff --git a/drivers/gpu/drm/panthor/panthor_fw.h b/drivers/gpu/drm/panthor/panthor_fw.h
index 22448abde992..29042d0dc60c 100644
--- a/drivers/gpu/drm/panthor/panthor_fw.h
+++ b/drivers/gpu/drm/panthor/panthor_fw.h
@@ -481,6 +481,8 @@ panthor_fw_alloc_queue_iface_mem(struct panthor_device *ptdev,
 				 u32 *input_fw_va, u32 *output_fw_va);
 struct panthor_kernel_bo *
 panthor_fw_alloc_suspend_buf_mem(struct panthor_device *ptdev, size_t size);
+struct panthor_kernel_bo *
+panthor_fw_alloc_protm_suspend_buf_mem(struct panthor_device *ptdev, size_t size);
 
 struct panthor_vm *panthor_fw_vm(struct panthor_device *ptdev);
 
diff --git a/drivers/gpu/drm/panthor/panthor_gem.c b/drivers/gpu/drm/panthor/panthor_gem.c
index 8244a4e6c2a2..88caf928acd0 100644
--- a/drivers/gpu/drm/panthor/panthor_gem.c
+++ b/drivers/gpu/drm/panthor/panthor_gem.c
@@ -9,10 +9,14 @@
 
 #include <drm/panthor_drm.h>
 
+#include <uapi/linux/dma-heap.h>
+
 #include "panthor_device.h"
 #include "panthor_gem.h"
 #include "panthor_mmu.h"
 
+MODULE_IMPORT_NS(DMA_BUF);
+
 static void panthor_gem_free_object(struct drm_gem_object *obj)
 {
 	struct panthor_gem_object *bo = to_panthor_bo(obj);
@@ -31,6 +35,7 @@ static void panthor_gem_free_object(struct drm_gem_object *obj)
  */
 void panthor_kernel_bo_destroy(struct panthor_kernel_bo *bo)
 {
+	struct dma_buf *dma_bo = NULL;
 	struct panthor_vm *vm;
 	int ret;
 
@@ -38,6 +43,10 @@ void panthor_kernel_bo_destroy(struct panthor_kernel_bo *bo)
 		return;
 
 	vm = bo->vm;
+
+	if (bo->flags & DRM_PANTHOR_KBO_PROTECTED_HEAP)
+		dma_bo = bo->obj->import_attach->dmabuf;
+
 	panthor_kernel_bo_vunmap(bo);
 
 	if (drm_WARN_ON(bo->obj->dev,
@@ -51,6 +60,9 @@ void panthor_kernel_bo_destroy(struct panthor_kernel_bo *bo)
 	panthor_vm_free_va(vm, &bo->va_node);
 	drm_gem_object_put(bo->obj);
 
+	if (dma_bo)
+		dma_buf_put(dma_bo);
+
 out_free_bo:
 	panthor_vm_put(vm);
 	kfree(bo);
@@ -62,6 +74,7 @@ void panthor_kernel_bo_destroy(struct panthor_kernel_bo *bo)
  * @vm: VM to map the GEM to. If NULL, the kernel object is not GPU mapped.
  * @size: Size of the buffer object.
  * @bo_flags: Combination of drm_panthor_bo_flags flags.
+ * @kbo_flags: Combination of drm_panthor_kbo_flags flags.
  * @vm_map_flags: Combination of drm_panthor_vm_bind_op_flags (only those
  * that are related to map operations).
  * @gpu_va: GPU address assigned when mapping to the VM.
@@ -72,9 +85,11 @@ void panthor_kernel_bo_destroy(struct panthor_kernel_bo *bo)
  */
 struct panthor_kernel_bo *
 panthor_kernel_bo_create(struct panthor_device *ptdev, struct panthor_vm *vm,
-			 size_t size, u32 bo_flags, u32 vm_map_flags,
+			 size_t size, u32 bo_flags, u32 kbo_flags, u32 vm_map_flags,
 			 u64 gpu_va)
 {
+	struct dma_buf *dma_bo = NULL;
+	struct drm_gem_object *gem_obj = NULL;
 	struct drm_gem_shmem_object *obj;
 	struct panthor_kernel_bo *kbo;
 	struct panthor_gem_object *bo;
@@ -87,14 +102,38 @@ panthor_kernel_bo_create(struct panthor_device *ptdev, struct panthor_vm *vm,
 	if (!kbo)
 		return ERR_PTR(-ENOMEM);
 
-	obj = drm_gem_shmem_create(&ptdev->base, size);
+	if (kbo_flags & DRM_PANTHOR_KBO_PROTECTED_HEAP) {
+		if (!ptdev->protm.heap) {
+			ret = -EINVAL;
+			goto err_free_bo;
+		}
+
+		dma_bo = dma_heap_buffer_alloc(ptdev->protm.heap, size,
+					       DMA_HEAP_VALID_FD_FLAGS, DMA_HEAP_VALID_HEAP_FLAGS);
+		if (!dma_bo) {
+			ret = -ENOMEM;
+			goto err_free_bo;
+		}
+
+		gem_obj = drm_gem_prime_import(&ptdev->base, dma_bo);
+		if (IS_ERR(gem_obj)) {
+			ret = PTR_ERR(gem_obj);
+			goto err_free_dma_bo;
+		}
+
+		obj = to_drm_gem_shmem_obj(gem_obj);
+	} else {
+		obj = drm_gem_shmem_create(&ptdev->base, size);
+	}
+
 	if (IS_ERR(obj)) {
 		ret = PTR_ERR(obj);
-		goto err_free_bo;
+		goto err_free_dma_bo;
 	}
 
 	bo = to_panthor_bo(&obj->base);
 	kbo->obj = &obj->base;
+	kbo->flags = kbo_flags;
 	bo->flags = bo_flags;
 
 	/* The system and GPU MMU page size might differ, which becomes a
@@ -124,6 +163,10 @@ panthor_kernel_bo_create(struct panthor_device *ptdev, struct panthor_vm *vm,
 err_put_obj:
 	drm_gem_object_put(&obj->base);
 
+err_free_dma_bo:
+	if (dma_bo)
+		dma_buf_put(dma_bo);
+
 err_free_bo:
 	kfree(kbo);
 	return ERR_PTR(ret);
diff --git a/drivers/gpu/drm/panthor/panthor_gem.h b/drivers/gpu/drm/panthor/panthor_gem.h
index e43021cf6d45..d4fe8ae9f0a8 100644
--- a/drivers/gpu/drm/panthor/panthor_gem.h
+++ b/drivers/gpu/drm/panthor/panthor_gem.h
@@ -13,6 +13,17 @@
 
 struct panthor_vm;
 
+/**
+ * enum drm_panthor_kbo_flags -  Kernel buffer object flags, passed at creation time
+ */
+enum drm_panthor_kbo_flags {
+	/**
+	 * @DRM_PANTHOR_KBO_PROTECTED_HEAP: The buffer object will be allocated
+	 * from a DMA-Buf protected heap.
+	 */
+	DRM_PANTHOR_KBO_PROTECTED_HEAP = (1 << 0),
+};
+
 /**
  * struct panthor_gem_object - Driver specific GEM object.
  */
@@ -75,6 +86,9 @@ struct panthor_kernel_bo {
 	 * @kmap: Kernel CPU mapping of @gem.
 	 */
 	void *kmap;
+
+	/** @flags: Combination of drm_panthor_kbo_flags flags. */
+	u32 flags;
 };
 
 static inline
@@ -138,7 +152,7 @@ panthor_kernel_bo_vunmap(struct panthor_kernel_bo *bo)
 
 struct panthor_kernel_bo *
 panthor_kernel_bo_create(struct panthor_device *ptdev, struct panthor_vm *vm,
-			 size_t size, u32 bo_flags, u32 vm_map_flags,
+			 size_t size, u32 bo_flags, u32 kbo_flags, u32 vm_map_flags,
 			 u64 gpu_va);
 
 void panthor_kernel_bo_destroy(struct panthor_kernel_bo *bo);
diff --git a/drivers/gpu/drm/panthor/panthor_heap.c b/drivers/gpu/drm/panthor/panthor_heap.c
index 3796a9eb22af..5395f0d90360 100644
--- a/drivers/gpu/drm/panthor/panthor_heap.c
+++ b/drivers/gpu/drm/panthor/panthor_heap.c
@@ -146,6 +146,7 @@ static int panthor_alloc_heap_chunk(struct panthor_device *ptdev,
 
 	chunk->bo = panthor_kernel_bo_create(ptdev, vm, heap->chunk_size,
 					     DRM_PANTHOR_BO_NO_MMAP,
+					     0,
 					     DRM_PANTHOR_VM_BIND_OP_MAP_NOEXEC,
 					     PANTHOR_VM_KERNEL_AUTO_VA);
 	if (IS_ERR(chunk->bo)) {
@@ -549,6 +550,7 @@ panthor_heap_pool_create(struct panthor_device *ptdev, struct panthor_vm *vm)
 
 	pool->gpu_contexts = panthor_kernel_bo_create(ptdev, vm, bosize,
 						      DRM_PANTHOR_BO_NO_MMAP,
+						      0,
 						      DRM_PANTHOR_VM_BIND_OP_MAP_NOEXEC,
 						      PANTHOR_VM_KERNEL_AUTO_VA);
 	if (IS_ERR(pool->gpu_contexts)) {
diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
index ef4bec7ff9c7..e260ed8aef5b 100644
--- a/drivers/gpu/drm/panthor/panthor_sched.c
+++ b/drivers/gpu/drm/panthor/panthor_sched.c
@@ -3298,6 +3298,7 @@ group_create_queue(struct panthor_group *group,
 	queue->ringbuf = panthor_kernel_bo_create(group->ptdev, group->vm,
 						  args->ringbuf_size,
 						  DRM_PANTHOR_BO_NO_MMAP,
+						  0,
 						  DRM_PANTHOR_VM_BIND_OP_MAP_NOEXEC |
 						  DRM_PANTHOR_VM_BIND_OP_MAP_UNCACHED,
 						  PANTHOR_VM_KERNEL_AUTO_VA);
@@ -3328,6 +3329,7 @@ group_create_queue(struct panthor_group *group,
 					 queue->profiling.slot_count *
 					 sizeof(struct panthor_job_profiling_data),
 					 DRM_PANTHOR_BO_NO_MMAP,
+					 0,
 					 DRM_PANTHOR_VM_BIND_OP_MAP_NOEXEC |
 					 DRM_PANTHOR_VM_BIND_OP_MAP_UNCACHED,
 					 PANTHOR_VM_KERNEL_AUTO_VA);
@@ -3435,7 +3437,7 @@ int panthor_group_create(struct panthor_file *pfile,
 	}
 
 	suspend_size = csg_iface->control->protm_suspend_size;
-	group->protm_suspend_buf = panthor_fw_alloc_suspend_buf_mem(ptdev, suspend_size);
+	group->protm_suspend_buf = panthor_fw_alloc_protm_suspend_buf_mem(ptdev, suspend_size);
 	if (IS_ERR(group->protm_suspend_buf)) {
 		ret = PTR_ERR(group->protm_suspend_buf);
 		group->protm_suspend_buf = NULL;
@@ -3446,6 +3448,7 @@ int panthor_group_create(struct panthor_file *pfile,
 						   group_args->queues.count *
 						   sizeof(struct panthor_syncobj_64b),
 						   DRM_PANTHOR_BO_NO_MMAP,
+						   0,
 						   DRM_PANTHOR_VM_BIND_OP_MAP_NOEXEC |
 						   DRM_PANTHOR_VM_BIND_OP_MAP_UNCACHED,
 						   PANTHOR_VM_KERNEL_AUTO_VA);
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [RFC PATCH 5/5] drm/panthor: Add support for entering and exiting protected mode
  2025-01-30 13:08 [RFC PATCH 0/5] drm/panthor: Protected mode support for Mali CSF GPUs Florent Tomasin
                   ` (3 preceding siblings ...)
  2025-01-30 13:09 ` [RFC PATCH 4/5] drm/panthor: Add support for protected memory allocation in panthor Florent Tomasin
@ 2025-01-30 13:09 ` Florent Tomasin
  2025-02-10 14:01   ` Boris Brezillon
  2025-01-30 13:46 ` [RFC PATCH 0/5] drm/panthor: Protected mode support for Mali CSF GPUs Maxime Ripard
  2025-01-30 16:15 ` Simona Vetter
  6 siblings, 1 reply; 48+ messages in thread
From: Florent Tomasin @ 2025-01-30 13:09 UTC (permalink / raw)
  To: Vinod Koul, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	Sumit Semwal, Benjamin Gaignard, Brian Starkey, John Stultz,
	T . J . Mercier, Christian König, Matthias Brugger,
	AngeloGioacchino Del Regno, Yong Wu
  Cc: dmaengine, devicetree, linux-kernel, dri-devel, linux-media,
	linaro-mm-sig, linux-arm-kernel, linux-mediatek, nd, Akash Goel,
	Florent Tomasin

This patch modifies the Panthor driver code to allow handling
of the GPU HW protected mode enter and exit.

The logic added by this patch only includes the mechanisms
needed for entering and exiting protected mode. The submission of
a protected mode jobs is not covered by this patch series
and is responsibility of the user space program.

To help with the review, here are some important information
about Mali GPU protected mode enter and exit:
- When the GPU detects a protected mode job needs to be
  executed, an IRQ is sent to the CPU to notify the kernel
  driver that the job is blocked until the GPU has entered
  protected mode. The entering of protected mode is controlled
  by the kernel driver.
- The Mali Panthor CSF driver will schedule a tick and evaluate
  which CS in the CSG to schedule on slot needs protected mode.
  If the priority of the CSG is not sufficiently high, the
  protected mode job will not progress until the CSG is
  scheduled at top priority.
- The Panthor scheduler notifies the GPU that the blocked
  protected jobs will soon be able to progress.
- Once all CSG and CS slots are updated, the scheduler
  requests the GPU to enter protected mode and waits for
  it to be acknowledged.
- If successful, all protected mode jobs will resume execution
  while normal mode jobs block until the GPU exits
  protected mode, or the kernel driver rotates the CSGs
  and forces the GPU to exit protected mode.
- If unsuccessful, the scheduler will request a GPU reset.
- When a protected mode job is suspended as a result of
  the CSGs rotation, the GPU will send an IRQ to the CPU
  to notify that the protected mode job needs to resume.

This sequence will continue so long the user space is
submitting protected mode jobs.

Signed-off-by: Florent Tomasin <florent.tomasin@arm.com>
---
 drivers/gpu/drm/panthor/panthor_device.h |   3 +
 drivers/gpu/drm/panthor/panthor_fw.c     |  10 +-
 drivers/gpu/drm/panthor/panthor_sched.c  | 119 +++++++++++++++++++++--
 3 files changed, 122 insertions(+), 10 deletions(-)

diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/panthor/panthor_device.h
index 406de9e888e2..0c76bfd392a0 100644
--- a/drivers/gpu/drm/panthor/panthor_device.h
+++ b/drivers/gpu/drm/panthor/panthor_device.h
@@ -196,6 +196,9 @@ struct panthor_device {
 	struct {
 		/** @heap: Pointer to the protected heap */
 		struct dma_heap *heap;
+
+		/** @pending: Set to true if a protected mode enter request is pending. */
+		bool pending;
 	} protm;
 };
 
diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor/panthor_fw.c
index 7822af1533b4..2006d652f4db 100644
--- a/drivers/gpu/drm/panthor/panthor_fw.c
+++ b/drivers/gpu/drm/panthor/panthor_fw.c
@@ -1025,13 +1025,19 @@ static void panthor_fw_init_global_iface(struct panthor_device *ptdev)
 	glb_iface->input->progress_timer = PROGRESS_TIMEOUT_CYCLES >> PROGRESS_TIMEOUT_SCALE_SHIFT;
 	glb_iface->input->idle_timer = panthor_fw_conv_timeout(ptdev, IDLE_HYSTERESIS_US);
 
-	/* Enable interrupts we care about. */
+	/* Enable interrupts we care about.
+	 *
+	 * GLB_PROTM_ENTER and GLB_PROTM_EXIT interrupts are only
+	 * relevant if a protected memory heap is present.
+	 */
 	glb_iface->input->ack_irq_mask = GLB_CFG_ALLOC_EN |
 					 GLB_PING |
 					 GLB_CFG_PROGRESS_TIMER |
 					 GLB_CFG_POWEROFF_TIMER |
 					 GLB_IDLE_EN |
-					 GLB_IDLE;
+					 GLB_IDLE |
+					 (ptdev->protm.heap ?
+					 (GLB_PROTM_ENTER | GLB_PROTM_EXIT) : 0);
 
 	panthor_fw_update_reqs(glb_iface, req, GLB_IDLE_EN, GLB_IDLE_EN);
 	panthor_fw_toggle_reqs(glb_iface, req, ack,
diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
index e260ed8aef5b..c10a21f9d075 100644
--- a/drivers/gpu/drm/panthor/panthor_sched.c
+++ b/drivers/gpu/drm/panthor/panthor_sched.c
@@ -573,6 +573,9 @@ struct panthor_group {
 	/** @fatal_queues: Bitmask reflecting the queues that hit a fatal exception. */
 	u32 fatal_queues;
 
+	/** @protm_queues: Bitmask reflecting the queues that are waiting on a CS_PROTM_PENDING. */
+	u32 protm_queues;
+
 	/** @tiler_oom: Mask of queues that have a tiler OOM event to process. */
 	atomic_t tiler_oom;
 
@@ -870,6 +873,31 @@ panthor_queue_get_syncwait_obj(struct panthor_group *group, struct panthor_queue
 	return NULL;
 }
 
+static int glb_protm_enter(struct panthor_device *ptdev)
+{
+	struct panthor_fw_global_iface *glb_iface;
+	u32 acked;
+	int ret;
+
+	lockdep_assert_held(&ptdev->scheduler->lock);
+
+	if (!ptdev->protm.pending)
+		return 0;
+
+	glb_iface = panthor_fw_get_glb_iface(ptdev);
+
+	panthor_fw_toggle_reqs(glb_iface, req, ack, GLB_PROTM_ENTER);
+	gpu_write(ptdev, CSF_DOORBELL(CSF_GLB_DOORBELL_ID), 1);
+
+	ret = panthor_fw_glb_wait_acks(ptdev, GLB_PROTM_ENTER, &acked, 4000);
+	if (ret)
+		drm_err(&ptdev->base, "FW protm enter timeout, scheduling a reset");
+	else
+		ptdev->protm.pending = false;
+
+	return ret;
+}
+
 static void group_free_queue(struct panthor_group *group, struct panthor_queue *queue)
 {
 	if (IS_ERR_OR_NULL(queue))
@@ -1027,6 +1055,7 @@ group_unbind_locked(struct panthor_group *group)
  * @ptdev: Device.
  * @csg_id: Group slot ID.
  * @cs_id: Queue slot ID.
+ * @protm_ack: Acknowledge pending protected mode queues
  *
  * Program a queue slot with the queue information so things can start being
  * executed on this queue.
@@ -1034,10 +1063,13 @@ group_unbind_locked(struct panthor_group *group)
  * The group slot must have a group bound to it already (group_bind_locked()).
  */
 static void
-cs_slot_prog_locked(struct panthor_device *ptdev, u32 csg_id, u32 cs_id)
+cs_slot_prog_locked(struct panthor_device *ptdev, u32 csg_id, u32 cs_id, bool protm_ack)
 {
-	struct panthor_queue *queue = ptdev->scheduler->csg_slots[csg_id].group->queues[cs_id];
+	struct panthor_group * const group = ptdev->scheduler->csg_slots[csg_id].group;
+	struct panthor_queue *queue = group->queues[cs_id];
 	struct panthor_fw_cs_iface *cs_iface = panthor_fw_get_cs_iface(ptdev, csg_id, cs_id);
+	u32 const cs_protm_pending_mask =
+		protm_ack && (group->protm_queues & BIT(cs_id)) ? CS_PROTM_PENDING : 0;
 
 	lockdep_assert_held(&ptdev->scheduler->lock);
 
@@ -1055,15 +1087,22 @@ cs_slot_prog_locked(struct panthor_device *ptdev, u32 csg_id, u32 cs_id)
 			       CS_IDLE_SYNC_WAIT |
 			       CS_IDLE_EMPTY |
 			       CS_STATE_START |
-			       CS_EXTRACT_EVENT,
+			       CS_EXTRACT_EVENT |
+			       cs_protm_pending_mask,
 			       CS_IDLE_SYNC_WAIT |
 			       CS_IDLE_EMPTY |
 			       CS_STATE_MASK |
-			       CS_EXTRACT_EVENT);
+			       CS_EXTRACT_EVENT |
+			       CS_PROTM_PENDING);
 	if (queue->iface.input->insert != queue->iface.input->extract && queue->timeout_suspended) {
 		drm_sched_resume_timeout(&queue->scheduler, queue->remaining_time);
 		queue->timeout_suspended = false;
 	}
+
+	if (cs_protm_pending_mask) {
+		group->protm_queues &= ~BIT(cs_id);
+		ptdev->protm.pending = true;
+	}
 }
 
 /**
@@ -1274,7 +1313,7 @@ csg_slot_sync_state_locked(struct panthor_device *ptdev, u32 csg_id)
 }
 
 static int
-csg_slot_prog_locked(struct panthor_device *ptdev, u32 csg_id, u32 priority)
+csg_slot_prog_locked(struct panthor_device *ptdev, u32 csg_id, u32 priority, bool protm_ack)
 {
 	struct panthor_fw_csg_iface *csg_iface;
 	struct panthor_csg_slot *csg_slot;
@@ -1291,14 +1330,14 @@ csg_slot_prog_locked(struct panthor_device *ptdev, u32 csg_id, u32 priority)
 
 	csg_slot = &ptdev->scheduler->csg_slots[csg_id];
 	group = csg_slot->group;
-	if (!group || group->state == PANTHOR_CS_GROUP_ACTIVE)
+	if (!group || (group->state == PANTHOR_CS_GROUP_ACTIVE && !protm_ack))
 		return 0;
 
 	csg_iface = panthor_fw_get_csg_iface(group->ptdev, csg_id);
 
 	for (i = 0; i < group->queue_count; i++) {
 		if (group->queues[i]) {
-			cs_slot_prog_locked(ptdev, csg_id, i);
+			cs_slot_prog_locked(ptdev, csg_id, i, protm_ack);
 			queue_mask |= BIT(i);
 		}
 	}
@@ -1329,6 +1368,34 @@ csg_slot_prog_locked(struct panthor_device *ptdev, u32 csg_id, u32 priority)
 	return 0;
 }
 
+static void
+cs_slot_process_protm_pending_event_locked(struct panthor_device *ptdev,
+					   u32 csg_id, u32 cs_id)
+{
+	struct panthor_scheduler *sched = ptdev->scheduler;
+	struct panthor_csg_slot *csg_slot = &sched->csg_slots[csg_id];
+	struct panthor_group *group = csg_slot->group;
+
+	lockdep_assert_held(&sched->lock);
+
+	if (!group)
+		return;
+
+	/* No protected memory heap, a user space program tried to
+	 * submit a protected mode jobs resulting in the GPU raising
+	 * a CS_PROTM_PENDING request.
+	 *
+	 * This scenario is invalid and the protected mode jobs must
+	 * not be allowed to progress.
+	 */
+	if (drm_WARN_ON_ONCE(&ptdev->base, !ptdev->protm.heap))
+		return;
+
+	group->protm_queues |= BIT(cs_id);
+
+	sched_queue_delayed_work(sched, tick, 0);
+}
+
 static void
 cs_slot_process_fatal_event_locked(struct panthor_device *ptdev,
 				   u32 csg_id, u32 cs_id)
@@ -1566,6 +1633,9 @@ static bool cs_slot_process_irq_locked(struct panthor_device *ptdev,
 	if (events & CS_TILER_OOM)
 		cs_slot_process_tiler_oom_event_locked(ptdev, csg_id, cs_id);
 
+	if (events & CS_PROTM_PENDING)
+		cs_slot_process_protm_pending_event_locked(ptdev, csg_id, cs_id);
+
 	/* We don't acknowledge the TILER_OOM event since its handling is
 	 * deferred to a separate work.
 	 */
@@ -1703,6 +1773,17 @@ static void sched_process_idle_event_locked(struct panthor_device *ptdev)
 	sched_queue_delayed_work(ptdev->scheduler, tick, 0);
 }
 
+static void sched_process_protm_exit_event_locked(struct panthor_device *ptdev)
+{
+	struct panthor_fw_global_iface *glb_iface = panthor_fw_get_glb_iface(ptdev);
+
+	lockdep_assert_held(&ptdev->scheduler->lock);
+
+	/* Acknowledge the protm exit and schedule a tick. */
+	panthor_fw_update_reqs(glb_iface, req, glb_iface->output->ack, GLB_PROTM_EXIT);
+	sched_queue_delayed_work(ptdev->scheduler, tick, 0);
+}
+
 /**
  * sched_process_global_irq_locked() - Process the scheduling part of a global IRQ
  * @ptdev: Device.
@@ -1720,6 +1801,9 @@ static void sched_process_global_irq_locked(struct panthor_device *ptdev)
 
 	if (evts & GLB_IDLE)
 		sched_process_idle_event_locked(ptdev);
+
+	if (evts & GLB_PROTM_EXIT)
+		sched_process_protm_exit_event_locked(ptdev);
 }
 
 static void process_fw_events_work(struct work_struct *work)
@@ -2238,9 +2322,22 @@ tick_ctx_apply(struct panthor_scheduler *sched, struct panthor_sched_tick_ctx *c
 		list_for_each_entry(group, &ctx->groups[prio], run_node) {
 			int csg_id = group->csg_id;
 			struct panthor_fw_csg_iface *csg_iface;
+			bool protm_ack = false;
+
+			/* The highest priority group has pending protected mode queues */
+			if (new_csg_prio == MAX_CSG_PRIO && group->protm_queues)
+				protm_ack = true;
 
 			if (csg_id >= 0) {
 				new_csg_prio--;
+
+				/* This group is on slot but at least one queue
+				 * is waiting for PROTM_ENTER.
+				 */
+				if (protm_ack)
+					csg_slot_prog_locked(ptdev, csg_id,
+							     new_csg_prio, protm_ack);
+
 				continue;
 			}
 
@@ -2251,7 +2348,7 @@ tick_ctx_apply(struct panthor_scheduler *sched, struct panthor_sched_tick_ctx *c
 			csg_iface = panthor_fw_get_csg_iface(ptdev, csg_id);
 			csg_slot = &sched->csg_slots[csg_id];
 			group_bind_locked(group, csg_id);
-			csg_slot_prog_locked(ptdev, csg_id, new_csg_prio--);
+			csg_slot_prog_locked(ptdev, csg_id, new_csg_prio--, protm_ack);
 			csgs_upd_ctx_queue_reqs(ptdev, &upd_ctx, csg_id,
 						group->state == PANTHOR_CS_GROUP_SUSPENDED ?
 						CSG_STATE_RESUME : CSG_STATE_START,
@@ -2303,6 +2400,12 @@ tick_ctx_apply(struct panthor_scheduler *sched, struct panthor_sched_tick_ctx *c
 
 	sched->used_csg_slot_count = ctx->group_count;
 	sched->might_have_idle_groups = ctx->idle_group_count > 0;
+
+	ret = glb_protm_enter(ptdev);
+	if (ret) {
+		panthor_device_schedule_reset(ptdev);
+		ctx->csg_upd_failed_mask = U32_MAX;
+	}
 }
 
 static u64
-- 
2.34.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 3/5] dt-bindings: gpu: Add protected heap name to Mali Valhall CSF binding
  2025-01-30 13:08 ` [RFC PATCH 3/5] dt-bindings: gpu: Add protected heap name to Mali Valhall CSF binding Florent Tomasin
@ 2025-01-30 13:25   ` Krzysztof Kozlowski
  2025-02-03 15:31     ` Florent Tomasin
  0 siblings, 1 reply; 48+ messages in thread
From: Krzysztof Kozlowski @ 2025-01-30 13:25 UTC (permalink / raw)
  To: Florent Tomasin, Vinod Koul, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Boris Brezillon, Steven Price, Liviu Dudau,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, Sumit Semwal, Benjamin Gaignard, Brian Starkey,
	John Stultz, T . J . Mercier, Christian König,
	Matthias Brugger, AngeloGioacchino Del Regno, Yong Wu
  Cc: dmaengine, devicetree, linux-kernel, dri-devel, linux-media,
	linaro-mm-sig, linux-arm-kernel, linux-mediatek, nd, Akash Goel

On 30/01/2025 14:08, Florent Tomasin wrote:
> Allow mali-valhall-csf driver to retrieve a protected
> heap at probe time by passing the name of the heap
> as attribute to the device tree GPU node.

Please wrap commit message according to Linux coding style / submission
process (neither too early nor over the limit):
https://elixir.bootlin.com/linux/v6.4-rc1/source/Documentation/process/submitting-patches.rst#L597

Why this cannot be passed by phandle, just like all reserved regions?

From where do you take these protected heaps? Firmware? This would
explain why no relation is here (no probe ordering, no device links,
nothing connecting separate devices).

Best regards,
Krzysztof

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 1/5] dt-bindings: dma: Add CMA Heap bindings
  2025-01-30 13:08 ` [RFC PATCH 1/5] dt-bindings: dma: Add CMA Heap bindings Florent Tomasin
@ 2025-01-30 13:28   ` Maxime Ripard
  2025-02-03 13:36     ` Florent Tomasin
  2025-01-30 23:20   ` Rob Herring
  1 sibling, 1 reply; 48+ messages in thread
From: Maxime Ripard @ 2025-01-30 13:28 UTC (permalink / raw)
  To: Florent Tomasin
  Cc: Vinod Koul, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
	Benjamin Gaignard, Brian Starkey, John Stultz, T . J . Mercier,
	Christian König, Matthias Brugger,
	AngeloGioacchino Del Regno, Yong Wu, dmaengine, devicetree,
	linux-kernel, dri-devel, linux-media, linaro-mm-sig,
	linux-arm-kernel, linux-mediatek, nd, Akash Goel

[-- Attachment #1: Type: text/plain, Size: 2404 bytes --]

Hi,

On Thu, Jan 30, 2025 at 01:08:57PM +0000, Florent Tomasin wrote:
> Introduce a CMA Heap dt-binding allowing custom
> CMA heap registrations.
> 
> * Note to the reviewers:
> The patch was used for the development of the protected mode
> feature in Panthor CSF kernel driver and is not initially thought
> to land in the Linux kernel. It is mostly relevant if someone
> wants to reproduce the environment of testing. Please, raise
> interest if you think the patch has value in the Linux kernel.
> 
> Signed-off-by: Florent Tomasin <florent.tomasin@arm.com>
> ---
>  .../devicetree/bindings/dma/linux,cma.yml     | 43 +++++++++++++++++++
>  1 file changed, 43 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/dma/linux,cma.yml
> 
> diff --git a/Documentation/devicetree/bindings/dma/linux,cma.yml b/Documentation/devicetree/bindings/dma/linux,cma.yml
> new file mode 100644
> index 000000000000..c532e016bbe5
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/dma/linux,cma.yml
> @@ -0,0 +1,43 @@
> +# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/dma/linux-cma.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: Custom Linux CMA heap
> +
> +description:
> +  The custom Linux CMA heap device tree node allows registering
> +  of multiple CMA heaps.
> +
> +  The CMA heap name will match the node name of the "memory-region".
> +
> +properties:
> +  compatible:
> +    enum:
> +      - linux,cma
> +
> +  memory-region:
> +    maxItems: 1
> +    description: |
> +      Phandle to the reserved memory node associated with the CMA Heap.
> +      The reserved memory node must follow this binding convention:
> +       - Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
> +
> +examples:
> +  - |
> +    reserved-memory {
> +      #address-cells = <2>;
> +      #size-cells = <2>;
> +
> +      custom_cma_heap: custom-cma-heap {
> +        compatible = "shared-dma-pool";
> +        reg = <0x0 0x90600000 0x0 0x1000000>;
> +        reusable;
> +      };
> +    };
> +
> +    device_cma_heap: device-cma-heap {
> +      compatible = "linux,cma";
> +      memory-region = <&custom_cma_heap>;
> +    };

Isn't it redundant with the linux,cma-default shared-dma-pool property
already?

Maxime

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 273 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 2/5] cma-heap: Allow registration of custom cma heaps
  2025-01-30 13:08 ` [RFC PATCH 2/5] cma-heap: Allow registration of custom cma heaps Florent Tomasin
@ 2025-01-30 13:34   ` Maxime Ripard
  2025-02-03 13:52     ` Florent Tomasin
  0 siblings, 1 reply; 48+ messages in thread
From: Maxime Ripard @ 2025-01-30 13:34 UTC (permalink / raw)
  To: Florent Tomasin
  Cc: Vinod Koul, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
	Benjamin Gaignard, Brian Starkey, John Stultz, T . J . Mercier,
	Christian König, Matthias Brugger,
	AngeloGioacchino Del Regno, Yong Wu, dmaengine, devicetree,
	linux-kernel, dri-devel, linux-media, linaro-mm-sig,
	linux-arm-kernel, linux-mediatek, nd, Akash Goel

[-- Attachment #1: Type: text/plain, Size: 1020 bytes --]

Hi,

On Thu, Jan 30, 2025 at 01:08:58PM +0000, Florent Tomasin wrote:
> This patch introduces a cma-heap probe function, allowing
> users to register custom cma heaps in the device tree.
> 
> A "memory-region" is bound to the cma heap at probe time
> allowing allocation of DMA buffers from that heap.
> 
> Use cases:
> - registration of carved out secure heaps. Some devices
>   are implementing secure memory by reserving a specific
>   memory regions for that purpose. For example, this is the
>   case of platforms making use of early version of
>   ARM TrustZone.

In such a case, the CMA heap would de-facto become un-mappable for
userspace, right?

> - registration of multiple memory regions at different
>   locations for efficiency or HW integration reasons.
>   For example, a peripheral may expect to share data at a
>   specific location in RAM. This information could have been
>   programmed by a FW prior to the kernel boot.

How would you differentiate between them?

Maxime

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 273 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 0/5] drm/panthor: Protected mode support for Mali CSF GPUs
  2025-01-30 13:08 [RFC PATCH 0/5] drm/panthor: Protected mode support for Mali CSF GPUs Florent Tomasin
                   ` (4 preceding siblings ...)
  2025-01-30 13:09 ` [RFC PATCH 5/5] drm/panthor: Add support for entering and exiting protected mode Florent Tomasin
@ 2025-01-30 13:46 ` Maxime Ripard
  2025-01-30 15:59   ` Nicolas Dufresne
  2025-01-30 16:15 ` Simona Vetter
  6 siblings, 1 reply; 48+ messages in thread
From: Maxime Ripard @ 2025-01-30 13:46 UTC (permalink / raw)
  To: Florent Tomasin
  Cc: Vinod Koul, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
	Benjamin Gaignard, Brian Starkey, John Stultz, T . J . Mercier,
	Christian König, Matthias Brugger,
	AngeloGioacchino Del Regno, Yong Wu, dmaengine, devicetree,
	linux-kernel, dri-devel, linux-media, linaro-mm-sig,
	linux-arm-kernel, linux-mediatek, nd, Akash Goel

[-- Attachment #1: Type: text/plain, Size: 2124 bytes --]

Hi,

I started to review it, but it's probably best to discuss it here.

On Thu, Jan 30, 2025 at 01:08:56PM +0000, Florent Tomasin wrote:
> Hi,
> 
> This is a patch series covering the support for protected mode execution in
> Mali Panthor CSF kernel driver.
> 
> The Mali CSF GPUs come with the support for protected mode execution at the
> HW level. This feature requires two main changes in the kernel driver:
> 
> 1) Configure the GPU with a protected buffer. The system must provide a DMA
>    heap from which the driver can allocate a protected buffer.
>    It can be a carved-out memory or dynamically allocated protected memory region.
>    Some system includes a trusted FW which is in charge of the protected memory.
>    Since this problem is integration specific, the Mali Panthor CSF kernel
>    driver must import the protected memory from a device specific exporter.

Why do you need a heap for it in the first place? My understanding of
your series is that you have a carved out memory region somewhere, and
you want to allocate from that carved out memory region your buffers.

How is that any different from using a reserved-memory region, adding
the reserved-memory property to the GPU device and doing all your
allocation through the usual dma_alloc_* API?

Or do you expect to have another dma-buf heap that would call into the
firmware to perform the allocations?

The semantics of the CMA heap allocations is a concern too.

Another question is how would you expect something like a secure
video-playback pipeline to operate with that kind of interface. Assuming
you have a secure codec, you would likely get that protected buffer from
the codec, right? So the most likely scenario would be to then import
that dma-buf into the GPU driver, but not allocate the buffer from it.

Overall, I think a better abstraction would be to have a heap indeed to
allocate your protected buffers from, and then import them in the
devices that need them. The responsibility would be on the userspace to
do so, but it already kind of does with your design anyway.

Maxime

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 273 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 0/5] drm/panthor: Protected mode support for Mali CSF GPUs
  2025-01-30 13:46 ` [RFC PATCH 0/5] drm/panthor: Protected mode support for Mali CSF GPUs Maxime Ripard
@ 2025-01-30 15:59   ` Nicolas Dufresne
  2025-01-30 16:38     ` Maxime Ripard
  0 siblings, 1 reply; 48+ messages in thread
From: Nicolas Dufresne @ 2025-01-30 15:59 UTC (permalink / raw)
  To: Maxime Ripard, Florent Tomasin
  Cc: Vinod Koul, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
	Benjamin Gaignard, Brian Starkey, John Stultz, T . J . Mercier,
	Christian König, Matthias Brugger,
	AngeloGioacchino Del Regno, Yong Wu, dmaengine, devicetree,
	linux-kernel, dri-devel, linux-media, linaro-mm-sig,
	linux-arm-kernel, linux-mediatek, nd, Akash Goel

Le jeudi 30 janvier 2025 à 14:46 +0100, Maxime Ripard a écrit :
> Hi,
> 
> I started to review it, but it's probably best to discuss it here.
> 
> On Thu, Jan 30, 2025 at 01:08:56PM +0000, Florent Tomasin wrote:
> > Hi,
> > 
> > This is a patch series covering the support for protected mode execution in
> > Mali Panthor CSF kernel driver.
> > 
> > The Mali CSF GPUs come with the support for protected mode execution at the
> > HW level. This feature requires two main changes in the kernel driver:
> > 
> > 1) Configure the GPU with a protected buffer. The system must provide a DMA
> >    heap from which the driver can allocate a protected buffer.
> >    It can be a carved-out memory or dynamically allocated protected memory region.
> >    Some system includes a trusted FW which is in charge of the protected memory.
> >    Since this problem is integration specific, the Mali Panthor CSF kernel
> >    driver must import the protected memory from a device specific exporter.
> 
> Why do you need a heap for it in the first place? My understanding of
> your series is that you have a carved out memory region somewhere, and
> you want to allocate from that carved out memory region your buffers.
> 
> How is that any different from using a reserved-memory region, adding
> the reserved-memory property to the GPU device and doing all your
> allocation through the usual dma_alloc_* API?

How do you then multiplex this region so it can be shared between
GPU/Camera/Display/Codec drivers and also userspace ? Also, how the secure
memory is allocted / obtained is a process that can vary a lot between SoC, so
implementation details assumption should not be coded in the driver.

Nicolas

> 
> Or do you expect to have another dma-buf heap that would call into the
> firmware to perform the allocations?
> 
> The semantics of the CMA heap allocations is a concern too.
> 
> Another question is how would you expect something like a secure
> video-playback pipeline to operate with that kind of interface. Assuming
> you have a secure codec, you would likely get that protected buffer from
> the codec, right? So the most likely scenario would be to then import
> that dma-buf into the GPU driver, but not allocate the buffer from it.
> 
> Overall, I think a better abstraction would be to have a heap indeed to
> allocate your protected buffers from, and then import them in the
> devices that need them. The responsibility would be on the userspace to
> do so, but it already kind of does with your design anyway.
> 
> Maxime


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 0/5] drm/panthor: Protected mode support for Mali CSF GPUs
  2025-01-30 13:08 [RFC PATCH 0/5] drm/panthor: Protected mode support for Mali CSF GPUs Florent Tomasin
                   ` (5 preceding siblings ...)
  2025-01-30 13:46 ` [RFC PATCH 0/5] drm/panthor: Protected mode support for Mali CSF GPUs Maxime Ripard
@ 2025-01-30 16:15 ` Simona Vetter
  2025-02-03  9:25   ` Boris Brezillon
  6 siblings, 1 reply; 48+ messages in thread
From: Simona Vetter @ 2025-01-30 16:15 UTC (permalink / raw)
  To: Florent Tomasin
  Cc: Vinod Koul, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	Sumit Semwal, Benjamin Gaignard, Brian Starkey, John Stultz,
	T . J . Mercier, Christian König, Matthias Brugger,
	AngeloGioacchino Del Regno, Yong Wu, dmaengine, devicetree,
	linux-kernel, dri-devel, linux-media, linaro-mm-sig,
	linux-arm-kernel, linux-mediatek, nd, Akash Goel

On Thu, Jan 30, 2025 at 01:08:56PM +0000, Florent Tomasin wrote:
> Hi,
> 
> This is a patch series covering the support for protected mode execution in
> Mali Panthor CSF kernel driver.
> 
> The Mali CSF GPUs come with the support for protected mode execution at the
> HW level. This feature requires two main changes in the kernel driver:
> 
> 1) Configure the GPU with a protected buffer. The system must provide a DMA
>    heap from which the driver can allocate a protected buffer.
>    It can be a carved-out memory or dynamically allocated protected memory region.
>    Some system includes a trusted FW which is in charge of the protected memory.
>    Since this problem is integration specific, the Mali Panthor CSF kernel
>    driver must import the protected memory from a device specific exporter.
> 
> 2) Handle enter and exit of the GPU HW from normal to protected mode of execution.
>    FW sends a request for protected mode entry to the kernel driver.
>    The acknowledgment of that request is a scheduling decision. Effectively,
>    protected mode execution should not overrule normal mode of execution.
>    A fair distribution of execution time will guaranty the overall performance
>    of the device, including the UI (usually executing in normal mode),
>    will not regress when a protected mode job is submitted by an application.
> 
> 
> Background
> ----------
> 
> Current Mali Panthor CSF driver does not allow a user space application to
> execute protected jobs on the GPU. This use case is quite common on end-user-device.
> A user may want to watch a video or render content that is under a "Digital Right
> Management" protection, or launch an application with user private data.
> 
> 1) User-space:
> 
>    In order for an application to execute protected jobs on a Mali CSF GPU the
>    user space application must submit jobs to the GPU within a "protected regions"
>    (range of commands to execute in protected mode).
> 
>    Find here an example of a command buffer that contains protected commands:
> 
> ```
>           <--- Normal mode ---><--- Protected mode ---><--- Normal mode --->
>    +-------------------------------------------------------------------------+
>    | ... | CMD_0 | ... | CMD_N | PROT_REGION | CMD_N+1 | ... | CMD_N+M | ... |
>    +-------------------------------------------------------------------------+
> ```
> 
>    The PROT_REGION command acts as a barrier to notify the HW of upcoming
>    protected jobs. It also defines the number of commands to execute in protected
>    mode.
> 
>    The Mesa definition of the opcode can be found here:
> 
>      https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/panfrost/lib/genxml/v10.xml?ref_type=heads#L763

Is there also something around that implements egl_ext_protected_context
or similar in mesa? I think that's the minimal bar all the protected gpu
workload kernel support patches cleared thus far, since usually getting
the actual video code stuff published seems to be impossible.
-Sima

> 
> 2) Kernel-space:
> 
>    When loading the FW image, the Kernel driver must also load the data section of
>    CSF FW that comes from the protected memory, in order to allow FW to execute in
>    protected mode.
> 
>    Important: this memory is not owned by any process. It is a GPU device level
>               protected memory.
> 
>    In addition, when a CSG (group) is created, it must have a protected suspend buffer.
>    This memory is allocated within the kernel but bound to a specific CSG that belongs
>    to a process. The kernel owns this allocation and does not allow user space mapping.
>    The format of the data in this buffer is only known by the FW and does not need to
>    be shared with other entities. The purpose of this buffer is the same as the normal
>    suspend buffer but for protected mode. FW will use it to suspend the execution of
>    PROT_REGION before returning to normal mode of execution.
> 
> 
> Design decisions
> ----------------
> 
> The Mali Panthor CSF kernel driver will allocate protected DMA buffers
> using a global protected DMA heap. The name of the heap can vary on
> the system and is integration specific. Therefore, the kernel driver
> will retrieve it using the DTB entry: "protected-heap-name".
> 
> The Mali Panthor CSF kernel driver will handle enter/exit of protected
> mode with a fair consideration of the job scheduling.
> 
> If the system integrator does not provide a protected DMA heap, the driver
> will not allow any protected mode execution.
> 
> 
> Patch series
> ------------
> 
> The series is based on:
> 
>   https://lore.kernel.org/lkml/20230911023038.30649-1-yong.wu@mediatek.com/#t
> 
> [PATCHES 1-2]:
>   These patches were used for the development of the feature and are not
>   initially thought to land in the Linux kernel. They are mostly relevant
>   if someone wants to reproduce the environment of testing.
> 
>   Note: Please, raise interest if you think those patches have value in
>         the Linux kernel.
> 
>   * dt-bindings: dma: Add CMA Heap bindings
>   * cma-heap: Allow registration of custom cma heaps
> 
> [PATCHES 3-4]:
>   These patches introduce the Mali Panthor CSF driver DTB binding to pass
>   the protected DMA Heap name and the handling of the protected DMA memory
>   allocations in the driver.
> 
>   Note: The registration of the protected DMA buffers is done via GEM prime.
>   The direct call to the registration function, may seems controversial and
>   I would appreciate feedback on that matter.
> 
>   * dt-bindings: gpu: Add protected heap name to Mali Valhall CSF binding
>   * drm/panthor: Add support for protected memory allocation in panthor
> 
> [PATCH 5]:
>   This patch implements the logic to handle enter/exit of the GPU protected
>   mode in Panthor CSF driver.
> 
>   Note: to prevent scheduler priority inversion, only a single CSG is allowed
>         to execute while in protected mode. It must be the top priority one.
> 
>   * drm/panthor: Add support for entering and exiting protected mode
> 
> Testing
> -------
> 
> 1) Platform and development environment
> 
>    Any platform containing a Mali CSF type of GPU and a protected memory allocator
>    that is based on DMA Heap can be used. For example, it can be a physical platform
>    or a simulator such as Arm Total Compute FVPs platforms. Reference to the latter:
> 
>      https://developer.arm.com/Tools%20and%20Software/Fixed%20Virtual%20Platforms/Total%20Compute%20FVPs
> 
>    To ease the development of the feature, a carved-out protected memory heap was
>    defined and managed using a modified version of the CMA heap driver.
> 
> 2) Protected job submission:
> 
>    A protected mode job can be created in Mesa following this approach:
> 
> ```
> diff --git a/src/gallium/drivers/panfrost/pan_csf.c b/src/gallium/drivers/panfrost/pan_csf.c
> index da6ce875f86..13d54abf5a1 100644
> --- a/src/gallium/drivers/panfrost/pan_csf.c
> +++ b/src/gallium/drivers/panfrost/pan_csf.c
> @@ -803,8 +803,25 @@ GENX(csf_emit_fragment_job)(struct panfrost_batch *batch,
>        }
>     }
> 
> +   if (protected_cmd) {
> +      /* Number of commands to execute in protected mode in bytes.
> +       * The run fragment and wait commands. */
> +      unsigned const size = 2 * sizeof(u64);
> +
> +      /* Wait for all previous commands to complete before evaluating
> +       * the protected commands. */
> +      cs_wait_slots(b, SB_ALL_MASK, false);
> +      cs_prot_region(b, size);
> +   }
> +
>     /* Run the fragment job and wait */
>     cs_run_fragment(b, false, MALI_TILE_RENDER_ORDER_Z_ORDER, false);
> +
> +   /* Wait for all protected commands to complete before evaluating
> +    * the normal mode commands. */
> +   if (protected_cmd)
> +      cs_wait_slots(b, SB_ALL_MASK, false);
> +
>     cs_wait_slot(b, 2, false);
> 
>     /* Gather freed heap chunks and add them to the heap context free list
> ```
> 
> 
> Constraints
> -----------
> 
> At the time of developing the feature, Linux kernel does not have a generic
> way of implementing protected DMA heaps. The patch series relies on previous
> work to expose the DMA heap API to the kernel drivers.
> 
> The Mali CSF GPU requires device level allocated protected memory, which do
> not belong to a process. The current Linux implementation of DMA heap only
> allows a user space program to allocate from such heap. Having the ability
> to allocate this memory at the kernel level via the DMA heap API would allow
> support for protected mode on Mali CSF GPUs.
> 
> 
> Conclusion
> ----------
> 
> This patch series aims to initiate the discussion around handling of protected
> mode in Mali CSG GPU and highlights constraints found during the development
> of the feature.
> 
> Some Mesa changes are required to construct a protected mode job in user space,
> which can be submitted to the GPU.
> 
> Some of the changes may seems controversial and we would appreciate getting
> opinion from the community.
> 
> 
> Regards,
> 
> Florent Tomasin (5):
>   dt-bindings: dma: Add CMA Heap bindings
>   cma-heap: Allow registration of custom cma heaps
>   dt-bindings: gpu: Add protected heap name to Mali Valhall CSF binding
>   drm/panthor: Add support for protected memory allocation in panthor
>   drm/panthor: Add support for entering and exiting protected mode
> 
>  .../devicetree/bindings/dma/linux,cma.yml     |  43 ++++++
>  .../bindings/gpu/arm,mali-valhall-csf.yaml    |   6 +
>  drivers/dma-buf/heaps/cma_heap.c              | 120 +++++++++++------
>  drivers/gpu/drm/panthor/Kconfig               |   1 +
>  drivers/gpu/drm/panthor/panthor_device.c      |  22 +++-
>  drivers/gpu/drm/panthor/panthor_device.h      |  10 ++
>  drivers/gpu/drm/panthor/panthor_fw.c          |  46 ++++++-
>  drivers/gpu/drm/panthor/panthor_fw.h          |   2 +
>  drivers/gpu/drm/panthor/panthor_gem.c         |  49 ++++++-
>  drivers/gpu/drm/panthor/panthor_gem.h         |  16 ++-
>  drivers/gpu/drm/panthor/panthor_heap.c        |   2 +
>  drivers/gpu/drm/panthor/panthor_sched.c       | 124 ++++++++++++++++--
>  12 files changed, 382 insertions(+), 59 deletions(-)
>  create mode 100644 Documentation/devicetree/bindings/dma/linux,cma.yml
> 
> --
> 2.34.1
> 

-- 
Simona Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 0/5] drm/panthor: Protected mode support for Mali CSF GPUs
  2025-01-30 15:59   ` Nicolas Dufresne
@ 2025-01-30 16:38     ` Maxime Ripard
  2025-01-30 17:47       ` Nicolas Dufresne
  0 siblings, 1 reply; 48+ messages in thread
From: Maxime Ripard @ 2025-01-30 16:38 UTC (permalink / raw)
  To: Nicolas Dufresne
  Cc: Florent Tomasin, Vinod Koul, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Boris Brezillon, Steven Price, Liviu Dudau,
	Maarten Lankhorst, Thomas Zimmermann, David Airlie, Simona Vetter,
	Sumit Semwal, Benjamin Gaignard, Brian Starkey, John Stultz,
	T . J . Mercier, Christian König, Matthias Brugger,
	AngeloGioacchino Del Regno, Yong Wu, dmaengine, devicetree,
	linux-kernel, dri-devel, linux-media, linaro-mm-sig,
	linux-arm-kernel, linux-mediatek, nd, Akash Goel

[-- Attachment #1: Type: text/plain, Size: 2117 bytes --]

Hi Nicolas,

On Thu, Jan 30, 2025 at 10:59:56AM -0500, Nicolas Dufresne wrote:
> Le jeudi 30 janvier 2025 à 14:46 +0100, Maxime Ripard a écrit :
> > Hi,
> > 
> > I started to review it, but it's probably best to discuss it here.
> > 
> > On Thu, Jan 30, 2025 at 01:08:56PM +0000, Florent Tomasin wrote:
> > > Hi,
> > > 
> > > This is a patch series covering the support for protected mode execution in
> > > Mali Panthor CSF kernel driver.
> > > 
> > > The Mali CSF GPUs come with the support for protected mode execution at the
> > > HW level. This feature requires two main changes in the kernel driver:
> > > 
> > > 1) Configure the GPU with a protected buffer. The system must provide a DMA
> > >    heap from which the driver can allocate a protected buffer.
> > >    It can be a carved-out memory or dynamically allocated protected memory region.
> > >    Some system includes a trusted FW which is in charge of the protected memory.
> > >    Since this problem is integration specific, the Mali Panthor CSF kernel
> > >    driver must import the protected memory from a device specific exporter.
> > 
> > Why do you need a heap for it in the first place? My understanding of
> > your series is that you have a carved out memory region somewhere, and
> > you want to allocate from that carved out memory region your buffers.
> > 
> > How is that any different from using a reserved-memory region, adding
> > the reserved-memory property to the GPU device and doing all your
> > allocation through the usual dma_alloc_* API?
> 
> How do you then multiplex this region so it can be shared between
> GPU/Camera/Display/Codec drivers and also userspace ?

You could point all the devices to the same reserved memory region, and
they would all allocate from there, including for their userspace-facing
allocations.

> Also, how the secure memory is allocted / obtained is a process that
> can vary a lot between SoC, so implementation details assumption
> should not be coded in the driver.

But yeah, we agree there, it's also the point I was trying to make :)

Maxime

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 273 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 0/5] drm/panthor: Protected mode support for Mali CSF GPUs
  2025-01-30 16:38     ` Maxime Ripard
@ 2025-01-30 17:47       ` Nicolas Dufresne
  2025-02-03 16:43         ` Florent Tomasin
  0 siblings, 1 reply; 48+ messages in thread
From: Nicolas Dufresne @ 2025-01-30 17:47 UTC (permalink / raw)
  To: Maxime Ripard
  Cc: Florent Tomasin, Vinod Koul, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Boris Brezillon, Steven Price, Liviu Dudau,
	Maarten Lankhorst, Thomas Zimmermann, David Airlie, Simona Vetter,
	Sumit Semwal, Benjamin Gaignard, Brian Starkey, John Stultz,
	T . J . Mercier, Christian König, Matthias Brugger,
	AngeloGioacchino Del Regno, Yong Wu, dmaengine, devicetree,
	linux-kernel, dri-devel, linux-media, linaro-mm-sig,
	linux-arm-kernel, linux-mediatek, nd, Akash Goel

Le jeudi 30 janvier 2025 à 17:38 +0100, Maxime Ripard a écrit :
> Hi Nicolas,
> 
> On Thu, Jan 30, 2025 at 10:59:56AM -0500, Nicolas Dufresne wrote:
> > Le jeudi 30 janvier 2025 à 14:46 +0100, Maxime Ripard a écrit :
> > > Hi,
> > > 
> > > I started to review it, but it's probably best to discuss it here.
> > > 
> > > On Thu, Jan 30, 2025 at 01:08:56PM +0000, Florent Tomasin wrote:
> > > > Hi,
> > > > 
> > > > This is a patch series covering the support for protected mode execution in
> > > > Mali Panthor CSF kernel driver.
> > > > 
> > > > The Mali CSF GPUs come with the support for protected mode execution at the
> > > > HW level. This feature requires two main changes in the kernel driver:
> > > > 
> > > > 1) Configure the GPU with a protected buffer. The system must provide a DMA
> > > >    heap from which the driver can allocate a protected buffer.
> > > >    It can be a carved-out memory or dynamically allocated protected memory region.
> > > >    Some system includes a trusted FW which is in charge of the protected memory.
> > > >    Since this problem is integration specific, the Mali Panthor CSF kernel
> > > >    driver must import the protected memory from a device specific exporter.
> > > 
> > > Why do you need a heap for it in the first place? My understanding of
> > > your series is that you have a carved out memory region somewhere, and
> > > you want to allocate from that carved out memory region your buffers.
> > > 
> > > How is that any different from using a reserved-memory region, adding
> > > the reserved-memory property to the GPU device and doing all your
> > > allocation through the usual dma_alloc_* API?
> > 
> > How do you then multiplex this region so it can be shared between
> > GPU/Camera/Display/Codec drivers and also userspace ?
> 
> You could point all the devices to the same reserved memory region, and
> they would all allocate from there, including for their userspace-facing
> allocations.

I get that using memory region is somewhat more of an HW description, and
aligned with what a DT is supposed to describe. One of the challenge is that
Mediatek heap proposal endup calling into their TEE, meaning knowing the region
is not that useful. You actually need the TEE APP guid and its IPC protocol. If
we can dell drivers to use a head instead, we can abstract that SoC specific
complexity. I believe each allocated addressed has to be mapped to a zone, and
that can only be done in the secure application. I can imagine similar needs
when the protection is done using some sort of a VM / hypervisor.

Nicolas

> 
> > Also, how the secure memory is allocted / obtained is a process that
> > can vary a lot between SoC, so implementation details assumption
> > should not be coded in the driver.
> 
> But yeah, we agree there, it's also the point I was trying to make :)
> 
> Maxime


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 1/5] dt-bindings: dma: Add CMA Heap bindings
  2025-01-30 13:08 ` [RFC PATCH 1/5] dt-bindings: dma: Add CMA Heap bindings Florent Tomasin
  2025-01-30 13:28   ` Maxime Ripard
@ 2025-01-30 23:20   ` Rob Herring
  2025-02-03 16:18     ` Florent Tomasin
  1 sibling, 1 reply; 48+ messages in thread
From: Rob Herring @ 2025-01-30 23:20 UTC (permalink / raw)
  To: Florent Tomasin
  Cc: Vinod Koul, Krzysztof Kozlowski, Conor Dooley, Boris Brezillon,
	Steven Price, Liviu Dudau, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
	Benjamin Gaignard, Brian Starkey, John Stultz, T . J . Mercier,
	Christian König, Matthias Brugger,
	AngeloGioacchino Del Regno, Yong Wu, dmaengine, devicetree,
	linux-kernel, dri-devel, linux-media, linaro-mm-sig,
	linux-arm-kernel, linux-mediatek, nd, Akash Goel

On Thu, Jan 30, 2025 at 01:08:57PM +0000, Florent Tomasin wrote:
> Introduce a CMA Heap dt-binding allowing custom
> CMA heap registrations.
> 
> * Note to the reviewers:
> The patch was used for the development of the protected mode
> feature in Panthor CSF kernel driver and is not initially thought
> to land in the Linux kernel. It is mostly relevant if someone
> wants to reproduce the environment of testing. Please, raise
> interest if you think the patch has value in the Linux kernel.

Why would panthor need CMA, it has an MMU.

In any case, I agree with Maxime that this is redundant.

Rob


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 0/5] drm/panthor: Protected mode support for Mali CSF GPUs
  2025-01-30 16:15 ` Simona Vetter
@ 2025-02-03  9:25   ` Boris Brezillon
  0 siblings, 0 replies; 48+ messages in thread
From: Boris Brezillon @ 2025-02-03  9:25 UTC (permalink / raw)
  To: Simona Vetter
  Cc: Florent Tomasin, Vinod Koul, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	Sumit Semwal, Benjamin Gaignard, Brian Starkey, John Stultz,
	T . J . Mercier, Christian König, Matthias Brugger,
	AngeloGioacchino Del Regno, Yong Wu, dmaengine, devicetree,
	linux-kernel, dri-devel, linux-media, linaro-mm-sig,
	linux-arm-kernel, linux-mediatek, nd, Akash Goel

On Thu, 30 Jan 2025 17:15:24 +0100
Simona Vetter <simona.vetter@ffwll.ch> wrote:

> On Thu, Jan 30, 2025 at 01:08:56PM +0000, Florent Tomasin wrote:
> > Hi,
> > 
> > This is a patch series covering the support for protected mode execution in
> > Mali Panthor CSF kernel driver.
> > 
> > The Mali CSF GPUs come with the support for protected mode execution at the
> > HW level. This feature requires two main changes in the kernel driver:
> > 
> > 1) Configure the GPU with a protected buffer. The system must provide a DMA
> >    heap from which the driver can allocate a protected buffer.
> >    It can be a carved-out memory or dynamically allocated protected memory region.
> >    Some system includes a trusted FW which is in charge of the protected memory.
> >    Since this problem is integration specific, the Mali Panthor CSF kernel
> >    driver must import the protected memory from a device specific exporter.
> > 
> > 2) Handle enter and exit of the GPU HW from normal to protected mode of execution.
> >    FW sends a request for protected mode entry to the kernel driver.
> >    The acknowledgment of that request is a scheduling decision. Effectively,
> >    protected mode execution should not overrule normal mode of execution.
> >    A fair distribution of execution time will guaranty the overall performance
> >    of the device, including the UI (usually executing in normal mode),
> >    will not regress when a protected mode job is submitted by an application.
> > 
> > 
> > Background
> > ----------
> > 
> > Current Mali Panthor CSF driver does not allow a user space application to
> > execute protected jobs on the GPU. This use case is quite common on end-user-device.
> > A user may want to watch a video or render content that is under a "Digital Right
> > Management" protection, or launch an application with user private data.
> > 
> > 1) User-space:
> > 
> >    In order for an application to execute protected jobs on a Mali CSF GPU the
> >    user space application must submit jobs to the GPU within a "protected regions"
> >    (range of commands to execute in protected mode).
> > 
> >    Find here an example of a command buffer that contains protected commands:
> > 
> > ```
> >           <--- Normal mode ---><--- Protected mode ---><--- Normal mode --->
> >    +-------------------------------------------------------------------------+
> >    | ... | CMD_0 | ... | CMD_N | PROT_REGION | CMD_N+1 | ... | CMD_N+M | ... |
> >    +-------------------------------------------------------------------------+
> > ```
> > 
> >    The PROT_REGION command acts as a barrier to notify the HW of upcoming
> >    protected jobs. It also defines the number of commands to execute in protected
> >    mode.
> > 
> >    The Mesa definition of the opcode can be found here:
> > 
> >      https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/panfrost/lib/genxml/v10.xml?ref_type=heads#L763  
> 
> Is there also something around that implements egl_ext_protected_context
> or similar in mesa?

I'll be looking at a mesa implementation for EGL_EXT_protected_content
in the coming weeks. I'll probably get back to reviewing the panthor
implementation when I have something working in mesa.

> I think that's the minimal bar all the protected gpu
> workload kernel support patches cleared thus far, since usually getting
> the actual video code stuff published seems to be impossible.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 1/5] dt-bindings: dma: Add CMA Heap bindings
  2025-01-30 13:28   ` Maxime Ripard
@ 2025-02-03 13:36     ` Florent Tomasin
  2025-02-04 18:12       ` Nicolas Dufresne
  0 siblings, 1 reply; 48+ messages in thread
From: Florent Tomasin @ 2025-02-03 13:36 UTC (permalink / raw)
  To: Maxime Ripard
  Cc: Vinod Koul, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
	Benjamin Gaignard, Brian Starkey, John Stultz, T . J . Mercier,
	Christian König, Matthias Brugger,
	AngeloGioacchino Del Regno, Yong Wu, dmaengine, devicetree,
	linux-kernel, dri-devel, linux-media, linaro-mm-sig,
	linux-arm-kernel, linux-mediatek, nd, Akash Goel



On 30/01/2025 13:28, Maxime Ripard wrote:
> Hi,
> 
> On Thu, Jan 30, 2025 at 01:08:57PM +0000, Florent Tomasin wrote:
>> Introduce a CMA Heap dt-binding allowing custom
>> CMA heap registrations.
>>
>> * Note to the reviewers:
>> The patch was used for the development of the protected mode
>> feature in Panthor CSF kernel driver and is not initially thought
>> to land in the Linux kernel. It is mostly relevant if someone
>> wants to reproduce the environment of testing. Please, raise
>> interest if you think the patch has value in the Linux kernel.
>>
>> Signed-off-by: Florent Tomasin <florent.tomasin@arm.com>
>> ---
>>  .../devicetree/bindings/dma/linux,cma.yml     | 43 +++++++++++++++++++
>>  1 file changed, 43 insertions(+)
>>  create mode 100644 Documentation/devicetree/bindings/dma/linux,cma.yml
>>
>> diff --git a/Documentation/devicetree/bindings/dma/linux,cma.yml b/Documentation/devicetree/bindings/dma/linux,cma.yml
>> new file mode 100644
>> index 000000000000..c532e016bbe5
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/dma/linux,cma.yml
>> @@ -0,0 +1,43 @@
>> +# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause
>> +%YAML 1.2
>> +---
>> +$id: http://devicetree.org/schemas/dma/linux-cma.yaml#
>> +$schema: http://devicetree.org/meta-schemas/core.yaml#
>> +
>> +title: Custom Linux CMA heap
>> +
>> +description:
>> +  The custom Linux CMA heap device tree node allows registering
>> +  of multiple CMA heaps.
>> +
>> +  The CMA heap name will match the node name of the "memory-region".
>> +
>> +properties:
>> +  compatible:
>> +    enum:
>> +      - linux,cma
>> +
>> +  memory-region:
>> +    maxItems: 1
>> +    description: |
>> +      Phandle to the reserved memory node associated with the CMA Heap.
>> +      The reserved memory node must follow this binding convention:
>> +       - Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
>> +
>> +examples:
>> +  - |
>> +    reserved-memory {
>> +      #address-cells = <2>;
>> +      #size-cells = <2>;
>> +
>> +      custom_cma_heap: custom-cma-heap {
>> +        compatible = "shared-dma-pool";
>> +        reg = <0x0 0x90600000 0x0 0x1000000>;
>> +        reusable;
>> +      };
>> +    };
>> +
>> +    device_cma_heap: device-cma-heap {
>> +      compatible = "linux,cma";
>> +      memory-region = <&custom_cma_heap>;
>> +    };
> 
> Isn't it redundant with the linux,cma-default shared-dma-pool property
> already?
> 
> Maxime

Hi Maxime,

Please correct me if my understanding is wrong,

The existing properties: linux,cma-default and shared-dma-pool, do not
allow the creations of multiple standalone CMA heaps, those will create
a single CMA heap: `dma_contiguous_default_area`? Other CMA heaps will
be bound to a driver.

I introduced the "linux,cma" to allow creating multiple standalone CMA
heaps, with the intention of validating the protected mode support on
Mali CSG GPUs. It was included in the RFC in there are interests in
this approach.

Since the Panthor CSF kernel driver does not own or manage a heap,
I needed a way to create a standalone heap. The idea here is for the
kernel driver to be an importer. I relied on a patch series to retrieve
the heap and allocate a DMA buffer from it:
- dma_heap_find()
- dma_heap_buffer_alloc()
- dma_heap_put()

Ref:
https://lore.kernel.org/lkml/20230911023038.30649-1-yong.wu@mediatek.com/#t


Since the protected/secure memory management is integration specific,
I needed a generic way for Panthor to allocate from such heap.

In some scenarios it might be a carved-out memory, in others a FW will
reside in the system (TEE) and require a secure heap driver to allocate
memory (e.g: a similar approach is followd by MTK). Such driver would
implement the allocation and free logic.

Florent



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 2/5] cma-heap: Allow registration of custom cma heaps
  2025-01-30 13:34   ` Maxime Ripard
@ 2025-02-03 13:52     ` Florent Tomasin
  0 siblings, 0 replies; 48+ messages in thread
From: Florent Tomasin @ 2025-02-03 13:52 UTC (permalink / raw)
  To: Maxime Ripard
  Cc: Vinod Koul, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
	Benjamin Gaignard, Brian Starkey, John Stultz, T . J . Mercier,
	Christian König, Matthias Brugger,
	AngeloGioacchino Del Regno, Yong Wu, dmaengine, devicetree,
	linux-kernel, dri-devel, linux-media, linaro-mm-sig,
	linux-arm-kernel, linux-mediatek, nd, Akash Goel

Hi,

On 30/01/2025 13:34, Maxime Ripard wrote:
>> This patch introduces a cma-heap probe function, allowing
>> users to register custom cma heaps in the device tree.
>>
>> A "memory-region" is bound to the cma heap at probe time
>> allowing allocation of DMA buffers from that heap.
>>
>> Use cases:
>> - registration of carved out secure heaps. Some devices
>>   are implementing secure memory by reserving a specific
>>   memory regions for that purpose. For example, this is the
>>   case of platforms making use of early version of
>>   ARM TrustZone.
> 
> In such a case, the CMA heap would de-facto become un-mappable for
> userspace, right?
> 

It could be that the CMA heap or alternative carved-out types of heaps
are made mappable to user space. An example would be an integrator
decided to implement a single carved-out secure heap and have both user
and kernel space programs allocate from it (using the DMA heap
framework).

In the case of Mali CSF GPUs, this same integrator could have decided to
share the secure heap with the whole system and protect its usage with a
secure FW.

>> - registration of multiple memory regions at different
>>   locations for efficiency or HW integration reasons.
>>   For example, a peripheral may expect to share data at a
>>   specific location in RAM. This information could have been
>>   programmed by a FW prior to the kernel boot.
> 
> How would you differentiate between them?

For that situation, I relied on the API exposed by this proposal:

-
https://lore.kernel.org/lkml/20230911023038.30649-1-yong.wu@mediatek.com/#t

The heaps would be distinguished by the name they are given. Therefore,
in the CMA patch, I retrieved the name of the heap using the label of
DTB node. We could do it differently and have a specific field in the
DTB node to assign the name.

I assumed it would be possible to call `dma_heap_find()` from the kernel
driver. The name of the heap would be known by the integrator. This
person may decide to hard code the name of the heap in the importer
kernel driver, or pass it as a property of some sort: insmod module
parameter, DTB, etc to make it generic.

Florent


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 3/5] dt-bindings: gpu: Add protected heap name to Mali Valhall CSF binding
  2025-01-30 13:25   ` Krzysztof Kozlowski
@ 2025-02-03 15:31     ` Florent Tomasin
  2025-02-05  9:13       ` Krzysztof Kozlowski
  0 siblings, 1 reply; 48+ messages in thread
From: Florent Tomasin @ 2025-02-03 15:31 UTC (permalink / raw)
  To: Krzysztof Kozlowski, Vinod Koul, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Boris Brezillon, Steven Price, Liviu Dudau,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, Sumit Semwal, Benjamin Gaignard, Brian Starkey,
	John Stultz, T . J . Mercier, Christian König,
	Matthias Brugger, AngeloGioacchino Del Regno, Yong Wu
  Cc: dmaengine, devicetree, linux-kernel, dri-devel, linux-media,
	linaro-mm-sig, linux-arm-kernel, linux-mediatek, nd, Akash Goel

Hi Krzysztof

On 30/01/2025 13:25, Krzysztof Kozlowski wrote:
> On 30/01/2025 14:08, Florent Tomasin wrote:
>> Allow mali-valhall-csf driver to retrieve a protected
>> heap at probe time by passing the name of the heap
>> as attribute to the device tree GPU node.
> 
> Please wrap commit message according to Linux coding style / submission
> process (neither too early nor over the limit):
> https://elixir.bootlin.com/linux/v6.4-rc1/source/Documentation/process/submitting-patches.rst#L597
Apologies, I think I made quite few other mistakes in the style of the
patches I sent. I will work on improving this aspect, appreciated

> Why this cannot be passed by phandle, just like all reserved regions?
> 
> From where do you take these protected heaps? Firmware? This would
> explain why no relation is here (no probe ordering, no device links,
> nothing connecting separate devices).

The protected heap is generaly obtained from a firmware (TEE) and could
sometimes be a carved-out memory with restricted access.

The Panthor CSF kernel driver does not own or manage the protected heap
and is instead a consumer of it (assuming the heap is made available by
the system integrator).

I initially used a phandle, but then I realised it would introduce a new
API to share the heap across kernel driver. In addition I found this
patch series:
-
https://lore.kernel.org/lkml/20230911023038.30649-1-yong.wu@mediatek.com/#t

which introduces a DMA Heap API to the rest of the kernel to find a
heap by name:
- dma_heap_find()

I then decided to follow that approach to help isolate the heap
management from the GPU driver code. In the Panthor driver, if the
heap is not found at probe time, the driver will defer the probe until
the exporter made it available.

Best regards,
Florent

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 1/5] dt-bindings: dma: Add CMA Heap bindings
  2025-01-30 23:20   ` Rob Herring
@ 2025-02-03 16:18     ` Florent Tomasin
  0 siblings, 0 replies; 48+ messages in thread
From: Florent Tomasin @ 2025-02-03 16:18 UTC (permalink / raw)
  To: Rob Herring
  Cc: Vinod Koul, Krzysztof Kozlowski, Conor Dooley, Boris Brezillon,
	Steven Price, Liviu Dudau, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
	Benjamin Gaignard, Brian Starkey, John Stultz, T . J . Mercier,
	Christian König, Matthias Brugger,
	AngeloGioacchino Del Regno, Yong Wu, dmaengine, devicetree,
	linux-kernel, dri-devel, linux-media, linaro-mm-sig,
	linux-arm-kernel, linux-mediatek, nd, Akash Goel

Hi Rob

On 30/01/2025 23:20, Rob Herring wrote:
> 
> Why would panthor need CMA, it has an MMU.
> 
> In any case, I agree with Maxime that this is redundant.
> 

This is correct, the GPU has an MMU. The reason I introduced this custom
CMA DTB entry is to allow creation of a standalone DMA heap which can be
retrieved by Panthor using the API exposed by:
-
https://lore.kernel.org/lkml/20230911023038.30649-1-yong.wu@mediatek.com/#t

My understanding might be wrong, I am under the impression that current
CMA driver only has `dma_contiguous_default_area` as standalone
carved-out heap and we cannot have more than one. Please correct me if
this is invalid.

With the DMA Heap API I based the RFC on, Panthor kernel driver does not
manage the protected heap itself, it relies on an exporter to do it. On
some system the secure heap will communicate with a secure FW, on others
it will be a carved-out memory with restricted access. This is
integration specific. Panthor kernel driver will expect to import a DMA
buffer obtained from a heap.

For the development of the protected mode feature, I decided to modify
the CMA driver to create a standalone DMA heap and allocate a DMA buffer
from it. It helped me abstract the importing of a heap in Panthor kernel
driver. Someone may use a different heap driver to reproduce the setup.

* Additional information to help with the context:
Mali CSF GPU requires protected memory at the device level which does
not belong to a user space process in order to allow the FW to enter
protected mode. There is a single FW per GPU instance and the FW is
loaded a probe time.

Regards,
Florent

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 0/5] drm/panthor: Protected mode support for Mali CSF GPUs
  2025-01-30 17:47       ` Nicolas Dufresne
@ 2025-02-03 16:43         ` Florent Tomasin
  2025-02-04 18:22           ` Nicolas Dufresne
  2025-02-05 14:52           ` Maxime Ripard
  0 siblings, 2 replies; 48+ messages in thread
From: Florent Tomasin @ 2025-02-03 16:43 UTC (permalink / raw)
  To: Nicolas Dufresne, Maxime Ripard
  Cc: Vinod Koul, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
	Benjamin Gaignard, Brian Starkey, John Stultz, T . J . Mercier,
	Christian König, Matthias Brugger,
	AngeloGioacchino Del Regno, Yong Wu, dmaengine, devicetree,
	linux-kernel, dri-devel, linux-media, linaro-mm-sig,
	linux-arm-kernel, linux-mediatek, nd, Akash Goel

Hi Maxime, Nicolas

On 30/01/2025 17:47, Nicolas Dufresne wrote:
> Le jeudi 30 janvier 2025 à 17:38 +0100, Maxime Ripard a écrit :
>> Hi Nicolas,
>>
>> On Thu, Jan 30, 2025 at 10:59:56AM -0500, Nicolas Dufresne wrote:
>>> Le jeudi 30 janvier 2025 à 14:46 +0100, Maxime Ripard a écrit :
>>>> Hi,
>>>>
>>>> I started to review it, but it's probably best to discuss it here.
>>>>
>>>> On Thu, Jan 30, 2025 at 01:08:56PM +0000, Florent Tomasin wrote:
>>>>> Hi,
>>>>>
>>>>> This is a patch series covering the support for protected mode execution in
>>>>> Mali Panthor CSF kernel driver.
>>>>>
>>>>> The Mali CSF GPUs come with the support for protected mode execution at the
>>>>> HW level. This feature requires two main changes in the kernel driver:
>>>>>
>>>>> 1) Configure the GPU with a protected buffer. The system must provide a DMA
>>>>>    heap from which the driver can allocate a protected buffer.
>>>>>    It can be a carved-out memory or dynamically allocated protected memory region.
>>>>>    Some system includes a trusted FW which is in charge of the protected memory.
>>>>>    Since this problem is integration specific, the Mali Panthor CSF kernel
>>>>>    driver must import the protected memory from a device specific exporter.
>>>>
>>>> Why do you need a heap for it in the first place? My understanding of
>>>> your series is that you have a carved out memory region somewhere, and
>>>> you want to allocate from that carved out memory region your buffers.
>>>>
>>>> How is that any different from using a reserved-memory region, adding
>>>> the reserved-memory property to the GPU device and doing all your
>>>> allocation through the usual dma_alloc_* API?
>>>
>>> How do you then multiplex this region so it can be shared between
>>> GPU/Camera/Display/Codec drivers and also userspace ?
>>
>> You could point all the devices to the same reserved memory region, and
>> they would all allocate from there, including for their userspace-facing
>> allocations.
> 
> I get that using memory region is somewhat more of an HW description, and
> aligned with what a DT is supposed to describe. One of the challenge is that
> Mediatek heap proposal endup calling into their TEE, meaning knowing the region
> is not that useful. You actually need the TEE APP guid and its IPC protocol. If
> we can dell drivers to use a head instead, we can abstract that SoC specific
> complexity. I believe each allocated addressed has to be mapped to a zone, and
> that can only be done in the secure application. I can imagine similar needs
> when the protection is done using some sort of a VM / hypervisor.
> 
> Nicolas
> 

The idea in this design is to abstract the heap management from the
Panthor kernel driver (which consumes a DMA buffer from it).

In a system, an integrator would have implemented a secure heap driver,
and could be based on TEE or a carved-out memory with restricted access,
or else. This heap driver would be responsible of implementing the
logic to: allocate, free, refcount, etc.

The heap would be retrieved by the Panthor kernel driver in order to
allocate protected memory to load the FW and allow the GPU to enter/exit
protected mode. This memory would not belong to a user space process.
The driver allocates it at the time of loading the FW and initialization
of the GPU HW. This is a device globally owned protected memory.

When I came across this patch series:
-
https://lore.kernel.org/lkml/20230911023038.30649-1-yong.wu@mediatek.com/#t
I found it could help abstract the interface between the secure heap and
the integration of protected memory in Panthor.

A kernel driver would have to find the heap: `dma_heap_find()`, then
request allocation of a DMA buffer from it. The heap driver would deal
with the specifities of the protected memory on the system.

>>
>>> Also, how the secure memory is allocted / obtained is a process that
>>> can vary a lot between SoC, so implementation details assumption
>>> should not be coded in the driver.
>>
>> But yeah, we agree there, it's also the point I was trying to make :)
>>
>> Maxime
> 

Agree with your point, the Panthor kernel driver may not be aware of the
heap management logic. As an alternative to the DMA heap API used here,
I also tried to expose the heap by passing the phandle of a "heap"
device to Panthor. The reference to the DMA heap was stored as a private
data of the heap device as a new type: `struct dma_heap_import_info`,
similar to the existing type: `struct dma_heap_export_info`.
This made me think it could be problematic, as the private data type
would have to be cast before accessing it from the importer driver. I
worried about a mis-use of the types with this approach.

Regards,
Florent

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 1/5] dt-bindings: dma: Add CMA Heap bindings
  2025-02-03 13:36     ` Florent Tomasin
@ 2025-02-04 18:12       ` Nicolas Dufresne
  2025-02-12  9:49         ` Florent Tomasin
  0 siblings, 1 reply; 48+ messages in thread
From: Nicolas Dufresne @ 2025-02-04 18:12 UTC (permalink / raw)
  To: Florent Tomasin, Maxime Ripard
  Cc: Vinod Koul, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
	Benjamin Gaignard, Brian Starkey, John Stultz, T . J . Mercier,
	Christian König, Matthias Brugger,
	AngeloGioacchino Del Regno, Yong Wu, dmaengine, devicetree,
	linux-kernel, dri-devel, linux-media, linaro-mm-sig,
	linux-arm-kernel, linux-mediatek, nd, Akash Goel

Hi Florent,

Le lundi 03 février 2025 à 13:36 +0000, Florent Tomasin a écrit :
> 
> On 30/01/2025 13:28, Maxime Ripard wrote:
> > Hi,
> > 
> > On Thu, Jan 30, 2025 at 01:08:57PM +0000, Florent Tomasin wrote:
> > > Introduce a CMA Heap dt-binding allowing custom
> > > CMA heap registrations.
> > > 
> > > * Note to the reviewers:
> > > The patch was used for the development of the protected mode

Just to avoid divergence in nomenclature, and because this is not a new subject,
perhaps you should also adhere to the name "restricted". Both Linaro and
Mediatek have moved from "secure" to that name in their proposal. As you are the
third proposing this (at least for the proposal that are CCed on linux-media), I
would have expected in your cover letter a summary of how the other requirement
have been blended in your proposal.

regards,
Nicolas

> > > feature in Panthor CSF kernel driver and is not initially thought
> > > to land in the Linux kernel. It is mostly relevant if someone
> > > wants to reproduce the environment of testing. Please, raise
> > > interest if you think the patch has value in the Linux kernel.
> > > 
> > > Signed-off-by: Florent Tomasin <florent.tomasin@arm.com>
> > > ---
> > >  .../devicetree/bindings/dma/linux,cma.yml     | 43 +++++++++++++++++++
> > >  1 file changed, 43 insertions(+)
> > >  create mode 100644 Documentation/devicetree/bindings/dma/linux,cma.yml
> > > 
> > > diff --git a/Documentation/devicetree/bindings/dma/linux,cma.yml b/Documentation/devicetree/bindings/dma/linux,cma.yml
> > > new file mode 100644
> > > index 000000000000..c532e016bbe5
> > > --- /dev/null
> > > +++ b/Documentation/devicetree/bindings/dma/linux,cma.yml
> > > @@ -0,0 +1,43 @@
> > > +# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause
> > > +%YAML 1.2
> > > +---
> > > +$id: http://devicetree.org/schemas/dma/linux-cma.yaml#
> > > +$schema: http://devicetree.org/meta-schemas/core.yaml#
> > > +
> > > +title: Custom Linux CMA heap
> > > +
> > > +description:
> > > +  The custom Linux CMA heap device tree node allows registering
> > > +  of multiple CMA heaps.
> > > +
> > > +  The CMA heap name will match the node name of the "memory-region".
> > > +
> > > +properties:
> > > +  compatible:
> > > +    enum:
> > > +      - linux,cma
> > > +
> > > +  memory-region:
> > > +    maxItems: 1
> > > +    description: |
> > > +      Phandle to the reserved memory node associated with the CMA Heap.
> > > +      The reserved memory node must follow this binding convention:
> > > +       - Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
> > > +
> > > +examples:
> > > +  - |
> > > +    reserved-memory {
> > > +      #address-cells = <2>;
> > > +      #size-cells = <2>;
> > > +
> > > +      custom_cma_heap: custom-cma-heap {
> > > +        compatible = "shared-dma-pool";
> > > +        reg = <0x0 0x90600000 0x0 0x1000000>;
> > > +        reusable;
> > > +      };
> > > +    };
> > > +
> > > +    device_cma_heap: device-cma-heap {
> > > +      compatible = "linux,cma";
> > > +      memory-region = <&custom_cma_heap>;
> > > +    };
> > 
> > Isn't it redundant with the linux,cma-default shared-dma-pool property
> > already?
> > 
> > Maxime
> 
> Hi Maxime,
> 
> Please correct me if my understanding is wrong,
> 
> The existing properties: linux,cma-default and shared-dma-pool, do not
> allow the creations of multiple standalone CMA heaps, those will create
> a single CMA heap: `dma_contiguous_default_area`? Other CMA heaps will
> be bound to a driver.
> 
> I introduced the "linux,cma" to allow creating multiple standalone CMA
> heaps, with the intention of validating the protected mode support on
> Mali CSG GPUs. It was included in the RFC in there are interests in
> this approach.
> 
> Since the Panthor CSF kernel driver does not own or manage a heap,
> I needed a way to create a standalone heap. The idea here is for the
> kernel driver to be an importer. I relied on a patch series to retrieve
> the heap and allocate a DMA buffer from it:
> - dma_heap_find()
> - dma_heap_buffer_alloc()
> - dma_heap_put()
> 
> Ref:
> https://lore.kernel.org/lkml/20230911023038.30649-1-yong.wu@mediatek.com/#t
> 
> 
> Since the protected/secure memory management is integration specific,
> I needed a generic way for Panthor to allocate from such heap.
> 
> In some scenarios it might be a carved-out memory, in others a FW will
> reside in the system (TEE) and require a secure heap driver to allocate
> memory (e.g: a similar approach is followd by MTK). Such driver would
> implement the allocation and free logic.
> 
> Florent
> 
> 
> 


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 0/5] drm/panthor: Protected mode support for Mali CSF GPUs
  2025-02-03 16:43         ` Florent Tomasin
@ 2025-02-04 18:22           ` Nicolas Dufresne
  2025-02-05 14:53             ` Maxime Ripard
  2025-02-05 14:52           ` Maxime Ripard
  1 sibling, 1 reply; 48+ messages in thread
From: Nicolas Dufresne @ 2025-02-04 18:22 UTC (permalink / raw)
  To: Florent Tomasin, Maxime Ripard
  Cc: Vinod Koul, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
	Benjamin Gaignard, Brian Starkey, John Stultz, T . J . Mercier,
	Christian König, Matthias Brugger,
	AngeloGioacchino Del Regno, Yong Wu, dmaengine, devicetree,
	linux-kernel, dri-devel, linux-media, linaro-mm-sig,
	linux-arm-kernel, linux-mediatek, nd, Akash Goel

Le lundi 03 février 2025 à 16:43 +0000, Florent Tomasin a écrit :
> Hi Maxime, Nicolas
> 
> On 30/01/2025 17:47, Nicolas Dufresne wrote:
> > Le jeudi 30 janvier 2025 à 17:38 +0100, Maxime Ripard a écrit :
> > > Hi Nicolas,
> > > 
> > > On Thu, Jan 30, 2025 at 10:59:56AM -0500, Nicolas Dufresne wrote:
> > > > Le jeudi 30 janvier 2025 à 14:46 +0100, Maxime Ripard a écrit :
> > > > > Hi,
> > > > > 
> > > > > I started to review it, but it's probably best to discuss it here.
> > > > > 
> > > > > On Thu, Jan 30, 2025 at 01:08:56PM +0000, Florent Tomasin wrote:
> > > > > > Hi,
> > > > > > 
> > > > > > This is a patch series covering the support for protected mode execution in
> > > > > > Mali Panthor CSF kernel driver.
> > > > > > 
> > > > > > The Mali CSF GPUs come with the support for protected mode execution at the
> > > > > > HW level. This feature requires two main changes in the kernel driver:
> > > > > > 
> > > > > > 1) Configure the GPU with a protected buffer. The system must provide a DMA
> > > > > >    heap from which the driver can allocate a protected buffer.
> > > > > >    It can be a carved-out memory or dynamically allocated protected memory region.
> > > > > >    Some system includes a trusted FW which is in charge of the protected memory.
> > > > > >    Since this problem is integration specific, the Mali Panthor CSF kernel
> > > > > >    driver must import the protected memory from a device specific exporter.
> > > > > 
> > > > > Why do you need a heap for it in the first place? My understanding of
> > > > > your series is that you have a carved out memory region somewhere, and
> > > > > you want to allocate from that carved out memory region your buffers.
> > > > > 
> > > > > How is that any different from using a reserved-memory region, adding
> > > > > the reserved-memory property to the GPU device and doing all your
> > > > > allocation through the usual dma_alloc_* API?
> > > > 
> > > > How do you then multiplex this region so it can be shared between
> > > > GPU/Camera/Display/Codec drivers and also userspace ?
> > > 
> > > You could point all the devices to the same reserved memory region, and
> > > they would all allocate from there, including for their userspace-facing
> > > allocations.
> > 
> > I get that using memory region is somewhat more of an HW description, and
> > aligned with what a DT is supposed to describe. One of the challenge is that
> > Mediatek heap proposal endup calling into their TEE, meaning knowing the region
> > is not that useful. You actually need the TEE APP guid and its IPC protocol. If
> > we can dell drivers to use a head instead, we can abstract that SoC specific
> > complexity. I believe each allocated addressed has to be mapped to a zone, and
> > that can only be done in the secure application. I can imagine similar needs
> > when the protection is done using some sort of a VM / hypervisor.
> > 
> > Nicolas
> > 
> 
> The idea in this design is to abstract the heap management from the
> Panthor kernel driver (which consumes a DMA buffer from it).
> 
> In a system, an integrator would have implemented a secure heap driver,
> and could be based on TEE or a carved-out memory with restricted access,
> or else. This heap driver would be responsible of implementing the
> logic to: allocate, free, refcount, etc.
> 
> The heap would be retrieved by the Panthor kernel driver in order to
> allocate protected memory to load the FW and allow the GPU to enter/exit
> protected mode. This memory would not belong to a user space process.
> The driver allocates it at the time of loading the FW and initialization
> of the GPU HW. This is a device globally owned protected memory.

This use case also applies well for codec. The Mediatek SCP firmware needs to be
loaded with a restricted memory too, its a very similar scenario, plus Mediatek
chips often include a Mali. On top of that, V4L2 codecs (in general) do need to
allocate internal scratch buffer for the IP to write to for things like motion
vectors, reconstruction frames, entropy statistics, etc. The IP will only be
able to write if the memory is restricted.

Nicolas

> 
> When I came across this patch series:
> -
> https://lore.kernel.org/lkml/20230911023038.30649-1-yong.wu@mediatek.com/#t
> I found it could help abstract the interface between the secure heap and
> the integration of protected memory in Panthor.
> 
> A kernel driver would have to find the heap: `dma_heap_find()`, then
> request allocation of a DMA buffer from it. The heap driver would deal
> with the specifities of the protected memory on the system.
> 
> > > 
> > > > Also, how the secure memory is allocted / obtained is a process that
> > > > can vary a lot between SoC, so implementation details assumption
> > > > should not be coded in the driver.
> > > 
> > > But yeah, we agree there, it's also the point I was trying to make :)
> > > 
> > > Maxime
> > 
> 
> Agree with your point, the Panthor kernel driver may not be aware of the
> heap management logic. As an alternative to the DMA heap API used here,
> I also tried to expose the heap by passing the phandle of a "heap"
> device to Panthor. The reference to the DMA heap was stored as a private
> data of the heap device as a new type: `struct dma_heap_import_info`,
> similar to the existing type: `struct dma_heap_export_info`.
> This made me think it could be problematic, as the private data type
> would have to be cast before accessing it from the importer driver. I
> worried about a mis-use of the types with this approach.
> 
> Regards,
> Florent


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 3/5] dt-bindings: gpu: Add protected heap name to Mali Valhall CSF binding
  2025-02-03 15:31     ` Florent Tomasin
@ 2025-02-05  9:13       ` Krzysztof Kozlowski
  2025-02-06 21:21         ` Nicolas Dufresne
  0 siblings, 1 reply; 48+ messages in thread
From: Krzysztof Kozlowski @ 2025-02-05  9:13 UTC (permalink / raw)
  To: Florent Tomasin, Vinod Koul, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Boris Brezillon, Steven Price, Liviu Dudau,
	Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
	Simona Vetter, Sumit Semwal, Benjamin Gaignard, Brian Starkey,
	John Stultz, T . J . Mercier, Christian König,
	Matthias Brugger, AngeloGioacchino Del Regno, Yong Wu
  Cc: dmaengine, devicetree, linux-kernel, dri-devel, linux-media,
	linaro-mm-sig, linux-arm-kernel, linux-mediatek, nd, Akash Goel

On 03/02/2025 16:31, Florent Tomasin wrote:
> Hi Krzysztof
> 
> On 30/01/2025 13:25, Krzysztof Kozlowski wrote:
>> On 30/01/2025 14:08, Florent Tomasin wrote:
>>> Allow mali-valhall-csf driver to retrieve a protected
>>> heap at probe time by passing the name of the heap
>>> as attribute to the device tree GPU node.
>>
>> Please wrap commit message according to Linux coding style / submission
>> process (neither too early nor over the limit):
>> https://elixir.bootlin.com/linux/v6.4-rc1/source/Documentation/process/submitting-patches.rst#L597
> Apologies, I think I made quite few other mistakes in the style of the
> patches I sent. I will work on improving this aspect, appreciated
> 
>> Why this cannot be passed by phandle, just like all reserved regions?
>>
>> From where do you take these protected heaps? Firmware? This would
>> explain why no relation is here (no probe ordering, no device links,
>> nothing connecting separate devices).
> 
> The protected heap is generaly obtained from a firmware (TEE) and could
> sometimes be a carved-out memory with restricted access.

Which is a reserved memory, isn't it?

> 
> The Panthor CSF kernel driver does not own or manage the protected heap
> and is instead a consumer of it (assuming the heap is made available by
> the system integrator).
> 
> I initially used a phandle, but then I realised it would introduce a new
> API to share the heap across kernel driver. In addition I found this
> patch series:
> -
> https://lore.kernel.org/lkml/20230911023038.30649-1-yong.wu@mediatek.com/#t
> 
> which introduces a DMA Heap API to the rest of the kernel to find a
> heap by name:
> - dma_heap_find()
> 
> I then decided to follow that approach to help isolate the heap
> management from the GPU driver code. In the Panthor driver, if the
> heap is not found at probe time, the driver will defer the probe until
> the exporter made it available.


I don't talk here really about the driver but even above mediatek
patchset uses reserved memory bindings.

You explained some things about driver yet you did not answer the
question. This looks like reserved memory. If it does not, bring
arguments why this binding cannot be a reserved memory, why hardware is
not a carve out memory.

Best regards,
Krzysztof

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 0/5] drm/panthor: Protected mode support for Mali CSF GPUs
  2025-02-03 16:43         ` Florent Tomasin
  2025-02-04 18:22           ` Nicolas Dufresne
@ 2025-02-05 14:52           ` Maxime Ripard
  2025-02-05 18:14             ` Nicolas Dufresne
  1 sibling, 1 reply; 48+ messages in thread
From: Maxime Ripard @ 2025-02-05 14:52 UTC (permalink / raw)
  To: Florent Tomasin
  Cc: Nicolas Dufresne, Vinod Koul, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Boris Brezillon, Steven Price, Liviu Dudau,
	Maarten Lankhorst, Thomas Zimmermann, David Airlie, Simona Vetter,
	Sumit Semwal, Benjamin Gaignard, Brian Starkey, John Stultz,
	T . J . Mercier, Christian König, Matthias Brugger,
	AngeloGioacchino Del Regno, Yong Wu, dmaengine, devicetree,
	linux-kernel, dri-devel, linux-media, linaro-mm-sig,
	linux-arm-kernel, linux-mediatek, nd, Akash Goel

[-- Attachment #1: Type: text/plain, Size: 4687 bytes --]

On Mon, Feb 03, 2025 at 04:43:23PM +0000, Florent Tomasin wrote:
> Hi Maxime, Nicolas
> 
> On 30/01/2025 17:47, Nicolas Dufresne wrote:
> > Le jeudi 30 janvier 2025 à 17:38 +0100, Maxime Ripard a écrit :
> >> Hi Nicolas,
> >>
> >> On Thu, Jan 30, 2025 at 10:59:56AM -0500, Nicolas Dufresne wrote:
> >>> Le jeudi 30 janvier 2025 à 14:46 +0100, Maxime Ripard a écrit :
> >>>> Hi,
> >>>>
> >>>> I started to review it, but it's probably best to discuss it here.
> >>>>
> >>>> On Thu, Jan 30, 2025 at 01:08:56PM +0000, Florent Tomasin wrote:
> >>>>> Hi,
> >>>>>
> >>>>> This is a patch series covering the support for protected mode execution in
> >>>>> Mali Panthor CSF kernel driver.
> >>>>>
> >>>>> The Mali CSF GPUs come with the support for protected mode execution at the
> >>>>> HW level. This feature requires two main changes in the kernel driver:
> >>>>>
> >>>>> 1) Configure the GPU with a protected buffer. The system must provide a DMA
> >>>>>    heap from which the driver can allocate a protected buffer.
> >>>>>    It can be a carved-out memory or dynamically allocated protected memory region.
> >>>>>    Some system includes a trusted FW which is in charge of the protected memory.
> >>>>>    Since this problem is integration specific, the Mali Panthor CSF kernel
> >>>>>    driver must import the protected memory from a device specific exporter.
> >>>>
> >>>> Why do you need a heap for it in the first place? My understanding of
> >>>> your series is that you have a carved out memory region somewhere, and
> >>>> you want to allocate from that carved out memory region your buffers.
> >>>>
> >>>> How is that any different from using a reserved-memory region, adding
> >>>> the reserved-memory property to the GPU device and doing all your
> >>>> allocation through the usual dma_alloc_* API?
> >>>
> >>> How do you then multiplex this region so it can be shared between
> >>> GPU/Camera/Display/Codec drivers and also userspace ?
> >>
> >> You could point all the devices to the same reserved memory region, and
> >> they would all allocate from there, including for their userspace-facing
> >> allocations.
> > 
> > I get that using memory region is somewhat more of an HW description, and
> > aligned with what a DT is supposed to describe. One of the challenge is that
> > Mediatek heap proposal endup calling into their TEE, meaning knowing the region
> > is not that useful. You actually need the TEE APP guid and its IPC protocol. If
> > we can dell drivers to use a head instead, we can abstract that SoC specific
> > complexity. I believe each allocated addressed has to be mapped to a zone, and
> > that can only be done in the secure application. I can imagine similar needs
> > when the protection is done using some sort of a VM / hypervisor.
> > 
> > Nicolas
> > 
> 
> The idea in this design is to abstract the heap management from the
> Panthor kernel driver (which consumes a DMA buffer from it).
> 
> In a system, an integrator would have implemented a secure heap driver,
> and could be based on TEE or a carved-out memory with restricted access,
> or else. This heap driver would be responsible of implementing the
> logic to: allocate, free, refcount, etc.
> 
> The heap would be retrieved by the Panthor kernel driver in order to
> allocate protected memory to load the FW and allow the GPU to enter/exit
> protected mode. This memory would not belong to a user space process.
> The driver allocates it at the time of loading the FW and initialization
> of the GPU HW. This is a device globally owned protected memory.

The thing is, it's really not clear why you absolutely need to have the
Panthor driver involved there. It won't be transparent to userspace,
since you'd need an extra flag at allocation time, and the buffers
behave differently. If userspace has to be aware of it, what's the
advantage to your approach compared to just exposing a heap for those
secure buffers, and letting userspace allocate its buffers from there?

> When I came across this patch series:
> -
> https://lore.kernel.org/lkml/20230911023038.30649-1-yong.wu@mediatek.com/#t
> I found it could help abstract the interface between the secure heap and
> the integration of protected memory in Panthor.
> 
> A kernel driver would have to find the heap: `dma_heap_find()`, then
> request allocation of a DMA buffer from it. The heap driver would deal
> with the specifities of the protected memory on the system.

Sure, but we still have to address *why* it would be a good idea for the
driver to do it in the first place. The mediatek series had the same
feedback.

Maxime

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 273 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 0/5] drm/panthor: Protected mode support for Mali CSF GPUs
  2025-02-04 18:22           ` Nicolas Dufresne
@ 2025-02-05 14:53             ` Maxime Ripard
  2025-02-05 18:07               ` Nicolas Dufresne
  0 siblings, 1 reply; 48+ messages in thread
From: Maxime Ripard @ 2025-02-05 14:53 UTC (permalink / raw)
  To: Nicolas Dufresne
  Cc: Florent Tomasin, Vinod Koul, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Boris Brezillon, Steven Price, Liviu Dudau,
	Maarten Lankhorst, Thomas Zimmermann, David Airlie, Simona Vetter,
	Sumit Semwal, Benjamin Gaignard, Brian Starkey, John Stultz,
	T . J . Mercier, Christian König, Matthias Brugger,
	AngeloGioacchino Del Regno, Yong Wu, dmaengine, devicetree,
	linux-kernel, dri-devel, linux-media, linaro-mm-sig,
	linux-arm-kernel, linux-mediatek, nd, Akash Goel

[-- Attachment #1: Type: text/plain, Size: 4581 bytes --]

On Tue, Feb 04, 2025 at 01:22:58PM -0500, Nicolas Dufresne wrote:
> Le lundi 03 février 2025 à 16:43 +0000, Florent Tomasin a écrit :
> > Hi Maxime, Nicolas
> > 
> > On 30/01/2025 17:47, Nicolas Dufresne wrote:
> > > Le jeudi 30 janvier 2025 à 17:38 +0100, Maxime Ripard a écrit :
> > > > Hi Nicolas,
> > > > 
> > > > On Thu, Jan 30, 2025 at 10:59:56AM -0500, Nicolas Dufresne wrote:
> > > > > Le jeudi 30 janvier 2025 à 14:46 +0100, Maxime Ripard a écrit :
> > > > > > Hi,
> > > > > > 
> > > > > > I started to review it, but it's probably best to discuss it here.
> > > > > > 
> > > > > > On Thu, Jan 30, 2025 at 01:08:56PM +0000, Florent Tomasin wrote:
> > > > > > > Hi,
> > > > > > > 
> > > > > > > This is a patch series covering the support for protected mode execution in
> > > > > > > Mali Panthor CSF kernel driver.
> > > > > > > 
> > > > > > > The Mali CSF GPUs come with the support for protected mode execution at the
> > > > > > > HW level. This feature requires two main changes in the kernel driver:
> > > > > > > 
> > > > > > > 1) Configure the GPU with a protected buffer. The system must provide a DMA
> > > > > > >    heap from which the driver can allocate a protected buffer.
> > > > > > >    It can be a carved-out memory or dynamically allocated protected memory region.
> > > > > > >    Some system includes a trusted FW which is in charge of the protected memory.
> > > > > > >    Since this problem is integration specific, the Mali Panthor CSF kernel
> > > > > > >    driver must import the protected memory from a device specific exporter.
> > > > > > 
> > > > > > Why do you need a heap for it in the first place? My understanding of
> > > > > > your series is that you have a carved out memory region somewhere, and
> > > > > > you want to allocate from that carved out memory region your buffers.
> > > > > > 
> > > > > > How is that any different from using a reserved-memory region, adding
> > > > > > the reserved-memory property to the GPU device and doing all your
> > > > > > allocation through the usual dma_alloc_* API?
> > > > > 
> > > > > How do you then multiplex this region so it can be shared between
> > > > > GPU/Camera/Display/Codec drivers and also userspace ?
> > > > 
> > > > You could point all the devices to the same reserved memory region, and
> > > > they would all allocate from there, including for their userspace-facing
> > > > allocations.
> > > 
> > > I get that using memory region is somewhat more of an HW description, and
> > > aligned with what a DT is supposed to describe. One of the challenge is that
> > > Mediatek heap proposal endup calling into their TEE, meaning knowing the region
> > > is not that useful. You actually need the TEE APP guid and its IPC protocol. If
> > > we can dell drivers to use a head instead, we can abstract that SoC specific
> > > complexity. I believe each allocated addressed has to be mapped to a zone, and
> > > that can only be done in the secure application. I can imagine similar needs
> > > when the protection is done using some sort of a VM / hypervisor.
> > > 
> > > Nicolas
> > > 
> > 
> > The idea in this design is to abstract the heap management from the
> > Panthor kernel driver (which consumes a DMA buffer from it).
> > 
> > In a system, an integrator would have implemented a secure heap driver,
> > and could be based on TEE or a carved-out memory with restricted access,
> > or else. This heap driver would be responsible of implementing the
> > logic to: allocate, free, refcount, etc.
> > 
> > The heap would be retrieved by the Panthor kernel driver in order to
> > allocate protected memory to load the FW and allow the GPU to enter/exit
> > protected mode. This memory would not belong to a user space process.
> > The driver allocates it at the time of loading the FW and initialization
> > of the GPU HW. This is a device globally owned protected memory.
> 
> This use case also applies well for codec. The Mediatek SCP firmware needs to be
> loaded with a restricted memory too, its a very similar scenario, plus Mediatek
> chips often include a Mali. On top of that, V4L2 codecs (in general) do need to
> allocate internal scratch buffer for the IP to write to for things like motion
> vectors, reconstruction frames, entropy statistics, etc. The IP will only be
> able to write if the memory is restricted.

BTW, in such a case, do the scratch buffers need to be
protected/secure/whatever too, or would codecs be able to use any buffer
as a scratch buffer?

Maxime

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 273 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 0/5] drm/panthor: Protected mode support for Mali CSF GPUs
  2025-02-05 14:53             ` Maxime Ripard
@ 2025-02-05 18:07               ` Nicolas Dufresne
  0 siblings, 0 replies; 48+ messages in thread
From: Nicolas Dufresne @ 2025-02-05 18:07 UTC (permalink / raw)
  To: Maxime Ripard
  Cc: Florent Tomasin, Vinod Koul, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Boris Brezillon, Steven Price, Liviu Dudau,
	Maarten Lankhorst, Thomas Zimmermann, David Airlie, Simona Vetter,
	Sumit Semwal, Benjamin Gaignard, Brian Starkey, John Stultz,
	T . J . Mercier, Christian König, Matthias Brugger,
	AngeloGioacchino Del Regno, Yong Wu, dmaengine, devicetree,
	linux-kernel, dri-devel, linux-media, linaro-mm-sig,
	linux-arm-kernel, linux-mediatek, nd, Akash Goel

Le mercredi 05 février 2025 à 15:53 +0100, Maxime Ripard a écrit :
> On Tue, Feb 04, 2025 at 01:22:58PM -0500, Nicolas Dufresne wrote:
> > Le lundi 03 février 2025 à 16:43 +0000, Florent Tomasin a écrit :
> > > Hi Maxime, Nicolas
> > > 
> > > On 30/01/2025 17:47, Nicolas Dufresne wrote:
> > > > Le jeudi 30 janvier 2025 à 17:38 +0100, Maxime Ripard a écrit :
> > > > > Hi Nicolas,
> > > > > 
> > > > > On Thu, Jan 30, 2025 at 10:59:56AM -0500, Nicolas Dufresne wrote:
> > > > > > Le jeudi 30 janvier 2025 à 14:46 +0100, Maxime Ripard a écrit :
> > > > > > > Hi,
> > > > > > > 
> > > > > > > I started to review it, but it's probably best to discuss it here.
> > > > > > > 
> > > > > > > On Thu, Jan 30, 2025 at 01:08:56PM +0000, Florent Tomasin wrote:
> > > > > > > > Hi,
> > > > > > > > 
> > > > > > > > This is a patch series covering the support for protected mode execution in
> > > > > > > > Mali Panthor CSF kernel driver.
> > > > > > > > 
> > > > > > > > The Mali CSF GPUs come with the support for protected mode execution at the
> > > > > > > > HW level. This feature requires two main changes in the kernel driver:
> > > > > > > > 
> > > > > > > > 1) Configure the GPU with a protected buffer. The system must provide a DMA
> > > > > > > >    heap from which the driver can allocate a protected buffer.
> > > > > > > >    It can be a carved-out memory or dynamically allocated protected memory region.
> > > > > > > >    Some system includes a trusted FW which is in charge of the protected memory.
> > > > > > > >    Since this problem is integration specific, the Mali Panthor CSF kernel
> > > > > > > >    driver must import the protected memory from a device specific exporter.
> > > > > > > 
> > > > > > > Why do you need a heap for it in the first place? My understanding of
> > > > > > > your series is that you have a carved out memory region somewhere, and
> > > > > > > you want to allocate from that carved out memory region your buffers.
> > > > > > > 
> > > > > > > How is that any different from using a reserved-memory region, adding
> > > > > > > the reserved-memory property to the GPU device and doing all your
> > > > > > > allocation through the usual dma_alloc_* API?
> > > > > > 
> > > > > > How do you then multiplex this region so it can be shared between
> > > > > > GPU/Camera/Display/Codec drivers and also userspace ?
> > > > > 
> > > > > You could point all the devices to the same reserved memory region, and
> > > > > they would all allocate from there, including for their userspace-facing
> > > > > allocations.
> > > > 
> > > > I get that using memory region is somewhat more of an HW description, and
> > > > aligned with what a DT is supposed to describe. One of the challenge is that
> > > > Mediatek heap proposal endup calling into their TEE, meaning knowing the region
> > > > is not that useful. You actually need the TEE APP guid and its IPC protocol. If
> > > > we can dell drivers to use a head instead, we can abstract that SoC specific
> > > > complexity. I believe each allocated addressed has to be mapped to a zone, and
> > > > that can only be done in the secure application. I can imagine similar needs
> > > > when the protection is done using some sort of a VM / hypervisor.
> > > > 
> > > > Nicolas
> > > > 
> > > 
> > > The idea in this design is to abstract the heap management from the
> > > Panthor kernel driver (which consumes a DMA buffer from it).
> > > 
> > > In a system, an integrator would have implemented a secure heap driver,
> > > and could be based on TEE or a carved-out memory with restricted access,
> > > or else. This heap driver would be responsible of implementing the
> > > logic to: allocate, free, refcount, etc.
> > > 
> > > The heap would be retrieved by the Panthor kernel driver in order to
> > > allocate protected memory to load the FW and allow the GPU to enter/exit
> > > protected mode. This memory would not belong to a user space process.
> > > The driver allocates it at the time of loading the FW and initialization
> > > of the GPU HW. This is a device globally owned protected memory.
> > 
> > This use case also applies well for codec. The Mediatek SCP firmware needs to be
> > loaded with a restricted memory too, its a very similar scenario, plus Mediatek
> > chips often include a Mali. On top of that, V4L2 codecs (in general) do need to
> > allocate internal scratch buffer for the IP to write to for things like motion
> > vectors, reconstruction frames, entropy statistics, etc. The IP will only be
> > able to write if the memory is restricted.
> 
> BTW, in such a case, do the scratch buffers need to be
> protected/secure/whatever too, or would codecs be able to use any buffer
> as a scratch buffer?

They need to be protected yes. Its not very fine grained on the platform I work
on. When that protection is enabled, the decoder can only read and write from
protected memory. I know there is platform were it can read from both, but
generally all IOs regardless what they are used for endup with the same
restriction.

Nicolas

p.s. since Khronos seems to have adopted "protected", perhaps it will be advised
to go for that in the end.

> 
> Maxime


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 0/5] drm/panthor: Protected mode support for Mali CSF GPUs
  2025-02-05 14:52           ` Maxime Ripard
@ 2025-02-05 18:14             ` Nicolas Dufresne
  2025-02-07 15:02               ` Boris Brezillon
  0 siblings, 1 reply; 48+ messages in thread
From: Nicolas Dufresne @ 2025-02-05 18:14 UTC (permalink / raw)
  To: Maxime Ripard, Florent Tomasin
  Cc: Vinod Koul, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
	Benjamin Gaignard, Brian Starkey, John Stultz, T . J . Mercier,
	Christian König, Matthias Brugger,
	AngeloGioacchino Del Regno, Yong Wu, dmaengine, devicetree,
	linux-kernel, dri-devel, linux-media, linaro-mm-sig,
	linux-arm-kernel, linux-mediatek, nd, Akash Goel

Le mercredi 05 février 2025 à 15:52 +0100, Maxime Ripard a écrit :
> On Mon, Feb 03, 2025 at 04:43:23PM +0000, Florent Tomasin wrote:
> > Hi Maxime, Nicolas
> > 
> > On 30/01/2025 17:47, Nicolas Dufresne wrote:
> > > Le jeudi 30 janvier 2025 à 17:38 +0100, Maxime Ripard a écrit :
> > > > Hi Nicolas,
> > > > 
> > > > On Thu, Jan 30, 2025 at 10:59:56AM -0500, Nicolas Dufresne wrote:
> > > > > Le jeudi 30 janvier 2025 à 14:46 +0100, Maxime Ripard a écrit :
> > > > > > Hi,
> > > > > > 
> > > > > > I started to review it, but it's probably best to discuss it here.
> > > > > > 
> > > > > > On Thu, Jan 30, 2025 at 01:08:56PM +0000, Florent Tomasin wrote:
> > > > > > > Hi,
> > > > > > > 
> > > > > > > This is a patch series covering the support for protected mode execution in
> > > > > > > Mali Panthor CSF kernel driver.
> > > > > > > 
> > > > > > > The Mali CSF GPUs come with the support for protected mode execution at the
> > > > > > > HW level. This feature requires two main changes in the kernel driver:
> > > > > > > 
> > > > > > > 1) Configure the GPU with a protected buffer. The system must provide a DMA
> > > > > > >    heap from which the driver can allocate a protected buffer.
> > > > > > >    It can be a carved-out memory or dynamically allocated protected memory region.
> > > > > > >    Some system includes a trusted FW which is in charge of the protected memory.
> > > > > > >    Since this problem is integration specific, the Mali Panthor CSF kernel
> > > > > > >    driver must import the protected memory from a device specific exporter.
> > > > > > 
> > > > > > Why do you need a heap for it in the first place? My understanding of
> > > > > > your series is that you have a carved out memory region somewhere, and
> > > > > > you want to allocate from that carved out memory region your buffers.
> > > > > > 
> > > > > > How is that any different from using a reserved-memory region, adding
> > > > > > the reserved-memory property to the GPU device and doing all your
> > > > > > allocation through the usual dma_alloc_* API?
> > > > > 
> > > > > How do you then multiplex this region so it can be shared between
> > > > > GPU/Camera/Display/Codec drivers and also userspace ?
> > > > 
> > > > You could point all the devices to the same reserved memory region, and
> > > > they would all allocate from there, including for their userspace-facing
> > > > allocations.
> > > 
> > > I get that using memory region is somewhat more of an HW description, and
> > > aligned with what a DT is supposed to describe. One of the challenge is that
> > > Mediatek heap proposal endup calling into their TEE, meaning knowing the region
> > > is not that useful. You actually need the TEE APP guid and its IPC protocol. If
> > > we can dell drivers to use a head instead, we can abstract that SoC specific
> > > complexity. I believe each allocated addressed has to be mapped to a zone, and
> > > that can only be done in the secure application. I can imagine similar needs
> > > when the protection is done using some sort of a VM / hypervisor.
> > > 
> > > Nicolas
> > > 
> > 
> > The idea in this design is to abstract the heap management from the
> > Panthor kernel driver (which consumes a DMA buffer from it).
> > 
> > In a system, an integrator would have implemented a secure heap driver,
> > and could be based on TEE or a carved-out memory with restricted access,
> > or else. This heap driver would be responsible of implementing the
> > logic to: allocate, free, refcount, etc.
> > 
> > The heap would be retrieved by the Panthor kernel driver in order to
> > allocate protected memory to load the FW and allow the GPU to enter/exit
> > protected mode. This memory would not belong to a user space process.
> > The driver allocates it at the time of loading the FW and initialization
> > of the GPU HW. This is a device globally owned protected memory.
> 
> The thing is, it's really not clear why you absolutely need to have the
> Panthor driver involved there. It won't be transparent to userspace,
> since you'd need an extra flag at allocation time, and the buffers
> behave differently. If userspace has to be aware of it, what's the
> advantage to your approach compared to just exposing a heap for those
> secure buffers, and letting userspace allocate its buffers from there?

Unless I'm mistaken, the Panthor driver loads its own firmware. Since loading
the firmware requires placing the data in a protected memory region, and that
this aspect has no exposure to userspace, how can Panthor not be implicated ?

> 
> > When I came across this patch series:
> > -
> > https://lore.kernel.org/lkml/20230911023038.30649-1-yong.wu@mediatek.com/#t
> > I found it could help abstract the interface between the secure heap and
> > the integration of protected memory in Panthor.
> > 
> > A kernel driver would have to find the heap: `dma_heap_find()`, then
> > request allocation of a DMA buffer from it. The heap driver would deal
> > with the specifities of the protected memory on the system.
> 
> Sure, but we still have to address *why* it would be a good idea for the
> driver to do it in the first place. The mediatek series had the same
> feedback.

Which got pretty clear replies iirc. The drivers needs scratch buffers and
secondary buffers to be protected, and these are not visible to userspace. No
one have made a proper counter argument yet. In MTK, the remote-proc driver for
the SCP (a multi-purpose multimedia co-processor) will also need to place the
firmware data into a protected buffer (with the help of the tee to copy the data
into it of course).

Nicolas

> 
> Maxime


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 3/5] dt-bindings: gpu: Add protected heap name to Mali Valhall CSF binding
  2025-02-05  9:13       ` Krzysztof Kozlowski
@ 2025-02-06 21:21         ` Nicolas Dufresne
  2025-02-09 11:56           ` Krzysztof Kozlowski
  0 siblings, 1 reply; 48+ messages in thread
From: Nicolas Dufresne @ 2025-02-06 21:21 UTC (permalink / raw)
  To: Krzysztof Kozlowski, Florent Tomasin, Vinod Koul, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, Boris Brezillon, Steven Price,
	Liviu Dudau, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	David Airlie, Simona Vetter, Sumit Semwal, Benjamin Gaignard,
	Brian Starkey, John Stultz, T . J . Mercier, Christian König,
	Matthias Brugger, AngeloGioacchino Del Regno, Yong Wu
  Cc: dmaengine, devicetree, linux-kernel, dri-devel, linux-media,
	linaro-mm-sig, linux-arm-kernel, linux-mediatek, nd, Akash Goel

Le mercredi 05 février 2025 à 10:13 +0100, Krzysztof Kozlowski a écrit :
> On 03/02/2025 16:31, Florent Tomasin wrote:
> > Hi Krzysztof
> > 
> > On 30/01/2025 13:25, Krzysztof Kozlowski wrote:
> > > On 30/01/2025 14:08, Florent Tomasin wrote:
> > > > Allow mali-valhall-csf driver to retrieve a protected
> > > > heap at probe time by passing the name of the heap
> > > > as attribute to the device tree GPU node.
> > > 
> > > Please wrap commit message according to Linux coding style / submission
> > > process (neither too early nor over the limit):
> > > https://elixir.bootlin.com/linux/v6.4-rc1/source/Documentation/process/submitting-patches.rst#L597
> > Apologies, I think I made quite few other mistakes in the style of the
> > patches I sent. I will work on improving this aspect, appreciated
> > 
> > > Why this cannot be passed by phandle, just like all reserved regions?
> > > 
> > > From where do you take these protected heaps? Firmware? This would
> > > explain why no relation is here (no probe ordering, no device links,
> > > nothing connecting separate devices).
> > 
> > The protected heap is generaly obtained from a firmware (TEE) and could
> > sometimes be a carved-out memory with restricted access.
> 
> Which is a reserved memory, isn't it?
> 
> > 
> > The Panthor CSF kernel driver does not own or manage the protected heap
> > and is instead a consumer of it (assuming the heap is made available by
> > the system integrator).
> > 
> > I initially used a phandle, but then I realised it would introduce a new
> > API to share the heap across kernel driver. In addition I found this
> > patch series:
> > -
> > https://lore.kernel.org/lkml/20230911023038.30649-1-yong.wu@mediatek.com/#t
> > 
> > which introduces a DMA Heap API to the rest of the kernel to find a
> > heap by name:
> > - dma_heap_find()
> > 
> > I then decided to follow that approach to help isolate the heap
> > management from the GPU driver code. In the Panthor driver, if the
> > heap is not found at probe time, the driver will defer the probe until
> > the exporter made it available.
> 
> 
> I don't talk here really about the driver but even above mediatek
> patchset uses reserved memory bindings.
> 
> You explained some things about driver yet you did not answer the
> question. This looks like reserved memory. If it does not, bring
> arguments why this binding cannot be a reserved memory, why hardware is
> not a carve out memory.

I think the point is that from the Mali GPU view, the memory does not need to be
within the range the Linux Kernel actually see, even though current integration
have that. From Mali GPU driver stand point (or codec drivers and what's not),
the memory range is not useful to allocate protected/restricted memory. On top
of which, its not reserved specifically for the Mali GPU.

What's your practical suggestion here ? Introduce dma_heap_find_by_region() ?

Nicolas

> 
> Best regards,
> Krzysztof
> 


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 0/5] drm/panthor: Protected mode support for Mali CSF GPUs
  2025-02-05 18:14             ` Nicolas Dufresne
@ 2025-02-07 15:02               ` Boris Brezillon
  2025-02-07 16:32                 ` Nicolas Dufresne
  2025-02-11 13:46                 ` Maxime Ripard
  0 siblings, 2 replies; 48+ messages in thread
From: Boris Brezillon @ 2025-02-07 15:02 UTC (permalink / raw)
  To: Nicolas Dufresne
  Cc: Maxime Ripard, Florent Tomasin, Vinod Koul, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, Steven Price, Liviu Dudau,
	Maarten Lankhorst, Thomas Zimmermann, David Airlie, Simona Vetter,
	Sumit Semwal, Benjamin Gaignard, Brian Starkey, John Stultz,
	T . J . Mercier, Christian König, Matthias Brugger,
	AngeloGioacchino Del Regno, Yong Wu, dmaengine, devicetree,
	linux-kernel, dri-devel, linux-media, linaro-mm-sig,
	linux-arm-kernel, linux-mediatek, nd, Akash Goel

Sorry for joining the party late, a couple of comments to back Akash
and Nicolas' concerns.

On Wed, 05 Feb 2025 13:14:14 -0500
Nicolas Dufresne <nicolas@ndufresne.ca> wrote:

> Le mercredi 05 février 2025 à 15:52 +0100, Maxime Ripard a écrit :
> > On Mon, Feb 03, 2025 at 04:43:23PM +0000, Florent Tomasin wrote:  
> > > Hi Maxime, Nicolas
> > > 
> > > On 30/01/2025 17:47, Nicolas Dufresne wrote:  
> > > > Le jeudi 30 janvier 2025 à 17:38 +0100, Maxime Ripard a écrit :  
> > > > > Hi Nicolas,
> > > > > 
> > > > > On Thu, Jan 30, 2025 at 10:59:56AM -0500, Nicolas Dufresne wrote:  
> > > > > > Le jeudi 30 janvier 2025 à 14:46 +0100, Maxime Ripard a écrit :  
> > > > > > > Hi,
> > > > > > > 
> > > > > > > I started to review it, but it's probably best to discuss it here.
> > > > > > > 
> > > > > > > On Thu, Jan 30, 2025 at 01:08:56PM +0000, Florent Tomasin wrote:  
> > > > > > > > Hi,
> > > > > > > > 
> > > > > > > > This is a patch series covering the support for protected mode execution in
> > > > > > > > Mali Panthor CSF kernel driver.
> > > > > > > > 
> > > > > > > > The Mali CSF GPUs come with the support for protected mode execution at the
> > > > > > > > HW level. This feature requires two main changes in the kernel driver:
> > > > > > > > 
> > > > > > > > 1) Configure the GPU with a protected buffer. The system must provide a DMA
> > > > > > > >    heap from which the driver can allocate a protected buffer.
> > > > > > > >    It can be a carved-out memory or dynamically allocated protected memory region.
> > > > > > > >    Some system includes a trusted FW which is in charge of the protected memory.
> > > > > > > >    Since this problem is integration specific, the Mali Panthor CSF kernel
> > > > > > > >    driver must import the protected memory from a device specific exporter.  
> > > > > > > 
> > > > > > > Why do you need a heap for it in the first place? My understanding of
> > > > > > > your series is that you have a carved out memory region somewhere, and
> > > > > > > you want to allocate from that carved out memory region your buffers.
> > > > > > > 
> > > > > > > How is that any different from using a reserved-memory region, adding
> > > > > > > the reserved-memory property to the GPU device and doing all your
> > > > > > > allocation through the usual dma_alloc_* API?  
> > > > > > 
> > > > > > How do you then multiplex this region so it can be shared between
> > > > > > GPU/Camera/Display/Codec drivers and also userspace ?  
> > > > > 
> > > > > You could point all the devices to the same reserved memory region, and
> > > > > they would all allocate from there, including for their userspace-facing
> > > > > allocations.  
> > > > 
> > > > I get that using memory region is somewhat more of an HW description, and
> > > > aligned with what a DT is supposed to describe. One of the challenge is that
> > > > Mediatek heap proposal endup calling into their TEE, meaning knowing the region
> > > > is not that useful. You actually need the TEE APP guid and its IPC protocol. If
> > > > we can dell drivers to use a head instead, we can abstract that SoC specific
> > > > complexity. I believe each allocated addressed has to be mapped to a zone, and
> > > > that can only be done in the secure application. I can imagine similar needs
> > > > when the protection is done using some sort of a VM / hypervisor.
> > > > 
> > > > Nicolas
> > > >   
> > > 
> > > The idea in this design is to abstract the heap management from the
> > > Panthor kernel driver (which consumes a DMA buffer from it).
> > > 
> > > In a system, an integrator would have implemented a secure heap driver,
> > > and could be based on TEE or a carved-out memory with restricted access,
> > > or else. This heap driver would be responsible of implementing the
> > > logic to: allocate, free, refcount, etc.
> > > 
> > > The heap would be retrieved by the Panthor kernel driver in order to
> > > allocate protected memory to load the FW and allow the GPU to enter/exit
> > > protected mode. This memory would not belong to a user space process.
> > > The driver allocates it at the time of loading the FW and initialization
> > > of the GPU HW. This is a device globally owned protected memory.  
> > 
> > The thing is, it's really not clear why you absolutely need to have the
> > Panthor driver involved there. It won't be transparent to userspace,
> > since you'd need an extra flag at allocation time, and the buffers
> > behave differently. If userspace has to be aware of it, what's the
> > advantage to your approach compared to just exposing a heap for those
> > secure buffers, and letting userspace allocate its buffers from there?  
> 
> Unless I'm mistaken, the Panthor driver loads its own firmware. Since loading
> the firmware requires placing the data in a protected memory region, and that
> this aspect has no exposure to userspace, how can Panthor not be implicated ?

Right, the very reason we need protected memory early is because some
FW sections need to be allocated from the protected pool, otherwise the
TEE will fault as soon at the FW enters the so-called 'protected mode'.

Now, it's not impossible to work around this limitation. For instance,
we could load the FW without this protected section by default (what we
do right now), and then provide a DRM_PANTHOR_ENABLE_FW_PROT_MODE
ioctl that would take a GEM object imported from a dmabuf allocated
from the protected dma-heap by userspace. We can then reset the FW and
allow it to operate in protected mode after that point. This approach
has two downsides though:

1. We have no way of checking that the memory we're passed is actually
suitable for FW execution in a protected context. If we're passed
random memory, this will likely hang the platform as soon as we enter
protected mode.

2. If the driver already boot the FW and exposed a DRI node, we might
have GPU workloads running, and doing a FW reset might incur a slight
delay in GPU jobs execution.

I think #1 is a more general issue that applies to suspend buffers
allocated for GPU contexts too. If we expose ioctls where we take
protected memory buffers that can possibly lead to crashes if they are
not real protected memory regions, and we have no way to ensure the
memory is protected, we probably want to restrict these ioctls/modes to
some high-privilege CAP_SYS_.

For #2, that's probably something we can live with, since it's a
one-shot thing. If it becomes an issue, we can even make sure we enable
the FW protected-mode before the GPU starts being used for real.

This being said, I think the problem applies outside Panthor, and it
might be that the video codec can't reset the FW/HW block to switch to
protected mode as easily as Panthor.

Note that there's also downsides to the reserved-memory node approach,
where some bootloader stage would ask the secure FW to reserve a
portion of mem and pass this through the DT. This sort of things tend to
be an integration mess, where you need all the pieces of the stack (TEE,
u-boot, MTK dma-heap driver, gbm, ...) to be at a certain version to
work properly. If we go the ioctl() way, we restrict the scope to the
TEE, gbm/mesa and the protected-dma-heap driver, which is still a lot,
but we've ripped the bootloader out of the equation at least.

Regards,

Boris

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 0/5] drm/panthor: Protected mode support for Mali CSF GPUs
  2025-02-07 15:02               ` Boris Brezillon
@ 2025-02-07 16:32                 ` Nicolas Dufresne
  2025-02-07 16:42                   ` Boris Brezillon
  2025-02-11 13:46                 ` Maxime Ripard
  1 sibling, 1 reply; 48+ messages in thread
From: Nicolas Dufresne @ 2025-02-07 16:32 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Maxime Ripard, Florent Tomasin, Vinod Koul, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, Steven Price, Liviu Dudau,
	Maarten Lankhorst, Thomas Zimmermann, David Airlie, Simona Vetter,
	Sumit Semwal, Benjamin Gaignard, Brian Starkey, John Stultz,
	T . J . Mercier, Christian König, Matthias Brugger,
	AngeloGioacchino Del Regno, Yong Wu, dmaengine, devicetree,
	linux-kernel, dri-devel, linux-media, linaro-mm-sig,
	linux-arm-kernel, linux-mediatek, nd, Akash Goel

Le vendredi 07 février 2025 à 16:02 +0100, Boris Brezillon a écrit :
> Sorry for joining the party late, a couple of comments to back Akash
> and Nicolas' concerns.
> 
> On Wed, 05 Feb 2025 13:14:14 -0500
> Nicolas Dufresne <nicolas@ndufresne.ca> wrote:
> 
> > Le mercredi 05 février 2025 à 15:52 +0100, Maxime Ripard a écrit :
> > > On Mon, Feb 03, 2025 at 04:43:23PM +0000, Florent Tomasin wrote:  
> > > > Hi Maxime, Nicolas
> > > > 
> > > > On 30/01/2025 17:47, Nicolas Dufresne wrote:  
> > > > > Le jeudi 30 janvier 2025 à 17:38 +0100, Maxime Ripard a écrit :  
> > > > > > Hi Nicolas,
> > > > > > 
> > > > > > On Thu, Jan 30, 2025 at 10:59:56AM -0500, Nicolas Dufresne wrote:  
> > > > > > > Le jeudi 30 janvier 2025 à 14:46 +0100, Maxime Ripard a écrit :  
> > > > > > > > Hi,
> > > > > > > > 
> > > > > > > > I started to review it, but it's probably best to discuss it here.
> > > > > > > > 
> > > > > > > > On Thu, Jan 30, 2025 at 01:08:56PM +0000, Florent Tomasin wrote:  
> > > > > > > > > Hi,
> > > > > > > > > 
> > > > > > > > > This is a patch series covering the support for protected mode execution in
> > > > > > > > > Mali Panthor CSF kernel driver.
> > > > > > > > > 
> > > > > > > > > The Mali CSF GPUs come with the support for protected mode execution at the
> > > > > > > > > HW level. This feature requires two main changes in the kernel driver:
> > > > > > > > > 
> > > > > > > > > 1) Configure the GPU with a protected buffer. The system must provide a DMA
> > > > > > > > >    heap from which the driver can allocate a protected buffer.
> > > > > > > > >    It can be a carved-out memory or dynamically allocated protected memory region.
> > > > > > > > >    Some system includes a trusted FW which is in charge of the protected memory.
> > > > > > > > >    Since this problem is integration specific, the Mali Panthor CSF kernel
> > > > > > > > >    driver must import the protected memory from a device specific exporter.  
> > > > > > > > 
> > > > > > > > Why do you need a heap for it in the first place? My understanding of
> > > > > > > > your series is that you have a carved out memory region somewhere, and
> > > > > > > > you want to allocate from that carved out memory region your buffers.
> > > > > > > > 
> > > > > > > > How is that any different from using a reserved-memory region, adding
> > > > > > > > the reserved-memory property to the GPU device and doing all your
> > > > > > > > allocation through the usual dma_alloc_* API?  
> > > > > > > 
> > > > > > > How do you then multiplex this region so it can be shared between
> > > > > > > GPU/Camera/Display/Codec drivers and also userspace ?  
> > > > > > 
> > > > > > You could point all the devices to the same reserved memory region, and
> > > > > > they would all allocate from there, including for their userspace-facing
> > > > > > allocations.  
> > > > > 
> > > > > I get that using memory region is somewhat more of an HW description, and
> > > > > aligned with what a DT is supposed to describe. One of the challenge is that
> > > > > Mediatek heap proposal endup calling into their TEE, meaning knowing the region
> > > > > is not that useful. You actually need the TEE APP guid and its IPC protocol. If
> > > > > we can dell drivers to use a head instead, we can abstract that SoC specific
> > > > > complexity. I believe each allocated addressed has to be mapped to a zone, and
> > > > > that can only be done in the secure application. I can imagine similar needs
> > > > > when the protection is done using some sort of a VM / hypervisor.
> > > > > 
> > > > > Nicolas
> > > > >   
> > > > 
> > > > The idea in this design is to abstract the heap management from the
> > > > Panthor kernel driver (which consumes a DMA buffer from it).
> > > > 
> > > > In a system, an integrator would have implemented a secure heap driver,
> > > > and could be based on TEE or a carved-out memory with restricted access,
> > > > or else. This heap driver would be responsible of implementing the
> > > > logic to: allocate, free, refcount, etc.
> > > > 
> > > > The heap would be retrieved by the Panthor kernel driver in order to
> > > > allocate protected memory to load the FW and allow the GPU to enter/exit
> > > > protected mode. This memory would not belong to a user space process.
> > > > The driver allocates it at the time of loading the FW and initialization
> > > > of the GPU HW. This is a device globally owned protected memory.  
> > > 
> > > The thing is, it's really not clear why you absolutely need to have the
> > > Panthor driver involved there. It won't be transparent to userspace,
> > > since you'd need an extra flag at allocation time, and the buffers
> > > behave differently. If userspace has to be aware of it, what's the
> > > advantage to your approach compared to just exposing a heap for those
> > > secure buffers, and letting userspace allocate its buffers from there?  
> > 
> > Unless I'm mistaken, the Panthor driver loads its own firmware. Since loading
> > the firmware requires placing the data in a protected memory region, and that
> > this aspect has no exposure to userspace, how can Panthor not be implicated ?
> 
> Right, the very reason we need protected memory early is because some
> FW sections need to be allocated from the protected pool, otherwise the
> TEE will fault as soon at the FW enters the so-called 'protected mode'.
> 
> Now, it's not impossible to work around this limitation. For instance,
> we could load the FW without this protected section by default (what we
> do right now), and then provide a DRM_PANTHOR_ENABLE_FW_PROT_MODE
> ioctl that would take a GEM object imported from a dmabuf allocated
> from the protected dma-heap by userspace. We can then reset the FW and
> allow it to operate in protected mode after that point. This approach
> has two downsides though:
> 
> 1. We have no way of checking that the memory we're passed is actually
> suitable for FW execution in a protected context. If we're passed
> random memory, this will likely hang the platform as soon as we enter
> protected mode.
> 
> 2. If the driver already boot the FW and exposed a DRI node, we might
> have GPU workloads running, and doing a FW reset might incur a slight
> delay in GPU jobs execution.
> 
> I think #1 is a more general issue that applies to suspend buffers
> allocated for GPU contexts too. If we expose ioctls where we take
> protected memory buffers that can possibly lead to crashes if they are
> not real protected memory regions, and we have no way to ensure the
> memory is protected, we probably want to restrict these ioctls/modes to
> some high-privilege CAP_SYS_.
> 
> For #2, that's probably something we can live with, since it's a
> one-shot thing. If it becomes an issue, we can even make sure we enable
> the FW protected-mode before the GPU starts being used for real.
> 
> This being said, I think the problem applies outside Panthor, and it
> might be that the video codec can't reset the FW/HW block to switch to
> protected mode as easily as Panthor.

Overall the reset and reboot method is pretty ugly in my opinion. But to stick
with the pure rationale, rebooting the SCP on MTK is much harder, since its not
specific to a single HW/driver.

Other codecs like Samsung MFC, Venus/Iris, Chips&Media, etc. that approach seams
plausible, but we still can't trust the buffer, which to me is not acceptable.

> 
> Note that there's also downsides to the reserved-memory node approach,
> where some bootloader stage would ask the secure FW to reserve a
> portion of mem and pass this through the DT. This sort of things tend to
> be an integration mess, where you need all the pieces of the stack (TEE,
> u-boot, MTK dma-heap driver, gbm, ...) to be at a certain version to
> work properly. If we go the ioctl() way, we restrict the scope to the
> TEE, gbm/mesa and the protected-dma-heap driver, which is still a lot,
> but we've ripped the bootloader out of the equation at least.
> 
> Regards,
> 
> Boris


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 0/5] drm/panthor: Protected mode support for Mali CSF GPUs
  2025-02-07 16:32                 ` Nicolas Dufresne
@ 2025-02-07 16:42                   ` Boris Brezillon
  0 siblings, 0 replies; 48+ messages in thread
From: Boris Brezillon @ 2025-02-07 16:42 UTC (permalink / raw)
  To: Nicolas Dufresne
  Cc: Maxime Ripard, Florent Tomasin, Vinod Koul, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, Steven Price, Liviu Dudau,
	Maarten Lankhorst, Thomas Zimmermann, David Airlie, Simona Vetter,
	Sumit Semwal, Benjamin Gaignard, Brian Starkey, John Stultz,
	T . J . Mercier, Christian König, Matthias Brugger,
	AngeloGioacchino Del Regno, Yong Wu, dmaengine, devicetree,
	linux-kernel, dri-devel, linux-media, linaro-mm-sig,
	linux-arm-kernel, linux-mediatek, nd, Akash Goel

On Fri, 07 Feb 2025 11:32:18 -0500
Nicolas Dufresne <nicolas@ndufresne.ca> wrote:

> Le vendredi 07 février 2025 à 16:02 +0100, Boris Brezillon a écrit :
> > Sorry for joining the party late, a couple of comments to back Akash
> > and Nicolas' concerns.
> > 
> > On Wed, 05 Feb 2025 13:14:14 -0500
> > Nicolas Dufresne <nicolas@ndufresne.ca> wrote:
> >   
> > > Le mercredi 05 février 2025 à 15:52 +0100, Maxime Ripard a écrit :  
> > > > On Mon, Feb 03, 2025 at 04:43:23PM +0000, Florent Tomasin wrote:    
> > > > > Hi Maxime, Nicolas
> > > > > 
> > > > > On 30/01/2025 17:47, Nicolas Dufresne wrote:    
> > > > > > Le jeudi 30 janvier 2025 à 17:38 +0100, Maxime Ripard a écrit :    
> > > > > > > Hi Nicolas,
> > > > > > > 
> > > > > > > On Thu, Jan 30, 2025 at 10:59:56AM -0500, Nicolas Dufresne wrote:    
> > > > > > > > Le jeudi 30 janvier 2025 à 14:46 +0100, Maxime Ripard a écrit :    
> > > > > > > > > Hi,
> > > > > > > > > 
> > > > > > > > > I started to review it, but it's probably best to discuss it here.
> > > > > > > > > 
> > > > > > > > > On Thu, Jan 30, 2025 at 01:08:56PM +0000, Florent Tomasin wrote:    
> > > > > > > > > > Hi,
> > > > > > > > > > 
> > > > > > > > > > This is a patch series covering the support for protected mode execution in
> > > > > > > > > > Mali Panthor CSF kernel driver.
> > > > > > > > > > 
> > > > > > > > > > The Mali CSF GPUs come with the support for protected mode execution at the
> > > > > > > > > > HW level. This feature requires two main changes in the kernel driver:
> > > > > > > > > > 
> > > > > > > > > > 1) Configure the GPU with a protected buffer. The system must provide a DMA
> > > > > > > > > >    heap from which the driver can allocate a protected buffer.
> > > > > > > > > >    It can be a carved-out memory or dynamically allocated protected memory region.
> > > > > > > > > >    Some system includes a trusted FW which is in charge of the protected memory.
> > > > > > > > > >    Since this problem is integration specific, the Mali Panthor CSF kernel
> > > > > > > > > >    driver must import the protected memory from a device specific exporter.    
> > > > > > > > > 
> > > > > > > > > Why do you need a heap for it in the first place? My understanding of
> > > > > > > > > your series is that you have a carved out memory region somewhere, and
> > > > > > > > > you want to allocate from that carved out memory region your buffers.
> > > > > > > > > 
> > > > > > > > > How is that any different from using a reserved-memory region, adding
> > > > > > > > > the reserved-memory property to the GPU device and doing all your
> > > > > > > > > allocation through the usual dma_alloc_* API?    
> > > > > > > > 
> > > > > > > > How do you then multiplex this region so it can be shared between
> > > > > > > > GPU/Camera/Display/Codec drivers and also userspace ?    
> > > > > > > 
> > > > > > > You could point all the devices to the same reserved memory region, and
> > > > > > > they would all allocate from there, including for their userspace-facing
> > > > > > > allocations.    
> > > > > > 
> > > > > > I get that using memory region is somewhat more of an HW description, and
> > > > > > aligned with what a DT is supposed to describe. One of the challenge is that
> > > > > > Mediatek heap proposal endup calling into their TEE, meaning knowing the region
> > > > > > is not that useful. You actually need the TEE APP guid and its IPC protocol. If
> > > > > > we can dell drivers to use a head instead, we can abstract that SoC specific
> > > > > > complexity. I believe each allocated addressed has to be mapped to a zone, and
> > > > > > that can only be done in the secure application. I can imagine similar needs
> > > > > > when the protection is done using some sort of a VM / hypervisor.
> > > > > > 
> > > > > > Nicolas
> > > > > >     
> > > > > 
> > > > > The idea in this design is to abstract the heap management from the
> > > > > Panthor kernel driver (which consumes a DMA buffer from it).
> > > > > 
> > > > > In a system, an integrator would have implemented a secure heap driver,
> > > > > and could be based on TEE or a carved-out memory with restricted access,
> > > > > or else. This heap driver would be responsible of implementing the
> > > > > logic to: allocate, free, refcount, etc.
> > > > > 
> > > > > The heap would be retrieved by the Panthor kernel driver in order to
> > > > > allocate protected memory to load the FW and allow the GPU to enter/exit
> > > > > protected mode. This memory would not belong to a user space process.
> > > > > The driver allocates it at the time of loading the FW and initialization
> > > > > of the GPU HW. This is a device globally owned protected memory.    
> > > > 
> > > > The thing is, it's really not clear why you absolutely need to have the
> > > > Panthor driver involved there. It won't be transparent to userspace,
> > > > since you'd need an extra flag at allocation time, and the buffers
> > > > behave differently. If userspace has to be aware of it, what's the
> > > > advantage to your approach compared to just exposing a heap for those
> > > > secure buffers, and letting userspace allocate its buffers from there?    
> > > 
> > > Unless I'm mistaken, the Panthor driver loads its own firmware. Since loading
> > > the firmware requires placing the data in a protected memory region, and that
> > > this aspect has no exposure to userspace, how can Panthor not be implicated ?  
> > 
> > Right, the very reason we need protected memory early is because some
> > FW sections need to be allocated from the protected pool, otherwise the
> > TEE will fault as soon at the FW enters the so-called 'protected mode'.
> > 
> > Now, it's not impossible to work around this limitation. For instance,
> > we could load the FW without this protected section by default (what we
> > do right now), and then provide a DRM_PANTHOR_ENABLE_FW_PROT_MODE
> > ioctl that would take a GEM object imported from a dmabuf allocated
> > from the protected dma-heap by userspace. We can then reset the FW and
> > allow it to operate in protected mode after that point. This approach
> > has two downsides though:
> > 
> > 1. We have no way of checking that the memory we're passed is actually
> > suitable for FW execution in a protected context. If we're passed
> > random memory, this will likely hang the platform as soon as we enter
> > protected mode.
> > 
> > 2. If the driver already boot the FW and exposed a DRI node, we might
> > have GPU workloads running, and doing a FW reset might incur a slight
> > delay in GPU jobs execution.
> > 
> > I think #1 is a more general issue that applies to suspend buffers
> > allocated for GPU contexts too. If we expose ioctls where we take
> > protected memory buffers that can possibly lead to crashes if they are
> > not real protected memory regions, and we have no way to ensure the
> > memory is protected, we probably want to restrict these ioctls/modes to
> > some high-privilege CAP_SYS_.
> > 
> > For #2, that's probably something we can live with, since it's a
> > one-shot thing. If it becomes an issue, we can even make sure we enable
> > the FW protected-mode before the GPU starts being used for real.
> > 
> > This being said, I think the problem applies outside Panthor, and it
> > might be that the video codec can't reset the FW/HW block to switch to
> > protected mode as easily as Panthor.  
> 
> Overall the reset and reboot method is pretty ugly in my opinion.

Yeah, I'm not entirely sold on this approach either, but I thought it
was good to mention it for completeness.

> But to stick
> with the pure rationale, rebooting the SCP on MTK is much harder, since its not
> specific to a single HW/driver.
> 
> Other codecs like Samsung MFC, Venus/Iris, Chips&Media, etc. that approach seams
> plausible, but we still can't trust the buffer, which to me is not acceptable.

Unfortunately, there's so many ways this can go wrong, because we're
not just talking about FW exec sections, but also data buffers that
have to be dynamically allocated by userspace (suspend buffers for GPU
contexts, textures/framebuffers, decode frames, ...). If there's no
central component kernel-side we can refer to to check correctness, the
only safe guard is privilege-based restriction...

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 3/5] dt-bindings: gpu: Add protected heap name to Mali Valhall CSF binding
  2025-02-06 21:21         ` Nicolas Dufresne
@ 2025-02-09 11:56           ` Krzysztof Kozlowski
  2025-02-12  9:25             ` Florent Tomasin
  0 siblings, 1 reply; 48+ messages in thread
From: Krzysztof Kozlowski @ 2025-02-09 11:56 UTC (permalink / raw)
  To: Nicolas Dufresne, Florent Tomasin, Vinod Koul, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, Boris Brezillon, Steven Price,
	Liviu Dudau, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	David Airlie, Simona Vetter, Sumit Semwal, Benjamin Gaignard,
	Brian Starkey, John Stultz, T . J . Mercier, Christian König,
	Matthias Brugger, AngeloGioacchino Del Regno, Yong Wu
  Cc: dmaengine, devicetree, linux-kernel, dri-devel, linux-media,
	linaro-mm-sig, linux-arm-kernel, linux-mediatek, nd, Akash Goel

On 06/02/2025 22:21, Nicolas Dufresne wrote:
> Le mercredi 05 février 2025 à 10:13 +0100, Krzysztof Kozlowski a écrit :
>> On 03/02/2025 16:31, Florent Tomasin wrote:
>>> Hi Krzysztof
>>>
>>> On 30/01/2025 13:25, Krzysztof Kozlowski wrote:
>>>> On 30/01/2025 14:08, Florent Tomasin wrote:
>>>>> Allow mali-valhall-csf driver to retrieve a protected
>>>>> heap at probe time by passing the name of the heap
>>>>> as attribute to the device tree GPU node.
>>>>
>>>> Please wrap commit message according to Linux coding style / submission
>>>> process (neither too early nor over the limit):
>>>> https://elixir.bootlin.com/linux/v6.4-rc1/source/Documentation/process/submitting-patches.rst#L597
>>> Apologies, I think I made quite few other mistakes in the style of the
>>> patches I sent. I will work on improving this aspect, appreciated
>>>
>>>> Why this cannot be passed by phandle, just like all reserved regions?
>>>>
>>>> From where do you take these protected heaps? Firmware? This would
>>>> explain why no relation is here (no probe ordering, no device links,
>>>> nothing connecting separate devices).
>>>
>>> The protected heap is generaly obtained from a firmware (TEE) and could
>>> sometimes be a carved-out memory with restricted access.
>>
>> Which is a reserved memory, isn't it?
>>
>>>
>>> The Panthor CSF kernel driver does not own or manage the protected heap
>>> and is instead a consumer of it (assuming the heap is made available by
>>> the system integrator).
>>>
>>> I initially used a phandle, but then I realised it would introduce a new
>>> API to share the heap across kernel driver. In addition I found this
>>> patch series:
>>> -
>>> https://lore.kernel.org/lkml/20230911023038.30649-1-yong.wu@mediatek.com/#t
>>>
>>> which introduces a DMA Heap API to the rest of the kernel to find a
>>> heap by name:
>>> - dma_heap_find()
>>>
>>> I then decided to follow that approach to help isolate the heap
>>> management from the GPU driver code. In the Panthor driver, if the
>>> heap is not found at probe time, the driver will defer the probe until
>>> the exporter made it available.
>>
>>
>> I don't talk here really about the driver but even above mediatek
>> patchset uses reserved memory bindings.
>>
>> You explained some things about driver yet you did not answer the
>> question. This looks like reserved memory. If it does not, bring
>> arguments why this binding cannot be a reserved memory, why hardware is
>> not a carve out memory.
> 
> I think the point is that from the Mali GPU view, the memory does not need to be
> within the range the Linux Kernel actually see, even though current integration


Do I get it right:
Memory can be outside of kernel address range but you put it to the
bindings as reserved memory? If yes, then I still do not understand why
DT should keep that information. Basically, you can choose whatever
memory is there, because it anyway won't interfere with Linux, right?
Linux does not have any reasonable way to access it.

It might interfere with firmware or other processors, but then it's the
job of firmware which has discoverable interfaces for this.

The binding says it is about protected heap name, but it explains
nothing what is that protected heap. You pass it to some firmware as
string? Does not look like, rather looks like Linux thingy, but this
again is neither explained in commit msg nor actually correct: Linux
thingies do not belong to DT.

> have that. From Mali GPU driver stand point (or codec drivers and what's not),
> the memory range is not useful to allocate protected/restricted memory. On top
> of which, its not reserved specifically for the Mali GPU.
> 
> What's your practical suggestion here ? Introduce dma_heap_find_by_region() ?

I did not comment about driver and I do not judge how you access
whatever you need to access. This is discussion purely about binding
thus about hardware.


Best regards,
Krzysztof

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 5/5] drm/panthor: Add support for entering and exiting protected mode
  2025-01-30 13:09 ` [RFC PATCH 5/5] drm/panthor: Add support for entering and exiting protected mode Florent Tomasin
@ 2025-02-10 14:01   ` Boris Brezillon
  0 siblings, 0 replies; 48+ messages in thread
From: Boris Brezillon @ 2025-02-10 14:01 UTC (permalink / raw)
  To: Florent Tomasin
  Cc: Vinod Koul, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Steven Price, Liviu Dudau, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
	Benjamin Gaignard, Brian Starkey, John Stultz, T . J . Mercier,
	Christian König, Matthias Brugger,
	AngeloGioacchino Del Regno, Yong Wu, dmaengine, devicetree,
	linux-kernel, dri-devel, linux-media, linaro-mm-sig,
	linux-arm-kernel, linux-mediatek, nd, Akash Goel

On Thu, 30 Jan 2025 13:09:01 +0000
Florent Tomasin <florent.tomasin@arm.com> wrote:

> This patch modifies the Panthor driver code to allow handling
> of the GPU HW protected mode enter and exit.
> 
> The logic added by this patch only includes the mechanisms
> needed for entering and exiting protected mode. The submission of
> a protected mode jobs is not covered by this patch series
> and is responsibility of the user space program.
> 
> To help with the review, here are some important information
> about Mali GPU protected mode enter and exit:
> - When the GPU detects a protected mode job needs to be
>   executed, an IRQ is sent to the CPU to notify the kernel
>   driver that the job is blocked until the GPU has entered
>   protected mode. The entering of protected mode is controlled
>   by the kernel driver.
> - The Mali Panthor CSF driver will schedule a tick and evaluate
>   which CS in the CSG to schedule on slot needs protected mode.
>   If the priority of the CSG is not sufficiently high, the
>   protected mode job will not progress until the CSG is
>   scheduled at top priority.
> - The Panthor scheduler notifies the GPU that the blocked
>   protected jobs will soon be able to progress.
> - Once all CSG and CS slots are updated, the scheduler
>   requests the GPU to enter protected mode and waits for
>   it to be acknowledged.
> - If successful, all protected mode jobs will resume execution
>   while normal mode jobs block until the GPU exits
>   protected mode, or the kernel driver rotates the CSGs
>   and forces the GPU to exit protected mode.
> - If unsuccessful, the scheduler will request a GPU reset.
> - When a protected mode job is suspended as a result of
>   the CSGs rotation, the GPU will send an IRQ to the CPU
>   to notify that the protected mode job needs to resume.
> 
> This sequence will continue so long the user space is
> submitting protected mode jobs.
> 
> Signed-off-by: Florent Tomasin <florent.tomasin@arm.com>
> ---
>  drivers/gpu/drm/panthor/panthor_device.h |   3 +
>  drivers/gpu/drm/panthor/panthor_fw.c     |  10 +-
>  drivers/gpu/drm/panthor/panthor_sched.c  | 119 +++++++++++++++++++++--
>  3 files changed, 122 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/panthor/panthor_device.h
> index 406de9e888e2..0c76bfd392a0 100644
> --- a/drivers/gpu/drm/panthor/panthor_device.h
> +++ b/drivers/gpu/drm/panthor/panthor_device.h
> @@ -196,6 +196,9 @@ struct panthor_device {
>  	struct {
>  		/** @heap: Pointer to the protected heap */
>  		struct dma_heap *heap;
> +
> +		/** @pending: Set to true if a protected mode enter request is pending. */
> +		bool pending;

If this is only used by panthor_sched.c, I'd be tempted to keep that in
the panthor_scheduler struct.

>  	} protm;
>  };
>  
> diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor/panthor_fw.c
> index 7822af1533b4..2006d652f4db 100644
> --- a/drivers/gpu/drm/panthor/panthor_fw.c
> +++ b/drivers/gpu/drm/panthor/panthor_fw.c
> @@ -1025,13 +1025,19 @@ static void panthor_fw_init_global_iface(struct panthor_device *ptdev)
>  	glb_iface->input->progress_timer = PROGRESS_TIMEOUT_CYCLES >> PROGRESS_TIMEOUT_SCALE_SHIFT;
>  	glb_iface->input->idle_timer = panthor_fw_conv_timeout(ptdev, IDLE_HYSTERESIS_US);
>  
> -	/* Enable interrupts we care about. */
> +	/* Enable interrupts we care about.
> +	 *
> +	 * GLB_PROTM_ENTER and GLB_PROTM_EXIT interrupts are only
> +	 * relevant if a protected memory heap is present.
> +	 */
>  	glb_iface->input->ack_irq_mask = GLB_CFG_ALLOC_EN |
>  					 GLB_PING |
>  					 GLB_CFG_PROGRESS_TIMER |
>  					 GLB_CFG_POWEROFF_TIMER |
>  					 GLB_IDLE_EN |
> -					 GLB_IDLE;
> +					 GLB_IDLE |
> +					 (ptdev->protm.heap ?
> +					 (GLB_PROTM_ENTER | GLB_PROTM_EXIT) : 0);

How about we keep things simple and unconditionally enable the PROTM
interrupts? If the group doesn't support protected mode, it should
generate PROTM events anyway. And if it does, we probably want to know
and kill the group immediately.

>  
>  	panthor_fw_update_reqs(glb_iface, req, GLB_IDLE_EN, GLB_IDLE_EN);
>  	panthor_fw_toggle_reqs(glb_iface, req, ack,
> diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
> index e260ed8aef5b..c10a21f9d075 100644
> --- a/drivers/gpu/drm/panthor/panthor_sched.c
> +++ b/drivers/gpu/drm/panthor/panthor_sched.c
> @@ -573,6 +573,9 @@ struct panthor_group {
>  	/** @fatal_queues: Bitmask reflecting the queues that hit a fatal exception. */
>  	u32 fatal_queues;
>  
> +	/** @protm_queues: Bitmask reflecting the queues that are waiting on a CS_PROTM_PENDING. */
> +	u32 protm_queues;

How about protm_pending_queues or protm_pending_mask, otherwise
it gets confusing. I initially thought it was a mask encoding the CS
that are in protected mode, but the bit is cleared as soon as we
acknowledge a request. BTW, what happens when we resume a CS that was
suspended inside a protected section? Do we get a new PROTM_PEND event?

I feel like the whole sequence and corner cases should be documented
somewhere in the code, because it's not obvious.

> +
>  	/** @tiler_oom: Mask of queues that have a tiler OOM event to process. */
>  	atomic_t tiler_oom;
>  
> @@ -870,6 +873,31 @@ panthor_queue_get_syncwait_obj(struct panthor_group *group, struct panthor_queue
>  	return NULL;
>  }
>  
> +static int glb_protm_enter(struct panthor_device *ptdev)
> +{
> +	struct panthor_fw_global_iface *glb_iface;
> +	u32 acked;
> +	int ret;
> +
> +	lockdep_assert_held(&ptdev->scheduler->lock);
> +
> +	if (!ptdev->protm.pending)
> +		return 0;
> +
> +	glb_iface = panthor_fw_get_glb_iface(ptdev);
> +
> +	panthor_fw_toggle_reqs(glb_iface, req, ack, GLB_PROTM_ENTER);
> +	gpu_write(ptdev, CSF_DOORBELL(CSF_GLB_DOORBELL_ID), 1);
> +
> +	ret = panthor_fw_glb_wait_acks(ptdev, GLB_PROTM_ENTER, &acked, 4000);
> +	if (ret)
> +		drm_err(&ptdev->base, "FW protm enter timeout, scheduling a reset");
> +	else
> +		ptdev->protm.pending = false;
> +
> +	return ret;
> +}
> +
>  static void group_free_queue(struct panthor_group *group, struct panthor_queue *queue)
>  {
>  	if (IS_ERR_OR_NULL(queue))
> @@ -1027,6 +1055,7 @@ group_unbind_locked(struct panthor_group *group)
>   * @ptdev: Device.
>   * @csg_id: Group slot ID.
>   * @cs_id: Queue slot ID.
> + * @protm_ack: Acknowledge pending protected mode queues
>   *
>   * Program a queue slot with the queue information so things can start being
>   * executed on this queue.
> @@ -1034,10 +1063,13 @@ group_unbind_locked(struct panthor_group *group)
>   * The group slot must have a group bound to it already (group_bind_locked()).
>   */
>  static void
> -cs_slot_prog_locked(struct panthor_device *ptdev, u32 csg_id, u32 cs_id)
> +cs_slot_prog_locked(struct panthor_device *ptdev, u32 csg_id, u32 cs_id, bool protm_ack)
>  {
> -	struct panthor_queue *queue = ptdev->scheduler->csg_slots[csg_id].group->queues[cs_id];
> +	struct panthor_group * const group = ptdev->scheduler->csg_slots[csg_id].group;
> +	struct panthor_queue *queue = group->queues[cs_id];
>  	struct panthor_fw_cs_iface *cs_iface = panthor_fw_get_cs_iface(ptdev, csg_id, cs_id);
> +	u32 const cs_protm_pending_mask =
> +		protm_ack && (group->protm_queues & BIT(cs_id)) ? CS_PROTM_PENDING : 0;
>  
>  	lockdep_assert_held(&ptdev->scheduler->lock);
>  
> @@ -1055,15 +1087,22 @@ cs_slot_prog_locked(struct panthor_device *ptdev, u32 csg_id, u32 cs_id)
>  			       CS_IDLE_SYNC_WAIT |
>  			       CS_IDLE_EMPTY |
>  			       CS_STATE_START |
> -			       CS_EXTRACT_EVENT,
> +			       CS_EXTRACT_EVENT |
> +			       cs_protm_pending_mask,
>  			       CS_IDLE_SYNC_WAIT |
>  			       CS_IDLE_EMPTY |
>  			       CS_STATE_MASK |
> -			       CS_EXTRACT_EVENT);
> +			       CS_EXTRACT_EVENT |
> +			       CS_PROTM_PENDING);
>  	if (queue->iface.input->insert != queue->iface.input->extract && queue->timeout_suspended) {
>  		drm_sched_resume_timeout(&queue->scheduler, queue->remaining_time);
>  		queue->timeout_suspended = false;
>  	}
> +
> +	if (cs_protm_pending_mask) {
> +		group->protm_queues &= ~BIT(cs_id);
> +		ptdev->protm.pending = true;
> +	}
>  }
>  
>  /**
> @@ -1274,7 +1313,7 @@ csg_slot_sync_state_locked(struct panthor_device *ptdev, u32 csg_id)
>  }
>  
>  static int
> -csg_slot_prog_locked(struct panthor_device *ptdev, u32 csg_id, u32 priority)
> +csg_slot_prog_locked(struct panthor_device *ptdev, u32 csg_id, u32 priority, bool protm_ack)
>  {
>  	struct panthor_fw_csg_iface *csg_iface;
>  	struct panthor_csg_slot *csg_slot;
> @@ -1291,14 +1330,14 @@ csg_slot_prog_locked(struct panthor_device *ptdev, u32 csg_id, u32 priority)
>  
>  	csg_slot = &ptdev->scheduler->csg_slots[csg_id];
>  	group = csg_slot->group;
> -	if (!group || group->state == PANTHOR_CS_GROUP_ACTIVE)
> +	if (!group || (group->state == PANTHOR_CS_GROUP_ACTIVE && !protm_ack))
>  		return 0;
>  
>  	csg_iface = panthor_fw_get_csg_iface(group->ptdev, csg_id);
>  
>  	for (i = 0; i < group->queue_count; i++) {
>  		if (group->queues[i]) {
> -			cs_slot_prog_locked(ptdev, csg_id, i);
> +			cs_slot_prog_locked(ptdev, csg_id, i, protm_ack);
>  			queue_mask |= BIT(i);
>  		}
>  	}
> @@ -1329,6 +1368,34 @@ csg_slot_prog_locked(struct panthor_device *ptdev, u32 csg_id, u32 priority)
>  	return 0;
>  }
>  
> +static void
> +cs_slot_process_protm_pending_event_locked(struct panthor_device *ptdev,
> +					   u32 csg_id, u32 cs_id)
> +{
> +	struct panthor_scheduler *sched = ptdev->scheduler;
> +	struct panthor_csg_slot *csg_slot = &sched->csg_slots[csg_id];
> +	struct panthor_group *group = csg_slot->group;
> +
> +	lockdep_assert_held(&sched->lock);
> +
> +	if (!group)
> +		return;
> +
> +	/* No protected memory heap, a user space program tried to
> +	 * submit a protected mode jobs resulting in the GPU raising
> +	 * a CS_PROTM_PENDING request.
> +	 *
> +	 * This scenario is invalid and the protected mode jobs must
> +	 * not be allowed to progress.
> +	 */
> +	if (drm_WARN_ON_ONCE(&ptdev->base, !ptdev->protm.heap))
> +		return;
> +
> +	group->protm_queues |= BIT(cs_id);
> +
> +	sched_queue_delayed_work(sched, tick, 0);
> +}
> +
>  static void
>  cs_slot_process_fatal_event_locked(struct panthor_device *ptdev,
>  				   u32 csg_id, u32 cs_id)
> @@ -1566,6 +1633,9 @@ static bool cs_slot_process_irq_locked(struct panthor_device *ptdev,
>  	if (events & CS_TILER_OOM)
>  		cs_slot_process_tiler_oom_event_locked(ptdev, csg_id, cs_id);
>  
> +	if (events & CS_PROTM_PENDING)
> +		cs_slot_process_protm_pending_event_locked(ptdev, csg_id, cs_id);
> +
>  	/* We don't acknowledge the TILER_OOM event since its handling is
>  	 * deferred to a separate work.
>  	 */
> @@ -1703,6 +1773,17 @@ static void sched_process_idle_event_locked(struct panthor_device *ptdev)
>  	sched_queue_delayed_work(ptdev->scheduler, tick, 0);
>  }
>  
> +static void sched_process_protm_exit_event_locked(struct panthor_device *ptdev)
> +{
> +	struct panthor_fw_global_iface *glb_iface = panthor_fw_get_glb_iface(ptdev);
> +
> +	lockdep_assert_held(&ptdev->scheduler->lock);
> +
> +	/* Acknowledge the protm exit and schedule a tick. */
> +	panthor_fw_update_reqs(glb_iface, req, glb_iface->output->ack, GLB_PROTM_EXIT);

I would keep this ack in panthor_fw.c (just like I would keep the enter
handler at the _fw.c level).

> +	sched_queue_delayed_work(ptdev->scheduler, tick, 0);
> +}
> +
>  /**
>   * sched_process_global_irq_locked() - Process the scheduling part of a global IRQ
>   * @ptdev: Device.
> @@ -1720,6 +1801,9 @@ static void sched_process_global_irq_locked(struct panthor_device *ptdev)
>  
>  	if (evts & GLB_IDLE)
>  		sched_process_idle_event_locked(ptdev);
> +
> +	if (evts & GLB_PROTM_EXIT)
> +		sched_process_protm_exit_event_locked(ptdev);
>  }
>  
>  static void process_fw_events_work(struct work_struct *work)
> @@ -2238,9 +2322,22 @@ tick_ctx_apply(struct panthor_scheduler *sched, struct panthor_sched_tick_ctx *c
>  		list_for_each_entry(group, &ctx->groups[prio], run_node) {
>  			int csg_id = group->csg_id;
>  			struct panthor_fw_csg_iface *csg_iface;
> +			bool protm_ack = false;
> +
> +			/* The highest priority group has pending protected mode queues */
> +			if (new_csg_prio == MAX_CSG_PRIO && group->protm_queues)
> +				protm_ack = true;

I'd rather have an `enter_protm` field added to panthor_sched_tick_ctx,
and then have this ctx passed to csg_slot_prog_locked() so they can
check themselves if they need to ack PROTM requests.

>  
>  			if (csg_id >= 0) {
>  				new_csg_prio--;
> +
> +				/* This group is on slot but at least one queue
> +				 * is waiting for PROTM_ENTER.
> +				 */
> +				if (protm_ack)
> +					csg_slot_prog_locked(ptdev, csg_id,
> +							     new_csg_prio, protm_ack);
> +
>  				continue;
>  			}
>  
> @@ -2251,7 +2348,7 @@ tick_ctx_apply(struct panthor_scheduler *sched, struct panthor_sched_tick_ctx *c
>  			csg_iface = panthor_fw_get_csg_iface(ptdev, csg_id);
>  			csg_slot = &sched->csg_slots[csg_id];
>  			group_bind_locked(group, csg_id);
> -			csg_slot_prog_locked(ptdev, csg_id, new_csg_prio--);
> +			csg_slot_prog_locked(ptdev, csg_id, new_csg_prio--, protm_ack);
>  			csgs_upd_ctx_queue_reqs(ptdev, &upd_ctx, csg_id,
>  						group->state == PANTHOR_CS_GROUP_SUSPENDED ?
>  						CSG_STATE_RESUME : CSG_STATE_START,
> @@ -2303,6 +2400,12 @@ tick_ctx_apply(struct panthor_scheduler *sched, struct panthor_sched_tick_ctx *c
>  
>  	sched->used_csg_slot_count = ctx->group_count;
>  	sched->might_have_idle_groups = ctx->idle_group_count > 0;
> +
> +	ret = glb_protm_enter(ptdev);
> +	if (ret) {
> +		panthor_device_schedule_reset(ptdev);
> +		ctx->csg_upd_failed_mask = U32_MAX;
> +	}
>  }
>  
>  static u64


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 4/5] drm/panthor: Add support for protected memory allocation in panthor
  2025-01-30 13:09 ` [RFC PATCH 4/5] drm/panthor: Add support for protected memory allocation in panthor Florent Tomasin
@ 2025-02-11 11:04   ` Boris Brezillon
  2025-02-11 11:20     ` Boris Brezillon
  2025-03-12 20:05   ` Adrian Larumbe
  1 sibling, 1 reply; 48+ messages in thread
From: Boris Brezillon @ 2025-02-11 11:04 UTC (permalink / raw)
  To: Florent Tomasin
  Cc: Vinod Koul, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Steven Price, Liviu Dudau, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
	Benjamin Gaignard, Brian Starkey, John Stultz, T . J . Mercier,
	Christian König, Matthias Brugger,
	AngeloGioacchino Del Regno, Yong Wu, dmaengine, devicetree,
	linux-kernel, dri-devel, linux-media, linaro-mm-sig,
	linux-arm-kernel, linux-mediatek, nd, Akash Goel

On Thu, 30 Jan 2025 13:09:00 +0000
Florent Tomasin <florent.tomasin@arm.com> wrote:

> This patch allows Panthor to allocate buffer objects from a
> protected heap. The Panthor driver should be seen as a consumer
> of the heap and not an exporter.
> 
> To help with the review of this patch, here are important information
> about the Mali GPU protected mode support:
> - On CSF FW load, the Panthor driver must allocate a protected
>   buffer object to hold data to use by the FW when in protected
>   mode. This protected buffer object is owned by the device
>   and does not belong to a process.
> - On CSG creation, the Panthor driver must allocate a protected
>   suspend buffer object for the FW to store data when suspending
>   the CSG while in protected mode. The kernel owns this allocation
>   and does not allow user space mapping. The format of the data
>   in this buffer is only known by the FW and does not need to be
>   shared with other entities.
> 
> To summarize, Mali GPUs require allocations of protected buffer
> objects at the kernel level.
> 
> * How is the protected heap accessed by the Panthor driver?
> The driver will retrieve the protected heap using the name of the
> heap provided to the driver via the DTB as attribute.
> If the heap is not yet available, the panthor driver will defer
> the probe until created. It is an integration error to provide
> a heap name that does not exist or is never created in the
> DTB node.
> 
> * How is the Panthor driver allocating from the heap?
> Panthor is calling the DMA heap allocation function
> and obtains a DMA buffer from it. This buffer is then
> registered to GEM via PRIME by importing the DMA buffer.
> 
> Signed-off-by: Florent Tomasin <florent.tomasin@arm.com>
> ---
>  drivers/gpu/drm/panthor/Kconfig          |  1 +
>  drivers/gpu/drm/panthor/panthor_device.c | 22 ++++++++++-
>  drivers/gpu/drm/panthor/panthor_device.h |  7 ++++
>  drivers/gpu/drm/panthor/panthor_fw.c     | 36 +++++++++++++++--
>  drivers/gpu/drm/panthor/panthor_fw.h     |  2 +
>  drivers/gpu/drm/panthor/panthor_gem.c    | 49 ++++++++++++++++++++++--
>  drivers/gpu/drm/panthor/panthor_gem.h    | 16 +++++++-
>  drivers/gpu/drm/panthor/panthor_heap.c   |  2 +
>  drivers/gpu/drm/panthor/panthor_sched.c  |  5 ++-
>  9 files changed, 130 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panthor/Kconfig b/drivers/gpu/drm/panthor/Kconfig
> index 55b40ad07f3b..c0208b886d9f 100644
> --- a/drivers/gpu/drm/panthor/Kconfig
> +++ b/drivers/gpu/drm/panthor/Kconfig
> @@ -7,6 +7,7 @@ config DRM_PANTHOR
>  	depends on !GENERIC_ATOMIC64  # for IOMMU_IO_PGTABLE_LPAE
>  	depends on MMU
>  	select DEVFREQ_GOV_SIMPLE_ONDEMAND
> +	select DMABUF_HEAPS
>  	select DRM_EXEC
>  	select DRM_GEM_SHMEM_HELPER
>  	select DRM_GPUVM
> diff --git a/drivers/gpu/drm/panthor/panthor_device.c b/drivers/gpu/drm/panthor/panthor_device.c
> index 00f7b8ce935a..1018e5c90a0e 100644
> --- a/drivers/gpu/drm/panthor/panthor_device.c
> +++ b/drivers/gpu/drm/panthor/panthor_device.c
> @@ -4,7 +4,9 @@
>  /* Copyright 2023 Collabora ltd. */
>  
>  #include <linux/clk.h>
> +#include <linux/dma-heap.h>
>  #include <linux/mm.h>
> +#include <linux/of.h>
>  #include <linux/platform_device.h>
>  #include <linux/pm_domain.h>
>  #include <linux/pm_runtime.h>
> @@ -102,6 +104,9 @@ void panthor_device_unplug(struct panthor_device *ptdev)
>  	panthor_mmu_unplug(ptdev);
>  	panthor_gpu_unplug(ptdev);
>  
> +	if (ptdev->protm.heap)
> +		dma_heap_put(ptdev->protm.heap);
> +
>  	pm_runtime_dont_use_autosuspend(ptdev->base.dev);
>  	pm_runtime_put_sync_suspend(ptdev->base.dev);
>  
> @@ -172,6 +177,7 @@ int panthor_device_init(struct panthor_device *ptdev)
>  	u32 *dummy_page_virt;
>  	struct resource *res;
>  	struct page *p;
> +	const char *protm_heap_name;
>  	int ret;
>  
>  	ret = panthor_gpu_coherency_init(ptdev);
> @@ -246,9 +252,19 @@ int panthor_device_init(struct panthor_device *ptdev)
>  			return ret;
>  	}
>  
> +	/* If a protected heap is specified but not found, defer the probe until created */
> +	if (!of_property_read_string(ptdev->base.dev->of_node, "protected-heap-name",
> +				     &protm_heap_name)) {
> +		ptdev->protm.heap = dma_heap_find(protm_heap_name);
> +		if (!ptdev->protm.heap) {
> +			ret = -EPROBE_DEFER;
> +			goto err_rpm_put;
> +		}
> +	}
> +
>  	ret = panthor_gpu_init(ptdev);
>  	if (ret)
> -		goto err_rpm_put;
> +		goto err_dma_heap_put;
>  
>  	ret = panthor_mmu_init(ptdev);
>  	if (ret)
> @@ -286,6 +302,10 @@ int panthor_device_init(struct panthor_device *ptdev)
>  err_unplug_gpu:
>  	panthor_gpu_unplug(ptdev);
>  
> +err_dma_heap_put:
> +	if (ptdev->protm.heap)
> +		dma_heap_put(ptdev->protm.heap);
> +
>  err_rpm_put:
>  	pm_runtime_put_sync_suspend(ptdev->base.dev);
>  	return ret;
> diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/panthor/panthor_device.h
> index 0e68f5a70d20..406de9e888e2 100644
> --- a/drivers/gpu/drm/panthor/panthor_device.h
> +++ b/drivers/gpu/drm/panthor/panthor_device.h
> @@ -7,6 +7,7 @@
>  #define __PANTHOR_DEVICE_H__
>  
>  #include <linux/atomic.h>
> +#include <linux/dma-heap.h>
>  #include <linux/io-pgtable.h>
>  #include <linux/regulator/consumer.h>
>  #include <linux/sched.h>
> @@ -190,6 +191,12 @@ struct panthor_device {
>  
>  	/** @fast_rate: Maximum device clock frequency. Set by DVFS */
>  	unsigned long fast_rate;
> +
> +	/** @protm: Protected mode related data. */
> +	struct {
> +		/** @heap: Pointer to the protected heap */
> +		struct dma_heap *heap;
> +	} protm;
>  };
>  
>  struct panthor_gpu_usage {
> diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor/panthor_fw.c
> index 4a2e36504fea..7822af1533b4 100644
> --- a/drivers/gpu/drm/panthor/panthor_fw.c
> +++ b/drivers/gpu/drm/panthor/panthor_fw.c
> @@ -458,6 +458,7 @@ panthor_fw_alloc_queue_iface_mem(struct panthor_device *ptdev,
>  
>  	mem = panthor_kernel_bo_create(ptdev, ptdev->fw->vm, SZ_8K,
>  				       DRM_PANTHOR_BO_NO_MMAP,
> +				       0,
>  				       DRM_PANTHOR_VM_BIND_OP_MAP_NOEXEC |
>  				       DRM_PANTHOR_VM_BIND_OP_MAP_UNCACHED,
>  				       PANTHOR_VM_KERNEL_AUTO_VA);
> @@ -491,6 +492,28 @@ panthor_fw_alloc_suspend_buf_mem(struct panthor_device *ptdev, size_t size)
>  {
>  	return panthor_kernel_bo_create(ptdev, panthor_fw_vm(ptdev), size,
>  					DRM_PANTHOR_BO_NO_MMAP,
> +					0,
> +					DRM_PANTHOR_VM_BIND_OP_MAP_NOEXEC,
> +					PANTHOR_VM_KERNEL_AUTO_VA);
> +}
> +
> +/**
> + * panthor_fw_alloc_protm_suspend_buf_mem() - Allocate a protm suspend buffer
> + * for a command stream group.
> + * @ptdev: Device.
> + * @size: Size of the protm suspend buffer.
> + *
> + * Return: A valid pointer in case of success, NULL if no protected heap, an ERR_PTR() otherwise.
> + */
> +struct panthor_kernel_bo *
> +panthor_fw_alloc_protm_suspend_buf_mem(struct panthor_device *ptdev, size_t size)
> +{
> +	if (!ptdev->protm.heap)
> +		return NULL;
> +
> +	return panthor_kernel_bo_create(ptdev, panthor_fw_vm(ptdev), size,
> +					DRM_PANTHOR_BO_NO_MMAP,
> +					DRM_PANTHOR_KBO_PROTECTED_HEAP,
>  					DRM_PANTHOR_VM_BIND_OP_MAP_NOEXEC,
>  					PANTHOR_VM_KERNEL_AUTO_VA);
>  }
> @@ -503,6 +526,7 @@ static int panthor_fw_load_section_entry(struct panthor_device *ptdev,
>  	ssize_t vm_pgsz = panthor_vm_page_size(ptdev->fw->vm);
>  	struct panthor_fw_binary_section_entry_hdr hdr;
>  	struct panthor_fw_section *section;
> +	bool is_protm_section = false;
>  	u32 section_size;
>  	u32 name_len;
>  	int ret;
> @@ -541,10 +565,13 @@ static int panthor_fw_load_section_entry(struct panthor_device *ptdev,
>  		return -EINVAL;
>  	}
>  
> -	if (hdr.flags & CSF_FW_BINARY_IFACE_ENTRY_PROT) {
> +	if ((hdr.flags & CSF_FW_BINARY_IFACE_ENTRY_PROT) && !ptdev->protm.heap) {
>  		drm_warn(&ptdev->base,
>  			 "Firmware protected mode entry not be supported, ignoring");
>  		return 0;
> +	} else if ((hdr.flags & CSF_FW_BINARY_IFACE_ENTRY_PROT) && ptdev->protm.heap) {
> +		drm_info(&ptdev->base, "Firmware protected mode entry supported");
> +		is_protm_section = true;
>  	}
>  
>  	if (hdr.va.start == CSF_MCU_SHARED_REGION_START &&
> @@ -610,9 +637,10 @@ static int panthor_fw_load_section_entry(struct panthor_device *ptdev,
>  			vm_map_flags |= DRM_PANTHOR_VM_BIND_OP_MAP_UNCACHED;
>  
>  		section->mem = panthor_kernel_bo_create(ptdev, panthor_fw_vm(ptdev),
> -							section_size,
> -							DRM_PANTHOR_BO_NO_MMAP,
> -							vm_map_flags, va);
> +					section_size,
> +					DRM_PANTHOR_BO_NO_MMAP,
> +					(is_protm_section ? DRM_PANTHOR_KBO_PROTECTED_HEAP : 0),
> +					vm_map_flags, va);
>  		if (IS_ERR(section->mem))
>  			return PTR_ERR(section->mem);
>  
> diff --git a/drivers/gpu/drm/panthor/panthor_fw.h b/drivers/gpu/drm/panthor/panthor_fw.h
> index 22448abde992..29042d0dc60c 100644
> --- a/drivers/gpu/drm/panthor/panthor_fw.h
> +++ b/drivers/gpu/drm/panthor/panthor_fw.h
> @@ -481,6 +481,8 @@ panthor_fw_alloc_queue_iface_mem(struct panthor_device *ptdev,
>  				 u32 *input_fw_va, u32 *output_fw_va);
>  struct panthor_kernel_bo *
>  panthor_fw_alloc_suspend_buf_mem(struct panthor_device *ptdev, size_t size);
> +struct panthor_kernel_bo *
> +panthor_fw_alloc_protm_suspend_buf_mem(struct panthor_device *ptdev, size_t size);
>  
>  struct panthor_vm *panthor_fw_vm(struct panthor_device *ptdev);
>  
> diff --git a/drivers/gpu/drm/panthor/panthor_gem.c b/drivers/gpu/drm/panthor/panthor_gem.c
> index 8244a4e6c2a2..88caf928acd0 100644
> --- a/drivers/gpu/drm/panthor/panthor_gem.c
> +++ b/drivers/gpu/drm/panthor/panthor_gem.c
> @@ -9,10 +9,14 @@
>  
>  #include <drm/panthor_drm.h>
>  
> +#include <uapi/linux/dma-heap.h>
> +
>  #include "panthor_device.h"
>  #include "panthor_gem.h"
>  #include "panthor_mmu.h"
>  
> +MODULE_IMPORT_NS(DMA_BUF);

Uh, that's ugly. If the consensus is to let panthor allocate
its protected buffers from a heap, let's just add a dependency on
DMABUF_HEAPS instead.

> +
>  static void panthor_gem_free_object(struct drm_gem_object *obj)
>  {
>  	struct panthor_gem_object *bo = to_panthor_bo(obj);
> @@ -31,6 +35,7 @@ static void panthor_gem_free_object(struct drm_gem_object *obj)
>   */
>  void panthor_kernel_bo_destroy(struct panthor_kernel_bo *bo)
>  {
> +	struct dma_buf *dma_bo = NULL;
>  	struct panthor_vm *vm;
>  	int ret;
>  
> @@ -38,6 +43,10 @@ void panthor_kernel_bo_destroy(struct panthor_kernel_bo *bo)
>  		return;
>  
>  	vm = bo->vm;
> +
> +	if (bo->flags & DRM_PANTHOR_KBO_PROTECTED_HEAP)
> +		dma_bo = bo->obj->import_attach->dmabuf;
> +
>  	panthor_kernel_bo_vunmap(bo);
>  
>  	if (drm_WARN_ON(bo->obj->dev,
> @@ -51,6 +60,9 @@ void panthor_kernel_bo_destroy(struct panthor_kernel_bo *bo)
>  	panthor_vm_free_va(vm, &bo->va_node);
>  	drm_gem_object_put(bo->obj);
>  
> +	if (dma_bo)
> +		dma_buf_put(dma_bo);
> +
>  out_free_bo:
>  	panthor_vm_put(vm);
>  	kfree(bo);
> @@ -62,6 +74,7 @@ void panthor_kernel_bo_destroy(struct panthor_kernel_bo *bo)
>   * @vm: VM to map the GEM to. If NULL, the kernel object is not GPU mapped.
>   * @size: Size of the buffer object.
>   * @bo_flags: Combination of drm_panthor_bo_flags flags.
> + * @kbo_flags: Combination of drm_panthor_kbo_flags flags.
>   * @vm_map_flags: Combination of drm_panthor_vm_bind_op_flags (only those
>   * that are related to map operations).
>   * @gpu_va: GPU address assigned when mapping to the VM.
> @@ -72,9 +85,11 @@ void panthor_kernel_bo_destroy(struct panthor_kernel_bo *bo)
>   */
>  struct panthor_kernel_bo *
>  panthor_kernel_bo_create(struct panthor_device *ptdev, struct panthor_vm *vm,
> -			 size_t size, u32 bo_flags, u32 vm_map_flags,
> +			 size_t size, u32 bo_flags, u32 kbo_flags, u32 vm_map_flags,

Hm, I'm not convinced by this kbo_flags. How about we have a dedicated
panthor_kernel_protected_bo_create() helper that takes a dmabuf object
to import, and then we simply store this dmabuf in panthor_kernel_bo to
reflect the fact this is a protected BO.

>  			 u64 gpu_va)
>  {
> +	struct dma_buf *dma_bo = NULL;
> +	struct drm_gem_object *gem_obj = NULL;
>  	struct drm_gem_shmem_object *obj;
>  	struct panthor_kernel_bo *kbo;
>  	struct panthor_gem_object *bo;
> @@ -87,14 +102,38 @@ panthor_kernel_bo_create(struct panthor_device *ptdev, struct panthor_vm *vm,
>  	if (!kbo)
>  		return ERR_PTR(-ENOMEM);
>  
> -	obj = drm_gem_shmem_create(&ptdev->base, size);
> +	if (kbo_flags & DRM_PANTHOR_KBO_PROTECTED_HEAP) {
> +		if (!ptdev->protm.heap) {
> +			ret = -EINVAL;
> +			goto err_free_bo;
> +		}
> +
> +		dma_bo = dma_heap_buffer_alloc(ptdev->protm.heap, size,
> +					       DMA_HEAP_VALID_FD_FLAGS, DMA_HEAP_VALID_HEAP_FLAGS);
> +		if (!dma_bo) {
> +			ret = -ENOMEM;
> +			goto err_free_bo;
> +		}
> +
> +		gem_obj = drm_gem_prime_import(&ptdev->base, dma_bo);
> +		if (IS_ERR(gem_obj)) {
> +			ret = PTR_ERR(gem_obj);
> +			goto err_free_dma_bo;
> +		}
> +
> +		obj = to_drm_gem_shmem_obj(gem_obj);
> +	} else {
> +		obj = drm_gem_shmem_create(&ptdev->base, size);
> +	}
> +
>  	if (IS_ERR(obj)) {
>  		ret = PTR_ERR(obj);
> -		goto err_free_bo;
> +		goto err_free_dma_bo;
>  	}
>  
>  	bo = to_panthor_bo(&obj->base);
>  	kbo->obj = &obj->base;
> +	kbo->flags = kbo_flags;
>  	bo->flags = bo_flags;
>  
>  	/* The system and GPU MMU page size might differ, which becomes a
> @@ -124,6 +163,10 @@ panthor_kernel_bo_create(struct panthor_device *ptdev, struct panthor_vm *vm,
>  err_put_obj:
>  	drm_gem_object_put(&obj->base);
>  
> +err_free_dma_bo:
> +	if (dma_bo)
> +		dma_buf_put(dma_bo);
> +
>  err_free_bo:
>  	kfree(kbo);
>  	return ERR_PTR(ret);
> diff --git a/drivers/gpu/drm/panthor/panthor_gem.h b/drivers/gpu/drm/panthor/panthor_gem.h
> index e43021cf6d45..d4fe8ae9f0a8 100644
> --- a/drivers/gpu/drm/panthor/panthor_gem.h
> +++ b/drivers/gpu/drm/panthor/panthor_gem.h
> @@ -13,6 +13,17 @@
>  
>  struct panthor_vm;
>  
> +/**
> + * enum drm_panthor_kbo_flags -  Kernel buffer object flags, passed at creation time
> + */
> +enum drm_panthor_kbo_flags {
> +	/**
> +	 * @DRM_PANTHOR_KBO_PROTECTED_HEAP: The buffer object will be allocated
> +	 * from a DMA-Buf protected heap.
> +	 */
> +	DRM_PANTHOR_KBO_PROTECTED_HEAP = (1 << 0),
> +};
> +
>  /**
>   * struct panthor_gem_object - Driver specific GEM object.
>   */
> @@ -75,6 +86,9 @@ struct panthor_kernel_bo {
>  	 * @kmap: Kernel CPU mapping of @gem.
>  	 */
>  	void *kmap;
> +
> +	/** @flags: Combination of drm_panthor_kbo_flags flags. */
> +	u32 flags;
>  };
>  
>  static inline
> @@ -138,7 +152,7 @@ panthor_kernel_bo_vunmap(struct panthor_kernel_bo *bo)
>  
>  struct panthor_kernel_bo *
>  panthor_kernel_bo_create(struct panthor_device *ptdev, struct panthor_vm *vm,
> -			 size_t size, u32 bo_flags, u32 vm_map_flags,
> +			 size_t size, u32 bo_flags, u32 kbo_flags, u32 vm_map_flags,
>  			 u64 gpu_va);
>  
>  void panthor_kernel_bo_destroy(struct panthor_kernel_bo *bo);
> diff --git a/drivers/gpu/drm/panthor/panthor_heap.c b/drivers/gpu/drm/panthor/panthor_heap.c
> index 3796a9eb22af..5395f0d90360 100644
> --- a/drivers/gpu/drm/panthor/panthor_heap.c
> +++ b/drivers/gpu/drm/panthor/panthor_heap.c
> @@ -146,6 +146,7 @@ static int panthor_alloc_heap_chunk(struct panthor_device *ptdev,
>  
>  	chunk->bo = panthor_kernel_bo_create(ptdev, vm, heap->chunk_size,
>  					     DRM_PANTHOR_BO_NO_MMAP,
> +					     0,
>  					     DRM_PANTHOR_VM_BIND_OP_MAP_NOEXEC,
>  					     PANTHOR_VM_KERNEL_AUTO_VA);
>  	if (IS_ERR(chunk->bo)) {
> @@ -549,6 +550,7 @@ panthor_heap_pool_create(struct panthor_device *ptdev, struct panthor_vm *vm)
>  
>  	pool->gpu_contexts = panthor_kernel_bo_create(ptdev, vm, bosize,
>  						      DRM_PANTHOR_BO_NO_MMAP,
> +						      0,
>  						      DRM_PANTHOR_VM_BIND_OP_MAP_NOEXEC,
>  						      PANTHOR_VM_KERNEL_AUTO_VA);
>  	if (IS_ERR(pool->gpu_contexts)) {
> diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
> index ef4bec7ff9c7..e260ed8aef5b 100644
> --- a/drivers/gpu/drm/panthor/panthor_sched.c
> +++ b/drivers/gpu/drm/panthor/panthor_sched.c
> @@ -3298,6 +3298,7 @@ group_create_queue(struct panthor_group *group,
>  	queue->ringbuf = panthor_kernel_bo_create(group->ptdev, group->vm,
>  						  args->ringbuf_size,
>  						  DRM_PANTHOR_BO_NO_MMAP,
> +						  0,
>  						  DRM_PANTHOR_VM_BIND_OP_MAP_NOEXEC |
>  						  DRM_PANTHOR_VM_BIND_OP_MAP_UNCACHED,
>  						  PANTHOR_VM_KERNEL_AUTO_VA);
> @@ -3328,6 +3329,7 @@ group_create_queue(struct panthor_group *group,
>  					 queue->profiling.slot_count *
>  					 sizeof(struct panthor_job_profiling_data),
>  					 DRM_PANTHOR_BO_NO_MMAP,
> +					 0,
>  					 DRM_PANTHOR_VM_BIND_OP_MAP_NOEXEC |
>  					 DRM_PANTHOR_VM_BIND_OP_MAP_UNCACHED,
>  					 PANTHOR_VM_KERNEL_AUTO_VA);
> @@ -3435,7 +3437,7 @@ int panthor_group_create(struct panthor_file *pfile,
>  	}
>  
>  	suspend_size = csg_iface->control->protm_suspend_size;
> -	group->protm_suspend_buf = panthor_fw_alloc_suspend_buf_mem(ptdev, suspend_size);
> +	group->protm_suspend_buf = panthor_fw_alloc_protm_suspend_buf_mem(ptdev, suspend_size);

This predates your patchset, but I think we should refrain from
allocating a protm suspend buffer if the context is not flagged as
protected. This involves extending the uAPI to pass a new flag to the
GROUP_CREATE ioctl (repurposing the pad field in
drm_panthor_group_create into a flag field, and defining an
drm_panthor_group_create_flags enum).

>  	if (IS_ERR(group->protm_suspend_buf)) {
>  		ret = PTR_ERR(group->protm_suspend_buf);
>  		group->protm_suspend_buf = NULL;
> @@ -3446,6 +3448,7 @@ int panthor_group_create(struct panthor_file *pfile,
>  						   group_args->queues.count *
>  						   sizeof(struct panthor_syncobj_64b),
>  						   DRM_PANTHOR_BO_NO_MMAP,
> +						   0,
>  						   DRM_PANTHOR_VM_BIND_OP_MAP_NOEXEC |
>  						   DRM_PANTHOR_VM_BIND_OP_MAP_UNCACHED,
>  						   PANTHOR_VM_KERNEL_AUTO_VA);


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 4/5] drm/panthor: Add support for protected memory allocation in panthor
  2025-02-11 11:04   ` Boris Brezillon
@ 2025-02-11 11:20     ` Boris Brezillon
  0 siblings, 0 replies; 48+ messages in thread
From: Boris Brezillon @ 2025-02-11 11:20 UTC (permalink / raw)
  To: Florent Tomasin
  Cc: Vinod Koul, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Steven Price, Liviu Dudau, Maarten Lankhorst, Maxime Ripard,
	Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
	Benjamin Gaignard, Brian Starkey, John Stultz, T . J . Mercier,
	Christian König, Matthias Brugger,
	AngeloGioacchino Del Regno, Yong Wu, dmaengine, devicetree,
	linux-kernel, dri-devel, linux-media, linaro-mm-sig,
	linux-arm-kernel, linux-mediatek, nd, Akash Goel

On Tue, 11 Feb 2025 12:04:48 +0100
Boris Brezillon <boris.brezillon@collabora.com> wrote:

> > --- a/drivers/gpu/drm/panthor/panthor_gem.c
> > +++ b/drivers/gpu/drm/panthor/panthor_gem.c
> > @@ -9,10 +9,14 @@
> >  
> >  #include <drm/panthor_drm.h>
> >  
> > +#include <uapi/linux/dma-heap.h>
> > +
> >  #include "panthor_device.h"
> >  #include "panthor_gem.h"
> >  #include "panthor_mmu.h"
> >  
> > +MODULE_IMPORT_NS(DMA_BUF);  
> 
> Uh, that's ugly. If the consensus is to let panthor allocate
> its protected buffers from a heap, let's just add a dependency on
> DMABUF_HEAPS instead.

My bad, that one is required for dma_buf_put(). Should be

  MODULE_IMPORT_NS("DMA_BUF");

though.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 0/5] drm/panthor: Protected mode support for Mali CSF GPUs
  2025-02-07 15:02               ` Boris Brezillon
  2025-02-07 16:32                 ` Nicolas Dufresne
@ 2025-02-11 13:46                 ` Maxime Ripard
  2025-02-11 14:32                   ` Boris Brezillon
  1 sibling, 1 reply; 48+ messages in thread
From: Maxime Ripard @ 2025-02-11 13:46 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Nicolas Dufresne, Florent Tomasin, Vinod Koul, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, Steven Price, Liviu Dudau,
	Maarten Lankhorst, Thomas Zimmermann, David Airlie, Simona Vetter,
	Sumit Semwal, Benjamin Gaignard, Brian Starkey, John Stultz,
	T . J . Mercier, Christian König, Matthias Brugger,
	AngeloGioacchino Del Regno, Yong Wu, dmaengine, devicetree,
	linux-kernel, dri-devel, linux-media, linaro-mm-sig,
	linux-arm-kernel, linux-mediatek, nd, Akash Goel

[-- Attachment #1: Type: text/plain, Size: 8826 bytes --]

Hi Boris,

On Fri, Feb 07, 2025 at 04:02:53PM +0100, Boris Brezillon wrote:
> Sorry for joining the party late, a couple of comments to back Akash
> and Nicolas' concerns.
> 
> On Wed, 05 Feb 2025 13:14:14 -0500
> Nicolas Dufresne <nicolas@ndufresne.ca> wrote:
> 
> > Le mercredi 05 février 2025 à 15:52 +0100, Maxime Ripard a écrit :
> > > On Mon, Feb 03, 2025 at 04:43:23PM +0000, Florent Tomasin wrote:  
> > > > Hi Maxime, Nicolas
> > > > 
> > > > On 30/01/2025 17:47, Nicolas Dufresne wrote:  
> > > > > Le jeudi 30 janvier 2025 à 17:38 +0100, Maxime Ripard a écrit :  
> > > > > > Hi Nicolas,
> > > > > > 
> > > > > > On Thu, Jan 30, 2025 at 10:59:56AM -0500, Nicolas Dufresne wrote:  
> > > > > > > Le jeudi 30 janvier 2025 à 14:46 +0100, Maxime Ripard a écrit :  
> > > > > > > > Hi,
> > > > > > > > 
> > > > > > > > I started to review it, but it's probably best to discuss it here.
> > > > > > > > 
> > > > > > > > On Thu, Jan 30, 2025 at 01:08:56PM +0000, Florent Tomasin wrote:  
> > > > > > > > > Hi,
> > > > > > > > > 
> > > > > > > > > This is a patch series covering the support for protected mode execution in
> > > > > > > > > Mali Panthor CSF kernel driver.
> > > > > > > > > 
> > > > > > > > > The Mali CSF GPUs come with the support for protected mode execution at the
> > > > > > > > > HW level. This feature requires two main changes in the kernel driver:
> > > > > > > > > 
> > > > > > > > > 1) Configure the GPU with a protected buffer. The system must provide a DMA
> > > > > > > > >    heap from which the driver can allocate a protected buffer.
> > > > > > > > >    It can be a carved-out memory or dynamically allocated protected memory region.
> > > > > > > > >    Some system includes a trusted FW which is in charge of the protected memory.
> > > > > > > > >    Since this problem is integration specific, the Mali Panthor CSF kernel
> > > > > > > > >    driver must import the protected memory from a device specific exporter.  
> > > > > > > > 
> > > > > > > > Why do you need a heap for it in the first place? My understanding of
> > > > > > > > your series is that you have a carved out memory region somewhere, and
> > > > > > > > you want to allocate from that carved out memory region your buffers.
> > > > > > > > 
> > > > > > > > How is that any different from using a reserved-memory region, adding
> > > > > > > > the reserved-memory property to the GPU device and doing all your
> > > > > > > > allocation through the usual dma_alloc_* API?  
> > > > > > > 
> > > > > > > How do you then multiplex this region so it can be shared between
> > > > > > > GPU/Camera/Display/Codec drivers and also userspace ?  
> > > > > > 
> > > > > > You could point all the devices to the same reserved memory region, and
> > > > > > they would all allocate from there, including for their userspace-facing
> > > > > > allocations.  
> > > > > 
> > > > > I get that using memory region is somewhat more of an HW description, and
> > > > > aligned with what a DT is supposed to describe. One of the challenge is that
> > > > > Mediatek heap proposal endup calling into their TEE, meaning knowing the region
> > > > > is not that useful. You actually need the TEE APP guid and its IPC protocol. If
> > > > > we can dell drivers to use a head instead, we can abstract that SoC specific
> > > > > complexity. I believe each allocated addressed has to be mapped to a zone, and
> > > > > that can only be done in the secure application. I can imagine similar needs
> > > > > when the protection is done using some sort of a VM / hypervisor.
> > > > > 
> > > > > Nicolas
> > > > >   
> > > > 
> > > > The idea in this design is to abstract the heap management from the
> > > > Panthor kernel driver (which consumes a DMA buffer from it).
> > > > 
> > > > In a system, an integrator would have implemented a secure heap driver,
> > > > and could be based on TEE or a carved-out memory with restricted access,
> > > > or else. This heap driver would be responsible of implementing the
> > > > logic to: allocate, free, refcount, etc.
> > > > 
> > > > The heap would be retrieved by the Panthor kernel driver in order to
> > > > allocate protected memory to load the FW and allow the GPU to enter/exit
> > > > protected mode. This memory would not belong to a user space process.
> > > > The driver allocates it at the time of loading the FW and initialization
> > > > of the GPU HW. This is a device globally owned protected memory.  
> > > 
> > > The thing is, it's really not clear why you absolutely need to have the
> > > Panthor driver involved there. It won't be transparent to userspace,
> > > since you'd need an extra flag at allocation time, and the buffers
> > > behave differently. If userspace has to be aware of it, what's the
> > > advantage to your approach compared to just exposing a heap for those
> > > secure buffers, and letting userspace allocate its buffers from there?  
> > 
> > Unless I'm mistaken, the Panthor driver loads its own firmware. Since loading
> > the firmware requires placing the data in a protected memory region, and that
> > this aspect has no exposure to userspace, how can Panthor not be implicated ?
> 
> Right, the very reason we need protected memory early is because some
> FW sections need to be allocated from the protected pool, otherwise the
> TEE will fault as soon at the FW enters the so-called 'protected mode'.

How does that work if you don't have some way to allocate the protected
memory? You can still submit jobs to the GPU, but you can't submit /
execute "protected jobs"?

> Now, it's not impossible to work around this limitation. For instance,
> we could load the FW without this protected section by default (what we
> do right now), and then provide a DRM_PANTHOR_ENABLE_FW_PROT_MODE
> ioctl that would take a GEM object imported from a dmabuf allocated
> from the protected dma-heap by userspace. We can then reset the FW and
> allow it to operate in protected mode after that point.

Urgh, I'd rather avoid that dance if possible :)

> This approach has two downsides though:
> 
> 1. We have no way of checking that the memory we're passed is actually
> suitable for FW execution in a protected context. If we're passed
> random memory, this will likely hang the platform as soon as we enter
> protected mode.

It's a current limitation of dma-buf in general, and you'd have the same
issue right now if someone imports a buffer, or misconfigure the heap
for a !protected heap.

I'd really like to have some way to store some metadata in dma_buf, if
only to tell that the buffer is protected.

I suspect you'd also need that if you do things like do protected video
playback through a codec, get a protected frame, and want to import that
into the GPU. Depending on how you allocate it, either the codec or the
GPU or both will want to make sure it's protected.

> 2. If the driver already boot the FW and exposed a DRI node, we might
> have GPU workloads running, and doing a FW reset might incur a slight
> delay in GPU jobs execution.
> 
> I think #1 is a more general issue that applies to suspend buffers
> allocated for GPU contexts too. If we expose ioctls where we take
> protected memory buffers that can possibly lead to crashes if they are
> not real protected memory regions, and we have no way to ensure the
> memory is protected, we probably want to restrict these ioctls/modes to
> some high-privilege CAP_SYS_.
> 
> For #2, that's probably something we can live with, since it's a
> one-shot thing. If it becomes an issue, we can even make sure we enable
> the FW protected-mode before the GPU starts being used for real.
> 
> This being said, I think the problem applies outside Panthor, and it
> might be that the video codec can't reset the FW/HW block to switch to
> protected mode as easily as Panthor.
>
> Note that there's also downsides to the reserved-memory node approach,
> where some bootloader stage would ask the secure FW to reserve a
> portion of mem and pass this through the DT. This sort of things tend to
> be an integration mess, where you need all the pieces of the stack (TEE,
> u-boot, MTK dma-heap driver, gbm, ...) to be at a certain version to
> work properly. If we go the ioctl() way, we restrict the scope to the
> TEE, gbm/mesa and the protected-dma-heap driver, which is still a lot,
> but we've ripped the bootloader out of the equation at least.

Yeah. I also think there's two discussions in parallel here:

 1) Being able to allocate protected buffers from the driver
 2) Exposing an interface to allocate those to userspace

I'm not really convinced we need 2, but 1 is obviously needed from what
you're saying.

Maxime

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 273 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 0/5] drm/panthor: Protected mode support for Mali CSF GPUs
  2025-02-11 13:46                 ` Maxime Ripard
@ 2025-02-11 14:32                   ` Boris Brezillon
  2025-02-20 13:32                     ` Maxime Ripard
  0 siblings, 1 reply; 48+ messages in thread
From: Boris Brezillon @ 2025-02-11 14:32 UTC (permalink / raw)
  To: Maxime Ripard
  Cc: Nicolas Dufresne, Florent Tomasin, Vinod Koul, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, Steven Price, Liviu Dudau,
	Maarten Lankhorst, Thomas Zimmermann, David Airlie, Simona Vetter,
	Sumit Semwal, Benjamin Gaignard, Brian Starkey, John Stultz,
	T . J . Mercier, Christian König, Matthias Brugger,
	AngeloGioacchino Del Regno, Yong Wu, dmaengine, devicetree,
	linux-kernel, dri-devel, linux-media, linaro-mm-sig,
	linux-arm-kernel, linux-mediatek, nd, Akash Goel

On Tue, 11 Feb 2025 14:46:56 +0100
Maxime Ripard <mripard@kernel.org> wrote:

> Hi Boris,
> 
> On Fri, Feb 07, 2025 at 04:02:53PM +0100, Boris Brezillon wrote:
> > Sorry for joining the party late, a couple of comments to back Akash
> > and Nicolas' concerns.
> > 
> > On Wed, 05 Feb 2025 13:14:14 -0500
> > Nicolas Dufresne <nicolas@ndufresne.ca> wrote:
> >   
> > > Le mercredi 05 février 2025 à 15:52 +0100, Maxime Ripard a écrit :  
> > > > On Mon, Feb 03, 2025 at 04:43:23PM +0000, Florent Tomasin wrote:    
> > > > > Hi Maxime, Nicolas
> > > > > 
> > > > > On 30/01/2025 17:47, Nicolas Dufresne wrote:    
> > > > > > Le jeudi 30 janvier 2025 à 17:38 +0100, Maxime Ripard a écrit :    
> > > > > > > Hi Nicolas,
> > > > > > > 
> > > > > > > On Thu, Jan 30, 2025 at 10:59:56AM -0500, Nicolas Dufresne wrote:    
> > > > > > > > Le jeudi 30 janvier 2025 à 14:46 +0100, Maxime Ripard a écrit :    
> > > > > > > > > Hi,
> > > > > > > > > 
> > > > > > > > > I started to review it, but it's probably best to discuss it here.
> > > > > > > > > 
> > > > > > > > > On Thu, Jan 30, 2025 at 01:08:56PM +0000, Florent Tomasin wrote:    
> > > > > > > > > > Hi,
> > > > > > > > > > 
> > > > > > > > > > This is a patch series covering the support for protected mode execution in
> > > > > > > > > > Mali Panthor CSF kernel driver.
> > > > > > > > > > 
> > > > > > > > > > The Mali CSF GPUs come with the support for protected mode execution at the
> > > > > > > > > > HW level. This feature requires two main changes in the kernel driver:
> > > > > > > > > > 
> > > > > > > > > > 1) Configure the GPU with a protected buffer. The system must provide a DMA
> > > > > > > > > >    heap from which the driver can allocate a protected buffer.
> > > > > > > > > >    It can be a carved-out memory or dynamically allocated protected memory region.
> > > > > > > > > >    Some system includes a trusted FW which is in charge of the protected memory.
> > > > > > > > > >    Since this problem is integration specific, the Mali Panthor CSF kernel
> > > > > > > > > >    driver must import the protected memory from a device specific exporter.    
> > > > > > > > > 
> > > > > > > > > Why do you need a heap for it in the first place? My understanding of
> > > > > > > > > your series is that you have a carved out memory region somewhere, and
> > > > > > > > > you want to allocate from that carved out memory region your buffers.
> > > > > > > > > 
> > > > > > > > > How is that any different from using a reserved-memory region, adding
> > > > > > > > > the reserved-memory property to the GPU device and doing all your
> > > > > > > > > allocation through the usual dma_alloc_* API?    
> > > > > > > > 
> > > > > > > > How do you then multiplex this region so it can be shared between
> > > > > > > > GPU/Camera/Display/Codec drivers and also userspace ?    
> > > > > > > 
> > > > > > > You could point all the devices to the same reserved memory region, and
> > > > > > > they would all allocate from there, including for their userspace-facing
> > > > > > > allocations.    
> > > > > > 
> > > > > > I get that using memory region is somewhat more of an HW description, and
> > > > > > aligned with what a DT is supposed to describe. One of the challenge is that
> > > > > > Mediatek heap proposal endup calling into their TEE, meaning knowing the region
> > > > > > is not that useful. You actually need the TEE APP guid and its IPC protocol. If
> > > > > > we can dell drivers to use a head instead, we can abstract that SoC specific
> > > > > > complexity. I believe each allocated addressed has to be mapped to a zone, and
> > > > > > that can only be done in the secure application. I can imagine similar needs
> > > > > > when the protection is done using some sort of a VM / hypervisor.
> > > > > > 
> > > > > > Nicolas
> > > > > >     
> > > > > 
> > > > > The idea in this design is to abstract the heap management from the
> > > > > Panthor kernel driver (which consumes a DMA buffer from it).
> > > > > 
> > > > > In a system, an integrator would have implemented a secure heap driver,
> > > > > and could be based on TEE or a carved-out memory with restricted access,
> > > > > or else. This heap driver would be responsible of implementing the
> > > > > logic to: allocate, free, refcount, etc.
> > > > > 
> > > > > The heap would be retrieved by the Panthor kernel driver in order to
> > > > > allocate protected memory to load the FW and allow the GPU to enter/exit
> > > > > protected mode. This memory would not belong to a user space process.
> > > > > The driver allocates it at the time of loading the FW and initialization
> > > > > of the GPU HW. This is a device globally owned protected memory.    
> > > > 
> > > > The thing is, it's really not clear why you absolutely need to have the
> > > > Panthor driver involved there. It won't be transparent to userspace,
> > > > since you'd need an extra flag at allocation time, and the buffers
> > > > behave differently. If userspace has to be aware of it, what's the
> > > > advantage to your approach compared to just exposing a heap for those
> > > > secure buffers, and letting userspace allocate its buffers from there?    
> > > 
> > > Unless I'm mistaken, the Panthor driver loads its own firmware. Since loading
> > > the firmware requires placing the data in a protected memory region, and that
> > > this aspect has no exposure to userspace, how can Panthor not be implicated ?  
> > 
> > Right, the very reason we need protected memory early is because some
> > FW sections need to be allocated from the protected pool, otherwise the
> > TEE will fault as soon at the FW enters the so-called 'protected mode'.  
> 
> How does that work if you don't have some way to allocate the protected
> memory? You can still submit jobs to the GPU, but you can't submit /
> execute "protected jobs"?

Exactly.

> 
> > Now, it's not impossible to work around this limitation. For instance,
> > we could load the FW without this protected section by default (what we
> > do right now), and then provide a DRM_PANTHOR_ENABLE_FW_PROT_MODE
> > ioctl that would take a GEM object imported from a dmabuf allocated
> > from the protected dma-heap by userspace. We can then reset the FW and
> > allow it to operate in protected mode after that point.  
> 
> Urgh, I'd rather avoid that dance if possible :)

Me too.

> 
> > This approach has two downsides though:
> > 
> > 1. We have no way of checking that the memory we're passed is actually
> > suitable for FW execution in a protected context. If we're passed
> > random memory, this will likely hang the platform as soon as we enter
> > protected mode.  
> 
> It's a current limitation of dma-buf in general, and you'd have the same
> issue right now if someone imports a buffer, or misconfigure the heap
> for a !protected heap.
> 
> I'd really like to have some way to store some metadata in dma_buf, if
> only to tell that the buffer is protected.

The dma_buf has a pointer to its ops, so it should be relatively easy
to add an is_dma_buf_coming_from_this_heap() helper. Of course this
implies linking the consumer driver to the heap it's supposed to take
protected buffers from, which is basically the thing being discussed
here :-).

> 
> I suspect you'd also need that if you do things like do protected video
> playback through a codec, get a protected frame, and want to import that
> into the GPU. Depending on how you allocate it, either the codec or the
> GPU or both will want to make sure it's protected.

If it's all allocated from a central "protected" heap (even if that
goes through the driver calling the dma_heap_alloc_buffer()), it
shouldn't be an issue.

> 
> > 2. If the driver already boot the FW and exposed a DRI node, we might
> > have GPU workloads running, and doing a FW reset might incur a slight
> > delay in GPU jobs execution.
> > 
> > I think #1 is a more general issue that applies to suspend buffers
> > allocated for GPU contexts too. If we expose ioctls where we take
> > protected memory buffers that can possibly lead to crashes if they are
> > not real protected memory regions, and we have no way to ensure the
> > memory is protected, we probably want to restrict these ioctls/modes to
> > some high-privilege CAP_SYS_.
> > 
> > For #2, that's probably something we can live with, since it's a
> > one-shot thing. If it becomes an issue, we can even make sure we enable
> > the FW protected-mode before the GPU starts being used for real.
> > 
> > This being said, I think the problem applies outside Panthor, and it
> > might be that the video codec can't reset the FW/HW block to switch to
> > protected mode as easily as Panthor.
> >
> > Note that there's also downsides to the reserved-memory node approach,
> > where some bootloader stage would ask the secure FW to reserve a
> > portion of mem and pass this through the DT. This sort of things tend to
> > be an integration mess, where you need all the pieces of the stack (TEE,
> > u-boot, MTK dma-heap driver, gbm, ...) to be at a certain version to
> > work properly. If we go the ioctl() way, we restrict the scope to the
> > TEE, gbm/mesa and the protected-dma-heap driver, which is still a lot,
> > but we've ripped the bootloader out of the equation at least.  
> 
> Yeah. I also think there's two discussions in parallel here:
> 
>  1) Being able to allocate protected buffers from the driver
>  2) Exposing an interface to allocate those to userspace
> 
> I'm not really convinced we need 2, but 1 is obviously needed from what
> you're saying.

I suspect we need #2 for GBM, still. But that's what dma-heaps are for,
so I don't think that's a problem.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 3/5] dt-bindings: gpu: Add protected heap name to Mali Valhall CSF binding
  2025-02-09 11:56           ` Krzysztof Kozlowski
@ 2025-02-12  9:25             ` Florent Tomasin
  0 siblings, 0 replies; 48+ messages in thread
From: Florent Tomasin @ 2025-02-12  9:25 UTC (permalink / raw)
  To: Krzysztof Kozlowski, Nicolas Dufresne, Vinod Koul, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, Boris Brezillon, Steven Price,
	Liviu Dudau, Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann,
	David Airlie, Simona Vetter, Sumit Semwal, Benjamin Gaignard,
	Brian Starkey, John Stultz, T . J . Mercier, Christian König,
	Matthias Brugger, AngeloGioacchino Del Regno, Yong Wu
  Cc: dmaengine, devicetree, linux-kernel, dri-devel, linux-media,
	linaro-mm-sig, linux-arm-kernel, linux-mediatek, nd, Akash Goel

Hi Nicolas and Krzysztof,

On 09/02/2025 11:56, Krzysztof Kozlowski wrote:
> On 06/02/2025 22:21, Nicolas Dufresne wrote:
>> Le mercredi 05 février 2025 à 10:13 +0100, Krzysztof Kozlowski a écrit :
>>> On 03/02/2025 16:31, Florent Tomasin wrote:
>>>> Hi Krzysztof
>>>>
>>>> On 30/01/2025 13:25, Krzysztof Kozlowski wrote:
>>>>> On 30/01/2025 14:08, Florent Tomasin wrote:
>>>>>> Allow mali-valhall-csf driver to retrieve a protected
>>>>>> heap at probe time by passing the name of the heap
>>>>>> as attribute to the device tree GPU node.
>>>>>
>>>>> Please wrap commit message according to Linux coding style / submission
>>>>> process (neither too early nor over the limit):
>>>>> https://elixir.bootlin.com/linux/v6.4-rc1/source/Documentation/process/submitting-patches.rst#L597
>>>> Apologies, I think I made quite few other mistakes in the style of the
>>>> patches I sent. I will work on improving this aspect, appreciated
>>>>
>>>>> Why this cannot be passed by phandle, just like all reserved regions?
>>>>>
>>>>> From where do you take these protected heaps? Firmware? This would
>>>>> explain why no relation is here (no probe ordering, no device links,
>>>>> nothing connecting separate devices).
>>>>
>>>> The protected heap is generaly obtained from a firmware (TEE) and could
>>>> sometimes be a carved-out memory with restricted access.
>>>
>>> Which is a reserved memory, isn't it?
>>>
>>>>
>>>> The Panthor CSF kernel driver does not own or manage the protected heap
>>>> and is instead a consumer of it (assuming the heap is made available by
>>>> the system integrator).
>>>>
>>>> I initially used a phandle, but then I realised it would introduce a new
>>>> API to share the heap across kernel driver. In addition I found this
>>>> patch series:
>>>> -
>>>> https://lore.kernel.org/lkml/20230911023038.30649-1-yong.wu@mediatek.com/#t
>>>>
>>>> which introduces a DMA Heap API to the rest of the kernel to find a
>>>> heap by name:
>>>> - dma_heap_find()
>>>>
>>>> I then decided to follow that approach to help isolate the heap
>>>> management from the GPU driver code. In the Panthor driver, if the
>>>> heap is not found at probe time, the driver will defer the probe until
>>>> the exporter made it available.
>>>
>>>
>>> I don't talk here really about the driver but even above mediatek
>>> patchset uses reserved memory bindings.
>>>
>>> You explained some things about driver yet you did not answer the
>>> question. This looks like reserved memory. If it does not, bring
>>> arguments why this binding cannot be a reserved memory, why hardware is
>>> not a carve out memory.
>>
>> I think the point is that from the Mali GPU view, the memory does not need to be
>> within the range the Linux Kernel actually see, even though current integration
> 
> 
> Do I get it right:
> Memory can be outside of kernel address range but you put it to the
> bindings as reserved memory? If yes, then I still do not understand why
> DT should keep that information. Basically, you can choose whatever
> memory is there, because it anyway won't interfere with Linux, right?
> Linux does not have any reasonable way to access it.
> 
> It might interfere with firmware or other processors, but then it's the
> job of firmware which has discoverable interfaces for this.
> 
> The binding says it is about protected heap name, but it explains
> nothing what is that protected heap. You pass it to some firmware as
> string? Does not look like, rather looks like Linux thingy, but this
> again is neither explained in commit msg nor actually correct: Linux
> thingies do not belong to DT.

Indeed, the protected heap name refers to a Linux concept: the DMA heap
name. I understand the confusion introduced by this patch. I added a SW
concept in the DTB, where it is meant to describe the HW.

Following a discussion with Boris, we agreed to remove the DTB entry,
and instead rely on an alternative way to get the name of the heap
in the Panthor kernel driver. I will prepare a v2 of the RFC which
will rely on a module parameter.

Regards,
Florent


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 1/5] dt-bindings: dma: Add CMA Heap bindings
  2025-02-04 18:12       ` Nicolas Dufresne
@ 2025-02-12  9:49         ` Florent Tomasin
  2025-02-12 10:01           ` Maxime Ripard
  0 siblings, 1 reply; 48+ messages in thread
From: Florent Tomasin @ 2025-02-12  9:49 UTC (permalink / raw)
  To: Nicolas Dufresne, Maxime Ripard
  Cc: Vinod Koul, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
	Benjamin Gaignard, Brian Starkey, John Stultz, T . J . Mercier,
	Christian König, Matthias Brugger,
	AngeloGioacchino Del Regno, Yong Wu, dmaengine, devicetree,
	linux-kernel, dri-devel, linux-media, linaro-mm-sig,
	linux-arm-kernel, linux-mediatek, nd, Akash Goel

Hi Nicolas,

On 04/02/2025 18:12, Nicolas Dufresne wrote:
> Hi Florent,
> 
> Le lundi 03 février 2025 à 13:36 +0000, Florent Tomasin a écrit :
>>
>> On 30/01/2025 13:28, Maxime Ripard wrote:
>>> Hi,
>>>
>>> On Thu, Jan 30, 2025 at 01:08:57PM +0000, Florent Tomasin wrote:
>>>> Introduce a CMA Heap dt-binding allowing custom
>>>> CMA heap registrations.
>>>>
>>>> * Note to the reviewers:
>>>> The patch was used for the development of the protected mode
> 
> Just to avoid divergence in nomenclature, and because this is not a new subject,
> perhaps you should also adhere to the name "restricted". Both Linaro and
> Mediatek have moved from "secure" to that name in their proposal. As you are the
> third proposing this (at least for the proposal that are CCed on linux-media), I
> would have expected in your cover letter a summary of how the other requirement
> have been blended in your proposal.

Just to be sure I undertand your suggestion correctly, are you
proposing to use "restricted mode" instead of "protected mode"?

In the case of Panthor CSF driver, the term: "protected mode" refers to
a Mali CSF GPU HW concept:
-
https://developer.arm.com/documentation/100964/1127/Fast-Models-components/Media-components/Mali-G71

If preferred and to avoid confusion, I can remove the reference to
"protected mode" and "Panthor CSF driver" from the commit message to
focus only on the CMA heap changes, which are more generic and can apply
to any type of CMA memory.

Note that the CMA patches were initially shared to help reproduce my
environment of development, I can isolate them in a separate patch
series and include a reference or "base-commit:" tag to it in the
Panthor protected mode RFC, to help progress this review in another
thread. It will avoid overlapping these two topics:

- Multiple standalone CMA heaps support
- Panthor protected mode handling

Regards,
Florent

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 1/5] dt-bindings: dma: Add CMA Heap bindings
  2025-02-12  9:49         ` Florent Tomasin
@ 2025-02-12 10:01           ` Maxime Ripard
  2025-02-12 10:29             ` Florent Tomasin
  2025-02-12 10:37             ` Boris Brezillon
  0 siblings, 2 replies; 48+ messages in thread
From: Maxime Ripard @ 2025-02-12 10:01 UTC (permalink / raw)
  To: Florent Tomasin
  Cc: Nicolas Dufresne, Vinod Koul, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Boris Brezillon, Steven Price, Liviu Dudau,
	Maarten Lankhorst, Thomas Zimmermann, David Airlie, Simona Vetter,
	Sumit Semwal, Benjamin Gaignard, Brian Starkey, John Stultz,
	T . J . Mercier, Christian König, Matthias Brugger,
	AngeloGioacchino Del Regno, Yong Wu, dmaengine, devicetree,
	linux-kernel, dri-devel, linux-media, linaro-mm-sig,
	linux-arm-kernel, linux-mediatek, nd, Akash Goel

[-- Attachment #1: Type: text/plain, Size: 932 bytes --]

On Wed, Feb 12, 2025 at 09:49:56AM +0000, Florent Tomasin wrote:
> Note that the CMA patches were initially shared to help reproduce my
> environment of development, I can isolate them in a separate patch
> series and include a reference or "base-commit:" tag to it in the
> Panthor protected mode RFC, to help progress this review in another
> thread. It will avoid overlapping these two topics:
> 
> - Multiple standalone CMA heaps support
> - Panthor protected mode handling

You keep insisting on using CMA here, but it's really not clear to me
why you would need CMA in the first place.

By CMA, do you mean the CMA allocator, and thus would provide buffers
through the usual dma_alloc_* API, or would any allocator providing
physically contiguous memory work?

In the latter case, would something like this work:
https://lore.kernel.org/all/20240515-dma-buf-ecc-heap-v1-1-54cbbd049511@kernel.org/

Maxime

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 273 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 1/5] dt-bindings: dma: Add CMA Heap bindings
  2025-02-12 10:01           ` Maxime Ripard
@ 2025-02-12 10:29             ` Florent Tomasin
  2025-02-12 10:49               ` Maxime Ripard
  2025-02-12 10:37             ` Boris Brezillon
  1 sibling, 1 reply; 48+ messages in thread
From: Florent Tomasin @ 2025-02-12 10:29 UTC (permalink / raw)
  To: Maxime Ripard
  Cc: Nicolas Dufresne, Vinod Koul, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Boris Brezillon, Steven Price, Liviu Dudau,
	Maarten Lankhorst, Thomas Zimmermann, David Airlie, Simona Vetter,
	Sumit Semwal, Benjamin Gaignard, Brian Starkey, John Stultz,
	T . J . Mercier, Christian König, Matthias Brugger,
	AngeloGioacchino Del Regno, Yong Wu, dmaengine, devicetree,
	linux-kernel, dri-devel, linux-media, linaro-mm-sig,
	linux-arm-kernel, linux-mediatek, nd, Akash Goel



On 12/02/2025 10:01, Maxime Ripard wrote:
> On Wed, Feb 12, 2025 at 09:49:56AM +0000, Florent Tomasin wrote:
>> Note that the CMA patches were initially shared to help reproduce my
>> environment of development, I can isolate them in a separate patch
>> series and include a reference or "base-commit:" tag to it in the
>> Panthor protected mode RFC, to help progress this review in another
>> thread. It will avoid overlapping these two topics:
>>
>> - Multiple standalone CMA heaps support
>> - Panthor protected mode handling
> 
> You keep insisting on using CMA here, but it's really not clear to me
> why you would need CMA in the first place.
> 
> By CMA, do you mean the CMA allocator, and thus would provide buffers
> through the usual dma_alloc_* API, or would any allocator providing
> physically contiguous memory work?

You are correct only the CMA allocator is relevant. I needed a way to
sub-allocate from a carved-out memory.

> In the latter case, would something like this work:
> https://lore.kernel.org/all/20240515-dma-buf-ecc-heap-v1-1-54cbbd049511@kernel.org/

Thanks for sharing this link, I was not aware previous work was done
on this aspect. The new carveout heap introduced in the series could
probably be a good alternative. I will play-around with it and share
some updates.

Appreciated,
Florent

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 1/5] dt-bindings: dma: Add CMA Heap bindings
  2025-02-12 10:01           ` Maxime Ripard
  2025-02-12 10:29             ` Florent Tomasin
@ 2025-02-12 10:37             ` Boris Brezillon
  1 sibling, 0 replies; 48+ messages in thread
From: Boris Brezillon @ 2025-02-12 10:37 UTC (permalink / raw)
  To: Maxime Ripard
  Cc: Florent Tomasin, Nicolas Dufresne, Vinod Koul, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, Steven Price, Liviu Dudau,
	Maarten Lankhorst, Thomas Zimmermann, David Airlie, Simona Vetter,
	Sumit Semwal, Benjamin Gaignard, Brian Starkey, John Stultz,
	T . J . Mercier, Christian König, Matthias Brugger,
	AngeloGioacchino Del Regno, Yong Wu, dmaengine, devicetree,
	linux-kernel, dri-devel, linux-media, linaro-mm-sig,
	linux-arm-kernel, linux-mediatek, nd, Akash Goel

On Wed, 12 Feb 2025 11:01:11 +0100
Maxime Ripard <mripard@kernel.org> wrote:

> On Wed, Feb 12, 2025 at 09:49:56AM +0000, Florent Tomasin wrote:
> > Note that the CMA patches were initially shared to help reproduce my
> > environment of development, I can isolate them in a separate patch
> > series and include a reference or "base-commit:" tag to it in the
> > Panthor protected mode RFC, to help progress this review in another
> > thread. It will avoid overlapping these two topics:
> > 
> > - Multiple standalone CMA heaps support
> > - Panthor protected mode handling  
> 
> You keep insisting on using CMA here, but it's really not clear to me
> why you would need CMA in the first place.

CMA is certainly not the solution. As Florent said, it's just here to
help people test the panthor protected-mode feature without having to
pull Mediatek's protected heap implementation, which currently
lives in some vendor tree and is thus quite painful to integrate to a
vanilla kernel.

I would suggest keeping those patches in a public tree we can point to,
and dropping them from v2, to avoid the confusion.

> 
> By CMA, do you mean the CMA allocator, and thus would provide buffers
> through the usual dma_alloc_* API, or would any allocator providing
> physically contiguous memory work?

Panthor can work with the system heap too AFAICT (I've done my testing
with the system heap, and it seems to work fine). It gets tricky when
you want to allocate protected scanout buffers and import them in the
KMS device though. But as said above, we shouldn't really bother
exposing custom CMA heaps, because that's not what we want to use
ultimately. What we'll want is some ATF implementation for protected
memory, that we can rely on to implement a standard protected dma-heap
implementation, I guess.

> 
> In the latter case, would something like this work:
> https://lore.kernel.org/all/20240515-dma-buf-ecc-heap-v1-1-54cbbd049511@kernel.org/
> 
> Maxime


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 1/5] dt-bindings: dma: Add CMA Heap bindings
  2025-02-12 10:29             ` Florent Tomasin
@ 2025-02-12 10:49               ` Maxime Ripard
  2025-02-12 11:02                 ` Florent Tomasin
  0 siblings, 1 reply; 48+ messages in thread
From: Maxime Ripard @ 2025-02-12 10:49 UTC (permalink / raw)
  To: Florent Tomasin
  Cc: Nicolas Dufresne, Vinod Koul, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Boris Brezillon, Steven Price, Liviu Dudau,
	Maarten Lankhorst, Thomas Zimmermann, David Airlie, Simona Vetter,
	Sumit Semwal, Benjamin Gaignard, Brian Starkey, John Stultz,
	T . J . Mercier, Christian König, Matthias Brugger,
	AngeloGioacchino Del Regno, Yong Wu, dmaengine, devicetree,
	linux-kernel, dri-devel, linux-media, linaro-mm-sig,
	linux-arm-kernel, linux-mediatek, nd, Akash Goel

[-- Attachment #1: Type: text/plain, Size: 1602 bytes --]

On Wed, Feb 12, 2025 at 10:29:32AM +0000, Florent Tomasin wrote:
> 
> 
> On 12/02/2025 10:01, Maxime Ripard wrote:
> > On Wed, Feb 12, 2025 at 09:49:56AM +0000, Florent Tomasin wrote:
> >> Note that the CMA patches were initially shared to help reproduce my
> >> environment of development, I can isolate them in a separate patch
> >> series and include a reference or "base-commit:" tag to it in the
> >> Panthor protected mode RFC, to help progress this review in another
> >> thread. It will avoid overlapping these two topics:
> >>
> >> - Multiple standalone CMA heaps support
> >> - Panthor protected mode handling
> > 
> > You keep insisting on using CMA here, but it's really not clear to me
> > why you would need CMA in the first place.
> > 
> > By CMA, do you mean the CMA allocator, and thus would provide buffers
> > through the usual dma_alloc_* API, or would any allocator providing
> > physically contiguous memory work?
> 
> You are correct only the CMA allocator is relevant. I needed a way to
> sub-allocate from a carved-out memory.

I'm still confused, sorry. You're saying that you require CMA but...

> > In the latter case, would something like this work:
> > https://lore.kernel.org/all/20240515-dma-buf-ecc-heap-v1-1-54cbbd049511@kernel.org/
> 
> Thanks for sharing this link, I was not aware previous work was done
> on this aspect. The new carveout heap introduced in the series could
> probably be a good alternative. I will play-around with it and share
> some updates.

... you seem to be ok with a driver that doesn't use it?

Maxime

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 273 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 1/5] dt-bindings: dma: Add CMA Heap bindings
  2025-02-12 10:49               ` Maxime Ripard
@ 2025-02-12 11:02                 ` Florent Tomasin
  0 siblings, 0 replies; 48+ messages in thread
From: Florent Tomasin @ 2025-02-12 11:02 UTC (permalink / raw)
  To: Maxime Ripard
  Cc: Nicolas Dufresne, Vinod Koul, Rob Herring, Krzysztof Kozlowski,
	Conor Dooley, Boris Brezillon, Steven Price, Liviu Dudau,
	Maarten Lankhorst, Thomas Zimmermann, David Airlie, Simona Vetter,
	Sumit Semwal, Benjamin Gaignard, Brian Starkey, John Stultz,
	T . J . Mercier, Christian König, Matthias Brugger,
	AngeloGioacchino Del Regno, Yong Wu, dmaengine, devicetree,
	linux-kernel, dri-devel, linux-media, linaro-mm-sig,
	linux-arm-kernel, linux-mediatek, nd, Akash Goel



On 12/02/2025 10:49, Maxime Ripard wrote:
> On Wed, Feb 12, 2025 at 10:29:32AM +0000, Florent Tomasin wrote:
>>
>>
>> On 12/02/2025 10:01, Maxime Ripard wrote:
>>> On Wed, Feb 12, 2025 at 09:49:56AM +0000, Florent Tomasin wrote:
>>>> Note that the CMA patches were initially shared to help reproduce my
>>>> environment of development, I can isolate them in a separate patch
>>>> series and include a reference or "base-commit:" tag to it in the
>>>> Panthor protected mode RFC, to help progress this review in another
>>>> thread. It will avoid overlapping these two topics:
>>>>
>>>> - Multiple standalone CMA heaps support
>>>> - Panthor protected mode handling
>>>
>>> You keep insisting on using CMA here, but it's really not clear to me
>>> why you would need CMA in the first place.
>>>
>>> By CMA, do you mean the CMA allocator, and thus would provide buffers
>>> through the usual dma_alloc_* API, or would any allocator providing
>>> physically contiguous memory work?
>>
>> You are correct only the CMA allocator is relevant. I needed a way to
>> sub-allocate from a carved-out memory.
> 
> I'm still confused, sorry. You're saying that you require CMA but...

Adding to Boris's comment, the objective here was to enable
sub-allocation from a carved-out memory region. The CMA heap
was used for convinience. It can be any other heap driver that
allows allocating a protected buffer.

>>> In the latter case, would something like this work:
>>> https://lore.kernel.org/all/20240515-dma-buf-ecc-heap-v1-1-54cbbd049511@kernel.org/
>>
>> Thanks for sharing this link, I was not aware previous work was done
>> on this aspect. The new carveout heap introduced in the series could
>> probably be a good alternative. I will play-around with it and share
>> some updates.
> 
> ... you seem to be ok with a driver that doesn't use it?

I will confirm it once I have done some validation.

From Panthor driver point of view, it does not matter if the we use CMA,
or alternative heaps. We just need to be able to allocate from a
protected heap. I used the CMA heap to simplify the developpment of the
feature, it can be anything.

I will extract the CMA changes from the V2 of the RFC to prevent
confusion.

Regards,
Florent

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 0/5] drm/panthor: Protected mode support for Mali CSF GPUs
  2025-02-11 14:32                   ` Boris Brezillon
@ 2025-02-20 13:32                     ` Maxime Ripard
  2025-02-24 11:36                       ` Boris Brezillon
  0 siblings, 1 reply; 48+ messages in thread
From: Maxime Ripard @ 2025-02-20 13:32 UTC (permalink / raw)
  To: Boris Brezillon
  Cc: Nicolas Dufresne, Florent Tomasin, Vinod Koul, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, Steven Price, Liviu Dudau,
	Maarten Lankhorst, Thomas Zimmermann, David Airlie, Simona Vetter,
	Sumit Semwal, Benjamin Gaignard, Brian Starkey, John Stultz,
	T . J . Mercier, Christian König, Matthias Brugger,
	AngeloGioacchino Del Regno, Yong Wu, dmaengine, devicetree,
	linux-kernel, dri-devel, linux-media, linaro-mm-sig,
	linux-arm-kernel, linux-mediatek, nd, Akash Goel

[-- Attachment #1: Type: text/plain, Size: 11018 bytes --]

On Tue, Feb 11, 2025 at 03:32:23PM +0100, Boris Brezillon wrote:
> On Tue, 11 Feb 2025 14:46:56 +0100
> Maxime Ripard <mripard@kernel.org> wrote:
> 
> > Hi Boris,
> > 
> > On Fri, Feb 07, 2025 at 04:02:53PM +0100, Boris Brezillon wrote:
> > > Sorry for joining the party late, a couple of comments to back Akash
> > > and Nicolas' concerns.
> > > 
> > > On Wed, 05 Feb 2025 13:14:14 -0500
> > > Nicolas Dufresne <nicolas@ndufresne.ca> wrote:
> > >   
> > > > Le mercredi 05 février 2025 à 15:52 +0100, Maxime Ripard a écrit :  
> > > > > On Mon, Feb 03, 2025 at 04:43:23PM +0000, Florent Tomasin wrote:    
> > > > > > Hi Maxime, Nicolas
> > > > > > 
> > > > > > On 30/01/2025 17:47, Nicolas Dufresne wrote:    
> > > > > > > Le jeudi 30 janvier 2025 à 17:38 +0100, Maxime Ripard a écrit :    
> > > > > > > > Hi Nicolas,
> > > > > > > > 
> > > > > > > > On Thu, Jan 30, 2025 at 10:59:56AM -0500, Nicolas Dufresne wrote:    
> > > > > > > > > Le jeudi 30 janvier 2025 à 14:46 +0100, Maxime Ripard a écrit :    
> > > > > > > > > > Hi,
> > > > > > > > > > 
> > > > > > > > > > I started to review it, but it's probably best to discuss it here.
> > > > > > > > > > 
> > > > > > > > > > On Thu, Jan 30, 2025 at 01:08:56PM +0000, Florent Tomasin wrote:    
> > > > > > > > > > > Hi,
> > > > > > > > > > > 
> > > > > > > > > > > This is a patch series covering the support for protected mode execution in
> > > > > > > > > > > Mali Panthor CSF kernel driver.
> > > > > > > > > > > 
> > > > > > > > > > > The Mali CSF GPUs come with the support for protected mode execution at the
> > > > > > > > > > > HW level. This feature requires two main changes in the kernel driver:
> > > > > > > > > > > 
> > > > > > > > > > > 1) Configure the GPU with a protected buffer. The system must provide a DMA
> > > > > > > > > > >    heap from which the driver can allocate a protected buffer.
> > > > > > > > > > >    It can be a carved-out memory or dynamically allocated protected memory region.
> > > > > > > > > > >    Some system includes a trusted FW which is in charge of the protected memory.
> > > > > > > > > > >    Since this problem is integration specific, the Mali Panthor CSF kernel
> > > > > > > > > > >    driver must import the protected memory from a device specific exporter.    
> > > > > > > > > > 
> > > > > > > > > > Why do you need a heap for it in the first place? My understanding of
> > > > > > > > > > your series is that you have a carved out memory region somewhere, and
> > > > > > > > > > you want to allocate from that carved out memory region your buffers.
> > > > > > > > > > 
> > > > > > > > > > How is that any different from using a reserved-memory region, adding
> > > > > > > > > > the reserved-memory property to the GPU device and doing all your
> > > > > > > > > > allocation through the usual dma_alloc_* API?    
> > > > > > > > > 
> > > > > > > > > How do you then multiplex this region so it can be shared between
> > > > > > > > > GPU/Camera/Display/Codec drivers and also userspace ?    
> > > > > > > > 
> > > > > > > > You could point all the devices to the same reserved memory region, and
> > > > > > > > they would all allocate from there, including for their userspace-facing
> > > > > > > > allocations.    
> > > > > > > 
> > > > > > > I get that using memory region is somewhat more of an HW description, and
> > > > > > > aligned with what a DT is supposed to describe. One of the challenge is that
> > > > > > > Mediatek heap proposal endup calling into their TEE, meaning knowing the region
> > > > > > > is not that useful. You actually need the TEE APP guid and its IPC protocol. If
> > > > > > > we can dell drivers to use a head instead, we can abstract that SoC specific
> > > > > > > complexity. I believe each allocated addressed has to be mapped to a zone, and
> > > > > > > that can only be done in the secure application. I can imagine similar needs
> > > > > > > when the protection is done using some sort of a VM / hypervisor.
> > > > > > > 
> > > > > > > Nicolas
> > > > > > >     
> > > > > > 
> > > > > > The idea in this design is to abstract the heap management from the
> > > > > > Panthor kernel driver (which consumes a DMA buffer from it).
> > > > > > 
> > > > > > In a system, an integrator would have implemented a secure heap driver,
> > > > > > and could be based on TEE or a carved-out memory with restricted access,
> > > > > > or else. This heap driver would be responsible of implementing the
> > > > > > logic to: allocate, free, refcount, etc.
> > > > > > 
> > > > > > The heap would be retrieved by the Panthor kernel driver in order to
> > > > > > allocate protected memory to load the FW and allow the GPU to enter/exit
> > > > > > protected mode. This memory would not belong to a user space process.
> > > > > > The driver allocates it at the time of loading the FW and initialization
> > > > > > of the GPU HW. This is a device globally owned protected memory.    
> > > > > 
> > > > > The thing is, it's really not clear why you absolutely need to have the
> > > > > Panthor driver involved there. It won't be transparent to userspace,
> > > > > since you'd need an extra flag at allocation time, and the buffers
> > > > > behave differently. If userspace has to be aware of it, what's the
> > > > > advantage to your approach compared to just exposing a heap for those
> > > > > secure buffers, and letting userspace allocate its buffers from there?    
> > > > 
> > > > Unless I'm mistaken, the Panthor driver loads its own firmware. Since loading
> > > > the firmware requires placing the data in a protected memory region, and that
> > > > this aspect has no exposure to userspace, how can Panthor not be implicated ?  
> > > 
> > > Right, the very reason we need protected memory early is because some
> > > FW sections need to be allocated from the protected pool, otherwise the
> > > TEE will fault as soon at the FW enters the so-called 'protected mode'.  
> > 
> > How does that work if you don't have some way to allocate the protected
> > memory? You can still submit jobs to the GPU, but you can't submit /
> > execute "protected jobs"?
> 
> Exactly.
> 
> > 
> > > Now, it's not impossible to work around this limitation. For instance,
> > > we could load the FW without this protected section by default (what we
> > > do right now), and then provide a DRM_PANTHOR_ENABLE_FW_PROT_MODE
> > > ioctl that would take a GEM object imported from a dmabuf allocated
> > > from the protected dma-heap by userspace. We can then reset the FW and
> > > allow it to operate in protected mode after that point.  
> > 
> > Urgh, I'd rather avoid that dance if possible :)
> 
> Me too.
> 
> > 
> > > This approach has two downsides though:
> > > 
> > > 1. We have no way of checking that the memory we're passed is actually
> > > suitable for FW execution in a protected context. If we're passed
> > > random memory, this will likely hang the platform as soon as we enter
> > > protected mode.  
> > 
> > It's a current limitation of dma-buf in general, and you'd have the same
> > issue right now if someone imports a buffer, or misconfigure the heap
> > for a !protected heap.
> > 
> > I'd really like to have some way to store some metadata in dma_buf, if
> > only to tell that the buffer is protected.
> 
> The dma_buf has a pointer to its ops, so it should be relatively easy
> to add an is_dma_buf_coming_from_this_heap() helper. Of course this
> implies linking the consumer driver to the heap it's supposed to take
> protected buffers from, which is basically the thing being discussed
> here :-).

I'm not sure looking at the ops would be enough. Like, you can compare
that the buffer you allocated come from the heap you got from the DT,
but if that heap doesn't allocate protected buffers, you're screwed and
you have no way to tell.

It also falls apart if we have a heap driver with multiple instances,
which is pretty likely if we ever merge the carveout heap driver.

> > 
> > I suspect you'd also need that if you do things like do protected video
> > playback through a codec, get a protected frame, and want to import that
> > into the GPU. Depending on how you allocate it, either the codec or the
> > GPU or both will want to make sure it's protected.
> 
> If it's all allocated from a central "protected" heap (even if that
> goes through the driver calling the dma_heap_alloc_buffer()), it
> shouldn't be an issue.

Right, assuming we have a way to identify the heap the buffer was
allocated from somehow. This kind of assumes that you only ever get one
source of protected memory, and you'd never allocate a protected buffer
from a different one in the codec driver for example.

> > > 2. If the driver already boot the FW and exposed a DRI node, we might
> > > have GPU workloads running, and doing a FW reset might incur a slight
> > > delay in GPU jobs execution.
> > > 
> > > I think #1 is a more general issue that applies to suspend buffers
> > > allocated for GPU contexts too. If we expose ioctls where we take
> > > protected memory buffers that can possibly lead to crashes if they are
> > > not real protected memory regions, and we have no way to ensure the
> > > memory is protected, we probably want to restrict these ioctls/modes to
> > > some high-privilege CAP_SYS_.
> > > 
> > > For #2, that's probably something we can live with, since it's a
> > > one-shot thing. If it becomes an issue, we can even make sure we enable
> > > the FW protected-mode before the GPU starts being used for real.
> > > 
> > > This being said, I think the problem applies outside Panthor, and it
> > > might be that the video codec can't reset the FW/HW block to switch to
> > > protected mode as easily as Panthor.
> > >
> > > Note that there's also downsides to the reserved-memory node approach,
> > > where some bootloader stage would ask the secure FW to reserve a
> > > portion of mem and pass this through the DT. This sort of things tend to
> > > be an integration mess, where you need all the pieces of the stack (TEE,
> > > u-boot, MTK dma-heap driver, gbm, ...) to be at a certain version to
> > > work properly. If we go the ioctl() way, we restrict the scope to the
> > > TEE, gbm/mesa and the protected-dma-heap driver, which is still a lot,
> > > but we've ripped the bootloader out of the equation at least.  
> > 
> > Yeah. I also think there's two discussions in parallel here:
> > 
> >  1) Being able to allocate protected buffers from the driver
> >  2) Exposing an interface to allocate those to userspace
> > 
> > I'm not really convinced we need 2, but 1 is obviously needed from what
> > you're saying.
> 
> I suspect we need #2 for GBM, still. But that's what dma-heaps are for,
> so I don't think that's a problem.

Yeah, that was my point too, we have the heaps for that already.

Maxime

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 273 bytes --]

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 0/5] drm/panthor: Protected mode support for Mali CSF GPUs
  2025-02-20 13:32                     ` Maxime Ripard
@ 2025-02-24 11:36                       ` Boris Brezillon
  0 siblings, 0 replies; 48+ messages in thread
From: Boris Brezillon @ 2025-02-24 11:36 UTC (permalink / raw)
  To: Maxime Ripard
  Cc: Nicolas Dufresne, Florent Tomasin, Vinod Koul, Rob Herring,
	Krzysztof Kozlowski, Conor Dooley, Steven Price, Liviu Dudau,
	Maarten Lankhorst, Thomas Zimmermann, David Airlie, Simona Vetter,
	Sumit Semwal, Benjamin Gaignard, Brian Starkey, John Stultz,
	T . J . Mercier, Christian König, Matthias Brugger,
	AngeloGioacchino Del Regno, Yong Wu, dmaengine, devicetree,
	linux-kernel, dri-devel, linux-media, linaro-mm-sig,
	linux-arm-kernel, linux-mediatek, nd, Akash Goel

Hi Maxime,

On Thu, 20 Feb 2025 14:32:14 +0100
Maxime Ripard <mripard@kernel.org> wrote:

> > > > This approach has two downsides though:
> > > > 
> > > > 1. We have no way of checking that the memory we're passed is actually
> > > > suitable for FW execution in a protected context. If we're passed
> > > > random memory, this will likely hang the platform as soon as we enter
> > > > protected mode.    
> > > 
> > > It's a current limitation of dma-buf in general, and you'd have the same
> > > issue right now if someone imports a buffer, or misconfigure the heap
> > > for a !protected heap.
> > > 
> > > I'd really like to have some way to store some metadata in dma_buf, if
> > > only to tell that the buffer is protected.  
> > 
> > The dma_buf has a pointer to its ops, so it should be relatively easy
> > to add an is_dma_buf_coming_from_this_heap() helper. Of course this
> > implies linking the consumer driver to the heap it's supposed to take
> > protected buffers from, which is basically the thing being discussed
> > here :-).  
> 
> I'm not sure looking at the ops would be enough. Like, you can compare
> that the buffer you allocated come from the heap you got from the DT,
> but if that heap doesn't allocate protected buffers, you're screwed and
> you have no way to tell.

If heap names are unique, the name of the heap should somehow guarantee
the protected/restricted nature of buffers allocated from this heap
though. So, from a user perspective, all you have to do is check that
the buffers you import come from this particular heap you've been
pointed to. Where we get the heap name from (DT or module param
passed through a whitelist of protected heap names?) is an
implementation detail.

> 
> It also falls apart if we have a heap driver with multiple instances,
> which is pretty likely if we ever merge the carveout heap driver.

What I meant here is that checking that a buffer comes from a
particular heap is something the heap driver itself can easily do. It
can be a mix of ops+name check (or ops+property check) if there's
multiple heaps instantiated by a single driver, of course.

I guess the other option would be to have a protected property at the
dma_buf level so we don't have to go all the way back to the dma_heap
to figure it out.

> 
> > > 
> > > I suspect you'd also need that if you do things like do protected video
> > > playback through a codec, get a protected frame, and want to import that
> > > into the GPU. Depending on how you allocate it, either the codec or the
> > > GPU or both will want to make sure it's protected.  
> > 
> > If it's all allocated from a central "protected" heap (even if that
> > goes through the driver calling the dma_heap_alloc_buffer()), it
> > shouldn't be an issue.  
> 
> Right, assuming we have a way to identify the heap the buffer was
> allocated from somehow. This kind of assumes that you only ever get one
> source of protected memory, and you'd never allocate a protected buffer
> from a different one in the codec driver for example.

Yes, and that's why having the ability to check that a buffer comes
from a particular heap is key. I mean, we don't necessarily have to
restrict things to a single heap, it can be a whitelist of heaps we know
provide protected buffers if we see a value in having multiple
protected heaps coexisting on a single platform.

Regards,

Boris

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [RFC PATCH 4/5] drm/panthor: Add support for protected memory allocation in panthor
  2025-01-30 13:09 ` [RFC PATCH 4/5] drm/panthor: Add support for protected memory allocation in panthor Florent Tomasin
  2025-02-11 11:04   ` Boris Brezillon
@ 2025-03-12 20:05   ` Adrian Larumbe
  1 sibling, 0 replies; 48+ messages in thread
From: Adrian Larumbe @ 2025-03-12 20:05 UTC (permalink / raw)
  To: Florent Tomasin
  Cc: Vinod Koul, Rob Herring, Krzysztof Kozlowski, Conor Dooley,
	Boris Brezillon, Steven Price, Liviu Dudau, Maarten Lankhorst,
	Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
	Sumit Semwal, Benjamin Gaignard, Brian Starkey, John Stultz,
	T . J . Mercier, Christian König, Matthias Brugger,
	AngeloGioacchino Del Regno, Yong Wu, dmaengine, devicetree,
	linux-kernel, dri-devel, linux-media, linaro-mm-sig,
	linux-arm-kernel, linux-mediatek, nd, Akash Goel

Hi Florent,

There's a chance whatever I say here's already been discussed in the
review before. If that were the case, ignore it without further ado.

On 30.01.2025 13:09, Florent Tomasin wrote:
> This patch allows Panthor to allocate buffer objects from a
> protected heap. The Panthor driver should be seen as a consumer
> of the heap and not an exporter.
>
> To help with the review of this patch, here are important information
> about the Mali GPU protected mode support:
> - On CSF FW load, the Panthor driver must allocate a protected
>   buffer object to hold data to use by the FW when in protected
>   mode. This protected buffer object is owned by the device
>   and does not belong to a process.
> - On CSG creation, the Panthor driver must allocate a protected
>   suspend buffer object for the FW to store data when suspending
>   the CSG while in protected mode. The kernel owns this allocation
>   and does not allow user space mapping. The format of the data
>   in this buffer is only known by the FW and does not need to be
>   shared with other entities.
>
> To summarize, Mali GPUs require allocations of protected buffer
> objects at the kernel level.
>
> * How is the protected heap accessed by the Panthor driver?
> The driver will retrieve the protected heap using the name of the
> heap provided to the driver via the DTB as attribute.
> If the heap is not yet available, the panthor driver will defer
> the probe until created. It is an integration error to provide
> a heap name that does not exist or is never created in the
> DTB node.
>
> * How is the Panthor driver allocating from the heap?
> Panthor is calling the DMA heap allocation function
> and obtains a DMA buffer from it. This buffer is then
> registered to GEM via PRIME by importing the DMA buffer.
>
> Signed-off-by: Florent Tomasin <florent.tomasin@arm.com>
> ---
>  drivers/gpu/drm/panthor/Kconfig          |  1 +
>  drivers/gpu/drm/panthor/panthor_device.c | 22 ++++++++++-
>  drivers/gpu/drm/panthor/panthor_device.h |  7 ++++
>  drivers/gpu/drm/panthor/panthor_fw.c     | 36 +++++++++++++++--
>  drivers/gpu/drm/panthor/panthor_fw.h     |  2 +
>  drivers/gpu/drm/panthor/panthor_gem.c    | 49 ++++++++++++++++++++++--
>  drivers/gpu/drm/panthor/panthor_gem.h    | 16 +++++++-
>  drivers/gpu/drm/panthor/panthor_heap.c   |  2 +
>  drivers/gpu/drm/panthor/panthor_sched.c  |  5 ++-
>  9 files changed, 130 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/gpu/drm/panthor/Kconfig b/drivers/gpu/drm/panthor/Kconfig
> index 55b40ad07f3b..c0208b886d9f 100644
> --- a/drivers/gpu/drm/panthor/Kconfig
> +++ b/drivers/gpu/drm/panthor/Kconfig
> @@ -7,6 +7,7 @@ config DRM_PANTHOR
>  	depends on !GENERIC_ATOMIC64  # for IOMMU_IO_PGTABLE_LPAE
>  	depends on MMU
>  	select DEVFREQ_GOV_SIMPLE_ONDEMAND
> +	select DMABUF_HEAPS
>  	select DRM_EXEC
>  	select DRM_GEM_SHMEM_HELPER
>  	select DRM_GPUVM
> diff --git a/drivers/gpu/drm/panthor/panthor_device.c b/drivers/gpu/drm/panthor/panthor_device.c
> index 00f7b8ce935a..1018e5c90a0e 100644
> --- a/drivers/gpu/drm/panthor/panthor_device.c
> +++ b/drivers/gpu/drm/panthor/panthor_device.c
> @@ -4,7 +4,9 @@
>  /* Copyright 2023 Collabora ltd. */
>
>  #include <linux/clk.h>
> +#include <linux/dma-heap.h>
>  #include <linux/mm.h>
> +#include <linux/of.h>
>  #include <linux/platform_device.h>
>  #include <linux/pm_domain.h>
>  #include <linux/pm_runtime.h>
> @@ -102,6 +104,9 @@ void panthor_device_unplug(struct panthor_device *ptdev)
>  	panthor_mmu_unplug(ptdev);
>  	panthor_gpu_unplug(ptdev);
>
> +	if (ptdev->protm.heap)
> +		dma_heap_put(ptdev->protm.heap);
> +
Probably beyond the point here, but I think in this case you might be
better off using the CMA API directly rather than going through a
hypothetical extension of dma-heaps.

>  	pm_runtime_dont_use_autosuspend(ptdev->base.dev);
>  	pm_runtime_put_sync_suspend(ptdev->base.dev);
>
> @@ -172,6 +177,7 @@ int panthor_device_init(struct panthor_device *ptdev)
>  	u32 *dummy_page_virt;
>  	struct resource *res;
>  	struct page *p;
> +	const char *protm_heap_name;
>  	int ret;
>
>  	ret = panthor_gpu_coherency_init(ptdev);
> @@ -246,9 +252,19 @@ int panthor_device_init(struct panthor_device *ptdev)
>  			return ret;
>  	}
>
> +	/* If a protected heap is specified but not found, defer the probe until created */
> +	if (!of_property_read_string(ptdev->base.dev->of_node, "protected-heap-name",
> +				     &protm_heap_name)) {
> +		ptdev->protm.heap = dma_heap_find(protm_heap_name);
> +		if (!ptdev->protm.heap) {
> +			ret = -EPROBE_DEFER;
> +			goto err_rpm_put;
> +		}
> +	}

Cince none of the DMA memory is going to be shared with othe devices, I don't
think a dma-heap interface is the right tool for the job here. Whatever
protected contiguous memory you allocate for protected mode, it'll always remain
within the boundaries of the driver, so maybe just do the usual sequence here:

of_reserved_mem_device_init();
dev_get_cma_area();
cma_alloc();

>  	ret = panthor_gpu_init(ptdev);
>  	if (ret)
> -		goto err_rpm_put;
> +		goto err_dma_heap_put;
>
>  	ret = panthor_mmu_init(ptdev);
>  	if (ret)
> @@ -286,6 +302,10 @@ int panthor_device_init(struct panthor_device *ptdev)
>  err_unplug_gpu:
>  	panthor_gpu_unplug(ptdev);
>
> +err_dma_heap_put:
> +	if (ptdev->protm.heap)
> +		dma_heap_put(ptdev->protm.heap);
> +
>  err_rpm_put:
>  	pm_runtime_put_sync_suspend(ptdev->base.dev);
>  	return ret;
> diff --git a/drivers/gpu/drm/panthor/panthor_device.h b/drivers/gpu/drm/panthor/panthor_device.h
> index 0e68f5a70d20..406de9e888e2 100644
> --- a/drivers/gpu/drm/panthor/panthor_device.h
> +++ b/drivers/gpu/drm/panthor/panthor_device.h
> @@ -7,6 +7,7 @@
>  #define __PANTHOR_DEVICE_H__
>
>  #include <linux/atomic.h>
> +#include <linux/dma-heap.h>
>  #include <linux/io-pgtable.h>
>  #include <linux/regulator/consumer.h>
>  #include <linux/sched.h>
> @@ -190,6 +191,12 @@ struct panthor_device {
>
>  	/** @fast_rate: Maximum device clock frequency. Set by DVFS */
>  	unsigned long fast_rate;
> +
> +	/** @protm: Protected mode related data. */
> +	struct {
> +		/** @heap: Pointer to the protected heap */
> +		struct dma_heap *heap;
> +	} protm;
>  };
>
>  struct panthor_gpu_usage {
> diff --git a/drivers/gpu/drm/panthor/panthor_fw.c b/drivers/gpu/drm/panthor/panthor_fw.c
> index 4a2e36504fea..7822af1533b4 100644
> --- a/drivers/gpu/drm/panthor/panthor_fw.c
> +++ b/drivers/gpu/drm/panthor/panthor_fw.c
> @@ -458,6 +458,7 @@ panthor_fw_alloc_queue_iface_mem(struct panthor_device *ptdev,
>
>  	mem = panthor_kernel_bo_create(ptdev, ptdev->fw->vm, SZ_8K,
>  				       DRM_PANTHOR_BO_NO_MMAP,
> +				       0,
>  				       DRM_PANTHOR_VM_BIND_OP_MAP_NOEXEC |
>  				       DRM_PANTHOR_VM_BIND_OP_MAP_UNCACHED,
>  				       PANTHOR_VM_KERNEL_AUTO_VA);
> @@ -491,6 +492,28 @@ panthor_fw_alloc_suspend_buf_mem(struct panthor_device *ptdev, size_t size)
>  {
>  	return panthor_kernel_bo_create(ptdev, panthor_fw_vm(ptdev), size,
>  					DRM_PANTHOR_BO_NO_MMAP,
> +					0,
> +					DRM_PANTHOR_VM_BIND_OP_MAP_NOEXEC,
> +					PANTHOR_VM_KERNEL_AUTO_VA);
> +}
> +
> +/**
> + * panthor_fw_alloc_protm_suspend_buf_mem() - Allocate a protm suspend buffer
> + * for a command stream group.
> + * @ptdev: Device.
> + * @size: Size of the protm suspend buffer.
> + *
> + * Return: A valid pointer in case of success, NULL if no protected heap, an ERR_PTR() otherwise.
> + */
> +struct panthor_kernel_bo *
> +panthor_fw_alloc_protm_suspend_buf_mem(struct panthor_device *ptdev, size_t size)
> +{
> +	if (!ptdev->protm.heap)
> +		return NULL;
> +
> +	return panthor_kernel_bo_create(ptdev, panthor_fw_vm(ptdev), size,
> +					DRM_PANTHOR_BO_NO_MMAP,
> +					DRM_PANTHOR_KBO_PROTECTED_HEAP,
>  					DRM_PANTHOR_VM_BIND_OP_MAP_NOEXEC,
>  					PANTHOR_VM_KERNEL_AUTO_VA);
>  }

Unify the two above into a single interface like:
struct panthor_kernel_bo *
panthor_fw_alloc_suspend_buf_mem(struct panthor_device *ptdev, size_t size, bool protected)

and then

return panthor_kernel_bo_create(ptdev, panthor_fw_vm(ptdev), size,
				DRM_PANTHOR_BO_NO_MMAP,
				(protected ? DRM_PANTHOR_KBO_PROTECTED_HEAP : 0)
				DRM_PANTHOR_VM_BIND_OP_MAP_NOEXEC,
				PANTHOR_VM_KERNEL_AUTO_VA);

> @@ -503,6 +526,7 @@ static int panthor_fw_load_section_entry(struct panthor_device *ptdev,
>  	ssize_t vm_pgsz = panthor_vm_page_size(ptdev->fw->vm);
>  	struct panthor_fw_binary_section_entry_hdr hdr;
>  	struct panthor_fw_section *section;
> +	bool is_protm_section = false;
>  	u32 section_size;
>  	u32 name_len;
>  	int ret;
> @@ -541,10 +565,13 @@ static int panthor_fw_load_section_entry(struct panthor_device *ptdev,
>  		return -EINVAL;
>  	}
>
> -	if (hdr.flags & CSF_FW_BINARY_IFACE_ENTRY_PROT) {
> +	if ((hdr.flags & CSF_FW_BINARY_IFACE_ENTRY_PROT) && !ptdev->protm.heap) {
>  		drm_warn(&ptdev->base,
>  			 "Firmware protected mode entry not be supported, ignoring");
>  		return 0;
> +	} else if ((hdr.flags & CSF_FW_BINARY_IFACE_ENTRY_PROT) && ptdev->protm.heap) {
> +		drm_info(&ptdev->base, "Firmware protected mode entry supported");
> +		is_protm_section = true;
>  	}
>
>  	if (hdr.va.start == CSF_MCU_SHARED_REGION_START &&
> @@ -610,9 +637,10 @@ static int panthor_fw_load_section_entry(struct panthor_device *ptdev,
>  			vm_map_flags |= DRM_PANTHOR_VM_BIND_OP_MAP_UNCACHED;
>
>  		section->mem = panthor_kernel_bo_create(ptdev, panthor_fw_vm(ptdev),
> -							section_size,
> -							DRM_PANTHOR_BO_NO_MMAP,
> -							vm_map_flags, va);
> +					section_size,
> +					DRM_PANTHOR_BO_NO_MMAP,
> +					(is_protm_section ? DRM_PANTHOR_KBO_PROTECTED_HEAP : 0),
> +					vm_map_flags, va);
>  		if (IS_ERR(section->mem))
>  			return PTR_ERR(section->mem);
>
> diff --git a/drivers/gpu/drm/panthor/panthor_fw.h b/drivers/gpu/drm/panthor/panthor_fw.h
> index 22448abde992..29042d0dc60c 100644
> --- a/drivers/gpu/drm/panthor/panthor_fw.h
> +++ b/drivers/gpu/drm/panthor/panthor_fw.h
> @@ -481,6 +481,8 @@ panthor_fw_alloc_queue_iface_mem(struct panthor_device *ptdev,
>  				 u32 *input_fw_va, u32 *output_fw_va);
>  struct panthor_kernel_bo *
>  panthor_fw_alloc_suspend_buf_mem(struct panthor_device *ptdev, size_t size);
> +struct panthor_kernel_bo *
> +panthor_fw_alloc_protm_suspend_buf_mem(struct panthor_device *ptdev, size_t size);
>
>  struct panthor_vm *panthor_fw_vm(struct panthor_device *ptdev);
>
> diff --git a/drivers/gpu/drm/panthor/panthor_gem.c b/drivers/gpu/drm/panthor/panthor_gem.c
> index 8244a4e6c2a2..88caf928acd0 100644
> --- a/drivers/gpu/drm/panthor/panthor_gem.c
> +++ b/drivers/gpu/drm/panthor/panthor_gem.c
> @@ -9,10 +9,14 @@
>
>  #include <drm/panthor_drm.h>
>
> +#include <uapi/linux/dma-heap.h>
> +
>  #include "panthor_device.h"
>  #include "panthor_gem.h"
>  #include "panthor_mmu.h"
>
> +MODULE_IMPORT_NS(DMA_BUF);
> +
>  static void panthor_gem_free_object(struct drm_gem_object *obj)
>  {
>  	struct panthor_gem_object *bo = to_panthor_bo(obj);
> @@ -31,6 +35,7 @@ static void panthor_gem_free_object(struct drm_gem_object *obj)
>   */
>  void panthor_kernel_bo_destroy(struct panthor_kernel_bo *bo)
>  {
> +	struct dma_buf *dma_bo = NULL;
>  	struct panthor_vm *vm;
>  	int ret;
>
> @@ -38,6 +43,10 @@ void panthor_kernel_bo_destroy(struct panthor_kernel_bo *bo)
>  		return;
>
>  	vm = bo->vm;
> +
> +	if (bo->flags & DRM_PANTHOR_KBO_PROTECTED_HEAP)
> +		dma_bo = bo->obj->import_attach->dmabuf;
> +
>  	panthor_kernel_bo_vunmap(bo);
>
>  	if (drm_WARN_ON(bo->obj->dev,
> @@ -51,6 +60,9 @@ void panthor_kernel_bo_destroy(struct panthor_kernel_bo *bo)
>  	panthor_vm_free_va(vm, &bo->va_node);
>  	drm_gem_object_put(bo->obj);
>
> +	if (dma_bo)
> +		dma_buf_put(dma_bo);
> +
>  out_free_bo:
>  	panthor_vm_put(vm);
>  	kfree(bo);
> @@ -62,6 +74,7 @@ void panthor_kernel_bo_destroy(struct panthor_kernel_bo *bo)
>   * @vm: VM to map the GEM to. If NULL, the kernel object is not GPU mapped.
>   * @size: Size of the buffer object.
>   * @bo_flags: Combination of drm_panthor_bo_flags flags.
> + * @kbo_flags: Combination of drm_panthor_kbo_flags flags.
>   * @vm_map_flags: Combination of drm_panthor_vm_bind_op_flags (only those
>   * that are related to map operations).
>   * @gpu_va: GPU address assigned when mapping to the VM.
> @@ -72,9 +85,11 @@ void panthor_kernel_bo_destroy(struct panthor_kernel_bo *bo)
>   */
>  struct panthor_kernel_bo *
>  panthor_kernel_bo_create(struct panthor_device *ptdev, struct panthor_vm *vm,
> -			 size_t size, u32 bo_flags, u32 vm_map_flags,
> +			 size_t size, u32 bo_flags, u32 kbo_flags, u32 vm_map_flags,
>  			 u64 gpu_va)
>  {
> +	struct dma_buf *dma_bo = NULL;
> +	struct drm_gem_object *gem_obj = NULL;
>  	struct drm_gem_shmem_object *obj;
>  	struct panthor_kernel_bo *kbo;
>  	struct panthor_gem_object *bo;
> @@ -87,14 +102,38 @@ panthor_kernel_bo_create(struct panthor_device *ptdev, struct panthor_vm *vm,
>  	if (!kbo)
>  		return ERR_PTR(-ENOMEM);
>
> -	obj = drm_gem_shmem_create(&ptdev->base, size);
> +	if (kbo_flags & DRM_PANTHOR_KBO_PROTECTED_HEAP) {
> +		if (!ptdev->protm.heap) {
> +			ret = -EINVAL;
> +			goto err_free_bo;
> +		}
> +
> +		dma_bo = dma_heap_buffer_alloc(ptdev->protm.heap, size,
> +					       DMA_HEAP_VALID_FD_FLAGS, DMA_HEAP_VALID_HEAP_FLAGS);

		dma_heap_buffer_alloc() is a static function, so why do you call it from here?
                I suppose you plan on extending the dma-heap interface in the future?

> +		if (!dma_bo) {
> +			ret = -ENOMEM;
> +			goto err_free_bo;
> +		}
> +
> +		gem_obj = drm_gem_prime_import(&ptdev->base, dma_bo);
> +		if (IS_ERR(gem_obj)) {
> +			ret = PTR_ERR(gem_obj);
> +			goto err_free_dma_bo;
> +		}
> +

If you chose the dma-heap interface because dma-heap offers you an easy
way to create a GEM object from an sgtable, perhaps you could create the
dma-buf right here and do a self-import. That way you wouldn't need to
extend dma-heap.


> +		obj = to_drm_gem_shmem_obj(gem_obj);
> +	} else {
> +		obj = drm_gem_shmem_create(&ptdev->base, size);
> +	}
> +
>  	if (IS_ERR(obj)) {
>  		ret = PTR_ERR(obj);
> -		goto err_free_bo;
> +		goto err_free_dma_bo;
>  	}
>
>  	bo = to_panthor_bo(&obj->base);
>  	kbo->obj = &obj->base;
> +	kbo->flags = kbo_flags;
>  	bo->flags = bo_flags;
>
>  	/* The system and GPU MMU page size might differ, which becomes a
> @@ -124,6 +163,10 @@ panthor_kernel_bo_create(struct panthor_device *ptdev, struct panthor_vm *vm,
>  err_put_obj:
>  	drm_gem_object_put(&obj->base);
>
> +err_free_dma_bo:
> +	if (dma_bo)
> +		dma_buf_put(dma_bo);
> +
>  err_free_bo:
>  	kfree(kbo);
>  	return ERR_PTR(ret);
> diff --git a/drivers/gpu/drm/panthor/panthor_gem.h b/drivers/gpu/drm/panthor/panthor_gem.h
> index e43021cf6d45..d4fe8ae9f0a8 100644
> --- a/drivers/gpu/drm/panthor/panthor_gem.h
> +++ b/drivers/gpu/drm/panthor/panthor_gem.h
> @@ -13,6 +13,17 @@
>
>  struct panthor_vm;
>
> +/**
> + * enum drm_panthor_kbo_flags -  Kernel buffer object flags, passed at creation time
> + */
> +enum drm_panthor_kbo_flags {
> +	/**
> +	 * @DRM_PANTHOR_KBO_PROTECTED_HEAP: The buffer object will be allocated
> +	 * from a DMA-Buf protected heap.
> +	 */
> +	DRM_PANTHOR_KBO_PROTECTED_HEAP = (1 << 0),
> +};
> +
>  /**
>   * struct panthor_gem_object - Driver specific GEM object.
>   */
> @@ -75,6 +86,9 @@ struct panthor_kernel_bo {
>  	 * @kmap: Kernel CPU mapping of @gem.
>  	 */
>  	void *kmap;
> +
> +	/** @flags: Combination of drm_panthor_kbo_flags flags. */
> +	u32 flags;
>  };
>
>  static inline
> @@ -138,7 +152,7 @@ panthor_kernel_bo_vunmap(struct panthor_kernel_bo *bo)
>
>  struct panthor_kernel_bo *
>  panthor_kernel_bo_create(struct panthor_device *ptdev, struct panthor_vm *vm,
> -			 size_t size, u32 bo_flags, u32 vm_map_flags,
> +			 size_t size, u32 bo_flags, u32 kbo_flags, u32 vm_map_flags,
>  			 u64 gpu_va);
>
>  void panthor_kernel_bo_destroy(struct panthor_kernel_bo *bo);
> diff --git a/drivers/gpu/drm/panthor/panthor_heap.c b/drivers/gpu/drm/panthor/panthor_heap.c
> index 3796a9eb22af..5395f0d90360 100644
> --- a/drivers/gpu/drm/panthor/panthor_heap.c
> +++ b/drivers/gpu/drm/panthor/panthor_heap.c
> @@ -146,6 +146,7 @@ static int panthor_alloc_heap_chunk(struct panthor_device *ptdev,
>
>  	chunk->bo = panthor_kernel_bo_create(ptdev, vm, heap->chunk_size,
>  					     DRM_PANTHOR_BO_NO_MMAP,
> +					     0,
>  					     DRM_PANTHOR_VM_BIND_OP_MAP_NOEXEC,
>  					     PANTHOR_VM_KERNEL_AUTO_VA);
>  	if (IS_ERR(chunk->bo)) {
> @@ -549,6 +550,7 @@ panthor_heap_pool_create(struct panthor_device *ptdev, struct panthor_vm *vm)
>
>  	pool->gpu_contexts = panthor_kernel_bo_create(ptdev, vm, bosize,
>  						      DRM_PANTHOR_BO_NO_MMAP,
> +						      0,
>  						      DRM_PANTHOR_VM_BIND_OP_MAP_NOEXEC,
>  						      PANTHOR_VM_KERNEL_AUTO_VA);
>  	if (IS_ERR(pool->gpu_contexts)) {
> diff --git a/drivers/gpu/drm/panthor/panthor_sched.c b/drivers/gpu/drm/panthor/panthor_sched.c
> index ef4bec7ff9c7..e260ed8aef5b 100644
> --- a/drivers/gpu/drm/panthor/panthor_sched.c
> +++ b/drivers/gpu/drm/panthor/panthor_sched.c
> @@ -3298,6 +3298,7 @@ group_create_queue(struct panthor_group *group,
>  	queue->ringbuf = panthor_kernel_bo_create(group->ptdev, group->vm,
>  						  args->ringbuf_size,
>  						  DRM_PANTHOR_BO_NO_MMAP,
> +						  0,
>  						  DRM_PANTHOR_VM_BIND_OP_MAP_NOEXEC |
>  						  DRM_PANTHOR_VM_BIND_OP_MAP_UNCACHED,
>  						  PANTHOR_VM_KERNEL_AUTO_VA);
> @@ -3328,6 +3329,7 @@ group_create_queue(struct panthor_group *group,
>  					 queue->profiling.slot_count *
>  					 sizeof(struct panthor_job_profiling_data),
>  					 DRM_PANTHOR_BO_NO_MMAP,
> +					 0,
>  					 DRM_PANTHOR_VM_BIND_OP_MAP_NOEXEC |
>  					 DRM_PANTHOR_VM_BIND_OP_MAP_UNCACHED,
>  					 PANTHOR_VM_KERNEL_AUTO_VA);
> @@ -3435,7 +3437,7 @@ int panthor_group_create(struct panthor_file *pfile,
>  	}
>
>  	suspend_size = csg_iface->control->protm_suspend_size;
> -	group->protm_suspend_buf = panthor_fw_alloc_suspend_buf_mem(ptdev, suspend_size);
> +	group->protm_suspend_buf = panthor_fw_alloc_protm_suspend_buf_mem(ptdev, suspend_size);

I think  you could reuse panthor_fw_alloc_suspend_buf_mem() and extend its interface
so that it takes a boolean for the case you want to allocate a protected buffer.

>  	if (IS_ERR(group->protm_suspend_buf)) {
>  		ret = PTR_ERR(group->protm_suspend_buf);
>  		group->protm_suspend_buf = NULL;
> @@ -3446,6 +3448,7 @@ int panthor_group_create(struct panthor_file *pfile,
>  						   group_args->queues.count *
>  						   sizeof(struct panthor_syncobj_64b),
>  						   DRM_PANTHOR_BO_NO_MMAP,
> +						   0,
>  						   DRM_PANTHOR_VM_BIND_OP_MAP_NOEXEC |
>  						   DRM_PANTHOR_VM_BIND_OP_MAP_UNCACHED,
>  						   PANTHOR_VM_KERNEL_AUTO_VA);
> --
> 2.34.1


Adrian Larumbe

^ permalink raw reply	[flat|nested] 48+ messages in thread

end of thread, other threads:[~2025-03-12 20:06 UTC | newest]

Thread overview: 48+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-30 13:08 [RFC PATCH 0/5] drm/panthor: Protected mode support for Mali CSF GPUs Florent Tomasin
2025-01-30 13:08 ` [RFC PATCH 1/5] dt-bindings: dma: Add CMA Heap bindings Florent Tomasin
2025-01-30 13:28   ` Maxime Ripard
2025-02-03 13:36     ` Florent Tomasin
2025-02-04 18:12       ` Nicolas Dufresne
2025-02-12  9:49         ` Florent Tomasin
2025-02-12 10:01           ` Maxime Ripard
2025-02-12 10:29             ` Florent Tomasin
2025-02-12 10:49               ` Maxime Ripard
2025-02-12 11:02                 ` Florent Tomasin
2025-02-12 10:37             ` Boris Brezillon
2025-01-30 23:20   ` Rob Herring
2025-02-03 16:18     ` Florent Tomasin
2025-01-30 13:08 ` [RFC PATCH 2/5] cma-heap: Allow registration of custom cma heaps Florent Tomasin
2025-01-30 13:34   ` Maxime Ripard
2025-02-03 13:52     ` Florent Tomasin
2025-01-30 13:08 ` [RFC PATCH 3/5] dt-bindings: gpu: Add protected heap name to Mali Valhall CSF binding Florent Tomasin
2025-01-30 13:25   ` Krzysztof Kozlowski
2025-02-03 15:31     ` Florent Tomasin
2025-02-05  9:13       ` Krzysztof Kozlowski
2025-02-06 21:21         ` Nicolas Dufresne
2025-02-09 11:56           ` Krzysztof Kozlowski
2025-02-12  9:25             ` Florent Tomasin
2025-01-30 13:09 ` [RFC PATCH 4/5] drm/panthor: Add support for protected memory allocation in panthor Florent Tomasin
2025-02-11 11:04   ` Boris Brezillon
2025-02-11 11:20     ` Boris Brezillon
2025-03-12 20:05   ` Adrian Larumbe
2025-01-30 13:09 ` [RFC PATCH 5/5] drm/panthor: Add support for entering and exiting protected mode Florent Tomasin
2025-02-10 14:01   ` Boris Brezillon
2025-01-30 13:46 ` [RFC PATCH 0/5] drm/panthor: Protected mode support for Mali CSF GPUs Maxime Ripard
2025-01-30 15:59   ` Nicolas Dufresne
2025-01-30 16:38     ` Maxime Ripard
2025-01-30 17:47       ` Nicolas Dufresne
2025-02-03 16:43         ` Florent Tomasin
2025-02-04 18:22           ` Nicolas Dufresne
2025-02-05 14:53             ` Maxime Ripard
2025-02-05 18:07               ` Nicolas Dufresne
2025-02-05 14:52           ` Maxime Ripard
2025-02-05 18:14             ` Nicolas Dufresne
2025-02-07 15:02               ` Boris Brezillon
2025-02-07 16:32                 ` Nicolas Dufresne
2025-02-07 16:42                   ` Boris Brezillon
2025-02-11 13:46                 ` Maxime Ripard
2025-02-11 14:32                   ` Boris Brezillon
2025-02-20 13:32                     ` Maxime Ripard
2025-02-24 11:36                       ` Boris Brezillon
2025-01-30 16:15 ` Simona Vetter
2025-02-03  9:25   ` Boris Brezillon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).