* [PATCH 0/9] dma-buf: heaps: Add Tegra VPR support
@ 2025-09-02 15:46 Thierry Reding
2025-09-02 15:46 ` [PATCH 1/9] dt-bindings: reserved-memory: Document Tegra VPR Thierry Reding
` (9 more replies)
0 siblings, 10 replies; 24+ messages in thread
From: Thierry Reding @ 2025-09-02 15:46 UTC (permalink / raw)
To: Thierry Reding, David Airlie, Simona Vetter, Sumit Semwal
Cc: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Benjamin Gaignard,
Brian Starkey, John Stultz, T.J. Mercier, Andrew Morton,
David Hildenbrand, Mike Rapoport, dri-devel, devicetree,
linux-tegra, linaro-mm-sig, linux-mm
From: Thierry Reding <treding@nvidia.com>
Hi,
This series adds support for the video protection region (VPR) used on
Tegra SoC devices. It's a special region of memory that is protected
from accesses by the CPU and used to store DRM protected content (both
decrypted stream data as well as decoded video frames).
Patches 1 and 2 add DT binding documentation for the VPR and add the VPR
to the list of memory-region items for display and host1x.
Patch 3 introduces new APIs needed by the Tegra VPR implementation that
allow CMA areas to be dynamically created at runtime rather than using
the fixed, system-wide list. This is used in this driver specifically
because it can use an arbitrary number of these areas (though they are
currently limited to 4).
Patch 4 adds some infrastructure for DMA heap implementations to provide
information through debugfs.
The Tegra VPR implementation is added in patch 5. See its commit message
for more details about the specifics of this implementation.
Finally, patches 6-9 add the VPR placeholder node on Tegra234 and hook
it up to the host1x and GPU nodes so that they can make use of this
region.
Thierry
Thierry Reding (9):
dt-bindings: reserved-memory: Document Tegra VPR
dt-bindings: display: tegra: Document memory regions
mm/cma: Allow dynamically creating CMA areas
dma-buf: heaps: Add debugfs support
dma-buf: heaps: Add support for Tegra VPR
arm64: tegra: Add VPR placeholder node on Tegra234
arm64: tegra: Add GPU node on Tegra234
arm64: tegra: Hook up VPR to host1x
arm64: tegra: Hook up VPR to the GPU
.../display/tegra/nvidia,tegra186-dc.yaml | 10 +
.../display/tegra/nvidia,tegra20-dc.yaml | 10 +-
.../display/tegra/nvidia,tegra20-host1x.yaml | 7 +
.../nvidia,tegra-video-protection-region.yaml | 55 ++
arch/arm64/boot/dts/nvidia/tegra234.dtsi | 57 ++
drivers/dma-buf/dma-heap.c | 56 ++
drivers/dma-buf/heaps/Kconfig | 7 +
drivers/dma-buf/heaps/Makefile | 1 +
drivers/dma-buf/heaps/tegra-vpr.c | 831 ++++++++++++++++++
include/linux/cma.h | 16 +
include/linux/dma-heap.h | 2 +
include/trace/events/tegra_vpr.h | 57 ++
mm/cma.c | 89 +-
13 files changed, 1175 insertions(+), 23 deletions(-)
create mode 100644 Documentation/devicetree/bindings/reserved-memory/nvidia,tegra-video-protection-region.yaml
create mode 100644 drivers/dma-buf/heaps/tegra-vpr.c
create mode 100644 include/trace/events/tegra_vpr.h
--
2.50.0
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH 1/9] dt-bindings: reserved-memory: Document Tegra VPR
2025-09-02 15:46 [PATCH 0/9] dma-buf: heaps: Add Tegra VPR support Thierry Reding
@ 2025-09-02 15:46 ` Thierry Reding
2025-09-03 16:45 ` Rob Herring (Arm)
2025-09-02 15:46 ` [PATCH 2/9] dt-bindings: display: tegra: Document memory regions Thierry Reding
` (8 subsequent siblings)
9 siblings, 1 reply; 24+ messages in thread
From: Thierry Reding @ 2025-09-02 15:46 UTC (permalink / raw)
To: Thierry Reding, David Airlie, Simona Vetter, Sumit Semwal
Cc: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Benjamin Gaignard,
Brian Starkey, John Stultz, T.J. Mercier, Andrew Morton,
David Hildenbrand, Mike Rapoport, dri-devel, devicetree,
linux-tegra, linaro-mm-sig, linux-mm
From: Thierry Reding <treding@nvidia.com>
The Video Protection Region (VPR) found on NVIDIA Tegra chips is a
region of memory that is protected from CPU accesses. It is used to
decode and play back DRM protected content.
It is a standard reserved memory region that can exist in two forms:
static VPR where the base address and size are fixed (uses the "reg"
property to describe the memory) and a resizable VPR where only the
size is known upfront and the OS can allocate it wherever it can be
accomodated.
Signed-off-by: Thierry Reding <treding@nvidia.com>
---
.../nvidia,tegra-video-protection-region.yaml | 55 +++++++++++++++++++
1 file changed, 55 insertions(+)
create mode 100644 Documentation/devicetree/bindings/reserved-memory/nvidia,tegra-video-protection-region.yaml
diff --git a/Documentation/devicetree/bindings/reserved-memory/nvidia,tegra-video-protection-region.yaml b/Documentation/devicetree/bindings/reserved-memory/nvidia,tegra-video-protection-region.yaml
new file mode 100644
index 000000000000..c13292a791bb
--- /dev/null
+++ b/Documentation/devicetree/bindings/reserved-memory/nvidia,tegra-video-protection-region.yaml
@@ -0,0 +1,55 @@
+# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/reserved-memory/nvidia,tegra-video-protection-region.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: NVIDIA Tegra Video Protection Region (VPR)
+
+maintainers:
+ - Thierry Reding <thierry.reding@gmail.com>
+ - Jon Hunter <jonathanh@nvidia.com>
+
+description: |
+ NVIDIA Tegra chips have long supported a mechanism to protect a single,
+ contiguous memory region from non-secure memory accesses. Typically this
+ region is used for decoding and playback of DRM protected content. Various
+ devices, such as the display controller and multimedia engines (video
+ decoder) can access this region in a secure way. Access from the CPU is
+ generally forbidden.
+
+ Two variants exist for VPR: one is fixed in both the base address and size,
+ while the other is resizable. Fixed VPR can be described by just a "reg"
+ property specifying the base address and size, whereas the resizable VPR
+ is defined by a size/alignment pair of properties. For resizable VPR the
+ memory is reusable by the rest of the system when it's unused for VPR and
+ therefore the "reusable" property must be specified along with it. For a
+ fixed VPR, the memory is permanently protected, and therefore it's not
+ reusable and must also be marked as "no-map" to prevent any (including
+ speculative) accesses to it.
+
+allOf:
+ - $ref: reserved-memory.yaml
+
+properties:
+ compatible:
+ const: nvidia,tegra-video-protection-region
+
+dependencies:
+ size: [alignment, reusable]
+ alignment: [size, reusable]
+ reusable: [alignment, size]
+
+ reg: [no-map]
+ no-map: [reg]
+
+unevaluatedProperties: false
+
+oneOf:
+ - required:
+ - compatible
+ - reg
+
+ - required:
+ - compatible
+ - size
--
2.50.0
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 2/9] dt-bindings: display: tegra: Document memory regions
2025-09-02 15:46 [PATCH 0/9] dma-buf: heaps: Add Tegra VPR support Thierry Reding
2025-09-02 15:46 ` [PATCH 1/9] dt-bindings: reserved-memory: Document Tegra VPR Thierry Reding
@ 2025-09-02 15:46 ` Thierry Reding
2025-09-02 15:46 ` [PATCH 3/9] mm/cma: Allow dynamically creating CMA areas Thierry Reding
` (7 subsequent siblings)
9 siblings, 0 replies; 24+ messages in thread
From: Thierry Reding @ 2025-09-02 15:46 UTC (permalink / raw)
To: Thierry Reding, David Airlie, Simona Vetter, Sumit Semwal
Cc: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Benjamin Gaignard,
Brian Starkey, John Stultz, T.J. Mercier, Andrew Morton,
David Hildenbrand, Mike Rapoport, dri-devel, devicetree,
linux-tegra, linaro-mm-sig, linux-mm
From: Thierry Reding <treding@nvidia.com>
Add the memory-region and memory-region-names properties to the bindings
for the display controllers and the host1x engine found on various Tegra
generations. These memory regions are used to access firmware-provided
framebuffer memory as well as the video protection region.
Signed-off-by: Thierry Reding <treding@nvidia.com>
---
.../bindings/display/tegra/nvidia,tegra186-dc.yaml | 10 ++++++++++
.../bindings/display/tegra/nvidia,tegra20-dc.yaml | 10 +++++++++-
.../bindings/display/tegra/nvidia,tegra20-host1x.yaml | 7 +++++++
3 files changed, 26 insertions(+), 1 deletion(-)
diff --git a/Documentation/devicetree/bindings/display/tegra/nvidia,tegra186-dc.yaml b/Documentation/devicetree/bindings/display/tegra/nvidia,tegra186-dc.yaml
index ce4589466a18..881bfbf4764d 100644
--- a/Documentation/devicetree/bindings/display/tegra/nvidia,tegra186-dc.yaml
+++ b/Documentation/devicetree/bindings/display/tegra/nvidia,tegra186-dc.yaml
@@ -57,6 +57,16 @@ properties:
- const: dma-mem # read-0
- const: read-1
+ memory-region:
+ minItems: 1
+ maxItems: 2
+
+ memory-region-names:
+ items:
+ enum: [ framebuffer, protected ]
+ minItems: 1
+ maxItems: 2
+
nvidia,outputs:
description: A list of phandles of outputs that this display
controller can drive.
diff --git a/Documentation/devicetree/bindings/display/tegra/nvidia,tegra20-dc.yaml b/Documentation/devicetree/bindings/display/tegra/nvidia,tegra20-dc.yaml
index 69be95afd562..a012644eeb7d 100644
--- a/Documentation/devicetree/bindings/display/tegra/nvidia,tegra20-dc.yaml
+++ b/Documentation/devicetree/bindings/display/tegra/nvidia,tegra20-dc.yaml
@@ -65,7 +65,15 @@ properties:
items:
- description: phandle to the core power domain
- memory-region: true
+ memory-region:
+ minItems: 1
+ maxItems: 2
+
+ memory-region-names:
+ items:
+ enum: [ framebuffer, protected ]
+ minItems: 1
+ maxitems: 2
nvidia,head:
$ref: /schemas/types.yaml#/definitions/uint32
diff --git a/Documentation/devicetree/bindings/display/tegra/nvidia,tegra20-host1x.yaml b/Documentation/devicetree/bindings/display/tegra/nvidia,tegra20-host1x.yaml
index 3563378a01af..f45be30835a8 100644
--- a/Documentation/devicetree/bindings/display/tegra/nvidia,tegra20-host1x.yaml
+++ b/Documentation/devicetree/bindings/display/tegra/nvidia,tegra20-host1x.yaml
@@ -96,6 +96,13 @@ properties:
items:
- description: phandle to the HEG or core power domain
+ memory-region:
+ maxItems: 1
+
+ memory-region-names:
+ items:
+ - const: protected
+
required:
- compatible
- interrupts
--
2.50.0
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 3/9] mm/cma: Allow dynamically creating CMA areas
2025-09-02 15:46 [PATCH 0/9] dma-buf: heaps: Add Tegra VPR support Thierry Reding
2025-09-02 15:46 ` [PATCH 1/9] dt-bindings: reserved-memory: Document Tegra VPR Thierry Reding
2025-09-02 15:46 ` [PATCH 2/9] dt-bindings: display: tegra: Document memory regions Thierry Reding
@ 2025-09-02 15:46 ` Thierry Reding
2025-09-02 17:27 ` Frank van der Linden
2025-09-02 15:46 ` [PATCH 4/9] dma-buf: heaps: Add debugfs support Thierry Reding
` (6 subsequent siblings)
9 siblings, 1 reply; 24+ messages in thread
From: Thierry Reding @ 2025-09-02 15:46 UTC (permalink / raw)
To: Thierry Reding, David Airlie, Simona Vetter, Sumit Semwal
Cc: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Benjamin Gaignard,
Brian Starkey, John Stultz, T.J. Mercier, Andrew Morton,
David Hildenbrand, Mike Rapoport, dri-devel, devicetree,
linux-tegra, linaro-mm-sig, linux-mm
From: Thierry Reding <treding@nvidia.com>
There is no technical reason why there should be a limited number of CMA
regions, so extract some code into helpers and use them to create extra
functions (cma_create() and cma_free()) that allow creating and freeing,
respectively, CMA regions dynamically at runtime.
Note that these dynamically created CMA areas are treated specially and
do not contribute to the number of total CMA pages so that this count
still only applies to the fixed number of CMA areas.
Signed-off-by: Thierry Reding <treding@nvidia.com>
---
include/linux/cma.h | 16 ++++++++
mm/cma.c | 89 ++++++++++++++++++++++++++++++++++-----------
2 files changed, 83 insertions(+), 22 deletions(-)
diff --git a/include/linux/cma.h b/include/linux/cma.h
index 62d9c1cf6326..f1e20642198a 100644
--- a/include/linux/cma.h
+++ b/include/linux/cma.h
@@ -61,6 +61,10 @@ extern void cma_reserve_pages_on_error(struct cma *cma);
struct folio *cma_alloc_folio(struct cma *cma, int order, gfp_t gfp);
bool cma_free_folio(struct cma *cma, const struct folio *folio);
bool cma_validate_zones(struct cma *cma);
+
+struct cma *cma_create(phys_addr_t base, phys_addr_t size,
+ unsigned int order_per_bit, const char *name);
+void cma_free(struct cma *cma);
#else
static inline struct folio *cma_alloc_folio(struct cma *cma, int order, gfp_t gfp)
{
@@ -71,10 +75,22 @@ static inline bool cma_free_folio(struct cma *cma, const struct folio *folio)
{
return false;
}
+
static inline bool cma_validate_zones(struct cma *cma)
{
return false;
}
+
+static inline struct cma *cma_create(phys_addr_t base, phys_addr_t size,
+ unsigned int order_per_bit,
+ const char *name)
+{
+ return NULL;
+}
+
+static inline void cma_free(struct cma *cma)
+{
+}
#endif
#endif
diff --git a/mm/cma.c b/mm/cma.c
index e56ec64d0567..8149227d319f 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -214,6 +214,18 @@ void __init cma_reserve_pages_on_error(struct cma *cma)
set_bit(CMA_RESERVE_PAGES_ON_ERROR, &cma->flags);
}
+static void __init cma_init_area(struct cma *cma, const char *name,
+ phys_addr_t size, unsigned int order_per_bit)
+{
+ if (name)
+ snprintf(cma->name, CMA_MAX_NAME, "%s", name);
+ else
+ snprintf(cma->name, CMA_MAX_NAME, "cma%d\n", cma_area_count);
+
+ cma->available_count = cma->count = size >> PAGE_SHIFT;
+ cma->order_per_bit = order_per_bit;
+}
+
static int __init cma_new_area(const char *name, phys_addr_t size,
unsigned int order_per_bit,
struct cma **res_cma)
@@ -232,13 +244,8 @@ static int __init cma_new_area(const char *name, phys_addr_t size,
cma = &cma_areas[cma_area_count];
cma_area_count++;
- if (name)
- snprintf(cma->name, CMA_MAX_NAME, "%s", name);
- else
- snprintf(cma->name, CMA_MAX_NAME, "cma%d\n", cma_area_count);
+ cma_init_area(cma, name, size, order_per_bit);
- cma->available_count = cma->count = size >> PAGE_SHIFT;
- cma->order_per_bit = order_per_bit;
*res_cma = cma;
totalcma_pages += cma->count;
@@ -251,6 +258,27 @@ static void __init cma_drop_area(struct cma *cma)
cma_area_count--;
}
+static int __init cma_check_memory(phys_addr_t base, phys_addr_t size)
+{
+ if (!size || !memblock_is_region_reserved(base, size))
+ return -EINVAL;
+
+ /*
+ * CMA uses CMA_MIN_ALIGNMENT_BYTES as alignment requirement which
+ * needs pageblock_order to be initialized. Let's enforce it.
+ */
+ if (!pageblock_order) {
+ pr_err("pageblock_order not yet initialized. Called during early boot?\n");
+ return -EINVAL;
+ }
+
+ /* ensure minimal alignment required by mm core */
+ if (!IS_ALIGNED(base | size, CMA_MIN_ALIGNMENT_BYTES))
+ return -EINVAL;
+
+ return 0;
+}
+
/**
* cma_init_reserved_mem() - create custom contiguous area from reserved memory
* @base: Base address of the reserved area
@@ -271,22 +299,9 @@ int __init cma_init_reserved_mem(phys_addr_t base, phys_addr_t size,
struct cma *cma;
int ret;
- /* Sanity checks */
- if (!size || !memblock_is_region_reserved(base, size))
- return -EINVAL;
-
- /*
- * CMA uses CMA_MIN_ALIGNMENT_BYTES as alignment requirement which
- * needs pageblock_order to be initialized. Let's enforce it.
- */
- if (!pageblock_order) {
- pr_err("pageblock_order not yet initialized. Called during early boot?\n");
- return -EINVAL;
- }
-
- /* ensure minimal alignment required by mm core */
- if (!IS_ALIGNED(base | size, CMA_MIN_ALIGNMENT_BYTES))
- return -EINVAL;
+ ret = cma_check_memory(base, size);
+ if (ret < 0)
+ return ret;
ret = cma_new_area(name, size, order_per_bit, &cma);
if (ret != 0)
@@ -1112,3 +1127,33 @@ void __init *cma_reserve_early(struct cma *cma, unsigned long size)
return ret;
}
+
+struct cma *__init cma_create(phys_addr_t base, phys_addr_t size,
+ unsigned int order_per_bit, const char *name)
+{
+ struct cma *cma;
+ int ret;
+
+ ret = cma_check_memory(base, size);
+ if (ret < 0)
+ return ERR_PTR(ret);
+
+ cma = kzalloc(sizeof(*cma), GFP_KERNEL);
+ if (!cma)
+ return ERR_PTR(-ENOMEM);
+
+ cma_init_area(cma, name, size, order_per_bit);
+ cma->ranges[0].base_pfn = PFN_DOWN(base);
+ cma->ranges[0].early_pfn = PFN_DOWN(base);
+ cma->ranges[0].count = cma->count;
+ cma->nranges = 1;
+
+ cma_activate_area(cma);
+
+ return cma;
+}
+
+void cma_free(struct cma *cma)
+{
+ kfree(cma);
+}
--
2.50.0
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 4/9] dma-buf: heaps: Add debugfs support
2025-09-02 15:46 [PATCH 0/9] dma-buf: heaps: Add Tegra VPR support Thierry Reding
` (2 preceding siblings ...)
2025-09-02 15:46 ` [PATCH 3/9] mm/cma: Allow dynamically creating CMA areas Thierry Reding
@ 2025-09-02 15:46 ` Thierry Reding
2025-09-02 22:37 ` John Stultz
2025-09-02 15:46 ` [PATCH 5/9] dma-buf: heaps: Add support for Tegra VPR Thierry Reding
` (5 subsequent siblings)
9 siblings, 1 reply; 24+ messages in thread
From: Thierry Reding @ 2025-09-02 15:46 UTC (permalink / raw)
To: Thierry Reding, David Airlie, Simona Vetter, Sumit Semwal
Cc: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Benjamin Gaignard,
Brian Starkey, John Stultz, T.J. Mercier, Andrew Morton,
David Hildenbrand, Mike Rapoport, dri-devel, devicetree,
linux-tegra, linaro-mm-sig, linux-mm
From: Thierry Reding <treding@nvidia.com>
Add a callback to struct dma_heap_ops that heap providers can implement
to show information about the state of the heap in debugfs. A top-level
directory named "dma_heap" is created in debugfs and individual files
will be named after the heaps.
Signed-off-by: Thierry Reding <treding@nvidia.com>
---
drivers/dma-buf/dma-heap.c | 56 ++++++++++++++++++++++++++++++++++++++
include/linux/dma-heap.h | 2 ++
2 files changed, 58 insertions(+)
diff --git a/drivers/dma-buf/dma-heap.c b/drivers/dma-buf/dma-heap.c
index cdddf0e24dce..f062f88365a5 100644
--- a/drivers/dma-buf/dma-heap.c
+++ b/drivers/dma-buf/dma-heap.c
@@ -7,6 +7,7 @@
*/
#include <linux/cdev.h>
+#include <linux/debugfs.h>
#include <linux/device.h>
#include <linux/dma-buf.h>
#include <linux/dma-heap.h>
@@ -217,6 +218,46 @@ const char *dma_heap_get_name(struct dma_heap *heap)
}
EXPORT_SYMBOL(dma_heap_get_name);
+#ifdef CONFIG_DEBUG_FS
+static int dma_heap_debug_show(struct seq_file *s, void *unused)
+{
+ struct dma_heap *heap = s->private;
+ int err = 0;
+
+ if (heap->ops && heap->ops->show)
+ err = heap->ops->show(s, heap);
+
+ return err;
+}
+DEFINE_SHOW_ATTRIBUTE(dma_heap_debug);
+
+static struct dentry *dma_heap_debugfs_dir;
+
+static void dma_heap_init_debugfs(void)
+{
+ struct dentry *dir;
+
+ dir = debugfs_create_dir("dma_heap", NULL);
+ if (IS_ERR(dir))
+ return;
+
+ dma_heap_debugfs_dir = dir;
+}
+
+static void dma_heap_exit_debugfs(void)
+{
+ debugfs_remove_recursive(dma_heap_debugfs_dir);
+}
+#else
+static void dma_heap_init_debugfs(void)
+{
+}
+
+static void dma_heap_exit_debugfs(void)
+{
+}
+#endif
+
/**
* dma_heap_add - adds a heap to dmabuf heaps
* @exp_info: information needed to register this heap
@@ -291,6 +332,13 @@ struct dma_heap *dma_heap_add(const struct dma_heap_export_info *exp_info)
/* Add heap to the list */
list_add(&heap->list, &heap_list);
+
+#ifdef CONFIG_DEBUG_FS
+ if (heap->ops && heap->ops->show)
+ debugfs_create_file(heap->name, 0444, dma_heap_debugfs_dir,
+ heap, &dma_heap_debug_fops);
+#endif
+
mutex_unlock(&heap_list_lock);
return heap;
@@ -327,6 +375,14 @@ static int dma_heap_init(void)
}
dma_heap_class->devnode = dma_heap_devnode;
+ dma_heap_init_debugfs();
+
return 0;
}
subsys_initcall(dma_heap_init);
+
+static void __exit dma_heap_exit(void)
+{
+ dma_heap_exit_debugfs();
+}
+__exitcall(dma_heap_exit);
diff --git a/include/linux/dma-heap.h b/include/linux/dma-heap.h
index 27d15f60950a..065f537177af 100644
--- a/include/linux/dma-heap.h
+++ b/include/linux/dma-heap.h
@@ -12,6 +12,7 @@
#include <linux/types.h>
struct dma_heap;
+struct seq_file;
/**
* struct dma_heap_ops - ops to operate on a given heap
@@ -24,6 +25,7 @@ struct dma_heap_ops {
unsigned long len,
u32 fd_flags,
u64 heap_flags);
+ int (*show)(struct seq_file *s, struct dma_heap *heap);
};
/**
--
2.50.0
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 5/9] dma-buf: heaps: Add support for Tegra VPR
2025-09-02 15:46 [PATCH 0/9] dma-buf: heaps: Add Tegra VPR support Thierry Reding
` (3 preceding siblings ...)
2025-09-02 15:46 ` [PATCH 4/9] dma-buf: heaps: Add debugfs support Thierry Reding
@ 2025-09-02 15:46 ` Thierry Reding
2025-09-02 15:46 ` [PATCH 6/9] arm64: tegra: Add VPR placeholder node on Tegra234 Thierry Reding
` (4 subsequent siblings)
9 siblings, 0 replies; 24+ messages in thread
From: Thierry Reding @ 2025-09-02 15:46 UTC (permalink / raw)
To: Thierry Reding, David Airlie, Simona Vetter, Sumit Semwal
Cc: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Benjamin Gaignard,
Brian Starkey, John Stultz, T.J. Mercier, Andrew Morton,
David Hildenbrand, Mike Rapoport, dri-devel, devicetree,
linux-tegra, linaro-mm-sig, linux-mm
From: Thierry Reding <treding@nvidia.com>
NVIDIA Tegra SoCs commonly define a Video-Protection-Region, which is a
region of memory dedicated to content-protected video decode and
playback. This memory cannot be accessed by the CPU and only certain
hardware devices have access to it.
Expose the VPR as a DMA heap so that applications and drivers can
allocate buffers from this region for use-cases that require this kind
of protected memory.
VPR has a few very critical peculiarities. First, it must be a single
contiguous region of memory (there is a single pair of registers that
set the base address and size of the region), which is configured by
calling back into the secure monitor. The memory region also needs to
quite large for some use-cases because it needs to fit multiple video
frames (8K video should be supported), so VPR sizes of ~2 GiB are
expected. However, some devices cannot afford to reserve this amount
of memory for a particular use-case, and therefore the VPR must be
resizable.
Unfortunately, resizing the VPR is slightly tricky because the GPU found
on Tegra SoCs must be in reset during the VPR resize operation. This is
currently implemented by freezing all userspace processes and calling
invoking the GPU's freeze() implementation, resizing and the thawing the
GPU and userspace processes. This is quite heavy-handed, so eventually
it might be better to implement thawing/freezing in the GPU driver in
such a way that they block accesses to the GPU so that the VPR resize
operation can happen without suspending all userspace.
In order to balance the memory usage versus the amount of resizing that
needs to happen, the VPR is divided into multiple chunks. Each chunk is
implemented as a CMA area that is completely allocated on first use to
guarantee the contiguity of the VPR. Once all buffers from a chunk have
been freed, the CMA area is deallocated and the memory returned to the
system.
Signed-off-by: Thierry Reding <treding@nvidia.com>
---
drivers/dma-buf/heaps/Kconfig | 7 +
drivers/dma-buf/heaps/Makefile | 1 +
drivers/dma-buf/heaps/tegra-vpr.c | 831 ++++++++++++++++++++++++++++++
include/trace/events/tegra_vpr.h | 57 ++
4 files changed, 896 insertions(+)
create mode 100644 drivers/dma-buf/heaps/tegra-vpr.c
create mode 100644 include/trace/events/tegra_vpr.h
diff --git a/drivers/dma-buf/heaps/Kconfig b/drivers/dma-buf/heaps/Kconfig
index bb369b38b001..af97af1bb420 100644
--- a/drivers/dma-buf/heaps/Kconfig
+++ b/drivers/dma-buf/heaps/Kconfig
@@ -22,3 +22,10 @@ config DMABUF_HEAPS_CMA_LEGACY
from the CMA area's devicetree node, or "reserved" if the area is not
defined in the devicetree. This uses the same underlying allocator as
CONFIG_DMABUF_HEAPS_CMA.
+
+config DMABUF_HEAPS_TEGRA_VPR
+ bool "NVIDIA Tegra Video-Protected-Region DMA-BUF Heap"
+ depends on DMABUF_HEAPS && DMA_CMA
+ help
+ Choose this option to enable Video-Protected-Region (VPR) support on
+ a range of NVIDIA Tegra devices.
diff --git a/drivers/dma-buf/heaps/Makefile b/drivers/dma-buf/heaps/Makefile
index 974467791032..265b77a7b889 100644
--- a/drivers/dma-buf/heaps/Makefile
+++ b/drivers/dma-buf/heaps/Makefile
@@ -1,3 +1,4 @@
# SPDX-License-Identifier: GPL-2.0
obj-$(CONFIG_DMABUF_HEAPS_SYSTEM) += system_heap.o
obj-$(CONFIG_DMABUF_HEAPS_CMA) += cma_heap.o
+obj-$(CONFIG_DMABUF_HEAPS_TEGRA_VPR) += tegra-vpr.o
diff --git a/drivers/dma-buf/heaps/tegra-vpr.c b/drivers/dma-buf/heaps/tegra-vpr.c
new file mode 100644
index 000000000000..a36efeb031b8
--- /dev/null
+++ b/drivers/dma-buf/heaps/tegra-vpr.c
@@ -0,0 +1,831 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * DMA-BUF restricted heap exporter for NVIDIA Video-Protection-Region (VPR)
+ *
+ * Copyright (C) 2024-2025 NVIDIA Corporation
+ */
+
+#define pr_fmt(fmt) "tegra-vpr: " fmt
+
+#include <linux/arm-smccc.h>
+#include <linux/cma.h>
+#include <linux/debugfs.h>
+#include <linux/dma-buf.h>
+#include <linux/dma-heap.h>
+#include <linux/of_reserved_mem.h>
+
+#include <linux/platform_device.h>
+#include <linux/pm_runtime.h>
+#include <linux/reset.h>
+
+#include <linux/freezer.h>
+
+#define CREATE_TRACE_POINTS
+#include <trace/events/tegra_vpr.h>
+
+struct tegra_vpr;
+
+struct tegra_vpr_device {
+ struct list_head node;
+ struct device *dev;
+};
+
+struct tegra_vpr_chunk {
+ phys_addr_t start;
+ phys_addr_t limit;
+ size_t size;
+
+ struct tegra_vpr *vpr;
+ struct cma *cma;
+ bool active;
+
+ struct page *start_page;
+ unsigned long *bitmap;
+ unsigned long virt;
+ pgoff_t num_pages;
+
+ struct list_head buffers;
+ struct mutex lock;
+};
+
+struct tegra_vpr {
+ struct device_node *dev_node;
+ unsigned long align;
+ phys_addr_t base;
+ phys_addr_t size;
+ bool use_freezer;
+
+ struct tegra_vpr_chunk *chunks;
+ unsigned int num_chunks;
+
+ struct list_head devices;
+ struct mutex lock;
+};
+
+struct tegra_vpr_buffer {
+ struct tegra_vpr_chunk *chunk;
+ struct list_head attachments;
+ struct list_head list;
+ struct mutex lock;
+
+ struct page *start_page;
+ struct page **pages;
+ pgoff_t num_pages;
+ phys_addr_t start;
+ phys_addr_t limit;
+ size_t size;
+ int pageno;
+ int order;
+
+ unsigned long virt;
+};
+
+struct tegra_vpr_attachment {
+ struct device *dev;
+ struct sg_table sgt;
+ struct list_head list;
+};
+
+#define ARM_SMCCC_TE_FUNC_PROGRAM_VPR 0x3
+
+#define ARM_SMCCC_VENDOR_SIP_TE_PROGRAM_VPR_FUNC_ID \
+ ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \
+ ARM_SMCCC_SMC_32, \
+ ARM_SMCCC_OWNER_SIP, \
+ ARM_SMCCC_TE_FUNC_PROGRAM_VPR)
+
+static int tegra_vpr_set(phys_addr_t base, phys_addr_t size)
+{
+ struct arm_smccc_res res;
+
+ arm_smccc_smc(ARM_SMCCC_VENDOR_SIP_TE_PROGRAM_VPR_FUNC_ID, base, size,
+ 0, 0, 0, 0, 0, &res);
+
+ return res.a0;
+}
+
+static int tegra_vpr_get_extents(struct tegra_vpr *vpr, phys_addr_t *base,
+ phys_addr_t *size)
+{
+ phys_addr_t start = ~0, limit = 0;
+ unsigned int i;
+
+ for (i = 0; i < vpr->num_chunks; i++) {
+ struct tegra_vpr_chunk *chunk = &vpr->chunks[i];
+
+ if (!chunk->active)
+ break;
+
+ if (chunk->start < start)
+ start = chunk->start;
+
+ if (chunk->limit > limit)
+ limit = chunk->limit;
+ }
+
+ if (limit > start) {
+ *size = limit - start;
+ *base = start;
+ } else {
+ *base = *size = 0;
+ }
+
+ return 0;
+}
+
+static int tegra_vpr_resize(struct tegra_vpr *vpr)
+{
+ struct tegra_vpr_device *node;
+ phys_addr_t base, size;
+ int err;
+
+ err = tegra_vpr_get_extents(vpr, &base, &size);
+ if (err < 0) {
+ pr_err("%s(): failed to get VPR extents: %d\n", __func__, err);
+ return err;
+ }
+
+ if (vpr->use_freezer) {
+ err = freeze_processes();
+ if (err < 0) {
+ pr_err("%s(): failed to freeze processes: %d\n",
+ __func__, err);
+ return err;
+ }
+ }
+
+ list_for_each_entry(node, &vpr->devices, node) {
+ err = pm_generic_freeze(node->dev);
+ if (err < 0) {
+ pr_err("failed to runtime suspend %s\n",
+ dev_name(node->dev));
+ continue;
+ }
+ }
+
+ trace_tegra_vpr_set(base, size);
+
+ err = tegra_vpr_set(base, size);
+ if (err < 0) {
+ pr_err("failed to secure VPR: %d\n", err);
+ return err;
+ }
+
+ list_for_each_entry(node, &vpr->devices, node) {
+ err = pm_generic_thaw(node->dev);
+ if (err < 0) {
+ pr_err("failed to runtime resume %s\n",
+ dev_name(node->dev));
+ continue;
+ }
+ }
+
+ if (vpr->use_freezer)
+ thaw_processes();
+
+ return 0;
+}
+
+static int tegra_vpr_protect_pages(pte_t *ptep, unsigned long addr,
+ void *unused)
+{
+ pte_t pte = __ptep_get(ptep);
+
+ pte = clear_pte_bit(pte, __pgprot(PROT_NORMAL));
+ pte = set_pte_bit(pte, __pgprot(PROT_DEVICE_nGnRnE));
+
+ __set_pte(ptep, pte);
+
+ return 0;
+}
+
+static int tegra_vpr_unprotect_pages(pte_t *ptep, unsigned long addr,
+ void *unused)
+{
+ pte_t pte = __ptep_get(ptep);
+
+ pte = clear_pte_bit(pte, __pgprot(PROT_DEVICE_nGnRnE));
+ pte = set_pte_bit(pte, __pgprot(PROT_NORMAL));
+
+ __set_pte(ptep, pte);
+
+ return 0;
+}
+
+static int tegra_vpr_chunk_init(struct tegra_vpr *vpr,
+ struct tegra_vpr_chunk *chunk,
+ phys_addr_t start, size_t size,
+ unsigned int order, const char *name)
+{
+ INIT_LIST_HEAD(&chunk->buffers);
+ chunk->start = start;
+ chunk->limit = start + size;
+ chunk->size = size;
+ chunk->vpr = vpr;
+
+ chunk->cma = cma_create(start, size, order, name);
+ if (IS_ERR(chunk->cma))
+ return PTR_ERR(chunk->cma);
+
+ chunk->num_pages = size >> PAGE_SHIFT;
+
+ chunk->bitmap = bitmap_zalloc(chunk->num_pages, GFP_KERNEL);
+ if (!chunk->bitmap) {
+ cma_free(chunk->cma);
+ return -ENOMEM;
+ }
+
+ /* CMA area is not reserved yet */
+ chunk->start_page = NULL;
+ chunk->virt = 0;
+
+ return 0;
+}
+
+static void tegra_vpr_chunk_free(struct tegra_vpr_chunk *chunk)
+{
+ kfree(chunk->bitmap);
+ cma_free(chunk->cma);
+}
+
+static inline bool tegra_vpr_chunk_is_last(const struct tegra_vpr_chunk *chunk)
+{
+ phys_addr_t limit = chunk->vpr->base + chunk->vpr->size;
+
+ return chunk->limit == limit;
+}
+
+static inline bool tegra_vpr_chunk_is_leaf(const struct tegra_vpr_chunk *chunk)
+{
+ const struct tegra_vpr_chunk *next = chunk + 1;
+
+ if (tegra_vpr_chunk_is_last(chunk))
+ return true;
+
+ return !next->active;
+}
+
+static int tegra_vpr_chunk_activate(struct tegra_vpr_chunk *chunk)
+{
+ unsigned long align = get_order(chunk->vpr->align);
+ int err;
+
+ if (chunk->active)
+ return 0;
+
+ trace_tegra_vpr_chunk_activate(chunk->start, chunk->limit);
+
+ chunk->start_page = cma_alloc(chunk->cma, chunk->num_pages, align,
+ false);
+ if (!chunk->start_page) {
+ err = -ENOMEM;
+ goto free;
+ }
+
+ chunk->virt = (unsigned long)page_to_virt(chunk->start_page);
+
+ apply_to_existing_page_range(&init_mm, chunk->virt, chunk->size,
+ tegra_vpr_protect_pages, NULL);
+ flush_tlb_kernel_range(chunk->virt, chunk->virt + chunk->size);
+
+ chunk->active = true;
+
+ err = tegra_vpr_resize(chunk->vpr);
+ if (err < 0)
+ goto unprotect;
+
+ bitmap_zero(chunk->bitmap, chunk->num_pages);
+
+ return 0;
+
+unprotect:
+ chunk->active = false;
+ apply_to_existing_page_range(&init_mm, chunk->virt, chunk->size,
+ tegra_vpr_unprotect_pages, NULL);
+ flush_tlb_kernel_range(chunk->virt, chunk->virt + chunk->size);
+free:
+ cma_release(chunk->cma, chunk->start_page, chunk->num_pages);
+ chunk->start_page = NULL;
+ chunk->virt = 0;
+ return err;
+}
+
+static int tegra_vpr_chunk_deactivate(struct tegra_vpr_chunk *chunk)
+{
+ int err;
+
+ if (!chunk->active || !tegra_vpr_chunk_is_leaf(chunk))
+ return 0;
+
+ /* do not deactivate if there are buffers left in this chunk */
+ if (WARN_ON(!list_empty(&chunk->buffers)))
+ return 0;
+
+ trace_tegra_vpr_chunk_deactivate(chunk->start, chunk->limit);
+
+ chunk->active = false;
+
+ err = tegra_vpr_resize(chunk->vpr);
+ if (err < 0) {
+ chunk->active = true;
+ return err;
+ }
+
+ apply_to_existing_page_range(&init_mm, chunk->virt, chunk->size,
+ tegra_vpr_unprotect_pages, NULL);
+ flush_tlb_kernel_range(chunk->virt, chunk->virt + chunk->size);
+
+ cma_release(chunk->cma, chunk->start_page, chunk->num_pages);
+ chunk->start_page = NULL;
+ chunk->virt = 0;
+
+ return 0;
+}
+
+static struct tegra_vpr_buffer *
+tegra_vpr_chunk_allocate(struct tegra_vpr_chunk *chunk, size_t size)
+{
+ unsigned int order = get_order(size);
+ struct tegra_vpr_buffer *buffer;
+ int pageno, err;
+ pgoff_t i;
+
+ err = tegra_vpr_chunk_activate(chunk);
+ if (err < 0)
+ return ERR_PTR(err);
+
+ /*
+ * "order" defines the alignment and size, so this may result in
+ * fragmented memory depending on the allocation patterns. However,
+ * since this is used primarily for video frames, it is expected that
+ * a number of buffers of the same size will be allocated, so
+ * fragmentation should be negligible.
+ */
+ pageno = bitmap_find_free_region(chunk->bitmap, chunk->num_pages,
+ order);
+ if (pageno < 0)
+ return ERR_PTR(-ENOSPC);
+
+ buffer = kzalloc(sizeof(*buffer), GFP_KERNEL);
+ if (!buffer) {
+ err = -ENOMEM;
+ goto release;
+ }
+
+ INIT_LIST_HEAD(&buffer->attachments);
+ mutex_init(&buffer->lock);
+ buffer->chunk = chunk;
+ buffer->start = chunk->start + (pageno << PAGE_SHIFT);
+ buffer->limit = buffer->start + size;
+ buffer->size = size;
+ buffer->num_pages = buffer->size >> PAGE_SHIFT;
+ buffer->pageno = pageno;
+ buffer->order = order;
+
+ buffer->virt = (unsigned long)page_to_virt(chunk->start_page + pageno);
+
+ buffer->pages = kmalloc_array(buffer->num_pages,
+ sizeof(*buffer->pages),
+ GFP_KERNEL);
+ if (!buffer->pages) {
+ err = -ENOMEM;
+ goto free;
+ }
+
+ for (i = 0; i < buffer->num_pages; i++)
+ buffer->pages[i] = &chunk->start_page[pageno + i];
+
+ list_add_tail(&buffer->list, &chunk->buffers);
+
+ return buffer;
+
+free:
+ kfree(buffer);
+release:
+ bitmap_release_region(chunk->bitmap, pageno, order);
+ return ERR_PTR(err);
+}
+
+static void tegra_vpr_chunk_release(struct tegra_vpr_chunk *chunk,
+ struct tegra_vpr_buffer *buffer)
+{
+ list_del(&buffer->list);
+ kfree(buffer->pages);
+ kfree(buffer);
+
+ bitmap_release_region(chunk->bitmap, buffer->pageno, buffer->order);
+}
+
+static int tegra_vpr_attach(struct dma_buf *buf,
+ struct dma_buf_attachment *attachment)
+{
+ struct tegra_vpr_buffer *buffer = buf->priv;
+ struct tegra_vpr_attachment *attach;
+ int err;
+
+ attach = kzalloc(sizeof(*attach), GFP_KERNEL);
+ if (!attach)
+ return -ENOMEM;
+
+ err = sg_alloc_table_from_pages(&attach->sgt, buffer->pages,
+ buffer->num_pages, 0, buffer->size,
+ GFP_KERNEL);
+ if (err < 0)
+ goto free;
+
+ attach->dev = attach->dev;
+ INIT_LIST_HEAD(&attach->list);
+ attachment->priv = attach;
+
+ mutex_lock(&buffer->lock);
+ list_add(&attach->list, &buffer->attachments);
+ mutex_unlock(&buffer->lock);
+
+ return 0;
+
+free:
+ kfree(attach);
+ return err;
+}
+
+static void tegra_vpr_detach(struct dma_buf *buf,
+ struct dma_buf_attachment *attachment)
+{
+ struct tegra_vpr_buffer *buffer = buf->priv;
+ struct tegra_vpr_attachment *attach = attachment->priv;
+
+ mutex_lock(&buffer->lock);
+ list_del(&attach->list);
+ mutex_unlock(&buffer->lock);
+
+ sg_free_table(&attach->sgt);
+ kfree(attach);
+}
+
+static struct sg_table *
+tegra_vpr_map_dma_buf(struct dma_buf_attachment *attachment,
+ enum dma_data_direction direction)
+{
+ struct tegra_vpr_attachment *attach = attachment->priv;
+ struct sg_table *sgt = &attach->sgt;
+ int err;
+
+ err = dma_map_sgtable(attachment->dev, sgt, direction,
+ DMA_ATTR_SKIP_CPU_SYNC);
+ if (err < 0)
+ return ERR_PTR(err);
+
+ return sgt;
+}
+
+static void tegra_vpr_unmap_dma_buf(struct dma_buf_attachment *attachment,
+ struct sg_table *sgt,
+ enum dma_data_direction direction)
+{
+ dma_unmap_sgtable(attachment->dev, sgt, direction,
+ DMA_ATTR_SKIP_CPU_SYNC);
+}
+
+static void tegra_vpr_recycle(struct tegra_vpr *vpr)
+{
+ unsigned int i;
+ int err;
+
+ /*
+ * Walk the list of chunks in reverse order and check if they can be
+ * deactivated.
+ */
+ for (i = 0; i < vpr->num_chunks; i++) {
+ unsigned int index = vpr->num_chunks - i - 1;
+ struct tegra_vpr_chunk *chunk = &vpr->chunks[index];
+
+ /*
+ * Stop at any chunk that has remaining buffers. We cannot
+ * deactivate any chunks at lower addresses because the
+ * protected region needs to remain contiguous. Technically we
+ * could shrink from top and bottom, but for the sake of
+ * simplicity we'll only shrink from the top for now.
+ */
+ if (!list_empty(&chunk->buffers))
+ break;
+
+ err = tegra_vpr_chunk_deactivate(chunk);
+ if (err < 0)
+ pr_err("failed to deactivate chunk\n");
+ }
+}
+
+static void tegra_vpr_release(struct dma_buf *buf)
+{
+ struct tegra_vpr_buffer *buffer = buf->priv;
+ struct tegra_vpr_chunk *chunk = buffer->chunk;
+ struct tegra_vpr *vpr = chunk->vpr;
+
+ mutex_lock(&vpr->lock);
+
+ tegra_vpr_chunk_release(chunk, buffer);
+ tegra_vpr_recycle(vpr);
+
+ mutex_unlock(&vpr->lock);
+}
+
+/*
+ * Prohibit userspace mapping because the CPU cannot access this memory
+ * anyway.
+ */
+static int tegra_vpr_begin_cpu_access(struct dma_buf *buf,
+ enum dma_data_direction direction)
+{
+ return -EPERM;
+}
+
+static int tegra_vpr_end_cpu_access(struct dma_buf *buf,
+ enum dma_data_direction direction)
+{
+ return -EPERM;
+}
+
+static int tegra_vpr_mmap(struct dma_buf *buf, struct vm_area_struct *vma)
+{
+ return -EPERM;
+}
+
+static const struct dma_buf_ops tegra_vpr_buf_ops = {
+ .attach = tegra_vpr_attach,
+ .detach = tegra_vpr_detach,
+ .map_dma_buf = tegra_vpr_map_dma_buf,
+ .unmap_dma_buf = tegra_vpr_unmap_dma_buf,
+ .release = tegra_vpr_release,
+ .begin_cpu_access = tegra_vpr_begin_cpu_access,
+ .end_cpu_access = tegra_vpr_end_cpu_access,
+ .mmap = tegra_vpr_mmap,
+};
+
+static struct dma_buf *tegra_vpr_allocate(struct dma_heap *heap,
+ unsigned long len, u32 fd_flags,
+ u64 heap_flags)
+{
+ struct tegra_vpr *vpr = dma_heap_get_drvdata(heap);
+ DEFINE_DMA_BUF_EXPORT_INFO(export);
+ struct tegra_vpr_buffer *buffer;
+ struct dma_buf *buf;
+ unsigned int i;
+
+ mutex_lock(&vpr->lock);
+
+ for (i = 0; i < vpr->num_chunks; i++) {
+ struct tegra_vpr_chunk *chunk = &vpr->chunks[i];
+ size_t size = ALIGN(len, vpr->align);
+
+ buffer = tegra_vpr_chunk_allocate(chunk, size);
+ if (IS_ERR(buffer)) {
+ /* try the next chunk if the current one is exhausted */
+ if (PTR_ERR(buffer) == -ENOSPC)
+ continue;
+
+ mutex_unlock(&vpr->lock);
+ return ERR_CAST(buffer);
+ }
+
+ /*
+ * If a valid buffer was allocated, wrap it in a dma_buf and
+ * return it.
+ */
+ if (buffer) {
+ export.exp_name = dma_heap_get_name(heap);
+ export.ops = &tegra_vpr_buf_ops;
+ export.size = buffer->size;
+ export.flags = fd_flags;
+ export.priv = buffer;
+
+ buf = dma_buf_export(&export);
+ if (IS_ERR(buf)) {
+ tegra_vpr_chunk_release(chunk, buffer);
+ return ERR_CAST(buf);
+ }
+
+ mutex_unlock(&vpr->lock);
+ return buf;
+ }
+ }
+
+ mutex_unlock(&vpr->lock);
+
+ /*
+ * If we get here, none of the chunks could allocate a buffer, so
+ * there's nothing else we can do.
+ */
+ return ERR_PTR(-ENOMEM);
+}
+
+static int tegra_vpr_debugfs_show(struct seq_file *s, struct dma_heap *heap)
+{
+ struct tegra_vpr *vpr = dma_heap_get_drvdata(heap);
+ phys_addr_t limit = vpr->base + vpr->size;
+ unsigned int i;
+ char buf[16];
+
+ string_get_size(vpr->size, 1, STRING_UNITS_2, buf, sizeof(buf));
+ seq_printf(s, "%pap-%pap (%s)\n", &vpr->base, &limit, buf);
+
+ for (i = 0; i < vpr->num_chunks; i++) {
+ const struct tegra_vpr_chunk *chunk = &vpr->chunks[i];
+ struct tegra_vpr_buffer *buffer;
+
+ string_get_size(chunk->size, 1, STRING_UNITS_2, buf,
+ sizeof(buf));
+ seq_printf(s, " %pap-%pap (%s)\n", &chunk->start,
+ &chunk->limit, buf);
+
+ list_for_each_entry(buffer, &chunk->buffers, list) {
+ string_get_size(buffer->size, 1, STRING_UNITS_2, buf,
+ sizeof(buf));
+ seq_printf(s, " %pap-%pap (%s)\n", &buffer->start,
+ &buffer->limit, buf);
+ }
+ }
+
+ return 0;
+}
+
+static const struct dma_heap_ops tegra_vpr_heap_ops = {
+ .allocate = tegra_vpr_allocate,
+ .show = tegra_vpr_debugfs_show,
+};
+
+static int __init tegra_vpr_add_heap(struct reserved_mem *rmem,
+ struct device_node *np)
+{
+ struct dma_heap_export_info info = {};
+ phys_addr_t start, limit;
+ struct dma_heap *heap;
+ struct tegra_vpr *vpr;
+ unsigned int order, i;
+ size_t max_size;
+ int err;
+
+ vpr = kzalloc(sizeof(*vpr), GFP_KERNEL);
+ if (!vpr) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ INIT_LIST_HEAD(&vpr->devices);
+ vpr->use_freezer = true;
+ vpr->dev_node = np;
+ vpr->align = SZ_1M;
+ vpr->base = rmem->base;
+ vpr->size = rmem->size;
+ vpr->num_chunks = 4;
+
+ max_size = PAGE_SIZE << (get_order(vpr->size) - ilog2(vpr->num_chunks));
+ order = get_order(vpr->align);
+
+ vpr->chunks = kcalloc(vpr->num_chunks, sizeof(*vpr->chunks),
+ GFP_KERNEL);
+ if (!vpr) {
+ err = -ENOMEM;
+ goto free;
+ }
+
+ /*
+ * Allocate CMA areas for VPR. All areas will be roughtly the same
+ * size, with the last area taking up the rest.
+ */
+ start = vpr->base;
+ limit = vpr->base + vpr->size;
+
+ pr_debug("VPR: %pap-%pap (%u chunks, %lu MiB)\n", &start, &limit,
+ vpr->num_chunks, (unsigned long)vpr->size / 1024 / 1024);
+
+ for (i = 0; i < vpr->num_chunks; i++) {
+ size_t size = limit - start;
+ phys_addr_t end;
+
+ size = min_t(size_t, size, max_size);
+ end = start + size - 1;
+
+ err = tegra_vpr_chunk_init(vpr, &vpr->chunks[i], start, size,
+ order, rmem->name);
+ if (err < 0) {
+ pr_err("failed to create VPR chunk: %d\n", err);
+ goto free;
+ }
+
+ pr_debug(" %2u: %pap-%pap (%lu MiB)\n", i, &start, &end,
+ size / 1024 / 1024);
+ start += size;
+ }
+
+ info.name = vpr->dev_node->name;
+ info.ops = &tegra_vpr_heap_ops;
+ info.priv = vpr;
+
+ heap = dma_heap_add(&info);
+ if (IS_ERR(heap)) {
+ err = PTR_ERR(heap);
+ goto cma_free;
+ }
+
+ rmem->priv = heap;
+
+ return 0;
+
+cma_free:
+ while (i--)
+ tegra_vpr_chunk_free(&vpr->chunks[i]);
+free:
+ kfree(vpr->chunks);
+ kfree(vpr);
+out:
+ return err;
+}
+
+static int __init tegra_vpr_init(void)
+{
+ const char *compatible = "nvidia,tegra-video-protection-region";
+ struct device_node *parent;
+ struct reserved_mem *rmem;
+ int err;
+
+ parent = of_find_node_by_path("/reserved-memory");
+ if (!parent)
+ return 0;
+
+ for_each_child_of_node_scoped(parent, child) {
+ if (!of_device_is_compatible(child, compatible))
+ continue;
+
+ rmem = of_reserved_mem_lookup(child);
+ if (!rmem)
+ continue;
+
+ err = tegra_vpr_add_heap(rmem, child);
+ if (err < 0)
+ pr_err("failed to add VPR heap for %pOF: %d\n", child,
+ err);
+
+ /* only a single VPR heap is supported */
+ break;
+ }
+
+ return 0;
+}
+module_init(tegra_vpr_init);
+
+static int tegra_vpr_device_init(struct reserved_mem *rmem, struct device *dev)
+{
+ struct dma_heap *heap = rmem->priv;
+ struct tegra_vpr *vpr = dma_heap_get_drvdata(heap);
+ struct tegra_vpr_device *node;
+ int err = 0;
+
+ if (!dev->driver->pm->freeze || !dev->driver->pm->thaw)
+ return -EINVAL;
+
+ node = kzalloc(sizeof(*node), GFP_KERNEL);
+ if (!node) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ INIT_LIST_HEAD(&node->node);
+ node->dev = dev;
+
+ list_add_tail(&node->node, &vpr->devices);
+
+out:
+ return err;
+}
+
+static void tegra_vpr_device_release(struct reserved_mem *rmem,
+ struct device *dev)
+{
+ struct dma_heap *heap = rmem->priv;
+ struct tegra_vpr *vpr = dma_heap_get_drvdata(heap);
+ struct tegra_vpr_device *node, *tmp;
+
+ list_for_each_entry_safe(node, tmp, &vpr->devices, node) {
+ if (node->dev == dev) {
+ list_del(&node->node);
+ kfree(node);
+ }
+ }
+}
+
+static const struct reserved_mem_ops tegra_vpr_ops = {
+ .device_init = tegra_vpr_device_init,
+ .device_release = tegra_vpr_device_release,
+};
+
+static int tegra_vpr_rmem_init(struct reserved_mem *rmem)
+{
+ rmem->ops = &tegra_vpr_ops;
+
+ return 0;
+}
+RESERVEDMEM_OF_DECLARE(tegra_vpr, "nvidia,tegra-video-protection-region",
+ tegra_vpr_rmem_init);
+
+MODULE_DESCRIPTION("NVIDIA Tegra Video-Protection-Region DMA-BUF heap driver");
+MODULE_LICENSE("GPL");
diff --git a/include/trace/events/tegra_vpr.h b/include/trace/events/tegra_vpr.h
new file mode 100644
index 000000000000..f8ceb17679fe
--- /dev/null
+++ b/include/trace/events/tegra_vpr.h
@@ -0,0 +1,57 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#if !defined(_TRACE_TEGRA_VPR_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_TEGRA_VPR_H
+
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM tegra_vpr
+
+#include <linux/tracepoint.h>
+
+TRACE_EVENT(tegra_vpr_chunk_activate,
+ TP_PROTO(phys_addr_t start, phys_addr_t limit),
+ TP_ARGS(start, limit),
+ TP_STRUCT__entry(
+ __field(phys_addr_t, start)
+ __field(phys_addr_t, limit)
+ ),
+ TP_fast_assign(
+ __entry->start = start;
+ __entry->limit = limit;
+ ),
+ TP_printk("%pap-%pap", &__entry->start,
+ &__entry->limit)
+);
+
+TRACE_EVENT(tegra_vpr_chunk_deactivate,
+ TP_PROTO(phys_addr_t start, phys_addr_t limit),
+ TP_ARGS(start, limit),
+ TP_STRUCT__entry(
+ __field(phys_addr_t, start)
+ __field(phys_addr_t, limit)
+ ),
+ TP_fast_assign(
+ __entry->start = start;
+ __entry->limit = limit;
+ ),
+ TP_printk("%pap-%pap", &__entry->start,
+ &__entry->limit)
+);
+
+TRACE_EVENT(tegra_vpr_set,
+ TP_PROTO(phys_addr_t base, phys_addr_t size),
+ TP_ARGS(base, size),
+ TP_STRUCT__entry(
+ __field(phys_addr_t, start)
+ __field(phys_addr_t, limit)
+ ),
+ TP_fast_assign(
+ __entry->start = base;
+ __entry->limit = base + size;
+ ),
+ TP_printk("%pap-%pap", &__entry->start, &__entry->limit)
+);
+
+#endif /* _TRACE_TEGRA_VPR_H */
+
+#include <trace/define_trace.h>
--
2.50.0
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 6/9] arm64: tegra: Add VPR placeholder node on Tegra234
2025-09-02 15:46 [PATCH 0/9] dma-buf: heaps: Add Tegra VPR support Thierry Reding
` (4 preceding siblings ...)
2025-09-02 15:46 ` [PATCH 5/9] dma-buf: heaps: Add support for Tegra VPR Thierry Reding
@ 2025-09-02 15:46 ` Thierry Reding
2025-09-04 15:30 ` Thierry Reding
2025-09-02 15:46 ` [PATCH 7/9] arm64: tegra: Add GPU " Thierry Reding
` (3 subsequent siblings)
9 siblings, 1 reply; 24+ messages in thread
From: Thierry Reding @ 2025-09-02 15:46 UTC (permalink / raw)
To: Thierry Reding, David Airlie, Simona Vetter, Sumit Semwal
Cc: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Benjamin Gaignard,
Brian Starkey, John Stultz, T.J. Mercier, Andrew Morton,
David Hildenbrand, Mike Rapoport, dri-devel, devicetree,
linux-tegra, linaro-mm-sig, linux-mm
From: Thierry Reding <treding@nvidia.com>
This node contains two sets of properties, one for the case where the
VPR is resizable (in which case the VPR region will be dynamically
allocated at boot time) and another case where the VPR is fixed in size
and initialized by early firmware.
The firmware running on the device is responsible for updating the node
with the real physical address for the fixed VPR case and remove the
properties needed only for resizable VPR. Similarly, if the VPR is
resizable, the firmware should remove the "reg" property since it is no
longer needed.
Signed-off-by: Thierry Reding <treding@nvidia.com>
---
arch/arm64/boot/dts/nvidia/tegra234.dtsi | 34 ++++++++++++++++++++++++
1 file changed, 34 insertions(+)
diff --git a/arch/arm64/boot/dts/nvidia/tegra234.dtsi b/arch/arm64/boot/dts/nvidia/tegra234.dtsi
index df034dbb8285..4d572f5fa0b1 100644
--- a/arch/arm64/boot/dts/nvidia/tegra234.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra234.dtsi
@@ -28,6 +28,40 @@ aliases {
i2c8 = &dp_aux_ch3_i2c;
};
+ reserved-memory {
+ #address-cells = <2>;
+ #size-cells = <2>;
+ ranges;
+
+ vpr: video-protection-region@0 {
+ compatible = "nvidia,tegra-video-protection-region";
+ status = "disabled";
+ no-map;
+
+ /*
+ * Two variants exist for this. For fixed VPR, the
+ * firmware is supposed to update the "reg" property
+ * with the fixed memory region configured as VPR.
+ *
+ * For resizable VPR we don't care about the exact
+ * address and instead want a reserved region to be
+ * allocated with a certain size and alignment at
+ * boot time.
+ *
+ * The firmware is responsible for removing the
+ * unused set of properties.
+ */
+
+ /* fixed VPR */
+ reg = <0x0 0x0 0x0 0x0>;
+
+ /* resizable VPR */
+ size = <0x0 0x70000000>;
+ alignment = <0x0 0x100000>;
+ reusable;
+ };
+ };
+
bus@0 {
compatible = "simple-bus";
--
2.50.0
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 7/9] arm64: tegra: Add GPU node on Tegra234
2025-09-02 15:46 [PATCH 0/9] dma-buf: heaps: Add Tegra VPR support Thierry Reding
` (5 preceding siblings ...)
2025-09-02 15:46 ` [PATCH 6/9] arm64: tegra: Add VPR placeholder node on Tegra234 Thierry Reding
@ 2025-09-02 15:46 ` Thierry Reding
2025-09-02 15:46 ` [PATCH 8/9] arm64: tegra: Hook up VPR to host1x Thierry Reding
` (2 subsequent siblings)
9 siblings, 0 replies; 24+ messages in thread
From: Thierry Reding @ 2025-09-02 15:46 UTC (permalink / raw)
To: Thierry Reding, David Airlie, Simona Vetter, Sumit Semwal
Cc: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Benjamin Gaignard,
Brian Starkey, John Stultz, T.J. Mercier, Andrew Morton,
David Hildenbrand, Mike Rapoport, dri-devel, devicetree,
linux-tegra, linaro-mm-sig, linux-mm
From: Thierry Reding <treding@nvidia.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
---
arch/arm64/boot/dts/nvidia/tegra234.dtsi | 17 +++++++++++++++++
1 file changed, 17 insertions(+)
diff --git a/arch/arm64/boot/dts/nvidia/tegra234.dtsi b/arch/arm64/boot/dts/nvidia/tegra234.dtsi
index 4d572f5fa0b1..4f8031055ad0 100644
--- a/arch/arm64/boot/dts/nvidia/tegra234.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra234.dtsi
@@ -5262,6 +5262,23 @@ pcie-ep@141e0000 {
};
};
+ gpu@17000000 {
+ compatible = "nvidia,ga10b";
+ reg = <0x0 0x17000000 0x0 0x1000000>,
+ <0x0 0x18000000 0x0 0x1000000>;
+ interrupts = <GIC_SPI 67 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 68 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 70 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 71 IRQ_TYPE_LEVEL_HIGH>;
+ interrupt-names = "nonstall", "stall0", "stall1", "stall2";
+ power-domains = <&bpmp TEGRA234_POWER_DOMAIN_GPU>;
+ clocks = <&bpmp TEGRA234_CLK_GPUSYS>,
+ <&bpmp TEGRA234_CLK_GPC0CLK>,
+ <&bpmp TEGRA234_CLK_GPC1CLK>;
+ clock-names = "sys", "gpc0", "gpc1";
+ resets = <&bpmp TEGRA234_RESET_GPU>;
+ };
+
sram@40000000 {
compatible = "nvidia,tegra234-sysram", "mmio-sram";
reg = <0x0 0x40000000 0x0 0x80000>;
--
2.50.0
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 8/9] arm64: tegra: Hook up VPR to host1x
2025-09-02 15:46 [PATCH 0/9] dma-buf: heaps: Add Tegra VPR support Thierry Reding
` (6 preceding siblings ...)
2025-09-02 15:46 ` [PATCH 7/9] arm64: tegra: Add GPU " Thierry Reding
@ 2025-09-02 15:46 ` Thierry Reding
2025-09-02 15:46 ` [PATCH 9/9] arm64: tegra: Hook up VPR to the GPU Thierry Reding
2025-09-03 11:54 ` [PATCH 0/9] dma-buf: heaps: Add Tegra VPR support David Hildenbrand
9 siblings, 0 replies; 24+ messages in thread
From: Thierry Reding @ 2025-09-02 15:46 UTC (permalink / raw)
To: Thierry Reding, David Airlie, Simona Vetter, Sumit Semwal
Cc: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Benjamin Gaignard,
Brian Starkey, John Stultz, T.J. Mercier, Andrew Morton,
David Hildenbrand, Mike Rapoport, dri-devel, devicetree,
linux-tegra, linaro-mm-sig, linux-mm
From: Thierry Reding <treding@nvidia.com>
The host1x needs access to the VPR region, so make sure to reference it
via the memory-region property.
Signed-off-by: Thierry Reding <treding@nvidia.com>
---
arch/arm64/boot/dts/nvidia/tegra234.dtsi | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/arm64/boot/dts/nvidia/tegra234.dtsi b/arch/arm64/boot/dts/nvidia/tegra234.dtsi
index 4f8031055ad0..0b9c2e1b47d2 100644
--- a/arch/arm64/boot/dts/nvidia/tegra234.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra234.dtsi
@@ -4414,6 +4414,9 @@ host1x@13e00000 {
<14 &smmu_niso1 TEGRA234_SID_HOST1X_CTX6 1>,
<15 &smmu_niso1 TEGRA234_SID_HOST1X_CTX7 1>;
+ memory-region = <&vpr>;
+ memory-region-names = "protected";
+
vic@15340000 {
compatible = "nvidia,tegra234-vic";
reg = <0x0 0x15340000 0x0 0x00040000>;
--
2.50.0
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH 9/9] arm64: tegra: Hook up VPR to the GPU
2025-09-02 15:46 [PATCH 0/9] dma-buf: heaps: Add Tegra VPR support Thierry Reding
` (7 preceding siblings ...)
2025-09-02 15:46 ` [PATCH 8/9] arm64: tegra: Hook up VPR to host1x Thierry Reding
@ 2025-09-02 15:46 ` Thierry Reding
2025-09-03 11:54 ` [PATCH 0/9] dma-buf: heaps: Add Tegra VPR support David Hildenbrand
9 siblings, 0 replies; 24+ messages in thread
From: Thierry Reding @ 2025-09-02 15:46 UTC (permalink / raw)
To: Thierry Reding, David Airlie, Simona Vetter, Sumit Semwal
Cc: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Benjamin Gaignard,
Brian Starkey, John Stultz, T.J. Mercier, Andrew Morton,
David Hildenbrand, Mike Rapoport, dri-devel, devicetree,
linux-tegra, linaro-mm-sig, linux-mm
From: Thierry Reding <treding@nvidia.com>
The GPU needs to be idled before the VPR can be resized and unidled
afterwards. Associate it with the VPR using the standard memory-region
device tree property.
Signed-off-by: Thierry Reding <treding@nvidia.com>
---
arch/arm64/boot/dts/nvidia/tegra234.dtsi | 3 +++
1 file changed, 3 insertions(+)
diff --git a/arch/arm64/boot/dts/nvidia/tegra234.dtsi b/arch/arm64/boot/dts/nvidia/tegra234.dtsi
index 0b9c2e1b47d2..98d87144a2e4 100644
--- a/arch/arm64/boot/dts/nvidia/tegra234.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra234.dtsi
@@ -5280,6 +5280,9 @@ gpu@17000000 {
<&bpmp TEGRA234_CLK_GPC1CLK>;
clock-names = "sys", "gpc0", "gpc1";
resets = <&bpmp TEGRA234_RESET_GPU>;
+
+ memory-region-names = "vpr";
+ memory-region = <&vpr>;
};
sram@40000000 {
--
2.50.0
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH 3/9] mm/cma: Allow dynamically creating CMA areas
2025-09-02 15:46 ` [PATCH 3/9] mm/cma: Allow dynamically creating CMA areas Thierry Reding
@ 2025-09-02 17:27 ` Frank van der Linden
2025-09-02 19:04 ` David Hildenbrand
2025-09-03 16:05 ` Thierry Reding
0 siblings, 2 replies; 24+ messages in thread
From: Frank van der Linden @ 2025-09-02 17:27 UTC (permalink / raw)
To: Thierry Reding
Cc: David Airlie, Simona Vetter, Sumit Semwal, Rob Herring,
Krzysztof Kozlowski, Conor Dooley, Benjamin Gaignard,
Brian Starkey, John Stultz, T.J. Mercier, Andrew Morton,
David Hildenbrand, Mike Rapoport, dri-devel, devicetree,
linux-tegra, linaro-mm-sig, linux-mm
On Tue, Sep 2, 2025 at 8:46 AM Thierry Reding <thierry.reding@gmail.com> wrote:
>
> From: Thierry Reding <treding@nvidia.com>
>
> There is no technical reason why there should be a limited number of CMA
> regions, so extract some code into helpers and use them to create extra
> functions (cma_create() and cma_free()) that allow creating and freeing,
> respectively, CMA regions dynamically at runtime.
>
> Note that these dynamically created CMA areas are treated specially and
> do not contribute to the number of total CMA pages so that this count
> still only applies to the fixed number of CMA areas.
>
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
> include/linux/cma.h | 16 ++++++++
> mm/cma.c | 89 ++++++++++++++++++++++++++++++++++-----------
> 2 files changed, 83 insertions(+), 22 deletions(-)
>
> diff --git a/include/linux/cma.h b/include/linux/cma.h
> index 62d9c1cf6326..f1e20642198a 100644
> --- a/include/linux/cma.h
> +++ b/include/linux/cma.h
> @@ -61,6 +61,10 @@ extern void cma_reserve_pages_on_error(struct cma *cma);
> struct folio *cma_alloc_folio(struct cma *cma, int order, gfp_t gfp);
> bool cma_free_folio(struct cma *cma, const struct folio *folio);
> bool cma_validate_zones(struct cma *cma);
> +
> +struct cma *cma_create(phys_addr_t base, phys_addr_t size,
> + unsigned int order_per_bit, const char *name);
> +void cma_free(struct cma *cma);
> #else
> static inline struct folio *cma_alloc_folio(struct cma *cma, int order, gfp_t gfp)
> {
> @@ -71,10 +75,22 @@ static inline bool cma_free_folio(struct cma *cma, const struct folio *folio)
> {
> return false;
> }
> +
> static inline bool cma_validate_zones(struct cma *cma)
> {
> return false;
> }
> +
> +static inline struct cma *cma_create(phys_addr_t base, phys_addr_t size,
> + unsigned int order_per_bit,
> + const char *name)
> +{
> + return NULL;
> +}
> +
> +static inline void cma_free(struct cma *cma)
> +{
> +}
> #endif
>
> #endif
> diff --git a/mm/cma.c b/mm/cma.c
> index e56ec64d0567..8149227d319f 100644
> --- a/mm/cma.c
> +++ b/mm/cma.c
> @@ -214,6 +214,18 @@ void __init cma_reserve_pages_on_error(struct cma *cma)
> set_bit(CMA_RESERVE_PAGES_ON_ERROR, &cma->flags);
> }
>
> +static void __init cma_init_area(struct cma *cma, const char *name,
> + phys_addr_t size, unsigned int order_per_bit)
> +{
> + if (name)
> + snprintf(cma->name, CMA_MAX_NAME, "%s", name);
> + else
> + snprintf(cma->name, CMA_MAX_NAME, "cma%d\n", cma_area_count);
> +
> + cma->available_count = cma->count = size >> PAGE_SHIFT;
> + cma->order_per_bit = order_per_bit;
> +}
> +
> static int __init cma_new_area(const char *name, phys_addr_t size,
> unsigned int order_per_bit,
> struct cma **res_cma)
> @@ -232,13 +244,8 @@ static int __init cma_new_area(const char *name, phys_addr_t size,
> cma = &cma_areas[cma_area_count];
> cma_area_count++;
>
> - if (name)
> - snprintf(cma->name, CMA_MAX_NAME, "%s", name);
> - else
> - snprintf(cma->name, CMA_MAX_NAME, "cma%d\n", cma_area_count);
> + cma_init_area(cma, name, size, order_per_bit);
>
> - cma->available_count = cma->count = size >> PAGE_SHIFT;
> - cma->order_per_bit = order_per_bit;
> *res_cma = cma;
> totalcma_pages += cma->count;
>
> @@ -251,6 +258,27 @@ static void __init cma_drop_area(struct cma *cma)
> cma_area_count--;
> }
>
> +static int __init cma_check_memory(phys_addr_t base, phys_addr_t size)
> +{
> + if (!size || !memblock_is_region_reserved(base, size))
> + return -EINVAL;
> +
> + /*
> + * CMA uses CMA_MIN_ALIGNMENT_BYTES as alignment requirement which
> + * needs pageblock_order to be initialized. Let's enforce it.
> + */
> + if (!pageblock_order) {
> + pr_err("pageblock_order not yet initialized. Called during early boot?\n");
> + return -EINVAL;
> + }
> +
> + /* ensure minimal alignment required by mm core */
> + if (!IS_ALIGNED(base | size, CMA_MIN_ALIGNMENT_BYTES))
> + return -EINVAL;
> +
> + return 0;
> +}
> +
> /**
> * cma_init_reserved_mem() - create custom contiguous area from reserved memory
> * @base: Base address of the reserved area
> @@ -271,22 +299,9 @@ int __init cma_init_reserved_mem(phys_addr_t base, phys_addr_t size,
> struct cma *cma;
> int ret;
>
> - /* Sanity checks */
> - if (!size || !memblock_is_region_reserved(base, size))
> - return -EINVAL;
> -
> - /*
> - * CMA uses CMA_MIN_ALIGNMENT_BYTES as alignment requirement which
> - * needs pageblock_order to be initialized. Let's enforce it.
> - */
> - if (!pageblock_order) {
> - pr_err("pageblock_order not yet initialized. Called during early boot?\n");
> - return -EINVAL;
> - }
> -
> - /* ensure minimal alignment required by mm core */
> - if (!IS_ALIGNED(base | size, CMA_MIN_ALIGNMENT_BYTES))
> - return -EINVAL;
> + ret = cma_check_memory(base, size);
> + if (ret < 0)
> + return ret;
>
> ret = cma_new_area(name, size, order_per_bit, &cma);
> if (ret != 0)
> @@ -1112,3 +1127,33 @@ void __init *cma_reserve_early(struct cma *cma, unsigned long size)
>
> return ret;
> }
> +
> +struct cma *__init cma_create(phys_addr_t base, phys_addr_t size,
> + unsigned int order_per_bit, const char *name)
> +{
> + struct cma *cma;
> + int ret;
> +
> + ret = cma_check_memory(base, size);
> + if (ret < 0)
> + return ERR_PTR(ret);
> +
> + cma = kzalloc(sizeof(*cma), GFP_KERNEL);
> + if (!cma)
> + return ERR_PTR(-ENOMEM);
> +
> + cma_init_area(cma, name, size, order_per_bit);
> + cma->ranges[0].base_pfn = PFN_DOWN(base);
> + cma->ranges[0].early_pfn = PFN_DOWN(base);
> + cma->ranges[0].count = cma->count;
> + cma->nranges = 1;
> +
> + cma_activate_area(cma);
> +
> + return cma;
> +}
> +
> +void cma_free(struct cma *cma)
> +{
> + kfree(cma);
> +}
> --
> 2.50.0
I agree that supporting dynamic CMA areas would be good. However, by
doing it like this, these CMA areas are invisible to the rest of the
system. E.g. cma_for_each_area() does not know about them. It seems a
bit inconsistent that there will now be some areas that are globally
known, and some that are not.
I am being somewhat selfish here, as I have some WIP code that needs
the global list :-) But I think the inconsistency is a more general
point than just what I want (and the s390 code does use
cma_for_each_area()). Maybe you could keep maintaining a global
structure containing all areas? What do you think are the chances of
running out of the global count of areas?
Also, you say that "these are treated specially and do not contribute
to the number of total CMA pages". But, if I'm reading this right, you
do call cma_activate_area(), which will do
init_cma_reserved_pageblock() for each pageblock in it. Which adjusts
the CMA counters for the zone they are in. But your change does not
adjust totalcma_pages for dynamically created areas. That seems
inconsistent, too.
- Frank
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 3/9] mm/cma: Allow dynamically creating CMA areas
2025-09-02 17:27 ` Frank van der Linden
@ 2025-09-02 19:04 ` David Hildenbrand
2025-09-03 16:12 ` Thierry Reding
2025-09-03 16:05 ` Thierry Reding
1 sibling, 1 reply; 24+ messages in thread
From: David Hildenbrand @ 2025-09-02 19:04 UTC (permalink / raw)
To: Frank van der Linden, Thierry Reding
Cc: David Airlie, Simona Vetter, Sumit Semwal, Rob Herring,
Krzysztof Kozlowski, Conor Dooley, Benjamin Gaignard,
Brian Starkey, John Stultz, T.J. Mercier, Andrew Morton,
Mike Rapoport, dri-devel, devicetree, linux-tegra, linaro-mm-sig,
linux-mm
>> +>> +struct cma *__init cma_create(phys_addr_t base, phys_addr_t size,
>> + unsigned int order_per_bit, const char *name)
>> +{
>> + struct cma *cma;
>> + int ret;
>> +
>> + ret = cma_check_memory(base, size);
>> + if (ret < 0)
>> + return ERR_PTR(ret);
>> +
>> + cma = kzalloc(sizeof(*cma), GFP_KERNEL);
>> + if (!cma)
>> + return ERR_PTR(-ENOMEM);
>> +
>> + cma_init_area(cma, name, size, order_per_bit);
>> + cma->ranges[0].base_pfn = PFN_DOWN(base);
>> + cma->ranges[0].early_pfn = PFN_DOWN(base);
>> + cma->ranges[0].count = cma->count;
>> + cma->nranges = 1;
>> +
>> + cma_activate_area(cma);
>> +
>> + return cma;
>> +}
>> +
>> +void cma_free(struct cma *cma)
>> +{
>> + kfree(cma);
>> +}
>> --
>> 2.50.0
>
>
> I agree that supporting dynamic CMA areas would be good. However, by
> doing it like this, these CMA areas are invisible to the rest of the
> system. E.g. cma_for_each_area() does not know about them. It seems a
> bit inconsistent that there will now be some areas that are globally
> known, and some that are not.
Yeah, I'm not a fan of that.
What is the big problem we are trying to solve here? Why do they have to
be dynamic, why do they even have to support freeing?
--
Cheers
David / dhildenb
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 4/9] dma-buf: heaps: Add debugfs support
2025-09-02 15:46 ` [PATCH 4/9] dma-buf: heaps: Add debugfs support Thierry Reding
@ 2025-09-02 22:37 ` John Stultz
2025-09-03 15:38 ` Thierry Reding
0 siblings, 1 reply; 24+ messages in thread
From: John Stultz @ 2025-09-02 22:37 UTC (permalink / raw)
To: Thierry Reding
Cc: David Airlie, Simona Vetter, Sumit Semwal, Rob Herring,
Krzysztof Kozlowski, Conor Dooley, Benjamin Gaignard,
Brian Starkey, T.J. Mercier, Andrew Morton, David Hildenbrand,
Mike Rapoport, dri-devel, devicetree, linux-tegra, linaro-mm-sig,
linux-mm
On Tue, Sep 2, 2025 at 8:46 AM Thierry Reding <thierry.reding@gmail.com> wrote:
>
> From: Thierry Reding <treding@nvidia.com>
>
> Add a callback to struct dma_heap_ops that heap providers can implement
> to show information about the state of the heap in debugfs. A top-level
> directory named "dma_heap" is created in debugfs and individual files
> will be named after the heaps.
>
I know its debugfs, but this feels a little loosey-goosey as an uAPI.
Is there any expected format for the show function?
What would other dmabuf heaps ideally export via this interface?
Is there some consistent dma_heap-ish concept for it to justify it
being under a dma_heap directory, and not just an independent debugfs
file for the driver implementing the dmabuf heap?
thanks
-john
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 0/9] dma-buf: heaps: Add Tegra VPR support
2025-09-02 15:46 [PATCH 0/9] dma-buf: heaps: Add Tegra VPR support Thierry Reding
` (8 preceding siblings ...)
2025-09-02 15:46 ` [PATCH 9/9] arm64: tegra: Hook up VPR to the GPU Thierry Reding
@ 2025-09-03 11:54 ` David Hildenbrand
9 siblings, 0 replies; 24+ messages in thread
From: David Hildenbrand @ 2025-09-03 11:54 UTC (permalink / raw)
To: Thierry Reding, David Airlie, Simona Vetter, Sumit Semwal
Cc: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Benjamin Gaignard,
Brian Starkey, John Stultz, T.J. Mercier, Andrew Morton,
Mike Rapoport, dri-devel, devicetree, linux-tegra, linaro-mm-sig,
linux-mm
On 02.09.25 17:46, Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
>
> Hi,
>
Hi,
> This series adds support for the video protection region (VPR) used on
> Tegra SoC devices. It's a special region of memory that is protected
> from accesses by the CPU and used to store DRM protected content (both
> decrypted stream data as well as decoded video frames).
>
> Patches 1 and 2 add DT binding documentation for the VPR and add the VPR
> to the list of memory-region items for display and host1x.
>
> Patch 3 introduces new APIs needed by the Tegra VPR implementation that
> allow CMA areas to be dynamically created at runtime rather than using
> the fixed, system-wide list. This is used in this driver specifically
> because it can use an arbitrary number of these areas (though they are
> currently limited to 4).
I am pretty sure we want a system-wide list. Currently we maintain all
areas in a static array limited by CONFIG_CMA_AREAS.
We can adjsut that to support for more dynamically.
--
Cheers
David / dhildenb
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 4/9] dma-buf: heaps: Add debugfs support
2025-09-02 22:37 ` John Stultz
@ 2025-09-03 15:38 ` Thierry Reding
2025-09-03 18:48 ` John Stultz
0 siblings, 1 reply; 24+ messages in thread
From: Thierry Reding @ 2025-09-03 15:38 UTC (permalink / raw)
To: John Stultz
Cc: David Airlie, Simona Vetter, Sumit Semwal, Rob Herring,
Krzysztof Kozlowski, Conor Dooley, Benjamin Gaignard,
Brian Starkey, T.J. Mercier, Andrew Morton, David Hildenbrand,
Mike Rapoport, dri-devel, devicetree, linux-tegra, linaro-mm-sig,
linux-mm
[-- Attachment #1: Type: text/plain, Size: 1766 bytes --]
On Tue, Sep 02, 2025 at 03:37:45PM -0700, John Stultz wrote:
> On Tue, Sep 2, 2025 at 8:46 AM Thierry Reding <thierry.reding@gmail.com> wrote:
> >
> > From: Thierry Reding <treding@nvidia.com>
> >
> > Add a callback to struct dma_heap_ops that heap providers can implement
> > to show information about the state of the heap in debugfs. A top-level
> > directory named "dma_heap" is created in debugfs and individual files
> > will be named after the heaps.
> >
>
> I know its debugfs, but this feels a little loosey-goosey as an uAPI.
Well, the whole point of debugfs is that it's not really an ABI. Nothing
should ever rely on the presence of these files.
> Is there any expected format for the show function?
>
> What would other dmabuf heaps ideally export via this interface?
I've thought about this a bit and I'm not sure it makes sense to
standardize on this. I think on one hand having a list of buffers
exported by the dma-buf heap is probably the lowest common denominator,
but then there might be a bunch of other things that are very heap-
specific that some heap might want to export.
> Is there some consistent dma_heap-ish concept for it to justify it
> being under a dma_heap directory, and not just an independent debugfs
> file for the driver implementing the dmabuf heap?
Well, I think just the fact that it's a dma-heap would qualify its
corresponding debugfs to be in a well-known location. We could of course
pick some arbitrary location, but that's just a recipe for chaos because
then everybody puts these whereever they want. There's really no
standard place for driver-specific debugfs files to go, so putting it
into some "subsystem"-specific directory seems like the better option.
Thierry
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 3/9] mm/cma: Allow dynamically creating CMA areas
2025-09-02 17:27 ` Frank van der Linden
2025-09-02 19:04 ` David Hildenbrand
@ 2025-09-03 16:05 ` Thierry Reding
2025-09-03 16:41 ` Frank van der Linden
1 sibling, 1 reply; 24+ messages in thread
From: Thierry Reding @ 2025-09-03 16:05 UTC (permalink / raw)
To: Frank van der Linden
Cc: David Airlie, Simona Vetter, Sumit Semwal, Rob Herring,
Krzysztof Kozlowski, Conor Dooley, Benjamin Gaignard,
Brian Starkey, John Stultz, T.J. Mercier, Andrew Morton,
David Hildenbrand, Mike Rapoport, dri-devel, devicetree,
linux-tegra, linaro-mm-sig, linux-mm
[-- Attachment #1: Type: text/plain, Size: 4697 bytes --]
On Tue, Sep 02, 2025 at 10:27:01AM -0700, Frank van der Linden wrote:
> On Tue, Sep 2, 2025 at 8:46 AM Thierry Reding <thierry.reding@gmail.com> wrote:
> >
> > From: Thierry Reding <treding@nvidia.com>
> >
> > There is no technical reason why there should be a limited number of CMA
> > regions, so extract some code into helpers and use them to create extra
> > functions (cma_create() and cma_free()) that allow creating and freeing,
> > respectively, CMA regions dynamically at runtime.
> >
> > Note that these dynamically created CMA areas are treated specially and
> > do not contribute to the number of total CMA pages so that this count
> > still only applies to the fixed number of CMA areas.
> >
> > Signed-off-by: Thierry Reding <treding@nvidia.com>
> > ---
> > include/linux/cma.h | 16 ++++++++
> > mm/cma.c | 89 ++++++++++++++++++++++++++++++++++-----------
> > 2 files changed, 83 insertions(+), 22 deletions(-)
[...]
> I agree that supporting dynamic CMA areas would be good. However, by
> doing it like this, these CMA areas are invisible to the rest of the
> system. E.g. cma_for_each_area() does not know about them. It seems a
> bit inconsistent that there will now be some areas that are globally
> known, and some that are not.
That was kind of the point of this experiment. When I started on this I
ran into the case where I was running out of predefined CMA areas and as
I went looking for ways on how to fix this, I realized that there's not
much reason to keep a global list of these areas. And even less reason
to limit the number of CMA areas to this predefined list. Very little
code outside of the core CMA code even uses this.
There's one instance of cma_for_each_area() that I don't grok. There's
another early MMU fixup for CMA areas in 32-bit ARM that. Other than
that there's a few places where the total CMA page count is shown for
informational purposes and I don't know how useful that really is
because totalcma_pages doesn't really track how many pages are used for
CMA, but pages that could potentially be used for CMA.
And that's about it.
It seems like there are cases where we might really need to globally
know about some of these areas, specifically ones that are allocated
very early during boot and then used for very specific purposes.
However, it seems to me like CMA is more universally useful than just
for these cases and I don't see the usefulness of tracking these more
generic uses.
> I am being somewhat selfish here, as I have some WIP code that needs
> the global list :-) But I think the inconsistency is a more general
> point than just what I want (and the s390 code does use
> cma_for_each_area()). Maybe you could keep maintaining a global
> structure containing all areas?
If it's really useful to be able to access all CMA areas, then we could
easily just add them all to a global linked list upon activation (we may
still want/need to keep the predefined list around for all those early
allocation cases). That way we'd get the best of both worlds.
> What do you think are the chances of running out of the global count
> of areas?
Well, I did run out of CMA areas during the early VPR testing because I
was initially testing with 16 areas and a different allocation scheme
that turned out to cause too many resizes in common cases.
However, given that the default is 8 on normal systems (20 on NUMA) and
is configurable, it means that even with restricting this to 4 for VPR
doesn't always guarantee that all 4 are available. Again, yes, we could
keep bumping that number, but why not turn this into something a bit
more robust where nobody has to know or care about how many there are?
> Also, you say that "these are treated specially and do not contribute
> to the number of total CMA pages". But, if I'm reading this right, you
> do call cma_activate_area(), which will do
> init_cma_reserved_pageblock() for each pageblock in it. Which adjusts
> the CMA counters for the zone they are in. But your change does not
> adjust totalcma_pages for dynamically created areas. That seems
> inconsistent, too.
I was referring to just totalcma_pages that isn't impacted by these
dynamically allocated regions. This is, again, because I don't see why
that information would be useful. It's a fairly easy change to update
that value, so if people prefer that, I can add that.
I don't see an immediate connection between totalcma_pages and
init_cma_reserved_pageblock(). I thought the latter was primarily useful
for making sure that the CMA pages can be migrated, which is still
critical for this use-case.
Thierry
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 3/9] mm/cma: Allow dynamically creating CMA areas
2025-09-02 19:04 ` David Hildenbrand
@ 2025-09-03 16:12 ` Thierry Reding
2025-09-03 16:14 ` David Hildenbrand
0 siblings, 1 reply; 24+ messages in thread
From: Thierry Reding @ 2025-09-03 16:12 UTC (permalink / raw)
To: David Hildenbrand
Cc: Frank van der Linden, David Airlie, Simona Vetter, Sumit Semwal,
Rob Herring, Krzysztof Kozlowski, Conor Dooley, Benjamin Gaignard,
Brian Starkey, John Stultz, T.J. Mercier, Andrew Morton,
Mike Rapoport, dri-devel, devicetree, linux-tegra, linaro-mm-sig,
linux-mm
[-- Attachment #1: Type: text/plain, Size: 2264 bytes --]
On Tue, Sep 02, 2025 at 09:04:24PM +0200, David Hildenbrand wrote:
>
> > > +>> +struct cma *__init cma_create(phys_addr_t base, phys_addr_t size,
> > > + unsigned int order_per_bit, const char *name)
> > > +{
> > > + struct cma *cma;
> > > + int ret;
> > > +
> > > + ret = cma_check_memory(base, size);
> > > + if (ret < 0)
> > > + return ERR_PTR(ret);
> > > +
> > > + cma = kzalloc(sizeof(*cma), GFP_KERNEL);
> > > + if (!cma)
> > > + return ERR_PTR(-ENOMEM);
> > > +
> > > + cma_init_area(cma, name, size, order_per_bit);
> > > + cma->ranges[0].base_pfn = PFN_DOWN(base);
> > > + cma->ranges[0].early_pfn = PFN_DOWN(base);
> > > + cma->ranges[0].count = cma->count;
> > > + cma->nranges = 1;
> > > +
> > > + cma_activate_area(cma);
> > > +
> > > + return cma;
> > > +}
> > > +
> > > +void cma_free(struct cma *cma)
> > > +{
> > > + kfree(cma);
> > > +}
> > > --
> > > 2.50.0
> >
> >
> > I agree that supporting dynamic CMA areas would be good. However, by
> > doing it like this, these CMA areas are invisible to the rest of the
> > system. E.g. cma_for_each_area() does not know about them. It seems a
> > bit inconsistent that there will now be some areas that are globally
> > known, and some that are not.
>
> Yeah, I'm not a fan of that.
>
> What is the big problem we are trying to solve here? Why do they have to be
> dynamic, why do they even have to support freeing?
Freeing isn't necessarily something that I've needed. It just seemed
like there wasn't really a good reason not to support it. The current
implementation here is not sufficient, though, because we'd need to
properly undo everything that cma_activate_area() does. I think the
cleanup: block in cma_activate_area() is probably sufficient.
The problem that I'm trying to solve is that currently, depending on the
use-case the kernel configuration needs to be changed and the kernel
rebuilt in order to support it. However there doesn't seem to be a good
technical reason for that limitation. The only reason it is this way
seems to be that, well, it's always been this way.
Thierry
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 3/9] mm/cma: Allow dynamically creating CMA areas
2025-09-03 16:12 ` Thierry Reding
@ 2025-09-03 16:14 ` David Hildenbrand
0 siblings, 0 replies; 24+ messages in thread
From: David Hildenbrand @ 2025-09-03 16:14 UTC (permalink / raw)
To: Thierry Reding
Cc: Frank van der Linden, David Airlie, Simona Vetter, Sumit Semwal,
Rob Herring, Krzysztof Kozlowski, Conor Dooley, Benjamin Gaignard,
Brian Starkey, John Stultz, T.J. Mercier, Andrew Morton,
Mike Rapoport, dri-devel, devicetree, linux-tegra, linaro-mm-sig,
linux-mm
On 03.09.25 18:12, Thierry Reding wrote:
> On Tue, Sep 02, 2025 at 09:04:24PM +0200, David Hildenbrand wrote:
>>
>>>> +>> +struct cma *__init cma_create(phys_addr_t base, phys_addr_t size,
>>>> + unsigned int order_per_bit, const char *name)
>>>> +{
>>>> + struct cma *cma;
>>>> + int ret;
>>>> +
>>>> + ret = cma_check_memory(base, size);
>>>> + if (ret < 0)
>>>> + return ERR_PTR(ret);
>>>> +
>>>> + cma = kzalloc(sizeof(*cma), GFP_KERNEL);
>>>> + if (!cma)
>>>> + return ERR_PTR(-ENOMEM);
>>>> +
>>>> + cma_init_area(cma, name, size, order_per_bit);
>>>> + cma->ranges[0].base_pfn = PFN_DOWN(base);
>>>> + cma->ranges[0].early_pfn = PFN_DOWN(base);
>>>> + cma->ranges[0].count = cma->count;
>>>> + cma->nranges = 1;
>>>> +
>>>> + cma_activate_area(cma);
>>>> +
>>>> + return cma;
>>>> +}
>>>> +
>>>> +void cma_free(struct cma *cma)
>>>> +{
>>>> + kfree(cma);
>>>> +}
>>>> --
>>>> 2.50.0
>>>
>>>
>>> I agree that supporting dynamic CMA areas would be good. However, by
>>> doing it like this, these CMA areas are invisible to the rest of the
>>> system. E.g. cma_for_each_area() does not know about them. It seems a
>>> bit inconsistent that there will now be some areas that are globally
>>> known, and some that are not.
>>
>> Yeah, I'm not a fan of that.
>>
>> What is the big problem we are trying to solve here? Why do they have to be
>> dynamic, why do they even have to support freeing?
>
> Freeing isn't necessarily something that I've needed. It just seemed
> like there wasn't really a good reason not to support it. The current
> implementation here is not sufficient, though, because we'd need to
> properly undo everything that cma_activate_area() does. I think the
> cleanup: block in cma_activate_area() is probably sufficient.
>
> The problem that I'm trying to solve is that currently, depending on the
> use-case the kernel configuration needs to be changed and the kernel
> rebuilt in order to support it. However there doesn't seem to be a good
> technical reason for that limitation. The only reason it is this way
> seems to be that, well, it's always been this way.
Right, and we can just dynamically grow the array, keep them in a list etc.
--
Cheers
David / dhildenb
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 3/9] mm/cma: Allow dynamically creating CMA areas
2025-09-03 16:05 ` Thierry Reding
@ 2025-09-03 16:41 ` Frank van der Linden
2025-09-04 12:06 ` Thierry Reding
0 siblings, 1 reply; 24+ messages in thread
From: Frank van der Linden @ 2025-09-03 16:41 UTC (permalink / raw)
To: Thierry Reding
Cc: David Airlie, Simona Vetter, Sumit Semwal, Rob Herring,
Krzysztof Kozlowski, Conor Dooley, Benjamin Gaignard,
Brian Starkey, John Stultz, T.J. Mercier, Andrew Morton,
David Hildenbrand, Mike Rapoport, dri-devel, devicetree,
linux-tegra, linaro-mm-sig, linux-mm
On Wed, Sep 3, 2025 at 9:05 AM Thierry Reding <thierry.reding@gmail.com> wrote:
>
> On Tue, Sep 02, 2025 at 10:27:01AM -0700, Frank van der Linden wrote:
> > On Tue, Sep 2, 2025 at 8:46 AM Thierry Reding <thierry.reding@gmail.com> wrote:
> > >
> > > From: Thierry Reding <treding@nvidia.com>
> > >
> > > There is no technical reason why there should be a limited number of CMA
> > > regions, so extract some code into helpers and use them to create extra
> > > functions (cma_create() and cma_free()) that allow creating and freeing,
> > > respectively, CMA regions dynamically at runtime.
> > >
> > > Note that these dynamically created CMA areas are treated specially and
> > > do not contribute to the number of total CMA pages so that this count
> > > still only applies to the fixed number of CMA areas.
> > >
> > > Signed-off-by: Thierry Reding <treding@nvidia.com>
> > > ---
> > > include/linux/cma.h | 16 ++++++++
> > > mm/cma.c | 89 ++++++++++++++++++++++++++++++++++-----------
> > > 2 files changed, 83 insertions(+), 22 deletions(-)
> [...]
> > I agree that supporting dynamic CMA areas would be good. However, by
> > doing it like this, these CMA areas are invisible to the rest of the
> > system. E.g. cma_for_each_area() does not know about them. It seems a
> > bit inconsistent that there will now be some areas that are globally
> > known, and some that are not.
>
> That was kind of the point of this experiment. When I started on this I
> ran into the case where I was running out of predefined CMA areas and as
> I went looking for ways on how to fix this, I realized that there's not
> much reason to keep a global list of these areas. And even less reason
> to limit the number of CMA areas to this predefined list. Very little
> code outside of the core CMA code even uses this.
>
> There's one instance of cma_for_each_area() that I don't grok. There's
> another early MMU fixup for CMA areas in 32-bit ARM that. Other than
> that there's a few places where the total CMA page count is shown for
> informational purposes and I don't know how useful that really is
> because totalcma_pages doesn't really track how many pages are used for
> CMA, but pages that could potentially be used for CMA.
>
> And that's about it.
>
> It seems like there are cases where we might really need to globally
> know about some of these areas, specifically ones that are allocated
> very early during boot and then used for very specific purposes.
>
> However, it seems to me like CMA is more universally useful than just
> for these cases and I don't see the usefulness of tracking these more
> generic uses.
>
> > I am being somewhat selfish here, as I have some WIP code that needs
> > the global list :-) But I think the inconsistency is a more general
> > point than just what I want (and the s390 code does use
> > cma_for_each_area()). Maybe you could keep maintaining a global
> > structure containing all areas?
>
> If it's really useful to be able to access all CMA areas, then we could
> easily just add them all to a global linked list upon activation (we may
> still want/need to keep the predefined list around for all those early
> allocation cases). That way we'd get the best of both worlds.
>
> > What do you think are the chances of running out of the global count
> > of areas?
>
> Well, I did run out of CMA areas during the early VPR testing because I
> was initially testing with 16 areas and a different allocation scheme
> that turned out to cause too many resizes in common cases.
>
> However, given that the default is 8 on normal systems (20 on NUMA) and
> is configurable, it means that even with restricting this to 4 for VPR
> doesn't always guarantee that all 4 are available. Again, yes, we could
> keep bumping that number, but why not turn this into something a bit
> more robust where nobody has to know or care about how many there are?
>
> > Also, you say that "these are treated specially and do not contribute
> > to the number of total CMA pages". But, if I'm reading this right, you
> > do call cma_activate_area(), which will do
> > init_cma_reserved_pageblock() for each pageblock in it. Which adjusts
> > the CMA counters for the zone they are in. But your change does not
> > adjust totalcma_pages for dynamically created areas. That seems
> > inconsistent, too.
>
> I was referring to just totalcma_pages that isn't impacted by these
> dynamically allocated regions. This is, again, because I don't see why
> that information would be useful. It's a fairly easy change to update
> that value, so if people prefer that, I can add that.
>
> I don't see an immediate connection between totalcma_pages and
> init_cma_reserved_pageblock(). I thought the latter was primarily useful
> for making sure that the CMA pages can be migrated, which is still
> critical for this use-case.
My comment was about statistics, they would be inconsistent after your
change. E.g. currently, totalcma_pages is equal to the sum of CMA
pages in each zone. But that would no longer be true, and applications
/ administrators looking at those statistics might see the
inconsistency (between meminfo and vmstat) and wonder what's going on.
It seems best to keep those numbers in sync.
In general, I think it's fine to support dynamic allocation, and I
agree with your arguments that it doesn't seem right to set the number
of CMA areas via a config option. I would just like there to be a
canonical way to find all CMA areas.
- Frank
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 1/9] dt-bindings: reserved-memory: Document Tegra VPR
2025-09-02 15:46 ` [PATCH 1/9] dt-bindings: reserved-memory: Document Tegra VPR Thierry Reding
@ 2025-09-03 16:45 ` Rob Herring (Arm)
0 siblings, 0 replies; 24+ messages in thread
From: Rob Herring (Arm) @ 2025-09-03 16:45 UTC (permalink / raw)
To: Thierry Reding
Cc: T.J. Mercier, Conor Dooley, Simona Vetter, David Airlie,
Sumit Semwal, John Stultz, Andrew Morton, David Hildenbrand,
dri-devel, linux-tegra, Brian Starkey, linux-mm, Mike Rapoport,
Krzysztof Kozlowski, devicetree, Benjamin Gaignard, linaro-mm-sig
On Tue, 02 Sep 2025 17:46:21 +0200, Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
>
> The Video Protection Region (VPR) found on NVIDIA Tegra chips is a
> region of memory that is protected from CPU accesses. It is used to
> decode and play back DRM protected content.
>
> It is a standard reserved memory region that can exist in two forms:
> static VPR where the base address and size are fixed (uses the "reg"
> property to describe the memory) and a resizable VPR where only the
> size is known upfront and the OS can allocate it wherever it can be
> accomodated.
>
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
> .../nvidia,tegra-video-protection-region.yaml | 55 +++++++++++++++++++
> 1 file changed, 55 insertions(+)
> create mode 100644 Documentation/devicetree/bindings/reserved-memory/nvidia,tegra-video-protection-region.yaml
>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 4/9] dma-buf: heaps: Add debugfs support
2025-09-03 15:38 ` Thierry Reding
@ 2025-09-03 18:48 ` John Stultz
2025-09-04 12:04 ` Thierry Reding
0 siblings, 1 reply; 24+ messages in thread
From: John Stultz @ 2025-09-03 18:48 UTC (permalink / raw)
To: Thierry Reding
Cc: David Airlie, Simona Vetter, Sumit Semwal, Rob Herring,
Krzysztof Kozlowski, Conor Dooley, Benjamin Gaignard,
Brian Starkey, T.J. Mercier, Andrew Morton, David Hildenbrand,
Mike Rapoport, dri-devel, devicetree, linux-tegra, linaro-mm-sig,
linux-mm
On Wed, Sep 3, 2025 at 8:38 AM Thierry Reding <thierry.reding@gmail.com> wrote:
>
> On Tue, Sep 02, 2025 at 03:37:45PM -0700, John Stultz wrote:
> > On Tue, Sep 2, 2025 at 8:46 AM Thierry Reding <thierry.reding@gmail.com> wrote:
> > >
> > > From: Thierry Reding <treding@nvidia.com>
> > >
> > > Add a callback to struct dma_heap_ops that heap providers can implement
> > > to show information about the state of the heap in debugfs. A top-level
> > > directory named "dma_heap" is created in debugfs and individual files
> > > will be named after the heaps.
> > >
> >
> > I know its debugfs, but this feels a little loosey-goosey as an uAPI.
>
> Well, the whole point of debugfs is that it's not really an ABI. Nothing
> should ever rely on the presence of these files.
>
> > Is there any expected format for the show function?
> >
> > What would other dmabuf heaps ideally export via this interface?
>
> I've thought about this a bit and I'm not sure it makes sense to
> standardize on this. I think on one hand having a list of buffers
> exported by the dma-buf heap is probably the lowest common denominator,
> but then there might be a bunch of other things that are very heap-
> specific that some heap might want to export.
>
> > Is there some consistent dma_heap-ish concept for it to justify it
> > being under a dma_heap directory, and not just an independent debugfs
> > file for the driver implementing the dmabuf heap?
>
> Well, I think just the fact that it's a dma-heap would qualify its
> corresponding debugfs to be in a well-known location. We could of course
> pick some arbitrary location, but that's just a recipe for chaos because
> then everybody puts these whereever they want. There's really no
> standard place for driver-specific debugfs files to go, so putting it
> into some "subsystem"-specific directory seems like the better option.
Ok, I guess I was thinking if the files are organizationally cohesive
to be under the dma-heap directory, they ought to have some
consistency between them.
But I can see your perspective here that organizing the driver
specific debug files in a directory helps with folks finding and
identifying it.
Thanks for clarifying!
-john
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 4/9] dma-buf: heaps: Add debugfs support
2025-09-03 18:48 ` John Stultz
@ 2025-09-04 12:04 ` Thierry Reding
0 siblings, 0 replies; 24+ messages in thread
From: Thierry Reding @ 2025-09-04 12:04 UTC (permalink / raw)
To: John Stultz
Cc: David Airlie, Simona Vetter, Sumit Semwal, Rob Herring,
Krzysztof Kozlowski, Conor Dooley, Benjamin Gaignard,
Brian Starkey, T.J. Mercier, Andrew Morton, David Hildenbrand,
Mike Rapoport, dri-devel, devicetree, linux-tegra, linaro-mm-sig,
linux-mm
[-- Attachment #1: Type: text/plain, Size: 2852 bytes --]
On Wed, Sep 03, 2025 at 11:48:38AM -0700, John Stultz wrote:
> On Wed, Sep 3, 2025 at 8:38 AM Thierry Reding <thierry.reding@gmail.com> wrote:
> >
> > On Tue, Sep 02, 2025 at 03:37:45PM -0700, John Stultz wrote:
> > > On Tue, Sep 2, 2025 at 8:46 AM Thierry Reding <thierry.reding@gmail.com> wrote:
> > > >
> > > > From: Thierry Reding <treding@nvidia.com>
> > > >
> > > > Add a callback to struct dma_heap_ops that heap providers can implement
> > > > to show information about the state of the heap in debugfs. A top-level
> > > > directory named "dma_heap" is created in debugfs and individual files
> > > > will be named after the heaps.
> > > >
> > >
> > > I know its debugfs, but this feels a little loosey-goosey as an uAPI.
> >
> > Well, the whole point of debugfs is that it's not really an ABI. Nothing
> > should ever rely on the presence of these files.
> >
> > > Is there any expected format for the show function?
> > >
> > > What would other dmabuf heaps ideally export via this interface?
> >
> > I've thought about this a bit and I'm not sure it makes sense to
> > standardize on this. I think on one hand having a list of buffers
> > exported by the dma-buf heap is probably the lowest common denominator,
> > but then there might be a bunch of other things that are very heap-
> > specific that some heap might want to export.
> >
> > > Is there some consistent dma_heap-ish concept for it to justify it
> > > being under a dma_heap directory, and not just an independent debugfs
> > > file for the driver implementing the dmabuf heap?
> >
> > Well, I think just the fact that it's a dma-heap would qualify its
> > corresponding debugfs to be in a well-known location. We could of course
> > pick some arbitrary location, but that's just a recipe for chaos because
> > then everybody puts these whereever they want. There's really no
> > standard place for driver-specific debugfs files to go, so putting it
> > into some "subsystem"-specific directory seems like the better option.
>
> Ok, I guess I was thinking if the files are organizationally cohesive
> to be under the dma-heap directory, they ought to have some
> consistency between them.
As far as I can tell there's not even enough information in a dma-heap
to add any common debugfs snippets. As I mentioned earlier, a list of
buffers allocated from a dma-heap is about the only generic piece of
information that I can think of, but we don't track these buffers in a
generic way. None of the existing heaps do so either seem to be
interested in this either.
It's also not like it's very useful information most of the time, it's
mainly in this driver so that it can be inspected at runtime to see what
the allocation pattern looks like at various stages and maybe help tune
the division into chunks.
Thierry
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 3/9] mm/cma: Allow dynamically creating CMA areas
2025-09-03 16:41 ` Frank van der Linden
@ 2025-09-04 12:06 ` Thierry Reding
0 siblings, 0 replies; 24+ messages in thread
From: Thierry Reding @ 2025-09-04 12:06 UTC (permalink / raw)
To: Frank van der Linden
Cc: David Airlie, Simona Vetter, Sumit Semwal, Rob Herring,
Krzysztof Kozlowski, Conor Dooley, Benjamin Gaignard,
Brian Starkey, John Stultz, T.J. Mercier, Andrew Morton,
David Hildenbrand, Mike Rapoport, dri-devel, devicetree,
linux-tegra, linaro-mm-sig, linux-mm
[-- Attachment #1: Type: text/plain, Size: 6188 bytes --]
On Wed, Sep 03, 2025 at 09:41:18AM -0700, Frank van der Linden wrote:
> On Wed, Sep 3, 2025 at 9:05 AM Thierry Reding <thierry.reding@gmail.com> wrote:
> >
> > On Tue, Sep 02, 2025 at 10:27:01AM -0700, Frank van der Linden wrote:
> > > On Tue, Sep 2, 2025 at 8:46 AM Thierry Reding <thierry.reding@gmail.com> wrote:
> > > >
> > > > From: Thierry Reding <treding@nvidia.com>
> > > >
> > > > There is no technical reason why there should be a limited number of CMA
> > > > regions, so extract some code into helpers and use them to create extra
> > > > functions (cma_create() and cma_free()) that allow creating and freeing,
> > > > respectively, CMA regions dynamically at runtime.
> > > >
> > > > Note that these dynamically created CMA areas are treated specially and
> > > > do not contribute to the number of total CMA pages so that this count
> > > > still only applies to the fixed number of CMA areas.
> > > >
> > > > Signed-off-by: Thierry Reding <treding@nvidia.com>
> > > > ---
> > > > include/linux/cma.h | 16 ++++++++
> > > > mm/cma.c | 89 ++++++++++++++++++++++++++++++++++-----------
> > > > 2 files changed, 83 insertions(+), 22 deletions(-)
> > [...]
> > > I agree that supporting dynamic CMA areas would be good. However, by
> > > doing it like this, these CMA areas are invisible to the rest of the
> > > system. E.g. cma_for_each_area() does not know about them. It seems a
> > > bit inconsistent that there will now be some areas that are globally
> > > known, and some that are not.
> >
> > That was kind of the point of this experiment. When I started on this I
> > ran into the case where I was running out of predefined CMA areas and as
> > I went looking for ways on how to fix this, I realized that there's not
> > much reason to keep a global list of these areas. And even less reason
> > to limit the number of CMA areas to this predefined list. Very little
> > code outside of the core CMA code even uses this.
> >
> > There's one instance of cma_for_each_area() that I don't grok. There's
> > another early MMU fixup for CMA areas in 32-bit ARM that. Other than
> > that there's a few places where the total CMA page count is shown for
> > informational purposes and I don't know how useful that really is
> > because totalcma_pages doesn't really track how many pages are used for
> > CMA, but pages that could potentially be used for CMA.
> >
> > And that's about it.
> >
> > It seems like there are cases where we might really need to globally
> > know about some of these areas, specifically ones that are allocated
> > very early during boot and then used for very specific purposes.
> >
> > However, it seems to me like CMA is more universally useful than just
> > for these cases and I don't see the usefulness of tracking these more
> > generic uses.
> >
> > > I am being somewhat selfish here, as I have some WIP code that needs
> > > the global list :-) But I think the inconsistency is a more general
> > > point than just what I want (and the s390 code does use
> > > cma_for_each_area()). Maybe you could keep maintaining a global
> > > structure containing all areas?
> >
> > If it's really useful to be able to access all CMA areas, then we could
> > easily just add them all to a global linked list upon activation (we may
> > still want/need to keep the predefined list around for all those early
> > allocation cases). That way we'd get the best of both worlds.
> >
> > > What do you think are the chances of running out of the global count
> > > of areas?
> >
> > Well, I did run out of CMA areas during the early VPR testing because I
> > was initially testing with 16 areas and a different allocation scheme
> > that turned out to cause too many resizes in common cases.
> >
> > However, given that the default is 8 on normal systems (20 on NUMA) and
> > is configurable, it means that even with restricting this to 4 for VPR
> > doesn't always guarantee that all 4 are available. Again, yes, we could
> > keep bumping that number, but why not turn this into something a bit
> > more robust where nobody has to know or care about how many there are?
> >
> > > Also, you say that "these are treated specially and do not contribute
> > > to the number of total CMA pages". But, if I'm reading this right, you
> > > do call cma_activate_area(), which will do
> > > init_cma_reserved_pageblock() for each pageblock in it. Which adjusts
> > > the CMA counters for the zone they are in. But your change does not
> > > adjust totalcma_pages for dynamically created areas. That seems
> > > inconsistent, too.
> >
> > I was referring to just totalcma_pages that isn't impacted by these
> > dynamically allocated regions. This is, again, because I don't see why
> > that information would be useful. It's a fairly easy change to update
> > that value, so if people prefer that, I can add that.
> >
> > I don't see an immediate connection between totalcma_pages and
> > init_cma_reserved_pageblock(). I thought the latter was primarily useful
> > for making sure that the CMA pages can be migrated, which is still
> > critical for this use-case.
>
> My comment was about statistics, they would be inconsistent after your
> change. E.g. currently, totalcma_pages is equal to the sum of CMA
> pages in each zone. But that would no longer be true, and applications
> / administrators looking at those statistics might see the
> inconsistency (between meminfo and vmstat) and wonder what's going on.
> It seems best to keep those numbers in sync.
>
> In general, I think it's fine to support dynamic allocation, and I
> agree with your arguments that it doesn't seem right to set the number
> of CMA areas via a config option. I would just like there to be a
> canonical way to find all CMA areas.
Okay, so judging by your and David's feedback, it sounds like I should
add a bit of code to track dynamically allocated areas within a global
list, along with the existing predefined regions and keep totalcma_pages
updated so that the global view is consistent.
I'll look into that. Thanks for the feedback.
Thierry
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH 6/9] arm64: tegra: Add VPR placeholder node on Tegra234
2025-09-02 15:46 ` [PATCH 6/9] arm64: tegra: Add VPR placeholder node on Tegra234 Thierry Reding
@ 2025-09-04 15:30 ` Thierry Reding
0 siblings, 0 replies; 24+ messages in thread
From: Thierry Reding @ 2025-09-04 15:30 UTC (permalink / raw)
To: Rob Herring, Krzysztof Kozlowski, Conor Dooley
Cc: David Airlie, Simona Vetter, Sumit Semwal, Benjamin Gaignard,
Brian Starkey, John Stultz, T.J. Mercier, Andrew Morton,
David Hildenbrand, Mike Rapoport, dri-devel, devicetree,
linux-tegra, linaro-mm-sig, linux-mm
[-- Attachment #1: Type: text/plain, Size: 3277 bytes --]
On Tue, Sep 02, 2025 at 05:46:26PM +0200, Thierry Reding wrote:
> From: Thierry Reding <treding@nvidia.com>
>
> This node contains two sets of properties, one for the case where the
> VPR is resizable (in which case the VPR region will be dynamically
> allocated at boot time) and another case where the VPR is fixed in size
> and initialized by early firmware.
>
> The firmware running on the device is responsible for updating the node
> with the real physical address for the fixed VPR case and remove the
> properties needed only for resizable VPR. Similarly, if the VPR is
> resizable, the firmware should remove the "reg" property since it is no
> longer needed.
>
> Signed-off-by: Thierry Reding <treding@nvidia.com>
> ---
> arch/arm64/boot/dts/nvidia/tegra234.dtsi | 34 ++++++++++++++++++++++++
> 1 file changed, 34 insertions(+)
>
> diff --git a/arch/arm64/boot/dts/nvidia/tegra234.dtsi b/arch/arm64/boot/dts/nvidia/tegra234.dtsi
> index df034dbb8285..4d572f5fa0b1 100644
> --- a/arch/arm64/boot/dts/nvidia/tegra234.dtsi
> +++ b/arch/arm64/boot/dts/nvidia/tegra234.dtsi
> @@ -28,6 +28,40 @@ aliases {
> i2c8 = &dp_aux_ch3_i2c;
> };
>
> + reserved-memory {
> + #address-cells = <2>;
> + #size-cells = <2>;
> + ranges;
> +
> + vpr: video-protection-region@0 {
> + compatible = "nvidia,tegra-video-protection-region";
> + status = "disabled";
> + no-map;
> +
> + /*
> + * Two variants exist for this. For fixed VPR, the
> + * firmware is supposed to update the "reg" property
> + * with the fixed memory region configured as VPR.
> + *
> + * For resizable VPR we don't care about the exact
> + * address and instead want a reserved region to be
> + * allocated with a certain size and alignment at
> + * boot time.
> + *
> + * The firmware is responsible for removing the
> + * unused set of properties.
> + */
> +
> + /* fixed VPR */
> + reg = <0x0 0x0 0x0 0x0>;
> +
> + /* resizable VPR */
> + size = <0x0 0x70000000>;
> + alignment = <0x0 0x100000>;
> + reusable;
> + };
> + };
Hi DT maintainers,
I wanted to get some feedback on this type of placeholder DT node. This
doesn't actually validate properly because it contains properties for
both the fixed and resizable VPR variants, which are mutually exclusive.
However, the way that this currently works is that UEFI will remove and
update whatever properties need to change during boot, so the booted
kernel ends up with the correct, non-conflicting information.
The reason why it was done this way is because it simplifies the code in
UEFI to update this node. Also, without this being a placeholder I don't
know what to put into this. There's no "default" for this. One option is
to not have this in the DT at all and completely create it at boot time,
but then it becomes quite difficult to create the phandle references.
While at it, I'm not sure if I properly understand how to correctly name
a reserved-memory region that is dynamically allocated like in the case
of resizable VPR? It doesn't have a base address during boot and the
kernel will allocate memory where it sees fit. Do I just leave out the
unit-address in that case?
Thanks,
Thierry
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2025-09-04 15:31 UTC | newest]
Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-02 15:46 [PATCH 0/9] dma-buf: heaps: Add Tegra VPR support Thierry Reding
2025-09-02 15:46 ` [PATCH 1/9] dt-bindings: reserved-memory: Document Tegra VPR Thierry Reding
2025-09-03 16:45 ` Rob Herring (Arm)
2025-09-02 15:46 ` [PATCH 2/9] dt-bindings: display: tegra: Document memory regions Thierry Reding
2025-09-02 15:46 ` [PATCH 3/9] mm/cma: Allow dynamically creating CMA areas Thierry Reding
2025-09-02 17:27 ` Frank van der Linden
2025-09-02 19:04 ` David Hildenbrand
2025-09-03 16:12 ` Thierry Reding
2025-09-03 16:14 ` David Hildenbrand
2025-09-03 16:05 ` Thierry Reding
2025-09-03 16:41 ` Frank van der Linden
2025-09-04 12:06 ` Thierry Reding
2025-09-02 15:46 ` [PATCH 4/9] dma-buf: heaps: Add debugfs support Thierry Reding
2025-09-02 22:37 ` John Stultz
2025-09-03 15:38 ` Thierry Reding
2025-09-03 18:48 ` John Stultz
2025-09-04 12:04 ` Thierry Reding
2025-09-02 15:46 ` [PATCH 5/9] dma-buf: heaps: Add support for Tegra VPR Thierry Reding
2025-09-02 15:46 ` [PATCH 6/9] arm64: tegra: Add VPR placeholder node on Tegra234 Thierry Reding
2025-09-04 15:30 ` Thierry Reding
2025-09-02 15:46 ` [PATCH 7/9] arm64: tegra: Add GPU " Thierry Reding
2025-09-02 15:46 ` [PATCH 8/9] arm64: tegra: Hook up VPR to host1x Thierry Reding
2025-09-02 15:46 ` [PATCH 9/9] arm64: tegra: Hook up VPR to the GPU Thierry Reding
2025-09-03 11:54 ` [PATCH 0/9] dma-buf: heaps: Add Tegra VPR support David Hildenbrand
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).