devicetree.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v7 0/6] Add support for Verisilicon IOMMU used by media codec blocks
@ 2025-08-25 15:34 Benjamin Gaignard
  2025-08-25 15:34 ` [PATCH v7 1/6] dt-bindings: vendor-prefixes: Add Verisilicon Benjamin Gaignard
                   ` (5 more replies)
  0 siblings, 6 replies; 17+ messages in thread
From: Benjamin Gaignard @ 2025-08-25 15:34 UTC (permalink / raw)
  To: joro, will, robin.murphy, robh, krzk+dt, conor+dt, heiko,
	nicolas.dufresne, jgg, p.zabel, mchehab
  Cc: iommu, devicetree, linux-kernel, linux-arm-kernel, linux-rockchip,
	kernel, linux-media, Benjamin Gaignard

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=y, Size: 3632 bytes --]

Hi all,

This patch series adds support for the Verisilicon IOMMU, which is found in front
of hardware encoder and decoder blocks in several SoCs using Verisilicon IP. 
A first implementation of this IOMMU is available on the Rockchip RK3588 SoC.

Rockchip provides a driver for this hardware in their 6.1 kernel branch:
https://github.com/rockchip-linux/kernel/blob/develop-6.1/drivers/iommu/rockchip-iommu-av1d.c

This series includes:
- a new binding for the Verisilicon IOMMU
- a driver implementation
- DT updates for RK3588

The driver was forward-ported from Rockchip’s 6.1 implementation, 
the prefix was renamed to vsi for generality, and several fixes were applied.

AV1 decoding was tested using the stateless VPU driver and Fluster.
The test results show a score of 205/239, which confirms that no regressions
were introduced by this series.

Feedback and testing welcome.

changes in version 7:
- fix locking issues.
- add a patch in AV1 video decoder to manage per context iommu domain.
- fix compilation issues when build as module.
- remove useless "rockchip,rk3588-av1-iommu" compatible in driver code.

changes in version 6:
- rework lock schema in vsi_iommu_attach_device() so
  it protected against concurrent invalidation.
- flush the cache after changing of domain.

changes in version 5:
- change locking schema to use 2 spin_locks: one to protect vsi_domain
  data and one to protect vsi_iommu structure.
- make suspend/resume more robust by calling disable/enable function.
- rebased on top of v6.16-rc5

changes in version 4:
- rename and reorder compatibles fields.
- Kconfig dependencies
- Fix the remarks done by Jason and Robin: locking, clocks, macros
  probing, pm_runtime, atomic allocation.

changes in version 3:
- Change compatible to "rockchip,rk3588-iommu-1.2"
- Fix compatible in .yaml
- Update DT and driver to use "rockchip,rk3588-iommu-1.2" compatible
- Set CONFIG_VSI_IOMMU as module in defconfig
- Create an identity domain for the driver
- Fix double flush issue
- Rework attach/detach logic
- Simplify xlate function
- Discover iommu device like done in ARM driver
- Remove ARM_DMA_USE_IOMMU from Kconfig

changes in version 2:
- Add a compatible "rockchip,rk3588-av1-iommu"
- Fix clock-names in binding 
- Remove "vsi_mmu" label in binding example.
- Rework driver probe function
- Remove double flush
- Rework driver internal structures and avoid allocate
  in xlate function.
- Do not touch to VPU driver anymore (path removed)
- Add a patch to enable the driver in arm64 defconfig
 
Benjamin Gaignard (6):
  dt-bindings: vendor-prefixes: Add Verisilicon
  dt-bindings: iommu: verisilicon: Add binding for VSI IOMMU
  iommu: Add verisilicon IOMMU driver
  media: verisilicon: AV1: Restore IOMMU context before decoding a frame
  arm64: dts: rockchip: Add verisilicon IOMMU node on RK3588
  arm64: defconfig: enable Verisilicon IOMMU

 .../bindings/iommu/verisilicon,iommu.yaml     |  71 ++
 .../devicetree/bindings/vendor-prefixes.yaml  |   2 +
 arch/arm64/boot/dts/rockchip/rk3588-base.dtsi |  11 +
 arch/arm64/configs/defconfig                  |   1 +
 drivers/iommu/Kconfig                         |  11 +
 drivers/iommu/Makefile                        |   1 +
 drivers/iommu/vsi-iommu.c                     | 779 ++++++++++++++++++
 drivers/media/platform/verisilicon/hantro.h   |   5 +
 .../media/platform/verisilicon/hantro_drv.c   |  11 +
 .../verisilicon/rockchip_vpu981_hw_av1_dec.c  |  10 +
 10 files changed, 902 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/iommu/verisilicon,iommu.yaml
 create mode 100644 drivers/iommu/vsi-iommu.c

-- 
2.43.0


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v7 1/6] dt-bindings: vendor-prefixes: Add Verisilicon
  2025-08-25 15:34 [PATCH v7 0/6] Add support for Verisilicon IOMMU used by media codec blocks Benjamin Gaignard
@ 2025-08-25 15:34 ` Benjamin Gaignard
  2025-08-25 15:34 ` [PATCH v7 2/6] dt-bindings: iommu: verisilicon: Add binding for VSI IOMMU Benjamin Gaignard
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 17+ messages in thread
From: Benjamin Gaignard @ 2025-08-25 15:34 UTC (permalink / raw)
  To: joro, will, robin.murphy, robh, krzk+dt, conor+dt, heiko,
	nicolas.dufresne, jgg, p.zabel, mchehab
  Cc: iommu, devicetree, linux-kernel, linux-arm-kernel, linux-rockchip,
	kernel, linux-media, Benjamin Gaignard, Conor Dooley

Verisilicon Microelectronics is a company based in Shanghai, China,
developping hardware blocks for SoC.

https://verisilicon.com/

Add their name to the list of vendors.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
---
 Documentation/devicetree/bindings/vendor-prefixes.yaml | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/Documentation/devicetree/bindings/vendor-prefixes.yaml b/Documentation/devicetree/bindings/vendor-prefixes.yaml
index 77160cd47f54..ef71164ab0ec 100644
--- a/Documentation/devicetree/bindings/vendor-prefixes.yaml
+++ b/Documentation/devicetree/bindings/vendor-prefixes.yaml
@@ -1654,6 +1654,8 @@ patternProperties:
     description: Variscite Ltd.
   "^vdl,.*":
     description: Van der Laan b.v.
+  "^verisilicon,.*":
+    description: VeriSilicon Microelectronics
   "^vertexcom,.*":
     description: Vertexcom Technologies, Inc.
   "^via,.*":
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v7 2/6] dt-bindings: iommu: verisilicon: Add binding for VSI IOMMU
  2025-08-25 15:34 [PATCH v7 0/6] Add support for Verisilicon IOMMU used by media codec blocks Benjamin Gaignard
  2025-08-25 15:34 ` [PATCH v7 1/6] dt-bindings: vendor-prefixes: Add Verisilicon Benjamin Gaignard
@ 2025-08-25 15:34 ` Benjamin Gaignard
  2025-08-25 15:34 ` [PATCH v7 3/6] iommu: Add verisilicon IOMMU driver Benjamin Gaignard
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 17+ messages in thread
From: Benjamin Gaignard @ 2025-08-25 15:34 UTC (permalink / raw)
  To: joro, will, robin.murphy, robh, krzk+dt, conor+dt, heiko,
	nicolas.dufresne, jgg, p.zabel, mchehab
  Cc: iommu, devicetree, linux-kernel, linux-arm-kernel, linux-rockchip,
	kernel, linux-media, Benjamin Gaignard, Conor Dooley

Add a device tree binding for the Verisilicon (VSI) IOMMU.
This IOMMU sits in front of hardware encoder and decoder
blocks on SoCs using Verisilicon IP, such as the Rockchip RK3588.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
---
 .../bindings/iommu/verisilicon,iommu.yaml     | 71 +++++++++++++++++++
 1 file changed, 71 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/iommu/verisilicon,iommu.yaml

diff --git a/Documentation/devicetree/bindings/iommu/verisilicon,iommu.yaml b/Documentation/devicetree/bindings/iommu/verisilicon,iommu.yaml
new file mode 100644
index 000000000000..d3ce9e603b61
--- /dev/null
+++ b/Documentation/devicetree/bindings/iommu/verisilicon,iommu.yaml
@@ -0,0 +1,71 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/iommu/verisilicon,iommu.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Verisilicon IOMMU
+
+maintainers:
+  - Benjamin Gaignard <benjamin.gaignard@collabora.com>
+
+description: |+
+  A Versilicon iommu translates io virtual addresses to physical addresses for
+  its associated video decoder.
+
+properties:
+  compatible:
+    items:
+      - const: rockchip,rk3588-av1-iommu
+      - const: verisilicon,iommu-1.2
+
+  reg:
+    maxItems: 1
+
+  interrupts:
+    maxItems: 1
+
+  clocks:
+    items:
+      - description: Core clock
+      - description: Interface clock
+
+  clock-names:
+    items:
+      - const: core
+      - const: iface
+
+  "#iommu-cells":
+    const: 0
+
+  power-domains:
+    maxItems: 1
+
+required:
+  - compatible
+  - reg
+  - interrupts
+  - clocks
+  - clock-names
+  - "#iommu-cells"
+
+additionalProperties: false
+
+examples:
+  - |
+    #include <dt-bindings/clock/rockchip,rk3588-cru.h>
+    #include <dt-bindings/interrupt-controller/arm-gic.h>
+
+    bus {
+      #address-cells = <2>;
+      #size-cells = <2>;
+
+      iommu@fdca0000 {
+        compatible = "rockchip,rk3588-av1-iommu","verisilicon,iommu-1.2";
+        reg = <0x0 0xfdca0000 0x0 0x600>;
+        interrupts = <GIC_SPI 109 IRQ_TYPE_LEVEL_HIGH 0>;
+        clocks = <&cru ACLK_AV1>, <&cru PCLK_AV1>;
+        clock-names = "core", "iface";
+        #iommu-cells = <0>;
+      };
+    };
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v7 3/6] iommu: Add verisilicon IOMMU driver
  2025-08-25 15:34 [PATCH v7 0/6] Add support for Verisilicon IOMMU used by media codec blocks Benjamin Gaignard
  2025-08-25 15:34 ` [PATCH v7 1/6] dt-bindings: vendor-prefixes: Add Verisilicon Benjamin Gaignard
  2025-08-25 15:34 ` [PATCH v7 2/6] dt-bindings: iommu: verisilicon: Add binding for VSI IOMMU Benjamin Gaignard
@ 2025-08-25 15:34 ` Benjamin Gaignard
  2025-08-25 15:34 ` [PATCH v7 4/6] media: verisilicon: AV1: Restore IOMMU context before decoding a frame Benjamin Gaignard
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 17+ messages in thread
From: Benjamin Gaignard @ 2025-08-25 15:34 UTC (permalink / raw)
  To: joro, will, robin.murphy, robh, krzk+dt, conor+dt, heiko,
	nicolas.dufresne, jgg, p.zabel, mchehab
  Cc: iommu, devicetree, linux-kernel, linux-arm-kernel, linux-rockchip,
	kernel, linux-media, Benjamin Gaignard

The Verisilicon IOMMU hardware block can be found in combination
with Verisilicon hardware video codecs (encoders or decoders) on
different SoCs.
Enable it will allow us to use non contiguous memory allocators
for Verisilicon video codecs.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
---
changes in version 7:
- fix locking issues, always lock domain before iommu lock
- fix compilation issues when build as module.
- remove useless "rockchip,rk3588-av1-iommu" compatible in driver code.

 drivers/iommu/Kconfig     |  11 +
 drivers/iommu/Makefile    |   1 +
 drivers/iommu/vsi-iommu.c | 779 ++++++++++++++++++++++++++++++++++++++
 3 files changed, 791 insertions(+)
 create mode 100644 drivers/iommu/vsi-iommu.c

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 70d29b14d851..d3731be630a2 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -383,4 +383,15 @@ config SPRD_IOMMU
 
 	  Say Y here if you want to use the multimedia devices listed above.
 
+config VSI_IOMMU
+	tristate "Verisilicon IOMMU Support"
+	depends on (ARCH_ROCKCHIP && ARM64) || COMPILE_TEST
+	select IOMMU_API
+	help
+	  Support for IOMMUs used by Verisilicon sub-systems like video
+	  decoders or encoder hardware blocks.
+
+	  Say Y here if you want to use this IOMMU in front of these
+	  hardware blocks.
+
 endif # IOMMU_SUPPORT
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 355294fa9033..68aeff31af8b 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -34,3 +34,4 @@ obj-$(CONFIG_IOMMU_SVA) += iommu-sva.o
 obj-$(CONFIG_IOMMU_IOPF) += io-pgfault.o
 obj-$(CONFIG_SPRD_IOMMU) += sprd-iommu.o
 obj-$(CONFIG_APPLE_DART) += apple-dart.o
+obj-$(CONFIG_VSI_IOMMU) += vsi-iommu.o
diff --git a/drivers/iommu/vsi-iommu.c b/drivers/iommu/vsi-iommu.c
new file mode 100644
index 000000000000..69b5fcb910ef
--- /dev/null
+++ b/drivers/iommu/vsi-iommu.c
@@ -0,0 +1,779 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright (C) 2025 Collabora Ltd.
+ *
+ * IOMMU API for Verisilicon
+ *
+ * Module Authors:	Yandong Lin <yandong.lin@rock-chips.com>
+ *			Simon Xue <xxm@rock-chips.com>
+ *			Benjamin Gaignard <benjamin.gaignard@collabora.com>
+ */
+
+#include <linux/clk.h>
+#include <linux/compiler.h>
+#include <linux/delay.h>
+#include <linux/device.h>
+#include <linux/dma-mapping.h>
+#include <linux/errno.h>
+#include <linux/init.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/iommu.h>
+#include <linux/list.h>
+#include <linux/mm.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/of_iommu.h>
+#include <linux/of_platform.h>
+#include <linux/platform_device.h>
+#include <linux/pm_runtime.h>
+#include <linux/slab.h>
+#include <linux/spinlock.h>
+
+#include "iommu-pages.h"
+
+struct vsi_iommu {
+	struct device *dev;
+	void __iomem *regs;
+	struct clk_bulk_data *clocks;
+	int num_clocks;
+	struct iommu_device iommu;
+	struct list_head node; /* entry in vsi_iommu_domain.iommus */
+	struct iommu_domain *domain; /* domain to which iommu is attached */
+	spinlock_t lock;
+	int irq;
+};
+
+struct vsi_iommu_domain {
+	struct list_head iommus;
+	struct device *dev;
+	u32 *dt;
+	dma_addr_t dt_dma;
+	struct iommu_domain domain;
+	u64 *pta;
+	dma_addr_t pta_dma;
+	spinlock_t lock;
+};
+
+static struct iommu_domain vsi_identity_domain;
+
+#define NUM_DT_ENTRIES	1024
+#define NUM_PT_ENTRIES	1024
+#define PT_SIZE		(NUM_PT_ENTRIES * sizeof(u32))
+
+#define SPAGE_SIZE	BIT(12)
+
+/* vsi iommu regs address */
+#define VSI_MMU_CONFIG1_BASE			0x1ac
+#define VSI_MMU_AHB_EXCEPTION_BASE		0x380
+#define VSI_MMU_AHB_CONTROL_BASE		0x388
+#define VSI_MMU_AHB_TLB_ARRAY_BASE_L_BASE	0x38C
+
+/* MMU register offsets */
+#define VSI_MMU_FLUSH_BASE		0x184
+#define VSI_MMU_BIT_FLUSH		BIT(4)
+
+#define VSI_MMU_PAGE_FAULT_ADDR		0x380
+#define VSI_MMU_STATUS_BASE		0x384	/* IRQ status */
+
+#define VSI_MMU_BIT_ENABLE		BIT(0)
+
+#define VSI_MMU_OUT_OF_BOUND		BIT(28)
+/* Irq mask */
+#define VSI_MMU_IRQ_MASK		0x7
+
+#define VSI_DTE_PT_ADDRESS_MASK		0xffffffc0
+#define VSI_DTE_PT_VALID		BIT(0)
+
+#define VSI_PAGE_DESC_LO_MASK		0xfffff000
+#define VSI_PAGE_DESC_HI_MASK		GENMASK_ULL(39, 32)
+#define VSI_PAGE_DESC_HI_SHIFT		(32 - 4)
+
+static inline phys_addr_t vsi_dte_pt_address(u32 dte)
+{
+	return (phys_addr_t)dte & VSI_DTE_PT_ADDRESS_MASK;
+}
+
+static inline u32 vsi_mk_dte(u32 dte)
+{
+	return (phys_addr_t)dte | VSI_DTE_PT_VALID;
+}
+
+#define VSI_PTE_PAGE_WRITABLE		BIT(2)
+#define VSI_PTE_PAGE_VALID		BIT(0)
+
+static inline phys_addr_t vsi_pte_page_address(u64 pte)
+{
+	return pte << VSI_PAGE_DESC_HI_SHIFT;
+}
+
+static u32 vsi_mk_pte(phys_addr_t page, int prot)
+{
+	u32 flags = 0;
+
+	flags |= (prot & IOMMU_WRITE) ? VSI_PTE_PAGE_WRITABLE : 0;
+	page = (page & VSI_PAGE_DESC_LO_MASK) |
+	       ((page & VSI_PAGE_DESC_HI_MASK) >> VSI_PAGE_DESC_HI_SHIFT);
+
+	return page | flags | VSI_PTE_PAGE_VALID;
+}
+
+#define VSI_DTE_PT_VALID	BIT(0)
+
+static inline bool vsi_dte_is_pt_valid(u32 dte)
+{
+	return dte & VSI_DTE_PT_VALID;
+}
+
+static inline bool vsi_pte_is_page_valid(u32 pte)
+{
+	return pte & VSI_PTE_PAGE_VALID;
+}
+
+static u32 vsi_mk_pte_invalid(u32 pte)
+{
+	return pte & ~VSI_PTE_PAGE_VALID;
+}
+
+#define VSI_MASTER_TLB_MASK	GENMASK_ULL(31, 10)
+/* mode 0 : 4k */
+#define VSI_PTA_4K_MODE	0
+
+static u64 vsi_mk_pta(dma_addr_t dt_dma)
+{
+	u64 val = (dt_dma & VSI_MASTER_TLB_MASK) | VSI_PTA_4K_MODE;
+
+	return val;
+}
+
+static struct vsi_iommu_domain *to_vsi_domain(struct iommu_domain *dom)
+{
+	return container_of(dom, struct vsi_iommu_domain, domain);
+}
+
+static inline void vsi_table_flush(struct vsi_iommu_domain *vsi_domain, dma_addr_t dma,
+				   unsigned int count)
+{
+	size_t size = count * sizeof(u32); /* count of u32 entry */
+
+	dma_sync_single_for_device(vsi_domain->dev, dma, size, DMA_TO_DEVICE);
+}
+
+#define VSI_IOVA_DTE_MASK	0xffc00000
+#define VSI_IOVA_DTE_SHIFT	22
+#define VSI_IOVA_PTE_MASK	0x003ff000
+#define VSI_IOVA_PTE_SHIFT	12
+#define VSI_IOVA_PAGE_MASK	0x00000fff
+#define VSI_IOVA_PAGE_SHIFT	0
+
+static u32 vsi_iova_dte_index(u32 iova)
+{
+	return (iova & VSI_IOVA_DTE_MASK) >> VSI_IOVA_DTE_SHIFT;
+}
+
+static u32 vsi_iova_pte_index(u32 iova)
+{
+	return (iova & VSI_IOVA_PTE_MASK) >> VSI_IOVA_PTE_SHIFT;
+}
+
+static u32 vsi_iova_page_offset(u32 iova)
+{
+	return (iova & VSI_IOVA_PAGE_MASK) >> VSI_IOVA_PAGE_SHIFT;
+}
+
+static void vsi_iommu_flush_tlb_all(struct iommu_domain *domain)
+{
+	struct vsi_iommu_domain *vsi_domain = to_vsi_domain(domain);
+	struct list_head *pos;
+	unsigned long flags;
+
+	spin_lock_irqsave(&vsi_domain->lock, flags);
+
+	list_for_each(pos, &vsi_domain->iommus) {
+		struct vsi_iommu *iommu;
+		int ret;
+
+		iommu = list_entry(pos, struct vsi_iommu, node);
+		ret = pm_runtime_resume_and_get(iommu->dev);
+		if (ret < 0)
+			continue;
+
+		spin_lock(&iommu->lock);
+
+		writel(VSI_MMU_BIT_FLUSH, iommu->regs + VSI_MMU_FLUSH_BASE);
+		writel(0, iommu->regs + VSI_MMU_FLUSH_BASE);
+
+		spin_unlock(&iommu->lock);
+		pm_runtime_put_autosuspend(iommu->dev);
+	}
+
+	spin_unlock_irqrestore(&vsi_domain->lock, flags);
+
+}
+
+static irqreturn_t vsi_iommu_irq(int irq, void *dev_id)
+{
+	struct vsi_iommu *iommu = dev_id;
+	unsigned long flags;
+	dma_addr_t iova;
+	u32 status;
+
+	if (pm_runtime_resume_and_get(iommu->dev) < 0)
+		return IRQ_NONE;
+
+	spin_lock_irqsave(&iommu->lock, flags);
+
+	status = readl(iommu->regs + VSI_MMU_STATUS_BASE);
+	if (status & VSI_MMU_IRQ_MASK) {
+		dev_err(iommu->dev, "unexpected int_status=%08x\n", status);
+		iova = readl(iommu->regs + VSI_MMU_PAGE_FAULT_ADDR);
+		report_iommu_fault(iommu->domain, iommu->dev, iova, status);
+	}
+	writel(0, iommu->regs + VSI_MMU_STATUS_BASE);
+
+	spin_unlock_irqrestore(&iommu->lock, flags);
+	pm_runtime_put_autosuspend(iommu->dev);
+
+	return IRQ_HANDLED;
+}
+
+static struct vsi_iommu *vsi_iommu_get_from_dev(struct device *dev)
+{
+	struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
+	struct device *iommu_dev = bus_find_device_by_fwnode(&platform_bus_type,
+							     fwspec->iommu_fwnode);
+
+	put_device(iommu_dev);
+
+	return iommu_dev ? dev_get_drvdata(iommu_dev) : NULL;
+}
+
+static struct iommu_domain *vsi_iommu_domain_alloc_paging(struct device *dev)
+{
+	struct vsi_iommu *iommu = dev_iommu_priv_get(dev);
+	struct vsi_iommu_domain *vsi_domain;
+
+	vsi_domain = kzalloc(sizeof(*vsi_domain), GFP_KERNEL);
+	if (!vsi_domain)
+		return NULL;
+
+	vsi_domain->dev = iommu->dev;
+	spin_lock_init(&vsi_domain->lock);
+
+	/*
+	 * iommu use a 2 level pagetable.
+	 * Each level1 (dt) and level2 (pt) table has 1024 4-byte entries.
+	 * Allocate one 4 KiB page for each table.
+	 */
+	vsi_domain->dt = iommu_alloc_pages_sz(GFP_KERNEL | GFP_DMA32,
+					      SPAGE_SIZE);
+	if (!vsi_domain->dt)
+		goto err_free_domain;
+
+	vsi_domain->dt_dma = dma_map_single(vsi_domain->dev, vsi_domain->dt,
+					    SPAGE_SIZE, DMA_TO_DEVICE);
+	if (dma_mapping_error(vsi_domain->dev, vsi_domain->dt_dma)) {
+		dev_err(dev, "DMA map error for DT\n");
+		goto err_free_dt;
+	}
+
+	vsi_domain->pta = iommu_alloc_pages_sz(GFP_KERNEL | GFP_DMA32,
+					       SPAGE_SIZE);
+	if (!vsi_domain->pta)
+		goto err_unmap_dt;
+
+	vsi_domain->pta[0] = vsi_mk_pta(vsi_domain->dt_dma);
+	vsi_domain->pta_dma = dma_map_single(vsi_domain->dev, vsi_domain->pta,
+					     SPAGE_SIZE, DMA_TO_DEVICE);
+	if (dma_mapping_error(vsi_domain->dev, vsi_domain->pta_dma)) {
+		dev_err(dev, "DMA map error for PTA\n");
+		goto err_free_pta;
+	}
+
+	INIT_LIST_HEAD(&vsi_domain->iommus);
+
+	vsi_domain->domain.geometry.aperture_start = 0;
+	vsi_domain->domain.geometry.aperture_end   = DMA_BIT_MASK(32);
+	vsi_domain->domain.geometry.force_aperture = true;
+	vsi_domain->domain.pgsize_bitmap	   = SZ_4K;
+
+	return &vsi_domain->domain;
+
+err_free_pta:
+	iommu_free_pages(vsi_domain->pta);
+err_unmap_dt:
+	dma_unmap_single(vsi_domain->dev, vsi_domain->dt_dma,
+			 SPAGE_SIZE, DMA_TO_DEVICE);
+err_free_dt:
+	iommu_free_pages(vsi_domain->dt);
+err_free_domain:
+	kfree(vsi_domain);
+
+	return NULL;
+}
+
+static phys_addr_t vsi_iommu_iova_to_phys(struct iommu_domain *domain,
+					  dma_addr_t iova)
+{
+	struct vsi_iommu_domain *vsi_domain = to_vsi_domain(domain);
+	phys_addr_t pt_phys, phys = 0;
+	unsigned long flags;
+	u32 dte, pte;
+	u32 *page_table;
+
+	spin_lock_irqsave(&vsi_domain->lock, flags);
+	dte = vsi_domain->dt[vsi_iova_dte_index(iova)];
+	if (!vsi_dte_is_pt_valid(dte))
+		goto unlock;
+
+	pt_phys = vsi_dte_pt_address(dte);
+	page_table = (u32 *)phys_to_virt(pt_phys);
+	pte = page_table[vsi_iova_pte_index(iova)];
+	if (!vsi_pte_is_page_valid(pte))
+		goto unlock;
+
+	phys = vsi_pte_page_address(pte) + vsi_iova_page_offset(iova);
+
+unlock:
+	spin_unlock_irqrestore(&vsi_domain->lock, flags);
+	return phys;
+}
+
+static size_t vsi_iommu_unmap_iova(struct vsi_iommu_domain *vsi_domain,
+				   u32 *pte_addr, dma_addr_t pte_dma,
+				   size_t size)
+{
+	unsigned int pte_count;
+	unsigned int pte_total = size / SPAGE_SIZE;
+
+	for (pte_count = 0;
+	     pte_count < pte_total && pte_count < NUM_PT_ENTRIES; pte_count++) {
+		u32 pte = pte_addr[pte_count];
+
+		if (!vsi_pte_is_page_valid(pte))
+			break;
+
+		pte_addr[pte_count] = vsi_mk_pte_invalid(pte);
+	}
+
+	vsi_table_flush(vsi_domain, pte_dma, pte_count);
+
+	return pte_count * SPAGE_SIZE;
+}
+
+static int vsi_iommu_map_iova(struct vsi_iommu_domain *vsi_domain, u32 *pte_addr,
+			      dma_addr_t pte_dma, dma_addr_t iova,
+			      phys_addr_t paddr, size_t size, int prot)
+{
+	unsigned int pte_count;
+	unsigned int pte_total = size / SPAGE_SIZE;
+
+	for (pte_count = 0;
+	     pte_count < pte_total && pte_count < NUM_PT_ENTRIES; pte_count++) {
+		u32 pte = pte_addr[pte_count];
+
+		if (vsi_pte_is_page_valid(pte))
+			return (pte_count - 1) * SPAGE_SIZE;
+
+		pte_addr[pte_count] = vsi_mk_pte(paddr, prot);
+
+		paddr += SPAGE_SIZE;
+	}
+
+	vsi_table_flush(vsi_domain, pte_dma, pte_total);
+
+	return 0;
+}
+
+static size_t vsi_iommu_unmap(struct iommu_domain *domain, unsigned long _iova,
+			      size_t size, size_t count, struct iommu_iotlb_gather *gather)
+{
+	struct vsi_iommu_domain *vsi_domain = to_vsi_domain(domain);
+	dma_addr_t pte_dma, iova = (dma_addr_t)_iova;
+	unsigned long flags;
+	phys_addr_t pt_phys;
+	u32 dte;
+	u32 *pte_addr;
+	size_t unmap_size = 0;
+
+	spin_lock_irqsave(&vsi_domain->lock, flags);
+
+	dte = vsi_domain->dt[vsi_iova_dte_index(iova)];
+	/* Just return 0 if iova is unmapped */
+	if (!vsi_dte_is_pt_valid(dte))
+		goto unlock;
+
+	pt_phys = vsi_dte_pt_address(dte);
+	pte_addr = (u32 *)phys_to_virt(pt_phys) + vsi_iova_pte_index(iova);
+	pte_dma = pt_phys + vsi_iova_pte_index(iova) * sizeof(u32);
+	unmap_size = vsi_iommu_unmap_iova(vsi_domain, pte_addr, pte_dma, size);
+
+unlock:
+	spin_unlock_irqrestore(&vsi_domain->lock, flags);
+
+	return unmap_size;
+}
+
+static u32 *vsi_dte_get_page_table(struct vsi_iommu_domain *vsi_domain,
+				   dma_addr_t iova, gfp_t gfp)
+{
+	u32 *page_table, *dte_addr;
+	u32 dte_index, dte;
+	phys_addr_t pt_phys;
+	dma_addr_t pt_dma;
+	gfp_t flags;
+
+	dte_index = vsi_iova_dte_index(iova);
+	dte_addr = &vsi_domain->dt[dte_index];
+	dte = *dte_addr;
+	if (vsi_dte_is_pt_valid(dte))
+		goto done;
+
+	/* Do not allow to sleep while allocating the buffer */
+	flags = (gfp & ~GFP_KERNEL) | GFP_ATOMIC | GFP_DMA32;
+	page_table = iommu_alloc_pages_sz(flags, PAGE_SIZE);
+	if (!page_table)
+		return ERR_PTR(-ENOMEM);
+
+	pt_dma = dma_map_single(vsi_domain->dev, page_table, PAGE_SIZE, DMA_TO_DEVICE);
+	if (dma_mapping_error(vsi_domain->dev, pt_dma)) {
+		dev_err(vsi_domain->dev, "DMA mapping error while allocating page table\n");
+		iommu_free_pages(page_table);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	dte = vsi_mk_dte(pt_dma);
+	*dte_addr = dte;
+
+	vsi_table_flush(vsi_domain,
+			vsi_domain->dt_dma + dte_index * sizeof(u32), 1);
+done:
+	pt_phys = vsi_dte_pt_address(dte);
+	return (u32 *)phys_to_virt(pt_phys);
+}
+
+static int vsi_iommu_map(struct iommu_domain *domain, unsigned long _iova,
+			 phys_addr_t paddr, size_t size, size_t count,
+			 int prot, gfp_t gfp, size_t *mapped)
+{
+	struct vsi_iommu_domain *vsi_domain = to_vsi_domain(domain);
+	dma_addr_t pte_dma, iova = (dma_addr_t)_iova;
+	u32 *page_table, *pte_addr;
+	u32 dte, pte_index;
+	unsigned long flags;
+	int ret;
+
+	spin_lock_irqsave(&vsi_domain->lock, flags);
+
+	page_table = vsi_dte_get_page_table(vsi_domain, iova, gfp);
+	if (IS_ERR(page_table)) {
+		spin_unlock_irqrestore(&vsi_domain->lock, flags);
+		return PTR_ERR(page_table);
+	}
+
+	dte = vsi_domain->dt[vsi_iova_dte_index(iova)];
+	pte_index = vsi_iova_pte_index(iova);
+	pte_addr = &page_table[pte_index];
+	pte_dma = vsi_dte_pt_address(dte) + pte_index * sizeof(u32);
+	ret = vsi_iommu_map_iova(vsi_domain, pte_addr, pte_dma, iova,
+				 paddr, size, prot);
+	if (!ret)
+		*mapped = size;
+	spin_unlock_irqrestore(&vsi_domain->lock, flags);
+
+	return ret;
+}
+
+static void vsi_iommu_disable(struct vsi_iommu *iommu)
+{
+	writel(0, iommu->regs + VSI_MMU_AHB_CONTROL_BASE);
+}
+
+static int vsi_iommu_identity_attach(struct iommu_domain *domain,
+				     struct device *dev)
+{
+	struct vsi_iommu *iommu = dev_iommu_priv_get(dev);
+	unsigned long flags;
+	int ret;
+
+	ret = pm_runtime_resume_and_get(iommu->dev);
+	if (ret < 0)
+		return ret;
+
+	spin_lock_irqsave(&iommu->lock, flags);
+	if (iommu->domain == domain)
+		goto unlock;
+
+	vsi_iommu_disable(iommu);
+	list_del_init(&iommu->node);
+
+	iommu->domain = domain;
+
+unlock:
+	spin_unlock_irqrestore(&iommu->lock, flags);
+	pm_runtime_put_autosuspend(iommu->dev);
+	return 0;
+}
+
+static const struct iommu_domain_ops vsi_identity_ops = {
+	.attach_dev = vsi_iommu_identity_attach,
+};
+
+static struct iommu_domain vsi_identity_domain = {
+	.type = IOMMU_DOMAIN_IDENTITY,
+	.ops = &vsi_identity_ops,
+};
+
+static void vsi_iommu_enable(struct vsi_iommu *iommu, struct iommu_domain *domain)
+{
+	struct vsi_iommu_domain *vsi_domain = to_vsi_domain(domain);
+
+	if (domain == &vsi_identity_domain)
+		return;
+
+	writel(vsi_domain->pta_dma, iommu->regs + VSI_MMU_AHB_TLB_ARRAY_BASE_L_BASE);
+	writel(VSI_MMU_OUT_OF_BOUND, iommu->regs + VSI_MMU_CONFIG1_BASE);
+	writel(VSI_MMU_BIT_ENABLE, iommu->regs + VSI_MMU_AHB_EXCEPTION_BASE);
+	writel(VSI_MMU_BIT_ENABLE, iommu->regs + VSI_MMU_AHB_CONTROL_BASE);
+}
+
+static int vsi_iommu_attach_device(struct iommu_domain *domain,
+				   struct device *dev)
+{
+	struct vsi_iommu *iommu = dev_iommu_priv_get(dev);
+	struct vsi_iommu_domain *vsi_domain = to_vsi_domain(domain);
+	unsigned long flags, flags2;
+	int ret = 0;
+
+	ret = pm_runtime_resume_and_get(iommu->dev);
+	if (ret < 0)
+		return ret;
+
+	spin_lock_irqsave(&vsi_domain->lock, flags);
+	spin_lock_irqsave(&iommu->lock, flags2);
+
+	vsi_iommu_enable(iommu, domain);
+	writel(VSI_MMU_BIT_FLUSH, iommu->regs + VSI_MMU_FLUSH_BASE);
+	writel(0, iommu->regs + VSI_MMU_FLUSH_BASE);
+
+	list_del_init(&iommu->node);
+	list_add_tail(&iommu->node, &vsi_domain->iommus);
+
+	iommu->domain = domain;
+
+	spin_unlock_irqrestore(&iommu->lock, flags2);
+	spin_unlock_irqrestore(&vsi_domain->lock, flags);
+	pm_runtime_put_autosuspend(iommu->dev);
+	return ret;
+}
+
+static void vsi_iommu_domain_free(struct iommu_domain *domain)
+{
+	struct vsi_iommu_domain *vsi_domain = to_vsi_domain(domain);
+	unsigned long flags;
+	int i;
+
+	spin_lock_irqsave(&vsi_domain->lock, flags);
+
+	WARN_ON(!list_empty(&vsi_domain->iommus));
+
+	for (i = 0; i < NUM_DT_ENTRIES; i++) {
+		u32 dte = vsi_domain->dt[i];
+
+		if (vsi_dte_is_pt_valid(dte)) {
+			phys_addr_t pt_phys = vsi_dte_pt_address(dte);
+			u32 *page_table = phys_to_virt(pt_phys);
+
+			dma_unmap_single(vsi_domain->dev, pt_phys,
+					 SPAGE_SIZE, DMA_TO_DEVICE);
+			iommu_free_pages(page_table);
+		}
+	}
+
+	dma_unmap_single(vsi_domain->dev, vsi_domain->dt_dma,
+			 SPAGE_SIZE, DMA_TO_DEVICE);
+	iommu_free_pages(vsi_domain->dt);
+
+	dma_unmap_single(vsi_domain->dev, vsi_domain->pta_dma,
+			 SPAGE_SIZE, DMA_TO_DEVICE);
+	iommu_free_pages(vsi_domain->pta);
+
+	spin_unlock_irqrestore(&vsi_domain->lock, flags);
+
+	kfree(vsi_domain);
+}
+
+static struct iommu_device *vsi_iommu_probe_device(struct device *dev)
+{
+	struct vsi_iommu *iommu = vsi_iommu_get_from_dev(dev);
+	struct device_link *link;
+
+	link = device_link_add(dev, iommu->dev,
+			       DL_FLAG_STATELESS | DL_FLAG_PM_RUNTIME);
+	if (!link)
+		dev_err(dev, "Unable to link %s\n", dev_name(iommu->dev));
+
+	dev_iommu_priv_set(dev, iommu);
+	return &iommu->iommu;
+}
+
+static void vsi_iommu_release_device(struct device *dev)
+{
+	struct vsi_iommu *iommu = dev_iommu_priv_get(dev);
+
+	device_link_remove(dev, iommu->dev);
+}
+
+static int vsi_iommu_of_xlate(struct device *dev, const struct of_phandle_args *args)
+{
+	return iommu_fwspec_add_ids(dev, args->args, 1);
+}
+
+static const struct iommu_ops vsi_iommu_ops = {
+	.identity_domain = &vsi_identity_domain,
+	.release_domain = &vsi_identity_domain,
+	.domain_alloc_paging = vsi_iommu_domain_alloc_paging,
+	.of_xlate = vsi_iommu_of_xlate,
+	.probe_device = vsi_iommu_probe_device,
+	.release_device = vsi_iommu_release_device,
+	.device_group = generic_single_device_group,
+	.owner = THIS_MODULE,
+	.default_domain_ops = &(const struct iommu_domain_ops) {
+		.attach_dev		= vsi_iommu_attach_device,
+		.map_pages		= vsi_iommu_map,
+		.unmap_pages		= vsi_iommu_unmap,
+		.flush_iotlb_all	= vsi_iommu_flush_tlb_all,
+		.iova_to_phys		= vsi_iommu_iova_to_phys,
+		.free			= vsi_iommu_domain_free,
+	}
+};
+
+static const struct of_device_id vsi_iommu_dt_ids[] = {
+	{
+		.compatible = "verisilicon,iommu-1.2",
+	},
+	{ /* sentinel */ }
+};
+MODULE_DEVICE_TABLE(of, vsi_iommu_dt_ids);
+
+static int vsi_iommu_probe(struct platform_device *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct vsi_iommu *iommu;
+	int err;
+
+	iommu = devm_kzalloc(dev, sizeof(*iommu), GFP_KERNEL);
+	if (!iommu)
+		return -ENOMEM;
+
+	iommu->dev = dev;
+	spin_lock_init(&iommu->lock);
+	INIT_LIST_HEAD(&iommu->node);
+
+	iommu->regs = devm_platform_ioremap_resource(pdev, 0);
+	if (IS_ERR(iommu->regs))
+		return -ENOMEM;
+
+	iommu->num_clocks = devm_clk_bulk_get_all(dev, &iommu->clocks);
+	if  (iommu->num_clocks < 0)
+		return iommu->num_clocks;
+
+	err = clk_bulk_prepare(iommu->num_clocks, iommu->clocks);
+	if (err)
+		return err;
+
+	iommu->irq = platform_get_irq(pdev, 0);
+	if (iommu->irq < 0)
+		return iommu->irq;
+
+	err = devm_request_irq(iommu->dev, iommu->irq, vsi_iommu_irq,
+			       IRQF_SHARED, dev_name(dev), iommu);
+	if (err)
+		goto err_unprepare_clocks;
+
+	dma_set_mask_and_coherent(dev, DMA_BIT_MASK(32));
+	platform_set_drvdata(pdev, iommu);
+
+	pm_runtime_set_autosuspend_delay(dev, 100);
+	pm_runtime_use_autosuspend(dev);
+	pm_runtime_enable(dev);
+
+	err = iommu_device_sysfs_add(&iommu->iommu, dev, NULL, dev_name(dev));
+	if (err)
+		goto err_runtime_disable;
+
+	err = iommu_device_register(&iommu->iommu, &vsi_iommu_ops, dev);
+	if (err)
+		goto err_remove_sysfs;
+
+	return 0;
+
+err_remove_sysfs:
+	iommu_device_sysfs_remove(&iommu->iommu);
+err_runtime_disable:
+	pm_runtime_disable(dev);
+err_unprepare_clocks:
+	clk_bulk_unprepare(iommu->num_clocks, iommu->clocks);
+	return err;
+}
+
+static void vsi_iommu_shutdown(struct platform_device *pdev)
+{
+	struct vsi_iommu *iommu = platform_get_drvdata(pdev);
+
+	disable_irq(iommu->irq);
+	pm_runtime_force_suspend(&pdev->dev);
+}
+
+static int __maybe_unused vsi_iommu_suspend(struct device *dev)
+{
+	struct vsi_iommu *iommu = dev_get_drvdata(dev);
+
+	vsi_iommu_disable(iommu);
+
+	clk_bulk_disable(iommu->num_clocks, iommu->clocks);
+
+	return 0;
+}
+
+static int __maybe_unused vsi_iommu_resume(struct device *dev)
+{
+	struct vsi_iommu *iommu = dev_get_drvdata(dev);
+	unsigned long flags, flags2;
+	int ret;
+
+	ret = clk_bulk_enable(iommu->num_clocks, iommu->clocks);
+	if (ret)
+		return ret;
+
+	if (iommu->domain) {
+		struct vsi_iommu_domain *vsi_domain = to_vsi_domain(iommu->domain);
+
+		spin_lock_irqsave(&vsi_domain->lock, flags);
+		spin_lock_irqsave(&iommu->lock, flags2);
+		vsi_iommu_enable(iommu, iommu->domain);
+		spin_unlock_irqrestore(&iommu->lock, flags2);
+		spin_unlock_irqrestore(&vsi_domain->lock, flags);
+	}
+
+	return 0;
+}
+
+static DEFINE_RUNTIME_DEV_PM_OPS(vsi_iommu_pm_ops,
+				 vsi_iommu_suspend, vsi_iommu_resume,
+				 NULL);
+
+static struct platform_driver rockchip_vsi_iommu_driver = {
+	.probe = vsi_iommu_probe,
+	.shutdown = vsi_iommu_shutdown,
+	.driver = {
+		   .name = "vsi_iommu",
+		   .of_match_table = vsi_iommu_dt_ids,
+		   .pm = pm_sleep_ptr(&vsi_iommu_pm_ops),
+		   .suppress_bind_attrs = true,
+	},
+};
+module_platform_driver(rockchip_vsi_iommu_driver);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Benjamin Gaignard <benjamin.gaignard@collabora.com>");
+MODULE_DESCRIPTION("Verisilicon IOMMU driver");
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v7 4/6] media: verisilicon: AV1: Restore IOMMU context before decoding a frame
  2025-08-25 15:34 [PATCH v7 0/6] Add support for Verisilicon IOMMU used by media codec blocks Benjamin Gaignard
                   ` (2 preceding siblings ...)
  2025-08-25 15:34 ` [PATCH v7 3/6] iommu: Add verisilicon IOMMU driver Benjamin Gaignard
@ 2025-08-25 15:34 ` Benjamin Gaignard
  2025-08-25 17:05   ` Jason Gunthorpe
  2025-08-25 15:34 ` [PATCH v7 5/6] arm64: dts: rockchip: Add verisilicon IOMMU node on RK3588 Benjamin Gaignard
  2025-08-25 15:34 ` [PATCH v7 6/6] arm64: defconfig: enable Verisilicon IOMMU Benjamin Gaignard
  5 siblings, 1 reply; 17+ messages in thread
From: Benjamin Gaignard @ 2025-08-25 15:34 UTC (permalink / raw)
  To: joro, will, robin.murphy, robh, krzk+dt, conor+dt, heiko,
	nicolas.dufresne, jgg, p.zabel, mchehab
  Cc: iommu, devicetree, linux-kernel, linux-arm-kernel, linux-rockchip,
	kernel, linux-media, Benjamin Gaignard

AV1 is a stateless decoder and multiple AV1 bitstreams could be decoded
at the same time. Each decoding context got it own iommu domain which
need to be restored before each frame. To be sure that iommu context is
correctly set AV1 driver detach and attach before decoding the frame.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
---
changes in version 7:
- add a patch in AV1 video decoder to manage per context iommu domain.

 drivers/media/platform/verisilicon/hantro.h           |  5 +++++
 drivers/media/platform/verisilicon/hantro_drv.c       | 11 +++++++++++
 .../platform/verisilicon/rockchip_vpu981_hw_av1_dec.c | 10 ++++++++++
 3 files changed, 26 insertions(+)

diff --git a/drivers/media/platform/verisilicon/hantro.h b/drivers/media/platform/verisilicon/hantro.h
index 81328c63b796..a28a181013b9 100644
--- a/drivers/media/platform/verisilicon/hantro.h
+++ b/drivers/media/platform/verisilicon/hantro.h
@@ -12,6 +12,9 @@
 #ifndef HANTRO_H_
 #define HANTRO_H_
 
+#include <linux/dma-map-ops.h>
+#include <linux/iommu.h>
+#include <linux/iommu-dma.h>
 #include <linux/platform_device.h>
 #include <linux/videodev2.h>
 #include <linux/wait.h>
@@ -266,6 +269,8 @@ struct hantro_ctx {
 	struct hantro_postproc_ctx postproc;
 	bool need_postproc;
 
+	struct iommu_domain *iommu_domain;
+
 	/* Specific for particular codec modes. */
 	union {
 		struct hantro_h264_dec_hw_ctx h264_dec;
diff --git a/drivers/media/platform/verisilicon/hantro_drv.c b/drivers/media/platform/verisilicon/hantro_drv.c
index 8542238e0fb1..feec8e6fb504 100644
--- a/drivers/media/platform/verisilicon/hantro_drv.c
+++ b/drivers/media/platform/verisilicon/hantro_drv.c
@@ -675,6 +675,13 @@ static int hantro_open(struct file *filp)
 	}
 	ctx->fh.ctrl_handler = &ctx->ctrl_handler;
 
+	if (use_dma_iommu(ctx->dev->v4l2_dev.dev)) {
+		ctx->iommu_domain = iommu_paging_domain_alloc(ctx->dev->v4l2_dev.dev);
+
+		if (!ctx->iommu_domain)
+			vpu_err("cannot alloc new empty domain\n");
+	}
+
 	return 0;
 
 err_fh_free:
@@ -698,6 +705,10 @@ static int hantro_release(struct file *filp)
 	v4l2_fh_del(&ctx->fh);
 	v4l2_fh_exit(&ctx->fh);
 	v4l2_ctrl_handler_free(&ctx->ctrl_handler);
+
+	if (ctx->iommu_domain)
+		iommu_domain_free(ctx->iommu_domain);
+
 	kfree(ctx);
 
 	return 0;
diff --git a/drivers/media/platform/verisilicon/rockchip_vpu981_hw_av1_dec.c b/drivers/media/platform/verisilicon/rockchip_vpu981_hw_av1_dec.c
index e4703bb6be7c..b3e52387234f 100644
--- a/drivers/media/platform/verisilicon/rockchip_vpu981_hw_av1_dec.c
+++ b/drivers/media/platform/verisilicon/rockchip_vpu981_hw_av1_dec.c
@@ -2095,12 +2095,22 @@ rockchip_vpu981_av1_dec_set_output_buffer(struct hantro_ctx *ctx)
 	hantro_write_addr(vpu, AV1_TILE_OUT_MV, mv_addr);
 }
 
+static void rockchip_vpu981_av1_restore_iommu(struct hantro_ctx *ctx)
+{
+	if (ctx->iommu_domain) {
+		iommu_attach_device(ctx->iommu_domain, ctx->dev->v4l2_dev.dev);
+		iommu_detach_device(ctx->iommu_domain, ctx->dev->v4l2_dev.dev);
+	}
+}
+
 int rockchip_vpu981_av1_dec_run(struct hantro_ctx *ctx)
 {
 	struct hantro_dev *vpu = ctx->dev;
 	struct vb2_v4l2_buffer *vb2_src;
 	int ret;
 
+	rockchip_vpu981_av1_restore_iommu(ctx);
+
 	hantro_start_prepare_run(ctx);
 
 	ret = rockchip_vpu981_av1_dec_prepare_run(ctx);
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v7 5/6] arm64: dts: rockchip: Add verisilicon IOMMU node on RK3588
  2025-08-25 15:34 [PATCH v7 0/6] Add support for Verisilicon IOMMU used by media codec blocks Benjamin Gaignard
                   ` (3 preceding siblings ...)
  2025-08-25 15:34 ` [PATCH v7 4/6] media: verisilicon: AV1: Restore IOMMU context before decoding a frame Benjamin Gaignard
@ 2025-08-25 15:34 ` Benjamin Gaignard
  2025-08-25 15:34 ` [PATCH v7 6/6] arm64: defconfig: enable Verisilicon IOMMU Benjamin Gaignard
  5 siblings, 0 replies; 17+ messages in thread
From: Benjamin Gaignard @ 2025-08-25 15:34 UTC (permalink / raw)
  To: joro, will, robin.murphy, robh, krzk+dt, conor+dt, heiko,
	nicolas.dufresne, jgg, p.zabel, mchehab
  Cc: iommu, devicetree, linux-kernel, linux-arm-kernel, linux-rockchip,
	kernel, linux-media, Benjamin Gaignard

Add the device tree node for the Verisilicon IOMMU present
in the RK3588 SoC.
This IOMMU handles address translation for the VPU hardware blocks.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
---
 arch/arm64/boot/dts/rockchip/rk3588-base.dtsi | 11 +++++++++++
 1 file changed, 11 insertions(+)

diff --git a/arch/arm64/boot/dts/rockchip/rk3588-base.dtsi b/arch/arm64/boot/dts/rockchip/rk3588-base.dtsi
index 70f03e68ba55..8656e46ad288 100644
--- a/arch/arm64/boot/dts/rockchip/rk3588-base.dtsi
+++ b/arch/arm64/boot/dts/rockchip/rk3588-base.dtsi
@@ -1263,6 +1263,17 @@ av1d: video-codec@fdc70000 {
 		clock-names = "aclk", "hclk";
 		power-domains = <&power RK3588_PD_AV1>;
 		resets = <&cru SRST_A_AV1>, <&cru SRST_P_AV1>, <&cru SRST_A_AV1_BIU>, <&cru SRST_P_AV1_BIU>;
+		iommus = <&av1d_mmu>;
+	};
+
+	av1d_mmu: iommu@fdca0000 {
+		compatible = "rockchip,rk3588-av1-iommu", "verisilicon,iommu-1.2";
+		reg = <0x0 0xfdca0000 0x0 0x600>;
+		interrupts = <GIC_SPI 109 IRQ_TYPE_LEVEL_HIGH 0>;
+		clocks = <&cru ACLK_AV1>, <&cru PCLK_AV1>;
+		clock-names = "core", "iface";
+		#iommu-cells = <0>;
+		power-domains = <&power RK3588_PD_AV1>;
 	};
 
 	vop: vop@fdd90000 {
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v7 6/6] arm64: defconfig: enable Verisilicon IOMMU
  2025-08-25 15:34 [PATCH v7 0/6] Add support for Verisilicon IOMMU used by media codec blocks Benjamin Gaignard
                   ` (4 preceding siblings ...)
  2025-08-25 15:34 ` [PATCH v7 5/6] arm64: dts: rockchip: Add verisilicon IOMMU node on RK3588 Benjamin Gaignard
@ 2025-08-25 15:34 ` Benjamin Gaignard
  2025-08-26  8:25   ` Krzysztof Kozlowski
  5 siblings, 1 reply; 17+ messages in thread
From: Benjamin Gaignard @ 2025-08-25 15:34 UTC (permalink / raw)
  To: joro, will, robin.murphy, robh, krzk+dt, conor+dt, heiko,
	nicolas.dufresne, jgg, p.zabel, mchehab
  Cc: iommu, devicetree, linux-kernel, linux-arm-kernel, linux-rockchip,
	kernel, linux-media, Benjamin Gaignard

Enable Verisilicon IOMMU used by RK3588 AV1 hardware codec.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
---
 arch/arm64/configs/defconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
index 58f87d09366c..04547bcc904b 100644
--- a/arch/arm64/configs/defconfig
+++ b/arch/arm64/configs/defconfig
@@ -1470,6 +1470,7 @@ CONFIG_ARM_SMMU=y
 CONFIG_ARM_SMMU_V3=y
 CONFIG_MTK_IOMMU=y
 CONFIG_QCOM_IOMMU=y
+CONFIG_VSI_IOMMU=m
 CONFIG_REMOTEPROC=y
 CONFIG_IMX_REMOTEPROC=y
 CONFIG_MTK_SCP=m
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v7 4/6] media: verisilicon: AV1: Restore IOMMU context before decoding a frame
  2025-08-25 15:34 ` [PATCH v7 4/6] media: verisilicon: AV1: Restore IOMMU context before decoding a frame Benjamin Gaignard
@ 2025-08-25 17:05   ` Jason Gunthorpe
  2025-08-25 17:50     ` Nicolas Dufresne
  0 siblings, 1 reply; 17+ messages in thread
From: Jason Gunthorpe @ 2025-08-25 17:05 UTC (permalink / raw)
  To: Benjamin Gaignard
  Cc: joro, will, robin.murphy, robh, krzk+dt, conor+dt, heiko,
	nicolas.dufresne, p.zabel, mchehab, iommu, devicetree,
	linux-kernel, linux-arm-kernel, linux-rockchip, kernel,
	linux-media

On Mon, Aug 25, 2025 at 05:34:43PM +0200, Benjamin Gaignard wrote:
> diff --git a/drivers/media/platform/verisilicon/hantro.h b/drivers/media/platform/verisilicon/hantro.h
> index 81328c63b796..a28a181013b9 100644
> --- a/drivers/media/platform/verisilicon/hantro.h
> +++ b/drivers/media/platform/verisilicon/hantro.h
> @@ -12,6 +12,9 @@
>  #ifndef HANTRO_H_
>  #define HANTRO_H_
>  
> +#include <linux/dma-map-ops.h>
> +#include <linux/iommu.h>
> +#include <linux/iommu-dma.h>

This is an internal header it should not be included in drivers.

> +static void rockchip_vpu981_av1_restore_iommu(struct hantro_ctx *ctx)
> +{
> +	if (ctx->iommu_domain) {
> +		iommu_attach_device(ctx->iommu_domain, ctx->dev->v4l2_dev.dev);
> +		iommu_detach_device(ctx->iommu_domain, ctx->dev->v4l2_dev.dev);
> +	}
> +}

What is this supposed to do? Put it back to the default domain? Who
changed it away from the default domain?

Did some other driver change the attached domain (if so that's wild
and wrong)? The commit message hints at that but it should be
explained alot more.

This just seems wrong and goofy. Driver shouldn't be changing their
iommu domains if they are using the default domain at all. We now have
APIs to allow you to allocate wide chunks of IOVA space and manage
them directly. Maybe these 'multiple stream's should be doing that
instead of mucking with iommu domains?

Jason

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v7 4/6] media: verisilicon: AV1: Restore IOMMU context before decoding a frame
  2025-08-25 17:05   ` Jason Gunthorpe
@ 2025-08-25 17:50     ` Nicolas Dufresne
  2025-08-25 18:31       ` Jason Gunthorpe
  0 siblings, 1 reply; 17+ messages in thread
From: Nicolas Dufresne @ 2025-08-25 17:50 UTC (permalink / raw)
  To: Jason Gunthorpe, Benjamin Gaignard
  Cc: joro, will, robin.murphy, robh, krzk+dt, conor+dt, heiko, p.zabel,
	mchehab, iommu, devicetree, linux-kernel, linux-arm-kernel,
	linux-rockchip, kernel, linux-media

[-- Attachment #1: Type: text/plain, Size: 2773 bytes --]

Hi Benjmain, Jason,

Le lundi 25 août 2025 à 14:05 -0300, Jason Gunthorpe a écrit :
> On Mon, Aug 25, 2025 at 05:34:43PM +0200, Benjamin Gaignard wrote:
> > diff --git a/drivers/media/platform/verisilicon/hantro.h b/drivers/media/platform/verisilicon/hantro.h
> > index 81328c63b796..a28a181013b9 100644
> > --- a/drivers/media/platform/verisilicon/hantro.h
> > +++ b/drivers/media/platform/verisilicon/hantro.h
> > @@ -12,6 +12,9 @@
> >  #ifndef HANTRO_H_
> >  #define HANTRO_H_
> >  
> > +#include <linux/dma-map-ops.h>
> > +#include <linux/iommu.h>
> > +#include <linux/iommu-dma.h>
> 
> This is an internal header it should not be included in drivers.
> 
> > +static void rockchip_vpu981_av1_restore_iommu(struct hantro_ctx *ctx)
> > +{
> > +	if (ctx->iommu_domain) {
> > +		iommu_attach_device(ctx->iommu_domain, ctx->dev->v4l2_dev.dev);
> > +		iommu_detach_device(ctx->iommu_domain, ctx->dev->v4l2_dev.dev);
> > +	}
> > +}
> 
> What is this supposed to do? Put it back to the default domain? Who
> changed it away from the default domain?

If you rename the iommu_domain into empty_domain it adds a bit of sense. When
you attach another domain (this one is empty, but it does not matter) and detach
it, the default domain get restored. That effectively forces the reprogramming
(in the VSI case flushing) of the iommu configuration.

> 
> Did some other driver change the attached domain (if so that's wild
> and wrong)? The commit message hints at that but it should be
> explained alot more.
> 
> This just seems wrong and goofy. Driver shouldn't be changing their
> iommu domains if they are using the default domain at all. We now have
> APIs to allow you to allocate wide chunks of IOVA space and manage
> them directly. Maybe these 'multiple stream's should be doing that
> instead of mucking with iommu domains?

This is seeking inspiration of what we do in rkvdec [0], the iommu-dma.h should
not be needed.

Jason, the point is that the iommu and the VPU are not separate devices, which
comes with side effects. On RKVDec side, the iommu configuration get resets
whenever a decoding error leads to a VPU "self reset". I can't remember who from
the iommu subsystem suggested that, but the empty domain method was agreed to be
the least invasive way to workaround that issue. I believe Detlev tried multiple
time to add APIs for that before the discussion lead to this path.

Benjamin, please improve the naming, comment and description, I agree with Jason
its not completely clear. I'm also surprised that you do that every frame, seems
excessive.

regards,
Nicolas

[0] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ff8c5622f9f7c644e995d013af320b59e4d61b93

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v7 4/6] media: verisilicon: AV1: Restore IOMMU context before decoding a frame
  2025-08-25 17:50     ` Nicolas Dufresne
@ 2025-08-25 18:31       ` Jason Gunthorpe
  2025-08-26  9:52         ` Benjamin Gaignard
  0 siblings, 1 reply; 17+ messages in thread
From: Jason Gunthorpe @ 2025-08-25 18:31 UTC (permalink / raw)
  To: Nicolas Dufresne
  Cc: Benjamin Gaignard, joro, will, robin.murphy, robh, krzk+dt,
	conor+dt, heiko, p.zabel, mchehab, iommu, devicetree,
	linux-kernel, linux-arm-kernel, linux-rockchip, kernel,
	linux-media

On Mon, Aug 25, 2025 at 01:50:16PM -0400, Nicolas Dufresne wrote:

> Jason, the point is that the iommu and the VPU are not separate devices, which
> comes with side effects. On RKVDec side, the iommu configuration get resets
> whenever a decoding error leads to a VPU "self reset". I can't remember who from
> the iommu subsystem suggested that, but the empty domain method was agreed to be

IDK, that seems really goofy too me an defiantly needs to be
extensively documented this is restoring the default with some lore
link of the original suggestion.

> the least invasive way to workaround that issue. I believe Detlev tried multiple
> time to add APIs for that before the discussion lead to this path.

You mean this:

https://lore.kernel.org/linux-iommu/20250318152049.14781-1-detlev.casanova@collabora.com/

Which came back with the same remark I would give:

 Please have some kind of proper reset notifier mechanism - in fact
 with runtime PM could you not already invoke a suspend/resume cycle
 via the device links?

Or another reasonable option:

  Or at worst just export a public interface for the other driver to
  invoke rk_iommu_resume() directly.

Sigh.

> Benjamin, please improve the naming, comment and description, I agree with Jason
> its not completely clear. I'm also surprised that you do that every frame, seems
> excessive.

Indeed if it is just error recovery.

> [0] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ff8c5622f9f7c644e995d013af320b59e4d61b93

This is already merged? And now you want two copies of this? I think
this is a very poor direction..

Jason

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v7 6/6] arm64: defconfig: enable Verisilicon IOMMU
  2025-08-25 15:34 ` [PATCH v7 6/6] arm64: defconfig: enable Verisilicon IOMMU Benjamin Gaignard
@ 2025-08-26  8:25   ` Krzysztof Kozlowski
  2025-08-26  9:53     ` Benjamin Gaignard
  0 siblings, 1 reply; 17+ messages in thread
From: Krzysztof Kozlowski @ 2025-08-26  8:25 UTC (permalink / raw)
  To: Benjamin Gaignard, joro, will, robin.murphy, robh, krzk+dt,
	conor+dt, heiko, nicolas.dufresne, jgg, p.zabel, mchehab
  Cc: iommu, devicetree, linux-kernel, linux-arm-kernel, linux-rockchip,
	kernel, linux-media

On 25/08/2025 17:34, Benjamin Gaignard wrote:
> Enable Verisilicon IOMMU used by RK3588 AV1 hardware codec.

Qualcomm RK3588? This is defconfig for all platforms, so be specific
which board uses it.



Best regards,
Krzysztof

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v7 4/6] media: verisilicon: AV1: Restore IOMMU context before decoding a frame
  2025-08-25 18:31       ` Jason Gunthorpe
@ 2025-08-26  9:52         ` Benjamin Gaignard
  2025-08-26 12:41           ` Jason Gunthorpe
  0 siblings, 1 reply; 17+ messages in thread
From: Benjamin Gaignard @ 2025-08-26  9:52 UTC (permalink / raw)
  To: Jason Gunthorpe, Nicolas Dufresne
  Cc: joro, will, robin.murphy, robh, krzk+dt, conor+dt, heiko, p.zabel,
	mchehab, iommu, devicetree, linux-kernel, linux-arm-kernel,
	linux-rockchip, kernel, linux-media


Le 25/08/2025 à 20:31, Jason Gunthorpe a écrit :
> On Mon, Aug 25, 2025 at 01:50:16PM -0400, Nicolas Dufresne wrote:
>
>> Jason, the point is that the iommu and the VPU are not separate devices, which
>> comes with side effects. On RKVDec side, the iommu configuration get resets
>> whenever a decoding error leads to a VPU "self reset". I can't remember who from
>> the iommu subsystem suggested that, but the empty domain method was agreed to be
> IDK, that seems really goofy too me an defiantly needs to be
> extensively documented this is restoring the default with some lore
> link of the original suggestion.
>
>> the least invasive way to workaround that issue. I believe Detlev tried multiple
>> time to add APIs for that before the discussion lead to this path.
> You mean this:
>
> https://lore.kernel.org/linux-iommu/20250318152049.14781-1-detlev.casanova@collabora.com/
>
> Which came back with the same remark I would give:
>
>   Please have some kind of proper reset notifier mechanism - in fact
>   with runtime PM could you not already invoke a suspend/resume cycle
>   via the device links?

when doing parallel decode suspend/resume are not invoked.

>
> Or another reasonable option:
>
>    Or at worst just export a public interface for the other driver to
>    invoke rk_iommu_resume() directly.
>
> Sigh.

An other solution which is working is to call iommu_flush_iotlb_all()
before decoding each frame.
It doesn't require to allocate a domain per decoding context.
Does that sound as a better solution ?

Benjamin

>
>> Benjamin, please improve the naming, comment and description, I agree with Jason
>> its not completely clear. I'm also surprised that you do that every frame, seems
>> excessive.
> Indeed if it is just error recovery.
>
>> [0] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ff8c5622f9f7c644e995d013af320b59e4d61b93
> This is already merged? And now you want two copies of this? I think
> this is a very poor direction..
>
> Jason
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v7 6/6] arm64: defconfig: enable Verisilicon IOMMU
  2025-08-26  8:25   ` Krzysztof Kozlowski
@ 2025-08-26  9:53     ` Benjamin Gaignard
  2025-08-26 10:05       ` Krzysztof Kozlowski
  0 siblings, 1 reply; 17+ messages in thread
From: Benjamin Gaignard @ 2025-08-26  9:53 UTC (permalink / raw)
  To: Krzysztof Kozlowski, joro, will, robin.murphy, robh, krzk+dt,
	conor+dt, heiko, nicolas.dufresne, jgg, p.zabel, mchehab
  Cc: iommu, devicetree, linux-kernel, linux-arm-kernel, linux-rockchip,
	kernel, linux-media


Le 26/08/2025 à 10:25, Krzysztof Kozlowski a écrit :
> On 25/08/2025 17:34, Benjamin Gaignard wrote:
>> Enable Verisilicon IOMMU used by RK3588 AV1 hardware codec.
> Qualcomm RK3588? This is defconfig for all platforms, so be specific
> which board uses it.

It is for rockchip rk3588 SoC.

Regards,
Benjamin

>
>
>
> Best regards,
> Krzysztof
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v7 6/6] arm64: defconfig: enable Verisilicon IOMMU
  2025-08-26  9:53     ` Benjamin Gaignard
@ 2025-08-26 10:05       ` Krzysztof Kozlowski
  0 siblings, 0 replies; 17+ messages in thread
From: Krzysztof Kozlowski @ 2025-08-26 10:05 UTC (permalink / raw)
  To: Benjamin Gaignard, joro, will, robin.murphy, robh, krzk+dt,
	conor+dt, heiko, nicolas.dufresne, jgg, p.zabel, mchehab
  Cc: iommu, devicetree, linux-kernel, linux-arm-kernel, linux-rockchip,
	kernel, linux-media

On 26/08/2025 11:53, Benjamin Gaignard wrote:
> 
> Le 26/08/2025 à 10:25, Krzysztof Kozlowski a écrit :
>> On 25/08/2025 17:34, Benjamin Gaignard wrote:
>>> Enable Verisilicon IOMMU used by RK3588 AV1 hardware codec.
>> Qualcomm RK3588? This is defconfig for all platforms, so be specific
>> which board uses it.
> 
> It is for rockchip rk3588 SoC.

I guessed that, please mention it in the commit msg if there is going to
be v8. Also mention name of any board upstream having this.

Best regards,
Krzysztof

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v7 4/6] media: verisilicon: AV1: Restore IOMMU context before decoding a frame
  2025-08-26  9:52         ` Benjamin Gaignard
@ 2025-08-26 12:41           ` Jason Gunthorpe
  2025-08-26 13:14             ` Benjamin Gaignard
  2025-08-29 16:23             ` Nicolas Dufresne
  0 siblings, 2 replies; 17+ messages in thread
From: Jason Gunthorpe @ 2025-08-26 12:41 UTC (permalink / raw)
  To: Benjamin Gaignard
  Cc: Nicolas Dufresne, joro, will, robin.murphy, robh, krzk+dt,
	conor+dt, heiko, p.zabel, mchehab, iommu, devicetree,
	linux-kernel, linux-arm-kernel, linux-rockchip, kernel,
	linux-media

On Tue, Aug 26, 2025 at 11:52:37AM +0200, Benjamin Gaignard wrote:
> 
> Le 25/08/2025 à 20:31, Jason Gunthorpe a écrit :
> > On Mon, Aug 25, 2025 at 01:50:16PM -0400, Nicolas Dufresne wrote:
> > 
> > > Jason, the point is that the iommu and the VPU are not separate devices, which
> > > comes with side effects. On RKVDec side, the iommu configuration get resets
> > > whenever a decoding error leads to a VPU "self reset". I can't remember who from
> > > the iommu subsystem suggested that, but the empty domain method was agreed to be
> > IDK, that seems really goofy too me an defiantly needs to be
> > extensively documented this is restoring the default with some lore
> > link of the original suggestion.
> > 
> > > the least invasive way to workaround that issue. I believe Detlev tried multiple
> > > time to add APIs for that before the discussion lead to this path.
> > You mean this:
> > 
> > https://lore.kernel.org/linux-iommu/20250318152049.14781-1-detlev.casanova@collabora.com/
> > 
> > Which came back with the same remark I would give:
> > 
> >   Please have some kind of proper reset notifier mechanism - in fact
> >   with runtime PM could you not already invoke a suspend/resume cycle
> >   via the device links?
> 
> when doing parallel decode suspend/resume are not invoked.

It was a proposal for an error recovery path.

> > Or another reasonable option:
> > 
> >    Or at worst just export a public interface for the other driver to
> >    invoke rk_iommu_resume() directly.
> > 
> > Sigh.
> 
> An other solution which is working is to call iommu_flush_iotlb_all()
> before decoding each frame.

That was already proposed and shot down, it makes no sense at all use
to use flushing to reset the registers because the HW weirdly lost
them, and flushing should never happen outside mapping contexts.

If the HW is really resetting the iommu registers after every frame
that is really just painfully broken, and makes me wonder if it really
should be an iommu subsystem driver at all if it is so tightly coupled
to the computing HW. :\

Like we wouldn't try to put a GPU MMU into the iommu subsystem though
they do fairly similar things.

Jason

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v7 4/6] media: verisilicon: AV1: Restore IOMMU context before decoding a frame
  2025-08-26 12:41           ` Jason Gunthorpe
@ 2025-08-26 13:14             ` Benjamin Gaignard
  2025-08-29 16:23             ` Nicolas Dufresne
  1 sibling, 0 replies; 17+ messages in thread
From: Benjamin Gaignard @ 2025-08-26 13:14 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Nicolas Dufresne, joro, will, robin.murphy, robh, krzk+dt,
	conor+dt, heiko, p.zabel, mchehab, iommu, devicetree,
	linux-kernel, linux-arm-kernel, linux-rockchip, kernel,
	linux-media


Le 26/08/2025 à 14:41, Jason Gunthorpe a écrit :
> On Tue, Aug 26, 2025 at 11:52:37AM +0200, Benjamin Gaignard wrote:
>> Le 25/08/2025 à 20:31, Jason Gunthorpe a écrit :
>>> On Mon, Aug 25, 2025 at 01:50:16PM -0400, Nicolas Dufresne wrote:
>>>
>>>> Jason, the point is that the iommu and the VPU are not separate devices, which
>>>> comes with side effects. On RKVDec side, the iommu configuration get resets
>>>> whenever a decoding error leads to a VPU "self reset". I can't remember who from
>>>> the iommu subsystem suggested that, but the empty domain method was agreed to be
>>> IDK, that seems really goofy too me an defiantly needs to be
>>> extensively documented this is restoring the default with some lore
>>> link of the original suggestion.
>>>
>>>> the least invasive way to workaround that issue. I believe Detlev tried multiple
>>>> time to add APIs for that before the discussion lead to this path.
>>> You mean this:
>>>
>>> https://lore.kernel.org/linux-iommu/20250318152049.14781-1-detlev.casanova@collabora.com/
>>>
>>> Which came back with the same remark I would give:
>>>
>>>    Please have some kind of proper reset notifier mechanism - in fact
>>>    with runtime PM could you not already invoke a suspend/resume cycle
>>>    via the device links?
>> when doing parallel decode suspend/resume are not invoked.
> It was a proposal for an error recovery path.
>
>>> Or another reasonable option:
>>>
>>>     Or at worst just export a public interface for the other driver to
>>>     invoke rk_iommu_resume() directly.
>>>
>>> Sigh.
>> An other solution which is working is to call iommu_flush_iotlb_all()
>> before decoding each frame.
> That was already proposed and shot down, it makes no sense at all use
> to use flushing to reset the registers because the HW weirdly lost
> them, and flushing should never happen outside mapping contexts.
>
> If the HW is really resetting the iommu registers after every frame
> that is really just painfully broken, and makes me wonder if it really
> should be an iommu subsystem driver at all if it is so tightly coupled
> to the computing HW. :\
>
> Like we wouldn't try to put a GPU MMU into the iommu subsystem though
> they do fairly similar things.

I will add vsi_iommu_restore_ctx() function where all locks will be
managed and call vsi_iommu_enable(). Decoder driver will call this function
before each frame.
I think it similar to some omap iommu code I have found.

Regards,
Benjamin

> Jason
>

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v7 4/6] media: verisilicon: AV1: Restore IOMMU context before decoding a frame
  2025-08-26 12:41           ` Jason Gunthorpe
  2025-08-26 13:14             ` Benjamin Gaignard
@ 2025-08-29 16:23             ` Nicolas Dufresne
  1 sibling, 0 replies; 17+ messages in thread
From: Nicolas Dufresne @ 2025-08-29 16:23 UTC (permalink / raw)
  To: Jason Gunthorpe, Benjamin Gaignard
  Cc: joro, will, robin.murphy, robh, krzk+dt, conor+dt, heiko, p.zabel,
	mchehab, iommu, devicetree, linux-kernel, linux-arm-kernel,
	linux-rockchip, kernel, linux-media

[-- Attachment #1: Type: text/plain, Size: 4055 bytes --]

Le mardi 26 août 2025 à 09:41 -0300, Jason Gunthorpe a écrit :
> On Tue, Aug 26, 2025 at 11:52:37AM +0200, Benjamin Gaignard wrote:
> > 
> > Le 25/08/2025 à 20:31, Jason Gunthorpe a écrit :
> > > On Mon, Aug 25, 2025 at 01:50:16PM -0400, Nicolas Dufresne wrote:
> > > 
> > > > Jason, the point is that the iommu and the VPU are not separate devices, which
> > > > comes with side effects. On RKVDec side, the iommu configuration get resets
> > > > whenever a decoding error leads to a VPU "self reset". I can't remember who from
> > > > the iommu subsystem suggested that, but the empty domain method was agreed to be
> > > IDK, that seems really goofy too me an defiantly needs to be
> > > extensively documented this is restoring the default with some lore
> > > link of the original suggestion.
> > > 
> > > > the least invasive way to workaround that issue. I believe Detlev tried multiple
> > > > time to add APIs for that before the discussion lead to this path.
> > > You mean this:
> > > 
> > > https://lore.kernel.org/linux-iommu/20250318152049.14781-1-detlev.casanova@collabora.com/
> > > 
> > > Which came back with the same remark I would give:
> > > 
> > >   Please have some kind of proper reset notifier mechanism - in fact
> > >   with runtime PM could you not already invoke a suspend/resume cycle
> > >   via the device links?
> > 
> > when doing parallel decode suspend/resume are not invoked.
> 
> It was a proposal for an error recovery path.
> 
> > > Or another reasonable option:
> > > 
> > >    Or at worst just export a public interface for the other driver to
> > >    invoke rk_iommu_resume() directly.
> > > 
> > > Sigh.
> > 
> > An other solution which is working is to call iommu_flush_iotlb_all()
> > before decoding each frame.
> 
> That was already proposed and shot down, it makes no sense at all use
> to use flushing to reset the registers because the HW weirdly lost
> them, and flushing should never happen outside mapping contexts.
> 
> If the HW is really resetting the iommu registers after every frame
> that is really just painfully broken, and makes me wonder if it really
> should be an iommu subsystem driver at all if it is so tightly coupled
> to the computing HW. :\
> 
> Like we wouldn't try to put a GPU MMU into the iommu subsystem though
> they do fairly similar things.

I didn't mention, but this is obviously close to the same IOMMU wrapped inside
etnaviv (same company making it). Note that for media driver, drivers in the
iommu subsystem are very convenient, they just work usually (except when they
don't like with codecs). I'm pretty sure rkmmu is also used by Panfrost, so I
suppose not all GPU IOMMU lives in GPU drivers (I could be wrong). Its one
instance per block, but the same programming interface. Note that we do have an
in-driver iommu implementation in the RGA2 driver.

If we can agree on solutions for this problem, which seems slightly different on
RK compared to VSI IOMMU, it will be quite beneficial to not have to override
all the media allocation ops, and re-implement rkmmu, vsimmu in every driver
needing this block. The VSI mmu could be used similarly to how the rkmmu is
used, meaning we'd have to copy its implementation in every driver. If we could
find out more accuratly how this is suppose to work it would be great, I feel we
don't really understand what is going on yet. Once that done, we can port rkvdec
driver with the unified solution.

The empty domain approach was used since there was no solution that came out
over a year, and users these days expect concurrent decoding to work. So yes,
its not all pretty, but its the best we found until this type of hardware
behaviour gets an API for that is commonly agreed.

Main question is shall we block on merging the VSI IOMMU driver for that reason
? Its there anything in the IOMMU driver that still needs more work ?

cheers,
Nicolas

p.s. RK means Rockchip, VSI means Verisilicon, the company behind Vivante GPU

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2025-08-29 16:23 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-25 15:34 [PATCH v7 0/6] Add support for Verisilicon IOMMU used by media codec blocks Benjamin Gaignard
2025-08-25 15:34 ` [PATCH v7 1/6] dt-bindings: vendor-prefixes: Add Verisilicon Benjamin Gaignard
2025-08-25 15:34 ` [PATCH v7 2/6] dt-bindings: iommu: verisilicon: Add binding for VSI IOMMU Benjamin Gaignard
2025-08-25 15:34 ` [PATCH v7 3/6] iommu: Add verisilicon IOMMU driver Benjamin Gaignard
2025-08-25 15:34 ` [PATCH v7 4/6] media: verisilicon: AV1: Restore IOMMU context before decoding a frame Benjamin Gaignard
2025-08-25 17:05   ` Jason Gunthorpe
2025-08-25 17:50     ` Nicolas Dufresne
2025-08-25 18:31       ` Jason Gunthorpe
2025-08-26  9:52         ` Benjamin Gaignard
2025-08-26 12:41           ` Jason Gunthorpe
2025-08-26 13:14             ` Benjamin Gaignard
2025-08-29 16:23             ` Nicolas Dufresne
2025-08-25 15:34 ` [PATCH v7 5/6] arm64: dts: rockchip: Add verisilicon IOMMU node on RK3588 Benjamin Gaignard
2025-08-25 15:34 ` [PATCH v7 6/6] arm64: defconfig: enable Verisilicon IOMMU Benjamin Gaignard
2025-08-26  8:25   ` Krzysztof Kozlowski
2025-08-26  9:53     ` Benjamin Gaignard
2025-08-26 10:05       ` Krzysztof Kozlowski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).