* Re: [PATCH v2] arm64: dts: airoha: en7581: Enable spi nand controller for EN7581 EVB
From: Christian Marangi (Ansuel) @ 2026-04-15 9:47 UTC (permalink / raw)
To: Lorenzo Bianconi
Cc: Matthias Brugger, AngeloGioacchino Del Regno, Rob Herring,
Krzysztof Kozlowski, Conor Dooley, linux-arm-kernel,
linux-mediatek, devicetree
In-Reply-To: <abBPufvrG8I8UP69@lore-desk>
Il giorno mar 10 mar 2026 alle ore 18:07 Lorenzo Bianconi
<lorenzo@kernel.org> ha scritto:
>
> > Enable spi controller used for snand memory device for EN7581 evaluation
> > board.
> >
> > Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
> > Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
>
> Hi all,
>
> it seems this patch has been reviewed by AngeloGioacchino, but it has never
> been applied to linux-mediatek tree (or at least I can't find it). It is marked
> as 'New, archived' in patchwork [0]. Am I missing something?
>
> Regards,
> Lorenzo
>
> [0] https://patchwork.kernel.org/project/linux-mediatek/patch/20250225-en7581-snfi-probe-fix-v2-1-92e35add701b@kernel.org/
>
Hi,
friendly ping here. There are lots of patch with review tag and ACK
also for 7583.
Any chance someone can ping maintainers that take care of picking these patch?
Or someone that can reply on how to handle this? Maybe we need to sync with
them? Lorenzo (and also me) are fully maintaining the Airoha ARM target also on
U-Boot. Also on OpenWrt this target is starting to get traction and is
getting used
there, so Airoha is not considered an abandoned target anymore.
^ permalink raw reply
* [PATCH] pwm: atmel-tcb: Fix sleeping function called from invalid context
From: Sangyun Kim @ 2026-04-15 9:34 UTC (permalink / raw)
To: ukleinek
Cc: nicolas.ferre, alexandre.belloni, claudiu.beznea, linux-pwm,
linux-arm-kernel, linux-kernel
atmel_tcb_pwm_apply() holds tcbpwmc->lock as a spinlock via
guard(spinlock)() and then calls atmel_tcb_pwm_config(), which calls
clk_get_rate() twice. clk_get_rate() acquires clk_prepare_lock (a
mutex), so this is a sleep-in-atomic-context violation.
On CONFIG_DEBUG_ATOMIC_SLEEP kernels every pwm_apply_state() that
enables or reconfigures the PWM triggers a "BUG: sleeping function
called from invalid context" warning.
All callers of tcbpwmc->lock (the .request and .apply callbacks) run in
process context and only need mutual exclusion against each other, so
use a mutex instead of a spinlock and allow the sleeping calls inside
atmel_tcb_pwm_config().
Fixes: 37f7707077f5 ("pwm: atmel-tcb: Fix race condition and convert to guards")
Signed-off-by: Sangyun Kim <sangyun.kim@snu.ac.kr>
---
drivers/pwm/pwm-atmel-tcb.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/pwm/pwm-atmel-tcb.c b/drivers/pwm/pwm-atmel-tcb.c
index f9ff78ba122d..6405e82d9f10 100644
--- a/drivers/pwm/pwm-atmel-tcb.c
+++ b/drivers/pwm/pwm-atmel-tcb.c
@@ -17,6 +17,7 @@
#include <linux/ioport.h>
#include <linux/io.h>
#include <linux/mfd/syscon.h>
+#include <linux/mutex.h>
#include <linux/platform_device.h>
#include <linux/pwm.h>
#include <linux/of.h>
@@ -47,7 +48,7 @@ struct atmel_tcb_channel {
};
struct atmel_tcb_pwm_chip {
- spinlock_t lock;
+ struct mutex lock;
u8 channel;
u8 width;
struct regmap *regmap;
@@ -81,7 +82,7 @@ static int atmel_tcb_pwm_request(struct pwm_chip *chip,
tcbpwm->period = 0;
tcbpwm->div = 0;
- guard(spinlock)(&tcbpwmc->lock);
+ guard(mutex)(&tcbpwmc->lock);
regmap_read(tcbpwmc->regmap, ATMEL_TC_REG(tcbpwmc->channel, CMR), &cmr);
/*
@@ -335,7 +336,7 @@ static int atmel_tcb_pwm_apply(struct pwm_chip *chip, struct pwm_device *pwm,
int duty_cycle, period;
int ret;
- guard(spinlock)(&tcbpwmc->lock);
+ guard(mutex)(&tcbpwmc->lock);
if (!state->enabled) {
atmel_tcb_pwm_disable(chip, pwm, state->polarity);
@@ -438,7 +439,7 @@ static int atmel_tcb_pwm_probe(struct platform_device *pdev)
if (err)
goto err_gclk;
- spin_lock_init(&tcbpwmc->lock);
+ mutex_init(&tcbpwmc->lock);
err = pwmchip_add(chip);
if (err < 0)
--
2.34.1
^ permalink raw reply related
* [RFC PATCH v5 8/9] media: chips-media: wave6: Add Wave6 control driver
From: Nas Chung @ 2026-04-15 9:25 UTC (permalink / raw)
To: mchehab, hverkuil, robh, krzk+dt, conor+dt, shawnguo, s.hauer
Cc: linux-media, devicetree, linux-kernel, linux-imx,
linux-arm-kernel, marek.vasut, ming.qian, Nas Chung
In-Reply-To: <20260415092529.577-1-nas.chung@chipsnmedia.com>
Add the control driver for the Chips&Media Wave6 video codec IP.
The hardware contains one control register region and four interface
register regions for a shared video processing engine.
This driver handles the control region and manages shared resources such
as firmware loading, firmware memory allocation, and coordination required
by the interface register regions.
It also instantiates and coordinates with the `wave6-core` child devices
for firmware and power state management.
Signed-off-by: Nas Chung <nas.chung@chipsnmedia.com>
Tested-by: Ming Qian <ming.qian@oss.nxp.com>
Tested-by: Marek Vasut <marek.vasut@mailbox.org>
---
drivers/media/platform/chips-media/Kconfig | 1 +
drivers/media/platform/chips-media/Makefile | 1 +
.../media/platform/chips-media/wave6/Kconfig | 17 +
.../media/platform/chips-media/wave6/Makefile | 17 +
.../platform/chips-media/wave6/wave6-vpu.c | 816 ++++++++++++++++++
.../platform/chips-media/wave6/wave6-vpu.h | 143 +++
6 files changed, 995 insertions(+)
create mode 100644 drivers/media/platform/chips-media/wave6/Kconfig
create mode 100644 drivers/media/platform/chips-media/wave6/Makefile
create mode 100644 drivers/media/platform/chips-media/wave6/wave6-vpu.c
create mode 100644 drivers/media/platform/chips-media/wave6/wave6-vpu.h
diff --git a/drivers/media/platform/chips-media/Kconfig b/drivers/media/platform/chips-media/Kconfig
index ad350eb6b1fc..8ef7fc8029a4 100644
--- a/drivers/media/platform/chips-media/Kconfig
+++ b/drivers/media/platform/chips-media/Kconfig
@@ -4,3 +4,4 @@ comment "Chips&Media media platform drivers"
source "drivers/media/platform/chips-media/coda/Kconfig"
source "drivers/media/platform/chips-media/wave5/Kconfig"
+source "drivers/media/platform/chips-media/wave6/Kconfig"
diff --git a/drivers/media/platform/chips-media/Makefile b/drivers/media/platform/chips-media/Makefile
index 6b5d99de8b54..b9a07a91c9d6 100644
--- a/drivers/media/platform/chips-media/Makefile
+++ b/drivers/media/platform/chips-media/Makefile
@@ -2,3 +2,4 @@
obj-y += coda/
obj-y += wave5/
+obj-y += wave6/
diff --git a/drivers/media/platform/chips-media/wave6/Kconfig b/drivers/media/platform/chips-media/wave6/Kconfig
new file mode 100644
index 000000000000..63d79c56c7fc
--- /dev/null
+++ b/drivers/media/platform/chips-media/wave6/Kconfig
@@ -0,0 +1,17 @@
+# SPDX-License-Identifier: GPL-2.0
+
+config VIDEO_WAVE6_VPU
+ tristate "Chips&Media Wave6 Codec Driver"
+ depends on V4L_MEM2MEM_DRIVERS
+ depends on VIDEO_DEV && OF
+ depends on ARCH_MXC || COMPILE_TEST
+ select VIDEOBUF2_DMA_CONTIG
+ select V4L2_MEM2MEM_DEV
+ select GENERIC_ALLOCATOR
+ help
+ Chips&Media Wave6 stateful codec driver.
+ The wave6 driver manages shared resources such as firmware memory.
+ The wave6-core driver provides encoding and decoding capabilities
+ for H.264, HEVC, and other video formats.
+ To compile this driver as modules, choose M here: the
+ modules will be called wave6 and wave6-core.
diff --git a/drivers/media/platform/chips-media/wave6/Makefile b/drivers/media/platform/chips-media/wave6/Makefile
new file mode 100644
index 000000000000..06f8ac9bef14
--- /dev/null
+++ b/drivers/media/platform/chips-media/wave6/Makefile
@@ -0,0 +1,17 @@
+# SPDX-License-Identifier: GPL-2.0
+
+# tell define_trace.h where to find the trace header
+CFLAGS_wave6-vpu-core.o := -I$(src)
+
+wave6-objs += wave6-vpu.o \
+ wave6-vpu-thermal.o
+obj-$(CONFIG_VIDEO_WAVE6_VPU) += wave6.o
+
+wave6-core-objs += wave6-vpu-core.o \
+ wave6-vpu-v4l2.o \
+ wave6-vpu-dbg.o \
+ wave6-vpuapi.o \
+ wave6-vpu-dec.o \
+ wave6-vpu-enc.o \
+ wave6-hw.o
+obj-$(CONFIG_VIDEO_WAVE6_VPU) += wave6-core.o
diff --git a/drivers/media/platform/chips-media/wave6/wave6-vpu.c b/drivers/media/platform/chips-media/wave6/wave6-vpu.c
new file mode 100644
index 000000000000..ca8dded9bf86
--- /dev/null
+++ b/drivers/media/platform/chips-media/wave6/wave6-vpu.c
@@ -0,0 +1,816 @@
+// SPDX-License-Identifier: (GPL-2.0 OR BSD-3-Clause)
+/*
+ * Wave6 series multi-standard codec IP - wave6 driver
+ *
+ * Copyright (C) 2025 CHIPS&MEDIA INC
+ */
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/platform_device.h>
+#include <linux/clk.h>
+#include <linux/of.h>
+#include <linux/of_address.h>
+#include <linux/of_device.h>
+#include <linux/of_irq.h>
+#include <linux/of_platform.h>
+#include <linux/firmware.h>
+#include <linux/interrupt.h>
+#include <linux/pm_runtime.h>
+#include <linux/pm_domain.h>
+#include <linux/dma-mapping.h>
+#include <linux/iopoll.h>
+#include <linux/genalloc.h>
+
+#include "wave6-vpuconfig.h"
+#include "wave6-regdefine.h"
+#include "wave6-vpu.h"
+
+static const struct wave6_vpu_resource wave633c_data = {
+ .fw_name = "cnm/wave633c_imx9_codec_fw.bin",
+ /* For HEVC, AVC, 4096x4096, 8bit */
+ .sram_size = 0x14800,
+};
+
+static const struct wave6_vpu_core_resource wave633c_core_data = {
+ .codec_types = WAVE633_CODEC_TYPE,
+ .compatible_fw_version = WAVE633_COMPATIBLE_FW_VERSION,
+};
+
+static const char *wave6_vpu_state_name(enum wave6_vpu_state state)
+{
+ switch (state) {
+ case WAVE6_VPU_STATE_OFF:
+ return "off";
+ case WAVE6_VPU_STATE_PREPARE:
+ return "prepare";
+ case WAVE6_VPU_STATE_ON:
+ return "on";
+ case WAVE6_VPU_STATE_SLEEP:
+ return "sleep";
+ default:
+ return "unknown";
+ }
+}
+
+static bool wave6_vpu_valid_transition(struct wave6_vpu_device *vpu,
+ enum wave6_vpu_state next)
+{
+ switch (vpu->state) {
+ case WAVE6_VPU_STATE_OFF:
+ /* to PREPARE: first boot attempt */
+ /* to ON: already booted before, skipping boot */
+ if (next == WAVE6_VPU_STATE_PREPARE ||
+ next == WAVE6_VPU_STATE_ON)
+ return true;
+ break;
+ case WAVE6_VPU_STATE_PREPARE:
+ /* to OFF: boot failed */
+ /* to ON: boot successful */
+ if (next == WAVE6_VPU_STATE_OFF ||
+ next == WAVE6_VPU_STATE_ON)
+ return true;
+ break;
+ case WAVE6_VPU_STATE_ON:
+ /* to OFF: sleep failed */
+ /* to SLEEP: sleep successful */
+ if (next == WAVE6_VPU_STATE_OFF ||
+ next == WAVE6_VPU_STATE_SLEEP)
+ return true;
+ break;
+ case WAVE6_VPU_STATE_SLEEP:
+ /* to OFF: resume failed */
+ /* to ON: resume successful */
+ if (next == WAVE6_VPU_STATE_OFF ||
+ next == WAVE6_VPU_STATE_ON)
+ return true;
+ break;
+ }
+
+ dev_err(vpu->dev, "invalid transition: %s -> %s\n",
+ wave6_vpu_state_name(vpu->state), wave6_vpu_state_name(next));
+
+ return false;
+}
+
+static void wave6_vpu_set_state(struct wave6_vpu_device *vpu,
+ enum wave6_vpu_state state)
+{
+ if (!wave6_vpu_valid_transition(vpu, state))
+ return;
+
+ dev_dbg(vpu->dev, "set state: %s -> %s\n",
+ wave6_vpu_state_name(vpu->state), wave6_vpu_state_name(state));
+
+ vpu->state = state;
+}
+
+static int wave6_vpu_wait_busy(struct vpu_core_device *core)
+{
+ u32 val;
+
+ return read_poll_timeout(wave6_vdi_readl, val, !val,
+ W6_VPU_POLL_DELAY_US, W6_VPU_POLL_TIMEOUT,
+ false, core->reg_base, W6_VPU_BUSY_STATUS);
+}
+
+static int wave6_vpu_check_result(struct vpu_core_device *core)
+{
+ if (wave6_vdi_readl(core->reg_base, W6_RET_SUCCESS))
+ return 0;
+
+ return wave6_vdi_readl(core->reg_base, W6_RET_FAIL_REASON);
+}
+
+static u32 wave6_vpu_get_code_buf_size(struct wave6_vpu_device *vpu)
+{
+ return min_t(u32, vpu->code_buf.size, W6_MAX_CODE_BUF_SIZE);
+}
+
+static void wave6_vpu_remap_code_buf(struct wave6_vpu_device *vpu)
+{
+ dma_addr_t code_base = vpu->code_buf.dma_addr;
+ u32 i, reg_val;
+
+ for (i = 0; i < wave6_vpu_get_code_buf_size(vpu) / W6_MAX_REMAP_PAGE_SIZE; i++) {
+ reg_val = REMAP_CTRL_ON |
+ REMAP_CTRL_INDEX(i) |
+ REMAP_CTRL_PAGE_SIZE_ON |
+ REMAP_CTRL_PAGE_SIZE(W6_MAX_REMAP_PAGE_SIZE);
+ wave6_vdi_writel(vpu->reg_base, W6_VPU_REMAP_CTRL_GB, reg_val);
+ wave6_vdi_writel(vpu->reg_base, W6_VPU_REMAP_VADDR_GB,
+ i * W6_MAX_REMAP_PAGE_SIZE);
+ wave6_vdi_writel(vpu->reg_base, W6_VPU_REMAP_PADDR_GB,
+ code_base + i * W6_MAX_REMAP_PAGE_SIZE);
+ }
+}
+
+static void wave6_vpu_init_code_buf(struct wave6_vpu_device *vpu)
+{
+ if (vpu->code_buf.size < W6_CODE_BUF_SIZE) {
+ dev_warn(vpu->dev,
+ "code buf size (%zu) is too small\n", vpu->code_buf.size);
+ vpu->code_buf.phys_addr = 0;
+ vpu->code_buf.size = 0;
+ memset(&vpu->code_buf, 0, sizeof(vpu->code_buf));
+ return;
+ }
+
+ vpu->code_buf.vaddr = devm_memremap(vpu->dev,
+ vpu->code_buf.phys_addr,
+ vpu->code_buf.size,
+ MEMREMAP_WC);
+ if (!vpu->code_buf.vaddr) {
+ memset(&vpu->code_buf, 0, sizeof(vpu->code_buf));
+ return;
+ }
+
+ vpu->code_buf.dma_addr = dma_map_resource(vpu->dev,
+ vpu->code_buf.phys_addr,
+ vpu->code_buf.size,
+ DMA_BIDIRECTIONAL,
+ 0);
+ if (!vpu->code_buf.dma_addr) {
+ memset(&vpu->code_buf, 0, sizeof(vpu->code_buf));
+ return;
+ }
+}
+
+static void wave6_vpu_allocate_work_buffers(struct wave6_vpu_device *vpu)
+{
+ struct vpu_buf *buf;
+ int i;
+
+ for (i = 0; i < MAX_NUM_INSTANCE; i++) {
+ buf = &vpu->work_buffers[i];
+ buf->size = W637DEC_WORKBUF_SIZE_FOR_CQ;
+
+ if (wave6_vdi_alloc_dma(vpu->dev, buf)) {
+ dev_warn(vpu->dev, "Failed to allocate work_buffers\n");
+ return;
+ }
+
+ vpu->work_buffers_alloc++;
+ }
+}
+
+static void wave6_vpu_free_work_buffers(struct wave6_vpu_device *vpu)
+{
+ int i;
+
+ for (i = 0; i < vpu->work_buffers_alloc; i++)
+ wave6_vdi_free_dma(&vpu->work_buffers[i]);
+
+ vpu->work_buffers_alloc = 0;
+ vpu->work_buffers_avail = 0;
+}
+
+static void wave6_vpu_init_work_buf(struct wave6_vpu_device *vpu,
+ struct vpu_core_device *core)
+{
+ int ret;
+
+ lockdep_assert_held(&vpu->lock);
+
+ wave6_vdi_writel(core->reg_base, W6_VPU_BUSY_STATUS, BUSY_STATUS_SET);
+ wave6_vdi_writel(core->reg_base, W6_COMMAND, W6_CMD_INIT_WORK_BUF);
+ wave6_vdi_writel(core->reg_base, W6_VPU_HOST_INT_REQ, HOST_INT_REQ_ON);
+
+ ret = wave6_vpu_wait_busy(core);
+ if (ret) {
+ dev_err(vpu->dev, "init work buf failed\n");
+ return;
+ }
+
+ ret = wave6_vpu_check_result(core);
+ if (ret) {
+ dev_err(vpu->dev, "init work buf failed, reason 0x%x\n", ret);
+ return;
+ }
+
+ vpu->work_buffers_avail = vpu->work_buffers_alloc;
+}
+
+static int wave6_vpu_init_vpu(struct wave6_vpu_device *vpu,
+ struct vpu_core_device *core)
+{
+ int ret;
+
+ lockdep_assert_held(&vpu->lock);
+
+ /* try init directly as firmware is running */
+ if (wave6_vdi_readl(core->reg_base, W6_VPU_VCPU_CUR_PC))
+ goto init_done;
+
+ wave6_vpu_set_state(vpu, WAVE6_VPU_STATE_PREPARE);
+
+ wave6_vpu_remap_code_buf(vpu);
+
+ wave6_vdi_writel(core->reg_base, W6_VPU_BUSY_STATUS, BUSY_STATUS_SET);
+ wave6_vdi_writel(core->reg_base, W6_CMD_INIT_VPU_SEC_AXI_BASE_CORE0,
+ vpu->sram_buf.dma_addr);
+ wave6_vdi_writel(core->reg_base, W6_CMD_INIT_VPU_SEC_AXI_SIZE_CORE0,
+ vpu->sram_buf.size);
+ wave6_vdi_writel(vpu->reg_base, W6_COMMAND_GB, W6_CMD_INIT_VPU);
+ wave6_vdi_writel(vpu->reg_base, W6_VPU_REMAP_CORE_START_GB,
+ REMAP_CORE_START_ON);
+
+ ret = wave6_vpu_wait_busy(core);
+ if (ret) {
+ dev_err(vpu->dev, "init vpu timeout\n");
+ wave6_vpu_set_state(vpu, WAVE6_VPU_STATE_OFF);
+ return -EINVAL;
+ }
+
+ ret = wave6_vpu_check_result(core);
+ if (ret) {
+ dev_err(vpu->dev, "init vpu fail, reason 0x%x\n", ret);
+ wave6_vpu_set_state(vpu, WAVE6_VPU_STATE_OFF);
+ return -EIO;
+ }
+
+init_done:
+ wave6_vpu_init_work_buf(vpu, core);
+ wave6_vpu_set_state(vpu, WAVE6_VPU_STATE_ON);
+
+ return 0;
+}
+
+static int wave6_vpu_sleep(struct wave6_vpu_device *vpu,
+ struct vpu_core_device *core)
+{
+ int ret;
+
+ lockdep_assert_held(&vpu->lock);
+
+ if (!wave6_vdi_readl(core->reg_base, W6_VPU_VCPU_CUR_PC)) {
+ wave6_vpu_set_state(vpu, WAVE6_VPU_STATE_OFF);
+ return 0;
+ }
+
+ wave6_vdi_writel(core->reg_base, W6_VPU_BUSY_STATUS, BUSY_STATUS_SET);
+ wave6_vdi_writel(core->reg_base, W6_COMMAND, W6_CMD_SLEEP_VPU);
+ wave6_vdi_writel(core->reg_base, W6_VPU_HOST_INT_REQ, HOST_INT_REQ_ON);
+
+ ret = wave6_vpu_wait_busy(core);
+ if (ret) {
+ dev_err(vpu->dev, "sleep vpu timeout\n");
+ wave6_vpu_set_state(vpu, WAVE6_VPU_STATE_OFF);
+ return -EINVAL;
+ }
+
+ ret = wave6_vpu_check_result(core);
+ if (ret) {
+ dev_err(vpu->dev, "sleep vpu fail, reason 0x%x\n", ret);
+ wave6_vpu_set_state(vpu, WAVE6_VPU_STATE_OFF);
+ return -EIO;
+ }
+
+ wave6_vpu_set_state(vpu, WAVE6_VPU_STATE_SLEEP);
+
+ return 0;
+}
+
+static int wave6_vpu_wakeup(struct wave6_vpu_device *vpu,
+ struct vpu_core_device *core)
+{
+ int ret;
+
+ lockdep_assert_held(&vpu->lock);
+
+ /* try wakeup directly as firmware is running */
+ if (wave6_vdi_readl(core->reg_base, W6_VPU_VCPU_CUR_PC))
+ goto wakeup_done;
+
+ wave6_vpu_remap_code_buf(vpu);
+
+ wave6_vdi_writel(core->reg_base, W6_VPU_BUSY_STATUS, BUSY_STATUS_SET);
+ wave6_vdi_writel(core->reg_base, W6_CMD_INIT_VPU_SEC_AXI_BASE_CORE0,
+ vpu->sram_buf.dma_addr);
+ wave6_vdi_writel(core->reg_base, W6_CMD_INIT_VPU_SEC_AXI_SIZE_CORE0,
+ vpu->sram_buf.size);
+ wave6_vdi_writel(vpu->reg_base, W6_COMMAND_GB, W6_CMD_WAKEUP_VPU);
+ wave6_vdi_writel(vpu->reg_base, W6_VPU_REMAP_CORE_START_GB,
+ REMAP_CORE_START_ON);
+
+ ret = wave6_vpu_wait_busy(core);
+ if (ret) {
+ dev_err(vpu->dev, "wakeup vpu timeout\n");
+ wave6_vpu_set_state(vpu, WAVE6_VPU_STATE_OFF);
+ return -EINVAL;
+ }
+
+ ret = wave6_vpu_check_result(core);
+ if (ret) {
+ dev_err(vpu->dev, "wakeup vpu fail, reason 0x%x\n", ret);
+ wave6_vpu_set_state(vpu, WAVE6_VPU_STATE_OFF);
+ return -EIO;
+ }
+
+wakeup_done:
+ wave6_vpu_set_state(vpu, WAVE6_VPU_STATE_ON);
+
+ return 0;
+}
+
+static int wave6_vpu_try_boot(struct wave6_vpu_device *vpu,
+ struct vpu_core_device *core)
+{
+ u32 product_code;
+ int ret;
+
+ lockdep_assert_held(&vpu->lock);
+
+ if (vpu->state != WAVE6_VPU_STATE_OFF && vpu->state != WAVE6_VPU_STATE_SLEEP)
+ return 0;
+
+ product_code = wave6_vdi_readl(core->reg_base, W6_VPU_RET_PRODUCT_CODE);
+ if (!PRODUCT_CODE_W_SERIES(product_code)) {
+ dev_err(vpu->dev, "unknown product : %08x\n", product_code);
+ return -EINVAL;
+ }
+
+ if (vpu->state == WAVE6_VPU_STATE_SLEEP) {
+ ret = wave6_vpu_wakeup(vpu, core);
+ return ret;
+ }
+
+ ret = wave6_vpu_init_vpu(vpu, core);
+
+ return ret;
+}
+
+static int wave6_vpu_get(struct wave6_vpu_device *vpu,
+ struct vpu_core_device *core)
+{
+ int ret;
+
+ if (WARN_ON(!vpu || !core))
+ return -EINVAL;
+
+ guard(mutex)(&vpu->lock);
+
+ if (!vpu->fw_available)
+ return -EINVAL;
+
+ /* Only the first core executes boot; others return */
+ if (atomic_inc_return(&vpu->core_count) > 1)
+ return 0;
+
+ ret = pm_runtime_resume_and_get(vpu->dev);
+ if (ret)
+ goto error_pm;
+
+ ret = wave6_vpu_try_boot(vpu, core);
+ if (ret)
+ goto error_boot;
+
+ return 0;
+
+error_boot:
+ pm_runtime_put_sync(vpu->dev);
+error_pm:
+ atomic_dec(&vpu->core_count);
+
+ return ret;
+}
+
+static void wave6_vpu_put(struct wave6_vpu_device *vpu,
+ struct vpu_core_device *core)
+{
+ if (WARN_ON(!vpu || !core))
+ return;
+
+ guard(mutex)(&vpu->lock);
+
+ if (!vpu->fw_available)
+ return;
+
+ /* Only the last core executes sleep; others return */
+ if (atomic_dec_return(&vpu->core_count) > 0)
+ return;
+
+ wave6_vpu_sleep(vpu, core);
+
+ if (!pm_runtime_suspended(vpu->dev))
+ pm_runtime_put_sync(vpu->dev);
+}
+
+static void wave6_vpu_require_work_buffer(struct wave6_vpu_device *vpu,
+ struct vpu_core_device *core)
+{
+ struct vpu_buf *vb;
+ u32 size;
+
+ if (WARN_ON(!vpu || !core))
+ return;
+
+ size = wave6_vdi_readl(core->reg_base, W6_CMD_SET_WORK_BUF_SIZE);
+ if (!size)
+ return;
+
+ if (WARN_ON(size > W637DEC_WORKBUF_SIZE_FOR_CQ))
+ goto exit;
+
+ if (WARN_ON(vpu->work_buffers_avail <= 0))
+ goto exit;
+
+ vpu->work_buffers_avail--;
+ vb = &vpu->work_buffers[vpu->work_buffers_avail];
+
+ wave6_vdi_writel(core->reg_base, W6_CMD_SET_WORK_BUF_ADDR, vb->daddr);
+
+exit:
+ wave6_vdi_writel(core->reg_base, W6_CMD_SET_WORK_BUF_SIZE, SET_WORK_BUF_SIZE_ACK);
+}
+
+static int wave6_vpu_create_cores(struct wave6_vpu_device *vpu)
+{
+ struct device_node *child;
+ int num_cores = 0;
+
+ for_each_available_child_of_node(vpu->dev->of_node, child) {
+ struct platform_device_info info = {};
+ struct platform_device *pdev;
+ struct resource res[2];
+ int irq;
+
+ if (num_cores >= W6_VPU_MAX_NUM_CORE) {
+ of_node_put(child);
+ break;
+ }
+
+ if (of_address_to_resource(child, 0, &res[0])) {
+ dev_warn(vpu->dev, "%pOF: missing reg property\n", child);
+ continue;
+ }
+
+ irq = of_irq_get(child, 0);
+ if (irq < 0) {
+ dev_warn(vpu->dev, "%pOF: missing interrupts property\n", child);
+ continue;
+ }
+ res[1] = DEFINE_RES_IRQ(irq);
+
+ info.fwnode = of_fwnode_handle(child);
+ info.parent = vpu->dev;
+ info.name = WAVE6_VPU_CORE_PLATFORM_DRIVER_NAME;
+ info.id = num_cores;
+ info.dma_mask = DMA_BIT_MASK(32);
+ info.res = res;
+ info.num_res = ARRAY_SIZE(res);
+ info.data = &wave633c_core_data;
+ info.size_data = sizeof(wave633c_core_data);
+
+ pdev = platform_device_register_full(&info);
+ if (IS_ERR(pdev)) {
+ dev_err(vpu->dev, "Failed to register core %d: %ld\n",
+ num_cores, PTR_ERR(pdev));
+ continue;
+ }
+
+ vpu->core_pdevs[num_cores] = pdev;
+ num_cores++;
+ }
+
+ return num_cores;
+}
+
+static void wave6_vpu_destroy_cores(struct wave6_vpu_device *vpu)
+{
+ int i;
+
+ for (i = 0; i < W6_VPU_MAX_NUM_CORE; i++) {
+ struct platform_device *pdev = vpu->core_pdevs[i];
+
+ if (!pdev)
+ continue;
+
+ platform_device_unregister(pdev);
+ vpu->core_pdevs[i] = NULL;
+ }
+}
+
+static void wave6_vpu_release(struct wave6_vpu_device *vpu)
+{
+ guard(mutex)(&vpu->lock);
+
+ vpu->fw_available = false;
+ wave6_vpu_destroy_cores(vpu);
+ wave6_vpu_free_work_buffers(vpu);
+ if (vpu->sram_pool && vpu->sram_buf.vaddr) {
+ dma_unmap_resource(vpu->dev,
+ vpu->sram_buf.dma_addr,
+ vpu->sram_buf.size,
+ DMA_BIDIRECTIONAL,
+ 0);
+ gen_pool_free(vpu->sram_pool,
+ (unsigned long)vpu->sram_buf.vaddr,
+ vpu->sram_buf.size);
+ }
+ if (vpu->code_buf.dma_addr)
+ dma_unmap_resource(vpu->dev,
+ vpu->code_buf.dma_addr,
+ vpu->code_buf.size,
+ DMA_BIDIRECTIONAL,
+ 0);
+}
+
+static void wave6_vpu_load_firmware(const struct firmware *fw, void *context)
+{
+ struct wave6_vpu_device *vpu = context;
+
+ guard(mutex)(&vpu->lock);
+
+ if (!fw || !fw->data) {
+ dev_err(vpu->dev, "No firmware.\n");
+ return;
+ }
+
+ if (!vpu->fw_available)
+ goto exit;
+
+ if (fw->size + W6_EXTRA_CODE_BUF_SIZE > wave6_vpu_get_code_buf_size(vpu)) {
+ dev_err(vpu->dev, "firmware size (%zd > %zd) is too big\n",
+ fw->size, vpu->code_buf.size);
+ vpu->fw_available = false;
+ goto exit;
+ }
+
+ memcpy(vpu->code_buf.vaddr, fw->data, fw->size);
+
+ vpu->get_vpu = wave6_vpu_get;
+ vpu->put_vpu = wave6_vpu_put;
+ vpu->req_work_buffer = wave6_vpu_require_work_buffer;
+ request_module("platform:%s", WAVE6_VPU_CORE_PLATFORM_DRIVER_NAME);
+ if (!wave6_vpu_create_cores(vpu)) {
+ dev_err(vpu->dev, "Failed to create VPU cores\n");
+ vpu->fw_available = false;
+ }
+
+exit:
+ release_firmware(fw);
+}
+
+static void wave6_vpu_detach_pm_domains(struct wave6_vpu_device *vpu)
+{
+ if (!vpu->num_pm_domains)
+ return;
+
+ for (int i = 0; i < vpu->num_pm_domains; i++) {
+ struct device *pd_dev = vpu->pd_list->pd_devs[i];
+
+ if (!IS_ERR_OR_NULL(pd_dev) && !pm_runtime_suspended(pd_dev))
+ pm_runtime_force_suspend(pd_dev);
+ }
+
+ dev_pm_domain_detach_list(vpu->pd_list);
+ vpu->pd_list = NULL;
+ vpu->num_pm_domains = 0;
+}
+
+static int wave6_vpu_attach_pm_domains(struct wave6_vpu_device *vpu)
+{
+ int ret;
+
+ vpu->num_pm_domains = of_count_phandle_with_args(vpu->dev->of_node,
+ "power-domains",
+ "#power-domain-cells");
+ if (vpu->num_pm_domains < 0) {
+ dev_err(vpu->dev, "No power domains defined for vpu node\n");
+ return vpu->num_pm_domains;
+ }
+
+ if (vpu->num_pm_domains == 1) {
+ /* genpd_dev_pm_attach() attach automatically if count is 1 */
+ vpu->num_pm_domains = 0;
+ return 0;
+ }
+
+ ret = dev_pm_domain_attach_list(vpu->dev, NULL, &vpu->pd_list);
+ if (ret < 0) {
+ dev_err(vpu->dev, "Can't attach pm domains, ret = %d\n", ret);
+ return ret;
+ }
+
+ vpu->num_pm_domains = ret;
+
+ return 0;
+}
+
+static struct device *wave6_vpu_get_performance_dev(struct wave6_vpu_device *vpu)
+{
+ int index;
+
+ if (!vpu->num_pm_domains)
+ return vpu->dev;
+
+ index = of_property_match_string(vpu->dev->of_node,
+ "power-domain-names", "perf");
+ if (index < 0 || index >= vpu->num_pm_domains)
+ return NULL;
+
+ return vpu->pd_list->pd_devs[index];
+}
+
+static int wave6_vpu_probe(struct platform_device *pdev)
+{
+ struct device_node *np;
+ struct wave6_vpu_device *vpu;
+ const struct wave6_vpu_resource *res;
+ int ret;
+
+ ret = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(32));
+ if (ret < 0) {
+ dev_err(&pdev->dev, "failed to set DMA mask: %d\n", ret);
+ return ret;
+ }
+
+ res = of_device_get_match_data(&pdev->dev);
+ if (!res)
+ return -ENODEV;
+
+ vpu = devm_kzalloc(&pdev->dev, sizeof(*vpu), GFP_KERNEL);
+ if (!vpu)
+ return -ENOMEM;
+
+ ret = devm_mutex_init(&pdev->dev, &vpu->lock);
+ if (ret)
+ return ret;
+
+ atomic_set(&vpu->core_count, 0);
+ dev_set_drvdata(&pdev->dev, vpu);
+ vpu->dev = &pdev->dev;
+ vpu->res = res;
+ vpu->reg_base = devm_platform_ioremap_resource(pdev, 0);
+ if (IS_ERR(vpu->reg_base))
+ return PTR_ERR(vpu->reg_base);
+
+ ret = devm_clk_bulk_get_all(&pdev->dev, &vpu->clks);
+ if (ret < 0)
+ return dev_err_probe(&pdev->dev, ret, "failed to get clocks\n");
+
+ vpu->num_clks = ret;
+
+ ret = wave6_vpu_attach_pm_domains(vpu);
+ if (ret)
+ return dev_err_probe(&pdev->dev, ret, "failed to attach pm domains\n");
+
+ np = of_parse_phandle(pdev->dev.of_node, "memory-region", 0);
+ if (np) {
+ struct resource mem;
+
+ ret = of_address_to_resource(np, 0, &mem);
+ of_node_put(np);
+ if (!ret) {
+ vpu->code_buf.phys_addr = mem.start;
+ vpu->code_buf.size = resource_size(&mem);
+ wave6_vpu_init_code_buf(vpu);
+ } else {
+ dev_warn(&pdev->dev, "memory-region is not available.\n");
+ }
+ }
+
+ vpu->sram_pool = of_gen_pool_get(pdev->dev.of_node, "sram", 0);
+ if (vpu->sram_pool) {
+ vpu->sram_buf.size = vpu->res->sram_size;
+ vpu->sram_buf.vaddr = gen_pool_dma_alloc(vpu->sram_pool,
+ vpu->sram_buf.size,
+ &vpu->sram_buf.phys_addr);
+ if (!vpu->sram_buf.vaddr)
+ vpu->sram_buf.size = 0;
+ else
+ vpu->sram_buf.dma_addr = dma_map_resource(&pdev->dev,
+ vpu->sram_buf.phys_addr,
+ vpu->sram_buf.size,
+ DMA_BIDIRECTIONAL,
+ 0);
+ }
+
+ vpu->thermal.dev = wave6_vpu_get_performance_dev(vpu);
+ vpu->thermal.of_node = vpu->dev->of_node;
+ if (vpu->thermal.dev) {
+ ret = wave6_vpu_cooling_init(&vpu->thermal);
+ if (ret)
+ dev_err(&pdev->dev, "failed to initialize thermal cooling, %d\n", ret);
+ }
+
+ wave6_vpu_allocate_work_buffers(vpu);
+
+ pm_runtime_enable(&pdev->dev);
+ vpu->fw_available = true;
+
+ ret = firmware_request_nowait_nowarn(THIS_MODULE,
+ vpu->res->fw_name,
+ &pdev->dev,
+ GFP_KERNEL,
+ vpu,
+ wave6_vpu_load_firmware);
+ if (ret) {
+ dev_err(&pdev->dev, "request firmware fail, ret = %d\n", ret);
+ goto error;
+ }
+
+ return 0;
+
+error:
+ pm_runtime_disable(&pdev->dev);
+ if (vpu->thermal.dev)
+ wave6_vpu_cooling_remove(&vpu->thermal);
+ wave6_vpu_release(vpu);
+ wave6_vpu_detach_pm_domains(vpu);
+
+ return ret;
+}
+
+static void wave6_vpu_remove(struct platform_device *pdev)
+{
+ struct wave6_vpu_device *vpu = dev_get_drvdata(&pdev->dev);
+
+ pm_runtime_disable(vpu->dev);
+ if (vpu->thermal.dev)
+ wave6_vpu_cooling_remove(&vpu->thermal);
+ wave6_vpu_release(vpu);
+ wave6_vpu_detach_pm_domains(vpu);
+}
+
+static int __maybe_unused wave6_vpu_runtime_suspend(struct device *dev)
+{
+ struct wave6_vpu_device *vpu = dev_get_drvdata(dev);
+
+ clk_bulk_disable_unprepare(vpu->num_clks, vpu->clks);
+
+ return 0;
+}
+
+static int __maybe_unused wave6_vpu_runtime_resume(struct device *dev)
+{
+ struct wave6_vpu_device *vpu = dev_get_drvdata(dev);
+
+ return clk_bulk_prepare_enable(vpu->num_clks, vpu->clks);
+}
+
+static const struct dev_pm_ops wave6_vpu_pm_ops = {
+ SET_RUNTIME_PM_OPS(wave6_vpu_runtime_suspend,
+ wave6_vpu_runtime_resume, NULL)
+};
+
+static const struct of_device_id wave6_vpu_ids[] = {
+ { .compatible = "nxp,imx95-vpu", .data = &wave633c_data },
+ { /* sentinel */ }
+};
+MODULE_DEVICE_TABLE(of, wave6_vpu_ids);
+
+static struct platform_driver wave6_vpu_driver = {
+ .driver = {
+ .name = WAVE6_VPU_PLATFORM_DRIVER_NAME,
+ .of_match_table = wave6_vpu_ids,
+ .pm = &wave6_vpu_pm_ops,
+ },
+ .probe = wave6_vpu_probe,
+ .remove = wave6_vpu_remove,
+};
+
+module_platform_driver(wave6_vpu_driver);
+MODULE_DESCRIPTION("chips&media Wave6 VPU driver");
+MODULE_LICENSE("Dual BSD/GPL");
diff --git a/drivers/media/platform/chips-media/wave6/wave6-vpu.h b/drivers/media/platform/chips-media/wave6/wave6-vpu.h
new file mode 100644
index 000000000000..ec3c9299526b
--- /dev/null
+++ b/drivers/media/platform/chips-media/wave6/wave6-vpu.h
@@ -0,0 +1,143 @@
+/* SPDX-License-Identifier: (GPL-2.0 OR BSD-3-Clause) */
+/*
+ * Wave6 series multi-standard codec IP - wave6 driver
+ *
+ * Copyright (C) 2025 CHIPS&MEDIA INC
+ */
+
+#ifndef __WAVE6_VPU_H__
+#define __WAVE6_VPU_H__
+
+#include <linux/device.h>
+#include "wave6-vpu-thermal.h"
+#include "wave6-vdi.h"
+#include "wave6-vpuapi.h"
+
+#define WAVE6_VPU_PLATFORM_DRIVER_NAME "wave6-vpu"
+#define WAVE6_VPU_CORE_PLATFORM_DRIVER_NAME "wave6-vpu-core"
+
+struct wave6_vpu_device;
+struct vpu_core_device;
+
+/**
+ * enum wave6_vpu_state - VPU states
+ * @WAVE6_VPU_STATE_OFF: VPU is powered off
+ * @WAVE6_VPU_STATE_PREPARE: VPU is booting
+ * @WAVE6_VPU_STATE_ON: VPU is running
+ * @WAVE6_VPU_STATE_SLEEP: VPU is in a sleep mode
+ */
+enum wave6_vpu_state {
+ WAVE6_VPU_STATE_OFF,
+ WAVE6_VPU_STATE_PREPARE,
+ WAVE6_VPU_STATE_ON,
+ WAVE6_VPU_STATE_SLEEP
+};
+
+/**
+ * struct wave6_vpu_dma_buf - VPU buffer from reserved memory or gen_pool
+ * @size: Buffer size
+ * @dma_addr: Mapped address for device access
+ * @vaddr: Kernel virtual address
+ * @phys_addr: Physical address of the reserved memory region or gen_pool
+ *
+ * Represents a buffer allocated from pre-reserved device memory regions or
+ * SRAM via gen_pool_dma_alloc(). Used for code and SRAM buffers only.
+ * Managed by the VPU device.
+ */
+struct wave6_vpu_dma_buf {
+ size_t size;
+ dma_addr_t dma_addr;
+ void *vaddr;
+ phys_addr_t phys_addr;
+};
+
+/**
+ * struct wave6_vpu_resource - VPU device compatible data
+ * @fw_name: Firmware name for the device
+ * @sram_size: Required SRAM size
+ */
+struct wave6_vpu_resource {
+ const char *fw_name;
+ u32 sram_size;
+};
+
+#define WAVE6_IS_ENC BIT(0)
+#define WAVE6_IS_DEC BIT(1)
+
+#define WAVE633_CODEC_TYPE (WAVE6_IS_ENC | WAVE6_IS_DEC)
+#define WAVE633_COMPATIBLE_FW_VERSION 0x2010000
+
+/**
+ * struct wave6_vpu_core_resource - VPU CORE device compatible data
+ * @codec_types: Bitmask of supported codec types
+ * @compatible_fw_version: Firmware version compatible with driver
+ */
+struct wave6_vpu_core_resource {
+ int codec_types;
+ u32 compatible_fw_version;
+};
+
+/**
+ * struct wave6_vpu_device - VPU driver structure
+ * @get_vpu: Function pointer, boot or wake the device
+ * @put_vpu: Function pointer, power off or suspend the device
+ * @req_work_buffer: Function pointer, request allocation of a work buffer
+ * @dev: Platform device pointer
+ * @reg_base: Base address of MMIO registers
+ * @clks: Array of clock handles
+ * @num_clks: Number of entries in @clks
+ * @state: Device state
+ * @lock: Mutex protecting device data, register access
+ * @fw_available: Firmware availability flag
+ * @res: Device compatible data
+ * @sram_pool: Genalloc pool for SRAM allocations
+ * @sram_buf: Optional SRAM buffer
+ * @code_buf: Firmware code buffer
+ * @work_buffers: Array of work buffers
+ * @work_buffers_alloc: Number of allocated work buffers
+ * @work_buffers_avail: Number of available work buffers
+ * @thermal: Thermal cooling device
+ * @core_count: Number of available VPU core devices
+ *
+ * @get_vpu, @put_vpu, @req_work_buffer are called by VPU core devices.
+ *
+ * Buffers such as @sram_buf, @code_buf, and @work_buffers are managed
+ * by the VPU device and accessed exclusively by the firmware.
+ */
+struct wave6_vpu_device {
+ int (*get_vpu)(struct wave6_vpu_device *vpu,
+ struct vpu_core_device *core);
+ void (*put_vpu)(struct wave6_vpu_device *vpu,
+ struct vpu_core_device *core);
+ void (*req_work_buffer)(struct wave6_vpu_device *vpu,
+ struct vpu_core_device *core);
+ struct device *dev;
+ void __iomem *reg_base;
+ struct clk_bulk_data *clks;
+ int num_clks;
+ enum wave6_vpu_state state;
+ struct mutex lock; /* Protects device data, register access */
+
+ /* Prevents boot or sleep sequence if firmware is unavailable. */
+ bool fw_available;
+
+ const struct wave6_vpu_resource *res;
+ struct gen_pool *sram_pool;
+ struct wave6_vpu_dma_buf sram_buf;
+ struct wave6_vpu_dma_buf code_buf;
+
+ /* Allocates per-instance, used for storing instance-specific data. */
+ struct vpu_buf work_buffers[MAX_NUM_INSTANCE];
+ u32 work_buffers_alloc;
+ u32 work_buffers_avail;
+
+ struct vpu_thermal_cooling thermal;
+ atomic_t core_count;
+
+ int num_pm_domains;
+ struct dev_pm_domain_list *pd_list;
+
+ struct platform_device *core_pdevs[W6_VPU_MAX_NUM_CORE];
+};
+
+#endif /* __WAVE6_VPU_H__ */
--
2.31.1
^ permalink raw reply related
* [RFC PATCH v5 7/9] media: chips-media: wave6: Add Wave6 thermal cooling device
From: Nas Chung @ 2026-04-15 9:25 UTC (permalink / raw)
To: mchehab, hverkuil, robh, krzk+dt, conor+dt, shawnguo, s.hauer
Cc: linux-media, devicetree, linux-kernel, linux-imx,
linux-arm-kernel, marek.vasut, ming.qian, Nas Chung
In-Reply-To: <20260415092529.577-1-nas.chung@chipsnmedia.com>
Add a thermal cooling device for the Wave6 VPU.
The device operates within the Linux thermal framework,
adjusting the VPU performance state based on thermal conditions.
Signed-off-by: Nas Chung <nas.chung@chipsnmedia.com>
Tested-by: Ming Qian <ming.qian@oss.nxp.com>
Tested-by: Marek Vasut <marek.vasut@mailbox.org>
---
.../chips-media/wave6/wave6-vpu-thermal.c | 141 ++++++++++++++++++
.../chips-media/wave6/wave6-vpu-thermal.h | 26 ++++
2 files changed, 167 insertions(+)
create mode 100644 drivers/media/platform/chips-media/wave6/wave6-vpu-thermal.c
create mode 100644 drivers/media/platform/chips-media/wave6/wave6-vpu-thermal.h
diff --git a/drivers/media/platform/chips-media/wave6/wave6-vpu-thermal.c b/drivers/media/platform/chips-media/wave6/wave6-vpu-thermal.c
new file mode 100644
index 000000000000..df8161fd4998
--- /dev/null
+++ b/drivers/media/platform/chips-media/wave6/wave6-vpu-thermal.c
@@ -0,0 +1,141 @@
+// SPDX-License-Identifier: (GPL-2.0 OR BSD-3-Clause)
+/*
+ * Wave6 series multi-standard codec IP - wave6 thermal cooling interface
+ *
+ * Copyright (C) 2025 CHIPS&MEDIA INC
+ *
+ */
+
+#include <linux/pm_domain.h>
+#include <linux/pm_opp.h>
+#include <linux/units.h>
+#include <linux/slab.h>
+#include "wave6-vpu-thermal.h"
+
+static int wave6_vpu_thermal_cooling_update(struct vpu_thermal_cooling *thermal,
+ int state)
+{
+ unsigned long new_clock_rate;
+ int ret;
+
+ if (state > thermal->thermal_max || !thermal->cooling)
+ return 0;
+
+ new_clock_rate = DIV_ROUND_UP(thermal->freq_table[state], HZ_PER_KHZ);
+ dev_dbg(thermal->dev, "receive cooling state: %d, new clock rate %ld\n",
+ state, new_clock_rate);
+
+ ret = dev_pm_genpd_set_performance_state(thermal->dev, new_clock_rate);
+ if (ret && !((ret == -ENODEV) || (ret == -EOPNOTSUPP))) {
+ dev_err(thermal->dev, "failed to set perf to %lu, ret = %d\n",
+ new_clock_rate, ret);
+ return ret;
+ }
+
+ return 0;
+}
+
+static int wave6_vpu_cooling_get_max_state(struct thermal_cooling_device *cdev,
+ unsigned long *state)
+{
+ struct vpu_thermal_cooling *thermal = cdev->devdata;
+
+ *state = thermal->thermal_max;
+
+ return 0;
+}
+
+static int wave6_vpu_cooling_get_cur_state(struct thermal_cooling_device *cdev,
+ unsigned long *state)
+{
+ struct vpu_thermal_cooling *thermal = cdev->devdata;
+
+ *state = thermal->thermal_event;
+
+ return 0;
+}
+
+static int wave6_vpu_cooling_set_cur_state(struct thermal_cooling_device *cdev,
+ unsigned long state)
+{
+ struct vpu_thermal_cooling *thermal = cdev->devdata;
+
+ thermal->thermal_event = state;
+ wave6_vpu_thermal_cooling_update(thermal, state);
+
+ return 0;
+}
+
+static struct thermal_cooling_device_ops wave6_cooling_ops = {
+ .get_max_state = wave6_vpu_cooling_get_max_state,
+ .get_cur_state = wave6_vpu_cooling_get_cur_state,
+ .set_cur_state = wave6_vpu_cooling_set_cur_state,
+};
+
+int wave6_vpu_cooling_init(struct vpu_thermal_cooling *thermal)
+{
+ int i;
+ int num_opps;
+ unsigned long freq;
+
+ if (WARN_ON(!thermal || !thermal->dev))
+ return -EINVAL;
+
+ num_opps = dev_pm_opp_get_opp_count(thermal->dev);
+ if (num_opps < 0) {
+ dev_err(thermal->dev, "fail to get pm opp count, ret = %d\n", num_opps);
+ return num_opps;
+ }
+ if (num_opps == 0) {
+ dev_err(thermal->dev, "no OPP entries found\n");
+ return -ENODEV;
+ }
+
+ thermal->freq_table = kcalloc(num_opps, sizeof(*thermal->freq_table), GFP_KERNEL);
+ if (!thermal->freq_table)
+ goto error;
+
+ for (i = 0, freq = ULONG_MAX; i < num_opps; i++, freq--) {
+ struct dev_pm_opp *opp;
+
+ opp = dev_pm_opp_find_freq_floor(thermal->dev, &freq);
+ if (IS_ERR(opp))
+ break;
+
+ dev_pm_opp_put(opp);
+
+ dev_dbg(thermal->dev, "[%d] = %lu\n", i, freq);
+ if (freq < 100 * HZ_PER_MHZ)
+ break;
+
+ thermal->freq_table[i] = freq;
+ thermal->thermal_max = i;
+ }
+
+ if (!thermal->thermal_max)
+ goto error;
+
+ thermal->thermal_event = 0;
+ thermal->cooling = thermal_of_cooling_device_register(thermal->dev->of_node,
+ dev_name(thermal->dev),
+ thermal,
+ &wave6_cooling_ops);
+ if (IS_ERR(thermal->cooling)) {
+ dev_err(thermal->dev, "register cooling device failed\n");
+ goto error;
+ }
+
+ return 0;
+
+error:
+ wave6_vpu_cooling_remove(thermal);
+
+ return -EINVAL;
+}
+
+void wave6_vpu_cooling_remove(struct vpu_thermal_cooling *thermal)
+{
+ thermal_cooling_device_unregister(thermal->cooling);
+ kfree(thermal->freq_table);
+ thermal->freq_table = NULL;
+}
diff --git a/drivers/media/platform/chips-media/wave6/wave6-vpu-thermal.h b/drivers/media/platform/chips-media/wave6/wave6-vpu-thermal.h
new file mode 100644
index 000000000000..16e86e99540a
--- /dev/null
+++ b/drivers/media/platform/chips-media/wave6/wave6-vpu-thermal.h
@@ -0,0 +1,26 @@
+/* SPDX-License-Identifier: (GPL-2.0 OR BSD-3-Clause) */
+/*
+ * Wave6 series multi-standard codec IP - wave6 thermal cooling interface
+ *
+ * Copyright (C) 2025 CHIPS&MEDIA INC
+ *
+ */
+
+#ifndef __WAVE6_VPU_THERMAL_H__
+#define __WAVE6_VPU_THERMAL_H__
+
+#include <linux/thermal.h>
+
+struct vpu_thermal_cooling {
+ struct device *dev;
+ struct device_node *of_node;
+ int thermal_event;
+ int thermal_max;
+ struct thermal_cooling_device *cooling;
+ unsigned long *freq_table;
+};
+
+int wave6_vpu_cooling_init(struct vpu_thermal_cooling *thermal);
+void wave6_vpu_cooling_remove(struct vpu_thermal_cooling *thermal);
+
+#endif /* __WAVE6_VPU_THERMAL_H__ */
--
2.31.1
^ permalink raw reply related
* [RFC PATCH v5 6/9] media: chips-media: wave6: Improve debugging capabilities
From: Nas Chung @ 2026-04-15 9:25 UTC (permalink / raw)
To: mchehab, hverkuil, robh, krzk+dt, conor+dt, shawnguo, s.hauer
Cc: linux-media, devicetree, linux-kernel, linux-imx,
linux-arm-kernel, marek.vasut, ming.qian, Nas Chung
In-Reply-To: <20260415092529.577-1-nas.chung@chipsnmedia.com>
Add debugfs entries and trace events to provide detailed
debugging information.
These enhancements help diagnose issues and improve debugging
capabilities for the Wave6 core driver.
Signed-off-by: Nas Chung <nas.chung@chipsnmedia.com>
Tested-by: Ming Qian <ming.qian@oss.nxp.com>
Tested-by: Marek Vasut <marek.vasut@mailbox.org>
---
.../platform/chips-media/wave6/wave6-trace.h | 289 ++++++++++++++++++
.../chips-media/wave6/wave6-vpu-dbg.c | 225 ++++++++++++++
.../chips-media/wave6/wave6-vpu-dbg.h | 14 +
3 files changed, 528 insertions(+)
create mode 100644 drivers/media/platform/chips-media/wave6/wave6-trace.h
create mode 100644 drivers/media/platform/chips-media/wave6/wave6-vpu-dbg.c
create mode 100644 drivers/media/platform/chips-media/wave6/wave6-vpu-dbg.h
diff --git a/drivers/media/platform/chips-media/wave6/wave6-trace.h b/drivers/media/platform/chips-media/wave6/wave6-trace.h
new file mode 100644
index 000000000000..2c80923e2f29
--- /dev/null
+++ b/drivers/media/platform/chips-media/wave6/wave6-trace.h
@@ -0,0 +1,289 @@
+/* SPDX-License-Identifier: (GPL-2.0 OR BSD-3-Clause) */
+/*
+ * Wave6 series multi-standard codec IP - wave6 driver tracer
+ *
+ * Copyright (C) 2025 CHIPS&MEDIA INC
+ */
+
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM wave6
+
+#if !defined(__WAVE6_TRACE_H__) || defined(TRACE_HEADER_MULTI_READ)
+#define __WAVE6_TRACE_H__
+
+#include <linux/tracepoint.h>
+#include <media/videobuf2-v4l2.h>
+
+DECLARE_EVENT_CLASS(wave6_vpu_register_access,
+ TP_PROTO(struct device *dev, u32 addr, u32 value),
+ TP_ARGS(dev, addr, value),
+ TP_STRUCT__entry(__string(name, dev_name(dev))
+ __field(u32, addr)
+ __field(u32, value)),
+ TP_fast_assign(__assign_str(name);
+ __entry->addr = addr;
+ __entry->value = value;),
+ TP_printk("%s:0x%03x 0x%08x",
+ __get_str(name), __entry->addr, __entry->value));
+
+DEFINE_EVENT(wave6_vpu_register_access, wave6_vpu_writel,
+ TP_PROTO(struct device *dev, u32 addr, u32 value),
+ TP_ARGS(dev, addr, value));
+DEFINE_EVENT(wave6_vpu_register_access, wave6_vpu_readl,
+ TP_PROTO(struct device *dev, u32 addr, u32 value),
+ TP_ARGS(dev, addr, value));
+
+TRACE_EVENT(wave6_vpu_send_command,
+ TP_PROTO(struct vpu_core_device *core, u32 id, u32 std, u32 cmd),
+ TP_ARGS(core, id, std, cmd),
+ TP_STRUCT__entry(__string(name, dev_name(core->dev))
+ __field(u32, id)
+ __field(u32, std)
+ __field(u32, cmd)),
+ TP_fast_assign(__assign_str(name);
+ __entry->id = id;
+ __entry->std = std;
+ __entry->cmd = cmd;),
+ TP_printk("%s: inst id %d, std 0x%x, cmd 0x%x",
+ __get_str(name), __entry->id,
+ __entry->std, __entry->cmd));
+
+TRACE_EVENT(wave6_vpu_irq,
+ TP_PROTO(struct vpu_core_device *core, u32 irq, u32 idc),
+ TP_ARGS(core, irq, idc),
+ TP_STRUCT__entry(__string(name, dev_name(core->dev))
+ __field(u32, irq)
+ __field(u32, idc)),
+ TP_fast_assign(__assign_str(name);
+ __entry->irq = irq;
+ __entry->idc = idc;),
+ TP_printk("%s: irq 0x%x, idc 0x%x",
+ __get_str(name), __entry->irq, __entry->idc));
+
+TRACE_EVENT(wave6_vpu_set_state,
+ TP_PROTO(struct vpu_instance *inst, u32 state),
+ TP_ARGS(inst, state),
+ TP_STRUCT__entry(__string(name, dev_name(inst->dev->dev))
+ __field(u32, id)
+ __string(cur_state, wave6_vpu_instance_state_name(inst->state))
+ __string(nxt_state, wave6_vpu_instance_state_name(state))),
+ TP_fast_assign(__assign_str(name);
+ __entry->id = inst->id;
+ __assign_str(cur_state);
+ __assign_str(nxt_state);),
+ TP_printk("%s: inst[%d] set state %s -> %s",
+ __get_str(name), __entry->id,
+ __get_str(cur_state), __get_str(nxt_state)));
+
+DECLARE_EVENT_CLASS(wave6_vpu_inst_internal,
+ TP_PROTO(struct vpu_instance *inst, bool is_out),
+ TP_ARGS(inst, is_out),
+ TP_STRUCT__entry(__string(name, dev_name(inst->dev->dev))
+ __field(u32, id)
+ __string(type, is_out ? "output" : "capture")
+ __field(u32, pixelformat)
+ __field(u32, width)
+ __field(u32, height)
+ __field(u32, buf_cnt_src)
+ __field(u32, buf_cnt_dst)
+ __field(u32, processed_cnt)
+ __field(u32, error_cnt)),
+ TP_fast_assign(__assign_str(name);
+ __entry->id = inst->id;
+ __assign_str(type);
+ __entry->pixelformat = is_out ? inst->src_fmt.pixelformat :
+ inst->dst_fmt.pixelformat;
+ __entry->width = is_out ? inst->src_fmt.width :
+ inst->dst_fmt.width;
+ __entry->height = is_out ? inst->src_fmt.height :
+ inst->dst_fmt.height;
+ __entry->buf_cnt_src = inst->queued_src_buf_num;
+ __entry->buf_cnt_dst = inst->queued_dst_buf_num;
+ __entry->processed_cnt = inst->processed_buf_num;
+ __entry->error_cnt = inst->error_buf_num;),
+ TP_printk("%s: inst[%d] %s %c%c%c%c %dx%d, input %d, %d, process %d, error %d",
+ __get_str(name), __entry->id, __get_str(type),
+ __entry->pixelformat,
+ __entry->pixelformat >> 8,
+ __entry->pixelformat >> 16,
+ __entry->pixelformat >> 24,
+ __entry->width, __entry->height,
+ __entry->buf_cnt_src, __entry->buf_cnt_dst,
+ __entry->processed_cnt, __entry->error_cnt));
+
+DEFINE_EVENT(wave6_vpu_inst_internal, wave6_vpu_start_streaming,
+ TP_PROTO(struct vpu_instance *inst, bool is_out),
+ TP_ARGS(inst, is_out));
+
+DEFINE_EVENT(wave6_vpu_inst_internal, wave6_vpu_stop_streaming,
+ TP_PROTO(struct vpu_instance *inst, bool is_out),
+ TP_ARGS(inst, is_out));
+
+TRACE_EVENT(wave6_vpu_dec_pic,
+ TP_PROTO(struct vpu_instance *inst, u32 srcidx, u32 size),
+ TP_ARGS(inst, srcidx, size),
+ TP_STRUCT__entry(__string(name, dev_name(inst->dev->dev))
+ __field(u32, id)
+ __field(u32, srcidx)
+ __field(u32, start)
+ __field(u32, size)),
+ TP_fast_assign(__assign_str(name);
+ __entry->id = inst->id;
+ __entry->srcidx = srcidx;
+ __entry->start = inst->codec_info->dec_info.stream_rd_ptr;
+ __entry->size = size;),
+ TP_printk("%s: inst[%d] src[%2d] %8x, %d",
+ __get_str(name), __entry->id,
+ __entry->srcidx, __entry->start, __entry->size));
+
+TRACE_EVENT(wave6_vpu_source_change,
+ TP_PROTO(struct vpu_instance *inst, struct dec_seq_info *info),
+ TP_ARGS(inst, info),
+ TP_STRUCT__entry(__string(name, dev_name(inst->dev->dev))
+ __field(u32, id)
+ __field(u32, width)
+ __field(u32, height)
+ __field(u32, profile)
+ __field(u32, level)
+ __field(u32, tier)
+ __field(u32, min_fb_cnt)
+ __field(u32, disp_delay)
+ __field(u32, quantization)
+ __field(u32, colorspace)
+ __field(u32, xfer_func)
+ __field(u32, ycbcr_enc)),
+ TP_fast_assign(__assign_str(name);
+ __entry->id = inst->id;
+ __entry->width = info->pic_width,
+ __entry->height = info->pic_height,
+ __entry->profile = info->profile,
+ __entry->level = info->level;
+ __entry->tier = info->tier;
+ __entry->min_fb_cnt = info->min_frame_buffer_count;
+ __entry->disp_delay = info->frame_buf_delay;
+ __entry->quantization = inst->quantization;
+ __entry->colorspace = inst->colorspace;
+ __entry->xfer_func = inst->xfer_func;
+ __entry->ycbcr_enc = inst->ycbcr_enc;),
+ TP_printk("%s: inst[%d] %dx%d profile %d %d %d min_fb %d delay %d color %d %d %d %d",
+ __get_str(name), __entry->id,
+ __entry->width, __entry->height,
+ __entry->profile, __entry->level, __entry->tier,
+ __entry->min_fb_cnt, __entry->disp_delay,
+ __entry->quantization, __entry->colorspace,
+ __entry->xfer_func, __entry->ycbcr_enc));
+
+TRACE_EVENT(wave6_vpu_dec_done,
+ TP_PROTO(struct vpu_instance *inst, struct dec_output_info *info),
+ TP_ARGS(inst, info),
+ TP_STRUCT__entry(__string(name, dev_name(inst->dev->dev))
+ __field(u32, id)
+ __field(u32, dec_flag)
+ __field(u32, dec_poc)
+ __field(u32, disp_flag)
+ __field(u32, disp_cnt)
+ __field(u32, rel_cnt)
+ __field(u32, src_ch)
+ __field(u32, eos)
+ __field(u32, error)
+ __field(u32, warn)),
+ TP_fast_assign(__assign_str(name);
+ __entry->id = inst->id;
+ __entry->dec_flag = info->frame_decoded;
+ __entry->dec_poc = info->decoded_poc;
+ __entry->disp_flag = info->frame_display;
+ __entry->disp_cnt = info->disp_frame_num;
+ __entry->rel_cnt = info->release_disp_frame_num;
+ __entry->src_ch = info->notification_flags & DEC_NOTI_FLAG_SEQ_CHANGE;
+ __entry->eos = info->stream_end;
+ __entry->error = info->error_reason;
+ __entry->warn = info->warn_info;),
+ TP_printk("%s: inst[%d] dec %d %d disp %d(%d) rel %d src_ch %d eos %d error 0x%x 0x%x",
+ __get_str(name), __entry->id,
+ __entry->dec_flag, __entry->dec_poc,
+ __entry->disp_flag, __entry->disp_cnt,
+ __entry->rel_cnt,
+ __entry->src_ch, __entry->eos,
+ __entry->error, __entry->warn));
+
+TRACE_EVENT(wave6_vpu_enc_pic,
+ TP_PROTO(struct vpu_instance *inst, struct enc_param *param),
+ TP_ARGS(inst, param),
+ TP_STRUCT__entry(__string(name, dev_name(inst->dev->dev))
+ __field(u32, id)
+ __field(u32, srcidx)
+ __field(u32, buf_y)
+ __field(u32, buf_cb)
+ __field(u32, buf_cr)
+ __field(u32, stride)
+ __field(u32, buf_strm)
+ __field(u32, size_strm)
+ __field(u32, force_type_enable)
+ __field(u32, force_type)
+ __field(u32, end_flag)),
+ TP_fast_assign(__assign_str(name);
+ __entry->id = inst->id;
+ __entry->srcidx = param->src_idx;
+ __entry->buf_y = param->source_frame->buf_y;
+ __entry->buf_cb = param->source_frame->buf_cb;
+ __entry->buf_cr = param->source_frame->buf_cr;
+ __entry->stride = param->source_frame->stride;
+ __entry->buf_strm = param->pic_stream_buffer_addr;
+ __entry->size_strm = param->pic_stream_buffer_size;
+ __entry->force_type_enable = param->force_pic;
+ __entry->force_type = param->force_pic_type;
+ __entry->end_flag = param->src_end;),
+ TP_printk("%s: inst[%d] src[%2d] %8x %8x %8x(%d) dst %8x(%d) force type %d(%d) end %d",
+ __get_str(name), __entry->id, __entry->srcidx,
+ __entry->buf_y, __entry->buf_cb, __entry->buf_cr,
+ __entry->stride, __entry->buf_strm, __entry->size_strm,
+ __entry->force_type_enable, __entry->force_type,
+ __entry->end_flag));
+
+TRACE_EVENT(wave6_vpu_enc_done,
+ TP_PROTO(struct vpu_instance *inst, struct enc_output_info *info),
+ TP_ARGS(inst, info),
+ TP_STRUCT__entry(__string(name, dev_name(inst->dev->dev))
+ __field(u32, id)
+ __field(u32, srcidx)
+ __field(u32, frmidx)
+ __field(u32, size)
+ __field(u32, type)
+ __field(u32, avg_qp)),
+ TP_fast_assign(__assign_str(name);
+ __entry->id = inst->id;
+ __entry->srcidx = info->enc_src_idx;
+ __entry->frmidx = info->recon_frame_index;
+ __entry->size = info->bitstream_size;
+ __entry->type = info->pic_type;
+ __entry->avg_qp = info->avg_ctu_qp;),
+ TP_printk("%s: inst[%d] src %d, frame %d, size %d, type %d, qp %d, eos %d",
+ __get_str(name), __entry->id,
+ __entry->srcidx, __entry->frmidx,
+ __entry->size, __entry->type, __entry->avg_qp,
+ __entry->frmidx == RECON_IDX_FLAG_ENC_END));
+
+TRACE_EVENT(wave6_vpu_s_ctrl,
+ TP_PROTO(struct vpu_instance *inst, struct v4l2_ctrl *ctrl),
+ TP_ARGS(inst, ctrl),
+ TP_STRUCT__entry(__string(name, dev_name(inst->dev->dev))
+ __field(u32, id)
+ __string(ctrl_name, ctrl->name)
+ __field(u32, val)),
+ TP_fast_assign(__assign_str(name);
+ __entry->id = inst->id;
+ __assign_str(ctrl_name);
+ __entry->val = ctrl->val;),
+ TP_printk("%s: inst[%d] %s = %d",
+ __get_str(name), __entry->id,
+ __get_str(ctrl_name), __entry->val));
+
+#endif /* __WAVE6_TRACE_H__ */
+
+#undef TRACE_INCLUDE_PATH
+#define TRACE_INCLUDE_PATH .
+#undef TRACE_INCLUDE_FILE
+#define TRACE_INCLUDE_FILE wave6-trace
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
diff --git a/drivers/media/platform/chips-media/wave6/wave6-vpu-dbg.c b/drivers/media/platform/chips-media/wave6/wave6-vpu-dbg.c
new file mode 100644
index 000000000000..7f04060f0aea
--- /dev/null
+++ b/drivers/media/platform/chips-media/wave6/wave6-vpu-dbg.c
@@ -0,0 +1,225 @@
+// SPDX-License-Identifier: (GPL-2.0 OR BSD-3-Clause)
+/*
+ * Wave6 series multi-standard codec IP - debug interface
+ *
+ * Copyright (C) 2025 CHIPS&MEDIA INC
+ */
+
+#include <linux/types.h>
+#include <linux/debugfs.h>
+#include "wave6-vpu-core.h"
+#include "wave6-vpu-dbg.h"
+
+static int wave6_vpu_dbg_instance(struct seq_file *s, void *data)
+{
+ struct vpu_instance *inst = s->private;
+ struct vpu_performance_info *perf = &inst->performance;
+ struct vb2_queue *vq;
+ char str[128];
+ int num;
+ s64 tmp;
+ s64 fps;
+
+ if (!inst->v4l2_fh.m2m_ctx)
+ return 0;
+
+ num = scnprintf(str, sizeof(str), "[%s]\n",
+ inst->type == VPU_INST_TYPE_DEC ? "Decoder" : "Encoder");
+ if (seq_write(s, str, num))
+ return 0;
+
+ num = scnprintf(str, sizeof(str), "%s : product 0x%x, fw_ver %d.%d.%d(r%d), hw_ver 0x%x\n",
+ dev_name(inst->dev->dev),
+ inst->dev->attr.product_code,
+ FW_VERSION_MAJOR(inst->dev->attr.fw_version),
+ FW_VERSION_MINOR(inst->dev->attr.fw_version),
+ FW_VERSION_REL(inst->dev->attr.fw_version),
+ inst->dev->attr.fw_revision,
+ inst->dev->attr.hw_version);
+ if (seq_write(s, str, num))
+ return 0;
+
+ num = scnprintf(str, sizeof(str), "state = %s\n",
+ wave6_vpu_instance_state_name(inst->state));
+ if (seq_write(s, str, num))
+ return 0;
+
+ vq = v4l2_m2m_get_src_vq(inst->v4l2_fh.m2m_ctx);
+ num = scnprintf(str, sizeof(str),
+ "output (%2d, %2d): fmt = %c%c%c%c %d x %d, %d;\n",
+ vb2_is_streaming(vq),
+ vb2_get_num_buffers(vq),
+ inst->src_fmt.pixelformat,
+ inst->src_fmt.pixelformat >> 8,
+ inst->src_fmt.pixelformat >> 16,
+ inst->src_fmt.pixelformat >> 24,
+ inst->src_fmt.width,
+ inst->src_fmt.height,
+ vq->last_buffer_dequeued);
+ if (seq_write(s, str, num))
+ return 0;
+
+ vq = v4l2_m2m_get_dst_vq(inst->v4l2_fh.m2m_ctx);
+ num = scnprintf(str, sizeof(str),
+ "capture(%2d, %2d): fmt = %c%c%c%c %d x %d, %d;\n",
+ vb2_is_streaming(vq),
+ vb2_get_num_buffers(vq),
+ inst->dst_fmt.pixelformat,
+ inst->dst_fmt.pixelformat >> 8,
+ inst->dst_fmt.pixelformat >> 16,
+ inst->dst_fmt.pixelformat >> 24,
+ inst->dst_fmt.width,
+ inst->dst_fmt.height,
+ vq->last_buffer_dequeued);
+ if (seq_write(s, str, num))
+ return 0;
+
+ num = scnprintf(str, sizeof(str), "crop: (%d, %d) %d x %d\n",
+ inst->crop.left,
+ inst->crop.top,
+ inst->crop.width,
+ inst->crop.height);
+ if (seq_write(s, str, num))
+ return 0;
+
+ if (inst->scaler_info.enable) {
+ num = scnprintf(str, sizeof(str), "scale: %d x %d\n",
+ inst->scaler_info.width, inst->scaler_info.height);
+ if (seq_write(s, str, num))
+ return 0;
+ }
+
+ num = scnprintf(str, sizeof(str),
+ "queued src %d, dst %d, process %d, sequence %d, error %d, drain %d:%d\n",
+ inst->queued_src_buf_num,
+ inst->queued_dst_buf_num,
+ inst->processed_buf_num,
+ inst->sequence,
+ inst->error_buf_num,
+ inst->v4l2_fh.m2m_ctx->out_q_ctx.buffered,
+ inst->eos);
+ if (seq_write(s, str, num))
+ return 0;
+
+ num = scnprintf(str, sizeof(str), "fps");
+ if (seq_write(s, str, num))
+ return 0;
+ tmp = MSEC_PER_SEC * inst->processed_buf_num;
+ if (perf->ts_last > perf->ts_first + NSEC_PER_MSEC) {
+ fps = DIV_ROUND_CLOSEST(tmp, (perf->ts_last - perf->ts_first) / NSEC_PER_MSEC);
+ num = scnprintf(str, sizeof(str), " actual: %lld;", fps);
+ if (seq_write(s, str, num))
+ return 0;
+ }
+ if (perf->total_sw_time) {
+ fps = DIV_ROUND_CLOSEST(tmp, perf->total_sw_time / NSEC_PER_MSEC);
+ num = scnprintf(str, sizeof(str), " sw: %lld;", fps);
+ if (seq_write(s, str, num))
+ return 0;
+ }
+ if (perf->total_hw_time) {
+ fps = DIV_ROUND_CLOSEST(tmp, perf->total_hw_time / NSEC_PER_MSEC);
+ num = scnprintf(str, sizeof(str), " hw: %lld", fps);
+ if (seq_write(s, str, num))
+ return 0;
+ }
+ num = scnprintf(str, sizeof(str), "\n");
+ if (seq_write(s, str, num))
+ return 0;
+
+ num = scnprintf(str, sizeof(str),
+ "latency(ms) first: %llu.%06llu, max %llu.%06llu, setup %llu.%06llu\n",
+ perf->latency_first / NSEC_PER_MSEC,
+ perf->latency_first % NSEC_PER_MSEC,
+ perf->latency_max / NSEC_PER_MSEC,
+ perf->latency_max % NSEC_PER_MSEC,
+ (perf->ts_first - perf->ts_start) / NSEC_PER_MSEC,
+ (perf->ts_first - perf->ts_start) % NSEC_PER_MSEC);
+ if (seq_write(s, str, num))
+ return 0;
+
+ num = scnprintf(str, sizeof(str),
+ "process frame time(ms) min: %llu.%06llu, max %llu.%06llu\n",
+ perf->min_process_time / NSEC_PER_MSEC,
+ perf->min_process_time % NSEC_PER_MSEC,
+ perf->max_process_time / NSEC_PER_MSEC,
+ perf->max_process_time % NSEC_PER_MSEC);
+ if (seq_write(s, str, num))
+ return 0;
+
+ if (inst->type == VPU_INST_TYPE_DEC) {
+ num = scnprintf(str, sizeof(str), "%s order\n",
+ inst->disp_mode == DISP_MODE_DISP_ORDER ? "display" : "decode");
+ if (seq_write(s, str, num))
+ return 0;
+ } else {
+ struct enc_info *p_enc_info = &inst->codec_info->enc_info;
+ struct enc_codec_param *param = &p_enc_info->open_param.codec_param;
+
+ num = scnprintf(str, sizeof(str), "profile %d, level %d, tier %d\n",
+ param->profile, param->level, param->tier);
+ if (seq_write(s, str, num))
+ return 0;
+
+ num = scnprintf(str, sizeof(str), "frame_rate %d, idr_period %d, intra_period %d\n",
+ param->frame_rate, param->idr_period, param->intra_period);
+ if (seq_write(s, str, num))
+ return 0;
+
+ num = scnprintf(str, sizeof(str), "rc %d, mode %d, bitrate %d\n",
+ param->en_rate_control,
+ param->rc_mode,
+ param->bitrate);
+ if (seq_write(s, str, num))
+ return 0;
+
+ num = scnprintf(str, sizeof(str),
+ "qp %d, i_qp [%d, %d], p_qp [%d, %d], b_qp [%d, %d]\n",
+ param->qp,
+ param->min_qp_i, param->max_qp_i,
+ param->min_qp_p, param->max_qp_p,
+ param->min_qp_b, param->max_qp_b);
+ if (seq_write(s, str, num))
+ return 0;
+ }
+
+ return 0;
+}
+
+static int wave6_vpu_dbg_open(struct inode *inode, struct file *filp)
+{
+ return single_open(filp, wave6_vpu_dbg_instance, inode->i_private);
+}
+
+static const struct file_operations wave6_vpu_dbg_fops = {
+ .owner = THIS_MODULE,
+ .open = wave6_vpu_dbg_open,
+ .release = single_release,
+ .read = seq_read,
+};
+
+int wave6_vpu_create_dbgfs_file(struct vpu_instance *inst)
+{
+ char name[64];
+
+ if (WARN_ON(!inst || !inst->dev || IS_ERR_OR_NULL(inst->dev->debugfs)))
+ return -EINVAL;
+
+ scnprintf(name, sizeof(name), "instance.%d", inst->id);
+ inst->debugfs = debugfs_create_file((const char *)name,
+ VERIFY_OCTAL_PERMISSIONS(0444),
+ inst->dev->debugfs,
+ inst,
+ &wave6_vpu_dbg_fops);
+
+ return 0;
+}
+
+void wave6_vpu_remove_dbgfs_file(struct vpu_instance *inst)
+{
+ if (WARN_ON(!inst || !inst->debugfs))
+ return;
+
+ debugfs_remove(inst->debugfs);
+ inst->debugfs = NULL;
+}
diff --git a/drivers/media/platform/chips-media/wave6/wave6-vpu-dbg.h b/drivers/media/platform/chips-media/wave6/wave6-vpu-dbg.h
new file mode 100644
index 000000000000..6453eb2de76f
--- /dev/null
+++ b/drivers/media/platform/chips-media/wave6/wave6-vpu-dbg.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: (GPL-2.0 OR BSD-3-Clause) */
+/*
+ * Wave6 series multi-standard codec IP - debug interface
+ *
+ * Copyright (C) 2025 CHIPS&MEDIA INC
+ */
+
+#ifndef __WAVE6_VPU_DBG_H__
+#define __WAVE6_VPU_DBG_H__
+
+int wave6_vpu_create_dbgfs_file(struct vpu_instance *inst);
+void wave6_vpu_remove_dbgfs_file(struct vpu_instance *inst);
+
+#endif /* __WAVE6_VPU_DBG_H__ */
--
2.31.1
^ permalink raw reply related
* [RFC PATCH v5 5/9] media: chips-media: wave6: Add Wave6 core driver
From: Nas Chung @ 2026-04-15 9:25 UTC (permalink / raw)
To: mchehab, hverkuil, robh, krzk+dt, conor+dt, shawnguo, s.hauer
Cc: linux-media, devicetree, linux-kernel, linux-imx,
linux-arm-kernel, marek.vasut, ming.qian, Nas Chung
In-Reply-To: <20260415092529.577-1-nas.chung@chipsnmedia.com>
Add the core driver for the Chips&Media Wave6 video codec IP.
The hardware contains one control register region and four interface
register regions for a shared video processing engine. This driver
handles the interface register regions, each with its own MMIO range and
interrupt, while relying on the control driver for firmware loading and
shared resource management.
It configures the V4L2 mem2mem devices and communicates with the Wave6
hardware to perform video processing tasks.
Signed-off-by: Nas Chung <nas.chung@chipsnmedia.com>
Tested-by: Ming Qian <ming.qian@oss.nxp.com>
Tested-by: Marek Vasut <marek.vasut@mailbox.org>
---
.../chips-media/wave6/wave6-vpu-core.c | 397 ++++++++++++++++++
.../chips-media/wave6/wave6-vpu-core.h | 123 ++++++
2 files changed, 520 insertions(+)
create mode 100644 drivers/media/platform/chips-media/wave6/wave6-vpu-core.c
create mode 100644 drivers/media/platform/chips-media/wave6/wave6-vpu-core.h
diff --git a/drivers/media/platform/chips-media/wave6/wave6-vpu-core.c b/drivers/media/platform/chips-media/wave6/wave6-vpu-core.c
new file mode 100644
index 000000000000..d666a6bb22e8
--- /dev/null
+++ b/drivers/media/platform/chips-media/wave6/wave6-vpu-core.c
@@ -0,0 +1,397 @@
+// SPDX-License-Identifier: (GPL-2.0 OR BSD-3-Clause)
+/*
+ * Wave6 series multi-standard codec IP - wave6 core driver
+ *
+ * Copyright (C) 2025 CHIPS&MEDIA INC
+ */
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/platform_device.h>
+#include <linux/clk.h>
+#include <linux/firmware.h>
+#include <linux/interrupt.h>
+#include <linux/pm_runtime.h>
+#include <linux/debugfs.h>
+#include <linux/iopoll.h>
+#include "wave6-vpu-core.h"
+#include "wave6-regdefine.h"
+#include "wave6-vpuconfig.h"
+#include "wave6-hw.h"
+#include "wave6-vpu-dbg.h"
+
+#define CREATE_TRACE_POINTS
+#include "wave6-trace.h"
+
+#define WAVE6_VPU_DEBUGFS_DIR "wave6"
+
+static irqreturn_t wave6_vpu_core_irq(int irq, void *dev_id)
+{
+ struct vpu_core_device *core = dev_id;
+ struct vpu_irq irq_info;
+
+ if (!vpu_read_reg(core, W6_VPU_VPU_INT_STS))
+ return IRQ_NONE;
+
+ irq_info.status = vpu_read_reg(core, W6_VPU_VINT_REASON);
+ irq_info.inst_idc = vpu_read_reg(core, W6_RET_INT_INSTANCE_INFO);
+
+ vpu_write_reg(core, W6_RET_INT_INSTANCE_INFO, INT_INSTANCE_INFO_CLEAR);
+ vpu_write_reg(core, W6_VPU_VINT_REASON_CLEAR, irq_info.status);
+ vpu_write_reg(core, W6_VPU_VINT_CLEAR, VINT_CLEAR);
+
+ trace_wave6_vpu_irq(core, irq_info.status, irq_info.inst_idc);
+
+ if (irq_info.status & BIT(W6_INT_BIT_REQ_WORK_BUF)) {
+ if (core->vpu)
+ core->vpu->req_work_buffer(core->vpu, core);
+
+ return IRQ_HANDLED;
+ }
+
+ kfifo_in(&core->irq_fifo, &irq_info, sizeof(struct vpu_irq));
+
+ return IRQ_WAKE_THREAD;
+}
+
+static struct vpu_instance *wave6_vpu_core_get_instance(struct vpu_core_device *core,
+ u32 inst_idc)
+{
+ struct vpu_instance *inst;
+
+ guard(spinlock)(&core->inst_lock);
+
+ list_for_each_entry(inst, &core->instances, list) {
+ if (BIT(inst->id) & inst_idc)
+ return inst;
+ }
+
+ return NULL;
+}
+
+static irqreturn_t wave6_vpu_core_irq_thread(int irq, void *dev_id)
+{
+ struct vpu_core_device *core = dev_id;
+ struct vpu_instance *inst;
+ struct vpu_irq irq_info;
+
+ while (kfifo_len(&core->irq_fifo)) {
+ bool error = false;
+
+ if (!kfifo_out(&core->irq_fifo, &irq_info, sizeof(struct vpu_irq)))
+ break;
+
+ inst = wave6_vpu_core_get_instance(core, irq_info.inst_idc);
+ if (!inst)
+ break;
+
+ if ((irq_info.status & BIT(W6_INT_BIT_INIT_SEQ)) ||
+ (irq_info.status & BIT(W6_INT_BIT_ENC_SET_PARAM))) {
+ complete(&inst->irq_done);
+ continue;
+ }
+
+ if (irq_info.status & BIT(W6_INT_BIT_BSBUF_ERROR))
+ error = true;
+
+ if (inst->ops && inst->ops->finish_process)
+ inst->ops->finish_process(inst, error);
+ }
+
+ return IRQ_HANDLED;
+}
+
+static void wave6_vpu_core_check_state(struct vpu_core_device *core)
+{
+ u32 val;
+ int ret;
+
+ guard(mutex)(&core->hw_lock);
+
+ ret = read_poll_timeout(vpu_read_reg, val, val != 0,
+ W6_VPU_POLL_DELAY_US, W6_VPU_POLL_TIMEOUT,
+ false, core, W6_VPU_VCPU_CUR_PC);
+ if (ret)
+ return;
+
+ wave6_vpu_enable_interrupt(core);
+ ret = wave6_vpu_get_version(core);
+ if (ret) {
+ dev_err(core->dev, "wave6_vpu_get_version fail\n");
+ return;
+ }
+
+ dev_dbg(core->dev, "product 0x%x, fw_ver %d.%d.%d(r%d), hw_ver 0x%x\n",
+ core->attr.product_code,
+ FW_VERSION_MAJOR(core->attr.fw_version),
+ FW_VERSION_MINOR(core->attr.fw_version),
+ FW_VERSION_REL(core->attr.fw_version),
+ core->attr.fw_revision,
+ core->attr.hw_version);
+
+ if (core->attr.fw_version < core->res->compatible_fw_version)
+ dev_err(core->dev, "fw version is too low (< v%d.%d.%d)\n",
+ FW_VERSION_MAJOR(core->res->compatible_fw_version),
+ FW_VERSION_MINOR(core->res->compatible_fw_version),
+ FW_VERSION_REL(core->res->compatible_fw_version));
+}
+
+void wave6_vpu_core_activate(struct vpu_core_device *core)
+{
+ core->active = true;
+}
+
+static void wave6_vpu_core_wait_activated(struct vpu_core_device *core)
+{
+ if (core->active)
+ wave6_vpu_core_check_state(core);
+}
+
+static int wave6_vpu_core_probe(struct platform_device *pdev)
+{
+ struct vpu_core_device *core;
+ const struct wave6_vpu_core_resource *res;
+ int ret;
+ int irq;
+
+ res = dev_get_platdata(&pdev->dev);
+ if (!res) {
+ dev_err(&pdev->dev, "There is no platform data\n");
+ return -ENODEV;
+ }
+
+ ret = dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(32));
+ if (ret < 0) {
+ dev_err(&pdev->dev, "failed to set DMA mask: %d\n", ret);
+ return ret;
+ }
+
+ core = devm_kzalloc(&pdev->dev, sizeof(*core), GFP_KERNEL);
+ if (!core)
+ return -ENOMEM;
+
+ ret = devm_mutex_init(&pdev->dev, &core->dev_lock);
+ if (ret)
+ return ret;
+
+ ret = devm_mutex_init(&pdev->dev, &core->hw_lock);
+ if (ret)
+ return ret;
+
+ spin_lock_init(&core->inst_lock);
+ INIT_LIST_HEAD(&core->instances);
+ dev_set_drvdata(&pdev->dev, core);
+ core->dev = &pdev->dev;
+ core->res = res;
+
+ if (pdev->dev.parent->driver && pdev->dev.parent->driver->name &&
+ !strcmp(pdev->dev.parent->driver->name, WAVE6_VPU_PLATFORM_DRIVER_NAME))
+ core->vpu = dev_get_drvdata(pdev->dev.parent);
+
+ core->reg_base = devm_platform_ioremap_resource(pdev, 0);
+ if (IS_ERR(core->reg_base))
+ return PTR_ERR(core->reg_base);
+
+ ret = devm_clk_bulk_get_all(&pdev->dev, &core->clks);
+ if (ret < 0)
+ return dev_err_probe(&pdev->dev, ret, "failed to get clocks\n");
+
+ core->num_clks = ret;
+
+ irq = platform_get_irq(pdev, 0);
+ if (irq < 0)
+ return irq;
+
+ ret = devm_request_threaded_irq(&pdev->dev, irq,
+ wave6_vpu_core_irq,
+ wave6_vpu_core_irq_thread,
+ 0, "vpu_irq", core);
+ if (ret) {
+ dev_err(&pdev->dev, "failed to request IRQ: %d\n", ret);
+ return ret;
+ }
+
+ ret = v4l2_device_register(&pdev->dev, &core->v4l2_dev);
+ if (ret) {
+ dev_err(&pdev->dev, "failed to register v4l2_dev: %d\n", ret);
+ return ret;
+ }
+
+ ret = wave6_vpu_init_m2m_dev(core);
+ if (ret)
+ goto err_v4l2_unregister;
+
+ ret = kfifo_alloc(&core->irq_fifo,
+ MAX_NUM_INSTANCE * sizeof(struct vpu_irq),
+ GFP_KERNEL);
+ if (ret) {
+ dev_err(&pdev->dev, "failed to allocate fifo\n");
+ goto err_m2m_dev_release;
+ }
+
+ core->temp_vbuf.size = ALIGN(W6_TEMPBUF_SIZE, 4096);
+ ret = wave6_vdi_alloc_dma(core->dev, &core->temp_vbuf);
+ if (ret) {
+ dev_err(&pdev->dev, "failed to allocate temp_vbuf: %d\n", ret);
+ goto err_kfifo_free;
+ }
+
+ core->debugfs = debugfs_lookup(WAVE6_VPU_DEBUGFS_DIR, NULL);
+ if (IS_ERR_OR_NULL(core->debugfs))
+ core->debugfs = debugfs_create_dir(WAVE6_VPU_DEBUGFS_DIR, NULL);
+
+ pm_runtime_enable(&pdev->dev);
+
+ if (core->res->codec_types & WAVE6_IS_DEC) {
+ ret = wave6_vpu_dec_register_device(core);
+ if (ret) {
+ dev_err(&pdev->dev,
+ "failed to register video_dev_dec: %d\n", ret);
+ goto err_temp_vbuf_free;
+ }
+ }
+ if (core->res->codec_types & WAVE6_IS_ENC) {
+ ret = wave6_vpu_enc_register_device(core);
+ if (ret) {
+ dev_err(&pdev->dev,
+ "failed to register video_dev_enc: %d\n", ret);
+ goto err_dec_unreg;
+ }
+ }
+
+ dev_dbg(&pdev->dev, "Added wave6 driver with caps %s %s\n",
+ core->res->codec_types & WAVE6_IS_ENC ? "'ENCODE'" : "",
+ core->res->codec_types & WAVE6_IS_DEC ? "'DECODE'" : "");
+
+ return 0;
+
+err_dec_unreg:
+ if (core->res->codec_types & WAVE6_IS_DEC)
+ wave6_vpu_dec_unregister_device(core);
+err_temp_vbuf_free:
+ wave6_vdi_free_dma(&core->temp_vbuf);
+err_kfifo_free:
+ kfifo_free(&core->irq_fifo);
+err_m2m_dev_release:
+ wave6_vpu_release_m2m_dev(core);
+err_v4l2_unregister:
+ v4l2_device_unregister(&core->v4l2_dev);
+
+ return ret;
+}
+
+static void wave6_vpu_core_remove(struct platform_device *pdev)
+{
+ struct vpu_core_device *core = dev_get_drvdata(&pdev->dev);
+
+ pm_runtime_disable(&pdev->dev);
+
+ wave6_vpu_enc_unregister_device(core);
+ wave6_vpu_dec_unregister_device(core);
+ wave6_vdi_free_dma(&core->temp_vbuf);
+ kfifo_free(&core->irq_fifo);
+ wave6_vpu_release_m2m_dev(core);
+ v4l2_device_unregister(&core->v4l2_dev);
+}
+
+static int __maybe_unused wave6_vpu_core_runtime_suspend(struct device *dev)
+{
+ struct vpu_core_device *core = dev_get_drvdata(dev);
+
+ if (WARN_ON(!core))
+ return -ENODEV;
+
+ /*
+ * Only call parent VPU put_vpu if the core has a parent and is active.
+ * - core->vpu: prevent access in core without parent VPU.
+ * - core->active: execute sleep only after m2m streaming is started.
+ */
+ if (core->vpu && core->active)
+ core->vpu->put_vpu(core->vpu, core);
+
+ if (core->num_clks)
+ clk_bulk_disable_unprepare(core->num_clks, core->clks);
+
+ return 0;
+}
+
+static int __maybe_unused wave6_vpu_core_runtime_resume(struct device *dev)
+{
+ struct vpu_core_device *core = dev_get_drvdata(dev);
+ int ret = 0;
+
+ if (WARN_ON(!core))
+ return -ENODEV;
+
+ if (core->num_clks) {
+ ret = clk_bulk_prepare_enable(core->num_clks, core->clks);
+ if (ret) {
+ dev_err(dev, "failed to enable clocks: %d\n", ret);
+ return ret;
+ }
+ }
+
+ /*
+ * Only call parent VPU get_vpu if the core has a parent and is active.
+ * - core->vpu: prevent access in core without parent VPU.
+ * - core->active: execute boot only after m2m streaming is started.
+ */
+ if (core->vpu && core->active)
+ ret = core->vpu->get_vpu(core->vpu, core);
+
+ if (!ret)
+ wave6_vpu_core_wait_activated(core);
+ else if (core->num_clks)
+ clk_bulk_disable_unprepare(core->num_clks, core->clks);
+
+ return ret;
+}
+
+static int __maybe_unused wave6_vpu_core_suspend(struct device *dev)
+{
+ struct vpu_core_device *core = dev_get_drvdata(dev);
+ int ret;
+
+ v4l2_m2m_suspend(core->m2m_dev);
+
+ ret = pm_runtime_force_suspend(dev);
+ if (ret)
+ v4l2_m2m_resume(core->m2m_dev);
+
+ return ret;
+}
+
+static int __maybe_unused wave6_vpu_core_resume(struct device *dev)
+{
+ struct vpu_core_device *core = dev_get_drvdata(dev);
+ int ret;
+
+ ret = pm_runtime_force_resume(dev);
+ if (ret)
+ return ret;
+
+ v4l2_m2m_resume(core->m2m_dev);
+
+ return 0;
+}
+
+static const struct dev_pm_ops wave6_vpu_core_pm_ops = {
+ SET_RUNTIME_PM_OPS(wave6_vpu_core_runtime_suspend,
+ wave6_vpu_core_runtime_resume, NULL)
+ SET_SYSTEM_SLEEP_PM_OPS(wave6_vpu_core_suspend,
+ wave6_vpu_core_resume)
+};
+
+static struct platform_driver wave6_vpu_core_driver = {
+ .driver = {
+ .name = WAVE6_VPU_CORE_PLATFORM_DRIVER_NAME,
+ .pm = &wave6_vpu_core_pm_ops,
+ },
+ .probe = wave6_vpu_core_probe,
+ .remove = wave6_vpu_core_remove,
+};
+
+module_platform_driver(wave6_vpu_core_driver);
+MODULE_ALIAS("platform:" WAVE6_VPU_CORE_PLATFORM_DRIVER_NAME);
+MODULE_DESCRIPTION("chips&media Wave6 VPU CORE V4L2 driver");
+MODULE_LICENSE("Dual BSD/GPL");
diff --git a/drivers/media/platform/chips-media/wave6/wave6-vpu-core.h b/drivers/media/platform/chips-media/wave6/wave6-vpu-core.h
new file mode 100644
index 000000000000..728ea5a0578d
--- /dev/null
+++ b/drivers/media/platform/chips-media/wave6/wave6-vpu-core.h
@@ -0,0 +1,123 @@
+/* SPDX-License-Identifier: (GPL-2.0 OR BSD-3-Clause) */
+/*
+ * Wave6 series multi-standard codec IP - wave6 core driver
+ *
+ * Copyright (C) 2025 CHIPS&MEDIA INC
+ */
+
+#ifndef __WAVE6_VPU_CORE_H__
+#define __WAVE6_VPU_CORE_H__
+
+#include <media/v4l2-ctrls.h>
+#include <media/v4l2-ioctl.h>
+#include <media/v4l2-event.h>
+#include <media/v4l2-fh.h>
+#include <media/videobuf2-v4l2.h>
+#include <media/videobuf2-dma-contig.h>
+#include "wave6-vpuconfig.h"
+#include "wave6-vpuapi.h"
+
+#define vpu_write_reg(CORE, ADDR, DATA) wave6_vpu_writel(CORE, ADDR, DATA)
+#define vpu_read_reg(CORE, ADDR) wave6_vpu_readl(CORE, ADDR)
+
+struct vpu_buffer {
+ struct v4l2_m2m_buffer v4l2_m2m_buf;
+ bool consumed;
+ bool used;
+ bool error;
+ bool force_key_frame;
+ bool force_frame_qp;
+ u32 force_i_frame_qp;
+ u32 force_p_frame_qp;
+ u32 force_b_frame_qp;
+ ktime_t ts_input;
+ ktime_t ts_start;
+ ktime_t ts_finish;
+ ktime_t ts_output;
+ u64 hw_time;
+ u32 average_qp;
+};
+
+enum vpu_fmt_type {
+ VPU_FMT_TYPE_CODEC = 0,
+ VPU_FMT_TYPE_RAW = 1
+};
+
+#define VPU_FMT_FLAG_CBCR_INTERLEAVED BIT(0)
+#define VPU_FMT_FLAG_CRCB_ORDER BIT(1)
+#define VPU_FMT_FLAG_10BIT BIT(2)
+#define VPU_FMT_FLAG_RGB BIT(3)
+
+struct vpu_format {
+ unsigned int v4l2_pix_fmt;
+ unsigned int max_width;
+ unsigned int min_width;
+ unsigned int max_height;
+ unsigned int min_height;
+ unsigned int num_planes;
+ enum frame_buffer_format fb_fmt;
+ enum endian_mode endian;
+ enum csc_format_order csc_fmt_order;
+ unsigned int flags;
+};
+
+static inline struct vpu_instance *wave6_fh_to_vpu_inst(struct v4l2_fh *vfh)
+{
+ return container_of(vfh, struct vpu_instance, v4l2_fh);
+}
+
+static inline struct vpu_instance *wave6_file_to_vpu_inst(struct file *filp)
+{
+ return wave6_fh_to_vpu_inst(file_to_v4l2_fh(filp));
+}
+
+static inline struct vpu_instance *wave6_ctrl_to_vpu_inst(struct v4l2_ctrl *vctrl)
+{
+ return container_of(vctrl->handler, struct vpu_instance, v4l2_ctrl_hdl);
+}
+
+static inline struct vpu_buffer *wave6_to_vpu_buf(struct vb2_v4l2_buffer *vbuf)
+{
+ return container_of(vbuf, struct vpu_buffer, v4l2_m2m_buf.vb);
+}
+
+static inline bool wave6_vpu_both_queues_are_streaming(struct vpu_instance *inst)
+{
+ struct vb2_queue *vq_cap = v4l2_m2m_get_dst_vq(inst->v4l2_fh.m2m_ctx);
+ struct vb2_queue *vq_out = v4l2_m2m_get_src_vq(inst->v4l2_fh.m2m_ctx);
+
+ return vb2_is_streaming(vq_cap) && vb2_is_streaming(vq_out);
+}
+
+u32 wave6_vpu_get_consumed_fb_num(struct vpu_instance *inst);
+void wave6_vpu_core_activate(struct vpu_core_device *core);
+void wave6_update_pix_fmt(struct v4l2_pix_format_mplane *pix_mp,
+ unsigned int width,
+ unsigned int height);
+struct vb2_v4l2_buffer *wave6_get_dst_buf_by_addr(struct vpu_instance *inst,
+ dma_addr_t addr);
+dma_addr_t wave6_get_dma_addr(struct vb2_v4l2_buffer *buf,
+ unsigned int plane_no);
+enum codec_std wave6_to_codec_std(enum vpu_instance_type type, unsigned int v4l2_pix_fmt);
+const char *wave6_vpu_instance_state_name(enum vpu_instance_state state);
+void wave6_vpu_set_instance_state(struct vpu_instance *inst,
+ enum vpu_instance_state state);
+u64 wave6_vpu_cycle_to_ns(struct vpu_core_device *core, u64 cycle);
+int wave6_vpu_wait_interrupt(struct vpu_instance *inst, unsigned int timeout);
+int wave6_vpu_dec_register_device(struct vpu_core_device *core);
+void wave6_vpu_dec_unregister_device(struct vpu_core_device *core);
+int wave6_vpu_enc_register_device(struct vpu_core_device *core);
+void wave6_vpu_enc_unregister_device(struct vpu_core_device *core);
+void wave6_vpu_finish_job(struct vpu_instance *inst);
+void wave6_vpu_record_performance_timestamps(struct vpu_instance *inst);
+void wave6_vpu_handle_performance(struct vpu_instance *inst,
+ struct vpu_buffer *vpu_buf);
+void wave6_vpu_reset_performance(struct vpu_instance *inst);
+int wave6_vpu_init_m2m_dev(struct vpu_core_device *core);
+void wave6_vpu_release_m2m_dev(struct vpu_core_device *core);
+int wave6_vpu_subscribe_event(struct v4l2_fh *fh,
+ const struct v4l2_event_subscription *sub);
+void wave6_vpu_return_buffers(struct vpu_instance *inst,
+ unsigned int type, enum vb2_buffer_state state);
+
+#endif /* __WAVE6_VPU_CORE_H__ */
--
2.31.1
^ permalink raw reply related
* [RFC PATCH v5 0/9] Add support for Wave6 video codec driver
From: Nas Chung @ 2026-04-15 9:25 UTC (permalink / raw)
To: mchehab, hverkuil, robh, krzk+dt, conor+dt, shawnguo, s.hauer
Cc: linux-media, devicetree, linux-kernel, linux-imx,
linux-arm-kernel, marek.vasut, ming.qian, Nas Chung
This RFC primarily asks for feedback on the devicetree representation
for the Chips&Media Wave6 codec block on NXP i.MX95. It only includes
DT-driven changes. Non-DT driver feedback and cleanups will be
addressed once the DT structure is agreed.
On i.MX95 the Wave6 hardware exposes one control register region and
four interface register regions for one shared video processing engine.
In this RFC, the control region is described by the parent node and the
interface regions by child nodes. The control and interface regions are
distinct DMA requesters and can be associated with separate IOMMU stream
IDs, allowing DMA isolation between them. The control region has its own
MMIO range, and each interface region has its own MMIO range and
interrupt.
I also evaluated folding all resources into a single parent node, but in
that model all stream IDs end up attached to the same IOMMU domain and
we observed loss of DMA isolation. Alternatives such as iommu-map or a
vendor-specific stream ID property were considered, but they do not seem
to fit this use case.
Thanks for your time and feedback.
RFC v5:
- Move all shared resources to the parent node
- Drop child compatible and use data-only interface child nodes
- Update the VPU driver to create child devices and load the core driver
v4:
- Fixed build issues reported by CI tools
- Updated commit messages to use imperative mood
- Avoided using the same name for both nodes and labels in devicetree
- Removed unused labels from YAML examples
- Added description for child(vpu-core) node
- Added iommus property to both parent(vpu) and child(vpu-core) nodes
- Updated probe() functions to use dev_err_probe() when returning -EPROBE_DEFER
- Added wave6_vpu prefix to trace functions
- Updated HEVC decoder profile control to report MAIN_STILL profile
- Fixed bug in multiple instance creation by pre-allocating work buffer
- Fixed interrupt handling by checking INSTANCE_INFO register and instance list
v3:
- Removed ambiguous SUPPORT_FOLLOWER feature
- Used WARN_ON() for unexpected programming errors
- Split thermal device code into wave6-vpu-thermal.c/h
- Dropped wave6_cooling_disable module parameter
- Replaced mutex_lock() with guard()
- Added lockdep_assert_held() to clarify locking regions
- Removed exported function due to dual-license and used function pointer
- Added documentation and validation for state transitions
- Added documentation for device structures
- Added patch to enable VPU device in imx95 DTS
- Updated DT bindings and driver to align with parent(vpu) and child(vpu-core)
- Replaced magic numbers with mask and offset macros when accessing registers
- Placed goto statements after an empty line
- Printed HW info (e.g. product_code) via dev_dbg() for debugging
- Replaced wave6_vpu_dec_give_command() with dedicated functions
v2:
- Refined DT bindings to better represent the hardware
- Reworked driver to align with the parent(VPU) and child(CTRL, CORE)
- Fixed build issues reported by CI tools (Smatch, Sparse, TRACE)
- Improved commit messages with clearer descriptions
- Added kernel-doc for exported functions
- Removed redundant print statements and unused code
- Reordered patches to prevent build failures
Nas Chung (9):
media: v4l2-common: Add YUV24 format info
dt-bindings: media: nxp: Add Wave6 video codec device
media: chips-media: wave6: Add Wave6 VPU interface
media: chips-media: wave6: Add v4l2 m2m driver support
media: chips-media: wave6: Add Wave6 core driver
media: chips-media: wave6: Improve debugging capabilities
media: chips-media: wave6: Add Wave6 thermal cooling device
media: chips-media: wave6: Add Wave6 control driver
arm64: dts: freescale: imx95: Add video codec node
.../bindings/media/nxp,imx95-vpu.yaml | 163 +
MAINTAINERS | 8 +
.../boot/dts/freescale/imx95-15x15-evk.dts | 7 +-
.../boot/dts/freescale/imx95-15x15-frdm.dts | 5 +
.../boot/dts/freescale/imx95-19x19-evk.dts | 10 +
.../dts/freescale/imx95-19x19-verdin-evk.dts | 10 +
.../dts/freescale/imx95-phycore-fpsc.dtsi | 10 +
.../dts/freescale/imx95-toradex-smarc-dev.dts | 5 +
.../dts/freescale/imx95-toradex-smarc.dtsi | 5 +
.../boot/dts/freescale/imx95-tqma9596sa.dtsi | 7 +-
arch/arm64/boot/dts/freescale/imx95.dtsi | 35 +
drivers/media/platform/chips-media/Kconfig | 1 +
drivers/media/platform/chips-media/Makefile | 1 +
.../media/platform/chips-media/wave6/Kconfig | 17 +
.../media/platform/chips-media/wave6/Makefile | 17 +
.../platform/chips-media/wave6/wave6-hw.c | 2929 +++++++++++++++++
.../platform/chips-media/wave6/wave6-hw.h | 73 +
.../chips-media/wave6/wave6-regdefine.h | 641 ++++
.../platform/chips-media/wave6/wave6-trace.h | 289 ++
.../platform/chips-media/wave6/wave6-vdi.h | 92 +
.../chips-media/wave6/wave6-vpu-core.c | 397 +++
.../chips-media/wave6/wave6-vpu-core.h | 123 +
.../chips-media/wave6/wave6-vpu-dbg.c | 225 ++
.../chips-media/wave6/wave6-vpu-dbg.h | 14 +
.../chips-media/wave6/wave6-vpu-dec.c | 1867 +++++++++++
.../chips-media/wave6/wave6-vpu-enc.c | 2691 +++++++++++++++
.../chips-media/wave6/wave6-vpu-thermal.c | 141 +
.../chips-media/wave6/wave6-vpu-thermal.h | 26 +
.../chips-media/wave6/wave6-vpu-v4l2.c | 507 +++
.../platform/chips-media/wave6/wave6-vpu.c | 816 +++++
.../platform/chips-media/wave6/wave6-vpu.h | 143 +
.../platform/chips-media/wave6/wave6-vpuapi.c | 725 ++++
.../platform/chips-media/wave6/wave6-vpuapi.h | 1026 ++++++
.../chips-media/wave6/wave6-vpuconfig.h | 72 +
.../chips-media/wave6/wave6-vpuerror.h | 262 ++
drivers/media/v4l2-core/v4l2-common.c | 1 +
36 files changed, 13359 insertions(+), 2 deletions(-)
create mode 100644 Documentation/devicetree/bindings/media/nxp,imx95-vpu.yaml
create mode 100644 drivers/media/platform/chips-media/wave6/Kconfig
create mode 100644 drivers/media/platform/chips-media/wave6/Makefile
create mode 100644 drivers/media/platform/chips-media/wave6/wave6-hw.c
create mode 100644 drivers/media/platform/chips-media/wave6/wave6-hw.h
create mode 100644 drivers/media/platform/chips-media/wave6/wave6-regdefine.h
create mode 100644 drivers/media/platform/chips-media/wave6/wave6-trace.h
create mode 100644 drivers/media/platform/chips-media/wave6/wave6-vdi.h
create mode 100644 drivers/media/platform/chips-media/wave6/wave6-vpu-core.c
create mode 100644 drivers/media/platform/chips-media/wave6/wave6-vpu-core.h
create mode 100644 drivers/media/platform/chips-media/wave6/wave6-vpu-dbg.c
create mode 100644 drivers/media/platform/chips-media/wave6/wave6-vpu-dbg.h
create mode 100644 drivers/media/platform/chips-media/wave6/wave6-vpu-dec.c
create mode 100644 drivers/media/platform/chips-media/wave6/wave6-vpu-enc.c
create mode 100644 drivers/media/platform/chips-media/wave6/wave6-vpu-thermal.c
create mode 100644 drivers/media/platform/chips-media/wave6/wave6-vpu-thermal.h
create mode 100644 drivers/media/platform/chips-media/wave6/wave6-vpu-v4l2.c
create mode 100644 drivers/media/platform/chips-media/wave6/wave6-vpu.c
create mode 100644 drivers/media/platform/chips-media/wave6/wave6-vpu.h
create mode 100644 drivers/media/platform/chips-media/wave6/wave6-vpuapi.c
create mode 100644 drivers/media/platform/chips-media/wave6/wave6-vpuapi.h
create mode 100644 drivers/media/platform/chips-media/wave6/wave6-vpuconfig.h
create mode 100644 drivers/media/platform/chips-media/wave6/wave6-vpuerror.h
--
2.31.1
^ permalink raw reply
* [RFC PATCH v5 2/9] dt-bindings: media: nxp: Add Wave6 video codec device
From: Nas Chung @ 2026-04-15 9:25 UTC (permalink / raw)
To: mchehab, hverkuil, robh, krzk+dt, conor+dt, shawnguo, s.hauer
Cc: linux-media, devicetree, linux-kernel, linux-imx,
linux-arm-kernel, marek.vasut, ming.qian, Nas Chung
In-Reply-To: <20260415092529.577-1-nas.chung@chipsnmedia.com>
Add documentation for the Chips&Media Wave6 video codec on NXP i.MX SoCs.
The hardware contains one control register region and four interface
register regions for a shared video processing engine. The control region
manages shared resources such as firmware memory, while each interface
region has its own MMIO range and interrupt.
The control region and each interface region are distinct DMA requesters
and can be associated with separate IOMMU stream IDs. Represent the
control region as the parent node and the interface register regions as
child nodes to describe these resources.
Signed-off-by: Nas Chung <nas.chung@chipsnmedia.com>
---
.../bindings/media/nxp,imx95-vpu.yaml | 163 ++++++++++++++++++
MAINTAINERS | 7 +
2 files changed, 170 insertions(+)
create mode 100644 Documentation/devicetree/bindings/media/nxp,imx95-vpu.yaml
diff --git a/Documentation/devicetree/bindings/media/nxp,imx95-vpu.yaml b/Documentation/devicetree/bindings/media/nxp,imx95-vpu.yaml
new file mode 100644
index 000000000000..9a5ca53e15a3
--- /dev/null
+++ b/Documentation/devicetree/bindings/media/nxp,imx95-vpu.yaml
@@ -0,0 +1,163 @@
+# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/media/nxp,imx95-vpu.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Chips&Media Wave6 Series multi-standard codec IP on NXP i.MX SoCs
+
+maintainers:
+ - Nas Chung <nas.chung@chipsnmedia.com>
+ - Jackson Lee <jackson.lee@chipsnmedia.com>
+
+description:
+ The Chips&Media Wave6 codec IP is a multi-standard video encoder/decoder.
+ On NXP i.MX SoCs, the Wave6 codec IP exposes one control register region and
+ four interface register regions for a shared video processing engine.
+ The parent node describes the control region, which has its own MMIO range and
+ manages shared resources such as firmware memory. The child nodes describe the
+ interface register regions. Each interface region has its own MMIO range and
+ interrupt.
+ The control region and the interface regions are distinct DMA requesters.
+ The control region and each interface region can be associated with separate
+ IOMMU stream IDs, allowing DMA isolation between them.
+
+properties:
+ compatible:
+ enum:
+ - nxp,imx95-vpu
+
+ reg:
+ maxItems: 1
+
+ clocks:
+ items:
+ - description: VPU core clock
+ - description: VPU associated block clock
+
+ clock-names:
+ items:
+ - const: core
+ - const: vpublk
+
+ power-domains:
+ items:
+ - description: Main VPU power domain
+ - description: Performance power domain
+
+ power-domain-names:
+ items:
+ - const: vpu
+ - const: perf
+
+ memory-region:
+ maxItems: 1
+
+ sram:
+ $ref: /schemas/types.yaml#/definitions/phandle
+ description:
+ phandle to the SRAM node used to store reference data, reducing DMA
+ memory bandwidth.
+
+ iommus:
+ maxItems: 1
+
+ "#cooling-cells":
+ const: 2
+
+ "#address-cells":
+ const: 2
+
+ "#size-cells":
+ const: 2
+
+ ranges: true
+
+patternProperties:
+ "^interface@[0-9a-f]+$":
+ type: object
+ description:
+ An interface register region within the Chips&Media Wave6 codec IP.
+ Each region has its own MMIO range and interrupt and can be associated
+ with a separate IOMMU stream ID for DMA isolation.
+ additionalProperties: false
+
+ properties:
+ reg:
+ maxItems: 1
+
+ interrupts:
+ maxItems: 1
+
+ iommus:
+ maxItems: 1
+
+ required:
+ - reg
+ - interrupts
+
+required:
+ - compatible
+ - reg
+ - clocks
+ - clock-names
+ - power-domains
+ - power-domain-names
+ - memory-region
+ - "#address-cells"
+ - "#size-cells"
+ - ranges
+
+additionalProperties: false
+
+examples:
+ - |
+ #include <dt-bindings/interrupt-controller/arm-gic.h>
+ #include <dt-bindings/clock/nxp,imx95-clock.h>
+
+ soc {
+ #address-cells = <2>;
+ #size-cells = <2>;
+
+ video-codec@4c4c0000 {
+ compatible = "nxp,imx95-vpu";
+ reg = <0x0 0x4c4c0000 0x0 0x10000>;
+ clocks = <&scmi_clk 115>,
+ <&vpu_blk_ctrl IMX95_CLK_VPUBLK_WAVE>;
+ clock-names = "core", "vpublk";
+ power-domains = <&scmi_devpd 21>,
+ <&scmi_perf 10>;
+ power-domain-names = "vpu", "perf";
+ memory-region = <&vpu_boot>;
+ sram = <&sram1>;
+ iommus = <&smmu 0x32>;
+ #cooling-cells = <2>;
+ #address-cells = <2>;
+ #size-cells = <2>;
+ ranges;
+
+ interface@4c480000 {
+ reg = <0x0 0x4c480000 0x0 0x10000>;
+ interrupts = <GIC_SPI 299 IRQ_TYPE_LEVEL_HIGH>;
+ iommus = <&smmu 0x33>;
+ };
+
+ interface@4c490000 {
+ reg = <0x0 0x4c490000 0x0 0x10000>;
+ interrupts = <GIC_SPI 300 IRQ_TYPE_LEVEL_HIGH>;
+ iommus = <&smmu 0x34>;
+ };
+
+ interface@4c4a0000 {
+ reg = <0x0 0x4c4a0000 0x0 0x10000>;
+ interrupts = <GIC_SPI 301 IRQ_TYPE_LEVEL_HIGH>;
+ iommus = <&smmu 0x35>;
+ };
+
+ interface@4c4b0000 {
+ reg = <0x0 0x4c4b0000 0x0 0x10000>;
+ interrupts = <GIC_SPI 302 IRQ_TYPE_LEVEL_HIGH>;
+ iommus = <&smmu 0x36>;
+ };
+ };
+ };
diff --git a/MAINTAINERS b/MAINTAINERS
index 32b1dfee8614..5700be993849 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -28393,6 +28393,13 @@ S: Maintained
F: Documentation/devicetree/bindings/media/cnm,wave521c.yaml
F: drivers/media/platform/chips-media/wave5/
+WAVE6 VPU CODEC DRIVER
+M: Nas Chung <nas.chung@chipsnmedia.com>
+M: Jackson Lee <jackson.lee@chipsnmedia.com>
+L: linux-media@vger.kernel.org
+S: Maintained
+F: Documentation/devicetree/bindings/media/nxp,imx95-vpu.yaml
+
WHISKEYCOVE PMIC GPIO DRIVER
M: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
L: linux-gpio@vger.kernel.org
--
2.31.1
^ permalink raw reply related
* [RFC PATCH v5 9/9] arm64: dts: freescale: imx95: Add video codec node
From: Nas Chung @ 2026-04-15 9:25 UTC (permalink / raw)
To: mchehab, hverkuil, robh, krzk+dt, conor+dt, shawnguo, s.hauer
Cc: linux-media, devicetree, linux-kernel, linux-imx,
linux-arm-kernel, marek.vasut, ming.qian, Nas Chung
In-Reply-To: <20260415092529.577-1-nas.chung@chipsnmedia.com>
Add the Chips and Media wave633 video codec node on IMX95 SoCs.
Signed-off-by: Nas Chung <nas.chung@chipsnmedia.com>
---
.../boot/dts/freescale/imx95-15x15-evk.dts | 7 +++-
.../boot/dts/freescale/imx95-15x15-frdm.dts | 5 +++
.../boot/dts/freescale/imx95-19x19-evk.dts | 10 ++++++
.../dts/freescale/imx95-19x19-verdin-evk.dts | 10 ++++++
.../dts/freescale/imx95-phycore-fpsc.dtsi | 10 ++++++
.../dts/freescale/imx95-toradex-smarc-dev.dts | 5 +++
.../dts/freescale/imx95-toradex-smarc.dtsi | 5 +++
.../boot/dts/freescale/imx95-tqma9596sa.dtsi | 7 +++-
arch/arm64/boot/dts/freescale/imx95.dtsi | 35 +++++++++++++++++++
9 files changed, 92 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/boot/dts/freescale/imx95-15x15-evk.dts b/arch/arm64/boot/dts/freescale/imx95-15x15-evk.dts
index d4184fb8b28c..2c841e476d17 100644
--- a/arch/arm64/boot/dts/freescale/imx95-15x15-evk.dts
+++ b/arch/arm64/boot/dts/freescale/imx95-15x15-evk.dts
@@ -215,7 +215,7 @@ rsc_table: rsc-table@88220000 {
no-map;
};
- vpu_boot: vpu-boot@a0000000 {
+ vpu_boot: memory@a0000000 {
reg = <0 0xa0000000 0 0x100000>;
no-map;
};
@@ -1157,6 +1157,11 @@ &wdog3 {
status = "okay";
};
+&vpu {
+ memory-region = <&vpu_boot>;
+ sram = <&sram1>;
+};
+
&xcvr {
clocks = <&scmi_clk IMX95_CLK_BUSWAKEUP>,
<&scmi_clk IMX95_CLK_SPDIF>,
diff --git a/arch/arm64/boot/dts/freescale/imx95-15x15-frdm.dts b/arch/arm64/boot/dts/freescale/imx95-15x15-frdm.dts
index ca1c4966c867..106186c75f9c 100644
--- a/arch/arm64/boot/dts/freescale/imx95-15x15-frdm.dts
+++ b/arch/arm64/boot/dts/freescale/imx95-15x15-frdm.dts
@@ -962,3 +962,8 @@ &usdhc3 {
&wdog3 {
status = "okay";
};
+
+&vpu {
+ memory-region = <&vpu_boot>;
+ sram = <&sram1>;
+};
diff --git a/arch/arm64/boot/dts/freescale/imx95-19x19-evk.dts b/arch/arm64/boot/dts/freescale/imx95-19x19-evk.dts
index aaa0da55a22b..0ee5f9700fd3 100644
--- a/arch/arm64/boot/dts/freescale/imx95-19x19-evk.dts
+++ b/arch/arm64/boot/dts/freescale/imx95-19x19-evk.dts
@@ -76,6 +76,11 @@ linux_cma: linux,cma {
linux,cma-default;
reusable;
};
+
+ vpu_boot: memory@a0000000 {
+ reg = <0 0xa0000000 0 0x100000>;
+ no-map;
+ };
};
flexcan1_phy: can-phy0 {
@@ -1142,3 +1147,8 @@ &tpm6 {
pinctrl-0 = <&pinctrl_tpm6>;
status = "okay";
};
+
+&vpu {
+ memory-region = <&vpu_boot>;
+ sram = <&sram1>;
+};
diff --git a/arch/arm64/boot/dts/freescale/imx95-19x19-verdin-evk.dts b/arch/arm64/boot/dts/freescale/imx95-19x19-verdin-evk.dts
index 2b0ff232f680..c35ad2466b19 100644
--- a/arch/arm64/boot/dts/freescale/imx95-19x19-verdin-evk.dts
+++ b/arch/arm64/boot/dts/freescale/imx95-19x19-verdin-evk.dts
@@ -65,6 +65,11 @@ linux_cma: linux,cma {
linux,cma-default;
reusable;
};
+
+ vpu_boot: memory@a0000000 {
+ reg = <0 0xa0000000 0 0x100000>;
+ no-map;
+ };
};
reg_1p8v: regulator-1p8v {
@@ -693,3 +698,8 @@ pinctrl_usdhc3: usdhc3grp {
<IMX95_PAD_SD3_DATA3__USDHC3_DATA3 0x138e>;
};
};
+
+&vpu {
+ memory-region = <&vpu_boot>;
+ sram = <&sram1>;
+};
diff --git a/arch/arm64/boot/dts/freescale/imx95-phycore-fpsc.dtsi b/arch/arm64/boot/dts/freescale/imx95-phycore-fpsc.dtsi
index 7519d5bd06ba..b713d4159e35 100644
--- a/arch/arm64/boot/dts/freescale/imx95-phycore-fpsc.dtsi
+++ b/arch/arm64/boot/dts/freescale/imx95-phycore-fpsc.dtsi
@@ -59,6 +59,11 @@ linux,cma {
size = <0 0x3c000000>;
linux,cma-default;
};
+
+ vpu_boot: memory@a0000000 {
+ reg = <0 0xa0000000 0 0x100000>;
+ no-map;
+ };
};
};
@@ -654,3 +659,8 @@ &usdhc3 { /* FPSC SDIO */
pinctrl-0 = <&pinctrl_usdhc3>;
pinctrl-names = "default";
};
+
+&vpu {
+ memory-region = <&vpu_boot>;
+ sram = <&sram1>;
+};
diff --git a/arch/arm64/boot/dts/freescale/imx95-toradex-smarc-dev.dts b/arch/arm64/boot/dts/freescale/imx95-toradex-smarc-dev.dts
index 5b05f256fd52..5bdfdab8647e 100644
--- a/arch/arm64/boot/dts/freescale/imx95-toradex-smarc-dev.dts
+++ b/arch/arm64/boot/dts/freescale/imx95-toradex-smarc-dev.dts
@@ -275,3 +275,8 @@ &usb3_phy {
&usdhc2 {
status = "okay";
};
+
+&vpu {
+ memory-region = <&vpu_boot>;
+ sram = <&sram1>;
+};
diff --git a/arch/arm64/boot/dts/freescale/imx95-toradex-smarc.dtsi b/arch/arm64/boot/dts/freescale/imx95-toradex-smarc.dtsi
index 5932ba238a8a..10e6d1fbb8e2 100644
--- a/arch/arm64/boot/dts/freescale/imx95-toradex-smarc.dtsi
+++ b/arch/arm64/boot/dts/freescale/imx95-toradex-smarc.dtsi
@@ -156,6 +156,11 @@ linux_cma: linux,cma {
alloc-ranges = <0 0x80000000 0 0x7f000000>;
linux,cma-default;
};
+
+ vpu_boot: memory@a0000000 {
+ reg = <0 0xa0000000 0 0x100000>;
+ no-map;
+ };
};
};
diff --git a/arch/arm64/boot/dts/freescale/imx95-tqma9596sa.dtsi b/arch/arm64/boot/dts/freescale/imx95-tqma9596sa.dtsi
index 456129f4a682..a7b5b517e021 100644
--- a/arch/arm64/boot/dts/freescale/imx95-tqma9596sa.dtsi
+++ b/arch/arm64/boot/dts/freescale/imx95-tqma9596sa.dtsi
@@ -40,7 +40,7 @@ linux_cma: linux,cma {
linux,cma-default;
};
- vpu_boot: vpu-boot@a0000000 {
+ vpu_boot: memory@a0000000 {
reg = <0 0xa0000000 0 0x100000>;
no-map;
};
@@ -801,3 +801,8 @@ pinctrl_usdhc2_200mhz: usdhc2-200mhzgrp {
<IMX95_PAD_SD2_VSELECT__USDHC2_VSELECT 0x51e>;
};
};
+
+&vpu {
+ memory-region = <&vpu_boot>;
+ sram = <&sram1>;
+};
diff --git a/arch/arm64/boot/dts/freescale/imx95.dtsi b/arch/arm64/boot/dts/freescale/imx95.dtsi
index 55e2da094c88..de8fb19c7e3b 100644
--- a/arch/arm64/boot/dts/freescale/imx95.dtsi
+++ b/arch/arm64/boot/dts/freescale/imx95.dtsi
@@ -2005,6 +2005,41 @@ vpu_blk_ctrl: clock-controller@4c410000 {
assigned-clock-rates = <133333333>, <667000000>, <500000000>;
};
+ vpu: video-codec@4c4c0000 {
+ compatible = "nxp,imx95-vpu";
+ reg = <0x0 0x4c4c0000 0x0 0x10000>;
+ clocks = <&scmi_clk IMX95_CLK_VPU>,
+ <&vpu_blk_ctrl IMX95_CLK_VPUBLK_WAVE>;
+ clock-names = "core", "vpublk";
+ power-domains = <&scmi_devpd IMX95_PD_VPU>,
+ <&scmi_perf IMX95_PERF_VPU>;
+ power-domain-names = "vpu", "perf";
+ #cooling-cells = <2>;
+ #address-cells = <2>;
+ #size-cells = <2>;
+ ranges;
+
+ interface@4c480000 {
+ reg = <0x0 0x4c480000 0x0 0x10000>;
+ interrupts = <GIC_SPI 299 IRQ_TYPE_LEVEL_HIGH>;
+ };
+
+ interface@4c490000 {
+ reg = <0x0 0x4c490000 0x0 0x10000>;
+ interrupts = <GIC_SPI 300 IRQ_TYPE_LEVEL_HIGH>;
+ };
+
+ interface@4c4a0000 {
+ reg = <0x0 0x4c4a0000 0x0 0x10000>;
+ interrupts = <GIC_SPI 301 IRQ_TYPE_LEVEL_HIGH>;
+ };
+
+ interface@4c4b0000 {
+ reg = <0x0 0x4c4b0000 0x0 0x10000>;
+ interrupts = <GIC_SPI 302 IRQ_TYPE_LEVEL_HIGH>;
+ };
+ };
+
jpegdec: jpegdec@4c500000 {
compatible = "nxp,imx95-jpgdec", "nxp,imx8qxp-jpgdec";
reg = <0x0 0x4C500000 0x0 0x00050000>;
--
2.31.1
^ permalink raw reply related
* [RFC PATCH v5 1/9] media: v4l2-common: Add YUV24 format info
From: Nas Chung @ 2026-04-15 9:25 UTC (permalink / raw)
To: mchehab, hverkuil, robh, krzk+dt, conor+dt, shawnguo, s.hauer
Cc: linux-media, devicetree, linux-kernel, linux-imx,
linux-arm-kernel, marek.vasut, ming.qian, Nas Chung,
Nicolas Dufresne
In-Reply-To: <20260415092529.577-1-nas.chung@chipsnmedia.com>
The YUV24 format is missing an entry in the v4l2_format_info().
The YUV24 format is the packed YUV 4:4:4 formats with 8 bits
per component.
Fixes: 0376a51fbe5e ("media: v4l: Add packed YUV444 24bpp pixel format")
Signed-off-by: Nas Chung <nas.chung@chipsnmedia.com>
Reviewed-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
---
drivers/media/v4l2-core/v4l2-common.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/media/v4l2-core/v4l2-common.c b/drivers/media/v4l2-core/v4l2-common.c
index 554c591e1113..55bcd5975d9f 100644
--- a/drivers/media/v4l2-core/v4l2-common.c
+++ b/drivers/media/v4l2-core/v4l2-common.c
@@ -281,6 +281,7 @@ const struct v4l2_format_info *v4l2_format_info(u32 format)
{ .format = V4L2_PIX_FMT_Y212, .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 1, .comp_planes = 1, .bpp = { 4, 0, 0, 0 }, .bpp_div = { 1, 1, 1, 1 }, .hdiv = 2, .vdiv = 1 },
{ .format = V4L2_PIX_FMT_Y216, .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 1, .comp_planes = 1, .bpp = { 4, 0, 0, 0 }, .bpp_div = { 1, 1, 1, 1 }, .hdiv = 2, .vdiv = 1 },
{ .format = V4L2_PIX_FMT_YUV48_12, .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 1, .comp_planes = 1, .bpp = { 6, 0, 0, 0 }, .bpp_div = { 1, 1, 1, 1 }, .hdiv = 1, .vdiv = 1 },
+ { .format = V4L2_PIX_FMT_YUV24, .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 1, .comp_planes = 1, .bpp = { 3, 0, 0, 0 }, .bpp_div = { 1, 1, 1, 1 }, .hdiv = 1, .vdiv = 1 },
{ .format = V4L2_PIX_FMT_MT2110T, .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 2, .comp_planes = 2, .bpp = { 5, 10, 0, 0 }, .bpp_div = { 4, 4, 1, 1 }, .hdiv = 2, .vdiv = 2,
.block_w = { 16, 8, 0, 0 }, .block_h = { 32, 16, 0, 0 }},
{ .format = V4L2_PIX_FMT_MT2110R, .pixel_enc = V4L2_PIXEL_ENC_YUV, .mem_planes = 2, .comp_planes = 2, .bpp = { 5, 10, 0, 0 }, .bpp_div = { 4, 4, 1, 1 }, .hdiv = 2, .vdiv = 2,
--
2.31.1
^ permalink raw reply related
* RE: [PATCH 4/5] media: dt-bindings: add NXP i.MX95 compatible string
From: G.N. Zhou (OSS) @ 2026-04-15 9:21 UTC (permalink / raw)
To: Krzysztof Kozlowski, G.N. Zhou (OSS)
Cc: Michael Riesch, Mauro Carvalho Chehab, Rob Herring,
Krzysztof Kozlowski, Conor Dooley, Heiko Stuebner,
Laurent Pinchart, Frank Li, linux-media@vger.kernel.org,
linux-kernel@vger.kernel.org, devicetree@vger.kernel.org,
imx@lists.linux.dev, linux-arm-kernel@lists.infradead.org,
linux-rockchip@lists.infradead.org
In-Reply-To: <20260415-glaring-premium-nuthatch-ce00fc@quoll>
Hi Krzysztof Kozlowski
Thanks for your review.
> -----Original Message-----
> From: Krzysztof Kozlowski <krzk@kernel.org>
> Sent: Wednesday, April 15, 2026 4:10 PM
> To: G.N. Zhou (OSS) <guoniu.zhou@oss.nxp.com>
> Cc: Michael Riesch <michael.riesch@collabora.com>; Mauro Carvalho Chehab
> <mchehab@kernel.org>; Rob Herring <robh@kernel.org>; Krzysztof Kozlowski
> <krzk+dt@kernel.org>; Conor Dooley <conor+dt@kernel.org>; Heiko Stuebner
> <heiko@sntech.de>; Laurent Pinchart <laurent.pinchart@ideasonboard.com>;
> Frank Li <frank.li@nxp.com>; linux-media@vger.kernel.org; linux-
> kernel@vger.kernel.org; devicetree@vger.kernel.org; imx@lists.linux.dev; linux-
> arm-kernel@lists.infradead.org; linux-rockchip@lists.infradead.org
> Subject: Re: [PATCH 4/5] media: dt-bindings: add NXP i.MX95 compatible string
>
> On Wed, Apr 15, 2026 at 11:46:55AM +0800, Guoniu Zhou wrote:
> > The i.MX95 CSI-2 controller is nearly identical to i.MX93, with the
> > only difference being the use of IDI (Image Data Interface) instead of
> > IPI (Image Pixel Interface). The binding constraints are otherwise the
> > same.
>
> Nearly identical with some difference really, really suggests they are
> compatible. Express compatibility or explain why they are not compatible
> (difference between IDI and IPI unfortunately does not help me).
You're right that they are very similar. The key difference between IDI and IPI
is in the software interface:
- IPI (Image Pixel Interface) on i.MX93 requires software configuration through
a set of registers to enable the interface and configure data routing.
- IDI (Image Data Interface) on i.MX95 is software transparent - it requires no
register configuration and the data routing is handled automatically by hardware.
Because of this difference in register layout and initialization requirements,
they cannot share the same compatible string. The driver needs to know which
interface is present
>
> Best regards,
> Krzysztof
^ permalink raw reply
* Re: [PATCH 1/8] arm64/hwcap: Generate the KERNEL_HWCAP_ definitions for the hwcaps
From: Alexander Stein @ 2026-04-15 9:07 UTC (permalink / raw)
To: Catalin Marinas
Cc: Will Deacon, Jonathan Corbet, Shuah Khan, linux-arm-kernel,
linux-kernel, linux-doc, linux-kselftest, Mark Brown
In-Reply-To: <ad9H0BiD4le07P-a@arm.com>
Hi Catalin,
Am Mittwoch, 15. April 2026, 10:09:52 CEST schrieb Catalin Marinas:
> On Wed, Apr 15, 2026 at 08:24:22AM +0200, Alexander Stein wrote:
> > Am Montag, 2. März 2026, 23:53:16 CEST schrieb Mark Brown:
> > > Currently for each hwcap we define both the HWCAPn_NAME definition which is
> > > exposed to userspace and a kernel internal KERNEL_HWCAP_NAME definition
> > > which we use internally. This is tedious and repetitive, instead use a
> > > script to generate the KERNEL_HWCAP_ definitions from the UAPI definitions.
> > >
> > > No functional changes intended.
> >
> > Somehow this change causes to delete and generate kernel-hwcap.h on each
> > make call. This results in compiling essentially everything each time.
>
> Does this fix it:
>
> https://lore.kernel.org/r/20260413-arm64-hwcap-gen-fix-v1-1-26c56aed6908@kernel.org
>
> It's queued, it will go in before -rc1.
Ah, I didn't notice that. Thanks. Works for me.
Best regards
Alexander
--
TQ-Systems GmbH | Mühlstraße 2, Gut Delling | 82229 Seefeld, Germany
Amtsgericht München, HRB 105018
Geschäftsführer: Detlef Schneider, Rüdiger Stahl, Stefan Schneider
http://www.tq-group.com/
^ permalink raw reply
* Re: [PATCH v4 3/9] coresight: etm4x: fix leaked trace id
From: Suzuki K Poulose @ 2026-04-15 8:56 UTC (permalink / raw)
To: Jie Gan, Leo Yan, Yeoreum Yun
Cc: coresight, linux-arm-kernel, linux-kernel, mike.leach,
james.clark, alexander.shishkin
In-Reply-To: <d871bc3a-c104-4e8f-8104-898b957af79c@oss.qualcomm.com>
On 15/04/2026 09:45, Jie Gan wrote:
>
>
> On 4/15/2026 4:32 PM, Leo Yan wrote:
>> On Wed, Apr 15, 2026 at 09:01:09AM +0100, Yeoreum Yun wrote:
>>
>> [...]
>>
>>>>> What I am thinking is as SoCs continue to grow more complex with an
>>>>> increasing number of subsystems, trace IDs may be exhausted in the
>>>>> near
>>>>> future. (that's why we have dynamic trace ID allocation/release).
>>>>
>>>> Thanks for the input.
>>>>
>>>> I am wandering if we can use "dev->devt" as the trace ID. A device's
>>>> major/minor number is unique in kernel and dev_t is defined as u32:
>>>>
>>>> typedef u32 __kernel_dev_t;
>>>>
>>>> And we can consolidate this for both SYSFS and PERF modes.
>>>>
>>>
>>> When I see the CORESIGHT_TRACE_ID_MAX:
>>>
>>> /* architecturally we have 128 IDs some of which are reserved */
>>> #define CORESIGHT_TRACE_IDS_MAX 128
>>>
>>> I think this came from the hardware restriction for number of TRACE_IDs.
>>> In this case, clamping the device_id to trace_id seems more complex and
>>> reduce some performance perspective.
>>
>> Sigh, my stupid. Please ignore my previous comment, let us first fix
>> ID leak issue.
>>
>> Given Jie's comment on the use-out issue, it is valid for me especially
>> if a system have many dummy tracers. We can defer to refactor it
>> later (e.g., use separate ranges for hardware and dummy tracers).
>>
>> thanks for correction!
>
> Just share some info:
>
> With my memory, The ARM AMBA ATB Protocol Specification defined a 7-bit
> width field for the trace ID, that's where the 128 comes from. (in each
> frame, we also have 7-bit field for containing the trace ID)
That is true and some IDs in the range (0-128) are reserved. So we
actually have less than 128. We need the dynamic allocation, preferrably
isolated to a "pool" for the relevant session to make the full use of
the space.
Suzuki
>
> Thanks,
> Jie
>
^ permalink raw reply
* Re: [PATCH v7 6/6] arm64: dts: rockchip: Add Orange Pi 5 Pro board support
From: Alexey Charkov @ 2026-04-15 8:57 UTC (permalink / raw)
To: Dennis Gilmore
Cc: Andrew Lunn, Andrzej Hajda, Chaoyi Chen, Conor Dooley,
David Airlie, devicetree, dri-devel, FUKAUMI Naoki,
Heiko Stuebner, Hsun Lai, Jernej Skrabec, Jimmy Hon, John Clark,
Jonas Karlman, Krzysztof Kozlowski, Laurent Pinchart,
linux-arm-kernel, linux-kernel, linux-rockchip, Maarten Lankhorst,
Maxime Ripard, Michael Opdenacker, Michael Riesch, Mykola Kvach,
Neil Armstrong, Peter Robinson, Quentin Schulz, Robert Foss,
Rob Herring, Simona Vetter, Thomas Zimmermann
In-Reply-To: <20260414214104.1363987-7-dennis@ausil.us>
On Wed, Apr 15, 2026 at 1:41 AM Dennis Gilmore <dennis@ausil.us> wrote:
>
> Add device tree for the Xunlong Orange Pi 5 Pro (RK3588S).
>
> - eMMC module, you can optionally solder a SPI NOR in place and turn
> off the eMMC
> - PCIe-attached NIC (pcie2x1l1)
> - PCIe NVMe slot (pcie2x1l2)
Hi Dennis,
Sashiko noticed [1] that the controller names here do not match the
nodes/comments you have in the patch body - which ones are correct?
[1] https://sashiko.dev/#/patchset/20260414214104.1363987-1-dennis%40ausil.us
> - AP6256 WiFi (BCM43456) via SDIO with mmc-pwrseq
> - BCM4345C5 Bluetooth
> - es8388 audio
> - USB 2.0 and USB 3.0
> - Two HDMI ports, the second is connected to the SoC's DP controller
> driven through a Lontium LT8711UXD bridge.
>
> Vendors schematics are available at:
> https://drive.google.com/file/d/1qs1DratHuh7C6J6MEtQIwUsiSrg8qgTi/view
>
> Signed-off-by: Dennis Gilmore <dennis@ausil.us>
> ---
> arch/arm64/boot/dts/rockchip/Makefile | 1 +
> .../dts/rockchip/rk3588s-orangepi-5-pro.dts | 442 ++++++++++++++++++
> 2 files changed, 443 insertions(+)
> create mode 100644 arch/arm64/boot/dts/rockchip/rk3588s-orangepi-5-pro.dts
>
> diff --git a/arch/arm64/boot/dts/rockchip/Makefile b/arch/arm64/boot/dts/rockchip/Makefile
> index 4d384f153c13..c99dca2ae9e7 100644
> --- a/arch/arm64/boot/dts/rockchip/Makefile
> +++ b/arch/arm64/boot/dts/rockchip/Makefile
> @@ -214,6 +214,7 @@ dtb-$(CONFIG_ARCH_ROCKCHIP) += rk3588s-nanopi-r6c.dtb
> dtb-$(CONFIG_ARCH_ROCKCHIP) += rk3588s-odroid-m2.dtb
> dtb-$(CONFIG_ARCH_ROCKCHIP) += rk3588s-orangepi-5.dtb
> dtb-$(CONFIG_ARCH_ROCKCHIP) += rk3588s-orangepi-5b.dtb
> +dtb-$(CONFIG_ARCH_ROCKCHIP) += rk3588s-orangepi-5-pro.dtb
> dtb-$(CONFIG_ARCH_ROCKCHIP) += rk3588s-orangepi-cm5-base.dtb
> dtb-$(CONFIG_ARCH_ROCKCHIP) += rk3588s-radxa-cm5-io.dtb
> dtb-$(CONFIG_ARCH_ROCKCHIP) += rk3588s-roc-pc.dtb
> diff --git a/arch/arm64/boot/dts/rockchip/rk3588s-orangepi-5-pro.dts b/arch/arm64/boot/dts/rockchip/rk3588s-orangepi-5-pro.dts
> new file mode 100644
> index 000000000000..61462c66753d
> --- /dev/null
> +++ b/arch/arm64/boot/dts/rockchip/rk3588s-orangepi-5-pro.dts
> @@ -0,0 +1,442 @@
> +// SPDX-License-Identifier: (GPL-2.0+ OR MIT)
> +
> +/dts-v1/;
> +
> +#include "rk3588s-orangepi-5.dtsi"
> +
> +/ {
> + model = "Xunlong Orange Pi 5 Pro";
> + compatible = "xunlong,orangepi-5-pro", "rockchip,rk3588s";
> +
> + aliases {
> + mmc0 = &sdhci;
> + mmc1 = &sdmmc;
> + mmc2 = &sdio;
> + };
> +
> + hdmi1-con {
> + compatible = "hdmi-connector";
> + label = "HDMI1 OUT";
> + type = "a";
> +
> + port {
> + hdmi1_con_in: endpoint {
> + remote-endpoint = <<8711uxd_out>;
> + };
> + };
> + };
> +
> + lt8711uxd {
Please use a generic node name per DT convention. "hdmi-bridge" perhaps?
> + compatible = "lontium,lt8711uxd";
Don't you want to add "vdd-supply = <&vcc3v3_dp>;" here? It costs you
nothing, as it's already in the binding and in the driver, and having
this dependency listed explicitly will let the kernel order the driver
probes correctly, and also likely let you drop the boot-on/always-on
annotation from the regulator node.
> + ports {
> + #address-cells = <1>;
> + #size-cells = <0>;
> +
> + port@0 {
> + reg = <0>;
> +
> + lt8711uxd_in: endpoint {
> + remote-endpoint = <&dp0_out_con>;
> + };
> + };
> +
> + port@1 {
> + reg = <1>;
> +
> + lt8711uxd_out: endpoint {
> + remote-endpoint = <&hdmi1_con_in>;
> + };
> + };
> + };
> + };
> +
> + analog-sound {
> + compatible = "simple-audio-card";
> + pinctrl-names = "default";
> + pinctrl-0 = <&hp_detect>;
> + simple-audio-card,format = "i2s";
> + simple-audio-card,hp-det-gpios = <&gpio1 RK_PD5 GPIO_ACTIVE_HIGH>;
> + simple-audio-card,mclk-fs = <256>;
> + simple-audio-card,name = "rockchip,es8388";
> + simple-audio-card,routing =
> + "Headphones", "LOUT1",
> + "Headphones", "ROUT1",
> + "LINPUT1", "Microphone Jack",
> + "RINPUT1", "Microphone Jack",
> + "LINPUT2", "Onboard Microphone",
> + "RINPUT2", "Onboard Microphone";
> + simple-audio-card,widgets =
> + "Microphone", "Microphone Jack",
> + "Microphone", "Onboard Microphone",
> + "Headphone", "Headphones";
> +
> + simple-audio-card,cpu {
> + sound-dai = <&i2s2_2ch>;
> + };
> +
> + simple-audio-card,codec {
> + sound-dai = <&es8388>;
> + system-clock-frequency = <12288000>;
> + };
> + };
> +
> + pwm-leds {
> + compatible = "pwm-leds";
> +
> + led-0 {
> + color = <LED_COLOR_ID_BLUE>;
> + function = LED_FUNCTION_STATUS;
> + linux,default-trigger = "heartbeat";
> + max-brightness = <255>;
> + pwms = <&pwm15 0 1000000 0>;
> + };
> +
> + led-1 {
> + color = <LED_COLOR_ID_GREEN>;
> + function = LED_FUNCTION_ACTIVITY;
> + linux,default-trigger = "heartbeat";
> + max-brightness = <255>;
> + pwms = <&pwm3 0 1000000 0>;
> + };
> + };
> +
> + fan: pwm-fan {
> + compatible = "pwm-fan";
> + #cooling-cells = <2>;
> + cooling-levels = <0 50 100 150 200 255>;
> + fan-supply = <&vcc5v0_sys>;
> + pwms = <&pwm2 0 20000000 0>;
> + };
> +
> + vcc3v3_dp: regulator-vcc3v3-dp {
> + compatible = "regulator-fixed";
> + enable-active-high;
> + gpios = <&gpio3 RK_PC2 GPIO_ACTIVE_HIGH>;
> + pinctrl-names = "default";
> + pinctrl-0 = <&dp_bridge_en>;
> + regulator-max-microvolt = <3300000>;
> + regulator-min-microvolt = <3300000>;
> + regulator-name = "vcc3v3_dp";
> + regulator-always-on;
> + regulator-boot-on;
Please see if you can drop these always-on/boot-on when vdd-supply is
explicitly listed in the bridge node
> + vin-supply = <&vcc_3v3_s3>;
> + };
> +
> + vcc3v3_phy1: regulator-vcc3v3-phy1 {
> + compatible = "regulator-fixed";
> + enable-active-high;
> + gpios = <&gpio3 RK_PB7 GPIO_ACTIVE_HIGH>;
> + pinctrl-names = "default";
> + pinctrl-0 = <&vcc3v3_phy1_en>;
The board schematics call the pin "Ethernet_EN"
> + regulator-max-microvolt = <3300000>;
> + regulator-min-microvolt = <3300000>;
> + regulator-name = "vcc3v3_phy1";
> + startup-delay-us = <50000>;
> + vin-supply = <&vcc_3v3_s3>;
> + };
> +
> + vcc5v0_otg: regulator-vcc5v0-otg {
> + compatible = "regulator-fixed";
> + enable-active-high;
> + gpios = <&gpio0 RK_PC4 GPIO_ACTIVE_HIGH>;
> + pinctrl-names = "default";
> + pinctrl-0 = <&vcc5v0_otg_en>;
> + regulator-max-microvolt = <5000000>;
> + regulator-min-microvolt = <5000000>;
> + regulator-name = "vcc5v0_otg";
> + vin-supply = <&vcc5v0_sys>;
> + };
> +
> + sdio_pwrseq: sdio-pwrseq {
> + compatible = "mmc-pwrseq-simple";
> + clocks = <&hym8563>;
> + clock-names = "ext_clock";
> + pinctrl-names = "default";
> + pinctrl-0 = <&wifi_enable_h>;
> + post-power-on-delay-ms = <200>;
> + reset-gpios = <&gpio0 RK_PD0 GPIO_ACTIVE_LOW>;
> + };
> +
> + typea_con: usb-a-connector {
> + compatible = "usb-a-connector";
> + data-role = "host";
> + label = "USB3 Type-A";
> + power-role = "source";
> + vbus-supply = <&vcc5v0_otg>;
> + };
> +};
> +
> +&dp0 {
> + pinctrl-names = "default";
> + pinctrl-0 = <&dp0m0_pins>;
> + status = "okay";
> +};
> +
> +&dp0_in {
> + dp0_in_vp1: endpoint {
> + remote-endpoint = <&vp1_out_dp0>;
> + };
> +};
> +
> +&dp0_out {
> + dp0_out_con: endpoint {
> + remote-endpoint = <<8711uxd_in>;
> + };
> +};
> +
> +&i2c1 {
> + pinctrl-names = "default";
> + pinctrl-0 = <&i2c1m4_xfer>;
> + status = "okay";
> +};
> +
> +&i2c3 {
> + pinctrl-names = "default";
> + pinctrl-0 = <&i2c3m0_xfer>;
> + status = "okay";
> +
> + es8388: audio-codec@11 {
> + compatible = "everest,es8388", "everest,es8328";
> + reg = <0x11>;
> + #sound-dai-cells = <0>;
> + AVDD-supply = <&vcca_3v3_s0>;
> + DVDD-supply = <&vcca_1v8_s0>;
> + HPVDD-supply = <&vcca_3v3_s0>;
> + PVDD-supply = <&vcca_1v8_s0>;
> + assigned-clock-rates = <12288000>;
> + assigned-clocks = <&cru I2S2_2CH_MCLKOUT>;
> + clocks = <&cru I2S2_2CH_MCLKOUT>;
> + pinctrl-names = "default";
> + pinctrl-0 = <&i2s2m1_mclk>;
> + };
> +};
> +
> +&i2c4 {
> + pinctrl-names = "default";
> + pinctrl-0 = <&i2c4m3_xfer>;
> + status = "okay";
> +};
> +
> +&i2s2_2ch {
> + pinctrl-0 = <&i2s2m1_lrck &i2s2m1_sclk
> + &i2s2m1_sdi &i2s2m1_sdo>;
> + status = "okay";
> +};
> +
> +&package_thermal {
> + polling-delay = <1000>;
> +
> + cooling-maps {
> + map0 {
> + trip = <&package_fan0>;
> + cooling-device = <&fan THERMAL_NO_LIMIT 1>;
> + };
> +
> + map1 {
> + trip = <&package_fan1>;
> + cooling-device = <&fan 2 THERMAL_NO_LIMIT>;
> + };
> + };
> +
> + trips {
> + package_fan0: package-fan0 {
> + hysteresis = <2000>;
> + temperature = <55000>;
> + type = "active";
> + };
> +
> + package_fan1: package-fan1 {
> + hysteresis = <2000>;
> + temperature = <65000>;
> + type = "active";
> + };
> + };
> +};
> +
> +/* NVMe */
> +&pcie2x1l1 {
> + pinctrl-names = "default";
> + pinctrl-0 = <&pcie2x1l1_rst &pcie30x1m1_1_clkreqn &pcie30x1m1_1_waken>;
Is there a particular reason to use the GPIO mode for the reset pin,
rather than the (confusingly named) &pcie30x1m1_1_perstn in line with
the other two?
> + reset-gpios = <&gpio4 RK_PA2 GPIO_ACTIVE_HIGH>;
> + supports-clkreq;
> + vpcie3v3-supply = <&vcc_3v3_s3>;
> + status = "okay";
> +};
> +
> +/* NIC */
> +&pcie2x1l2 {
> + pinctrl-names = "default";
> + pinctrl-0 = <&pcie2x1l2_rst>;
Similar to the above - have you tried the dedicated hardware mode for
this pin, i.e. &pcie20x1m0_perstn? You are not requesting the
&pcie20x1m0_clkreqn or &pcie20x1m0_waken either, even though they are
routed on the board - that will probably bite you if you try
suspending the board.
> + reset-gpios = <&gpio3 RK_PD1 GPIO_ACTIVE_HIGH>;
> + vpcie3v3-supply = <&vcc3v3_phy1>;
> + status = "okay";
> +};
> +
> +&pinctrl {
> + bluetooth {
> + bt_wake_gpio: bt-wake-pin {
> + rockchip,pins = <0 RK_PC6 RK_FUNC_GPIO &pcfg_pull_none>;
If you care about power consumption of the board it's probably better
to pull this down to make sure the Bluetooth module is predictably in
a sleep state when not explicitly requested, not floating randomly.
There is no dedicated pull-up/pull-down on your board.
> + };
> +
> + bt_wake_host_irq: bt-wake-host-irq {
> + rockchip,pins = <0 RK_PC5 RK_FUNC_GPIO &pcfg_pull_down>;
> + };
> + };
> +
> + dp {
> + dp_bridge_en: dp-bridge-en {
> + rockchip,pins = <3 RK_PC2 RK_FUNC_GPIO &pcfg_pull_none>;
This pin doesn't have any dedicated pull-up/pull-down on the board, so
you might end up in a weird power state for the period of time between
the probing of the pinctrl subsystem and regulators. Better set it to
&pcfg_pull_down, which matches the power-on-reset default state of
this pin.
> + };
> + };
> +
> + pcie {
> + pcie2x1l1_rst: pcie2x1l1-rst {
> + rockchip,pins = <4 RK_PA2 RK_FUNC_GPIO &pcfg_pull_none>;
> + };
> +
> + pcie2x1l2_rst: pcie2x1l2-rst {
> + rockchip,pins = <3 RK_PD1 RK_FUNC_GPIO &pcfg_pull_none>;
> + };
> +
> + vcc3v3_phy1_en: vcc3v3-phy1-en {
The schematic calls this pin "Ethernet_EN", so perhaps use that in the
label and node name for easier reference.
> + rockchip,pins = <3 RK_PB7 RK_FUNC_GPIO &pcfg_pull_none>;
As above: no dedicated pull resistors on the board, better set to
&pcfg_pull_down in line with POR default.
> + };
> + };
> +
> + usb {
> + vcc5v0_otg_en: vcc5v0-otg-en {
> + rockchip,pins = <0 RK_PC4 RK_FUNC_GPIO &pcfg_pull_none>;
As above: no dedicated pull resistors on the board, better set to
&pcfg_pull_down in line with POR default.
> + };
> + };
> +
> + wlan {
> + wifi_enable_h: wifi-enable-h {
> + rockchip,pins = <0 RK_PD0 RK_FUNC_GPIO &pcfg_pull_none>;
As above: no dedicated pull resistors on the board, better set to
&pcfg_pull_down in line with POR default.
Best regards,
Alexey
^ permalink raw reply
* [soc:soc/late] BUILD SUCCESS c82aee2c760aba93dce0ad354c48195be00bc9cf
From: kernel test robot @ 2026-04-15 8:48 UTC (permalink / raw)
To: Krzysztof Kozlowski; +Cc: linux-arm-kernel, arm
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/soc/soc.git soc/late
branch HEAD: c82aee2c760aba93dce0ad354c48195be00bc9cf Merge tag 'mvebu-dt64-7.1-1' of https://git.kernel.org/pub/scm/linux/kernel/git/gclement/mvebu into soc/late
elapsed time: 805m
configs tested: 115
configs skipped: 4
The following configs have been built successfully.
More configs may be tested in the coming days.
tested configs:
alpha allnoconfig gcc-15.2.0
alpha allyesconfig gcc-15.2.0
arc allmodconfig gcc-15.2.0
arc allnoconfig gcc-15.2.0
arc allyesconfig gcc-15.2.0
arm allnoconfig clang-23
arm allyesconfig gcc-15.2.0
arm64 allmodconfig clang-19
arm64 allnoconfig gcc-15.2.0
arm64 randconfig-001-20260415 gcc-15.2.0
arm64 randconfig-002-20260415 gcc-14.3.0
arm64 randconfig-003-20260415 gcc-13.4.0
arm64 randconfig-004-20260415 clang-19
csky allmodconfig gcc-15.2.0
csky allnoconfig gcc-15.2.0
csky randconfig-001-20260415 gcc-9.5.0
csky randconfig-002-20260415 gcc-15.2.0
hexagon allmodconfig clang-17
hexagon allnoconfig clang-23
hexagon randconfig-001-20260415 clang-23
hexagon randconfig-002-20260415 clang-23
i386 allmodconfig gcc-14
i386 allnoconfig gcc-14
i386 allyesconfig gcc-14
i386 buildonly-randconfig-001-20260415 clang-20
i386 buildonly-randconfig-002-20260415 gcc-14
i386 buildonly-randconfig-003-20260415 clang-20
i386 buildonly-randconfig-004-20260415 gcc-14
i386 buildonly-randconfig-005-20260415 gcc-14
i386 buildonly-randconfig-006-20260415 clang-20
i386 randconfig-001-20260415 gcc-14
i386 randconfig-002-20260415 clang-20
i386 randconfig-003-20260415 gcc-13
i386 randconfig-004-20260415 clang-20
i386 randconfig-005-20260415 clang-20
i386 randconfig-006-20260415 clang-20
i386 randconfig-007-20260415 clang-20
i386 randconfig-011-20260415 gcc-14
i386 randconfig-012-20260415 clang-20
i386 randconfig-013-20260415 gcc-14
i386 randconfig-014-20260415 gcc-14
i386 randconfig-015-20260415 clang-20
i386 randconfig-016-20260415 gcc-14
i386 randconfig-017-20260415 gcc-14
loongarch allnoconfig clang-23
loongarch defconfig clang-19
loongarch randconfig-001-20260415 clang-18
loongarch randconfig-002-20260415 clang-23
m68k allmodconfig gcc-15.2.0
m68k allnoconfig gcc-15.2.0
m68k allyesconfig gcc-15.2.0
m68k defconfig gcc-15.2.0
microblaze allnoconfig gcc-15.2.0
microblaze allyesconfig gcc-15.2.0
microblaze defconfig gcc-15.2.0
mips allmodconfig gcc-15.2.0
mips allnoconfig gcc-15.2.0
mips allyesconfig gcc-15.2.0
nios2 allmodconfig gcc-11.5.0
nios2 allnoconfig gcc-11.5.0
nios2 defconfig gcc-11.5.0
nios2 randconfig-001-20260415 gcc-10.5.0
nios2 randconfig-002-20260415 gcc-11.5.0
openrisc allmodconfig gcc-15.2.0
openrisc allnoconfig gcc-15.2.0
parisc allmodconfig gcc-15.2.0
parisc allnoconfig gcc-15.2.0
parisc allyesconfig gcc-15.2.0
parisc64 defconfig gcc-15.2.0
powerpc allnoconfig gcc-15.2.0
riscv allmodconfig clang-23
riscv allnoconfig gcc-15.2.0
riscv allyesconfig clang-16
s390 allmodconfig clang-18
s390 allnoconfig clang-23
s390 allyesconfig gcc-15.2.0
sh alldefconfig gcc-15.2.0
sh allmodconfig gcc-15.2.0
sh allnoconfig gcc-15.2.0
sh allyesconfig gcc-15.2.0
sh defconfig gcc-15.2.0
sparc allnoconfig gcc-15.2.0
sparc randconfig-001-20260415 gcc-8.5.0
sparc randconfig-002-20260415 gcc-11.5.0
sparc64 allmodconfig clang-23
sparc64 defconfig clang-20
sparc64 randconfig-001-20260415 clang-23
sparc64 randconfig-002-20260415 gcc-12.5.0
um allmodconfig clang-19
um allnoconfig clang-23
um allyesconfig gcc-14
um defconfig clang-23
um i386_defconfig gcc-14
um randconfig-001-20260415 clang-23
um x86_64_defconfig clang-23
x86_64 allmodconfig clang-20
x86_64 allnoconfig clang-20
x86_64 allyesconfig clang-20
x86_64 buildonly-randconfig-001-20260415 clang-20
x86_64 buildonly-randconfig-002-20260415 gcc-13
x86_64 buildonly-randconfig-003-20260415 gcc-14
x86_64 buildonly-randconfig-004-20260415 clang-20
x86_64 buildonly-randconfig-005-20260415 clang-20
x86_64 buildonly-randconfig-006-20260415 clang-20
x86_64 defconfig gcc-14
x86_64 randconfig-071-20260415 clang-20
x86_64 randconfig-072-20260415 clang-20
x86_64 randconfig-073-20260415 gcc-13
x86_64 randconfig-074-20260415 clang-20
x86_64 randconfig-075-20260415 clang-20
x86_64 randconfig-076-20260415 gcc-14
x86_64 rhel-9.4-rust clang-20
xtensa allnoconfig gcc-15.2.0
xtensa randconfig-001-20260415 gcc-8.5.0
xtensa randconfig-002-20260415 gcc-8.5.0
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply
* Re: [PATCH v4 3/9] coresight: etm4x: fix leaked trace id
From: Jie Gan @ 2026-04-15 8:45 UTC (permalink / raw)
To: Leo Yan, Yeoreum Yun
Cc: coresight, linux-arm-kernel, linux-kernel, suzuki.poulose,
mike.leach, james.clark, alexander.shishkin
In-Reply-To: <20260415083224.GJ356832@e132581.arm.com>
On 4/15/2026 4:32 PM, Leo Yan wrote:
> On Wed, Apr 15, 2026 at 09:01:09AM +0100, Yeoreum Yun wrote:
>
> [...]
>
>>>> What I am thinking is as SoCs continue to grow more complex with an
>>>> increasing number of subsystems, trace IDs may be exhausted in the near
>>>> future. (that's why we have dynamic trace ID allocation/release).
>>>
>>> Thanks for the input.
>>>
>>> I am wandering if we can use "dev->devt" as the trace ID. A device's
>>> major/minor number is unique in kernel and dev_t is defined as u32:
>>>
>>> typedef u32 __kernel_dev_t;
>>>
>>> And we can consolidate this for both SYSFS and PERF modes.
>>>
>>
>> When I see the CORESIGHT_TRACE_ID_MAX:
>>
>> /* architecturally we have 128 IDs some of which are reserved */
>> #define CORESIGHT_TRACE_IDS_MAX 128
>>
>> I think this came from the hardware restriction for number of TRACE_IDs.
>> In this case, clamping the device_id to trace_id seems more complex and
>> reduce some performance perspective.
>
> Sigh, my stupid. Please ignore my previous comment, let us first fix
> ID leak issue.
>
> Given Jie's comment on the use-out issue, it is valid for me especially
> if a system have many dummy tracers. We can defer to refactor it
> later (e.g., use separate ranges for hardware and dummy tracers).
>
> thanks for correction!
Just share some info:
With my memory, The ARM AMBA ATB Protocol Specification defined a 7-bit
width field for the trace ID, that's where the 128 comes from. (in each
frame, we also have 7-bit field for containing the trace ID)
Thanks,
Jie
^ permalink raw reply
* Re: [PATCH v5] nvme: Skip trace complete_rq on host path error
From: 전민식 @ 2026-04-15 8:41 UTC (permalink / raw)
To: Keith Busch, hch@lst.de
Cc: Justin Tee, axboe@kernel.dk, sven@kernel.org, j@jannau.net,
neal@gompa.dev, sagi@grimberg.me, justin.tee@broadcom.com,
nareshgottumukkala83@gmail.com, paul.ely@broadcom.com,
James Smart, kch@nvidia.com, linux-arm-kernel@lists.infradead.org,
linux-nvme@lists.infradead.org, asahi@lists.linux.dev,
linux-kernel@vger.kernel.org, 이은수,
칸찬, 전민식
In-Reply-To: <CGME20260415084107epcms2p71b9c0d252180653ab96a9f5f2121be71@epcms2p7>
Hi Keith, hch
Gentle ping for review on [PATCH v5] nvme: Skip trace complete_rq on
host path error.
I agree with your propsed approach. Should I send a new version(v6)
based on that?
^ permalink raw reply
* Re: [PATCH 1/1] KVM: arm64: nv: Avoid full shadow s2 unmap
From: Marc Zyngier @ 2026-04-15 8:38 UTC (permalink / raw)
To: Wei-Lin Chang
Cc: linux-arm-kernel, kvmarm, linux-kernel, Oliver Upton, Joey Gouly,
Suzuki K Poulose, Zenghui Yu, Catalin Marinas, Will Deacon
In-Reply-To: <20260411125024.3735989-2-weilin.chang@arm.com>
On Sat, 11 Apr 2026 13:50:24 +0100,
Wei-Lin Chang <weilin.chang@arm.com> wrote:
>
> Currently we are forced to fully unmap all shadow stage-2 for a VM when
> unmapping a page from the canonical stage-2, for example during an MMU
> notifier call. This is because we are not tracking what canonical IPA
> are mapped in the shadow stage-2 page tables hence there is no way to
> know what to unmap.
>
> Create a per kvm_s2_mmu maple tree to track canonical IPA range ->
> nested IPA range, so that it is possible to partially unmap shadow
> stage-2 when a canonical IPA range is unmapped. The algorithm is simple
> and conservative:
>
> At each shadow stage-2 map, insert the nested IPA range into the maple
> tree, with the canonical IPA range as the key. If the canonical IPA
> range doesn't overlap with existing ranges in the tree, insert as is,
> and a reverse mapping for this range is established. But if the
> canonical IPA range overlaps with any existing ranges in the tree,
> create a new range that spans all the overlapping ranges including the
> input range and replace those existing ranges. In the mean time, mark
> this new spanning canonical IPA range as "polluted" indicating we lost
> track of the nested IPA ranges that map to this canonical IPA range.
>
> The maple tree's 64 bit entry is enough to store the nested IPA and
> polluted status (stored as a bit called UNKNOWN_IPA), therefore besides
> maple tree's internal operation, memory allocation is avoided.
>
> Example:
> |||| means existing range, ---- means empty range
>
> input: $$$$$$$$$$$$$$$$$$$$$$$$$$
> tree: --||||-----|||||||---------||||||||||-----------
>
> insert spanning range and replace overlapping ones:
> --||||-----||||||||||||||||||||||||||-----------
> ^^^^^^^^polluted!^^^^^^^^^
I think you should stick to a single terminology. It is either
"polluted", or "unknown IPA". My preference goes to the latter, as the
former is not very descriptive in this context.
>
> With the reverse map created, when a canonical IPA range gets unmapped,
> look into each s2 mmu's maple tree and look for canonical IPA ranges
> affected, and base on their polluted status:
>
> polluted -> fall back and fully invalidate the current shadow stage-2,
> also clear the tree
> not polluted -> unmap the nested IPA range, and remove the reverse map
> entry
>
> Suggested-by: Marc Zyngier <maz@kernel.org>
> Signed-off-by: Wei-Lin Chang <weilin.chang@arm.com>
> ---
> arch/arm64/include/asm/kvm_host.h | 4 +
> arch/arm64/include/asm/kvm_nested.h | 4 +
> arch/arm64/kvm/mmu.c | 30 ++++--
> arch/arm64/kvm/nested.c | 147 +++++++++++++++++++++++++++-
> 4 files changed, 177 insertions(+), 8 deletions(-)
>
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 851f6171751c..a97bd461c1e1 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -217,6 +217,10 @@ struct kvm_s2_mmu {
> */
> bool nested_stage2_enabled;
>
> + /* canonical IPA to nested IPA range lookup */
> + struct maple_tree nested_revmap_mt;
> + bool nested_revmap_broken;
> +
Consider moving this boolean next to the other ones so that you don't
create too many holes in the kvm_s2_mmu structure (use pahole to find out).
But I have some misgivings about the way things are structured
here. Only NV needs a revmap, yet this is present irrelevant of the
nature of the VM and bloats the data structure a bit.
My naive approach would have been to only keep a pointer to the
revmap, and make that pointer NULL when the tree is "broken", and
freed under RCU if the context isn't the correct one.
This would have multiple benefits: no large-ish structure embedded in
the s2_mmu structure, no extra boolean to indicate an error condition,
memory reclaimed earlier.
> #ifdef CONFIG_PTDUMP_STAGE2_DEBUGFS
> struct dentry *shadow_pt_debugfs_dentry;
> #endif
> diff --git a/arch/arm64/include/asm/kvm_nested.h b/arch/arm64/include/asm/kvm_nested.h
> index 091544e6af44..f039220e87a6 100644
> --- a/arch/arm64/include/asm/kvm_nested.h
> +++ b/arch/arm64/include/asm/kvm_nested.h
> @@ -76,6 +76,8 @@ extern void kvm_s2_mmu_iterate_by_vmid(struct kvm *kvm, u16 vmid,
> const union tlbi_info *info,
> void (*)(struct kvm_s2_mmu *,
> const union tlbi_info *));
> +extern void kvm_record_nested_revmap(gpa_t gpa, struct kvm_s2_mmu *mmu,
> + gpa_t fault_gpa, size_t map_size);
> extern void kvm_vcpu_load_hw_mmu(struct kvm_vcpu *vcpu);
> extern void kvm_vcpu_put_hw_mmu(struct kvm_vcpu *vcpu);
>
> @@ -164,6 +166,8 @@ extern int kvm_s2_handle_perm_fault(struct kvm_vcpu *vcpu,
> struct kvm_s2_trans *trans);
> extern int kvm_inject_s2_fault(struct kvm_vcpu *vcpu, u64 esr_el2);
> extern void kvm_nested_s2_wp(struct kvm *kvm);
> +extern void kvm_unmap_gfn_range_nested(struct kvm *kvm, gpa_t gpa, size_t size,
> + bool may_block);
> extern void kvm_nested_s2_unmap(struct kvm *kvm, bool may_block);
> extern void kvm_nested_s2_flush(struct kvm *kvm);
>
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index d089c107d9b7..4c9b9cf6dc43 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -5,6 +5,7 @@
> */
>
> #include <linux/acpi.h>
> +#include <linux/maple_tree.h>
> #include <linux/mman.h>
> #include <linux/kvm_host.h>
> #include <linux/io.h>
> @@ -1099,6 +1100,7 @@ void kvm_free_stage2_pgd(struct kvm_s2_mmu *mmu)
> {
> struct kvm *kvm = kvm_s2_mmu_to_kvm(mmu);
> struct kvm_pgtable *pgt = NULL;
> + struct maple_tree *mt = &mmu->nested_revmap_mt;
>
> write_lock(&kvm->mmu_lock);
> pgt = mmu->pgt;
> @@ -1108,8 +1110,11 @@ void kvm_free_stage2_pgd(struct kvm_s2_mmu *mmu)
> free_percpu(mmu->last_vcpu_ran);
> }
>
> - if (kvm_is_nested_s2_mmu(kvm, mmu))
> + if (kvm_is_nested_s2_mmu(kvm, mmu)) {
> + if (!mtree_empty(mt))
> + mtree_destroy(mt);
> kvm_init_nested_s2_mmu(mmu);
> + }
>
> write_unlock(&kvm->mmu_lock);
>
> @@ -1631,6 +1636,10 @@ static int gmem_abort(const struct kvm_s2_fault_desc *s2fd)
> goto out_unlock;
> }
>
> + if (s2fd->nested)
> + kvm_record_nested_revmap(gfn << PAGE_SHIFT, pgt->mmu,
> + s2fd->fault_ipa, PAGE_SIZE);
> +
> ret = KVM_PGT_FN(kvm_pgtable_stage2_map)(pgt, s2fd->fault_ipa, PAGE_SIZE,
> __pfn_to_phys(pfn), prot,
> memcache, flags);
> @@ -2031,6 +2040,13 @@ static int kvm_s2_fault_map(const struct kvm_s2_fault_desc *s2fd,
> ret = KVM_PGT_FN(kvm_pgtable_stage2_relax_perms)(pgt, gfn_to_gpa(gfn),
> prot, flags);
> } else {
> + if (s2fd->nested) {
> + phys_addr_t ipa = gfn_to_gpa(get_canonical_gfn(s2fd, s2vi));
> +
> + ipa &= ~(mapping_size - 1);
I guess it'd be worth adding a helper for this instead of duplicating
the existing code.
> + kvm_record_nested_revmap(ipa, pgt->mmu, gfn_to_gpa(gfn),
> + mapping_size);
This worries me a bit, see below.
> + }
> ret = KVM_PGT_FN(kvm_pgtable_stage2_map)(pgt, gfn_to_gpa(gfn), mapping_size,
> __pfn_to_phys(pfn), prot,
> memcache, flags);
> @@ -2388,14 +2404,16 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu)
>
> bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range)
> {
> + gpa_t gpa = range->start << PAGE_SHIFT;
> + size_t size = (range->end - range->start) << PAGE_SHIFT;
> + bool may_block = range->may_block;
> +
> if (!kvm->arch.mmu.pgt || kvm_vm_is_protected(kvm))
> return false;
>
> - __unmap_stage2_range(&kvm->arch.mmu, range->start << PAGE_SHIFT,
> - (range->end - range->start) << PAGE_SHIFT,
> - range->may_block);
> + __unmap_stage2_range(&kvm->arch.mmu, gpa, size, may_block);
> + kvm_unmap_gfn_range_nested(kvm, gpa, size, may_block);
>
> - kvm_nested_s2_unmap(kvm, range->may_block);
> return false;
> }
>
> @@ -2673,7 +2691,7 @@ void kvm_arch_flush_shadow_memslot(struct kvm *kvm,
>
> write_lock(&kvm->mmu_lock);
> kvm_stage2_unmap_range(&kvm->arch.mmu, gpa, size, true);
> - kvm_nested_s2_unmap(kvm, true);
> + kvm_unmap_gfn_range_nested(kvm, gpa, size, true);
> write_unlock(&kvm->mmu_lock);
> }
>
> diff --git a/arch/arm64/kvm/nested.c b/arch/arm64/kvm/nested.c
> index 883b6c1008fb..c9ebe969b453 100644
> --- a/arch/arm64/kvm/nested.c
> +++ b/arch/arm64/kvm/nested.c
> @@ -7,6 +7,7 @@
> #include <linux/bitfield.h>
> #include <linux/kvm.h>
> #include <linux/kvm_host.h>
> +#include <linux/maple_tree.h>
>
> #include <asm/fixmap.h>
> #include <asm/kvm_arm.h>
> @@ -43,6 +44,19 @@ struct vncr_tlb {
> */
> #define S2_MMU_PER_VCPU 2
>
> +/*
> + * Per shadow S2 reverse map (IPA -> nested IPA range) maple tree payload
> + * layout:
> + *
> + * bit 63: valid, 1 for non-polluted entries, prevents the case where the
> + * nested IPA is 0 and turns the whole value to 0
> + * bits 55-12: nested IPA bits 55-12
> + * bit 0: polluted, 1 for polluted, 0 for not
> + */
> +#define VALID_ENTRY BIT(63)
> +#define NESTED_IPA_MASK GENMASK_ULL(55, 12)
> +#define UNKNOWN_IPA BIT(0)
> +
This only works because you are using the "advanced" API, right?
Otherwise, you'd be losing the high bit. It'd be good to add a comment
so that people keep that in mind.
> void kvm_init_nested(struct kvm *kvm)
> {
> kvm->arch.nested_mmus = NULL;
> @@ -769,12 +783,57 @@ static struct kvm_s2_mmu *get_s2_mmu_nested(struct kvm_vcpu *vcpu)
> return s2_mmu;
> }
>
> +void kvm_record_nested_revmap(gpa_t ipa, struct kvm_s2_mmu *mmu,
> + gpa_t fault_ipa, size_t map_size)
> +{
> + struct maple_tree *mt = &mmu->nested_revmap_mt;
> + gpa_t start = ipa;
> + gpa_t end = ipa + map_size - 1;
> + u64 entry, new_entry = 0;
> + MA_STATE(mas, mt, start, end);
> +
> + if (mmu->nested_revmap_broken)
> + return;
> +
> + mtree_lock(mt);
> + entry = (u64)mas_find_range(&mas, end);
> +
> + if (entry) {
> + /* maybe just a perm update... */
> + if (!(entry & UNKNOWN_IPA) && mas.index == start &&
> + mas.last == end &&
> + fault_ipa == (entry & NESTED_IPA_MASK))
> + goto unlock;
> + /*
> + * Create a "polluted" range that spans all the overlapping
> + * ranges and store it.
> + */
> + while (entry && mas.index <= end) {
> + start = min(mas.index, start);
> + end = max(mas.last, end);
> + entry = (u64)mas_find_range(&mas, end);
> + }
> + new_entry |= UNKNOWN_IPA;
> + } else {
> + new_entry |= fault_ipa;
> + new_entry |= VALID_ENTRY;
> + }
> +
> + mas_set_range(&mas, start, end);
> + if (mas_store_gfp(&mas, (void *)new_entry, GFP_NOWAIT | __GFP_ACCOUNT))
> + mmu->nested_revmap_broken = true;
Can we try and minimise the risk of allocation failure here?
user_mem_abort() tries very hard to pre-allocate pages for page
tables by maintaining an memcache. Can we have a similar approach for
the revmap?
> +unlock:
> + mtree_unlock(mt);
> +}
> +
> void kvm_init_nested_s2_mmu(struct kvm_s2_mmu *mmu)
> {
> /* CnP being set denotes an invalid entry */
> mmu->tlb_vttbr = VTTBR_CNP_BIT;
> mmu->nested_stage2_enabled = false;
> atomic_set(&mmu->refcnt, 0);
> + mt_init(&mmu->nested_revmap_mt);
> + mmu->nested_revmap_broken = false;
> }
>
> void kvm_vcpu_load_hw_mmu(struct kvm_vcpu *vcpu)
> @@ -1150,6 +1209,90 @@ void kvm_nested_s2_wp(struct kvm *kvm)
> kvm_invalidate_vncr_ipa(kvm, 0, BIT(kvm->arch.mmu.pgt->ia_bits));
> }
>
> +static void reset_revmap_and_unmap(struct kvm_s2_mmu *mmu, bool may_block)
> +{
> + mtree_destroy(&mmu->nested_revmap_mt);
> + kvm_stage2_unmap_range(mmu, 0, kvm_phys_size(mmu), may_block);
> + mmu->nested_revmap_broken = false;
> +}
> +
> +static void unmap_mmu_ipa_range(struct kvm_s2_mmu *mmu, gpa_t gpa,
> + size_t unmap_size, bool may_block)
> +{
> + struct maple_tree *mt = &mmu->nested_revmap_mt;
> + gpa_t start = gpa;
> + gpa_t end = gpa + unmap_size - 1;
> + u64 entry;
> + size_t entry_size;
> + bool unlock, fallback;
> + MA_STATE(mas, mt, gpa, end);
> +
> + if (mmu->nested_revmap_broken) {
> + unlock = false;
> + fallback = true;
> + goto fin;
> + }
Using booleans to affect the control flow reads really badly. I'd
expect this to simply be:
if (...) {
reset_revmap_and_unmap(mmu, may_block);
return;
}
> +
> + mtree_lock(mt);
> + entry = (u64)mas_find_range(&mas, end);
> +
> + while (entry && mas.index <= end) {
> + start = mas.last + 1;
> + entry_size = mas.last - mas.index + 1;
> + /*
> + * Give up and invalidate this s2 mmu if the unmap range
> + * touches any polluted range.
> + */
> + if (entry & UNKNOWN_IPA) {
> + unlock = true;
> + fallback = true;
> + goto fin;
> + }
and this to be:
if (entry & UNKNOWN_IPA) {
mtree_unlock(mt);
reset_revmap_and_unmap(mmu, may_block);
return;
}
> +
> + /*
> + * Ignore result, it is okay if a reverse mapping erase
> + * fails.
> + */
> + mas_store_gfp(&mas, NULL, GFP_NOWAIT | __GFP_ACCOUNT);
> +
> + mtree_unlock(mt);
> + kvm_stage2_unmap_range(mmu, entry & NESTED_IPA_MASK, entry_size,
> + may_block);
> + mtree_lock(mt);
> + /*
> + * Other maple tree operations during preemption could render
> + * this ma_state invalid, so reset it.
> + */
> + mas_set_range(&mas, start, end);
> + entry = (u64)mas_find_range(&mas, end);
> + }
> + unlock = true;
> + fallback = false;
> +
> +fin:
> + if (unlock)
> + mtree_unlock(mt);
> + if (fallback)
> + reset_revmap_and_unmap(mmu, may_block);
and this can eventually be greatly simplified.
> +}
> +
> +void kvm_unmap_gfn_range_nested(struct kvm *kvm, gpa_t gpa, size_t size,
> + bool may_block)
> +{
> + int i;
> +
> + if (!kvm->arch.nested_mmus_size)
> + return;
> +
> + /* TODO: accelerate this using mt of canonical s2 mmu */
> + for (i = 0; i < kvm->arch.nested_mmus_size; i++) {
> + struct kvm_s2_mmu *mmu = &kvm->arch.nested_mmus[i];
> +
> + if (kvm_s2_mmu_valid(mmu))
> + unmap_mmu_ipa_range(mmu, gpa, size, may_block);
> + }
> +}
> +
> void kvm_nested_s2_unmap(struct kvm *kvm, bool may_block)
> {
> int i;
> @@ -1163,7 +1306,7 @@ void kvm_nested_s2_unmap(struct kvm *kvm, bool may_block)
> struct kvm_s2_mmu *mmu = &kvm->arch.nested_mmus[i];
>
> if (kvm_s2_mmu_valid(mmu))
> - kvm_stage2_unmap_range(mmu, 0, kvm_phys_size(mmu), may_block);
> + reset_revmap_and_unmap(mmu, may_block);
> }
>
> kvm_invalidate_vncr_ipa(kvm, 0, BIT(kvm->arch.mmu.pgt->ia_bits));
> @@ -1848,7 +1991,7 @@ void check_nested_vcpu_requests(struct kvm_vcpu *vcpu)
>
> write_lock(&vcpu->kvm->mmu_lock);
> if (mmu->pending_unmap) {
> - kvm_stage2_unmap_range(mmu, 0, kvm_phys_size(mmu), true);
> + reset_revmap_and_unmap(mmu, true);
> mmu->pending_unmap = false;
> }
> write_unlock(&vcpu->kvm->mmu_lock);
My other concern here is related to TLB invalidation. As the guest
performs TLB invalidations that remove entries from the shadow S2,
there is no way to update the revmap to account for this.
This obviously means that the revmap becomes more and more inaccurate
over time, and that is likely to accumulate conflicting entries.
What is the plan to improve the situation on this front?
Thanks,
M.
--
Without deviation from the norm, progress is not possible.
^ permalink raw reply
* [PATCH v2 2/4] soc: amlogic: clk-measure: Add A1 and T7 support
From: Jian Hu via B4 Relay @ 2026-04-15 8:33 UTC (permalink / raw)
To: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Neil Armstrong,
Kevin Hilman, Jerome Brunet, Martin Blumenstingl
Cc: devicetree, linux-arm-kernel, linux-amlogic, linux-kernel,
Jian Hu
In-Reply-To: <20260415-clkmsr_a1_t7-v2-0-02b6314427e6@amlogic.com>
From: Jian Hu <jian.hu@amlogic.com>
Add support for the A1 and T7 SoC family in amlogic clk measure.
Signed-off-by: Jian Hu <jian.hu@amlogic.com>
---
drivers/soc/amlogic/meson-clk-measure.c | 272 ++++++++++++++++++++++++++++++++
1 file changed, 272 insertions(+)
diff --git a/drivers/soc/amlogic/meson-clk-measure.c b/drivers/soc/amlogic/meson-clk-measure.c
index d862e30a244e..8c4f3cc8c8ab 100644
--- a/drivers/soc/amlogic/meson-clk-measure.c
+++ b/drivers/soc/amlogic/meson-clk-measure.c
@@ -787,6 +787,258 @@ static const struct meson_msr_id clk_msr_s4[] = {
};
+static const struct meson_msr_id clk_msr_a1[] = {
+ CLK_MSR_ID(0, "tdmout_b_sclk"),
+ CLK_MSR_ID(1, "tdmout_a_sclk"),
+ CLK_MSR_ID(2, "tdmin_lb_sclk"),
+ CLK_MSR_ID(3, "tdmin_b_sclk"),
+ CLK_MSR_ID(4, "tdmin_a_sclk"),
+ CLK_MSR_ID(5, "vad"),
+ CLK_MSR_ID(6, "resamplea"),
+ CLK_MSR_ID(7, "pdm_sysclk"),
+ CLK_MSR_ID(8, "pdm_dclk"),
+ CLK_MSR_ID(9, "locker_out"),
+ CLK_MSR_ID(10, "locker_in"),
+ CLK_MSR_ID(11, "spdifin"),
+ CLK_MSR_ID(12, "tdmin_vad"),
+ CLK_MSR_ID(13, "au_adc"),
+ CLK_MSR_ID(14, "au_dac"),
+ CLK_MSR_ID(16, "spicc_a"),
+ CLK_MSR_ID(17, "spifc"),
+ CLK_MSR_ID(18, "sd_emmc_a"),
+ CLK_MSR_ID(19, "dmcx4"),
+ CLK_MSR_ID(20, "dmc"),
+ CLK_MSR_ID(21, "psram"),
+ CLK_MSR_ID(22, "cecb"),
+ CLK_MSR_ID(23, "ceca"),
+ CLK_MSR_ID(24, "ts"),
+ CLK_MSR_ID(25, "pwm_f"),
+ CLK_MSR_ID(26, "pwm_e"),
+ CLK_MSR_ID(27, "pwm_d"),
+ CLK_MSR_ID(28, "pwm_c"),
+ CLK_MSR_ID(29, "pwm_b"),
+ CLK_MSR_ID(30, "pwm_a"),
+ CLK_MSR_ID(31, "saradc"),
+ CLK_MSR_ID(32, "usb_bus"),
+ CLK_MSR_ID(33, "dsp_b"),
+ CLK_MSR_ID(34, "dsp_a"),
+ CLK_MSR_ID(35, "axi"),
+ CLK_MSR_ID(36, "sys"),
+ CLK_MSR_ID(40, "rng_ring_osc0"),
+ CLK_MSR_ID(41, "rng_ring_osc1"),
+ CLK_MSR_ID(42, "rng_ring_osc2"),
+ CLK_MSR_ID(43, "rng_ring_osc3"),
+ CLK_MSR_ID(44, "dds_out"),
+ CLK_MSR_ID(45, "cpu_clk_div16"),
+ CLK_MSR_ID(46, "gpio_msr"),
+ CLK_MSR_ID(50, "osc_ring_cpu0"),
+ CLK_MSR_ID(51, "osc_ring_cpu1"),
+ CLK_MSR_ID(54, "osc_ring_top0"),
+ CLK_MSR_ID(55, "osc_ring_top1"),
+ CLK_MSR_ID(56, "osc_ring_ddr"),
+ CLK_MSR_ID(57, "osc_ring_dmc"),
+ CLK_MSR_ID(58, "osc_ring_dspa"),
+ CLK_MSR_ID(59, "osc_ring_dspb"),
+ CLK_MSR_ID(60, "osc_ring_rama"),
+ CLK_MSR_ID(61, "osc_ring_ramb"),
+};
+
+static const struct meson_msr_id clk_msr_t7[] = {
+ CLK_MSR_ID(0, "sys"),
+ CLK_MSR_ID(1, "axi"),
+ CLK_MSR_ID(2, "rtc"),
+ CLK_MSR_ID(3, "dspa"),
+ CLK_MSR_ID(4, "dspb"),
+ CLK_MSR_ID(5, "mali"),
+ CLK_MSR_ID(6, "sys_cpu_clk_div16"),
+ CLK_MSR_ID(7, "ceca"),
+ CLK_MSR_ID(8, "cecb"),
+ CLK_MSR_ID(10, "fclk_div5"),
+ CLK_MSR_ID(11, "mpll0"),
+ CLK_MSR_ID(12, "mpll1"),
+ CLK_MSR_ID(13, "mpll2"),
+ CLK_MSR_ID(14, "mpll3"),
+ CLK_MSR_ID(15, "mpll_50m"),
+ CLK_MSR_ID(16, "pcie_inp"),
+ CLK_MSR_ID(17, "pcie_inn"),
+ CLK_MSR_ID(18, "mpll_test_out"),
+ CLK_MSR_ID(19, "hifi_pll"),
+ CLK_MSR_ID(20, "gp0_pll"),
+ CLK_MSR_ID(21, "gp1_pll"),
+ CLK_MSR_ID(22, "eth_mppll_50m"),
+ CLK_MSR_ID(23, "sys_pll_div16"),
+ CLK_MSR_ID(24, "ddr_dpll_pt"),
+ CLK_MSR_ID(25, "earcrx_pll"),
+ CLK_MSR_ID(26, "paie1_clk_inp"),
+ CLK_MSR_ID(27, "paie1_clk_inn"),
+ CLK_MSR_ID(28, "amlgdc"),
+ CLK_MSR_ID(29, "gdc"),
+ CLK_MSR_ID(30, "mod_eth_phy_ref"),
+ CLK_MSR_ID(31, "mod_eth_tx"),
+ CLK_MSR_ID(32, "eth_clk125Mhz"),
+ CLK_MSR_ID(33, "eth_clk_rmii"),
+ CLK_MSR_ID(34, "co_clkin_to_mac"),
+ CLK_MSR_ID(35, "mod_eth_rx_clk_rmii"),
+ CLK_MSR_ID(36, "co_rx"),
+ CLK_MSR_ID(37, "co_tx"),
+ CLK_MSR_ID(38, "eth_phy_rxclk"),
+ CLK_MSR_ID(39, "eth_phy_plltxclk"),
+ CLK_MSR_ID(40, "ephy_test"),
+ CLK_MSR_ID(41, "dsi_b_meas"),
+ CLK_MSR_ID(42, "hdmirx_apl"),
+ CLK_MSR_ID(43, "hdmirx_tmds"),
+ CLK_MSR_ID(44, "hdmirx_cable"),
+ CLK_MSR_ID(45, "hdmirx_apll_clk_audio"),
+ CLK_MSR_ID(46, "hdmirx_5m"),
+ CLK_MSR_ID(47, "hdmirx_2m"),
+ CLK_MSR_ID(48, "hdmirx_cfg"),
+ CLK_MSR_ID(49, "hdmirx_hdcp2x_eclk"),
+ CLK_MSR_ID(50, "vid_pll0_div"),
+ CLK_MSR_ID(51, "hdmi_vid_pll"),
+ CLK_MSR_ID(54, "vdac_clk"),
+ CLK_MSR_ID(55, "vpu_clk_buf"),
+ CLK_MSR_ID(56, "mod_tcon_clko"),
+ CLK_MSR_ID(57, "lcd_an_clk_ph2"),
+ CLK_MSR_ID(58, "lcd_an_clk_ph3"),
+ CLK_MSR_ID(59, "hdmi_tx_pixel"),
+ CLK_MSR_ID(60, "vdin_meas"),
+ CLK_MSR_ID(61, "vpu_clk"),
+ CLK_MSR_ID(62, "vpu_clkb"),
+ CLK_MSR_ID(63, "vpu_clkb_tmp"),
+ CLK_MSR_ID(64, "vpu_clkc"),
+ CLK_MSR_ID(65, "vid_lock"),
+ CLK_MSR_ID(66, "vapbclk"),
+ CLK_MSR_ID(67, "ge2d"),
+ CLK_MSR_ID(68, "aud_pll"),
+ CLK_MSR_ID(69, "aud_sck"),
+ CLK_MSR_ID(70, "dsi_a_meas"),
+ CLK_MSR_ID(72, "mipi_csi_phy"),
+ CLK_MSR_ID(73, "mipi_isp"),
+ CLK_MSR_ID(76, "hdmitx_tmds"),
+ CLK_MSR_ID(77, "hdmitx_sys"),
+ CLK_MSR_ID(78, "hdmitx_fe"),
+ CLK_MSR_ID(80, "hdmitx_prif"),
+ CLK_MSR_ID(81, "hdmitx_200m"),
+ CLK_MSR_ID(82, "hdmitx_aud"),
+ CLK_MSR_ID(83, "hdmitx_pnx"),
+ CLK_MSR_ID(84, "spicc5"),
+ CLK_MSR_ID(85, "spicc4"),
+ CLK_MSR_ID(86, "spicc3"),
+ CLK_MSR_ID(87, "spicc2"),
+ CLK_MSR_ID(93, "vdec"),
+ CLK_MSR_ID(94, "wave521_aclk"),
+ CLK_MSR_ID(95, "wave521_cclk"),
+ CLK_MSR_ID(96, "wave521_bclk"),
+ CLK_MSR_ID(97, "hcodec"),
+ CLK_MSR_ID(98, "hevcb"),
+ CLK_MSR_ID(99, "hevcf"),
+ CLK_MSR_ID(100, "hdmi_aud_pll"),
+ CLK_MSR_ID(101, "hdmi_acr_ref"),
+ CLK_MSR_ID(102, "hdmi_meter"),
+ CLK_MSR_ID(103, "hdmi_vid"),
+ CLK_MSR_ID(104, "hdmi_aud"),
+ CLK_MSR_ID(105, "hdmi_dsd"),
+ CLK_MSR_ID(108, "dsi1_phy"),
+ CLK_MSR_ID(109, "dsi0_phy"),
+ CLK_MSR_ID(110, "smartcard"),
+ CLK_MSR_ID(111, "sar_adc"),
+ CLK_MSR_ID(113, "sd_emmc_c"),
+ CLK_MSR_ID(114, "sd_emmc_b"),
+ CLK_MSR_ID(115, "sd_emmc_a"),
+ CLK_MSR_ID(116, "gpio_msr"),
+ CLK_MSR_ID(117, "spicc1"),
+ CLK_MSR_ID(118, "spicc0"),
+ CLK_MSR_ID(119, "anakin"),
+ CLK_MSR_ID(121, "ts_clk(temp sensor)"),
+ CLK_MSR_ID(122, "ts_a73"),
+ CLK_MSR_ID(123, "ts_a53"),
+ CLK_MSR_ID(124, "ts_nna"),
+ CLK_MSR_ID(130, "audio_vad"),
+ CLK_MSR_ID(131, "acodec_dac_clk_x128"),
+ CLK_MSR_ID(132, "audio_locker_in"),
+ CLK_MSR_ID(133, "audio_locker_out"),
+ CLK_MSR_ID(134, "audio_tdmout_c_sclk"),
+ CLK_MSR_ID(135, "audio_tdmout_b_sclk"),
+ CLK_MSR_ID(136, "audio_tdmout_a_sclk"),
+ CLK_MSR_ID(137, "audio_tdmin_lb_sclk"),
+ CLK_MSR_ID(138, "audio_tdmin_c_sclk"),
+ CLK_MSR_ID(139, "audio_tdmin_b_sclk"),
+ CLK_MSR_ID(140, "audio_tdmin_a_sclk"),
+ CLK_MSR_ID(141, "audio_resamplea"),
+ CLK_MSR_ID(142, "audio_pdm_sysclk"),
+ CLK_MSR_ID(143, "audio_spdifoutb_mst"),
+ CLK_MSR_ID(144, "audio_spdifout_mst"),
+ CLK_MSR_ID(145, "audio_spdifin_mst"),
+ CLK_MSR_ID(146, "audio_pdm_dclk"),
+ CLK_MSR_ID(147, "audio_resampleb"),
+ CLK_MSR_ID(148, "earcrx_pll_dmac"),
+ CLK_MSR_ID(156, "pwm_ao_h"),
+ CLK_MSR_ID(157, "pwm_ao_g"),
+ CLK_MSR_ID(158, "pwm_ao_f"),
+ CLK_MSR_ID(159, "pwm_ao_e"),
+ CLK_MSR_ID(160, "pwm_ao_d"),
+ CLK_MSR_ID(161, "pwm_ao_c"),
+ CLK_MSR_ID(162, "pwm_ao_b"),
+ CLK_MSR_ID(163, "pwm_ao_a"),
+ CLK_MSR_ID(164, "pwm_f"),
+ CLK_MSR_ID(165, "pwm_e"),
+ CLK_MSR_ID(166, "pwm_d"),
+ CLK_MSR_ID(167, "pwm_c"),
+ CLK_MSR_ID(168, "pwm_b"),
+ CLK_MSR_ID(169, "pwm_a"),
+ CLK_MSR_ID(170, "aclkm"),
+ CLK_MSR_ID(171, "mclk_pll"),
+ CLK_MSR_ID(172, "a73_sys_pll_div16"),
+ CLK_MSR_ID(173, "a73_cpu_clk_div16"),
+ CLK_MSR_ID(176, "rng_ring_0"),
+ CLK_MSR_ID(177, "rng_ring_1"),
+ CLK_MSR_ID(178, "rng_ring_2"),
+ CLK_MSR_ID(179, "rng_ring_3"),
+ CLK_MSR_ID(180, "am_ring_out0"),
+ CLK_MSR_ID(181, "am_ring_out1"),
+ CLK_MSR_ID(182, "am_ring_out2"),
+ CLK_MSR_ID(183, "am_ring_out3"),
+ CLK_MSR_ID(184, "am_ring_out4"),
+ CLK_MSR_ID(185, "am_ring_out5"),
+ CLK_MSR_ID(186, "am_ring_out6"),
+ CLK_MSR_ID(187, "am_ring_out7"),
+ CLK_MSR_ID(188, "am_ring_out8"),
+ CLK_MSR_ID(189, "am_ring_out9"),
+ CLK_MSR_ID(190, "am_ring_out10"),
+ CLK_MSR_ID(191, "am_ring_out11"),
+ CLK_MSR_ID(192, "am_ring_out12"),
+ CLK_MSR_ID(193, "am_ring_out13"),
+ CLK_MSR_ID(194, "am_ring_out14"),
+ CLK_MSR_ID(195, "am_ring_out15"),
+ CLK_MSR_ID(196, "am_ring_out16"),
+ CLK_MSR_ID(197, "am_ring_out17"),
+ CLK_MSR_ID(198, "am_ring_out18"),
+ CLK_MSR_ID(199, "am_ring_out19"),
+ CLK_MSR_ID(200, "mipi_csi_phy0"),
+ CLK_MSR_ID(201, "mipi_csi_phy1"),
+ CLK_MSR_ID(202, "mipi_csi_phy2"),
+ CLK_MSR_ID(203, "mipi_csi_phy3"),
+ CLK_MSR_ID(204, "vid_pll1_div"),
+ CLK_MSR_ID(205, "vid_pll2_div"),
+ CLK_MSR_ID(206, "am_ring_out20"),
+ CLK_MSR_ID(207, "am_ring_out21"),
+ CLK_MSR_ID(208, "am_ring_out22"),
+ CLK_MSR_ID(209, "am_ring_out23"),
+ CLK_MSR_ID(210, "am_ring_out24"),
+ CLK_MSR_ID(211, "am_ring_out25"),
+ CLK_MSR_ID(212, "am_ring_out26"),
+ CLK_MSR_ID(213, "am_ring_out27"),
+ CLK_MSR_ID(214, "am_ring_out28"),
+ CLK_MSR_ID(215, "am_ring_out29"),
+ CLK_MSR_ID(216, "am_ring_out30"),
+ CLK_MSR_ID(217, "am_ring_out31"),
+ CLK_MSR_ID(218, "am_ring_out32"),
+ CLK_MSR_ID(219, "enc0_if"),
+ CLK_MSR_ID(220, "enc2"),
+ CLK_MSR_ID(221, "enc1"),
+ CLK_MSR_ID(222, "enc0")
+};
+
static int meson_measure_id(struct meson_msr_id *clk_msr_id,
unsigned int duration)
{
@@ -1026,6 +1278,18 @@ static const struct meson_msr_data clk_msr_s4_data = {
.reg = &msr_reg_offset_v2,
};
+static const struct meson_msr_data clk_msr_a1_data = {
+ .msr_table = (void *)clk_msr_a1,
+ .msr_count = ARRAY_SIZE(clk_msr_a1),
+ .reg = &msr_reg_offset_v2,
+};
+
+static const struct meson_msr_data clk_msr_t7_data = {
+ .msr_table = (void *)clk_msr_t7,
+ .msr_count = ARRAY_SIZE(clk_msr_t7),
+ .reg = &msr_reg_offset_v2,
+};
+
static const struct of_device_id meson_msr_match_table[] = {
{
.compatible = "amlogic,meson-gx-clk-measure",
@@ -1059,6 +1323,14 @@ static const struct of_device_id meson_msr_match_table[] = {
.compatible = "amlogic,s4-clk-measure",
.data = &clk_msr_s4_data,
},
+ {
+ .compatible = "amlogic,a1-clk-measure",
+ .data = &clk_msr_a1_data,
+ },
+ {
+ .compatible = "amlogic,t7-clk-measure",
+ .data = &clk_msr_t7_data,
+ },
{ /* sentinel */ }
};
MODULE_DEVICE_TABLE(of, meson_msr_match_table);
--
2.47.1
^ permalink raw reply related
* [PATCH v2 0/4] soc: amlogic: clk-measure: add A1 and T7 support
From: Jian Hu via B4 Relay @ 2026-04-15 8:33 UTC (permalink / raw)
To: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Neil Armstrong,
Kevin Hilman, Jerome Brunet, Martin Blumenstingl
Cc: devicetree, linux-arm-kernel, linux-amlogic, linux-kernel,
Jian Hu
This series adds Amlogic clock measurement support for A1 and T7 SoCs,
including binding updates, driver additions, and device tree enablement.
Signed-off-by: Jian Hu <jian.hu@amlogic.com>
---
Changes in v2:
- Add const for a1 and t7 clock measure table.
- Use b4 to send this series.
- Link to v1: https://lore.kernel.org/all/20260410100329.3167482-1-jian.hu@amlogic.com
---
Jian Hu (4):
dt-bindings: soc: amlogic: clk-measure: Add A1 and T7 compatible
soc: amlogic: clk-measure: Add A1 and T7 support
arm64: dts: meson: a1: Add clk measure support
arm64: dts: amlogic: t7: Add clk measure support
.../soc/amlogic/amlogic,meson-gx-clk-measure.yaml | 2 +
arch/arm64/boot/dts/amlogic/amlogic-t7.dtsi | 5 +
arch/arm64/boot/dts/amlogic/meson-a1.dtsi | 5 +
drivers/soc/amlogic/meson-clk-measure.c | 272 +++++++++++++++++++++
4 files changed, 284 insertions(+)
---
base-commit: 401e5c73eedde8225e87bd11c794b8409248ff41
change-id: 20260415-clkmsr_a1_t7-9820984d0af1
Best regards,
--
Jian Hu <jian.hu@amlogic.com>
^ permalink raw reply
* [PATCH v2 4/4] arm64: dts: amlogic: t7: Add clk measure support
From: Jian Hu via B4 Relay @ 2026-04-15 8:33 UTC (permalink / raw)
To: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Neil Armstrong,
Kevin Hilman, Jerome Brunet, Martin Blumenstingl
Cc: devicetree, linux-arm-kernel, linux-amlogic, linux-kernel,
Jian Hu
In-Reply-To: <20260415-clkmsr_a1_t7-v2-0-02b6314427e6@amlogic.com>
From: Jian Hu <jian.hu@amlogic.com>
Add the clock measure device to the T7 SoC family.
Signed-off-by: Jian Hu <jian.hu@amlogic.com>
---
arch/arm64/boot/dts/amlogic/amlogic-t7.dtsi | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/arch/arm64/boot/dts/amlogic/amlogic-t7.dtsi b/arch/arm64/boot/dts/amlogic/amlogic-t7.dtsi
index 7fe72c94ed62..cec2ea74850d 100644
--- a/arch/arm64/boot/dts/amlogic/amlogic-t7.dtsi
+++ b/arch/arm64/boot/dts/amlogic/amlogic-t7.dtsi
@@ -701,6 +701,11 @@ pwm_ao_cd: pwm@60000 {
status = "disabled";
};
+ clock-measurer@48000 {
+ compatible = "amlogic,t7-clk-measure";
+ reg = <0x0 0x48000 0x0 0x1c>;
+ };
+
sd_emmc_a: mmc@88000 {
compatible = "amlogic,t7-mmc", "amlogic,meson-axg-mmc";
reg = <0x0 0x88000 0x0 0x800>;
--
2.47.1
^ permalink raw reply related
* [PATCH v2 3/4] arm64: dts: meson: a1: Add clk measure support
From: Jian Hu via B4 Relay @ 2026-04-15 8:33 UTC (permalink / raw)
To: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Neil Armstrong,
Kevin Hilman, Jerome Brunet, Martin Blumenstingl
Cc: devicetree, linux-arm-kernel, linux-amlogic, linux-kernel,
Jian Hu
In-Reply-To: <20260415-clkmsr_a1_t7-v2-0-02b6314427e6@amlogic.com>
From: Jian Hu <jian.hu@amlogic.com>
Add the clock measure device to the A1 SoC family.
Signed-off-by: Jian Hu <jian.hu@amlogic.com>
---
arch/arm64/boot/dts/amlogic/meson-a1.dtsi | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/arch/arm64/boot/dts/amlogic/meson-a1.dtsi b/arch/arm64/boot/dts/amlogic/meson-a1.dtsi
index 348411411f3d..6f6a6145cba1 100644
--- a/arch/arm64/boot/dts/amlogic/meson-a1.dtsi
+++ b/arch/arm64/boot/dts/amlogic/meson-a1.dtsi
@@ -576,6 +576,11 @@ saradc: adc@2c00 {
status = "disabled";
};
+ clock-measurer@3400 {
+ compatible = "amlogic,a1-clk-measure";
+ reg = <0x0 0x3400 0x0 0x1c>;
+ };
+
i2c1: i2c@5c00 {
compatible = "amlogic,meson-axg-i2c";
status = "disabled";
--
2.47.1
^ permalink raw reply related
* [PATCH v2 1/4] dt-bindings: soc: amlogic: clk-measure: Add A1 and T7 compatible
From: Jian Hu via B4 Relay @ 2026-04-15 8:33 UTC (permalink / raw)
To: Rob Herring, Krzysztof Kozlowski, Conor Dooley, Neil Armstrong,
Kevin Hilman, Jerome Brunet, Martin Blumenstingl
Cc: devicetree, linux-arm-kernel, linux-amlogic, linux-kernel,
Jian Hu
In-Reply-To: <20260415-clkmsr_a1_t7-v2-0-02b6314427e6@amlogic.com>
From: Jian Hu <jian.hu@amlogic.com>
Add the Amlogic A1 and T7 compatible for the clk-measurer IP.
Signed-off-by: Jian Hu <jian.hu@amlogic.com>
---
.../devicetree/bindings/soc/amlogic/amlogic,meson-gx-clk-measure.yaml | 2 ++
1 file changed, 2 insertions(+)
diff --git a/Documentation/devicetree/bindings/soc/amlogic/amlogic,meson-gx-clk-measure.yaml b/Documentation/devicetree/bindings/soc/amlogic/amlogic,meson-gx-clk-measure.yaml
index 39d4637c2d08..b1200e6940ac 100644
--- a/Documentation/devicetree/bindings/soc/amlogic/amlogic,meson-gx-clk-measure.yaml
+++ b/Documentation/devicetree/bindings/soc/amlogic/amlogic,meson-gx-clk-measure.yaml
@@ -24,6 +24,8 @@ properties:
- amlogic,meson-sm1-clk-measure
- amlogic,c3-clk-measure
- amlogic,s4-clk-measure
+ - amlogic,a1-clk-measure
+ - amlogic,t7-clk-measure
reg:
maxItems: 1
--
2.47.1
^ permalink raw reply related
* Re: [PATCH v4 3/9] coresight: etm4x: fix leaked trace id
From: Leo Yan @ 2026-04-15 8:32 UTC (permalink / raw)
To: Yeoreum Yun
Cc: Jie Gan, coresight, linux-arm-kernel, linux-kernel,
suzuki.poulose, mike.leach, james.clark, alexander.shishkin
In-Reply-To: <ad9FxS86FTqxu00d@e129823.arm.com>
On Wed, Apr 15, 2026 at 09:01:09AM +0100, Yeoreum Yun wrote:
[...]
> > > What I am thinking is as SoCs continue to grow more complex with an
> > > increasing number of subsystems, trace IDs may be exhausted in the near
> > > future. (that's why we have dynamic trace ID allocation/release).
> >
> > Thanks for the input.
> >
> > I am wandering if we can use "dev->devt" as the trace ID. A device's
> > major/minor number is unique in kernel and dev_t is defined as u32:
> >
> > typedef u32 __kernel_dev_t;
> >
> > And we can consolidate this for both SYSFS and PERF modes.
> >
>
> When I see the CORESIGHT_TRACE_ID_MAX:
>
> /* architecturally we have 128 IDs some of which are reserved */
> #define CORESIGHT_TRACE_IDS_MAX 128
>
> I think this came from the hardware restriction for number of TRACE_IDs.
> In this case, clamping the device_id to trace_id seems more complex and
> reduce some performance perspective.
Sigh, my stupid. Please ignore my previous comment, let us first fix
ID leak issue.
Given Jie's comment on the use-out issue, it is valid for me especially
if a system have many dummy tracers. We can defer to refactor it
later (e.g., use separate ranges for hardware and dummy tracers).
thanks for correction!
^ permalink raw reply
* Re: [PATCH v2] media: verisilicon: Simplify motion vectors and rfc buffers allocation
From: Benjamin Gaignard @ 2026-04-15 8:28 UTC (permalink / raw)
To: Nicolas Dufresne, p.zabel, mchehab, heiko
Cc: linux-media, linux-rockchip, linux-kernel, linux-arm-kernel,
kernel
In-Reply-To: <43b252cc6186829e021022480ebfe34274c3e572.camel@collabora.com>
Le 08/04/2026 à 22:41, Nicolas Dufresne a écrit :
> Hi,
>
> Le mercredi 25 mars 2026 à 14:17 +0100, Benjamin Gaignard a écrit :
>> Until now we reserve the space needed for motion vectors and reference
>> frame compression at the end of the frame buffer.
>> This patch disentanglement mv and rfc from frame buffers by allocating
> Use imperative tone, avoid sarting a story (Once upon a time ...), drop "This patch", we know its a patch.
>
>> distinct buffers for each purpose.
>> That simplify the code by removing lot of offset computation.
>>
>> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
>> ---
>> version 2:
>> - rework commit message
>> - free mv and rfc buffer before signal the buffer completion.
>>
>> drivers/media/platform/verisilicon/hantro.h | 17 +-
>> .../media/platform/verisilicon/hantro_av1.c | 7 -
>> .../media/platform/verisilicon/hantro_av1.h | 1 -
>> .../media/platform/verisilicon/hantro_g2.c | 36 --
>> .../platform/verisilicon/hantro_g2_hevc_dec.c | 24 +-
>> .../platform/verisilicon/hantro_g2_vp9_dec.c | 12 +-
>> .../media/platform/verisilicon/hantro_hevc.c | 20 +-
>> .../media/platform/verisilicon/hantro_hw.h | 99 +-----
>> .../platform/verisilicon/hantro_postproc.c | 29 +-
>> .../media/platform/verisilicon/hantro_v4l2.c | 314 ++++++++++++++++--
>> .../verisilicon/rockchip_vpu981_hw_av1_dec.c | 16 +-
>> 11 files changed, 359 insertions(+), 216 deletions(-)
>>
>> diff --git a/drivers/media/platform/verisilicon/hantro.h b/drivers/media/platform/verisilicon/hantro.h
>> index 0353de154a1e..c4ceb9c99016 100644
>> --- a/drivers/media/platform/verisilicon/hantro.h
>> +++ b/drivers/media/platform/verisilicon/hantro.h
>> @@ -31,6 +31,9 @@ struct hantro_ctx;
>> struct hantro_codec_ops;
>> struct hantro_postproc_ops;
>>
>> +#define MAX_MV_BUFFERS MAX_POSTPROC_BUFFERS
>> +#define MAX_RFC_BUFFERS MAX_POSTPROC_BUFFERS
> Why two defines ? And why 64 ? Isn't the maximum something per codec ?
One per new array to be more readable when iterating in these arrays.
MAX_POSTPROC_BUFFERS is the maximum number of buffers for the capture queue
and it isn't something codec specific.
>
>> +
>> #define HANTRO_JPEG_ENCODER BIT(0)
>> #define HANTRO_ENCODERS 0x0000ffff
>> #define HANTRO_MPEG2_DECODER BIT(16)
>> @@ -237,6 +240,9 @@ struct hantro_dev {
>> * @need_postproc: Set to true if the bitstream features require to
>> * use the post-processor.
>> *
>> + * @dec_mv: motion vectors buffers for the context.
>> + * @dec_rfc: reference frame compression buffers for the context.
>> + *
>> * @codec_ops: Set of operations related to codec mode.
>> * @postproc: Post-processing context.
>> * @h264_dec: H.264-decoding context.
>> @@ -264,6 +270,9 @@ struct hantro_ctx {
>> int jpeg_quality;
>> int bit_depth;
>>
>> + struct hantro_aux_buf dec_mv[MAX_MV_BUFFERS];
>> + struct hantro_aux_buf dec_rfc[MAX_RFC_BUFFERS];
>> +
>> const struct hantro_codec_ops *codec_ops;
>> struct hantro_postproc_ctx postproc;
>> bool need_postproc;
>> @@ -334,14 +343,14 @@ struct hantro_vp9_decoded_buffer_info {
>> unsigned short width;
>> unsigned short height;
>> size_t chroma_offset;
>> - size_t mv_offset;
>> + dma_addr_t mv_addr;
>> u32 bit_depth : 4;
>> };
>>
>> struct hantro_av1_decoded_buffer_info {
>> /* Info needed when the decoded frame serves as a reference frame. */
>> size_t chroma_offset;
>> - size_t mv_offset;
>> + dma_addr_t mv_addr;
>> };
>>
>> struct hantro_decoded_buffer {
>> @@ -507,4 +516,8 @@ void hantro_postproc_free(struct hantro_ctx *ctx);
>> int hanto_postproc_enum_framesizes(struct hantro_ctx *ctx,
>> struct v4l2_frmsizeenum *fsize);
>>
>> +dma_addr_t hantro_mv_get_buf_addr(struct hantro_ctx *ctx, int index);
>> +dma_addr_t hantro_rfc_get_luma_buf_addr(struct hantro_ctx *ctx, int index);
>> +dma_addr_t hantro_rfc_get_chroma_buf_addr(struct hantro_ctx *ctx, int index);
>> +
>> #endif /* HANTRO_H_ */
>> diff --git a/drivers/media/platform/verisilicon/hantro_av1.c b/drivers/media/platform/verisilicon/hantro_av1.c
>> index 5a51ac877c9c..3a80a7994f67 100644
>> --- a/drivers/media/platform/verisilicon/hantro_av1.c
>> +++ b/drivers/media/platform/verisilicon/hantro_av1.c
>> @@ -222,13 +222,6 @@ size_t hantro_av1_luma_size(struct hantro_ctx *ctx)
>> return ctx->ref_fmt.plane_fmt[0].bytesperline * ctx->ref_fmt.height;
>> }
>>
>> -size_t hantro_av1_chroma_size(struct hantro_ctx *ctx)
>> -{
>> - size_t cr_offset = hantro_av1_luma_size(ctx);
>> -
>> - return ALIGN((cr_offset * 3) / 2, 64);
>> -}
>> -
>> static void hantro_av1_tiles_free(struct hantro_ctx *ctx)
>> {
>> struct hantro_dev *vpu = ctx->dev;
>> diff --git a/drivers/media/platform/verisilicon/hantro_av1.h b/drivers/media/platform/verisilicon/hantro_av1.h
>> index 4e2122b95cdd..330f7938d097 100644
>> --- a/drivers/media/platform/verisilicon/hantro_av1.h
>> +++ b/drivers/media/platform/verisilicon/hantro_av1.h
>> @@ -41,7 +41,6 @@ int hantro_av1_get_order_hint(struct hantro_ctx *ctx, int ref);
>> int hantro_av1_frame_ref(struct hantro_ctx *ctx, u64 timestamp);
>> void hantro_av1_clean_refs(struct hantro_ctx *ctx);
>> size_t hantro_av1_luma_size(struct hantro_ctx *ctx);
>> -size_t hantro_av1_chroma_size(struct hantro_ctx *ctx);
>> void hantro_av1_exit(struct hantro_ctx *ctx);
>> int hantro_av1_init(struct hantro_ctx *ctx);
>> int hantro_av1_prepare_run(struct hantro_ctx *ctx);
>> diff --git a/drivers/media/platform/verisilicon/hantro_g2.c b/drivers/media/platform/verisilicon/hantro_g2.c
>> index 318673b66da8..4ae7df53dcb1 100644
>> --- a/drivers/media/platform/verisilicon/hantro_g2.c
>> +++ b/drivers/media/platform/verisilicon/hantro_g2.c
>> @@ -99,39 +99,3 @@ size_t hantro_g2_chroma_offset(struct hantro_ctx *ctx)
>> {
>> return ctx->ref_fmt.plane_fmt[0].bytesperline * ctx->ref_fmt.height;
>> }
>> -
>> -size_t hantro_g2_motion_vectors_offset(struct hantro_ctx *ctx)
>> -{
>> - size_t cr_offset = hantro_g2_chroma_offset(ctx);
>> -
>> - return ALIGN((cr_offset * 3) / 2, G2_ALIGN);
>> -}
>> -
>> -static size_t hantro_g2_mv_size(struct hantro_ctx *ctx)
>> -{
>> - const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
>> - const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
>> - unsigned int pic_width_in_ctbs, pic_height_in_ctbs;
>> - unsigned int max_log2_ctb_size;
>> -
>> - max_log2_ctb_size = sps->log2_min_luma_coding_block_size_minus3 + 3 +
>> - sps->log2_diff_max_min_luma_coding_block_size;
>> - pic_width_in_ctbs = (sps->pic_width_in_luma_samples +
>> - (1 << max_log2_ctb_size) - 1) >> max_log2_ctb_size;
>> - pic_height_in_ctbs = (sps->pic_height_in_luma_samples + (1 << max_log2_ctb_size) - 1)
>> - >> max_log2_ctb_size;
>> -
>> - return pic_width_in_ctbs * pic_height_in_ctbs * (1 << (2 * (max_log2_ctb_size - 4))) * 16;
>> -}
>> -
>> -size_t hantro_g2_luma_compress_offset(struct hantro_ctx *ctx)
>> -{
>> - return hantro_g2_motion_vectors_offset(ctx) +
>> - hantro_g2_mv_size(ctx);
>> -}
>> -
>> -size_t hantro_g2_chroma_compress_offset(struct hantro_ctx *ctx)
>> -{
>> - return hantro_g2_luma_compress_offset(ctx) +
>> - hantro_hevc_luma_compressed_size(ctx->dst_fmt.width, ctx->dst_fmt.height);
>> -}
>> diff --git a/drivers/media/platform/verisilicon/hantro_g2_hevc_dec.c b/drivers/media/platform/verisilicon/hantro_g2_hevc_dec.c
>> index e8c2e83379de..d0af9fb882ba 100644
>> --- a/drivers/media/platform/verisilicon/hantro_g2_hevc_dec.c
>> +++ b/drivers/media/platform/verisilicon/hantro_g2_hevc_dec.c
>> @@ -383,9 +383,6 @@ static int set_ref(struct hantro_ctx *ctx)
>> struct vb2_v4l2_buffer *vb2_dst;
>> struct hantro_decoded_buffer *dst;
>> size_t cr_offset = hantro_g2_chroma_offset(ctx);
>> - size_t mv_offset = hantro_g2_motion_vectors_offset(ctx);
>> - size_t compress_luma_offset = hantro_g2_luma_compress_offset(ctx);
>> - size_t compress_chroma_offset = hantro_g2_chroma_compress_offset(ctx);
>> u32 max_ref_frames;
>> u16 dpb_longterm_e;
>> static const struct hantro_reg cur_poc[] = {
>> @@ -453,14 +450,17 @@ static int set_ref(struct hantro_ctx *ctx)
>> dpb_longterm_e = 0;
>> for (i = 0; i < decode_params->num_active_dpb_entries &&
>> i < (V4L2_HEVC_DPB_ENTRIES_NUM_MAX - 1); i++) {
>> + int index = hantro_hevc_get_ref_buf_index(ctx, dpb[i].pic_order_cnt_val);
>> luma_addr = hantro_hevc_get_ref_buf(ctx, dpb[i].pic_order_cnt_val);
>> if (!luma_addr)
>> return -ENOMEM;
>>
>> chroma_addr = luma_addr + cr_offset;
>> - mv_addr = luma_addr + mv_offset;
>> - compress_luma_addr = luma_addr + compress_luma_offset;
>> - compress_chroma_addr = luma_addr + compress_chroma_offset;
>> + mv_addr = hantro_mv_get_buf_addr(ctx, index);
>> + if (ctx->hevc_dec.use_compression) {
>> + compress_luma_addr = hantro_rfc_get_luma_buf_addr(ctx, index);
>> + compress_chroma_addr = hantro_rfc_get_chroma_buf_addr(ctx, index);
>> + }
>>
>> if (dpb[i].flags & V4L2_HEVC_DPB_ENTRY_LONG_TERM_REFERENCE)
>> dpb_longterm_e |= BIT(V4L2_HEVC_DPB_ENTRIES_NUM_MAX - 1 - i);
>> @@ -478,13 +478,17 @@ static int set_ref(struct hantro_ctx *ctx)
>> if (!luma_addr)
>> return -ENOMEM;
>>
>> - if (hantro_hevc_add_ref_buf(ctx, decode_params->pic_order_cnt_val, luma_addr))
>> + if (hantro_hevc_add_ref_buf(ctx, decode_params->pic_order_cnt_val, luma_addr, vb2_dst))
>> return -EINVAL;
>>
>> chroma_addr = luma_addr + cr_offset;
>> - mv_addr = luma_addr + mv_offset;
>> - compress_luma_addr = luma_addr + compress_luma_offset;
>> - compress_chroma_addr = luma_addr + compress_chroma_offset;
>> + mv_addr = hantro_mv_get_buf_addr(ctx, dst->base.vb.vb2_buf.index);
>> + if (ctx->hevc_dec.use_compression) {
>> + compress_luma_addr =
>> + hantro_rfc_get_luma_buf_addr(ctx, dst->base.vb.vb2_buf.index);
>> + compress_chroma_addr =
>> + hantro_rfc_get_chroma_buf_addr(ctx, dst->base.vb.vb2_buf.index);
>> + }
>>
>> hantro_write_addr(vpu, G2_REF_LUMA_ADDR(i), luma_addr);
>> hantro_write_addr(vpu, G2_REF_CHROMA_ADDR(i), chroma_addr);
>> diff --git a/drivers/media/platform/verisilicon/hantro_g2_vp9_dec.c b/drivers/media/platform/verisilicon/hantro_g2_vp9_dec.c
>> index 56c79e339030..1e96d0fce72a 100644
>> --- a/drivers/media/platform/verisilicon/hantro_g2_vp9_dec.c
>> +++ b/drivers/media/platform/verisilicon/hantro_g2_vp9_dec.c
>> @@ -129,7 +129,7 @@ static void config_output(struct hantro_ctx *ctx,
>> struct hantro_decoded_buffer *dst,
>> const struct v4l2_ctrl_vp9_frame *dec_params)
>> {
>> - dma_addr_t luma_addr, chroma_addr, mv_addr;
>> + dma_addr_t luma_addr, chroma_addr;
>>
>> hantro_reg_write(ctx->dev, &g2_out_dis, 0);
>> if (!ctx->dev->variant->legacy_regs)
>> @@ -142,9 +142,8 @@ static void config_output(struct hantro_ctx *ctx,
>> hantro_write_addr(ctx->dev, G2_OUT_CHROMA_ADDR, chroma_addr);
>> dst->vp9.chroma_offset = hantro_g2_chroma_offset(ctx);
>>
>> - mv_addr = luma_addr + hantro_g2_motion_vectors_offset(ctx);
>> - hantro_write_addr(ctx->dev, G2_OUT_MV_ADDR, mv_addr);
>> - dst->vp9.mv_offset = hantro_g2_motion_vectors_offset(ctx);
>> + dst->vp9.mv_addr = hantro_mv_get_buf_addr(ctx, dst->base.vb.vb2_buf.index);
>> + hantro_write_addr(ctx->dev, G2_OUT_MV_ADDR, dst->vp9.mv_addr);
>> }
>>
>> struct hantro_vp9_ref_reg {
>> @@ -215,15 +214,12 @@ static void config_ref_registers(struct hantro_ctx *ctx,
>> .c_base = G2_REF_CHROMA_ADDR(5),
>> },
>> };
>> - dma_addr_t mv_addr;
>>
>> config_ref(ctx, dst, &ref_regs[0], dec_params, dec_params->last_frame_ts);
>> config_ref(ctx, dst, &ref_regs[1], dec_params, dec_params->golden_frame_ts);
>> config_ref(ctx, dst, &ref_regs[2], dec_params, dec_params->alt_frame_ts);
>>
>> - mv_addr = hantro_get_dec_buf_addr(ctx, &mv_ref->base.vb.vb2_buf) +
>> - mv_ref->vp9.mv_offset;
>> - hantro_write_addr(ctx->dev, G2_REF_MV_ADDR(0), mv_addr);
>> + hantro_write_addr(ctx->dev, G2_REF_MV_ADDR(0), mv_ref->vp9.mv_addr);
>>
>> hantro_reg_write(ctx->dev, &vp9_last_sign_bias,
>> dec_params->ref_frame_sign_bias & V4L2_VP9_SIGN_BIAS_LAST ? 1 : 0);
>> diff --git a/drivers/media/platform/verisilicon/hantro_hevc.c b/drivers/media/platform/verisilicon/hantro_hevc.c
>> index 83cd12b0ddd6..272ce336b1c6 100644
>> --- a/drivers/media/platform/verisilicon/hantro_hevc.c
>> +++ b/drivers/media/platform/verisilicon/hantro_hevc.c
>> @@ -54,7 +54,24 @@ dma_addr_t hantro_hevc_get_ref_buf(struct hantro_ctx *ctx,
>> return 0;
>> }
>>
>> -int hantro_hevc_add_ref_buf(struct hantro_ctx *ctx, int poc, dma_addr_t addr)
>> +int hantro_hevc_get_ref_buf_index(struct hantro_ctx *ctx, s32 poc)
>> +{
>> + struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
>> + int i;
>> +
>> + /* Find the reference buffer in already known ones */
>> + for (i = 0; i < NUM_REF_PICTURES; i++) {
>> + if (hevc_dec->ref_bufs_poc[i] == poc)
>> + return hevc_dec->ref_vb2[i]->vb2_buf.index;
> I'm a little worried that there is no flag indicating if the entry was set or
> not. POC 0 is valid notably, do we initialize to an invalid value to prevent
> matching an unset entry or unused entry ?
I will add a check of hevc_dec->ref_bufs_used here.
>
>> + }
>> +
>> + return 0;
>> +}
>> +
>> +int hantro_hevc_add_ref_buf(struct hantro_ctx *ctx,
>> + int poc,
>> + dma_addr_t addr,
>> + struct vb2_v4l2_buffer *vb2)
>> {
>> struct hantro_hevc_dec_hw_ctx *hevc_dec = &ctx->hevc_dec;
>> int i;
>> @@ -65,6 +82,7 @@ int hantro_hevc_add_ref_buf(struct hantro_ctx *ctx, int poc, dma_addr_t addr)
>> hevc_dec->ref_bufs_used |= 1 << i;
>> hevc_dec->ref_bufs_poc[i] = poc;
>> hevc_dec->ref_bufs[i].dma = addr;
>> + hevc_dec->ref_vb2[i] = vb2;
>> return 0;
>> }
>> }
>> diff --git a/drivers/media/platform/verisilicon/hantro_hw.h b/drivers/media/platform/verisilicon/hantro_hw.h
>> index f0e4bca4b2b2..6a1ee9899b60 100644
>> --- a/drivers/media/platform/verisilicon/hantro_hw.h
>> +++ b/drivers/media/platform/verisilicon/hantro_hw.h
>> @@ -162,6 +162,7 @@ struct hantro_hevc_dec_hw_ctx {
>> struct hantro_aux_buf scaling_lists;
>> s32 ref_bufs_poc[NUM_REF_PICTURES];
>> u32 ref_bufs_used;
>> + struct vb2_v4l2_buffer *ref_vb2[NUM_REF_PICTURES];
>> struct hantro_hevc_dec_ctrls ctrls;
>> unsigned int num_tile_cols_allocated;
>> bool use_compression;
>> @@ -457,7 +458,10 @@ int hantro_g2_hevc_dec_run(struct hantro_ctx *ctx);
>> int hantro_hevc_dec_prepare_run(struct hantro_ctx *ctx);
>> void hantro_hevc_ref_init(struct hantro_ctx *ctx);
>> dma_addr_t hantro_hevc_get_ref_buf(struct hantro_ctx *ctx, s32 poc);
>> -int hantro_hevc_add_ref_buf(struct hantro_ctx *ctx, int poc, dma_addr_t addr);
>> +int hantro_hevc_add_ref_buf(struct hantro_ctx *ctx, int poc,
>> + dma_addr_t addr,
>> + struct vb2_v4l2_buffer *vb2);
>> +int hantro_hevc_get_ref_buf_index(struct hantro_ctx *ctx, s32 poc);
>>
>> int rockchip_vpu981_av1_dec_init(struct hantro_ctx *ctx);
>> void rockchip_vpu981_av1_dec_exit(struct hantro_ctx *ctx);
>> @@ -469,100 +473,7 @@ static inline unsigned short hantro_vp9_num_sbs(unsigned short dimension)
>> return (dimension + 63) / 64;
>> }
>>
>> -static inline size_t
>> -hantro_vp9_mv_size(unsigned int width, unsigned int height)
>> -{
>> - int num_ctbs;
>> -
>> - /*
>> - * There can be up to (CTBs x 64) number of blocks,
>> - * and the motion vector for each block needs 16 bytes.
>> - */
>> - num_ctbs = hantro_vp9_num_sbs(width) * hantro_vp9_num_sbs(height);
>> - return (num_ctbs * 64) * 16;
>> -}
>> -
>> -static inline size_t
>> -hantro_h264_mv_size(unsigned int width, unsigned int height)
>> -{
>> - /*
>> - * A decoded 8-bit 4:2:0 NV12 frame may need memory for up to
>> - * 448 bytes per macroblock with additional 32 bytes on
>> - * multi-core variants.
>> - *
>> - * The H264 decoder needs extra space on the output buffers
>> - * to store motion vectors. This is needed for reference
>> - * frames and only if the format is non-post-processed NV12.
>> - *
>> - * Memory layout is as follow:
>> - *
>> - * +---------------------------+
>> - * | Y-plane 256 bytes x MBs |
>> - * +---------------------------+
>> - * | UV-plane 128 bytes x MBs |
>> - * +---------------------------+
>> - * | MV buffer 64 bytes x MBs |
>> - * +---------------------------+
>> - * | MC sync 32 bytes |
>> - * +---------------------------+
>> - */
>> - return 64 * MB_WIDTH(width) * MB_WIDTH(height) + 32;
>> -}
>> -
>> -static inline size_t
>> -hantro_hevc_mv_size(unsigned int width, unsigned int height)
>> -{
>> - /*
>> - * A CTB can be 64x64, 32x32 or 16x16.
>> - * Allocated memory for the "worse" case: 16x16
>> - */
>> - return width * height / 16;
>> -}
>> -
>> -static inline size_t
>> -hantro_hevc_luma_compressed_size(unsigned int width, unsigned int height)
>> -{
>> - u32 pic_width_in_cbsy =
>> - round_up((width + CBS_LUMA - 1) / CBS_LUMA, CBS_SIZE);
>> - u32 pic_height_in_cbsy = (height + CBS_LUMA - 1) / CBS_LUMA;
>> -
>> - return round_up(pic_width_in_cbsy * pic_height_in_cbsy, CBS_SIZE);
>> -}
>> -
>> -static inline size_t
>> -hantro_hevc_chroma_compressed_size(unsigned int width, unsigned int height)
>> -{
>> - u32 pic_width_in_cbsc =
>> - round_up((width + CBS_CHROMA_W - 1) / CBS_CHROMA_W, CBS_SIZE);
>> - u32 pic_height_in_cbsc = (height / 2 + CBS_CHROMA_H - 1) / CBS_CHROMA_H;
>> -
>> - return round_up(pic_width_in_cbsc * pic_height_in_cbsc, CBS_SIZE);
>> -}
>> -
>> -static inline size_t
>> -hantro_hevc_compressed_size(unsigned int width, unsigned int height)
>> -{
>> - return hantro_hevc_luma_compressed_size(width, height) +
>> - hantro_hevc_chroma_compressed_size(width, height);
>> -}
>> -
>> -static inline unsigned short hantro_av1_num_sbs(unsigned short dimension)
>> -{
>> - return DIV_ROUND_UP(dimension, 64);
>> -}
>> -
>> -static inline size_t
>> -hantro_av1_mv_size(unsigned int width, unsigned int height)
>> -{
>> - size_t num_sbs = hantro_av1_num_sbs(width) * hantro_av1_num_sbs(height);
>> -
>> - return ALIGN(num_sbs * 384, 16) * 2 + 512;
>> -}
>> -
>> size_t hantro_g2_chroma_offset(struct hantro_ctx *ctx);
>> -size_t hantro_g2_motion_vectors_offset(struct hantro_ctx *ctx);
>> -size_t hantro_g2_luma_compress_offset(struct hantro_ctx *ctx);
>> -size_t hantro_g2_chroma_compress_offset(struct hantro_ctx *ctx);
>>
>> int hantro_g1_mpeg2_dec_run(struct hantro_ctx *ctx);
>> int rockchip_vpu2_mpeg2_dec_run(struct hantro_ctx *ctx);
>> diff --git a/drivers/media/platform/verisilicon/hantro_postproc.c b/drivers/media/platform/verisilicon/hantro_postproc.c
>> index e94d1ba5ef10..2409353c16e4 100644
>> --- a/drivers/media/platform/verisilicon/hantro_postproc.c
>> +++ b/drivers/media/platform/verisilicon/hantro_postproc.c
>> @@ -196,36 +196,11 @@ void hantro_postproc_free(struct hantro_ctx *ctx)
>> }
>> }
>>
>> -static unsigned int hantro_postproc_buffer_size(struct hantro_ctx *ctx)
>> -{
>> - unsigned int buf_size;
>> -
>> - buf_size = ctx->ref_fmt.plane_fmt[0].sizeimage;
>> - if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_H264_SLICE)
>> - buf_size += hantro_h264_mv_size(ctx->ref_fmt.width,
>> - ctx->ref_fmt.height);
>> - else if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_VP9_FRAME)
>> - buf_size += hantro_vp9_mv_size(ctx->ref_fmt.width,
>> - ctx->ref_fmt.height);
>> - else if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_HEVC_SLICE) {
>> - buf_size += hantro_hevc_mv_size(ctx->ref_fmt.width,
>> - ctx->ref_fmt.height);
>> - if (ctx->hevc_dec.use_compression)
>> - buf_size += hantro_hevc_compressed_size(ctx->ref_fmt.width,
>> - ctx->ref_fmt.height);
>> - }
>> - else if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_AV1_FRAME)
>> - buf_size += hantro_av1_mv_size(ctx->ref_fmt.width,
>> - ctx->ref_fmt.height);
>> -
>> - return buf_size;
>> -}
>> -
>> static int hantro_postproc_alloc(struct hantro_ctx *ctx, int index)
>> {
>> struct hantro_dev *vpu = ctx->dev;
>> struct hantro_aux_buf *priv = &ctx->postproc.dec_q[index];
>> - unsigned int buf_size = hantro_postproc_buffer_size(ctx);
>> + unsigned int buf_size = ctx->ref_fmt.plane_fmt[0].sizeimage;
>>
>> if (!buf_size)
>> return -EINVAL;
>> @@ -267,7 +242,7 @@ dma_addr_t
>> hantro_postproc_get_dec_buf_addr(struct hantro_ctx *ctx, int index)
>> {
>> struct hantro_aux_buf *priv = &ctx->postproc.dec_q[index];
>> - unsigned int buf_size = hantro_postproc_buffer_size(ctx);
>> + unsigned int buf_size = ctx->ref_fmt.plane_fmt[0].sizeimage;
>> struct hantro_dev *vpu = ctx->dev;
>> int ret;
>>
>> diff --git a/drivers/media/platform/verisilicon/hantro_v4l2.c b/drivers/media/platform/verisilicon/hantro_v4l2.c
>> index fcf3bd9bcda2..f8d4dd518368 100644
>> --- a/drivers/media/platform/verisilicon/hantro_v4l2.c
>> +++ b/drivers/media/platform/verisilicon/hantro_v4l2.c
>> @@ -36,6 +36,9 @@ static int hantro_set_fmt_out(struct hantro_ctx *ctx,
>> static int hantro_set_fmt_cap(struct hantro_ctx *ctx,
>> struct v4l2_pix_format_mplane *pix_mp);
>>
>> +static void hantro_mv_free(struct hantro_ctx *ctx);
>> +static void hantro_rfc_free(struct hantro_ctx *ctx);
>> +
>> static const struct hantro_fmt *
>> hantro_get_formats(const struct hantro_ctx *ctx, unsigned int *num_fmts, bool need_postproc)
>> {
>> @@ -362,26 +365,6 @@ static int hantro_try_fmt(const struct hantro_ctx *ctx,
>> /* Fill remaining fields */
>> v4l2_fill_pixfmt_mp(pix_mp, fmt->fourcc, pix_mp->width,
>> pix_mp->height);
>> - if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_H264_SLICE &&
>> - !hantro_needs_postproc(ctx, fmt))
>> - pix_mp->plane_fmt[0].sizeimage +=
>> - hantro_h264_mv_size(pix_mp->width,
>> - pix_mp->height);
>> - else if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_VP9_FRAME &&
>> - !hantro_needs_postproc(ctx, fmt))
>> - pix_mp->plane_fmt[0].sizeimage +=
>> - hantro_vp9_mv_size(pix_mp->width,
>> - pix_mp->height);
>> - else if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_HEVC_SLICE &&
>> - !hantro_needs_postproc(ctx, fmt))
>> - pix_mp->plane_fmt[0].sizeimage +=
>> - hantro_hevc_mv_size(pix_mp->width,
>> - pix_mp->height);
>> - else if (ctx->vpu_src_fmt->fourcc == V4L2_PIX_FMT_AV1_FRAME &&
>> - !hantro_needs_postproc(ctx, fmt))
>> - pix_mp->plane_fmt[0].sizeimage +=
>> - hantro_av1_mv_size(pix_mp->width,
>> - pix_mp->height);
>> } else if (!pix_mp->plane_fmt[0].sizeimage) {
>> /*
>> * For coded formats the application can specify
>> @@ -984,6 +967,9 @@ static void hantro_stop_streaming(struct vb2_queue *q)
>> ctx->codec_ops->exit(ctx);
>> }
>>
>> + hantro_mv_free(ctx);
>> + hantro_rfc_free(ctx);
>> +
>> /*
>> * The mem2mem framework calls v4l2_m2m_cancel_job before
>> * .stop_streaming, so there isn't any job running and
>> @@ -1025,3 +1011,291 @@ const struct vb2_ops hantro_queue_ops = {
>> .start_streaming = hantro_start_streaming,
>> .stop_streaming = hantro_stop_streaming,
>> };
>> +
>> +static inline size_t
>> +hantro_vp9_mv_size(unsigned int width, unsigned int height)
> I don't like much that we are adding more codec specific function in
> hantro_v4l2.c. Can we move these into codec specific headers (since this is
> inline), just to keep things separate.
I will do that and maybe more clean up in an additional patch.
>
>> +{
>> + int num_ctbs;
>> +
>> + /*
>> + * There can be up to (CTBs x 64) number of blocks,
>> + * and the motion vector for each block needs 16 bytes.
>> + */
>> + num_ctbs = hantro_vp9_num_sbs(width) * hantro_vp9_num_sbs(height);
>> + return (num_ctbs * 64) * 16;
>> +}
>> +
>> +static inline size_t
>> +hantro_h264_mv_size(unsigned int width, unsigned int height)
>> +{
>> + /*
>> + * A decoded 8-bit 4:2:0 NV12 frame may need memory for up to
>> + * 448 bytes per macroblock with additional 32 bytes on
>> + * multi-core variants.
>> + *
>> + * The H264 decoder needs extra space on the output buffers
>> + * to store motion vectors. This is needed for reference
>> + * frames and only if the format is non-post-processed NV12.
>> + *
>> + * Memory layout is as follow:
>> + *
>> + * +---------------------------+
>> + * | Y-plane 256 bytes x MBs |
>> + * +---------------------------+
>> + * | UV-plane 128 bytes x MBs |
>> + * +---------------------------+
>> + * | MV buffer 64 bytes x MBs |
>> + * +---------------------------+
>> + * | MC sync 32 bytes |
>> + * +---------------------------+
>> + */
>> + return 64 * MB_WIDTH(width) * MB_WIDTH(height) + 32;
>> +}
>> +
>> +static inline size_t
>> +hantro_hevc_mv_size(unsigned int width, unsigned int height, int depth)
>> +{
>> + /*
>> + * A CTB can be 64x64, 32x32 or 16x16.
>> + * Allocated memory for the "worse" case: 16x16
>> + */
>> + return DIV_ROUND_UP(width * height * depth / 8, 128);
>> +}
>> +
>> +static inline unsigned short hantro_av1_num_sbs(unsigned short dimension)
>> +{
>> + return DIV_ROUND_UP(dimension, 64);
>> +}
>> +
>> +static inline size_t
>> +hantro_av1_mv_size(unsigned int width, unsigned int height)
>> +{
>> + size_t num_sbs = hantro_av1_num_sbs(width) * hantro_av1_num_sbs(height);
>> +
>> + return ALIGN(num_sbs * 384, 16) * 2 + 512;
>> +}
>> +
>> +static void hantro_mv_free(struct hantro_ctx *ctx)
>> +{
>> + struct hantro_dev *vpu = ctx->dev;
>> + int i;
>> +
>> + for (i = 0; i < MAX_MV_BUFFERS; i++) {
>> + struct hantro_aux_buf *mv = &ctx->dec_mv[i];
>> +
>> + if (!mv->cpu)
>> + continue;
>> +
>> + dma_free_attrs(vpu->dev, mv->size, mv->cpu,
>> + mv->dma, mv->attrs);
>> + mv->cpu = NULL;
>> + }
>> +}
>> +
>> +static unsigned int hantro_mv_buffer_size(struct hantro_ctx *ctx)
>> +{
>> + struct hantro_dev *vpu = ctx->dev;
>> + int fourcc = ctx->vpu_src_fmt->fourcc;
>> + int width = ctx->ref_fmt.width;
>> + int height = ctx->ref_fmt.height;
>> +
>> + switch (fourcc) {
>> + case V4L2_PIX_FMT_H264_SLICE:
>> + return hantro_h264_mv_size(width, height);
>> + case V4L2_PIX_FMT_VP9_FRAME:
>> + return hantro_vp9_mv_size(width, height);
>> + case V4L2_PIX_FMT_HEVC_SLICE:
>> + return hantro_hevc_mv_size(width, height, ctx->bit_depth);
>> + case V4L2_PIX_FMT_AV1_FRAME:
>> + return hantro_av1_mv_size(width, height);
>> + }
>> +
>> + /* Should not happen */
>> + dev_warn(vpu->dev, "Invalid motion vectors size\n");
>> + return 0;
>> +}
>> +
>> +static int hantro_mv_buffer_alloc(struct hantro_ctx *ctx, int index)
>> +{
>> + struct hantro_dev *vpu = ctx->dev;
>> + struct hantro_aux_buf *mv = &ctx->dec_mv[index];
>> + unsigned int buf_size = hantro_mv_buffer_size(ctx);
>> +
>> + if (!buf_size)
>> + return -EINVAL;
>> +
>> + /*
>> + * Motion vectors buffers are only read and write by the
>> + * hardware so no mapping is needed.
>> + */
>> + mv->attrs = DMA_ATTR_NO_KERNEL_MAPPING;
>> + mv->cpu = dma_alloc_attrs(vpu->dev, buf_size, &mv->dma,
>> + GFP_KERNEL, mv->attrs);
>> + if (!mv->cpu)
>> + return -ENOMEM;
>> + mv->size = buf_size;
>> +
>> + return 0;
>> +}
>> +
>> +dma_addr_t
>> +hantro_mv_get_buf_addr(struct hantro_ctx *ctx, int index)
>> +{
>> + struct hantro_aux_buf *mv = &ctx->dec_mv[index];
>> + unsigned int buf_size = hantro_mv_buffer_size(ctx);
>> + struct hantro_dev *vpu = ctx->dev;
>> + int ret;
>> +
>> + if (mv->size < buf_size && mv->cpu) {
>> + /* buffer is too small, release it */
>> + dma_free_attrs(vpu->dev, mv->size, mv->cpu,
>> + mv->dma, mv->attrs);
>> + mv->cpu = NULL;
>> + }
>> +
>> + if (!mv->cpu) {
>> + /* buffer not already allocated, try getting a new one */
>> + ret = hantro_mv_buffer_alloc(ctx, index);
>> + if (ret)
>> + return 0;
>> + }
>> +
>> + if (!mv->cpu)
>> + return 0;
>> +
>> + return mv->dma;
>> +}
>> +
>> +static inline size_t
>> +hantro_hevc_luma_compressed_size(unsigned int width, unsigned int height)
>> +{
>> + u32 pic_width_in_cbsy =
>> + round_up((width + CBS_LUMA - 1) / CBS_LUMA, CBS_SIZE);
>> + u32 pic_height_in_cbsy = (height + CBS_LUMA - 1) / CBS_LUMA;
>> +
>> + return round_up(pic_width_in_cbsy * pic_height_in_cbsy, CBS_SIZE);
>> +}
>> +
>> +static inline size_t
>> +hantro_hevc_chroma_compressed_size(unsigned int width, unsigned int height)
>> +{
>> + u32 pic_width_in_cbsc =
>> + round_up((width + CBS_CHROMA_W - 1) / CBS_CHROMA_W, CBS_SIZE);
>> + u32 pic_height_in_cbsc = (height / 2 + CBS_CHROMA_H - 1) / CBS_CHROMA_H;
>> +
>> + return round_up(pic_width_in_cbsc * pic_height_in_cbsc, CBS_SIZE);
>> +}
>> +
>> +static inline size_t
>> +hantro_hevc_compressed_size(unsigned int width, unsigned int height)
>> +{
>> + return hantro_hevc_luma_compressed_size(width, height) +
>> + hantro_hevc_chroma_compressed_size(width, height);
>> +}
>> +
>> +static void hantro_rfc_free(struct hantro_ctx *ctx)
>> +{
>> + struct hantro_dev *vpu = ctx->dev;
>> + int i;
>> +
>> + for (i = 0; i < MAX_MV_BUFFERS; i++) {
>> + struct hantro_aux_buf *rfc = &ctx->dec_rfc[i];
>> +
>> + if (!rfc->cpu)
>> + continue;
>> +
>> + dma_free_attrs(vpu->dev, rfc->size, rfc->cpu,
>> + rfc->dma, rfc->attrs);
>> + rfc->cpu = NULL;
>> + }
>> +}
>> +
>> +static unsigned int hantro_rfc_buffer_size(struct hantro_ctx *ctx)
>> +{
>> + struct hantro_dev *vpu = ctx->dev;
>> + int fourcc = ctx->vpu_src_fmt->fourcc;
>> + int width = ctx->ref_fmt.width;
>> + int height = ctx->ref_fmt.height;
>> +
>> + switch (fourcc) {
>> + case V4L2_PIX_FMT_HEVC_SLICE:
>> + return hantro_hevc_compressed_size(width, height);
>> + }
>> +
>> + /* Should not happen */
>> + dev_warn(vpu->dev, "Invalid rfc size\n");
>> + return 0;
>> +}
>> +
>> +static int hantro_rfc_buffer_alloc(struct hantro_ctx *ctx, int index)
>> +{
>> + struct hantro_dev *vpu = ctx->dev;
>> + struct hantro_aux_buf *rfc = &ctx->dec_rfc[index];
>> + unsigned int buf_size = hantro_rfc_buffer_size(ctx);
>> +
>> + if (!buf_size)
>> + return -EINVAL;
>> +
>> + /*
>> + * RFC buffers are only read and write by the
>> + * hardware so no mapping is needed.
>> + */
>> + rfc->attrs = DMA_ATTR_NO_KERNEL_MAPPING;
>> + rfc->cpu = dma_alloc_attrs(vpu->dev, buf_size, &rfc->dma,
>> + GFP_KERNEL, rfc->attrs);
>> + if (!rfc->cpu)
>> + return -ENOMEM;
>> + rfc->size = buf_size;
>> +
>> + return 0;
>> +}
>> +
>> +dma_addr_t
>> +hantro_rfc_get_luma_buf_addr(struct hantro_ctx *ctx, int index)
>> +{
>> + struct hantro_aux_buf *rfc = &ctx->dec_rfc[index];
>> + unsigned int buf_size = hantro_rfc_buffer_size(ctx);
>> + struct hantro_dev *vpu = ctx->dev;
>> + int ret;
>> +
>> + if (rfc->size < buf_size && rfc->cpu) {
>> + /* buffer is too small, release it */
>> + dma_free_attrs(vpu->dev, rfc->size, rfc->cpu,
>> + rfc->dma, rfc->attrs);
>> + rfc->cpu = NULL;
>> + }
>> +
>> + if (!rfc->cpu) {
>> + /* buffer not already allocated, try getting a new one */
>> + ret = hantro_rfc_buffer_alloc(ctx, index);
>> + if (ret)
>> + return 0;
>> + }
>> +
>> + if (!rfc->cpu)
>> + return 0;
>> +
>> + return rfc->dma;
>> +}
>> +
>> +dma_addr_t
>> +hantro_rfc_get_chroma_buf_addr(struct hantro_ctx *ctx, int index)
>> +{
>> + dma_addr_t luma_addr = hantro_rfc_get_luma_buf_addr(ctx, index);
>> + struct hantro_dev *vpu = ctx->dev;
>> + int fourcc = ctx->vpu_src_fmt->fourcc;
>> + int width = ctx->ref_fmt.width;
>> + int height = ctx->ref_fmt.height;
>> +
>> + if (!luma_addr)
>> + return -EINVAL;
>> +
>> + switch (fourcc) {
>> + case V4L2_PIX_FMT_HEVC_SLICE:
>> + return luma_addr + hantro_hevc_luma_compressed_size(width, height);
>> + }
>> +
>> + /* Should not happen */
>> + dev_warn(vpu->dev, "Invalid rfc chroma address\n");
>> + return 0;
>> +}
>> diff --git a/drivers/media/platform/verisilicon/rockchip_vpu981_hw_av1_dec.c b/drivers/media/platform/verisilicon/rockchip_vpu981_hw_av1_dec.c
>> index c1ada14df4c3..21da8ddfc4b3 100644
>> --- a/drivers/media/platform/verisilicon/rockchip_vpu981_hw_av1_dec.c
>> +++ b/drivers/media/platform/verisilicon/rockchip_vpu981_hw_av1_dec.c
>> @@ -62,7 +62,7 @@ rockchip_vpu981_av1_dec_set_ref(struct hantro_ctx *ctx, int ref, int idx,
>> const struct v4l2_ctrl_av1_frame *frame = ctrls->frame;
>> struct hantro_dev *vpu = ctx->dev;
>> struct hantro_decoded_buffer *dst;
>> - dma_addr_t luma_addr, chroma_addr, mv_addr = 0;
>> + dma_addr_t luma_addr, chroma_addr = 0;
>> int cur_width = frame->frame_width_minus_1 + 1;
>> int cur_height = frame->frame_height_minus_1 + 1;
>> int scale_width =
>> @@ -120,11 +120,10 @@ rockchip_vpu981_av1_dec_set_ref(struct hantro_ctx *ctx, int ref, int idx,
>> dst = vb2_to_hantro_decoded_buf(&av1_dec->frame_refs[idx].vb2_ref->vb2_buf);
>> luma_addr = hantro_get_dec_buf_addr(ctx, &dst->base.vb.vb2_buf);
>> chroma_addr = luma_addr + dst->av1.chroma_offset;
>> - mv_addr = luma_addr + dst->av1.mv_offset;
>>
>> hantro_write_addr(vpu, AV1_REFERENCE_Y(ref), luma_addr);
>> hantro_write_addr(vpu, AV1_REFERENCE_CB(ref), chroma_addr);
>> - hantro_write_addr(vpu, AV1_REFERENCE_MV(ref), mv_addr);
>> + hantro_write_addr(vpu, AV1_REFERENCE_MV(ref), dst->av1.mv_addr);
>>
>> return (scale_width != (1 << AV1_REF_SCALE_SHIFT)) ||
>> (scale_height != (1 << AV1_REF_SCALE_SHIFT));
>> @@ -180,11 +179,10 @@ static void rockchip_vpu981_av1_dec_set_segmentation(struct hantro_ctx *ctx)
>> if (idx >= 0) {
>> dma_addr_t luma_addr, mv_addr = 0;
>> struct hantro_decoded_buffer *seg;
>> - size_t mv_offset = hantro_av1_chroma_size(ctx);
>>
>> seg = vb2_to_hantro_decoded_buf(&av1_dec->frame_refs[idx].vb2_ref->vb2_buf);
>> luma_addr = hantro_get_dec_buf_addr(ctx, &seg->base.vb.vb2_buf);
>> - mv_addr = luma_addr + mv_offset;
>> + mv_addr = hantro_mv_get_buf_addr(ctx, seg->base.vb.vb2_buf.index);
>>
>> hantro_write_addr(vpu, AV1_SEGMENTATION, mv_addr);
>> hantro_reg_write(vpu, &av1_use_temporal3_mvs, 1);
>> @@ -1350,22 +1348,20 @@ rockchip_vpu981_av1_dec_set_output_buffer(struct hantro_ctx *ctx)
>> struct hantro_dev *vpu = ctx->dev;
>> struct hantro_decoded_buffer *dst;
>> struct vb2_v4l2_buffer *vb2_dst;
>> - dma_addr_t luma_addr, chroma_addr, mv_addr = 0;
>> + dma_addr_t luma_addr, chroma_addr = 0;
>> size_t cr_offset = hantro_av1_luma_size(ctx);
>> - size_t mv_offset = hantro_av1_chroma_size(ctx);
>>
>> vb2_dst = av1_dec->frame_refs[av1_dec->current_frame_index].vb2_ref;
>> dst = vb2_to_hantro_decoded_buf(&vb2_dst->vb2_buf);
>> luma_addr = hantro_get_dec_buf_addr(ctx, &dst->base.vb.vb2_buf);
>> chroma_addr = luma_addr + cr_offset;
>> - mv_addr = luma_addr + mv_offset;
>>
>> dst->av1.chroma_offset = cr_offset;
>> - dst->av1.mv_offset = mv_offset;
>> + dst->av1.mv_addr = hantro_mv_get_buf_addr(ctx, dst->base.vb.vb2_buf.index);
>>
>> hantro_write_addr(vpu, AV1_TILE_OUT_LU, luma_addr);
>> hantro_write_addr(vpu, AV1_TILE_OUT_CH, chroma_addr);
>> - hantro_write_addr(vpu, AV1_TILE_OUT_MV, mv_addr);
>> + hantro_write_addr(vpu, AV1_TILE_OUT_MV, dst->av1.mv_addr);
>> }
>>
>> int rockchip_vpu981_av1_dec_run(struct hantro_ctx *ctx)
> I like the direction this is going, as it removes a lot of stride/offset open
> calculation, which has been source of problem, and it also reduce the memory
> allocation overhead. My main worry is that we don't tighly manages the entries
> based on the DPB (references). So even if a reference have gone away, we don't
> explicitly reset the entry and prevent them from being used. I'd like to see
> that improved.
Sure but I don't want to mix everything is this patch.
This need to be solve per codec.
Regards,
Benjamin
>
> Nicolas
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox