* [PATCH RFC 00/18] accel/qda: Introduce Qualcomm DSP Accelerator driver
@ 2026-02-23 19:08 ` Ekansh Gupta
2026-02-23 19:08 ` [PATCH RFC 01/18] accel/qda: Add Qualcomm QDA DSP accelerator driver docs Ekansh Gupta
` (23 more replies)
0 siblings, 24 replies; 83+ messages in thread
From: Ekansh Gupta @ 2026-02-23 19:08 UTC (permalink / raw)
To: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König
Cc: dri-devel, linux-doc, linux-kernel, linux-arm-msm, iommu,
linux-media, linaro-mm-sig, Srinivas Kandagatla, Dmitry Baryshkov,
Bharath Kumar, Chenna Kesava Raju, Ekansh Gupta
This patch series introduces the Qualcomm DSP Accelerator (QDA) driver,
a modern DRM-based accelerator implementation for Qualcomm Hexagon DSPs.
The driver provides a standardized interface for offloading computational
tasks to DSPs found on Qualcomm SoCs, supporting all DSP domains (ADSP,
CDSP, SDSP, GDSP).
The QDA driver is designed as an alternative for the FastRPC driver
in drivers/misc/, offering improved resource management, better integration
with standard kernel subsystems, and alignment with the Linux kernel's
Compute Accelerators framework.
User-space staging branch
============
https://github.com/qualcomm/fastrpc/tree/accel/staging
Key Features
============
* Standard DRM accelerator interface via /dev/accel/accelN
* GEM-based buffer management with DMA-BUF import/export support
* IOMMU-based memory isolation using per-process context banks
* FastRPC protocol implementation for DSP communication
* RPMsg transport layer for reliable message passing
* Support for all DSP domains (ADSP, CDSP, SDSP, GDSP)
* Comprehensive IOCTL interface for DSP operations
High-Level Architecture Differences with Existing FastRPC Driver
=================================================================
The QDA driver represents a significant architectural departure from the
existing FastRPC driver (drivers/misc/fastrpc.c), addressing several key
limitations while maintaining protocol compatibility:
1. DRM Accelerator Framework Integration
- FastRPC: Custom character device (/dev/fastrpc-*)
- QDA: Standard DRM accel device (/dev/accel/accelN)
- Benefit: Leverages established DRM infrastructure for device
management.
2. Memory Management
- FastRPC: Custom memory allocator with ION/DMA-BUF integration
- QDA: Native GEM objects with full PRIME support
- Benefit: Seamless buffer sharing using standard DRM mechanisms
3. IOMMU Context Bank Management
- FastRPC: Direct IOMMU domain manipulation, limited isolation
- QDA: Custom compute bus (qda_cb_bus_type) with proper device model
- Benefit: Each CB device is a proper struct device with IOMMU group
support, enabling better isolation and resource tracking.
- https://lore.kernel.org/all/245d602f-3037-4ae3-9af9-d98f37258aae@oss.qualcomm.com/
4. Memory Manager Architecture
- FastRPC: Monolithic allocator
- QDA: Pluggable memory manager with backend abstraction
- Benefit: Currently uses DMA-coherent backend, easily extensible for
future memory types (e.g., carveout, CMA)
5. Transport Layer
- FastRPC: Direct RPMsg integration in core driver
- QDA: Abstracted transport layer (qda_rpmsg.c)
- Benefit: Clean separation of concerns, easier to add alternative
transports if needed
8. Code Organization
- FastRPC: ~3000 lines in single file
- QDA: Modular design across multiple files (~4600 lines total)
* qda_drv.c: Core driver and DRM integration
* qda_gem.c: GEM object management
* qda_memory_manager.c: Memory and IOMMU management
* qda_fastrpc.c: FastRPC protocol implementation
* qda_rpmsg.c: Transport layer
* qda_cb.c: Context bank device management
- Benefit: Better maintainability, clearer separation of concerns
9. UAPI Design
- FastRPC: Custom IOCTL interface
- QDA: DRM-style IOCTLs with proper versioning support
- Benefit: Follows DRM conventions, easier userspace integration
10. Documentation
- FastRPC: Minimal in-tree documentation
- QDA: Comprehensive documentation in Documentation/accel/qda/
- Benefit: Better developer experience, clearer API contracts
11. Buffer Reference Mechanism
- FastRPC: Uses buffer file descriptors (FDs) for all book-keeping
in both kernel and DSP
- QDA: Uses GEM handles for kernel-side management, providing better
integration with DRM subsystem
- Benefit: Leverages DRM GEM infrastructure for reference counting,
lifetime management, and integration with other DRM components
Key Technical Improvements
===========================
* Proper device model: CB devices are real struct device instances on a
custom bus, enabling proper IOMMU group management and power management
integration
* Reference-counted IOMMU devices: Multiple file descriptors from the same
process share a single IOMMU device, reducing overhead
* GEM-based buffer lifecycle: Automatic cleanup via DRM GEM reference
counting, eliminating many resource leak scenarios
* Modular memory backends: The memory manager supports pluggable backends,
currently implementing DMA-coherent allocations with SID-prefixed
addresses for DSP firmware
* Context-based invocation tracking: XArray-based context management with
proper synchronization and cleanup
Patch Series Organization
==========================
Patches 1-2: Driver skeleton and documentation
Patches 3-6: RPMsg transport and IOMMU/CB infrastructure
Patches 7-9: DRM device registration and basic IOCTL
Patches 10-12: GEM buffer management and PRIME support
Patches 13-17: FastRPC protocol implementation (attach, invoke, create,
map/unmap)
Patch 18: MAINTAINERS entry
Open Items
===========
The following items are identified as open items:
1. Privilege Level Management
- Currently, daemon processes and user processes have the same access
level as both use the same accel device node. This needs to be
addressed as daemons attach to privileged DSP PDs and require
higher privilege levels for system-level operations
- Seeking guidance on the best approach: separate device nodes,
capability-based checks, or DRM master/authentication mechanisms
2. UAPI Compatibility Layer
- Add UAPI compat layer to facilitate migration of client applications
from existing FastRPC UAPI to the new QDA accel driver UAPI,
ensuring smooth transition for existing userspace code
- Seeking guidance on implementation approach: in-kernel translation
layer, userspace wrapper library, or hybrid solution
3. Documentation Improvements
- Add detailed IOCTL usage examples
- Document DSP firmware interface requirements
- Create migration guide from existing FastRPC
4. Per-Domain Memory Allocation
- Develop new userspace API to support memory allocation on a per
domain basis, enabling domain-specific memory management and
optimization
5. Audio and Sensors PD Support
- The current patch series does not handle Audio PD and Sensors PD
functionalities. These specialized protection domains require
additional support for real-time constraints and power management
Interface Compatibility
========================
The QDA driver maintains compatibility with existing FastRPC infrastructure:
* Device Tree Bindings: The driver uses the same device tree bindings as
the existing FastRPC driver, ensuring no changes are required to device
tree sources. The "qcom,fastrpc" compatible string and child node
structure remain unchanged.
* Userspace Interface: While the driver provides a new DRM-based UAPI,
the underlying FastRPC protocol and DSP firmware interface remain
compatible. This ensures that DSP firmware and libraries continue to
work without modification.
* Migration Path: The modular design allows for gradual migration, where
both drivers can coexist during the transition period. Applications can
be migrated incrementally to the new UAPI with the help of the planned
compatibility layer.
References
==========
Previous discussions on this migration:
- https://lkml.org/lkml/2024/6/24/479
- https://lkml.org/lkml/2024/6/21/1252
Testing
=======
The driver has been tested on Qualcomm platforms with:
- Basic FastRPC attach/release operations
- DSP process creation and initialization
- Memory mapping/unmapping operations
- Dynamic invocation with various buffer types
- GEM buffer allocation and mmap
- PRIME buffer import from other subsystems
Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
---
Ekansh Gupta (18):
accel/qda: Add Qualcomm QDA DSP accelerator driver docs
accel/qda: Add Qualcomm DSP accelerator driver skeleton
accel/qda: Add RPMsg transport for Qualcomm DSP accelerator
accel/qda: Add built-in compute CB bus for QDA and integrate with IOMMU
accel/qda: Create compute CB devices on QDA compute bus
accel/qda: Add memory manager for CB devices
accel/qda: Add DRM accel device registration for QDA driver
accel/qda: Add per-file DRM context and open/close handling
accel/qda: Add QUERY IOCTL and basic QDA UAPI header
accel/qda: Add DMA-backed GEM objects and memory manager integration
accel/qda: Add GEM_CREATE and GEM_MMAP_OFFSET IOCTLs
accel/qda: Add PRIME dma-buf import support
accel/qda: Add initial FastRPC attach and release support
accel/qda: Add FastRPC dynamic invocation support
accel/qda: Add FastRPC DSP process creation support
accel/qda: Add FastRPC-based DSP memory mapping support
accel/qda: Add FastRPC-based DSP memory unmapping support
MAINTAINERS: Add MAINTAINERS entry for QDA driver
Documentation/accel/index.rst | 1 +
Documentation/accel/qda/index.rst | 14 +
Documentation/accel/qda/qda.rst | 129 ++++
MAINTAINERS | 9 +
arch/arm64/configs/defconfig | 2 +
drivers/accel/Kconfig | 1 +
drivers/accel/Makefile | 2 +
drivers/accel/qda/Kconfig | 35 ++
drivers/accel/qda/Makefile | 19 +
drivers/accel/qda/qda_cb.c | 182 ++++++
drivers/accel/qda/qda_cb.h | 26 +
drivers/accel/qda/qda_compute_bus.c | 23 +
drivers/accel/qda/qda_drv.c | 375 ++++++++++++
drivers/accel/qda/qda_drv.h | 171 ++++++
drivers/accel/qda/qda_fastrpc.c | 1002 ++++++++++++++++++++++++++++++++
drivers/accel/qda/qda_fastrpc.h | 433 ++++++++++++++
drivers/accel/qda/qda_gem.c | 211 +++++++
drivers/accel/qda/qda_gem.h | 103 ++++
drivers/accel/qda/qda_ioctl.c | 271 +++++++++
drivers/accel/qda/qda_ioctl.h | 118 ++++
drivers/accel/qda/qda_memory_dma.c | 91 +++
drivers/accel/qda/qda_memory_dma.h | 46 ++
drivers/accel/qda/qda_memory_manager.c | 382 ++++++++++++
drivers/accel/qda/qda_memory_manager.h | 148 +++++
drivers/accel/qda/qda_prime.c | 194 +++++++
drivers/accel/qda/qda_prime.h | 43 ++
drivers/accel/qda/qda_rpmsg.c | 327 +++++++++++
drivers/accel/qda/qda_rpmsg.h | 57 ++
drivers/iommu/iommu.c | 4 +
include/linux/qda_compute_bus.h | 22 +
include/uapi/drm/qda_accel.h | 224 +++++++
31 files changed, 4665 insertions(+)
---
base-commit: d4906ae14a5f136ceb671bb14cedbf13fa560da6
change-id: 20260223-qda-firstpost-4ab05249e2cc
Best regards,
--
Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
^ permalink raw reply [flat|nested] 83+ messages in thread
* [PATCH RFC 01/18] accel/qda: Add Qualcomm QDA DSP accelerator driver docs
2026-02-23 19:08 ` [PATCH RFC 00/18] accel/qda: Introduce Qualcomm DSP Accelerator driver Ekansh Gupta
@ 2026-02-23 19:08 ` Ekansh Gupta
2026-02-23 21:17 ` Dmitry Baryshkov
2026-02-24 3:33 ` Trilok Soni
2026-02-23 19:08 ` [PATCH RFC 02/18] accel/qda: Add Qualcomm DSP accelerator driver skeleton Ekansh Gupta
` (22 subsequent siblings)
23 siblings, 2 replies; 83+ messages in thread
From: Ekansh Gupta @ 2026-02-23 19:08 UTC (permalink / raw)
To: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König
Cc: dri-devel, linux-doc, linux-kernel, linux-arm-msm, iommu,
linux-media, linaro-mm-sig, Srinivas Kandagatla, Dmitry Baryshkov,
Bharath Kumar, Chenna Kesava Raju, Ekansh Gupta
Add initial documentation for the Qualcomm DSP Accelerator (QDA) driver
integrated in the DRM accel subsystem.
The new docs introduce QDA as a DRM/accel-based implementation of
Hexagon DSP offload that is intended as a modern alternative to the
legacy FastRPC driver in drivers/misc. The text describes the driver
motivation, high-level architecture and interaction with IOMMU context
banks, GEM-based buffer management and the RPMsg transport.
The user-space facing section documents the main QDA IOCTLs used to
establish DSP sessions, manage GEM buffer objects and invoke remote
procedures using the FastRPC protocol, along with a typical lifecycle
example for applications.
Finally, the driver is wired into the Compute Accelerators
documentation index under Documentation/accel, and a brief debugging
section shows how to enable dynamic debug for the QDA implementation.
Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
---
Documentation/accel/index.rst | 1 +
Documentation/accel/qda/index.rst | 14 +++++
Documentation/accel/qda/qda.rst | 129 ++++++++++++++++++++++++++++++++++++++
3 files changed, 144 insertions(+)
diff --git a/Documentation/accel/index.rst b/Documentation/accel/index.rst
index cbc7d4c3876a..5901ea7f784c 100644
--- a/Documentation/accel/index.rst
+++ b/Documentation/accel/index.rst
@@ -10,4 +10,5 @@ Compute Accelerators
introduction
amdxdna/index
qaic/index
+ qda/index
rocket/index
diff --git a/Documentation/accel/qda/index.rst b/Documentation/accel/qda/index.rst
new file mode 100644
index 000000000000..bce188f21117
--- /dev/null
+++ b/Documentation/accel/qda/index.rst
@@ -0,0 +1,14 @@
+.. SPDX-License-Identifier: GPL-2.0-only
+
+==============================
+ accel/qda Qualcomm DSP Driver
+==============================
+
+The **accel/qda** driver provides support for Qualcomm Hexagon DSPs (Digital
+Signal Processors) within the DRM accelerator framework. It serves as a modern
+replacement for the legacy FastRPC driver, offering improved resource management
+and standard subsystem integration.
+
+.. toctree::
+
+ qda
diff --git a/Documentation/accel/qda/qda.rst b/Documentation/accel/qda/qda.rst
new file mode 100644
index 000000000000..742159841b95
--- /dev/null
+++ b/Documentation/accel/qda/qda.rst
@@ -0,0 +1,129 @@
+.. SPDX-License-Identifier: GPL-2.0-only
+
+==================================
+Qualcomm Hexagon DSP (QDA) Driver
+==================================
+
+Introduction
+============
+
+The **QDA** (Qualcomm DSP Accelerator) driver is a new DRM-based
+accelerator driver for Qualcomm's Hexagon DSPs. It provides a standardized
+interface for user-space applications to offload computational tasks ranging
+from audio processing and sensor offload to computer vision and AI
+inference to the Hexagon DSPs found on Qualcomm SoCs.
+
+This driver is designed to align with the Linux kernel's modern **Compute
+Accelerators** subsystem (`drivers/accel/`), providing a robust and modular
+alternative to the legacy FastRPC driver in `drivers/misc/`, offering
+improved resource management and better integration with standard kernel
+subsystems.
+
+Motivation
+==========
+
+The existing FastRPC implementation in the kernel utilizes a custom character
+device and lacks integration with modern kernel memory management frameworks.
+The QDA driver addresses these limitations by:
+
+1. **Adopting the DRM accel Framework**: Leveraging standard uAPIs for device
+ management, job submission, and synchronization.
+2. **Utilizing GEM for Memory**: Providing proper buffer object management,
+ including DMA-BUF import/export capabilities.
+3. **Improving Isolation**: Using IOMMU context banks to enforce memory
+ isolation between different DSP user sessions.
+
+Key Features
+============
+
+* **Standard Accelerator Interface**: Exposes a standard character device
+ node (e.g., `/dev/accel/accel0`) via the DRM subsystem.
+* **Unified Offload Support**: Supports all DSP domains (ADSP, CDSP, SDSP,
+ GDSP) via a single driver architecture.
+* **FastRPC Protocol**: Implements the reliable Remote Procedure Call
+ (FastRPC) protocol for communication between the application processor
+ and DSP.
+* **DMA-BUF Interop**: Seamless sharing of memory buffers between the DSP
+ and other multimedia subsystems (GPU, Camera, Video) via standard DMA-BUFs.
+* **Modular Design**: Clean separation between the core DRM logic, the memory
+ manager, and the RPMsg-based transport layer.
+
+Architecture
+============
+
+The QDA driver is composed of several modular components:
+
+1. **Core Driver (`qda_drv`)**: Manages device registration, file operations,
+ and bridges the driver with the DRM accelerator subsystem.
+2. **Memory Manager (`qda_memory_manager`)**: A flexible memory management
+ layer that handles IOMMU context banks. It supports pluggable backends
+ (such as DMA-coherent) to adapt to different SoC memory architectures.
+3. **GEM Subsystem**: Implements the DRM GEM interface for buffer management:
+
+ * **`qda_gem`**: Core GEM object management, including allocation, mmap
+ operations, and buffer lifecycle management.
+ * **`qda_prime`**: PRIME import functionality for DMA-BUF interoperability,
+ enabling seamless buffer sharing with other kernel subsystems.
+
+4. **Transport Layer (`qda_rpmsg`)**: Abstraction over the RPMsg framework
+ to handle low-level message passing with the DSP firmware.
+5. **Compute Bus (`qda_compute_bus`)**: A custom virtual bus used to
+ enumerate and manage the specific compute context banks defined in the
+ device tree.
+6. **FastRPC Core (`qda_fastrpc`)**: Implements the protocol logic for
+ marshalling arguments and handling remote invocations.
+
+User-Space API
+==============
+
+The driver exposes a set of DRM-compliant IOCTLs. Note that these are designed
+to be familiar to existing FastRPC users while adhering to DRM standards.
+
+* `DRM_IOCTL_QDA_QUERY`: Query DSP type (e.g., "cdsp", "adsp")
+ and capabilities.
+* `DRM_IOCTL_QDA_INIT_ATTACH`: Attach a user session to the DSP's protection
+ domain.
+* `DRM_IOCTL_QDA_INIT_CREATE`: Initialize a new process context on the DSP.
+* `DRM_IOCTL_QDA_INVOKE`: Submit a remote method invocation (the primary
+ execution unit).
+* `DRM_IOCTL_QDA_GEM_CREATE`: Allocate a GEM buffer object for DSP usage.
+* `DRM_IOCTL_QDA_GEM_MMAP_OFFSET`: Retrieve mmap offsets for memory mapping.
+* `DRM_IOCTL_QDA_MAP` / `DRM_IOCTL_QDA_MUNMAP`: Map or unmap buffers into the
+ DSP's virtual address space.
+
+Usage Example
+=============
+
+A typical lifecycle for a user-space application:
+
+1. **Discovery**: Open `/dev/accel/accel*` and check
+ `DRM_IOCTL_QDA_QUERY` to find the desired DSP (e.g., CDSP for
+ compute workloads).
+2. **Initialization**: Call `DRM_IOCTL_QDA_INIT_ATTACH` and
+ `DRM_IOCTL_QDA_INIT_CREATE` to establish a session.
+3. **Memory**: Allocate buffers via `DRM_IOCTL_QDA_GEM_CREATE` or import
+ DMA-BUFs (PRIME fd) from other drivers using `DRM_IOCTL_PRIME_FD_TO_HANDLE`.
+4. **Execution**: Use `DRM_IOCTL_QDA_INVOKE` to pass arguments and execute
+ functions on the DSP.
+5. **Cleanup**: Close file descriptors to automatically release resources and
+ detach the session.
+
+Internal Implementation
+=======================
+
+Memory Management
+-----------------
+The driver's memory manager creates virtual "IOMMU devices" that map to
+hardware context banks. This allows the driver to manage multiple isolated
+address spaces. The implementation currently uses a **DMA-coherent backend**
+to ensure data consistency between the CPU and DSP without manual cache
+maintenance in most cases.
+
+Debugging
+=========
+The driver includes extensive dynamic debug support. Enable it via the
+kernel's dynamic debug control:
+
+.. code-block:: bash
+
+ echo "file drivers/accel/qda/* +p" > /sys/kernel/debug/dynamic_debug/control
--
2.34.1
^ permalink raw reply related [flat|nested] 83+ messages in thread
* [PATCH RFC 02/18] accel/qda: Add Qualcomm DSP accelerator driver skeleton
2026-02-23 19:08 ` [PATCH RFC 00/18] accel/qda: Introduce Qualcomm DSP Accelerator driver Ekansh Gupta
2026-02-23 19:08 ` [PATCH RFC 01/18] accel/qda: Add Qualcomm QDA DSP accelerator driver docs Ekansh Gupta
@ 2026-02-23 19:08 ` Ekansh Gupta
2026-02-23 21:52 ` Bjorn Andersson
2026-02-23 19:08 ` [PATCH RFC 03/18] accel/qda: Add RPMsg transport for Qualcomm DSP accelerator Ekansh Gupta
` (21 subsequent siblings)
23 siblings, 1 reply; 83+ messages in thread
From: Ekansh Gupta @ 2026-02-23 19:08 UTC (permalink / raw)
To: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König
Cc: dri-devel, linux-doc, linux-kernel, linux-arm-msm, iommu,
linux-media, linaro-mm-sig, Srinivas Kandagatla, Dmitry Baryshkov,
Bharath Kumar, Chenna Kesava Raju, Ekansh Gupta
This patch introduces the initial scaffolding for the Qualcomm DSP
accelerator (QDA) driver under drivers/accel/.
The new Kconfig option DRM_ACCEL_QDA integrates the driver with the
DRM/accel subsystem, and the accel Makefile is updated to build the
driver as a loadable module. A minimal qda_drv.c file is added to
provide basic module_init/module_exit hooks so that the driver can be
built and loaded.
Subsequent patches will add:
- RPMSG-based communication with Qualcomm Hexagon DSPs
- FastRPC integration for userspace offload
- DMA-BUF support and memory management
- GEM, PRIME and IOCTL interfaces for compute job submission
This patch only wires up the basic driver framework and does not yet
provide functional DSP offload capabilities.
Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
---
drivers/accel/Kconfig | 1 +
drivers/accel/Makefile | 1 +
drivers/accel/qda/Kconfig | 29 +++++++++++++++++++++++++++++
drivers/accel/qda/Makefile | 8 ++++++++
drivers/accel/qda/qda_drv.c | 22 ++++++++++++++++++++++
5 files changed, 61 insertions(+)
diff --git a/drivers/accel/Kconfig b/drivers/accel/Kconfig
index bdf48ccafcf2..74ac0f71bc9d 100644
--- a/drivers/accel/Kconfig
+++ b/drivers/accel/Kconfig
@@ -29,6 +29,7 @@ source "drivers/accel/ethosu/Kconfig"
source "drivers/accel/habanalabs/Kconfig"
source "drivers/accel/ivpu/Kconfig"
source "drivers/accel/qaic/Kconfig"
+source "drivers/accel/qda/Kconfig"
source "drivers/accel/rocket/Kconfig"
endif
diff --git a/drivers/accel/Makefile b/drivers/accel/Makefile
index 1d3a7251b950..58c08dd5f389 100644
--- a/drivers/accel/Makefile
+++ b/drivers/accel/Makefile
@@ -5,4 +5,5 @@ obj-$(CONFIG_DRM_ACCEL_ARM_ETHOSU) += ethosu/
obj-$(CONFIG_DRM_ACCEL_HABANALABS) += habanalabs/
obj-$(CONFIG_DRM_ACCEL_IVPU) += ivpu/
obj-$(CONFIG_DRM_ACCEL_QAIC) += qaic/
+obj-$(CONFIG_DRM_ACCEL_QDA) += qda/
obj-$(CONFIG_DRM_ACCEL_ROCKET) += rocket/
\ No newline at end of file
diff --git a/drivers/accel/qda/Kconfig b/drivers/accel/qda/Kconfig
new file mode 100644
index 000000000000..3c78ff6189e0
--- /dev/null
+++ b/drivers/accel/qda/Kconfig
@@ -0,0 +1,29 @@
+# SPDX-License-Identifier: GPL-2.0-only
+#
+# Qualcomm DSP accelerator driver
+#
+
+config DRM_ACCEL_QDA
+ tristate "Qualcomm DSP accelerator"
+ depends on DRM_ACCEL
+ depends on ARCH_QCOM || COMPILE_TEST
+ help
+ Enables the DRM-based accelerator driver for Qualcomm's Hexagon DSPs.
+ This driver provides a standardized interface for offloading computational
+ tasks to the DSP, including audio processing, sensor offload, computer
+ vision, and AI inference workloads.
+
+ The driver supports all DSP domains (ADSP, CDSP, SDSP, GDSP) and
+ implements the FastRPC protocol for communication between the application
+ processor and DSP. It integrates with the Linux kernel's Compute
+ Accelerators subsystem (drivers/accel/) and provides a modern alternative
+ to the legacy FastRPC driver found in drivers/misc/.
+
+ Key features include DMA-BUF interoperability for seamless buffer sharing
+ with other multimedia subsystems, IOMMU-based memory isolation, and
+ standard DRM IOCTLs for device management and job submission.
+
+ If unsure, say N.
+
+ To compile this driver as a module, choose M here: the
+ module will be called qda.
diff --git a/drivers/accel/qda/Makefile b/drivers/accel/qda/Makefile
new file mode 100644
index 000000000000..573711af1d28
--- /dev/null
+++ b/drivers/accel/qda/Makefile
@@ -0,0 +1,8 @@
+# SPDX-License-Identifier: GPL-2.0-only
+#
+# Makefile for Qualcomm DSP accelerator driver
+#
+
+obj-$(CONFIG_DRM_ACCEL_QDA) := qda.o
+
+qda-y := qda_drv.o
diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c
new file mode 100644
index 000000000000..18b0d3fb1598
--- /dev/null
+++ b/drivers/accel/qda/qda_drv.c
@@ -0,0 +1,22 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
+#include <linux/module.h>
+#include <linux/kernel.h>
+
+static int __init qda_core_init(void)
+{
+ pr_info("QDA: driver initialization complete\n");
+ return 0;
+}
+
+static void __exit qda_core_exit(void)
+{
+ pr_info("QDA: driver exit complete\n");
+}
+
+module_init(qda_core_init);
+module_exit(qda_core_exit);
+
+MODULE_AUTHOR("Qualcomm AI Infra Team");
+MODULE_DESCRIPTION("Qualcomm DSP Accelerator Driver");
+MODULE_LICENSE("GPL");
--
2.34.1
^ permalink raw reply related [flat|nested] 83+ messages in thread
* [PATCH RFC 03/18] accel/qda: Add RPMsg transport for Qualcomm DSP accelerator
2026-02-23 19:08 ` [PATCH RFC 00/18] accel/qda: Introduce Qualcomm DSP Accelerator driver Ekansh Gupta
2026-02-23 19:08 ` [PATCH RFC 01/18] accel/qda: Add Qualcomm QDA DSP accelerator driver docs Ekansh Gupta
2026-02-23 19:08 ` [PATCH RFC 02/18] accel/qda: Add Qualcomm DSP accelerator driver skeleton Ekansh Gupta
@ 2026-02-23 19:08 ` Ekansh Gupta
2026-02-23 21:23 ` Dmitry Baryshkov
2026-02-23 19:08 ` [PATCH RFC 04/18] accel/qda: Add built-in compute CB bus for QDA and integrate with IOMMU Ekansh Gupta
` (20 subsequent siblings)
23 siblings, 1 reply; 83+ messages in thread
From: Ekansh Gupta @ 2026-02-23 19:08 UTC (permalink / raw)
To: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König
Cc: dri-devel, linux-doc, linux-kernel, linux-arm-msm, iommu,
linux-media, linaro-mm-sig, Srinivas Kandagatla, Dmitry Baryshkov,
Bharath Kumar, Chenna Kesava Raju, Ekansh Gupta
Extend the Qualcomm DSP accelerator (QDA) driver with an RPMsg-based
transport used to discover and manage DSP instances.
This patch introduces:
- A core qda_dev structure with basic device state (rpmsg device,
device pointer, lock, removal flag, DSP name).
- Logging helpers that integrate with dev_* when a device is available
and fall back to pr_* otherwise.
- An RPMsg client driver that binds to the Qualcomm FastRPC service and
allocates a qda_dev instance using devm_kzalloc().
- Basic device initialization and teardown paths wired into the module
init/exit.
The RPMsg driver currently sets the DSP name from a "label" property in
the device tree, which will be used by subsequent patches to distinguish
between different DSP domains (e.g. ADSP, CDSP).
Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
---
drivers/accel/qda/Kconfig | 1 +
drivers/accel/qda/Makefile | 4 +-
drivers/accel/qda/qda_drv.c | 41 ++++++++++++++-
drivers/accel/qda/qda_drv.h | 91 ++++++++++++++++++++++++++++++++
drivers/accel/qda/qda_rpmsg.c | 119 ++++++++++++++++++++++++++++++++++++++++++
drivers/accel/qda/qda_rpmsg.h | 17 ++++++
6 files changed, 270 insertions(+), 3 deletions(-)
diff --git a/drivers/accel/qda/Kconfig b/drivers/accel/qda/Kconfig
index 3c78ff6189e0..484d21ff1b55 100644
--- a/drivers/accel/qda/Kconfig
+++ b/drivers/accel/qda/Kconfig
@@ -7,6 +7,7 @@ config DRM_ACCEL_QDA
tristate "Qualcomm DSP accelerator"
depends on DRM_ACCEL
depends on ARCH_QCOM || COMPILE_TEST
+ depends on RPMSG
help
Enables the DRM-based accelerator driver for Qualcomm's Hexagon DSPs.
This driver provides a standardized interface for offloading computational
diff --git a/drivers/accel/qda/Makefile b/drivers/accel/qda/Makefile
index 573711af1d28..e7f23182589b 100644
--- a/drivers/accel/qda/Makefile
+++ b/drivers/accel/qda/Makefile
@@ -5,4 +5,6 @@
obj-$(CONFIG_DRM_ACCEL_QDA) := qda.o
-qda-y := qda_drv.o
+qda-y := \
+ qda_drv.o \
+ qda_rpmsg.o \
diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c
index 18b0d3fb1598..389c66a9ad4f 100644
--- a/drivers/accel/qda/qda_drv.c
+++ b/drivers/accel/qda/qda_drv.c
@@ -2,16 +2,53 @@
// Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
#include <linux/module.h>
#include <linux/kernel.h>
+#include <linux/atomic.h>
+#include "qda_drv.h"
+#include "qda_rpmsg.h"
+
+static void cleanup_device_resources(struct qda_dev *qdev)
+{
+ mutex_destroy(&qdev->lock);
+}
+
+void qda_deinit_device(struct qda_dev *qdev)
+{
+ cleanup_device_resources(qdev);
+}
+
+/* Initialize device resources */
+static void init_device_resources(struct qda_dev *qdev)
+{
+ qda_dbg(qdev, "Initializing device resources\n");
+
+ mutex_init(&qdev->lock);
+ atomic_set(&qdev->removing, 0);
+}
+
+int qda_init_device(struct qda_dev *qdev)
+{
+ init_device_resources(qdev);
+
+ qda_dbg(qdev, "QDA device initialized successfully\n");
+ return 0;
+}
static int __init qda_core_init(void)
{
- pr_info("QDA: driver initialization complete\n");
+ int ret;
+
+ ret = qda_rpmsg_register();
+ if (ret)
+ return ret;
+
+ qda_info(NULL, "QDA driver initialization complete\n");
return 0;
}
static void __exit qda_core_exit(void)
{
- pr_info("QDA: driver exit complete\n");
+ qda_rpmsg_unregister();
+ qda_info(NULL, "QDA driver exit complete\n");
}
module_init(qda_core_init);
diff --git a/drivers/accel/qda/qda_drv.h b/drivers/accel/qda/qda_drv.h
new file mode 100644
index 000000000000..bec2d31ca1bb
--- /dev/null
+++ b/drivers/accel/qda/qda_drv.h
@@ -0,0 +1,91 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
+ */
+
+#ifndef __QDA_DRV_H__
+#define __QDA_DRV_H__
+
+#include <linux/device.h>
+#include <linux/mutex.h>
+#include <linux/rpmsg.h>
+#include <linux/xarray.h>
+
+/* Driver identification */
+#define DRIVER_NAME "qda"
+
+/* struct qda_dev - Main device structure for QDA driver */
+struct qda_dev {
+ /* RPMsg device for communication with remote processor */
+ struct rpmsg_device *rpdev;
+ /* Underlying device structure */
+ struct device *dev;
+ /* Mutex protecting device state */
+ struct mutex lock;
+ /* Flag indicating device removal in progress */
+ atomic_t removing;
+ /* Name of the DSP (e.g., "cdsp", "adsp") */
+ char dsp_name[16];
+};
+
+/**
+ * qda_get_log_device - Get appropriate device for logging
+ * @qdev: QDA device structure
+ *
+ * Returns the most appropriate device structure for logging messages.
+ * Prefers qdev->dev, or returns NULL if the device is being removed
+ * or invalid.
+ */
+static inline struct device *qda_get_log_device(struct qda_dev *qdev)
+{
+ if (!qdev || atomic_read(&qdev->removing))
+ return NULL;
+
+ if (qdev->dev)
+ return qdev->dev;
+
+ return NULL;
+}
+
+/*
+ * Logging macros
+ *
+ * These macros provide consistent logging across the driver with automatic
+ * function name inclusion. They use dev_* functions when a device is available,
+ * falling back to pr_* functions otherwise.
+ */
+
+/* Error logging - always logs and tracks errors */
+#define qda_err(qdev, fmt, ...) do { \
+ struct device *__dev = qda_get_log_device(qdev); \
+ if (__dev) \
+ dev_err(__dev, "[%s] " fmt, __func__, ##__VA_ARGS__); \
+ else \
+ pr_err(DRIVER_NAME ": [%s] " fmt, __func__, ##__VA_ARGS__); \
+} while (0)
+
+/* Info logging - always logs, can be filtered via loglevel */
+#define qda_info(qdev, fmt, ...) do { \
+ struct device *__dev = qda_get_log_device(qdev); \
+ if (__dev) \
+ dev_info(__dev, "[%s] " fmt, __func__, ##__VA_ARGS__); \
+ else \
+ pr_info(DRIVER_NAME ": [%s] " fmt, __func__, ##__VA_ARGS__); \
+} while (0)
+
+/* Debug logging - controlled via dynamic debug (CONFIG_DYNAMIC_DEBUG) */
+#define qda_dbg(qdev, fmt, ...) do { \
+ struct device *__dev = qda_get_log_device(qdev); \
+ if (__dev) \
+ dev_dbg(__dev, "[%s] " fmt, __func__, ##__VA_ARGS__); \
+ else \
+ pr_debug(DRIVER_NAME ": [%s] " fmt, __func__, ##__VA_ARGS__); \
+} while (0)
+
+/*
+ * Core device management functions
+ */
+int qda_init_device(struct qda_dev *qdev);
+void qda_deinit_device(struct qda_dev *qdev);
+
+#endif /* __QDA_DRV_H__ */
diff --git a/drivers/accel/qda/qda_rpmsg.c b/drivers/accel/qda/qda_rpmsg.c
new file mode 100644
index 000000000000..a8b24a99ca13
--- /dev/null
+++ b/drivers/accel/qda/qda_rpmsg.c
@@ -0,0 +1,119 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
+#include <linux/module.h>
+#include <linux/rpmsg.h>
+#include <linux/of_platform.h>
+#include <linux/of.h>
+#include <linux/of_device.h>
+#include "qda_drv.h"
+#include "qda_rpmsg.h"
+
+static int qda_rpmsg_init(struct qda_dev *qdev)
+{
+ dev_set_drvdata(&qdev->rpdev->dev, qdev);
+ return 0;
+}
+
+/* Utility function to allocate and initialize qda_dev */
+static struct qda_dev *alloc_and_init_qdev(struct rpmsg_device *rpdev)
+{
+ struct qda_dev *qdev;
+
+ qdev = devm_kzalloc(&rpdev->dev, sizeof(*qdev), GFP_KERNEL);
+ if (!qdev)
+ return ERR_PTR(-ENOMEM);
+
+ qdev->dev = &rpdev->dev;
+ qdev->rpdev = rpdev;
+
+ qda_dbg(qdev, "Allocated and initialized qda_dev\n");
+ return qdev;
+}
+
+static int qda_rpmsg_cb(struct rpmsg_device *rpdev, void *data, int len, void *priv, u32 src)
+{
+ /* Dummy function for rpmsg driver */
+ return 0;
+}
+
+static void qda_rpmsg_remove(struct rpmsg_device *rpdev)
+{
+ struct qda_dev *qdev = dev_get_drvdata(&rpdev->dev);
+
+ qda_info(qdev, "Removing RPMsg device\n");
+
+ atomic_set(&qdev->removing, 1);
+
+ mutex_lock(&qdev->lock);
+ qdev->rpdev = NULL;
+ mutex_unlock(&qdev->lock);
+
+ qda_deinit_device(qdev);
+
+ qda_info(qdev, "RPMsg device removed\n");
+}
+
+static int qda_rpmsg_probe(struct rpmsg_device *rpdev)
+{
+ struct qda_dev *qdev;
+ int ret;
+ const char *label;
+
+ qda_dbg(NULL, "QDA RPMsg probe starting\n");
+
+ qdev = alloc_and_init_qdev(rpdev);
+ if (IS_ERR(qdev))
+ return PTR_ERR(qdev);
+
+ ret = of_property_read_string(rpdev->dev.of_node, "label", &label);
+ if (!ret) {
+ strscpy(qdev->dsp_name, label, sizeof(qdev->dsp_name));
+ } else {
+ qda_info(qdev, "QDA DSP label not found in DT\n");
+ return ret;
+ }
+
+ ret = qda_rpmsg_init(qdev);
+ if (ret) {
+ qda_err(qdev, "RPMsg init failed: %d\n", ret);
+ return ret;
+ }
+
+ ret = qda_init_device(qdev);
+ if (ret)
+ return ret;
+
+ qda_info(qdev, "QDA RPMsg probe completed successfully for %s\n", qdev->dsp_name);
+ return 0;
+}
+
+static const struct of_device_id qda_rpmsg_id_table[] = {
+ { .compatible = "qcom,fastrpc" },
+ {},
+};
+MODULE_DEVICE_TABLE(of, qda_rpmsg_id_table);
+
+static struct rpmsg_driver qda_rpmsg_driver = {
+ .probe = qda_rpmsg_probe,
+ .remove = qda_rpmsg_remove,
+ .callback = qda_rpmsg_cb,
+ .drv = {
+ .name = "qcom,fastrpc",
+ .of_match_table = qda_rpmsg_id_table,
+ },
+};
+
+int qda_rpmsg_register(void)
+{
+ int ret = register_rpmsg_driver(&qda_rpmsg_driver);
+
+ if (ret)
+ qda_err(NULL, "Failed to register RPMsg driver: %d\n", ret);
+
+ return ret;
+}
+
+void qda_rpmsg_unregister(void)
+{
+ unregister_rpmsg_driver(&qda_rpmsg_driver);
+}
diff --git a/drivers/accel/qda/qda_rpmsg.h b/drivers/accel/qda/qda_rpmsg.h
new file mode 100644
index 000000000000..348827bff255
--- /dev/null
+++ b/drivers/accel/qda/qda_rpmsg.h
@@ -0,0 +1,17 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
+ */
+
+#ifndef __QDA_RPMSG_H__
+#define __QDA_RPMSG_H__
+
+#include "qda_drv.h"
+
+/*
+ * Transport layer registration
+ */
+int qda_rpmsg_register(void);
+void qda_rpmsg_unregister(void);
+
+#endif /* __QDA_RPMSG_H__ */
--
2.34.1
^ permalink raw reply related [flat|nested] 83+ messages in thread
* [PATCH RFC 04/18] accel/qda: Add built-in compute CB bus for QDA and integrate with IOMMU
2026-02-23 19:08 ` [PATCH RFC 00/18] accel/qda: Introduce Qualcomm DSP Accelerator driver Ekansh Gupta
` (2 preceding siblings ...)
2026-02-23 19:08 ` [PATCH RFC 03/18] accel/qda: Add RPMsg transport for Qualcomm DSP accelerator Ekansh Gupta
@ 2026-02-23 19:08 ` Ekansh Gupta
2026-02-23 22:44 ` Dmitry Baryshkov
2026-02-26 10:46 ` Krzysztof Kozlowski
2026-02-23 19:08 ` [PATCH RFC 05/18] accel/qda: Create compute CB devices on QDA compute bus Ekansh Gupta
` (19 subsequent siblings)
23 siblings, 2 replies; 83+ messages in thread
From: Ekansh Gupta @ 2026-02-23 19:08 UTC (permalink / raw)
To: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König
Cc: dri-devel, linux-doc, linux-kernel, linux-arm-msm, iommu,
linux-media, linaro-mm-sig, Srinivas Kandagatla, Dmitry Baryshkov,
Bharath Kumar, Chenna Kesava Raju, Ekansh Gupta
Introduce a built-in compute context-bank (CB) bus used by the Qualcomm
DSP accelerator (QDA) driver to represent DSP CB devices that require
IOMMU configuration. This separates the CB bus from the QDA driver and
allows QDA to remain a loadable module while the bus is always built-in.
A new bool Kconfig symbol DRM_ACCEL_QDA_COMPUTE_BUS is added and is
selected by the main DRM_ACCEL_QDA driver. The parent accel Makefile is
updated to descend into the QDA directory for both built-in and module
builds so that the CB bus is compiled into vmlinux while the driver
remains modular.
The CB bus is registered at postcore_initcall() time and is exposed to
the IOMMU core through iommu_buses[] in the same way as the Tegra
host1x context-bus. This enables later patches to create CB devices on
this bus and obtain IOMMU domains for them.
Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
---
drivers/accel/Makefile | 1 +
drivers/accel/qda/Kconfig | 5 +++++
drivers/accel/qda/Makefile | 2 ++
drivers/accel/qda/qda_compute_bus.c | 23 +++++++++++++++++++++++
drivers/iommu/iommu.c | 4 ++++
include/linux/qda_compute_bus.h | 22 ++++++++++++++++++++++
6 files changed, 57 insertions(+)
diff --git a/drivers/accel/Makefile b/drivers/accel/Makefile
index 58c08dd5f389..9ed843cd293f 100644
--- a/drivers/accel/Makefile
+++ b/drivers/accel/Makefile
@@ -6,4 +6,5 @@ obj-$(CONFIG_DRM_ACCEL_HABANALABS) += habanalabs/
obj-$(CONFIG_DRM_ACCEL_IVPU) += ivpu/
obj-$(CONFIG_DRM_ACCEL_QAIC) += qaic/
obj-$(CONFIG_DRM_ACCEL_QDA) += qda/
+obj-$(CONFIG_DRM_ACCEL_QDA_COMPUTE_BUS) += qda/
obj-$(CONFIG_DRM_ACCEL_ROCKET) += rocket/
\ No newline at end of file
diff --git a/drivers/accel/qda/Kconfig b/drivers/accel/qda/Kconfig
index 484d21ff1b55..ef1fa384efbe 100644
--- a/drivers/accel/qda/Kconfig
+++ b/drivers/accel/qda/Kconfig
@@ -3,11 +3,16 @@
# Qualcomm DSP accelerator driver
#
+
+config DRM_ACCEL_QDA_COMPUTE_BUS
+ bool
+
config DRM_ACCEL_QDA
tristate "Qualcomm DSP accelerator"
depends on DRM_ACCEL
depends on ARCH_QCOM || COMPILE_TEST
depends on RPMSG
+ select DRM_ACCEL_QDA_COMPUTE_BUS
help
Enables the DRM-based accelerator driver for Qualcomm's Hexagon DSPs.
This driver provides a standardized interface for offloading computational
diff --git a/drivers/accel/qda/Makefile b/drivers/accel/qda/Makefile
index e7f23182589b..242684ef1af7 100644
--- a/drivers/accel/qda/Makefile
+++ b/drivers/accel/qda/Makefile
@@ -8,3 +8,5 @@ obj-$(CONFIG_DRM_ACCEL_QDA) := qda.o
qda-y := \
qda_drv.o \
qda_rpmsg.o \
+
+obj-$(CONFIG_DRM_ACCEL_QDA_COMPUTE_BUS) += qda_compute_bus.o
diff --git a/drivers/accel/qda/qda_compute_bus.c b/drivers/accel/qda/qda_compute_bus.c
new file mode 100644
index 000000000000..1d9c39948fb5
--- /dev/null
+++ b/drivers/accel/qda/qda_compute_bus.c
@@ -0,0 +1,23 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
+#include <linux/device.h>
+#include <linux/init.h>
+
+struct bus_type qda_cb_bus_type = {
+ .name = "qda-compute-cb",
+};
+EXPORT_SYMBOL_GPL(qda_cb_bus_type);
+
+static int __init qda_cb_bus_init(void)
+{
+ int err;
+
+ err = bus_register(&qda_cb_bus_type);
+ if (err < 0) {
+ pr_err("qda-compute-cb bus registration failed: %d\n", err);
+ return err;
+ }
+ return 0;
+}
+
+postcore_initcall(qda_cb_bus_init);
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 4926a43118e6..5dee912686ee 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -33,6 +33,7 @@
#include <trace/events/iommu.h>
#include <linux/sched/mm.h>
#include <linux/msi.h>
+#include <linux/qda_compute_bus.h>
#include <uapi/linux/iommufd.h>
#include "dma-iommu.h"
@@ -178,6 +179,9 @@ static const struct bus_type * const iommu_buses[] = {
#ifdef CONFIG_CDX_BUS
&cdx_bus_type,
#endif
+#ifdef CONFIG_DRM_ACCEL_QDA_COMPUTE_BUS
+ &qda_cb_bus_type,
+#endif
};
/*
diff --git a/include/linux/qda_compute_bus.h b/include/linux/qda_compute_bus.h
new file mode 100644
index 000000000000..807122d84e3f
--- /dev/null
+++ b/include/linux/qda_compute_bus.h
@@ -0,0 +1,22 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
+ */
+
+#ifndef __QDA_COMPUTE_BUS_H__
+#define __QDA_COMPUTE_BUS_H__
+
+#include <linux/device.h>
+
+/*
+ * Custom bus type for QDA compute context bank (CB) devices
+ *
+ * This bus type is used for manually created CB devices that represent
+ * IOMMU context banks. The custom bus allows proper IOMMU configuration
+ * and device management for these virtual devices.
+ */
+#ifdef CONFIG_DRM_ACCEL_QDA_COMPUTE_BUS
+extern struct bus_type qda_cb_bus_type;
+#endif
+
+#endif /* __QDA_COMPUTE_BUS_H__ */
--
2.34.1
^ permalink raw reply related [flat|nested] 83+ messages in thread
* [PATCH RFC 05/18] accel/qda: Create compute CB devices on QDA compute bus
2026-02-23 19:08 ` [PATCH RFC 00/18] accel/qda: Introduce Qualcomm DSP Accelerator driver Ekansh Gupta
` (3 preceding siblings ...)
2026-02-23 19:08 ` [PATCH RFC 04/18] accel/qda: Add built-in compute CB bus for QDA and integrate with IOMMU Ekansh Gupta
@ 2026-02-23 19:08 ` Ekansh Gupta
2026-02-23 22:49 ` Dmitry Baryshkov
2026-02-23 19:09 ` [PATCH RFC 06/18] accel/qda: Add memory manager for CB devices Ekansh Gupta
` (18 subsequent siblings)
23 siblings, 1 reply; 83+ messages in thread
From: Ekansh Gupta @ 2026-02-23 19:08 UTC (permalink / raw)
To: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König
Cc: dri-devel, linux-doc, linux-kernel, linux-arm-msm, iommu,
linux-media, linaro-mm-sig, Srinivas Kandagatla, Dmitry Baryshkov,
Bharath Kumar, Chenna Kesava Raju, Ekansh Gupta
Add support for creating compute context-bank (CB) devices under
the QDA compute bus based on child nodes of the FastRPC RPMsg
device tree node. Each DT child with compatible
"qcom,fastrpc-compute-cb" is turned into a QDA-owned struct
device on qda_cb_bus_type.
A new qda_cb_dev structure and cb_devs list in qda_dev track these
CB devices. qda_populate_child_devices() walks the DT children
during QDA RPMsg probe, creates CB devices, configures their DMA
and IOMMU settings using of_dma_configure(), and associates a SID
from the "reg" property when present.
On RPMsg remove, qda_unpopulate_child_devices() tears down all CB
devices, removing them from their IOMMU groups if present and
unregistering the devices. This prepares the ground for using CB
devices as IOMMU endpoints for DSP compute workloads in later
patches.
Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
---
drivers/accel/qda/Makefile | 1 +
drivers/accel/qda/qda_cb.c | 150 ++++++++++++++++++++++++++++++++++++++++++
drivers/accel/qda/qda_cb.h | 26 ++++++++
drivers/accel/qda/qda_drv.h | 3 +
drivers/accel/qda/qda_rpmsg.c | 40 +++++++++++
5 files changed, 220 insertions(+)
diff --git a/drivers/accel/qda/Makefile b/drivers/accel/qda/Makefile
index 242684ef1af7..4aded20b6bc2 100644
--- a/drivers/accel/qda/Makefile
+++ b/drivers/accel/qda/Makefile
@@ -8,5 +8,6 @@ obj-$(CONFIG_DRM_ACCEL_QDA) := qda.o
qda-y := \
qda_drv.o \
qda_rpmsg.o \
+ qda_cb.o \
obj-$(CONFIG_DRM_ACCEL_QDA_COMPUTE_BUS) += qda_compute_bus.o
diff --git a/drivers/accel/qda/qda_cb.c b/drivers/accel/qda/qda_cb.c
new file mode 100644
index 000000000000..77a2d8cae076
--- /dev/null
+++ b/drivers/accel/qda/qda_cb.c
@@ -0,0 +1,150 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
+#include <linux/dma-mapping.h>
+#include <linux/device.h>
+#include <linux/of.h>
+#include <linux/of_device.h>
+#include <linux/iommu.h>
+#include <linux/slab.h>
+#include "qda_drv.h"
+#include "qda_cb.h"
+
+static void qda_cb_dev_release(struct device *dev)
+{
+ kfree(dev);
+}
+
+static int qda_configure_cb_iommu(struct device *cb_dev, struct device_node *cb_node)
+{
+ int ret;
+
+ qda_dbg(NULL, "Configuring DMA/IOMMU for CB device %s\n", dev_name(cb_dev));
+
+ /* Use of_dma_configure which handles both DMA and IOMMU configuration */
+ ret = of_dma_configure(cb_dev, cb_node, true);
+ if (ret) {
+ qda_err(NULL, "of_dma_configure failed for %s: %d\n", dev_name(cb_dev), ret);
+ return ret;
+ }
+
+ qda_dbg(NULL, "DMA/IOMMU configured successfully for CB device %s\n", dev_name(cb_dev));
+ return 0;
+}
+
+static int qda_cb_setup_device(struct qda_dev *qdev, struct device *cb_dev)
+{
+ int rc;
+ u32 sid, pa_bits = 32;
+
+ qda_dbg(qdev, "Setting up CB device %s\n", dev_name(cb_dev));
+
+ if (of_property_read_u32(cb_dev->of_node, "reg", &sid)) {
+ qda_dbg(qdev, "No 'reg' property found, defaulting SID to 0\n");
+ sid = 0;
+ }
+
+ rc = dma_set_mask(cb_dev, DMA_BIT_MASK(pa_bits));
+ if (rc) {
+ qda_err(qdev, "%d bit DMA enable failed: %d\n", pa_bits, rc);
+ return rc;
+ }
+
+ qda_dbg(qdev, "CB device setup complete - SID: %u, PA bits: %u\n", sid, pa_bits);
+
+ return 0;
+}
+
+int qda_create_cb_device(struct qda_dev *qdev, struct device_node *cb_node)
+{
+ struct device *cb_dev;
+ int ret;
+ u32 sid = 0;
+ struct qda_cb_dev *entry;
+
+ qda_dbg(qdev, "Creating CB device for node: %s\n", cb_node->name);
+
+ of_property_read_u32(cb_node, "reg", &sid);
+
+ cb_dev = kzalloc_obj(*cb_dev, GFP_KERNEL);
+ if (!cb_dev)
+ return -ENOMEM;
+
+ device_initialize(cb_dev);
+ cb_dev->parent = qdev->dev;
+ cb_dev->bus = &qda_cb_bus_type; /* Use our custom bus type for IOMMU handling */
+ cb_dev->release = qda_cb_dev_release;
+ dev_set_name(cb_dev, "qda-cb-%s-%u", qdev->dsp_name, sid);
+
+ qda_dbg(qdev, "Initialized CB device: %s\n", dev_name(cb_dev));
+
+ cb_dev->of_node = of_node_get(cb_node);
+
+ cb_dev->dma_mask = &cb_dev->coherent_dma_mask;
+ cb_dev->coherent_dma_mask = DMA_BIT_MASK(32);
+
+ dev_set_drvdata(cb_dev->parent, qdev);
+
+ ret = device_add(cb_dev);
+ if (ret) {
+ qda_err(qdev, "Failed to add CB device for SID %u: %d\n", sid, ret);
+ goto cleanup_device_init;
+ }
+
+ qda_dbg(qdev, "CB device added to system\n");
+
+ ret = qda_configure_cb_iommu(cb_dev, cb_node);
+ if (ret) {
+ qda_err(qdev, "IOMMU configuration failed: %d\n", ret);
+ goto cleanup_device_add;
+ }
+
+ ret = qda_cb_setup_device(qdev, cb_dev);
+ if (ret) {
+ qda_err(qdev, "CB device setup failed: %d\n", ret);
+ goto cleanup_device_add;
+ }
+
+ entry = kzalloc(sizeof(*entry), GFP_KERNEL);
+ if (!entry) {
+ ret = -ENOMEM;
+ goto cleanup_device_add;
+ }
+
+ entry->dev = cb_dev;
+ list_add_tail(&entry->node, &qdev->cb_devs);
+
+ qda_dbg(qdev, "Successfully created CB device for SID %u\n", sid);
+ return 0;
+
+cleanup_device_add:
+ device_del(cb_dev);
+cleanup_device_init:
+ of_node_put(cb_dev->of_node);
+ put_device(cb_dev);
+ return ret;
+}
+
+void qda_destroy_cb_device(struct device *cb_dev)
+{
+ struct iommu_group *group;
+
+ if (!cb_dev) {
+ qda_dbg(NULL, "NULL CB device passed to destroy\n");
+ return;
+ }
+
+ qda_dbg(NULL, "Destroying CB device %s\n", dev_name(cb_dev));
+
+ group = iommu_group_get(cb_dev);
+ if (group) {
+ qda_dbg(NULL, "Removing %s from IOMMU group\n", dev_name(cb_dev));
+ iommu_group_remove_device(cb_dev);
+ iommu_group_put(group);
+ }
+
+ of_node_put(cb_dev->of_node);
+ cb_dev->of_node = NULL;
+ device_unregister(cb_dev);
+
+ qda_dbg(NULL, "CB device %s destroyed\n", dev_name(cb_dev));
+}
diff --git a/drivers/accel/qda/qda_cb.h b/drivers/accel/qda/qda_cb.h
new file mode 100644
index 000000000000..a4ae9fef142e
--- /dev/null
+++ b/drivers/accel/qda/qda_cb.h
@@ -0,0 +1,26 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
+ */
+
+#ifndef __QDA_CB_H__
+#define __QDA_CB_H__
+
+#include <linux/device.h>
+#include <linux/of.h>
+#include <linux/list.h>
+#include <linux/qda_compute_bus.h>
+#include "qda_drv.h"
+
+struct qda_cb_dev {
+ struct list_head node;
+ struct device *dev;
+};
+
+/*
+ * Compute bus (CB) device management
+ */
+int qda_create_cb_device(struct qda_dev *qdev, struct device_node *cb_node);
+void qda_destroy_cb_device(struct device *cb_dev);
+
+#endif /* __QDA_CB_H__ */
diff --git a/drivers/accel/qda/qda_drv.h b/drivers/accel/qda/qda_drv.h
index bec2d31ca1bb..eb732b7d8091 100644
--- a/drivers/accel/qda/qda_drv.h
+++ b/drivers/accel/qda/qda_drv.h
@@ -7,6 +7,7 @@
#define __QDA_DRV_H__
#include <linux/device.h>
+#include <linux/list.h>
#include <linux/mutex.h>
#include <linux/rpmsg.h>
#include <linux/xarray.h>
@@ -26,6 +27,8 @@ struct qda_dev {
atomic_t removing;
/* Name of the DSP (e.g., "cdsp", "adsp") */
char dsp_name[16];
+ /* Compute context-bank (CB) child devices */
+ struct list_head cb_devs;
};
/**
diff --git a/drivers/accel/qda/qda_rpmsg.c b/drivers/accel/qda/qda_rpmsg.c
index a8b24a99ca13..5a57384de6a2 100644
--- a/drivers/accel/qda/qda_rpmsg.c
+++ b/drivers/accel/qda/qda_rpmsg.c
@@ -7,6 +7,7 @@
#include <linux/of_device.h>
#include "qda_drv.h"
#include "qda_rpmsg.h"
+#include "qda_cb.h"
static int qda_rpmsg_init(struct qda_dev *qdev)
{
@@ -25,11 +26,42 @@ static struct qda_dev *alloc_and_init_qdev(struct rpmsg_device *rpdev)
qdev->dev = &rpdev->dev;
qdev->rpdev = rpdev;
+ INIT_LIST_HEAD(&qdev->cb_devs);
qda_dbg(qdev, "Allocated and initialized qda_dev\n");
return qdev;
}
+static void qda_unpopulate_child_devices(struct qda_dev *qdev)
+{
+ struct qda_cb_dev *entry, *tmp;
+
+ list_for_each_entry_safe(entry, tmp, &qdev->cb_devs, node) {
+ list_del(&entry->node);
+ qda_destroy_cb_device(entry->dev);
+ kfree(entry);
+ }
+}
+
+static int qda_populate_child_devices(struct qda_dev *qdev, struct device_node *parent_node)
+{
+ struct device_node *child;
+ int count = 0, success = 0;
+
+ for_each_child_of_node(parent_node, child) {
+ if (of_device_is_compatible(child, "qcom,fastrpc-compute-cb")) {
+ count++;
+ if (qda_create_cb_device(qdev, child) == 0) {
+ success++;
+ qda_dbg(qdev, "Created CB device for node: %s\n", child->name);
+ } else {
+ qda_err(qdev, "Failed to create CB device for: %s\n", child->name);
+ }
+ }
+ }
+ return success > 0 ? 0 : (count > 0 ? -ENODEV : 0);
+}
+
static int qda_rpmsg_cb(struct rpmsg_device *rpdev, void *data, int len, void *priv, u32 src)
{
/* Dummy function for rpmsg driver */
@@ -48,6 +80,7 @@ static void qda_rpmsg_remove(struct rpmsg_device *rpdev)
qdev->rpdev = NULL;
mutex_unlock(&qdev->lock);
+ qda_unpopulate_child_devices(qdev);
qda_deinit_device(qdev);
qda_info(qdev, "RPMsg device removed\n");
@@ -83,6 +116,13 @@ static int qda_rpmsg_probe(struct rpmsg_device *rpdev)
if (ret)
return ret;
+ ret = qda_populate_child_devices(qdev, rpdev->dev.of_node);
+ if (ret) {
+ qda_err(qdev, "Failed to populate child devices: %d\n", ret);
+ qda_deinit_device(qdev);
+ return ret;
+ }
+
qda_info(qdev, "QDA RPMsg probe completed successfully for %s\n", qdev->dsp_name);
return 0;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 83+ messages in thread
* [PATCH RFC 06/18] accel/qda: Add memory manager for CB devices
2026-02-23 19:08 ` [PATCH RFC 00/18] accel/qda: Introduce Qualcomm DSP Accelerator driver Ekansh Gupta
` (4 preceding siblings ...)
2026-02-23 19:08 ` [PATCH RFC 05/18] accel/qda: Create compute CB devices on QDA compute bus Ekansh Gupta
@ 2026-02-23 19:09 ` Ekansh Gupta
2026-02-23 22:50 ` Dmitry Baryshkov
2026-02-23 23:11 ` Bjorn Andersson
2026-02-23 19:09 ` [PATCH RFC 07/18] accel/qda: Add DRM accel device registration for QDA driver Ekansh Gupta
` (17 subsequent siblings)
23 siblings, 2 replies; 83+ messages in thread
From: Ekansh Gupta @ 2026-02-23 19:09 UTC (permalink / raw)
To: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König
Cc: dri-devel, linux-doc, linux-kernel, linux-arm-msm, iommu,
linux-media, linaro-mm-sig, Srinivas Kandagatla, Dmitry Baryshkov,
Bharath Kumar, Chenna Kesava Raju, Ekansh Gupta
Introduce a per-device memory manager for the QDA driver that tracks
IOMMU-capable compute context-bank (CB) devices. Each CB device is
represented by a qda_iommu_device and registered with a central
qda_memory_manager instance owned by qda_dev.
The memory manager maintains an xarray of devices and assigns a
unique ID to each CB. It also provides basic lifetime management
and a workqueue for deferred device removal. qda_cb_setup_device()
now allocates a qda_iommu_device for each CB and registers it with
the memory manager after DMA configuration succeeds.
qda_init_device() is extended to allocate and initialize the memory
manager, while qda_deinit_device() will tear it down in later
patches. This prepares the QDA driver for fine-grained memory and
IOMMU domain management tied to individual CB devices.
Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
---
drivers/accel/qda/Makefile | 1 +
drivers/accel/qda/qda_cb.c | 32 +++++++
drivers/accel/qda/qda_drv.c | 46 ++++++++++
drivers/accel/qda/qda_drv.h | 3 +
drivers/accel/qda/qda_memory_manager.c | 152 +++++++++++++++++++++++++++++++++
drivers/accel/qda/qda_memory_manager.h | 101 ++++++++++++++++++++++
6 files changed, 335 insertions(+)
diff --git a/drivers/accel/qda/Makefile b/drivers/accel/qda/Makefile
index 4aded20b6bc2..7e96ddc40a24 100644
--- a/drivers/accel/qda/Makefile
+++ b/drivers/accel/qda/Makefile
@@ -9,5 +9,6 @@ qda-y := \
qda_drv.o \
qda_rpmsg.o \
qda_cb.o \
+ qda_memory_manager.o \
obj-$(CONFIG_DRM_ACCEL_QDA_COMPUTE_BUS) += qda_compute_bus.o
diff --git a/drivers/accel/qda/qda_cb.c b/drivers/accel/qda/qda_cb.c
index 77a2d8cae076..e7b9aaeba9af 100644
--- a/drivers/accel/qda/qda_cb.c
+++ b/drivers/accel/qda/qda_cb.c
@@ -7,6 +7,7 @@
#include <linux/iommu.h>
#include <linux/slab.h>
#include "qda_drv.h"
+#include "qda_memory_manager.h"
#include "qda_cb.h"
static void qda_cb_dev_release(struct device *dev)
@@ -33,11 +34,16 @@ static int qda_configure_cb_iommu(struct device *cb_dev, struct device_node *cb_
static int qda_cb_setup_device(struct qda_dev *qdev, struct device *cb_dev)
{
+ struct qda_iommu_device *iommu_dev;
int rc;
u32 sid, pa_bits = 32;
qda_dbg(qdev, "Setting up CB device %s\n", dev_name(cb_dev));
+ iommu_dev = kzalloc_obj(*iommu_dev, GFP_KERNEL);
+ if (!iommu_dev)
+ return -ENOMEM;
+
if (of_property_read_u32(cb_dev->of_node, "reg", &sid)) {
qda_dbg(qdev, "No 'reg' property found, defaulting SID to 0\n");
sid = 0;
@@ -46,6 +52,18 @@ static int qda_cb_setup_device(struct qda_dev *qdev, struct device *cb_dev)
rc = dma_set_mask(cb_dev, DMA_BIT_MASK(pa_bits));
if (rc) {
qda_err(qdev, "%d bit DMA enable failed: %d\n", pa_bits, rc);
+ kfree(iommu_dev);
+ return rc;
+ }
+
+ iommu_dev->dev = cb_dev;
+ iommu_dev->sid = sid;
+ snprintf(iommu_dev->name, sizeof(iommu_dev->name), "qda_iommu_dev_%u", sid);
+
+ rc = qda_memory_manager_register_device(qdev->iommu_mgr, iommu_dev);
+ if (rc) {
+ qda_err(qdev, "Failed to register IOMMU device: %d\n", rc);
+ kfree(iommu_dev);
return rc;
}
@@ -127,6 +145,8 @@ int qda_create_cb_device(struct qda_dev *qdev, struct device_node *cb_node)
void qda_destroy_cb_device(struct device *cb_dev)
{
struct iommu_group *group;
+ struct qda_iommu_device *iommu_dev;
+ struct qda_dev *qdev;
if (!cb_dev) {
qda_dbg(NULL, "NULL CB device passed to destroy\n");
@@ -135,6 +155,18 @@ void qda_destroy_cb_device(struct device *cb_dev)
qda_dbg(NULL, "Destroying CB device %s\n", dev_name(cb_dev));
+ iommu_dev = dev_get_drvdata(cb_dev);
+ if (iommu_dev) {
+ if (cb_dev->parent) {
+ qdev = dev_get_drvdata(cb_dev->parent);
+ if (qdev && qdev->iommu_mgr) {
+ qda_dbg(NULL, "Unregistering IOMMU device for %s\n",
+ dev_name(cb_dev));
+ qda_memory_manager_unregister_device(qdev->iommu_mgr, iommu_dev);
+ }
+ }
+ }
+
group = iommu_group_get(cb_dev);
if (group) {
qda_dbg(NULL, "Removing %s from IOMMU group\n", dev_name(cb_dev));
diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c
index 389c66a9ad4f..69132737f964 100644
--- a/drivers/accel/qda/qda_drv.c
+++ b/drivers/accel/qda/qda_drv.c
@@ -3,9 +3,20 @@
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/atomic.h>
+#include <linux/slab.h>
#include "qda_drv.h"
#include "qda_rpmsg.h"
+static void cleanup_iommu_manager(struct qda_dev *qdev)
+{
+ if (qdev->iommu_mgr) {
+ qda_dbg(qdev, "Cleaning up IOMMU manager\n");
+ qda_memory_manager_exit(qdev->iommu_mgr);
+ kfree(qdev->iommu_mgr);
+ qdev->iommu_mgr = NULL;
+ }
+}
+
static void cleanup_device_resources(struct qda_dev *qdev)
{
mutex_destroy(&qdev->lock);
@@ -13,6 +24,7 @@ static void cleanup_device_resources(struct qda_dev *qdev)
void qda_deinit_device(struct qda_dev *qdev)
{
+ cleanup_iommu_manager(qdev);
cleanup_device_resources(qdev);
}
@@ -25,12 +37,46 @@ static void init_device_resources(struct qda_dev *qdev)
atomic_set(&qdev->removing, 0);
}
+static int init_memory_manager(struct qda_dev *qdev)
+{
+ int ret;
+
+ qda_dbg(qdev, "Initializing IOMMU manager\n");
+
+ qdev->iommu_mgr = kzalloc_obj(*qdev->iommu_mgr, GFP_KERNEL);
+ if (!qdev->iommu_mgr)
+ return -ENOMEM;
+
+ ret = qda_memory_manager_init(qdev->iommu_mgr);
+ if (ret) {
+ qda_err(qdev, "Failed to initialize memory manager: %d\n", ret);
+ kfree(qdev->iommu_mgr);
+ qdev->iommu_mgr = NULL;
+ return ret;
+ }
+
+ qda_dbg(qdev, "IOMMU manager initialized successfully\n");
+ return 0;
+}
+
int qda_init_device(struct qda_dev *qdev)
{
+ int ret;
+
init_device_resources(qdev);
+ ret = init_memory_manager(qdev);
+ if (ret) {
+ qda_err(qdev, "IOMMU manager initialization failed: %d\n", ret);
+ goto err_cleanup_resources;
+ }
+
qda_dbg(qdev, "QDA device initialized successfully\n");
return 0;
+
+err_cleanup_resources:
+ cleanup_device_resources(qdev);
+ return ret;
}
static int __init qda_core_init(void)
diff --git a/drivers/accel/qda/qda_drv.h b/drivers/accel/qda/qda_drv.h
index eb732b7d8091..2cb97e4eafbf 100644
--- a/drivers/accel/qda/qda_drv.h
+++ b/drivers/accel/qda/qda_drv.h
@@ -11,6 +11,7 @@
#include <linux/mutex.h>
#include <linux/rpmsg.h>
#include <linux/xarray.h>
+#include "qda_memory_manager.h"
/* Driver identification */
#define DRIVER_NAME "qda"
@@ -23,6 +24,8 @@ struct qda_dev {
struct device *dev;
/* Mutex protecting device state */
struct mutex lock;
+ /* IOMMU/memory manager */
+ struct qda_memory_manager *iommu_mgr;
/* Flag indicating device removal in progress */
atomic_t removing;
/* Name of the DSP (e.g., "cdsp", "adsp") */
diff --git a/drivers/accel/qda/qda_memory_manager.c b/drivers/accel/qda/qda_memory_manager.c
new file mode 100644
index 000000000000..b4c7047a89d4
--- /dev/null
+++ b/drivers/accel/qda/qda_memory_manager.c
@@ -0,0 +1,152 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
+
+#include <linux/refcount.h>
+#include <linux/slab.h>
+#include <linux/spinlock.h>
+#include <linux/workqueue.h>
+#include <linux/xarray.h>
+#include "qda_drv.h"
+#include "qda_memory_manager.h"
+
+static void cleanup_all_memory_devices(struct qda_memory_manager *mem_mgr)
+{
+ unsigned long index;
+ void *entry;
+
+ qda_dbg(NULL, "Starting cleanup of all memory devices\n");
+
+ xa_for_each(&mem_mgr->device_xa, index, entry) {
+ struct qda_iommu_device *iommu_dev = entry;
+
+ qda_dbg(NULL, "Cleaning up device id=%lu\n", index);
+
+ xa_erase(&mem_mgr->device_xa, index);
+ kfree(iommu_dev);
+ }
+
+ qda_dbg(NULL, "Completed cleanup of all memory devices\n");
+}
+
+static void qda_memory_manager_remove_work(struct work_struct *work)
+{
+ struct qda_iommu_device *iommu_dev =
+ container_of(work, struct qda_iommu_device, remove_work);
+ struct qda_memory_manager *mem_mgr = iommu_dev->manager;
+
+ qda_dbg(NULL, "Remove work started for device id=%u\n", iommu_dev->id);
+
+ if (!mem_mgr) {
+ qda_dbg(NULL, "No manager for device id=%u\n", iommu_dev->id);
+ kfree(iommu_dev);
+ return;
+ }
+
+ xa_erase(&mem_mgr->device_xa, iommu_dev->id);
+
+ qda_dbg(NULL, "Device id=%u removed successfully\n", iommu_dev->id);
+ kfree(iommu_dev);
+}
+
+static void init_iommu_device_fields(struct qda_iommu_device *iommu_dev,
+ struct qda_memory_manager *mem_mgr)
+{
+ iommu_dev->manager = mem_mgr;
+ spin_lock_init(&iommu_dev->lock);
+ refcount_set(&iommu_dev->refcount, 0);
+ INIT_WORK(&iommu_dev->remove_work, qda_memory_manager_remove_work);
+}
+
+static int allocate_device_id(struct qda_memory_manager *mem_mgr,
+ struct qda_iommu_device *iommu_dev, u32 *id)
+{
+ int ret;
+
+ ret = xa_alloc(&mem_mgr->device_xa, id, iommu_dev,
+ xa_limit_31b, GFP_KERNEL);
+ if (ret) {
+ qda_dbg(NULL, "xa_alloc failed, using atomic counter\n");
+ *id = atomic_inc_return(&mem_mgr->next_id);
+ ret = xa_insert(&mem_mgr->device_xa, *id, iommu_dev, GFP_KERNEL);
+ if (ret) {
+ qda_err(NULL, "Failed to insert device with id=%u: %d\n", *id, ret);
+ return ret;
+ }
+ }
+
+ qda_dbg(NULL, "Allocated device id=%u\n", *id);
+ return ret;
+}
+
+int qda_memory_manager_register_device(struct qda_memory_manager *mem_mgr,
+ struct qda_iommu_device *iommu_dev)
+{
+ int ret;
+ u32 id;
+
+ if (!mem_mgr || !iommu_dev || !iommu_dev->dev) {
+ qda_err(NULL, "Invalid parameters for device registration\n");
+ return -EINVAL;
+ }
+
+ init_iommu_device_fields(iommu_dev, mem_mgr);
+
+ ret = allocate_device_id(mem_mgr, iommu_dev, &id);
+ if (ret) {
+ qda_err(NULL, "Failed to allocate device ID: %d (sid=%u)\n", ret, iommu_dev->sid);
+ return ret;
+ }
+
+ iommu_dev->id = id;
+
+ qda_dbg(NULL, "Registered device id=%u (sid=%u)\n", id, iommu_dev->sid);
+
+ return 0;
+}
+
+void qda_memory_manager_unregister_device(struct qda_memory_manager *mem_mgr,
+ struct qda_iommu_device *iommu_dev)
+{
+ if (!mem_mgr || !iommu_dev) {
+ qda_err(NULL, "Attempted to unregister invalid device/manager\n");
+ return;
+ }
+
+ qda_dbg(NULL, "Unregistering device id=%u (refcount=%u)\n", iommu_dev->id,
+ refcount_read(&iommu_dev->refcount));
+
+ if (refcount_read(&iommu_dev->refcount) == 0) {
+ xa_erase(&mem_mgr->device_xa, iommu_dev->id);
+ kfree(iommu_dev);
+ return;
+ }
+
+ if (refcount_dec_and_test(&iommu_dev->refcount)) {
+ qda_info(NULL, "Device id=%u refcount reached zero, queuing removal\n",
+ iommu_dev->id);
+ queue_work(mem_mgr->wq, &iommu_dev->remove_work);
+ }
+}
+
+int qda_memory_manager_init(struct qda_memory_manager *mem_mgr)
+{
+ qda_dbg(NULL, "Initializing memory manager\n");
+
+ xa_init_flags(&mem_mgr->device_xa, XA_FLAGS_ALLOC);
+ atomic_set(&mem_mgr->next_id, 0);
+ mem_mgr->wq = create_workqueue("memory_manager_wq");
+ if (!mem_mgr->wq) {
+ qda_err(NULL, "Failed to create memory manager workqueue\n");
+ return -ENOMEM;
+ }
+
+ qda_dbg(NULL, "QDA: Memory manager initialized successfully\n");
+ return 0;
+}
+
+void qda_memory_manager_exit(struct qda_memory_manager *mem_mgr)
+{
+ cleanup_all_memory_devices(mem_mgr);
+ destroy_workqueue(mem_mgr->wq);
+ qda_dbg(NULL, "QDA: Memory manager exited\n");
+}
diff --git a/drivers/accel/qda/qda_memory_manager.h b/drivers/accel/qda/qda_memory_manager.h
new file mode 100644
index 000000000000..3bf4cd529909
--- /dev/null
+++ b/drivers/accel/qda/qda_memory_manager.h
@@ -0,0 +1,101 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
+ */
+
+#ifndef _QDA_MEMORY_MANAGER_H
+#define _QDA_MEMORY_MANAGER_H
+
+#include <linux/device.h>
+#include <linux/refcount.h>
+#include <linux/spinlock.h>
+#include <linux/workqueue.h>
+#include <linux/xarray.h>
+
+/**
+ * struct qda_iommu_device - IOMMU device instance for memory management
+ *
+ * This structure represents a single IOMMU-enabled device managed by the
+ * memory manager. Each device can be assigned to a specific process.
+ */
+struct qda_iommu_device {
+ /* Unique identifier for this IOMMU device */
+ u32 id;
+ /* Pointer to the underlying device */
+ struct device *dev;
+ /* Name for the device */
+ char name[32];
+ /* Spinlock protecting concurrent access to device */
+ spinlock_t lock;
+ /* Reference counter for device */
+ refcount_t refcount;
+ /* Work structure for deferred device removal */
+ struct work_struct remove_work;
+ /* Stream ID for IOMMU transactions */
+ u32 sid;
+ /* Pointer to parent memory manager */
+ struct qda_memory_manager *manager;
+};
+
+/**
+ * struct qda_memory_manager - Central memory management coordinator
+ *
+ * This is the top-level structure coordinating memory management across
+ * multiple IOMMU devices. It maintains a registry of devices and backends,
+ * and ensures thread-safe access to shared resources.
+ */
+struct qda_memory_manager {
+ /* XArray storing all registered IOMMU devices */
+ struct xarray device_xa;
+ /* Atomic counter for generating unique device IDs */
+ atomic_t next_id;
+ /* Workqueue for asynchronous device operations */
+ struct workqueue_struct *wq;
+};
+
+/**
+ * qda_memory_manager_init() - Initialize the memory manager
+ * @mem_mgr: Pointer to memory manager structure to initialize
+ *
+ * Initializes the memory manager's internal data structures including
+ * the device registry, workqueue, and synchronization primitives.
+ *
+ * Return: 0 on success, negative error code on failure
+ */
+int qda_memory_manager_init(struct qda_memory_manager *mem_mgr);
+
+/**
+ * qda_memory_manager_exit() - Clean up the memory manager
+ * @mem_mgr: Pointer to memory manager structure to clean up
+ *
+ * Releases all resources associated with the memory manager, including
+ * unregistering all devices and destroying the workqueue.
+ */
+void qda_memory_manager_exit(struct qda_memory_manager *mem_mgr);
+
+/**
+ * qda_memory_manager_register_device() - Register an IOMMU device
+ * @mem_mgr: Pointer to memory manager
+ * @iommu_dev: Pointer to IOMMU device to register
+ *
+ * Adds a new IOMMU device to the memory manager's registry and initializes
+ * its memory backend. The device becomes available for memory allocation
+ * operations.
+ *
+ * Return: 0 on success, negative error code on failure
+ */
+int qda_memory_manager_register_device(struct qda_memory_manager *mem_mgr,
+ struct qda_iommu_device *iommu_dev);
+
+/**
+ * qda_memory_manager_unregister_device() - Unregister an IOMMU device
+ * @mem_mgr: Pointer to memory manager
+ * @iommu_dev: Pointer to IOMMU device to unregister
+ *
+ * Removes an IOMMU device from the memory manager's registry and cleans up
+ * its associated resources. Any remaining memory allocations are freed.
+ */
+void qda_memory_manager_unregister_device(struct qda_memory_manager *mem_mgr,
+ struct qda_iommu_device *iommu_dev);
+
+#endif /* _QDA_MEMORY_MANAGER_H */
--
2.34.1
^ permalink raw reply related [flat|nested] 83+ messages in thread
* [PATCH RFC 07/18] accel/qda: Add DRM accel device registration for QDA driver
2026-02-23 19:08 ` [PATCH RFC 00/18] accel/qda: Introduce Qualcomm DSP Accelerator driver Ekansh Gupta
` (5 preceding siblings ...)
2026-02-23 19:09 ` [PATCH RFC 06/18] accel/qda: Add memory manager for CB devices Ekansh Gupta
@ 2026-02-23 19:09 ` Ekansh Gupta
2026-02-23 22:16 ` Dmitry Baryshkov
2026-02-23 19:09 ` [PATCH RFC 08/18] accel/qda: Add per-file DRM context and open/close handling Ekansh Gupta
` (16 subsequent siblings)
23 siblings, 1 reply; 83+ messages in thread
From: Ekansh Gupta @ 2026-02-23 19:09 UTC (permalink / raw)
To: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König
Cc: dri-devel, linux-doc, linux-kernel, linux-arm-msm, iommu,
linux-media, linaro-mm-sig, Srinivas Kandagatla, Dmitry Baryshkov,
Bharath Kumar, Chenna Kesava Raju, Ekansh Gupta
Add DRM accel integration for the QDA DSP accelerator driver. A new
qda_drm_priv structure is introduced to hold per-device DRM state,
including a pointer to the memory manager and the parent qda_dev
instance. The driver now allocates a drm_device, initializes
driver-private state, and registers the device via the DRM accel
infrastructure.
qda_register_device() performs allocation and registration of the DRM
device, while qda_unregister_device() handles device teardown and
releases references using drm_dev_unregister() and drm_dev_put().
Initialization and teardown paths are updated so DRM resources are
allocated after IOMMU/memory-manager setup and cleaned during RPMsg
remove.
This patch lays the foundation for adding GEM buffer support and IOCTL
handling in later patches as part of the compute accelerator interface.
Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
---
drivers/accel/qda/qda_drv.c | 103 ++++++++++++++++++++++++++++++++++++++++++
drivers/accel/qda/qda_drv.h | 33 +++++++++++++-
drivers/accel/qda/qda_rpmsg.c | 8 ++++
3 files changed, 142 insertions(+), 2 deletions(-)
diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c
index 69132737f964..a9113ec78fa2 100644
--- a/drivers/accel/qda/qda_drv.c
+++ b/drivers/accel/qda/qda_drv.c
@@ -4,9 +4,31 @@
#include <linux/kernel.h>
#include <linux/atomic.h>
#include <linux/slab.h>
+#include <drm/drm_accel.h>
+#include <drm/drm_drv.h>
+#include <drm/drm_file.h>
+#include <drm/drm_gem.h>
+#include <drm/drm_ioctl.h>
#include "qda_drv.h"
#include "qda_rpmsg.h"
+DEFINE_DRM_ACCEL_FOPS(qda_accel_fops);
+
+static struct drm_driver qda_drm_driver = {
+ .driver_features = DRIVER_COMPUTE_ACCEL,
+ .fops = &qda_accel_fops,
+ .name = DRIVER_NAME,
+ .desc = "Qualcomm DSP Accelerator Driver",
+};
+
+static void cleanup_drm_private(struct qda_dev *qdev)
+{
+ if (qdev->drm_priv) {
+ qda_dbg(qdev, "Cleaning up DRM private data\n");
+ kfree(qdev->drm_priv);
+ }
+}
+
static void cleanup_iommu_manager(struct qda_dev *qdev)
{
if (qdev->iommu_mgr) {
@@ -24,6 +46,7 @@ static void cleanup_device_resources(struct qda_dev *qdev)
void qda_deinit_device(struct qda_dev *qdev)
{
+ cleanup_drm_private(qdev);
cleanup_iommu_manager(qdev);
cleanup_device_resources(qdev);
}
@@ -59,6 +82,18 @@ static int init_memory_manager(struct qda_dev *qdev)
return 0;
}
+static int init_drm_private(struct qda_dev *qdev)
+{
+ qda_dbg(qdev, "Initializing DRM private data\n");
+
+ qdev->drm_priv = kzalloc_obj(*qdev->drm_priv, GFP_KERNEL);
+ if (!qdev->drm_priv)
+ return -ENOMEM;
+
+ qda_dbg(qdev, "DRM private data initialized successfully\n");
+ return 0;
+}
+
int qda_init_device(struct qda_dev *qdev)
{
int ret;
@@ -71,14 +106,82 @@ int qda_init_device(struct qda_dev *qdev)
goto err_cleanup_resources;
}
+ ret = init_drm_private(qdev);
+ if (ret) {
+ qda_err(qdev, "DRM private data initialization failed: %d\n", ret);
+ goto err_cleanup_iommu;
+ }
+
qda_dbg(qdev, "QDA device initialized successfully\n");
return 0;
+err_cleanup_iommu:
+ cleanup_iommu_manager(qdev);
err_cleanup_resources:
cleanup_device_resources(qdev);
return ret;
}
+static int setup_and_register_drm_device(struct qda_dev *qdev)
+{
+ struct drm_device *ddev;
+ int ret;
+
+ qda_dbg(qdev, "Setting up and registering DRM device\n");
+
+ ddev = drm_dev_alloc(&qda_drm_driver, qdev->dev);
+ if (IS_ERR(ddev)) {
+ ret = PTR_ERR(ddev);
+ qda_err(qdev, "Failed to allocate DRM device: %d\n", ret);
+ return ret;
+ }
+
+ qdev->drm_priv->drm_dev = ddev;
+ qdev->drm_priv->iommu_mgr = qdev->iommu_mgr;
+ qdev->drm_priv->qdev = qdev;
+
+ ddev->dev_private = qdev->drm_priv;
+ qdev->drm_dev = ddev;
+
+ ret = drm_dev_register(ddev, 0);
+ if (ret) {
+ qda_err(qdev, "Failed to register DRM device: %d\n", ret);
+ drm_dev_put(ddev);
+ return ret;
+ }
+
+ qda_dbg(qdev, "DRM device registered successfully\n");
+ return 0;
+}
+
+int qda_register_device(struct qda_dev *qdev)
+{
+ int ret;
+
+ ret = setup_and_register_drm_device(qdev);
+ if (ret) {
+ qda_err(qdev, "DRM device setup failed: %d\n", ret);
+ return ret;
+ }
+
+ qda_dbg(qdev, "QDA device registered successfully\n");
+ return 0;
+}
+
+void qda_unregister_device(struct qda_dev *qdev)
+{
+ qda_info(qdev, "Unregistering QDA device\n");
+
+ if (qdev->drm_dev) {
+ qda_dbg(qdev, "Unregistering DRM device\n");
+ drm_dev_unregister(qdev->drm_dev);
+ drm_dev_put(qdev->drm_dev);
+ qdev->drm_dev = NULL;
+ }
+
+ qda_dbg(qdev, "QDA device unregistered successfully\n");
+}
+
static int __init qda_core_init(void)
{
int ret;
diff --git a/drivers/accel/qda/qda_drv.h b/drivers/accel/qda/qda_drv.h
index 2cb97e4eafbf..2b80401a3741 100644
--- a/drivers/accel/qda/qda_drv.h
+++ b/drivers/accel/qda/qda_drv.h
@@ -11,13 +11,35 @@
#include <linux/mutex.h>
#include <linux/rpmsg.h>
#include <linux/xarray.h>
+#include <drm/drm_drv.h>
+#include <drm/drm_file.h>
+#include <drm/drm_device.h>
+#include <drm/drm_accel.h>
#include "qda_memory_manager.h"
/* Driver identification */
#define DRIVER_NAME "qda"
+/**
+ * struct qda_drm_priv - DRM device private data for QDA device
+ *
+ * This structure serves as the DRM device private data (stored in dev_private),
+ * bridging the DRM device context with the QDA device and providing access to
+ * shared resources like the memory manager during buffer operations.
+ */
+struct qda_drm_priv {
+ /* DRM device structure */
+ struct drm_device *drm_dev;
+ /* Global memory/IOMMU manager */
+ struct qda_memory_manager *iommu_mgr;
+ /* Back-pointer to qda_dev */
+ struct qda_dev *qdev;
+};
+
/* struct qda_dev - Main device structure for QDA driver */
struct qda_dev {
+ /* DRM device for accelerator interface */
+ struct drm_device *drm_dev;
/* RPMsg device for communication with remote processor */
struct rpmsg_device *rpdev;
/* Underlying device structure */
@@ -26,6 +48,8 @@ struct qda_dev {
struct mutex lock;
/* IOMMU/memory manager */
struct qda_memory_manager *iommu_mgr;
+ /* DRM device private data */
+ struct qda_drm_priv *drm_priv;
/* Flag indicating device removal in progress */
atomic_t removing;
/* Name of the DSP (e.g., "cdsp", "adsp") */
@@ -39,8 +63,8 @@ struct qda_dev {
* @qdev: QDA device structure
*
* Returns the most appropriate device structure for logging messages.
- * Prefers qdev->dev, or returns NULL if the device is being removed
- * or invalid.
+ * Prefers qdev->dev, falls back to qdev->drm_dev->dev, or returns NULL
+ * if the device is being removed or invalid.
*/
static inline struct device *qda_get_log_device(struct qda_dev *qdev)
{
@@ -50,6 +74,9 @@ static inline struct device *qda_get_log_device(struct qda_dev *qdev)
if (qdev->dev)
return qdev->dev;
+ if (qdev->drm_dev)
+ return qdev->drm_dev->dev;
+
return NULL;
}
@@ -93,5 +120,7 @@ static inline struct device *qda_get_log_device(struct qda_dev *qdev)
*/
int qda_init_device(struct qda_dev *qdev);
void qda_deinit_device(struct qda_dev *qdev);
+int qda_register_device(struct qda_dev *qdev);
+void qda_unregister_device(struct qda_dev *qdev);
#endif /* __QDA_DRV_H__ */
diff --git a/drivers/accel/qda/qda_rpmsg.c b/drivers/accel/qda/qda_rpmsg.c
index 5a57384de6a2..b2b44b4d3ca8 100644
--- a/drivers/accel/qda/qda_rpmsg.c
+++ b/drivers/accel/qda/qda_rpmsg.c
@@ -80,6 +80,7 @@ static void qda_rpmsg_remove(struct rpmsg_device *rpdev)
qdev->rpdev = NULL;
mutex_unlock(&qdev->lock);
+ qda_unregister_device(qdev);
qda_unpopulate_child_devices(qdev);
qda_deinit_device(qdev);
@@ -123,6 +124,13 @@ static int qda_rpmsg_probe(struct rpmsg_device *rpdev)
return ret;
}
+ ret = qda_register_device(qdev);
+ if (ret) {
+ qda_deinit_device(qdev);
+ qda_unpopulate_child_devices(qdev);
+ return ret;
+ }
+
qda_info(qdev, "QDA RPMsg probe completed successfully for %s\n", qdev->dsp_name);
return 0;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 83+ messages in thread
* [PATCH RFC 08/18] accel/qda: Add per-file DRM context and open/close handling
2026-02-23 19:08 ` [PATCH RFC 00/18] accel/qda: Introduce Qualcomm DSP Accelerator driver Ekansh Gupta
` (6 preceding siblings ...)
2026-02-23 19:09 ` [PATCH RFC 07/18] accel/qda: Add DRM accel device registration for QDA driver Ekansh Gupta
@ 2026-02-23 19:09 ` Ekansh Gupta
2026-02-23 22:20 ` Dmitry Baryshkov
2026-02-23 19:09 ` [PATCH RFC 09/18] accel/qda: Add QUERY IOCTL and basic QDA UAPI header Ekansh Gupta
` (15 subsequent siblings)
23 siblings, 1 reply; 83+ messages in thread
From: Ekansh Gupta @ 2026-02-23 19:09 UTC (permalink / raw)
To: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König
Cc: dri-devel, linux-doc, linux-kernel, linux-arm-msm, iommu,
linux-media, linaro-mm-sig, Srinivas Kandagatla, Dmitry Baryshkov,
Bharath Kumar, Chenna Kesava Raju, Ekansh Gupta
Introduce per-file and per-user context for the QDA DRM accelerator
driver. A new qda_file_priv structure is stored in file->driver_priv
for each open file descriptor, and a qda_user object is allocated per
client with a unique client_id generated from an atomic counter in
qda_dev.
The DRM driver now provides qda_open() and qda_postclose() callbacks.
qda_open() resolves the qda_dev from the drm_device, allocates the
qda_file_priv and qda_user structures, and attaches them to the DRM
file. qda_postclose() tears down the per-file context and frees the
qda_user object when the file is closed.
This prepares the QDA driver to track per-process state for future
features such as per-client memory mappings, job submission contexts,
and access control over DSP compute resources.
Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
---
drivers/accel/qda/qda_drv.c | 117 ++++++++++++++++++++++++++++++++++++++++++++
drivers/accel/qda/qda_drv.h | 30 ++++++++++++
2 files changed, 147 insertions(+)
diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c
index a9113ec78fa2..bf95fc782cf8 100644
--- a/drivers/accel/qda/qda_drv.c
+++ b/drivers/accel/qda/qda_drv.c
@@ -12,11 +12,127 @@
#include "qda_drv.h"
#include "qda_rpmsg.h"
+static struct qda_drm_priv *get_drm_priv_from_device(struct drm_device *dev)
+{
+ if (!dev)
+ return NULL;
+
+ return (struct qda_drm_priv *)dev->dev_private;
+}
+
+static struct qda_dev *get_qdev_from_drm_device(struct drm_device *dev)
+{
+ struct qda_drm_priv *drm_priv;
+
+ if (!dev) {
+ qda_dbg(NULL, "Invalid drm_device\n");
+ return NULL;
+ }
+
+ drm_priv = get_drm_priv_from_device(dev);
+ if (!drm_priv) {
+ qda_dbg(NULL, "No drm_priv in dev_private\n");
+ return NULL;
+ }
+
+ return drm_priv->qdev;
+}
+
+static struct qda_user *alloc_qda_user(struct qda_dev *qdev)
+{
+ struct qda_user *qda_user;
+
+ qda_user = kzalloc_obj(*qda_user, GFP_KERNEL);
+ if (!qda_user)
+ return NULL;
+
+ qda_user->client_id = atomic_inc_return(&qdev->client_id_counter);
+ qda_user->qda_dev = qdev;
+
+ qda_dbg(qdev, "Allocated qda_user with client_id=%u\n", qda_user->client_id);
+ return qda_user;
+}
+
+static void free_qda_user(struct qda_user *qda_user)
+{
+ if (!qda_user)
+ return;
+
+ qda_dbg(qda_user->qda_dev, "Freeing qda_user client_id=%u\n", qda_user->client_id);
+
+ kfree(qda_user);
+}
+
+static int qda_open(struct drm_device *dev, struct drm_file *file)
+{
+ struct qda_user *qda_user;
+ struct qda_file_priv *qda_file_priv;
+ struct qda_dev *qdev;
+
+ if (!file) {
+ qda_dbg(NULL, "Invalid file pointer\n");
+ return -EINVAL;
+ }
+
+ qdev = get_qdev_from_drm_device(dev);
+ if (!qdev) {
+ qda_dbg(NULL, "Failed to get qdev from drm_device\n");
+ return -EINVAL;
+ }
+
+ qda_file_priv = kzalloc(sizeof(*qda_file_priv), GFP_KERNEL);
+ if (!qda_file_priv)
+ return -ENOMEM;
+
+ qda_file_priv->pid = current->pid;
+
+ qda_user = alloc_qda_user(qdev);
+ if (!qda_user) {
+ qda_dbg(qdev, "Failed to allocate qda_user\n");
+ kfree(qda_file_priv);
+ return -ENOMEM;
+ }
+
+ file->driver_priv = qda_file_priv;
+ qda_file_priv->qda_user = qda_user;
+
+ qda_dbg(qdev, "Device opened successfully for PID %d\n", current->pid);
+
+ return 0;
+}
+
+static void qda_postclose(struct drm_device *dev, struct drm_file *file)
+{
+ struct qda_dev *qdev;
+ struct qda_file_priv *qda_file_priv;
+ struct qda_user *qda_user;
+
+ qdev = get_qdev_from_drm_device(dev);
+ if (!qdev || atomic_read(&qdev->removing)) {
+ qda_dbg(NULL, "Device unavailable or removing\n");
+ return;
+ }
+
+ qda_file_priv = (struct qda_file_priv *)file->driver_priv;
+ if (qda_file_priv) {
+ qda_user = qda_file_priv->qda_user;
+ if (qda_user)
+ free_qda_user(qda_user);
+
+ kfree(qda_file_priv);
+ file->driver_priv = NULL;
+ }
+
+ qda_dbg(qdev, "Device closed for PID %d\n", current->pid);
+}
+
DEFINE_DRM_ACCEL_FOPS(qda_accel_fops);
static struct drm_driver qda_drm_driver = {
.driver_features = DRIVER_COMPUTE_ACCEL,
.fops = &qda_accel_fops,
+ .open = qda_open,
+ .postclose = qda_postclose,
.name = DRIVER_NAME,
.desc = "Qualcomm DSP Accelerator Driver",
};
@@ -58,6 +174,7 @@ static void init_device_resources(struct qda_dev *qdev)
mutex_init(&qdev->lock);
atomic_set(&qdev->removing, 0);
+ atomic_set(&qdev->client_id_counter, 0);
}
static int init_memory_manager(struct qda_dev *qdev)
diff --git a/drivers/accel/qda/qda_drv.h b/drivers/accel/qda/qda_drv.h
index 2b80401a3741..e0ba37702a86 100644
--- a/drivers/accel/qda/qda_drv.h
+++ b/drivers/accel/qda/qda_drv.h
@@ -10,6 +10,7 @@
#include <linux/list.h>
#include <linux/mutex.h>
#include <linux/rpmsg.h>
+#include <linux/types.h>
#include <linux/xarray.h>
#include <drm/drm_drv.h>
#include <drm/drm_file.h>
@@ -20,6 +21,33 @@
/* Driver identification */
#define DRIVER_NAME "qda"
+/**
+ * struct qda_file_priv - Per-process private data for DRM file
+ *
+ * This structure tracks per-process state for each open file descriptor.
+ * It maintains the IOMMU device assignment and links to the legacy qda_user
+ * structure for compatibility with existing code.
+ */
+struct qda_file_priv {
+ /* Process ID for tracking */
+ pid_t pid;
+ /* Pointer to qda_user structure for backward compatibility */
+ struct qda_user *qda_user;
+};
+
+/**
+ * struct qda_user - Per-user context for remote processor interaction
+ *
+ * This structure maintains per-user state for interactions with the
+ * remote processor, including memory mappings and pending operations.
+ */
+struct qda_user {
+ /* Unique client identifier */
+ u32 client_id;
+ /* Back-pointer to device structure */
+ struct qda_dev *qda_dev;
+};
+
/**
* struct qda_drm_priv - DRM device private data for QDA device
*
@@ -52,6 +80,8 @@ struct qda_dev {
struct qda_drm_priv *drm_priv;
/* Flag indicating device removal in progress */
atomic_t removing;
+ /* Atomic counter for generating unique client IDs */
+ atomic_t client_id_counter;
/* Name of the DSP (e.g., "cdsp", "adsp") */
char dsp_name[16];
/* Compute context-bank (CB) child devices */
--
2.34.1
^ permalink raw reply related [flat|nested] 83+ messages in thread
* [PATCH RFC 09/18] accel/qda: Add QUERY IOCTL and basic QDA UAPI header
2026-02-23 19:08 ` [PATCH RFC 00/18] accel/qda: Introduce Qualcomm DSP Accelerator driver Ekansh Gupta
` (7 preceding siblings ...)
2026-02-23 19:09 ` [PATCH RFC 08/18] accel/qda: Add per-file DRM context and open/close handling Ekansh Gupta
@ 2026-02-23 19:09 ` Ekansh Gupta
2026-02-23 22:24 ` Dmitry Baryshkov
2026-02-23 19:09 ` [PATCH RFC 10/18] accel/qda: Add DMA-backed GEM objects and memory manager integration Ekansh Gupta
` (14 subsequent siblings)
23 siblings, 1 reply; 83+ messages in thread
From: Ekansh Gupta @ 2026-02-23 19:09 UTC (permalink / raw)
To: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König
Cc: dri-devel, linux-doc, linux-kernel, linux-arm-msm, iommu,
linux-media, linaro-mm-sig, Srinivas Kandagatla, Dmitry Baryshkov,
Bharath Kumar, Chenna Kesava Raju, Ekansh Gupta
Introduce a basic UAPI for the QDA accelerator driver along with a
DRM IOCTL handler to query DSP device identity. A new UAPI header
include/uapi/drm/qda_accel.h defines DRM_QDA_QUERY, the corresponding
DRM_IOCTL_QDA_QUERY command, and struct drm_qda_query, which contains
a DSP name string.
On the kernel side, qda_ioctl_query() validates the per-file context,
resolves the qda_dev instance from dev->dev_private, and copies the
DSP name from qdev->dsp_name into the query structure. The new
qda_ioctls[] table wires this IOCTL into the QDA DRM driver so
userspace can call it through the standard DRM command interface.
This IOCTL provides a simple and stable way for userspace to discover
which DSP a given QDA device node represents and serves as the first
building block for a richer QDA UAPI in subsequent patches.
Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
---
drivers/accel/qda/Makefile | 1 +
drivers/accel/qda/qda_drv.c | 9 +++++++++
drivers/accel/qda/qda_ioctl.c | 45 +++++++++++++++++++++++++++++++++++++++++
drivers/accel/qda/qda_ioctl.h | 26 ++++++++++++++++++++++++
include/uapi/drm/qda_accel.h | 47 +++++++++++++++++++++++++++++++++++++++++++
5 files changed, 128 insertions(+)
diff --git a/drivers/accel/qda/Makefile b/drivers/accel/qda/Makefile
index 7e96ddc40a24..f547398e1a72 100644
--- a/drivers/accel/qda/Makefile
+++ b/drivers/accel/qda/Makefile
@@ -10,5 +10,6 @@ qda-y := \
qda_rpmsg.o \
qda_cb.o \
qda_memory_manager.o \
+ qda_ioctl.o \
obj-$(CONFIG_DRM_ACCEL_QDA_COMPUTE_BUS) += qda_compute_bus.o
diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c
index bf95fc782cf8..86758a9cd982 100644
--- a/drivers/accel/qda/qda_drv.c
+++ b/drivers/accel/qda/qda_drv.c
@@ -9,7 +9,10 @@
#include <drm/drm_file.h>
#include <drm/drm_gem.h>
#include <drm/drm_ioctl.h>
+#include <drm/qda_accel.h>
+
#include "qda_drv.h"
+#include "qda_ioctl.h"
#include "qda_rpmsg.h"
static struct qda_drm_priv *get_drm_priv_from_device(struct drm_device *dev)
@@ -128,11 +131,17 @@ static void qda_postclose(struct drm_device *dev, struct drm_file *file)
DEFINE_DRM_ACCEL_FOPS(qda_accel_fops);
+static const struct drm_ioctl_desc qda_ioctls[] = {
+ DRM_IOCTL_DEF_DRV(QDA_QUERY, qda_ioctl_query, 0),
+};
+
static struct drm_driver qda_drm_driver = {
.driver_features = DRIVER_COMPUTE_ACCEL,
.fops = &qda_accel_fops,
.open = qda_open,
.postclose = qda_postclose,
+ .ioctls = qda_ioctls,
+ .num_ioctls = ARRAY_SIZE(qda_ioctls),
.name = DRIVER_NAME,
.desc = "Qualcomm DSP Accelerator Driver",
};
diff --git a/drivers/accel/qda/qda_ioctl.c b/drivers/accel/qda/qda_ioctl.c
new file mode 100644
index 000000000000..9fa73ec2dfce
--- /dev/null
+++ b/drivers/accel/qda/qda_ioctl.c
@@ -0,0 +1,45 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
+#include <drm/drm_ioctl.h>
+#include <drm/drm_gem.h>
+#include <drm/qda_accel.h>
+#include "qda_drv.h"
+#include "qda_ioctl.h"
+
+static int qda_validate_and_get_context(struct drm_device *dev, struct drm_file *file_priv,
+ struct qda_dev **qdev, struct qda_user **qda_user)
+{
+ struct qda_drm_priv *drm_priv = dev->dev_private;
+ struct qda_file_priv *qda_file_priv;
+
+ if (!drm_priv)
+ return -EINVAL;
+
+ *qdev = drm_priv->qdev;
+ if (!*qdev)
+ return -EINVAL;
+
+ qda_file_priv = (struct qda_file_priv *)file_priv->driver_priv;
+ if (!qda_file_priv || !qda_file_priv->qda_user)
+ return -EINVAL;
+
+ *qda_user = qda_file_priv->qda_user;
+
+ return 0;
+}
+
+int qda_ioctl_query(struct drm_device *dev, void *data, struct drm_file *file_priv)
+{
+ struct qda_dev *qdev;
+ struct qda_user *qda_user;
+ struct drm_qda_query *args = data;
+ int ret;
+
+ ret = qda_validate_and_get_context(dev, file_priv, &qdev, &qda_user);
+ if (ret)
+ return ret;
+
+ strscpy(args->dsp_name, qdev->dsp_name, sizeof(args->dsp_name));
+
+ return 0;
+}
diff --git a/drivers/accel/qda/qda_ioctl.h b/drivers/accel/qda/qda_ioctl.h
new file mode 100644
index 000000000000..6bf3bcd28c0e
--- /dev/null
+++ b/drivers/accel/qda/qda_ioctl.h
@@ -0,0 +1,26 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
+ */
+
+#ifndef _QDA_IOCTL_H
+#define _QDA_IOCTL_H
+
+#include <linux/types.h>
+#include <linux/kernel.h>
+#include <drm/drm_ioctl.h>
+#include "qda_drv.h"
+
+/**
+ * qda_ioctl_query - Query DSP device information and capabilities
+ * @dev: DRM device structure
+ * @data: User-space data containing query parameters and results
+ * @file_priv: DRM file private data
+ *
+ * This IOCTL handler queries information about the DSP device.
+ *
+ * Return: 0 on success, negative error code on failure
+ */
+int qda_ioctl_query(struct drm_device *dev, void *data, struct drm_file *file_priv);
+
+#endif /* _QDA_IOCTL_H */
diff --git a/include/uapi/drm/qda_accel.h b/include/uapi/drm/qda_accel.h
new file mode 100644
index 000000000000..0aad791c4832
--- /dev/null
+++ b/include/uapi/drm/qda_accel.h
@@ -0,0 +1,47 @@
+/* SPDX-License-Identifier: GPL-2.0-only WITH Linux-syscall-note */
+/*
+ * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
+ */
+
+#ifndef __QDA_ACCEL_H__
+#define __QDA_ACCEL_H__
+
+#include "drm.h"
+
+#if defined(__cplusplus)
+extern "C" {
+#endif
+
+/*
+ * QDA IOCTL command numbers
+ *
+ * These define the command numbers for QDA-specific IOCTLs.
+ * They are used with DRM_COMMAND_BASE to create the full IOCTL numbers.
+ */
+#define DRM_QDA_QUERY 0x00
+/*
+ * QDA IOCTL definitions
+ *
+ * These macros define the actual IOCTL numbers used by userspace applications.
+ * They combine the command numbers with DRM_COMMAND_BASE and specify the
+ * data structure and direction (read/write) for each IOCTL.
+ */
+#define DRM_IOCTL_QDA_QUERY DRM_IOR(DRM_COMMAND_BASE + DRM_QDA_QUERY, struct drm_qda_query)
+
+/**
+ * struct drm_qda_query - Device information query structure
+ * @dsp_name: Name of DSP (e.g., "adsp", "cdsp", "cdsp1", "gdsp0", "gdsp1")
+ *
+ * This structure is used with DRM_IOCTL_QDA_QUERY to query device type,
+ * allowing userspace to identify which DSP a device node represents. The
+ * kernel provides the DSP name directly as a null-terminated string.
+ */
+struct drm_qda_query {
+ __u8 dsp_name[16];
+};
+
+#if defined(__cplusplus)
+}
+#endif
+
+#endif /* __QDA_ACCEL_H__ */
--
2.34.1
^ permalink raw reply related [flat|nested] 83+ messages in thread
* [PATCH RFC 10/18] accel/qda: Add DMA-backed GEM objects and memory manager integration
2026-02-23 19:08 ` [PATCH RFC 00/18] accel/qda: Introduce Qualcomm DSP Accelerator driver Ekansh Gupta
` (8 preceding siblings ...)
2026-02-23 19:09 ` [PATCH RFC 09/18] accel/qda: Add QUERY IOCTL and basic QDA UAPI header Ekansh Gupta
@ 2026-02-23 19:09 ` Ekansh Gupta
2026-02-23 22:36 ` Dmitry Baryshkov
2026-02-23 19:09 ` [PATCH RFC 11/18] accel/qda: Add GEM_CREATE and GEM_MMAP_OFFSET IOCTLs Ekansh Gupta
` (13 subsequent siblings)
23 siblings, 1 reply; 83+ messages in thread
From: Ekansh Gupta @ 2026-02-23 19:09 UTC (permalink / raw)
To: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König
Cc: dri-devel, linux-doc, linux-kernel, linux-arm-msm, iommu,
linux-media, linaro-mm-sig, Srinivas Kandagatla, Dmitry Baryshkov,
Bharath Kumar, Chenna Kesava Raju, Ekansh Gupta
Introduce DMA-backed GEM buffer objects for the QDA accelerator
driver and integrate them with the existing memory manager and IOMMU
device abstraction.
A new qda_gem_obj structure wraps drm_gem_object and tracks the
kernel virtual address, DMA address, size and owning qda_iommu_device.
qda_gem_create_object() allocates a GEM object, aligns the requested
size, and uses qda_memory_manager_alloc() to obtain DMA-coherent
memory from a per-process IOMMU device. The GEM object implements
a .mmap callback that validates the VMA offset and calls into
qda_dma_mmap(), which maps the DMA memory into userspace and sets
appropriate VMA flags.
The DMA backend is implemented in qda_memory_dma.c, which allocates
and frees coherent memory via dma_alloc_coherent() and
dma_free_coherent(), while storing a SID-prefixed DMA address in
the GEM object for later use by DSP firmware. The memory manager
is extended to maintain a mapping from processes to IOMMU devices
using qda_file_priv and a process_assignment_lock, and provides
qda_memory_manager_alloc() and qda_memory_manager_free() helpers
for GEM allocations.
This patch lays the groundwork for GEM allocation and mmap IOCTLs
as well as future PRIME and job submission support for QDA buffers.
Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
---
drivers/accel/qda/Makefile | 2 +
drivers/accel/qda/qda_drv.c | 23 +++-
drivers/accel/qda/qda_drv.h | 7 ++
drivers/accel/qda/qda_gem.c | 187 +++++++++++++++++++++++++++++++
drivers/accel/qda/qda_gem.h | 63 +++++++++++
drivers/accel/qda/qda_memory_dma.c | 91 ++++++++++++++++
drivers/accel/qda/qda_memory_dma.h | 46 ++++++++
drivers/accel/qda/qda_memory_manager.c | 194 +++++++++++++++++++++++++++++++++
drivers/accel/qda/qda_memory_manager.h | 33 ++++++
9 files changed, 645 insertions(+), 1 deletion(-)
diff --git a/drivers/accel/qda/Makefile b/drivers/accel/qda/Makefile
index f547398e1a72..88c324fa382c 100644
--- a/drivers/accel/qda/Makefile
+++ b/drivers/accel/qda/Makefile
@@ -11,5 +11,7 @@ qda-y := \
qda_cb.o \
qda_memory_manager.o \
qda_ioctl.o \
+ qda_gem.o \
+ qda_memory_dma.o \
obj-$(CONFIG_DRM_ACCEL_QDA_COMPUTE_BUS) += qda_compute_bus.o
diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c
index 86758a9cd982..19798359b14e 100644
--- a/drivers/accel/qda/qda_drv.c
+++ b/drivers/accel/qda/qda_drv.c
@@ -15,7 +15,7 @@
#include "qda_ioctl.h"
#include "qda_rpmsg.h"
-static struct qda_drm_priv *get_drm_priv_from_device(struct drm_device *dev)
+struct qda_drm_priv *get_drm_priv_from_device(struct drm_device *dev)
{
if (!dev)
return NULL;
@@ -88,6 +88,7 @@ static int qda_open(struct drm_device *dev, struct drm_file *file)
return -ENOMEM;
qda_file_priv->pid = current->pid;
+ qda_file_priv->assigned_iommu_dev = NULL; /* Will be assigned on first allocation */
qda_user = alloc_qda_user(qdev);
if (!qda_user) {
@@ -118,6 +119,26 @@ static void qda_postclose(struct drm_device *dev, struct drm_file *file)
qda_file_priv = (struct qda_file_priv *)file->driver_priv;
if (qda_file_priv) {
+ if (qda_file_priv->assigned_iommu_dev) {
+ struct qda_iommu_device *iommu_dev = qda_file_priv->assigned_iommu_dev;
+ unsigned long flags;
+
+ /* Decrement reference count - if it reaches 0, reset PID assignment */
+ if (refcount_dec_and_test(&iommu_dev->refcount)) {
+ /* Last reference released - reset PID assignment */
+ spin_lock_irqsave(&iommu_dev->lock, flags);
+ iommu_dev->assigned_pid = 0;
+ iommu_dev->assigned_file_priv = NULL;
+ spin_unlock_irqrestore(&iommu_dev->lock, flags);
+
+ qda_dbg(qdev, "Reset PID assignment for IOMMU device %u (process %d exited)\n",
+ iommu_dev->id, qda_file_priv->pid);
+ } else {
+ qda_dbg(qdev, "Decremented reference for IOMMU device %u from process %d\n",
+ iommu_dev->id, qda_file_priv->pid);
+ }
+ }
+
qda_user = qda_file_priv->qda_user;
if (qda_user)
free_qda_user(qda_user);
diff --git a/drivers/accel/qda/qda_drv.h b/drivers/accel/qda/qda_drv.h
index e0ba37702a86..8a2cd474958b 100644
--- a/drivers/accel/qda/qda_drv.h
+++ b/drivers/accel/qda/qda_drv.h
@@ -33,6 +33,8 @@ struct qda_file_priv {
pid_t pid;
/* Pointer to qda_user structure for backward compatibility */
struct qda_user *qda_user;
+ /* IOMMU device assigned to this process */
+ struct qda_iommu_device *assigned_iommu_dev;
};
/**
@@ -153,4 +155,9 @@ void qda_deinit_device(struct qda_dev *qdev);
int qda_register_device(struct qda_dev *qdev);
void qda_unregister_device(struct qda_dev *qdev);
+/*
+ * Utility function to get DRM private data from DRM device
+ */
+struct qda_drm_priv *get_drm_priv_from_device(struct drm_device *dev);
+
#endif /* __QDA_DRV_H__ */
diff --git a/drivers/accel/qda/qda_gem.c b/drivers/accel/qda/qda_gem.c
new file mode 100644
index 000000000000..bbd54e2502d3
--- /dev/null
+++ b/drivers/accel/qda/qda_gem.c
@@ -0,0 +1,187 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
+#include <drm/drm_gem.h>
+#include <drm/drm_prime.h>
+#include <linux/slab.h>
+#include <linux/dma-mapping.h>
+#include "qda_drv.h"
+#include "qda_gem.h"
+#include "qda_memory_manager.h"
+#include "qda_memory_dma.h"
+
+static int validate_gem_obj_for_mmap(struct qda_gem_obj *qda_gem_obj)
+{
+ if (qda_gem_obj->size == 0) {
+ qda_err(NULL, "Invalid GEM object size\n");
+ return -EINVAL;
+ }
+ if (!qda_gem_obj->iommu_dev || !qda_gem_obj->iommu_dev->dev) {
+ qda_err(NULL, "Allocated buffer missing IOMMU device\n");
+ return -EINVAL;
+ }
+ if (!qda_gem_obj->iommu_dev->dev) {
+ qda_err(NULL, "Allocated buffer missing IOMMU device\n");
+ return -EINVAL;
+ }
+ if (!qda_gem_obj->virt) {
+ qda_err(NULL, "Allocated buffer missing virtual address\n");
+ return -EINVAL;
+ }
+ if (qda_gem_obj->dma_addr == 0) {
+ qda_err(NULL, "Allocated buffer missing DMA address\n");
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static int validate_vma_offset(struct drm_gem_object *drm_obj, struct vm_area_struct *vma)
+{
+ u64 expected_offset = drm_vma_node_offset_addr(&drm_obj->vma_node);
+ u64 actual_offset = vma->vm_pgoff << PAGE_SHIFT;
+
+ if (actual_offset != expected_offset) {
+ qda_err(NULL, "VMA offset mismatch: expected=0x%llx, actual=0x%llx\n",
+ expected_offset, actual_offset);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static void setup_vma_flags(struct vm_area_struct *vma)
+{
+ vm_flags_set(vma, VM_DONTEXPAND);
+ vm_flags_set(vma, VM_DONTDUMP);
+}
+
+void qda_gem_free_object(struct drm_gem_object *gem_obj)
+{
+ struct qda_gem_obj *qda_gem_obj = to_qda_gem_obj(gem_obj);
+ struct qda_drm_priv *drm_priv = get_drm_priv_from_device(gem_obj->dev);
+
+ if (qda_gem_obj->virt) {
+ if (drm_priv && drm_priv->iommu_mgr)
+ qda_memory_manager_free(drm_priv->iommu_mgr, qda_gem_obj);
+ }
+
+ drm_gem_object_release(gem_obj);
+ kfree(qda_gem_obj);
+}
+
+int qda_gem_mmap_obj(struct drm_gem_object *drm_obj, struct vm_area_struct *vma)
+{
+ struct qda_gem_obj *qda_gem_obj = to_qda_gem_obj(drm_obj);
+ int ret;
+
+ ret = validate_gem_obj_for_mmap(qda_gem_obj);
+ if (ret) {
+ qda_err(NULL, "GEM object validation failed: %d\n", ret);
+ return ret;
+ }
+
+ ret = validate_vma_offset(drm_obj, vma);
+ if (ret) {
+ qda_err(NULL, "VMA offset validation failed: %d\n", ret);
+ return ret;
+ }
+
+ /* Reset vm_pgoff for DMA mmap */
+ vma->vm_pgoff = 0;
+
+ ret = qda_dma_mmap(qda_gem_obj, vma);
+
+ if (ret == 0) {
+ setup_vma_flags(vma);
+ qda_dbg(NULL, "GEM object mapped successfully\n");
+ } else {
+ qda_err(NULL, "GEM object mmap failed: %d\n", ret);
+ }
+
+ return ret;
+}
+
+static const struct drm_gem_object_funcs qda_gem_object_funcs = {
+ .free = qda_gem_free_object,
+ .mmap = qda_gem_mmap_obj,
+};
+
+struct qda_gem_obj *qda_gem_alloc_object(struct drm_device *drm_dev, size_t aligned_size)
+{
+ struct qda_gem_obj *qda_gem_obj;
+ int ret;
+
+ qda_gem_obj = kzalloc_obj(*qda_gem_obj, GFP_KERNEL);
+ if (!qda_gem_obj)
+ return ERR_PTR(-ENOMEM);
+
+ ret = drm_gem_object_init(drm_dev, &qda_gem_obj->base, aligned_size);
+ if (ret) {
+ qda_err(NULL, "Failed to initialize GEM object: %d\n", ret);
+ kfree(qda_gem_obj);
+ return ERR_PTR(ret);
+ }
+
+ qda_gem_obj->base.funcs = &qda_gem_object_funcs;
+ qda_gem_obj->size = aligned_size;
+
+ qda_dbg(NULL, "Allocated GEM object size=%zu\n", aligned_size);
+ return qda_gem_obj;
+}
+
+void qda_gem_cleanup_object(struct qda_gem_obj *qda_gem_obj)
+{
+ drm_gem_object_release(&qda_gem_obj->base);
+ kfree(qda_gem_obj);
+}
+
+struct drm_gem_object *qda_gem_lookup_object(struct drm_file *file_priv, u32 handle)
+{
+ struct drm_gem_object *gem_obj;
+
+ gem_obj = drm_gem_object_lookup(file_priv, handle);
+ if (!gem_obj)
+ return ERR_PTR(-ENOENT);
+
+ return gem_obj;
+}
+
+int qda_gem_create_handle(struct drm_file *file_priv, struct drm_gem_object *gem_obj, u32 *handle)
+{
+ int ret;
+
+ ret = drm_gem_handle_create(file_priv, gem_obj, handle);
+ drm_gem_object_put(gem_obj);
+
+ return ret;
+}
+
+struct drm_gem_object *qda_gem_create_object(struct drm_device *drm_dev,
+ struct qda_memory_manager *iommu_mgr, size_t size,
+ struct drm_file *file_priv)
+{
+ struct qda_gem_obj *qda_gem_obj;
+ size_t aligned_size;
+ int ret;
+
+ if (size == 0) {
+ qda_err(NULL, "Invalid size for GEM object creation\n");
+ return ERR_PTR(-EINVAL);
+ }
+
+ aligned_size = PAGE_ALIGN(size);
+
+ qda_gem_obj = qda_gem_alloc_object(drm_dev, aligned_size);
+ if (IS_ERR(qda_gem_obj))
+ return (struct drm_gem_object *)qda_gem_obj;
+
+ ret = qda_memory_manager_alloc(iommu_mgr, qda_gem_obj, file_priv);
+ if (ret) {
+ qda_err(NULL, "Memory manager allocation failed: %d\n", ret);
+ qda_gem_cleanup_object(qda_gem_obj);
+ return ERR_PTR(ret);
+ }
+
+ qda_dbg(NULL, "GEM object created successfully size=%zu\n", aligned_size);
+ return &qda_gem_obj->base;
+}
diff --git a/drivers/accel/qda/qda_gem.h b/drivers/accel/qda/qda_gem.h
new file mode 100644
index 000000000000..caae9cda5363
--- /dev/null
+++ b/drivers/accel/qda/qda_gem.h
@@ -0,0 +1,63 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
+ */
+#ifndef _QDA_GEM_H
+#define _QDA_GEM_H
+
+#include <linux/xarray.h>
+#include <drm/drm_device.h>
+#include <drm/drm_gem.h>
+#include <linux/dma-mapping.h>
+
+/* Forward declarations */
+struct qda_memory_manager;
+struct qda_iommu_device;
+
+/**
+ * struct qda_gem_obj - QDA GEM buffer object
+ *
+ * This structure represents a GEM buffer object that can be either
+ * allocated by the driver or imported from another driver via dma-buf.
+ */
+struct qda_gem_obj {
+ /* DRM GEM object base structure */
+ struct drm_gem_object base;
+ /* Kernel virtual address of allocated memory */
+ void *virt;
+ /* DMA address for allocated buffers */
+ dma_addr_t dma_addr;
+ /* Size of the buffer in bytes */
+ size_t size;
+ /* IOMMU device that performed the allocation */
+ struct qda_iommu_device *iommu_dev;
+};
+
+/*
+ * Helper macro to cast a drm_gem_object to qda_gem_obj
+ */
+#define to_qda_gem_obj(gem_obj) container_of(gem_obj, struct qda_gem_obj, base)
+
+/*
+ * GEM object lifecycle management
+ */
+struct drm_gem_object *qda_gem_create_object(struct drm_device *drm_dev,
+ struct qda_memory_manager *iommu_mgr,
+ size_t size, struct drm_file *file_priv);
+void qda_gem_free_object(struct drm_gem_object *gem_obj);
+int qda_gem_mmap_obj(struct drm_gem_object *gem_obj, struct vm_area_struct *vma);
+
+/*
+ * Helper functions for GEM object allocation and cleanup
+ * These are used internally and by the PRIME import code
+ */
+struct qda_gem_obj *qda_gem_alloc_object(struct drm_device *drm_dev, size_t aligned_size);
+void qda_gem_cleanup_object(struct qda_gem_obj *qda_gem_obj);
+
+/*
+ * Utility functions for GEM operations
+ */
+struct drm_gem_object *qda_gem_lookup_object(struct drm_file *file_priv, u32 handle);
+int qda_gem_create_handle(struct drm_file *file_priv, struct drm_gem_object *gem_obj, u32 *handle);
+
+#endif /* _QDA_GEM_H */
diff --git a/drivers/accel/qda/qda_memory_dma.c b/drivers/accel/qda/qda_memory_dma.c
new file mode 100644
index 000000000000..ffdd5423c88c
--- /dev/null
+++ b/drivers/accel/qda/qda_memory_dma.c
@@ -0,0 +1,91 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
+#include <linux/slab.h>
+#include <linux/dma-mapping.h>
+#include "qda_drv.h"
+#include "qda_memory_dma.h"
+
+static dma_addr_t get_actual_dma_addr(struct qda_gem_obj *gem_obj)
+{
+ return gem_obj->dma_addr - ((u64)gem_obj->iommu_dev->sid << 32);
+}
+
+static void setup_gem_object(struct qda_gem_obj *gem_obj, void *virt,
+ dma_addr_t dma_addr, struct qda_iommu_device *iommu_dev)
+{
+ gem_obj->virt = virt;
+ gem_obj->dma_addr = dma_addr;
+ gem_obj->iommu_dev = iommu_dev;
+}
+
+static void cleanup_gem_object_fields(struct qda_gem_obj *gem_obj)
+{
+ gem_obj->virt = NULL;
+ gem_obj->dma_addr = 0;
+ gem_obj->iommu_dev = NULL;
+}
+
+int qda_dma_alloc(struct qda_iommu_device *iommu_dev,
+ struct qda_gem_obj *gem_obj, size_t size)
+{
+ void *virt;
+ dma_addr_t dma_addr;
+
+ if (!iommu_dev || !iommu_dev->dev) {
+ qda_err(NULL, "Invalid iommu_dev or device for DMA allocation\n");
+ return -EINVAL;
+ }
+
+ virt = dma_alloc_coherent(iommu_dev->dev, size, &dma_addr, GFP_KERNEL);
+ if (!virt)
+ return -ENOMEM;
+
+ dma_addr += ((u64)iommu_dev->sid << 32);
+
+ qda_dbg(NULL, "DMA address with SID prefix: 0x%llx (sid=%u)\n",
+ (u64)dma_addr, iommu_dev->sid);
+
+ setup_gem_object(gem_obj, virt, dma_addr, iommu_dev);
+
+ return 0;
+}
+
+void qda_dma_free(struct qda_gem_obj *gem_obj)
+{
+ if (!gem_obj || !gem_obj->iommu_dev) {
+ qda_dbg(NULL, "Invalid gem_obj or iommu_dev for DMA free\n");
+ return;
+ }
+
+ qda_dbg(NULL, "DMA freeing: size=%zu, device_id=%u, dma_addr=0x%llx\n",
+ gem_obj->size, gem_obj->iommu_dev->id, gem_obj->dma_addr);
+
+ dma_free_coherent(gem_obj->iommu_dev->dev, gem_obj->size,
+ gem_obj->virt, get_actual_dma_addr(gem_obj));
+
+ cleanup_gem_object_fields(gem_obj);
+}
+
+int qda_dma_mmap(struct qda_gem_obj *gem_obj, struct vm_area_struct *vma)
+{
+ struct qda_iommu_device *iommu_dev;
+ int ret;
+
+ if (!gem_obj || !gem_obj->virt || !gem_obj->iommu_dev || !gem_obj->iommu_dev->dev) {
+ qda_err(NULL, "Invalid parameters for DMA mmap\n");
+ return -EINVAL;
+ }
+
+ iommu_dev = gem_obj->iommu_dev;
+
+ ret = dma_mmap_coherent(iommu_dev->dev, vma, gem_obj->virt,
+ get_actual_dma_addr(gem_obj), gem_obj->size);
+
+ if (ret)
+ qda_err(NULL, "DMA mmap failed: size=%zu, device_id=%u, ret=%d\n",
+ gem_obj->size, iommu_dev->id, ret);
+ else
+ qda_dbg(NULL, "DMA mmap successful: size=%zu\n", gem_obj->size);
+
+ return ret;
+}
diff --git a/drivers/accel/qda/qda_memory_dma.h b/drivers/accel/qda/qda_memory_dma.h
new file mode 100644
index 000000000000..79b3c4053a82
--- /dev/null
+++ b/drivers/accel/qda/qda_memory_dma.h
@@ -0,0 +1,46 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
+ */
+
+#ifndef _QDA_MEMORY_DMA_H
+#define _QDA_MEMORY_DMA_H
+
+#include <linux/dma-mapping.h>
+#include "qda_memory_manager.h"
+
+/**
+ * qda_dma_alloc() - Allocate DMA coherent memory for a GEM object
+ * @iommu_dev: Pointer to the QDA IOMMU device structure
+ * @gem_obj: Pointer to GEM object to allocate memory for
+ * @size: Size of memory to allocate in bytes
+ *
+ * Allocates DMA-coherent memory and sets up the GEM object with the
+ * allocated memory details including virtual and DMA addresses.
+ *
+ * Return: 0 on success, negative error code on failure
+ */
+int qda_dma_alloc(struct qda_iommu_device *iommu_dev,
+ struct qda_gem_obj *gem_obj, size_t size);
+
+/**
+ * qda_dma_free() - Free DMA coherent memory for a GEM object
+ * @gem_obj: Pointer to GEM object to free memory for
+ *
+ * Frees DMA-coherent memory previously allocated for the GEM object
+ * and cleans up the GEM object fields.
+ */
+void qda_dma_free(struct qda_gem_obj *gem_obj);
+
+/**
+ * qda_dma_mmap() - Map DMA memory into userspace
+ * @gem_obj: Pointer to GEM object containing DMA memory
+ * @vma: Virtual memory area to map into
+ *
+ * Maps DMA-coherent memory into userspace virtual address space.
+ *
+ * Return: 0 on success, negative error code on failure
+ */
+int qda_dma_mmap(struct qda_gem_obj *gem_obj, struct vm_area_struct *vma);
+
+#endif /* _QDA_MEMORY_DMA_H */
diff --git a/drivers/accel/qda/qda_memory_manager.c b/drivers/accel/qda/qda_memory_manager.c
index b4c7047a89d4..e225667557ee 100644
--- a/drivers/accel/qda/qda_memory_manager.c
+++ b/drivers/accel/qda/qda_memory_manager.c
@@ -6,8 +6,11 @@
#include <linux/spinlock.h>
#include <linux/workqueue.h>
#include <linux/xarray.h>
+#include <drm/drm_file.h>
#include "qda_drv.h"
+#include "qda_gem.h"
#include "qda_memory_manager.h"
+#include "qda_memory_dma.h"
static void cleanup_all_memory_devices(struct qda_memory_manager *mem_mgr)
{
@@ -55,6 +58,8 @@ static void init_iommu_device_fields(struct qda_iommu_device *iommu_dev,
spin_lock_init(&iommu_dev->lock);
refcount_set(&iommu_dev->refcount, 0);
INIT_WORK(&iommu_dev->remove_work, qda_memory_manager_remove_work);
+ iommu_dev->assigned_pid = 0;
+ iommu_dev->assigned_file_priv = NULL;
}
static int allocate_device_id(struct qda_memory_manager *mem_mgr,
@@ -78,6 +83,194 @@ static int allocate_device_id(struct qda_memory_manager *mem_mgr,
return ret;
}
+static struct qda_iommu_device *find_device_for_pid(struct qda_memory_manager *mem_mgr,
+ pid_t pid)
+{
+ unsigned long index;
+ void *entry;
+ struct qda_iommu_device *found_dev = NULL;
+ unsigned long flags;
+
+ xa_lock(&mem_mgr->device_xa);
+ xa_for_each(&mem_mgr->device_xa, index, entry) {
+ struct qda_iommu_device *iommu_dev = entry;
+
+ spin_lock_irqsave(&iommu_dev->lock, flags);
+ if (iommu_dev->assigned_pid == pid) {
+ found_dev = iommu_dev;
+ refcount_inc(&found_dev->refcount);
+ qda_dbg(NULL, "Reusing device id=%u for PID=%d (refcount=%u)\n",
+ found_dev->id, pid, refcount_read(&found_dev->refcount));
+ spin_unlock_irqrestore(&iommu_dev->lock, flags);
+ break;
+ }
+ spin_unlock_irqrestore(&iommu_dev->lock, flags);
+ }
+ xa_unlock(&mem_mgr->device_xa);
+
+ return found_dev;
+}
+
+static struct qda_iommu_device *assign_available_device_to_pid(struct qda_memory_manager *mem_mgr,
+ pid_t pid,
+ struct drm_file *file_priv)
+{
+ unsigned long index;
+ void *entry;
+ struct qda_iommu_device *selected_dev = NULL;
+ unsigned long flags;
+
+ xa_lock(&mem_mgr->device_xa);
+ xa_for_each(&mem_mgr->device_xa, index, entry) {
+ struct qda_iommu_device *iommu_dev = entry;
+
+ spin_lock_irqsave(&iommu_dev->lock, flags);
+ if (iommu_dev->assigned_pid == 0) {
+ iommu_dev->assigned_pid = pid;
+ iommu_dev->assigned_file_priv = file_priv;
+ selected_dev = iommu_dev;
+ refcount_set(&selected_dev->refcount, 1);
+ qda_dbg(NULL, "Assigned device id=%u to PID=%d\n",
+ selected_dev->id, pid);
+ spin_unlock_irqrestore(&iommu_dev->lock, flags);
+ break;
+ }
+ spin_unlock_irqrestore(&iommu_dev->lock, flags);
+ }
+ xa_unlock(&mem_mgr->device_xa);
+
+ return selected_dev;
+}
+
+static struct qda_iommu_device *get_process_iommu_device(struct qda_memory_manager *mem_mgr,
+ struct drm_file *file_priv)
+{
+ struct qda_file_priv *qda_priv;
+
+ if (!file_priv || !file_priv->driver_priv)
+ return NULL;
+
+ qda_priv = (struct qda_file_priv *)file_priv->driver_priv;
+ return qda_priv->assigned_iommu_dev;
+}
+
+static int qda_memory_manager_assign_device(struct qda_memory_manager *mem_mgr,
+ struct drm_file *file_priv)
+{
+ struct qda_file_priv *qda_priv;
+ struct qda_iommu_device *selected_dev = NULL;
+ int ret = 0;
+ pid_t current_pid;
+
+ if (!file_priv || !file_priv->driver_priv) {
+ qda_err(NULL, "Invalid file_priv or driver_priv\n");
+ return -EINVAL;
+ }
+
+ qda_priv = (struct qda_file_priv *)file_priv->driver_priv;
+ current_pid = qda_priv->pid;
+
+ mutex_lock(&mem_mgr->process_assignment_lock);
+
+ if (qda_priv->assigned_iommu_dev) {
+ qda_dbg(NULL, "PID=%d already has device id=%u assigned\n",
+ current_pid, qda_priv->assigned_iommu_dev->id);
+ ret = 0;
+ goto unlock_and_return;
+ }
+
+ selected_dev = find_device_for_pid(mem_mgr, current_pid);
+
+ if (selected_dev) {
+ qda_priv->assigned_iommu_dev = selected_dev;
+ goto unlock_and_return;
+ }
+
+ selected_dev = assign_available_device_to_pid(mem_mgr, current_pid, file_priv);
+
+ if (!selected_dev) {
+ qda_err(NULL, "No available device for PID=%d\n", current_pid);
+ ret = -ENOMEM;
+ goto unlock_and_return;
+ }
+
+ qda_priv->assigned_iommu_dev = selected_dev;
+
+unlock_and_return:
+ mutex_unlock(&mem_mgr->process_assignment_lock);
+ return ret;
+}
+
+static struct qda_iommu_device *get_or_assign_iommu_device(struct qda_memory_manager *mem_mgr,
+ struct drm_file *file_priv,
+ size_t size)
+{
+ struct qda_iommu_device *iommu_dev;
+ int ret;
+
+ iommu_dev = get_process_iommu_device(mem_mgr, file_priv);
+ if (iommu_dev)
+ return iommu_dev;
+
+ ret = qda_memory_manager_assign_device(mem_mgr, file_priv);
+ if (ret)
+ return NULL;
+
+ iommu_dev = get_process_iommu_device(mem_mgr, file_priv);
+ if (iommu_dev)
+ return iommu_dev;
+
+ return NULL;
+}
+
+int qda_memory_manager_alloc(struct qda_memory_manager *mem_mgr, struct qda_gem_obj *gem_obj,
+ struct drm_file *file_priv)
+{
+ struct qda_iommu_device *selected_dev;
+ size_t size;
+ int ret;
+
+ if (!mem_mgr || !gem_obj || !file_priv) {
+ qda_err(NULL, "Invalid parameters for memory allocation\n");
+ return -EINVAL;
+ }
+
+ size = gem_obj->size;
+ if (size == 0) {
+ qda_err(NULL, "Invalid allocation size: 0\n");
+ return -EINVAL;
+ }
+
+ selected_dev = get_or_assign_iommu_device(mem_mgr, file_priv, size);
+
+ if (!selected_dev) {
+ qda_err(NULL, "Failed to get/assign device for allocation (size=%zu)\n", size);
+ return -ENOMEM;
+ }
+
+ ret = qda_dma_alloc(selected_dev, gem_obj, size);
+
+ if (ret) {
+ qda_err(NULL, "Allocation failed: size=%zu, device_id=%u, ret=%d\n",
+ size, selected_dev->id, ret);
+ return ret;
+ }
+
+ qda_dbg(NULL, "Successfully allocated: size=%zu, device_id=%u, dma_addr=0x%llx\n",
+ size, selected_dev->id, gem_obj->dma_addr);
+ return 0;
+}
+
+void qda_memory_manager_free(struct qda_memory_manager *mem_mgr, struct qda_gem_obj *gem_obj)
+{
+ if (!gem_obj || !gem_obj->iommu_dev) {
+ qda_dbg(NULL, "Invalid gem_obj or iommu_dev for free\n");
+ return;
+ }
+
+ qda_dma_free(gem_obj);
+}
+
int qda_memory_manager_register_device(struct qda_memory_manager *mem_mgr,
struct qda_iommu_device *iommu_dev)
{
@@ -134,6 +327,7 @@ int qda_memory_manager_init(struct qda_memory_manager *mem_mgr)
xa_init_flags(&mem_mgr->device_xa, XA_FLAGS_ALLOC);
atomic_set(&mem_mgr->next_id, 0);
+ mutex_init(&mem_mgr->process_assignment_lock);
mem_mgr->wq = create_workqueue("memory_manager_wq");
if (!mem_mgr->wq) {
qda_err(NULL, "Failed to create memory manager workqueue\n");
diff --git a/drivers/accel/qda/qda_memory_manager.h b/drivers/accel/qda/qda_memory_manager.h
index 3bf4cd529909..bac44284ef98 100644
--- a/drivers/accel/qda/qda_memory_manager.h
+++ b/drivers/accel/qda/qda_memory_manager.h
@@ -11,6 +11,8 @@
#include <linux/spinlock.h>
#include <linux/workqueue.h>
#include <linux/xarray.h>
+#include <drm/drm_file.h>
+#include "qda_gem.h"
/**
* struct qda_iommu_device - IOMMU device instance for memory management
@@ -35,6 +37,10 @@ struct qda_iommu_device {
u32 sid;
/* Pointer to parent memory manager */
struct qda_memory_manager *manager;
+ /* Process ID of the process assigned to this device */
+ pid_t assigned_pid;
+ /* DRM file private data for the assigned process */
+ struct drm_file *assigned_file_priv;
};
/**
@@ -51,6 +57,8 @@ struct qda_memory_manager {
atomic_t next_id;
/* Workqueue for asynchronous device operations */
struct workqueue_struct *wq;
+ /* Mutex protecting process-to-device assignments */
+ struct mutex process_assignment_lock;
};
/**
@@ -98,4 +106,29 @@ int qda_memory_manager_register_device(struct qda_memory_manager *mem_mgr,
void qda_memory_manager_unregister_device(struct qda_memory_manager *mem_mgr,
struct qda_iommu_device *iommu_dev);
+/**
+ * qda_memory_manager_alloc() - Allocate memory for a GEM object
+ * @mem_mgr: Pointer to memory manager
+ * @gem_obj: Pointer to GEM object to allocate memory for
+ * @file_priv: DRM file private data for process association
+ *
+ * Allocates memory for the specified GEM object using an appropriate IOMMU
+ * device. The allocation is associated with the calling process via
+ * file_priv.
+ *
+ * Return: 0 on success, negative error code on failure
+ */
+int qda_memory_manager_alloc(struct qda_memory_manager *mem_mgr, struct qda_gem_obj *gem_obj,
+ struct drm_file *file_priv);
+
+/**
+ * qda_memory_manager_free() - Free memory for a GEM object
+ * @mem_mgr: Pointer to memory manager
+ * @gem_obj: Pointer to GEM object to free memory for
+ *
+ * Releases memory previously allocated for the specified GEM object and
+ * removes any associated IOMMU mappings.
+ */
+void qda_memory_manager_free(struct qda_memory_manager *mem_mgr, struct qda_gem_obj *gem_obj);
+
#endif /* _QDA_MEMORY_MANAGER_H */
--
2.34.1
^ permalink raw reply related [flat|nested] 83+ messages in thread
* [PATCH RFC 11/18] accel/qda: Add GEM_CREATE and GEM_MMAP_OFFSET IOCTLs
2026-02-23 19:08 ` [PATCH RFC 00/18] accel/qda: Introduce Qualcomm DSP Accelerator driver Ekansh Gupta
` (9 preceding siblings ...)
2026-02-23 19:09 ` [PATCH RFC 10/18] accel/qda: Add DMA-backed GEM objects and memory manager integration Ekansh Gupta
@ 2026-02-23 19:09 ` Ekansh Gupta
2026-02-23 22:39 ` Dmitry Baryshkov
2026-02-24 9:05 ` Christian König
2026-02-23 19:09 ` [PATCH RFC 12/18] accel/qda: Add PRIME dma-buf import support Ekansh Gupta
` (12 subsequent siblings)
23 siblings, 2 replies; 83+ messages in thread
From: Ekansh Gupta @ 2026-02-23 19:09 UTC (permalink / raw)
To: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König
Cc: dri-devel, linux-doc, linux-kernel, linux-arm-msm, iommu,
linux-media, linaro-mm-sig, Srinivas Kandagatla, Dmitry Baryshkov,
Bharath Kumar, Chenna Kesava Raju, Ekansh Gupta
Add two GEM-related IOCTLs for the QDA accelerator driver and hook
them into the DRM accel driver. DRM_IOCTL_QDA_GEM_CREATE allocates
a DMA-backed GEM buffer object via qda_gem_create_object() and
returns a GEM handle to userspace, while
DRM_IOCTL_QDA_GEM_MMAP_OFFSET returns a valid mmap offset for a
given GEM handle using drm_gem_create_mmap_offset() and the
vma_node in the GEM object.
The QDA driver is updated to advertise DRIVER_GEM in its
driver_features, and the new IOCTLs are wired through the QDA
GEM and memory-manager backend. These IOCTLs allow userspace to
allocate buffers and map them into its address space as a first
step toward full compute buffer management and integration with
DSP workloads.
Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
---
drivers/accel/qda/qda_drv.c | 5 ++++-
drivers/accel/qda/qda_gem.h | 30 ++++++++++++++++++++++++++++++
drivers/accel/qda/qda_ioctl.c | 35 +++++++++++++++++++++++++++++++++++
include/uapi/drm/qda_accel.h | 36 ++++++++++++++++++++++++++++++++++++
4 files changed, 105 insertions(+), 1 deletion(-)
diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c
index 19798359b14e..0dd0e2bb2c0f 100644
--- a/drivers/accel/qda/qda_drv.c
+++ b/drivers/accel/qda/qda_drv.c
@@ -12,6 +12,7 @@
#include <drm/qda_accel.h>
#include "qda_drv.h"
+#include "qda_gem.h"
#include "qda_ioctl.h"
#include "qda_rpmsg.h"
@@ -154,10 +155,12 @@ DEFINE_DRM_ACCEL_FOPS(qda_accel_fops);
static const struct drm_ioctl_desc qda_ioctls[] = {
DRM_IOCTL_DEF_DRV(QDA_QUERY, qda_ioctl_query, 0),
+ DRM_IOCTL_DEF_DRV(QDA_GEM_CREATE, qda_ioctl_gem_create, 0),
+ DRM_IOCTL_DEF_DRV(QDA_GEM_MMAP_OFFSET, qda_ioctl_gem_mmap_offset, 0),
};
static struct drm_driver qda_drm_driver = {
- .driver_features = DRIVER_COMPUTE_ACCEL,
+ .driver_features = DRIVER_GEM | DRIVER_COMPUTE_ACCEL,
.fops = &qda_accel_fops,
.open = qda_open,
.postclose = qda_postclose,
diff --git a/drivers/accel/qda/qda_gem.h b/drivers/accel/qda/qda_gem.h
index caae9cda5363..cbd5d0a58fa4 100644
--- a/drivers/accel/qda/qda_gem.h
+++ b/drivers/accel/qda/qda_gem.h
@@ -47,6 +47,36 @@ struct drm_gem_object *qda_gem_create_object(struct drm_device *drm_dev,
void qda_gem_free_object(struct drm_gem_object *gem_obj);
int qda_gem_mmap_obj(struct drm_gem_object *gem_obj, struct vm_area_struct *vma);
+/*
+ * GEM IOCTL handlers
+ */
+
+/**
+ * qda_ioctl_gem_create - Create a GEM buffer object
+ * @dev: DRM device structure
+ * @data: User-space data containing buffer creation parameters
+ * @file_priv: DRM file private data
+ *
+ * This IOCTL handler creates a new GEM buffer object with the specified
+ * size and returns a handle to the created buffer.
+ *
+ * Return: 0 on success, negative error code on failure
+ */
+int qda_ioctl_gem_create(struct drm_device *dev, void *data, struct drm_file *file_priv);
+
+/**
+ * qda_ioctl_gem_mmap_offset - Get mmap offset for a GEM buffer object
+ * @dev: DRM device structure
+ * @data: User-space data containing buffer handle and offset result
+ * @file_priv: DRM file private data
+ *
+ * This IOCTL handler retrieves the mmap offset for a GEM buffer object,
+ * which can be used to map the buffer into user-space memory.
+ *
+ * Return: 0 on success, negative error code on failure
+ */
+int qda_ioctl_gem_mmap_offset(struct drm_device *dev, void *data, struct drm_file *file_priv);
+
/*
* Helper functions for GEM object allocation and cleanup
* These are used internally and by the PRIME import code
diff --git a/drivers/accel/qda/qda_ioctl.c b/drivers/accel/qda/qda_ioctl.c
index 9fa73ec2dfce..ef3c9c691cb7 100644
--- a/drivers/accel/qda/qda_ioctl.c
+++ b/drivers/accel/qda/qda_ioctl.c
@@ -43,3 +43,38 @@ int qda_ioctl_query(struct drm_device *dev, void *data, struct drm_file *file_pr
return 0;
}
+
+int qda_ioctl_gem_create(struct drm_device *dev, void *data, struct drm_file *file_priv)
+{
+ struct drm_qda_gem_create *args = data;
+ struct drm_gem_object *gem_obj;
+ struct qda_drm_priv *drm_priv;
+
+ drm_priv = get_drm_priv_from_device(dev);
+ if (!drm_priv || !drm_priv->iommu_mgr)
+ return -EINVAL;
+
+ gem_obj = qda_gem_create_object(dev, drm_priv->iommu_mgr, args->size, file_priv);
+ if (IS_ERR(gem_obj))
+ return PTR_ERR(gem_obj);
+
+ return qda_gem_create_handle(file_priv, gem_obj, &args->handle);
+}
+
+int qda_ioctl_gem_mmap_offset(struct drm_device *dev, void *data, struct drm_file *file_priv)
+{
+ struct drm_qda_gem_mmap_offset *args = data;
+ struct drm_gem_object *gem_obj;
+ int ret;
+
+ gem_obj = qda_gem_lookup_object(file_priv, args->handle);
+ if (IS_ERR(gem_obj))
+ return PTR_ERR(gem_obj);
+
+ ret = drm_gem_create_mmap_offset(gem_obj);
+ if (ret == 0)
+ args->offset = drm_vma_node_offset_addr(&gem_obj->vma_node);
+
+ drm_gem_object_put(gem_obj);
+ return ret;
+}
diff --git a/include/uapi/drm/qda_accel.h b/include/uapi/drm/qda_accel.h
index 0aad791c4832..ed24a7f5637e 100644
--- a/include/uapi/drm/qda_accel.h
+++ b/include/uapi/drm/qda_accel.h
@@ -19,6 +19,8 @@ extern "C" {
* They are used with DRM_COMMAND_BASE to create the full IOCTL numbers.
*/
#define DRM_QDA_QUERY 0x00
+#define DRM_QDA_GEM_CREATE 0x01
+#define DRM_QDA_GEM_MMAP_OFFSET 0x02
/*
* QDA IOCTL definitions
*
@@ -27,6 +29,10 @@ extern "C" {
* data structure and direction (read/write) for each IOCTL.
*/
#define DRM_IOCTL_QDA_QUERY DRM_IOR(DRM_COMMAND_BASE + DRM_QDA_QUERY, struct drm_qda_query)
+#define DRM_IOCTL_QDA_GEM_CREATE DRM_IOWR(DRM_COMMAND_BASE + DRM_QDA_GEM_CREATE, \
+ struct drm_qda_gem_create)
+#define DRM_IOCTL_QDA_GEM_MMAP_OFFSET DRM_IOWR(DRM_COMMAND_BASE + DRM_QDA_GEM_MMAP_OFFSET, \
+ struct drm_qda_gem_mmap_offset)
/**
* struct drm_qda_query - Device information query structure
@@ -40,6 +46,36 @@ struct drm_qda_query {
__u8 dsp_name[16];
};
+/**
+ * struct drm_qda_gem_create - GEM buffer object creation parameters
+ * @size: Size of the GEM object to create in bytes (input)
+ * @handle: Allocated GEM handle (output)
+ *
+ * This structure is used with DRM_IOCTL_QDA_GEM_CREATE to allocate
+ * a new GEM buffer object.
+ */
+struct drm_qda_gem_create {
+ __u32 handle;
+ __u32 pad;
+ __u64 size;
+};
+
+/**
+ * struct drm_qda_gem_mmap_offset - GEM object mmap offset query
+ * @handle: GEM handle (input)
+ * @pad: Padding for 64-bit alignment
+ * @offset: mmap offset for the GEM object (output)
+ *
+ * This structure is used with DRM_IOCTL_QDA_GEM_MMAP_OFFSET to retrieve
+ * the mmap offset that can be used with mmap() to map the GEM object into
+ * user space.
+ */
+struct drm_qda_gem_mmap_offset {
+ __u32 handle;
+ __u32 pad;
+ __u64 offset;
+};
+
#if defined(__cplusplus)
}
#endif
--
2.34.1
^ permalink raw reply related [flat|nested] 83+ messages in thread
* [PATCH RFC 12/18] accel/qda: Add PRIME dma-buf import support
2026-02-23 19:08 ` [PATCH RFC 00/18] accel/qda: Introduce Qualcomm DSP Accelerator driver Ekansh Gupta
` (10 preceding siblings ...)
2026-02-23 19:09 ` [PATCH RFC 11/18] accel/qda: Add GEM_CREATE and GEM_MMAP_OFFSET IOCTLs Ekansh Gupta
@ 2026-02-23 19:09 ` Ekansh Gupta
2026-02-24 8:52 ` Matthew Brost
2026-02-24 9:12 ` Christian König
2026-02-23 19:09 ` [PATCH RFC 13/18] accel/qda: Add initial FastRPC attach and release support Ekansh Gupta
` (11 subsequent siblings)
23 siblings, 2 replies; 83+ messages in thread
From: Ekansh Gupta @ 2026-02-23 19:09 UTC (permalink / raw)
To: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König
Cc: dri-devel, linux-doc, linux-kernel, linux-arm-msm, iommu,
linux-media, linaro-mm-sig, Srinivas Kandagatla, Dmitry Baryshkov,
Bharath Kumar, Chenna Kesava Raju, Ekansh Gupta
Add PRIME dma-buf import support for QDA GEM buffer objects and integrate
it with the existing per-process memory manager and IOMMU device model.
The implementation extends qda_gem_obj to represent imported dma-bufs,
including dma_buf references, attachment state, scatter-gather tables
and an imported DMA address used for DSP-facing book-keeping. The
qda_gem_prime_import() path handles reimports of buffers originally
exported by QDA as well as imports of external dma-bufs, attaching them
to the assigned IOMMU device and mapping them through the memory manager
for DSP access. The GEM free path is updated to unmap and detach
imported buffers while preserving the existing behaviour for locally
allocated memory.
The PRIME fd-to-handle path is implemented in qda_prime_fd_to_handle(),
which records the calling drm_file in a driver-private import context
before invoking the core DRM helpers. The GEM import callback retrieves
this context to ensure that an IOMMU device is assigned to the process
and that imported buffers follow the same per-process IOMMU selection
rules as natively allocated GEM objects.
This patch prepares the driver for interoperable buffer sharing between
QDA and other dma-buf capable subsystems while keeping IOMMU mapping and
lifetime handling consistent with the existing GEM allocation flow.
Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
---
drivers/accel/qda/Makefile | 1 +
drivers/accel/qda/qda_drv.c | 8 ++
drivers/accel/qda/qda_drv.h | 4 +
drivers/accel/qda/qda_gem.c | 60 +++++++---
drivers/accel/qda/qda_gem.h | 10 ++
drivers/accel/qda/qda_ioctl.c | 7 ++
drivers/accel/qda/qda_ioctl.h | 15 +++
drivers/accel/qda/qda_memory_manager.c | 42 ++++++-
drivers/accel/qda/qda_memory_manager.h | 14 +++
drivers/accel/qda/qda_prime.c | 194 +++++++++++++++++++++++++++++++++
drivers/accel/qda/qda_prime.h | 43 ++++++++
11 files changed, 377 insertions(+), 21 deletions(-)
diff --git a/drivers/accel/qda/Makefile b/drivers/accel/qda/Makefile
index 88c324fa382c..8286f5279748 100644
--- a/drivers/accel/qda/Makefile
+++ b/drivers/accel/qda/Makefile
@@ -13,5 +13,6 @@ qda-y := \
qda_ioctl.o \
qda_gem.o \
qda_memory_dma.o \
+ qda_prime.o \
obj-$(CONFIG_DRM_ACCEL_QDA_COMPUTE_BUS) += qda_compute_bus.o
diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c
index 0dd0e2bb2c0f..4adee00b1f2c 100644
--- a/drivers/accel/qda/qda_drv.c
+++ b/drivers/accel/qda/qda_drv.c
@@ -10,9 +10,11 @@
#include <drm/drm_gem.h>
#include <drm/drm_ioctl.h>
#include <drm/qda_accel.h>
+#include <drm/drm_prime.h>
#include "qda_drv.h"
#include "qda_gem.h"
+#include "qda_prime.h"
#include "qda_ioctl.h"
#include "qda_rpmsg.h"
@@ -166,6 +168,8 @@ static struct drm_driver qda_drm_driver = {
.postclose = qda_postclose,
.ioctls = qda_ioctls,
.num_ioctls = ARRAY_SIZE(qda_ioctls),
+ .gem_prime_import = qda_gem_prime_import,
+ .prime_fd_to_handle = qda_ioctl_prime_fd_to_handle,
.name = DRIVER_NAME,
.desc = "Qualcomm DSP Accelerator Driver",
};
@@ -174,6 +178,7 @@ static void cleanup_drm_private(struct qda_dev *qdev)
{
if (qdev->drm_priv) {
qda_dbg(qdev, "Cleaning up DRM private data\n");
+ mutex_destroy(&qdev->drm_priv->import_lock);
kfree(qdev->drm_priv);
}
}
@@ -240,6 +245,9 @@ static int init_drm_private(struct qda_dev *qdev)
if (!qdev->drm_priv)
return -ENOMEM;
+ mutex_init(&qdev->drm_priv->import_lock);
+ qdev->drm_priv->current_import_file_priv = NULL;
+
qda_dbg(qdev, "DRM private data initialized successfully\n");
return 0;
}
diff --git a/drivers/accel/qda/qda_drv.h b/drivers/accel/qda/qda_drv.h
index 8a2cd474958b..bb0dd7e284c6 100644
--- a/drivers/accel/qda/qda_drv.h
+++ b/drivers/accel/qda/qda_drv.h
@@ -64,6 +64,10 @@ struct qda_drm_priv {
struct qda_memory_manager *iommu_mgr;
/* Back-pointer to qda_dev */
struct qda_dev *qdev;
+ /* Lock protecting import context */
+ struct mutex import_lock;
+ /* Current file_priv during prime import */
+ struct drm_file *current_import_file_priv;
};
/* struct qda_dev - Main device structure for QDA driver */
diff --git a/drivers/accel/qda/qda_gem.c b/drivers/accel/qda/qda_gem.c
index bbd54e2502d3..37279e8b46fe 100644
--- a/drivers/accel/qda/qda_gem.c
+++ b/drivers/accel/qda/qda_gem.c
@@ -8,6 +8,7 @@
#include "qda_gem.h"
#include "qda_memory_manager.h"
#include "qda_memory_dma.h"
+#include "qda_prime.h"
static int validate_gem_obj_for_mmap(struct qda_gem_obj *qda_gem_obj)
{
@@ -15,23 +16,29 @@ static int validate_gem_obj_for_mmap(struct qda_gem_obj *qda_gem_obj)
qda_err(NULL, "Invalid GEM object size\n");
return -EINVAL;
}
- if (!qda_gem_obj->iommu_dev || !qda_gem_obj->iommu_dev->dev) {
- qda_err(NULL, "Allocated buffer missing IOMMU device\n");
- return -EINVAL;
- }
- if (!qda_gem_obj->iommu_dev->dev) {
- qda_err(NULL, "Allocated buffer missing IOMMU device\n");
- return -EINVAL;
- }
- if (!qda_gem_obj->virt) {
- qda_err(NULL, "Allocated buffer missing virtual address\n");
- return -EINVAL;
- }
- if (qda_gem_obj->dma_addr == 0) {
- qda_err(NULL, "Allocated buffer missing DMA address\n");
- return -EINVAL;
+ if (qda_gem_obj->is_imported) {
+ if (!qda_gem_obj->sgt) {
+ qda_err(NULL, "Imported buffer missing sgt\n");
+ return -EINVAL;
+ }
+ if (!qda_gem_obj->iommu_dev || !qda_gem_obj->iommu_dev->dev) {
+ qda_err(NULL, "Imported buffer missing IOMMU device\n");
+ return -EINVAL;
+ }
+ } else {
+ if (!qda_gem_obj->iommu_dev || !qda_gem_obj->iommu_dev->dev) {
+ qda_err(NULL, "Allocated buffer missing IOMMU device\n");
+ return -EINVAL;
+ }
+ if (!qda_gem_obj->virt) {
+ qda_err(NULL, "Allocated buffer missing virtual address\n");
+ return -EINVAL;
+ }
+ if (qda_gem_obj->dma_addr == 0) {
+ qda_err(NULL, "Allocated buffer missing DMA address\n");
+ return -EINVAL;
+ }
}
-
return 0;
}
@@ -60,9 +67,21 @@ void qda_gem_free_object(struct drm_gem_object *gem_obj)
struct qda_gem_obj *qda_gem_obj = to_qda_gem_obj(gem_obj);
struct qda_drm_priv *drm_priv = get_drm_priv_from_device(gem_obj->dev);
- if (qda_gem_obj->virt) {
- if (drm_priv && drm_priv->iommu_mgr)
+ if (qda_gem_obj->is_imported) {
+ if (qda_gem_obj->attachment && qda_gem_obj->sgt)
+ dma_buf_unmap_attachment_unlocked(qda_gem_obj->attachment,
+ qda_gem_obj->sgt, DMA_BIDIRECTIONAL);
+ if (qda_gem_obj->attachment)
+ dma_buf_detach(qda_gem_obj->dma_buf, qda_gem_obj->attachment);
+ if (qda_gem_obj->dma_buf)
+ dma_buf_put(qda_gem_obj->dma_buf);
+ if (qda_gem_obj->iommu_dev && drm_priv && drm_priv->iommu_mgr)
qda_memory_manager_free(drm_priv->iommu_mgr, qda_gem_obj);
+ } else {
+ if (qda_gem_obj->virt) {
+ if (drm_priv && drm_priv->iommu_mgr)
+ qda_memory_manager_free(drm_priv->iommu_mgr, qda_gem_obj);
+ }
}
drm_gem_object_release(gem_obj);
@@ -174,6 +193,11 @@ struct drm_gem_object *qda_gem_create_object(struct drm_device *drm_dev,
qda_gem_obj = qda_gem_alloc_object(drm_dev, aligned_size);
if (IS_ERR(qda_gem_obj))
return (struct drm_gem_object *)qda_gem_obj;
+ qda_gem_obj->is_imported = false;
+ qda_gem_obj->dma_buf = NULL;
+ qda_gem_obj->attachment = NULL;
+ qda_gem_obj->sgt = NULL;
+ qda_gem_obj->imported_dma_addr = 0;
ret = qda_memory_manager_alloc(iommu_mgr, qda_gem_obj, file_priv);
if (ret) {
diff --git a/drivers/accel/qda/qda_gem.h b/drivers/accel/qda/qda_gem.h
index cbd5d0a58fa4..3566c5b2ad88 100644
--- a/drivers/accel/qda/qda_gem.h
+++ b/drivers/accel/qda/qda_gem.h
@@ -31,6 +31,16 @@ struct qda_gem_obj {
size_t size;
/* IOMMU device that performed the allocation */
struct qda_iommu_device *iommu_dev;
+ /* True if buffer is imported, false if allocated */
+ bool is_imported;
+ /* Reference to imported dma_buf */
+ struct dma_buf *dma_buf;
+ /* DMA buf attachment */
+ struct dma_buf_attachment *attachment;
+ /* Scatter-gather table */
+ struct sg_table *sgt;
+ /* DMA address of imported buffer */
+ dma_addr_t imported_dma_addr;
};
/*
diff --git a/drivers/accel/qda/qda_ioctl.c b/drivers/accel/qda/qda_ioctl.c
index ef3c9c691cb7..d91983048d6c 100644
--- a/drivers/accel/qda/qda_ioctl.c
+++ b/drivers/accel/qda/qda_ioctl.c
@@ -5,6 +5,7 @@
#include <drm/qda_accel.h>
#include "qda_drv.h"
#include "qda_ioctl.h"
+#include "qda_prime.h"
static int qda_validate_and_get_context(struct drm_device *dev, struct drm_file *file_priv,
struct qda_dev **qdev, struct qda_user **qda_user)
@@ -78,3 +79,9 @@ int qda_ioctl_gem_mmap_offset(struct drm_device *dev, void *data, struct drm_fil
drm_gem_object_put(gem_obj);
return ret;
}
+
+int qda_ioctl_prime_fd_to_handle(struct drm_device *dev, struct drm_file *file_priv, int prime_fd,
+ u32 *handle)
+{
+ return qda_prime_fd_to_handle(dev, file_priv, prime_fd, handle);
+}
diff --git a/drivers/accel/qda/qda_ioctl.h b/drivers/accel/qda/qda_ioctl.h
index 6bf3bcd28c0e..d454256f5fc5 100644
--- a/drivers/accel/qda/qda_ioctl.h
+++ b/drivers/accel/qda/qda_ioctl.h
@@ -23,4 +23,19 @@
*/
int qda_ioctl_query(struct drm_device *dev, void *data, struct drm_file *file_priv);
+/**
+ * qda_ioctl_prime_fd_to_handle - IOCTL handler for PRIME FD to handle conversion
+ * @dev: DRM device structure
+ * @file_priv: DRM file private data
+ * @prime_fd: File descriptor of the PRIME buffer
+ * @handle: Output parameter for the GEM handle
+ *
+ * This IOCTL handler converts a PRIME file descriptor to a GEM handle.
+ * It serves as both the DRM driver callback and can be used directly.
+ *
+ * Return: 0 on success, negative error code on failure
+ */
+int qda_ioctl_prime_fd_to_handle(struct drm_device *dev, struct drm_file *file_priv,
+ int prime_fd, u32 *handle);
+
#endif /* _QDA_IOCTL_H */
diff --git a/drivers/accel/qda/qda_memory_manager.c b/drivers/accel/qda/qda_memory_manager.c
index e225667557ee..3fd20f17c57b 100644
--- a/drivers/accel/qda/qda_memory_manager.c
+++ b/drivers/accel/qda/qda_memory_manager.c
@@ -154,8 +154,8 @@ static struct qda_iommu_device *get_process_iommu_device(struct qda_memory_manag
return qda_priv->assigned_iommu_dev;
}
-static int qda_memory_manager_assign_device(struct qda_memory_manager *mem_mgr,
- struct drm_file *file_priv)
+int qda_memory_manager_assign_device(struct qda_memory_manager *mem_mgr,
+ struct drm_file *file_priv)
{
struct qda_file_priv *qda_priv;
struct qda_iommu_device *selected_dev = NULL;
@@ -223,6 +223,35 @@ static struct qda_iommu_device *get_or_assign_iommu_device(struct qda_memory_man
return NULL;
}
+static int qda_memory_manager_map_imported(struct qda_memory_manager *mem_mgr,
+ struct qda_gem_obj *gem_obj,
+ struct qda_iommu_device *iommu_dev)
+{
+ struct scatterlist *sg;
+ dma_addr_t dma_addr;
+ int ret = 0;
+
+ if (!gem_obj->is_imported || !gem_obj->sgt || !iommu_dev) {
+ qda_err(NULL, "Invalid parameters for imported buffer mapping\n");
+ return -EINVAL;
+ }
+
+ gem_obj->iommu_dev = iommu_dev;
+
+ sg = gem_obj->sgt->sgl;
+ if (sg) {
+ dma_addr = sg_dma_address(sg);
+ dma_addr += ((u64)iommu_dev->sid << 32);
+
+ gem_obj->imported_dma_addr = dma_addr;
+ } else {
+ qda_err(NULL, "Invalid scatter-gather list for imported buffer\n");
+ ret = -EINVAL;
+ }
+
+ return ret;
+}
+
int qda_memory_manager_alloc(struct qda_memory_manager *mem_mgr, struct qda_gem_obj *gem_obj,
struct drm_file *file_priv)
{
@@ -248,7 +277,10 @@ int qda_memory_manager_alloc(struct qda_memory_manager *mem_mgr, struct qda_gem_
return -ENOMEM;
}
- ret = qda_dma_alloc(selected_dev, gem_obj, size);
+ if (gem_obj->is_imported)
+ ret = qda_memory_manager_map_imported(mem_mgr, gem_obj, selected_dev);
+ else
+ ret = qda_dma_alloc(selected_dev, gem_obj, size);
if (ret) {
qda_err(NULL, "Allocation failed: size=%zu, device_id=%u, ret=%d\n",
@@ -268,6 +300,10 @@ void qda_memory_manager_free(struct qda_memory_manager *mem_mgr, struct qda_gem_
return;
}
+ if (gem_obj->is_imported) {
+ qda_dbg(NULL, "Freed imported buffer tracking (no DMA free needed)\n");
+ return;
+ }
qda_dma_free(gem_obj);
}
diff --git a/drivers/accel/qda/qda_memory_manager.h b/drivers/accel/qda/qda_memory_manager.h
index bac44284ef98..f6c7963cec42 100644
--- a/drivers/accel/qda/qda_memory_manager.h
+++ b/drivers/accel/qda/qda_memory_manager.h
@@ -106,6 +106,20 @@ int qda_memory_manager_register_device(struct qda_memory_manager *mem_mgr,
void qda_memory_manager_unregister_device(struct qda_memory_manager *mem_mgr,
struct qda_iommu_device *iommu_dev);
+/**
+ * qda_memory_manager_assign_device() - Assign an IOMMU device to a process
+ * @mem_mgr: Pointer to memory manager
+ * @file_priv: DRM file private data for process association
+ *
+ * Assigns an IOMMU device to the calling process. If the process already has
+ * a device assigned, returns success. If another file descriptor from the same
+ * PID has a device, reuses it. Otherwise, finds an available device and assigns it.
+ *
+ * Return: 0 on success, negative error code on failure
+ */
+int qda_memory_manager_assign_device(struct qda_memory_manager *mem_mgr,
+ struct drm_file *file_priv);
+
/**
* qda_memory_manager_alloc() - Allocate memory for a GEM object
* @mem_mgr: Pointer to memory manager
diff --git a/drivers/accel/qda/qda_prime.c b/drivers/accel/qda/qda_prime.c
new file mode 100644
index 000000000000..3d23842e48bb
--- /dev/null
+++ b/drivers/accel/qda/qda_prime.c
@@ -0,0 +1,194 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
+#include <drm/drm_gem.h>
+#include <drm/drm_prime.h>
+#include <linux/slab.h>
+#include <linux/dma-mapping.h>
+#include "qda_drv.h"
+#include "qda_gem.h"
+#include "qda_prime.h"
+#include "qda_memory_manager.h"
+
+static struct drm_gem_object *check_own_buffer(struct drm_device *dev, struct dma_buf *dma_buf)
+{
+ if (dma_buf->priv) {
+ struct drm_gem_object *existing_gem = dma_buf->priv;
+
+ if (existing_gem->dev == dev) {
+ struct qda_gem_obj *existing_qda_gem = to_qda_gem_obj(existing_gem);
+
+ if (!existing_qda_gem->is_imported) {
+ drm_gem_object_get(existing_gem);
+ return existing_gem;
+ }
+ }
+ }
+ return NULL;
+}
+
+static struct qda_iommu_device *get_iommu_device_for_import(struct qda_drm_priv *drm_priv,
+ struct drm_file **file_priv_out,
+ struct qda_dev *qdev)
+{
+ struct drm_file *file_priv;
+ struct qda_file_priv *qda_file_priv;
+ struct qda_iommu_device *iommu_dev = NULL;
+ int ret;
+
+ file_priv = drm_priv->current_import_file_priv;
+ *file_priv_out = file_priv;
+
+ if (!file_priv || !file_priv->driver_priv)
+ return NULL;
+
+ qda_file_priv = (struct qda_file_priv *)file_priv->driver_priv;
+ iommu_dev = qda_file_priv->assigned_iommu_dev;
+
+ if (!iommu_dev) {
+ ret = qda_memory_manager_assign_device(drm_priv->iommu_mgr, file_priv);
+ if (ret) {
+ qda_err(qdev, "Failed to assign IOMMU device: %d\n", ret);
+ return NULL;
+ }
+
+ iommu_dev = qda_file_priv->assigned_iommu_dev;
+ }
+
+ return iommu_dev;
+}
+
+static int setup_dma_buf_mapping(struct qda_gem_obj *qda_gem_obj, struct dma_buf *dma_buf,
+ struct device *attach_dev, struct qda_dev *qdev)
+{
+ struct dma_buf_attachment *attachment;
+ struct sg_table *sgt;
+ int ret;
+
+ attachment = dma_buf_attach(dma_buf, attach_dev);
+ if (IS_ERR(attachment)) {
+ ret = PTR_ERR(attachment);
+ qda_err(qdev, "Failed to attach dma_buf: %d\n", ret);
+ return ret;
+ }
+ qda_gem_obj->attachment = attachment;
+
+ sgt = dma_buf_map_attachment_unlocked(attachment, DMA_BIDIRECTIONAL);
+ if (IS_ERR(sgt)) {
+ ret = PTR_ERR(sgt);
+ qda_err(qdev, "Failed to map dma_buf attachment: %d\n", ret);
+ dma_buf_detach(dma_buf, attachment);
+ return ret;
+ }
+ qda_gem_obj->sgt = sgt;
+
+ return 0;
+}
+
+struct drm_gem_object *qda_gem_prime_import(struct drm_device *dev, struct dma_buf *dma_buf)
+{
+ struct qda_drm_priv *drm_priv;
+ struct qda_gem_obj *qda_gem_obj;
+ struct drm_file *file_priv;
+ struct qda_iommu_device *iommu_dev;
+ struct qda_dev *qdev;
+ struct drm_gem_object *existing_gem;
+ size_t aligned_size;
+ int ret;
+
+ drm_priv = get_drm_priv_from_device(dev);
+ if (!drm_priv || !drm_priv->iommu_mgr) {
+ qda_err(NULL, "Invalid drm_priv or iommu_mgr\n");
+ return ERR_PTR(-EINVAL);
+ }
+
+ qdev = drm_priv->qdev;
+
+ existing_gem = check_own_buffer(dev, dma_buf);
+ if (existing_gem)
+ return existing_gem;
+
+ iommu_dev = get_iommu_device_for_import(drm_priv, &file_priv, qdev);
+ if (!iommu_dev || !iommu_dev->dev) {
+ qda_err(qdev, "No IOMMU device assigned for prime import\n");
+ return ERR_PTR(-ENODEV);
+ }
+
+ qda_dbg(qdev, "Using IOMMU device %u for prime import\n", iommu_dev->id);
+
+ aligned_size = PAGE_ALIGN(dma_buf->size);
+ qda_gem_obj = qda_gem_alloc_object(dev, aligned_size);
+ if (IS_ERR(qda_gem_obj))
+ return (struct drm_gem_object *)qda_gem_obj;
+
+ qda_gem_obj->is_imported = true;
+ qda_gem_obj->dma_buf = dma_buf;
+ qda_gem_obj->virt = NULL;
+ qda_gem_obj->dma_addr = 0;
+ qda_gem_obj->imported_dma_addr = 0;
+ qda_gem_obj->iommu_dev = iommu_dev;
+
+ get_dma_buf(dma_buf);
+
+ ret = setup_dma_buf_mapping(qda_gem_obj, dma_buf, iommu_dev->dev, qdev);
+ if (ret)
+ goto err_put_dma_buf;
+
+ ret = qda_memory_manager_alloc(drm_priv->iommu_mgr, qda_gem_obj, file_priv);
+ if (ret) {
+ qda_err(qdev, "Failed to allocate IOMMU mapping: %d\n", ret);
+ goto err_unmap;
+ }
+
+ qda_dbg(qdev, "Prime import completed successfully size=%zu\n", aligned_size);
+ return &qda_gem_obj->base;
+
+err_unmap:
+ dma_buf_unmap_attachment_unlocked(qda_gem_obj->attachment,
+ qda_gem_obj->sgt, DMA_BIDIRECTIONAL);
+ dma_buf_detach(dma_buf, qda_gem_obj->attachment);
+err_put_dma_buf:
+ dma_buf_put(dma_buf);
+ qda_gem_cleanup_object(qda_gem_obj);
+ return ERR_PTR(ret);
+}
+
+int qda_prime_fd_to_handle(struct drm_device *dev, struct drm_file *file_priv,
+ int prime_fd, u32 *handle)
+{
+ struct qda_drm_priv *drm_priv;
+ struct qda_dev *qdev;
+ int ret;
+
+ drm_priv = get_drm_priv_from_device(dev);
+ if (!drm_priv) {
+ qda_dbg(NULL, "Failed to get drm_priv from device\n");
+ return -EINVAL;
+ }
+
+ qdev = drm_priv->qdev;
+
+ if (file_priv && file_priv->driver_priv) {
+ struct qda_file_priv *qda_file_priv;
+
+ qda_file_priv = (struct qda_file_priv *)file_priv->driver_priv;
+ } else {
+ qda_dbg(qdev, "Called with NULL file_priv or driver_priv\n");
+ }
+
+ mutex_lock(&drm_priv->import_lock);
+ drm_priv->current_import_file_priv = file_priv;
+
+ ret = drm_gem_prime_fd_to_handle(dev, file_priv, prime_fd, handle);
+
+ drm_priv->current_import_file_priv = NULL;
+ mutex_unlock(&drm_priv->import_lock);
+
+ if (!ret)
+ qda_dbg(qdev, "Completed with ret=%d, handle=%u\n", ret, *handle);
+ else
+ qda_dbg(qdev, "Completed with ret=%d\n", ret);
+
+ return ret;
+}
+
+MODULE_IMPORT_NS("DMA_BUF");
diff --git a/drivers/accel/qda/qda_prime.h b/drivers/accel/qda/qda_prime.h
new file mode 100644
index 000000000000..939902454dcd
--- /dev/null
+++ b/drivers/accel/qda/qda_prime.h
@@ -0,0 +1,43 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
+ */
+
+#ifndef _QDA_PRIME_H
+#define _QDA_PRIME_H
+
+#include <drm/drm_device.h>
+#include <drm/drm_file.h>
+#include <drm/drm_gem.h>
+#include <linux/dma-buf.h>
+
+/**
+ * qda_gem_prime_import - Import a DMA-BUF as a GEM object
+ * @dev: DRM device structure
+ * @dma_buf: DMA-BUF to import
+ *
+ * This function imports an external DMA-BUF into the QDA driver as a GEM
+ * object. It handles both re-imports of buffers originally from this driver
+ * and imports of external buffers from other drivers.
+ *
+ * Return: Pointer to the imported GEM object on success, ERR_PTR on failure
+ */
+struct drm_gem_object *qda_gem_prime_import(struct drm_device *dev, struct dma_buf *dma_buf);
+
+/**
+ * qda_prime_fd_to_handle - Core implementation for PRIME FD to GEM handle conversion
+ * @dev: DRM device structure
+ * @file_priv: DRM file private data
+ * @prime_fd: File descriptor of the PRIME buffer
+ * @handle: Output parameter for the GEM handle
+ *
+ * This core function sets up the necessary context before calling the
+ * DRM framework's prime FD to handle conversion. It ensures proper IOMMU
+ * device assignment and tracking for the import operation.
+ *
+ * Return: 0 on success, negative error code on failure
+ */
+int qda_prime_fd_to_handle(struct drm_device *dev, struct drm_file *file_priv,
+ int prime_fd, u32 *handle);
+
+#endif /* _QDA_PRIME_H */
--
2.34.1
^ permalink raw reply related [flat|nested] 83+ messages in thread
* [PATCH RFC 13/18] accel/qda: Add initial FastRPC attach and release support
2026-02-23 19:08 ` [PATCH RFC 00/18] accel/qda: Introduce Qualcomm DSP Accelerator driver Ekansh Gupta
` (11 preceding siblings ...)
2026-02-23 19:09 ` [PATCH RFC 12/18] accel/qda: Add PRIME dma-buf import support Ekansh Gupta
@ 2026-02-23 19:09 ` Ekansh Gupta
2026-02-23 23:07 ` Dmitry Baryshkov
2026-02-23 19:09 ` [PATCH RFC 14/18] accel/qda: Add FastRPC dynamic invocation support Ekansh Gupta
` (10 subsequent siblings)
23 siblings, 1 reply; 83+ messages in thread
From: Ekansh Gupta @ 2026-02-23 19:09 UTC (permalink / raw)
To: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König
Cc: dri-devel, linux-doc, linux-kernel, linux-arm-msm, iommu,
linux-media, linaro-mm-sig, Srinivas Kandagatla, Dmitry Baryshkov,
Bharath Kumar, Chenna Kesava Raju, Ekansh Gupta
Add the initial FastRPC invocation plumbing to the QDA accelerator
driver to support attaching to and releasing a DSP process. A new
fastrpc_invoke_context structure tracks the state of a single remote
procedure call, including arguments, overlap handling, completion and
GEM-based message buffers. Contexts are indexed through an xarray in
qda_dev so that RPMsg callbacks can match responses back to the
originating invocation.
The new qda_fastrpc implementation provides helpers to prepare
FastRPC scalars and arguments, pack them into a QDA message backed by
a GEM buffer and unpack responses. The FastRPC INIT_ATTACH and
INIT_RELEASE methods are wired up via a new QDA_INIT_ATTACH ioctl and
a postclose hook that sends a release request when a client file
descriptor is closed. On the transport side qda_rpmsg_send_msg()
builds and sends a fastrpc_msg over RPMsg, while qda_rpmsg_cb()
decodes qda_invoke_rsp messages, looks up the context by its id and
completes the corresponding wait.
This lays the foundation for QDA FastRPC method support on top of the
existing GEM and RPMsg infrastructure, starting with the attach and
release control flows for DSP sessions.
Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
---
drivers/accel/qda/Makefile | 1 +
drivers/accel/qda/qda_drv.c | 5 +
drivers/accel/qda/qda_drv.h | 2 +
drivers/accel/qda/qda_fastrpc.c | 548 ++++++++++++++++++++++++++++++++++++++++
drivers/accel/qda/qda_fastrpc.h | 303 ++++++++++++++++++++++
drivers/accel/qda/qda_ioctl.c | 107 ++++++++
drivers/accel/qda/qda_ioctl.h | 25 ++
drivers/accel/qda/qda_rpmsg.c | 164 +++++++++++-
drivers/accel/qda/qda_rpmsg.h | 40 +++
include/uapi/drm/qda_accel.h | 19 ++
10 files changed, 1212 insertions(+), 2 deletions(-)
diff --git a/drivers/accel/qda/Makefile b/drivers/accel/qda/Makefile
index 8286f5279748..82d40e452fa9 100644
--- a/drivers/accel/qda/Makefile
+++ b/drivers/accel/qda/Makefile
@@ -14,5 +14,6 @@ qda-y := \
qda_gem.o \
qda_memory_dma.o \
qda_prime.o \
+ qda_fastrpc.o \
obj-$(CONFIG_DRM_ACCEL_QDA_COMPUTE_BUS) += qda_compute_bus.o
diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c
index 4adee00b1f2c..3034ea660924 100644
--- a/drivers/accel/qda/qda_drv.c
+++ b/drivers/accel/qda/qda_drv.c
@@ -120,6 +120,8 @@ static void qda_postclose(struct drm_device *dev, struct drm_file *file)
return;
}
+ fastrpc_release_current_dsp_process(qdev, file);
+
qda_file_priv = (struct qda_file_priv *)file->driver_priv;
if (qda_file_priv) {
if (qda_file_priv->assigned_iommu_dev) {
@@ -159,6 +161,7 @@ static const struct drm_ioctl_desc qda_ioctls[] = {
DRM_IOCTL_DEF_DRV(QDA_QUERY, qda_ioctl_query, 0),
DRM_IOCTL_DEF_DRV(QDA_GEM_CREATE, qda_ioctl_gem_create, 0),
DRM_IOCTL_DEF_DRV(QDA_GEM_MMAP_OFFSET, qda_ioctl_gem_mmap_offset, 0),
+ DRM_IOCTL_DEF_DRV(QDA_INIT_ATTACH, qda_ioctl_attach, 0),
};
static struct drm_driver qda_drm_driver = {
@@ -195,6 +198,7 @@ static void cleanup_iommu_manager(struct qda_dev *qdev)
static void cleanup_device_resources(struct qda_dev *qdev)
{
+ xa_destroy(&qdev->ctx_xa);
mutex_destroy(&qdev->lock);
}
@@ -213,6 +217,7 @@ static void init_device_resources(struct qda_dev *qdev)
mutex_init(&qdev->lock);
atomic_set(&qdev->removing, 0);
atomic_set(&qdev->client_id_counter, 0);
+ xa_init_flags(&qdev->ctx_xa, XA_FLAGS_ALLOC1);
}
static int init_memory_manager(struct qda_dev *qdev)
diff --git a/drivers/accel/qda/qda_drv.h b/drivers/accel/qda/qda_drv.h
index bb0dd7e284c6..bb1d1e82036a 100644
--- a/drivers/accel/qda/qda_drv.h
+++ b/drivers/accel/qda/qda_drv.h
@@ -92,6 +92,8 @@ struct qda_dev {
char dsp_name[16];
/* Compute context-bank (CB) child devices */
struct list_head cb_devs;
+ /* XArray for context management */
+ struct xarray ctx_xa;
};
/**
diff --git a/drivers/accel/qda/qda_fastrpc.c b/drivers/accel/qda/qda_fastrpc.c
new file mode 100644
index 000000000000..eda7c90070ee
--- /dev/null
+++ b/drivers/accel/qda/qda_fastrpc.c
@@ -0,0 +1,548 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
+#include <linux/slab.h>
+#include <linux/uaccess.h>
+#include <linux/sort.h>
+#include <linux/completion.h>
+#include <linux/dma-buf.h>
+#include <drm/drm_gem.h>
+#include <drm/qda_accel.h>
+#include "qda_fastrpc.h"
+#include "qda_drv.h"
+#include "qda_gem.h"
+#include "qda_memory_manager.h"
+
+static int copy_to_user_or_kernel(void __user *dst, const void *src, size_t size)
+{
+ if ((unsigned long)dst >= PAGE_OFFSET) {
+ memcpy(dst, src, size);
+ return 0;
+ } else {
+ return copy_to_user(dst, src, size) ? -EFAULT : 0;
+ }
+}
+
+static int get_gem_obj_from_handle(struct drm_file *file_priv, u32 handle,
+ struct drm_gem_object **gem_obj)
+{
+ if (handle == 0)
+ return -EINVAL;
+
+ if (!file_priv)
+ return -EINVAL;
+
+ *gem_obj = drm_gem_object_lookup(file_priv, handle);
+ if (*gem_obj)
+ return 0;
+
+ return -ENOENT;
+}
+
+static void setup_pages_from_gem_obj(struct qda_gem_obj *qda_gem_obj,
+ struct fastrpc_phy_page *pages)
+{
+ if (qda_gem_obj->is_imported)
+ pages->addr = qda_gem_obj->imported_dma_addr;
+ else
+ pages->addr = qda_gem_obj->dma_addr;
+
+ pages->size = qda_gem_obj->size;
+}
+
+static u64 calculate_vma_offset(u64 user_ptr)
+{
+ struct vm_area_struct *vma;
+ u64 user_ptr_page_mask = user_ptr & PAGE_MASK;
+ u64 vma_offset = 0;
+
+ mmap_read_lock(current->mm);
+ vma = find_vma(current->mm, user_ptr);
+ if (vma)
+ vma_offset = user_ptr_page_mask - vma->vm_start;
+ mmap_read_unlock(current->mm);
+
+ return vma_offset;
+}
+
+static u64 calculate_page_aligned_size(u64 ptr, u64 len)
+{
+ u64 pg_start = (ptr & PAGE_MASK) >> PAGE_SHIFT;
+ u64 pg_end = ((ptr + len - 1) & PAGE_MASK) >> PAGE_SHIFT;
+ u64 aligned_size = (pg_end - pg_start + 1) * PAGE_SIZE;
+
+ return aligned_size;
+}
+
+static void setup_single_arg(struct fastrpc_invoke_args *args, void *ptr, size_t size)
+{
+ args[0].ptr = (u64)(uintptr_t)ptr;
+ args[0].length = size;
+ args[0].fd = -1;
+}
+
+static struct fastrpc_invoke_buf *fastrpc_invoke_buf_start(union fastrpc_remote_arg *pra, int len)
+{
+ struct fastrpc_invoke_buf *buf = (struct fastrpc_invoke_buf *)(&pra[len]);
+ return buf;
+}
+
+static struct fastrpc_phy_page *fastrpc_phy_page_start(struct fastrpc_invoke_buf *buf, int len)
+{
+ struct fastrpc_phy_page *pages = (struct fastrpc_phy_page *)(&buf[len]);
+ return pages;
+}
+
+static int fastrpc_get_meta_size(struct fastrpc_invoke_context *ctx)
+{
+ int size = 0;
+
+ size = (sizeof(struct fastrpc_remote_buf) +
+ sizeof(struct fastrpc_invoke_buf) +
+ sizeof(struct fastrpc_phy_page)) * ctx->nscalars +
+ sizeof(u64) * FASTRPC_MAX_FDLIST +
+ sizeof(u32) * FASTRPC_MAX_CRCLIST;
+
+ return size;
+}
+
+static u64 fastrpc_get_payload_size(struct fastrpc_invoke_context *ctx, int metalen)
+{
+ u64 size = 0;
+ int oix;
+
+ size = ALIGN(metalen, FASTRPC_ALIGN);
+
+ for (oix = 0; oix < ctx->nbufs; oix++) {
+ int i = ctx->olaps[oix].raix;
+
+ if (ctx->args[i].fd == 0 || ctx->args[i].fd == -1) {
+ if (ctx->olaps[oix].offset == 0)
+ size = ALIGN(size, FASTRPC_ALIGN);
+
+ size += (ctx->olaps[oix].mend - ctx->olaps[oix].mstart);
+ }
+ }
+
+ return size;
+}
+
+void fastrpc_context_free(struct kref *ref)
+{
+ struct fastrpc_invoke_context *ctx;
+ int i;
+
+ ctx = container_of(ref, struct fastrpc_invoke_context, refcount);
+ if (ctx->gem_objs) {
+ for (i = 0; i < ctx->nscalars; ++i) {
+ if (ctx->gem_objs[i]) {
+ drm_gem_object_put(ctx->gem_objs[i]);
+ ctx->gem_objs[i] = NULL;
+ }
+ }
+ kfree(ctx->gem_objs);
+ ctx->gem_objs = NULL;
+ }
+
+ if (ctx->msg_gem_obj) {
+ drm_gem_object_put(&ctx->msg_gem_obj->base);
+ ctx->msg_gem_obj = NULL;
+ }
+
+ kfree(ctx->olaps);
+ ctx->olaps = NULL;
+
+ kfree(ctx->args);
+ kfree(ctx->req);
+ kfree(ctx->rsp);
+ kfree(ctx->input_pages);
+ kfree(ctx->inbuf);
+
+ kfree(ctx);
+}
+
+#define CMP(aa, bb) ((aa) == (bb) ? 0 : (aa) < (bb) ? -1 : 1)
+
+static int olaps_cmp(const void *a, const void *b)
+{
+ struct fastrpc_buf_overlap *pa = (struct fastrpc_buf_overlap *)a;
+ struct fastrpc_buf_overlap *pb = (struct fastrpc_buf_overlap *)b;
+ int st = CMP(pa->start, pb->start);
+ int ed = CMP(pb->end, pa->end);
+
+ return st == 0 ? ed : st;
+}
+
+static void fastrpc_get_buff_overlaps(struct fastrpc_invoke_context *ctx)
+{
+ u64 max_end = 0;
+ int i;
+
+ for (i = 0; i < ctx->nbufs; ++i) {
+ ctx->olaps[i].start = ctx->args[i].ptr;
+ ctx->olaps[i].end = ctx->olaps[i].start + ctx->args[i].length;
+ ctx->olaps[i].raix = i;
+ }
+
+ sort(ctx->olaps, ctx->nbufs, sizeof(*ctx->olaps), olaps_cmp, NULL);
+
+ for (i = 0; i < ctx->nbufs; ++i) {
+ if (ctx->olaps[i].start < max_end) {
+ ctx->olaps[i].mstart = max_end;
+ ctx->olaps[i].mend = ctx->olaps[i].end;
+ ctx->olaps[i].offset = max_end - ctx->olaps[i].start;
+
+ if (ctx->olaps[i].end > max_end) {
+ max_end = ctx->olaps[i].end;
+ } else {
+ ctx->olaps[i].mend = 0;
+ ctx->olaps[i].mstart = 0;
+ }
+ } else {
+ ctx->olaps[i].mend = ctx->olaps[i].end;
+ ctx->olaps[i].mstart = ctx->olaps[i].start;
+ ctx->olaps[i].offset = 0;
+ max_end = ctx->olaps[i].end;
+ }
+ }
+}
+
+struct fastrpc_invoke_context *fastrpc_context_alloc(void)
+{
+ struct fastrpc_invoke_context *ctx = NULL;
+
+ ctx = kzalloc_obj(*ctx, GFP_KERNEL);
+ if (!ctx)
+ return ERR_PTR(-ENOMEM);
+
+ INIT_LIST_HEAD(&ctx->node);
+
+ ctx->retval = -1;
+ ctx->pid = current->pid;
+ init_completion(&ctx->work);
+ ctx->msg_gem_obj = NULL;
+ kref_init(&ctx->refcount);
+
+ return ctx;
+}
+
+static int process_fd_buffer(struct fastrpc_invoke_context *ctx, int i,
+ union fastrpc_remote_arg *rpra, struct fastrpc_phy_page *pages)
+{
+ struct drm_gem_object *gem_obj;
+ struct qda_gem_obj *qda_gem_obj;
+ int err;
+ u64 len = ctx->args[i].length;
+ u64 vma_offset;
+
+ err = get_gem_obj_from_handle(ctx->file_priv, ctx->args[i].fd, &gem_obj);
+ if (err)
+ return err;
+
+ ctx->gem_objs[i] = gem_obj;
+ qda_gem_obj = to_qda_gem_obj(gem_obj);
+
+ rpra[i].buf.pv = (u64)ctx->args[i].ptr;
+
+ if (qda_gem_obj->is_imported)
+ pages[i].addr = qda_gem_obj->imported_dma_addr;
+ else
+ pages[i].addr = qda_gem_obj->dma_addr;
+
+ vma_offset = calculate_vma_offset(ctx->args[i].ptr);
+ pages[i].addr += vma_offset;
+ pages[i].size = calculate_page_aligned_size(ctx->args[i].ptr, len);
+
+ return 0;
+}
+
+static int process_direct_buffer(struct fastrpc_invoke_context *ctx, int i, int oix,
+ union fastrpc_remote_arg *rpra, struct fastrpc_phy_page *pages,
+ uintptr_t *args, u64 *rlen, u64 pkt_size)
+{
+ int mlen;
+ u64 len = ctx->args[i].length;
+ int inbufs = ctx->inbufs;
+
+ if (ctx->olaps[oix].offset == 0) {
+ *rlen -= ALIGN(*args, FASTRPC_ALIGN) - *args;
+ *args = ALIGN(*args, FASTRPC_ALIGN);
+ }
+
+ mlen = ctx->olaps[oix].mend - ctx->olaps[oix].mstart;
+
+ if (*rlen < mlen)
+ return -ENOSPC;
+
+ rpra[i].buf.pv = *args - ctx->olaps[oix].offset;
+
+ pages[i].addr = ctx->msg->phys - ctx->olaps[oix].offset + (pkt_size - *rlen);
+ pages[i].addr = pages[i].addr & PAGE_MASK;
+ pages[i].size = calculate_page_aligned_size(rpra[i].buf.pv, len);
+
+ *args = *args + mlen;
+ *rlen -= mlen;
+
+ if (i < inbufs) {
+ void *dst = (void *)(uintptr_t)rpra[i].buf.pv;
+ void *src = (void *)(uintptr_t)ctx->args[i].ptr;
+
+ if ((unsigned long)src >= PAGE_OFFSET) {
+ memcpy(dst, src, len);
+ } else {
+ if (copy_from_user(dst, (void __user *)src, len))
+ return -EFAULT;
+ }
+ }
+
+ return 0;
+}
+
+static int process_dma_handle(struct fastrpc_invoke_context *ctx, int i,
+ union fastrpc_remote_arg *rpra, struct fastrpc_phy_page *pages)
+{
+ if (ctx->args[i].fd > 0) {
+ struct drm_gem_object *gem_obj;
+ struct qda_gem_obj *qda_gem_obj;
+ int err;
+
+ err = get_gem_obj_from_handle(ctx->file_priv, ctx->args[i].fd, &gem_obj);
+ if (err)
+ return err;
+
+ ctx->gem_objs[i] = gem_obj;
+ qda_gem_obj = to_qda_gem_obj(gem_obj);
+
+ setup_pages_from_gem_obj(qda_gem_obj, &pages[i]);
+
+ rpra[i].dma.fd = ctx->args[i].fd;
+ rpra[i].dma.len = ctx->args[i].length;
+ rpra[i].dma.offset = (u64)ctx->args[i].ptr;
+ } else {
+ rpra[i].buf.pv = ctx->args[i].ptr;
+ rpra[i].buf.len = ctx->args[i].length;
+ }
+
+ return 0;
+}
+
+int fastrpc_get_header_size(struct fastrpc_invoke_context *ctx, size_t *out_size)
+{
+ ctx->inbufs = REMOTE_SCALARS_INBUFS(ctx->sc);
+ ctx->metalen = fastrpc_get_meta_size(ctx);
+ ctx->pkt_size = fastrpc_get_payload_size(ctx, ctx->metalen);
+
+ ctx->aligned_pkt_size = PAGE_ALIGN(ctx->pkt_size);
+ if (ctx->aligned_pkt_size == 0)
+ return -EINVAL;
+
+ *out_size = ctx->aligned_pkt_size;
+ return 0;
+}
+
+static int fastrpc_get_args(struct fastrpc_invoke_context *ctx)
+{
+ union fastrpc_remote_arg *rpra;
+ struct fastrpc_invoke_buf *list;
+ struct fastrpc_phy_page *pages;
+ int i, oix, err = 0;
+ u64 rlen;
+ uintptr_t args;
+ size_t hdr_size;
+
+ ctx->inbufs = REMOTE_SCALARS_INBUFS(ctx->sc);
+ err = fastrpc_get_header_size(ctx, &hdr_size);
+ if (err)
+ return err;
+
+ ctx->msg->buf = ctx->msg_gem_obj->virt;
+ ctx->msg->phys = ctx->msg_gem_obj->dma_addr;
+
+ memset(ctx->msg->buf, 0, ctx->aligned_pkt_size);
+
+ rpra = (union fastrpc_remote_arg *)ctx->msg->buf;
+ ctx->list = fastrpc_invoke_buf_start(rpra, ctx->nscalars);
+ ctx->pages = fastrpc_phy_page_start(ctx->list, ctx->nscalars);
+ list = ctx->list;
+ pages = ctx->pages;
+ args = (uintptr_t)ctx->msg->buf + ctx->metalen;
+ rlen = ctx->pkt_size - ctx->metalen;
+ ctx->rpra = rpra;
+
+ for (oix = 0; oix < ctx->nbufs; ++oix) {
+ i = ctx->olaps[oix].raix;
+
+ rpra[i].buf.pv = 0;
+ rpra[i].buf.len = ctx->args[i].length;
+ list[i].num = ctx->args[i].length ? 1 : 0;
+ list[i].pgidx = i;
+
+ if (!ctx->args[i].length)
+ continue;
+
+ if (ctx->args[i].fd > 0)
+ err = process_fd_buffer(ctx, i, rpra, pages);
+ else
+ err = process_direct_buffer(ctx, i, oix, rpra, pages, &args, &rlen,
+ ctx->pkt_size);
+
+ if (err)
+ goto bail_gem;
+ }
+
+ for (i = ctx->nbufs; i < ctx->nscalars; ++i) {
+ list[i].num = ctx->args[i].length ? 1 : 0;
+ list[i].pgidx = i;
+
+ err = process_dma_handle(ctx, i, rpra, pages);
+ if (err)
+ goto bail_gem;
+ }
+
+ return 0;
+
+bail_gem:
+ if (ctx->msg_gem_obj) {
+ drm_gem_object_put(&ctx->msg_gem_obj->base);
+ ctx->msg_gem_obj = NULL;
+ }
+
+ return err;
+}
+
+static int fastrpc_put_args(struct fastrpc_invoke_context *ctx, struct qda_msg *msg)
+{
+ union fastrpc_remote_arg *rpra = ctx->rpra;
+ int i, err = 0;
+
+ if (!ctx || !rpra)
+ return -EINVAL;
+
+ for (i = ctx->inbufs; i < ctx->nbufs; ++i) {
+ if (ctx->args[i].fd <= 0) {
+ void *src = (void *)(uintptr_t)rpra[i].buf.pv;
+ void *dst = (void *)(uintptr_t)ctx->args[i].ptr;
+ u64 len = rpra[i].buf.len;
+
+ err = copy_to_user_or_kernel(dst, src, len);
+ if (err)
+ break;
+ }
+ }
+
+ return err;
+}
+
+int fastrpc_internal_invoke_pack(struct fastrpc_invoke_context *ctx,
+ struct qda_msg *msg)
+{
+ int err = 0;
+
+ if (ctx->handle == FASTRPC_INIT_HANDLE)
+ msg->client_id = 0;
+ else
+ msg->client_id = ctx->client_id;
+
+ ctx->msg = msg;
+
+ err = fastrpc_get_args(ctx);
+ if (err)
+ return err;
+
+ dma_wmb();
+
+ msg->tid = ctx->pid;
+ msg->ctx = ctx->ctxid | ctx->pd;
+ msg->handle = ctx->handle;
+ msg->sc = ctx->sc;
+ msg->addr = ctx->msg->phys;
+ msg->size = roundup(ctx->pkt_size, PAGE_SIZE);
+ msg->fastrpc_ctx = ctx;
+ msg->file_priv = ctx->file_priv;
+
+ return 0;
+}
+
+int fastrpc_internal_invoke_unpack(struct fastrpc_invoke_context *ctx,
+ struct qda_msg *msg)
+{
+ int err;
+
+ dma_rmb();
+
+ err = fastrpc_put_args(ctx, msg);
+ if (err)
+ return err;
+
+ err = ctx->retval;
+ return err;
+}
+
+static int fastrpc_prepare_args_init_attach(struct fastrpc_invoke_context *ctx)
+{
+ struct fastrpc_invoke_args *args;
+
+ args = kzalloc_obj(*args, GFP_KERNEL);
+ if (!args)
+ return -ENOMEM;
+
+ setup_single_arg(args, &ctx->client_id, sizeof(ctx->client_id));
+ ctx->sc = FASTRPC_SCALARS(FASTRPC_RMID_INIT_ATTACH, 1, 0);
+ ctx->args = args;
+ ctx->handle = FASTRPC_INIT_HANDLE;
+
+ return 0;
+}
+
+static int fastrpc_prepare_args_release_process(struct fastrpc_invoke_context *ctx)
+{
+ struct fastrpc_invoke_args *args;
+
+ args = kzalloc_obj(*args, GFP_KERNEL);
+ if (!args)
+ return -ENOMEM;
+
+ setup_single_arg(args, &ctx->client_id, sizeof(ctx->client_id));
+ ctx->sc = FASTRPC_SCALARS(FASTRPC_RMID_INIT_RELEASE, 1, 0);
+ ctx->args = args;
+ ctx->handle = FASTRPC_INIT_HANDLE;
+
+ return 0;
+}
+
+int fastrpc_prepare_args(struct fastrpc_invoke_context *ctx, char __user *argp)
+{
+ int err;
+
+ switch (ctx->type) {
+ case FASTRPC_RMID_INIT_ATTACH:
+ ctx->pd = ROOT_PD;
+ err = fastrpc_prepare_args_init_attach(ctx);
+ break;
+ case FASTRPC_RMID_INIT_RELEASE:
+ err = fastrpc_prepare_args_release_process(ctx);
+ break;
+ default:
+ return -EINVAL;
+ }
+
+ if (err)
+ return err;
+
+ ctx->nscalars = REMOTE_SCALARS_LENGTH(ctx->sc);
+ ctx->nbufs = REMOTE_SCALARS_INBUFS(ctx->sc) + REMOTE_SCALARS_OUTBUFS(ctx->sc);
+
+ if (ctx->nscalars) {
+ ctx->gem_objs = kcalloc(ctx->nscalars, sizeof(*ctx->gem_objs), GFP_KERNEL);
+ if (!ctx->gem_objs)
+ return -ENOMEM;
+ ctx->olaps = kcalloc(ctx->nscalars, sizeof(*ctx->olaps), GFP_KERNEL);
+ if (!ctx->olaps) {
+ kfree(ctx->gem_objs);
+ ctx->gem_objs = NULL;
+ return -ENOMEM;
+ }
+ fastrpc_get_buff_overlaps(ctx);
+ }
+
+ return err;
+}
diff --git a/drivers/accel/qda/qda_fastrpc.h b/drivers/accel/qda/qda_fastrpc.h
new file mode 100644
index 000000000000..744421382079
--- /dev/null
+++ b/drivers/accel/qda/qda_fastrpc.h
@@ -0,0 +1,303 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
+ */
+
+#ifndef __QDA_FASTRPC_H__
+#define __QDA_FASTRPC_H__
+
+#include <linux/completion.h>
+#include <linux/list.h>
+#include <linux/types.h>
+#include <drm/drm_drv.h>
+#include <drm/drm_file.h>
+
+/*
+ * FastRPC scalar extraction macros
+ *
+ * These macros extract different fields from the scalar value that describes
+ * the arguments passed in a FastRPC invocation.
+ */
+#define REMOTE_SCALARS_INBUFS(sc) (((sc) >> 16) & 0x0ff)
+#define REMOTE_SCALARS_OUTBUFS(sc) (((sc) >> 8) & 0x0ff)
+#define REMOTE_SCALARS_INHANDLES(sc) (((sc) >> 4) & 0x0f)
+#define REMOTE_SCALARS_OUTHANDLES(sc) ((sc) & 0x0f)
+#define REMOTE_SCALARS_LENGTH(sc) (REMOTE_SCALARS_INBUFS(sc) + \
+ REMOTE_SCALARS_OUTBUFS(sc) + \
+ REMOTE_SCALARS_INHANDLES(sc) + \
+ REMOTE_SCALARS_OUTHANDLES(sc))
+
+/* FastRPC configuration constants */
+#define FASTRPC_ALIGN 128 /* Alignment requirement */
+#define FASTRPC_MAX_FDLIST 16 /* Maximum file descriptors */
+#define FASTRPC_MAX_CRCLIST 64 /* Maximum CRC list entries */
+
+/*
+ * FastRPC scalar construction macros
+ *
+ * These macros build the scalar value that describes the arguments
+ * for a FastRPC invocation.
+ */
+#define FASTRPC_BUILD_SCALARS(attr, method, in, out, oin, oout) \
+ (((attr & 0x07) << 29) | \
+ ((method & 0x1f) << 24) | \
+ ((in & 0xff) << 16) | \
+ ((out & 0xff) << 8) | \
+ ((oin & 0x0f) << 4) | \
+ (oout & 0x0f))
+
+#define FASTRPC_SCALARS(method, in, out) \
+ FASTRPC_BUILD_SCALARS(0, method, in, out, 0, 0)
+
+/**
+ * struct fastrpc_buf_overlap - Buffer overlap tracking structure
+ *
+ * This structure tracks overlapping buffer regions to optimize memory
+ * mapping and avoid redundant mappings of the same physical memory.
+ */
+struct fastrpc_buf_overlap {
+ /* Start address of the buffer in user virtual address space */
+ u64 start;
+ /* End address of the buffer in user virtual address space */
+ u64 end;
+ /* Remote argument index associated with this overlap */
+ int raix;
+ /* Start address of the mapped region */
+ u64 mstart;
+ /* End address of the mapped region */
+ u64 mend;
+ /* Offset within the mapped region */
+ u64 offset;
+};
+
+/**
+ * struct fastrpc_remote_dmahandle - Structure to represent a remote DMA handle
+ */
+struct fastrpc_remote_dmahandle {
+ /* DMA handle file descriptor */
+ s32 fd;
+ /* DMA handle offset */
+ u32 offset;
+ /* DMA handle length */
+ u32 len;
+};
+
+/**
+ * struct fastrpc_remote_buf - Structure to represent a remote buffer
+ */
+struct fastrpc_remote_buf {
+ /* Buffer pointer */
+ u64 pv;
+ /* Length of buffer */
+ u64 len;
+};
+
+/**
+ * union fastrpc_remote_arg - Union to represent remote arguments
+ */
+union fastrpc_remote_arg {
+ /* Remote buffer */
+ struct fastrpc_remote_buf buf;
+ /* Remote DMA handle */
+ struct fastrpc_remote_dmahandle dma;
+};
+
+/**
+ * struct fastrpc_phy_page - Structure to represent a physical page
+ */
+struct fastrpc_phy_page {
+ /* Physical address */
+ u64 addr;
+ /* Size of contiguous region */
+ u64 size;
+};
+
+/**
+ * struct fastrpc_invoke_buf - Structure to represent an invoke buffer
+ */
+struct fastrpc_invoke_buf {
+ /* Number of contiguous regions */
+ u32 num;
+ /* Page index */
+ u32 pgidx;
+};
+
+/**
+ * struct qda_msg - Message structure for FastRPC communication
+ *
+ * This structure represents a message sent to or received from the remote
+ * processor via FastRPC protocol.
+ */
+struct qda_msg {
+ /* Process client ID */
+ int client_id;
+ /* Thread ID */
+ int tid;
+ /* Context identifier for matching responses */
+ u64 ctx;
+ /* Handle to invoke on remote processor */
+ u32 handle;
+ /* Scalars structure describing the data layout */
+ u32 sc;
+ /* Physical address of the message buffer */
+ u64 addr;
+ /* Size of contiguous region */
+ u64 size;
+ /* Kernel virtual address of the buffer */
+ void *buf;
+ /* Physical/DMA address of the buffer */
+ u64 phys;
+ /* Return value from remote processor */
+ int ret;
+ /* Pointer to qda_dev for context management */
+ struct qda_dev *qdev;
+ /* Back-pointer to FastRPC context */
+ struct fastrpc_invoke_context *fastrpc_ctx;
+ /* File private data for GEM object lookup */
+ struct drm_file *file_priv;
+};
+
+/**
+ * struct fastrpc_invoke_context - Remote procedure call invocation context
+ *
+ * This structure maintains all state for a single remote procedure call,
+ * including buffer management, synchronization, and result handling.
+ */
+struct fastrpc_invoke_context {
+ /* Unique context identifier for this invocation */
+ u64 ctxid;
+ /* Number of input buffers */
+ int inbufs;
+ /* Number of output buffers */
+ int outbufs;
+ /* Number of file descriptor handles */
+ int handles;
+ /* Number of scalar parameters */
+ int nscalars;
+ /* Total number of buffers (input + output) */
+ int nbufs;
+ /* Process ID of the calling process */
+ int pid;
+ /* Return value from the remote invocation */
+ int retval;
+ /* Length of metadata */
+ int metalen;
+ /* Client identifier for this session */
+ int client_id;
+ /* Protection domain identifier */
+ int pd;
+ /* Type of invocation request */
+ int type;
+ /* Scalars parameter encoding buffer information */
+ u32 sc;
+ /* Handle to the remote method being invoked */
+ u32 handle;
+ /* Pointer to CRC values for data integrity */
+ u32 *crc;
+ /* Pointer to array of file descriptors */
+ u64 *fdlist;
+ /* Size of the packet */
+ u64 pkt_size;
+ /* Aligned packet size for DMA transfers */
+ u64 aligned_pkt_size;
+ /* Array of invoke buffer descriptors */
+ struct fastrpc_invoke_buf *list;
+ /* Array of physical page descriptors for buffers */
+ struct fastrpc_phy_page *pages;
+ /* Array of physical page descriptors for input buffers */
+ struct fastrpc_phy_page *input_pages;
+ /* List node for linking contexts in a queue */
+ struct list_head node;
+ /* Completion object for synchronizing invocation */
+ struct completion work;
+ /* Pointer to the QDA message structure */
+ struct qda_msg *msg;
+ /* Array of remote procedure arguments */
+ union fastrpc_remote_arg *rpra;
+ /* Array of GEM objects for argument buffers */
+ struct drm_gem_object **gem_objs;
+ /* Pointer to user-space invoke arguments */
+ struct fastrpc_invoke_args *args;
+ /* Array of buffer overlap descriptors */
+ struct fastrpc_buf_overlap *olaps;
+ /* Reference counter for context lifetime management */
+ struct kref refcount;
+ /* GEM object for the main message buffer */
+ struct qda_gem_obj *msg_gem_obj;
+ /* DRM file private data */
+ struct drm_file *file_priv;
+ /* Pointer to request buffer */
+ void *req;
+ /* Pointer to response buffer */
+ void *rsp;
+ /* Pointer to input buffer */
+ void *inbuf;
+};
+
+/* Remote Method ID table - identifies initialization and control operations */
+#define FASTRPC_RMID_INIT_ATTACH 0 /* Attach to DSP session */
+#define FASTRPC_RMID_INIT_RELEASE 1 /* Release DSP session */
+
+/* Common handle for initialization operations */
+#define FASTRPC_INIT_HANDLE 0x1
+
+/* Protection Domain(PD) ids */
+#define ROOT_PD (0)
+
+/**
+ * fastrpc_context_free - Free an invocation context
+ * @ref: Reference counter for the context
+ *
+ * This function is called when the reference count reaches zero,
+ * releasing all resources associated with the invocation context.
+ */
+void fastrpc_context_free(struct kref *ref);
+
+/*
+ * FastRPC context and invocation management functions
+ */
+
+/**
+ * fastrpc_context_alloc - Allocate a new FastRPC invocation context
+ *
+ * Returns: Pointer to allocated context, or NULL on failure
+ */
+struct fastrpc_invoke_context *fastrpc_context_alloc(void);
+
+/**
+ * fastrpc_prepare_args - Prepare arguments for FastRPC invocation
+ * @ctx: FastRPC invocation context
+ * @argp: User-space pointer to invocation arguments
+ *
+ * Returns: 0 on success, negative error code on failure
+ */
+int fastrpc_prepare_args(struct fastrpc_invoke_context *ctx, char __user *argp);
+
+/**
+ * fastrpc_get_header_size - Get the size of the FastRPC message header
+ * @ctx: FastRPC invocation context
+ * @out_size: Pointer to store the header size in bytes
+ *
+ * Returns: 0 on success, negative error code on failure
+ */
+int fastrpc_get_header_size(struct fastrpc_invoke_context *ctx, size_t *out_size);
+
+/**
+ * fastrpc_internal_invoke_pack - Pack invocation context into message
+ * @ctx: FastRPC invocation context
+ * @msg: QDA message structure to pack into
+ *
+ * Returns: 0 on success, negative error code on failure
+ */
+int fastrpc_internal_invoke_pack(struct fastrpc_invoke_context *ctx, struct qda_msg *msg);
+
+/**
+ * fastrpc_internal_invoke_unpack - Unpack response message into context
+ * @ctx: FastRPC invocation context
+ * @msg: QDA message structure to unpack from
+ *
+ * Returns: 0 on success, negative error code on failure
+ */
+int fastrpc_internal_invoke_unpack(struct fastrpc_invoke_context *ctx, struct qda_msg *msg);
+
+#endif /* __QDA_FASTRPC_H__ */
diff --git a/drivers/accel/qda/qda_ioctl.c b/drivers/accel/qda/qda_ioctl.c
index d91983048d6c..1066ab6ddc7b 100644
--- a/drivers/accel/qda/qda_ioctl.c
+++ b/drivers/accel/qda/qda_ioctl.c
@@ -6,6 +6,8 @@
#include "qda_drv.h"
#include "qda_ioctl.h"
#include "qda_prime.h"
+#include "qda_fastrpc.h"
+#include "qda_rpmsg.h"
static int qda_validate_and_get_context(struct drm_device *dev, struct drm_file *file_priv,
struct qda_dev **qdev, struct qda_user **qda_user)
@@ -85,3 +87,108 @@ int qda_ioctl_prime_fd_to_handle(struct drm_device *dev, struct drm_file *file_p
{
return qda_prime_fd_to_handle(dev, file_priv, prime_fd, handle);
}
+
+static int fastrpc_context_get_id(struct fastrpc_invoke_context *ctx, struct qda_dev *qdev)
+{
+ int ret;
+ u32 id;
+
+ if (!qdev)
+ return -EINVAL;
+
+ if (atomic_read(&qdev->removing))
+ return -ENODEV;
+
+ ret = xa_alloc(&qdev->ctx_xa, &id, ctx, xa_limit_32b, GFP_KERNEL);
+ if (ret)
+ return ret;
+
+ ctx->ctxid = id << 4;
+ return 0;
+}
+
+static void fastrpc_context_put_id(struct fastrpc_invoke_context *ctx, struct qda_dev *qdev)
+{
+ if (qdev)
+ xa_erase(&qdev->ctx_xa, ctx->ctxid >> 4);
+}
+
+static int fastrpc_invoke(int type, struct drm_device *dev, void *data,
+ struct drm_file *file_priv)
+{
+ struct qda_dev *qdev;
+ struct qda_user *qda_user;
+ struct qda_msg msg;
+ struct fastrpc_invoke_context *ctx;
+ struct drm_gem_object *gem_obj;
+ int err;
+ size_t hdr_size;
+
+ err = qda_validate_and_get_context(dev, file_priv, &qdev, &qda_user);
+ if (err)
+ return err;
+
+ ctx = fastrpc_context_alloc();
+ if (IS_ERR(ctx))
+ return PTR_ERR(ctx);
+
+ err = fastrpc_context_get_id(ctx, qdev);
+ if (err) {
+ kref_put(&ctx->refcount, fastrpc_context_free);
+ return err;
+ }
+
+ ctx->type = type;
+ ctx->file_priv = file_priv;
+ ctx->client_id = qda_user->client_id;
+
+ err = fastrpc_prepare_args(ctx, (char __user *)data);
+ if (err)
+ goto err_context_free;
+
+ err = fastrpc_get_header_size(ctx, &hdr_size);
+ if (err)
+ goto err_context_free;
+
+ gem_obj = qda_gem_create_object(qdev->drm_dev,
+ qdev->drm_priv->iommu_mgr,
+ hdr_size, file_priv);
+ if (IS_ERR(gem_obj)) {
+ err = PTR_ERR(gem_obj);
+ goto err_context_free;
+ }
+
+ ctx->msg_gem_obj = to_qda_gem_obj(gem_obj);
+
+ err = fastrpc_internal_invoke_pack(ctx, &msg);
+ if (err)
+ goto err_context_free;
+
+ err = qda_rpmsg_send_msg(qdev, &msg);
+ if (err)
+ goto err_context_free;
+
+ err = qda_rpmsg_wait_for_rsp(ctx);
+ if (err)
+ goto err_context_free;
+
+ err = fastrpc_internal_invoke_unpack(ctx, &msg);
+ if (err)
+ goto err_context_free;
+
+err_context_free:
+ fastrpc_context_put_id(ctx, qdev);
+ kref_put(&ctx->refcount, fastrpc_context_free);
+
+ return err;
+}
+
+int qda_ioctl_attach(struct drm_device *dev, void *data, struct drm_file *file_priv)
+{
+ return fastrpc_invoke(FASTRPC_RMID_INIT_ATTACH, dev, data, file_priv);
+}
+
+int fastrpc_release_current_dsp_process(struct qda_dev *qdev, struct drm_file *file_priv)
+{
+ return fastrpc_invoke(FASTRPC_RMID_INIT_RELEASE, qdev->drm_dev, NULL, file_priv);
+}
diff --git a/drivers/accel/qda/qda_ioctl.h b/drivers/accel/qda/qda_ioctl.h
index d454256f5fc5..044c616a51c6 100644
--- a/drivers/accel/qda/qda_ioctl.h
+++ b/drivers/accel/qda/qda_ioctl.h
@@ -38,4 +38,29 @@ int qda_ioctl_query(struct drm_device *dev, void *data, struct drm_file *file_pr
int qda_ioctl_prime_fd_to_handle(struct drm_device *dev, struct drm_file *file_priv,
int prime_fd, u32 *handle);
+/**
+ * qda_ioctl_attach - Attach to DSP root protection domain
+ * @dev: DRM device structure
+ * @data: User-space data for the attach operation
+ * @file_priv: DRM file private data
+ *
+ * This IOCTL handler attaches to the DSP root PD (Protection Domain)
+ * to enable communication between the host and DSP.
+ *
+ * Return: 0 on success, negative error code on failure
+ */
+int qda_ioctl_attach(struct drm_device *dev, void *data, struct drm_file *file_priv);
+
+/**
+ * fastrpc_release_current_dsp_process - Release DSP process resources
+ * @qdev: QDA device structure
+ * @file_priv: DRM file private data
+ *
+ * This function releases all resources associated with a DSP process
+ * when a user-space client closes its file descriptor.
+ *
+ * Return: 0 on success, negative error code on failure
+ */
+int fastrpc_release_current_dsp_process(struct qda_dev *qdev, struct drm_file *file_priv);
+
#endif /* _QDA_IOCTL_H */
diff --git a/drivers/accel/qda/qda_rpmsg.c b/drivers/accel/qda/qda_rpmsg.c
index b2b44b4d3ca8..96a08d753271 100644
--- a/drivers/accel/qda/qda_rpmsg.c
+++ b/drivers/accel/qda/qda_rpmsg.c
@@ -5,7 +5,11 @@
#include <linux/of_platform.h>
#include <linux/of.h>
#include <linux/of_device.h>
+#include <linux/completion.h>
+#include <linux/wait.h>
+#include <linux/sched.h>
#include "qda_drv.h"
+#include "qda_fastrpc.h"
#include "qda_rpmsg.h"
#include "qda_cb.h"
@@ -15,7 +19,104 @@ static int qda_rpmsg_init(struct qda_dev *qdev)
return 0;
}
-/* Utility function to allocate and initialize qda_dev */
+static int validate_device_availability(struct qda_dev *qdev)
+{
+ struct rpmsg_device *rpdev;
+
+ if (!qdev)
+ return -ENODEV;
+
+ if (atomic_read(&qdev->removing)) {
+ qda_dbg(qdev, "RPMsg device unavailable: removing\n");
+ return -ENODEV;
+ }
+
+ mutex_lock(&qdev->lock);
+ rpdev = qdev->rpdev;
+ mutex_unlock(&qdev->lock);
+
+ if (!rpdev) {
+ qda_dbg(qdev, "RPMsg device unavailable: rpdev is NULL\n");
+ return -ENODEV;
+ }
+
+ return 0;
+}
+
+static struct fastrpc_invoke_context *get_and_validate_context(struct qda_msg *msg,
+ struct qda_dev *qdev)
+{
+ struct fastrpc_invoke_context *ctx = msg->fastrpc_ctx;
+
+ if (!ctx) {
+ qda_dbg(qdev, "FastRPC context not found in message\n");
+ return ERR_PTR(-EINVAL);
+ }
+
+ kref_get(&ctx->refcount);
+ return ctx;
+}
+
+static void populate_fastrpc_msg(struct fastrpc_msg *dst, struct qda_msg *src)
+{
+ dst->client_id = src->client_id;
+ dst->tid = src->tid;
+ dst->ctx = src->ctx;
+ dst->handle = src->handle;
+ dst->sc = src->sc;
+ dst->addr = src->addr;
+ dst->size = src->size;
+}
+
+static int validate_callback_params(struct qda_dev *qdev, void *data, int len)
+{
+ if (!qdev)
+ return -ENODEV;
+
+ if (atomic_read(&qdev->removing))
+ return -ENODEV;
+
+ if (len < sizeof(struct qda_invoke_rsp)) {
+ qda_dbg(qdev, "Invalid message size from remote: %d\n", len);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static unsigned long extract_context_id(struct qda_invoke_rsp *resp_msg)
+{
+ return (resp_msg->ctx & 0xFF0) >> 4;
+}
+
+static struct fastrpc_invoke_context *find_context_by_id(struct qda_dev *qdev,
+ unsigned long ctxid)
+{
+ struct fastrpc_invoke_context *ctx;
+
+ {
+ unsigned long flags;
+
+ xa_lock_irqsave(&qdev->ctx_xa, flags);
+ ctx = xa_load(&qdev->ctx_xa, ctxid);
+ xa_unlock_irqrestore(&qdev->ctx_xa, flags);
+ }
+
+ if (!ctx) {
+ qda_dbg(qdev, "FastRPC context not found for ctxid: %lu\n", ctxid);
+ return ERR_PTR(-ENOENT);
+ }
+
+ return ctx;
+}
+
+static void complete_context_processing(struct fastrpc_invoke_context *ctx, int retval)
+{
+ ctx->retval = retval;
+ complete(&ctx->work);
+ kref_put(&ctx->refcount, fastrpc_context_free);
+}
+
static struct qda_dev *alloc_and_init_qdev(struct rpmsg_device *rpdev)
{
struct qda_dev *qdev;
@@ -62,9 +163,68 @@ static int qda_populate_child_devices(struct qda_dev *qdev, struct device_node *
return success > 0 ? 0 : (count > 0 ? -ENODEV : 0);
}
+int qda_rpmsg_send_msg(struct qda_dev *qdev, struct qda_msg *msg)
+{
+ int ret;
+ struct fastrpc_invoke_context *ctx;
+ struct fastrpc_msg msg1;
+ struct rpmsg_device *rpdev;
+
+ ret = validate_device_availability(qdev);
+ if (ret)
+ return ret;
+
+ ctx = get_and_validate_context(msg, qdev);
+ if (IS_ERR(ctx))
+ return PTR_ERR(ctx);
+
+ populate_fastrpc_msg(&msg1, msg);
+
+ mutex_lock(&qdev->lock);
+ rpdev = qdev->rpdev;
+ if (!rpdev) {
+ mutex_unlock(&qdev->lock);
+ kref_put(&ctx->refcount, fastrpc_context_free);
+ return -ENODEV;
+ }
+
+ ret = rpmsg_send(rpdev->ept, (void *)&msg1, sizeof(msg1));
+ mutex_unlock(&qdev->lock);
+
+ if (ret) {
+ qda_err(qdev, "rpmsg_send failed: %d\n", ret);
+ kref_put(&ctx->refcount, fastrpc_context_free);
+ return ret;
+ }
+
+ return 0;
+}
+
+int qda_rpmsg_wait_for_rsp(struct fastrpc_invoke_context *ctx)
+{
+ return wait_for_completion_interruptible(&ctx->work);
+}
+
static int qda_rpmsg_cb(struct rpmsg_device *rpdev, void *data, int len, void *priv, u32 src)
{
- /* Dummy function for rpmsg driver */
+ struct qda_dev *qdev = dev_get_drvdata(&rpdev->dev);
+ struct qda_invoke_rsp *resp_msg = (struct qda_invoke_rsp *)data;
+ struct fastrpc_invoke_context *ctx;
+ unsigned long ctxid;
+ int ret;
+
+ ret = validate_callback_params(qdev, data, len);
+ if (ret)
+ return ret;
+
+ ctxid = extract_context_id(resp_msg);
+
+ ctx = find_context_by_id(qdev, ctxid);
+ if (IS_ERR(ctx))
+ return PTR_ERR(ctx);
+
+ complete_context_processing(ctx, resp_msg->retval);
+
return 0;
}
diff --git a/drivers/accel/qda/qda_rpmsg.h b/drivers/accel/qda/qda_rpmsg.h
index 348827bff255..b3e76e44f4cd 100644
--- a/drivers/accel/qda/qda_rpmsg.h
+++ b/drivers/accel/qda/qda_rpmsg.h
@@ -7,6 +7,46 @@
#define __QDA_RPMSG_H__
#include "qda_drv.h"
+#include "qda_fastrpc.h"
+
+/**
+ * struct fastrpc_msg - FastRPC message structure for remote invocations
+ *
+ * This structure represents a FastRPC message sent to the remote processor
+ * via RPMsg transport layer.
+ */
+struct fastrpc_msg {
+ /* Process client ID */
+ int client_id;
+ /* Thread ID */
+ int tid;
+ /* Context identifier for matching request/response */
+ u64 ctx;
+ /* Handle to invoke on remote processor */
+ u32 handle;
+ /* Scalars structure describing the data layout */
+ u32 sc;
+ /* Physical address of the message buffer */
+ u64 addr;
+ /* Size of contiguous region */
+ u64 size;
+};
+
+/**
+ * struct qda_invoke_rsp - Response structure for FastRPC invocations
+ */
+struct qda_invoke_rsp {
+ /* Invoke caller context for matching request/response */
+ u64 ctx;
+ /* Return value from the remote invocation */
+ int retval;
+};
+
+/*
+ * RPMsg transport layer functions
+ */
+int qda_rpmsg_send_msg(struct qda_dev *qdev, struct qda_msg *msg);
+int qda_rpmsg_wait_for_rsp(struct fastrpc_invoke_context *ctx);
/*
* Transport layer registration
diff --git a/include/uapi/drm/qda_accel.h b/include/uapi/drm/qda_accel.h
index ed24a7f5637e..4d3666c5b998 100644
--- a/include/uapi/drm/qda_accel.h
+++ b/include/uapi/drm/qda_accel.h
@@ -21,6 +21,7 @@ extern "C" {
#define DRM_QDA_QUERY 0x00
#define DRM_QDA_GEM_CREATE 0x01
#define DRM_QDA_GEM_MMAP_OFFSET 0x02
+#define DRM_QDA_INIT_ATTACH 0x03
/*
* QDA IOCTL definitions
*
@@ -33,6 +34,7 @@ extern "C" {
struct drm_qda_gem_create)
#define DRM_IOCTL_QDA_GEM_MMAP_OFFSET DRM_IOWR(DRM_COMMAND_BASE + DRM_QDA_GEM_MMAP_OFFSET, \
struct drm_qda_gem_mmap_offset)
+#define DRM_IOCTL_QDA_INIT_ATTACH DRM_IO(DRM_COMMAND_BASE + DRM_QDA_INIT_ATTACH)
/**
* struct drm_qda_query - Device information query structure
@@ -76,6 +78,23 @@ struct drm_qda_gem_mmap_offset {
__u64 offset;
};
+/**
+ * struct fastrpc_invoke_args - FastRPC invocation argument descriptor
+ * @ptr: Pointer to argument data (user virtual address)
+ * @length: Length of the argument data in bytes
+ * @fd: File descriptor for buffer arguments, -1 for scalar arguments
+ * @attr: Argument attributes and flags
+ *
+ * This structure describes a single argument passed to a FastRPC invocation.
+ * Arguments can be either scalar values or buffer references (via file descriptor).
+ */
+struct fastrpc_invoke_args {
+ __u64 ptr;
+ __u64 length;
+ __s32 fd;
+ __u32 attr;
+};
+
#if defined(__cplusplus)
}
#endif
--
2.34.1
^ permalink raw reply related [flat|nested] 83+ messages in thread
* [PATCH RFC 14/18] accel/qda: Add FastRPC dynamic invocation support
2026-02-23 19:08 ` [PATCH RFC 00/18] accel/qda: Introduce Qualcomm DSP Accelerator driver Ekansh Gupta
` (12 preceding siblings ...)
2026-02-23 19:09 ` [PATCH RFC 13/18] accel/qda: Add initial FastRPC attach and release support Ekansh Gupta
@ 2026-02-23 19:09 ` Ekansh Gupta
2026-02-23 23:10 ` Dmitry Baryshkov
2026-02-23 19:09 ` [PATCH RFC 15/18] accel/qda: Add FastRPC DSP process creation support Ekansh Gupta
` (9 subsequent siblings)
23 siblings, 1 reply; 83+ messages in thread
From: Ekansh Gupta @ 2026-02-23 19:09 UTC (permalink / raw)
To: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König
Cc: dri-devel, linux-doc, linux-kernel, linux-arm-msm, iommu,
linux-media, linaro-mm-sig, Srinivas Kandagatla, Dmitry Baryshkov,
Bharath Kumar, Chenna Kesava Raju, Ekansh Gupta
Extend the QDA FastRPC implementation to support dynamic remote
procedure calls from userspace. A new DRM_QDA_INVOKE ioctl is added,
which accepts a qda_invoke_args structure containing a remote handle,
FastRPC scalars value and a pointer to an array of fastrpc_invoke_args
describing the individual arguments. The driver copies the scalar and
argument array into a fastrpc_invoke_context and reuses the existing
buffer overlap and packing logic to build a GEM-backed message buffer
for transport.
The FastRPC core gains a FASTRPC_RMID_INVOKE_DYNAMIC method type and a
fastrpc_prepare_args_invoke() helper that reads the qda_invoke_args
header and argument descriptors from user or kernel memory using a
copy_from_user_or_kernel() helper. The generic fastrpc_prepare_args()
path is updated to handle the dynamic method alongside the existing
INIT_ATTACH and INIT_RELEASE control calls, deriving the number of
buffers and scalars from the provided FastRPC scalars encoding.
On the transport side qda_ioctl_invoke() simply forwards the request
to fastrpc_invoke() with the dynamic method id, allowing the RPMsg
transport and context lookup to treat dynamic calls in the same way as
the existing control methods. This patch establishes the basic FastRPC
invoke mechanism on top of the QDA GEM and RPMsg infrastructure so
that future patches can wire up more complex DSP APIs.
Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
---
drivers/accel/qda/qda_drv.c | 1 +
drivers/accel/qda/qda_fastrpc.c | 48 +++++++++++++++++++++++++++++++++++++++++
drivers/accel/qda/qda_fastrpc.h | 1 +
drivers/accel/qda/qda_ioctl.c | 5 +++++
drivers/accel/qda/qda_ioctl.h | 13 +++++++++++
include/uapi/drm/qda_accel.h | 21 ++++++++++++++++++
6 files changed, 89 insertions(+)
diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c
index 3034ea660924..f94f780ea50a 100644
--- a/drivers/accel/qda/qda_drv.c
+++ b/drivers/accel/qda/qda_drv.c
@@ -162,6 +162,7 @@ static const struct drm_ioctl_desc qda_ioctls[] = {
DRM_IOCTL_DEF_DRV(QDA_GEM_CREATE, qda_ioctl_gem_create, 0),
DRM_IOCTL_DEF_DRV(QDA_GEM_MMAP_OFFSET, qda_ioctl_gem_mmap_offset, 0),
DRM_IOCTL_DEF_DRV(QDA_INIT_ATTACH, qda_ioctl_attach, 0),
+ DRM_IOCTL_DEF_DRV(QDA_INVOKE, qda_ioctl_invoke, 0),
};
static struct drm_driver qda_drm_driver = {
diff --git a/drivers/accel/qda/qda_fastrpc.c b/drivers/accel/qda/qda_fastrpc.c
index eda7c90070ee..a48b255ffb1b 100644
--- a/drivers/accel/qda/qda_fastrpc.c
+++ b/drivers/accel/qda/qda_fastrpc.c
@@ -12,6 +12,16 @@
#include "qda_gem.h"
#include "qda_memory_manager.h"
+static int copy_from_user_or_kernel(void *dst, const void __user *src, size_t size)
+{
+ if ((unsigned long)src >= PAGE_OFFSET) {
+ memcpy(dst, src, size);
+ return 0;
+ } else {
+ return copy_from_user(dst, src, size) ? -EFAULT : 0;
+ }
+}
+
static int copy_to_user_or_kernel(void __user *dst, const void *src, size_t size)
{
if ((unsigned long)dst >= PAGE_OFFSET) {
@@ -509,6 +519,41 @@ static int fastrpc_prepare_args_release_process(struct fastrpc_invoke_context *c
return 0;
}
+static int fastrpc_prepare_args_invoke(struct fastrpc_invoke_context *ctx, char __user *argp)
+{
+ struct fastrpc_invoke_args *args = NULL;
+ struct qda_invoke_args inv;
+ int err = 0;
+ int nscalars;
+
+ if (!argp)
+ return -EINVAL;
+
+ err = copy_from_user_or_kernel(&inv, argp, sizeof(inv));
+ if (err)
+ return err;
+
+ nscalars = REMOTE_SCALARS_LENGTH(inv.sc);
+
+ if (nscalars) {
+ args = kcalloc(nscalars, sizeof(*args), GFP_KERNEL);
+ if (!args)
+ return -ENOMEM;
+
+ err = copy_from_user_or_kernel(args, (const void __user *)(uintptr_t)inv.args,
+ nscalars * sizeof(*args));
+ if (err) {
+ kfree(args);
+ return err;
+ }
+ }
+ ctx->sc = inv.sc;
+ ctx->args = args;
+ ctx->handle = inv.handle;
+
+ return 0;
+}
+
int fastrpc_prepare_args(struct fastrpc_invoke_context *ctx, char __user *argp)
{
int err;
@@ -521,6 +566,9 @@ int fastrpc_prepare_args(struct fastrpc_invoke_context *ctx, char __user *argp)
case FASTRPC_RMID_INIT_RELEASE:
err = fastrpc_prepare_args_release_process(ctx);
break;
+ case FASTRPC_RMID_INVOKE_DYNAMIC:
+ err = fastrpc_prepare_args_invoke(ctx, argp);
+ break;
default:
return -EINVAL;
}
diff --git a/drivers/accel/qda/qda_fastrpc.h b/drivers/accel/qda/qda_fastrpc.h
index 744421382079..bcadf9437a36 100644
--- a/drivers/accel/qda/qda_fastrpc.h
+++ b/drivers/accel/qda/qda_fastrpc.h
@@ -237,6 +237,7 @@ struct fastrpc_invoke_context {
/* Remote Method ID table - identifies initialization and control operations */
#define FASTRPC_RMID_INIT_ATTACH 0 /* Attach to DSP session */
#define FASTRPC_RMID_INIT_RELEASE 1 /* Release DSP session */
+#define FASTRPC_RMID_INVOKE_DYNAMIC 0xFFFFFFFF /* Dynamic method invocation */
/* Common handle for initialization operations */
#define FASTRPC_INIT_HANDLE 0x1
diff --git a/drivers/accel/qda/qda_ioctl.c b/drivers/accel/qda/qda_ioctl.c
index 1066ab6ddc7b..e90aceabd30d 100644
--- a/drivers/accel/qda/qda_ioctl.c
+++ b/drivers/accel/qda/qda_ioctl.c
@@ -192,3 +192,8 @@ int fastrpc_release_current_dsp_process(struct qda_dev *qdev, struct drm_file *f
{
return fastrpc_invoke(FASTRPC_RMID_INIT_RELEASE, qdev->drm_dev, NULL, file_priv);
}
+
+int qda_ioctl_invoke(struct drm_device *dev, void *data, struct drm_file *file_priv)
+{
+ return fastrpc_invoke(FASTRPC_RMID_INVOKE_DYNAMIC, dev, data, file_priv);
+}
diff --git a/drivers/accel/qda/qda_ioctl.h b/drivers/accel/qda/qda_ioctl.h
index 044c616a51c6..e186c5183171 100644
--- a/drivers/accel/qda/qda_ioctl.h
+++ b/drivers/accel/qda/qda_ioctl.h
@@ -63,4 +63,17 @@ int qda_ioctl_attach(struct drm_device *dev, void *data, struct drm_file *file_p
*/
int fastrpc_release_current_dsp_process(struct qda_dev *qdev, struct drm_file *file_priv);
+/**
+ * qda_ioctl_invoke - Invoke a remote procedure on the DSP
+ * @dev: DRM device structure
+ * @data: User-space data containing invocation parameters
+ * @file_priv: DRM file private data
+ *
+ * This IOCTL handler initiates a remote procedure call on the DSP,
+ * marshalling arguments, executing the call, and returning results.
+ *
+ * Return: 0 on success, negative error code on failure
+ */
+int qda_ioctl_invoke(struct drm_device *dev, void *data, struct drm_file *file_priv);
+
#endif /* _QDA_IOCTL_H */
diff --git a/include/uapi/drm/qda_accel.h b/include/uapi/drm/qda_accel.h
index 4d3666c5b998..01072a9d0a91 100644
--- a/include/uapi/drm/qda_accel.h
+++ b/include/uapi/drm/qda_accel.h
@@ -22,6 +22,9 @@ extern "C" {
#define DRM_QDA_GEM_CREATE 0x01
#define DRM_QDA_GEM_MMAP_OFFSET 0x02
#define DRM_QDA_INIT_ATTACH 0x03
+/* Indexes 0x04 to 0x06 are reserved for other requests */
+#define DRM_QDA_INVOKE 0x07
+
/*
* QDA IOCTL definitions
*
@@ -35,6 +38,8 @@ extern "C" {
#define DRM_IOCTL_QDA_GEM_MMAP_OFFSET DRM_IOWR(DRM_COMMAND_BASE + DRM_QDA_GEM_MMAP_OFFSET, \
struct drm_qda_gem_mmap_offset)
#define DRM_IOCTL_QDA_INIT_ATTACH DRM_IO(DRM_COMMAND_BASE + DRM_QDA_INIT_ATTACH)
+#define DRM_IOCTL_QDA_INVOKE DRM_IOWR(DRM_COMMAND_BASE + DRM_QDA_INVOKE, \
+ struct qda_invoke_args)
/**
* struct drm_qda_query - Device information query structure
@@ -95,6 +100,22 @@ struct fastrpc_invoke_args {
__u32 attr;
};
+/**
+ * struct qda_invoke_args - User-space IOCTL arguments for invoking a function
+ * @handle: Handle identifying the remote function to invoke
+ * @sc: Scalars parameter encoding buffer counts and attributes
+ * @args: User-space pointer to the argument array
+ *
+ * This structure is passed from user-space to invoke a remote function
+ * on the DSP. The scalars parameter encodes the number and types of
+ * input/output buffers.
+ */
+struct qda_invoke_args {
+ __u32 handle;
+ __u32 sc;
+ __u64 args;
+};
+
#if defined(__cplusplus)
}
#endif
--
2.34.1
^ permalink raw reply related [flat|nested] 83+ messages in thread
* [PATCH RFC 15/18] accel/qda: Add FastRPC DSP process creation support
2026-02-23 19:08 ` [PATCH RFC 00/18] accel/qda: Introduce Qualcomm DSP Accelerator driver Ekansh Gupta
` (13 preceding siblings ...)
2026-02-23 19:09 ` [PATCH RFC 14/18] accel/qda: Add FastRPC dynamic invocation support Ekansh Gupta
@ 2026-02-23 19:09 ` Ekansh Gupta
2026-02-23 19:09 ` [PATCH RFC 16/18] accel/qda: Add FastRPC-based DSP memory mapping support Ekansh Gupta
` (8 subsequent siblings)
23 siblings, 0 replies; 83+ messages in thread
From: Ekansh Gupta @ 2026-02-23 19:09 UTC (permalink / raw)
To: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König
Cc: dri-devel, linux-doc, linux-kernel, linux-arm-msm, iommu,
linux-media, linaro-mm-sig, Srinivas Kandagatla, Dmitry Baryshkov,
Bharath Kumar, Chenna Kesava Raju, Ekansh Gupta
Add support for creating a DSP process through the QDA FastRPC
interface. A new DRM_QDA_INIT_CREATE ioctl accepts a qda_init_create
structure describing the executable image, process attributes and
optional signature. The driver allocates a GEM-backed initialization
buffer, prepares a fastrpc_create_process_inbuf and a single
fastrpc_phy_page entry pointing to the initialization memory and
packages these into a set of FastRPC arguments.
The FastRPC core gains FASTRPC_RMID_INIT_CREATE and
FASTRPC_RMID_INIT_CREATE_ATTR method identifiers along with a
fastrpc_prepare_args_init_create() helper that reads the
qda_init_create parameters from user space, validates the ELF length,
optionally verifies a GEM handle for the image and fills a
FASTRPC_CREATE_PROCESS_NARGS-sized fastrpc_invoke_args array. The
scalars value is built from the FastRPC method id and buffer counts
so that the existing overlap and packing logic can treat process
creation like any other call.
On the IOCTL side qda_ioctl_create() forwards requests to
fastrpc_invoke() with the INIT_CREATE method id, ensuring that the
message buffer, per-process initialization memory and RPMsg
transport are reused for process creation in the same way as attach,
release and dynamic invocation. This patch lays the groundwork for
loading and running DSP user PDs under the QDA driver.
Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
---
drivers/accel/qda/qda_drv.c | 1 +
drivers/accel/qda/qda_drv.h | 2 +
drivers/accel/qda/qda_fastrpc.c | 109 ++++++++++++++++++++++++++++++++++++++++
drivers/accel/qda/qda_fastrpc.h | 31 ++++++++++++
drivers/accel/qda/qda_ioctl.c | 28 ++++++++++-
drivers/accel/qda/qda_ioctl.h | 13 +++++
include/uapi/drm/qda_accel.h | 29 ++++++++++-
7 files changed, 211 insertions(+), 2 deletions(-)
diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c
index f94f780ea50a..2b080d5d51c5 100644
--- a/drivers/accel/qda/qda_drv.c
+++ b/drivers/accel/qda/qda_drv.c
@@ -162,6 +162,7 @@ static const struct drm_ioctl_desc qda_ioctls[] = {
DRM_IOCTL_DEF_DRV(QDA_GEM_CREATE, qda_ioctl_gem_create, 0),
DRM_IOCTL_DEF_DRV(QDA_GEM_MMAP_OFFSET, qda_ioctl_gem_mmap_offset, 0),
DRM_IOCTL_DEF_DRV(QDA_INIT_ATTACH, qda_ioctl_attach, 0),
+ DRM_IOCTL_DEF_DRV(QDA_INIT_CREATE, qda_ioctl_create, 0),
DRM_IOCTL_DEF_DRV(QDA_INVOKE, qda_ioctl_invoke, 0),
};
diff --git a/drivers/accel/qda/qda_drv.h b/drivers/accel/qda/qda_drv.h
index bb1d1e82036a..950e8d44995d 100644
--- a/drivers/accel/qda/qda_drv.h
+++ b/drivers/accel/qda/qda_drv.h
@@ -48,6 +48,8 @@ struct qda_user {
u32 client_id;
/* Back-pointer to device structure */
struct qda_dev *qda_dev;
+ /* GEM object for PD initialization memory */
+ struct qda_gem_obj *init_mem_gem_obj;
};
/**
diff --git a/drivers/accel/qda/qda_fastrpc.c b/drivers/accel/qda/qda_fastrpc.c
index a48b255ffb1b..f03dcf7e21e4 100644
--- a/drivers/accel/qda/qda_fastrpc.c
+++ b/drivers/accel/qda/qda_fastrpc.c
@@ -487,6 +487,36 @@ int fastrpc_internal_invoke_unpack(struct fastrpc_invoke_context *ctx,
return err;
}
+static void setup_create_process_args(struct fastrpc_invoke_args *args,
+ struct fastrpc_create_process_inbuf *inbuf,
+ struct qda_init_create *init,
+ struct fastrpc_phy_page *pages)
+{
+ args[0].ptr = (u64)(uintptr_t)inbuf;
+ args[0].length = sizeof(*inbuf);
+ args[0].fd = -1;
+
+ args[1].ptr = (u64)(uintptr_t)current->comm;
+ args[1].length = inbuf->namelen;
+ args[1].fd = -1;
+
+ args[2].ptr = (u64)init->file;
+ args[2].length = inbuf->filelen;
+ args[2].fd = init->filefd;
+
+ args[3].ptr = (u64)(uintptr_t)pages;
+ args[3].length = 1 * sizeof(*pages);
+ args[3].fd = -1;
+
+ args[4].ptr = (u64)(uintptr_t)&inbuf->attrs;
+ args[4].length = sizeof(inbuf->attrs);
+ args[4].fd = -1;
+
+ args[5].ptr = (u64)(uintptr_t)&inbuf->siglen;
+ args[5].length = sizeof(inbuf->siglen);
+ args[5].fd = -1;
+}
+
static int fastrpc_prepare_args_init_attach(struct fastrpc_invoke_context *ctx)
{
struct fastrpc_invoke_args *args;
@@ -554,6 +584,80 @@ static int fastrpc_prepare_args_invoke(struct fastrpc_invoke_context *ctx, char
return 0;
}
+static int fastrpc_prepare_args_init_create(struct fastrpc_invoke_context *ctx, char __user *argp)
+{
+ struct qda_init_create init;
+ struct fastrpc_invoke_args *args;
+ struct fastrpc_create_process_inbuf *inbuf;
+ int err;
+ u32 sc;
+ struct drm_gem_object *file_gem_obj = NULL;
+
+ args = kcalloc(FASTRPC_CREATE_PROCESS_NARGS, sizeof(*args), GFP_KERNEL);
+ if (!args)
+ return -ENOMEM;
+
+ ctx->input_pages = kcalloc(1, sizeof(*ctx->input_pages), GFP_KERNEL);
+ if (!ctx->input_pages) {
+ err = -ENOMEM;
+ goto err_free_args;
+ }
+
+ ctx->inbuf = kcalloc(1, sizeof(*inbuf), GFP_KERNEL);
+ if (!ctx->inbuf) {
+ err = -ENOMEM;
+ goto err_free_input_pages;
+ }
+ inbuf = ctx->inbuf;
+
+ err = copy_from_user_or_kernel(&init, argp, sizeof(init));
+ if (err)
+ goto err_free_inbuf;
+
+ if (init.filelen > INIT_FILELEN_MAX) {
+ err = -EINVAL;
+ goto err_free_inbuf;
+ }
+ inbuf->client_id = ctx->client_id;
+ inbuf->namelen = strlen(current->comm) + 1;
+ inbuf->filelen = init.filelen;
+ inbuf->pageslen = 1;
+ inbuf->attrs = init.attrs;
+ inbuf->siglen = init.siglen;
+
+ setup_pages_from_gem_obj(ctx->init_mem_gem_obj, &ctx->input_pages[0]);
+
+ if (init.filelen && init.filefd) {
+ err = get_gem_obj_from_handle(ctx->file_priv, init.filefd, &file_gem_obj);
+ if (err) {
+ err = -EINVAL;
+ goto err_free_inbuf;
+ }
+ drm_gem_object_put(file_gem_obj);
+ }
+
+ setup_create_process_args(args, inbuf, &init, ctx->input_pages);
+
+ sc = FASTRPC_SCALARS(FASTRPC_RMID_INIT_CREATE, 4, 0);
+ if (init.attrs)
+ sc = FASTRPC_SCALARS(FASTRPC_RMID_INIT_CREATE_ATTR, 4, 0);
+ ctx->sc = sc;
+ ctx->args = args;
+ ctx->handle = FASTRPC_INIT_HANDLE;
+
+ return 0;
+
+err_free_inbuf:
+ kfree(ctx->inbuf);
+ ctx->inbuf = NULL;
+err_free_input_pages:
+ kfree(ctx->input_pages);
+ ctx->input_pages = NULL;
+err_free_args:
+ kfree(args);
+ return err;
+}
+
int fastrpc_prepare_args(struct fastrpc_invoke_context *ctx, char __user *argp)
{
int err;
@@ -569,6 +673,11 @@ int fastrpc_prepare_args(struct fastrpc_invoke_context *ctx, char __user *argp)
case FASTRPC_RMID_INVOKE_DYNAMIC:
err = fastrpc_prepare_args_invoke(ctx, argp);
break;
+ case FASTRPC_RMID_INIT_CREATE:
+ case FASTRPC_RMID_INIT_CREATE_ATTR:
+ ctx->pd = USER_PD;
+ err = fastrpc_prepare_args_init_create(ctx, argp);
+ break;
default:
return -EINVAL;
}
diff --git a/drivers/accel/qda/qda_fastrpc.h b/drivers/accel/qda/qda_fastrpc.h
index bcadf9437a36..a8deb7efec86 100644
--- a/drivers/accel/qda/qda_fastrpc.h
+++ b/drivers/accel/qda/qda_fastrpc.h
@@ -122,6 +122,27 @@ struct fastrpc_invoke_buf {
u32 pgidx;
};
+/**
+ * struct fastrpc_create_process_inbuf - Input buffer for process creation
+ *
+ * This structure defines the input buffer format for creating a new
+ * process on the remote DSP.
+ */
+struct fastrpc_create_process_inbuf {
+ /* Client identifier for the session */
+ int client_id;
+ /* Length of the process name string */
+ u32 namelen;
+ /* Length of the shell file */
+ u32 filelen;
+ /* Length of the pages list */
+ u32 pageslen;
+ /* Process attributes flags */
+ u32 attrs;
+ /* Length of the signature data */
+ u32 siglen;
+};
+
/**
* struct qda_msg - Message structure for FastRPC communication
*
@@ -226,6 +247,8 @@ struct fastrpc_invoke_context {
struct qda_gem_obj *msg_gem_obj;
/* DRM file private data */
struct drm_file *file_priv;
+ /* GEM object for PD initialization memory */
+ struct qda_gem_obj *init_mem_gem_obj;
/* Pointer to request buffer */
void *req;
/* Pointer to response buffer */
@@ -237,6 +260,8 @@ struct fastrpc_invoke_context {
/* Remote Method ID table - identifies initialization and control operations */
#define FASTRPC_RMID_INIT_ATTACH 0 /* Attach to DSP session */
#define FASTRPC_RMID_INIT_RELEASE 1 /* Release DSP session */
+#define FASTRPC_RMID_INIT_CREATE 6 /* Create DSP process */
+#define FASTRPC_RMID_INIT_CREATE_ATTR 7 /* Create DSP process with attributes */
#define FASTRPC_RMID_INVOKE_DYNAMIC 0xFFFFFFFF /* Dynamic method invocation */
/* Common handle for initialization operations */
@@ -244,6 +269,12 @@ struct fastrpc_invoke_context {
/* Protection Domain(PD) ids */
#define ROOT_PD (0)
+#define USER_PD (1)
+
+/* Number of arguments for process creation */
+#define FASTRPC_CREATE_PROCESS_NARGS 6
+/* Maximum initialization file size (4MB) */
+#define INIT_FILELEN_MAX (4 * 1024 * 1024)
/**
* fastrpc_context_free - Free an invocation context
diff --git a/drivers/accel/qda/qda_ioctl.c b/drivers/accel/qda/qda_ioctl.c
index e90aceabd30d..477112ad6664 100644
--- a/drivers/accel/qda/qda_ioctl.c
+++ b/drivers/accel/qda/qda_ioctl.c
@@ -122,7 +122,7 @@ static int fastrpc_invoke(int type, struct drm_device *dev, void *data,
struct fastrpc_invoke_context *ctx;
struct drm_gem_object *gem_obj;
int err;
- size_t hdr_size;
+ size_t hdr_size, initmem_size = 4 * 1024 * 1024;
err = qda_validate_and_get_context(dev, file_priv, &qdev, &qda_user);
if (err)
@@ -142,6 +142,22 @@ static int fastrpc_invoke(int type, struct drm_device *dev, void *data,
ctx->file_priv = file_priv;
ctx->client_id = qda_user->client_id;
+ if (type == FASTRPC_RMID_INIT_CREATE) {
+ struct drm_gem_object *gem_obj;
+
+ gem_obj = qda_gem_create_object(qdev->drm_dev, qdev->drm_priv->iommu_mgr,
+ initmem_size, file_priv);
+ if (IS_ERR(gem_obj)) {
+ err = PTR_ERR(gem_obj);
+ goto err_context_free;
+ }
+
+ ctx->init_mem_gem_obj = to_qda_gem_obj(gem_obj);
+ qda_user->init_mem_gem_obj = ctx->init_mem_gem_obj;
+ } else if (type == FASTRPC_RMID_INIT_RELEASE) {
+ ctx->init_mem_gem_obj = qda_user->init_mem_gem_obj;
+ }
+
err = fastrpc_prepare_args(ctx, (char __user *)data);
if (err)
goto err_context_free;
@@ -177,6 +193,11 @@ static int fastrpc_invoke(int type, struct drm_device *dev, void *data,
goto err_context_free;
err_context_free:
+ if (type == FASTRPC_RMID_INIT_RELEASE && qda_user->init_mem_gem_obj) {
+ drm_gem_object_put(&qda_user->init_mem_gem_obj->base);
+ qda_user->init_mem_gem_obj = NULL;
+ }
+
fastrpc_context_put_id(ctx, qdev);
kref_put(&ctx->refcount, fastrpc_context_free);
@@ -197,3 +218,8 @@ int qda_ioctl_invoke(struct drm_device *dev, void *data, struct drm_file *file_p
{
return fastrpc_invoke(FASTRPC_RMID_INVOKE_DYNAMIC, dev, data, file_priv);
}
+
+int qda_ioctl_create(struct drm_device *dev, void *data, struct drm_file *file_priv)
+{
+ return fastrpc_invoke(FASTRPC_RMID_INIT_CREATE, dev, data, file_priv);
+}
diff --git a/drivers/accel/qda/qda_ioctl.h b/drivers/accel/qda/qda_ioctl.h
index e186c5183171..181ed50b19dc 100644
--- a/drivers/accel/qda/qda_ioctl.h
+++ b/drivers/accel/qda/qda_ioctl.h
@@ -76,4 +76,17 @@ int fastrpc_release_current_dsp_process(struct qda_dev *qdev, struct drm_file *f
*/
int qda_ioctl_invoke(struct drm_device *dev, void *data, struct drm_file *file_priv);
+/**
+ * qda_ioctl_create - Create a DSP process
+ * @dev: DRM device structure
+ * @data: User-space data containing process creation parameters
+ * @file_priv: DRM file private data
+ *
+ * This IOCTL handler creates a new process on the DSP, loading the
+ * specified executable and initializing its runtime environment.
+ *
+ * Return: 0 on success, negative error code on failure
+ */
+int qda_ioctl_create(struct drm_device *dev, void *data, struct drm_file *file_priv);
+
#endif /* _QDA_IOCTL_H */
diff --git a/include/uapi/drm/qda_accel.h b/include/uapi/drm/qda_accel.h
index 01072a9d0a91..2b7f500db52c 100644
--- a/include/uapi/drm/qda_accel.h
+++ b/include/uapi/drm/qda_accel.h
@@ -22,7 +22,8 @@ extern "C" {
#define DRM_QDA_GEM_CREATE 0x01
#define DRM_QDA_GEM_MMAP_OFFSET 0x02
#define DRM_QDA_INIT_ATTACH 0x03
-/* Indexes 0x04 to 0x06 are reserved for other requests */
+#define DRM_QDA_INIT_CREATE 0x04
+/* Indexes 0x05-0x06 are reserved for other requests */
#define DRM_QDA_INVOKE 0x07
/*
@@ -38,6 +39,8 @@ extern "C" {
#define DRM_IOCTL_QDA_GEM_MMAP_OFFSET DRM_IOWR(DRM_COMMAND_BASE + DRM_QDA_GEM_MMAP_OFFSET, \
struct drm_qda_gem_mmap_offset)
#define DRM_IOCTL_QDA_INIT_ATTACH DRM_IO(DRM_COMMAND_BASE + DRM_QDA_INIT_ATTACH)
+#define DRM_IOCTL_QDA_INIT_CREATE DRM_IOWR(DRM_COMMAND_BASE + DRM_QDA_INIT_CREATE, \
+ struct qda_init_create)
#define DRM_IOCTL_QDA_INVOKE DRM_IOWR(DRM_COMMAND_BASE + DRM_QDA_INVOKE, \
struct qda_invoke_args)
@@ -116,6 +119,30 @@ struct qda_invoke_args {
__u64 args;
};
+/**
+ * struct qda_init_create - Accelerator process initialization parameters
+ * @filelen: Length of the ELF file in bytes
+ * @filefd: File descriptor containing the ELF file
+ * @attrs: Process attributes flags
+ * @siglen: Length of signature data in bytes
+ * @file: Pointer to ELF file data if not using filefd
+ *
+ * This structure is used with DRM_IOCTL_QDA_INIT_CREATE to initialize
+ * a new process on the accelerator. The process code is provided either
+ * via a file descriptor (filefd, typically a GEM object) or a direct
+ * pointer (file). Set file to 0 if using filefd.
+ *
+ * The attrs field contains bit flags for debug mode, privileged execution,
+ * and other process attributes.
+ */
+struct qda_init_create {
+ __u32 filelen;
+ __s32 filefd;
+ __u32 attrs;
+ __u32 siglen;
+ __u64 file;
+};
+
#if defined(__cplusplus)
}
#endif
--
2.34.1
^ permalink raw reply related [flat|nested] 83+ messages in thread
* [PATCH RFC 16/18] accel/qda: Add FastRPC-based DSP memory mapping support
2026-02-23 19:08 ` [PATCH RFC 00/18] accel/qda: Introduce Qualcomm DSP Accelerator driver Ekansh Gupta
` (14 preceding siblings ...)
2026-02-23 19:09 ` [PATCH RFC 15/18] accel/qda: Add FastRPC DSP process creation support Ekansh Gupta
@ 2026-02-23 19:09 ` Ekansh Gupta
2026-02-26 10:48 ` Krzysztof Kozlowski
2026-02-23 19:09 ` [PATCH RFC 17/18] accel/qda: Add FastRPC-based DSP memory unmapping support Ekansh Gupta
` (7 subsequent siblings)
23 siblings, 1 reply; 83+ messages in thread
From: Ekansh Gupta @ 2026-02-23 19:09 UTC (permalink / raw)
To: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König
Cc: dri-devel, linux-doc, linux-kernel, linux-arm-msm, iommu,
linux-media, linaro-mm-sig, Srinivas Kandagatla, Dmitry Baryshkov,
Bharath Kumar, Chenna Kesava Raju, Ekansh Gupta
Add a DRM_QDA_MAP ioctl and supporting FastRPC plumbing to map GEM
backed buffers into the DSP virtual address space. The new
qda_mem_map UAPI structure allows userspace to request legacy MMAP
style mappings or handle-based MEM_MAP mappings with attributes, and
encodes flags, offsets and optional virtual address hints that are
forwarded to the DSP.
On the FastRPC side new method identifiers FASTRPC_RMID_INIT_MMAP
and FASTRPC_RMID_INIT_MEM_MAP are introduced together with message
structures for map requests and responses. The fastrpc_prepare_args
path is extended to build the appropriate request headers, serialize
the physical page information derived from a GEM object into a
fastrpc_phy_page array and pack the arguments into the shared message
buffer used by the existing invoke infrastructure.
The qda_ioctl_mmap() handler dispatches mapping requests based on the
qda_mem_map request type, reusing the generic fastrpc_invoke()
machinery and the RPMsg transport to communicate with the DSP. This
provides the foundation for explicit buffer mapping into the DSP
address space for subsequent FastRPC calls, aligned with the
traditional FastRPC user space model.
Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
---
arch/arm64/configs/defconfig | 2 +
drivers/accel/qda/qda_drv.c | 1 +
drivers/accel/qda/qda_fastrpc.c | 217 ++++++++++++++++++++++++++++++++++++++++
drivers/accel/qda/qda_fastrpc.h | 64 ++++++++++++
drivers/accel/qda/qda_ioctl.c | 24 +++++
drivers/accel/qda/qda_ioctl.h | 13 +++
include/uapi/drm/qda_accel.h | 44 +++++++-
7 files changed, 364 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
index b67d5b1fc45b..e53a7984c9be 100644
--- a/arch/arm64/configs/defconfig
+++ b/arch/arm64/configs/defconfig
@@ -1046,6 +1046,8 @@ CONFIG_DRM_TIDSS=m
CONFIG_DRM_ZYNQMP_DPSUB=m
CONFIG_DRM_ZYNQMP_DPSUB_AUDIO=y
CONFIG_DRM_POWERVR=m
+CONFIG_DRM_ACCEL=y
+CONFIG_DRM_ACCEL_QDA=m
CONFIG_FB=y
CONFIG_FB_EFI=y
CONFIG_FB_MODE_HELPERS=y
diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c
index 2b080d5d51c5..5f43c97ebc25 100644
--- a/drivers/accel/qda/qda_drv.c
+++ b/drivers/accel/qda/qda_drv.c
@@ -163,6 +163,7 @@ static const struct drm_ioctl_desc qda_ioctls[] = {
DRM_IOCTL_DEF_DRV(QDA_GEM_MMAP_OFFSET, qda_ioctl_gem_mmap_offset, 0),
DRM_IOCTL_DEF_DRV(QDA_INIT_ATTACH, qda_ioctl_attach, 0),
DRM_IOCTL_DEF_DRV(QDA_INIT_CREATE, qda_ioctl_create, 0),
+ DRM_IOCTL_DEF_DRV(QDA_MAP, qda_ioctl_mmap, 0),
DRM_IOCTL_DEF_DRV(QDA_INVOKE, qda_ioctl_invoke, 0),
};
diff --git a/drivers/accel/qda/qda_fastrpc.c b/drivers/accel/qda/qda_fastrpc.c
index f03dcf7e21e4..25b5d53ba2d6 100644
--- a/drivers/accel/qda/qda_fastrpc.c
+++ b/drivers/accel/qda/qda_fastrpc.c
@@ -487,6 +487,40 @@ int fastrpc_internal_invoke_unpack(struct fastrpc_invoke_context *ctx,
return err;
}
+static int fastrpc_return_result_mem_map(struct fastrpc_invoke_context *ctx, char __user *argp)
+{
+ struct qda_mem_map margs;
+ struct fastrpc_map_rsp_msg *rsp_msg;
+ int err;
+
+ rsp_msg = ctx->rsp;
+
+ err = copy_from_user_or_kernel(&margs, argp, sizeof(margs));
+ if (err)
+ return err;
+
+ margs.vaddrout = rsp_msg->vaddrout;
+
+ err = copy_to_user_or_kernel(argp, &margs, sizeof(margs));
+ return err;
+}
+
+int fastrpc_return_result(struct fastrpc_invoke_context *ctx, char __user *argp)
+{
+ int err = 0;
+
+ switch (ctx->type) {
+ case FASTRPC_RMID_INIT_MMAP:
+ case FASTRPC_RMID_INIT_MEM_MAP:
+ err = fastrpc_return_result_mem_map(ctx, argp);
+ break;
+ default:
+ break;
+ }
+
+ return err;
+}
+
static void setup_create_process_args(struct fastrpc_invoke_args *args,
struct fastrpc_create_process_inbuf *inbuf,
struct qda_init_create *init,
@@ -517,6 +551,29 @@ static void setup_create_process_args(struct fastrpc_invoke_args *args,
args[5].fd = -1;
}
+static int setup_mmap_pages(struct drm_file *file_priv, int fd, struct fastrpc_phy_page *pages)
+{
+ struct drm_gem_object *gem_obj;
+ struct qda_gem_obj *qda_gem_obj;
+ int err;
+
+ if (fd <= 0) {
+ pages->addr = 0;
+ pages->size = 0;
+ return 0;
+ }
+
+ err = get_gem_obj_from_handle(file_priv, fd, &gem_obj);
+ if (err)
+ return err;
+
+ qda_gem_obj = to_qda_gem_obj(gem_obj);
+ setup_pages_from_gem_obj(qda_gem_obj, pages);
+
+ drm_gem_object_put(gem_obj);
+ return 0;
+}
+
static int fastrpc_prepare_args_init_attach(struct fastrpc_invoke_context *ctx)
{
struct fastrpc_invoke_args *args;
@@ -658,6 +715,160 @@ static int fastrpc_prepare_args_init_create(struct fastrpc_invoke_context *ctx,
return err;
}
+static int fastrpc_prepare_args_map(struct fastrpc_invoke_context *ctx, char __user *argp)
+{
+ struct qda_mem_map margs;
+ struct fastrpc_invoke_args *args;
+ void *req, *rsp;
+ struct fastrpc_map_req_msg *req_msg;
+ struct fastrpc_map_rsp_msg *rsp_msg;
+ int err;
+
+ err = copy_from_user_or_kernel(&margs, argp, sizeof(margs));
+ if (err)
+ return err;
+
+ args = kzalloc_objs(*args, 3, GFP_KERNEL);
+ if (!args)
+ return -ENOMEM;
+
+ req = kzalloc_obj(*req_msg, GFP_KERNEL);
+ if (!req) {
+ err = -ENOMEM;
+ goto err_free_args;
+ }
+ req_msg = (struct fastrpc_map_req_msg *)req;
+
+ rsp = kzalloc_obj(*rsp_msg, GFP_KERNEL);
+ if (!rsp) {
+ err = -ENOMEM;
+ goto err_free_req;
+ }
+ rsp_msg = (struct fastrpc_map_rsp_msg *)rsp;
+
+ ctx->input_pages = kzalloc_objs(*ctx->input_pages, 1, GFP_KERNEL);
+ if (!ctx->input_pages) {
+ err = -ENOMEM;
+ goto err_free_rsp;
+ }
+
+ req_msg->client_id = ctx->client_id;
+ req_msg->flags = margs.flags;
+ req_msg->vaddr = margs.vaddrin;
+ req_msg->num = sizeof(*ctx->input_pages);
+
+ args[0].ptr = (u64)(uintptr_t)req;
+ args[0].length = sizeof(*req_msg);
+
+ err = setup_mmap_pages(ctx->file_priv, margs.fd, ctx->input_pages);
+ if (err)
+ goto err_free_input_pages;
+
+ args[1].ptr = (u64)(uintptr_t)ctx->input_pages;
+ args[1].length = sizeof(*ctx->input_pages);
+
+ args[2].ptr = (u64)(uintptr_t)rsp;
+ args[2].length = sizeof(*rsp_msg);
+
+ ctx->sc = FASTRPC_SCALARS(FASTRPC_RMID_INIT_MMAP, 2, 1);
+ ctx->args = args;
+ ctx->req = req;
+ ctx->rsp = rsp;
+ ctx->handle = FASTRPC_INIT_HANDLE;
+
+ return 0;
+
+err_free_input_pages:
+ kfree(ctx->input_pages);
+ ctx->input_pages = NULL;
+err_free_rsp:
+ kfree(rsp);
+err_free_req:
+ kfree(req);
+err_free_args:
+ kfree(args);
+ return err;
+}
+
+static int fastrpc_prepare_args_mem_map_attr(struct fastrpc_invoke_context *ctx, char __user *argp)
+{
+ struct qda_mem_map margs;
+ struct fastrpc_invoke_args *args;
+ void *req, *rsp;
+ struct fastrpc_mem_map_req_msg *req_msg;
+ struct fastrpc_map_rsp_msg *rsp_msg;
+ int err;
+
+ err = copy_from_user_or_kernel(&margs, argp, sizeof(margs));
+ if (err)
+ return err;
+
+ args = kzalloc_objs(*args, 4, GFP_KERNEL);
+ if (!args)
+ return -ENOMEM;
+
+ req = kzalloc_obj(*req_msg, GFP_KERNEL);
+ if (!req) {
+ kfree(args);
+ return -ENOMEM;
+ }
+ req_msg = (struct fastrpc_mem_map_req_msg *)req;
+
+ rsp = kzalloc_obj(*rsp_msg, GFP_KERNEL);
+ if (!rsp) {
+ kfree(args);
+ kfree(req);
+ return -ENOMEM;
+ }
+ rsp_msg = (struct fastrpc_map_rsp_msg *)rsp;
+
+ ctx->input_pages = kzalloc_objs(*ctx->input_pages, 1, GFP_KERNEL);
+ if (!ctx->input_pages) {
+ kfree(args);
+ kfree(req);
+ kfree(rsp);
+ return -ENOMEM;
+ }
+
+ req_msg->client_id = ctx->client_id;
+ req_msg->fd = margs.fd;
+ req_msg->offset = margs.offset;
+ req_msg->flags = margs.flags;
+ req_msg->vaddrin = margs.vaddrin;
+ req_msg->num = sizeof(*ctx->input_pages);
+ req_msg->data_len = 0;
+
+ args[0].ptr = (u64)(uintptr_t)req;
+ args[0].length = sizeof(*req_msg);
+
+ err = setup_mmap_pages(ctx->file_priv, margs.fd, ctx->input_pages);
+ if (err) {
+ kfree(args);
+ kfree(req);
+ kfree(rsp);
+ kfree(ctx->input_pages);
+ ctx->input_pages = NULL;
+ return err;
+ }
+
+ args[1].ptr = (u64)(uintptr_t)ctx->input_pages;
+ args[1].length = sizeof(*ctx->input_pages);
+
+ args[2].ptr = (u64)(uintptr_t)ctx->input_pages;
+ args[2].length = 0;
+
+ args[3].ptr = (u64)(uintptr_t)rsp;
+ args[3].length = sizeof(*rsp_msg);
+
+ ctx->sc = FASTRPC_SCALARS(FASTRPC_RMID_INIT_MEM_MAP, 3, 1);
+ ctx->args = args;
+ ctx->req = req;
+ ctx->rsp = rsp;
+ ctx->handle = FASTRPC_INIT_HANDLE;
+
+ return 0;
+}
+
int fastrpc_prepare_args(struct fastrpc_invoke_context *ctx, char __user *argp)
{
int err;
@@ -678,6 +889,12 @@ int fastrpc_prepare_args(struct fastrpc_invoke_context *ctx, char __user *argp)
ctx->pd = USER_PD;
err = fastrpc_prepare_args_init_create(ctx, argp);
break;
+ case FASTRPC_RMID_INIT_MMAP:
+ err = fastrpc_prepare_args_map(ctx, argp);
+ break;
+ case FASTRPC_RMID_INIT_MEM_MAP:
+ err = fastrpc_prepare_args_mem_map_attr(ctx, argp);
+ break;
default:
return -EINVAL;
}
diff --git a/drivers/accel/qda/qda_fastrpc.h b/drivers/accel/qda/qda_fastrpc.h
index a8deb7efec86..b45ccc77d9d1 100644
--- a/drivers/accel/qda/qda_fastrpc.h
+++ b/drivers/accel/qda/qda_fastrpc.h
@@ -260,8 +260,10 @@ struct fastrpc_invoke_context {
/* Remote Method ID table - identifies initialization and control operations */
#define FASTRPC_RMID_INIT_ATTACH 0 /* Attach to DSP session */
#define FASTRPC_RMID_INIT_RELEASE 1 /* Release DSP session */
+#define FASTRPC_RMID_INIT_MMAP 4 /* Map memory region to DSP */
#define FASTRPC_RMID_INIT_CREATE 6 /* Create DSP process */
#define FASTRPC_RMID_INIT_CREATE_ATTR 7 /* Create DSP process with attributes */
+#define FASTRPC_RMID_INIT_MEM_MAP 10 /* Map DMA buffer with attributes to DSP */
#define FASTRPC_RMID_INVOKE_DYNAMIC 0xFFFFFFFF /* Dynamic method invocation */
/* Common handle for initialization operations */
@@ -276,6 +278,59 @@ struct fastrpc_invoke_context {
/* Maximum initialization file size (4MB) */
#define INIT_FILELEN_MAX (4 * 1024 * 1024)
+/* Message structures for internal FastRPC calls */
+
+/**
+ * struct fastrpc_mem_map_req_msg - Memory map request message with attributes
+ *
+ * This message structure is sent to the DSP to request mapping
+ * of a DMA buffer with custom attributes (ATTR request).
+ */
+struct fastrpc_mem_map_req_msg {
+ /* Client identifier for the session */
+ s32 client_id;
+ /* Handle of the buffer */
+ s32 fd;
+ /* Offset within the buffer */
+ s32 offset;
+ /* Mapping flags */
+ u32 flags;
+ /* Virtual address hint for mapping */
+ u64 vaddrin;
+ /* Pages in the mapping */
+ s32 num;
+ /* Length of additional data */
+ s32 data_len;
+};
+
+/**
+ * struct fastrpc_map_req_msg - Legacy memory map request message
+ *
+ * This message structure is sent to the DSP to request mapping
+ * of a DMA buffer into the DSP's virtual address space.
+ */
+struct fastrpc_map_req_msg {
+ /* Client identifier for the session */
+ s32 client_id;
+ /* Mapping flags */
+ u32 flags;
+ /* Virtual address hint for mapping */
+ u64 vaddr;
+ /* Pages in the mapping */
+ s32 num;
+};
+
+/**
+ * struct fastrpc_map_rsp_msg - Memory map response message
+ *
+ * This message structure is returned by the DSP after successfully
+ * mapping a buffer, providing the virtual address for future access.
+ */
+struct fastrpc_map_rsp_msg {
+ /* DSP virtual address assigned to the mapped buffer */
+ u64 vaddrout;
+};
+
/**
* fastrpc_context_free - Free an invocation context
* @ref: Reference counter for the context
@@ -332,4 +387,13 @@ int fastrpc_internal_invoke_pack(struct fastrpc_invoke_context *ctx, struct qda_
*/
int fastrpc_internal_invoke_unpack(struct fastrpc_invoke_context *ctx, struct qda_msg *msg);
+/**
+ * fastrpc_return_result - Return invocation result to user-space
+ * @ctx: FastRPC invocation context
+ * @argp: User-space pointer to return result
+ *
+ * Returns: 0 on success, negative error code on failure
+ */
+int fastrpc_return_result(struct fastrpc_invoke_context *ctx, char __user *argp);
+
#endif /* __QDA_FASTRPC_H__ */
diff --git a/drivers/accel/qda/qda_ioctl.c b/drivers/accel/qda/qda_ioctl.c
index 477112ad6664..4eb932e2c9ae 100644
--- a/drivers/accel/qda/qda_ioctl.c
+++ b/drivers/accel/qda/qda_ioctl.c
@@ -192,6 +192,10 @@ static int fastrpc_invoke(int type, struct drm_device *dev, void *data,
if (err)
goto err_context_free;
+ err = fastrpc_return_result(ctx, (char __user *)data);
+ if (err)
+ goto err_context_free;
+
err_context_free:
if (type == FASTRPC_RMID_INIT_RELEASE && qda_user->init_mem_gem_obj) {
drm_gem_object_put(&qda_user->init_mem_gem_obj->base);
@@ -223,3 +227,23 @@ int qda_ioctl_create(struct drm_device *dev, void *data, struct drm_file *file_p
{
return fastrpc_invoke(FASTRPC_RMID_INIT_CREATE, dev, data, file_priv);
}
+
+int qda_ioctl_mmap(struct drm_device *dev, void *data, struct drm_file *file_priv)
+{
+ struct qda_mem_map *map_req;
+
+ if (!data)
+ return -EINVAL;
+
+ map_req = (struct qda_mem_map *)data;
+
+ switch (map_req->request) {
+ case QDA_MAP_REQUEST_LEGACY:
+ return fastrpc_invoke(FASTRPC_RMID_INIT_MMAP, dev, data, file_priv);
+ case QDA_MAP_REQUEST_ATTR:
+ return fastrpc_invoke(FASTRPC_RMID_INIT_MEM_MAP, dev, data, file_priv);
+ default:
+ qda_err(NULL, "Invalid map request type: %u\n", map_req->request);
+ return -EINVAL;
+ }
+}
diff --git a/drivers/accel/qda/qda_ioctl.h b/drivers/accel/qda/qda_ioctl.h
index 181ed50b19dc..d402d6715b41 100644
--- a/drivers/accel/qda/qda_ioctl.h
+++ b/drivers/accel/qda/qda_ioctl.h
@@ -89,4 +89,17 @@ int qda_ioctl_invoke(struct drm_device *dev, void *data, struct drm_file *file_p
*/
int qda_ioctl_create(struct drm_device *dev, void *data, struct drm_file *file_priv);
+/**
+ * qda_ioctl_mmap - Map memory to DSP address space
+ * @dev: DRM device structure
+ * @data: User-space data containing memory mapping parameters
+ * @file_priv: DRM file private data
+ *
+ * This IOCTL handler maps a DMA buffer into the DSP's virtual address
+ * space, enabling the DSP to access the buffer during remote calls.
+ *
+ * Return: 0 on success, negative error code on failure
+ */
+int qda_ioctl_mmap(struct drm_device *dev, void *data, struct drm_file *file_priv);
+
#endif /* _QDA_IOCTL_H */
diff --git a/include/uapi/drm/qda_accel.h b/include/uapi/drm/qda_accel.h
index 2b7f500db52c..9151ba7adfaf 100644
--- a/include/uapi/drm/qda_accel.h
+++ b/include/uapi/drm/qda_accel.h
@@ -23,7 +23,8 @@ extern "C" {
#define DRM_QDA_GEM_MMAP_OFFSET 0x02
#define DRM_QDA_INIT_ATTACH 0x03
#define DRM_QDA_INIT_CREATE 0x04
-/* Indexes 0x05-0x06 are reserved for other requests */
+#define DRM_QDA_MAP 0x05
+/* 0x06 is reserved for other request */
#define DRM_QDA_INVOKE 0x07
/*
@@ -41,9 +42,14 @@ extern "C" {
#define DRM_IOCTL_QDA_INIT_ATTACH DRM_IO(DRM_COMMAND_BASE + DRM_QDA_INIT_ATTACH)
#define DRM_IOCTL_QDA_INIT_CREATE DRM_IOWR(DRM_COMMAND_BASE + DRM_QDA_INIT_CREATE, \
struct qda_init_create)
+#define DRM_IOCTL_QDA_MAP DRM_IOWR(DRM_COMMAND_BASE + DRM_QDA_MAP, struct qda_mem_map)
#define DRM_IOCTL_QDA_INVOKE DRM_IOWR(DRM_COMMAND_BASE + DRM_QDA_INVOKE, \
struct qda_invoke_args)
+/* Request type definitions for qda_mem_map */
+#define QDA_MAP_REQUEST_LEGACY 1 /* Legacy MMAP operation */
+#define QDA_MAP_REQUEST_ATTR 2 /* Handle-based MEM_MAP operation with attributes */
+
/**
* struct drm_qda_query - Device information query structure
* @dsp_name: Name of DSP (e.g., "adsp", "cdsp", "cdsp1", "gdsp0", "gdsp1")
@@ -143,6 +149,42 @@ struct qda_init_create {
__u64 file;
};
+/**
+ * struct qda_mem_map - Memory mapping request structure
+ * @request: Request type (QDA_MAP_REQUEST_LEGACY or QDA_MAP_REQUEST_ATTR)
+ * @flags: Mapping flags for DSP (cache attributes, permissions)
+ * @fd: Handle of the buffer to map
+ * @attrs: Mapping attributes (used for ATTR request)
+ * @offset: Offset within buffer (used for ATTR request)
+ * @reserved: Reserved for alignment/future use
+ * @vaddrin: Optional virtual address hint for mapping
+ * @size: Size of the memory region to map in bytes
+ * @vaddrout: Output DSP virtual address after successful mapping
+ *
+ * This structure is used to request mapping of a DMA buffer into the
+ * DSP's virtual address space. The DSP will map the buffer according
+ * to the specified flags and return the virtual address in vaddrout.
+ *
+ * For QDA_MAP_REQUEST_LEGACY (value 1):
+ * - Uses fields: fd, flags, vaddrin, size, vaddrout
+ * - Legacy MMAP operation for backward compatibility
+ *
+ * For QDA_MAP_REQUEST_ATTR (value 2):
+ * - Uses all fields including attrs and offset
+ * - FD-based MEM_MAP operation with custom SMMU attributes
+ */
+struct qda_mem_map {
+ __u32 request;
+ __u32 flags;
+ __s32 fd;
+ __u32 attrs;
+ __u32 offset;
+ __u32 reserved;
+ __u64 vaddrin;
+ __u64 size;
+ __u64 vaddrout;
+};
+
#if defined(__cplusplus)
}
#endif
--
2.34.1
^ permalink raw reply related [flat|nested] 83+ messages in thread
* [PATCH RFC 17/18] accel/qda: Add FastRPC-based DSP memory unmapping support
2026-02-23 19:08 ` [PATCH RFC 00/18] accel/qda: Introduce Qualcomm DSP Accelerator driver Ekansh Gupta
` (15 preceding siblings ...)
2026-02-23 19:09 ` [PATCH RFC 16/18] accel/qda: Add FastRPC-based DSP memory mapping support Ekansh Gupta
@ 2026-02-23 19:09 ` Ekansh Gupta
2026-02-23 19:09 ` [PATCH RFC 18/18] MAINTAINERS: Add MAINTAINERS entry for QDA driver Ekansh Gupta
` (6 subsequent siblings)
23 siblings, 0 replies; 83+ messages in thread
From: Ekansh Gupta @ 2026-02-23 19:09 UTC (permalink / raw)
To: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König
Cc: dri-devel, linux-doc, linux-kernel, linux-arm-msm, iommu,
linux-media, linaro-mm-sig, Srinivas Kandagatla, Dmitry Baryshkov,
Bharath Kumar, Chenna Kesava Raju, Ekansh Gupta
Add a DRM_QDA_MUNMAP ioctl and corresponding FastRPC plumbing to
unmap previously mapped buffers from the DSP virtual address space.
The new qda_mem_unmap UAPI structure supports both legacy unmap
semantics, where a DSP virtual address is provided directly, and
handle-based MEM_UNMAP semantics using a buffer handle, virtual
address and size.
On the FastRPC side new method identifiers FASTRPC_RMID_INIT_MUNMAP
and FASTRPC_RMID_INIT_MEM_UNMAP are introduced along with request
message structures for legacy and attribute-based unmap operations.
The fastrpc_prepare_args() path gains handlers that copy the
qda_mem_unmap parameters from user space, build the appropriate
unmap request payload and encode a single input buffer in the
scalars so that the existing invoke infrastructure can be reused.
The qda_ioctl_munmap() handler selects the appropriate FastRPC method
based on the qda_mem_unmap request type and forwards the unmap
operation through fastrpc_invoke(), allowing RPMsg to deliver the
request to the DSP. This completes the basic memory management flow
for QDA FastRPC clients by providing explicit unmap operations to
release DSP mappings established via DRM_QDA_MAP.
Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
---
drivers/accel/qda/qda_drv.c | 1 +
drivers/accel/qda/qda_fastrpc.c | 80 +++++++++++++++++++++++++++++++++++++++++
drivers/accel/qda/qda_fastrpc.h | 34 ++++++++++++++++++
drivers/accel/qda/qda_ioctl.c | 22 ++++++++++++
drivers/accel/qda/qda_ioctl.h | 13 +++++++
include/uapi/drm/qda_accel.h | 34 +++++++++++++++++-
6 files changed, 183 insertions(+), 1 deletion(-)
diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c
index 5f43c97ebc25..072a788b0980 100644
--- a/drivers/accel/qda/qda_drv.c
+++ b/drivers/accel/qda/qda_drv.c
@@ -164,6 +164,7 @@ static const struct drm_ioctl_desc qda_ioctls[] = {
DRM_IOCTL_DEF_DRV(QDA_INIT_ATTACH, qda_ioctl_attach, 0),
DRM_IOCTL_DEF_DRV(QDA_INIT_CREATE, qda_ioctl_create, 0),
DRM_IOCTL_DEF_DRV(QDA_MAP, qda_ioctl_mmap, 0),
+ DRM_IOCTL_DEF_DRV(QDA_MUNMAP, qda_ioctl_munmap, 0),
DRM_IOCTL_DEF_DRV(QDA_INVOKE, qda_ioctl_invoke, 0),
};
diff --git a/drivers/accel/qda/qda_fastrpc.c b/drivers/accel/qda/qda_fastrpc.c
index 25b5d53ba2d6..53d505b76aad 100644
--- a/drivers/accel/qda/qda_fastrpc.c
+++ b/drivers/accel/qda/qda_fastrpc.c
@@ -869,6 +869,80 @@ static int fastrpc_prepare_args_mem_map_attr(struct fastrpc_invoke_context *ctx,
return 0;
}
+static int fastrpc_prepare_args_munmap(struct fastrpc_invoke_context *ctx, char __user *argp)
+{
+ struct fastrpc_invoke_args *args;
+ struct fastrpc_munmap_req_msg *req_msg;
+ struct qda_mem_unmap uargs;
+ void *req;
+ int err;
+
+ err = copy_from_user_or_kernel(&uargs, argp, sizeof(uargs));
+ if (err)
+ return err;
+
+ args = kzalloc_obj(*args, GFP_KERNEL);
+ if (!args)
+ return -ENOMEM;
+
+ req = kzalloc_obj(*req_msg, GFP_KERNEL);
+ if (!req) {
+ kfree(args);
+ return -ENOMEM;
+ }
+ req_msg = (struct fastrpc_munmap_req_msg *)req;
+
+ req_msg->client_id = ctx->client_id;
+ req_msg->size = uargs.size;
+ req_msg->vaddr = uargs.vaddrout;
+
+ setup_single_arg(args, req_msg, sizeof(*req_msg));
+ ctx->sc = FASTRPC_SCALARS(FASTRPC_RMID_INIT_MUNMAP, 1, 0);
+ ctx->args = args;
+ ctx->req = req;
+ ctx->handle = FASTRPC_INIT_HANDLE;
+
+ return 0;
+}
+
+static int fastrpc_prepare_args_mem_unmap_attr(struct fastrpc_invoke_context *ctx,
+ char __user *argp)
+{
+ struct fastrpc_invoke_args *args;
+ struct fastrpc_mem_unmap_req_msg *req_msg;
+ struct qda_mem_unmap uargs;
+ void *req;
+ int err;
+
+ err = copy_from_user_or_kernel(&uargs, argp, sizeof(uargs));
+ if (err)
+ return err;
+
+ args = kzalloc_obj(*args, GFP_KERNEL);
+ if (!args)
+ return -ENOMEM;
+
+ req = kzalloc_obj(*req_msg, GFP_KERNEL);
+ if (!req) {
+ kfree(args);
+ return -ENOMEM;
+ }
+ req_msg = (struct fastrpc_mem_unmap_req_msg *)req;
+
+ req_msg->client_id = ctx->client_id;
+ req_msg->fd = uargs.fd;
+ req_msg->vaddrin = uargs.vaddr;
+ req_msg->len = uargs.size;
+
+ setup_single_arg(args, req_msg, sizeof(*req_msg));
+ ctx->sc = FASTRPC_SCALARS(FASTRPC_RMID_INIT_MEM_UNMAP, 1, 0);
+ ctx->args = args;
+ ctx->req = req;
+ ctx->handle = FASTRPC_INIT_HANDLE;
+
+ return 0;
+}
+
int fastrpc_prepare_args(struct fastrpc_invoke_context *ctx, char __user *argp)
{
int err;
@@ -895,6 +969,12 @@ int fastrpc_prepare_args(struct fastrpc_invoke_context *ctx, char __user *argp)
case FASTRPC_RMID_INIT_MEM_MAP:
err = fastrpc_prepare_args_mem_map_attr(ctx, argp);
break;
+ case FASTRPC_RMID_INIT_MUNMAP:
+ err = fastrpc_prepare_args_munmap(ctx, argp);
+ break;
+ case FASTRPC_RMID_INIT_MEM_UNMAP:
+ err = fastrpc_prepare_args_mem_unmap_attr(ctx, argp);
+ break;
default:
return -EINVAL;
}
diff --git a/drivers/accel/qda/qda_fastrpc.h b/drivers/accel/qda/qda_fastrpc.h
index b45ccc77d9d1..aa396fdc8e7f 100644
--- a/drivers/accel/qda/qda_fastrpc.h
+++ b/drivers/accel/qda/qda_fastrpc.h
@@ -261,9 +261,11 @@ struct fastrpc_invoke_context {
#define FASTRPC_RMID_INIT_ATTACH 0 /* Attach to DSP session */
#define FASTRPC_RMID_INIT_RELEASE 1 /* Release DSP session */
#define FASTRPC_RMID_INIT_MMAP 4 /* Map memory region to DSP */
+#define FASTRPC_RMID_INIT_MUNMAP 5 /* Unmap DSP memory region */
#define FASTRPC_RMID_INIT_CREATE 6 /* Create DSP process */
#define FASTRPC_RMID_INIT_CREATE_ATTR 7 /* Create DSP process with attributes */
#define FASTRPC_RMID_INIT_MEM_MAP 10 /* Map DMA buffer with attributes to DSP */
+#define FASTRPC_RMID_INIT_MEM_UNMAP 11 /* Unmap DMA buffer from DSP */
#define FASTRPC_RMID_INVOKE_DYNAMIC 0xFFFFFFFF /* Dynamic method invocation */
/* Common handle for initialization operations */
@@ -280,6 +282,38 @@ struct fastrpc_invoke_context {
/* Message structures for internal FastRPC calls */
+/**
+ * struct fastrpc_mem_unmap_req_msg - Memory unmap request message with attributes
+ *
+ * This message structure is sent to the DSP to request unmapping
+ * of a previously mapped memory region (ATTR request).
+ */
+struct fastrpc_mem_unmap_req_msg {
+ /* Client identifier for the session */
+ s32 client_id;
+ /* Handle of the buffer */
+ s32 fd;
+ /* Virtual address to unmap from DSP */
+ u64 vaddrin;
+ /* Size of the region to unmap in bytes */
+ u64 len;
+};
+
+/**
+ * struct fastrpc_munmap_req_msg - Legacy memory unmap request message
+ *
+ * This message structure is sent to the DSP to request unmapping
+ * of a previously mapped memory region.
+ */
+struct fastrpc_munmap_req_msg {
+ /* Client identifier for the session */
+ int client_id;
+ /* Virtual address to unmap from DSP */
+ u64 vaddr;
+ /* Size of the region to unmap in bytes */
+ u64 size;
+};
+
/**
* struct fastrpc_mem_map_req_msg - Memory map request message with attributes
*
diff --git a/drivers/accel/qda/qda_ioctl.c b/drivers/accel/qda/qda_ioctl.c
index 4eb932e2c9ae..a7a8ff283498 100644
--- a/drivers/accel/qda/qda_ioctl.c
+++ b/drivers/accel/qda/qda_ioctl.c
@@ -247,3 +247,25 @@ int qda_ioctl_mmap(struct drm_device *dev, void *data, struct drm_file *file_pri
return -EINVAL;
}
}
+
+int qda_ioctl_munmap(struct drm_device *dev, void *data, struct drm_file *file_priv)
+{
+ struct qda_mem_unmap *unmap_req;
+
+ if (!data)
+ return -EINVAL;
+
+ unmap_req = (struct qda_mem_unmap *)data;
+
+ switch (unmap_req->request) {
+ case QDA_MUNMAP_REQUEST_LEGACY:
+ return fastrpc_invoke(FASTRPC_RMID_INIT_MUNMAP, dev, data, file_priv);
+
+ case QDA_MUNMAP_REQUEST_ATTR:
+ return fastrpc_invoke(FASTRPC_RMID_INIT_MEM_UNMAP, dev, data, file_priv);
+
+ default:
+ qda_err(NULL, "Invalid munmap request type: %u\n", unmap_req->request);
+ return -EINVAL;
+ }
+}
diff --git a/drivers/accel/qda/qda_ioctl.h b/drivers/accel/qda/qda_ioctl.h
index d402d6715b41..759ba3b98394 100644
--- a/drivers/accel/qda/qda_ioctl.h
+++ b/drivers/accel/qda/qda_ioctl.h
@@ -102,4 +102,17 @@ int qda_ioctl_create(struct drm_device *dev, void *data, struct drm_file *file_p
*/
int qda_ioctl_mmap(struct drm_device *dev, void *data, struct drm_file *file_priv);
+/**
+ * qda_ioctl_munmap - Unmap memory from DSP address space
+ * @dev: DRM device structure
+ * @data: User-space data containing memory unmapping parameters
+ * @file_priv: DRM file private data
+ *
+ * This IOCTL handler unmaps a previously mapped buffer from the DSP's
+ * virtual address space, releasing the associated resources.
+ *
+ * Return: 0 on success, negative error code on failure
+ */
+int qda_ioctl_munmap(struct drm_device *dev, void *data, struct drm_file *file_priv);
+
#endif /* _QDA_IOCTL_H */
diff --git a/include/uapi/drm/qda_accel.h b/include/uapi/drm/qda_accel.h
index 9151ba7adfaf..53f4a9955a87 100644
--- a/include/uapi/drm/qda_accel.h
+++ b/include/uapi/drm/qda_accel.h
@@ -24,7 +24,7 @@ extern "C" {
#define DRM_QDA_INIT_ATTACH 0x03
#define DRM_QDA_INIT_CREATE 0x04
#define DRM_QDA_MAP 0x05
-/* 0x06 is reserved for other request */
+#define DRM_QDA_MUNMAP 0x06
#define DRM_QDA_INVOKE 0x07
/*
@@ -43,6 +43,8 @@ extern "C" {
#define DRM_IOCTL_QDA_INIT_CREATE DRM_IOWR(DRM_COMMAND_BASE + DRM_QDA_INIT_CREATE, \
struct qda_init_create)
#define DRM_IOCTL_QDA_MAP DRM_IOWR(DRM_COMMAND_BASE + DRM_QDA_MAP, struct qda_mem_map)
+#define DRM_IOCTL_QDA_MUNMAP DRM_IOWR(DRM_COMMAND_BASE + DRM_QDA_MUNMAP, \
+ struct qda_mem_unmap)
#define DRM_IOCTL_QDA_INVOKE DRM_IOWR(DRM_COMMAND_BASE + DRM_QDA_INVOKE, \
struct qda_invoke_args)
@@ -50,6 +52,9 @@ extern "C" {
#define QDA_MAP_REQUEST_LEGACY 1 /* Legacy MMAP operation */
#define QDA_MAP_REQUEST_ATTR 2 /* Handle-based MEM_MAP operation with attributes */
+/* Request type definitions for qda_mem_unmap */
+#define QDA_MUNMAP_REQUEST_LEGACY 1 /* Legacy MUNMAP operation */
+#define QDA_MUNMAP_REQUEST_ATTR 2 /* Handle-based MEM_UNMAP operation */
/**
* struct drm_qda_query - Device information query structure
* @dsp_name: Name of DSP (e.g., "adsp", "cdsp", "cdsp1", "gdsp0", "gdsp1")
@@ -185,6 +190,33 @@ struct qda_mem_map {
__u64 vaddrout;
};
+/**
+ * struct qda_mem_unmap - Memory unmapping request structure
+ * @request: Request type (QDA_MUNMAP_REQUEST_LEGACY or QDA_MUNMAP_REQUEST_ATTR)
+ * @fd: Handle (used for ATTR request)
+ * @vaddr: Virtual address (used for ATTR request)
+ * @vaddrout: DSP virtual address (used for LEGACY request)
+ * @size: Size of the memory region to unmap in bytes
+ *
+ * This structure is used to request unmapping of a previously mapped
+ * memory region from the DSP's virtual address space.
+ *
+ * For QDA_MUNMAP_REQUEST_LEGACY (value 1):
+ * - Uses fields: vaddrout, size
+ * - Legacy MUNMAP operation for backward compatibility
+ *
+ * For QDA_MUNMAP_REQUEST_ATTR (value 2):
+ * - Uses fields: fd, vaddr, size
+ * - Handle-based MEM_UNMAP operation
+ */
+struct qda_mem_unmap {
+ __u32 request;
+ __s32 fd;
+ __u64 vaddr;
+ __u64 vaddrout;
+ __u64 size;
+};
+
#if defined(__cplusplus)
}
#endif
--
2.34.1
^ permalink raw reply related [flat|nested] 83+ messages in thread
* [PATCH RFC 18/18] MAINTAINERS: Add MAINTAINERS entry for QDA driver
2026-02-23 19:08 ` [PATCH RFC 00/18] accel/qda: Introduce Qualcomm DSP Accelerator driver Ekansh Gupta
` (16 preceding siblings ...)
2026-02-23 19:09 ` [PATCH RFC 17/18] accel/qda: Add FastRPC-based DSP memory unmapping support Ekansh Gupta
@ 2026-02-23 19:09 ` Ekansh Gupta
2026-02-23 22:40 ` Dmitry Baryshkov
2026-02-23 22:03 ` [PATCH RFC 00/18] accel/qda: Introduce Qualcomm DSP Accelerator driver Bjorn Andersson
` (5 subsequent siblings)
23 siblings, 1 reply; 83+ messages in thread
From: Ekansh Gupta @ 2026-02-23 19:09 UTC (permalink / raw)
To: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König
Cc: dri-devel, linux-doc, linux-kernel, linux-arm-msm, iommu,
linux-media, linaro-mm-sig, Srinivas Kandagatla, Dmitry Baryshkov,
Bharath Kumar, Chenna Kesava Raju, Ekansh Gupta
Add a new MAINTAINERS entry for the Qualcomm DSP Accelerator (QDA)
driver. The entry lists the primary maintainer, the linux-arm-msm and
dri-devel mailing lists, and covers all source files under
drivers/accel/qda, Documentation/accel/qda and the UAPI header
include/uapi/drm/qda_accel.h.
This ensures that patches to the QDA driver and its public API are
tracked and routed to the appropriate reviewers as the driver is
integrated into the DRM accel subsystem.
Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
---
MAINTAINERS | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/MAINTAINERS b/MAINTAINERS
index 71f76fddebbf..78b8b82a6370 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -21691,6 +21691,15 @@ S: Maintained
F: Documentation/devicetree/bindings/crypto/qcom-qce.yaml
F: drivers/crypto/qce/
+QUALCOMM DSP ACCELERATOR (QDA) DRIVER
+M: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
+L: linux-arm-msm@vger.kernel.org
+L: dri-devel@lists.freedesktop.org
+S: Supported
+F: Documentation/accel/qda/
+F: drivers/accel/qda/
+F: include/uapi/drm/qda_accel.h
+
QUALCOMM EMAC GIGABIT ETHERNET DRIVER
M: Timur Tabi <timur@kernel.org>
L: netdev@vger.kernel.org
--
2.34.1
^ permalink raw reply related [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 01/18] accel/qda: Add Qualcomm QDA DSP accelerator driver docs
2026-02-23 19:08 ` [PATCH RFC 01/18] accel/qda: Add Qualcomm QDA DSP accelerator driver docs Ekansh Gupta
@ 2026-02-23 21:17 ` Dmitry Baryshkov
2026-02-25 13:57 ` Ekansh Gupta
2026-02-24 3:33 ` Trilok Soni
1 sibling, 1 reply; 83+ messages in thread
From: Dmitry Baryshkov @ 2026-02-23 21:17 UTC (permalink / raw)
To: Ekansh Gupta
Cc: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, dri-devel, linux-doc, linux-kernel,
linux-arm-msm, iommu, linux-media, linaro-mm-sig,
Srinivas Kandagatla, Bharath Kumar, Chenna Kesava Raju
On Tue, Feb 24, 2026 at 12:38:55AM +0530, Ekansh Gupta wrote:
> Add initial documentation for the Qualcomm DSP Accelerator (QDA) driver
> integrated in the DRM accel subsystem.
>
> The new docs introduce QDA as a DRM/accel-based implementation of
> Hexagon DSP offload that is intended as a modern alternative to the
> legacy FastRPC driver in drivers/misc. The text describes the driver
> motivation, high-level architecture and interaction with IOMMU context
> banks, GEM-based buffer management and the RPMsg transport.
>
> The user-space facing section documents the main QDA IOCTLs used to
> establish DSP sessions, manage GEM buffer objects and invoke remote
> procedures using the FastRPC protocol, along with a typical lifecycle
> example for applications.
>
> Finally, the driver is wired into the Compute Accelerators
> documentation index under Documentation/accel, and a brief debugging
> section shows how to enable dynamic debug for the QDA implementation.
>
> Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
> ---
> Documentation/accel/index.rst | 1 +
> Documentation/accel/qda/index.rst | 14 +++++
> Documentation/accel/qda/qda.rst | 129 ++++++++++++++++++++++++++++++++++++++
> 3 files changed, 144 insertions(+)
>
> diff --git a/Documentation/accel/index.rst b/Documentation/accel/index.rst
> index cbc7d4c3876a..5901ea7f784c 100644
> --- a/Documentation/accel/index.rst
> +++ b/Documentation/accel/index.rst
> @@ -10,4 +10,5 @@ Compute Accelerators
> introduction
> amdxdna/index
> qaic/index
> + qda/index
> rocket/index
> diff --git a/Documentation/accel/qda/index.rst b/Documentation/accel/qda/index.rst
> new file mode 100644
> index 000000000000..bce188f21117
> --- /dev/null
> +++ b/Documentation/accel/qda/index.rst
> @@ -0,0 +1,14 @@
> +.. SPDX-License-Identifier: GPL-2.0-only
> +
> +==============================
> + accel/qda Qualcomm DSP Driver
> +==============================
> +
> +The **accel/qda** driver provides support for Qualcomm Hexagon DSPs (Digital
> +Signal Processors) within the DRM accelerator framework. It serves as a modern
> +replacement for the legacy FastRPC driver, offering improved resource management
> +and standard subsystem integration.
> +
> +.. toctree::
> +
> + qda
> diff --git a/Documentation/accel/qda/qda.rst b/Documentation/accel/qda/qda.rst
> new file mode 100644
> index 000000000000..742159841b95
> --- /dev/null
> +++ b/Documentation/accel/qda/qda.rst
> @@ -0,0 +1,129 @@
> +.. SPDX-License-Identifier: GPL-2.0-only
> +
> +==================================
> +Qualcomm Hexagon DSP (QDA) Driver
> +==================================
> +
> +Introduction
> +============
> +
> +The **QDA** (Qualcomm DSP Accelerator) driver is a new DRM-based
> +accelerator driver for Qualcomm's Hexagon DSPs. It provides a standardized
> +interface for user-space applications to offload computational tasks ranging
> +from audio processing and sensor offload to computer vision and AI
> +inference to the Hexagon DSPs found on Qualcomm SoCs.
> +
> +This driver is designed to align with the Linux kernel's modern **Compute
> +Accelerators** subsystem (`drivers/accel/`), providing a robust and modular
> +alternative to the legacy FastRPC driver in `drivers/misc/`, offering
> +improved resource management and better integration with standard kernel
> +subsystems.
> +
> +Motivation
> +==========
> +
> +The existing FastRPC implementation in the kernel utilizes a custom character
> +device and lacks integration with modern kernel memory management frameworks.
> +The QDA driver addresses these limitations by:
> +
> +1. **Adopting the DRM accel Framework**: Leveraging standard uAPIs for device
> + management, job submission, and synchronization.
> +2. **Utilizing GEM for Memory**: Providing proper buffer object management,
> + including DMA-BUF import/export capabilities.
> +3. **Improving Isolation**: Using IOMMU context banks to enforce memory
> + isolation between different DSP user sessions.
> +
> +Key Features
> +============
> +
> +* **Standard Accelerator Interface**: Exposes a standard character device
> + node (e.g., `/dev/accel/accel0`) via the DRM subsystem.
> +* **Unified Offload Support**: Supports all DSP domains (ADSP, CDSP, SDSP,
> + GDSP) via a single driver architecture.
> +* **FastRPC Protocol**: Implements the reliable Remote Procedure Call
> + (FastRPC) protocol for communication between the application processor
> + and DSP.
> +* **DMA-BUF Interop**: Seamless sharing of memory buffers between the DSP
> + and other multimedia subsystems (GPU, Camera, Video) via standard DMA-BUFs.
> +* **Modular Design**: Clean separation between the core DRM logic, the memory
> + manager, and the RPMsg-based transport layer.
> +
> +Architecture
> +============
> +
> +The QDA driver is composed of several modular components:
> +
> +1. **Core Driver (`qda_drv`)**: Manages device registration, file operations,
> + and bridges the driver with the DRM accelerator subsystem.
> +2. **Memory Manager (`qda_memory_manager`)**: A flexible memory management
> + layer that handles IOMMU context banks. It supports pluggable backends
> + (such as DMA-coherent) to adapt to different SoC memory architectures.
> +3. **GEM Subsystem**: Implements the DRM GEM interface for buffer management:
> +
> + * **`qda_gem`**: Core GEM object management, including allocation, mmap
> + operations, and buffer lifecycle management.
> + * **`qda_prime`**: PRIME import functionality for DMA-BUF interoperability,
> + enabling seamless buffer sharing with other kernel subsystems.
> +
> +4. **Transport Layer (`qda_rpmsg`)**: Abstraction over the RPMsg framework
> + to handle low-level message passing with the DSP firmware.
> +5. **Compute Bus (`qda_compute_bus`)**: A custom virtual bus used to
> + enumerate and manage the specific compute context banks defined in the
> + device tree.
I'm really not sure if it's a bonus or not. I'm waiting for iommu-map
improvements to land to send patches reworking FastRPC CB from using
probe into being created by the main driver: it would remove some of the
possible race conditions between main driver finishing probe and the CB
devices probing in the background.
What's the actual benefit of the CB bus?
> +6. **FastRPC Core (`qda_fastrpc`)**: Implements the protocol logic for
> + marshalling arguments and handling remote invocations.
> +
> +User-Space API
> +==============
> +
> +The driver exposes a set of DRM-compliant IOCTLs. Note that these are designed
> +to be familiar to existing FastRPC users while adhering to DRM standards.
> +
> +* `DRM_IOCTL_QDA_QUERY`: Query DSP type (e.g., "cdsp", "adsp")
> + and capabilities.
> +* `DRM_IOCTL_QDA_INIT_ATTACH`: Attach a user session to the DSP's protection
> + domain.
> +* `DRM_IOCTL_QDA_INIT_CREATE`: Initialize a new process context on the DSP.
You need to explain the difference between these two.
> +* `DRM_IOCTL_QDA_INVOKE`: Submit a remote method invocation (the primary
> + execution unit).
> +* `DRM_IOCTL_QDA_GEM_CREATE`: Allocate a GEM buffer object for DSP usage.
> +* `DRM_IOCTL_QDA_GEM_MMAP_OFFSET`: Retrieve mmap offsets for memory mapping.
> +* `DRM_IOCTL_QDA_MAP` / `DRM_IOCTL_QDA_MUNMAP`: Map or unmap buffers into the
> + DSP's virtual address space.
Do we need to make this separate? Can we map/unmap buffers on their
usage? Or when they are created? I'm thinking about that the
virtualization. An alternative approach would be to merge
GET_MMAP_OFFSET with _MAP: once you map it to the DSP memory, you will
get the offset.
> +
> +Usage Example
> +=============
> +
> +A typical lifecycle for a user-space application:
> +
> +1. **Discovery**: Open `/dev/accel/accel*` and check
> + `DRM_IOCTL_QDA_QUERY` to find the desired DSP (e.g., CDSP for
> + compute workloads).
> +2. **Initialization**: Call `DRM_IOCTL_QDA_INIT_ATTACH` and
> + `DRM_IOCTL_QDA_INIT_CREATE` to establish a session.
> +3. **Memory**: Allocate buffers via `DRM_IOCTL_QDA_GEM_CREATE` or import
> + DMA-BUFs (PRIME fd) from other drivers using `DRM_IOCTL_PRIME_FD_TO_HANDLE`.
> +4. **Execution**: Use `DRM_IOCTL_QDA_INVOKE` to pass arguments and execute
> + functions on the DSP.
> +5. **Cleanup**: Close file descriptors to automatically release resources and
> + detach the session.
> +
> +Internal Implementation
> +=======================
> +
> +Memory Management
> +-----------------
> +The driver's memory manager creates virtual "IOMMU devices" that map to
> +hardware context banks. This allows the driver to manage multiple isolated
> +address spaces. The implementation currently uses a **DMA-coherent backend**
> +to ensure data consistency between the CPU and DSP without manual cache
> +maintenance in most cases.
> +
> +Debugging
> +=========
> +The driver includes extensive dynamic debug support. Enable it via the
> +kernel's dynamic debug control:
> +
> +.. code-block:: bash
> +
> + echo "file drivers/accel/qda/* +p" > /sys/kernel/debug/dynamic_debug/control
Please add documentation on how to build the test apps and how to load
them to the DSP.
>
> --
> 2.34.1
>
--
With best wishes
Dmitry
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 03/18] accel/qda: Add RPMsg transport for Qualcomm DSP accelerator
2026-02-23 19:08 ` [PATCH RFC 03/18] accel/qda: Add RPMsg transport for Qualcomm DSP accelerator Ekansh Gupta
@ 2026-02-23 21:23 ` Dmitry Baryshkov
2026-02-23 21:50 ` Bjorn Andersson
2026-02-25 17:16 ` Ekansh Gupta
0 siblings, 2 replies; 83+ messages in thread
From: Dmitry Baryshkov @ 2026-02-23 21:23 UTC (permalink / raw)
To: Ekansh Gupta
Cc: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, dri-devel, linux-doc, linux-kernel,
linux-arm-msm, iommu, linux-media, linaro-mm-sig,
Srinivas Kandagatla, Bharath Kumar, Chenna Kesava Raju
On Tue, Feb 24, 2026 at 12:38:57AM +0530, Ekansh Gupta wrote:
> Extend the Qualcomm DSP accelerator (QDA) driver with an RPMsg-based
> transport used to discover and manage DSP instances.
>
> This patch introduces:
>
> - A core qda_dev structure with basic device state (rpmsg device,
> device pointer, lock, removal flag, DSP name).
> - Logging helpers that integrate with dev_* when a device is available
> and fall back to pr_* otherwise.
> - An RPMsg client driver that binds to the Qualcomm FastRPC service and
> allocates a qda_dev instance using devm_kzalloc().
> - Basic device initialization and teardown paths wired into the module
> init/exit.
>
> The RPMsg driver currently sets the DSP name from a "label" property in
> the device tree, which will be used by subsequent patches to distinguish
> between different DSP domains (e.g. ADSP, CDSP).
>
> Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
> ---
> drivers/accel/qda/Kconfig | 1 +
> drivers/accel/qda/Makefile | 4 +-
> drivers/accel/qda/qda_drv.c | 41 ++++++++++++++-
> drivers/accel/qda/qda_drv.h | 91 ++++++++++++++++++++++++++++++++
> drivers/accel/qda/qda_rpmsg.c | 119 ++++++++++++++++++++++++++++++++++++++++++
> drivers/accel/qda/qda_rpmsg.h | 17 ++++++
> 6 files changed, 270 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/accel/qda/Kconfig b/drivers/accel/qda/Kconfig
> index 3c78ff6189e0..484d21ff1b55 100644
> --- a/drivers/accel/qda/Kconfig
> +++ b/drivers/accel/qda/Kconfig
> @@ -7,6 +7,7 @@ config DRM_ACCEL_QDA
> tristate "Qualcomm DSP accelerator"
> depends on DRM_ACCEL
> depends on ARCH_QCOM || COMPILE_TEST
> + depends on RPMSG
> help
> Enables the DRM-based accelerator driver for Qualcomm's Hexagon DSPs.
> This driver provides a standardized interface for offloading computational
> diff --git a/drivers/accel/qda/Makefile b/drivers/accel/qda/Makefile
> index 573711af1d28..e7f23182589b 100644
> --- a/drivers/accel/qda/Makefile
> +++ b/drivers/accel/qda/Makefile
> @@ -5,4 +5,6 @@
>
> obj-$(CONFIG_DRM_ACCEL_QDA) := qda.o
>
> -qda-y := qda_drv.o
> +qda-y := \
> + qda_drv.o \
Squash these parts into the previous patch.
> + qda_rpmsg.o \
> diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c
> index 18b0d3fb1598..389c66a9ad4f 100644
> --- a/drivers/accel/qda/qda_drv.c
> +++ b/drivers/accel/qda/qda_drv.c
> @@ -2,16 +2,53 @@
> // Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
> #include <linux/module.h>
> #include <linux/kernel.h>
> +#include <linux/atomic.h>
> +#include "qda_drv.h"
> +#include "qda_rpmsg.h"
> +
> +static void cleanup_device_resources(struct qda_dev *qdev)
> +{
> + mutex_destroy(&qdev->lock);
> +}
> +
> +void qda_deinit_device(struct qda_dev *qdev)
> +{
> + cleanup_device_resources(qdev);
> +}
> +
> +/* Initialize device resources */
> +static void init_device_resources(struct qda_dev *qdev)
> +{
> + qda_dbg(qdev, "Initializing device resources\n");
> +
> + mutex_init(&qdev->lock);
> + atomic_set(&qdev->removing, 0);
> +}
> +
> +int qda_init_device(struct qda_dev *qdev)
> +{
> + init_device_resources(qdev);
> +
> + qda_dbg(qdev, "QDA device initialized successfully\n");
> + return 0;
> +}
>
> static int __init qda_core_init(void)
> {
> - pr_info("QDA: driver initialization complete\n");
> + int ret;
> +
> + ret = qda_rpmsg_register();
> + if (ret)
> + return ret;
> +
> + qda_info(NULL, "QDA driver initialization complete\n");
> return 0;
> }
>
> static void __exit qda_core_exit(void)
> {
> - pr_info("QDA: driver exit complete\n");
> + qda_rpmsg_unregister();
> + qda_info(NULL, "QDA driver exit complete\n");
> }
>
> module_init(qda_core_init);
> diff --git a/drivers/accel/qda/qda_drv.h b/drivers/accel/qda/qda_drv.h
> new file mode 100644
> index 000000000000..bec2d31ca1bb
> --- /dev/null
> +++ b/drivers/accel/qda/qda_drv.h
> @@ -0,0 +1,91 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
> + */
> +
> +#ifndef __QDA_DRV_H__
> +#define __QDA_DRV_H__
> +
> +#include <linux/device.h>
> +#include <linux/mutex.h>
> +#include <linux/rpmsg.h>
> +#include <linux/xarray.h>
> +
> +/* Driver identification */
> +#define DRIVER_NAME "qda"
> +
> +/* struct qda_dev - Main device structure for QDA driver */
> +struct qda_dev {
> + /* RPMsg device for communication with remote processor */
> + struct rpmsg_device *rpdev;
> + /* Underlying device structure */
> + struct device *dev;
> + /* Mutex protecting device state */
> + struct mutex lock;
Which parts of the state?
> + /* Flag indicating device removal in progress */
> + atomic_t removing;
Why do you need it if we have dev->unplugged and drm_dev_enter() /
drm_dev_exit()?
> + /* Name of the DSP (e.g., "cdsp", "adsp") */
> + char dsp_name[16];
Please replace with the pointers to the static array.
> +};
> +
> +/**
> + * qda_get_log_device - Get appropriate device for logging
> + * @qdev: QDA device structure
> + *
> + * Returns the most appropriate device structure for logging messages.
> + * Prefers qdev->dev, or returns NULL if the device is being removed
> + * or invalid.
> + */
> +static inline struct device *qda_get_log_device(struct qda_dev *qdev)
> +{
> + if (!qdev || atomic_read(&qdev->removing))
> + return NULL;
> +
> + if (qdev->dev)
> + return qdev->dev;
> +
> + return NULL;
> +}
> +
> +/*
> + * Logging macros
> + *
> + * These macros provide consistent logging across the driver with automatic
> + * function name inclusion. They use dev_* functions when a device is available,
> + * falling back to pr_* functions otherwise.
> + */
> +
> +/* Error logging - always logs and tracks errors */
> +#define qda_err(qdev, fmt, ...) do { \
> + struct device *__dev = qda_get_log_device(qdev); \
> + if (__dev) \
> + dev_err(__dev, "[%s] " fmt, __func__, ##__VA_ARGS__); \
> + else \
> + pr_err(DRIVER_NAME ": [%s] " fmt, __func__, ##__VA_ARGS__); \
What /why? You are under drm, so you can use drm_* helpers instead.
> +} while (0)
> +
> +/* Info logging - always logs, can be filtered via loglevel */
> +#define qda_info(qdev, fmt, ...) do { \
> + struct device *__dev = qda_get_log_device(qdev); \
> + if (__dev) \
> + dev_info(__dev, "[%s] " fmt, __func__, ##__VA_ARGS__); \
> + else \
> + pr_info(DRIVER_NAME ": [%s] " fmt, __func__, ##__VA_ARGS__); \
> +} while (0)
> +
> +/* Debug logging - controlled via dynamic debug (CONFIG_DYNAMIC_DEBUG) */
> +#define qda_dbg(qdev, fmt, ...) do { \
> + struct device *__dev = qda_get_log_device(qdev); \
> + if (__dev) \
> + dev_dbg(__dev, "[%s] " fmt, __func__, ##__VA_ARGS__); \
> + else \
> + pr_debug(DRIVER_NAME ": [%s] " fmt, __func__, ##__VA_ARGS__); \
> +} while (0)
> +
> +/*
> + * Core device management functions
> + */
> +int qda_init_device(struct qda_dev *qdev);
> +void qda_deinit_device(struct qda_dev *qdev);
> +
> +#endif /* __QDA_DRV_H__ */
> diff --git a/drivers/accel/qda/qda_rpmsg.c b/drivers/accel/qda/qda_rpmsg.c
> new file mode 100644
> index 000000000000..a8b24a99ca13
> --- /dev/null
> +++ b/drivers/accel/qda/qda_rpmsg.c
> @@ -0,0 +1,119 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +// Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
> +#include <linux/module.h>
> +#include <linux/rpmsg.h>
> +#include <linux/of_platform.h>
> +#include <linux/of.h>
> +#include <linux/of_device.h>
> +#include "qda_drv.h"
> +#include "qda_rpmsg.h"
> +
> +static int qda_rpmsg_init(struct qda_dev *qdev)
> +{
> + dev_set_drvdata(&qdev->rpdev->dev, qdev);
> + return 0;
> +}
> +
> +/* Utility function to allocate and initialize qda_dev */
> +static struct qda_dev *alloc_and_init_qdev(struct rpmsg_device *rpdev)
> +{
> + struct qda_dev *qdev;
> +
> + qdev = devm_kzalloc(&rpdev->dev, sizeof(*qdev), GFP_KERNEL);
> + if (!qdev)
> + return ERR_PTR(-ENOMEM);
> +
> + qdev->dev = &rpdev->dev;
> + qdev->rpdev = rpdev;
> +
> + qda_dbg(qdev, "Allocated and initialized qda_dev\n");
> + return qdev;
> +}
> +
> +static int qda_rpmsg_cb(struct rpmsg_device *rpdev, void *data, int len, void *priv, u32 src)
> +{
> + /* Dummy function for rpmsg driver */
> + return 0;
> +}
> +
> +static void qda_rpmsg_remove(struct rpmsg_device *rpdev)
> +{
> + struct qda_dev *qdev = dev_get_drvdata(&rpdev->dev);
> +
> + qda_info(qdev, "Removing RPMsg device\n");
> +
> + atomic_set(&qdev->removing, 1);
> +
> + mutex_lock(&qdev->lock);
> + qdev->rpdev = NULL;
> + mutex_unlock(&qdev->lock);
> +
> + qda_deinit_device(qdev);
> +
> + qda_info(qdev, "RPMsg device removed\n");
> +}
> +
> +static int qda_rpmsg_probe(struct rpmsg_device *rpdev)
> +{
> + struct qda_dev *qdev;
> + int ret;
> + const char *label;
> +
> + qda_dbg(NULL, "QDA RPMsg probe starting\n");
> +
> + qdev = alloc_and_init_qdev(rpdev);
> + if (IS_ERR(qdev))
> + return PTR_ERR(qdev);
> +
> + ret = of_property_read_string(rpdev->dev.of_node, "label", &label);
> + if (!ret) {
> + strscpy(qdev->dsp_name, label, sizeof(qdev->dsp_name));
> + } else {
> + qda_info(qdev, "QDA DSP label not found in DT\n");
> + return ret;
> + }
> +
> + ret = qda_rpmsg_init(qdev);
> + if (ret) {
> + qda_err(qdev, "RPMsg init failed: %d\n", ret);
> + return ret;
> + }
> +
> + ret = qda_init_device(qdev);
> + if (ret)
> + return ret;
> +
> + qda_info(qdev, "QDA RPMsg probe completed successfully for %s\n", qdev->dsp_name);
> + return 0;
> +}
> +
> +static const struct of_device_id qda_rpmsg_id_table[] = {
> + { .compatible = "qcom,fastrpc" },
> + {},
> +};
> +MODULE_DEVICE_TABLE(of, qda_rpmsg_id_table);
> +
> +static struct rpmsg_driver qda_rpmsg_driver = {
> + .probe = qda_rpmsg_probe,
> + .remove = qda_rpmsg_remove,
> + .callback = qda_rpmsg_cb,
> + .drv = {
> + .name = "qcom,fastrpc",
> + .of_match_table = qda_rpmsg_id_table,
> + },
> +};
> +
> +int qda_rpmsg_register(void)
> +{
> + int ret = register_rpmsg_driver(&qda_rpmsg_driver);
> +
> + if (ret)
> + qda_err(NULL, "Failed to register RPMsg driver: %d\n", ret);
> +
> + return ret;
> +}
> +
> +void qda_rpmsg_unregister(void)
> +{
> + unregister_rpmsg_driver(&qda_rpmsg_driver);
> +}
> diff --git a/drivers/accel/qda/qda_rpmsg.h b/drivers/accel/qda/qda_rpmsg.h
> new file mode 100644
> index 000000000000..348827bff255
> --- /dev/null
> +++ b/drivers/accel/qda/qda_rpmsg.h
> @@ -0,0 +1,17 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
> + */
> +
> +#ifndef __QDA_RPMSG_H__
> +#define __QDA_RPMSG_H__
> +
> +#include "qda_drv.h"
> +
> +/*
> + * Transport layer registration
> + */
> +int qda_rpmsg_register(void);
> +void qda_rpmsg_unregister(void);
> +
> +#endif /* __QDA_RPMSG_H__ */
>
> --
> 2.34.1
>
--
With best wishes
Dmitry
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 03/18] accel/qda: Add RPMsg transport for Qualcomm DSP accelerator
2026-02-23 21:23 ` Dmitry Baryshkov
@ 2026-02-23 21:50 ` Bjorn Andersson
2026-02-23 22:12 ` Dmitry Baryshkov
2026-02-25 17:16 ` Ekansh Gupta
1 sibling, 1 reply; 83+ messages in thread
From: Bjorn Andersson @ 2026-02-23 21:50 UTC (permalink / raw)
To: Dmitry Baryshkov
Cc: Ekansh Gupta, Oded Gabbay, Jonathan Corbet, Shuah Khan,
Joerg Roedel, Will Deacon, Robin Murphy, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Sumit Semwal, Christian König, dri-devel, linux-doc,
linux-kernel, linux-arm-msm, iommu, linux-media, linaro-mm-sig,
Srinivas Kandagatla, Bharath Kumar, Chenna Kesava Raju
On Mon, Feb 23, 2026 at 11:23:13PM +0200, Dmitry Baryshkov wrote:
> On Tue, Feb 24, 2026 at 12:38:57AM +0530, Ekansh Gupta wrote:
[..]
> > diff --git a/drivers/accel/qda/qda_drv.h b/drivers/accel/qda/qda_drv.h
[..]
> > +/* Error logging - always logs and tracks errors */
> > +#define qda_err(qdev, fmt, ...) do { \
> > + struct device *__dev = qda_get_log_device(qdev); \
> > + if (__dev) \
> > + dev_err(__dev, "[%s] " fmt, __func__, ##__VA_ARGS__); \
> > + else \
> > + pr_err(DRIVER_NAME ": [%s] " fmt, __func__, ##__VA_ARGS__); \
>
> What /why? You are under drm, so you can use drm_* helpers instead.
>
In particular, rather than rolling our own wrappers around standard
functions, just use dev_err() whenever you have a struct device. And for
something like fastrpc - life starts at some probe() and ends at some
remove() so that should be always.
Regards,
Bjorn
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 02/18] accel/qda: Add Qualcomm DSP accelerator driver skeleton
2026-02-23 19:08 ` [PATCH RFC 02/18] accel/qda: Add Qualcomm DSP accelerator driver skeleton Ekansh Gupta
@ 2026-02-23 21:52 ` Bjorn Andersson
2026-02-25 14:20 ` Ekansh Gupta
0 siblings, 1 reply; 83+ messages in thread
From: Bjorn Andersson @ 2026-02-23 21:52 UTC (permalink / raw)
To: Ekansh Gupta
Cc: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, dri-devel, linux-doc, linux-kernel,
linux-arm-msm, iommu, linux-media, linaro-mm-sig,
Srinivas Kandagatla, Dmitry Baryshkov, Bharath Kumar,
Chenna Kesava Raju
On Tue, Feb 24, 2026 at 12:38:56AM +0530, Ekansh Gupta wrote:
[..]
> diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c
> new file mode 100644
> index 000000000000..18b0d3fb1598
> --- /dev/null
> +++ b/drivers/accel/qda/qda_drv.c
> @@ -0,0 +1,22 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +// Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
> +#include <linux/module.h>
> +#include <linux/kernel.h>
> +
> +static int __init qda_core_init(void)
> +{
> + pr_info("QDA: driver initialization complete\n");
This print is useless as soon as you make the driver do anything, please
don't include developmental debug logs.
In fact, this patch doesn't actually do anything, please squash things a
bit to give it some meat.
Regards,
Bjorn
> + return 0;
> +}
> +
> +static void __exit qda_core_exit(void)
> +{
> + pr_info("QDA: driver exit complete\n");
> +}
> +
> +module_init(qda_core_init);
> +module_exit(qda_core_exit);
> +
> +MODULE_AUTHOR("Qualcomm AI Infra Team");
> +MODULE_DESCRIPTION("Qualcomm DSP Accelerator Driver");
> +MODULE_LICENSE("GPL");
>
> --
> 2.34.1
>
>
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 00/18] accel/qda: Introduce Qualcomm DSP Accelerator driver
2026-02-23 19:08 ` [PATCH RFC 00/18] accel/qda: Introduce Qualcomm DSP Accelerator driver Ekansh Gupta
` (17 preceding siblings ...)
2026-02-23 19:09 ` [PATCH RFC 18/18] MAINTAINERS: Add MAINTAINERS entry for QDA driver Ekansh Gupta
@ 2026-02-23 22:03 ` Bjorn Andersson
2026-03-02 8:54 ` Ekansh Gupta
2026-02-24 3:37 ` Trilok Soni
` (4 subsequent siblings)
23 siblings, 1 reply; 83+ messages in thread
From: Bjorn Andersson @ 2026-02-23 22:03 UTC (permalink / raw)
To: Ekansh Gupta
Cc: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, dri-devel, linux-doc, linux-kernel,
linux-arm-msm, iommu, linux-media, linaro-mm-sig,
Srinivas Kandagatla, Dmitry Baryshkov, Bharath Kumar,
Chenna Kesava Raju
On Tue, Feb 24, 2026 at 12:38:54AM +0530, Ekansh Gupta wrote:
> This patch series introduces the Qualcomm DSP Accelerator (QDA) driver,
> a modern DRM-based accelerator implementation for Qualcomm Hexagon DSPs.
> The driver provides a standardized interface for offloading computational
> tasks to DSPs found on Qualcomm SoCs, supporting all DSP domains (ADSP,
> CDSP, SDSP, GDSP).
>
> The QDA driver is designed as an alternative for the FastRPC driver
> in drivers/misc/, offering improved resource management, better integration
> with standard kernel subsystems, and alignment with the Linux kernel's
> Compute Accelerators framework.
>
If I understand correctly, this is just the same FastRPC protocol but
in the accel framework, and hence with a new userspace ABI?
I don't fancy the name "QDA" as an acronym for "FastRPC Accel".
I would much prefer to see this living in drivers/accel/fastrpc and be
named some variation of "fastrpc" (e.g. fastrpc_accel). (Driver name can
be "fastrpc" as the other one apparently is named "qcom,fastrpc").
> User-space staging branch
> ============
> https://github.com/qualcomm/fastrpc/tree/accel/staging
>
> Key Features
> ============
>
> * Standard DRM accelerator interface via /dev/accel/accelN
> * GEM-based buffer management with DMA-BUF import/export support
> * IOMMU-based memory isolation using per-process context banks
> * FastRPC protocol implementation for DSP communication
> * RPMsg transport layer for reliable message passing
> * Support for all DSP domains (ADSP, CDSP, SDSP, GDSP)
> * Comprehensive IOCTL interface for DSP operations
>
> High-Level Architecture Differences with Existing FastRPC Driver
> =================================================================
>
> The QDA driver represents a significant architectural departure from the
> existing FastRPC driver (drivers/misc/fastrpc.c), addressing several key
> limitations while maintaining protocol compatibility:
>
> 1. DRM Accelerator Framework Integration
> - FastRPC: Custom character device (/dev/fastrpc-*)
> - QDA: Standard DRM accel device (/dev/accel/accelN)
> - Benefit: Leverages established DRM infrastructure for device
> management.
>
> 2. Memory Management
> - FastRPC: Custom memory allocator with ION/DMA-BUF integration
> - QDA: Native GEM objects with full PRIME support
> - Benefit: Seamless buffer sharing using standard DRM mechanisms
>
> 3. IOMMU Context Bank Management
> - FastRPC: Direct IOMMU domain manipulation, limited isolation
> - QDA: Custom compute bus (qda_cb_bus_type) with proper device model
> - Benefit: Each CB device is a proper struct device with IOMMU group
> support, enabling better isolation and resource tracking.
> - https://lore.kernel.org/all/245d602f-3037-4ae3-9af9-d98f37258aae@oss.qualcomm.com/
>
> 4. Memory Manager Architecture
> - FastRPC: Monolithic allocator
> - QDA: Pluggable memory manager with backend abstraction
> - Benefit: Currently uses DMA-coherent backend, easily extensible for
> future memory types (e.g., carveout, CMA)
>
> 5. Transport Layer
> - FastRPC: Direct RPMsg integration in core driver
> - QDA: Abstracted transport layer (qda_rpmsg.c)
> - Benefit: Clean separation of concerns, easier to add alternative
> transports if needed
>
> 8. Code Organization
> - FastRPC: ~3000 lines in single file
> - QDA: Modular design across multiple files (~4600 lines total)
"Now 50% more LOC and you need 6 tabs open in your IDE!"
Might be better, but in itself it provides no immediate value.
> * qda_drv.c: Core driver and DRM integration
> * qda_gem.c: GEM object management
> * qda_memory_manager.c: Memory and IOMMU management
> * qda_fastrpc.c: FastRPC protocol implementation
> * qda_rpmsg.c: Transport layer
> * qda_cb.c: Context bank device management
> - Benefit: Better maintainability, clearer separation of concerns
>
> 9. UAPI Design
> - FastRPC: Custom IOCTL interface
> - QDA: DRM-style IOCTLs with proper versioning support
> - Benefit: Follows DRM conventions, easier userspace integration
>
> 10. Documentation
> - FastRPC: Minimal in-tree documentation
> - QDA: Comprehensive documentation in Documentation/accel/qda/
> - Benefit: Better developer experience, clearer API contracts
>
> 11. Buffer Reference Mechanism
> - FastRPC: Uses buffer file descriptors (FDs) for all book-keeping
> in both kernel and DSP
> - QDA: Uses GEM handles for kernel-side management, providing better
> integration with DRM subsystem
> - Benefit: Leverages DRM GEM infrastructure for reference counting,
> lifetime management, and integration with other DRM components
>
This is all good, but what is the plan regarding /dev/fastrpc-*?
The idea here clearly is to provide an alternative implementation, and
they seem to bind to the same toplevel compatible - so you can only
compile one into your kernel at any point in time.
So if I understand correctly, at some point in time we need to say
CONFIG_DRM_ACCEL_QDA=m and CONFIG_QCOM_FASTRPC=n, which will break all
existing user space applications? That's not acceptable.
Would it be possible to have a final driver that is implemented as a
accel, but provides wrappers for the legacy misc and ioctl interface to
the applications?
Regards,
Bjorn
> Key Technical Improvements
> ===========================
>
> * Proper device model: CB devices are real struct device instances on a
> custom bus, enabling proper IOMMU group management and power management
> integration
>
> * Reference-counted IOMMU devices: Multiple file descriptors from the same
> process share a single IOMMU device, reducing overhead
>
> * GEM-based buffer lifecycle: Automatic cleanup via DRM GEM reference
> counting, eliminating many resource leak scenarios
>
> * Modular memory backends: The memory manager supports pluggable backends,
> currently implementing DMA-coherent allocations with SID-prefixed
> addresses for DSP firmware
>
> * Context-based invocation tracking: XArray-based context management with
> proper synchronization and cleanup
>
> Patch Series Organization
> ==========================
>
> Patches 1-2: Driver skeleton and documentation
> Patches 3-6: RPMsg transport and IOMMU/CB infrastructure
> Patches 7-9: DRM device registration and basic IOCTL
> Patches 10-12: GEM buffer management and PRIME support
> Patches 13-17: FastRPC protocol implementation (attach, invoke, create,
> map/unmap)
> Patch 18: MAINTAINERS entry
>
> Open Items
> ===========
>
> The following items are identified as open items:
>
> 1. Privilege Level Management
> - Currently, daemon processes and user processes have the same access
> level as both use the same accel device node. This needs to be
> addressed as daemons attach to privileged DSP PDs and require
> higher privilege levels for system-level operations
> - Seeking guidance on the best approach: separate device nodes,
> capability-based checks, or DRM master/authentication mechanisms
>
> 2. UAPI Compatibility Layer
> - Add UAPI compat layer to facilitate migration of client applications
> from existing FastRPC UAPI to the new QDA accel driver UAPI,
> ensuring smooth transition for existing userspace code
> - Seeking guidance on implementation approach: in-kernel translation
> layer, userspace wrapper library, or hybrid solution
>
> 3. Documentation Improvements
> - Add detailed IOCTL usage examples
> - Document DSP firmware interface requirements
> - Create migration guide from existing FastRPC
>
> 4. Per-Domain Memory Allocation
> - Develop new userspace API to support memory allocation on a per
> domain basis, enabling domain-specific memory management and
> optimization
>
> 5. Audio and Sensors PD Support
> - The current patch series does not handle Audio PD and Sensors PD
> functionalities. These specialized protection domains require
> additional support for real-time constraints and power management
>
> Interface Compatibility
> ========================
>
> The QDA driver maintains compatibility with existing FastRPC infrastructure:
>
> * Device Tree Bindings: The driver uses the same device tree bindings as
> the existing FastRPC driver, ensuring no changes are required to device
> tree sources. The "qcom,fastrpc" compatible string and child node
> structure remain unchanged.
>
> * Userspace Interface: While the driver provides a new DRM-based UAPI,
> the underlying FastRPC protocol and DSP firmware interface remain
> compatible. This ensures that DSP firmware and libraries continue to
> work without modification.
>
> * Migration Path: The modular design allows for gradual migration, where
> both drivers can coexist during the transition period. Applications can
> be migrated incrementally to the new UAPI with the help of the planned
> compatibility layer.
>
> References
> ==========
>
> Previous discussions on this migration:
> - https://lkml.org/lkml/2024/6/24/479
> - https://lkml.org/lkml/2024/6/21/1252
>
> Testing
> =======
>
> The driver has been tested on Qualcomm platforms with:
> - Basic FastRPC attach/release operations
> - DSP process creation and initialization
> - Memory mapping/unmapping operations
> - Dynamic invocation with various buffer types
> - GEM buffer allocation and mmap
> - PRIME buffer import from other subsystems
>
> Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
> ---
> Ekansh Gupta (18):
> accel/qda: Add Qualcomm QDA DSP accelerator driver docs
> accel/qda: Add Qualcomm DSP accelerator driver skeleton
> accel/qda: Add RPMsg transport for Qualcomm DSP accelerator
> accel/qda: Add built-in compute CB bus for QDA and integrate with IOMMU
> accel/qda: Create compute CB devices on QDA compute bus
> accel/qda: Add memory manager for CB devices
> accel/qda: Add DRM accel device registration for QDA driver
> accel/qda: Add per-file DRM context and open/close handling
> accel/qda: Add QUERY IOCTL and basic QDA UAPI header
> accel/qda: Add DMA-backed GEM objects and memory manager integration
> accel/qda: Add GEM_CREATE and GEM_MMAP_OFFSET IOCTLs
> accel/qda: Add PRIME dma-buf import support
> accel/qda: Add initial FastRPC attach and release support
> accel/qda: Add FastRPC dynamic invocation support
> accel/qda: Add FastRPC DSP process creation support
> accel/qda: Add FastRPC-based DSP memory mapping support
> accel/qda: Add FastRPC-based DSP memory unmapping support
> MAINTAINERS: Add MAINTAINERS entry for QDA driver
>
> Documentation/accel/index.rst | 1 +
> Documentation/accel/qda/index.rst | 14 +
> Documentation/accel/qda/qda.rst | 129 ++++
> MAINTAINERS | 9 +
> arch/arm64/configs/defconfig | 2 +
> drivers/accel/Kconfig | 1 +
> drivers/accel/Makefile | 2 +
> drivers/accel/qda/Kconfig | 35 ++
> drivers/accel/qda/Makefile | 19 +
> drivers/accel/qda/qda_cb.c | 182 ++++++
> drivers/accel/qda/qda_cb.h | 26 +
> drivers/accel/qda/qda_compute_bus.c | 23 +
> drivers/accel/qda/qda_drv.c | 375 ++++++++++++
> drivers/accel/qda/qda_drv.h | 171 ++++++
> drivers/accel/qda/qda_fastrpc.c | 1002 ++++++++++++++++++++++++++++++++
> drivers/accel/qda/qda_fastrpc.h | 433 ++++++++++++++
> drivers/accel/qda/qda_gem.c | 211 +++++++
> drivers/accel/qda/qda_gem.h | 103 ++++
> drivers/accel/qda/qda_ioctl.c | 271 +++++++++
> drivers/accel/qda/qda_ioctl.h | 118 ++++
> drivers/accel/qda/qda_memory_dma.c | 91 +++
> drivers/accel/qda/qda_memory_dma.h | 46 ++
> drivers/accel/qda/qda_memory_manager.c | 382 ++++++++++++
> drivers/accel/qda/qda_memory_manager.h | 148 +++++
> drivers/accel/qda/qda_prime.c | 194 +++++++
> drivers/accel/qda/qda_prime.h | 43 ++
> drivers/accel/qda/qda_rpmsg.c | 327 +++++++++++
> drivers/accel/qda/qda_rpmsg.h | 57 ++
> drivers/iommu/iommu.c | 4 +
> include/linux/qda_compute_bus.h | 22 +
> include/uapi/drm/qda_accel.h | 224 +++++++
> 31 files changed, 4665 insertions(+)
> ---
> base-commit: d4906ae14a5f136ceb671bb14cedbf13fa560da6
> change-id: 20260223-qda-firstpost-4ab05249e2cc
>
> Best regards,
> --
> Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
>
>
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 03/18] accel/qda: Add RPMsg transport for Qualcomm DSP accelerator
2026-02-23 21:50 ` Bjorn Andersson
@ 2026-02-23 22:12 ` Dmitry Baryshkov
2026-02-23 22:25 ` Bjorn Andersson
0 siblings, 1 reply; 83+ messages in thread
From: Dmitry Baryshkov @ 2026-02-23 22:12 UTC (permalink / raw)
To: Bjorn Andersson
Cc: Ekansh Gupta, Oded Gabbay, Jonathan Corbet, Shuah Khan,
Joerg Roedel, Will Deacon, Robin Murphy, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Sumit Semwal, Christian König, dri-devel, linux-doc,
linux-kernel, linux-arm-msm, iommu, linux-media, linaro-mm-sig,
Srinivas Kandagatla, Bharath Kumar, Chenna Kesava Raju
On Mon, Feb 23, 2026 at 03:50:32PM -0600, Bjorn Andersson wrote:
> On Mon, Feb 23, 2026 at 11:23:13PM +0200, Dmitry Baryshkov wrote:
> > On Tue, Feb 24, 2026 at 12:38:57AM +0530, Ekansh Gupta wrote:
> [..]
> > > diff --git a/drivers/accel/qda/qda_drv.h b/drivers/accel/qda/qda_drv.h
> [..]
> > > +/* Error logging - always logs and tracks errors */
> > > +#define qda_err(qdev, fmt, ...) do { \
> > > + struct device *__dev = qda_get_log_device(qdev); \
> > > + if (__dev) \
> > > + dev_err(__dev, "[%s] " fmt, __func__, ##__VA_ARGS__); \
> > > + else \
> > > + pr_err(DRIVER_NAME ": [%s] " fmt, __func__, ##__VA_ARGS__); \
> >
> > What /why? You are under drm, so you can use drm_* helpers instead.
> >
>
> In particular, rather than rolling our own wrappers around standard
> functions, just use dev_err() whenever you have a struct device. And for
> something like fastrpc - life starts at some probe() and ends at some
> remove() so that should be always.
I'd say differently. For the DRM devices the life cycle is centered
around the DRM device (which can outlive platform device for multiple
reasons). So, please start by registering the DRM accel device and using
it for all the logging (and btw for private data management too).
>
> Regards,
> Bjorn
--
With best wishes
Dmitry
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 07/18] accel/qda: Add DRM accel device registration for QDA driver
2026-02-23 19:09 ` [PATCH RFC 07/18] accel/qda: Add DRM accel device registration for QDA driver Ekansh Gupta
@ 2026-02-23 22:16 ` Dmitry Baryshkov
2026-03-02 8:33 ` Ekansh Gupta
0 siblings, 1 reply; 83+ messages in thread
From: Dmitry Baryshkov @ 2026-02-23 22:16 UTC (permalink / raw)
To: Ekansh Gupta
Cc: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, dri-devel, linux-doc, linux-kernel,
linux-arm-msm, iommu, linux-media, linaro-mm-sig,
Srinivas Kandagatla, Bharath Kumar, Chenna Kesava Raju
On Tue, Feb 24, 2026 at 12:39:01AM +0530, Ekansh Gupta wrote:
> Add DRM accel integration for the QDA DSP accelerator driver. A new
> qda_drm_priv structure is introduced to hold per-device DRM state,
> including a pointer to the memory manager and the parent qda_dev
> instance. The driver now allocates a drm_device, initializes
> driver-private state, and registers the device via the DRM accel
> infrastructure.
>
> qda_register_device() performs allocation and registration of the DRM
> device, while qda_unregister_device() handles device teardown and
> releases references using drm_dev_unregister() and drm_dev_put().
> Initialization and teardown paths are updated so DRM resources are
> allocated after IOMMU/memory-manager setup and cleaned during RPMsg
> remove.
>
> This patch lays the foundation for adding GEM buffer support and IOCTL
> handling in later patches as part of the compute accelerator interface.
>
> Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
> ---
> drivers/accel/qda/qda_drv.c | 103 ++++++++++++++++++++++++++++++++++++++++++
> drivers/accel/qda/qda_drv.h | 33 +++++++++++++-
> drivers/accel/qda/qda_rpmsg.c | 8 ++++
> 3 files changed, 142 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c
> index 69132737f964..a9113ec78fa2 100644
> --- a/drivers/accel/qda/qda_drv.c
> +++ b/drivers/accel/qda/qda_drv.c
> @@ -4,9 +4,31 @@
> #include <linux/kernel.h>
> #include <linux/atomic.h>
> #include <linux/slab.h>
> +#include <drm/drm_accel.h>
> +#include <drm/drm_drv.h>
> +#include <drm/drm_file.h>
> +#include <drm/drm_gem.h>
> +#include <drm/drm_ioctl.h>
> #include "qda_drv.h"
> #include "qda_rpmsg.h"
>
> +DEFINE_DRM_ACCEL_FOPS(qda_accel_fops);
> +
> +static struct drm_driver qda_drm_driver = {
> + .driver_features = DRIVER_COMPUTE_ACCEL,
> + .fops = &qda_accel_fops,
Strange indentation in the middle. Please drop it.
> + .name = DRIVER_NAME,
> + .desc = "Qualcomm DSP Accelerator Driver",
> +};
> +
> +static void cleanup_drm_private(struct qda_dev *qdev)
> +{
> + if (qdev->drm_priv) {
> + qda_dbg(qdev, "Cleaning up DRM private data\n");
> + kfree(qdev->drm_priv);
> + }
> +}
> +
> static void cleanup_iommu_manager(struct qda_dev *qdev)
> {
> if (qdev->iommu_mgr) {
> @@ -24,6 +46,7 @@ static void cleanup_device_resources(struct qda_dev *qdev)
>
> void qda_deinit_device(struct qda_dev *qdev)
> {
> + cleanup_drm_private(qdev);
> cleanup_iommu_manager(qdev);
> cleanup_device_resources(qdev);
> }
> @@ -59,6 +82,18 @@ static int init_memory_manager(struct qda_dev *qdev)
> return 0;
> }
>
> +static int init_drm_private(struct qda_dev *qdev)
> +{
> + qda_dbg(qdev, "Initializing DRM private data\n");
> +
> + qdev->drm_priv = kzalloc_obj(*qdev->drm_priv, GFP_KERNEL);
> + if (!qdev->drm_priv)
> + return -ENOMEM;
> +
> + qda_dbg(qdev, "DRM private data initialized successfully\n");
> + return 0;
> +}
> +
> int qda_init_device(struct qda_dev *qdev)
> {
> int ret;
> @@ -71,14 +106,82 @@ int qda_init_device(struct qda_dev *qdev)
> goto err_cleanup_resources;
> }
>
> + ret = init_drm_private(qdev);
> + if (ret) {
> + qda_err(qdev, "DRM private data initialization failed: %d\n", ret);
> + goto err_cleanup_iommu;
> + }
> +
> qda_dbg(qdev, "QDA device initialized successfully\n");
> return 0;
>
> +err_cleanup_iommu:
> + cleanup_iommu_manager(qdev);
> err_cleanup_resources:
> cleanup_device_resources(qdev);
> return ret;
> }
>
> +static int setup_and_register_drm_device(struct qda_dev *qdev)
> +{
> + struct drm_device *ddev;
> + int ret;
> +
> + qda_dbg(qdev, "Setting up and registering DRM device\n");
> +
> + ddev = drm_dev_alloc(&qda_drm_driver, qdev->dev);
devm_drm_dev_alloc() please. Move this patch to the front of the series,
making everything else depend on the allocated data structure.
> + if (IS_ERR(ddev)) {
> + ret = PTR_ERR(ddev);
> + qda_err(qdev, "Failed to allocate DRM device: %d\n", ret);
> + return ret;
> + }
> +
> + qdev->drm_priv->drm_dev = ddev;
> + qdev->drm_priv->iommu_mgr = qdev->iommu_mgr;
> + qdev->drm_priv->qdev = qdev;
> +
> + ddev->dev_private = qdev->drm_priv;
> + qdev->drm_dev = ddev;
> +
> + ret = drm_dev_register(ddev, 0);
> + if (ret) {
> + qda_err(qdev, "Failed to register DRM device: %d\n", ret);
> + drm_dev_put(ddev);
> + return ret;
> + }
> +
> + qda_dbg(qdev, "DRM device registered successfully\n");
> + return 0;
> +}
> +
> +int qda_register_device(struct qda_dev *qdev)
> +{
> + int ret;
> +
> + ret = setup_and_register_drm_device(qdev);
> + if (ret) {
> + qda_err(qdev, "DRM device setup failed: %d\n", ret);
> + return ret;
> + }
> +
> + qda_dbg(qdev, "QDA device registered successfully\n");
> + return 0;
> +}
> +
> +void qda_unregister_device(struct qda_dev *qdev)
> +{
> + qda_info(qdev, "Unregistering QDA device\n");
> +
> + if (qdev->drm_dev) {
> + qda_dbg(qdev, "Unregistering DRM device\n");
> + drm_dev_unregister(qdev->drm_dev);
> + drm_dev_put(qdev->drm_dev);
> + qdev->drm_dev = NULL;
> + }
> +
> + qda_dbg(qdev, "QDA device unregistered successfully\n");
> +}
> +
> static int __init qda_core_init(void)
> {
> int ret;
> diff --git a/drivers/accel/qda/qda_drv.h b/drivers/accel/qda/qda_drv.h
> index 2cb97e4eafbf..2b80401a3741 100644
> --- a/drivers/accel/qda/qda_drv.h
> +++ b/drivers/accel/qda/qda_drv.h
> @@ -11,13 +11,35 @@
> #include <linux/mutex.h>
> #include <linux/rpmsg.h>
> #include <linux/xarray.h>
> +#include <drm/drm_drv.h>
> +#include <drm/drm_file.h>
> +#include <drm/drm_device.h>
> +#include <drm/drm_accel.h>
> #include "qda_memory_manager.h"
>
> /* Driver identification */
> #define DRIVER_NAME "qda"
>
> +/**
> + * struct qda_drm_priv - DRM device private data for QDA device
> + *
> + * This structure serves as the DRM device private data (stored in dev_private),
> + * bridging the DRM device context with the QDA device and providing access to
> + * shared resources like the memory manager during buffer operations.
> + */
> +struct qda_drm_priv {
Shared between what and what? Why do you need a separate structure
instead of using qda_dev?
> + /* DRM device structure */
> + struct drm_device *drm_dev;
> + /* Global memory/IOMMU manager */
> + struct qda_memory_manager *iommu_mgr;
> + /* Back-pointer to qda_dev */
> + struct qda_dev *qdev;
> +};
> +
> /* struct qda_dev - Main device structure for QDA driver */
> struct qda_dev {
> + /* DRM device for accelerator interface */
> + struct drm_device *drm_dev;
Drop the pointer here.
> /* RPMsg device for communication with remote processor */
> struct rpmsg_device *rpdev;
> /* Underlying device structure */
> @@ -26,6 +48,8 @@ struct qda_dev {
> struct mutex lock;
> /* IOMMU/memory manager */
> struct qda_memory_manager *iommu_mgr;
> + /* DRM device private data */
> + struct qda_drm_priv *drm_priv;
> /* Flag indicating device removal in progress */
> atomic_t removing;
> /* Name of the DSP (e.g., "cdsp", "adsp") */
> @@ -39,8 +63,8 @@ struct qda_dev {
> * @qdev: QDA device structure
> *
> * Returns the most appropriate device structure for logging messages.
> - * Prefers qdev->dev, or returns NULL if the device is being removed
> - * or invalid.
> + * Prefers qdev->dev, falls back to qdev->drm_dev->dev, or returns NULL
> + * if the device is being removed or invalid.
> */
> static inline struct device *qda_get_log_device(struct qda_dev *qdev)
> {
> @@ -50,6 +74,9 @@ static inline struct device *qda_get_log_device(struct qda_dev *qdev)
> if (qdev->dev)
> return qdev->dev;
>
> + if (qdev->drm_dev)
> + return qdev->drm_dev->dev;
> +
> return NULL;
> }
>
> @@ -93,5 +120,7 @@ static inline struct device *qda_get_log_device(struct qda_dev *qdev)
> */
> int qda_init_device(struct qda_dev *qdev);
> void qda_deinit_device(struct qda_dev *qdev);
> +int qda_register_device(struct qda_dev *qdev);
> +void qda_unregister_device(struct qda_dev *qdev);
>
> #endif /* __QDA_DRV_H__ */
> diff --git a/drivers/accel/qda/qda_rpmsg.c b/drivers/accel/qda/qda_rpmsg.c
> index 5a57384de6a2..b2b44b4d3ca8 100644
> --- a/drivers/accel/qda/qda_rpmsg.c
> +++ b/drivers/accel/qda/qda_rpmsg.c
> @@ -80,6 +80,7 @@ static void qda_rpmsg_remove(struct rpmsg_device *rpdev)
> qdev->rpdev = NULL;
> mutex_unlock(&qdev->lock);
>
> + qda_unregister_device(qdev);
> qda_unpopulate_child_devices(qdev);
> qda_deinit_device(qdev);
>
> @@ -123,6 +124,13 @@ static int qda_rpmsg_probe(struct rpmsg_device *rpdev)
> return ret;
> }
>
> + ret = qda_register_device(qdev);
> + if (ret) {
> + qda_deinit_device(qdev);
> + qda_unpopulate_child_devices(qdev);
> + return ret;
> + }
> +
> qda_info(qdev, "QDA RPMsg probe completed successfully for %s\n", qdev->dsp_name);
> return 0;
> }
>
> --
> 2.34.1
>
--
With best wishes
Dmitry
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 08/18] accel/qda: Add per-file DRM context and open/close handling
2026-02-23 19:09 ` [PATCH RFC 08/18] accel/qda: Add per-file DRM context and open/close handling Ekansh Gupta
@ 2026-02-23 22:20 ` Dmitry Baryshkov
2026-03-02 8:36 ` Ekansh Gupta
0 siblings, 1 reply; 83+ messages in thread
From: Dmitry Baryshkov @ 2026-02-23 22:20 UTC (permalink / raw)
To: Ekansh Gupta
Cc: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, dri-devel, linux-doc, linux-kernel,
linux-arm-msm, iommu, linux-media, linaro-mm-sig,
Srinivas Kandagatla, Bharath Kumar, Chenna Kesava Raju
On Tue, Feb 24, 2026 at 12:39:02AM +0530, Ekansh Gupta wrote:
> Introduce per-file and per-user context for the QDA DRM accelerator
> driver. A new qda_file_priv structure is stored in file->driver_priv
> for each open file descriptor, and a qda_user object is allocated per
> client with a unique client_id generated from an atomic counter in
> qda_dev.
>
> The DRM driver now provides qda_open() and qda_postclose() callbacks.
> qda_open() resolves the qda_dev from the drm_device, allocates the
> qda_file_priv and qda_user structures, and attaches them to the DRM
> file. qda_postclose() tears down the per-file context and frees the
> qda_user object when the file is closed.
>
> This prepares the QDA driver to track per-process state for future
> features such as per-client memory mappings, job submission contexts,
> and access control over DSP compute resources.
Start by describing the problem instead of stuffing it to the end. Can
we use something better suited for this task, like IDR?
> Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
> ---
> drivers/accel/qda/qda_drv.c | 117 ++++++++++++++++++++++++++++++++++++++++++++
> drivers/accel/qda/qda_drv.h | 30 ++++++++++++
> 2 files changed, 147 insertions(+)
>
> diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c
> index a9113ec78fa2..bf95fc782cf8 100644
> --- a/drivers/accel/qda/qda_drv.c
> +++ b/drivers/accel/qda/qda_drv.c
> @@ -12,11 +12,127 @@
> #include "qda_drv.h"
> #include "qda_rpmsg.h"
>
> +static struct qda_drm_priv *get_drm_priv_from_device(struct drm_device *dev)
> +{
> + if (!dev)
> + return NULL;
> +
> + return (struct qda_drm_priv *)dev->dev_private;
> +}
> +
> +static struct qda_dev *get_qdev_from_drm_device(struct drm_device *dev)
> +{
> + struct qda_drm_priv *drm_priv;
> +
> + if (!dev) {
> + qda_dbg(NULL, "Invalid drm_device\n");
> + return NULL;
> + }
> +
> + drm_priv = get_drm_priv_from_device(dev);
> + if (!drm_priv) {
> + qda_dbg(NULL, "No drm_priv in dev_private\n");
> + return NULL;
> + }
> +
> + return drm_priv->qdev;
> +}
> +
> +static struct qda_user *alloc_qda_user(struct qda_dev *qdev)
> +{
> + struct qda_user *qda_user;
> +
> + qda_user = kzalloc_obj(*qda_user, GFP_KERNEL);
> + if (!qda_user)
> + return NULL;
> +
> + qda_user->client_id = atomic_inc_return(&qdev->client_id_counter);
> + qda_user->qda_dev = qdev;
> +
> + qda_dbg(qdev, "Allocated qda_user with client_id=%u\n", qda_user->client_id);
> + return qda_user;
> +}
> +
> +static void free_qda_user(struct qda_user *qda_user)
> +{
> + if (!qda_user)
> + return;
> +
> + qda_dbg(qda_user->qda_dev, "Freeing qda_user client_id=%u\n", qda_user->client_id);
> +
> + kfree(qda_user);
> +}
> +
> +static int qda_open(struct drm_device *dev, struct drm_file *file)
> +{
> + struct qda_user *qda_user;
> + struct qda_file_priv *qda_file_priv;
> + struct qda_dev *qdev;
> +
> + if (!file) {
> + qda_dbg(NULL, "Invalid file pointer\n");
> + return -EINVAL;
> + }
> +
> + qdev = get_qdev_from_drm_device(dev);
> + if (!qdev) {
> + qda_dbg(NULL, "Failed to get qdev from drm_device\n");
> + return -EINVAL;
> + }
> +
> + qda_file_priv = kzalloc(sizeof(*qda_file_priv), GFP_KERNEL);
> + if (!qda_file_priv)
> + return -ENOMEM;
> +
> + qda_file_priv->pid = current->pid;
> +
> + qda_user = alloc_qda_user(qdev);
> + if (!qda_user) {
> + qda_dbg(qdev, "Failed to allocate qda_user\n");
> + kfree(qda_file_priv);
> + return -ENOMEM;
> + }
> +
> + file->driver_priv = qda_file_priv;
> + qda_file_priv->qda_user = qda_user;
> +
> + qda_dbg(qdev, "Device opened successfully for PID %d\n", current->pid);
> +
> + return 0;
> +}
> +
> +static void qda_postclose(struct drm_device *dev, struct drm_file *file)
> +{
> + struct qda_dev *qdev;
> + struct qda_file_priv *qda_file_priv;
> + struct qda_user *qda_user;
> +
> + qdev = get_qdev_from_drm_device(dev);
> + if (!qdev || atomic_read(&qdev->removing)) {
> + qda_dbg(NULL, "Device unavailable or removing\n");
> + return;
Even if it is being removed, no need to free the memory?
> + }
> +
> + qda_file_priv = (struct qda_file_priv *)file->driver_priv;
> + if (qda_file_priv) {
> + qda_user = qda_file_priv->qda_user;
> + if (qda_user)
> + free_qda_user(qda_user);
> +
> + kfree(qda_file_priv);
> + file->driver_priv = NULL;
> + }
> +
> + qda_dbg(qdev, "Device closed for PID %d\n", current->pid);
> +}
> +
> DEFINE_DRM_ACCEL_FOPS(qda_accel_fops);
>
> static struct drm_driver qda_drm_driver = {
> .driver_features = DRIVER_COMPUTE_ACCEL,
> .fops = &qda_accel_fops,
> + .open = qda_open,
> + .postclose = qda_postclose,
> .name = DRIVER_NAME,
> .desc = "Qualcomm DSP Accelerator Driver",
> };
> @@ -58,6 +174,7 @@ static void init_device_resources(struct qda_dev *qdev)
>
> mutex_init(&qdev->lock);
> atomic_set(&qdev->removing, 0);
> + atomic_set(&qdev->client_id_counter, 0);
> }
>
> static int init_memory_manager(struct qda_dev *qdev)
> diff --git a/drivers/accel/qda/qda_drv.h b/drivers/accel/qda/qda_drv.h
> index 2b80401a3741..e0ba37702a86 100644
> --- a/drivers/accel/qda/qda_drv.h
> +++ b/drivers/accel/qda/qda_drv.h
> @@ -10,6 +10,7 @@
> #include <linux/list.h>
> #include <linux/mutex.h>
> #include <linux/rpmsg.h>
> +#include <linux/types.h>
> #include <linux/xarray.h>
> #include <drm/drm_drv.h>
> #include <drm/drm_file.h>
> @@ -20,6 +21,33 @@
> /* Driver identification */
> #define DRIVER_NAME "qda"
>
> +/**
> + * struct qda_file_priv - Per-process private data for DRM file
> + *
> + * This structure tracks per-process state for each open file descriptor.
> + * It maintains the IOMMU device assignment and links to the legacy qda_user
> + * structure for compatibility with existing code.
> + */
> +struct qda_file_priv {
> + /* Process ID for tracking */
> + pid_t pid;
> + /* Pointer to qda_user structure for backward compatibility */
> + struct qda_user *qda_user;
> +};
> +
> +/**
> + * struct qda_user - Per-user context for remote processor interaction
> + *
> + * This structure maintains per-user state for interactions with the
> + * remote processor, including memory mappings and pending operations.
> + */
> +struct qda_user {
> + /* Unique client identifier */
> + u32 client_id;
> + /* Back-pointer to device structure */
> + struct qda_dev *qda_dev;
> +};
> +
> /**
> * struct qda_drm_priv - DRM device private data for QDA device
> *
> @@ -52,6 +80,8 @@ struct qda_dev {
> struct qda_drm_priv *drm_priv;
> /* Flag indicating device removal in progress */
> atomic_t removing;
> + /* Atomic counter for generating unique client IDs */
> + atomic_t client_id_counter;
> /* Name of the DSP (e.g., "cdsp", "adsp") */
> char dsp_name[16];
> /* Compute context-bank (CB) child devices */
>
> --
> 2.34.1
>
--
With best wishes
Dmitry
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 09/18] accel/qda: Add QUERY IOCTL and basic QDA UAPI header
2026-02-23 19:09 ` [PATCH RFC 09/18] accel/qda: Add QUERY IOCTL and basic QDA UAPI header Ekansh Gupta
@ 2026-02-23 22:24 ` Dmitry Baryshkov
2026-03-02 8:41 ` Ekansh Gupta
0 siblings, 1 reply; 83+ messages in thread
From: Dmitry Baryshkov @ 2026-02-23 22:24 UTC (permalink / raw)
To: Ekansh Gupta
Cc: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, dri-devel, linux-doc, linux-kernel,
linux-arm-msm, iommu, linux-media, linaro-mm-sig,
Srinivas Kandagatla, Bharath Kumar, Chenna Kesava Raju
On Tue, Feb 24, 2026 at 12:39:03AM +0530, Ekansh Gupta wrote:
> Introduce a basic UAPI for the QDA accelerator driver along with a
> DRM IOCTL handler to query DSP device identity. A new UAPI header
> include/uapi/drm/qda_accel.h defines DRM_QDA_QUERY, the corresponding
> DRM_IOCTL_QDA_QUERY command, and struct drm_qda_query, which contains
> a DSP name string.
>
> On the kernel side, qda_ioctl_query() validates the per-file context,
> resolves the qda_dev instance from dev->dev_private, and copies the
> DSP name from qdev->dsp_name into the query structure. The new
> qda_ioctls[] table wires this IOCTL into the QDA DRM driver so
> userspace can call it through the standard DRM command interface.
>
> This IOCTL provides a simple and stable way for userspace to discover
> which DSP a given QDA device node represents and serves as the first
> building block for a richer QDA UAPI in subsequent patches.
>
> Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
> ---
> drivers/accel/qda/Makefile | 1 +
> drivers/accel/qda/qda_drv.c | 9 +++++++++
> drivers/accel/qda/qda_ioctl.c | 45 +++++++++++++++++++++++++++++++++++++++++
> drivers/accel/qda/qda_ioctl.h | 26 ++++++++++++++++++++++++
> include/uapi/drm/qda_accel.h | 47 +++++++++++++++++++++++++++++++++++++++++++
> 5 files changed, 128 insertions(+)
>
> diff --git a/drivers/accel/qda/Makefile b/drivers/accel/qda/Makefile
> index 7e96ddc40a24..f547398e1a72 100644
> --- a/drivers/accel/qda/Makefile
> +++ b/drivers/accel/qda/Makefile
> @@ -10,5 +10,6 @@ qda-y := \
> qda_rpmsg.o \
> qda_cb.o \
> qda_memory_manager.o \
> + qda_ioctl.o \
Keep the list sorted, please.
>
> obj-$(CONFIG_DRM_ACCEL_QDA_COMPUTE_BUS) += qda_compute_bus.o
> diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c
> index bf95fc782cf8..86758a9cd982 100644
> --- a/drivers/accel/qda/qda_drv.c
> +++ b/drivers/accel/qda/qda_drv.c
> @@ -9,7 +9,10 @@
> #include <drm/drm_file.h>
> #include <drm/drm_gem.h>
> #include <drm/drm_ioctl.h>
> +#include <drm/qda_accel.h>
> +
> #include "qda_drv.h"
> +#include "qda_ioctl.h"
> #include "qda_rpmsg.h"
>
> static struct qda_drm_priv *get_drm_priv_from_device(struct drm_device *dev)
> @@ -128,11 +131,17 @@ static void qda_postclose(struct drm_device *dev, struct drm_file *file)
>
> DEFINE_DRM_ACCEL_FOPS(qda_accel_fops);
>
> +static const struct drm_ioctl_desc qda_ioctls[] = {
> + DRM_IOCTL_DEF_DRV(QDA_QUERY, qda_ioctl_query, 0),
> +};
> +
> static struct drm_driver qda_drm_driver = {
> .driver_features = DRIVER_COMPUTE_ACCEL,
> .fops = &qda_accel_fops,
> .open = qda_open,
> .postclose = qda_postclose,
> + .ioctls = qda_ioctls,
Please select one style. Either you indent all assignments or you don't.
> + .num_ioctls = ARRAY_SIZE(qda_ioctls),
> .name = DRIVER_NAME,
> .desc = "Qualcomm DSP Accelerator Driver",
> };
> diff --git a/drivers/accel/qda/qda_ioctl.c b/drivers/accel/qda/qda_ioctl.c
> new file mode 100644
> index 000000000000..9fa73ec2dfce
> --- /dev/null
> +++ b/drivers/accel/qda/qda_ioctl.c
> @@ -0,0 +1,45 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +// Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
> +#include <drm/drm_ioctl.h>
> +#include <drm/drm_gem.h>
> +#include <drm/qda_accel.h>
> +#include "qda_drv.h"
> +#include "qda_ioctl.h"
> +
> +static int qda_validate_and_get_context(struct drm_device *dev, struct drm_file *file_priv,
> + struct qda_dev **qdev, struct qda_user **qda_user)
> +{
> + struct qda_drm_priv *drm_priv = dev->dev_private;
> + struct qda_file_priv *qda_file_priv;
> +
> + if (!drm_priv)
> + return -EINVAL;
> +
> + *qdev = drm_priv->qdev;
> + if (!*qdev)
> + return -EINVAL;
Can this actually happen or is it (un)wishful thinking?
> +
> + qda_file_priv = (struct qda_file_priv *)file_priv->driver_priv;
> + if (!qda_file_priv || !qda_file_priv->qda_user)
> + return -EINVAL;
What are you protecting against?
> +
> + *qda_user = qda_file_priv->qda_user;
> +
> + return 0;
> +}
> +
> +int qda_ioctl_query(struct drm_device *dev, void *data, struct drm_file *file_priv)
> +{
> + struct qda_dev *qdev;
> + struct qda_user *qda_user;
> + struct drm_qda_query *args = data;
> + int ret;
> +
> + ret = qda_validate_and_get_context(dev, file_priv, &qdev, &qda_user);
> + if (ret)
> + return ret;
> +
> + strscpy(args->dsp_name, qdev->dsp_name, sizeof(args->dsp_name));
> +
> + return 0;
> +}
> diff --git a/drivers/accel/qda/qda_ioctl.h b/drivers/accel/qda/qda_ioctl.h
> new file mode 100644
> index 000000000000..6bf3bcd28c0e
> --- /dev/null
> +++ b/drivers/accel/qda/qda_ioctl.h
> @@ -0,0 +1,26 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
> + */
> +
> +#ifndef _QDA_IOCTL_H
> +#define _QDA_IOCTL_H
> +
> +#include <linux/types.h>
> +#include <linux/kernel.h>
> +#include <drm/drm_ioctl.h>
> +#include "qda_drv.h"
> +
> +/**
> + * qda_ioctl_query - Query DSP device information and capabilities
> + * @dev: DRM device structure
> + * @data: User-space data containing query parameters and results
> + * @file_priv: DRM file private data
> + *
> + * This IOCTL handler queries information about the DSP device.
> + *
> + * Return: 0 on success, negative error code on failure
> + */
> +int qda_ioctl_query(struct drm_device *dev, void *data, struct drm_file *file_priv);
> +
> +#endif /* _QDA_IOCTL_H */
> diff --git a/include/uapi/drm/qda_accel.h b/include/uapi/drm/qda_accel.h
> new file mode 100644
> index 000000000000..0aad791c4832
> --- /dev/null
> +++ b/include/uapi/drm/qda_accel.h
> @@ -0,0 +1,47 @@
> +/* SPDX-License-Identifier: GPL-2.0-only WITH Linux-syscall-note */
> +/*
> + * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
> + */
> +
> +#ifndef __QDA_ACCEL_H__
> +#define __QDA_ACCEL_H__
> +
> +#include "drm.h"
> +
> +#if defined(__cplusplus)
> +extern "C" {
> +#endif
> +
> +/*
> + * QDA IOCTL command numbers
> + *
> + * These define the command numbers for QDA-specific IOCTLs.
> + * They are used with DRM_COMMAND_BASE to create the full IOCTL numbers.
> + */
> +#define DRM_QDA_QUERY 0x00
> +/*
> + * QDA IOCTL definitions
> + *
> + * These macros define the actual IOCTL numbers used by userspace applications.
> + * They combine the command numbers with DRM_COMMAND_BASE and specify the
> + * data structure and direction (read/write) for each IOCTL.
> + */
> +#define DRM_IOCTL_QDA_QUERY DRM_IOR(DRM_COMMAND_BASE + DRM_QDA_QUERY, struct drm_qda_query)
> +
> +/**
> + * struct drm_qda_query - Device information query structure
> + * @dsp_name: Name of DSP (e.g., "adsp", "cdsp", "cdsp1", "gdsp0", "gdsp1")
> + *
> + * This structure is used with DRM_IOCTL_QDA_QUERY to query device type,
> + * allowing userspace to identify which DSP a device node represents. The
> + * kernel provides the DSP name directly as a null-terminated string.
> + */
> +struct drm_qda_query {
> + __u8 dsp_name[16];
> +};
> +
> +#if defined(__cplusplus)
> +}
> +#endif
> +
> +#endif /* __QDA_ACCEL_H__ */
>
> --
> 2.34.1
>
--
With best wishes
Dmitry
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 03/18] accel/qda: Add RPMsg transport for Qualcomm DSP accelerator
2026-02-23 22:12 ` Dmitry Baryshkov
@ 2026-02-23 22:25 ` Bjorn Andersson
2026-02-23 22:41 ` Dmitry Baryshkov
0 siblings, 1 reply; 83+ messages in thread
From: Bjorn Andersson @ 2026-02-23 22:25 UTC (permalink / raw)
To: Dmitry Baryshkov
Cc: Ekansh Gupta, Oded Gabbay, Jonathan Corbet, Shuah Khan,
Joerg Roedel, Will Deacon, Robin Murphy, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Sumit Semwal, Christian König, dri-devel, linux-doc,
linux-kernel, linux-arm-msm, iommu, linux-media, linaro-mm-sig,
Srinivas Kandagatla, Bharath Kumar, Chenna Kesava Raju
On Tue, Feb 24, 2026 at 12:12:32AM +0200, Dmitry Baryshkov wrote:
> On Mon, Feb 23, 2026 at 03:50:32PM -0600, Bjorn Andersson wrote:
> > On Mon, Feb 23, 2026 at 11:23:13PM +0200, Dmitry Baryshkov wrote:
> > > On Tue, Feb 24, 2026 at 12:38:57AM +0530, Ekansh Gupta wrote:
> > [..]
> > > > diff --git a/drivers/accel/qda/qda_drv.h b/drivers/accel/qda/qda_drv.h
> > [..]
> > > > +/* Error logging - always logs and tracks errors */
> > > > +#define qda_err(qdev, fmt, ...) do { \
> > > > + struct device *__dev = qda_get_log_device(qdev); \
> > > > + if (__dev) \
> > > > + dev_err(__dev, "[%s] " fmt, __func__, ##__VA_ARGS__); \
> > > > + else \
> > > > + pr_err(DRIVER_NAME ": [%s] " fmt, __func__, ##__VA_ARGS__); \
> > >
> > > What /why? You are under drm, so you can use drm_* helpers instead.
> > >
> >
> > In particular, rather than rolling our own wrappers around standard
> > functions, just use dev_err() whenever you have a struct device. And for
> > something like fastrpc - life starts at some probe() and ends at some
> > remove() so that should be always.
>
> I'd say differently. For the DRM devices the life cycle is centered
> around the DRM device (which can outlive platform device for multiple
> reasons). So, please start by registering the DRM accel device and using
> it for all the logging (and btw for private data management too).
>
There are no platform_devices here, but tomato tomato... What defines
the life cycle of the DRM device then? Might it linger because clients
are holding open handles to it?
Note that the fastrpc service is coming and going, as the remoteproc
starts and stops.
Regards,
Bjorn
> >
> > Regards,
> > Bjorn
>
> --
> With best wishes
> Dmitry
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 10/18] accel/qda: Add DMA-backed GEM objects and memory manager integration
2026-02-23 19:09 ` [PATCH RFC 10/18] accel/qda: Add DMA-backed GEM objects and memory manager integration Ekansh Gupta
@ 2026-02-23 22:36 ` Dmitry Baryshkov
2026-03-02 9:06 ` Ekansh Gupta
0 siblings, 1 reply; 83+ messages in thread
From: Dmitry Baryshkov @ 2026-02-23 22:36 UTC (permalink / raw)
To: Ekansh Gupta
Cc: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, dri-devel, linux-doc, linux-kernel,
linux-arm-msm, iommu, linux-media, linaro-mm-sig,
Srinivas Kandagatla, Bharath Kumar, Chenna Kesava Raju
On Tue, Feb 24, 2026 at 12:39:04AM +0530, Ekansh Gupta wrote:
> Introduce DMA-backed GEM buffer objects for the QDA accelerator
> driver and integrate them with the existing memory manager and IOMMU
> device abstraction.
>
> A new qda_gem_obj structure wraps drm_gem_object and tracks the
> kernel virtual address, DMA address, size and owning qda_iommu_device.
> qda_gem_create_object() allocates a GEM object, aligns the requested
> size, and uses qda_memory_manager_alloc() to obtain DMA-coherent
> memory from a per-process IOMMU device. The GEM object implements
> a .mmap callback that validates the VMA offset and calls into
> qda_dma_mmap(), which maps the DMA memory into userspace and sets
> appropriate VMA flags.
>
> The DMA backend is implemented in qda_memory_dma.c, which allocates
> and frees coherent memory via dma_alloc_coherent() and
> dma_free_coherent(), while storing a SID-prefixed DMA address in
> the GEM object for later use by DSP firmware. The memory manager
> is extended to maintain a mapping from processes to IOMMU devices
> using qda_file_priv and a process_assignment_lock, and provides
> qda_memory_manager_alloc() and qda_memory_manager_free() helpers
> for GEM allocations.
Why are you not using drm_gem_dma_helper?
>
> This patch lays the groundwork for GEM allocation and mmap IOCTLs
> as well as future PRIME and job submission support for QDA buffers.
Documentation/process/submitting-patches.rst, "This patch"
>
> Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
> ---
> drivers/accel/qda/Makefile | 2 +
> drivers/accel/qda/qda_drv.c | 23 +++-
> drivers/accel/qda/qda_drv.h | 7 ++
> drivers/accel/qda/qda_gem.c | 187 +++++++++++++++++++++++++++++++
> drivers/accel/qda/qda_gem.h | 63 +++++++++++
> drivers/accel/qda/qda_memory_dma.c | 91 ++++++++++++++++
> drivers/accel/qda/qda_memory_dma.h | 46 ++++++++
> drivers/accel/qda/qda_memory_manager.c | 194 +++++++++++++++++++++++++++++++++
> drivers/accel/qda/qda_memory_manager.h | 33 ++++++
> 9 files changed, 645 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/accel/qda/Makefile b/drivers/accel/qda/Makefile
> index f547398e1a72..88c324fa382c 100644
> --- a/drivers/accel/qda/Makefile
> +++ b/drivers/accel/qda/Makefile
> @@ -11,5 +11,7 @@ qda-y := \
> qda_cb.o \
> qda_memory_manager.o \
> qda_ioctl.o \
> + qda_gem.o \
> + qda_memory_dma.o \
>
> obj-$(CONFIG_DRM_ACCEL_QDA_COMPUTE_BUS) += qda_compute_bus.o
> diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c
> index 86758a9cd982..19798359b14e 100644
> --- a/drivers/accel/qda/qda_drv.c
> +++ b/drivers/accel/qda/qda_drv.c
> @@ -15,7 +15,7 @@
> #include "qda_ioctl.h"
> #include "qda_rpmsg.h"
>
> -static struct qda_drm_priv *get_drm_priv_from_device(struct drm_device *dev)
> +struct qda_drm_priv *get_drm_priv_from_device(struct drm_device *dev)
And this is a namespace leak. Please name all your functions in a
selected style (qda_foo()).
> {
> if (!dev)
> return NULL;
> @@ -88,6 +88,7 @@ static int qda_open(struct drm_device *dev, struct drm_file *file)
> return -ENOMEM;
>
> qda_file_priv->pid = current->pid;
> + qda_file_priv->assigned_iommu_dev = NULL; /* Will be assigned on first allocation */
Why? Also, isn't qda_file_priv zero-filled?
>
> qda_user = alloc_qda_user(qdev);
> if (!qda_user) {
> @@ -118,6 +119,26 @@ static void qda_postclose(struct drm_device *dev, struct drm_file *file)
>
> qda_file_priv = (struct qda_file_priv *)file->driver_priv;
> if (qda_file_priv) {
Cant it be NULL? When?
> + if (qda_file_priv->assigned_iommu_dev) {
> + struct qda_iommu_device *iommu_dev = qda_file_priv->assigned_iommu_dev;
> + unsigned long flags;
> +
> + /* Decrement reference count - if it reaches 0, reset PID assignment */
> + if (refcount_dec_and_test(&iommu_dev->refcount)) {
> + /* Last reference released - reset PID assignment */
> + spin_lock_irqsave(&iommu_dev->lock, flags);
> + iommu_dev->assigned_pid = 0;
This is the part that needs to be discussed in the commit message
instead of a generic description of the patch. What is assigned_pid /
assigned_iommu_dev? Why do they need to be assigned?
> + iommu_dev->assigned_file_priv = NULL;
> + spin_unlock_irqrestore(&iommu_dev->lock, flags);
> +
> + qda_dbg(qdev, "Reset PID assignment for IOMMU device %u (process %d exited)\n",
> + iommu_dev->id, qda_file_priv->pid);
> + } else {
> + qda_dbg(qdev, "Decremented reference for IOMMU device %u from process %d\n",
> + iommu_dev->id, qda_file_priv->pid);
> + }
> + }
> +
> qda_user = qda_file_priv->qda_user;
> if (qda_user)
> free_qda_user(qda_user);
> diff --git a/drivers/accel/qda/qda_drv.h b/drivers/accel/qda/qda_drv.h
> index e0ba37702a86..8a2cd474958b 100644
> --- a/drivers/accel/qda/qda_drv.h
> +++ b/drivers/accel/qda/qda_drv.h
> @@ -33,6 +33,8 @@ struct qda_file_priv {
> pid_t pid;
> /* Pointer to qda_user structure for backward compatibility */
> struct qda_user *qda_user;
> + /* IOMMU device assigned to this process */
> + struct qda_iommu_device *assigned_iommu_dev;
> };
>
> /**
> @@ -153,4 +155,9 @@ void qda_deinit_device(struct qda_dev *qdev);
> int qda_register_device(struct qda_dev *qdev);
> void qda_unregister_device(struct qda_dev *qdev);
>
> +/*
> + * Utility function to get DRM private data from DRM device
> + */
> +struct qda_drm_priv *get_drm_priv_from_device(struct drm_device *dev);
> +
> #endif /* __QDA_DRV_H__ */
> diff --git a/drivers/accel/qda/qda_gem.c b/drivers/accel/qda/qda_gem.c
> new file mode 100644
> index 000000000000..bbd54e2502d3
> --- /dev/null
> +++ b/drivers/accel/qda/qda_gem.c
> @@ -0,0 +1,187 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +// Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
> +#include <drm/drm_gem.h>
> +#include <drm/drm_prime.h>
> +#include <linux/slab.h>
> +#include <linux/dma-mapping.h>
> +#include "qda_drv.h"
> +#include "qda_gem.h"
> +#include "qda_memory_manager.h"
> +#include "qda_memory_dma.h"
> +
> +static int validate_gem_obj_for_mmap(struct qda_gem_obj *qda_gem_obj)
> +{
> + if (qda_gem_obj->size == 0) {
> + qda_err(NULL, "Invalid GEM object size\n");
> + return -EINVAL;
> + }
> + if (!qda_gem_obj->iommu_dev || !qda_gem_obj->iommu_dev->dev) {
> + qda_err(NULL, "Allocated buffer missing IOMMU device\n");
> + return -EINVAL;
> + }
> + if (!qda_gem_obj->iommu_dev->dev) {
> + qda_err(NULL, "Allocated buffer missing IOMMU device\n");
> + return -EINVAL;
> + }
> + if (!qda_gem_obj->virt) {
> + qda_err(NULL, "Allocated buffer missing virtual address\n");
> + return -EINVAL;
> + }
> + if (qda_gem_obj->dma_addr == 0) {
> + qda_err(NULL, "Allocated buffer missing DMA address\n");
> + return -EINVAL;
> + }
Is any of these conditions real?
> +
> + return 0;
> +}
> +
> +static int validate_vma_offset(struct drm_gem_object *drm_obj, struct vm_area_struct *vma)
> +{
> + u64 expected_offset = drm_vma_node_offset_addr(&drm_obj->vma_node);
> + u64 actual_offset = vma->vm_pgoff << PAGE_SHIFT;
> +
> + if (actual_offset != expected_offset) {
What??
> + qda_err(NULL, "VMA offset mismatch: expected=0x%llx, actual=0x%llx\n",
> + expected_offset, actual_offset);
> + return -EINVAL;
> + }
> +
> + return 0;
> +}
> +
> +static void setup_vma_flags(struct vm_area_struct *vma)
> +{
> + vm_flags_set(vma, VM_DONTEXPAND);
> + vm_flags_set(vma, VM_DONTDUMP);
> +}
> +
> +void qda_gem_free_object(struct drm_gem_object *gem_obj)
> +{
> + struct qda_gem_obj *qda_gem_obj = to_qda_gem_obj(gem_obj);
> + struct qda_drm_priv *drm_priv = get_drm_priv_from_device(gem_obj->dev);
> +
> + if (qda_gem_obj->virt) {
> + if (drm_priv && drm_priv->iommu_mgr)
> + qda_memory_manager_free(drm_priv->iommu_mgr, qda_gem_obj);
> + }
> +
> + drm_gem_object_release(gem_obj);
> + kfree(qda_gem_obj);
> +}
> +
> +int qda_gem_mmap_obj(struct drm_gem_object *drm_obj, struct vm_area_struct *vma)
> +{
> + struct qda_gem_obj *qda_gem_obj = to_qda_gem_obj(drm_obj);
> + int ret;
> +
> + ret = validate_gem_obj_for_mmap(qda_gem_obj);
> + if (ret) {
> + qda_err(NULL, "GEM object validation failed: %d\n", ret);
> + return ret;
> + }
> +
> + ret = validate_vma_offset(drm_obj, vma);
> + if (ret) {
> + qda_err(NULL, "VMA offset validation failed: %d\n", ret);
> + return ret;
> + }
> +
> + /* Reset vm_pgoff for DMA mmap */
> + vma->vm_pgoff = 0;
> +
> + ret = qda_dma_mmap(qda_gem_obj, vma);
> +
> + if (ret == 0) {
> + setup_vma_flags(vma);
> + qda_dbg(NULL, "GEM object mapped successfully\n");
> + } else {
> + qda_err(NULL, "GEM object mmap failed: %d\n", ret);
> + }
> +
> + return ret;
> +}
> +
> +static const struct drm_gem_object_funcs qda_gem_object_funcs = {
> + .free = qda_gem_free_object,
> + .mmap = qda_gem_mmap_obj,
> +};
> +
> +struct qda_gem_obj *qda_gem_alloc_object(struct drm_device *drm_dev, size_t aligned_size)
> +{
> + struct qda_gem_obj *qda_gem_obj;
> + int ret;
> +
> + qda_gem_obj = kzalloc_obj(*qda_gem_obj, GFP_KERNEL);
> + if (!qda_gem_obj)
> + return ERR_PTR(-ENOMEM);
> +
> + ret = drm_gem_object_init(drm_dev, &qda_gem_obj->base, aligned_size);
> + if (ret) {
> + qda_err(NULL, "Failed to initialize GEM object: %d\n", ret);
> + kfree(qda_gem_obj);
> + return ERR_PTR(ret);
> + }
> +
> + qda_gem_obj->base.funcs = &qda_gem_object_funcs;
> + qda_gem_obj->size = aligned_size;
> +
> + qda_dbg(NULL, "Allocated GEM object size=%zu\n", aligned_size);
> + return qda_gem_obj;
> +}
> +
> +void qda_gem_cleanup_object(struct qda_gem_obj *qda_gem_obj)
> +{
> + drm_gem_object_release(&qda_gem_obj->base);
> + kfree(qda_gem_obj);
> +}
> +
> +struct drm_gem_object *qda_gem_lookup_object(struct drm_file *file_priv, u32 handle)
> +{
> + struct drm_gem_object *gem_obj;
> +
> + gem_obj = drm_gem_object_lookup(file_priv, handle);
> + if (!gem_obj)
> + return ERR_PTR(-ENOENT);
> +
> + return gem_obj;
> +}
> +
> +int qda_gem_create_handle(struct drm_file *file_priv, struct drm_gem_object *gem_obj, u32 *handle)
> +{
> + int ret;
> +
> + ret = drm_gem_handle_create(file_priv, gem_obj, handle);
> + drm_gem_object_put(gem_obj);
> +
> + return ret;
> +}
> +
> +struct drm_gem_object *qda_gem_create_object(struct drm_device *drm_dev,
> + struct qda_memory_manager *iommu_mgr, size_t size,
> + struct drm_file *file_priv)
> +{
> + struct qda_gem_obj *qda_gem_obj;
> + size_t aligned_size;
> + int ret;
> +
> + if (size == 0) {
> + qda_err(NULL, "Invalid size for GEM object creation\n");
> + return ERR_PTR(-EINVAL);
> + }
> +
> + aligned_size = PAGE_ALIGN(size);
> +
> + qda_gem_obj = qda_gem_alloc_object(drm_dev, aligned_size);
> + if (IS_ERR(qda_gem_obj))
> + return (struct drm_gem_object *)qda_gem_obj;
> +
> + ret = qda_memory_manager_alloc(iommu_mgr, qda_gem_obj, file_priv);
> + if (ret) {
> + qda_err(NULL, "Memory manager allocation failed: %d\n", ret);
> + qda_gem_cleanup_object(qda_gem_obj);
> + return ERR_PTR(ret);
> + }
> +
> + qda_dbg(NULL, "GEM object created successfully size=%zu\n", aligned_size);
> + return &qda_gem_obj->base;
> +}
> diff --git a/drivers/accel/qda/qda_gem.h b/drivers/accel/qda/qda_gem.h
> new file mode 100644
> index 000000000000..caae9cda5363
> --- /dev/null
> +++ b/drivers/accel/qda/qda_gem.h
> @@ -0,0 +1,63 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
> + */
> +#ifndef _QDA_GEM_H
> +#define _QDA_GEM_H
> +
> +#include <linux/xarray.h>
> +#include <drm/drm_device.h>
> +#include <drm/drm_gem.h>
> +#include <linux/dma-mapping.h>
> +
> +/* Forward declarations */
> +struct qda_memory_manager;
> +struct qda_iommu_device;
> +
> +/**
> + * struct qda_gem_obj - QDA GEM buffer object
> + *
> + * This structure represents a GEM buffer object that can be either
> + * allocated by the driver or imported from another driver via dma-buf.
> + */
> +struct qda_gem_obj {
> + /* DRM GEM object base structure */
> + struct drm_gem_object base;
> + /* Kernel virtual address of allocated memory */
> + void *virt;
> + /* DMA address for allocated buffers */
> + dma_addr_t dma_addr;
> + /* Size of the buffer in bytes */
> + size_t size;
> + /* IOMMU device that performed the allocation */
> + struct qda_iommu_device *iommu_dev;
> +};
> +
> +/*
> + * Helper macro to cast a drm_gem_object to qda_gem_obj
> + */
> +#define to_qda_gem_obj(gem_obj) container_of(gem_obj, struct qda_gem_obj, base)
> +
> +/*
> + * GEM object lifecycle management
> + */
> +struct drm_gem_object *qda_gem_create_object(struct drm_device *drm_dev,
> + struct qda_memory_manager *iommu_mgr,
> + size_t size, struct drm_file *file_priv);
> +void qda_gem_free_object(struct drm_gem_object *gem_obj);
> +int qda_gem_mmap_obj(struct drm_gem_object *gem_obj, struct vm_area_struct *vma);
> +
> +/*
> + * Helper functions for GEM object allocation and cleanup
> + * These are used internally and by the PRIME import code
> + */
> +struct qda_gem_obj *qda_gem_alloc_object(struct drm_device *drm_dev, size_t aligned_size);
> +void qda_gem_cleanup_object(struct qda_gem_obj *qda_gem_obj);
> +
> +/*
> + * Utility functions for GEM operations
> + */
> +struct drm_gem_object *qda_gem_lookup_object(struct drm_file *file_priv, u32 handle);
> +int qda_gem_create_handle(struct drm_file *file_priv, struct drm_gem_object *gem_obj, u32 *handle);
> +
> +#endif /* _QDA_GEM_H */
> diff --git a/drivers/accel/qda/qda_memory_dma.c b/drivers/accel/qda/qda_memory_dma.c
> new file mode 100644
> index 000000000000..ffdd5423c88c
> --- /dev/null
> +++ b/drivers/accel/qda/qda_memory_dma.c
> @@ -0,0 +1,91 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +// Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
> +#include <linux/slab.h>
> +#include <linux/dma-mapping.h>
> +#include "qda_drv.h"
> +#include "qda_memory_dma.h"
> +
> +static dma_addr_t get_actual_dma_addr(struct qda_gem_obj *gem_obj)
> +{
> + return gem_obj->dma_addr - ((u64)gem_obj->iommu_dev->sid << 32);
> +}
> +
> +static void setup_gem_object(struct qda_gem_obj *gem_obj, void *virt,
> + dma_addr_t dma_addr, struct qda_iommu_device *iommu_dev)
> +{
> + gem_obj->virt = virt;
> + gem_obj->dma_addr = dma_addr;
> + gem_obj->iommu_dev = iommu_dev;
> +}
> +
> +static void cleanup_gem_object_fields(struct qda_gem_obj *gem_obj)
> +{
> + gem_obj->virt = NULL;
> + gem_obj->dma_addr = 0;
> + gem_obj->iommu_dev = NULL;
> +}
> +
> +int qda_dma_alloc(struct qda_iommu_device *iommu_dev,
> + struct qda_gem_obj *gem_obj, size_t size)
> +{
> + void *virt;
> + dma_addr_t dma_addr;
> +
> + if (!iommu_dev || !iommu_dev->dev) {
> + qda_err(NULL, "Invalid iommu_dev or device for DMA allocation\n");
> + return -EINVAL;
> + }
> +
> + virt = dma_alloc_coherent(iommu_dev->dev, size, &dma_addr, GFP_KERNEL);
> + if (!virt)
> + return -ENOMEM;
> +
> + dma_addr += ((u64)iommu_dev->sid << 32);
> +
> + qda_dbg(NULL, "DMA address with SID prefix: 0x%llx (sid=%u)\n",
> + (u64)dma_addr, iommu_dev->sid);
> +
> + setup_gem_object(gem_obj, virt, dma_addr, iommu_dev);
> +
> + return 0;
> +}
> +
> +void qda_dma_free(struct qda_gem_obj *gem_obj)
> +{
> + if (!gem_obj || !gem_obj->iommu_dev) {
> + qda_dbg(NULL, "Invalid gem_obj or iommu_dev for DMA free\n");
> + return;
> + }
> +
> + qda_dbg(NULL, "DMA freeing: size=%zu, device_id=%u, dma_addr=0x%llx\n",
> + gem_obj->size, gem_obj->iommu_dev->id, gem_obj->dma_addr);
> +
> + dma_free_coherent(gem_obj->iommu_dev->dev, gem_obj->size,
> + gem_obj->virt, get_actual_dma_addr(gem_obj));
> +
> + cleanup_gem_object_fields(gem_obj);
> +}
> +
> +int qda_dma_mmap(struct qda_gem_obj *gem_obj, struct vm_area_struct *vma)
> +{
> + struct qda_iommu_device *iommu_dev;
> + int ret;
> +
> + if (!gem_obj || !gem_obj->virt || !gem_obj->iommu_dev || !gem_obj->iommu_dev->dev) {
> + qda_err(NULL, "Invalid parameters for DMA mmap\n");
> + return -EINVAL;
> + }
> +
> + iommu_dev = gem_obj->iommu_dev;
> +
> + ret = dma_mmap_coherent(iommu_dev->dev, vma, gem_obj->virt,
> + get_actual_dma_addr(gem_obj), gem_obj->size);
> +
> + if (ret)
> + qda_err(NULL, "DMA mmap failed: size=%zu, device_id=%u, ret=%d\n",
> + gem_obj->size, iommu_dev->id, ret);
if (ret) {
qda_err();
return ret;
// or goto err_foo;
}
return 0;
> + else
> + qda_dbg(NULL, "DMA mmap successful: size=%zu\n", gem_obj->size);
It feels like the driver is over-verbose if debugging is enabled.
> +
> + return ret;
> +}
> diff --git a/drivers/accel/qda/qda_memory_dma.h b/drivers/accel/qda/qda_memory_dma.h
> new file mode 100644
> index 000000000000..79b3c4053a82
> --- /dev/null
> +++ b/drivers/accel/qda/qda_memory_dma.h
> @@ -0,0 +1,46 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
> + */
> +
> +#ifndef _QDA_MEMORY_DMA_H
> +#define _QDA_MEMORY_DMA_H
> +
> +#include <linux/dma-mapping.h>
> +#include "qda_memory_manager.h"
> +
> +/**
> + * qda_dma_alloc() - Allocate DMA coherent memory for a GEM object
> + * @iommu_dev: Pointer to the QDA IOMMU device structure
> + * @gem_obj: Pointer to GEM object to allocate memory for
> + * @size: Size of memory to allocate in bytes
> + *
> + * Allocates DMA-coherent memory and sets up the GEM object with the
> + * allocated memory details including virtual and DMA addresses.
> + *
> + * Return: 0 on success, negative error code on failure
> + */
Move the kerneldoc from the headers to the driver code, otherwise they
are mostly ignored by the automatic validators.
> +int qda_dma_alloc(struct qda_iommu_device *iommu_dev,
> + struct qda_gem_obj *gem_obj, size_t size);
> +
> +/**
> + * qda_dma_free() - Free DMA coherent memory for a GEM object
> + * @gem_obj: Pointer to GEM object to free memory for
> + *
> + * Frees DMA-coherent memory previously allocated for the GEM object
> + * and cleans up the GEM object fields.
> + */
> +void qda_dma_free(struct qda_gem_obj *gem_obj);
> +
> +/**
> + * qda_dma_mmap() - Map DMA memory into userspace
> + * @gem_obj: Pointer to GEM object containing DMA memory
> + * @vma: Virtual memory area to map into
> + *
> + * Maps DMA-coherent memory into userspace virtual address space.
> + *
> + * Return: 0 on success, negative error code on failure
> + */
> +int qda_dma_mmap(struct qda_gem_obj *gem_obj, struct vm_area_struct *vma);
> +
> +#endif /* _QDA_MEMORY_DMA_H */
> diff --git a/drivers/accel/qda/qda_memory_manager.c b/drivers/accel/qda/qda_memory_manager.c
> index b4c7047a89d4..e225667557ee 100644
> --- a/drivers/accel/qda/qda_memory_manager.c
> +++ b/drivers/accel/qda/qda_memory_manager.c
> @@ -6,8 +6,11 @@
> #include <linux/spinlock.h>
> #include <linux/workqueue.h>
> #include <linux/xarray.h>
> +#include <drm/drm_file.h>
> #include "qda_drv.h"
> +#include "qda_gem.h"
> #include "qda_memory_manager.h"
> +#include "qda_memory_dma.h"
>
> static void cleanup_all_memory_devices(struct qda_memory_manager *mem_mgr)
> {
> @@ -55,6 +58,8 @@ static void init_iommu_device_fields(struct qda_iommu_device *iommu_dev,
> spin_lock_init(&iommu_dev->lock);
> refcount_set(&iommu_dev->refcount, 0);
> INIT_WORK(&iommu_dev->remove_work, qda_memory_manager_remove_work);
> + iommu_dev->assigned_pid = 0;
> + iommu_dev->assigned_file_priv = NULL;
> }
>
> static int allocate_device_id(struct qda_memory_manager *mem_mgr,
> @@ -78,6 +83,194 @@ static int allocate_device_id(struct qda_memory_manager *mem_mgr,
> return ret;
> }
>
> +static struct qda_iommu_device *find_device_for_pid(struct qda_memory_manager *mem_mgr,
> + pid_t pid)
> +{
> + unsigned long index;
> + void *entry;
> + struct qda_iommu_device *found_dev = NULL;
> + unsigned long flags;
> +
> + xa_lock(&mem_mgr->device_xa);
> + xa_for_each(&mem_mgr->device_xa, index, entry) {
> + struct qda_iommu_device *iommu_dev = entry;
> +
> + spin_lock_irqsave(&iommu_dev->lock, flags);
> + if (iommu_dev->assigned_pid == pid) {
> + found_dev = iommu_dev;
> + refcount_inc(&found_dev->refcount);
> + qda_dbg(NULL, "Reusing device id=%u for PID=%d (refcount=%u)\n",
> + found_dev->id, pid, refcount_read(&found_dev->refcount));
And what if there are two different FastRPC sessions within the same
PID?
> + spin_unlock_irqrestore(&iommu_dev->lock, flags);
> + break;
> + }
> + spin_unlock_irqrestore(&iommu_dev->lock, flags);
> + }
> + xa_unlock(&mem_mgr->device_xa);
> +
> + return found_dev;
> +}
> +
> +static struct qda_iommu_device *assign_available_device_to_pid(struct qda_memory_manager *mem_mgr,
> + pid_t pid,
> + struct drm_file *file_priv)
> +{
> + unsigned long index;
> + void *entry;
> + struct qda_iommu_device *selected_dev = NULL;
> + unsigned long flags;
> +
> + xa_lock(&mem_mgr->device_xa);
> + xa_for_each(&mem_mgr->device_xa, index, entry) {
> + struct qda_iommu_device *iommu_dev = entry;
> +
> + spin_lock_irqsave(&iommu_dev->lock, flags);
> + if (iommu_dev->assigned_pid == 0) {
> + iommu_dev->assigned_pid = pid;
> + iommu_dev->assigned_file_priv = file_priv;
> + selected_dev = iommu_dev;
> + refcount_set(&selected_dev->refcount, 1);
> + qda_dbg(NULL, "Assigned device id=%u to PID=%d\n",
> + selected_dev->id, pid);
> + spin_unlock_irqrestore(&iommu_dev->lock, flags);
> + break;
> + }
> + spin_unlock_irqrestore(&iommu_dev->lock, flags);
> + }
> + xa_unlock(&mem_mgr->device_xa);
> +
> + return selected_dev;
> +}
> +
> +static struct qda_iommu_device *get_process_iommu_device(struct qda_memory_manager *mem_mgr,
> + struct drm_file *file_priv)
> +{
> + struct qda_file_priv *qda_priv;
> +
> + if (!file_priv || !file_priv->driver_priv)
> + return NULL;
> +
> + qda_priv = (struct qda_file_priv *)file_priv->driver_priv;
> + return qda_priv->assigned_iommu_dev;
> +}
> +
> +static int qda_memory_manager_assign_device(struct qda_memory_manager *mem_mgr,
> + struct drm_file *file_priv)
> +{
> + struct qda_file_priv *qda_priv;
> + struct qda_iommu_device *selected_dev = NULL;
> + int ret = 0;
> + pid_t current_pid;
> +
> + if (!file_priv || !file_priv->driver_priv) {
> + qda_err(NULL, "Invalid file_priv or driver_priv\n");
> + return -EINVAL;
> + }
> +
> + qda_priv = (struct qda_file_priv *)file_priv->driver_priv;
> + current_pid = qda_priv->pid;
> +
> + mutex_lock(&mem_mgr->process_assignment_lock);
> +
> + if (qda_priv->assigned_iommu_dev) {
> + qda_dbg(NULL, "PID=%d already has device id=%u assigned\n",
> + current_pid, qda_priv->assigned_iommu_dev->id);
> + ret = 0;
> + goto unlock_and_return;
> + }
> +
> + selected_dev = find_device_for_pid(mem_mgr, current_pid);
> +
> + if (selected_dev) {
> + qda_priv->assigned_iommu_dev = selected_dev;
> + goto unlock_and_return;
> + }
> +
> + selected_dev = assign_available_device_to_pid(mem_mgr, current_pid, file_priv);
> +
> + if (!selected_dev) {
> + qda_err(NULL, "No available device for PID=%d\n", current_pid);
> + ret = -ENOMEM;
> + goto unlock_and_return;
> + }
> +
> + qda_priv->assigned_iommu_dev = selected_dev;
> +
> +unlock_and_return:
> + mutex_unlock(&mem_mgr->process_assignment_lock);
> + return ret;
> +}
> +
> +static struct qda_iommu_device *get_or_assign_iommu_device(struct qda_memory_manager *mem_mgr,
> + struct drm_file *file_priv,
> + size_t size)
> +{
> + struct qda_iommu_device *iommu_dev;
> + int ret;
> +
> + iommu_dev = get_process_iommu_device(mem_mgr, file_priv);
> + if (iommu_dev)
> + return iommu_dev;
> +
> + ret = qda_memory_manager_assign_device(mem_mgr, file_priv);
> + if (ret)
> + return NULL;
> +
> + iommu_dev = get_process_iommu_device(mem_mgr, file_priv);
> + if (iommu_dev)
> + return iommu_dev;
> +
> + return NULL;
> +}
> +
> +int qda_memory_manager_alloc(struct qda_memory_manager *mem_mgr, struct qda_gem_obj *gem_obj,
> + struct drm_file *file_priv)
> +{
> + struct qda_iommu_device *selected_dev;
> + size_t size;
> + int ret;
> +
> + if (!mem_mgr || !gem_obj || !file_priv) {
> + qda_err(NULL, "Invalid parameters for memory allocation\n");
> + return -EINVAL;
> + }
> +
> + size = gem_obj->size;
> + if (size == 0) {
> + qda_err(NULL, "Invalid allocation size: 0\n");
> + return -EINVAL;
> + }
> +
> + selected_dev = get_or_assign_iommu_device(mem_mgr, file_priv, size);
> +
> + if (!selected_dev) {
> + qda_err(NULL, "Failed to get/assign device for allocation (size=%zu)\n", size);
> + return -ENOMEM;
> + }
> +
> + ret = qda_dma_alloc(selected_dev, gem_obj, size);
> +
> + if (ret) {
> + qda_err(NULL, "Allocation failed: size=%zu, device_id=%u, ret=%d\n",
> + size, selected_dev->id, ret);
> + return ret;
> + }
> +
> + qda_dbg(NULL, "Successfully allocated: size=%zu, device_id=%u, dma_addr=0x%llx\n",
> + size, selected_dev->id, gem_obj->dma_addr);
> + return 0;
> +}
> +
> +void qda_memory_manager_free(struct qda_memory_manager *mem_mgr, struct qda_gem_obj *gem_obj)
> +{
> + if (!gem_obj || !gem_obj->iommu_dev) {
> + qda_dbg(NULL, "Invalid gem_obj or iommu_dev for free\n");
> + return;
> + }
> +
> + qda_dma_free(gem_obj);
> +}
> +
> int qda_memory_manager_register_device(struct qda_memory_manager *mem_mgr,
> struct qda_iommu_device *iommu_dev)
> {
> @@ -134,6 +327,7 @@ int qda_memory_manager_init(struct qda_memory_manager *mem_mgr)
>
> xa_init_flags(&mem_mgr->device_xa, XA_FLAGS_ALLOC);
> atomic_set(&mem_mgr->next_id, 0);
> + mutex_init(&mem_mgr->process_assignment_lock);
> mem_mgr->wq = create_workqueue("memory_manager_wq");
> if (!mem_mgr->wq) {
> qda_err(NULL, "Failed to create memory manager workqueue\n");
> diff --git a/drivers/accel/qda/qda_memory_manager.h b/drivers/accel/qda/qda_memory_manager.h
> index 3bf4cd529909..bac44284ef98 100644
> --- a/drivers/accel/qda/qda_memory_manager.h
> +++ b/drivers/accel/qda/qda_memory_manager.h
> @@ -11,6 +11,8 @@
> #include <linux/spinlock.h>
> #include <linux/workqueue.h>
> #include <linux/xarray.h>
> +#include <drm/drm_file.h>
> +#include "qda_gem.h"
>
> /**
> * struct qda_iommu_device - IOMMU device instance for memory management
> @@ -35,6 +37,10 @@ struct qda_iommu_device {
> u32 sid;
> /* Pointer to parent memory manager */
> struct qda_memory_manager *manager;
> + /* Process ID of the process assigned to this device */
> + pid_t assigned_pid;
> + /* DRM file private data for the assigned process */
> + struct drm_file *assigned_file_priv;
> };
>
> /**
> @@ -51,6 +57,8 @@ struct qda_memory_manager {
> atomic_t next_id;
> /* Workqueue for asynchronous device operations */
> struct workqueue_struct *wq;
> + /* Mutex protecting process-to-device assignments */
> + struct mutex process_assignment_lock;
> };
>
> /**
> @@ -98,4 +106,29 @@ int qda_memory_manager_register_device(struct qda_memory_manager *mem_mgr,
> void qda_memory_manager_unregister_device(struct qda_memory_manager *mem_mgr,
> struct qda_iommu_device *iommu_dev);
>
> +/**
> + * qda_memory_manager_alloc() - Allocate memory for a GEM object
> + * @mem_mgr: Pointer to memory manager
> + * @gem_obj: Pointer to GEM object to allocate memory for
> + * @file_priv: DRM file private data for process association
> + *
> + * Allocates memory for the specified GEM object using an appropriate IOMMU
> + * device. The allocation is associated with the calling process via
> + * file_priv.
> + *
> + * Return: 0 on success, negative error code on failure
> + */
> +int qda_memory_manager_alloc(struct qda_memory_manager *mem_mgr, struct qda_gem_obj *gem_obj,
> + struct drm_file *file_priv);
> +
> +/**
> + * qda_memory_manager_free() - Free memory for a GEM object
> + * @mem_mgr: Pointer to memory manager
> + * @gem_obj: Pointer to GEM object to free memory for
> + *
> + * Releases memory previously allocated for the specified GEM object and
> + * removes any associated IOMMU mappings.
> + */
> +void qda_memory_manager_free(struct qda_memory_manager *mem_mgr, struct qda_gem_obj *gem_obj);
> +
> #endif /* _QDA_MEMORY_MANAGER_H */
>
> --
> 2.34.1
>
--
With best wishes
Dmitry
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 11/18] accel/qda: Add GEM_CREATE and GEM_MMAP_OFFSET IOCTLs
2026-02-23 19:09 ` [PATCH RFC 11/18] accel/qda: Add GEM_CREATE and GEM_MMAP_OFFSET IOCTLs Ekansh Gupta
@ 2026-02-23 22:39 ` Dmitry Baryshkov
2026-03-02 9:07 ` Ekansh Gupta
2026-02-24 9:05 ` Christian König
1 sibling, 1 reply; 83+ messages in thread
From: Dmitry Baryshkov @ 2026-02-23 22:39 UTC (permalink / raw)
To: Ekansh Gupta
Cc: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, dri-devel, linux-doc, linux-kernel,
linux-arm-msm, iommu, linux-media, linaro-mm-sig,
Srinivas Kandagatla, Bharath Kumar, Chenna Kesava Raju
On Tue, Feb 24, 2026 at 12:39:05AM +0530, Ekansh Gupta wrote:
> Add two GEM-related IOCTLs for the QDA accelerator driver and hook
> them into the DRM accel driver. DRM_IOCTL_QDA_GEM_CREATE allocates
> a DMA-backed GEM buffer object via qda_gem_create_object() and
> returns a GEM handle to userspace, while
> DRM_IOCTL_QDA_GEM_MMAP_OFFSET returns a valid mmap offset for a
> given GEM handle using drm_gem_create_mmap_offset() and the
> vma_node in the GEM object.
>
> The QDA driver is updated to advertise DRIVER_GEM in its
> driver_features, and the new IOCTLs are wired through the QDA
> GEM and memory-manager backend. These IOCTLs allow userspace to
> allocate buffers and map them into its address space as a first
> step toward full compute buffer management and integration with
> DSP workloads.
>
> Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
> ---
> drivers/accel/qda/qda_drv.c | 5 ++++-
> drivers/accel/qda/qda_gem.h | 30 ++++++++++++++++++++++++++++++
> drivers/accel/qda/qda_ioctl.c | 35 +++++++++++++++++++++++++++++++++++
> include/uapi/drm/qda_accel.h | 36 ++++++++++++++++++++++++++++++++++++
> 4 files changed, 105 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c
> index 19798359b14e..0dd0e2bb2c0f 100644
> --- a/drivers/accel/qda/qda_drv.c
> +++ b/drivers/accel/qda/qda_drv.c
> @@ -12,6 +12,7 @@
> #include <drm/qda_accel.h>
>
> #include "qda_drv.h"
> +#include "qda_gem.h"
> #include "qda_ioctl.h"
> #include "qda_rpmsg.h"
>
> @@ -154,10 +155,12 @@ DEFINE_DRM_ACCEL_FOPS(qda_accel_fops);
>
> static const struct drm_ioctl_desc qda_ioctls[] = {
> DRM_IOCTL_DEF_DRV(QDA_QUERY, qda_ioctl_query, 0),
> + DRM_IOCTL_DEF_DRV(QDA_GEM_CREATE, qda_ioctl_gem_create, 0),
> + DRM_IOCTL_DEF_DRV(QDA_GEM_MMAP_OFFSET, qda_ioctl_gem_mmap_offset, 0),
> };
>
> static struct drm_driver qda_drm_driver = {
> - .driver_features = DRIVER_COMPUTE_ACCEL,
> + .driver_features = DRIVER_GEM | DRIVER_COMPUTE_ACCEL,
> .fops = &qda_accel_fops,
> .open = qda_open,
> .postclose = qda_postclose,
> diff --git a/drivers/accel/qda/qda_gem.h b/drivers/accel/qda/qda_gem.h
> index caae9cda5363..cbd5d0a58fa4 100644
> --- a/drivers/accel/qda/qda_gem.h
> +++ b/drivers/accel/qda/qda_gem.h
> @@ -47,6 +47,36 @@ struct drm_gem_object *qda_gem_create_object(struct drm_device *drm_dev,
> void qda_gem_free_object(struct drm_gem_object *gem_obj);
> int qda_gem_mmap_obj(struct drm_gem_object *gem_obj, struct vm_area_struct *vma);
>
> +/*
> + * GEM IOCTL handlers
> + */
> +
> +/**
> + * qda_ioctl_gem_create - Create a GEM buffer object
> + * @dev: DRM device structure
> + * @data: User-space data containing buffer creation parameters
> + * @file_priv: DRM file private data
> + *
> + * This IOCTL handler creates a new GEM buffer object with the specified
> + * size and returns a handle to the created buffer.
> + *
> + * Return: 0 on success, negative error code on failure
> + */
> +int qda_ioctl_gem_create(struct drm_device *dev, void *data, struct drm_file *file_priv);
> +
> +/**
> + * qda_ioctl_gem_mmap_offset - Get mmap offset for a GEM buffer object
> + * @dev: DRM device structure
> + * @data: User-space data containing buffer handle and offset result
> + * @file_priv: DRM file private data
> + *
> + * This IOCTL handler retrieves the mmap offset for a GEM buffer object,
> + * which can be used to map the buffer into user-space memory.
> + *
> + * Return: 0 on success, negative error code on failure
> + */
> +int qda_ioctl_gem_mmap_offset(struct drm_device *dev, void *data, struct drm_file *file_priv);
> +
> /*
> * Helper functions for GEM object allocation and cleanup
> * These are used internally and by the PRIME import code
> diff --git a/drivers/accel/qda/qda_ioctl.c b/drivers/accel/qda/qda_ioctl.c
> index 9fa73ec2dfce..ef3c9c691cb7 100644
> --- a/drivers/accel/qda/qda_ioctl.c
> +++ b/drivers/accel/qda/qda_ioctl.c
> @@ -43,3 +43,38 @@ int qda_ioctl_query(struct drm_device *dev, void *data, struct drm_file *file_pr
>
> return 0;
> }
> +
> +int qda_ioctl_gem_create(struct drm_device *dev, void *data, struct drm_file *file_priv)
> +{
> + struct drm_qda_gem_create *args = data;
> + struct drm_gem_object *gem_obj;
> + struct qda_drm_priv *drm_priv;
> +
> + drm_priv = get_drm_priv_from_device(dev);
> + if (!drm_priv || !drm_priv->iommu_mgr)
> + return -EINVAL;
> +
> + gem_obj = qda_gem_create_object(dev, drm_priv->iommu_mgr, args->size, file_priv);
> + if (IS_ERR(gem_obj))
> + return PTR_ERR(gem_obj);
> +
> + return qda_gem_create_handle(file_priv, gem_obj, &args->handle);
> +}
> +
> +int qda_ioctl_gem_mmap_offset(struct drm_device *dev, void *data, struct drm_file *file_priv)
> +{
> + struct drm_qda_gem_mmap_offset *args = data;
> + struct drm_gem_object *gem_obj;
> + int ret;
> +
> + gem_obj = qda_gem_lookup_object(file_priv, args->handle);
> + if (IS_ERR(gem_obj))
> + return PTR_ERR(gem_obj);
> +
> + ret = drm_gem_create_mmap_offset(gem_obj);
> + if (ret == 0)
> + args->offset = drm_vma_node_offset_addr(&gem_obj->vma_node);
> +
> + drm_gem_object_put(gem_obj);
> + return ret;
> +}
> diff --git a/include/uapi/drm/qda_accel.h b/include/uapi/drm/qda_accel.h
> index 0aad791c4832..ed24a7f5637e 100644
> --- a/include/uapi/drm/qda_accel.h
> +++ b/include/uapi/drm/qda_accel.h
> @@ -19,6 +19,8 @@ extern "C" {
> * They are used with DRM_COMMAND_BASE to create the full IOCTL numbers.
> */
> #define DRM_QDA_QUERY 0x00
> +#define DRM_QDA_GEM_CREATE 0x01
> +#define DRM_QDA_GEM_MMAP_OFFSET 0x02
> /*
> * QDA IOCTL definitions
> *
> @@ -27,6 +29,10 @@ extern "C" {
> * data structure and direction (read/write) for each IOCTL.
> */
> #define DRM_IOCTL_QDA_QUERY DRM_IOR(DRM_COMMAND_BASE + DRM_QDA_QUERY, struct drm_qda_query)
> +#define DRM_IOCTL_QDA_GEM_CREATE DRM_IOWR(DRM_COMMAND_BASE + DRM_QDA_GEM_CREATE, \
> + struct drm_qda_gem_create)
> +#define DRM_IOCTL_QDA_GEM_MMAP_OFFSET DRM_IOWR(DRM_COMMAND_BASE + DRM_QDA_GEM_MMAP_OFFSET, \
> + struct drm_qda_gem_mmap_offset)
>
> /**
> * struct drm_qda_query - Device information query structure
> @@ -40,6 +46,36 @@ struct drm_qda_query {
> __u8 dsp_name[16];
> };
>
> +/**
> + * struct drm_qda_gem_create - GEM buffer object creation parameters
> + * @size: Size of the GEM object to create in bytes (input)
> + * @handle: Allocated GEM handle (output)
> + *
> + * This structure is used with DRM_IOCTL_QDA_GEM_CREATE to allocate
> + * a new GEM buffer object.
> + */
> +struct drm_qda_gem_create {
> + __u32 handle;
> + __u32 pad;
> + __u64 size;
If you put size before handle, you would not need padding.
> +};
> +
> +/**
> + * struct drm_qda_gem_mmap_offset - GEM object mmap offset query
> + * @handle: GEM handle (input)
> + * @pad: Padding for 64-bit alignment
> + * @offset: mmap offset for the GEM object (output)
> + *
> + * This structure is used with DRM_IOCTL_QDA_GEM_MMAP_OFFSET to retrieve
> + * the mmap offset that can be used with mmap() to map the GEM object into
> + * user space.
> + */
> +struct drm_qda_gem_mmap_offset {
> + __u32 handle;
> + __u32 pad;
> + __u64 offset;
I'm really not a fan of the pad field in the middle of the structure.
> +};
> +
> #if defined(__cplusplus)
> }
> #endif
>
> --
> 2.34.1
>
--
With best wishes
Dmitry
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 18/18] MAINTAINERS: Add MAINTAINERS entry for QDA driver
2026-02-23 19:09 ` [PATCH RFC 18/18] MAINTAINERS: Add MAINTAINERS entry for QDA driver Ekansh Gupta
@ 2026-02-23 22:40 ` Dmitry Baryshkov
2026-03-02 8:41 ` Ekansh Gupta
0 siblings, 1 reply; 83+ messages in thread
From: Dmitry Baryshkov @ 2026-02-23 22:40 UTC (permalink / raw)
To: Ekansh Gupta
Cc: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, dri-devel, linux-doc, linux-kernel,
linux-arm-msm, iommu, linux-media, linaro-mm-sig,
Srinivas Kandagatla, Bharath Kumar, Chenna Kesava Raju
On Tue, Feb 24, 2026 at 12:39:12AM +0530, Ekansh Gupta wrote:
> Add a new MAINTAINERS entry for the Qualcomm DSP Accelerator (QDA)
> driver. The entry lists the primary maintainer, the linux-arm-msm and
> dri-devel mailing lists, and covers all source files under
> drivers/accel/qda, Documentation/accel/qda and the UAPI header
> include/uapi/drm/qda_accel.h.
>
> This ensures that patches to the QDA driver and its public API are
> tracked and routed to the appropriate reviewers as the driver is
> integrated into the DRM accel subsystem.
Please add it in the first patch.
>
> Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
> ---
> MAINTAINERS | 9 +++++++++
> 1 file changed, 9 insertions(+)
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 71f76fddebbf..78b8b82a6370 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -21691,6 +21691,15 @@ S: Maintained
> F: Documentation/devicetree/bindings/crypto/qcom-qce.yaml
> F: drivers/crypto/qce/
>
> +QUALCOMM DSP ACCELERATOR (QDA) DRIVER
> +M: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
> +L: linux-arm-msm@vger.kernel.org
> +L: dri-devel@lists.freedesktop.org
> +S: Supported
> +F: Documentation/accel/qda/
> +F: drivers/accel/qda/
> +F: include/uapi/drm/qda_accel.h
> +
> QUALCOMM EMAC GIGABIT ETHERNET DRIVER
> M: Timur Tabi <timur@kernel.org>
> L: netdev@vger.kernel.org
>
> --
> 2.34.1
>
--
With best wishes
Dmitry
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 03/18] accel/qda: Add RPMsg transport for Qualcomm DSP accelerator
2026-02-23 22:25 ` Bjorn Andersson
@ 2026-02-23 22:41 ` Dmitry Baryshkov
0 siblings, 0 replies; 83+ messages in thread
From: Dmitry Baryshkov @ 2026-02-23 22:41 UTC (permalink / raw)
To: Bjorn Andersson
Cc: Ekansh Gupta, Oded Gabbay, Jonathan Corbet, Shuah Khan,
Joerg Roedel, Will Deacon, Robin Murphy, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Sumit Semwal, Christian König, dri-devel, linux-doc,
linux-kernel, linux-arm-msm, iommu, linux-media, linaro-mm-sig,
Srinivas Kandagatla, Bharath Kumar, Chenna Kesava Raju
On Mon, Feb 23, 2026 at 04:25:08PM -0600, Bjorn Andersson wrote:
> On Tue, Feb 24, 2026 at 12:12:32AM +0200, Dmitry Baryshkov wrote:
> > On Mon, Feb 23, 2026 at 03:50:32PM -0600, Bjorn Andersson wrote:
> > > On Mon, Feb 23, 2026 at 11:23:13PM +0200, Dmitry Baryshkov wrote:
> > > > On Tue, Feb 24, 2026 at 12:38:57AM +0530, Ekansh Gupta wrote:
> > > [..]
> > > > > diff --git a/drivers/accel/qda/qda_drv.h b/drivers/accel/qda/qda_drv.h
> > > [..]
> > > > > +/* Error logging - always logs and tracks errors */
> > > > > +#define qda_err(qdev, fmt, ...) do { \
> > > > > + struct device *__dev = qda_get_log_device(qdev); \
> > > > > + if (__dev) \
> > > > > + dev_err(__dev, "[%s] " fmt, __func__, ##__VA_ARGS__); \
> > > > > + else \
> > > > > + pr_err(DRIVER_NAME ": [%s] " fmt, __func__, ##__VA_ARGS__); \
> > > >
> > > > What /why? You are under drm, so you can use drm_* helpers instead.
> > > >
> > >
> > > In particular, rather than rolling our own wrappers around standard
> > > functions, just use dev_err() whenever you have a struct device. And for
> > > something like fastrpc - life starts at some probe() and ends at some
> > > remove() so that should be always.
> >
> > I'd say differently. For the DRM devices the life cycle is centered
> > around the DRM device (which can outlive platform device for multiple
> > reasons). So, please start by registering the DRM accel device and using
> > it for all the logging (and btw for private data management too).
> >
>
> There are no platform_devices here, but tomato tomato... What defines
> the life cycle of the DRM device then? Might it linger because clients
> are holding open handles to it?
Yes.
>
> Note that the fastrpc service is coming and going, as the remoteproc
> starts and stops.
Even one more reason to use drm_device for life cycle management
instead of manually inventing the wheel.
>
> Regards,
> Bjorn
>
> > >
> > > Regards,
> > > Bjorn
> >
> > --
> > With best wishes
> > Dmitry
--
With best wishes
Dmitry
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 04/18] accel/qda: Add built-in compute CB bus for QDA and integrate with IOMMU
2026-02-23 19:08 ` [PATCH RFC 04/18] accel/qda: Add built-in compute CB bus for QDA and integrate with IOMMU Ekansh Gupta
@ 2026-02-23 22:44 ` Dmitry Baryshkov
2026-02-25 17:56 ` Ekansh Gupta
2026-02-26 10:46 ` Krzysztof Kozlowski
1 sibling, 1 reply; 83+ messages in thread
From: Dmitry Baryshkov @ 2026-02-23 22:44 UTC (permalink / raw)
To: Ekansh Gupta
Cc: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, dri-devel, linux-doc, linux-kernel,
linux-arm-msm, iommu, linux-media, linaro-mm-sig,
Srinivas Kandagatla, Bharath Kumar, Chenna Kesava Raju
On Tue, Feb 24, 2026 at 12:38:58AM +0530, Ekansh Gupta wrote:
> Introduce a built-in compute context-bank (CB) bus used by the Qualcomm
> DSP accelerator (QDA) driver to represent DSP CB devices that require
> IOMMU configuration. This separates the CB bus from the QDA driver and
> allows QDA to remain a loadable module while the bus is always built-in.
Why? What is the actual problem that you are trying to solve?
>
> A new bool Kconfig symbol DRM_ACCEL_QDA_COMPUTE_BUS is added and is
Don't describe the patch contents. Please.
> selected by the main DRM_ACCEL_QDA driver. The parent accel Makefile is
> updated to descend into the QDA directory for both built-in and module
> builds so that the CB bus is compiled into vmlinux while the driver
> remains modular.
>
> The CB bus is registered at postcore_initcall() time and is exposed to
> the IOMMU core through iommu_buses[] in the same way as the Tegra
> host1x context-bus. This enables later patches to create CB devices on
> this bus and obtain IOMMU domains for them.
Note, there is nothing QDA-specific in this patch. Please explain, why
the bus is QDA-specific? Can we generalize it?
>
> Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
> ---
> drivers/accel/Makefile | 1 +
> drivers/accel/qda/Kconfig | 5 +++++
> drivers/accel/qda/Makefile | 2 ++
> drivers/accel/qda/qda_compute_bus.c | 23 +++++++++++++++++++++++
> drivers/iommu/iommu.c | 4 ++++
> include/linux/qda_compute_bus.h | 22 ++++++++++++++++++++++
> 6 files changed, 57 insertions(+)
>
> diff --git a/drivers/accel/Makefile b/drivers/accel/Makefile
> index 58c08dd5f389..9ed843cd293f 100644
> --- a/drivers/accel/Makefile
> +++ b/drivers/accel/Makefile
> @@ -6,4 +6,5 @@ obj-$(CONFIG_DRM_ACCEL_HABANALABS) += habanalabs/
> obj-$(CONFIG_DRM_ACCEL_IVPU) += ivpu/
> obj-$(CONFIG_DRM_ACCEL_QAIC) += qaic/
> obj-$(CONFIG_DRM_ACCEL_QDA) += qda/
> +obj-$(CONFIG_DRM_ACCEL_QDA_COMPUTE_BUS) += qda/
> obj-$(CONFIG_DRM_ACCEL_ROCKET) += rocket/
> \ No newline at end of file
> diff --git a/drivers/accel/qda/Kconfig b/drivers/accel/qda/Kconfig
> index 484d21ff1b55..ef1fa384efbe 100644
> --- a/drivers/accel/qda/Kconfig
> +++ b/drivers/accel/qda/Kconfig
> @@ -3,11 +3,16 @@
> # Qualcomm DSP accelerator driver
> #
>
> +
> +config DRM_ACCEL_QDA_COMPUTE_BUS
> + bool
> +
> config DRM_ACCEL_QDA
> tristate "Qualcomm DSP accelerator"
> depends on DRM_ACCEL
> depends on ARCH_QCOM || COMPILE_TEST
> depends on RPMSG
> + select DRM_ACCEL_QDA_COMPUTE_BUS
> help
> Enables the DRM-based accelerator driver for Qualcomm's Hexagon DSPs.
> This driver provides a standardized interface for offloading computational
> diff --git a/drivers/accel/qda/Makefile b/drivers/accel/qda/Makefile
> index e7f23182589b..242684ef1af7 100644
> --- a/drivers/accel/qda/Makefile
> +++ b/drivers/accel/qda/Makefile
> @@ -8,3 +8,5 @@ obj-$(CONFIG_DRM_ACCEL_QDA) := qda.o
> qda-y := \
> qda_drv.o \
> qda_rpmsg.o \
> +
> +obj-$(CONFIG_DRM_ACCEL_QDA_COMPUTE_BUS) += qda_compute_bus.o
> diff --git a/drivers/accel/qda/qda_compute_bus.c b/drivers/accel/qda/qda_compute_bus.c
> new file mode 100644
> index 000000000000..1d9c39948fb5
> --- /dev/null
> +++ b/drivers/accel/qda/qda_compute_bus.c
> @@ -0,0 +1,23 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +// Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
> +#include <linux/device.h>
> +#include <linux/init.h>
> +
> +struct bus_type qda_cb_bus_type = {
> + .name = "qda-compute-cb",
> +};
> +EXPORT_SYMBOL_GPL(qda_cb_bus_type);
> +
> +static int __init qda_cb_bus_init(void)
> +{
> + int err;
> +
> + err = bus_register(&qda_cb_bus_type);
> + if (err < 0) {
> + pr_err("qda-compute-cb bus registration failed: %d\n", err);
> + return err;
> + }
> + return 0;
> +}
> +
> +postcore_initcall(qda_cb_bus_init);
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index 4926a43118e6..5dee912686ee 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -33,6 +33,7 @@
> #include <trace/events/iommu.h>
> #include <linux/sched/mm.h>
> #include <linux/msi.h>
> +#include <linux/qda_compute_bus.h>
> #include <uapi/linux/iommufd.h>
>
> #include "dma-iommu.h"
> @@ -178,6 +179,9 @@ static const struct bus_type * const iommu_buses[] = {
> #ifdef CONFIG_CDX_BUS
> &cdx_bus_type,
> #endif
> +#ifdef CONFIG_DRM_ACCEL_QDA_COMPUTE_BUS
> + &qda_cb_bus_type,
> +#endif
> };
>
> /*
> diff --git a/include/linux/qda_compute_bus.h b/include/linux/qda_compute_bus.h
> new file mode 100644
> index 000000000000..807122d84e3f
> --- /dev/null
> +++ b/include/linux/qda_compute_bus.h
> @@ -0,0 +1,22 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
> + */
> +
> +#ifndef __QDA_COMPUTE_BUS_H__
> +#define __QDA_COMPUTE_BUS_H__
> +
> +#include <linux/device.h>
> +
> +/*
> + * Custom bus type for QDA compute context bank (CB) devices
> + *
> + * This bus type is used for manually created CB devices that represent
> + * IOMMU context banks. The custom bus allows proper IOMMU configuration
> + * and device management for these virtual devices.
> + */
> +#ifdef CONFIG_DRM_ACCEL_QDA_COMPUTE_BUS
> +extern struct bus_type qda_cb_bus_type;
> +#endif
> +
> +#endif /* __QDA_COMPUTE_BUS_H__ */
>
> --
> 2.34.1
>
--
With best wishes
Dmitry
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 05/18] accel/qda: Create compute CB devices on QDA compute bus
2026-02-23 19:08 ` [PATCH RFC 05/18] accel/qda: Create compute CB devices on QDA compute bus Ekansh Gupta
@ 2026-02-23 22:49 ` Dmitry Baryshkov
2026-02-26 8:38 ` Ekansh Gupta
0 siblings, 1 reply; 83+ messages in thread
From: Dmitry Baryshkov @ 2026-02-23 22:49 UTC (permalink / raw)
To: Ekansh Gupta
Cc: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, dri-devel, linux-doc, linux-kernel,
linux-arm-msm, iommu, linux-media, linaro-mm-sig,
Srinivas Kandagatla, Bharath Kumar, Chenna Kesava Raju
On Tue, Feb 24, 2026 at 12:38:59AM +0530, Ekansh Gupta wrote:
> Add support for creating compute context-bank (CB) devices under
> the QDA compute bus based on child nodes of the FastRPC RPMsg
> device tree node. Each DT child with compatible
> "qcom,fastrpc-compute-cb" is turned into a QDA-owned struct
> device on qda_cb_bus_type.
>
> A new qda_cb_dev structure and cb_devs list in qda_dev track these
> CB devices. qda_populate_child_devices() walks the DT children
> during QDA RPMsg probe, creates CB devices, configures their DMA
> and IOMMU settings using of_dma_configure(), and associates a SID
> from the "reg" property when present.
>
> On RPMsg remove, qda_unpopulate_child_devices() tears down all CB
> devices, removing them from their IOMMU groups if present and
> unregistering the devices. This prepares the ground for using CB
> devices as IOMMU endpoints for DSP compute workloads in later
> patches.
Are we loosing the nsessions support?
>
> Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
> ---
> drivers/accel/qda/Makefile | 1 +
> drivers/accel/qda/qda_cb.c | 150 ++++++++++++++++++++++++++++++++++++++++++
> drivers/accel/qda/qda_cb.h | 26 ++++++++
> drivers/accel/qda/qda_drv.h | 3 +
> drivers/accel/qda/qda_rpmsg.c | 40 +++++++++++
> 5 files changed, 220 insertions(+)
>
> diff --git a/drivers/accel/qda/Makefile b/drivers/accel/qda/Makefile
> index 242684ef1af7..4aded20b6bc2 100644
> --- a/drivers/accel/qda/Makefile
> +++ b/drivers/accel/qda/Makefile
> @@ -8,5 +8,6 @@ obj-$(CONFIG_DRM_ACCEL_QDA) := qda.o
> qda-y := \
> qda_drv.o \
> qda_rpmsg.o \
> + qda_cb.o \
>
> obj-$(CONFIG_DRM_ACCEL_QDA_COMPUTE_BUS) += qda_compute_bus.o
> diff --git a/drivers/accel/qda/qda_cb.c b/drivers/accel/qda/qda_cb.c
> new file mode 100644
> index 000000000000..77a2d8cae076
> --- /dev/null
> +++ b/drivers/accel/qda/qda_cb.c
> @@ -0,0 +1,150 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +// Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
> +#include <linux/dma-mapping.h>
> +#include <linux/device.h>
> +#include <linux/of.h>
> +#include <linux/of_device.h>
> +#include <linux/iommu.h>
> +#include <linux/slab.h>
> +#include "qda_drv.h"
> +#include "qda_cb.h"
> +
> +static void qda_cb_dev_release(struct device *dev)
> +{
> + kfree(dev);
Do you need to put the reference on the OF node?
> +}
> +
> +static int qda_configure_cb_iommu(struct device *cb_dev, struct device_node *cb_node)
> +{
> + int ret;
> +
> + qda_dbg(NULL, "Configuring DMA/IOMMU for CB device %s\n", dev_name(cb_dev));
> +
> + /* Use of_dma_configure which handles both DMA and IOMMU configuration */
> + ret = of_dma_configure(cb_dev, cb_node, true);
> + if (ret) {
> + qda_err(NULL, "of_dma_configure failed for %s: %d\n", dev_name(cb_dev), ret);
> + return ret;
> + }
> +
> + qda_dbg(NULL, "DMA/IOMMU configured successfully for CB device %s\n", dev_name(cb_dev));
> + return 0;
> +}
> +
> +static int qda_cb_setup_device(struct qda_dev *qdev, struct device *cb_dev)
> +{
> + int rc;
> + u32 sid, pa_bits = 32;
> +
> + qda_dbg(qdev, "Setting up CB device %s\n", dev_name(cb_dev));
> +
> + if (of_property_read_u32(cb_dev->of_node, "reg", &sid)) {
> + qda_dbg(qdev, "No 'reg' property found, defaulting SID to 0\n");
> + sid = 0;
Don't do the job of the schema validator. Are there nodes without reg?
No.
> + }
> +
> + rc = dma_set_mask(cb_dev, DMA_BIT_MASK(pa_bits));
> + if (rc) {
> + qda_err(qdev, "%d bit DMA enable failed: %d\n", pa_bits, rc);
> + return rc;
> + }
> +
> + qda_dbg(qdev, "CB device setup complete - SID: %u, PA bits: %u\n", sid, pa_bits);
> +
> + return 0;
> +}
> +
> +int qda_create_cb_device(struct qda_dev *qdev, struct device_node *cb_node)
> +{
> + struct device *cb_dev;
> + int ret;
> + u32 sid = 0;
> + struct qda_cb_dev *entry;
> +
> + qda_dbg(qdev, "Creating CB device for node: %s\n", cb_node->name);
> +
> + of_property_read_u32(cb_node, "reg", &sid);
> +
> + cb_dev = kzalloc_obj(*cb_dev, GFP_KERNEL);
> + if (!cb_dev)
> + return -ENOMEM;
> +
> + device_initialize(cb_dev);
> + cb_dev->parent = qdev->dev;
> + cb_dev->bus = &qda_cb_bus_type; /* Use our custom bus type for IOMMU handling */
> + cb_dev->release = qda_cb_dev_release;
> + dev_set_name(cb_dev, "qda-cb-%s-%u", qdev->dsp_name, sid);
> +
> + qda_dbg(qdev, "Initialized CB device: %s\n", dev_name(cb_dev));
> +
> + cb_dev->of_node = of_node_get(cb_node);
> +
> + cb_dev->dma_mask = &cb_dev->coherent_dma_mask;
> + cb_dev->coherent_dma_mask = DMA_BIT_MASK(32);
> +
> + dev_set_drvdata(cb_dev->parent, qdev);
> +
> + ret = device_add(cb_dev);
> + if (ret) {
> + qda_err(qdev, "Failed to add CB device for SID %u: %d\n", sid, ret);
> + goto cleanup_device_init;
> + }
> +
> + qda_dbg(qdev, "CB device added to system\n");
> +
> + ret = qda_configure_cb_iommu(cb_dev, cb_node);
> + if (ret) {
> + qda_err(qdev, "IOMMU configuration failed: %d\n", ret);
> + goto cleanup_device_add;
> + }
> +
> + ret = qda_cb_setup_device(qdev, cb_dev);
> + if (ret) {
> + qda_err(qdev, "CB device setup failed: %d\n", ret);
> + goto cleanup_device_add;
> + }
> +
> + entry = kzalloc(sizeof(*entry), GFP_KERNEL);
> + if (!entry) {
> + ret = -ENOMEM;
> + goto cleanup_device_add;
> + }
> +
> + entry->dev = cb_dev;
> + list_add_tail(&entry->node, &qdev->cb_devs);
> +
> + qda_dbg(qdev, "Successfully created CB device for SID %u\n", sid);
> + return 0;
> +
> +cleanup_device_add:
> + device_del(cb_dev);
> +cleanup_device_init:
> + of_node_put(cb_dev->of_node);
> + put_device(cb_dev);
> + return ret;
> +}
> +
> +void qda_destroy_cb_device(struct device *cb_dev)
> +{
> + struct iommu_group *group;
> +
> + if (!cb_dev) {
> + qda_dbg(NULL, "NULL CB device passed to destroy\n");
> + return;
> + }
> +
> + qda_dbg(NULL, "Destroying CB device %s\n", dev_name(cb_dev));
> +
> + group = iommu_group_get(cb_dev);
> + if (group) {
> + qda_dbg(NULL, "Removing %s from IOMMU group\n", dev_name(cb_dev));
> + iommu_group_remove_device(cb_dev);
> + iommu_group_put(group);
> + }
> +
> + of_node_put(cb_dev->of_node);
> + cb_dev->of_node = NULL;
> + device_unregister(cb_dev);
> +
> + qda_dbg(NULL, "CB device %s destroyed\n", dev_name(cb_dev));
> +}
> diff --git a/drivers/accel/qda/qda_cb.h b/drivers/accel/qda/qda_cb.h
> new file mode 100644
> index 000000000000..a4ae9fef142e
> --- /dev/null
> +++ b/drivers/accel/qda/qda_cb.h
> @@ -0,0 +1,26 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
> + */
> +
> +#ifndef __QDA_CB_H__
> +#define __QDA_CB_H__
> +
> +#include <linux/device.h>
> +#include <linux/of.h>
> +#include <linux/list.h>
> +#include <linux/qda_compute_bus.h>
> +#include "qda_drv.h"
> +
> +struct qda_cb_dev {
> + struct list_head node;
> + struct device *dev;
> +};
> +
> +/*
> + * Compute bus (CB) device management
> + */
> +int qda_create_cb_device(struct qda_dev *qdev, struct device_node *cb_node);
> +void qda_destroy_cb_device(struct device *cb_dev);
> +
> +#endif /* __QDA_CB_H__ */
> diff --git a/drivers/accel/qda/qda_drv.h b/drivers/accel/qda/qda_drv.h
> index bec2d31ca1bb..eb732b7d8091 100644
> --- a/drivers/accel/qda/qda_drv.h
> +++ b/drivers/accel/qda/qda_drv.h
> @@ -7,6 +7,7 @@
> #define __QDA_DRV_H__
>
> #include <linux/device.h>
> +#include <linux/list.h>
> #include <linux/mutex.h>
> #include <linux/rpmsg.h>
> #include <linux/xarray.h>
> @@ -26,6 +27,8 @@ struct qda_dev {
> atomic_t removing;
> /* Name of the DSP (e.g., "cdsp", "adsp") */
> char dsp_name[16];
> + /* Compute context-bank (CB) child devices */
> + struct list_head cb_devs;
> };
>
> /**
> diff --git a/drivers/accel/qda/qda_rpmsg.c b/drivers/accel/qda/qda_rpmsg.c
> index a8b24a99ca13..5a57384de6a2 100644
> --- a/drivers/accel/qda/qda_rpmsg.c
> +++ b/drivers/accel/qda/qda_rpmsg.c
> @@ -7,6 +7,7 @@
> #include <linux/of_device.h>
> #include "qda_drv.h"
> #include "qda_rpmsg.h"
> +#include "qda_cb.h"
>
> static int qda_rpmsg_init(struct qda_dev *qdev)
> {
> @@ -25,11 +26,42 @@ static struct qda_dev *alloc_and_init_qdev(struct rpmsg_device *rpdev)
>
> qdev->dev = &rpdev->dev;
> qdev->rpdev = rpdev;
> + INIT_LIST_HEAD(&qdev->cb_devs);
>
> qda_dbg(qdev, "Allocated and initialized qda_dev\n");
> return qdev;
> }
>
> +static void qda_unpopulate_child_devices(struct qda_dev *qdev)
> +{
> + struct qda_cb_dev *entry, *tmp;
> +
> + list_for_each_entry_safe(entry, tmp, &qdev->cb_devs, node) {
> + list_del(&entry->node);
> + qda_destroy_cb_device(entry->dev);
> + kfree(entry);
Why can't you embed struct device into a structure together with the
list_node (and possibly some other data?)?
> + }
> +}
> +
> +static int qda_populate_child_devices(struct qda_dev *qdev, struct device_node *parent_node)
> +{
> + struct device_node *child;
> + int count = 0, success = 0;
> +
> + for_each_child_of_node(parent_node, child) {
> + if (of_device_is_compatible(child, "qcom,fastrpc-compute-cb")) {
> + count++;
> + if (qda_create_cb_device(qdev, child) == 0) {
> + success++;
> + qda_dbg(qdev, "Created CB device for node: %s\n", child->name);
> + } else {
> + qda_err(qdev, "Failed to create CB device for: %s\n", child->name);
Don't loose the error code. Instead please return it to the caller.
> + }
> + }
> + }
> + return success > 0 ? 0 : (count > 0 ? -ENODEV : 0);
> +}
> +
> static int qda_rpmsg_cb(struct rpmsg_device *rpdev, void *data, int len, void *priv, u32 src)
> {
> /* Dummy function for rpmsg driver */
> @@ -48,6 +80,7 @@ static void qda_rpmsg_remove(struct rpmsg_device *rpdev)
> qdev->rpdev = NULL;
> mutex_unlock(&qdev->lock);
>
> + qda_unpopulate_child_devices(qdev);
> qda_deinit_device(qdev);
>
> qda_info(qdev, "RPMsg device removed\n");
> @@ -83,6 +116,13 @@ static int qda_rpmsg_probe(struct rpmsg_device *rpdev)
> if (ret)
> return ret;
>
> + ret = qda_populate_child_devices(qdev, rpdev->dev.of_node);
> + if (ret) {
> + qda_err(qdev, "Failed to populate child devices: %d\n", ret);
> + qda_deinit_device(qdev);
> + return ret;
> + }
> +
> qda_info(qdev, "QDA RPMsg probe completed successfully for %s\n", qdev->dsp_name);
> return 0;
> }
>
> --
> 2.34.1
>
--
With best wishes
Dmitry
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 06/18] accel/qda: Add memory manager for CB devices
2026-02-23 19:09 ` [PATCH RFC 06/18] accel/qda: Add memory manager for CB devices Ekansh Gupta
@ 2026-02-23 22:50 ` Dmitry Baryshkov
2026-03-02 8:15 ` Ekansh Gupta
2026-02-23 23:11 ` Bjorn Andersson
1 sibling, 1 reply; 83+ messages in thread
From: Dmitry Baryshkov @ 2026-02-23 22:50 UTC (permalink / raw)
To: Ekansh Gupta
Cc: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, dri-devel, linux-doc, linux-kernel,
linux-arm-msm, iommu, linux-media, linaro-mm-sig,
Srinivas Kandagatla, Bharath Kumar, Chenna Kesava Raju
On Tue, Feb 24, 2026 at 12:39:00AM +0530, Ekansh Gupta wrote:
> Introduce a per-device memory manager for the QDA driver that tracks
> IOMMU-capable compute context-bank (CB) devices. Each CB device is
> represented by a qda_iommu_device and registered with a central
> qda_memory_manager instance owned by qda_dev.
>
> The memory manager maintains an xarray of devices and assigns a
> unique ID to each CB. It also provides basic lifetime management
Sounds like IDR.
> and a workqueue for deferred device removal. qda_cb_setup_device()
What is deferred device removal? Why do you need it?
> now allocates a qda_iommu_device for each CB and registers it with
> the memory manager after DMA configuration succeeds.
>
> qda_init_device() is extended to allocate and initialize the memory
> manager, while qda_deinit_device() will tear it down in later
> patches. This prepares the QDA driver for fine-grained memory and
> IOMMU domain management tied to individual CB devices.
>
> Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
> ---
> drivers/accel/qda/Makefile | 1 +
> drivers/accel/qda/qda_cb.c | 32 +++++++
> drivers/accel/qda/qda_drv.c | 46 ++++++++++
> drivers/accel/qda/qda_drv.h | 3 +
> drivers/accel/qda/qda_memory_manager.c | 152 +++++++++++++++++++++++++++++++++
> drivers/accel/qda/qda_memory_manager.h | 101 ++++++++++++++++++++++
> 6 files changed, 335 insertions(+)
>
--
With best wishes
Dmitry
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 13/18] accel/qda: Add initial FastRPC attach and release support
2026-02-23 19:09 ` [PATCH RFC 13/18] accel/qda: Add initial FastRPC attach and release support Ekansh Gupta
@ 2026-02-23 23:07 ` Dmitry Baryshkov
2026-03-09 6:50 ` Ekansh Gupta
0 siblings, 1 reply; 83+ messages in thread
From: Dmitry Baryshkov @ 2026-02-23 23:07 UTC (permalink / raw)
To: Ekansh Gupta
Cc: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, dri-devel, linux-doc, linux-kernel,
linux-arm-msm, iommu, linux-media, linaro-mm-sig,
Srinivas Kandagatla, Bharath Kumar, Chenna Kesava Raju
On Tue, Feb 24, 2026 at 12:39:07AM +0530, Ekansh Gupta wrote:
> Add the initial FastRPC invocation plumbing to the QDA accelerator
> driver to support attaching to and releasing a DSP process. A new
> fastrpc_invoke_context structure tracks the state of a single remote
So, why does it embed kref?
> procedure call, including arguments, overlap handling, completion and
> GEM-based message buffers. Contexts are indexed through an xarray in
> qda_dev so that RPMsg callbacks can match responses back to the
> originating invocation.
Again, IDR? Or not?
>
> The new qda_fastrpc implementation provides helpers to prepare
> FastRPC scalars and arguments, pack them into a QDA message backed by
> a GEM buffer and unpack responses. The FastRPC INIT_ATTACH and
> INIT_RELEASE methods are wired up via a new QDA_INIT_ATTACH ioctl and
> a postclose hook that sends a release request when a client file
> descriptor is closed. On the transport side qda_rpmsg_send_msg()
> builds and sends a fastrpc_msg over RPMsg, while qda_rpmsg_cb()
> decodes qda_invoke_rsp messages, looks up the context by its id and
> completes the corresponding wait.
>
> This lays the foundation for QDA FastRPC method support on top of the
> existing GEM and RPMsg infrastructure, starting with the attach and
> release control flows for DSP sessions.
I think the FastRPC backing code should be a separate commit,
INIT_ATTACH another, separate commit.
>
> Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
> ---
> drivers/accel/qda/Makefile | 1 +
> drivers/accel/qda/qda_drv.c | 5 +
> drivers/accel/qda/qda_drv.h | 2 +
> drivers/accel/qda/qda_fastrpc.c | 548 ++++++++++++++++++++++++++++++++++++++++
> drivers/accel/qda/qda_fastrpc.h | 303 ++++++++++++++++++++++
> drivers/accel/qda/qda_ioctl.c | 107 ++++++++
> drivers/accel/qda/qda_ioctl.h | 25 ++
> drivers/accel/qda/qda_rpmsg.c | 164 +++++++++++-
> drivers/accel/qda/qda_rpmsg.h | 40 +++
> include/uapi/drm/qda_accel.h | 19 ++
> 10 files changed, 1212 insertions(+), 2 deletions(-)
>
> diff --git a/include/uapi/drm/qda_accel.h b/include/uapi/drm/qda_accel.h
> index ed24a7f5637e..4d3666c5b998 100644
> --- a/include/uapi/drm/qda_accel.h
> +++ b/include/uapi/drm/qda_accel.h
[moved this file to the beginning of the patch to ease reviewing]
> @@ -21,6 +21,7 @@ extern "C" {
> #define DRM_QDA_QUERY 0x00
> #define DRM_QDA_GEM_CREATE 0x01
> #define DRM_QDA_GEM_MMAP_OFFSET 0x02
> +#define DRM_QDA_INIT_ATTACH 0x03
> /*
> * QDA IOCTL definitions
> *
> @@ -33,6 +34,7 @@ extern "C" {
> struct drm_qda_gem_create)
> #define DRM_IOCTL_QDA_GEM_MMAP_OFFSET DRM_IOWR(DRM_COMMAND_BASE + DRM_QDA_GEM_MMAP_OFFSET, \
> struct drm_qda_gem_mmap_offset)
> +#define DRM_IOCTL_QDA_INIT_ATTACH DRM_IO(DRM_COMMAND_BASE + DRM_QDA_INIT_ATTACH)
>
> /**
> * struct drm_qda_query - Device information query structure
> @@ -76,6 +78,23 @@ struct drm_qda_gem_mmap_offset {
> __u64 offset;
> };
>
> +/**
> + * struct fastrpc_invoke_args - FastRPC invocation argument descriptor
> + * @ptr: Pointer to argument data (user virtual address)
> + * @length: Length of the argument data in bytes
And the data is defined... where?
> + * @fd: File descriptor for buffer arguments, -1 for scalar arguments
> + * @attr: Argument attributes and flags
Which attributes and flags?
> + *
> + * This structure describes a single argument passed to a FastRPC invocation.
> + * Arguments can be either scalar values or buffer references (via file descriptor).
Can't it just be GEM handle + offset inside the handle?
> + */
> +struct fastrpc_invoke_args {
> + __u64 ptr;
> + __u64 length;
> + __s32 fd;
> + __u32 attr;
> +};
> +
> #if defined(__cplusplus)
> }
> #endif
>
> diff --git a/drivers/accel/qda/Makefile b/drivers/accel/qda/Makefile
> index 8286f5279748..82d40e452fa9 100644
> --- a/drivers/accel/qda/Makefile
> +++ b/drivers/accel/qda/Makefile
> @@ -14,5 +14,6 @@ qda-y := \
> qda_gem.o \
> qda_memory_dma.o \
> qda_prime.o \
> + qda_fastrpc.o \
>
> obj-$(CONFIG_DRM_ACCEL_QDA_COMPUTE_BUS) += qda_compute_bus.o
> diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c
> index 4adee00b1f2c..3034ea660924 100644
> --- a/drivers/accel/qda/qda_drv.c
> +++ b/drivers/accel/qda/qda_drv.c
> @@ -120,6 +120,8 @@ static void qda_postclose(struct drm_device *dev, struct drm_file *file)
> return;
> }
>
> + fastrpc_release_current_dsp_process(qdev, file);
No, this is not the fastrpc driver.
> +
> qda_file_priv = (struct qda_file_priv *)file->driver_priv;
> if (qda_file_priv) {
> if (qda_file_priv->assigned_iommu_dev) {
> @@ -159,6 +161,7 @@ static const struct drm_ioctl_desc qda_ioctls[] = {
> DRM_IOCTL_DEF_DRV(QDA_QUERY, qda_ioctl_query, 0),
> DRM_IOCTL_DEF_DRV(QDA_GEM_CREATE, qda_ioctl_gem_create, 0),
> DRM_IOCTL_DEF_DRV(QDA_GEM_MMAP_OFFSET, qda_ioctl_gem_mmap_offset, 0),
> + DRM_IOCTL_DEF_DRV(QDA_INIT_ATTACH, qda_ioctl_attach, 0),
> };
>
> static struct drm_driver qda_drm_driver = {
> @@ -195,6 +198,7 @@ static void cleanup_iommu_manager(struct qda_dev *qdev)
>
> static void cleanup_device_resources(struct qda_dev *qdev)
> {
> + xa_destroy(&qdev->ctx_xa);
I thought xarray was in some other patch. What is this ctx_xa?
> mutex_destroy(&qdev->lock);
> }
>
> @@ -213,6 +217,7 @@ static void init_device_resources(struct qda_dev *qdev)
> mutex_init(&qdev->lock);
> atomic_set(&qdev->removing, 0);
> atomic_set(&qdev->client_id_counter, 0);
> + xa_init_flags(&qdev->ctx_xa, XA_FLAGS_ALLOC1);
> }
>
> static int init_memory_manager(struct qda_dev *qdev)
> diff --git a/drivers/accel/qda/qda_drv.h b/drivers/accel/qda/qda_drv.h
> index bb0dd7e284c6..bb1d1e82036a 100644
> --- a/drivers/accel/qda/qda_drv.h
> +++ b/drivers/accel/qda/qda_drv.h
> @@ -92,6 +92,8 @@ struct qda_dev {
> char dsp_name[16];
> /* Compute context-bank (CB) child devices */
> struct list_head cb_devs;
> + /* XArray for context management */
> + struct xarray ctx_xa;
> };
>
> /**
> diff --git a/drivers/accel/qda/qda_fastrpc.c b/drivers/accel/qda/qda_fastrpc.c
> new file mode 100644
> index 000000000000..eda7c90070ee
> --- /dev/null
> +++ b/drivers/accel/qda/qda_fastrpc.c
> @@ -0,0 +1,548 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +// Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
> +#include <linux/slab.h>
> +#include <linux/uaccess.h>
> +#include <linux/sort.h>
> +#include <linux/completion.h>
> +#include <linux/dma-buf.h>
> +#include <drm/drm_gem.h>
> +#include <drm/qda_accel.h>
> +#include "qda_fastrpc.h"
> +#include "qda_drv.h"
> +#include "qda_gem.h"
> +#include "qda_memory_manager.h"
> +
> +static int copy_to_user_or_kernel(void __user *dst, const void *src, size_t size)
> +{
> + if ((unsigned long)dst >= PAGE_OFFSET) {
> + memcpy(dst, src, size);
> + return 0;
> + } else {
> + return copy_to_user(dst, src, size) ? -EFAULT : 0;
Huh?
> + }
> +}
> +
> +static int get_gem_obj_from_handle(struct drm_file *file_priv, u32 handle,
> + struct drm_gem_object **gem_obj)
> +{
> + if (handle == 0)
> + return -EINVAL;
Let the system do its job.
> +
> + if (!file_priv)
> + return -EINVAL;
Can it be NULL?
> +
> + *gem_obj = drm_gem_object_lookup(file_priv, handle);
> + if (*gem_obj)
> + return 0;
> +
> + return -ENOENT;
> +}
> +
> +static void setup_pages_from_gem_obj(struct qda_gem_obj *qda_gem_obj,
> + struct fastrpc_phy_page *pages)
> +{
> + if (qda_gem_obj->is_imported)
> + pages->addr = qda_gem_obj->imported_dma_addr;
> + else
> + pages->addr = qda_gem_obj->dma_addr;
Why do you need tow kinds of addresses?
> +
> + pages->size = qda_gem_obj->size;
> +}
> +
> +static u64 calculate_vma_offset(u64 user_ptr)
> +{
> + struct vm_area_struct *vma;
> + u64 user_ptr_page_mask = user_ptr & PAGE_MASK;
> + u64 vma_offset = 0;
> +
> + mmap_read_lock(current->mm);
> + vma = find_vma(current->mm, user_ptr);
> + if (vma)
> + vma_offset = user_ptr_page_mask - vma->vm_start;
> + mmap_read_unlock(current->mm);
> +
> + return vma_offset;
> +}
> +
> +static u64 calculate_page_aligned_size(u64 ptr, u64 len)
> +{
> + u64 pg_start = (ptr & PAGE_MASK) >> PAGE_SHIFT;
> + u64 pg_end = ((ptr + len - 1) & PAGE_MASK) >> PAGE_SHIFT;
> + u64 aligned_size = (pg_end - pg_start + 1) * PAGE_SIZE;
> +
> + return aligned_size;
> +}
> +
> +static void setup_single_arg(struct fastrpc_invoke_args *args, void *ptr, size_t size)
> +{
> + args[0].ptr = (u64)(uintptr_t)ptr;
What kind of address is it? If ptr is on the DSP side, then it should
not be void* here.
> + args[0].length = size;
> + args[0].fd = -1;
> +}
> +
> +static struct fastrpc_invoke_buf *fastrpc_invoke_buf_start(union fastrpc_remote_arg *pra, int len)
> +{
> + struct fastrpc_invoke_buf *buf = (struct fastrpc_invoke_buf *)(&pra[len]);
> + return buf;
> +}
> +
> +static struct fastrpc_phy_page *fastrpc_phy_page_start(struct fastrpc_invoke_buf *buf, int len)
> +{
> + struct fastrpc_phy_page *pages = (struct fastrpc_phy_page *)(&buf[len]);
> + return pages;
> +}
> +
> +static int fastrpc_get_meta_size(struct fastrpc_invoke_context *ctx)
> +{
> + int size = 0;
> +
> + size = (sizeof(struct fastrpc_remote_buf) +
> + sizeof(struct fastrpc_invoke_buf) +
> + sizeof(struct fastrpc_phy_page)) * ctx->nscalars +
> + sizeof(u64) * FASTRPC_MAX_FDLIST +
> + sizeof(u32) * FASTRPC_MAX_CRCLIST;
> +
> + return size;
> +}
> +
> +static u64 fastrpc_get_payload_size(struct fastrpc_invoke_context *ctx, int metalen)
> +{
> + u64 size = 0;
> + int oix;
> +
> + size = ALIGN(metalen, FASTRPC_ALIGN);
> +
> + for (oix = 0; oix < ctx->nbufs; oix++) {
> + int i = ctx->olaps[oix].raix;
whts olaps?
Why do you need to specially track it?
> +
> + if (ctx->args[i].fd == 0 || ctx->args[i].fd == -1) {
> + if (ctx->olaps[oix].offset == 0)
> + size = ALIGN(size, FASTRPC_ALIGN);
> +
> + size += (ctx->olaps[oix].mend - ctx->olaps[oix].mstart);
> + }
> + }
> +
> + return size;
> +}
> +
> +void fastrpc_context_free(struct kref *ref)
> +{
> + struct fastrpc_invoke_context *ctx;
> + int i;
> +
> + ctx = container_of(ref, struct fastrpc_invoke_context, refcount);
> + if (ctx->gem_objs) {
> + for (i = 0; i < ctx->nscalars; ++i) {
> + if (ctx->gem_objs[i]) {
> + drm_gem_object_put(ctx->gem_objs[i]);
> + ctx->gem_objs[i] = NULL;
> + }
> + }
> + kfree(ctx->gem_objs);
> + ctx->gem_objs = NULL;
You are going to kfree ctx. Why do you need to zero the field?
> + }
> +
> + if (ctx->msg_gem_obj) {
> + drm_gem_object_put(&ctx->msg_gem_obj->base);
> + ctx->msg_gem_obj = NULL;
> + }
> +
> + kfree(ctx->olaps);
> + ctx->olaps = NULL;
> +
> + kfree(ctx->args);
> + kfree(ctx->req);
> + kfree(ctx->rsp);
> + kfree(ctx->input_pages);
> + kfree(ctx->inbuf);
Generally it feels like there are too many allocations and frees for a
single RPC call. Can all these buffers be embedded into the context
instead?
> +
> + kfree(ctx);
> +}
> +
> +#define CMP(aa, bb) ((aa) == (bb) ? 0 : (aa) < (bb) ? -1 : 1)
> +
> +static int olaps_cmp(const void *a, const void *b)
> +{
> + struct fastrpc_buf_overlap *pa = (struct fastrpc_buf_overlap *)a;
> + struct fastrpc_buf_overlap *pb = (struct fastrpc_buf_overlap *)b;
> + int st = CMP(pa->start, pb->start);
> + int ed = CMP(pb->end, pa->end);
> +
> + return st == 0 ? ed : st;
wist?
> +}
> +
> +static void fastrpc_get_buff_overlaps(struct fastrpc_invoke_context *ctx)
> +{
> + u64 max_end = 0;
> + int i;
> +
> + for (i = 0; i < ctx->nbufs; ++i) {
> + ctx->olaps[i].start = ctx->args[i].ptr;
> + ctx->olaps[i].end = ctx->olaps[i].start + ctx->args[i].length;
> + ctx->olaps[i].raix = i;
> + }
> +
> + sort(ctx->olaps, ctx->nbufs, sizeof(*ctx->olaps), olaps_cmp, NULL);
> +
> + for (i = 0; i < ctx->nbufs; ++i) {
> + if (ctx->olaps[i].start < max_end) {
> + ctx->olaps[i].mstart = max_end;
> + ctx->olaps[i].mend = ctx->olaps[i].end;
> + ctx->olaps[i].offset = max_end - ctx->olaps[i].start;
> +
> + if (ctx->olaps[i].end > max_end) {
> + max_end = ctx->olaps[i].end;
> + } else {
> + ctx->olaps[i].mend = 0;
> + ctx->olaps[i].mstart = 0;
> + }
> + } else {
> + ctx->olaps[i].mend = ctx->olaps[i].end;
> + ctx->olaps[i].mstart = ctx->olaps[i].start;
> + ctx->olaps[i].offset = 0;
> + max_end = ctx->olaps[i].end;
> + }
> + }
> +}
> +
> +struct fastrpc_invoke_context *fastrpc_context_alloc(void)
> +{
> + struct fastrpc_invoke_context *ctx = NULL;
> +
> + ctx = kzalloc_obj(*ctx, GFP_KERNEL);
> + if (!ctx)
> + return ERR_PTR(-ENOMEM);
> +
> + INIT_LIST_HEAD(&ctx->node);
> +
> + ctx->retval = -1;
> + ctx->pid = current->pid;
> + init_completion(&ctx->work);
> + ctx->msg_gem_obj = NULL;
> + kref_init(&ctx->refcount);
> +
> + return ctx;
> +}
> +
> +static int process_fd_buffer(struct fastrpc_invoke_context *ctx, int i,
> + union fastrpc_remote_arg *rpra, struct fastrpc_phy_page *pages)
> +{
> + struct drm_gem_object *gem_obj;
> + struct qda_gem_obj *qda_gem_obj;
> + int err;
> + u64 len = ctx->args[i].length;
> + u64 vma_offset;
> +
> + err = get_gem_obj_from_handle(ctx->file_priv, ctx->args[i].fd, &gem_obj);
> + if (err)
> + return err;
> +
> + ctx->gem_objs[i] = gem_obj;
> + qda_gem_obj = to_qda_gem_obj(gem_obj);
> +
> + rpra[i].buf.pv = (u64)ctx->args[i].ptr;
> +
> + if (qda_gem_obj->is_imported)
> + pages[i].addr = qda_gem_obj->imported_dma_addr;
> + else
> + pages[i].addr = qda_gem_obj->dma_addr;
> +
> + vma_offset = calculate_vma_offset(ctx->args[i].ptr);
> + pages[i].addr += vma_offset;
> + pages[i].size = calculate_page_aligned_size(ctx->args[i].ptr, len);
> +
> + return 0;
> +}
> +
> +static int process_direct_buffer(struct fastrpc_invoke_context *ctx, int i, int oix,
> + union fastrpc_remote_arg *rpra, struct fastrpc_phy_page *pages,
> + uintptr_t *args, u64 *rlen, u64 pkt_size)
What is direct buffer?
> +{
> + int mlen;
> + u64 len = ctx->args[i].length;
> + int inbufs = ctx->inbufs;
> +
> + if (ctx->olaps[oix].offset == 0) {
> + *rlen -= ALIGN(*args, FASTRPC_ALIGN) - *args;
> + *args = ALIGN(*args, FASTRPC_ALIGN);
> + }
> +
> + mlen = ctx->olaps[oix].mend - ctx->olaps[oix].mstart;
> +
> + if (*rlen < mlen)
> + return -ENOSPC;
> +
> + rpra[i].buf.pv = *args - ctx->olaps[oix].offset;
> +
> + pages[i].addr = ctx->msg->phys - ctx->olaps[oix].offset + (pkt_size - *rlen);
> + pages[i].addr = pages[i].addr & PAGE_MASK;
> + pages[i].size = calculate_page_aligned_size(rpra[i].buf.pv, len);
> +
> + *args = *args + mlen;
> + *rlen -= mlen;
> +
> + if (i < inbufs) {
> + void *dst = (void *)(uintptr_t)rpra[i].buf.pv;
> + void *src = (void *)(uintptr_t)ctx->args[i].ptr;
Huh?
> +
> + if ((unsigned long)src >= PAGE_OFFSET) {
> + memcpy(dst, src, len);
> + } else {
> + if (copy_from_user(dst, (void __user *)src, len))
> + return -EFAULT;
> + }
> + }
> +
> + return 0;
> +}
> +
> +static int process_dma_handle(struct fastrpc_invoke_context *ctx, int i,
> + union fastrpc_remote_arg *rpra, struct fastrpc_phy_page *pages)
> +{
> + if (ctx->args[i].fd > 0) {
> + struct drm_gem_object *gem_obj;
> + struct qda_gem_obj *qda_gem_obj;
> + int err;
> +
> + err = get_gem_obj_from_handle(ctx->file_priv, ctx->args[i].fd, &gem_obj);
> + if (err)
> + return err;
> +
> + ctx->gem_objs[i] = gem_obj;
> + qda_gem_obj = to_qda_gem_obj(gem_obj);
> +
> + setup_pages_from_gem_obj(qda_gem_obj, &pages[i]);
> +
> + rpra[i].dma.fd = ctx->args[i].fd;
> + rpra[i].dma.len = ctx->args[i].length;
> + rpra[i].dma.offset = (u64)ctx->args[i].ptr;
> + } else {
> + rpra[i].buf.pv = ctx->args[i].ptr;
> + rpra[i].buf.len = ctx->args[i].length;
> + }
> +
> + return 0;
> +}
> +
> +int fastrpc_get_header_size(struct fastrpc_invoke_context *ctx, size_t *out_size)
> +{
> + ctx->inbufs = REMOTE_SCALARS_INBUFS(ctx->sc);
> + ctx->metalen = fastrpc_get_meta_size(ctx);
> + ctx->pkt_size = fastrpc_get_payload_size(ctx, ctx->metalen);
> +
> + ctx->aligned_pkt_size = PAGE_ALIGN(ctx->pkt_size);
> + if (ctx->aligned_pkt_size == 0)
> + return -EINVAL;
> +
> + *out_size = ctx->aligned_pkt_size;
> + return 0;
> +}
> +
> +static int fastrpc_get_args(struct fastrpc_invoke_context *ctx)
> +{
> + union fastrpc_remote_arg *rpra;
> + struct fastrpc_invoke_buf *list;
> + struct fastrpc_phy_page *pages;
> + int i, oix, err = 0;
> + u64 rlen;
> + uintptr_t args;
> + size_t hdr_size;
> +
> + ctx->inbufs = REMOTE_SCALARS_INBUFS(ctx->sc);
> + err = fastrpc_get_header_size(ctx, &hdr_size);
> + if (err)
> + return err;
> +
> + ctx->msg->buf = ctx->msg_gem_obj->virt;
> + ctx->msg->phys = ctx->msg_gem_obj->dma_addr;
> +
> + memset(ctx->msg->buf, 0, ctx->aligned_pkt_size);
> +
> + rpra = (union fastrpc_remote_arg *)ctx->msg->buf;
> + ctx->list = fastrpc_invoke_buf_start(rpra, ctx->nscalars);
> + ctx->pages = fastrpc_phy_page_start(ctx->list, ctx->nscalars);
> + list = ctx->list;
> + pages = ctx->pages;
> + args = (uintptr_t)ctx->msg->buf + ctx->metalen;
> + rlen = ctx->pkt_size - ctx->metalen;
> + ctx->rpra = rpra;
> +
> + for (oix = 0; oix < ctx->nbufs; ++oix) {
> + i = ctx->olaps[oix].raix;
> +
> + rpra[i].buf.pv = 0;
> + rpra[i].buf.len = ctx->args[i].length;
> + list[i].num = ctx->args[i].length ? 1 : 0;
> + list[i].pgidx = i;
> +
> + if (!ctx->args[i].length)
> + continue;
> +
> + if (ctx->args[i].fd > 0)
> + err = process_fd_buffer(ctx, i, rpra, pages);
> + else
> + err = process_direct_buffer(ctx, i, oix, rpra, pages, &args, &rlen,
> + ctx->pkt_size);
> +
> + if (err)
> + goto bail_gem;
> + }
> +
> + for (i = ctx->nbufs; i < ctx->nscalars; ++i) {
> + list[i].num = ctx->args[i].length ? 1 : 0;
> + list[i].pgidx = i;
> +
> + err = process_dma_handle(ctx, i, rpra, pages);
> + if (err)
> + goto bail_gem;
> + }
> +
> + return 0;
> +
> +bail_gem:
> + if (ctx->msg_gem_obj) {
> + drm_gem_object_put(&ctx->msg_gem_obj->base);
> + ctx->msg_gem_obj = NULL;
> + }
> +
> + return err;
> +}
> +
> +static int fastrpc_put_args(struct fastrpc_invoke_context *ctx, struct qda_msg *msg)
> +{
> + union fastrpc_remote_arg *rpra = ctx->rpra;
> + int i, err = 0;
> +
> + if (!ctx || !rpra)
> + return -EINVAL;
> +
> + for (i = ctx->inbufs; i < ctx->nbufs; ++i) {
> + if (ctx->args[i].fd <= 0) {
> + void *src = (void *)(uintptr_t)rpra[i].buf.pv;
> + void *dst = (void *)(uintptr_t)ctx->args[i].ptr;
> + u64 len = rpra[i].buf.len;
> +
> + err = copy_to_user_or_kernel(dst, src, len);
> + if (err)
> + break;
> + }
> + }
> +
> + return err;
> +}
> +
> +int fastrpc_internal_invoke_pack(struct fastrpc_invoke_context *ctx,
> + struct qda_msg *msg)
> +{
> + int err = 0;
> +
> + if (ctx->handle == FASTRPC_INIT_HANDLE)
> + msg->client_id = 0;
> + else
> + msg->client_id = ctx->client_id;
> +
> + ctx->msg = msg;
> +
> + err = fastrpc_get_args(ctx);
> + if (err)
> + return err;
> +
> + dma_wmb();
> +
> + msg->tid = ctx->pid;
> + msg->ctx = ctx->ctxid | ctx->pd;
> + msg->handle = ctx->handle;
> + msg->sc = ctx->sc;
> + msg->addr = ctx->msg->phys;
> + msg->size = roundup(ctx->pkt_size, PAGE_SIZE);
> + msg->fastrpc_ctx = ctx;
> + msg->file_priv = ctx->file_priv;
> +
> + return 0;
> +}
> +
> +int fastrpc_internal_invoke_unpack(struct fastrpc_invoke_context *ctx,
> + struct qda_msg *msg)
> +{
> + int err;
> +
> + dma_rmb();
> +
> + err = fastrpc_put_args(ctx, msg);
> + if (err)
> + return err;
> +
> + err = ctx->retval;
> + return err;
> +}
> +
> +static int fastrpc_prepare_args_init_attach(struct fastrpc_invoke_context *ctx)
> +{
> + struct fastrpc_invoke_args *args;
> +
> + args = kzalloc_obj(*args, GFP_KERNEL);
> + if (!args)
> + return -ENOMEM;
> +
> + setup_single_arg(args, &ctx->client_id, sizeof(ctx->client_id));
> + ctx->sc = FASTRPC_SCALARS(FASTRPC_RMID_INIT_ATTACH, 1, 0);
> + ctx->args = args;
> + ctx->handle = FASTRPC_INIT_HANDLE;
> +
> + return 0;
> +}
> +
> +static int fastrpc_prepare_args_release_process(struct fastrpc_invoke_context *ctx)
> +{
> + struct fastrpc_invoke_args *args;
> +
> + args = kzalloc_obj(*args, GFP_KERNEL);
> + if (!args)
> + return -ENOMEM;
> +
> + setup_single_arg(args, &ctx->client_id, sizeof(ctx->client_id));
> + ctx->sc = FASTRPC_SCALARS(FASTRPC_RMID_INIT_RELEASE, 1, 0);
> + ctx->args = args;
> + ctx->handle = FASTRPC_INIT_HANDLE;
> +
> + return 0;
> +}
> +
> +int fastrpc_prepare_args(struct fastrpc_invoke_context *ctx, char __user *argp)
> +{
> + int err;
> +
> + switch (ctx->type) {
> + case FASTRPC_RMID_INIT_ATTACH:
> + ctx->pd = ROOT_PD;
> + err = fastrpc_prepare_args_init_attach(ctx);
> + break;
> + case FASTRPC_RMID_INIT_RELEASE:
> + err = fastrpc_prepare_args_release_process(ctx);
> + break;
> + default:
> + return -EINVAL;
> + }
> +
> + if (err)
> + return err;
> +
> + ctx->nscalars = REMOTE_SCALARS_LENGTH(ctx->sc);
> + ctx->nbufs = REMOTE_SCALARS_INBUFS(ctx->sc) + REMOTE_SCALARS_OUTBUFS(ctx->sc);
> +
> + if (ctx->nscalars) {
> + ctx->gem_objs = kcalloc(ctx->nscalars, sizeof(*ctx->gem_objs), GFP_KERNEL);
> + if (!ctx->gem_objs)
> + return -ENOMEM;
> + ctx->olaps = kcalloc(ctx->nscalars, sizeof(*ctx->olaps), GFP_KERNEL);
> + if (!ctx->olaps) {
> + kfree(ctx->gem_objs);
> + ctx->gem_objs = NULL;
> + return -ENOMEM;
> + }
> + fastrpc_get_buff_overlaps(ctx);
> + }
> +
> + return err;
> +}
> diff --git a/drivers/accel/qda/qda_fastrpc.h b/drivers/accel/qda/qda_fastrpc.h
> new file mode 100644
> index 000000000000..744421382079
> --- /dev/null
> +++ b/drivers/accel/qda/qda_fastrpc.h
> @@ -0,0 +1,303 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
> + */
> +
> +#ifndef __QDA_FASTRPC_H__
> +#define __QDA_FASTRPC_H__
> +
> +#include <linux/completion.h>
> +#include <linux/list.h>
> +#include <linux/types.h>
> +#include <drm/drm_drv.h>
> +#include <drm/drm_file.h>
> +
> +/*
> + * FastRPC scalar extraction macros
> + *
> + * These macros extract different fields from the scalar value that describes
> + * the arguments passed in a FastRPC invocation.
> + */
> +#define REMOTE_SCALARS_INBUFS(sc) (((sc) >> 16) & 0x0ff)
> +#define REMOTE_SCALARS_OUTBUFS(sc) (((sc) >> 8) & 0x0ff)
> +#define REMOTE_SCALARS_INHANDLES(sc) (((sc) >> 4) & 0x0f)
> +#define REMOTE_SCALARS_OUTHANDLES(sc) ((sc) & 0x0f)
> +#define REMOTE_SCALARS_LENGTH(sc) (REMOTE_SCALARS_INBUFS(sc) + \
> + REMOTE_SCALARS_OUTBUFS(sc) + \
> + REMOTE_SCALARS_INHANDLES(sc) + \
> + REMOTE_SCALARS_OUTHANDLES(sc))
> +
> +/* FastRPC configuration constants */
> +#define FASTRPC_ALIGN 128 /* Alignment requirement */
> +#define FASTRPC_MAX_FDLIST 16 /* Maximum file descriptors */
> +#define FASTRPC_MAX_CRCLIST 64 /* Maximum CRC list entries */
> +
> +/*
> + * FastRPC scalar construction macros
> + *
> + * These macros build the scalar value that describes the arguments
> + * for a FastRPC invocation.
> + */
> +#define FASTRPC_BUILD_SCALARS(attr, method, in, out, oin, oout) \
> + (((attr & 0x07) << 29) | \
> + ((method & 0x1f) << 24) | \
> + ((in & 0xff) << 16) | \
> + ((out & 0xff) << 8) | \
> + ((oin & 0x0f) << 4) | \
> + (oout & 0x0f))
> +
> +#define FASTRPC_SCALARS(method, in, out) \
> + FASTRPC_BUILD_SCALARS(0, method, in, out, 0, 0)
> +
> +/**
> + * struct fastrpc_buf_overlap - Buffer overlap tracking structure
> + *
> + * This structure tracks overlapping buffer regions to optimize memory
> + * mapping and avoid redundant mappings of the same physical memory.
I think you are spending much more efforts on optimizing it than the
actual cost of mapping the same region twice. Or is there something more
than the optimization?
> + */
> +struct fastrpc_buf_overlap {
> + /* Start address of the buffer in user virtual address space */
> + u64 start;
> + /* End address of the buffer in user virtual address space */
> + u64 end;
> + /* Remote argument index associated with this overlap */
> + int raix;
> + /* Start address of the mapped region */
> + u64 mstart;
> + /* End address of the mapped region */
> + u64 mend;
> + /* Offset within the mapped region */
> + u64 offset;
> +};
> +
> +/**
> + * struct fastrpc_remote_dmahandle - Structure to represent a remote DMA handle
> + */
> +struct fastrpc_remote_dmahandle {
> + /* DMA handle file descriptor */
> + s32 fd;
> + /* DMA handle offset */
> + u32 offset;
> + /* DMA handle length */
> + u32 len;
> +};
> +
> +/**
> + * struct fastrpc_remote_buf - Structure to represent a remote buffer
> + */
> +struct fastrpc_remote_buf {
> + /* Buffer pointer */
> + u64 pv;
> + /* Length of buffer */
> + u64 len;
> +};
> +
> +/**
> + * union fastrpc_remote_arg - Union to represent remote arguments
> + */
> +union fastrpc_remote_arg {
> + /* Remote buffer */
> + struct fastrpc_remote_buf buf;
> + /* Remote DMA handle */
> + struct fastrpc_remote_dmahandle dma;
> +};
> +
> +/**
> + * struct fastrpc_phy_page - Structure to represent a physical page
> + */
> +struct fastrpc_phy_page {
> + /* Physical address */
> + u64 addr;
> + /* Size of contiguous region */
> + u64 size;
> +};
> +
> +/**
> + * struct fastrpc_invoke_buf - Structure to represent an invoke buffer
> + */
> +struct fastrpc_invoke_buf {
> + /* Number of contiguous regions */
> + u32 num;
> + /* Page index */
> + u32 pgidx;
> +};
> +
> +/**
> + * struct qda_msg - Message structure for FastRPC communication
> + *
> + * This structure represents a message sent to or received from the remote
> + * processor via FastRPC protocol.
> + */
> +struct qda_msg {
> + /* Process client ID */
> + int client_id;
> + /* Thread ID */
> + int tid;
> + /* Context identifier for matching responses */
> + u64 ctx;
> + /* Handle to invoke on remote processor */
> + u32 handle;
> + /* Scalars structure describing the data layout */
> + u32 sc;
> + /* Physical address of the message buffer */
> + u64 addr;
> + /* Size of contiguous region */
> + u64 size;
> + /* Kernel virtual address of the buffer */
> + void *buf;
> + /* Physical/DMA address of the buffer */
> + u64 phys;
> + /* Return value from remote processor */
> + int ret;
> + /* Pointer to qda_dev for context management */
> + struct qda_dev *qdev;
> + /* Back-pointer to FastRPC context */
> + struct fastrpc_invoke_context *fastrpc_ctx;
> + /* File private data for GEM object lookup */
> + struct drm_file *file_priv;
> +};
> +
> +/**
> + * struct fastrpc_invoke_context - Remote procedure call invocation context
> + *
> + * This structure maintains all state for a single remote procedure call,
> + * including buffer management, synchronization, and result handling.
> + */
> +struct fastrpc_invoke_context {
> + /* Unique context identifier for this invocation */
> + u64 ctxid;
> + /* Number of input buffers */
> + int inbufs;
> + /* Number of output buffers */
> + int outbufs;
> + /* Number of file descriptor handles */
> + int handles;
> + /* Number of scalar parameters */
> + int nscalars;
> + /* Total number of buffers (input + output) */
> + int nbufs;
> + /* Process ID of the calling process */
> + int pid;
> + /* Return value from the remote invocation */
> + int retval;
> + /* Length of metadata */
> + int metalen;
> + /* Client identifier for this session */
> + int client_id;
> + /* Protection domain identifier */
> + int pd;
> + /* Type of invocation request */
> + int type;
> + /* Scalars parameter encoding buffer information */
> + u32 sc;
> + /* Handle to the remote method being invoked */
> + u32 handle;
> + /* Pointer to CRC values for data integrity */
> + u32 *crc;
> + /* Pointer to array of file descriptors */
> + u64 *fdlist;
> + /* Size of the packet */
> + u64 pkt_size;
> + /* Aligned packet size for DMA transfers */
> + u64 aligned_pkt_size;
> + /* Array of invoke buffer descriptors */
> + struct fastrpc_invoke_buf *list;
> + /* Array of physical page descriptors for buffers */
> + struct fastrpc_phy_page *pages;
> + /* Array of physical page descriptors for input buffers */
> + struct fastrpc_phy_page *input_pages;
> + /* List node for linking contexts in a queue */
> + struct list_head node;
> + /* Completion object for synchronizing invocation */
> + struct completion work;
> + /* Pointer to the QDA message structure */
> + struct qda_msg *msg;
> + /* Array of remote procedure arguments */
> + union fastrpc_remote_arg *rpra;
> + /* Array of GEM objects for argument buffers */
> + struct drm_gem_object **gem_objs;
> + /* Pointer to user-space invoke arguments */
> + struct fastrpc_invoke_args *args;
> + /* Array of buffer overlap descriptors */
> + struct fastrpc_buf_overlap *olaps;
> + /* Reference counter for context lifetime management */
> + struct kref refcount;
> + /* GEM object for the main message buffer */
> + struct qda_gem_obj *msg_gem_obj;
> + /* DRM file private data */
> + struct drm_file *file_priv;
> + /* Pointer to request buffer */
> + void *req;
> + /* Pointer to response buffer */
> + void *rsp;
> + /* Pointer to input buffer */
> + void *inbuf;
> +};
> +
> +/* Remote Method ID table - identifies initialization and control operations */
> +#define FASTRPC_RMID_INIT_ATTACH 0 /* Attach to DSP session */
> +#define FASTRPC_RMID_INIT_RELEASE 1 /* Release DSP session */
> +
> +/* Common handle for initialization operations */
> +#define FASTRPC_INIT_HANDLE 0x1
> +
> +/* Protection Domain(PD) ids */
> +#define ROOT_PD (0)
> +
> +/**
> + * fastrpc_context_free - Free an invocation context
> + * @ref: Reference counter for the context
> + *
> + * This function is called when the reference count reaches zero,
> + * releasing all resources associated with the invocation context.
> + */
> +void fastrpc_context_free(struct kref *ref);
> +
> +/*
> + * FastRPC context and invocation management functions
> + */
> +
> +/**
> + * fastrpc_context_alloc - Allocate a new FastRPC invocation context
> + *
> + * Returns: Pointer to allocated context, or NULL on failure
> + */
> +struct fastrpc_invoke_context *fastrpc_context_alloc(void);
> +
> +/**
> + * fastrpc_prepare_args - Prepare arguments for FastRPC invocation
> + * @ctx: FastRPC invocation context
> + * @argp: User-space pointer to invocation arguments
> + *
> + * Returns: 0 on success, negative error code on failure
> + */
> +int fastrpc_prepare_args(struct fastrpc_invoke_context *ctx, char __user *argp);
> +
> +/**
> + * fastrpc_get_header_size - Get the size of the FastRPC message header
> + * @ctx: FastRPC invocation context
> + * @out_size: Pointer to store the header size in bytes
> + *
> + * Returns: 0 on success, negative error code on failure
> + */
> +int fastrpc_get_header_size(struct fastrpc_invoke_context *ctx, size_t *out_size);
> +
> +/**
> + * fastrpc_internal_invoke_pack - Pack invocation context into message
> + * @ctx: FastRPC invocation context
> + * @msg: QDA message structure to pack into
> + *
> + * Returns: 0 on success, negative error code on failure
> + */
> +int fastrpc_internal_invoke_pack(struct fastrpc_invoke_context *ctx, struct qda_msg *msg);
> +
> +/**
> + * fastrpc_internal_invoke_unpack - Unpack response message into context
> + * @ctx: FastRPC invocation context
> + * @msg: QDA message structure to unpack from
> + *
> + * Returns: 0 on success, negative error code on failure
> + */
> +int fastrpc_internal_invoke_unpack(struct fastrpc_invoke_context *ctx, struct qda_msg *msg);
> +
> +#endif /* __QDA_FASTRPC_H__ */
> diff --git a/drivers/accel/qda/qda_ioctl.c b/drivers/accel/qda/qda_ioctl.c
> index d91983048d6c..1066ab6ddc7b 100644
> --- a/drivers/accel/qda/qda_ioctl.c
> +++ b/drivers/accel/qda/qda_ioctl.c
> @@ -6,6 +6,8 @@
> #include "qda_drv.h"
> #include "qda_ioctl.h"
> #include "qda_prime.h"
> +#include "qda_fastrpc.h"
> +#include "qda_rpmsg.h"
>
> static int qda_validate_and_get_context(struct drm_device *dev, struct drm_file *file_priv,
> struct qda_dev **qdev, struct qda_user **qda_user)
> @@ -85,3 +87,108 @@ int qda_ioctl_prime_fd_to_handle(struct drm_device *dev, struct drm_file *file_p
> {
> return qda_prime_fd_to_handle(dev, file_priv, prime_fd, handle);
> }
> +
> +static int fastrpc_context_get_id(struct fastrpc_invoke_context *ctx, struct qda_dev *qdev)
> +{
> + int ret;
> + u32 id;
> +
> + if (!qdev)
> + return -EINVAL;
> +
> + if (atomic_read(&qdev->removing))
> + return -ENODEV;
> +
> + ret = xa_alloc(&qdev->ctx_xa, &id, ctx, xa_limit_32b, GFP_KERNEL);
> + if (ret)
> + return ret;
> +
> + ctx->ctxid = id << 4;
> + return 0;
> +}
> +
> +static void fastrpc_context_put_id(struct fastrpc_invoke_context *ctx, struct qda_dev *qdev)
> +{
> + if (qdev)
> + xa_erase(&qdev->ctx_xa, ctx->ctxid >> 4);
> +}
> +
> +static int fastrpc_invoke(int type, struct drm_device *dev, void *data,
> + struct drm_file *file_priv)
> +{
> + struct qda_dev *qdev;
> + struct qda_user *qda_user;
> + struct qda_msg msg;
> + struct fastrpc_invoke_context *ctx;
> + struct drm_gem_object *gem_obj;
> + int err;
> + size_t hdr_size;
> +
> + err = qda_validate_and_get_context(dev, file_priv, &qdev, &qda_user);
> + if (err)
> + return err;
> +
> + ctx = fastrpc_context_alloc();
> + if (IS_ERR(ctx))
> + return PTR_ERR(ctx);
> +
> + err = fastrpc_context_get_id(ctx, qdev);
> + if (err) {
> + kref_put(&ctx->refcount, fastrpc_context_free);
> + return err;
> + }
> +
> + ctx->type = type;
> + ctx->file_priv = file_priv;
> + ctx->client_id = qda_user->client_id;
> +
> + err = fastrpc_prepare_args(ctx, (char __user *)data);
> + if (err)
> + goto err_context_free;
> +
> + err = fastrpc_get_header_size(ctx, &hdr_size);
> + if (err)
> + goto err_context_free;
> +
> + gem_obj = qda_gem_create_object(qdev->drm_dev,
> + qdev->drm_priv->iommu_mgr,
> + hdr_size, file_priv);
> + if (IS_ERR(gem_obj)) {
> + err = PTR_ERR(gem_obj);
> + goto err_context_free;
> + }
> +
> + ctx->msg_gem_obj = to_qda_gem_obj(gem_obj);
> +
> + err = fastrpc_internal_invoke_pack(ctx, &msg);
> + if (err)
> + goto err_context_free;
> +
> + err = qda_rpmsg_send_msg(qdev, &msg);
> + if (err)
> + goto err_context_free;
> +
> + err = qda_rpmsg_wait_for_rsp(ctx);
> + if (err)
> + goto err_context_free;
> +
> + err = fastrpc_internal_invoke_unpack(ctx, &msg);
> + if (err)
> + goto err_context_free;
> +
> +err_context_free:
> + fastrpc_context_put_id(ctx, qdev);
> + kref_put(&ctx->refcount, fastrpc_context_free);
> +
> + return err;
> +}
> +
> +int qda_ioctl_attach(struct drm_device *dev, void *data, struct drm_file *file_priv)
> +{
> + return fastrpc_invoke(FASTRPC_RMID_INIT_ATTACH, dev, data, file_priv);
> +}
> +
> +int fastrpc_release_current_dsp_process(struct qda_dev *qdev, struct drm_file *file_priv)
> +{
> + return fastrpc_invoke(FASTRPC_RMID_INIT_RELEASE, qdev->drm_dev, NULL, file_priv);
> +}
> diff --git a/drivers/accel/qda/qda_ioctl.h b/drivers/accel/qda/qda_ioctl.h
> index d454256f5fc5..044c616a51c6 100644
> --- a/drivers/accel/qda/qda_ioctl.h
> +++ b/drivers/accel/qda/qda_ioctl.h
> @@ -38,4 +38,29 @@ int qda_ioctl_query(struct drm_device *dev, void *data, struct drm_file *file_pr
> int qda_ioctl_prime_fd_to_handle(struct drm_device *dev, struct drm_file *file_priv,
> int prime_fd, u32 *handle);
>
> +/**
> + * qda_ioctl_attach - Attach to DSP root protection domain
> + * @dev: DRM device structure
> + * @data: User-space data for the attach operation
> + * @file_priv: DRM file private data
> + *
> + * This IOCTL handler attaches to the DSP root PD (Protection Domain)
> + * to enable communication between the host and DSP.
> + *
> + * Return: 0 on success, negative error code on failure
> + */
> +int qda_ioctl_attach(struct drm_device *dev, void *data, struct drm_file *file_priv);
> +
> +/**
> + * fastrpc_release_current_dsp_process - Release DSP process resources
> + * @qdev: QDA device structure
> + * @file_priv: DRM file private data
> + *
> + * This function releases all resources associated with a DSP process
> + * when a user-space client closes its file descriptor.
> + *
> + * Return: 0 on success, negative error code on failure
> + */
> +int fastrpc_release_current_dsp_process(struct qda_dev *qdev, struct drm_file *file_priv);
> +
> #endif /* _QDA_IOCTL_H */
> diff --git a/drivers/accel/qda/qda_rpmsg.c b/drivers/accel/qda/qda_rpmsg.c
> index b2b44b4d3ca8..96a08d753271 100644
> --- a/drivers/accel/qda/qda_rpmsg.c
> +++ b/drivers/accel/qda/qda_rpmsg.c
> @@ -5,7 +5,11 @@
> #include <linux/of_platform.h>
> #include <linux/of.h>
> #include <linux/of_device.h>
> +#include <linux/completion.h>
> +#include <linux/wait.h>
> +#include <linux/sched.h>
> #include "qda_drv.h"
> +#include "qda_fastrpc.h"
> #include "qda_rpmsg.h"
> #include "qda_cb.h"
>
> @@ -15,7 +19,104 @@ static int qda_rpmsg_init(struct qda_dev *qdev)
> return 0;
> }
>
> -/* Utility function to allocate and initialize qda_dev */
> +static int validate_device_availability(struct qda_dev *qdev)
> +{
> + struct rpmsg_device *rpdev;
> +
> + if (!qdev)
> + return -ENODEV;
> +
> + if (atomic_read(&qdev->removing)) {
> + qda_dbg(qdev, "RPMsg device unavailable: removing\n");
> + return -ENODEV;
> + }
> +
> + mutex_lock(&qdev->lock);
> + rpdev = qdev->rpdev;
> + mutex_unlock(&qdev->lock);
> +
> + if (!rpdev) {
> + qda_dbg(qdev, "RPMsg device unavailable: rpdev is NULL\n");
> + return -ENODEV;
> + }
> +
> + return 0;
> +}
> +
> +static struct fastrpc_invoke_context *get_and_validate_context(struct qda_msg *msg,
> + struct qda_dev *qdev)
> +{
> + struct fastrpc_invoke_context *ctx = msg->fastrpc_ctx;
> +
> + if (!ctx) {
> + qda_dbg(qdev, "FastRPC context not found in message\n");
> + return ERR_PTR(-EINVAL);
> + }
> +
> + kref_get(&ctx->refcount);
> + return ctx;
> +}
> +
> +static void populate_fastrpc_msg(struct fastrpc_msg *dst, struct qda_msg *src)
> +{
> + dst->client_id = src->client_id;
> + dst->tid = src->tid;
> + dst->ctx = src->ctx;
> + dst->handle = src->handle;
> + dst->sc = src->sc;
> + dst->addr = src->addr;
> + dst->size = src->size;
> +}
> +
> +static int validate_callback_params(struct qda_dev *qdev, void *data, int len)
> +{
> + if (!qdev)
> + return -ENODEV;
> +
> + if (atomic_read(&qdev->removing))
> + return -ENODEV;
> +
> + if (len < sizeof(struct qda_invoke_rsp)) {
> + qda_dbg(qdev, "Invalid message size from remote: %d\n", len);
> + return -EINVAL;
> + }
> +
> + return 0;
> +}
> +
> +static unsigned long extract_context_id(struct qda_invoke_rsp *resp_msg)
> +{
> + return (resp_msg->ctx & 0xFF0) >> 4;
> +}
> +
> +static struct fastrpc_invoke_context *find_context_by_id(struct qda_dev *qdev,
> + unsigned long ctxid)
> +{
> + struct fastrpc_invoke_context *ctx;
> +
> + {
> + unsigned long flags;
> +
> + xa_lock_irqsave(&qdev->ctx_xa, flags);
> + ctx = xa_load(&qdev->ctx_xa, ctxid);
> + xa_unlock_irqrestore(&qdev->ctx_xa, flags);
> + }
> +
> + if (!ctx) {
> + qda_dbg(qdev, "FastRPC context not found for ctxid: %lu\n", ctxid);
> + return ERR_PTR(-ENOENT);
> + }
> +
> + return ctx;
> +}
> +
> +static void complete_context_processing(struct fastrpc_invoke_context *ctx, int retval)
> +{
> + ctx->retval = retval;
> + complete(&ctx->work);
> + kref_put(&ctx->refcount, fastrpc_context_free);
> +}
> +
> static struct qda_dev *alloc_and_init_qdev(struct rpmsg_device *rpdev)
> {
> struct qda_dev *qdev;
> @@ -62,9 +163,68 @@ static int qda_populate_child_devices(struct qda_dev *qdev, struct device_node *
> return success > 0 ? 0 : (count > 0 ? -ENODEV : 0);
> }
>
> +int qda_rpmsg_send_msg(struct qda_dev *qdev, struct qda_msg *msg)
> +{
> + int ret;
> + struct fastrpc_invoke_context *ctx;
> + struct fastrpc_msg msg1;
> + struct rpmsg_device *rpdev;
> +
> + ret = validate_device_availability(qdev);
> + if (ret)
> + return ret;
> +
> + ctx = get_and_validate_context(msg, qdev);
> + if (IS_ERR(ctx))
> + return PTR_ERR(ctx);
> +
> + populate_fastrpc_msg(&msg1, msg);
> +
> + mutex_lock(&qdev->lock);
> + rpdev = qdev->rpdev;
> + if (!rpdev) {
> + mutex_unlock(&qdev->lock);
> + kref_put(&ctx->refcount, fastrpc_context_free);
> + return -ENODEV;
> + }
> +
> + ret = rpmsg_send(rpdev->ept, (void *)&msg1, sizeof(msg1));
> + mutex_unlock(&qdev->lock);
> +
> + if (ret) {
> + qda_err(qdev, "rpmsg_send failed: %d\n", ret);
> + kref_put(&ctx->refcount, fastrpc_context_free);
> + return ret;
> + }
> +
> + return 0;
> +}
> +
> +int qda_rpmsg_wait_for_rsp(struct fastrpc_invoke_context *ctx)
> +{
> + return wait_for_completion_interruptible(&ctx->work);
> +}
> +
> static int qda_rpmsg_cb(struct rpmsg_device *rpdev, void *data, int len, void *priv, u32 src)
> {
> - /* Dummy function for rpmsg driver */
> + struct qda_dev *qdev = dev_get_drvdata(&rpdev->dev);
> + struct qda_invoke_rsp *resp_msg = (struct qda_invoke_rsp *)data;
> + struct fastrpc_invoke_context *ctx;
> + unsigned long ctxid;
> + int ret;
> +
> + ret = validate_callback_params(qdev, data, len);
> + if (ret)
> + return ret;
> +
> + ctxid = extract_context_id(resp_msg);
> +
> + ctx = find_context_by_id(qdev, ctxid);
> + if (IS_ERR(ctx))
> + return PTR_ERR(ctx);
> +
> + complete_context_processing(ctx, resp_msg->retval);
> +
> return 0;
> }
>
> diff --git a/drivers/accel/qda/qda_rpmsg.h b/drivers/accel/qda/qda_rpmsg.h
> index 348827bff255..b3e76e44f4cd 100644
> --- a/drivers/accel/qda/qda_rpmsg.h
> +++ b/drivers/accel/qda/qda_rpmsg.h
> @@ -7,6 +7,46 @@
> #define __QDA_RPMSG_H__
>
> #include "qda_drv.h"
> +#include "qda_fastrpc.h"
> +
> +/**
> + * struct fastrpc_msg - FastRPC message structure for remote invocations
> + *
> + * This structure represents a FastRPC message sent to the remote processor
> + * via RPMsg transport layer.
> + */
> +struct fastrpc_msg {
> + /* Process client ID */
> + int client_id;
> + /* Thread ID */
> + int tid;
> + /* Context identifier for matching request/response */
> + u64 ctx;
> + /* Handle to invoke on remote processor */
> + u32 handle;
> + /* Scalars structure describing the data layout */
> + u32 sc;
> + /* Physical address of the message buffer */
> + u64 addr;
> + /* Size of contiguous region */
> + u64 size;
> +};
> +
> +/**
> + * struct qda_invoke_rsp - Response structure for FastRPC invocations
> + */
> +struct qda_invoke_rsp {
> + /* Invoke caller context for matching request/response */
> + u64 ctx;
> + /* Return value from the remote invocation */
> + int retval;
> +};
> +
> +/*
> + * RPMsg transport layer functions
> + */
> +int qda_rpmsg_send_msg(struct qda_dev *qdev, struct qda_msg *msg);
> +int qda_rpmsg_wait_for_rsp(struct fastrpc_invoke_context *ctx);
>
> /*
> * Transport layer registration
> --
> 2.34.1
>
--
With best wishes
Dmitry
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 14/18] accel/qda: Add FastRPC dynamic invocation support
2026-02-23 19:09 ` [PATCH RFC 14/18] accel/qda: Add FastRPC dynamic invocation support Ekansh Gupta
@ 2026-02-23 23:10 ` Dmitry Baryshkov
2026-03-09 6:53 ` Ekansh Gupta
0 siblings, 1 reply; 83+ messages in thread
From: Dmitry Baryshkov @ 2026-02-23 23:10 UTC (permalink / raw)
To: Ekansh Gupta
Cc: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, dri-devel, linux-doc, linux-kernel,
linux-arm-msm, iommu, linux-media, linaro-mm-sig,
Srinivas Kandagatla, Bharath Kumar, Chenna Kesava Raju
On Tue, Feb 24, 2026 at 12:39:08AM +0530, Ekansh Gupta wrote:
> Extend the QDA FastRPC implementation to support dynamic remote
> procedure calls from userspace. A new DRM_QDA_INVOKE ioctl is added,
> which accepts a qda_invoke_args structure containing a remote handle,
> FastRPC scalars value and a pointer to an array of fastrpc_invoke_args
> describing the individual arguments. The driver copies the scalar and
> argument array into a fastrpc_invoke_context and reuses the existing
> buffer overlap and packing logic to build a GEM-backed message buffer
> for transport.
>
> The FastRPC core gains a FASTRPC_RMID_INVOKE_DYNAMIC method type and a
> fastrpc_prepare_args_invoke() helper that reads the qda_invoke_args
> header and argument descriptors from user or kernel memory using a
> copy_from_user_or_kernel() helper. The generic fastrpc_prepare_args()
> path is updated to handle the dynamic method alongside the existing
> INIT_ATTACH and INIT_RELEASE control calls, deriving the number of
> buffers and scalars from the provided FastRPC scalars encoding.
>
> On the transport side qda_ioctl_invoke() simply forwards the request
> to fastrpc_invoke() with the dynamic method id, allowing the RPMsg
> transport and context lookup to treat dynamic calls in the same way as
> the existing control methods. This patch establishes the basic FastRPC
> invoke mechanism on top of the QDA GEM and RPMsg infrastructure so
> that future patches can wire up more complex DSP APIs.
>
> Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
> ---
> drivers/accel/qda/qda_drv.c | 1 +
> drivers/accel/qda/qda_fastrpc.c | 48 +++++++++++++++++++++++++++++++++++++++++
> drivers/accel/qda/qda_fastrpc.h | 1 +
> drivers/accel/qda/qda_ioctl.c | 5 +++++
> drivers/accel/qda/qda_ioctl.h | 13 +++++++++++
> include/uapi/drm/qda_accel.h | 21 ++++++++++++++++++
> 6 files changed, 89 insertions(+)
>
> diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c
> index 3034ea660924..f94f780ea50a 100644
> --- a/drivers/accel/qda/qda_drv.c
> +++ b/drivers/accel/qda/qda_drv.c
> @@ -162,6 +162,7 @@ static const struct drm_ioctl_desc qda_ioctls[] = {
> DRM_IOCTL_DEF_DRV(QDA_GEM_CREATE, qda_ioctl_gem_create, 0),
> DRM_IOCTL_DEF_DRV(QDA_GEM_MMAP_OFFSET, qda_ioctl_gem_mmap_offset, 0),
> DRM_IOCTL_DEF_DRV(QDA_INIT_ATTACH, qda_ioctl_attach, 0),
> + DRM_IOCTL_DEF_DRV(QDA_INVOKE, qda_ioctl_invoke, 0),
> };
>
> static struct drm_driver qda_drm_driver = {
> diff --git a/drivers/accel/qda/qda_fastrpc.c b/drivers/accel/qda/qda_fastrpc.c
> index eda7c90070ee..a48b255ffb1b 100644
> --- a/drivers/accel/qda/qda_fastrpc.c
> +++ b/drivers/accel/qda/qda_fastrpc.c
> @@ -12,6 +12,16 @@
> #include "qda_gem.h"
> #include "qda_memory_manager.h"
>
> +static int copy_from_user_or_kernel(void *dst, const void __user *src, size_t size)
> +{
> + if ((unsigned long)src >= PAGE_OFFSET) {
> + memcpy(dst, src, size);
> + return 0;
> + } else {
> + return copy_from_user(dst, src, size) ? -EFAULT : 0;
> + }
Nah, it's a direct route to failure. __user is for user pointers, it
can't be a kernel data. Define separate functions and be 100% sure
whether the data is coming from the user (and thus needs to be
sanitized) or of it is coming from the kernel. Otherwise a funny user
can pass kernel pointer and get away with your code copying data from or
writing data to the kernel buffer.
> +}
> +
> static int copy_to_user_or_kernel(void __user *dst, const void *src, size_t size)
> {
> if ((unsigned long)dst >= PAGE_OFFSET) {
> @@ -509,6 +519,41 @@ static int fastrpc_prepare_args_release_process(struct fastrpc_invoke_context *c
> return 0;
> }
>
> +static int fastrpc_prepare_args_invoke(struct fastrpc_invoke_context *ctx, char __user *argp)
> +{
> + struct fastrpc_invoke_args *args = NULL;
> + struct qda_invoke_args inv;
> + int err = 0;
> + int nscalars;
> +
> + if (!argp)
> + return -EINVAL;
> +
> + err = copy_from_user_or_kernel(&inv, argp, sizeof(inv));
> + if (err)
> + return err;
> +
> + nscalars = REMOTE_SCALARS_LENGTH(inv.sc);
> +
> + if (nscalars) {
> + args = kcalloc(nscalars, sizeof(*args), GFP_KERNEL);
> + if (!args)
> + return -ENOMEM;
> +
> + err = copy_from_user_or_kernel(args, (const void __user *)(uintptr_t)inv.args,
> + nscalars * sizeof(*args));
So... You are allowing users to specify the address in the kernel
address space? Are you... sure?
> + if (err) {
> + kfree(args);
> + return err;
> + }
> + }
> + ctx->sc = inv.sc;
> + ctx->args = args;
> + ctx->handle = inv.handle;
> +
> + return 0;
> +}
> +
> int fastrpc_prepare_args(struct fastrpc_invoke_context *ctx, char __user *argp)
> {
> int err;
> @@ -521,6 +566,9 @@ int fastrpc_prepare_args(struct fastrpc_invoke_context *ctx, char __user *argp)
> case FASTRPC_RMID_INIT_RELEASE:
> err = fastrpc_prepare_args_release_process(ctx);
> break;
> + case FASTRPC_RMID_INVOKE_DYNAMIC:
> + err = fastrpc_prepare_args_invoke(ctx, argp);
> + break;
> default:
> return -EINVAL;
> }
> diff --git a/drivers/accel/qda/qda_fastrpc.h b/drivers/accel/qda/qda_fastrpc.h
> index 744421382079..bcadf9437a36 100644
> --- a/drivers/accel/qda/qda_fastrpc.h
> +++ b/drivers/accel/qda/qda_fastrpc.h
> @@ -237,6 +237,7 @@ struct fastrpc_invoke_context {
> /* Remote Method ID table - identifies initialization and control operations */
> #define FASTRPC_RMID_INIT_ATTACH 0 /* Attach to DSP session */
> #define FASTRPC_RMID_INIT_RELEASE 1 /* Release DSP session */
> +#define FASTRPC_RMID_INVOKE_DYNAMIC 0xFFFFFFFF /* Dynamic method invocation */
>
> /* Common handle for initialization operations */
> #define FASTRPC_INIT_HANDLE 0x1
> diff --git a/drivers/accel/qda/qda_ioctl.c b/drivers/accel/qda/qda_ioctl.c
> index 1066ab6ddc7b..e90aceabd30d 100644
> --- a/drivers/accel/qda/qda_ioctl.c
> +++ b/drivers/accel/qda/qda_ioctl.c
> @@ -192,3 +192,8 @@ int fastrpc_release_current_dsp_process(struct qda_dev *qdev, struct drm_file *f
> {
> return fastrpc_invoke(FASTRPC_RMID_INIT_RELEASE, qdev->drm_dev, NULL, file_priv);
> }
> +
> +int qda_ioctl_invoke(struct drm_device *dev, void *data, struct drm_file *file_priv)
> +{
> + return fastrpc_invoke(FASTRPC_RMID_INVOKE_DYNAMIC, dev, data, file_priv);
> +}
> diff --git a/drivers/accel/qda/qda_ioctl.h b/drivers/accel/qda/qda_ioctl.h
> index 044c616a51c6..e186c5183171 100644
> --- a/drivers/accel/qda/qda_ioctl.h
> +++ b/drivers/accel/qda/qda_ioctl.h
> @@ -63,4 +63,17 @@ int qda_ioctl_attach(struct drm_device *dev, void *data, struct drm_file *file_p
> */
> int fastrpc_release_current_dsp_process(struct qda_dev *qdev, struct drm_file *file_priv);
>
> +/**
> + * qda_ioctl_invoke - Invoke a remote procedure on the DSP
> + * @dev: DRM device structure
> + * @data: User-space data containing invocation parameters
> + * @file_priv: DRM file private data
> + *
> + * This IOCTL handler initiates a remote procedure call on the DSP,
> + * marshalling arguments, executing the call, and returning results.
> + *
> + * Return: 0 on success, negative error code on failure
> + */
> +int qda_ioctl_invoke(struct drm_device *dev, void *data, struct drm_file *file_priv);
> +
> #endif /* _QDA_IOCTL_H */
> diff --git a/include/uapi/drm/qda_accel.h b/include/uapi/drm/qda_accel.h
> index 4d3666c5b998..01072a9d0a91 100644
> --- a/include/uapi/drm/qda_accel.h
> +++ b/include/uapi/drm/qda_accel.h
> @@ -22,6 +22,9 @@ extern "C" {
> #define DRM_QDA_GEM_CREATE 0x01
> #define DRM_QDA_GEM_MMAP_OFFSET 0x02
> #define DRM_QDA_INIT_ATTACH 0x03
> +/* Indexes 0x04 to 0x06 are reserved for other requests */
> +#define DRM_QDA_INVOKE 0x07
> +
> /*
> * QDA IOCTL definitions
> *
> @@ -35,6 +38,8 @@ extern "C" {
> #define DRM_IOCTL_QDA_GEM_MMAP_OFFSET DRM_IOWR(DRM_COMMAND_BASE + DRM_QDA_GEM_MMAP_OFFSET, \
> struct drm_qda_gem_mmap_offset)
> #define DRM_IOCTL_QDA_INIT_ATTACH DRM_IO(DRM_COMMAND_BASE + DRM_QDA_INIT_ATTACH)
> +#define DRM_IOCTL_QDA_INVOKE DRM_IOWR(DRM_COMMAND_BASE + DRM_QDA_INVOKE, \
> + struct qda_invoke_args)
>
> /**
> * struct drm_qda_query - Device information query structure
> @@ -95,6 +100,22 @@ struct fastrpc_invoke_args {
> __u32 attr;
> };
>
> +/**
> + * struct qda_invoke_args - User-space IOCTL arguments for invoking a function
> + * @handle: Handle identifying the remote function to invoke
> + * @sc: Scalars parameter encoding buffer counts and attributes
Encoding... how?
> + * @args: User-space pointer to the argument array
Which is defined at...?
Can you actually write the user code by looking at your uapi header?
> + *
> + * This structure is passed from user-space to invoke a remote function
> + * on the DSP. The scalars parameter encodes the number and types of
> + * input/output buffers.
> + */
> +struct qda_invoke_args {
> + __u32 handle;
> + __u32 sc;
> + __u64 args;
> +};
> +
> #if defined(__cplusplus)
> }
> #endif
>
> --
> 2.34.1
>
--
With best wishes
Dmitry
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 06/18] accel/qda: Add memory manager for CB devices
2026-02-23 19:09 ` [PATCH RFC 06/18] accel/qda: Add memory manager for CB devices Ekansh Gupta
2026-02-23 22:50 ` Dmitry Baryshkov
@ 2026-02-23 23:11 ` Bjorn Andersson
2026-03-02 8:30 ` Ekansh Gupta
1 sibling, 1 reply; 83+ messages in thread
From: Bjorn Andersson @ 2026-02-23 23:11 UTC (permalink / raw)
To: Ekansh Gupta
Cc: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, dri-devel, linux-doc, linux-kernel,
linux-arm-msm, iommu, linux-media, linaro-mm-sig,
Srinivas Kandagatla, Dmitry Baryshkov, Bharath Kumar,
Chenna Kesava Raju
On Tue, Feb 24, 2026 at 12:39:00AM +0530, Ekansh Gupta wrote:
> Introduce a per-device memory manager for the QDA driver that tracks
> IOMMU-capable compute context-bank (CB) devices. Each CB device is
> represented by a qda_iommu_device and registered with a central
> qda_memory_manager instance owned by qda_dev.
>
The name makes me expect that this manages memory, but it seems to
manage devices and context banks...
> The memory manager maintains an xarray of devices and assigns a
> unique ID to each CB. It also provides basic lifetime management
> and a workqueue for deferred device removal. qda_cb_setup_device()
> now allocates a qda_iommu_device for each CB and registers it with
> the memory manager after DMA configuration succeeds.
>
> qda_init_device() is extended to allocate and initialize the memory
> manager, while qda_deinit_device() will tear it down in later
> patches.
"in later patches" makes this extremely hard to review. I had to apply
the series to try to navigate the code...
> This prepares the QDA driver for fine-grained memory and
> IOMMU domain management tied to individual CB devices.
>
> Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
[..]
> obj-$(CONFIG_DRM_ACCEL_QDA_COMPUTE_BUS) += qda_compute_bus.o
> diff --git a/drivers/accel/qda/qda_cb.c b/drivers/accel/qda/qda_cb.c
[..]
> @@ -46,6 +52,18 @@ static int qda_cb_setup_device(struct qda_dev *qdev, struct device *cb_dev)
> rc = dma_set_mask(cb_dev, DMA_BIT_MASK(pa_bits));
> if (rc) {
> qda_err(qdev, "%d bit DMA enable failed: %d\n", pa_bits, rc);
> + kfree(iommu_dev);
> + return rc;
> + }
> +
> + iommu_dev->dev = cb_dev;
> + iommu_dev->sid = sid;
> + snprintf(iommu_dev->name, sizeof(iommu_dev->name), "qda_iommu_dev_%u", sid);
It's not easy to follow, when you have scattered the code across so many
patches and so many files. But I don't think iommu_dev->name is ever
used.
> +
> + rc = qda_memory_manager_register_device(qdev->iommu_mgr, iommu_dev);
> + if (rc) {
> + qda_err(qdev, "Failed to register IOMMU device: %d\n", rc);
> + kfree(iommu_dev);
> return rc;
> }
>
> @@ -127,6 +145,8 @@ int qda_create_cb_device(struct qda_dev *qdev, struct device_node *cb_node)
> void qda_destroy_cb_device(struct device *cb_dev)
> {
> struct iommu_group *group;
> + struct qda_iommu_device *iommu_dev;
> + struct qda_dev *qdev;
>
> if (!cb_dev) {
> qda_dbg(NULL, "NULL CB device passed to destroy\n");
> @@ -135,6 +155,18 @@ void qda_destroy_cb_device(struct device *cb_dev)
>
> qda_dbg(NULL, "Destroying CB device %s\n", dev_name(cb_dev));
>
> + iommu_dev = dev_get_drvdata(cb_dev);
I'm not sure, but I think cb_dev is the struct device allocated in
qda_create_cb_device(), but I can not find a place where you set drvdata
for this device.
> + if (iommu_dev) {
> + if (cb_dev->parent) {
> + qdev = dev_get_drvdata(cb_dev->parent);
> + if (qdev && qdev->iommu_mgr) {
> + qda_dbg(NULL, "Unregistering IOMMU device for %s\n",
> + dev_name(cb_dev));
> + qda_memory_manager_unregister_device(qdev->iommu_mgr, iommu_dev);
> + }
> + }
> + }
> +
> group = iommu_group_get(cb_dev);
> if (group) {
> qda_dbg(NULL, "Removing %s from IOMMU group\n", dev_name(cb_dev));
> diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c
[..]
> @@ -25,12 +37,46 @@ static void init_device_resources(struct qda_dev *qdev)
> atomic_set(&qdev->removing, 0);
> }
>
> +static int init_memory_manager(struct qda_dev *qdev)
> +{
> + int ret;
> +
> + qda_dbg(qdev, "Initializing IOMMU manager\n");
> +
> + qdev->iommu_mgr = kzalloc_obj(*qdev->iommu_mgr, GFP_KERNEL);
> + if (!qdev->iommu_mgr)
> + return -ENOMEM;
> +
> + ret = qda_memory_manager_init(qdev->iommu_mgr);
> + if (ret) {
> + qda_err(qdev, "Failed to initialize memory manager: %d\n", ret);
qda_memory_manager_init() already logged 1 error and 1 debug prints if
you get here.
> + kfree(qdev->iommu_mgr);
> + qdev->iommu_mgr = NULL;
We're going to fail probe, you shouldn't have to clear this.
> + return ret;
> + }
> +
> + qda_dbg(qdev, "IOMMU manager initialized successfully\n");
> + return 0;
> +}
> +
> int qda_init_device(struct qda_dev *qdev)
> {
> + int ret;
> +
> init_device_resources(qdev);
>
> + ret = init_memory_manager(qdev);
> + if (ret) {
> + qda_err(qdev, "IOMMU manager initialization failed: %d\n", ret);
And now we have 2 debug prints and two error prints in the log.
> + goto err_cleanup_resources;
> + }
> +
> qda_dbg(qdev, "QDA device initialized successfully\n");
Or, if we get here, you have 8 debug prints.
Please learn how to use kprobe/kretprobe instead of reimplementing it
using printk().
> return 0;
> +
> +err_cleanup_resources:
> + cleanup_device_resources(qdev);
> + return ret;
> }
>
> static int __init qda_core_init(void)
> diff --git a/drivers/accel/qda/qda_drv.h b/drivers/accel/qda/qda_drv.h
> index eb732b7d8091..2cb97e4eafbf 100644
> --- a/drivers/accel/qda/qda_drv.h
> +++ b/drivers/accel/qda/qda_drv.h
> @@ -11,6 +11,7 @@
> #include <linux/mutex.h>
> #include <linux/rpmsg.h>
> #include <linux/xarray.h>
> +#include "qda_memory_manager.h"
>
> /* Driver identification */
> #define DRIVER_NAME "qda"
> @@ -23,6 +24,8 @@ struct qda_dev {
> struct device *dev;
> /* Mutex protecting device state */
> struct mutex lock;
> + /* IOMMU/memory manager */
> + struct qda_memory_manager *iommu_mgr;
> /* Flag indicating device removal in progress */
> atomic_t removing;
> /* Name of the DSP (e.g., "cdsp", "adsp") */
> diff --git a/drivers/accel/qda/qda_memory_manager.c b/drivers/accel/qda/qda_memory_manager.c
[..]
> +int qda_memory_manager_register_device(struct qda_memory_manager *mem_mgr,
> + struct qda_iommu_device *iommu_dev)
> +{
> + int ret;
> + u32 id;
> +
> + if (!mem_mgr || !iommu_dev || !iommu_dev->dev) {
How could this happen? You call this function from one place, that looks
like this:
iommu_dev->dev = cb_dev;
iommu_dev->sid = sid;
rc = qda_memory_manager_register_device(qdev->iommu_mgr, iommu_dev);
You just allocated in filled out iommu_dev.
Looking up the callstack, we're coming from qda_rpmsg_probe() which just
did qda_init_device() which created the qsdev->iommu_mgr.
In other words, these can't possibly be NULL.
> + qda_err(NULL, "Invalid parameters for device registration\n");
> + return -EINVAL;
> + }
> +
> + init_iommu_device_fields(iommu_dev, mem_mgr);
> +
> + ret = allocate_device_id(mem_mgr, iommu_dev, &id);
> + if (ret) {
> + qda_err(NULL, "Failed to allocate device ID: %d (sid=%u)\n", ret, iommu_dev->sid);
> + return ret;
> + }
> +
> + iommu_dev->id = id;
> +
> + qda_dbg(NULL, "Registered device id=%u (sid=%u)\n", id, iommu_dev->sid);
> +
> + return 0;
> +}
> +
> +void qda_memory_manager_unregister_device(struct qda_memory_manager *mem_mgr,
> + struct qda_iommu_device *iommu_dev)
> +{
> + if (!mem_mgr || !iommu_dev) {
The one call to this function is wrapped in:
if (iommu_dev) {
if (qdev->iommu_mgr) {
qda_dbg(NULL, ...);
qda_memory_manager_unregister_device(qdev->iommu_mgr, iommu_dev);
}
}
> + qda_err(NULL, "Attempted to unregister invalid device/manager\n");
> + return;
> + }
> +
> + qda_dbg(NULL, "Unregistering device id=%u (refcount=%u)\n", iommu_dev->id,
> + refcount_read(&iommu_dev->refcount));
And just before the call to qda_memory_manager_unregister_device() you
print a debug log, saying you will call this function.
> +
> + if (refcount_read(&iommu_dev->refcount) == 0) {
> + xa_erase(&mem_mgr->device_xa, iommu_dev->id);
> + kfree(iommu_dev);
> + return;
> + }
> +
> + if (refcount_dec_and_test(&iommu_dev->refcount)) {
> + qda_info(NULL, "Device id=%u refcount reached zero, queuing removal\n",
> + iommu_dev->id);
> + queue_work(mem_mgr->wq, &iommu_dev->remove_work);
> + }
> +}
> +
[..]
> diff --git a/drivers/accel/qda/qda_memory_manager.h b/drivers/accel/qda/qda_memory_manager.h
[..]
> +
> +/**
This says "kernel-doc"
> + * struct qda_iommu_device - IOMMU device instance for memory management
> + *
> + * This structure represents a single IOMMU-enabled device managed by the
> + * memory manager. Each device can be assigned to a specific process.
> + */
> +struct qda_iommu_device {
> + /* Unique identifier for this IOMMU device */
But this doesn't follow kernel-doc style.
At the end of the series,
./scripts/kernel-doc -none -vv -Wall drivers/accel/qda/
reports 270 warnings.
> + u32 id;
> + /* Pointer to the underlying device */
> + struct device *dev;
> + /* Name for the device */
> + char name[32];
> + /* Spinlock protecting concurrent access to device */
> + spinlock_t lock;
> + /* Reference counter for device */
> + refcount_t refcount;
> + /* Work structure for deferred device removal */
> + struct work_struct remove_work;
> + /* Stream ID for IOMMU transactions */
> + u32 sid;
> + /* Pointer to parent memory manager */
> + struct qda_memory_manager *manager;
> +};
Regards,
Bjorn
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 01/18] accel/qda: Add Qualcomm QDA DSP accelerator driver docs
2026-02-23 19:08 ` [PATCH RFC 01/18] accel/qda: Add Qualcomm QDA DSP accelerator driver docs Ekansh Gupta
2026-02-23 21:17 ` Dmitry Baryshkov
@ 2026-02-24 3:33 ` Trilok Soni
2026-02-25 14:17 ` Ekansh Gupta
1 sibling, 1 reply; 83+ messages in thread
From: Trilok Soni @ 2026-02-24 3:33 UTC (permalink / raw)
To: Ekansh Gupta, Oded Gabbay, Jonathan Corbet, Shuah Khan,
Joerg Roedel, Will Deacon, Robin Murphy, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Sumit Semwal, Christian König
Cc: dri-devel, linux-doc, linux-kernel, linux-arm-msm, iommu,
linux-media, linaro-mm-sig, Srinivas Kandagatla, Dmitry Baryshkov,
Bharath Kumar, Chenna Kesava Raju
On 2/23/2026 11:08 AM, Ekansh Gupta wrote:
> Add initial documentation for the Qualcomm DSP Accelerator (QDA) driver
> integrated in the DRM accel subsystem.
>
> The new docs introduce QDA as a DRM/accel-based implementation of
> Hexagon DSP offload that is intended as a modern alternative to the
> legacy FastRPC driver in drivers/misc. The text describes the driver
> motivation, high-level architecture and interaction with IOMMU context
> banks, GEM-based buffer management and the RPMsg transport.
>
> The user-space facing section documents the main QDA IOCTLs used to
> establish DSP sessions, manage GEM buffer objects and invoke remote
> procedures using the FastRPC protocol, along with a typical lifecycle
> example for applications.
>
> Finally, the driver is wired into the Compute Accelerators
> documentation index under Documentation/accel, and a brief debugging
> section shows how to enable dynamic debug for the QDA implementation.
So existing applications written over character device UAPI needs to be
rewritten over new UAPI and it will be broken once this driver gets
merged? Are we going to keep both the drivers in the Linux kernel
and not deprecate the /char device one?
Is Qualcomm going to provide the wrapper library in the userspace
so that existing applications by our customers and developers
keep working w/ the newer kernel if the char interface based
driver gets deprecated? It is not clear from your text above.
---Trilok Soni
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 00/18] accel/qda: Introduce Qualcomm DSP Accelerator driver
2026-02-23 19:08 ` [PATCH RFC 00/18] accel/qda: Introduce Qualcomm DSP Accelerator driver Ekansh Gupta
` (18 preceding siblings ...)
2026-02-23 22:03 ` [PATCH RFC 00/18] accel/qda: Introduce Qualcomm DSP Accelerator driver Bjorn Andersson
@ 2026-02-24 3:37 ` Trilok Soni
2026-02-24 3:39 ` Trilok Soni
` (3 subsequent siblings)
23 siblings, 0 replies; 83+ messages in thread
From: Trilok Soni @ 2026-02-24 3:37 UTC (permalink / raw)
To: Ekansh Gupta, Oded Gabbay, Jonathan Corbet, Shuah Khan,
Joerg Roedel, Will Deacon, Robin Murphy, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Sumit Semwal, Christian König
Cc: dri-devel, linux-doc, linux-kernel, linux-arm-msm, iommu,
linux-media, linaro-mm-sig, Srinivas Kandagatla, Dmitry Baryshkov,
Bharath Kumar, Chenna Kesava Raju
On 2/23/2026 11:08 AM, Ekansh Gupta wrote:
> This patch series introduces the Qualcomm DSP Accelerator (QDA) driver,
> a modern DRM-based accelerator implementation for Qualcomm Hexagon DSPs.
> The driver provides a standardized interface for offloading computational
> tasks to DSPs found on Qualcomm SoCs, supporting all DSP domains (ADSP,
> CDSP, SDSP, GDSP).
>
> The QDA driver is designed as an alternative for the FastRPC driver
alternative or replacement? are you going to keep both drivers?
> in drivers/misc/, offering improved resource management, better integration
> with standard kernel subsystems, and alignment with the Linux kernel's
> Compute Accelerators framework.
>
> User-space staging branch
> ============
> https://github.com/qualcomm/fastrpc/tree/accel/staging
>
> Key Features
> ============
>
> * Standard DRM accelerator interface via /dev/accel/accelN
> * GEM-based buffer management with DMA-BUF import/export support
> * IOMMU-based memory isolation using per-process context banks
> * FastRPC protocol implementation for DSP communication
> * RPMsg transport layer for reliable message passing
> * Support for all DSP domains (ADSP, CDSP, SDSP, GDSP)
> * Comprehensive IOCTL interface for DSP operations
>
> High-Level Architecture Differences with Existing FastRPC Driver
> =================================================================
>
> The QDA driver represents a significant architectural departure from the
> existing FastRPC driver (drivers/misc/fastrpc.c), addressing several key
> limitations while maintaining protocol compatibility:
>
> 1. DRM Accelerator Framework Integration
> - FastRPC: Custom character device (/dev/fastrpc-*)
> - QDA: Standard DRM accel device (/dev/accel/accelN)
> - Benefit: Leverages established DRM infrastructure for device
> management.
>
> 2. Memory Management
> - FastRPC: Custom memory allocator with ION/DMA-BUF integration
> - QDA: Native GEM objects with full PRIME support
> - Benefit: Seamless buffer sharing using standard DRM mechanisms
>
> 3. IOMMU Context Bank Management
> - FastRPC: Direct IOMMU domain manipulation, limited isolation
> - QDA: Custom compute bus (qda_cb_bus_type) with proper device model
> - Benefit: Each CB device is a proper struct device with IOMMU group
> support, enabling better isolation and resource tracking.
> - https://lore.kernel.org/all/245d602f-3037-4ae3-9af9-d98f37258aae@oss.qualcomm.com/
>
> 4. Memory Manager Architecture
> - FastRPC: Monolithic allocator
> - QDA: Pluggable memory manager with backend abstraction
> - Benefit: Currently uses DMA-coherent backend, easily extensible for
> future memory types (e.g., carveout, CMA)
>
> 5. Transport Layer
> - FastRPC: Direct RPMsg integration in core driver
> - QDA: Abstracted transport layer (qda_rpmsg.c)
> - Benefit: Clean separation of concerns, easier to add alternative
> transports if needed
>
> 8. Code Organization
> - FastRPC: ~3000 lines in single file
> - QDA: Modular design across multiple files (~4600 lines total)
> * qda_drv.c: Core driver and DRM integration
> * qda_gem.c: GEM object management
> * qda_memory_manager.c: Memory and IOMMU management
> * qda_fastrpc.c: FastRPC protocol implementation
> * qda_rpmsg.c: Transport layer
> * qda_cb.c: Context bank device management
> - Benefit: Better maintainability, clearer separation of concerns
>
> 9. UAPI Design
> - FastRPC: Custom IOCTL interface
> - QDA: DRM-style IOCTLs with proper versioning support
> - Benefit: Follows DRM conventions, easier userspace integration
>
> 10. Documentation
> - FastRPC: Minimal in-tree documentation
> - QDA: Comprehensive documentation in Documentation/accel/qda/
> - Benefit: Better developer experience, clearer API contracts
>
> 11. Buffer Reference Mechanism
> - FastRPC: Uses buffer file descriptors (FDs) for all book-keeping
> in both kernel and DSP
> - QDA: Uses GEM handles for kernel-side management, providing better
> integration with DRM subsystem
> - Benefit: Leverages DRM GEM infrastructure for reference counting,
> lifetime management, and integration with other DRM components
>
> Key Technical Improvements
> ===========================
>
> * Proper device model: CB devices are real struct device instances on a
> custom bus, enabling proper IOMMU group management and power management
> integration
>
> * Reference-counted IOMMU devices: Multiple file descriptors from the same
> process share a single IOMMU device, reducing overhead
>
> * GEM-based buffer lifecycle: Automatic cleanup via DRM GEM reference
> counting, eliminating many resource leak scenarios
>
> * Modular memory backends: The memory manager supports pluggable backends,
> currently implementing DMA-coherent allocations with SID-prefixed
> addresses for DSP firmware
>
> * Context-based invocation tracking: XArray-based context management with
> proper synchronization and cleanup
>
> Patch Series Organization
> ==========================
>
> Patches 1-2: Driver skeleton and documentation
> Patches 3-6: RPMsg transport and IOMMU/CB infrastructure
> Patches 7-9: DRM device registration and basic IOCTL
> Patches 10-12: GEM buffer management and PRIME support
> Patches 13-17: FastRPC protocol implementation (attach, invoke, create,
> map/unmap)
> Patch 18: MAINTAINERS entry
>
> Open Items
> ===========
>
> The following items are identified as open items:
>
> 1. Privilege Level Management
> - Currently, daemon processes and user processes have the same access
> level as both use the same accel device node. This needs to be
> addressed as daemons attach to privileged DSP PDs and require
> higher privilege levels for system-level operations
> - Seeking guidance on the best approach: separate device nodes,
> capability-based checks, or DRM master/authentication mechanisms
>
> 2. UAPI Compatibility Layer
> - Add UAPI compat layer to facilitate migration of client applications
> from existing FastRPC UAPI to the new QDA accel driver UAPI,
> ensuring smooth transition for existing userspace code
> - Seeking guidance on implementation approach: in-kernel translation
> layer, userspace wrapper library, or hybrid solution
>
> 3. Documentation Improvements
> - Add detailed IOCTL usage examples
> - Document DSP firmware interface requirements
> - Create migration guide from existing FastRPC
>
> 4. Per-Domain Memory Allocation
> - Develop new userspace API to support memory allocation on a per
> domain basis, enabling domain-specific memory management and
> optimization
>
> 5. Audio and Sensors PD Support
> - The current patch series does not handle Audio PD and Sensors PD
> functionalities. These specialized protection domains require
> additional support for real-time constraints and power management
>
> Interface Compatibility
> ========================
>
> The QDA driver maintains compatibility with existing FastRPC infrastructure:
>
> * Device Tree Bindings: The driver uses the same device tree bindings as
> the existing FastRPC driver, ensuring no changes are required to device
> tree sources. The "qcom,fastrpc" compatible string and child node
> structure remain unchanged.
>
> * Userspace Interface: While the driver provides a new DRM-based UAPI,
> the underlying FastRPC protocol and DSP firmware interface remain
> compatible. This ensures that DSP firmware and libraries continue to
> work without modification.
>
> * Migration Path: The modular design allows for gradual migration, where
> both drivers can coexist during the transition period. Applications can
> be migrated incrementally to the new UAPI with the help of the planned
> compatibility layer.
>
> References
> ==========
>
> Previous discussions on this migration:
> - https://lkml.org/lkml/2024/6/24/479
> - https://lkml.org/lkml/2024/6/21/1252
>
> Testing
> =======
>
> The driver has been tested on Qualcomm platforms with:
> - Basic FastRPC attach/release operations
> - DSP process creation and initialization
> - Memory mapping/unmapping operations
> - Dynamic invocation with various buffer types
> - GEM buffer allocation and mmap
> - PRIME buffer import from other subsystems
>
> Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
> ---
> Ekansh Gupta (18):
> accel/qda: Add Qualcomm QDA DSP accelerator driver docs
> accel/qda: Add Qualcomm DSP accelerator driver skeleton
> accel/qda: Add RPMsg transport for Qualcomm DSP accelerator
> accel/qda: Add built-in compute CB bus for QDA and integrate with IOMMU
> accel/qda: Create compute CB devices on QDA compute bus
> accel/qda: Add memory manager for CB devices
> accel/qda: Add DRM accel device registration for QDA driver
> accel/qda: Add per-file DRM context and open/close handling
> accel/qda: Add QUERY IOCTL and basic QDA UAPI header
> accel/qda: Add DMA-backed GEM objects and memory manager integration
> accel/qda: Add GEM_CREATE and GEM_MMAP_OFFSET IOCTLs
> accel/qda: Add PRIME dma-buf import support
> accel/qda: Add initial FastRPC attach and release support
> accel/qda: Add FastRPC dynamic invocation support
> accel/qda: Add FastRPC DSP process creation support
> accel/qda: Add FastRPC-based DSP memory mapping support
> accel/qda: Add FastRPC-based DSP memory unmapping support
> MAINTAINERS: Add MAINTAINERS entry for QDA driver
>
> Documentation/accel/index.rst | 1 +
> Documentation/accel/qda/index.rst | 14 +
> Documentation/accel/qda/qda.rst | 129 ++++
> MAINTAINERS | 9 +
> arch/arm64/configs/defconfig | 2 +
> drivers/accel/Kconfig | 1 +
> drivers/accel/Makefile | 2 +
> drivers/accel/qda/Kconfig | 35 ++
> drivers/accel/qda/Makefile | 19 +
> drivers/accel/qda/qda_cb.c | 182 ++++++
> drivers/accel/qda/qda_cb.h | 26 +
> drivers/accel/qda/qda_compute_bus.c | 23 +
> drivers/accel/qda/qda_drv.c | 375 ++++++++++++
> drivers/accel/qda/qda_drv.h | 171 ++++++
> drivers/accel/qda/qda_fastrpc.c | 1002 ++++++++++++++++++++++++++++++++
> drivers/accel/qda/qda_fastrpc.h | 433 ++++++++++++++
> drivers/accel/qda/qda_gem.c | 211 +++++++
> drivers/accel/qda/qda_gem.h | 103 ++++
> drivers/accel/qda/qda_ioctl.c | 271 +++++++++
> drivers/accel/qda/qda_ioctl.h | 118 ++++
> drivers/accel/qda/qda_memory_dma.c | 91 +++
> drivers/accel/qda/qda_memory_dma.h | 46 ++
> drivers/accel/qda/qda_memory_manager.c | 382 ++++++++++++
> drivers/accel/qda/qda_memory_manager.h | 148 +++++
> drivers/accel/qda/qda_prime.c | 194 +++++++
> drivers/accel/qda/qda_prime.h | 43 ++
> drivers/accel/qda/qda_rpmsg.c | 327 +++++++++++
> drivers/accel/qda/qda_rpmsg.h | 57 ++
> drivers/iommu/iommu.c | 4 +
> include/linux/qda_compute_bus.h | 22 +
> include/uapi/drm/qda_accel.h | 224 +++++++
> 31 files changed, 4665 insertions(+)
> ---
> base-commit: d4906ae14a5f136ceb671bb14cedbf13fa560da6
> change-id: 20260223-qda-firstpost-4ab05249e2cc
>
> Best regards,
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 00/18] accel/qda: Introduce Qualcomm DSP Accelerator driver
2026-02-23 19:08 ` [PATCH RFC 00/18] accel/qda: Introduce Qualcomm DSP Accelerator driver Ekansh Gupta
` (19 preceding siblings ...)
2026-02-24 3:37 ` Trilok Soni
@ 2026-02-24 3:39 ` Trilok Soni
2026-03-02 8:43 ` Ekansh Gupta
2026-02-25 13:42 ` Bryan O'Donoghue
` (2 subsequent siblings)
23 siblings, 1 reply; 83+ messages in thread
From: Trilok Soni @ 2026-02-24 3:39 UTC (permalink / raw)
To: Ekansh Gupta, Oded Gabbay, Jonathan Corbet, Shuah Khan,
Joerg Roedel, Will Deacon, Robin Murphy, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Sumit Semwal, Christian König
Cc: dri-devel, linux-doc, linux-kernel, linux-arm-msm, iommu,
linux-media, linaro-mm-sig, Srinivas Kandagatla, Dmitry Baryshkov,
Bharath Kumar, Chenna Kesava Raju
On 2/23/2026 11:08 AM, Ekansh Gupta wrote:
> * Userspace Interface: While the driver provides a new DRM-based UAPI,
> the underlying FastRPC protocol and DSP firmware interface remain
> compatible. This ensures that DSP firmware and libraries continue to
> work without modification.
This is not very clear and it is not explained properly in the 1st patch
where you document this driver. It doesn't talk about how older
UAPI based application will still work without any change
or recompilation. I prefer the same old binary to work w/ the new
DRM based interface without any changes (I don't know how that will be possible)
OR if recompilation + linking is needed then you need to provide the wrapper library.
---Trilok Soni
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 12/18] accel/qda: Add PRIME dma-buf import support
2026-02-23 19:09 ` [PATCH RFC 12/18] accel/qda: Add PRIME dma-buf import support Ekansh Gupta
@ 2026-02-24 8:52 ` Matthew Brost
2026-03-02 9:19 ` Ekansh Gupta
2026-02-24 9:12 ` Christian König
1 sibling, 1 reply; 83+ messages in thread
From: Matthew Brost @ 2026-02-24 8:52 UTC (permalink / raw)
To: Ekansh Gupta
Cc: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, dri-devel, linux-doc, linux-kernel,
linux-arm-msm, iommu, linux-media, linaro-mm-sig,
Srinivas Kandagatla, Dmitry Baryshkov, Bharath Kumar,
Chenna Kesava Raju
On Tue, Feb 24, 2026 at 12:39:06AM +0530, Ekansh Gupta wrote:
> Add PRIME dma-buf import support for QDA GEM buffer objects and integrate
> it with the existing per-process memory manager and IOMMU device model.
>
> The implementation extends qda_gem_obj to represent imported dma-bufs,
> including dma_buf references, attachment state, scatter-gather tables
> and an imported DMA address used for DSP-facing book-keeping. The
> qda_gem_prime_import() path handles reimports of buffers originally
> exported by QDA as well as imports of external dma-bufs, attaching them
> to the assigned IOMMU device and mapping them through the memory manager
> for DSP access. The GEM free path is updated to unmap and detach
> imported buffers while preserving the existing behaviour for locally
> allocated memory.
>
> The PRIME fd-to-handle path is implemented in qda_prime_fd_to_handle(),
> which records the calling drm_file in a driver-private import context
> before invoking the core DRM helpers. The GEM import callback retrieves
> this context to ensure that an IOMMU device is assigned to the process
> and that imported buffers follow the same per-process IOMMU selection
> rules as natively allocated GEM objects.
>
> This patch prepares the driver for interoperable buffer sharing between
> QDA and other dma-buf capable subsystems while keeping IOMMU mapping and
> lifetime handling consistent with the existing GEM allocation flow.
>
> Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
> ---
> drivers/accel/qda/Makefile | 1 +
> drivers/accel/qda/qda_drv.c | 8 ++
> drivers/accel/qda/qda_drv.h | 4 +
> drivers/accel/qda/qda_gem.c | 60 +++++++---
> drivers/accel/qda/qda_gem.h | 10 ++
> drivers/accel/qda/qda_ioctl.c | 7 ++
> drivers/accel/qda/qda_ioctl.h | 15 +++
> drivers/accel/qda/qda_memory_manager.c | 42 ++++++-
> drivers/accel/qda/qda_memory_manager.h | 14 +++
> drivers/accel/qda/qda_prime.c | 194 +++++++++++++++++++++++++++++++++
> drivers/accel/qda/qda_prime.h | 43 ++++++++
> 11 files changed, 377 insertions(+), 21 deletions(-)
>
> diff --git a/drivers/accel/qda/Makefile b/drivers/accel/qda/Makefile
> index 88c324fa382c..8286f5279748 100644
> --- a/drivers/accel/qda/Makefile
> +++ b/drivers/accel/qda/Makefile
> @@ -13,5 +13,6 @@ qda-y := \
> qda_ioctl.o \
> qda_gem.o \
> qda_memory_dma.o \
> + qda_prime.o \
>
> obj-$(CONFIG_DRM_ACCEL_QDA_COMPUTE_BUS) += qda_compute_bus.o
> diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c
> index 0dd0e2bb2c0f..4adee00b1f2c 100644
> --- a/drivers/accel/qda/qda_drv.c
> +++ b/drivers/accel/qda/qda_drv.c
> @@ -10,9 +10,11 @@
> #include <drm/drm_gem.h>
> #include <drm/drm_ioctl.h>
> #include <drm/qda_accel.h>
> +#include <drm/drm_prime.h>
>
> #include "qda_drv.h"
> #include "qda_gem.h"
> +#include "qda_prime.h"
> #include "qda_ioctl.h"
> #include "qda_rpmsg.h"
>
> @@ -166,6 +168,8 @@ static struct drm_driver qda_drm_driver = {
> .postclose = qda_postclose,
> .ioctls = qda_ioctls,
> .num_ioctls = ARRAY_SIZE(qda_ioctls),
> + .gem_prime_import = qda_gem_prime_import,
> + .prime_fd_to_handle = qda_ioctl_prime_fd_to_handle,
> .name = DRIVER_NAME,
> .desc = "Qualcomm DSP Accelerator Driver",
> };
> @@ -174,6 +178,7 @@ static void cleanup_drm_private(struct qda_dev *qdev)
> {
> if (qdev->drm_priv) {
> qda_dbg(qdev, "Cleaning up DRM private data\n");
> + mutex_destroy(&qdev->drm_priv->import_lock);
> kfree(qdev->drm_priv);
> }
> }
> @@ -240,6 +245,9 @@ static int init_drm_private(struct qda_dev *qdev)
> if (!qdev->drm_priv)
> return -ENOMEM;
>
> + mutex_init(&qdev->drm_priv->import_lock);
> + qdev->drm_priv->current_import_file_priv = NULL;
> +
> qda_dbg(qdev, "DRM private data initialized successfully\n");
> return 0;
> }
> diff --git a/drivers/accel/qda/qda_drv.h b/drivers/accel/qda/qda_drv.h
> index 8a2cd474958b..bb0dd7e284c6 100644
> --- a/drivers/accel/qda/qda_drv.h
> +++ b/drivers/accel/qda/qda_drv.h
> @@ -64,6 +64,10 @@ struct qda_drm_priv {
> struct qda_memory_manager *iommu_mgr;
> /* Back-pointer to qda_dev */
> struct qda_dev *qdev;
> + /* Lock protecting import context */
> + struct mutex import_lock;
> + /* Current file_priv during prime import */
> + struct drm_file *current_import_file_priv;
> };
>
> /* struct qda_dev - Main device structure for QDA driver */
> diff --git a/drivers/accel/qda/qda_gem.c b/drivers/accel/qda/qda_gem.c
> index bbd54e2502d3..37279e8b46fe 100644
> --- a/drivers/accel/qda/qda_gem.c
> +++ b/drivers/accel/qda/qda_gem.c
> @@ -8,6 +8,7 @@
> #include "qda_gem.h"
> #include "qda_memory_manager.h"
> #include "qda_memory_dma.h"
> +#include "qda_prime.h"
>
> static int validate_gem_obj_for_mmap(struct qda_gem_obj *qda_gem_obj)
> {
> @@ -15,23 +16,29 @@ static int validate_gem_obj_for_mmap(struct qda_gem_obj *qda_gem_obj)
> qda_err(NULL, "Invalid GEM object size\n");
> return -EINVAL;
> }
> - if (!qda_gem_obj->iommu_dev || !qda_gem_obj->iommu_dev->dev) {
> - qda_err(NULL, "Allocated buffer missing IOMMU device\n");
> - return -EINVAL;
> - }
> - if (!qda_gem_obj->iommu_dev->dev) {
> - qda_err(NULL, "Allocated buffer missing IOMMU device\n");
> - return -EINVAL;
> - }
> - if (!qda_gem_obj->virt) {
> - qda_err(NULL, "Allocated buffer missing virtual address\n");
> - return -EINVAL;
> - }
> - if (qda_gem_obj->dma_addr == 0) {
> - qda_err(NULL, "Allocated buffer missing DMA address\n");
> - return -EINVAL;
> + if (qda_gem_obj->is_imported) {
> + if (!qda_gem_obj->sgt) {
> + qda_err(NULL, "Imported buffer missing sgt\n");
> + return -EINVAL;
> + }
> + if (!qda_gem_obj->iommu_dev || !qda_gem_obj->iommu_dev->dev) {
> + qda_err(NULL, "Imported buffer missing IOMMU device\n");
> + return -EINVAL;
> + }
> + } else {
> + if (!qda_gem_obj->iommu_dev || !qda_gem_obj->iommu_dev->dev) {
> + qda_err(NULL, "Allocated buffer missing IOMMU device\n");
> + return -EINVAL;
> + }
> + if (!qda_gem_obj->virt) {
> + qda_err(NULL, "Allocated buffer missing virtual address\n");
> + return -EINVAL;
> + }
> + if (qda_gem_obj->dma_addr == 0) {
> + qda_err(NULL, "Allocated buffer missing DMA address\n");
> + return -EINVAL;
> + }
> }
> -
> return 0;
> }
>
> @@ -60,9 +67,21 @@ void qda_gem_free_object(struct drm_gem_object *gem_obj)
> struct qda_gem_obj *qda_gem_obj = to_qda_gem_obj(gem_obj);
> struct qda_drm_priv *drm_priv = get_drm_priv_from_device(gem_obj->dev);
>
> - if (qda_gem_obj->virt) {
> - if (drm_priv && drm_priv->iommu_mgr)
> + if (qda_gem_obj->is_imported) {
> + if (qda_gem_obj->attachment && qda_gem_obj->sgt)
> + dma_buf_unmap_attachment_unlocked(qda_gem_obj->attachment,
> + qda_gem_obj->sgt, DMA_BIDIRECTIONAL);
> + if (qda_gem_obj->attachment)
> + dma_buf_detach(qda_gem_obj->dma_buf, qda_gem_obj->attachment);
> + if (qda_gem_obj->dma_buf)
> + dma_buf_put(qda_gem_obj->dma_buf);
> + if (qda_gem_obj->iommu_dev && drm_priv && drm_priv->iommu_mgr)
> qda_memory_manager_free(drm_priv->iommu_mgr, qda_gem_obj);
> + } else {
> + if (qda_gem_obj->virt) {
> + if (drm_priv && drm_priv->iommu_mgr)
> + qda_memory_manager_free(drm_priv->iommu_mgr, qda_gem_obj);
> + }
> }
>
> drm_gem_object_release(gem_obj);
> @@ -174,6 +193,11 @@ struct drm_gem_object *qda_gem_create_object(struct drm_device *drm_dev,
> qda_gem_obj = qda_gem_alloc_object(drm_dev, aligned_size);
> if (IS_ERR(qda_gem_obj))
> return (struct drm_gem_object *)qda_gem_obj;
> + qda_gem_obj->is_imported = false;
> + qda_gem_obj->dma_buf = NULL;
> + qda_gem_obj->attachment = NULL;
> + qda_gem_obj->sgt = NULL;
> + qda_gem_obj->imported_dma_addr = 0;
>
> ret = qda_memory_manager_alloc(iommu_mgr, qda_gem_obj, file_priv);
> if (ret) {
> diff --git a/drivers/accel/qda/qda_gem.h b/drivers/accel/qda/qda_gem.h
> index cbd5d0a58fa4..3566c5b2ad88 100644
> --- a/drivers/accel/qda/qda_gem.h
> +++ b/drivers/accel/qda/qda_gem.h
> @@ -31,6 +31,16 @@ struct qda_gem_obj {
> size_t size;
> /* IOMMU device that performed the allocation */
> struct qda_iommu_device *iommu_dev;
> + /* True if buffer is imported, false if allocated */
> + bool is_imported;
> + /* Reference to imported dma_buf */
> + struct dma_buf *dma_buf;
> + /* DMA buf attachment */
> + struct dma_buf_attachment *attachment;
> + /* Scatter-gather table */
> + struct sg_table *sgt;
> + /* DMA address of imported buffer */
> + dma_addr_t imported_dma_addr;
> };
>
> /*
> diff --git a/drivers/accel/qda/qda_ioctl.c b/drivers/accel/qda/qda_ioctl.c
> index ef3c9c691cb7..d91983048d6c 100644
> --- a/drivers/accel/qda/qda_ioctl.c
> +++ b/drivers/accel/qda/qda_ioctl.c
> @@ -5,6 +5,7 @@
> #include <drm/qda_accel.h>
> #include "qda_drv.h"
> #include "qda_ioctl.h"
> +#include "qda_prime.h"
>
> static int qda_validate_and_get_context(struct drm_device *dev, struct drm_file *file_priv,
> struct qda_dev **qdev, struct qda_user **qda_user)
> @@ -78,3 +79,9 @@ int qda_ioctl_gem_mmap_offset(struct drm_device *dev, void *data, struct drm_fil
> drm_gem_object_put(gem_obj);
> return ret;
> }
> +
> +int qda_ioctl_prime_fd_to_handle(struct drm_device *dev, struct drm_file *file_priv, int prime_fd,
> + u32 *handle)
> +{
> + return qda_prime_fd_to_handle(dev, file_priv, prime_fd, handle);
> +}
> diff --git a/drivers/accel/qda/qda_ioctl.h b/drivers/accel/qda/qda_ioctl.h
> index 6bf3bcd28c0e..d454256f5fc5 100644
> --- a/drivers/accel/qda/qda_ioctl.h
> +++ b/drivers/accel/qda/qda_ioctl.h
> @@ -23,4 +23,19 @@
> */
> int qda_ioctl_query(struct drm_device *dev, void *data, struct drm_file *file_priv);
>
> +/**
> + * qda_ioctl_prime_fd_to_handle - IOCTL handler for PRIME FD to handle conversion
> + * @dev: DRM device structure
> + * @file_priv: DRM file private data
> + * @prime_fd: File descriptor of the PRIME buffer
> + * @handle: Output parameter for the GEM handle
> + *
> + * This IOCTL handler converts a PRIME file descriptor to a GEM handle.
> + * It serves as both the DRM driver callback and can be used directly.
> + *
> + * Return: 0 on success, negative error code on failure
> + */
> +int qda_ioctl_prime_fd_to_handle(struct drm_device *dev, struct drm_file *file_priv,
> + int prime_fd, u32 *handle);
> +
> #endif /* _QDA_IOCTL_H */
> diff --git a/drivers/accel/qda/qda_memory_manager.c b/drivers/accel/qda/qda_memory_manager.c
> index e225667557ee..3fd20f17c57b 100644
> --- a/drivers/accel/qda/qda_memory_manager.c
> +++ b/drivers/accel/qda/qda_memory_manager.c
> @@ -154,8 +154,8 @@ static struct qda_iommu_device *get_process_iommu_device(struct qda_memory_manag
> return qda_priv->assigned_iommu_dev;
> }
>
> -static int qda_memory_manager_assign_device(struct qda_memory_manager *mem_mgr,
> - struct drm_file *file_priv)
> +int qda_memory_manager_assign_device(struct qda_memory_manager *mem_mgr,
> + struct drm_file *file_priv)
> {
> struct qda_file_priv *qda_priv;
> struct qda_iommu_device *selected_dev = NULL;
> @@ -223,6 +223,35 @@ static struct qda_iommu_device *get_or_assign_iommu_device(struct qda_memory_man
> return NULL;
> }
>
> +static int qda_memory_manager_map_imported(struct qda_memory_manager *mem_mgr,
> + struct qda_gem_obj *gem_obj,
> + struct qda_iommu_device *iommu_dev)
> +{
> + struct scatterlist *sg;
> + dma_addr_t dma_addr;
> + int ret = 0;
> +
> + if (!gem_obj->is_imported || !gem_obj->sgt || !iommu_dev) {
> + qda_err(NULL, "Invalid parameters for imported buffer mapping\n");
> + return -EINVAL;
> + }
> +
> + gem_obj->iommu_dev = iommu_dev;
> +
> + sg = gem_obj->sgt->sgl;
> + if (sg) {
> + dma_addr = sg_dma_address(sg);
> + dma_addr += ((u64)iommu_dev->sid << 32);
> +
> + gem_obj->imported_dma_addr = dma_addr;
> + } else {
> + qda_err(NULL, "Invalid scatter-gather list for imported buffer\n");
> + ret = -EINVAL;
> + }
> +
> + return ret;
> +}
> +
> int qda_memory_manager_alloc(struct qda_memory_manager *mem_mgr, struct qda_gem_obj *gem_obj,
> struct drm_file *file_priv)
> {
> @@ -248,7 +277,10 @@ int qda_memory_manager_alloc(struct qda_memory_manager *mem_mgr, struct qda_gem_
> return -ENOMEM;
> }
>
> - ret = qda_dma_alloc(selected_dev, gem_obj, size);
> + if (gem_obj->is_imported)
> + ret = qda_memory_manager_map_imported(mem_mgr, gem_obj, selected_dev);
> + else
> + ret = qda_dma_alloc(selected_dev, gem_obj, size);
>
> if (ret) {
> qda_err(NULL, "Allocation failed: size=%zu, device_id=%u, ret=%d\n",
> @@ -268,6 +300,10 @@ void qda_memory_manager_free(struct qda_memory_manager *mem_mgr, struct qda_gem_
> return;
> }
>
> + if (gem_obj->is_imported) {
> + qda_dbg(NULL, "Freed imported buffer tracking (no DMA free needed)\n");
> + return;
> + }
> qda_dma_free(gem_obj);
> }
>
> diff --git a/drivers/accel/qda/qda_memory_manager.h b/drivers/accel/qda/qda_memory_manager.h
> index bac44284ef98..f6c7963cec42 100644
> --- a/drivers/accel/qda/qda_memory_manager.h
> +++ b/drivers/accel/qda/qda_memory_manager.h
> @@ -106,6 +106,20 @@ int qda_memory_manager_register_device(struct qda_memory_manager *mem_mgr,
> void qda_memory_manager_unregister_device(struct qda_memory_manager *mem_mgr,
> struct qda_iommu_device *iommu_dev);
>
> +/**
> + * qda_memory_manager_assign_device() - Assign an IOMMU device to a process
> + * @mem_mgr: Pointer to memory manager
> + * @file_priv: DRM file private data for process association
> + *
> + * Assigns an IOMMU device to the calling process. If the process already has
> + * a device assigned, returns success. If another file descriptor from the same
> + * PID has a device, reuses it. Otherwise, finds an available device and assigns it.
> + *
> + * Return: 0 on success, negative error code on failure
> + */
> +int qda_memory_manager_assign_device(struct qda_memory_manager *mem_mgr,
> + struct drm_file *file_priv);
> +
> /**
> * qda_memory_manager_alloc() - Allocate memory for a GEM object
> * @mem_mgr: Pointer to memory manager
> diff --git a/drivers/accel/qda/qda_prime.c b/drivers/accel/qda/qda_prime.c
> new file mode 100644
> index 000000000000..3d23842e48bb
> --- /dev/null
> +++ b/drivers/accel/qda/qda_prime.c
> @@ -0,0 +1,194 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +// Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
> +#include <drm/drm_gem.h>
> +#include <drm/drm_prime.h>
> +#include <linux/slab.h>
> +#include <linux/dma-mapping.h>
> +#include "qda_drv.h"
> +#include "qda_gem.h"
> +#include "qda_prime.h"
> +#include "qda_memory_manager.h"
> +
> +static struct drm_gem_object *check_own_buffer(struct drm_device *dev, struct dma_buf *dma_buf)
> +{
> + if (dma_buf->priv) {
> + struct drm_gem_object *existing_gem = dma_buf->priv;
Randomly looking at your driver — you’ve broken the dma-buf cross-driver
contract here. How do you know dma_buf->priv is a struct drm_gem_object?
You don’t, because that is assigned by the exporter, and userspace could
pass in a dma-buf from another device and blow up your driver.
I think you just want to call drm_gem_is_prime_exported_dma_buf() here
before doing anything.
The rest of this dma-buf code also looks highly questionable. I’d study
how other drivers implement their dma-buf paths and use those as a
reference to improve yours.
Matt
> +
> + if (existing_gem->dev == dev) {
> + struct qda_gem_obj *existing_qda_gem = to_qda_gem_obj(existing_gem);
> +
> + if (!existing_qda_gem->is_imported) {
> + drm_gem_object_get(existing_gem);
> + return existing_gem;
> + }
> + }
> + }
> + return NULL;
> +}
> +
> +static struct qda_iommu_device *get_iommu_device_for_import(struct qda_drm_priv *drm_priv,
> + struct drm_file **file_priv_out,
> + struct qda_dev *qdev)
> +{
> + struct drm_file *file_priv;
> + struct qda_file_priv *qda_file_priv;
> + struct qda_iommu_device *iommu_dev = NULL;
> + int ret;
> +
> + file_priv = drm_priv->current_import_file_priv;
> + *file_priv_out = file_priv;
> +
> + if (!file_priv || !file_priv->driver_priv)
> + return NULL;
> +
> + qda_file_priv = (struct qda_file_priv *)file_priv->driver_priv;
> + iommu_dev = qda_file_priv->assigned_iommu_dev;
> +
> + if (!iommu_dev) {
> + ret = qda_memory_manager_assign_device(drm_priv->iommu_mgr, file_priv);
> + if (ret) {
> + qda_err(qdev, "Failed to assign IOMMU device: %d\n", ret);
> + return NULL;
> + }
> +
> + iommu_dev = qda_file_priv->assigned_iommu_dev;
> + }
> +
> + return iommu_dev;
> +}
> +
> +static int setup_dma_buf_mapping(struct qda_gem_obj *qda_gem_obj, struct dma_buf *dma_buf,
> + struct device *attach_dev, struct qda_dev *qdev)
> +{
> + struct dma_buf_attachment *attachment;
> + struct sg_table *sgt;
> + int ret;
> +
> + attachment = dma_buf_attach(dma_buf, attach_dev);
> + if (IS_ERR(attachment)) {
> + ret = PTR_ERR(attachment);
> + qda_err(qdev, "Failed to attach dma_buf: %d\n", ret);
> + return ret;
> + }
> + qda_gem_obj->attachment = attachment;
> +
> + sgt = dma_buf_map_attachment_unlocked(attachment, DMA_BIDIRECTIONAL);
> + if (IS_ERR(sgt)) {
> + ret = PTR_ERR(sgt);
> + qda_err(qdev, "Failed to map dma_buf attachment: %d\n", ret);
> + dma_buf_detach(dma_buf, attachment);
> + return ret;
> + }
> + qda_gem_obj->sgt = sgt;
> +
> + return 0;
> +}
> +
> +struct drm_gem_object *qda_gem_prime_import(struct drm_device *dev, struct dma_buf *dma_buf)
> +{
> + struct qda_drm_priv *drm_priv;
> + struct qda_gem_obj *qda_gem_obj;
> + struct drm_file *file_priv;
> + struct qda_iommu_device *iommu_dev;
> + struct qda_dev *qdev;
> + struct drm_gem_object *existing_gem;
> + size_t aligned_size;
> + int ret;
> +
> + drm_priv = get_drm_priv_from_device(dev);
> + if (!drm_priv || !drm_priv->iommu_mgr) {
> + qda_err(NULL, "Invalid drm_priv or iommu_mgr\n");
> + return ERR_PTR(-EINVAL);
> + }
> +
> + qdev = drm_priv->qdev;
> +
> + existing_gem = check_own_buffer(dev, dma_buf);
> + if (existing_gem)
> + return existing_gem;
> +
> + iommu_dev = get_iommu_device_for_import(drm_priv, &file_priv, qdev);
> + if (!iommu_dev || !iommu_dev->dev) {
> + qda_err(qdev, "No IOMMU device assigned for prime import\n");
> + return ERR_PTR(-ENODEV);
> + }
> +
> + qda_dbg(qdev, "Using IOMMU device %u for prime import\n", iommu_dev->id);
> +
> + aligned_size = PAGE_ALIGN(dma_buf->size);
> + qda_gem_obj = qda_gem_alloc_object(dev, aligned_size);
> + if (IS_ERR(qda_gem_obj))
> + return (struct drm_gem_object *)qda_gem_obj;
> +
> + qda_gem_obj->is_imported = true;
> + qda_gem_obj->dma_buf = dma_buf;
> + qda_gem_obj->virt = NULL;
> + qda_gem_obj->dma_addr = 0;
> + qda_gem_obj->imported_dma_addr = 0;
> + qda_gem_obj->iommu_dev = iommu_dev;
> +
> + get_dma_buf(dma_buf);
> +
> + ret = setup_dma_buf_mapping(qda_gem_obj, dma_buf, iommu_dev->dev, qdev);
> + if (ret)
> + goto err_put_dma_buf;
> +
> + ret = qda_memory_manager_alloc(drm_priv->iommu_mgr, qda_gem_obj, file_priv);
> + if (ret) {
> + qda_err(qdev, "Failed to allocate IOMMU mapping: %d\n", ret);
> + goto err_unmap;
> + }
> +
> + qda_dbg(qdev, "Prime import completed successfully size=%zu\n", aligned_size);
> + return &qda_gem_obj->base;
> +
> +err_unmap:
> + dma_buf_unmap_attachment_unlocked(qda_gem_obj->attachment,
> + qda_gem_obj->sgt, DMA_BIDIRECTIONAL);
> + dma_buf_detach(dma_buf, qda_gem_obj->attachment);
> +err_put_dma_buf:
> + dma_buf_put(dma_buf);
> + qda_gem_cleanup_object(qda_gem_obj);
> + return ERR_PTR(ret);
> +}
> +
> +int qda_prime_fd_to_handle(struct drm_device *dev, struct drm_file *file_priv,
> + int prime_fd, u32 *handle)
> +{
> + struct qda_drm_priv *drm_priv;
> + struct qda_dev *qdev;
> + int ret;
> +
> + drm_priv = get_drm_priv_from_device(dev);
> + if (!drm_priv) {
> + qda_dbg(NULL, "Failed to get drm_priv from device\n");
> + return -EINVAL;
> + }
> +
> + qdev = drm_priv->qdev;
> +
> + if (file_priv && file_priv->driver_priv) {
> + struct qda_file_priv *qda_file_priv;
> +
> + qda_file_priv = (struct qda_file_priv *)file_priv->driver_priv;
> + } else {
> + qda_dbg(qdev, "Called with NULL file_priv or driver_priv\n");
> + }
> +
> + mutex_lock(&drm_priv->import_lock);
> + drm_priv->current_import_file_priv = file_priv;
> +
> + ret = drm_gem_prime_fd_to_handle(dev, file_priv, prime_fd, handle);
> +
> + drm_priv->current_import_file_priv = NULL;
> + mutex_unlock(&drm_priv->import_lock);
> +
> + if (!ret)
> + qda_dbg(qdev, "Completed with ret=%d, handle=%u\n", ret, *handle);
> + else
> + qda_dbg(qdev, "Completed with ret=%d\n", ret);
> +
> + return ret;
> +}
> +
> +MODULE_IMPORT_NS("DMA_BUF");
> diff --git a/drivers/accel/qda/qda_prime.h b/drivers/accel/qda/qda_prime.h
> new file mode 100644
> index 000000000000..939902454dcd
> --- /dev/null
> +++ b/drivers/accel/qda/qda_prime.h
> @@ -0,0 +1,43 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
> + */
> +
> +#ifndef _QDA_PRIME_H
> +#define _QDA_PRIME_H
> +
> +#include <drm/drm_device.h>
> +#include <drm/drm_file.h>
> +#include <drm/drm_gem.h>
> +#include <linux/dma-buf.h>
> +
> +/**
> + * qda_gem_prime_import - Import a DMA-BUF as a GEM object
> + * @dev: DRM device structure
> + * @dma_buf: DMA-BUF to import
> + *
> + * This function imports an external DMA-BUF into the QDA driver as a GEM
> + * object. It handles both re-imports of buffers originally from this driver
> + * and imports of external buffers from other drivers.
> + *
> + * Return: Pointer to the imported GEM object on success, ERR_PTR on failure
> + */
> +struct drm_gem_object *qda_gem_prime_import(struct drm_device *dev, struct dma_buf *dma_buf);
> +
> +/**
> + * qda_prime_fd_to_handle - Core implementation for PRIME FD to GEM handle conversion
> + * @dev: DRM device structure
> + * @file_priv: DRM file private data
> + * @prime_fd: File descriptor of the PRIME buffer
> + * @handle: Output parameter for the GEM handle
> + *
> + * This core function sets up the necessary context before calling the
> + * DRM framework's prime FD to handle conversion. It ensures proper IOMMU
> + * device assignment and tracking for the import operation.
> + *
> + * Return: 0 on success, negative error code on failure
> + */
> +int qda_prime_fd_to_handle(struct drm_device *dev, struct drm_file *file_priv,
> + int prime_fd, u32 *handle);
> +
> +#endif /* _QDA_PRIME_H */
>
> --
> 2.34.1
>
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 11/18] accel/qda: Add GEM_CREATE and GEM_MMAP_OFFSET IOCTLs
2026-02-23 19:09 ` [PATCH RFC 11/18] accel/qda: Add GEM_CREATE and GEM_MMAP_OFFSET IOCTLs Ekansh Gupta
2026-02-23 22:39 ` Dmitry Baryshkov
@ 2026-02-24 9:05 ` Christian König
2026-03-02 9:08 ` Ekansh Gupta
1 sibling, 1 reply; 83+ messages in thread
From: Christian König @ 2026-02-24 9:05 UTC (permalink / raw)
To: Ekansh Gupta, Oded Gabbay, Jonathan Corbet, Shuah Khan,
Joerg Roedel, Will Deacon, Robin Murphy, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Sumit Semwal
Cc: dri-devel, linux-doc, linux-kernel, linux-arm-msm, iommu,
linux-media, linaro-mm-sig, Srinivas Kandagatla, Dmitry Baryshkov,
Bharath Kumar, Chenna Kesava Raju
On 2/23/26 20:09, Ekansh Gupta wrote:
...
> +int qda_ioctl_gem_mmap_offset(struct drm_device *dev, void *data, struct drm_file *file_priv)
> +{
> + struct drm_qda_gem_mmap_offset *args = data;
> + struct drm_gem_object *gem_obj;
> + int ret;
> +
> + gem_obj = qda_gem_lookup_object(file_priv, args->handle);
> + if (IS_ERR(gem_obj))
> + return PTR_ERR(gem_obj);
> +
> + ret = drm_gem_create_mmap_offset(gem_obj);
> + if (ret == 0)
> + args->offset = drm_vma_node_offset_addr(&gem_obj->vma_node);
> +
> + drm_gem_object_put(gem_obj);
> + return ret;
You should probably use drm_gem_dumb_map_offset() instead of open coding this.
Otherwise you allow mmap() of imported objects which is not allowed at all.
Regards,
Christian.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 12/18] accel/qda: Add PRIME dma-buf import support
2026-02-23 19:09 ` [PATCH RFC 12/18] accel/qda: Add PRIME dma-buf import support Ekansh Gupta
2026-02-24 8:52 ` Matthew Brost
@ 2026-02-24 9:12 ` Christian König
2026-03-09 6:59 ` Ekansh Gupta
1 sibling, 1 reply; 83+ messages in thread
From: Christian König @ 2026-02-24 9:12 UTC (permalink / raw)
To: Ekansh Gupta, Oded Gabbay, Jonathan Corbet, Shuah Khan,
Joerg Roedel, Will Deacon, Robin Murphy, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Sumit Semwal
Cc: dri-devel, linux-doc, linux-kernel, linux-arm-msm, iommu,
linux-media, linaro-mm-sig, Srinivas Kandagatla, Dmitry Baryshkov,
Bharath Kumar, Chenna Kesava Raju
On 2/23/26 20:09, Ekansh Gupta wrote:
> [Sie erhalten nicht häufig E-Mails von ekansh.gupta@oss.qualcomm.com. Weitere Informationen, warum dies wichtig ist, finden Sie unter https://aka.ms/LearnAboutSenderIdentification ]
>
> Add PRIME dma-buf import support for QDA GEM buffer objects and integrate
> it with the existing per-process memory manager and IOMMU device model.
>
> The implementation extends qda_gem_obj to represent imported dma-bufs,
> including dma_buf references, attachment state, scatter-gather tables
> and an imported DMA address used for DSP-facing book-keeping. The
> qda_gem_prime_import() path handles reimports of buffers originally
> exported by QDA as well as imports of external dma-bufs, attaching them
> to the assigned IOMMU device
That is usually an absolutely clear NO-GO for DMA-bufs. Where exactly in the code is that?
> and mapping them through the memory manager
> for DSP access. The GEM free path is updated to unmap and detach
> imported buffers while preserving the existing behaviour for locally
> allocated memory.
>
> The PRIME fd-to-handle path is implemented in qda_prime_fd_to_handle(),
> which records the calling drm_file in a driver-private import context
> before invoking the core DRM helpers. The GEM import callback retrieves
> this context to ensure that an IOMMU device is assigned to the process
> and that imported buffers follow the same per-process IOMMU selection
> rules as natively allocated GEM objects.
>
> This patch prepares the driver for interoperable buffer sharing between
> QDA and other dma-buf capable subsystems while keeping IOMMU mapping and
> lifetime handling consistent with the existing GEM allocation flow.
>
> Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
...
> @@ -15,23 +16,29 @@ static int validate_gem_obj_for_mmap(struct qda_gem_obj *qda_gem_obj)
> qda_err(NULL, "Invalid GEM object size\n");
> return -EINVAL;
> }
> - if (!qda_gem_obj->iommu_dev || !qda_gem_obj->iommu_dev->dev) {
> - qda_err(NULL, "Allocated buffer missing IOMMU device\n");
> - return -EINVAL;
> - }
> - if (!qda_gem_obj->iommu_dev->dev) {
> - qda_err(NULL, "Allocated buffer missing IOMMU device\n");
> - return -EINVAL;
> - }
> - if (!qda_gem_obj->virt) {
> - qda_err(NULL, "Allocated buffer missing virtual address\n");
> - return -EINVAL;
> - }
> - if (qda_gem_obj->dma_addr == 0) {
> - qda_err(NULL, "Allocated buffer missing DMA address\n");
> - return -EINVAL;
> + if (qda_gem_obj->is_imported) {
Absolutely clear NAK to that. Imported buffers *can't* be mmaped through the importer!
Userspace needs to mmap() them through the exporter.
If you absolutely have to map them through the importer for uAPI backward compatibility then there is dma_buf_mmap() for that, but this is clearly not the case here.
...
> +static int qda_memory_manager_map_imported(struct qda_memory_manager *mem_mgr,
> + struct qda_gem_obj *gem_obj,
> + struct qda_iommu_device *iommu_dev)
> +{
> + struct scatterlist *sg;
> + dma_addr_t dma_addr;
> + int ret = 0;
> +
> + if (!gem_obj->is_imported || !gem_obj->sgt || !iommu_dev) {
> + qda_err(NULL, "Invalid parameters for imported buffer mapping\n");
> + return -EINVAL;
> + }
> +
> + gem_obj->iommu_dev = iommu_dev;
> +
> + sg = gem_obj->sgt->sgl;
> + if (sg) {
> + dma_addr = sg_dma_address(sg);
> + dma_addr += ((u64)iommu_dev->sid << 32);
> +
> + gem_obj->imported_dma_addr = dma_addr;
Well that looks like you are only using the first DMA address from the imported sgt. What about the others?
Regards,
Christian.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 00/18] accel/qda: Introduce Qualcomm DSP Accelerator driver
2026-02-23 19:08 ` [PATCH RFC 00/18] accel/qda: Introduce Qualcomm DSP Accelerator driver Ekansh Gupta
` (20 preceding siblings ...)
2026-02-24 3:39 ` Trilok Soni
@ 2026-02-25 13:42 ` Bryan O'Donoghue
2026-02-25 19:12 ` Dmitry Baryshkov
2026-03-02 15:57 ` Srinivas Kandagatla
2026-03-09 8:07 ` Ekansh Gupta
23 siblings, 1 reply; 83+ messages in thread
From: Bryan O'Donoghue @ 2026-02-25 13:42 UTC (permalink / raw)
To: Ekansh Gupta, Oded Gabbay, Jonathan Corbet, Shuah Khan,
Joerg Roedel, Will Deacon, Robin Murphy, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Sumit Semwal, Christian König
Cc: dri-devel, linux-doc, linux-kernel, linux-arm-msm, iommu,
linux-media, linaro-mm-sig, Srinivas Kandagatla, Dmitry Baryshkov,
Bharath Kumar, Chenna Kesava Raju
On 23/02/2026 19:08, Ekansh Gupta wrote:
> User-space staging branch
> ============
> https://github.com/qualcomm/fastrpc/tree/accel/staging
What would be really nice to see would be mesa integration allowing
convergence of the xDSP/xPU accelerator space around something like a
standard.
See:
https://blog.tomeuvizoso.net/2025/07/rockchip-npu-update-6-we-are-in-mainline.html
---
bod
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 01/18] accel/qda: Add Qualcomm QDA DSP accelerator driver docs
2026-02-23 21:17 ` Dmitry Baryshkov
@ 2026-02-25 13:57 ` Ekansh Gupta
2026-02-25 17:17 ` Dmitry Baryshkov
0 siblings, 1 reply; 83+ messages in thread
From: Ekansh Gupta @ 2026-02-25 13:57 UTC (permalink / raw)
To: Dmitry Baryshkov
Cc: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, dri-devel, linux-doc, linux-kernel,
linux-arm-msm, iommu, linux-media, linaro-mm-sig,
Srinivas Kandagatla, Bharath Kumar, Chenna Kesava Raju
On 2/24/2026 2:47 AM, Dmitry Baryshkov wrote:
> On Tue, Feb 24, 2026 at 12:38:55AM +0530, Ekansh Gupta wrote:
>> Add initial documentation for the Qualcomm DSP Accelerator (QDA) driver
>> integrated in the DRM accel subsystem.
>>
>> The new docs introduce QDA as a DRM/accel-based implementation of
>> Hexagon DSP offload that is intended as a modern alternative to the
>> legacy FastRPC driver in drivers/misc. The text describes the driver
>> motivation, high-level architecture and interaction with IOMMU context
>> banks, GEM-based buffer management and the RPMsg transport.
>>
>> The user-space facing section documents the main QDA IOCTLs used to
>> establish DSP sessions, manage GEM buffer objects and invoke remote
>> procedures using the FastRPC protocol, along with a typical lifecycle
>> example for applications.
>>
>> Finally, the driver is wired into the Compute Accelerators
>> documentation index under Documentation/accel, and a brief debugging
>> section shows how to enable dynamic debug for the QDA implementation.
>>
>> Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
>> ---
>> Documentation/accel/index.rst | 1 +
>> Documentation/accel/qda/index.rst | 14 +++++
>> Documentation/accel/qda/qda.rst | 129 ++++++++++++++++++++++++++++++++++++++
>> 3 files changed, 144 insertions(+)
>>
>> diff --git a/Documentation/accel/index.rst b/Documentation/accel/index.rst
>> index cbc7d4c3876a..5901ea7f784c 100644
>> --- a/Documentation/accel/index.rst
>> +++ b/Documentation/accel/index.rst
>> @@ -10,4 +10,5 @@ Compute Accelerators
>> introduction
>> amdxdna/index
>> qaic/index
>> + qda/index
>> rocket/index
>> diff --git a/Documentation/accel/qda/index.rst b/Documentation/accel/qda/index.rst
>> new file mode 100644
>> index 000000000000..bce188f21117
>> --- /dev/null
>> +++ b/Documentation/accel/qda/index.rst
>> @@ -0,0 +1,14 @@
>> +.. SPDX-License-Identifier: GPL-2.0-only
>> +
>> +==============================
>> + accel/qda Qualcomm DSP Driver
>> +==============================
>> +
>> +The **accel/qda** driver provides support for Qualcomm Hexagon DSPs (Digital
>> +Signal Processors) within the DRM accelerator framework. It serves as a modern
>> +replacement for the legacy FastRPC driver, offering improved resource management
>> +and standard subsystem integration.
>> +
>> +.. toctree::
>> +
>> + qda
>> diff --git a/Documentation/accel/qda/qda.rst b/Documentation/accel/qda/qda.rst
>> new file mode 100644
>> index 000000000000..742159841b95
>> --- /dev/null
>> +++ b/Documentation/accel/qda/qda.rst
>> @@ -0,0 +1,129 @@
>> +.. SPDX-License-Identifier: GPL-2.0-only
>> +
>> +==================================
>> +Qualcomm Hexagon DSP (QDA) Driver
>> +==================================
>> +
>> +Introduction
>> +============
>> +
>> +The **QDA** (Qualcomm DSP Accelerator) driver is a new DRM-based
>> +accelerator driver for Qualcomm's Hexagon DSPs. It provides a standardized
>> +interface for user-space applications to offload computational tasks ranging
>> +from audio processing and sensor offload to computer vision and AI
>> +inference to the Hexagon DSPs found on Qualcomm SoCs.
>> +
>> +This driver is designed to align with the Linux kernel's modern **Compute
>> +Accelerators** subsystem (`drivers/accel/`), providing a robust and modular
>> +alternative to the legacy FastRPC driver in `drivers/misc/`, offering
>> +improved resource management and better integration with standard kernel
>> +subsystems.
>> +
>> +Motivation
>> +==========
>> +
>> +The existing FastRPC implementation in the kernel utilizes a custom character
>> +device and lacks integration with modern kernel memory management frameworks.
>> +The QDA driver addresses these limitations by:
>> +
>> +1. **Adopting the DRM accel Framework**: Leveraging standard uAPIs for device
>> + management, job submission, and synchronization.
>> +2. **Utilizing GEM for Memory**: Providing proper buffer object management,
>> + including DMA-BUF import/export capabilities.
>> +3. **Improving Isolation**: Using IOMMU context banks to enforce memory
>> + isolation between different DSP user sessions.
>> +
>> +Key Features
>> +============
>> +
>> +* **Standard Accelerator Interface**: Exposes a standard character device
>> + node (e.g., `/dev/accel/accel0`) via the DRM subsystem.
>> +* **Unified Offload Support**: Supports all DSP domains (ADSP, CDSP, SDSP,
>> + GDSP) via a single driver architecture.
>> +* **FastRPC Protocol**: Implements the reliable Remote Procedure Call
>> + (FastRPC) protocol for communication between the application processor
>> + and DSP.
>> +* **DMA-BUF Interop**: Seamless sharing of memory buffers between the DSP
>> + and other multimedia subsystems (GPU, Camera, Video) via standard DMA-BUFs.
>> +* **Modular Design**: Clean separation between the core DRM logic, the memory
>> + manager, and the RPMsg-based transport layer.
>> +
>> +Architecture
>> +============
>> +
>> +The QDA driver is composed of several modular components:
>> +
>> +1. **Core Driver (`qda_drv`)**: Manages device registration, file operations,
>> + and bridges the driver with the DRM accelerator subsystem.
>> +2. **Memory Manager (`qda_memory_manager`)**: A flexible memory management
>> + layer that handles IOMMU context banks. It supports pluggable backends
>> + (such as DMA-coherent) to adapt to different SoC memory architectures.
>> +3. **GEM Subsystem**: Implements the DRM GEM interface for buffer management:
>> +
>> + * **`qda_gem`**: Core GEM object management, including allocation, mmap
>> + operations, and buffer lifecycle management.
>> + * **`qda_prime`**: PRIME import functionality for DMA-BUF interoperability,
>> + enabling seamless buffer sharing with other kernel subsystems.
>> +
>> +4. **Transport Layer (`qda_rpmsg`)**: Abstraction over the RPMsg framework
>> + to handle low-level message passing with the DSP firmware.
>> +5. **Compute Bus (`qda_compute_bus`)**: A custom virtual bus used to
>> + enumerate and manage the specific compute context banks defined in the
>> + device tree.
> I'm really not sure if it's a bonus or not. I'm waiting for iommu-map
> improvements to land to send patches reworking FastRPC CB from using
> probe into being created by the main driver: it would remove some of the
> possible race conditions between main driver finishing probe and the CB
> devices probing in the background.
>
> What's the actual benefit of the CB bus?
I tried following the Tegra host1x logic here as was discussed here[1]. My understanding is that
with this the CB will become more manageable reducing the scope of races that exists in the
current fastrpc driver.
That said, I'm not completely aware about the iommu-map improvements. Is it the one
being discussed for this patch[2]? If it helps in main driver to create CB devices directly, then I
would be happy to adapt the design.
[1] https://lore.kernel.org/all/245d602f-3037-4ae3-9af9-d98f37258aae@oss.qualcomm.com/
[2] https://lore.kernel.org/all/20260126-kaanapali-iris-v1-3-e2646246bfc1@oss.qualcomm.com/
>
>> +6. **FastRPC Core (`qda_fastrpc`)**: Implements the protocol logic for
>> + marshalling arguments and handling remote invocations.
>> +
>> +User-Space API
>> +==============
>> +
>> +The driver exposes a set of DRM-compliant IOCTLs. Note that these are designed
>> +to be familiar to existing FastRPC users while adhering to DRM standards.
>> +
>> +* `DRM_IOCTL_QDA_QUERY`: Query DSP type (e.g., "cdsp", "adsp")
>> + and capabilities.
>> +* `DRM_IOCTL_QDA_INIT_ATTACH`: Attach a user session to the DSP's protection
>> + domain.
>> +* `DRM_IOCTL_QDA_INIT_CREATE`: Initialize a new process context on the DSP.
> You need to explain the difference between these two.
Ack.
>
>> +* `DRM_IOCTL_QDA_INVOKE`: Submit a remote method invocation (the primary
>> + execution unit).
>> +* `DRM_IOCTL_QDA_GEM_CREATE`: Allocate a GEM buffer object for DSP usage.
>> +* `DRM_IOCTL_QDA_GEM_MMAP_OFFSET`: Retrieve mmap offsets for memory mapping.
>> +* `DRM_IOCTL_QDA_MAP` / `DRM_IOCTL_QDA_MUNMAP`: Map or unmap buffers into the
>> + DSP's virtual address space.
> Do we need to make this separate? Can we map/unmap buffers on their
> usage? Or when they are created? I'm thinking about that the
> virtualization.
The lib provides ways(fastrpc_mmap/remote_mmap64) for users to map/unmap the
buffers on DSP as per processes requirement. The ioctls are added to support the same.
> An alternative approach would be to merge
> GET_MMAP_OFFSET with _MAP: once you map it to the DSP memory, you will
> get the offset.
_MAP is not need for all the buffers. Most of the remote call buffers that are passed to DSP
are automatically mapped by DSP before invoking the DSP implementation so the user-space
does not need to call _MAP for these.
Some buffers(e.g., shared persistent buffers) do require explicit mapping, which is why
MAP/MUNMAP exists in FastRPC.
Because of this behavioral difference, merging GET_MMAP_OFFSET with MAP is not accurate.
GET_MMAP_OFFSET is for CPU‑side mmap via GEM, whereas MAP is specifically for DSP
virtual address assignment.
>
>> +
>> +Usage Example
>> +=============
>> +
>> +A typical lifecycle for a user-space application:
>> +
>> +1. **Discovery**: Open `/dev/accel/accel*` and check
>> + `DRM_IOCTL_QDA_QUERY` to find the desired DSP (e.g., CDSP for
>> + compute workloads).
>> +2. **Initialization**: Call `DRM_IOCTL_QDA_INIT_ATTACH` and
>> + `DRM_IOCTL_QDA_INIT_CREATE` to establish a session.
>> +3. **Memory**: Allocate buffers via `DRM_IOCTL_QDA_GEM_CREATE` or import
>> + DMA-BUFs (PRIME fd) from other drivers using `DRM_IOCTL_PRIME_FD_TO_HANDLE`.
>> +4. **Execution**: Use `DRM_IOCTL_QDA_INVOKE` to pass arguments and execute
>> + functions on the DSP.
>> +5. **Cleanup**: Close file descriptors to automatically release resources and
>> + detach the session.
>> +
>> +Internal Implementation
>> +=======================
>> +
>> +Memory Management
>> +-----------------
>> +The driver's memory manager creates virtual "IOMMU devices" that map to
>> +hardware context banks. This allows the driver to manage multiple isolated
>> +address spaces. The implementation currently uses a **DMA-coherent backend**
>> +to ensure data consistency between the CPU and DSP without manual cache
>> +maintenance in most cases.
>> +
>> +Debugging
>> +=========
>> +The driver includes extensive dynamic debug support. Enable it via the
>> +kernel's dynamic debug control:
>> +
>> +.. code-block:: bash
>> +
>> + echo "file drivers/accel/qda/* +p" > /sys/kernel/debug/dynamic_debug/control
> Please add documentation on how to build the test apps and how to load
> them to the DSP.
Ack.
>
>> --
>> 2.34.1
>>
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 01/18] accel/qda: Add Qualcomm QDA DSP accelerator driver docs
2026-02-24 3:33 ` Trilok Soni
@ 2026-02-25 14:17 ` Ekansh Gupta
2026-02-25 15:12 ` Bjorn Andersson
0 siblings, 1 reply; 83+ messages in thread
From: Ekansh Gupta @ 2026-02-25 14:17 UTC (permalink / raw)
To: Trilok Soni, Oded Gabbay, Jonathan Corbet, Shuah Khan,
Joerg Roedel, Will Deacon, Robin Murphy, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Sumit Semwal, Christian König
Cc: dri-devel, linux-doc, linux-kernel, linux-arm-msm, iommu,
linux-media, linaro-mm-sig, Srinivas Kandagatla, Dmitry Baryshkov,
Bharath Kumar, Chenna Kesava Raju
On 2/24/2026 9:03 AM, Trilok Soni wrote:
> On 2/23/2026 11:08 AM, Ekansh Gupta wrote:
>> Add initial documentation for the Qualcomm DSP Accelerator (QDA) driver
>> integrated in the DRM accel subsystem.
>>
>> The new docs introduce QDA as a DRM/accel-based implementation of
>> Hexagon DSP offload that is intended as a modern alternative to the
>> legacy FastRPC driver in drivers/misc. The text describes the driver
>> motivation, high-level architecture and interaction with IOMMU context
>> banks, GEM-based buffer management and the RPMsg transport.
>>
>> The user-space facing section documents the main QDA IOCTLs used to
>> establish DSP sessions, manage GEM buffer objects and invoke remote
>> procedures using the FastRPC protocol, along with a typical lifecycle
>> example for applications.
>>
>> Finally, the driver is wired into the Compute Accelerators
>> documentation index under Documentation/accel, and a brief debugging
>> section shows how to enable dynamic debug for the QDA implementation.
> So existing applications written over character device UAPI needs to be
> rewritten over new UAPI and it will be broken once this driver gets
> merged? Are we going to keep both the drivers in the Linux kernel
> and not deprecate the /char device one?
>
> Is Qualcomm going to provide the wrapper library in the userspace
> so that existing applications by our customers and developers
> keep working w/ the newer kernel if the char interface based
> driver gets deprecated? It is not clear from your text above.
Thanks for raising this, Trilok.
This is one of the open items that I have. I'm not exactly sure what would be the
acceptable way for this.
As you mentioned, applications that rely on /dev/fastrpc* might not work on QDA
without modification.
I was thinking in the same lines as you have mentioned and having some shim/compat
driver to translate FastRPC UAPI to QDA. The compat driver would expose the existing
character devices and route the calls to QDA. The compat driver could be built via Kconfig.
However, I haven’t encountered an example of such a UAPI‑translation driver in the kernel
before, so I would want guidance from maintainers on whether this is an acceptable
model or not.
Regarding your question about library, all the APIs exposed by github/fastrpc library are kept
unchanged in terms of definitions and expectation. The same project can be build for both
FastRPC and QDA based on configure options. So, the applications using github/fastrpc should
not face any problem if the libs is built with proper configure options.
I have noted your point regarding the doc not providing clear details, I have added interface
compatibility information in cover letter and will try pulling the same to Doc.
>
> ---Trilok Soni
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 02/18] accel/qda: Add Qualcomm DSP accelerator driver skeleton
2026-02-23 21:52 ` Bjorn Andersson
@ 2026-02-25 14:20 ` Ekansh Gupta
0 siblings, 0 replies; 83+ messages in thread
From: Ekansh Gupta @ 2026-02-25 14:20 UTC (permalink / raw)
To: Bjorn Andersson
Cc: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, dri-devel, linux-doc, linux-kernel,
linux-arm-msm, iommu, linux-media, linaro-mm-sig,
Srinivas Kandagatla, Dmitry Baryshkov, Bharath Kumar,
Chenna Kesava Raju
On 2/24/2026 3:22 AM, Bjorn Andersson wrote:
> On Tue, Feb 24, 2026 at 12:38:56AM +0530, Ekansh Gupta wrote:
> [..]
>> diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c
>> new file mode 100644
>> index 000000000000..18b0d3fb1598
>> --- /dev/null
>> +++ b/drivers/accel/qda/qda_drv.c
>> @@ -0,0 +1,22 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +// Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
>> +#include <linux/module.h>
>> +#include <linux/kernel.h>
>> +
>> +static int __init qda_core_init(void)
>> +{
>> + pr_info("QDA: driver initialization complete\n");
> This print is useless as soon as you make the driver do anything, please
> don't include developmental debug logs.
>
>
> In fact, this patch doesn't actually do anything, please squash things a
> bit to give it some meat.
>
> Regards,
> Bjorn
Ack, will squash the next commit with this one.
>
>> + return 0;
>> +}
>> +
>> +static void __exit qda_core_exit(void)
>> +{
>> + pr_info("QDA: driver exit complete\n");
>> +}
>> +
>> +module_init(qda_core_init);
>> +module_exit(qda_core_exit);
>> +
>> +MODULE_AUTHOR("Qualcomm AI Infra Team");
>> +MODULE_DESCRIPTION("Qualcomm DSP Accelerator Driver");
>> +MODULE_LICENSE("GPL");
>>
>> --
>> 2.34.1
>>
>>
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 01/18] accel/qda: Add Qualcomm QDA DSP accelerator driver docs
2026-02-25 14:17 ` Ekansh Gupta
@ 2026-02-25 15:12 ` Bjorn Andersson
2026-02-25 19:16 ` Trilok Soni
0 siblings, 1 reply; 83+ messages in thread
From: Bjorn Andersson @ 2026-02-25 15:12 UTC (permalink / raw)
To: Ekansh Gupta
Cc: Trilok Soni, Oded Gabbay, Jonathan Corbet, Shuah Khan,
Joerg Roedel, Will Deacon, Robin Murphy, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Sumit Semwal, Christian König, dri-devel, linux-doc,
linux-kernel, linux-arm-msm, iommu, linux-media, linaro-mm-sig,
Srinivas Kandagatla, Dmitry Baryshkov, Bharath Kumar,
Chenna Kesava Raju
On Wed, Feb 25, 2026 at 07:47:08PM +0530, Ekansh Gupta wrote:
>
>
> On 2/24/2026 9:03 AM, Trilok Soni wrote:
> > On 2/23/2026 11:08 AM, Ekansh Gupta wrote:
> >> Add initial documentation for the Qualcomm DSP Accelerator (QDA) driver
> >> integrated in the DRM accel subsystem.
> >>
> >> The new docs introduce QDA as a DRM/accel-based implementation of
> >> Hexagon DSP offload that is intended as a modern alternative to the
> >> legacy FastRPC driver in drivers/misc. The text describes the driver
> >> motivation, high-level architecture and interaction with IOMMU context
> >> banks, GEM-based buffer management and the RPMsg transport.
> >>
> >> The user-space facing section documents the main QDA IOCTLs used to
> >> establish DSP sessions, manage GEM buffer objects and invoke remote
> >> procedures using the FastRPC protocol, along with a typical lifecycle
> >> example for applications.
> >>
> >> Finally, the driver is wired into the Compute Accelerators
> >> documentation index under Documentation/accel, and a brief debugging
> >> section shows how to enable dynamic debug for the QDA implementation.
> > So existing applications written over character device UAPI needs to be
> > rewritten over new UAPI and it will be broken once this driver gets
> > merged? Are we going to keep both the drivers in the Linux kernel
> > and not deprecate the /char device one?
> >
> > Is Qualcomm going to provide the wrapper library in the userspace
> > so that existing applications by our customers and developers
> > keep working w/ the newer kernel if the char interface based
> > driver gets deprecated? It is not clear from your text above.
> Thanks for raising this, Trilok.
>
> This is one of the open items that I have. I'm not exactly sure what would be the
> acceptable way for this.
>
> As you mentioned, applications that rely on /dev/fastrpc* might not work on QDA
> without modification.
>
> I was thinking in the same lines as you have mentioned and having some shim/compat
> driver to translate FastRPC UAPI to QDA. The compat driver would expose the existing
> character devices and route the calls to QDA. The compat driver could be built via Kconfig.
>
This is a fundamental requirement, you need to address this in order for
this to move forward.
Which makes me wonder if it would be possible to reach an accel driver
through incremental transition of the current driver, instead of just
dropping in a few thousand lines of new code/design.
> However, I haven’t encountered an example of such a UAPI‑translation driver in the kernel
> before, so I would want guidance from maintainers on whether this is an acceptable
> model or not.
>
> Regarding your question about library, all the APIs exposed by github/fastrpc library are kept
> unchanged in terms of definitions and expectation. The same project can be build for both
> FastRPC and QDA based on configure options. So, the applications using github/fastrpc should
> not face any problem if the libs is built with proper configure options.
>
You're assuming that the kernel and userspace are a unified piece of
software, they are not. It must be possible for me to install a new
kernel package without having to replace the userspace libraries.
Regards,
Bjorn
> I have noted your point regarding the doc not providing clear details, I have added interface
> compatibility information in cover letter and will try pulling the same to Doc.
> >
> > ---Trilok Soni
>
>
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 03/18] accel/qda: Add RPMsg transport for Qualcomm DSP accelerator
2026-02-23 21:23 ` Dmitry Baryshkov
2026-02-23 21:50 ` Bjorn Andersson
@ 2026-02-25 17:16 ` Ekansh Gupta
1 sibling, 0 replies; 83+ messages in thread
From: Ekansh Gupta @ 2026-02-25 17:16 UTC (permalink / raw)
To: Dmitry Baryshkov
Cc: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, dri-devel, linux-doc, linux-kernel,
linux-arm-msm, iommu, linux-media, linaro-mm-sig,
Srinivas Kandagatla, Bharath Kumar, Chenna Kesava Raju
On 2/24/2026 2:53 AM, Dmitry Baryshkov wrote:
> On Tue, Feb 24, 2026 at 12:38:57AM +0530, Ekansh Gupta wrote:
>> Extend the Qualcomm DSP accelerator (QDA) driver with an RPMsg-based
>> transport used to discover and manage DSP instances.
>>
>> This patch introduces:
>>
>> - A core qda_dev structure with basic device state (rpmsg device,
>> device pointer, lock, removal flag, DSP name).
>> - Logging helpers that integrate with dev_* when a device is available
>> and fall back to pr_* otherwise.
>> - An RPMsg client driver that binds to the Qualcomm FastRPC service and
>> allocates a qda_dev instance using devm_kzalloc().
>> - Basic device initialization and teardown paths wired into the module
>> init/exit.
>>
>> The RPMsg driver currently sets the DSP name from a "label" property in
>> the device tree, which will be used by subsequent patches to distinguish
>> between different DSP domains (e.g. ADSP, CDSP).
>>
>> Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
>> ---
>> drivers/accel/qda/Kconfig | 1 +
>> drivers/accel/qda/Makefile | 4 +-
>> drivers/accel/qda/qda_drv.c | 41 ++++++++++++++-
>> drivers/accel/qda/qda_drv.h | 91 ++++++++++++++++++++++++++++++++
>> drivers/accel/qda/qda_rpmsg.c | 119 ++++++++++++++++++++++++++++++++++++++++++
>> drivers/accel/qda/qda_rpmsg.h | 17 ++++++
>> 6 files changed, 270 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/accel/qda/Kconfig b/drivers/accel/qda/Kconfig
>> index 3c78ff6189e0..484d21ff1b55 100644
>> --- a/drivers/accel/qda/Kconfig
>> +++ b/drivers/accel/qda/Kconfig
>> @@ -7,6 +7,7 @@ config DRM_ACCEL_QDA
>> tristate "Qualcomm DSP accelerator"
>> depends on DRM_ACCEL
>> depends on ARCH_QCOM || COMPILE_TEST
>> + depends on RPMSG
>> help
>> Enables the DRM-based accelerator driver for Qualcomm's Hexagon DSPs.
>> This driver provides a standardized interface for offloading computational
>> diff --git a/drivers/accel/qda/Makefile b/drivers/accel/qda/Makefile
>> index 573711af1d28..e7f23182589b 100644
>> --- a/drivers/accel/qda/Makefile
>> +++ b/drivers/accel/qda/Makefile
>> @@ -5,4 +5,6 @@
>>
>> obj-$(CONFIG_DRM_ACCEL_QDA) := qda.o
>>
>> -qda-y := qda_drv.o
>> +qda-y := \
>> + qda_drv.o \
> Squash these parts into the previous patch.
Ack.
>
>> + qda_rpmsg.o \
>> diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c
>> index 18b0d3fb1598..389c66a9ad4f 100644
>> --- a/drivers/accel/qda/qda_drv.c
>> +++ b/drivers/accel/qda/qda_drv.c
>> @@ -2,16 +2,53 @@
>> // Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
>> #include <linux/module.h>
>> #include <linux/kernel.h>
>> +#include <linux/atomic.h>
>> +#include "qda_drv.h"
>> +#include "qda_rpmsg.h"
>> +
>> +static void cleanup_device_resources(struct qda_dev *qdev)
>> +{
>> + mutex_destroy(&qdev->lock);
>> +}
>> +
>> +void qda_deinit_device(struct qda_dev *qdev)
>> +{
>> + cleanup_device_resources(qdev);
>> +}
>> +
>> +/* Initialize device resources */
>> +static void init_device_resources(struct qda_dev *qdev)
>> +{
>> + qda_dbg(qdev, "Initializing device resources\n");
>> +
>> + mutex_init(&qdev->lock);
>> + atomic_set(&qdev->removing, 0);
>> +}
>> +
>> +int qda_init_device(struct qda_dev *qdev)
>> +{
>> + init_device_resources(qdev);
>> +
>> + qda_dbg(qdev, "QDA device initialized successfully\n");
>> + return 0;
>> +}
>>
>> static int __init qda_core_init(void)
>> {
>> - pr_info("QDA: driver initialization complete\n");
>> + int ret;
>> +
>> + ret = qda_rpmsg_register();
>> + if (ret)
>> + return ret;
>> +
>> + qda_info(NULL, "QDA driver initialization complete\n");
>> return 0;
>> }
>>
>> static void __exit qda_core_exit(void)
>> {
>> - pr_info("QDA: driver exit complete\n");
>> + qda_rpmsg_unregister();
>> + qda_info(NULL, "QDA driver exit complete\n");
>> }
>>
>> module_init(qda_core_init);
>> diff --git a/drivers/accel/qda/qda_drv.h b/drivers/accel/qda/qda_drv.h
>> new file mode 100644
>> index 000000000000..bec2d31ca1bb
>> --- /dev/null
>> +++ b/drivers/accel/qda/qda_drv.h
>> @@ -0,0 +1,91 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/*
>> + * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
>> + */
>> +
>> +#ifndef __QDA_DRV_H__
>> +#define __QDA_DRV_H__
>> +
>> +#include <linux/device.h>
>> +#include <linux/mutex.h>
>> +#include <linux/rpmsg.h>
>> +#include <linux/xarray.h>
>> +
>> +/* Driver identification */
>> +#define DRIVER_NAME "qda"
>> +
>> +/* struct qda_dev - Main device structure for QDA driver */
>> +struct qda_dev {
>> + /* RPMsg device for communication with remote processor */
>> + struct rpmsg_device *rpdev;
>> + /* Underlying device structure */
>> + struct device *dev;
>> + /* Mutex protecting device state */
>> + struct mutex lock;
> Which parts of the state?
This lock is added to ensure proper rpdev before sending message to DSP, now I think
it might not be needed if I can ensure proper checks based on any existing helpers, I'll
check this and remove if it's not needed.
>
>> + /* Flag indicating device removal in progress */
>> + atomic_t removing;
> Why do you need it if we have dev->unplugged and drm_dev_enter() /
> drm_dev_exit()?
I'll check the helpers and replace wherever necessary.
>
>> + /* Name of the DSP (e.g., "cdsp", "adsp") */
>> + char dsp_name[16];
> Please replace with the pointers to the static array.
ack.
>
>> +};
>> +
>> +/**
>> + * qda_get_log_device - Get appropriate device for logging
>> + * @qdev: QDA device structure
>> + *
>> + * Returns the most appropriate device structure for logging messages.
>> + * Prefers qdev->dev, or returns NULL if the device is being removed
>> + * or invalid.
>> + */
>> +static inline struct device *qda_get_log_device(struct qda_dev *qdev)
>> +{
>> + if (!qdev || atomic_read(&qdev->removing))
>> + return NULL;
>> +
>> + if (qdev->dev)
>> + return qdev->dev;
>> +
>> + return NULL;
>> +}
>> +
>> +/*
>> + * Logging macros
>> + *
>> + * These macros provide consistent logging across the driver with automatic
>> + * function name inclusion. They use dev_* functions when a device is available,
>> + * falling back to pr_* functions otherwise.
>> + */
>> +
>> +/* Error logging - always logs and tracks errors */
>> +#define qda_err(qdev, fmt, ...) do { \
>> + struct device *__dev = qda_get_log_device(qdev); \
>> + if (__dev) \
>> + dev_err(__dev, "[%s] " fmt, __func__, ##__VA_ARGS__); \
>> + else \
>> + pr_err(DRIVER_NAME ": [%s] " fmt, __func__, ##__VA_ARGS__); \
> What /why? You are under drm, so you can use drm_* helpers instead.
ack.
>
>> +} while (0)
>> +
>> +/* Info logging - always logs, can be filtered via loglevel */
>> +#define qda_info(qdev, fmt, ...) do { \
>> + struct device *__dev = qda_get_log_device(qdev); \
>> + if (__dev) \
>> + dev_info(__dev, "[%s] " fmt, __func__, ##__VA_ARGS__); \
>> + else \
>> + pr_info(DRIVER_NAME ": [%s] " fmt, __func__, ##__VA_ARGS__); \
>> +} while (0)
>> +
>> +/* Debug logging - controlled via dynamic debug (CONFIG_DYNAMIC_DEBUG) */
>> +#define qda_dbg(qdev, fmt, ...) do { \
>> + struct device *__dev = qda_get_log_device(qdev); \
>> + if (__dev) \
>> + dev_dbg(__dev, "[%s] " fmt, __func__, ##__VA_ARGS__); \
>> + else \
>> + pr_debug(DRIVER_NAME ": [%s] " fmt, __func__, ##__VA_ARGS__); \
>> +} while (0)
>> +
>> +/*
>> + * Core device management functions
>> + */
>> +int qda_init_device(struct qda_dev *qdev);
>> +void qda_deinit_device(struct qda_dev *qdev);
>> +
>> +#endif /* __QDA_DRV_H__ */
>> diff --git a/drivers/accel/qda/qda_rpmsg.c b/drivers/accel/qda/qda_rpmsg.c
>> new file mode 100644
>> index 000000000000..a8b24a99ca13
>> --- /dev/null
>> +++ b/drivers/accel/qda/qda_rpmsg.c
>> @@ -0,0 +1,119 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +// Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
>> +#include <linux/module.h>
>> +#include <linux/rpmsg.h>
>> +#include <linux/of_platform.h>
>> +#include <linux/of.h>
>> +#include <linux/of_device.h>
>> +#include "qda_drv.h"
>> +#include "qda_rpmsg.h"
>> +
>> +static int qda_rpmsg_init(struct qda_dev *qdev)
>> +{
>> + dev_set_drvdata(&qdev->rpdev->dev, qdev);
>> + return 0;
>> +}
>> +
>> +/* Utility function to allocate and initialize qda_dev */
>> +static struct qda_dev *alloc_and_init_qdev(struct rpmsg_device *rpdev)
>> +{
>> + struct qda_dev *qdev;
>> +
>> + qdev = devm_kzalloc(&rpdev->dev, sizeof(*qdev), GFP_KERNEL);
>> + if (!qdev)
>> + return ERR_PTR(-ENOMEM);
>> +
>> + qdev->dev = &rpdev->dev;
>> + qdev->rpdev = rpdev;
>> +
>> + qda_dbg(qdev, "Allocated and initialized qda_dev\n");
>> + return qdev;
>> +}
>> +
>> +static int qda_rpmsg_cb(struct rpmsg_device *rpdev, void *data, int len, void *priv, u32 src)
>> +{
>> + /* Dummy function for rpmsg driver */
>> + return 0;
>> +}
>> +
>> +static void qda_rpmsg_remove(struct rpmsg_device *rpdev)
>> +{
>> + struct qda_dev *qdev = dev_get_drvdata(&rpdev->dev);
>> +
>> + qda_info(qdev, "Removing RPMsg device\n");
>> +
>> + atomic_set(&qdev->removing, 1);
>> +
>> + mutex_lock(&qdev->lock);
>> + qdev->rpdev = NULL;
>> + mutex_unlock(&qdev->lock);
>> +
>> + qda_deinit_device(qdev);
>> +
>> + qda_info(qdev, "RPMsg device removed\n");
>> +}
>> +
>> +static int qda_rpmsg_probe(struct rpmsg_device *rpdev)
>> +{
>> + struct qda_dev *qdev;
>> + int ret;
>> + const char *label;
>> +
>> + qda_dbg(NULL, "QDA RPMsg probe starting\n");
>> +
>> + qdev = alloc_and_init_qdev(rpdev);
>> + if (IS_ERR(qdev))
>> + return PTR_ERR(qdev);
>> +
>> + ret = of_property_read_string(rpdev->dev.of_node, "label", &label);
>> + if (!ret) {
>> + strscpy(qdev->dsp_name, label, sizeof(qdev->dsp_name));
>> + } else {
>> + qda_info(qdev, "QDA DSP label not found in DT\n");
>> + return ret;
>> + }
>> +
>> + ret = qda_rpmsg_init(qdev);
>> + if (ret) {
>> + qda_err(qdev, "RPMsg init failed: %d\n", ret);
>> + return ret;
>> + }
>> +
>> + ret = qda_init_device(qdev);
>> + if (ret)
>> + return ret;
>> +
>> + qda_info(qdev, "QDA RPMsg probe completed successfully for %s\n", qdev->dsp_name);
>> + return 0;
>> +}
>> +
>> +static const struct of_device_id qda_rpmsg_id_table[] = {
>> + { .compatible = "qcom,fastrpc" },
>> + {},
>> +};
>> +MODULE_DEVICE_TABLE(of, qda_rpmsg_id_table);
>> +
>> +static struct rpmsg_driver qda_rpmsg_driver = {
>> + .probe = qda_rpmsg_probe,
>> + .remove = qda_rpmsg_remove,
>> + .callback = qda_rpmsg_cb,
>> + .drv = {
>> + .name = "qcom,fastrpc",
>> + .of_match_table = qda_rpmsg_id_table,
>> + },
>> +};
>> +
>> +int qda_rpmsg_register(void)
>> +{
>> + int ret = register_rpmsg_driver(&qda_rpmsg_driver);
>> +
>> + if (ret)
>> + qda_err(NULL, "Failed to register RPMsg driver: %d\n", ret);
>> +
>> + return ret;
>> +}
>> +
>> +void qda_rpmsg_unregister(void)
>> +{
>> + unregister_rpmsg_driver(&qda_rpmsg_driver);
>> +}
>> diff --git a/drivers/accel/qda/qda_rpmsg.h b/drivers/accel/qda/qda_rpmsg.h
>> new file mode 100644
>> index 000000000000..348827bff255
>> --- /dev/null
>> +++ b/drivers/accel/qda/qda_rpmsg.h
>> @@ -0,0 +1,17 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/*
>> + * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
>> + */
>> +
>> +#ifndef __QDA_RPMSG_H__
>> +#define __QDA_RPMSG_H__
>> +
>> +#include "qda_drv.h"
>> +
>> +/*
>> + * Transport layer registration
>> + */
>> +int qda_rpmsg_register(void);
>> +void qda_rpmsg_unregister(void);
>> +
>> +#endif /* __QDA_RPMSG_H__ */
>>
>> --
>> 2.34.1
>>
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 01/18] accel/qda: Add Qualcomm QDA DSP accelerator driver docs
2026-02-25 13:57 ` Ekansh Gupta
@ 2026-02-25 17:17 ` Dmitry Baryshkov
0 siblings, 0 replies; 83+ messages in thread
From: Dmitry Baryshkov @ 2026-02-25 17:17 UTC (permalink / raw)
To: Ekansh Gupta
Cc: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, dri-devel, linux-doc, linux-kernel,
linux-arm-msm, iommu, linux-media, linaro-mm-sig,
Srinivas Kandagatla, Bharath Kumar, Chenna Kesava Raju
On Wed, Feb 25, 2026 at 07:27:47PM +0530, Ekansh Gupta wrote:
>
>
> On 2/24/2026 2:47 AM, Dmitry Baryshkov wrote:
> > On Tue, Feb 24, 2026 at 12:38:55AM +0530, Ekansh Gupta wrote:
> >> Add initial documentation for the Qualcomm DSP Accelerator (QDA) driver
> >> integrated in the DRM accel subsystem.
> >>
> >> The new docs introduce QDA as a DRM/accel-based implementation of
> >> Hexagon DSP offload that is intended as a modern alternative to the
> >> legacy FastRPC driver in drivers/misc. The text describes the driver
> >> motivation, high-level architecture and interaction with IOMMU context
> >> banks, GEM-based buffer management and the RPMsg transport.
> >>
> >> The user-space facing section documents the main QDA IOCTLs used to
> >> establish DSP sessions, manage GEM buffer objects and invoke remote
> >> procedures using the FastRPC protocol, along with a typical lifecycle
> >> example for applications.
> >>
> >> Finally, the driver is wired into the Compute Accelerators
> >> documentation index under Documentation/accel, and a brief debugging
> >> section shows how to enable dynamic debug for the QDA implementation.
> >>
> >> Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
> >> ---
> >> Documentation/accel/index.rst | 1 +
> >> Documentation/accel/qda/index.rst | 14 +++++
> >> Documentation/accel/qda/qda.rst | 129 ++++++++++++++++++++++++++++++++++++++
> >> 3 files changed, 144 insertions(+)
> >>
> >> diff --git a/Documentation/accel/index.rst b/Documentation/accel/index.rst
> >> index cbc7d4c3876a..5901ea7f784c 100644
> >> --- a/Documentation/accel/index.rst
> >> +++ b/Documentation/accel/index.rst
> >> @@ -10,4 +10,5 @@ Compute Accelerators
> >> introduction
> >> amdxdna/index
> >> qaic/index
> >> + qda/index
> >> rocket/index
> >> diff --git a/Documentation/accel/qda/index.rst b/Documentation/accel/qda/index.rst
> >> new file mode 100644
> >> index 000000000000..bce188f21117
> >> --- /dev/null
> >> +++ b/Documentation/accel/qda/index.rst
> >> @@ -0,0 +1,14 @@
> >> +.. SPDX-License-Identifier: GPL-2.0-only
> >> +
> >> +==============================
> >> + accel/qda Qualcomm DSP Driver
> >> +==============================
> >> +
> >> +The **accel/qda** driver provides support for Qualcomm Hexagon DSPs (Digital
> >> +Signal Processors) within the DRM accelerator framework. It serves as a modern
> >> +replacement for the legacy FastRPC driver, offering improved resource management
> >> +and standard subsystem integration.
> >> +
> >> +.. toctree::
> >> +
> >> + qda
> >> diff --git a/Documentation/accel/qda/qda.rst b/Documentation/accel/qda/qda.rst
> >> new file mode 100644
> >> index 000000000000..742159841b95
> >> --- /dev/null
> >> +++ b/Documentation/accel/qda/qda.rst
> >> @@ -0,0 +1,129 @@
> >> +.. SPDX-License-Identifier: GPL-2.0-only
> >> +
> >> +==================================
> >> +Qualcomm Hexagon DSP (QDA) Driver
> >> +==================================
> >> +
> >> +Introduction
> >> +============
> >> +
> >> +The **QDA** (Qualcomm DSP Accelerator) driver is a new DRM-based
> >> +accelerator driver for Qualcomm's Hexagon DSPs. It provides a standardized
> >> +interface for user-space applications to offload computational tasks ranging
> >> +from audio processing and sensor offload to computer vision and AI
> >> +inference to the Hexagon DSPs found on Qualcomm SoCs.
> >> +
> >> +This driver is designed to align with the Linux kernel's modern **Compute
> >> +Accelerators** subsystem (`drivers/accel/`), providing a robust and modular
> >> +alternative to the legacy FastRPC driver in `drivers/misc/`, offering
> >> +improved resource management and better integration with standard kernel
> >> +subsystems.
> >> +
> >> +Motivation
> >> +==========
> >> +
> >> +The existing FastRPC implementation in the kernel utilizes a custom character
> >> +device and lacks integration with modern kernel memory management frameworks.
> >> +The QDA driver addresses these limitations by:
> >> +
> >> +1. **Adopting the DRM accel Framework**: Leveraging standard uAPIs for device
> >> + management, job submission, and synchronization.
> >> +2. **Utilizing GEM for Memory**: Providing proper buffer object management,
> >> + including DMA-BUF import/export capabilities.
> >> +3. **Improving Isolation**: Using IOMMU context banks to enforce memory
> >> + isolation between different DSP user sessions.
> >> +
> >> +Key Features
> >> +============
> >> +
> >> +* **Standard Accelerator Interface**: Exposes a standard character device
> >> + node (e.g., `/dev/accel/accel0`) via the DRM subsystem.
> >> +* **Unified Offload Support**: Supports all DSP domains (ADSP, CDSP, SDSP,
> >> + GDSP) via a single driver architecture.
> >> +* **FastRPC Protocol**: Implements the reliable Remote Procedure Call
> >> + (FastRPC) protocol for communication between the application processor
> >> + and DSP.
> >> +* **DMA-BUF Interop**: Seamless sharing of memory buffers between the DSP
> >> + and other multimedia subsystems (GPU, Camera, Video) via standard DMA-BUFs.
> >> +* **Modular Design**: Clean separation between the core DRM logic, the memory
> >> + manager, and the RPMsg-based transport layer.
> >> +
> >> +Architecture
> >> +============
> >> +
> >> +The QDA driver is composed of several modular components:
> >> +
> >> +1. **Core Driver (`qda_drv`)**: Manages device registration, file operations,
> >> + and bridges the driver with the DRM accelerator subsystem.
> >> +2. **Memory Manager (`qda_memory_manager`)**: A flexible memory management
> >> + layer that handles IOMMU context banks. It supports pluggable backends
> >> + (such as DMA-coherent) to adapt to different SoC memory architectures.
> >> +3. **GEM Subsystem**: Implements the DRM GEM interface for buffer management:
> >> +
> >> + * **`qda_gem`**: Core GEM object management, including allocation, mmap
> >> + operations, and buffer lifecycle management.
> >> + * **`qda_prime`**: PRIME import functionality for DMA-BUF interoperability,
> >> + enabling seamless buffer sharing with other kernel subsystems.
> >> +
> >> +4. **Transport Layer (`qda_rpmsg`)**: Abstraction over the RPMsg framework
> >> + to handle low-level message passing with the DSP firmware.
> >> +5. **Compute Bus (`qda_compute_bus`)**: A custom virtual bus used to
> >> + enumerate and manage the specific compute context banks defined in the
> >> + device tree.
> > I'm really not sure if it's a bonus or not. I'm waiting for iommu-map
> > improvements to land to send patches reworking FastRPC CB from using
> > probe into being created by the main driver: it would remove some of the
> > possible race conditions between main driver finishing probe and the CB
> > devices probing in the background.
> >
> > What's the actual benefit of the CB bus?
> I tried following the Tegra host1x logic here as was discussed here[1]. My understanding is that
> with this the CB will become more manageable reducing the scope of races that exists in the
> current fastrpc driver.
It's nice, but then it can also be used by the existing fastrpc driver.
Would you mind splitting it to a separate changeset and submitting it?
>
> That said, I'm not completely aware about the iommu-map improvements. Is it the one
> being discussed for this patch[2]? If it helps in main driver to create CB devices directly, then I
> would be happy to adapt the design.
That would mostly mean a change to the way we describe CBs (using the
property instead of the in-tree subdevices). Anyway, as I wrote, please
submit it separately.
>
> [1] https://lore.kernel.org/all/245d602f-3037-4ae3-9af9-d98f37258aae@oss.qualcomm.com/
> [2] https://lore.kernel.org/all/20260126-kaanapali-iris-v1-3-e2646246bfc1@oss.qualcomm.com/
> >
> >> +6. **FastRPC Core (`qda_fastrpc`)**: Implements the protocol logic for
> >> + marshalling arguments and handling remote invocations.
> >> +
> >> +User-Space API
> >> +==============
> >> +
> >> +The driver exposes a set of DRM-compliant IOCTLs. Note that these are designed
> >> +to be familiar to existing FastRPC users while adhering to DRM standards.
> >> +
> >> +* `DRM_IOCTL_QDA_QUERY`: Query DSP type (e.g., "cdsp", "adsp")
> >> + and capabilities.
> >> +* `DRM_IOCTL_QDA_INIT_ATTACH`: Attach a user session to the DSP's protection
> >> + domain.
> >> +* `DRM_IOCTL_QDA_INIT_CREATE`: Initialize a new process context on the DSP.
> > You need to explain the difference between these two.
> Ack.
> >
> >> +* `DRM_IOCTL_QDA_INVOKE`: Submit a remote method invocation (the primary
> >> + execution unit).
> >> +* `DRM_IOCTL_QDA_GEM_CREATE`: Allocate a GEM buffer object for DSP usage.
> >> +* `DRM_IOCTL_QDA_GEM_MMAP_OFFSET`: Retrieve mmap offsets for memory mapping.
> >> +* `DRM_IOCTL_QDA_MAP` / `DRM_IOCTL_QDA_MUNMAP`: Map or unmap buffers into the
> >> + DSP's virtual address space.
> > Do we need to make this separate? Can we map/unmap buffers on their
> > usage? Or when they are created? I'm thinking about that the
> > virtualization.
> The lib provides ways(fastrpc_mmap/remote_mmap64) for users to map/unmap the
> buffers on DSP as per processes requirement. The ioctls are added to support the same.
If the buffers are mapped, then library calls become empty calls. Let's
focus on the API first and adapt to the library later on.
> > An alternative approach would be to merge
> > GET_MMAP_OFFSET with _MAP: once you map it to the DSP memory, you will
> > get the offset.
> _MAP is not need for all the buffers. Most of the remote call buffers that are passed to DSP
> are automatically mapped by DSP before invoking the DSP implementation so the user-space
> does not need to call _MAP for these.
Is there a reason for that? I'd really prefer if we change it, making it
more effective and more controllable.
>
> Some buffers(e.g., shared persistent buffers) do require explicit mapping, which is why
> MAP/MUNMAP exists in FastRPC.
>
> Because of this behavioral difference, merging GET_MMAP_OFFSET with MAP is not accurate.
> GET_MMAP_OFFSET is for CPU‑side mmap via GEM, whereas MAP is specifically for DSP
> virtual address assignment.
> >
> >> +
> >> +Usage Example
> >> +=============
> >> +
> >> +A typical lifecycle for a user-space application:
> >> +
> >> +1. **Discovery**: Open `/dev/accel/accel*` and check
> >> + `DRM_IOCTL_QDA_QUERY` to find the desired DSP (e.g., CDSP for
> >> + compute workloads).
> >> +2. **Initialization**: Call `DRM_IOCTL_QDA_INIT_ATTACH` and
> >> + `DRM_IOCTL_QDA_INIT_CREATE` to establish a session.
> >> +3. **Memory**: Allocate buffers via `DRM_IOCTL_QDA_GEM_CREATE` or import
> >> + DMA-BUFs (PRIME fd) from other drivers using `DRM_IOCTL_PRIME_FD_TO_HANDLE`.
> >> +4. **Execution**: Use `DRM_IOCTL_QDA_INVOKE` to pass arguments and execute
> >> + functions on the DSP.
> >> +5. **Cleanup**: Close file descriptors to automatically release resources and
> >> + detach the session.
> >> +
> >> +Internal Implementation
> >> +=======================
> >> +
> >> +Memory Management
> >> +-----------------
> >> +The driver's memory manager creates virtual "IOMMU devices" that map to
> >> +hardware context banks. This allows the driver to manage multiple isolated
> >> +address spaces. The implementation currently uses a **DMA-coherent backend**
> >> +to ensure data consistency between the CPU and DSP without manual cache
> >> +maintenance in most cases.
> >> +
> >> +Debugging
> >> +=========
> >> +The driver includes extensive dynamic debug support. Enable it via the
> >> +kernel's dynamic debug control:
> >> +
> >> +.. code-block:: bash
> >> +
> >> + echo "file drivers/accel/qda/* +p" > /sys/kernel/debug/dynamic_debug/control
> > Please add documentation on how to build the test apps and how to load
> > them to the DSP.
> Ack.
> >
> >> --
> >> 2.34.1
> >>
>
--
With best wishes
Dmitry
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 04/18] accel/qda: Add built-in compute CB bus for QDA and integrate with IOMMU
2026-02-23 22:44 ` Dmitry Baryshkov
@ 2026-02-25 17:56 ` Ekansh Gupta
2026-02-25 19:09 ` Dmitry Baryshkov
0 siblings, 1 reply; 83+ messages in thread
From: Ekansh Gupta @ 2026-02-25 17:56 UTC (permalink / raw)
To: Dmitry Baryshkov
Cc: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, dri-devel, linux-doc, linux-kernel,
linux-arm-msm, iommu, linux-media, linaro-mm-sig,
Srinivas Kandagatla, Bharath Kumar, Chenna Kesava Raju
On 2/24/2026 4:14 AM, Dmitry Baryshkov wrote:
> On Tue, Feb 24, 2026 at 12:38:58AM +0530, Ekansh Gupta wrote:
>> Introduce a built-in compute context-bank (CB) bus used by the Qualcomm
>> DSP accelerator (QDA) driver to represent DSP CB devices that require
>> IOMMU configuration. This separates the CB bus from the QDA driver and
>> allows QDA to remain a loadable module while the bus is always built-in.
> Why? What is the actual problem that you are trying to solve?
Bus needs to be built-in as it is being used by iommu driver. I'll add more details here.
>
>> A new bool Kconfig symbol DRM_ACCEL_QDA_COMPUTE_BUS is added and is
> Don't describe the patch contents. Please.
Ack.
>
>> selected by the main DRM_ACCEL_QDA driver. The parent accel Makefile is
>> updated to descend into the QDA directory for both built-in and module
>> builds so that the CB bus is compiled into vmlinux while the driver
>> remains modular.
>>
>> The CB bus is registered at postcore_initcall() time and is exposed to
>> the IOMMU core through iommu_buses[] in the same way as the Tegra
>> host1x context-bus. This enables later patches to create CB devices on
>> this bus and obtain IOMMU domains for them.
> Note, there is nothing QDA-specific in this patch. Please explain, why
> the bus is QDA-specific? Can we generalize it?
I needed a custom bus here to use for the compute cb devices for iommu
configurations, I don't see any reason to keep it QDA-specific. The only requirement
is that this should be enabled built in whenever QDA is enabled.
But if I keep it generic, where should this be placed? Should it be accel(or drm?) specific?
>
>> Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
>> ---
>> drivers/accel/Makefile | 1 +
>> drivers/accel/qda/Kconfig | 5 +++++
>> drivers/accel/qda/Makefile | 2 ++
>> drivers/accel/qda/qda_compute_bus.c | 23 +++++++++++++++++++++++
>> drivers/iommu/iommu.c | 4 ++++
>> include/linux/qda_compute_bus.h | 22 ++++++++++++++++++++++
>> 6 files changed, 57 insertions(+)
>>
>> diff --git a/drivers/accel/Makefile b/drivers/accel/Makefile
>> index 58c08dd5f389..9ed843cd293f 100644
>> --- a/drivers/accel/Makefile
>> +++ b/drivers/accel/Makefile
>> @@ -6,4 +6,5 @@ obj-$(CONFIG_DRM_ACCEL_HABANALABS) += habanalabs/
>> obj-$(CONFIG_DRM_ACCEL_IVPU) += ivpu/
>> obj-$(CONFIG_DRM_ACCEL_QAIC) += qaic/
>> obj-$(CONFIG_DRM_ACCEL_QDA) += qda/
>> +obj-$(CONFIG_DRM_ACCEL_QDA_COMPUTE_BUS) += qda/
>> obj-$(CONFIG_DRM_ACCEL_ROCKET) += rocket/
>> \ No newline at end of file
>> diff --git a/drivers/accel/qda/Kconfig b/drivers/accel/qda/Kconfig
>> index 484d21ff1b55..ef1fa384efbe 100644
>> --- a/drivers/accel/qda/Kconfig
>> +++ b/drivers/accel/qda/Kconfig
>> @@ -3,11 +3,16 @@
>> # Qualcomm DSP accelerator driver
>> #
>>
>> +
>> +config DRM_ACCEL_QDA_COMPUTE_BUS
>> + bool
>> +
>> config DRM_ACCEL_QDA
>> tristate "Qualcomm DSP accelerator"
>> depends on DRM_ACCEL
>> depends on ARCH_QCOM || COMPILE_TEST
>> depends on RPMSG
>> + select DRM_ACCEL_QDA_COMPUTE_BUS
>> help
>> Enables the DRM-based accelerator driver for Qualcomm's Hexagon DSPs.
>> This driver provides a standardized interface for offloading computational
>> diff --git a/drivers/accel/qda/Makefile b/drivers/accel/qda/Makefile
>> index e7f23182589b..242684ef1af7 100644
>> --- a/drivers/accel/qda/Makefile
>> +++ b/drivers/accel/qda/Makefile
>> @@ -8,3 +8,5 @@ obj-$(CONFIG_DRM_ACCEL_QDA) := qda.o
>> qda-y := \
>> qda_drv.o \
>> qda_rpmsg.o \
>> +
>> +obj-$(CONFIG_DRM_ACCEL_QDA_COMPUTE_BUS) += qda_compute_bus.o
>> diff --git a/drivers/accel/qda/qda_compute_bus.c b/drivers/accel/qda/qda_compute_bus.c
>> new file mode 100644
>> index 000000000000..1d9c39948fb5
>> --- /dev/null
>> +++ b/drivers/accel/qda/qda_compute_bus.c
>> @@ -0,0 +1,23 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +// Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
>> +#include <linux/device.h>
>> +#include <linux/init.h>
>> +
>> +struct bus_type qda_cb_bus_type = {
>> + .name = "qda-compute-cb",
>> +};
>> +EXPORT_SYMBOL_GPL(qda_cb_bus_type);
>> +
>> +static int __init qda_cb_bus_init(void)
>> +{
>> + int err;
>> +
>> + err = bus_register(&qda_cb_bus_type);
>> + if (err < 0) {
>> + pr_err("qda-compute-cb bus registration failed: %d\n", err);
>> + return err;
>> + }
>> + return 0;
>> +}
>> +
>> +postcore_initcall(qda_cb_bus_init);
>> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
>> index 4926a43118e6..5dee912686ee 100644
>> --- a/drivers/iommu/iommu.c
>> +++ b/drivers/iommu/iommu.c
>> @@ -33,6 +33,7 @@
>> #include <trace/events/iommu.h>
>> #include <linux/sched/mm.h>
>> #include <linux/msi.h>
>> +#include <linux/qda_compute_bus.h>
>> #include <uapi/linux/iommufd.h>
>>
>> #include "dma-iommu.h"
>> @@ -178,6 +179,9 @@ static const struct bus_type * const iommu_buses[] = {
>> #ifdef CONFIG_CDX_BUS
>> &cdx_bus_type,
>> #endif
>> +#ifdef CONFIG_DRM_ACCEL_QDA_COMPUTE_BUS
>> + &qda_cb_bus_type,
>> +#endif
>> };
>>
>> /*
>> diff --git a/include/linux/qda_compute_bus.h b/include/linux/qda_compute_bus.h
>> new file mode 100644
>> index 000000000000..807122d84e3f
>> --- /dev/null
>> +++ b/include/linux/qda_compute_bus.h
>> @@ -0,0 +1,22 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/*
>> + * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
>> + */
>> +
>> +#ifndef __QDA_COMPUTE_BUS_H__
>> +#define __QDA_COMPUTE_BUS_H__
>> +
>> +#include <linux/device.h>
>> +
>> +/*
>> + * Custom bus type for QDA compute context bank (CB) devices
>> + *
>> + * This bus type is used for manually created CB devices that represent
>> + * IOMMU context banks. The custom bus allows proper IOMMU configuration
>> + * and device management for these virtual devices.
>> + */
>> +#ifdef CONFIG_DRM_ACCEL_QDA_COMPUTE_BUS
>> +extern struct bus_type qda_cb_bus_type;
>> +#endif
>> +
>> +#endif /* __QDA_COMPUTE_BUS_H__ */
>>
>> --
>> 2.34.1
>>
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 04/18] accel/qda: Add built-in compute CB bus for QDA and integrate with IOMMU
2026-02-25 17:56 ` Ekansh Gupta
@ 2026-02-25 19:09 ` Dmitry Baryshkov
2026-03-02 8:12 ` Ekansh Gupta
0 siblings, 1 reply; 83+ messages in thread
From: Dmitry Baryshkov @ 2026-02-25 19:09 UTC (permalink / raw)
To: Ekansh Gupta
Cc: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, dri-devel, linux-doc, linux-kernel,
linux-arm-msm, iommu, linux-media, linaro-mm-sig,
Srinivas Kandagatla, Bharath Kumar, Chenna Kesava Raju
On Wed, Feb 25, 2026 at 11:26:52PM +0530, Ekansh Gupta wrote:
>
>
> On 2/24/2026 4:14 AM, Dmitry Baryshkov wrote:
> > On Tue, Feb 24, 2026 at 12:38:58AM +0530, Ekansh Gupta wrote:
> >> Introduce a built-in compute context-bank (CB) bus used by the Qualcomm
> >> DSP accelerator (QDA) driver to represent DSP CB devices that require
> >> IOMMU configuration. This separates the CB bus from the QDA driver and
> >> allows QDA to remain a loadable module while the bus is always built-in.
> > Why? What is the actual problem that you are trying to solve?
> Bus needs to be built-in as it is being used by iommu driver. I'll add more details here.
It's an implementation detail. Start your commit message with the
description of the issue or a problem that you are solving.
> >
> >> A new bool Kconfig symbol DRM_ACCEL_QDA_COMPUTE_BUS is added and is
> > Don't describe the patch contents. Please.
> Ack.
> >
> >> selected by the main DRM_ACCEL_QDA driver. The parent accel Makefile is
> >> updated to descend into the QDA directory for both built-in and module
> >> builds so that the CB bus is compiled into vmlinux while the driver
> >> remains modular.
> >>
> >> The CB bus is registered at postcore_initcall() time and is exposed to
> >> the IOMMU core through iommu_buses[] in the same way as the Tegra
> >> host1x context-bus. This enables later patches to create CB devices on
> >> this bus and obtain IOMMU domains for them.
> > Note, there is nothing QDA-specific in this patch. Please explain, why
> > the bus is QDA-specific? Can we generalize it?
> I needed a custom bus here to use for the compute cb devices for iommu
> configurations, I don't see any reason to keep it QDA-specific. The only requirement
> is that this should be enabled built in whenever QDA is enabled.
Why? FastRPC uses platform_bus. You need to explain, why it's not
correct.
>
> But if I keep it generic, where should this be placed? Should it be accel(or drm?) specific?
drivers/base? Or drivers/iommu? That would totally depend on the issue
description. E.g. can we use the same code for host1x?
--
With best wishes
Dmitry
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 00/18] accel/qda: Introduce Qualcomm DSP Accelerator driver
2026-02-25 13:42 ` Bryan O'Donoghue
@ 2026-02-25 19:12 ` Dmitry Baryshkov
0 siblings, 0 replies; 83+ messages in thread
From: Dmitry Baryshkov @ 2026-02-25 19:12 UTC (permalink / raw)
To: Bryan O'Donoghue
Cc: Ekansh Gupta, Oded Gabbay, Jonathan Corbet, Shuah Khan,
Joerg Roedel, Will Deacon, Robin Murphy, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Sumit Semwal, Christian König, dri-devel, linux-doc,
linux-kernel, linux-arm-msm, iommu, linux-media, linaro-mm-sig,
Srinivas Kandagatla, Bharath Kumar, Chenna Kesava Raju
On Wed, Feb 25, 2026 at 01:42:19PM +0000, Bryan O'Donoghue wrote:
> On 23/02/2026 19:08, Ekansh Gupta wrote:
> > User-space staging branch
> > ============
> > https://github.com/qualcomm/fastrpc/tree/accel/staging
>
> What would be really nice to see would be mesa integration allowing
> convergence of the xDSP/xPU accelerator space around something like a
> standard.
I'd say, writing Mesa compiler to build Hexagon code for Teflon frontend
would be a nice item. It would probably also allow us to use DSPs for
OpenCL acceleration. But, I'd say, it's a separate topic.
>
> See: https://blog.tomeuvizoso.net/2025/07/rockchip-npu-update-6-we-are-in-mainline.html
>
> ---
> bod
--
With best wishes
Dmitry
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 01/18] accel/qda: Add Qualcomm QDA DSP accelerator driver docs
2026-02-25 15:12 ` Bjorn Andersson
@ 2026-02-25 19:16 ` Trilok Soni
2026-02-25 19:40 ` Dmitry Baryshkov
0 siblings, 1 reply; 83+ messages in thread
From: Trilok Soni @ 2026-02-25 19:16 UTC (permalink / raw)
To: Bjorn Andersson, Ekansh Gupta
Cc: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, dri-devel, linux-doc, linux-kernel,
linux-arm-msm, iommu, linux-media, linaro-mm-sig,
Srinivas Kandagatla, Dmitry Baryshkov, Bharath Kumar,
Chenna Kesava Raju
On 2/25/2026 7:12 AM, Bjorn Andersson wrote:
> On Wed, Feb 25, 2026 at 07:47:08PM +0530, Ekansh Gupta wrote:
>>
>>
>> On 2/24/2026 9:03 AM, Trilok Soni wrote:
>>> On 2/23/2026 11:08 AM, Ekansh Gupta wrote:
>>>> Add initial documentation for the Qualcomm DSP Accelerator (QDA) driver
>>>> integrated in the DRM accel subsystem.
>>>>
>>>> The new docs introduce QDA as a DRM/accel-based implementation of
>>>> Hexagon DSP offload that is intended as a modern alternative to the
>>>> legacy FastRPC driver in drivers/misc. The text describes the driver
>>>> motivation, high-level architecture and interaction with IOMMU context
>>>> banks, GEM-based buffer management and the RPMsg transport.
>>>>
>>>> The user-space facing section documents the main QDA IOCTLs used to
>>>> establish DSP sessions, manage GEM buffer objects and invoke remote
>>>> procedures using the FastRPC protocol, along with a typical lifecycle
>>>> example for applications.
>>>>
>>>> Finally, the driver is wired into the Compute Accelerators
>>>> documentation index under Documentation/accel, and a brief debugging
>>>> section shows how to enable dynamic debug for the QDA implementation.
>>> So existing applications written over character device UAPI needs to be
>>> rewritten over new UAPI and it will be broken once this driver gets
>>> merged? Are we going to keep both the drivers in the Linux kernel
>>> and not deprecate the /char device one?
>>>
>>> Is Qualcomm going to provide the wrapper library in the userspace
>>> so that existing applications by our customers and developers
>>> keep working w/ the newer kernel if the char interface based
>>> driver gets deprecated? It is not clear from your text above.
>> Thanks for raising this, Trilok.
>>
>> This is one of the open items that I have. I'm not exactly sure what would be the
>> acceptable way for this.
>>
>> As you mentioned, applications that rely on /dev/fastrpc* might not work on QDA
>> without modification.
>>
>> I was thinking in the same lines as you have mentioned and having some shim/compat
>> driver to translate FastRPC UAPI to QDA. The compat driver would expose the existing
>> character devices and route the calls to QDA. The compat driver could be built via Kconfig.
>>
>
> This is a fundamental requirement, you need to address this in order for
> this to move forward.
>
> Which makes me wonder if it would be possible to reach an accel driver
> through incremental transition of the current driver, instead of just
> dropping in a few thousand lines of new code/design.
>
>> However, I haven’t encountered an example of such a UAPI‑translation driver in the kernel
>> before, so I would want guidance from maintainers on whether this is an acceptable
>> model or not.
>>
>> Regarding your question about library, all the APIs exposed by github/fastrpc library are kept
>> unchanged in terms of definitions and expectation. The same project can be build for both
>> FastRPC and QDA based on configure options. So, the applications using github/fastrpc should
>> not face any problem if the libs is built with proper configure options.
>>
>
> You're assuming that the kernel and userspace are a unified piece of
> software, they are not. It must be possible for me to install a new
> kernel package without having to replace the userspace libraries.
Thank you Bjorn for providing the inputs.
I also foresee that we will be stop adding (or already happened) new features
into the existing fastrpc driver, so calling the new driver as an alternative
is in oversold category.
You are pretty much began the deprecating the existing fastrpc driver, so let's
just mention it if that is the case and provide migration/shim path so that
existing binaries doesn't break.
---Trilok Soni
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 01/18] accel/qda: Add Qualcomm QDA DSP accelerator driver docs
2026-02-25 19:16 ` Trilok Soni
@ 2026-02-25 19:40 ` Dmitry Baryshkov
2026-02-25 23:18 ` Trilok Soni
0 siblings, 1 reply; 83+ messages in thread
From: Dmitry Baryshkov @ 2026-02-25 19:40 UTC (permalink / raw)
To: Trilok Soni
Cc: Bjorn Andersson, Ekansh Gupta, Oded Gabbay, Jonathan Corbet,
Shuah Khan, Joerg Roedel, Will Deacon, Robin Murphy,
Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
Simona Vetter, Sumit Semwal, Christian König, dri-devel,
linux-doc, linux-kernel, linux-arm-msm, iommu, linux-media,
linaro-mm-sig, Srinivas Kandagatla, Bharath Kumar,
Chenna Kesava Raju
On Wed, Feb 25, 2026 at 11:16:26AM -0800, Trilok Soni wrote:
> On 2/25/2026 7:12 AM, Bjorn Andersson wrote:
> > On Wed, Feb 25, 2026 at 07:47:08PM +0530, Ekansh Gupta wrote:
> >>
> >>
> >> On 2/24/2026 9:03 AM, Trilok Soni wrote:
> >>> On 2/23/2026 11:08 AM, Ekansh Gupta wrote:
> >>>> Add initial documentation for the Qualcomm DSP Accelerator (QDA) driver
> >>>> integrated in the DRM accel subsystem.
> >>>>
> >>>> The new docs introduce QDA as a DRM/accel-based implementation of
> >>>> Hexagon DSP offload that is intended as a modern alternative to the
> >>>> legacy FastRPC driver in drivers/misc. The text describes the driver
> >>>> motivation, high-level architecture and interaction with IOMMU context
> >>>> banks, GEM-based buffer management and the RPMsg transport.
> >>>>
> >>>> The user-space facing section documents the main QDA IOCTLs used to
> >>>> establish DSP sessions, manage GEM buffer objects and invoke remote
> >>>> procedures using the FastRPC protocol, along with a typical lifecycle
> >>>> example for applications.
> >>>>
> >>>> Finally, the driver is wired into the Compute Accelerators
> >>>> documentation index under Documentation/accel, and a brief debugging
> >>>> section shows how to enable dynamic debug for the QDA implementation.
> >>> So existing applications written over character device UAPI needs to be
> >>> rewritten over new UAPI and it will be broken once this driver gets
> >>> merged? Are we going to keep both the drivers in the Linux kernel
> >>> and not deprecate the /char device one?
> >>>
> >>> Is Qualcomm going to provide the wrapper library in the userspace
> >>> so that existing applications by our customers and developers
> >>> keep working w/ the newer kernel if the char interface based
> >>> driver gets deprecated? It is not clear from your text above.
> >> Thanks for raising this, Trilok.
> >>
> >> This is one of the open items that I have. I'm not exactly sure what would be the
> >> acceptable way for this.
> >>
> >> As you mentioned, applications that rely on /dev/fastrpc* might not work on QDA
> >> without modification.
> >>
> >> I was thinking in the same lines as you have mentioned and having some shim/compat
> >> driver to translate FastRPC UAPI to QDA. The compat driver would expose the existing
> >> character devices and route the calls to QDA. The compat driver could be built via Kconfig.
> >>
> >
> > This is a fundamental requirement, you need to address this in order for
> > this to move forward.
> >
> > Which makes me wonder if it would be possible to reach an accel driver
> > through incremental transition of the current driver, instead of just
> > dropping in a few thousand lines of new code/design.
> >
> >> However, I haven’t encountered an example of such a UAPI‑translation driver in the kernel
> >> before, so I would want guidance from maintainers on whether this is an acceptable
> >> model or not.
> >>
> >> Regarding your question about library, all the APIs exposed by github/fastrpc library are kept
> >> unchanged in terms of definitions and expectation. The same project can be build for both
> >> FastRPC and QDA based on configure options. So, the applications using github/fastrpc should
> >> not face any problem if the libs is built with proper configure options.
> >>
> >
> > You're assuming that the kernel and userspace are a unified piece of
> > software, they are not. It must be possible for me to install a new
> > kernel package without having to replace the userspace libraries.
>
> Thank you Bjorn for providing the inputs.
>
> I also foresee that we will be stop adding (or already happened) new features
> into the existing fastrpc driver, so calling the new driver as an alternative
> is in oversold category.
>
> You are pretty much began the deprecating the existing fastrpc driver, so let's
> just mention it if that is the case and provide migration/shim path so that
> existing binaries doesn't break.
I agree that we need a migration path, but I'd really focus on it after
getting at least basic parts of the QDA reviewed and agreed upon.
Otherwise the shim layer will be reworked again and again with no
immediate added benefit.
--
With best wishes
Dmitry
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 01/18] accel/qda: Add Qualcomm QDA DSP accelerator driver docs
2026-02-25 19:40 ` Dmitry Baryshkov
@ 2026-02-25 23:18 ` Trilok Soni
0 siblings, 0 replies; 83+ messages in thread
From: Trilok Soni @ 2026-02-25 23:18 UTC (permalink / raw)
To: Dmitry Baryshkov
Cc: Bjorn Andersson, Ekansh Gupta, Oded Gabbay, Jonathan Corbet,
Shuah Khan, Joerg Roedel, Will Deacon, Robin Murphy,
Maarten Lankhorst, Maxime Ripard, Thomas Zimmermann, David Airlie,
Simona Vetter, Sumit Semwal, Christian König, dri-devel,
linux-doc, linux-kernel, linux-arm-msm, iommu, linux-media,
linaro-mm-sig, Srinivas Kandagatla, Bharath Kumar,
Chenna Kesava Raju
On 2/25/2026 11:40 AM, Dmitry Baryshkov wrote:
> On Wed, Feb 25, 2026 at 11:16:26AM -0800, Trilok Soni wrote:
>> On 2/25/2026 7:12 AM, Bjorn Andersson wrote:
>>> On Wed, Feb 25, 2026 at 07:47:08PM +0530, Ekansh Gupta wrote:
>>>>
>>>>
>>>> On 2/24/2026 9:03 AM, Trilok Soni wrote:
>>>>> On 2/23/2026 11:08 AM, Ekansh Gupta wrote:
>>>>>> Add initial documentation for the Qualcomm DSP Accelerator (QDA) driver
>>>>>> integrated in the DRM accel subsystem.
>>>>>>
>>>>>> The new docs introduce QDA as a DRM/accel-based implementation of
>>>>>> Hexagon DSP offload that is intended as a modern alternative to the
>>>>>> legacy FastRPC driver in drivers/misc. The text describes the driver
>>>>>> motivation, high-level architecture and interaction with IOMMU context
>>>>>> banks, GEM-based buffer management and the RPMsg transport.
>>>>>>
>>>>>> The user-space facing section documents the main QDA IOCTLs used to
>>>>>> establish DSP sessions, manage GEM buffer objects and invoke remote
>>>>>> procedures using the FastRPC protocol, along with a typical lifecycle
>>>>>> example for applications.
>>>>>>
>>>>>> Finally, the driver is wired into the Compute Accelerators
>>>>>> documentation index under Documentation/accel, and a brief debugging
>>>>>> section shows how to enable dynamic debug for the QDA implementation.
>>>>> So existing applications written over character device UAPI needs to be
>>>>> rewritten over new UAPI and it will be broken once this driver gets
>>>>> merged? Are we going to keep both the drivers in the Linux kernel
>>>>> and not deprecate the /char device one?
>>>>>
>>>>> Is Qualcomm going to provide the wrapper library in the userspace
>>>>> so that existing applications by our customers and developers
>>>>> keep working w/ the newer kernel if the char interface based
>>>>> driver gets deprecated? It is not clear from your text above.
>>>> Thanks for raising this, Trilok.
>>>>
>>>> This is one of the open items that I have. I'm not exactly sure what would be the
>>>> acceptable way for this.
>>>>
>>>> As you mentioned, applications that rely on /dev/fastrpc* might not work on QDA
>>>> without modification.
>>>>
>>>> I was thinking in the same lines as you have mentioned and having some shim/compat
>>>> driver to translate FastRPC UAPI to QDA. The compat driver would expose the existing
>>>> character devices and route the calls to QDA. The compat driver could be built via Kconfig.
>>>>
>>>
>>> This is a fundamental requirement, you need to address this in order for
>>> this to move forward.
>>>
>>> Which makes me wonder if it would be possible to reach an accel driver
>>> through incremental transition of the current driver, instead of just
>>> dropping in a few thousand lines of new code/design.
>>>
>>>> However, I haven’t encountered an example of such a UAPI‑translation driver in the kernel
>>>> before, so I would want guidance from maintainers on whether this is an acceptable
>>>> model or not.
>>>>
>>>> Regarding your question about library, all the APIs exposed by github/fastrpc library are kept
>>>> unchanged in terms of definitions and expectation. The same project can be build for both
>>>> FastRPC and QDA based on configure options. So, the applications using github/fastrpc should
>>>> not face any problem if the libs is built with proper configure options.
>>>>
>>>
>>> You're assuming that the kernel and userspace are a unified piece of
>>> software, they are not. It must be possible for me to install a new
>>> kernel package without having to replace the userspace libraries.
>>
>> Thank you Bjorn for providing the inputs.
>>
>> I also foresee that we will be stop adding (or already happened) new features
>> into the existing fastrpc driver, so calling the new driver as an alternative
>> is in oversold category.
>>
>> You are pretty much began the deprecating the existing fastrpc driver, so let's
>> just mention it if that is the case and provide migration/shim path so that
>> existing binaries doesn't break.
>
> I agree that we need a migration path, but I'd really focus on it after
> getting at least basic parts of the QDA reviewed and agreed upon.
> Otherwise the shim layer will be reworked again and again with no
> immediate added benefit.
>
I am fine with the review to be continued, this is RFC series anyway. We should also decide
the design of the shim layer here as well. I prefer to not have multiple
RFC revisions here if we don't agree on the basic requirements which
leads to acceptance of this new driver.
---Trilok Soni
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 05/18] accel/qda: Create compute CB devices on QDA compute bus
2026-02-23 22:49 ` Dmitry Baryshkov
@ 2026-02-26 8:38 ` Ekansh Gupta
2026-02-26 10:46 ` Dmitry Baryshkov
0 siblings, 1 reply; 83+ messages in thread
From: Ekansh Gupta @ 2026-02-26 8:38 UTC (permalink / raw)
To: Dmitry Baryshkov
Cc: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, dri-devel, linux-doc, linux-kernel,
linux-arm-msm, iommu, linux-media, linaro-mm-sig,
Srinivas Kandagatla, Bharath Kumar, Chenna Kesava Raju
On 2/24/2026 4:19 AM, Dmitry Baryshkov wrote:
> On Tue, Feb 24, 2026 at 12:38:59AM +0530, Ekansh Gupta wrote:
>> Add support for creating compute context-bank (CB) devices under
>> the QDA compute bus based on child nodes of the FastRPC RPMsg
>> device tree node. Each DT child with compatible
>> "qcom,fastrpc-compute-cb" is turned into a QDA-owned struct
>> device on qda_cb_bus_type.
>>
>> A new qda_cb_dev structure and cb_devs list in qda_dev track these
>> CB devices. qda_populate_child_devices() walks the DT children
>> during QDA RPMsg probe, creates CB devices, configures their DMA
>> and IOMMU settings using of_dma_configure(), and associates a SID
>> from the "reg" property when present.
>>
>> On RPMsg remove, qda_unpopulate_child_devices() tears down all CB
>> devices, removing them from their IOMMU groups if present and
>> unregistering the devices. This prepares the ground for using CB
>> devices as IOMMU endpoints for DSP compute workloads in later
>> patches.
> Are we loosing the nsessions support?
Yes, it's not part of this series. I'll try bringing that as well.
>
>> Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
>> ---
>> drivers/accel/qda/Makefile | 1 +
>> drivers/accel/qda/qda_cb.c | 150 ++++++++++++++++++++++++++++++++++++++++++
>> drivers/accel/qda/qda_cb.h | 26 ++++++++
>> drivers/accel/qda/qda_drv.h | 3 +
>> drivers/accel/qda/qda_rpmsg.c | 40 +++++++++++
>> 5 files changed, 220 insertions(+)
>>
>> diff --git a/drivers/accel/qda/Makefile b/drivers/accel/qda/Makefile
>> index 242684ef1af7..4aded20b6bc2 100644
>> --- a/drivers/accel/qda/Makefile
>> +++ b/drivers/accel/qda/Makefile
>> @@ -8,5 +8,6 @@ obj-$(CONFIG_DRM_ACCEL_QDA) := qda.o
>> qda-y := \
>> qda_drv.o \
>> qda_rpmsg.o \
>> + qda_cb.o \
>>
>> obj-$(CONFIG_DRM_ACCEL_QDA_COMPUTE_BUS) += qda_compute_bus.o
>> diff --git a/drivers/accel/qda/qda_cb.c b/drivers/accel/qda/qda_cb.c
>> new file mode 100644
>> index 000000000000..77a2d8cae076
>> --- /dev/null
>> +++ b/drivers/accel/qda/qda_cb.c
>> @@ -0,0 +1,150 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +// Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
>> +#include <linux/dma-mapping.h>
>> +#include <linux/device.h>
>> +#include <linux/of.h>
>> +#include <linux/of_device.h>
>> +#include <linux/iommu.h>
>> +#include <linux/slab.h>
>> +#include "qda_drv.h"
>> +#include "qda_cb.h"
>> +
>> +static void qda_cb_dev_release(struct device *dev)
>> +{
>> + kfree(dev);
> Do you need to put the reference on the OF node?
Reference put is happening as part of qda_destroy_cb_device.
>
>> +}
>> +
>> +static int qda_configure_cb_iommu(struct device *cb_dev, struct device_node *cb_node)
>> +{
>> + int ret;
>> +
>> + qda_dbg(NULL, "Configuring DMA/IOMMU for CB device %s\n", dev_name(cb_dev));
>> +
>> + /* Use of_dma_configure which handles both DMA and IOMMU configuration */
>> + ret = of_dma_configure(cb_dev, cb_node, true);
>> + if (ret) {
>> + qda_err(NULL, "of_dma_configure failed for %s: %d\n", dev_name(cb_dev), ret);
>> + return ret;
>> + }
>> +
>> + qda_dbg(NULL, "DMA/IOMMU configured successfully for CB device %s\n", dev_name(cb_dev));
>> + return 0;
>> +}
>> +
>> +static int qda_cb_setup_device(struct qda_dev *qdev, struct device *cb_dev)
>> +{
>> + int rc;
>> + u32 sid, pa_bits = 32;
>> +
>> + qda_dbg(qdev, "Setting up CB device %s\n", dev_name(cb_dev));
>> +
>> + if (of_property_read_u32(cb_dev->of_node, "reg", &sid)) {
>> + qda_dbg(qdev, "No 'reg' property found, defaulting SID to 0\n");
>> + sid = 0;
> Don't do the job of the schema validator. Are there nodes without reg?
> No.
Ack.
>
>> + }
>> +
>> + rc = dma_set_mask(cb_dev, DMA_BIT_MASK(pa_bits));
>> + if (rc) {
>> + qda_err(qdev, "%d bit DMA enable failed: %d\n", pa_bits, rc);
>> + return rc;
>> + }
>> +
>> + qda_dbg(qdev, "CB device setup complete - SID: %u, PA bits: %u\n", sid, pa_bits);
>> +
>> + return 0;
>> +}
>> +
>> +int qda_create_cb_device(struct qda_dev *qdev, struct device_node *cb_node)
>> +{
>> + struct device *cb_dev;
>> + int ret;
>> + u32 sid = 0;
>> + struct qda_cb_dev *entry;
>> +
>> + qda_dbg(qdev, "Creating CB device for node: %s\n", cb_node->name);
>> +
>> + of_property_read_u32(cb_node, "reg", &sid);
>> +
>> + cb_dev = kzalloc_obj(*cb_dev, GFP_KERNEL);
>> + if (!cb_dev)
>> + return -ENOMEM;
>> +
>> + device_initialize(cb_dev);
>> + cb_dev->parent = qdev->dev;
>> + cb_dev->bus = &qda_cb_bus_type; /* Use our custom bus type for IOMMU handling */
>> + cb_dev->release = qda_cb_dev_release;
>> + dev_set_name(cb_dev, "qda-cb-%s-%u", qdev->dsp_name, sid);
>> +
>> + qda_dbg(qdev, "Initialized CB device: %s\n", dev_name(cb_dev));
>> +
>> + cb_dev->of_node = of_node_get(cb_node);
>> +
>> + cb_dev->dma_mask = &cb_dev->coherent_dma_mask;
>> + cb_dev->coherent_dma_mask = DMA_BIT_MASK(32);
>> +
>> + dev_set_drvdata(cb_dev->parent, qdev);
>> +
>> + ret = device_add(cb_dev);
>> + if (ret) {
>> + qda_err(qdev, "Failed to add CB device for SID %u: %d\n", sid, ret);
>> + goto cleanup_device_init;
>> + }
>> +
>> + qda_dbg(qdev, "CB device added to system\n");
>> +
>> + ret = qda_configure_cb_iommu(cb_dev, cb_node);
>> + if (ret) {
>> + qda_err(qdev, "IOMMU configuration failed: %d\n", ret);
>> + goto cleanup_device_add;
>> + }
>> +
>> + ret = qda_cb_setup_device(qdev, cb_dev);
>> + if (ret) {
>> + qda_err(qdev, "CB device setup failed: %d\n", ret);
>> + goto cleanup_device_add;
>> + }
>> +
>> + entry = kzalloc(sizeof(*entry), GFP_KERNEL);
>> + if (!entry) {
>> + ret = -ENOMEM;
>> + goto cleanup_device_add;
>> + }
>> +
>> + entry->dev = cb_dev;
>> + list_add_tail(&entry->node, &qdev->cb_devs);
>> +
>> + qda_dbg(qdev, "Successfully created CB device for SID %u\n", sid);
>> + return 0;
>> +
>> +cleanup_device_add:
>> + device_del(cb_dev);
>> +cleanup_device_init:
>> + of_node_put(cb_dev->of_node);
>> + put_device(cb_dev);
>> + return ret;
>> +}
>> +
>> +void qda_destroy_cb_device(struct device *cb_dev)
>> +{
>> + struct iommu_group *group;
>> +
>> + if (!cb_dev) {
>> + qda_dbg(NULL, "NULL CB device passed to destroy\n");
>> + return;
>> + }
>> +
>> + qda_dbg(NULL, "Destroying CB device %s\n", dev_name(cb_dev));
>> +
>> + group = iommu_group_get(cb_dev);
>> + if (group) {
>> + qda_dbg(NULL, "Removing %s from IOMMU group\n", dev_name(cb_dev));
>> + iommu_group_remove_device(cb_dev);
>> + iommu_group_put(group);
>> + }
>> +
>> + of_node_put(cb_dev->of_node);
>> + cb_dev->of_node = NULL;
>> + device_unregister(cb_dev);
>> +
>> + qda_dbg(NULL, "CB device %s destroyed\n", dev_name(cb_dev));
>> +}
>> diff --git a/drivers/accel/qda/qda_cb.h b/drivers/accel/qda/qda_cb.h
>> new file mode 100644
>> index 000000000000..a4ae9fef142e
>> --- /dev/null
>> +++ b/drivers/accel/qda/qda_cb.h
>> @@ -0,0 +1,26 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/*
>> + * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
>> + */
>> +
>> +#ifndef __QDA_CB_H__
>> +#define __QDA_CB_H__
>> +
>> +#include <linux/device.h>
>> +#include <linux/of.h>
>> +#include <linux/list.h>
>> +#include <linux/qda_compute_bus.h>
>> +#include "qda_drv.h"
>> +
>> +struct qda_cb_dev {
>> + struct list_head node;
>> + struct device *dev;
>> +};
>> +
>> +/*
>> + * Compute bus (CB) device management
>> + */
>> +int qda_create_cb_device(struct qda_dev *qdev, struct device_node *cb_node);
>> +void qda_destroy_cb_device(struct device *cb_dev);
>> +
>> +#endif /* __QDA_CB_H__ */
>> diff --git a/drivers/accel/qda/qda_drv.h b/drivers/accel/qda/qda_drv.h
>> index bec2d31ca1bb..eb732b7d8091 100644
>> --- a/drivers/accel/qda/qda_drv.h
>> +++ b/drivers/accel/qda/qda_drv.h
>> @@ -7,6 +7,7 @@
>> #define __QDA_DRV_H__
>>
>> #include <linux/device.h>
>> +#include <linux/list.h>
>> #include <linux/mutex.h>
>> #include <linux/rpmsg.h>
>> #include <linux/xarray.h>
>> @@ -26,6 +27,8 @@ struct qda_dev {
>> atomic_t removing;
>> /* Name of the DSP (e.g., "cdsp", "adsp") */
>> char dsp_name[16];
>> + /* Compute context-bank (CB) child devices */
>> + struct list_head cb_devs;
>> };
>>
>> /**
>> diff --git a/drivers/accel/qda/qda_rpmsg.c b/drivers/accel/qda/qda_rpmsg.c
>> index a8b24a99ca13..5a57384de6a2 100644
>> --- a/drivers/accel/qda/qda_rpmsg.c
>> +++ b/drivers/accel/qda/qda_rpmsg.c
>> @@ -7,6 +7,7 @@
>> #include <linux/of_device.h>
>> #include "qda_drv.h"
>> #include "qda_rpmsg.h"
>> +#include "qda_cb.h"
>>
>> static int qda_rpmsg_init(struct qda_dev *qdev)
>> {
>> @@ -25,11 +26,42 @@ static struct qda_dev *alloc_and_init_qdev(struct rpmsg_device *rpdev)
>>
>> qdev->dev = &rpdev->dev;
>> qdev->rpdev = rpdev;
>> + INIT_LIST_HEAD(&qdev->cb_devs);
>>
>> qda_dbg(qdev, "Allocated and initialized qda_dev\n");
>> return qdev;
>> }
>>
>> +static void qda_unpopulate_child_devices(struct qda_dev *qdev)
>> +{
>> + struct qda_cb_dev *entry, *tmp;
>> +
>> + list_for_each_entry_safe(entry, tmp, &qdev->cb_devs, node) {
>> + list_del(&entry->node);
>> + qda_destroy_cb_device(entry->dev);
>> + kfree(entry);
> Why can't you embed struct device into a structure together with the
> list_node (and possibly some other data?)?
I'll check this.
>
>> + }
>> +}
>> +
>> +static int qda_populate_child_devices(struct qda_dev *qdev, struct device_node *parent_node)
>> +{
>> + struct device_node *child;
>> + int count = 0, success = 0;
>> +
>> + for_each_child_of_node(parent_node, child) {
>> + if (of_device_is_compatible(child, "qcom,fastrpc-compute-cb")) {
>> + count++;
>> + if (qda_create_cb_device(qdev, child) == 0) {
>> + success++;
>> + qda_dbg(qdev, "Created CB device for node: %s\n", child->name);
>> + } else {
>> + qda_err(qdev, "Failed to create CB device for: %s\n", child->name);
> Don't loose the error code. Instead please return it to the caller.
Ack.
>
>> + }
>> + }
>> + }
>> + return success > 0 ? 0 : (count > 0 ? -ENODEV : 0);
>> +}
>> +
>> static int qda_rpmsg_cb(struct rpmsg_device *rpdev, void *data, int len, void *priv, u32 src)
>> {
>> /* Dummy function for rpmsg driver */
>> @@ -48,6 +80,7 @@ static void qda_rpmsg_remove(struct rpmsg_device *rpdev)
>> qdev->rpdev = NULL;
>> mutex_unlock(&qdev->lock);
>>
>> + qda_unpopulate_child_devices(qdev);
>> qda_deinit_device(qdev);
>>
>> qda_info(qdev, "RPMsg device removed\n");
>> @@ -83,6 +116,13 @@ static int qda_rpmsg_probe(struct rpmsg_device *rpdev)
>> if (ret)
>> return ret;
>>
>> + ret = qda_populate_child_devices(qdev, rpdev->dev.of_node);
>> + if (ret) {
>> + qda_err(qdev, "Failed to populate child devices: %d\n", ret);
>> + qda_deinit_device(qdev);
>> + return ret;
>> + }
>> +
>> qda_info(qdev, "QDA RPMsg probe completed successfully for %s\n", qdev->dsp_name);
>> return 0;
>> }
>>
>> --
>> 2.34.1
>>
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 04/18] accel/qda: Add built-in compute CB bus for QDA and integrate with IOMMU
2026-02-23 19:08 ` [PATCH RFC 04/18] accel/qda: Add built-in compute CB bus for QDA and integrate with IOMMU Ekansh Gupta
2026-02-23 22:44 ` Dmitry Baryshkov
@ 2026-02-26 10:46 ` Krzysztof Kozlowski
1 sibling, 0 replies; 83+ messages in thread
From: Krzysztof Kozlowski @ 2026-02-26 10:46 UTC (permalink / raw)
To: Ekansh Gupta, Oded Gabbay, Jonathan Corbet, Shuah Khan,
Joerg Roedel, Will Deacon, Robin Murphy, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Sumit Semwal, Christian König
Cc: dri-devel, linux-doc, linux-kernel, linux-arm-msm, iommu,
linux-media, linaro-mm-sig, Srinivas Kandagatla, Dmitry Baryshkov,
Bharath Kumar, Chenna Kesava Raju
On 23/02/2026 20:08, Ekansh Gupta wrote:
> Introduce a built-in compute context-bank (CB) bus used by the Qualcomm
> DSP accelerator (QDA) driver to represent DSP CB devices that require
> IOMMU configuration. This separates the CB bus from the QDA driver and
> allows QDA to remain a loadable module while the bus is always built-in.
>
> A new bool Kconfig symbol DRM_ACCEL_QDA_COMPUTE_BUS is added and is
> selected by the main DRM_ACCEL_QDA driver. The parent accel Makefile is
> updated to descend into the QDA directory for both built-in and module
> builds so that the CB bus is compiled into vmlinux while the driver
> remains modular.
>
> The CB bus is registered at postcore_initcall() time and is exposed to
> the IOMMU core through iommu_buses[] in the same way as the Tegra
> host1x context-bus. This enables later patches to create CB devices on
> this bus and obtain IOMMU domains for them.
>
> Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
> ---
> drivers/accel/Makefile | 1 +
> drivers/accel/qda/Kconfig | 5 +++++
> drivers/accel/qda/Makefile | 2 ++
> drivers/accel/qda/qda_compute_bus.c | 23 +++++++++++++++++++++++
> drivers/iommu/iommu.c | 4 ++++
> include/linux/qda_compute_bus.h | 22 ++++++++++++++++++++++
Do not combine independent work into one patch.
Also, your patch has clear patch warnings, so please review it BEFORE
you send.
Best regards,
Krzysztof
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 05/18] accel/qda: Create compute CB devices on QDA compute bus
2026-02-26 8:38 ` Ekansh Gupta
@ 2026-02-26 10:46 ` Dmitry Baryshkov
2026-03-02 8:10 ` Ekansh Gupta
0 siblings, 1 reply; 83+ messages in thread
From: Dmitry Baryshkov @ 2026-02-26 10:46 UTC (permalink / raw)
To: Ekansh Gupta
Cc: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, dri-devel, linux-doc, linux-kernel,
linux-arm-msm, iommu, linux-media, linaro-mm-sig,
Srinivas Kandagatla, Bharath Kumar, Chenna Kesava Raju
On Thu, Feb 26, 2026 at 02:08:57PM +0530, Ekansh Gupta wrote:
>
>
> On 2/24/2026 4:19 AM, Dmitry Baryshkov wrote:
> > On Tue, Feb 24, 2026 at 12:38:59AM +0530, Ekansh Gupta wrote:
> >> Add support for creating compute context-bank (CB) devices under
> >> the QDA compute bus based on child nodes of the FastRPC RPMsg
> >> device tree node. Each DT child with compatible
> >> "qcom,fastrpc-compute-cb" is turned into a QDA-owned struct
> >> device on qda_cb_bus_type.
> >>
> >> A new qda_cb_dev structure and cb_devs list in qda_dev track these
> >> CB devices. qda_populate_child_devices() walks the DT children
> >> during QDA RPMsg probe, creates CB devices, configures their DMA
> >> and IOMMU settings using of_dma_configure(), and associates a SID
> >> from the "reg" property when present.
> >>
> >> On RPMsg remove, qda_unpopulate_child_devices() tears down all CB
> >> devices, removing them from their IOMMU groups if present and
> >> unregistering the devices. This prepares the ground for using CB
> >> devices as IOMMU endpoints for DSP compute workloads in later
> >> patches.
> > Are we loosing the nsessions support?
> Yes, it's not part of this series. I'll try bringing that as well.
> >
> >> Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
> >> ---
> >> drivers/accel/qda/Makefile | 1 +
> >> drivers/accel/qda/qda_cb.c | 150 ++++++++++++++++++++++++++++++++++++++++++
> >> drivers/accel/qda/qda_cb.h | 26 ++++++++
> >> drivers/accel/qda/qda_drv.h | 3 +
> >> drivers/accel/qda/qda_rpmsg.c | 40 +++++++++++
> >> 5 files changed, 220 insertions(+)
> >>
> >> diff --git a/drivers/accel/qda/Makefile b/drivers/accel/qda/Makefile
> >> index 242684ef1af7..4aded20b6bc2 100644
> >> --- a/drivers/accel/qda/Makefile
> >> +++ b/drivers/accel/qda/Makefile
> >> @@ -8,5 +8,6 @@ obj-$(CONFIG_DRM_ACCEL_QDA) := qda.o
> >> qda-y := \
> >> qda_drv.o \
> >> qda_rpmsg.o \
> >> + qda_cb.o \
> >>
> >> obj-$(CONFIG_DRM_ACCEL_QDA_COMPUTE_BUS) += qda_compute_bus.o
> >> diff --git a/drivers/accel/qda/qda_cb.c b/drivers/accel/qda/qda_cb.c
> >> new file mode 100644
> >> index 000000000000..77a2d8cae076
> >> --- /dev/null
> >> +++ b/drivers/accel/qda/qda_cb.c
> >> @@ -0,0 +1,150 @@
> >> +// SPDX-License-Identifier: GPL-2.0-only
> >> +// Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
> >> +#include <linux/dma-mapping.h>
> >> +#include <linux/device.h>
> >> +#include <linux/of.h>
> >> +#include <linux/of_device.h>
> >> +#include <linux/iommu.h>
> >> +#include <linux/slab.h>
> >> +#include "qda_drv.h"
> >> +#include "qda_cb.h"
> >> +
> >> +static void qda_cb_dev_release(struct device *dev)
> >> +{
> >> + kfree(dev);
> > Do you need to put the reference on the OF node?
> Reference put is happening as part of qda_destroy_cb_device.
This way: you have a (small) window where of_node is already put (and
might be gone), but the pointer is not NULL. The of_node should be put
only when device is no longer accessible from the rest of the system, in
release function.
--
With best wishes
Dmitry
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 16/18] accel/qda: Add FastRPC-based DSP memory mapping support
2026-02-23 19:09 ` [PATCH RFC 16/18] accel/qda: Add FastRPC-based DSP memory mapping support Ekansh Gupta
@ 2026-02-26 10:48 ` Krzysztof Kozlowski
2026-03-02 9:12 ` Ekansh Gupta
0 siblings, 1 reply; 83+ messages in thread
From: Krzysztof Kozlowski @ 2026-02-26 10:48 UTC (permalink / raw)
To: Ekansh Gupta, Oded Gabbay, Jonathan Corbet, Shuah Khan,
Joerg Roedel, Will Deacon, Robin Murphy, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Sumit Semwal, Christian König
Cc: dri-devel, linux-doc, linux-kernel, linux-arm-msm, iommu,
linux-media, linaro-mm-sig, Srinivas Kandagatla, Dmitry Baryshkov,
Bharath Kumar, Chenna Kesava Raju
On 23/02/2026 20:09, Ekansh Gupta wrote:
> Add a DRM_QDA_MAP ioctl and supporting FastRPC plumbing to map GEM
> backed buffers into the DSP virtual address space. The new
> qda_mem_map UAPI structure allows userspace to request legacy MMAP
> style mappings or handle-based MEM_MAP mappings with attributes, and
> encodes flags, offsets and optional virtual address hints that are
> forwarded to the DSP.
>
> On the FastRPC side new method identifiers FASTRPC_RMID_INIT_MMAP
> and FASTRPC_RMID_INIT_MEM_MAP are introduced together with message
> structures for map requests and responses. The fastrpc_prepare_args
> path is extended to build the appropriate request headers, serialize
> the physical page information derived from a GEM object into a
> fastrpc_phy_page array and pack the arguments into the shared message
> buffer used by the existing invoke infrastructure.
>
> The qda_ioctl_mmap() handler dispatches mapping requests based on the
> qda_mem_map request type, reusing the generic fastrpc_invoke()
> machinery and the RPMsg transport to communicate with the DSP. This
> provides the foundation for explicit buffer mapping into the DSP
> address space for subsequent FastRPC calls, aligned with the
> traditional FastRPC user space model.
>
> Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
> ---
> arch/arm64/configs/defconfig | 2 +
Not relevan there. Don't stuff other subsystem code into your patches.
Especially without any reasons (your commit msg must explain WHY you are
doing things).
> drivers/accel/qda/qda_drv.c | 1 +
> drivers/accel/qda/qda_fastrpc.c | 217 ++++++++++++++++++++++++++++++++++++++++
> drivers/accel/qda/qda_fastrpc.h | 64 ++++++++++++
> drivers/accel/qda/qda_ioctl.c | 24 +++++
> drivers/accel/qda/qda_ioctl.h | 13 +++
> include/uapi/drm/qda_accel.h | 44 +++++++-
> 7 files changed, 364 insertions(+), 1 deletion(-)
>
Best regards,
Krzysztof
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 05/18] accel/qda: Create compute CB devices on QDA compute bus
2026-02-26 10:46 ` Dmitry Baryshkov
@ 2026-03-02 8:10 ` Ekansh Gupta
0 siblings, 0 replies; 83+ messages in thread
From: Ekansh Gupta @ 2026-03-02 8:10 UTC (permalink / raw)
To: Dmitry Baryshkov
Cc: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, dri-devel, linux-doc, linux-kernel,
linux-arm-msm, iommu, linux-media, linaro-mm-sig,
Srinivas Kandagatla, Bharath Kumar, Chenna Kesava Raju
On 2/26/2026 4:16 PM, Dmitry Baryshkov wrote:
> On Thu, Feb 26, 2026 at 02:08:57PM +0530, Ekansh Gupta wrote:
>>
>> On 2/24/2026 4:19 AM, Dmitry Baryshkov wrote:
>>> On Tue, Feb 24, 2026 at 12:38:59AM +0530, Ekansh Gupta wrote:
>>>> Add support for creating compute context-bank (CB) devices under
>>>> the QDA compute bus based on child nodes of the FastRPC RPMsg
>>>> device tree node. Each DT child with compatible
>>>> "qcom,fastrpc-compute-cb" is turned into a QDA-owned struct
>>>> device on qda_cb_bus_type.
>>>>
>>>> A new qda_cb_dev structure and cb_devs list in qda_dev track these
>>>> CB devices. qda_populate_child_devices() walks the DT children
>>>> during QDA RPMsg probe, creates CB devices, configures their DMA
>>>> and IOMMU settings using of_dma_configure(), and associates a SID
>>>> from the "reg" property when present.
>>>>
>>>> On RPMsg remove, qda_unpopulate_child_devices() tears down all CB
>>>> devices, removing them from their IOMMU groups if present and
>>>> unregistering the devices. This prepares the ground for using CB
>>>> devices as IOMMU endpoints for DSP compute workloads in later
>>>> patches.
>>> Are we loosing the nsessions support?
>> Yes, it's not part of this series. I'll try bringing that as well.
>>>> Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
>>>> ---
>>>> drivers/accel/qda/Makefile | 1 +
>>>> drivers/accel/qda/qda_cb.c | 150 ++++++++++++++++++++++++++++++++++++++++++
>>>> drivers/accel/qda/qda_cb.h | 26 ++++++++
>>>> drivers/accel/qda/qda_drv.h | 3 +
>>>> drivers/accel/qda/qda_rpmsg.c | 40 +++++++++++
>>>> 5 files changed, 220 insertions(+)
>>>>
>>>> diff --git a/drivers/accel/qda/Makefile b/drivers/accel/qda/Makefile
>>>> index 242684ef1af7..4aded20b6bc2 100644
>>>> --- a/drivers/accel/qda/Makefile
>>>> +++ b/drivers/accel/qda/Makefile
>>>> @@ -8,5 +8,6 @@ obj-$(CONFIG_DRM_ACCEL_QDA) := qda.o
>>>> qda-y := \
>>>> qda_drv.o \
>>>> qda_rpmsg.o \
>>>> + qda_cb.o \
>>>>
>>>> obj-$(CONFIG_DRM_ACCEL_QDA_COMPUTE_BUS) += qda_compute_bus.o
>>>> diff --git a/drivers/accel/qda/qda_cb.c b/drivers/accel/qda/qda_cb.c
>>>> new file mode 100644
>>>> index 000000000000..77a2d8cae076
>>>> --- /dev/null
>>>> +++ b/drivers/accel/qda/qda_cb.c
>>>> @@ -0,0 +1,150 @@
>>>> +// SPDX-License-Identifier: GPL-2.0-only
>>>> +// Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
>>>> +#include <linux/dma-mapping.h>
>>>> +#include <linux/device.h>
>>>> +#include <linux/of.h>
>>>> +#include <linux/of_device.h>
>>>> +#include <linux/iommu.h>
>>>> +#include <linux/slab.h>
>>>> +#include "qda_drv.h"
>>>> +#include "qda_cb.h"
>>>> +
>>>> +static void qda_cb_dev_release(struct device *dev)
>>>> +{
>>>> + kfree(dev);
>>> Do you need to put the reference on the OF node?
>> Reference put is happening as part of qda_destroy_cb_device.
> This way: you have a (small) window where of_node is already put (and
> might be gone), but the pointer is not NULL. The of_node should be put
> only when device is no longer accessible from the rest of the system, in
> release function.
I'll move put to release function to avoid suggested scenario. Thanks.
>
>
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 04/18] accel/qda: Add built-in compute CB bus for QDA and integrate with IOMMU
2026-02-25 19:09 ` Dmitry Baryshkov
@ 2026-03-02 8:12 ` Ekansh Gupta
0 siblings, 0 replies; 83+ messages in thread
From: Ekansh Gupta @ 2026-03-02 8:12 UTC (permalink / raw)
To: Dmitry Baryshkov
Cc: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, dri-devel, linux-doc, linux-kernel,
linux-arm-msm, iommu, linux-media, linaro-mm-sig,
Srinivas Kandagatla, Bharath Kumar, Chenna Kesava Raju
On 2/26/2026 12:39 AM, Dmitry Baryshkov wrote:
> On Wed, Feb 25, 2026 at 11:26:52PM +0530, Ekansh Gupta wrote:
>>
>> On 2/24/2026 4:14 AM, Dmitry Baryshkov wrote:
>>> On Tue, Feb 24, 2026 at 12:38:58AM +0530, Ekansh Gupta wrote:
>>>> Introduce a built-in compute context-bank (CB) bus used by the Qualcomm
>>>> DSP accelerator (QDA) driver to represent DSP CB devices that require
>>>> IOMMU configuration. This separates the CB bus from the QDA driver and
>>>> allows QDA to remain a loadable module while the bus is always built-in.
>>> Why? What is the actual problem that you are trying to solve?
>> Bus needs to be built-in as it is being used by iommu driver. I'll add more details here.
> It's an implementation detail. Start your commit message with the
> description of the issue or a problem that you are solving.
Ack.
>
>>>> A new bool Kconfig symbol DRM_ACCEL_QDA_COMPUTE_BUS is added and is
>>> Don't describe the patch contents. Please.
>> Ack.
>>>> selected by the main DRM_ACCEL_QDA driver. The parent accel Makefile is
>>>> updated to descend into the QDA directory for both built-in and module
>>>> builds so that the CB bus is compiled into vmlinux while the driver
>>>> remains modular.
>>>>
>>>> The CB bus is registered at postcore_initcall() time and is exposed to
>>>> the IOMMU core through iommu_buses[] in the same way as the Tegra
>>>> host1x context-bus. This enables later patches to create CB devices on
>>>> this bus and obtain IOMMU domains for them.
>>> Note, there is nothing QDA-specific in this patch. Please explain, why
>>> the bus is QDA-specific? Can we generalize it?
>> I needed a custom bus here to use for the compute cb devices for iommu
>> configurations, I don't see any reason to keep it QDA-specific. The only requirement
>> is that this should be enabled built in whenever QDA is enabled.
> Why? FastRPC uses platform_bus. You need to explain, why it's not
> correct.
Ack.
>
>> But if I keep it generic, where should this be placed? Should it be accel(or drm?) specific?
> drivers/base? Or drivers/iommu? That would totally depend on the issue
> description. E.g. can we use the same code for host1x?
I'll evaluate and bring this change separately for fastrpc and host1x.
>
>
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 06/18] accel/qda: Add memory manager for CB devices
2026-02-23 22:50 ` Dmitry Baryshkov
@ 2026-03-02 8:15 ` Ekansh Gupta
2026-03-04 4:22 ` Dmitry Baryshkov
0 siblings, 1 reply; 83+ messages in thread
From: Ekansh Gupta @ 2026-03-02 8:15 UTC (permalink / raw)
To: Dmitry Baryshkov
Cc: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, dri-devel, linux-doc, linux-kernel,
linux-arm-msm, iommu, linux-media, linaro-mm-sig,
Srinivas Kandagatla, Bharath Kumar, Chenna Kesava Raju
On 2/24/2026 4:20 AM, Dmitry Baryshkov wrote:
> On Tue, Feb 24, 2026 at 12:39:00AM +0530, Ekansh Gupta wrote:
>> Introduce a per-device memory manager for the QDA driver that tracks
>> IOMMU-capable compute context-bank (CB) devices. Each CB device is
>> represented by a qda_iommu_device and registered with a central
>> qda_memory_manager instance owned by qda_dev.
>>
>> The memory manager maintains an xarray of devices and assigns a
>> unique ID to each CB. It also provides basic lifetime management
> Sounds like IDR.
I was planning to stick with xarray accross QDA as IDR gives checkpatch warnings.
>
>> and a workqueue for deferred device removal. qda_cb_setup_device()
> What is deferred device removal? Why do you need it?
This is not needed, I was trying some experiment in my initial design(CB aggregation),
but it's not needed now, I'll remove this.
>
>> now allocates a qda_iommu_device for each CB and registers it with
>> the memory manager after DMA configuration succeeds.
>>
>> qda_init_device() is extended to allocate and initialize the memory
>> manager, while qda_deinit_device() will tear it down in later
>> patches. This prepares the QDA driver for fine-grained memory and
>> IOMMU domain management tied to individual CB devices.
>>
>> Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
>> ---
>> drivers/accel/qda/Makefile | 1 +
>> drivers/accel/qda/qda_cb.c | 32 +++++++
>> drivers/accel/qda/qda_drv.c | 46 ++++++++++
>> drivers/accel/qda/qda_drv.h | 3 +
>> drivers/accel/qda/qda_memory_manager.c | 152 +++++++++++++++++++++++++++++++++
>> drivers/accel/qda/qda_memory_manager.h | 101 ++++++++++++++++++++++
>> 6 files changed, 335 insertions(+)
>>
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 06/18] accel/qda: Add memory manager for CB devices
2026-02-23 23:11 ` Bjorn Andersson
@ 2026-03-02 8:30 ` Ekansh Gupta
0 siblings, 0 replies; 83+ messages in thread
From: Ekansh Gupta @ 2026-03-02 8:30 UTC (permalink / raw)
To: Bjorn Andersson
Cc: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, dri-devel, linux-doc, linux-kernel,
linux-arm-msm, iommu, linux-media, linaro-mm-sig,
Srinivas Kandagatla, Dmitry Baryshkov, Bharath Kumar,
Chenna Kesava Raju
On 2/24/2026 4:41 AM, Bjorn Andersson wrote:
> On Tue, Feb 24, 2026 at 12:39:00AM +0530, Ekansh Gupta wrote:
>> Introduce a per-device memory manager for the QDA driver that tracks
>> IOMMU-capable compute context-bank (CB) devices. Each CB device is
>> represented by a qda_iommu_device and registered with a central
>> qda_memory_manager instance owned by qda_dev.
>>
> The name makes me expect that this manages memory, but it seems to
> manage devices and context banks...
>
>> The memory manager maintains an xarray of devices and assigns a
>> unique ID to each CB. It also provides basic lifetime management
>> and a workqueue for deferred device removal. qda_cb_setup_device()
>> now allocates a qda_iommu_device for each CB and registers it with
>> the memory manager after DMA configuration succeeds.
>>
>> qda_init_device() is extended to allocate and initialize the memory
>> manager, while qda_deinit_device() will tear it down in later
>> patches.
> "in later patches" makes this extremely hard to review. I had to apply
> the series to try to navigate the code...
Thanks for highlighting. I'll update this.
>
>> This prepares the QDA driver for fine-grained memory and
>> IOMMU domain management tied to individual CB devices.
>>
>> Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
> [..]
>> obj-$(CONFIG_DRM_ACCEL_QDA_COMPUTE_BUS) += qda_compute_bus.o
>> diff --git a/drivers/accel/qda/qda_cb.c b/drivers/accel/qda/qda_cb.c
> [..]
>> @@ -46,6 +52,18 @@ static int qda_cb_setup_device(struct qda_dev *qdev, struct device *cb_dev)
>> rc = dma_set_mask(cb_dev, DMA_BIT_MASK(pa_bits));
>> if (rc) {
>> qda_err(qdev, "%d bit DMA enable failed: %d\n", pa_bits, rc);
>> + kfree(iommu_dev);
>> + return rc;
>> + }
>> +
>> + iommu_dev->dev = cb_dev;
>> + iommu_dev->sid = sid;
>> + snprintf(iommu_dev->name, sizeof(iommu_dev->name), "qda_iommu_dev_%u", sid);
> It's not easy to follow, when you have scattered the code across so many
> patches and so many files. But I don't think iommu_dev->name is ever
> used.
I'll remove this.
>
>> +
>> + rc = qda_memory_manager_register_device(qdev->iommu_mgr, iommu_dev);
>> + if (rc) {
>> + qda_err(qdev, "Failed to register IOMMU device: %d\n", rc);
>> + kfree(iommu_dev);
>> return rc;
>> }
>>
>> @@ -127,6 +145,8 @@ int qda_create_cb_device(struct qda_dev *qdev, struct device_node *cb_node)
>> void qda_destroy_cb_device(struct device *cb_dev)
>> {
>> struct iommu_group *group;
>> + struct qda_iommu_device *iommu_dev;
>> + struct qda_dev *qdev;
>>
>> if (!cb_dev) {
>> qda_dbg(NULL, "NULL CB device passed to destroy\n");
>> @@ -135,6 +155,18 @@ void qda_destroy_cb_device(struct device *cb_dev)
>>
>> qda_dbg(NULL, "Destroying CB device %s\n", dev_name(cb_dev));
>>
>> + iommu_dev = dev_get_drvdata(cb_dev);
> I'm not sure, but I think cb_dev is the struct device allocated in
> qda_create_cb_device(), but I can not find a place where you set drvdata
> for this device.
It should be updated with iommu_dev in qda_cb_setup_device. I believe I missed
adding this and it didn't give me any functional failure. Thanks for highlighting this,
I'll fix this in the next spin.
>
>> + if (iommu_dev) {
>> + if (cb_dev->parent) {
>> + qdev = dev_get_drvdata(cb_dev->parent);
>> + if (qdev && qdev->iommu_mgr) {
>> + qda_dbg(NULL, "Unregistering IOMMU device for %s\n",
>> + dev_name(cb_dev));
>> + qda_memory_manager_unregister_device(qdev->iommu_mgr, iommu_dev);
>> + }
>> + }
>> + }
>> +
>> group = iommu_group_get(cb_dev);
>> if (group) {
>> qda_dbg(NULL, "Removing %s from IOMMU group\n", dev_name(cb_dev));
>> diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c
> [..]
>> @@ -25,12 +37,46 @@ static void init_device_resources(struct qda_dev *qdev)
>> atomic_set(&qdev->removing, 0);
>> }
>>
>> +static int init_memory_manager(struct qda_dev *qdev)
>> +{
>> + int ret;
>> +
>> + qda_dbg(qdev, "Initializing IOMMU manager\n");
>> +
>> + qdev->iommu_mgr = kzalloc_obj(*qdev->iommu_mgr, GFP_KERNEL);
>> + if (!qdev->iommu_mgr)
>> + return -ENOMEM;
>> +
>> + ret = qda_memory_manager_init(qdev->iommu_mgr);
>> + if (ret) {
>> + qda_err(qdev, "Failed to initialize memory manager: %d\n", ret);
> qda_memory_manager_init() already logged 1 error and 1 debug prints if
> you get here.
ack.
>
>> + kfree(qdev->iommu_mgr);
>> + qdev->iommu_mgr = NULL;
> We're going to fail probe, you shouldn't have to clear this.
>
>> + return ret;
>> + }
>> +
>> + qda_dbg(qdev, "IOMMU manager initialized successfully\n");
>> + return 0;
>> +}
>> +
>> int qda_init_device(struct qda_dev *qdev)
>> {
>> + int ret;
>> +
>> init_device_resources(qdev);
>>
>> + ret = init_memory_manager(qdev);
>> + if (ret) {
>> + qda_err(qdev, "IOMMU manager initialization failed: %d\n", ret);
> And now we have 2 debug prints and two error prints in the log.
I'll clean the duplicate/unnecessary logs at at places
>
>> + goto err_cleanup_resources;
>> + }
>> +
>> qda_dbg(qdev, "QDA device initialized successfully\n");
> Or, if we get here, you have 8 debug prints.
>
> Please learn how to use kprobe/kretprobe instead of reimplementing it
> using printk().
ack.
>
>> return 0;
>> +
>> +err_cleanup_resources:
>> + cleanup_device_resources(qdev);
>> + return ret;
>> }
>>
>> static int __init qda_core_init(void)
>> diff --git a/drivers/accel/qda/qda_drv.h b/drivers/accel/qda/qda_drv.h
>> index eb732b7d8091..2cb97e4eafbf 100644
>> --- a/drivers/accel/qda/qda_drv.h
>> +++ b/drivers/accel/qda/qda_drv.h
>> @@ -11,6 +11,7 @@
>> #include <linux/mutex.h>
>> #include <linux/rpmsg.h>
>> #include <linux/xarray.h>
>> +#include "qda_memory_manager.h"
>>
>> /* Driver identification */
>> #define DRIVER_NAME "qda"
>> @@ -23,6 +24,8 @@ struct qda_dev {
>> struct device *dev;
>> /* Mutex protecting device state */
>> struct mutex lock;
>> + /* IOMMU/memory manager */
>> + struct qda_memory_manager *iommu_mgr;
>> /* Flag indicating device removal in progress */
>> atomic_t removing;
>> /* Name of the DSP (e.g., "cdsp", "adsp") */
>> diff --git a/drivers/accel/qda/qda_memory_manager.c b/drivers/accel/qda/qda_memory_manager.c
> [..]
>> +int qda_memory_manager_register_device(struct qda_memory_manager *mem_mgr,
>> + struct qda_iommu_device *iommu_dev)
>> +{
>> + int ret;
>> + u32 id;
>> +
>> + if (!mem_mgr || !iommu_dev || !iommu_dev->dev) {
> How could this happen? You call this function from one place, that looks
> like this:
>
> iommu_dev->dev = cb_dev;
> iommu_dev->sid = sid;
> rc = qda_memory_manager_register_device(qdev->iommu_mgr, iommu_dev);
>
> You just allocated in filled out iommu_dev.
>
> Looking up the callstack, we're coming from qda_rpmsg_probe() which just
> did qda_init_device() which created the qsdev->iommu_mgr.
>
> In other words, these can't possibly be NULL.
I'll recheck this and remove redundant checks.
>
>> + qda_err(NULL, "Invalid parameters for device registration\n");
>> + return -EINVAL;
>> + }
>> +
>> + init_iommu_device_fields(iommu_dev, mem_mgr);
>> +
>> + ret = allocate_device_id(mem_mgr, iommu_dev, &id);
>> + if (ret) {
>> + qda_err(NULL, "Failed to allocate device ID: %d (sid=%u)\n", ret, iommu_dev->sid);
>> + return ret;
>> + }
>> +
>> + iommu_dev->id = id;
>> +
>> + qda_dbg(NULL, "Registered device id=%u (sid=%u)\n", id, iommu_dev->sid);
>> +
>> + return 0;
>> +}
>> +
>> +void qda_memory_manager_unregister_device(struct qda_memory_manager *mem_mgr,
>> + struct qda_iommu_device *iommu_dev)
>> +{
>> + if (!mem_mgr || !iommu_dev) {
> The one call to this function is wrapped in:
>
> if (iommu_dev) {
> if (qdev->iommu_mgr) {
> qda_dbg(NULL, ...);
> qda_memory_manager_unregister_device(qdev->iommu_mgr, iommu_dev);
> }
> }
>
>> + qda_err(NULL, "Attempted to unregister invalid device/manager\n");
>> + return;
>> + }
>> +
>> + qda_dbg(NULL, "Unregistering device id=%u (refcount=%u)\n", iommu_dev->id,
>> + refcount_read(&iommu_dev->refcount));
> And just before the call to qda_memory_manager_unregister_device() you
> print a debug log, saying you will call this function.
>
>> +
>> + if (refcount_read(&iommu_dev->refcount) == 0) {
>> + xa_erase(&mem_mgr->device_xa, iommu_dev->id);
>> + kfree(iommu_dev);
>> + return;
>> + }
>> +
>> + if (refcount_dec_and_test(&iommu_dev->refcount)) {
>> + qda_info(NULL, "Device id=%u refcount reached zero, queuing removal\n",
>> + iommu_dev->id);
>> + queue_work(mem_mgr->wq, &iommu_dev->remove_work);
>> + }
>> +}
>> +
> [..]
>> diff --git a/drivers/accel/qda/qda_memory_manager.h b/drivers/accel/qda/qda_memory_manager.h
> [..]
>> +
>> +/**
> This says "kernel-doc"
>
>> + * struct qda_iommu_device - IOMMU device instance for memory management
>> + *
>> + * This structure represents a single IOMMU-enabled device managed by the
>> + * memory manager. Each device can be assigned to a specific process.
>> + */
>> +struct qda_iommu_device {
>> + /* Unique identifier for this IOMMU device */
> But this doesn't follow kernel-doc style.
>
> At the end of the series,
>
> ./scripts/kernel-doc -none -vv -Wall drivers/accel/qda/
>
> reports 270 warnings.
I'll resolve the warnings in next version.
>
>> + u32 id;
>> + /* Pointer to the underlying device */
>> + struct device *dev;
>> + /* Name for the device */
>> + char name[32];
>> + /* Spinlock protecting concurrent access to device */
>> + spinlock_t lock;
>> + /* Reference counter for device */
>> + refcount_t refcount;
>> + /* Work structure for deferred device removal */
>> + struct work_struct remove_work;
>> + /* Stream ID for IOMMU transactions */
>> + u32 sid;
>> + /* Pointer to parent memory manager */
>> + struct qda_memory_manager *manager;
>> +};
> Regards,
> Bjorn
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 07/18] accel/qda: Add DRM accel device registration for QDA driver
2026-02-23 22:16 ` Dmitry Baryshkov
@ 2026-03-02 8:33 ` Ekansh Gupta
0 siblings, 0 replies; 83+ messages in thread
From: Ekansh Gupta @ 2026-03-02 8:33 UTC (permalink / raw)
To: Dmitry Baryshkov
Cc: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, dri-devel, linux-doc, linux-kernel,
linux-arm-msm, iommu, linux-media, linaro-mm-sig,
Srinivas Kandagatla, Bharath Kumar, Chenna Kesava Raju
On 2/24/2026 3:46 AM, Dmitry Baryshkov wrote:
> On Tue, Feb 24, 2026 at 12:39:01AM +0530, Ekansh Gupta wrote:
>> Add DRM accel integration for the QDA DSP accelerator driver. A new
>> qda_drm_priv structure is introduced to hold per-device DRM state,
>> including a pointer to the memory manager and the parent qda_dev
>> instance. The driver now allocates a drm_device, initializes
>> driver-private state, and registers the device via the DRM accel
>> infrastructure.
>>
>> qda_register_device() performs allocation and registration of the DRM
>> device, while qda_unregister_device() handles device teardown and
>> releases references using drm_dev_unregister() and drm_dev_put().
>> Initialization and teardown paths are updated so DRM resources are
>> allocated after IOMMU/memory-manager setup and cleaned during RPMsg
>> remove.
>>
>> This patch lays the foundation for adding GEM buffer support and IOCTL
>> handling in later patches as part of the compute accelerator interface.
>>
>> Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
>> ---
>> drivers/accel/qda/qda_drv.c | 103 ++++++++++++++++++++++++++++++++++++++++++
>> drivers/accel/qda/qda_drv.h | 33 +++++++++++++-
>> drivers/accel/qda/qda_rpmsg.c | 8 ++++
>> 3 files changed, 142 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c
>> index 69132737f964..a9113ec78fa2 100644
>> --- a/drivers/accel/qda/qda_drv.c
>> +++ b/drivers/accel/qda/qda_drv.c
>> @@ -4,9 +4,31 @@
>> #include <linux/kernel.h>
>> #include <linux/atomic.h>
>> #include <linux/slab.h>
>> +#include <drm/drm_accel.h>
>> +#include <drm/drm_drv.h>
>> +#include <drm/drm_file.h>
>> +#include <drm/drm_gem.h>
>> +#include <drm/drm_ioctl.h>
>> #include "qda_drv.h"
>> #include "qda_rpmsg.h"
>>
>> +DEFINE_DRM_ACCEL_FOPS(qda_accel_fops);
>> +
>> +static struct drm_driver qda_drm_driver = {
>> + .driver_features = DRIVER_COMPUTE_ACCEL,
>> + .fops = &qda_accel_fops,
> Strange indentation in the middle. Please drop it.
ack.
>
>> + .name = DRIVER_NAME,
>> + .desc = "Qualcomm DSP Accelerator Driver",
>> +};
>> +
>> +static void cleanup_drm_private(struct qda_dev *qdev)
>> +{
>> + if (qdev->drm_priv) {
>> + qda_dbg(qdev, "Cleaning up DRM private data\n");
>> + kfree(qdev->drm_priv);
>> + }
>> +}
>> +
>> static void cleanup_iommu_manager(struct qda_dev *qdev)
>> {
>> if (qdev->iommu_mgr) {
>> @@ -24,6 +46,7 @@ static void cleanup_device_resources(struct qda_dev *qdev)
>>
>> void qda_deinit_device(struct qda_dev *qdev)
>> {
>> + cleanup_drm_private(qdev);
>> cleanup_iommu_manager(qdev);
>> cleanup_device_resources(qdev);
>> }
>> @@ -59,6 +82,18 @@ static int init_memory_manager(struct qda_dev *qdev)
>> return 0;
>> }
>>
>> +static int init_drm_private(struct qda_dev *qdev)
>> +{
>> + qda_dbg(qdev, "Initializing DRM private data\n");
>> +
>> + qdev->drm_priv = kzalloc_obj(*qdev->drm_priv, GFP_KERNEL);
>> + if (!qdev->drm_priv)
>> + return -ENOMEM;
>> +
>> + qda_dbg(qdev, "DRM private data initialized successfully\n");
>> + return 0;
>> +}
>> +
>> int qda_init_device(struct qda_dev *qdev)
>> {
>> int ret;
>> @@ -71,14 +106,82 @@ int qda_init_device(struct qda_dev *qdev)
>> goto err_cleanup_resources;
>> }
>>
>> + ret = init_drm_private(qdev);
>> + if (ret) {
>> + qda_err(qdev, "DRM private data initialization failed: %d\n", ret);
>> + goto err_cleanup_iommu;
>> + }
>> +
>> qda_dbg(qdev, "QDA device initialized successfully\n");
>> return 0;
>>
>> +err_cleanup_iommu:
>> + cleanup_iommu_manager(qdev);
>> err_cleanup_resources:
>> cleanup_device_resources(qdev);
>> return ret;
>> }
>>
>> +static int setup_and_register_drm_device(struct qda_dev *qdev)
>> +{
>> + struct drm_device *ddev;
>> + int ret;
>> +
>> + qda_dbg(qdev, "Setting up and registering DRM device\n");
>> +
>> + ddev = drm_dev_alloc(&qda_drm_driver, qdev->dev);
> devm_drm_dev_alloc() please. Move this patch to the front of the series,
> making everything else depend on the allocated data structure.
ack.
>
>> + if (IS_ERR(ddev)) {
>> + ret = PTR_ERR(ddev);
>> + qda_err(qdev, "Failed to allocate DRM device: %d\n", ret);
>> + return ret;
>> + }
>> +
>> + qdev->drm_priv->drm_dev = ddev;
>> + qdev->drm_priv->iommu_mgr = qdev->iommu_mgr;
>> + qdev->drm_priv->qdev = qdev;
>> +
>> + ddev->dev_private = qdev->drm_priv;
>> + qdev->drm_dev = ddev;
>> +
>> + ret = drm_dev_register(ddev, 0);
>> + if (ret) {
>> + qda_err(qdev, "Failed to register DRM device: %d\n", ret);
>> + drm_dev_put(ddev);
>> + return ret;
>> + }
>> +
>> + qda_dbg(qdev, "DRM device registered successfully\n");
>> + return 0;
>> +}
>> +
>> +int qda_register_device(struct qda_dev *qdev)
>> +{
>> + int ret;
>> +
>> + ret = setup_and_register_drm_device(qdev);
>> + if (ret) {
>> + qda_err(qdev, "DRM device setup failed: %d\n", ret);
>> + return ret;
>> + }
>> +
>> + qda_dbg(qdev, "QDA device registered successfully\n");
>> + return 0;
>> +}
>> +
>> +void qda_unregister_device(struct qda_dev *qdev)
>> +{
>> + qda_info(qdev, "Unregistering QDA device\n");
>> +
>> + if (qdev->drm_dev) {
>> + qda_dbg(qdev, "Unregistering DRM device\n");
>> + drm_dev_unregister(qdev->drm_dev);
>> + drm_dev_put(qdev->drm_dev);
>> + qdev->drm_dev = NULL;
>> + }
>> +
>> + qda_dbg(qdev, "QDA device unregistered successfully\n");
>> +}
>> +
>> static int __init qda_core_init(void)
>> {
>> int ret;
>> diff --git a/drivers/accel/qda/qda_drv.h b/drivers/accel/qda/qda_drv.h
>> index 2cb97e4eafbf..2b80401a3741 100644
>> --- a/drivers/accel/qda/qda_drv.h
>> +++ b/drivers/accel/qda/qda_drv.h
>> @@ -11,13 +11,35 @@
>> #include <linux/mutex.h>
>> #include <linux/rpmsg.h>
>> #include <linux/xarray.h>
>> +#include <drm/drm_drv.h>
>> +#include <drm/drm_file.h>
>> +#include <drm/drm_device.h>
>> +#include <drm/drm_accel.h>
>> #include "qda_memory_manager.h"
>>
>> /* Driver identification */
>> #define DRIVER_NAME "qda"
>>
>> +/**
>> + * struct qda_drm_priv - DRM device private data for QDA device
>> + *
>> + * This structure serves as the DRM device private data (stored in dev_private),
>> + * bridging the DRM device context with the QDA device and providing access to
>> + * shared resources like the memory manager during buffer operations.
>> + */
>> +struct qda_drm_priv {
> Shared between what and what? Why do you need a separate structure
> instead of using qda_dev?
This is for channel specific resources which will be used by all processes using the channel. It
should be possible to use qda_dev, I'll try it out and fix this in next version.
>
>> + /* DRM device structure */
>> + struct drm_device *drm_dev;
>> + /* Global memory/IOMMU manager */
>> + struct qda_memory_manager *iommu_mgr;
>> + /* Back-pointer to qda_dev */
>> + struct qda_dev *qdev;
>> +};
>> +
>> /* struct qda_dev - Main device structure for QDA driver */
>> struct qda_dev {
>> + /* DRM device for accelerator interface */
>> + struct drm_device *drm_dev;
> Drop the pointer here.
I'll modify this based on qda_drm_priv replacement.
>
>> /* RPMsg device for communication with remote processor */
>> struct rpmsg_device *rpdev;
>> /* Underlying device structure */
>> @@ -26,6 +48,8 @@ struct qda_dev {
>> struct mutex lock;
>> /* IOMMU/memory manager */
>> struct qda_memory_manager *iommu_mgr;
>> + /* DRM device private data */
>> + struct qda_drm_priv *drm_priv;
>> /* Flag indicating device removal in progress */
>> atomic_t removing;
>> /* Name of the DSP (e.g., "cdsp", "adsp") */
>> @@ -39,8 +63,8 @@ struct qda_dev {
>> * @qdev: QDA device structure
>> *
>> * Returns the most appropriate device structure for logging messages.
>> - * Prefers qdev->dev, or returns NULL if the device is being removed
>> - * or invalid.
>> + * Prefers qdev->dev, falls back to qdev->drm_dev->dev, or returns NULL
>> + * if the device is being removed or invalid.
>> */
>> static inline struct device *qda_get_log_device(struct qda_dev *qdev)
>> {
>> @@ -50,6 +74,9 @@ static inline struct device *qda_get_log_device(struct qda_dev *qdev)
>> if (qdev->dev)
>> return qdev->dev;
>>
>> + if (qdev->drm_dev)
>> + return qdev->drm_dev->dev;
>> +
>> return NULL;
>> }
>>
>> @@ -93,5 +120,7 @@ static inline struct device *qda_get_log_device(struct qda_dev *qdev)
>> */
>> int qda_init_device(struct qda_dev *qdev);
>> void qda_deinit_device(struct qda_dev *qdev);
>> +int qda_register_device(struct qda_dev *qdev);
>> +void qda_unregister_device(struct qda_dev *qdev);
>>
>> #endif /* __QDA_DRV_H__ */
>> diff --git a/drivers/accel/qda/qda_rpmsg.c b/drivers/accel/qda/qda_rpmsg.c
>> index 5a57384de6a2..b2b44b4d3ca8 100644
>> --- a/drivers/accel/qda/qda_rpmsg.c
>> +++ b/drivers/accel/qda/qda_rpmsg.c
>> @@ -80,6 +80,7 @@ static void qda_rpmsg_remove(struct rpmsg_device *rpdev)
>> qdev->rpdev = NULL;
>> mutex_unlock(&qdev->lock);
>>
>> + qda_unregister_device(qdev);
>> qda_unpopulate_child_devices(qdev);
>> qda_deinit_device(qdev);
>>
>> @@ -123,6 +124,13 @@ static int qda_rpmsg_probe(struct rpmsg_device *rpdev)
>> return ret;
>> }
>>
>> + ret = qda_register_device(qdev);
>> + if (ret) {
>> + qda_deinit_device(qdev);
>> + qda_unpopulate_child_devices(qdev);
>> + return ret;
>> + }
>> +
>> qda_info(qdev, "QDA RPMsg probe completed successfully for %s\n", qdev->dsp_name);
>> return 0;
>> }
>>
>> --
>> 2.34.1
>>
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 08/18] accel/qda: Add per-file DRM context and open/close handling
2026-02-23 22:20 ` Dmitry Baryshkov
@ 2026-03-02 8:36 ` Ekansh Gupta
0 siblings, 0 replies; 83+ messages in thread
From: Ekansh Gupta @ 2026-03-02 8:36 UTC (permalink / raw)
To: Dmitry Baryshkov
Cc: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, dri-devel, linux-doc, linux-kernel,
linux-arm-msm, iommu, linux-media, linaro-mm-sig,
Srinivas Kandagatla, Bharath Kumar, Chenna Kesava Raju
On 2/24/2026 3:50 AM, Dmitry Baryshkov wrote:
> On Tue, Feb 24, 2026 at 12:39:02AM +0530, Ekansh Gupta wrote:
>> Introduce per-file and per-user context for the QDA DRM accelerator
>> driver. A new qda_file_priv structure is stored in file->driver_priv
>> for each open file descriptor, and a qda_user object is allocated per
>> client with a unique client_id generated from an atomic counter in
>> qda_dev.
>>
>> The DRM driver now provides qda_open() and qda_postclose() callbacks.
>> qda_open() resolves the qda_dev from the drm_device, allocates the
>> qda_file_priv and qda_user structures, and attaches them to the DRM
>> file. qda_postclose() tears down the per-file context and frees the
>> qda_user object when the file is closed.
>>
>> This prepares the QDA driver to track per-process state for future
>> features such as per-client memory mappings, job submission contexts,
>> and access control over DSP compute resources.
> Start by describing the problem instead of stuffing it to the end. Can
> we use something better suited for this task, like IDR?
ack, same comment for IDR here also, sticking with xarray everywhere for QDA for
uniformity and to avoid checkpatch warnings.
>
>> Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
>> ---
>> drivers/accel/qda/qda_drv.c | 117 ++++++++++++++++++++++++++++++++++++++++++++
>> drivers/accel/qda/qda_drv.h | 30 ++++++++++++
>> 2 files changed, 147 insertions(+)
>>
>> diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c
>> index a9113ec78fa2..bf95fc782cf8 100644
>> --- a/drivers/accel/qda/qda_drv.c
>> +++ b/drivers/accel/qda/qda_drv.c
>> @@ -12,11 +12,127 @@
>> #include "qda_drv.h"
>> #include "qda_rpmsg.h"
>>
>> +static struct qda_drm_priv *get_drm_priv_from_device(struct drm_device *dev)
>> +{
>> + if (!dev)
>> + return NULL;
>> +
>> + return (struct qda_drm_priv *)dev->dev_private;
>> +}
>> +
>> +static struct qda_dev *get_qdev_from_drm_device(struct drm_device *dev)
>> +{
>> + struct qda_drm_priv *drm_priv;
>> +
>> + if (!dev) {
>> + qda_dbg(NULL, "Invalid drm_device\n");
>> + return NULL;
>> + }
>> +
>> + drm_priv = get_drm_priv_from_device(dev);
>> + if (!drm_priv) {
>> + qda_dbg(NULL, "No drm_priv in dev_private\n");
>> + return NULL;
>> + }
>> +
>> + return drm_priv->qdev;
>> +}
>> +
>> +static struct qda_user *alloc_qda_user(struct qda_dev *qdev)
>> +{
>> + struct qda_user *qda_user;
>> +
>> + qda_user = kzalloc_obj(*qda_user, GFP_KERNEL);
>> + if (!qda_user)
>> + return NULL;
>> +
>> + qda_user->client_id = atomic_inc_return(&qdev->client_id_counter);
>> + qda_user->qda_dev = qdev;
>> +
>> + qda_dbg(qdev, "Allocated qda_user with client_id=%u\n", qda_user->client_id);
>> + return qda_user;
>> +}
>> +
>> +static void free_qda_user(struct qda_user *qda_user)
>> +{
>> + if (!qda_user)
>> + return;
>> +
>> + qda_dbg(qda_user->qda_dev, "Freeing qda_user client_id=%u\n", qda_user->client_id);
>> +
>> + kfree(qda_user);
>> +}
>> +
>> +static int qda_open(struct drm_device *dev, struct drm_file *file)
>> +{
>> + struct qda_user *qda_user;
>> + struct qda_file_priv *qda_file_priv;
>> + struct qda_dev *qdev;
>> +
>> + if (!file) {
>> + qda_dbg(NULL, "Invalid file pointer\n");
>> + return -EINVAL;
>> + }
>> +
>> + qdev = get_qdev_from_drm_device(dev);
>> + if (!qdev) {
>> + qda_dbg(NULL, "Failed to get qdev from drm_device\n");
>> + return -EINVAL;
>> + }
>> +
>> + qda_file_priv = kzalloc(sizeof(*qda_file_priv), GFP_KERNEL);
>> + if (!qda_file_priv)
>> + return -ENOMEM;
>> +
>> + qda_file_priv->pid = current->pid;
>> +
>> + qda_user = alloc_qda_user(qdev);
>> + if (!qda_user) {
>> + qda_dbg(qdev, "Failed to allocate qda_user\n");
>> + kfree(qda_file_priv);
>> + return -ENOMEM;
>> + }
>> +
>> + file->driver_priv = qda_file_priv;
>> + qda_file_priv->qda_user = qda_user;
>> +
>> + qda_dbg(qdev, "Device opened successfully for PID %d\n", current->pid);
>> +
>> + return 0;
>> +}
>> +
>> +static void qda_postclose(struct drm_device *dev, struct drm_file *file)
>> +{
>> + struct qda_dev *qdev;
>> + struct qda_file_priv *qda_file_priv;
>> + struct qda_user *qda_user;
>> +
>> + qdev = get_qdev_from_drm_device(dev);
>> + if (!qdev || atomic_read(&qdev->removing)) {
>> + qda_dbg(NULL, "Device unavailable or removing\n");
>> + return;
> Even if it is being removed, no need to free the memory?
Right, It should still be freed.
>
>> + }
>> +
>> + qda_file_priv = (struct qda_file_priv *)file->driver_priv;
>> + if (qda_file_priv) {
>> + qda_user = qda_file_priv->qda_user;
>> + if (qda_user)
>> + free_qda_user(qda_user);
>> +
>> + kfree(qda_file_priv);
>> + file->driver_priv = NULL;
>> + }
>> +
>> + qda_dbg(qdev, "Device closed for PID %d\n", current->pid);
>> +}
>> +
>> DEFINE_DRM_ACCEL_FOPS(qda_accel_fops);
>>
>> static struct drm_driver qda_drm_driver = {
>> .driver_features = DRIVER_COMPUTE_ACCEL,
>> .fops = &qda_accel_fops,
>> + .open = qda_open,
>> + .postclose = qda_postclose,
>> .name = DRIVER_NAME,
>> .desc = "Qualcomm DSP Accelerator Driver",
>> };
>> @@ -58,6 +174,7 @@ static void init_device_resources(struct qda_dev *qdev)
>>
>> mutex_init(&qdev->lock);
>> atomic_set(&qdev->removing, 0);
>> + atomic_set(&qdev->client_id_counter, 0);
>> }
>>
>> static int init_memory_manager(struct qda_dev *qdev)
>> diff --git a/drivers/accel/qda/qda_drv.h b/drivers/accel/qda/qda_drv.h
>> index 2b80401a3741..e0ba37702a86 100644
>> --- a/drivers/accel/qda/qda_drv.h
>> +++ b/drivers/accel/qda/qda_drv.h
>> @@ -10,6 +10,7 @@
>> #include <linux/list.h>
>> #include <linux/mutex.h>
>> #include <linux/rpmsg.h>
>> +#include <linux/types.h>
>> #include <linux/xarray.h>
>> #include <drm/drm_drv.h>
>> #include <drm/drm_file.h>
>> @@ -20,6 +21,33 @@
>> /* Driver identification */
>> #define DRIVER_NAME "qda"
>>
>> +/**
>> + * struct qda_file_priv - Per-process private data for DRM file
>> + *
>> + * This structure tracks per-process state for each open file descriptor.
>> + * It maintains the IOMMU device assignment and links to the legacy qda_user
>> + * structure for compatibility with existing code.
>> + */
>> +struct qda_file_priv {
>> + /* Process ID for tracking */
>> + pid_t pid;
>> + /* Pointer to qda_user structure for backward compatibility */
>> + struct qda_user *qda_user;
>> +};
>> +
>> +/**
>> + * struct qda_user - Per-user context for remote processor interaction
>> + *
>> + * This structure maintains per-user state for interactions with the
>> + * remote processor, including memory mappings and pending operations.
>> + */
>> +struct qda_user {
>> + /* Unique client identifier */
>> + u32 client_id;
>> + /* Back-pointer to device structure */
>> + struct qda_dev *qda_dev;
>> +};
>> +
>> /**
>> * struct qda_drm_priv - DRM device private data for QDA device
>> *
>> @@ -52,6 +80,8 @@ struct qda_dev {
>> struct qda_drm_priv *drm_priv;
>> /* Flag indicating device removal in progress */
>> atomic_t removing;
>> + /* Atomic counter for generating unique client IDs */
>> + atomic_t client_id_counter;
>> /* Name of the DSP (e.g., "cdsp", "adsp") */
>> char dsp_name[16];
>> /* Compute context-bank (CB) child devices */
>>
>> --
>> 2.34.1
>>
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 09/18] accel/qda: Add QUERY IOCTL and basic QDA UAPI header
2026-02-23 22:24 ` Dmitry Baryshkov
@ 2026-03-02 8:41 ` Ekansh Gupta
0 siblings, 0 replies; 83+ messages in thread
From: Ekansh Gupta @ 2026-03-02 8:41 UTC (permalink / raw)
To: Dmitry Baryshkov
Cc: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, dri-devel, linux-doc, linux-kernel,
linux-arm-msm, iommu, linux-media, linaro-mm-sig,
Srinivas Kandagatla, Bharath Kumar, Chenna Kesava Raju
On 2/24/2026 3:54 AM, Dmitry Baryshkov wrote:
> On Tue, Feb 24, 2026 at 12:39:03AM +0530, Ekansh Gupta wrote:
>> Introduce a basic UAPI for the QDA accelerator driver along with a
>> DRM IOCTL handler to query DSP device identity. A new UAPI header
>> include/uapi/drm/qda_accel.h defines DRM_QDA_QUERY, the corresponding
>> DRM_IOCTL_QDA_QUERY command, and struct drm_qda_query, which contains
>> a DSP name string.
>>
>> On the kernel side, qda_ioctl_query() validates the per-file context,
>> resolves the qda_dev instance from dev->dev_private, and copies the
>> DSP name from qdev->dsp_name into the query structure. The new
>> qda_ioctls[] table wires this IOCTL into the QDA DRM driver so
>> userspace can call it through the standard DRM command interface.
>>
>> This IOCTL provides a simple and stable way for userspace to discover
>> which DSP a given QDA device node represents and serves as the first
>> building block for a richer QDA UAPI in subsequent patches.
>>
>> Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
>> ---
>> drivers/accel/qda/Makefile | 1 +
>> drivers/accel/qda/qda_drv.c | 9 +++++++++
>> drivers/accel/qda/qda_ioctl.c | 45 +++++++++++++++++++++++++++++++++++++++++
>> drivers/accel/qda/qda_ioctl.h | 26 ++++++++++++++++++++++++
>> include/uapi/drm/qda_accel.h | 47 +++++++++++++++++++++++++++++++++++++++++++
>> 5 files changed, 128 insertions(+)
>>
>> diff --git a/drivers/accel/qda/Makefile b/drivers/accel/qda/Makefile
>> index 7e96ddc40a24..f547398e1a72 100644
>> --- a/drivers/accel/qda/Makefile
>> +++ b/drivers/accel/qda/Makefile
>> @@ -10,5 +10,6 @@ qda-y := \
>> qda_rpmsg.o \
>> qda_cb.o \
>> qda_memory_manager.o \
>> + qda_ioctl.o \
> Keep the list sorted, please.
ack
>
>>
>> obj-$(CONFIG_DRM_ACCEL_QDA_COMPUTE_BUS) += qda_compute_bus.o
>> diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c
>> index bf95fc782cf8..86758a9cd982 100644
>> --- a/drivers/accel/qda/qda_drv.c
>> +++ b/drivers/accel/qda/qda_drv.c
>> @@ -9,7 +9,10 @@
>> #include <drm/drm_file.h>
>> #include <drm/drm_gem.h>
>> #include <drm/drm_ioctl.h>
>> +#include <drm/qda_accel.h>
>> +
>> #include "qda_drv.h"
>> +#include "qda_ioctl.h"
>> #include "qda_rpmsg.h"
>>
>> static struct qda_drm_priv *get_drm_priv_from_device(struct drm_device *dev)
>> @@ -128,11 +131,17 @@ static void qda_postclose(struct drm_device *dev, struct drm_file *file)
>>
>> DEFINE_DRM_ACCEL_FOPS(qda_accel_fops);
>>
>> +static const struct drm_ioctl_desc qda_ioctls[] = {
>> + DRM_IOCTL_DEF_DRV(QDA_QUERY, qda_ioctl_query, 0),
>> +};
>> +
>> static struct drm_driver qda_drm_driver = {
>> .driver_features = DRIVER_COMPUTE_ACCEL,
>> .fops = &qda_accel_fops,
>> .open = qda_open,
>> .postclose = qda_postclose,
>> + .ioctls = qda_ioctls,
> Please select one style. Either you indent all assignments or you don't.
ack
>
>> + .num_ioctls = ARRAY_SIZE(qda_ioctls),
>> .name = DRIVER_NAME,
>> .desc = "Qualcomm DSP Accelerator Driver",
>> };
>> diff --git a/drivers/accel/qda/qda_ioctl.c b/drivers/accel/qda/qda_ioctl.c
>> new file mode 100644
>> index 000000000000..9fa73ec2dfce
>> --- /dev/null
>> +++ b/drivers/accel/qda/qda_ioctl.c
>> @@ -0,0 +1,45 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +// Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
>> +#include <drm/drm_ioctl.h>
>> +#include <drm/drm_gem.h>
>> +#include <drm/qda_accel.h>
>> +#include "qda_drv.h"
>> +#include "qda_ioctl.h"
>> +
>> +static int qda_validate_and_get_context(struct drm_device *dev, struct drm_file *file_priv,
>> + struct qda_dev **qdev, struct qda_user **qda_user)
>> +{
>> + struct qda_drm_priv *drm_priv = dev->dev_private;
>> + struct qda_file_priv *qda_file_priv;
>> +
>> + if (!drm_priv)
>> + return -EINVAL;
>> +
>> + *qdev = drm_priv->qdev;
>> + if (!*qdev)
>> + return -EINVAL;
> Can this actually happen or is it (un)wishful thinking?
>
>> +
>> + qda_file_priv = (struct qda_file_priv *)file_priv->driver_priv;
>> + if (!qda_file_priv || !qda_file_priv->qda_user)
>> + return -EINVAL;
> What are you protecting against?
The intention for all these checks are to ensure channel is properly initialized before any
request is queued for any specific channel, I'll update the checks based on the current
initialization ordering.
>
>> +
>> + *qda_user = qda_file_priv->qda_user;
>> +
>> + return 0;
>> +}
>> +
>> +int qda_ioctl_query(struct drm_device *dev, void *data, struct drm_file *file_priv)
>> +{
>> + struct qda_dev *qdev;
>> + struct qda_user *qda_user;
>> + struct drm_qda_query *args = data;
>> + int ret;
>> +
>> + ret = qda_validate_and_get_context(dev, file_priv, &qdev, &qda_user);
>> + if (ret)
>> + return ret;
>> +
>> + strscpy(args->dsp_name, qdev->dsp_name, sizeof(args->dsp_name));
>> +
>> + return 0;
>> +}
>> diff --git a/drivers/accel/qda/qda_ioctl.h b/drivers/accel/qda/qda_ioctl.h
>> new file mode 100644
>> index 000000000000..6bf3bcd28c0e
>> --- /dev/null
>> +++ b/drivers/accel/qda/qda_ioctl.h
>> @@ -0,0 +1,26 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/*
>> + * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
>> + */
>> +
>> +#ifndef _QDA_IOCTL_H
>> +#define _QDA_IOCTL_H
>> +
>> +#include <linux/types.h>
>> +#include <linux/kernel.h>
>> +#include <drm/drm_ioctl.h>
>> +#include "qda_drv.h"
>> +
>> +/**
>> + * qda_ioctl_query - Query DSP device information and capabilities
>> + * @dev: DRM device structure
>> + * @data: User-space data containing query parameters and results
>> + * @file_priv: DRM file private data
>> + *
>> + * This IOCTL handler queries information about the DSP device.
>> + *
>> + * Return: 0 on success, negative error code on failure
>> + */
>> +int qda_ioctl_query(struct drm_device *dev, void *data, struct drm_file *file_priv);
>> +
>> +#endif /* _QDA_IOCTL_H */
>> diff --git a/include/uapi/drm/qda_accel.h b/include/uapi/drm/qda_accel.h
>> new file mode 100644
>> index 000000000000..0aad791c4832
>> --- /dev/null
>> +++ b/include/uapi/drm/qda_accel.h
>> @@ -0,0 +1,47 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only WITH Linux-syscall-note */
>> +/*
>> + * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
>> + */
>> +
>> +#ifndef __QDA_ACCEL_H__
>> +#define __QDA_ACCEL_H__
>> +
>> +#include "drm.h"
>> +
>> +#if defined(__cplusplus)
>> +extern "C" {
>> +#endif
>> +
>> +/*
>> + * QDA IOCTL command numbers
>> + *
>> + * These define the command numbers for QDA-specific IOCTLs.
>> + * They are used with DRM_COMMAND_BASE to create the full IOCTL numbers.
>> + */
>> +#define DRM_QDA_QUERY 0x00
>> +/*
>> + * QDA IOCTL definitions
>> + *
>> + * These macros define the actual IOCTL numbers used by userspace applications.
>> + * They combine the command numbers with DRM_COMMAND_BASE and specify the
>> + * data structure and direction (read/write) for each IOCTL.
>> + */
>> +#define DRM_IOCTL_QDA_QUERY DRM_IOR(DRM_COMMAND_BASE + DRM_QDA_QUERY, struct drm_qda_query)
>> +
>> +/**
>> + * struct drm_qda_query - Device information query structure
>> + * @dsp_name: Name of DSP (e.g., "adsp", "cdsp", "cdsp1", "gdsp0", "gdsp1")
>> + *
>> + * This structure is used with DRM_IOCTL_QDA_QUERY to query device type,
>> + * allowing userspace to identify which DSP a device node represents. The
>> + * kernel provides the DSP name directly as a null-terminated string.
>> + */
>> +struct drm_qda_query {
>> + __u8 dsp_name[16];
>> +};
>> +
>> +#if defined(__cplusplus)
>> +}
>> +#endif
>> +
>> +#endif /* __QDA_ACCEL_H__ */
>>
>> --
>> 2.34.1
>>
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 18/18] MAINTAINERS: Add MAINTAINERS entry for QDA driver
2026-02-23 22:40 ` Dmitry Baryshkov
@ 2026-03-02 8:41 ` Ekansh Gupta
0 siblings, 0 replies; 83+ messages in thread
From: Ekansh Gupta @ 2026-03-02 8:41 UTC (permalink / raw)
To: Dmitry Baryshkov
Cc: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, dri-devel, linux-doc, linux-kernel,
linux-arm-msm, iommu, linux-media, linaro-mm-sig,
Srinivas Kandagatla, Bharath Kumar, Chenna Kesava Raju
On 2/24/2026 4:10 AM, Dmitry Baryshkov wrote:
> On Tue, Feb 24, 2026 at 12:39:12AM +0530, Ekansh Gupta wrote:
>> Add a new MAINTAINERS entry for the Qualcomm DSP Accelerator (QDA)
>> driver. The entry lists the primary maintainer, the linux-arm-msm and
>> dri-devel mailing lists, and covers all source files under
>> drivers/accel/qda, Documentation/accel/qda and the UAPI header
>> include/uapi/drm/qda_accel.h.
>>
>> This ensures that patches to the QDA driver and its public API are
>> tracked and routed to the appropriate reviewers as the driver is
>> integrated into the DRM accel subsystem.
> Please add it in the first patch.
ack
>
>> Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
>> ---
>> MAINTAINERS | 9 +++++++++
>> 1 file changed, 9 insertions(+)
>>
>> diff --git a/MAINTAINERS b/MAINTAINERS
>> index 71f76fddebbf..78b8b82a6370 100644
>> --- a/MAINTAINERS
>> +++ b/MAINTAINERS
>> @@ -21691,6 +21691,15 @@ S: Maintained
>> F: Documentation/devicetree/bindings/crypto/qcom-qce.yaml
>> F: drivers/crypto/qce/
>>
>> +QUALCOMM DSP ACCELERATOR (QDA) DRIVER
>> +M: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
>> +L: linux-arm-msm@vger.kernel.org
>> +L: dri-devel@lists.freedesktop.org
>> +S: Supported
>> +F: Documentation/accel/qda/
>> +F: drivers/accel/qda/
>> +F: include/uapi/drm/qda_accel.h
>> +
>> QUALCOMM EMAC GIGABIT ETHERNET DRIVER
>> M: Timur Tabi <timur@kernel.org>
>> L: netdev@vger.kernel.org
>>
>> --
>> 2.34.1
>>
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 00/18] accel/qda: Introduce Qualcomm DSP Accelerator driver
2026-02-24 3:39 ` Trilok Soni
@ 2026-03-02 8:43 ` Ekansh Gupta
0 siblings, 0 replies; 83+ messages in thread
From: Ekansh Gupta @ 2026-03-02 8:43 UTC (permalink / raw)
To: Trilok Soni, Oded Gabbay, Jonathan Corbet, Shuah Khan,
Joerg Roedel, Will Deacon, Robin Murphy, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Sumit Semwal, Christian König
Cc: dri-devel, linux-doc, linux-kernel, linux-arm-msm, iommu,
linux-media, linaro-mm-sig, Srinivas Kandagatla, Dmitry Baryshkov,
Bharath Kumar, Chenna Kesava Raju
On 2/24/2026 9:09 AM, Trilok Soni wrote:
> On 2/23/2026 11:08 AM, Ekansh Gupta wrote:
>> * Userspace Interface: While the driver provides a new DRM-based UAPI,
>> the underlying FastRPC protocol and DSP firmware interface remain
>> compatible. This ensures that DSP firmware and libraries continue to
>> work without modification.
>
> This is not very clear and it is not explained properly in the 1st patch
> where you document this driver. It doesn't talk about how older
> UAPI based application will still work without any change
> or recompilation. I prefer the same old binary to work w/ the new
> DRM based interface without any changes (I don't know how that will be possible)
> OR if recompilation + linking is needed then you need to provide the wrapper library.
I'll add more details for this based on the discussion for compat driver.
>
> ---Trilok Soni
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 00/18] accel/qda: Introduce Qualcomm DSP Accelerator driver
2026-02-23 22:03 ` [PATCH RFC 00/18] accel/qda: Introduce Qualcomm DSP Accelerator driver Bjorn Andersson
@ 2026-03-02 8:54 ` Ekansh Gupta
0 siblings, 0 replies; 83+ messages in thread
From: Ekansh Gupta @ 2026-03-02 8:54 UTC (permalink / raw)
To: Bjorn Andersson
Cc: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, dri-devel, linux-doc, linux-kernel,
linux-arm-msm, iommu, linux-media, linaro-mm-sig,
Srinivas Kandagatla, Dmitry Baryshkov, Bharath Kumar,
Chenna Kesava Raju
On 2/24/2026 3:33 AM, Bjorn Andersson wrote:
> On Tue, Feb 24, 2026 at 12:38:54AM +0530, Ekansh Gupta wrote:
>> This patch series introduces the Qualcomm DSP Accelerator (QDA) driver,
>> a modern DRM-based accelerator implementation for Qualcomm Hexagon DSPs.
>> The driver provides a standardized interface for offloading computational
>> tasks to DSPs found on Qualcomm SoCs, supporting all DSP domains (ADSP,
>> CDSP, SDSP, GDSP).
>>
>> The QDA driver is designed as an alternative for the FastRPC driver
>> in drivers/misc/, offering improved resource management, better integration
>> with standard kernel subsystems, and alignment with the Linux kernel's
>> Compute Accelerators framework.
>>
> If I understand correctly, this is just the same FastRPC protocol but
> in the accel framework, and hence with a new userspace ABI?
>
> I don't fancy the name "QDA" as an acronym for "FastRPC Accel".
>
> I would much prefer to see this living in drivers/accel/fastrpc and be
> named some variation of "fastrpc" (e.g. fastrpc_accel). (Driver name can
> be "fastrpc" as the other one apparently is named "qcom,fastrpc").
Planning to stick with QDA as per the future plans where the driver might use some
other mechanism than fastrpc(signalling).
>
>> User-space staging branch
>> ============
>> https://github.com/qualcomm/fastrpc/tree/accel/staging
>>
>> Key Features
>> ============
>>
>> * Standard DRM accelerator interface via /dev/accel/accelN
>> * GEM-based buffer management with DMA-BUF import/export support
>> * IOMMU-based memory isolation using per-process context banks
>> * FastRPC protocol implementation for DSP communication
>> * RPMsg transport layer for reliable message passing
>> * Support for all DSP domains (ADSP, CDSP, SDSP, GDSP)
>> * Comprehensive IOCTL interface for DSP operations
>>
>> High-Level Architecture Differences with Existing FastRPC Driver
>> =================================================================
>>
>> The QDA driver represents a significant architectural departure from the
>> existing FastRPC driver (drivers/misc/fastrpc.c), addressing several key
>> limitations while maintaining protocol compatibility:
>>
>> 1. DRM Accelerator Framework Integration
>> - FastRPC: Custom character device (/dev/fastrpc-*)
>> - QDA: Standard DRM accel device (/dev/accel/accelN)
>> - Benefit: Leverages established DRM infrastructure for device
>> management.
>>
>> 2. Memory Management
>> - FastRPC: Custom memory allocator with ION/DMA-BUF integration
>> - QDA: Native GEM objects with full PRIME support
>> - Benefit: Seamless buffer sharing using standard DRM mechanisms
>>
>> 3. IOMMU Context Bank Management
>> - FastRPC: Direct IOMMU domain manipulation, limited isolation
>> - QDA: Custom compute bus (qda_cb_bus_type) with proper device model
>> - Benefit: Each CB device is a proper struct device with IOMMU group
>> support, enabling better isolation and resource tracking.
>> - https://lore.kernel.org/all/245d602f-3037-4ae3-9af9-d98f37258aae@oss.qualcomm.com/
>>
>> 4. Memory Manager Architecture
>> - FastRPC: Monolithic allocator
>> - QDA: Pluggable memory manager with backend abstraction
>> - Benefit: Currently uses DMA-coherent backend, easily extensible for
>> future memory types (e.g., carveout, CMA)
>>
>> 5. Transport Layer
>> - FastRPC: Direct RPMsg integration in core driver
>> - QDA: Abstracted transport layer (qda_rpmsg.c)
>> - Benefit: Clean separation of concerns, easier to add alternative
>> transports if needed
>>
>> 8. Code Organization
>> - FastRPC: ~3000 lines in single file
>> - QDA: Modular design across multiple files (~4600 lines total)
> "Now 50% more LOC and you need 6 tabs open in your IDE!"
>
> Might be better, but in itself it provides no immediate value.
I added this as a point because I think separating/abstracting sensible parts to different files
might improve readability and maintainability. But if that doesn't make sense, then I can
remove this point.
https://lore.kernel.org/all/c007308b-4641-44a5-9e64-fb085cced2b0@linaro.org/
>
>> * qda_drv.c: Core driver and DRM integration
>> * qda_gem.c: GEM object management
>> * qda_memory_manager.c: Memory and IOMMU management
>> * qda_fastrpc.c: FastRPC protocol implementation
>> * qda_rpmsg.c: Transport layer
>> * qda_cb.c: Context bank device management
>> - Benefit: Better maintainability, clearer separation of concerns
>>
>> 9. UAPI Design
>> - FastRPC: Custom IOCTL interface
>> - QDA: DRM-style IOCTLs with proper versioning support
>> - Benefit: Follows DRM conventions, easier userspace integration
>>
>> 10. Documentation
>> - FastRPC: Minimal in-tree documentation
>> - QDA: Comprehensive documentation in Documentation/accel/qda/
>> - Benefit: Better developer experience, clearer API contracts
>>
>> 11. Buffer Reference Mechanism
>> - FastRPC: Uses buffer file descriptors (FDs) for all book-keeping
>> in both kernel and DSP
>> - QDA: Uses GEM handles for kernel-side management, providing better
>> integration with DRM subsystem
>> - Benefit: Leverages DRM GEM infrastructure for reference counting,
>> lifetime management, and integration with other DRM components
>>
> This is all good, but what is the plan regarding /dev/fastrpc-*?
>
> The idea here clearly is to provide an alternative implementation, and
> they seem to bind to the same toplevel compatible - so you can only
> compile one into your kernel at any point in time.
>
> So if I understand correctly, at some point in time we need to say
> CONFIG_DRM_ACCEL_QDA=m and CONFIG_QCOM_FASTRPC=n, which will break all
> existing user space applications? That's not acceptable.
>
>
> Would it be possible to have a final driver that is implemented as a
> accel, but provides wrappers for the legacy misc and ioctl interface to
> the applications?
As per the discussions on other thread, I believe compat driver would be the way to
go for this. When I send the actual driver changes, I can include compat driver as well
to the patches.
I'm assuming a compat driver will live in the same QDA directory and will translate misc/fastrpc
calls to accel/qda calls if QDA is enabled.
>
> Regards,
> Bjorn
>
>> Key Technical Improvements
>> ===========================
>>
>> * Proper device model: CB devices are real struct device instances on a
>> custom bus, enabling proper IOMMU group management and power management
>> integration
>>
>> * Reference-counted IOMMU devices: Multiple file descriptors from the same
>> process share a single IOMMU device, reducing overhead
>>
>> * GEM-based buffer lifecycle: Automatic cleanup via DRM GEM reference
>> counting, eliminating many resource leak scenarios
>>
>> * Modular memory backends: The memory manager supports pluggable backends,
>> currently implementing DMA-coherent allocations with SID-prefixed
>> addresses for DSP firmware
>>
>> * Context-based invocation tracking: XArray-based context management with
>> proper synchronization and cleanup
>>
>> Patch Series Organization
>> ==========================
>>
>> Patches 1-2: Driver skeleton and documentation
>> Patches 3-6: RPMsg transport and IOMMU/CB infrastructure
>> Patches 7-9: DRM device registration and basic IOCTL
>> Patches 10-12: GEM buffer management and PRIME support
>> Patches 13-17: FastRPC protocol implementation (attach, invoke, create,
>> map/unmap)
>> Patch 18: MAINTAINERS entry
>>
>> Open Items
>> ===========
>>
>> The following items are identified as open items:
>>
>> 1. Privilege Level Management
>> - Currently, daemon processes and user processes have the same access
>> level as both use the same accel device node. This needs to be
>> addressed as daemons attach to privileged DSP PDs and require
>> higher privilege levels for system-level operations
>> - Seeking guidance on the best approach: separate device nodes,
>> capability-based checks, or DRM master/authentication mechanisms
>>
>> 2. UAPI Compatibility Layer
>> - Add UAPI compat layer to facilitate migration of client applications
>> from existing FastRPC UAPI to the new QDA accel driver UAPI,
>> ensuring smooth transition for existing userspace code
>> - Seeking guidance on implementation approach: in-kernel translation
>> layer, userspace wrapper library, or hybrid solution
>>
>> 3. Documentation Improvements
>> - Add detailed IOCTL usage examples
>> - Document DSP firmware interface requirements
>> - Create migration guide from existing FastRPC
>>
>> 4. Per-Domain Memory Allocation
>> - Develop new userspace API to support memory allocation on a per
>> domain basis, enabling domain-specific memory management and
>> optimization
>>
>> 5. Audio and Sensors PD Support
>> - The current patch series does not handle Audio PD and Sensors PD
>> functionalities. These specialized protection domains require
>> additional support for real-time constraints and power management
>>
>> Interface Compatibility
>> ========================
>>
>> The QDA driver maintains compatibility with existing FastRPC infrastructure:
>>
>> * Device Tree Bindings: The driver uses the same device tree bindings as
>> the existing FastRPC driver, ensuring no changes are required to device
>> tree sources. The "qcom,fastrpc" compatible string and child node
>> structure remain unchanged.
>>
>> * Userspace Interface: While the driver provides a new DRM-based UAPI,
>> the underlying FastRPC protocol and DSP firmware interface remain
>> compatible. This ensures that DSP firmware and libraries continue to
>> work without modification.
>>
>> * Migration Path: The modular design allows for gradual migration, where
>> both drivers can coexist during the transition period. Applications can
>> be migrated incrementally to the new UAPI with the help of the planned
>> compatibility layer.
>>
>> References
>> ==========
>>
>> Previous discussions on this migration:
>> - https://lkml.org/lkml/2024/6/24/479
>> - https://lkml.org/lkml/2024/6/21/1252
>>
>> Testing
>> =======
>>
>> The driver has been tested on Qualcomm platforms with:
>> - Basic FastRPC attach/release operations
>> - DSP process creation and initialization
>> - Memory mapping/unmapping operations
>> - Dynamic invocation with various buffer types
>> - GEM buffer allocation and mmap
>> - PRIME buffer import from other subsystems
>>
>> Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
>> ---
>> Ekansh Gupta (18):
>> accel/qda: Add Qualcomm QDA DSP accelerator driver docs
>> accel/qda: Add Qualcomm DSP accelerator driver skeleton
>> accel/qda: Add RPMsg transport for Qualcomm DSP accelerator
>> accel/qda: Add built-in compute CB bus for QDA and integrate with IOMMU
>> accel/qda: Create compute CB devices on QDA compute bus
>> accel/qda: Add memory manager for CB devices
>> accel/qda: Add DRM accel device registration for QDA driver
>> accel/qda: Add per-file DRM context and open/close handling
>> accel/qda: Add QUERY IOCTL and basic QDA UAPI header
>> accel/qda: Add DMA-backed GEM objects and memory manager integration
>> accel/qda: Add GEM_CREATE and GEM_MMAP_OFFSET IOCTLs
>> accel/qda: Add PRIME dma-buf import support
>> accel/qda: Add initial FastRPC attach and release support
>> accel/qda: Add FastRPC dynamic invocation support
>> accel/qda: Add FastRPC DSP process creation support
>> accel/qda: Add FastRPC-based DSP memory mapping support
>> accel/qda: Add FastRPC-based DSP memory unmapping support
>> MAINTAINERS: Add MAINTAINERS entry for QDA driver
>>
>> Documentation/accel/index.rst | 1 +
>> Documentation/accel/qda/index.rst | 14 +
>> Documentation/accel/qda/qda.rst | 129 ++++
>> MAINTAINERS | 9 +
>> arch/arm64/configs/defconfig | 2 +
>> drivers/accel/Kconfig | 1 +
>> drivers/accel/Makefile | 2 +
>> drivers/accel/qda/Kconfig | 35 ++
>> drivers/accel/qda/Makefile | 19 +
>> drivers/accel/qda/qda_cb.c | 182 ++++++
>> drivers/accel/qda/qda_cb.h | 26 +
>> drivers/accel/qda/qda_compute_bus.c | 23 +
>> drivers/accel/qda/qda_drv.c | 375 ++++++++++++
>> drivers/accel/qda/qda_drv.h | 171 ++++++
>> drivers/accel/qda/qda_fastrpc.c | 1002 ++++++++++++++++++++++++++++++++
>> drivers/accel/qda/qda_fastrpc.h | 433 ++++++++++++++
>> drivers/accel/qda/qda_gem.c | 211 +++++++
>> drivers/accel/qda/qda_gem.h | 103 ++++
>> drivers/accel/qda/qda_ioctl.c | 271 +++++++++
>> drivers/accel/qda/qda_ioctl.h | 118 ++++
>> drivers/accel/qda/qda_memory_dma.c | 91 +++
>> drivers/accel/qda/qda_memory_dma.h | 46 ++
>> drivers/accel/qda/qda_memory_manager.c | 382 ++++++++++++
>> drivers/accel/qda/qda_memory_manager.h | 148 +++++
>> drivers/accel/qda/qda_prime.c | 194 +++++++
>> drivers/accel/qda/qda_prime.h | 43 ++
>> drivers/accel/qda/qda_rpmsg.c | 327 +++++++++++
>> drivers/accel/qda/qda_rpmsg.h | 57 ++
>> drivers/iommu/iommu.c | 4 +
>> include/linux/qda_compute_bus.h | 22 +
>> include/uapi/drm/qda_accel.h | 224 +++++++
>> 31 files changed, 4665 insertions(+)
>> ---
>> base-commit: d4906ae14a5f136ceb671bb14cedbf13fa560da6
>> change-id: 20260223-qda-firstpost-4ab05249e2cc
>>
>> Best regards,
>> --
>> Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
>>
>>
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 10/18] accel/qda: Add DMA-backed GEM objects and memory manager integration
2026-02-23 22:36 ` Dmitry Baryshkov
@ 2026-03-02 9:06 ` Ekansh Gupta
0 siblings, 0 replies; 83+ messages in thread
From: Ekansh Gupta @ 2026-03-02 9:06 UTC (permalink / raw)
To: Dmitry Baryshkov
Cc: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, dri-devel, linux-doc, linux-kernel,
linux-arm-msm, iommu, linux-media, linaro-mm-sig,
Srinivas Kandagatla, Bharath Kumar, Chenna Kesava Raju
On 2/24/2026 4:06 AM, Dmitry Baryshkov wrote:
> On Tue, Feb 24, 2026 at 12:39:04AM +0530, Ekansh Gupta wrote:
>> Introduce DMA-backed GEM buffer objects for the QDA accelerator
>> driver and integrate them with the existing memory manager and IOMMU
>> device abstraction.
>>
>> A new qda_gem_obj structure wraps drm_gem_object and tracks the
>> kernel virtual address, DMA address, size and owning qda_iommu_device.
>> qda_gem_create_object() allocates a GEM object, aligns the requested
>> size, and uses qda_memory_manager_alloc() to obtain DMA-coherent
>> memory from a per-process IOMMU device. The GEM object implements
>> a .mmap callback that validates the VMA offset and calls into
>> qda_dma_mmap(), which maps the DMA memory into userspace and sets
>> appropriate VMA flags.
>>
>> The DMA backend is implemented in qda_memory_dma.c, which allocates
>> and frees coherent memory via dma_alloc_coherent() and
>> dma_free_coherent(), while storing a SID-prefixed DMA address in
>> the GEM object for later use by DSP firmware. The memory manager
>> is extended to maintain a mapping from processes to IOMMU devices
>> using qda_file_priv and a process_assignment_lock, and provides
>> qda_memory_manager_alloc() and qda_memory_manager_free() helpers
>> for GEM allocations.
> Why are you not using drm_gem_dma_helper?
These helpers are for the underlying memory allocation using IOMMU devices. I'm not
sure if drm_gem_dma_helper would work here.
>
>> This patch lays the groundwork for GEM allocation and mmap IOCTLs
>> as well as future PRIME and job submission support for QDA buffers.
> Documentation/process/submitting-patches.rst, "This patch"
ack
>
>> Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
>> ---
>> drivers/accel/qda/Makefile | 2 +
>> drivers/accel/qda/qda_drv.c | 23 +++-
>> drivers/accel/qda/qda_drv.h | 7 ++
>> drivers/accel/qda/qda_gem.c | 187 +++++++++++++++++++++++++++++++
>> drivers/accel/qda/qda_gem.h | 63 +++++++++++
>> drivers/accel/qda/qda_memory_dma.c | 91 ++++++++++++++++
>> drivers/accel/qda/qda_memory_dma.h | 46 ++++++++
>> drivers/accel/qda/qda_memory_manager.c | 194 +++++++++++++++++++++++++++++++++
>> drivers/accel/qda/qda_memory_manager.h | 33 ++++++
>> 9 files changed, 645 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/accel/qda/Makefile b/drivers/accel/qda/Makefile
>> index f547398e1a72..88c324fa382c 100644
>> --- a/drivers/accel/qda/Makefile
>> +++ b/drivers/accel/qda/Makefile
>> @@ -11,5 +11,7 @@ qda-y := \
>> qda_cb.o \
>> qda_memory_manager.o \
>> qda_ioctl.o \
>> + qda_gem.o \
>> + qda_memory_dma.o \
>>
>> obj-$(CONFIG_DRM_ACCEL_QDA_COMPUTE_BUS) += qda_compute_bus.o
>> diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c
>> index 86758a9cd982..19798359b14e 100644
>> --- a/drivers/accel/qda/qda_drv.c
>> +++ b/drivers/accel/qda/qda_drv.c
>> @@ -15,7 +15,7 @@
>> #include "qda_ioctl.h"
>> #include "qda_rpmsg.h"
>>
>> -static struct qda_drm_priv *get_drm_priv_from_device(struct drm_device *dev)
>> +struct qda_drm_priv *get_drm_priv_from_device(struct drm_device *dev)
> And this is a namespace leak. Please name all your functions in a
> selected style (qda_foo()).
>
>> {
>> if (!dev)
>> return NULL;
>> @@ -88,6 +88,7 @@ static int qda_open(struct drm_device *dev, struct drm_file *file)
>> return -ENOMEM;
>>
>> qda_file_priv->pid = current->pid;
>> + qda_file_priv->assigned_iommu_dev = NULL; /* Will be assigned on first allocation */
> Why? Also, isn't qda_file_priv zero-filled?
ack
>
>>
>> qda_user = alloc_qda_user(qdev);
>> if (!qda_user) {
>> @@ -118,6 +119,26 @@ static void qda_postclose(struct drm_device *dev, struct drm_file *file)
>>
>> qda_file_priv = (struct qda_file_priv *)file->driver_priv;
>> if (qda_file_priv) {
> Cant it be NULL? When?
>
>> + if (qda_file_priv->assigned_iommu_dev) {
>> + struct qda_iommu_device *iommu_dev = qda_file_priv->assigned_iommu_dev;
>> + unsigned long flags;
>> +
>> + /* Decrement reference count - if it reaches 0, reset PID assignment */
>> + if (refcount_dec_and_test(&iommu_dev->refcount)) {
>> + /* Last reference released - reset PID assignment */
>> + spin_lock_irqsave(&iommu_dev->lock, flags);
>> + iommu_dev->assigned_pid = 0;
> This is the part that needs to be discussed in the commit message
> instead of a generic description of the patch. What is assigned_pid /
> assigned_iommu_dev? Why do they need to be assigned?
I'll update more details for this.
>
>> + iommu_dev->assigned_file_priv = NULL;
>> + spin_unlock_irqrestore(&iommu_dev->lock, flags);
>> +
>> + qda_dbg(qdev, "Reset PID assignment for IOMMU device %u (process %d exited)\n",
>> + iommu_dev->id, qda_file_priv->pid);
>> + } else {
>> + qda_dbg(qdev, "Decremented reference for IOMMU device %u from process %d\n",
>> + iommu_dev->id, qda_file_priv->pid);
>> + }
>> + }
>> +
>> qda_user = qda_file_priv->qda_user;
>> if (qda_user)
>> free_qda_user(qda_user);
>> diff --git a/drivers/accel/qda/qda_drv.h b/drivers/accel/qda/qda_drv.h
>> index e0ba37702a86..8a2cd474958b 100644
>> --- a/drivers/accel/qda/qda_drv.h
>> +++ b/drivers/accel/qda/qda_drv.h
>> @@ -33,6 +33,8 @@ struct qda_file_priv {
>> pid_t pid;
>> /* Pointer to qda_user structure for backward compatibility */
>> struct qda_user *qda_user;
>> + /* IOMMU device assigned to this process */
>> + struct qda_iommu_device *assigned_iommu_dev;
>> };
>>
>> /**
>> @@ -153,4 +155,9 @@ void qda_deinit_device(struct qda_dev *qdev);
>> int qda_register_device(struct qda_dev *qdev);
>> void qda_unregister_device(struct qda_dev *qdev);
>>
>> +/*
>> + * Utility function to get DRM private data from DRM device
>> + */
>> +struct qda_drm_priv *get_drm_priv_from_device(struct drm_device *dev);
>> +
>> #endif /* __QDA_DRV_H__ */
>> diff --git a/drivers/accel/qda/qda_gem.c b/drivers/accel/qda/qda_gem.c
>> new file mode 100644
>> index 000000000000..bbd54e2502d3
>> --- /dev/null
>> +++ b/drivers/accel/qda/qda_gem.c
>> @@ -0,0 +1,187 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +// Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
>> +#include <drm/drm_gem.h>
>> +#include <drm/drm_prime.h>
>> +#include <linux/slab.h>
>> +#include <linux/dma-mapping.h>
>> +#include "qda_drv.h"
>> +#include "qda_gem.h"
>> +#include "qda_memory_manager.h"
>> +#include "qda_memory_dma.h"
>> +
>> +static int validate_gem_obj_for_mmap(struct qda_gem_obj *qda_gem_obj)
>> +{
>> + if (qda_gem_obj->size == 0) {
>> + qda_err(NULL, "Invalid GEM object size\n");
>> + return -EINVAL;
>> + }
>> + if (!qda_gem_obj->iommu_dev || !qda_gem_obj->iommu_dev->dev) {
>> + qda_err(NULL, "Allocated buffer missing IOMMU device\n");
>> + return -EINVAL;
>> + }
>> + if (!qda_gem_obj->iommu_dev->dev) {
>> + qda_err(NULL, "Allocated buffer missing IOMMU device\n");
>> + return -EINVAL;
>> + }
>> + if (!qda_gem_obj->virt) {
>> + qda_err(NULL, "Allocated buffer missing virtual address\n");
>> + return -EINVAL;
>> + }
>> + if (qda_gem_obj->dma_addr == 0) {
>> + qda_err(NULL, "Allocated buffer missing DMA address\n");
>> + return -EINVAL;
>> + }
> Is any of these conditions real?
>
>> +
>> + return 0;
>> +}
>> +
>> +static int validate_vma_offset(struct drm_gem_object *drm_obj, struct vm_area_struct *vma)
>> +{
>> + u64 expected_offset = drm_vma_node_offset_addr(&drm_obj->vma_node);
>> + u64 actual_offset = vma->vm_pgoff << PAGE_SHIFT;
>> +
>> + if (actual_offset != expected_offset) {
> What??
I'll remove all unnecessary checks.
>
>> + qda_err(NULL, "VMA offset mismatch: expected=0x%llx, actual=0x%llx\n",
>> + expected_offset, actual_offset);
>> + return -EINVAL;
>> + }
>> +
>> + return 0;
>> +}
>> +
>> +static void setup_vma_flags(struct vm_area_struct *vma)
>> +{
>> + vm_flags_set(vma, VM_DONTEXPAND);
>> + vm_flags_set(vma, VM_DONTDUMP);
>> +}
>> +
>> +void qda_gem_free_object(struct drm_gem_object *gem_obj)
>> +{
>> + struct qda_gem_obj *qda_gem_obj = to_qda_gem_obj(gem_obj);
>> + struct qda_drm_priv *drm_priv = get_drm_priv_from_device(gem_obj->dev);
>> +
>> + if (qda_gem_obj->virt) {
>> + if (drm_priv && drm_priv->iommu_mgr)
>> + qda_memory_manager_free(drm_priv->iommu_mgr, qda_gem_obj);
>> + }
>> +
>> + drm_gem_object_release(gem_obj);
>> + kfree(qda_gem_obj);
>> +}
>> +
>> +int qda_gem_mmap_obj(struct drm_gem_object *drm_obj, struct vm_area_struct *vma)
>> +{
>> + struct qda_gem_obj *qda_gem_obj = to_qda_gem_obj(drm_obj);
>> + int ret;
>> +
>> + ret = validate_gem_obj_for_mmap(qda_gem_obj);
>> + if (ret) {
>> + qda_err(NULL, "GEM object validation failed: %d\n", ret);
>> + return ret;
>> + }
>> +
>> + ret = validate_vma_offset(drm_obj, vma);
>> + if (ret) {
>> + qda_err(NULL, "VMA offset validation failed: %d\n", ret);
>> + return ret;
>> + }
>> +
>> + /* Reset vm_pgoff for DMA mmap */
>> + vma->vm_pgoff = 0;
>> +
>> + ret = qda_dma_mmap(qda_gem_obj, vma);
>> +
>> + if (ret == 0) {
>> + setup_vma_flags(vma);
>> + qda_dbg(NULL, "GEM object mapped successfully\n");
>> + } else {
>> + qda_err(NULL, "GEM object mmap failed: %d\n", ret);
>> + }
>> +
>> + return ret;
>> +}
>> +
>> +static const struct drm_gem_object_funcs qda_gem_object_funcs = {
>> + .free = qda_gem_free_object,
>> + .mmap = qda_gem_mmap_obj,
>> +};
>> +
>> +struct qda_gem_obj *qda_gem_alloc_object(struct drm_device *drm_dev, size_t aligned_size)
>> +{
>> + struct qda_gem_obj *qda_gem_obj;
>> + int ret;
>> +
>> + qda_gem_obj = kzalloc_obj(*qda_gem_obj, GFP_KERNEL);
>> + if (!qda_gem_obj)
>> + return ERR_PTR(-ENOMEM);
>> +
>> + ret = drm_gem_object_init(drm_dev, &qda_gem_obj->base, aligned_size);
>> + if (ret) {
>> + qda_err(NULL, "Failed to initialize GEM object: %d\n", ret);
>> + kfree(qda_gem_obj);
>> + return ERR_PTR(ret);
>> + }
>> +
>> + qda_gem_obj->base.funcs = &qda_gem_object_funcs;
>> + qda_gem_obj->size = aligned_size;
>> +
>> + qda_dbg(NULL, "Allocated GEM object size=%zu\n", aligned_size);
>> + return qda_gem_obj;
>> +}
>> +
>> +void qda_gem_cleanup_object(struct qda_gem_obj *qda_gem_obj)
>> +{
>> + drm_gem_object_release(&qda_gem_obj->base);
>> + kfree(qda_gem_obj);
>> +}
>> +
>> +struct drm_gem_object *qda_gem_lookup_object(struct drm_file *file_priv, u32 handle)
>> +{
>> + struct drm_gem_object *gem_obj;
>> +
>> + gem_obj = drm_gem_object_lookup(file_priv, handle);
>> + if (!gem_obj)
>> + return ERR_PTR(-ENOENT);
>> +
>> + return gem_obj;
>> +}
>> +
>> +int qda_gem_create_handle(struct drm_file *file_priv, struct drm_gem_object *gem_obj, u32 *handle)
>> +{
>> + int ret;
>> +
>> + ret = drm_gem_handle_create(file_priv, gem_obj, handle);
>> + drm_gem_object_put(gem_obj);
>> +
>> + return ret;
>> +}
>> +
>> +struct drm_gem_object *qda_gem_create_object(struct drm_device *drm_dev,
>> + struct qda_memory_manager *iommu_mgr, size_t size,
>> + struct drm_file *file_priv)
>> +{
>> + struct qda_gem_obj *qda_gem_obj;
>> + size_t aligned_size;
>> + int ret;
>> +
>> + if (size == 0) {
>> + qda_err(NULL, "Invalid size for GEM object creation\n");
>> + return ERR_PTR(-EINVAL);
>> + }
>> +
>> + aligned_size = PAGE_ALIGN(size);
>> +
>> + qda_gem_obj = qda_gem_alloc_object(drm_dev, aligned_size);
>> + if (IS_ERR(qda_gem_obj))
>> + return (struct drm_gem_object *)qda_gem_obj;
>> +
>> + ret = qda_memory_manager_alloc(iommu_mgr, qda_gem_obj, file_priv);
>> + if (ret) {
>> + qda_err(NULL, "Memory manager allocation failed: %d\n", ret);
>> + qda_gem_cleanup_object(qda_gem_obj);
>> + return ERR_PTR(ret);
>> + }
>> +
>> + qda_dbg(NULL, "GEM object created successfully size=%zu\n", aligned_size);
>> + return &qda_gem_obj->base;
>> +}
>> diff --git a/drivers/accel/qda/qda_gem.h b/drivers/accel/qda/qda_gem.h
>> new file mode 100644
>> index 000000000000..caae9cda5363
>> --- /dev/null
>> +++ b/drivers/accel/qda/qda_gem.h
>> @@ -0,0 +1,63 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/*
>> + * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
>> + */
>> +#ifndef _QDA_GEM_H
>> +#define _QDA_GEM_H
>> +
>> +#include <linux/xarray.h>
>> +#include <drm/drm_device.h>
>> +#include <drm/drm_gem.h>
>> +#include <linux/dma-mapping.h>
>> +
>> +/* Forward declarations */
>> +struct qda_memory_manager;
>> +struct qda_iommu_device;
>> +
>> +/**
>> + * struct qda_gem_obj - QDA GEM buffer object
>> + *
>> + * This structure represents a GEM buffer object that can be either
>> + * allocated by the driver or imported from another driver via dma-buf.
>> + */
>> +struct qda_gem_obj {
>> + /* DRM GEM object base structure */
>> + struct drm_gem_object base;
>> + /* Kernel virtual address of allocated memory */
>> + void *virt;
>> + /* DMA address for allocated buffers */
>> + dma_addr_t dma_addr;
>> + /* Size of the buffer in bytes */
>> + size_t size;
>> + /* IOMMU device that performed the allocation */
>> + struct qda_iommu_device *iommu_dev;
>> +};
>> +
>> +/*
>> + * Helper macro to cast a drm_gem_object to qda_gem_obj
>> + */
>> +#define to_qda_gem_obj(gem_obj) container_of(gem_obj, struct qda_gem_obj, base)
>> +
>> +/*
>> + * GEM object lifecycle management
>> + */
>> +struct drm_gem_object *qda_gem_create_object(struct drm_device *drm_dev,
>> + struct qda_memory_manager *iommu_mgr,
>> + size_t size, struct drm_file *file_priv);
>> +void qda_gem_free_object(struct drm_gem_object *gem_obj);
>> +int qda_gem_mmap_obj(struct drm_gem_object *gem_obj, struct vm_area_struct *vma);
>> +
>> +/*
>> + * Helper functions for GEM object allocation and cleanup
>> + * These are used internally and by the PRIME import code
>> + */
>> +struct qda_gem_obj *qda_gem_alloc_object(struct drm_device *drm_dev, size_t aligned_size);
>> +void qda_gem_cleanup_object(struct qda_gem_obj *qda_gem_obj);
>> +
>> +/*
>> + * Utility functions for GEM operations
>> + */
>> +struct drm_gem_object *qda_gem_lookup_object(struct drm_file *file_priv, u32 handle);
>> +int qda_gem_create_handle(struct drm_file *file_priv, struct drm_gem_object *gem_obj, u32 *handle);
>> +
>> +#endif /* _QDA_GEM_H */
>> diff --git a/drivers/accel/qda/qda_memory_dma.c b/drivers/accel/qda/qda_memory_dma.c
>> new file mode 100644
>> index 000000000000..ffdd5423c88c
>> --- /dev/null
>> +++ b/drivers/accel/qda/qda_memory_dma.c
>> @@ -0,0 +1,91 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +// Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
>> +#include <linux/slab.h>
>> +#include <linux/dma-mapping.h>
>> +#include "qda_drv.h"
>> +#include "qda_memory_dma.h"
>> +
>> +static dma_addr_t get_actual_dma_addr(struct qda_gem_obj *gem_obj)
>> +{
>> + return gem_obj->dma_addr - ((u64)gem_obj->iommu_dev->sid << 32);
>> +}
>> +
>> +static void setup_gem_object(struct qda_gem_obj *gem_obj, void *virt,
>> + dma_addr_t dma_addr, struct qda_iommu_device *iommu_dev)
>> +{
>> + gem_obj->virt = virt;
>> + gem_obj->dma_addr = dma_addr;
>> + gem_obj->iommu_dev = iommu_dev;
>> +}
>> +
>> +static void cleanup_gem_object_fields(struct qda_gem_obj *gem_obj)
>> +{
>> + gem_obj->virt = NULL;
>> + gem_obj->dma_addr = 0;
>> + gem_obj->iommu_dev = NULL;
>> +}
>> +
>> +int qda_dma_alloc(struct qda_iommu_device *iommu_dev,
>> + struct qda_gem_obj *gem_obj, size_t size)
>> +{
>> + void *virt;
>> + dma_addr_t dma_addr;
>> +
>> + if (!iommu_dev || !iommu_dev->dev) {
>> + qda_err(NULL, "Invalid iommu_dev or device for DMA allocation\n");
>> + return -EINVAL;
>> + }
>> +
>> + virt = dma_alloc_coherent(iommu_dev->dev, size, &dma_addr, GFP_KERNEL);
>> + if (!virt)
>> + return -ENOMEM;
>> +
>> + dma_addr += ((u64)iommu_dev->sid << 32);
>> +
>> + qda_dbg(NULL, "DMA address with SID prefix: 0x%llx (sid=%u)\n",
>> + (u64)dma_addr, iommu_dev->sid);
>> +
>> + setup_gem_object(gem_obj, virt, dma_addr, iommu_dev);
>> +
>> + return 0;
>> +}
>> +
>> +void qda_dma_free(struct qda_gem_obj *gem_obj)
>> +{
>> + if (!gem_obj || !gem_obj->iommu_dev) {
>> + qda_dbg(NULL, "Invalid gem_obj or iommu_dev for DMA free\n");
>> + return;
>> + }
>> +
>> + qda_dbg(NULL, "DMA freeing: size=%zu, device_id=%u, dma_addr=0x%llx\n",
>> + gem_obj->size, gem_obj->iommu_dev->id, gem_obj->dma_addr);
>> +
>> + dma_free_coherent(gem_obj->iommu_dev->dev, gem_obj->size,
>> + gem_obj->virt, get_actual_dma_addr(gem_obj));
>> +
>> + cleanup_gem_object_fields(gem_obj);
>> +}
>> +
>> +int qda_dma_mmap(struct qda_gem_obj *gem_obj, struct vm_area_struct *vma)
>> +{
>> + struct qda_iommu_device *iommu_dev;
>> + int ret;
>> +
>> + if (!gem_obj || !gem_obj->virt || !gem_obj->iommu_dev || !gem_obj->iommu_dev->dev) {
>> + qda_err(NULL, "Invalid parameters for DMA mmap\n");
>> + return -EINVAL;
>> + }
>> +
>> + iommu_dev = gem_obj->iommu_dev;
>> +
>> + ret = dma_mmap_coherent(iommu_dev->dev, vma, gem_obj->virt,
>> + get_actual_dma_addr(gem_obj), gem_obj->size);
>> +
>> + if (ret)
>> + qda_err(NULL, "DMA mmap failed: size=%zu, device_id=%u, ret=%d\n",
>> + gem_obj->size, iommu_dev->id, ret);
> if (ret) {
> qda_err();
> return ret;
> // or goto err_foo;
> }
>
> return 0;
ack
>
>
>> + else
>> + qda_dbg(NULL, "DMA mmap successful: size=%zu\n", gem_obj->size);
> It feels like the driver is over-verbose if debugging is enabled.
I'll remove all unnecessary logs
>
>> +
>> + return ret;
>> +}
>> diff --git a/drivers/accel/qda/qda_memory_dma.h b/drivers/accel/qda/qda_memory_dma.h
>> new file mode 100644
>> index 000000000000..79b3c4053a82
>> --- /dev/null
>> +++ b/drivers/accel/qda/qda_memory_dma.h
>> @@ -0,0 +1,46 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/*
>> + * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
>> + */
>> +
>> +#ifndef _QDA_MEMORY_DMA_H
>> +#define _QDA_MEMORY_DMA_H
>> +
>> +#include <linux/dma-mapping.h>
>> +#include "qda_memory_manager.h"
>> +
>> +/**
>> + * qda_dma_alloc() - Allocate DMA coherent memory for a GEM object
>> + * @iommu_dev: Pointer to the QDA IOMMU device structure
>> + * @gem_obj: Pointer to GEM object to allocate memory for
>> + * @size: Size of memory to allocate in bytes
>> + *
>> + * Allocates DMA-coherent memory and sets up the GEM object with the
>> + * allocated memory details including virtual and DMA addresses.
>> + *
>> + * Return: 0 on success, negative error code on failure
>> + */
> Move the kerneldoc from the headers to the driver code, otherwise they
> are mostly ignored by the automatic validators.
ack. I'll run the kernel-doc checker for the next iteration.
>
>> +int qda_dma_alloc(struct qda_iommu_device *iommu_dev,
>> + struct qda_gem_obj *gem_obj, size_t size);
>> +
>> +/**
>> + * qda_dma_free() - Free DMA coherent memory for a GEM object
>> + * @gem_obj: Pointer to GEM object to free memory for
>> + *
>> + * Frees DMA-coherent memory previously allocated for the GEM object
>> + * and cleans up the GEM object fields.
>> + */
>> +void qda_dma_free(struct qda_gem_obj *gem_obj);
>> +
>> +/**
>> + * qda_dma_mmap() - Map DMA memory into userspace
>> + * @gem_obj: Pointer to GEM object containing DMA memory
>> + * @vma: Virtual memory area to map into
>> + *
>> + * Maps DMA-coherent memory into userspace virtual address space.
>> + *
>> + * Return: 0 on success, negative error code on failure
>> + */
>> +int qda_dma_mmap(struct qda_gem_obj *gem_obj, struct vm_area_struct *vma);
>> +
>> +#endif /* _QDA_MEMORY_DMA_H */
>> diff --git a/drivers/accel/qda/qda_memory_manager.c b/drivers/accel/qda/qda_memory_manager.c
>> index b4c7047a89d4..e225667557ee 100644
>> --- a/drivers/accel/qda/qda_memory_manager.c
>> +++ b/drivers/accel/qda/qda_memory_manager.c
>> @@ -6,8 +6,11 @@
>> #include <linux/spinlock.h>
>> #include <linux/workqueue.h>
>> #include <linux/xarray.h>
>> +#include <drm/drm_file.h>
>> #include "qda_drv.h"
>> +#include "qda_gem.h"
>> #include "qda_memory_manager.h"
>> +#include "qda_memory_dma.h"
>>
>> static void cleanup_all_memory_devices(struct qda_memory_manager *mem_mgr)
>> {
>> @@ -55,6 +58,8 @@ static void init_iommu_device_fields(struct qda_iommu_device *iommu_dev,
>> spin_lock_init(&iommu_dev->lock);
>> refcount_set(&iommu_dev->refcount, 0);
>> INIT_WORK(&iommu_dev->remove_work, qda_memory_manager_remove_work);
>> + iommu_dev->assigned_pid = 0;
>> + iommu_dev->assigned_file_priv = NULL;
>> }
>>
>> static int allocate_device_id(struct qda_memory_manager *mem_mgr,
>> @@ -78,6 +83,194 @@ static int allocate_device_id(struct qda_memory_manager *mem_mgr,
>> return ret;
>> }
>>
>> +static struct qda_iommu_device *find_device_for_pid(struct qda_memory_manager *mem_mgr,
>> + pid_t pid)
>> +{
>> + unsigned long index;
>> + void *entry;
>> + struct qda_iommu_device *found_dev = NULL;
>> + unsigned long flags;
>> +
>> + xa_lock(&mem_mgr->device_xa);
>> + xa_for_each(&mem_mgr->device_xa, index, entry) {
>> + struct qda_iommu_device *iommu_dev = entry;
>> +
>> + spin_lock_irqsave(&iommu_dev->lock, flags);
>> + if (iommu_dev->assigned_pid == pid) {
>> + found_dev = iommu_dev;
>> + refcount_inc(&found_dev->refcount);
>> + qda_dbg(NULL, "Reusing device id=%u for PID=%d (refcount=%u)\n",
>> + found_dev->id, pid, refcount_read(&found_dev->refcount));
> And what if there are two different FastRPC sessions within the same
> PID?
As for this patch, multi session/multi PD might not work. I'll add changes for Multi-PD support also.
>
>> + spin_unlock_irqrestore(&iommu_dev->lock, flags);
>> + break;
>> + }
>> + spin_unlock_irqrestore(&iommu_dev->lock, flags);
>> + }
>> + xa_unlock(&mem_mgr->device_xa);
>> +
>> + return found_dev;
>> +}
>> +
>> +static struct qda_iommu_device *assign_available_device_to_pid(struct qda_memory_manager *mem_mgr,
>> + pid_t pid,
>> + struct drm_file *file_priv)
>> +{
>> + unsigned long index;
>> + void *entry;
>> + struct qda_iommu_device *selected_dev = NULL;
>> + unsigned long flags;
>> +
>> + xa_lock(&mem_mgr->device_xa);
>> + xa_for_each(&mem_mgr->device_xa, index, entry) {
>> + struct qda_iommu_device *iommu_dev = entry;
>> +
>> + spin_lock_irqsave(&iommu_dev->lock, flags);
>> + if (iommu_dev->assigned_pid == 0) {
>> + iommu_dev->assigned_pid = pid;
>> + iommu_dev->assigned_file_priv = file_priv;
>> + selected_dev = iommu_dev;
>> + refcount_set(&selected_dev->refcount, 1);
>> + qda_dbg(NULL, "Assigned device id=%u to PID=%d\n",
>> + selected_dev->id, pid);
>> + spin_unlock_irqrestore(&iommu_dev->lock, flags);
>> + break;
>> + }
>> + spin_unlock_irqrestore(&iommu_dev->lock, flags);
>> + }
>> + xa_unlock(&mem_mgr->device_xa);
>> +
>> + return selected_dev;
>> +}
>> +
>> +static struct qda_iommu_device *get_process_iommu_device(struct qda_memory_manager *mem_mgr,
>> + struct drm_file *file_priv)
>> +{
>> + struct qda_file_priv *qda_priv;
>> +
>> + if (!file_priv || !file_priv->driver_priv)
>> + return NULL;
>> +
>> + qda_priv = (struct qda_file_priv *)file_priv->driver_priv;
>> + return qda_priv->assigned_iommu_dev;
>> +}
>> +
>> +static int qda_memory_manager_assign_device(struct qda_memory_manager *mem_mgr,
>> + struct drm_file *file_priv)
>> +{
>> + struct qda_file_priv *qda_priv;
>> + struct qda_iommu_device *selected_dev = NULL;
>> + int ret = 0;
>> + pid_t current_pid;
>> +
>> + if (!file_priv || !file_priv->driver_priv) {
>> + qda_err(NULL, "Invalid file_priv or driver_priv\n");
>> + return -EINVAL;
>> + }
>> +
>> + qda_priv = (struct qda_file_priv *)file_priv->driver_priv;
>> + current_pid = qda_priv->pid;
>> +
>> + mutex_lock(&mem_mgr->process_assignment_lock);
>> +
>> + if (qda_priv->assigned_iommu_dev) {
>> + qda_dbg(NULL, "PID=%d already has device id=%u assigned\n",
>> + current_pid, qda_priv->assigned_iommu_dev->id);
>> + ret = 0;
>> + goto unlock_and_return;
>> + }
>> +
>> + selected_dev = find_device_for_pid(mem_mgr, current_pid);
>> +
>> + if (selected_dev) {
>> + qda_priv->assigned_iommu_dev = selected_dev;
>> + goto unlock_and_return;
>> + }
>> +
>> + selected_dev = assign_available_device_to_pid(mem_mgr, current_pid, file_priv);
>> +
>> + if (!selected_dev) {
>> + qda_err(NULL, "No available device for PID=%d\n", current_pid);
>> + ret = -ENOMEM;
>> + goto unlock_and_return;
>> + }
>> +
>> + qda_priv->assigned_iommu_dev = selected_dev;
>> +
>> +unlock_and_return:
>> + mutex_unlock(&mem_mgr->process_assignment_lock);
>> + return ret;
>> +}
>> +
>> +static struct qda_iommu_device *get_or_assign_iommu_device(struct qda_memory_manager *mem_mgr,
>> + struct drm_file *file_priv,
>> + size_t size)
>> +{
>> + struct qda_iommu_device *iommu_dev;
>> + int ret;
>> +
>> + iommu_dev = get_process_iommu_device(mem_mgr, file_priv);
>> + if (iommu_dev)
>> + return iommu_dev;
>> +
>> + ret = qda_memory_manager_assign_device(mem_mgr, file_priv);
>> + if (ret)
>> + return NULL;
>> +
>> + iommu_dev = get_process_iommu_device(mem_mgr, file_priv);
>> + if (iommu_dev)
>> + return iommu_dev;
>> +
>> + return NULL;
>> +}
>> +
>> +int qda_memory_manager_alloc(struct qda_memory_manager *mem_mgr, struct qda_gem_obj *gem_obj,
>> + struct drm_file *file_priv)
>> +{
>> + struct qda_iommu_device *selected_dev;
>> + size_t size;
>> + int ret;
>> +
>> + if (!mem_mgr || !gem_obj || !file_priv) {
>> + qda_err(NULL, "Invalid parameters for memory allocation\n");
>> + return -EINVAL;
>> + }
>> +
>> + size = gem_obj->size;
>> + if (size == 0) {
>> + qda_err(NULL, "Invalid allocation size: 0\n");
>> + return -EINVAL;
>> + }
>> +
>> + selected_dev = get_or_assign_iommu_device(mem_mgr, file_priv, size);
>> +
>> + if (!selected_dev) {
>> + qda_err(NULL, "Failed to get/assign device for allocation (size=%zu)\n", size);
>> + return -ENOMEM;
>> + }
>> +
>> + ret = qda_dma_alloc(selected_dev, gem_obj, size);
>> +
>> + if (ret) {
>> + qda_err(NULL, "Allocation failed: size=%zu, device_id=%u, ret=%d\n",
>> + size, selected_dev->id, ret);
>> + return ret;
>> + }
>> +
>> + qda_dbg(NULL, "Successfully allocated: size=%zu, device_id=%u, dma_addr=0x%llx\n",
>> + size, selected_dev->id, gem_obj->dma_addr);
>> + return 0;
>> +}
>> +
>> +void qda_memory_manager_free(struct qda_memory_manager *mem_mgr, struct qda_gem_obj *gem_obj)
>> +{
>> + if (!gem_obj || !gem_obj->iommu_dev) {
>> + qda_dbg(NULL, "Invalid gem_obj or iommu_dev for free\n");
>> + return;
>> + }
>> +
>> + qda_dma_free(gem_obj);
>> +}
>> +
>> int qda_memory_manager_register_device(struct qda_memory_manager *mem_mgr,
>> struct qda_iommu_device *iommu_dev)
>> {
>> @@ -134,6 +327,7 @@ int qda_memory_manager_init(struct qda_memory_manager *mem_mgr)
>>
>> xa_init_flags(&mem_mgr->device_xa, XA_FLAGS_ALLOC);
>> atomic_set(&mem_mgr->next_id, 0);
>> + mutex_init(&mem_mgr->process_assignment_lock);
>> mem_mgr->wq = create_workqueue("memory_manager_wq");
>> if (!mem_mgr->wq) {
>> qda_err(NULL, "Failed to create memory manager workqueue\n");
>> diff --git a/drivers/accel/qda/qda_memory_manager.h b/drivers/accel/qda/qda_memory_manager.h
>> index 3bf4cd529909..bac44284ef98 100644
>> --- a/drivers/accel/qda/qda_memory_manager.h
>> +++ b/drivers/accel/qda/qda_memory_manager.h
>> @@ -11,6 +11,8 @@
>> #include <linux/spinlock.h>
>> #include <linux/workqueue.h>
>> #include <linux/xarray.h>
>> +#include <drm/drm_file.h>
>> +#include "qda_gem.h"
>>
>> /**
>> * struct qda_iommu_device - IOMMU device instance for memory management
>> @@ -35,6 +37,10 @@ struct qda_iommu_device {
>> u32 sid;
>> /* Pointer to parent memory manager */
>> struct qda_memory_manager *manager;
>> + /* Process ID of the process assigned to this device */
>> + pid_t assigned_pid;
>> + /* DRM file private data for the assigned process */
>> + struct drm_file *assigned_file_priv;
>> };
>>
>> /**
>> @@ -51,6 +57,8 @@ struct qda_memory_manager {
>> atomic_t next_id;
>> /* Workqueue for asynchronous device operations */
>> struct workqueue_struct *wq;
>> + /* Mutex protecting process-to-device assignments */
>> + struct mutex process_assignment_lock;
>> };
>>
>> /**
>> @@ -98,4 +106,29 @@ int qda_memory_manager_register_device(struct qda_memory_manager *mem_mgr,
>> void qda_memory_manager_unregister_device(struct qda_memory_manager *mem_mgr,
>> struct qda_iommu_device *iommu_dev);
>>
>> +/**
>> + * qda_memory_manager_alloc() - Allocate memory for a GEM object
>> + * @mem_mgr: Pointer to memory manager
>> + * @gem_obj: Pointer to GEM object to allocate memory for
>> + * @file_priv: DRM file private data for process association
>> + *
>> + * Allocates memory for the specified GEM object using an appropriate IOMMU
>> + * device. The allocation is associated with the calling process via
>> + * file_priv.
>> + *
>> + * Return: 0 on success, negative error code on failure
>> + */
>> +int qda_memory_manager_alloc(struct qda_memory_manager *mem_mgr, struct qda_gem_obj *gem_obj,
>> + struct drm_file *file_priv);
>> +
>> +/**
>> + * qda_memory_manager_free() - Free memory for a GEM object
>> + * @mem_mgr: Pointer to memory manager
>> + * @gem_obj: Pointer to GEM object to free memory for
>> + *
>> + * Releases memory previously allocated for the specified GEM object and
>> + * removes any associated IOMMU mappings.
>> + */
>> +void qda_memory_manager_free(struct qda_memory_manager *mem_mgr, struct qda_gem_obj *gem_obj);
>> +
>> #endif /* _QDA_MEMORY_MANAGER_H */
>>
>> --
>> 2.34.1
>>
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 11/18] accel/qda: Add GEM_CREATE and GEM_MMAP_OFFSET IOCTLs
2026-02-23 22:39 ` Dmitry Baryshkov
@ 2026-03-02 9:07 ` Ekansh Gupta
0 siblings, 0 replies; 83+ messages in thread
From: Ekansh Gupta @ 2026-03-02 9:07 UTC (permalink / raw)
To: Dmitry Baryshkov
Cc: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, dri-devel, linux-doc, linux-kernel,
linux-arm-msm, iommu, linux-media, linaro-mm-sig,
Srinivas Kandagatla, Bharath Kumar, Chenna Kesava Raju
On 2/24/2026 4:09 AM, Dmitry Baryshkov wrote:
> On Tue, Feb 24, 2026 at 12:39:05AM +0530, Ekansh Gupta wrote:
>> Add two GEM-related IOCTLs for the QDA accelerator driver and hook
>> them into the DRM accel driver. DRM_IOCTL_QDA_GEM_CREATE allocates
>> a DMA-backed GEM buffer object via qda_gem_create_object() and
>> returns a GEM handle to userspace, while
>> DRM_IOCTL_QDA_GEM_MMAP_OFFSET returns a valid mmap offset for a
>> given GEM handle using drm_gem_create_mmap_offset() and the
>> vma_node in the GEM object.
>>
>> The QDA driver is updated to advertise DRIVER_GEM in its
>> driver_features, and the new IOCTLs are wired through the QDA
>> GEM and memory-manager backend. These IOCTLs allow userspace to
>> allocate buffers and map them into its address space as a first
>> step toward full compute buffer management and integration with
>> DSP workloads.
>>
>> Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
>> ---
>> drivers/accel/qda/qda_drv.c | 5 ++++-
>> drivers/accel/qda/qda_gem.h | 30 ++++++++++++++++++++++++++++++
>> drivers/accel/qda/qda_ioctl.c | 35 +++++++++++++++++++++++++++++++++++
>> include/uapi/drm/qda_accel.h | 36 ++++++++++++++++++++++++++++++++++++
>> 4 files changed, 105 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c
>> index 19798359b14e..0dd0e2bb2c0f 100644
>> --- a/drivers/accel/qda/qda_drv.c
>> +++ b/drivers/accel/qda/qda_drv.c
>> @@ -12,6 +12,7 @@
>> #include <drm/qda_accel.h>
>>
>> #include "qda_drv.h"
>> +#include "qda_gem.h"
>> #include "qda_ioctl.h"
>> #include "qda_rpmsg.h"
>>
>> @@ -154,10 +155,12 @@ DEFINE_DRM_ACCEL_FOPS(qda_accel_fops);
>>
>> static const struct drm_ioctl_desc qda_ioctls[] = {
>> DRM_IOCTL_DEF_DRV(QDA_QUERY, qda_ioctl_query, 0),
>> + DRM_IOCTL_DEF_DRV(QDA_GEM_CREATE, qda_ioctl_gem_create, 0),
>> + DRM_IOCTL_DEF_DRV(QDA_GEM_MMAP_OFFSET, qda_ioctl_gem_mmap_offset, 0),
>> };
>>
>> static struct drm_driver qda_drm_driver = {
>> - .driver_features = DRIVER_COMPUTE_ACCEL,
>> + .driver_features = DRIVER_GEM | DRIVER_COMPUTE_ACCEL,
>> .fops = &qda_accel_fops,
>> .open = qda_open,
>> .postclose = qda_postclose,
>> diff --git a/drivers/accel/qda/qda_gem.h b/drivers/accel/qda/qda_gem.h
>> index caae9cda5363..cbd5d0a58fa4 100644
>> --- a/drivers/accel/qda/qda_gem.h
>> +++ b/drivers/accel/qda/qda_gem.h
>> @@ -47,6 +47,36 @@ struct drm_gem_object *qda_gem_create_object(struct drm_device *drm_dev,
>> void qda_gem_free_object(struct drm_gem_object *gem_obj);
>> int qda_gem_mmap_obj(struct drm_gem_object *gem_obj, struct vm_area_struct *vma);
>>
>> +/*
>> + * GEM IOCTL handlers
>> + */
>> +
>> +/**
>> + * qda_ioctl_gem_create - Create a GEM buffer object
>> + * @dev: DRM device structure
>> + * @data: User-space data containing buffer creation parameters
>> + * @file_priv: DRM file private data
>> + *
>> + * This IOCTL handler creates a new GEM buffer object with the specified
>> + * size and returns a handle to the created buffer.
>> + *
>> + * Return: 0 on success, negative error code on failure
>> + */
>> +int qda_ioctl_gem_create(struct drm_device *dev, void *data, struct drm_file *file_priv);
>> +
>> +/**
>> + * qda_ioctl_gem_mmap_offset - Get mmap offset for a GEM buffer object
>> + * @dev: DRM device structure
>> + * @data: User-space data containing buffer handle and offset result
>> + * @file_priv: DRM file private data
>> + *
>> + * This IOCTL handler retrieves the mmap offset for a GEM buffer object,
>> + * which can be used to map the buffer into user-space memory.
>> + *
>> + * Return: 0 on success, negative error code on failure
>> + */
>> +int qda_ioctl_gem_mmap_offset(struct drm_device *dev, void *data, struct drm_file *file_priv);
>> +
>> /*
>> * Helper functions for GEM object allocation and cleanup
>> * These are used internally and by the PRIME import code
>> diff --git a/drivers/accel/qda/qda_ioctl.c b/drivers/accel/qda/qda_ioctl.c
>> index 9fa73ec2dfce..ef3c9c691cb7 100644
>> --- a/drivers/accel/qda/qda_ioctl.c
>> +++ b/drivers/accel/qda/qda_ioctl.c
>> @@ -43,3 +43,38 @@ int qda_ioctl_query(struct drm_device *dev, void *data, struct drm_file *file_pr
>>
>> return 0;
>> }
>> +
>> +int qda_ioctl_gem_create(struct drm_device *dev, void *data, struct drm_file *file_priv)
>> +{
>> + struct drm_qda_gem_create *args = data;
>> + struct drm_gem_object *gem_obj;
>> + struct qda_drm_priv *drm_priv;
>> +
>> + drm_priv = get_drm_priv_from_device(dev);
>> + if (!drm_priv || !drm_priv->iommu_mgr)
>> + return -EINVAL;
>> +
>> + gem_obj = qda_gem_create_object(dev, drm_priv->iommu_mgr, args->size, file_priv);
>> + if (IS_ERR(gem_obj))
>> + return PTR_ERR(gem_obj);
>> +
>> + return qda_gem_create_handle(file_priv, gem_obj, &args->handle);
>> +}
>> +
>> +int qda_ioctl_gem_mmap_offset(struct drm_device *dev, void *data, struct drm_file *file_priv)
>> +{
>> + struct drm_qda_gem_mmap_offset *args = data;
>> + struct drm_gem_object *gem_obj;
>> + int ret;
>> +
>> + gem_obj = qda_gem_lookup_object(file_priv, args->handle);
>> + if (IS_ERR(gem_obj))
>> + return PTR_ERR(gem_obj);
>> +
>> + ret = drm_gem_create_mmap_offset(gem_obj);
>> + if (ret == 0)
>> + args->offset = drm_vma_node_offset_addr(&gem_obj->vma_node);
>> +
>> + drm_gem_object_put(gem_obj);
>> + return ret;
>> +}
>> diff --git a/include/uapi/drm/qda_accel.h b/include/uapi/drm/qda_accel.h
>> index 0aad791c4832..ed24a7f5637e 100644
>> --- a/include/uapi/drm/qda_accel.h
>> +++ b/include/uapi/drm/qda_accel.h
>> @@ -19,6 +19,8 @@ extern "C" {
>> * They are used with DRM_COMMAND_BASE to create the full IOCTL numbers.
>> */
>> #define DRM_QDA_QUERY 0x00
>> +#define DRM_QDA_GEM_CREATE 0x01
>> +#define DRM_QDA_GEM_MMAP_OFFSET 0x02
>> /*
>> * QDA IOCTL definitions
>> *
>> @@ -27,6 +29,10 @@ extern "C" {
>> * data structure and direction (read/write) for each IOCTL.
>> */
>> #define DRM_IOCTL_QDA_QUERY DRM_IOR(DRM_COMMAND_BASE + DRM_QDA_QUERY, struct drm_qda_query)
>> +#define DRM_IOCTL_QDA_GEM_CREATE DRM_IOWR(DRM_COMMAND_BASE + DRM_QDA_GEM_CREATE, \
>> + struct drm_qda_gem_create)
>> +#define DRM_IOCTL_QDA_GEM_MMAP_OFFSET DRM_IOWR(DRM_COMMAND_BASE + DRM_QDA_GEM_MMAP_OFFSET, \
>> + struct drm_qda_gem_mmap_offset)
>>
>> /**
>> * struct drm_qda_query - Device information query structure
>> @@ -40,6 +46,36 @@ struct drm_qda_query {
>> __u8 dsp_name[16];
>> };
>>
>> +/**
>> + * struct drm_qda_gem_create - GEM buffer object creation parameters
>> + * @size: Size of the GEM object to create in bytes (input)
>> + * @handle: Allocated GEM handle (output)
>> + *
>> + * This structure is used with DRM_IOCTL_QDA_GEM_CREATE to allocate
>> + * a new GEM buffer object.
>> + */
>> +struct drm_qda_gem_create {
>> + __u32 handle;
>> + __u32 pad;
>> + __u64 size;
> If you put size before handle, you would not need padding.
ack
>
>> +};
>> +
>> +/**
>> + * struct drm_qda_gem_mmap_offset - GEM object mmap offset query
>> + * @handle: GEM handle (input)
>> + * @pad: Padding for 64-bit alignment
>> + * @offset: mmap offset for the GEM object (output)
>> + *
>> + * This structure is used with DRM_IOCTL_QDA_GEM_MMAP_OFFSET to retrieve
>> + * the mmap offset that can be used with mmap() to map the GEM object into
>> + * user space.
>> + */
>> +struct drm_qda_gem_mmap_offset {
>> + __u32 handle;
>> + __u32 pad;
>> + __u64 offset;
> I'm really not a fan of the pad field in the middle of the structure.
ack
>
>> +};
>> +
>> #if defined(__cplusplus)
>> }
>> #endif
>>
>> --
>> 2.34.1
>>
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 11/18] accel/qda: Add GEM_CREATE and GEM_MMAP_OFFSET IOCTLs
2026-02-24 9:05 ` Christian König
@ 2026-03-02 9:08 ` Ekansh Gupta
0 siblings, 0 replies; 83+ messages in thread
From: Ekansh Gupta @ 2026-03-02 9:08 UTC (permalink / raw)
To: Christian König, Oded Gabbay, Jonathan Corbet, Shuah Khan,
Joerg Roedel, Will Deacon, Robin Murphy, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Sumit Semwal
Cc: dri-devel, linux-doc, linux-kernel, linux-arm-msm, iommu,
linux-media, linaro-mm-sig, Srinivas Kandagatla, Dmitry Baryshkov,
Bharath Kumar, Chenna Kesava Raju
On 2/24/2026 2:35 PM, Christian König wrote:
> On 2/23/26 20:09, Ekansh Gupta wrote:
> ...
>> +int qda_ioctl_gem_mmap_offset(struct drm_device *dev, void *data, struct drm_file *file_priv)
>> +{
>> + struct drm_qda_gem_mmap_offset *args = data;
>> + struct drm_gem_object *gem_obj;
>> + int ret;
>> +
>> + gem_obj = qda_gem_lookup_object(file_priv, args->handle);
>> + if (IS_ERR(gem_obj))
>> + return PTR_ERR(gem_obj);
>> +
>> + ret = drm_gem_create_mmap_offset(gem_obj);
>> + if (ret == 0)
>> + args->offset = drm_vma_node_offset_addr(&gem_obj->vma_node);
>> +
>> + drm_gem_object_put(gem_obj);
>> + return ret;
> You should probably use drm_gem_dumb_map_offset() instead of open coding this.
>
> Otherwise you allow mmap() of imported objects which is not allowed at all.
Thanks for pointing this, Christian. I'll read more about this and fix as per your suggestion.
>
> Regards,
> Christian.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 16/18] accel/qda: Add FastRPC-based DSP memory mapping support
2026-02-26 10:48 ` Krzysztof Kozlowski
@ 2026-03-02 9:12 ` Ekansh Gupta
0 siblings, 0 replies; 83+ messages in thread
From: Ekansh Gupta @ 2026-03-02 9:12 UTC (permalink / raw)
To: Krzysztof Kozlowski, Oded Gabbay, Jonathan Corbet, Shuah Khan,
Joerg Roedel, Will Deacon, Robin Murphy, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Sumit Semwal, Christian König
Cc: dri-devel, linux-doc, linux-kernel, linux-arm-msm, iommu,
linux-media, linaro-mm-sig, Srinivas Kandagatla, Dmitry Baryshkov,
Bharath Kumar, Chenna Kesava Raju
On 2/26/2026 4:18 PM, Krzysztof Kozlowski wrote:
> On 23/02/2026 20:09, Ekansh Gupta wrote:
>> Add a DRM_QDA_MAP ioctl and supporting FastRPC plumbing to map GEM
>> backed buffers into the DSP virtual address space. The new
>> qda_mem_map UAPI structure allows userspace to request legacy MMAP
>> style mappings or handle-based MEM_MAP mappings with attributes, and
>> encodes flags, offsets and optional virtual address hints that are
>> forwarded to the DSP.
>>
>> On the FastRPC side new method identifiers FASTRPC_RMID_INIT_MMAP
>> and FASTRPC_RMID_INIT_MEM_MAP are introduced together with message
>> structures for map requests and responses. The fastrpc_prepare_args
>> path is extended to build the appropriate request headers, serialize
>> the physical page information derived from a GEM object into a
>> fastrpc_phy_page array and pack the arguments into the shared message
>> buffer used by the existing invoke infrastructure.
>>
>> The qda_ioctl_mmap() handler dispatches mapping requests based on the
>> qda_mem_map request type, reusing the generic fastrpc_invoke()
>> machinery and the RPMsg transport to communicate with the DSP. This
>> provides the foundation for explicit buffer mapping into the DSP
>> address space for subsequent FastRPC calls, aligned with the
>> traditional FastRPC user space model.
>>
>> Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
>> ---
>> arch/arm64/configs/defconfig | 2 +
> Not relevan there. Don't stuff other subsystem code into your patches.
> Especially without any reasons (your commit msg must explain WHY you are
> doing things).
Please ignore this, it's a mistake pulled from my local test branch. I'm not going to add any
defconfig changes as part of this patch series.
Thanks for pointing this out.
>
>> drivers/accel/qda/qda_drv.c | 1 +
>> drivers/accel/qda/qda_fastrpc.c | 217 ++++++++++++++++++++++++++++++++++++++++
>> drivers/accel/qda/qda_fastrpc.h | 64 ++++++++++++
>> drivers/accel/qda/qda_ioctl.c | 24 +++++
>> drivers/accel/qda/qda_ioctl.h | 13 +++
>> include/uapi/drm/qda_accel.h | 44 +++++++-
>> 7 files changed, 364 insertions(+), 1 deletion(-)
>>
>
>
> Best regards,
> Krzysztof
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 12/18] accel/qda: Add PRIME dma-buf import support
2026-02-24 8:52 ` Matthew Brost
@ 2026-03-02 9:19 ` Ekansh Gupta
0 siblings, 0 replies; 83+ messages in thread
From: Ekansh Gupta @ 2026-03-02 9:19 UTC (permalink / raw)
To: Matthew Brost
Cc: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, dri-devel, linux-doc, linux-kernel,
linux-arm-msm, iommu, linux-media, linaro-mm-sig,
Srinivas Kandagatla, Dmitry Baryshkov, Bharath Kumar,
Chenna Kesava Raju
On 2/24/2026 2:22 PM, Matthew Brost wrote:
> On Tue, Feb 24, 2026 at 12:39:06AM +0530, Ekansh Gupta wrote:
>> Add PRIME dma-buf import support for QDA GEM buffer objects and integrate
>> it with the existing per-process memory manager and IOMMU device model.
>>
>> The implementation extends qda_gem_obj to represent imported dma-bufs,
>> including dma_buf references, attachment state, scatter-gather tables
>> and an imported DMA address used for DSP-facing book-keeping. The
>> qda_gem_prime_import() path handles reimports of buffers originally
>> exported by QDA as well as imports of external dma-bufs, attaching them
>> to the assigned IOMMU device and mapping them through the memory manager
>> for DSP access. The GEM free path is updated to unmap and detach
>> imported buffers while preserving the existing behaviour for locally
>> allocated memory.
>>
>> The PRIME fd-to-handle path is implemented in qda_prime_fd_to_handle(),
>> which records the calling drm_file in a driver-private import context
>> before invoking the core DRM helpers. The GEM import callback retrieves
>> this context to ensure that an IOMMU device is assigned to the process
>> and that imported buffers follow the same per-process IOMMU selection
>> rules as natively allocated GEM objects.
>>
>> This patch prepares the driver for interoperable buffer sharing between
>> QDA and other dma-buf capable subsystems while keeping IOMMU mapping and
>> lifetime handling consistent with the existing GEM allocation flow.
>>
>> Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
>> ---
>> drivers/accel/qda/Makefile | 1 +
>> drivers/accel/qda/qda_drv.c | 8 ++
>> drivers/accel/qda/qda_drv.h | 4 +
>> drivers/accel/qda/qda_gem.c | 60 +++++++---
>> drivers/accel/qda/qda_gem.h | 10 ++
>> drivers/accel/qda/qda_ioctl.c | 7 ++
>> drivers/accel/qda/qda_ioctl.h | 15 +++
>> drivers/accel/qda/qda_memory_manager.c | 42 ++++++-
>> drivers/accel/qda/qda_memory_manager.h | 14 +++
>> drivers/accel/qda/qda_prime.c | 194 +++++++++++++++++++++++++++++++++
>> drivers/accel/qda/qda_prime.h | 43 ++++++++
>> 11 files changed, 377 insertions(+), 21 deletions(-)
>>
>> diff --git a/drivers/accel/qda/Makefile b/drivers/accel/qda/Makefile
>> index 88c324fa382c..8286f5279748 100644
>> --- a/drivers/accel/qda/Makefile
>> +++ b/drivers/accel/qda/Makefile
>> @@ -13,5 +13,6 @@ qda-y := \
>> qda_ioctl.o \
>> qda_gem.o \
>> qda_memory_dma.o \
>> + qda_prime.o \
>>
>> obj-$(CONFIG_DRM_ACCEL_QDA_COMPUTE_BUS) += qda_compute_bus.o
>> diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c
>> index 0dd0e2bb2c0f..4adee00b1f2c 100644
>> --- a/drivers/accel/qda/qda_drv.c
>> +++ b/drivers/accel/qda/qda_drv.c
>> @@ -10,9 +10,11 @@
>> #include <drm/drm_gem.h>
>> #include <drm/drm_ioctl.h>
>> #include <drm/qda_accel.h>
>> +#include <drm/drm_prime.h>
>>
>> #include "qda_drv.h"
>> #include "qda_gem.h"
>> +#include "qda_prime.h"
>> #include "qda_ioctl.h"
>> #include "qda_rpmsg.h"
>>
>> @@ -166,6 +168,8 @@ static struct drm_driver qda_drm_driver = {
>> .postclose = qda_postclose,
>> .ioctls = qda_ioctls,
>> .num_ioctls = ARRAY_SIZE(qda_ioctls),
>> + .gem_prime_import = qda_gem_prime_import,
>> + .prime_fd_to_handle = qda_ioctl_prime_fd_to_handle,
>> .name = DRIVER_NAME,
>> .desc = "Qualcomm DSP Accelerator Driver",
>> };
>> @@ -174,6 +178,7 @@ static void cleanup_drm_private(struct qda_dev *qdev)
>> {
>> if (qdev->drm_priv) {
>> qda_dbg(qdev, "Cleaning up DRM private data\n");
>> + mutex_destroy(&qdev->drm_priv->import_lock);
>> kfree(qdev->drm_priv);
>> }
>> }
>> @@ -240,6 +245,9 @@ static int init_drm_private(struct qda_dev *qdev)
>> if (!qdev->drm_priv)
>> return -ENOMEM;
>>
>> + mutex_init(&qdev->drm_priv->import_lock);
>> + qdev->drm_priv->current_import_file_priv = NULL;
>> +
>> qda_dbg(qdev, "DRM private data initialized successfully\n");
>> return 0;
>> }
>> diff --git a/drivers/accel/qda/qda_drv.h b/drivers/accel/qda/qda_drv.h
>> index 8a2cd474958b..bb0dd7e284c6 100644
>> --- a/drivers/accel/qda/qda_drv.h
>> +++ b/drivers/accel/qda/qda_drv.h
>> @@ -64,6 +64,10 @@ struct qda_drm_priv {
>> struct qda_memory_manager *iommu_mgr;
>> /* Back-pointer to qda_dev */
>> struct qda_dev *qdev;
>> + /* Lock protecting import context */
>> + struct mutex import_lock;
>> + /* Current file_priv during prime import */
>> + struct drm_file *current_import_file_priv;
>> };
>>
>> /* struct qda_dev - Main device structure for QDA driver */
>> diff --git a/drivers/accel/qda/qda_gem.c b/drivers/accel/qda/qda_gem.c
>> index bbd54e2502d3..37279e8b46fe 100644
>> --- a/drivers/accel/qda/qda_gem.c
>> +++ b/drivers/accel/qda/qda_gem.c
>> @@ -8,6 +8,7 @@
>> #include "qda_gem.h"
>> #include "qda_memory_manager.h"
>> #include "qda_memory_dma.h"
>> +#include "qda_prime.h"
>>
>> static int validate_gem_obj_for_mmap(struct qda_gem_obj *qda_gem_obj)
>> {
>> @@ -15,23 +16,29 @@ static int validate_gem_obj_for_mmap(struct qda_gem_obj *qda_gem_obj)
>> qda_err(NULL, "Invalid GEM object size\n");
>> return -EINVAL;
>> }
>> - if (!qda_gem_obj->iommu_dev || !qda_gem_obj->iommu_dev->dev) {
>> - qda_err(NULL, "Allocated buffer missing IOMMU device\n");
>> - return -EINVAL;
>> - }
>> - if (!qda_gem_obj->iommu_dev->dev) {
>> - qda_err(NULL, "Allocated buffer missing IOMMU device\n");
>> - return -EINVAL;
>> - }
>> - if (!qda_gem_obj->virt) {
>> - qda_err(NULL, "Allocated buffer missing virtual address\n");
>> - return -EINVAL;
>> - }
>> - if (qda_gem_obj->dma_addr == 0) {
>> - qda_err(NULL, "Allocated buffer missing DMA address\n");
>> - return -EINVAL;
>> + if (qda_gem_obj->is_imported) {
>> + if (!qda_gem_obj->sgt) {
>> + qda_err(NULL, "Imported buffer missing sgt\n");
>> + return -EINVAL;
>> + }
>> + if (!qda_gem_obj->iommu_dev || !qda_gem_obj->iommu_dev->dev) {
>> + qda_err(NULL, "Imported buffer missing IOMMU device\n");
>> + return -EINVAL;
>> + }
>> + } else {
>> + if (!qda_gem_obj->iommu_dev || !qda_gem_obj->iommu_dev->dev) {
>> + qda_err(NULL, "Allocated buffer missing IOMMU device\n");
>> + return -EINVAL;
>> + }
>> + if (!qda_gem_obj->virt) {
>> + qda_err(NULL, "Allocated buffer missing virtual address\n");
>> + return -EINVAL;
>> + }
>> + if (qda_gem_obj->dma_addr == 0) {
>> + qda_err(NULL, "Allocated buffer missing DMA address\n");
>> + return -EINVAL;
>> + }
>> }
>> -
>> return 0;
>> }
>>
>> @@ -60,9 +67,21 @@ void qda_gem_free_object(struct drm_gem_object *gem_obj)
>> struct qda_gem_obj *qda_gem_obj = to_qda_gem_obj(gem_obj);
>> struct qda_drm_priv *drm_priv = get_drm_priv_from_device(gem_obj->dev);
>>
>> - if (qda_gem_obj->virt) {
>> - if (drm_priv && drm_priv->iommu_mgr)
>> + if (qda_gem_obj->is_imported) {
>> + if (qda_gem_obj->attachment && qda_gem_obj->sgt)
>> + dma_buf_unmap_attachment_unlocked(qda_gem_obj->attachment,
>> + qda_gem_obj->sgt, DMA_BIDIRECTIONAL);
>> + if (qda_gem_obj->attachment)
>> + dma_buf_detach(qda_gem_obj->dma_buf, qda_gem_obj->attachment);
>> + if (qda_gem_obj->dma_buf)
>> + dma_buf_put(qda_gem_obj->dma_buf);
>> + if (qda_gem_obj->iommu_dev && drm_priv && drm_priv->iommu_mgr)
>> qda_memory_manager_free(drm_priv->iommu_mgr, qda_gem_obj);
>> + } else {
>> + if (qda_gem_obj->virt) {
>> + if (drm_priv && drm_priv->iommu_mgr)
>> + qda_memory_manager_free(drm_priv->iommu_mgr, qda_gem_obj);
>> + }
>> }
>>
>> drm_gem_object_release(gem_obj);
>> @@ -174,6 +193,11 @@ struct drm_gem_object *qda_gem_create_object(struct drm_device *drm_dev,
>> qda_gem_obj = qda_gem_alloc_object(drm_dev, aligned_size);
>> if (IS_ERR(qda_gem_obj))
>> return (struct drm_gem_object *)qda_gem_obj;
>> + qda_gem_obj->is_imported = false;
>> + qda_gem_obj->dma_buf = NULL;
>> + qda_gem_obj->attachment = NULL;
>> + qda_gem_obj->sgt = NULL;
>> + qda_gem_obj->imported_dma_addr = 0;
>>
>> ret = qda_memory_manager_alloc(iommu_mgr, qda_gem_obj, file_priv);
>> if (ret) {
>> diff --git a/drivers/accel/qda/qda_gem.h b/drivers/accel/qda/qda_gem.h
>> index cbd5d0a58fa4..3566c5b2ad88 100644
>> --- a/drivers/accel/qda/qda_gem.h
>> +++ b/drivers/accel/qda/qda_gem.h
>> @@ -31,6 +31,16 @@ struct qda_gem_obj {
>> size_t size;
>> /* IOMMU device that performed the allocation */
>> struct qda_iommu_device *iommu_dev;
>> + /* True if buffer is imported, false if allocated */
>> + bool is_imported;
>> + /* Reference to imported dma_buf */
>> + struct dma_buf *dma_buf;
>> + /* DMA buf attachment */
>> + struct dma_buf_attachment *attachment;
>> + /* Scatter-gather table */
>> + struct sg_table *sgt;
>> + /* DMA address of imported buffer */
>> + dma_addr_t imported_dma_addr;
>> };
>>
>> /*
>> diff --git a/drivers/accel/qda/qda_ioctl.c b/drivers/accel/qda/qda_ioctl.c
>> index ef3c9c691cb7..d91983048d6c 100644
>> --- a/drivers/accel/qda/qda_ioctl.c
>> +++ b/drivers/accel/qda/qda_ioctl.c
>> @@ -5,6 +5,7 @@
>> #include <drm/qda_accel.h>
>> #include "qda_drv.h"
>> #include "qda_ioctl.h"
>> +#include "qda_prime.h"
>>
>> static int qda_validate_and_get_context(struct drm_device *dev, struct drm_file *file_priv,
>> struct qda_dev **qdev, struct qda_user **qda_user)
>> @@ -78,3 +79,9 @@ int qda_ioctl_gem_mmap_offset(struct drm_device *dev, void *data, struct drm_fil
>> drm_gem_object_put(gem_obj);
>> return ret;
>> }
>> +
>> +int qda_ioctl_prime_fd_to_handle(struct drm_device *dev, struct drm_file *file_priv, int prime_fd,
>> + u32 *handle)
>> +{
>> + return qda_prime_fd_to_handle(dev, file_priv, prime_fd, handle);
>> +}
>> diff --git a/drivers/accel/qda/qda_ioctl.h b/drivers/accel/qda/qda_ioctl.h
>> index 6bf3bcd28c0e..d454256f5fc5 100644
>> --- a/drivers/accel/qda/qda_ioctl.h
>> +++ b/drivers/accel/qda/qda_ioctl.h
>> @@ -23,4 +23,19 @@
>> */
>> int qda_ioctl_query(struct drm_device *dev, void *data, struct drm_file *file_priv);
>>
>> +/**
>> + * qda_ioctl_prime_fd_to_handle - IOCTL handler for PRIME FD to handle conversion
>> + * @dev: DRM device structure
>> + * @file_priv: DRM file private data
>> + * @prime_fd: File descriptor of the PRIME buffer
>> + * @handle: Output parameter for the GEM handle
>> + *
>> + * This IOCTL handler converts a PRIME file descriptor to a GEM handle.
>> + * It serves as both the DRM driver callback and can be used directly.
>> + *
>> + * Return: 0 on success, negative error code on failure
>> + */
>> +int qda_ioctl_prime_fd_to_handle(struct drm_device *dev, struct drm_file *file_priv,
>> + int prime_fd, u32 *handle);
>> +
>> #endif /* _QDA_IOCTL_H */
>> diff --git a/drivers/accel/qda/qda_memory_manager.c b/drivers/accel/qda/qda_memory_manager.c
>> index e225667557ee..3fd20f17c57b 100644
>> --- a/drivers/accel/qda/qda_memory_manager.c
>> +++ b/drivers/accel/qda/qda_memory_manager.c
>> @@ -154,8 +154,8 @@ static struct qda_iommu_device *get_process_iommu_device(struct qda_memory_manag
>> return qda_priv->assigned_iommu_dev;
>> }
>>
>> -static int qda_memory_manager_assign_device(struct qda_memory_manager *mem_mgr,
>> - struct drm_file *file_priv)
>> +int qda_memory_manager_assign_device(struct qda_memory_manager *mem_mgr,
>> + struct drm_file *file_priv)
>> {
>> struct qda_file_priv *qda_priv;
>> struct qda_iommu_device *selected_dev = NULL;
>> @@ -223,6 +223,35 @@ static struct qda_iommu_device *get_or_assign_iommu_device(struct qda_memory_man
>> return NULL;
>> }
>>
>> +static int qda_memory_manager_map_imported(struct qda_memory_manager *mem_mgr,
>> + struct qda_gem_obj *gem_obj,
>> + struct qda_iommu_device *iommu_dev)
>> +{
>> + struct scatterlist *sg;
>> + dma_addr_t dma_addr;
>> + int ret = 0;
>> +
>> + if (!gem_obj->is_imported || !gem_obj->sgt || !iommu_dev) {
>> + qda_err(NULL, "Invalid parameters for imported buffer mapping\n");
>> + return -EINVAL;
>> + }
>> +
>> + gem_obj->iommu_dev = iommu_dev;
>> +
>> + sg = gem_obj->sgt->sgl;
>> + if (sg) {
>> + dma_addr = sg_dma_address(sg);
>> + dma_addr += ((u64)iommu_dev->sid << 32);
>> +
>> + gem_obj->imported_dma_addr = dma_addr;
>> + } else {
>> + qda_err(NULL, "Invalid scatter-gather list for imported buffer\n");
>> + ret = -EINVAL;
>> + }
>> +
>> + return ret;
>> +}
>> +
>> int qda_memory_manager_alloc(struct qda_memory_manager *mem_mgr, struct qda_gem_obj *gem_obj,
>> struct drm_file *file_priv)
>> {
>> @@ -248,7 +277,10 @@ int qda_memory_manager_alloc(struct qda_memory_manager *mem_mgr, struct qda_gem_
>> return -ENOMEM;
>> }
>>
>> - ret = qda_dma_alloc(selected_dev, gem_obj, size);
>> + if (gem_obj->is_imported)
>> + ret = qda_memory_manager_map_imported(mem_mgr, gem_obj, selected_dev);
>> + else
>> + ret = qda_dma_alloc(selected_dev, gem_obj, size);
>>
>> if (ret) {
>> qda_err(NULL, "Allocation failed: size=%zu, device_id=%u, ret=%d\n",
>> @@ -268,6 +300,10 @@ void qda_memory_manager_free(struct qda_memory_manager *mem_mgr, struct qda_gem_
>> return;
>> }
>>
>> + if (gem_obj->is_imported) {
>> + qda_dbg(NULL, "Freed imported buffer tracking (no DMA free needed)\n");
>> + return;
>> + }
>> qda_dma_free(gem_obj);
>> }
>>
>> diff --git a/drivers/accel/qda/qda_memory_manager.h b/drivers/accel/qda/qda_memory_manager.h
>> index bac44284ef98..f6c7963cec42 100644
>> --- a/drivers/accel/qda/qda_memory_manager.h
>> +++ b/drivers/accel/qda/qda_memory_manager.h
>> @@ -106,6 +106,20 @@ int qda_memory_manager_register_device(struct qda_memory_manager *mem_mgr,
>> void qda_memory_manager_unregister_device(struct qda_memory_manager *mem_mgr,
>> struct qda_iommu_device *iommu_dev);
>>
>> +/**
>> + * qda_memory_manager_assign_device() - Assign an IOMMU device to a process
>> + * @mem_mgr: Pointer to memory manager
>> + * @file_priv: DRM file private data for process association
>> + *
>> + * Assigns an IOMMU device to the calling process. If the process already has
>> + * a device assigned, returns success. If another file descriptor from the same
>> + * PID has a device, reuses it. Otherwise, finds an available device and assigns it.
>> + *
>> + * Return: 0 on success, negative error code on failure
>> + */
>> +int qda_memory_manager_assign_device(struct qda_memory_manager *mem_mgr,
>> + struct drm_file *file_priv);
>> +
>> /**
>> * qda_memory_manager_alloc() - Allocate memory for a GEM object
>> * @mem_mgr: Pointer to memory manager
>> diff --git a/drivers/accel/qda/qda_prime.c b/drivers/accel/qda/qda_prime.c
>> new file mode 100644
>> index 000000000000..3d23842e48bb
>> --- /dev/null
>> +++ b/drivers/accel/qda/qda_prime.c
>> @@ -0,0 +1,194 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +// Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
>> +#include <drm/drm_gem.h>
>> +#include <drm/drm_prime.h>
>> +#include <linux/slab.h>
>> +#include <linux/dma-mapping.h>
>> +#include "qda_drv.h"
>> +#include "qda_gem.h"
>> +#include "qda_prime.h"
>> +#include "qda_memory_manager.h"
>> +
>> +static struct drm_gem_object *check_own_buffer(struct drm_device *dev, struct dma_buf *dma_buf)
>> +{
>> + if (dma_buf->priv) {
>> + struct drm_gem_object *existing_gem = dma_buf->priv;
> Randomly looking at your driver — you’ve broken the dma-buf cross-driver
> contract here. How do you know dma_buf->priv is a struct drm_gem_object?
> You don’t, because that is assigned by the exporter, and userspace could
> pass in a dma-buf from another device and blow up your driver.
>
> I think you just want to call drm_gem_is_prime_exported_dma_buf() here
> before doing anything.
>
> The rest of this dma-buf code also looks highly questionable. I’d study
> how other drivers implement their dma-buf paths and use those as a
> reference to improve yours.
>
> Matt
I had this concern while developing this patch but I was not able to find the right way
to handle this. I'll look into drm_gem_is_prime_exported_dma_buf() and see if it fits
here. For the rest of the dma-buf, the mapping part is something where I could not see my
requirements on any other driver(mapping to iommu device), so I might be trying to
implement something new here. That being said, I'll go through some more drivers and
check if my dma-buf could be improved.
Thanks for the review and your suggestion, Matt.
>
>> +
>> + if (existing_gem->dev == dev) {
>> + struct qda_gem_obj *existing_qda_gem = to_qda_gem_obj(existing_gem);
>> +
>> + if (!existing_qda_gem->is_imported) {
>> + drm_gem_object_get(existing_gem);
>> + return existing_gem;
>> + }
>> + }
>> + }
>> + return NULL;
>> +}
>> +
>> +static struct qda_iommu_device *get_iommu_device_for_import(struct qda_drm_priv *drm_priv,
>> + struct drm_file **file_priv_out,
>> + struct qda_dev *qdev)
>> +{
>> + struct drm_file *file_priv;
>> + struct qda_file_priv *qda_file_priv;
>> + struct qda_iommu_device *iommu_dev = NULL;
>> + int ret;
>> +
>> + file_priv = drm_priv->current_import_file_priv;
>> + *file_priv_out = file_priv;
>> +
>> + if (!file_priv || !file_priv->driver_priv)
>> + return NULL;
>> +
>> + qda_file_priv = (struct qda_file_priv *)file_priv->driver_priv;
>> + iommu_dev = qda_file_priv->assigned_iommu_dev;
>> +
>> + if (!iommu_dev) {
>> + ret = qda_memory_manager_assign_device(drm_priv->iommu_mgr, file_priv);
>> + if (ret) {
>> + qda_err(qdev, "Failed to assign IOMMU device: %d\n", ret);
>> + return NULL;
>> + }
>> +
>> + iommu_dev = qda_file_priv->assigned_iommu_dev;
>> + }
>> +
>> + return iommu_dev;
>> +}
>> +
>> +static int setup_dma_buf_mapping(struct qda_gem_obj *qda_gem_obj, struct dma_buf *dma_buf,
>> + struct device *attach_dev, struct qda_dev *qdev)
>> +{
>> + struct dma_buf_attachment *attachment;
>> + struct sg_table *sgt;
>> + int ret;
>> +
>> + attachment = dma_buf_attach(dma_buf, attach_dev);
>> + if (IS_ERR(attachment)) {
>> + ret = PTR_ERR(attachment);
>> + qda_err(qdev, "Failed to attach dma_buf: %d\n", ret);
>> + return ret;
>> + }
>> + qda_gem_obj->attachment = attachment;
>> +
>> + sgt = dma_buf_map_attachment_unlocked(attachment, DMA_BIDIRECTIONAL);
>> + if (IS_ERR(sgt)) {
>> + ret = PTR_ERR(sgt);
>> + qda_err(qdev, "Failed to map dma_buf attachment: %d\n", ret);
>> + dma_buf_detach(dma_buf, attachment);
>> + return ret;
>> + }
>> + qda_gem_obj->sgt = sgt;
>> +
>> + return 0;
>> +}
>> +
>> +struct drm_gem_object *qda_gem_prime_import(struct drm_device *dev, struct dma_buf *dma_buf)
>> +{
>> + struct qda_drm_priv *drm_priv;
>> + struct qda_gem_obj *qda_gem_obj;
>> + struct drm_file *file_priv;
>> + struct qda_iommu_device *iommu_dev;
>> + struct qda_dev *qdev;
>> + struct drm_gem_object *existing_gem;
>> + size_t aligned_size;
>> + int ret;
>> +
>> + drm_priv = get_drm_priv_from_device(dev);
>> + if (!drm_priv || !drm_priv->iommu_mgr) {
>> + qda_err(NULL, "Invalid drm_priv or iommu_mgr\n");
>> + return ERR_PTR(-EINVAL);
>> + }
>> +
>> + qdev = drm_priv->qdev;
>> +
>> + existing_gem = check_own_buffer(dev, dma_buf);
>> + if (existing_gem)
>> + return existing_gem;
>> +
>> + iommu_dev = get_iommu_device_for_import(drm_priv, &file_priv, qdev);
>> + if (!iommu_dev || !iommu_dev->dev) {
>> + qda_err(qdev, "No IOMMU device assigned for prime import\n");
>> + return ERR_PTR(-ENODEV);
>> + }
>> +
>> + qda_dbg(qdev, "Using IOMMU device %u for prime import\n", iommu_dev->id);
>> +
>> + aligned_size = PAGE_ALIGN(dma_buf->size);
>> + qda_gem_obj = qda_gem_alloc_object(dev, aligned_size);
>> + if (IS_ERR(qda_gem_obj))
>> + return (struct drm_gem_object *)qda_gem_obj;
>> +
>> + qda_gem_obj->is_imported = true;
>> + qda_gem_obj->dma_buf = dma_buf;
>> + qda_gem_obj->virt = NULL;
>> + qda_gem_obj->dma_addr = 0;
>> + qda_gem_obj->imported_dma_addr = 0;
>> + qda_gem_obj->iommu_dev = iommu_dev;
>> +
>> + get_dma_buf(dma_buf);
>> +
>> + ret = setup_dma_buf_mapping(qda_gem_obj, dma_buf, iommu_dev->dev, qdev);
>> + if (ret)
>> + goto err_put_dma_buf;
>> +
>> + ret = qda_memory_manager_alloc(drm_priv->iommu_mgr, qda_gem_obj, file_priv);
>> + if (ret) {
>> + qda_err(qdev, "Failed to allocate IOMMU mapping: %d\n", ret);
>> + goto err_unmap;
>> + }
>> +
>> + qda_dbg(qdev, "Prime import completed successfully size=%zu\n", aligned_size);
>> + return &qda_gem_obj->base;
>> +
>> +err_unmap:
>> + dma_buf_unmap_attachment_unlocked(qda_gem_obj->attachment,
>> + qda_gem_obj->sgt, DMA_BIDIRECTIONAL);
>> + dma_buf_detach(dma_buf, qda_gem_obj->attachment);
>> +err_put_dma_buf:
>> + dma_buf_put(dma_buf);
>> + qda_gem_cleanup_object(qda_gem_obj);
>> + return ERR_PTR(ret);
>> +}
>> +
>> +int qda_prime_fd_to_handle(struct drm_device *dev, struct drm_file *file_priv,
>> + int prime_fd, u32 *handle)
>> +{
>> + struct qda_drm_priv *drm_priv;
>> + struct qda_dev *qdev;
>> + int ret;
>> +
>> + drm_priv = get_drm_priv_from_device(dev);
>> + if (!drm_priv) {
>> + qda_dbg(NULL, "Failed to get drm_priv from device\n");
>> + return -EINVAL;
>> + }
>> +
>> + qdev = drm_priv->qdev;
>> +
>> + if (file_priv && file_priv->driver_priv) {
>> + struct qda_file_priv *qda_file_priv;
>> +
>> + qda_file_priv = (struct qda_file_priv *)file_priv->driver_priv;
>> + } else {
>> + qda_dbg(qdev, "Called with NULL file_priv or driver_priv\n");
>> + }
>> +
>> + mutex_lock(&drm_priv->import_lock);
>> + drm_priv->current_import_file_priv = file_priv;
>> +
>> + ret = drm_gem_prime_fd_to_handle(dev, file_priv, prime_fd, handle);
>> +
>> + drm_priv->current_import_file_priv = NULL;
>> + mutex_unlock(&drm_priv->import_lock);
>> +
>> + if (!ret)
>> + qda_dbg(qdev, "Completed with ret=%d, handle=%u\n", ret, *handle);
>> + else
>> + qda_dbg(qdev, "Completed with ret=%d\n", ret);
>> +
>> + return ret;
>> +}
>> +
>> +MODULE_IMPORT_NS("DMA_BUF");
>> diff --git a/drivers/accel/qda/qda_prime.h b/drivers/accel/qda/qda_prime.h
>> new file mode 100644
>> index 000000000000..939902454dcd
>> --- /dev/null
>> +++ b/drivers/accel/qda/qda_prime.h
>> @@ -0,0 +1,43 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/*
>> + * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
>> + */
>> +
>> +#ifndef _QDA_PRIME_H
>> +#define _QDA_PRIME_H
>> +
>> +#include <drm/drm_device.h>
>> +#include <drm/drm_file.h>
>> +#include <drm/drm_gem.h>
>> +#include <linux/dma-buf.h>
>> +
>> +/**
>> + * qda_gem_prime_import - Import a DMA-BUF as a GEM object
>> + * @dev: DRM device structure
>> + * @dma_buf: DMA-BUF to import
>> + *
>> + * This function imports an external DMA-BUF into the QDA driver as a GEM
>> + * object. It handles both re-imports of buffers originally from this driver
>> + * and imports of external buffers from other drivers.
>> + *
>> + * Return: Pointer to the imported GEM object on success, ERR_PTR on failure
>> + */
>> +struct drm_gem_object *qda_gem_prime_import(struct drm_device *dev, struct dma_buf *dma_buf);
>> +
>> +/**
>> + * qda_prime_fd_to_handle - Core implementation for PRIME FD to GEM handle conversion
>> + * @dev: DRM device structure
>> + * @file_priv: DRM file private data
>> + * @prime_fd: File descriptor of the PRIME buffer
>> + * @handle: Output parameter for the GEM handle
>> + *
>> + * This core function sets up the necessary context before calling the
>> + * DRM framework's prime FD to handle conversion. It ensures proper IOMMU
>> + * device assignment and tracking for the import operation.
>> + *
>> + * Return: 0 on success, negative error code on failure
>> + */
>> +int qda_prime_fd_to_handle(struct drm_device *dev, struct drm_file *file_priv,
>> + int prime_fd, u32 *handle);
>> +
>> +#endif /* _QDA_PRIME_H */
>>
>> --
>> 2.34.1
>>
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 00/18] accel/qda: Introduce Qualcomm DSP Accelerator driver
2026-02-23 19:08 ` [PATCH RFC 00/18] accel/qda: Introduce Qualcomm DSP Accelerator driver Ekansh Gupta
` (21 preceding siblings ...)
2026-02-25 13:42 ` Bryan O'Donoghue
@ 2026-03-02 15:57 ` Srinivas Kandagatla
2026-03-09 8:07 ` Ekansh Gupta
23 siblings, 0 replies; 83+ messages in thread
From: Srinivas Kandagatla @ 2026-03-02 15:57 UTC (permalink / raw)
To: Ekansh Gupta, Oded Gabbay, Jonathan Corbet, Shuah Khan,
Joerg Roedel, Will Deacon, Robin Murphy, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Sumit Semwal, Christian König
Cc: dri-devel, linux-doc, linux-kernel, linux-arm-msm, iommu,
linux-media, linaro-mm-sig, Dmitry Baryshkov, Bharath Kumar,
Chenna Kesava Raju
On 2/23/26 7:08 PM, Ekansh Gupta wrote:
Thanks Ekansh for this this one out.
> Key Features
> ============
>
> * Standard DRM accelerator interface via /dev/accel/accelN> * GEM-based buffer management with DMA-BUF import/export support
> * IOMMU-based memory isolation using per-process context banks
> * FastRPC protocol implementation for DSP communication
> * RPMsg transport layer for reliable message passing
> * Support for all DSP domains (ADSP, CDSP, SDSP, GDSP)
To what extent is this support expected ?
> * Comprehensive IOCTL interface for DSP operations
>
> High-Level Architecture Differences with Existing FastRPC Driver
> =================================================================
>
> The QDA driver represents a significant architectural departure from the
> existing FastRPC driver (drivers/misc/fastrpc.c), addressing several key
> limitations while maintaining protocol compatibility:
>
> 3. IOMMU Context Bank Management
>
>
> 9. UAPI Design
> - FastRPC: Custom IOCTL interface
> - QDA: DRM-style IOCTLs with proper versioning support
> - Benefit: Follows DRM conventions, easier userspace integration
Can you elaborate this.
Are we really getting leverage from any of the standard libraries that
are available for drm accel?
In general I would like to understand how standardization of this kernel
driver is helping userspace side of things.
Does this mean that there will be no libfastrpc requirements in future?
If that is not the case then I see no point.
>
> Open Items
> ===========
>
> The following items are identified as open items:
>
> 1. Privilege Level Management
> - Currently, daemon processes and user processes have the same access
> level as both use the same accel device node. This needs to be
> addressed as daemons attach to privileged DSP PDs and require
> higher privilege levels for system-level operations
> - Seeking guidance on the best approach: separate device nodes,
> capability-based checks, or DRM master/authentication mechanisms
>
> 2. UAPI Compatibility Layer
Simple rule! you can not break anything that is already working with
existing UAPI.
> - Add UAPI compat layer to facilitate migration of client applications
> from existing FastRPC UAPI to the new QDA accel driver UAPI,
> ensuring smooth transition for existing userspace code
What will happen to long term supported devices?
> - Seeking guidance on implementation approach: in-kernel translation
> layer, userspace wrapper library, or hybrid solution
>
> 3. Documentation Improvements
> - Add detailed IOCTL usage examples
> - Document DSP firmware interface requirements
> - Create migration guide from existing FastRPC
>
> 4. Per-Domain Memory Allocation
> - Develop new userspace API to support memory allocation on a per
> domain basis, enabling domain-specific memory management and
> optimization
>
> 5. Audio and Sensors PD Support
> - The current patch series does not handle Audio PD and Sensors PD
> functionalities. These specialized protection domains require
> additional support for real-time constraints and power management
Please elaborate, fastrpc support is incomplete without audiopd support.
--srini
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 06/18] accel/qda: Add memory manager for CB devices
2026-03-02 8:15 ` Ekansh Gupta
@ 2026-03-04 4:22 ` Dmitry Baryshkov
0 siblings, 0 replies; 83+ messages in thread
From: Dmitry Baryshkov @ 2026-03-04 4:22 UTC (permalink / raw)
To: Ekansh Gupta
Cc: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, dri-devel, linux-doc, linux-kernel,
linux-arm-msm, iommu, linux-media, linaro-mm-sig,
Srinivas Kandagatla, Bharath Kumar, Chenna Kesava Raju
On Mon, Mar 02, 2026 at 01:45:09PM +0530, Ekansh Gupta wrote:
>
>
> On 2/24/2026 4:20 AM, Dmitry Baryshkov wrote:
> > On Tue, Feb 24, 2026 at 12:39:00AM +0530, Ekansh Gupta wrote:
> >> Introduce a per-device memory manager for the QDA driver that tracks
> >> IOMMU-capable compute context-bank (CB) devices. Each CB device is
> >> represented by a qda_iommu_device and registered with a central
> >> qda_memory_manager instance owned by qda_dev.
> >>
> >> The memory manager maintains an xarray of devices and assigns a
> >> unique ID to each CB. It also provides basic lifetime management
> > Sounds like IDR.
> I was planning to stick with xarray accross QDA as IDR gives checkpatch warnings.
Ack.
> >
> >> and a workqueue for deferred device removal. qda_cb_setup_device()
> > What is deferred device removal? Why do you need it?
> This is not needed, I was trying some experiment in my initial design(CB aggregation),
> but it's not needed now, I'll remove this.
Ack
> >
> >> now allocates a qda_iommu_device for each CB and registers it with
> >> the memory manager after DMA configuration succeeds.
> >>
> >> qda_init_device() is extended to allocate and initialize the memory
> >> manager, while qda_deinit_device() will tear it down in later
> >> patches. This prepares the QDA driver for fine-grained memory and
> >> IOMMU domain management tied to individual CB devices.
> >>
> >> Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
> >> ---
> >> drivers/accel/qda/Makefile | 1 +
> >> drivers/accel/qda/qda_cb.c | 32 +++++++
> >> drivers/accel/qda/qda_drv.c | 46 ++++++++++
> >> drivers/accel/qda/qda_drv.h | 3 +
> >> drivers/accel/qda/qda_memory_manager.c | 152 +++++++++++++++++++++++++++++++++
> >> drivers/accel/qda/qda_memory_manager.h | 101 ++++++++++++++++++++++
> >> 6 files changed, 335 insertions(+)
> >>
>
--
With best wishes
Dmitry
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 13/18] accel/qda: Add initial FastRPC attach and release support
2026-02-23 23:07 ` Dmitry Baryshkov
@ 2026-03-09 6:50 ` Ekansh Gupta
0 siblings, 0 replies; 83+ messages in thread
From: Ekansh Gupta @ 2026-03-09 6:50 UTC (permalink / raw)
To: Dmitry Baryshkov
Cc: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, dri-devel, linux-doc, linux-kernel,
linux-arm-msm, iommu, linux-media, linaro-mm-sig,
Srinivas Kandagatla, Bharath Kumar, Chenna Kesava Raju
On 2/24/2026 4:37 AM, Dmitry Baryshkov wrote:
> On Tue, Feb 24, 2026 at 12:39:07AM +0530, Ekansh Gupta wrote:
>> Add the initial FastRPC invocation plumbing to the QDA accelerator
>> driver to support attaching to and releasing a DSP process. A new
>> fastrpc_invoke_context structure tracks the state of a single remote
> So, why does it embed kref?
I'll remove kref from ctx.
>
>> procedure call, including arguments, overlap handling, completion and
>> GEM-based message buffers. Contexts are indexed through an xarray in
>> qda_dev so that RPMsg callbacks can match responses back to the
>> originating invocation.
> Again, IDR? Or not?
Same comment as other patches
>
>> The new qda_fastrpc implementation provides helpers to prepare
>> FastRPC scalars and arguments, pack them into a QDA message backed by
>> a GEM buffer and unpack responses. The FastRPC INIT_ATTACH and
>> INIT_RELEASE methods are wired up via a new QDA_INIT_ATTACH ioctl and
>> a postclose hook that sends a release request when a client file
>> descriptor is closed. On the transport side qda_rpmsg_send_msg()
>> builds and sends a fastrpc_msg over RPMsg, while qda_rpmsg_cb()
>> decodes qda_invoke_rsp messages, looks up the context by its id and
>> completes the corresponding wait.
>>
>> This lays the foundation for QDA FastRPC method support on top of the
>> existing GEM and RPMsg infrastructure, starting with the attach and
>> release control flows for DSP sessions.
> I think the FastRPC backing code should be a separate commit,
> INIT_ATTACH another, separate commit.
ack
>
>> Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
>> ---
>> drivers/accel/qda/Makefile | 1 +
>> drivers/accel/qda/qda_drv.c | 5 +
>> drivers/accel/qda/qda_drv.h | 2 +
>> drivers/accel/qda/qda_fastrpc.c | 548 ++++++++++++++++++++++++++++++++++++++++
>> drivers/accel/qda/qda_fastrpc.h | 303 ++++++++++++++++++++++
>> drivers/accel/qda/qda_ioctl.c | 107 ++++++++
>> drivers/accel/qda/qda_ioctl.h | 25 ++
>> drivers/accel/qda/qda_rpmsg.c | 164 +++++++++++-
>> drivers/accel/qda/qda_rpmsg.h | 40 +++
>> include/uapi/drm/qda_accel.h | 19 ++
>> 10 files changed, 1212 insertions(+), 2 deletions(-)
>>
>> diff --git a/include/uapi/drm/qda_accel.h b/include/uapi/drm/qda_accel.h
>> index ed24a7f5637e..4d3666c5b998 100644
>> --- a/include/uapi/drm/qda_accel.h
>> +++ b/include/uapi/drm/qda_accel.h
> [moved this file to the beginning of the patch to ease reviewing]
ack.
>
>> @@ -21,6 +21,7 @@ extern "C" {
>> #define DRM_QDA_QUERY 0x00
>> #define DRM_QDA_GEM_CREATE 0x01
>> #define DRM_QDA_GEM_MMAP_OFFSET 0x02
>> +#define DRM_QDA_INIT_ATTACH 0x03
>> /*
>> * QDA IOCTL definitions
>> *
>> @@ -33,6 +34,7 @@ extern "C" {
>> struct drm_qda_gem_create)
>> #define DRM_IOCTL_QDA_GEM_MMAP_OFFSET DRM_IOWR(DRM_COMMAND_BASE + DRM_QDA_GEM_MMAP_OFFSET, \
>> struct drm_qda_gem_mmap_offset)
>> +#define DRM_IOCTL_QDA_INIT_ATTACH DRM_IO(DRM_COMMAND_BASE + DRM_QDA_INIT_ATTACH)
>>
>> /**
>> * struct drm_qda_query - Device information query structure
>> @@ -76,6 +78,23 @@ struct drm_qda_gem_mmap_offset {
>> __u64 offset;
>> };
>>
>> +/**
>> + * struct fastrpc_invoke_args - FastRPC invocation argument descriptor
>> + * @ptr: Pointer to argument data (user virtual address)
>> + * @length: Length of the argument data in bytes
> And the data is defined... where?
>
>> + * @fd: File descriptor for buffer arguments, -1 for scalar arguments
>> + * @attr: Argument attributes and flags
> Which attributes and flags?
this struct is taken similar to the existing fastrpc uAPI. I'll add more details.
>
>> + *
>> + * This structure describes a single argument passed to a FastRPC invocation.
>> + * Arguments can be either scalar values or buffer references (via file descriptor).
> Can't it just be GEM handle + offset inside the handle?
Yes, fd is actually getting replaced with GEM handle.
>
>> + */
>> +struct fastrpc_invoke_args {
>> + __u64 ptr;
>> + __u64 length;
>> + __s32 fd;
>> + __u32 attr;
>> +};
>> +
>> #if defined(__cplusplus)
>> }
>> #endif
>>
>> diff --git a/drivers/accel/qda/Makefile b/drivers/accel/qda/Makefile
>> index 8286f5279748..82d40e452fa9 100644
>> --- a/drivers/accel/qda/Makefile
>> +++ b/drivers/accel/qda/Makefile
>> @@ -14,5 +14,6 @@ qda-y := \
>> qda_gem.o \
>> qda_memory_dma.o \
>> qda_prime.o \
>> + qda_fastrpc.o \
>>
>> obj-$(CONFIG_DRM_ACCEL_QDA_COMPUTE_BUS) += qda_compute_bus.o
>> diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c
>> index 4adee00b1f2c..3034ea660924 100644
>> --- a/drivers/accel/qda/qda_drv.c
>> +++ b/drivers/accel/qda/qda_drv.c
>> @@ -120,6 +120,8 @@ static void qda_postclose(struct drm_device *dev, struct drm_file *file)
>> return;
>> }
>>
>> + fastrpc_release_current_dsp_process(qdev, file);
> No, this is not the fastrpc driver.
ack.
>
>> +
>> qda_file_priv = (struct qda_file_priv *)file->driver_priv;
>> if (qda_file_priv) {
>> if (qda_file_priv->assigned_iommu_dev) {
>> @@ -159,6 +161,7 @@ static const struct drm_ioctl_desc qda_ioctls[] = {
>> DRM_IOCTL_DEF_DRV(QDA_QUERY, qda_ioctl_query, 0),
>> DRM_IOCTL_DEF_DRV(QDA_GEM_CREATE, qda_ioctl_gem_create, 0),
>> DRM_IOCTL_DEF_DRV(QDA_GEM_MMAP_OFFSET, qda_ioctl_gem_mmap_offset, 0),
>> + DRM_IOCTL_DEF_DRV(QDA_INIT_ATTACH, qda_ioctl_attach, 0),
>> };
>>
>> static struct drm_driver qda_drm_driver = {
>> @@ -195,6 +198,7 @@ static void cleanup_iommu_manager(struct qda_dev *qdev)
>>
>> static void cleanup_device_resources(struct qda_dev *qdev)
>> {
>> + xa_destroy(&qdev->ctx_xa);
> I thought xarray was in some other patch. What is this ctx_xa?
ctx_xa is for ctxid allocations.
>
>> mutex_destroy(&qdev->lock);
>> }
>>
>> @@ -213,6 +217,7 @@ static void init_device_resources(struct qda_dev *qdev)
>> mutex_init(&qdev->lock);
>> atomic_set(&qdev->removing, 0);
>> atomic_set(&qdev->client_id_counter, 0);
>> + xa_init_flags(&qdev->ctx_xa, XA_FLAGS_ALLOC1);
>> }
>>
>> static int init_memory_manager(struct qda_dev *qdev)
>> diff --git a/drivers/accel/qda/qda_drv.h b/drivers/accel/qda/qda_drv.h
>> index bb0dd7e284c6..bb1d1e82036a 100644
>> --- a/drivers/accel/qda/qda_drv.h
>> +++ b/drivers/accel/qda/qda_drv.h
>> @@ -92,6 +92,8 @@ struct qda_dev {
>> char dsp_name[16];
>> /* Compute context-bank (CB) child devices */
>> struct list_head cb_devs;
>> + /* XArray for context management */
>> + struct xarray ctx_xa;
>> };
>>
>> /**
>> diff --git a/drivers/accel/qda/qda_fastrpc.c b/drivers/accel/qda/qda_fastrpc.c
>> new file mode 100644
>> index 000000000000..eda7c90070ee
>> --- /dev/null
>> +++ b/drivers/accel/qda/qda_fastrpc.c
>> @@ -0,0 +1,548 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +// Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
>> +#include <linux/slab.h>
>> +#include <linux/uaccess.h>
>> +#include <linux/sort.h>
>> +#include <linux/completion.h>
>> +#include <linux/dma-buf.h>
>> +#include <drm/drm_gem.h>
>> +#include <drm/qda_accel.h>
>> +#include "qda_fastrpc.h"
>> +#include "qda_drv.h"
>> +#include "qda_gem.h"
>> +#include "qda_memory_manager.h"
>> +
>> +static int copy_to_user_or_kernel(void __user *dst, const void *src, size_t size)
>> +{
>> + if ((unsigned long)dst >= PAGE_OFFSET) {
>> + memcpy(dst, src, size);
>> + return 0;
>> + } else {
>> + return copy_to_user(dst, src, size) ? -EFAULT : 0;
> Huh?
Can you please tell me what is wrong here? Should I drop else case completely if
DRM is ensuring kernel pointer?
>
>> + }
>> +}
>> +
>> +static int get_gem_obj_from_handle(struct drm_file *file_priv, u32 handle,
>> + struct drm_gem_object **gem_obj)
>> +{
>> + if (handle == 0)
>> + return -EINVAL;
> Let the system do its job.
ack
>
>> +
>> + if (!file_priv)
>> + return -EINVAL;
> Can it be NULL?
I'll re-evaluate and remove this check.
>
>> +
>> + *gem_obj = drm_gem_object_lookup(file_priv, handle);
>> + if (*gem_obj)
>> + return 0;
>> +
>> + return -ENOENT;
>> +}
>> +
>> +static void setup_pages_from_gem_obj(struct qda_gem_obj *qda_gem_obj,
>> + struct fastrpc_phy_page *pages)
>> +{
>> + if (qda_gem_obj->is_imported)
>> + pages->addr = qda_gem_obj->imported_dma_addr;
>> + else
>> + pages->addr = qda_gem_obj->dma_addr;
> Why do you need tow kinds of addresses?
not really needed as I have already added a flag suggesting imported buffer, I'll fix this
>
>> +
>> + pages->size = qda_gem_obj->size;
>> +}
>> +
>> +static u64 calculate_vma_offset(u64 user_ptr)
>> +{
>> + struct vm_area_struct *vma;
>> + u64 user_ptr_page_mask = user_ptr & PAGE_MASK;
>> + u64 vma_offset = 0;
>> +
>> + mmap_read_lock(current->mm);
>> + vma = find_vma(current->mm, user_ptr);
>> + if (vma)
>> + vma_offset = user_ptr_page_mask - vma->vm_start;
>> + mmap_read_unlock(current->mm);
>> +
>> + return vma_offset;
>> +}
>> +
>> +static u64 calculate_page_aligned_size(u64 ptr, u64 len)
>> +{
>> + u64 pg_start = (ptr & PAGE_MASK) >> PAGE_SHIFT;
>> + u64 pg_end = ((ptr + len - 1) & PAGE_MASK) >> PAGE_SHIFT;
>> + u64 aligned_size = (pg_end - pg_start + 1) * PAGE_SIZE;
>> +
>> + return aligned_size;
>> +}
>> +
>> +static void setup_single_arg(struct fastrpc_invoke_args *args, void *ptr, size_t size)
>> +{
>> + args[0].ptr = (u64)(uintptr_t)ptr;
> What kind of address is it? If ptr is on the DSP side, then it should
> not be void* here.
Not a DSP side ptr. It's pointer to different arguments that are being passed to DSP.
>
>> + args[0].length = size;
>> + args[0].fd = -1;
>> +}
>> +
>> +static struct fastrpc_invoke_buf *fastrpc_invoke_buf_start(union fastrpc_remote_arg *pra, int len)
>> +{
>> + struct fastrpc_invoke_buf *buf = (struct fastrpc_invoke_buf *)(&pra[len]);
>> + return buf;
>> +}
>> +
>> +static struct fastrpc_phy_page *fastrpc_phy_page_start(struct fastrpc_invoke_buf *buf, int len)
>> +{
>> + struct fastrpc_phy_page *pages = (struct fastrpc_phy_page *)(&buf[len]);
>> + return pages;
>> +}
>> +
>> +static int fastrpc_get_meta_size(struct fastrpc_invoke_context *ctx)
>> +{
>> + int size = 0;
>> +
>> + size = (sizeof(struct fastrpc_remote_buf) +
>> + sizeof(struct fastrpc_invoke_buf) +
>> + sizeof(struct fastrpc_phy_page)) * ctx->nscalars +
>> + sizeof(u64) * FASTRPC_MAX_FDLIST +
>> + sizeof(u32) * FASTRPC_MAX_CRCLIST;
>> +
>> + return size;
>> +}
>> +
>> +static u64 fastrpc_get_payload_size(struct fastrpc_invoke_context *ctx, int metalen)
>> +{
>> + u64 size = 0;
>> + int oix;
>> +
>> + size = ALIGN(metalen, FASTRPC_ALIGN);
>> +
>> + for (oix = 0; oix < ctx->nbufs; oix++) {
>> + int i = ctx->olaps[oix].raix;
> whts olaps?
>
> Why do you need to specially track it?
olaps are the buffer overlapping details
this is for copy buffers(non-GEM buffers), which gets copied to the meta-data
>
>
>> +
>> + if (ctx->args[i].fd == 0 || ctx->args[i].fd == -1) {
>> + if (ctx->olaps[oix].offset == 0)
>> + size = ALIGN(size, FASTRPC_ALIGN);
>> +
>> + size += (ctx->olaps[oix].mend - ctx->olaps[oix].mstart);
>> + }
>> + }
>> +
>> + return size;
>> +}
>> +
>> +void fastrpc_context_free(struct kref *ref)
>> +{
>> + struct fastrpc_invoke_context *ctx;
>> + int i;
>> +
>> + ctx = container_of(ref, struct fastrpc_invoke_context, refcount);
>> + if (ctx->gem_objs) {
>> + for (i = 0; i < ctx->nscalars; ++i) {
>> + if (ctx->gem_objs[i]) {
>> + drm_gem_object_put(ctx->gem_objs[i]);
>> + ctx->gem_objs[i] = NULL;
>> + }
>> + }
>> + kfree(ctx->gem_objs);
>> + ctx->gem_objs = NULL;
> You are going to kfree ctx. Why do you need to zero the field?
ack
>
>> + }
>> +
>> + if (ctx->msg_gem_obj) {
>> + drm_gem_object_put(&ctx->msg_gem_obj->base);
>> + ctx->msg_gem_obj = NULL;
>> + }
>> +
>> + kfree(ctx->olaps);
>> + ctx->olaps = NULL;
>> +
>> + kfree(ctx->args);
>> + kfree(ctx->req);
>> + kfree(ctx->rsp);
>> + kfree(ctx->input_pages);
>> + kfree(ctx->inbuf);
> Generally it feels like there are too many allocations and frees for a
> single RPC call. Can all these buffers be embedded into the context
> instead?
I'll check this.
>
>> +
>> + kfree(ctx);
>> +}
>> +
>> +#define CMP(aa, bb) ((aa) == (bb) ? 0 : (aa) < (bb) ? -1 : 1)
>
>
>> +
>> +static int olaps_cmp(const void *a, const void *b)
>> +{
>> + struct fastrpc_buf_overlap *pa = (struct fastrpc_buf_overlap *)a;
>> + struct fastrpc_buf_overlap *pb = (struct fastrpc_buf_overlap *)b;
>> + int st = CMP(pa->start, pb->start);
>> + int ed = CMP(pb->end, pa->end);
>> +
>> + return st == 0 ? ed : st;
> wist?
the sorting logic is taken from fastrpc.
>
>> +}
>> +
>> +static void fastrpc_get_buff_overlaps(struct fastrpc_invoke_context *ctx)
>> +{
>> + u64 max_end = 0;
>> + int i;
>> +
>> + for (i = 0; i < ctx->nbufs; ++i) {
>> + ctx->olaps[i].start = ctx->args[i].ptr;
>> + ctx->olaps[i].end = ctx->olaps[i].start + ctx->args[i].length;
>> + ctx->olaps[i].raix = i;
>> + }
>> +
>> + sort(ctx->olaps, ctx->nbufs, sizeof(*ctx->olaps), olaps_cmp, NULL);
>> +
>> + for (i = 0; i < ctx->nbufs; ++i) {
>> + if (ctx->olaps[i].start < max_end) {
>> + ctx->olaps[i].mstart = max_end;
>> + ctx->olaps[i].mend = ctx->olaps[i].end;
>> + ctx->olaps[i].offset = max_end - ctx->olaps[i].start;
>> +
>> + if (ctx->olaps[i].end > max_end) {
>> + max_end = ctx->olaps[i].end;
>> + } else {
>> + ctx->olaps[i].mend = 0;
>> + ctx->olaps[i].mstart = 0;
>> + }
>> + } else {
>> + ctx->olaps[i].mend = ctx->olaps[i].end;
>> + ctx->olaps[i].mstart = ctx->olaps[i].start;
>> + ctx->olaps[i].offset = 0;
>> + max_end = ctx->olaps[i].end;
>> + }
>> + }
>> +}
>> +
>> +struct fastrpc_invoke_context *fastrpc_context_alloc(void)
>> +{
>> + struct fastrpc_invoke_context *ctx = NULL;
>> +
>> + ctx = kzalloc_obj(*ctx, GFP_KERNEL);
>> + if (!ctx)
>> + return ERR_PTR(-ENOMEM);
>> +
>> + INIT_LIST_HEAD(&ctx->node);
>> +
>> + ctx->retval = -1;
>> + ctx->pid = current->pid;
>> + init_completion(&ctx->work);
>> + ctx->msg_gem_obj = NULL;
>> + kref_init(&ctx->refcount);
>> +
>> + return ctx;
>> +}
>> +
>> +static int process_fd_buffer(struct fastrpc_invoke_context *ctx, int i,
>> + union fastrpc_remote_arg *rpra, struct fastrpc_phy_page *pages)
>> +{
>> + struct drm_gem_object *gem_obj;
>> + struct qda_gem_obj *qda_gem_obj;
>> + int err;
>> + u64 len = ctx->args[i].length;
>> + u64 vma_offset;
>> +
>> + err = get_gem_obj_from_handle(ctx->file_priv, ctx->args[i].fd, &gem_obj);
>> + if (err)
>> + return err;
>> +
>> + ctx->gem_objs[i] = gem_obj;
>> + qda_gem_obj = to_qda_gem_obj(gem_obj);
>> +
>> + rpra[i].buf.pv = (u64)ctx->args[i].ptr;
>> +
>> + if (qda_gem_obj->is_imported)
>> + pages[i].addr = qda_gem_obj->imported_dma_addr;
>> + else
>> + pages[i].addr = qda_gem_obj->dma_addr;
>> +
>> + vma_offset = calculate_vma_offset(ctx->args[i].ptr);
>> + pages[i].addr += vma_offset;
>> + pages[i].size = calculate_page_aligned_size(ctx->args[i].ptr, len);
>> +
>> + return 0;
>> +}
>> +
>> +static int process_direct_buffer(struct fastrpc_invoke_context *ctx, int i, int oix,
>> + union fastrpc_remote_arg *rpra, struct fastrpc_phy_page *pages,
>> + uintptr_t *args, u64 *rlen, u64 pkt_size)
> What is direct buffer?
it's for GEM buffers.
>
>> +{
>> + int mlen;
>> + u64 len = ctx->args[i].length;
>> + int inbufs = ctx->inbufs;
>> +
>> + if (ctx->olaps[oix].offset == 0) {
>> + *rlen -= ALIGN(*args, FASTRPC_ALIGN) - *args;
>> + *args = ALIGN(*args, FASTRPC_ALIGN);
>> + }
>> +
>> + mlen = ctx->olaps[oix].mend - ctx->olaps[oix].mstart;
>> +
>> + if (*rlen < mlen)
>> + return -ENOSPC;
>> +
>> + rpra[i].buf.pv = *args - ctx->olaps[oix].offset;
>> +
>> + pages[i].addr = ctx->msg->phys - ctx->olaps[oix].offset + (pkt_size - *rlen);
>> + pages[i].addr = pages[i].addr & PAGE_MASK;
>> + pages[i].size = calculate_page_aligned_size(rpra[i].buf.pv, len);
>> +
>> + *args = *args + mlen;
>> + *rlen -= mlen;
>> +
>> + if (i < inbufs) {
>> + void *dst = (void *)(uintptr_t)rpra[i].buf.pv;
>> + void *src = (void *)(uintptr_t)ctx->args[i].ptr;
> Huh?
do you see any problem here, I've copied this from existing fastrpc.
>
>> +
>> + if ((unsigned long)src >= PAGE_OFFSET) {
>> + memcpy(dst, src, len);
>> + } else {
>> + if (copy_from_user(dst, (void __user *)src, len))
>> + return -EFAULT;
>> + }
>> + }
>> +
>> + return 0;
>> +}
>> +
>> +static int process_dma_handle(struct fastrpc_invoke_context *ctx, int i,
>> + union fastrpc_remote_arg *rpra, struct fastrpc_phy_page *pages)
>> +{
>> + if (ctx->args[i].fd > 0) {
>> + struct drm_gem_object *gem_obj;
>> + struct qda_gem_obj *qda_gem_obj;
>> + int err;
>> +
>> + err = get_gem_obj_from_handle(ctx->file_priv, ctx->args[i].fd, &gem_obj);
>> + if (err)
>> + return err;
>> +
>> + ctx->gem_objs[i] = gem_obj;
>> + qda_gem_obj = to_qda_gem_obj(gem_obj);
>> +
>> + setup_pages_from_gem_obj(qda_gem_obj, &pages[i]);
>> +
>> + rpra[i].dma.fd = ctx->args[i].fd;
>> + rpra[i].dma.len = ctx->args[i].length;
>> + rpra[i].dma.offset = (u64)ctx->args[i].ptr;
>> + } else {
>> + rpra[i].buf.pv = ctx->args[i].ptr;
>> + rpra[i].buf.len = ctx->args[i].length;
>> + }
>> +
>> + return 0;
>> +}
>> +
>> +int fastrpc_get_header_size(struct fastrpc_invoke_context *ctx, size_t *out_size)
>> +{
>> + ctx->inbufs = REMOTE_SCALARS_INBUFS(ctx->sc);
>> + ctx->metalen = fastrpc_get_meta_size(ctx);
>> + ctx->pkt_size = fastrpc_get_payload_size(ctx, ctx->metalen);
>> +
>> + ctx->aligned_pkt_size = PAGE_ALIGN(ctx->pkt_size);
>> + if (ctx->aligned_pkt_size == 0)
>> + return -EINVAL;
>> +
>> + *out_size = ctx->aligned_pkt_size;
>> + return 0;
>> +}
>> +
>> +static int fastrpc_get_args(struct fastrpc_invoke_context *ctx)
>> +{
>> + union fastrpc_remote_arg *rpra;
>> + struct fastrpc_invoke_buf *list;
>> + struct fastrpc_phy_page *pages;
>> + int i, oix, err = 0;
>> + u64 rlen;
>> + uintptr_t args;
>> + size_t hdr_size;
>> +
>> + ctx->inbufs = REMOTE_SCALARS_INBUFS(ctx->sc);
>> + err = fastrpc_get_header_size(ctx, &hdr_size);
>> + if (err)
>> + return err;
>> +
>> + ctx->msg->buf = ctx->msg_gem_obj->virt;
>> + ctx->msg->phys = ctx->msg_gem_obj->dma_addr;
>> +
>> + memset(ctx->msg->buf, 0, ctx->aligned_pkt_size);
>> +
>> + rpra = (union fastrpc_remote_arg *)ctx->msg->buf;
>> + ctx->list = fastrpc_invoke_buf_start(rpra, ctx->nscalars);
>> + ctx->pages = fastrpc_phy_page_start(ctx->list, ctx->nscalars);
>> + list = ctx->list;
>> + pages = ctx->pages;
>> + args = (uintptr_t)ctx->msg->buf + ctx->metalen;
>> + rlen = ctx->pkt_size - ctx->metalen;
>> + ctx->rpra = rpra;
>> +
>> + for (oix = 0; oix < ctx->nbufs; ++oix) {
>> + i = ctx->olaps[oix].raix;
>> +
>> + rpra[i].buf.pv = 0;
>> + rpra[i].buf.len = ctx->args[i].length;
>> + list[i].num = ctx->args[i].length ? 1 : 0;
>> + list[i].pgidx = i;
>> +
>> + if (!ctx->args[i].length)
>> + continue;
>> +
>> + if (ctx->args[i].fd > 0)
>> + err = process_fd_buffer(ctx, i, rpra, pages);
>> + else
>> + err = process_direct_buffer(ctx, i, oix, rpra, pages, &args, &rlen,
>> + ctx->pkt_size);
>> +
>> + if (err)
>> + goto bail_gem;
>> + }
>> +
>> + for (i = ctx->nbufs; i < ctx->nscalars; ++i) {
>> + list[i].num = ctx->args[i].length ? 1 : 0;
>> + list[i].pgidx = i;
>> +
>> + err = process_dma_handle(ctx, i, rpra, pages);
>> + if (err)
>> + goto bail_gem;
>> + }
>> +
>> + return 0;
>> +
>> +bail_gem:
>> + if (ctx->msg_gem_obj) {
>> + drm_gem_object_put(&ctx->msg_gem_obj->base);
>> + ctx->msg_gem_obj = NULL;
>> + }
>> +
>> + return err;
>> +}
>> +
>> +static int fastrpc_put_args(struct fastrpc_invoke_context *ctx, struct qda_msg *msg)
>> +{
>> + union fastrpc_remote_arg *rpra = ctx->rpra;
>> + int i, err = 0;
>> +
>> + if (!ctx || !rpra)
>> + return -EINVAL;
>> +
>> + for (i = ctx->inbufs; i < ctx->nbufs; ++i) {
>> + if (ctx->args[i].fd <= 0) {
>> + void *src = (void *)(uintptr_t)rpra[i].buf.pv;
>> + void *dst = (void *)(uintptr_t)ctx->args[i].ptr;
>> + u64 len = rpra[i].buf.len;
>> +
>> + err = copy_to_user_or_kernel(dst, src, len);
>> + if (err)
>> + break;
>> + }
>> + }
>> +
>> + return err;
>> +}
>> +
>> +int fastrpc_internal_invoke_pack(struct fastrpc_invoke_context *ctx,
>> + struct qda_msg *msg)
>> +{
>> + int err = 0;
>> +
>> + if (ctx->handle == FASTRPC_INIT_HANDLE)
>> + msg->client_id = 0;
>> + else
>> + msg->client_id = ctx->client_id;
>> +
>> + ctx->msg = msg;
>> +
>> + err = fastrpc_get_args(ctx);
>> + if (err)
>> + return err;
>> +
>> + dma_wmb();
>> +
>> + msg->tid = ctx->pid;
>> + msg->ctx = ctx->ctxid | ctx->pd;
>> + msg->handle = ctx->handle;
>> + msg->sc = ctx->sc;
>> + msg->addr = ctx->msg->phys;
>> + msg->size = roundup(ctx->pkt_size, PAGE_SIZE);
>> + msg->fastrpc_ctx = ctx;
>> + msg->file_priv = ctx->file_priv;
>> +
>> + return 0;
>> +}
>> +
>> +int fastrpc_internal_invoke_unpack(struct fastrpc_invoke_context *ctx,
>> + struct qda_msg *msg)
>> +{
>> + int err;
>> +
>> + dma_rmb();
>> +
>> + err = fastrpc_put_args(ctx, msg);
>> + if (err)
>> + return err;
>> +
>> + err = ctx->retval;
>> + return err;
>> +}
>> +
>> +static int fastrpc_prepare_args_init_attach(struct fastrpc_invoke_context *ctx)
>> +{
>> + struct fastrpc_invoke_args *args;
>> +
>> + args = kzalloc_obj(*args, GFP_KERNEL);
>> + if (!args)
>> + return -ENOMEM;
>> +
>> + setup_single_arg(args, &ctx->client_id, sizeof(ctx->client_id));
>> + ctx->sc = FASTRPC_SCALARS(FASTRPC_RMID_INIT_ATTACH, 1, 0);
>> + ctx->args = args;
>> + ctx->handle = FASTRPC_INIT_HANDLE;
>> +
>> + return 0;
>> +}
>> +
>> +static int fastrpc_prepare_args_release_process(struct fastrpc_invoke_context *ctx)
>> +{
>> + struct fastrpc_invoke_args *args;
>> +
>> + args = kzalloc_obj(*args, GFP_KERNEL);
>> + if (!args)
>> + return -ENOMEM;
>> +
>> + setup_single_arg(args, &ctx->client_id, sizeof(ctx->client_id));
>> + ctx->sc = FASTRPC_SCALARS(FASTRPC_RMID_INIT_RELEASE, 1, 0);
>> + ctx->args = args;
>> + ctx->handle = FASTRPC_INIT_HANDLE;
>> +
>> + return 0;
>> +}
>> +
>> +int fastrpc_prepare_args(struct fastrpc_invoke_context *ctx, char __user *argp)
>> +{
>> + int err;
>> +
>> + switch (ctx->type) {
>> + case FASTRPC_RMID_INIT_ATTACH:
>> + ctx->pd = ROOT_PD;
>> + err = fastrpc_prepare_args_init_attach(ctx);
>> + break;
>> + case FASTRPC_RMID_INIT_RELEASE:
>> + err = fastrpc_prepare_args_release_process(ctx);
>> + break;
>> + default:
>> + return -EINVAL;
>> + }
>> +
>> + if (err)
>> + return err;
>> +
>> + ctx->nscalars = REMOTE_SCALARS_LENGTH(ctx->sc);
>> + ctx->nbufs = REMOTE_SCALARS_INBUFS(ctx->sc) + REMOTE_SCALARS_OUTBUFS(ctx->sc);
>> +
>> + if (ctx->nscalars) {
>> + ctx->gem_objs = kcalloc(ctx->nscalars, sizeof(*ctx->gem_objs), GFP_KERNEL);
>> + if (!ctx->gem_objs)
>> + return -ENOMEM;
>> + ctx->olaps = kcalloc(ctx->nscalars, sizeof(*ctx->olaps), GFP_KERNEL);
>> + if (!ctx->olaps) {
>> + kfree(ctx->gem_objs);
>> + ctx->gem_objs = NULL;
>> + return -ENOMEM;
>> + }
>> + fastrpc_get_buff_overlaps(ctx);
>> + }
>> +
>> + return err;
>> +}
>> diff --git a/drivers/accel/qda/qda_fastrpc.h b/drivers/accel/qda/qda_fastrpc.h
>> new file mode 100644
>> index 000000000000..744421382079
>> --- /dev/null
>> +++ b/drivers/accel/qda/qda_fastrpc.h
>> @@ -0,0 +1,303 @@
>> +/* SPDX-License-Identifier: GPL-2.0-only */
>> +/*
>> + * Copyright (c) Qualcomm Technologies, Inc. and/or its subsidiaries.
>> + */
>> +
>> +#ifndef __QDA_FASTRPC_H__
>> +#define __QDA_FASTRPC_H__
>> +
>> +#include <linux/completion.h>
>> +#include <linux/list.h>
>> +#include <linux/types.h>
>> +#include <drm/drm_drv.h>
>> +#include <drm/drm_file.h>
>> +
>> +/*
>> + * FastRPC scalar extraction macros
>> + *
>> + * These macros extract different fields from the scalar value that describes
>> + * the arguments passed in a FastRPC invocation.
>> + */
>> +#define REMOTE_SCALARS_INBUFS(sc) (((sc) >> 16) & 0x0ff)
>> +#define REMOTE_SCALARS_OUTBUFS(sc) (((sc) >> 8) & 0x0ff)
>> +#define REMOTE_SCALARS_INHANDLES(sc) (((sc) >> 4) & 0x0f)
>> +#define REMOTE_SCALARS_OUTHANDLES(sc) ((sc) & 0x0f)
>> +#define REMOTE_SCALARS_LENGTH(sc) (REMOTE_SCALARS_INBUFS(sc) + \
>> + REMOTE_SCALARS_OUTBUFS(sc) + \
>> + REMOTE_SCALARS_INHANDLES(sc) + \
>> + REMOTE_SCALARS_OUTHANDLES(sc))
>> +
>> +/* FastRPC configuration constants */
>> +#define FASTRPC_ALIGN 128 /* Alignment requirement */
>> +#define FASTRPC_MAX_FDLIST 16 /* Maximum file descriptors */
>> +#define FASTRPC_MAX_CRCLIST 64 /* Maximum CRC list entries */
>> +
>> +/*
>> + * FastRPC scalar construction macros
>> + *
>> + * These macros build the scalar value that describes the arguments
>> + * for a FastRPC invocation.
>> + */
>> +#define FASTRPC_BUILD_SCALARS(attr, method, in, out, oin, oout) \
>> + (((attr & 0x07) << 29) | \
>> + ((method & 0x1f) << 24) | \
>> + ((in & 0xff) << 16) | \
>> + ((out & 0xff) << 8) | \
>> + ((oin & 0x0f) << 4) | \
>> + (oout & 0x0f))
>> +
>> +#define FASTRPC_SCALARS(method, in, out) \
>> + FASTRPC_BUILD_SCALARS(0, method, in, out, 0, 0)
>> +
>> +/**
>> + * struct fastrpc_buf_overlap - Buffer overlap tracking structure
>> + *
>> + * This structure tracks overlapping buffer regions to optimize memory
>> + * mapping and avoid redundant mappings of the same physical memory.
> I think you are spending much more efforts on optimizing it than the
> actual cost of mapping the same region twice. Or is there something more
> than the optimization?
The current design is to take both DMABUF and non-DMABUF buffers over RPC calls.
I'm checking if the copy approach can be completely moved to userspace. If that is possible
this kernel logic might not be needed.
>
>> + */
>> +struct fastrpc_buf_overlap {
>> + /* Start address of the buffer in user virtual address space */
>> + u64 start;
>> + /* End address of the buffer in user virtual address space */
>> + u64 end;
>> + /* Remote argument index associated with this overlap */
>> + int raix;
>> + /* Start address of the mapped region */
>> + u64 mstart;
>> + /* End address of the mapped region */
>> + u64 mend;
>> + /* Offset within the mapped region */
>> + u64 offset;
>> +};
>> +
>> +/**
>> + * struct fastrpc_remote_dmahandle - Structure to represent a remote DMA handle
>> + */
>> +struct fastrpc_remote_dmahandle {
>> + /* DMA handle file descriptor */
>> + s32 fd;
>> + /* DMA handle offset */
>> + u32 offset;
>> + /* DMA handle length */
>> + u32 len;
>> +};
>> +
>> +/**
>> + * struct fastrpc_remote_buf - Structure to represent a remote buffer
>> + */
>> +struct fastrpc_remote_buf {
>> + /* Buffer pointer */
>> + u64 pv;
>> + /* Length of buffer */
>> + u64 len;
>> +};
>> +
>> +/**
>> + * union fastrpc_remote_arg - Union to represent remote arguments
>> + */
>> +union fastrpc_remote_arg {
>> + /* Remote buffer */
>> + struct fastrpc_remote_buf buf;
>> + /* Remote DMA handle */
>> + struct fastrpc_remote_dmahandle dma;
>> +};
>> +
>> +/**
>> + * struct fastrpc_phy_page - Structure to represent a physical page
>> + */
>> +struct fastrpc_phy_page {
>> + /* Physical address */
>> + u64 addr;
>> + /* Size of contiguous region */
>> + u64 size;
>> +};
>> +
>> +/**
>> + * struct fastrpc_invoke_buf - Structure to represent an invoke buffer
>> + */
>> +struct fastrpc_invoke_buf {
>> + /* Number of contiguous regions */
>> + u32 num;
>> + /* Page index */
>> + u32 pgidx;
>> +};
>> +
>> +/**
>> + * struct qda_msg - Message structure for FastRPC communication
>> + *
>> + * This structure represents a message sent to or received from the remote
>> + * processor via FastRPC protocol.
>> + */
>> +struct qda_msg {
>> + /* Process client ID */
>> + int client_id;
>> + /* Thread ID */
>> + int tid;
>> + /* Context identifier for matching responses */
>> + u64 ctx;
>> + /* Handle to invoke on remote processor */
>> + u32 handle;
>> + /* Scalars structure describing the data layout */
>> + u32 sc;
>> + /* Physical address of the message buffer */
>> + u64 addr;
>> + /* Size of contiguous region */
>> + u64 size;
>> + /* Kernel virtual address of the buffer */
>> + void *buf;
>> + /* Physical/DMA address of the buffer */
>> + u64 phys;
>> + /* Return value from remote processor */
>> + int ret;
>> + /* Pointer to qda_dev for context management */
>> + struct qda_dev *qdev;
>> + /* Back-pointer to FastRPC context */
>> + struct fastrpc_invoke_context *fastrpc_ctx;
>> + /* File private data for GEM object lookup */
>> + struct drm_file *file_priv;
>> +};
>> +
>> +/**
>> + * struct fastrpc_invoke_context - Remote procedure call invocation context
>> + *
>> + * This structure maintains all state for a single remote procedure call,
>> + * including buffer management, synchronization, and result handling.
>> + */
>> +struct fastrpc_invoke_context {
>> + /* Unique context identifier for this invocation */
>> + u64 ctxid;
>> + /* Number of input buffers */
>> + int inbufs;
>> + /* Number of output buffers */
>> + int outbufs;
>> + /* Number of file descriptor handles */
>> + int handles;
>> + /* Number of scalar parameters */
>> + int nscalars;
>> + /* Total number of buffers (input + output) */
>> + int nbufs;
>> + /* Process ID of the calling process */
>> + int pid;
>> + /* Return value from the remote invocation */
>> + int retval;
>> + /* Length of metadata */
>> + int metalen;
>> + /* Client identifier for this session */
>> + int client_id;
>> + /* Protection domain identifier */
>> + int pd;
>> + /* Type of invocation request */
>> + int type;
>> + /* Scalars parameter encoding buffer information */
>> + u32 sc;
>> + /* Handle to the remote method being invoked */
>> + u32 handle;
>> + /* Pointer to CRC values for data integrity */
>> + u32 *crc;
>> + /* Pointer to array of file descriptors */
>> + u64 *fdlist;
>> + /* Size of the packet */
>> + u64 pkt_size;
>> + /* Aligned packet size for DMA transfers */
>> + u64 aligned_pkt_size;
>> + /* Array of invoke buffer descriptors */
>> + struct fastrpc_invoke_buf *list;
>> + /* Array of physical page descriptors for buffers */
>> + struct fastrpc_phy_page *pages;
>> + /* Array of physical page descriptors for input buffers */
>> + struct fastrpc_phy_page *input_pages;
>> + /* List node for linking contexts in a queue */
>> + struct list_head node;
>> + /* Completion object for synchronizing invocation */
>> + struct completion work;
>> + /* Pointer to the QDA message structure */
>> + struct qda_msg *msg;
>> + /* Array of remote procedure arguments */
>> + union fastrpc_remote_arg *rpra;
>> + /* Array of GEM objects for argument buffers */
>> + struct drm_gem_object **gem_objs;
>> + /* Pointer to user-space invoke arguments */
>> + struct fastrpc_invoke_args *args;
>> + /* Array of buffer overlap descriptors */
>> + struct fastrpc_buf_overlap *olaps;
>> + /* Reference counter for context lifetime management */
>> + struct kref refcount;
>> + /* GEM object for the main message buffer */
>> + struct qda_gem_obj *msg_gem_obj;
>> + /* DRM file private data */
>> + struct drm_file *file_priv;
>> + /* Pointer to request buffer */
>> + void *req;
>> + /* Pointer to response buffer */
>> + void *rsp;
>> + /* Pointer to input buffer */
>> + void *inbuf;
>> +};
>> +
>> +/* Remote Method ID table - identifies initialization and control operations */
>> +#define FASTRPC_RMID_INIT_ATTACH 0 /* Attach to DSP session */
>> +#define FASTRPC_RMID_INIT_RELEASE 1 /* Release DSP session */
>> +
>> +/* Common handle for initialization operations */
>> +#define FASTRPC_INIT_HANDLE 0x1
>> +
>> +/* Protection Domain(PD) ids */
>> +#define ROOT_PD (0)
>> +
>> +/**
>> + * fastrpc_context_free - Free an invocation context
>> + * @ref: Reference counter for the context
>> + *
>> + * This function is called when the reference count reaches zero,
>> + * releasing all resources associated with the invocation context.
>> + */
>> +void fastrpc_context_free(struct kref *ref);
>> +
>> +/*
>> + * FastRPC context and invocation management functions
>> + */
>> +
>> +/**
>> + * fastrpc_context_alloc - Allocate a new FastRPC invocation context
>> + *
>> + * Returns: Pointer to allocated context, or NULL on failure
>> + */
>> +struct fastrpc_invoke_context *fastrpc_context_alloc(void);
>> +
>> +/**
>> + * fastrpc_prepare_args - Prepare arguments for FastRPC invocation
>> + * @ctx: FastRPC invocation context
>> + * @argp: User-space pointer to invocation arguments
>> + *
>> + * Returns: 0 on success, negative error code on failure
>> + */
>> +int fastrpc_prepare_args(struct fastrpc_invoke_context *ctx, char __user *argp);
>> +
>> +/**
>> + * fastrpc_get_header_size - Get the size of the FastRPC message header
>> + * @ctx: FastRPC invocation context
>> + * @out_size: Pointer to store the header size in bytes
>> + *
>> + * Returns: 0 on success, negative error code on failure
>> + */
>> +int fastrpc_get_header_size(struct fastrpc_invoke_context *ctx, size_t *out_size);
>> +
>> +/**
>> + * fastrpc_internal_invoke_pack - Pack invocation context into message
>> + * @ctx: FastRPC invocation context
>> + * @msg: QDA message structure to pack into
>> + *
>> + * Returns: 0 on success, negative error code on failure
>> + */
>> +int fastrpc_internal_invoke_pack(struct fastrpc_invoke_context *ctx, struct qda_msg *msg);
>> +
>> +/**
>> + * fastrpc_internal_invoke_unpack - Unpack response message into context
>> + * @ctx: FastRPC invocation context
>> + * @msg: QDA message structure to unpack from
>> + *
>> + * Returns: 0 on success, negative error code on failure
>> + */
>> +int fastrpc_internal_invoke_unpack(struct fastrpc_invoke_context *ctx, struct qda_msg *msg);
>> +
>> +#endif /* __QDA_FASTRPC_H__ */
>> diff --git a/drivers/accel/qda/qda_ioctl.c b/drivers/accel/qda/qda_ioctl.c
>> index d91983048d6c..1066ab6ddc7b 100644
>> --- a/drivers/accel/qda/qda_ioctl.c
>> +++ b/drivers/accel/qda/qda_ioctl.c
>> @@ -6,6 +6,8 @@
>> #include "qda_drv.h"
>> #include "qda_ioctl.h"
>> #include "qda_prime.h"
>> +#include "qda_fastrpc.h"
>> +#include "qda_rpmsg.h"
>>
>> static int qda_validate_and_get_context(struct drm_device *dev, struct drm_file *file_priv,
>> struct qda_dev **qdev, struct qda_user **qda_user)
>> @@ -85,3 +87,108 @@ int qda_ioctl_prime_fd_to_handle(struct drm_device *dev, struct drm_file *file_p
>> {
>> return qda_prime_fd_to_handle(dev, file_priv, prime_fd, handle);
>> }
>> +
>> +static int fastrpc_context_get_id(struct fastrpc_invoke_context *ctx, struct qda_dev *qdev)
>> +{
>> + int ret;
>> + u32 id;
>> +
>> + if (!qdev)
>> + return -EINVAL;
>> +
>> + if (atomic_read(&qdev->removing))
>> + return -ENODEV;
>> +
>> + ret = xa_alloc(&qdev->ctx_xa, &id, ctx, xa_limit_32b, GFP_KERNEL);
>> + if (ret)
>> + return ret;
>> +
>> + ctx->ctxid = id << 4;
>> + return 0;
>> +}
>> +
>> +static void fastrpc_context_put_id(struct fastrpc_invoke_context *ctx, struct qda_dev *qdev)
>> +{
>> + if (qdev)
>> + xa_erase(&qdev->ctx_xa, ctx->ctxid >> 4);
>> +}
>> +
>> +static int fastrpc_invoke(int type, struct drm_device *dev, void *data,
>> + struct drm_file *file_priv)
>> +{
>> + struct qda_dev *qdev;
>> + struct qda_user *qda_user;
>> + struct qda_msg msg;
>> + struct fastrpc_invoke_context *ctx;
>> + struct drm_gem_object *gem_obj;
>> + int err;
>> + size_t hdr_size;
>> +
>> + err = qda_validate_and_get_context(dev, file_priv, &qdev, &qda_user);
>> + if (err)
>> + return err;
>> +
>> + ctx = fastrpc_context_alloc();
>> + if (IS_ERR(ctx))
>> + return PTR_ERR(ctx);
>> +
>> + err = fastrpc_context_get_id(ctx, qdev);
>> + if (err) {
>> + kref_put(&ctx->refcount, fastrpc_context_free);
>> + return err;
>> + }
>> +
>> + ctx->type = type;
>> + ctx->file_priv = file_priv;
>> + ctx->client_id = qda_user->client_id;
>> +
>> + err = fastrpc_prepare_args(ctx, (char __user *)data);
>> + if (err)
>> + goto err_context_free;
>> +
>> + err = fastrpc_get_header_size(ctx, &hdr_size);
>> + if (err)
>> + goto err_context_free;
>> +
>> + gem_obj = qda_gem_create_object(qdev->drm_dev,
>> + qdev->drm_priv->iommu_mgr,
>> + hdr_size, file_priv);
>> + if (IS_ERR(gem_obj)) {
>> + err = PTR_ERR(gem_obj);
>> + goto err_context_free;
>> + }
>> +
>> + ctx->msg_gem_obj = to_qda_gem_obj(gem_obj);
>> +
>> + err = fastrpc_internal_invoke_pack(ctx, &msg);
>> + if (err)
>> + goto err_context_free;
>> +
>> + err = qda_rpmsg_send_msg(qdev, &msg);
>> + if (err)
>> + goto err_context_free;
>> +
>> + err = qda_rpmsg_wait_for_rsp(ctx);
>> + if (err)
>> + goto err_context_free;
>> +
>> + err = fastrpc_internal_invoke_unpack(ctx, &msg);
>> + if (err)
>> + goto err_context_free;
>> +
>> +err_context_free:
>> + fastrpc_context_put_id(ctx, qdev);
>> + kref_put(&ctx->refcount, fastrpc_context_free);
>> +
>> + return err;
>> +}
>> +
>> +int qda_ioctl_attach(struct drm_device *dev, void *data, struct drm_file *file_priv)
>> +{
>> + return fastrpc_invoke(FASTRPC_RMID_INIT_ATTACH, dev, data, file_priv);
>> +}
>> +
>> +int fastrpc_release_current_dsp_process(struct qda_dev *qdev, struct drm_file *file_priv)
>> +{
>> + return fastrpc_invoke(FASTRPC_RMID_INIT_RELEASE, qdev->drm_dev, NULL, file_priv);
>> +}
>> diff --git a/drivers/accel/qda/qda_ioctl.h b/drivers/accel/qda/qda_ioctl.h
>> index d454256f5fc5..044c616a51c6 100644
>> --- a/drivers/accel/qda/qda_ioctl.h
>> +++ b/drivers/accel/qda/qda_ioctl.h
>> @@ -38,4 +38,29 @@ int qda_ioctl_query(struct drm_device *dev, void *data, struct drm_file *file_pr
>> int qda_ioctl_prime_fd_to_handle(struct drm_device *dev, struct drm_file *file_priv,
>> int prime_fd, u32 *handle);
>>
>> +/**
>> + * qda_ioctl_attach - Attach to DSP root protection domain
>> + * @dev: DRM device structure
>> + * @data: User-space data for the attach operation
>> + * @file_priv: DRM file private data
>> + *
>> + * This IOCTL handler attaches to the DSP root PD (Protection Domain)
>> + * to enable communication between the host and DSP.
>> + *
>> + * Return: 0 on success, negative error code on failure
>> + */
>> +int qda_ioctl_attach(struct drm_device *dev, void *data, struct drm_file *file_priv);
>> +
>> +/**
>> + * fastrpc_release_current_dsp_process - Release DSP process resources
>> + * @qdev: QDA device structure
>> + * @file_priv: DRM file private data
>> + *
>> + * This function releases all resources associated with a DSP process
>> + * when a user-space client closes its file descriptor.
>> + *
>> + * Return: 0 on success, negative error code on failure
>> + */
>> +int fastrpc_release_current_dsp_process(struct qda_dev *qdev, struct drm_file *file_priv);
>> +
>> #endif /* _QDA_IOCTL_H */
>> diff --git a/drivers/accel/qda/qda_rpmsg.c b/drivers/accel/qda/qda_rpmsg.c
>> index b2b44b4d3ca8..96a08d753271 100644
>> --- a/drivers/accel/qda/qda_rpmsg.c
>> +++ b/drivers/accel/qda/qda_rpmsg.c
>> @@ -5,7 +5,11 @@
>> #include <linux/of_platform.h>
>> #include <linux/of.h>
>> #include <linux/of_device.h>
>> +#include <linux/completion.h>
>> +#include <linux/wait.h>
>> +#include <linux/sched.h>
>> #include "qda_drv.h"
>> +#include "qda_fastrpc.h"
>> #include "qda_rpmsg.h"
>> #include "qda_cb.h"
>>
>> @@ -15,7 +19,104 @@ static int qda_rpmsg_init(struct qda_dev *qdev)
>> return 0;
>> }
>>
>> -/* Utility function to allocate and initialize qda_dev */
>> +static int validate_device_availability(struct qda_dev *qdev)
>> +{
>> + struct rpmsg_device *rpdev;
>> +
>> + if (!qdev)
>> + return -ENODEV;
>> +
>> + if (atomic_read(&qdev->removing)) {
>> + qda_dbg(qdev, "RPMsg device unavailable: removing\n");
>> + return -ENODEV;
>> + }
>> +
>> + mutex_lock(&qdev->lock);
>> + rpdev = qdev->rpdev;
>> + mutex_unlock(&qdev->lock);
>> +
>> + if (!rpdev) {
>> + qda_dbg(qdev, "RPMsg device unavailable: rpdev is NULL\n");
>> + return -ENODEV;
>> + }
>> +
>> + return 0;
>> +}
>> +
>> +static struct fastrpc_invoke_context *get_and_validate_context(struct qda_msg *msg,
>> + struct qda_dev *qdev)
>> +{
>> + struct fastrpc_invoke_context *ctx = msg->fastrpc_ctx;
>> +
>> + if (!ctx) {
>> + qda_dbg(qdev, "FastRPC context not found in message\n");
>> + return ERR_PTR(-EINVAL);
>> + }
>> +
>> + kref_get(&ctx->refcount);
>> + return ctx;
>> +}
>> +
>> +static void populate_fastrpc_msg(struct fastrpc_msg *dst, struct qda_msg *src)
>> +{
>> + dst->client_id = src->client_id;
>> + dst->tid = src->tid;
>> + dst->ctx = src->ctx;
>> + dst->handle = src->handle;
>> + dst->sc = src->sc;
>> + dst->addr = src->addr;
>> + dst->size = src->size;
>> +}
>> +
>> +static int validate_callback_params(struct qda_dev *qdev, void *data, int len)
>> +{
>> + if (!qdev)
>> + return -ENODEV;
>> +
>> + if (atomic_read(&qdev->removing))
>> + return -ENODEV;
>> +
>> + if (len < sizeof(struct qda_invoke_rsp)) {
>> + qda_dbg(qdev, "Invalid message size from remote: %d\n", len);
>> + return -EINVAL;
>> + }
>> +
>> + return 0;
>> +}
>> +
>> +static unsigned long extract_context_id(struct qda_invoke_rsp *resp_msg)
>> +{
>> + return (resp_msg->ctx & 0xFF0) >> 4;
>> +}
>> +
>> +static struct fastrpc_invoke_context *find_context_by_id(struct qda_dev *qdev,
>> + unsigned long ctxid)
>> +{
>> + struct fastrpc_invoke_context *ctx;
>> +
>> + {
>> + unsigned long flags;
>> +
>> + xa_lock_irqsave(&qdev->ctx_xa, flags);
>> + ctx = xa_load(&qdev->ctx_xa, ctxid);
>> + xa_unlock_irqrestore(&qdev->ctx_xa, flags);
>> + }
>> +
>> + if (!ctx) {
>> + qda_dbg(qdev, "FastRPC context not found for ctxid: %lu\n", ctxid);
>> + return ERR_PTR(-ENOENT);
>> + }
>> +
>> + return ctx;
>> +}
>> +
>> +static void complete_context_processing(struct fastrpc_invoke_context *ctx, int retval)
>> +{
>> + ctx->retval = retval;
>> + complete(&ctx->work);
>> + kref_put(&ctx->refcount, fastrpc_context_free);
>> +}
>> +
>> static struct qda_dev *alloc_and_init_qdev(struct rpmsg_device *rpdev)
>> {
>> struct qda_dev *qdev;
>> @@ -62,9 +163,68 @@ static int qda_populate_child_devices(struct qda_dev *qdev, struct device_node *
>> return success > 0 ? 0 : (count > 0 ? -ENODEV : 0);
>> }
>>
>> +int qda_rpmsg_send_msg(struct qda_dev *qdev, struct qda_msg *msg)
>> +{
>> + int ret;
>> + struct fastrpc_invoke_context *ctx;
>> + struct fastrpc_msg msg1;
>> + struct rpmsg_device *rpdev;
>> +
>> + ret = validate_device_availability(qdev);
>> + if (ret)
>> + return ret;
>> +
>> + ctx = get_and_validate_context(msg, qdev);
>> + if (IS_ERR(ctx))
>> + return PTR_ERR(ctx);
>> +
>> + populate_fastrpc_msg(&msg1, msg);
>> +
>> + mutex_lock(&qdev->lock);
>> + rpdev = qdev->rpdev;
>> + if (!rpdev) {
>> + mutex_unlock(&qdev->lock);
>> + kref_put(&ctx->refcount, fastrpc_context_free);
>> + return -ENODEV;
>> + }
>> +
>> + ret = rpmsg_send(rpdev->ept, (void *)&msg1, sizeof(msg1));
>> + mutex_unlock(&qdev->lock);
>> +
>> + if (ret) {
>> + qda_err(qdev, "rpmsg_send failed: %d\n", ret);
>> + kref_put(&ctx->refcount, fastrpc_context_free);
>> + return ret;
>> + }
>> +
>> + return 0;
>> +}
>> +
>> +int qda_rpmsg_wait_for_rsp(struct fastrpc_invoke_context *ctx)
>> +{
>> + return wait_for_completion_interruptible(&ctx->work);
>> +}
>> +
>> static int qda_rpmsg_cb(struct rpmsg_device *rpdev, void *data, int len, void *priv, u32 src)
>> {
>> - /* Dummy function for rpmsg driver */
>> + struct qda_dev *qdev = dev_get_drvdata(&rpdev->dev);
>> + struct qda_invoke_rsp *resp_msg = (struct qda_invoke_rsp *)data;
>> + struct fastrpc_invoke_context *ctx;
>> + unsigned long ctxid;
>> + int ret;
>> +
>> + ret = validate_callback_params(qdev, data, len);
>> + if (ret)
>> + return ret;
>> +
>> + ctxid = extract_context_id(resp_msg);
>> +
>> + ctx = find_context_by_id(qdev, ctxid);
>> + if (IS_ERR(ctx))
>> + return PTR_ERR(ctx);
>> +
>> + complete_context_processing(ctx, resp_msg->retval);
>> +
>> return 0;
>> }
>>
>> diff --git a/drivers/accel/qda/qda_rpmsg.h b/drivers/accel/qda/qda_rpmsg.h
>> index 348827bff255..b3e76e44f4cd 100644
>> --- a/drivers/accel/qda/qda_rpmsg.h
>> +++ b/drivers/accel/qda/qda_rpmsg.h
>> @@ -7,6 +7,46 @@
>> #define __QDA_RPMSG_H__
>>
>> #include "qda_drv.h"
>> +#include "qda_fastrpc.h"
>> +
>> +/**
>> + * struct fastrpc_msg - FastRPC message structure for remote invocations
>> + *
>> + * This structure represents a FastRPC message sent to the remote processor
>> + * via RPMsg transport layer.
>> + */
>> +struct fastrpc_msg {
>> + /* Process client ID */
>> + int client_id;
>> + /* Thread ID */
>> + int tid;
>> + /* Context identifier for matching request/response */
>> + u64 ctx;
>> + /* Handle to invoke on remote processor */
>> + u32 handle;
>> + /* Scalars structure describing the data layout */
>> + u32 sc;
>> + /* Physical address of the message buffer */
>> + u64 addr;
>> + /* Size of contiguous region */
>> + u64 size;
>> +};
>> +
>> +/**
>> + * struct qda_invoke_rsp - Response structure for FastRPC invocations
>> + */
>> +struct qda_invoke_rsp {
>> + /* Invoke caller context for matching request/response */
>> + u64 ctx;
>> + /* Return value from the remote invocation */
>> + int retval;
>> +};
>> +
>> +/*
>> + * RPMsg transport layer functions
>> + */
>> +int qda_rpmsg_send_msg(struct qda_dev *qdev, struct qda_msg *msg);
>> +int qda_rpmsg_wait_for_rsp(struct fastrpc_invoke_context *ctx);
>>
>> /*
>> * Transport layer registration
>> --
>> 2.34.1
>>
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 14/18] accel/qda: Add FastRPC dynamic invocation support
2026-02-23 23:10 ` Dmitry Baryshkov
@ 2026-03-09 6:53 ` Ekansh Gupta
0 siblings, 0 replies; 83+ messages in thread
From: Ekansh Gupta @ 2026-03-09 6:53 UTC (permalink / raw)
To: Dmitry Baryshkov
Cc: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König, dri-devel, linux-doc, linux-kernel,
linux-arm-msm, iommu, linux-media, linaro-mm-sig,
Srinivas Kandagatla, Bharath Kumar, Chenna Kesava Raju
On 2/24/2026 4:40 AM, Dmitry Baryshkov wrote:
> On Tue, Feb 24, 2026 at 12:39:08AM +0530, Ekansh Gupta wrote:
>> Extend the QDA FastRPC implementation to support dynamic remote
>> procedure calls from userspace. A new DRM_QDA_INVOKE ioctl is added,
>> which accepts a qda_invoke_args structure containing a remote handle,
>> FastRPC scalars value and a pointer to an array of fastrpc_invoke_args
>> describing the individual arguments. The driver copies the scalar and
>> argument array into a fastrpc_invoke_context and reuses the existing
>> buffer overlap and packing logic to build a GEM-backed message buffer
>> for transport.
>>
>> The FastRPC core gains a FASTRPC_RMID_INVOKE_DYNAMIC method type and a
>> fastrpc_prepare_args_invoke() helper that reads the qda_invoke_args
>> header and argument descriptors from user or kernel memory using a
>> copy_from_user_or_kernel() helper. The generic fastrpc_prepare_args()
>> path is updated to handle the dynamic method alongside the existing
>> INIT_ATTACH and INIT_RELEASE control calls, deriving the number of
>> buffers and scalars from the provided FastRPC scalars encoding.
>>
>> On the transport side qda_ioctl_invoke() simply forwards the request
>> to fastrpc_invoke() with the dynamic method id, allowing the RPMsg
>> transport and context lookup to treat dynamic calls in the same way as
>> the existing control methods. This patch establishes the basic FastRPC
>> invoke mechanism on top of the QDA GEM and RPMsg infrastructure so
>> that future patches can wire up more complex DSP APIs.
>>
>> Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
>> ---
>> drivers/accel/qda/qda_drv.c | 1 +
>> drivers/accel/qda/qda_fastrpc.c | 48 +++++++++++++++++++++++++++++++++++++++++
>> drivers/accel/qda/qda_fastrpc.h | 1 +
>> drivers/accel/qda/qda_ioctl.c | 5 +++++
>> drivers/accel/qda/qda_ioctl.h | 13 +++++++++++
>> include/uapi/drm/qda_accel.h | 21 ++++++++++++++++++
>> 6 files changed, 89 insertions(+)
>>
>> diff --git a/drivers/accel/qda/qda_drv.c b/drivers/accel/qda/qda_drv.c
>> index 3034ea660924..f94f780ea50a 100644
>> --- a/drivers/accel/qda/qda_drv.c
>> +++ b/drivers/accel/qda/qda_drv.c
>> @@ -162,6 +162,7 @@ static const struct drm_ioctl_desc qda_ioctls[] = {
>> DRM_IOCTL_DEF_DRV(QDA_GEM_CREATE, qda_ioctl_gem_create, 0),
>> DRM_IOCTL_DEF_DRV(QDA_GEM_MMAP_OFFSET, qda_ioctl_gem_mmap_offset, 0),
>> DRM_IOCTL_DEF_DRV(QDA_INIT_ATTACH, qda_ioctl_attach, 0),
>> + DRM_IOCTL_DEF_DRV(QDA_INVOKE, qda_ioctl_invoke, 0),
>> };
>>
>> static struct drm_driver qda_drm_driver = {
>> diff --git a/drivers/accel/qda/qda_fastrpc.c b/drivers/accel/qda/qda_fastrpc.c
>> index eda7c90070ee..a48b255ffb1b 100644
>> --- a/drivers/accel/qda/qda_fastrpc.c
>> +++ b/drivers/accel/qda/qda_fastrpc.c
>> @@ -12,6 +12,16 @@
>> #include "qda_gem.h"
>> #include "qda_memory_manager.h"
>>
>> +static int copy_from_user_or_kernel(void *dst, const void __user *src, size_t size)
>> +{
>> + if ((unsigned long)src >= PAGE_OFFSET) {
>> + memcpy(dst, src, size);
>> + return 0;
>> + } else {
>> + return copy_from_user(dst, src, size) ? -EFAULT : 0;
>> + }
> Nah, it's a direct route to failure. __user is for user pointers, it
> can't be a kernel data. Define separate functions and be 100% sure
> whether the data is coming from the user (and thus needs to be
> sanitized) or of it is coming from the kernel. Otherwise a funny user
> can pass kernel pointer and get away with your code copying data from or
> writing data to the kernel buffer.
I see, I get your comment on the other patch also, I'll fix this.
>
>> +}
>> +
>> static int copy_to_user_or_kernel(void __user *dst, const void *src, size_t size)
>> {
>> if ((unsigned long)dst >= PAGE_OFFSET) {
>> @@ -509,6 +519,41 @@ static int fastrpc_prepare_args_release_process(struct fastrpc_invoke_context *c
>> return 0;
>> }
>>
>> +static int fastrpc_prepare_args_invoke(struct fastrpc_invoke_context *ctx, char __user *argp)
>> +{
>> + struct fastrpc_invoke_args *args = NULL;
>> + struct qda_invoke_args inv;
>> + int err = 0;
>> + int nscalars;
>> +
>> + if (!argp)
>> + return -EINVAL;
>> +
>> + err = copy_from_user_or_kernel(&inv, argp, sizeof(inv));
>> + if (err)
>> + return err;
>> +
>> + nscalars = REMOTE_SCALARS_LENGTH(inv.sc);
>> +
>> + if (nscalars) {
>> + args = kcalloc(nscalars, sizeof(*args), GFP_KERNEL);
>> + if (!args)
>> + return -ENOMEM;
>> +
>> + err = copy_from_user_or_kernel(args, (const void __user *)(uintptr_t)inv.args,
>> + nscalars * sizeof(*args));
> So... You are allowing users to specify the address in the kernel
> address space? Are you... sure?
ack, I'll fix this
>
>> + if (err) {
>> + kfree(args);
>> + return err;
>> + }
>> + }
>> + ctx->sc = inv.sc;
>> + ctx->args = args;
>> + ctx->handle = inv.handle;
>> +
>> + return 0;
>> +}
>> +
>> int fastrpc_prepare_args(struct fastrpc_invoke_context *ctx, char __user *argp)
>> {
>> int err;
>> @@ -521,6 +566,9 @@ int fastrpc_prepare_args(struct fastrpc_invoke_context *ctx, char __user *argp)
>> case FASTRPC_RMID_INIT_RELEASE:
>> err = fastrpc_prepare_args_release_process(ctx);
>> break;
>> + case FASTRPC_RMID_INVOKE_DYNAMIC:
>> + err = fastrpc_prepare_args_invoke(ctx, argp);
>> + break;
>> default:
>> return -EINVAL;
>> }
>> diff --git a/drivers/accel/qda/qda_fastrpc.h b/drivers/accel/qda/qda_fastrpc.h
>> index 744421382079..bcadf9437a36 100644
>> --- a/drivers/accel/qda/qda_fastrpc.h
>> +++ b/drivers/accel/qda/qda_fastrpc.h
>> @@ -237,6 +237,7 @@ struct fastrpc_invoke_context {
>> /* Remote Method ID table - identifies initialization and control operations */
>> #define FASTRPC_RMID_INIT_ATTACH 0 /* Attach to DSP session */
>> #define FASTRPC_RMID_INIT_RELEASE 1 /* Release DSP session */
>> +#define FASTRPC_RMID_INVOKE_DYNAMIC 0xFFFFFFFF /* Dynamic method invocation */
>>
>> /* Common handle for initialization operations */
>> #define FASTRPC_INIT_HANDLE 0x1
>> diff --git a/drivers/accel/qda/qda_ioctl.c b/drivers/accel/qda/qda_ioctl.c
>> index 1066ab6ddc7b..e90aceabd30d 100644
>> --- a/drivers/accel/qda/qda_ioctl.c
>> +++ b/drivers/accel/qda/qda_ioctl.c
>> @@ -192,3 +192,8 @@ int fastrpc_release_current_dsp_process(struct qda_dev *qdev, struct drm_file *f
>> {
>> return fastrpc_invoke(FASTRPC_RMID_INIT_RELEASE, qdev->drm_dev, NULL, file_priv);
>> }
>> +
>> +int qda_ioctl_invoke(struct drm_device *dev, void *data, struct drm_file *file_priv)
>> +{
>> + return fastrpc_invoke(FASTRPC_RMID_INVOKE_DYNAMIC, dev, data, file_priv);
>> +}
>> diff --git a/drivers/accel/qda/qda_ioctl.h b/drivers/accel/qda/qda_ioctl.h
>> index 044c616a51c6..e186c5183171 100644
>> --- a/drivers/accel/qda/qda_ioctl.h
>> +++ b/drivers/accel/qda/qda_ioctl.h
>> @@ -63,4 +63,17 @@ int qda_ioctl_attach(struct drm_device *dev, void *data, struct drm_file *file_p
>> */
>> int fastrpc_release_current_dsp_process(struct qda_dev *qdev, struct drm_file *file_priv);
>>
>> +/**
>> + * qda_ioctl_invoke - Invoke a remote procedure on the DSP
>> + * @dev: DRM device structure
>> + * @data: User-space data containing invocation parameters
>> + * @file_priv: DRM file private data
>> + *
>> + * This IOCTL handler initiates a remote procedure call on the DSP,
>> + * marshalling arguments, executing the call, and returning results.
>> + *
>> + * Return: 0 on success, negative error code on failure
>> + */
>> +int qda_ioctl_invoke(struct drm_device *dev, void *data, struct drm_file *file_priv);
>> +
>> #endif /* _QDA_IOCTL_H */
>> diff --git a/include/uapi/drm/qda_accel.h b/include/uapi/drm/qda_accel.h
>> index 4d3666c5b998..01072a9d0a91 100644
>> --- a/include/uapi/drm/qda_accel.h
>> +++ b/include/uapi/drm/qda_accel.h
>> @@ -22,6 +22,9 @@ extern "C" {
>> #define DRM_QDA_GEM_CREATE 0x01
>> #define DRM_QDA_GEM_MMAP_OFFSET 0x02
>> #define DRM_QDA_INIT_ATTACH 0x03
>> +/* Indexes 0x04 to 0x06 are reserved for other requests */
>> +#define DRM_QDA_INVOKE 0x07
>> +
>> /*
>> * QDA IOCTL definitions
>> *
>> @@ -35,6 +38,8 @@ extern "C" {
>> #define DRM_IOCTL_QDA_GEM_MMAP_OFFSET DRM_IOWR(DRM_COMMAND_BASE + DRM_QDA_GEM_MMAP_OFFSET, \
>> struct drm_qda_gem_mmap_offset)
>> #define DRM_IOCTL_QDA_INIT_ATTACH DRM_IO(DRM_COMMAND_BASE + DRM_QDA_INIT_ATTACH)
>> +#define DRM_IOCTL_QDA_INVOKE DRM_IOWR(DRM_COMMAND_BASE + DRM_QDA_INVOKE, \
>> + struct qda_invoke_args)
>>
>> /**
>> * struct drm_qda_query - Device information query structure
>> @@ -95,6 +100,22 @@ struct fastrpc_invoke_args {
>> __u32 attr;
>> };
>>
>> +/**
>> + * struct qda_invoke_args - User-space IOCTL arguments for invoking a function
>> + * @handle: Handle identifying the remote function to invoke
>> + * @sc: Scalars parameter encoding buffer counts and attributes
> Encoding... how?
I can add more details for this over here or over FASTRPC_BUILD_SCALARS definition.
>
>> + * @args: User-space pointer to the argument array
> Which is defined at...?
>
>
> Can you actually write the user code by looking at your uapi header?
will add more details for this.
>
>> + *
>> + * This structure is passed from user-space to invoke a remote function
>> + * on the DSP. The scalars parameter encodes the number and types of
>> + * input/output buffers.
>> + */
>> +struct qda_invoke_args {
>> + __u32 handle;
>> + __u32 sc;
>> + __u64 args;
>> +};
>> +
>> #if defined(__cplusplus)
>> }
>> #endif
>>
>> --
>> 2.34.1
>>
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 12/18] accel/qda: Add PRIME dma-buf import support
2026-02-24 9:12 ` Christian König
@ 2026-03-09 6:59 ` Ekansh Gupta
0 siblings, 0 replies; 83+ messages in thread
From: Ekansh Gupta @ 2026-03-09 6:59 UTC (permalink / raw)
To: Christian König, Oded Gabbay, Jonathan Corbet, Shuah Khan,
Joerg Roedel, Will Deacon, Robin Murphy, Maarten Lankhorst,
Maxime Ripard, Thomas Zimmermann, David Airlie, Simona Vetter,
Sumit Semwal
Cc: dri-devel, linux-doc, linux-kernel, linux-arm-msm, iommu,
linux-media, linaro-mm-sig, Srinivas Kandagatla, Dmitry Baryshkov,
Bharath Kumar, Chenna Kesava Raju
On 2/24/2026 2:42 PM, Christian König wrote:
> On 2/23/26 20:09, Ekansh Gupta wrote:
>> [Sie erhalten nicht häufig E-Mails von ekansh.gupta@oss.qualcomm.com. Weitere Informationen, warum dies wichtig ist, finden Sie unter https://aka.ms/LearnAboutSenderIdentification ]
>>
>> Add PRIME dma-buf import support for QDA GEM buffer objects and integrate
>> it with the existing per-process memory manager and IOMMU device model.
>>
>> The implementation extends qda_gem_obj to represent imported dma-bufs,
>> including dma_buf references, attachment state, scatter-gather tables
>> and an imported DMA address used for DSP-facing book-keeping. The
>> qda_gem_prime_import() path handles reimports of buffers originally
>> exported by QDA as well as imports of external dma-bufs, attaching them
>> to the assigned IOMMU device
> That is usually an absolutely clear NO-GO for DMA-bufs. Where exactly in the code is that?
dma_buf_attach* to comute-cb iommu devices are critical for DSPs to access the buffer.
This is needed if the buffer is exported by anyone other than QDA(say system heap). If this is not
the correct way, what should be the right way here? On the current fastrpc driver also,
the DMABUF is getting attached with iommu device[1] due to the same requirement.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/drivers/misc/fastrpc.c#n779
>
>> and mapping them through the memory manager
>> for DSP access. The GEM free path is updated to unmap and detach
>> imported buffers while preserving the existing behaviour for locally
>> allocated memory.
>>
>> The PRIME fd-to-handle path is implemented in qda_prime_fd_to_handle(),
>> which records the calling drm_file in a driver-private import context
>> before invoking the core DRM helpers. The GEM import callback retrieves
>> this context to ensure that an IOMMU device is assigned to the process
>> and that imported buffers follow the same per-process IOMMU selection
>> rules as natively allocated GEM objects.
>>
>> This patch prepares the driver for interoperable buffer sharing between
>> QDA and other dma-buf capable subsystems while keeping IOMMU mapping and
>> lifetime handling consistent with the existing GEM allocation flow.
>>
>> Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
> ...
>
>> @@ -15,23 +16,29 @@ static int validate_gem_obj_for_mmap(struct qda_gem_obj *qda_gem_obj)
>> qda_err(NULL, "Invalid GEM object size\n");
>> return -EINVAL;
>> }
>> - if (!qda_gem_obj->iommu_dev || !qda_gem_obj->iommu_dev->dev) {
>> - qda_err(NULL, "Allocated buffer missing IOMMU device\n");
>> - return -EINVAL;
>> - }
>> - if (!qda_gem_obj->iommu_dev->dev) {
>> - qda_err(NULL, "Allocated buffer missing IOMMU device\n");
>> - return -EINVAL;
>> - }
>> - if (!qda_gem_obj->virt) {
>> - qda_err(NULL, "Allocated buffer missing virtual address\n");
>> - return -EINVAL;
>> - }
>> - if (qda_gem_obj->dma_addr == 0) {
>> - qda_err(NULL, "Allocated buffer missing DMA address\n");
>> - return -EINVAL;
>> + if (qda_gem_obj->is_imported) {
> Absolutely clear NAK to that. Imported buffers *can't* be mmaped through the importer!
>
> Userspace needs to mmap() them through the exporter.
>
> If you absolutely have to map them through the importer for uAPI backward compatibility then there is dma_buf_mmap() for that, but this is clearly not the case here.
>
> ...
Okay, the requirement is slightly different here. Any buffer which is not allocated using the
QDA GEM interface needs to be attached to the iommu device for that particular process to
enable DSP for the access. I should not call it `mmap` instead it should be called importing the
buffer to a particular iommu context bank. With this definition, is it fine to keep it this way? Or
should the dma_buf_attach* calls be moved to some other place?
>> +static int qda_memory_manager_map_imported(struct qda_memory_manager *mem_mgr,
>> + struct qda_gem_obj *gem_obj,
>> + struct qda_iommu_device *iommu_dev)
>> +{
>> + struct scatterlist *sg;
>> + dma_addr_t dma_addr;
>> + int ret = 0;
>> +
>> + if (!gem_obj->is_imported || !gem_obj->sgt || !iommu_dev) {
>> + qda_err(NULL, "Invalid parameters for imported buffer mapping\n");
>> + return -EINVAL;
>> + }
>> +
>> + gem_obj->iommu_dev = iommu_dev;
>> +
>> + sg = gem_obj->sgt->sgl;
>> + if (sg) {
>> + dma_addr = sg_dma_address(sg);
>> + dma_addr += ((u64)iommu_dev->sid << 32);
>> +
>> + gem_obj->imported_dma_addr = dma_addr;
> Well that looks like you are only using the first DMA address from the imported sgt. What about the others?
I might have a proper appach for this now, will update in the next spin.
>
> Regards,
> Christian.
^ permalink raw reply [flat|nested] 83+ messages in thread
* Re: [PATCH RFC 00/18] accel/qda: Introduce Qualcomm DSP Accelerator driver
2026-02-23 19:08 ` [PATCH RFC 00/18] accel/qda: Introduce Qualcomm DSP Accelerator driver Ekansh Gupta
` (22 preceding siblings ...)
2026-03-02 15:57 ` Srinivas Kandagatla
@ 2026-03-09 8:07 ` Ekansh Gupta
23 siblings, 0 replies; 83+ messages in thread
From: Ekansh Gupta @ 2026-03-09 8:07 UTC (permalink / raw)
To: Oded Gabbay, Jonathan Corbet, Shuah Khan, Joerg Roedel,
Will Deacon, Robin Murphy, Maarten Lankhorst, Maxime Ripard,
Thomas Zimmermann, David Airlie, Simona Vetter, Sumit Semwal,
Christian König
Cc: dri-devel, linux-doc, linux-kernel, linux-arm-msm, iommu,
linux-media, linaro-mm-sig, Srinivas Kandagatla, Dmitry Baryshkov,
Bharath Kumar, Chenna Kesava Raju
On 2/24/2026 12:38 AM, Ekansh Gupta wrote:
> This patch series introduces the Qualcomm DSP Accelerator (QDA) driver,
> a modern DRM-based accelerator implementation for Qualcomm Hexagon DSPs.
> The driver provides a standardized interface for offloading computational
> tasks to DSPs found on Qualcomm SoCs, supporting all DSP domains (ADSP,
> CDSP, SDSP, GDSP).
>
> The QDA driver is designed as an alternative for the FastRPC driver
> in drivers/misc/, offering improved resource management, better integration
> with standard kernel subsystems, and alignment with the Linux kernel's
> Compute Accelerators framework.
>
> User-space staging branch
> ============
> https://github.com/qualcomm/fastrpc/tree/accel/staging
>
> Key Features
> ============
>
> * Standard DRM accelerator interface via /dev/accel/accelN
> * GEM-based buffer management with DMA-BUF import/export support
> * IOMMU-based memory isolation using per-process context banks
> * FastRPC protocol implementation for DSP communication
> * RPMsg transport layer for reliable message passing
> * Support for all DSP domains (ADSP, CDSP, SDSP, GDSP)
> * Comprehensive IOCTL interface for DSP operations
>
> High-Level Architecture Differences with Existing FastRPC Driver
> =================================================================
>
> The QDA driver represents a significant architectural departure from the
> existing FastRPC driver (drivers/misc/fastrpc.c), addressing several key
> limitations while maintaining protocol compatibility:
>
> 1. DRM Accelerator Framework Integration
> - FastRPC: Custom character device (/dev/fastrpc-*)
> - QDA: Standard DRM accel device (/dev/accel/accelN)
> - Benefit: Leverages established DRM infrastructure for device
> management.
>
> 2. Memory Management
> - FastRPC: Custom memory allocator with ION/DMA-BUF integration
> - QDA: Native GEM objects with full PRIME support
> - Benefit: Seamless buffer sharing using standard DRM mechanisms
>
> 3. IOMMU Context Bank Management
> - FastRPC: Direct IOMMU domain manipulation, limited isolation
> - QDA: Custom compute bus (qda_cb_bus_type) with proper device model
> - Benefit: Each CB device is a proper struct device with IOMMU group
> support, enabling better isolation and resource tracking.
> - https://lore.kernel.org/all/245d602f-3037-4ae3-9af9-d98f37258aae@oss.qualcomm.com/
>
> 4. Memory Manager Architecture
> - FastRPC: Monolithic allocator
> - QDA: Pluggable memory manager with backend abstraction
> - Benefit: Currently uses DMA-coherent backend, easily extensible for
> future memory types (e.g., carveout, CMA)
>
> 5. Transport Layer
> - FastRPC: Direct RPMsg integration in core driver
> - QDA: Abstracted transport layer (qda_rpmsg.c)
> - Benefit: Clean separation of concerns, easier to add alternative
> transports if needed
>
> 8. Code Organization
> - FastRPC: ~3000 lines in single file
> - QDA: Modular design across multiple files (~4600 lines total)
> * qda_drv.c: Core driver and DRM integration
> * qda_gem.c: GEM object management
> * qda_memory_manager.c: Memory and IOMMU management
> * qda_fastrpc.c: FastRPC protocol implementation
> * qda_rpmsg.c: Transport layer
> * qda_cb.c: Context bank device management
> - Benefit: Better maintainability, clearer separation of concerns
>
> 9. UAPI Design
> - FastRPC: Custom IOCTL interface
> - QDA: DRM-style IOCTLs with proper versioning support
> - Benefit: Follows DRM conventions, easier userspace integration
>
> 10. Documentation
> - FastRPC: Minimal in-tree documentation
> - QDA: Comprehensive documentation in Documentation/accel/qda/
> - Benefit: Better developer experience, clearer API contracts
>
> 11. Buffer Reference Mechanism
> - FastRPC: Uses buffer file descriptors (FDs) for all book-keeping
> in both kernel and DSP
> - QDA: Uses GEM handles for kernel-side management, providing better
> integration with DRM subsystem
> - Benefit: Leverages DRM GEM infrastructure for reference counting,
> lifetime management, and integration with other DRM components
>
> Key Technical Improvements
> ===========================
>
> * Proper device model: CB devices are real struct device instances on a
> custom bus, enabling proper IOMMU group management and power management
> integration
>
> * Reference-counted IOMMU devices: Multiple file descriptors from the same
> process share a single IOMMU device, reducing overhead
>
> * GEM-based buffer lifecycle: Automatic cleanup via DRM GEM reference
> counting, eliminating many resource leak scenarios
>
> * Modular memory backends: The memory manager supports pluggable backends,
> currently implementing DMA-coherent allocations with SID-prefixed
> addresses for DSP firmware
>
> * Context-based invocation tracking: XArray-based context management with
> proper synchronization and cleanup
>
> Patch Series Organization
> ==========================
>
> Patches 1-2: Driver skeleton and documentation
> Patches 3-6: RPMsg transport and IOMMU/CB infrastructure
> Patches 7-9: DRM device registration and basic IOCTL
> Patches 10-12: GEM buffer management and PRIME support
> Patches 13-17: FastRPC protocol implementation (attach, invoke, create,
> map/unmap)
> Patch 18: MAINTAINERS entry
>
> Open Items
> ===========
>
> The following items are identified as open items:
>
> 1. Privilege Level Management
> - Currently, daemon processes and user processes have the same access
> level as both use the same accel device node. This needs to be
> addressed as daemons attach to privileged DSP PDs and require
> higher privilege levels for system-level operations
> - Seeking guidance on the best approach: separate device nodes,
> capability-based checks, or DRM master/authentication mechanisms
Hi all, I'm seeking guidance for this open item. I wanted some conclusion on this before
I send out the next version. This requirement is because any malicious application should
not attach to privileged DSP PDs and it's might impact the functionality of the PD by not
providing proper file-operation framework.
>
> 2. UAPI Compatibility Layer
> - Add UAPI compat layer to facilitate migration of client applications
> from existing FastRPC UAPI to the new QDA accel driver UAPI,
> ensuring smooth transition for existing userspace code
> - Seeking guidance on implementation approach: in-kernel translation
> layer, userspace wrapper library, or hybrid solution
>
> 3. Documentation Improvements
> - Add detailed IOCTL usage examples
> - Document DSP firmware interface requirements
> - Create migration guide from existing FastRPC
>
> 4. Per-Domain Memory Allocation
> - Develop new userspace API to support memory allocation on a per
> domain basis, enabling domain-specific memory management and
> optimization
>
> 5. Audio and Sensors PD Support
> - The current patch series does not handle Audio PD and Sensors PD
> functionalities. These specialized protection domains require
> additional support for real-time constraints and power management
>
> Interface Compatibility
> ========================
>
> The QDA driver maintains compatibility with existing FastRPC infrastructure:
>
> * Device Tree Bindings: The driver uses the same device tree bindings as
> the existing FastRPC driver, ensuring no changes are required to device
> tree sources. The "qcom,fastrpc" compatible string and child node
> structure remain unchanged.
>
> * Userspace Interface: While the driver provides a new DRM-based UAPI,
> the underlying FastRPC protocol and DSP firmware interface remain
> compatible. This ensures that DSP firmware and libraries continue to
> work without modification.
>
> * Migration Path: The modular design allows for gradual migration, where
> both drivers can coexist during the transition period. Applications can
> be migrated incrementally to the new UAPI with the help of the planned
> compatibility layer.
>
> References
> ==========
>
> Previous discussions on this migration:
> - https://lkml.org/lkml/2024/6/24/479
> - https://lkml.org/lkml/2024/6/21/1252
>
> Testing
> =======
>
> The driver has been tested on Qualcomm platforms with:
> - Basic FastRPC attach/release operations
> - DSP process creation and initialization
> - Memory mapping/unmapping operations
> - Dynamic invocation with various buffer types
> - GEM buffer allocation and mmap
> - PRIME buffer import from other subsystems
>
> Signed-off-by: Ekansh Gupta <ekansh.gupta@oss.qualcomm.com>
> ---
> Ekansh Gupta (18):
> accel/qda: Add Qualcomm QDA DSP accelerator driver docs
> accel/qda: Add Qualcomm DSP accelerator driver skeleton
> accel/qda: Add RPMsg transport for Qualcomm DSP accelerator
> accel/qda: Add built-in compute CB bus for QDA and integrate with IOMMU
> accel/qda: Create compute CB devices on QDA compute bus
> accel/qda: Add memory manager for CB devices
> accel/qda: Add DRM accel device registration for QDA driver
> accel/qda: Add per-file DRM context and open/close handling
> accel/qda: Add QUERY IOCTL and basic QDA UAPI header
> accel/qda: Add DMA-backed GEM objects and memory manager integration
> accel/qda: Add GEM_CREATE and GEM_MMAP_OFFSET IOCTLs
> accel/qda: Add PRIME dma-buf import support
> accel/qda: Add initial FastRPC attach and release support
> accel/qda: Add FastRPC dynamic invocation support
> accel/qda: Add FastRPC DSP process creation support
> accel/qda: Add FastRPC-based DSP memory mapping support
> accel/qda: Add FastRPC-based DSP memory unmapping support
> MAINTAINERS: Add MAINTAINERS entry for QDA driver
>
> Documentation/accel/index.rst | 1 +
> Documentation/accel/qda/index.rst | 14 +
> Documentation/accel/qda/qda.rst | 129 ++++
> MAINTAINERS | 9 +
> arch/arm64/configs/defconfig | 2 +
> drivers/accel/Kconfig | 1 +
> drivers/accel/Makefile | 2 +
> drivers/accel/qda/Kconfig | 35 ++
> drivers/accel/qda/Makefile | 19 +
> drivers/accel/qda/qda_cb.c | 182 ++++++
> drivers/accel/qda/qda_cb.h | 26 +
> drivers/accel/qda/qda_compute_bus.c | 23 +
> drivers/accel/qda/qda_drv.c | 375 ++++++++++++
> drivers/accel/qda/qda_drv.h | 171 ++++++
> drivers/accel/qda/qda_fastrpc.c | 1002 ++++++++++++++++++++++++++++++++
> drivers/accel/qda/qda_fastrpc.h | 433 ++++++++++++++
> drivers/accel/qda/qda_gem.c | 211 +++++++
> drivers/accel/qda/qda_gem.h | 103 ++++
> drivers/accel/qda/qda_ioctl.c | 271 +++++++++
> drivers/accel/qda/qda_ioctl.h | 118 ++++
> drivers/accel/qda/qda_memory_dma.c | 91 +++
> drivers/accel/qda/qda_memory_dma.h | 46 ++
> drivers/accel/qda/qda_memory_manager.c | 382 ++++++++++++
> drivers/accel/qda/qda_memory_manager.h | 148 +++++
> drivers/accel/qda/qda_prime.c | 194 +++++++
> drivers/accel/qda/qda_prime.h | 43 ++
> drivers/accel/qda/qda_rpmsg.c | 327 +++++++++++
> drivers/accel/qda/qda_rpmsg.h | 57 ++
> drivers/iommu/iommu.c | 4 +
> include/linux/qda_compute_bus.h | 22 +
> include/uapi/drm/qda_accel.h | 224 +++++++
> 31 files changed, 4665 insertions(+)
> ---
> base-commit: d4906ae14a5f136ceb671bb14cedbf13fa560da6
> change-id: 20260223-qda-firstpost-4ab05249e2cc
>
> Best regards,
^ permalink raw reply [flat|nested] 83+ messages in thread
end of thread, other threads:[~2026-03-09 8:07 UTC | newest]
Thread overview: 83+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <vU2QyEVqOu-D3eGp7BZFICUeauxL32bwWzeidOAijoeVaJTk8KcRVsaQQD4MdFQEcaQTZ5RkzRsz9-Lhl1qsqg==@protonmail.internalid>
2026-02-23 19:08 ` [PATCH RFC 00/18] accel/qda: Introduce Qualcomm DSP Accelerator driver Ekansh Gupta
2026-02-23 19:08 ` [PATCH RFC 01/18] accel/qda: Add Qualcomm QDA DSP accelerator driver docs Ekansh Gupta
2026-02-23 21:17 ` Dmitry Baryshkov
2026-02-25 13:57 ` Ekansh Gupta
2026-02-25 17:17 ` Dmitry Baryshkov
2026-02-24 3:33 ` Trilok Soni
2026-02-25 14:17 ` Ekansh Gupta
2026-02-25 15:12 ` Bjorn Andersson
2026-02-25 19:16 ` Trilok Soni
2026-02-25 19:40 ` Dmitry Baryshkov
2026-02-25 23:18 ` Trilok Soni
2026-02-23 19:08 ` [PATCH RFC 02/18] accel/qda: Add Qualcomm DSP accelerator driver skeleton Ekansh Gupta
2026-02-23 21:52 ` Bjorn Andersson
2026-02-25 14:20 ` Ekansh Gupta
2026-02-23 19:08 ` [PATCH RFC 03/18] accel/qda: Add RPMsg transport for Qualcomm DSP accelerator Ekansh Gupta
2026-02-23 21:23 ` Dmitry Baryshkov
2026-02-23 21:50 ` Bjorn Andersson
2026-02-23 22:12 ` Dmitry Baryshkov
2026-02-23 22:25 ` Bjorn Andersson
2026-02-23 22:41 ` Dmitry Baryshkov
2026-02-25 17:16 ` Ekansh Gupta
2026-02-23 19:08 ` [PATCH RFC 04/18] accel/qda: Add built-in compute CB bus for QDA and integrate with IOMMU Ekansh Gupta
2026-02-23 22:44 ` Dmitry Baryshkov
2026-02-25 17:56 ` Ekansh Gupta
2026-02-25 19:09 ` Dmitry Baryshkov
2026-03-02 8:12 ` Ekansh Gupta
2026-02-26 10:46 ` Krzysztof Kozlowski
2026-02-23 19:08 ` [PATCH RFC 05/18] accel/qda: Create compute CB devices on QDA compute bus Ekansh Gupta
2026-02-23 22:49 ` Dmitry Baryshkov
2026-02-26 8:38 ` Ekansh Gupta
2026-02-26 10:46 ` Dmitry Baryshkov
2026-03-02 8:10 ` Ekansh Gupta
2026-02-23 19:09 ` [PATCH RFC 06/18] accel/qda: Add memory manager for CB devices Ekansh Gupta
2026-02-23 22:50 ` Dmitry Baryshkov
2026-03-02 8:15 ` Ekansh Gupta
2026-03-04 4:22 ` Dmitry Baryshkov
2026-02-23 23:11 ` Bjorn Andersson
2026-03-02 8:30 ` Ekansh Gupta
2026-02-23 19:09 ` [PATCH RFC 07/18] accel/qda: Add DRM accel device registration for QDA driver Ekansh Gupta
2026-02-23 22:16 ` Dmitry Baryshkov
2026-03-02 8:33 ` Ekansh Gupta
2026-02-23 19:09 ` [PATCH RFC 08/18] accel/qda: Add per-file DRM context and open/close handling Ekansh Gupta
2026-02-23 22:20 ` Dmitry Baryshkov
2026-03-02 8:36 ` Ekansh Gupta
2026-02-23 19:09 ` [PATCH RFC 09/18] accel/qda: Add QUERY IOCTL and basic QDA UAPI header Ekansh Gupta
2026-02-23 22:24 ` Dmitry Baryshkov
2026-03-02 8:41 ` Ekansh Gupta
2026-02-23 19:09 ` [PATCH RFC 10/18] accel/qda: Add DMA-backed GEM objects and memory manager integration Ekansh Gupta
2026-02-23 22:36 ` Dmitry Baryshkov
2026-03-02 9:06 ` Ekansh Gupta
2026-02-23 19:09 ` [PATCH RFC 11/18] accel/qda: Add GEM_CREATE and GEM_MMAP_OFFSET IOCTLs Ekansh Gupta
2026-02-23 22:39 ` Dmitry Baryshkov
2026-03-02 9:07 ` Ekansh Gupta
2026-02-24 9:05 ` Christian König
2026-03-02 9:08 ` Ekansh Gupta
2026-02-23 19:09 ` [PATCH RFC 12/18] accel/qda: Add PRIME dma-buf import support Ekansh Gupta
2026-02-24 8:52 ` Matthew Brost
2026-03-02 9:19 ` Ekansh Gupta
2026-02-24 9:12 ` Christian König
2026-03-09 6:59 ` Ekansh Gupta
2026-02-23 19:09 ` [PATCH RFC 13/18] accel/qda: Add initial FastRPC attach and release support Ekansh Gupta
2026-02-23 23:07 ` Dmitry Baryshkov
2026-03-09 6:50 ` Ekansh Gupta
2026-02-23 19:09 ` [PATCH RFC 14/18] accel/qda: Add FastRPC dynamic invocation support Ekansh Gupta
2026-02-23 23:10 ` Dmitry Baryshkov
2026-03-09 6:53 ` Ekansh Gupta
2026-02-23 19:09 ` [PATCH RFC 15/18] accel/qda: Add FastRPC DSP process creation support Ekansh Gupta
2026-02-23 19:09 ` [PATCH RFC 16/18] accel/qda: Add FastRPC-based DSP memory mapping support Ekansh Gupta
2026-02-26 10:48 ` Krzysztof Kozlowski
2026-03-02 9:12 ` Ekansh Gupta
2026-02-23 19:09 ` [PATCH RFC 17/18] accel/qda: Add FastRPC-based DSP memory unmapping support Ekansh Gupta
2026-02-23 19:09 ` [PATCH RFC 18/18] MAINTAINERS: Add MAINTAINERS entry for QDA driver Ekansh Gupta
2026-02-23 22:40 ` Dmitry Baryshkov
2026-03-02 8:41 ` Ekansh Gupta
2026-02-23 22:03 ` [PATCH RFC 00/18] accel/qda: Introduce Qualcomm DSP Accelerator driver Bjorn Andersson
2026-03-02 8:54 ` Ekansh Gupta
2026-02-24 3:37 ` Trilok Soni
2026-02-24 3:39 ` Trilok Soni
2026-03-02 8:43 ` Ekansh Gupta
2026-02-25 13:42 ` Bryan O'Donoghue
2026-02-25 19:12 ` Dmitry Baryshkov
2026-03-02 15:57 ` Srinivas Kandagatla
2026-03-09 8:07 ` Ekansh Gupta
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox