public inbox for dmaengine@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/7] Add support of AMD AE4DMA DMA Engine
@ 2024-07-08 14:44 Basavaraj Natikar
  2024-07-08 14:44 ` [PATCH v5 1/7] dmaengine: Move AMD DMA driver to separate directory Basavaraj Natikar
                   ` (6 more replies)
  0 siblings, 7 replies; 14+ messages in thread
From: Basavaraj Natikar @ 2024-07-08 14:44 UTC (permalink / raw)
  To: vkoul, dmaengine
  Cc: Raju.Rangoju, Frank.li, helgaas, pstanner, Basavaraj Natikar

AMD AE4DMA Controller is a multi-queue DMA controller. Its design differs
significantly from PTDMA controller, although some functionalities
overlap. All functionalities similar to PTDMA are extended and merged
within PTDMA code to support both PTDMA and AE4DMA for code reuse. A new
AE4DMA driver directory is created to house unique AE4DMA code, ensuring
efficient handling of AE4DMA functionalities.

Changes in v5:
	- Initialized the buffer to zero before using snprintf.
	- Changed the function name to avoid a duplicate symbol error,
	  as pointed out in the link below.
Link: https://lore.kernel.org/all/202407040547.MfCINdp6-lkp@intel.com/

Changes in v4:
	- Removed blank lines and grouped the mutexes into a single block. 
	- Replaced all magic numbers with #define macros.
	- Added short comment for error codes.
	- Replaced the while loop with a for loop.

Changes in v3:
	- Added COMPILE_TEST to the Kconfig.
	- Considered using pci_alloc_irq_vectors and pci_free_irq_vectors.
	- Removed mb() and atomic variables where applicable. 
Link: https://lore.kernel.org/oe-kbuild-all/202406240021.ytiS3jV6-lkp@intel.com/

Changes in v2:
	- Changed all variants of "mb" to "dma_mb.
	- Changed cancel_delayed_work to cancel_delayed_work_sync.
	- Removed 32bit dma_set_mask_and_coherent.
	- Rearrange the order of the #include directives alphabetically.

Basavaraj Natikar (7):
  dmaengine: Move AMD DMA driver to separate directory
  dmaengine: ae4dma: Add AMD ae4dma controller driver
  dmaengine: ptdma: Move common functions to common code
  dmaengine: ptdma: Extend ptdma to support multi-channel and version
  dmaengine: ae4dma: Register AE4DMA using pt_dmaengine_register
  dmaengine: ptdma: Extend ptdma-debugfs to support multi-queue
  dmaengine: ae4dma: Register debugfs using ptdma_debugfs_setup

 MAINTAINERS                                   |   9 +-
 drivers/dma/Kconfig                           |   4 +-
 drivers/dma/Makefile                          |   2 +-
 drivers/dma/amd/Kconfig                       |   6 +
 drivers/dma/amd/Makefile                      |   7 +
 drivers/dma/amd/ae4dma/Kconfig                |  14 +
 drivers/dma/amd/ae4dma/Makefile               |  10 +
 drivers/dma/amd/ae4dma/ae4dma-dev.c           | 261 ++++++++++++++++++
 drivers/dma/amd/ae4dma/ae4dma-pci.c           | 158 +++++++++++
 drivers/dma/amd/ae4dma/ae4dma.h               |  89 ++++++
 drivers/dma/amd/common/amd_dma.h              |  30 ++
 drivers/dma/{ => amd}/ptdma/Kconfig           |   0
 drivers/dma/{ => amd}/ptdma/Makefile          |   0
 drivers/dma/{ => amd}/ptdma/ptdma-debugfs.c   |  80 ++++--
 drivers/dma/{ => amd}/ptdma/ptdma-dev.c       |   2 +-
 drivers/dma/{ => amd}/ptdma/ptdma-dmaengine.c | 120 ++++++--
 drivers/dma/{ => amd}/ptdma/ptdma-pci.c       |   0
 drivers/dma/{ => amd}/ptdma/ptdma.h           |   4 +-
 18 files changed, 743 insertions(+), 53 deletions(-)
 create mode 100644 drivers/dma/amd/Kconfig
 create mode 100644 drivers/dma/amd/Makefile
 create mode 100644 drivers/dma/amd/ae4dma/Kconfig
 create mode 100644 drivers/dma/amd/ae4dma/Makefile
 create mode 100644 drivers/dma/amd/ae4dma/ae4dma-dev.c
 create mode 100644 drivers/dma/amd/ae4dma/ae4dma-pci.c
 create mode 100644 drivers/dma/amd/ae4dma/ae4dma.h
 create mode 100644 drivers/dma/amd/common/amd_dma.h
 rename drivers/dma/{ => amd}/ptdma/Kconfig (100%)
 rename drivers/dma/{ => amd}/ptdma/Makefile (100%)
 rename drivers/dma/{ => amd}/ptdma/ptdma-debugfs.c (52%)
 rename drivers/dma/{ => amd}/ptdma/ptdma-dev.c (99%)
 rename drivers/dma/{ => amd}/ptdma/ptdma-dmaengine.c (76%)
 rename drivers/dma/{ => amd}/ptdma/ptdma-pci.c (100%)
 rename drivers/dma/{ => amd}/ptdma/ptdma.h (99%)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v5 1/7] dmaengine: Move AMD DMA driver to separate directory
  2024-07-08 14:44 [PATCH v5 0/7] Add support of AMD AE4DMA DMA Engine Basavaraj Natikar
@ 2024-07-08 14:44 ` Basavaraj Natikar
  2024-07-08 14:44 ` [PATCH v5 2/7] dmaengine: ae4dma: Add AMD ae4dma controller driver Basavaraj Natikar
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 14+ messages in thread
From: Basavaraj Natikar @ 2024-07-08 14:44 UTC (permalink / raw)
  To: vkoul, dmaengine
  Cc: Raju.Rangoju, Frank.li, helgaas, pstanner, Basavaraj Natikar

Currently, AMD PTDMA driver is single DMA driver supported and newer AMD
platforms supports newer DMA engine. Hence move the current drivers to
separate directory. This would also mean the newer driver submissions to
AMD DMA driver in the future will also land in AMD specific directory.

Reviewed-by: Raju Rangoju <Raju.Rangoju@amd.com>
Signed-off-by: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
---
 MAINTAINERS                                   | 2 +-
 drivers/dma/Kconfig                           | 4 ++--
 drivers/dma/Makefile                          | 2 +-
 drivers/dma/amd/Kconfig                       | 5 +++++
 drivers/dma/amd/Makefile                      | 6 ++++++
 drivers/dma/{ => amd}/ptdma/Kconfig           | 0
 drivers/dma/{ => amd}/ptdma/Makefile          | 0
 drivers/dma/{ => amd}/ptdma/ptdma-debugfs.c   | 0
 drivers/dma/{ => amd}/ptdma/ptdma-dev.c       | 0
 drivers/dma/{ => amd}/ptdma/ptdma-dmaengine.c | 3 +--
 drivers/dma/{ => amd}/ptdma/ptdma-pci.c       | 0
 drivers/dma/{ => amd}/ptdma/ptdma.h           | 2 +-
 12 files changed, 17 insertions(+), 7 deletions(-)
 create mode 100644 drivers/dma/amd/Kconfig
 create mode 100644 drivers/dma/amd/Makefile
 rename drivers/dma/{ => amd}/ptdma/Kconfig (100%)
 rename drivers/dma/{ => amd}/ptdma/Makefile (100%)
 rename drivers/dma/{ => amd}/ptdma/ptdma-debugfs.c (100%)
 rename drivers/dma/{ => amd}/ptdma/ptdma-dev.c (100%)
 rename drivers/dma/{ => amd}/ptdma/ptdma-dmaengine.c (99%)
 rename drivers/dma/{ => amd}/ptdma/ptdma-pci.c (100%)
 rename drivers/dma/{ => amd}/ptdma/ptdma.h (99%)

diff --git a/MAINTAINERS b/MAINTAINERS
index da5352dbd4f3..33a1049fd38b 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1113,7 +1113,7 @@ AMD PTDMA DRIVER
 M:	Basavaraj Natikar <Basavaraj.Natikar@amd.com>
 L:	dmaengine@vger.kernel.org
 S:	Maintained
-F:	drivers/dma/ptdma/
+F:	drivers/dma/amd/ptdma/
 
 AMD SEATTLE DEVICE TREE SUPPORT
 M:	Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig
index 9fc99cfbef08..e4e424ebdac4 100644
--- a/drivers/dma/Kconfig
+++ b/drivers/dma/Kconfig
@@ -752,8 +752,6 @@ source "drivers/dma/bestcomm/Kconfig"
 
 source "drivers/dma/mediatek/Kconfig"
 
-source "drivers/dma/ptdma/Kconfig"
-
 source "drivers/dma/qcom/Kconfig"
 
 source "drivers/dma/dw/Kconfig"
@@ -772,6 +770,8 @@ source "drivers/dma/fsl-dpaa2-qdma/Kconfig"
 
 source "drivers/dma/lgm/Kconfig"
 
+source "drivers/dma/amd/Kconfig"
+
 # clients
 comment "DMA Clients"
 	depends on DMA_ENGINE
diff --git a/drivers/dma/Makefile b/drivers/dma/Makefile
index 802ca916f05f..c59948a0dc7c 100644
--- a/drivers/dma/Makefile
+++ b/drivers/dma/Makefile
@@ -16,7 +16,7 @@ obj-$(CONFIG_DMATEST) += dmatest.o
 obj-$(CONFIG_ALTERA_MSGDMA) += altera-msgdma.o
 obj-$(CONFIG_AMBA_PL08X) += amba-pl08x.o
 obj-$(CONFIG_AMCC_PPC440SPE_ADMA) += ppc4xx/
-obj-$(CONFIG_AMD_PTDMA) += ptdma/
+obj-y += amd/
 obj-$(CONFIG_APPLE_ADMAC) += apple-admac.o
 obj-$(CONFIG_AT_HDMAC) += at_hdmac.o
 obj-$(CONFIG_AT_XDMAC) += at_xdmac.o
diff --git a/drivers/dma/amd/Kconfig b/drivers/dma/amd/Kconfig
new file mode 100644
index 000000000000..8246b463bcf7
--- /dev/null
+++ b/drivers/dma/amd/Kconfig
@@ -0,0 +1,5 @@
+# SPDX-License-Identifier: GPL-2.0-only
+#
+# AMD DMA Drivers
+
+source "drivers/dma/amd/ptdma/Kconfig"
diff --git a/drivers/dma/amd/Makefile b/drivers/dma/amd/Makefile
new file mode 100644
index 000000000000..dd7257ba7e06
--- /dev/null
+++ b/drivers/dma/amd/Makefile
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: GPL-2.0-only
+#
+# AMD DMA drivers
+#
+
+obj-$(CONFIG_AMD_PTDMA) += ptdma/
diff --git a/drivers/dma/ptdma/Kconfig b/drivers/dma/amd/ptdma/Kconfig
similarity index 100%
rename from drivers/dma/ptdma/Kconfig
rename to drivers/dma/amd/ptdma/Kconfig
diff --git a/drivers/dma/ptdma/Makefile b/drivers/dma/amd/ptdma/Makefile
similarity index 100%
rename from drivers/dma/ptdma/Makefile
rename to drivers/dma/amd/ptdma/Makefile
diff --git a/drivers/dma/ptdma/ptdma-debugfs.c b/drivers/dma/amd/ptdma/ptdma-debugfs.c
similarity index 100%
rename from drivers/dma/ptdma/ptdma-debugfs.c
rename to drivers/dma/amd/ptdma/ptdma-debugfs.c
diff --git a/drivers/dma/ptdma/ptdma-dev.c b/drivers/dma/amd/ptdma/ptdma-dev.c
similarity index 100%
rename from drivers/dma/ptdma/ptdma-dev.c
rename to drivers/dma/amd/ptdma/ptdma-dev.c
diff --git a/drivers/dma/ptdma/ptdma-dmaengine.c b/drivers/dma/amd/ptdma/ptdma-dmaengine.c
similarity index 99%
rename from drivers/dma/ptdma/ptdma-dmaengine.c
rename to drivers/dma/amd/ptdma/ptdma-dmaengine.c
index f79240734807..a2e7c2cec15e 100644
--- a/drivers/dma/ptdma/ptdma-dmaengine.c
+++ b/drivers/dma/amd/ptdma/ptdma-dmaengine.c
@@ -10,8 +10,7 @@
  */
 
 #include "ptdma.h"
-#include "../dmaengine.h"
-#include "../virt-dma.h"
+#include "../../dmaengine.h"
 
 static inline struct pt_dma_chan *to_pt_chan(struct dma_chan *dma_chan)
 {
diff --git a/drivers/dma/ptdma/ptdma-pci.c b/drivers/dma/amd/ptdma/ptdma-pci.c
similarity index 100%
rename from drivers/dma/ptdma/ptdma-pci.c
rename to drivers/dma/amd/ptdma/ptdma-pci.c
diff --git a/drivers/dma/ptdma/ptdma.h b/drivers/dma/amd/ptdma/ptdma.h
similarity index 99%
rename from drivers/dma/ptdma/ptdma.h
rename to drivers/dma/amd/ptdma/ptdma.h
index 21b4bf895200..2690a32fc7cb 100644
--- a/drivers/dma/ptdma/ptdma.h
+++ b/drivers/dma/amd/ptdma/ptdma.h
@@ -22,7 +22,7 @@
 #include <linux/wait.h>
 #include <linux/dmapool.h>
 
-#include "../virt-dma.h"
+#include "../../virt-dma.h"
 
 #define MAX_PT_NAME_LEN			16
 #define MAX_DMAPOOL_NAME_LEN		32
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v5 2/7] dmaengine: ae4dma: Add AMD ae4dma controller driver
  2024-07-08 14:44 [PATCH v5 0/7] Add support of AMD AE4DMA DMA Engine Basavaraj Natikar
  2024-07-08 14:44 ` [PATCH v5 1/7] dmaengine: Move AMD DMA driver to separate directory Basavaraj Natikar
@ 2024-07-08 14:44 ` Basavaraj Natikar
  2024-08-28 13:23   ` Vinod Koul
  2024-07-08 14:44 ` [PATCH v5 3/7] dmaengine: ptdma: Move common functions to common code Basavaraj Natikar
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 14+ messages in thread
From: Basavaraj Natikar @ 2024-07-08 14:44 UTC (permalink / raw)
  To: vkoul, dmaengine
  Cc: Raju.Rangoju, Frank.li, helgaas, pstanner, Basavaraj Natikar

Add support for AMD AE4DMA controller. It performs high-bandwidth
memory to memory and IO copy operation. Device commands are managed
via a circular queue of 'descriptors', each of which specifies source
and destination addresses for copying a single buffer of data.

Reviewed-by: Raju Rangoju <Raju.Rangoju@amd.com>
Reviewed-by: Philipp Stanner <pstanner@redhat.com>
Signed-off-by: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
---
 MAINTAINERS                         |   6 +
 drivers/dma/amd/Kconfig             |   1 +
 drivers/dma/amd/Makefile            |   1 +
 drivers/dma/amd/ae4dma/Kconfig      |  14 ++
 drivers/dma/amd/ae4dma/Makefile     |  10 ++
 drivers/dma/amd/ae4dma/ae4dma-dev.c | 198 ++++++++++++++++++++++++++++
 drivers/dma/amd/ae4dma/ae4dma-pci.c | 157 ++++++++++++++++++++++
 drivers/dma/amd/ae4dma/ae4dma.h     |  85 ++++++++++++
 drivers/dma/amd/common/amd_dma.h    |  26 ++++
 9 files changed, 498 insertions(+)
 create mode 100644 drivers/dma/amd/ae4dma/Kconfig
 create mode 100644 drivers/dma/amd/ae4dma/Makefile
 create mode 100644 drivers/dma/amd/ae4dma/ae4dma-dev.c
 create mode 100644 drivers/dma/amd/ae4dma/ae4dma-pci.c
 create mode 100644 drivers/dma/amd/ae4dma/ae4dma.h
 create mode 100644 drivers/dma/amd/common/amd_dma.h

diff --git a/MAINTAINERS b/MAINTAINERS
index 33a1049fd38b..539bf52410de 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -947,6 +947,12 @@ L:	linux-edac@vger.kernel.org
 S:	Supported
 F:	drivers/ras/amd/atl/*
 
+AMD AE4DMA DRIVER
+M:	Basavaraj Natikar <Basavaraj.Natikar@amd.com>
+L:	dmaengine@vger.kernel.org
+S:	Maintained
+F:	drivers/dma/amd/ae4dma/
+
 AMD AXI W1 DRIVER
 M:	Kris Chaplin <kris.chaplin@amd.com>
 R:	Thomas Delev <thomas.delev@amd.com>
diff --git a/drivers/dma/amd/Kconfig b/drivers/dma/amd/Kconfig
index 8246b463bcf7..8c25a3ed6b94 100644
--- a/drivers/dma/amd/Kconfig
+++ b/drivers/dma/amd/Kconfig
@@ -3,3 +3,4 @@
 # AMD DMA Drivers
 
 source "drivers/dma/amd/ptdma/Kconfig"
+source "drivers/dma/amd/ae4dma/Kconfig"
diff --git a/drivers/dma/amd/Makefile b/drivers/dma/amd/Makefile
index dd7257ba7e06..8049b06a9ff5 100644
--- a/drivers/dma/amd/Makefile
+++ b/drivers/dma/amd/Makefile
@@ -4,3 +4,4 @@
 #
 
 obj-$(CONFIG_AMD_PTDMA) += ptdma/
+obj-$(CONFIG_AMD_AE4DMA) += ae4dma/
diff --git a/drivers/dma/amd/ae4dma/Kconfig b/drivers/dma/amd/ae4dma/Kconfig
new file mode 100644
index 000000000000..ea8a7fe68df5
--- /dev/null
+++ b/drivers/dma/amd/ae4dma/Kconfig
@@ -0,0 +1,14 @@
+# SPDX-License-Identifier: GPL-2.0
+config AMD_AE4DMA
+	tristate  "AMD AE4DMA Engine"
+	depends on (X86_64 || COMPILE_TEST) && PCI
+	depends on AMD_PTDMA
+	select DMA_ENGINE
+	select DMA_VIRTUAL_CHANNELS
+	help
+	  Enable support for the AMD AE4DMA controller. This controller
+	  provides DMA capabilities to perform high bandwidth memory to
+	  memory and IO copy operations. It performs DMA transfer through
+	  queue-based descriptor management. This DMA controller is intended
+	  to be used with AMD Non-Transparent Bridge devices and not for
+	  general purpose peripheral DMA.
diff --git a/drivers/dma/amd/ae4dma/Makefile b/drivers/dma/amd/ae4dma/Makefile
new file mode 100644
index 000000000000..e918f85a80ec
--- /dev/null
+++ b/drivers/dma/amd/ae4dma/Makefile
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: GPL-2.0
+#
+# AMD AE4DMA driver
+#
+
+obj-$(CONFIG_AMD_AE4DMA) += ae4dma.o
+
+ae4dma-objs := ae4dma-dev.o
+
+ae4dma-$(CONFIG_PCI) += ae4dma-pci.o
diff --git a/drivers/dma/amd/ae4dma/ae4dma-dev.c b/drivers/dma/amd/ae4dma/ae4dma-dev.c
new file mode 100644
index 000000000000..c38464b96fc6
--- /dev/null
+++ b/drivers/dma/amd/ae4dma/ae4dma-dev.c
@@ -0,0 +1,198 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * AMD AE4DMA driver
+ *
+ * Copyright (c) 2024, Advanced Micro Devices, Inc.
+ * All Rights Reserved.
+ *
+ * Author: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
+ */
+
+#include "ae4dma.h"
+
+static unsigned int max_hw_q = 1;
+module_param(max_hw_q, uint, 0444);
+MODULE_PARM_DESC(max_hw_q, "max hw queues supported by engine (any non-zero value, default: 1)");
+
+static char *ae4_error_codes[] = {
+	"",
+	"ERR 01: INVALID HEADER DW0",
+	"ERR 02: INVALID STATUS",
+	"ERR 03: INVALID LENGTH - 4 BYTE ALIGNMENT",
+	"ERR 04: INVALID SRC ADDR - 4 BYTE ALIGNMENT",
+	"ERR 05: INVALID DST ADDR - 4 BYTE ALIGNMENT",
+	"ERR 06: INVALID ALIGNMENT",
+	"ERR 07: INVALID DESCRIPTOR",
+};
+
+static void ae4_log_error(struct pt_device *d, int e)
+{
+	/* ERR 01 - 07 represents Invalid AE4 errors */
+	if (e <= 7)
+		dev_info(d->dev, "AE4DMA error: %s (0x%x)\n", ae4_error_codes[e], e);
+	/* ERR 08 - 15 represents Invalid Descriptor errors */
+	else if (e > 7 && e <= 15)
+		dev_info(d->dev, "AE4DMA error: %s (0x%x)\n", "INVALID DESCRIPTOR", e);
+	/* ERR 16 - 31 represents Firmware errors */
+	else if (e > 15 && e <= 31)
+		dev_info(d->dev, "AE4DMA error: %s (0x%x)\n", "FIRMWARE ERROR", e);
+	/* ERR 32 - 63 represents Fatal errors */
+	else if (e > 31 && e <= 63)
+		dev_info(d->dev, "AE4DMA error: %s (0x%x)\n", "FATAL ERROR", e);
+	/* ERR 64 - 255 represents PTE errors */
+	else if (e > 63 && e <= 255)
+		dev_info(d->dev, "AE4DMA error: %s (0x%x)\n", "PTE ERROR", e);
+	else
+		dev_info(d->dev, "Unknown AE4DMA error");
+}
+
+static void ae4_check_status_error(struct ae4_cmd_queue *ae4cmd_q, int idx)
+{
+	struct pt_cmd_queue *cmd_q = &ae4cmd_q->cmd_q;
+	struct ae4dma_desc desc;
+	u8 status;
+
+	memcpy(&desc, &cmd_q->qbase[idx], sizeof(struct ae4dma_desc));
+	status = desc.dw1.status;
+	if (status && status != AE4_DESC_COMPLETED) {
+		cmd_q->cmd_error = desc.dw1.err_code;
+		if (cmd_q->cmd_error)
+			ae4_log_error(cmd_q->pt, cmd_q->cmd_error);
+	}
+}
+
+static void ae4_pending_work(struct work_struct *work)
+{
+	struct ae4_cmd_queue *ae4cmd_q = container_of(work, struct ae4_cmd_queue, p_work.work);
+	struct pt_cmd_queue *cmd_q = &ae4cmd_q->cmd_q;
+	struct pt_cmd *cmd;
+	u32 cridx;
+
+	for (;;) {
+		wait_event_interruptible(ae4cmd_q->q_w,
+					 ((atomic64_read(&ae4cmd_q->done_cnt)) <
+					   atomic64_read(&ae4cmd_q->intr_cnt)));
+
+		atomic64_inc(&ae4cmd_q->done_cnt);
+
+		mutex_lock(&ae4cmd_q->cmd_lock);
+		cridx = readl(cmd_q->reg_control + AE4_RD_IDX_OFF);
+		while ((ae4cmd_q->dridx != cridx) && !list_empty(&ae4cmd_q->cmd)) {
+			cmd = list_first_entry(&ae4cmd_q->cmd, struct pt_cmd, entry);
+			list_del(&cmd->entry);
+
+			ae4_check_status_error(ae4cmd_q, ae4cmd_q->dridx);
+			cmd->pt_cmd_callback(cmd->data, cmd->ret);
+
+			ae4cmd_q->q_cmd_count--;
+			ae4cmd_q->dridx = (ae4cmd_q->dridx + 1) % CMD_Q_LEN;
+
+			complete_all(&ae4cmd_q->cmp);
+		}
+		mutex_unlock(&ae4cmd_q->cmd_lock);
+	}
+}
+
+static irqreturn_t ae4_core_irq_handler(int irq, void *data)
+{
+	struct ae4_cmd_queue *ae4cmd_q = data;
+	struct pt_cmd_queue *cmd_q;
+	struct pt_device *pt;
+	u32 status;
+
+	cmd_q = &ae4cmd_q->cmd_q;
+	pt = cmd_q->pt;
+
+	pt->total_interrupts++;
+	atomic64_inc(&ae4cmd_q->intr_cnt);
+
+	wake_up(&ae4cmd_q->q_w);
+
+	status = readl(cmd_q->reg_control + AE4_INTR_STS_OFF);
+	if (status & BIT(0)) {
+		status &= GENMASK(31, 1);
+		writel(status, cmd_q->reg_control + AE4_INTR_STS_OFF);
+	}
+
+	return IRQ_HANDLED;
+}
+
+void ae4_destroy_work(struct ae4_device *ae4)
+{
+	struct ae4_cmd_queue *ae4cmd_q;
+	int i;
+
+	for (i = 0; i < ae4->cmd_q_count; i++) {
+		ae4cmd_q = &ae4->ae4cmd_q[i];
+
+		if (!ae4cmd_q->pws)
+			break;
+
+		cancel_delayed_work_sync(&ae4cmd_q->p_work);
+		destroy_workqueue(ae4cmd_q->pws);
+	}
+}
+
+int ae4_core_init(struct ae4_device *ae4)
+{
+	struct pt_device *pt = &ae4->pt;
+	struct ae4_cmd_queue *ae4cmd_q;
+	struct device *dev = pt->dev;
+	struct pt_cmd_queue *cmd_q;
+	int i, ret = 0;
+
+	writel(max_hw_q, pt->io_regs);
+
+	for (i = 0; i < max_hw_q; i++) {
+		ae4cmd_q = &ae4->ae4cmd_q[i];
+		ae4cmd_q->id = ae4->cmd_q_count;
+		ae4->cmd_q_count++;
+
+		cmd_q = &ae4cmd_q->cmd_q;
+		cmd_q->pt = pt;
+
+		cmd_q->reg_control = pt->io_regs + ((i + 1) * AE4_Q_SZ);
+
+		ret = devm_request_irq(dev, ae4->ae4_irq[i], ae4_core_irq_handler, 0,
+				       dev_name(pt->dev), ae4cmd_q);
+		if (ret)
+			return ret;
+
+		cmd_q->qsize = Q_SIZE(sizeof(struct ae4dma_desc));
+
+		cmd_q->qbase = dmam_alloc_coherent(dev, cmd_q->qsize, &cmd_q->qbase_dma,
+						   GFP_KERNEL);
+		if (!cmd_q->qbase)
+			return -ENOMEM;
+	}
+
+	for (i = 0; i < ae4->cmd_q_count; i++) {
+		ae4cmd_q = &ae4->ae4cmd_q[i];
+
+		cmd_q = &ae4cmd_q->cmd_q;
+
+		cmd_q->reg_control = pt->io_regs + ((i + 1) * AE4_Q_SZ);
+
+		/* Update the device registers with queue information. */
+		writel(CMD_Q_LEN, cmd_q->reg_control + AE4_MAX_IDX_OFF);
+
+		cmd_q->qdma_tail = cmd_q->qbase_dma;
+		writel(lower_32_bits(cmd_q->qdma_tail), cmd_q->reg_control + AE4_Q_BASE_L_OFF);
+		writel(upper_32_bits(cmd_q->qdma_tail), cmd_q->reg_control + AE4_Q_BASE_H_OFF);
+
+		INIT_LIST_HEAD(&ae4cmd_q->cmd);
+		init_waitqueue_head(&ae4cmd_q->q_w);
+
+		ae4cmd_q->pws = alloc_ordered_workqueue("ae4dma_%d", WQ_MEM_RECLAIM, ae4cmd_q->id);
+		if (!ae4cmd_q->pws) {
+			ae4_destroy_work(ae4);
+			return -ENOMEM;
+		}
+		INIT_DELAYED_WORK(&ae4cmd_q->p_work, ae4_pending_work);
+		queue_delayed_work(ae4cmd_q->pws, &ae4cmd_q->p_work,  usecs_to_jiffies(100));
+
+		init_completion(&ae4cmd_q->cmp);
+	}
+
+	return ret;
+}
diff --git a/drivers/dma/amd/ae4dma/ae4dma-pci.c b/drivers/dma/amd/ae4dma/ae4dma-pci.c
new file mode 100644
index 000000000000..43d36e9d1efb
--- /dev/null
+++ b/drivers/dma/amd/ae4dma/ae4dma-pci.c
@@ -0,0 +1,157 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * AMD AE4DMA driver
+ *
+ * Copyright (c) 2024, Advanced Micro Devices, Inc.
+ * All Rights Reserved.
+ *
+ * Author: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
+ */
+
+#include "ae4dma.h"
+
+static int ae4_get_irqs(struct ae4_device *ae4)
+{
+	struct ae4_msix *ae4_msix = ae4->ae4_msix;
+	struct pt_device *pt = &ae4->pt;
+	struct device *dev = pt->dev;
+	struct pci_dev *pdev;
+	int i, v, ret;
+
+	pdev = to_pci_dev(dev);
+
+	for (v = 0; v < ARRAY_SIZE(ae4_msix->msix_entry); v++)
+		ae4_msix->msix_entry[v].entry = v;
+
+	ret = pci_alloc_irq_vectors(pdev, v, v, PCI_IRQ_MSIX);
+	if (ret != v) {
+		if (ret > 0)
+			pci_free_irq_vectors(pdev);
+
+		dev_err(dev, "could not enable MSI-X (%d), trying MSI\n", ret);
+		ret = pci_alloc_irq_vectors(pdev, 1, 1, PCI_IRQ_MSI);
+		if (ret < 0) {
+			dev_err(dev, "could not enable MSI (%d)\n", ret);
+			return ret;
+		}
+
+		ret = pci_irq_vector(pdev, 0);
+		if (ret < 0) {
+			pci_free_irq_vectors(pdev);
+			return ret;
+		}
+
+		for (i = 0; i < MAX_AE4_HW_QUEUES; i++)
+			ae4->ae4_irq[i] = ret;
+
+	} else {
+		ae4_msix->msix_count = ret;
+		for (i = 0; i < MAX_AE4_HW_QUEUES; i++)
+			ae4->ae4_irq[i] = ae4_msix->msix_entry[i].vector;
+	}
+
+	return ret;
+}
+
+static void ae4_free_irqs(struct ae4_device *ae4)
+{
+	struct ae4_msix *ae4_msix = ae4->ae4_msix;
+	struct pt_device *pt = &ae4->pt;
+	struct device *dev = pt->dev;
+	struct pci_dev *pdev;
+
+	pdev = to_pci_dev(dev);
+
+	if (ae4_msix && (ae4_msix->msix_count || ae4->ae4_irq[MAX_AE4_HW_QUEUES - 1]))
+		pci_free_irq_vectors(pdev);
+}
+
+static void ae4_deinit(struct ae4_device *ae4)
+{
+	ae4_free_irqs(ae4);
+}
+
+static int ae4_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+{
+	struct device *dev = &pdev->dev;
+	struct ae4_device *ae4;
+	struct pt_device *pt;
+	int bar_mask;
+	int ret = 0;
+
+	ae4 = devm_kzalloc(dev, sizeof(*ae4), GFP_KERNEL);
+	if (!ae4)
+		return -ENOMEM;
+
+	ae4->ae4_msix = devm_kzalloc(dev, sizeof(struct ae4_msix), GFP_KERNEL);
+	if (!ae4->ae4_msix)
+		return -ENOMEM;
+
+	ret = pcim_enable_device(pdev);
+	if (ret)
+		goto ae4_error;
+
+	bar_mask = pci_select_bars(pdev, IORESOURCE_MEM);
+	ret = pcim_iomap_regions(pdev, bar_mask, "ae4dma");
+	if (ret)
+		goto ae4_error;
+
+	pt = &ae4->pt;
+	pt->dev = dev;
+
+	pt->io_regs = pcim_iomap_table(pdev)[0];
+	if (!pt->io_regs) {
+		ret = -ENOMEM;
+		goto ae4_error;
+	}
+
+	ret = ae4_get_irqs(ae4);
+	if (ret < 0)
+		goto ae4_error;
+
+	pci_set_master(pdev);
+
+	dma_set_mask_and_coherent(dev, DMA_BIT_MASK(48));
+
+	dev_set_drvdata(dev, ae4);
+
+	ret = ae4_core_init(ae4);
+	if (ret)
+		goto ae4_error;
+
+	return 0;
+
+ae4_error:
+	ae4_deinit(ae4);
+
+	return ret;
+}
+
+static void ae4_pci_remove(struct pci_dev *pdev)
+{
+	struct ae4_device *ae4 = dev_get_drvdata(&pdev->dev);
+
+	ae4_destroy_work(ae4);
+	ae4_deinit(ae4);
+}
+
+static const struct pci_device_id ae4_pci_table[] = {
+	{ PCI_VDEVICE(AMD, 0x14C8), },
+	{ PCI_VDEVICE(AMD, 0x14DC), },
+	{ PCI_VDEVICE(AMD, 0x149B), },
+	/* Last entry must be zero */
+	{ 0, }
+};
+MODULE_DEVICE_TABLE(pci, ae4_pci_table);
+
+static struct pci_driver ae4_pci_driver = {
+	.name = "ae4dma",
+	.id_table = ae4_pci_table,
+	.probe = ae4_pci_probe,
+	.remove = ae4_pci_remove,
+};
+
+module_pci_driver(ae4_pci_driver);
+
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("AMD AE4DMA driver");
diff --git a/drivers/dma/amd/ae4dma/ae4dma.h b/drivers/dma/amd/ae4dma/ae4dma.h
new file mode 100644
index 000000000000..a63525792080
--- /dev/null
+++ b/drivers/dma/amd/ae4dma/ae4dma.h
@@ -0,0 +1,85 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * AMD AE4DMA driver
+ *
+ * Copyright (c) 2024, Advanced Micro Devices, Inc.
+ * All Rights Reserved.
+ *
+ * Author: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
+ */
+#ifndef __AE4DMA_H__
+#define __AE4DMA_H__
+
+#include "../common/amd_dma.h"
+
+#define MAX_AE4_HW_QUEUES		16
+
+#define AE4_DESC_COMPLETED		0x03
+
+#define AE4_MAX_IDX_OFF			0x08
+#define AE4_RD_IDX_OFF			0x0C
+#define AE4_WR_IDX_OFF			0x10
+#define AE4_INTR_STS_OFF		0x14
+#define AE4_Q_BASE_L_OFF		0x18
+#define AE4_Q_BASE_H_OFF		0x1C
+#define AE4_Q_SZ			0x20
+
+struct ae4_msix {
+	int msix_count;
+	struct msix_entry msix_entry[MAX_AE4_HW_QUEUES];
+};
+
+struct ae4_cmd_queue {
+	struct ae4_device *ae4;
+	struct pt_cmd_queue cmd_q;
+	struct list_head cmd;
+	/* protect command operations */
+	struct mutex cmd_lock;
+	struct delayed_work p_work;
+	struct workqueue_struct *pws;
+	struct completion cmp;
+	wait_queue_head_t q_w;
+	atomic64_t intr_cnt;
+	atomic64_t done_cnt;
+	u64 q_cmd_count;
+	u32 dridx;
+	u32 id;
+};
+
+union dwou {
+	u32 dw0;
+	struct dword0 {
+	u8	byte0;
+	u8	byte1;
+	u16	timestamp;
+	} dws;
+};
+
+struct dword1 {
+	u8	status;
+	u8	err_code;
+	u16	desc_id;
+};
+
+struct ae4dma_desc {
+	union dwou dwouv;
+	struct dword1 dw1;
+	u32 length;
+	u32 rsvd;
+	u32 src_hi;
+	u32 src_lo;
+	u32 dst_hi;
+	u32 dst_lo;
+};
+
+struct ae4_device {
+	struct pt_device pt;
+	struct ae4_msix *ae4_msix;
+	struct ae4_cmd_queue ae4cmd_q[MAX_AE4_HW_QUEUES];
+	unsigned int ae4_irq[MAX_AE4_HW_QUEUES];
+	unsigned int cmd_q_count;
+};
+
+int ae4_core_init(struct ae4_device *ae4);
+void ae4_destroy_work(struct ae4_device *ae4);
+#endif
diff --git a/drivers/dma/amd/common/amd_dma.h b/drivers/dma/amd/common/amd_dma.h
new file mode 100644
index 000000000000..f9f396cd4371
--- /dev/null
+++ b/drivers/dma/amd/common/amd_dma.h
@@ -0,0 +1,26 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * AMD DMA Driver common
+ *
+ * Copyright (c) 2024, Advanced Micro Devices, Inc.
+ * All Rights Reserved.
+ *
+ * Author: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
+ */
+
+#ifndef AMD_DMA_H
+#define AMD_DMA_H
+
+#include <linux/device.h>
+#include <linux/dmaengine.h>
+#include <linux/dmapool.h>
+#include <linux/list.h>
+#include <linux/mutex.h>
+#include <linux/pci.h>
+#include <linux/spinlock.h>
+#include <linux/wait.h>
+
+#include "../ptdma/ptdma.h"
+#include "../../virt-dma.h"
+
+#endif
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v5 3/7] dmaengine: ptdma: Move common functions to common code
  2024-07-08 14:44 [PATCH v5 0/7] Add support of AMD AE4DMA DMA Engine Basavaraj Natikar
  2024-07-08 14:44 ` [PATCH v5 1/7] dmaengine: Move AMD DMA driver to separate directory Basavaraj Natikar
  2024-07-08 14:44 ` [PATCH v5 2/7] dmaengine: ae4dma: Add AMD ae4dma controller driver Basavaraj Natikar
@ 2024-07-08 14:44 ` Basavaraj Natikar
  2024-08-28 13:18   ` Vinod Koul
  2024-07-08 14:44 ` [PATCH v5 4/7] dmaengine: ptdma: Extend ptdma to support multi-channel and version Basavaraj Natikar
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 14+ messages in thread
From: Basavaraj Natikar @ 2024-07-08 14:44 UTC (permalink / raw)
  To: vkoul, dmaengine
  Cc: Raju.Rangoju, Frank.li, helgaas, pstanner, Basavaraj Natikar

To focus on reusability of ptdma code across modules extract common
functions into reusable modules.

Reviewed-by: Raju Rangoju <Raju.Rangoju@amd.com>
Signed-off-by: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
---
 MAINTAINERS                             | 1 +
 drivers/dma/amd/ptdma/ptdma-dev.c       | 2 +-
 drivers/dma/amd/ptdma/ptdma-dmaengine.c | 3 +--
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 539bf52410de..97d97ddf26f5 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -952,6 +952,7 @@ M:	Basavaraj Natikar <Basavaraj.Natikar@amd.com>
 L:	dmaengine@vger.kernel.org
 S:	Maintained
 F:	drivers/dma/amd/ae4dma/
+F:	drivers/dma/amd/common/
 
 AMD AXI W1 DRIVER
 M:	Kris Chaplin <kris.chaplin@amd.com>
diff --git a/drivers/dma/amd/ptdma/ptdma-dev.c b/drivers/dma/amd/ptdma/ptdma-dev.c
index a2bf13ff18b6..2bdf418fe556 100644
--- a/drivers/dma/amd/ptdma/ptdma-dev.c
+++ b/drivers/dma/amd/ptdma/ptdma-dev.c
@@ -17,7 +17,7 @@
 #include <linux/module.h>
 #include <linux/pci.h>
 
-#include "ptdma.h"
+#include "../common/amd_dma.h"
 
 /* Human-readable error strings */
 static char *pt_error_codes[] = {
diff --git a/drivers/dma/amd/ptdma/ptdma-dmaengine.c b/drivers/dma/amd/ptdma/ptdma-dmaengine.c
index a2e7c2cec15e..66ea10499643 100644
--- a/drivers/dma/amd/ptdma/ptdma-dmaengine.c
+++ b/drivers/dma/amd/ptdma/ptdma-dmaengine.c
@@ -9,8 +9,7 @@
  * Author: Gary R Hook <gary.hook@amd.com>
  */
 
-#include "ptdma.h"
-#include "../../dmaengine.h"
+#include "../common/amd_dma.h"
 
 static inline struct pt_dma_chan *to_pt_chan(struct dma_chan *dma_chan)
 {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v5 4/7] dmaengine: ptdma: Extend ptdma to support multi-channel and version
  2024-07-08 14:44 [PATCH v5 0/7] Add support of AMD AE4DMA DMA Engine Basavaraj Natikar
                   ` (2 preceding siblings ...)
  2024-07-08 14:44 ` [PATCH v5 3/7] dmaengine: ptdma: Move common functions to common code Basavaraj Natikar
@ 2024-07-08 14:44 ` Basavaraj Natikar
  2024-08-28 17:46   ` Vinod Koul
  2024-07-08 14:44 ` [PATCH v5 5/7] dmaengine: ae4dma: Register AE4DMA using pt_dmaengine_register Basavaraj Natikar
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 14+ messages in thread
From: Basavaraj Natikar @ 2024-07-08 14:44 UTC (permalink / raw)
  To: vkoul, dmaengine
  Cc: Raju.Rangoju, Frank.li, helgaas, pstanner, Basavaraj Natikar

To support multi-channel functionality with AE4DMA engine, extend the
PTDMA code with reusable components.

Reviewed-by: Raju Rangoju <Raju.Rangoju@amd.com>
Signed-off-by: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
---
 drivers/dma/amd/ae4dma/ae4dma.h         |   2 +
 drivers/dma/amd/common/amd_dma.h        |   1 +
 drivers/dma/amd/ptdma/ptdma-dmaengine.c | 105 +++++++++++++++++++-----
 drivers/dma/amd/ptdma/ptdma.h           |   2 +
 4 files changed, 90 insertions(+), 20 deletions(-)

diff --git a/drivers/dma/amd/ae4dma/ae4dma.h b/drivers/dma/amd/ae4dma/ae4dma.h
index a63525792080..5f9dab5f05f4 100644
--- a/drivers/dma/amd/ae4dma/ae4dma.h
+++ b/drivers/dma/amd/ae4dma/ae4dma.h
@@ -24,6 +24,8 @@
 #define AE4_Q_BASE_H_OFF		0x1C
 #define AE4_Q_SZ			0x20
 
+#define AE4_DMA_VERSION			4
+
 struct ae4_msix {
 	int msix_count;
 	struct msix_entry msix_entry[MAX_AE4_HW_QUEUES];
diff --git a/drivers/dma/amd/common/amd_dma.h b/drivers/dma/amd/common/amd_dma.h
index f9f396cd4371..396667e81e1a 100644
--- a/drivers/dma/amd/common/amd_dma.h
+++ b/drivers/dma/amd/common/amd_dma.h
@@ -21,6 +21,7 @@
 #include <linux/wait.h>
 
 #include "../ptdma/ptdma.h"
+#include "../ae4dma/ae4dma.h"
 #include "../../virt-dma.h"
 
 #endif
diff --git a/drivers/dma/amd/ptdma/ptdma-dmaengine.c b/drivers/dma/amd/ptdma/ptdma-dmaengine.c
index 66ea10499643..90ca02fd5f8f 100644
--- a/drivers/dma/amd/ptdma/ptdma-dmaengine.c
+++ b/drivers/dma/amd/ptdma/ptdma-dmaengine.c
@@ -43,7 +43,24 @@ static void pt_do_cleanup(struct virt_dma_desc *vd)
 	kmem_cache_free(pt->dma_desc_cache, desc);
 }
 
-static int pt_dma_start_desc(struct pt_dma_desc *desc)
+static struct pt_cmd_queue *pt_get_cmd_queue(struct pt_device *pt, struct pt_dma_chan *chan)
+{
+	struct ae4_cmd_queue *ae4cmd_q;
+	struct pt_cmd_queue *cmd_q;
+	struct ae4_device *ae4;
+
+	if (pt->ver == AE4_DMA_VERSION) {
+		ae4 = container_of(pt, struct ae4_device, pt);
+		ae4cmd_q = &ae4->ae4cmd_q[chan->id];
+		cmd_q = &ae4cmd_q->cmd_q;
+	} else {
+		cmd_q = &pt->cmd_q;
+	}
+
+	return cmd_q;
+}
+
+static int pt_dma_start_desc(struct pt_dma_desc *desc, struct pt_dma_chan *chan)
 {
 	struct pt_passthru_engine *pt_engine;
 	struct pt_device *pt;
@@ -54,7 +71,9 @@ static int pt_dma_start_desc(struct pt_dma_desc *desc)
 
 	pt_cmd = &desc->pt_cmd;
 	pt = pt_cmd->pt;
-	cmd_q = &pt->cmd_q;
+
+	cmd_q = pt_get_cmd_queue(pt, chan);
+
 	pt_engine = &pt_cmd->passthru;
 
 	pt->tdata.cmd = pt_cmd;
@@ -149,7 +168,7 @@ static void pt_cmd_callback(void *data, int err)
 		if (!desc)
 			break;
 
-		ret = pt_dma_start_desc(desc);
+		ret = pt_dma_start_desc(desc, chan);
 		if (!ret)
 			break;
 
@@ -184,7 +203,10 @@ static struct pt_dma_desc *pt_create_desc(struct dma_chan *dma_chan,
 {
 	struct pt_dma_chan *chan = to_pt_chan(dma_chan);
 	struct pt_passthru_engine *pt_engine;
+	struct pt_device *pt = chan->pt;
+	struct ae4_cmd_queue *ae4cmd_q;
 	struct pt_dma_desc *desc;
+	struct ae4_device *ae4;
 	struct pt_cmd *pt_cmd;
 
 	desc = pt_alloc_dma_desc(chan, flags);
@@ -192,7 +214,7 @@ static struct pt_dma_desc *pt_create_desc(struct dma_chan *dma_chan,
 		return NULL;
 
 	pt_cmd = &desc->pt_cmd;
-	pt_cmd->pt = chan->pt;
+	pt_cmd->pt = pt;
 	pt_engine = &pt_cmd->passthru;
 	pt_cmd->engine = PT_ENGINE_PASSTHRU;
 	pt_engine->src_dma = src;
@@ -203,6 +225,14 @@ static struct pt_dma_desc *pt_create_desc(struct dma_chan *dma_chan,
 
 	desc->len = len;
 
+	if (pt->ver == AE4_DMA_VERSION) {
+		ae4 = container_of(pt, struct ae4_device, pt);
+		ae4cmd_q = &ae4->ae4cmd_q[chan->id];
+		mutex_lock(&ae4cmd_q->cmd_lock);
+		list_add_tail(&pt_cmd->entry, &ae4cmd_q->cmd);
+		mutex_unlock(&ae4cmd_q->cmd_lock);
+	}
+
 	return desc;
 }
 
@@ -260,8 +290,11 @@ static enum dma_status
 pt_tx_status(struct dma_chan *c, dma_cookie_t cookie,
 		struct dma_tx_state *txstate)
 {
-	struct pt_device *pt = to_pt_chan(c)->pt;
-	struct pt_cmd_queue *cmd_q = &pt->cmd_q;
+	struct pt_dma_chan *chan = to_pt_chan(c);
+	struct pt_device *pt = chan->pt;
+	struct pt_cmd_queue *cmd_q;
+
+	cmd_q = pt_get_cmd_queue(pt, chan);
 
 	pt_check_status_trans(pt, cmd_q);
 	return dma_cookie_status(c, cookie, txstate);
@@ -270,10 +303,13 @@ pt_tx_status(struct dma_chan *c, dma_cookie_t cookie,
 static int pt_pause(struct dma_chan *dma_chan)
 {
 	struct pt_dma_chan *chan = to_pt_chan(dma_chan);
+	struct pt_device *pt = chan->pt;
+	struct pt_cmd_queue *cmd_q;
 	unsigned long flags;
 
 	spin_lock_irqsave(&chan->vc.lock, flags);
-	pt_stop_queue(&chan->pt->cmd_q);
+	cmd_q = pt_get_cmd_queue(pt, chan);
+	pt_stop_queue(cmd_q);
 	spin_unlock_irqrestore(&chan->vc.lock, flags);
 
 	return 0;
@@ -283,10 +319,13 @@ static int pt_resume(struct dma_chan *dma_chan)
 {
 	struct pt_dma_chan *chan = to_pt_chan(dma_chan);
 	struct pt_dma_desc *desc = NULL;
+	struct pt_device *pt = chan->pt;
+	struct pt_cmd_queue *cmd_q;
 	unsigned long flags;
 
 	spin_lock_irqsave(&chan->vc.lock, flags);
-	pt_start_queue(&chan->pt->cmd_q);
+	cmd_q = pt_get_cmd_queue(pt, chan);
+	pt_start_queue(cmd_q);
 	desc = pt_next_dma_desc(chan);
 	spin_unlock_irqrestore(&chan->vc.lock, flags);
 
@@ -300,11 +339,17 @@ static int pt_resume(struct dma_chan *dma_chan)
 static int pt_terminate_all(struct dma_chan *dma_chan)
 {
 	struct pt_dma_chan *chan = to_pt_chan(dma_chan);
+	struct pt_device *pt = chan->pt;
+	struct pt_cmd_queue *cmd_q;
 	unsigned long flags;
-	struct pt_cmd_queue *cmd_q = &chan->pt->cmd_q;
 	LIST_HEAD(head);
 
-	iowrite32(SUPPORTED_INTERRUPTS, cmd_q->reg_control + 0x0010);
+	cmd_q = pt_get_cmd_queue(pt, chan);
+	if (pt->ver == AE4_DMA_VERSION)
+		pt_stop_queue(cmd_q);
+	else
+		iowrite32(SUPPORTED_INTERRUPTS, cmd_q->reg_control + 0x0010);
+
 	spin_lock_irqsave(&chan->vc.lock, flags);
 	vchan_get_all_descriptors(&chan->vc, &head);
 	spin_unlock_irqrestore(&chan->vc.lock, flags);
@@ -317,14 +362,24 @@ static int pt_terminate_all(struct dma_chan *dma_chan)
 
 int pt_dmaengine_register(struct pt_device *pt)
 {
-	struct pt_dma_chan *chan;
 	struct dma_device *dma_dev = &pt->dma_dev;
-	char *cmd_cache_name;
+	struct ae4_cmd_queue *ae4cmd_q = NULL;
+	struct ae4_device *ae4 = NULL;
+	struct pt_dma_chan *chan;
 	char *desc_cache_name;
-	int ret;
+	char *cmd_cache_name;
+	int ret, i;
+
+	if (pt->ver == AE4_DMA_VERSION)
+		ae4 = container_of(pt, struct ae4_device, pt);
+
+	if (ae4)
+		pt->pt_dma_chan = devm_kcalloc(pt->dev, ae4->cmd_q_count,
+					       sizeof(*pt->pt_dma_chan), GFP_KERNEL);
+	else
+		pt->pt_dma_chan = devm_kzalloc(pt->dev, sizeof(*pt->pt_dma_chan),
+					       GFP_KERNEL);
 
-	pt->pt_dma_chan = devm_kzalloc(pt->dev, sizeof(*pt->pt_dma_chan),
-				       GFP_KERNEL);
 	if (!pt->pt_dma_chan)
 		return -ENOMEM;
 
@@ -366,9 +421,6 @@ int pt_dmaengine_register(struct pt_device *pt)
 
 	INIT_LIST_HEAD(&dma_dev->channels);
 
-	chan = pt->pt_dma_chan;
-	chan->pt = pt;
-
 	/* Set base and prep routines */
 	dma_dev->device_free_chan_resources = pt_free_chan_resources;
 	dma_dev->device_prep_dma_memcpy = pt_prep_dma_memcpy;
@@ -380,8 +432,21 @@ int pt_dmaengine_register(struct pt_device *pt)
 	dma_dev->device_terminate_all = pt_terminate_all;
 	dma_dev->device_synchronize = pt_synchronize;
 
-	chan->vc.desc_free = pt_do_cleanup;
-	vchan_init(&chan->vc, dma_dev);
+	if (ae4) {
+		for (i = 0; i < ae4->cmd_q_count; i++) {
+			chan = pt->pt_dma_chan + i;
+			ae4cmd_q = &ae4->ae4cmd_q[i];
+			chan->id = ae4cmd_q->id;
+			chan->pt = pt;
+			chan->vc.desc_free = pt_do_cleanup;
+			vchan_init(&chan->vc, dma_dev);
+		}
+	} else {
+		chan = pt->pt_dma_chan;
+		chan->pt = pt;
+		chan->vc.desc_free = pt_do_cleanup;
+		vchan_init(&chan->vc, dma_dev);
+	}
 
 	ret = dma_async_device_register(dma_dev);
 	if (ret)
diff --git a/drivers/dma/amd/ptdma/ptdma.h b/drivers/dma/amd/ptdma/ptdma.h
index 2690a32fc7cb..a6990021fe2b 100644
--- a/drivers/dma/amd/ptdma/ptdma.h
+++ b/drivers/dma/amd/ptdma/ptdma.h
@@ -184,6 +184,7 @@ struct pt_dma_desc {
 struct pt_dma_chan {
 	struct virt_dma_chan vc;
 	struct pt_device *pt;
+	u32 id;
 };
 
 struct pt_cmd_queue {
@@ -262,6 +263,7 @@ struct pt_device {
 	unsigned long total_interrupts;
 
 	struct pt_tasklet_data tdata;
+	int ver;
 };
 
 /*
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v5 5/7] dmaengine: ae4dma: Register AE4DMA using pt_dmaengine_register
  2024-07-08 14:44 [PATCH v5 0/7] Add support of AMD AE4DMA DMA Engine Basavaraj Natikar
                   ` (3 preceding siblings ...)
  2024-07-08 14:44 ` [PATCH v5 4/7] dmaengine: ptdma: Extend ptdma to support multi-channel and version Basavaraj Natikar
@ 2024-07-08 14:44 ` Basavaraj Natikar
  2024-07-08 14:44 ` [PATCH v5 6/7] dmaengine: ptdma: Extend ptdma-debugfs to support multi-queue Basavaraj Natikar
  2024-07-08 14:45 ` [PATCH v5 7/7] dmaengine: ae4dma: Register debugfs using ptdma_debugfs_setup Basavaraj Natikar
  6 siblings, 0 replies; 14+ messages in thread
From: Basavaraj Natikar @ 2024-07-08 14:44 UTC (permalink / raw)
  To: vkoul, dmaengine
  Cc: Raju.Rangoju, Frank.li, helgaas, pstanner, Basavaraj Natikar

Use the pt_dmaengine_register function to register a AE4DMA DMA engine.

Reviewed-by: Raju Rangoju <Raju.Rangoju@amd.com>
Reviewed-by: Philipp Stanner <pstanner@redhat.com>
Signed-off-by: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
---
 drivers/dma/amd/ae4dma/ae4dma-dev.c     | 61 +++++++++++++++++++++++++
 drivers/dma/amd/ae4dma/ae4dma-pci.c     |  1 +
 drivers/dma/amd/ae4dma/ae4dma.h         |  2 +
 drivers/dma/amd/common/amd_dma.h        |  3 ++
 drivers/dma/amd/ptdma/ptdma-dmaengine.c | 11 ++++-
 5 files changed, 76 insertions(+), 2 deletions(-)

diff --git a/drivers/dma/amd/ae4dma/ae4dma-dev.c b/drivers/dma/amd/ae4dma/ae4dma-dev.c
index c38464b96fc6..248abf794aff 100644
--- a/drivers/dma/amd/ae4dma/ae4dma-dev.c
+++ b/drivers/dma/amd/ae4dma/ae4dma-dev.c
@@ -61,6 +61,16 @@ static void ae4_check_status_error(struct ae4_cmd_queue *ae4cmd_q, int idx)
 	}
 }
 
+void pt_check_status_trans_ae4(struct pt_device *pt, struct pt_cmd_queue *cmd_q)
+{
+	struct ae4_cmd_queue *ae4cmd_q = container_of(cmd_q, struct ae4_cmd_queue, cmd_q);
+	int i;
+
+	for (i = 0; i < CMD_Q_LEN; i++)
+		ae4_check_status_error(ae4cmd_q, i);
+}
+EXPORT_SYMBOL_GPL(pt_check_status_trans_ae4);
+
 static void ae4_pending_work(struct work_struct *work)
 {
 	struct ae4_cmd_queue *ae4cmd_q = container_of(work, struct ae4_cmd_queue, p_work.work);
@@ -117,6 +127,53 @@ static irqreturn_t ae4_core_irq_handler(int irq, void *data)
 	return IRQ_HANDLED;
 }
 
+static int ae4_core_execute_cmd(struct ae4dma_desc *desc, struct ae4_cmd_queue *ae4cmd_q)
+{
+	bool soc = FIELD_GET(DWORD0_SOC, desc->dwouv.dw0);
+	struct pt_cmd_queue *cmd_q = &ae4cmd_q->cmd_q;
+
+	if (soc) {
+		desc->dwouv.dw0 |= FIELD_PREP(DWORD0_IOC, desc->dwouv.dw0);
+		desc->dwouv.dw0 &= ~DWORD0_SOC;
+	}
+
+	mutex_lock(&ae4cmd_q->cmd_lock);
+	memcpy(&cmd_q->qbase[ae4cmd_q->tail_wi], desc, sizeof(struct ae4dma_desc));
+	ae4cmd_q->q_cmd_count++;
+	ae4cmd_q->tail_wi = (ae4cmd_q->tail_wi + 1) % CMD_Q_LEN;
+	writel(ae4cmd_q->tail_wi, cmd_q->reg_control + AE4_WR_IDX_OFF);
+	mutex_unlock(&ae4cmd_q->cmd_lock);
+
+	wake_up(&ae4cmd_q->q_w);
+
+	return 0;
+}
+
+int pt_core_perform_passthru_ae4(struct pt_cmd_queue *cmd_q, struct pt_passthru_engine *pt_engine)
+{
+	struct ae4_cmd_queue *ae4cmd_q = container_of(cmd_q, struct ae4_cmd_queue, cmd_q);
+	struct ae4dma_desc desc;
+
+	cmd_q->cmd_error = 0;
+	cmd_q->total_pt_ops++;
+	memset(&desc, 0, sizeof(desc));
+	desc.dwouv.dws.byte0 = CMD_AE4_DESC_DW0_VAL;
+
+	desc.dw1.status = 0;
+	desc.dw1.err_code = 0;
+	desc.dw1.desc_id = 0;
+
+	desc.length = pt_engine->src_len;
+
+	desc.src_lo = upper_32_bits(pt_engine->src_dma);
+	desc.src_hi = lower_32_bits(pt_engine->src_dma);
+	desc.dst_lo = upper_32_bits(pt_engine->dst_dma);
+	desc.dst_hi = lower_32_bits(pt_engine->dst_dma);
+
+	return ae4_core_execute_cmd(&desc, ae4cmd_q);
+}
+EXPORT_SYMBOL_GPL(pt_core_perform_passthru_ae4);
+
 void ae4_destroy_work(struct ae4_device *ae4)
 {
 	struct ae4_cmd_queue *ae4cmd_q;
@@ -194,5 +251,9 @@ int ae4_core_init(struct ae4_device *ae4)
 		init_completion(&ae4cmd_q->cmp);
 	}
 
+	ret = pt_dmaengine_register(pt);
+	if (ret)
+		ae4_destroy_work(ae4);
+
 	return ret;
 }
diff --git a/drivers/dma/amd/ae4dma/ae4dma-pci.c b/drivers/dma/amd/ae4dma/ae4dma-pci.c
index 43d36e9d1efb..aad0dc4294a3 100644
--- a/drivers/dma/amd/ae4dma/ae4dma-pci.c
+++ b/drivers/dma/amd/ae4dma/ae4dma-pci.c
@@ -98,6 +98,7 @@ static int ae4_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 
 	pt = &ae4->pt;
 	pt->dev = dev;
+	pt->ver = AE4_DMA_VERSION;
 
 	pt->io_regs = pcim_iomap_table(pdev)[0];
 	if (!pt->io_regs) {
diff --git a/drivers/dma/amd/ae4dma/ae4dma.h b/drivers/dma/amd/ae4dma/ae4dma.h
index 5f9dab5f05f4..1bc8abcd1d44 100644
--- a/drivers/dma/amd/ae4dma/ae4dma.h
+++ b/drivers/dma/amd/ae4dma/ae4dma.h
@@ -25,6 +25,7 @@
 #define AE4_Q_SZ			0x20
 
 #define AE4_DMA_VERSION			4
+#define CMD_AE4_DESC_DW0_VAL		2
 
 struct ae4_msix {
 	int msix_count;
@@ -45,6 +46,7 @@ struct ae4_cmd_queue {
 	atomic64_t done_cnt;
 	u64 q_cmd_count;
 	u32 dridx;
+	u32 tail_wi;
 	u32 id;
 };
 
diff --git a/drivers/dma/amd/common/amd_dma.h b/drivers/dma/amd/common/amd_dma.h
index 396667e81e1a..896318ed2495 100644
--- a/drivers/dma/amd/common/amd_dma.h
+++ b/drivers/dma/amd/common/amd_dma.h
@@ -24,4 +24,7 @@
 #include "../ae4dma/ae4dma.h"
 #include "../../virt-dma.h"
 
+void pt_check_status_trans_ae4(struct pt_device *pt, struct pt_cmd_queue *cmd_q);
+int pt_core_perform_passthru_ae4(struct pt_cmd_queue *cmd_q, struct pt_passthru_engine *pt_engine);
+
 #endif
diff --git a/drivers/dma/amd/ptdma/ptdma-dmaengine.c b/drivers/dma/amd/ptdma/ptdma-dmaengine.c
index 90ca02fd5f8f..412b40903e57 100644
--- a/drivers/dma/amd/ptdma/ptdma-dmaengine.c
+++ b/drivers/dma/amd/ptdma/ptdma-dmaengine.c
@@ -79,7 +79,10 @@ static int pt_dma_start_desc(struct pt_dma_desc *desc, struct pt_dma_chan *chan)
 	pt->tdata.cmd = pt_cmd;
 
 	/* Execute the command */
-	pt_cmd->ret = pt_core_perform_passthru(cmd_q, pt_engine);
+	if (pt->ver == AE4_DMA_VERSION)
+		pt_cmd->ret = pt_core_perform_passthru_ae4(cmd_q, pt_engine);
+	else
+		pt_cmd->ret = pt_core_perform_passthru(cmd_q, pt_engine);
 
 	return 0;
 }
@@ -296,7 +299,10 @@ pt_tx_status(struct dma_chan *c, dma_cookie_t cookie,
 
 	cmd_q = pt_get_cmd_queue(pt, chan);
 
-	pt_check_status_trans(pt, cmd_q);
+	if (pt->ver == AE4_DMA_VERSION)
+		pt_check_status_trans_ae4(pt, cmd_q);
+	else
+		pt_check_status_trans(pt, cmd_q);
 	return dma_cookie_status(c, cookie, txstate);
 }
 
@@ -462,6 +468,7 @@ int pt_dmaengine_register(struct pt_device *pt)
 
 	return ret;
 }
+EXPORT_SYMBOL_GPL(pt_dmaengine_register);
 
 void pt_dmaengine_unregister(struct pt_device *pt)
 {
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v5 6/7] dmaengine: ptdma: Extend ptdma-debugfs to support multi-queue
  2024-07-08 14:44 [PATCH v5 0/7] Add support of AMD AE4DMA DMA Engine Basavaraj Natikar
                   ` (4 preceding siblings ...)
  2024-07-08 14:44 ` [PATCH v5 5/7] dmaengine: ae4dma: Register AE4DMA using pt_dmaengine_register Basavaraj Natikar
@ 2024-07-08 14:44 ` Basavaraj Natikar
  2024-07-08 14:45 ` [PATCH v5 7/7] dmaengine: ae4dma: Register debugfs using ptdma_debugfs_setup Basavaraj Natikar
  6 siblings, 0 replies; 14+ messages in thread
From: Basavaraj Natikar @ 2024-07-08 14:44 UTC (permalink / raw)
  To: vkoul, dmaengine
  Cc: Raju.Rangoju, Frank.li, helgaas, pstanner, Basavaraj Natikar

To support multi-channel functionality with AE4DMA engine, extend the
ptdma-debugfs with reusable components.

Reviewed-by: Raju Rangoju <Raju.Rangoju@amd.com>
Signed-off-by: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
---
 drivers/dma/amd/ptdma/ptdma-debugfs.c | 79 +++++++++++++++++++--------
 1 file changed, 57 insertions(+), 22 deletions(-)

diff --git a/drivers/dma/amd/ptdma/ptdma-debugfs.c b/drivers/dma/amd/ptdma/ptdma-debugfs.c
index c8307d3044a3..25654799077b 100644
--- a/drivers/dma/amd/ptdma/ptdma-debugfs.c
+++ b/drivers/dma/amd/ptdma/ptdma-debugfs.c
@@ -12,7 +12,7 @@
 #include <linux/debugfs.h>
 #include <linux/seq_file.h>
 
-#include "ptdma.h"
+#include "../common/amd_dma.h"
 
 /* DebugFS helpers */
 #define	RI_VERSION_NUM	0x0000003F
@@ -23,11 +23,19 @@
 static int pt_debugfs_info_show(struct seq_file *s, void *p)
 {
 	struct pt_device *pt = s->private;
+	struct ae4_device *ae4;
 	unsigned int regval;
 
 	seq_printf(s, "Device name: %s\n", dev_name(pt->dev));
-	seq_printf(s, "   # Queues: %d\n", 1);
-	seq_printf(s, "     # Cmds: %d\n", pt->cmd_count);
+
+	if (pt->ver == AE4_DMA_VERSION) {
+		ae4 = container_of(pt, struct ae4_device, pt);
+		seq_printf(s, "   # Queues: %d\n", ae4->cmd_q_count);
+		seq_printf(s, "     # Cmds per queue: %d\n", CMD_Q_LEN);
+	} else {
+		seq_printf(s, "   # Queues: %d\n", 1);
+		seq_printf(s, "     # Cmds: %d\n", pt->cmd_count);
+	}
 
 	regval = ioread32(pt->io_regs + CMD_PT_VERSION);
 
@@ -55,6 +63,7 @@ static int pt_debugfs_stats_show(struct seq_file *s, void *p)
 static int pt_debugfs_queue_show(struct seq_file *s, void *p)
 {
 	struct pt_cmd_queue *cmd_q = s->private;
+	struct pt_device *pt;
 	unsigned int regval;
 
 	if (!cmd_q)
@@ -62,18 +71,24 @@ static int pt_debugfs_queue_show(struct seq_file *s, void *p)
 
 	seq_printf(s, "               Pass-Thru: %ld\n", cmd_q->total_pt_ops);
 
-	regval = ioread32(cmd_q->reg_control + 0x000C);
-
-	seq_puts(s, "      Enabled Interrupts:");
-	if (regval & INT_EMPTY_QUEUE)
-		seq_puts(s, " EMPTY");
-	if (regval & INT_QUEUE_STOPPED)
-		seq_puts(s, " STOPPED");
-	if (regval & INT_ERROR)
-		seq_puts(s, " ERROR");
-	if (regval & INT_COMPLETION)
-		seq_puts(s, " COMPLETION");
-	seq_puts(s, "\n");
+	pt = cmd_q->pt;
+	if (pt->ver == AE4_DMA_VERSION) {
+		regval = readl(cmd_q->reg_control + 0x4);
+		seq_printf(s, "     Enabled Interrupts:: status 0x%x\n", regval);
+	} else {
+		regval = ioread32(cmd_q->reg_control + 0x000C);
+
+		seq_puts(s, "      Enabled Interrupts:");
+		if (regval & INT_EMPTY_QUEUE)
+			seq_puts(s, " EMPTY");
+		if (regval & INT_QUEUE_STOPPED)
+			seq_puts(s, " STOPPED");
+		if (regval & INT_ERROR)
+			seq_puts(s, " ERROR");
+		if (regval & INT_COMPLETION)
+			seq_puts(s, " COMPLETION");
+		seq_puts(s, "\n");
+	}
 
 	return 0;
 }
@@ -84,8 +99,12 @@ DEFINE_SHOW_ATTRIBUTE(pt_debugfs_stats);
 
 void ptdma_debugfs_setup(struct pt_device *pt)
 {
-	struct pt_cmd_queue *cmd_q;
 	struct dentry *debugfs_q_instance;
+	struct ae4_cmd_queue *ae4cmd_q;
+	struct pt_cmd_queue *cmd_q;
+	struct ae4_device *ae4;
+	char name[30];
+	int i;
 
 	if (!debugfs_initialized())
 		return;
@@ -96,11 +115,27 @@ void ptdma_debugfs_setup(struct pt_device *pt)
 	debugfs_create_file("stats", 0400, pt->dma_dev.dbg_dev_root, pt,
 			    &pt_debugfs_stats_fops);
 
-	cmd_q = &pt->cmd_q;
-
-	debugfs_q_instance =
-		debugfs_create_dir("q", pt->dma_dev.dbg_dev_root);
 
-	debugfs_create_file("stats", 0400, debugfs_q_instance, cmd_q,
-			    &pt_debugfs_queue_fops);
+	if (pt->ver == AE4_DMA_VERSION) {
+		ae4 = container_of(pt, struct ae4_device, pt);
+		for (i = 0; i < ae4->cmd_q_count; i++) {
+			ae4cmd_q = &ae4->ae4cmd_q[i];
+			cmd_q = &ae4cmd_q->cmd_q;
+
+			memset(name, 0, sizeof(name));
+			snprintf(name, 29, "q%d", ae4cmd_q->id);
+
+			debugfs_q_instance =
+				debugfs_create_dir(name, pt->dma_dev.dbg_dev_root);
+
+			debugfs_create_file("stats", 0400, debugfs_q_instance, cmd_q,
+					    &pt_debugfs_queue_fops);
+		}
+	} else {
+		debugfs_q_instance =
+			debugfs_create_dir("q", pt->dma_dev.dbg_dev_root);
+		cmd_q = &pt->cmd_q;
+		debugfs_create_file("stats", 0400, debugfs_q_instance, cmd_q,
+				    &pt_debugfs_queue_fops);
+	}
 }
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v5 7/7] dmaengine: ae4dma: Register debugfs using ptdma_debugfs_setup
  2024-07-08 14:44 [PATCH v5 0/7] Add support of AMD AE4DMA DMA Engine Basavaraj Natikar
                   ` (5 preceding siblings ...)
  2024-07-08 14:44 ` [PATCH v5 6/7] dmaengine: ptdma: Extend ptdma-debugfs to support multi-queue Basavaraj Natikar
@ 2024-07-08 14:45 ` Basavaraj Natikar
  6 siblings, 0 replies; 14+ messages in thread
From: Basavaraj Natikar @ 2024-07-08 14:45 UTC (permalink / raw)
  To: vkoul, dmaengine
  Cc: Raju.Rangoju, Frank.li, helgaas, pstanner, Basavaraj Natikar

Use the ptdma_debugfs_setup function to register debugfs for AE4DMA DMA
engine.

Reviewed-by: Raju Rangoju <Raju.Rangoju@amd.com>
Signed-off-by: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
---
 drivers/dma/amd/ae4dma/ae4dma-dev.c   | 2 ++
 drivers/dma/amd/ptdma/ptdma-debugfs.c | 1 +
 2 files changed, 3 insertions(+)

diff --git a/drivers/dma/amd/ae4dma/ae4dma-dev.c b/drivers/dma/amd/ae4dma/ae4dma-dev.c
index 248abf794aff..3ff3573fd2b7 100644
--- a/drivers/dma/amd/ae4dma/ae4dma-dev.c
+++ b/drivers/dma/amd/ae4dma/ae4dma-dev.c
@@ -254,6 +254,8 @@ int ae4_core_init(struct ae4_device *ae4)
 	ret = pt_dmaengine_register(pt);
 	if (ret)
 		ae4_destroy_work(ae4);
+	else
+		ptdma_debugfs_setup(pt);
 
 	return ret;
 }
diff --git a/drivers/dma/amd/ptdma/ptdma-debugfs.c b/drivers/dma/amd/ptdma/ptdma-debugfs.c
index 25654799077b..0d1133ec1919 100644
--- a/drivers/dma/amd/ptdma/ptdma-debugfs.c
+++ b/drivers/dma/amd/ptdma/ptdma-debugfs.c
@@ -139,3 +139,4 @@ void ptdma_debugfs_setup(struct pt_device *pt)
 				    &pt_debugfs_queue_fops);
 	}
 }
+EXPORT_SYMBOL_GPL(ptdma_debugfs_setup);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v5 3/7] dmaengine: ptdma: Move common functions to common code
  2024-07-08 14:44 ` [PATCH v5 3/7] dmaengine: ptdma: Move common functions to common code Basavaraj Natikar
@ 2024-08-28 13:18   ` Vinod Koul
  2024-09-02  7:07     ` Basavaraj Natikar
  0 siblings, 1 reply; 14+ messages in thread
From: Vinod Koul @ 2024-08-28 13:18 UTC (permalink / raw)
  To: Basavaraj Natikar; +Cc: dmaengine, Raju.Rangoju, Frank.li, helgaas, pstanner

On 08-07-24, 20:14, Basavaraj Natikar wrote:
> To focus on reusability of ptdma code across modules extract common
> functions into reusable modules.
> 
> Reviewed-by: Raju Rangoju <Raju.Rangoju@amd.com>
> Signed-off-by: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
> ---
>  MAINTAINERS                             | 1 +
>  drivers/dma/amd/ptdma/ptdma-dev.c       | 2 +-
>  drivers/dma/amd/ptdma/ptdma-dmaengine.c | 3 +--
>  3 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 539bf52410de..97d97ddf26f5 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -952,6 +952,7 @@ M:	Basavaraj Natikar <Basavaraj.Natikar@amd.com>
>  L:	dmaengine@vger.kernel.org
>  S:	Maintained
>  F:	drivers/dma/amd/ae4dma/
> +F:	drivers/dma/amd/common/

I think this should be made amd/ to avoid churn when a file is
added/dropped

>  
>  AMD AXI W1 DRIVER
>  M:	Kris Chaplin <kris.chaplin@amd.com>
> diff --git a/drivers/dma/amd/ptdma/ptdma-dev.c b/drivers/dma/amd/ptdma/ptdma-dev.c
> index a2bf13ff18b6..2bdf418fe556 100644
> --- a/drivers/dma/amd/ptdma/ptdma-dev.c
> +++ b/drivers/dma/amd/ptdma/ptdma-dev.c
> @@ -17,7 +17,7 @@
>  #include <linux/module.h>
>  #include <linux/pci.h>
>  
> -#include "ptdma.h"
> +#include "../common/amd_dma.h"
>  
>  /* Human-readable error strings */
>  static char *pt_error_codes[] = {
> diff --git a/drivers/dma/amd/ptdma/ptdma-dmaengine.c b/drivers/dma/amd/ptdma/ptdma-dmaengine.c
> index a2e7c2cec15e..66ea10499643 100644
> --- a/drivers/dma/amd/ptdma/ptdma-dmaengine.c
> +++ b/drivers/dma/amd/ptdma/ptdma-dmaengine.c
> @@ -9,8 +9,7 @@
>   * Author: Gary R Hook <gary.hook@amd.com>
>   */
>  
> -#include "ptdma.h"
> -#include "../../dmaengine.h"
> +#include "../common/amd_dma.h"

So the driver was including old headers and now new, but I dont see any
functions being moved? Does each patch compile...?

-- 
~Vinod

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v5 2/7] dmaengine: ae4dma: Add AMD ae4dma controller driver
  2024-07-08 14:44 ` [PATCH v5 2/7] dmaengine: ae4dma: Add AMD ae4dma controller driver Basavaraj Natikar
@ 2024-08-28 13:23   ` Vinod Koul
  2024-09-02  6:47     ` Basavaraj Natikar
  0 siblings, 1 reply; 14+ messages in thread
From: Vinod Koul @ 2024-08-28 13:23 UTC (permalink / raw)
  To: Basavaraj Natikar; +Cc: dmaengine, Raju.Rangoju, Frank.li, helgaas, pstanner

On 08-07-24, 20:14, Basavaraj Natikar wrote:
> Add support for AMD AE4DMA controller. It performs high-bandwidth
> memory to memory and IO copy operation. Device commands are managed
> via a circular queue of 'descriptors', each of which specifies source
> and destination addresses for copying a single buffer of data.
> 
> Reviewed-by: Raju Rangoju <Raju.Rangoju@amd.com>
> Reviewed-by: Philipp Stanner <pstanner@redhat.com>
> Signed-off-by: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
> ---
>  MAINTAINERS                         |   6 +
>  drivers/dma/amd/Kconfig             |   1 +
>  drivers/dma/amd/Makefile            |   1 +
>  drivers/dma/amd/ae4dma/Kconfig      |  14 ++
>  drivers/dma/amd/ae4dma/Makefile     |  10 ++
>  drivers/dma/amd/ae4dma/ae4dma-dev.c | 198 ++++++++++++++++++++++++++++
>  drivers/dma/amd/ae4dma/ae4dma-pci.c | 157 ++++++++++++++++++++++
>  drivers/dma/amd/ae4dma/ae4dma.h     |  85 ++++++++++++
>  drivers/dma/amd/common/amd_dma.h    |  26 ++++
>  9 files changed, 498 insertions(+)
>  create mode 100644 drivers/dma/amd/ae4dma/Kconfig
>  create mode 100644 drivers/dma/amd/ae4dma/Makefile
>  create mode 100644 drivers/dma/amd/ae4dma/ae4dma-dev.c
>  create mode 100644 drivers/dma/amd/ae4dma/ae4dma-pci.c
>  create mode 100644 drivers/dma/amd/ae4dma/ae4dma.h
>  create mode 100644 drivers/dma/amd/common/amd_dma.h
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 33a1049fd38b..539bf52410de 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -947,6 +947,12 @@ L:	linux-edac@vger.kernel.org
>  S:	Supported
>  F:	drivers/ras/amd/atl/*
>  
> +AMD AE4DMA DRIVER
> +M:	Basavaraj Natikar <Basavaraj.Natikar@amd.com>
> +L:	dmaengine@vger.kernel.org
> +S:	Maintained
> +F:	drivers/dma/amd/ae4dma/
> +
>  AMD AXI W1 DRIVER
>  M:	Kris Chaplin <kris.chaplin@amd.com>
>  R:	Thomas Delev <thomas.delev@amd.com>
> diff --git a/drivers/dma/amd/Kconfig b/drivers/dma/amd/Kconfig
> index 8246b463bcf7..8c25a3ed6b94 100644
> --- a/drivers/dma/amd/Kconfig
> +++ b/drivers/dma/amd/Kconfig
> @@ -3,3 +3,4 @@
>  # AMD DMA Drivers
>  
>  source "drivers/dma/amd/ptdma/Kconfig"
> +source "drivers/dma/amd/ae4dma/Kconfig"
> diff --git a/drivers/dma/amd/Makefile b/drivers/dma/amd/Makefile
> index dd7257ba7e06..8049b06a9ff5 100644
> --- a/drivers/dma/amd/Makefile
> +++ b/drivers/dma/amd/Makefile
> @@ -4,3 +4,4 @@
>  #
>  
>  obj-$(CONFIG_AMD_PTDMA) += ptdma/
> +obj-$(CONFIG_AMD_AE4DMA) += ae4dma/
> diff --git a/drivers/dma/amd/ae4dma/Kconfig b/drivers/dma/amd/ae4dma/Kconfig
> new file mode 100644
> index 000000000000..ea8a7fe68df5
> --- /dev/null
> +++ b/drivers/dma/amd/ae4dma/Kconfig
> @@ -0,0 +1,14 @@
> +# SPDX-License-Identifier: GPL-2.0
> +config AMD_AE4DMA
> +	tristate  "AMD AE4DMA Engine"
> +	depends on (X86_64 || COMPILE_TEST) && PCI
> +	depends on AMD_PTDMA
> +	select DMA_ENGINE
> +	select DMA_VIRTUAL_CHANNELS
> +	help
> +	  Enable support for the AMD AE4DMA controller. This controller
> +	  provides DMA capabilities to perform high bandwidth memory to
> +	  memory and IO copy operations. It performs DMA transfer through
> +	  queue-based descriptor management. This DMA controller is intended
> +	  to be used with AMD Non-Transparent Bridge devices and not for
> +	  general purpose peripheral DMA.
> diff --git a/drivers/dma/amd/ae4dma/Makefile b/drivers/dma/amd/ae4dma/Makefile
> new file mode 100644
> index 000000000000..e918f85a80ec
> --- /dev/null
> +++ b/drivers/dma/amd/ae4dma/Makefile
> @@ -0,0 +1,10 @@
> +# SPDX-License-Identifier: GPL-2.0
> +#
> +# AMD AE4DMA driver
> +#
> +
> +obj-$(CONFIG_AMD_AE4DMA) += ae4dma.o
> +
> +ae4dma-objs := ae4dma-dev.o
> +
> +ae4dma-$(CONFIG_PCI) += ae4dma-pci.o
> diff --git a/drivers/dma/amd/ae4dma/ae4dma-dev.c b/drivers/dma/amd/ae4dma/ae4dma-dev.c
> new file mode 100644
> index 000000000000..c38464b96fc6
> --- /dev/null
> +++ b/drivers/dma/amd/ae4dma/ae4dma-dev.c
> @@ -0,0 +1,198 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * AMD AE4DMA driver
> + *
> + * Copyright (c) 2024, Advanced Micro Devices, Inc.
> + * All Rights Reserved.
> + *
> + * Author: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
> + */
> +
> +#include "ae4dma.h"
> +
> +static unsigned int max_hw_q = 1;
> +module_param(max_hw_q, uint, 0444);
> +MODULE_PARM_DESC(max_hw_q, "max hw queues supported by engine (any non-zero value, default: 1)");
> +
> +static char *ae4_error_codes[] = {
> +	"",
> +	"ERR 01: INVALID HEADER DW0",
> +	"ERR 02: INVALID STATUS",
> +	"ERR 03: INVALID LENGTH - 4 BYTE ALIGNMENT",
> +	"ERR 04: INVALID SRC ADDR - 4 BYTE ALIGNMENT",
> +	"ERR 05: INVALID DST ADDR - 4 BYTE ALIGNMENT",
> +	"ERR 06: INVALID ALIGNMENT",
> +	"ERR 07: INVALID DESCRIPTOR",
> +};
> +
> +static void ae4_log_error(struct pt_device *d, int e)
> +{
> +	/* ERR 01 - 07 represents Invalid AE4 errors */
> +	if (e <= 7)
> +		dev_info(d->dev, "AE4DMA error: %s (0x%x)\n", ae4_error_codes[e], e);
> +	/* ERR 08 - 15 represents Invalid Descriptor errors */
> +	else if (e > 7 && e <= 15)
> +		dev_info(d->dev, "AE4DMA error: %s (0x%x)\n", "INVALID DESCRIPTOR", e);
> +	/* ERR 16 - 31 represents Firmware errors */
> +	else if (e > 15 && e <= 31)
> +		dev_info(d->dev, "AE4DMA error: %s (0x%x)\n", "FIRMWARE ERROR", e);
> +	/* ERR 32 - 63 represents Fatal errors */
> +	else if (e > 31 && e <= 63)
> +		dev_info(d->dev, "AE4DMA error: %s (0x%x)\n", "FATAL ERROR", e);
> +	/* ERR 64 - 255 represents PTE errors */
> +	else if (e > 63 && e <= 255)
> +		dev_info(d->dev, "AE4DMA error: %s (0x%x)\n", "PTE ERROR", e);
> +	else
> +		dev_info(d->dev, "Unknown AE4DMA error");
> +}
> +
> +static void ae4_check_status_error(struct ae4_cmd_queue *ae4cmd_q, int idx)
> +{
> +	struct pt_cmd_queue *cmd_q = &ae4cmd_q->cmd_q;
> +	struct ae4dma_desc desc;
> +	u8 status;
> +
> +	memcpy(&desc, &cmd_q->qbase[idx], sizeof(struct ae4dma_desc));
> +	status = desc.dw1.status;
> +	if (status && status != AE4_DESC_COMPLETED) {
> +		cmd_q->cmd_error = desc.dw1.err_code;
> +		if (cmd_q->cmd_error)
> +			ae4_log_error(cmd_q->pt, cmd_q->cmd_error);
> +	}
> +}
> +
> +static void ae4_pending_work(struct work_struct *work)
> +{
> +	struct ae4_cmd_queue *ae4cmd_q = container_of(work, struct ae4_cmd_queue, p_work.work);
> +	struct pt_cmd_queue *cmd_q = &ae4cmd_q->cmd_q;
> +	struct pt_cmd *cmd;
> +	u32 cridx;
> +
> +	for (;;) {
> +		wait_event_interruptible(ae4cmd_q->q_w,
> +					 ((atomic64_read(&ae4cmd_q->done_cnt)) <
> +					   atomic64_read(&ae4cmd_q->intr_cnt)));
> +
> +		atomic64_inc(&ae4cmd_q->done_cnt);
> +
> +		mutex_lock(&ae4cmd_q->cmd_lock);
> +		cridx = readl(cmd_q->reg_control + AE4_RD_IDX_OFF);
> +		while ((ae4cmd_q->dridx != cridx) && !list_empty(&ae4cmd_q->cmd)) {
> +			cmd = list_first_entry(&ae4cmd_q->cmd, struct pt_cmd, entry);
> +			list_del(&cmd->entry);
> +
> +			ae4_check_status_error(ae4cmd_q, ae4cmd_q->dridx);
> +			cmd->pt_cmd_callback(cmd->data, cmd->ret);
> +
> +			ae4cmd_q->q_cmd_count--;
> +			ae4cmd_q->dridx = (ae4cmd_q->dridx + 1) % CMD_Q_LEN;
> +
> +			complete_all(&ae4cmd_q->cmp);
> +		}
> +		mutex_unlock(&ae4cmd_q->cmd_lock);
> +	}
> +}
> +
> +static irqreturn_t ae4_core_irq_handler(int irq, void *data)
> +{
> +	struct ae4_cmd_queue *ae4cmd_q = data;
> +	struct pt_cmd_queue *cmd_q;
> +	struct pt_device *pt;
> +	u32 status;
> +
> +	cmd_q = &ae4cmd_q->cmd_q;
> +	pt = cmd_q->pt;
> +
> +	pt->total_interrupts++;
> +	atomic64_inc(&ae4cmd_q->intr_cnt);
> +
> +	wake_up(&ae4cmd_q->q_w);

shouldnt this be last thing before returning

> +
> +	status = readl(cmd_q->reg_control + AE4_INTR_STS_OFF);
> +	if (status & BIT(0)) {
> +		status &= GENMASK(31, 1);
> +		writel(status, cmd_q->reg_control + AE4_INTR_STS_OFF);
> +	}
> +
> +	return IRQ_HANDLED;
> +}
> +
> +void ae4_destroy_work(struct ae4_device *ae4)
> +{
> +	struct ae4_cmd_queue *ae4cmd_q;
> +	int i;
> +
> +	for (i = 0; i < ae4->cmd_q_count; i++) {
> +		ae4cmd_q = &ae4->ae4cmd_q[i];
> +
> +		if (!ae4cmd_q->pws)
> +			break;
> +
> +		cancel_delayed_work_sync(&ae4cmd_q->p_work);
> +		destroy_workqueue(ae4cmd_q->pws);
> +	}
> +}
> +
> +int ae4_core_init(struct ae4_device *ae4)
> +{
> +	struct pt_device *pt = &ae4->pt;
> +	struct ae4_cmd_queue *ae4cmd_q;
> +	struct device *dev = pt->dev;
> +	struct pt_cmd_queue *cmd_q;
> +	int i, ret = 0;
> +
> +	writel(max_hw_q, pt->io_regs);
> +
> +	for (i = 0; i < max_hw_q; i++) {
> +		ae4cmd_q = &ae4->ae4cmd_q[i];
> +		ae4cmd_q->id = ae4->cmd_q_count;
> +		ae4->cmd_q_count++;
> +
> +		cmd_q = &ae4cmd_q->cmd_q;
> +		cmd_q->pt = pt;
> +
> +		cmd_q->reg_control = pt->io_regs + ((i + 1) * AE4_Q_SZ);
> +
> +		ret = devm_request_irq(dev, ae4->ae4_irq[i], ae4_core_irq_handler, 0,
> +				       dev_name(pt->dev), ae4cmd_q);
> +		if (ret)
> +			return ret;
> +
> +		cmd_q->qsize = Q_SIZE(sizeof(struct ae4dma_desc));
> +
> +		cmd_q->qbase = dmam_alloc_coherent(dev, cmd_q->qsize, &cmd_q->qbase_dma,
> +						   GFP_KERNEL);
> +		if (!cmd_q->qbase)
> +			return -ENOMEM;
> +	}
> +
> +	for (i = 0; i < ae4->cmd_q_count; i++) {
> +		ae4cmd_q = &ae4->ae4cmd_q[i];
> +
> +		cmd_q = &ae4cmd_q->cmd_q;
> +
> +		cmd_q->reg_control = pt->io_regs + ((i + 1) * AE4_Q_SZ);
> +
> +		/* Update the device registers with queue information. */
> +		writel(CMD_Q_LEN, cmd_q->reg_control + AE4_MAX_IDX_OFF);
> +
> +		cmd_q->qdma_tail = cmd_q->qbase_dma;
> +		writel(lower_32_bits(cmd_q->qdma_tail), cmd_q->reg_control + AE4_Q_BASE_L_OFF);
> +		writel(upper_32_bits(cmd_q->qdma_tail), cmd_q->reg_control + AE4_Q_BASE_H_OFF);
> +
> +		INIT_LIST_HEAD(&ae4cmd_q->cmd);
> +		init_waitqueue_head(&ae4cmd_q->q_w);
> +
> +		ae4cmd_q->pws = alloc_ordered_workqueue("ae4dma_%d", WQ_MEM_RECLAIM, ae4cmd_q->id);
> +		if (!ae4cmd_q->pws) {
> +			ae4_destroy_work(ae4);
> +			return -ENOMEM;
> +		}
> +		INIT_DELAYED_WORK(&ae4cmd_q->p_work, ae4_pending_work);
> +		queue_delayed_work(ae4cmd_q->pws, &ae4cmd_q->p_work,  usecs_to_jiffies(100));
> +
> +		init_completion(&ae4cmd_q->cmp);
> +	}
> +
> +	return ret;
> +}
> diff --git a/drivers/dma/amd/ae4dma/ae4dma-pci.c b/drivers/dma/amd/ae4dma/ae4dma-pci.c
> new file mode 100644
> index 000000000000..43d36e9d1efb
> --- /dev/null
> +++ b/drivers/dma/amd/ae4dma/ae4dma-pci.c
> @@ -0,0 +1,157 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * AMD AE4DMA driver
> + *
> + * Copyright (c) 2024, Advanced Micro Devices, Inc.
> + * All Rights Reserved.
> + *
> + * Author: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
> + */
> +
> +#include "ae4dma.h"
> +
> +static int ae4_get_irqs(struct ae4_device *ae4)
> +{
> +	struct ae4_msix *ae4_msix = ae4->ae4_msix;
> +	struct pt_device *pt = &ae4->pt;
> +	struct device *dev = pt->dev;
> +	struct pci_dev *pdev;
> +	int i, v, ret;
> +
> +	pdev = to_pci_dev(dev);
> +
> +	for (v = 0; v < ARRAY_SIZE(ae4_msix->msix_entry); v++)
> +		ae4_msix->msix_entry[v].entry = v;
> +
> +	ret = pci_alloc_irq_vectors(pdev, v, v, PCI_IRQ_MSIX);
> +	if (ret != v) {
> +		if (ret > 0)
> +			pci_free_irq_vectors(pdev);
> +
> +		dev_err(dev, "could not enable MSI-X (%d), trying MSI\n", ret);
> +		ret = pci_alloc_irq_vectors(pdev, 1, 1, PCI_IRQ_MSI);
> +		if (ret < 0) {
> +			dev_err(dev, "could not enable MSI (%d)\n", ret);
> +			return ret;
> +		}
> +
> +		ret = pci_irq_vector(pdev, 0);
> +		if (ret < 0) {
> +			pci_free_irq_vectors(pdev);
> +			return ret;
> +		}
> +
> +		for (i = 0; i < MAX_AE4_HW_QUEUES; i++)
> +			ae4->ae4_irq[i] = ret;
> +
> +	} else {
> +		ae4_msix->msix_count = ret;
> +		for (i = 0; i < MAX_AE4_HW_QUEUES; i++)
> +			ae4->ae4_irq[i] = ae4_msix->msix_entry[i].vector;
> +	}
> +
> +	return ret;
> +}
> +
> +static void ae4_free_irqs(struct ae4_device *ae4)
> +{
> +	struct ae4_msix *ae4_msix = ae4->ae4_msix;
> +	struct pt_device *pt = &ae4->pt;
> +	struct device *dev = pt->dev;
> +	struct pci_dev *pdev;
> +
> +	pdev = to_pci_dev(dev);
> +
> +	if (ae4_msix && (ae4_msix->msix_count || ae4->ae4_irq[MAX_AE4_HW_QUEUES - 1]))
> +		pci_free_irq_vectors(pdev);
> +}
> +
> +static void ae4_deinit(struct ae4_device *ae4)
> +{
> +	ae4_free_irqs(ae4);
> +}
> +
> +static int ae4_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
> +{
> +	struct device *dev = &pdev->dev;
> +	struct ae4_device *ae4;
> +	struct pt_device *pt;
> +	int bar_mask;
> +	int ret = 0;
> +
> +	ae4 = devm_kzalloc(dev, sizeof(*ae4), GFP_KERNEL);
> +	if (!ae4)
> +		return -ENOMEM;
> +
> +	ae4->ae4_msix = devm_kzalloc(dev, sizeof(struct ae4_msix), GFP_KERNEL);
> +	if (!ae4->ae4_msix)
> +		return -ENOMEM;
> +
> +	ret = pcim_enable_device(pdev);
> +	if (ret)
> +		goto ae4_error;
> +
> +	bar_mask = pci_select_bars(pdev, IORESOURCE_MEM);
> +	ret = pcim_iomap_regions(pdev, bar_mask, "ae4dma");
> +	if (ret)
> +		goto ae4_error;
> +
> +	pt = &ae4->pt;
> +	pt->dev = dev;
> +
> +	pt->io_regs = pcim_iomap_table(pdev)[0];
> +	if (!pt->io_regs) {
> +		ret = -ENOMEM;
> +		goto ae4_error;
> +	}
> +
> +	ret = ae4_get_irqs(ae4);
> +	if (ret < 0)
> +		goto ae4_error;
> +
> +	pci_set_master(pdev);
> +
> +	dma_set_mask_and_coherent(dev, DMA_BIT_MASK(48));
> +
> +	dev_set_drvdata(dev, ae4);
> +
> +	ret = ae4_core_init(ae4);
> +	if (ret)
> +		goto ae4_error;
> +
> +	return 0;
> +
> +ae4_error:
> +	ae4_deinit(ae4);
> +
> +	return ret;
> +}
> +
> +static void ae4_pci_remove(struct pci_dev *pdev)
> +{
> +	struct ae4_device *ae4 = dev_get_drvdata(&pdev->dev);
> +
> +	ae4_destroy_work(ae4);
> +	ae4_deinit(ae4);
> +}
> +
> +static const struct pci_device_id ae4_pci_table[] = {
> +	{ PCI_VDEVICE(AMD, 0x14C8), },
> +	{ PCI_VDEVICE(AMD, 0x14DC), },
> +	{ PCI_VDEVICE(AMD, 0x149B), },
> +	/* Last entry must be zero */
> +	{ 0, }
> +};
> +MODULE_DEVICE_TABLE(pci, ae4_pci_table);
> +
> +static struct pci_driver ae4_pci_driver = {
> +	.name = "ae4dma",
> +	.id_table = ae4_pci_table,
> +	.probe = ae4_pci_probe,
> +	.remove = ae4_pci_remove,
> +};
> +
> +module_pci_driver(ae4_pci_driver);
> +
> +MODULE_LICENSE("GPL");
> +MODULE_DESCRIPTION("AMD AE4DMA driver");
> diff --git a/drivers/dma/amd/ae4dma/ae4dma.h b/drivers/dma/amd/ae4dma/ae4dma.h
> new file mode 100644
> index 000000000000..a63525792080
> --- /dev/null
> +++ b/drivers/dma/amd/ae4dma/ae4dma.h
> @@ -0,0 +1,85 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * AMD AE4DMA driver
> + *
> + * Copyright (c) 2024, Advanced Micro Devices, Inc.
> + * All Rights Reserved.
> + *
> + * Author: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
> + */
> +#ifndef __AE4DMA_H__
> +#define __AE4DMA_H__
> +
> +#include "../common/amd_dma.h"
> +
> +#define MAX_AE4_HW_QUEUES		16
> +
> +#define AE4_DESC_COMPLETED		0x03
> +
> +#define AE4_MAX_IDX_OFF			0x08
> +#define AE4_RD_IDX_OFF			0x0C
> +#define AE4_WR_IDX_OFF			0x10
> +#define AE4_INTR_STS_OFF		0x14
> +#define AE4_Q_BASE_L_OFF		0x18
> +#define AE4_Q_BASE_H_OFF		0x1C
> +#define AE4_Q_SZ			0x20

lower case for hex values please

> +
> +struct ae4_msix {
> +	int msix_count;
> +	struct msix_entry msix_entry[MAX_AE4_HW_QUEUES];
> +};
> +
> +struct ae4_cmd_queue {
> +	struct ae4_device *ae4;
> +	struct pt_cmd_queue cmd_q;
> +	struct list_head cmd;
> +	/* protect command operations */
> +	struct mutex cmd_lock;
> +	struct delayed_work p_work;
> +	struct workqueue_struct *pws;
> +	struct completion cmp;
> +	wait_queue_head_t q_w;
> +	atomic64_t intr_cnt;
> +	atomic64_t done_cnt;
> +	u64 q_cmd_count;
> +	u32 dridx;
> +	u32 id;
> +};
> +
> +union dwou {
> +	u32 dw0;
> +	struct dword0 {
nice naming dw0 dword0!

> +	u8	byte0;
> +	u8	byte1;
> +	u16	timestamp;
> +	} dws;
> +};
> +
> +struct dword1 {
> +	u8	status;
> +	u8	err_code;
> +	u16	desc_id;
> +};
> +
> +struct ae4dma_desc {
> +	union dwou dwouv;
> +	struct dword1 dw1;
> +	u32 length;
> +	u32 rsvd;
> +	u32 src_hi;
> +	u32 src_lo;
> +	u32 dst_hi;
> +	u32 dst_lo;
> +};
> +
> +struct ae4_device {
> +	struct pt_device pt;
> +	struct ae4_msix *ae4_msix;
> +	struct ae4_cmd_queue ae4cmd_q[MAX_AE4_HW_QUEUES];
> +	unsigned int ae4_irq[MAX_AE4_HW_QUEUES];
> +	unsigned int cmd_q_count;
> +};
> +
> +int ae4_core_init(struct ae4_device *ae4);
> +void ae4_destroy_work(struct ae4_device *ae4);
> +#endif
> diff --git a/drivers/dma/amd/common/amd_dma.h b/drivers/dma/amd/common/amd_dma.h
> new file mode 100644
> index 000000000000..f9f396cd4371
> --- /dev/null
> +++ b/drivers/dma/amd/common/amd_dma.h
> @@ -0,0 +1,26 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * AMD DMA Driver common
> + *
> + * Copyright (c) 2024, Advanced Micro Devices, Inc.
> + * All Rights Reserved.
> + *
> + * Author: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
> + */
> +
> +#ifndef AMD_DMA_H
> +#define AMD_DMA_H
> +
> +#include <linux/device.h>
> +#include <linux/dmaengine.h>
> +#include <linux/dmapool.h>
> +#include <linux/list.h>
> +#include <linux/mutex.h>
> +#include <linux/pci.h>
> +#include <linux/spinlock.h>
> +#include <linux/wait.h>
> +
> +#include "../ptdma/ptdma.h"
> +#include "../../virt-dma.h"

Header which adds other headers? That does not look good?
Why not add these in C files directly


-- 
~Vinod

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v5 4/7] dmaengine: ptdma: Extend ptdma to support multi-channel and version
  2024-07-08 14:44 ` [PATCH v5 4/7] dmaengine: ptdma: Extend ptdma to support multi-channel and version Basavaraj Natikar
@ 2024-08-28 17:46   ` Vinod Koul
  2024-09-02  7:01     ` Basavaraj Natikar
  0 siblings, 1 reply; 14+ messages in thread
From: Vinod Koul @ 2024-08-28 17:46 UTC (permalink / raw)
  To: Basavaraj Natikar; +Cc: dmaengine, Raju.Rangoju, Frank.li, helgaas, pstanner

On 08-07-24, 20:14, Basavaraj Natikar wrote:
> To support multi-channel functionality with AE4DMA engine, extend the
> PTDMA code with reusable components.
> 
> Reviewed-by: Raju Rangoju <Raju.Rangoju@amd.com>
> Signed-off-by: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
> ---
>  drivers/dma/amd/ae4dma/ae4dma.h         |   2 +
>  drivers/dma/amd/common/amd_dma.h        |   1 +
>  drivers/dma/amd/ptdma/ptdma-dmaengine.c | 105 +++++++++++++++++++-----
>  drivers/dma/amd/ptdma/ptdma.h           |   2 +
>  4 files changed, 90 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/dma/amd/ae4dma/ae4dma.h b/drivers/dma/amd/ae4dma/ae4dma.h
> index a63525792080..5f9dab5f05f4 100644
> --- a/drivers/dma/amd/ae4dma/ae4dma.h
> +++ b/drivers/dma/amd/ae4dma/ae4dma.h
> @@ -24,6 +24,8 @@
>  #define AE4_Q_BASE_H_OFF		0x1C
>  #define AE4_Q_SZ			0x20
>  
> +#define AE4_DMA_VERSION			4

Can it be driver data for the device?
I am not seeing who sets this here?

> +
>  struct ae4_msix {
>  	int msix_count;
>  	struct msix_entry msix_entry[MAX_AE4_HW_QUEUES];
> diff --git a/drivers/dma/amd/common/amd_dma.h b/drivers/dma/amd/common/amd_dma.h
> index f9f396cd4371..396667e81e1a 100644
> --- a/drivers/dma/amd/common/amd_dma.h
> +++ b/drivers/dma/amd/common/amd_dma.h
> @@ -21,6 +21,7 @@
>  #include <linux/wait.h>
>  
>  #include "../ptdma/ptdma.h"
> +#include "../ae4dma/ae4dma.h"
>  #include "../../virt-dma.h"
>  
>  #endif
> diff --git a/drivers/dma/amd/ptdma/ptdma-dmaengine.c b/drivers/dma/amd/ptdma/ptdma-dmaengine.c
> index 66ea10499643..90ca02fd5f8f 100644
> --- a/drivers/dma/amd/ptdma/ptdma-dmaengine.c
> +++ b/drivers/dma/amd/ptdma/ptdma-dmaengine.c
> @@ -43,7 +43,24 @@ static void pt_do_cleanup(struct virt_dma_desc *vd)
>  	kmem_cache_free(pt->dma_desc_cache, desc);
>  }
>  
> -static int pt_dma_start_desc(struct pt_dma_desc *desc)
> +static struct pt_cmd_queue *pt_get_cmd_queue(struct pt_device *pt, struct pt_dma_chan *chan)
> +{
> +	struct ae4_cmd_queue *ae4cmd_q;
> +	struct pt_cmd_queue *cmd_q;
> +	struct ae4_device *ae4;
> +
> +	if (pt->ver == AE4_DMA_VERSION) {
> +		ae4 = container_of(pt, struct ae4_device, pt);
> +		ae4cmd_q = &ae4->ae4cmd_q[chan->id];
> +		cmd_q = &ae4cmd_q->cmd_q;
> +	} else {
> +		cmd_q = &pt->cmd_q;
> +	}
> +
> +	return cmd_q;
> +}
> +
> +static int pt_dma_start_desc(struct pt_dma_desc *desc, struct pt_dma_chan *chan)
>  {
>  	struct pt_passthru_engine *pt_engine;
>  	struct pt_device *pt;
> @@ -54,7 +71,9 @@ static int pt_dma_start_desc(struct pt_dma_desc *desc)
>  
>  	pt_cmd = &desc->pt_cmd;
>  	pt = pt_cmd->pt;
> -	cmd_q = &pt->cmd_q;
> +
> +	cmd_q = pt_get_cmd_queue(pt, chan);
> +
>  	pt_engine = &pt_cmd->passthru;
>  
>  	pt->tdata.cmd = pt_cmd;
> @@ -149,7 +168,7 @@ static void pt_cmd_callback(void *data, int err)
>  		if (!desc)
>  			break;
>  
> -		ret = pt_dma_start_desc(desc);
> +		ret = pt_dma_start_desc(desc, chan);
>  		if (!ret)
>  			break;
>  
> @@ -184,7 +203,10 @@ static struct pt_dma_desc *pt_create_desc(struct dma_chan *dma_chan,
>  {
>  	struct pt_dma_chan *chan = to_pt_chan(dma_chan);
>  	struct pt_passthru_engine *pt_engine;
> +	struct pt_device *pt = chan->pt;
> +	struct ae4_cmd_queue *ae4cmd_q;
>  	struct pt_dma_desc *desc;
> +	struct ae4_device *ae4;
>  	struct pt_cmd *pt_cmd;
>  
>  	desc = pt_alloc_dma_desc(chan, flags);
> @@ -192,7 +214,7 @@ static struct pt_dma_desc *pt_create_desc(struct dma_chan *dma_chan,
>  		return NULL;
>  
>  	pt_cmd = &desc->pt_cmd;
> -	pt_cmd->pt = chan->pt;
> +	pt_cmd->pt = pt;
>  	pt_engine = &pt_cmd->passthru;
>  	pt_cmd->engine = PT_ENGINE_PASSTHRU;
>  	pt_engine->src_dma = src;
> @@ -203,6 +225,14 @@ static struct pt_dma_desc *pt_create_desc(struct dma_chan *dma_chan,
>  
>  	desc->len = len;
>  
> +	if (pt->ver == AE4_DMA_VERSION) {
> +		ae4 = container_of(pt, struct ae4_device, pt);
> +		ae4cmd_q = &ae4->ae4cmd_q[chan->id];
> +		mutex_lock(&ae4cmd_q->cmd_lock);
> +		list_add_tail(&pt_cmd->entry, &ae4cmd_q->cmd);
> +		mutex_unlock(&ae4cmd_q->cmd_lock);
> +	}
> +
>  	return desc;
>  }
>  
> @@ -260,8 +290,11 @@ static enum dma_status
>  pt_tx_status(struct dma_chan *c, dma_cookie_t cookie,
>  		struct dma_tx_state *txstate)
>  {
> -	struct pt_device *pt = to_pt_chan(c)->pt;
> -	struct pt_cmd_queue *cmd_q = &pt->cmd_q;
> +	struct pt_dma_chan *chan = to_pt_chan(c);
> +	struct pt_device *pt = chan->pt;
> +	struct pt_cmd_queue *cmd_q;
> +
> +	cmd_q = pt_get_cmd_queue(pt, chan);
>  
>  	pt_check_status_trans(pt, cmd_q);
>  	return dma_cookie_status(c, cookie, txstate);
> @@ -270,10 +303,13 @@ pt_tx_status(struct dma_chan *c, dma_cookie_t cookie,
>  static int pt_pause(struct dma_chan *dma_chan)
>  {
>  	struct pt_dma_chan *chan = to_pt_chan(dma_chan);
> +	struct pt_device *pt = chan->pt;
> +	struct pt_cmd_queue *cmd_q;
>  	unsigned long flags;
>  
>  	spin_lock_irqsave(&chan->vc.lock, flags);
> -	pt_stop_queue(&chan->pt->cmd_q);
> +	cmd_q = pt_get_cmd_queue(pt, chan);
> +	pt_stop_queue(cmd_q);
>  	spin_unlock_irqrestore(&chan->vc.lock, flags);
>  
>  	return 0;
> @@ -283,10 +319,13 @@ static int pt_resume(struct dma_chan *dma_chan)
>  {
>  	struct pt_dma_chan *chan = to_pt_chan(dma_chan);
>  	struct pt_dma_desc *desc = NULL;
> +	struct pt_device *pt = chan->pt;
> +	struct pt_cmd_queue *cmd_q;
>  	unsigned long flags;
>  
>  	spin_lock_irqsave(&chan->vc.lock, flags);
> -	pt_start_queue(&chan->pt->cmd_q);
> +	cmd_q = pt_get_cmd_queue(pt, chan);
> +	pt_start_queue(cmd_q);
>  	desc = pt_next_dma_desc(chan);
>  	spin_unlock_irqrestore(&chan->vc.lock, flags);
>  
> @@ -300,11 +339,17 @@ static int pt_resume(struct dma_chan *dma_chan)
>  static int pt_terminate_all(struct dma_chan *dma_chan)
>  {
>  	struct pt_dma_chan *chan = to_pt_chan(dma_chan);
> +	struct pt_device *pt = chan->pt;
> +	struct pt_cmd_queue *cmd_q;
>  	unsigned long flags;
> -	struct pt_cmd_queue *cmd_q = &chan->pt->cmd_q;
>  	LIST_HEAD(head);
>  
> -	iowrite32(SUPPORTED_INTERRUPTS, cmd_q->reg_control + 0x0010);
> +	cmd_q = pt_get_cmd_queue(pt, chan);
> +	if (pt->ver == AE4_DMA_VERSION)
> +		pt_stop_queue(cmd_q);
> +	else
> +		iowrite32(SUPPORTED_INTERRUPTS, cmd_q->reg_control + 0x0010);
> +
>  	spin_lock_irqsave(&chan->vc.lock, flags);
>  	vchan_get_all_descriptors(&chan->vc, &head);
>  	spin_unlock_irqrestore(&chan->vc.lock, flags);
> @@ -317,14 +362,24 @@ static int pt_terminate_all(struct dma_chan *dma_chan)
>  
>  int pt_dmaengine_register(struct pt_device *pt)
>  {
> -	struct pt_dma_chan *chan;
>  	struct dma_device *dma_dev = &pt->dma_dev;
> -	char *cmd_cache_name;
> +	struct ae4_cmd_queue *ae4cmd_q = NULL;
> +	struct ae4_device *ae4 = NULL;
> +	struct pt_dma_chan *chan;
>  	char *desc_cache_name;
> -	int ret;
> +	char *cmd_cache_name;
> +	int ret, i;
> +
> +	if (pt->ver == AE4_DMA_VERSION)
> +		ae4 = container_of(pt, struct ae4_device, pt);
> +
> +	if (ae4)

in which case would ae4 be null but you are on version 4?

> +		pt->pt_dma_chan = devm_kcalloc(pt->dev, ae4->cmd_q_count,
> +					       sizeof(*pt->pt_dma_chan), GFP_KERNEL);
> +	else
> +		pt->pt_dma_chan = devm_kzalloc(pt->dev, sizeof(*pt->pt_dma_chan),
> +					       GFP_KERNEL);
>  
> -	pt->pt_dma_chan = devm_kzalloc(pt->dev, sizeof(*pt->pt_dma_chan),
> -				       GFP_KERNEL);
>  	if (!pt->pt_dma_chan)
>  		return -ENOMEM;
>  
> @@ -366,9 +421,6 @@ int pt_dmaengine_register(struct pt_device *pt)
>  
>  	INIT_LIST_HEAD(&dma_dev->channels);
>  
> -	chan = pt->pt_dma_chan;
> -	chan->pt = pt;
> -
>  	/* Set base and prep routines */
>  	dma_dev->device_free_chan_resources = pt_free_chan_resources;
>  	dma_dev->device_prep_dma_memcpy = pt_prep_dma_memcpy;
> @@ -380,8 +432,21 @@ int pt_dmaengine_register(struct pt_device *pt)
>  	dma_dev->device_terminate_all = pt_terminate_all;
>  	dma_dev->device_synchronize = pt_synchronize;
>  
> -	chan->vc.desc_free = pt_do_cleanup;
> -	vchan_init(&chan->vc, dma_dev);
> +	if (ae4) {
> +		for (i = 0; i < ae4->cmd_q_count; i++) {
> +			chan = pt->pt_dma_chan + i;
> +			ae4cmd_q = &ae4->ae4cmd_q[i];
> +			chan->id = ae4cmd_q->id;
> +			chan->pt = pt;
> +			chan->vc.desc_free = pt_do_cleanup;
> +			vchan_init(&chan->vc, dma_dev);
> +		}
> +	} else {
> +		chan = pt->pt_dma_chan;
> +		chan->pt = pt;
> +		chan->vc.desc_free = pt_do_cleanup;
> +		vchan_init(&chan->vc, dma_dev);
> +	}
>  
>  	ret = dma_async_device_register(dma_dev);
>  	if (ret)
> diff --git a/drivers/dma/amd/ptdma/ptdma.h b/drivers/dma/amd/ptdma/ptdma.h
> index 2690a32fc7cb..a6990021fe2b 100644
> --- a/drivers/dma/amd/ptdma/ptdma.h
> +++ b/drivers/dma/amd/ptdma/ptdma.h
> @@ -184,6 +184,7 @@ struct pt_dma_desc {
>  struct pt_dma_chan {
>  	struct virt_dma_chan vc;
>  	struct pt_device *pt;
> +	u32 id;
>  };
>  
>  struct pt_cmd_queue {
> @@ -262,6 +263,7 @@ struct pt_device {
>  	unsigned long total_interrupts;
>  
>  	struct pt_tasklet_data tdata;
> +	int ver;
>  };
>  
>  /*
> -- 
> 2.25.1

-- 
~Vinod

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v5 2/7] dmaengine: ae4dma: Add AMD ae4dma controller driver
  2024-08-28 13:23   ` Vinod Koul
@ 2024-09-02  6:47     ` Basavaraj Natikar
  0 siblings, 0 replies; 14+ messages in thread
From: Basavaraj Natikar @ 2024-09-02  6:47 UTC (permalink / raw)
  To: Vinod Koul, Basavaraj Natikar
  Cc: dmaengine, Raju.Rangoju, Frank.li, helgaas, pstanner


On 8/28/2024 6:53 PM, Vinod Koul wrote:
> On 08-07-24, 20:14, Basavaraj Natikar wrote:
>> Add support for AMD AE4DMA controller. It performs high-bandwidth
>> memory to memory and IO copy operation. Device commands are managed
>> via a circular queue of 'descriptors', each of which specifies source
>> and destination addresses for copying a single buffer of data.
>>
>> Reviewed-by: Raju Rangoju <Raju.Rangoju@amd.com>
>> Reviewed-by: Philipp Stanner <pstanner@redhat.com>
>> Signed-off-by: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
>> ---
>>  MAINTAINERS                         |   6 +
>>  drivers/dma/amd/Kconfig             |   1 +
>>  drivers/dma/amd/Makefile            |   1 +
>>  drivers/dma/amd/ae4dma/Kconfig      |  14 ++
>>  drivers/dma/amd/ae4dma/Makefile     |  10 ++
>>  drivers/dma/amd/ae4dma/ae4dma-dev.c | 198 ++++++++++++++++++++++++++++
>>  drivers/dma/amd/ae4dma/ae4dma-pci.c | 157 ++++++++++++++++++++++
>>  drivers/dma/amd/ae4dma/ae4dma.h     |  85 ++++++++++++
>>  drivers/dma/amd/common/amd_dma.h    |  26 ++++
>>  9 files changed, 498 insertions(+)
>>  create mode 100644 drivers/dma/amd/ae4dma/Kconfig
>>  create mode 100644 drivers/dma/amd/ae4dma/Makefile
>>  create mode 100644 drivers/dma/amd/ae4dma/ae4dma-dev.c
>>  create mode 100644 drivers/dma/amd/ae4dma/ae4dma-pci.c
>>  create mode 100644 drivers/dma/amd/ae4dma/ae4dma.h
>>  create mode 100644 drivers/dma/amd/common/amd_dma.h
>>
>> diff --git a/MAINTAINERS b/MAINTAINERS
>> index 33a1049fd38b..539bf52410de 100644
>> --- a/MAINTAINERS
>> +++ b/MAINTAINERS
>> @@ -947,6 +947,12 @@ L:	linux-edac@vger.kernel.org
>>  S:	Supported
>>  F:	drivers/ras/amd/atl/*
>>  
>> +AMD AE4DMA DRIVER
>> +M:	Basavaraj Natikar <Basavaraj.Natikar@amd.com>
>> +L:	dmaengine@vger.kernel.org
>> +S:	Maintained
>> +F:	drivers/dma/amd/ae4dma/
>> +
>>  AMD AXI W1 DRIVER
>>  M:	Kris Chaplin <kris.chaplin@amd.com>
>>  R:	Thomas Delev <thomas.delev@amd.com>
>> diff --git a/drivers/dma/amd/Kconfig b/drivers/dma/amd/Kconfig
>> index 8246b463bcf7..8c25a3ed6b94 100644
>> --- a/drivers/dma/amd/Kconfig
>> +++ b/drivers/dma/amd/Kconfig
>> @@ -3,3 +3,4 @@
>>  # AMD DMA Drivers
>>  
>>  source "drivers/dma/amd/ptdma/Kconfig"
>> +source "drivers/dma/amd/ae4dma/Kconfig"
>> diff --git a/drivers/dma/amd/Makefile b/drivers/dma/amd/Makefile
>> index dd7257ba7e06..8049b06a9ff5 100644
>> --- a/drivers/dma/amd/Makefile
>> +++ b/drivers/dma/amd/Makefile
>> @@ -4,3 +4,4 @@
>>  #
>>  
>>  obj-$(CONFIG_AMD_PTDMA) += ptdma/
>> +obj-$(CONFIG_AMD_AE4DMA) += ae4dma/
>> diff --git a/drivers/dma/amd/ae4dma/Kconfig b/drivers/dma/amd/ae4dma/Kconfig
>> new file mode 100644
>> index 000000000000..ea8a7fe68df5
>> --- /dev/null
>> +++ b/drivers/dma/amd/ae4dma/Kconfig
>> @@ -0,0 +1,14 @@
>> +# SPDX-License-Identifier: GPL-2.0
>> +config AMD_AE4DMA
>> +	tristate  "AMD AE4DMA Engine"
>> +	depends on (X86_64 || COMPILE_TEST) && PCI
>> +	depends on AMD_PTDMA
>> +	select DMA_ENGINE
>> +	select DMA_VIRTUAL_CHANNELS
>> +	help
>> +	  Enable support for the AMD AE4DMA controller. This controller
>> +	  provides DMA capabilities to perform high bandwidth memory to
>> +	  memory and IO copy operations. It performs DMA transfer through
>> +	  queue-based descriptor management. This DMA controller is intended
>> +	  to be used with AMD Non-Transparent Bridge devices and not for
>> +	  general purpose peripheral DMA.
>> diff --git a/drivers/dma/amd/ae4dma/Makefile b/drivers/dma/amd/ae4dma/Makefile
>> new file mode 100644
>> index 000000000000..e918f85a80ec
>> --- /dev/null
>> +++ b/drivers/dma/amd/ae4dma/Makefile
>> @@ -0,0 +1,10 @@
>> +# SPDX-License-Identifier: GPL-2.0
>> +#
>> +# AMD AE4DMA driver
>> +#
>> +
>> +obj-$(CONFIG_AMD_AE4DMA) += ae4dma.o
>> +
>> +ae4dma-objs := ae4dma-dev.o
>> +
>> +ae4dma-$(CONFIG_PCI) += ae4dma-pci.o
>> diff --git a/drivers/dma/amd/ae4dma/ae4dma-dev.c b/drivers/dma/amd/ae4dma/ae4dma-dev.c
>> new file mode 100644
>> index 000000000000..c38464b96fc6
>> --- /dev/null
>> +++ b/drivers/dma/amd/ae4dma/ae4dma-dev.c
>> @@ -0,0 +1,198 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * AMD AE4DMA driver
>> + *
>> + * Copyright (c) 2024, Advanced Micro Devices, Inc.
>> + * All Rights Reserved.
>> + *
>> + * Author: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
>> + */
>> +
>> +#include "ae4dma.h"
>> +
>> +static unsigned int max_hw_q = 1;
>> +module_param(max_hw_q, uint, 0444);
>> +MODULE_PARM_DESC(max_hw_q, "max hw queues supported by engine (any non-zero value, default: 1)");
>> +
>> +static char *ae4_error_codes[] = {
>> +	"",
>> +	"ERR 01: INVALID HEADER DW0",
>> +	"ERR 02: INVALID STATUS",
>> +	"ERR 03: INVALID LENGTH - 4 BYTE ALIGNMENT",
>> +	"ERR 04: INVALID SRC ADDR - 4 BYTE ALIGNMENT",
>> +	"ERR 05: INVALID DST ADDR - 4 BYTE ALIGNMENT",
>> +	"ERR 06: INVALID ALIGNMENT",
>> +	"ERR 07: INVALID DESCRIPTOR",
>> +};
>> +
>> +static void ae4_log_error(struct pt_device *d, int e)
>> +{
>> +	/* ERR 01 - 07 represents Invalid AE4 errors */
>> +	if (e <= 7)
>> +		dev_info(d->dev, "AE4DMA error: %s (0x%x)\n", ae4_error_codes[e], e);
>> +	/* ERR 08 - 15 represents Invalid Descriptor errors */
>> +	else if (e > 7 && e <= 15)
>> +		dev_info(d->dev, "AE4DMA error: %s (0x%x)\n", "INVALID DESCRIPTOR", e);
>> +	/* ERR 16 - 31 represents Firmware errors */
>> +	else if (e > 15 && e <= 31)
>> +		dev_info(d->dev, "AE4DMA error: %s (0x%x)\n", "FIRMWARE ERROR", e);
>> +	/* ERR 32 - 63 represents Fatal errors */
>> +	else if (e > 31 && e <= 63)
>> +		dev_info(d->dev, "AE4DMA error: %s (0x%x)\n", "FATAL ERROR", e);
>> +	/* ERR 64 - 255 represents PTE errors */
>> +	else if (e > 63 && e <= 255)
>> +		dev_info(d->dev, "AE4DMA error: %s (0x%x)\n", "PTE ERROR", e);
>> +	else
>> +		dev_info(d->dev, "Unknown AE4DMA error");
>> +}
>> +
>> +static void ae4_check_status_error(struct ae4_cmd_queue *ae4cmd_q, int idx)
>> +{
>> +	struct pt_cmd_queue *cmd_q = &ae4cmd_q->cmd_q;
>> +	struct ae4dma_desc desc;
>> +	u8 status;
>> +
>> +	memcpy(&desc, &cmd_q->qbase[idx], sizeof(struct ae4dma_desc));
>> +	status = desc.dw1.status;
>> +	if (status && status != AE4_DESC_COMPLETED) {
>> +		cmd_q->cmd_error = desc.dw1.err_code;
>> +		if (cmd_q->cmd_error)
>> +			ae4_log_error(cmd_q->pt, cmd_q->cmd_error);
>> +	}
>> +}
>> +
>> +static void ae4_pending_work(struct work_struct *work)
>> +{
>> +	struct ae4_cmd_queue *ae4cmd_q = container_of(work, struct ae4_cmd_queue, p_work.work);
>> +	struct pt_cmd_queue *cmd_q = &ae4cmd_q->cmd_q;
>> +	struct pt_cmd *cmd;
>> +	u32 cridx;
>> +
>> +	for (;;) {
>> +		wait_event_interruptible(ae4cmd_q->q_w,
>> +					 ((atomic64_read(&ae4cmd_q->done_cnt)) <
>> +					   atomic64_read(&ae4cmd_q->intr_cnt)));
>> +
>> +		atomic64_inc(&ae4cmd_q->done_cnt);
>> +
>> +		mutex_lock(&ae4cmd_q->cmd_lock);
>> +		cridx = readl(cmd_q->reg_control + AE4_RD_IDX_OFF);
>> +		while ((ae4cmd_q->dridx != cridx) && !list_empty(&ae4cmd_q->cmd)) {
>> +			cmd = list_first_entry(&ae4cmd_q->cmd, struct pt_cmd, entry);
>> +			list_del(&cmd->entry);
>> +
>> +			ae4_check_status_error(ae4cmd_q, ae4cmd_q->dridx);
>> +			cmd->pt_cmd_callback(cmd->data, cmd->ret);
>> +
>> +			ae4cmd_q->q_cmd_count--;
>> +			ae4cmd_q->dridx = (ae4cmd_q->dridx + 1) % CMD_Q_LEN;
>> +
>> +			complete_all(&ae4cmd_q->cmp);
>> +		}
>> +		mutex_unlock(&ae4cmd_q->cmd_lock);
>> +	}
>> +}
>> +
>> +static irqreturn_t ae4_core_irq_handler(int irq, void *data)
>> +{
>> +	struct ae4_cmd_queue *ae4cmd_q = data;
>> +	struct pt_cmd_queue *cmd_q;
>> +	struct pt_device *pt;
>> +	u32 status;
>> +
>> +	cmd_q = &ae4cmd_q->cmd_q;
>> +	pt = cmd_q->pt;
>> +
>> +	pt->total_interrupts++;
>> +	atomic64_inc(&ae4cmd_q->intr_cnt);
>> +
>> +	wake_up(&ae4cmd_q->q_w);
> shouldnt this be last thing before returning

Sure, I will move this before returning.

>
>> +
>> +	status = readl(cmd_q->reg_control + AE4_INTR_STS_OFF);
>> +	if (status & BIT(0)) {
>> +		status &= GENMASK(31, 1);
>> +		writel(status, cmd_q->reg_control + AE4_INTR_STS_OFF);
>> +	}
>> +
>> +	return IRQ_HANDLED;
>> +}
>> +
>> +void ae4_destroy_work(struct ae4_device *ae4)
>> +{
>> +	struct ae4_cmd_queue *ae4cmd_q;
>> +	int i;
>> +
>> +	for (i = 0; i < ae4->cmd_q_count; i++) {
>> +		ae4cmd_q = &ae4->ae4cmd_q[i];
>> +
>> +		if (!ae4cmd_q->pws)
>> +			break;
>> +
>> +		cancel_delayed_work_sync(&ae4cmd_q->p_work);
>> +		destroy_workqueue(ae4cmd_q->pws);
>> +	}
>> +}
>> +
>> +int ae4_core_init(struct ae4_device *ae4)
>> +{
>> +	struct pt_device *pt = &ae4->pt;
>> +	struct ae4_cmd_queue *ae4cmd_q;
>> +	struct device *dev = pt->dev;
>> +	struct pt_cmd_queue *cmd_q;
>> +	int i, ret = 0;
>> +
>> +	writel(max_hw_q, pt->io_regs);
>> +
>> +	for (i = 0; i < max_hw_q; i++) {
>> +		ae4cmd_q = &ae4->ae4cmd_q[i];
>> +		ae4cmd_q->id = ae4->cmd_q_count;
>> +		ae4->cmd_q_count++;
>> +
>> +		cmd_q = &ae4cmd_q->cmd_q;
>> +		cmd_q->pt = pt;
>> +
>> +		cmd_q->reg_control = pt->io_regs + ((i + 1) * AE4_Q_SZ);
>> +
>> +		ret = devm_request_irq(dev, ae4->ae4_irq[i], ae4_core_irq_handler, 0,
>> +				       dev_name(pt->dev), ae4cmd_q);
>> +		if (ret)
>> +			return ret;
>> +
>> +		cmd_q->qsize = Q_SIZE(sizeof(struct ae4dma_desc));
>> +
>> +		cmd_q->qbase = dmam_alloc_coherent(dev, cmd_q->qsize, &cmd_q->qbase_dma,
>> +						   GFP_KERNEL);
>> +		if (!cmd_q->qbase)
>> +			return -ENOMEM;
>> +	}
>> +
>> +	for (i = 0; i < ae4->cmd_q_count; i++) {
>> +		ae4cmd_q = &ae4->ae4cmd_q[i];
>> +
>> +		cmd_q = &ae4cmd_q->cmd_q;
>> +
>> +		cmd_q->reg_control = pt->io_regs + ((i + 1) * AE4_Q_SZ);
>> +
>> +		/* Update the device registers with queue information. */
>> +		writel(CMD_Q_LEN, cmd_q->reg_control + AE4_MAX_IDX_OFF);
>> +
>> +		cmd_q->qdma_tail = cmd_q->qbase_dma;
>> +		writel(lower_32_bits(cmd_q->qdma_tail), cmd_q->reg_control + AE4_Q_BASE_L_OFF);
>> +		writel(upper_32_bits(cmd_q->qdma_tail), cmd_q->reg_control + AE4_Q_BASE_H_OFF);
>> +
>> +		INIT_LIST_HEAD(&ae4cmd_q->cmd);
>> +		init_waitqueue_head(&ae4cmd_q->q_w);
>> +
>> +		ae4cmd_q->pws = alloc_ordered_workqueue("ae4dma_%d", WQ_MEM_RECLAIM, ae4cmd_q->id);
>> +		if (!ae4cmd_q->pws) {
>> +			ae4_destroy_work(ae4);
>> +			return -ENOMEM;
>> +		}
>> +		INIT_DELAYED_WORK(&ae4cmd_q->p_work, ae4_pending_work);
>> +		queue_delayed_work(ae4cmd_q->pws, &ae4cmd_q->p_work,  usecs_to_jiffies(100));
>> +
>> +		init_completion(&ae4cmd_q->cmp);
>> +	}
>> +
>> +	return ret;
>> +}
>> diff --git a/drivers/dma/amd/ae4dma/ae4dma-pci.c b/drivers/dma/amd/ae4dma/ae4dma-pci.c
>> new file mode 100644
>> index 000000000000..43d36e9d1efb
>> --- /dev/null
>> +++ b/drivers/dma/amd/ae4dma/ae4dma-pci.c
>> @@ -0,0 +1,157 @@
>> +// SPDX-License-Identifier: GPL-2.0
>> +/*
>> + * AMD AE4DMA driver
>> + *
>> + * Copyright (c) 2024, Advanced Micro Devices, Inc.
>> + * All Rights Reserved.
>> + *
>> + * Author: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
>> + */
>> +
>> +#include "ae4dma.h"
>> +
>> +static int ae4_get_irqs(struct ae4_device *ae4)
>> +{
>> +	struct ae4_msix *ae4_msix = ae4->ae4_msix;
>> +	struct pt_device *pt = &ae4->pt;
>> +	struct device *dev = pt->dev;
>> +	struct pci_dev *pdev;
>> +	int i, v, ret;
>> +
>> +	pdev = to_pci_dev(dev);
>> +
>> +	for (v = 0; v < ARRAY_SIZE(ae4_msix->msix_entry); v++)
>> +		ae4_msix->msix_entry[v].entry = v;
>> +
>> +	ret = pci_alloc_irq_vectors(pdev, v, v, PCI_IRQ_MSIX);
>> +	if (ret != v) {
>> +		if (ret > 0)
>> +			pci_free_irq_vectors(pdev);
>> +
>> +		dev_err(dev, "could not enable MSI-X (%d), trying MSI\n", ret);
>> +		ret = pci_alloc_irq_vectors(pdev, 1, 1, PCI_IRQ_MSI);
>> +		if (ret < 0) {
>> +			dev_err(dev, "could not enable MSI (%d)\n", ret);
>> +			return ret;
>> +		}
>> +
>> +		ret = pci_irq_vector(pdev, 0);
>> +		if (ret < 0) {
>> +			pci_free_irq_vectors(pdev);
>> +			return ret;
>> +		}
>> +
>> +		for (i = 0; i < MAX_AE4_HW_QUEUES; i++)
>> +			ae4->ae4_irq[i] = ret;
>> +
>> +	} else {
>> +		ae4_msix->msix_count = ret;
>> +		for (i = 0; i < MAX_AE4_HW_QUEUES; i++)
>> +			ae4->ae4_irq[i] = ae4_msix->msix_entry[i].vector;
>> +	}
>> +
>> +	return ret;
>> +}
>> +
>> +static void ae4_free_irqs(struct ae4_device *ae4)
>> +{
>> +	struct ae4_msix *ae4_msix = ae4->ae4_msix;
>> +	struct pt_device *pt = &ae4->pt;
>> +	struct device *dev = pt->dev;
>> +	struct pci_dev *pdev;
>> +
>> +	pdev = to_pci_dev(dev);
>> +
>> +	if (ae4_msix && (ae4_msix->msix_count || ae4->ae4_irq[MAX_AE4_HW_QUEUES - 1]))
>> +		pci_free_irq_vectors(pdev);
>> +}
>> +
>> +static void ae4_deinit(struct ae4_device *ae4)
>> +{
>> +	ae4_free_irqs(ae4);
>> +}
>> +
>> +static int ae4_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>> +{
>> +	struct device *dev = &pdev->dev;
>> +	struct ae4_device *ae4;
>> +	struct pt_device *pt;
>> +	int bar_mask;
>> +	int ret = 0;
>> +
>> +	ae4 = devm_kzalloc(dev, sizeof(*ae4), GFP_KERNEL);
>> +	if (!ae4)
>> +		return -ENOMEM;
>> +
>> +	ae4->ae4_msix = devm_kzalloc(dev, sizeof(struct ae4_msix), GFP_KERNEL);
>> +	if (!ae4->ae4_msix)
>> +		return -ENOMEM;
>> +
>> +	ret = pcim_enable_device(pdev);
>> +	if (ret)
>> +		goto ae4_error;
>> +
>> +	bar_mask = pci_select_bars(pdev, IORESOURCE_MEM);
>> +	ret = pcim_iomap_regions(pdev, bar_mask, "ae4dma");
>> +	if (ret)
>> +		goto ae4_error;
>> +
>> +	pt = &ae4->pt;
>> +	pt->dev = dev;
>> +
>> +	pt->io_regs = pcim_iomap_table(pdev)[0];
>> +	if (!pt->io_regs) {
>> +		ret = -ENOMEM;
>> +		goto ae4_error;
>> +	}
>> +
>> +	ret = ae4_get_irqs(ae4);
>> +	if (ret < 0)
>> +		goto ae4_error;
>> +
>> +	pci_set_master(pdev);
>> +
>> +	dma_set_mask_and_coherent(dev, DMA_BIT_MASK(48));
>> +
>> +	dev_set_drvdata(dev, ae4);
>> +
>> +	ret = ae4_core_init(ae4);
>> +	if (ret)
>> +		goto ae4_error;
>> +
>> +	return 0;
>> +
>> +ae4_error:
>> +	ae4_deinit(ae4);
>> +
>> +	return ret;
>> +}
>> +
>> +static void ae4_pci_remove(struct pci_dev *pdev)
>> +{
>> +	struct ae4_device *ae4 = dev_get_drvdata(&pdev->dev);
>> +
>> +	ae4_destroy_work(ae4);
>> +	ae4_deinit(ae4);
>> +}
>> +
>> +static const struct pci_device_id ae4_pci_table[] = {
>> +	{ PCI_VDEVICE(AMD, 0x14C8), },
>> +	{ PCI_VDEVICE(AMD, 0x14DC), },
>> +	{ PCI_VDEVICE(AMD, 0x149B), },
>> +	/* Last entry must be zero */
>> +	{ 0, }
>> +};
>> +MODULE_DEVICE_TABLE(pci, ae4_pci_table);
>> +
>> +static struct pci_driver ae4_pci_driver = {
>> +	.name = "ae4dma",
>> +	.id_table = ae4_pci_table,
>> +	.probe = ae4_pci_probe,
>> +	.remove = ae4_pci_remove,
>> +};
>> +
>> +module_pci_driver(ae4_pci_driver);
>> +
>> +MODULE_LICENSE("GPL");
>> +MODULE_DESCRIPTION("AMD AE4DMA driver");
>> diff --git a/drivers/dma/amd/ae4dma/ae4dma.h b/drivers/dma/amd/ae4dma/ae4dma.h
>> new file mode 100644
>> index 000000000000..a63525792080
>> --- /dev/null
>> +++ b/drivers/dma/amd/ae4dma/ae4dma.h
>> @@ -0,0 +1,85 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +/*
>> + * AMD AE4DMA driver
>> + *
>> + * Copyright (c) 2024, Advanced Micro Devices, Inc.
>> + * All Rights Reserved.
>> + *
>> + * Author: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
>> + */
>> +#ifndef __AE4DMA_H__
>> +#define __AE4DMA_H__
>> +
>> +#include "../common/amd_dma.h"
>> +
>> +#define MAX_AE4_HW_QUEUES		16
>> +
>> +#define AE4_DESC_COMPLETED		0x03
>> +
>> +#define AE4_MAX_IDX_OFF			0x08
>> +#define AE4_RD_IDX_OFF			0x0C
>> +#define AE4_WR_IDX_OFF			0x10
>> +#define AE4_INTR_STS_OFF		0x14
>> +#define AE4_Q_BASE_L_OFF		0x18
>> +#define AE4_Q_BASE_H_OFF		0x1C
>> +#define AE4_Q_SZ			0x20
> lower case for hex values please

Sure, I will change it to lowercase.

>
>> +
>> +struct ae4_msix {
>> +	int msix_count;
>> +	struct msix_entry msix_entry[MAX_AE4_HW_QUEUES];
>> +};
>> +
>> +struct ae4_cmd_queue {
>> +	struct ae4_device *ae4;
>> +	struct pt_cmd_queue cmd_q;
>> +	struct list_head cmd;
>> +	/* protect command operations */
>> +	struct mutex cmd_lock;
>> +	struct delayed_work p_work;
>> +	struct workqueue_struct *pws;
>> +	struct completion cmp;
>> +	wait_queue_head_t q_w;
>> +	atomic64_t intr_cnt;
>> +	atomic64_t done_cnt;
>> +	u64 q_cmd_count;
>> +	u32 dridx;
>> +	u32 id;
>> +};
>> +
>> +union dwou {
>> +	u32 dw0;
>> +	struct dword0 {
> nice naming dw0 dword0!
>
>> +	u8	byte0;
>> +	u8	byte1;
>> +	u16	timestamp;
>> +	} dws;
>> +};
>> +
>> +struct dword1 {
>> +	u8	status;
>> +	u8	err_code;
>> +	u16	desc_id;
>> +};
>> +
>> +struct ae4dma_desc {
>> +	union dwou dwouv;
>> +	struct dword1 dw1;
>> +	u32 length;
>> +	u32 rsvd;
>> +	u32 src_hi;
>> +	u32 src_lo;
>> +	u32 dst_hi;
>> +	u32 dst_lo;
>> +};
>> +
>> +struct ae4_device {
>> +	struct pt_device pt;
>> +	struct ae4_msix *ae4_msix;
>> +	struct ae4_cmd_queue ae4cmd_q[MAX_AE4_HW_QUEUES];
>> +	unsigned int ae4_irq[MAX_AE4_HW_QUEUES];
>> +	unsigned int cmd_q_count;
>> +};
>> +
>> +int ae4_core_init(struct ae4_device *ae4);
>> +void ae4_destroy_work(struct ae4_device *ae4);
>> +#endif
>> diff --git a/drivers/dma/amd/common/amd_dma.h b/drivers/dma/amd/common/amd_dma.h
>> new file mode 100644
>> index 000000000000..f9f396cd4371
>> --- /dev/null
>> +++ b/drivers/dma/amd/common/amd_dma.h
>> @@ -0,0 +1,26 @@
>> +/* SPDX-License-Identifier: GPL-2.0 */
>> +/*
>> + * AMD DMA Driver common
>> + *
>> + * Copyright (c) 2024, Advanced Micro Devices, Inc.
>> + * All Rights Reserved.
>> + *
>> + * Author: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
>> + */
>> +
>> +#ifndef AMD_DMA_H
>> +#define AMD_DMA_H
>> +
>> +#include <linux/device.h>
>> +#include <linux/dmaengine.h>
>> +#include <linux/dmapool.h>
>> +#include <linux/list.h>
>> +#include <linux/mutex.h>
>> +#include <linux/pci.h>
>> +#include <linux/spinlock.h>
>> +#include <linux/wait.h>
>> +
>> +#include "../ptdma/ptdma.h"
>> +#include "../../virt-dma.h"
> Header which adds other headers? That does not look good?
> Why not add these in C files directly

Sure, I will remove the common directory and include
everything in the C files.

Thanks,
--
Basavaraj

>
>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v5 4/7] dmaengine: ptdma: Extend ptdma to support multi-channel and version
  2024-08-28 17:46   ` Vinod Koul
@ 2024-09-02  7:01     ` Basavaraj Natikar
  0 siblings, 0 replies; 14+ messages in thread
From: Basavaraj Natikar @ 2024-09-02  7:01 UTC (permalink / raw)
  To: Vinod Koul, Basavaraj Natikar
  Cc: dmaengine, Raju.Rangoju, Frank.li, helgaas, pstanner


On 8/28/2024 11:16 PM, Vinod Koul wrote:
> On 08-07-24, 20:14, Basavaraj Natikar wrote:
>> To support multi-channel functionality with AE4DMA engine, extend the
>> PTDMA code with reusable components.
>>
>> Reviewed-by: Raju Rangoju <Raju.Rangoju@amd.com>
>> Signed-off-by: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
>> ---
>>  drivers/dma/amd/ae4dma/ae4dma.h         |   2 +
>>  drivers/dma/amd/common/amd_dma.h        |   1 +
>>  drivers/dma/amd/ptdma/ptdma-dmaengine.c | 105 +++++++++++++++++++-----
>>  drivers/dma/amd/ptdma/ptdma.h           |   2 +
>>  4 files changed, 90 insertions(+), 20 deletions(-)
>>
>> diff --git a/drivers/dma/amd/ae4dma/ae4dma.h b/drivers/dma/amd/ae4dma/ae4dma.h
>> index a63525792080..5f9dab5f05f4 100644
>> --- a/drivers/dma/amd/ae4dma/ae4dma.h
>> +++ b/drivers/dma/amd/ae4dma/ae4dma.h
>> @@ -24,6 +24,8 @@
>>  #define AE4_Q_BASE_H_OFF		0x1C
>>  #define AE4_Q_SZ			0x20
>>  
>> +#define AE4_DMA_VERSION			4
> Can it be driver data for the device?
> I am not seeing who sets this here?

This will be used in the functions below to
differentiate between PTDMA and AE4DMA functionality.
By default, without a version, it is PTDMA, and this
will be set in ae4_pci_probe in the next patch: 
https://lore.kernel.org/all/20240708144500.1523651-6-Basavaraj.Natikar@amd.com/#r.

>
>> +
>>  struct ae4_msix {
>>  	int msix_count;
>>  	struct msix_entry msix_entry[MAX_AE4_HW_QUEUES];
>> diff --git a/drivers/dma/amd/common/amd_dma.h b/drivers/dma/amd/common/amd_dma.h
>> index f9f396cd4371..396667e81e1a 100644
>> --- a/drivers/dma/amd/common/amd_dma.h
>> +++ b/drivers/dma/amd/common/amd_dma.h
>> @@ -21,6 +21,7 @@
>>  #include <linux/wait.h>
>>  
>>  #include "../ptdma/ptdma.h"
>> +#include "../ae4dma/ae4dma.h"
>>  #include "../../virt-dma.h"
>>  
>>  #endif
>> diff --git a/drivers/dma/amd/ptdma/ptdma-dmaengine.c b/drivers/dma/amd/ptdma/ptdma-dmaengine.c
>> index 66ea10499643..90ca02fd5f8f 100644
>> --- a/drivers/dma/amd/ptdma/ptdma-dmaengine.c
>> +++ b/drivers/dma/amd/ptdma/ptdma-dmaengine.c
>> @@ -43,7 +43,24 @@ static void pt_do_cleanup(struct virt_dma_desc *vd)
>>  	kmem_cache_free(pt->dma_desc_cache, desc);
>>  }
>>  
>> -static int pt_dma_start_desc(struct pt_dma_desc *desc)
>> +static struct pt_cmd_queue *pt_get_cmd_queue(struct pt_device *pt, struct pt_dma_chan *chan)
>> +{
>> +	struct ae4_cmd_queue *ae4cmd_q;
>> +	struct pt_cmd_queue *cmd_q;
>> +	struct ae4_device *ae4;
>> +
>> +	if (pt->ver == AE4_DMA_VERSION) {
>> +		ae4 = container_of(pt, struct ae4_device, pt);
>> +		ae4cmd_q = &ae4->ae4cmd_q[chan->id];
>> +		cmd_q = &ae4cmd_q->cmd_q;
>> +	} else {
>> +		cmd_q = &pt->cmd_q;
>> +	}
>> +
>> +	return cmd_q;
>> +}
>> +
>> +static int pt_dma_start_desc(struct pt_dma_desc *desc, struct pt_dma_chan *chan)
>>  {
>>  	struct pt_passthru_engine *pt_engine;
>>  	struct pt_device *pt;
>> @@ -54,7 +71,9 @@ static int pt_dma_start_desc(struct pt_dma_desc *desc)
>>  
>>  	pt_cmd = &desc->pt_cmd;
>>  	pt = pt_cmd->pt;
>> -	cmd_q = &pt->cmd_q;
>> +
>> +	cmd_q = pt_get_cmd_queue(pt, chan);
>> +
>>  	pt_engine = &pt_cmd->passthru;
>>  
>>  	pt->tdata.cmd = pt_cmd;
>> @@ -149,7 +168,7 @@ static void pt_cmd_callback(void *data, int err)
>>  		if (!desc)
>>  			break;
>>  
>> -		ret = pt_dma_start_desc(desc);
>> +		ret = pt_dma_start_desc(desc, chan);
>>  		if (!ret)
>>  			break;
>>  
>> @@ -184,7 +203,10 @@ static struct pt_dma_desc *pt_create_desc(struct dma_chan *dma_chan,
>>  {
>>  	struct pt_dma_chan *chan = to_pt_chan(dma_chan);
>>  	struct pt_passthru_engine *pt_engine;
>> +	struct pt_device *pt = chan->pt;
>> +	struct ae4_cmd_queue *ae4cmd_q;
>>  	struct pt_dma_desc *desc;
>> +	struct ae4_device *ae4;
>>  	struct pt_cmd *pt_cmd;
>>  
>>  	desc = pt_alloc_dma_desc(chan, flags);
>> @@ -192,7 +214,7 @@ static struct pt_dma_desc *pt_create_desc(struct dma_chan *dma_chan,
>>  		return NULL;
>>  
>>  	pt_cmd = &desc->pt_cmd;
>> -	pt_cmd->pt = chan->pt;
>> +	pt_cmd->pt = pt;
>>  	pt_engine = &pt_cmd->passthru;
>>  	pt_cmd->engine = PT_ENGINE_PASSTHRU;
>>  	pt_engine->src_dma = src;
>> @@ -203,6 +225,14 @@ static struct pt_dma_desc *pt_create_desc(struct dma_chan *dma_chan,
>>  
>>  	desc->len = len;
>>  
>> +	if (pt->ver == AE4_DMA_VERSION) {
>> +		ae4 = container_of(pt, struct ae4_device, pt);
>> +		ae4cmd_q = &ae4->ae4cmd_q[chan->id];
>> +		mutex_lock(&ae4cmd_q->cmd_lock);
>> +		list_add_tail(&pt_cmd->entry, &ae4cmd_q->cmd);
>> +		mutex_unlock(&ae4cmd_q->cmd_lock);
>> +	}
>> +
>>  	return desc;
>>  }
>>  
>> @@ -260,8 +290,11 @@ static enum dma_status
>>  pt_tx_status(struct dma_chan *c, dma_cookie_t cookie,
>>  		struct dma_tx_state *txstate)
>>  {
>> -	struct pt_device *pt = to_pt_chan(c)->pt;
>> -	struct pt_cmd_queue *cmd_q = &pt->cmd_q;
>> +	struct pt_dma_chan *chan = to_pt_chan(c);
>> +	struct pt_device *pt = chan->pt;
>> +	struct pt_cmd_queue *cmd_q;
>> +
>> +	cmd_q = pt_get_cmd_queue(pt, chan);
>>  
>>  	pt_check_status_trans(pt, cmd_q);
>>  	return dma_cookie_status(c, cookie, txstate);
>> @@ -270,10 +303,13 @@ pt_tx_status(struct dma_chan *c, dma_cookie_t cookie,
>>  static int pt_pause(struct dma_chan *dma_chan)
>>  {
>>  	struct pt_dma_chan *chan = to_pt_chan(dma_chan);
>> +	struct pt_device *pt = chan->pt;
>> +	struct pt_cmd_queue *cmd_q;
>>  	unsigned long flags;
>>  
>>  	spin_lock_irqsave(&chan->vc.lock, flags);
>> -	pt_stop_queue(&chan->pt->cmd_q);
>> +	cmd_q = pt_get_cmd_queue(pt, chan);
>> +	pt_stop_queue(cmd_q);
>>  	spin_unlock_irqrestore(&chan->vc.lock, flags);
>>  
>>  	return 0;
>> @@ -283,10 +319,13 @@ static int pt_resume(struct dma_chan *dma_chan)
>>  {
>>  	struct pt_dma_chan *chan = to_pt_chan(dma_chan);
>>  	struct pt_dma_desc *desc = NULL;
>> +	struct pt_device *pt = chan->pt;
>> +	struct pt_cmd_queue *cmd_q;
>>  	unsigned long flags;
>>  
>>  	spin_lock_irqsave(&chan->vc.lock, flags);
>> -	pt_start_queue(&chan->pt->cmd_q);
>> +	cmd_q = pt_get_cmd_queue(pt, chan);
>> +	pt_start_queue(cmd_q);
>>  	desc = pt_next_dma_desc(chan);
>>  	spin_unlock_irqrestore(&chan->vc.lock, flags);
>>  
>> @@ -300,11 +339,17 @@ static int pt_resume(struct dma_chan *dma_chan)
>>  static int pt_terminate_all(struct dma_chan *dma_chan)
>>  {
>>  	struct pt_dma_chan *chan = to_pt_chan(dma_chan);
>> +	struct pt_device *pt = chan->pt;
>> +	struct pt_cmd_queue *cmd_q;
>>  	unsigned long flags;
>> -	struct pt_cmd_queue *cmd_q = &chan->pt->cmd_q;
>>  	LIST_HEAD(head);
>>  
>> -	iowrite32(SUPPORTED_INTERRUPTS, cmd_q->reg_control + 0x0010);
>> +	cmd_q = pt_get_cmd_queue(pt, chan);
>> +	if (pt->ver == AE4_DMA_VERSION)
>> +		pt_stop_queue(cmd_q);
>> +	else
>> +		iowrite32(SUPPORTED_INTERRUPTS, cmd_q->reg_control + 0x0010);
>> +
>>  	spin_lock_irqsave(&chan->vc.lock, flags);
>>  	vchan_get_all_descriptors(&chan->vc, &head);
>>  	spin_unlock_irqrestore(&chan->vc.lock, flags);
>> @@ -317,14 +362,24 @@ static int pt_terminate_all(struct dma_chan *dma_chan)
>>  
>>  int pt_dmaengine_register(struct pt_device *pt)
>>  {
>> -	struct pt_dma_chan *chan;
>>  	struct dma_device *dma_dev = &pt->dma_dev;
>> -	char *cmd_cache_name;
>> +	struct ae4_cmd_queue *ae4cmd_q = NULL;
>> +	struct ae4_device *ae4 = NULL;
>> +	struct pt_dma_chan *chan;
>>  	char *desc_cache_name;
>> -	int ret;
>> +	char *cmd_cache_name;
>> +	int ret, i;
>> +
>> +	if (pt->ver == AE4_DMA_VERSION)
>> +		ae4 = container_of(pt, struct ae4_device, pt);
>> +
>> +	if (ae4)
> in which case would ae4 be null but you are on version 4?

Yes correct, this will be used in the functions to differentiate
between PTDMA and AE4DMA functionality. By default, without a
version (i.e., null for PTDMA and 4 for AE4DMA).

Thanks,
--
Basavaraj

>
>> +		pt->pt_dma_chan = devm_kcalloc(pt->dev, ae4->cmd_q_count,
>> +					       sizeof(*pt->pt_dma_chan), GFP_KERNEL);
>> +	else
>> +		pt->pt_dma_chan = devm_kzalloc(pt->dev, sizeof(*pt->pt_dma_chan),
>> +					       GFP_KERNEL);
>>  
>> -	pt->pt_dma_chan = devm_kzalloc(pt->dev, sizeof(*pt->pt_dma_chan),
>> -				       GFP_KERNEL);
>>  	if (!pt->pt_dma_chan)
>>  		return -ENOMEM;
>>  
>> @@ -366,9 +421,6 @@ int pt_dmaengine_register(struct pt_device *pt)
>>  
>>  	INIT_LIST_HEAD(&dma_dev->channels);
>>  
>> -	chan = pt->pt_dma_chan;
>> -	chan->pt = pt;
>> -
>>  	/* Set base and prep routines */
>>  	dma_dev->device_free_chan_resources = pt_free_chan_resources;
>>  	dma_dev->device_prep_dma_memcpy = pt_prep_dma_memcpy;
>> @@ -380,8 +432,21 @@ int pt_dmaengine_register(struct pt_device *pt)
>>  	dma_dev->device_terminate_all = pt_terminate_all;
>>  	dma_dev->device_synchronize = pt_synchronize;
>>  
>> -	chan->vc.desc_free = pt_do_cleanup;
>> -	vchan_init(&chan->vc, dma_dev);
>> +	if (ae4) {
>> +		for (i = 0; i < ae4->cmd_q_count; i++) {
>> +			chan = pt->pt_dma_chan + i;
>> +			ae4cmd_q = &ae4->ae4cmd_q[i];
>> +			chan->id = ae4cmd_q->id;
>> +			chan->pt = pt;
>> +			chan->vc.desc_free = pt_do_cleanup;
>> +			vchan_init(&chan->vc, dma_dev);
>> +		}
>> +	} else {
>> +		chan = pt->pt_dma_chan;
>> +		chan->pt = pt;
>> +		chan->vc.desc_free = pt_do_cleanup;
>> +		vchan_init(&chan->vc, dma_dev);
>> +	}
>>  
>>  	ret = dma_async_device_register(dma_dev);
>>  	if (ret)
>> diff --git a/drivers/dma/amd/ptdma/ptdma.h b/drivers/dma/amd/ptdma/ptdma.h
>> index 2690a32fc7cb..a6990021fe2b 100644
>> --- a/drivers/dma/amd/ptdma/ptdma.h
>> +++ b/drivers/dma/amd/ptdma/ptdma.h
>> @@ -184,6 +184,7 @@ struct pt_dma_desc {
>>  struct pt_dma_chan {
>>  	struct virt_dma_chan vc;
>>  	struct pt_device *pt;
>> +	u32 id;
>>  };
>>  
>>  struct pt_cmd_queue {
>> @@ -262,6 +263,7 @@ struct pt_device {
>>  	unsigned long total_interrupts;
>>  
>>  	struct pt_tasklet_data tdata;
>> +	int ver;
>>  };
>>  
>>  /*
>> -- 
>> 2.25.1


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v5 3/7] dmaengine: ptdma: Move common functions to common code
  2024-08-28 13:18   ` Vinod Koul
@ 2024-09-02  7:07     ` Basavaraj Natikar
  0 siblings, 0 replies; 14+ messages in thread
From: Basavaraj Natikar @ 2024-09-02  7:07 UTC (permalink / raw)
  To: Vinod Koul, Basavaraj Natikar
  Cc: dmaengine, Raju.Rangoju, Frank.li, helgaas, pstanner


On 8/28/2024 6:48 PM, Vinod Koul wrote:
> On 08-07-24, 20:14, Basavaraj Natikar wrote:
>> To focus on reusability of ptdma code across modules extract common
>> functions into reusable modules.
>>
>> Reviewed-by: Raju Rangoju <Raju.Rangoju@amd.com>
>> Signed-off-by: Basavaraj Natikar <Basavaraj.Natikar@amd.com>
>> ---
>>  MAINTAINERS                             | 1 +
>>  drivers/dma/amd/ptdma/ptdma-dev.c       | 2 +-
>>  drivers/dma/amd/ptdma/ptdma-dmaengine.c | 3 +--
>>  3 files changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/MAINTAINERS b/MAINTAINERS
>> index 539bf52410de..97d97ddf26f5 100644
>> --- a/MAINTAINERS
>> +++ b/MAINTAINERS
>> @@ -952,6 +952,7 @@ M:	Basavaraj Natikar <Basavaraj.Natikar@amd.com>
>>  L:	dmaengine@vger.kernel.org
>>  S:	Maintained
>>  F:	drivers/dma/amd/ae4dma/
>> +F:	drivers/dma/amd/common/
> I think this should be made amd/ to avoid churn when a file is
> added/dropped

Sure, I will remove the common directory and include everything
in the C files.

>
>>  
>>  AMD AXI W1 DRIVER
>>  M:	Kris Chaplin <kris.chaplin@amd.com>
>> diff --git a/drivers/dma/amd/ptdma/ptdma-dev.c b/drivers/dma/amd/ptdma/ptdma-dev.c
>> index a2bf13ff18b6..2bdf418fe556 100644
>> --- a/drivers/dma/amd/ptdma/ptdma-dev.c
>> +++ b/drivers/dma/amd/ptdma/ptdma-dev.c
>> @@ -17,7 +17,7 @@
>>  #include <linux/module.h>
>>  #include <linux/pci.h>
>>  
>> -#include "ptdma.h"
>> +#include "../common/amd_dma.h"
>>  
>>  /* Human-readable error strings */
>>  static char *pt_error_codes[] = {
>> diff --git a/drivers/dma/amd/ptdma/ptdma-dmaengine.c b/drivers/dma/amd/ptdma/ptdma-dmaengine.c
>> index a2e7c2cec15e..66ea10499643 100644
>> --- a/drivers/dma/amd/ptdma/ptdma-dmaengine.c
>> +++ b/drivers/dma/amd/ptdma/ptdma-dmaengine.c
>> @@ -9,8 +9,7 @@
>>   * Author: Gary R Hook <gary.hook@amd.com>
>>   */
>>  
>> -#include "ptdma.h"
>> -#include "../../dmaengine.h"
>> +#include "../common/amd_dma.h"
> So the driver was including old headers and now new, but I dont see any
> functions being moved? Does each patch compile...?

Yes, this compiles. As mentioned in the previous comment, I will remove
the common directory and include everything in the C files.


Thanks,
--
Basavaraj

>


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2024-09-02  7:07 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-08 14:44 [PATCH v5 0/7] Add support of AMD AE4DMA DMA Engine Basavaraj Natikar
2024-07-08 14:44 ` [PATCH v5 1/7] dmaengine: Move AMD DMA driver to separate directory Basavaraj Natikar
2024-07-08 14:44 ` [PATCH v5 2/7] dmaengine: ae4dma: Add AMD ae4dma controller driver Basavaraj Natikar
2024-08-28 13:23   ` Vinod Koul
2024-09-02  6:47     ` Basavaraj Natikar
2024-07-08 14:44 ` [PATCH v5 3/7] dmaengine: ptdma: Move common functions to common code Basavaraj Natikar
2024-08-28 13:18   ` Vinod Koul
2024-09-02  7:07     ` Basavaraj Natikar
2024-07-08 14:44 ` [PATCH v5 4/7] dmaengine: ptdma: Extend ptdma to support multi-channel and version Basavaraj Natikar
2024-08-28 17:46   ` Vinod Koul
2024-09-02  7:01     ` Basavaraj Natikar
2024-07-08 14:44 ` [PATCH v5 5/7] dmaengine: ae4dma: Register AE4DMA using pt_dmaengine_register Basavaraj Natikar
2024-07-08 14:44 ` [PATCH v5 6/7] dmaengine: ptdma: Extend ptdma-debugfs to support multi-queue Basavaraj Natikar
2024-07-08 14:45 ` [PATCH v5 7/7] dmaengine: ae4dma: Register debugfs using ptdma_debugfs_setup Basavaraj Natikar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox