linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 00/25] NTB/PCI: Add DW eDMA intr fallback and BAR MW offsets
@ 2025-10-23  7:18 Koichiro Den
  2025-10-23  7:18 ` [RFC PATCH 01/25] PCI: endpoint: pci-epf-vntb: Use array_index_nospec() on mws_size[] access Koichiro Den
                   ` (26 more replies)
  0 siblings, 27 replies; 37+ messages in thread
From: Koichiro Den @ 2025-10-23  7:18 UTC (permalink / raw)
  To: ntb, linux-pci, dmaengine, linux-kernel
  Cc: mani, kwilczynski, kishon, bhelgaas, corbet, vkoul, jdmason,
	dave.jiang, allenbh, Basavaraj.Natikar, Shyam-sundar.S-k,
	kurt.schwemmer, logang, jingoohan1, lpieralisi, robh, jbrunet,
	Frank.Li, fancer.lancer, arnd, pstanner, elfring

Hi all,

Motivation
==========

On Renesas R-Car S4 the PCIe Endpoint is DesignWare-based and the platform
does not allow mapping GITS_TRANSLATER as an inbound iATU target. As a
result, forwarding MSI writes from the Root Complex (RC) to the Endpoint
(EP) is not possible even if we would add implementation to create a MSI
domain for the vNTB device to use existing drivers/ntb/msi.c, and NTB
traffic must fall back to doorbells (polling). In addition, BAR resources
are scarce, which makes it difficult to dedicate a BAR solely to an
NTB/msi window.

This RFC introduces a generic interrupt backend for NTB. The existing MSI
path is converted to a backend, and a new DW eDMA test-interrupt backend
provides an RC-to-EP interrupt fallback when MSI cannot be used. In
parallel, EPC/DWC gains inbound subrange mapping so multiple NTB memory
windows (MWs) can share a single BAR at arbitrary offsets (via mwN_offset).
The vNTB EPF and ntb_transport are taught about offsets.

Backend selection is automatic: if MSI is available we use the MSI backend.
Otherwise, if enabled, the DW eDMA backend is used. If neither is
available, we continue to use doorbells. Existing systems remain unaffected
unless use_intr=1 is set.

Example layout (R-Car S4):

  BAR0: Config/Spad
  BAR2 [0x00000-0xF0000]: MW1 (data)
  BAR2 [0xF0000-0xF8000]: MW2 (interrupts)
  BAR4: Doorbell

  # The corresponding configfs settings (see Patch #25):
  echo 0xF0000 > ./mw1
  echo 0x8000  > ./mw2
  echo 0xF0000 > ./mw2_offset
  echo 2       > ./mw1_bar
  echo 2       > ./mw2_bar

Summary of changes
==================

* NTB core/transport
  - Introduce struct ntb_intr_backend and convert MSI to the new backend.
  - Add DW eDMA interrupt backend (CONFIG_NTB_DW_EDMA) as MSI-less fallback.
  - Rename module parameter to use_intr (keep use_msi as deprecated alias).
  - Support offsetted partial MWs in ntb_transport.
  - Hardening for peer-reported interrupt values and minor cleanups.

* PCI Endpoint core and DWC EP controller
  - Add EPC ops map_inbound()/unmap_inbound() for BAR subrange mapping.
  - Implement inbound mapping for DesignWare EP (Address Match mode), with
    tracking of multiple inbound iATU entries per BAR and proper teardown.

* EPF vNTB
  - Add mwN_offset configfs attributes and propagate offsets to inbound maps.
  - Prefer pci_epc_map_inbound() when supported. Otherwise fall back to
    set_bar().
  - Provide .get_pci_epc() so backends can locate the common eDMA instance.

* DW eDMA
  - Add self-interrupt registration and expose test-IRQ register offsets.
  - Provide dw_edma_find_by_child().

* Renesas R-Car
  - Place MW2 in BAR2 to host the interrupt window alongside the data MW.

* Documentation

Patch layout
============

* Patches 01-11 : BAR subrange and MW offsets (EPC/DWC EP, vNTB, core helpers)
* Patches 12-14 : Interrupt handling hardening in ntb_transport/MSI
* Patches 15-17 : DW eDMA: self-IRQ API, offsets, lookup helper
* Patches 18-19 : NTB/EPF glue (.get_pci_epc())
* Patch 20      : Module param name change (use_msi->use_intr, alias preserved)
* Patches 21-23 : Generic interrupt backend + MSI conversion + DW eDMA backend
* Patch 24      : R-Car: add MW2 in BAR2 for interrupts
* Patch 25      : Documentation updates

Tested on
=========

* Renesas R-Car S4 Spider
* Kernel base: commit 68113d260674 ("NTB/msi: Remove unused functions") (ntb-driver-core/ntb-next)

Performance measurement
=======================

Even without the DMA acceleration patches for R-Car S4 (which I keep
separate from this RFC patch series), enabling RC-to-EP interrupts
dramatically improves NTB latency on R-Car S4:

* Before this patch series (NB. use_msi doesn't work on R-Car S4)

  # Server: sockperf server -i 0.0.0.0
  # Client: sockperf ping-pong -i $SERVER_IP
  ========= Printing statistics for Server No: 0
  [Valid Duration] RunTime=0.540 sec; SentMessages=45; ReceivedMessages=45
  ====> avg-latency=5995.680 (std-dev=70.258, mean-ad=57.478, median-ad=85.978,\
        siqr=59.698, cv=0.012, std-error=10.473, 99.0% ci=[5968.702, 6022.658])
  # dropped messages = 0; # duplicated messages = 0; # out-of-order messages = 0
  Summary: Latency is 5995.680 usec
  Total 45 observations; each percentile contains 0.45 observations
  ---> <MAX> observation = 6121.137
  ---> percentile 99.999 = 6121.137
  ---> percentile 99.990 = 6121.137
  ---> percentile 99.900 = 6121.137
  ---> percentile 99.000 = 6121.137
  ---> percentile 90.000 = 6099.178
  ---> percentile 75.000 = 6054.418
  ---> percentile 50.000 = 5993.040
  ---> percentile 25.000 = 5935.021
  ---> <MIN> observation = 5883.362

* With this series (use_intr=1)

  # Server: sockperf server -i 0.0.0.0
  # Client: sockperf ping-pong -i $SERVER_IP
  ========= Printing statistics for Server No: 0
  [Valid Duration] RunTime=0.550 sec; SentMessages=2145; ReceivedMessages=2145
  ====> avg-latency=127.677 (std-dev=21.719, mean-ad=11.759, median-ad=3.779,\
        siqr=2.699, cv=0.170, std-error=0.469, 99.0% ci=[126.469, 128.885])
  # dropped messages = 0; # duplicated messages = 0; # out-of-order messages = 0
  Summary: Latency is 127.677 usec
  Total 2145 observations; each percentile contains 21.45 observations
  ---> <MAX> observation =  446.691
  ---> percentile 99.999 =  446.691
  ---> percentile 99.990 =  446.691
  ---> percentile 99.900 =  291.234
  ---> percentile 99.000 =  221.515
  ---> percentile 90.000 =  149.277
  ---> percentile 75.000 =  124.497
  ---> percentile 50.000 =  121.137
  ---> percentile 25.000 =  119.037
  ---> <MIN> observation =  113.637

Feedback welcome on both the approach and the splitting/routing preference.

(The series spans NTB, PCI EP/DWC and dmaengine/dw-edma. I'm happy to split
later if preferred.)

Thanks for reviewing.


Koichiro Den (25):
  PCI: endpoint: pci-epf-vntb: Use array_index_nospec() on mws_size[]
    access
  PCI: endpoint: pci-epf-vntb: Add mwN_offset configfs attributes
  NTB: epf: Handle mwN_offset for inbound MW regions
  PCI: endpoint: Add inbound mapping ops to EPC core
  PCI: dwc: ep: Implement EPC inbound mapping support
  PCI: endpoint: pci-epf-vntb: Use pci_epc_map_inbound() for MW mapping
  NTB: Add offset parameter to MW translation APIs
  PCI: endpoint: pci-epf-vntb: Propagate MW offset from configfs when
    present
  NTB: ntb_transport: Support offsetted partial memory windows
  NTB/msi: Support offsetted partial memory window for MSI
  NTB/msi: Do not force MW to its maximum possible size
  NTB: ntb_transport: Stricter checks for peer-reported interrupt values
  NTB/msi: Skip mw_set_trans() if already configured
  NTB/msi: Add a inner loop for PCI-MSI cases
  dmaengine: dw-edma: Add self-interrupt registration API
  dmaengine: dw-edma: Expose self-IRQ register offsets
  dmaengine: dw-edma: Add dw_edma_find_by_child() helper
  NTB: core: Add .get_pci_epc() to ntb_dev_ops
  NTB: epf: vntb: Implement .get_pci_epc() callback
  NTB: ntb_transport: Rename use_msi to use_intr (keep alias)
  NTB: Introduce generic interrupt backend abstraction and convert MSI
  NTB: ntb_transport: Rename MSI symbols to generic interrupt form
  NTB: intr_dw_edma: Add DW eDMA emulated interrupt backend
  NTB: epf: Add MW2 for interrupt use on Renesas R-Car
  Documentation: PCI: endpoint: pci-epf-vntb: Update and add mwN_offset
    usage

 Documentation/PCI/endpoint/pci-vntb-howto.rst |  16 +-
 drivers/dma/dw-edma/dw-edma-core.c            | 109 ++++++++
 drivers/dma/dw-edma/dw-edma-core.h            |  18 ++
 drivers/dma/dw-edma/dw-edma-v0-core.c         |  15 ++
 drivers/ntb/Kconfig                           |  15 ++
 drivers/ntb/Makefile                          |   6 +-
 drivers/ntb/hw/amd/ntb_hw_amd.c               |   6 +-
 drivers/ntb/hw/epf/ntb_hw_epf.c               |  46 ++--
 drivers/ntb/hw/idt/ntb_hw_idt.c               |   3 +-
 drivers/ntb/hw/intel/ntb_hw_gen1.c            |   6 +-
 drivers/ntb/hw/intel/ntb_hw_gen1.h            |   2 +-
 drivers/ntb/hw/intel/ntb_hw_gen3.c            |   3 +-
 drivers/ntb/hw/intel/ntb_hw_gen4.c            |   6 +-
 drivers/ntb/hw/mscc/ntb_hw_switchtec.c        |   6 +-
 drivers/ntb/intr_common.c                     |  61 +++++
 drivers/ntb/intr_dw_edma.c                    | 253 ++++++++++++++++++
 drivers/ntb/msi.c                             | 186 +++++++------
 drivers/ntb/ntb_transport.c                   | 155 ++++++-----
 drivers/ntb/test/ntb_msi_test.c               |  26 +-
 drivers/ntb/test/ntb_perf.c                   |   4 +-
 drivers/ntb/test/ntb_tool.c                   |   6 +-
 .../pci/controller/dwc/pcie-designware-ep.c   | 242 +++++++++++++++--
 drivers/pci/controller/dwc/pcie-designware.c  |   1 +
 drivers/pci/controller/dwc/pcie-designware.h  |   2 +
 drivers/pci/endpoint/functions/pci-epf-vntb.c | 197 ++++++++++++--
 drivers/pci/endpoint/pci-epc-core.c           |  44 +++
 include/linux/dma/edma.h                      |  31 +++
 include/linux/ntb.h                           | 134 +++++++---
 include/linux/pci-epc.h                       |  11 +
 29 files changed, 1310 insertions(+), 300 deletions(-)
 create mode 100644 drivers/ntb/intr_common.c
 create mode 100644 drivers/ntb/intr_dw_edma.c

-- 
2.48.1


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [RFC PATCH 01/25] PCI: endpoint: pci-epf-vntb: Use array_index_nospec() on mws_size[] access
  2025-10-23  7:18 [RFC PATCH 00/25] NTB/PCI: Add DW eDMA intr fallback and BAR MW offsets Koichiro Den
@ 2025-10-23  7:18 ` Koichiro Den
  2025-10-24  0:06   ` Frank Li
  2025-10-23  7:18 ` [RFC PATCH 02/25] PCI: endpoint: pci-epf-vntb: Add mwN_offset configfs attributes Koichiro Den
                   ` (25 subsequent siblings)
  26 siblings, 1 reply; 37+ messages in thread
From: Koichiro Den @ 2025-10-23  7:18 UTC (permalink / raw)
  To: ntb, linux-pci, dmaengine, linux-kernel
  Cc: mani, kwilczynski, kishon, bhelgaas, corbet, vkoul, jdmason,
	dave.jiang, allenbh, Basavaraj.Natikar, Shyam-sundar.S-k,
	kurt.schwemmer, logang, jingoohan1, lpieralisi, robh, jbrunet,
	Frank.Li, fancer.lancer, arnd, pstanner, elfring

Follow common kernel idioms for indices derived from configfs attributes
and suppress Smatch warnings:

  epf_ntb_mw1_show() warn: potential spectre issue 'ntb->mws_size' [r]
  epf_ntb_mw1_store() warn: potential spectre issue 'ntb->mws_size' [w]

No functional changes.

Signed-off-by: Koichiro Den <den@valinux.co.jp>
---
 drivers/pci/endpoint/functions/pci-epf-vntb.c | 23 +++++++++++--------
 1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/drivers/pci/endpoint/functions/pci-epf-vntb.c b/drivers/pci/endpoint/functions/pci-epf-vntb.c
index 83e9ab10f9c4..55307cd613c9 100644
--- a/drivers/pci/endpoint/functions/pci-epf-vntb.c
+++ b/drivers/pci/endpoint/functions/pci-epf-vntb.c
@@ -876,17 +876,19 @@ static ssize_t epf_ntb_##_name##_show(struct config_item *item,		\
 	struct config_group *group = to_config_group(item);		\
 	struct epf_ntb *ntb = to_epf_ntb(group);			\
 	struct device *dev = &ntb->epf->dev;				\
-	int win_no;							\
+	int win_no, idx;						\
 									\
 	if (sscanf(#_name, "mw%d", &win_no) != 1)			\
 		return -EINVAL;						\
 									\
-	if (win_no <= 0 || win_no > ntb->num_mws) {			\
-		dev_err(dev, "Invalid num_nws: %d value\n", ntb->num_mws); \
+	idx = win_no - 1;						\
+	if (idx < 0 || idx >= ntb->num_mws) {				\
+		dev_err(dev, "MW%d out of range (num_mws=%d)\n",	\
+			win_no, ntb->num_mws);				\
 		return -EINVAL;						\
 	}								\
-									\
-	return sprintf(page, "%lld\n", ntb->mws_size[win_no - 1]);	\
+	idx = array_index_nospec(idx, ntb->num_mws);			\
+	return sprintf(page, "%lld\n", ntb->mws_size[idx]);		\
 }
 
 #define EPF_NTB_MW_W(_name)						\
@@ -896,7 +898,7 @@ static ssize_t epf_ntb_##_name##_store(struct config_item *item,	\
 	struct config_group *group = to_config_group(item);		\
 	struct epf_ntb *ntb = to_epf_ntb(group);			\
 	struct device *dev = &ntb->epf->dev;				\
-	int win_no;							\
+	int win_no, idx;						\
 	u64 val;							\
 	int ret;							\
 									\
@@ -907,12 +909,15 @@ static ssize_t epf_ntb_##_name##_store(struct config_item *item,	\
 	if (sscanf(#_name, "mw%d", &win_no) != 1)			\
 		return -EINVAL;						\
 									\
-	if (win_no <= 0 || win_no > ntb->num_mws) {			\
-		dev_err(dev, "Invalid num_nws: %d value\n", ntb->num_mws); \
+	idx = win_no - 1;						\
+	if (idx < 0 || idx >= ntb->num_mws) {				\
+		dev_err(dev, "MW%d out of range (num_mws=%d)\n",	\
+			win_no, ntb->num_mws);				\
 		return -EINVAL;						\
 	}								\
 									\
-	ntb->mws_size[win_no - 1] = val;				\
+	idx = array_index_nospec(idx, ntb->num_mws);			\
+	ntb->mws_size[idx] = val;					\
 									\
 	return len;							\
 }
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [RFC PATCH 02/25] PCI: endpoint: pci-epf-vntb: Add mwN_offset configfs attributes
  2025-10-23  7:18 [RFC PATCH 00/25] NTB/PCI: Add DW eDMA intr fallback and BAR MW offsets Koichiro Den
  2025-10-23  7:18 ` [RFC PATCH 01/25] PCI: endpoint: pci-epf-vntb: Use array_index_nospec() on mws_size[] access Koichiro Den
@ 2025-10-23  7:18 ` Koichiro Den
  2025-10-23  7:18 ` [RFC PATCH 03/25] NTB: epf: Handle mwN_offset for inbound MW regions Koichiro Den
                   ` (24 subsequent siblings)
  26 siblings, 0 replies; 37+ messages in thread
From: Koichiro Den @ 2025-10-23  7:18 UTC (permalink / raw)
  To: ntb, linux-pci, dmaengine, linux-kernel
  Cc: mani, kwilczynski, kishon, bhelgaas, corbet, vkoul, jdmason,
	dave.jiang, allenbh, Basavaraj.Natikar, Shyam-sundar.S-k,
	kurt.schwemmer, logang, jingoohan1, lpieralisi, robh, jbrunet,
	Frank.Li, fancer.lancer, arnd, pstanner, elfring

Introduce new mwN_offset configfs attributes to specify memory window
offsets. This enables mapping multiple windows into a single BAR at
arbitrary offsets, improving layout flexibility.

Signed-off-by: Koichiro Den <den@valinux.co.jp>
---
 drivers/pci/endpoint/functions/pci-epf-vntb.c | 133 ++++++++++++++++--
 1 file changed, 120 insertions(+), 13 deletions(-)

diff --git a/drivers/pci/endpoint/functions/pci-epf-vntb.c b/drivers/pci/endpoint/functions/pci-epf-vntb.c
index 55307cd613c9..6953abb2987d 100644
--- a/drivers/pci/endpoint/functions/pci-epf-vntb.c
+++ b/drivers/pci/endpoint/functions/pci-epf-vntb.c
@@ -38,6 +38,7 @@
 
 #include <linux/delay.h>
 #include <linux/io.h>
+#include <linux/log2.h>
 #include <linux/module.h>
 #include <linux/slab.h>
 
@@ -109,7 +110,8 @@ struct epf_ntb_ctrl {
 	u64 addr;
 	u64 size;
 	u32 num_mws;
-	u32 reserved;
+	u32 mw_offset[MAX_MW];
+	u32 mw_size[MAX_MW];
 	u32 spad_offset;
 	u32 spad_count;
 	u32 db_entry_size;
@@ -126,6 +128,7 @@ struct epf_ntb {
 	u32 db_count;
 	u32 spad_count;
 	u64 mws_size[MAX_MW];
+	u64 mws_offset[MAX_MW];
 	u64 db;
 	u32 vbus_number;
 	u16 vntb_pid;
@@ -441,6 +444,8 @@ static int epf_ntb_config_spad_bar_alloc(struct epf_ntb *ntb)
 
 	ctrl->spad_count = spad_count;
 	ctrl->num_mws = ntb->num_mws;
+	memset(ctrl->mw_offset, 0, sizeof(ctrl->mw_offset));
+	memset(ctrl->mw_size, 0, sizeof(ctrl->mw_size));
 	ntb->spad_size = spad_size;
 
 	ctrl->db_entry_size = sizeof(u32);
@@ -570,15 +575,31 @@ static void epf_ntb_db_bar_clear(struct epf_ntb *ntb)
  */
 static int epf_ntb_mw_bar_init(struct epf_ntb *ntb)
 {
+	struct device *dev = &ntb->epf->dev;
+	u64 bar_ends[BAR_5 + 1] = { 0 };
+	unsigned long bars_used = 0;
+	enum pci_barno barno;
+	u64 off, size, end;
 	int ret = 0;
 	int i;
-	u64 size;
-	enum pci_barno barno;
-	struct device *dev = &ntb->epf->dev;
 
 	for (i = 0; i < ntb->num_mws; i++) {
-		size = ntb->mws_size[i];
 		barno = ntb->epf_ntb_bar[BAR_MW1 + i];
+		off = ntb->mws_offset[i];
+		size = ntb->mws_size[i];
+		end = off + size;
+		if (end > bar_ends[barno])
+			bar_ends[barno] = end;
+		bars_used |= BIT(barno);
+	}
+
+	for (barno = BAR_0; barno <= BAR_5; barno++) {
+		if (!(bars_used & BIT(barno)))
+			continue;
+		if (bar_ends[barno] < SZ_4K)
+			size = SZ_4K;
+		else
+			size = roundup_pow_of_two(bar_ends[barno]);
 
 		ntb->epf->bar[barno].barno = barno;
 		ntb->epf->bar[barno].size = size;
@@ -594,8 +615,12 @@ static int epf_ntb_mw_bar_init(struct epf_ntb *ntb)
 				      &ntb->epf->bar[barno]);
 		if (ret) {
 			dev_err(dev, "MW set failed\n");
-			goto err_alloc_mem;
+			goto err_set_bar;
 		}
+	}
+
+	for (i = 0; i < ntb->num_mws; i++) {
+		size = ntb->mws_size[i];
 
 		/* Allocate EPC outbound memory windows to vpci vntb device */
 		ntb->vpci_mw_addr[i] = pci_epc_mem_alloc_addr(ntb->epf->epc,
@@ -604,19 +629,31 @@ static int epf_ntb_mw_bar_init(struct epf_ntb *ntb)
 		if (!ntb->vpci_mw_addr[i]) {
 			ret = -ENOMEM;
 			dev_err(dev, "Failed to allocate source address\n");
-			goto err_set_bar;
+			goto err_alloc_mem;
 		}
 	}
 
+	for (i = 0; i < ntb->num_mws; i++) {
+		ntb->reg->mw_offset[i] = (u32)ntb->mws_offset[i];
+		ntb->reg->mw_size[i] = (u32)ntb->mws_size[i];
+	}
+
 	return ret;
 
-err_set_bar:
-	pci_epc_clear_bar(ntb->epf->epc,
-			  ntb->epf->func_no,
-			  ntb->epf->vfunc_no,
-			  &ntb->epf->bar[barno]);
 err_alloc_mem:
-	epf_ntb_mw_bar_clear(ntb, i);
+	while (--i >= 0)
+		pci_epc_mem_free_addr(ntb->epf->epc,
+				      ntb->vpci_mw_phy[i],
+				      ntb->vpci_mw_addr[i],
+				      ntb->mws_size[i]);
+err_set_bar:
+	while (--barno >= BAR_0)
+		if (bars_used & BIT(barno))
+			pci_epc_clear_bar(ntb->epf->epc,
+					  ntb->epf->func_no,
+					  ntb->epf->vfunc_no,
+					  &ntb->epf->bar[barno]);
+
 	return ret;
 }
 
@@ -922,6 +959,60 @@ static ssize_t epf_ntb_##_name##_store(struct config_item *item,	\
 	return len;							\
 }
 
+#define EPF_NTB_MW_OFF_R(_name)						\
+static ssize_t epf_ntb_##_name##_show(struct config_item *item,		\
+				      char *page)			\
+{									\
+	struct config_group *group = to_config_group(item);		\
+	struct epf_ntb *ntb = to_epf_ntb(group);			\
+	struct device *dev = &ntb->epf->dev;				\
+	int win_no, idx;						\
+									\
+	if (sscanf(#_name, "mw%d_offset", &win_no) != 1)		\
+		return -EINVAL;						\
+									\
+	idx = win_no - 1;						\
+	if (idx < 0 || idx >= ntb->num_mws) {				\
+		dev_err(dev, "MW%d out of range (num_mws=%d)\n",	\
+			win_no, ntb->num_mws);				\
+		return -EINVAL;						\
+	}								\
+									\
+	idx = array_index_nospec(idx, ntb->num_mws);			\
+	return sprintf(page, "%lld\n", ntb->mws_offset[idx]);		\
+}
+
+#define EPF_NTB_MW_OFF_W(_name)						\
+static ssize_t epf_ntb_##_name##_store(struct config_item *item,	\
+				       const char *page, size_t len)	\
+{									\
+	struct config_group *group = to_config_group(item);		\
+	struct epf_ntb *ntb = to_epf_ntb(group);			\
+	struct device *dev = &ntb->epf->dev;				\
+	int win_no, idx;						\
+	u64 val;							\
+	int ret;							\
+									\
+	ret = kstrtou64(page, 0, &val);					\
+	if (ret)							\
+		return ret;						\
+									\
+	if (sscanf(#_name, "mw%d_offset", &win_no) != 1)		\
+		return -EINVAL;						\
+									\
+	idx = win_no - 1;						\
+	if (idx < 0 || idx >= ntb->num_mws) {				\
+		dev_err(dev, "MW%d out of range (num_mws=%d)\n",	\
+			win_no, ntb->num_mws);				\
+		return -EINVAL;						\
+	}								\
+									\
+	idx = array_index_nospec(idx, ntb->num_mws);			\
+	ntb->mws_offset[idx] = val;					\
+									\
+	return len;							\
+}
+
 #define EPF_NTB_BAR_R(_name, _id)					\
 	static ssize_t epf_ntb_##_name##_show(struct config_item *item,	\
 					      char *page)		\
@@ -992,6 +1083,14 @@ EPF_NTB_MW_R(mw3)
 EPF_NTB_MW_W(mw3)
 EPF_NTB_MW_R(mw4)
 EPF_NTB_MW_W(mw4)
+EPF_NTB_MW_OFF_R(mw1_offset)
+EPF_NTB_MW_OFF_W(mw1_offset)
+EPF_NTB_MW_OFF_R(mw2_offset)
+EPF_NTB_MW_OFF_W(mw2_offset)
+EPF_NTB_MW_OFF_R(mw3_offset)
+EPF_NTB_MW_OFF_W(mw3_offset)
+EPF_NTB_MW_OFF_R(mw4_offset)
+EPF_NTB_MW_OFF_W(mw4_offset)
 EPF_NTB_BAR_R(ctrl_bar, BAR_CONFIG)
 EPF_NTB_BAR_W(ctrl_bar, BAR_CONFIG)
 EPF_NTB_BAR_R(db_bar, BAR_DB)
@@ -1012,6 +1111,10 @@ CONFIGFS_ATTR(epf_ntb_, mw1);
 CONFIGFS_ATTR(epf_ntb_, mw2);
 CONFIGFS_ATTR(epf_ntb_, mw3);
 CONFIGFS_ATTR(epf_ntb_, mw4);
+CONFIGFS_ATTR(epf_ntb_, mw1_offset);
+CONFIGFS_ATTR(epf_ntb_, mw2_offset);
+CONFIGFS_ATTR(epf_ntb_, mw3_offset);
+CONFIGFS_ATTR(epf_ntb_, mw4_offset);
 CONFIGFS_ATTR(epf_ntb_, vbus_number);
 CONFIGFS_ATTR(epf_ntb_, vntb_pid);
 CONFIGFS_ATTR(epf_ntb_, vntb_vid);
@@ -1030,6 +1133,10 @@ static struct configfs_attribute *epf_ntb_attrs[] = {
 	&epf_ntb_attr_mw2,
 	&epf_ntb_attr_mw3,
 	&epf_ntb_attr_mw4,
+	&epf_ntb_attr_mw1_offset,
+	&epf_ntb_attr_mw2_offset,
+	&epf_ntb_attr_mw3_offset,
+	&epf_ntb_attr_mw4_offset,
 	&epf_ntb_attr_vbus_number,
 	&epf_ntb_attr_vntb_pid,
 	&epf_ntb_attr_vntb_vid,
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [RFC PATCH 03/25] NTB: epf: Handle mwN_offset for inbound MW regions
  2025-10-23  7:18 [RFC PATCH 00/25] NTB/PCI: Add DW eDMA intr fallback and BAR MW offsets Koichiro Den
  2025-10-23  7:18 ` [RFC PATCH 01/25] PCI: endpoint: pci-epf-vntb: Use array_index_nospec() on mws_size[] access Koichiro Den
  2025-10-23  7:18 ` [RFC PATCH 02/25] PCI: endpoint: pci-epf-vntb: Add mwN_offset configfs attributes Koichiro Den
@ 2025-10-23  7:18 ` Koichiro Den
  2025-10-23  7:18 ` [RFC PATCH 04/25] PCI: endpoint: Add inbound mapping ops to EPC core Koichiro Den
                   ` (23 subsequent siblings)
  26 siblings, 0 replies; 37+ messages in thread
From: Koichiro Den @ 2025-10-23  7:18 UTC (permalink / raw)
  To: ntb, linux-pci, dmaengine, linux-kernel
  Cc: mani, kwilczynski, kishon, bhelgaas, corbet, vkoul, jdmason,
	dave.jiang, allenbh, Basavaraj.Natikar, Shyam-sundar.S-k,
	kurt.schwemmer, logang, jingoohan1, lpieralisi, robh, jbrunet,
	Frank.Li, fancer.lancer, arnd, pstanner, elfring

Add and use new fields in the common control register to convey both
offset and size for each memory window (MW), so that it can correctly
handle flexible MW layouts and support partial BAR mappings.

Signed-off-by: Koichiro Den <den@valinux.co.jp>
---
 drivers/ntb/hw/epf/ntb_hw_epf.c | 27 ++++++++++++++++-----------
 1 file changed, 16 insertions(+), 11 deletions(-)

diff --git a/drivers/ntb/hw/epf/ntb_hw_epf.c b/drivers/ntb/hw/epf/ntb_hw_epf.c
index d3ecf25a5162..91d3f8e05807 100644
--- a/drivers/ntb/hw/epf/ntb_hw_epf.c
+++ b/drivers/ntb/hw/epf/ntb_hw_epf.c
@@ -36,12 +36,13 @@
 #define NTB_EPF_LOWER_SIZE	0x18
 #define NTB_EPF_UPPER_SIZE	0x1C
 #define NTB_EPF_MW_COUNT	0x20
-#define NTB_EPF_MW1_OFFSET	0x24
-#define NTB_EPF_SPAD_OFFSET	0x28
-#define NTB_EPF_SPAD_COUNT	0x2C
-#define NTB_EPF_DB_ENTRY_SIZE	0x30
-#define NTB_EPF_DB_DATA(n)	(0x34 + (n) * 4)
-#define NTB_EPF_DB_OFFSET(n)	(0xB4 + (n) * 4)
+#define NTB_EPF_MW_OFFSET(n)	(0x24 + (n) * 4)
+#define NTB_EPF_MW_SIZE(n)	(0x34 + (n) * 4)
+#define NTB_EPF_SPAD_OFFSET	0x44
+#define NTB_EPF_SPAD_COUNT	0x48
+#define NTB_EPF_DB_ENTRY_SIZE	0x4C
+#define NTB_EPF_DB_DATA(n)	(0x50 + (n) * 4)
+#define NTB_EPF_DB_OFFSET(n)	(0xD0 + (n) * 4)
 
 #define NTB_EPF_MIN_DB_COUNT	3
 #define NTB_EPF_MAX_DB_COUNT	31
@@ -451,11 +452,12 @@ static int ntb_epf_peer_mw_get_addr(struct ntb_dev *ntb, int idx,
 				    phys_addr_t *base, resource_size_t *size)
 {
 	struct ntb_epf_dev *ndev = ntb_ndev(ntb);
-	u32 offset = 0;
+	resource_size_t bar_sz;
+	u32 offset, sz;
 	int bar;
 
-	if (idx == 0)
-		offset = readl(ndev->ctrl_reg + NTB_EPF_MW1_OFFSET);
+	offset = readl(ndev->ctrl_reg + NTB_EPF_MW_OFFSET(idx));
+	sz = readl(ndev->ctrl_reg + NTB_EPF_MW_SIZE(idx));
 
 	bar = ntb_epf_mw_to_bar(ndev, idx);
 	if (bar < 0)
@@ -464,8 +466,11 @@ static int ntb_epf_peer_mw_get_addr(struct ntb_dev *ntb, int idx,
 	if (base)
 		*base = pci_resource_start(ndev->ntb.pdev, bar) + offset;
 
-	if (size)
-		*size = pci_resource_len(ndev->ntb.pdev, bar) - offset;
+	if (size) {
+		bar_sz = pci_resource_len(ndev->ntb.pdev, bar);
+		*size = sz ? min_t(resource_size_t, sz, bar_sz - offset)
+			   : (bar_sz > offset ? bar_sz - offset : 0);
+	}
 
 	return 0;
 }
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [RFC PATCH 04/25] PCI: endpoint: Add inbound mapping ops to EPC core
  2025-10-23  7:18 [RFC PATCH 00/25] NTB/PCI: Add DW eDMA intr fallback and BAR MW offsets Koichiro Den
                   ` (2 preceding siblings ...)
  2025-10-23  7:18 ` [RFC PATCH 03/25] NTB: epf: Handle mwN_offset for inbound MW regions Koichiro Den
@ 2025-10-23  7:18 ` Koichiro Den
  2025-10-23  7:18 ` [RFC PATCH 05/25] PCI: dwc: ep: Implement EPC inbound mapping support Koichiro Den
                   ` (22 subsequent siblings)
  26 siblings, 0 replies; 37+ messages in thread
From: Koichiro Den @ 2025-10-23  7:18 UTC (permalink / raw)
  To: ntb, linux-pci, dmaengine, linux-kernel
  Cc: mani, kwilczynski, kishon, bhelgaas, corbet, vkoul, jdmason,
	dave.jiang, allenbh, Basavaraj.Natikar, Shyam-sundar.S-k,
	kurt.schwemmer, logang, jingoohan1, lpieralisi, robh, jbrunet,
	Frank.Li, fancer.lancer, arnd, pstanner, elfring

Add new EPC ops map_inbound() and unmap_inbound() for mapping a subrange
of a BAR into CPU space. These will be implemented by controller drivers
such as DesignWare.

Signed-off-by: Koichiro Den <den@valinux.co.jp>
---
 drivers/pci/endpoint/pci-epc-core.c | 44 +++++++++++++++++++++++++++++
 include/linux/pci-epc.h             | 11 ++++++++
 2 files changed, 55 insertions(+)

diff --git a/drivers/pci/endpoint/pci-epc-core.c b/drivers/pci/endpoint/pci-epc-core.c
index ca7f19cc973a..825109e54ba9 100644
--- a/drivers/pci/endpoint/pci-epc-core.c
+++ b/drivers/pci/endpoint/pci-epc-core.c
@@ -444,6 +444,50 @@ int pci_epc_map_addr(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
 }
 EXPORT_SYMBOL_GPL(pci_epc_map_addr);
 
+/**
+ * pci_epc_map_inbound() - map a BAR subrange to the local CPU address
+ * @epc: the EPC device on which BAR has to be configured
+ * @func_no: the physical endpoint function number in the EPC device
+ * @vfunc_no: the virtual endpoint function number in the physical function
+ * @epf_bar: the struct epf_bar that contains the BAR information
+ * @offset: byte offset from the BAR base selected by the host
+ *
+ * Invoke to configure the BAR of the endpoint device and map a subrange
+ * selected by @offset to a CPU address.
+ *
+ * Returns 0 on success, -EOPNOTSUPP if unsupported, or a negative errno.
+ */
+int pci_epc_map_inbound(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
+			struct pci_epf_bar *epf_bar, u64 offset)
+{
+	if (!epc || !epc->ops || !epc->ops->map_inbound)
+		return -EOPNOTSUPP;
+
+	return epc->ops->map_inbound(epc, func_no, vfunc_no, epf_bar, offset);
+}
+EXPORT_SYMBOL_GPL(pci_epc_map_inbound);
+
+/**
+ * pci_epc_unmap_inbound() - unmap a previously mapped BAR subrange
+ * @epc: the EPC device on which the inbound mapping was programmed
+ * @func_no: the physical endpoint function number in the EPC device
+ * @vfunc_no: the virtual endpoint function number in the physical function
+ * @epf_bar: the struct epf_bar used when the mapping was created
+ * @offset: byte offset from the BAR base that was mapped
+ *
+ * Invoke to remove a BAR subrange mapping created by pci_epc_map_inbound().
+ * If the controller has no support, this call is a no-op.
+ */
+void pci_epc_unmap_inbound(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
+			   struct pci_epf_bar *epf_bar, u64 offset)
+{
+	if (!epc || !epc->ops || !epc->ops->unmap_inbound)
+		return;
+
+	epc->ops->unmap_inbound(epc, func_no, vfunc_no, epf_bar, offset);
+}
+EXPORT_SYMBOL_GPL(pci_epc_unmap_inbound);
+
 /**
  * pci_epc_mem_map() - allocate and map a PCI address to a CPU address
  * @epc: the EPC device on which the CPU address is to be allocated and mapped
diff --git a/include/linux/pci-epc.h b/include/linux/pci-epc.h
index 4286bfdbfdfa..a5fb91cc2982 100644
--- a/include/linux/pci-epc.h
+++ b/include/linux/pci-epc.h
@@ -71,6 +71,8 @@ struct pci_epc_map {
  *		region
  * @map_addr: ops to map CPU address to PCI address
  * @unmap_addr: ops to unmap CPU address and PCI address
+ * @map_inbound: ops to map a subrange inside a BAR to CPU address.
+ * @unmap_inbound: ops to unmap a subrange inside a BAR and CPU address.
  * @set_msi: ops to set the requested number of MSI interrupts in the MSI
  *	     capability register
  * @get_msi: ops to get the number of MSI interrupts allocated by the RC from
@@ -99,6 +101,10 @@ struct pci_epc_ops {
 			    phys_addr_t addr, u64 pci_addr, size_t size);
 	void	(*unmap_addr)(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
 			      phys_addr_t addr);
+	int	(*map_inbound)(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
+			       struct pci_epf_bar *epf_bar, u64 offset);
+	void	(*unmap_inbound)(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
+				 struct pci_epf_bar *epf_bar, u64 offset);
 	int	(*set_msi)(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
 			   u8 nr_irqs);
 	int	(*get_msi)(struct pci_epc *epc, u8 func_no, u8 vfunc_no);
@@ -286,6 +292,11 @@ int pci_epc_map_addr(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
 		     u64 pci_addr, size_t size);
 void pci_epc_unmap_addr(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
 			phys_addr_t phys_addr);
+
+int pci_epc_map_inbound(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
+			struct pci_epf_bar *epf_bar, u64 offset);
+void pci_epc_unmap_inbound(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
+			   struct pci_epf_bar *epf_bar, u64 offset);
 int pci_epc_set_msi(struct pci_epc *epc, u8 func_no, u8 vfunc_no, u8 nr_irqs);
 int pci_epc_get_msi(struct pci_epc *epc, u8 func_no, u8 vfunc_no);
 int pci_epc_set_msix(struct pci_epc *epc, u8 func_no, u8 vfunc_no, u16 nr_irqs,
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [RFC PATCH 05/25] PCI: dwc: ep: Implement EPC inbound mapping support
  2025-10-23  7:18 [RFC PATCH 00/25] NTB/PCI: Add DW eDMA intr fallback and BAR MW offsets Koichiro Den
                   ` (3 preceding siblings ...)
  2025-10-23  7:18 ` [RFC PATCH 04/25] PCI: endpoint: Add inbound mapping ops to EPC core Koichiro Den
@ 2025-10-23  7:18 ` Koichiro Den
  2025-10-23  7:18 ` [RFC PATCH 06/25] PCI: endpoint: pci-epf-vntb: Use pci_epc_map_inbound() for MW mapping Koichiro Den
                   ` (21 subsequent siblings)
  26 siblings, 0 replies; 37+ messages in thread
From: Koichiro Den @ 2025-10-23  7:18 UTC (permalink / raw)
  To: ntb, linux-pci, dmaengine, linux-kernel
  Cc: mani, kwilczynski, kishon, bhelgaas, corbet, vkoul, jdmason,
	dave.jiang, allenbh, Basavaraj.Natikar, Shyam-sundar.S-k,
	kurt.schwemmer, logang, jingoohan1, lpieralisi, robh, jbrunet,
	Frank.Li, fancer.lancer, arnd, pstanner, elfring

Implement map_inbound() and unmap_inbound() for DesignWare endpoint
controllers (Address Match mode). Allows subrange mappings within a BAR,
enabling advanced endpoint functions such as NTB with offset-based
windows.

Signed-off-by: Koichiro Den <den@valinux.co.jp>
---
 .../pci/controller/dwc/pcie-designware-ep.c   | 242 +++++++++++++++---
 drivers/pci/controller/dwc/pcie-designware.h  |   2 +
 2 files changed, 215 insertions(+), 29 deletions(-)

diff --git a/drivers/pci/controller/dwc/pcie-designware-ep.c b/drivers/pci/controller/dwc/pcie-designware-ep.c
index 0ae54a94809b..d7093958a916 100644
--- a/drivers/pci/controller/dwc/pcie-designware-ep.c
+++ b/drivers/pci/controller/dwc/pcie-designware-ep.c
@@ -8,13 +8,25 @@
 
 #include <linux/align.h>
 #include <linux/bitfield.h>
+#include <linux/list.h>
 #include <linux/of.h>
+#include <linux/pci_regs.h>
 #include <linux/platform_device.h>
+#include <linux/spinlock.h>
 
 #include "pcie-designware.h"
 #include <linux/pci-epc.h>
 #include <linux/pci-epf.h>
 
+struct dw_pcie_ib_map {
+	struct list_head node;
+	enum pci_barno bar;
+	u64 pci_addr;           /* BAR base + offset at map time */
+	phys_addr_t cpu_addr;   /* EP local phys */
+	u64 size;
+	u32 index;              /* iATU inbound window index */
+};
+
 /**
  * dw_pcie_ep_get_func_from_ep - Get the struct dw_pcie_ep_func corresponding to
  *				 the endpoint function
@@ -232,6 +244,7 @@ static void dw_pcie_ep_clear_bar(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
 	struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
 	enum pci_barno bar = epf_bar->barno;
 	u32 atu_index = ep->bar_to_atu[bar] - 1;
+	struct dw_pcie_ib_map *m, *tmp;
 
 	if (!ep->bar_to_atu[bar])
 		return;
@@ -242,6 +255,16 @@ static void dw_pcie_ep_clear_bar(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
 	clear_bit(atu_index, ep->ib_window_map);
 	ep->epf_bar[bar] = NULL;
 	ep->bar_to_atu[bar] = 0;
+
+	guard(spinlock_irqsave)(&ep->ib_map_lock);
+	list_for_each_entry_safe(m, tmp, &ep->ib_map_list, node) {
+		if (m->bar != bar)
+			continue;
+		dw_pcie_disable_atu(pci, PCIE_ATU_REGION_DIR_IB, m->index);
+		clear_bit(m->index, ep->ib_window_map);
+		list_del(&m->node);
+		kfree(m);
+	}
 }
 
 static unsigned int dw_pcie_ep_get_rebar_offset(struct dw_pcie *pci,
@@ -363,14 +386,46 @@ static enum pci_epc_bar_type dw_pcie_ep_get_bar_type(struct dw_pcie_ep *ep,
 	return epc_features->bar[bar].type;
 }
 
+static int dw_pcie_ep_set_bar_init(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
+				   struct pci_epf_bar *epf_bar)
+{
+	struct dw_pcie_ep *ep = epc_get_drvdata(epc);
+	struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
+	enum pci_barno bar = epf_bar->barno;
+	enum pci_epc_bar_type bar_type;
+	int ret;
+
+	bar_type = dw_pcie_ep_get_bar_type(ep, bar);
+	switch (bar_type) {
+	case BAR_FIXED:
+		/*
+		 * There is no need to write a BAR mask for a fixed BAR (except
+		 * to write 1 to the LSB of the BAR mask register, to enable the
+		 * BAR). Write the BAR mask regardless. (The fixed bits in the
+		 * BAR mask register will be read-only anyway.)
+		 */
+		fallthrough;
+	case BAR_PROGRAMMABLE:
+		ret = dw_pcie_ep_set_bar_programmable(ep, func_no, epf_bar);
+		break;
+	case BAR_RESIZABLE:
+		ret = dw_pcie_ep_set_bar_resizable(ep, func_no, epf_bar);
+		break;
+	default:
+		ret = -EINVAL;
+		dev_err(pci->dev, "Invalid BAR type\n");
+		break;
+	}
+
+	return ret;
+}
+
 static int dw_pcie_ep_set_bar(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
 			      struct pci_epf_bar *epf_bar)
 {
 	struct dw_pcie_ep *ep = epc_get_drvdata(epc);
-	struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
 	enum pci_barno bar = epf_bar->barno;
 	size_t size = epf_bar->size;
-	enum pci_epc_bar_type bar_type;
 	int flags = epf_bar->flags;
 	int ret, type;
 
@@ -401,35 +456,12 @@ static int dw_pcie_ep_set_bar(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
 		 * When dynamically changing a BAR, skip writing the BAR reg, as
 		 * that would clear the BAR's PCI address assigned by the host.
 		 */
-		goto config_atu;
-	}
-
-	bar_type = dw_pcie_ep_get_bar_type(ep, bar);
-	switch (bar_type) {
-	case BAR_FIXED:
-		/*
-		 * There is no need to write a BAR mask for a fixed BAR (except
-		 * to write 1 to the LSB of the BAR mask register, to enable the
-		 * BAR). Write the BAR mask regardless. (The fixed bits in the
-		 * BAR mask register will be read-only anyway.)
-		 */
-		fallthrough;
-	case BAR_PROGRAMMABLE:
-		ret = dw_pcie_ep_set_bar_programmable(ep, func_no, epf_bar);
-		break;
-	case BAR_RESIZABLE:
-		ret = dw_pcie_ep_set_bar_resizable(ep, func_no, epf_bar);
-		break;
-	default:
-		ret = -EINVAL;
-		dev_err(pci->dev, "Invalid BAR type\n");
-		break;
+	} else {
+		ret = dw_pcie_ep_set_bar_init(epc, func_no, vfunc_no, epf_bar);
+		if (ret)
+			return ret;
 	}
 
-	if (ret)
-		return ret;
-
-config_atu:
 	if (!(flags & PCI_BASE_ADDRESS_SPACE))
 		type = PCIE_ATU_TYPE_MEM;
 	else
@@ -515,6 +547,154 @@ static int dw_pcie_ep_map_addr(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
 	return 0;
 }
 
+static inline u64 dw_pcie_ep_read_bar_assigned(struct dw_pcie_ep *ep, u8 func_no,
+					       enum pci_barno bar, bool is_io,
+					       bool is_64)
+{
+	u32 reg = PCI_BASE_ADDRESS_0 + 4 * bar;
+	u32 lo, hi = 0;
+	u64 base;
+
+	lo = dw_pcie_ep_readl_dbi(ep, func_no, reg);
+	if (is_io)
+		base = lo & PCI_BASE_ADDRESS_IO_MASK;
+	else {
+		base = lo & PCI_BASE_ADDRESS_MEM_MASK;
+		if (is_64) {
+			hi = dw_pcie_ep_readl_dbi(ep, func_no, reg + 4);
+			base |= ((u64)hi) << 32;
+		}
+	}
+	return base;
+}
+
+static int dw_pcie_ep_map_inbound(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
+				  struct pci_epf_bar *epf_bar, u64 offset)
+{
+	struct dw_pcie_ep *ep = epc_get_drvdata(epc);
+	struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
+	enum pci_barno bar = epf_bar->barno;
+	size_t size = epf_bar->size;
+	int flags = epf_bar->flags;
+	struct dw_pcie_ib_map *m;
+	u64 base, pci_addr;
+	int ret, type, win;
+
+	/*
+	 * DWC does not allow BAR pairs to overlap, e.g. you cannot combine BARs
+	 * 1 and 2 to form a 64-bit BAR.
+	 */
+	if ((flags & PCI_BASE_ADDRESS_MEM_TYPE_64) && (bar & 1))
+		return -EINVAL;
+
+	/*
+	 * Certain EPF drivers dynamically change the physical address of a BAR
+	 * (i.e. they call set_bar() twice, without ever calling clear_bar(), as
+	 * calling clear_bar() would clear the BAR's PCI address assigned by the
+	 * host).
+	 */
+	if (ep->epf_bar[bar]) {
+		/*
+		 * We can only dynamically add a whole or partial mapping if the
+		 * BAR flags do not differ from the existing configuration.
+		 */
+		if (ep->epf_bar[bar]->barno != bar ||
+		    ep->epf_bar[bar]->flags != flags)
+			return -EINVAL;
+
+		/*
+		 * When dynamically changing a BAR, skip writing the BAR reg, as
+		 * that would clear the BAR's PCI address assigned by the host.
+		 */
+	} else {
+		ret = dw_pcie_ep_set_bar_init(epc, func_no, vfunc_no, epf_bar);
+		if (ret)
+			return ret;
+	}
+
+	ep->epf_bar[bar] = epf_bar;
+
+	/*
+	 * Skip programming the inbound translation if phys_addr is 0.
+	 * In this case, the caller only intends to initialize the BAR.
+	 */
+	if (!epf_bar->phys_addr)
+		return 0;
+
+	base = dw_pcie_ep_read_bar_assigned(ep, func_no, bar,
+					    flags & PCI_BASE_ADDRESS_SPACE,
+					    flags & PCI_BASE_ADDRESS_MEM_TYPE_64);
+	if (!(flags & PCI_BASE_ADDRESS_SPACE))
+		type = PCIE_ATU_TYPE_MEM;
+	else
+		type = PCIE_ATU_TYPE_IO;
+	pci_addr = base + offset;
+
+	/* Allocate an inbound iATU window */
+	win = find_first_zero_bit(ep->ib_window_map, pci->num_ib_windows);
+	if (win >= pci->num_ib_windows)
+		return -ENOSPC;
+
+	/* Program address-match inbound iATU */
+	ret = dw_pcie_prog_inbound_atu(pci, win, type,
+				       epf_bar->phys_addr - pci->parent_bus_offset,
+				       pci_addr, size);
+	if (ret)
+		return ret;
+
+	m = kzalloc(sizeof(*m), GFP_KERNEL);
+	if (!m) {
+		dw_pcie_disable_atu(pci, PCIE_ATU_REGION_DIR_IB, win);
+		return -ENOMEM;
+	}
+	m->bar = bar;
+	m->pci_addr = pci_addr;
+	m->cpu_addr = epf_bar->phys_addr;
+	m->size = size;
+	m->index = win;
+
+	guard(spinlock_irqsave)(&ep->ib_map_lock);
+	set_bit(win, ep->ib_window_map);
+	list_add(&m->node, &ep->ib_map_list);
+
+	return 0;
+}
+
+static void dw_pcie_ep_unmap_inbound(struct pci_epc *epc, u8 func_no, u8 vfunc_no,
+				     struct pci_epf_bar *epf_bar, u64 offset)
+{
+	struct dw_pcie_ep *ep = epc_get_drvdata(epc);
+	struct dw_pcie *pci = to_dw_pcie_from_ep(ep);
+	enum pci_barno bar = epf_bar->barno;
+	struct dw_pcie_ib_map *m, *tmp;
+	size_t size = epf_bar->size;
+	int flags = epf_bar->flags;
+	u64 match_pci = 0;
+	u64 base;
+
+	/* If BAR base isn't assigned, there can't be any programmed sub-window */
+	base = dw_pcie_ep_read_bar_assigned(ep, func_no, bar,
+					    flags & PCI_BASE_ADDRESS_SPACE,
+					    flags & PCI_BASE_ADDRESS_MEM_TYPE_64);
+	if (base)
+		match_pci = base + offset;
+
+	guard(spinlock_irqsave)(&ep->ib_map_lock);
+	list_for_each_entry_safe(m, tmp, &ep->ib_map_list, node) {
+		if (m->bar != bar)
+			continue;
+		if (match_pci && m->pci_addr != match_pci)
+			continue;
+		if (size && m->size != size)
+			/* Partial unmap is unsupported for now */
+			continue;
+		dw_pcie_disable_atu(pci, PCIE_ATU_REGION_DIR_IB, m->index);
+		clear_bit(m->index, ep->ib_window_map);
+		list_del(&m->node);
+		kfree(m);
+	}
+}
+
 static int dw_pcie_ep_get_msi(struct pci_epc *epc, u8 func_no, u8 vfunc_no)
 {
 	struct dw_pcie_ep *ep = epc_get_drvdata(epc);
@@ -657,6 +837,8 @@ static const struct pci_epc_ops epc_ops = {
 	.align_addr		= dw_pcie_ep_align_addr,
 	.map_addr		= dw_pcie_ep_map_addr,
 	.unmap_addr		= dw_pcie_ep_unmap_addr,
+	.map_inbound		= dw_pcie_ep_map_inbound,
+	.unmap_inbound		= dw_pcie_ep_unmap_inbound,
 	.set_msi		= dw_pcie_ep_set_msi,
 	.get_msi		= dw_pcie_ep_get_msi,
 	.set_msix		= dw_pcie_ep_set_msix,
@@ -1113,6 +1295,8 @@ int dw_pcie_ep_init(struct dw_pcie_ep *ep)
 	struct device *dev = pci->dev;
 
 	INIT_LIST_HEAD(&ep->func_list);
+	INIT_LIST_HEAD(&ep->ib_map_list);
+	spin_lock_init(&ep->ib_map_lock);
 
 	epc = devm_pci_epc_create(dev, &epc_ops);
 	if (IS_ERR(epc)) {
diff --git a/drivers/pci/controller/dwc/pcie-designware.h b/drivers/pci/controller/dwc/pcie-designware.h
index 00f52d472dcd..455170e53d7e 100644
--- a/drivers/pci/controller/dwc/pcie-designware.h
+++ b/drivers/pci/controller/dwc/pcie-designware.h
@@ -462,6 +462,8 @@ struct dw_pcie_ep {
 	phys_addr_t		*outbound_addr;
 	unsigned long		*ib_window_map;
 	unsigned long		*ob_window_map;
+	struct list_head	ib_map_list;
+	spinlock_t		ib_map_lock;
 	void __iomem		*msi_mem;
 	phys_addr_t		msi_mem_phys;
 	struct pci_epf_bar	*epf_bar[PCI_STD_NUM_BARS];
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [RFC PATCH 06/25] PCI: endpoint: pci-epf-vntb: Use pci_epc_map_inbound() for MW mapping
  2025-10-23  7:18 [RFC PATCH 00/25] NTB/PCI: Add DW eDMA intr fallback and BAR MW offsets Koichiro Den
                   ` (4 preceding siblings ...)
  2025-10-23  7:18 ` [RFC PATCH 05/25] PCI: dwc: ep: Implement EPC inbound mapping support Koichiro Den
@ 2025-10-23  7:18 ` Koichiro Den
  2025-10-23  7:18 ` [RFC PATCH 07/25] NTB: Add offset parameter to MW translation APIs Koichiro Den
                   ` (20 subsequent siblings)
  26 siblings, 0 replies; 37+ messages in thread
From: Koichiro Den @ 2025-10-23  7:18 UTC (permalink / raw)
  To: ntb, linux-pci, dmaengine, linux-kernel
  Cc: mani, kwilczynski, kishon, bhelgaas, corbet, vkoul, jdmason,
	dave.jiang, allenbh, Basavaraj.Natikar, Shyam-sundar.S-k,
	kurt.schwemmer, logang, jingoohan1, lpieralisi, robh, jbrunet,
	Frank.Li, fancer.lancer, arnd, pstanner, elfring

Switch MW setup to use pci_epc_map_inbound() when supported. This allows
mapping portions of a BAR rather than the entire region, supporting
partial BAR usage on capable controllers.

Signed-off-by: Koichiro Den <den@valinux.co.jp>
---
 drivers/pci/endpoint/functions/pci-epf-vntb.c | 23 +++++++++++++++----
 1 file changed, 18 insertions(+), 5 deletions(-)

diff --git a/drivers/pci/endpoint/functions/pci-epf-vntb.c b/drivers/pci/endpoint/functions/pci-epf-vntb.c
index 6953abb2987d..5b3aa1abeb70 100644
--- a/drivers/pci/endpoint/functions/pci-epf-vntb.c
+++ b/drivers/pci/endpoint/functions/pci-epf-vntb.c
@@ -609,10 +609,16 @@ static int epf_ntb_mw_bar_init(struct epf_ntb *ntb)
 				PCI_BASE_ADDRESS_MEM_TYPE_64 :
 				PCI_BASE_ADDRESS_MEM_TYPE_32;
 
-		ret = pci_epc_set_bar(ntb->epf->epc,
-				      ntb->epf->func_no,
-				      ntb->epf->vfunc_no,
-				      &ntb->epf->bar[barno]);
+		if (ntb->epf->epc->ops->map_inbound)
+			ret = pci_epc_map_inbound(ntb->epf->epc,
+						  ntb->epf->func_no,
+						  ntb->epf->vfunc_no,
+						  &ntb->epf->bar[barno], 0);
+		else
+			ret = pci_epc_set_bar(ntb->epf->epc,
+					      ntb->epf->func_no,
+					      ntb->epf->vfunc_no,
+					      &ntb->epf->bar[barno]);
 		if (ret) {
 			dev_err(dev, "MW set failed\n");
 			goto err_set_bar;
@@ -1268,17 +1274,24 @@ static int vntb_epf_mw_set_trans(struct ntb_dev *ndev, int pidx, int idx,
 	struct epf_ntb *ntb = ntb_ndev(ndev);
 	struct pci_epf_bar *epf_bar;
 	enum pci_barno barno;
+	struct pci_epc *epc;
 	int ret;
 	struct device *dev;
 
+	epc = ntb->epf->epc;
 	dev = &ntb->ntb.dev;
 	barno = ntb->epf_ntb_bar[BAR_MW1 + idx];
+
 	epf_bar = &ntb->epf->bar[barno];
 	epf_bar->phys_addr = addr;
 	epf_bar->barno = barno;
 	epf_bar->size = size;
 
-	ret = pci_epc_set_bar(ntb->epf->epc, 0, 0, epf_bar);
+	if (epc->ops->map_inbound)
+		ret = pci_epc_map_inbound(epc, 0, 0, epf_bar, 0);
+	else
+		ret = pci_epc_set_bar(epc, 0, 0, epf_bar);
+
 	if (ret) {
 		dev_err(dev, "failure set mw trans\n");
 		return ret;
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [RFC PATCH 07/25] NTB: Add offset parameter to MW translation APIs
  2025-10-23  7:18 [RFC PATCH 00/25] NTB/PCI: Add DW eDMA intr fallback and BAR MW offsets Koichiro Den
                   ` (5 preceding siblings ...)
  2025-10-23  7:18 ` [RFC PATCH 06/25] PCI: endpoint: pci-epf-vntb: Use pci_epc_map_inbound() for MW mapping Koichiro Den
@ 2025-10-23  7:18 ` Koichiro Den
  2025-10-23  7:18 ` [RFC PATCH 08/25] PCI: endpoint: pci-epf-vntb: Propagate MW offset from configfs when present Koichiro Den
                   ` (19 subsequent siblings)
  26 siblings, 0 replies; 37+ messages in thread
From: Koichiro Den @ 2025-10-23  7:18 UTC (permalink / raw)
  To: ntb, linux-pci, dmaengine, linux-kernel
  Cc: mani, kwilczynski, kishon, bhelgaas, corbet, vkoul, jdmason,
	dave.jiang, allenbh, Basavaraj.Natikar, Shyam-sundar.S-k,
	kurt.schwemmer, logang, jingoohan1, lpieralisi, robh, jbrunet,
	Frank.Li, fancer.lancer, arnd, pstanner, elfring

Extend ntb_mw_set_trans() and ntb_mw_get_align() with an offset
argument. This supports subrange mapping inside a BAR for platforms that
require offset-based translations.

Signed-off-by: Koichiro Den <den@valinux.co.jp>
---
 drivers/ntb/hw/amd/ntb_hw_amd.c               |  6 ++++--
 drivers/ntb/hw/epf/ntb_hw_epf.c               |  6 ++++--
 drivers/ntb/hw/idt/ntb_hw_idt.c               |  3 ++-
 drivers/ntb/hw/intel/ntb_hw_gen1.c            |  6 ++++--
 drivers/ntb/hw/intel/ntb_hw_gen1.h            |  2 +-
 drivers/ntb/hw/intel/ntb_hw_gen3.c            |  3 ++-
 drivers/ntb/hw/intel/ntb_hw_gen4.c            |  6 ++++--
 drivers/ntb/hw/mscc/ntb_hw_switchtec.c        |  6 ++++--
 drivers/ntb/msi.c                             |  6 +++---
 drivers/ntb/ntb_transport.c                   |  4 ++--
 drivers/ntb/test/ntb_perf.c                   |  4 ++--
 drivers/ntb/test/ntb_tool.c                   |  6 +++---
 drivers/pci/endpoint/functions/pci-epf-vntb.c |  7 ++++---
 include/linux/ntb.h                           | 18 +++++++++++-------
 14 files changed, 50 insertions(+), 33 deletions(-)

diff --git a/drivers/ntb/hw/amd/ntb_hw_amd.c b/drivers/ntb/hw/amd/ntb_hw_amd.c
index 1a163596ddf5..c0137df413c4 100644
--- a/drivers/ntb/hw/amd/ntb_hw_amd.c
+++ b/drivers/ntb/hw/amd/ntb_hw_amd.c
@@ -92,7 +92,8 @@ static int amd_ntb_mw_count(struct ntb_dev *ntb, int pidx)
 static int amd_ntb_mw_get_align(struct ntb_dev *ntb, int pidx, int idx,
 				resource_size_t *addr_align,
 				resource_size_t *size_align,
-				resource_size_t *size_max)
+				resource_size_t *size_max,
+				resource_size_t *offset)
 {
 	struct amd_ntb_dev *ndev = ntb_ndev(ntb);
 	int bar;
@@ -117,7 +118,8 @@ static int amd_ntb_mw_get_align(struct ntb_dev *ntb, int pidx, int idx,
 }
 
 static int amd_ntb_mw_set_trans(struct ntb_dev *ntb, int pidx, int idx,
-				dma_addr_t addr, resource_size_t size)
+				dma_addr_t addr, resource_size_t size,
+				resource_size_t offset)
 {
 	struct amd_ntb_dev *ndev = ntb_ndev(ntb);
 	unsigned long xlat_reg, limit_reg = 0;
diff --git a/drivers/ntb/hw/epf/ntb_hw_epf.c b/drivers/ntb/hw/epf/ntb_hw_epf.c
index 91d3f8e05807..a3ec411bfe49 100644
--- a/drivers/ntb/hw/epf/ntb_hw_epf.c
+++ b/drivers/ntb/hw/epf/ntb_hw_epf.c
@@ -164,7 +164,8 @@ static int ntb_epf_mw_count(struct ntb_dev *ntb, int pidx)
 static int ntb_epf_mw_get_align(struct ntb_dev *ntb, int pidx, int idx,
 				resource_size_t *addr_align,
 				resource_size_t *size_align,
-				resource_size_t *size_max)
+				resource_size_t *size_max,
+				resource_size_t *offset)
 {
 	struct ntb_epf_dev *ndev = ntb_ndev(ntb);
 	struct device *dev = ndev->dev;
@@ -402,7 +403,8 @@ static int ntb_epf_db_set_mask(struct ntb_dev *ntb, u64 db_bits)
 }
 
 static int ntb_epf_mw_set_trans(struct ntb_dev *ntb, int pidx, int idx,
-				dma_addr_t addr, resource_size_t size)
+				dma_addr_t addr, resource_size_t size,
+				resource_size_t offset)
 {
 	struct ntb_epf_dev *ndev = ntb_ndev(ntb);
 	struct device *dev = ndev->dev;
diff --git a/drivers/ntb/hw/idt/ntb_hw_idt.c b/drivers/ntb/hw/idt/ntb_hw_idt.c
index f27df8d7f3b9..8c2cf149b99b 100644
--- a/drivers/ntb/hw/idt/ntb_hw_idt.c
+++ b/drivers/ntb/hw/idt/ntb_hw_idt.c
@@ -1190,7 +1190,8 @@ static int idt_ntb_mw_count(struct ntb_dev *ntb, int pidx)
 static int idt_ntb_mw_get_align(struct ntb_dev *ntb, int pidx, int widx,
 				resource_size_t *addr_align,
 				resource_size_t *size_align,
-				resource_size_t *size_max)
+				resource_size_t *size_max,
+				resource_size_t *offset)
 {
 	struct idt_ntb_dev *ndev = to_ndev_ntb(ntb);
 	struct idt_ntb_peer *peer;
diff --git a/drivers/ntb/hw/intel/ntb_hw_gen1.c b/drivers/ntb/hw/intel/ntb_hw_gen1.c
index 079b8cd79785..6cbbd6cdf4c0 100644
--- a/drivers/ntb/hw/intel/ntb_hw_gen1.c
+++ b/drivers/ntb/hw/intel/ntb_hw_gen1.c
@@ -804,7 +804,8 @@ int intel_ntb_mw_count(struct ntb_dev *ntb, int pidx)
 int intel_ntb_mw_get_align(struct ntb_dev *ntb, int pidx, int idx,
 			   resource_size_t *addr_align,
 			   resource_size_t *size_align,
-			   resource_size_t *size_max)
+			   resource_size_t *size_max,
+			   resource_size_t *offset)
 {
 	struct intel_ntb_dev *ndev = ntb_ndev(ntb);
 	resource_size_t bar_size, mw_size;
@@ -840,7 +841,8 @@ int intel_ntb_mw_get_align(struct ntb_dev *ntb, int pidx, int idx,
 }
 
 static int intel_ntb_mw_set_trans(struct ntb_dev *ntb, int pidx, int idx,
-				  dma_addr_t addr, resource_size_t size)
+				  dma_addr_t addr, resource_size_t size,
+				  resource_size_t offset)
 {
 	struct intel_ntb_dev *ndev = ntb_ndev(ntb);
 	unsigned long base_reg, xlat_reg, limit_reg;
diff --git a/drivers/ntb/hw/intel/ntb_hw_gen1.h b/drivers/ntb/hw/intel/ntb_hw_gen1.h
index 344249fc18d1..f9ebd2780b7f 100644
--- a/drivers/ntb/hw/intel/ntb_hw_gen1.h
+++ b/drivers/ntb/hw/intel/ntb_hw_gen1.h
@@ -159,7 +159,7 @@ int ndev_mw_to_bar(struct intel_ntb_dev *ndev, int idx);
 int intel_ntb_mw_count(struct ntb_dev *ntb, int pidx);
 int intel_ntb_mw_get_align(struct ntb_dev *ntb, int pidx, int idx,
 		resource_size_t *addr_align, resource_size_t *size_align,
-		resource_size_t *size_max);
+		resource_size_t *size_max, resource_size_t *offset);
 int intel_ntb_peer_mw_count(struct ntb_dev *ntb);
 int intel_ntb_peer_mw_get_addr(struct ntb_dev *ntb, int idx,
 		phys_addr_t *base, resource_size_t *size);
diff --git a/drivers/ntb/hw/intel/ntb_hw_gen3.c b/drivers/ntb/hw/intel/ntb_hw_gen3.c
index a5aa96a31f4a..98722032ca5d 100644
--- a/drivers/ntb/hw/intel/ntb_hw_gen3.c
+++ b/drivers/ntb/hw/intel/ntb_hw_gen3.c
@@ -444,7 +444,8 @@ int intel_ntb3_link_enable(struct ntb_dev *ntb, enum ntb_speed max_speed,
 	return 0;
 }
 static int intel_ntb3_mw_set_trans(struct ntb_dev *ntb, int pidx, int idx,
-				   dma_addr_t addr, resource_size_t size)
+				   dma_addr_t addr, resource_size_t size,
+				   resource_size_t offset)
 {
 	struct intel_ntb_dev *ndev = ntb_ndev(ntb);
 	unsigned long xlat_reg, limit_reg;
diff --git a/drivers/ntb/hw/intel/ntb_hw_gen4.c b/drivers/ntb/hw/intel/ntb_hw_gen4.c
index 22cac7975b3c..8df90ea04c7c 100644
--- a/drivers/ntb/hw/intel/ntb_hw_gen4.c
+++ b/drivers/ntb/hw/intel/ntb_hw_gen4.c
@@ -335,7 +335,8 @@ ssize_t ndev_ntb4_debugfs_read(struct file *filp, char __user *ubuf,
 }
 
 static int intel_ntb4_mw_set_trans(struct ntb_dev *ntb, int pidx, int idx,
-				   dma_addr_t addr, resource_size_t size)
+				   dma_addr_t addr, resource_size_t size,
+				   resource_size_t offset)
 {
 	struct intel_ntb_dev *ndev = ntb_ndev(ntb);
 	unsigned long xlat_reg, limit_reg, idx_reg;
@@ -524,7 +525,8 @@ static int intel_ntb4_link_disable(struct ntb_dev *ntb)
 static int intel_ntb4_mw_get_align(struct ntb_dev *ntb, int pidx, int idx,
 				   resource_size_t *addr_align,
 				   resource_size_t *size_align,
-				   resource_size_t *size_max)
+				   resource_size_t *size_max,
+				   resource_size_t *offset)
 {
 	struct intel_ntb_dev *ndev = ntb_ndev(ntb);
 	resource_size_t bar_size, mw_size;
diff --git a/drivers/ntb/hw/mscc/ntb_hw_switchtec.c b/drivers/ntb/hw/mscc/ntb_hw_switchtec.c
index e38540b92716..5d8bace78d4f 100644
--- a/drivers/ntb/hw/mscc/ntb_hw_switchtec.c
+++ b/drivers/ntb/hw/mscc/ntb_hw_switchtec.c
@@ -191,7 +191,8 @@ static int peer_lut_index(struct switchtec_ntb *sndev, int mw_idx)
 static int switchtec_ntb_mw_get_align(struct ntb_dev *ntb, int pidx,
 				      int widx, resource_size_t *addr_align,
 				      resource_size_t *size_align,
-				      resource_size_t *size_max)
+				      resource_size_t *size_max,
+				      resource_size_t *offset)
 {
 	struct switchtec_ntb *sndev = ntb_sndev(ntb);
 	int lut;
@@ -268,7 +269,8 @@ static void switchtec_ntb_mw_set_lut(struct switchtec_ntb *sndev, int idx,
 }
 
 static int switchtec_ntb_mw_set_trans(struct ntb_dev *ntb, int pidx, int widx,
-				      dma_addr_t addr, resource_size_t size)
+				      dma_addr_t addr, resource_size_t size,
+				      resource_size_t offset)
 {
 	struct switchtec_ntb *sndev = ntb_sndev(ntb);
 	struct ntb_ctrl_regs __iomem *ctl = sndev->mmio_peer_ctrl;
diff --git a/drivers/ntb/msi.c b/drivers/ntb/msi.c
index 6817d504c12a..8875bcbf2ea4 100644
--- a/drivers/ntb/msi.c
+++ b/drivers/ntb/msi.c
@@ -117,7 +117,7 @@ int ntb_msi_setup_mws(struct ntb_dev *ntb)
 			return peer_widx;
 
 		ret = ntb_mw_get_align(ntb, peer, peer_widx, &addr_align,
-				       NULL, NULL);
+				       NULL, NULL, NULL);
 		if (ret)
 			return ret;
 
@@ -132,7 +132,7 @@ int ntb_msi_setup_mws(struct ntb_dev *ntb)
 		}
 
 		ret = ntb_mw_get_align(ntb, peer, peer_widx, NULL,
-				       &size_align, &size_max);
+				       &size_align, &size_max, NULL);
 		if (ret)
 			goto error_out;
 
@@ -142,7 +142,7 @@ int ntb_msi_setup_mws(struct ntb_dev *ntb)
 			mw_min_size = mw_size;
 
 		ret = ntb_mw_set_trans(ntb, peer, peer_widx,
-				       addr, mw_size);
+				       addr, mw_size, 0);
 		if (ret)
 			goto error_out;
 	}
diff --git a/drivers/ntb/ntb_transport.c b/drivers/ntb/ntb_transport.c
index eb875e3db2e3..4bb1a64c1090 100644
--- a/drivers/ntb/ntb_transport.c
+++ b/drivers/ntb/ntb_transport.c
@@ -883,7 +883,7 @@ static int ntb_set_mw(struct ntb_transport_ctx *nt, int num_mw,
 		return -EINVAL;
 
 	rc = ntb_mw_get_align(nt->ndev, PIDX, num_mw, &xlat_align,
-			      &xlat_align_size, NULL);
+			      &xlat_align_size, NULL, NULL);
 	if (rc)
 		return rc;
 
@@ -918,7 +918,7 @@ static int ntb_set_mw(struct ntb_transport_ctx *nt, int num_mw,
 
 	/* Notify HW the memory location of the receive buffer */
 	rc = ntb_mw_set_trans(nt->ndev, PIDX, num_mw, mw->dma_addr,
-			      mw->xlat_size);
+			      mw->xlat_size, 0);
 	if (rc) {
 		dev_err(&pdev->dev, "Unable to set mw%d translation", num_mw);
 		ntb_free_mw(nt, num_mw);
diff --git a/drivers/ntb/test/ntb_perf.c b/drivers/ntb/test/ntb_perf.c
index dfd175f79e8f..b842b69e4242 100644
--- a/drivers/ntb/test/ntb_perf.c
+++ b/drivers/ntb/test/ntb_perf.c
@@ -573,7 +573,7 @@ static int perf_setup_inbuf(struct perf_peer *peer)
 
 	/* Get inbound MW parameters */
 	ret = ntb_mw_get_align(perf->ntb, peer->pidx, perf->gidx,
-			       &xlat_align, &size_align, &size_max);
+			       &xlat_align, &size_align, &size_max, NULL);
 	if (ret) {
 		dev_err(&perf->ntb->dev, "Couldn't get inbuf restrictions\n");
 		return ret;
@@ -604,7 +604,7 @@ static int perf_setup_inbuf(struct perf_peer *peer)
 	}
 
 	ret = ntb_mw_set_trans(perf->ntb, peer->pidx, peer->gidx,
-			       peer->inbuf_xlat, peer->inbuf_size);
+			       peer->inbuf_xlat, peer->inbuf_size, 0);
 	if (ret) {
 		dev_err(&perf->ntb->dev, "Failed to set inbuf translation\n");
 		goto err_free_inbuf;
diff --git a/drivers/ntb/test/ntb_tool.c b/drivers/ntb/test/ntb_tool.c
index 641cb7e05a47..7a7ba486bba7 100644
--- a/drivers/ntb/test/ntb_tool.c
+++ b/drivers/ntb/test/ntb_tool.c
@@ -578,7 +578,7 @@ static int tool_setup_mw(struct tool_ctx *tc, int pidx, int widx,
 		return 0;
 
 	ret = ntb_mw_get_align(tc->ntb, pidx, widx, &addr_align,
-				&size_align, &size);
+				&size_align, &size, NULL);
 	if (ret)
 		return ret;
 
@@ -595,7 +595,7 @@ static int tool_setup_mw(struct tool_ctx *tc, int pidx, int widx,
 		goto err_free_dma;
 	}
 
-	ret = ntb_mw_set_trans(tc->ntb, pidx, widx, inmw->dma_base, inmw->size);
+	ret = ntb_mw_set_trans(tc->ntb, pidx, widx, inmw->dma_base, inmw->size, 0);
 	if (ret)
 		goto err_free_dma;
 
@@ -652,7 +652,7 @@ static ssize_t tool_mw_trans_read(struct file *filep, char __user *ubuf,
 		return -ENOMEM;
 
 	ret = ntb_mw_get_align(inmw->tc->ntb, inmw->pidx, inmw->widx,
-			       &addr_align, &size_align, &size_max);
+			       &addr_align, &size_align, &size_max, NULL);
 	if (ret)
 		goto err;
 
diff --git a/drivers/pci/endpoint/functions/pci-epf-vntb.c b/drivers/pci/endpoint/functions/pci-epf-vntb.c
index 5b3aa1abeb70..becfad483643 100644
--- a/drivers/pci/endpoint/functions/pci-epf-vntb.c
+++ b/drivers/pci/endpoint/functions/pci-epf-vntb.c
@@ -1269,7 +1269,7 @@ static int vntb_epf_db_set_mask(struct ntb_dev *ntb, u64 db_bits)
 }
 
 static int vntb_epf_mw_set_trans(struct ntb_dev *ndev, int pidx, int idx,
-		dma_addr_t addr, resource_size_t size)
+		dma_addr_t addr, resource_size_t size, resource_size_t offset)
 {
 	struct epf_ntb *ntb = ntb_ndev(ndev);
 	struct pci_epf_bar *epf_bar;
@@ -1288,7 +1288,7 @@ static int vntb_epf_mw_set_trans(struct ntb_dev *ndev, int pidx, int idx,
 	epf_bar->size = size;
 
 	if (epc->ops->map_inbound)
-		ret = pci_epc_map_inbound(epc, 0, 0, epf_bar, 0);
+		ret = pci_epc_map_inbound(epc, 0, 0, epf_bar, offset);
 	else
 		ret = pci_epc_set_bar(epc, 0, 0, epf_bar);
 
@@ -1399,7 +1399,8 @@ static u64 vntb_epf_db_read(struct ntb_dev *ndev)
 static int vntb_epf_mw_get_align(struct ntb_dev *ndev, int pidx, int idx,
 			resource_size_t *addr_align,
 			resource_size_t *size_align,
-			resource_size_t *size_max)
+			resource_size_t *size_max,
+			resource_size_t *offset)
 {
 	struct epf_ntb *ntb = ntb_ndev(ndev);
 
diff --git a/include/linux/ntb.h b/include/linux/ntb.h
index 8ff9d663096b..d7ce5d2e60d0 100644
--- a/include/linux/ntb.h
+++ b/include/linux/ntb.h
@@ -273,9 +273,11 @@ struct ntb_dev_ops {
 	int (*mw_get_align)(struct ntb_dev *ntb, int pidx, int widx,
 			    resource_size_t *addr_align,
 			    resource_size_t *size_align,
-			    resource_size_t *size_max);
+			    resource_size_t *size_max,
+			    resource_size_t *offset);
 	int (*mw_set_trans)(struct ntb_dev *ntb, int pidx, int widx,
-			    dma_addr_t addr, resource_size_t size);
+			    dma_addr_t addr, resource_size_t size,
+			    resource_size_t offset);
 	int (*mw_clear_trans)(struct ntb_dev *ntb, int pidx, int widx);
 	int (*peer_mw_count)(struct ntb_dev *ntb);
 	int (*peer_mw_get_addr)(struct ntb_dev *ntb, int widx,
@@ -823,13 +825,14 @@ static inline int ntb_mw_count(struct ntb_dev *ntb, int pidx)
 static inline int ntb_mw_get_align(struct ntb_dev *ntb, int pidx, int widx,
 				   resource_size_t *addr_align,
 				   resource_size_t *size_align,
-				   resource_size_t *size_max)
+				   resource_size_t *size_max,
+				   resource_size_t *offset)
 {
 	if (!(ntb_link_is_up(ntb, NULL, NULL) & BIT_ULL(pidx)))
 		return -ENOTCONN;
 
 	return ntb->ops->mw_get_align(ntb, pidx, widx, addr_align, size_align,
-				      size_max);
+				      size_max, offset);
 }
 
 /**
@@ -852,12 +855,13 @@ static inline int ntb_mw_get_align(struct ntb_dev *ntb, int pidx, int widx,
  * Return: Zero on success, otherwise an error number.
  */
 static inline int ntb_mw_set_trans(struct ntb_dev *ntb, int pidx, int widx,
-				   dma_addr_t addr, resource_size_t size)
+				   dma_addr_t addr, resource_size_t size,
+				   resource_size_t offset)
 {
 	if (!ntb->ops->mw_set_trans)
 		return 0;
 
-	return ntb->ops->mw_set_trans(ntb, pidx, widx, addr, size);
+	return ntb->ops->mw_set_trans(ntb, pidx, widx, addr, size, offset);
 }
 
 /**
@@ -875,7 +879,7 @@ static inline int ntb_mw_set_trans(struct ntb_dev *ntb, int pidx, int widx,
 static inline int ntb_mw_clear_trans(struct ntb_dev *ntb, int pidx, int widx)
 {
 	if (!ntb->ops->mw_clear_trans)
-		return ntb_mw_set_trans(ntb, pidx, widx, 0, 0);
+		return ntb_mw_set_trans(ntb, pidx, widx, 0, 0, 0);
 
 	return ntb->ops->mw_clear_trans(ntb, pidx, widx);
 }
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [RFC PATCH 08/25] PCI: endpoint: pci-epf-vntb: Propagate MW offset from configfs when present
  2025-10-23  7:18 [RFC PATCH 00/25] NTB/PCI: Add DW eDMA intr fallback and BAR MW offsets Koichiro Den
                   ` (6 preceding siblings ...)
  2025-10-23  7:18 ` [RFC PATCH 07/25] NTB: Add offset parameter to MW translation APIs Koichiro Den
@ 2025-10-23  7:18 ` Koichiro Den
  2025-10-23  7:19 ` [RFC PATCH 09/25] NTB: ntb_transport: Support offsetted partial memory windows Koichiro Den
                   ` (18 subsequent siblings)
  26 siblings, 0 replies; 37+ messages in thread
From: Koichiro Den @ 2025-10-23  7:18 UTC (permalink / raw)
  To: ntb, linux-pci, dmaengine, linux-kernel
  Cc: mani, kwilczynski, kishon, bhelgaas, corbet, vkoul, jdmason,
	dave.jiang, allenbh, Basavaraj.Natikar, Shyam-sundar.S-k,
	kurt.schwemmer, logang, jingoohan1, lpieralisi, robh, jbrunet,
	Frank.Li, fancer.lancer, arnd, pstanner, elfring

The NTB API functions ntb_mw_set_trans() and ntb_mw_get_align() now
support non-zero MW offsets. Update pci-epf-vntb to populate
mws_offset[idx] when the offset parameter is provided. Users can now get
the offset value and use it on ntb_mw_set_trans().

Signed-off-by: Koichiro Den <den@valinux.co.jp>
---
 drivers/pci/endpoint/functions/pci-epf-vntb.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/pci/endpoint/functions/pci-epf-vntb.c b/drivers/pci/endpoint/functions/pci-epf-vntb.c
index becfad483643..6495f99ffd4f 100644
--- a/drivers/pci/endpoint/functions/pci-epf-vntb.c
+++ b/drivers/pci/endpoint/functions/pci-epf-vntb.c
@@ -1413,6 +1413,9 @@ static int vntb_epf_mw_get_align(struct ntb_dev *ndev, int pidx, int idx,
 	if (size_max)
 		*size_max = ntb->mws_size[idx];
 
+	if (offset)
+		*offset = ntb->mws_offset[idx];
+
 	return 0;
 }
 
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [RFC PATCH 09/25] NTB: ntb_transport: Support offsetted partial memory windows
  2025-10-23  7:18 [RFC PATCH 00/25] NTB/PCI: Add DW eDMA intr fallback and BAR MW offsets Koichiro Den
                   ` (7 preceding siblings ...)
  2025-10-23  7:18 ` [RFC PATCH 08/25] PCI: endpoint: pci-epf-vntb: Propagate MW offset from configfs when present Koichiro Den
@ 2025-10-23  7:19 ` Koichiro Den
  2025-10-23  7:19 ` [RFC PATCH 10/25] NTB/msi: Support offsetted partial memory window for MSI Koichiro Den
                   ` (17 subsequent siblings)
  26 siblings, 0 replies; 37+ messages in thread
From: Koichiro Den @ 2025-10-23  7:19 UTC (permalink / raw)
  To: ntb, linux-pci, dmaengine, linux-kernel
  Cc: mani, kwilczynski, kishon, bhelgaas, corbet, vkoul, jdmason,
	dave.jiang, allenbh, Basavaraj.Natikar, Shyam-sundar.S-k,
	kurt.schwemmer, logang, jingoohan1, lpieralisi, robh, jbrunet,
	Frank.Li, fancer.lancer, arnd, pstanner, elfring

The NTB API functions ntb_mw_set_trans() and ntb_mw_get_align() now
support non-zero MW offsets. Update ntb_transport to make use of this
capability by propagating the offset when setting up MW translations.

Signed-off-by: Koichiro Den <den@valinux.co.jp>
---
 drivers/ntb/ntb_transport.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/ntb/ntb_transport.c b/drivers/ntb/ntb_transport.c
index 4bb1a64c1090..3f3bc991e667 100644
--- a/drivers/ntb/ntb_transport.c
+++ b/drivers/ntb/ntb_transport.c
@@ -877,13 +877,14 @@ static int ntb_set_mw(struct ntb_transport_ctx *nt, int num_mw,
 	size_t xlat_size, buff_size;
 	resource_size_t xlat_align;
 	resource_size_t xlat_align_size;
+	resource_size_t offset;
 	int rc;
 
 	if (!size)
 		return -EINVAL;
 
 	rc = ntb_mw_get_align(nt->ndev, PIDX, num_mw, &xlat_align,
-			      &xlat_align_size, NULL, NULL);
+			      &xlat_align_size, NULL, &offset);
 	if (rc)
 		return rc;
 
@@ -918,7 +919,7 @@ static int ntb_set_mw(struct ntb_transport_ctx *nt, int num_mw,
 
 	/* Notify HW the memory location of the receive buffer */
 	rc = ntb_mw_set_trans(nt->ndev, PIDX, num_mw, mw->dma_addr,
-			      mw->xlat_size, 0);
+			      mw->xlat_size, offset);
 	if (rc) {
 		dev_err(&pdev->dev, "Unable to set mw%d translation", num_mw);
 		ntb_free_mw(nt, num_mw);
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [RFC PATCH 10/25] NTB/msi: Support offsetted partial memory window for MSI
  2025-10-23  7:18 [RFC PATCH 00/25] NTB/PCI: Add DW eDMA intr fallback and BAR MW offsets Koichiro Den
                   ` (8 preceding siblings ...)
  2025-10-23  7:19 ` [RFC PATCH 09/25] NTB: ntb_transport: Support offsetted partial memory windows Koichiro Den
@ 2025-10-23  7:19 ` Koichiro Den
  2025-10-23  7:19 ` [RFC PATCH 11/25] NTB/msi: Do not force MW to its maximum possible size Koichiro Den
                   ` (16 subsequent siblings)
  26 siblings, 0 replies; 37+ messages in thread
From: Koichiro Den @ 2025-10-23  7:19 UTC (permalink / raw)
  To: ntb, linux-pci, dmaengine, linux-kernel
  Cc: mani, kwilczynski, kishon, bhelgaas, corbet, vkoul, jdmason,
	dave.jiang, allenbh, Basavaraj.Natikar, Shyam-sundar.S-k,
	kurt.schwemmer, logang, jingoohan1, lpieralisi, robh, jbrunet,
	Frank.Li, fancer.lancer, arnd, pstanner, elfring

The NTB API functions ntb_mw_set_trans() and ntb_mw_get_align() now
support non-zero MW offsets. Update ntb/msi to make use of this
capability by propagating the alignment offset when setting up a MW
translation for MSI.

Signed-off-by: Koichiro Den <den@valinux.co.jp>
---
 drivers/ntb/msi.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/ntb/msi.c b/drivers/ntb/msi.c
index 8875bcbf2ea4..4dc134cf404f 100644
--- a/drivers/ntb/msi.c
+++ b/drivers/ntb/msi.c
@@ -97,7 +97,7 @@ int ntb_msi_setup_mws(struct ntb_dev *ntb)
 	struct msi_desc *desc;
 	u64 addr;
 	int peer, peer_widx;
-	resource_size_t addr_align, size_align, size_max;
+	resource_size_t addr_align, size_align, size_max, offset;
 	resource_size_t mw_size = SZ_32K;
 	resource_size_t mw_min_size = mw_size;
 	int i;
@@ -132,7 +132,7 @@ int ntb_msi_setup_mws(struct ntb_dev *ntb)
 		}
 
 		ret = ntb_mw_get_align(ntb, peer, peer_widx, NULL,
-				       &size_align, &size_max, NULL);
+				       &size_align, &size_max, &offset);
 		if (ret)
 			goto error_out;
 
@@ -142,7 +142,7 @@ int ntb_msi_setup_mws(struct ntb_dev *ntb)
 			mw_min_size = mw_size;
 
 		ret = ntb_mw_set_trans(ntb, peer, peer_widx,
-				       addr, mw_size, 0);
+				       addr, mw_size, offset);
 		if (ret)
 			goto error_out;
 	}
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [RFC PATCH 11/25] NTB/msi: Do not force MW to its maximum possible size
  2025-10-23  7:18 [RFC PATCH 00/25] NTB/PCI: Add DW eDMA intr fallback and BAR MW offsets Koichiro Den
                   ` (9 preceding siblings ...)
  2025-10-23  7:19 ` [RFC PATCH 10/25] NTB/msi: Support offsetted partial memory window for MSI Koichiro Den
@ 2025-10-23  7:19 ` Koichiro Den
  2025-10-23  7:19 ` [RFC PATCH 12/25] NTB: ntb_transport: Stricter checks for peer-reported interrupt values Koichiro Den
                   ` (15 subsequent siblings)
  26 siblings, 0 replies; 37+ messages in thread
From: Koichiro Den @ 2025-10-23  7:19 UTC (permalink / raw)
  To: ntb, linux-pci, dmaengine, linux-kernel
  Cc: mani, kwilczynski, kishon, bhelgaas, corbet, vkoul, jdmason,
	dave.jiang, allenbh, Basavaraj.Natikar, Shyam-sundar.S-k,
	kurt.schwemmer, logang, jingoohan1, lpieralisi, robh, jbrunet,
	Frank.Li, fancer.lancer, arnd, pstanner, elfring

As partial BAR usage is now supported, stop rounding memory windows up
to the maximum possible size.

Signed-off-by: Koichiro Den <den@valinux.co.jp>
---
 drivers/ntb/msi.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/ntb/msi.c b/drivers/ntb/msi.c
index 4dc134cf404f..00218cfa6fd5 100644
--- a/drivers/ntb/msi.c
+++ b/drivers/ntb/msi.c
@@ -97,7 +97,7 @@ int ntb_msi_setup_mws(struct ntb_dev *ntb)
 	struct msi_desc *desc;
 	u64 addr;
 	int peer, peer_widx;
-	resource_size_t addr_align, size_align, size_max, offset;
+	resource_size_t addr_align, size_align, offset;
 	resource_size_t mw_size = SZ_32K;
 	resource_size_t mw_min_size = mw_size;
 	int i;
@@ -132,12 +132,11 @@ int ntb_msi_setup_mws(struct ntb_dev *ntb)
 		}
 
 		ret = ntb_mw_get_align(ntb, peer, peer_widx, NULL,
-				       &size_align, &size_max, &offset);
+				       &size_align, NULL, &offset);
 		if (ret)
 			goto error_out;
 
 		mw_size = round_up(mw_size, size_align);
-		mw_size = max(mw_size, size_max);
 		if (mw_size < mw_min_size)
 			mw_min_size = mw_size;
 
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [RFC PATCH 12/25] NTB: ntb_transport: Stricter checks for peer-reported interrupt values
  2025-10-23  7:18 [RFC PATCH 00/25] NTB/PCI: Add DW eDMA intr fallback and BAR MW offsets Koichiro Den
                   ` (10 preceding siblings ...)
  2025-10-23  7:19 ` [RFC PATCH 11/25] NTB/msi: Do not force MW to its maximum possible size Koichiro Den
@ 2025-10-23  7:19 ` Koichiro Den
  2025-10-23  7:19 ` [RFC PATCH 13/25] NTB/msi: Skip mw_set_trans() if already configured Koichiro Den
                   ` (14 subsequent siblings)
  26 siblings, 0 replies; 37+ messages in thread
From: Koichiro Den @ 2025-10-23  7:19 UTC (permalink / raw)
  To: ntb, linux-pci, dmaengine, linux-kernel
  Cc: mani, kwilczynski, kishon, bhelgaas, corbet, vkoul, jdmason,
	dave.jiang, allenbh, Basavaraj.Natikar, Shyam-sundar.S-k,
	kurt.schwemmer, logang, jingoohan1, lpieralisi, robh, jbrunet,
	Frank.Li, fancer.lancer, arnd, pstanner, elfring

addr_offset and/or data may legitimately be zero, depending on alignment
constraints. Introduce more clearly invalid default values and
strengthen validation of peer-reported ones to prevent false rejections.

Signed-off-by: Koichiro Den <den@valinux.co.jp>
---
 drivers/ntb/ntb_transport.c | 24 +++++++++++++++++-------
 1 file changed, 17 insertions(+), 7 deletions(-)

diff --git a/drivers/ntb/ntb_transport.c b/drivers/ntb/ntb_transport.c
index 3f3bc991e667..d9fc450ef497 100644
--- a/drivers/ntb/ntb_transport.c
+++ b/drivers/ntb/ntb_transport.c
@@ -69,6 +69,9 @@
 #define NTB_TRANSPORT_DESC	"Software Queue-Pair Transport over NTB"
 #define NTB_TRANSPORT_MIN_SPADS (MW0_SZ_HIGH + 2)
 
+#define INTR_INVALID_ADDR_OFFSET	U32_MAX
+#define INTR_INVALID_DATA		U32_MAX
+
 MODULE_DESCRIPTION(NTB_TRANSPORT_DESC);
 MODULE_VERSION(NTB_TRANSPORT_VER);
 MODULE_LICENSE("Dual BSD/GPL");
@@ -715,7 +718,11 @@ static void ntb_transport_setup_qp_peer_msi(struct ntb_transport_ctx *nt,
 	dev_dbg(&qp->ndev->pdev->dev, "QP%d Peer MSI addr=%x data=%x\n",
 		qp_num, qp->peer_msi_desc.addr_offset, qp->peer_msi_desc.data);
 
-	if (qp->peer_msi_desc.addr_offset) {
+	if (qp->peer_msi_desc.addr_offset == INTR_INVALID_ADDR_OFFSET ||
+	    qp->peer_msi_desc.data == INTR_INVALID_DATA)
+		dev_info(&qp->ndev->pdev->dev,
+			 "Invalid addr_offset or data, falling back to doorbell\n");
+	else {
 		qp->use_msi = true;
 		dev_info(&qp->ndev->pdev->dev,
 			 "Using MSI interrupts for QP%d\n", qp_num);
@@ -723,12 +730,18 @@ static void ntb_transport_setup_qp_peer_msi(struct ntb_transport_ctx *nt,
 }
 
 static void ntb_transport_setup_qp_msi(struct ntb_transport_ctx *nt,
-				       unsigned int qp_num)
+				       unsigned int qp_num, bool changed)
 {
 	struct ntb_transport_qp *qp = &nt->qp_vec[qp_num];
 	int spad = qp_num * 2 + nt->msi_spad_offset;
 	int rc;
 
+	if (!changed && qp->msi_irq)
+		return;
+
+	ntb_spad_write(qp->ndev, spad, INTR_INVALID_ADDR_OFFSET);
+	ntb_spad_write(qp->ndev, spad + 1, INTR_INVALID_DATA);
+
 	if (!nt->use_msi)
 		return;
 
@@ -738,9 +751,6 @@ static void ntb_transport_setup_qp_msi(struct ntb_transport_ctx *nt,
 		return;
 	}
 
-	ntb_spad_write(qp->ndev, spad, 0);
-	ntb_spad_write(qp->ndev, spad + 1, 0);
-
 	if (!qp->msi_irq) {
 		qp->msi_irq = ntbm_msi_request_irq(qp->ndev, ntb_transport_isr,
 						   KBUILD_MODNAME, qp,
@@ -789,7 +799,7 @@ static void ntb_transport_msi_desc_changed(void *data)
 	dev_dbg(&nt->ndev->pdev->dev, "MSI descriptors changed");
 
 	for (i = 0; i < nt->qp_count; i++)
-		ntb_transport_setup_qp_msi(nt, i);
+		ntb_transport_setup_qp_msi(nt, i, true);
 
 	ntb_peer_db_set(nt->ndev, nt->msi_db_mask);
 }
@@ -1068,7 +1078,7 @@ static void ntb_transport_link_work(struct work_struct *work)
 	}
 
 	for (i = 0; i < nt->qp_count; i++)
-		ntb_transport_setup_qp_msi(nt, i);
+		ntb_transport_setup_qp_msi(nt, i, false);
 
 	for (i = 0; i < nt->mw_count; i++) {
 		size = nt->mw_vec[i].phys_size;
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [RFC PATCH 13/25] NTB/msi: Skip mw_set_trans() if already configured
  2025-10-23  7:18 [RFC PATCH 00/25] NTB/PCI: Add DW eDMA intr fallback and BAR MW offsets Koichiro Den
                   ` (11 preceding siblings ...)
  2025-10-23  7:19 ` [RFC PATCH 12/25] NTB: ntb_transport: Stricter checks for peer-reported interrupt values Koichiro Den
@ 2025-10-23  7:19 ` Koichiro Den
  2025-10-23  7:19 ` [RFC PATCH 14/25] NTB/msi: Add a inner loop for PCI-MSI cases Koichiro Den
                   ` (13 subsequent siblings)
  26 siblings, 0 replies; 37+ messages in thread
From: Koichiro Den @ 2025-10-23  7:19 UTC (permalink / raw)
  To: ntb, linux-pci, dmaengine, linux-kernel
  Cc: mani, kwilczynski, kishon, bhelgaas, corbet, vkoul, jdmason,
	dave.jiang, allenbh, Basavaraj.Natikar, Shyam-sundar.S-k,
	kurt.schwemmer, logang, jingoohan1, lpieralisi, robh, jbrunet,
	Frank.Li, fancer.lancer, arnd, pstanner, elfring

Return early if msi->base_addr is already set, avoiding redundant
mw_set_trans() calls and unnecessary reprogramming of address
translations.

Signed-off-by: Koichiro Den <den@valinux.co.jp>
---
 drivers/ntb/msi.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/ntb/msi.c b/drivers/ntb/msi.c
index 00218cfa6fd5..6d48418aa756 100644
--- a/drivers/ntb/msi.c
+++ b/drivers/ntb/msi.c
@@ -106,6 +106,9 @@ int ntb_msi_setup_mws(struct ntb_dev *ntb)
 	if (!ntb->msi)
 		return -EINVAL;
 
+	if (ntb->msi->base_addr)
+		return 0;
+
 	scoped_guard (msi_descs_lock, &ntb->pdev->dev) {
 		desc = msi_first_desc(&ntb->pdev->dev, MSI_DESC_ASSOCIATED);
 		addr = desc->msg.address_lo + ((uint64_t)desc->msg.address_hi << 32);
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [RFC PATCH 14/25] NTB/msi: Add a inner loop for PCI-MSI cases
  2025-10-23  7:18 [RFC PATCH 00/25] NTB/PCI: Add DW eDMA intr fallback and BAR MW offsets Koichiro Den
                   ` (12 preceding siblings ...)
  2025-10-23  7:19 ` [RFC PATCH 13/25] NTB/msi: Skip mw_set_trans() if already configured Koichiro Den
@ 2025-10-23  7:19 ` Koichiro Den
  2025-10-23  7:19 ` [RFC PATCH 15/25] dmaengine: dw-edma: Add self-interrupt registration API Koichiro Den
                   ` (12 subsequent siblings)
  26 siblings, 0 replies; 37+ messages in thread
From: Koichiro Den @ 2025-10-23  7:19 UTC (permalink / raw)
  To: ntb, linux-pci, dmaengine, linux-kernel
  Cc: mani, kwilczynski, kishon, bhelgaas, corbet, vkoul, jdmason,
	dave.jiang, allenbh, Basavaraj.Natikar, Shyam-sundar.S-k,
	kurt.schwemmer, logang, jingoohan1, lpieralisi, robh, jbrunet,
	Frank.Li, fancer.lancer, arnd, pstanner, elfring

Add inner loop to handle MSI descriptors with nvec_used > 1, allowing
multiple interrupt vectors per single MSI descriptor as on PCI-MSI.

Signed-off-by: Koichiro Den <den@valinux.co.jp>
---
 drivers/ntb/msi.c   | 51 ++++++++++++++++++++++++++-------------------
 include/linux/ntb.h |  1 +
 2 files changed, 30 insertions(+), 22 deletions(-)

diff --git a/drivers/ntb/msi.c b/drivers/ntb/msi.c
index 6d48418aa756..983725d4eb13 100644
--- a/drivers/ntb/msi.c
+++ b/drivers/ntb/msi.c
@@ -195,7 +195,7 @@ struct ntb_msi_devres {
 };
 
 static int ntb_msi_set_desc(struct ntb_dev *ntb, struct msi_desc *entry,
-			    struct ntb_msi_desc *msi_desc)
+			    struct ntb_msi_desc *msi_desc, u16 vector_offset)
 {
 	u64 addr;
 
@@ -211,7 +211,8 @@ static int ntb_msi_set_desc(struct ntb_dev *ntb, struct msi_desc *entry,
 	}
 
 	msi_desc->addr_offset = addr - ntb->msi->base_addr;
-	msi_desc->data = entry->msg.data;
+	msi_desc->data = entry->msg.data + vector_offset;
+	msi_desc->vector_offset = vector_offset;
 
 	return 0;
 }
@@ -220,7 +221,8 @@ static void ntb_msi_write_msg(struct msi_desc *entry, void *data)
 {
 	struct ntb_msi_devres *dr = data;
 
-	WARN_ON(ntb_msi_set_desc(dr->ntb, entry, dr->msi_desc));
+	WARN_ON(ntb_msi_set_desc(dr->ntb, entry, dr->msi_desc,
+				 dr->msi_desc->vector_offset));
 
 	if (dr->ntb->msi->desc_changed)
 		dr->ntb->msi->desc_changed(dr->ntb->ctx);
@@ -286,32 +288,37 @@ int ntbm_msi_request_threaded_irq(struct ntb_dev *ntb, irq_handler_t handler,
 {
 	struct device *dev = &ntb->pdev->dev;
 	struct msi_desc *entry;
-	int ret;
+	unsigned int virq;
+	int ret, i;
 
 	if (!ntb->msi)
 		return -EINVAL;
 
 	guard(msi_descs_lock)(dev);
 	msi_for_each_desc(entry, dev, MSI_DESC_ASSOCIATED) {
-		if (irq_has_action(entry->irq))
-			continue;
-
-		ret = devm_request_threaded_irq(&ntb->dev, entry->irq, handler,
-						thread_fn, 0, name, dev_id);
-		if (ret)
-			continue;
-
-		if (ntb_msi_set_desc(ntb, entry, msi_desc)) {
-			devm_free_irq(&ntb->dev, entry->irq, dev_id);
-			continue;
-		}
-
-		ret = ntbm_msi_setup_callback(ntb, entry, msi_desc);
-		if (ret) {
-			devm_free_irq(&ntb->dev, entry->irq, dev_id);
-			return ret;
+		for (i = 0; i < entry->nvec_used; i++) {
+			virq = entry->irq + i;
+			if (irq_has_action(virq))
+				continue;
+
+			ret = devm_request_threaded_irq(
+					&ntb->dev, virq, handler,
+					thread_fn, 0, name, dev_id);
+			if (ret)
+				continue;
+
+			if (ntb_msi_set_desc(ntb, entry, msi_desc, i)) {
+				devm_free_irq(&ntb->dev, virq, dev_id);
+				continue;
+			}
+
+			ret = ntbm_msi_setup_callback(ntb, entry, msi_desc);
+			if (ret) {
+				devm_free_irq(&ntb->dev, virq, dev_id);
+				return ret;
+			}
+			return virq;
 		}
-		return entry->irq;
 	}
 	return -ENODEV;
 }
diff --git a/include/linux/ntb.h b/include/linux/ntb.h
index d7ce5d2e60d0..dc5aab43abc2 100644
--- a/include/linux/ntb.h
+++ b/include/linux/ntb.h
@@ -1640,6 +1640,7 @@ static inline int ntb_peer_highest_mw_idx(struct ntb_dev *ntb, int pidx)
 struct ntb_msi_desc {
 	u32 addr_offset;
 	u32 data;
+	u16 vector_offset;
 };
 
 #ifdef CONFIG_NTB_MSI
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [RFC PATCH 15/25] dmaengine: dw-edma: Add self-interrupt registration API
  2025-10-23  7:18 [RFC PATCH 00/25] NTB/PCI: Add DW eDMA intr fallback and BAR MW offsets Koichiro Den
                   ` (13 preceding siblings ...)
  2025-10-23  7:19 ` [RFC PATCH 14/25] NTB/msi: Add a inner loop for PCI-MSI cases Koichiro Den
@ 2025-10-23  7:19 ` Koichiro Den
  2025-10-23  7:19 ` [RFC PATCH 16/25] dmaengine: dw-edma: Expose self-IRQ register offsets Koichiro Den
                   ` (11 subsequent siblings)
  26 siblings, 0 replies; 37+ messages in thread
From: Koichiro Den @ 2025-10-23  7:19 UTC (permalink / raw)
  To: ntb, linux-pci, dmaengine, linux-kernel
  Cc: mani, kwilczynski, kishon, bhelgaas, corbet, vkoul, jdmason,
	dave.jiang, allenbh, Basavaraj.Natikar, Shyam-sundar.S-k,
	kurt.schwemmer, logang, jingoohan1, lpieralisi, robh, jbrunet,
	Frank.Li, fancer.lancer, arnd, pstanner, elfring

Introduce dw_edma_register_selfirq() and dw_edma_unregister_selfirq() to
register IRQ callbacks for emulated interrupts. These can be used for
testing purposes, or even as interrupt callbacks triggered by inbound
address translations where the IB iATU target is set to e.g.
DMA_READ_INT_STATUS_OFF. The latter case provides a practical workaround
for endpoint controllers that cannot directly access GIC ITS registers
due to security restrictions, e.g. on R-Car S4 Spider.

Signed-off-by: Koichiro Den <den@valinux.co.jp>
---
 drivers/dma/dw-edma/dw-edma-core.c    | 60 +++++++++++++++++++++++++++
 drivers/dma/dw-edma/dw-edma-core.h    | 16 +++++++
 drivers/dma/dw-edma/dw-edma-v0-core.c | 15 +++++++
 include/linux/dma/edma.h              | 16 +++++++
 4 files changed, 107 insertions(+)

diff --git a/drivers/dma/dw-edma/dw-edma-core.c b/drivers/dma/dw-edma/dw-edma-core.c
index b43255f914f3..7cf9e5e74a89 100644
--- a/drivers/dma/dw-edma/dw-edma-core.c
+++ b/drivers/dma/dw-edma/dw-edma-core.c
@@ -661,11 +661,22 @@ static inline irqreturn_t dw_edma_interrupt_read(int irq, void *data)
 
 static irqreturn_t dw_edma_interrupt_common(int irq, void *data)
 {
+	struct dw_edma_irq *dw_irq = data;
+	struct dw_edma *dw = dw_irq->dw;
 	irqreturn_t ret = IRQ_NONE;
+	struct dw_edma_selfirq *h;
 
 	ret |= dw_edma_interrupt_write(irq, data);
 	ret |= dw_edma_interrupt_read(irq, data);
 
+	if (ret == IRQ_NONE) {
+		dw_edma_core_ack_test(dw);
+		scoped_guard(spinlock_irqsave, &dw->selfirq_lock) {
+			list_for_each_entry(h, &dw->selfirq_handlers, node)
+				h->fn(dw, h->data);
+		}
+		ret = IRQ_HANDLED;
+	}
 	return ret;
 }
 
@@ -892,6 +903,44 @@ static int dw_edma_irq_request(struct dw_edma *dw,
 	return err;
 }
 
+int dw_edma_register_selfirq(struct dw_edma *dw,
+			     dw_edma_selfirq_fn fn, void *data)
+{
+	struct dw_edma_selfirq *h;
+
+	if (!dw || !fn)
+		return -EINVAL;
+
+	h = kzalloc(sizeof(*h), GFP_KERNEL);
+	if (!h)
+		return -ENOMEM;
+	h->fn = fn;
+	h->data = data;
+	guard(spinlock_irqsave)(&dw->selfirq_lock);
+	list_add_tail(&h->node, &dw->selfirq_handlers);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(dw_edma_register_selfirq);
+
+void dw_edma_unregister_selfirq(struct dw_edma *dw,
+				dw_edma_selfirq_fn fn, void *data)
+{
+	struct dw_edma_selfirq *h, *tmp;
+
+	if (!dw || !fn)
+		return;
+
+	guard(spinlock_irqsave)(&dw->selfirq_lock);
+	list_for_each_entry_safe(h, tmp, &dw->selfirq_handlers, node) {
+		if (h->fn == fn && h->data == data) {
+			list_del(&h->node);
+			kfree(h);
+			break;
+		}
+	}
+}
+EXPORT_SYMBOL_GPL(dw_edma_unregister_selfirq);
+
 int dw_edma_probe(struct dw_edma_chip *chip)
 {
 	struct device *dev;
@@ -912,6 +961,8 @@ int dw_edma_probe(struct dw_edma_chip *chip)
 		return -ENOMEM;
 
 	dw->chip = chip;
+	INIT_LIST_HEAD(&dw->selfirq_handlers);
+	spin_lock_init(&dw->selfirq_lock);
 
 	if (dw->chip->mf == EDMA_MF_HDMA_NATIVE)
 		dw_hdma_v0_core_register(dw);
@@ -974,6 +1025,7 @@ EXPORT_SYMBOL_GPL(dw_edma_probe);
 int dw_edma_remove(struct dw_edma_chip *chip)
 {
 	struct dw_edma_chan *chan, *_chan;
+	struct dw_edma_selfirq *h, *tmp;
 	struct device *dev = chip->dev;
 	struct dw_edma *dw = chip->dw;
 	int i;
@@ -985,6 +1037,14 @@ int dw_edma_remove(struct dw_edma_chip *chip)
 	/* Disable eDMA */
 	dw_edma_core_off(dw);
 
+	/* Free self-irq handlers */
+	scoped_guard(spinlock_irqsave, &dw->selfirq_lock) {
+		list_for_each_entry_safe(h, tmp, &dw->selfirq_handlers, node) {
+			list_del(&h->node);
+			kfree(h);
+		}
+	}
+
 	/* Free irqs */
 	for (i = (dw->nr_irqs - 1); i >= 0; i--)
 		free_irq(chip->ops->irq_vector(dev, i), &dw->irq[i]);
diff --git a/drivers/dma/dw-edma/dw-edma-core.h b/drivers/dma/dw-edma/dw-edma-core.h
index 71894b9e0b15..7d7dd9f13863 100644
--- a/drivers/dma/dw-edma/dw-edma-core.h
+++ b/drivers/dma/dw-edma/dw-edma-core.h
@@ -95,6 +95,12 @@ struct dw_edma_irq {
 	struct dw_edma			*dw;
 };
 
+struct dw_edma_selfirq {
+	struct list_head		node;
+	dw_edma_selfirq_fn		fn;
+	void				*data;
+};
+
 struct dw_edma {
 	char				name[32];
 
@@ -113,6 +119,9 @@ struct dw_edma {
 	struct dw_edma_chip             *chip;
 
 	const struct dw_edma_core_ops	*core;
+
+	struct list_head selfirq_handlers;
+	spinlock_t selfirq_lock;
 };
 
 typedef void (*dw_edma_handler_t)(struct dw_edma_chan *);
@@ -126,6 +135,7 @@ struct dw_edma_core_ops {
 	void (*start)(struct dw_edma_chunk *chunk, bool first);
 	void (*ch_config)(struct dw_edma_chan *chan);
 	void (*debugfs_on)(struct dw_edma *dw);
+	void (*ack_test)(struct dw_edma *dw);
 };
 
 struct dw_edma_sg {
@@ -206,4 +216,10 @@ void dw_edma_core_debugfs_on(struct dw_edma *dw)
 	dw->core->debugfs_on(dw);
 }
 
+static inline void dw_edma_core_ack_test(struct dw_edma *dw)
+{
+	if (dw->core->ack_test)
+		dw->core->ack_test(dw);
+}
+
 #endif /* _DW_EDMA_CORE_H */
diff --git a/drivers/dma/dw-edma/dw-edma-v0-core.c b/drivers/dma/dw-edma/dw-edma-v0-core.c
index b75fdaffad9a..67b0541f38c3 100644
--- a/drivers/dma/dw-edma/dw-edma-v0-core.c
+++ b/drivers/dma/dw-edma/dw-edma-v0-core.c
@@ -509,6 +509,20 @@ static void dw_edma_v0_core_debugfs_on(struct dw_edma *dw)
 	dw_edma_v0_debugfs_on(dw);
 }
 
+static void dw_edma_v0_core_ack_test(struct dw_edma *dw)
+{
+	u32 wr_mask_all = (dw->wr_ch_cnt >= 32) ? ~0U : (BIT(dw->wr_ch_cnt) - 1);
+	u32 rd_mask_all = (dw->rd_ch_cnt >= 32) ? ~0U : (BIT(dw->rd_ch_cnt) - 1);
+
+	u32 wr_val = FIELD_PREP(EDMA_V0_DONE_INT_MASK, wr_mask_all) |
+		     FIELD_PREP(EDMA_V0_ABORT_INT_MASK, wr_mask_all);
+	u32 rd_val = FIELD_PREP(EDMA_V0_DONE_INT_MASK, rd_mask_all) |
+		     FIELD_PREP(EDMA_V0_ABORT_INT_MASK, rd_mask_all);
+
+	SET_32(dw, wr_int_clear, wr_val);
+	SET_32(dw, rd_int_clear, rd_val);
+}
+
 static const struct dw_edma_core_ops dw_edma_v0_core = {
 	.off = dw_edma_v0_core_off,
 	.ch_count = dw_edma_v0_core_ch_count,
@@ -517,6 +531,7 @@ static const struct dw_edma_core_ops dw_edma_v0_core = {
 	.start = dw_edma_v0_core_start,
 	.ch_config = dw_edma_v0_core_ch_config,
 	.debugfs_on = dw_edma_v0_core_debugfs_on,
+	.ack_test = dw_edma_v0_core_ack_test,
 };
 
 void dw_edma_v0_core_register(struct dw_edma *dw)
diff --git a/include/linux/dma/edma.h b/include/linux/dma/edma.h
index 3080747689f6..42daf9a76b56 100644
--- a/include/linux/dma/edma.h
+++ b/include/linux/dma/edma.h
@@ -101,10 +101,16 @@ struct dw_edma_chip {
 	struct dw_edma		*dw;
 };
 
+typedef void (*dw_edma_selfirq_fn)(struct dw_edma *dw, void *data);
+
 /* Export to the platform drivers */
 #if IS_REACHABLE(CONFIG_DW_EDMA)
 int dw_edma_probe(struct dw_edma_chip *chip);
 int dw_edma_remove(struct dw_edma_chip *chip);
+int dw_edma_register_selfirq(struct dw_edma *dw,
+			     dw_edma_selfirq_fn fn, void *data);
+void dw_edma_unregister_selfirq(struct dw_edma *dw,
+				dw_edma_selfirq_fn fn, void *data);
 #else
 static inline int dw_edma_probe(struct dw_edma_chip *chip)
 {
@@ -115,6 +121,16 @@ static inline int dw_edma_remove(struct dw_edma_chip *chip)
 {
 	return 0;
 }
+static inline int dw_edma_register_selfirq(struct dw_edma *dw,
+					   dw_edma_selfirq_fn fn, void *data)
+{
+	return -EOPNOTSUPP;
+}
+
+static inline void dw_edma_unregister_selfirq(struct dw_edma *dw,
+					      dw_edma_selfirq_fn fn, void *data)
+{
+}
 #endif /* CONFIG_DW_EDMA */
 
 #endif /* _DW_EDMA_H */
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [RFC PATCH 16/25] dmaengine: dw-edma: Expose self-IRQ register offsets
  2025-10-23  7:18 [RFC PATCH 00/25] NTB/PCI: Add DW eDMA intr fallback and BAR MW offsets Koichiro Den
                   ` (14 preceding siblings ...)
  2025-10-23  7:19 ` [RFC PATCH 15/25] dmaengine: dw-edma: Add self-interrupt registration API Koichiro Den
@ 2025-10-23  7:19 ` Koichiro Den
  2025-10-23  7:19 ` [RFC PATCH 17/25] dmaengine: dw-edma: Add dw_edma_find_by_child() helper Koichiro Den
                   ` (10 subsequent siblings)
  26 siblings, 0 replies; 37+ messages in thread
From: Koichiro Den @ 2025-10-23  7:19 UTC (permalink / raw)
  To: ntb, linux-pci, dmaengine, linux-kernel
  Cc: mani, kwilczynski, kishon, bhelgaas, corbet, vkoul, jdmason,
	dave.jiang, allenbh, Basavaraj.Natikar, Shyam-sundar.S-k,
	kurt.schwemmer, logang, jingoohan1, lpieralisi, robh, jbrunet,
	Frank.Li, fancer.lancer, arnd, pstanner, elfring

Add helper dw_edma_selfirq_offsets() to query the physical addresses of
status and clear registers for software-triggered IRQs.

Signed-off-by: Koichiro Den <den@valinux.co.jp>
---
 drivers/dma/dw-edma/dw-edma-core.c           | 23 ++++++++++++++++++++
 drivers/pci/controller/dwc/pcie-designware.c |  1 +
 include/linux/dma/edma.h                     | 10 +++++++++
 3 files changed, 34 insertions(+)

diff --git a/drivers/dma/dw-edma/dw-edma-core.c b/drivers/dma/dw-edma/dw-edma-core.c
index 7cf9e5e74a89..28cc319e224d 100644
--- a/drivers/dma/dw-edma/dw-edma-core.c
+++ b/drivers/dma/dw-edma/dw-edma-core.c
@@ -20,6 +20,7 @@
 #include "dw-edma-core.h"
 #include "dw-edma-v0-core.h"
 #include "dw-hdma-v0-core.h"
+#include "dw-edma-v0-regs.h"
 #include "../dmaengine.h"
 #include "../virt-dma.h"
 
@@ -903,6 +904,28 @@ static int dw_edma_irq_request(struct dw_edma *dw,
 	return err;
 }
 
+int dw_edma_selfirq_offsets(struct dw_edma *dw,
+			    resource_size_t *rd_status_off,
+			    resource_size_t *rd_clear_off)
+{
+	struct dw_edma_chip *chip;
+
+	if (!dw)
+		return -ENODEV;
+
+	chip = dw->chip;
+	if (dw->chip->mf == EDMA_MF_EDMA_LEGACY || dw->chip->mf == EDMA_MF_HDMA_NATIVE)
+		return -EOPNOTSUPP;
+	if (rd_status_off)
+		*rd_status_off = (uintptr_t)chip->reg_phys_addr +
+				 offsetof(struct dw_edma_v0_regs, rd_int_status);
+	if (rd_clear_off)
+		*rd_clear_off = (uintptr_t)chip->reg_phys_addr +
+				offsetof(struct dw_edma_v0_regs, rd_int_clear);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(dw_edma_selfirq_offsets);
+
 int dw_edma_register_selfirq(struct dw_edma *dw,
 			     dw_edma_selfirq_fn fn, void *data)
 {
diff --git a/drivers/pci/controller/dwc/pcie-designware.c b/drivers/pci/controller/dwc/pcie-designware.c
index 89aad5a08928..8233cc26249f 100644
--- a/drivers/pci/controller/dwc/pcie-designware.c
+++ b/drivers/pci/controller/dwc/pcie-designware.c
@@ -162,6 +162,7 @@ int dw_pcie_get_resources(struct dw_pcie *pci)
 			pci->edma.reg_base = devm_ioremap_resource(pci->dev, res);
 			if (IS_ERR(pci->edma.reg_base))
 				return PTR_ERR(pci->edma.reg_base);
+			pci->edma.reg_phys_addr = res->start;
 		} else if (pci->atu_size >= 2 * DEFAULT_DBI_DMA_OFFSET) {
 			pci->edma.reg_base = pci->atu_base + DEFAULT_DBI_DMA_OFFSET;
 		}
diff --git a/include/linux/dma/edma.h b/include/linux/dma/edma.h
index 42daf9a76b56..1f11b70e1b1a 100644
--- a/include/linux/dma/edma.h
+++ b/include/linux/dma/edma.h
@@ -85,6 +85,7 @@ struct dw_edma_chip {
 	u32			flags;
 
 	void __iomem		*reg_base;
+	resource_size_t		reg_phys_addr;
 
 	u16			ll_wr_cnt;
 	u16			ll_rd_cnt;
@@ -107,6 +108,9 @@ typedef void (*dw_edma_selfirq_fn)(struct dw_edma *dw, void *data);
 #if IS_REACHABLE(CONFIG_DW_EDMA)
 int dw_edma_probe(struct dw_edma_chip *chip);
 int dw_edma_remove(struct dw_edma_chip *chip);
+int dw_edma_selfirq_offsets(struct dw_edma *dw,
+			    resource_size_t *rd_status_off,
+			    resource_size_t *rd_clear_off);
 int dw_edma_register_selfirq(struct dw_edma *dw,
 			     dw_edma_selfirq_fn fn, void *data);
 void dw_edma_unregister_selfirq(struct dw_edma *dw,
@@ -121,6 +125,12 @@ static inline int dw_edma_remove(struct dw_edma_chip *chip)
 {
 	return 0;
 }
+static inline int dw_edma_selfirq_offsets(struct dw_edma *dw,
+					  resource_size_t *rd_status_off,
+					  resource_size_t *rd_clear_off)
+{
+	return -EOPNOTSUPP;
+}
 static inline int dw_edma_register_selfirq(struct dw_edma *dw,
 					   dw_edma_selfirq_fn fn, void *data)
 {
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [RFC PATCH 17/25] dmaengine: dw-edma: Add dw_edma_find_by_child() helper
  2025-10-23  7:18 [RFC PATCH 00/25] NTB/PCI: Add DW eDMA intr fallback and BAR MW offsets Koichiro Den
                   ` (15 preceding siblings ...)
  2025-10-23  7:19 ` [RFC PATCH 16/25] dmaengine: dw-edma: Expose self-IRQ register offsets Koichiro Den
@ 2025-10-23  7:19 ` Koichiro Den
  2025-10-23  7:19 ` [RFC PATCH 18/25] NTB: core: Add .get_pci_epc() to ntb_dev_ops Koichiro Den
                   ` (9 subsequent siblings)
  26 siblings, 0 replies; 37+ messages in thread
From: Koichiro Den @ 2025-10-23  7:19 UTC (permalink / raw)
  To: ntb, linux-pci, dmaengine, linux-kernel
  Cc: mani, kwilczynski, kishon, bhelgaas, corbet, vkoul, jdmason,
	dave.jiang, allenbh, Basavaraj.Natikar, Shyam-sundar.S-k,
	kurt.schwemmer, logang, jingoohan1, lpieralisi, robh, jbrunet,
	Frank.Li, fancer.lancer, arnd, pstanner, elfring

Add a helper to locate a dw_edma instance by its child device pointer.
Used by PCI endpoint functions to locate the shared eDMA controller.

Signed-off-by: Koichiro Den <den@valinux.co.jp>
---
 drivers/dma/dw-edma/dw-edma-core.c | 26 ++++++++++++++++++++++++++
 drivers/dma/dw-edma/dw-edma-core.h |  2 ++
 include/linux/dma/edma.h           |  5 +++++
 3 files changed, 33 insertions(+)

diff --git a/drivers/dma/dw-edma/dw-edma-core.c b/drivers/dma/dw-edma/dw-edma-core.c
index 28cc319e224d..6c7495504456 100644
--- a/drivers/dma/dw-edma/dw-edma-core.c
+++ b/drivers/dma/dw-edma/dw-edma-core.c
@@ -24,6 +24,9 @@
 #include "../dmaengine.h"
 #include "../virt-dma.h"
 
+static DEFINE_MUTEX(dw_edma_list_lock);
+static LIST_HEAD(dw_edma_list);
+
 static inline
 struct dw_edma_desc *vd2dw_edma_desc(struct virt_dma_desc *vd)
 {
@@ -964,6 +967,22 @@ void dw_edma_unregister_selfirq(struct dw_edma *dw,
 }
 EXPORT_SYMBOL_GPL(dw_edma_unregister_selfirq);
 
+struct dw_edma *dw_edma_find_by_child(struct device *child)
+{
+	struct dw_edma *dw;
+
+	if (!child)
+		return NULL;
+
+	guard(mutex)(&dw_edma_list_lock);
+	list_for_each_entry(dw, &dw_edma_list, node)
+		if (child->parent == dw->dma.dev)
+			return dw;
+
+	return NULL;
+}
+EXPORT_SYMBOL_GPL(dw_edma_find_by_child);
+
 int dw_edma_probe(struct dw_edma_chip *chip)
 {
 	struct device *dev;
@@ -1035,6 +1054,10 @@ int dw_edma_probe(struct dw_edma_chip *chip)
 
 	chip->dw = dw;
 
+	INIT_LIST_HEAD(&dw->node);
+	guard(mutex)(&dw_edma_list_lock);
+	list_add_tail(&dw->node, &dw_edma_list);
+
 	return 0;
 
 err_irq_free:
@@ -1080,6 +1103,9 @@ int dw_edma_remove(struct dw_edma_chip *chip)
 		list_del(&chan->vc.chan.device_node);
 	}
 
+	guard(mutex)(&dw_edma_list_lock);
+	list_del(&dw->node);
+
 	return 0;
 }
 EXPORT_SYMBOL_GPL(dw_edma_remove);
diff --git a/drivers/dma/dw-edma/dw-edma-core.h b/drivers/dma/dw-edma/dw-edma-core.h
index 7d7dd9f13863..249d7e153cbf 100644
--- a/drivers/dma/dw-edma/dw-edma-core.h
+++ b/drivers/dma/dw-edma/dw-edma-core.h
@@ -122,6 +122,8 @@ struct dw_edma {
 
 	struct list_head selfirq_handlers;
 	spinlock_t selfirq_lock;
+
+	struct list_head		node;
 };
 
 typedef void (*dw_edma_handler_t)(struct dw_edma_chan *);
diff --git a/include/linux/dma/edma.h b/include/linux/dma/edma.h
index 1f11b70e1b1a..abc59ffde62c 100644
--- a/include/linux/dma/edma.h
+++ b/include/linux/dma/edma.h
@@ -115,6 +115,7 @@ int dw_edma_register_selfirq(struct dw_edma *dw,
 			     dw_edma_selfirq_fn fn, void *data);
 void dw_edma_unregister_selfirq(struct dw_edma *dw,
 				dw_edma_selfirq_fn fn, void *data);
+struct dw_edma *dw_edma_find_by_child(struct device *child);
 #else
 static inline int dw_edma_probe(struct dw_edma_chip *chip)
 {
@@ -141,6 +142,10 @@ static inline void dw_edma_unregister_selfirq(struct dw_edma *dw,
 					      dw_edma_selfirq_fn fn, void *data)
 {
 }
+struct dw_edma *dw_edma_find_by_child(struct device *child)
+{
+	return NULL;
+}
 #endif /* CONFIG_DW_EDMA */
 
 #endif /* _DW_EDMA_H */
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [RFC PATCH 18/25] NTB: core: Add .get_pci_epc() to ntb_dev_ops
  2025-10-23  7:18 [RFC PATCH 00/25] NTB/PCI: Add DW eDMA intr fallback and BAR MW offsets Koichiro Den
                   ` (16 preceding siblings ...)
  2025-10-23  7:19 ` [RFC PATCH 17/25] dmaengine: dw-edma: Add dw_edma_find_by_child() helper Koichiro Den
@ 2025-10-23  7:19 ` Koichiro Den
  2025-10-23  7:19 ` [RFC PATCH 19/25] NTB: epf: vntb: Implement .get_pci_epc() callback Koichiro Den
                   ` (8 subsequent siblings)
  26 siblings, 0 replies; 37+ messages in thread
From: Koichiro Den @ 2025-10-23  7:19 UTC (permalink / raw)
  To: ntb, linux-pci, dmaengine, linux-kernel
  Cc: mani, kwilczynski, kishon, bhelgaas, corbet, vkoul, jdmason,
	dave.jiang, allenbh, Basavaraj.Natikar, Shyam-sundar.S-k,
	kurt.schwemmer, logang, jingoohan1, lpieralisi, robh, jbrunet,
	Frank.Li, fancer.lancer, arnd, pstanner, elfring

Add an optional get_pci_epc() callback to retrieve the underlying
pci_epc device associated with the NTB implementation.

Signed-off-by: Koichiro Den <den@valinux.co.jp>
---
 drivers/ntb/hw/epf/ntb_hw_epf.c | 11 +----------
 include/linux/ntb.h             | 22 ++++++++++++++++++++++
 2 files changed, 23 insertions(+), 10 deletions(-)

diff --git a/drivers/ntb/hw/epf/ntb_hw_epf.c b/drivers/ntb/hw/epf/ntb_hw_epf.c
index a3ec411bfe49..d55ce6b0fad4 100644
--- a/drivers/ntb/hw/epf/ntb_hw_epf.c
+++ b/drivers/ntb/hw/epf/ntb_hw_epf.c
@@ -9,6 +9,7 @@
 #include <linux/delay.h>
 #include <linux/module.h>
 #include <linux/pci.h>
+#include <linux/pci-epf.h>
 #include <linux/slab.h>
 #include <linux/ntb.h>
 
@@ -49,16 +50,6 @@
 
 #define NTB_EPF_COMMAND_TIMEOUT	1000 /* 1 Sec */
 
-enum pci_barno {
-	NO_BAR = -1,
-	BAR_0,
-	BAR_1,
-	BAR_2,
-	BAR_3,
-	BAR_4,
-	BAR_5,
-};
-
 enum epf_ntb_bar {
 	BAR_CONFIG,
 	BAR_PEER_SPAD,
diff --git a/include/linux/ntb.h b/include/linux/ntb.h
index dc5aab43abc2..9f819c7383a3 100644
--- a/include/linux/ntb.h
+++ b/include/linux/ntb.h
@@ -59,11 +59,13 @@
 #include <linux/completion.h>
 #include <linux/device.h>
 #include <linux/interrupt.h>
+#include <linux/pci-epc.h>
 
 struct ntb_client;
 struct ntb_dev;
 struct ntb_msi;
 struct pci_dev;
+struct pci_epc;
 
 /**
  * enum ntb_topo - NTB connection topology
@@ -256,6 +258,7 @@ static inline int ntb_ctx_ops_is_valid(const struct ntb_ctx_ops *ops)
  * @msg_clear_mask:	See ntb_msg_clear_mask().
  * @msg_read:		See ntb_msg_read().
  * @peer_msg_write:	See ntb_peer_msg_write().
+ * @get_pci_epc:	See ntb_get_pci_epc().
  */
 struct ntb_dev_ops {
 	int (*port_number)(struct ntb_dev *ntb);
@@ -331,6 +334,7 @@ struct ntb_dev_ops {
 	int (*msg_clear_mask)(struct ntb_dev *ntb, u64 mask_bits);
 	u32 (*msg_read)(struct ntb_dev *ntb, int *pidx, int midx);
 	int (*peer_msg_write)(struct ntb_dev *ntb, int pidx, int midx, u32 msg);
+	struct pci_epc *(*get_pci_epc)(struct ntb_dev *ntb);
 };
 
 static inline int ntb_dev_ops_is_valid(const struct ntb_dev_ops *ops)
@@ -393,6 +397,9 @@ static inline int ntb_dev_ops_is_valid(const struct ntb_dev_ops *ops)
 		/* !ops->msg_clear_mask == !ops->msg_count	&& */
 		!ops->msg_read == !ops->msg_count		&&
 		!ops->peer_msg_write == !ops->msg_count		&&
+
+		/* Miscellaneous optional callbacks */
+		/* ops->get_pci_epc			&& */
 		1;
 }
 
@@ -1567,6 +1574,21 @@ static inline int ntb_peer_msg_write(struct ntb_dev *ntb, int pidx, int midx,
 	return ntb->ops->peer_msg_write(ntb, pidx, midx, msg);
 }
 
+/**
+ * ntb_get_pci_epc() - get backing PCI endpoint controller if possible.
+ * @ntb:	NTB device context.
+ *
+ * Get the backing PCI endpoint controller representation.
+ *
+ * Return: The pointer of pci_epc instance if possible. or %NULL if not.
+ */
+static inline struct pci_epc __maybe_unused *ntb_get_pci_epc(struct ntb_dev *ntb)
+{
+	if (!ntb->ops->get_pci_epc)
+		return NULL;
+	return ntb->ops->get_pci_epc(ntb);
+}
+
 /**
  * ntb_peer_resource_idx() - get a resource index for a given peer idx
  * @ntb:	NTB device context.
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [RFC PATCH 19/25] NTB: epf: vntb: Implement .get_pci_epc() callback
  2025-10-23  7:18 [RFC PATCH 00/25] NTB/PCI: Add DW eDMA intr fallback and BAR MW offsets Koichiro Den
                   ` (17 preceding siblings ...)
  2025-10-23  7:19 ` [RFC PATCH 18/25] NTB: core: Add .get_pci_epc() to ntb_dev_ops Koichiro Den
@ 2025-10-23  7:19 ` Koichiro Den
  2025-10-23  7:19 ` [RFC PATCH 20/25] NTB: ntb_transport: Rename use_msi to use_intr (keep alias) Koichiro Den
                   ` (7 subsequent siblings)
  26 siblings, 0 replies; 37+ messages in thread
From: Koichiro Den @ 2025-10-23  7:19 UTC (permalink / raw)
  To: ntb, linux-pci, dmaengine, linux-kernel
  Cc: mani, kwilczynski, kishon, bhelgaas, corbet, vkoul, jdmason,
	dave.jiang, allenbh, Basavaraj.Natikar, Shyam-sundar.S-k,
	kurt.schwemmer, logang, jingoohan1, lpieralisi, robh, jbrunet,
	Frank.Li, fancer.lancer, arnd, pstanner, elfring

Implement the new get_pci_epc() operation for the EPF vNTB driver to
expose its associated EPC device to NTB subsystems.

Signed-off-by: Koichiro Den <den@valinux.co.jp>
---
 drivers/pci/endpoint/functions/pci-epf-vntb.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/pci/endpoint/functions/pci-epf-vntb.c b/drivers/pci/endpoint/functions/pci-epf-vntb.c
index 6495f99ffd4f..e3acea19f473 100644
--- a/drivers/pci/endpoint/functions/pci-epf-vntb.c
+++ b/drivers/pci/endpoint/functions/pci-epf-vntb.c
@@ -1446,6 +1446,15 @@ static int vntb_epf_link_disable(struct ntb_dev *ntb)
 	return 0;
 }
 
+static struct pci_epc *vntb_epf_get_pci_epc(struct ntb_dev *ntb)
+{
+	struct epf_ntb *ndev = ntb_ndev(ntb);
+
+	if (!ndev || !ndev->epf)
+		return NULL;
+	return ndev->epf->epc;
+}
+
 static const struct ntb_dev_ops vntb_epf_ops = {
 	.mw_count		= vntb_epf_mw_count,
 	.spad_count		= vntb_epf_spad_count,
@@ -1467,6 +1476,7 @@ static const struct ntb_dev_ops vntb_epf_ops = {
 	.db_clear_mask		= vntb_epf_db_clear_mask,
 	.db_clear		= vntb_epf_db_clear,
 	.link_disable		= vntb_epf_link_disable,
+	.get_pci_epc		= vntb_epf_get_pci_epc,
 };
 
 static int pci_vntb_probe(struct pci_dev *pdev, const struct pci_device_id *id)
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [RFC PATCH 20/25] NTB: ntb_transport: Rename use_msi to use_intr (keep alias)
  2025-10-23  7:18 [RFC PATCH 00/25] NTB/PCI: Add DW eDMA intr fallback and BAR MW offsets Koichiro Den
                   ` (18 preceding siblings ...)
  2025-10-23  7:19 ` [RFC PATCH 19/25] NTB: epf: vntb: Implement .get_pci_epc() callback Koichiro Den
@ 2025-10-23  7:19 ` Koichiro Den
  2025-10-23  7:19 ` [RFC PATCH 21/25] NTB: Introduce generic interrupt backend abstraction and convert MSI Koichiro Den
                   ` (6 subsequent siblings)
  26 siblings, 0 replies; 37+ messages in thread
From: Koichiro Den @ 2025-10-23  7:19 UTC (permalink / raw)
  To: ntb, linux-pci, dmaengine, linux-kernel
  Cc: mani, kwilczynski, kishon, bhelgaas, corbet, vkoul, jdmason,
	dave.jiang, allenbh, Basavaraj.Natikar, Shyam-sundar.S-k,
	kurt.schwemmer, logang, jingoohan1, lpieralisi, robh, jbrunet,
	Frank.Li, fancer.lancer, arnd, pstanner, elfring

Replace the module parameter use_msi with use_intr as a more generic
interrupt selector, while keeping use_msi as a deprecated alias for
compatibility.

Signed-off-by: Koichiro Den <den@valinux.co.jp>
---
 drivers/ntb/ntb_transport.c | 26 +++++++++++++++-----------
 1 file changed, 15 insertions(+), 11 deletions(-)

diff --git a/drivers/ntb/ntb_transport.c b/drivers/ntb/ntb_transport.c
index d9fc450ef497..4695eb5e6831 100644
--- a/drivers/ntb/ntb_transport.c
+++ b/drivers/ntb/ntb_transport.c
@@ -97,10 +97,14 @@ static bool use_dma;
 module_param(use_dma, bool, 0644);
 MODULE_PARM_DESC(use_dma, "Use DMA engine to perform large data copy");
 
-static bool use_msi;
+static bool use_intr;
+module_param(use_intr, bool, 0644);
+MODULE_PARM_DESC(use_intr, "Use peer-triggerable interrupts (MSI if available, otherwise provider fallback)");
+
+/* Backward-compat: keep 'use_msi' as an alias to 'use_intr'. Marked deprecated */
 #ifdef CONFIG_NTB_MSI
-module_param(use_msi, bool, 0644);
-MODULE_PARM_DESC(use_msi, "Use MSI interrupts instead of doorbells");
+module_param_named(use_msi, use_intr, bool, 0644);
+MODULE_PARM_DESC(use_msi, "DEPRECATED: same as use_intr (will be removed after grace period)");
 #endif
 
 static struct dentry *nt_debugfs_dir;
@@ -236,7 +240,7 @@ struct ntb_transport_ctx {
 	u64 qp_bitmap;
 	u64 qp_bitmap_free;
 
-	bool use_msi;
+	bool use_intr;
 	unsigned int msi_spad_offset;
 	u64 msi_db_mask;
 
@@ -704,7 +708,7 @@ static void ntb_transport_setup_qp_peer_msi(struct ntb_transport_ctx *nt,
 	struct ntb_transport_qp *qp = &nt->qp_vec[qp_num];
 	int spad = qp_num * 2 + nt->msi_spad_offset;
 
-	if (!nt->use_msi)
+	if (!nt->use_intr)
 		return;
 
 	if (spad >= ntb_spad_count(nt->ndev))
@@ -742,7 +746,7 @@ static void ntb_transport_setup_qp_msi(struct ntb_transport_ctx *nt,
 	ntb_spad_write(qp->ndev, spad, INTR_INVALID_ADDR_OFFSET);
 	ntb_spad_write(qp->ndev, spad + 1, INTR_INVALID_DATA);
 
-	if (!nt->use_msi)
+	if (!nt->use_intr)
 		return;
 
 	if (spad >= ntb_spad_count(nt->ndev)) {
@@ -1067,13 +1071,13 @@ static void ntb_transport_link_work(struct work_struct *work)
 
 	/* send the local info, in the opposite order of the way we read it */
 
-	if (nt->use_msi) {
+	if (nt->use_intr) {
 		rc = ntb_msi_setup_mws(ndev);
 		if (rc) {
 			dev_warn(&pdev->dev,
 				 "Failed to register MSI memory window: %d\n",
 				 rc);
-			nt->use_msi = false;
+			nt->use_intr = false;
 		}
 	}
 
@@ -1316,11 +1320,11 @@ static int ntb_transport_probe(struct ntb_client *self, struct ntb_dev *ndev)
 	 * If we are using MSI, and have at least one extra memory window,
 	 * we will reserve the last MW for the MSI window.
 	 */
-	if (use_msi && mw_count > 1) {
+	if (use_intr && mw_count > 1) {
 		rc = ntb_msi_init(ndev, ntb_transport_msi_desc_changed);
 		if (!rc) {
 			mw_count -= 1;
-			nt->use_msi = true;
+			nt->use_intr = true;
 		}
 	}
 
@@ -1369,7 +1373,7 @@ static int ntb_transport_probe(struct ntb_client *self, struct ntb_dev *ndev)
 	qp_bitmap = ntb_db_valid_mask(ndev);
 
 	qp_count = ilog2(qp_bitmap);
-	if (nt->use_msi) {
+	if (nt->use_intr) {
 		qp_count -= 1;
 		nt->msi_db_mask = BIT_ULL(qp_count);
 		ntb_db_clear_mask(ndev, nt->msi_db_mask);
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [RFC PATCH 21/25] NTB: Introduce generic interrupt backend abstraction and convert MSI
  2025-10-23  7:18 [RFC PATCH 00/25] NTB/PCI: Add DW eDMA intr fallback and BAR MW offsets Koichiro Den
                   ` (19 preceding siblings ...)
  2025-10-23  7:19 ` [RFC PATCH 20/25] NTB: ntb_transport: Rename use_msi to use_intr (keep alias) Koichiro Den
@ 2025-10-23  7:19 ` Koichiro Den
  2025-10-23  7:19 ` [RFC PATCH 22/25] NTB: ntb_transport: Rename MSI symbols to generic interrupt form Koichiro Den
                   ` (5 subsequent siblings)
  26 siblings, 0 replies; 37+ messages in thread
From: Koichiro Den @ 2025-10-23  7:19 UTC (permalink / raw)
  To: ntb, linux-pci, dmaengine, linux-kernel
  Cc: mani, kwilczynski, kishon, bhelgaas, corbet, vkoul, jdmason,
	dave.jiang, allenbh, Basavaraj.Natikar, Shyam-sundar.S-k,
	kurt.schwemmer, logang, jingoohan1, lpieralisi, robh, jbrunet,
	Frank.Li, fancer.lancer, arnd, pstanner, elfring

Refactor interrupt handling into a new ntb_intr_backend abstraction
layer, and migrate the MSI implementation to use it as a backend. This
enables alternate backends such as DW eDMA test interrupts.

No functional changes.

Signed-off-by: Koichiro Den <den@valinux.co.jp>
---
 drivers/ntb/Kconfig             |   5 ++
 drivers/ntb/Makefile            |   5 +-
 drivers/ntb/intr_common.c       |  55 ++++++++++++
 drivers/ntb/msi.c               | 145 ++++++++++++++++++--------------
 drivers/ntb/ntb_transport.c     |  36 ++++----
 drivers/ntb/test/ntb_msi_test.c |  26 +++---
 include/linux/ntb.h             |  85 ++++++++++++-------
 7 files changed, 231 insertions(+), 126 deletions(-)
 create mode 100644 drivers/ntb/intr_common.c

diff --git a/drivers/ntb/Kconfig b/drivers/ntb/Kconfig
index df16c755b4da..2f22f44245b3 100644
--- a/drivers/ntb/Kconfig
+++ b/drivers/ntb/Kconfig
@@ -13,9 +13,13 @@ menuconfig NTB
 
 if NTB
 
+config NTB_INTR_COMMON
+	bool
+
 config NTB_MSI
 	bool "MSI Interrupt Support"
 	depends on PCI_MSI
+	select NTB_INTR_COMMON
 	help
 	 Support using MSI interrupt forwarding instead of (or in addition to)
 	 hardware doorbells. MSI interrupts typically offer lower latency
@@ -24,6 +28,7 @@ config NTB_MSI
 	 in the hardware driver for creating the MSI interrupts.
 
 	 If unsure, say N.
+
 source "drivers/ntb/hw/Kconfig"
 
 source "drivers/ntb/test/Kconfig"
diff --git a/drivers/ntb/Makefile b/drivers/ntb/Makefile
index 3a6fa181ff99..feaa2a77cbf6 100644
--- a/drivers/ntb/Makefile
+++ b/drivers/ntb/Makefile
@@ -2,5 +2,6 @@
 obj-$(CONFIG_NTB) += ntb.o hw/ test/
 obj-$(CONFIG_NTB_TRANSPORT) += ntb_transport.o
 
-ntb-y			:= core.o
-ntb-$(CONFIG_NTB_MSI)	+= msi.o
+ntb-y				:= core.o
+ntb-$(CONFIG_NTB_INTR_COMMON)	+= intr_common.o
+ntb-$(CONFIG_NTB_MSI)		+= msi.o
diff --git a/drivers/ntb/intr_common.c b/drivers/ntb/intr_common.c
new file mode 100644
index 000000000000..e0e296fd3e3c
--- /dev/null
+++ b/drivers/ntb/intr_common.c
@@ -0,0 +1,55 @@
+// SPDX-License-Identifier: (GPL-2.0 OR BSD-3-Clause)
+
+#include <linux/ntb.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/slab.h>
+
+int ntb_intr_init(struct ntb_dev *ntb,
+		  void (*desc_changed)(void *ctx))
+{
+#ifdef CONFIG_NTB_MSI
+	if (ntb->pdev->dev.msi.data) {
+		ntb->intr_backend = ntb_intr_msi_backend();
+		dev_info(&ntb->dev, "NTB interrupt MSI backend selected.\n");
+	}
+#endif
+	if (!ntb->intr_backend)
+		return -ENODEV;
+	return ntb->intr_backend->init(ntb, desc_changed);
+}
+EXPORT_SYMBOL_GPL(ntb_intr_init);
+
+int ntb_intr_setup_mws(struct ntb_dev *ntb)
+{
+	return ntb->intr_backend->setup_mws(ntb);
+}
+EXPORT_SYMBOL_GPL(ntb_intr_setup_mws);
+
+void ntb_intr_clear_mws(struct ntb_dev *ntb)
+{
+	ntb->intr_backend->clear_mws(ntb);
+}
+EXPORT_SYMBOL_GPL(ntb_intr_clear_mws);
+
+int ntb_intr_request_irq(struct ntb_dev *ntb, irq_handler_t h,
+			 const char *name, void *dev_id,
+			 struct ntb_intr_desc *d)
+{
+	return ntb->intr_backend->request_irq(ntb, h, name, dev_id, d);
+}
+EXPORT_SYMBOL_GPL(ntb_intr_request_irq);
+
+void ntb_intr_free_irq(struct ntb_dev *ntb, int irq, void *dev_id,
+		       struct ntb_intr_desc *d)
+{
+	return ntb->intr_backend->free_irq(ntb, irq, dev_id, d);
+}
+EXPORT_SYMBOL_GPL(ntb_intr_free_irq);
+
+int ntb_intr_peer_trigger(struct ntb_dev *ntb, int peer,
+			  struct ntb_intr_desc *d)
+{
+	return ntb->intr_backend->peer_trigger(ntb, peer, d);
+}
+EXPORT_SYMBOL_GPL(ntb_intr_peer_trigger);
diff --git a/drivers/ntb/msi.c b/drivers/ntb/msi.c
index 983725d4eb13..cdc3ff6040c8 100644
--- a/drivers/ntb/msi.c
+++ b/drivers/ntb/msi.c
@@ -28,11 +28,12 @@ struct ntb_msi {
  *
  * Return: Zero on success, otherwise a negative error number.
  */
-int ntb_msi_init(struct ntb_dev *ntb,
-		 void (*desc_changed)(void *ctx))
+static int ntb_msi_init(struct ntb_dev *ntb,
+			void (*desc_changed)(void *ctx))
 {
 	phys_addr_t mw_phys_addr;
 	resource_size_t mw_size;
+	struct ntb_msi *msi;
 	int peer_widx;
 	int peers;
 	int ret;
@@ -42,12 +43,12 @@ int ntb_msi_init(struct ntb_dev *ntb,
 	if (peers <= 0)
 		return -EINVAL;
 
-	ntb->msi = devm_kzalloc(&ntb->dev, struct_size(ntb->msi, peer_mws, peers),
+	msi = devm_kzalloc(&ntb->dev, struct_size(msi, peer_mws, peers),
 				GFP_KERNEL);
-	if (!ntb->msi)
+	if (!msi)
 		return -ENOMEM;
 
-	ntb->msi->desc_changed = desc_changed;
+	msi->desc_changed = desc_changed;
 
 	for (i = 0; i < peers; i++) {
 		peer_widx = ntb_peer_mw_count(ntb) - 1 - i;
@@ -57,26 +58,26 @@ int ntb_msi_init(struct ntb_dev *ntb,
 		if (ret)
 			goto unroll;
 
-		ntb->msi->peer_mws[i] = devm_ioremap(&ntb->dev, mw_phys_addr,
+		msi->peer_mws[i] = devm_ioremap(&ntb->dev, mw_phys_addr,
 						     mw_size);
-		if (!ntb->msi->peer_mws[i]) {
+		if (!msi->peer_mws[i]) {
 			ret = -EFAULT;
 			goto unroll;
 		}
 	}
 
+	ntb->intr_priv = msi;
+
 	return 0;
 
 unroll:
 	for (i = 0; i < peers; i++)
-		if (ntb->msi->peer_mws[i])
-			devm_iounmap(&ntb->dev, ntb->msi->peer_mws[i]);
+		if (msi->peer_mws[i])
+			devm_iounmap(&ntb->dev, msi->peer_mws[i]);
 
-	devm_kfree(&ntb->dev, ntb->msi);
-	ntb->msi = NULL;
+	devm_kfree(&ntb->dev, msi);
 	return ret;
 }
-EXPORT_SYMBOL(ntb_msi_init);
 
 /**
  * ntb_msi_setup_mws() - Initialize the MSI inbound memory windows
@@ -92,7 +93,7 @@ EXPORT_SYMBOL(ntb_msi_init);
  *
  * Return: Zero on success, otherwise a negative error number.
  */
-int ntb_msi_setup_mws(struct ntb_dev *ntb)
+static int ntb_msi_setup_mws(struct ntb_dev *ntb)
 {
 	struct msi_desc *desc;
 	u64 addr;
@@ -100,13 +101,14 @@ int ntb_msi_setup_mws(struct ntb_dev *ntb)
 	resource_size_t addr_align, size_align, offset;
 	resource_size_t mw_size = SZ_32K;
 	resource_size_t mw_min_size = mw_size;
+	struct ntb_msi *msi = ntb->intr_priv;
 	int i;
 	int ret;
 
-	if (!ntb->msi)
+	if (!msi)
 		return -EINVAL;
 
-	if (ntb->msi->base_addr)
+	if (msi->base_addr)
 		return 0;
 
 	scoped_guard (msi_descs_lock, &ntb->pdev->dev) {
@@ -149,8 +151,8 @@ int ntb_msi_setup_mws(struct ntb_dev *ntb)
 			goto error_out;
 	}
 
-	ntb->msi->base_addr = addr;
-	ntb->msi->end_addr = addr + mw_min_size;
+	msi->base_addr = addr;
+	msi->end_addr = addr + mw_min_size;
 
 	return 0;
 
@@ -165,7 +167,6 @@ int ntb_msi_setup_mws(struct ntb_dev *ntb)
 
 	return ret;
 }
-EXPORT_SYMBOL(ntb_msi_setup_mws);
 
 /**
  * ntb_msi_clear_mws() - Clear all inbound memory windows
@@ -173,7 +174,7 @@ EXPORT_SYMBOL(ntb_msi_setup_mws);
  *
  * This function tears down the resources used by ntb_msi_setup_mws().
  */
-void ntb_msi_clear_mws(struct ntb_dev *ntb)
+static void ntb_msi_clear_mws(struct ntb_dev *ntb)
 {
 	int peer;
 	int peer_widx;
@@ -186,33 +187,33 @@ void ntb_msi_clear_mws(struct ntb_dev *ntb)
 		ntb_mw_clear_trans(ntb, peer, peer_widx);
 	}
 }
-EXPORT_SYMBOL(ntb_msi_clear_mws);
 
 struct ntb_msi_devres {
 	struct ntb_dev *ntb;
 	struct msi_desc *entry;
-	struct ntb_msi_desc *msi_desc;
+	struct ntb_intr_desc *intr_desc;
 };
 
 static int ntb_msi_set_desc(struct ntb_dev *ntb, struct msi_desc *entry,
-			    struct ntb_msi_desc *msi_desc, u16 vector_offset)
+			    struct ntb_intr_desc *intr_desc, u16 vector_offset)
 {
+	struct ntb_msi *msi = ntb->intr_priv;
 	u64 addr;
 
 	addr = entry->msg.address_lo +
 		((uint64_t)entry->msg.address_hi << 32);
 
-	if (addr < ntb->msi->base_addr || addr >= ntb->msi->end_addr) {
+	if (addr < msi->base_addr || addr >= msi->end_addr) {
 		dev_warn_once(&ntb->dev,
 			      "IRQ %d: MSI Address not within the memory window (%llx, [%llx %llx])\n",
-			      entry->irq, addr, ntb->msi->base_addr,
-			      ntb->msi->end_addr);
+			      entry->irq, addr, msi->base_addr,
+			      msi->end_addr);
 		return -EFAULT;
 	}
 
-	msi_desc->addr_offset = addr - ntb->msi->base_addr;
-	msi_desc->data = entry->msg.data + vector_offset;
-	msi_desc->vector_offset = vector_offset;
+	intr_desc->addr_offset = addr - msi->base_addr;
+	intr_desc->data = entry->msg.data + vector_offset;
+	intr_desc->vector_offset = vector_offset;
 
 	return 0;
 }
@@ -220,12 +221,13 @@ static int ntb_msi_set_desc(struct ntb_dev *ntb, struct msi_desc *entry,
 static void ntb_msi_write_msg(struct msi_desc *entry, void *data)
 {
 	struct ntb_msi_devres *dr = data;
+	struct ntb_msi *msi = dr->ntb->intr_priv;
 
-	WARN_ON(ntb_msi_set_desc(dr->ntb, entry, dr->msi_desc,
-				 dr->msi_desc->vector_offset));
+	WARN_ON(ntb_msi_set_desc(dr->ntb, entry, dr->intr_desc,
+				 dr->intr_desc->vector_offset));
 
-	if (dr->ntb->msi->desc_changed)
-		dr->ntb->msi->desc_changed(dr->ntb->ctx);
+	if (msi->desc_changed)
+		msi->desc_changed(dr->ntb->ctx);
 }
 
 static void ntbm_msi_callback_release(struct device *dev, void *res)
@@ -237,7 +239,7 @@ static void ntbm_msi_callback_release(struct device *dev, void *res)
 }
 
 static int ntbm_msi_setup_callback(struct ntb_dev *ntb, struct msi_desc *entry,
-				   struct ntb_msi_desc *msi_desc)
+				   struct ntb_intr_desc *intr_desc)
 {
 	struct ntb_msi_devres *dr;
 
@@ -248,7 +250,7 @@ static int ntbm_msi_setup_callback(struct ntb_dev *ntb, struct msi_desc *entry,
 
 	dr->ntb = ntb;
 	dr->entry = entry;
-	dr->msi_desc = msi_desc;
+	dr->intr_desc = intr_desc;
 
 	devres_add(&ntb->dev, dr);
 
@@ -259,14 +261,12 @@ static int ntbm_msi_setup_callback(struct ntb_dev *ntb, struct msi_desc *entry,
 }
 
 /**
- * ntbm_msi_request_threaded_irq() - allocate an MSI interrupt
+ * ntb_msi_request_irq() - allocate an MSI interrupt
  * @ntb:	NTB device context
  * @handler:	Function to be called when the IRQ occurs
- * @thread_fn:  Function to be called in a threaded interrupt context. NULL
- *              for clients which handle everything in @handler
- * @name:    An ascii name for the claiming device, dev_name(dev) if NULL
- * @dev_id:     A cookie passed back to the handler function
- * @msi_desc:	MSI descriptor data which triggers the interrupt
+ * @name:	An ascii name for the claiming device, dev_name(dev) if NULL
+ * @dev_id:	A cookie passed back to the handler function
+ * @intr_desc:	Generic interrupt descriptor
  *
  * This function assigns an interrupt handler to an unused
  * MSI interrupt and returns the descriptor used to trigger
@@ -281,19 +281,15 @@ static int ntbm_msi_setup_callback(struct ntb_dev *ntb, struct msi_desc *entry,
  *
  * Return: IRQ number assigned on success, otherwise a negative error number.
  */
-int ntbm_msi_request_threaded_irq(struct ntb_dev *ntb, irq_handler_t handler,
-				  irq_handler_t thread_fn,
-				  const char *name, void *dev_id,
-				  struct ntb_msi_desc *msi_desc)
+static int ntb_msi_request_irq(struct ntb_dev *ntb, irq_handler_t handler,
+			       const char *name, void *dev_id,
+			       struct ntb_intr_desc *intr_desc)
 {
 	struct device *dev = &ntb->pdev->dev;
 	struct msi_desc *entry;
 	unsigned int virq;
 	int ret, i;
 
-	if (!ntb->msi)
-		return -EINVAL;
-
 	guard(msi_descs_lock)(dev);
 	msi_for_each_desc(entry, dev, MSI_DESC_ASSOCIATED) {
 		for (i = 0; i < entry->nvec_used; i++) {
@@ -301,18 +297,17 @@ int ntbm_msi_request_threaded_irq(struct ntb_dev *ntb, irq_handler_t handler,
 			if (irq_has_action(virq))
 				continue;
 
-			ret = devm_request_threaded_irq(
-					&ntb->dev, virq, handler,
-					thread_fn, 0, name, dev_id);
+			ret = devm_request_irq(&ntb->dev, virq, handler,
+					       0, name, dev_id);
 			if (ret)
 				continue;
 
-			if (ntb_msi_set_desc(ntb, entry, msi_desc, i)) {
+			if (ntb_msi_set_desc(ntb, entry, intr_desc, i)) {
 				devm_free_irq(&ntb->dev, virq, dev_id);
 				continue;
 			}
 
-			ret = ntbm_msi_setup_callback(ntb, entry, msi_desc);
+			ret = ntbm_msi_setup_callback(ntb, entry, intr_desc);
 			if (ret) {
 				devm_free_irq(&ntb->dev, virq, dev_id);
 				return ret;
@@ -322,7 +317,23 @@ int ntbm_msi_request_threaded_irq(struct ntb_dev *ntb, irq_handler_t handler,
 	}
 	return -ENODEV;
 }
-EXPORT_SYMBOL(ntbm_msi_request_threaded_irq);
+
+/**
+ * ntb_msi_free_irq() - free an MSI interrupt
+ * @ntb:	NTB device context
+ * @irq:	IRQ number assigned
+ * @dev_id:	A cookie passed back to the handler function
+ * @desc:	Generic interrupt descriptor
+ *
+ * Free an IRQ assigned by ntb_msi_request_irq().
+ *
+ * Return: void
+ */
+static void ntb_msi_free_irq(struct ntb_dev *ntb, int irq, void *dev_id,
+			     struct ntb_intr_desc *desc)
+{
+	devm_free_irq(&ntb->dev, irq, dev_id);
+}
 
 /**
  * ntb_msi_peer_trigger() - Trigger an interrupt handler on a peer
@@ -336,18 +347,30 @@ EXPORT_SYMBOL(ntbm_msi_request_threaded_irq);
  *
  * Return: Zero on success, otherwise a negative error number.
  */
-int ntb_msi_peer_trigger(struct ntb_dev *ntb, int peer,
-			 struct ntb_msi_desc *desc)
+static int ntb_msi_peer_trigger(struct ntb_dev *ntb, int peer,
+				struct ntb_intr_desc *desc)
 {
+	struct ntb_msi *msi = ntb->intr_priv;
 	int idx;
 
-	if (!ntb->msi)
-		return -EINVAL;
+	idx = desc->addr_offset / sizeof(*msi->peer_mws[peer]);
 
-	idx = desc->addr_offset / sizeof(*ntb->msi->peer_mws[peer]);
-
-	iowrite32(desc->data, &ntb->msi->peer_mws[peer][idx]);
+	iowrite32(desc->data, &msi->peer_mws[peer][idx]);
 
 	return 0;
 }
-EXPORT_SYMBOL(ntb_msi_peer_trigger);
+
+static const struct ntb_intr_backend ntb_intr_backend_msi = {
+	.name = "msi",
+	.init = ntb_msi_init,
+	.setup_mws = ntb_msi_setup_mws,
+	.clear_mws = ntb_msi_clear_mws,
+	.request_irq = ntb_msi_request_irq,
+	.free_irq = ntb_msi_free_irq,
+	.peer_trigger = ntb_msi_peer_trigger,
+};
+
+const struct ntb_intr_backend *ntb_intr_msi_backend(void)
+{
+	return &ntb_intr_backend_msi;
+}
diff --git a/drivers/ntb/ntb_transport.c b/drivers/ntb/ntb_transport.c
index 4695eb5e6831..ff4a149680c5 100644
--- a/drivers/ntb/ntb_transport.c
+++ b/drivers/ntb/ntb_transport.c
@@ -205,8 +205,8 @@ struct ntb_transport_qp {
 
 	bool use_msi;
 	int msi_irq;
-	struct ntb_msi_desc msi_desc;
-	struct ntb_msi_desc peer_msi_desc;
+	struct ntb_intr_desc intr_desc;
+	struct ntb_intr_desc peer_intr_desc;
 };
 
 struct ntb_transport_mw {
@@ -714,16 +714,16 @@ static void ntb_transport_setup_qp_peer_msi(struct ntb_transport_ctx *nt,
 	if (spad >= ntb_spad_count(nt->ndev))
 		return;
 
-	qp->peer_msi_desc.addr_offset =
+	qp->peer_intr_desc.addr_offset =
 		ntb_peer_spad_read(qp->ndev, PIDX, spad);
-	qp->peer_msi_desc.data =
+	qp->peer_intr_desc.data =
 		ntb_peer_spad_read(qp->ndev, PIDX, spad + 1);
 
 	dev_dbg(&qp->ndev->pdev->dev, "QP%d Peer MSI addr=%x data=%x\n",
-		qp_num, qp->peer_msi_desc.addr_offset, qp->peer_msi_desc.data);
+		qp_num, qp->peer_intr_desc.addr_offset, qp->peer_intr_desc.data);
 
-	if (qp->peer_msi_desc.addr_offset == INTR_INVALID_ADDR_OFFSET ||
-	    qp->peer_msi_desc.data == INTR_INVALID_DATA)
+	if (qp->peer_intr_desc.addr_offset == INTR_INVALID_ADDR_OFFSET ||
+	    qp->peer_intr_desc.data == INTR_INVALID_DATA)
 		dev_info(&qp->ndev->pdev->dev,
 			 "Invalid addr_offset or data, falling back to doorbell\n");
 	else {
@@ -756,9 +756,9 @@ static void ntb_transport_setup_qp_msi(struct ntb_transport_ctx *nt,
 	}
 
 	if (!qp->msi_irq) {
-		qp->msi_irq = ntbm_msi_request_irq(qp->ndev, ntb_transport_isr,
+		qp->msi_irq = ntb_intr_request_irq(qp->ndev, ntb_transport_isr,
 						   KBUILD_MODNAME, qp,
-						   &qp->msi_desc);
+						   &qp->intr_desc);
 		if (qp->msi_irq < 0) {
 			dev_warn(&qp->ndev->pdev->dev,
 				 "Unable to allocate MSI interrupt for qp%d\n",
@@ -767,22 +767,22 @@ static void ntb_transport_setup_qp_msi(struct ntb_transport_ctx *nt,
 		}
 	}
 
-	rc = ntb_spad_write(qp->ndev, spad, qp->msi_desc.addr_offset);
+	rc = ntb_spad_write(qp->ndev, spad, qp->intr_desc.addr_offset);
 	if (rc)
 		goto err_free_interrupt;
 
-	rc = ntb_spad_write(qp->ndev, spad + 1, qp->msi_desc.data);
+	rc = ntb_spad_write(qp->ndev, spad + 1, qp->intr_desc.data);
 	if (rc)
 		goto err_free_interrupt;
 
 	dev_dbg(&qp->ndev->pdev->dev, "QP%d MSI %d addr=%x data=%x\n",
-		qp_num, qp->msi_irq, qp->msi_desc.addr_offset,
-		qp->msi_desc.data);
+		qp_num, qp->msi_irq, qp->intr_desc.addr_offset,
+		qp->intr_desc.data);
 
 	return;
 
 err_free_interrupt:
-	devm_free_irq(&nt->ndev->dev, qp->msi_irq, qp);
+	ntb_intr_free_irq(qp->ndev, qp->msi_irq, qp, &qp->intr_desc);
 }
 
 static void ntb_transport_msi_peer_desc_changed(struct ntb_transport_ctx *nt)
@@ -795,7 +795,7 @@ static void ntb_transport_msi_peer_desc_changed(struct ntb_transport_ctx *nt)
 		ntb_transport_setup_qp_peer_msi(nt, i);
 }
 
-static void ntb_transport_msi_desc_changed(void *data)
+static void ntb_transport_intr_desc_changed(void *data)
 {
 	struct ntb_transport_ctx *nt = data;
 	int i;
@@ -1072,7 +1072,7 @@ static void ntb_transport_link_work(struct work_struct *work)
 	/* send the local info, in the opposite order of the way we read it */
 
 	if (nt->use_intr) {
-		rc = ntb_msi_setup_mws(ndev);
+		rc = ntb_intr_setup_mws(ndev);
 		if (rc) {
 			dev_warn(&pdev->dev,
 				 "Failed to register MSI memory window: %d\n",
@@ -1321,7 +1321,7 @@ static int ntb_transport_probe(struct ntb_client *self, struct ntb_dev *ndev)
 	 * we will reserve the last MW for the MSI window.
 	 */
 	if (use_intr && mw_count > 1) {
-		rc = ntb_msi_init(ndev, ntb_transport_msi_desc_changed);
+		rc = ntb_intr_init(ndev, ntb_transport_intr_desc_changed);
 		if (!rc) {
 			mw_count -= 1;
 			nt->use_intr = true;
@@ -1803,7 +1803,7 @@ static void ntb_tx_copy_callback(void *data,
 	iowrite32(entry->flags | DESC_DONE_FLAG, &hdr->flags);
 
 	if (qp->use_msi)
-		ntb_msi_peer_trigger(qp->ndev, PIDX, &qp->peer_msi_desc);
+		ntb_intr_peer_trigger(qp->ndev, PIDX, &qp->peer_intr_desc);
 	else
 		ntb_peer_db_set(qp->ndev, BIT_ULL(qp->qp_num));
 
diff --git a/drivers/ntb/test/ntb_msi_test.c b/drivers/ntb/test/ntb_msi_test.c
index 4e18e08776c9..d037892e752e 100644
--- a/drivers/ntb/test/ntb_msi_test.c
+++ b/drivers/ntb/test/ntb_msi_test.c
@@ -26,7 +26,7 @@ struct ntb_msit_ctx {
 		int irq_num;
 		int occurrences;
 		struct ntb_msit_ctx *nm;
-		struct ntb_msi_desc desc;
+		struct ntb_intr_desc desc;
 	} *isr_ctx;
 
 	struct ntb_msit_peer {
@@ -34,7 +34,7 @@ struct ntb_msit_ctx {
 		int pidx;
 		int num_irqs;
 		struct completion init_comp;
-		struct ntb_msi_desc *msi_desc;
+		struct ntb_intr_desc *intr_desc;
 	} peers[];
 };
 
@@ -62,7 +62,7 @@ static void ntb_msit_setup_work(struct work_struct *work)
 	int ret;
 	uintptr_t i;
 
-	ret = ntb_msi_setup_mws(nm->ntb);
+	ret = ntb_intr_setup_mws(nm->ntb);
 	if (ret) {
 		dev_err(&nm->ntb->dev, "Unable to setup MSI windows: %d\n",
 			ret);
@@ -74,7 +74,7 @@ static void ntb_msit_setup_work(struct work_struct *work)
 		nm->isr_ctx[i].nm = nm;
 
 		if (!nm->isr_ctx[i].irq_num) {
-			irq = ntbm_msi_request_irq(nm->ntb, ntb_msit_isr,
+			irq = ntb_intr_request_irq(nm->ntb, ntb_msit_isr,
 						   KBUILD_MODNAME,
 						   &nm->isr_ctx[i],
 						   &nm->isr_ctx[i].desc);
@@ -131,7 +131,7 @@ static void ntb_msit_link_event(void *ctx)
 static void ntb_msit_copy_peer_desc(struct ntb_msit_ctx *nm, int peer)
 {
 	int i;
-	struct ntb_msi_desc *desc = nm->peers[peer].msi_desc;
+	struct ntb_intr_desc *desc = nm->peers[peer].intr_desc;
 	int irq_count = nm->peers[peer].num_irqs;
 
 	for (i = 0; i < irq_count; i++) {
@@ -149,7 +149,7 @@ static void ntb_msit_copy_peer_desc(struct ntb_msit_ctx *nm, int peer)
 static void ntb_msit_db_event(void *ctx, int vec)
 {
 	struct ntb_msit_ctx *nm = ctx;
-	struct ntb_msi_desc *desc;
+	struct ntb_intr_desc *desc;
 	u64 peer_mask = ntb_db_read(nm->ntb);
 	u32 irq_count;
 	int peer;
@@ -168,8 +168,8 @@ static void ntb_msit_db_event(void *ctx, int vec)
 		if (!desc)
 			continue;
 
-		kfree(nm->peers[peer].msi_desc);
-		nm->peers[peer].msi_desc = desc;
+		kfree(nm->peers[peer].intr_desc);
+		nm->peers[peer].intr_desc = desc;
 		nm->peers[peer].num_irqs = irq_count;
 
 		ntb_msit_copy_peer_desc(nm, peer);
@@ -191,8 +191,8 @@ static int ntb_msit_dbgfs_trigger(void *data, u64 idx)
 	dev_dbg(&peer->nm->ntb->dev, "trigger irq %llu on peer %u\n",
 		idx, peer->pidx);
 
-	return ntb_msi_peer_trigger(peer->nm->ntb, peer->pidx,
-				    &peer->msi_desc[idx]);
+	return ntb_intr_peer_trigger(peer->nm->ntb, peer->pidx,
+				     &peer->intr_desc[idx]);
 }
 
 DEFINE_DEBUGFS_ATTRIBUTE(ntb_msit_trigger_fops, NULL,
@@ -344,7 +344,7 @@ static int ntb_msit_probe(struct ntb_client *client, struct ntb_dev *ntb)
 		return ret;
 	}
 
-	ret = ntb_msi_init(ntb, ntb_msit_desc_changed);
+	ret = ntb_intr_init(ntb, ntb_msit_desc_changed);
 	if (ret) {
 		dev_err(&ntb->dev, "Unable to initialize MSI library: %d\n",
 			ret);
@@ -392,10 +392,10 @@ static void ntb_msit_remove(struct ntb_client *client, struct ntb_dev *ntb)
 
 	ntb_link_disable(ntb);
 	ntb_db_set_mask(ntb, ntb_db_valid_mask(ntb));
-	ntb_msi_clear_mws(ntb);
+	ntb_intr_clear_mws(ntb);
 
 	for (i = 0; i < ntb_peer_port_count(ntb); i++)
-		kfree(nm->peers[i].msi_desc);
+		kfree(nm->peers[i].intr_desc);
 
 	ntb_clear_ctx(ntb);
 	ntb_msit_remove_dbgfs(nm);
diff --git a/include/linux/ntb.h b/include/linux/ntb.h
index 9f819c7383a3..1a88fe45471e 100644
--- a/include/linux/ntb.h
+++ b/include/linux/ntb.h
@@ -63,7 +63,7 @@
 
 struct ntb_client;
 struct ntb_dev;
-struct ntb_msi;
+struct ntb_intr_backend;
 struct pci_dev;
 struct pci_epc;
 
@@ -438,8 +438,9 @@ struct ntb_dev {
 	/* block unregister until device is fully released */
 	struct completion		released;
 
-#ifdef CONFIG_NTB_MSI
-	struct ntb_msi *msi;
+#ifdef CONFIG_NTB_INTR_COMMON
+	void				*intr_priv;
+	const struct ntb_intr_backend	*intr_backend;
 #endif
 };
 #define dev_ntb(__dev) container_of((__dev), struct ntb_dev, dev)
@@ -1659,58 +1660,78 @@ static inline int ntb_peer_highest_mw_idx(struct ntb_dev *ntb, int pidx)
 	return ntb_mw_count(ntb, pidx) - ret - 1;
 }
 
-struct ntb_msi_desc {
+struct ntb_intr_desc {
 	u32 addr_offset;
 	u32 data;
 	u16 vector_offset;
 };
 
-#ifdef CONFIG_NTB_MSI
+struct ntb_intr_backend {
+	const char *name;
+	int (*init)(struct ntb_dev *ntb, void (*desc_changed)(void *ctx));
+	int (*setup_mws)(struct ntb_dev *ntb);
+	void (*clear_mws)(struct ntb_dev *ntb);
+	int (*request_irq)(struct ntb_dev *ntb, irq_handler_t handler,
+			   const char *name, void *dev_id,
+			   struct ntb_intr_desc *desc);
+	void (*free_irq)(struct ntb_dev *ntb, int irq, void *dev_id,
+			 struct ntb_intr_desc *desc);
+	int (*peer_trigger)(struct ntb_dev *ntb, int pidx,
+			    struct ntb_intr_desc *desc);
+	int (*peer_addr)(struct ntb_dev *ntb, int pidx,
+			 const struct ntb_intr_desc *local, phys_addr_t *addr);
+};
+
+#ifdef CONFIG_NTB_INTR_COMMON
 
-int ntb_msi_init(struct ntb_dev *ntb, void (*desc_changed)(void *ctx));
-int ntb_msi_setup_mws(struct ntb_dev *ntb);
-void ntb_msi_clear_mws(struct ntb_dev *ntb);
-int ntbm_msi_request_threaded_irq(struct ntb_dev *ntb, irq_handler_t handler,
-				  irq_handler_t thread_fn,
-				  const char *name, void *dev_id,
-				  struct ntb_msi_desc *msi_desc);
-int ntb_msi_peer_trigger(struct ntb_dev *ntb, int peer,
-			 struct ntb_msi_desc *desc);
+int ntb_intr_init(struct ntb_dev *ntb, void (*desc_changed)(void *ctx));
+int ntb_intr_setup_mws(struct ntb_dev *ntb);
+void ntb_intr_clear_mws(struct ntb_dev *ntb);
+int ntb_intr_request_irq(struct ntb_dev *ntb, irq_handler_t handler,
+			 const char *name, void *dev_id,
+			 struct ntb_intr_desc *intr_desc);
+void ntb_intr_free_irq(struct ntb_dev *ntb, int irq, void *dev_id,
+		       struct ntb_intr_desc *intr_desc);
+int ntb_intr_peer_trigger(struct ntb_dev *ntb, int peer,
+			  struct ntb_intr_desc *desc);
 
-#else /* not CONFIG_NTB_MSI */
+#else /* not CONFIG_NTB_INTR_COMMON */
 
-static inline int ntb_msi_init(struct ntb_dev *ntb,
+static inline int ntb_intr_init(struct ntb_dev *ntb,
 			       void (*desc_changed)(void *ctx))
 {
 	return -EOPNOTSUPP;
 }
-static inline int ntb_msi_setup_mws(struct ntb_dev *ntb)
+static inline int ntb_intr_setup_mws(struct ntb_dev *ntb)
 {
 	return -EOPNOTSUPP;
 }
-static inline void ntb_msi_clear_mws(struct ntb_dev *ntb) {}
-static inline int ntbm_msi_request_threaded_irq(struct ntb_dev *ntb,
-						irq_handler_t handler,
-						irq_handler_t thread_fn,
-						const char *name, void *dev_id,
-						struct ntb_msi_desc *msi_desc)
+static inline void ntb_intr_clear_mws(struct ntb_dev *ntb) {}
+static inline int ntb_intr_request_irq(struct ntb_dev *ntb,
+				       irq_handler_t handler,
+				       const char *name, void *dev_id,
+				       struct ntb_intr_desc *intr_desc)
 {
 	return -EOPNOTSUPP;
 }
-static inline int ntb_msi_peer_trigger(struct ntb_dev *ntb, int peer,
-				       struct ntb_msi_desc *desc)
+static inline void ntb_intr_free_irq(struct ntb_dev *ntb, int irq, void *dev_id,
+				     struct ntb_intr_desc *desc)
+{
+}
+static inline int ntb_intr_peer_trigger(struct ntb_dev *ntb, int peer,
+					struct ntb_intr_desc *desc)
 {
 	return -EOPNOTSUPP;
 }
-#endif /* CONFIG_NTB_MSI */
+#endif /* CONFIG_NTB_INTR_COMMON */
 
-static inline int ntbm_msi_request_irq(struct ntb_dev *ntb,
-				       irq_handler_t handler,
-				       const char *name, void *dev_id,
-				       struct ntb_msi_desc *msi_desc)
+#ifdef CONFIG_NTB_MSI
+extern const struct ntb_intr_backend *ntb_intr_msi_backend(void);
+#else
+static inline const struct ntb_intr_backend *ntb_intr_msi_backend(void)
 {
-	return ntbm_msi_request_threaded_irq(ntb, handler, NULL, name,
-					     dev_id, msi_desc);
+	return NULL;
 }
+#endif /* CONFIG_NTB_MSI */
 
 #endif
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [RFC PATCH 22/25] NTB: ntb_transport: Rename MSI symbols to generic interrupt form
  2025-10-23  7:18 [RFC PATCH 00/25] NTB/PCI: Add DW eDMA intr fallback and BAR MW offsets Koichiro Den
                   ` (20 preceding siblings ...)
  2025-10-23  7:19 ` [RFC PATCH 21/25] NTB: Introduce generic interrupt backend abstraction and convert MSI Koichiro Den
@ 2025-10-23  7:19 ` Koichiro Den
  2025-10-23  7:19 ` [RFC PATCH 23/25] NTB: intr_dw_edma: Add DW eDMA emulated interrupt backend Koichiro Den
                   ` (4 subsequent siblings)
  26 siblings, 0 replies; 37+ messages in thread
From: Koichiro Den @ 2025-10-23  7:19 UTC (permalink / raw)
  To: ntb, linux-pci, dmaengine, linux-kernel
  Cc: mani, kwilczynski, kishon, bhelgaas, corbet, vkoul, jdmason,
	dave.jiang, allenbh, Basavaraj.Natikar, Shyam-sundar.S-k,
	kurt.schwemmer, logang, jingoohan1, lpieralisi, robh, jbrunet,
	Frank.Li, fancer.lancer, arnd, pstanner, elfring

Rename the remaining MSI-specific symbols and functions in ntb_transport
to reflect their new generic interrupt backend usage. This unifies
naming for both MSI-based and non-MSI backends.

No functional changes.

Signed-off-by: Koichiro Den <den@valinux.co.jp>
---
 drivers/ntb/ntb_transport.c | 84 ++++++++++++++++++-------------------
 1 file changed, 42 insertions(+), 42 deletions(-)

diff --git a/drivers/ntb/ntb_transport.c b/drivers/ntb/ntb_transport.c
index ff4a149680c5..a4f51c9f18b7 100644
--- a/drivers/ntb/ntb_transport.c
+++ b/drivers/ntb/ntb_transport.c
@@ -203,8 +203,8 @@ struct ntb_transport_qp {
 	u64 tx_memcpy;
 	u64 tx_async;
 
-	bool use_msi;
-	int msi_irq;
+	bool use_intr;
+	int irq;
 	struct ntb_intr_desc intr_desc;
 	struct ntb_intr_desc peer_intr_desc;
 };
@@ -241,8 +241,8 @@ struct ntb_transport_ctx {
 	u64 qp_bitmap_free;
 
 	bool use_intr;
-	unsigned int msi_spad_offset;
-	u64 msi_db_mask;
+	unsigned int intr_spad_offset;
+	u64 intr_db_mask;
 
 	bool link_is_up;
 	struct delayed_work link_work;
@@ -702,11 +702,11 @@ static irqreturn_t ntb_transport_isr(int irq, void *dev)
 	return IRQ_HANDLED;
 }
 
-static void ntb_transport_setup_qp_peer_msi(struct ntb_transport_ctx *nt,
-					    unsigned int qp_num)
+static void ntb_transport_setup_qp_peer_intr(struct ntb_transport_ctx *nt,
+					     unsigned int qp_num)
 {
 	struct ntb_transport_qp *qp = &nt->qp_vec[qp_num];
-	int spad = qp_num * 2 + nt->msi_spad_offset;
+	int spad = qp_num * 2 + nt->intr_spad_offset;
 
 	if (!nt->use_intr)
 		return;
@@ -719,7 +719,7 @@ static void ntb_transport_setup_qp_peer_msi(struct ntb_transport_ctx *nt,
 	qp->peer_intr_desc.data =
 		ntb_peer_spad_read(qp->ndev, PIDX, spad + 1);
 
-	dev_dbg(&qp->ndev->pdev->dev, "QP%d Peer MSI addr=%x data=%x\n",
+	dev_dbg(&qp->ndev->pdev->dev, "QP%d Peer interruption addr=%x data=%x\n",
 		qp_num, qp->peer_intr_desc.addr_offset, qp->peer_intr_desc.data);
 
 	if (qp->peer_intr_desc.addr_offset == INTR_INVALID_ADDR_OFFSET ||
@@ -727,20 +727,20 @@ static void ntb_transport_setup_qp_peer_msi(struct ntb_transport_ctx *nt,
 		dev_info(&qp->ndev->pdev->dev,
 			 "Invalid addr_offset or data, falling back to doorbell\n");
 	else {
-		qp->use_msi = true;
+		qp->use_intr = true;
 		dev_info(&qp->ndev->pdev->dev,
-			 "Using MSI interrupts for QP%d\n", qp_num);
+			 "Using interrupts for QP%d\n", qp_num);
 	}
 }
 
-static void ntb_transport_setup_qp_msi(struct ntb_transport_ctx *nt,
-				       unsigned int qp_num, bool changed)
+static void ntb_transport_setup_qp_intr(struct ntb_transport_ctx *nt,
+					unsigned int qp_num, bool changed)
 {
 	struct ntb_transport_qp *qp = &nt->qp_vec[qp_num];
-	int spad = qp_num * 2 + nt->msi_spad_offset;
+	int spad = qp_num * 2 + nt->intr_spad_offset;
 	int rc;
 
-	if (!changed && qp->msi_irq)
+	if (!changed && qp->irq)
 		return;
 
 	ntb_spad_write(qp->ndev, spad, INTR_INVALID_ADDR_OFFSET);
@@ -751,17 +751,17 @@ static void ntb_transport_setup_qp_msi(struct ntb_transport_ctx *nt,
 
 	if (spad >= ntb_spad_count(nt->ndev)) {
 		dev_warn_once(&qp->ndev->pdev->dev,
-			      "Not enough SPADS to use MSI interrupts\n");
+			      "Not enough SPADS to use interrupts\n");
 		return;
 	}
 
-	if (!qp->msi_irq) {
-		qp->msi_irq = ntb_intr_request_irq(qp->ndev, ntb_transport_isr,
-						   KBUILD_MODNAME, qp,
-						   &qp->intr_desc);
-		if (qp->msi_irq < 0) {
+	if (!qp->irq) {
+		qp->irq = ntb_intr_request_irq(qp->ndev, ntb_transport_isr,
+					       KBUILD_MODNAME, qp,
+					       &qp->intr_desc);
+		if (qp->irq < 0) {
 			dev_warn(&qp->ndev->pdev->dev,
-				 "Unable to allocate MSI interrupt for qp%d\n",
+				 "Unable to allocate an interrupt for qp%d\n",
 				 qp_num);
 			return;
 		}
@@ -775,24 +775,24 @@ static void ntb_transport_setup_qp_msi(struct ntb_transport_ctx *nt,
 	if (rc)
 		goto err_free_interrupt;
 
-	dev_dbg(&qp->ndev->pdev->dev, "QP%d MSI %d addr=%x data=%x\n",
-		qp_num, qp->msi_irq, qp->intr_desc.addr_offset,
+	dev_dbg(&qp->ndev->pdev->dev, "QP%d Interrupt %d addr=%x data=%x\n",
+		qp_num, qp->irq, qp->intr_desc.addr_offset,
 		qp->intr_desc.data);
 
 	return;
 
 err_free_interrupt:
-	ntb_intr_free_irq(qp->ndev, qp->msi_irq, qp, &qp->intr_desc);
+	ntb_intr_free_irq(qp->ndev, qp->irq, qp, &qp->intr_desc);
 }
 
-static void ntb_transport_msi_peer_desc_changed(struct ntb_transport_ctx *nt)
+static void ntb_transport_intr_peer_desc_changed(struct ntb_transport_ctx *nt)
 {
 	int i;
 
-	dev_dbg(&nt->ndev->pdev->dev, "Peer MSI descriptors changed");
+	dev_dbg(&nt->ndev->pdev->dev, "Peer Interrupt descriptors changed");
 
 	for (i = 0; i < nt->qp_count; i++)
-		ntb_transport_setup_qp_peer_msi(nt, i);
+		ntb_transport_setup_qp_peer_intr(nt, i);
 }
 
 static void ntb_transport_intr_desc_changed(void *data)
@@ -800,12 +800,12 @@ static void ntb_transport_intr_desc_changed(void *data)
 	struct ntb_transport_ctx *nt = data;
 	int i;
 
-	dev_dbg(&nt->ndev->pdev->dev, "MSI descriptors changed");
+	dev_dbg(&nt->ndev->pdev->dev, "Interrupt descriptors changed");
 
 	for (i = 0; i < nt->qp_count; i++)
-		ntb_transport_setup_qp_msi(nt, i, true);
+		ntb_transport_setup_qp_intr(nt, i, true);
 
-	ntb_peer_db_set(nt->ndev, nt->msi_db_mask);
+	ntb_peer_db_set(nt->ndev, nt->intr_db_mask);
 }
 
 static void ntb_free_mw(struct ntb_transport_ctx *nt, int num_mw)
@@ -1075,14 +1075,14 @@ static void ntb_transport_link_work(struct work_struct *work)
 		rc = ntb_intr_setup_mws(ndev);
 		if (rc) {
 			dev_warn(&pdev->dev,
-				 "Failed to register MSI memory window: %d\n",
+				 "Failed to register Interrupt memory window: %d\n",
 				 rc);
 			nt->use_intr = false;
 		}
 	}
 
 	for (i = 0; i < nt->qp_count; i++)
-		ntb_transport_setup_qp_msi(nt, i, false);
+		ntb_transport_setup_qp_intr(nt, i, false);
 
 	for (i = 0; i < nt->mw_count; i++) {
 		size = nt->mw_vec[i].phys_size;
@@ -1141,7 +1141,7 @@ static void ntb_transport_link_work(struct work_struct *work)
 		struct ntb_transport_qp *qp = &nt->qp_vec[i];
 
 		ntb_transport_setup_qp_mw(nt, i);
-		ntb_transport_setup_qp_peer_msi(nt, i);
+		ntb_transport_setup_qp_peer_intr(nt, i);
 
 		if (qp->client_ready)
 			schedule_delayed_work(&qp->link_work, 0);
@@ -1317,8 +1317,8 @@ static int ntb_transport_probe(struct ntb_client *self, struct ntb_dev *ndev)
 	nt->ndev = ndev;
 
 	/*
-	 * If we are using MSI, and have at least one extra memory window,
-	 * we will reserve the last MW for the MSI window.
+	 * If we are using interrupt, and have at least one extra memory window,
+	 * we will reserve the last MW for the interrupt window.
 	 */
 	if (use_intr && mw_count > 1) {
 		rc = ntb_intr_init(ndev, ntb_transport_intr_desc_changed);
@@ -1341,7 +1341,7 @@ static int ntb_transport_probe(struct ntb_client *self, struct ntb_dev *ndev)
 	max_mw_count_for_spads = (spad_count - MW0_SZ_HIGH) / 2;
 	nt->mw_count = min(mw_count, max_mw_count_for_spads);
 
-	nt->msi_spad_offset = nt->mw_count * 2 + MW0_SZ_HIGH;
+	nt->intr_spad_offset = nt->mw_count * 2 + MW0_SZ_HIGH;
 
 	nt->mw_vec = kcalloc_node(mw_count, sizeof(*nt->mw_vec),
 				  GFP_KERNEL, node);
@@ -1375,8 +1375,8 @@ static int ntb_transport_probe(struct ntb_client *self, struct ntb_dev *ndev)
 	qp_count = ilog2(qp_bitmap);
 	if (nt->use_intr) {
 		qp_count -= 1;
-		nt->msi_db_mask = BIT_ULL(qp_count);
-		ntb_db_clear_mask(ndev, nt->msi_db_mask);
+		nt->intr_db_mask = BIT_ULL(qp_count);
+		ntb_db_clear_mask(ndev, nt->intr_db_mask);
 	}
 
 	if (max_num_clients && max_num_clients < qp_count)
@@ -1802,7 +1802,7 @@ static void ntb_tx_copy_callback(void *data,
 
 	iowrite32(entry->flags | DESC_DONE_FLAG, &hdr->flags);
 
-	if (qp->use_msi)
+	if (qp->use_intr)
 		ntb_intr_peer_trigger(qp->ndev, PIDX, &qp->peer_intr_desc);
 	else
 		ntb_peer_db_set(qp->ndev, BIT_ULL(qp->qp_num));
@@ -2477,9 +2477,9 @@ static void ntb_transport_doorbell_callback(void *data, int vector)
 	u64 db_bits;
 	unsigned int qp_num;
 
-	if (ntb_db_read(nt->ndev) & nt->msi_db_mask) {
-		ntb_transport_msi_peer_desc_changed(nt);
-		ntb_db_clear(nt->ndev, nt->msi_db_mask);
+	if (ntb_db_read(nt->ndev) & nt->intr_db_mask) {
+		ntb_transport_intr_peer_desc_changed(nt);
+		ntb_db_clear(nt->ndev, nt->intr_db_mask);
 	}
 
 	db_bits = (nt->qp_bitmap & ~nt->qp_bitmap_free &
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [RFC PATCH 23/25] NTB: intr_dw_edma: Add DW eDMA emulated interrupt backend
  2025-10-23  7:18 [RFC PATCH 00/25] NTB/PCI: Add DW eDMA intr fallback and BAR MW offsets Koichiro Den
                   ` (21 preceding siblings ...)
  2025-10-23  7:19 ` [RFC PATCH 22/25] NTB: ntb_transport: Rename MSI symbols to generic interrupt form Koichiro Den
@ 2025-10-23  7:19 ` Koichiro Den
  2025-10-23  7:19 ` [RFC PATCH 24/25] NTB: epf: Add MW2 for interrupt use on Renesas R-Car Koichiro Den
                   ` (3 subsequent siblings)
  26 siblings, 0 replies; 37+ messages in thread
From: Koichiro Den @ 2025-10-23  7:19 UTC (permalink / raw)
  To: ntb, linux-pci, dmaengine, linux-kernel
  Cc: mani, kwilczynski, kishon, bhelgaas, corbet, vkoul, jdmason,
	dave.jiang, allenbh, Basavaraj.Natikar, Shyam-sundar.S-k,
	kurt.schwemmer, logang, jingoohan1, lpieralisi, robh, jbrunet,
	Frank.Li, fancer.lancer, arnd, pstanner, elfring

Introduce a new NTB interrupt backend using DW eDMA's emulated interrupt
mechanism. Enables interrupt-based signaling from RC to EP where MSI is
impossible due to security restrictions on the platform.

Signed-off-by: Koichiro Den <den@valinux.co.jp>
---
 drivers/ntb/Kconfig        |  10 ++
 drivers/ntb/Makefile       |   1 +
 drivers/ntb/intr_common.c  |   8 +-
 drivers/ntb/intr_dw_edma.c | 253 +++++++++++++++++++++++++++++++++++++
 include/linux/ntb.h        |  10 ++
 5 files changed, 281 insertions(+), 1 deletion(-)
 create mode 100644 drivers/ntb/intr_dw_edma.c

diff --git a/drivers/ntb/Kconfig b/drivers/ntb/Kconfig
index 2f22f44245b3..5b7e1563e639 100644
--- a/drivers/ntb/Kconfig
+++ b/drivers/ntb/Kconfig
@@ -29,6 +29,16 @@ config NTB_MSI
 
 	 If unsure, say N.
 
+config NTB_DW_EDMA
+	bool "DW eDMA test-interrupt backend"
+	depends on PCI_ENDPOINT && PCIE_DW_EP && DW_EDMA
+	select NTB_INTR_COMMON
+	help
+	 Use DW eDMA v0 test interrupt as a doorbell-like backend
+	 for NTB transports when MSI is not available on EPF side.
+
+	 If unsure, say N.
+
 source "drivers/ntb/hw/Kconfig"
 
 source "drivers/ntb/test/Kconfig"
diff --git a/drivers/ntb/Makefile b/drivers/ntb/Makefile
index feaa2a77cbf6..cae84d132b78 100644
--- a/drivers/ntb/Makefile
+++ b/drivers/ntb/Makefile
@@ -5,3 +5,4 @@ obj-$(CONFIG_NTB_TRANSPORT) += ntb_transport.o
 ntb-y				:= core.o
 ntb-$(CONFIG_NTB_INTR_COMMON)	+= intr_common.o
 ntb-$(CONFIG_NTB_MSI)		+= msi.o
+ntb-$(CONFIG_NTB_DW_EDMA)	+= intr_dw_edma.o
diff --git a/drivers/ntb/intr_common.c b/drivers/ntb/intr_common.c
index e0e296fd3e3c..41b2752c6d03 100644
--- a/drivers/ntb/intr_common.c
+++ b/drivers/ntb/intr_common.c
@@ -1,7 +1,7 @@
 // SPDX-License-Identifier: (GPL-2.0 OR BSD-3-Clause)
 
-#include <linux/ntb.h>
 #include <linux/module.h>
+#include <linux/ntb.h>
 #include <linux/pci.h>
 #include <linux/slab.h>
 
@@ -13,6 +13,12 @@ int ntb_intr_init(struct ntb_dev *ntb,
 		ntb->intr_backend = ntb_intr_msi_backend();
 		dev_info(&ntb->dev, "NTB interrupt MSI backend selected.\n");
 	}
+#endif
+#ifdef CONFIG_NTB_DW_EDMA
+	if (!ntb->intr_backend) {
+		ntb->intr_backend = ntb_intr_dw_edma_backend();
+		dev_info(&ntb->dev, "NTB interrupt DW eDMA backend selected.\n");
+	}
 #endif
 	if (!ntb->intr_backend)
 		return -ENODEV;
diff --git a/drivers/ntb/intr_dw_edma.c b/drivers/ntb/intr_dw_edma.c
new file mode 100644
index 000000000000..0e408ecfaf61
--- /dev/null
+++ b/drivers/ntb/intr_dw_edma.c
@@ -0,0 +1,253 @@
+// SPDX-License-Identifier: (GPL-2.0 OR BSD-3-Clause)
+
+#include <linux/dma/edma.h>
+#include <linux/ntb.h>
+#include <linux/pci-epc.h>
+
+struct ntb_intr_dw {
+	u64 base_addr;
+	u64 end_addr;
+
+	struct dw_edma *edma;
+	resource_size_t rd_status_off;
+	resource_size_t rd_clear_off;
+
+	u32 __iomem *peer_mws[];
+};
+
+struct ntb_intr_dw_ctx {
+	irq_handler_t handler;
+	void *dev;
+	struct dw_edma *edma;
+};
+
+static void dw_edma_selfirq_handler(struct dw_edma *dw, void *data)
+{
+	struct ntb_intr_dw_ctx *ctx = data;
+
+	ctx->handler(0, ctx->dev);
+}
+
+static int dw_edma_find_backend_for_ntb(struct ntb_dev *ntb, struct ntb_intr_dw *intr_dw)
+{
+	struct pci_epc *epc = NULL;
+
+	epc = ntb_get_pci_epc(ntb);
+	if (!epc)
+		return -ENODEV;
+	intr_dw->edma = dw_edma_find_by_child(&epc->dev);
+	if (!intr_dw->edma)
+		return -ENODEV;
+	dw_edma_selfirq_offsets(intr_dw->edma, &intr_dw->rd_status_off, &intr_dw->rd_clear_off);
+	return 0;
+}
+
+static int dw_intr_init(struct ntb_dev *ntb, void (*desc_changed)(void *ctx))
+{
+	struct ntb_intr_dw *intr_dw;
+	phys_addr_t mw_phys_addr;
+	resource_size_t mw_size;
+	int peer_widx;
+	int peers;
+	int ret;
+	int i;
+
+	peers = ntb_peer_port_count(ntb);
+	if (peers <= 0)
+		return -EINVAL;
+
+	intr_dw = devm_kzalloc(&ntb->dev, struct_size(intr_dw, peer_mws, peers),
+			       GFP_KERNEL);
+	if (!intr_dw)
+		return -ENOMEM;
+
+	ret = dw_edma_find_backend_for_ntb(ntb, intr_dw);
+	if (ret) {
+		devm_kfree(&ntb->dev, intr_dw);
+		return ret;
+	}
+
+	for (i = 0; i < peers; i++) {
+		peer_widx = ntb_peer_mw_count(ntb) - 1 - i;
+
+		ret = ntb_peer_mw_get_addr(ntb, peer_widx, &mw_phys_addr,
+					   &mw_size);
+		if (ret)
+			goto unroll;
+
+		intr_dw->peer_mws[i] = devm_ioremap(&ntb->dev, mw_phys_addr,
+						    mw_size);
+		if (!intr_dw->peer_mws[i]) {
+			ret = -EFAULT;
+			goto unroll;
+		}
+	}
+
+	ntb->intr_priv = intr_dw;
+
+	return 0;
+
+unroll:
+	for (i = 0; i < peers; i++)
+		if (intr_dw->peer_mws[i])
+			devm_iounmap(&ntb->dev, intr_dw->peer_mws[i]);
+
+	devm_kfree(&ntb->dev, intr_dw);
+	return ret;
+}
+
+static int dw_intr_setup_mws(struct ntb_dev *ntb)
+{
+	struct ntb_intr_dw *dwc = ntb->intr_priv;
+	resource_size_t addr_align, size_align, offset;
+	resource_size_t mw_size = SZ_32K;
+	resource_size_t mw_min_size = mw_size;
+	u64 addr = dwc->rd_status_off;
+	int peer, peer_widx, ret;
+	int i;
+
+	for (peer = 0; peer < ntb_peer_port_count(ntb); peer++) {
+		peer_widx = ntb_peer_highest_mw_idx(ntb, peer);
+		if (peer_widx < 0)
+			return peer_widx;
+
+		ret = ntb_mw_get_align(ntb, peer, peer_widx, &addr_align,
+				       NULL, NULL, NULL);
+		if (ret)
+			return ret;
+
+		addr &= ~(addr_align - 1);
+	}
+
+	for (peer = 0; peer < ntb_peer_port_count(ntb); peer++) {
+		peer_widx = ntb_peer_highest_mw_idx(ntb, peer);
+		if (peer_widx < 0) {
+			ret = peer_widx;
+			goto error_out;
+		}
+
+		ret = ntb_mw_get_align(ntb, peer, peer_widx, NULL,
+				       &size_align, NULL, &offset);
+		if (ret)
+			goto error_out;
+
+		mw_size = round_up(mw_size, size_align);
+		if (mw_size < mw_min_size)
+			mw_min_size = mw_size;
+
+		ret = ntb_mw_set_trans(ntb, peer, peer_widx,
+				       addr, mw_size, offset);
+		if (ret)
+			goto error_out;
+	}
+
+	dwc->base_addr = addr;
+	dwc->end_addr = addr + mw_min_size;
+
+	return 0;
+
+error_out:
+	for (i = 0; i < peer; i++) {
+		peer_widx = ntb_peer_highest_mw_idx(ntb, peer);
+		if (peer_widx < 0)
+			continue;
+
+		ntb_mw_clear_trans(ntb, i, peer_widx);
+	}
+
+	return ret;
+}
+
+static void dw_intr_clear_mws(struct ntb_dev *ntb)
+{
+	int peer, peer_widx;
+
+	for (peer = 0; peer < ntb_peer_port_count(ntb); peer++) {
+		peer_widx = ntb_peer_highest_mw_idx(ntb, peer);
+		if (peer_widx < 0)
+			continue;
+
+		ntb_mw_clear_trans(ntb, peer, peer_widx);
+	}
+}
+
+static void dw_intr_release_irq(void *data)
+{
+	struct ntb_intr_dw_ctx *ctx = data;
+
+	dw_edma_unregister_selfirq(ctx->edma, dw_edma_selfirq_handler, ctx);
+	kfree(ctx);
+}
+
+static int dw_intr_request_irq(struct ntb_dev *ntb, irq_handler_t h,
+			       const char *name, void *dev_id,
+			       struct ntb_intr_desc *intr_desc)
+{
+	struct ntb_intr_dw *dwc = ntb->intr_priv;
+	struct dw_edma *edma = dwc->edma;
+	int ret;
+
+	if (intr_desc->ctx)
+		return 1;
+
+	struct ntb_intr_dw_ctx *ctx __free(kfree) = kzalloc(
+						sizeof(*ctx), GFP_KERNEL);
+	if (!ctx)
+		return -ENOMEM;
+	ctx->handler = h;
+	ctx->dev = dev_id;
+	ctx->edma = edma;
+
+	ret = dw_edma_register_selfirq(edma, dw_edma_selfirq_handler, ctx);
+	if (ret)
+		return ret;
+
+	ret = devm_add_action_or_reset(&ntb->dev, dw_intr_release_irq, ctx);
+	if (ret)
+		return ret;
+
+	intr_desc->addr_offset = dwc->rd_status_off - dwc->base_addr;
+	intr_desc->data = 0x0;
+	intr_desc->ctx = no_free_ptr(ctx);
+	return 1;
+}
+
+static void dw_intr_free_irq(struct ntb_dev *ntb, int irq, void *dev_id,
+			     struct ntb_intr_desc *intr_desc)
+{
+	struct ntb_intr_dw *dwc = ntb->intr_priv;
+	struct dw_edma *edma = dwc->edma;
+	struct ntb_intr_dw_ctx *ctx;
+
+	ctx = intr_desc->ctx;
+	dw_edma_unregister_selfirq(edma, dw_edma_selfirq_handler, ctx);
+	devm_remove_action(&ntb->dev, dw_intr_release_irq, ctx);
+	kfree(ctx);
+}
+
+static int dw_intr_peer_trigger(struct ntb_dev *ntb, int peer, struct ntb_intr_desc *desc)
+{
+	struct ntb_intr_dw *intr_dw = ntb->intr_priv;
+	int idx;
+
+	idx = desc->addr_offset / sizeof(*intr_dw->peer_mws[peer]);
+
+	iowrite32(desc->data, &intr_dw->peer_mws[peer][idx]);
+
+	return 0;
+}
+
+static const struct ntb_intr_backend ntb_intr_backend_dw_edma = {
+	.name = "dw-edma-testirq",
+	.init = dw_intr_init,
+	.setup_mws = dw_intr_setup_mws,
+	.clear_mws = dw_intr_clear_mws,
+	.request_irq = dw_intr_request_irq,
+	.free_irq = dw_intr_free_irq,
+	.peer_trigger = dw_intr_peer_trigger,
+};
+
+const struct ntb_intr_backend *ntb_intr_dw_edma_backend(void)
+{
+	return &ntb_intr_backend_dw_edma;
+}
diff --git a/include/linux/ntb.h b/include/linux/ntb.h
index 1a88fe45471e..7daba67928e9 100644
--- a/include/linux/ntb.h
+++ b/include/linux/ntb.h
@@ -1664,6 +1664,7 @@ struct ntb_intr_desc {
 	u32 addr_offset;
 	u32 data;
 	u16 vector_offset;
+	void *ctx;
 };
 
 struct ntb_intr_backend {
@@ -1734,4 +1735,13 @@ static inline const struct ntb_intr_backend *ntb_intr_msi_backend(void)
 }
 #endif /* CONFIG_NTB_MSI */
 
+#ifdef CONFIG_NTB_DW_EDMA
+extern const struct ntb_intr_backend *ntb_intr_dw_edma_backend(void);
+#else
+static inline const struct ntb_intr_backend *ntb_intr_dw_edma_backend(void)
+{
+	return NULL;
+}
+#endif /* CONFIG_NTB_DW_EDMA */
+
 #endif
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [RFC PATCH 24/25] NTB: epf: Add MW2 for interrupt use on Renesas R-Car
  2025-10-23  7:18 [RFC PATCH 00/25] NTB/PCI: Add DW eDMA intr fallback and BAR MW offsets Koichiro Den
                   ` (22 preceding siblings ...)
  2025-10-23  7:19 ` [RFC PATCH 23/25] NTB: intr_dw_edma: Add DW eDMA emulated interrupt backend Koichiro Den
@ 2025-10-23  7:19 ` Koichiro Den
  2025-10-23  7:19 ` [RFC PATCH 25/25] Documentation: PCI: endpoint: pci-epf-vntb: Update and add mwN_offset usage Koichiro Den
                   ` (2 subsequent siblings)
  26 siblings, 0 replies; 37+ messages in thread
From: Koichiro Den @ 2025-10-23  7:19 UTC (permalink / raw)
  To: ntb, linux-pci, dmaengine, linux-kernel
  Cc: mani, kwilczynski, kishon, bhelgaas, corbet, vkoul, jdmason,
	dave.jiang, allenbh, Basavaraj.Natikar, Shyam-sundar.S-k,
	kurt.schwemmer, logang, jingoohan1, lpieralisi, robh, jbrunet,
	Frank.Li, fancer.lancer, arnd, pstanner, elfring

To enable interrupt support, one additional memory window is required.
Since a single BAR can now be split into multiple memory windows, add
MW2 to BAR2 on R-Car.

For pci_epf_vntb configfs settings, it may look like this:
  $ echo 2       > functions/pci_epf_vntb/func1/pci_epf_vntb.0/num_mws
  $ echo 0xF0000 > functions/pci_epf_vntb/func1/pci_epf_vntb.0/mw1
  $ echo 0x8000  > functions/pci_epf_vntb/func1/pci_epf_vntb.0/mw2
  $ echo 0xF0000 > functions/pci_epf_vntb/func1/pci_epf_vntb.0/mw2_offset
  $ echo 2       > functions/pci_epf_vntb/func1/pci_epf_vntb.0/mw1_bar
  $ echo 2       > functions/pci_epf_vntb/func1/pci_epf_vntb.0/mw2_bar

Note that users who do not use interrupts can keep their existing
configfs settings, and this changes will not affect them.

Signed-off-by: Koichiro Den <den@valinux.co.jp>
---
 drivers/ntb/hw/epf/ntb_hw_epf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/ntb/hw/epf/ntb_hw_epf.c b/drivers/ntb/hw/epf/ntb_hw_epf.c
index d55ce6b0fad4..85165662a03b 100644
--- a/drivers/ntb/hw/epf/ntb_hw_epf.c
+++ b/drivers/ntb/hw/epf/ntb_hw_epf.c
@@ -750,7 +750,7 @@ static const enum pci_barno rcar_barno[NTB_BAR_NUM] = {
 	[BAR_PEER_SPAD]	= BAR_0,
 	[BAR_DB]	= BAR_4,
 	[BAR_MW1]	= BAR_2,
-	[BAR_MW2]	= NO_BAR,
+	[BAR_MW2]	= BAR_2,
 	[BAR_MW3]	= NO_BAR,
 	[BAR_MW4]	= NO_BAR,
 };
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [RFC PATCH 25/25] Documentation: PCI: endpoint: pci-epf-vntb: Update and add mwN_offset usage
  2025-10-23  7:18 [RFC PATCH 00/25] NTB/PCI: Add DW eDMA intr fallback and BAR MW offsets Koichiro Den
                   ` (23 preceding siblings ...)
  2025-10-23  7:19 ` [RFC PATCH 24/25] NTB: epf: Add MW2 for interrupt use on Renesas R-Car Koichiro Den
@ 2025-10-23  7:19 ` Koichiro Den
  2025-10-23  7:55 ` [RFC PATCH 00/25] NTB/PCI: Add DW eDMA intr fallback and BAR MW offsets Jerome Brunet
  2025-10-24  3:27 ` Frank Li
  26 siblings, 0 replies; 37+ messages in thread
From: Koichiro Den @ 2025-10-23  7:19 UTC (permalink / raw)
  To: ntb, linux-pci, dmaengine, linux-kernel
  Cc: mani, kwilczynski, kishon, bhelgaas, corbet, vkoul, jdmason,
	dave.jiang, allenbh, Basavaraj.Natikar, Shyam-sundar.S-k,
	kurt.schwemmer, logang, jingoohan1, lpieralisi, robh, jbrunet,
	Frank.Li, fancer.lancer, arnd, pstanner, elfring

Add a concrete example showing how to place two memory windows in the
same BAR (one for data, one for interrupts) by using 'mwN_offset' and
'mwN_bar'. This is useful when BAR resources are scarce and is aligned
with recent endpoint-side inbound mapping support.

Note that part of the `ls` delta covers missing doc update for
attributes that were introduced in the commit e7cd58d2fdf8 ("PCI:
endpoint: pci-epf-vntb: Allow BAR assignment via configfs")

Signed-off-by: Koichiro Den <den@valinux.co.jp>
---
 Documentation/PCI/endpoint/pci-vntb-howto.rst | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/Documentation/PCI/endpoint/pci-vntb-howto.rst b/Documentation/PCI/endpoint/pci-vntb-howto.rst
index 70d3bc90893f..142cf9244cc6 100644
--- a/Documentation/PCI/endpoint/pci-vntb-howto.rst
+++ b/Documentation/PCI/endpoint/pci-vntb-howto.rst
@@ -90,8 +90,10 @@ of the function device and is populated with the following NTB specific
 attributes that can be configured by the user::
 
 	# ls functions/pci_epf_vntb/func1/pci_epf_vntb.0/
-	db_count    mw1         mw2         mw3         mw4         num_mws
-	spad_count
+	ctrl_bar  mw1_bar     mw2_offset  mw4         spad_count
+	db_bar    mw1_offset  mw3         mw4_bar     vbus_number
+	db_count  mw2         mw3_bar     mw4_offset  vntb_pid
+	mw1       mw2_bar     mw3_offset  num_mws     vntb_vid
 
 A sample configuration for NTB function is given below::
 
@@ -106,6 +108,16 @@ A sample configuration for virtual NTB driver for virtual PCI bus::
 	# echo 0x080A > functions/pci_epf_vntb/func1/pci_epf_vntb.0/vntb_pid
 	# echo 0x10 > functions/pci_epf_vntb/func1/pci_epf_vntb.0/vbus_number
 
+When BAR resources are tight but you still want to enable interrupts (which
+require a dedicated MW in addition to the data MW), map both MWs into a
+single BAR via 'mwN_offset' and 'mwN_bar' as shown below::
+
+	# echo 0xF0000 > functions/pci_epf_vntb/func1/pci_epf_vntb.0/mw1
+	# echo 0x8000 > functions/pci_epf_vntb/func1/pci_epf_vntb.0/mw2
+	# echo 0xF0000 > functions/pci_epf_vntb/func1/pci_epf_vntb.0/mw2_offset
+	# echo 2 > functions/pci_epf_vntb/func1/pci_epf_vntb.0/mw1_bar
+	# echo 2 > functions/pci_epf_vntb/func1/pci_epf_vntb.0/mw2_bar
+
 Binding pci-epf-ntb Device to EP Controller
 --------------------------------------------
 
-- 
2.48.1


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH 00/25] NTB/PCI: Add DW eDMA intr fallback and BAR MW offsets
  2025-10-23  7:18 [RFC PATCH 00/25] NTB/PCI: Add DW eDMA intr fallback and BAR MW offsets Koichiro Den
                   ` (24 preceding siblings ...)
  2025-10-23  7:19 ` [RFC PATCH 25/25] Documentation: PCI: endpoint: pci-epf-vntb: Update and add mwN_offset usage Koichiro Den
@ 2025-10-23  7:55 ` Jerome Brunet
  2025-10-24 16:11   ` Koichiro Den
  2025-10-24  3:27 ` Frank Li
  26 siblings, 1 reply; 37+ messages in thread
From: Jerome Brunet @ 2025-10-23  7:55 UTC (permalink / raw)
  To: Koichiro Den
  Cc: ntb, linux-pci, dmaengine, linux-kernel, mani, kwilczynski,
	kishon, bhelgaas, corbet, vkoul, jdmason, dave.jiang, allenbh,
	Basavaraj.Natikar, Shyam-sundar.S-k, kurt.schwemmer, logang,
	jingoohan1, lpieralisi, robh, Frank.Li, fancer.lancer, arnd,
	pstanner, elfring

On Thu 23 Oct 2025 at 16:18, Koichiro Den <den@valinux.co.jp> wrote:

> Hi all,
>
> Motivation
> ==========
>
> On Renesas R-Car S4 the PCIe Endpoint is DesignWare-based and the platform
> does not allow mapping GITS_TRANSLATER as an inbound iATU target. As a
> result, forwarding MSI writes from the Root Complex (RC) to the Endpoint
> (EP) is not possible even if we would add implementation to create a MSI
> domain for the vNTB device to use existing drivers/ntb/msi.c, and NTB
> traffic must fall back to doorbells (polling). In addition, BAR resources
> are scarce, which makes it difficult to dedicate a BAR solely to an
> NTB/msi window.
>
> This RFC introduces a generic interrupt backend for NTB. The existing MSI
> path is converted to a backend, and a new DW eDMA test-interrupt backend
> provides an RC-to-EP interrupt fallback when MSI cannot be used. In
> parallel, EPC/DWC gains inbound subrange mapping so multiple NTB memory
> windows (MWs) can share a single BAR at arbitrary offsets (via mwN_offset).
> The vNTB EPF and ntb_transport are taught about offsets.
>
> Backend selection is automatic: if MSI is available we use the MSI backend.
> Otherwise, if enabled, the DW eDMA backend is used. If neither is
> available, we continue to use doorbells. Existing systems remain unaffected
> unless use_intr=1 is set.
>
> Example layout (R-Car S4):
>
>   BAR0: Config/Spad
>   BAR2 [0x00000-0xF0000]: MW1 (data)
>   BAR2 [0xF0000-0xF8000]: MW2 (interrupts)
>   BAR4: Doorbell

Have you considered putting the doorbell in BAR0 along Config/SPAD
instead ? Doorbells already have an offset in the config and it would
allow the following setup

BAR0 : Config/Spad/Doorbell
BAR2 : MW1
BAR4 : MW2

If MW2 handle the IRQs, I suppose the size requirement is rather
limited so it should fit ?

The modification to allow this setup is minimal and you would not need
all the offset related changes below ... This is something I
was experimenting on. I can share that if you are interested.

>
>   # The corresponding configfs settings (see Patch #25):
>   echo 0xF0000 > ./mw1
>   echo 0x8000  > ./mw2
>   echo 0xF0000 > ./mw2_offset
>   echo 2       > ./mw1_bar
>   echo 2       > ./mw2_bar
>
> Summary of changes
> ==================
>
> * NTB core/transport
>   - Introduce struct ntb_intr_backend and convert MSI to the new backend.
>   - Add DW eDMA interrupt backend (CONFIG_NTB_DW_EDMA) as MSI-less fallback.
>   - Rename module parameter to use_intr (keep use_msi as deprecated alias).
>   - Support offsetted partial MWs in ntb_transport.
>   - Hardening for peer-reported interrupt values and minor cleanups.
>
> * PCI Endpoint core and DWC EP controller
>   - Add EPC ops map_inbound()/unmap_inbound() for BAR subrange mapping.
>   - Implement inbound mapping for DesignWare EP (Address Match mode), with
>     tracking of multiple inbound iATU entries per BAR and proper teardown.
>
> * EPF vNTB
>   - Add mwN_offset configfs attributes and propagate offsets to inbound maps.

... then you would not need this with and it would remove significant
part of the necessary changes below

>   - Prefer pci_epc_map_inbound() when supported. Otherwise fall back to
>     set_bar().
>   - Provide .get_pci_epc() so backends can locate the common eDMA instance.
>
> * DW eDMA
>   - Add self-interrupt registration and expose test-IRQ register offsets.
>   - Provide dw_edma_find_by_child().
>
> * Renesas R-Car
>   - Place MW2 in BAR2 to host the interrupt window alongside the data MW.
>
> * Documentation
>
> Patch layout
> ============
>
> * Patches 01-11 : BAR subrange and MW offsets (EPC/DWC EP, vNTB, core helpers)
> * Patches 12-14 : Interrupt handling hardening in ntb_transport/MSI
> * Patches 15-17 : DW eDMA: self-IRQ API, offsets, lookup helper
> * Patches 18-19 : NTB/EPF glue (.get_pci_epc())
> * Patch 20      : Module param name change (use_msi->use_intr, alias preserved)
> * Patches 21-23 : Generic interrupt backend + MSI conversion + DW eDMA backend
> * Patch 24      : R-Car: add MW2 in BAR2 for interrupts
> * Patch 25      : Documentation updates
>
> Tested on
> =========
>
> * Renesas R-Car S4 Spider
> * Kernel base: commit 68113d260674 ("NTB/msi: Remove unused functions") (ntb-driver-core/ntb-next)
>
> Performance measurement
> =======================
>
> Even without the DMA acceleration patches for R-Car S4 (which I keep
> separate from this RFC patch series), enabling RC-to-EP interrupts
> dramatically improves NTB latency on R-Car S4:
>
> * Before this patch series (NB. use_msi doesn't work on R-Car S4)
>
>   # Server: sockperf server -i 0.0.0.0
>   # Client: sockperf ping-pong -i $SERVER_IP
>   ========= Printing statistics for Server No: 0
>   [Valid Duration] RunTime=0.540 sec; SentMessages=45; ReceivedMessages=45
>   ====> avg-latency=5995.680 (std-dev=70.258, mean-ad=57.478, median-ad=85.978,\
>         siqr=59.698, cv=0.012, std-error=10.473, 99.0% ci=[5968.702, 6022.658])
>   # dropped messages = 0; # duplicated messages = 0; # out-of-order messages = 0
>   Summary: Latency is 5995.680 usec
>   Total 45 observations; each percentile contains 0.45 observations
>   ---> <MAX> observation = 6121.137
>   ---> percentile 99.999 = 6121.137
>   ---> percentile 99.990 = 6121.137
>   ---> percentile 99.900 = 6121.137
>   ---> percentile 99.000 = 6121.137
>   ---> percentile 90.000 = 6099.178
>   ---> percentile 75.000 = 6054.418
>   ---> percentile 50.000 = 5993.040
>   ---> percentile 25.000 = 5935.021
>   ---> <MIN> observation = 5883.362
>
> * With this series (use_intr=1)
>
>   # Server: sockperf server -i 0.0.0.0
>   # Client: sockperf ping-pong -i $SERVER_IP
>   ========= Printing statistics for Server No: 0
>   [Valid Duration] RunTime=0.550 sec; SentMessages=2145; ReceivedMessages=2145
>   ====> avg-latency=127.677 (std-dev=21.719, mean-ad=11.759, median-ad=3.779,\
>         siqr=2.699, cv=0.170, std-error=0.469, 99.0% ci=[126.469, 128.885])
>   # dropped messages = 0; # duplicated messages = 0; # out-of-order messages = 0
>   Summary: Latency is 127.677 usec
>   Total 2145 observations; each percentile contains 21.45 observations
>   ---> <MAX> observation =  446.691
>   ---> percentile 99.999 =  446.691
>   ---> percentile 99.990 =  446.691
>   ---> percentile 99.900 =  291.234
>   ---> percentile 99.000 =  221.515
>   ---> percentile 90.000 =  149.277
>   ---> percentile 75.000 =  124.497
>   ---> percentile 50.000 =  121.137
>   ---> percentile 25.000 =  119.037
>   ---> <MIN> observation =  113.637
>
> Feedback welcome on both the approach and the splitting/routing preference.
>
> (The series spans NTB, PCI EP/DWC and dmaengine/dw-edma. I'm happy to split
> later if preferred.)
>
> Thanks for reviewing.
>
>
> Koichiro Den (25):
>   PCI: endpoint: pci-epf-vntb: Use array_index_nospec() on mws_size[]
>     access
>   PCI: endpoint: pci-epf-vntb: Add mwN_offset configfs attributes
>   NTB: epf: Handle mwN_offset for inbound MW regions
>   PCI: endpoint: Add inbound mapping ops to EPC core
>   PCI: dwc: ep: Implement EPC inbound mapping support
>   PCI: endpoint: pci-epf-vntb: Use pci_epc_map_inbound() for MW mapping
>   NTB: Add offset parameter to MW translation APIs
>   PCI: endpoint: pci-epf-vntb: Propagate MW offset from configfs when
>     present
>   NTB: ntb_transport: Support offsetted partial memory windows
>   NTB/msi: Support offsetted partial memory window for MSI
>   NTB/msi: Do not force MW to its maximum possible size
>   NTB: ntb_transport: Stricter checks for peer-reported interrupt values
>   NTB/msi: Skip mw_set_trans() if already configured
>   NTB/msi: Add a inner loop for PCI-MSI cases
>   dmaengine: dw-edma: Add self-interrupt registration API
>   dmaengine: dw-edma: Expose self-IRQ register offsets
>   dmaengine: dw-edma: Add dw_edma_find_by_child() helper
>   NTB: core: Add .get_pci_epc() to ntb_dev_ops
>   NTB: epf: vntb: Implement .get_pci_epc() callback
>   NTB: ntb_transport: Rename use_msi to use_intr (keep alias)
>   NTB: Introduce generic interrupt backend abstraction and convert MSI
>   NTB: ntb_transport: Rename MSI symbols to generic interrupt form
>   NTB: intr_dw_edma: Add DW eDMA emulated interrupt backend
>   NTB: epf: Add MW2 for interrupt use on Renesas R-Car
>   Documentation: PCI: endpoint: pci-epf-vntb: Update and add mwN_offset
>     usage
>
>  Documentation/PCI/endpoint/pci-vntb-howto.rst |  16 +-
>  drivers/dma/dw-edma/dw-edma-core.c            | 109 ++++++++
>  drivers/dma/dw-edma/dw-edma-core.h            |  18 ++
>  drivers/dma/dw-edma/dw-edma-v0-core.c         |  15 ++
>  drivers/ntb/Kconfig                           |  15 ++
>  drivers/ntb/Makefile                          |   6 +-
>  drivers/ntb/hw/amd/ntb_hw_amd.c               |   6 +-
>  drivers/ntb/hw/epf/ntb_hw_epf.c               |  46 ++--
>  drivers/ntb/hw/idt/ntb_hw_idt.c               |   3 +-
>  drivers/ntb/hw/intel/ntb_hw_gen1.c            |   6 +-
>  drivers/ntb/hw/intel/ntb_hw_gen1.h            |   2 +-
>  drivers/ntb/hw/intel/ntb_hw_gen3.c            |   3 +-
>  drivers/ntb/hw/intel/ntb_hw_gen4.c            |   6 +-
>  drivers/ntb/hw/mscc/ntb_hw_switchtec.c        |   6 +-
>  drivers/ntb/intr_common.c                     |  61 +++++
>  drivers/ntb/intr_dw_edma.c                    | 253 ++++++++++++++++++
>  drivers/ntb/msi.c                             | 186 +++++++------
>  drivers/ntb/ntb_transport.c                   | 155 ++++++-----
>  drivers/ntb/test/ntb_msi_test.c               |  26 +-
>  drivers/ntb/test/ntb_perf.c                   |   4 +-
>  drivers/ntb/test/ntb_tool.c                   |   6 +-
>  .../pci/controller/dwc/pcie-designware-ep.c   | 242 +++++++++++++++--
>  drivers/pci/controller/dwc/pcie-designware.c  |   1 +
>  drivers/pci/controller/dwc/pcie-designware.h  |   2 +
>  drivers/pci/endpoint/functions/pci-epf-vntb.c | 197 ++++++++++++--
>  drivers/pci/endpoint/pci-epc-core.c           |  44 +++
>  include/linux/dma/edma.h                      |  31 +++
>  include/linux/ntb.h                           | 134 +++++++---
>  include/linux/pci-epc.h                       |  11 +
>  29 files changed, 1310 insertions(+), 300 deletions(-)
>  create mode 100644 drivers/ntb/intr_common.c
>  create mode 100644 drivers/ntb/intr_dw_edma.c

-- 
Jerome

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH 01/25] PCI: endpoint: pci-epf-vntb: Use array_index_nospec() on mws_size[] access
  2025-10-23  7:18 ` [RFC PATCH 01/25] PCI: endpoint: pci-epf-vntb: Use array_index_nospec() on mws_size[] access Koichiro Den
@ 2025-10-24  0:06   ` Frank Li
  2025-10-24 16:24     ` Koichiro Den
  0 siblings, 1 reply; 37+ messages in thread
From: Frank Li @ 2025-10-24  0:06 UTC (permalink / raw)
  To: Koichiro Den
  Cc: ntb, linux-pci, dmaengine, linux-kernel, mani, kwilczynski,
	kishon, bhelgaas, corbet, vkoul, jdmason, dave.jiang, allenbh,
	Basavaraj.Natikar, Shyam-sundar.S-k, kurt.schwemmer, logang,
	jingoohan1, lpieralisi, robh, jbrunet, fancer.lancer, arnd,
	pstanner, elfring

On Thu, Oct 23, 2025 at 04:18:52PM +0900, Koichiro Den wrote:
> Follow common kernel idioms for indices derived from configfs attributes
> and suppress Smatch warnings:
>
>   epf_ntb_mw1_show() warn: potential spectre issue 'ntb->mws_size' [r]
>   epf_ntb_mw1_store() warn: potential spectre issue 'ntb->mws_size' [w]
>
> No functional changes.
>
> Signed-off-by: Koichiro Den <den@valinux.co.jp>
> ---
>  drivers/pci/endpoint/functions/pci-epf-vntb.c | 23 +++++++++++--------
>  1 file changed, 14 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/pci/endpoint/functions/pci-epf-vntb.c b/drivers/pci/endpoint/functions/pci-epf-vntb.c
> index 83e9ab10f9c4..55307cd613c9 100644
> --- a/drivers/pci/endpoint/functions/pci-epf-vntb.c
> +++ b/drivers/pci/endpoint/functions/pci-epf-vntb.c
> @@ -876,17 +876,19 @@ static ssize_t epf_ntb_##_name##_show(struct config_item *item,		\
>  	struct config_group *group = to_config_group(item);		\
>  	struct epf_ntb *ntb = to_epf_ntb(group);			\
>  	struct device *dev = &ntb->epf->dev;				\
> -	int win_no;							\
> +	int win_no, idx;						\
>  									\
>  	if (sscanf(#_name, "mw%d", &win_no) != 1)			\
>  		return -EINVAL;						\
>  									\
> -	if (win_no <= 0 || win_no > ntb->num_mws) {			\
> -		dev_err(dev, "Invalid num_nws: %d value\n", ntb->num_mws); \
> +	idx = win_no - 1;						\
> +	if (idx < 0 || idx >= ntb->num_mws) {				\
> +		dev_err(dev, "MW%d out of range (num_mws=%d)\n",	\
> +			win_no, ntb->num_mws);				\
>  		return -EINVAL;						\
>  	}								\
> -									\
> -	return sprintf(page, "%lld\n", ntb->mws_size[win_no - 1]);	\
> +	idx = array_index_nospec(idx, ntb->num_mws);			\
> +	return sprintf(page, "%lld\n", ntb->mws_size[idx]);		\

keep original check if (win_no <= 0 || win_no > ntb->num_mws)

just
	idx = array_index_nospec(win_no - 1, ntb->num_mws);
	return sprintf(page, "%lld\n", ntb->mws_size[idx]);

It should be more simple.

Frank
>  }
>
>  #define EPF_NTB_MW_W(_name)						\
> @@ -896,7 +898,7 @@ static ssize_t epf_ntb_##_name##_store(struct config_item *item,	\
>  	struct config_group *group = to_config_group(item);		\
>  	struct epf_ntb *ntb = to_epf_ntb(group);			\
>  	struct device *dev = &ntb->epf->dev;				\
> -	int win_no;							\
> +	int win_no, idx;						\
>  	u64 val;							\
>  	int ret;							\
>  									\
> @@ -907,12 +909,15 @@ static ssize_t epf_ntb_##_name##_store(struct config_item *item,	\
>  	if (sscanf(#_name, "mw%d", &win_no) != 1)			\
>  		return -EINVAL;						\
>  									\
> -	if (win_no <= 0 || win_no > ntb->num_mws) {			\
> -		dev_err(dev, "Invalid num_nws: %d value\n", ntb->num_mws); \
> +	idx = win_no - 1;						\
> +	if (idx < 0 || idx >= ntb->num_mws) {				\
> +		dev_err(dev, "MW%d out of range (num_mws=%d)\n",	\
> +			win_no, ntb->num_mws);				\
>  		return -EINVAL;						\
>  	}								\
>  									\
> -	ntb->mws_size[win_no - 1] = val;				\
> +	idx = array_index_nospec(idx, ntb->num_mws);			\
> +	ntb->mws_size[idx] = val;					\
>  									\
>  	return len;							\
>  }
> --
> 2.48.1
>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH 00/25] NTB/PCI: Add DW eDMA intr fallback and BAR MW offsets
  2025-10-23  7:18 [RFC PATCH 00/25] NTB/PCI: Add DW eDMA intr fallback and BAR MW offsets Koichiro Den
                   ` (25 preceding siblings ...)
  2025-10-23  7:55 ` [RFC PATCH 00/25] NTB/PCI: Add DW eDMA intr fallback and BAR MW offsets Jerome Brunet
@ 2025-10-24  3:27 ` Frank Li
  2025-10-24 16:04   ` Koichiro Den
  26 siblings, 1 reply; 37+ messages in thread
From: Frank Li @ 2025-10-24  3:27 UTC (permalink / raw)
  To: Koichiro Den
  Cc: ntb, linux-pci, dmaengine, linux-kernel, mani, kwilczynski,
	kishon, bhelgaas, corbet, vkoul, jdmason, dave.jiang, allenbh,
	Basavaraj.Natikar, Shyam-sundar.S-k, kurt.schwemmer, logang,
	jingoohan1, lpieralisi, robh, jbrunet, fancer.lancer, arnd,
	pstanner, elfring

On Thu, Oct 23, 2025 at 04:18:51PM +0900, Koichiro Den wrote:
> Hi all,
>
> Motivation
> ==========
>
> On Renesas R-Car S4 the PCIe Endpoint is DesignWare-based and the platform
> does not allow mapping GITS_TRANSLATER as an inbound iATU target. As a
> result, forwarding MSI writes from the Root Complex (RC) to the Endpoint
> (EP) is not possible even if we would add implementation to create a MSI
> domain for the vNTB device to use existing drivers/ntb/msi.c, and NTB
> traffic must fall back to doorbells (polling). In addition, BAR resources
> are scarce, which makes it difficult to dedicate a BAR solely to an
> NTB/msi window.
>
> This RFC introduces a generic interrupt backend for NTB. The existing MSI
> path is converted to a backend, and a new DW eDMA test-interrupt backend
> provides an RC-to-EP interrupt fallback when MSI cannot be used. In
> parallel, EPC/DWC gains inbound subrange mapping so multiple NTB memory
> windows (MWs) can share a single BAR at arbitrary offsets (via mwN_offset).
> The vNTB EPF and ntb_transport are taught about offsets.

Map multi address to one bar is quite valuable, so we can start it as the
first steps.

But I have a problem about DWC iATU address map mode. for example, bar0
to cpu address 0x8000000 (Local CPU).  but difference RC system, at RC side
bar0 address is variable. May be 0xa000_0000 (RC side), maybe 0xc000_0000
(RC side).

Set bar0 mapping before linkup.

How do you know PCI bus address is 0xa0000000 or 0xc0000000.

Frank

>
> Backend selection is automatic: if MSI is available we use the MSI backend.
> Otherwise, if enabled, the DW eDMA backend is used. If neither is
> available, we continue to use doorbells. Existing systems remain unaffected
> unless use_intr=1 is set.
>
> Example layout (R-Car S4):
>
>   BAR0: Config/Spad
>   BAR2 [0x00000-0xF0000]: MW1 (data)
>   BAR2 [0xF0000-0xF8000]: MW2 (interrupts)
>   BAR4: Doorbell
>
>   # The corresponding configfs settings (see Patch #25):
>   echo 0xF0000 > ./mw1
>   echo 0x8000  > ./mw2
>   echo 0xF0000 > ./mw2_offset
>   echo 2       > ./mw1_bar
>   echo 2       > ./mw2_bar
>
> Summary of changes
> ==================
>
> * NTB core/transport
>   - Introduce struct ntb_intr_backend and convert MSI to the new backend.
>   - Add DW eDMA interrupt backend (CONFIG_NTB_DW_EDMA) as MSI-less fallback.
>   - Rename module parameter to use_intr (keep use_msi as deprecated alias).
>   - Support offsetted partial MWs in ntb_transport.
>   - Hardening for peer-reported interrupt values and minor cleanups.
>
> * PCI Endpoint core and DWC EP controller
>   - Add EPC ops map_inbound()/unmap_inbound() for BAR subrange mapping.
>   - Implement inbound mapping for DesignWare EP (Address Match mode), with
>     tracking of multiple inbound iATU entries per BAR and proper teardown.
>
> * EPF vNTB
>   - Add mwN_offset configfs attributes and propagate offsets to inbound maps.
>   - Prefer pci_epc_map_inbound() when supported. Otherwise fall back to
>     set_bar().
>   - Provide .get_pci_epc() so backends can locate the common eDMA instance.
>
> * DW eDMA
>   - Add self-interrupt registration and expose test-IRQ register offsets.
>   - Provide dw_edma_find_by_child().
>
> * Renesas R-Car
>   - Place MW2 in BAR2 to host the interrupt window alongside the data MW.
>
> * Documentation
>
> Patch layout
> ============
>
> * Patches 01-11 : BAR subrange and MW offsets (EPC/DWC EP, vNTB, core helpers)
> * Patches 12-14 : Interrupt handling hardening in ntb_transport/MSI
> * Patches 15-17 : DW eDMA: self-IRQ API, offsets, lookup helper
> * Patches 18-19 : NTB/EPF glue (.get_pci_epc())
> * Patch 20      : Module param name change (use_msi->use_intr, alias preserved)
> * Patches 21-23 : Generic interrupt backend + MSI conversion + DW eDMA backend
> * Patch 24      : R-Car: add MW2 in BAR2 for interrupts
> * Patch 25      : Documentation updates
>
> Tested on
> =========
>
> * Renesas R-Car S4 Spider
> * Kernel base: commit 68113d260674 ("NTB/msi: Remove unused functions") (ntb-driver-core/ntb-next)
>
> Performance measurement
> =======================
>
> Even without the DMA acceleration patches for R-Car S4 (which I keep
> separate from this RFC patch series), enabling RC-to-EP interrupts
> dramatically improves NTB latency on R-Car S4:
>
> * Before this patch series (NB. use_msi doesn't work on R-Car S4)
>
>   # Server: sockperf server -i 0.0.0.0
>   # Client: sockperf ping-pong -i $SERVER_IP
>   ========= Printing statistics for Server No: 0
>   [Valid Duration] RunTime=0.540 sec; SentMessages=45; ReceivedMessages=45
>   ====> avg-latency=5995.680 (std-dev=70.258, mean-ad=57.478, median-ad=85.978,\
>         siqr=59.698, cv=0.012, std-error=10.473, 99.0% ci=[5968.702, 6022.658])
>   # dropped messages = 0; # duplicated messages = 0; # out-of-order messages = 0
>   Summary: Latency is 5995.680 usec
>   Total 45 observations; each percentile contains 0.45 observations
>   ---> <MAX> observation = 6121.137
>   ---> percentile 99.999 = 6121.137
>   ---> percentile 99.990 = 6121.137
>   ---> percentile 99.900 = 6121.137
>   ---> percentile 99.000 = 6121.137
>   ---> percentile 90.000 = 6099.178
>   ---> percentile 75.000 = 6054.418
>   ---> percentile 50.000 = 5993.040
>   ---> percentile 25.000 = 5935.021
>   ---> <MIN> observation = 5883.362
>
> * With this series (use_intr=1)
>
>   # Server: sockperf server -i 0.0.0.0
>   # Client: sockperf ping-pong -i $SERVER_IP
>   ========= Printing statistics for Server No: 0
>   [Valid Duration] RunTime=0.550 sec; SentMessages=2145; ReceivedMessages=2145
>   ====> avg-latency=127.677 (std-dev=21.719, mean-ad=11.759, median-ad=3.779,\
>         siqr=2.699, cv=0.170, std-error=0.469, 99.0% ci=[126.469, 128.885])
>   # dropped messages = 0; # duplicated messages = 0; # out-of-order messages = 0
>   Summary: Latency is 127.677 usec
>   Total 2145 observations; each percentile contains 21.45 observations
>   ---> <MAX> observation =  446.691
>   ---> percentile 99.999 =  446.691
>   ---> percentile 99.990 =  446.691
>   ---> percentile 99.900 =  291.234
>   ---> percentile 99.000 =  221.515
>   ---> percentile 90.000 =  149.277
>   ---> percentile 75.000 =  124.497
>   ---> percentile 50.000 =  121.137
>   ---> percentile 25.000 =  119.037
>   ---> <MIN> observation =  113.637
>
> Feedback welcome on both the approach and the splitting/routing preference.
>
> (The series spans NTB, PCI EP/DWC and dmaengine/dw-edma. I'm happy to split
> later if preferred.)
>
> Thanks for reviewing.
>
>
> Koichiro Den (25):
>   PCI: endpoint: pci-epf-vntb: Use array_index_nospec() on mws_size[]
>     access
>   PCI: endpoint: pci-epf-vntb: Add mwN_offset configfs attributes
>   NTB: epf: Handle mwN_offset for inbound MW regions
>   PCI: endpoint: Add inbound mapping ops to EPC core
>   PCI: dwc: ep: Implement EPC inbound mapping support
>   PCI: endpoint: pci-epf-vntb: Use pci_epc_map_inbound() for MW mapping
>   NTB: Add offset parameter to MW translation APIs
>   PCI: endpoint: pci-epf-vntb: Propagate MW offset from configfs when
>     present
>   NTB: ntb_transport: Support offsetted partial memory windows
>   NTB/msi: Support offsetted partial memory window for MSI
>   NTB/msi: Do not force MW to its maximum possible size
>   NTB: ntb_transport: Stricter checks for peer-reported interrupt values
>   NTB/msi: Skip mw_set_trans() if already configured
>   NTB/msi: Add a inner loop for PCI-MSI cases
>   dmaengine: dw-edma: Add self-interrupt registration API
>   dmaengine: dw-edma: Expose self-IRQ register offsets
>   dmaengine: dw-edma: Add dw_edma_find_by_child() helper
>   NTB: core: Add .get_pci_epc() to ntb_dev_ops
>   NTB: epf: vntb: Implement .get_pci_epc() callback
>   NTB: ntb_transport: Rename use_msi to use_intr (keep alias)
>   NTB: Introduce generic interrupt backend abstraction and convert MSI
>   NTB: ntb_transport: Rename MSI symbols to generic interrupt form
>   NTB: intr_dw_edma: Add DW eDMA emulated interrupt backend
>   NTB: epf: Add MW2 for interrupt use on Renesas R-Car
>   Documentation: PCI: endpoint: pci-epf-vntb: Update and add mwN_offset
>     usage
>
>  Documentation/PCI/endpoint/pci-vntb-howto.rst |  16 +-
>  drivers/dma/dw-edma/dw-edma-core.c            | 109 ++++++++
>  drivers/dma/dw-edma/dw-edma-core.h            |  18 ++
>  drivers/dma/dw-edma/dw-edma-v0-core.c         |  15 ++
>  drivers/ntb/Kconfig                           |  15 ++
>  drivers/ntb/Makefile                          |   6 +-
>  drivers/ntb/hw/amd/ntb_hw_amd.c               |   6 +-
>  drivers/ntb/hw/epf/ntb_hw_epf.c               |  46 ++--
>  drivers/ntb/hw/idt/ntb_hw_idt.c               |   3 +-
>  drivers/ntb/hw/intel/ntb_hw_gen1.c            |   6 +-
>  drivers/ntb/hw/intel/ntb_hw_gen1.h            |   2 +-
>  drivers/ntb/hw/intel/ntb_hw_gen3.c            |   3 +-
>  drivers/ntb/hw/intel/ntb_hw_gen4.c            |   6 +-
>  drivers/ntb/hw/mscc/ntb_hw_switchtec.c        |   6 +-
>  drivers/ntb/intr_common.c                     |  61 +++++
>  drivers/ntb/intr_dw_edma.c                    | 253 ++++++++++++++++++
>  drivers/ntb/msi.c                             | 186 +++++++------
>  drivers/ntb/ntb_transport.c                   | 155 ++++++-----
>  drivers/ntb/test/ntb_msi_test.c               |  26 +-
>  drivers/ntb/test/ntb_perf.c                   |   4 +-
>  drivers/ntb/test/ntb_tool.c                   |   6 +-
>  .../pci/controller/dwc/pcie-designware-ep.c   | 242 +++++++++++++++--
>  drivers/pci/controller/dwc/pcie-designware.c  |   1 +
>  drivers/pci/controller/dwc/pcie-designware.h  |   2 +
>  drivers/pci/endpoint/functions/pci-epf-vntb.c | 197 ++++++++++++--
>  drivers/pci/endpoint/pci-epc-core.c           |  44 +++
>  include/linux/dma/edma.h                      |  31 +++
>  include/linux/ntb.h                           | 134 +++++++---
>  include/linux/pci-epc.h                       |  11 +
>  29 files changed, 1310 insertions(+), 300 deletions(-)
>  create mode 100644 drivers/ntb/intr_common.c
>  create mode 100644 drivers/ntb/intr_dw_edma.c
>
> --
> 2.48.1
>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH 00/25] NTB/PCI: Add DW eDMA intr fallback and BAR MW offsets
  2025-10-24  3:27 ` Frank Li
@ 2025-10-24 16:04   ` Koichiro Den
  2025-10-24 16:43     ` Frank Li
  0 siblings, 1 reply; 37+ messages in thread
From: Koichiro Den @ 2025-10-24 16:04 UTC (permalink / raw)
  To: Frank Li
  Cc: ntb, linux-pci, dmaengine, linux-kernel, mani, kwilczynski,
	kishon, bhelgaas, corbet, vkoul, jdmason, dave.jiang, allenbh,
	Basavaraj.Natikar, Shyam-sundar.S-k, kurt.schwemmer, logang,
	jingoohan1, lpieralisi, robh, jbrunet, fancer.lancer, arnd,
	pstanner, elfring

On Thu, Oct 23, 2025 at 11:27:09PM -0400, Frank Li wrote:
> On Thu, Oct 23, 2025 at 04:18:51PM +0900, Koichiro Den wrote:
> > Hi all,
> >
> > Motivation
> > ==========
> >
> > On Renesas R-Car S4 the PCIe Endpoint is DesignWare-based and the platform
> > does not allow mapping GITS_TRANSLATER as an inbound iATU target. As a
> > result, forwarding MSI writes from the Root Complex (RC) to the Endpoint
> > (EP) is not possible even if we would add implementation to create a MSI
> > domain for the vNTB device to use existing drivers/ntb/msi.c, and NTB
> > traffic must fall back to doorbells (polling). In addition, BAR resources
> > are scarce, which makes it difficult to dedicate a BAR solely to an
> > NTB/msi window.
> >
> > This RFC introduces a generic interrupt backend for NTB. The existing MSI
> > path is converted to a backend, and a new DW eDMA test-interrupt backend
> > provides an RC-to-EP interrupt fallback when MSI cannot be used. In
> > parallel, EPC/DWC gains inbound subrange mapping so multiple NTB memory
> > windows (MWs) can share a single BAR at arbitrary offsets (via mwN_offset).
> > The vNTB EPF and ntb_transport are taught about offsets.
> 
> Map multi address to one bar is quite valuable, so we can start it as the
> first steps.
> 
> But I have a problem about DWC iATU address map mode. for example, bar0
> to cpu address 0x8000000 (Local CPU).  but difference RC system, at RC side
> bar0 address is variable. May be 0xa000_0000 (RC side), maybe 0xc000_0000
> (RC side).
> 
> Set bar0 mapping before linkup.
> 
> How do you know PCI bus address is 0xa0000000 or 0xc0000000.

Thanks for the comment.

For vNTB this is done in two steps:

1). In the epf_ntb_bind() path we call pci_epc_map_inbound() with
    epf_bar->phys_addr == 0. On the DWC side this only triggers
    dw_pcie_ep_set_bar_init() and does not program an inbound iATU yet.
    (pls see Patch #5).
2). Later, when ntb_transport's link work runs and we actually need to
    set up Address Match inbound window(s), pci_epc_map_inbound() is called
    again with epf_bar->phys_addr != 0 (and an offset for the sub‑range). At
    that point the RC has already enumerated the device and assigned the BAR,
    so dw_pcie_ep_map_inbound() reads back the assigned BAR value via
    dw_pcie_ep_read_bar_assigned(), computes pci_addr = base + offset, and
    programs the inbound iATU in Address Match mode (again, Patch #5 is
    relevant).

Because we do not program the inbound iATU before enumeration, we don't
need to know upfront whether the RC will place BAR0 at 0xa000_0000 or
0xc000_0000. We read the assigned address right before the actual
programming (again, see the Patch #5). Am I missing something?

-Koichiro

> 
> Frank
> 
> >
> > Backend selection is automatic: if MSI is available we use the MSI backend.
> > Otherwise, if enabled, the DW eDMA backend is used. If neither is
> > available, we continue to use doorbells. Existing systems remain unaffected
> > unless use_intr=1 is set.
> >
> > Example layout (R-Car S4):
> >
> >   BAR0: Config/Spad
> >   BAR2 [0x00000-0xF0000]: MW1 (data)
> >   BAR2 [0xF0000-0xF8000]: MW2 (interrupts)
> >   BAR4: Doorbell
> >
> >   # The corresponding configfs settings (see Patch #25):
> >   echo 0xF0000 > ./mw1
> >   echo 0x8000  > ./mw2
> >   echo 0xF0000 > ./mw2_offset
> >   echo 2       > ./mw1_bar
> >   echo 2       > ./mw2_bar
> >
> > Summary of changes
> > ==================
> >
> > * NTB core/transport
> >   - Introduce struct ntb_intr_backend and convert MSI to the new backend.
> >   - Add DW eDMA interrupt backend (CONFIG_NTB_DW_EDMA) as MSI-less fallback.
> >   - Rename module parameter to use_intr (keep use_msi as deprecated alias).
> >   - Support offsetted partial MWs in ntb_transport.
> >   - Hardening for peer-reported interrupt values and minor cleanups.
> >
> > * PCI Endpoint core and DWC EP controller
> >   - Add EPC ops map_inbound()/unmap_inbound() for BAR subrange mapping.
> >   - Implement inbound mapping for DesignWare EP (Address Match mode), with
> >     tracking of multiple inbound iATU entries per BAR and proper teardown.
> >
> > * EPF vNTB
> >   - Add mwN_offset configfs attributes and propagate offsets to inbound maps.
> >   - Prefer pci_epc_map_inbound() when supported. Otherwise fall back to
> >     set_bar().
> >   - Provide .get_pci_epc() so backends can locate the common eDMA instance.
> >
> > * DW eDMA
> >   - Add self-interrupt registration and expose test-IRQ register offsets.
> >   - Provide dw_edma_find_by_child().
> >
> > * Renesas R-Car
> >   - Place MW2 in BAR2 to host the interrupt window alongside the data MW.
> >
> > * Documentation
> >
> > Patch layout
> > ============
> >
> > * Patches 01-11 : BAR subrange and MW offsets (EPC/DWC EP, vNTB, core helpers)
> > * Patches 12-14 : Interrupt handling hardening in ntb_transport/MSI
> > * Patches 15-17 : DW eDMA: self-IRQ API, offsets, lookup helper
> > * Patches 18-19 : NTB/EPF glue (.get_pci_epc())
> > * Patch 20      : Module param name change (use_msi->use_intr, alias preserved)
> > * Patches 21-23 : Generic interrupt backend + MSI conversion + DW eDMA backend
> > * Patch 24      : R-Car: add MW2 in BAR2 for interrupts
> > * Patch 25      : Documentation updates
> >
> > Tested on
> > =========
> >
> > * Renesas R-Car S4 Spider
> > * Kernel base: commit 68113d260674 ("NTB/msi: Remove unused functions") (ntb-driver-core/ntb-next)
> >
> > Performance measurement
> > =======================
> >
> > Even without the DMA acceleration patches for R-Car S4 (which I keep
> > separate from this RFC patch series), enabling RC-to-EP interrupts
> > dramatically improves NTB latency on R-Car S4:
> >
> > * Before this patch series (NB. use_msi doesn't work on R-Car S4)
> >
> >   # Server: sockperf server -i 0.0.0.0
> >   # Client: sockperf ping-pong -i $SERVER_IP
> >   ========= Printing statistics for Server No: 0
> >   [Valid Duration] RunTime=0.540 sec; SentMessages=45; ReceivedMessages=45
> >   ====> avg-latency=5995.680 (std-dev=70.258, mean-ad=57.478, median-ad=85.978,\
> >         siqr=59.698, cv=0.012, std-error=10.473, 99.0% ci=[5968.702, 6022.658])
> >   # dropped messages = 0; # duplicated messages = 0; # out-of-order messages = 0
> >   Summary: Latency is 5995.680 usec
> >   Total 45 observations; each percentile contains 0.45 observations
> >   ---> <MAX> observation = 6121.137
> >   ---> percentile 99.999 = 6121.137
> >   ---> percentile 99.990 = 6121.137
> >   ---> percentile 99.900 = 6121.137
> >   ---> percentile 99.000 = 6121.137
> >   ---> percentile 90.000 = 6099.178
> >   ---> percentile 75.000 = 6054.418
> >   ---> percentile 50.000 = 5993.040
> >   ---> percentile 25.000 = 5935.021
> >   ---> <MIN> observation = 5883.362
> >
> > * With this series (use_intr=1)
> >
> >   # Server: sockperf server -i 0.0.0.0
> >   # Client: sockperf ping-pong -i $SERVER_IP
> >   ========= Printing statistics for Server No: 0
> >   [Valid Duration] RunTime=0.550 sec; SentMessages=2145; ReceivedMessages=2145
> >   ====> avg-latency=127.677 (std-dev=21.719, mean-ad=11.759, median-ad=3.779,\
> >         siqr=2.699, cv=0.170, std-error=0.469, 99.0% ci=[126.469, 128.885])
> >   # dropped messages = 0; # duplicated messages = 0; # out-of-order messages = 0
> >   Summary: Latency is 127.677 usec
> >   Total 2145 observations; each percentile contains 21.45 observations
> >   ---> <MAX> observation =  446.691
> >   ---> percentile 99.999 =  446.691
> >   ---> percentile 99.990 =  446.691
> >   ---> percentile 99.900 =  291.234
> >   ---> percentile 99.000 =  221.515
> >   ---> percentile 90.000 =  149.277
> >   ---> percentile 75.000 =  124.497
> >   ---> percentile 50.000 =  121.137
> >   ---> percentile 25.000 =  119.037
> >   ---> <MIN> observation =  113.637
> >
> > Feedback welcome on both the approach and the splitting/routing preference.
> >
> > (The series spans NTB, PCI EP/DWC and dmaengine/dw-edma. I'm happy to split
> > later if preferred.)
> >
> > Thanks for reviewing.
> >
> >
> > Koichiro Den (25):
> >   PCI: endpoint: pci-epf-vntb: Use array_index_nospec() on mws_size[]
> >     access
> >   PCI: endpoint: pci-epf-vntb: Add mwN_offset configfs attributes
> >   NTB: epf: Handle mwN_offset for inbound MW regions
> >   PCI: endpoint: Add inbound mapping ops to EPC core
> >   PCI: dwc: ep: Implement EPC inbound mapping support
> >   PCI: endpoint: pci-epf-vntb: Use pci_epc_map_inbound() for MW mapping
> >   NTB: Add offset parameter to MW translation APIs
> >   PCI: endpoint: pci-epf-vntb: Propagate MW offset from configfs when
> >     present
> >   NTB: ntb_transport: Support offsetted partial memory windows
> >   NTB/msi: Support offsetted partial memory window for MSI
> >   NTB/msi: Do not force MW to its maximum possible size
> >   NTB: ntb_transport: Stricter checks for peer-reported interrupt values
> >   NTB/msi: Skip mw_set_trans() if already configured
> >   NTB/msi: Add a inner loop for PCI-MSI cases
> >   dmaengine: dw-edma: Add self-interrupt registration API
> >   dmaengine: dw-edma: Expose self-IRQ register offsets
> >   dmaengine: dw-edma: Add dw_edma_find_by_child() helper
> >   NTB: core: Add .get_pci_epc() to ntb_dev_ops
> >   NTB: epf: vntb: Implement .get_pci_epc() callback
> >   NTB: ntb_transport: Rename use_msi to use_intr (keep alias)
> >   NTB: Introduce generic interrupt backend abstraction and convert MSI
> >   NTB: ntb_transport: Rename MSI symbols to generic interrupt form
> >   NTB: intr_dw_edma: Add DW eDMA emulated interrupt backend
> >   NTB: epf: Add MW2 for interrupt use on Renesas R-Car
> >   Documentation: PCI: endpoint: pci-epf-vntb: Update and add mwN_offset
> >     usage
> >
> >  Documentation/PCI/endpoint/pci-vntb-howto.rst |  16 +-
> >  drivers/dma/dw-edma/dw-edma-core.c            | 109 ++++++++
> >  drivers/dma/dw-edma/dw-edma-core.h            |  18 ++
> >  drivers/dma/dw-edma/dw-edma-v0-core.c         |  15 ++
> >  drivers/ntb/Kconfig                           |  15 ++
> >  drivers/ntb/Makefile                          |   6 +-
> >  drivers/ntb/hw/amd/ntb_hw_amd.c               |   6 +-
> >  drivers/ntb/hw/epf/ntb_hw_epf.c               |  46 ++--
> >  drivers/ntb/hw/idt/ntb_hw_idt.c               |   3 +-
> >  drivers/ntb/hw/intel/ntb_hw_gen1.c            |   6 +-
> >  drivers/ntb/hw/intel/ntb_hw_gen1.h            |   2 +-
> >  drivers/ntb/hw/intel/ntb_hw_gen3.c            |   3 +-
> >  drivers/ntb/hw/intel/ntb_hw_gen4.c            |   6 +-
> >  drivers/ntb/hw/mscc/ntb_hw_switchtec.c        |   6 +-
> >  drivers/ntb/intr_common.c                     |  61 +++++
> >  drivers/ntb/intr_dw_edma.c                    | 253 ++++++++++++++++++
> >  drivers/ntb/msi.c                             | 186 +++++++------
> >  drivers/ntb/ntb_transport.c                   | 155 ++++++-----
> >  drivers/ntb/test/ntb_msi_test.c               |  26 +-
> >  drivers/ntb/test/ntb_perf.c                   |   4 +-
> >  drivers/ntb/test/ntb_tool.c                   |   6 +-
> >  .../pci/controller/dwc/pcie-designware-ep.c   | 242 +++++++++++++++--
> >  drivers/pci/controller/dwc/pcie-designware.c  |   1 +
> >  drivers/pci/controller/dwc/pcie-designware.h  |   2 +
> >  drivers/pci/endpoint/functions/pci-epf-vntb.c | 197 ++++++++++++--
> >  drivers/pci/endpoint/pci-epc-core.c           |  44 +++
> >  include/linux/dma/edma.h                      |  31 +++
> >  include/linux/ntb.h                           | 134 +++++++---
> >  include/linux/pci-epc.h                       |  11 +
> >  29 files changed, 1310 insertions(+), 300 deletions(-)
> >  create mode 100644 drivers/ntb/intr_common.c
> >  create mode 100644 drivers/ntb/intr_dw_edma.c
> >
> > --
> > 2.48.1
> >

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH 00/25] NTB/PCI: Add DW eDMA intr fallback and BAR MW offsets
  2025-10-23  7:55 ` [RFC PATCH 00/25] NTB/PCI: Add DW eDMA intr fallback and BAR MW offsets Jerome Brunet
@ 2025-10-24 16:11   ` Koichiro Den
  0 siblings, 0 replies; 37+ messages in thread
From: Koichiro Den @ 2025-10-24 16:11 UTC (permalink / raw)
  To: Jerome Brunet
  Cc: ntb, linux-pci, dmaengine, linux-kernel, mani, kwilczynski,
	kishon, bhelgaas, corbet, vkoul, jdmason, dave.jiang, allenbh,
	Basavaraj.Natikar, Shyam-sundar.S-k, kurt.schwemmer, logang,
	jingoohan1, lpieralisi, robh, Frank.Li, fancer.lancer, arnd,
	pstanner, elfring

On Thu, Oct 23, 2025 at 09:55:42AM +0200, Jerome Brunet wrote:
> On Thu 23 Oct 2025 at 16:18, Koichiro Den <den@valinux.co.jp> wrote:
> 
> > Hi all,
> >
> > Motivation
> > ==========
> >
> > On Renesas R-Car S4 the PCIe Endpoint is DesignWare-based and the platform
> > does not allow mapping GITS_TRANSLATER as an inbound iATU target. As a
> > result, forwarding MSI writes from the Root Complex (RC) to the Endpoint
> > (EP) is not possible even if we would add implementation to create a MSI
> > domain for the vNTB device to use existing drivers/ntb/msi.c, and NTB
> > traffic must fall back to doorbells (polling). In addition, BAR resources
> > are scarce, which makes it difficult to dedicate a BAR solely to an
> > NTB/msi window.
> >
> > This RFC introduces a generic interrupt backend for NTB. The existing MSI
> > path is converted to a backend, and a new DW eDMA test-interrupt backend
> > provides an RC-to-EP interrupt fallback when MSI cannot be used. In
> > parallel, EPC/DWC gains inbound subrange mapping so multiple NTB memory
> > windows (MWs) can share a single BAR at arbitrary offsets (via mwN_offset).
> > The vNTB EPF and ntb_transport are taught about offsets.
> >
> > Backend selection is automatic: if MSI is available we use the MSI backend.
> > Otherwise, if enabled, the DW eDMA backend is used. If neither is
> > available, we continue to use doorbells. Existing systems remain unaffected
> > unless use_intr=1 is set.
> >
> > Example layout (R-Car S4):
> >
> >   BAR0: Config/Spad
> >   BAR2 [0x00000-0xF0000]: MW1 (data)
> >   BAR2 [0xF0000-0xF8000]: MW2 (interrupts)
> >   BAR4: Doorbell
> 
> Have you considered putting the doorbell in BAR0 along Config/SPAD
> instead ? Doorbells already have an offset in the config and it would
> allow the following setup
> 
> BAR0 : Config/Spad/Doorbell
> BAR2 : MW1
> BAR4 : MW2
> 
> If MW2 handle the IRQs, I suppose the size requirement is rather
> limited so it should fit ?
> 
> The modification to allow this setup is minimal and you would not need
> all the offset related changes below ... This is something I
> was experimenting on. I can share that if you are interested.

Thank you for the info. Somehow I overlooked NTB_EPF_DB_OFFSET / db_offset
when preparing the patch set. The modification should be minimal, so I can
cook it up if/when needed, thanks!

To be honest, since there is NTB_EPF_MW1_OFFSET / reserved, which is
actually unused, I assumed someone would complete the implementation for
MW*_offset once it really became relevant, and I thought this was/could be
a good timing.

-Koichiro

> 
> >
> >   # The corresponding configfs settings (see Patch #25):
> >   echo 0xF0000 > ./mw1
> >   echo 0x8000  > ./mw2
> >   echo 0xF0000 > ./mw2_offset
> >   echo 2       > ./mw1_bar
> >   echo 2       > ./mw2_bar
> >
> > Summary of changes
> > ==================
> >
> > * NTB core/transport
> >   - Introduce struct ntb_intr_backend and convert MSI to the new backend.
> >   - Add DW eDMA interrupt backend (CONFIG_NTB_DW_EDMA) as MSI-less fallback.
> >   - Rename module parameter to use_intr (keep use_msi as deprecated alias).
> >   - Support offsetted partial MWs in ntb_transport.
> >   - Hardening for peer-reported interrupt values and minor cleanups.
> >
> > * PCI Endpoint core and DWC EP controller
> >   - Add EPC ops map_inbound()/unmap_inbound() for BAR subrange mapping.
> >   - Implement inbound mapping for DesignWare EP (Address Match mode), with
> >     tracking of multiple inbound iATU entries per BAR and proper teardown.
> >
> > * EPF vNTB
> >   - Add mwN_offset configfs attributes and propagate offsets to inbound maps.
> 
> ... then you would not need this with and it would remove significant
> part of the necessary changes below
> 
> >   - Prefer pci_epc_map_inbound() when supported. Otherwise fall back to
> >     set_bar().
> >   - Provide .get_pci_epc() so backends can locate the common eDMA instance.
> >
> > * DW eDMA
> >   - Add self-interrupt registration and expose test-IRQ register offsets.
> >   - Provide dw_edma_find_by_child().
> >
> > * Renesas R-Car
> >   - Place MW2 in BAR2 to host the interrupt window alongside the data MW.
> >
> > * Documentation
> >
> > Patch layout
> > ============
> >
> > * Patches 01-11 : BAR subrange and MW offsets (EPC/DWC EP, vNTB, core helpers)
> > * Patches 12-14 : Interrupt handling hardening in ntb_transport/MSI
> > * Patches 15-17 : DW eDMA: self-IRQ API, offsets, lookup helper
> > * Patches 18-19 : NTB/EPF glue (.get_pci_epc())
> > * Patch 20      : Module param name change (use_msi->use_intr, alias preserved)
> > * Patches 21-23 : Generic interrupt backend + MSI conversion + DW eDMA backend
> > * Patch 24      : R-Car: add MW2 in BAR2 for interrupts
> > * Patch 25      : Documentation updates
> >
> > Tested on
> > =========
> >
> > * Renesas R-Car S4 Spider
> > * Kernel base: commit 68113d260674 ("NTB/msi: Remove unused functions") (ntb-driver-core/ntb-next)
> >
> > Performance measurement
> > =======================
> >
> > Even without the DMA acceleration patches for R-Car S4 (which I keep
> > separate from this RFC patch series), enabling RC-to-EP interrupts
> > dramatically improves NTB latency on R-Car S4:
> >
> > * Before this patch series (NB. use_msi doesn't work on R-Car S4)
> >
> >   # Server: sockperf server -i 0.0.0.0
> >   # Client: sockperf ping-pong -i $SERVER_IP
> >   ========= Printing statistics for Server No: 0
> >   [Valid Duration] RunTime=0.540 sec; SentMessages=45; ReceivedMessages=45
> >   ====> avg-latency=5995.680 (std-dev=70.258, mean-ad=57.478, median-ad=85.978,\
> >         siqr=59.698, cv=0.012, std-error=10.473, 99.0% ci=[5968.702, 6022.658])
> >   # dropped messages = 0; # duplicated messages = 0; # out-of-order messages = 0
> >   Summary: Latency is 5995.680 usec
> >   Total 45 observations; each percentile contains 0.45 observations
> >   ---> <MAX> observation = 6121.137
> >   ---> percentile 99.999 = 6121.137
> >   ---> percentile 99.990 = 6121.137
> >   ---> percentile 99.900 = 6121.137
> >   ---> percentile 99.000 = 6121.137
> >   ---> percentile 90.000 = 6099.178
> >   ---> percentile 75.000 = 6054.418
> >   ---> percentile 50.000 = 5993.040
> >   ---> percentile 25.000 = 5935.021
> >   ---> <MIN> observation = 5883.362
> >
> > * With this series (use_intr=1)
> >
> >   # Server: sockperf server -i 0.0.0.0
> >   # Client: sockperf ping-pong -i $SERVER_IP
> >   ========= Printing statistics for Server No: 0
> >   [Valid Duration] RunTime=0.550 sec; SentMessages=2145; ReceivedMessages=2145
> >   ====> avg-latency=127.677 (std-dev=21.719, mean-ad=11.759, median-ad=3.779,\
> >         siqr=2.699, cv=0.170, std-error=0.469, 99.0% ci=[126.469, 128.885])
> >   # dropped messages = 0; # duplicated messages = 0; # out-of-order messages = 0
> >   Summary: Latency is 127.677 usec
> >   Total 2145 observations; each percentile contains 21.45 observations
> >   ---> <MAX> observation =  446.691
> >   ---> percentile 99.999 =  446.691
> >   ---> percentile 99.990 =  446.691
> >   ---> percentile 99.900 =  291.234
> >   ---> percentile 99.000 =  221.515
> >   ---> percentile 90.000 =  149.277
> >   ---> percentile 75.000 =  124.497
> >   ---> percentile 50.000 =  121.137
> >   ---> percentile 25.000 =  119.037
> >   ---> <MIN> observation =  113.637
> >
> > Feedback welcome on both the approach and the splitting/routing preference.
> >
> > (The series spans NTB, PCI EP/DWC and dmaengine/dw-edma. I'm happy to split
> > later if preferred.)
> >
> > Thanks for reviewing.
> >
> >
> > Koichiro Den (25):
> >   PCI: endpoint: pci-epf-vntb: Use array_index_nospec() on mws_size[]
> >     access
> >   PCI: endpoint: pci-epf-vntb: Add mwN_offset configfs attributes
> >   NTB: epf: Handle mwN_offset for inbound MW regions
> >   PCI: endpoint: Add inbound mapping ops to EPC core
> >   PCI: dwc: ep: Implement EPC inbound mapping support
> >   PCI: endpoint: pci-epf-vntb: Use pci_epc_map_inbound() for MW mapping
> >   NTB: Add offset parameter to MW translation APIs
> >   PCI: endpoint: pci-epf-vntb: Propagate MW offset from configfs when
> >     present
> >   NTB: ntb_transport: Support offsetted partial memory windows
> >   NTB/msi: Support offsetted partial memory window for MSI
> >   NTB/msi: Do not force MW to its maximum possible size
> >   NTB: ntb_transport: Stricter checks for peer-reported interrupt values
> >   NTB/msi: Skip mw_set_trans() if already configured
> >   NTB/msi: Add a inner loop for PCI-MSI cases
> >   dmaengine: dw-edma: Add self-interrupt registration API
> >   dmaengine: dw-edma: Expose self-IRQ register offsets
> >   dmaengine: dw-edma: Add dw_edma_find_by_child() helper
> >   NTB: core: Add .get_pci_epc() to ntb_dev_ops
> >   NTB: epf: vntb: Implement .get_pci_epc() callback
> >   NTB: ntb_transport: Rename use_msi to use_intr (keep alias)
> >   NTB: Introduce generic interrupt backend abstraction and convert MSI
> >   NTB: ntb_transport: Rename MSI symbols to generic interrupt form
> >   NTB: intr_dw_edma: Add DW eDMA emulated interrupt backend
> >   NTB: epf: Add MW2 for interrupt use on Renesas R-Car
> >   Documentation: PCI: endpoint: pci-epf-vntb: Update and add mwN_offset
> >     usage
> >
> >  Documentation/PCI/endpoint/pci-vntb-howto.rst |  16 +-
> >  drivers/dma/dw-edma/dw-edma-core.c            | 109 ++++++++
> >  drivers/dma/dw-edma/dw-edma-core.h            |  18 ++
> >  drivers/dma/dw-edma/dw-edma-v0-core.c         |  15 ++
> >  drivers/ntb/Kconfig                           |  15 ++
> >  drivers/ntb/Makefile                          |   6 +-
> >  drivers/ntb/hw/amd/ntb_hw_amd.c               |   6 +-
> >  drivers/ntb/hw/epf/ntb_hw_epf.c               |  46 ++--
> >  drivers/ntb/hw/idt/ntb_hw_idt.c               |   3 +-
> >  drivers/ntb/hw/intel/ntb_hw_gen1.c            |   6 +-
> >  drivers/ntb/hw/intel/ntb_hw_gen1.h            |   2 +-
> >  drivers/ntb/hw/intel/ntb_hw_gen3.c            |   3 +-
> >  drivers/ntb/hw/intel/ntb_hw_gen4.c            |   6 +-
> >  drivers/ntb/hw/mscc/ntb_hw_switchtec.c        |   6 +-
> >  drivers/ntb/intr_common.c                     |  61 +++++
> >  drivers/ntb/intr_dw_edma.c                    | 253 ++++++++++++++++++
> >  drivers/ntb/msi.c                             | 186 +++++++------
> >  drivers/ntb/ntb_transport.c                   | 155 ++++++-----
> >  drivers/ntb/test/ntb_msi_test.c               |  26 +-
> >  drivers/ntb/test/ntb_perf.c                   |   4 +-
> >  drivers/ntb/test/ntb_tool.c                   |   6 +-
> >  .../pci/controller/dwc/pcie-designware-ep.c   | 242 +++++++++++++++--
> >  drivers/pci/controller/dwc/pcie-designware.c  |   1 +
> >  drivers/pci/controller/dwc/pcie-designware.h  |   2 +
> >  drivers/pci/endpoint/functions/pci-epf-vntb.c | 197 ++++++++++++--
> >  drivers/pci/endpoint/pci-epc-core.c           |  44 +++
> >  include/linux/dma/edma.h                      |  31 +++
> >  include/linux/ntb.h                           | 134 +++++++---
> >  include/linux/pci-epc.h                       |  11 +
> >  29 files changed, 1310 insertions(+), 300 deletions(-)
> >  create mode 100644 drivers/ntb/intr_common.c
> >  create mode 100644 drivers/ntb/intr_dw_edma.c
> 
> -- 
> Jerome

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH 01/25] PCI: endpoint: pci-epf-vntb: Use array_index_nospec() on mws_size[] access
  2025-10-24  0:06   ` Frank Li
@ 2025-10-24 16:24     ` Koichiro Den
  2025-10-24 18:40       ` Frank Li
  0 siblings, 1 reply; 37+ messages in thread
From: Koichiro Den @ 2025-10-24 16:24 UTC (permalink / raw)
  To: Frank Li
  Cc: ntb, linux-pci, dmaengine, linux-kernel, mani, kwilczynski,
	kishon, bhelgaas, corbet, vkoul, jdmason, dave.jiang, allenbh,
	Basavaraj.Natikar, Shyam-sundar.S-k, kurt.schwemmer, logang,
	jingoohan1, lpieralisi, robh, jbrunet, fancer.lancer, arnd,
	pstanner, elfring

On Thu, Oct 23, 2025 at 08:06:40PM -0400, Frank Li wrote:
> On Thu, Oct 23, 2025 at 04:18:52PM +0900, Koichiro Den wrote:
> > Follow common kernel idioms for indices derived from configfs attributes
> > and suppress Smatch warnings:
> >
> >   epf_ntb_mw1_show() warn: potential spectre issue 'ntb->mws_size' [r]
> >   epf_ntb_mw1_store() warn: potential spectre issue 'ntb->mws_size' [w]
> >
> > No functional changes.
> >
> > Signed-off-by: Koichiro Den <den@valinux.co.jp>
> > ---
> >  drivers/pci/endpoint/functions/pci-epf-vntb.c | 23 +++++++++++--------
> >  1 file changed, 14 insertions(+), 9 deletions(-)
> >
> > diff --git a/drivers/pci/endpoint/functions/pci-epf-vntb.c b/drivers/pci/endpoint/functions/pci-epf-vntb.c
> > index 83e9ab10f9c4..55307cd613c9 100644
> > --- a/drivers/pci/endpoint/functions/pci-epf-vntb.c
> > +++ b/drivers/pci/endpoint/functions/pci-epf-vntb.c
> > @@ -876,17 +876,19 @@ static ssize_t epf_ntb_##_name##_show(struct config_item *item,		\
> >  	struct config_group *group = to_config_group(item);		\
> >  	struct epf_ntb *ntb = to_epf_ntb(group);			\
> >  	struct device *dev = &ntb->epf->dev;				\
> > -	int win_no;							\
> > +	int win_no, idx;						\
> >  									\
> >  	if (sscanf(#_name, "mw%d", &win_no) != 1)			\
> >  		return -EINVAL;						\
> >  									\
> > -	if (win_no <= 0 || win_no > ntb->num_mws) {			\
> > -		dev_err(dev, "Invalid num_nws: %d value\n", ntb->num_mws); \
> > +	idx = win_no - 1;						\
> > +	if (idx < 0 || idx >= ntb->num_mws) {				\
> > +		dev_err(dev, "MW%d out of range (num_mws=%d)\n",	\
> > +			win_no, ntb->num_mws);				\
> >  		return -EINVAL;						\
> >  	}								\
> > -									\
> > -	return sprintf(page, "%lld\n", ntb->mws_size[win_no - 1]);	\
> > +	idx = array_index_nospec(idx, ntb->num_mws);			\
> > +	return sprintf(page, "%lld\n", ntb->mws_size[idx]);		\
> 
> keep original check if (win_no <= 0 || win_no > ntb->num_mws)
> 
> just
> 	idx = array_index_nospec(win_no - 1, ntb->num_mws);
> 	return sprintf(page, "%lld\n", ntb->mws_size[idx]);
> 
> It should be more simple.

Thanks for the review.

For minimal changes, that makes sense. I'd also like to update the dev_err
message (the "num_nws" typo, and I think what's invalid is win_no, not
num_mws). So how about combining your suggestion with the log message
update?

-Koichiro

> 
> Frank
> >  }
> >
> >  #define EPF_NTB_MW_W(_name)						\
> > @@ -896,7 +898,7 @@ static ssize_t epf_ntb_##_name##_store(struct config_item *item,	\
> >  	struct config_group *group = to_config_group(item);		\
> >  	struct epf_ntb *ntb = to_epf_ntb(group);			\
> >  	struct device *dev = &ntb->epf->dev;				\
> > -	int win_no;							\
> > +	int win_no, idx;						\
> >  	u64 val;							\
> >  	int ret;							\
> >  									\
> > @@ -907,12 +909,15 @@ static ssize_t epf_ntb_##_name##_store(struct config_item *item,	\
> >  	if (sscanf(#_name, "mw%d", &win_no) != 1)			\
> >  		return -EINVAL;						\
> >  									\
> > -	if (win_no <= 0 || win_no > ntb->num_mws) {			\
> > -		dev_err(dev, "Invalid num_nws: %d value\n", ntb->num_mws); \
> > +	idx = win_no - 1;						\
> > +	if (idx < 0 || idx >= ntb->num_mws) {				\
> > +		dev_err(dev, "MW%d out of range (num_mws=%d)\n",	\
> > +			win_no, ntb->num_mws);				\
> >  		return -EINVAL;						\
> >  	}								\
> >  									\
> > -	ntb->mws_size[win_no - 1] = val;				\
> > +	idx = array_index_nospec(idx, ntb->num_mws);			\
> > +	ntb->mws_size[idx] = val;					\
> >  									\
> >  	return len;							\
> >  }
> > --
> > 2.48.1
> >

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH 00/25] NTB/PCI: Add DW eDMA intr fallback and BAR MW offsets
  2025-10-24 16:04   ` Koichiro Den
@ 2025-10-24 16:43     ` Frank Li
  2025-10-27  5:29       ` Koichiro Den
  0 siblings, 1 reply; 37+ messages in thread
From: Frank Li @ 2025-10-24 16:43 UTC (permalink / raw)
  To: Koichiro Den
  Cc: ntb, linux-pci, dmaengine, linux-kernel, mani, kwilczynski,
	kishon, bhelgaas, corbet, vkoul, jdmason, dave.jiang, allenbh,
	Basavaraj.Natikar, Shyam-sundar.S-k, kurt.schwemmer, logang,
	jingoohan1, lpieralisi, robh, jbrunet, fancer.lancer, arnd,
	pstanner, elfring

On Sat, Oct 25, 2025 at 01:04:01AM +0900, Koichiro Den wrote:
> On Thu, Oct 23, 2025 at 11:27:09PM -0400, Frank Li wrote:
> > On Thu, Oct 23, 2025 at 04:18:51PM +0900, Koichiro Den wrote:
> > > Hi all,
> > >
> > > Motivation
> > > ==========
> > >
> > > On Renesas R-Car S4 the PCIe Endpoint is DesignWare-based and the platform
> > > does not allow mapping GITS_TRANSLATER as an inbound iATU target. As a
> > > result, forwarding MSI writes from the Root Complex (RC) to the Endpoint
> > > (EP) is not possible even if we would add implementation to create a MSI
> > > domain for the vNTB device to use existing drivers/ntb/msi.c, and NTB
> > > traffic must fall back to doorbells (polling). In addition, BAR resources
> > > are scarce, which makes it difficult to dedicate a BAR solely to an
> > > NTB/msi window.
> > >
> > > This RFC introduces a generic interrupt backend for NTB. The existing MSI
> > > path is converted to a backend, and a new DW eDMA test-interrupt backend
> > > provides an RC-to-EP interrupt fallback when MSI cannot be used. In
> > > parallel, EPC/DWC gains inbound subrange mapping so multiple NTB memory
> > > windows (MWs) can share a single BAR at arbitrary offsets (via mwN_offset).
> > > The vNTB EPF and ntb_transport are taught about offsets.
> >
> > Map multi address to one bar is quite valuable, so we can start it as the
> > first steps.
> >
> > But I have a problem about DWC iATU address map mode. for example, bar0
> > to cpu address 0x8000000 (Local CPU).  but difference RC system, at RC side
> > bar0 address is variable. May be 0xa000_0000 (RC side), maybe 0xc000_0000
> > (RC side).
> >
> > Set bar0 mapping before linkup.
> >
> > How do you know PCI bus address is 0xa0000000 or 0xc0000000.
>
> Thanks for the comment.
>
> For vNTB this is done in two steps:
>
> 1). In the epf_ntb_bind() path we call pci_epc_map_inbound() with
>     epf_bar->phys_addr == 0. On the DWC side this only triggers
>     dw_pcie_ep_set_bar_init() and does not program an inbound iATU yet.
>     (pls see Patch #5).
> 2). Later, when ntb_transport's link work runs and we actually need to
>     set up Address Match inbound window(s), pci_epc_map_inbound() is called
>     again with epf_bar->phys_addr != 0 (and an offset for the sub‑range). At
>     that point the RC has already enumerated the device and assigned the BAR,
>     so dw_pcie_ep_map_inbound() reads back the assigned BAR value via
>     dw_pcie_ep_read_bar_assigned(), computes pci_addr = base + offset, and
>     programs the inbound iATU in Address Match mode (again, Patch #5 is
>     relevant).
>
> Because we do not program the inbound iATU before enumeration, we don't
> need to know upfront whether the RC will place BAR0 at 0xa000_0000 or
> 0xc000_0000. We read the assigned address right before the actual
> programming (again, see the Patch #5). Am I missing something?

This should work for vntb user case. It needs generalize for other usage
mode. maybe combine multi regions to one bar.

Add a case in pci-ep-test function drivers to let more people can review
it.

Frank

>
> -Koichiro
>
> >
> > Frank
> >
> > >
> > > Backend selection is automatic: if MSI is available we use the MSI backend.
> > > Otherwise, if enabled, the DW eDMA backend is used. If neither is
> > > available, we continue to use doorbells. Existing systems remain unaffected
> > > unless use_intr=1 is set.
> > >
> > > Example layout (R-Car S4):
> > >
> > >   BAR0: Config/Spad
> > >   BAR2 [0x00000-0xF0000]: MW1 (data)
> > >   BAR2 [0xF0000-0xF8000]: MW2 (interrupts)
> > >   BAR4: Doorbell
> > >
> > >   # The corresponding configfs settings (see Patch #25):
> > >   echo 0xF0000 > ./mw1
> > >   echo 0x8000  > ./mw2
> > >   echo 0xF0000 > ./mw2_offset
> > >   echo 2       > ./mw1_bar
> > >   echo 2       > ./mw2_bar
> > >
> > > Summary of changes
> > > ==================
> > >
> > > * NTB core/transport
> > >   - Introduce struct ntb_intr_backend and convert MSI to the new backend.
> > >   - Add DW eDMA interrupt backend (CONFIG_NTB_DW_EDMA) as MSI-less fallback.
> > >   - Rename module parameter to use_intr (keep use_msi as deprecated alias).
> > >   - Support offsetted partial MWs in ntb_transport.
> > >   - Hardening for peer-reported interrupt values and minor cleanups.
> > >
> > > * PCI Endpoint core and DWC EP controller
> > >   - Add EPC ops map_inbound()/unmap_inbound() for BAR subrange mapping.
> > >   - Implement inbound mapping for DesignWare EP (Address Match mode), with
> > >     tracking of multiple inbound iATU entries per BAR and proper teardown.
> > >
> > > * EPF vNTB
> > >   - Add mwN_offset configfs attributes and propagate offsets to inbound maps.
> > >   - Prefer pci_epc_map_inbound() when supported. Otherwise fall back to
> > >     set_bar().
> > >   - Provide .get_pci_epc() so backends can locate the common eDMA instance.
> > >
> > > * DW eDMA
> > >   - Add self-interrupt registration and expose test-IRQ register offsets.
> > >   - Provide dw_edma_find_by_child().
> > >
> > > * Renesas R-Car
> > >   - Place MW2 in BAR2 to host the interrupt window alongside the data MW.
> > >
> > > * Documentation
> > >
> > > Patch layout
> > > ============
> > >
> > > * Patches 01-11 : BAR subrange and MW offsets (EPC/DWC EP, vNTB, core helpers)
> > > * Patches 12-14 : Interrupt handling hardening in ntb_transport/MSI
> > > * Patches 15-17 : DW eDMA: self-IRQ API, offsets, lookup helper
> > > * Patches 18-19 : NTB/EPF glue (.get_pci_epc())
> > > * Patch 20      : Module param name change (use_msi->use_intr, alias preserved)
> > > * Patches 21-23 : Generic interrupt backend + MSI conversion + DW eDMA backend
> > > * Patch 24      : R-Car: add MW2 in BAR2 for interrupts
> > > * Patch 25      : Documentation updates
> > >
> > > Tested on
> > > =========
> > >
> > > * Renesas R-Car S4 Spider
> > > * Kernel base: commit 68113d260674 ("NTB/msi: Remove unused functions") (ntb-driver-core/ntb-next)
> > >
> > > Performance measurement
> > > =======================
> > >
> > > Even without the DMA acceleration patches for R-Car S4 (which I keep
> > > separate from this RFC patch series), enabling RC-to-EP interrupts
> > > dramatically improves NTB latency on R-Car S4:
> > >
> > > * Before this patch series (NB. use_msi doesn't work on R-Car S4)
> > >
> > >   # Server: sockperf server -i 0.0.0.0
> > >   # Client: sockperf ping-pong -i $SERVER_IP
> > >   ========= Printing statistics for Server No: 0
> > >   [Valid Duration] RunTime=0.540 sec; SentMessages=45; ReceivedMessages=45
> > >   ====> avg-latency=5995.680 (std-dev=70.258, mean-ad=57.478, median-ad=85.978,\
> > >         siqr=59.698, cv=0.012, std-error=10.473, 99.0% ci=[5968.702, 6022.658])
> > >   # dropped messages = 0; # duplicated messages = 0; # out-of-order messages = 0
> > >   Summary: Latency is 5995.680 usec
> > >   Total 45 observations; each percentile contains 0.45 observations
> > >   ---> <MAX> observation = 6121.137
> > >   ---> percentile 99.999 = 6121.137
> > >   ---> percentile 99.990 = 6121.137
> > >   ---> percentile 99.900 = 6121.137
> > >   ---> percentile 99.000 = 6121.137
> > >   ---> percentile 90.000 = 6099.178
> > >   ---> percentile 75.000 = 6054.418
> > >   ---> percentile 50.000 = 5993.040
> > >   ---> percentile 25.000 = 5935.021
> > >   ---> <MIN> observation = 5883.362
> > >
> > > * With this series (use_intr=1)
> > >
> > >   # Server: sockperf server -i 0.0.0.0
> > >   # Client: sockperf ping-pong -i $SERVER_IP
> > >   ========= Printing statistics for Server No: 0
> > >   [Valid Duration] RunTime=0.550 sec; SentMessages=2145; ReceivedMessages=2145
> > >   ====> avg-latency=127.677 (std-dev=21.719, mean-ad=11.759, median-ad=3.779,\
> > >         siqr=2.699, cv=0.170, std-error=0.469, 99.0% ci=[126.469, 128.885])
> > >   # dropped messages = 0; # duplicated messages = 0; # out-of-order messages = 0
> > >   Summary: Latency is 127.677 usec
> > >   Total 2145 observations; each percentile contains 21.45 observations
> > >   ---> <MAX> observation =  446.691
> > >   ---> percentile 99.999 =  446.691
> > >   ---> percentile 99.990 =  446.691
> > >   ---> percentile 99.900 =  291.234
> > >   ---> percentile 99.000 =  221.515
> > >   ---> percentile 90.000 =  149.277
> > >   ---> percentile 75.000 =  124.497
> > >   ---> percentile 50.000 =  121.137
> > >   ---> percentile 25.000 =  119.037
> > >   ---> <MIN> observation =  113.637
> > >
> > > Feedback welcome on both the approach and the splitting/routing preference.
> > >
> > > (The series spans NTB, PCI EP/DWC and dmaengine/dw-edma. I'm happy to split
> > > later if preferred.)
> > >
> > > Thanks for reviewing.
> > >
> > >
> > > Koichiro Den (25):
> > >   PCI: endpoint: pci-epf-vntb: Use array_index_nospec() on mws_size[]
> > >     access
> > >   PCI: endpoint: pci-epf-vntb: Add mwN_offset configfs attributes
> > >   NTB: epf: Handle mwN_offset for inbound MW regions
> > >   PCI: endpoint: Add inbound mapping ops to EPC core
> > >   PCI: dwc: ep: Implement EPC inbound mapping support
> > >   PCI: endpoint: pci-epf-vntb: Use pci_epc_map_inbound() for MW mapping
> > >   NTB: Add offset parameter to MW translation APIs
> > >   PCI: endpoint: pci-epf-vntb: Propagate MW offset from configfs when
> > >     present
> > >   NTB: ntb_transport: Support offsetted partial memory windows
> > >   NTB/msi: Support offsetted partial memory window for MSI
> > >   NTB/msi: Do not force MW to its maximum possible size
> > >   NTB: ntb_transport: Stricter checks for peer-reported interrupt values
> > >   NTB/msi: Skip mw_set_trans() if already configured
> > >   NTB/msi: Add a inner loop for PCI-MSI cases
> > >   dmaengine: dw-edma: Add self-interrupt registration API
> > >   dmaengine: dw-edma: Expose self-IRQ register offsets
> > >   dmaengine: dw-edma: Add dw_edma_find_by_child() helper
> > >   NTB: core: Add .get_pci_epc() to ntb_dev_ops
> > >   NTB: epf: vntb: Implement .get_pci_epc() callback
> > >   NTB: ntb_transport: Rename use_msi to use_intr (keep alias)
> > >   NTB: Introduce generic interrupt backend abstraction and convert MSI
> > >   NTB: ntb_transport: Rename MSI symbols to generic interrupt form
> > >   NTB: intr_dw_edma: Add DW eDMA emulated interrupt backend
> > >   NTB: epf: Add MW2 for interrupt use on Renesas R-Car
> > >   Documentation: PCI: endpoint: pci-epf-vntb: Update and add mwN_offset
> > >     usage
> > >
> > >  Documentation/PCI/endpoint/pci-vntb-howto.rst |  16 +-
> > >  drivers/dma/dw-edma/dw-edma-core.c            | 109 ++++++++
> > >  drivers/dma/dw-edma/dw-edma-core.h            |  18 ++
> > >  drivers/dma/dw-edma/dw-edma-v0-core.c         |  15 ++
> > >  drivers/ntb/Kconfig                           |  15 ++
> > >  drivers/ntb/Makefile                          |   6 +-
> > >  drivers/ntb/hw/amd/ntb_hw_amd.c               |   6 +-
> > >  drivers/ntb/hw/epf/ntb_hw_epf.c               |  46 ++--
> > >  drivers/ntb/hw/idt/ntb_hw_idt.c               |   3 +-
> > >  drivers/ntb/hw/intel/ntb_hw_gen1.c            |   6 +-
> > >  drivers/ntb/hw/intel/ntb_hw_gen1.h            |   2 +-
> > >  drivers/ntb/hw/intel/ntb_hw_gen3.c            |   3 +-
> > >  drivers/ntb/hw/intel/ntb_hw_gen4.c            |   6 +-
> > >  drivers/ntb/hw/mscc/ntb_hw_switchtec.c        |   6 +-
> > >  drivers/ntb/intr_common.c                     |  61 +++++
> > >  drivers/ntb/intr_dw_edma.c                    | 253 ++++++++++++++++++
> > >  drivers/ntb/msi.c                             | 186 +++++++------
> > >  drivers/ntb/ntb_transport.c                   | 155 ++++++-----
> > >  drivers/ntb/test/ntb_msi_test.c               |  26 +-
> > >  drivers/ntb/test/ntb_perf.c                   |   4 +-
> > >  drivers/ntb/test/ntb_tool.c                   |   6 +-
> > >  .../pci/controller/dwc/pcie-designware-ep.c   | 242 +++++++++++++++--
> > >  drivers/pci/controller/dwc/pcie-designware.c  |   1 +
> > >  drivers/pci/controller/dwc/pcie-designware.h  |   2 +
> > >  drivers/pci/endpoint/functions/pci-epf-vntb.c | 197 ++++++++++++--
> > >  drivers/pci/endpoint/pci-epc-core.c           |  44 +++
> > >  include/linux/dma/edma.h                      |  31 +++
> > >  include/linux/ntb.h                           | 134 +++++++---
> > >  include/linux/pci-epc.h                       |  11 +
> > >  29 files changed, 1310 insertions(+), 300 deletions(-)
> > >  create mode 100644 drivers/ntb/intr_common.c
> > >  create mode 100644 drivers/ntb/intr_dw_edma.c
> > >
> > > --
> > > 2.48.1
> > >

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH 01/25] PCI: endpoint: pci-epf-vntb: Use array_index_nospec() on mws_size[] access
  2025-10-24 16:24     ` Koichiro Den
@ 2025-10-24 18:40       ` Frank Li
  0 siblings, 0 replies; 37+ messages in thread
From: Frank Li @ 2025-10-24 18:40 UTC (permalink / raw)
  To: Koichiro Den
  Cc: ntb, linux-pci, dmaengine, linux-kernel, mani, kwilczynski,
	kishon, bhelgaas, corbet, vkoul, jdmason, dave.jiang, allenbh,
	Basavaraj.Natikar, Shyam-sundar.S-k, kurt.schwemmer, logang,
	jingoohan1, lpieralisi, robh, jbrunet, fancer.lancer, arnd,
	pstanner, elfring

On Sat, Oct 25, 2025 at 01:24:29AM +0900, Koichiro Den wrote:
> On Thu, Oct 23, 2025 at 08:06:40PM -0400, Frank Li wrote:
> > On Thu, Oct 23, 2025 at 04:18:52PM +0900, Koichiro Den wrote:
> > > Follow common kernel idioms for indices derived from configfs attributes
> > > and suppress Smatch warnings:
> > >
> > >   epf_ntb_mw1_show() warn: potential spectre issue 'ntb->mws_size' [r]
> > >   epf_ntb_mw1_store() warn: potential spectre issue 'ntb->mws_size' [w]
> > >
> > > No functional changes.
> > >
> > > Signed-off-by: Koichiro Den <den@valinux.co.jp>
> > > ---
> > >  drivers/pci/endpoint/functions/pci-epf-vntb.c | 23 +++++++++++--------
> > >  1 file changed, 14 insertions(+), 9 deletions(-)
> > >
> > > diff --git a/drivers/pci/endpoint/functions/pci-epf-vntb.c b/drivers/pci/endpoint/functions/pci-epf-vntb.c
> > > index 83e9ab10f9c4..55307cd613c9 100644
> > > --- a/drivers/pci/endpoint/functions/pci-epf-vntb.c
> > > +++ b/drivers/pci/endpoint/functions/pci-epf-vntb.c
> > > @@ -876,17 +876,19 @@ static ssize_t epf_ntb_##_name##_show(struct config_item *item,		\
> > >  	struct config_group *group = to_config_group(item);		\
> > >  	struct epf_ntb *ntb = to_epf_ntb(group);			\
> > >  	struct device *dev = &ntb->epf->dev;				\
> > > -	int win_no;							\
> > > +	int win_no, idx;						\
> > >  									\
> > >  	if (sscanf(#_name, "mw%d", &win_no) != 1)			\
> > >  		return -EINVAL;						\
> > >  									\
> > > -	if (win_no <= 0 || win_no > ntb->num_mws) {			\
> > > -		dev_err(dev, "Invalid num_nws: %d value\n", ntb->num_mws); \
> > > +	idx = win_no - 1;						\
> > > +	if (idx < 0 || idx >= ntb->num_mws) {				\
> > > +		dev_err(dev, "MW%d out of range (num_mws=%d)\n",	\
> > > +			win_no, ntb->num_mws);				\
> > >  		return -EINVAL;						\
> > >  	}								\
> > > -									\
> > > -	return sprintf(page, "%lld\n", ntb->mws_size[win_no - 1]);	\
> > > +	idx = array_index_nospec(idx, ntb->num_mws);			\
> > > +	return sprintf(page, "%lld\n", ntb->mws_size[idx]);		\
> >
> > keep original check if (win_no <= 0 || win_no > ntb->num_mws)
> >
> > just
> > 	idx = array_index_nospec(win_no - 1, ntb->num_mws);
> > 	return sprintf(page, "%lld\n", ntb->mws_size[idx]);
> >
> > It should be more simple.
>
> Thanks for the review.
>
> For minimal changes, that makes sense. I'd also like to update the dev_err
> message (the "num_nws" typo, and I think what's invalid is win_no, not
> num_mws). So how about combining your suggestion with the log message
> update?

Okay!

Frank

>
> -Koichiro
>
> >
> > Frank
> > >  }
> > >
> > >  #define EPF_NTB_MW_W(_name)						\
> > > @@ -896,7 +898,7 @@ static ssize_t epf_ntb_##_name##_store(struct config_item *item,	\
> > >  	struct config_group *group = to_config_group(item);		\
> > >  	struct epf_ntb *ntb = to_epf_ntb(group);			\
> > >  	struct device *dev = &ntb->epf->dev;				\
> > > -	int win_no;							\
> > > +	int win_no, idx;						\
> > >  	u64 val;							\
> > >  	int ret;							\
> > >  									\
> > > @@ -907,12 +909,15 @@ static ssize_t epf_ntb_##_name##_store(struct config_item *item,	\
> > >  	if (sscanf(#_name, "mw%d", &win_no) != 1)			\
> > >  		return -EINVAL;						\
> > >  									\
> > > -	if (win_no <= 0 || win_no > ntb->num_mws) {			\
> > > -		dev_err(dev, "Invalid num_nws: %d value\n", ntb->num_mws); \
> > > +	idx = win_no - 1;						\
> > > +	if (idx < 0 || idx >= ntb->num_mws) {				\
> > > +		dev_err(dev, "MW%d out of range (num_mws=%d)\n",	\
> > > +			win_no, ntb->num_mws);				\
> > >  		return -EINVAL;						\
> > >  	}								\
> > >  									\
> > > -	ntb->mws_size[win_no - 1] = val;				\
> > > +	idx = array_index_nospec(idx, ntb->num_mws);			\
> > > +	ntb->mws_size[idx] = val;					\
> > >  									\
> > >  	return len;							\
> > >  }
> > > --
> > > 2.48.1
> > >

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH 00/25] NTB/PCI: Add DW eDMA intr fallback and BAR MW offsets
  2025-10-24 16:43     ` Frank Li
@ 2025-10-27  5:29       ` Koichiro Den
  2025-10-28 20:50         ` Frank Li
  0 siblings, 1 reply; 37+ messages in thread
From: Koichiro Den @ 2025-10-27  5:29 UTC (permalink / raw)
  To: Frank Li
  Cc: ntb, linux-pci, dmaengine, linux-kernel, mani, kwilczynski,
	kishon, bhelgaas, corbet, vkoul, jdmason, dave.jiang, allenbh,
	Basavaraj.Natikar, Shyam-sundar.S-k, kurt.schwemmer, logang,
	jingoohan1, lpieralisi, robh, jbrunet, fancer.lancer, arnd,
	pstanner, elfring

On Fri, Oct 24, 2025 at 12:43:47PM -0400, Frank Li wrote:
> On Sat, Oct 25, 2025 at 01:04:01AM +0900, Koichiro Den wrote:
> > On Thu, Oct 23, 2025 at 11:27:09PM -0400, Frank Li wrote:
> > > On Thu, Oct 23, 2025 at 04:18:51PM +0900, Koichiro Den wrote:
> > > > Hi all,
> > > >
> > > > Motivation
> > > > ==========
> > > >
> > > > On Renesas R-Car S4 the PCIe Endpoint is DesignWare-based and the platform
> > > > does not allow mapping GITS_TRANSLATER as an inbound iATU target. As a
> > > > result, forwarding MSI writes from the Root Complex (RC) to the Endpoint
> > > > (EP) is not possible even if we would add implementation to create a MSI
> > > > domain for the vNTB device to use existing drivers/ntb/msi.c, and NTB
> > > > traffic must fall back to doorbells (polling). In addition, BAR resources
> > > > are scarce, which makes it difficult to dedicate a BAR solely to an
> > > > NTB/msi window.
> > > >
> > > > This RFC introduces a generic interrupt backend for NTB. The existing MSI
> > > > path is converted to a backend, and a new DW eDMA test-interrupt backend
> > > > provides an RC-to-EP interrupt fallback when MSI cannot be used. In
> > > > parallel, EPC/DWC gains inbound subrange mapping so multiple NTB memory
> > > > windows (MWs) can share a single BAR at arbitrary offsets (via mwN_offset).
> > > > The vNTB EPF and ntb_transport are taught about offsets.
> > >
> > > Map multi address to one bar is quite valuable, so we can start it as the
> > > first steps.
> > >
> > > But I have a problem about DWC iATU address map mode. for example, bar0
> > > to cpu address 0x8000000 (Local CPU).  but difference RC system, at RC side
> > > bar0 address is variable. May be 0xa000_0000 (RC side), maybe 0xc000_0000
> > > (RC side).
> > >
> > > Set bar0 mapping before linkup.
> > >
> > > How do you know PCI bus address is 0xa0000000 or 0xc0000000.
> >
> > Thanks for the comment.
> >
> > For vNTB this is done in two steps:
> >
> > 1). In the epf_ntb_bind() path we call pci_epc_map_inbound() with
> >     epf_bar->phys_addr == 0. On the DWC side this only triggers
> >     dw_pcie_ep_set_bar_init() and does not program an inbound iATU yet.
> >     (pls see Patch #5).
> > 2). Later, when ntb_transport's link work runs and we actually need to
> >     set up Address Match inbound window(s), pci_epc_map_inbound() is called
> >     again with epf_bar->phys_addr != 0 (and an offset for the sub‑range). At
> >     that point the RC has already enumerated the device and assigned the BAR,
> >     so dw_pcie_ep_map_inbound() reads back the assigned BAR value via
> >     dw_pcie_ep_read_bar_assigned(), computes pci_addr = base + offset, and
> >     programs the inbound iATU in Address Match mode (again, Patch #5 is
> >     relevant).
> >
> > Because we do not program the inbound iATU before enumeration, we don't
> > need to know upfront whether the RC will place BAR0 at 0xa000_0000 or
> > 0xc000_0000. We read the assigned address right before the actual
> > programming (again, see the Patch #5). Am I missing something?
> 
> This should work for vntb user case. It needs generalize for other usage
> mode. maybe combine multi regions to one bar.

IMO it's already generized infrastructure. I'm not sure if we need to
retrofit other EPFs (pci_epc_set_bar callers) in this series. We can do
that when there's really a concrete need.

> 
> Add a case in pci-ep-test function drivers to let more people can review
> it.

This sounds reasonable, though it may involve seemingly a bit of duplicate
work, i.e. adding a similar configfs knobs on the pci-epf-test side, expand
the control register fields, make pci_endpoint_test aware of it, and
makeing sure that the selftest still pass. Please correct me if I'm off
here. I'll take some time to prepare that.

Thanks for the review.

-Koichiro

> 
> Frank
> 
> >
> > -Koichiro
> >
> > >
> > > Frank
> > >
> > > >
> > > > Backend selection is automatic: if MSI is available we use the MSI backend.
> > > > Otherwise, if enabled, the DW eDMA backend is used. If neither is
> > > > available, we continue to use doorbells. Existing systems remain unaffected
> > > > unless use_intr=1 is set.
> > > >
> > > > Example layout (R-Car S4):
> > > >
> > > >   BAR0: Config/Spad
> > > >   BAR2 [0x00000-0xF0000]: MW1 (data)
> > > >   BAR2 [0xF0000-0xF8000]: MW2 (interrupts)
> > > >   BAR4: Doorbell
> > > >
> > > >   # The corresponding configfs settings (see Patch #25):
> > > >   echo 0xF0000 > ./mw1
> > > >   echo 0x8000  > ./mw2
> > > >   echo 0xF0000 > ./mw2_offset
> > > >   echo 2       > ./mw1_bar
> > > >   echo 2       > ./mw2_bar
> > > >
> > > > Summary of changes
> > > > ==================
> > > >
> > > > * NTB core/transport
> > > >   - Introduce struct ntb_intr_backend and convert MSI to the new backend.
> > > >   - Add DW eDMA interrupt backend (CONFIG_NTB_DW_EDMA) as MSI-less fallback.
> > > >   - Rename module parameter to use_intr (keep use_msi as deprecated alias).
> > > >   - Support offsetted partial MWs in ntb_transport.
> > > >   - Hardening for peer-reported interrupt values and minor cleanups.
> > > >
> > > > * PCI Endpoint core and DWC EP controller
> > > >   - Add EPC ops map_inbound()/unmap_inbound() for BAR subrange mapping.
> > > >   - Implement inbound mapping for DesignWare EP (Address Match mode), with
> > > >     tracking of multiple inbound iATU entries per BAR and proper teardown.
> > > >
> > > > * EPF vNTB
> > > >   - Add mwN_offset configfs attributes and propagate offsets to inbound maps.
> > > >   - Prefer pci_epc_map_inbound() when supported. Otherwise fall back to
> > > >     set_bar().
> > > >   - Provide .get_pci_epc() so backends can locate the common eDMA instance.
> > > >
> > > > * DW eDMA
> > > >   - Add self-interrupt registration and expose test-IRQ register offsets.
> > > >   - Provide dw_edma_find_by_child().
> > > >
> > > > * Renesas R-Car
> > > >   - Place MW2 in BAR2 to host the interrupt window alongside the data MW.
> > > >
> > > > * Documentation
> > > >
> > > > Patch layout
> > > > ============
> > > >
> > > > * Patches 01-11 : BAR subrange and MW offsets (EPC/DWC EP, vNTB, core helpers)
> > > > * Patches 12-14 : Interrupt handling hardening in ntb_transport/MSI
> > > > * Patches 15-17 : DW eDMA: self-IRQ API, offsets, lookup helper
> > > > * Patches 18-19 : NTB/EPF glue (.get_pci_epc())
> > > > * Patch 20      : Module param name change (use_msi->use_intr, alias preserved)
> > > > * Patches 21-23 : Generic interrupt backend + MSI conversion + DW eDMA backend
> > > > * Patch 24      : R-Car: add MW2 in BAR2 for interrupts
> > > > * Patch 25      : Documentation updates
> > > >
> > > > Tested on
> > > > =========
> > > >
> > > > * Renesas R-Car S4 Spider
> > > > * Kernel base: commit 68113d260674 ("NTB/msi: Remove unused functions") (ntb-driver-core/ntb-next)
> > > >
> > > > Performance measurement
> > > > =======================
> > > >
> > > > Even without the DMA acceleration patches for R-Car S4 (which I keep
> > > > separate from this RFC patch series), enabling RC-to-EP interrupts
> > > > dramatically improves NTB latency on R-Car S4:
> > > >
> > > > * Before this patch series (NB. use_msi doesn't work on R-Car S4)
> > > >
> > > >   # Server: sockperf server -i 0.0.0.0
> > > >   # Client: sockperf ping-pong -i $SERVER_IP
> > > >   ========= Printing statistics for Server No: 0
> > > >   [Valid Duration] RunTime=0.540 sec; SentMessages=45; ReceivedMessages=45
> > > >   ====> avg-latency=5995.680 (std-dev=70.258, mean-ad=57.478, median-ad=85.978,\
> > > >         siqr=59.698, cv=0.012, std-error=10.473, 99.0% ci=[5968.702, 6022.658])
> > > >   # dropped messages = 0; # duplicated messages = 0; # out-of-order messages = 0
> > > >   Summary: Latency is 5995.680 usec
> > > >   Total 45 observations; each percentile contains 0.45 observations
> > > >   ---> <MAX> observation = 6121.137
> > > >   ---> percentile 99.999 = 6121.137
> > > >   ---> percentile 99.990 = 6121.137
> > > >   ---> percentile 99.900 = 6121.137
> > > >   ---> percentile 99.000 = 6121.137
> > > >   ---> percentile 90.000 = 6099.178
> > > >   ---> percentile 75.000 = 6054.418
> > > >   ---> percentile 50.000 = 5993.040
> > > >   ---> percentile 25.000 = 5935.021
> > > >   ---> <MIN> observation = 5883.362
> > > >
> > > > * With this series (use_intr=1)
> > > >
> > > >   # Server: sockperf server -i 0.0.0.0
> > > >   # Client: sockperf ping-pong -i $SERVER_IP
> > > >   ========= Printing statistics for Server No: 0
> > > >   [Valid Duration] RunTime=0.550 sec; SentMessages=2145; ReceivedMessages=2145
> > > >   ====> avg-latency=127.677 (std-dev=21.719, mean-ad=11.759, median-ad=3.779,\
> > > >         siqr=2.699, cv=0.170, std-error=0.469, 99.0% ci=[126.469, 128.885])
> > > >   # dropped messages = 0; # duplicated messages = 0; # out-of-order messages = 0
> > > >   Summary: Latency is 127.677 usec
> > > >   Total 2145 observations; each percentile contains 21.45 observations
> > > >   ---> <MAX> observation =  446.691
> > > >   ---> percentile 99.999 =  446.691
> > > >   ---> percentile 99.990 =  446.691
> > > >   ---> percentile 99.900 =  291.234
> > > >   ---> percentile 99.000 =  221.515
> > > >   ---> percentile 90.000 =  149.277
> > > >   ---> percentile 75.000 =  124.497
> > > >   ---> percentile 50.000 =  121.137
> > > >   ---> percentile 25.000 =  119.037
> > > >   ---> <MIN> observation =  113.637
> > > >
> > > > Feedback welcome on both the approach and the splitting/routing preference.
> > > >
> > > > (The series spans NTB, PCI EP/DWC and dmaengine/dw-edma. I'm happy to split
> > > > later if preferred.)
> > > >
> > > > Thanks for reviewing.
> > > >
> > > >
> > > > Koichiro Den (25):
> > > >   PCI: endpoint: pci-epf-vntb: Use array_index_nospec() on mws_size[]
> > > >     access
> > > >   PCI: endpoint: pci-epf-vntb: Add mwN_offset configfs attributes
> > > >   NTB: epf: Handle mwN_offset for inbound MW regions
> > > >   PCI: endpoint: Add inbound mapping ops to EPC core
> > > >   PCI: dwc: ep: Implement EPC inbound mapping support
> > > >   PCI: endpoint: pci-epf-vntb: Use pci_epc_map_inbound() for MW mapping
> > > >   NTB: Add offset parameter to MW translation APIs
> > > >   PCI: endpoint: pci-epf-vntb: Propagate MW offset from configfs when
> > > >     present
> > > >   NTB: ntb_transport: Support offsetted partial memory windows
> > > >   NTB/msi: Support offsetted partial memory window for MSI
> > > >   NTB/msi: Do not force MW to its maximum possible size
> > > >   NTB: ntb_transport: Stricter checks for peer-reported interrupt values
> > > >   NTB/msi: Skip mw_set_trans() if already configured
> > > >   NTB/msi: Add a inner loop for PCI-MSI cases
> > > >   dmaengine: dw-edma: Add self-interrupt registration API
> > > >   dmaengine: dw-edma: Expose self-IRQ register offsets
> > > >   dmaengine: dw-edma: Add dw_edma_find_by_child() helper
> > > >   NTB: core: Add .get_pci_epc() to ntb_dev_ops
> > > >   NTB: epf: vntb: Implement .get_pci_epc() callback
> > > >   NTB: ntb_transport: Rename use_msi to use_intr (keep alias)
> > > >   NTB: Introduce generic interrupt backend abstraction and convert MSI
> > > >   NTB: ntb_transport: Rename MSI symbols to generic interrupt form
> > > >   NTB: intr_dw_edma: Add DW eDMA emulated interrupt backend
> > > >   NTB: epf: Add MW2 for interrupt use on Renesas R-Car
> > > >   Documentation: PCI: endpoint: pci-epf-vntb: Update and add mwN_offset
> > > >     usage
> > > >
> > > >  Documentation/PCI/endpoint/pci-vntb-howto.rst |  16 +-
> > > >  drivers/dma/dw-edma/dw-edma-core.c            | 109 ++++++++
> > > >  drivers/dma/dw-edma/dw-edma-core.h            |  18 ++
> > > >  drivers/dma/dw-edma/dw-edma-v0-core.c         |  15 ++
> > > >  drivers/ntb/Kconfig                           |  15 ++
> > > >  drivers/ntb/Makefile                          |   6 +-
> > > >  drivers/ntb/hw/amd/ntb_hw_amd.c               |   6 +-
> > > >  drivers/ntb/hw/epf/ntb_hw_epf.c               |  46 ++--
> > > >  drivers/ntb/hw/idt/ntb_hw_idt.c               |   3 +-
> > > >  drivers/ntb/hw/intel/ntb_hw_gen1.c            |   6 +-
> > > >  drivers/ntb/hw/intel/ntb_hw_gen1.h            |   2 +-
> > > >  drivers/ntb/hw/intel/ntb_hw_gen3.c            |   3 +-
> > > >  drivers/ntb/hw/intel/ntb_hw_gen4.c            |   6 +-
> > > >  drivers/ntb/hw/mscc/ntb_hw_switchtec.c        |   6 +-
> > > >  drivers/ntb/intr_common.c                     |  61 +++++
> > > >  drivers/ntb/intr_dw_edma.c                    | 253 ++++++++++++++++++
> > > >  drivers/ntb/msi.c                             | 186 +++++++------
> > > >  drivers/ntb/ntb_transport.c                   | 155 ++++++-----
> > > >  drivers/ntb/test/ntb_msi_test.c               |  26 +-
> > > >  drivers/ntb/test/ntb_perf.c                   |   4 +-
> > > >  drivers/ntb/test/ntb_tool.c                   |   6 +-
> > > >  .../pci/controller/dwc/pcie-designware-ep.c   | 242 +++++++++++++++--
> > > >  drivers/pci/controller/dwc/pcie-designware.c  |   1 +
> > > >  drivers/pci/controller/dwc/pcie-designware.h  |   2 +
> > > >  drivers/pci/endpoint/functions/pci-epf-vntb.c | 197 ++++++++++++--
> > > >  drivers/pci/endpoint/pci-epc-core.c           |  44 +++
> > > >  include/linux/dma/edma.h                      |  31 +++
> > > >  include/linux/ntb.h                           | 134 +++++++---
> > > >  include/linux/pci-epc.h                       |  11 +
> > > >  29 files changed, 1310 insertions(+), 300 deletions(-)
> > > >  create mode 100644 drivers/ntb/intr_common.c
> > > >  create mode 100644 drivers/ntb/intr_dw_edma.c
> > > >
> > > > --
> > > > 2.48.1
> > > >

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH 00/25] NTB/PCI: Add DW eDMA intr fallback and BAR MW offsets
  2025-10-27  5:29       ` Koichiro Den
@ 2025-10-28 20:50         ` Frank Li
  2025-10-29  7:13           ` Koichiro Den
  0 siblings, 1 reply; 37+ messages in thread
From: Frank Li @ 2025-10-28 20:50 UTC (permalink / raw)
  To: Koichiro Den
  Cc: ntb, linux-pci, dmaengine, linux-kernel, mani, kwilczynski,
	kishon, bhelgaas, corbet, vkoul, jdmason, dave.jiang, allenbh,
	Basavaraj.Natikar, Shyam-sundar.S-k, kurt.schwemmer, logang,
	jingoohan1, lpieralisi, robh, jbrunet, fancer.lancer, arnd,
	pstanner, elfring

On Mon, Oct 27, 2025 at 02:29:30PM +0900, Koichiro Den wrote:
> On Fri, Oct 24, 2025 at 12:43:47PM -0400, Frank Li wrote:
> > On Sat, Oct 25, 2025 at 01:04:01AM +0900, Koichiro Den wrote:
> > > On Thu, Oct 23, 2025 at 11:27:09PM -0400, Frank Li wrote:
> > > > On Thu, Oct 23, 2025 at 04:18:51PM +0900, Koichiro Den wrote:
> > > > > Hi all,
> > > > >
> > > > > Motivation
> > > > > ==========
> > > > >
> > > > > On Renesas R-Car S4 the PCIe Endpoint is DesignWare-based and the platform
> > > > > does not allow mapping GITS_TRANSLATER as an inbound iATU target. As a
> > > > > result, forwarding MSI writes from the Root Complex (RC) to the Endpoint
> > > > > (EP) is not possible even if we would add implementation to create a MSI
> > > > > domain for the vNTB device to use existing drivers/ntb/msi.c, and NTB
> > > > > traffic must fall back to doorbells (polling). In addition, BAR resources
> > > > > are scarce, which makes it difficult to dedicate a BAR solely to an
> > > > > NTB/msi window.
> > > > >
> > > > > This RFC introduces a generic interrupt backend for NTB. The existing MSI
> > > > > path is converted to a backend, and a new DW eDMA test-interrupt backend
> > > > > provides an RC-to-EP interrupt fallback when MSI cannot be used. In
> > > > > parallel, EPC/DWC gains inbound subrange mapping so multiple NTB memory
> > > > > windows (MWs) can share a single BAR at arbitrary offsets (via mwN_offset).
> > > > > The vNTB EPF and ntb_transport are taught about offsets.
> > > >
> > > > Map multi address to one bar is quite valuable, so we can start it as the
> > > > first steps.
> > > >
> > > > But I have a problem about DWC iATU address map mode. for example, bar0
> > > > to cpu address 0x8000000 (Local CPU).  but difference RC system, at RC side
> > > > bar0 address is variable. May be 0xa000_0000 (RC side), maybe 0xc000_0000
> > > > (RC side).
> > > >
> > > > Set bar0 mapping before linkup.
> > > >
> > > > How do you know PCI bus address is 0xa0000000 or 0xc0000000.
> > >
> > > Thanks for the comment.
> > >
> > > For vNTB this is done in two steps:
> > >
> > > 1). In the epf_ntb_bind() path we call pci_epc_map_inbound() with
> > >     epf_bar->phys_addr == 0. On the DWC side this only triggers
> > >     dw_pcie_ep_set_bar_init() and does not program an inbound iATU yet.
> > >     (pls see Patch #5).
> > > 2). Later, when ntb_transport's link work runs and we actually need to
> > >     set up Address Match inbound window(s), pci_epc_map_inbound() is called
> > >     again with epf_bar->phys_addr != 0 (and an offset for the sub‑range). At
> > >     that point the RC has already enumerated the device and assigned the BAR,
> > >     so dw_pcie_ep_map_inbound() reads back the assigned BAR value via
> > >     dw_pcie_ep_read_bar_assigned(), computes pci_addr = base + offset, and
> > >     programs the inbound iATU in Address Match mode (again, Patch #5 is
> > >     relevant).
> > >
> > > Because we do not program the inbound iATU before enumeration, we don't
> > > need to know upfront whether the RC will place BAR0 at 0xa000_0000 or
> > > 0xc000_0000. We read the assigned address right before the actual
> > > programming (again, see the Patch #5). Am I missing something?
> >
> > This should work for vntb user case. It needs generalize for other usage
> > mode. maybe combine multi regions to one bar.
>
> IMO it's already generized infrastructure. I'm not sure if we need to
> retrofit other EPFs (pci_epc_set_bar callers) in this series. We can do
> that when there's really a concrete need.
>
> >
> > Add a case in pci-ep-test function drivers to let more people can review
> > it.
>
> This sounds reasonable, though it may involve seemingly a bit of duplicate
> work, i.e. adding a similar configfs knobs on the pci-epf-test side, expand
> the control register fields, make pci_endpoint_test aware of it, and
> makeing sure that the selftest still pass. Please correct me if I'm off
> here. I'll take some time to prepare that.
>
> Thanks for the review.

I like combine eDMA address to one bar, so RC side ntb epf driver can use
dw-edma driver, (suppose just refer drivers/dma/dw-edma/dw-edma-pcie.c)
to register a host side dma engine, so ntb transfer can use this dma
engineer to do data transfer (with little bit modify to support periphal
mode).

So data transfer speed can get big improvement.  Of source also use eDMA
as doorbell work if there are enough dma channels in dwc controller.

Frank

>
> -Koichiro
>
> >
> > Frank
> >
> > >
> > > -Koichiro
> > >
> > > >
> > > > Frank
> > > >
> > > > >
> > > > > Backend selection is automatic: if MSI is available we use the MSI backend.
> > > > > Otherwise, if enabled, the DW eDMA backend is used. If neither is
> > > > > available, we continue to use doorbells. Existing systems remain unaffected
> > > > > unless use_intr=1 is set.
> > > > >
> > > > > Example layout (R-Car S4):
> > > > >
> > > > >   BAR0: Config/Spad
> > > > >   BAR2 [0x00000-0xF0000]: MW1 (data)
> > > > >   BAR2 [0xF0000-0xF8000]: MW2 (interrupts)
> > > > >   BAR4: Doorbell
> > > > >
> > > > >   # The corresponding configfs settings (see Patch #25):
> > > > >   echo 0xF0000 > ./mw1
> > > > >   echo 0x8000  > ./mw2
> > > > >   echo 0xF0000 > ./mw2_offset
> > > > >   echo 2       > ./mw1_bar
> > > > >   echo 2       > ./mw2_bar
> > > > >
> > > > > Summary of changes
> > > > > ==================
> > > > >
> > > > > * NTB core/transport
> > > > >   - Introduce struct ntb_intr_backend and convert MSI to the new backend.
> > > > >   - Add DW eDMA interrupt backend (CONFIG_NTB_DW_EDMA) as MSI-less fallback.
> > > > >   - Rename module parameter to use_intr (keep use_msi as deprecated alias).
> > > > >   - Support offsetted partial MWs in ntb_transport.
> > > > >   - Hardening for peer-reported interrupt values and minor cleanups.
> > > > >
> > > > > * PCI Endpoint core and DWC EP controller
> > > > >   - Add EPC ops map_inbound()/unmap_inbound() for BAR subrange mapping.
> > > > >   - Implement inbound mapping for DesignWare EP (Address Match mode), with
> > > > >     tracking of multiple inbound iATU entries per BAR and proper teardown.
> > > > >
> > > > > * EPF vNTB
> > > > >   - Add mwN_offset configfs attributes and propagate offsets to inbound maps.
> > > > >   - Prefer pci_epc_map_inbound() when supported. Otherwise fall back to
> > > > >     set_bar().
> > > > >   - Provide .get_pci_epc() so backends can locate the common eDMA instance.
> > > > >
> > > > > * DW eDMA
> > > > >   - Add self-interrupt registration and expose test-IRQ register offsets.
> > > > >   - Provide dw_edma_find_by_child().
> > > > >
> > > > > * Renesas R-Car
> > > > >   - Place MW2 in BAR2 to host the interrupt window alongside the data MW.
> > > > >
> > > > > * Documentation
> > > > >
> > > > > Patch layout
> > > > > ============
> > > > >
> > > > > * Patches 01-11 : BAR subrange and MW offsets (EPC/DWC EP, vNTB, core helpers)
> > > > > * Patches 12-14 : Interrupt handling hardening in ntb_transport/MSI
> > > > > * Patches 15-17 : DW eDMA: self-IRQ API, offsets, lookup helper
> > > > > * Patches 18-19 : NTB/EPF glue (.get_pci_epc())
> > > > > * Patch 20      : Module param name change (use_msi->use_intr, alias preserved)
> > > > > * Patches 21-23 : Generic interrupt backend + MSI conversion + DW eDMA backend
> > > > > * Patch 24      : R-Car: add MW2 in BAR2 for interrupts
> > > > > * Patch 25      : Documentation updates
> > > > >
> > > > > Tested on
> > > > > =========
> > > > >
> > > > > * Renesas R-Car S4 Spider
> > > > > * Kernel base: commit 68113d260674 ("NTB/msi: Remove unused functions") (ntb-driver-core/ntb-next)
> > > > >
> > > > > Performance measurement
> > > > > =======================
> > > > >
> > > > > Even without the DMA acceleration patches for R-Car S4 (which I keep
> > > > > separate from this RFC patch series), enabling RC-to-EP interrupts
> > > > > dramatically improves NTB latency on R-Car S4:
> > > > >
> > > > > * Before this patch series (NB. use_msi doesn't work on R-Car S4)
> > > > >
> > > > >   # Server: sockperf server -i 0.0.0.0
> > > > >   # Client: sockperf ping-pong -i $SERVER_IP
> > > > >   ========= Printing statistics for Server No: 0
> > > > >   [Valid Duration] RunTime=0.540 sec; SentMessages=45; ReceivedMessages=45
> > > > >   ====> avg-latency=5995.680 (std-dev=70.258, mean-ad=57.478, median-ad=85.978,\
> > > > >         siqr=59.698, cv=0.012, std-error=10.473, 99.0% ci=[5968.702, 6022.658])
> > > > >   # dropped messages = 0; # duplicated messages = 0; # out-of-order messages = 0
> > > > >   Summary: Latency is 5995.680 usec
> > > > >   Total 45 observations; each percentile contains 0.45 observations
> > > > >   ---> <MAX> observation = 6121.137
> > > > >   ---> percentile 99.999 = 6121.137
> > > > >   ---> percentile 99.990 = 6121.137
> > > > >   ---> percentile 99.900 = 6121.137
> > > > >   ---> percentile 99.000 = 6121.137
> > > > >   ---> percentile 90.000 = 6099.178
> > > > >   ---> percentile 75.000 = 6054.418
> > > > >   ---> percentile 50.000 = 5993.040
> > > > >   ---> percentile 25.000 = 5935.021
> > > > >   ---> <MIN> observation = 5883.362
> > > > >
> > > > > * With this series (use_intr=1)
> > > > >
> > > > >   # Server: sockperf server -i 0.0.0.0
> > > > >   # Client: sockperf ping-pong -i $SERVER_IP
> > > > >   ========= Printing statistics for Server No: 0
> > > > >   [Valid Duration] RunTime=0.550 sec; SentMessages=2145; ReceivedMessages=2145
> > > > >   ====> avg-latency=127.677 (std-dev=21.719, mean-ad=11.759, median-ad=3.779,\
> > > > >         siqr=2.699, cv=0.170, std-error=0.469, 99.0% ci=[126.469, 128.885])
> > > > >   # dropped messages = 0; # duplicated messages = 0; # out-of-order messages = 0
> > > > >   Summary: Latency is 127.677 usec
> > > > >   Total 2145 observations; each percentile contains 21.45 observations
> > > > >   ---> <MAX> observation =  446.691
> > > > >   ---> percentile 99.999 =  446.691
> > > > >   ---> percentile 99.990 =  446.691
> > > > >   ---> percentile 99.900 =  291.234
> > > > >   ---> percentile 99.000 =  221.515
> > > > >   ---> percentile 90.000 =  149.277
> > > > >   ---> percentile 75.000 =  124.497
> > > > >   ---> percentile 50.000 =  121.137
> > > > >   ---> percentile 25.000 =  119.037
> > > > >   ---> <MIN> observation =  113.637
> > > > >
> > > > > Feedback welcome on both the approach and the splitting/routing preference.
> > > > >
> > > > > (The series spans NTB, PCI EP/DWC and dmaengine/dw-edma. I'm happy to split
> > > > > later if preferred.)
> > > > >
> > > > > Thanks for reviewing.
> > > > >
> > > > >
> > > > > Koichiro Den (25):
> > > > >   PCI: endpoint: pci-epf-vntb: Use array_index_nospec() on mws_size[]
> > > > >     access
> > > > >   PCI: endpoint: pci-epf-vntb: Add mwN_offset configfs attributes
> > > > >   NTB: epf: Handle mwN_offset for inbound MW regions
> > > > >   PCI: endpoint: Add inbound mapping ops to EPC core
> > > > >   PCI: dwc: ep: Implement EPC inbound mapping support
> > > > >   PCI: endpoint: pci-epf-vntb: Use pci_epc_map_inbound() for MW mapping
> > > > >   NTB: Add offset parameter to MW translation APIs
> > > > >   PCI: endpoint: pci-epf-vntb: Propagate MW offset from configfs when
> > > > >     present
> > > > >   NTB: ntb_transport: Support offsetted partial memory windows
> > > > >   NTB/msi: Support offsetted partial memory window for MSI
> > > > >   NTB/msi: Do not force MW to its maximum possible size
> > > > >   NTB: ntb_transport: Stricter checks for peer-reported interrupt values
> > > > >   NTB/msi: Skip mw_set_trans() if already configured
> > > > >   NTB/msi: Add a inner loop for PCI-MSI cases
> > > > >   dmaengine: dw-edma: Add self-interrupt registration API
> > > > >   dmaengine: dw-edma: Expose self-IRQ register offsets
> > > > >   dmaengine: dw-edma: Add dw_edma_find_by_child() helper
> > > > >   NTB: core: Add .get_pci_epc() to ntb_dev_ops
> > > > >   NTB: epf: vntb: Implement .get_pci_epc() callback
> > > > >   NTB: ntb_transport: Rename use_msi to use_intr (keep alias)
> > > > >   NTB: Introduce generic interrupt backend abstraction and convert MSI
> > > > >   NTB: ntb_transport: Rename MSI symbols to generic interrupt form
> > > > >   NTB: intr_dw_edma: Add DW eDMA emulated interrupt backend
> > > > >   NTB: epf: Add MW2 for interrupt use on Renesas R-Car
> > > > >   Documentation: PCI: endpoint: pci-epf-vntb: Update and add mwN_offset
> > > > >     usage
> > > > >
> > > > >  Documentation/PCI/endpoint/pci-vntb-howto.rst |  16 +-
> > > > >  drivers/dma/dw-edma/dw-edma-core.c            | 109 ++++++++
> > > > >  drivers/dma/dw-edma/dw-edma-core.h            |  18 ++
> > > > >  drivers/dma/dw-edma/dw-edma-v0-core.c         |  15 ++
> > > > >  drivers/ntb/Kconfig                           |  15 ++
> > > > >  drivers/ntb/Makefile                          |   6 +-
> > > > >  drivers/ntb/hw/amd/ntb_hw_amd.c               |   6 +-
> > > > >  drivers/ntb/hw/epf/ntb_hw_epf.c               |  46 ++--
> > > > >  drivers/ntb/hw/idt/ntb_hw_idt.c               |   3 +-
> > > > >  drivers/ntb/hw/intel/ntb_hw_gen1.c            |   6 +-
> > > > >  drivers/ntb/hw/intel/ntb_hw_gen1.h            |   2 +-
> > > > >  drivers/ntb/hw/intel/ntb_hw_gen3.c            |   3 +-
> > > > >  drivers/ntb/hw/intel/ntb_hw_gen4.c            |   6 +-
> > > > >  drivers/ntb/hw/mscc/ntb_hw_switchtec.c        |   6 +-
> > > > >  drivers/ntb/intr_common.c                     |  61 +++++
> > > > >  drivers/ntb/intr_dw_edma.c                    | 253 ++++++++++++++++++
> > > > >  drivers/ntb/msi.c                             | 186 +++++++------
> > > > >  drivers/ntb/ntb_transport.c                   | 155 ++++++-----
> > > > >  drivers/ntb/test/ntb_msi_test.c               |  26 +-
> > > > >  drivers/ntb/test/ntb_perf.c                   |   4 +-
> > > > >  drivers/ntb/test/ntb_tool.c                   |   6 +-
> > > > >  .../pci/controller/dwc/pcie-designware-ep.c   | 242 +++++++++++++++--
> > > > >  drivers/pci/controller/dwc/pcie-designware.c  |   1 +
> > > > >  drivers/pci/controller/dwc/pcie-designware.h  |   2 +
> > > > >  drivers/pci/endpoint/functions/pci-epf-vntb.c | 197 ++++++++++++--
> > > > >  drivers/pci/endpoint/pci-epc-core.c           |  44 +++
> > > > >  include/linux/dma/edma.h                      |  31 +++
> > > > >  include/linux/ntb.h                           | 134 +++++++---
> > > > >  include/linux/pci-epc.h                       |  11 +
> > > > >  29 files changed, 1310 insertions(+), 300 deletions(-)
> > > > >  create mode 100644 drivers/ntb/intr_common.c
> > > > >  create mode 100644 drivers/ntb/intr_dw_edma.c
> > > > >
> > > > > --
> > > > > 2.48.1
> > > > >

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [RFC PATCH 00/25] NTB/PCI: Add DW eDMA intr fallback and BAR MW offsets
  2025-10-28 20:50         ` Frank Li
@ 2025-10-29  7:13           ` Koichiro Den
  0 siblings, 0 replies; 37+ messages in thread
From: Koichiro Den @ 2025-10-29  7:13 UTC (permalink / raw)
  To: Frank Li
  Cc: ntb, linux-pci, dmaengine, linux-kernel, mani, kwilczynski,
	kishon, bhelgaas, corbet, vkoul, jdmason, dave.jiang, allenbh,
	Basavaraj.Natikar, Shyam-sundar.S-k, kurt.schwemmer, logang,
	jingoohan1, lpieralisi, robh, jbrunet, fancer.lancer, arnd,
	pstanner, elfring

On Tue, Oct 28, 2025 at 04:50:18PM -0400, Frank Li wrote:
> On Mon, Oct 27, 2025 at 02:29:30PM +0900, Koichiro Den wrote:
> > On Fri, Oct 24, 2025 at 12:43:47PM -0400, Frank Li wrote:
> > > On Sat, Oct 25, 2025 at 01:04:01AM +0900, Koichiro Den wrote:
> > > > On Thu, Oct 23, 2025 at 11:27:09PM -0400, Frank Li wrote:
> > > > > On Thu, Oct 23, 2025 at 04:18:51PM +0900, Koichiro Den wrote:
> > > > > > Hi all,
> > > > > >
> > > > > > Motivation
> > > > > > ==========
> > > > > >
> > > > > > On Renesas R-Car S4 the PCIe Endpoint is DesignWare-based and the platform
> > > > > > does not allow mapping GITS_TRANSLATER as an inbound iATU target. As a
> > > > > > result, forwarding MSI writes from the Root Complex (RC) to the Endpoint
> > > > > > (EP) is not possible even if we would add implementation to create a MSI
> > > > > > domain for the vNTB device to use existing drivers/ntb/msi.c, and NTB
> > > > > > traffic must fall back to doorbells (polling). In addition, BAR resources
> > > > > > are scarce, which makes it difficult to dedicate a BAR solely to an
> > > > > > NTB/msi window.
> > > > > >
> > > > > > This RFC introduces a generic interrupt backend for NTB. The existing MSI
> > > > > > path is converted to a backend, and a new DW eDMA test-interrupt backend
> > > > > > provides an RC-to-EP interrupt fallback when MSI cannot be used. In
> > > > > > parallel, EPC/DWC gains inbound subrange mapping so multiple NTB memory
> > > > > > windows (MWs) can share a single BAR at arbitrary offsets (via mwN_offset).
> > > > > > The vNTB EPF and ntb_transport are taught about offsets.
> > > > >
> > > > > Map multi address to one bar is quite valuable, so we can start it as the
> > > > > first steps.
> > > > >
> > > > > But I have a problem about DWC iATU address map mode. for example, bar0
> > > > > to cpu address 0x8000000 (Local CPU).  but difference RC system, at RC side
> > > > > bar0 address is variable. May be 0xa000_0000 (RC side), maybe 0xc000_0000
> > > > > (RC side).
> > > > >
> > > > > Set bar0 mapping before linkup.
> > > > >
> > > > > How do you know PCI bus address is 0xa0000000 or 0xc0000000.
> > > >
> > > > Thanks for the comment.
> > > >
> > > > For vNTB this is done in two steps:
> > > >
> > > > 1). In the epf_ntb_bind() path we call pci_epc_map_inbound() with
> > > >     epf_bar->phys_addr == 0. On the DWC side this only triggers
> > > >     dw_pcie_ep_set_bar_init() and does not program an inbound iATU yet.
> > > >     (pls see Patch #5).
> > > > 2). Later, when ntb_transport's link work runs and we actually need to
> > > >     set up Address Match inbound window(s), pci_epc_map_inbound() is called
> > > >     again with epf_bar->phys_addr != 0 (and an offset for the sub‑range). At
> > > >     that point the RC has already enumerated the device and assigned the BAR,
> > > >     so dw_pcie_ep_map_inbound() reads back the assigned BAR value via
> > > >     dw_pcie_ep_read_bar_assigned(), computes pci_addr = base + offset, and
> > > >     programs the inbound iATU in Address Match mode (again, Patch #5 is
> > > >     relevant).
> > > >
> > > > Because we do not program the inbound iATU before enumeration, we don't
> > > > need to know upfront whether the RC will place BAR0 at 0xa000_0000 or
> > > > 0xc000_0000. We read the assigned address right before the actual
> > > > programming (again, see the Patch #5). Am I missing something?
> > >
> > > This should work for vntb user case. It needs generalize for other usage
> > > mode. maybe combine multi regions to one bar.
> >
> > IMO it's already generized infrastructure. I'm not sure if we need to
> > retrofit other EPFs (pci_epc_set_bar callers) in this series. We can do
> > that when there's really a concrete need.
> >
> > >
> > > Add a case in pci-ep-test function drivers to let more people can review
> > > it.
> >
> > This sounds reasonable, though it may involve seemingly a bit of duplicate
> > work, i.e. adding a similar configfs knobs on the pci-epf-test side, expand
> > the control register fields, make pci_endpoint_test aware of it, and
> > makeing sure that the selftest still pass. Please correct me if I'm off
> > here. I'll take some time to prepare that.
> >
> > Thanks for the review.
> 
> I like combine eDMA address to one bar, so RC side ntb epf driver can use
> dw-edma driver, (suppose just refer drivers/dma/dw-edma/dw-edma-pcie.c)
> to register a host side dma engine, so ntb transfer can use this dma
> engineer to do data transfer (with little bit modify to support periphal
> mode).

Thanks for the review.

Sounds like the cleanest path to me as well. I think we should modify
ntb_transport further to really benefit from this. As an example, for
ntb_netdev, by allocating RX buffers from a pre-DMA-mapped page pool (which
is no longer constrained by the MW/BAR) and exchanging those DMA addresses
over the control path for the other end to set as a DMA DAR, we can
eliminate the current double hopping model, i.e. local DMAC -> MW(BAR) ->
iATU -> TLP.

> 
> So data transfer speed can get big improvement.  Of source also use eDMA
> as doorbell work if there are enough dma channels in dwc controller.

As I understand it, the eDMA test interrupt mechanism used in the RFC (v1)
can be eliminated in this model. Instead, we can rely on real DMA
completion on EP side for the RC->EP direction, since RC kicks EP's eDMA to
initiate its TX.

I've been thinking of whether the test interrupt mechanism (Patch #23)
could serve as an intermidiate small step/practical workaround, but now I'm
inclined to aim for the cleaner way from the beginning.

-Koichiro

> 
> Frank
> 
> >
> > -Koichiro
> >
> > >
> > > Frank
> > >
> > > >
> > > > -Koichiro
> > > >
> > > > >
> > > > > Frank
> > > > >
> > > > > >
> > > > > > Backend selection is automatic: if MSI is available we use the MSI backend.
> > > > > > Otherwise, if enabled, the DW eDMA backend is used. If neither is
> > > > > > available, we continue to use doorbells. Existing systems remain unaffected
> > > > > > unless use_intr=1 is set.
> > > > > >
> > > > > > Example layout (R-Car S4):
> > > > > >
> > > > > >   BAR0: Config/Spad
> > > > > >   BAR2 [0x00000-0xF0000]: MW1 (data)
> > > > > >   BAR2 [0xF0000-0xF8000]: MW2 (interrupts)
> > > > > >   BAR4: Doorbell
> > > > > >
> > > > > >   # The corresponding configfs settings (see Patch #25):
> > > > > >   echo 0xF0000 > ./mw1
> > > > > >   echo 0x8000  > ./mw2
> > > > > >   echo 0xF0000 > ./mw2_offset
> > > > > >   echo 2       > ./mw1_bar
> > > > > >   echo 2       > ./mw2_bar
> > > > > >
> > > > > > Summary of changes
> > > > > > ==================
> > > > > >
> > > > > > * NTB core/transport
> > > > > >   - Introduce struct ntb_intr_backend and convert MSI to the new backend.
> > > > > >   - Add DW eDMA interrupt backend (CONFIG_NTB_DW_EDMA) as MSI-less fallback.
> > > > > >   - Rename module parameter to use_intr (keep use_msi as deprecated alias).
> > > > > >   - Support offsetted partial MWs in ntb_transport.
> > > > > >   - Hardening for peer-reported interrupt values and minor cleanups.
> > > > > >
> > > > > > * PCI Endpoint core and DWC EP controller
> > > > > >   - Add EPC ops map_inbound()/unmap_inbound() for BAR subrange mapping.
> > > > > >   - Implement inbound mapping for DesignWare EP (Address Match mode), with
> > > > > >     tracking of multiple inbound iATU entries per BAR and proper teardown.
> > > > > >
> > > > > > * EPF vNTB
> > > > > >   - Add mwN_offset configfs attributes and propagate offsets to inbound maps.
> > > > > >   - Prefer pci_epc_map_inbound() when supported. Otherwise fall back to
> > > > > >     set_bar().
> > > > > >   - Provide .get_pci_epc() so backends can locate the common eDMA instance.
> > > > > >
> > > > > > * DW eDMA
> > > > > >   - Add self-interrupt registration and expose test-IRQ register offsets.
> > > > > >   - Provide dw_edma_find_by_child().
> > > > > >
> > > > > > * Renesas R-Car
> > > > > >   - Place MW2 in BAR2 to host the interrupt window alongside the data MW.
> > > > > >
> > > > > > * Documentation
> > > > > >
> > > > > > Patch layout
> > > > > > ============
> > > > > >
> > > > > > * Patches 01-11 : BAR subrange and MW offsets (EPC/DWC EP, vNTB, core helpers)
> > > > > > * Patches 12-14 : Interrupt handling hardening in ntb_transport/MSI
> > > > > > * Patches 15-17 : DW eDMA: self-IRQ API, offsets, lookup helper
> > > > > > * Patches 18-19 : NTB/EPF glue (.get_pci_epc())
> > > > > > * Patch 20      : Module param name change (use_msi->use_intr, alias preserved)
> > > > > > * Patches 21-23 : Generic interrupt backend + MSI conversion + DW eDMA backend
> > > > > > * Patch 24      : R-Car: add MW2 in BAR2 for interrupts
> > > > > > * Patch 25      : Documentation updates
> > > > > >
> > > > > > Tested on
> > > > > > =========
> > > > > >
> > > > > > * Renesas R-Car S4 Spider
> > > > > > * Kernel base: commit 68113d260674 ("NTB/msi: Remove unused functions") (ntb-driver-core/ntb-next)
> > > > > >
> > > > > > Performance measurement
> > > > > > =======================
> > > > > >
> > > > > > Even without the DMA acceleration patches for R-Car S4 (which I keep
> > > > > > separate from this RFC patch series), enabling RC-to-EP interrupts
> > > > > > dramatically improves NTB latency on R-Car S4:
> > > > > >
> > > > > > * Before this patch series (NB. use_msi doesn't work on R-Car S4)
> > > > > >
> > > > > >   # Server: sockperf server -i 0.0.0.0
> > > > > >   # Client: sockperf ping-pong -i $SERVER_IP
> > > > > >   ========= Printing statistics for Server No: 0
> > > > > >   [Valid Duration] RunTime=0.540 sec; SentMessages=45; ReceivedMessages=45
> > > > > >   ====> avg-latency=5995.680 (std-dev=70.258, mean-ad=57.478, median-ad=85.978,\
> > > > > >         siqr=59.698, cv=0.012, std-error=10.473, 99.0% ci=[5968.702, 6022.658])
> > > > > >   # dropped messages = 0; # duplicated messages = 0; # out-of-order messages = 0
> > > > > >   Summary: Latency is 5995.680 usec
> > > > > >   Total 45 observations; each percentile contains 0.45 observations
> > > > > >   ---> <MAX> observation = 6121.137
> > > > > >   ---> percentile 99.999 = 6121.137
> > > > > >   ---> percentile 99.990 = 6121.137
> > > > > >   ---> percentile 99.900 = 6121.137
> > > > > >   ---> percentile 99.000 = 6121.137
> > > > > >   ---> percentile 90.000 = 6099.178
> > > > > >   ---> percentile 75.000 = 6054.418
> > > > > >   ---> percentile 50.000 = 5993.040
> > > > > >   ---> percentile 25.000 = 5935.021
> > > > > >   ---> <MIN> observation = 5883.362
> > > > > >
> > > > > > * With this series (use_intr=1)
> > > > > >
> > > > > >   # Server: sockperf server -i 0.0.0.0
> > > > > >   # Client: sockperf ping-pong -i $SERVER_IP
> > > > > >   ========= Printing statistics for Server No: 0
> > > > > >   [Valid Duration] RunTime=0.550 sec; SentMessages=2145; ReceivedMessages=2145
> > > > > >   ====> avg-latency=127.677 (std-dev=21.719, mean-ad=11.759, median-ad=3.779,\
> > > > > >         siqr=2.699, cv=0.170, std-error=0.469, 99.0% ci=[126.469, 128.885])
> > > > > >   # dropped messages = 0; # duplicated messages = 0; # out-of-order messages = 0
> > > > > >   Summary: Latency is 127.677 usec
> > > > > >   Total 2145 observations; each percentile contains 21.45 observations
> > > > > >   ---> <MAX> observation =  446.691
> > > > > >   ---> percentile 99.999 =  446.691
> > > > > >   ---> percentile 99.990 =  446.691
> > > > > >   ---> percentile 99.900 =  291.234
> > > > > >   ---> percentile 99.000 =  221.515
> > > > > >   ---> percentile 90.000 =  149.277
> > > > > >   ---> percentile 75.000 =  124.497
> > > > > >   ---> percentile 50.000 =  121.137
> > > > > >   ---> percentile 25.000 =  119.037
> > > > > >   ---> <MIN> observation =  113.637
> > > > > >
> > > > > > Feedback welcome on both the approach and the splitting/routing preference.
> > > > > >
> > > > > > (The series spans NTB, PCI EP/DWC and dmaengine/dw-edma. I'm happy to split
> > > > > > later if preferred.)
> > > > > >
> > > > > > Thanks for reviewing.
> > > > > >
> > > > > >
> > > > > > Koichiro Den (25):
> > > > > >   PCI: endpoint: pci-epf-vntb: Use array_index_nospec() on mws_size[]
> > > > > >     access
> > > > > >   PCI: endpoint: pci-epf-vntb: Add mwN_offset configfs attributes
> > > > > >   NTB: epf: Handle mwN_offset for inbound MW regions
> > > > > >   PCI: endpoint: Add inbound mapping ops to EPC core
> > > > > >   PCI: dwc: ep: Implement EPC inbound mapping support
> > > > > >   PCI: endpoint: pci-epf-vntb: Use pci_epc_map_inbound() for MW mapping
> > > > > >   NTB: Add offset parameter to MW translation APIs
> > > > > >   PCI: endpoint: pci-epf-vntb: Propagate MW offset from configfs when
> > > > > >     present
> > > > > >   NTB: ntb_transport: Support offsetted partial memory windows
> > > > > >   NTB/msi: Support offsetted partial memory window for MSI
> > > > > >   NTB/msi: Do not force MW to its maximum possible size
> > > > > >   NTB: ntb_transport: Stricter checks for peer-reported interrupt values
> > > > > >   NTB/msi: Skip mw_set_trans() if already configured
> > > > > >   NTB/msi: Add a inner loop for PCI-MSI cases
> > > > > >   dmaengine: dw-edma: Add self-interrupt registration API
> > > > > >   dmaengine: dw-edma: Expose self-IRQ register offsets
> > > > > >   dmaengine: dw-edma: Add dw_edma_find_by_child() helper
> > > > > >   NTB: core: Add .get_pci_epc() to ntb_dev_ops
> > > > > >   NTB: epf: vntb: Implement .get_pci_epc() callback
> > > > > >   NTB: ntb_transport: Rename use_msi to use_intr (keep alias)
> > > > > >   NTB: Introduce generic interrupt backend abstraction and convert MSI
> > > > > >   NTB: ntb_transport: Rename MSI symbols to generic interrupt form
> > > > > >   NTB: intr_dw_edma: Add DW eDMA emulated interrupt backend
> > > > > >   NTB: epf: Add MW2 for interrupt use on Renesas R-Car
> > > > > >   Documentation: PCI: endpoint: pci-epf-vntb: Update and add mwN_offset
> > > > > >     usage
> > > > > >
> > > > > >  Documentation/PCI/endpoint/pci-vntb-howto.rst |  16 +-
> > > > > >  drivers/dma/dw-edma/dw-edma-core.c            | 109 ++++++++
> > > > > >  drivers/dma/dw-edma/dw-edma-core.h            |  18 ++
> > > > > >  drivers/dma/dw-edma/dw-edma-v0-core.c         |  15 ++
> > > > > >  drivers/ntb/Kconfig                           |  15 ++
> > > > > >  drivers/ntb/Makefile                          |   6 +-
> > > > > >  drivers/ntb/hw/amd/ntb_hw_amd.c               |   6 +-
> > > > > >  drivers/ntb/hw/epf/ntb_hw_epf.c               |  46 ++--
> > > > > >  drivers/ntb/hw/idt/ntb_hw_idt.c               |   3 +-
> > > > > >  drivers/ntb/hw/intel/ntb_hw_gen1.c            |   6 +-
> > > > > >  drivers/ntb/hw/intel/ntb_hw_gen1.h            |   2 +-
> > > > > >  drivers/ntb/hw/intel/ntb_hw_gen3.c            |   3 +-
> > > > > >  drivers/ntb/hw/intel/ntb_hw_gen4.c            |   6 +-
> > > > > >  drivers/ntb/hw/mscc/ntb_hw_switchtec.c        |   6 +-
> > > > > >  drivers/ntb/intr_common.c                     |  61 +++++
> > > > > >  drivers/ntb/intr_dw_edma.c                    | 253 ++++++++++++++++++
> > > > > >  drivers/ntb/msi.c                             | 186 +++++++------
> > > > > >  drivers/ntb/ntb_transport.c                   | 155 ++++++-----
> > > > > >  drivers/ntb/test/ntb_msi_test.c               |  26 +-
> > > > > >  drivers/ntb/test/ntb_perf.c                   |   4 +-
> > > > > >  drivers/ntb/test/ntb_tool.c                   |   6 +-
> > > > > >  .../pci/controller/dwc/pcie-designware-ep.c   | 242 +++++++++++++++--
> > > > > >  drivers/pci/controller/dwc/pcie-designware.c  |   1 +
> > > > > >  drivers/pci/controller/dwc/pcie-designware.h  |   2 +
> > > > > >  drivers/pci/endpoint/functions/pci-epf-vntb.c | 197 ++++++++++++--
> > > > > >  drivers/pci/endpoint/pci-epc-core.c           |  44 +++
> > > > > >  include/linux/dma/edma.h                      |  31 +++
> > > > > >  include/linux/ntb.h                           | 134 +++++++---
> > > > > >  include/linux/pci-epc.h                       |  11 +
> > > > > >  29 files changed, 1310 insertions(+), 300 deletions(-)
> > > > > >  create mode 100644 drivers/ntb/intr_common.c
> > > > > >  create mode 100644 drivers/ntb/intr_dw_edma.c
> > > > > >
> > > > > > --
> > > > > > 2.48.1
> > > > > >

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2025-10-29  7:13 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-23  7:18 [RFC PATCH 00/25] NTB/PCI: Add DW eDMA intr fallback and BAR MW offsets Koichiro Den
2025-10-23  7:18 ` [RFC PATCH 01/25] PCI: endpoint: pci-epf-vntb: Use array_index_nospec() on mws_size[] access Koichiro Den
2025-10-24  0:06   ` Frank Li
2025-10-24 16:24     ` Koichiro Den
2025-10-24 18:40       ` Frank Li
2025-10-23  7:18 ` [RFC PATCH 02/25] PCI: endpoint: pci-epf-vntb: Add mwN_offset configfs attributes Koichiro Den
2025-10-23  7:18 ` [RFC PATCH 03/25] NTB: epf: Handle mwN_offset for inbound MW regions Koichiro Den
2025-10-23  7:18 ` [RFC PATCH 04/25] PCI: endpoint: Add inbound mapping ops to EPC core Koichiro Den
2025-10-23  7:18 ` [RFC PATCH 05/25] PCI: dwc: ep: Implement EPC inbound mapping support Koichiro Den
2025-10-23  7:18 ` [RFC PATCH 06/25] PCI: endpoint: pci-epf-vntb: Use pci_epc_map_inbound() for MW mapping Koichiro Den
2025-10-23  7:18 ` [RFC PATCH 07/25] NTB: Add offset parameter to MW translation APIs Koichiro Den
2025-10-23  7:18 ` [RFC PATCH 08/25] PCI: endpoint: pci-epf-vntb: Propagate MW offset from configfs when present Koichiro Den
2025-10-23  7:19 ` [RFC PATCH 09/25] NTB: ntb_transport: Support offsetted partial memory windows Koichiro Den
2025-10-23  7:19 ` [RFC PATCH 10/25] NTB/msi: Support offsetted partial memory window for MSI Koichiro Den
2025-10-23  7:19 ` [RFC PATCH 11/25] NTB/msi: Do not force MW to its maximum possible size Koichiro Den
2025-10-23  7:19 ` [RFC PATCH 12/25] NTB: ntb_transport: Stricter checks for peer-reported interrupt values Koichiro Den
2025-10-23  7:19 ` [RFC PATCH 13/25] NTB/msi: Skip mw_set_trans() if already configured Koichiro Den
2025-10-23  7:19 ` [RFC PATCH 14/25] NTB/msi: Add a inner loop for PCI-MSI cases Koichiro Den
2025-10-23  7:19 ` [RFC PATCH 15/25] dmaengine: dw-edma: Add self-interrupt registration API Koichiro Den
2025-10-23  7:19 ` [RFC PATCH 16/25] dmaengine: dw-edma: Expose self-IRQ register offsets Koichiro Den
2025-10-23  7:19 ` [RFC PATCH 17/25] dmaengine: dw-edma: Add dw_edma_find_by_child() helper Koichiro Den
2025-10-23  7:19 ` [RFC PATCH 18/25] NTB: core: Add .get_pci_epc() to ntb_dev_ops Koichiro Den
2025-10-23  7:19 ` [RFC PATCH 19/25] NTB: epf: vntb: Implement .get_pci_epc() callback Koichiro Den
2025-10-23  7:19 ` [RFC PATCH 20/25] NTB: ntb_transport: Rename use_msi to use_intr (keep alias) Koichiro Den
2025-10-23  7:19 ` [RFC PATCH 21/25] NTB: Introduce generic interrupt backend abstraction and convert MSI Koichiro Den
2025-10-23  7:19 ` [RFC PATCH 22/25] NTB: ntb_transport: Rename MSI symbols to generic interrupt form Koichiro Den
2025-10-23  7:19 ` [RFC PATCH 23/25] NTB: intr_dw_edma: Add DW eDMA emulated interrupt backend Koichiro Den
2025-10-23  7:19 ` [RFC PATCH 24/25] NTB: epf: Add MW2 for interrupt use on Renesas R-Car Koichiro Den
2025-10-23  7:19 ` [RFC PATCH 25/25] Documentation: PCI: endpoint: pci-epf-vntb: Update and add mwN_offset usage Koichiro Den
2025-10-23  7:55 ` [RFC PATCH 00/25] NTB/PCI: Add DW eDMA intr fallback and BAR MW offsets Jerome Brunet
2025-10-24 16:11   ` Koichiro Den
2025-10-24  3:27 ` Frank Li
2025-10-24 16:04   ` Koichiro Den
2025-10-24 16:43     ` Frank Li
2025-10-27  5:29       ` Koichiro Den
2025-10-28 20:50         ` Frank Li
2025-10-29  7:13           ` Koichiro Den

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).