Linux PCI subsystem development
 help / color / mirror / Atom feed
From: Frank Li <Frank.li@nxp.com>
To: Koichiro Den <den@valinux.co.jp>
Cc: ntb@lists.linux.dev, linux-pci@vger.kernel.org,
	dmaengine@vger.kernel.org, linux-kernel@vger.kernel.org,
	mani@kernel.org, kwilczynski@kernel.org, kishon@kernel.org,
	bhelgaas@google.com, corbet@lwn.net, vkoul@kernel.org,
	jdmason@kudzu.us, dave.jiang@intel.com, allenbh@gmail.com,
	Basavaraj.Natikar@amd.com, Shyam-sundar.S-k@amd.com,
	kurt.schwemmer@microsemi.com, logang@deltatee.com,
	jingoohan1@gmail.com, lpieralisi@kernel.org, robh@kernel.org,
	jbrunet@baylibre.com, fancer.lancer@gmail.com, arnd@arndb.de,
	pstanner@redhat.com, elfring@users.sourceforge.net
Subject: Re: [RFC PATCH v2 00/27] NTB transport backed by remote DW eDMA
Date: Mon, 1 Dec 2025 17:02:57 -0500	[thread overview]
Message-ID: <aS4QkYn+aKphlRFm@lizhi-Precision-Tower-5810> (raw)
In-Reply-To: <20251129160405.2568284-1-den@valinux.co.jp>

On Sun, Nov 30, 2025 at 01:03:38AM +0900, Koichiro Den wrote:
> Hi,
>
> This is RFC v2 of the NTB/PCI series for Renesas R-Car S4. The ultimate
> goal is unchanged, i.e. to improve performance between RC and EP
> (with vNTB) over ntb_transport, but the approach has changed drastically.
> Based on the feedback from Frank Li in the v1 thread, in particular:
> https://lore.kernel.org/all/aQEsip3TsPn4LJY9@lizhi-Precision-Tower-5810/
> this RFC v2 instead builds an NTB transport backed by remote eDMA
> architecture and reshapes the series around it. The RC->EP interruption
> is now achieved using a dedicated eDMA read channel, so the somewhat
> "hack"-ish approach in RFC v1 is no longer needed.
>
> Compared to RFC v1, this v2 series enables NTB transport backed by
> remote DW eDMA, so the current ntb_transport handling of Memory Window
> is no longer needed, and direct DMA transfers between EP and RC are
> used.
>
> I realize this is quite a large series. Sorry for the volume, but for
> the RFC stage I believe presenting the full picture in a single set
> helps with reviewing the overall architecture. Once the direction is
> agreed, I will respin it split by subsystem and topic.
>
>
...
>
> - Before this change:
>
>   * ping
>     64 bytes from 10.0.0.11: icmp_seq=1 ttl=64 time=12.3 ms
>     64 bytes from 10.0.0.11: icmp_seq=2 ttl=64 time=6.58 ms
>     64 bytes from 10.0.0.11: icmp_seq=3 ttl=64 time=1.26 ms
>     64 bytes from 10.0.0.11: icmp_seq=4 ttl=64 time=7.43 ms
>     64 bytes from 10.0.0.11: icmp_seq=5 ttl=64 time=1.39 ms
>     64 bytes from 10.0.0.11: icmp_seq=6 ttl=64 time=7.38 ms
>     64 bytes from 10.0.0.11: icmp_seq=7 ttl=64 time=1.42 ms
>     64 bytes from 10.0.0.11: icmp_seq=8 ttl=64 time=7.41 ms
>
>   * RC->EP (`sudo iperf3 -ub0 -l 65480 -P 2`)
>     [ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
>     [  5]   0.00-10.01  sec   344 MBytes   288 Mbits/sec  3.483 ms  51/5555 (0.92%)  receiver
>     [  6]   0.00-10.01  sec   342 MBytes   287 Mbits/sec  3.814 ms  38/5517 (0.69%)  receiver
>     [SUM]   0.00-10.01  sec   686 MBytes   575 Mbits/sec  3.648 ms  89/11072 (0.8%)  receiver
>
>   * EP->RC (`sudo iperf3 -ub0 -l 65480 -P 2`)
>     [  5]   0.00-10.03  sec   334 MBytes   279 Mbits/sec  3.164 ms  390/5731 (6.8%)  receiver
>     [  6]   0.00-10.03  sec   334 MBytes   279 Mbits/sec  2.416 ms  396/5741 (6.9%)  receiver
>     [SUM]   0.00-10.03  sec   667 MBytes   558 Mbits/sec  2.790 ms  786/11472 (6.9%)  receiver
>
>     Note: with `-P 2`, the best total bitrate (receiver side) was achieved.
>
> - After this change (use_remote_edma=1) [1]:
>
>   * ping
>     64 bytes from 10.0.0.11: icmp_seq=1 ttl=64 time=1.48 ms
>     64 bytes from 10.0.0.11: icmp_seq=2 ttl=64 time=1.03 ms
>     64 bytes from 10.0.0.11: icmp_seq=3 ttl=64 time=0.931 ms
>     64 bytes from 10.0.0.11: icmp_seq=4 ttl=64 time=0.910 ms
>     64 bytes from 10.0.0.11: icmp_seq=5 ttl=64 time=1.07 ms
>     64 bytes from 10.0.0.11: icmp_seq=6 ttl=64 time=0.986 ms
>     64 bytes from 10.0.0.11: icmp_seq=7 ttl=64 time=0.910 ms
>     64 bytes from 10.0.0.11: icmp_seq=8 ttl=64 time=0.883 ms
>
>   * RC->EP (`sudo iperf3 -ub0 -l 65480 -P 4`)
>     [  5]   0.00-10.01  sec  3.54 GBytes  3.04 Gbits/sec  0.030 ms  0/58007 (0%)  receiver
>     [  6]   0.00-10.01  sec  3.71 GBytes  3.19 Gbits/sec  0.453 ms  0/60909 (0%)  receiver
>     [  9]   0.00-10.01  sec  3.85 GBytes  3.30 Gbits/sec  0.027 ms  0/63072 (0%)  receiver
>     [ 11]   0.00-10.01  sec  3.26 GBytes  2.80 Gbits/sec  0.070 ms  1/53512 (0.0019%)  receiver
>     [SUM]   0.00-10.01  sec  14.4 GBytes  12.3 Gbits/sec  0.145 ms  1/235500 (0.00042%)  receiver
>
>   * EP->RC (`sudo iperf3 -ub0 -l 65480 -P 4`)
>     [  5]   0.00-10.03  sec  3.40 GBytes  2.91 Gbits/sec  0.104 ms  15467/71208 (22%)  receiver
>     [  6]   0.00-10.03  sec  3.08 GBytes  2.64 Gbits/sec  0.176 ms  12097/62609 (19%)  receiver
>     [  9]   0.00-10.03  sec  3.38 GBytes  2.90 Gbits/sec  0.270 ms  17212/72710 (24%)  receiver
>     [ 11]   0.00-10.03  sec  2.56 GBytes  2.19 Gbits/sec  0.200 ms  11193/53090 (21%)  receiver

Almost 10x fast, 2.9G vs 279M? high light this one will bring more peopole
interesting about this topic.

>     [SUM]   0.00-10.03  sec  12.4 GBytes  10.6 Gbits/sec  0.188 ms  55969/259617 (22%)  receiver
>
>   [1] configfs settings:
>       # modprobe pci_epf_vntb dyndbg=+pmf
>       # cd /sys/kernel/config/pci_ep/
>       # mkdir functions/pci_epf_vntb/func1
>       # echo 0x1912 >   functions/pci_epf_vntb/func1/vendorid
>       # echo 0x0030 >   functions/pci_epf_vntb/func1/deviceid
>       # echo 32 >       functions/pci_epf_vntb/func1/msi_interrupts
>       # echo 16 >       functions/pci_epf_vntb/func1/pci_epf_vntb.0/db_count
>       # echo 128 >      functions/pci_epf_vntb/func1/pci_epf_vntb.0/spad_count
>       # echo 2 >        functions/pci_epf_vntb/func1/pci_epf_vntb.0/num_mws
>       # echo 0xe0000 >  functions/pci_epf_vntb/func1/pci_epf_vntb.0/mw1
>       # echo 0x20000 >  functions/pci_epf_vntb/func1/pci_epf_vntb.0/mw2
>       # echo 0xe0000 >  functions/pci_epf_vntb/func1/pci_epf_vntb.0/mw2_offset

look like, you try to create sub-small mw windows.

Is it more clean ?

echo 0xe0000 >  functions/pci_epf_vntb/func1/pci_epf_vntb.0/mw1.0
echo 0x20000 >  functions/pci_epf_vntb/func1/pci_epf_vntb.0/mw1.1

so wm1.1 natively continue from prevous one.

Frank

>       # echo 0x1912 >   functions/pci_epf_vntb/func1/pci_epf_vntb.0/vntb_vid
>       # echo 0x0030 >   functions/pci_epf_vntb/func1/pci_epf_vntb.0/vntb_pid
>       # echo 0x10 >     functions/pci_epf_vntb/func1/pci_epf_vntb.0/vbus_number
>       # echo 0 >        functions/pci_epf_vntb/func1/pci_epf_vntb.0/ctrl_bar
>       # echo 4 >        functions/pci_epf_vntb/func1/pci_epf_vntb.0/db_bar
>       # echo 2 >        functions/pci_epf_vntb/func1/pci_epf_vntb.0/mw1_bar
>       # echo 2 >        functions/pci_epf_vntb/func1/pci_epf_vntb.0/mw2_bar
>       # ln -s controllers/e65d0000.pcie-ep functions/pci_epf_vntb/func1/primary/
>       # echo 1 > controllers/e65d0000.pcie-ep/start
>
>
> Thanks for taking a look.
>
>
> Koichiro Den (27):
>   PCI: endpoint: pci-epf-vntb: Use array_index_nospec() on mws_size[]
>     access
>   PCI: endpoint: pci-epf-vntb: Add mwN_offset configfs attributes
>   NTB: epf: Handle mwN_offset for inbound MW regions
>   PCI: endpoint: Add inbound mapping ops to EPC core
>   PCI: dwc: ep: Implement EPC inbound mapping support
>   PCI: endpoint: pci-epf-vntb: Use pci_epc_map_inbound() for MW mapping
>   NTB: Add offset parameter to MW translation APIs
>   PCI: endpoint: pci-epf-vntb: Propagate MW offset from configfs when
>     present
>   NTB: ntb_transport: Support offsetted partial memory windows
>   NTB: core: Add .get_pci_epc() to ntb_dev_ops
>   NTB: epf: vntb: Implement .get_pci_epc() callback
>   damengine: dw-edma: Fix MSI data values for multi-vector IMWr
>     interrupts
>   NTB: ntb_transport: Use seq_file for QP stats debugfs
>   NTB: ntb_transport: Move TX memory window setup into setup_qp_mw()
>   NTB: ntb_transport: Dynamically determine qp count
>   NTB: ntb_transport: Introduce get_dma_dev() helper
>   NTB: epf: Reserve a subset of MSI vectors for non-NTB users
>   NTB: ntb_transport: Introduce ntb_transport_backend_ops
>   PCI: dwc: ep: Cache MSI outbound iATU mapping
>   NTB: ntb_transport: Introduce remote eDMA backed transport mode
>   NTB: epf: Provide db_vector_count/db_vector_mask callbacks
>   ntb_netdev: Multi-queue support
>   NTB: epf: Add per-SoC quirk to cap MRRS for DWC eDMA (128B for R-Car)
>   iommu: ipmmu-vmsa: Add PCIe ch0 to devices_allowlist
>   iommu: ipmmu-vmsa: Add support for reserved regions
>   arm64: dts: renesas: Add Spider RC/EP DTs for NTB with remote DW PCIe
>     eDMA
>   NTB: epf: Add an additional memory window (MW2) barno mapping on
>     Renesas R-Car
>
>  arch/arm64/boot/dts/renesas/Makefile          |    2 +
>  .../boot/dts/renesas/r8a779f0-spider-ep.dts   |   46 +
>  .../boot/dts/renesas/r8a779f0-spider-rc.dts   |   52 +
>  drivers/dma/dw-edma/dw-edma-core.c            |   28 +-
>  drivers/iommu/ipmmu-vmsa.c                    |    7 +-
>  drivers/net/ntb_netdev.c                      |  341 ++-
>  drivers/ntb/Kconfig                           |   11 +
>  drivers/ntb/Makefile                          |    3 +
>  drivers/ntb/hw/amd/ntb_hw_amd.c               |    6 +-
>  drivers/ntb/hw/epf/ntb_hw_epf.c               |  177 +-
>  drivers/ntb/hw/idt/ntb_hw_idt.c               |    3 +-
>  drivers/ntb/hw/intel/ntb_hw_gen1.c            |    6 +-
>  drivers/ntb/hw/intel/ntb_hw_gen1.h            |    2 +-
>  drivers/ntb/hw/intel/ntb_hw_gen3.c            |    3 +-
>  drivers/ntb/hw/intel/ntb_hw_gen4.c            |    6 +-
>  drivers/ntb/hw/mscc/ntb_hw_switchtec.c        |    6 +-
>  drivers/ntb/msi.c                             |    6 +-
>  drivers/ntb/ntb_edma.c                        |  628 ++++++
>  drivers/ntb/ntb_edma.h                        |  128 ++
>  .../{ntb_transport.c => ntb_transport_core.c} | 1829 ++++++++++++++---
>  drivers/ntb/test/ntb_perf.c                   |    4 +-
>  drivers/ntb/test/ntb_tool.c                   |    6 +-
>  .../pci/controller/dwc/pcie-designware-ep.c   |  287 ++-
>  drivers/pci/controller/dwc/pcie-designware.h  |    7 +
>  drivers/pci/endpoint/functions/pci-epf-vntb.c |  229 ++-
>  drivers/pci/endpoint/pci-epc-core.c           |   44 +
>  include/linux/ntb.h                           |   39 +-
>  include/linux/ntb_transport.h                 |   21 +
>  include/linux/pci-epc.h                       |   11 +
>  29 files changed, 3415 insertions(+), 523 deletions(-)
>  create mode 100644 arch/arm64/boot/dts/renesas/r8a779f0-spider-ep.dts
>  create mode 100644 arch/arm64/boot/dts/renesas/r8a779f0-spider-rc.dts
>  create mode 100644 drivers/ntb/ntb_edma.c
>  create mode 100644 drivers/ntb/ntb_edma.h
>  rename drivers/ntb/{ntb_transport.c => ntb_transport_core.c} (59%)
>
> --
> 2.48.1
>

  parent reply	other threads:[~2025-12-01 22:03 UTC|newest]

Thread overview: 97+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-29 16:03 [RFC PATCH v2 00/27] NTB transport backed by remote DW eDMA Koichiro Den
2025-11-29 16:03 ` [RFC PATCH v2 01/27] PCI: endpoint: pci-epf-vntb: Use array_index_nospec() on mws_size[] access Koichiro Den
2025-12-01 18:59   ` Frank Li
2025-11-29 16:03 ` [RFC PATCH v2 02/27] PCI: endpoint: pci-epf-vntb: Add mwN_offset configfs attributes Koichiro Den
2025-12-01 19:11   ` Frank Li
2025-12-02  6:23     ` Koichiro Den
2025-11-29 16:03 ` [RFC PATCH v2 03/27] NTB: epf: Handle mwN_offset for inbound MW regions Koichiro Den
2025-12-01 19:14   ` Frank Li
2025-12-02  6:23     ` Koichiro Den
2025-11-29 16:03 ` [RFC PATCH v2 04/27] PCI: endpoint: Add inbound mapping ops to EPC core Koichiro Den
2025-12-01 19:19   ` Frank Li
2025-12-02  6:25     ` Koichiro Den
2025-12-02 15:58       ` Frank Li
2025-12-03 14:12         ` Koichiro Den
2025-11-29 16:03 ` [RFC PATCH v2 05/27] PCI: dwc: ep: Implement EPC inbound mapping support Koichiro Den
2025-12-01 19:32   ` Frank Li
2025-12-02  6:26     ` Koichiro Den
2025-11-29 16:03 ` [RFC PATCH v2 06/27] PCI: endpoint: pci-epf-vntb: Use pci_epc_map_inbound() for MW mapping Koichiro Den
2025-12-01 19:34   ` Frank Li
2025-12-02  6:26     ` Koichiro Den
2025-11-29 16:03 ` [RFC PATCH v2 07/27] NTB: Add offset parameter to MW translation APIs Koichiro Den
2025-11-29 16:03 ` [RFC PATCH v2 08/27] PCI: endpoint: pci-epf-vntb: Propagate MW offset from configfs when present Koichiro Den
2025-12-01 19:35   ` Frank Li
2025-11-29 16:03 ` [RFC PATCH v2 09/27] NTB: ntb_transport: Support offsetted partial memory windows Koichiro Den
2025-11-29 16:03 ` [RFC PATCH v2 10/27] NTB: core: Add .get_pci_epc() to ntb_dev_ops Koichiro Den
2025-12-01 19:39   ` Frank Li
2025-12-02  6:31     ` Koichiro Den
2025-12-01 21:08   ` Dave Jiang
2025-12-02  6:32     ` Koichiro Den
2025-12-02 14:49       ` Dave Jiang
2025-12-03 15:02         ` Koichiro Den
2025-11-29 16:03 ` [RFC PATCH v2 11/27] NTB: epf: vntb: Implement .get_pci_epc() callback Koichiro Den
2025-11-29 16:03 ` [RFC PATCH v2 12/27] damengine: dw-edma: Fix MSI data values for multi-vector IMWr interrupts Koichiro Den
2025-12-01 19:46   ` Frank Li
2025-12-02  6:32     ` Koichiro Den
2025-12-18  6:52       ` Koichiro Den
2025-11-29 16:03 ` [RFC PATCH v2 13/27] NTB: ntb_transport: Use seq_file for QP stats debugfs Koichiro Den
2025-12-01 19:50   ` Frank Li
2025-12-02  6:33     ` Koichiro Den
2025-11-29 16:03 ` [RFC PATCH v2 14/27] NTB: ntb_transport: Move TX memory window setup into setup_qp_mw() Koichiro Den
2025-12-01 20:02   ` Frank Li
2025-12-02  6:33     ` Koichiro Den
2025-11-29 16:03 ` [RFC PATCH v2 15/27] NTB: ntb_transport: Dynamically determine qp count Koichiro Den
2025-11-29 16:03 ` [RFC PATCH v2 16/27] NTB: ntb_transport: Introduce get_dma_dev() helper Koichiro Den
2025-11-29 16:03 ` [RFC PATCH v2 17/27] NTB: epf: Reserve a subset of MSI vectors for non-NTB users Koichiro Den
2025-11-29 16:03 ` [RFC PATCH v2 18/27] NTB: ntb_transport: Introduce ntb_transport_backend_ops Koichiro Den
2025-11-29 16:03 ` [RFC PATCH v2 19/27] PCI: dwc: ep: Cache MSI outbound iATU mapping Koichiro Den
2025-12-01 20:41   ` Frank Li
2025-12-02  6:35     ` Koichiro Den
2025-12-02  9:32       ` Niklas Cassel
2025-12-02 15:20         ` Frank Li
2025-12-03  8:40         ` Koichiro Den
2025-12-03 10:39           ` Niklas Cassel
2025-12-03 14:36             ` Koichiro Den
2025-12-03 14:40               ` Koichiro Den
2025-12-04 17:10             ` Frank Li
2025-12-05 16:28             ` Frank Li
2025-12-02  6:32   ` Niklas Cassel
2025-12-03  8:30     ` Koichiro Den
2025-12-03 10:19       ` Niklas Cassel
2025-12-03 14:56         ` Koichiro Den
2025-12-08  7:57   ` Niklas Cassel
2025-12-09  8:15     ` Niklas Cassel
2025-12-12  3:56       ` Koichiro Den
2025-12-22  5:10     ` Krishna Chaitanya Chundru
2025-12-22  7:50       ` Niklas Cassel
2025-12-22  8:14         ` Krishna Chaitanya Chundru
2025-12-22 10:21           ` Manivannan Sadhasivam
2025-12-12  3:38   ` Manivannan Sadhasivam
2025-12-18  8:28     ` Koichiro Den
2025-11-29 16:03 ` [RFC PATCH v2 20/27] NTB: ntb_transport: Introduce remote eDMA backed transport mode Koichiro Den
2025-12-01 21:41   ` Frank Li
2025-12-02  6:43     ` Koichiro Den
2025-12-02 15:42       ` Frank Li
2025-12-03  8:53         ` Koichiro Den
2025-12-03 16:14           ` Frank Li
2025-12-04 15:42             ` Koichiro Den
2025-12-04 20:16               ` Frank Li
2025-12-05  3:04                 ` Koichiro Den
2025-12-05 15:06                   ` Frank Li
2025-12-18  4:34                     ` Koichiro Den
2025-12-01 21:46   ` Dave Jiang
2025-12-02  6:59     ` Koichiro Den
2025-12-02 14:53       ` Dave Jiang
2025-12-03 14:19         ` Koichiro Den
2025-11-29 16:03 ` [RFC PATCH v2 21/27] NTB: epf: Provide db_vector_count/db_vector_mask callbacks Koichiro Den
2025-11-29 16:04 ` [RFC PATCH v2 22/27] ntb_netdev: Multi-queue support Koichiro Den
2025-11-29 16:04 ` [RFC PATCH v2 23/27] NTB: epf: Add per-SoC quirk to cap MRRS for DWC eDMA (128B for R-Car) Koichiro Den
2025-12-01 20:47   ` Frank Li
2025-11-29 16:04 ` [RFC PATCH v2 24/27] iommu: ipmmu-vmsa: Add PCIe ch0 to devices_allowlist Koichiro Den
2025-11-29 16:04 ` [RFC PATCH v2 25/27] iommu: ipmmu-vmsa: Add support for reserved regions Koichiro Den
2025-11-29 16:04 ` [RFC PATCH v2 26/27] arm64: dts: renesas: Add Spider RC/EP DTs for NTB with remote DW PCIe eDMA Koichiro Den
2025-11-29 16:04 ` [RFC PATCH v2 27/27] NTB: epf: Add an additional memory window (MW2) barno mapping on Renesas R-Car Koichiro Den
2025-12-01 22:02 ` Frank Li [this message]
2025-12-02  6:20   ` [RFC PATCH v2 00/27] NTB transport backed by remote DW eDMA Koichiro Den
2025-12-02 16:07     ` Frank Li
2025-12-03  8:43       ` Koichiro Den

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aS4QkYn+aKphlRFm@lizhi-Precision-Tower-5810 \
    --to=frank.li@nxp.com \
    --cc=Basavaraj.Natikar@amd.com \
    --cc=Shyam-sundar.S-k@amd.com \
    --cc=allenbh@gmail.com \
    --cc=arnd@arndb.de \
    --cc=bhelgaas@google.com \
    --cc=corbet@lwn.net \
    --cc=dave.jiang@intel.com \
    --cc=den@valinux.co.jp \
    --cc=dmaengine@vger.kernel.org \
    --cc=elfring@users.sourceforge.net \
    --cc=fancer.lancer@gmail.com \
    --cc=jbrunet@baylibre.com \
    --cc=jdmason@kudzu.us \
    --cc=jingoohan1@gmail.com \
    --cc=kishon@kernel.org \
    --cc=kurt.schwemmer@microsemi.com \
    --cc=kwilczynski@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=logang@deltatee.com \
    --cc=lpieralisi@kernel.org \
    --cc=mani@kernel.org \
    --cc=ntb@lists.linux.dev \
    --cc=pstanner@redhat.com \
    --cc=robh@kernel.org \
    --cc=vkoul@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox