[PATCH] PCI: rcar-gen4: Limit Max_Read_Request_Size and Max_Payload

Linux kernel -stable discussions
 help / color / mirror / Atom feed

* [PATCH] PCI: rcar-gen4: Limit Max_Read_Request_Size and Max_Payload_Size to 256 Bytes
@ 2026-04-25 23:38 Marek Vasut
  2026-04-28  7:00 ` Koichiro Den
  2026-05-11 14:34 ` Manivannan Sadhasivam
  0 siblings, 2 replies; 8+ messages in thread
From: Marek Vasut @ 2026-04-25 23:38 UTC (permalink / raw)
  To: linux-pci
  Cc: Marek Vasut, stable, Krzysztof Wilczyński, Bjorn Helgaas,
	Geert Uytterhoeven, Koichiro Den, Lorenzo Pieralisi, Magnus Damm,
	Manivannan Sadhasivam, Rob Herring, Yoshihiro Shimoda,
	linux-kernel, linux-renesas-soc

R-Car Gen4 PCIe controller has a hardware limitation of 256 Bytes
maximum payload size. The PCIe DMA generates requests of size up
to minimum(Max_Read_Request_Size, Max_Payload_Size). Force limit
both Max_Read_Request_Size and Max_Payload_Size to 256 Bytes and
propagate this limit to all downstream devices.

This limitation can be triggered for example by using an NVMe SSD
which does not use host memory buffer, Samsung 980 PRO is such an
SSD. Affected SSD reports 'hmpre' field as 0:
"
$ nvme id-ctrl /dev/nvme0 | grep hmpre
hmpre     : 0
"

The symptom is a read from the SSD which wraps around at 256 Byte
boundary. The test for this symptom can be implemented by writing
512 Byte of random data into the SSD and reading the data back. If
the read back data repeat after 256 Bytes, the device is affected.
"
$ dd if=/dev/urandom of=/tmp/data.bin bs=256 count=2 \
  dd if=/tmp/data.bin of=/dev/nvme0n1 bs=256 count=2 \
  dd if=/dev/nvme0n1 bs=256 count=2 of=/tmp/readback.bin
"

Expected data:
"
$ hexdump -vC /tmp/data.bin
00000000  97 81 b7 3b 0e 38 2b 4d  a7 d3 e0 47 ff c2 4b ca
00000010  c1 85 98 f0 4a ac 03 a0  3b ab f3 19 44 dd 06 8b
...
00000100  7a ce 3c b2 e1 d5 d9 11  88 63 10 59 76 3c dc 32 <-- random
00000110  72 32 2a 7d a3 e1 aa 13  7c da 58 a1 7b 21 11 50 <-- data
"

Faulty readback, collected without this change in place:
"
$ hexdump -vC /tmp/readback.bin
00000000  97 81 b7 3b 0e 38 2b 4d  a7 d3 e0 47 ff c2 4b ca <---.
00000010  c1 85 98 f0 4a ac 03 a0  3b ab f3 19 44 dd 06 8b <-. |
...                                                          | |
00000100  97 81 b7 3b 0e 38 2b 4d  a7 d3 e0 47 ff c2 4b ca <-:-+- repeated
00000110  c1 85 98 f0 4a ac 03 a0  3b ab f3 19 44 dd 06 8b <-+--- data
     ^^^
      |
      '--- Repeat starts at offset 0x100 = 256 Bytes
"

Fixes: 0d0c551011df ("PCI: rcar-gen4: Add R-Car Gen4 PCIe controller support for host mode")
Cc: stable@vger.kernel.org
Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org>
---
Cc: "Krzysztof Wilczyński" <kwilczynski@kernel.org>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Geert Uytterhoeven <geert+renesas@glider.be>
Cc: Koichiro Den <den@valinux.co.jp>
Cc: Lorenzo Pieralisi <lpieralisi@kernel.org>
Cc: Magnus Damm <magnus.damm@gmail.com>
Cc: Manivannan Sadhasivam <mani@kernel.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Cc: linux-kernel@vger.kernel.org
Cc: linux-pci@vger.kernel.org
Cc: linux-renesas-soc@vger.kernel.org
---
 drivers/pci/controller/dwc/pcie-rcar-gen4.c | 56 +++++++++++++++++++++
 1 file changed, 56 insertions(+)

diff --git a/drivers/pci/controller/dwc/pcie-rcar-gen4.c b/drivers/pci/controller/dwc/pcie-rcar-gen4.c
index 8b03c42f8c84c..82f0a074a71da 100644
--- a/drivers/pci/controller/dwc/pcie-rcar-gen4.c
+++ b/drivers/pci/controller/dwc/pcie-rcar-gen4.c
@@ -576,6 +576,7 @@ static int r8a779f0_pcie_ltssm_control(struct rcar_gen4_pcie *rcar, bool enable)
 static void rcar_gen4_pcie_additional_common_init(struct rcar_gen4_pcie *rcar)
 {
 	struct dw_pcie *dw = &rcar->dw;
+	u16 offset = dw_pcie_find_capability(dw, PCI_CAP_ID_EXP);
 	u32 val;
 
 	val = dw_pcie_readl_dbi(dw, PCIE_PORT_LANE_SKEW);
@@ -584,11 +585,66 @@ static void rcar_gen4_pcie_additional_common_init(struct rcar_gen4_pcie *rcar)
 		val |= BIT(6);
 	dw_pcie_writel_dbi(dw, PCIE_PORT_LANE_SKEW, val);
 
+	val = dw_pcie_readl_dbi(dw, offset + PCI_EXP_DEVCTL);
+	val &= ~(PCI_EXP_DEVCTL_PAYLOAD | PCI_EXP_DEVCTL_READRQ);
+	val |= PCI_EXP_DEVCTL_PAYLOAD_256B | PCI_EXP_DEVCTL_READRQ_256B;
+	dw_pcie_writel_dbi(dw, offset + PCI_EXP_DEVCTL, val);
+
 	val = readl(rcar->base + PCIEPWRMNGCTRL);
 	val |= APP_CLK_REQ_N | APP_CLK_PM_EN;
 	writel(val, rcar->base + PCIEPWRMNGCTRL);
 }
 
+static void rcar_gen4_rc_pcie_quirk(struct pci_dev *dev)
+{
+	static const struct pci_device_id rcar_gen4_pcie_rc_devid = {
+		PCI_DEVICE(PCI_VENDOR_ID_RENESAS, 0x0030),
+		.class = PCI_CLASS_BRIDGE_PCI_NORMAL, .class_mask = ~0
+	};
+	struct pci_bus *bus = dev->bus;
+	struct pci_dev *bridge;
+
+	if (pci_is_root_bus(bus))
+		bridge = dev;
+
+	/* Look for the host bridge */
+	while (!pci_is_root_bus(bus)) {
+		bridge = bus->self;
+		bus = bus->parent;
+	}
+
+	if (!bridge)
+		return;
+
+	if (!pci_match_one_device(&rcar_gen4_pcie_rc_devid, bridge))
+		return;
+
+	/*
+	 * R-Car Gen4 PCIe controller has a hardware limitation of 256 Bytes
+	 * maximum payload size. The PCIe DMA generates requests of size up
+	 * to minimum(Max_Read_Request_Size, Max_Payload_Size). Force limit
+	 * both Max_Read_Request_Size and Max_Payload_Size to 256 Bytes and
+	 * propagate this limit to all downstream devices.
+	 *
+	 * For details, refer to:
+	 * R-Car S4 R19UH0161EJ0130 Rev.1.30 Jun. 16, 2025 or
+	 * R-Car V4H R19UH0186EJ0130 Rev.1.30 Apr. 21, 2025 or
+	 * R-Car V4M R19UH0217EJ0100 Rev.1.00 Dec. 12, 2025,
+	 * chapters 104.1.1 Features and 104.3.9 DMA Transfer
+	 * section DMA Read Transfer.
+	 */
+	if (pcie_get_readrq(dev) > 256) {
+		dev_info(&dev->dev, "Limiting MRRS to 256 bytes\n");
+		pcie_set_readrq(dev, 256);
+	}
+
+	if (pcie_get_mps(dev) > 256) {
+		dev_info(&dev->dev, "Limiting MPS to 256 bytes\n");
+		pcie_set_mps(dev, 256);
+	}
+}
+DECLARE_PCI_FIXUP_ENABLE(PCI_ANY_ID, PCI_ANY_ID, rcar_gen4_rc_pcie_quirk);
+
 static void rcar_gen4_pcie_phy_reg_update_bits(struct rcar_gen4_pcie *rcar,
 					       u32 offset, u32 mask, u32 val)
 {
-- 
2.53.0


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] PCI: rcar-gen4: Limit Max_Read_Request_Size and Max_Payload_Size to 256 Bytes
  2026-04-25 23:38 [PATCH] PCI: rcar-gen4: Limit Max_Read_Request_Size and Max_Payload_Size to 256 Bytes Marek Vasut
@ 2026-04-28  7:00 ` Koichiro Den
  2026-05-03 23:54   ` Marek Vasut
  2026-05-11 14:34 ` Manivannan Sadhasivam
  1 sibling, 1 reply; 8+ messages in thread
From: Koichiro Den @ 2026-04-28  7:00 UTC (permalink / raw)
  To: Marek Vasut
  Cc: linux-pci, stable, Krzysztof Wilczyński, Bjorn Helgaas,
	Geert Uytterhoeven, Lorenzo Pieralisi, Magnus Damm,
	Manivannan Sadhasivam, Rob Herring, Yoshihiro Shimoda,
	linux-kernel, linux-renesas-soc

On Sun, Apr 26, 2026 at 01:38:28AM +0200, Marek Vasut wrote:
> R-Car Gen4 PCIe controller has a hardware limitation of 256 Bytes
> maximum payload size. The PCIe DMA generates requests of size up
> to minimum(Max_Read_Request_Size, Max_Payload_Size). Force limit
> both Max_Read_Request_Size and Max_Payload_Size to 256 Bytes and
> propagate this limit to all downstream devices.
> 
> This limitation can be triggered for example by using an NVMe SSD
> which does not use host memory buffer, Samsung 980 PRO is such an
> SSD. Affected SSD reports 'hmpre' field as 0:
> "
> $ nvme id-ctrl /dev/nvme0 | grep hmpre
> hmpre     : 0
> "
> 
> The symptom is a read from the SSD which wraps around at 256 Byte
> boundary. The test for this symptom can be implemented by writing
> 512 Byte of random data into the SSD and reading the data back. If
> the read back data repeat after 256 Bytes, the device is affected.
> "
> $ dd if=/dev/urandom of=/tmp/data.bin bs=256 count=2 \
>   dd if=/tmp/data.bin of=/dev/nvme0n1 bs=256 count=2 \
>   dd if=/dev/nvme0n1 bs=256 count=2 of=/tmp/readback.bin
> "
> 
> Expected data:
> "
> $ hexdump -vC /tmp/data.bin
> 00000000  97 81 b7 3b 0e 38 2b 4d  a7 d3 e0 47 ff c2 4b ca
> 00000010  c1 85 98 f0 4a ac 03 a0  3b ab f3 19 44 dd 06 8b
> ...
> 00000100  7a ce 3c b2 e1 d5 d9 11  88 63 10 59 76 3c dc 32 <-- random
> 00000110  72 32 2a 7d a3 e1 aa 13  7c da 58 a1 7b 21 11 50 <-- data
> "
> 
> Faulty readback, collected without this change in place:
> "
> $ hexdump -vC /tmp/readback.bin
> 00000000  97 81 b7 3b 0e 38 2b 4d  a7 d3 e0 47 ff c2 4b ca <---.
> 00000010  c1 85 98 f0 4a ac 03 a0  3b ab f3 19 44 dd 06 8b <-. |
> ...                                                          | |
> 00000100  97 81 b7 3b 0e 38 2b 4d  a7 d3 e0 47 ff c2 4b ca <-:-+- repeated
> 00000110  c1 85 98 f0 4a ac 03 a0  3b ab f3 19 44 dd 06 8b <-+--- data
>      ^^^
>       |
>       '--- Repeat starts at offset 0x100 = 256 Bytes
> "
> 
> Fixes: 0d0c551011df ("PCI: rcar-gen4: Add R-Car Gen4 PCIe controller support for host mode")
> Cc: stable@vger.kernel.org
> Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org>
> ---
> Cc: "Krzysztof Wilczyński" <kwilczynski@kernel.org>
> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: Geert Uytterhoeven <geert+renesas@glider.be>
> Cc: Koichiro Den <den@valinux.co.jp>
> Cc: Lorenzo Pieralisi <lpieralisi@kernel.org>
> Cc: Magnus Damm <magnus.damm@gmail.com>
> Cc: Manivannan Sadhasivam <mani@kernel.org>
> Cc: Rob Herring <robh@kernel.org>
> Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
> Cc: linux-kernel@vger.kernel.org
> Cc: linux-pci@vger.kernel.org
> Cc: linux-renesas-soc@vger.kernel.org
> ---
>  drivers/pci/controller/dwc/pcie-rcar-gen4.c | 56 +++++++++++++++++++++
>  1 file changed, 56 insertions(+)
> 
> diff --git a/drivers/pci/controller/dwc/pcie-rcar-gen4.c b/drivers/pci/controller/dwc/pcie-rcar-gen4.c
> index 8b03c42f8c84c..82f0a074a71da 100644
> --- a/drivers/pci/controller/dwc/pcie-rcar-gen4.c
> +++ b/drivers/pci/controller/dwc/pcie-rcar-gen4.c
> @@ -576,6 +576,7 @@ static int r8a779f0_pcie_ltssm_control(struct rcar_gen4_pcie *rcar, bool enable)
>  static void rcar_gen4_pcie_additional_common_init(struct rcar_gen4_pcie *rcar)
>  {
>  	struct dw_pcie *dw = &rcar->dw;
> +	u16 offset = dw_pcie_find_capability(dw, PCI_CAP_ID_EXP);
>  	u32 val;
>  
>  	val = dw_pcie_readl_dbi(dw, PCIE_PORT_LANE_SKEW);
> @@ -584,11 +585,66 @@ static void rcar_gen4_pcie_additional_common_init(struct rcar_gen4_pcie *rcar)
>  		val |= BIT(6);
>  	dw_pcie_writel_dbi(dw, PCIE_PORT_LANE_SKEW, val);
>  
> +	val = dw_pcie_readl_dbi(dw, offset + PCI_EXP_DEVCTL);
> +	val &= ~(PCI_EXP_DEVCTL_PAYLOAD | PCI_EXP_DEVCTL_READRQ);
> +	val |= PCI_EXP_DEVCTL_PAYLOAD_256B | PCI_EXP_DEVCTL_READRQ_256B;
> +	dw_pcie_writel_dbi(dw, offset + PCI_EXP_DEVCTL, val);
> +
>  	val = readl(rcar->base + PCIEPWRMNGCTRL);
>  	val |= APP_CLK_REQ_N | APP_CLK_PM_EN;
>  	writel(val, rcar->base + PCIEPWRMNGCTRL);
>  }

Hello Marek,

The patch makes sense to me. Let me ask two questions:

1. Could r8a779f0 (R-Car S4-8) be handled as well, perhaps by adding a separate
   .additional_common_init() implementation for it?

   As far as I can see, the r8a779f0 match data currently does not use
   rcar_gen4_pcie_additional_common_init().

2. Did you also happen to test V4H/V4M in endpoint (EP) mode, with the local
   eDMA engine issuing MRd requests toward host memory? Your commit message
   describes an NVMe device as the requester, but I'm wondering whether the same
   256B limit was also verified for the R-Car EP DMA requester path.



(*) The background for my question 2:

   I only have access to S4 Spider boards. In my RC <-> EP setup, where the EP
   side uses the local eDMA engine to issue MRd requests toward the RC, 256-byte
   MRd requests still appear to corrupt the transferred data. With the following
   change on top of your patch, my DMA-read tests become stable:

   ---8<-----8<---

   diff --git a/drivers/pci/controller/dwc/pcie-rcar-gen4.c b/drivers/pci/controller/dwc/pcie-rcar-gen4.c
   index 82f0a074a71d..6910b9cd9d7b 100644
   --- a/drivers/pci/controller/dwc/pcie-rcar-gen4.c
   +++ b/drivers/pci/controller/dwc/pcie-rcar-gen4.c
   @@ -595,6 +595,18 @@ static void rcar_gen4_pcie_additional_common_init(struct rcar_gen4_pcie *rcar)
           writel(val, rcar->base + PCIEPWRMNGCTRL);
    }
   
   +static void r8a779f0_additional_common_init(struct rcar_gen4_pcie *rcar)
   +{
   +       struct dw_pcie *dw = &rcar->dw;
   +       u16 offset = dw_pcie_find_capability(dw, PCI_CAP_ID_EXP);
   +       u32 val;
   +
   +       val = dw_pcie_readl_dbi(dw, offset + PCI_EXP_DEVCTL);
   +       val &= ~(PCI_EXP_DEVCTL_PAYLOAD | PCI_EXP_DEVCTL_READRQ);
   +       val |= PCI_EXP_DEVCTL_PAYLOAD_128B | PCI_EXP_DEVCTL_READRQ_128B;
   +       dw_pcie_writel_dbi(dw, offset + PCI_EXP_DEVCTL, val);
   +}
   +
    static void rcar_gen4_rc_pcie_quirk(struct pci_dev *dev)
    {
           static const struct pci_device_id rcar_gen4_pcie_rc_devid = {
   @@ -796,11 +808,13 @@ static int rcar_gen4_pcie_ltssm_control(struct rcar_gen4_pcie *rcar, bool enable
    }
   
    static struct rcar_gen4_pcie_drvdata drvdata_r8a779f0_pcie = {
   +       .additional_common_init = r8a779f0_additional_common_init,
           .ltssm_control = r8a779f0_pcie_ltssm_control,
           .mode = DW_PCIE_RC_TYPE,
    };
   
    static struct rcar_gen4_pcie_drvdata drvdata_r8a779f0_pcie_ep = {
   +       .additional_common_init = r8a779f0_additional_common_init,
           .ltssm_control = r8a779f0_pcie_ltssm_control,
           .mode = DW_PCIE_EP_TYPE,
    };

   ---8<-----8<---

   One detail which might be important is that limiting only MPS does not appear
   to be sufficient in my setup. MPS=128B with MRRS=256B still seems broken,
   while MPS=128B with MRRS=128B works fine. I wonder whether this is because
   the "MPS" term in the min(MRRS, MPS) limit for DMA read transfers may
   effectively be tied to the DMA read buffer segment size / MPSS rather than
   only to DevCtl.MPS. I'm not sure about this yet though.

   One more thing I noticed in the manuals:

     R-Car S4 R19UH0161EJ0130 Rev.1.30 Jun. 16, 2025:
       Type00 MPSS initial = 256B, PCI R, Internal R/W
       Type01 MPSS initial = 128B, PCI R, Internal R

     R-Car V4H R19UH0186EJ0130 Rev.1.30 Apr. 21, 2025
       Type00 MPSS initial = 256B, PCI R, Internal R
       Type01 MPSS initial = 128B, PCI R, Internal R/W

   I'm still unsure, but this difference might be relevant. In particular, in
   V4H/V4M RC mode your patch programs DevCtl.MPS to 256B, but does not change
   Type01 MPSS. I wonder if the Type01 MPSS should also be updated to 256B first
   on SoCs where the manual says it is writable from the internal bus, or if I'm
   missing something here.



Best regards,
Koichiro

>  
> +static void rcar_gen4_rc_pcie_quirk(struct pci_dev *dev)
> +{
> +	static const struct pci_device_id rcar_gen4_pcie_rc_devid = {
> +		PCI_DEVICE(PCI_VENDOR_ID_RENESAS, 0x0030),
> +		.class = PCI_CLASS_BRIDGE_PCI_NORMAL, .class_mask = ~0
> +	};
> +	struct pci_bus *bus = dev->bus;
> +	struct pci_dev *bridge;
> +
> +	if (pci_is_root_bus(bus))
> +		bridge = dev;
> +
> +	/* Look for the host bridge */
> +	while (!pci_is_root_bus(bus)) {
> +		bridge = bus->self;
> +		bus = bus->parent;
> +	}
> +
> +	if (!bridge)
> +		return;
> +
> +	if (!pci_match_one_device(&rcar_gen4_pcie_rc_devid, bridge))
> +		return;
> +
> +	/*
> +	 * R-Car Gen4 PCIe controller has a hardware limitation of 256 Bytes
> +	 * maximum payload size. The PCIe DMA generates requests of size up
> +	 * to minimum(Max_Read_Request_Size, Max_Payload_Size). Force limit
> +	 * both Max_Read_Request_Size and Max_Payload_Size to 256 Bytes and
> +	 * propagate this limit to all downstream devices.
> +	 *
> +	 * For details, refer to:
> +	 * R-Car S4 R19UH0161EJ0130 Rev.1.30 Jun. 16, 2025 or
> +	 * R-Car V4H R19UH0186EJ0130 Rev.1.30 Apr. 21, 2025 or
> +	 * R-Car V4M R19UH0217EJ0100 Rev.1.00 Dec. 12, 2025,
> +	 * chapters 104.1.1 Features and 104.3.9 DMA Transfer
> +	 * section DMA Read Transfer.
> +	 */
> +	if (pcie_get_readrq(dev) > 256) {
> +		dev_info(&dev->dev, "Limiting MRRS to 256 bytes\n");
> +		pcie_set_readrq(dev, 256);
> +	}
> +
> +	if (pcie_get_mps(dev) > 256) {
> +		dev_info(&dev->dev, "Limiting MPS to 256 bytes\n");
> +		pcie_set_mps(dev, 256);
> +	}
> +}
> +DECLARE_PCI_FIXUP_ENABLE(PCI_ANY_ID, PCI_ANY_ID, rcar_gen4_rc_pcie_quirk);
> +
>  static void rcar_gen4_pcie_phy_reg_update_bits(struct rcar_gen4_pcie *rcar,
>  					       u32 offset, u32 mask, u32 val)
>  {
> -- 
> 2.53.0
> 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] PCI: rcar-gen4: Limit Max_Read_Request_Size and Max_Payload_Size to 256 Bytes
  2026-04-28  7:00 ` Koichiro Den
@ 2026-05-03 23:54   ` Marek Vasut
  2026-05-11 14:20     ` Koichiro Den
  0 siblings, 1 reply; 8+ messages in thread
From: Marek Vasut @ 2026-05-03 23:54 UTC (permalink / raw)
  To: Koichiro Den
  Cc: linux-pci, stable, Krzysztof Wilczyński, Bjorn Helgaas,
	Geert Uytterhoeven, Lorenzo Pieralisi, Magnus Damm,
	Manivannan Sadhasivam, Rob Herring, Yoshihiro Shimoda,
	linux-kernel, linux-renesas-soc

On 4/28/26 9:00 AM, Koichiro Den wrote:

Hello Den-san,

> The patch makes sense to me. Let me ask two questions:
> 
> 1. Could r8a779f0 (R-Car S4-8) be handled as well, perhaps by adding a separate
>     .additional_common_init() implementation for it?
> 
>     As far as I can see, the r8a779f0 match data currently does not use
>     rcar_gen4_pcie_additional_common_init().

I will address this one in V2, thank you for pointing that out.

> 2. Did you also happen to test V4H/V4M in endpoint (EP) mode, with the local
>     eDMA engine issuing MRd requests toward host memory?

I was not able to test this configuration.

Is it possible to perform this test with a single device, by having the 
eDMA do local-memory-read-to-local-memory-write transfers, maybe using 
PIPE_LOOPBACK/LOOPBACK_ENABLE bits, or do I need two devices with NTB 
connection between them ?

In case it is the later, could you please briefly describe the S4 NTB 
setup you use, so I could try to replicate it locally ?

> Your commit message
>     describes an NVMe device as the requester, but I'm wondering whether the same
>     256B limit was also verified for the R-Car EP DMA requester path.

This part I currently can not answer, I'm sorry.

...

I made the following two observations in the meantime.

First, I wrote two SSDs, Crucial P5 Plus SSD without HMPRE (without host 
memory buffer) and XPG GAMMIX P55 with HMPRE (with host memory buffer) 
with 4 GiB of random data on another system (iMX8M Plus, ARM64 with DWC 
PCIe controller too), then I did a read back and compared the data, the 
writen and read-back data matched.

Then I plugged both SSDs into V4H Sparrow Hawk _without_ this patch, and 
I did read back of data:

- Crucial P5 Plus SSD without HMPRE (without host memory buffer)
   -> Data read back match data written on iMX8M Plus, OK
- XPG GAMMIX P55 with HMPRE (with host memory buffer)
   -> Data read back match data written on iMX8M Plus, OK

Then I wrote 512 Byte of data into the Crucial P5 Plus SSD without HMPRE 
on V4H Sparrow Hawk and did read back again.
   -> Data read back does NOT match data written, NG

That would indicate that:
- WRITE transfers from SSD to DRAM are OK
- READ transfers from DRAM to SSD are corrupted at 256 Bytes boundary

That would indicate that we need _at_least_ the 256 Bytes limit, likely 
on both MPS and MRRS.

Second, I got a report of another SSD for which this patch is not 
sufficient. I currently do not have access to that SSD, but I will ask 
for access and investigate. That may shed some light on the 128 Byte 
limit below.

> (*) The background for my question 2:
> 
>     I only have access to S4 Spider boards. In my RC <-> EP setup, where the EP
>     side uses the local eDMA engine to issue MRd requests toward the RC, 256-byte
>     MRd requests still appear to corrupt the transferred data.

Is the corruption deterministic in some way, i.e. are the same bytes of 
the transferred data corrupted every time, or is the corruption "random" ?

Does the corruption happen even on singular MRd transfer, or does it 
happen only when a lot of traffic is sent across the NTB link? I wonder 
if this corruption might be DRAM bandwidth related, i.e. whether the DMA 
does possibly saturate the DRAM controller with write requests and make 
the system run out of DRAM bandwidth.

> With the following
>     change on top of your patch, my DMA-read tests become stable:

[...]

>     One detail which might be important is that limiting only MPS does not appear
>     to be sufficient in my setup. MPS=128B with MRRS=256B still seems broken,
>     while MPS=128B with MRRS=128B works fine. I wonder whether this is because
>     the "MPS" term in the min(MRRS, MPS) limit for DMA read transfers may
>     effectively be tied to the DMA read buffer segment size / MPSS rather than
>     only to DevCtl.MPS. I'm not sure about this yet though.

I think setting MPS=128B MRRS=256B only leads to the transfer being 
split into 2 x 128B TLPs sent across the PCIe link, but in the end, 2 x 
128 Bytes of data are received (in some order) into the read segment 
buffer and reordered, and 1 x 256 Bytes are written from read segment 
buffer into the memory as a single write.

In case of MPS=256B MRRS=256B, only one 256B TLP is sent across the 
link, 1 x 256 Bytes of data are received into the read segment buffer 
with no reordering necessary, and 1 x 256 Bytes are still written from 
read segment buffer into the memory as a single write.

=> For MPS=128B/MPS=256B and MRRS=256B, there is difference in the
    transfer format between PCIe and DMA, but there is no difference
    between DMA and DRAM .

But in case of MRRS=128B and transfer of 256 Bytes, 2 x 128 Bytes of 
data are received into (separate? (*)) entries in read segment buffer, 
and 2 x 128 Bytes are written from (separate?) entries in read segment 
buffer into the memory as two separate writes . Could this different 
memory write pattern be responsible for the (lack of) corruption ?

Do you know whether the data are corrupted on the PCIe-to-DMA side (when 
the data are received from the PCIe side and written into the read 
buffer segment) or on the DMA-to-DRAM side (on read from read segment 
buffer or on write into DRAM) ?

(*) Since the read segment buffer has 16 x 256 Byte segments, with 16 
DMA tags and never more than 16 MRd requests in flight, I think it is 
likely that each MRd data land in separate read segment buffer segment. 
But this information comes from another datasheet, not V4H one.

>     One more thing I noticed in the manuals:
> 
>       R-Car S4 R19UH0161EJ0130 Rev.1.30 Jun. 16, 2025:
>         Type00 MPSS initial = 256B, PCI R, Internal R/W
>         Type01 MPSS initial = 128B, PCI R, Internal R
> 
>       R-Car V4H R19UH0186EJ0130 Rev.1.30 Apr. 21, 2025
>         Type00 MPSS initial = 256B, PCI R, Internal R
>         Type01 MPSS initial = 128B, PCI R, Internal R/W
> 
>     I'm still unsure, but this difference might be relevant. In particular, in
>     V4H/V4M RC mode your patch programs DevCtl.MPS to 256B, but does not change
>     Type01 MPSS. I wonder if the Type01 MPSS should also be updated to 256B first
>     on SoCs where the manual says it is writable from the internal bus, or if I'm
>     missing something here.

This is a very good point.

The R-Car S4 RM Rev.1.20 lists Type00 MPSS as Internal R and Type01 MPSS 
as Internal R/W. This was updated in RM Rev.1.30 to Type 00 Internal R/W 
and Type 01 Internal R. It is possible this change is going to be added 
into the V4H RM in the future too. That would likely imply, that Type01 
MPSS is not programmable.

I don't think Type1 affects RC operation, but does it affect NTB ?

[...]

Thank you for your help!

-- 
Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] PCI: rcar-gen4: Limit Max_Read_Request_Size and Max_Payload_Size to 256 Bytes
  2026-05-03 23:54   ` Marek Vasut
@ 2026-05-11 14:20     ` Koichiro Den
  2026-05-13  3:01       ` Marek Vasut
  0 siblings, 1 reply; 8+ messages in thread
From: Koichiro Den @ 2026-05-11 14:20 UTC (permalink / raw)
  To: Marek Vasut
  Cc: linux-pci, stable, Krzysztof Wilczyński, Bjorn Helgaas,
	Geert Uytterhoeven, Lorenzo Pieralisi, Magnus Damm,
	Manivannan Sadhasivam, Rob Herring, Yoshihiro Shimoda,
	linux-kernel, linux-renesas-soc

On Mon, May 04, 2026 at 01:54:01AM +0200, Marek Vasut wrote:
> On 4/28/26 9:00 AM, Koichiro Den wrote:
> 
> Hello Den-san,
> 
> > The patch makes sense to me. Let me ask two questions:
> > 
> > 1. Could r8a779f0 (R-Car S4-8) be handled as well, perhaps by adding a separate
> >     .additional_common_init() implementation for it?
> > 
> >     As far as I can see, the r8a779f0 match data currently does not use
> >     rcar_gen4_pcie_additional_common_init().

Hi Marek,

Thank you for the detailed analysis. I'm sorry for the late reply.

> 
> I will address this one in V2, thank you for pointing that out.

Thanks!

> 
> > 2. Did you also happen to test V4H/V4M in endpoint (EP) mode, with the local
> >     eDMA engine issuing MRd requests toward host memory?
> 
> I was not able to test this configuration.
> 
> Is it possible to perform this test with a single device, by having the eDMA
> do local-memory-read-to-local-memory-write transfers, maybe using
> PIPE_LOOPBACK/LOOPBACK_ENABLE bits, or do I need two devices with NTB
> connection between them ?
> 
> In case it is the later, could you please briefly describe the S4 NTB setup
> you use, so I could try to replicate it locally ?

My setup was a two-board setup:

  S4 Spider as RC <-> S4 Spider as EP, connected with OCuLink.

It is unfortunately not a small standalone reproducer. The setup was based on
the following RFC v4 series:

  [RFC PATCH v4 00/38] NTB transport backed by PCI EP embedded DMA
  https://lore.kernel.org/all/20260118135440.1958279-1-den@valinux.co.jp/

In particular, the workaround patch I used in the RFC series was:

  [RFC PATCH v4 31/38] NTB: epf: Add per-SoC quirk to cap MRRS for DWC eDMA (128B for R-Car)
  https://lore.kernel.org/all/20260118135440.1958279-32-den@valinux.co.jp/

Note that in that workaround I only capped MRRS (i.e. I did not add an MPS cap).
At least in that setup, avoiding 256B MRd requests was enough to make the
visible corruption disappear.

At a high level, the EP side exposes the vNTB endpoint function, and the RC side
uses the NTB data path which is backed by the EP-local eDMA through that vNTB
function. For the RC-to-EP data path, the EP-local eDMA acts as the requester:
it issues MRd requests toward remote RC memory, receives the CplD payloads, and
writes the data into EP-side memory. In other words, this is a DMA read transfer
from the point of view of the EP-local eDMA.

I have not tried PIPE_LOOPBACK/LOOPBACK_ENABLE. Given how heavy the setup
described above is, I am not asking you to reproduce the whole thing just for
this patch. Also, I do not want this NTB/eDMA observation to block your v2. For
now, please treat it as a separate observation from the RC/NVMe issue. I will
continue the investigation on my side and let you know if I can narrow down
where the corruption occurs.

> 
> > Your commit message
> >     describes an NVMe device as the requester, but I'm wondering whether the same
> >     256B limit was also verified for the R-Car EP DMA requester path.
> 
> This part I currently can not answer, I'm sorry.
> 
> ...
> 
> I made the following two observations in the meantime.
> 
> First, I wrote two SSDs, Crucial P5 Plus SSD without HMPRE (without host
> memory buffer) and XPG GAMMIX P55 with HMPRE (with host memory buffer) with
> 4 GiB of random data on another system (iMX8M Plus, ARM64 with DWC PCIe
> controller too), then I did a read back and compared the data, the writen
> and read-back data matched.
> 
> Then I plugged both SSDs into V4H Sparrow Hawk _without_ this patch, and I
> did read back of data:
> 
> - Crucial P5 Plus SSD without HMPRE (without host memory buffer)
>   -> Data read back match data written on iMX8M Plus, OK
> - XPG GAMMIX P55 with HMPRE (with host memory buffer)
>   -> Data read back match data written on iMX8M Plus, OK
> 
> Then I wrote 512 Byte of data into the Crucial P5 Plus SSD without HMPRE on
> V4H Sparrow Hawk and did read back again.
>   -> Data read back does NOT match data written, NG
> 
> That would indicate that:
> - WRITE transfers from SSD to DRAM are OK
> - READ transfers from DRAM to SSD are corrupted at 256 Bytes boundary
> 
> That would indicate that we need _at_least_ the 256 Bytes limit, likely on
> both MPS and MRRS.
> 
> Second, I got a report of another SSD for which this patch is not
> sufficient. I currently do not have access to that SSD, but I will ask for
> access and investigate. That may shed some light on the 128 Byte limit
> below.

Thank you for sharing these observations.
Interesting, that second point may indeed help determine whether my 128B
observation in the past is related to the same underlying issue, or is a purely
eDMA/NTB-specific one.

> 
> > (*) The background for my question 2:
> > 
> >     I only have access to S4 Spider boards. In my RC <-> EP setup, where the EP
> >     side uses the local eDMA engine to issue MRd requests toward the RC, 256-byte
> >     MRd requests still appear to corrupt the transferred data.
> 
> Is the corruption deterministic in some way, i.e. are the same bytes of the
> transferred data corrupted every time, or is the corruption "random" ?

The exact corrupted values were not deterministic, but the offsets where the
corruption occurred were fairly consistent.

Let me quote from my earlier RFC patch:
(https://lore.kernel.org/all/20260118135440.1958279-32-den@valinux.co.jp/)

  [...]
  * On some R-Car platforms using the Synopsys DWC PCIe + eDMA we
  * observe data corruption on RC->EP Remote DMA Read paths whenever
  * the EP issues large MRd requests. The corruption consistently
  * hits the tail of each 256-byte segment (e.g. offsets
  * 0x00E0..0x00FF within a 256B block, and again at 0x01E0..0x01FF
  * for larger transfers).
  [...]

> 
> Does the corruption happen even on singular MRd transfer, or does it happen
> only when a lot of traffic is sent across the NTB link? I wonder if this
> corruption might be DRAM bandwidth related, i.e. whether the DMA does
> possibly saturate the DRAM controller with write requests and make the
> system run out of DRAM bandwidth.

It occurred even with a single eDMA read transfer. It was not a symptom only
observable under high load.

> 
> > With the following
> >     change on top of your patch, my DMA-read tests become stable:
> 
> [...]
> 
> >     One detail which might be important is that limiting only MPS does not appear
> >     to be sufficient in my setup. MPS=128B with MRRS=256B still seems broken,
> >     while MPS=128B with MRRS=128B works fine. I wonder whether this is because
> >     the "MPS" term in the min(MRRS, MPS) limit for DMA read transfers may
> >     effectively be tied to the DMA read buffer segment size / MPSS rather than
> >     only to DevCtl.MPS. I'm not sure about this yet though.
> 
> I think setting MPS=128B MRRS=256B only leads to the transfer being split
> into 2 x 128B TLPs sent across the PCIe link, but in the end, 2 x 128 Bytes
> of data are received (in some order) into the read segment buffer and
> reordered, and 1 x 256 Bytes are written from read segment buffer into the
> memory as a single write.
> 
> In case of MPS=256B MRRS=256B, only one 256B TLP is sent across the link, 1
> x 256 Bytes of data are received into the read segment buffer with no
> reordering necessary, and 1 x 256 Bytes are still written from read segment
> buffer into the memory as a single write.
> 
> => For MPS=128B/MPS=256B and MRRS=256B, there is difference in the
>    transfer format between PCIe and DMA, but there is no difference
>    between DMA and DRAM .
> 
> But in case of MRRS=128B and transfer of 256 Bytes, 2 x 128 Bytes of data
> are received into (separate? (*)) entries in read segment buffer, and 2 x
> 128 Bytes are written from (separate?) entries in read segment buffer into
> the memory as two separate writes . Could this different memory write
> pattern be responsible for the (lack of) corruption ?
> 
> Do you know whether the data are corrupted on the PCIe-to-DMA side (when the
> data are received from the PCIe side and written into the read buffer
> segment) or on the DMA-to-DRAM side (on read from read segment buffer or on
> write into DRAM) ?

Unfortunately I cannot distinguish these from software alone. I only observed
the final destination buffer contents after the eDMA read transfer completed.

> 
> (*) Since the read segment buffer has 16 x 256 Byte segments, with 16 DMA
> tags and never more than 16 MRd requests in flight, I think it is likely
> that each MRd data land in separate read segment buffer segment. But this
> information comes from another datasheet, not V4H one.
> 
> >     One more thing I noticed in the manuals:
> > 
> >       R-Car S4 R19UH0161EJ0130 Rev.1.30 Jun. 16, 2025:
> >         Type00 MPSS initial = 256B, PCI R, Internal R/W
> >         Type01 MPSS initial = 128B, PCI R, Internal R
> > 
> >       R-Car V4H R19UH0186EJ0130 Rev.1.30 Apr. 21, 2025
> >         Type00 MPSS initial = 256B, PCI R, Internal R
> >         Type01 MPSS initial = 128B, PCI R, Internal R/W
> > 
> >     I'm still unsure, but this difference might be relevant. In particular, in
> >     V4H/V4M RC mode your patch programs DevCtl.MPS to 256B, but does not change
> >     Type01 MPSS. I wonder if the Type01 MPSS should also be updated to 256B first
> >     on SoCs where the manual says it is writable from the internal bus, or if I'm
> >     missing something here.
> 
> This is a very good point.
> 
> The R-Car S4 RM Rev.1.20 lists Type00 MPSS as Internal R and Type01 MPSS as
> Internal R/W. This was updated in RM Rev.1.30 to Type 00 Internal R/W and
> Type 01 Internal R. It is possible this change is going to be added into the
> V4H RM in the future too. That would likely imply, that Type01 MPSS is not
> programmable.
> 
> I don't think Type1 affects RC operation, but does it affect NTB ?

I have no evidence that Type1 affects NTB either. It was just a speculative idea
based on the difference I saw in the manuals.

Your inference, i.e. that the S4 RM Rev.1.30 may reflect the intended access
attributes and the V4H RM may later get a similar correction, sounds reasonable
to me.

I had not checked the S4 RM Rev.1.20, so I missed that change. Thanks for
pointing it out.

> 
> [...]
> 
> Thank you for your help!

Thank you for investigating this and for the very helpful analysis.
I will let you know if I find anything more.

Best regards,
Koichiro

> 
> -- 
> Best regards,
> Marek Vasut

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] PCI: rcar-gen4: Limit Max_Read_Request_Size and Max_Payload_Size to 256 Bytes
  2026-04-25 23:38 [PATCH] PCI: rcar-gen4: Limit Max_Read_Request_Size and Max_Payload_Size to 256 Bytes Marek Vasut
  2026-04-28  7:00 ` Koichiro Den
@ 2026-05-11 14:34 ` Manivannan Sadhasivam
  2026-05-12 22:57   ` Marek Vasut
  1 sibling, 1 reply; 8+ messages in thread
From: Manivannan Sadhasivam @ 2026-05-11 14:34 UTC (permalink / raw)
  To: Marek Vasut
  Cc: linux-pci, stable, Krzysztof Wilczyński, Bjorn Helgaas,
	Geert Uytterhoeven, Koichiro Den, Lorenzo Pieralisi, Magnus Damm,
	Rob Herring, Yoshihiro Shimoda, linux-kernel, linux-renesas-soc

On Sun, Apr 26, 2026 at 01:38:28AM +0200, Marek Vasut wrote:
> R-Car Gen4 PCIe controller has a hardware limitation of 256 Bytes
> maximum payload size. The PCIe DMA generates requests of size up
> to minimum(Max_Read_Request_Size, Max_Payload_Size). Force limit
> both Max_Read_Request_Size and Max_Payload_Size to 256 Bytes and
> propagate this limit to all downstream devices.
> 
> This limitation can be triggered for example by using an NVMe SSD
> which does not use host memory buffer, Samsung 980 PRO is such an
> SSD. Affected SSD reports 'hmpre' field as 0:
> "
> $ nvme id-ctrl /dev/nvme0 | grep hmpre
> hmpre     : 0
> "
> 
> The symptom is a read from the SSD which wraps around at 256 Byte
> boundary. The test for this symptom can be implemented by writing
> 512 Byte of random data into the SSD and reading the data back. If
> the read back data repeat after 256 Bytes, the device is affected.
> "
> $ dd if=/dev/urandom of=/tmp/data.bin bs=256 count=2 \
>   dd if=/tmp/data.bin of=/dev/nvme0n1 bs=256 count=2 \
>   dd if=/dev/nvme0n1 bs=256 count=2 of=/tmp/readback.bin
> "
> 
> Expected data:
> "
> $ hexdump -vC /tmp/data.bin
> 00000000  97 81 b7 3b 0e 38 2b 4d  a7 d3 e0 47 ff c2 4b ca
> 00000010  c1 85 98 f0 4a ac 03 a0  3b ab f3 19 44 dd 06 8b
> ...
> 00000100  7a ce 3c b2 e1 d5 d9 11  88 63 10 59 76 3c dc 32 <-- random
> 00000110  72 32 2a 7d a3 e1 aa 13  7c da 58 a1 7b 21 11 50 <-- data
> "
> 
> Faulty readback, collected without this change in place:
> "
> $ hexdump -vC /tmp/readback.bin
> 00000000  97 81 b7 3b 0e 38 2b 4d  a7 d3 e0 47 ff c2 4b ca <---.
> 00000010  c1 85 98 f0 4a ac 03 a0  3b ab f3 19 44 dd 06 8b <-. |
> ...                                                          | |
> 00000100  97 81 b7 3b 0e 38 2b 4d  a7 d3 e0 47 ff c2 4b ca <-:-+- repeated
> 00000110  c1 85 98 f0 4a ac 03 a0  3b ab f3 19 44 dd 06 8b <-+--- data
>      ^^^
>       |
>       '--- Repeat starts at offset 0x100 = 256 Bytes
> "
> 
> Fixes: 0d0c551011df ("PCI: rcar-gen4: Add R-Car Gen4 PCIe controller support for host mode")
> Cc: stable@vger.kernel.org
> Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org>
> ---
> Cc: "Krzysztof Wilczyński" <kwilczynski@kernel.org>
> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: Geert Uytterhoeven <geert+renesas@glider.be>
> Cc: Koichiro Den <den@valinux.co.jp>
> Cc: Lorenzo Pieralisi <lpieralisi@kernel.org>
> Cc: Magnus Damm <magnus.damm@gmail.com>
> Cc: Manivannan Sadhasivam <mani@kernel.org>
> Cc: Rob Herring <robh@kernel.org>
> Cc: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
> Cc: linux-kernel@vger.kernel.org
> Cc: linux-pci@vger.kernel.org
> Cc: linux-renesas-soc@vger.kernel.org
> ---
>  drivers/pci/controller/dwc/pcie-rcar-gen4.c | 56 +++++++++++++++++++++
>  1 file changed, 56 insertions(+)
> 
> diff --git a/drivers/pci/controller/dwc/pcie-rcar-gen4.c b/drivers/pci/controller/dwc/pcie-rcar-gen4.c
> index 8b03c42f8c84c..82f0a074a71da 100644
> --- a/drivers/pci/controller/dwc/pcie-rcar-gen4.c
> +++ b/drivers/pci/controller/dwc/pcie-rcar-gen4.c
> @@ -576,6 +576,7 @@ static int r8a779f0_pcie_ltssm_control(struct rcar_gen4_pcie *rcar, bool enable)
>  static void rcar_gen4_pcie_additional_common_init(struct rcar_gen4_pcie *rcar)
>  {
>  	struct dw_pcie *dw = &rcar->dw;
> +	u16 offset = dw_pcie_find_capability(dw, PCI_CAP_ID_EXP);
>  	u32 val;
>  
>  	val = dw_pcie_readl_dbi(dw, PCIE_PORT_LANE_SKEW);
> @@ -584,11 +585,66 @@ static void rcar_gen4_pcie_additional_common_init(struct rcar_gen4_pcie *rcar)
>  		val |= BIT(6);
>  	dw_pcie_writel_dbi(dw, PCIE_PORT_LANE_SKEW, val);
>  
> +	val = dw_pcie_readl_dbi(dw, offset + PCI_EXP_DEVCTL);
> +	val &= ~(PCI_EXP_DEVCTL_PAYLOAD | PCI_EXP_DEVCTL_READRQ);
> +	val |= PCI_EXP_DEVCTL_PAYLOAD_256B | PCI_EXP_DEVCTL_READRQ_256B;
> +	dw_pcie_writel_dbi(dw, offset + PCI_EXP_DEVCTL, val);

Instead of limiting the MRRS/MPS values for all devices through quirks, why
can't you just limit the Root Port's MPSS value in PCI_EXP_DEVCAP?

- Mani

> +
>  	val = readl(rcar->base + PCIEPWRMNGCTRL);
>  	val |= APP_CLK_REQ_N | APP_CLK_PM_EN;
>  	writel(val, rcar->base + PCIEPWRMNGCTRL);
>  }
>  
> +static void rcar_gen4_rc_pcie_quirk(struct pci_dev *dev)
> +{
> +	static const struct pci_device_id rcar_gen4_pcie_rc_devid = {
> +		PCI_DEVICE(PCI_VENDOR_ID_RENESAS, 0x0030),
> +		.class = PCI_CLASS_BRIDGE_PCI_NORMAL, .class_mask = ~0
> +	};
> +	struct pci_bus *bus = dev->bus;
> +	struct pci_dev *bridge;
> +
> +	if (pci_is_root_bus(bus))
> +		bridge = dev;
> +
> +	/* Look for the host bridge */
> +	while (!pci_is_root_bus(bus)) {
> +		bridge = bus->self;
> +		bus = bus->parent;
> +	}
> +
> +	if (!bridge)
> +		return;
> +
> +	if (!pci_match_one_device(&rcar_gen4_pcie_rc_devid, bridge))
> +		return;
> +
> +	/*
> +	 * R-Car Gen4 PCIe controller has a hardware limitation of 256 Bytes
> +	 * maximum payload size. The PCIe DMA generates requests of size up
> +	 * to minimum(Max_Read_Request_Size, Max_Payload_Size). Force limit
> +	 * both Max_Read_Request_Size and Max_Payload_Size to 256 Bytes and
> +	 * propagate this limit to all downstream devices.
> +	 *
> +	 * For details, refer to:
> +	 * R-Car S4 R19UH0161EJ0130 Rev.1.30 Jun. 16, 2025 or
> +	 * R-Car V4H R19UH0186EJ0130 Rev.1.30 Apr. 21, 2025 or
> +	 * R-Car V4M R19UH0217EJ0100 Rev.1.00 Dec. 12, 2025,
> +	 * chapters 104.1.1 Features and 104.3.9 DMA Transfer
> +	 * section DMA Read Transfer.
> +	 */
> +	if (pcie_get_readrq(dev) > 256) {
> +		dev_info(&dev->dev, "Limiting MRRS to 256 bytes\n");
> +		pcie_set_readrq(dev, 256);
> +	}
> +
> +	if (pcie_get_mps(dev) > 256) {
> +		dev_info(&dev->dev, "Limiting MPS to 256 bytes\n");
> +		pcie_set_mps(dev, 256);
> +	}
> +}
> +DECLARE_PCI_FIXUP_ENABLE(PCI_ANY_ID, PCI_ANY_ID, rcar_gen4_rc_pcie_quirk);
> +
>  static void rcar_gen4_pcie_phy_reg_update_bits(struct rcar_gen4_pcie *rcar,
>  					       u32 offset, u32 mask, u32 val)
>  {
> -- 
> 2.53.0
> 
> 

-- 
மணிவண்ணன் சதாசிவம்

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] PCI: rcar-gen4: Limit Max_Read_Request_Size and Max_Payload_Size to 256 Bytes
  2026-05-11 14:34 ` Manivannan Sadhasivam
@ 2026-05-12 22:57   ` Marek Vasut
  2026-05-12 23:08     ` Marek Vasut
  0 siblings, 1 reply; 8+ messages in thread
From: Marek Vasut @ 2026-05-12 22:57 UTC (permalink / raw)
  To: Manivannan Sadhasivam
  Cc: linux-pci, stable, Krzysztof Wilczyński, Bjorn Helgaas,
	Geert Uytterhoeven, Koichiro Den, Lorenzo Pieralisi, Magnus Damm,
	Rob Herring, Yoshihiro Shimoda, linux-kernel, linux-renesas-soc

On 5/11/26 4:34 PM, Manivannan Sadhasivam wrote:

Hello Manivannan,

>>   drivers/pci/controller/dwc/pcie-rcar-gen4.c | 56 +++++++++++++++++++++
>>   1 file changed, 56 insertions(+)
>>
>> diff --git a/drivers/pci/controller/dwc/pcie-rcar-gen4.c b/drivers/pci/controller/dwc/pcie-rcar-gen4.c
>> index 8b03c42f8c84c..82f0a074a71da 100644
>> --- a/drivers/pci/controller/dwc/pcie-rcar-gen4.c
>> +++ b/drivers/pci/controller/dwc/pcie-rcar-gen4.c
>> @@ -576,6 +576,7 @@ static int r8a779f0_pcie_ltssm_control(struct rcar_gen4_pcie *rcar, bool enable)
>>   static void rcar_gen4_pcie_additional_common_init(struct rcar_gen4_pcie *rcar)
>>   {
>>   	struct dw_pcie *dw = &rcar->dw;
>> +	u16 offset = dw_pcie_find_capability(dw, PCI_CAP_ID_EXP);
>>   	u32 val;
>>   
>>   	val = dw_pcie_readl_dbi(dw, PCIE_PORT_LANE_SKEW);
>> @@ -584,11 +585,66 @@ static void rcar_gen4_pcie_additional_common_init(struct rcar_gen4_pcie *rcar)
>>   		val |= BIT(6);
>>   	dw_pcie_writel_dbi(dw, PCIE_PORT_LANE_SKEW, val);
>>   
>> +	val = dw_pcie_readl_dbi(dw, offset + PCI_EXP_DEVCTL);
>> +	val &= ~(PCI_EXP_DEVCTL_PAYLOAD | PCI_EXP_DEVCTL_READRQ);
>> +	val |= PCI_EXP_DEVCTL_PAYLOAD_256B | PCI_EXP_DEVCTL_READRQ_256B;
>> +	dw_pcie_writel_dbi(dw, offset + PCI_EXP_DEVCTL, val);
> 
> Instead of limiting the MRRS/MPS values for all devices through quirks, why
> can't you just limit the Root Port's MPSS value in PCI_EXP_DEVCAP?
The root port MPSS is already 3'b001 = 256 Bytes and is read-only for 
EXPCAP1F0 (PCI_EXP_DEVCAP) .

The controller is limited to MPS 256 Bytes according to V4H rev.1.30 
documentation. There is no explicitly spelled out MRRS limitation in the 
documentation to my knowledge, except for the DMA hint, but please read on.

The root port EXPCAP2F0 MPS is 128 Bytes and MRRS is 512 Bytes .

I now noticed that in V4H rev.1.30 documentation, the EXPCAP2F0 MRRS 
field is default set to 3'b010 = 512 Bytes, but that value is "Reserved" 
and only two non-reserved values are 3'b000 and 3'b001 which are MRRS 
128 Bytes and 256 Bytes respectively. That means MRRS has to be trimmed 
to maximum 256 Bytes in software to avoid "Reserved" settings. I will 
also ask the hardware and documentation team about this.

As a result, I adjust EXPCAP2F0:

- I raise MPS from 128 Bytes to 256 Bytes
- I reduce MRRS from 512 Bytes to 256 Bytes (this is important to 
prevent data corruption)

However, the downstream devices (in my case, PCIe SSD) can still be 
configured with MRRS > 256 (in my case, Crucial P5 Plus 1 TiB has 
MRRS=512 and MPS=128), which is where the quirk kicks in and 
reconfigures MRRS for those downstream devices.

The pci_configure_mps() does propagate MPS from root port EXPCAP2F0 to 
downstream devices, but there is no equivalent for MRRS as far as I can 
find ?

Thank you for your help!

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] PCI: rcar-gen4: Limit Max_Read_Request_Size and Max_Payload_Size to 256 Bytes
  2026-05-12 22:57   ` Marek Vasut
@ 2026-05-12 23:08     ` Marek Vasut
  0 siblings, 0 replies; 8+ messages in thread
From: Marek Vasut @ 2026-05-12 23:08 UTC (permalink / raw)
  To: Manivannan Sadhasivam
  Cc: linux-pci, stable, Krzysztof Wilczyński, Bjorn Helgaas,
	Geert Uytterhoeven, Koichiro Den, Lorenzo Pieralisi, Magnus Damm,
	Rob Herring, Yoshihiro Shimoda, linux-kernel, linux-renesas-soc

On 5/13/26 12:57 AM, Marek Vasut wrote:
> On 5/11/26 4:34 PM, Manivannan Sadhasivam wrote:
> 
> Hello Manivannan,
> 
>>>   drivers/pci/controller/dwc/pcie-rcar-gen4.c | 56 +++++++++++++++++++++
>>>   1 file changed, 56 insertions(+)
>>>
>>> diff --git a/drivers/pci/controller/dwc/pcie-rcar-gen4.c b/drivers/ 
>>> pci/controller/dwc/pcie-rcar-gen4.c
>>> index 8b03c42f8c84c..82f0a074a71da 100644
>>> --- a/drivers/pci/controller/dwc/pcie-rcar-gen4.c
>>> +++ b/drivers/pci/controller/dwc/pcie-rcar-gen4.c
>>> @@ -576,6 +576,7 @@ static int r8a779f0_pcie_ltssm_control(struct 
>>> rcar_gen4_pcie *rcar, bool enable)
>>>   static void rcar_gen4_pcie_additional_common_init(struct 
>>> rcar_gen4_pcie *rcar)
>>>   {
>>>       struct dw_pcie *dw = &rcar->dw;
>>> +    u16 offset = dw_pcie_find_capability(dw, PCI_CAP_ID_EXP);
>>>       u32 val;
>>>       val = dw_pcie_readl_dbi(dw, PCIE_PORT_LANE_SKEW);
>>> @@ -584,11 +585,66 @@ static void 
>>> rcar_gen4_pcie_additional_common_init(struct rcar_gen4_pcie *rcar)
>>>           val |= BIT(6);
>>>       dw_pcie_writel_dbi(dw, PCIE_PORT_LANE_SKEW, val);
>>> +    val = dw_pcie_readl_dbi(dw, offset + PCI_EXP_DEVCTL);
>>> +    val &= ~(PCI_EXP_DEVCTL_PAYLOAD | PCI_EXP_DEVCTL_READRQ);
>>> +    val |= PCI_EXP_DEVCTL_PAYLOAD_256B | PCI_EXP_DEVCTL_READRQ_256B;
>>> +    dw_pcie_writel_dbi(dw, offset + PCI_EXP_DEVCTL, val);
>>
>> Instead of limiting the MRRS/MPS values for all devices through 
>> quirks, why
>> can't you just limit the Root Port's MPSS value in PCI_EXP_DEVCAP?
> The root port MPSS is already 3'b001 = 256 Bytes and is read-only for 
> EXPCAP1F0 (PCI_EXP_DEVCAP) .

I have to correct myself here -- EXPCAP1F0 Type 0 MPSS is 256 Bytes and 
Read-Only, Type 1 MPSS is 128 Bytes and Read-Write . I will now try to 
increase the later, but the MRRS topic below remains.

> The controller is limited to MPS 256 Bytes according to V4H rev.1.30 
> documentation. There is no explicitly spelled out MRRS limitation in the 
> documentation to my knowledge, except for the DMA hint, but please read on.
> 
> The root port EXPCAP2F0 MPS is 128 Bytes and MRRS is 512 Bytes .
> 
> I now noticed that in V4H rev.1.30 documentation, the EXPCAP2F0 MRRS 
> field is default set to 3'b010 = 512 Bytes, but that value is "Reserved" 
> and only two non-reserved values are 3'b000 and 3'b001 which are MRRS 
> 128 Bytes and 256 Bytes respectively. That means MRRS has to be trimmed 
> to maximum 256 Bytes in software to avoid "Reserved" settings. I will 
> also ask the hardware and documentation team about this.
> 
> As a result, I adjust EXPCAP2F0:
> 
> - I raise MPS from 128 Bytes to 256 Bytes
> - I reduce MRRS from 512 Bytes to 256 Bytes (this is important to 
> prevent data corruption)
> 
> However, the downstream devices (in my case, PCIe SSD) can still be 
> configured with MRRS > 256 (in my case, Crucial P5 Plus 1 TiB has 
> MRRS=512 and MPS=128), which is where the quirk kicks in and 
> reconfigures MRRS for those downstream devices.
> 
> The pci_configure_mps() does propagate MPS from root port EXPCAP2F0 to 
> downstream devices, but there is no equivalent for MRRS as far as I can 
> find ?
> 
> Thank you for your help!


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] PCI: rcar-gen4: Limit Max_Read_Request_Size and Max_Payload_Size to 256 Bytes
  2026-05-11 14:20     ` Koichiro Den
@ 2026-05-13  3:01       ` Marek Vasut
  0 siblings, 0 replies; 8+ messages in thread
From: Marek Vasut @ 2026-05-13  3:01 UTC (permalink / raw)
  To: Koichiro Den
  Cc: linux-pci, stable, Krzysztof Wilczyński, Bjorn Helgaas,
	Geert Uytterhoeven, Lorenzo Pieralisi, Magnus Damm,
	Manivannan Sadhasivam, Rob Herring, Yoshihiro Shimoda,
	linux-kernel, linux-renesas-soc

On 5/11/26 4:20 PM, Koichiro Den wrote:

Hello Den-san,

>>> 2. Did you also happen to test V4H/V4M in endpoint (EP) mode, with the local
>>>      eDMA engine issuing MRd requests toward host memory?
>>
>> I was not able to test this configuration.
>>
>> Is it possible to perform this test with a single device, by having the eDMA
>> do local-memory-read-to-local-memory-write transfers, maybe using
>> PIPE_LOOPBACK/LOOPBACK_ENABLE bits, or do I need two devices with NTB
>> connection between them ?
>>
>> In case it is the later, could you please briefly describe the S4 NTB setup
>> you use, so I could try to replicate it locally ?
> 
> My setup was a two-board setup:
> 
>    S4 Spider as RC <-> S4 Spider as EP, connected with OCuLink.
> 
> It is unfortunately not a small standalone reproducer. The setup was based on
> the following RFC v4 series:
> 
>    [RFC PATCH v4 00/38] NTB transport backed by PCI EP embedded DMA
>    https://lore.kernel.org/all/20260118135440.1958279-1-den@valinux.co.jp/
> 
> In particular, the workaround patch I used in the RFC series was:
> 
>    [RFC PATCH v4 31/38] NTB: epf: Add per-SoC quirk to cap MRRS for DWC eDMA (128B for R-Car)
>    https://lore.kernel.org/all/20260118135440.1958279-32-den@valinux.co.jp/
> 
> Note that in that workaround I only capped MRRS (i.e. I did not add an MPS cap).
> At least in that setup, avoiding 256B MRd requests was enough to make the
> visible corruption disappear.

I have been investigating the MPSS/MPS/MRRS a bit deeper. I did not make 
the connection between your last observation in your previous email, 
which already hinted at what the issue might be, the MPSS bitfield, 
TYPE00 and TYPE01 accesses until today and one more nudge from 
Manivannan in the MPSS direction. Thank you both for those two items.

It seems that for S4, the latest documentation rev.1.30 indicates 
EXPCAP1F MPSS as read-write and configurable between 128B and 256B for 
TYPE00 (EP) access , but read-only and set to fixed 128B for TYPE01 (RC) 
access .

If the S4 PCIe in RC mode is only capable of 128B long TLPs and in EP 
mode is capable of 128B or 256B long TLPs, this might explain why you 
observe corruption with 256B long TLPs between two S4 Spiders. The S4 
Spider in EP mode might work just fine with another RC which can do 256B 
long TLPs.

I still do not understand one more observation -- if I configure V4H 
PCIe as RC, and read out EXPCAP1F register MPSS field, it reads as 256B 
(value 3'b001). I would expect EXPCAP1F register MPSS field to read out 
as (default) 128B in this RC case. The V4H documentation indicates 
EXPCAP1F MPSS as read-ONLY and set to fixed 256B for TYPE00 (EP) access, 
but read-WRITE and set to 128B for TYPE01 (RC) access , which I think 
might be a documentation issue. I also do not rewrite the EXPCAP1F MPSS 
in any way.

If the V4H is also capable of only 128B TLPs in RC mode, then this patch 
would require additional adjustment and would have to limit TLP length 
based on configuration -- 128B for RC, 256B for EP.

I will now ask for documentation clarification.

> At a high level, the EP side exposes the vNTB endpoint function, and the RC side
> uses the NTB data path which is backed by the EP-local eDMA through that vNTB
> function. For the RC-to-EP data path, the EP-local eDMA acts as the requester:
> it issues MRd requests toward remote RC memory, receives the CplD payloads, and
> writes the data into EP-side memory. In other words, this is a DMA read transfer
> from the point of view of the EP-local eDMA.

I understand. If the S4 EP has MPSS set to 256 Bytes (and possibly also 
MPS), but the S4 RC may (*) be limited to MPSS and MPS 128 Bytes, I 
wonder if the MRd from the EP-local DMA sent to RC might be causing 
malfunction on the RC side.

(*) to be determined, I will ask.

> I have not tried PIPE_LOOPBACK/LOOPBACK_ENABLE. Given how heavy the setup
> described above is, I am not asking you to reproduce the whole thing just for
> this patch. Also, I do not want this NTB/eDMA observation to block your v2. For
> now, please treat it as a separate observation from the RC/NVMe issue. I will
> continue the investigation on my side and let you know if I can narrow down
> where the corruption occurs.

I very much appreciate your input, and in light of it, I believe this 
patch does need an update.

As for local Oculink setup options, I already had a closer look as well.

>>> Your commit message
>>>      describes an NVMe device as the requester, but I'm wondering whether the same
>>>      256B limit was also verified for the R-Car EP DMA requester path.
>>
>> This part I currently can not answer, I'm sorry.
>>
>> ...
>>
>> I made the following two observations in the meantime.
>>
>> First, I wrote two SSDs, Crucial P5 Plus SSD without HMPRE (without host
>> memory buffer) and XPG GAMMIX P55 with HMPRE (with host memory buffer) with
>> 4 GiB of random data on another system (iMX8M Plus, ARM64 with DWC PCIe
>> controller too), then I did a read back and compared the data, the writen
>> and read-back data matched.
>>
>> Then I plugged both SSDs into V4H Sparrow Hawk _without_ this patch, and I
>> did read back of data:
>>
>> - Crucial P5 Plus SSD without HMPRE (without host memory buffer)
>>    -> Data read back match data written on iMX8M Plus, OK
>> - XPG GAMMIX P55 with HMPRE (with host memory buffer)
>>    -> Data read back match data written on iMX8M Plus, OK
>>
>> Then I wrote 512 Byte of data into the Crucial P5 Plus SSD without HMPRE on
>> V4H Sparrow Hawk and did read back again.
>>    -> Data read back does NOT match data written, NG
>>
>> That would indicate that:
>> - WRITE transfers from SSD to DRAM are OK
>> - READ transfers from DRAM to SSD are corrupted at 256 Bytes boundary
>>
>> That would indicate that we need _at_least_ the 256 Bytes limit, likely on
>> both MPS and MRRS.
>>
>> Second, I got a report of another SSD for which this patch is not
>> sufficient. I currently do not have access to that SSD, but I will ask for
>> access and investigate. That may shed some light on the 128 Byte limit
>> below.
> 
> Thank you for sharing these observations.
> Interesting, that second point may indeed help determine whether my 128B
> observation in the past is related to the same underlying issue, or is a purely
> eDMA/NTB-specific one.

Could you please have a look at the beginning of this email too ? I 
wonder if the TYPE00/TYPE01 accesses might have different TLP size 
limitations.

>>> (*) The background for my question 2:
>>>
>>>      I only have access to S4 Spider boards. In my RC <-> EP setup, where the EP
>>>      side uses the local eDMA engine to issue MRd requests toward the RC, 256-byte
>>>      MRd requests still appear to corrupt the transferred data.
>>
>> Is the corruption deterministic in some way, i.e. are the same bytes of the
>> transferred data corrupted every time, or is the corruption "random" ?
> 
> The exact corrupted values were not deterministic, but the offsets where the
> corruption occurred were fairly consistent.
> 
> Let me quote from my earlier RFC patch:
> (https://lore.kernel.org/all/20260118135440.1958279-32-den@valinux.co.jp/)
> 
>    [...]
>    * On some R-Car platforms using the Synopsys DWC PCIe + eDMA we
>    * observe data corruption on RC->EP Remote DMA Read paths whenever
>    * the EP issues large MRd requests. The corruption consistently
>    * hits the tail of each 256-byte segment (e.g. offsets
>    * 0x00E0..0x00FF within a 256B block, and again at 0x01E0..0x01FF
>    * for larger transfers).
>    [...]

I see.

>> Does the corruption happen even on singular MRd transfer, or does it happen
>> only when a lot of traffic is sent across the NTB link? I wonder if this
>> corruption might be DRAM bandwidth related, i.e. whether the DMA does
>> possibly saturate the DRAM controller with write requests and make the
>> system run out of DRAM bandwidth.
> 
> It occurred even with a single eDMA read transfer. It was not a symptom only
> observable under high load.

That rules out my hypothesis that this might be link stability related, 
or memory or interconnect pressure related. Thank you for this input.

>>> With the following
>>>      change on top of your patch, my DMA-read tests become stable:
>>
>> [...]
>>
>>>      One detail which might be important is that limiting only MPS does not appear
>>>      to be sufficient in my setup. MPS=128B with MRRS=256B still seems broken,
>>>      while MPS=128B with MRRS=128B works fine. I wonder whether this is because
>>>      the "MPS" term in the min(MRRS, MPS) limit for DMA read transfers may
>>>      effectively be tied to the DMA read buffer segment size / MPSS rather than
>>>      only to DevCtl.MPS. I'm not sure about this yet though.
>>
>> I think setting MPS=128B MRRS=256B only leads to the transfer being split
>> into 2 x 128B TLPs sent across the PCIe link, but in the end, 2 x 128 Bytes
>> of data are received (in some order) into the read segment buffer and
>> reordered, and 1 x 256 Bytes are written from read segment buffer into the
>> memory as a single write.
>>
>> In case of MPS=256B MRRS=256B, only one 256B TLP is sent across the link, 1
>> x 256 Bytes of data are received into the read segment buffer with no
>> reordering necessary, and 1 x 256 Bytes are still written from read segment
>> buffer into the memory as a single write.
>>
>> => For MPS=128B/MPS=256B and MRRS=256B, there is difference in the
>>     transfer format between PCIe and DMA, but there is no difference
>>     between DMA and DRAM .
>>
>> But in case of MRRS=128B and transfer of 256 Bytes, 2 x 128 Bytes of data
>> are received into (separate? (*)) entries in read segment buffer, and 2 x
>> 128 Bytes are written from (separate?) entries in read segment buffer into
>> the memory as two separate writes . Could this different memory write
>> pattern be responsible for the (lack of) corruption ?
>>
>> Do you know whether the data are corrupted on the PCIe-to-DMA side (when the
>> data are received from the PCIe side and written into the read buffer
>> segment) or on the DMA-to-DRAM side (on read from read segment buffer or on
>> write into DRAM) ?
> 
> Unfortunately I cannot distinguish these from software alone. I only observed
> the final destination buffer contents after the eDMA read transfer completed.

I understand.

>> (*) Since the read segment buffer has 16 x 256 Byte segments, with 16 DMA
>> tags and never more than 16 MRd requests in flight, I think it is likely
>> that each MRd data land in separate read segment buffer segment. But this
>> information comes from another datasheet, not V4H one.
>>
>>>      One more thing I noticed in the manuals:
>>>
>>>        R-Car S4 R19UH0161EJ0130 Rev.1.30 Jun. 16, 2025:
>>>          Type00 MPSS initial = 256B, PCI R, Internal R/W
>>>          Type01 MPSS initial = 128B, PCI R, Internal R
>>>
>>>        R-Car V4H R19UH0186EJ0130 Rev.1.30 Apr. 21, 2025
>>>          Type00 MPSS initial = 256B, PCI R, Internal R
>>>          Type01 MPSS initial = 128B, PCI R, Internal R/W
>>>
>>>      I'm still unsure, but this difference might be relevant. In particular, in
>>>      V4H/V4M RC mode your patch programs DevCtl.MPS to 256B, but does not change
>>>      Type01 MPSS. I wonder if the Type01 MPSS should also be updated to 256B first
>>>      on SoCs where the manual says it is writable from the internal bus, or if I'm
>>>      missing something here.
>>
>> This is a very good point.
>>
>> The R-Car S4 RM Rev.1.20 lists Type00 MPSS as Internal R and Type01 MPSS as
>> Internal R/W. This was updated in RM Rev.1.30 to Type 00 Internal R/W and
>> Type 01 Internal R. It is possible this change is going to be added into the
>> V4H RM in the future too. That would likely imply, that Type01 MPSS is not
>> programmable.
>>
>> I don't think Type1 affects RC operation, but does it affect NTB ?
> 
> I have no evidence that Type1 affects NTB either. It was just a speculative idea
> based on the difference I saw in the manuals.
> 
> Your inference, i.e. that the S4 RM Rev.1.30 may reflect the intended access
> attributes and the V4H RM may later get a similar correction, sounds reasonable
> to me.
> 
> I had not checked the S4 RM Rev.1.20, so I missed that change. Thanks for
> pointing it out.

I have now checked 4 S4, 10 V4H, 6 V4M reference manual versions and 
there are subtle changes. I asked for clarification. If I learn 
anything, I will let you know.

I did not make the connection between your aforementioned observation, 
MPSS, and TYPE00 and TYPE01 accesses until today, now I realized there 
might be different TLP size limits for RC and EP modes.

>> [...]
>>
>> Thank you for your help!
> 
> Thank you for investigating this and for the very helpful analysis.
> I will let you know if I find anything more.

Likewise, thank you for your help !

-- 
Best regards,
Marek Vasut

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2026-05-13  3:04 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-25 23:38 [PATCH] PCI: rcar-gen4: Limit Max_Read_Request_Size and Max_Payload_Size to 256 Bytes Marek Vasut
2026-04-28  7:00 ` Koichiro Den
2026-05-03 23:54   ` Marek Vasut
2026-05-11 14:20     ` Koichiro Den
2026-05-13  3:01       ` Marek Vasut
2026-05-11 14:34 ` Manivannan Sadhasivam
2026-05-12 22:57   ` Marek Vasut
2026-05-12 23:08     ` Marek Vasut

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox