From: Krishna Chaitanya Chundru <krishna.chundru@oss.qualcomm.com>
To: Niklas Cassel <cassel@kernel.org>, Koichiro Den <den@valinux.co.jp>
Cc: ntb@lists.linux.dev, linux-pci@vger.kernel.org,
dmaengine@vger.kernel.org, linux-kernel@vger.kernel.org,
Frank.Li@nxp.com, mani@kernel.org, kwilczynski@kernel.org,
kishon@kernel.org, bhelgaas@google.com, corbet@lwn.net,
vkoul@kernel.org, jdmason@kudzu.us, dave.jiang@intel.com,
allenbh@gmail.com, Basavaraj.Natikar@amd.com,
Shyam-sundar.S-k@amd.com, kurt.schwemmer@microsemi.com,
logang@deltatee.com, jingoohan1@gmail.com, lpieralisi@kernel.org,
robh@kernel.org, jbrunet@baylibre.com, fancer.lancer@gmail.com,
arnd@arndb.de, pstanner@redhat.com,
elfring@users.sourceforge.net
Subject: Re: [RFC PATCH v2 19/27] PCI: dwc: ep: Cache MSI outbound iATU mapping
Date: Mon, 22 Dec 2025 10:40:12 +0530 [thread overview]
Message-ID: <4909f70a-2f65-4cac-96ac-5cd4371bc867@oss.qualcomm.com> (raw)
In-Reply-To: <aTaE3yB7tQ-Homju@ryzen>
On 12/8/2025 1:27 PM, Niklas Cassel wrote:
> On Sun, Nov 30, 2025 at 01:03:57AM +0900, Koichiro Den wrote:
>> dw_pcie_ep_raise_msi_irq() currently programs an outbound iATU window
>> for the MSI target address on every interrupt and tears it down again
>> via dw_pcie_ep_unmap_addr().
>>
>> On systems that heavily use the AXI bridge interface (for example when
>> the integrated eDMA engine is active), this means the outbound iATU
>> registers are updated while traffic is in flight. The DesignWare
>> endpoint spec warns that updating iATU registers in this situation is
>> not supported, and the behavior is undefined.
>>
>> Under high MSI and eDMA load this pattern results in occasional bogus
>> outbound transactions and IOMMU faults such as:
>>
>> ipmmu-vmsa eed40000.iommu: Unhandled fault: status 0x00001502 iova 0xfe000000
>>
>> followed by the system becoming unresponsive. This is the actual output
>> observed on Renesas R-Car S4, with its ipmmu_hc used with PCIe ch0.
>>
>> There is no need to reprogram the iATU region used for MSI on every
>> interrupt. The host-provided MSI address is stable while MSI is enabled,
>> and the endpoint driver already dedicates a scratch buffer for MSI
>> generation.
>>
>> Cache the aligned MSI address and map size, program the outbound iATU
>> once, and keep the window enabled. Subsequent interrupts only perform a
>> write to the MSI scratch buffer, avoiding dynamic iATU reprogramming in
>> the hot path and fixing the lockups seen under load.
>>
>> Signed-off-by: Koichiro Den <den@valinux.co.jp>
>> ---
>> .../pci/controller/dwc/pcie-designware-ep.c | 48 ++++++++++++++++---
>> drivers/pci/controller/dwc/pcie-designware.h | 5 ++
>> 2 files changed, 47 insertions(+), 6 deletions(-)
>>
> I don't like that this patch modifies dw_pcie_ep_raise_msi_irq() but does
> not modify dw_pcie_ep_raise_msix_irq()
>
> both functions call dw_pcie_ep_map_addr() before doing the writel(),
> so I think they should be treated the same.
>
>
> I do however understand that it is a bit wasteful to dedicate one
> outbound iATU for MSI and one outbound iATU for MSI-X, as the PCI
> spec does not allow both of them to be enabled at the same PCI,
> see:
>
> 6.1.4 MSI and MSI-X Operation § in PCIe 6.0 spec:
> "A Function is permitted to implement both MSI and MSI-X,
> but system software is prohibited from enabling both at the
> same time. If system software enables both at the same time,
> the behavior is undefined."
>
>
> I guess the problem is that some EPF drivers, even if only
> one capability can be enabled (MSI/MSI-X), call both
> pci_epc_set_msi() and pci_epc_set_msix(), e.g.:
> https://github.com/torvalds/linux/blob/v6.18/drivers/pci/endpoint/functions/pci-epf-test.c#L969-L987
>
> To fill in the number of MSI/MSI-X irqs.
>
> While other EPF drivers only call either pci_epc_set_msi() or
> pci_epc_set_msix(), depending on the IRQ type that will actually
> be used:
> https://github.com/torvalds/linux/blob/v6.18/drivers/nvme/target/pci-epf.c#L2247-L2262
>
> I think both versions is okay, just because the number of IRQs
> is filled in for both MSI/MSI-X, AFAICT, only one of them will
> get enabled.
>
>
> I guess it might be hard for an EPC driver to know which capability
> that is currently enabled, as to enable a capability is only a config
> space write by the host side.
As the host is the one which enables MSI/MSIX, it should be better the
controller
driver takes this decision and the EPF driver just sends only raise_irq.
Because technically, host can disable MSI and enable MSIX at runtime also.
In the controller driver, it can check which is enabled and chose b/w
MSIX/MSI/Legacy.
- Krishna Chaitanya.
> I guess in most real hardware, e.g. a NIC device, you do an
> "enable engine"/"stop enginge" type of write to a BAR.
>
> Perhaps we should have similar callbacks in struct pci_epc_ops ?
>
> My thinking is that after "start engine", an EPC driver could read
> the MSI and MSI-X capabilities, to see which is enabled.
> As it should not be allowed to change between MSI and MSI-X without
> doing a "stop engine" first.
>
>
> Kind regards,
> Niklas
>
next prev parent reply other threads:[~2025-12-22 5:10 UTC|newest]
Thread overview: 97+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-29 16:03 [RFC PATCH v2 00/27] NTB transport backed by remote DW eDMA Koichiro Den
2025-11-29 16:03 ` [RFC PATCH v2 01/27] PCI: endpoint: pci-epf-vntb: Use array_index_nospec() on mws_size[] access Koichiro Den
2025-12-01 18:59 ` Frank Li
2025-11-29 16:03 ` [RFC PATCH v2 02/27] PCI: endpoint: pci-epf-vntb: Add mwN_offset configfs attributes Koichiro Den
2025-12-01 19:11 ` Frank Li
2025-12-02 6:23 ` Koichiro Den
2025-11-29 16:03 ` [RFC PATCH v2 03/27] NTB: epf: Handle mwN_offset for inbound MW regions Koichiro Den
2025-12-01 19:14 ` Frank Li
2025-12-02 6:23 ` Koichiro Den
2025-11-29 16:03 ` [RFC PATCH v2 04/27] PCI: endpoint: Add inbound mapping ops to EPC core Koichiro Den
2025-12-01 19:19 ` Frank Li
2025-12-02 6:25 ` Koichiro Den
2025-12-02 15:58 ` Frank Li
2025-12-03 14:12 ` Koichiro Den
2025-11-29 16:03 ` [RFC PATCH v2 05/27] PCI: dwc: ep: Implement EPC inbound mapping support Koichiro Den
2025-12-01 19:32 ` Frank Li
2025-12-02 6:26 ` Koichiro Den
2025-11-29 16:03 ` [RFC PATCH v2 06/27] PCI: endpoint: pci-epf-vntb: Use pci_epc_map_inbound() for MW mapping Koichiro Den
2025-12-01 19:34 ` Frank Li
2025-12-02 6:26 ` Koichiro Den
2025-11-29 16:03 ` [RFC PATCH v2 07/27] NTB: Add offset parameter to MW translation APIs Koichiro Den
2025-11-29 16:03 ` [RFC PATCH v2 08/27] PCI: endpoint: pci-epf-vntb: Propagate MW offset from configfs when present Koichiro Den
2025-12-01 19:35 ` Frank Li
2025-11-29 16:03 ` [RFC PATCH v2 09/27] NTB: ntb_transport: Support offsetted partial memory windows Koichiro Den
2025-11-29 16:03 ` [RFC PATCH v2 10/27] NTB: core: Add .get_pci_epc() to ntb_dev_ops Koichiro Den
2025-12-01 19:39 ` Frank Li
2025-12-02 6:31 ` Koichiro Den
2025-12-01 21:08 ` Dave Jiang
2025-12-02 6:32 ` Koichiro Den
2025-12-02 14:49 ` Dave Jiang
2025-12-03 15:02 ` Koichiro Den
2025-11-29 16:03 ` [RFC PATCH v2 11/27] NTB: epf: vntb: Implement .get_pci_epc() callback Koichiro Den
2025-11-29 16:03 ` [RFC PATCH v2 12/27] damengine: dw-edma: Fix MSI data values for multi-vector IMWr interrupts Koichiro Den
2025-12-01 19:46 ` Frank Li
2025-12-02 6:32 ` Koichiro Den
2025-12-18 6:52 ` Koichiro Den
2025-11-29 16:03 ` [RFC PATCH v2 13/27] NTB: ntb_transport: Use seq_file for QP stats debugfs Koichiro Den
2025-12-01 19:50 ` Frank Li
2025-12-02 6:33 ` Koichiro Den
2025-11-29 16:03 ` [RFC PATCH v2 14/27] NTB: ntb_transport: Move TX memory window setup into setup_qp_mw() Koichiro Den
2025-12-01 20:02 ` Frank Li
2025-12-02 6:33 ` Koichiro Den
2025-11-29 16:03 ` [RFC PATCH v2 15/27] NTB: ntb_transport: Dynamically determine qp count Koichiro Den
2025-11-29 16:03 ` [RFC PATCH v2 16/27] NTB: ntb_transport: Introduce get_dma_dev() helper Koichiro Den
2025-11-29 16:03 ` [RFC PATCH v2 17/27] NTB: epf: Reserve a subset of MSI vectors for non-NTB users Koichiro Den
2025-11-29 16:03 ` [RFC PATCH v2 18/27] NTB: ntb_transport: Introduce ntb_transport_backend_ops Koichiro Den
2025-11-29 16:03 ` [RFC PATCH v2 19/27] PCI: dwc: ep: Cache MSI outbound iATU mapping Koichiro Den
2025-12-01 20:41 ` Frank Li
2025-12-02 6:35 ` Koichiro Den
2025-12-02 9:32 ` Niklas Cassel
2025-12-02 15:20 ` Frank Li
2025-12-03 8:40 ` Koichiro Den
2025-12-03 10:39 ` Niklas Cassel
2025-12-03 14:36 ` Koichiro Den
2025-12-03 14:40 ` Koichiro Den
2025-12-04 17:10 ` Frank Li
2025-12-05 16:28 ` Frank Li
2025-12-02 6:32 ` Niklas Cassel
2025-12-03 8:30 ` Koichiro Den
2025-12-03 10:19 ` Niklas Cassel
2025-12-03 14:56 ` Koichiro Den
2025-12-08 7:57 ` Niklas Cassel
2025-12-09 8:15 ` Niklas Cassel
2025-12-12 3:56 ` Koichiro Den
2025-12-22 5:10 ` Krishna Chaitanya Chundru [this message]
2025-12-22 7:50 ` Niklas Cassel
2025-12-22 8:14 ` Krishna Chaitanya Chundru
2025-12-22 10:21 ` Manivannan Sadhasivam
2025-12-12 3:38 ` Manivannan Sadhasivam
2025-12-18 8:28 ` Koichiro Den
2025-11-29 16:03 ` [RFC PATCH v2 20/27] NTB: ntb_transport: Introduce remote eDMA backed transport mode Koichiro Den
2025-12-01 21:41 ` Frank Li
2025-12-02 6:43 ` Koichiro Den
2025-12-02 15:42 ` Frank Li
2025-12-03 8:53 ` Koichiro Den
2025-12-03 16:14 ` Frank Li
2025-12-04 15:42 ` Koichiro Den
2025-12-04 20:16 ` Frank Li
2025-12-05 3:04 ` Koichiro Den
2025-12-05 15:06 ` Frank Li
2025-12-18 4:34 ` Koichiro Den
2025-12-01 21:46 ` Dave Jiang
2025-12-02 6:59 ` Koichiro Den
2025-12-02 14:53 ` Dave Jiang
2025-12-03 14:19 ` Koichiro Den
2025-11-29 16:03 ` [RFC PATCH v2 21/27] NTB: epf: Provide db_vector_count/db_vector_mask callbacks Koichiro Den
2025-11-29 16:04 ` [RFC PATCH v2 22/27] ntb_netdev: Multi-queue support Koichiro Den
2025-11-29 16:04 ` [RFC PATCH v2 23/27] NTB: epf: Add per-SoC quirk to cap MRRS for DWC eDMA (128B for R-Car) Koichiro Den
2025-12-01 20:47 ` Frank Li
2025-11-29 16:04 ` [RFC PATCH v2 24/27] iommu: ipmmu-vmsa: Add PCIe ch0 to devices_allowlist Koichiro Den
2025-11-29 16:04 ` [RFC PATCH v2 25/27] iommu: ipmmu-vmsa: Add support for reserved regions Koichiro Den
2025-11-29 16:04 ` [RFC PATCH v2 26/27] arm64: dts: renesas: Add Spider RC/EP DTs for NTB with remote DW PCIe eDMA Koichiro Den
2025-11-29 16:04 ` [RFC PATCH v2 27/27] NTB: epf: Add an additional memory window (MW2) barno mapping on Renesas R-Car Koichiro Den
2025-12-01 22:02 ` [RFC PATCH v2 00/27] NTB transport backed by remote DW eDMA Frank Li
2025-12-02 6:20 ` Koichiro Den
2025-12-02 16:07 ` Frank Li
2025-12-03 8:43 ` Koichiro Den
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4909f70a-2f65-4cac-96ac-5cd4371bc867@oss.qualcomm.com \
--to=krishna.chundru@oss.qualcomm.com \
--cc=Basavaraj.Natikar@amd.com \
--cc=Frank.Li@nxp.com \
--cc=Shyam-sundar.S-k@amd.com \
--cc=allenbh@gmail.com \
--cc=arnd@arndb.de \
--cc=bhelgaas@google.com \
--cc=cassel@kernel.org \
--cc=corbet@lwn.net \
--cc=dave.jiang@intel.com \
--cc=den@valinux.co.jp \
--cc=dmaengine@vger.kernel.org \
--cc=elfring@users.sourceforge.net \
--cc=fancer.lancer@gmail.com \
--cc=jbrunet@baylibre.com \
--cc=jdmason@kudzu.us \
--cc=jingoohan1@gmail.com \
--cc=kishon@kernel.org \
--cc=kurt.schwemmer@microsemi.com \
--cc=kwilczynski@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=logang@deltatee.com \
--cc=lpieralisi@kernel.org \
--cc=mani@kernel.org \
--cc=ntb@lists.linux.dev \
--cc=pstanner@redhat.com \
--cc=robh@kernel.org \
--cc=vkoul@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox