Linux PCI Non-Transparent Bridge framework and drivers
 help / color / mirror / Atom feed
From: Dave Jiang <dave.jiang@intel.com>
To: Koichiro Den <den@valinux.co.jp>
Cc: "Jon Mason" <jdmason@kudzu.us>, "Allen Hubbe" <allenbh@gmail.com>,
	"Manivannan Sadhasivam" <mani@kernel.org>,
	"Krzysztof Wilczyński" <kwilczynski@kernel.org>,
	"Kishon Vijay Abraham I" <kishon@kernel.org>,
	"Bjorn Helgaas" <bhelgaas@google.com>,
	"Frank Li" <Frank.Li@nxp.com>,
	"Jerome Brunet" <jbrunet@baylibre.com>,
	"Lorenzo Pieralisi" <lpieralisi@kernel.org>,
	"Niklas Cassel" <cassel@kernel.org>,
	linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org,
	ntb@lists.linux.dev
Subject: Re: [PATCH v4 00/12] PCI: endpoint: pci-epf-vntb / NTB: epf: Enable per-doorbell bit handling
Date: Tue, 26 May 2026 09:27:28 -0700	[thread overview]
Message-ID: <63c7ea8d-8a23-4570-a476-76112ddf5db2@intel.com> (raw)
In-Reply-To: <hl4pg76tbwashcacn2c7gbhvvqdfz4jr4z5vqldvt2iy53uc4n@6s3nspwjtuto>



On 5/26/26 8:56 AM, Koichiro Den wrote:
> On Tue, May 26, 2026 at 07:45:11AM -0700, Dave Jiang wrote:
>>
>>
>> On 5/20/26 11:29 PM, Koichiro Den wrote:
>>> On Tue, May 19, 2026 at 02:29:51PM -0700, Dave Jiang wrote:
>>>>
>>>>
>>>> On 5/12/26 7:49 PM, Koichiro Den wrote:
>>>>> This series fixes doorbell bit/vector handling for the EPF-based NTB
>>>>> pair (ntb_hw_epf <-> pci-epf-*ntb). Its primary goal is to enable safe
>>>>> per-db-vector handling in the NTB core and clients (e.g. ntb_transport),
>>>>> without changing the on-the-wire doorbell mapping.
>>>>>
>>>>>
>>>>> Background / problem
>>>>> ====================
>>>>>
>>>>> ntb_hw_epf historically applies an extra offset when ringing peer
>>>>> doorbells: the link event uses the first interrupt slot, and doorbells
>>>>> start from the third slot (i.e. a second slot is effectively unused).
>>>>> pci-epf-vntb carries the matching offset on the EP side as well.
>>>>>
>>>>> As long as db_vector_count()/db_vector_mask() are not implemented, this
>>>>> mismatch is mostly masked. Doorbell events are effectively treated as
>>>>> "can hit any QP" and the off-by-one vector numbering does not surface
>>>>> clearly.
>>>>>
>>>>> However, once per-vector handling is enabled, the current state becomes
>>>>> problematic:
>>>>>
>>>>>   - db_valid_mask exposes bits that do not correspond to real doorbells
>>>>>     (link/unused slots leak into the mask).
>>>>>   - ntb_db_event() is fed with 1-based/shifted vectors, while NTB core
>>>>>     expects a 0-based db_vector for doorbells.
>>>>>   - On pci-epf-vntb, .peer_db_set() may be called in atomic context, but
>>>>>     it directly calls pci_epc_raise_irq(), which can sleep.
>>>>>
>>>>>
>>>>> Why NOT fix the root offset?
>>>>> ============================
>>>>>
>>>>> The natural "root" fix would be to remove the historical extra offset in
>>>>> the peer_db_set() doorbell paths for ntb_hw_epf and pci-epf-vntb.
>>>>> Unfortunately this would lead to interoperability issues when mixing old
>>>>> and new kernel versions (old/new peers). A new side would ring a
>>>>> different interrupt slot than what an old peer expects, leading to
>>>>> missed or misrouted doorbells, once db_vector_count()/db_vector_mask()
>>>>> are implemented.
>>>>>
>>>>> Therefore this series intentionally keeps the legacy offset, and instead
>>>>> fixes the surrounding pieces so the mapping is documented and handled
>>>>> consistently in masks, vector numbering, and per-vector reporting.
>>>>>
>>>>>
>>>>> Dependencies
>>>>> ============
>>>>>
>>>>> This v4 is prepared on top of the current pci/endpoint:
>>>>> commit f5aa97125491 ("misc: pci_endpoint_test: remove dead BAR read before
>>>>>                       doorbell trigger")
>>>>
>>>> I can't seem to apply the patch set via 'b4 shazam' on top of f5aa97125491. Trying to set it up for review.
>>>
>>> Hi Dave,
>>>
>>>>
>>>> DJ
>>>>
>>>>>
>>>>> plus these patches as contextual prerequisites:
>>>>>
>>>>>   - [PATCH 0/2] NTB: epf: Fix ntb_hw_epf ISR issues
>>>>>     https://lore.kernel.org/ntb/20260304083028.1391068-1-den@valinux.co.jp/
>>>>>
>>>>>   - [PATCH 0/2] PCI: endpoint: pci-epf-{v}ntb: A couple of fixes
>>>>>     https://lore.kernel.org/linux-pci/20260407124421.282766-1-mani@kernel.org/
>>>>>
>>>>> (These touch the same code part as this series and would otherwise conflict
>>>>> with it. They should land upstream first in my opinion.)
>>>
>>> This series depends on the two prerequisite series listed above.
>>
>> They don't apply cleanly either. Do you have a public repo somewhere that has everything applied? That may be the easiest way to do things if a series has complicated dependencies.
> 
> I rechecked this with a fresh worktree based off of f5aa97125491:
> 
>   git worktree add /tmp/f5aa97125491 f5aa97125491
>   cd /tmp/f5aa97125491
> 
>   # 1st dependency
>   b4 am 20260304083028.1391068-1-den@valinux.co.jp
>   git am ./20260304_den_ntb_epf_fix_ntb_hw_epf_isr_issues.mbx
> 
>   # 2nd dependency
>   b4 am 20260407124421.282766-1-mani@kernel.org
>   git am ./20260407_manivannan_sadhasivam_pci_endpoint_pci_epf_v_ntb_a_couple_of_fixes.mbx
> 
>   # this v4 series
>   b4 am 20260513024923.451765-1-den@valinux.co.jp
>   git am ./v4_20260513_den_pci_endpoint_pci_epf_vntb_ntb_epf_enable_per_doorbell_bit_handling.mbx
> 
> And this gives me the following commit log:
> 
>   git log --reverse --oneline f5aa97125491..
>   # 1d1465542cc0 NTB: epf: Fix request_irq() unwind in ntb_epf_init_isr()
>   # 94bd43a9e435 NTB: epf: Avoid pci_irq_vector() from hardirq context
>   # 3cab4db0f647 PCI: endpoint: pci-epf-vntb: Add check to detect 'db_count' value of 0
>   # 5449f10512ae PCI: endpoint: pci-epf-ntb: Add check to detect 'db_count' value of 0
>   # e3dbdcceb2fd PCI: endpoint: pci-epf-vntb: Document legacy MSI doorbell offset
>   # e1c167a58888 PCI: endpoint: pci-epf-vntb: Defer pci_epc_raise_irq() out of atomic context
>   # 6c54fd0c9c9c PCI: endpoint: pci-epf-vntb: Report 0-based doorbell vector via ntb_db_event()
>   # 28a30b79ff03 PCI: endpoint: pci-epf-vntb: Reject unusable doorbell counts
>   # cb5fbd37bdc6 PCI: endpoint: pci-epf-vntb: Guard configfs writes after EPC attach
>   # 9ed0f4db85b4 PCI: endpoint: pci-epf-vntb: Exclude reserved slots from db_valid_mask
>   # 575bd3aa530b PCI: endpoint: pci-epf-vntb: Implement db_vector_count/mask for doorbells
>   # 41f26aa279af NTB: epf: Document legacy doorbell slot offset in ntb_epf_peer_db_set()
>   # 6b2e2f80a61a NTB: epf: Make db_valid_mask cover only real doorbell bits
>   # 44d1bb070521 NTB: epf: Report 0-based doorbell vector via ntb_db_event()
>   # 4e4d37b4c2a8 NTB: epf: Fix doorbell bitmask and IRQ vector handling
>   # 1e5829682361 (HEAD) NTB: epf: Implement db_vector_count/mask for doorbells
> 
> I don't have a public repo at the moment, sorry. Could you try the above? If it
> still fails, please let me know.

I'm able to apply with the above steps. Thanks!

> 
> Best regards,
> Koichiro
> 
>>
>> DJ
>>
>>>
>>> Best regards,
>>> Koichiro
>>>
>>>>>
>>>>>
>>>>> Practical benefit
>>>>> =================
>>>>>
>>>>> This also enables ntb_netdev's multi-queue performance scale:
>>>>>
>>>>>   [PATCH v3 0/4] net: ntb_netdev: Add Multi-queue support
>>>>>   https://lore.kernel.org/netdev/20260305155639.1885517-1-den@valinux.co.jp/
>>>>>
>>>>> Each netdev queue maps to an ntb_transport queue pair. With correct
>>>>> db_vector_count() and db_vector_mask() callbacks, ntb_transport can wake
>>>>> only the queue pairs covered by the doorbell vector it received. Without
>>>>> those callbacks, the NTB core falls back to the full valid doorbell mask,
>>>>> so each doorbell interrupt can still wake all queue pairs. See "Testing /
>>>>> Results" section in the above link.
>>>>>
>>>>>
>>>>> What this series does
>>>>> =====================
>>>>>
>>>>> - pci-epf-vntb:
>>>>>
>>>>>   - Document the legacy offset.
>>>>>   - Defer MSI doorbell raises to process context to avoid sleeping in
>>>>>     atomic context. This becomes relevant once multiple doorbells are
>>>>>     raised concurrently at a high rate.
>>>>>   - Report doorbell vectors as 0-based to ntb_db_event().
>>>>>   - Reject unusable doorbell counts and guard configfs writes after EPC
>>>>>     attach with -EOPNOTSUPP.
>>>>>   - Fix db_valid_mask and implement db_vector_count()/db_vector_mask().
>>>>>
>>>>> - ntb_hw_epf:
>>>>>
>>>>>   - Document the legacy offset in ntb_epf_peer_db_set().
>>>>>   - Fix db_valid_mask to cover only real doorbell bits.
>>>>>   - Report 0-based db_vector to ntb_db_event() (accounting for the
>>>>>     unused slot).
>>>>>   - Pass per-vector context as request_irq() dev_id instead of deriving the
>>>>>     device vector from Linux IRQ numbers.
>>>>>   - Keep db_val as a bitmask and fix db_read/db_clear semantics
>>>>>     accordingly.
>>>>>   - Implement db_vector_count()/db_vector_mask().
>>>>>
>>>>>
>>>>> Compatibility
>>>>> =============
>>>>>
>>>>> By keeping the legacy offset intact, this series aims to remain
>>>>> compatible across mixed kernel versions. The observable changes are
>>>>> limited to correct mask/vector reporting and safer execution context
>>>>> handling.
>>>>>
>>>>> Patches 1-7 (PCI Endpoint) and 8-12 (NTB) are independent and can be
>>>>> applied separately through the respective trees. They are sent together so
>>>>> the two sides of the same legacy layout do not drift apart, and the
>>>>> resulting code stays easier to follow.
>>>>>
>>>>> The plan is still to take the whole series through the PCI EP tree once the
>>>>> remaining NTB acks are collected. See:
>>>>> https://lore.kernel.org/linux-pci/rnzsnp5de4qf5w7smebkmqekpuaqckltx73rj6ha3q2nrby5yp@7hsgvdzvjkp6/
>>>>>
>>>>>
>>>>> Note
>>>>> ====
>>>>>
>>>>> Most v3->v4 changes are responses to Sashiko feedback:
>>>>> https://sashiko.dev/#/patchset/20260323031544.2598111-1-den%40valinux.co.jp
>>>>>
>>>>> Sashiko also flagged a broader driver-unbind lifetime concern around the
>>>>> vNTB PCI driver. That is real, but it predates this series and needs a
>>>>> proper .remove() path, including NTB unregistration and work cancellation.
>>>>> I am keeping it as follow-up lifecycle work:
>>>>>
>>>>> https://lore.kernel.org/linux-pci/20260226084142.2226875-1-den@valinux.co.jp/
>>>>>
>>>>> This v4 only clears the deferred doorbell state added by this series around
>>>>> EPC init/cleanup.
>>>>>
>>>>>
>>>>> ---
>>>>> Changelog
>>>>> =========
>>>>>
>>>>> Changes since v3:
>>>>>   - Rebased on the dependencies listed above.
>>>>>   - For Sashiko feedback on state added by this series: clear
>>>>>     peer_db_pending around work enable/disable, mask peer_db_set() with
>>>>>     db_valid_mask, and use __ffs64().
>>>>>   - For pre-existing issues made more relevant by per-vector semantics:
>>>>>     stop deriving ntb_hw_epf vectors from Linux IRQ numbers, pass
>>>>>     per-vector request_irq() context, ignore the reserved slot, and
>>>>>     validate vectors before BIT_ULL()/ntb_db_event().
>>>>>   - Added adjacent hardening: pci-epf-vntb db_count range/configfs guards,
>>>>>     bounded db_vector masks, ntb_hw_epf mw_count precheck, and helper guard
>>>>>     cleanups.
>>>>>   - Dropped Reviewed-by tags from patches that changed to non-trivial extent.
>>>>>
>>>>> Changes since v2:
>>>>>   - No functional changes.
>>>>>   - Rebased onto current pci/endpoint
>>>>>     e022f0c72c7f ("selftests: pci_endpoint: Skip reserved BARs").
>>>>>     * Patch 2 needed a trivial context-only adjustment while rebasing, due
>>>>>       to commit d799984233a5 ("PCI: endpoint: pci-epf-vntb: Stop
>>>>>       cmd_handler work in epf_ntb_epc_cleanup").
>>>>>   - Picked up additional Reviewed-by tags from Frank.
>>>>>   - Fixed the incorrect v2 series title.
>>>>>
>>>>> Changes since v1:
>>>>>   - Addressed feedback from Dave (add a source code comment, introduce
>>>>>     enum to eliminate magic numbers)
>>>>>   - Updated source code comment in Patch 2.
>>>>>   - No functional changes, so retained Reviewed-by tags by Frank and Dave.
>>>>>     Thank you both for the review.
>>>>>
>>>>> v3: https://lore.kernel.org/linux-pci/20260323031544.2598111-1-den@valinux.co.jp/
>>>>> v2: https://lore.kernel.org/linux-pci/20260227084955.3184017-1-den@valinux.co.jp/
>>>>> v1: https://lore.kernel.org/linux-pci/20260224133459.1741537-1-den@valinux.co.jp/
>>>>>
>>>>>
>>>>> Best regards,
>>>>> Koichiro
>>>>>
>>>>>
>>>>> Koichiro Den (12):
>>>>>   PCI: endpoint: pci-epf-vntb: Document legacy MSI doorbell offset
>>>>>   PCI: endpoint: pci-epf-vntb: Defer pci_epc_raise_irq() out of atomic
>>>>>     context
>>>>>   PCI: endpoint: pci-epf-vntb: Report 0-based doorbell vector via
>>>>>     ntb_db_event()
>>>>>   PCI: endpoint: pci-epf-vntb: Reject unusable doorbell counts
>>>>>   PCI: endpoint: pci-epf-vntb: Guard configfs writes after EPC attach
>>>>>   PCI: endpoint: pci-epf-vntb: Exclude reserved slots from db_valid_mask
>>>>>   PCI: endpoint: pci-epf-vntb: Implement db_vector_count/mask for
>>>>>     doorbells
>>>>>   NTB: epf: Document legacy doorbell slot offset in
>>>>>     ntb_epf_peer_db_set()
>>>>>   NTB: epf: Make db_valid_mask cover only real doorbell bits
>>>>>   NTB: epf: Report 0-based doorbell vector via ntb_db_event()
>>>>>   NTB: epf: Fix doorbell bitmask and IRQ vector handling
>>>>>   NTB: epf: Implement db_vector_count/mask for doorbells
>>>>>
>>>>>  drivers/ntb/hw/epf/ntb_hw_epf.c               | 133 ++++++++---
>>>>>  drivers/pci/endpoint/functions/pci-epf-vntb.c | 210 ++++++++++++++++--
>>>>>  2 files changed, 296 insertions(+), 47 deletions(-)
>>>>>
>>>>
>>>
>>


  reply	other threads:[~2026-05-26 16:27 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-13  2:49 [PATCH v4 00/12] PCI: endpoint: pci-epf-vntb / NTB: epf: Enable per-doorbell bit handling Koichiro Den
2026-05-13  2:49 ` [PATCH v4 01/12] PCI: endpoint: pci-epf-vntb: Document legacy MSI doorbell offset Koichiro Den
2026-05-13  2:49 ` [PATCH v4 02/12] PCI: endpoint: pci-epf-vntb: Defer pci_epc_raise_irq() out of atomic context Koichiro Den
2026-05-14 18:53   ` Frank Li
2026-05-13  2:49 ` [PATCH v4 03/12] PCI: endpoint: pci-epf-vntb: Report 0-based doorbell vector via ntb_db_event() Koichiro Den
2026-05-13  2:49 ` [PATCH v4 04/12] PCI: endpoint: pci-epf-vntb: Reject unusable doorbell counts Koichiro Den
2026-05-14 18:55   ` Frank Li
2026-05-13  2:49 ` [PATCH v4 05/12] PCI: endpoint: pci-epf-vntb: Guard configfs writes after EPC attach Koichiro Den
2026-05-14 18:57   ` Frank Li
2026-05-13  2:49 ` [PATCH v4 06/12] PCI: endpoint: pci-epf-vntb: Exclude reserved slots from db_valid_mask Koichiro Den
2026-05-13  2:49 ` [PATCH v4 07/12] PCI: endpoint: pci-epf-vntb: Implement db_vector_count/mask for doorbells Koichiro Den
2026-05-14 19:02   ` Frank Li
2026-05-13  2:49 ` [PATCH v4 08/12] NTB: epf: Document legacy doorbell slot offset in ntb_epf_peer_db_set() Koichiro Den
2026-05-13  2:49 ` [PATCH v4 09/12] NTB: epf: Make db_valid_mask cover only real doorbell bits Koichiro Den
2026-05-26 22:21   ` Dave Jiang
2026-05-13  2:49 ` [PATCH v4 10/12] NTB: epf: Report 0-based doorbell vector via ntb_db_event() Koichiro Den
2026-05-14 19:03   ` Frank Li
2026-05-26 22:28   ` Dave Jiang
2026-05-13  2:49 ` [PATCH v4 11/12] NTB: epf: Fix doorbell bitmask and IRQ vector handling Koichiro Den
2026-05-14 19:06   ` Frank Li
2026-05-26 22:36   ` Dave Jiang
2026-05-13  2:49 ` [PATCH v4 12/12] NTB: epf: Implement db_vector_count/mask for doorbells Koichiro Den
2026-05-26 22:47   ` Dave Jiang
2026-05-19 14:42 ` [PATCH v4 00/12] PCI: endpoint: pci-epf-vntb / NTB: epf: Enable per-doorbell bit handling Manivannan Sadhasivam
2026-05-19 21:29 ` Dave Jiang
2026-05-21  6:29   ` Koichiro Den
2026-05-26 14:45     ` Dave Jiang
2026-05-26 15:56       ` Koichiro Den
2026-05-26 16:27         ` Dave Jiang [this message]
2026-05-27  9:09 ` Manivannan Sadhasivam

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=63c7ea8d-8a23-4570-a476-76112ddf5db2@intel.com \
    --to=dave.jiang@intel.com \
    --cc=Frank.Li@nxp.com \
    --cc=allenbh@gmail.com \
    --cc=bhelgaas@google.com \
    --cc=cassel@kernel.org \
    --cc=den@valinux.co.jp \
    --cc=jbrunet@baylibre.com \
    --cc=jdmason@kudzu.us \
    --cc=kishon@kernel.org \
    --cc=kwilczynski@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lpieralisi@kernel.org \
    --cc=mani@kernel.org \
    --cc=ntb@lists.linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox