public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Niklas Cassel <cassel@kernel.org>
To: Max Boone <contact@maxboone.com>
Cc: den@valinux.co.jp, mani@kernel.org, frank.li@nxp.com,
	allenbh@gmail.com, bhanuseshukumar@gmail.com,
	bhelgaas@google.com, dave.jiang@intel.com, jdmason@kudzu.us,
	jingoohan1@gmail.com, kishon@kernel.org, kwilczynski@kernel.org,
	linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org,
	lpieralisi@kernel.org, marco.crivellari@suse.com,
	mmaddireddy@nvidia.com, ntb@lists.linux.dev, robh@kernel.org,
	shinichiro.kawasaki@wdc.com, mboone@akamai.com
Subject: Re: [PATCH v14 4/7] PCI: endpoint: pci-ep-msi: Refactor doorbell allocation for new backends
Date: Wed, 29 Apr 2026 11:33:13 +0200	[thread overview]
Message-ID: <afHQWdn2rmG4mAA4@ryzen> (raw)
In-Reply-To: <2F694A6C-23BA-4025-ACD7-2751595982CB@maxboone.com>

On Tue, Apr 28, 2026 at 10:36:15PM +0200, Max Boone wrote:
> 
> I’m not very fond of keeping this implementation in the pci-ep-msi file,
> as the platform MSI and this implementation are both iiuc specific to
> the designware ep driver. Even more so because the MSI implementation
> is enabled by config rather than through device tree.

Why do you think that the current code with DOMAIN_BUS_PLATFORM_MSI is
designware EPC specific?

I don't see anything that is designware EPC specific.

Sure, it relies on GIC ITS, but I don't see why non-designware EPCs can't
use GIC ITS.


> 
> Wouldn’t we want end-users to specify what kind of doorbell they want,
> as it seems to be that a more specific doorbell BAR layout can be 
> programmed with eDMA, allowing native support for nvmet’s doorbell
> BAR for example.

I also wanted to use my designware based EPC for doorbells in nvmet-pci-epf,
specifically the support that Koichiro added for inbound subrange mapping.

However, most designware EPC have a strict alignment requirement
(CX_ATU_MIN_REGION_SIZE), which is often 4k.

This alignment requirement is there both on the PCI address (address within
the BAR, and for the physical memory address (target address)).

I thought that we could use the inbound subrange mapping and put the doorbells
in a separate inbound iATU, so we could remove polling in the nvmet-pci-epf
driver, just like they have done in the vNTB driver. However, it works in vNTB
because they have a register telling exactly which BAR and offset in that BAR
where the doorbells are.
In the NVMe PCIe Transport specification, the offset for the start of the
doorbells is fixed, at offset 0x1000 (4k) and the only thing you can change is
the stride between the doorbells.

Currently, a doorbell is a single 32-bit data, sure we could call
pci_epf_alloc_doorbell() with ("number of I/O queues" + 1 (admin queue)) * 2
(submission queue and completion queue).

However, the address which we get from pci_epf_alloc_doorbell() might not be
4k aligned.

We have the function pci_epf_align_inbound_addr() which can split this
non-aligned address to a 4k aligned base + offset from that base.

However, that would also require the host side driver to write to this offset
from the start address. (See e.g. doorbell_offset in pci-epf-test.c).

So, basically, with the current limitation that the doorbells must start at
0x1000, together with the fact that the doorbells returned from
pci_epf_alloc_doorbell() might have an arbitrary alignment, I don't see how
we could add support for doorbells in nvmet-pci-epf.

If we could supply an alignment requirement to pci_epf_alloc_doorbell(),
e.g. 4k, and the API is guaranteed to return an address that satisfies this
alignment requirement, then we would be good.

However, right now, we don't have such an API. We simple get an address
somewhere within the GIC ITS MMIO region.


> 
> Originally in a patchset by Frank Li the API that was proposed was more
> generic, and the pci-epc-msi implementation was chosen because there
> was only one implementation:
> - https://lore.kernel.org/imx/20231019150441.GA7254@thinkpad/
> - https://lore.kernel.org/imx/20231019172347.GC7254@thinkpad/
> 
> I’d personally prefer to see an abstraction that is weaved into pci-epc-core
> and pci-epf-core that can be implemented by drivers as they wish. While 
> still keeping the enum for different types.
> 
> That also gives room to pull a poll-mode doorbell into the pci-epc-core,
> which deduplicates that code from the nvmet and vntb epfs, and allows
> other functions to use RC->EP doorbells without needing to bother with
> writing the polling mechanism.

Sounds like a good idea.


> 
> P.S. I’ve been working on a vfio-user based epc for development purposes
> personally, and the last hurdle before I want to send it in for comments is 
> support for doorbells, and came across this patchset checking if there’s
> any other activity in the space. Having an implementation-agnostic
> doorbell API in the EPF/EPC core would be very helpful to me.

I have looked at adding doorbell support to nvmet-pci-epf, but got stuck on
pci_epf_alloc_doorbell() returning an address that is not 4k aligned.

(Since the NVMe PCIe transport specification has the doorbells at a fixed
location, we can't change that.)

But if we could provide an "alignment" parameter to pci_epf_alloc_doorbell(),
then I think it is possible.

Sure, the GIC ITS MMIO area might be quite small, so it might not be able
satisfy such a request. E.g. on rk3588, the its1 MMIO region is 0x20000 (128k):
https://github.com/torvalds/linux/blob/v7.1-rc1/arch/arm64/boot/dts/rockchip/rk3588-base.dtsi#L2414

However, I have not idea of how much of this region the GIC driver uses for
actual registers, and how much of that region it can actually dedicate to
doorbells.


Kind regards,
Niklas

  reply	other threads:[~2026-04-29  9:33 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-28 20:36 [PATCH v14 4/7] PCI: endpoint: pci-ep-msi: Refactor doorbell allocation for new backends Max Boone
2026-04-29  9:33 ` Niklas Cassel [this message]
2026-04-29 11:11   ` Max Boone
2026-04-29 14:52     ` Niklas Cassel
  -- strict thread matches above, loose matches on Subject: below --
2026-04-14 14:15 [PATCH v14 0/7] PCI: endpoint: pci-ep-msi: Add embedded doorbell fallback Koichiro Den
2026-04-14 14:15 ` [PATCH v14 4/7] PCI: endpoint: pci-ep-msi: Refactor doorbell allocation for new backends Koichiro Den
2026-04-29  8:58   ` Max Boone

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=afHQWdn2rmG4mAA4@ryzen \
    --to=cassel@kernel.org \
    --cc=allenbh@gmail.com \
    --cc=bhanuseshukumar@gmail.com \
    --cc=bhelgaas@google.com \
    --cc=contact@maxboone.com \
    --cc=dave.jiang@intel.com \
    --cc=den@valinux.co.jp \
    --cc=frank.li@nxp.com \
    --cc=jdmason@kudzu.us \
    --cc=jingoohan1@gmail.com \
    --cc=kishon@kernel.org \
    --cc=kwilczynski@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lpieralisi@kernel.org \
    --cc=mani@kernel.org \
    --cc=marco.crivellari@suse.com \
    --cc=mboone@akamai.com \
    --cc=mmaddireddy@nvidia.com \
    --cc=ntb@lists.linux.dev \
    --cc=robh@kernel.org \
    --cc=shinichiro.kawasaki@wdc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox