From: Alex Williamson <alex@shazbot.org>
To: fengchengwen <fengchengwen@huawei.com>
Cc: <jgg@ziepe.ca>, <wathsala.vithanage@arm.com>,
<helgaas@kernel.org>, <wei.huang2@amd.com>, <zhipingz@meta.com>,
<wangzhou1@hisilicon.com>, <wangyushan12@huawei.com>,
<liuyonglong@huawei.com>, <kvm@vger.kernel.org>,
<linux-pci@vger.kernel.org>,
alex@shazbot.org
Subject: Re: [PATCH v14 0/8] vfio/pci: Add PCIe TPH support
Date: Wed, 3 Jun 2026 11:58:28 -0600 [thread overview]
Message-ID: <20260603115828.21216c0f@shazbot.org> (raw)
In-Reply-To: <3f8d1722-6749-4f3a-b630-c96303cec284@huawei.com>
On Wed, 3 Jun 2026 08:34:53 +0800
fengchengwen <fengchengwen@huawei.com> wrote:
> On 6/3/2026 7:08 AM, Alex Williamson wrote:
> > On Tue, 2 Jun 2026 22:46:13 +0800
> > fengchengwen <fengchengwen@huawei.com> wrote:
> >> On 6/1/2026 11:58 PM, Alex Williamson wrote:
> >>>
> >>>
> >>> - Patch 2/ sets tph_cap to zero if TPH cannot be enabled, but this is
> >>> never propagated to vfio-pci to drop the TPH capability from the
> >>> capability chain.
> >>>
> >>> - Patch 3/ exposing the enum to callers is a bit ugly, maybe we should
> >>> retain pci_enable_tph() as the "auto" wrapper, resulting in no
> >>> change to existing users, and add _ext and _std wrappers.
> >>>
> >>> - Patch 4/ same, multiple wrappers keeps the "auto" api cleaner. Also
> >>> note that auto is broken before TPH is enabled when tph_req_type is
> >>> zero.
> >>>
> >>> - Patch 6/:
> >>> - mutex_init/destroy belong in vfio_pci_core_init_dev/release_dev()
> >>> - Re-order to avoid forward declarations.
> >>> - vfio_check_feature() does not require declaring PROBE as a
> >>> supported op.
> >>> - pcie_tph_set_st_entry() can write to MMIO space without testing
> >>> for or enabling device memory or power state.
> >>> - tph_init allocates shadow table regardless of opt-in for tph
> >>> support.
> >>> - Redundant tests for count/index in feature function.
> >>> - Duplicate memcpy/kfree in feature function.
> >>>
> >>> - Patch 7/:
> >>> - Unclear why it's vfio-pci's responsibility to provide this
> >>> translation interface.
> >>> - PROBE is unnecessary for vfio_check_feature().
> >>> - Array size conversion is strange.
> >>>
> >>> - Patch 8/:
> >>> - default config read handles emulated bits. Clear the bit if
> >>> necessary in vconfig, use p_setd() to mark PCI_TPH_CAP_EXT_TPH
> >>> virtualized, read-only.
> >>> - Follow examples like vfio_pm_config_write, vfio_vpd_config_write,
> >>> vfio_exp_config_write, etc. Set the virtualized and writable
> >>> bits, perform default config write, evaluate if anything was
> >>> enabled, effect the change in hardware via kernel APIs.
> >>> - pcie_disable_tph() happens after the restore state is recorded.
> >>> - TPH capability becomes read/write, an ABI change to the user, with
> >>> only a host-based opt-in. Seems like the userspace driver also
> >>> needs to opt-in to the ABI change.
> >>> - Memory enable not assured for pcie_tph_set_st_entry().
> >>> - Sashiko notes there are existing hazards in the tph_enabled
> >>> storage unit relative to multiple bitfields that can be updated
> >>> concurrently. The problem is worse with this exposed through
> >>> vfio-pci.
> >>
> >> Thanks for review, all your comments are resolved in v15:
> >
> > Not everything. At a glance I can see that v15 is still pushing for
> > vfio-pci to provide the interface to translate CPU IDs to steering
> > tags, despite my push back above that it doesn't seem within vfio-pci's
> > scope to provide a CPU ID -> ST interface.
>
> This translation helper is required to handle varied TPH configurations,
> including cases where ST entries live inside device private registers
> instead of MSI-X or TPH capability space.
>
> Prior discussion with Jason (
> ref: https://lore.kernel.org/all/20260414151125.GF2577880@ziepe.ca/)
> indicated VFIO-PCI is the correct component for per-device TPH
> implementation.
>
> Production high-speed NICs from multiple vendors rely on this CPU-to-ST
> mapping logic for DPDK optimized polling I/O; removing this helper would
> render the whole TPH userspace control feature unusable for existing
> target workloads.
>
> Therefore, I believe it is appropriate to support it.
The citation is only describing that vfio should provide a way to
program the steering tags. That's not the issue I'm raising.
The proposed uAPI has morphed from one where the user provides CPU IDs,
which are translated to STs and written to the device or returned to
support device specific implementations, to one where raw STs are used.
I think this was a result of my question of how this interface
integrates with the concurrently proposed interface where drivers can
expose TPH values for a dmabuf. Suddenly the user is managing raw TPH
values and what used to be an integrated feature of performing CPU ID
to ST value, with necessary ugliness of exposing that raw ST value for
device specific support, becomes largely a standalone translation
interface, which might be more generically implemented elsewhere in the
kernel. It largely loses any direct association to the vfio aspect of
the device.
The interface now looks more like one where raw STs are provided to
vfio-pci through various sources. It then becomes unclear why vfio-pci
itself is necessarily one of those sources.
> > The host-only opt-in issue above also doesn't appear to be addressed or
> > rebutted.
>
> enable_unsafe_tph host opt-in already fully gates all new TPH ABI:
> TPH cap + feature ioctl are completely hidden from userspace when off,
> no userspace opt-in needed.
I'd argue that opt-in on the host doesn't mean we should ignore the ABI
change to userspace. There's an entire virtualization stack that needs
to be built on this change, intercepting writes to the TPH control
register, MSI-X vectors, and capability ST entries. The host opt-in is
a global switch, it might enable your userspace driver, but it might
also break QEMU, where writes to the TPH capability had been dropped
previously. A per instance opt-in, gated by the host opt-in, allows
seamless integration of the new feature.
> > Typically v15 would be an indication that a series is coming to
> > maturity, but with the quick cadence and lack of discussion throughout
> > this development, I'm not even sure we've arrived at a proper uAPI for
> > this feature. Thanks,
>
> Sorry for sending frequent quick updates without leaving adequate window
> for list review.
>
> This feature originates from an earlier ARM prototype posted last year for
> DPDK TPH acceleration scenarios. Based on that foundational exploration,
> this series has refined the interface through continuous upstream review
> feedback.
>
> All recent rapid iterations are code improvements and review fixes.
The issue is that feedback like above is represented as if it's been
addressed and changelogs are scant on details, requiring significant,
repeated effort to understand the actual changes between iterations.
The issues raised and interface design require discussion, not simply
assertions that it's correct and addressed. Thanks,
Alex
next prev parent reply other threads:[~2026-06-03 17:58 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-28 12:46 [PATCH v14 0/8] vfio/pci: Add PCIe TPH support Chengwen Feng
2026-05-28 12:46 ` [PATCH v14 1/8] PCI/TPH: Fix pcie_tph_get_st_table_loc() field extraction Chengwen Feng
2026-05-28 12:46 ` [PATCH v14 2/8] PCI/TPH: Cache TPH requester capability at probe time Chengwen Feng
2026-05-28 12:46 ` [PATCH v14 3/8] PCI/TPH: Add requester selection policy to pcie_enable_tph() Chengwen Feng
2026-05-28 12:46 ` [PATCH v14 4/8] PCI/TPH: Add requester policy to pcie_tph_get_cpu_st() Chengwen Feng
2026-05-28 12:46 ` [PATCH v14 5/8] PCI/TPH: expose the enabled TPH requester type Chengwen Feng
2026-05-28 12:46 ` [PATCH v14 6/8] vfio/pci: Add VFIO_DEVICE_FEATURE_TPH_ST_CONFIG for PCIe TPH ST configuration Chengwen Feng
2026-05-28 15:17 ` sashiko-bot
2026-05-28 12:46 ` [PATCH v14 7/8] vfio/pci: Add VFIO_DEVICE_FEATURE_TPH_CPU_ST to query TPH steering tag Chengwen Feng
2026-05-28 12:46 ` [PATCH v14 8/8] vfio/pci: Virtualize PCIe TPH capability registers Chengwen Feng
2026-05-28 16:42 ` sashiko-bot
2026-06-01 15:58 ` [PATCH v14 0/8] vfio/pci: Add PCIe TPH support Alex Williamson
2026-06-02 14:46 ` fengchengwen
2026-06-02 23:08 ` Alex Williamson
2026-06-03 0:34 ` fengchengwen
2026-06-03 0:45 ` Jason Gunthorpe
2026-06-03 1:25 ` fengchengwen
2026-06-03 18:53 ` Alex Williamson
2026-06-04 18:33 ` Jason Gunthorpe
2026-06-04 20:46 ` Alex Williamson
2026-06-03 17:58 ` Alex Williamson [this message]
2026-06-04 1:58 ` fengchengwen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260603115828.21216c0f@shazbot.org \
--to=alex@shazbot.org \
--cc=fengchengwen@huawei.com \
--cc=helgaas@kernel.org \
--cc=jgg@ziepe.ca \
--cc=kvm@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=liuyonglong@huawei.com \
--cc=wangyushan12@huawei.com \
--cc=wangzhou1@hisilicon.com \
--cc=wathsala.vithanage@arm.com \
--cc=wei.huang2@amd.com \
--cc=zhipingz@meta.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox