Kernel KVM virtualization development
 help / color / mirror / Atom feed
From: Alex Williamson <alex@shazbot.org>
To: fengchengwen <fengchengwen@huawei.com>
Cc: <helgaas@kernel.org>, <wathsala.vithanage@arm.com>,
	<wei.huang2@amd.com>, <zhipingz@meta.com>,
	<wangzhou1@hisilicon.com>, <wangyushan12@huawei.com>,
	<liuyonglong@huawei.com>, <kvm@vger.kernel.org>,
	<linux-pci@vger.kernel.org>, Jason Gunthorpe <jgg@ziepe.ca>,
	alex@shazbot.org
Subject: Re: [PATCH v17 08/12] PCI/TPH: Add sysfs binary file to export CPU to steering-tag mapping
Date: Mon, 29 Jun 2026 09:43:26 -0600	[thread overview]
Message-ID: <20260629094326.779ab0ff@shazbot.org> (raw)
In-Reply-To: <5b389fb6-7d0e-8fea-1fd9-b873efc028c3@huawei.com>

On Sun, 28 Jun 2026 19:58:09 +0800
fengchengwen <fengchengwen@huawei.com> wrote:
> 
> Thanks a lot for your extremely detailed and professional design breakdown
> for the unified VFIO TPH uAPI framework. I’ve fully gone through all your
> design points and aligned my implementation plan accordingly. I have several
> key implementation questions to confirm with you as below:
> 
> 1. Plan for dma-buf TPH metadata storage
>    I plan to add the following TPH-related fields into struct 
> vfio_pci_dma_buf
>    in my preparatory patch series, which can be fully reused after Zhiping’s
>    dma-buf TPH patches land upstream:
>        u16 tph_st_ext;
>        u8  tph_st;
>        u8  revoked:1;
>        u8  tph_st_valid:1;
>        u8  tph_st_ext_valid:1;
>        u8  tph_ph:2;
>        u8  tph_ph_valid:1;
>    The tph_ph_valid bit is newly added to track whether a valid PH value 
> is bound
>    to the dma-buf. Is this field layout and validity flag design acceptable?

In Zhiping's design, the PH completer validity is bound to the ST
validity.  In my proposal, the user makes a request relative to the
namespace, EXTENDED set = 16-bit, clear = 8-bit.  Internally we do a
.get_tph on the dmabuf based on the requested namespace and get back
success or failure.  On success, the full PH + ST is provided to the
user when running with DS or LITERAL capability available, otherwise the
ST is withheld and only the PH is provided.  I don't see a need to track
the ph validity separately.

> 2. Validation rule for VFIO_DEVICE_TPH_EXTENDED flag mismatches
>    The VFIO_DEVICE_TPH_EXTENDED modifier you defined is an excellent design,
>    letting users select either 8-bit base ST or 16-bit extended ST when 
> hardware
>    supports both variants.
>    But a mismatch risk exists: users may set ST entries via TPH_ST with 
> EXTENDED,
>    then later enable TPH requester in pure 8-bit mode only, causing 
> inconsistency
>    between shadow config and active hardware mode.
> 
>    My proposed solution: maintain two separate shadow ST tables inside VFIO,
>    one for base 8-bit ST and one for extended 16-bit ST. When enabling TPH
>    requester mode, activate the shadow table matching the selected ST width.
>    For devices only supporting 8-bit ST, directly reject EXTENDED flag 
> in all
>    TPH_ST ioctl calls.
> 
>    Should we enforce strict cross-check between EXTENDED flag used during ST
>    programming and the final active requester ST width during 
> enablement? If yes,
>    is the dual shadow table approach reasonable?

We've abandoned the apply at enable-time approach in this proposal, TPH
must first be enabled in device config space.  There is also no
buffering of user values, they're written straight through to hardware.
If the user has enabled only 8-bit mode, then a TPH_ST with the
EXTENDED flag set should generate an error.  Likewise, if the user
calls TPH_ST while Requester Enable is 00b, this generates an error
regardless of the namespace.
 
> 3. Virtualization logic for TPH requester enable bits with heterogeneous
> completer capabilities
>    Two complex real hardware topologies need proper handling:
>    - Case 1: Single device with multiple queues routing TLPs to host memory
>      and P2P peer memory via dma-buf flow; root port and P2P TPH completer
>      capabilities may differ.
>    - Case 2: Root port has no TPH completer support, while endpoint and P2P
>      peers fully support TPH completer.
> 
>    I’m confused about how to virtualize the device’s TPH requester 
> control bits.
>    My tentative idea: take the minimum supported capability between endpoint
>    and host root port. If root port lacks TPH completer, block TPH 
> requester enable
>    entirely.
> 
>    Is this the correct approach to handle heterogeneous completer capability
>    across different traffic paths?

In case 1, this is why it doesn't work to allow the user to buffer per
namespace STs to be applied based on the value written to Requester
Enable.  Register value 11b allows the requester to operate in both
namespaces simultaneously.  The only governance we can provide is to
disallow EXTENDED STs to be written when Requester Enable is 01b.

The peer completer's capability is provided through the dmabuf.  The
user can ask for the requester's preferred namespace, use the alternate
if available, or fail if there's no compatible namespace available,
which includes no .get_tph support.

In case 2, we're gated by the Linux TPH implementation and carry it
through to the uAPI.  The overall TPH feature opt-in needs to depend on
both TPH support in the requester (the user's device) AND TPH completer
support at the root port (unless the requester itself is a RCiEP).  I
had missed elaborating on this requirement in my write-up.

I'm glad you're onboard with the design, please let me know if any
further clarifications are needed.  Thanks,

Alex

  reply	other threads:[~2026-06-29 15:43 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-16 10:46 [PATCH v17 00/12] vfio/pci: Add PCIe TPH support Chengwen Feng
2026-06-16 10:46 ` [PATCH v17 01/12] PCI/TPH: Fix pcie_tph_get_st_table_loc() field extraction Chengwen Feng
2026-06-16 11:00   ` sashiko-bot
2026-06-16 10:46 ` [PATCH v17 02/12] PCI/TPH: Fix tph_enabled concurrent update race by bitfield packing Chengwen Feng
2026-06-16 10:55   ` sashiko-bot
2026-06-16 10:46 ` [PATCH v17 03/12] PCI/TPH: Cache TPH requester capability at probe time Chengwen Feng
2026-06-16 10:55   ` sashiko-bot
2026-06-16 10:46 ` [PATCH v17 04/12] PCI/TPH: Refactor pcie_enable_tph & add explicit requester variant Chengwen Feng
2026-06-16 10:53   ` sashiko-bot
2026-06-16 10:46 ` [PATCH v17 05/12] PCI/TPH: Refactor pcie_tph_get_cpu_st & add explicit variant Chengwen Feng
2026-06-16 10:53   ` sashiko-bot
2026-06-16 10:46 ` [PATCH v17 06/12] PCI/TPH: Expose the enabled TPH requester type Chengwen Feng
2026-06-16 10:51   ` sashiko-bot
2026-06-16 10:46 ` [PATCH v17 07/12] PCI/TPH: Add pcie_tph_supported() helper to check TPH capability attributes Chengwen Feng
2026-06-16 10:52   ` sashiko-bot
2026-06-16 10:46 ` [PATCH v17 08/12] PCI/TPH: Add sysfs binary file to export CPU to steering-tag mapping Chengwen Feng
2026-06-16 11:00   ` sashiko-bot
2026-06-16 14:42   ` Jason Gunthorpe
2026-06-16 16:57     ` Alex Williamson
2026-06-16 17:27       ` Jason Gunthorpe
2026-06-17  1:18         ` fengchengwen
2026-06-17  1:30           ` Alex Williamson
2026-06-17  2:33             ` fengchengwen
2026-06-17  3:01               ` Alex Williamson
2026-06-17  3:41                 ` fengchengwen
2026-06-17  3:53                   ` Krzysztof Wilczyński
2026-06-17  6:04                     ` fengchengwen
2026-06-23  9:56       ` fengchengwen
2026-06-26 15:22         ` Alex Williamson
2026-06-28 11:58           ` fengchengwen
2026-06-29 15:43             ` Alex Williamson [this message]
2026-06-16 10:46 ` [PATCH v17 09/12] vfio/pci: Hide TPH capability when TPH is unsupported Chengwen Feng
2026-06-16 10:56   ` sashiko-bot
2026-06-16 10:46 ` [PATCH v17 10/12] vfio/pci: Add TPH_ENABLE feature skeleton and unsafe module parameter Chengwen Feng
2026-06-16 10:55   ` sashiko-bot
2026-06-16 10:46 ` [PATCH v17 11/12] vfio/pci: Add TPH_ST_CONFIG for PCIe TPH ST configuration Chengwen Feng
2026-06-16 11:05   ` sashiko-bot
2026-06-16 10:46 ` [PATCH v17 12/12] vfio/pci: Virtualize PCIe TPH capability registers Chengwen Feng
2026-06-16 11:03   ` sashiko-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260629094326.779ab0ff@shazbot.org \
    --to=alex@shazbot.org \
    --cc=fengchengwen@huawei.com \
    --cc=helgaas@kernel.org \
    --cc=jgg@ziepe.ca \
    --cc=kvm@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=liuyonglong@huawei.com \
    --cc=wangyushan12@huawei.com \
    --cc=wangzhou1@hisilicon.com \
    --cc=wathsala.vithanage@arm.com \
    --cc=wei.huang2@amd.com \
    --cc=zhipingz@meta.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox