Linux PCI subsystem development
 help / color / mirror / Atom feed
From: Alex Williamson <alex@shazbot.org>
To: Chengwen Feng <fengchengwen@huawei.com>
Cc: <jgg@ziepe.ca>, <wathsala.vithanage@arm.com>,
	<helgaas@kernel.org>, <wei.huang2@amd.com>,
	<wangzhou1@hisilicon.com>, <wangyushan12@huawei.com>,
	<liuyonglong@huawei.com>, <kvm@vger.kernel.org>,
	<linux-pci@vger.kernel.org>,
	alex@shazbot.org
Subject: Re: [PATCH v8 4/7] vfio/pci: Add PCIe TPH interface with capability query
Date: Fri, 8 May 2026 16:40:03 -0600	[thread overview]
Message-ID: <20260508164003.70918c0c@shazbot.org> (raw)
In-Reply-To: <20260508064053.37529-5-fengchengwen@huawei.com>

On Fri, 8 May 2026 14:40:50 +0800
Chengwen Feng <fengchengwen@huawei.com> wrote:

> Add VFIO_DEVICE_PCI_TPH IOCTL to allow userspace to query device TPH
> capabilities, supported modes, and steering tag table information.
> 
> Add module parameter 'enable_unsafe_tph_ds_mode' to restrict unsafe
> device-specific TPH mode to trusted userspace only.
> 
> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> ---
>  drivers/vfio/pci/vfio_pci.c      |  13 ++-
>  drivers/vfio/pci/vfio_pci_core.c |  56 ++++++++++++-
>  include/linux/vfio_pci_core.h    |   3 +-
>  include/uapi/linux/vfio.h        | 133 +++++++++++++++++++++++++++++++
>  4 files changed, 202 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
> index 0c771064c0b8..40bf5aa9fd0b 100644
> --- a/drivers/vfio/pci/vfio_pci.c
> +++ b/drivers/vfio/pci/vfio_pci.c
> @@ -60,6 +60,12 @@ static bool disable_denylist;
>  module_param(disable_denylist, bool, 0444);
>  MODULE_PARM_DESC(disable_denylist, "Disable use of device denylist. Disabling the denylist allows binding to devices with known errata that may lead to exploitable stability or security issues when accessed by untrusted users.");
>  
> +#ifdef CONFIG_PCIE_TPH
> +static bool enable_unsafe_tph_ds_mode;
> +module_param(enable_unsafe_tph_ds_mode, bool, 0444);
> +MODULE_PARM_DESC(enable_unsafe_tph_ds_mode, "Enable UNSAFE TPH device-specific (DS) mode. This mode provides weak isolation, cannot be safely used for virtual machines. If you do not know what this is for, step away. (default: false)");
> +#endif
> +

Why is the "unsafe" aspect of this keyed on mode rather than storage
location?

Currently the user cannot enable TPH, the capability is read-only, but
the user does have direct access to the MSI-X table.  We rely on an
agreement that the user needs to use SET_IRQS to allocate host vectors
and we use interrupt remapping as protection against abuse, but there's
no mediation of STs written directly to the MSI-X table.  If the device
supports IV mode with ST in the MSI-X table, nothing prevents the user
from writing those ST entries directly to the MSI-X table.  Therefore
doesn't it have the same security concern as DS mode?

Further, config space lives in the device and various devices are known
to have alternate means for accessing their config space.
Virtualization of config space is more to present the device in the VM
address space and bridge features between guest and host.  It's not
great as a security barrier.

Maybe it's really neither the mode nor storage location, and we need to
decide if TPH as a whole introduces any new security considerations.
It seems arguable whether we can actually prevent a device from
including arbitrary STs on TLPs in any case and maybe we're really only
exposing a curated programming interface.

...
> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> index 5de618a3a5ee..81da2bd0c21b 100644
> --- a/include/uapi/linux/vfio.h
> +++ b/include/uapi/linux/vfio.h
> @@ -1321,6 +1321,139 @@ struct vfio_precopy_info {
>  
>  #define VFIO_MIG_GET_PRECOPY_INFO _IO(VFIO_TYPE, VFIO_BASE + 21)
>  
> +/**
> + * struct vfio_pci_tph_cap - PCIe TPH capability information
> + * @supported_modes: Supported TPH operating modes
> + * @st_table_sz: Number of entries in ST table; 0 means no ST table
> + * @reserved: Must be zero
> + *
> + * Used with VFIO_PCI_TPH_GET_CAP operation to return device
> + * TLP Processing Hints (TPH) capabilities to userspace.
> + */
> +struct vfio_pci_tph_cap {
> +	__u8  supported_modes;
> +#define VFIO_PCI_TPH_MODE_IV	(1u << 0) /* Interrupt vector */
> +#define VFIO_PCI_TPH_MODE_DS	(1u << 1) /* Device specific */
> +	__u8  reserved0;
> +	__u16 st_table_sz;
> +	__u32 reserved;
> +};
> +
> +/**
> + * struct vfio_pci_tph_ctrl - TPH enable control structure
> + * @mode: Selected TPH operating mode (VFIO_PCI_TPH_MODE_*)
> + * @reserved: Must be zero
> + *
> + * Used with VFIO_PCI_TPH_ENABLE operation to specify the
> + * operating mode when enabling TPH on the device.
> + */
> +struct vfio_pci_tph_ctrl {
> +	__u8 mode;
> +	__u8 reserved[7];
> +};
> +
> +/**
> + * struct vfio_pci_tph_entry - Single TPH steering tag entry
> + * @cpu: CPU identifier for steering tag calculation
> + * @mem_type: Memory type (VFIO_PCI_TPH_MEM_TYPE_*)
> + * @reserved0: Must be zero
> + * @index: ST table index for programming
> + * @st: Unused for SET_ST
> + * @reserved1: Must be zero
> + *
> + * For VFIO_PCI_TPH_GET_ST:
> + *   Userspace sets @cpu and @mem_type; kernel returns @st.
> + *
> + * For VFIO_PCI_TPH_SET_ST:
> + *   Userspace sets @index, @cpu, and @mem_type.
> + *   Kernel internally computes the steering tag and programs
> + *   it into the specified @index.
> + *
> + *   If @cpu == U32_MAX, kernel clears the steering tag at
> + *   the specified @index.
> + */
> +struct vfio_pci_tph_entry {
> +	__u32 cpu;
> +	__u8  mem_type;
> +#define VFIO_PCI_TPH_MEM_TYPE_VM	0
> +#define VFIO_PCI_TPH_MEM_TYPE_PM	1
> +	__u8  reserved0;
> +	__u16 index;
> +	__u16 st;
> +	__u16 reserved1;
> +};
> +
> +/**
> + * struct vfio_pci_tph_st - Batch steering tag request
> + * @count: Number of entries in the array
> + * @reserved: Must be zero
> + * @ents: Flexible array of steering tag entries
> + *
> + * Container structure for batch get/set operations.
> + * Used with both VFIO_PCI_TPH_GET_ST and VFIO_PCI_TPH_SET_ST.
> + */
> +struct vfio_pci_tph_st {
> +	__u32 count;
> +	__u32 reserved;
> +	struct vfio_pci_tph_entry ents[];
> +#define VFIO_PCI_TPH_MAX_ENTRIES    2048
> +};
> +
> +/**
> + * struct vfio_device_pci_tph_op - Argument for VFIO_DEVICE_PCI_TPH
> + * @argsz: User allocated size of this structure
> + * @op: TPH operation (VFIO_PCI_TPH_*)
> + * @cap: Capability data for GET_CAP
> + * @ctrl: Control data for ENABLE
> + * @st: Batch entry data for GET_ST/SET_ST
> + *
> + * @argsz must be set by the user to the size of the structure
> + * being executed. Kernel validates input and returns data
> + * only within the specified size.
> + *
> + * Operations:
> + * - VFIO_PCI_TPH_GET_CAP: Query device TPH capabilities.
> + * - VFIO_PCI_TPH_ENABLE:  Enable TPH using mode from &ctrl.
> + * - VFIO_PCI_TPH_DISABLE: Disable TPH on the device.
> + * - VFIO_PCI_TPH_GET_ST:  Retrieve CPU steering tags for Device-Specific (DS)
> + *                         mode. Used when device requires SW to obtain ST
> + *                         values for programming.
> + * - VFIO_PCI_TPH_SET_ST:  Program steering tag entries into device ST table.
> + *                         Valid when ST table resides in TPH Requester
> + *                         Capability or MSI-X Table.
> + *                         If any entry fails, all programmed entries are rolled
> + *                         back to 0 before returning error.
> + */
> +struct vfio_device_pci_tph_op {
> +	__u32 argsz;
> +	__u32 op;
> +#define VFIO_PCI_TPH_GET_CAP	0
> +#define VFIO_PCI_TPH_ENABLE	1
> +#define VFIO_PCI_TPH_DISABLE	2
> +#define VFIO_PCI_TPH_GET_ST	3
> +#define VFIO_PCI_TPH_SET_ST	4
> +	union {
> +		struct vfio_pci_tph_cap cap;
> +		struct vfio_pci_tph_ctrl ctrl;
> +		struct vfio_pci_tph_st st;
> +	};
> +};
> +
> +/**
> + * VFIO_DEVICE_PCI_TPH - _IO(VFIO_TYPE, VFIO_BASE + 22)
> + *
> + * IOCTL for managing PCIe TLP Processing Hints (TPH) on
> + * a VFIO-assigned PCI device. Provides operations to query
> + * device capabilities, enable/disable TPH, retrieve CPU's
> + * steering tags, and program steering tag tables.
> + *
> + * Return: 0 on success, negative errno on failure.
> + *         -EOPNOTSUPP: Operation not supported
> + *         -ENODEV: Device or required functionality not present
> + *         -EINVAL: Invalid argument or TPH not supported
> + */
> +#define VFIO_DEVICE_PCI_TPH	_IO(VFIO_TYPE, VFIO_BASE + 22)
> +

This seems like the wrong shape to me and introduces yet another ioctl
multiplexer.  We already have that via the device feature interface.
I'd propose this only needs one new DEVICE_FEATURE ioctl, TPH_ST.  The
uAPI would look like:

struct vfio_device_feature_tph_st {
	__u32 flags;
#define VFIO_TPH_ST_MEM_TYPE_PM	(1 << 0)
	__u16 index;
	__u16 count;
	__u32 data[]; /* host CPU# on SET, ST value on GET */
}

The user can SET multiple STs at once that have the same mem_type
(assuming that's a reasonable limitation).  On SET, each {cpu#,
mem_type} is translated to a host value and stored internally.  A GET
returns that translated ST value for DS use cases.

The user can use PROBE to determine if this feature is available.

We already provide the TPH capability read-only in config space, we can
use that rather than an explicit INFO/GET_CAP interface.

When the feature is available, the TPH control register is virtualized.
On enabling TPH via config space, vfio will store the translated ST
values to the appropriate location, or none, and enable TPH.  On SET
while already enabled, vfio will update both the internal table and the
device location (or none).  Thanks,

Alex

  parent reply	other threads:[~2026-05-08 22:40 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-08  6:40 [PATCH v8 0/7] vfio/pci: Add PCIe TPH support Chengwen Feng
2026-05-08  6:40 ` [PATCH v8 1/7] PCI/TPH: Fix pcie_tph_get_st_table_loc() field extraction Chengwen Feng
2026-05-08  6:40 ` [PATCH v8 2/7] PCI/TPH: Export pcie_tph_get_st_modes() for external use Chengwen Feng
2026-05-08 19:02   ` sashiko-bot
2026-05-08  6:40 ` [PATCH v8 3/7] PCI/TPH: Fix pcie_tph_get_st_table_size() for MSI-X table location Chengwen Feng
2026-05-08 19:31   ` sashiko-bot
2026-05-08  6:40 ` [PATCH v8 4/7] vfio/pci: Add PCIe TPH interface with capability query Chengwen Feng
2026-05-08 20:03   ` sashiko-bot
2026-05-08 22:40   ` Alex Williamson [this message]
2026-05-09  3:28     ` fengchengwen
2026-05-11  4:36       ` Alex Williamson
2026-05-08  6:40 ` [PATCH v8 5/7] vfio/pci: Add PCIe TPH enable/disable support Chengwen Feng
2026-05-08 20:46   ` sashiko-bot
2026-05-08  6:40 ` [PATCH v8 6/7] vfio/pci: Add PCIe TPH GET_ST interface Chengwen Feng
2026-05-08  6:40 ` [PATCH v8 7/7] vfio/pci: Add PCIe TPH SET_ST interface Chengwen Feng
2026-05-08 21:49   ` sashiko-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260508164003.70918c0c@shazbot.org \
    --to=alex@shazbot.org \
    --cc=fengchengwen@huawei.com \
    --cc=helgaas@kernel.org \
    --cc=jgg@ziepe.ca \
    --cc=kvm@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=liuyonglong@huawei.com \
    --cc=wangyushan12@huawei.com \
    --cc=wangzhou1@hisilicon.com \
    --cc=wathsala.vithanage@arm.com \
    --cc=wei.huang2@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox