From: fengchengwen <fengchengwen@huawei.com>
To: Alex Williamson <alex@shazbot.org>
Cc: <jgg@ziepe.ca>, <wathsala.vithanage@arm.com>,
<helgaas@kernel.org>, <wei.huang2@amd.com>,
<wangzhou1@hisilicon.com>, <wangyushan12@huawei.com>,
<liuyonglong@huawei.com>, <kvm@vger.kernel.org>,
<linux-pci@vger.kernel.org>
Subject: Re: [PATCH v8 4/7] vfio/pci: Add PCIe TPH interface with capability query
Date: Tue, 12 May 2026 17:41:04 +0800 [thread overview]
Message-ID: <d515105c-d206-41ff-8e78-63d42ae38af3@huawei.com> (raw)
In-Reply-To: <20260510223626.0ba9dc0b@shazbot.org>
Hi Alex,
Thank you for the detailed and insightful review. I agree with your points and have addressed them in the v9 patchset I just sent.
In v9:
- We use only VFIO_DEVICE_FEATURE for TPH ST management, with no custom ioctl.
- The interface supports contiguous index/count batches as you suggested.
- No redundant op field; set/get is handled by the feature flags.
- The TPH security model is simplified to a single module parameter.
- All your UAPI suggestions have been fully implemented.
For userspace reference, the DPDK counterpart patch that uses this new interface is available here:
https://patchwork.dpdk.org/project/dpdk/patch/20260512092302.23735-2-fengchengwen@huawei.com/
Please review the v9 when you have a chance.
Thanks again for your guidance!
Thanks,
Chengwen
On 5/11/2026 12:36 PM, Alex Williamson wrote:
> On Sat, 9 May 2026 11:28:03 +0800
> fengchengwen <fengchengwen@huawei.com> wrote:
>> On 5/9/2026 6:40 AM, Alex Williamson wrote:
>>> On Fri, 8 May 2026 14:40:50 +0800
>>> Chengwen Feng <fengchengwen@huawei.com> wrote:
>>>> diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c
>>>> index 0c771064c0b8..40bf5aa9fd0b 100644
>>>> --- a/drivers/vfio/pci/vfio_pci.c
>>>> +++ b/drivers/vfio/pci/vfio_pci.c
>>>> @@ -60,6 +60,12 @@ static bool disable_denylist;
>>>> module_param(disable_denylist, bool, 0444);
>>>> MODULE_PARM_DESC(disable_denylist, "Disable use of device denylist. Disabling the denylist allows binding to devices with known errata that may lead to exploitable stability or security issues when accessed by untrusted users.");
>>>>
>>>> +#ifdef CONFIG_PCIE_TPH
>>>> +static bool enable_unsafe_tph_ds_mode;
>>>> +module_param(enable_unsafe_tph_ds_mode, bool, 0444);
>>>> +MODULE_PARM_DESC(enable_unsafe_tph_ds_mode, "Enable UNSAFE TPH device-specific (DS) mode. This mode provides weak isolation, cannot be safely used for virtual machines. If you do not know what this is for, step away. (default: false)");
>>>> +#endif
>>>> +
>>>
>>> Why is the "unsafe" aspect of this keyed on mode rather than storage
>>> location?
>>>
>>> Currently the user cannot enable TPH, the capability is read-only, but
>>> the user does have direct access to the MSI-X table. We rely on an
>>> agreement that the user needs to use SET_IRQS to allocate host vectors
>>> and we use interrupt remapping as protection against abuse, but there's
>>> no mediation of STs written directly to the MSI-X table. If the device
>>> supports IV mode with ST in the MSI-X table, nothing prevents the user
>>> from writing those ST entries directly to the MSI-X table. Therefore
>>> doesn't it have the same security concern as DS mode?
>>
>>
>> Agree, from this perspective, even if it is in MSI-X table, it is still unsafe.
>> So TPH is unsafe as a whole, not just DS mode.
>>
>>>
>>> Further, config space lives in the device and various devices are known
>>> to have alternate means for accessing their config space.
>>> Virtualization of config space is more to present the device in the VM
>>> address space and bridge features between guest and host. It's not
>>> great as a security barrier.
>>>
>>> Maybe it's really neither the mode nor storage location, and we need to
>>> decide if TPH as a whole introduces any new security considerations.
>>
>> I will adjust the module parameter to control TPH globally instead of
>> only DS mode.
>
> I'm not convinced that's the right solution either. It's a usage
> barrier if the TPH capability isn't exposed R/W, but does it guarantee
> the device won't make use of such TLPs anyway? If the device has
> config space backdoors or can otherwise be manipulated to send these
> hints, a vfio-pci module option is just security theater. It's also a
> burden for users and for each variant driver for devices supporting TPH.
>
> We do however need to consider how changing the behavior of the
> capability affects existing users, like QEMU. We may need to consider
> two device features, one that only supports SET with no payload to
> enable virtualized access to the TPH capability and another that
> provides the ST handling interface.
>
>>> It seems arguable whether we can actually prevent a device from
>>> including arbitrary STs on TLPs in any case and maybe we're really
>>> only exposing a curated programming interface.
>>>
>>> ...
>>>> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
>>>> index 5de618a3a5ee..81da2bd0c21b 100644
>>>> --- a/include/uapi/linux/vfio.h
>>>> +++ b/include/uapi/linux/vfio.h
>>
>> ...
>>
>>>> +#define VFIO_DEVICE_PCI_TPH _IO(VFIO_TYPE, VFIO_BASE + 22)
>>>> +
>>>
>>> This seems like the wrong shape to me and introduces yet another
>>> ioctl multiplexer. We already have that via the device feature
>>> interface. I'd propose this only needs one new DEVICE_FEATURE
>>> ioctl, TPH_ST. The uAPI would look like:
>>>
>>> struct vfio_device_feature_tph_st {
>>> __u32 flags;
>>> #define VFIO_TPH_ST_MEM_TYPE_PM (1 << 0)
>>> __u16 index;
>>> __u16 count;
>>> __u32 data[]; /* host CPU# on SET, ST value on GET */
>>> }
>>>
>>> The user can SET multiple STs at once that have the same mem_type
>>> (assuming that's a reasonable limitation). On SET, each {cpu#,
>>
>> Agree, using the same mem_type for a batch is a good idea.
>>
>> Because it could set multiple index, so how about:
>>
>> struct vfio_pci_tph_entry {
>> __u32 cpu;
>> __u16 val; /* ST index on SET, ST value on GET */
>> __u16 reserved;
>> }
>
> In the structure I proposed the user can set/get contiguous index
> ranges according to index and count, where the data field can then just
> be a u32 array. Why does the user need to be able to set/get
> arbitrary, non-contiguous indexes?
>
> In a VM use case we'd likely be trapping individual writes, therefore
> we'd be intercepting one index at a time.
>
>> struct vfio_device_feature_tph_st {
>> __u32 op;
>> #define VFIO_TPH_OP_GET_ST 0
>> #define VFIO_TPH_OP_SET_ST 1
>
> The vfio device feature interface already handles set/get, we don't
> need this.
>
>> __u32 flags;
>> #define VFIO_TPH_ST_MEM_TYPE_PM (1 << 0)
>> __u16 count;
>> __u16 reserved1;
>> struct vfio_pci_tph_entry ents[];
>> }
>>
>>> mem_type} is translated to a host value and stored internally. A
>>> GET
>>
>> Should we store internally? How about writing directly to the device?
>
> Perhaps we should, but I think we need a flag to indicate whether we're
> virtualizing a write to hardware (capability or MSI-X table) or SET'ing
> an index that userspace will later retrieve for DS mode via GET.
> Otherwise we don't know until the mode bits are written which location,
> if any, the capability is actually using. For example the device can
> support either MSI-X or capability locations, but enable DS mode. For
> consistency though, it might make sense to write to an internal table
> regardless. Thanks,
>
> Alex
next prev parent reply other threads:[~2026-05-12 9:41 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-08 6:40 [PATCH v8 0/7] vfio/pci: Add PCIe TPH support Chengwen Feng
2026-05-08 6:40 ` [PATCH v8 1/7] PCI/TPH: Fix pcie_tph_get_st_table_loc() field extraction Chengwen Feng
2026-05-08 6:40 ` [PATCH v8 2/7] PCI/TPH: Export pcie_tph_get_st_modes() for external use Chengwen Feng
2026-05-08 6:40 ` [PATCH v8 3/7] PCI/TPH: Fix pcie_tph_get_st_table_size() for MSI-X table location Chengwen Feng
2026-05-08 6:40 ` [PATCH v8 4/7] vfio/pci: Add PCIe TPH interface with capability query Chengwen Feng
2026-05-08 22:40 ` Alex Williamson
2026-05-09 3:28 ` fengchengwen
2026-05-11 4:36 ` Alex Williamson
2026-05-12 9:41 ` fengchengwen [this message]
2026-05-08 6:40 ` [PATCH v8 5/7] vfio/pci: Add PCIe TPH enable/disable support Chengwen Feng
2026-05-08 6:40 ` [PATCH v8 6/7] vfio/pci: Add PCIe TPH GET_ST interface Chengwen Feng
2026-05-08 6:40 ` [PATCH v8 7/7] vfio/pci: Add PCIe TPH SET_ST interface Chengwen Feng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d515105c-d206-41ff-8e78-63d42ae38af3@huawei.com \
--to=fengchengwen@huawei.com \
--cc=alex@shazbot.org \
--cc=helgaas@kernel.org \
--cc=jgg@ziepe.ca \
--cc=kvm@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=liuyonglong@huawei.com \
--cc=wangyushan12@huawei.com \
--cc=wangzhou1@hisilicon.com \
--cc=wathsala.vithanage@arm.com \
--cc=wei.huang2@amd.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox