public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: fengchengwen <fengchengwen@huawei.com>
To: Alex Williamson <alex@shazbot.org>,
	Jason Gunthorpe <jgg@ziepe.ca>,
	Wathsala Vithanage <wathsala.vithanage@arm.com>
Cc: <kvm@vger.kernel.org>, <linux-pci@vger.kernel.org>
Subject: Re: [PATCH 3/4] vfio/pci: Add PCIe TPH GET_ST interface
Date: Thu, 23 Apr 2026 09:26:34 +0800	[thread overview]
Message-ID: <7f64f1fc-8d37-4990-aa78-7d5873ebfb46@huawei.com> (raw)
In-Reply-To: <f28a790f-d762-41f1-a2a1-0e3cc4cdb4f2@huawei.com>

Hi Alex, Jason and Wathsala

Gentle reminder that the corresponding protection mechanisms have been implemented in v3:

- Introduced module parameter *enable_unsafe_tph_ds* to control unsafe TPH usage
- Restrict TPH Device-Specific mode without standard ST table to be disabled by default
- Only permit enabling such mode and the GET_ST operation when the parameter is explicitly
  set by trusted users

Please help review when available.

Thanks

On 4/17/2026 10:06 AM, fengchengwen wrote:
> Sorry for the self-reply.
> 
> Hi Alex & Wathsala,
> 
> Based on the VM assignment scenario and the cross-VM attack concern
> raised by Wathsala in her review of "[PATCH v2 RESEND 4/5] vfio/pci: Add PCIe TPH GET_ST interface":
> 
>   This is unsafe. A user space driver can obtain STs for arbitrary CPUs and program them
>   into device-specific registers (e.g., E810), with no isolation guarantees.
>   For example, consider two VMs on the same host. A driver in VM1 could program STs that
>   target CPUs primarily used by VM2. This can steer traffic processing onto VM2's CPUs,
>   creating contention and degrading VM2's performance.
>   This breaks CPU isolation at the host level and can be used to disrupt workloads and violate
>   SLAs across tenants.
> 
> After fully re-evaluating the security architecture with your feedback,
> I agree with your concerns and conclusions. I hereby revoke my earlier
> proposal to restrict GET_ST to only the current CPU.
> 
> For devices that implement standard ST tables (via config space or MSI-X caps),
> hypervisors such as QEMU/kvmtool can trap and filter guest writes, preventing
> malicious steering tag abuse. This is the safe and supported model for TPH in
> virtualized environments.
> 
> However, for devices that only support Device-Specific mode with no standard
> ST table, there is no existing hypervisor interception mechanism to prevent
> a guest from programming arbitrary steering tags to attack other CPUs.
> This is a fundamental security risk that cannot be safely mitigated in software.
> 
> Therefore, the correct security posture for virtualization is:
> - TPH should be enabled *only* for devices with standard ST tables
> - Devices without standard ST tables should NOT enable TPH in virtualization
> 
> On the other hand, in non-virtualization (bare-metal) scenarios, there is
> strong legitimate demand for devices that lack a standard ST table —
> many real-world devices are designed this way. For this reason, I would
> like to retain the GET_ST interface.
> 
> For virtualization scenarios, the hypervisor is responsible for avoiding
> this risk by **disabling TPH Requester Enable in the PCIe config space**,
> which is fully interceptable and under the hypervisor’s control.
> 
> Thanks
> 
> On 4/17/2026 8:48 AM, fengchengwen wrote:
>> Hi Alex,
>>
>> Thank you very much for your clear and detailed security explanation.
>> I fully understand and agree with your security concerns about allowing
>> userspace to query steering tags for arbitrary CPUs.
>>
>> To completely resolve this security issue while retaining the mandatory
>> functionality for DS-mode devices without ST table, I will revise the
>> GET_ST interface with a strict security constraint in v3:
>>
>>     The CPU number provided by userspace will be VALIDATED TO EQUAL
>>     THE CURRENT CALLING CPU of the ioctl().
>>
>> In other words:
>> - Userspace can ONLY query the steering tag for the CPU it is currently
>>   running on.
>> - Userspace CANNOT query any other CPU.
>> - No cross-CPU probing, no side-channel, no attack surface.
>> - No ability to influence or target other CPUs.
>>
>> This completely eliminates the security exposure you mentioned, while
>> still fully supporting the Device-Specific mode requirement for devices
>> without ST tables.
>>
>> Thanks
>>
>> On 4/16/2026 9:40 PM, Alex Williamson wrote:
>>> On Thu, 16 Apr 2026 09:09:50 +0800
>>> fengchengwen <fengchengwen@huawei.com> wrote:
>>>
>>>> On 4/15/2026 9:55 PM, Wathsala Vithanage wrote:
>>>>> Hi Feng,
>>>>>
>>>>> get_st  feature is unsafe. It allows a rogue userspace driver in device-specific
>>>>> mode to obtain steering tags for arbitrary CPUs, including ones unrelated
>>>>> to the device or its workload, enabling it to direct traffic into those CPUs’
>>>>> caches and potentially interfere with other workloads, opening doors to
>>>>> further exploits depending on other vulnerabilities.  
>>>>
>>>> Thank you for the follow-up and for referencing the prior RFC
>>>> discussion on this topic. I appreciate you clarifying the
>>>> historical context of the safety concerns.
>>>>
>>>> I acknowledge the risks you’ve highlighted, but I believe the
>>>> risk profile in this VFIO interface is different and already
>>>> well bounded by existing design and practice:
>>>>
>>>> 1. VFIO device access requires elevated privileges
>>>>    A userspace process can only open a VFIO device node if it
>>>>    has sufficient privileges (typically root). This is not an
>>>>    interface for unprivileged users.
>>>
>>> This argument is NOT helping your cause.  This is not the usage model
>>> we design for.  VFIO usage requires that privileges be granted to a
>>> user, in the form of device ACL access and locked memory, but does not
>>> generally require elevated privileges beyond that, or otherwise grant
>>> the user authority beyond the scope of the device.  The root use case
>>> may be typical for you, but is not required for many other typical use
>>> cases, such as device assignment to VMs.
>>>  
>>>> 2. In the thread "[RFC v2 0/2] Retrieve tph from dmabuf for PCIe
>>>>    P2P memory access", applications can configure the steertag
>>>>    of exported dmabufs from userspace to the kernel. Kernel PCIe
>>>>    drivers (e.g., mlx5 NIC) then use these steertags and set them
>>>>    to their ST tables. Even here, userspace could set invalid
>>>>    steertags that impact GPU performance—but this model is
>>>>    basically accepted I think (refer from maillist discuss).
>>>
>>> It's an RFC.  It's bold to claim that it's nearly accepted.
>>>
>>>> 3. Malicious resource consumption is not unique to TPH
>>>>    A malicious thread can be created to forcibly consume CPU
>>>>    resources and bound to a specific CPU, affecting other CPUs.
>>>>    This is a general system security concern, not one specific
>>>>    to TPH GET_ST, and is addressed by existing system hardening
>>>>    and access control mechanisms—not by removing useful features.
>>>
>>> You're conflating process abuse of a CPU to a potential side-channel
>>> DMA attach from a device.  What *existing* hardening protects against
>>> the latter?
>>>
>>>> 4. GET_ST is strictly necessary for Device-Specific (DS) mode
>>>>    when no ST table is present on the device.
>>>>    For devices that do not have a dedicated ST table (a common
>>>>    scenario in many PCIe endpoints), DS mode requires userspace
>>>>    to retrieve per-CPU steering tags first, then program them
>>>>    into the device’s steering logic via other registers. Without
>>>>    GET_ST, userspace cannot obtain the required steertags to
>>>>    enable TPH DS mode at all—rendering TPH support useless for
>>>>    these devices. This is not an optional feature but a
>>>>    fundamental requirement to unlock TPH functionality for a
>>>>    large class of hardware.
>>>
>>> Unlocking a hardware feature does not give you authority to ignore the
>>> security implications of that feature.  Thanks,
>>>
>>> Alex
>>
> 
> 


  reply	other threads:[~2026-04-23  1:26 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-15  9:09 [PATCH 0/4] vfio/pci: Add PCIe TPH support Chengwen Feng
2026-04-15  9:09 ` [PATCH 1/4] vfio/pci: Add PCIe TPH interface with capability query Chengwen Feng
2026-04-15  9:09 ` [PATCH 2/4] vfio/pci: Add PCIe TPH enable/disable support Chengwen Feng
2026-04-15  9:09 ` [PATCH 3/4] vfio/pci: Add PCIe TPH GET_ST interface Chengwen Feng
2026-04-15 13:55   ` Wathsala Vithanage
2026-04-16  1:09     ` fengchengwen
2026-04-16 13:40       ` Alex Williamson
2026-04-16 16:12         ` Wathsala Vithanage
2026-04-17  0:48         ` fengchengwen
2026-04-17  2:06           ` fengchengwen
2026-04-23  1:26             ` fengchengwen [this message]
2026-04-15  9:09 ` [PATCH 4/4] vfio/pci: Add PCIe TPH SET_ST interface Chengwen Feng
     [not found]   ` <e6dbfdd5-5117-4c3e-bb84-ee1e489aa38f@arm.com>
2026-04-16  1:16     ` fengchengwen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7f64f1fc-8d37-4990-aa78-7d5873ebfb46@huawei.com \
    --to=fengchengwen@huawei.com \
    --cc=alex@shazbot.org \
    --cc=jgg@ziepe.ca \
    --cc=kvm@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=wathsala.vithanage@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox