From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from canpmsgout03.his.huawei.com (canpmsgout03.his.huawei.com [113.46.200.218]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E1842219303; Thu, 23 Apr 2026 01:26:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=113.46.200.218 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776907602; cv=none; b=bE8tRAyoqj3de3+IaI4Kh8+a3htn6K/2xzk6ovnYfwgSA3Zsx1tqqKapx/TnfYmg1XNf8QR7c4qckZKFDRNpRCOzAvSFpUMfzg6ZMMtVlJqII8+pH84SQ0r09nPOoihHrptznT1tWAKX1FkQw80zsPqyLasu7mjzjnJEeaBAM7Y= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1776907602; c=relaxed/simple; bh=MdDL1bBQaqygTE81aneeIcP+cvRfBlHORiy8pRj4Yb8=; h=Message-ID:Date:MIME-Version:Subject:From:To:CC:References: In-Reply-To:Content-Type; b=cMLWt5CzGPGYW2+PlfXLMUmm0G8xCCKgRQ7294nWDYeIfVyPtNxezRzE4P5sFK405+NJ6h+INTiU8/dD4qw1iRTzKzqC3GOyrpMBMIi3yzdSwzOCn4FX2J4Bo3u/OGHDcdphceRtXgcY98bulzHajEBKYGbME/Pibd9xL9rqta0= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; dkim=pass (1024-bit key) header.d=huawei.com header.i=@huawei.com header.b=ccdb/PuV; arc=none smtp.client-ip=113.46.200.218 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=huawei.com header.i=@huawei.com header.b="ccdb/PuV" dkim-signature: v=1; a=rsa-sha256; d=huawei.com; s=dkim; c=relaxed/relaxed; q=dns/txt; h=From; bh=j/n5R5XBKT8tgqvxo1o5U+XMmMy2tdYcro1HNkYPcCE=; b=ccdb/PuVSx0xGrLgvwIG81auWVm+kZOtZ0e6MHJiW1a08cc6ryMjQqQUwZX/7eXy4r0w4KnkT E85GN1eGvJ9xvlZBBGIUHvdoIChMplgypesnE5IDL7AzhHVIzk5XS0UkViLpaO2PlERngu/W0TH fe04xVxx7i/to2dJLk9lsnM= Received: from mail.maildlp.com (unknown [172.19.163.104]) by canpmsgout03.his.huawei.com (SkyGuard) with ESMTPS id 4g1JDT6RlKzpStt; Thu, 23 Apr 2026 09:20:09 +0800 (CST) Received: from kwepemk500009.china.huawei.com (unknown [7.202.194.94]) by mail.maildlp.com (Postfix) with ESMTPS id B676B4048F; Thu, 23 Apr 2026 09:26:35 +0800 (CST) Received: from [10.67.121.161] (10.67.121.161) by kwepemk500009.china.huawei.com (7.202.194.94) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Thu, 23 Apr 2026 09:26:35 +0800 Message-ID: <7f64f1fc-8d37-4990-aa78-7d5873ebfb46@huawei.com> Date: Thu, 23 Apr 2026 09:26:34 +0800 Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 3/4] vfio/pci: Add PCIe TPH GET_ST interface From: fengchengwen To: Alex Williamson , Jason Gunthorpe , Wathsala Vithanage CC: , References: <20260415090959.53672-1-fengchengwen@huawei.com> <20260415090959.53672-4-fengchengwen@huawei.com> <518e5e0a-d0b2-4775-a32a-e2dc87c8ba4b@arm.com> <41d5b3ea-abe2-4583-be88-2addf6b4d394@huawei.com> <20260416074020.57e4ed72@shazbot.org> <5ed17a05-1dee-49c2-8d40-a1db8f67ef13@huawei.com> Content-Language: en-US In-Reply-To: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit X-ClientProxiedBy: kwepems200001.china.huawei.com (7.221.188.67) To kwepemk500009.china.huawei.com (7.202.194.94) Hi Alex, Jason and Wathsala Gentle reminder that the corresponding protection mechanisms have been implemented in v3: - Introduced module parameter *enable_unsafe_tph_ds* to control unsafe TPH usage - Restrict TPH Device-Specific mode without standard ST table to be disabled by default - Only permit enabling such mode and the GET_ST operation when the parameter is explicitly set by trusted users Please help review when available. Thanks On 4/17/2026 10:06 AM, fengchengwen wrote: > Sorry for the self-reply. > > Hi Alex & Wathsala, > > Based on the VM assignment scenario and the cross-VM attack concern > raised by Wathsala in her review of "[PATCH v2 RESEND 4/5] vfio/pci: Add PCIe TPH GET_ST interface": > > This is unsafe. A user space driver can obtain STs for arbitrary CPUs and program them > into device-specific registers (e.g., E810), with no isolation guarantees. > For example, consider two VMs on the same host. A driver in VM1 could program STs that > target CPUs primarily used by VM2. This can steer traffic processing onto VM2's CPUs, > creating contention and degrading VM2's performance. > This breaks CPU isolation at the host level and can be used to disrupt workloads and violate > SLAs across tenants. > > After fully re-evaluating the security architecture with your feedback, > I agree with your concerns and conclusions. I hereby revoke my earlier > proposal to restrict GET_ST to only the current CPU. > > For devices that implement standard ST tables (via config space or MSI-X caps), > hypervisors such as QEMU/kvmtool can trap and filter guest writes, preventing > malicious steering tag abuse. This is the safe and supported model for TPH in > virtualized environments. > > However, for devices that only support Device-Specific mode with no standard > ST table, there is no existing hypervisor interception mechanism to prevent > a guest from programming arbitrary steering tags to attack other CPUs. > This is a fundamental security risk that cannot be safely mitigated in software. > > Therefore, the correct security posture for virtualization is: > - TPH should be enabled *only* for devices with standard ST tables > - Devices without standard ST tables should NOT enable TPH in virtualization > > On the other hand, in non-virtualization (bare-metal) scenarios, there is > strong legitimate demand for devices that lack a standard ST table — > many real-world devices are designed this way. For this reason, I would > like to retain the GET_ST interface. > > For virtualization scenarios, the hypervisor is responsible for avoiding > this risk by **disabling TPH Requester Enable in the PCIe config space**, > which is fully interceptable and under the hypervisor’s control. > > Thanks > > On 4/17/2026 8:48 AM, fengchengwen wrote: >> Hi Alex, >> >> Thank you very much for your clear and detailed security explanation. >> I fully understand and agree with your security concerns about allowing >> userspace to query steering tags for arbitrary CPUs. >> >> To completely resolve this security issue while retaining the mandatory >> functionality for DS-mode devices without ST table, I will revise the >> GET_ST interface with a strict security constraint in v3: >> >> The CPU number provided by userspace will be VALIDATED TO EQUAL >> THE CURRENT CALLING CPU of the ioctl(). >> >> In other words: >> - Userspace can ONLY query the steering tag for the CPU it is currently >> running on. >> - Userspace CANNOT query any other CPU. >> - No cross-CPU probing, no side-channel, no attack surface. >> - No ability to influence or target other CPUs. >> >> This completely eliminates the security exposure you mentioned, while >> still fully supporting the Device-Specific mode requirement for devices >> without ST tables. >> >> Thanks >> >> On 4/16/2026 9:40 PM, Alex Williamson wrote: >>> On Thu, 16 Apr 2026 09:09:50 +0800 >>> fengchengwen wrote: >>> >>>> On 4/15/2026 9:55 PM, Wathsala Vithanage wrote: >>>>> Hi Feng, >>>>> >>>>> get_st  feature is unsafe. It allows a rogue userspace driver in device-specific >>>>> mode to obtain steering tags for arbitrary CPUs, including ones unrelated >>>>> to the device or its workload, enabling it to direct traffic into those CPUs’ >>>>> caches and potentially interfere with other workloads, opening doors to >>>>> further exploits depending on other vulnerabilities. >>>> >>>> Thank you for the follow-up and for referencing the prior RFC >>>> discussion on this topic. I appreciate you clarifying the >>>> historical context of the safety concerns. >>>> >>>> I acknowledge the risks you’ve highlighted, but I believe the >>>> risk profile in this VFIO interface is different and already >>>> well bounded by existing design and practice: >>>> >>>> 1. VFIO device access requires elevated privileges >>>> A userspace process can only open a VFIO device node if it >>>> has sufficient privileges (typically root). This is not an >>>> interface for unprivileged users. >>> >>> This argument is NOT helping your cause. This is not the usage model >>> we design for. VFIO usage requires that privileges be granted to a >>> user, in the form of device ACL access and locked memory, but does not >>> generally require elevated privileges beyond that, or otherwise grant >>> the user authority beyond the scope of the device. The root use case >>> may be typical for you, but is not required for many other typical use >>> cases, such as device assignment to VMs. >>> >>>> 2. In the thread "[RFC v2 0/2] Retrieve tph from dmabuf for PCIe >>>> P2P memory access", applications can configure the steertag >>>> of exported dmabufs from userspace to the kernel. Kernel PCIe >>>> drivers (e.g., mlx5 NIC) then use these steertags and set them >>>> to their ST tables. Even here, userspace could set invalid >>>> steertags that impact GPU performance—but this model is >>>> basically accepted I think (refer from maillist discuss). >>> >>> It's an RFC. It's bold to claim that it's nearly accepted. >>> >>>> 3. Malicious resource consumption is not unique to TPH >>>> A malicious thread can be created to forcibly consume CPU >>>> resources and bound to a specific CPU, affecting other CPUs. >>>> This is a general system security concern, not one specific >>>> to TPH GET_ST, and is addressed by existing system hardening >>>> and access control mechanisms—not by removing useful features. >>> >>> You're conflating process abuse of a CPU to a potential side-channel >>> DMA attach from a device. What *existing* hardening protects against >>> the latter? >>> >>>> 4. GET_ST is strictly necessary for Device-Specific (DS) mode >>>> when no ST table is present on the device. >>>> For devices that do not have a dedicated ST table (a common >>>> scenario in many PCIe endpoints), DS mode requires userspace >>>> to retrieve per-CPU steering tags first, then program them >>>> into the device’s steering logic via other registers. Without >>>> GET_ST, userspace cannot obtain the required steertags to >>>> enable TPH DS mode at all—rendering TPH support useless for >>>> these devices. This is not an optional feature but a >>>> fundamental requirement to unlock TPH functionality for a >>>> large class of hardware. >>> >>> Unlocking a hardware feature does not give you authority to ignore the >>> security implications of that feature. Thanks, >>> >>> Alex >> > >