linux-coco.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: Alexey Kardashevskiy <aik@amd.com>,
	Dan Williams <dan.j.williams@intel.com>, <kvm@vger.kernel.org>
Cc: <iommu@lists.linux.dev>, <linux-coco@lists.linux.dev>,
	<linux-pci@vger.kernel.org>,
	Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	<pratikrajesh.sampat@amd.com>, <michael.day@amd.com>,
	<david.kaplan@amd.com>, <dhaval.giani@amd.com>,
	Santosh Shukla <santosh.shukla@amd.com>,
	Tom Lendacky <thomas.lendacky@amd.com>,
	Michael Roth <michael.roth@amd.com>,
	"Alexander Graf" <agraf@suse.de>,
	Nikunj A Dadhania <nikunj@amd.com>,
	Vasant Hegde <vasant.hegde@amd.com>,
	Lukas Wunner <lukas@wunner.de>
Subject: Re: [RFC PATCH 00/21] Secure VFIO, TDISP, SEV TIO
Date: Thu, 29 Aug 2024 16:41:34 -0700	[thread overview]
Message-ID: <66d1072ea0590_31daf294e8@dwillia2-xfh.jf.intel.com.notmuch> (raw)
In-Reply-To: <6cd62b80-9a8d-4f01-a458-4466dac6d27f@amd.com>

Alexey Kardashevskiy wrote:
[..]
> >> - skipping various enforcements of non-SME or
> >> SWIOTLB in the guest;
> > 
> > Is this based on some concept of private vs shared mode devices?
> > 
> >> No mixed share+private DMA supported within the
> >> same IOMMU.
> > 
> > What does this mean? A device may not have mixed mappings (makes sense),
> 
> Currently devices do not have an idea about private host memory (but it 
> is being worked on afaik).

Worked on where? You mean the PCI core indicating that a device is
private or not? Is that not indicated by guest-side TSM connection
state?

> > or an IOMMU can not host devices that do not all agree on whether DMA is
> > private or shared?
> 
> The hardware allows that via hardware-assisted vIOMMU and I/O page 
> tables in the guest with C-bit takes into accound by the IOMMU but the 
> software support is missing right now. So for this initial drop, vTOM is 
> used for DMA - this thing says "everything below <addr> is private, 
> above <addr> - shared" so nothing needs to bother with the C-bit, and in 
> my exercise I set the <addr> to the allowed maximum.
> 
> So each IOMMUFD instance in the VM is either "all private mappings" or 
> "all shared". Could be half/half by moving that <addr> :)

I thought existing use cases assume that the CC-VM can trigger page
conversions at will without regard to a vTOM concept? It would be nice
to have that address-map separation arrangement, has not that ship
already sailed?

[..]
> > Would the device not just launch in "shared" mode until it is later
> > converted to private? I am missing the detail of why passing the device
> > on the command line requires that private memory be mapped early.
> 
> A sequencing problem.
> 
> QEMU "realizes" a VFIO device, it creates an iommufd instance which 
> creates a domain and writes to a DTE (a IOMMU descriptor for PCI BDFn). 
> And DTE is not updated after than. For secure stuff, DTE needs to be 
> slightly different. So right then I tell IOMMUFD that it will handle 
> private memory.
> 
> Then, the same VFIO "realize" handler maps the guest memory in iommufd. 
> I use the same flag (well, pointer to kvm) in the iommufd pinning code, 
> private memory is pinned and mapped (and related page state change 
> happens as the guest memory is made guest-owned in RMP).
> 
> QEMU goes to machine_reset() and calls "SNP LAUNCH UPDATE" (the actual 
> place changed recenly, huh) and the latter will measure the guest and 
> try making all guest memory private but it already happened => error.
> 
> I think I have to decouple the pinning and the IOMMU/DTE setting.
> 
> > That said, the implication that private device assignment requires
> > hotplug events is a useful property. This matches nicely with initial
> > thoughts that device conversion events are violent and might as well be
> > unplug/replug events to match all the assumptions around what needs to
> > be updated.
> 
> For the initial drop, I tell QEMU via "-device vfio-pci,x-tio=true" that 
> it is going to be private so there should be no massive conversion.

That's a SEV-TIO RFC-specific hack, or a proposal?

An approach that aligns more closely with the VFIO operational model,
where it maps and waits for guest faults / usages, is that QEMU would be
told that the device is "bind capable", because the host is not in a
position to assume how the guest will use the device. A "bind capable"
device operates in shared mode unless and until the guest triggers
private conversion.

> >> This requires the BME hack as MMIO and
> > 
> > Not sure what the "BME hack" is, I guess this is foreshadowing for later
> > in this story.
>  >
> >> BusMaster enable bits cannot be 0 after MMIO
> >> validation is done
> > 
> > It would be useful to call out what is a TDISP requirement, vs
> > device-specific DSM vs host-specific TSM requirement. In this case I
> > assume you are referring to PCI 6.2 11.2.6 where it notes that TDIs must
> 
> Oh there is 6.2 already.
> 
> > enter the TDISP ERROR state if BME is cleared after the device is
> > locked?
> > 
> > ...but this begs the question of whether it needs to be avoided outright
> 
> Well, besides a couple of avoidable places (like testing INTx support 
> which we know is not going to work on VFs anyway), a standard driver 
> enables MSE first (and the value for the command register does not have 
> 1 for BME) and only then BME. TBH I do not think writing BME=0 when 
> BME=0 already is "clearing" but my test device disagrees.

...but we should not be creating kernel policy around test devices. What
matters is real devices. Now, if it is likely that real / production
devices will go into the TDISP ERROR state by not coalescing MSE + BME
updates then we need a solution.

Given it is unlikely that TDISP support will be widespread any time soon
it is likely tenable to assume TDISP compatible drivers call a new:

   pci_enable(pdev, PCI_ENABLE_TARGET | PCI_ENABLE_INITIATOR);

...or something like that to coalesce command register writes.

Otherwise if that retrofit ends up being too much work or confusion then
the ROI of teaching the PCI core to recover this scenario needs to be
evaluated.

> > or handled as an error recovery case dependending on policy.
> 
> Avoding seems more straight forward unless we actually want enlightened 
> device drivers which want to examine the interface report before 
> enabling the device. Not sure.

If TDISP capable devices trends towards a handful of devices in the near
term then some driver fixups seems reasonable. Otherwise if every PCI
device driver Linux has ever seens needs to be ready for that device to
have a TDISP capable flavor then mitigating this in the PCI core makes
more sense than playing driver whack-a-mole.

> >> the guest OS booting process when this
> >> appens.
> >>
> >> SVSM could help addressing these (not
> >> implemented at the moment).
> > 
> > At first though avoiding SVSM entanglements where the kernel can be
> > enlightened shoud be the policy. I would only expect SVSM hacks to cover
> > for legacy OSes that will never be TDISP enlightened, but in that case
> > we are likely talking about fully unaware L2. Lets assume fully
> > enlightened L1 for now.
> 
> Well, I could also tweak OVMF to make necessary calls to the PSP and 
> hack QEMU to postpone the command register updates to get this going, 
> just a matter of ugliness.

Per above, the tradeoff should be in ROI, not ugliness. I don't see how
OVMF helps when devices might be being virtually hotplugged or reset.

  reply	other threads:[~2024-08-29 23:41 UTC|newest]

Thread overview: 128+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-23 13:21 [RFC PATCH 00/21] Secure VFIO, TDISP, SEV TIO Alexey Kardashevskiy
2024-08-23 13:21 ` [RFC PATCH 01/21] tsm-report: Rename module to reflect what it does Alexey Kardashevskiy
2024-08-23 22:17   ` Bjorn Helgaas
2024-08-28 13:49   ` Jonathan Cameron
2024-08-30  0:13   ` Dan Williams
2024-09-02  1:29     ` Alexey Kardashevskiy
2024-08-23 13:21 ` [RFC PATCH 02/21] pci/doe: Define protocol types and make those public Alexey Kardashevskiy
2024-08-23 22:18   ` Bjorn Helgaas
2024-08-30  2:15   ` Dan Williams
2024-08-23 13:21 ` [RFC PATCH 03/21] pci: Define TEE-IO bit in PCIe device capabilities Alexey Kardashevskiy
2024-08-23 22:19   ` Bjorn Helgaas
2024-08-28 13:54   ` Jonathan Cameron
2024-08-30  2:21   ` Dan Williams
2024-08-30  4:04     ` Alexey Kardashevskiy
2024-08-30 21:37       ` Dan Williams
2024-08-23 13:21 ` [RFC PATCH 04/21] PCI/IDE: Define Integrity and Data Encryption (IDE) extended capability Alexey Kardashevskiy
2024-08-23 22:28   ` Bjorn Helgaas
2024-08-28 14:24   ` Jonathan Cameron
2024-08-30  2:41   ` Dan Williams
2024-08-23 13:21 ` [RFC PATCH 05/21] crypto/ccp: Make some SEV helpers public Alexey Kardashevskiy
2024-08-30  2:45   ` Dan Williams
2024-08-23 13:21 ` [RFC PATCH 06/21] crypto: ccp: Enable SEV-TIO feature in the PSP when supported Alexey Kardashevskiy
2024-08-28 14:32   ` Jonathan Cameron
2024-09-03 21:27   ` Dan Williams
2024-09-05  2:29     ` Alexey Kardashevskiy
2024-09-05 17:40       ` Dan Williams
2024-08-23 13:21 ` [RFC PATCH 07/21] pci/tdisp: Introduce tsm module Alexey Kardashevskiy
2024-08-27 12:32   ` Jason Gunthorpe
2024-08-28  3:00     ` Alexey Kardashevskiy
2024-08-28 23:42       ` Jason Gunthorpe
2024-08-29  0:00         ` Dan Williams
2024-08-29  0:09           ` Jason Gunthorpe
2024-08-29  0:20             ` Dan Williams
2024-08-29 12:03               ` Jason Gunthorpe
2024-08-29  4:57         ` Alexey Kardashevskiy
2024-08-29 12:07           ` Jason Gunthorpe
2024-09-02  0:52             ` Alexey Kardashevskiy
2024-08-28 15:04   ` Jonathan Cameron
2024-09-02  6:50   ` Aneesh Kumar K.V
2024-09-02  7:26     ` Alexey Kardashevskiy
2024-09-03 23:51   ` Dan Williams
2024-09-04 11:13     ` Alexey Kardashevskiy
2024-09-04 23:28       ` Dan Williams
2024-08-23 13:21 ` [RFC PATCH 08/21] crypto/ccp: Implement SEV TIO firmware interface Alexey Kardashevskiy
2024-08-28 15:39   ` Jonathan Cameron
2024-08-23 13:21 ` [RFC PATCH 09/21] kvm: Export kvm_vm_set_mem_attributes Alexey Kardashevskiy
2024-08-23 13:21 ` [RFC PATCH 10/21] vfio: Export helper to get vfio_device from fd Alexey Kardashevskiy
2024-08-23 13:21 ` [RFC PATCH 11/21] KVM: SEV: Add TIO VMGEXIT and bind TDI Alexey Kardashevskiy
2024-08-29 10:08   ` Xu Yilun
2024-08-30  4:00     ` Alexey Kardashevskiy
2024-08-30  7:02       ` Xu Yilun
2024-09-02  1:24         ` Alexey Kardashevskiy
2024-09-13 13:50   ` Zhi Wang
2024-09-13 22:08     ` Dan Williams
2024-09-14  2:47       ` Tian, Kevin
2024-09-14  5:19         ` Zhi Wang
2024-09-18 10:45           ` Xu Yilun
2024-09-20  3:41             ` Tian, Kevin
2024-08-23 13:21 ` [RFC PATCH 12/21] KVM: IOMMUFD: MEMFD: Map private pages Alexey Kardashevskiy
2024-08-26  8:39   ` Tian, Kevin
2024-08-26 12:30     ` Jason Gunthorpe
2024-08-29  9:34       ` Xu Yilun
2024-08-29 12:15         ` Jason Gunthorpe
2024-08-30  3:47           ` Alexey Kardashevskiy
2024-08-30 12:35             ` Jason Gunthorpe
2024-09-02  1:09               ` Alexey Kardashevskiy
2024-09-02 23:52                 ` Jason Gunthorpe
2024-09-03  0:03                   ` Alexey Kardashevskiy
2024-09-03  0:37                     ` Jason Gunthorpe
2024-08-30  5:20           ` Xu Yilun
2024-08-30 12:36             ` Jason Gunthorpe
2024-09-03 20:34               ` Dan Williams
2024-09-04  0:02                 ` Jason Gunthorpe
2024-09-04  0:59                   ` Dan Williams
2024-09-05  8:29                     ` Tian, Kevin
2024-09-05 12:02                       ` Jason Gunthorpe
2024-09-05 12:07                         ` Tian, Kevin
2024-09-05 12:00                     ` Jason Gunthorpe
2024-09-05 12:17                       ` Tian, Kevin
2024-09-05 12:23                         ` Jason Gunthorpe
2024-09-05 20:53                           ` Dan Williams
2024-09-05 23:06                             ` Jason Gunthorpe
2024-09-06  2:46                               ` Tian, Kevin
2024-09-06 13:54                                 ` Jason Gunthorpe
2024-09-06  2:41                             ` Tian, Kevin
2024-08-27  2:27     ` Alexey Kardashevskiy
2024-08-27  2:31       ` Tian, Kevin
2024-09-15 21:07   ` Jason Gunthorpe
2024-09-20 21:10     ` Vishal Annapurve
2024-09-23  5:35       ` Tian, Kevin
2024-09-23  6:34         ` Vishal Annapurve
2024-09-23  8:24           ` Tian, Kevin
2024-09-23 16:02             ` Jason Gunthorpe
2024-09-23 23:52               ` Tian, Kevin
2024-09-24 12:07                 ` Jason Gunthorpe
2024-09-25  8:44                   ` Vishal Annapurve
2024-09-25 15:41                     ` Jason Gunthorpe
2024-09-23 20:53             ` Vishal Annapurve
2024-09-23 23:55               ` Tian, Kevin
2024-08-23 13:21 ` [RFC PATCH 13/21] KVM: X86: Handle private MMIO as shared Alexey Kardashevskiy
2024-08-30 16:57   ` Xu Yilun
2024-09-02  2:22     ` Alexey Kardashevskiy
2024-09-03  5:13       ` Xu Yilun
2024-09-06  3:31         ` Alexey Kardashevskiy
2024-09-09 10:07           ` Xu Yilun
2024-09-10  1:28             ` Alexey Kardashevskiy
2024-08-23 13:21 ` [RFC PATCH 14/21] RFC: iommu/iommufd/amd: Add IOMMU_HWPT_TRUSTED flag, tweak DTE's DomainID, IOTLB Alexey Kardashevskiy
2024-08-27 12:17   ` Jason Gunthorpe
2024-08-23 13:21 ` [RFC PATCH 15/21] coco/sev-guest: Allow multiple source files in the driver Alexey Kardashevskiy
2024-08-23 13:21 ` [RFC PATCH 16/21] coco/sev-guest: Make SEV-to-PSP request helpers public Alexey Kardashevskiy
2024-08-23 13:21 ` [RFC PATCH 17/21] coco/sev-guest: Implement the guest side of things Alexey Kardashevskiy
2024-08-28 15:54   ` Jonathan Cameron
2024-09-14  7:19   ` Zhi Wang
2024-09-16  1:18     ` Alexey Kardashevskiy
2024-08-23 13:21 ` [RFC PATCH 18/21] RFC: pci: Add BUS_NOTIFY_PCI_BUS_MASTER event Alexey Kardashevskiy
2024-08-23 13:21 ` [RFC PATCH 19/21] sev-guest: Stop changing encrypted page state for TDISP devices Alexey Kardashevskiy
2024-08-23 13:21 ` [RFC PATCH 20/21] pci: Allow encrypted MMIO mapping via sysfs Alexey Kardashevskiy
2024-08-23 22:37   ` Bjorn Helgaas
2024-09-02  8:22     ` Alexey Kardashevskiy
2024-09-03 21:46       ` Bjorn Helgaas
2024-08-23 13:21 ` [RFC PATCH 21/21] pci: Define pci_iomap_range_encrypted Alexey Kardashevskiy
2024-08-28 20:43 ` [RFC PATCH 00/21] Secure VFIO, TDISP, SEV TIO Dan Williams
2024-08-29 14:13   ` Alexey Kardashevskiy
2024-08-29 23:41     ` Dan Williams [this message]
2024-08-30  4:38       ` Alexey Kardashevskiy
2024-08-30 21:57         ` Dan Williams
2024-09-05  8:21     ` Tian, Kevin
2024-09-03 15:56 ` Sean Christopherson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=66d1072ea0590_31daf294e8@dwillia2-xfh.jf.intel.com.notmuch \
    --to=dan.j.williams@intel.com \
    --cc=agraf@suse.de \
    --cc=aik@amd.com \
    --cc=alex.williamson@redhat.com \
    --cc=david.kaplan@amd.com \
    --cc=dhaval.giani@amd.com \
    --cc=iommu@lists.linux.dev \
    --cc=kvm@vger.kernel.org \
    --cc=linux-coco@lists.linux.dev \
    --cc=linux-pci@vger.kernel.org \
    --cc=lukas@wunner.de \
    --cc=michael.day@amd.com \
    --cc=michael.roth@amd.com \
    --cc=nikunj@amd.com \
    --cc=pratikrajesh.sampat@amd.com \
    --cc=santosh.shukla@amd.com \
    --cc=suravee.suthikulpanit@amd.com \
    --cc=thomas.lendacky@amd.com \
    --cc=vasant.hegde@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).