From: <dan.j.williams@intel.com>
To: Jason Gunthorpe <jgg@nvidia.com>, <dan.j.williams@intel.com>
Cc: <linux-coco@lists.linux.dev>, <linux-pci@vger.kernel.org>,
<gregkh@linuxfoundation.org>, <bhelgaas@google.com>,
<yilun.xu@linux.intel.com>, <aneesh.kumar@kernel.org>,
<aik@amd.com>
Subject: Re: [PATCH 6/7] samples/devsec: Introduce a "Device Security TSM" sample driver
Date: Fri, 29 Aug 2025 13:00:09 -0700 [thread overview]
Message-ID: <68b206c9e3f43_75db100e4@dwillia2-mobl4.notmuch> (raw)
In-Reply-To: <20250829160217.GL7333@nvidia.com>
Jason Gunthorpe wrote:
> On Thu, Aug 28, 2025 at 02:38:14PM -0700, dan.j.williams@intel.com wrote:
> > > device_cc_probe() doesn't save anything, doesn't this just get into an
> > > endless loop of EPROBE_DEFER? Usually the kernel will retry these
> > > things during booting?
> >
> > Hmm, no, deferred probing retriggers after a one-time boot timeout
> > (extended by driver registration events) and after any device
> > successfully completes probe.
>
> So it is not "endless" but it is also not "single probe then wait till
> accept". I'm not keen on using this mechanism, I think the things
> people want to do in the T=0 mode are going to be time consuming and
> repeatedly doing that time consuming step is not a good idea.
It would only ever run multiple times if the driver is built-in or
loaded early, which is also mitigated by disabling autoprobe like you
have below. So that problem is manageable.
Now, I do think it is worth exploring a convention for "cc_prepare"
drivers, but as long as userspace is prepared to rebind post accept it
does not really matter if it got to that point by cc_prepare driver,
probe deferral of enlightened driver, or plain probe error plus retry.
> How about this instead:
>
> 1) Drivers compiled into the kernel are "safe" and can freely bind
> at will during early boot
Agree.
In the past this idea has been met with "but but typical distro kernels
have lots of built-in drivers that *may* be unsafe", and the answer is
"yes, a VM image with a CC aware / specific kernel config is a
requirement".
> 2) The first thing the initrd does is set
> /sys/bus/*/drivers_autoprobe to false.
> This stops all kernel driver autobinding.
Makes sense for the buses userspace is prepared to manage.
> Maybe we also need a small kernel change to allow userspace to make
> drivers_autoprobe false for all future busses too.
I do think we need a mechanism to say, "no more dynamic device
enumeration", but a coarse and future promise "no autoprobe of any bus"
I fear is going to have a long tail of problems especially with design
patterns like "faux_device" and "auxiliary_device".
As far as I understand, these CC environments do not immediately have
secrets to protect at launch. Also, not sure how many are ready to
validate the launch state of the TVM that early. I think it is more a
case of allow everything by default to start (whatever is in ACPI, and
T=0 PCI devices). Later the relying party either says "no, you have
enumerated devices that should not be there", or "yes, launch state
looks good, lock device topology, proceed with the performance
enhancement of converting some PCI TDISP devices to T=1 operation, here
are your secrets".
That post validate model saves us from a long tail of fixes for
subsystems that may be surprised by a new userspace acceptance loop. It
is userspace responsibility to validate device topology relative to
relying party expectations, and likely for the device topology to be
static for the duration of secrets deployment.
> 3) Userspace then evaluates what devices are present, checks its
> policy, loads modules and issues /sys/.../bind operations.
>
> We need to close the general security gap I gave earlier, userspace
> policy should be able to implement the statement:
The gap is present when secrets are deployed and if secrets are deployed
pre-accept the TCB is already broken.
> mlx5 is allowed to bind to a RUN device after measuring and
> verifying it, and never otherwise.
...and if userspace binds mlx5 pre-RUN that is not the kernel's problem.
I state that explicitly not for you, but because of the rejection of the
"device filter" in-kernel mechanism previously.
> a) For non-TDISP devices userspace checks if a driver is "trusted"
> before binding it, a fancier CC only deny list stored in the
> initrd.
Yes, necessary.
> b) For TDISP devices userspace runs through the
> prepare/measure/lock/run sequence then binds the final
> driver.
> c) Something something RAS driver restart
>
> Basically userspace policy is entirely in control if a device is
> "accepted" by the ccVM or not. The kernel won't auto bind
> a driver to a physical device. It would be driven off of
> uevents, I guess through new CC focused features in udev.
Yes, the only quibble is whether that "kernel won't bind" is more a
"userspace shall lock and validate device topology" at a certain point
in the boot flow. Userspace may need to be prepared for some unaccepted
devices to bind before that point.
> I think the needed kernel support is already here, the main gap I
> see is that the modules.alias does not include the driver names, it
> just has the module names. We ran into this with vfio (see below)
> so it would be nice to fix, though it can be worked around like
> VFIO did by making the driver name == module name.
Yes, I think that is reasonable. Multi-driver modules are not the norm.
The kernel problems to solve are "accepted" flag and maybe documenting
to driver writers / udev developers strategies to handle the "prepare"
problem.
> 4) Userspace sequences the special "prepare" pre-T=0 drivers, perhaps
> discovered through modinfo matching similar to VFIO:
>
> $ grep vfio /lib/modules/`uname -r`/modules.alias
> alias vfio_pci:v000015B3d0000101Esv*sd*bc*sc*i* mlx5_vfio_pci
> ^^^^^^
> PCI driver but special for VFIO usage. So I imagine a
> ccprepare_pci:... driver variation.
Novel!
> Userspace can inspect the modules.alias, find if the device's
> modalias has a ccprepare_pci: match and if so it will bind/unbind
> that driver before going to locked/run. When it reaches run it will
> find the pci: match and bind that driver which is the operating
> driver.
>
> Policy if the ccprepare device should even be permitted is also
> controlled by userspace.
>
> Userspace sequences all of this based on its policy to accept a
> device and push it to RUN, not the kernel, again probably
> through some new CC features in udev.
>
> The kernel side of this is a commit like cc6711b0bf36 ("PCI / VFIO:
> Add 'override_only' support for VFIO PCI sub system")
Yeah, that looks like a viable option for these complicated drivers.
For RAS I do still like the property of a driver that will field errors
also having everything it needs to take a device from reset back to the
ready-to-accept state. That can be solved later, and maybe the outcome
is "cc_prepare" is incompatible with "recovery".
> This is much less kernel change and gives the big thing CC needs -
> driver binding policy decisions in userspace.
>
> > > As we discussed in the prior chain we need to have a policy decision
> > > before auto-binding drivers at all in a CC environment, I don't see
> > > that in the code though the cover letter talked about it??
> >
> > The aim was for the "'struct device' has an acceptance flag" discussion
> > to settle before starting a "device-core policy for unaccepted devices"
> > discussion. I am ok to put more logs on the fire if there is an appetite
> > for that.
>
> Sure, I think you shold drop this patch from this series and have this
> series focus only on creating an accepted struct device environment
> that a driver can bind to and operate.
You mean drop the device_cc_probe() piece. The rest of patch is starting
the work of a "accepted struct device environment" with a single flag
that MMIO and DMA infrastructure can reference.
> This is a long journey already, once this basic support is landed we
> still need to do all the arch support to enable DMA/IOMMU/etc as many
> followup series.
>
> The questions about when and what drivers are probed can be left to a
> different series, at this point it will be usable for development but
> not secure like it should be.
Agree. Especially because attestation interfaces are not part of this
series yet.
> The device_cc_probe() type issue should be solved in yet another
> series, IMHO, and that should come with a really strong justification
> why the kernel needs to do anything at all, vs just rely on userspace
> as I outline above.
It is trivial for a driver to open code EPROBE_DEFER so
device_cc_probe() is not putting any burden on the kernel besides
documentation, but I will drop it for now.
> > I was hoping to put the onus of that on the vendors that think they need
> > this Enlightened driver path. The path of least resistance for device
> > vendors is design the hardware so that it can be locked without needing
> > a driver to take any configuration action ahead of time. Otherwise,
> > explain to users that they need to adjust/replace the eventual udev
> > sysfs script that does:
>
> So if we already imagine changing udev, lets imagine the above
> instead?
That's the whole discussion, what are the udev requirements relative to
the secrets deployment event.
next prev parent reply other threads:[~2025-08-29 20:00 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-08-27 3:52 [PATCH 0/7] PCI/TSM: TEE I/O infrastructure Dan Williams
2025-08-27 3:52 ` [PATCH 1/7] PCI/TSM: Add pci_tsm_{bind,unbind}() methods for instantiating TDIs Dan Williams
2025-09-02 0:12 ` Alexey Kardashevskiy
2025-09-02 15:04 ` Aneesh Kumar K.V
2025-09-02 15:05 ` Aneesh Kumar K.V
2025-09-03 15:17 ` Aneesh Kumar K.V
2025-09-04 10:38 ` Alexey Kardashevskiy
2025-09-04 12:56 ` Aneesh Kumar K.V
2025-09-05 2:32 ` Alexey Kardashevskiy
2025-08-27 3:52 ` [PATCH 2/7] PCI/TSM: Add pci_tsm_guest_req() for managing TDIs Dan Williams
2025-08-28 9:53 ` Alexey Kardashevskiy
2025-08-28 22:07 ` dan.j.williams
2025-08-29 2:21 ` Alexey Kardashevskiy
2025-08-30 2:37 ` dan.j.williams
2025-09-01 23:49 ` Alexey Kardashevskiy
2025-08-28 13:02 ` Aneesh Kumar K.V
2025-08-28 22:14 ` dan.j.williams
2025-08-27 3:52 ` [PATCH 3/7] device core: Introduce confidential device acceptance Dan Williams
2025-08-27 6:14 ` Greg KH
2025-08-28 20:07 ` dan.j.williams
2025-08-27 3:52 ` [PATCH 4/7] x86/ioremap, resource: Introduce IORES_DESC_ENCRYPTED for encrypted PCI MMIO Dan Williams
2025-08-27 3:52 ` [PATCH 5/7] PCI/TSM: Add Device Security (TVM Guest) operations support Dan Williams
2025-09-03 15:22 ` Aneesh Kumar K.V
2025-09-04 15:02 ` Aneesh Kumar K.V
2025-08-27 3:52 ` [PATCH 6/7] samples/devsec: Introduce a "Device Security TSM" sample driver Dan Williams
2025-08-27 12:39 ` Jason Gunthorpe
2025-08-27 23:47 ` Alexey Kardashevskiy
2025-08-28 21:38 ` dan.j.williams
2025-08-29 16:02 ` Jason Gunthorpe
2025-08-29 20:00 ` dan.j.williams [this message]
2025-08-29 23:34 ` Jason Gunthorpe
2025-08-27 3:52 ` [PATCH 7/7] tools/testing/devsec: Add a script to exercise samples/devsec/ Dan Williams
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=68b206c9e3f43_75db100e4@dwillia2-mobl4.notmuch \
--to=dan.j.williams@intel.com \
--cc=aik@amd.com \
--cc=aneesh.kumar@kernel.org \
--cc=bhelgaas@google.com \
--cc=gregkh@linuxfoundation.org \
--cc=jgg@nvidia.com \
--cc=linux-coco@lists.linux.dev \
--cc=linux-pci@vger.kernel.org \
--cc=yilun.xu@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).